Prediction of potential genes in microbial genomes Time: Thu May 19 22:44:07 2011 Seq name: gi|224461401|gb|ACDC01000001.1| Fusobacterium sp. 2_1_31 cont1.1, whole genome shotgun sequence Length of sequence - 11457 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 18.5 1 1 Op 1 1/0.500 + CDS 102 - 815 672 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 2 1 Op 2 . + CDS 818 - 3508 2610 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 3 1 Op 3 . + CDS 3520 - 5304 1918 ## FN1385 hypothetical protein 4 1 Op 4 . + CDS 5314 - 8715 3566 ## COG0587 DNA polymerase III, alpha subunit + Term 8730 - 8765 5.3 - Term 8718 - 8753 5.3 5 2 Tu 1 . - CDS 8783 - 10060 2384 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 10241 - 10300 15.2 + Prom 10220 - 10279 20.9 6 3 Tu 1 . + CDS 10432 - 11442 1478 ## COG1052 Lactate dehydrogenase and related dehydrogenases Predicted protein(s) >gi|224461401|gb|ACDC01000001.1| GENE 1 102 - 815 672 237 aa, chain + ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 1 237 1 237 237 356 82.0 2e-98 MIYYIYHSGFVLELEKSILIFDFYRIPIDKKNEEESFISKFIKRTDKKVYVFSSHSHSDH FNKEILKWLNLNENIKYILSDDIKIHKHKNFYFTKEGDSFELDNLKISTFGSTDLGSSFY VNVEDKNIFHSGDLHLWHWEDDTPEEEKTMYNAYMSELEKIKKLDRIDIAFVPVDPRLGV NTLEGVELFYKVLKPKLIVPMHFSDDYSQMKNFIENFKNIKDVEVIEIDESMKKILE >gi|224461401|gb|ACDC01000001.1| GENE 2 818 - 3508 2610 896 aa, chain + ## HITS:1 COG:FN1386 KEGG:ns NR:ns ## COG: FN1386 COG0553 # Protein_GI_number: 19704721 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 8 896 1 892 892 1259 82.0 0 MVDVSFYMLVEEEGSFSLALYDSEKNVIGNYSNLRQNVVNDYIENLENEREFFISWDEKK SEYLSIDSTLLKYLLEHGNFVNSDFEKIEKSELTNISILIRENSEIEDKLDIFIEINDNL LSKKNIIDNYIYSQGVFYEVKGLGEFTLDELFQKIDKYELETYCSLILKNYSNVELKYED YETINAEEKLAIPQIIIEKISFDNSLYLKINSIISTMDYDFFKKNNLENIVTVNEVEKKL EISKINLENLSSDMLEIVKVLVKLQKNTGLKSSYYIDNENFIILNEEIAKEFVKKELLQL ANKYSIIGTDKLRKYNIKAVKPRLSGKFSYHLNYLEGEVDIEIEGEKFSIQELLNKYRKD EYIVLSDGTNALINREYIEKLQRIFKDEDENKVKISFFDMPIVQDILDEKTFNNEFAGNK DFFEGINKINENDIVFPKLNATLRDYQKYGYKWLKYLTDNRLGACLADDMGLGKTLQAIA LISKTHEEKKKRSMVIMPKSLIFNWESEIKKFAPNLKIAVYYGINRELSILKKADVVLTT YGTIRNDIENLLKEKFDLLVLDESQNIKNINSQTTKAVLLLNAEKRVALSGTPVENNLLE LYSLFRFLNPEMFGTVQNFTNNYIIPIQKYSDTSTIEELRKKIYPFLLRRVKKEVLADLP DKIEKLVYVDMNDEHRKYYEEKRRYYYSLLENNTSSQGNFDKFFVLQAINELRHIVSSPE LDNNKIISSKKEVLIENVIEAIENDHKVLIFVNYLSSIESICNSLKENKIKFLKMTGQTK DRQSLVDKFQSDNRYKVFVMTLKTGGVGLNLVSADTIFIYDPWWNKTVENQAIDRAYRLG QDKTVFAYKMIMRNTIEEKILKLQEIKDKLLDDLISEDNLSTKNLSKNDIEFILGN >gi|224461401|gb|ACDC01000001.1| GENE 3 3520 - 5304 1918 594 aa, chain + ## HITS:1 COG:no KEGG:FN1385 NR:ns ## KEGG: FN1385 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 18 594 1 578 578 805 81.0 0 MMNELEHILSKRYKQEALFGLFQRYFVDWIADAYIAKDMNIFEISAINEKTDKEVLVKFL AEFYGNEEIFKKIFETLPEEVKEIFKVVVWEEKFPIKKEDLKKYLESYVDKFEKEAYAPR EEYLFFDLDEFDKDMNTSFSMKDDIARFIRNYIDTKPKDYYLHKAEEENIVFKLYKDNNE NEFINNMNFYLDFYNSGENPISSSGKILKDFKKNMQKHCGITEYYNDVKGLEFLKTETLC LILTLLEKKYRVNTYFNNKNIKNILNDFMTAETFDKSDNYIYTNLFLNFLKGTRNIWEHP ENIKEVLKSLVELLKEMPENEVVSIDNILKAFVYRGKNIELITFKDVKDYIYINEANGER AKITDYSQYKDYIIEPFIKSYIFLLGIFGVFEIFYEKPFFKKGLYLKNNYLSKYDGLKYV RLTNLGRFIFGHTERYELPKINEKAEIELDDKRQFVTIVGEAPARMMFFEKIGTKVKDNM FKLTYDSFIKGIKTYDELMERIEKFKENIDNKKFTPNWEDFFENLEKKFNSVKIEDDYIV LKLKNNKELIQTVIRDKRFKTLALKGEEYHLLVKRENLKELIKIFSEYGYYIVE >gi|224461401|gb|ACDC01000001.1| GENE 4 5314 - 8715 3566 1133 aa, chain + ## HITS:1 COG:FN1383 KEGG:ns NR:ns ## COG: FN1383 COG0587 # Protein_GI_number: 19704718 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Fusobacterium nucleatum # 1 1133 1 1133 1133 1801 83.0 0 MENNFVHLNLHTEYSLLEGVNSIDSFLTRAKELGMNSLAVTDYANMFCAIEFYEKAKKMG IKPIIGLELPLYEKEEQNIFTLTLLAKDYEGYKNLVKLASELYKKKDNGELRISKGILKE YSKGLIALSSSMKGEIGKAILMNFPSEKLDSIVDEYIEIFSKENFYLEIQANELPETKVI NDKFYDLAKLKNIELVATNNVHYVDRDGYELQDIVICIQSGWKLKDKNRKRAVSKELYLK SKEEMQRSLDERFHKAIENTNYIASLCNLEIEFGNLQFPYYEVPNQYSGMDEYLKSICYE NIKKIYKENLTKDILERLEYELSVIIKMGYSGYFIVVWDFIAYAKRNGIPVGPGRGSAAG SLVAYCLGITMIDPIRYNLLFERFLNPERISMPDIDIDICRERRDELIDYVVHKYGRERV AHIITFGRMKARAAIRDIGRVLDIDLKKIDRLSKLVSSFQTLEKTLKENVEVAKLYTTDI ELQKVIDLSIRIENKIRHVSTHAAGILITKEDLDRTVPIYLDEKEGVIATQYQMKELEDL GLLKIDFLGLKNLSNIQRTIDYIKKYKNIDIELYKIPLDDKKVFEMLSQGDSTGVFQLES TGIRKIMKRLKPNKLEDIVALLALYRPGPLQSGMVDDFINRKNGKEKIEYPHKNLEIILK ETYGVILYQEQVMKIASYMANYSLGEADLLRRAMGKKNFTIMRENREKFIQRAVENNYTE EKADEIFELIDKFAGYGFNKSHSVAYAMISYWTAYLKAHYPAFYFAAIMTSEISETGDVA YYFNDAKEHRISIYPPNVNTPSAYFEIKNDGISYSLAAIKNLGLNLAKKIVEDYEKYGAY TKLDEFVLRNKKNGMNKRALEALILSGALDELEGNRKEKFLSIDKVLDFVSKAPKTDEIQ QMNLFGAASKTIDKFVLTNSEDFSLDEKLTKEKEFLGFYLSSHPLDKYRDIVTIFSINKL SEIDMEETKVLKTFGTIMGLKKLLTKKDEQMALFSILCYDRVISCIAFPKTYEKFLEEII EKKTVYVEGKIQIDEYKGEKTTKLLIEKIISLDKLYDYPAKKLFVLIEEEDRYKYSRLRE LINSNKGNTDFIFAIKNKNEKRIQNTGTKVKLNREFLEQLVELMGIEKIKIQM >gi|224461401|gb|ACDC01000001.1| GENE 5 8783 - 10060 2384 425 aa, chain - ## HITS:1 COG:FN0488 KEGG:ns NR:ns ## COG: FN0488 COG0334 # Protein_GI_number: 19703823 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Fusobacterium nucleatum # 1 425 15 439 439 781 95.0 0 MSKETLNPLASGQQQVKKACDALGLDPAVYELLKEPQRIIEITIPVRMDDGSIKTFKGYR SAHNDAVGPFKGGIRFHQNVNSDEVKALSLWMSIKCQVTGIPYGGGKGGITVDPSELSQR ELEQLSRGWVRGMWKYLGEKVDVPAPDVNTNGQIMAWMQDEYNKLTGEQTIGVFTGKPLS YGGSQGRNEATGFGVAVTMREAFTALGKDLKGATVAVQGFGNVGKYSVKNIMKLGGKVVA VAEFEKGKGAFAVYKAEGFTFEELEAAKAAGSLTKVPGAKELTMDEFWALDVEAIAPCAL ENAITNHEAELIKAGIICEGANGPITPEADEVLYKKGIVVTPDVLTNAGGVTVSYFEWVQ NIYGYYWTEKEVEEKEERAMVDAFKPIWALKKEFDEKGQPISFRQATYMKSIKRIAEAMK IRGWY >gi|224461401|gb|ACDC01000001.1| GENE 6 10432 - 11442 1478 336 aa, chain + ## HITS:1 COG:FN0487 KEGG:ns NR:ns ## COG: FN0487 COG1052 # Protein_GI_number: 19703822 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 1 336 1 338 338 606 89.0 1e-173 MKVLFYGVREVEVPLFHEQNKKFGFDLELIPDYLNSKETAEKAKGFECVVLRGNCFATKE VLDMYKEYGVKYLFTRTVGTNHIDVKYAKELGFKLAYVPFYSPNAIAELAVSLAMSLLRH LPYTAEKFNKKDFTVDAKMFSREIRNCTVGVVGLGRIGFTAAKLFKGLGANVIGYDMFPK TGVEDIVTQVSMEELIAKSDIITLHAPFIKENGKIVTKEFLSKMKENSILINTARGELMD LEAVVAALESGHLAAAGIDTIEGEVNYFFKNFSNDEAKFKLEYPLFNKLIELYPRVLVTP HVGSYTDEAASNMIETSLENLKEYLDTGACKNDIKA Prediction of potential genes in microbial genomes Time: Thu May 19 22:44:17 2011 Seq name: gi|224461400|gb|ACDC01000002.1| Fusobacterium sp. 2_1_31 cont1.2, whole genome shotgun sequence Length of sequence - 4147 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 23 - 646 991 ## COG0035 Uracil phosphoribosyltransferase - Term 662 - 693 1.1 2 1 Op 2 . - CDS 704 - 949 424 ## PROTEIN SUPPORTED gi|237739934|ref|ZP_04570415.1| LSU ribosomal protein L31P - Prom 980 - 1039 11.0 3 2 Op 1 . - CDS 1052 - 1450 490 ## FN0481 hypothetical protein 4 2 Op 2 . - CDS 1467 - 1769 395 ## FN0480 hypothetical protein 5 2 Op 3 1/0.000 - CDS 1770 - 2219 471 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 6 2 Op 4 1/0.000 - CDS 2291 - 3346 1443 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis - Term 3353 - 3387 2.0 7 2 Op 5 . - CDS 3397 - 4146 1199 ## COG0739 Membrane proteins related to metalloendopeptidases Predicted protein(s) >gi|224461400|gb|ACDC01000002.1| GENE 1 23 - 646 991 207 aa, chain - ## HITS:1 COG:FN0483 KEGG:ns NR:ns ## COG: FN0483 COG0035 # Protein_GI_number: 19703818 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 207 8 214 214 399 98.0 1e-111 MSVIEINHPLIEHKMTILRSVETDTKSFRENLNEIAKLMTYEATKNLKLETTEVTTPLMK TQAYTLQDKVALVPILRAGLGMVDGILDLIPTAKVGHIGVYRNEETLEPVYYYCKLPTDI ASRKVILVDPMLATGGSAVYAIDYLKEQGVTDIIFMCLVAAPDGIAKLLNKHPDVPIYTA KIDQGLNGDGYIYPGLGDCGDRIFGTK >gi|224461400|gb|ACDC01000002.1| GENE 2 704 - 949 424 81 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739934|ref|ZP_04570415.1| LSU ribosomal protein L31P [Fusobacterium sp. 2_1_31] # 1 81 1 81 81 167 100 1e-41 MKKGIHPEFNVVVFEDMAGNQFLTRSTKVPKETTTFEGKEYPVIKVAVSSKSHPFYTGEQ RFVDTAGRVDKFNKKFNLGKK >gi|224461400|gb|ACDC01000002.1| GENE 3 1052 - 1450 490 132 aa, chain - ## HITS:1 COG:no KEGG:FN0481 NR:ns ## KEGG: FN0481 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 132 1 132 132 187 84.0 1e-46 MKKFFYILFFVISIFTFASDENGLGIVEDSDLRAAGVKVENIKKAKDLMKQVSSNYELRL LERKQLELQINKYILDNPEKYLKKIDELFDRIGAIEATIMKERLRSQIQMKKYITTDQYM KAKEIALKRLSK >gi|224461400|gb|ACDC01000002.1| GENE 4 1467 - 1769 395 100 aa, chain - ## HITS:1 COG:no KEGG:FN0480 NR:ns ## KEGG: FN0480 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 100 1 101 101 135 76.0 7e-31 MSPKEKVRANIYKALLEEEKRKNKRMSIFSVGLFFVGVVTMSTYNSFVNTVPSPEINTAS VITADDREALVTSIYDNPSVIEKKTTTLNPDELFIFNTQI >gi|224461400|gb|ACDC01000002.1| GENE 5 1770 - 2219 471 149 aa, chain - ## HITS:1 COG:FN0479 KEGG:ns NR:ns ## COG: FN0479 COG1595 # Protein_GI_number: 19703814 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Fusobacterium nucleatum # 1 149 1 149 149 229 95.0 2e-60 MDFDNIYEEYFDRVYYKVLSVVKNDDDAEDICQETFISVYKNLSKFREESNIYTWIYRIA INKTYDFFKKRKVEFEINDDVLSLPEDVNFDTKVILQEKLKLISEKEREIVILKDIYGYK LKEIAEIKNMNLSTVKSVYYKALKDMGGN >gi|224461400|gb|ACDC01000002.1| GENE 6 2291 - 3346 1443 351 aa, chain - ## HITS:1 COG:FN0478 KEGG:ns NR:ns ## COG: FN0478 COG0821 # Protein_GI_number: 19703813 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Fusobacterium nucleatum # 2 350 5 353 354 583 91.0 1e-166 MTRIVKVGNLSIGGNNPIIIQSMTNTNSADVEATVKQINELEKAGCQLVRMTINNVKAAE AIKEIKKRVNLPLVADIHFDYRLALLAIENGIDKLRINPGNIGSDENVKKVVEAAKEKNI PIRIGVNSGSIEKEILEKYGKPCVEALVESALYHVRLLEKYNFFDIVISLKSSNVKMMVE AYRKISSLVNYPLHLGVTEAGTKFQGTVKSAIGIGALLVDGIGATLRVSLTENPVEEIKV AKEILKVLDLSDEGVEIISCPTCGRTEIDLIGLAKQVEEEFQNEKNKFKVAVMGCVVNGP GEAREADYGVAAGRGIGILFKKGEVVKKVSEENLLEELKKLIAEDLKNEKK >gi|224461400|gb|ACDC01000002.1| GENE 7 3397 - 4146 1199 249 aa, chain - ## HITS:1 COG:FN0477 KEGG:ns NR:ns ## COG: FN0477 COG0739 # Protein_GI_number: 19703812 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Fusobacterium nucleatum # 1 249 82 321 321 350 74.0 1e-96 ITFPSIDGLYYKLEKNDTLAKIAKKYGISVVDIVDYNNINPKKLKAGSTIFLKGVTLQKY KDVEGRLIAAQQAKEDKNKNKNKEKEKEKPEKPPKGAKGSAPPPPPPPQDDDDGGRSAAY SGAGFAYPVRYAGVSSPFGNRFHPVLKRYILHTGVDLVAKYVPLRAAKSGVVTFAGNMSG YGKIIIIRHDNGYETRYAHLSVISTNVGEHVNQGDLIGKTGNSGRTTGAHLHFEIRQNGV PKNPMKYLR Prediction of potential genes in microbial genomes Time: Thu May 19 22:44:29 2011 Seq name: gi|224461399|gb|ACDC01000003.1| Fusobacterium sp. 2_1_31 cont1.3, whole genome shotgun sequence Length of sequence - 15903 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 4, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 441 338 ## COG0739 Membrane proteins related to metalloendopeptidases 2 1 Op 2 1/0.000 - CDS 488 - 1726 1551 ## COG1158 Transcription termination factor 3 1 Op 3 . - CDS 1747 - 3054 495 ## PROTEIN SUPPORTED gi|229879795|ref|ZP_04499292.1| SSU ribosomal protein S12P methylthiotransferase - Prom 3209 - 3268 13.3 + Prom 3257 - 3316 10.2 4 2 Tu 1 . + CDS 3342 - 3845 1030 ## COG0716 Flavodoxins + Term 3858 - 3919 14.4 - Term 3913 - 3946 2.1 5 3 Op 1 50/0.000 - CDS 3965 - 4315 575 ## PROTEIN SUPPORTED gi|237739944|ref|ZP_04570425.1| LSU ribosomal protein L17P 6 3 Op 2 26/0.000 - CDS 4343 - 5323 1496 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 7 3 Op 3 36/0.000 - CDS 5352 - 5939 978 ## PROTEIN SUPPORTED gi|237739946|ref|ZP_04570427.1| SSU ribosomal protein S4P 8 3 Op 4 48/0.000 - CDS 5982 - 6371 660 ## PROTEIN SUPPORTED gi|237739947|ref|ZP_04570428.1| SSU ribosomal protein S11P 9 3 Op 5 . - CDS 6417 - 6773 591 ## PROTEIN SUPPORTED gi|237739948|ref|ZP_04570429.1| SSU ribosomal protein S13P - Prom 6793 - 6852 10.6 10 4 Op 1 . - CDS 6969 - 7082 200 ## PROTEIN SUPPORTED gi|197735973|ref|YP_002164751.1| hypothetical protein FNP_0496 11 4 Op 2 . - CDS 7101 - 7349 269 ## PROTEIN SUPPORTED gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 12 4 Op 3 . - CDS 7414 - 9204 1764 ## FN1289 hypothetical protein 13 4 Op 4 . - CDS 9214 - 10764 1425 ## FN1291 hypothetical protein 14 4 Op 5 . - CDS 10748 - 12250 1343 ## FN1292 hypothetical protein 15 4 Op 6 . - CDS 12237 - 13676 1383 ## FN1292 hypothetical protein 16 4 Op 7 . - CDS 13709 - 15271 1478 ## FN1293 hypothetical protein 17 4 Op 8 . - CDS 15281 - 15832 242 ## PROTEIN SUPPORTED gi|229255399|ref|ZP_04379326.1| acetyltransferase, ribosomal protein N-acetylase Predicted protein(s) >gi|224461399|gb|ACDC01000003.1| GENE 1 3 - 441 338 146 aa, chain - ## HITS:1 COG:FN0477 KEGG:ns NR:ns ## COG: FN0477 COG0739 # Protein_GI_number: 19703812 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Fusobacterium nucleatum # 44 146 1 92 321 104 58.0 4e-23 MAYLLILAIVVFSFRLYMISSKEVVDTTQFTDYFQLDEADNGGLELTTSNFTTFEKEYNF VKEEKVEEDKKEGEKEKEKEKPAPPPPPKRAEQITYKVKKKDTIPAIAKRYGVKQDTILM NNKDALNNKMKVGDTITFPSIDGLYY >gi|224461399|gb|ACDC01000003.1| GENE 2 488 - 1726 1551 412 aa, chain - ## HITS:1 COG:FN0476 KEGG:ns NR:ns ## COG: FN0476 COG1158 # Protein_GI_number: 19703811 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Fusobacterium nucleatum # 1 412 1 413 413 677 85.0 0 MDILNKFLLKELQEIARIMDIEVSGQKKEELKALIIETLEENNTVLAYGVLDTAPEGFGF LKETTLGKNIYMSASQVKKFKLRRGDLILGEVRNPIGEEKNFAIRRVLRVNDDDLNKIAD RVPFEDLIPTYPREQIKLGLDHDNISGRILDLIAPIGKGQRSLIIAPPKAGKTTFISSIA NAIIKGEKDIDVWILLIDERPEEVTDIKENVEGATVFASTFDDDPKNHIKVTEEIIERAK MKVEDGENVVILLDSLTRLSRAYNIVIPSSGKLLSGGIDPMALYHPKNFFGAARNIKNGG SLTIIATILVDTGSKMDEVIYEEFKSTGNCDIYLDRQLAEFRVFPAIDITRSGTRKEELL LKKNQIEEIWNLRRLLNDYDNKVSSTAALIKAIKTTKNNDELLRQLPKVLYK >gi|224461399|gb|ACDC01000003.1| GENE 3 1747 - 3054 495 435 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229879795|ref|ZP_04499292.1| SSU ribosomal protein S12P methylthiotransferase [Slackia heliotrinireducens DSM 20476] # 1 435 18 444 446 195 28 2e-49 MKKASIITYGCQMNVNESAKIKKIFQNLGYDVTEETDDADAVFLNTCTVREGAATQIFGK LGELKALKEKKGTIIGVTGCFAQEQGEELVRKFPIIDIVMGNQNIGRIPQAIEKIENNES THEVYTDNEDELPPRLDAEFASDQTASISITYGCNNFCTFCIVPYVRGRERSVPLEEIVK DVEQYVNKGAKEIVLLGQNVNSYGKDFKNGDNFAKLLEEICKVEGDYIVRFVSPHPRDFT DDVIDVIAKNDKISKCLHLPLQSGSSQVLRKMGRGYTKEKYLALVDKIKSKIPDVALTAD IIVGFPGETEEDFLDTIDVVEKVSFDNSYMFMYSIRKGTKAATMDNQIDENVKKERLQRL MEVQNKCSFNESSKYKDKIVRVLVEGPSKKNKEVLSGRTSTNKIVLFKGDMNLKGQFVNV KINECKTWTLYGELV >gi|224461399|gb|ACDC01000003.1| GENE 4 3342 - 3845 1030 167 aa, chain + ## HITS:1 COG:FN0472 KEGG:ns NR:ns ## COG: FN0472 COG0716 # Protein_GI_number: 19703807 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 167 1 167 167 275 96.0 3e-74 MKTVGIFFGTTGGKTQEVVDILAAQLGDAQVFDVANGVDEMEMFDNIILASPTYGMGELQ DDWASVIDEVADMDFSGKVVAFVGVGDAAIFGGNYVESMKHFYDAVEPKGAKIVGFTSTD GYDFEASEAVIDGDKFMGLAIDASFDTDEITSKVEDWLENKVKDELL >gi|224461399|gb|ACDC01000003.1| GENE 5 3965 - 4315 575 116 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739944|ref|ZP_04570425.1| LSU ribosomal protein L17P [Fusobacterium sp. 2_1_31] # 1 116 1 116 116 226 100 1e-58 MNHNKSYRKLGRRADHRKAMLKNMTISLVKAERIETTVTRAKELRKFAERMITFGKKNTL ASRRNAFAFLRDEEAVAKIFNELAPKYADRNGGYTRIIKTSVRKGDSAEMAIIELV >gi|224461399|gb|ACDC01000003.1| GENE 6 4343 - 5323 1496 326 aa, chain - ## HITS:1 COG:FN1283 KEGG:ns NR:ns ## COG: FN1283 COG0202 # Protein_GI_number: 19704618 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Fusobacterium nucleatum # 1 326 17 342 342 563 96.0 1e-160 MLKIEKQAKQISITEVKESNYKGQFVVEPLYRGYGNTLGNALRRVLLSSIPGAAIKGMRI EGVMSEFTVMDGVKEAVTEIILNVKEIVVKAESSGERRMTLSVKGPKVVKAADIVADIGL EIVNPEQVICTVTTDRTLDMEFLVDTGEGFVVSEEIDKKDWPVDYIAVDAIYTPIRKVSY EIQDTMFGRITDFDKLTLNVETDGSIEIRDALSYAVELLKLHLDPFLEIGNKMENLRDEI EEIIEEPIDIQVIDDKSHDMKIEELDLTVRSFNCLKKAGIEDVSQLASLSLNELLKIKNL GKKSLDEILEKMKDLGYDLEKNGSPE >gi|224461399|gb|ACDC01000003.1| GENE 7 5352 - 5939 978 195 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739946|ref|ZP_04570427.1| SSU ribosomal protein S4P [Fusobacterium sp. 2_1_31] # 1 195 1 195 195 381 100 1e-105 MARNRQPVLKKCRALGIDPVILGVKKSSNRQIRPNANKKPTEYAIQLREKQKAKFIYNVM EKQFRKIYEEAARKLGVTGLTLIEYLERRLENVVYRLGFAKTRRQARQIVSHGHIAVNGR RVNIASFRVKVGDVVSVIENSKNVELIKLAVEDATPPAWLELDRAAFSGKVLQNPTKDDL DFDLNESLIVEFYSR >gi|224461399|gb|ACDC01000003.1| GENE 8 5982 - 6371 660 129 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739947|ref|ZP_04570428.1| SSU ribosomal protein S11P [Fusobacterium sp. 2_1_31] # 1 129 1 129 129 258 99 1e-68 MAKKTVAKIKKKSKNIPNGVAHIHSTFNNTIVTITDVDGKVISWKSGGTSNFKGTKKGTP FAAQIAAEQAAQIAMENGMRKIEVKVKGPGSGREACIRSLQAAGLEVTKITDVTPVPHNG CRPPKRRRV >gi|224461399|gb|ACDC01000003.1| GENE 9 6417 - 6773 591 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739948|ref|ZP_04570429.1| SSU ribosomal protein S13P [Fusobacterium sp. 2_1_31] # 1 118 1 118 118 232 99 1e-60 MARIAGVDIPRNKRVEIALTYIYGIGRPTSQKILKEAGINFDTRVKDLTEEEVNKIREII KDIKVEGDLRKEVRLSIKRLMDIKCYRGLRHKMNLPVRGQSSKTNARTVKGPKKPIRK >gi|224461399|gb|ACDC01000003.1| GENE 10 6969 - 7082 200 37 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|197735973|ref|YP_002164751.1| hypothetical protein FNP_0496 [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 37 1 37 37 81 100 3e-15 MKVRVSIKPICDKCKIIKRHGKIRVICENPKHKQVQG >gi|224461399|gb|ACDC01000003.1| GENE 11 7101 - 7349 269 82 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 [Mycobacterium tuberculosis H37Rv] # 10 81 1 73 73 108 71 3e-23 MNFYSMGGKMSKKDVIELEGTIVEALPNAMFKVELENGHTILGHISGKMRMNYIKILPGD GVTVQISPYDLSRGRIVYRKKN >gi|224461399|gb|ACDC01000003.1| GENE 12 7414 - 9204 1764 596 aa, chain - ## HITS:1 COG:no KEGG:FN1289 NR:ns ## KEGG: FN1289 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 596 1 571 571 571 62.0 1e-161 MKVNGEDLSNIEKNVNEIEYYNSFTKIFKKISIIIIIILMVFSLFELISLYKMKSEYNLN QKILNNGQKYEKSIYIKYEGKIYCNSFGDIYQLKDVDIDSFKTFDTGDYRDNYIATDKNN VYLGNNIIPDLNPNRLKSLGSNYYSDGVNYYFLSDVYIQNEDISTWSIVKEYIIHFKKKQ LYFYPFKKIETTKALKGIENFRYLASDGEKVYYKGELIENADLYTLKAVDKYNDDYFYDK NNIYYKTKALNLSSNDNLNLVSVEQGERTYLYDGLNGNVSLEEYIFDKKYIPYQILGIDS AHVKDLLFVSKDGIFFYNPETKEQERAGDNIFKGKVENILSSVISDDKNIYYLHSYNIYR KKRTKHGYIDILVSKNIGIFSLGEKKDWEKIKDIDSGTTGQVWKKGNKYYYFDDLGVYQL IDDVVYEIVDNASLKYLLETNNINDDTIREFVRDKKLIAFKGEEVSTASIKYKESHVAEI FLAVFLTTFFGISILMISLKWKAQKKDREKLEEERKKIEKQMEFWDNYYNNNEEEKKEDE KIPTSYKSYDDEEEIKKEIDKIKPIVKNSDDIEGLKKREKKINSVIKNFNLDEEEK >gi|224461399|gb|ACDC01000003.1| GENE 13 9214 - 10764 1425 516 aa, chain - ## HITS:1 COG:no KEGG:FN1291 NR:ns ## KEGG: FN1291 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 72 479 2 407 411 382 53.0 1e-104 MQKNSRISFIKFIFIIYVIILIFLSLSYTLLLMKKLGSNSDEIESHGQKYGNTQFIKYQE KISIPVPSGGRYFLEKVDVDSFRVLDSQDYSDRSTLIVGLDKNSVYFGNIRIPDLDPNKL EVIGNGYYTDGINTYYCSDMSERNKNLSSPMEIFQTLIYAFSKTKRPQSYIYPYKKVETD KRLKAVDNLLFFATDGDNVYYKGEVLENADLNTLKPIDGQYTYFADKENVYYKSKLLPIK NNSNLKTVSLNPDDKFLYDEINGYVFIGDYSFDRKKAPYKIIGSNGTHLYSLIFVSDDGI YFYDSENKKQVKLKDNIFIGNIEEISPNVFIDDENMYYFQNYEIWKKYRNMVFLASRNTG VYSLGKKESWKKLADVGNENIGSIWQKDSEYYYFDNLENSSQTDDYRATIFKITDKKTLE NLLSYPEYISAEKIDEFILNKNFEEFKGEKLFIATIKFHSVFKIFLGFLLVLGFIFIVFF LYLAILNKKDVKNIDKMLLEKYRNIKPLSKDYNNKE >gi|224461399|gb|ACDC01000003.1| GENE 14 10748 - 12250 1343 500 aa, chain - ## HITS:1 COG:no KEGG:FN1292 NR:ns ## KEGG: FN1292 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 46 465 3 415 453 449 61.0 1e-124 MKKNNSDETLKFKKKRSSDTIFTFKIISAIILAIVFLIFLLVSSKTTSSDSYEIEEKGQR YGNSEFIKYQGKISVAIPSGGRYILENVDINSFRVLNSGDRNTRVIGLDKNSIYFGNIPI SDLDPNKLEVIGNGYYTDGTNTYFFSGVSEKNKNLSVPMKVFQSVIYAFSKTKKPQTYIY PYKKIDTDKRLKPISDFLSFATDGNNIYYEGEILENVDLSTLKAVDPYHEYFTDKENVYY KSKLLPIKNSGKLKIVSSEQGDEFLYDEANGYVFMENYSFDREKAPYKVLGNEGNHLYNL VFINNEGIYYYDNQKKKQLRVGDNIFVGNVEELSPNVFTDDENIYYFHAYEVQKKLKHSS GYVLASRNTVIYSLGKKDAWEKVKDIRSGTVGSIWKKENKYYYFDNLGIFQLIDNTIYEI RDKETLEYLLNYNEGSSKIRELIENEKLIKIEGEKKIEIRVKYTTFFLPFKISGLLAFIL GIIIAKVSHYFREKKNAKKF >gi|224461399|gb|ACDC01000003.1| GENE 15 12237 - 13676 1383 479 aa, chain - ## HITS:1 COG:no KEGG:FN1292 NR:ns ## KEGG: FN1292 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 45 473 2 443 453 591 78.0 1e-167 MKTNDLDDLFKKKKTSNTSFYLKIAMIVFVIFPLFFIPFFISIWINADNNTYEIKTNGEQ YEKGNFFKYQGKIYVFTLNDGMQELKNADIATFKPFEPEDYFTKNIALDKNSVYFENVII PDLNPNKLKVIGNGYYTDGTNTYFYSPFSELDKDSSKYIFPYKKIEGAKNLKALDNFGLF AVDGDNVYYKGEILNNADLNTLEIIDKNTEYFADKENVYYKSNLLPIKNSGKLKIVSSEH GDKFLYDEVNGYVFIGDYSFDREKAPYKVIGNNGTTLYNLIFIAKDGIYYYDNQKKQQLK AGDNIFIGNIEEITPNVFTDDENIYYFHAYDVSTATKKSIGELISKNTDICYLDKKEAWE KVADIKEASVASIWKKEDKYYYFNNLGIFPFMDNTIYEISDKETLNYLLSKADDKTDDIE ELIKDGKLIAVSGEKKMTITVKYKTDIVDKIFKYSIRIFLVAYLIFFIFKEFRSKNEKK >gi|224461399|gb|ACDC01000003.1| GENE 16 13709 - 15271 1478 520 aa, chain - ## HITS:1 COG:no KEGG:FN1293 NR:ns ## KEGG: FN1293 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 117 520 1 393 393 589 82.0 1e-166 MKINNQKNANIMIKAAGLSAIILIFLCFIVIFYIAFSGDNTSEIQENGERYGTSDFYKYK DKIYALVYGNGLLEVEGVDIPTFKVFDTEDDNGNVAYDKNRVYFGNIAVSDLDTDKLYYV GNNYYSDGTDSYFCSTSSEYNEELSAKSTIIQNISHFFFKTKRPQYYFYPYKKLETNKRL EKVEELKNSATDGEEVYYAGEKLVNADIYTIKTIEDALFYFADKENVYYKSKLLSFKNNG KLKVFHENDYNVYYLYDEESKNVYANDYLFETVNAPYKVVGVDGTHHFSLLFISKNGVYF YDPLKRKQERIGDNIFKGEIKEIYPDIFSDDENVYYLDVYEDWAKRSGNNPFSLMKKPLN GQLISRNTRIRYLDKKTAWENDWKKVADINFGGDGSIWKKGNKYYYFDIYGFNQNINRTI YEIVDKEVLDYLLNFSNLKDGYSINLPDKIRDFISEKKLIAFNGEVIMTATIHFIEDPYA YSIPKIIFISIAFLIGLYARYRFDIANFLKKRKKSKFSKK >gi|224461399|gb|ACDC01000003.1| GENE 17 15281 - 15832 242 183 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229255399|ref|ZP_04379326.1| acetyltransferase, ribosomal protein N-acetylase [Capnocytophaga ochracea DSM 7271] # 12 163 4 152 175 97 36 4e-20 MNTEINISNVILETDRLILRAWEIRDLDDFFEYASINGVGEKAGWKHHKSKDESLEILKM FIDEKKVFAIVLKENQKVIGSIGIEECRQDLDKNLENLLGRELGYVLSKDYWNKGIMTEA VSKVIEYCFKTLKLNYLVATYFNYNIESKKVLEKLNFKFYKDIIIETRYNTKEESTLMLL KYN Prediction of potential genes in microbial genomes Time: Thu May 19 22:45:10 2011 Seq name: gi|224461398|gb|ACDC01000004.1| Fusobacterium sp. 2_1_31 cont1.4, whole genome shotgun sequence Length of sequence - 15471 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 9, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 47 - 571 581 ## FN1296 hypothetical protein - Prom 634 - 693 6.2 - Term 640 - 684 5.1 2 1 Op 2 . - CDS 697 - 1227 776 ## FN1296 hypothetical protein - Prom 1353 - 1412 19.5 + Prom 1348 - 1407 11.0 3 2 Tu 1 . + CDS 1591 - 1791 442 ## COG1278 Cold shock proteins + Prom 2027 - 2086 13.4 4 3 Tu 1 . + CDS 2116 - 3528 1762 ## Lebu_0877 hypothetical protein + Prom 3530 - 3589 6.2 5 4 Op 1 1/0.000 + CDS 3651 - 5114 2323 ## COG0516 IMP dehydrogenase/GMP reductase 6 4 Op 2 . + CDS 5130 - 5966 1264 ## COG2849 Uncharacterized protein conserved in bacteria 7 4 Op 3 . + CDS 5968 - 6390 336 ## FN1229 hypothetical protein + Term 6413 - 6461 6.3 + Prom 6392 - 6451 2.8 8 5 Op 1 . + CDS 6472 - 7143 925 ## BCB4264_A2363 SMI1 / KNR4 family 9 5 Op 2 1/0.000 + CDS 7156 - 7833 739 ## COG0692 Uracil DNA glycosylase 10 5 Op 3 1/0.000 + CDS 7839 - 9296 2046 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 11 5 Op 4 . + CDS 9286 - 10122 1317 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase 12 6 Tu 1 . - CDS 10297 - 11079 646 ## COG0388 Predicted amidohydrolase - Prom 11235 - 11294 9.9 + Prom 11014 - 11073 9.2 13 7 Tu 1 . + CDS 11193 - 11960 935 ## gi|237739969|ref|ZP_04570450.1| predicted protein + Term 11993 - 12050 3.2 + Prom 12000 - 12059 5.7 14 8 Op 1 3/0.000 + CDS 12119 - 13747 2283 ## COG0281 Malic enzyme 15 8 Op 2 . + CDS 13771 - 14736 970 ## COG0679 Predicted permeases + Term 14831 - 14869 -0.9 + Prom 14763 - 14822 7.2 16 9 Tu 1 . + CDS 14912 - 15439 926 ## COG0778 Nitroreductase Predicted protein(s) >gi|224461398|gb|ACDC01000004.1| GENE 1 47 - 571 581 174 aa, chain - ## HITS:1 COG:no KEGG:FN1296 NR:ns ## KEGG: FN1296 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 20 164 8 162 169 90 43.0 2e-17 MKKIIIFFLMILSITIFSEEYKPYLKKNTNNKNLVFSAQIKDSKKVISIYKENKKLIYVY GSEGEKAEKIIIGTANKNLFKNENEIPLNENNNNKLTENFILFKVKNYTYLISFYNNYGV KENSYTLTVAKNDEEILFDKELDISIVYDNLFNTNFFKKLPYDNGVVAYYVTYD >gi|224461398|gb|ACDC01000004.1| GENE 2 697 - 1227 776 176 aa, chain - ## HITS:1 COG:no KEGG:FN1296 NR:ns ## KEGG: FN1296 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 21 176 9 169 169 98 39.0 7e-20 MKKLVLLLLVIVSAFSFGANFKPYLKGNTSNPDAKKILFSAQMESTKKVVTLYKDREKVV YVFGLEGKKPEITLEGVIGENLFFNADDTETYVGRFLVFMNEEHRYIVSFYNVNGKTRSY LLEAYKGTNPRPLYKKQLNNKTVYDKVFNDPNNAEGFNGLFYDQNYLDDESFYINY >gi|224461398|gb|ACDC01000004.1| GENE 3 1591 - 1791 442 66 aa, chain + ## HITS:1 COG:FN0528 KEGG:ns NR:ns ## COG: FN0528 COG1278 # Protein_GI_number: 19703863 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Fusobacterium nucleatum # 1 66 6 71 71 102 100.0 2e-22 MKGTVKWFNKEKGFGFITGEDGKDVFAHFSQIQKEGFKELFEGQEVEFEITEGQKGPQAS NIVVIK >gi|224461398|gb|ACDC01000004.1| GENE 4 2116 - 3528 1762 470 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0877 NR:ns ## KEGG: Lebu_0877 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 470 1 454 454 139 25.0 2e-31 MRKILFAILCLLFVSCSNLYKANKAYERGDYVENVVLTFKYFDEKPENFKELNEKKKNEI NSKFSNIFEYYSKQKNSEKLEDINKANIELFTIYIASDNSQYAKEFQAEREFLASNNVKT LFNQALKTNKELFLQNIGLRDDHTYALKVIDHTINMNIAIDAVVESNKSLDRNKVELYNY FKKEIAKHRADGYISLAEVEEKEGSNKYLRSAQNLYYKANEIYSKYQKNYRNSYSKYESA KYKADLNDAEDNYNKGITEYRNAGSSKAKYRAANYYFKEAQKYIYNYKDTNKLLNETREK GYFKYSLNSNNVDVKNKISNDLNSIAYPVTNGIELFIDYRDGDYNYNTSSNTNTEQLKKE IQTGVDSTGKPIMKVYNFTKITTTIEEIGTIRYTFSVRGAYYNNNISNDITIKNTVNNVK YSGEVPPSSEYRNSDNKALGSNELKKKVEEKLKKEVNEHIDSMIKDLKRI >gi|224461398|gb|ACDC01000004.1| GENE 5 3651 - 5114 2323 487 aa, chain + ## HITS:1 COG:FN1231_3 KEGG:ns NR:ns ## COG: FN1231_3 COG0516 # Protein_GI_number: 19704566 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Fusobacterium nucleatum # 203 487 1 285 285 508 92.0 1e-144 MNGKILKEGITFDDVLLIPAKSDVLPNEVSLKTRLTKKITLNLPILSAAMDTVTESDLAI ALARQGGIGFIHKNMSIEEQAAEVDRVKRSESGMITNPITLNKDSRVYQAEELMSRYKIS GLPVIEDDGKLIGIITNRDIKYRKDLDQPVGDIMTSKGLITAPVGTNLEQAKEILLANRI EKLPITDQNGYLKGLITIKDIDNIVQYPNSCKDELGKLRCGAAVGVAPDTLDRVAALVKA GVDIITVDSAHGHSQGVINMIKEIKKHYPDLDIIGGNIVTAEAAEELIEAGASAVKVGIG PGSICTTRVVAGVGVPQLTAVNDVYEYCKTRDIGVIADGGIKLSGDIVKALAAGADCVML GGLLAGTKEAPGEEIILEGRRFKIYVGMGSIAAMKRGSKDRYFQAGEVDNSKLVPEGIEG RIAYKGSVKDVIFQLAGGVRAGMGYCGTKTIKDLQVNGKFVKITGAGLIESHPHDITITK EAPNYSK >gi|224461398|gb|ACDC01000004.1| GENE 6 5130 - 5966 1264 278 aa, chain + ## HITS:1 COG:FN1230 KEGG:ns NR:ns ## COG: FN1230 COG2849 # Protein_GI_number: 19704565 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 117 278 1 162 162 248 82.0 1e-65 MNKINKFIILAGLLLSFSATAAEIKELESLETISKQILGETTSTKTKKEKAKETVKKEVT KKENKEEVKEESKKETEVKSENKASENEETVVNDIPDETATRVINKSEIVDFYEREVRDK IAYKEGSNTPFTGVFGIVIDDKIESYEEYKDGLLDGETAYFSKDKEVKLLSEMYSKGKLN GPQKTYYENGKLKSIVYYKNDRIDGIVEYDKSGKLLHKSIFENGTGDWKLYWSNGKVSEE GRYVSWKRDGVWKKYREDGSLDTILKYDNGRLLSEKWQ >gi|224461398|gb|ACDC01000004.1| GENE 7 5968 - 6390 336 140 aa, chain + ## HITS:1 COG:no KEGG:FN1229 NR:ns ## KEGG: FN1229 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 140 7 146 146 194 83.0 1e-48 MLISRVKQVYQYIFSKFDESNNSEIKKILSEEEFLIFSTMSNYDKVHSYSLYQKVKEEKT LSSEKLYLKLALLHDSGKGKVGLFRRIKKVLVGDKILEQHPNIAFEKLKNINFDLAELCL NHHNKDVDQKMKIFQELDDK >gi|224461398|gb|ACDC01000004.1| GENE 8 6472 - 7143 925 223 aa, chain + ## HITS:1 COG:no KEGG:BCB4264_A2363 NR:ns ## KEGG: BCB4264_A2363 # Name: not_defined # Def: SMI1 / KNR4 family # Organism: B.cereus_B4264 # Pathway: not_defined # 13 216 10 209 216 162 42.0 9e-39 MTEFNWDSFINELEKFQKGIENIGGHSRETIIETPAKEEEILEVEKKLGYTLPKDFRDIL LNYSSHFEYFWSTYRDEEEEQIEFPEKFCAIFAGNLHWGLKFLLDFEESRQGWVDVCYPD YNNEYDKVWHNKLAFYKVANGDHFAIELEKENYGKIVYLSHDGGDGHGHYIADNFKDLLS NWSKVGAVGGDDWQWEVFYTEGKGIDPDCENAKEWREYIFSKI >gi|224461398|gb|ACDC01000004.1| GENE 9 7156 - 7833 739 225 aa, chain + ## HITS:1 COG:FN1226 KEGG:ns NR:ns ## COG: FN1226 COG0692 # Protein_GI_number: 19704561 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Fusobacterium nucleatum # 1 225 1 225 226 383 89.0 1e-106 MSKINNDWKEILEEEFQKDYFVELKSILEKEYENYTVYPLKKDILNAFFLTPYSEVKVVL LGQDPYHQKGQAHGLAFSVNYGIKTPPSLLNMYKELHDDLGLYIPNNGFLEKWAKQGVLL LNTSLTVRDSEANSHSKIGWQTFTDNVIKKLNEREKPIIFILWGNNAKAKEKFIDTNKHY ILKGAHPSPLSANRGFFGCKHFSEVNRILKELKEKEIDWQIENKE >gi|224461398|gb|ACDC01000004.1| GENE 10 7839 - 9296 2046 485 aa, chain + ## HITS:1 COG:FN1225 KEGG:ns NR:ns ## COG: FN1225 COG0769 # Protein_GI_number: 19704560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Fusobacterium nucleatum # 1 485 1 485 485 856 90.0 0 MNIFSGVEYEVLRDVDLNRKYDGIEYDSRKVKENYIFVALEGANVDGHDYIDSAVKNGAT CIIVSRKVEMKHKVSYVLIDEIRHKLGYIASNFYEWPQRKLKIIGVTGTNGKTSSTYMIE KLMGDTPITRIGTIEYKIGDEVFEAVNTTPESLDLIKIFDKTLKKKIEYVVMEVSSHSLE IGRVDVLDFDYALFTNLTQDHLDYHVTMENYFQAKRKLFLKLKDINNSVFNIDDKYGKRL YDEFIEDNPEIISYGIDGGDLEGEYLDDGYIDIKFKGKVEKVKFALLGDFNLYNTLGAVA IAIKMGISWEDILKRVSNIKAAPGRFEALNCGQDYKVIVDYAHTPDALVNVIVAARNIRN GNRIITIFGCGGDRDRTKRPIMAKAAEDLSDIVILTSDNPRTESPEQIFADVKIGFTKND DYFFEPDREKAIKLAINMAEKNDIILITGKGHETYHIIGTKKWHFDDKEIARREIVRRRM VENVN >gi|224461398|gb|ACDC01000004.1| GENE 11 9286 - 10122 1317 278 aa, chain + ## HITS:1 COG:FN1224 KEGG:ns NR:ns ## COG: FN1224 COG2877 # Protein_GI_number: 19704559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Fusobacterium nucleatum # 1 278 9 286 286 548 97.0 1e-156 MLINDVNKVKVGNIVFGGKKRFVLIAGPCVMESQELMDEVAGGIKEICDRLGIEYIFKAS FDKANRSSIHSYRGPGLEEGMKMLAKTKEKFNVSVITDVHEAWQCKEVAKVADILQIPAF LCRQTDLLIAAAETGKAVNIKKGQFLAPWDMKNIVVKMEESGNKNIMLCERGSTFGYNNM VVDMRSLLEMRKFNYPVVFDVTHSVQKPGGLGTATSGDREYVYPLLRAGLAIGVDAIFAE VHPNPTEAKSDGPNMLYLKDLEEILKTAIEIDKIVKGV >gi|224461398|gb|ACDC01000004.1| GENE 12 10297 - 11079 646 260 aa, chain - ## HITS:1 COG:AF0115 KEGG:ns NR:ns ## COG: AF0115 COG0388 # Protein_GI_number: 11497735 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Archaeoglobus fulgidus # 4 260 5 254 257 123 35.0 4e-28 MKKKKFKIALAQIKIEQKNIEENCKKIFEKIEEAAKENVDIICFPELATIGYTITTDELK KLPEDFNNTFIEKLQEKAKFFKIHILVGYLESKTTKKSRDFYNSCIFIDDNGKILANARK FYLWKKEKTKFKAGNKFVVKNTKFGKIGILLCYDLEFPEPARIECLKGAEIIFVPSLWSF NAENRWHIDLAANSLFNLLFIAGCNAVGDSCCGKSKIVEPDGSTLIEASGTNEELLMATI DLEKISEVRAKIPYLTDLKK >gi|224461398|gb|ACDC01000004.1| GENE 13 11193 - 11960 935 255 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739969|ref|ZP_04570450.1| ## NR: gi|237739969|ref|ZP_04570450.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 255 1 255 255 468 100.0 1e-130 MERVNSENSHCNKNLNNYREERMIKKIFMCLMLVFAFTACQSLNYIKEKNETIQLVVKGN DNNIYMLGNNYDYQFSGKDADRLLRLSNFPKELNFSREQLKNASVNIHVDARDGSVGLDF GSRITINKKSRNNANYEKEQKVFYENLKNELNRRKVRYKIEENSEEWVIVLLDVPYFEGK VVKLQNRSEFLEKGKGQYIDVPSKLYLTDPPSQAVEGVVGGLMGAVVVPVKVVLAIPALV VLPFLIPFMKIGNTP >gi|224461398|gb|ACDC01000004.1| GENE 14 12119 - 13747 2283 542 aa, chain + ## HITS:1 COG:L121483 KEGG:ns NR:ns ## COG: L121483 COG0281 # Protein_GI_number: 15672882 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Lactococcus lactis # 4 542 2 540 540 710 64.0 0 MAKKSYEVLNDPFLNKGTAFTKEERKELELTGLLPPQIQTIEEQAEQVYAQYKSKEPLIN KRRFLMEIFDTNRTLFYYLFSQHVVEFMPIVYDPVIAENIENYSELFVNPQNAVYLSIDS PETIEESLRNATKDREIRLIVATDAEGILGIGDWGTNGVDISVGKLMVYTAAAGIDPKSV LPVVLDAGTNRETLLEDKLYLGNRHKRVYGDKYYDFVDKFVQTAEKLFPRLYLHFEDFGR SNAANVLHKYWKTYPVFNDDIQGTGIITLAGILGALKISGEKLTDQRYICFGAGTAGAGI ADRIYQEMLQQGLSENEARNRFYLVDKQGLLFDDMDDLTPEQKPFARKRTEFNNANELTN LEAAVKAVRPTILVGTSTQPNTFTETIVKEMASYTARPIIFPLSNPTKLAEATAENLIKW TEGKALVATGIPADPVEYNGVTYEIGQANNALIYPALGLGAIASTAKLMTNEMISKAAHS LGGIVDTTKPGAATLPPVSKLTEFSQRVAEAVGQCALDQKLNREDITDIKEAIEKIKWTP KY >gi|224461398|gb|ACDC01000004.1| GENE 15 13771 - 14736 970 321 aa, chain + ## HITS:1 COG:CAC0366 KEGG:ns NR:ns ## COG: CAC0366 COG0679 # Protein_GI_number: 15893657 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 11 318 7 299 301 73 24.0 6e-13 MEAFITSIGSILSIVLLIALGYILKEKQWFSDSFSGNISKLIMNIALPASIFVSVLKYLT LKSLLSLTGALVYTFLSVIIGYIFAYILVKIINVPVGRKGTFINTVVNANTIFIGLPLNI ALFGNQSLPYFLVYYVTNTVSTWAFGAILIGNDTNDKNKQGTTFDWKKLFPPPLLGFIVA LVFLFLSIPVPAFVNSTLGYLGGIVTPLSLIYIGIVLHNAGLKSIKFDRDTIFALIGRFI FSPIVMLILIKFASDILPLKELSAIEVKTFIVQSAAPALAVLPILVNEAKGDVEYATNVV TTSTLLFVIVIPIITTLLGGI >gi|224461398|gb|ACDC01000004.1| GENE 16 14912 - 15439 926 175 aa, chain + ## HITS:1 COG:FN1223 KEGG:ns NR:ns ## COG: FN1223 COG0778 # Protein_GI_number: 19704558 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 175 1 175 175 310 82.0 6e-85 MNEVLKAIKERRSIRKYKSDMLPKEIIDQVIESGLYAASGKGQQSPIIISVTNKELRDKL SRMNCEIGGWKEGFDPFFNAPVVLVVLAPKDWANKAYDGSLVMGNMMLAAHALNIGSCWI NRARQEFETEEGKEILKSLGIEGEYEGIGHCILGYVDGEYPSVPARKANRVYYVE Prediction of potential genes in microbial genomes Time: Thu May 19 22:45:41 2011 Seq name: gi|224461397|gb|ACDC01000005.1| Fusobacterium sp. 2_1_31 cont1.5, whole genome shotgun sequence Length of sequence - 1973 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 88 - 957 1123 ## COG3878 Uncharacterized protein conserved in bacteria - Prom 981 - 1040 11.2 + Prom 982 - 1041 6.7 2 2 Op 1 . + CDS 1086 - 1595 522 ## gi|237739974|ref|ZP_04570455.1| conserved hypothetical protein + Prom 1597 - 1656 3.4 3 2 Op 2 . + CDS 1676 - 1825 153 ## + Term 1934 - 1971 0.4 Predicted protein(s) >gi|224461397|gb|ACDC01000005.1| GENE 1 88 - 957 1123 289 aa, chain - ## HITS:1 COG:FN1221 KEGG:ns NR:ns ## COG: FN1221 COG3878 # Protein_GI_number: 19704556 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 289 1 288 288 379 69.0 1e-105 MDFKELLTKILSEVKKDEITIFTEPNEDNEILNKSKIGGKPYLPKDFVWPYYQELPLSFL AQINLEEVKSLDKDNLLPDKGMLYFFYELETEEWGFKPENKGCSKVLYFEDTTNFSLIDF PEDMEDYNIVPEFKVNFKSNISYPAYENFEKLNENDVLLEKYETFEGYDELNDNFFDNYY DFYEEYMDGLESHTKLLGYPDVVQNSMEEECVEVTRDFDMEAVKASPKKYKEEIKKAAEN WILLFQMDTVETDDYELMFGDSGHIYFWIKKEDLKNKNFDNVWLILQSC >gi|224461397|gb|ACDC01000005.1| GENE 2 1086 - 1595 522 169 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739974|ref|ZP_04570455.1| ## NR: gi|237739974|ref|ZP_04570455.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 15 169 1 155 155 260 100.0 2e-68 MREYNFKASDSTGEMLMGLGFPFSFLGVAGIILTLRLILFPKIKYSSYVNNIYLENFLIV VPALIITAYIMKLIKKYAIKNYHIYEDKEILKIENDKKIIELAYTAIKDIKIDKKGNKIS KCYKLVIKTNSKDFKFFVRPKRNYFGGADDIDFDNLENFYFFLREKISK >gi|224461397|gb|ACDC01000005.1| GENE 3 1676 - 1825 153 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKKILLPVFLFILIVFVVVETLKIGVLIQNKVSKEMISTSMERNMFFN Prediction of potential genes in microbial genomes Time: Thu May 19 22:45:57 2011 Seq name: gi|224461396|gb|ACDC01000006.1| Fusobacterium sp. 2_1_31 cont1.6, whole genome shotgun sequence Length of sequence - 14618 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 5, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 255 277 ## CLK_A0269 putative IS transposase - Prom 378 - 437 17.3 + Prom 597 - 656 14.2 2 2 Op 1 . + CDS 688 - 1050 137 ## gi|237739977|ref|ZP_04570458.1| predicted protein 3 2 Op 2 . + CDS 1066 - 1407 200 ## gi|237739978|ref|ZP_04570459.1| predicted protein + Prom 1416 - 1475 8.5 4 3 Op 1 . + CDS 1512 - 1895 78 ## gi|237739979|ref|ZP_04570460.1| conserved hypothetical protein 5 3 Op 2 . + CDS 1914 - 2267 136 ## gi|237745328|ref|ZP_04575809.1| predicted protein 6 3 Op 3 16/0.000 + CDS 2326 - 3153 1280 ## COG0207 Thymidylate synthase 7 3 Op 4 1/0.000 + CDS 3153 - 3647 682 ## COG0262 Dihydrofolate reductase 8 3 Op 5 1/0.000 + CDS 3660 - 5018 1788 ## COG0569 K+ transport systems, NAD-binding component 9 3 Op 6 . + CDS 5053 - 6408 1825 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 10 3 Op 7 . + CDS 6430 - 6825 391 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 6854 - 6913 13.4 11 4 Op 1 1/0.000 + CDS 6939 - 10490 4461 ## COG1196 Chromosome segregation ATPases 12 4 Op 2 . + CDS 10523 - 11527 1083 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 13 4 Op 3 . + CDS 11533 - 11769 245 ## FN1131 hypothetical protein 14 4 Op 4 . + CDS 11678 - 12307 539 ## FN1131 hypothetical protein 15 4 Op 5 . + CDS 12304 - 12885 694 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase + Term 13132 - 13183 5.1 16 5 Tu 1 . + CDS 13245 - 14594 885 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 Predicted protein(s) >gi|224461396|gb|ACDC01000006.1| GENE 1 3 - 255 277 84 aa, chain - ## HITS:1 COG:no KEGG:CLK_A0269 NR:ns ## KEGG: CLK_A0269 # Name: not_defined # Def: putative IS transposase # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 1 84 1 85 480 79 55.0 5e-14 MANYVLTLALKTELWQEHILEKRLNIARMIYNSCLSEILKRHRKMINSSEYKEISNLDKK EQSKRYKELDKKYSISKFELNKYV >gi|224461396|gb|ACDC01000006.1| GENE 2 688 - 1050 137 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739977|ref|ZP_04570458.1| ## NR: gi|237739977|ref|ZP_04570458.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 120 1 120 120 125 100.0 1e-27 MYLFFLTFFLLLLISAIFFTIFEINFVKHFLKIENTKYIVLLRILETMTPFVTLLIASGP RAILKSVFPVFCSLCFLYIIILIIEIFRKKMNMKELIVNSVLCFIDVALVTIGLIMIFGF >gi|224461396|gb|ACDC01000006.1| GENE 3 1066 - 1407 200 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739978|ref|ZP_04570459.1| ## NR: gi|237739978|ref|ZP_04570459.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 113 1 113 113 102 100.0 6e-21 MQEFLFGSIFLVLIISGIFSFFEIAFIRKFFEIKSTKYIKLLKILEILFFLMIFFSEILF IALTFLYFLVLISDFKKKIISKEELIINTLFYFIDILLIILAMLLILRNLPSI >gi|224461396|gb|ACDC01000006.1| GENE 4 1512 - 1895 78 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739979|ref|ZP_04570460.1| ## NR: gi|237739979|ref|ZP_04570460.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 127 1 127 127 108 100.0 1e-22 MDFLVLLFFILFFFWVILTIFEVTIISGMKVSTFKYIKLLKFLEFFYVILTIISIVFYLY INVEIFSYFYYLLSIIIYFWILIHDFWKKKITKKDFIIYFLYFFIDIILIIILLHLIMIL MSDFPSV >gi|224461396|gb|ACDC01000006.1| GENE 5 1914 - 2267 136 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237745328|ref|ZP_04575809.1| ## NR: gi|237745328|ref|ZP_04575809.1| predicted protein [Fusobacterium sp. 7_1] # 1 117 1 117 117 110 88.0 3e-23 MIAFIFTVLLLIATIGGLFTFFEICILKLFFKIENLKYIKFLKILEIMIIIISCITFISL KILIIFLSLIYFIILIYDFYKKKIDIKNFIINFVFLFGDFYVMNLAIKITSQKLPNF >gi|224461396|gb|ACDC01000006.1| GENE 6 2326 - 3153 1280 275 aa, chain + ## HITS:1 COG:FN0240 KEGG:ns NR:ns ## COG: FN0240 COG0207 # Protein_GI_number: 19703585 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Fusobacterium nucleatum # 1 275 1 275 275 532 93.0 1e-151 MKAKFDKIYKEIVDTIAEKGIWSEGNVRTKYADGTAAHYKSYIGYQFRLDNSDDEAHLIT SRFAPSKAAIRELYWIWILQSNNVNVLNDLGCKFWDEWKQEDGTIGKAYGYQIAQETYGQ KSQLHYVINELRKNPNSRRIMTEIWVPNELSEMALTPCVHLTQWSVIGNKLYLEVRQRSC DVALGLVANVFQYAVLHKLVALECGLEAADIIWNIHNMHIYDRHYDKLIKQVNGETFEPA KIKINNFKSIFDFKPDDVEIVDYKYGEKVNYEVAI >gi|224461396|gb|ACDC01000006.1| GENE 7 3153 - 3647 682 164 aa, chain + ## HITS:1 COG:FN0241 KEGG:ns NR:ns ## COG: FN0241 COG0262 # Protein_GI_number: 19703586 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Fusobacterium nucleatum # 1 164 1 164 164 273 88.0 8e-74 MEKKYYKNLKMIVCVGKDNLIGDRTPDENSNGMLWHIKEELMYFKERTMGNTVLFGGTTA KYVPVELMRKNREVIVLHRTMDVPKLIEDLTQENKTIFVAGGYSIYKYFLDNFEIDEIFL STIKDSVEVKDAVEPLYLPNIEEYGYKVVEKKEYDEFIAYVYKK >gi|224461396|gb|ACDC01000006.1| GENE 8 3660 - 5018 1788 452 aa, chain + ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 452 1 452 452 708 94.0 0 MKIVIVGAGKVGELLCRDLSLEGNDIILIEQDAKILEKILANNDIMGFVGSGVSYDAQME AEVPKADVFIAVTEKDEINIIASVIAKKLGAKYTIARVRSTDYSSQLNFMTESLGIDLVI NPELEAAKDIKQNIDFPEALNVENFLDGRLKLVEFHIDEDSILDNVSLFDFKQKFFPNLL VCIIKRGDEVIIPSGNSVIKGDDRIYITGSNSEIIKFQDALGKDRRKIKSAFIIGAGIIS HYLAEELLKDKIAVKIVEMNPKKANKFSEYLPNATIINADGSNEEILKEENFQNYDSCIS ITGIDEVNMFISIYAKKIGIKKIITKLNKLSFVDILGENSFQSIITPKKIIADKIVRVVR SIANKKKNLIENFYRLENNTVEAIEILVNSDSKINNIPLKDLKIKKNLIIAYIVRNNVAI FPKGTDVINEGDRVIIITKESFFDDINNIVAE >gi|224461396|gb|ACDC01000006.1| GENE 9 5053 - 6408 1825 451 aa, chain + ## HITS:1 COG:FN0243 KEGG:ns NR:ns ## COG: FN0243 COG0617 # Protein_GI_number: 19703588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Fusobacterium nucleatum # 1 451 1 451 451 710 85.0 0 MNKVSINNFSEIEIEILRKLNKYGKGYIVGGSIRDILLGLKPKDIDFTTNIPYETLKDLF SEYNPKETGKAFGVLRIRVNEIDYEIAKFREDNYEEKDGLKIVPEDNKVDFVDDIKEDLA RRDFSINAMAYNEADGIVDLYNGQKDIENKVINFVGNAEERIIEDPLRILRAFRFMSRLG FSSSENTIEAIKKQKDLLKSIPEERITMEFSKLLLGENVKNTLTAMKDTGVLELIIPEFK ATYDFEQHNPHHNLDLFNHIISVVSKVPADLELRYTALLHDIAKPLVQTFDENGIAHYKT HEIVGADMARDILTRLKLPVKLIETVEDIIKKHMVLYRDVTDKKFNKLLSEMGYDNLLRL IEHCNADNGSKNNEVVNPENDLHERLKRAVEKQMQVTVNDLTLNGKDLIDMGFKGTEIGK IKGELLDKYLSEEIPNEKEAMLAYVKEKYLK >gi|224461396|gb|ACDC01000006.1| GENE 10 6430 - 6825 391 131 aa, chain + ## HITS:1 COG:PM1553 KEGG:ns NR:ns ## COG: PM1553 COG0454 # Protein_GI_number: 15603418 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Pasteurella multocida # 1 125 1 127 130 101 40.0 3e-22 MITYKIAKSFDVEKIIEVFESSGIVRPTKEKERIKSMFENANLVYFAYDNGELIGLARCV TDFSYCCYLSDLAVKKDYQKQGIGKMLIEKVKEHIGEKVALILLSASSAMNYYPKVNFEK ADNAFIIKRKS >gi|224461396|gb|ACDC01000006.1| GENE 11 6939 - 10490 4461 1183 aa, chain + ## HITS:1 COG:FN1129 KEGG:ns NR:ns ## COG: FN1129 COG1196 # Protein_GI_number: 19704464 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Fusobacterium nucleatum # 1 1183 11 1193 1193 1444 84.0 0 MYLKAVEINGFKSFGERVYIDFNRGITSIVGPNGSGKSNILDAVLWVLGEQSYKNIRAKE SQDVIFSGGKEKKAATKAEVSLIIDNSDRYLDFDNDIVKITRRIHITGENEYLINDSKSR LKEIGNLFLDTGIGKTAYSVIGQGKVERIINSSPKEIKNIIEEAAGIKKLQANRLEAQKN LGNIEVNLDKVEFILNETRENKNKIEKQAELAQKYIDLKDEKSSLAKGIYITELEQKEKN LVENEDIKVKSEEECSILQEKFDKTLNRLTTIDLEKEEVKKQKILIDSRNKELKDVISTK ETEQAVTRERLDNFKKDKLLKEEYSLHLENKIEKKLEEINILIAKKEELSKNILEMEAAN KEFERKINELEAIKVEKTDLIESRNKKIRDLELEKQLSSNEIENNERKLKSSLDEVEILK KELDEIAKKEIANNEEKDLLNSQIKAKQEELAKTEERNEFLVNQLSEISKTINKLSQDIR EYEYQEKTSSGKLEALIRMEESNEGFFKSVKEVLNSGISGIDGVLISLIKFDDKLAKAIE AAVSGNLQDIIVEDKEVAKKCIAFLTERKLGRASFLALDTIKVSRREFKGNMPGVLGLAA DLVSAEDKYKKVVDFVFGGLLIVENIDVATDILNKNLFAGNIVTVNGELVSSRGRITGGE NQKSSINQIFERKKEIKVLEEKVSNLKSKIVEESKRREDLSIKLENYENEIDKIDSLEDN IRKKMELLKKDFENLSEKSEKISKELRNIKFNIDDAEKYKTSYQDRINSSVSNIEEIEKH INSLRKDLEADELTLKETLTSIDELNKQFSDTRIIFLNNKNSIEQYERDIISKENENSDL KEEKEKNSNVVMELSQNIEELEKNEEQLQKEIEEHIKIYNSENRDIEVLNERENNLSNEE RELSKDKSKLETDLLHSNDRLEKITEVIEKIKTDIENINEKLTELTDVTAKAVEIEKLKG SKDYLRSLENKINNFGDVNLLAINEFKELKEKYDYLARERDDVVKSRKQVMDLIQEIDER IHEDFHTTYENINENFNKMCEETIRNTEGRLNIINPEDFDNCGIEIFVKFKNKKKQPLSL LSGGEKSMVAIAFIMAIFMYKPSPFTFLDEIEAALDEKNTKNLLAKLRDFTDKSQFILIT HNKETMKESDSIFGVTMNKEIGISKIVSPDKITKILDSNKESN >gi|224461396|gb|ACDC01000006.1| GENE 12 10523 - 11527 1083 334 aa, chain + ## HITS:1 COG:FN1130 KEGG:ns NR:ns ## COG: FN1130 COG1663 # Protein_GI_number: 19704465 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Fusobacterium nucleatum # 10 334 1 325 325 590 92.0 1e-168 MKLLSYIYLLITTIRNFLYDEKILPIRKVPDVEVICIGNVSVGGTGKTPAVHFFVKKLLA KGRKVAVVSRGYRGKRKRDPLLVSDGMVIFATAQESGDESYLHALNLKVPVIVGADRYKA CMFAKKHFDIDTIVLDDGFQHRKLYRDRDVVLIDATNPFGGGYVLPAGLLREDFRRAARR AYEFIITKSDLVNERELRRIKNYLRKKFKKEVSVAKHGISCLCDLKGNMKPLFWVKGKKV LIFSGLANPLNFEKTVISLAPSYIERIDFKDHHNFKPKDIALVKKKAEKMDADYIITTEK DLVKLPDNLNINNLYVLKIEFTMLEDNTLKDMKG >gi|224461396|gb|ACDC01000006.1| GENE 13 11533 - 11769 245 78 aa, chain + ## HITS:1 COG:no KEGG:FN1131 NR:ns ## KEGG: FN1131 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 78 1 78 257 101 76.0 9e-21 MKKDVKVEFLKEKNLDACIELIKEKGKFNILSEYANFYDRRTYFKVNENGDIFQKTYNPI TLLYLFCDDEKKIGRLSF >gi|224461396|gb|ACDC01000006.1| GENE 14 11678 - 12307 539 209 aa, chain + ## HITS:1 COG:no KEGG:FN1131 NR:ns ## KEGG: FN1131 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 209 51 257 257 209 66.0 4e-53 MEIYSKKHIILSLYFIYFVMMKKKLADYLFKYSYVEEKQNIKKIDRASNLDVETLKKNLM KTLTNSHLDFSKIFAKELFLRDKKAFFELMYNFSFMGNPKDLKVLFVYALEEIFNQINYD ENIFYTIIAYFTKFRDDYSVYMNSTDDSIKFDIENYNEDKKIYLNIVEKIFTRYNLKNEN KFKVSLCRYFENDFELNPDLKDLLKGKDI >gi|224461396|gb|ACDC01000006.1| GENE 15 12304 - 12885 694 193 aa, chain + ## HITS:1 COG:FN1132 KEGG:ns NR:ns ## COG: FN1132 COG1057 # Protein_GI_number: 19704467 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Fusobacterium nucleatum # 1 193 1 193 193 285 87.0 3e-77 MRIAIYGGSFNPMHIGHEKIVDYVLNNLNMDKIIIIPVGIPSHRENNLEQSDTRLKICKE IFKGNKKIEVSDIEIKSEGKSYTYDTLLKLIDLYGENNEFFEIIGEDSLKSLKTWKNYEE LLKICKFIVFRRKDDKNIQIDEDFLNNKNIIILENEYYDISSTEIRNMVKNNEDISAFVN KKVKKLIEKEYLD >gi|224461396|gb|ACDC01000006.1| GENE 16 13245 - 14594 885 449 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 3 443 5 440 456 345 41 1e-94 MVNLIASINSLFWGSLLILLLVGTGIFFTIRLRFVQVRKFRKGITQLTGDFDLNGKDADH NGMSSFQALATAIAAQVGTGNLAGAATAIVSGGPGAIFWMWVSAFFGMSTIYAEAILSQL FKKKVEGEVTGGPAYYIEELFNKGILAKVLAVFFSLSCILALGFMGNGVQANSIGEAVQN AFNISPYITGAVVALLGGFVFFGGLKRIASFTEKVVPVMAGLYILICVVIIVINYANILT AFESIFVNAFSTKSILGGFLGMGVKKAIRYGVARGLFSNEAGMGSTPHAHAIAKVKNPVE QGNVALITVFIDTFVVLTLTALVILTANVGDGTLTGITLTQKSFEAALGYSGNIFIAVAL FFFAFSTIIGWYFFGEANIKYLFGKKAINIYRVLVMISIFIGSTQKVDLVWELADLFNGL MVIPNLIALLLLNKLVLETSDEYDKIHNL Prediction of potential genes in microbial genomes Time: Thu May 19 22:46:31 2011 Seq name: gi|224461395|gb|ACDC01000007.1| Fusobacterium sp. 2_1_31 cont1.7, whole genome shotgun sequence Length of sequence - 7697 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 90 - 284 503 ## FN1309 hypothetical protein 2 1 Op 2 4/0.000 - CDS 352 - 1158 746 ## COG4589 Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase 3 1 Op 3 2/0.000 - CDS 1160 - 1759 463 ## COG0558 Phosphatidylglycerophosphate synthase 4 1 Op 4 . - CDS 1761 - 3461 2223 ## COG0500 SAM-dependent methyltransferases - Prom 3554 - 3613 17.6 + Prom 3540 - 3599 10.8 5 2 Tu 1 . + CDS 3720 - 4934 1623 ## COG1171 Threonine dehydratase + Term 4940 - 4975 4.7 - Term 4926 - 4963 5.1 6 3 Op 1 . - CDS 4971 - 6353 1829 ## COG1262 Uncharacterized conserved protein - Prom 6390 - 6449 11.0 7 3 Op 2 . - CDS 6483 - 7358 893 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 7558 - 7617 79.6 + TRNA 7541 - 7617 72.9 # Arg CCT 0 0 Predicted protein(s) >gi|224461395|gb|ACDC01000007.1| GENE 1 90 - 284 503 64 aa, chain - ## HITS:1 COG:no KEGG:FN1309 NR:ns ## KEGG: FN1309 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 64 1 64 64 109 100.0 3e-23 MVTGDMNIMEAVEKYPVIVEVLQRNGLGCVGCMIASGETLAEGIEAHGLDTKAILDEINS LIKE >gi|224461395|gb|ACDC01000007.1| GENE 2 352 - 1158 746 268 aa, chain - ## HITS:1 COG:FN1308 KEGG:ns NR:ns ## COG: FN1308 COG4589 # Protein_GI_number: 19704643 # Func_class: R General function prediction only # Function: Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase # Organism: Fusobacterium nucleatum # 55 266 1 212 213 243 81.0 2e-64 MLVAMFFVDILALIILFLIKNKISEKKFTNIKQRIFTWFVIIILFYLATMSRIYLLLLFG LISTLAFKEFLQFAYIKYNSELMITSVVVNLAFYLGIYFKNLYVLLILFILIALRFYKRA FIIFAFFITTYLIGSISYIENLNFIVNYMILIELNDVFQYISGNIFGERKITPNISPNKT VEGLIGGMILTTLTAALLKFVFHINYQIKFIPYLALIGFFGDIFISSLKRKVHLKDSGTL LLGHGGILDRVDSLIFTAPIILFIFKYS >gi|224461395|gb|ACDC01000007.1| GENE 3 1160 - 1759 463 199 aa, chain - ## HITS:1 COG:FN1307 KEGG:ns NR:ns ## COG: FN1307 COG0558 # Protein_GI_number: 19704642 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Fusobacterium nucleatum # 1 198 1 198 199 288 85.0 3e-78 MDISIYKLKTKFQNLLMPICEKLVKLKVSPNQITITTVLLNIVFAGLIYKFNNYRFIYLT VPVFLFLRMALNALDGMIANKFNQKTKMGVFYNEAGDVVSDTIFFYVFLRVIGISEIHNL LFVFLSILSEYVGVTAMMVDNKRHYEGPMGKSDRAFLISLLAIIYYFIGNQYFDYILILA IVLLIFTIFNRVRSSVKGG >gi|224461395|gb|ACDC01000007.1| GENE 4 1761 - 3461 2223 566 aa, chain - ## HITS:1 COG:FN1306_2 KEGG:ns NR:ns ## COG: FN1306_2 COG0500 # Protein_GI_number: 19704641 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 265 566 1 302 302 521 89.0 1e-147 MENLYFTSFDNNKLFYRKWNFEQGKKTLILIHRGHEHSERLNSLAQDEKFLKYNIFAYDL RGHGYTETKTSPNAMDYVRDLDAFVKHIKNEYKIKEEDIFIVANSIGGVILSAYVHDFAP NLAGMALLAPAFEIKLYVPFAKQLVTLLTKIKKDAKVMSYVKAKVLTHDVEEQNKYNSDK LINKEINARLLIDLANMGQRLIEDSMAIELPTIIFSAQKDYVVKNSAQKKFYLNLSSKKR EFIELENFYHGIIFEKERQTVYKMLDDFIQDVFKNQKIELDDSPREFSRKEYERIGLEEY PLSEKIYYSIQKFSMKTFGFLSKGMSLGLKYGFDSGISLDYIYKNKASGKLLIGKLIDRF YLNQVGWAGVRVRKKNLLALIEKKINSLGEENVKILDVAGGTGNYLFDIKEKYPKLKILI NEFKKSNIEVGEEVIKRNNWQDISFVNYDCFDKETYKKINYKPNIVIISGVFELFEDNKM LENTISGVTEILDKDAAVIYTGQPWHPQLKQIALVLNSHKGSGKSWLMRRRSEKELDSLF EKYNLKKEKMLIDNEGIFTVSLAEMR >gi|224461395|gb|ACDC01000007.1| GENE 5 3720 - 4934 1623 404 aa, chain + ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 1 404 1 404 404 612 91.0 1e-175 MAKLEDFVKAKEKLSKVLLETHLIYSPIFSKESGNEVYIKPENLQKTGSFKIRGAYNKIS NLTEEEKKRGVIASSAGNHAQGVAYGARELGIKAVIVMPKSTPLIKVESTKQYGAEVVLH GDVYDDAYKKAKELEEKESYVFVHPFNDEDVLDGQGTIALEILDELPETDIILVPIGGGG LISGIACAAKLIKPDIKIIGVEPEGAASAREAIKENKVVELKEANTIADGTAVKRIGDLN FEYIKKYVDEIITVSDYELMEAFLLLVEKHKIIAENSGILSIAATKKLKEKNKKVVSVIS GGNIDVLMISSMINKGLIRRDRIFSFSVDISDKPGELAKVVDLIAELGANVVKLEHNQFK NLSRFRDVEVQITVETNGTEHIQNLIETFEQKGYEIIKIKTKIN >gi|224461395|gb|ACDC01000007.1| GENE 6 4971 - 6353 1829 460 aa, chain - ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 182 457 10 284 286 157 35.0 5e-38 MKEKILNFLNEGKPLLWIKGQNFHEIENIIVEGLNAFENKRYYIYEKGTTINRQNNSVEV GMGNLFTTLDELYPQGIRKVPVFLLIKDSLAEIVDENNLEYIKEIVETKTANPKYNFTLI VVDQQNTVPEDLREIASLVDDDEQKRTAEMALKKAILDITKIEKIELDLAKLEKIELDLD SIEKIVQSLKDDIKKITIGEKPVELKPTFEDMVFVKGGKYQPSFTDEEKEVSNLEVCKYL TTQKLWQELIRNNPANFKGDENRPIEYISWWHALEFCNRLSEKYGLRPVYNLGKSDQGLL MINQLDGTVVSPDVADFNKTEGFRLPTEVEWEWFARGGQVALDNGTFDYTYSGSNNIDDV AWYTGNSKDTTQSVGLKMPNVLGLYDCNGNVWEWCYDTTESIESGKSYVYKAYDHSNVYR RLKGGSWCNNTEVCAVAVRGNSQATYAYSNAGFRIVRTVL >gi|224461395|gb|ACDC01000007.1| GENE 7 6483 - 7358 893 291 aa, chain - ## HITS:1 COG:FN0354 KEGG:ns NR:ns ## COG: FN0354 COG0697 # Protein_GI_number: 19703696 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 68 291 1 224 224 291 82.0 1e-78 MNFSQIFKKLTAKHCAFIGIFFWATAFVITKVVLKEVDAMSLGVLRYFFASIIVIFILIK KKIPFPNLRDIPAFIFAGFSGYAGYIVLFNIATVLSSPSTLSVINALAPAITAIIAYFMF NEKIKLIGWIAMGISFCGILVLTLWNGTITINKGVLYMLLGCFLLSTYNISQRYLTKKYS SFSVSMYSLLIGGILLVIYSPHSIANIPNISSSSLILIIYMAIFPSIISYFFWTKAFELA KSTTEVTSFMFATPVLATILGMIILGDIPKLSTIIGGVIIISGMILFNKTK Prediction of potential genes in microbial genomes Time: Thu May 19 22:46:34 2011 Seq name: gi|224461394|gb|ACDC01000008.1| Fusobacterium sp. 2_1_31 cont1.8, whole genome shotgun sequence Length of sequence - 2775 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 20 - 1822 2553 ## COG0481 Membrane GTPase LepA 2 1 Op 2 . - CDS 1850 - 2773 997 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|224461394|gb|ACDC01000008.1| GENE 1 20 - 1822 2553 600 aa, chain - ## HITS:1 COG:FN0777 KEGG:ns NR:ns ## COG: FN0777 COG0481 # Protein_GI_number: 19704112 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Fusobacterium nucleatum # 1 600 5 604 604 1125 96.0 0 MLQKNKRNFSIIAHIDHGKSTIADRLLEYTGTVSERDMKDQILDSMDLEREKGITIKAQA VTLFYKAKDGEEYELNLIDTPGHVDFIYEVSRSLAACEGALLVVDAAQGVEAQTLANVYL AIENNLEILPIINKIDLPAAEPEKVKREIEDIIGLPADDAVLASAKNGIGIENILEAIVQ RIPAPNYDENAPLKALIFDSFFDDYRGVITYIKVLDGSIKKGDKIKIWSTEKELEVLEAG IFSPTMKSTDILTSGSVGYIITGVKTIHDTRVGDTITTVKNPALFPLAGFKPAQSMVFAG VYPLFTDDYEELREALEKLQLNDASLTFVPETSIALGFGFRCGFLGLLHMEIIVERLRRE YNIDLISTTPSVEYKVRIDNQEERIIDNPCEFPEPGRGKITIQEPYIRGKVIVPKEYVGN VMELCQEKRGIFLSMDYLDETRSMLSYELPLAEIVIDFYDKLKSRTKGYASFEYELSEYR ESNLVKVDILVSGKPVDAFSFIAHNDNAFYRGKAICQKLSEVIPRQQFEIPIQAALGSKI IARETIKAYRKNVIAKCYGGDITRKKKLLEKQKEGKKRMKSIGNVEIPQEAFVSVLKLND >gi|224461394|gb|ACDC01000008.1| GENE 2 1850 - 2773 997 307 aa, chain - ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 2 307 103 412 412 454 87.0 1e-128 NFSRKEKESNLVKSCNEHNKKKQYILNEGDKIDFLIELGLMSVEGKILKSSYNKFKQINK YLEFIDDVIEELKAKKLITNHINVLDFGCGKSYLTFALYYYLKNYRKDLTFSIVGLDLKK DVIEFCNKLAKKLNYENLEFLNGNIKDYDKSKEVDLVFSLHACNNATDYSLEKALSLDAK AILAVPCCHHEFFEKIQKNKNSEFYNTLKIMADNGVVLDKFATLATDSFRSLSLELCGYK TKMIEFIDMEHTPKNILIKAIKSKSSNLKEKLVEYNKLKEFLGIKPLLEDLIKKYFSIDT NTEIPYN Prediction of potential genes in microbial genomes Time: Thu May 19 22:46:47 2011 Seq name: gi|224461393|gb|ACDC01000009.1| Fusobacterium sp. 2_1_31 cont1.9, whole genome shotgun sequence Length of sequence - 38919 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 10, operones - 8 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.286 - CDS 86 - 301 184 ## COG0500 SAM-dependent methyltransferases 2 1 Op 2 1/0.286 - CDS 312 - 1196 1179 ## COG0523 Putative GTPases (G3E family) 3 1 Op 3 12/0.000 - CDS 1214 - 1705 529 ## COG3610 Uncharacterized conserved protein 4 1 Op 4 1/0.286 - CDS 1723 - 2481 587 ## COG2966 Uncharacterized conserved protein 5 1 Op 5 . - CDS 2493 - 3224 597 ## COG4123 Predicted O-methyltransferase - Prom 3301 - 3360 13.6 + Prom 3293 - 3352 13.7 6 2 Op 1 2/0.000 + CDS 3425 - 4570 2039 ## COG1960 Acyl-CoA dehydrogenases 7 2 Op 2 29/0.000 + CDS 4594 - 5382 1320 ## COG2086 Electron transfer flavoprotein, beta subunit 8 2 Op 3 . + CDS 5402 - 6577 2022 ## COG2025 Electron transfer flavoprotein, alpha subunit + Term 6582 - 6626 6.3 - Term 6570 - 6614 6.3 9 3 Op 1 . - CDS 6630 - 7049 561 ## FN0788 hypothetical protein 10 3 Op 2 1/0.286 - CDS 7067 - 7912 942 ## COG1284 Uncharacterized conserved protein 11 3 Op 3 . - CDS 7930 - 9048 1043 ## COG1940 Transcriptional regulator/sugar kinase - Prom 9171 - 9230 12.1 + Prom 9148 - 9207 11.2 12 4 Op 1 6/0.000 + CDS 9345 - 10880 2496 ## COG2986 Histidine ammonia-lyase 13 4 Op 2 . + CDS 10907 - 12928 3207 ## COG2987 Urocanate hydratase - Term 12936 - 12983 9.5 14 5 Tu 1 . - CDS 12997 - 14676 2440 ## COG1164 Oligoendopeptidase F - Prom 14849 - 14908 13.7 + Prom 14887 - 14946 14.2 15 6 Op 1 44/0.000 + CDS 15068 - 16078 1183 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 16 6 Op 2 6/0.000 + CDS 16075 - 17010 722 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 17 6 Op 3 49/0.000 + CDS 17035 - 17997 1378 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 18 6 Op 4 5/0.000 + CDS 18010 - 18912 1142 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 19 6 Op 5 3/0.000 + CDS 18949 - 20787 2785 ## COG0747 ABC-type dipeptide transport system, periplasmic component 20 6 Op 6 3/0.000 + CDS 20833 - 22662 2762 ## COG0747 ABC-type dipeptide transport system, periplasmic component 21 6 Op 7 . + CDS 22691 - 24517 2446 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 24528 - 24562 4.0 22 7 Tu 1 . + CDS 24573 - 25298 772 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 23 8 Op 1 . - CDS 25432 - 26502 1551 ## COG0136 Aspartate-semialdehyde dehydrogenase 24 8 Op 2 19/0.000 - CDS 26504 - 27388 1074 ## COG0083 Homoserine kinase 25 8 Op 3 . - CDS 27376 - 28833 1811 ## COG0498 Threonine synthase - Prom 28857 - 28916 7.5 + Prom 28821 - 28880 10.0 26 9 Op 1 . + CDS 28928 - 30058 1779 ## COG0460 Homoserine dehydrogenase 27 9 Op 2 . + CDS 30058 - 31374 1983 ## COG0527 Aspartokinases 28 9 Op 3 1/0.286 + CDS 31387 - 32478 1150 ## COG2849 Uncharacterized protein conserved in bacteria 29 9 Op 4 1/0.286 + CDS 32514 - 33554 1079 ## COG2849 Uncharacterized protein conserved in bacteria 30 9 Op 5 1/0.286 + CDS 33590 - 34483 771 ## COG2849 Uncharacterized protein conserved in bacteria 31 9 Op 6 . + CDS 34552 - 35349 1059 ## COG2849 Uncharacterized protein conserved in bacteria + Term 35366 - 35414 3.3 - Term 35354 - 35402 7.1 32 10 Op 1 . - CDS 35428 - 37476 1603 ## COG1479 Uncharacterized conserved protein - Term 37491 - 37536 10.2 33 10 Op 2 . - CDS 37548 - 38873 1830 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|224461393|gb|ACDC01000009.1| GENE 1 86 - 301 184 71 aa, chain - ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 1 65 1 65 412 81 78.0 3e-16 MEKENILFELKKNIQEDKLIKIVFSDRKSGDFNKVIIKPIILKSAKNIQIESFKDNKAFH KNIDLKITYKN >gi|224461393|gb|ACDC01000009.1| GENE 2 312 - 1196 1179 294 aa, chain - ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 1 294 1 294 294 447 82.0 1e-126 MKILLVSGFLGAGKTTFIKEMAKNINLEFVVLENEYADIGVDKDFLDEKNLDVWEMSEGC ICCSMKGDFKSSIKRIYSEINPEYLLIEPTGLGMLSSIIENIKELNNEDIEILRPISLID VTSFDEYLESFNNFFLDNLKNTGKVILTKLESINPLEIENIKNRILELNADLEIETNDYR NFPKKWFAELLNRNLDNKVIDKNFSMGTHINLRTFSKENINLKTMDELGLLLNRLVNGDF GKVYRAKGIIKIDGYWGKFNLVYKNFEMEAIENAKITKIVVIGNNLDIENLKNI >gi|224461393|gb|ACDC01000009.1| GENE 3 1214 - 1705 529 163 aa, chain - ## HITS:1 COG:FN0780 KEGG:ns NR:ns ## COG: FN0780 COG3610 # Protein_GI_number: 19704115 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 163 1 163 163 212 80.0 2e-55 MNYIEVFAAAFSTLFFGIIFNLTGRKLIYSSFAGGLGWYTYLLLYKEMGYSKTAAYLFSA IVITVFSEIIGRLKRTTVTTTLIPALIPLVPGGGIYYTMSFFVENKFQEALEKGRETIIL TMALSVGILLVSTFSQILDRTIKYTKVLKKYRKFKQYKKSHKI >gi|224461393|gb|ACDC01000009.1| GENE 4 1723 - 2481 587 252 aa, chain - ## HITS:1 COG:FN0781 KEGG:ns NR:ns ## COG: FN0781 COG2966 # Protein_GI_number: 19704116 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 252 5 256 256 387 83.0 1e-107 MQNDAFIIKVLSTANTIGKILLTSGAETYRVEEAITLVCRRFDLKSESFVTMTCVLTSAK KKDGEVITEVNRIYSVSNNLNKIDRIHKILLDIHKYEIDDLEKEIKKLQIQTVYKKKVLL ISYCFSAAFFSLLFDGKFRDFLVAGVGGVLIFYMAYFANKLKLNNFFINTLGGFLVTIFS SFATKLGIISTPSYSAIGTLMLLVPGLALTNAIRDLINGDLLAGTSRSIEAALVGSALAI GTGFALFTMSYF >gi|224461393|gb|ACDC01000009.1| GENE 5 2493 - 3224 597 243 aa, chain - ## HITS:1 COG:FN0782 KEGG:ns NR:ns ## COG: FN0782 COG4123 # Protein_GI_number: 19704117 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 1 243 1 243 243 348 84.0 4e-96 MNKNLESLIPLLNKNLKIIQRSDYFNFSIDSLLISEFVNLTKNTKKILDIGTGNAVIPLF LSKRTSAKIYGVEIQEISYQLALRNISINNLNEQIYIIYDNIKNYLKYFTVGSFDIVLSN PPFFKVTENKELLNDLEQLSIARHEIELNLDELIEISSKLVKDRGYFYLVHRADRLSEIL VTLQKYNFEAKKIKFCYTTKHKNAKIVLIEAIKNGKVGLTILPPLVINKDNGEYTDEVLK MFE >gi|224461393|gb|ACDC01000009.1| GENE 6 3425 - 4570 2039 381 aa, chain + ## HITS:1 COG:FN0783 KEGG:ns NR:ns ## COG: FN0783 COG1960 # Protein_GI_number: 19704118 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 1 381 1 381 381 727 98.0 0 MEFNVPKTHELFRQMIREFVEKEVKPIAAEVDENERFPMETVEKMAKIGIMGIPIPKQYG GAGGDNLMYAMAVEELSKACGTTGVIVSAHTSLGTWPILKFGNEKQKQKYLPKMASGEWI GAFGLTEPNAGTDAAGQQTMAVQDPETGEWILNGAKIFITNAGYAHVYVVFAMTDKSKGL KGISAFIVEAGTPGFSIGKKEMKLGIRGSATCELIFENCRIPKENLLGDKGKGFKIAMMT LDGGRIGIASQALGIAAGALEEAINYAKERKQFGRSLAQFQNTQFQIANLDVKVEAARLL VYKAAWRESNNLPYSLDAARAKLFAAETAMEVTTKAVQIFGGYGYTREYPVERMMRDAKI TEIYEGTSEVQRMVIAANIIK >gi|224461393|gb|ACDC01000009.1| GENE 7 4594 - 5382 1320 262 aa, chain + ## HITS:1 COG:FN0784 KEGG:ns NR:ns ## COG: FN0784 COG2086 # Protein_GI_number: 19704119 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Fusobacterium nucleatum # 1 262 1 262 262 470 96.0 1e-132 MRIVVCIKQVPDTTEVKIDPVKGTIIRDGVPSIMNPDDKGGLEEALKLKDLYGAEVIVIT MGPPQAEAILREAYAMGADRAILITDRKFGGADTLATSNTIAAAIRKIENIDLIVAGRQA IDGDTAQVGPQIAEHLDLPQVSYVKEMKYNEASKSFEIKRATEDGYFLLELPTPGLVTVL AEANQPRYMNVGAIVDVFERPIETWTFDDIEIDPAKIGLAGSPTKVNKSFTKGVKEPGVL HEVDPKEAANIILEKLKEKFII >gi|224461393|gb|ACDC01000009.1| GENE 8 5402 - 6577 2022 391 aa, chain + ## HITS:1 COG:FN0785 KEGG:ns NR:ns ## COG: FN0785 COG2025 # Protein_GI_number: 19704120 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Fusobacterium nucleatum # 1 391 1 391 391 663 90.0 0 MNLNDYKGILVYAEQRDGVLQNVGLELLGKATELAYEINKQIALKDAGDELAEYASKQAA AIKSIDAVAATLEEDDEKVKEKVAEVKANNPDAAKVTALLIGHNVKALADELVKAGADKV LVVDQPKLEVYDTEAYTQVLTAAINAEKPEIVLFGATTLGRDLAPRVSSRIATGLTADCT KLELLKDKERQLGMTRPAFGGNLMATIVSPDHRPQMATVRPGVMKKLPKSDDRKGEIVDF PVTLDESKMKVKLLNVVKEGGNKVDISEAKILVSGGRGVGAKQNFELLEDLAAEIGGIVS SSRAQVDAGNMPHDRQVGQTGKTVRPEVYFACGISGAIQHVAGMEESEFIIAINKDRFAP IFSVADLGIVGDLHKILPILTEEIKKYKANK >gi|224461393|gb|ACDC01000009.1| GENE 9 6630 - 7049 561 139 aa, chain - ## HITS:1 COG:no KEGG:FN0788 NR:ns ## KEGG: FN0788 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 139 1 139 139 210 78.0 2e-53 MNNNEFINKYTDGHCLSYLEFQVVAKKYGIYFEKINNDIIVCYDGDEDPKLAAFKFYKTF FPKTTLTPSDFDLITHLNNFHMKFLRDKINEISQKYGMPPVYKASMSIKENALLLLNTLK TRHAIYREDMEFIKYTLNF >gi|224461393|gb|ACDC01000009.1| GENE 10 7067 - 7912 942 281 aa, chain - ## HITS:1 COG:FN0789 KEGG:ns NR:ns ## COG: FN0789 COG1284 # Protein_GI_number: 19704124 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 278 1 278 280 340 76.0 2e-93 MSNKYLQFFKEYIIVALACMVMAFNTSYFFIGNKLAQGGVSGLSLIIHYLSNIDMSYLYF ALNIPLIILAYIFLGKNFLLKTLFATFVLSVFLKIFASFSEPLEDILLAAIFGGAINGIA IGIVFYAGGSTGGIDIIAKIINKYTGIPISRILLTTDFIVLSMVAVIFGKVIFMYTLISL VISSKMIDIIQVGIYSAKGVTIITTKEDEIRKRIMEDTGRGITLIDARGGYTQKEVGMLY CVVGQYQLIKVKTIVKEVDPSAFMIVADVHEVIGNGFLVNK >gi|224461393|gb|ACDC01000009.1| GENE 11 7930 - 9048 1043 372 aa, chain - ## HITS:1 COG:FN0790 KEGG:ns NR:ns ## COG: FN0790 COG1940 # Protein_GI_number: 19704125 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Fusobacterium nucleatum # 1 372 16 387 387 578 83.0 1e-165 MYQKEIKQGNENIIFHSIYFTEDSFSIPDLTKITNMTFPTVKRVVNEFLEKDIIVEWTLS TGGVGRRAVKYKYNPDFCYSIGVSINEEKIKFILINTIGKIFQSKIIDTENENFIDFLTK NLKDFIKEIDEKYLAKVIGVGISIPGIYNKEDHFLEFNNTDRYESAVIKEIEQDINLPIW VENEANMSILAEAIINKYKELEDFTVINISNKVTCSTFHKFGNKSEDYFFKASRVHHMIV DYENQKKVGDCISFKVLKNEILRAFPKLNSLEDFFSNKTYRESKKGKEILNKYLTYMGII LKNLLFTYNPKKLIICGDLSQFGSYLLDDVLNIVYEKNHIFYRGKETIIFSNFKGSSSII GAALFPIVDNLM >gi|224461393|gb|ACDC01000009.1| GENE 12 9345 - 10880 2496 511 aa, chain + ## HITS:1 COG:FN0791 KEGG:ns NR:ns ## COG: FN0791 COG2986 # Protein_GI_number: 19704126 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Fusobacterium nucleatum # 1 511 6 516 516 915 91.0 0 MEIVLGSKRITLEDLINVTRRGYKVKISDEAYEKIDKARALVDKYVDEARVSYGITTGFG KFAEVSISKEQTGELQKNIVMSHSCSVGNPMPIDIARGVVFLRAVNLAKGHSGARRIVVE KLVELLNKDVTPWIPEKGSVGSSGDLSPLAHMSLVLIGLGKAYYKGELLEGKEALERAGI EPIPALSSKEGLALTNGTQALTSTGAHVLYDAINLSKHLDIAASLTMEGLHGIVDAYDPR ISEVRGHLGQINTAKNMRNILAGSKNVTKQGVERVQDSYVLRCIPQIHGASKDTLEYVKQ KVEIELNAVTDNPLIFVETDEVISGGNFHGQPMALPFDFLGIALAEMANVSERRIEKMVN PAINHGLPAFLVEKGGLNSGFMIVQYSAAALVSENKVLAHPASVDSIPTSANQEDHVSMG SIAAKKSKDILENVRKVIGMELITACQAIDLKGAKDKLSPATKVAYDEVRKVIPYVAEDR PMYIDIHAAEEIVKNNKLVEDVEKAIGQLEF >gi|224461393|gb|ACDC01000009.1| GENE 13 10907 - 12928 3207 673 aa, chain + ## HITS:1 COG:FN0792 KEGG:ns NR:ns ## COG: FN0792 COG2987 # Protein_GI_number: 19704127 # Func_class: E Amino acid transport and metabolism # Function: Urocanate hydratase # Organism: Fusobacterium nucleatum # 1 673 1 673 673 1400 98.0 0 MLNNKTIYDAMTIKLTAEDIPMEIPKLDPSIRRAPKRIVKLSDHDIELALRNALRYIPEE FHEMLAPEFLQELEERGRIYGYRFRPEGNIYGKPIDEYKGKCTEAKAMQVMIDNNLDFDI ALYPYELVTYGETGQVCQNWMQYRLIKKYLENMTQDQTLVVASGHPTGLFRSNPYAPRAI ITNGLMIGLFDNYEDWARGAAIGVANYGQMTAGGWMYIGPQGIVHGTYSTILNAGRLFCG VPADGDLRGKLFITSGLGGMSGAQGKACEIAKGVAIVAEVDLSRINTRLEQGWVNVIANT PEEAFKIAEEKMASKTPYAIAYHGNIVEILEYAIEHNKHIDLLSDQTSCHAVYDGGYCPV GTSFEERTKLLGTDRAKFRELVNEGLKRHYKAIKTLHDRGVYFFDYGNSFLKSIYDVGIT EISKNGKDDKEGFIFPSYVEDILGPELFDYGYGPFRWVCLSRKKEDLLKTDKAALELVDP NRRYQDRDNYVWIQDADKNGLVVGTQARIFYQDAMSRTRIALKFNEMVRNGEIGPVMLGR DHHDVSGTDSPFRETSNIKDGSNIMADMATQCFAGNAARGMTMIALHNGGGVGIGKSING GFGMVLDGSKRVDEILWQAMPWDVMGGVARRAWARNPHSIETVVEYNHDNRGTDHITLPY IVSDELVKKVLKK >gi|224461393|gb|ACDC01000009.1| GENE 14 12997 - 14676 2440 559 aa, chain - ## HITS:1 COG:FN1145 KEGG:ns NR:ns ## COG: FN1145 COG1164 # Protein_GI_number: 19704480 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Fusobacterium nucleatum # 1 559 1 559 559 934 86.0 0 MKFNDIPYQRPNMEEVKKYFKDLTKNLEVANSGAEQIKLIEEFANFKKDLNTTRELANAR HSIDTSDKFYEAEMDFFDENDPIIATLNTEVSRAIFNSKFRTELEERFGKHYFKLLECKL VLNEKAIPFMQKENALSTKYDKIIANSKIKFRGKEYTVSQMPPLLQNPDREFRKEAYQAR AKFFEEHQEEFDSIYDEMVKVRTEMAKALGYENYVELQYKLLNRTDYDHNDVARYREKVL KTLTPLAVKIKKLQAERLGIKDFKYYDEACDFKDGNSNPNGDVDFIVKNAQKMYRELSPE TGKFFDFMVENELMDLVAKPKKRVGGFCTSFDKYKEPFIFSNFNGTNGDIDVITHEAGHA FQCYMSQYQLLPDYIWPTYDAAEIHSMSMEFLTWPWMELFFGENANKYRYSALKGALTFI PYGVTIDHFQHYVYENPDATPEERRKKYHELELMYKPDLDYDNDFYNSGAFWFAQGHVFW APFYYIDYTLAQVCAFQYLLKYLENKEETLKEYITLCKAGGSESFFKLLDIGNLKNPMTT NVLEEIAPKLEELLNSIKI >gi|224461393|gb|ACDC01000009.1| GENE 15 15068 - 16078 1183 336 aa, chain + ## HITS:1 COG:BH3640 KEGG:ns NR:ns ## COG: BH3640 COG0444 # Protein_GI_number: 15616202 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Bacillus halodurans # 4 335 2 332 340 426 61.0 1e-119 MEENKPILCEMKNLCTAFRIKDDYFNAVENVNLSLYQNEVLAIVGESGCGKSTLATTIMG LHNFNFTKVSGEVVFEGKNILNSTEDEYNKIRGGKIGMIFQDPLSALNPLQRIGQQIEEG LIYHTNLNAEQRKERAFELLKRVGIEKPERIYRQFPHQLSGGMRQRVVIAIALSCKPKIL IADEPTTALDVTIQAQILDLIADLQEEIKAGIILITHDLGVVAQIADRVAVMYAGEIVEL ATSKEIFTNPLHPYTRSLLKSIPQLDTNENDELHVIKGMVPSLKNLPRKGCRFSARIPYI PKEAHEENPGFHEAFPGHFVRCTCWKTFKFQEEDKK >gi|224461393|gb|ACDC01000009.1| GENE 16 16075 - 17010 722 311 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 267 11 275 329 282 51 2e-75 MSLLEIKNLKVHYPIRGGFFNKVVDHVYAVDGVSMVIEQGKTYGLIGESGSGKSTIGKTI IGLEKATAGEILYNGKNILDPKVRKEMKFNREVQMIFQDSMSSLNPKKRVLDILAEPIRN FEKLSKEAEKEKIYELLEIVGMPKDSIYKYPHEFSGGQRQRLGIARAIACKPKLIIADEP VSALDLSVQAQVLNYLKNIQRELNLSYIFISHDLGVVRHMCDYIYIMHRGKFTETGTRED IYKDARHIYTKRLIASIPQINPEAREELKKKREDVEKEYEKLYSQFYDKNGKVFDLEKIS ETHSVASSTKI >gi|224461393|gb|ACDC01000009.1| GENE 17 17035 - 17997 1378 320 aa, chain + ## HITS:1 COG:BH3638 KEGG:ns NR:ns ## COG: BH3638 COG0601 # Protein_GI_number: 15616200 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 1 320 1 322 322 328 49.0 7e-90 MWKTILRRVLLMIPQLFVLSLLIFILAKLMPGDALSGMIDPTVDAETIEKIRLQLGYYDP WYIQYFRWVKNAFHGDLGISYTYKLPVLTVIGARAMNSFSLSILALTIMYCIALPVGIFA GKNQGSKFDKGVILFNFFTYAIPSFVMYLFAILLFGYKLKWFPTIGSVDAGLVKGTFAYY MSRLHHMILPAMCIAILSTTGTIQYLRNEVIDAKTADYVKTARSKGVPMRKVYTKHIFRN SLLPIAAFFGFQISGLLGGSVIAETIFNYQGMGKFFIESILTRDYSVVTTLILLYGLLFL LGSLLSDITMAIVDPRIRIE >gi|224461393|gb|ACDC01000009.1| GENE 18 18010 - 18912 1142 300 aa, chain + ## HITS:1 COG:BH3637 KEGG:ns NR:ns ## COG: BH3637 COG1173 # Protein_GI_number: 15616199 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 6 300 11 302 302 290 50.0 3e-78 MEKNIKDPVKAENPTGFSVIVREFKKDKLALFSFFAVTIFIIAVFVASSFINLQQLQTVD IFRKYEVPSFNNFWNFFGRDSGGRSVMGYVIVGARNSITIGVIITIVTTFIGLFVGLCMG YYGGKVDALGMRIVDFISIMPSVMIIIVFVSIVPKYGIFQFILIFSMFYWTRTTRLARSK TLSETRRDYVNASKTMGTSDLKIMFGEILPNISSIIIVNGTLALASNIGIEVALSFLGFG LPAATPSLGTLISYASKPEIIQYKAYVWLPAALVLLFMMLGINYIGQALRRAADAKQRLG >gi|224461393|gb|ACDC01000009.1| GENE 19 18949 - 20787 2785 612 aa, chain + ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 62 610 77 610 610 386 38.0 1e-107 MKTKWLKVFGLFTGLLLLASCGDVNGGSKDAGNEKEVVDVSAIEKKYPSYFKNDAEAVQV DTLKVAIVSDSPFKGIFNGFLYSDAIDNKFMKYTMNGAFPIDNDLKLILDSDETPIKVTI NPEEKTVTYKINPKFKWSNGDPVTTKDIVKTYEIFANQDYIVSSKSLRFSKNRKAIVGIE EYNEGKADKISGLEVIDDSTMKIHLKEVTPSTYWGGNFAGELINAKQFEGIPMNKIAESD ALRKNPLSYGPYVIKQIVQGEKVIFEANPYYYKGEPKIKRIEMEVLPSSQQVAAMKAGKY DIIFGASNDVFPEVEKLDNINILTKKASYMNYIAFKLGKWDAEKNEVVTDPNSKMYDINL RKAMAYAIDNDAIGEQFYHGLATTAKSQLSPLFPSLHDPSINGYRIDIEKAKKLLDDAGY KDVDGDGIREGKDGKPIKFTLAMMSGGDIAEPLSQYYLQQWKSIGLNVELVDGRLLDINN FYDRVEADDPAIDFCLAAIGFGSDPQQVSLFGKTAGFNISRYTSETLEKALANTVSPEAI DDQKRAEFYKEYERVFMDEIPVVPQLNKYEYLVVNKRVKMFDWTESMRAFGEEFDWSKLE VTAKDPLVSETK >gi|224461393|gb|ACDC01000009.1| GENE 20 20833 - 22662 2762 609 aa, chain + ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 63 575 81 583 610 355 37.0 2e-97 MKFKKALALISGMLLLASCGGINDGGAKDAQKEAVDVSTVESQYPSYVENEGNPVETSVL KVAVVSDSPFRGIFNGFLYSDSLDGSFMASTMNGAFPIDADLKIILDSDETPIKVSVNPE EKTVTYKINPNFKWSNGEPVTTKDIVKTYEIMANQEYITSSKSLRYNKNRKAIVGIEEYN EGKADKISGLEVIDDSTMKIHLKEMTPSVYWGGNFVPEFVNAKQFEGIPMDKITESDALR KNPLSYGPYVIKEIVQGEKVIFEANPYYYKGEPKIKRLEMEILPPSQQVAAIKAGKYDIV LKVSPEIFPELEKLDNINILTKKAGSMNYIAFKLGKWDEEKNEVVTDPNSKMYDLNLRKA IAYAIDMDAVSKQFYHGLSTPAKSQLSPLFPSLHNPEINGFKQDVEKAKQLLDEAGFKDV DGDGIREGKDGKPVKYTLAMMSGGEIAEPLAQYYIQQWKAIGLDVELLDGRLLDSKNFYN RVNGDDPAIDFCIAGIGFGTDPQQLAIFGKNAKFNISRYISDNLEAALDATVSKDAMNEE YRVKAYKDYEKVFMEEIPAVPILNKLDILVVNKRIKKYDWRPNVDGKPNTFKWSMIEVVA PQPIVDSKN >gi|224461393|gb|ACDC01000009.1| GENE 21 22691 - 24517 2446 608 aa, chain + ## HITS:1 COG:BH3636 KEGG:ns NR:ns ## COG: BH3636 COG0747 # Protein_GI_number: 15616198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 23 576 40 583 610 379 36.0 1e-105 MNSKLKKLISLFAGMMLLVSCGDVNGGKADTAKQEVNLNEIEQKYPVAYKNEGEVVPVDT LKVAVVSSSPYKGIFNGFLYSSGIDNEFMQYTMNGAFPTNPDFTLVLDSDETPIKVTVNP EEKTVTYKINPKFKWSNGEPVTTKDIVKTYEIVANQKYIESSSSSRFNKNRKKIVGIQEY NEGKADKISGLEVIDDSTMKIHLTEVTPSVYWGGNFVSEFVNAKQFEGVPMDKIIESPAL RKNPLSYGPYYIKDIVQGEKVIFEANPYYYRGEPKIKTIEMEILPPSQQVAAIKSGKYDI VFSPELNIFPEIENLDNINILARKAMYFSYLGFHVGKWDAEKNEVITDPNSKMYDINLKR AMAYAIDNDSIAKQFYHGLAMRAPSPIAPVFTKLRNPEVDGFKIDLEKAKKLLDDAGYKD VDGDGIREGKDGKPFKINLAMMSGSEIQEPLSQYYIQQWKLIGLNVELVDGRLLDFNNFY DRLKADDPAIDCFFAAFGYGTDPQQMSLFGKNSQFNKSRYTSETFEKALEAQISPEALDE AKRIEIYHNYDKIFMEELPVAPQLNKMEYIVVNKRVKEYDWKYDTDMKEFDWSKIEVTAK EPISDSKN >gi|224461393|gb|ACDC01000009.1| GENE 22 24573 - 25298 772 241 aa, chain + ## HITS:1 COG:TM0742 KEGG:ns NR:ns ## COG: TM0742 COG0639 # Protein_GI_number: 15643505 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Thermotoga maritima # 24 241 3 203 209 90 30.0 2e-18 MKKGTIIRKGQIKYINEDDYKRIFVISDLHGYYNLFLKFLEKVNLQKDDLLINLGDSCDR GTQSYELYMKYNEMIKEGYNVLHLLGNHEDMLLTAVNTLDESSIDHWYRNNGETTIESFK NVTGLAKEDFYDKEKNKFLIDFLSTFPTLIVSDKTIFAHAAYNPDLSPEEQEEYFLIWNR ENFWDKNFTGKAIYFGHTPSKKEDHTIVYYSNNCTCIDLGTYKYQKMVGVEIKSKKEYYI D >gi|224461393|gb|ACDC01000009.1| GENE 23 25432 - 26502 1551 356 aa, chain - ## HITS:1 COG:CAC0568 KEGG:ns NR:ns ## COG: CAC0568 COG0136 # Protein_GI_number: 15893858 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Clostridium acetobutylicum # 17 355 18 358 359 421 60.0 1e-117 MERTKIAVVGATGMVGQRLLVLLENHPYFEVVKLAASKNSAGKRYGDLMANKWKLDMKIP EYTKDFIVEDAMDVKNVANGVKLIFCAVNLDKKELVALEEAYAKEEVVVVSNNSANRMKA DVPMIIPEINAKHLDIVDVQRKRLGTKKGFIVVKPNCSIQSYVPVFAAIKEFGIKEASIC TYQAISGSGRTFEEWPEMVGNIIPYIGGEEEKSEIEPLKIFGNIEDGEIKLNDTMKFSAQ CIRVPVLDGHLACVSFNLENNPGKEVLIEKIKNFKSDITDLPLAPKEFIHYYEENDRPQP LLDRDNEKGMQITVGRLREDNLFDYKFVGLSHNTLRGAAGGAVLTAELVKKLGYLD >gi|224461393|gb|ACDC01000009.1| GENE 24 26504 - 27388 1074 294 aa, chain - ## HITS:1 COG:CAC1235 KEGG:ns NR:ns ## COG: CAC1235 COG0083 # Protein_GI_number: 15894518 # Func_class: E Amino acid transport and metabolism # Function: Homoserine kinase # Organism: Clostridium acetobutylicum # 1 286 1 292 296 192 38.0 4e-49 MFEVKVPMTSANVACGFDTLGLALQTHSIFHFELNDKLDFVGFEKEFCNEDNLVYIAFKK TLNFLNKNIDGVKISLIEQAPIARGLGSSATCVVAGIFGAYLLTGTEINKNDILKIATEL EGHPDNVAPAIFGNLCASCLVNDEAISVQYNVDERFNFMALIPNFETKTADARKALPKDL PLKDAIFSLSRLGIVLRAFETYDIQTLKKVLADKIHEPYRKNLIHEYDEVRSICENIESY GFFISGSGSTLINILVDETKLELIKEQLKNLKYNWKVLFTKVDKEGTTWKERNV >gi|224461393|gb|ACDC01000009.1| GENE 25 27376 - 28833 1811 485 aa, chain - ## HITS:1 COG:CAC0999 KEGG:ns NR:ns ## COG: CAC0999 COG0498 # Protein_GI_number: 15894286 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Clostridium acetobutylicum # 3 477 6 490 496 466 52.0 1e-131 MNYRSTRNNTIRKNDKTALLQGLSEDGGLFVLENFNEKKIDLKNLLDKSYTDIAFEVLKL FFSFDESKLKSIIEKAYSKFSTSKVTPLVELKDAHVLELFHGPTSAFKDVALTLLPYLIQ LALEGSDQEILILTATSGDTGKAALEGFKDVDQTEIIVFYPKNGVSKIQELQMRTQEGKN TKVCAIEGNFDDAQTAVKNIFLDEELQKKLGKKKFSSANSINIGRLTPQIVYYIVAYIDL VKNKKINLGDKINFVVPTGNFGDILAGYYAKKLGLPVNKLVCASNKNNVLYDFLTTGIYD RNREFLKTISPSMDILISSNLERLLYDLSGSDDKYIKSLMDELKQSGKYQVNADILAKLK EEFGSGYASDEETSQVIKKVWDEEKYLLDPHTAVAYKVMLEQNLEGETVVLSTASPYKFC TSVANAVLNITDEDEFKLMEKLYEFTKVPVPENLKNLNSKEIRHSDLVKREDMAKYILEV EKCLK >gi|224461393|gb|ACDC01000009.1| GENE 26 28928 - 30058 1779 376 aa, chain + ## HITS:1 COG:sll0455 KEGG:ns NR:ns ## COG: sll0455 COG0460 # Protein_GI_number: 16331527 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Synechocystis # 1 302 3 323 433 215 39.0 1e-55 MKIAILGFGTVGSGVYEIAKTLKNIEVKKVLEKDLSKINIATDNYDEIINDKEIELVVEC MGGLHPAYEFIMQALKSKKSVVSANKAVIAKYLDEFLQAAKENNVEFRFEASVGGGIPCL AGIQKVRRVENIDKFYGIFNGTSNFILDNMYRFENEFFTTLKTAQELGYAEADPSADIDG YDVTNKVIISFALAYDGFIKNEFPCFTMRNITKEDILYFKKNGYIAKYIGEATTVGNEYE ASVMLNLFPTNALEGNVLSNYNIVTVQSHTMGEVKFYGQGAGKLPTANAIIQDILDIQAN ISFNPISIEKKYSYSAKLFKHRYVLRSNEELKGEFDKIEKDGNNFYHYTKEITQADLLKV IEGKDCLVTKLSEVLA >gi|224461393|gb|ACDC01000009.1| GENE 27 30058 - 31374 1983 438 aa, chain + ## HITS:1 COG:CAC0278 KEGG:ns NR:ns ## COG: CAC0278 COG0527 # Protein_GI_number: 15893570 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Clostridium acetobutylicum # 4 436 5 437 437 396 47.0 1e-110 MLKVAKFGGSSVASAEQFKKVKEIVKMDSSRKFVVVSAIGKADKEDNKITDLLYLCYAHI KYNMNCDAVFSIIEKKFCDIAKELNLQFNIKGELAQLKEKLDQKSVSEEYLVSRGEYFTA LLMAEYLGYRFIDAKDVIFYNYDNTFDYIKSEKAFQEITKTGENFIIPGFYGSFPNRDVK LMTRGGGDVTGAIVASLANADVYENWTDVSGVLMADPRIIPNPLPIEVINYNELRELSYM GASVLHEEAVFPVALKKIPIQIRNTNRPEDVGTIINNSDEGAFKHVITGIAGKKDFSIIT IRKVHMSNEVGLIRKALSVFEDYNVSIEHIPSGVDSFSVVVETKAVKPFVHELMGKLKKA TSAGEVTLTTEISLIATVGLGMKNYKGLSGRLFSAIGKAGINIVVISQTSDEINIIVGVH NSDYERTIRTIYYEFNPQ >gi|224461393|gb|ACDC01000009.1| GENE 28 31387 - 32478 1150 363 aa, chain + ## HITS:1 COG:FN0738_1 KEGG:ns NR:ns ## COG: FN0738_1 COG2849 # Protein_GI_number: 19704073 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 149 1 149 149 202 70.0 9e-52 MKKNFIVYTLIIFIFSSFSIFAEREVNYEDLKYNEKTELIYANDEKEPFTGIAKDYYEDK SLKVEFPYKNGRIEGKAKAYYPSGKFKSEAFFVDDLLQGKSVGYYESGNLQYEDNYKDDE LDGLIKEYYENGQIKSEMYYKSGNLDGPATVYYENGQVYIQESYKNGELDGESFNFNEDG SLKSKAVYQNGELVGDIVQGGVGSVVAGDVPDTEEIFVSTENENIENKVKYYTTIFAFGT VIIGLIIYTIFKIFTAFPKTKYLTDEQRNRIFKILMKYDEGKKELFSAYRLNGVGTGYYR VRSMMVDNQKVYIYAKMFSFLYIPTPITLGYLLCYNKDQILASYSNATFKEVKKEIEDTV LHM >gi|224461393|gb|ACDC01000009.1| GENE 29 32514 - 33554 1079 346 aa, chain + ## HITS:1 COG:FN0738_1 KEGG:ns NR:ns ## COG: FN0738_1 COG2849 # Protein_GI_number: 19704073 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 149 1 149 149 206 76.0 6e-53 MKKSFIIYTFIIFIFSSFSIFAEREVDFEKLEYNEETELVYVEGEKEAFTGIAKYYSKDE SSIFEFPYKNGKKEGRGKEYYLNGKFKSDAFFIDGLLQGKSIGYYENGNLEYEENYKDGK LDGLIKDYYENGQLKSELNYKNGQLDGLARAYHENGQLHIEENYKDGKLEGESTNYDENG NLTSKAIYKDDEMVENLFGDTEEDTSSKNNKLKSYTGSIILCGLIGLYVFLTAYKMLKSF PKTSHLTDEQRSRIFKILMKHDEGNKELFSSYTLNGVGSSYYRVASMMVDNEKMYIYAKM LSFVYLPTPITFGYLFGYSKDHILASYSNETFKEVKKEIEDTVLHM >gi|224461393|gb|ACDC01000009.1| GENE 30 33590 - 34483 771 297 aa, chain + ## HITS:1 COG:FN0738_1 KEGG:ns NR:ns ## COG: FN0738_1 COG2849 # Protein_GI_number: 19704073 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 146 1 146 149 164 61.0 1e-40 MKKSFIIYTFIIFIFSSFSIFAEREAKLEELKYNEETELMYVNDEKEPFTGIAKDFYEDN SLKVEYPYKNGKIEGLAKEYYPNGKLKSEENFVDGLLNGKAITYYENGNIEYEENYKDGK LEGEIVFYDENGNMELKAIYKDDKVDKVIDVKTGEEISDEKNKFEDYLEYSIFGVIIALY IFFIIFKFKSFPKTSNLSDEQREKILKILIQNDENKKEIFSASKFYGIGTGYSKVASLVV DSQEVYIKAKIFSFLFIPVPIVLGYLLCYDEDQILASYSKTTFKEVKKKIQETILHL >gi|224461393|gb|ACDC01000009.1| GENE 31 34552 - 35349 1059 265 aa, chain + ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 17 224 31 244 245 122 39.0 7e-28 MGTTEKEMNILELNQDEKTGLVYIKDSKKLFTGVGKTYYESGKLESIFRFKDGVLEGNGI GYYESGKISFTFNFTKGNINGITKSYYESGKIRTEKKFIDGKLDGSSKGYYENGKIAYEE NYLNGKLNGNSKFYYENGNLKADLFYKNDMLDGTVIEYLEDGKKTSVSNYKNGKLEGEKL SYYKNGSLYIEANYSNDELDGDIKVYKKNGDLDYIAPYKNGKPLTAKRLKASDDMIDEIS QDLKEILGDDIKITVKEENSLENKK >gi|224461393|gb|ACDC01000009.1| GENE 32 35428 - 37476 1603 682 aa, chain - ## HITS:1 COG:MA2417_1 KEGG:ns NR:ns ## COG: MA2417_1 COG1479 # Protein_GI_number: 20091248 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 563 1 573 597 250 32.0 6e-66 MQPYKARVIRDLIGKSRRVFKIPVYQRNYDWKNSQCKKLFQDILKAYDKDKDHFMGTMVY LKITDSSQLEEVLIIDGQQRITSIYILLKSLYDYAKKEKNFKLEDIIKDYLFNKKYEESH KLKLKPIKSDNEQLNYLMKNEFDELSKESNIYQNYILFESLIEEAISKNYTIEDILEGIE KLEVVEIVLDSSQGDEAQVIFESINSTGLELSVADLIRNYLLMDDKKQEFLFEEYWLKIE KMVGYDNLEEFFLHFLNSKLSDITLYKNIYPKFKTYYEKNFVTHEEILILLKKYAYYYSA FIGRNNKYSIEIMECLNNFNIINQTTIYPFLLLLFDDFETQKINEVEMLKILKFLLSYTV RRIICEIPSNSLRGFYKSLYSRTIKSNQKDYYYKIVSFFENIKTRDKIIGDKEFKNALIY KPLYKKPICKFILSSIENSTKEKIDILNLTIEHILPQKENSIVWKEEIGSNYKNIYELYL HTLGNLTITGLNSELGAKPFLEKKKIISENSKANILNQLILKSNVWNEEKILKRAEFLAN KILKIFEYEKVKLKLNSDEEEIYNLDSNINVTNTKPECFIFRGEKINVKSYSQMLEKFLE LIYDLDFKILIKLAKNNFSLPQAKNTYITYNKEKLRQPREIVKTGIFFETNLNSTLIIYF IRQVIQDSTEFDTSEFEFILKQ >gi|224461393|gb|ACDC01000009.1| GENE 33 37548 - 38873 1830 441 aa, chain - ## HITS:1 COG:FN0598 KEGG:ns NR:ns ## COG: FN0598 COG1132 # Protein_GI_number: 19703933 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Fusobacterium nucleatum # 1 440 143 582 583 749 94.0 0 MFKELLTVLILTGRMFQVDYILALVSLILLPLIIRVVRKYTKKIRKYGRERQDTTGKVTA FTQETLSGIFVIKAFNNTDFVIDKYKDLTKEEFEQAYKTTKIKAKVSPINEVITTFMVLL VVLYGGYQILVTKKITSGDLISFVTALGLMHQPLKRLISKNNDLQDSLPSADRVVEIFDE NIETDVFGEAVKFDEKIQNIKFENVNYKYDDSNEYVLKNVNLDVKAGEIVAFVGKSGSGK TTLVNLLARFFNTDEGSVTVNGVNIKNIPLGIYRNKFAIVPQETFLFGGTIKENISFGKE VTDEEIITAAKMANAYNFIQEDLPNKFETEVGERGALLSGGQKQRIAIARALIKNPEIMI LDEATSALDSESEKLVQDALDSLMEGRTTFVIAHRLSTIVRADKIVVMDNGEIKEMGTHS ELIAMNGIYKNLYDIQFNENK Prediction of potential genes in microbial genomes Time: Thu May 19 22:46:56 2011 Seq name: gi|224461392|gb|ACDC01000010.1| Fusobacterium sp. 2_1_31 cont1.10, whole genome shotgun sequence Length of sequence - 8331 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 2, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/1.000 - CDS 1 - 418 269 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 2 1 Op 2 5/0.000 - CDS 415 - 1485 1340 ## COG0763 Lipid A disaccharide synthetase 3 1 Op 3 5/0.000 - CDS 1495 - 2298 981 ## COG3494 Uncharacterized protein conserved in bacteria 4 1 Op 4 25/0.000 - CDS 2298 - 3071 1103 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 5 1 Op 5 4/0.000 - CDS 3088 - 3513 609 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 6 1 Op 6 1/1.000 - CDS 3533 - 4366 1132 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 7 1 Op 7 . - CDS 4377 - 6590 2592 ## COG0210 Superfamily I DNA and RNA helicases - Prom 6625 - 6684 9.8 + Prom 6463 - 6522 6.7 8 2 Op 1 . + CDS 6735 - 7919 1704 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 9 2 Op 2 . + CDS 7986 - 8189 370 ## gi|237740040|ref|ZP_04570521.1| predicted protein Predicted protein(s) >gi|224461392|gb|ACDC01000010.1| GENE 1 1 - 418 269 139 aa, chain - ## HITS:1 COG:FN0598 KEGG:ns NR:ns ## COG: FN0598 COG1132 # Protein_GI_number: 19703933 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Fusobacterium nucleatum # 1 139 1 139 583 239 90.0 1e-63 MKILNFKNKSLNVFLGYSYRYKWHMIAVIILSTIASAMSAVPAWLSKKFVDDVLIKQNKE MFLWIIGGIFAATVIKVISSYYSEITSNFVTETIKREIKIDIFSHLEKLPINYFKKNKLG DTLSKLTNDTTSLGRIGFI >gi|224461392|gb|ACDC01000010.1| GENE 2 415 - 1485 1340 356 aa, chain - ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 1 356 1 356 356 602 92.0 1e-172 MKFFVSTGEASGDLHLSYLVKSVKSRYKDVDFVGVAGEKSKKEGVEILQDISELAIMGFT EAIKKYKFLKQKAYEYLQYIKDNQIENVILVDYGGFNVKFLELLKNEIMDIKIFYYIPPK VWIWGEKRVKKLRLADYIMVIFPWEVDFYKKHNIDAVYFGNPFTDFYKKVERTGDKILLL PGSRRQEIKAMLPVFEEIINDLKDDKFILKLNSEQDLVYTENLKKYTNLEIIIDKKLKDI VGDCKLSVATSGTVTLELALFALPSIVVYKTSLINYLIGKYILKIGYISLPNLVLDDEIF PELIQKDCEAKNIEKHMKKILENLPEIEEKIENMRKKIEGKAVVESYADFLVKEGK >gi|224461392|gb|ACDC01000010.1| GENE 3 1495 - 2298 981 267 aa, chain - ## HITS:1 COG:FN0596 KEGG:ns NR:ns ## COG: FN0596 COG3494 # Protein_GI_number: 19703931 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 267 1 267 267 452 90.0 1e-127 MEKIGLIVGNGKLPLYFIEEAKNSNISVYPIGLFPSVDEEIKKSDNYAEFNVGHIGEIIK YLLLNDITKIVMLGKIEKKLIFENLILDKYGEKIMEIVPDKKDETLLFAIIGFLRLNGIK VLPQNYSMKRLIFEAKCYTERHPDADDEKTISMGIEAARLLSRVDVGQTVVCRDKAVIAV EGIEGTDETLKRAGQYSDKDNILIKMSRPQQDMRVDVPVIGLHTVETAIQNGFKGIVAQA KKMIFLNQKECIELANKNNIFIIAKKI >gi|224461392|gb|ACDC01000010.1| GENE 4 2298 - 3071 1103 257 aa, chain - ## HITS:1 COG:FN0595 KEGG:ns NR:ns ## COG: FN0595 COG1043 # Protein_GI_number: 19703930 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Fusobacterium nucleatum # 1 257 1 257 257 462 91.0 1e-130 MVDIHKTAIIEEGAIIEDGVTIGPYCIVGKDVIIKKGTVLQSHVVVEGITEIGENNTIYS FVSIGKANQDLKYKGEPTKTIIGNNNSIREFVTIHRGTDDRWETRIGSGNLLMAYVHVAH DVIIGDDCILANNVTLAGHVVVDSHAIIGGLTPIHQFTRIGSYSMIGGASGVNQDICPFV LAEGNKAVIRGLNSIGLRRRGFTDDEISNLKKAYRILFRQGLQLKDALEELERDFSEDKN VKYLVDFIKSSDRGIAR >gi|224461392|gb|ACDC01000010.1| GENE 5 3088 - 3513 609 141 aa, chain - ## HITS:1 COG:FN0594 KEGG:ns NR:ns ## COG: FN0594 COG0764 # Protein_GI_number: 19703929 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Fusobacterium nucleatum # 1 141 1 141 141 259 93.0 9e-70 MLDVLEIMKRIPHRYPFLLVDRILEMDKENQTIKGKKNVTINEEFFNGHFPGHPIMPGVL IVEGMAQCLGVMVMENFSGKVPYFAAIESAKFKNPVKPGDTLIYDVKVDKVKRNFVKATG KTYVDDAVVAEANFTFVIADL >gi|224461392|gb|ACDC01000010.1| GENE 6 3533 - 4366 1132 277 aa, chain - ## HITS:1 COG:FN0593 KEGG:ns NR:ns ## COG: FN0593 COG0774 # Protein_GI_number: 19703928 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Fusobacterium nucleatum # 1 277 7 283 283 491 91.0 1e-139 MKRKTLKNVVEYDGIGLHKGEVIKMKLIPNKSTGIVFRMMNMPEGKNEILLDYRNTFDLT RGTNLKNEHGAMVFTIEHFLSALYVVGITDLIIELSGNELPICDGSAIKFLDLFHESGIV ELDEDVEEIVVKEPIFLSKGDKHIIALPYENGYKLTYAIRFEHTFLKSQLAEFEITEEVY KKEIAPARTFGFDYEVEYLKQNNLALGGTLENAIVIKKDGVLNPEGLRFEDEFVRHKMLD IIGDLKILNRPIRAHIIAVKAGHLIDIEFAKILDNIK >gi|224461392|gb|ACDC01000010.1| GENE 7 4377 - 6590 2592 737 aa, chain - ## HITS:1 COG:FN0592 KEGG:ns NR:ns ## COG: FN0592 COG0210 # Protein_GI_number: 19703927 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Fusobacterium nucleatum # 3 737 1 735 735 1139 85.0 0 MNLNLLEKLNEKQREAASQIDGSILILAGAGSGKTRTITYRIAHMIENIGISPYSILAVT FTNKAAKEMRERVEDLVGEVAKSCTISTFHSFGMRLLRMYAAEVGYSPNFTIYDTDDQKR IIKAILKGQNITVNGNKLTERDLISIISKIKEEIKTIEEYSVMNKQIIEVYEKYNRSLIE SNAMDFSDILLNTYKLLQNSSILEKIQKKYKYIMIDEYQDTNNLQYKIIDLIARKSANLC VVGDENQSIYGFRGANILNILNFENNYKNAKIIKLEENYRSTSTILDAANELIKNNKSSK DKRLWTQNGKGDLIKVLVCDNARDEVSKIIDIIKENHQNGIPYKDMTILYRTNMQSRVFE EGLLRYNIPHKVFGGISFYSRAEIKDIIAYLSIIVNPQDELNLQRIVNVPKRKVGEKGIE KIIAFARENNLNLLDALSHIKDISGLTATGKEKLSEMYDIIKELKDLSYSETASYIVETL LDKIKYIDYVKETYDDADARIENIEEFKNSILELENVVGVLRLSEYLENVSLVSATDDLE DEKDYIKLMTIHNSKGLEFPIVFLVGFENEIFPGARASFDEKEMEEERRLCYVALTRAEK KLYLSHTAIRFVYGQDRLATPSIFLKEIPEKLLDVEVKKERLYFEDDEFSDTRHSEKFKR FEKKKTEINTKNTIVIPDDVKKVLDTLGFKIGDKVKHKKFGLGVIKKMDAKKIYVQYVDE TREMAIILADKLLTKLD >gi|224461392|gb|ACDC01000010.1| GENE 8 6735 - 7919 1704 394 aa, chain + ## HITS:1 COG:FN0590 KEGG:ns NR:ns ## COG: FN0590 COG1473 # Protein_GI_number: 19703925 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 1 392 1 392 393 738 89.0 0 MEEKIKKLSEKYLERVMELRRELHKYPEIGFDLFKTAEIVKKELDRIGIPYKSEIAKTGI VATIKGGKPGKTVLLRADMDALPLTEESRCDFKSTHEGKMHACGHDGHTAGLLGVGMILN ELKDELSGNIKLLFQPAEEEPGGAKPMIDEGVLENPKVDAAFGCHIWPSIKAGHVAIKDG AMMSHPTTFEIIFQGKGGHASQPEKTVDTVMVACQTVVNFQNIISRNISTLRPAVLSCCS IHAGEAHNIIPDKLFLKGTIRSFDEGITDQIVNRMDEILKGITSAYGASYEFLVDRMYPV LKNDHELFKFSKNALENILGKDNVEVMEDPVMGAEDFAYFGKHIPSFFFFVGVNDEQLEN ENMLHHPKLFWDEKYLITNMKTLSQLAVEFLNFN >gi|224461392|gb|ACDC01000010.1| GENE 9 7986 - 8189 370 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740040|ref|ZP_04570521.1| ## NR: gi|237740040|ref|ZP_04570521.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 67 1 67 67 98 100.0 1e-19 MDEFSVSVATISNWVRQFHDECQINEKANDEYNYMKENLRLRKELEEVKKENEFLKKAAA FFTKEID Prediction of potential genes in microbial genomes Time: Thu May 19 22:47:03 2011 Seq name: gi|224461391|gb|ACDC01000011.1| Fusobacterium sp. 2_1_31 cont1.11, whole genome shotgun sequence Length of sequence - 5238 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 22 - 81 8.3 1 1 Op 1 . + CDS 139 - 291 149 ## gi|237740047|ref|ZP_04570528.1| predicted protein 2 1 Op 2 . + CDS 291 - 800 591 ## FN2112 hypothetical protein + Term 816 - 852 2.8 + Prom 806 - 865 4.0 3 2 Op 1 . + CDS 890 - 1372 644 ## gi|237740043|ref|ZP_04570524.1| predicted protein 4 2 Op 2 . + CDS 1375 - 1905 491 ## FN2112 hypothetical protein + Term 1924 - 1960 5.9 5 3 Op 1 . + CDS 1967 - 2452 761 ## gi|237740045|ref|ZP_04570526.1| predicted protein 6 3 Op 2 . + CDS 2452 - 2964 610 ## FN2112 hypothetical protein + Term 2983 - 3019 4.3 7 4 Op 1 . + CDS 3029 - 3223 219 ## gi|237740047|ref|ZP_04570528.1| predicted protein 8 4 Op 2 . + CDS 3223 - 3744 627 ## FN2112 hypothetical protein + Term 3760 - 3795 3.7 + Prom 3746 - 3805 3.6 9 5 Op 1 . + CDS 3844 - 3996 158 ## gi|237740047|ref|ZP_04570528.1| predicted protein 10 5 Op 2 . + CDS 3996 - 4505 484 ## FN2112 hypothetical protein 11 5 Op 3 . + CDS 4570 - 4827 336 ## SSA_0394 hypothetical protein + Term 4830 - 4887 12.3 - Term 4822 - 4871 11.8 12 6 Tu 1 . - CDS 4876 - 5199 435 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|224461391|gb|ACDC01000011.1| GENE 1 139 - 291 149 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740047|ref|ZP_04570528.1| ## NR: gi|237740047|ref|ZP_04570528.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 49 15 63 64 90 97.0 3e-17 MIHFLDTNNVKEMAVPNYNYKIGEMDGKDMRSIFERWKDKRTQEKRYGGK >gi|224461391|gb|ACDC01000011.1| GENE 2 291 - 800 591 169 aa, chain + ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 168 1 164 164 148 51.0 6e-35 MKKIFISLVSLLVFTSCVLHVYRFTSVNYNNSRISISAGLVNSEDEKSPVEYIGVSDVRS NVNTPHKVKILSSTIKIIDSNNKEYIAKTNSNSGYIHIYKQGVVITDDFKAYIGKVQLDD GTIIDIPPLSFKKTVYVERYSVISDTINAGGRGKEIFSGTVEDYKKQKK >gi|224461391|gb|ACDC01000011.1| GENE 3 890 - 1372 644 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740043|ref|ZP_04570524.1| ## NR: gi|237740043|ref|ZP_04570524.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 160 1 160 160 250 100.0 2e-65 MKQQQGTIRYNAAINSLYKSDNGKELLAYNLENSSLVGIAITPNMVNDLNSKFTFINLMK KRLNKENEEKKGWKPLPLINFSTQINPDDNVPKYLGNKPNVEGWQNHPVANYNLKDYNYN EKKGIYERKTDKNKKNIPNTTVIIEEMIDKDKKFNYNPRR >gi|224461391|gb|ACDC01000011.1| GENE 4 1375 - 1905 491 176 aa, chain + ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 39 172 23 164 164 62 28.0 9e-09 MKKIIFLFVVIFILTNCTSIKVWNNYYPVIIGQNEKEINETIELHGNVQNNSDLNSYLNY ISLFERKETTNKGIKFLEPEIIIEGEQKYKIKNQEKSSNLEILSQGIKINSDTFTIYIGK VRLKDGRIINLPPIKFKRYISVYKINKFLDTLNQDTREDLFSGTIDEYREWKKKNK >gi|224461391|gb|ACDC01000011.1| GENE 5 1967 - 2452 761 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740045|ref|ZP_04570526.1| ## NR: gi|237740045|ref|ZP_04570526.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 161 1 161 161 292 100.0 4e-78 MINWHSQGSIYGAAAMIHFLDTDEGKEIAKKNLGVIGIKGIAITALGYGKLDKRMKGLDT GAQLYYMRNIGDWVSSIAAQGMPINTNGHKHDNIGYNDDNYELEEGMYKRKIIKIIRNGK EEKMVVPNYNYKIGEMDGKDMRSIIERWEDKRTQEKWDGGK >gi|224461391|gb|ACDC01000011.1| GENE 6 2452 - 2964 610 170 aa, chain + ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 168 1 164 164 143 47.0 2e-33 MKKILISLISLIILTACVSTRYSYYPVSSYGINKISISAGLISEEDENSPVEYIGISDVR SNVDEPHKVKILSSSIKIIDKNKEYIVKTNPKSGYIYIYKQGVVITDDFKTYIGKVQLDD GTIIDIPPLSFKKNVYVESYNPVTDTINAGARTKRLFNGTIDEYREWKNK >gi|224461391|gb|ACDC01000011.1| GENE 7 3029 - 3223 219 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740047|ref|ZP_04570528.1| ## NR: gi|237740047|ref|ZP_04570528.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 64 1 64 64 120 100.0 2e-26 MNLHSQGLIYEVAAMIHFLDTNNVKEMAVPNYNYKIVEMDGKDMRSIFERWKDKRTQEKR YGGN >gi|224461391|gb|ACDC01000011.1| GENE 8 3223 - 3744 627 173 aa, chain + ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 169 3 164 164 134 47.0 1e-30 MKKIFISLLFLIILTACVSARYSYYPVSSYRSDKISISAGLVNAEDENSPVDYIWVSDKR GYVGNSHYAKILSPTIKIVDKKNKEYIIKNDFYNEHIYIYKQGVIITDDFKAYIGKVQLD DGTIINIPPLSFRKNVYEESYNPVTDTINAGRRTKRLFNGTIEEYKEYKNQKK >gi|224461391|gb|ACDC01000011.1| GENE 9 3844 - 3996 158 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740047|ref|ZP_04570528.1| ## NR: gi|237740047|ref|ZP_04570528.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 49 15 63 64 89 95.0 5e-17 MIHFLDTNDVKEMAVPNYNYKIGEMDGKDMRSIFERWKDKRTQEKRYGGK >gi|224461391|gb|ACDC01000011.1| GENE 10 3996 - 4505 484 169 aa, chain + ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 167 1 164 164 155 54.0 5e-37 MKKIFISLVSLLVFTSCVLHIYNFSSINYRNDKISIDTNLLNSQKENSPLDYIWISDKRS HVGNNHRIKILSPTIKIISNSKEYILNTNPNSEVISVYKQGVIITDDFKAYIGKVQLDDG TIIDIPLVSFKKNVYVERYSVISDTINAGGRGKEIFSGTVEDYKKQKNN >gi|224461391|gb|ACDC01000011.1| GENE 11 4570 - 4827 336 85 aa, chain + ## HITS:1 COG:no KEGG:SSA_0394 NR:ns ## KEGG: SSA_0394 # Name: not_defined # Def: hypothetical protein # Organism: S.sanguinis # Pathway: not_defined # 1 85 7 91 92 122 68.0 5e-27 MSDKQTKILGWLGTTLSILMYVSYIPQIMGNLNGNKTSFIQPLVAAINCTIWVCYGFFKK NRDLPLALANLPGIIFGLIAAFTAL >gi|224461391|gb|ACDC01000011.1| GENE 12 4876 - 5199 435 107 aa, chain - ## HITS:1 COG:FN0586 KEGG:ns NR:ns ## COG: FN0586 COG0642 # Protein_GI_number: 19703921 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 1 104 340 443 445 134 61.0 3e-32 MLERVFLNILMNAVKFTKTNIEVSLTREDKTAVLKIRDNGIGISEENKKFIWERFFQVND SRNKEENKGSGLGLSMVKKIVDLHSATIDLESELEQGTCFTIKFNMQ Prediction of potential genes in microbial genomes Time: Thu May 19 22:47:50 2011 Seq name: gi|224461390|gb|ACDC01000012.1| Fusobacterium sp. 2_1_31 cont1.12, whole genome shotgun sequence Length of sequence - 8664 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 - CDS 1 - 886 952 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . - CDS 864 - 1538 969 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 3 1 Op 3 . - CDS 1563 - 2765 1191 ## COG0019 Diaminopimelate decarboxylase - Prom 2926 - 2985 7.4 4 2 Op 1 23/0.000 - CDS 3182 - 3877 274 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 5 2 Op 2 1/0.000 - CDS 3852 - 5021 1792 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 6 2 Op 3 . - CDS 5025 - 7271 2002 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC - Prom 7494 - 7553 9.9 7 3 Tu 1 . - CDS 8046 - 8483 532 ## FN0145 hypothetical protein Predicted protein(s) >gi|224461390|gb|ACDC01000012.1| GENE 1 1 - 886 952 295 aa, chain - ## HITS:1 COG:FN0586 KEGG:ns NR:ns ## COG: FN0586 COG0642 # Protein_GI_number: 19703921 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 1 295 1 295 445 404 75.0 1e-112 MFLKKMNRFLSRIPVSIRVTVWFSSVIVILFLIILSSLILIEDKVVNDLSQKELVEAVEE IYEDPEKFENFNDGIYYIKYNEQNEIIAGKFPKDFDIALAFSIEDINIYQVENKKFLYYD TRLQDEDDWIRGIYPLGKVQKEIETFWNIAIALSVLFIIFVVIVGYRIIKNAFKPVKQIS NTALEIKRSKDFSNRIELGDSNDDEIHKMASTFNEMLDTVEEVFIHEKQFSSDVSHELRT PITVILAQSDYALQYSDTLEETKESLEVINRHAKRMTNLINQIMELSKLERQKEI >gi|224461390|gb|ACDC01000012.1| GENE 2 864 - 1538 969 224 aa, chain - ## HITS:1 COG:FN0585 KEGG:ns NR:ns ## COG: FN0585 COG0745 # Protein_GI_number: 19703920 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Fusobacterium nucleatum # 1 224 1 224 224 345 84.0 5e-95 MRILVVEDEKDLNNIITKHLKKNNFSVDSVFNGEEALEYLDYGTYDLIVLDIMLPKVNGY EVIKKLRENKNETAVLMLTARDSIEDKIKGLDLGADDYLIKPFDFGELLARIRALVRRKY GNTSNTMEIDDLCIDIAKKTVVRGGKNIELTGKEYEVLEYLIQNKGHVLSRDKIRDSVWD YGYEGESNIIDVLIKNIRKKIDIGNSKPLIHTKRGLGYVLKEDE >gi|224461390|gb|ACDC01000012.1| GENE 3 1563 - 2765 1191 400 aa, chain - ## HITS:1 COG:AGc3079 KEGG:ns NR:ns ## COG: AGc3079 COG0019 # Protein_GI_number: 15888979 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 18 399 60 437 440 146 26.0 6e-35 MELKNKICEISKKYDSFYLYDEKIIKNSISNLRKVFPEIEFLYSVKCNSNVNVLKSIFSE GFGADAASLAEVLMARKLDLDKNKIYYSAPGKTSKDIEVAINESNLIADSIEEIRRINMV SKSLNKVTEIGVRLNPDFSGKASKFGVDEDIFYDFLHNNSCQNIKIVGIHVHLKSQELNV ETLANYYKNMFLLVEKVQNTLSYKLKYVNMGSGMGIQYSKSDVPLDLDRLKNLVKDNLSE FKKHNPDITIFIETGRYVTAKSGFYIMKVLDKKVSYGTTYLILKNTLNGFIKPSVIKLVS KYEKENPVSWEPLFTSKDAFEILTFKEETDKKEKVTLVGNLCTATDVIAEDIVLPSMDCG DIIVINNAGSYAAVLSPMQFSSQEKPVEVFLSVDGSIKIN >gi|224461390|gb|ACDC01000012.1| GENE 4 3182 - 3877 274 231 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 218 1 225 329 110 32 5e-24 MWRHLDMNNMIIKLEDVDKFYMETGNKLHILKKLNLEVKRGEFVSILGKSGSGKSTLLNI MGLLDKIDGGKIWIDDKEVSSLNETERNNIKNHFLGFVFQFHYLMSEFTALENVMIPALL NNFKNKSEIEKEAKELLEIVGLAERMKHKPNQLSGGEKQRVAIARAMINKPKLILADEPT GNLDEDTGEMIFSLFRKINKERNQSIVVVTHARDLSQVTDRQIYLKRGVLE >gi|224461390|gb|ACDC01000012.1| GENE 5 3852 - 5021 1792 389 aa, chain - ## HITS:1 COG:FN0581 KEGG:ns NR:ns ## COG: FN0581 COG4591 # Protein_GI_number: 19703916 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Fusobacterium nucleatum # 1 389 1 389 389 580 90.0 1e-165 MIEFFIAKKQMLERKKQSILSIVGVFIGITVLIVSLGVSNGLDKNMINSILSLTSHINVY SPENIPNYEELVKNIEEVKGVKGAVPTIETQGIIKYEGHGEPYVAGVKVVGYDLDKAIKV MKLDDYIIDGKIDVEDKKSILIGKELAASMGAMVGDKIKLITSEETDLEMTVGGIFQSGF YEYDVNMVLIPLQTAQYITYSDETVGRLSVRLDNPYDAQELIYDVARKLPTDLYIGTWGE QNKALLSALTLEKTIMLVVFSLIAIVAGFLIWITLNTLVREKTKDIGIMRAMGFSKKNIM LIFLIQGIILGIIGIILGIIVSLILLYYIKNYAVDLVSNIYYLKDIPIEISLKEIAIIVG ANFIVILISSIFPAYRAAKLENVEALRYE >gi|224461390|gb|ACDC01000012.1| GENE 6 5025 - 7271 2002 748 aa, chain - ## HITS:1 COG:FN0580 KEGG:ns NR:ns ## COG: FN0580 COG4953 # Protein_GI_number: 19703915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Fusobacterium nucleatum # 25 748 1 724 724 1206 89.0 0 MFKNINLKKVAIFIITLFILLFIYLIKIYVSYEPKKLVENINYSKIVLDRNGEILSVFLN KDEEFHLKYEGDIPETLKLAVLNYEDKKFYSHSGVDYPRILKSFFNNMTGGKKMGASTIT MQVVKLLEPKKRTYFNKLIEIVKAYKLESQFSKEDILKIYLNNVPYGSNIVGYSAAIKMY FNKDVKDLSYAEASLLAVLPNSPGILNLKKNNDKLEEKRNRLLKTLLDKGLIDERQYKFS LLEKFPNKIYYYEKKAPQFSIFLKNRYKEKIIRSTLDYKLQKKLEKIVHDYSNTMKDTGI NNAAVLVVNNKTKEVLAYVASQDFYDKKNNGEIDGLQAKRSPASLLKPFLYALSIDEGLI VPDSIYPDVPIYFGNFYPKNSTGTFSGMVKMEDALIKSLNIPFVKLLSDYGIDKFYYFLE NNDNYPEDRFDKYGLSLILGTREMRPVDIVKLYVGLANYGKVSNLKYTLTEDVPKEYEQF SKGASYLTLETLSKVVRPGNEKLYSEERPISWKTGTSYGLKDAWSVGVSPDYTVLVWLGN FNQKSIFSLSGVETAGNLLFKVFNIVDINSKPFSKPMEDLKEIEIDEKTGYRKMYDVESK KVLYPKNAKLLRTSPYYKKIFVDENDIEIDSRSEKFDKRKEKIVIEYPVEVSNYFFLNGV RENKKVKIAYPVENLNIFVPKDFEGYNKIAIKLYNPNNEYVYWYIDEEYMGFSNESERFF ELDMGKHKLTIVTEAGAREEVKFKINKR >gi|224461390|gb|ACDC01000012.1| GENE 7 8046 - 8483 532 145 aa, chain - ## HITS:1 COG:no KEGG:FN0145 NR:ns ## KEGG: FN0145 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 4 144 14 157 162 111 55.0 1e-23 MTKNTDFKLIILPLYFFFSILSFTFQHYEYTFTDINNKKEYEVFSEGKKVYIRLENEMLI EGQFPDFEIQNVYEDKNYFIGCDFNNNSEDNEDEKYYIVVDKNKAIMKIYKEQNFKEIYK NIDNKKFINIYTFLKRKGTKFGRYR Prediction of potential genes in microbial genomes Time: Thu May 19 22:47:55 2011 Seq name: gi|224461389|gb|ACDC01000013.1| Fusobacterium sp. 2_1_31 cont1.13, whole genome shotgun sequence Length of sequence - 8713 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 214 141 ## gi|237740061|ref|ZP_04570542.1| hypothetical protein FSAG_00135 - Prom 275 - 334 8.2 - Term 311 - 358 7.2 2 2 Op 1 1/0.000 - CDS 379 - 5226 6704 ## COG2373 Large extracellular alpha-helical protein 3 2 Op 2 . - CDS 5223 - 7043 1670 ## COG0514 Superfamily II DNA helicase 4 2 Op 3 . - CDS 7046 - 7789 1084 ## FN0577 hypothetical protein 5 2 Op 4 . - CDS 7820 - 8539 815 ## FN0577 hypothetical protein 6 2 Op 5 . - CDS 8596 - 8712 59 ## Predicted protein(s) >gi|224461389|gb|ACDC01000013.1| GENE 1 1 - 214 141 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740061|ref|ZP_04570542.1| ## NR: gi|237740061|ref|ZP_04570542.1| hypothetical protein FSAG_00135 [Fusobacterium sp. 2_1_31] # 1 71 1 71 71 115 100.0 1e-24 MKIKLKVEPNWEIWIYESYSTVPIIINGTMIFGAYNVDFKMTANIWKQLPEEYKRKIYRY NWKKVIKSMLI >gi|224461389|gb|ACDC01000013.1| GENE 2 379 - 5226 6704 1615 aa, chain - ## HITS:1 COG:FN0579 KEGG:ns NR:ns ## COG: FN0579 COG2373 # Protein_GI_number: 19703914 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Fusobacterium nucleatum # 9 1615 2 1611 1611 2245 77.0 0 MKKFLKLFFALSLLMIALVACQKDKEKAQTEQGQTEQEQNYDYQEMLYVNNAGFNISGDL VIMFSDEIDKNQEFNKLIEVEGLDGDITIMPFGRKIIIKGDFQKEVPYSVKVSKGIKSVS GNELNEDYTRYNLYVGKKQPALAFADYGNVLPSVNNKKINFNSVNIKKVKLEIVKIYTNN ITQYLKLSSNEYSLDWSVKEDIGDVVFSKEYEIESQEDEVVKNSIDLNGVIDTKGIYYVK LTSVGEESIDYDIAKYGEPLSFGYEDQPIYAKATKTIILSDIGIVANSNDSKLDIKLLNL NTLNPIGSAKLEFINSKNQTLEEGTTNSNGEYRSKVNLENVYYVLVKSGNEFNVLYLSDS KINYADFDIGGSLEGSDLKLYTYTDKGYYRPGDEINVSLIARSKEKMNDEHPFEYSFTAP DGSNKINNEVVKESKNGFYTFKIKTDVNDLNGAWTLTIKFGGKEVTQKVFIESKVANSIA IEADEDKIYSKADIKDGLMRFKFDFKYLSGAKLDKDSNVNLDYNVIEREPRSKKYKNFVF VNPSNYKYQFRNFAETKTDGSGELELRLEMPQALQNKNLYLSTTVNVQDASGRYSTENKV FTIINRENSVGVQKLDQNGNEASVKYILLNEKTDSPVAGKKLKYRVYNKQNNWWYDYYED DEKSFKENMETTLLEEGEITSASDAEILKVSSLADGVNFIEIEDEETGHSSGVFVYNYHY GDKKTGTIENLKASTDKEKYDIGDIAKIKYTGSIGSKALVTIEKDGKIIKEYWKTLTSTE NEETIVIEKDFFPNAYVSISVFQKYVDKQNDRPLRLYASLPLMVEDKSKMLTIDIDTKTE VLPAGDLNIKLSNKEKKKMYYEVFLVDEGVLRKTDYKKPDPYKFFYEKRAKLVQNFDNFS NIIEKYSDKVMNRLKTGGGDYEELAAEATDRAKVASDQKDELQLQGEAQRFKNLTIFRGV AESDENGNAELNIKVPNFFGQMRVFVVAVSDESYGSAEKSISVKAPVIVDSSAPRVLKVG DKFTVPVTLFPIEKAIGDSEVTLTYNGKTYSKKVNVKDGQNEKLLFELDAPDTVGTTKID IDFKSSKYSFKDSIDLNVDTNYPYQYVEKSLVLEPNQEFTLSMDEYKEFINGSIKSNISL SSYPKLGIEKLIKSLMDYPYICLEQISSKGLSMLYIDKLTTDLVEKNDAKNEINAIIAKL NNNYQLRNGAFAYWPGSQEESMSTIYAIEFLIEAKERGYYIPEAMFENAQAYLNSIAMRV DIPKADVLYLLASLNDPNVSEMNIFFDRYYNDASLVDKWTLLGAYAKIGEKDFARKEAEK LPKKAETKDGIYYADQNAKILRYYTEIYGSPEPSLYSSVLGTAKSDEWLTTFEKAHIVQA LAEGEKVSPEKKNLSFKLIVDGKEQNLELKDGEYTLKNLGIKENAKKIVIKNTSTSKLYV NSFVKGKPVKYEEKDESKNITITRRFVDMSGKEIDVKNLKAGTRFRMILSSKVDNNNLDD ISLLQILPSGWEFDNSQAGAPQNSDPQVVPMNTADIDNAEYGGEMNIADNSSYTDMRDDR VAYFFPLYAGEDKEIEINLIAVTPGSYRLPGTKVESMYNKDFRAYLKGFEVKVSQ >gi|224461389|gb|ACDC01000013.1| GENE 3 5223 - 7043 1670 606 aa, chain - ## HITS:1 COG:FN0578 KEGG:ns NR:ns ## COG: FN0578 COG0514 # Protein_GI_number: 19703913 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Fusobacterium nucleatum # 1 605 9 613 614 1020 87.0 0 MKAEALRILKEYYGYDNFREGQEKIIDAILQKRNVLGIMTTGAGKSICYQVPALVFNGLT IVISPLISLMKDQVDSLKLIGIDASYLNSTLTSDEYNKILFKIKKGQTKLLYISPERLEN KAFLNFIKTIKIAMVVVDEAHCVSQWGENFRKSYLRIADFIRYITDGVKIQTLAFTATAT PKIKVDIIDKLKIENPFVFVDNFNRDNIYFKVIDNTGLDKDLNIDSKPFIIDYLRKHKGK SGIIYCSTRKNVDDIYNYLVSFDRSVTKYHGGMTKEEREKNQNLFLNDDVEIMVATNAFG MGINKSNIRYVIHANIPADLESYYQEAGRAGRDGGKSEAILIYNEKDRDIQRFLMEKESE GRKDKDYLTKKLKSFNKMIEYAELKTCYREFILKYFGEKMIRNYCGFCENCKKEKNIKDF SLEAKKIISAVGRTKESLGISTLANMLMGKADTKMLNKGLNKISTFGIMREDKQEWIESF INYMISEKYLVQSAGSFPVLKLGKNYKEILNDNIKIIRKENEKIDFDYYENALFKELNSL RKEISKKENIAPYIIFSDMTLIEMAEKKPTNRWEMLKIKGIGNQKFTNYGERFLERINAY NMEEKK >gi|224461389|gb|ACDC01000013.1| GENE 4 7046 - 7789 1084 247 aa, chain - ## HITS:1 COG:no KEGG:FN0577 NR:ns ## KEGG: FN0577 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 247 1 241 241 320 73.0 3e-86 MSNIENFKELYNFEFEEIKAESFEEIEKKYLAAYKDGKEKGYTPVFLVLDDNLLETFEIN MEDEDTDNMMELVKSNLEKAKSINPIEFLEKFQGQNTDDLKENIDEYFSEIDYEFDDDDK SNLELSTVFDYDGNFKDNVILVKVPTTKPYEVLGYFGMGGYNECPFPAEQVAVAKYWYEK YGAVPAAITYDEIEFYVERPPQTLEEAKKLAVEHYAFCYDLVLQCCGTFEALVDGLYKNI QWYFWWD >gi|224461389|gb|ACDC01000013.1| GENE 5 7820 - 8539 815 239 aa, chain - ## HITS:1 COG:no KEGG:FN0577 NR:ns ## KEGG: FN0577 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 238 1 240 241 253 57.0 6e-66 MSNIDEFKKLYNFEFEEIKVDSFEEVSEKYLATYKDGKEKGYTPVFLTVDDYLLKTFEIN MKDENTDNMIDIFNKNLEKAKNINPIELFNKFIEQNADSIKSNVNEDFTKNNYEINDSNK NNLKFLTIFNNEGNLKDNVILVKIPTTKPYEILAYFGMGSEGIATVKYWYEKYGAVPAAI TYDEIEFYVERPVQTFEEARKLAIEQYAFCYGLLWECYDTLDELASAIYKNVQWYFWWS >gi|224461389|gb|ACDC01000013.1| GENE 6 8596 - 8712 59 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no ADINKDLIEKYGKYLIIIKNSFDFEKKIKNNLEQINLF Prediction of potential genes in microbial genomes Time: Thu May 19 22:48:12 2011 Seq name: gi|224461388|gb|ACDC01000014.1| Fusobacterium sp. 2_1_31 cont1.14, whole genome shotgun sequence Length of sequence - 1255 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 163 - 966 790 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake - Prom 1063 - 1122 11.6 Predicted protein(s) >gi|224461388|gb|ACDC01000014.1| GENE 1 163 - 966 790 267 aa, chain - ## HITS:1 COG:FN0571 KEGG:ns NR:ns ## COG: FN0571 COG0758 # Protein_GI_number: 19703906 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 1 266 1 266 304 362 77.0 1e-100 MYSKEELLIFSIINSKYDIGTQNLVNKIFKFSNKENLNFFNLNREDKIEFLSNFLSEENI EKILDIFDKDNFYKFEIEKIFKICKEKSINIFYHSYENYPKSLMNIKESPYVIFVKGTLP IDKELEKAFAIVGTRKASQEGINFAKDIGTYLAKNNIYNISGLALGIDTVGHETCLHRTG AILGQGLDLEIYPRENINLAEKILENNGFLLSELIPQQELSMFSLIKRDRLQSALTSGII IAESGIKGGTVNTFKYAKEQKKKIFIK Prediction of potential genes in microbial genomes Time: Thu May 19 22:48:17 2011 Seq name: gi|224461387|gb|ACDC01000015.1| Fusobacterium sp. 2_1_31 cont1.15, whole genome shotgun sequence Length of sequence - 14451 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 5, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 62 - 340 457 ## gi|237740067|ref|ZP_04570548.1| conserved hypothetical protein - Prom 435 - 494 7.8 - Term 433 - 478 1.5 2 2 Op 1 1/0.667 - CDS 512 - 2071 1271 ## COG1807 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family 3 2 Op 2 . - CDS 2147 - 3331 281 ## COG1807 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family 4 2 Op 3 8/0.000 - CDS 3267 - 4604 1877 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 5 2 Op 4 . - CDS 4608 - 5645 1220 ## COG0451 Nucleoside-diphosphate-sugar epimerases 6 2 Op 5 1/0.667 - CDS 5648 - 7201 880 ## COG1807 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family 7 2 Op 6 2/0.000 - CDS 7201 - 7482 168 ## COG3952 Predicted membrane protein 8 2 Op 7 . - CDS 7488 - 8198 734 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 8376 - 8435 13.4 + Prom 8391 - 8450 11.2 9 3 Op 1 40/0.000 + CDS 8495 - 9169 874 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 10 3 Op 2 . + CDS 9172 - 10476 1213 ## COG0642 Signal transduction histidine kinase + Term 10582 - 10646 2.2 11 4 Tu 1 . - CDS 10646 - 11239 642 ## PROTEIN SUPPORTED gi|148988990|ref|ZP_01820390.1| hypothetical protein CGSSp6BS73_02415 - Prom 11334 - 11393 15.4 12 5 Op 1 11/0.000 + CDS 11523 - 12566 1547 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component + Term 12604 - 12643 1.2 + Prom 12569 - 12628 7.8 13 5 Op 2 11/0.000 + CDS 12660 - 13130 519 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 14 5 Op 3 . + CDS 13145 - 14434 667 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 Predicted protein(s) >gi|224461387|gb|ACDC01000015.1| GENE 1 62 - 340 457 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740067|ref|ZP_04570548.1| ## NR: gi|237740067|ref|ZP_04570548.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 92 15 106 106 168 100.0 1e-40 MKEKYIYPCVVYEEDGIYYANFKDFDACFTDGESIEEVIINAKDVLEGTIFSLLKNNLEV PEPTLTKPDLENNEFLVYIDIWLTPIIDKVNN >gi|224461387|gb|ACDC01000015.1| GENE 2 512 - 2071 1271 519 aa, chain - ## HITS:1 COG:FN1262 KEGG:ns NR:ns ## COG: FN1262 COG1807 # Protein_GI_number: 19704597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family # Organism: Fusobacterium nucleatum # 1 519 1 519 519 701 75.0 0 MFSTRRKDIFVLVVLSLFAYLSIIAIREIDSAEARNFIAAREMLENSSWWSPTVNGHFYF ENPPLPTWLTAIVMMITRSHSEAILRIPNMLCCIFTILFLYRSMIRIKKDRLFAFLCSFV LLTTFMFIKLGAENTWDIYTYTFAFCASLAFYLYVRDGQRKNLYRMAILVFLSFLSKGPI GFYSVFIPFLLAHYIIFPKEIFKKRTFFVFLTLIISIALSLVWAFSMYFNHGDYFLTIVK DEVSAWATKHHRSFIFYTDYFVYMGSWLFFSIFVIFKVPEKKEEKVFWLWTILSLIFISI IQMKKKRYGLPMYLTSSITIGQLCIYYFRKTYAELKKREKTLLIIQQLFLLFVIFVSLIF LTYFGYIKKEISFGLFFLYAALHLLFLFLFAVGYTEISYAKRVIIFSGLTMLLVNFSSSW ILESKFMQNNLLKFRMPIDEEILKSSDSIYAERYDIEDVWKLGKQIKTLNKNMPDEREII FLGKEEPKSLSKVYEVKKVYEYQKVTHDMERLYILERIY >gi|224461387|gb|ACDC01000015.1| GENE 3 2147 - 3331 281 394 aa, chain - ## HITS:1 COG:aq_1220 KEGG:ns NR:ns ## COG: aq_1220 COG1807 # Protein_GI_number: 15606454 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family # Organism: Aquifex aeolicus # 47 350 29 317 499 63 25.0 6e-10 MKKLENVLNITRLVKNMLQVLKKNRILIAILVLFLILPFLRVPDLRNEMKYLNIVQEIVD KKSYWVLYYQGELYPDKPPLYFWLLTVIYKIFGKNLLFPLSLIFLSYLPFLSILSLACWQ LNYLKKEWKDIFLLYSFTIPYLMGLSIFLRMDMLMTFFVTLSLSLFIYFYFNRNKINNIK LFFLYLSIFLGIFTKGALGGILPILIIYIFLYLENDLNFFNNLHWKKGILFLVFFFSIWL VILYFQPNGTEYIKLLLGKQTIGRAYKSYSHARPFYYYFIYLPLTFFPYGFFYVYGFFKY LINLDRKKTWTLFEKWAFSWSIPPFILLSIISGKLQIYLLPLYIGMIFLSLIVRDKLIKK YEFLVSVEKNIQKCVYILYFFLPIGFLIYSKYFI >gi|224461387|gb|ACDC01000015.1| GENE 4 3267 - 4604 1877 445 aa, chain - ## HITS:1 COG:AGl1413 KEGG:ns NR:ns ## COG: AGl1413 COG1004 # Protein_GI_number: 15890827 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 445 1 439 443 447 51.0 1e-125 MKITVIGTGYVGLVQGIIMSEFGSEVICVDKSQEKIEILNKGELPIYEPGLAELLHKNQK AKRISFTTDIEKAIKKSEVIFIAVGTPALEDGSADLSAVLNVAEEIGEYINSYKVIVDKS TVPVGTGRKVINIIKEKIKSRNENIDFDVVSNPEFLREGKAVNDCLRPNRIVIGTESEKA KEIMSKVYNVLYINATPFLFTNLETAEMIKYASNAFLAVKISFINEIALLAEKVGANTQE IARGMGMDGRISPKFLHCGAGYGGSCFPKDTKAIVEIGKEYGEDMYVISAAIAANEKQKK KMVEKIKNTIGNLSGKVIGVLGLSFKPDTDDMRDAPSIDIIEGLIKEGAKIQAYCPEGIK EAIWRLQDYEDNIIYCADEYSIANGADAIVLMTEWNQFRGMDLKRLRKRMKDNYYFDLRN INIKNEKIRECFKYYPTGEKYVTSA >gi|224461387|gb|ACDC01000015.1| GENE 5 4608 - 5645 1220 345 aa, chain - ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 1 341 1 330 343 227 35.0 3e-59 MNVLITGGAGFIGSHLVEKFLKEKHRVIVVDNFDSFYSMDIKVLNVLESINKKELKEKIL ALKDDEKLISLIKYTESDNYKLYVEDICNLENLKEIFIKENIDFIVNLAALAGVRPSILR PFDYERVNVKGFLNILEICKELKINKLIQASSSSVYGNSKADIFTEDIRVDFPISPYAAT KKAGEEFGNVYSHLYNIDMIQLRFFTVYGERQRPDLAIHKFIEKIENNEEVTIYGDGNTS RDYTYIKDIVDGIFKSFEYLNNHQNIYEIINLGSSRKIKLIDMIKIIENKLNKKAKLKFI DKQAGDVDKTFACIDKAKKILNYKVSTKFEDGIENFVNWYRQRGV >gi|224461387|gb|ACDC01000015.1| GENE 6 5648 - 7201 880 517 aa, chain - ## HITS:1 COG:FN1262 KEGG:ns NR:ns ## COG: FN1262 COG1807 # Protein_GI_number: 19704597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family # Organism: Fusobacterium nucleatum # 1 516 1 517 519 523 58.0 1e-148 MLFSNKKDIFILLILSIFAFLSIIWINEVDIMEARNFITAREMLQNSDWWTTTLNGQFRF EKPPFPTWLTAFTMMITNSKLEFILRIPNMLVSVFTILFLYKTMFRIKKDRLFAFLSSFV LLTTFMFIKIGAENTWDIYTYAFAFCASLSLYIYMQENLKKDLFLTIIFLILSFLSKGPV GFYAIFIPFIIAYIFNNPKEKWKRKIKFIFLAVIVAFAISSIWAISMYLNHNDYFLKVMK KESSTWSTKHSHSIFFYLDYFVYMGTWIFFSVFAFLKVPKNKEDKIFYIWNIISLIFISI IQMKKKRYGLPIYLISSLNIAQLCVYYFRIPYEYLSKIEKFLLRFQQYFIMFVIFLSLEF LTYFGYIKKEISFILFLLYIFIHIIFLILINIKYTKNNYAKRIILFSGLTMLVLNFSSSW ILENNFMKDKMLKFRVPISQEVVVSNYPIYSNSFDIEEVWRLGKNIKELVNIPIEKNIFY LGDKEPQILMNDYRIIKIYKYQKINHKFTNLYYLERK >gi|224461387|gb|ACDC01000015.1| GENE 7 7201 - 7482 168 93 aa, chain - ## HITS:1 COG:mlr0010 KEGG:ns NR:ns ## COG: mlr0010 COG3952 # Protein_GI_number: 13470337 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mesorhizobium loti # 3 93 16 106 110 100 58.0 1e-21 MNFLNNLNFFMVLGFIGQFFFSMRFIVQWVASEKHKKSVVPLAFWVFSVLGSFLLLIYAI YRKDPVFILGQAPNLLIYFRNIWLIKTSKKGEL >gi|224461387|gb|ACDC01000015.1| GENE 8 7488 - 8198 734 236 aa, chain - ## HITS:1 COG:aq_1899 KEGG:ns NR:ns ## COG: aq_1899 COG0463 # Protein_GI_number: 15606924 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Aquifex aeolicus # 4 225 2 224 322 193 45.0 2e-49 MKRISVIAPIYNEKENIEILVEKIKTTLKDRFTSYEIILVDDGSTDGSSELIEKIASTDP HIKNYHFTKNNGQTAALSVGFKYCTGDIVVTMDSDLQTDPEDIYLMLPYLDKYDMVNGKR TTREDGFKRKISSLIGNGVRNFITKDNIKDTGCPLKLFKKEVVKSFYLYEGMHRFLPTLA KINGFSVIEVPVQHFDRMYGKSKYGVWNRLWKGLKDAFAVRWMSKRKINYVLKESR >gi|224461387|gb|ACDC01000015.1| GENE 9 8495 - 9169 874 224 aa, chain + ## HITS:1 COG:FN1261 KEGG:ns NR:ns ## COG: FN1261 COG0745 # Protein_GI_number: 19704596 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Fusobacterium nucleatum # 1 224 10 233 233 348 83.0 6e-96 MIKILLVEDDKNIQRLLTLELRHKEYLVDSAYDGEQGMELFERNYYDVVLLDLMLPKKSG KELCQEFRKLNNTPIIVITAKDSILDKVELLDLGANDYICKPFAMEELLARIRVATRNKE NFVDKQFYLEKDIKLDLSAKRVYLREEEINLTKTEFLILEYFMKNKGLSCSREKIIIDVW GYDFDGEEKIVDVYINSLRKKIDLDNRYIHTIRGFGYMFQYKED >gi|224461387|gb|ACDC01000015.1| GENE 10 9172 - 10476 1213 434 aa, chain + ## HITS:1 COG:FN1260 KEGG:ns NR:ns ## COG: FN1260 COG0642 # Protein_GI_number: 19704595 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 8 432 1 425 427 567 77.0 1e-161 MKKISNELLKTYYWIMLLFTAFSISALVSFSIFLWRENEKDIQTVEKFIDYELSEFEYKE ELKNLSDEELFKTALEEAPKIQDVYVEFFYRGKKYTRPPYLPNREHNFLDYYSVTKTYQL NGYEPIEVKITKRFAKNRLLILYTFGSFIFFLLVCLFIISRIQKRFSKKFENSLDKLKMF TQDYNLDSEIRIHNEENFIEFSILQKAFKNMLIRLKEQSQMQIDFVNNASHELKTPIFVL KGYVDMLNDWGKNDKEVLDESLVILKKEIQNMQDLTEKLLFLAKSKNLVVEKNNISLDNV LKEVIDNLSFAYPKQKINYISSEIFIDSDIALLKLLFKNLIENAIKYGKDNPINIELKKE KKVKVIIEDFGVGISEKALPHIFERFYREDEARNREIKSYGLGLSIVKEIIALLNIDIQI ESQINKGTKITLQL >gi|224461387|gb|ACDC01000015.1| GENE 11 10646 - 11239 642 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988990|ref|ZP_01820390.1| hypothetical protein CGSSp6BS73_02415 [Streptococcus pneumoniae SP6-BS73] # 6 195 3 192 192 251 58 2e-66 MTRDEKLKKLIEDIKNDEENKKYTEQGIDPLFSAPKEARIVIVGQAPGLKAQENKLYWKD KSGDKLRLWTGIDEKTFYSSNLLAIIPMDFYYPGKGKSGDLPPRKDFGEKWHNKILELLP DVELFILIGKYAQEFYLKGRTKENLTETVHSYKEYLPKFFPIVHPSPLNIRWLKKNPWFE KEVVPELKEMVTKIMEK >gi|224461387|gb|ACDC01000015.1| GENE 12 11523 - 12566 1547 347 aa, chain + ## HITS:1 COG:FN1258 KEGG:ns NR:ns ## COG: FN1258 COG1638 # Protein_GI_number: 19704593 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 15 347 1 333 333 629 93.0 1e-180 MKKILSLIFLSLFTLLLVACGGKKEEATKEGGEAKKEARVIKVTTKFVDDEQTAKSLVKV VEAINARSNGSLELQLFTSGTLPIGKDGMEQVANGSDWILVDGVNFLGDYIPDYNAVTGP MLYQSFDEYLRMVKTPLVQDLNAQALEKGIKVLSLDWVFGFRNIEAKKPIKTPEDMKGLK LRVPTSQLYTYTIEAMGGNPVAMPYPDTYAALQQGVIDGLEGSILSYYGTKQYENVKEYS LTRHLLGVSAVCISKKCWDSLTDEERTIIQEEFDKGAQDNLTETQRLEEEQAQALKDNGV TFHEVDAEAFNKAVAPVYDKFPKWTPGIYNKIMENLTQIREDIKNGK >gi|224461387|gb|ACDC01000015.1| GENE 13 12660 - 13130 519 156 aa, chain + ## HITS:1 COG:FN1257 KEGG:ns NR:ns ## COG: FN1257 COG3090 # Protein_GI_number: 19704592 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Fusobacterium nucleatum # 10 156 1 147 147 184 88.0 7e-47 MKDFLKKFELYVGSVFISVTTVVVIMNVFTRYFLKFTYFWTEEIAVGCFVWTIFLGTAAA YREKGLIGVEAIVVLLPEKIRNVVEFLTYILLTVLSGLMCLFSFTYVMSSSKITAALELS YGYINISIVISFALMTLYSIIFTIESFKKAFLSKGN >gi|224461387|gb|ACDC01000015.1| GENE 14 13145 - 14434 667 429 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 1 426 2 427 432 261 34 2e-69 MEALYPVIVLFVLFFLNIPIAFALMGSALFYFIFLNTTMSMDMVIQQFVTSVESFPYLAV PFFIMVGSVMNYSGISEELMNMAEVLAGHMKGGLAQVNCLLSAMMGGISGSANADAAMES KILVPEMIKKGFSKEFSAAVTAASSAVSPVIPPGTNLILYALIANVPVGDMFLAGYTPGI LMTLSMMITVYIISKKRGYNPSRERMARPSEILRQAIKSIWALAIPFGIIMGMRIGIFTP TEAGGVAVFFCFLVGFFVYKKLKLHHIPVILMETVKSTGAVMIIIASAKVFGYYMTLERI PQFITNSLMDFTDNKFVLLMVINLLLLFVGMFIEGGAALVILAPLLVPAVKALGVNPLHF GVIFIVNIMIGGLTPPFGSMMFTVCSIVGVRLEGFIKEVWPFIVALLVVLFVVTYSESIA LFIPNLFLK Prediction of potential genes in microbial genomes Time: Thu May 19 22:48:26 2011 Seq name: gi|224461386|gb|ACDC01000016.1| Fusobacterium sp. 2_1_31 cont1.16, whole genome shotgun sequence Length of sequence - 5679 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 44 - 838 1190 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 2 1 Op 2 . + CDS 900 - 1415 750 ## COG0778 Nitroreductase + Term 1429 - 1476 8.1 - TRNA 1533 - 1609 82.0 # Arg TCG 0 0 3 2 Op 1 7/0.000 + CDS 1818 - 2588 1126 ## COG1540 Uncharacterized proteins, homologs of lactam utilization protein B 4 2 Op 2 1/0.000 + CDS 2606 - 3793 1621 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family 5 2 Op 3 21/0.000 + CDS 3807 - 4556 996 ## COG2049 Allophanate hydrolase subunit 1 6 2 Op 4 . + CDS 4549 - 5562 1651 ## COG1984 Allophanate hydrolase subunit 2 + Term 5576 - 5636 18.7 Predicted protein(s) >gi|224461386|gb|ACDC01000016.1| GENE 1 44 - 838 1190 264 aa, chain + ## HITS:1 COG:FN1255 KEGG:ns NR:ns ## COG: FN1255 COG0647 # Protein_GI_number: 19704590 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 264 12 275 275 455 89.0 1e-128 MKELKDIKCYLLDMDGTIYLGNELIDGAKEFLKKLKEKNIRYIFLTNNSSKNKDKYVEKL NNLGIEAHREDVFSSGEATTIYLSKKKKGAKVFLLGTKDLEDEFEKAGFELVRERNKDID FVVLGFDTTLTYEKLWIACEYIANGVEYIATHPDFNCPLENGKFMPDAGAMMAFIKASTG KEPTVIGKPNRHIIDAIIEKYDLKKSELAMVGDRLYTDIRTGIDNGLTSILVMSGETDKK MLEETIFVPNFVFNSVKEIKETIE >gi|224461386|gb|ACDC01000016.1| GENE 2 900 - 1415 750 171 aa, chain + ## HITS:1 COG:FN1254 KEGG:ns NR:ns ## COG: FN1254 COG0778 # Protein_GI_number: 19704589 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 171 1 171 171 328 91.0 3e-90 MELLKLMSDRYACRRYSTEDVKEEDILKILEAAKIAPTAHNEQPQRIYVVKSEEGKAKLM KDFKFDFKAPCYLVCGYNEEEAWKNPLDNNKDSGEVDISIVMTHMMLMAEELGLGTCWIG FFDPLAVKKNLEIPDNIKIVGILSLGYHREDDRPSKLHTIRRSNEELVKFL >gi|224461386|gb|ACDC01000016.1| GENE 3 1818 - 2588 1126 256 aa, chain + ## HITS:1 COG:FN0439 KEGG:ns NR:ns ## COG: FN0439 COG1540 # Protein_GI_number: 19703777 # Func_class: R General function prediction only # Function: Uncharacterized proteins, homologs of lactam utilization protein B # Organism: Fusobacterium nucleatum # 1 256 1 256 257 472 91.0 1e-133 MKFYVDLNSDIGEGYGAYKLGMDEEIMKCVTSVNCACAWHAGDPLIMDKTIKIAKENNVA VGAHPGFPDLLGFGRRKMVISPEEARAYMLYQLGALDAFAKANGVKLQHMKLHGAFYNMA AVEKNLADAVLDGIEEFNKDIIVMTLSGSYMAKEAKRRGLKVAEEVFADRGYNADGTLVN RTLPGAFVKDPDEAIARVIKMVKTKKVTAVNGEEIDIAADSICVHGDNPKAIEFVERIRK ALIENGIEVKSLHEFI >gi|224461386|gb|ACDC01000016.1| GENE 4 2606 - 3793 1621 395 aa, chain + ## HITS:1 COG:FN0438 KEGG:ns NR:ns ## COG: FN0438 COG1914 # Protein_GI_number: 19703776 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Fusobacterium nucleatum # 1 395 1 395 395 595 95.0 1e-170 MEKKNNLSVLLGAAFLMATSAIGPGFMTQTAVFTKDMGATFAFVILVSVIMSFVAQLNVW RVLAVSKMRGQDIANSVLPGLGYFITFLVCLGGLAFNIGNVGGAALGFQVLFDLDLKIAA LVSGALGVIIFSFKSASKLMDKLTQVLGAMMILLIGYVAFSTNPPVGTAVKETFVPSSIN LMAIITLIGGTVGGYIMFSGGHRLIDAGIVGEENLPQVNKSAILGMSVATIVRVFLFLAV LGVVSLGNQLDAGNPAADAFKIAAGTVGYKIFGLVFLAAALTSIVGAAYTSVSFLKTLFK VVKDNENFFIIGFIVVSTLILIFLGKPVKLLVLAGSLNGLILPITLAITLIASKKEGIVG KYKHSNILFYLGWVVVLVTAYIGVQSLSKLAELFA >gi|224461386|gb|ACDC01000016.1| GENE 5 3807 - 4556 996 249 aa, chain + ## HITS:1 COG:FN0437 KEGG:ns NR:ns ## COG: FN0437 COG2049 # Protein_GI_number: 19703775 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 1 # Organism: Fusobacterium nucleatum # 1 249 14 262 262 448 91.0 1e-126 MENSVRFLFSGDSALVIEFGNEISVDINKKIRKMMDDIKKENIDGIVELVPTYCSLLINY DVLKIDYNTLVEKLKTFLNNDLETAEGEEVTLVEIPTLYNDEVGPDLSYVAEHNKLSKEE VIKIHTGTDYLVYMLGFMPGFTYLGGMSEKIATPRLESPRLQIYPGSVGIAGKQTGMYPS MSPGGWRIIGRTPLKLYNPDSDTPVYISSGDYVRYVSISEEEYNDILKKVENNEYKLNIR KIKRGELNA >gi|224461386|gb|ACDC01000016.1| GENE 6 4549 - 5562 1651 337 aa, chain + ## HITS:1 COG:FN0436 KEGG:ns NR:ns ## COG: FN0436 COG1984 # Protein_GI_number: 19703774 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 2 # Organism: Fusobacterium nucleatum # 1 336 1 336 336 603 89.0 1e-172 MPSIKVHKPGLCTTVQDIGRIGYQQFGIPVSGVMDEFAFTVANYLVESDKNNAVLEIPFL GPTLEFDFDVTIAITGGEIQAKINNQDVKMWESINVKKGDNLSFGNLKSGMRAYLAFSAE IDVPVVMGSKSTLLKSKLGGFEGRQLKMGDVLNFKNVKVLSKKNILDKKYIPAYSHNQNI RIVLGPQDDYFEESSIKTMLENKYQVTKDADRMGMRLAGEVIKHKDKADIISDAAVFGSI QVPGNGQPIILLADRQTTGGYTKIATVIKADLPKLAQMLPNDTIEFSLINIEEAQKAYRE FYRILDEIKESFVVKPRVYTEKQLYVLKKLFGNRRKK Prediction of potential genes in microbial genomes Time: Thu May 19 22:48:30 2011 Seq name: gi|224461385|gb|ACDC01000017.1| Fusobacterium sp. 2_1_31 cont1.17, whole genome shotgun sequence Length of sequence - 13556 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 2, operones - 2 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 13 - 72 3.6 1 1 Op 1 1/0.000 + CDS 101 - 994 1048 ## COG1032 Fe-S oxidoreductase 2 1 Op 2 1/0.000 + CDS 1005 - 2804 1979 ## COG0438 Glycosyltransferase 3 1 Op 3 . + CDS 2801 - 3541 938 ## COG3713 Outer membrane protein V 4 1 Op 4 . + CDS 3557 - 3973 646 ## Sterm_1566 hypothetical protein 5 1 Op 5 . + CDS 4045 - 5025 1385 ## Sterm_1566 hypothetical protein 6 1 Op 6 . + CDS 5064 - 5891 879 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 5898 - 5934 -0.9 - Term 5875 - 5930 2.2 7 2 Op 1 . - CDS 5934 - 7121 686 ## COG3307 Lipid A core - O-antigen ligase and related enzymes 8 2 Op 2 . - CDS 7171 - 7782 681 ## FN1239 hypothetical protein 9 2 Op 3 . - CDS 7792 - 8514 826 ## FN1240 lipopolysaccharide core biosynthesis protein RfaY 10 2 Op 4 1/0.000 - CDS 8516 - 9247 626 ## COG3774 Mannosyltransferase OCH1 and related enzymes 11 2 Op 5 3/0.000 - CDS 9257 - 10342 1103 ## COG0726 Predicted xylanase/chitin deacetylase 12 2 Op 6 26/0.000 - CDS 10355 - 11212 941 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 13 2 Op 7 5/0.000 - CDS 11220 - 12317 1127 ## COG0438 Glycosyltransferase 14 2 Op 8 . - CDS 12332 - 13420 1029 ## COG0859 ADP-heptose:LPS heptosyltransferase - Prom 13495 - 13554 5.3 Predicted protein(s) >gi|224461385|gb|ACDC01000017.1| GENE 1 101 - 994 1048 297 aa, chain + ## HITS:1 COG:FN0392 KEGG:ns NR:ns ## COG: FN0392 COG1032 # Protein_GI_number: 19703734 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Fusobacterium nucleatum # 1 297 1 297 297 540 94.0 1e-153 MYDLYDFPLYRPPSEAYSLIIQITLGCSHNRCTFCSMYKDKKFVIKPIEDIKADIDAFRA LYKNRAVEKIFLADGDALVVPTDILVQVLDYIKEVFPECKRVSIYGTAIAIHQKSVEDLK KLYEKGLTLVYLGVESGDDEALKFIKKGIKAEKVVELSKKIMSVGIDLSITLIAGLLGKY QDNKMHAINTAKIITDISPKYASILNLRLYEGTELYDLMQQGKYDYMEGIEVLKEMKLIL SSMDVSKITRPIIFRANHASNYLNLKGNLPEDIPRMIKEIDYAIENEAINVNNYRFL >gi|224461385|gb|ACDC01000017.1| GENE 2 1005 - 2804 1979 599 aa, chain + ## HITS:1 COG:FN0393_1 KEGG:ns NR:ns ## COG: FN0393_1 COG0438 # Protein_GI_number: 19703735 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Fusobacterium nucleatum # 1 350 1 350 350 599 90.0 1e-171 MNILMALSQLEITGAEVYATTIADELIERGNKVYIVSDTLTTPTKAEYIKLEFNKRSLIK RIEHIKFLYKLIKEKDIQIVHAHSRASSWSCQVACKLAGIPLITTTHGRQPIHFSRKLIK AFGDYSIAVCENIKKHMVNDIGFSENKTSVILNPVNYKELNLEKKLNDKKIISIIGRLSG PKGDVAYDLLSILSDDELLKKYKVRLIGGKELPERFVKFKEKDIEFIGYVPNIQEKIFES DIVIGAGRVAFEALLNKSSLIAVGETEYMGFINKESLDKSLASNFGDIGSMKYPKIEKDI LLNDIKKALELSETEKEELKNIIFNETNLHNIVDRIEKKYFELYVDKTKYEVPVIMYHRV INNSEDEGVHGTYIYENIFREHMQYLKDKNYTVITFRDLDKISWRNRFEKDKKYIILTFD DGYKDNYDLAFPILKEFGFKATIFLMGSSTYNEWDVKASGEKEFPLMSVDMIKEMQDYGI EFGAHTFNHPKINTLSNDEIEHQIIDVKKPLEEKIGREIITFAYPYGILNDYAKEMAKKA GYTFALATDSGSVCLSDDLYQIRRIAIFPNTNLFSFKRKVAGNYNFIKIKREEKNRSKK >gi|224461385|gb|ACDC01000017.1| GENE 3 2801 - 3541 938 246 aa, chain + ## HITS:1 COG:FN0394 KEGG:ns NR:ns ## COG: FN0394 COG3713 # Protein_GI_number: 19703736 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein V # Organism: Fusobacterium nucleatum # 1 246 1 246 246 340 71.0 1e-93 MKKYLLTMLALFSVAAVANDDFKASVTAAYGTRTSIYKGREENAIPIFPNLSYQNLYLKG TEVGFKFLDYNRFNSTLYVDLLDGHSIKGSRMDTGYESINRRRYQQAIGLKADVKLNEIS ENLTLTPSFSIGNRGSKTGLSLSYLYMPKENIIISPSVNVKYLSKKYTDYYFGVDRDELG GSITNEYNPDGAFEFGAGLYGEYYFTKNISALGYVNMKQYSSEVTKSPITEDRIITNVGA GLKYTF >gi|224461385|gb|ACDC01000017.1| GENE 4 3557 - 3973 646 138 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1566 NR:ns ## KEGG: Sterm_1566 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 138 1 139 324 88 34.0 6e-17 MDKKELVNKISYLVSKKNHDQAYAIIREFEKKNNFEMICASAQGFINAYHYRSALKILES IKKEYSKNAEFCARYAIALFNSEKEDKSLQWFEKAKEKGLEDLSEISNDFFSKSIDDWIK KAKFWGPIRVEENNYKEE >gi|224461385|gb|ACDC01000017.1| GENE 5 4045 - 5025 1385 326 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1566 NR:ns ## KEGG: Sterm_1566 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 30 322 31 322 324 317 52.0 4e-85 MEKDELIGKLSNFIRKEKFQEINEIIKKFKNEKNYDMVCFSSQAFINMSEYKEALEILDS IKKEYSENGEFCIRYAMALYNSNREDEALEWFKRAKEKGIKEIDETSGRYYPKSVDDWIK RAGAWAPRRIEKNKFERELREKRDKKPMLNVSFDEEVLKGLWYHDEFSIREYLGKPATDE DFEKVEKELGYRLPDSYKALMRIQNGGELRKNNFEGPLKRNWARKNFDVIGVYGVDSSRK YSLCGEFGSKFWIEEWKYPDIGVVICGTSSGGHDMIFLDYSDCGPEGEPCVVNIDQEGGY EITYLADNFKDFVEGLFPSFDDEDDD >gi|224461385|gb|ACDC01000017.1| GENE 6 5064 - 5891 879 275 aa, chain + ## HITS:1 COG:FN0395 KEGG:ns NR:ns ## COG: FN0395 COG0697 # Protein_GI_number: 19703737 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 274 13 286 286 405 87.0 1e-113 MLVSVLGFTFMGIAVKYLPRIPTYEKVFFRNSVSLMLSAFILFKEKESIKVEKANIPFVF GRSFFGFIGMVANFYALEHLTMAEANMLNKLSPVFVTICACIFLKERVDKKQVIGIILML LAVVFVIKPSFSPEVIPSLAGLFSAVLAGFSYTIIRYLNGKVKSEINVFYFSLLSVVCTF PLMMMNFVKPTLDEFLILLGGIGISAAMGQFGLTYAYTFAPASEVSIYNYVIIITSMLMD YVLFSTIPDLFSFIGGFMIMTTAIYLYIHNKKKDN >gi|224461385|gb|ACDC01000017.1| GENE 7 5934 - 7121 686 395 aa, chain - ## HITS:1 COG:FN1238 KEGG:ns NR:ns ## COG: FN1238 COG3307 # Protein_GI_number: 19704573 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A core - O-antigen ligase and related enzymes # Organism: Fusobacterium nucleatum # 33 395 1 363 365 458 74.0 1e-129 MLYKDKKETISFLGELISYTYIFSAFFSSKLNMKVGYLLLIVSFLYICINKDIIKLVNKK IYGMFLLILILGCFWNYISAGMIGMSKFININTKFFYGIAMFPFLVNIKKDKFNILIFLI TNLLSTIFLYNESYLYTLLDDLGRIRVILLMAWMYTLIYSFEKISENFKKYIFLLLASIL PFIALGRSGSRMGALSLLVTIFLYLLFKILKDKKNIKLISIVVAIIFFTGIFLPKEYMVK LKTSFQTSQNISNEDRIVMWKAGKHIFKENPIFGIGTYKKNIYPHVKKYVDENVQDEHLR QEFLNEDRFAMLHNMYVDFFVQNGILTFLYLFLFFILIPFIFFKYNKNNECISAFFTLIF YLFYGFTWSLWASLSISQVLFQIFLIWMLVNLKKE >gi|224461385|gb|ACDC01000017.1| GENE 8 7171 - 7782 681 203 aa, chain - ## HITS:1 COG:no KEGG:FN1239 NR:ns ## KEGG: FN1239 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 200 1 200 214 285 80.0 5e-76 MNYQFFDKYLEEHGDFGECISNHKDKTLVYKEKFQGKDFYIKKYIPYGKRRKRIAFGLYQ DRALHYEKVCKYLAKLNIPYVNLEYKKIKRISFFDRVSIIVTKDCGQTFENFVNDFEKNK DLITKFYDFFILLIKNKIYPTDYNTGGMLIDNEEILRLTDFDDYKINSFLTTNLKKRLIR NLSRIYLEESRTKECEEFLKKSN >gi|224461385|gb|ACDC01000017.1| GENE 9 7792 - 8514 826 240 aa, chain - ## HITS:1 COG:no KEGG:FN1240 NR:ns ## KEGG: FN1240 # Name: not_defined # Def: lipopolysaccharide core biosynthesis protein RfaY # Organism: F.nucleatum # Pathway: Lipopolysaccharide biosynthesis [PATH:fnu00540]; Metabolic pathways [PATH:fnu01100] # 1 240 1 240 240 355 88.0 9e-97 MLKLNLEKYKELNIYYYEKEFLDLALKVIDSDYSTYQILKNTKRNYVSIIEVNNKKYVYK EPRNEFRIPQRQVTTLFKKGESLTTLVNINNLINMGFKEFVKPLVAVNKRKYGFIVSSFF IMEYVEGEDNRENLDMIVEKMQEIHSKGYYHGDFNPGNFLVENKQIHILDTQGKKMFFGN YRAHYDMITMKYDSYEEMMYPYKKNLFYYLAYSMKRFKRLTFIEKIKYFKKKLRDKGWKI >gi|224461385|gb|ACDC01000017.1| GENE 10 8516 - 9247 626 243 aa, chain - ## HITS:1 COG:FN1241 KEGG:ns NR:ns ## COG: FN1241 COG3774 # Protein_GI_number: 19704576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Fusobacterium nucleatum # 1 243 1 243 243 398 88.0 1e-111 MIEKKIHYVWFGNAKPEKVLKCIESWKKNLPDYEIIEWNETNFDVEEELKKNKFFRECYK RQLWAFVADYVRVKILYNYGGIYLDTDMEIIKDISPLLDTDMFLGYENENTMSFGIVGVI PKHKVFEKMYEFYQDEIWKSSLHIITNILTDILEEEYQGKYKQNNINIYPREYFYPFNHD EEFTETCIKKNTYAIHWWGKSWKKNPKVYFLKYKHLPWWKKHPKHIAKLINYYFKNLFNF RKE >gi|224461385|gb|ACDC01000017.1| GENE 11 9257 - 10342 1103 361 aa, chain - ## HITS:1 COG:FN1242 KEGG:ns NR:ns ## COG: FN1242 COG0726 # Protein_GI_number: 19704577 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Fusobacterium nucleatum # 1 360 1 360 360 528 86.0 1e-150 MNILILYKDNENKNIIKDSNNNNLYFFKQKEYSYKKIKNLKNEKDIQIILYIGKNNFLLN IYSSFLNIPVVYTENSKNTEDIEVLLQNKLAYKDRKDLPVLMYHRVIDDKNEIGFYDTYV TKENFEMQMKYLSENSYTSITFKDIQNGEYKRRFDKDKKYVIITFDDGYKDNLKNALPIL KKYNMKMVLFLITSETYNKWDTDVENREKEKKFNLMTREEVKELIASDLVEIGGHTTKHL DMPNVDLKTIEEDLNISNKIIEEITGYKPISFAYPWGRSTKESRDIVKKVGYKFAVSTED GPACFSDDLFEIVRVGIYSDDDIEKFKLRISGKYPFIREKRNEMKAFRNKIRKFFGIKIK Q >gi|224461385|gb|ACDC01000017.1| GENE 12 10355 - 11212 941 285 aa, chain - ## HITS:1 COG:FN1243 KEGG:ns NR:ns ## COG: FN1243 COG0463 # Protein_GI_number: 19704578 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Fusobacterium nucleatum # 1 285 1 286 286 445 88.0 1e-125 MTDKITVIVTLYNRLEYARNMILALQQQTKQIDELIFVDDGSSEKLMDFISDLLVDCKFK IKHIYQDDIGFRLARSRNNGAREASGDYLIFLDQDVIFNNDFIESIYNSRRKKRMIFSEA LSSSLEERNKIQELINQKFDYEKIYKLIDNTKKIEQNKIVNKEKLYGILYKLKLRTRGAK IVGLIFSLFKEDFININGLDEKYIGYGYEDDDFGNRFFKYGGETYVFKMKMYPIHMYHKA AILGESPNEDYYRQRKNEISKKNYRCEYGYDKTFGEDKYKVIEIK >gi|224461385|gb|ACDC01000017.1| GENE 13 11220 - 12317 1127 365 aa, chain - ## HITS:1 COG:FN1245 KEGG:ns NR:ns ## COG: FN1245 COG0438 # Protein_GI_number: 19704580 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Fusobacterium nucleatum # 2 282 3 283 381 394 84.0 1e-109 MKILFKSGSTMMGGLKKVQIEYINFLVKQEKYQIKIVIENDNGKDNALEKYITSNVTYLK NYNYILEIKNLRENRKKSLWSRIKYNLAITKEKKYADNKFLQIYKEYKPDIVIDFDSSLT KIIDKLDLSKNLVWIYSSIENWKKKKSKIDRFVDRISKYNKIICICKEMKEDLINLKNES KNKVDFLYNPIDFNRIKKLSNEDFFEEDKKLLKDKFLLSIARLDCVPKDFETLFKAYEIA KKNGYDGKLYIIRDGPDKDKVEKLKEANLYKEDILLLGRKENKIYLEKTLGEKLYSFAYP YGIFNETSKKIGKELGFNYGIATDSGKFYIEDDLYQVRRIGIFSDITMSKFKRRVKGNYN LKYTK >gi|224461385|gb|ACDC01000017.1| GENE 14 12332 - 13420 1029 362 aa, chain - ## HITS:1 COG:FN1247 KEGG:ns NR:ns ## COG: FN1247 COG0859 # Protein_GI_number: 19704582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 362 16 377 379 541 78.0 1e-154 MREKRLKIGKAIWDKKEKTNIIKGNNFIEDNNIKSILFLRYDGKIGDMIVNSLMFREIKK VYPDIKIGLIARGAAIDIIKDNPNVDEIYEYHKDRKKIKELALKIREENYDLLIDFSEML RVNQMMLINLCGARFNLGLDRNNWSLFDLSVESNKDFKWSEHITKRYLAYLLKLGLKSEE INLSYDIYIKDNSKYTEFLNKIKESKRLVLNPYGASKHKSFSIDTLENIITFLKDKDIAI ILVYFGDKFKELETLEKKYTNVYIPKSITNILDTALLIKESDFVISPDTSIVHIASTFNK NIIAVYPPNGGKYGVDHLVWAPNSDYTRVIFCKDKSGTYDEIDINTFELEEMEAEILKMI NN Prediction of potential genes in microbial genomes Time: Thu May 19 22:48:50 2011 Seq name: gi|224461384|gb|ACDC01000018.1| Fusobacterium sp. 2_1_31 cont1.18, whole genome shotgun sequence Length of sequence - 13002 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 36 - 1835 2451 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 2 1 Op 2 . - CDS 1847 - 3586 1894 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 3690 - 3749 9.7 + Prom 3695 - 3754 10.7 3 2 Tu 1 . + CDS 3781 - 5103 1777 ## COG0457 FOG: TPR repeat + Term 5287 - 5322 -0.1 4 3 Op 1 17/0.000 - CDS 5255 - 6292 1301 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 5 3 Op 2 24/0.000 - CDS 6307 - 7083 1104 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 6 3 Op 3 . - CDS 7096 - 7848 962 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 7 3 Op 4 . - CDS 7860 - 9659 2842 ## Sterm_0484 thioredoxin domain protein 8 3 Op 5 . - CDS 9660 - 11234 1858 ## COG0155 Sulfite reductase, beta subunit (hemoprotein) - Prom 11477 - 11536 11.6 + Prom 11434 - 11493 11.2 9 4 Op 1 . + CDS 11522 - 12427 1001 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 10 4 Op 2 . + CDS 12443 - 12991 796 ## COG0693 Putative intracellular protease/amidase Predicted protein(s) >gi|224461384|gb|ACDC01000018.1| GENE 1 36 - 1835 2451 599 aa, chain - ## HITS:1 COG:CAC3281 KEGG:ns NR:ns ## COG: CAC3281 COG1132 # Protein_GI_number: 15896526 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 85 599 186 699 706 443 46.0 1e-124 MSKKKNQNEDSIKNLKKAVSNLLSLLGERKVPFLISVVANIISTVLVVAIPWTSAVAIDD IVKILNDNTIIDKWSAVFGFLIKPVSLLGIIAVSIFALSYLQEYISAILGEEVAQSLRVK LSRKFTKLPMNFFDTNQVGDILSKLTTDIEKVAEVIGSSFTRFVYSFLIMILVIIMLFTI NVKLTLLVLAILLISIVVTYYVSKLTQKIFSQDVKSLSDLSSLTEEALTGNLVVQAFNKQ EDIITSIDESIEKQYVAAKTLEFTIFSIYPSIRFITQIAFVTSAVMSAILVINGHLTLGL AQAFLQYITQISEPVTTSAYIINSLQNALVSVERVYDILELPEEKELSEDTHLLDNTKGQ IVFENVSFGYSKDKLLMKNVNFTAKAEQMVAIVGPTGAGKTTLINLLMRFYDVNGGRILF DGVDISKVTRKELRANFGMVLQDTWLFKGTIAENIAYGKPDATREEIIEAAKLAKCDSFI RKLPQGYDTIITSENGMVSQGEQQLLTIARTILPNPKVMILDEATSSIDTKTEKDIQAVI SQLMKGRTSFVIAHRLSTIRNADLILVMKDGDIVEQGNHDELMTVNGIYANLYNTQFSS >gi|224461384|gb|ACDC01000018.1| GENE 2 1847 - 3586 1894 579 aa, chain - ## HITS:1 COG:CAC2393 KEGG:ns NR:ns ## COG: CAC2393 COG1132 # Protein_GI_number: 15895659 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 575 5 571 577 356 36.0 6e-98 MKILRTYIKENIGILSLGALFLTLNTFATLAIPFQISNIINLGIMKKDIDMVYSTSIKMV IILIIGTTTGIIANHFVALFATNFTKKNRKLLIRNLESLTVDQVNDFGVASLVTRMGNDN NNAQRLIVAFFQMILPSPIMAVISIFMTIKLSPTLALIPLFTILIFAFAIVLTLFKSLPY ILKVQKKLDRMTLVLRERFIGAKIIRAFDNSKKERDKFNDVAQEYTDNYIIINKKFALLS PMAFALMSVVITLIIFFGAMKVLNNTLEIGSITAIVEYSLTTIAALIMSSMVLVQMPKAV VSIERIEEVLNVTSEIKDKEELKDNSYYEDILKQNPISLTFDNVCFRYKGAEKQILKNIS FSVKAGERFAIVGATGSGKSTIAKVLLRLNDIESGRILINGVNTLDLPLNCLRNQISYTP QKAYIFSGKIKDNFRFTNKDMTDEEMIKIAKIAQSYDFIDSLPDKFDSFVAQGGINFSGG QKQRLSIARALSKDANIYLFDDSFSALDYATDAKLRKELKTFLKDKITIIIAQRLNTIAD ADKIIVLKDSEITGMGTHQELLESNQEYIELAKSQGILE >gi|224461384|gb|ACDC01000018.1| GENE 3 3781 - 5103 1777 440 aa, chain + ## HITS:1 COG:FN0819 KEGG:ns NR:ns ## COG: FN0819 COG0457 # Protein_GI_number: 19704154 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 3 426 2 422 665 209 36.0 7e-54 MKKDTLLKKIKDLSDLDKHQEIIDMIEALPVEQLNNEIIGQLARAYINVQNYEKAIEVLK SIEKDEKNTMLWNYRMGCSYFYLKDYEKAEEYFLKAYNLEPEDENIKDFLMDIYINLSKQ VKFGEDDLDNQKKALNYALKAKDYMTTDDKKIECYSYLAFLYNKFTDYHTAEDLLKRAIS LGRDDLWIHSELGYCLGELNKLEESLEHYLKVIELEPNNIWALSQIAWTYRCLGRYEEAL KENFKALDLGEKSEWVYVEIGYCYKGLNNYDKAIKYYLEANKISKDRNVWLLSELVWLYD GIGKYENALEYLKKLEKLGRDDSWFNSEYGFCLIGLKKYNEAIEKYKHALEKEKNVKETI HYNCQIGFCYRLLEKYEEAIENLKKVLEIINGDKTNDNTDEKIFLNSQIGWIYGKIENDE KLKKELEELKDMVNMLYHPS >gi|224461384|gb|ACDC01000018.1| GENE 4 5255 - 6292 1301 345 aa, chain - ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 42 340 30 321 332 86 24.0 6e-17 MEGKSRKIKILIGLIALIVLAFGPFKPKDKNNENVNTAEVNLKKVIIGLPGISNQTLEAT GIAVNKGYIAEELKKVGYEPEFIYFQQAGPAVNEALATNKIDIAMYGDFPITILKSNGGD VKVFAVDNSRFMYGVLVQNDDNIKSIKDLEGKKVLYRKGTVEQKFFKEILKKYNLDEDKF VSVNAGGADGQSIFSAKEAEAIFTFYYTVLYMESKGLGKVIDSTLDKPEVGTQSLAVGRT KFLEENPDAAVAIIKALERAKDFAKENPEEVFNIYAQSGIPAEVYKKAYSADLTFSNFDP AITDDTKEKMKKLIDFLYDNQIVKNKITVDDIITTEYYDRYKSSK >gi|224461384|gb|ACDC01000018.1| GENE 5 6307 - 7083 1104 258 aa, chain - ## HITS:1 COG:MJ0412 KEGG:ns NR:ns ## COG: MJ0412 COG1116 # Protein_GI_number: 15668588 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Methanococcus jannaschii # 6 243 17 256 267 229 49.0 3e-60 MSENIIKIKNISKKFQKNNEEVQILNDVNLDIKKGEFITIVGKSGCGKSTLLKLISGMVP ITEGEILINGKSVNGVSKDCSMIFQDARLFPWLKIKDNVAIGLKNISSEEKNRIVLEYLE LVGLKGVENSYPDQLSGGMAQRASIARGLALNSQIMLFDEPFSALDAMTKVQLQEELLKI HQEKGKTVILVTHDIEEAVYLGDRVVVMAANPGVIKDIINIDIEGRKDRTNTEFLSYKNK IYDYFFEDRNKNAVEYNI >gi|224461384|gb|ACDC01000018.1| GENE 6 7096 - 7848 962 250 aa, chain - ## HITS:1 COG:AGpT116 KEGG:ns NR:ns ## COG: AGpT116 COG0600 # Protein_GI_number: 16119871 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 248 75 307 313 168 40.0 9e-42 MKNKSEYIKFILPLLIIFFWFIFTYTGKVPPTSLPSLSAVKDTFIEMLKSGQLSNDLSLS LRRVLAGFFISSVLGISLGIFMGISSKAKEFFQLTLTAIRQIPMIAWIPLIILWAGIGEV SKIVVILFAATFPIVVNTMGGVDSTSETYLEVAKMYGLSKKDTFFKVYLPSALPNIFTGL RLGLGASWMAVVASELIASSSGIGYRLNDARSLMRSDVVIVCMIIIGLVGLLMDKLIVLI SHELTPWKKN >gi|224461384|gb|ACDC01000018.1| GENE 7 7860 - 9659 2842 599 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0484 NR:ns ## KEGG: Sterm_0484 # Name: not_defined # Def: thioredoxin domain protein # Organism: S.termitidis # Pathway: not_defined # 1 593 1 595 600 738 58.0 0 MGLPSIYPTGVTIYNPEKCWNGYNLVQTIESGALLFDMNGNEVRRWDQFHGFPNKLLPNG NLIGYSGDRNPKYGMQDGLDLVQIDYDGNIVWKFEKFEFVEDEGEEPRWMARTHHDYQRE GNPVGYYVPGQIPEVNKGNTLILAHQTLYNKKISDKKLLDDVFYEVDWEGNILWQWNANE HFEEIGFSEDAKKTLYENPNVRAADGGVGDWLHINCMSYLGPNKHYDNGDERFHPENIIF DSREANFIAIISKKTGKIVWKIGPNWNDDDVKHIDFIIGPHHAHLIPQGLPGAGNILVFD NGGWGGYGLPNPSSKNGLKNALRDYSRVLEIDPITLEIVWEFTPESIKAAIPTDAAKFYS PYVSSAQRLPNGNTLIDEGSDGRVFEVTVEKEVVWEWISPYFTDGGKTTNNMIYRAYRYP YEWVPQEEKPIEKEIKPLDIKTYRLENAGKFGAKTVVKVEGTIPYSVSDALCVAKIDESK KLNSEKLFTVNRNLFEEIVEDNKKVEKLELILFGAERCRHCKALHPVIEKVLENDLAKSI KAKYVDVDKNPEITEKYKVQGIPVIIITDGEKELSRKAGEKTYSELYSWLEELISKNVK >gi|224461384|gb|ACDC01000018.1| GENE 8 9660 - 11234 1858 524 aa, chain - ## HITS:1 COG:CAC0094 KEGG:ns NR:ns ## COG: CAC0094 COG0155 # Protein_GI_number: 15893390 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfite reductase, beta subunit (hemoprotein) # Organism: Clostridium acetobutylicum # 10 520 9 514 516 246 32.0 8e-65 MEKLEGLENIDKVEEFIKLSKAALKDEEKYKLWNASKSMYGIYGERDKGTYMVRPRFTES KISLDNLIFFLDLAKRYGDKRLHLTTRQDIQLHGNKKEDLVNLLKELKSKGFLTKATGGD AARAVIAPPTTGFEEEIINVAPYSKAVTRLILETADFMFLPRKFKVAFSNKEENNLYVKI ADVGFEAIEKDGVKGFKVFGGGSLGINPREAIVLKDFIKPEEALYYVVAMRNLFNEHGDR KIRGKARLRFILIRLGEEEFLKLFNNYLDDLYKKVGDKYRNILLEEIEKYKNPYEVKPVK EKEKFIKKFNIVKGKIEGRYGYYIRLVKGDISLKEGEKLVEFLKNLNYKVEIRLTSHQEL FIANLKRADVYALENLSSKYSKKRFFSSLSCIGNTICNPGILDTPPILEMILNYFKNKQR LASYLPKIQLSGCPNSCSAHQIAELGFQGKRKKDGAYFNVFVGGRFKTDDTITLNSSVGE LKAETIPLFLEEMAKILKERKITYEDYSKQDEFIELVKKFEGVI >gi|224461384|gb|ACDC01000018.1| GENE 9 11522 - 12427 1001 301 aa, chain + ## HITS:1 COG:FN1038 KEGG:ns NR:ns ## COG: FN1038 COG0697 # Protein_GI_number: 19704373 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 301 2 302 303 461 85.0 1e-129 MNKKSYFGDLMLFLAAFIWGTAFVAQVAGMDRIGPFTFNMARSIVAVISLGAYLIFTKAK LPKDMSFLLKGGLVCGFFIFVGTSLQQIGLQYTTAGKTGFITSFYILILPFLTMIFLKHK IDVLTWISVIIGFIGLYLLAIPSLSDFSMNKGDFIVFLGSFCWAGHILVIDYYSKKVNPV ELSFLQFVVLSILSGICALIFENETATLSNIFSSWKPIMYAGFFSSGVAYTLQMVGQKYT KPVVASLILSLEAVFAALAGYLLLDEVMTSREFLGSFIVFLAMIFSQIPKDLFKKKYIGL K >gi|224461384|gb|ACDC01000018.1| GENE 10 12443 - 12991 796 182 aa, chain + ## HITS:1 COG:FN1085 KEGG:ns NR:ns ## COG: FN1085 COG0693 # Protein_GI_number: 19704420 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Fusobacterium nucleatum # 1 182 1 182 182 301 82.0 5e-82 MKTYIFLANGFEILETFSPVDVLKRCGAEVITVSTEKDLFVSSSQNNIVKADVMLNEIDY KDADLVVIPGGYPGYINLRENKEVVDIVKYFLENDKYVASICGGPTIFSYNKIANGAKIT AHSSVRKEIEENHIYVDVSTHVDGKIITGVGAGLALNFAFKIAEQFFTKEKIEEVKKGME LI Prediction of potential genes in microbial genomes Time: Thu May 19 22:49:04 2011 Seq name: gi|224461383|gb|ACDC01000019.1| Fusobacterium sp. 2_1_31 cont1.19, whole genome shotgun sequence Length of sequence - 20258 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 6, operones - 4 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 209 - 264 12.0 1 1 Op 1 22/0.000 - CDS 283 - 1068 999 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen 2 1 Op 2 32/0.000 - CDS 1084 - 1785 989 ## COG2011 ABC-type metal ion transport system, permease component 3 1 Op 3 . - CDS 1775 - 2782 1255 ## COG1135 ABC-type metal ion transport system, ATPase component - Prom 2988 - 3047 7.8 - Term 2810 - 2845 1.1 4 2 Tu 1 . - CDS 3052 - 3846 1074 ## COG0796 Glutamate racemase - Prom 3885 - 3944 7.7 - Term 3910 - 3972 7.7 5 3 Tu 1 . - CDS 3987 - 5204 1715 ## COG0786 Na+/glutamate symporter - Prom 5259 - 5318 13.1 - Term 5279 - 5329 7.1 6 4 Op 1 1/0.000 - CDS 5392 - 6510 1204 ## COG0053 Predicted Co/Zn/Cd cation transporters 7 4 Op 2 2/0.000 - CDS 6497 - 9412 3355 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 8 4 Op 3 1/0.000 - CDS 9405 - 10922 1698 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Term 10935 - 10971 3.1 9 4 Op 4 4/0.000 - CDS 10989 - 11735 911 ## COG2099 Precorrin-6x reductase - Prom 11765 - 11824 8.2 10 4 Op 5 6/0.000 - CDS 11866 - 12615 1319 ## COG1010 Precorrin-3B methylase 11 4 Op 6 . - CDS 12608 - 13618 1297 ## COG2073 Cobalamin biosynthesis protein CbiG 12 4 Op 7 . - CDS 13631 - 14362 595 ## FN0953 hypothetical protein 13 4 Op 8 . - CDS 14399 - 15649 1177 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif - Prom 15675 - 15734 6.8 14 5 Op 1 . - CDS 15736 - 16638 880 ## gi|237740123|ref|ZP_04570604.1| predicted protein 15 5 Op 2 . - CDS 16667 - 17194 704 ## FN0955 hypothetical protein 16 5 Op 3 . - CDS 17197 - 17424 318 ## FN0956 hypothetical protein - Prom 17477 - 17536 6.7 - Term 17463 - 17495 1.4 17 6 Op 1 . - CDS 17555 - 17941 546 ## COG0346 Lactoylglutathione lyase and related lyases 18 6 Op 2 . - CDS 17959 - 18732 1418 ## COG2875 Precorrin-4 methylase 19 6 Op 3 . - CDS 18770 - 19462 1014 ## FN0958 hypothetical protein 20 6 Op 4 . - CDS 19477 - 20199 1027 ## COG2243 Precorrin-2 methylase Predicted protein(s) >gi|224461383|gb|ACDC01000019.1| GENE 1 283 - 1068 999 261 aa, chain - ## HITS:1 COG:FN0658 KEGG:ns NR:ns ## COG: FN0658 COG1464 # Protein_GI_number: 19703993 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Fusobacterium nucleatum # 1 261 1 261 261 451 94.0 1e-127 MKFTKLIGNVGAFLLLSVGALAGTIKVGATPVPHAEILELIKPDLKKQGVELKIVEFTDY VTPNLALADKEIDANFFQHKPYLDKFVEERKLNLVSLGNVHVEPLGLYSKKIKSINDLKK GDTIAIPNDPSNGGRALILLHNKGVITLKDPKNLFATEFDIVKNPKKIKFKPTEVAQLPR ILPDVTAAIINGNYALQANLSPAKDSIILEGKESPYANILVVRKGDEKKEDIQKLLKALR SQKVKDYIKKKYSDGSVVPAF >gi|224461383|gb|ACDC01000019.1| GENE 2 1084 - 1785 989 233 aa, chain - ## HITS:1 COG:FN0659 KEGG:ns NR:ns ## COG: FN0659 COG2011 # Protein_GI_number: 19703994 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Fusobacterium nucleatum # 1 233 1 233 233 333 92.0 1e-91 MDISSLIEPLFENFENPIISMLAVSTVETLYMVFLSTLFSLLLGFPIGVLLVITKEGGIY EMKKFNAILGVIINALRSFPFIILMIILFPLSRFVVGTTIGATAAVVPLSIGAAPFVARI VEGALLEVDPGLVEASQSMGASNSKIIFKVMLPECYPTLVHGIVVTIISLIGYSAMAGTI GAGGLGDLAIRFGYLRFKLDIMIYAIIIIIILVQIIQSVGNYIVNRRLKKIGK >gi|224461383|gb|ACDC01000019.1| GENE 3 1775 - 2782 1255 335 aa, chain - ## HITS:1 COG:FN0660 KEGG:ns NR:ns ## COG: FN0660 COG1135 # Protein_GI_number: 19703995 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Fusobacterium nucleatum # 1 334 1 334 335 569 91.0 1e-162 MITLEKVNKVYSNGLHAVKDVSLKVNKGDIFGIIGLSGAGKSSLIRLINRLEEPTSGKIF INGENILEFNKKQLLERRKKIGMIFQHFNLLSSRTVEENVAFALEIANWNKNEIKERVAM LLDIVGLSDKAKYYPSQLSGGQKQRVSIARALANNPDILLSDEATSALDPKTTKSILELI KKIQQKFSLTVVMITHQMEVVKEVCNRVAIMSDGRIVEEGGVHHIFADPKNEITKELISY VHQQTDTEIDYLHHKGKKIVKVKFLGTSTQEPIISKVIKEYGIDISVLGGTIDKLATMNI GHLYLELDGDLSAQDKAIELMGTMDVIVEVIYNGY >gi|224461383|gb|ACDC01000019.1| GENE 4 3052 - 3846 1074 264 aa, chain - ## HITS:1 COG:FN1161 KEGG:ns NR:ns ## COG: FN1161 COG0796 # Protein_GI_number: 19704496 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Fusobacterium nucleatum # 1 264 1 264 264 454 85.0 1e-128 MADKRQRIGIFDSGLGGTTVLKEMMKALPNEDYIYYGDNGNFPYGSGKTKNEIQKLTERI LDFFVKNNCKLVIVACNTASTAAIDYLREKFPLPILGIVEAGIKIARKNTKTKNIAVIST KFTAESHGYKNKAKMIDTELNVKEIACVEFPMMIETGWDTFDNREELLNKYLAEIPKNVD TLVLGCTHYPLIRKDIEDRTKLKVVDPAVQIVDKVKQTLGSLDLLNDKKAKGKKIFFVTG ETYHFKPTAEKFLGEEIEIYRIPK >gi|224461383|gb|ACDC01000019.1| GENE 5 3987 - 5204 1715 405 aa, chain - ## HITS:1 COG:FN0793 KEGG:ns NR:ns ## COG: FN0793 COG0786 # Protein_GI_number: 19704128 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Fusobacterium nucleatum # 7 404 1 398 399 605 85.0 1e-173 MLRGLVMFEYQLNMAETVGFAIILLLLGRWIKKKVNFFERFFIPAPVIGGTLFSIILLIG HQTESFTFTFNNDIKNLLMIAFFTTVGFSASLKILAKGGVGVALFLLAATILVILQDIVG PVLAKALGIDPLLGLAAGSIPLTGGHGTSGAFGPYLEELGASGATVVAVASATYGLISGC LIGGPIARRLMIKNNLKPTEGKAGFDSSLLNNESEMTEESLFSAVVYVGIAMGIGATINI ILEKYGIKFPAYLMGMVVAAIMRNVIDASQKPLPFNEIGVIGNISLSLFLSMALMSMKLW ELVELAGPLSIILIVQTIVMALFAYFVTFNIMGRDYDAAVIATGHCGFGLGATPNAIANM ETFTATNGPSVKAFFIIPIVGSLFIDFVNAMVIKGFASWIVANFR >gi|224461383|gb|ACDC01000019.1| GENE 6 5392 - 6510 1204 372 aa, chain - ## HITS:1 COG:FN0948 KEGG:ns NR:ns ## COG: FN0948 COG0053 # Protein_GI_number: 19704283 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Fusobacterium nucleatum # 1 372 1 372 372 497 79.0 1e-140 MKKNNEEKRENIIIKTSIIGILVNILLVFFKAIVGLLSNSIAIILDAVNNLSDALSSIVT IIATKIADSEPDKKHPLGHGRVEYLSAMIVAGIIFYAGITSLVESIKKIINPEKVEYSKI TLLVLLVSIILKLVLGKYVKTKGENFNSPSLVASGSDAMSDAILSLSVLISAILYIFTNI NIEAYVGVLISIFIIKAGLEIFMDAVNEILGKRVNKDIKNKIKKTICEIENVHGVYDLVL HDYGPDKYIGSVHIEIPDSMTAEQIDPLERHITDVVLAKHNVYLSGITIYSMNTKNEEII KIHSDIMKTVMSNEGVLEFHGFYIEEKNKSIRFDIIIDYSVKNREEIYNKILNDVKKKYP DYTISIKVDMDI >gi|224461383|gb|ACDC01000019.1| GENE 7 6497 - 9412 3355 971 aa, chain - ## HITS:1 COG:FN0949 KEGG:ns NR:ns ## COG: FN0949 COG1112 # Protein_GI_number: 19704284 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Fusobacterium nucleatum # 1 970 454 1424 1425 1435 80.0 0 MDRIPGIVSSGVTCPINNENLTFLKSGYKKSVSKEEEKEIELGLNKLSDFWTLEEFKEML ENKKEIMSRLDLLLKNKKYHINDNLFYVDDKILIDLDKFKNYSGIDKIIPEDLKSIEDWK KDVCIAGTENSGDRKIWLEFIKDIRRLYDLTNMTKDQLFKKEVVYKDIDVSTAKKLIIGL KKGIERPGFFFKHRLRKARKQISDKVTINNRILETLYDCNVALEYTTLIELKENTKNTWN ILMTGNSLLENSNNKNLYKQLYSYADQMEYLLNWYDREKKTFLHKIENAGFEKLNINKTE GNPIYVDEVNQIFDFIPSLEELIAIGKIALEYREVDIKRSEYLVKIENIIKENSHLGREI KNAILNENIDKYSETLEKLRVLSEKEVLYKKYKDLLHNVKAVANSWGEELENGLFNEKIE NIYNVWRYKQISQKLKELAEKPYFNLQADILEKTEELKKLTIDLVTKKTWYNIIKFLEEK DNLAISQALKGWKQTVQKIGKGTGKNTNIHKKNAKEKMLLCQKVVPAWIMPLNKVFDTLN PVENKFDIIIIDEASQSDISSLILLYMAKKVIIVGDDKQVSPSDVGVNIDKINMFRRKYI KGKVANDDLYGIRASLYSIVSTTFQPISLREHFRSVPEIIGYSNKTSYDNQILPLRDSNS SILKPAIIDYKVNGRRDEKSKINRIEAETIVSLIEACLAMKEYKNSTFGVISLLGDEQAE LIQDLIVKRIPATEIENHKILCGNSASFQGDERDIMFISLVDSSEENKSLRLVGEGVEGA IRKRYNVAISRAKDQLWIVHSIDKNNLKEGDLRKELFEYIDSLKENVFDKTAIENITASD FENEVARHLLEKNYTIKQKWRVGSYDIDMVAIYDDKKIAIECDGKTLNHTEEEVIANLEE QEILERCGWKFIRVRASEYFRNPEKAIKDLIIQLDDKGVYPNHKEVYNDKNELLNNIKSE ALELMEKYEEE >gi|224461383|gb|ACDC01000019.1| GENE 8 9405 - 10922 1698 505 aa, chain - ## HITS:1 COG:FN0949 KEGG:ns NR:ns ## COG: FN0949 COG1112 # Protein_GI_number: 19704284 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Fusobacterium nucleatum # 51 504 1 454 1425 671 81.0 0 MDKRGNIIALYQYIAEVVKSIKTEKKDIHNEEWCYFLEELPKYSGITLNYLDNKNNLANQ KILQVEKLPFLEPLAIDKELLEWISGDWGDYKSPIKLLSEKIIKENDASKVVNISKKEKE ILDKLLKDRKLWVEEQKKIEVVRNLFDTLYNKYLVLDRDSDTLELVVANGLVKVPNEDIC YPILLKKVNFSIDTEKNIISITDASDNDFITQELYLNFLAEVENINLDKVFYLEDKIVEN NIHPISKNDTIKDFFREFIHNLNPRAQFIEDLDKKNKESVITIEWKPILFIRKKDDGKVE AINNIIKDIENGGEIPEYLSELVGVIGSDKRAVEPIPNILFTKETNNEQIEIIKSLYSHR AVVVQGPPGTGKTHTIANLLGHFLAEGKNVLITSQTKKALDVLKEKIPTDIQDLCISMLD DDSSDLGNSVESISEKLGYLNLETLKNEYEEIENQRNELKEDIKNIKRKIFNIKYQESHP IIYNNESITLREAGEFFKKKSKRIG >gi|224461383|gb|ACDC01000019.1| GENE 9 10989 - 11735 911 248 aa, chain - ## HITS:1 COG:FN0950 KEGG:ns NR:ns ## COG: FN0950 COG2099 # Protein_GI_number: 19704285 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6x reductase # Organism: Fusobacterium nucleatum # 1 248 19 266 266 403 88.0 1e-112 MIWVIGGTKDSRDFLEKFVKYENDIIVSTATEYGAKLIENLPVKTSSEKMDKEAMLKFVE NNKITKVIDTSHPYAFEVSKNAMDVAEEKNIQYFRFEREKVDILPKKYKNFEEIKDLIEY VENLEGNILVTLGSNNVPLFKDLKNLSNMYFRILSRWDMVKRCEDNNILPKNIIAMQGPF TENMNIAMMEQFNIKYLITKKAGDTGGEREKVSACDKLDVEIIYLDKKEMSYRNCYTDID VLIKNLIK >gi|224461383|gb|ACDC01000019.1| GENE 10 11866 - 12615 1319 249 aa, chain - ## HITS:1 COG:FN0951 KEGG:ns NR:ns ## COG: FN0951 COG1010 # Protein_GI_number: 19704286 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-3B methylase # Organism: Fusobacterium nucleatum # 1 249 1 249 249 466 93.0 1e-131 MSNGKIYVVGIGPGNMEDISIRAYNVLKNINVIAGYTTYVDLVKDEFPDKEFLVSGMKRE IERCREVLEVAKTGKNVALISSGDAGIYGMAGIMLEVAMGSGIEVEVVPGITSTIAGAAL VGAPLMHDQAIISLSDLLTDWEVIKKRIDCASQGDFAISLYNPKSKGRTEQIVEAREIML KHKLPTTPVALLRHIGRKEENYTLTTLEDFLNFDIDMFTIVLVGNSNTYVQDGKMITPRG YEKKSNWGK >gi|224461383|gb|ACDC01000019.1| GENE 11 12608 - 13618 1297 336 aa, chain - ## HITS:1 COG:FN0952 KEGG:ns NR:ns ## COG: FN0952 COG2073 # Protein_GI_number: 19704287 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiG # Organism: Fusobacterium nucleatum # 1 322 1 322 337 529 93.0 1e-150 MKLAFWTVTKGAGNIAREYKEKLQEHLKEDSIDVFTLKKYDVENTIQIEDFTANINEKFS QYDGHIFIMASGIVIRKIASLIGTKDKDPAVLLIDEGKHFVISLLSGHLGGANELTHSLA NILKLVPVITTSSDVTGKIAVDTISQKLNAELEDLKSAKDVTSLIVNGQKVNILLPKNVK VADNISADGFILVSNKKNIEYTRIYPKNLILGIGCKKDTKAEDILRAIEDCLDKNNLDIK SVKKVATVDVKENEQGLIDAVKFLNLDLEIISRDEIKKVQDQFEGSDFVEKNIGVRAVSE PVALLSSTGNGKFLVMKEKYNGITISIYEEEIDKYE >gi|224461383|gb|ACDC01000019.1| GENE 12 13631 - 14362 595 243 aa, chain - ## HITS:1 COG:no KEGG:FN0953 NR:ns ## KEGG: FN0953 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 243 1 243 243 403 88.0 1e-111 MANYYYDGSFDGLLTVIYMAYEDRENKMLRVNANTEQLILSLEGIHIATDFSKARRVEKA ICEKLSYNFLNNIRTCFLSYDKNKDTVIIHTVYKALKQGEEILNSLDEHAFYLNKLVKQV LNERHKYLGLVRFKEMKDGTMFSTIEPKNNVLPILISHFKNRMKREKFAIYDKGRKMIVY YDGEKAEIFFVESLEIEWSDEEIEYSKLWKTFHKTISIKERENKKLQQSNLPKYYWKYLV EDM >gi|224461383|gb|ACDC01000019.1| GENE 13 14399 - 15649 1177 416 aa, chain - ## HITS:1 COG:FN0954 KEGG:ns NR:ns ## COG: FN0954 COG4277 # Protein_GI_number: 19704289 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Fusobacterium nucleatum # 1 414 1 414 415 775 94.0 0 MSKSIEEKLRILSDAAKYDVSCSSSGSSRKNTNNGLGNAAINGICHSWSADGRCISLLKI LMTNYCIYDCKYCINRKDNDIERAILSPDEIVKLTINFYRRNYIEGLFLSSGIIKSADYT MELMIAVAKKLRLEEKFNGYIHMKVIPGASRQLINEIGLYVDRVSVNIEFAENTALKLLA PDKKATDISTSMGLIRKNMIENAEDKKIFKSTPSFTPAGQTTQMIIGASGESDYAILARS ENLYKNFDLKRVYYSGYVPVNKSGILVSTEQAVPMIREHRLYQADWLLRFYDFKADEILD EKDPFVDPLLDPKTNWAIKNSHFFPIEINKASYKDLLRVPGIGVTSAKRIVMTRKYSTIR YEHLKKLGIVIKRAKYFIVVNGEFLGFKKENPELLRNALMEKEKMVTEQLRLFNGL >gi|224461383|gb|ACDC01000019.1| GENE 14 15736 - 16638 880 300 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740123|ref|ZP_04570604.1| ## NR: gi|237740123|ref|ZP_04570604.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 300 1 300 300 472 100.0 1e-131 MKNKIEESYNKCLNLFKEGKRDTEEYRKELENVIELAKDNNEFKLCYFNAKFRLAQFYNE KHKYDLSKKHFLELINDKNMEEFKLDAIMHHAYNLRILKKYDEATFWYEKLSELSTSKYY DEVVLEGLAKCATMVNNLEKERENYRILLSSCLNKEDFKGLAEKILNLKSQLLSTVDQKQ KEKINTEIIYLNNDLDTAYYKLIDLKMKIAKSYFNEKKYEDCRKEVRTIFEFLEYSISDM QDYAITNANMLLGKTYFEEANFEKAREYFEPIANTPKEDKYYKYMISDIHAARNFLAKMK >gi|224461383|gb|ACDC01000019.1| GENE 15 16667 - 17194 704 175 aa, chain - ## HITS:1 COG:no KEGG:FN0955 NR:ns ## KEGG: FN0955 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 173 1 173 175 290 89.0 2e-77 MKRKFINVTKEYIENLAPTDFCVELIQPAWETVNIYGSYKEYEESLKPYTTEQRYLLAMH WLGAEVANGGFQQFLSNSTGIVWEDAYKGYQAIGSEKLAYLIEELIKIYGRDIPFDREER GNILESFNQEKLEEIDAITDLYYEIEEPEWRKVTLWVKANSEKFFIQAEINDYSR >gi|224461383|gb|ACDC01000019.1| GENE 16 17197 - 17424 318 75 aa, chain - ## HITS:1 COG:no KEGG:FN0956 NR:ns ## KEGG: FN0956 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 75 1 82 82 85 58.0 6e-16 MKTLNKKNWQYEKHGIDGEVELFGVNIFDYKWQDTKEIAKECDFPIYKVIIDGKEHEFAT KETSNNVWCFYLPKE >gi|224461383|gb|ACDC01000019.1| GENE 17 17555 - 17941 546 128 aa, chain - ## HITS:1 COG:CAC2466 KEGG:ns NR:ns ## COG: CAC2466 COG0346 # Protein_GI_number: 15895731 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 1 128 1 130 132 103 45.0 8e-23 MKYNDLIPELVVSNINISRDFYVNMLGFKVEYEREEDKFIFLSLGNIQLMLEEGSEEELS QMEYPFGKGINFTFGVNNVDELYSKFKIKKDLLKRDIEIREFRVNDEIIYTKEFSILDPD GYFIRISE >gi|224461383|gb|ACDC01000019.1| GENE 18 17959 - 18732 1418 257 aa, chain - ## HITS:1 COG:FN0957 KEGG:ns NR:ns ## COG: FN0957 COG2875 # Protein_GI_number: 19704292 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Fusobacterium nucleatum # 1 257 1 257 257 451 94.0 1e-127 MEKYKEKVYFIGAGPGDPELITIKGQRIVKEADVIIYAGSLVPKEVIDCHKEGAEIYNSA SMSLDEVIDVTVKAIKANKKVARVHTGDPAIYGAHREQMDMLDEYRIEYEVIPGVSSFLA SAAALKKEFTLPTVSQTVICTRIEGRTPVPEKESLESLAKHRASMAIFLSVHMIDKVVET LATSYPMTTPVAVVQRASWPDQKIVLGTLETIEQKVKEAGINKTAQILVGDFLGDEYEKS KLYDKYFTHEYREAVKK >gi|224461383|gb|ACDC01000019.1| GENE 19 18770 - 19462 1014 230 aa, chain - ## HITS:1 COG:no KEGG:FN0958 NR:ns ## KEGG: FN0958 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 230 1 226 226 282 72.0 6e-75 MKKLLAILFLIIAVQGIAETVVKGTYETKRKRYVELTQTLEKEIFLNYTLPDNKKEIVTY RDDDYDILVNKNDLLEVYNRGRKEPITDIKSKITYKDTSFKKLGSIYNDFAELIENNKAV VYDRKNEKEINYLIKVKYGNAIYYDNGGGSYYDGYKFYKDKDWTEPVLKFDIVTEYGIAI HSSLGDNPYNRELTEKEKERIEKYDEKFNEEKELYQRAMQTPDVTQSFSY >gi|224461383|gb|ACDC01000019.1| GENE 20 19477 - 20199 1027 240 aa, chain - ## HITS:1 COG:FN0959 KEGG:ns NR:ns ## COG: FN0959 COG2243 # Protein_GI_number: 19704294 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Fusobacterium nucleatum # 1 240 9 248 248 429 97.0 1e-120 MNNKFYGIGVGVGDPEEITIKAINTLKKLDVVILPEAKKDDGSVAYEIAKQYMKEDVEKV FVEFPMLKSLEDRENARKENAKIVQKLLDEGKNVGFLTIGDTMTYSTYVYILEHLPEKYL VETVPGVSSFVDMASRFNFPLMIGDETLKVVSLNKKTNIEFELENNDNIVFMKVSRNFEN LKQALIKTGNIDKIIMVSDCGKESQKVYYDIKDLTEDDIPYFTTLIVKKGGFEKWRKFSI Prediction of potential genes in microbial genomes Time: Thu May 19 22:49:33 2011 Seq name: gi|224461382|gb|ACDC01000020.1| Fusobacterium sp. 2_1_31 cont1.20, whole genome shotgun sequence Length of sequence - 19801 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 3, operones - 3 average op.length - 6.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 563 574 ## FN0960 hypothetical protein 2 1 Op 2 . - CDS 597 - 1061 500 ## FN0961 hypothetical protein 3 1 Op 3 . - CDS 1102 - 1974 1084 ## FN0962 putative cytoplasmic protein 4 1 Op 4 . - CDS 2001 - 3239 1387 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 3272 - 3331 8.4 5 2 Op 1 1/0.000 - CDS 3391 - 3960 891 ## COG2242 Precorrin-6B methylase 2 6 2 Op 2 1/0.000 - CDS 3984 - 4946 1256 ## COG1052 Lactate dehydrogenase and related dehydrogenases 7 2 Op 3 6/0.000 - CDS 4936 - 5589 875 ## COG2241 Precorrin-6B methylase 1 8 2 Op 4 . - CDS 5576 - 6703 1597 ## COG1903 Cobalamin biosynthesis protein CbiD 9 2 Op 5 . - CDS 6719 - 7456 591 ## FN0968 hypothetical protein 10 2 Op 6 . - CDS 7453 - 8199 531 ## FN0969 hypothetical protein 11 2 Op 7 . - CDS 8196 - 8948 575 ## FN0968 hypothetical protein 12 2 Op 8 . - CDS 8945 - 9727 531 ## FN0969 hypothetical protein 13 2 Op 9 1/0.000 - CDS 9752 - 10402 1013 ## COG2082 Precorrin isomerase 14 2 Op 10 1/0.000 - CDS 10430 - 11422 1127 ## COG3177 Uncharacterized conserved protein 15 2 Op 11 3/0.000 - CDS 11422 - 12756 1567 ## COG1797 Cobyrinic acid a,c-diamide synthase 16 2 Op 12 . - CDS 12832 - 13911 1255 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase - Prom 13941 - 14000 12.7 + Prom 13968 - 14027 12.2 17 3 Op 1 13/0.000 + CDS 14178 - 15620 1772 ## COG1538 Outer membrane protein 18 3 Op 2 27/0.000 + CDS 15630 - 16700 1291 ## COG0845 Membrane-fusion protein 19 3 Op 3 . + CDS 16697 - 19765 3623 ## COG0841 Cation/multidrug efflux pump Predicted protein(s) >gi|224461382|gb|ACDC01000020.1| GENE 1 2 - 563 574 187 aa, chain - ## HITS:1 COG:no KEGG:FN0960 NR:ns ## KEGG: FN0960 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 160 1 159 164 166 59.0 5e-40 MQIILLIFKLLGFLIPEKKNFYIADKWTIELPDKWDTTSKEIELDVESDNHPIIQTIFFQ PGSYLNIKAYYLDISKDDIYKKVEADIPDVIAVFENIISKIENKKEYHIPNYRSSKFKSY EYTYNEYDKNFYAITTGIFMKGRLLKIDVSSTIEKEVKTAISYLLSIKEADPKKTAFLKK VDSYRKH >gi|224461382|gb|ACDC01000020.1| GENE 2 597 - 1061 500 154 aa, chain - ## HITS:1 COG:no KEGG:FN0961 NR:ns ## KEGG: FN0961 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 154 1 154 154 216 75.0 2e-55 MKIFNIDNGWNIKLPENWKEERDNVDGYHIYYPTDSDLTIRVISFHFFRNIENDWKVLAP VDVLSEIFNESIKKIEPGNNTKVEEKKLNLNEFKIEDFKVECFESEYYENNEKVYNISCG IMITGYLLVINLYSASKEEVENAIKYIYSIEKTE >gi|224461382|gb|ACDC01000020.1| GENE 3 1102 - 1974 1084 290 aa, chain - ## HITS:1 COG:no KEGG:FN0962 NR:ns ## KEGG: FN0962 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 69 290 1 227 227 292 74.0 8e-78 MIKFTDFIKSFTEGKDSKIKFKFHMSTDFLGLKKSPYDCLMEDSKDWQNLNNYRNEKGKS SRLDEYDYLVSFAQYNIYGRNFFVFGGIYKVEIAKPEHYKIGGYNISLLDNNDPIGKFLN KYRKRLVIKLDENLGINFELTYETVAKKNIEVFEVFPNIASEKFTGYQNVSLMYKDLQVA LNDPTWIGALKNIKAVYVIVDTSNGKLYIGSAYGSDGLLNRWNKYVTNLTGENKEFEALI KEKGESYIQNNFKYSILEIFDTKTKDEYILERESYWKNVFETKKFGMNWN >gi|224461382|gb|ACDC01000020.1| GENE 4 2001 - 3239 1387 412 aa, chain - ## HITS:1 COG:SP0298 KEGG:ns NR:ns ## COG: SP0298 COG1373 # Protein_GI_number: 15900232 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Streptococcus pneumoniae TIGR4 # 1 410 1 400 402 343 49.0 4e-94 MIRIDRKEYLDFLIKSKDKQIIKVVSGVRRCGKSTLFEIYKDYLLKNKIEKKQIISINFE DMDYEELTNYKKLYEYIKSKMLDDKKNYIFLDEIQHVDKFEKVVDSLFIKDNVDLYITGS NAYFMSSELATLLSGRYIELKMLPLSFKEYYQARLKYEELEKKEPKILKTLMQYYNEYIV NSSFPYTLQLNNDLKNIYEYLNGIYNSVLLKDIVTRLKISDVMKLESVIKYIFDNIGNLT SISKIANTLTSMGRKTDTKTIEKYIKGLVDGLLIYEVNRYNIKGKEFLSTLSKYYVSDLG LRQMILGNRNIDMGHILENIIYLELLRRKVNVYVGQFDKNEIDFVVINSNEVEYYQVALT ILDGNTLKRELDAFKNIKDNYPKYLITLDDVLPNTDYDGIKVINALDWLLGE >gi|224461382|gb|ACDC01000020.1| GENE 5 3391 - 3960 891 189 aa, chain - ## HITS:1 COG:FN0964 KEGG:ns NR:ns ## COG: FN0964 COG2242 # Protein_GI_number: 19704299 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Fusobacterium nucleatum # 1 189 1 189 189 353 96.0 1e-97 MHIYDKEFTQTELPMTKQEIRAISIAKLMLKPNSILIDVGAGTGTIGIEAATYMPQGKVY AIEKEEKGLDTIKLNAEKFNLDNFELIHGKAPDAIPNIAYDRMFIGGSTGGLEEIINHFL TYAKDEAILVINCITLETQSKSLEILKEKGFKDIEVITVTVGRAKRVGPYTMMFGENPIC IIKVIKRNK >gi|224461382|gb|ACDC01000020.1| GENE 6 3984 - 4946 1256 320 aa, chain - ## HITS:1 COG:FN0965 KEGG:ns NR:ns ## COG: FN0965 COG1052 # Protein_GI_number: 19704300 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 2 320 3 321 321 520 84.0 1e-147 MENKLKIIFLDRNTVGPFELKDIFSKYGEYTEFNLTNDDDVASYLKDYDVIILNRIRLGK KEFEQAKHLKLVLLTGTGFNHIDLVAAKEHGVSIANVAGYSTNSVSQLTMTFLLNELTKV EKLSQKVKENKWNELSINMDNYYHVDTEDKILGILGYGNIGQKVAEYAKSFGMKVMVAKI PERKYTDNSDNRYDLDEVLEKCDIFSIHAPLTDLTKDLINLDRMKKMKKSAIILNLGRGP IINEEDLYYALKNNIIASAATDVMTTEPPKNDCKLLKLDNFTVTPHLAWKSQKSLERLFA AIENNLNLFLENKLIGVESK >gi|224461382|gb|ACDC01000020.1| GENE 7 4936 - 5589 875 217 aa, chain - ## HITS:1 COG:FN0966 KEGG:ns NR:ns ## COG: FN0966 COG2241 # Protein_GI_number: 19704301 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 1 # Organism: Fusobacterium nucleatum # 1 217 12 228 229 358 92.0 3e-99 MQSNKINVVGLGPGNIKYLSTAGIECIKEAEIIVGSTRQLSDLKTIISEKQEIYTLGKLA ELITYLKENIERKITIIVSGDTGYYSLVPYLSKNLSKDILNIIPNISSYQYLFSKLGENW QNFRLASVHGREFDYVKNINDEDIAGLVLLTDDIQNPYEVSKNLYNNGLRNLTVIVGENL SYDNEKITILEIEDYEKLNRKFDMNVLVLKKGENYGK >gi|224461382|gb|ACDC01000020.1| GENE 8 5576 - 6703 1597 375 aa, chain - ## HITS:1 COG:FN0967 KEGG:ns NR:ns ## COG: FN0967 COG1903 # Protein_GI_number: 19704302 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Fusobacterium nucleatum # 1 375 1 375 375 691 93.0 0 MEEKELKNGYTTGTCATAAVKVALEALVYGKKATEVEVTTLNHINLKIPVQKLRVRNNFA SCAIQKYAGDDPDVTDGISICAKVELVKELPKIDRGAYYDNCVIIGGRGVGLVTKKGLQI AVGKSAINPGPQKMITTVVNEILSGSDEKAIITIYIPEGRAKALKTYNPKMGVIGGISVL GTTGIVKAMSEDALKKSMFAEMKVLREDKNRDWVIFAFGNYGERHCQKIGLDTEQMIIIS NFVGFMIESAVKLGFKKIIMLGHIAKAIKVAGGIFNTHSRVADGRMETMASCAFLVDEKP EIIRKILFSNTIEEACDYIENNEIYHLIANRVAFKMQEYARADIEVSAAIFSFKGETIGE SDNYQRMVGECGAIK >gi|224461382|gb|ACDC01000020.1| GENE 9 6719 - 7456 591 245 aa, chain - ## HITS:1 COG:no KEGG:FN0968 NR:ns ## KEGG: FN0968 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 245 1 247 247 322 79.0 1e-86 MKIISKFKDFYDYKVVKYGVDEKLVYVRKTYCEYFQVLIGNISNINIDYRISEDDFNKNL KDDTKPIDKKNIHKILFIGEKLIHLFSTENGIYTHFDIKNENDLRKLNDFQYKKEVTFKN EKKFSIFSKFGSDWDNLLSFNRKKLITYDIDKDDIILNEPMLLIELIGTSKSSRYLYTYK FTYNPNLSQMGVYIDADFVWQSLVEFLSNKRSEKEISPEVSNENKILSKGFDLKTSFRPN MKKKK >gi|224461382|gb|ACDC01000020.1| GENE 10 7453 - 8199 531 248 aa, chain - ## HITS:1 COG:no KEGG:FN0969 NR:ns ## KEGG: FN0969 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 248 1 258 258 181 51.0 2e-44 MKIISKFKDFYDYKVAKYGVDEKLVYTRVTKNFRNSPRLFSINKTQPDYNNKILFVGDKI VLIFKTEEKLYTQFDLKDIELLKSKNSNVQIKNFFHHTNDSEITFLDGNTIFVNSFINID LYDLLKMNRKTFYNFFIKNKKDFFDIDEENNFFNEPIVLIEFLENVTDHDNRRATSIYKK TYNPNLSQMGIYIDEDFVWQSLVEFLSNKRSEKEISPEVSNENKILSKGFDLKTSFRPNM KKKHKGDI >gi|224461382|gb|ACDC01000020.1| GENE 11 8196 - 8948 575 250 aa, chain - ## HITS:1 COG:no KEGG:FN0968 NR:ns ## KEGG: FN0968 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 245 1 246 247 271 68.0 2e-71 MKIISKFRDFYDYKVAKYGVDEKLVYTRKTYCEYFESFVIDVYTASDDRISEENFNKNLK ENFEYFKGINFHKILILGEKLIHLFFTENAVYTHFDAKKLDVSKGTYQSYYSKEITFNDG RNFEITTDFGWDKLFSYNRKKLFSSMRIDKSDIIFNEPMILIEYFGKSYNKNLKYHRPLY KFTYNPNLSQMGVYIDADFVWQSLVEFLSNKRSEKEISPEVSNENKILSKGFDLKTSFRP NMKKKHKGDI >gi|224461382|gb|ACDC01000020.1| GENE 12 8945 - 9727 531 260 aa, chain - ## HITS:1 COG:no KEGG:FN0969 NR:ns ## KEGG: FN0969 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 260 1 258 258 330 72.0 4e-89 MKIISKFKDFYDYKVAKYGVDEKLIYNRKTCCDYYKMKFQYLNLHKNVPEKVSVEDFDNI LKEHIKFFDKTNHNKILIVGEKIVHLFFTEDGVYTHFDIKNPKDIGGETIYKYWAYYDGT KEITFNDGKKFDIHITFNELWDDFFNYDRKRFLSYLNISKEEVLFNEPIILVEYIGGIDR KIARYDNSVYKFTYNPNLSQMGVYIDEDFVWQSLVEFLANKRSEKEISPEVSNENKILSK GFDLKTSFRPNMKKKHKGDI >gi|224461382|gb|ACDC01000020.1| GENE 13 9752 - 10402 1013 216 aa, chain - ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 1 216 4 219 219 384 94.0 1e-107 MSYIKVPGDIEKRSFEIIEEELGDKAKKFSESEMPIVKRIIHTSADFEYADLIEFQNNAI ESGLKALEKGCKIYCDTNMIVNGLSKPALSKYNCSAYCLVSDKEVIEEAKKEGLTRSIVG MRKAGKDPETKIFILGNAPTALYQLKEMIENGEIEKPALVIGVPVGFVGAAESKEEFKKL GIPYITINGRKGGSTIGVAILHGIIYQIYKREGFHA >gi|224461382|gb|ACDC01000020.1| GENE 14 10430 - 11422 1127 330 aa, chain - ## HITS:1 COG:FN0971 KEGG:ns NR:ns ## COG: FN0971 COG3177 # Protein_GI_number: 19704306 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 330 1 330 330 545 90.0 1e-155 MKKELSPPFKITNDILNLVYEIGELVGKISAEKEFEKNLTLRRENRIKTIYSSLAIEQNT LTLEQVTDVINGKRVLAPLKDIKEVQNAYEIYERLEELNENSMKDLLLAHKIMTSELIKE SGRFRSKNAGVYQGDKLIHMGTLPEYIPELINNLFLWLKNSKEHPLIKAAVFHYEFEFIH PFQDGNGRIGRLWHSLILSKWKKFFAWLPIESLVQKYQKEYYIAINNSNKDGESTEFILF MLEIIKKTLIELIDTQEVTDKTTDKMTDKNKERVKLVLKYLGQNDSISNKEAQSLLGISE ATARRFLNSLVKENLLVAVGEYKARKYIKK >gi|224461382|gb|ACDC01000020.1| GENE 15 11422 - 12756 1567 444 aa, chain - ## HITS:1 COG:FN0972 KEGG:ns NR:ns ## COG: FN0972 COG1797 # Protein_GI_number: 19704307 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Fusobacterium nucleatum # 1 444 1 444 444 758 82.0 0 MKAFMLAGVSSGIGKTTISMALMSAFANVSPFKVGPDYIDPGFHEFITNNKSYNLDLYMM GEQGVRYSFYKHHKDISIVEGVMGLYDGIDNSLDNNSSAHVARFLGIPVILVVDGVGKST SIAAQILGYKMLDPRVNIAGVIINKVSSEKTYAIFKEAIEKYTSVKCLGFIEKNEALNIS SRHLGLLQADEVEDLRDKLFILKNLVLKNIDLEALEKIATEETRTINIDKDEIEYPLYLS SLKDKHKGKVIAIARDRAFSFYYNDNIEFLEYMGFRMTYFSPIKDKKVPDCDAIYLGGGY PENFAEELSNNKEMIESIKENYEQGKNILAECGGFMYLSHAIEQKDETLHQMCGLVPCTV VMNNRLDISRFGYISIRDKNDIEVAKGHEFHYSKLKTVLEDTREFKAVKKDGRNWECIFH EKNMYAGYPHIHFFGSYKLLEELF >gi|224461382|gb|ACDC01000020.1| GENE 16 12832 - 13911 1255 359 aa, chain - ## HITS:1 COG:FN0973 KEGG:ns NR:ns ## COG: FN0973 COG0079 # Protein_GI_number: 19704308 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Fusobacterium nucleatum # 3 358 2 357 357 595 87.0 1e-170 MSKDLHGGNIYKFQREGKNDILDYSSNINPLGVPQKFINIAKESFDKLVNYPDPYYIDLR KKIAEFNSLDLSNIIVGNGATEILFLYLKALKPKKVLILAPCFAEYERALKSVSAEITYF ELKESDNFYPNIENLKKEIETNNYDLLLFCNPNNPTGQFIKLEDIKEIVVACENKNTKIF VDEAFIEFIENWQEKTVSLFKNKNIFIMRAFTKFFAIPGLRLGYGIGFDDEILNKMWDEK EPWTVNTFANLAGLVMLDDKEYIEKSEKWILEEKKFMYKELSEFQYLKAYRTECNFILLK IQNISSASLRDKMIEKNILIRDASNFKFLDYHFVRLAIKDRESNIKVLEALADIIEYRG >gi|224461382|gb|ACDC01000020.1| GENE 17 14178 - 15620 1772 480 aa, chain + ## HITS:1 COG:FN0517 KEGG:ns NR:ns ## COG: FN0517 COG1538 # Protein_GI_number: 19703852 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 32 480 1 449 449 655 81.0 0 MTNRSSRWRNRMKIRSSLIFVSLILLVSCSKVNIENENNDMMSRLREKKESTEKLRVEKE GMIDLEEAIDLALKNNTQIKLKEIESQIAKIDKNISFGNFLPRISAMYSISELDRYMSAT IPAPDVTVGVLGGITLPSLPVTLTSRMVDKDFRNYALSAQLPIFVPATWFLYSAREKGEN ISLYTENLTKKMIKLKVISEYYYTLALTSEKNVLEKEYAYAKKLNKNAKLALKTGSILKW QEEETELLVKQKENALKNNERDLKIAKMNLMNEIGLDPYAEFRLVIPEDTVYKLPPLEDV VYDALVNSEVIKINHNLVAISKDKIKIAMSRFLPQISLDAGLVGTSVSYLNPQNILFGAI TGFLSLFNGFKDVNEYKKAKLQSEAAYIQREEAIMNTIISAVNSYNNVQKSIEDKELADL NYNVAKKKFKQKELEVEVGSATDTDLLKALSELEKAESIKLKTEYKYSVSIETLKMLIEK >gi|224461382|gb|ACDC01000020.1| GENE 18 15630 - 16700 1291 356 aa, chain + ## HITS:1 COG:FN0516 KEGG:ns NR:ns ## COG: FN0516 COG0845 # Protein_GI_number: 19703851 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 15 356 16 357 357 513 80.0 1e-145 MKKYILVFLLIFLFTACKKEAKEEMIRAVKIQEINSMQDENFNIDFPAQISPSQKTVLAF KYAGKIKSINFESGDFVKKGQIIAVIDDTDYKVNLDAFSKKYEAAKAVAQNAEQQFARAE KLYRGDALAKKDYDNALMQRNVAISTFKEASAGLQNARNTLNDTKIVAPYDGYIDKKVVD VGTVVPEGGPVVSFISNEITDISVNASVRDIEYIKNAENINFKDSTSDKIYSLKIKSVAQ NPDSINLTYPVVFTFSELSENDKFLSGQTGTVTIAVKNKGKEEILIPINAIFEDKGSNVY LFKDGKAVKTPIELGELRETDKISVVKGLKTGDKVIVAGVSKLADGDKVKLLGGNK >gi|224461382|gb|ACDC01000020.1| GENE 19 16697 - 19765 3623 1022 aa, chain + ## HITS:1 COG:FN0515 KEGG:ns NR:ns ## COG: FN0515 COG0841 # Protein_GI_number: 19703850 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 1022 1 1022 1022 1731 87.0 0 MKIIEYSIKNRIVVLFATLVLTLAGVISYFRLGKLEDPEFKVKEAIVVTLYPGASPESVE QEVTDKIEMALRKIPNADVDSVSKASYSEVHIKIDESTPSDKVDQEWDVVRKKINDVKTS LPLGALPPVVLDDYGDVYGMFFAITSEGFSKEELYNYAKEIRKELEKTDGVAKTTLFANS DTVIEVLVDRDKIASLGINEKMIALAFTGQNIPAYANSVLHGDKNLRFDIDQSFESIEDI ENLVIYSTPAVLNIQKPTTVLLKDIAEVRRTEVKPYTTKMRYNGKESIGLMLSPVSGTNV VETGKEISKKIELLKEDLPHGIEIEKVYYQPELVSTAINQFIINLIESVVVVVGVLLITM GIKSGLIIGSGLILSILGTLIAMLAMKIDLQRVSLGAFIIAMGMLVDNSIVVVDGVLDSL DNGDNKYTALTKPTSKTAIPLLGATFIAVIAFLPMYMMPTTAGEYIKSLFWVVAISLGLS WIISLTQTTVFCDIYLSENNIKGGDNKGKLFHNKFVVILEKILIYKKLSMIILLGAFFLS LLLFIKVPLSFFPDSDKKGFVINLWNPEGTDIEYTNKINQVVESEVLKQEGVVSVTSAIG GSPSRYYISSIPELPNTALSQLIISVEKLEDINKIGQDVKDFVDNNFPDTRVEIRKYTNG IPTRYPIQLRIIGEDLNVLREYSKKFENILRNIDGAENIQTDWKEKQLVIKPEIDKVKER ESLVTALDIATSLNRTTNGIKIGTFKDGEENIPVLFKEKNNGREFNINNLGQVPVWGLGP RSIPFRELIKKENLVWENPIIIRKDGFRAIQIQADVKNGYRVEAVRKEFAKAIKESEIEL PKGYKLEWSGEFYEQEKNTEEIISYIPLQLIIMFMTCVLLFGNLRDPFIIFGVLPLSFIG ILPGLFITGRTFGFMAIIGTISLSGMMIKSGIVLIDQIRYEIYTLNKEPFKAIIDSSASR IRAVILAAGTTVLGMIPLMFDPLFSDMAITIVFGLTVATLLILFVVPLLYSIFYKIDKPK EN Prediction of potential genes in microbial genomes Time: Thu May 19 22:50:05 2011 Seq name: gi|224461381|gb|ACDC01000021.1| Fusobacterium sp. 2_1_31 cont1.21, whole genome shotgun sequence Length of sequence - 21824 bp Number of predicted genes - 23, with homology - 21 Number of transcription units - 10, operones - 6 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 21 - 527 572 ## COG0602 Organic radical activating enzymes 2 1 Op 2 3/0.000 - CDS 532 - 2718 2720 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 2759 - 2818 13.3 - Term 2777 - 2836 12.4 3 2 Op 1 8/0.000 - CDS 2862 - 3932 1523 ## COG3839 ABC-type sugar transport systems, ATPase components 4 2 Op 2 . - CDS 3947 - 5629 1815 ## COG1178 ABC-type Fe3+ transport system, permease component - Prom 5660 - 5719 8.0 + Prom 5605 - 5664 8.2 5 3 Tu 1 . + CDS 5721 - 5786 58 ## + Term 5887 - 5937 1.2 6 4 Tu 1 . - CDS 5838 - 6893 1724 ## COG1840 ABC-type Fe3+ transport system, periplasmic component - Prom 6997 - 7056 11.0 + Prom 6999 - 7058 10.9 7 5 Op 1 . + CDS 7088 - 7786 205 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 8 5 Op 2 . + CDS 7783 - 8559 378 ## Teth514_0330 ABC-2 type transporter + Term 8586 - 8651 5.2 - Term 8581 - 8630 -0.0 9 6 Tu 1 . - CDS 8635 - 9144 571 ## Clos_2471 FMN-binding domain-containing protein - Prom 9195 - 9254 2.6 10 7 Op 1 . - CDS 9279 - 9407 123 ## gi|262066988|ref|ZP_06026600.1| putative lipoprotein 11 7 Op 2 . - CDS 9368 - 9613 205 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 12 7 Op 3 17/0.000 - CDS 9648 - 10262 271 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 13 7 Op 4 44/0.000 - CDS 10292 - 11122 247 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 14 7 Op 5 49/0.000 - CDS 11127 - 11987 1154 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 15 7 Op 6 38/0.000 - CDS 11977 - 12954 243 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 16 7 Op 7 . - CDS 12983 - 14620 2234 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 14657 - 14716 6.2 - Term 14645 - 14681 -0.5 17 8 Tu 1 . - CDS 14818 - 14898 71 ## - Prom 15007 - 15066 10.3 - Term 14988 - 15059 12.9 18 9 Op 1 13/0.000 - CDS 15102 - 16880 2679 ## COG0173 Aspartyl-tRNA synthetase 19 9 Op 2 1/0.000 - CDS 16898 - 18139 1804 ## COG0124 Histidyl-tRNA synthetase 20 9 Op 3 . - CDS 18151 - 19374 1206 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Prom 19395 - 19454 13.0 + Prom 19347 - 19406 13.1 21 10 Op 1 . + CDS 19507 - 20112 819 ## CLH_2545 hypothetical protein 22 10 Op 2 . + CDS 20130 - 21002 824 ## COG4296 Uncharacterized protein conserved in bacteria 23 10 Op 3 . + CDS 21034 - 21801 913 ## FN0296 putative cytoplasmic protein Predicted protein(s) >gi|224461381|gb|ACDC01000021.1| GENE 1 21 - 527 572 168 aa, chain - ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 168 1 168 168 304 92.0 5e-83 MNYSGIKYADMINGKGIRVSLFVSGCTHCCKNCFNEETWKETYGNKFTEKEEDEIIEYFK KYGKTIRGLSLLGGDPTYPKNIKPLLKFIKKFKENLPDRDIWIWSGFTWEEILEDENRFS LVKECDILIDGKYMDNLKDLNLKWRGSSNQRVIDIKKSLEKNEIIEYI >gi|224461381|gb|ACDC01000021.1| GENE 2 532 - 2718 2720 728 aa, chain - ## HITS:1 COG:FN0311 KEGG:ns NR:ns ## COG: FN0311 COG1328 # Protein_GI_number: 19703656 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Fusobacterium nucleatum # 1 728 1 728 728 1413 95.0 0 MKRVIKRDGSVVEFSKDRIINAISKTFIQASREPNMKLIEKIATQVEELPSKVLSVEEIQ DIVVKKLMASSEKDIAMSYQSYRTLKAEIRDREKGIYRQISELVDASNEKLLSENANKDA KTISVQRDLLAGISSRDYYLNKIVPEHIKLAHIKGEIHLHDLDYLLFRETNCELVNIETM LRGGCNIGNAKMLEPNSVDVAVGHIVQIIASVSSNTYGGCSIPYLDRALVRYIKKTFKKH FLRGAKYIDDLKEEEIEELKKEDLEYSNQFIKNKYPKTYEYSVDMTEESVKQAMQGLEYE INSLSTVNGQTPFTTIGIGTETSWEGRLVQKYVLKTRMAGFGAKKETAIFPKIVYAMCEG LNLNEEDPNWDISQLAFECMTKSIYPDILFITDEQLKNETVVYPMGCRAFLSPWKDENGK EKYAGRFNIGATSINLPRIAIKNRGNEEGFYKELDRILEICKDNCLFRAKYLENTVAEMA PILWMSGALAEKNQKDTIKDLIWGGYSTVSIGYIGLSEVSQLLYGKDFSESEEVYEKTFN ILKYMADKVLEYKQKYNLGFALYGTPSESLCDRFARVDKQEFGDIKGITDKGYYDNSFHV SSRINISPFEKLRLEALGHKYSAGGHISYIETDSLTKNLEAIPEILKYAKMVGIHYMGIN QPVDKCHICGYKGEFTATKEGFTCPQCGNHDSNEMSVIRRVCGYLSQPNARPFNKGKQEE IMHRVKHN >gi|224461381|gb|ACDC01000021.1| GENE 3 2862 - 3932 1523 356 aa, chain - ## HITS:1 COG:FN0310 KEGG:ns NR:ns ## COG: FN0310 COG3839 # Protein_GI_number: 19703655 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 356 1 356 356 658 95.0 0 MASVTIKGVTKSFGKVTVLQEFNQKFEDGEFITLLGPSGCGKTTMLRLIAGFEKPSSGEI YIGDKLVSSENEFLPPEKRGIGMVFQSYAVWPHMNVFDNIAYPLKIQKIAKNEIEERVSQ VLKIVHLEQYKDRFPSELSGGQQQRVALGRALVAQPEILLLDEPLSNLDAKLREEMRYEI KEITKKLKITVIYVTHDQIEAMTMSDRIVLINKGEVQQVAPPQEIYSNPKNMFVANFVGK VDFIKGKVEGNKILLDNSNGQTLPNTSSFKEKVVVAIRPENAILSDDGEITAKVYSKFYL GDCNDLRVDIGNGNILRIIARASTYNTLNEGDEVKVKILDYFVFEDDGKNQIKIMT >gi|224461381|gb|ACDC01000021.1| GENE 4 3947 - 5629 1815 560 aa, chain - ## HITS:1 COG:FN0309 KEGG:ns NR:ns ## COG: FN0309 COG1178 # Protein_GI_number: 19703654 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Fusobacterium nucleatum # 1 560 1 560 560 928 97.0 0 MVGQKKWRIDIKWIVILAIVAFLLIFEVFPLFYLLIKSLFSGGSFSLEAYRRVYTYDLNW IALKNTMITAGFTTIFGVALAFPLAFLVGRTDMYGKKFFRTLFVVTYMVPPYVGAMAWLR LLNPNAGVLNKFLMKIFGLGSAPFNIYTTSGIVWVLTCFFYPYAFITISRAMEKMDPSLE EASRISGASPLKTLFKVTIPMMTPSIIAAGLLVFVASASAYGIPSIIGAPGQIYTVTMRI IDFVHIGSEEGLTDAMSLAVFLMLISNVILYISTFVVGKRQYITMSGKSTRPNIVELGKW RLPITIIISIFSFFVIILPFITVAITSFTVNMGKPLTFSNLSLKAWEKVFSRASIISSTT NSFLTATAAAFFGILISCVMAYLLQRTNVKGKRIPDFLITLGSGTPSVTIALALIISMSG KFGINIYNTLTIMVVAYMIKYMLMGMRTVVSAMSQVHPSLEEAAQISGANWLRMLKDVTL PLIGASIVAGIFLIFMPSFYELTMSTLLYSSNTKTIGYELYIYQTYHSQQVASALATAIL LFVILVNYILSKLTKGQFSI >gi|224461381|gb|ACDC01000021.1| GENE 5 5721 - 5786 58 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRANLAVFEQSEFSEFAANS >gi|224461381|gb|ACDC01000021.1| GENE 6 5838 - 6893 1724 351 aa, chain - ## HITS:1 COG:FN0308 KEGG:ns NR:ns ## COG: FN0308 COG1840 # Protein_GI_number: 19703653 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 351 1 352 352 611 95.0 1e-175 MKKKFFWLVLSLMALVLVACGGEKKEEATEATNPNSEISGKIVIYTSMYEDIIDNVKEKL EKEFPNLEVEFFQGGTGTLQSKIIAELQANKLGCDMLMVAEPSYSLELKEKGILHPYLSK NAENLALDYDKEGYWYPVRLLNMVLAYNPDKYKKEDLALTFEDFAKREDLAGKISIPDPL KSGTALAAVSALTDKYGEEYFQNLAKLKVVVESGSVAVTKLETGEAAEIMILEESILKKR EEENSTLEVIYPEDGIISIPSTIMTVKEDMSANKNIKAAEALTDWFLSPAGQEAIVAGWM HSVLKNPEKAPYDAKATDEILKAAMPINWEKTYKDREELRKMFEKFITKAN >gi|224461381|gb|ACDC01000021.1| GENE 7 7088 - 7786 205 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 215 1 226 245 83 25 1e-15 MLVLNNINKSFKNIKVLKNISFTLQENKIFAFLGPNGVGKTTLIKIISGLISADSGTILL DDKKISMDKISTMFDGSRNLYWNISVRENFYYFTALKGRFKKEVDYLLEKNKEIFQIDNL LYKKYGELSLGQKQIVAVINTLLSSPELACFDEPSNGLDIYYEEKLIQIISNYIKNDSNK IIISSHDINFLYKVVDNFIVINKGEIIGEFSKNNLSLEEVTAKYLELLEGKK >gi|224461381|gb|ACDC01000021.1| GENE 8 7783 - 8559 378 258 aa, chain + ## HITS:1 COG:no KEGG:Teth514_0330 NR:ns ## KEGG: Teth514_0330 # Name: not_defined # Def: ABC-2 type transporter # Organism: Thermoanaerobacter_X514 # Pathway: not_defined # 1 258 1 259 259 109 35.0 1e-22 MKKYLNAFKSEIITNLIIAKNYKFSFLMDVGIFISILSFLILSKSGYKYTLYYSKDFDFR ELVLIAYIMWIISLSAINTICSEIRSENVQGTLELKFMSILPFQILLLGKIISTLIIQIL EIIVVLLFTKFVFNLSIGVNFKIIGIMLLTYIGMYGFSLVVGLLILSKKKIGQLNMIIQI LLLVFSNVFTISNIGFFSYLIPLGIGNHLIHLSYLKEEISSSKLLIFIFVCLLWIIIGQY LFNKAINYVKEKGTLSLY >gi|224461381|gb|ACDC01000021.1| GENE 9 8635 - 9144 571 169 aa, chain - ## HITS:1 COG:no KEGG:Clos_2471 NR:ns ## KEGG: Clos_2471 # Name: not_defined # Def: FMN-binding domain-containing protein # Organism: A.oremlandii # Pathway: not_defined # 4 165 144 312 322 94 31.0 2e-18 MSYLEWQVLKNQKLDFEYKTLFGSSNSARNGFVPLLKEMSKEVQGKTSNKRYVGITQPYD SGISTRLEVIYENGKIVDLKYDEIFADDKKDIKNKTLQEFYRQSKLESIEYNRITNKSFR TFVNTLRREVLRSQSLTDFPTDALKLDMPHIKEAYENYLFVAEKIKDIK >gi|224461381|gb|ACDC01000021.1| GENE 10 9279 - 9407 123 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262066988|ref|ZP_06026600.1| ## NR: gi|262066988|ref|ZP_06026600.1| putative lipoprotein [Fusobacterium periodonticum ATCC 33693] # 1 40 197 236 453 85 97.0 9e-16 MYWSVQAPKGIIVGDYYSGQKVFDGGYEAYAEVVVNNGELFI >gi|224461381|gb|ACDC01000021.1| GENE 11 9368 - 9613 205 81 aa, chain - ## HITS:1 COG:FN0300 KEGG:ns NR:ns ## COG: FN0300 COG0614 # Protein_GI_number: 19703645 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 14 63 1 50 331 89 88.0 1e-18 MKKFIKKFMLFLFLFISSLSFAEIRFKDDVDREIVLEKPLTRVVVASRYNNELIRAIGNI KKVKKKLMLCIGLYKHLKELL >gi|224461381|gb|ACDC01000021.1| GENE 12 9648 - 10262 271 204 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 198 278 495 563 108 30 2e-23 MLEIKNVSFNYLKGMPILNDISLNVKAGEIVGIFGSSGCGKSTLAKIIAGILKPSKGTIL VDGKELEKEGYCSVQLIYQHPEKATNPKWKMDKILNEGWMVDNETIKKMGIKNYWLTRWP QELSGGELQRFCITRALGPKTKYLIADEITTMLDALTQVKIWKNLIIAAKEKNIGMIVVT HNKFLADRICDRIISLENLNSMDC >gi|224461381|gb|ACDC01000021.1| GENE 13 10292 - 11122 247 276 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 28 252 38 269 329 99 28 1e-20 MNILEVKNLNIGFNMYDKLLNQKLYQMVFDLNVTIKKGEILAIAGSSGSGKSLMAHAILG ILPKNTVVSAEIKFKNEIVDEDRLSQLRGKEITFVPQSIAYLDPLMTIEDQLMRKDINKQ DFFKVMDTLGFTKADLGKYPFQLSGGMARRVLIANTILSKADLIIADEPTPGLSLDLAIE VLNHFRNMANDGKGILLISHDIDLVCNIADRMSIFYGGHILETLNTKDFLKGEKYIRHPL TKAFWKALPQNDFEETDIEDIRLQCKKLNLELPILE >gi|224461381|gb|ACDC01000021.1| GENE 14 11127 - 11987 1154 286 aa, chain - ## HITS:1 COG:MA1912 KEGG:ns NR:ns ## COG: MA1912 COG1173 # Protein_GI_number: 20090761 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 1 286 1 285 285 280 53.0 2e-75 MIDNVLLQKKSNYKKLNLRQKTLLYLVVSVVFLLIILIWGGFMDKNSYGVDFTVRNNPPS LKHIFGTDWMGRDMLTRTIKGLSTSLIIGVVASLVSCVFAVIIGSACAIFGKKIDKFFLW LIDLFQGMPHLILLVLISILTGKGTVGILVGVALTHWTTLSRIIRAEILSIKGEPYIVIA KKLGTSNVKIASKHILPHIIPQFIVGAVLLFPHAILHEAGITFLGFGLPPEEPAIGVILS ESMRYITSGYWWLAFFPGLALMLMVFAFDAIGHNLEKIINPNTGQE >gi|224461381|gb|ACDC01000021.1| GENE 15 11977 - 12954 243 325 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 66 322 43 317 320 98 28 4e-20 MKKIIWQIAKLFFLLFAISIITFLLMKLTPIDPVSSYLGADSNVTQEQYDYLVKVWGLNQ PSIVQYFNWVKGALTGNMGDSFIYNKPVSELISKAASNTFMLMLTSWLLSGIIGFILGIV AAAYHGRIIDQIIQKISYLFASMPTFWFAILMLMFFAVKLGWFPVGMSAPIGVIEKDISL WDRIHHMILPAMTLSILGISKIALHTREKLIDVLNSEYFLYSKINGEKLWEFIRKHGIRN ILLPAVTIQFASISELFGGSVLVENVFSYAGLGNITKIAGVKGDMPLLLAITLVSAIFVF VGNLCANILYPIIDPRIREGMYNDR >gi|224461381|gb|ACDC01000021.1| GENE 16 12983 - 14620 2234 545 aa, chain - ## HITS:1 COG:MA1915 KEGG:ns NR:ns ## COG: MA1915 COG0747 # Protein_GI_number: 20090764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 9 541 8 548 553 379 39.0 1e-105 MKKILKIGISVVAVTLMFSFLGCEKNSSEIKEASKIERLENKEEIIVGVGQSMLEKGFDP CKGWGNYGVTLIQSKLLDFDFENNIIKDLAENYEVSEDGKTWIFKIREDVKFSDGQKLTA KDVAFTFNKTKEIGTTFDFKLLEKAEALDDKTVKFTFSAPSSTFIYNAANLGIVAEHAYK DTNTYSSNPIGSGPYKLVSYTQGQQLILDRNEEYYGTKPKFKRLTLVAMTPDTALASIKA GDIDIVNVSEAMAQEKIENYSILATKTMDFRAISMPTIKKSEKLTEKGNPMGNDVTSDIA IRKAINYGVDRQEIIENVLYGYGEVIFDFFDSLPWGIKDEIRKEFKNGDIAKANEILDKA GWKMKDDGIREKDGIKAEFRLLYPASDDTRQSCAEAFAVQCKKIGINVIPEGSDWTEMEK RQSSDACVIGGGQYTPEVVARFYFSERIGGPWSNIVRENNPIVDEHIRAAYLATDEKVAI KNWQKALWDGKEGGSVLGDAPYCTICYLEHLYFVRDGLDLGRQKLHTHARDLSLMANIEE WDFKK >gi|224461381|gb|ACDC01000021.1| GENE 17 14818 - 14898 71 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMSIMFLIFKAKREFGEIPKRARRCN >gi|224461381|gb|ACDC01000021.1| GENE 18 15102 - 16880 2679 592 aa, chain - ## HITS:1 COG:FN0299 KEGG:ns NR:ns ## COG: FN0299 COG0173 # Protein_GI_number: 19703644 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 592 1 592 592 1121 95.0 0 MIYRTHNLAELREKNIGETVTLSGWVDTKRNVSTSLTFIDLRDREGKTQIVFNNELLSEK VLEEVQKLKSESVIRVVGEVKERSNKNPNIPTGDIEVFAKEIEILNACDTLPFQISGIDD NLSENMRLTYRYLDIRRSKMINNLKMRHRMIMSIRNYMDQAGFLDVDTPILTKSTPEGAR DFLVPSRTNPGTFYALPQSPQLFKQLLMIGGVEKYFQIAKCFRDEDLRADRQPEFTQLDI EMSFVEKEDVMNEIEGLAKYVFKNVTGEEANYTFQRMPYAEAMDRFGSDKPDLRFAVELK DLSDIVKNSSFNAFSSTVQNGGLVKAIVAPSANEKFSRKIISEYEEYVKTYFDAKGLAYI KLGADGISSPIAKFLSEDEMKAIIEKTEAKTGDVIFIVADKKKVVAAALGALRLRIGKDL DLINKDDFKFLWVVDFPMFDYDEEEQRYKAEHHPFTSIKAEDLDKFLAGQTEDIRTNTYD LVLNGSEIGGGSIRIFNPKIQSMVFDRLGLSQEEAKAKFGFFLDAFKYGAPPHGGLAFGI DRWLMVMLKEESIRDVIPFPKTNKGQCLMTEAPNTVDDKQLEELFIKSTFEK >gi|224461381|gb|ACDC01000021.1| GENE 19 16898 - 18139 1804 413 aa, chain - ## HITS:1 COG:FN0298 KEGG:ns NR:ns ## COG: FN0298 COG0124 # Protein_GI_number: 19703643 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 413 1 413 413 707 85.0 0 MELIRKPKGTKDIIGEDAVKYIYISNVTQEMFENYGYKFAKTPIFEETDLFKRGIGEATD VVEKEMYTFKDKGDRSITLRPENTASMVRCYLENSIYAKEDVSRFYYNGSMFRYERPQAG RQREFNQIGVEVFGEKSPILDAEVIAMGYNFLTKLGITDLEVKINSVGSKGSRTIYREKL VEHFQSHLDDMCEDCKDRINRNPLRLLDCKVDGDKDFYKSAPSIIDYLFEDERKHYEEVK KYLTIFGVKFTEDPTLVRGLDYYSSTVFEIVTNKLGSQGTVLGGGRYDNLLKELGDKDIP AFGFAAGVERVMMLIGDNYPKDVPDVYIAWLGDDTIETAMKIADSLRKENVKVYVDYSSK GMKSHMKKADKLETKYCIILGEDELNKGIVLLKDFSTREQKEVKIEEIINHIK >gi|224461381|gb|ACDC01000021.1| GENE 20 18151 - 19374 1206 407 aa, chain - ## HITS:1 COG:FN0297 KEGG:ns NR:ns ## COG: FN0297 COG2256 # Protein_GI_number: 19703642 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Fusobacterium nucleatum # 1 407 1 407 407 745 92.0 0 MNLFQNNYKNVEPLAYKLRPKNLDDFVGQEKLLGKDGVIRRLILNSALSNSIFYGPPGCG KSSLGEIISNTLDCNFEKLNATTASVSDIRTVVETAKRNIELYNKRTILFLDEIHRFNKN QQDALLSYTEDGTLTLIGATTENPYYNINNALLSRVMVFEFKALTNEDISKLIDKGLKFL NISMSGKIKEIIIDIAQGDSRIALNYVEMYNNIHSQMTEDEIFSIFKERQVSFDKKQDKY DMISAFIKSVRGSDPDAAVYWLARLLDGGEDPKYIARRLFIEASEDIGMANPEALLIANA AMNACERIGMPEVRIILSHATVYLAISSKSNSVYEAIDGALADIKNGELQEVPINICHDN VGYKYPHSYSDNFVKQKYMNKKKKYYKPGNNKNEKMIAEKLNKLWNE >gi|224461381|gb|ACDC01000021.1| GENE 21 19507 - 20112 819 201 aa, chain + ## HITS:1 COG:no KEGG:CLH_2545 NR:ns ## KEGG: CLH_2545 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 191 1 191 193 248 67.0 9e-65 MDILKKILNNPELTEKIRLKCDIELYPELQDLYDEDGHITWNIEGKAFGADGSGGEFVLL SDGTIGFNSSEGETGRIAENMKELFSLLVNCPCFFDFLMPDIYVDKALLKKYADKIEKEY REEFNDITDYDWDEIKREIAKELDLSVDDNIAENTLMKFYEIATREPQYQATYHEDDGTL TPSEPLISRPMGEWIRKKIGE >gi|224461381|gb|ACDC01000021.1| GENE 22 20130 - 21002 824 290 aa, chain + ## HITS:1 COG:all0924 KEGG:ns NR:ns ## COG: all0924 COG4296 # Protein_GI_number: 17228419 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 211 290 65 145 145 68 47.0 1e-11 MEKIQEIHKEILEGNTDILKDFPLPYCLSENKKDFVVLRKGRIVKEENHIKYFFPNSESN ETNGIYCLIWGRRNEESYGIGGTPIPDDFPIKEMKFEGDKLCLVSTKDEKIVASLKQFNK ALQKIWSNFSMEELSIAFREAPDTVLDEIKQEEMPKTVTIKNFGKFTYKKDDKAYKLVKE DIEYYFSADNKAELKKVKDIFSNIEIINLVEKAKEYTVKKLLKLKNDLWLEEDEKEVTKK EFKARMKFTSLYVFSESANFYFDDGDLFWGHTIEVNVNQNLEFTDANIVG >gi|224461381|gb|ACDC01000021.1| GENE 23 21034 - 21801 913 255 aa, chain + ## HITS:1 COG:no KEGG:FN0296 NR:ns ## KEGG: FN0296 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 1 255 1 255 255 410 81.0 1e-113 MKQNFNEIKQNWNFYMCTVDEKAASIRLNFALTEIAPVEDYAHRLTIFIKMNNPTEDGLS SNEEYPILCDIEDEVVDRLETLEDIFAGTVKTQGRLELYLFTKNPEKSEELCKEALAKFP DYLWKTYIDEDKEWDFYYNFLYPDVYSYQAIMNRSVIENLLKNEDKLEKEREIDHWLYFK TEENANLAIKKFEELGYEILSSKKLEDKSEHKYQVNISRMDNAIYAHVNEIVWELVEIAE SLDGYYDGWGCNITK Prediction of potential genes in microbial genomes Time: Thu May 19 22:50:32 2011 Seq name: gi|224461380|gb|ACDC01000022.1| Fusobacterium sp. 2_1_31 cont1.22, whole genome shotgun sequence Length of sequence - 6909 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 8 - 65 13.0 1 1 Op 1 1/0.000 - CDS 79 - 3648 5043 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 2 1 Op 2 . - CDS 3725 - 4921 1959 ## COG0282 Acetate kinase 3 1 Op 3 . - CDS 4936 - 5454 847 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 4 1 Op 4 . - CDS 5473 - 6477 1531 ## COG0280 Phosphotransacetylase - Prom 6541 - 6600 15.7 + Prom 6572 - 6631 10.5 5 2 Tu 1 . + CDS 6724 - 6907 175 ## FN1061 hypothetical protein Predicted protein(s) >gi|224461380|gb|ACDC01000022.1| GENE 1 79 - 3648 5043 1189 aa, chain - ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 1 410 1 410 410 808 96.0 0 MTKKMQTMDGNQAAAYASYAFTEVAGIYPITPSSPMAEYTDEWAAKGMKNIFGVPVKLVE MQSEGGAAGTVHGSLQAGALTTTYTASQGLLLKIPNMYKIAGELLPGVIHVSARSLSAQA LSIFGDHQDIYAARQTGFAMLATNSVQEVMDLAGVAHLAALKSRVPFLHFFDGFRTSHEI QKVEVMEYDDLKKLVDWKALEEFRKRALNPEHPVTRGTAQNDDIYFQAREVQNKFYDAVP DIVADYMKEISKITGREYKPFNYYGAPDAERIIIAMGSVCEAAQEVIDYLVEKGEKVGLI SVHLFRPFSAKYFFDVLPKTVKRISVLDRTKEPGSLGEPLLLDIKALFYNKENAPLIVGG RYGLSSKDTTPAQILAVFDNLKKDEPKDAFTVGIVDDVTHTSLEVGSAIALADPSTKACL FYGLGADGTVGANKNSIKIIGDKTDLYAQGYFAYDSKKSGGVTRSHLRFGKKPIRSTYLV SKPTFVACSVPAYLHQYDMTSGLKEGGKFLLNCVWTKEEALEHIPNNVKRDLAKNKARLF IINATALAHEIGLGQRTNTIMQAAFFKLAEIIPFEEAQQYMKDYAKKSYAKKGDEIVQLN YNAIDRGANDIVEIEVDPAWANLEVTPLNEPKETAGCGGCCSTLENFVEKIAKPINAIKG YDLPVSAFLGYEDGTFENGTAAFEKRGVAVDVPIWNIDKCIQCNQCSYVCPHAVIRPFLI NEEELKASPVELATKKPTGKGLDGLAYRIQVSTLDCVGCGSCAHVCPAKALDMMPIADSL NDKEDIKADYLFNNVKYRSDLMPVDTVKGSQFSQPLFEFHGACPGCGETPYIKLITQLYG DRMMVANATGCSSIYSGSAPATPYTTNENGEGPSWASSLFEDNAEYGFGMHIGVEALRSR IQHTMEENMDKVDEETATLFKDWIANRQYSVRTREIRDILVPKLEALNTEFAKEILDLKQ YLVKKSQWIIGGDGWAYDIGYGGLDHVLASNEDVNILVMDTEVYSNTGGQASKATPTGAV AKFAASGKPVKKKDLAAIAMSYGHIYVAQVSMGANQQQFIKAVKEAEAHQGPSIIIAYSP CINHGIKKGMSQSQTEMKLATECGYWPIFRYNPSLEKLGKNPLQLDSKEPKWEKYEEYLT GEVRYQTLTKSNPEEAKVLFESNKKEAQKRWRQYKRMAALDYTEEKEEE >gi|224461380|gb|ACDC01000022.1| GENE 2 3725 - 4921 1959 398 aa, chain - ## HITS:1 COG:FN1171 KEGG:ns NR:ns ## COG: FN1171 COG0282 # Protein_GI_number: 19704506 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Fusobacterium nucleatum # 1 398 1 398 398 773 96.0 0 MKILVINCGSSSLKYQLINPETEEVFAKGLCERIGIDGSKMEYEVPAKDFEKKLEAPMPS HKEALELVMSHLTDKEIGVIASVDEVDALGHRVVHGGEEFAQSVLINDEVLKAIEANNDL APLHNPANLMGIRTCMELMPGKKNVAVFDTAFHQTMKPEAFIYPLPYEDYKELKVRKYGF HGTSHLYVSGIMREIMGNPEHSKIIVCHLGNGASITAVKDGKSIDTSMGLTPLQGLMMGT RCGDIDPAAVLFVKNKRGLTDAQMDDRMNKKSGILGLFGKSSDCRDMENAVKEGDERAIL AESVSMHRLRSYIGAYAAVMGGVDAICFTGGIGENSSMTREKALEGLEFLGVELDKEINS VRKKGNVKLSKDSSKVLVYKIPTNEELVIARDTFRLAK >gi|224461380|gb|ACDC01000022.1| GENE 3 4936 - 5454 847 172 aa, chain - ## HITS:1 COG:BS_yrkL KEGG:ns NR:ns ## COG: BS_yrkL COG2249 # Protein_GI_number: 16079700 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Bacillus subtilis # 3 168 2 171 174 113 33.0 2e-25 MKKTLIILAHPDLTRSMANKKLKEEAEKNTDIIVHDIYKEYPNGKIDLEKELNLIKETGT LILQFPMQWFNCPSLLKEWIDTVFMAAHFTESGEKILANKKIGLAVTTGAPKEVYEGKLE GILAPFVLSIDYLNAKNIPIFSVHGVMPGKISETEIEENAKKYVEYLKNNIE >gi|224461380|gb|ACDC01000022.1| GENE 4 5473 - 6477 1531 334 aa, chain - ## HITS:1 COG:FN1172 KEGG:ns NR:ns ## COG: FN1172 COG0280 # Protein_GI_number: 19704507 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Fusobacterium nucleatum # 1 334 4 337 337 607 94.0 1e-173 MSFLGQVRKKALQANRRIVLPETSDERVIRAASLILKEGLAQVVLVGNQEAIMNSAKAYE VSLSGAKIVDPYNFERMNDYVNKLVELRSKKGMTPEEAKKLLLNDPNFFGAMLIRMGDAD GMVSGSASPTANVLRAAIQVIGTQPGVKTVSSVFIMELSQFKDLFGSILVFGDCSVIPFP TSEQLADIATSAAETAVKIAGINPRVALMTFSTKGSAKHECVDRVIEAGRILRERKVSFR FDDELQADAALVKSVGEIKAPLSDVSGNANVLIFPTLSAGNIGYKLVQRLAGANAYGPII QGLNAPVNDLSRGCSVEDIVVLTAITSAQACTEC >gi|224461380|gb|ACDC01000022.1| GENE 5 6724 - 6907 175 61 aa, chain + ## HITS:1 COG:no KEGG:FN1061 NR:ns ## KEGG: FN1061 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 61 1 61 184 80 78.0 1e-14 MIYKLNLLGFLLIVVAFFLGIKLPDWDFKLKLRHRNILTHSPFVTIILIALYEIDTSYFF K Prediction of potential genes in microbial genomes Time: Thu May 19 22:50:36 2011 Seq name: gi|224461379|gb|ACDC01000023.1| Fusobacterium sp. 2_1_31 cont1.23, whole genome shotgun sequence Length of sequence - 7229 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 9.0 1 1 Tu 1 . + CDS 221 - 6571 10061 ## FN0254 hypothetical protein + Term 6592 - 6642 9.2 + Prom 6691 - 6750 9.7 2 2 Tu 1 . + CDS 6791 - 7189 551 ## FN2052 hypothetical protein Predicted protein(s) >gi|224461379|gb|ACDC01000023.1| GENE 1 221 - 6571 10061 2116 aa, chain + ## HITS:1 COG:no KEGG:FN0254 NR:ns ## KEGG: FN0254 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 577 2116 157 1677 1677 1639 67.0 0 MRNNNLHDVEKNLRSIAKRYENVKYSVGLAVLFLMKGTNAFSDNNMIQEAEKQKEVITND QVTKSMAKKQEEKSQKAKQGLKASWTNMQFGANDLYSNFFVTPKAEVEKTSIVKTEKAVL VASADNSVTLPMFAKLSSDIDTTDTNTPTMEEIKTSKENLRGSVGNLKDKIDVARRENNK EINGLRLELIQLMEQGNQVVKSPWASWQFGVNYFYDNWGSSYKGRGDKSEKYAYEGVLER DTNLFNRYVPVSSKNYSSLAKSSNPRSAASNQREGLKSYGLASTRLAEEPKIVTQVNAGI NPKTINKQALNISAKAANAVTLPETVSFSPISPKTPVIAPPNVVIKDVTLSPLWNADDRL PDTQYIYVSGTNVVKPGSTYNLTTNYKNPNIPGYNRTKTFSADEVLKTSDGTDYTTVPGY TARGGMRDYTYGAIFEFIGGDHTIEGLTLEVDTVRERNGNIIKGVRAISTDGNNNLKVRN KSNIKLSADNVVGFATDEDPNGAGNLRQLINDTDGLIYSTGKQNVSFILVKEQDNPAATH EHINNGKIIMDGDESYAFAFSKTNGVHPNVYITHNINNGEIVMAGKENYGFALGAGRDYK VADSYIDNTAGGTISMLGNKSMAIITQSNMDHANNAGTINIAGSESYGMYTESATSMTNT GDINITDYTKNFTSNNAGGKWLDNKEYSASNAQKSIGIASGKAGSTITNSGNINISTGEN NIGAYTNIGNVVNSGNINVTNGQNIGMYVAGTGTGTNTNTGKVTVTSDGSIGLMTTGTAT LTNNGELKVTGGNNATNTTGTIGMSAGVGSTIDSTNGKATIDVTGDKSIGLYSEGTLKIG ESTIKTNNGAINYYAKDNGKIEVVTGKTSTATTGQSSLLFYTKGTGKIILNGNINATIKG GSTPSTRGTAFYYEATPGAYGTFNTAAIQNYFNTSFGNGSSTLNNLTLNMEDNSRLFIAS NVEMNLSDTAATSLMTGITNAPTITGSNYKTFMLYLSKLNINQSVDLNDPNNAYNKLEIS NSTIENANSMTGSLNRQVAIAQENGNDTNGNGYNANKVTLTNTASGVINLTGDETTGIYA KRGIITNDGQISVGKKSTGIYIVEDDRSPATATLGAKATNSSTGVITLGEDSTGMYYKIE PDNADGKGTNTAIPGGLTNAGRIESTADNIIAMSFDSPYGTKTFENLSTGVIDLQGQNST GMFATGAGAYTAKNDGTIKLASSTNVNTPNIGMYAEEVQITPAPAPKTTLVTLENNGTIE GGDKTVGIYGHNVNLGASSLTKVGAGGTGVYSKGGSVIINGGTLSVGENGTTGSNDAVGV YYVGGTGGAITSNATDVKIGNSAYGFVVQNENGAAVTLSTNTPNVTLGEDAVYAYSNNKA GSITNGTTLTSTGDGNYGIYGAGTVTNNGQMNFGTGVGNVGIYSILGGTATNNSTITVGA SDAANEKYAIAMAAGYKSSDSGNIINSPTGVINVTGKDSIGMYATGPLSTATNKGTINLS GENSVGMYLDNGATGVNDGSITTIGAPKGVKGVVLSNNSKLINNAGATININSADGFAVY RSNTPKTNVTIVNYGDITVSGGAQADGEYDATGGKELEKTVGGVTLKSPKGTNDINITAN GQAVTNIEKVTEPVGTRGDALISNLGMYVDTLRGTNPINGLGYLNIEEADLLYGVEAAEN STSKYFEVSGNILKPYQDAMRTAPQTLKWSHNSAALTWMALPTLDANGIPSKVAMAKIPY TAFAGNESSPVAVTDTYNFLDGLEQRYGVEALGTRENKVFQKLNSIGNNEETLFYQAVDE MMGHQYANTQQRIQATGDILDKEFNYLRSEWQTVSKDSNKVKVFGAKGEYNSDTAGIINY KNNAYGVAYVHEDETVKLGDTIGWYAGIVHNTFKFKDIGNSKEEQLQGKIGIFKSIPFDH NNSLNWIISGDIFAGYNKMNRKFLVVDEMFGAKGRYHTYGLGLKNEISKDFRLSESFSLR PYAALGLEYGRVSKIREKSGEIKLDVKSNDYFSVKPEIGADLIFKQYFGRKTLKVGVTVA YENELGKVANGKNKARVAGTDADWFNIRGEKEDRRGNVKTDLNIGIDNQRIGLTGNVGYD TKGHNVRGGVGLRVIF >gi|224461379|gb|ACDC01000023.1| GENE 2 6791 - 7189 551 132 aa, chain + ## HITS:1 COG:no KEGG:FN2052 NR:ns ## KEGG: FN2052 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 13 132 1 119 119 112 85.0 5e-24 MKIKYLLASMLVLGSLSYSAEVTDTVAQEVITEVRNIEAEYQALMQKEAERKEEFIQEKA NLEKEVKELKEKQLGREELYAKLKEDSKIRWHRDEYKKLLKRFDEYYNKLEKKIADKEQQ IAELTKLLEVLN Prediction of potential genes in microbial genomes Time: Thu May 19 22:51:04 2011 Seq name: gi|224461378|gb|ACDC01000024.1| Fusobacterium sp. 2_1_31 cont1.24, whole genome shotgun sequence Length of sequence - 3982 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 3982 5745 ## FN0387 hypothetical protein Predicted protein(s) >gi|224461378|gb|ACDC01000024.1| GENE 1 1 - 3982 5745 1327 aa, chain + ## HITS:1 COG:no KEGG:FN0387 NR:ns ## KEGG: FN0387 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 744 1327 248 812 1724 296 38.0 6e-78 YLIYELSMRRKQKEMGNNNLYTIEKNLRSIAKKYRGIKYSLGLAILFLMMGLSAFSEEVM STQEIAASKENLRSSVETLQNKVEAARKENSKEINGLRLELIQLMEQGNQVVKSPWASWQ FGANYMYNDWKGTYKGRGDKAEKYPYEGVLERDSNEFNRYVAPNSSMYSSLETSTNARSA STNSRKGLNNYGLSSNRMQQEPIVSLEVNAGITPRVINKKSPDTSPAAPDVTLPTFEPKL ITPPVPPEKPEEPVLNPPELVVNVGSSGNGGGNVIIGNGSNSRIQSVAIAEGDFKVVRTD SSSWTYEYSGYKGSNVFSVGNATSETGLSIAADGTWGGTTGISGNGTGGGLGFQSVIGSW GSTDGDAGFLSNANVLYSRAHENASDLLGEFVHQDVHGSETITTQRARFAKAVTLATGTS LVAKGPAMLTAYDDAMSFGTQTTSHTWVNTGKIVIEGGNTSLTNTYTHGSASKQASINTG EVIFQPYKTTAGQEYNKYTAVFVISNDTHAQTENVAYNGATGKLKNYTMNAVGFVIDPTS NRKVYMVNRGDFEFYGEKSAGVYVKRASDINLQTVSKDFAFDVGKNEVTAGSFKPIKIFG DQSIGYYNIPSNGNSTTVGNFAVDIGAEGKGNQNFSTVTVSNSTAGTNITDLNINPTNGT NDNIQNSFGILSNKSINLTSHQIRIFDKTEGNVGVYPEANVALNIGGGSIELNGGTGTTS KNNIGIYIGPKPATPPAPAPTTGQGTVKSTGDIKVNGGVGNLAIFAVGGAVAAGETNNVE VREVKATDTKNSVLIYGSKGAKVLLSDGTGLPTKATYGLNISGATVEADASTVNKKDSGA VFATDTGTVITINRLTKETTPNISITGTKLTDADRYAGFGLMAKDGGVINAEKNYVKVSD GSTAVASVGTNAKVNMTKGTVEYKGHGYALYAANGGNIDMTDAKLILDGSAIGYEKDLSV ATLPITTTGMSIHIKSKDVTVLSLKNATTLNVSNISNTLNTWAGLAATPTHDAGAENYKM AAIDGLTAYNIDQDIDKKAVAAGTADANSNMYVRNLLVQRAKVNLAASKNVTAYLNTADL NSLDTSTVVGLDMNSSANAVGRSDTQINLAAGSSVNADRVDAGSGAVGLFINYGEANIGS GAKVNVEKSGLNDANAKAVGVYAVNGSTVNNEGEINVGGEGSIGVLGISYRKDSNGVLKR DEFGTKPNAGDVGVVNKGKIELDGKKAVGIYIENNDSNTSAPHTIEATNDTNGTINMSGQ EAIGMAAKLGNLVNKGTINITADKGTGMFVETDGVRPATMTNDSTGTISLGDSTSESVLR TGMFTKN Prediction of potential genes in microbial genomes Time: Thu May 19 22:51:21 2011 Seq name: gi|224461377|gb|ACDC01000025.1| Fusobacterium sp. 2_1_31 cont1.25, whole genome shotgun sequence Length of sequence - 4116 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1870 2909 ## FN1554 hypothetical protein + Term 1896 - 1928 2.0 + Prom 1916 - 1975 13.1 2 2 Op 1 1/0.000 + CDS 2000 - 3304 1846 ## COG0001 Glutamate-1-semialdehyde aminotransferase 3 2 Op 2 . + CDS 3294 - 3758 592 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) - Term 3757 - 3790 2.1 4 3 Tu 1 . - CDS 3797 - 4114 390 ## gi|237740179|ref|ZP_04570660.1| conserved hypothetical protein Predicted protein(s) >gi|224461377|gb|ACDC01000025.1| GENE 1 2 - 1870 2909 622 aa, chain + ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 622 959 1582 1582 709 64.0 0 ATNNAGHTINLSGDGSMGMYLDNGAIGVNNGTITTVGNPKEAVGIVVRNGAEFTNNGTIN INSNGGFAFFKANGGIIRNYGTFHIGGGAVKEYTPGSKPTGKELVVNGVKVLDINAPGGA ATATITANGQVQTPVVTNVSGNRNMLSSNIGLYIDTLRGTNPITGSLGVLGDAADLIIGS EAAQVTTSKYIQVPQQIIAPYNTTIAANPTIKNWNIYSGALTWISTATLDKTTGLINNVY LAKVPYTAFAGDEATPVAVTDTYNFLDGLEQRYGVEELGTRENRVFQKLNSIGKNEEALF YQATDEMMGHQYANVQQRIQATGDILNKEFDYLRSEWQTVSKDSNKVKVFGTRGEYNTDT AGVIDYRSHAYGVAYVHEDETVKLGDTLGWYAGIVHNKFKFKDIGGSREEMLQGKVGLFK SVPFDDNNSLNWTISGDISVGYNKMHRRFLVVDEVFGAKGRYRTYGIGIKNEISKDFRLS ESFSLKPYAALGLEYGRFSKIKERSGEIRLDVKSKDYFSIRPEIGAELGFKQYFGRKTLR VGVSVAYENELGRVANGKNKARVSHTSADWFNIRGEKEDRRGNVKTDLNIGVDNQRIGLT GNVGYDTKGKNIRGGLGLRVIF >gi|224461377|gb|ACDC01000025.1| GENE 2 2000 - 3304 1846 434 aa, chain + ## HITS:1 COG:FN0540 KEGG:ns NR:ns ## COG: FN0540 COG0001 # Protein_GI_number: 19703875 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Fusobacterium nucleatum # 1 434 1 434 434 798 90.0 0 MVFKNSIDLYKKALNLIPGGVNSPVRAFKSVNREAPIFVKKGQGARIYDEDDNEYIDYIC SWGPLILGHNHPKVIEEVKKIIENGSSYGLPTKYEVDLAELIVEIVPSIEKVRLTTSGTE ATMSAVRLARAYTGRNKILKFEGCYHGHSDALLVKSGSGLLTDGYQDSNGITDGVLKDTL TLGFGDLEKVENLLRNEEIACVIVEPIPANMGLIETHKEFLQGLRRITEETKTVLIFDEV ISGFRLALGGAQEFFGITPDLTTLGKIIGGGYPVGAFGGKREIMDLVAPVGRVYHAGTLS GNPIASKAGFATISYLKENPNIYKELAENTNYLLDNVEKLAEKYGVDVCINSMGSLFTIF FVDLEKVENLEDSLKANTENFSIYFNTMLDNGIVVPPSQFEAHFLSIAHTKKELDRTLEV MEMAFKKIGEKNAK >gi|224461377|gb|ACDC01000025.1| GENE 3 3294 - 3758 592 154 aa, chain + ## HITS:1 COG:FN0539 KEGG:ns NR:ns ## COG: FN0539 COG1648 # Protein_GI_number: 19703874 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Fusobacterium nucleatum # 1 152 1 152 152 179 75.0 2e-45 MPNKFFPVSIDLNNKNILVIGAGKIALRKVKTLLDYNCNITVITKEISEEEFLELEKENK IKILKNQEFEEKFLEDTFLVVSATDNKELNDKISKLCISKNILVNNITSQDNMNLRFMSI LSNDDIQISITANGNPKKAVEVKNKIKEFFEKML >gi|224461377|gb|ACDC01000025.1| GENE 4 3797 - 4114 390 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740179|ref|ZP_04570660.1| ## NR: gi|237740179|ref|ZP_04570660.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 105 1 105 105 176 100.0 5e-43 SVSFSIFAGITTDGKPHFDKMIGRKIDYPDTADSFKIVKKGNSYKLIYYGYDPETQKSST ETSTLKVYKNIYLIDKYGIVYGYDTAKKKVAFLRENLEVIYYEGQ Prediction of potential genes in microbial genomes Time: Thu May 19 22:51:40 2011 Seq name: gi|224461376|gb|ACDC01000026.1| Fusobacterium sp. 2_1_31 cont1.26, whole genome shotgun sequence Length of sequence - 15965 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 2, operones - 2 average op.length - 8.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 65 - 811 1016 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 2 1 Op 2 . + CDS 825 - 1328 723 ## COG0716 Flavodoxins + Term 1333 - 1365 3.6 + Prom 1839 - 1898 9.0 3 2 Op 1 8/0.000 + CDS 1921 - 2442 673 ## COG2065 Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase 4 2 Op 2 15/0.000 + CDS 2530 - 3420 1168 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 5 2 Op 3 7/0.000 + CDS 3434 - 4714 2092 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 6 2 Op 4 24/0.000 + CDS 4730 - 5806 1678 ## COG0505 Carbamoylphosphate synthase small subunit 7 2 Op 5 . + CDS 5821 - 8997 4834 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 8 2 Op 6 . + CDS 9011 - 9676 731 ## Vpar_1790 hypothetical protein 9 2 Op 7 13/0.000 + CDS 9711 - 10523 1334 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 10 2 Op 8 . + CDS 10535 - 11449 1449 ## COG0167 Dihydroorotate dehydrogenase 11 2 Op 9 . + CDS 11446 - 12123 882 ## FN0425 putative cytoplasmic protein 12 2 Op 10 . + CDS 12160 - 12915 783 ## Mmol_0394 hypothetical protein 13 2 Op 11 . + CDS 12932 - 13645 1050 ## COG0284 Orotidine-5'-phosphate decarboxylase 14 2 Op 12 . + CDS 13635 - 14510 1203 ## gi|237740193|ref|ZP_04570674.1| predicted protein 15 2 Op 13 1/0.000 + CDS 14521 - 15138 955 ## COG0461 Orotate phosphoribosyltransferase 16 2 Op 14 . + CDS 15138 - 15692 576 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 17 2 Op 15 . + CDS 15704 - 15910 244 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family Predicted protein(s) >gi|224461376|gb|ACDC01000026.1| GENE 1 65 - 811 1016 248 aa, chain + ## HITS:1 COG:FN0725 KEGG:ns NR:ns ## COG: FN0725 COG1179 # Protein_GI_number: 19704060 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Fusobacterium nucleatum # 15 248 1 234 234 383 85.0 1e-106 MLLFSKLVNNQGEHMFLQRTELLIGSDNLEKLKNSNIIVFGLGGVGGAAVESLVRAGIGN LSIVDFDTVDKTNLNRQIITTQSTIGRAKVEVAKERILAINPEINLTVYHEKFLKENVDL FFKDKKYDYIVDAIDLVTAKLDLIEFAIKSKTPIISCMGTGNKLDPSRFQVADIKKTSVC PLAKVIRKELKNRRISKLKVVYSDEVPRKPLNLDGGREKFKNVGSISFVPPVAGMLLASA VIKDICEL >gi|224461376|gb|ACDC01000026.1| GENE 2 825 - 1328 723 167 aa, chain + ## HITS:1 COG:FN0724 KEGG:ns NR:ns ## COG: FN0724 COG0716 # Protein_GI_number: 19704059 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 166 1 166 167 272 89.0 2e-73 MKTIGIFYATLTKTTVGVVDELEFFLKHDDFKTFNIKNSVKEIENYENLIFVTPTYQVGE AHAAWMNNLKKLEEIDFTGKVVGLVGLGNQFAFGESFCGGIRHLYDVIVKKGAKVVGFTS TDGYHYEETSIIEDGKFIGLALDEENQANLTPKRIENWIAEVKKEFN >gi|224461376|gb|ACDC01000026.1| GENE 3 1921 - 2442 673 173 aa, chain + ## HITS:1 COG:FN0418 KEGG:ns NR:ns ## COG: FN0418 COG2065 # Protein_GI_number: 19703760 # Func_class: F Nucleotide transport and metabolism # Function: Pyrimidine operon attenuation protein/uracil phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 173 5 177 177 280 93.0 1e-75 MKILLDENGIQRSITRISYEIIERNKTVENIVLVGIKNRGDILAERIKEKLMELENVDIP LETIDITYYRDDIDRKNFDLDIKDTEFKSNLTGKVVVMVDDVLYTGRTIRAGLDAILSKS RPAKIQLACLIDRGHRELPIRADFIGKNIPTSHSENIEVYLKEIDGKEEVVIL >gi|224461376|gb|ACDC01000026.1| GENE 4 2530 - 3420 1168 296 aa, chain + ## HITS:1 COG:FN0419 KEGG:ns NR:ns ## COG: FN0419 COG0540 # Protein_GI_number: 19703761 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Fusobacterium nucleatum # 1 296 9 304 304 520 93.0 1e-147 MKNLLSMEDLTNEEILSLVKRALDLKNGAENKKRNDLFVANLFFENSTRTKKSFEVAEKK LNLNVIDFEVSTSSVQKGETLYDTCKTLEMIGIDMLVIRHSENEYYKQLENLKIPVINGG DGSGEHPSQCLLDIMTIYENYGKFEGLDIIIAGDIKNSRVARSNKKALTRLGAKVSFVAP EIWKDETLGEFVNFDDVIDKVDICMLLRVQHERHTDSKEKSEFSKENYHKNYGLTEERSK KLKEGAIIMHPAPVNRDVEIADSLVESEKSRIFEQMKNGMFMRQAILEYIIEKNNL >gi|224461376|gb|ACDC01000026.1| GENE 5 3434 - 4714 2092 426 aa, chain + ## HITS:1 COG:FN0420 KEGG:ns NR:ns ## COG: FN0420 COG0044 # Protein_GI_number: 19703762 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Fusobacterium nucleatum # 1 425 1 425 425 775 90.0 0 MLLKNCKILKNGKFEKVDILIKDDKIERISKNIDVVDENTIEVKNKFVTAGFIDAHVHWR EPGFSKKETVYTASRAAARGGFTTVMTMPNLNPVPDNLETLNKQLEIIEKDSGIRAIPYG AITKEEYGRELSDMEDIADKVFAFTDDGRGVQSANVMYEAMLMGSKLNKAIVAHCEDNSL IRSGAIHEGKRSAELGIKGIPSICESTQIARDILLAEAADCHYHVCHISAKESVRAVREG KKNNIRVTCEVTPHHLLSCDEDIKEDNGMWKMNPPLRGREDRNALIAGILDGTIDIIATD HAPHTMEEKVRGIEKSSFGIVGSETAFAQLYTKFVKTDIFSLEMLVKLMSENVAKIFNLP YGKLEENSFADIVVIDLEKEMTINPEEFLSKGKNTPYANEKVSGIPVLTISSGKIAYVDK EEINLL >gi|224461376|gb|ACDC01000026.1| GENE 6 4730 - 5806 1678 358 aa, chain + ## HITS:1 COG:FN0421 KEGG:ns NR:ns ## COG: FN0421 COG0505 # Protein_GI_number: 19703763 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Fusobacterium nucleatum # 1 358 1 358 358 716 98.0 0 MYNRQLILEDGTVYKGYAFGADVENVGEVVFNTSMTGYQEILSDPSYNGQIVTLTYPLIG NYGINRDDFESMKPCIKGMIVKEVCTTPSNFRSEKTLDEALKEFGIPGIYGIDTRALTRK LRSKGVVKGCLVSIDKNVDEVVAELKKTVLPTNQIEQVSSKSISPALGRGRRVVLVDLGM KIGIVRELVSRGCDVIVVPYNTTAEEVLRLEPDGVMLTNGPGDPEDAKESIEMIKGIIGK VTIFGICMGHQLVSLACGAKTYKLKFGHRGGNHPVKNILTGRVDITSQNHGYAVDIDSLK DTDLELTHIAINDRSCEGVRHKKYPVFTVQFHPEAAAGPHDTSYLFDEFIKNIDKNMK >gi|224461376|gb|ACDC01000026.1| GENE 7 5821 - 8997 4834 1058 aa, chain + ## HITS:1 COG:FN0422 KEGG:ns NR:ns ## COG: FN0422 COG0458 # Protein_GI_number: 19703764 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Fusobacterium nucleatum # 1 1058 6 1063 1063 2040 97.0 0 MPKRKDIKTILVIGSGPIIIGQAAEFDYAGTQACLSLREEGYEVILVNSNPATIMTDKEI ADKVYIEPLTVEFLSKIIRKEKPDALLPTLGGQVALNLAVSLHESGVLDECGVEILGTKL SSIKQAEDRELFRDLMNELNEPVPDSAIVHTLEEAEKFVKEIGYPVIVRPAFTMGGTGGG ICYNDEDLQEIVPNGLNYSPVHQCLLEKSIAGYKEIEYEVMRDSNDTAIVVCNMENIDPV GIHTGDSIVVAPCLTLTDRENHMLRDVSLKIIRALKIEGGCNVQIALDPNSFKYYIIEVN PRVSRSSALASKATGYPIAKIAAKIAVGMTLDEIINPVTNSSYACFEPAIDYVVTKIPRF PFDKFGDGDRYLGTQMKATGEVMAIGRTLEESLLKAIRSLEYGVHHLGLPNGEEFSLEKI IKRIKLAGDERLFFIGEALRRDVSIEEIHKYTKIDLFFLNKMKNIIDLEHLLKDNKGNIE LLRKVKTFGFSDRVIAHRWEMTEPEITELRHKHNIRPVYKMVDTCAAEFDSNTPYFYSTY EFENESTRSDKEKIVVLGSGPIRIGQGIEFDYATVHAIMAIKKLGYEAIVINNNPETVST DFSISDKLYFEPLTQEDVMEILDLEKPLGVVVQFGGQTAINLADKLVKNGIQILGSSLDS IDTAEDRDRFEKLLIELKIPQPLGKTAFDVETALKNANEIGYPVLVRPSYVLGGRAMEIV YNDEDLKKYMEKAVHINPEHPVLIDRYLIGKEIEVDAISDGENTFIPGIMEHIERAGVHS GDSISIYPPQSLSEKEIETLIDYTKKLASGLKVKGLINIQYVVSKGEIYVLEVNPRASRT VPFLSKVTGVPVANIAMQCILGKKLRDLGFTKDIADTGNFVSVKVPVFSFQKLKNVDTTL GPEMKSTGEVIGTDVNLEKALYKGLTAAGVKIKDYGRVLFTIDDKNKEAALNLAKGFSDV GFSIVATEGTGTYFEGHGLKVKKVGKIDNSDYSVLDAIQNGDVDIVINTTTKGKSSEKDG FKIRRKATEHGVICFTSLDTANALLRVIESMSFRVQSL >gi|224461376|gb|ACDC01000026.1| GENE 8 9011 - 9676 731 221 aa, chain + ## HITS:1 COG:no KEGG:Vpar_1790 NR:ns ## KEGG: Vpar_1790 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 8 212 15 226 240 76 29.0 6e-13 MSIRNLEEKLFDEWQEKLGYNLENKKIFVRDGLVDEKSYFKAPVKILYLLKEVNGGDRDW DLREYIENGGRAATWDNITRWTKGILKYKEELEWSSLENINEDSRKEILRYIVAVNLKKI PGGYTTDCKKIEDFLEKPSNINYLKKQISLYNPDIIICCGTGWWYSNYIEKGMKWEKTKR GILYNKENNKIIISYSHPAARVSSNLLCYGLIDAVKEIYKK >gi|224461376|gb|ACDC01000026.1| GENE 9 9711 - 10523 1334 270 aa, chain + ## HITS:1 COG:FN0423 KEGG:ns NR:ns ## COG: FN0423 COG0543 # Protein_GI_number: 19703765 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Fusobacterium nucleatum # 1 270 1 259 259 462 88.0 1e-130 MKMEDCTVEENVQIAKDTYKMKIKGNFVKECRTPGQFVNIRIGDGREYMLRRPISISEID RGENLVTIIYRIVGEGTKFMADIKKGSEIDIMGPLGRGYDVLSLKKGQTALLVGGGIGVP PLYELAKQFNQRGIKTIAILGFNTKDEVFYEEEFKKFGETYVSTVDGSLGTKGFVTDVIK KLQAENNLVFDKYYSCGPVPMLKALISTVGEDGYVSLENRMACGIGACYACVCKKKKKDK DIIAYDEKKVEYTRVCYDGPVYLASDVEIE >gi|224461376|gb|ACDC01000026.1| GENE 10 10535 - 11449 1449 304 aa, chain + ## HITS:1 COG:FN0424 KEGG:ns NR:ns ## COG: FN0424 COG0167 # Protein_GI_number: 19703766 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Fusobacterium nucleatum # 1 304 1 304 304 556 96.0 1e-158 MSERLRVQIPGLDLKNPIMPASGCFAFGIEYAELYDISKLGAIMIKAATKEARFGNPTPR VAETSSGMLNAIGLQNPGVDEIISNQLKKLEAYDVPIIANVAGSDIEDYVYVADKISKAP NVKALELNISCPNVKHGGIQFGTDPDVARNLTEKVKAVSSVPVYVKLSPNVTDIVAMAKA VEAGGADGLTMINTLVGIVLDRKTGKPIIANTTGGLSGPAIKPVAIRMVYQVAQAVNIPI IGMGGVMDEWDVIDFISAGASAVAVGTANFTDPFVCPKIIDNLESALDKLGINHILDLKG RAFK >gi|224461376|gb|ACDC01000026.1| GENE 11 11446 - 12123 882 225 aa, chain + ## HITS:1 COG:no KEGG:FN0425 NR:ns ## KEGG: FN0425 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 1 220 1 220 221 299 85.0 4e-80 MKLIFKDYLDIFEKYPKDDYLTREERKERYKLLQEYEKRNYQDEISVDEFKDFISSYIDK IDVSSQFIGKFLKVIKKDIDDGGTFAIKFLIGDKDENDYYLKFFSLLYDEFGDKINLINK LLEKEPDYLPAIKQKYTILSNYIDFSIHEMPWELLLDKASSEKEAKTEALADLDDFLELS KKLGKDNKEYIEECRIYYNAWFDFLDNKDKYKSYEEYLEKNNIEY >gi|224461376|gb|ACDC01000026.1| GENE 12 12160 - 12915 783 251 aa, chain + ## HITS:1 COG:no KEGG:Mmol_0394 NR:ns ## KEGG: Mmol_0394 # Name: not_defined # Def: hypothetical protein # Organism: M.mobilis # Pathway: not_defined # 15 247 23 263 267 79 28.0 8e-14 MIDKKAKKLFLKYMENKSSLNHEEVEYIKEKDLLREDIVITEKEFITNLEKILTKISLEE VSNAFLYSLSTRDLDYRYILASYIYARSWLKYDRGKEYKIPKKITPTFFNWVKYCSGGIW GEIAKPYYYLSEFLNMEKKIPKEEDYQILKKILLFADKFDEEKTATMLRNELAKEKIFPS NKDEVTGLLETLGICGILETKEHRGFWDSFTPMFERDSGDLRQYFSYPFHWWKGKDRVNY ENVKNIFKITV >gi|224461376|gb|ACDC01000026.1| GENE 13 12932 - 13645 1050 237 aa, chain + ## HITS:1 COG:FN0426 KEGG:ns NR:ns ## COG: FN0426 COG0284 # Protein_GI_number: 19703768 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Fusobacterium nucleatum # 1 235 1 235 237 421 97.0 1e-118 MKKEVIIALDFPTLEKTLEFLDKFKEEKLFVKVGMELYLQNGPVVIDEIKKRGHKIFLDL KLHDIPNTVYSAAKGLAKFNIDILTVHAAGGSEMLKGAKRAMTEAGVNTKVIAITQLTST SEEDMRKEQNIQTSIEESVLNYARLAKESGIDGVVSSVLETKKIREQSGEDFIIINPGIR LAEDSKGDQKRVATPIDANRDGASYIVVGRSITGNENPEERYRLIKNMFELGDKYEK >gi|224461376|gb|ACDC01000026.1| GENE 14 13635 - 14510 1203 291 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740193|ref|ZP_04570674.1| ## NR: gi|237740193|ref|ZP_04570674.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 291 1 291 291 439 100.0 1e-121 MRSKIIKILLFLFIGLECLAITNREKIEKDLKRLNITDPTMIAQTILIDEKMGSDRLSRE EKEIYLEDLKKLADENPKNFYLSYPIARYYLDFEDDIEEVKKNRKYFDNYVDNVFQDEEK YLLNISYYRKIGDKEKAKKYYDDFMKKYANKWSGKIKLAGEYETDQEKVKKYVKEAFELL KKDIKNGNKDEVTDEEFFIAQIAYEQMMIQEILEKKEYQKAVDYYLDNIANKDYYTQSVL NMNGRKLAFQLHLIIQINEKHLNKNKENIKKIRDSKVYKELEKMGEIIKKI >gi|224461376|gb|ACDC01000026.1| GENE 15 14521 - 15138 955 205 aa, chain + ## HITS:1 COG:FN0427 KEGG:ns NR:ns ## COG: FN0427 COG0461 # Protein_GI_number: 19703769 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 205 7 211 211 395 98.0 1e-110 MLDREIINALLDIKAVELRVDKENWFTWASGIKSPIYCDNRLTMSYPKIRKQIAEGFVKK IKELYPNVDYIVGTATAGIPHAAWISDIMDLPMLYVRGSAKDHGKTNQIEGKYEKGKKVV VIEDLISTGKSSVLAAQALQEEGLEVLGVIAIFSYNLNKAKEKFDEAKIPFSTLTNYDVL LELAKETGLIGDKENQILVDWRNNL >gi|224461376|gb|ACDC01000026.1| GENE 16 15138 - 15692 576 184 aa, chain + ## HITS:1 COG:FN0428 KEGG:ns NR:ns ## COG: FN0428 COG1387 # Protein_GI_number: 19703770 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Fusobacterium nucleatum # 1 184 2 185 258 251 77.0 4e-67 MFDQHVHSNFSFDSNEDLENYINVSNKNDIVTTEHLDFANPIINYEDSSIEYLKYIEEID NLNKKYSNKFFSGIEIGYTPNSEKRIEDFLKDKNFNLKLLSIHQNGLYDYMCVNKKLISL EAFIQEYFEQMIQALESSIEFNVLAHFEYGIRIVDISVTDFDSLARKFLNKIIELIIKKK LPLK >gi|224461376|gb|ACDC01000026.1| GENE 17 15704 - 15910 244 68 aa, chain + ## HITS:1 COG:FN0428 KEGG:ns NR:ns ## COG: FN0428 COG1387 # Protein_GI_number: 19703770 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Fusobacterium nucleatum # 1 68 191 258 258 98 79.0 3e-21 MYKYKKESLYSYMIEKYLKKGGKLFTLGSDAHNIKDYAYRFDDARKFLLTRNVKEIILFK DKIKMEKI Prediction of potential genes in microbial genomes Time: Thu May 19 22:52:07 2011 Seq name: gi|224461375|gb|ACDC01000027.1| Fusobacterium sp. 2_1_31 cont1.27, whole genome shotgun sequence Length of sequence - 11769 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 6, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 12 - 878 1148 ## gi|237740196|ref|ZP_04570677.1| predicted protein - Prom 927 - 986 5.3 2 2 Op 1 . - CDS 1008 - 1094 81 ## - Prom 1160 - 1219 4.5 3 2 Op 2 . - CDS 1353 - 1796 627 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 1830 - 1889 15.8 - Term 1891 - 1936 9.3 4 3 Tu 1 . - CDS 1970 - 3661 2601 ## COG0405 Gamma-glutamyltransferase - Prom 3701 - 3760 8.8 + Prom 3648 - 3707 10.1 5 4 Op 1 . + CDS 3879 - 4154 234 ## FN1082 hypothetical protein 6 4 Op 2 . + CDS 4151 - 4747 682 ## COG2431 Predicted membrane protein + Prom 4776 - 4835 9.3 7 5 Tu 1 . + CDS 4872 - 5120 408 ## FN1084 hypothetical protein + Term 5141 - 5190 6.4 - Term 5121 - 5184 7.1 8 6 Op 1 . - CDS 5258 - 6790 2425 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 6811 - 6870 2.1 9 6 Op 2 . - CDS 6874 - 7461 465 ## gi|237740203|ref|ZP_04570684.1| conserved hypothetical protein 10 6 Op 3 . - CDS 7470 - 8054 676 ## gi|237740204|ref|ZP_04570685.1| conserved hypothetical protein 11 6 Op 4 1/0.000 - CDS 8079 - 9005 1343 ## COG1186 Protein chain release factor B - Prom 9138 - 9197 2.9 12 6 Op 5 1/0.000 - CDS 9204 - 9572 538 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) 13 6 Op 6 . - CDS 9569 - 10327 1076 ## COG0084 Mg-dependent DNase 14 6 Op 7 . - CDS 10391 - 11689 1256 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 11709 - 11768 7.2 Predicted protein(s) >gi|224461375|gb|ACDC01000027.1| GENE 1 12 - 878 1148 288 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740196|ref|ZP_04570677.1| ## NR: gi|237740196|ref|ZP_04570677.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 288 5 292 292 454 99.0 1e-126 MCEQKQLKQIYFRSNLIRNEELCKLYSVDGRLKLDVVYTNPTKKEKEKGTVAGINTIDIE TFNNNSMEVVGKYKNTENATDIKVTIVPNDQIGNEQQVNVGASPENVQRYNVEADPVVVA KDTRNVKLPSNELFSSVDYVVASKDSTLESKIEKPKRHSVRIMPMGTRNTAKKEESIKVG AVQEKVSLQPETPVVEAAKTNNAPKDYRVIPDQPKVEKMQEEPVIEKEVVTSNTNVHYEE VSPRGQRKNSFLLPILFIGIGILLGGFLGLKSSFMFNAPKTVETAQNK >gi|224461375|gb|ACDC01000027.1| GENE 2 1008 - 1094 81 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGNQPPRPKGKVQRLFLNGKYTKAGGSE >gi|224461375|gb|ACDC01000027.1| GENE 3 1353 - 1796 627 147 aa, chain - ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 25 146 76 188 245 91 45.0 5e-19 MSEKEIYSGPNFRYRPDKNNFTEKEGSEFFYYESGQLKAEYNYKNGKLDGFAREYYENGQ LIAEGNYSNGKLEGISKMYYESGQLRSENSYKNNLLDGISKTYYENGQLKEEVNYKDGQV VQENLETELKDFCNIAYEDDKLKLEFD >gi|224461375|gb|ACDC01000027.1| GENE 4 1970 - 3661 2601 563 aa, chain - ## HITS:1 COG:FN0941 KEGG:ns NR:ns ## COG: FN0941 COG0405 # Protein_GI_number: 19704276 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyltransferase # Organism: Fusobacterium nucleatum # 22 563 34 579 579 567 56.0 1e-161 MKKFLMSVVLGLSLVSSLVYADDWKPYDKNGNVLRTDRDATGKNAVVSTARYEASKIGLD ILENGGNAIDAAVGVGFALGVCEPQSSGIGGGGFMIVRFAKTGETKFIDFREIAPKGANP DMWDVDKKGEVISDDKEFGGKSIGVPGTVKGFLYVLEKYGNLDRKAVIQPAIDLANNGYR VSAIMNMDMKNQINNILKYPATAKIYLKNGKPYNVGDLLKNPDLAKTMEKIVKDGEKAFY EGEVAEAIVKATVAAKGKMTLEDLKNYKIKISDPVKGTYRGYEIYTAAPPSSGGAHIIQI LNILENYDMKNIPAGSARYYHLLSESMKMAFADRAKFMGDTDFIKIPLNGVINKEYAKTL KNKIDETKSQEYTEGDPWKYESNETTHYSIIDKEGNIVAVTFTVNGVFASGVVADGTGIL LNNEMDDFDTGHDKANSIQEYKKPLSSMSPTIILKDGKPVASLGGLGAQKIITGITQVII QMVDYDKDIQEAINFPRIHDAYGELTYEGRIDKNVIDQLQKMGHKVKNGGEWLEYPCIQG VTIKDGVLRGGADPRRDGKALGF >gi|224461375|gb|ACDC01000027.1| GENE 5 3879 - 4154 234 91 aa, chain + ## HITS:1 COG:no KEGG:FN1082 NR:ns ## KEGG: FN1082 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 91 1 91 91 106 87.0 3e-22 MLDILIYICIILFAVFLVRKKLFPEKLLKKISLLQSLSLYFLLGAMGYKIGSDDRLISNL HILGIKALIISIFAIVFSIVFVKFFYWGDKK >gi|224461375|gb|ACDC01000027.1| GENE 6 4151 - 4747 682 198 aa, chain + ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 198 1 198 198 249 84.0 3e-66 MIAVSCAVIIGMLLGYFTKSHFEFDIGIVIQFGLYFLLFFIGIDIGKNENIIGDLKKLNK KVLFLPFITILSSLAGGAVASIFLSLTMPETIAVSAGMGWYSFSAIELSKVSVELGGIAF LSNIFRELLAIIFIPIIAKKVGALESVSVAGATAMDSVLPIINRSTSAEISIISFYSGLV ISIVVPILIPILVNIFSL >gi|224461375|gb|ACDC01000027.1| GENE 7 4872 - 5120 408 82 aa, chain + ## HITS:1 COG:no KEGG:FN1084 NR:ns ## KEGG: FN1084 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 82 4 85 85 136 90.0 2e-31 MFESWAENLYDETFSDMFDALVAEYKNGEITVEQLKVNLAEQQQILLNAFTEGEVKSTYC NAMVDAHQYVIALISNGKIVKE >gi|224461375|gb|ACDC01000027.1| GENE 8 5258 - 6790 2425 510 aa, chain - ## HITS:1 COG:FN1340 KEGG:ns NR:ns ## COG: FN1340 COG0008 # Protein_GI_number: 19704675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Fusobacterium nucleatum # 1 508 7 514 516 964 93.0 0 MCVDCKKRVRTRVAPSPTGDPHVGTAYIALFNIAFAHVNNGDFILRIEDTDRNRYTEGSE QMIFDALKWLDLDYAEGPDVGGDYGPYRQSERFDLYGKYAKELVEKGGAYYCFCDQERLE NLRERQKAMGLPPGYDGHCRSLTKEEIEEKLKAGVPYVIRLKMPYEGETVIHDRLRGDVV FENSKIDDQVLLKADGYPTYHLANIVDDHLMGITHVIRAEEWIPSTPKHIQLYKAFGWEA PEFIHMPLLRNDDRSKISKRKNPVSLIWYKEEGYLKEGLVNFLGLMGYSYGDGQEIFSLQ EFKDNFNIDKVTLGGPVFDLVKLGWVNNQHMKMKDLGELTRLTIPFFVQEGYFENENVSE KEFETLKKIVAIEREGAKTLKEIAKNSKFFFIDEFTLPEVKEDMDKKERKSIEKLLNSLQ DEVGLKAIKLLIDKLEKWESNEFTAEEAKDLLHSLLDDLQEGPGKVFMPIRAVLTGEPKG ADLYNVLYVIGKERALKRIKDTVKKYNIQL >gi|224461375|gb|ACDC01000027.1| GENE 9 6874 - 7461 465 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740203|ref|ZP_04570684.1| ## NR: gi|237740203|ref|ZP_04570684.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 195 1 195 195 347 100.0 2e-94 MKRKELKISPVPEDSKFYFIYAHILLLWPFTPFIFAAIYLFFIGDKTRSIIIEQFMKEKV LLTLVIIAFLVSVLNMFRELFNYLIVEEVCYVDRKTFFYQKFRRAFGTRKLMTNLEIPLS DISEVKEGNKASFLYYFFSPIAHRNSVELITTDGKKYQIMNSVLFGSRNSLKPNSKVTDE RTTKIYNEVKNMISK >gi|224461375|gb|ACDC01000027.1| GENE 10 7470 - 8054 676 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740204|ref|ZP_04570685.1| ## NR: gi|237740204|ref|ZP_04570685.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 194 9 202 202 323 100.0 3e-87 MKEIKARPLDENNKEGKMTAYMFFFQPFFALAMAIFIISFNKEVFKNNLAAQLFMGVFFL LVISSFFTNMPYIANGVFAEEVCYVKNKVFYYTKTRNFLGSKKIIKSFEIPIREITDVKE NEKKLKVNMFSIFKPRNSVEIETRDGIKYAIMNDFRLGSKNDTNTETRQERAKRIFNEVK DLITEAKNENTFNI >gi|224461375|gb|ACDC01000027.1| GENE 11 8079 - 9005 1343 308 aa, chain - ## HITS:1 COG:FN1341 KEGG:ns NR:ns ## COG: FN1341 COG1186 # Protein_GI_number: 19704676 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Fusobacterium nucleatum # 1 308 1 308 308 540 94.0 1e-153 MNFEKNIVSRYEKLATEVEDEEVLIDFVESGESSFENELIEKHKTLKYDIEEFEVNLLLD GEYDMNNAIVTIHSGAGGTEACDWADMLYRMYLRWCNLKNYKVSELDFMEGDSVGVKSVT FLVEGINAYGYLKSEKGVHRLVRISPFDANKKRHTSFASVEVVPEVDDNVEVEINPADIR IDTYRASGAGGQHVNMTDSAVRITHFPTGVVVTCQKERSQLSNRETAMKMLKSKLLEIEL KKKEEEMKKIQGEQTDIGWGNQIRSYVFQPYALVKDHRTNTEIGNVKAVMDGSIDDFINS YLRWIKNN >gi|224461375|gb|ACDC01000027.1| GENE 12 9204 - 9572 538 122 aa, chain - ## HITS:1 COG:FN1342 KEGG:ns NR:ns ## COG: FN1342 COG0736 # Protein_GI_number: 19704677 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Fusobacterium nucleatum # 1 122 1 122 122 180 85.0 7e-46 MIVGIGNDIIEIERVEKAISKEGFIAKVYTQREIENIVKRGNRTETYAGIFSAKEAISKA IGTGVREFALTDLEILNDDLGKPYVIVSDKLNKIIQSKKENYQIEIAISHSKKYATAMAI II >gi|224461375|gb|ACDC01000027.1| GENE 13 9569 - 10327 1076 252 aa, chain - ## HITS:1 COG:FN1343 KEGG:ns NR:ns ## COG: FN1343 COG0084 # Protein_GI_number: 19704678 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Fusobacterium nucleatum # 1 252 7 258 258 457 92.0 1e-128 MKIIDSHVHLNLHQFDSDREDVFKRIEEKLDFVVNIGFDLESSEKSVEYADKYPFIYAVI GFHPDEIEGYSDEAEKRLEELAKNPKVLAIGEIGLDYHWMTRPKEEQFKIFRRQLELARR VNKPVVIHTREAMEDTINILNEYPDIKGILHCYPGSVESAKRMIDRFYLGIGGVLTFKNA KKLVDVVKDIPIEHLVIETDCPYMAPTPYRGQRNEPIYTEEVAKKIAELKNMSYEDVVRI TNENTRKVFKML >gi|224461375|gb|ACDC01000027.1| GENE 14 10391 - 11689 1256 432 aa, chain - ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 429 23 450 470 292 40.0 8e-79 MYRRIFEYLKEWKNSPYRKPLILQGARQVGKTYSILNFGKSEYENIAYFNFETNPKLKET FEENIEPSYLIPILSRLVDQTIVKEKTLIFFDEIQLCERALTSLKYFQEQAPEYHIIVAG SLLGVAVNRENFSFPVGKVDIKTLYPMDIEEFLLAMGEDKLIEQIKTSFNKNSPLPTILH DLAMEYYRKYLLIGGMPECVAKFKETENYTLIRHTQEMILLSYLNDMSKYNTNNEIKKTR LVYDNITVQLSRENTRFQYKLVKTGGRASEFENAIEWLNLSGIISKIYCVQDIKKPLENY RNIDAFKIYISDVGLLCAKKQIVPEDILYLSDELNDFKGGMTENYVNIHLDINSYTPYFW KNEKGTSEIDFVIARDGKIIPIEVKSSNNTRSKSRDYYIKTYKPEYSIRISSKNFGLENN IKSIPLYAVFCL Prediction of potential genes in microbial genomes Time: Thu May 19 22:52:45 2011 Seq name: gi|224461374|gb|ACDC01000028.1| Fusobacterium sp. 2_1_31 cont1.28, whole genome shotgun sequence Length of sequence - 12053 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 5, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 8 - 838 927 ## gi|237740209|ref|ZP_04570690.1| predicted protein - Prom 926 - 985 20.2 + Prom 931 - 990 13.2 2 2 Tu 1 . + CDS 1021 - 1467 576 ## gi|237740210|ref|ZP_04570691.1| predicted protein + Term 1484 - 1527 0.7 + Prom 1499 - 1558 12.4 3 3 Op 1 . + CDS 1718 - 4258 2897 ## COG1353 Predicted hydrolase of the HD superfamily (permuted catalytic motifs) 4 3 Op 2 . + CDS 4262 - 4630 484 ## TherJR_2021 CRISPR-associated protein, Csm2 family 5 3 Op 3 7/0.000 + CDS 4667 - 5374 918 ## COG1337 Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) 6 3 Op 4 . + CDS 5374 - 6378 978 ## COG1567 Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) 7 3 Op 5 . + CDS 6375 - 7544 1266 ## Vpar_1802 CRISPR-associated RAMP protein, Csm5 family 8 3 Op 6 . + CDS 7528 - 8250 644 ## COG5551 Uncharacterized conserved protein 9 3 Op 7 . + CDS 8243 - 9634 1512 ## TherJR_2016 CRISPR-associated protein Csm6 10 3 Op 8 13/0.000 + CDS 9648 - 10655 1123 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 11 3 Op 9 . + CDS 10660 - 10989 453 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair + Term 10997 - 11036 4.8 12 4 Tu 1 . - CDS 11199 - 11609 406 ## gi|237740220|ref|ZP_04570701.1| predicted protein - Prom 11801 - 11860 4.8 13 5 Tu 1 . - CDS 11889 - 12053 220 ## Predicted protein(s) >gi|224461374|gb|ACDC01000028.1| GENE 1 8 - 838 927 276 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740209|ref|ZP_04570690.1| ## NR: gi|237740209|ref|ZP_04570690.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 276 1 276 276 407 100.0 1e-112 MNDNNDDFKLDLSNVKIEVPKTDYSPTTEETTTETKEDAKKETRKETKNVNSSKNLNNNS FVKKKSNMKAGTKTQLIILGVTLIFWIFLASGVISFIRSIGRGLKRPKNYYTQVNYDTNQ VAPQAQTTVVEPVTTTEVAPPVENTPVTTVPEPVNNTAVPQSIPETNNNVAPQQNTVNQV QNNQQYSAYDDYDLQVLEQVYDEVINRGNEAYLYNFSSSELAIIRNTLYARRGYRFKKKK YQQYFGSKPWYTPTTDSQNILPKNEERLANIIKKYE >gi|224461374|gb|ACDC01000028.1| GENE 2 1021 - 1467 576 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740210|ref|ZP_04570691.1| ## NR: gi|237740210|ref|ZP_04570691.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 148 1 148 148 263 100.0 3e-69 MKKIVLFVFLSLSLLCFGEWEYIEGNNSVRLVQENTMITLADGGGGVKTPIFHYTLEEIL DSKKPINTYGIEITVDENPAIKANAMVMYRVMRFPVKGVTTDEEVFNELLPQMQNGKMMK IKFLAKKYSDILIEIPLDNFNESYQQMK >gi|224461374|gb|ACDC01000028.1| GENE 3 1718 - 4258 2897 846 aa, chain + ## HITS:1 COG:MT2890 KEGG:ns NR:ns ## COG: MT2890 COG1353 # Protein_GI_number: 15842364 # Func_class: R General function prediction only # Function: Predicted hydrolase of the HD superfamily (permuted catalytic motifs) # Organism: Mycobacterium tuberculosis CDC1551 # 1 846 6 814 817 427 33.0 1e-119 MDEKLICLQLGALLHDIGKIVRRAGLDKNEHSIAGSNYLKENNLLVEKYKEVYDIINYHH AKYLSSAKLKEDSLAYMVYEADNIASGIDRVKYEDKVTKGNEMDNLNSIFNVVKAEKNNI KKTFKLFDFDKNNFNMPTSHDIKLTNSDYKKVLEYIKNNLSSFKENINPEKLAVVLEACC SYFPSSSYVDTPDVSYYDHVKLTAAVAACFYLYDKEKEISNYKKEYFSKADRNTKKFLLV SGEFSGIQNFIYTISSKMAMKSLRGRSFYLELFAEHIIDEVLSELELSRVNLLYSGGSHF YLLLPNTEKSKEVLKQYKEKINNFILEKIGATIYFEMVYTETSAEELGNGLSKDIKDENR IGELFRKTSAKVSKAKLNRYSLDQLKELLDENSSINEVKSYTRECNICKKPEDEKILRRN AKDFDEESEVELCSSCKSYIDLGKDISKLYHSSNENFIVEENCEENQNGLIFPKYSEAYV NVVIKSKEYVLRNIKRIHRYYAVNSNSVGDKLCKNIWVGNYNVMNKDETGGENLIEFKNL VKKSKGIERLAVFRADVDNLGTLFQSGFENKDSKEPYKNVTLSKSVVLSRYLSDFFKRKI NLILEKKDAIKDTNELFKKYCDIICEDNSNPRDIVIVYSGGDDVFAIGTWNDIIEFSIDL RTAFKEFTNDKITLSAGIGFFYENYPIHQMAEKTGNLESLAKANKDSSGKIIKDSVALFG EISPELNHIYTWDIFIDKVLQEKYKFIKSVTILNEETKEKYKDKILIGRSKWYKLMDLIV SRLTRNDNKLDIARFAYVLGRINHTANNKENYDKFKKNLLLWLKNKDDAKQILTAINILI YQERGE >gi|224461374|gb|ACDC01000028.1| GENE 4 4262 - 4630 484 122 aa, chain + ## HITS:1 COG:no KEGG:TherJR_2021 NR:ns ## KEGG: TherJR_2021 # Name: not_defined # Def: CRISPR-associated protein, Csm2 family # Organism: Thermincola_JR # Pathway: not_defined # 8 116 11 121 125 69 41.0 4e-11 MNNINVQEKIEKYQEDKKNTVTTTQLRLLLSNAVIIKNKIQVETRTKKGDEISEKLENEI KYLLVKHIYQCGREPKVKRFDNEFHISEKIKSIGKSAKKFNEFYRYLEEIVAYMKYYESD NK >gi|224461374|gb|ACDC01000028.1| GENE 5 4667 - 5374 918 235 aa, chain + ## HITS:1 COG:TM1809 KEGG:ns NR:ns ## COG: TM1809 COG1337 # Protein_GI_number: 15644553 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) # Organism: Thermotoga maritima # 6 235 7 246 247 160 40.0 3e-39 MYTLKGKLLIKGTIKLITGLHIGTSGDFSAIGAVDTIVIRDSVTNKPMIPGSSLKGKMRY LLARTKYHSSLELDDIKKEDVCIKRLFGSSEPIMSSRLQFQDILLSDKSIEEFKEFEFDL PHTEIKYENTIDRTTGIANPRQLERVPAGSEFDFQIVYNVEDAEEVKEDMENILLMMDVL EDDYLGGHGTRGYGRIKFKNLSLELKTYTEENKKALAKVEKEIEKIRKELESKVE >gi|224461374|gb|ACDC01000028.1| GENE 6 5374 - 6378 978 334 aa, chain + ## HITS:1 COG:MT2887 KEGG:ns NR:ns ## COG: MT2887 COG1567 # Protein_GI_number: 15842361 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 1 332 1 297 302 120 26.0 5e-27 MSYLLYKLRFPNGIHVGTASGNTLEETMMSVYSDTFYSAIFNEYMKIYNDDELYKISEAG EFLVSDLLPFKEKEDMSTDFYLPKPFISVQRQEMGKNEEEVVDRKKVKATNFIPADKLGE YLTFLKTGKNFPEIDDDFGKKELYTKNKVSLQNEDTKLYNIEVFKFNEKSGLYFIVKLPE DNEWQEIFENILESLSLTGIGGKRNSGFGQFISEDPMFFDGEDFDAIESESDAYINKALY SDEEKYLSLSSYSPKIEEIEKIKKSENYYQLIKRSGFVNSSLYSEQAEKRKQVYMLSSGS VLSFKPEGKILDLNLHGKHSIYRMGKPIVLGVKI >gi|224461374|gb|ACDC01000028.1| GENE 7 6375 - 7544 1266 389 aa, chain + ## HITS:1 COG:no KEGG:Vpar_1802 NR:ns ## KEGG: Vpar_1802 # Name: not_defined # Def: CRISPR-associated RAMP protein, Csm5 family # Organism: V.parvula # Pathway: not_defined # 1 378 1 389 391 106 28.0 1e-21 MSNIIRNKMKLEVLTPLHIGGADYKSKLDKKEYIFDKDKKTLTLIDNEKFIAFLIKKNLF EKYIAYIENNVNAKVMVQNRNINLLNFLKANNIDKDIQDFRKKAPIKLDMNIENMNDIKL MLRDVQGKPYIPGTSIKGALINLLLVDYIIKNREKFSKEKRIILSECKKTNDDRSIRGLK NDIKKIVNQIEKSIIYSDNKSLEKSKKFGISVSDSYSYSNTRTNFYQDIDEKRTNKSGED KSRPMPVAREYIIANSIFDFDITLDIDLLEESKLKIKNIDDLIDSIENAMSYLIDVLEDK NSPRTENLVLGANTGFLQKTIVYALFEDEKERLEVVKKLLHKNQKNVIGNHLNDKFAPRV LNRIKINNKNLLAGLVKIMKVEEKNVGTN >gi|224461374|gb|ACDC01000028.1| GENE 8 7528 - 8250 644 240 aa, chain + ## HITS:1 COG:MT2891 KEGG:ns NR:ns ## COG: MT2891 COG5551 # Protein_GI_number: 15842365 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 5 240 62 302 314 80 21.0 3e-15 MLAQINMELESKELNMNMASLFHGYLMENIDPAYAEYFHYNTTNPFTSCIFKDTKEDKFF WRVTTFSQKAYDMLMSYFSKGIPEKIYLKNKDLEINVKSFSIQKKSYEDLFLEATERKRI KLISPTSFKSDGITHIFPNISTLISGVIAKINQHSETAELEDKKIVNELLEKVYIKDYNL RTKIFHLESIKIKGFIGTMDLAIKGEDRTLANILNFLILMSEYTGLGIKTSLGMGGVKVE >gi|224461374|gb|ACDC01000028.1| GENE 9 8243 - 9634 1512 463 aa, chain + ## HITS:1 COG:no KEGG:TherJR_2016 NR:ns ## KEGG: TherJR_2016 # Name: not_defined # Def: CRISPR-associated protein Csm6 # Organism: Thermincola_JR # Pathway: not_defined # 1 458 1 452 460 152 26.0 3e-35 MSKKVLLTFAGNTDPTRGQHDGPIIHICRYYKPEKIYLILTKEMEERDEAPYNIYEKAIK ENLKDYNPEIIRIKTGIKDAHHFDVYFDTIYQTFEKIKKEEKDAEVYLNMTSGTSQITTN LLMYYIDSVDLKLIPIQVETFTGQSNKTEADNKTVDKYYAVEEEAICNLDNEENSKKRIV VPDLKKYSRILTKNQIEKLLEQYKYEAISELLKRDIFDKNLELNTLVNFAIERTNLKGLE CNKKLNSLNNKDYNRLYHFTKDKNVTRIPSWYQIVDCFALANIKQKAEDISSYTLMLEPI IVKLYLSILKDIMKKNLDELFRKDSHGYKIELKRLEEDLKEMIKEDLKREYLKDDVYISA QTLASTIKYYLKKEKKLASIMDVDYFISLAETLAKMKNVRNTLAHELKSISREDFNRESE TTVEQINSKILDFFNKFYTPLGYKKEMVEVYDNINKEIVKLLK >gi|224461374|gb|ACDC01000028.1| GENE 10 9648 - 10655 1123 335 aa, chain + ## HITS:1 COG:alr1468_2 KEGG:ns NR:ns ## COG: alr1468_2 COG1518 # Protein_GI_number: 17228961 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Nostoc sp. PCC 7120 # 1 333 2 332 335 215 33.0 8e-56 MSNLYIYEQGIVLRYKENRLLITYTNDDSKSIPIENIDNIVIFGGIQLSTTCMHNLLAKG IHVTFLSKNGSYFGRLESTSNINIDRQREQFRKSDDKEFCLEIAKKFIKGKATNQRTILI RANKELKNDVLSSTITTMFGIIKDINNAKTIEELMGVEGYLAKLYFNALNQIIDKKYSFK TRTKRPPKDPFNAVISFGYTLLHYEIFTILVTKGLNPYAAFLHSDRHKHPALCSDLMEEW RAILVDSLAIALLNNNKITYEDFDFDEKSGGVFLNKKACEKFVEQFEKRLRQEVSYIKEV PYKMSFRRIIEYQVMLLIKAFEANDADIYNPVLIR >gi|224461374|gb|ACDC01000028.1| GENE 11 10660 - 10989 453 109 aa, chain + ## HITS:1 COG:MT2883 KEGG:ns NR:ns ## COG: MT2883 COG1343 # Protein_GI_number: 15842357 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Mycobacterium tuberculosis CDC1551 # 22 107 28 113 113 69 41.0 2e-12 MENWDFLDEDFEKEIFEDNFTVIVIYDIISNKRRTQLSKLLSAFGFRIQRSAFECLLTRE KYKLLVERINRYAKPEDLIRIYRLNQNVITEIYGENSEAENENKAYYFF >gi|224461374|gb|ACDC01000028.1| GENE 12 11199 - 11609 406 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740220|ref|ZP_04570701.1| ## NR: gi|237740220|ref|ZP_04570701.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 136 1 136 136 212 100.0 5e-54 MNRLLQGFFHKISVSLRSIIHSYMKETENFNIFHGFKISVSLRSIIHSYYYKDVLYNFLT SLFPSPYGVSFILIQMLYFWLVDNGYINFRLLTEYHSFLLKVSPSPNSSLILFPSPYGVS FILILWSKKSFVAIIF >gi|224461374|gb|ACDC01000028.1| GENE 13 11889 - 12053 220 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no EKLDNETLKYIKFPSPYGVSFILIFDEYLRKVLKRKEISVSLRSIIHSYSHPKI Prediction of potential genes in microbial genomes Time: Thu May 19 22:53:33 2011 Seq name: gi|224461373|gb|ACDC01000029.1| Fusobacterium sp. 2_1_31 cont1.29, whole genome shotgun sequence Length of sequence - 17898 bp Number of predicted genes - 23, with homology - 14 Number of transcription units - 8, operones - 4 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 62 66 ## - Prom 204 - 263 7.3 - Term 234 - 271 0.1 2 2 Op 1 . - CDS 428 - 523 83 ## 3 2 Op 2 . - CDS 538 - 891 582 ## gi|237740221|ref|ZP_04570702.1| predicted protein 4 2 Op 3 . - CDS 845 - 973 248 ## - Prom 1007 - 1066 9.7 5 3 Op 1 . - CDS 1457 - 1537 76 ## 6 3 Op 2 . - CDS 1602 - 1793 212 ## - Prom 2002 - 2061 4.1 - Term 1802 - 1852 -0.4 7 4 Tu 1 . - CDS 2075 - 2362 541 ## - Prom 2389 - 2448 3.0 - Term 2683 - 2738 5.0 8 5 Tu 1 . - CDS 2754 - 2924 63 ## - Prom 3150 - 3209 6.9 - Term 3097 - 3134 -0.8 9 6 Op 1 . - CDS 3225 - 3446 285 ## 10 6 Op 2 . - CDS 3382 - 3552 166 ## - Prom 3602 - 3661 3.0 + Prom 3761 - 3820 4.0 11 7 Tu 1 . + CDS 3888 - 4250 352 ## Pecwa_1101 hypothetical protein + Term 4251 - 4301 10.0 + Prom 4295 - 4354 11.4 12 8 Op 1 32/0.000 + CDS 4374 - 5066 913 ## COG0020 Undecaprenyl pyrophosphate synthase 13 8 Op 2 15/0.000 + CDS 5059 - 5940 1084 ## COG0575 CDP-diglyceride synthetase 14 8 Op 3 1/0.000 + CDS 5958 - 7121 1503 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 15 8 Op 4 1/0.000 + CDS 7109 - 7783 876 ## COG0125 Thymidylate kinase + Term 7801 - 7834 0.8 16 8 Op 5 1/0.000 + CDS 7839 - 8858 1310 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 + Prom 8860 - 8919 3.6 17 8 Op 6 1/0.000 + CDS 8949 - 10334 2019 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 18 8 Op 7 1/0.000 + CDS 10360 - 12051 2430 ## COG0760 Parvulin-like peptidyl-prolyl isomerase + Term 12054 - 12100 8.2 19 8 Op 8 31/0.000 + CDS 12110 - 13921 1837 ## COG0358 DNA primase (bacterial type) 20 8 Op 9 1/0.000 + CDS 13952 - 15340 2176 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 21 8 Op 10 1/0.000 + CDS 15356 - 16180 928 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 22 8 Op 11 . + CDS 16177 - 16953 836 ## COG0327 Uncharacterized conserved protein 23 8 Op 12 . + CDS 16973 - 17542 661 ## FN1315 hypothetical protein Predicted protein(s) >gi|224461373|gb|ACDC01000029.1| GENE 1 2 - 62 66 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKFKSGSIFPSPYGVSFIL >gi|224461373|gb|ACDC01000029.1| GENE 2 428 - 523 83 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKNNGGLFPSPYGVSFILMKYTKIYNRLSI >gi|224461373|gb|ACDC01000029.1| GENE 3 538 - 891 582 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740221|ref|ZP_04570702.1| ## NR: gi|237740221|ref|ZP_04570702.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 117 3 119 119 203 100.0 2e-51 MRFRPLAEYHSFLLSELKEMVKNNGGLFPSPYRVSFILIGNNIRIINSHIFKEKFPSPYG VLFILMNSNFLFQDLRFGKGFPSPYGVSFILIKNIKFYGGKNEYKVSVSLRSIIHSY >gi|224461373|gb|ACDC01000029.1| GENE 4 845 - 973 248 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATIPSPYEVSFILIVIAGTSCSRDNVNAFPSPCGVSFILIK >gi|224461373|gb|ACDC01000029.1| GENE 5 1457 - 1537 76 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPKTDGLSTEEVIKCFRLLTEYHSFL >gi|224461373|gb|ACDC01000029.1| GENE 6 1602 - 1793 212 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEYYSFLSVNWWIKLNGLNLVMFPSPYGVSFILIYNRGIREKEKRVIRISVSLRSIIHYY QWH >gi|224461373|gb|ACDC01000029.1| GENE 7 2075 - 2362 541 95 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMKILLKFPSPYGVSFILMQLMNYLAGINELQSFRLLTEYHSFLSKELKKYIAEFMGFRL LTEYHSFLFYLVQLNNNLNKNILGFRLLTEYHSFL >gi|224461373|gb|ACDC01000029.1| GENE 8 2754 - 2924 63 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNIKSNGFRLLTEYYSFLLEKIFGVNYKEYQFPSPYEVIFTFILYSKKNCIIINF >gi|224461373|gb|ACDC01000029.1| GENE 9 3225 - 3446 285 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGFRLLTEYHSFLFVFTGTINEIINFFVSVSLQSIIHSYAFNINYNINKQHWEFPSPYGV SFILMKKRELMIT >gi|224461373|gb|ACDC01000029.1| GENE 10 3382 - 3552 166 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVLIFTNKLKFPSPYGVSFILMHFKGGFKMIKFKSGFPSPYRVSFIPICIYWYYQ >gi|224461373|gb|ACDC01000029.1| GENE 11 3888 - 4250 352 120 aa, chain + ## HITS:1 COG:no KEGG:Pecwa_1101 NR:ns ## KEGG: Pecwa_1101 # Name: not_defined # Def: hypothetical protein # Organism: P.wasabiae # Pathway: not_defined # 9 108 11 114 124 65 38.0 5e-10 MKNKGENILTFKDEDNGDQITMYYDIWLTVIREILMKYADKTYAESLKILNKHYYKKPAN YFECIYLSHELEYHWAMIGAYGEGYWLNDGCSELLPTDYYEWYKKFLDENNFNEPFEFYE >gi|224461373|gb|ACDC01000029.1| GENE 12 4374 - 5066 913 230 aa, chain + ## HITS:1 COG:FN1326 KEGG:ns NR:ns ## COG: FN1326 COG0020 # Protein_GI_number: 19704661 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 1 230 1 230 230 408 93.0 1e-114 MEKNIPQHIAIIMDGNGRWAKKRGLARSFGHMEGAKSLRRALEYFTEIGVKYLTVYAFST ENWSRPKDEVSTLMKLFLKYIKSERKNMMKNKIRFFVSGRKNNIPEKLLNEIEKLKEETK NNDKITLNIAFNYGSRAEIIDAVNNIIKDGKENITEEDFSKYLYNDFPDPDLLIRTSGEM RISNFLLWQIAYSELYITDTLWPDFDEKEIDKAIESYNQRDRRFGGVKNV >gi|224461373|gb|ACDC01000029.1| GENE 13 5059 - 5940 1084 293 aa, chain + ## HITS:1 COG:FN1325 KEGG:ns NR:ns ## COG: FN1325 COG0575 # Protein_GI_number: 19704660 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Fusobacterium nucleatum # 1 293 1 294 294 378 77.0 1e-105 MFKWNRVLVALIGVPLLFFVYMGEAFFHMNLQGLPMLIFTNLVVAIGTYEFYKMVKISGK EVYDKFGILVSIIIPNLIYLANRSKYLDQSMVGLVIIIATMSLLIYRVFRNQIKGTLEKV SFTILGIVYVSVFFSQIINLYFIGAIFPFILQVLVWISDTAAGIVGVAIGRKFFKNGFTE ISPKKSVEGALGSIIFTAIAFVGIVSYFEKIKDVSLEEGVVAFLIGAFISVVAQIGDLIE SLFKRECGVKDSGTILMGHGGILDRFDSMILVLPFVTVVIYFFHLYISYQYGI >gi|224461373|gb|ACDC01000029.1| GENE 14 5958 - 7121 1503 387 aa, chain + ## HITS:1 COG:FN1324 KEGG:ns NR:ns ## COG: FN1324 COG0743 # Protein_GI_number: 19704659 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Fusobacterium nucleatum # 1 387 4 390 390 634 84.0 0 MKKILILGSTGSIGTSALELIRNNREEYQVVAISGNRNIELLKKQIEEFRPLAIYVGAEE EAIKIKNEYPFIEDIYFGENGLAELAKNSNYDIILTAVSGAIGIDATVEAIKREKRIALA NKETMVSAGTYINRLLKEYPKAEIIPVDSEHSALFQSLQGFKKENVKKLIITASGGTFRG KTLEFLENVTVEEALKHPNWSMGKKITIDSSTLVNKGLEVIEAHELFNVPYDDIEVVVHP QSIIHSMVEYVDGSIIAQMGVPSMKTPILYAFSYPEKEFNASIDFLDLIKTKTLTFEEAD RKVFKGIDLAYRAGRTGETMPTVFNAANEVAVELFMKKKIKFLDIYRIIEEAMDNHKLIS LDTDETLSIIKEVDKETRKKVREQWEK >gi|224461373|gb|ACDC01000029.1| GENE 15 7109 - 7783 876 224 aa, chain + ## HITS:1 COG:FN1323 KEGG:ns NR:ns ## COG: FN1323 COG0125 # Protein_GI_number: 19704658 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Fusobacterium nucleatum # 1 223 1 223 225 365 91.0 1e-101 MGKIIVIEGTDSSGKETQTKLLYERVKKIYDKTIKISFPNYDSPACEPVKMYLAGKFGTD ATKVNPYPVSTMYAIDRYASFKQDWEKYYLDDYLIITDRYVTSNMIHQASKIKDAEAKDE YLNWLVDLEYKKNEIPEPDIVIFLKMPIDKAKELMENRKNKIDGSEKKDIHEVNEDYLKK SYDNATAISKKYSWCEIECVEDNKIKTIERINDEIFSKIEELIK >gi|224461373|gb|ACDC01000029.1| GENE 16 7839 - 8858 1310 339 aa, chain + ## HITS:1 COG:FN1322 KEGG:ns NR:ns ## COG: FN1322 COG0750 # Protein_GI_number: 19704657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Fusobacterium nucleatum # 1 339 1 339 339 518 82.0 1e-147 MTFLIAVAMLGLIIFVHELGHFLTAKFFKMPVSEFSIGMGPQVFSLDTKETTYSFRAIPI GGYVNIEGMEVGSQVENGFNSKPAYQRFIVLFAGVFMNFLTAFLIIFSIAQVSGRMEYEE KAVIGALVKGGANEQILKVDDKILELDGKKINLWADIPEVTKEAIDKEEIPALIERDGKE QKLVLKLTKDEENKRVVLGISPKSKKTNLSFTESLVFAKNSFVSILKDTVGGLFTLFSGK ANLKEISGPVGILKVVGEVSKFGWTSIASLAVILSINIGVLNLLPIPALDGGRIIFVLLE IFRIRINKKWEENLHKFGMVMLLFFILVISVNDVWKLFN >gi|224461373|gb|ACDC01000029.1| GENE 17 8949 - 10334 2019 461 aa, chain + ## HITS:1 COG:FN1321 KEGG:ns NR:ns ## COG: FN1321 COG2204 # Protein_GI_number: 19704656 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Fusobacterium nucleatum # 1 461 9 469 469 766 94.0 0 MKNAILAISEKKEILKQIRKELAEKYEVITFNNLLDAIDMVRESDFDLVLLDNALEGVTV GEAKKKLASIGKEFVTIALVDEVNVETTKELENSGIFAYLLKPIKVEDLDAIILPSLSGL ELIKENKRLEEKLAVLEEDTDIIGQSAKIKDVRNLIEKIADNDLPVLIVGETGTGKDIIA KEIHKKSERNKGRYAQISCALYPGELIERELFGYERGAFMGANASKKGLLEEIDGGTIYI EDISKMDIKIQSRFLKAIEYGEFKRVGGTKVRKTNVRFLVGTDIDLKQETEKGKFRKDLY HRLTALTIEVPPLRERKEDIPVLANYFLNKIVRILHKETPVISGEAMKFLMEYYYPGNIM ELKNLIERMALLSKDKILDVDQLPLEIKTKSDIVENKTVVGVGPLKEILEQEIYSLEEVE RVVIAIALQKTRWNKQETSKILGIGRTTLYEKIRKYGLDTK >gi|224461373|gb|ACDC01000029.1| GENE 18 10360 - 12051 2430 563 aa, chain + ## HITS:1 COG:FN1320 KEGG:ns NR:ns ## COG: FN1320 COG0760 # Protein_GI_number: 19704655 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Fusobacterium nucleatum # 208 563 1 356 356 484 78.0 1e-136 MSIRKFRKQMKPFIIVLTVVFILSLAYGGYESYRTSRANKKAQEAMLLNKDYIQKIDIER AKQEVSRAYAETVDKDIVDIIAFNDVIDKKLTLDLAKSLKVKVPSSEVNAQYEELESSMG DKEQFRRMLQVQGLTKDSLKNKIEENLLMQKTREEFAKNINPTDEEINAYMSLYSIPSDK KEDAINLYKMQKGEEAFKLALIKARKEMQIKDLAPEYENLVEKVSYEEDGFKVTNLDLAK IMATFMINQKATKEQAEELAKNMIAKQIKVAKMAKEKGVKVNEELDLMSQLQDYAIGLSE KVREEIKPTDAELESFFNTNKSRYNIPETADAKLIFITVKSTKEDDAVAKAEAEKLLAEL TPENFSEKGKTLGNNQDIIYQDLGTFGTKAMVKEFEEALKDVPSNTVINKVIKTKFGYHV VYVKKNDNNQQWSAEHILIVPYPSDKTVAEKLEKLEKLKADIEAGTLALNDKIDEDVIQS FDAKGITPDGIIPDFVYSPEIAKAVFETPLNKVGIINPNKATIIVFQKTKEVKAEEANFT KLKEEVRKDYINKQVGEYMSKLF >gi|224461373|gb|ACDC01000029.1| GENE 19 12110 - 13921 1837 603 aa, chain + ## HITS:1 COG:FN1319 KEGG:ns NR:ns ## COG: FN1319 COG0358 # Protein_GI_number: 19704654 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Fusobacterium nucleatum # 1 603 1 603 603 792 75.0 0 MYFRNEDIEKLLDSLRIEEVVGEFVDLKKSGSSYKGLCPFHADTNPSFSVKPEKKICKCF VCGSGGNAINFYSKIKNIPYMEAVKELAQKYRVNIKEYNAKNTDIDNEKFYQIMEDSHNF FMDKMFAQESRTALNYLSNRGLDTDLIKEHRLGYAPAKWSELYDFLKEKNYSDEDLLTLG LIKKNEEGRIYDTFRNRIIFPIYSISNRIIAFGGRSLEKDDTIPKYINSPDTPIFKKGKN IYGIERAVNIRNKNYSILMEGYMDVLSANIFDFDTSIAPLGTALTVEQAQLIKRYSSNIL LCFDMDKAGKSATERASFILKSQGFNIRVLQFDDAKDPDEYLKKNGREAFLEVVQKSLEI FDFLYELYSSEYDLTNIIAKQNFIERFKEFFAYLTTDLEKEMYLKNLSEKIDISIDILRK TLVEENKKKFIVKDYIDEIEEKETEKKEFKKANNLELSIVEMLLKKPEYYEFFKDEKFES DIANKTLKFFEEKIKENFNFESNNLMREFENYIRNDNESHSEYINSNIARIILNYVIDTE VKIEEKNFLKLFKDYFRVKVKLRDKTNDDFQKIVYFSKFKDKIEKSRSVEEFIEIYNSFK YLF >gi|224461373|gb|ACDC01000029.1| GENE 20 13952 - 15340 2176 462 aa, chain + ## HITS:1 COG:FN1318 KEGG:ns NR:ns ## COG: FN1318 COG0568 # Protein_GI_number: 19704653 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Fusobacterium nucleatum # 154 462 23 331 331 470 96.0 1e-132 MKEIIRTKKGREFVNEVREKKKTTYEEINKCFSKDYTEEQINELVKIFLEEGIEVLSETE TKAKSKTKAKTKSKAKAKEEAKAELDEEKDLNEKEKVEIEETEEKELDEEEDEEKDDDEE IEEKELDDEYVEEETDDSLDDEEKDDDDVDTDTFIGFEDEFNPDYIEDISEEELSNEKLL NLGNSAKVDEPIKMYLREIGQVPLLTHDEEIEYAKRAYEGDEEASQKLIESNLRLVVSIA KKHTNRGLKLLDLIQEGNIGLMKAVEKFEYTKGYKFSTYATWWIRQAITRAIADQGRTIR IPVHMIETINKIKKESRIYLQETGKDASPEILAERLGMEVEKIKAIQEMNQEPISLETPV GSEEDSELGDFVEDQKTTSPYEATNRAILREELDGVLKTLSPREEKVLRYRYGLDDSSPK TLEEVGKIFNVTRERIRQIEVKALRKLRHPSRKKKLEDFKVD >gi|224461373|gb|ACDC01000029.1| GENE 21 15356 - 16180 928 274 aa, chain + ## HITS:1 COG:FN1317 KEGG:ns NR:ns ## COG: FN1317 COG0568 # Protein_GI_number: 19704652 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Fusobacterium nucleatum # 1 274 1 270 270 317 77.0 2e-86 MKLFSLERYLLKNLDMTEEDFKKLVLDISEPLELELPEDRKLTDEEIDYEYIDLLVTETL ENLKDDVCTCETDCGVADCCGTRVEKNLKKVYQIALYMLRDGILYEDLTQEGVIGLIRAH ELFEDDKDFKLYKDYYIARAMFNYIESYANYRKTAFKEYAEYEIHKENHPKISLKDKSKS EELKKLEKENKEKHIEEMKQLEKRAEYQFDYLNLKYRLGEREIEAISLYFGLDGHKRKNF SEIQNIMKIDSDSLDKIVKDALFKLSVVDEKVEL >gi|224461373|gb|ACDC01000029.1| GENE 22 16177 - 16953 836 258 aa, chain + ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 257 1 257 258 370 82.0 1e-102 MITRDIINILEKKFPKINAEEWDNVGLLVGDYDKEVKKIQFSIDASLEVIENAIKEKVDM IITHHPFIFKAIKSINEQDILSKKIRMLIRNDINIYSIHTNLDSSVSGLNDYVLEKLGYT DYKFLDYDEEKNCGIGRIFKLDEEKDLKKFIEELKLKLQISNLRVISNDLNKKIKKVALI NGSAMSYWRKAKKEKIDLFITGDVGYHDALDARESGLAVIDFGHYESEHFFHEVLIKELK DTNLEFLVYNPEPVFKFC >gi|224461373|gb|ACDC01000029.1| GENE 23 16973 - 17542 661 189 aa, chain + ## HITS:1 COG:no KEGG:FN1315 NR:ns ## KEGG: FN1315 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 188 1 177 177 247 84.0 2e-64 MKKIIITFLFLISSVFSFATINDNLNILKDEEKVEINEKVDEIKKEKDLTVFVNTLSMDV GFAVSDPERALILNIKKGDKETYKVELSFSKDIDVDDYQDDINTTLTDAAPLLERKEYGK YILTVLDGASSVLQEVNIETLNQMTMTKEQENGTSTPIMIAAFVIIILFIVYKMYAAYKD KSNQEEDDD Prediction of potential genes in microbial genomes Time: Thu May 19 22:54:35 2011 Seq name: gi|224461372|gb|ACDC01000030.1| Fusobacterium sp. 2_1_31 cont1.30, whole genome shotgun sequence Length of sequence - 50728 bp Number of predicted genes - 54, with homology - 51 Number of transcription units - 20, operones - 13 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 12 - 482 462 ## COG1309 Transcriptional regulator 2 1 Op 2 . - CDS 547 - 1056 563 ## COG0716 Flavodoxins - Prom 1098 - 1157 6.3 3 2 Op 1 8/0.000 - CDS 1280 - 2962 188 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Op 2 . - CDS 2952 - 4697 1789 ## COG4988 ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components - Prom 4787 - 4846 9.4 + Prom 4459 - 4518 8.7 5 3 Op 1 1/0.000 + CDS 4760 - 5386 510 ## COG0703 Shikimate kinase 6 3 Op 2 . + CDS 5400 - 7202 535 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 7 3 Op 3 . + CDS 7235 - 8101 763 ## FN0824 DeoR family transcriptional regulator 8 4 Tu 1 . - CDS 8076 - 8207 182 ## - Prom 8259 - 8318 11.8 + Prom 8208 - 8267 5.3 9 5 Op 1 . + CDS 8293 - 9588 1347 ## FN0825 putative cytoplasmic protein 10 5 Op 2 24/0.000 + CDS 9601 - 10740 1406 ## COG0845 Membrane-fusion protein 11 5 Op 3 36/0.000 + CDS 10755 - 11417 321 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 12 5 Op 4 . + CDS 11414 - 12640 353 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 + Term 12654 - 12690 4.1 - Term 12633 - 12688 5.2 13 6 Op 1 . - CDS 12696 - 12914 435 ## FN1302 hypothetical protein 14 6 Op 2 2/0.000 - CDS 12988 - 15375 2633 ## COG0210 Superfamily I DNA and RNA helicases 15 6 Op 3 . - CDS 15397 - 16479 1084 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 16514 - 16573 5.9 16 7 Op 1 . - CDS 16593 - 16976 490 ## COG3654 Prophage maintenance system killer protein 17 7 Op 2 . - CDS 16973 - 17104 211 ## - Prom 17279 - 17338 18.1 + Prom 17172 - 17231 11.5 18 8 Op 1 . + CDS 17349 - 17549 327 ## gi|262068197|ref|ZP_06027809.1| putative flagellar protein + Term 17560 - 17609 5.4 19 8 Op 2 . + CDS 17618 - 17755 163 ## 20 8 Op 3 . + CDS 17812 - 18039 337 ## FN1099 hypothetical protein + Term 18086 - 18141 -0.1 + Prom 18189 - 18248 11.5 21 9 Op 1 1/0.000 + CDS 18269 - 19588 1566 ## COG1373 Predicted ATPase (AAA+ superfamily) 22 9 Op 2 . + CDS 19581 - 20120 786 ## COG1859 RNA:NAD 2'-phosphotransferase 23 9 Op 3 1/0.000 + CDS 20131 - 20748 648 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 24 9 Op 4 1/0.000 + CDS 20735 - 21199 686 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 25 9 Op 5 12/0.000 + CDS 21212 - 21673 645 ## COG0802 Predicted ATPase or kinase 26 9 Op 6 . + CDS 21654 - 22298 173 ## PROTEIN SUPPORTED gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase + Prom 22356 - 22415 8.4 27 10 Tu 1 . + CDS 22446 - 24467 1873 ## COG1479 Uncharacterized conserved protein + Term 24546 - 24591 2.6 28 11 Tu 1 . - CDS 24475 - 25695 1216 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 25721 - 25780 9.9 29 12 Op 1 . - CDS 25832 - 27421 1613 ## GYMC10_2788 hypothetical protein 30 12 Op 2 . - CDS 27414 - 28721 1208 ## Lebu_0718 hypothetical protein - Prom 28759 - 28818 8.5 + Prom 28688 - 28747 9.1 31 13 Op 1 . + CDS 28824 - 29492 287 ## PROTEIN SUPPORTED gi|241889384|ref|ZP_04776685.1| 30S ribosomal protein S8 32 13 Op 2 . + CDS 29564 - 29791 418 ## gi|237740264|ref|ZP_04570745.1| predicted protein 33 13 Op 3 . + CDS 29809 - 31671 1674 ## COG1533 DNA repair photolyase + Prom 31687 - 31746 5.1 34 14 Op 1 . + CDS 31837 - 32553 766 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 35 14 Op 2 1/0.000 + CDS 32559 - 33338 996 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 36 14 Op 3 1/0.000 + CDS 33353 - 33940 814 ## COG1573 Uracil-DNA glycosylase 37 14 Op 4 1/0.000 + CDS 33933 - 34475 870 ## COG0212 5-formyltetrahydrofolate cyclo-ligase + Term 34605 - 34635 -0.6 + Prom 34592 - 34651 6.5 38 14 Op 5 1/0.000 + CDS 34704 - 35675 1373 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 39 14 Op 6 . + CDS 35682 - 37265 1682 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 40 14 Op 7 . + CDS 37336 - 37578 276 ## gi|237740272|ref|ZP_04570753.1| predicted protein + Term 37591 - 37634 5.4 + Prom 37613 - 37672 10.1 41 15 Tu 1 . + CDS 37692 - 38021 378 ## FN0737 hypothetical protein + Prom 38035 - 38094 9.3 42 16 Tu 1 . + CDS 38125 - 38274 214 ## gi|237740274|ref|ZP_04570755.1| predicted protein + Term 38299 - 38345 3.0 - Term 38285 - 38332 8.2 43 17 Tu 1 . - CDS 38338 - 40926 3652 ## COG0474 Cation transport ATPase - Prom 41103 - 41162 9.4 + Prom 41083 - 41142 12.3 44 18 Op 1 7/0.000 + CDS 41228 - 42004 1139 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 45 18 Op 2 . + CDS 42020 - 42859 1407 ## COG1250 3-hydroxyacyl-CoA dehydrogenase + Term 42860 - 42902 4.7 - Term 42846 - 42890 5.1 46 19 Op 1 1/0.000 - CDS 42893 - 43897 1364 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 47 19 Op 2 14/0.000 - CDS 43881 - 44327 826 ## COG1799 Uncharacterized protein conserved in bacteria 48 19 Op 3 2/0.000 - CDS 44349 - 45020 873 ## COG0325 Predicted enzyme with a TIM-barrel fold 49 19 Op 4 1/0.000 - CDS 45042 - 46145 1136 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 50 19 Op 5 . - CDS 46129 - 47817 2754 ## COG1109 Phosphomannomutase 51 19 Op 6 . - CDS 47846 - 48568 750 ## FN0558 TraT complement resistance protein precursor 52 19 Op 7 . - CDS 48598 - 49314 1194 ## FN0558 TraT complement resistance protein precursor 53 19 Op 8 . - CDS 49391 - 49855 641 ## COG3467 Predicted flavin-nucleotide-binding protein - Prom 49929 - 49988 9.3 + Prom 49893 - 49952 15.1 54 20 Tu 1 . + CDS 49983 - 50711 1142 ## FN0557 hypothetical protein Predicted protein(s) >gi|224461372|gb|ACDC01000030.1| GENE 1 12 - 482 462 156 aa, chain - ## HITS:1 COG:FN1823 KEGG:ns NR:ns ## COG: FN1823 COG1309 # Protein_GI_number: 19705128 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 154 1 154 156 194 79.0 5e-50 MARKCAYTKEMILEAAIKLFKKEGSDAVTAKNIAKELGCSVAPIYSVYMSLDDLKRDLTF EIEKNILEEKEIHPLLSKMLAKLEVSENDEEFSKKLKEFKLKIHNKENQINIFSQFSDFV SLIYKSRRTKFSKIKILELIAKHKRYITEFRNSKIN >gi|224461372|gb|ACDC01000030.1| GENE 2 547 - 1056 563 169 aa, chain - ## HITS:1 COG:FN1822 KEGG:ns NR:ns ## COG: FN1822 COG0716 # Protein_GI_number: 19705127 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 168 1 168 169 283 91.0 1e-76 MKTLIIYSSETGNTKMVCEKAFEYINGEKVIIPIKEKDNINLDEFDNIIVGTWIDKANAN AEARKFVNTLANKNLFFIGTLAASLTSEHAKKCFNNLRKLCSKKNNFVDGVLARGRVSED LQEKFTKFPLNIIHKFVPNMKEIILEADAHPNETDFLLIKDFIDKNFNN >gi|224461372|gb|ACDC01000030.1| GENE 3 1280 - 2962 188 560 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 352 546 2 197 245 77 31 2e-13 MKNRSTFNIVFNLLKLLDSLWKFMTIAVSTGVIGFIFSFCITLFGAYAFLSVIPSTKDSL KYVFLGGYSTQTYFYAMMFCGFFRAILHYLEQFANHYIAFHILANIRVKLFKIMRKLAPA KMENKNQGNLISMITSDIELLEVFYAHTISPILIATITSIFLFLYFFQLNYLYALYMLFA QFIVGIVVPYIAHKRSAKSGVEVRAKLGKLNDEFLDKLKGIREIIQYSQGKKVLKKIDEI TSSLGENQKDLRNKASEVQMMVDSAIIILSIAQLLLSISLVSKGLVSIEASILAGVLQVG SFAPYINLAALGNILAQTFASGERVLNLMDEKPAVIDNIALLSEDISERDDISIDNISYS YENTDNKILKDFSLKIKKGQLTGIMGASGCGKSTLLKLLMRFWDVDSGKIILDRKDVKSV PLKELYQKFNYMTQSTSLFIGNIRDNLLVAKADATDEEIYIALKKASFYDYVMSLPDKLD SIVEEGGKNFSGGERQRIGLARAFLANREFFLLDEPTSNLDILNEAIILKSLADEAKDKT VILVSHRESTLSICNQIFKI >gi|224461372|gb|ACDC01000030.1| GENE 4 2952 - 4697 1789 581 aa, chain - ## HITS:1 COG:FN1819 KEGG:ns NR:ns ## COG: FN1819 COG4988 # Protein_GI_number: 19705124 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components # Organism: Fusobacterium nucleatum # 1 581 1 581 581 959 90.0 0 MIDKRLYNFSGNIKKYISITTFLSCVKLIANIFFYFIFAFLLVSLINRDFSFSYKYIIIS ILIIVLIRQFSTIKISHMLGNLVVDVKRNLRKLIFEKTLKLGLAYSQLFKTQELIHLSVD NVEQLEVYFGGFLTQFFYCVVSSFILFFSIAYFNLKIAFILLIFSLAIPMSLYIILNKVK KIQKKYFAKYMNVGTLFLDSLQGLTTLKIYGTDEKREEEIAKMSEEFRVETMRVLKMQLL SIAVINWIIYAGTILAIITSIKLFIDGSLGLFPMLFIFMLAPEFFIPMRTLTSLFHVAMT GVSAAENIISFIDSPERNSVGEKEFKNEREFKVSNLSFTYPDGTQSLKDINMTFKKGSLT AVVGHSGCGKSTLVSVLAGELKSNENEIFVDDIDIHNIKLEDKVKNILKITHDSHIFSGT VRDNLTMANEKLTDETMVEVLKTVKLWDIFSKAKGLDTVLESQGKNLSGGQAQRVALARA LLYDASVYIFDEATSNIDIESEEIILNIIHFLSKEKTVIYISHRLPAIKNADCIYVMDKG RVVENGKHNDLYAKKELYYNMYKHQEELESYLTKRGETNEK >gi|224461372|gb|ACDC01000030.1| GENE 5 4760 - 5386 510 208 aa, chain + ## HITS:1 COG:FN0822 KEGG:ns NR:ns ## COG: FN0822 COG0703 # Protein_GI_number: 19704157 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Fusobacterium nucleatum # 37 208 1 172 172 260 86.0 1e-69 MSIEIFEILFFKYKSNVNSFFYDIISKYKTNIQRDNMKDNIALIGFMGSGKTTIGKLLAK TMEMKFVDIDKIIEATEKKSINDIFKEKGQIYFRDLEREIILQESSRNNCVIATGGGSIL DNENVKSLQETSFIVFLDASIECLYLRLKDNTTRPILNDAEDKKQLIEELLEKRKFLYQI SANFIIHIDENTSIYETVDKIKESYINS >gi|224461372|gb|ACDC01000030.1| GENE 6 5400 - 7202 535 600 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 183 596 24 424 425 210 34 1e-53 MVNGNTSGLKEHILNSLDELYNSKIEKGKIINQEIIDYIAEVSNKINREINIAMDRSGNV IDISIGDSSTVSLPVVPVYDRRLSGVRIVHTHPGGNPHLSSVDISALIKLKLDCIVSIGV SDEGVTGYEVAVCSILNDELTYDRTLVKNLDDFDYLDAIKEVEEALRKRNITEDDKEYAL LIGIDDEIYLDELEELASACDVEVVGKFFQKRSKPDPLFLIGSGKIQELALFRQIRKANL LIFDEELSGLQLKMIEEVTGCKVIDRTTLILEIFARRARTREAKLQVELAQLKYRSNRLI GFGITMSRLGGGVGTKGPGEKKLEIDRRVIKKNIAYLNNELENIKKVRNTQREKREESGM PRVSLVGYTNVGKSTLRNVLVDMFPNDKTLKKEEVLSKDMLFATLDTTTRTIELKDKRVV SLTDTVGFIQKLPHDLVESFKSTLEEVIFSDLIIHVADASAKDVIEQIDAVENVLTELNC MDKTKILLLNKIDNATKDNTYAMIEQKIDEIKAKYTNYQILIISSKNRFNIDELMTLIKD NLAVKTYDCKVLVPYSKMDVSAKLHRNVIVKSEDFVDEGVVMEVILNEKQYNQFKEYIME >gi|224461372|gb|ACDC01000030.1| GENE 7 7235 - 8101 763 288 aa, chain + ## HITS:1 COG:no KEGG:FN0824 NR:ns ## KEGG: FN0824 # Name: not_defined # Def: DeoR family transcriptional regulator # Organism: F.nucleatum # Pathway: not_defined # 1 283 1 283 283 418 88.0 1e-115 MSKKIKVTLPQNIYEIIKNDISDFNMTSNYFMNYIFLNLNDKYKNFKGNPAIAEQSKEKS SIQFNLNKESSLIYYDVLRDNNAQNESEFMRSLLIRYATNPKNKRELFIFKESVERINLA IKDKKNVYITFNDDRKVKVSPYHIGSSDLEIANYIFCYDFSEEKYKNYKLSYLKQVYTTS EVAKWEDNDYIKDVIKNFDPFLSKGQIIKVRLSENGKKLLKTIKINRPKLISEDGDLFEF EASDEQIKRYFSYFFDEATVIEPIELKEWFIEKYENALKKLKKINKIN >gi|224461372|gb|ACDC01000030.1| GENE 8 8076 - 8207 182 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRVKSGVFERSEFPVLQRILNFYPLRNLASNEPFINLSYLFF >gi|224461372|gb|ACDC01000030.1| GENE 9 8293 - 9588 1347 431 aa, chain + ## HITS:1 COG:no KEGG:FN0825 NR:ns ## KEGG: FN0825 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 22 430 1 409 410 513 79.0 1e-144 MKKILMFFLLMASISSFSQESLTIDEALSRVGNNRESYEFKSFENTKEATDIRIKDNKLG DFNGVTISSSYNITENNFEDRDRKYDKTFQNKASYGPFFVNYNFVERDRSYVSYGVEKNL KDVFYSKYKSNIKVYDYQQELNKISYDKTIENKKINLVNLYNDILNTKNELEYRKKAYEH YKVDLDKFKKSYELGASPKINLESAELEAEDSKLQIDILETKLKSLYEIGKTDYNIDFEN YKLVDFIDNNESIEKLLASYMEKDIAELKLNLSVAEERKKYSNYDRHMPDLYLAYERVDR NLRGDRYYRDQDIFSIRFSKKLFSTDSDYKLSELEVENLKNDLNEKIRVINAEKIKLKAE YYELSKLLSIASKKSQLAYKKYLIKEKEYELSRASYLDVIDEYNKYLSLEIENKRAKNTL NSFIYKLKIKG >gi|224461372|gb|ACDC01000030.1| GENE 10 9601 - 10740 1406 379 aa, chain + ## HITS:1 COG:FN0826 KEGG:ns NR:ns ## COG: FN0826 COG0845 # Protein_GI_number: 19704161 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 43 379 1 338 338 476 82.0 1e-134 MKNIFKGKLKFVILLLVLIFILFYYVTHRGKKEEVYVDEYSYMKVEQTDEIGTINLNGYV KANNPIGIFVDKKLKVKEVFIKNGDFVEKGQILMTFDDDETNKLNRSIEKERINLQKIQR DLNTTRELYKLGGASRDEVKNLEDSARISQLNIDEYVEVLSKTATEVRSPVDGVVSNLKA QENYLVDTDSSLLEIIDANDLRIIVEIPEYNSQTVKIGQSIRVRQDISDDDKVYDGEITK ISRLSTTSSMTGENVLEADVKTNETIPNLVPGFKIKAVLQLKSDVKNIIIPKIALQNEEG KYFVYTLDEKNTIRKKIITIKNIVGDNIIVLSGLNPGEEIVLTPDNRLRDGLVLAGGDNH NSSEEVTSVPADKAKVIVN >gi|224461372|gb|ACDC01000030.1| GENE 11 10755 - 11417 321 220 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 2 220 1 218 245 128 34 7e-29 MIITVDKVNKTYKNGSLELQVLKNISFKVNKGEFLAIMGSSGSGKSTMMNILACLDSQYE GTYILDGIDISKLTENQLSEIRNKKIGFIFQSFNLLPRLSALENVELPLVYSSVPKAERH KRAAELLEMVGLKDRMHHKPNELSGGQRQRVAIARALVNDPSIILADEPTGNLDSKSEEE IIEILQELNRTGKTIVIVTHEPNIGDIAQRKIVFKDGEII >gi|224461372|gb|ACDC01000030.1| GENE 12 11414 - 12640 353 408 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 7 408 9 413 413 140 28 1e-32 MSFFDILKGSLATLKANKLRTLLTMLGIIIGISSVIAMWAIGNGGRDSILGDLKKVGYGK FTVTIDYKNENFKYKDYFTMENIDMLKNSHKFKAVSINVEDAFRMLKDNEPYYSYGTVTT EDYEKISPVTITSGRNFLPFEYTSNERVIILDSMSARKLFADEKLSLGQTVEITKDRKKA GHSYKIVGVYKSPYETLDSLFGDGDNYPILFRMPYKAYSIAFNDDSDVFSSLIIEAKNAD EITDSMREAKNILEFNKNVKDLYLTQTVSSDIESFDKILSTLSLFVTMAASISLLVGGIG VMNIMLVTVVERTKEIGIRKALGAKNRDILKQFLFESIILTVFGGLVGMGVGVLFGFLAG AVMGIKPIFSLTSIIVSLSISVIVGVIFGVSPARRAAKLNPIDALRTE >gi|224461372|gb|ACDC01000030.1| GENE 13 12696 - 12914 435 72 aa, chain - ## HITS:1 COG:no KEGG:FN1302 NR:ns ## KEGG: FN1302 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 72 1 72 72 112 81.0 5e-24 MKALLEKLAWKKCHIATVNHKFKDATILEVADGFALIETDEKEKALINLDFIRIVVEAKE GALPPVFVPHDL >gi|224461372|gb|ACDC01000030.1| GENE 14 12988 - 15375 2633 795 aa, chain - ## HITS:1 COG:ECs5264 KEGG:ns NR:ns ## COG: ECs5264 COG0210 # Protein_GI_number: 15834518 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 4 483 238 693 704 105 25.0 4e-22 MSEIILSNEQISVARYPENGVIRVNGGPGSGKTLVAVKRAIFLAKDYKYAEKDDKILFLF YNKSLKKTIKNLFEAEKDYENVKDKIEFESIDSFFVREYLNKNNHEFFEYLKKANNDKNF VRTFMKERKERIEKILIAKKEELKKFSPEDAEFVLSEIDWLRNCCYMKKEEYLEITRYGR GNQKKLQKEEKEEIYRILNLYRGGKKDTLRYTDFYDIAFLFLFYFEKEENRNKVKKYNHI IVDEAQDLSKIHFRFINLICEISKTSGNTISLFMDKNQSIYPEQAWIFGNRTLKQVGINI NKSFTLNRAYRNVKEIFDVAKKLNPEIEVGDIPDTKNQNLTLTFSVDRGIKPFFIKYSDS EDRLNNLCKDIKTLVNEFNYKYDDISIISLKDQSIKDIKSSLQKEHISYVSKNEIGEGIN ITTYHSAKGTENKVIFIPNIDELNADELTDLYPDKTREEILDELKKLLYIGMTRATEVLI VSSLKIEPSEYQKRLLEVFDFENDFINIDTDSNDFYSVFNKEINKNENIEKNHSKFFEIK EVVEEEKTSDTAIQKEIENLKPDKNEKSMDNKEIENEIETKFPSAHKSTKIGLLKAEKLF LRADKNDDYFGSESFEYLKSLECEIRTYYATIQEKVLNESYSKSEKLYTILNKLKDYSEF KTPVHDCYKHKVFNERNDLAHDYSDYTYNDLLETRELVKEKLLPKFIKAFKKFKTNKGID EFIIIGKLETSYNKIDIKKKKYYTYYIKDEINNTEFPALSENRYEQNINYKLTVNKLMLK GNEYYRILEANNFSD >gi|224461372|gb|ACDC01000030.1| GENE 15 15397 - 16479 1084 360 aa, chain - ## HITS:1 COG:FN1094 KEGG:ns NR:ns ## COG: FN1094 COG0463 # Protein_GI_number: 19704429 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Fusobacterium nucleatum # 1 360 1 360 360 626 89.0 1e-179 MKKTLVLIPALNPPKQLIDYVKSLLDNNLKDILLVDDGSKEEFKEIFETIEKFPDANIKV FRHAKNFGKGRALKNAFNYFLTLPNLDEYNGVVTADSDGQHRVEDVIKLAKEVEENPDTL ILGCRDFDLEQVPPKSKFGNKITNGAFKLFYGKNISDTQTGLRGFPTAIIKDFLDIAGER FEYETKMLIFCFQKEIPIKEVVIETIYFDDNSETHFNPIIDSIKIYKVTLSPFLKYIASA VSSFILDILSFKWILALLLAFRNIEGAAVITIATVVARIISSTFNFYLNKKFVFKYEKNT KKSLLKYYSLCAVQMLISAFFVTLVWKHTKYPETSIKIVVDSILFLLSYFIQQRWVFKRK >gi|224461372|gb|ACDC01000030.1| GENE 16 16593 - 16976 490 127 aa, chain - ## HITS:1 COG:alr9029 KEGG:ns NR:ns ## COG: alr9029 COG3654 # Protein_GI_number: 17227494 # Func_class: R General function prediction only # Function: Prophage maintenance system killer protein # Organism: Nostoc sp. PCC 7120 # 4 127 5 128 128 86 39.0 1e-17 MIILSKEQILNLHSQLINKFGGIDGVRDDGLLESALNNAYGVYFGLEKYPTVEEKAARLA YSLTKNHPFLDGNKRIGVLIMLVFLEINKIELTCNDDELTDLGLKIAASLKTYEEILEFV NIHKRNI >gi|224461372|gb|ACDC01000030.1| GENE 17 16973 - 17104 211 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIEETNEKILDTNENKLAKDDELKKIALKIMEKFEKTFEVLAK >gi|224461372|gb|ACDC01000030.1| GENE 18 17349 - 17549 327 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262068197|ref|ZP_06027809.1| ## NR: gi|262068197|ref|ZP_06027809.1| putative flagellar protein [Fusobacterium periodonticum ATCC 33693] # 1 66 1 66 66 81 95.0 2e-14 MSYLLTSMEEVRKENEKKQRILELKEAIKKAEAEWNTSDVEKLKKELKGLTNESFLTKIF KSDGRY >gi|224461372|gb|ACDC01000030.1| GENE 19 17618 - 17755 163 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNGVLEKTEKYNDNNEIVEIKKINRHIRVVNVGFGDYLPRTEFDD >gi|224461372|gb|ACDC01000030.1| GENE 20 17812 - 18039 337 75 aa, chain + ## HITS:1 COG:no KEGG:FN1099 NR:ns ## KEGG: FN1099 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 75 1 75 75 97 92.0 9e-20 MSVISIRFNDEEEEIVKNYVKSKGTNLSQYIKNIIFEKIEEEYDLKLVQEYLKAKSEETL NLIPFEEAIKEWDIE >gi|224461372|gb|ACDC01000030.1| GENE 21 18269 - 19588 1566 439 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 439 23 470 470 773 91.0 0 MERFILNDLIKWKNSKYRKPLILKGVRQVGKTWILKEFGNKYYENIAYFNFDENPEYKQF FQTTKDINRILQNLMLISGYKIIAEKTLIIFDEIQDAPEVINSLKYFYENAPEYHIACAG SLLGITLAKPSSFPVGKVDFLNIYPMNFSEFLLANGDENLKLFLDSLNSIENIPDAFFNP LYEKLKMYYVTGGMPEAVYMWTQERDIELVRKTLNNILEAYERDFAKHPNIYEFPKISMI WKSIPSQLSKENKKFIYKVVKEGARAREYEDALQWLVNANLVTKVFKWENDLSAFKIYLV DVGLLARLSQLSPSTFGEGNRLFTEFKGALTENYILQGLSPQFEVSPRYWSENNYEVDFI IQNENNIIPIEVKAETNIKSRSLQKFKEKFKDDIKLRVRFSFENLKLDDDLLNIPLFMVD YTEKIINIAMNKLKEKNNG >gi|224461372|gb|ACDC01000030.1| GENE 22 19581 - 20120 786 179 aa, chain + ## HITS:1 COG:FN1102 KEGG:ns NR:ns ## COG: FN1102 COG1859 # Protein_GI_number: 19704437 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNA:NAD 2'-phosphotransferase # Organism: Fusobacterium nucleatum # 1 179 1 179 179 299 92.0 2e-81 MDNDVKLGKFISLILRHKPETIDLKLDENGWADTKELIEKISKSGREIDFTTLERIVNEN NKKRYSFNEDKTKIRAVQGHSIEVNLELKEVVPPAVLYHGTAFKNVESIKIEGIKKMERQ HVHLSADPETAKNVATRHSSKYVILEIDTEAMLKENYKFYLSENKVWLTDFVPSKFIKF >gi|224461372|gb|ACDC01000030.1| GENE 23 20131 - 20748 648 205 aa, chain + ## HITS:1 COG:FN0931 KEGG:ns NR:ns ## COG: FN0931 COG0494 # Protein_GI_number: 19704266 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 1 205 1 205 205 310 80.0 1e-84 MNKNRILLRERYFESAVMLCIANIDGKDCFILEKRAKNIRQAGEISFPGGKKDKTDKTFK ETAIRETMEELQIKRNKISNVSKFGLLVAPLGVLIECYICKLNIENLDEINYNRDEVEKL LAVPIEFFMETEAIKGEVEICNKAKFDIKKYNFPKRYENDWRIPNRFVYIYMFEEEPIWG MTAEIICDFIKILKNEGKVGFYEYK >gi|224461372|gb|ACDC01000030.1| GENE 24 20735 - 21199 686 154 aa, chain + ## HITS:1 COG:FN0930 KEGG:ns NR:ns ## COG: FN0930 COG2870 # Protein_GI_number: 19704265 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Fusobacterium nucleatum # 1 154 7 160 160 263 91.0 9e-71 MNINRKLATELVEEAKKNGKKVVFTNGCFDILHAGHVTYLTEAKRQGDILIVGVNSDASV KRLKGETRPINSEYDRAFVLDALKSVDYTVIFEEDTPEELIACLKPSIHVKGGDYKKEDL PETKIVESYGGEVIILNFVEGKSTTNIIEKINKK >gi|224461372|gb|ACDC01000030.1| GENE 25 21212 - 21673 645 153 aa, chain + ## HITS:1 COG:FN0929 KEGG:ns NR:ns ## COG: FN0929 COG0802 # Protein_GI_number: 19704264 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Fusobacterium nucleatum # 1 153 1 153 153 233 91.0 9e-62 MEKVLTFSQIDELAKKLANYVEENTVIALIGDLGTGKTTFTKTFAKEFGVKENLKSPTFN YVLEYLSGRLPLYHFDVYRLCSSEEIYEIGYEDYINNGGVALIEWANIISEDLPKEYIRI EFKYAEKEDERIVDISYVGNKEKEEKFNVAFGN >gi|224461372|gb|ACDC01000030.1| GENE 26 21654 - 22298 173 214 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase [Lactobacillus jensenii 269-3] # 43 214 1 183 380 71 28 1e-11 MLLLGIDTSTKICTCSIYDSEAGVIAETSLSVKKNHSNIVMPIVDNLFKISDLNIKDIDK IAVAIGPGSFTGVRIALGIAKGLAMALNKGLVAVNELDILEAMASDNENEIIPLIDARKE RVYYKYQGKCQDDYLINLLSSLDKNKKYVFVGDGAINYADILKENLGENAIIVPRYNSFP RASVLCELSLNREDANIYTVEPEYISKSRAEKNF >gi|224461372|gb|ACDC01000030.1| GENE 27 22446 - 24467 1873 673 aa, chain + ## HITS:1 COG:Z5943m_1 KEGG:ns NR:ns ## COG: Z5943m_1 COG1479 # Protein_GI_number: 15804980 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 EDL933 # 1 556 1 556 592 317 35.0 4e-86 MKASERKITKLFSESDTVFSIPVYQRDYNWQEKQCQRLFKDILQTGKNEKVSSYFLGSIV YIHDGIYGVGEKEFHVIDGQQRMTTLTLLFLAIYFKLKGTILAKDADKIYNQYVVNPYSE KEIKLKLLPPEENLYILNKISHNKFNELEAFQDRNMLKNYLFFEKELENLSFDDMKHLSN GIEKLIYIDIALEKGKDDPQKIFESLNSTGLDLSQGDLIRNYILMDLERGEQNRIYKEIW IPIENNCKVSDGSEITSYVSDFIRDYLTLKTEKISSKPKVFETFKAYYEKENDEKLEDMK KYSEAYSYIIKPSLEKDRDIQRELDYLKSLDKTVINTFLIGILKDYKDNILEKDELVNML ILLQSYLWRRYITEKPTNALNKIFQGMYGKISRAGNYYENLVDILMAEDFPTDEELESAL KLKNVYKDKEKLNYVFKKLENYNHNELIDFDNEKITIEHIFPQKPNKAWKENYSDNELEQ MISFKDTISNLTLTGSNSNLSNKAFHEKRDDEVHGYRNSKLYMNKYLGRLEEWNLLSMEA RFESLYDDIIKIWKRPEDKATNDMEKITFVLKGKGTSGKGRLLSNEKFEILKGTSIVLEV KSDNPSTFRRNKNLIEDLIRKNLIEKLEDRYVFKENYIATSPSAAAILVLGRSANGWTEW KTYEGKLLSDYRK >gi|224461372|gb|ACDC01000030.1| GENE 28 24475 - 25695 1216 406 aa, chain - ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 402 1 401 402 372 53.0 1e-103 MIKRDLYLEEIKKYMNKPLIKVITGMRRSGKSMILKLIHEELKKQGVPEKNIIYINFESL IFMDIKDFETLYKYIIDKTTNISGRFYILLDEIQEVKAWEKAINSFLVDLDADIYITGSN ANLLSSELATYIAGRYIEIKIYPLSFQEYIDFATENNKEKPLSLDEYFNQYLNFGGLPGI HILNYSKEEIYQYLADVYNSILLRDVIARNNIRDIELLERVVLYIMDNIGNTFSAKNISD FLKNQGRKLSIETIYNYLKALENAFIISKVQRYDIKGKNILETQEKYYLSDLGFRHAKLG YQSNDISGYLENIVYLELLRRKYKANIGKQGNKEIDFIASFRDEKLYLQVTYLLASPETI EREFSVLNSIKDNYPKMVLSMDNLPESNIEGIKRKKIIDFLLEKRG >gi|224461372|gb|ACDC01000030.1| GENE 29 25832 - 27421 1613 529 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_2788 NR:ns ## KEGG: GYMC10_2788 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 39 520 37 521 551 226 34.0 1e-57 MDKILKEKIIDTTFEGLDKIIESEHKNHPNEKSYSCCRIQEGYNDYLKIVFRKGKINYFR SNFEWSSTPDERINCEKLKEIQRDDFVKEIVPEIKVKFEDLFFKYEDSFLFRYKFLLVLE FEGEKGLAKDRTYKEEFYFENKKRKEELKNRMEEYIKEVFLEEKKSIKDERECVIFAGNL LDFNLMGYSEKYIIELIEKILQVMKSVKNRRFDTTLKNDIKYYLDKWTREIFLKLDPEKV TEEQIDLYIYSALLKIKYRTYSFDVKNACDDLENAMNNYHSLKAKQYLEKGSGTLADELI HYKDKDLECKANDILSIVDIKIKNEVASSYEKALDFIIALLSNGFPHSYSIKFSSKSEKI FLDIKGLAKSSTHRFFRRILDFPELYDKLEVYAKTAMKEFEWYQDVEEGEKSLLPGSYAV FALGLYDEKYFPLIKEYYSKLDDEHQLAHQNFITALIDRYGVSEKSLPIFLDGFLSGQFD KVFKNLTSLLEDEENKKLLIKELENYGKHERQTILYSIWGNKWQQKIKI >gi|224461372|gb|ACDC01000030.1| GENE 30 27414 - 28721 1208 435 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0718 NR:ns ## KEGG: Lebu_0718 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 431 1 439 439 370 49.0 1e-101 MFFFKKKENLFVDILDLKVDCSKIINIKEAKLVYINGKGKLTVETGKSEPPNWEAPAKIK LNGIPLVQAQIPDCPTCSSLLATGYGIENTNCKELLEIQEKINSDYVNLETSIENMKALL TLLKSGFYLIADAICYPTDGENFFWNVPNKLKEFSSAGPAYLGEGTYVFSQPVYLYPTQT TNSYNKDRVEYYIEKFKNSADNKPRAIVYNFKDFINFIVDGHHKACASTILKEPVSCILI IPAKIYEDYYKNTRLNFSGILMDYKNIPKEYTRYIKKERFSPSHEKIEIKDGIVNNREWE KEYINSAKHYPSIIDYANIIDIMQDNEIEVNDIFIENCLENFDEDSQLKMKKLLYLLEFT NIKKAQEIALKYARKTLREEEIDKELKQLVYRILLSAKNNEEVEKIFIDYLVYYSENRED PILKIINLYWEETNG >gi|224461372|gb|ACDC01000030.1| GENE 31 28824 - 29492 287 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|241889384|ref|ZP_04776685.1| 30S ribosomal protein S8 [Gemella haemolysans ATCC 10379] # 8 222 6 216 216 115 33 6e-25 MREIKDFIKDKKIDLKRLEKFGFKLKDNSYYYDIFLLNNQFKMTVKINLDNSIFTEIIDV ETNEPYVLHLLEMKRSGYSEKVYKAYSEVLDKIKKECFENERFKANYTKEIIEYVKNKYG DELEFLWEKSPKTAVVRRKTSKKWYALILTLSKRKLNLDSDETAEIINLHNNPEEIKKLI DNKKYFPAYHMNKKHWCTICLDGTVELEKIYRLIDISYELAK >gi|224461372|gb|ACDC01000030.1| GENE 32 29564 - 29791 418 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740264|ref|ZP_04570745.1| ## NR: gi|237740264|ref|ZP_04570745.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 75 1 75 75 125 100.0 7e-28 MATLTINTDEKTAENFYAFCEELGLDMSTAITLYMKACLREQKIPFELKVAKKEVVQNVR TTPATIEELLENYDI >gi|224461372|gb|ACDC01000030.1| GENE 33 29809 - 31671 1674 620 aa, chain + ## HITS:1 COG:FN0898_2 KEGG:ns NR:ns ## COG: FN0898_2 COG1533 # Protein_GI_number: 19704233 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Fusobacterium nucleatum # 292 620 2 330 330 520 90.0 1e-147 MLYIVTALYIEAKPLISLFNLKKDNSYTKFQVFSNEDVKLIISGTGKVKSATALTYLVSK EDIKKNDYIVNIGFVASNKNSQLGDIVYISKIQNAYSDFDFYPEMIYKHNFLEGSLTTFD SIVEKKLENIEYIDMEAYGFFQTASIFFKKDKIIVLKIVSDILKDKLEDRVLVDFKDENL FSESYNNIYKFLVNFKTVNDDSDFTITEQELIKKVLENLRLSDTMTYELFNILRYLKIKY GNINILKKYENIEVTSKVQAKKLFEEIKNISLQKNSLEKTISPEINKKKISLNNRFSHIY VEKKILDNKNTLEILSKFRDAKIIEIDNYKEVFSSNNQDFHLQKLGQNLILASNKPNMIY EGAVVCEDFENDNFYYTSSIINCVYDCEYCYLQGVYSSGNIVIFVDIEKVFEEVEELYNK LKSLYLCVSYDTDLLAIENICSFSEKWYHFIKDKKDLKIELRTKSGNIDKFLNLDVLDNF IIAFTLSPEEIALKNEKYTASFKNRVKAIKELQNKGWKVRICIDPLIYTDDFEKNYSEMI EYLFSEIDKNKVIDVSIGVFRTSKEYLKKMRNQNKKSEILYYPFECIDGVYTYSDKLKSY MIDFIKEKFLKYIDNEKIYI >gi|224461372|gb|ACDC01000030.1| GENE 34 31837 - 32553 766 238 aa, chain + ## HITS:1 COG:AGl1487 KEGG:ns NR:ns ## COG: AGl1487 COG4221 # Protein_GI_number: 15890864 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 226 6 235 249 95 30.0 6e-20 MKIALVTGASSGIGYEIAKTLLNMDYEVYGVARNFIKNETKIFEEYENFFPIVCDLAKLD ELDKTLHSLKKIKFDLIVNSAGLAYFGLHEEINIAKIKNMISVNLQAPLVISQYFLRTLK ENKGIIINISSVTANKESPLACVYSATKAGLSQFSKSLFEEVRKNDVKVITIYPDMTKTN FYQNNTYFECDDDEKAYIKSEDIAKTIEFILNQSNNIVFTDVTIKPQRHKIKKIKRKE >gi|224461372|gb|ACDC01000030.1| GENE 35 32559 - 33338 996 259 aa, chain + ## HITS:1 COG:FN0900 KEGG:ns NR:ns ## COG: FN0900 COG1235 # Protein_GI_number: 19704235 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Fusobacterium nucleatum # 1 259 1 259 260 454 91.0 1e-128 MNISILGSGSSGNSTFVEIEDYKLLVDTGFSCKKTEEKLEMIGKKLSDISAILITHEHSD HINGAGVIARKYDIPIYITSESYKAGVSKLGEIDKSLIKFIDGSFILDDKVKVSPFDVMH DAERTIGFKLESQLNKKIAISTDIGYITNIVREYFKDVDAMVIESNYDFNTLMNCSYPWN LKERVKSRNGHLSNNECAKFIKEMYTDKLKKVFLAHVSKDSNHLSIIKETLEDEFTGMLR KPNCEITSQDKVTKLFTIE >gi|224461372|gb|ACDC01000030.1| GENE 36 33353 - 33940 814 195 aa, chain + ## HITS:1 COG:FN0901 KEGG:ns NR:ns ## COG: FN0901 COG1573 # Protein_GI_number: 19704236 # Func_class: L Replication, recombination and repair # Function: Uracil-DNA glycosylase # Organism: Fusobacterium nucleatum # 1 195 1 195 195 275 73.0 3e-74 MEEISELWEELKFELGSVGIETLPKDKQEIYIGMGNRNADVLFIGNDPKLYLSEDYKVEA QSSGEFLIRLFDLAGIVPEAYYITTLTKREVKIKNFDEEEKKILLDLLNMQIALISPKII VFLGKEVAQMIENREVNLEKERGKFKKWKGDIDCYLTYDVETVIKARNESGKKAAVATNF WLDIKNIKERLDHNE >gi|224461372|gb|ACDC01000030.1| GENE 37 33933 - 34475 870 180 aa, chain + ## HITS:1 COG:FN0902 KEGG:ns NR:ns ## COG: FN0902 COG0212 # Protein_GI_number: 19704237 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Fusobacterium nucleatum # 1 180 2 181 181 250 82.0 1e-66 MNKKDARNLIKERRMNLSMEYIETASDKIFEKLLENEDFKNAKVIMSYMDFKNEVKTDKI NEYIKKAGKTLVLPKVITKEKMIAIEDKNKYIVSPFGNSEPDGEEYIGEIDVIITPGVAF DRDKNRVGFGRGYYDRFFAIHKNAKKIAIAFEKQIIEEGIETTEYDMKVDILITEDNIIK >gi|224461372|gb|ACDC01000030.1| GENE 38 34704 - 35675 1373 323 aa, chain + ## HITS:1 COG:FN0903_1 KEGG:ns NR:ns ## COG: FN0903_1 COG0794 # Protein_GI_number: 19704238 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Fusobacterium nucleatum # 1 206 1 206 206 372 94.0 1e-103 MLDQEIIEIAKNIYDTEIKSLEKRMNKLSENFVKVVRKIFDCKGKVVVTGIGKTGIIGKK ISATFASTGTTSIFMNSTEGLHGDLGIINPEDIVLAISNSGESDEILAIMPAIKNIGAFV IAMTGNINSRLAKASDLYINTHVDEEGCPLNLAPMSSTTNALVMGDAIAGCLMKLRDFTP QNFAMYHPGGSLGRKLLTKVGNLMKTGEALALCKADTSMEDIVILMSEKKLGVVCVMNDD NSLLVGIITEGDIRRALSHKEKFFSLKASDIMTTNYTKVDKEEMATQALSIMEDRPHQIN VLPVFDDNNFVGVIRIHDLLKVR >gi|224461372|gb|ACDC01000030.1| GENE 39 35682 - 37265 1682 527 aa, chain + ## HITS:1 COG:FN0904 KEGG:ns NR:ns ## COG: FN0904 COG2509 # Protein_GI_number: 19704239 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 1 527 1 527 527 926 92.0 0 MKVNISNIIVSINKNQEKEIYKELEKNGISRDNIENLKYLKKSIDSRKKNDIKFIYTLEI SLKKNINLEKYSKLSLAKDESYDKRVALYPQREVAVVGTGPAGLFSALRLAELGYIPIVF ERGEEVDKRNITTNNFIKTFILNPNSNIQFGEGGAGTYSDGKLNTRIKSEYIEKVFKEFI ECGAQEEIFWNYKPHIGTDVLRIVVKNLREKIKSLGGKFYFSSLVEDIEVKNNEIKSLKI LEVDSGKRYNYDIDKVIFAIGHSSRDTYKMLYSKGIAMENKPFAIGVRIEHLRKDIDKMQ YGEAVSNPLLEAATYNMAFNNKKETRGTFSFCMCPGGEIVNASSEIGASLVNGMSYSTRN GKFSNSAIVVGVSERDYGSQIFSGMYLQEELEKKNYEIVGNYGAIYQNVIDFMKNQKTSF EIESSYKMKLFSYDINNFFPDYIIRNLHSAFENWSKNKLFISNKVNLIGPETRTSAPVKI LRDLKGESISVKGIFPIGEGAGYAGGIMSAAVDGIKIVDLAFSKKIV >gi|224461372|gb|ACDC01000030.1| GENE 40 37336 - 37578 276 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740272|ref|ZP_04570753.1| ## NR: gi|237740272|ref|ZP_04570753.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 80 1 80 80 118 100.0 1e-25 MKKLFIFVILLLVVGVSSMAATFYHRPRHMGYMMNGESQYYNFGMYERTYNRDYCSNNYY YDSYYNGENRGCCHSGSRWY >gi|224461372|gb|ACDC01000030.1| GENE 41 37692 - 38021 378 109 aa, chain + ## HITS:1 COG:no KEGG:FN0737 NR:ns ## KEGG: FN0737 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 109 1 109 109 192 91.0 2e-48 MPHLKIRGIEKNLIVENSKEIIDGLTEIIGCDRTWFTIEHQNTEYIFDGKIVDGYTFVEV YWFARDEKIKKDTADFLTKLIKRINNNKDCCIIFFTLTGDNYCDNGEFF >gi|224461372|gb|ACDC01000030.1| GENE 42 38125 - 38274 214 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740274|ref|ZP_04570755.1| ## NR: gi|237740274|ref|ZP_04570755.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 49 4 52 52 63 100.0 4e-09 MKKLIILTALVSIFSISAIAATYCYGYDYSRDSNNNNYSNNVPSCCSRY >gi|224461372|gb|ACDC01000030.1| GENE 43 38338 - 40926 3652 862 aa, chain - ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 862 1 862 862 1449 90.0 0 MKHFTKSKKHLFEEFKTSSTGLIEEEVVARRKKYGENKFVEKEKDGLIKIFFNQFKDSLV IILLIAAVISFFSGNKESALVIVLVLILNSILGAYQTIKAQKSLDSLKKMSSPKCKVIRD HEQLEVDSVELVPGDIVIVEAGDIVPADGRIIENFSLLVNENSLTGESNSIEKTDEVLEY EDLALGDQVNMVFSGSLVNYGRSKILVTETGMSTQLGKIATLLDQTEENVTPLQKSLDIF GKRLTLGIVVLCVLIFGIYVYHGNTVLNSLLLAVALAVAAIPESLNPIITIVLSMETEKL SKENAIVKELKSIEALGSISVICSDKTGTLTQNKMTVKKIFINGKLDNEYSLDKNKKIDK LLLDSFILCTDATDTIGDPTETALIHLTQKYDMSFRDERKDSKRISEIPFDSVRKLMTVL YETKNGKHIIFTKGAFDSLVTRFKYYLDENGNVQNVNEEFIKKIEKVNNELAEEGLRVLT FAYKYIDGEKELSNEDENDYIFHALVGMIDPPREESKLAVQECIRGGIKPVMITGDHKIT ARTIAKNIGIFKDGDIALEGVELEKMTDEELEKNVANISVYARVSPEHKIRIVNAWQKLG KIVAMTGDGVNDAPALKKANIGIAMGITGTEVSKNAASMILADDNFSTIVKAIITGRNVY RNIKNAIGFLLSGNTAAILAVLYSSLANLPVIFSAVQLLFINLLTDSLPSIAVGVEPKNE DILDEKPRDPNEAILTKRFSAKLLIEGVLIAIFIIIAFYIGLKDSALKGSTMAFATLCLA RLFHGIDYRGQRNVFAIGFFKNKFSLIAFALGFILLNAVLLCPPLYNMFGITKLETANFI QIYVLSLIPTVLIQIYKAIKYR >gi|224461372|gb|ACDC01000030.1| GENE 44 41228 - 42004 1139 258 aa, chain + ## HITS:1 COG:FN1020 KEGG:ns NR:ns ## COG: FN1020 COG1024 # Protein_GI_number: 19704355 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Fusobacterium nucleatum # 1 258 1 258 258 440 86.0 1e-123 MSVVSYKQEDFIGIVTIERPEALNALNTAVLNELNSTFANINLETTRVVILTGAGTKSFV AGADISEMAPLNNSEAARFSNKGNEVFRKIETFPLPVIAAINGFALGGGCELAMSCDFRV CSENAVFGQPEVGLGITPGFGGTQRLARLIGLGKAKEMIYTANAIKADEALNVGLVNHIY PQETLLEETKKLAAKIAKNAPFAVRASKKAINEGIDTDMDRAIIIEEKLFGSCFTTEDQK VGMKAFLEKVKGVEYKNK >gi|224461372|gb|ACDC01000030.1| GENE 45 42020 - 42859 1407 279 aa, chain + ## HITS:1 COG:FN1019 KEGG:ns NR:ns ## COG: FN1019 COG1250 # Protein_GI_number: 19704354 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Fusobacterium nucleatum # 1 279 1 279 279 504 93.0 1e-143 MKVGIIGAGTMGAGIAQAFAQTEGFTVALCDINNEFAANGKNKIAKGFEKRIAKGKMEQA EADTILSRITTGTKEICADCDLIIEAAIENMEIKKQTFKELDEICKADAIFATNTSSLSI TEIGAGLKRPMIGMHFFNPAPVMKLVEIIAGLHTPTEIVEKIKKVSEDIGKVPVQVEEAP GFVVNRILIPMINEAVGIYAEGVASVEGIDAAMKLGANHPIGPLALGDLIGLDVCLAIMD VLYHETGDSKYRAHTLLRKMVRGKQLGQKTGKGFYDYTK >gi|224461372|gb|ACDC01000030.1| GENE 46 42893 - 43897 1364 334 aa, chain - ## HITS:1 COG:FN0563 KEGG:ns NR:ns ## COG: FN0563 COG0482 # Protein_GI_number: 19703898 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 3 334 2 333 333 585 89.0 1e-167 MKEKIKALALFSGGLDSALAIKLVQDQGVEVIALNFVSHFFGGKNEKAEKMAEQLGIKLE YIDFKKRHMFVVEDPVYGRGKNMNPCIDCHSLMFKIAGELLEEYGAHFVISGEVLGQRPM SQNAQALEKVKKLSGMEDLVLRPLSAKLLPPSRAELMGWVDREKLLDINGRSRHRQMELM NSYGLVEYPSPGGGCLLTDPGYSSRLKVLEDDGLLKDEHSWLFKLIKEARFFRFSKGRYL FVGRDKESNMKIDEYRKEKNLKFYIHSAEVPGPHLIANTDLSDEEIEFAKNLFSRYSKVK GNEKINLNNSGNIETVDVVDLKKLDEEIKKYQQL >gi|224461372|gb|ACDC01000030.1| GENE 47 43881 - 44327 826 148 aa, chain - ## HITS:1 COG:FN0562 KEGG:ns NR:ns ## COG: FN0562 COG1799 # Protein_GI_number: 19703897 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 38 148 1 111 111 175 91.0 2e-44 MGILKDIKELVGINTEEEYDEEEVIEETSRTLSKREQMEMDTVDDFRYDDYSTIFIDPKQ FEDCKKIANYIEKEKMITINLENIGPNVAQRIMDFLAGAMEIKNASFAQIAKNVYTIVPE NMKVYYEGKRREKKLIDLEKGERFEGEN >gi|224461372|gb|ACDC01000030.1| GENE 48 44349 - 45020 873 223 aa, chain - ## HITS:1 COG:FN0561 KEGG:ns NR:ns ## COG: FN0561 COG0325 # Protein_GI_number: 19703896 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Fusobacterium nucleatum # 1 223 1 223 223 347 91.0 7e-96 MSIQASVEEILEDIKKYSPYPEKVKLIAVTKYSTVEDIEEFLKTGQNICGENKVQVVKDK IEYFKNKNTDIKWHFIGNLQKNKVKYIIDDVVAIHSVNKLSLAQEINKKAEQSGKTMDVL LEINVYGEESKQGYSLDELKCDIMELKNLKNLNIIGVMTMAPFTDDEKILRMVFSELRKI KDELNKEYFDNNLTELSMGMSNDYKIALQEGSTYIRVGTKIFK >gi|224461372|gb|ACDC01000030.1| GENE 49 45042 - 46145 1136 367 aa, chain - ## HITS:1 COG:FN0560 KEGG:ns NR:ns ## COG: FN0560 COG0635 # Protein_GI_number: 19703895 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 365 1 365 365 576 84.0 1e-164 MLKIYNTYIHIPFCERKCNYCDFTSLKGTDNQIEKYVNYLLKEIDIYSKNYDLSEKQDTI YFGGGTPSLLPIDSLKRILSRFSYDENTEITIEVNPKTVDINKLKEYRNLGINRLSIGIQ TFNDENLKVLGRIHNSEEAIEVYNIAREVGFKNISLDIMFSLPNQTLKMLKVDLEKLILL NPEHISIYSLIWEEGTKFFRDLKAGKLKETDNDLEATMYEYIIDYLKSKGYEHYEISNFS KKDFEARHNSIYWENKNYLGLGLSAAGYLGNLRYKNFFHLKDYYDKLDKNILPIDEREEL TANDIEQYRYLVGFRLLNKPLIPSKEYLEKCEILEKEAYLVKKENGYILSSKGLMLFNDF IANFIDD >gi|224461372|gb|ACDC01000030.1| GENE 50 46129 - 47817 2754 562 aa, chain - ## HITS:1 COG:FN0559 KEGG:ns NR:ns ## COG: FN0559 COG1109 # Protein_GI_number: 19703894 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Fusobacterium nucleatum # 1 562 19 580 580 999 90.0 0 MYLDEYKKWLDSTMLSEDEKEELRSIANDEKEIENRFYTNLSFGTAGMRGVRGIGKNRMN KYNIRKATQGLANYIIEATGETGKKKGVAIAYDSRLDSVENAINTAMTLAGNGIKVYLFE GIRSTPELSFAVRELKAQAGVMITASHNPKEYNGYKVYWEDGAQIVDPQATGIVSAVEAV DIFNGVKLMDEKEAIEKGLLVYVGKKLDDRFIEEVKKNAINPDVENKDKIKIVYSPLHGV AARPVERILKEMGYTSVYPVKEQEQPDGNFPTCDYANPEDTSVFKLSTELADKVGAEICI ANDPDGDRMGLAVLDNNGKWFFPNGNQIGILFAEYILNHKKDIPANGTMITTVVSTPLFD TIVKKNGKKALRVLTGFKYIGEKIRQFENKDLDGTFLFGFEESIGYLVGTHVRDKDAVVA SMIIAEMATTFKNNGSSIYNEIIKIYEKYGWRLETTIPITKKGKDGLEEIQKIMKSMREK THTEIAGIKVKEYRDYQKGVEDLPKSDVIQIVLEDETYLTVRPSGTEPKIKFYISVVDSD KKVAEEKLAKLEKEFLNYAENL >gi|224461372|gb|ACDC01000030.1| GENE 51 47846 - 48568 750 240 aa, chain - ## HITS:1 COG:no KEGG:FN0558 NR:ns ## KEGG: FN0558 # Name: not_defined # Def: TraT complement resistance protein precursor # Organism: F.nucleatum # Pathway: not_defined # 23 240 1 216 216 215 56.0 1e-54 MKKILKTVFILTIILTIVLSSTIHTIISKRNLEVQTKMSNTIWLEPVDTDQKTIFVKISN TSDKDLDIESKVINALKTKGYKIVKEPSEAKYSLQVNILNVEKSNLNDADGSGFSEVFMA AGIGSILATQSAEDRSDIVGLGMASATLAKISSAFVKDVVYAMITDVLVSEKIGKNVQVT TVNSVSQGISGTRTSTSSETSNIEKYSTRVLSTANKVNLKFEKAMPVLEDELVKVITGIF >gi|224461372|gb|ACDC01000030.1| GENE 52 48598 - 49314 1194 238 aa, chain - ## HITS:1 COG:no KEGG:FN0558 NR:ns ## KEGG: FN0558 # Name: not_defined # Def: TraT complement resistance protein precursor # Organism: F.nucleatum # Pathway: not_defined # 23 238 1 216 216 328 86.0 1e-88 MKKIWKSIIFLGLLLTMVSCSTMHTVISKRNLDVQTKMSDTIWLEPAAANQKTVFVKVSN TSGKNLNIEQKIISILSAKGYRIVNDPAEAKYWLQANILKVDKVNLNNENGFSDAVLGAG IGGVLGAQRSGGAYTALGWGLAGAAIGTIADALVSDTAYAMVTDILISEKTGKNVQSSTR NSVKQGNSGTMTSKTSSSSNMEKYSTKVLSTANQVNLNFDSAIPILEDELGKVISGIF >gi|224461372|gb|ACDC01000030.1| GENE 53 49391 - 49855 641 154 aa, chain - ## HITS:1 COG:FN1023 KEGG:ns NR:ns ## COG: FN1023 COG3467 # Protein_GI_number: 19704358 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Fusobacterium nucleatum # 1 154 3 156 156 226 74.0 1e-59 MRKANREVKDRNEIIEIMKRCDVCRLVFNNGDYPYIVPLNFGLDADEEKVIIYFHSALEG TKVDIMKREMKATFEMDCNHELQYYEDRGYCTMAYESVIGRGKIRILSEDEKMEALKKLM AQYHKDKEAYFNPAAIPRTLVYCLEVEEMTAKRK >gi|224461372|gb|ACDC01000030.1| GENE 54 49983 - 50711 1142 242 aa, chain + ## HITS:1 COG:no KEGG:FN0557 NR:ns ## KEGG: FN0557 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 242 3 244 244 345 80.0 7e-94 MKFLMALLVTVFAFSFSAEIQAKSVSKNKEVVDVIFILDRSGSMGGLESDTIGGFNSVLE KQRKEEGKAYITTVLFDDQYELLHDRVDITKVKNITEKEYYVRGSTALLDAIGKTIAKEK AIQDTLSKGEKATKVLFIIITDGLENASKEYNSATVKRLIETQKEKYGWEFLFLGANIDA IETASAIGISAERAVNYNSDSVGTQLNYKSLNNAVSEVRSGKELKKEWKADIEADYQQRS KK Prediction of potential genes in microbial genomes Time: Thu May 19 22:55:47 2011 Seq name: gi|224461371|gb|ACDC01000031.1| Fusobacterium sp. 2_1_31 cont1.31, whole genome shotgun sequence Length of sequence - 6529 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 64 - 693 772 ## gi|237740287|ref|ZP_04570768.1| predicted protein + Term 733 - 773 2.8 + Prom 720 - 779 4.7 2 2 Op 1 . + CDS 827 - 1666 896 ## gi|237740288|ref|ZP_04570769.1| conserved hypothetical protein 3 2 Op 2 . + CDS 1590 - 1979 492 ## gi|260494392|ref|ZP_05814523.1| conserved hypothetical protein + Prom 2003 - 2062 3.9 4 3 Tu 1 . + CDS 2085 - 3263 1362 ## Shew_2461 MORN repeat-containing protein + Prom 3266 - 3325 10.0 5 4 Op 1 . + CDS 3437 - 4324 1073 ## Lebu_0491 hypothetical protein 6 4 Op 2 . + CDS 4337 - 4894 874 ## gi|260494389|ref|ZP_05814520.1| predicted protein + Prom 5158 - 5217 6.2 7 5 Tu 1 . + CDS 5334 - 5876 570 ## FN0142 hypothetical protein + Term 5894 - 5933 5.4 + Prom 6136 - 6195 7.7 8 6 Tu 1 . + CDS 6215 - 6527 363 ## gi|294781936|ref|ZP_06747268.1| conserved hypothetical protein Predicted protein(s) >gi|224461371|gb|ACDC01000031.1| GENE 1 64 - 693 772 209 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740287|ref|ZP_04570768.1| ## NR: gi|237740287|ref|ZP_04570768.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 209 1 209 209 347 100.0 3e-94 MSLKIRKGDKMEYRYKNIYLEETIEEIFPELNNSNTEYEPSTFSLAYRPYEYVEVTIYLK FGEVLLIKIFDENFQIDNTLKVGIALTDEIINRYDLYYDDFEEVYLSKKYKELVVIVDLA DNIIGFSFVKEDGKDFSFPKDKIKNYLECKNLLDIYGSLRNNDTLDVNIEKREIYGQLNN YKFTFDIITRNIKSIQNLETGEFIKTSLE >gi|224461371|gb|ACDC01000031.1| GENE 2 827 - 1666 896 279 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740288|ref|ZP_04570769.1| ## NR: gi|237740288|ref|ZP_04570769.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 279 1 279 279 442 100.0 1e-123 MKIYKFYWNFDDEIPSFEFKITKDVNYFFDKKEIEDLIKRIKHDEEIANRYLTGKLEISR FEILKMYRLKYGTYEGLEKYIKLKIDGKKLKISDIMKVWSIGTGFILSKKAKDYIERKYS DYFKYIEVFYKDIPLYIVTELTKFELSSVEDTYKNKILDFKKISGNNPIFAANYWIITKE SSGSFYCLEEFKDYIEASDLKNYIFNEIQDSNTFIPEPEPEIEEIEEKEYYENGNLKYEG LTRLGLRVKEWKFYYENGKTSICRRLQIWRTRWNLENLL >gi|224461371|gb|ACDC01000031.1| GENE 3 1590 - 1979 492 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260494392|ref|ZP_05814523.1| ## NR: gi|260494392|ref|ZP_05814523.1| conserved hypothetical protein [Fusobacterium sp. 3_1_33] # 5 129 259 383 383 202 96.0 5e-51 MRMEKLQFVGDYKYGEQDGIWKIYYENGNIKNIANYDYGKLVGLVRNYEEDGKFSSTTYY EEGSNLTKWQFFYKDGKSIKKEGMAYDMGEEAKKRWITTGEWKYYSKAGKLQKIETYENG EIIKVEKFK >gi|224461371|gb|ACDC01000031.1| GENE 4 2085 - 3263 1362 392 aa, chain + ## HITS:1 COG:no KEGG:Shew_2461 NR:ns ## KEGG: Shew_2461 # Name: not_defined # Def: MORN repeat-containing protein # Organism: S.loihica # Pathway: not_defined # 220 378 593 770 772 66 29.0 2e-09 MKIYEIGFDYANYNVIFTFKINKDASFFFSKEELDRYFRKDRFYEEANLKRYVEGEAKIL DVTLLDIYKDKYGTYEGLEKYVELIPDGKKSNVKDIISIPGFGMKVLLSRKAKEYIEKKY SGKLEYLKVSYDKKDFYIVTDIKNIEYCYSLKLPPNIIDVYDFSKVSGKNDIFKIGTIEK KDFLKERFFCIKKFKDYIEESDLKGYKFEEMKDINDIEIFKEEKQEETQFTEIEEKGYYK SGKLKYIGTIWKGFRIKQWKSWYENGNLESDGEFNMKGEEEGEWRYYHQNGKIKNVANYE NGKLVGLVKNFDENGKFYSSTYYEKSSNLTKWKFFYKDGKSIKKEGMAYDMGEEAKKRWI ATGEWKYYSKAGKLQKIETYENGEIIKVEKFK >gi|224461371|gb|ACDC01000031.1| GENE 5 3437 - 4324 1073 295 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0491 NR:ns ## KEGG: Lebu_0491 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 278 3 289 296 179 39.0 1e-43 MLVSKKKFNEKLEHLIKRVDLYKEVENDYLKRIEEKNGDCQYCMRSLGRIYITVASKELV VDKDIESFRKNIYVYSKLNLMGTDTRAYLAWKKMNLFCVLMSNNKDFLDFILRTFDIIGH EKEKYKKSEADFYLMRTILLALKGDWEEVIKRADFYSANPSKETGFKYFPLEFGFLKALA EKNIEKMKENINAMLEPKVARQMMHDESIFFYLHVYVLLYLKIASYYGFDLEIESDIVPK ELIDNTPAKEYPEPYEFMKKFDLKTITPEEWKAWIYEYYPKPEILKEFEEKGSFI >gi|224461371|gb|ACDC01000031.1| GENE 6 4337 - 4894 874 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260494389|ref|ZP_05814520.1| ## NR: gi|260494389|ref|ZP_05814520.1| predicted protein [Fusobacterium sp. 3_1_33] # 1 185 1 185 185 229 92.0 6e-59 MADITDVYVRIKYKKNGKIYEDDLLEDIDMYDALSDMEYNGIYYEIDDKLIMRAYGRNYY ALCFQNESELKDYLFEISQEKGIENIYYIYCEYSYMMEVIRYGIINIDIINKKVTVDIEK EEKYIEIFEKIAKKSYPKLLENYEKYIDDELEDDEVEEYEDKMDEIMGEYSLKEFEKFLN KVKLK >gi|224461371|gb|ACDC01000031.1| GENE 7 5334 - 5876 570 180 aa, chain + ## HITS:1 COG:no KEGG:FN0142 NR:ns ## KEGG: FN0142 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 88 180 71 160 160 116 66.0 4e-25 MIVEPDVLAHLLTDVKLWKKYGTGPNDSTTRKEREKIGDSNLGLRIQNSYDEIRNELEAR EKDKASLNELTEILEKIATEKMVKKALKRVKEIEDFAEPYIDEETKEKIIRKIYPLQKEF LRILGRKNEKYKLGTGTMEGRFYIDIYIKDLKTNEIFIIKRDNIHIYYESAGPKVFLPSI >gi|224461371|gb|ACDC01000031.1| GENE 8 6215 - 6527 363 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294781936|ref|ZP_06747268.1| ## NR: gi|294781936|ref|ZP_06747268.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 104 1 104 127 172 92.0 8e-42 MSELQIVRKKIHGDFFLEKSYREGKLFYEVLRYKDDYIGINYGYLEDELREETYINDNRI GMIVVNEKDRIYICTLDEKRREVGITVTYRNKSGRLAHEIDYLD Prediction of potential genes in microbial genomes Time: Thu May 19 22:56:44 2011 Seq name: gi|224461370|gb|ACDC01000032.1| Fusobacterium sp. 2_1_31 cont1.32, whole genome shotgun sequence Length of sequence - 4819 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 333 - 392 10.2 1 1 Tu 1 . + CDS 416 - 805 423 ## gi|237740293|ref|ZP_04570774.1| predicted protein + Term 819 - 860 1.7 + Prom 828 - 887 6.7 2 2 Op 1 . + CDS 1127 - 1267 114 ## gi|237740294|ref|ZP_04570775.1| predicted protein 3 2 Op 2 . + CDS 1230 - 1490 302 ## gi|237740295|ref|ZP_04570776.1| predicted protein 4 2 Op 3 . + CDS 1495 - 2091 720 ## gi|237740296|ref|ZP_04570777.1| predicted protein + Prom 2161 - 2220 7.9 5 3 Tu 1 . + CDS 2247 - 2534 494 ## FN0038 hypothetical protein + Term 2562 - 2601 5.4 + Prom 2583 - 2642 10.5 6 4 Tu 1 . + CDS 2741 - 3136 590 ## gi|237740298|ref|ZP_04570779.1| predicted protein + Term 3184 - 3240 -0.2 + Prom 3267 - 3326 3.8 7 5 Op 1 . + CDS 3393 - 4202 1019 ## gi|237740299|ref|ZP_04570780.1| hemolysin 8 5 Op 2 . + CDS 4213 - 4819 531 ## gi|237740300|ref|ZP_04570781.1| LOW QUALITY PROTEIN: hypothetical protein FSAG_00374 Predicted protein(s) >gi|224461370|gb|ACDC01000032.1| GENE 1 416 - 805 423 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740293|ref|ZP_04570774.1| ## NR: gi|237740293|ref|ZP_04570774.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 129 1 129 129 209 100.0 4e-53 MKFKFYYYKSFEKSIHEKIDSMCSSINETQNEDIISCYISDISVFEIYLDSICERLEKKE PSAMDGQVWGGDFIEDRVYIYWVFDPDNEEGKAEISRKGMLKLMKRWIEFRKKKIPENYE EYEEIIEVD >gi|224461370|gb|ACDC01000032.1| GENE 2 1127 - 1267 114 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740294|ref|ZP_04570775.1| ## NR: gi|237740294|ref|ZP_04570775.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 46 1 46 46 65 100.0 9e-10 MIDEGIAKLKNMTTEEIKELFEKEGLNLKSFDKVNGKGSGQKFFNT >gi|224461370|gb|ACDC01000032.1| GENE 3 1230 - 1490 302 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740295|ref|ZP_04570776.1| ## NR: gi|237740295|ref|ZP_04570776.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 86 1 86 86 149 100.0 4e-35 MEKEVVKNFSIPNNKFMIPADFEKGKRKMVTIGSVRTNSGGTHGSPYILFGTNDGKYKIV FGDPKDYKYNITNKENANIIFVNPKR >gi|224461370|gb|ACDC01000032.1| GENE 4 1495 - 2091 720 198 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740296|ref|ZP_04570777.1| ## NR: gi|237740296|ref|ZP_04570777.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 198 1 198 198 312 100.0 8e-84 MKYKYKDIYLEETIEKIFSELNNSNTEYERSTFSLAYRPYENIEVYIYLEFSKVRLIKIF DENFQIDNTLKVGVKLTNDIIDKYSLYYDDFEEIYLSKKYKELVVIVDLANNIIGFSFSK GLEGEEKSPKDKIKNYLECKNLRDIFGSLYNSYTLDADIEKREIYGQLDNYKFTFDIITR DIKSIQNLETGEFIKTYN >gi|224461370|gb|ACDC01000032.1| GENE 5 2247 - 2534 494 95 aa, chain + ## HITS:1 COG:no KEGG:FN0038 NR:ns ## KEGG: FN0038 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 95 6 100 100 113 88.0 2e-24 MNATEKKELMGKYAKKLENAIKREATVMKEIENDKELIKYLEGQKTSGAAFDNTVYESYD AWIETIRKQIKKSESTLTNIEFKKVELEAIQKYIA >gi|224461370|gb|ACDC01000032.1| GENE 6 2741 - 3136 590 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740298|ref|ZP_04570779.1| ## NR: gi|237740298|ref|ZP_04570779.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 131 1 131 131 192 100.0 4e-48 MIWEELKSRKNFVEEDFIELRDSVEGLISVIEKYKDMRKDSDEYIMELKEFLEEVNLTLE EKKITDKELKNLNFLREDYFNSHTNSISEYGVYDKNDLEKTHKVNEEITVAVSRFGKILY KITEKVMYHMI >gi|224461370|gb|ACDC01000032.1| GENE 7 3393 - 4202 1019 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740299|ref|ZP_04570780.1| ## NR: gi|237740299|ref|ZP_04570780.1| hemolysin [Fusobacterium sp. 2_1_31] # 1 269 1 269 269 473 100.0 1e-132 MGISQKNLTTGEISLQKINPLGQRISETSFTPYEANTLIGANSSSKMLVGNRAVNQSVIS RVSYQTGNNSLVLYDKTPVPVATNGALVPPLTTNRALATVPLLTNGANQVSQVVTPPNYL TKPEEGGIMVVGAGTEAISGAYSIDINPIVKGVNKGDAENLVGIPDNFLSLVIIDNPTFN PVNTEILRVVKPGGEIRITGVISNSHFSKLFDKKRNEVKVPEGFELIEKGEIPENLRKQG YRNNGDPIGQKNGVGVPKKTDRIIRLRKK >gi|224461370|gb|ACDC01000032.1| GENE 8 4213 - 4819 531 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740300|ref|ZP_04570781.1| ## NR: gi|237740300|ref|ZP_04570781.1| LOW QUALITY PROTEIN: hypothetical protein FSAG_00374 [Fusobacterium sp. 2_1_31] # 1 197 1 197 197 367 100.0 1e-100 MKIKLRIDKDRWEMTPYEGYITMPIIIGRTMIFGAYNVEFEMTANIWKQLPEEYKRKIYR YNWKKMIKSMLITVTDITAYSFCFGSNIKLKSSITMEEFYKNFDENKRIEILTPGCDFPR SSMSVYFQYLGEVYAEVDLDELVAISDEENSFQYSIMPKLMEASNYRKRRENNTGKLEQI YNEQLIVKSLVDKNIDDLSKEE Prediction of potential genes in microbial genomes Time: Thu May 19 22:57:37 2011 Seq name: gi|224461369|gb|ACDC01000033.1| Fusobacterium sp. 2_1_31 cont1.33, whole genome shotgun sequence Length of sequence - 3393 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 407 335 ## gi|237740301|ref|ZP_04570782.1| predicted protein + Term 452 - 499 7.2 + Prom 474 - 533 8.2 2 2 Op 1 . + CDS 568 - 1053 592 ## FN0932 hypothetical protein 3 2 Op 2 5/0.000 + CDS 1065 - 2321 1387 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 4 2 Op 3 . + CDS 2302 - 3375 1217 ## COG0082 Chorismate synthase Predicted protein(s) >gi|224461369|gb|ACDC01000033.1| GENE 1 3 - 407 335 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740301|ref|ZP_04570782.1| ## NR: gi|237740301|ref|ZP_04570782.1| predicted protein [Fusobacterium sp. 2_1_31] # 33 134 1 102 102 150 99.0 2e-35 YEKEAKEEWKLKEIERRKIREIYKPESISLEKLEEKEERFQIFICNVHMSGKISKEDFEK IFSTYPLILTYLSLDFLILILKKAEELGIEFPENIKYDIGYCLADLQTEIMTEEEKEAIE EIRKKWNLKKVYED >gi|224461369|gb|ACDC01000033.1| GENE 2 568 - 1053 592 161 aa, chain + ## HITS:1 COG:no KEGG:FN0932 NR:ns ## KEGG: FN0932 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 161 7 167 167 189 73.0 3e-47 MLFYEDLVKKIETGKIEDIKKIEKFGLNKAKNISGYGIAIPLILIGLFEVYSYTIYHKWY LLLIGALFFALGLKQAKTVFTYSIKVDTEAKNIKFKNLNLNFDDVESGTLKEMKLGKKVL PVIDMITKDRKQVIIPLYMDKQERFILLVKEILAERFSIEK >gi|224461369|gb|ACDC01000033.1| GENE 3 1065 - 2321 1387 418 aa, chain + ## HITS:1 COG:FN0933 KEGG:ns NR:ns ## COG: FN0933 COG0128 # Protein_GI_number: 19704268 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Fusobacterium nucleatum # 2 417 7 424 424 625 80.0 1e-179 MKIIKADKLVGELSPPPSKSVLHRYIIASSLAKGTSKIENISFSEDIIATIEAMKKLGAK IEKKDNYLLIDGSDTFKNLNENIEIDCNESGSTLRFLFPLSIVKENKVLFKGRGKLFKRP MTPYFQNFEKYKIKHLYIDENAILLEGKLKAGIYEIEGNISSQFITGLLFSLPLLDGESK IIINGKLESSNYIDISLDCLNKFGIKIINNSYQEFIIEGNQSYRAGNYRTEADYSQAAFF LVANAIGSKIKINDLSEDSLQGDKKIIDYISEIDNWNSKDTLVLDGSETPDIIPILSLKA AVSGKKIEIVNIERLRIKESDRLKATVEELSKLNFDLIEKKDSILINSRENFEVNKNEKV VSLSAHSDHRIAMMIAIAATCYDGEILLDNLDCVKKSYPNFWEVFLSLGGKIYEYLGN >gi|224461369|gb|ACDC01000033.1| GENE 4 2302 - 3375 1217 357 aa, chain + ## HITS:1 COG:FN0934 KEGG:ns NR:ns ## COG: FN0934 COG0082 # Protein_GI_number: 19704269 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Fusobacterium nucleatum # 1 357 1 357 357 617 86.0 1e-177 MNTWGTKIRLSIFGESHGEALGIVIDGLKAGTKLNLENINKFIDRRRAGKSSFTTSRKEK DEYRILSGYKDGHTTGAPLCVIFENTNTQSKDYEDLKVLLRPNHADYPAAIKFKGFNDIR GGGHFSGRITLALTFAGAVATDILEEKGIKIFSHIKKILDIKDKSFLDFKEVDIDKFKNL KESSLAFIEDDLEIKAKELLEKIKLSGNSVGGEIECACYNLPVGLGSPFFDSLESKISHL AFSVPAVKGIQFGIGFDFSNILGSEANDLYYLEDDKIKTKTNNNGGILGGLSTGMPLVFS VVIKPTPSISIEQETVNIKEMKNDILKISGRHDACIVPRVMPVIEAITALAILDEIL Prediction of potential genes in microbial genomes Time: Thu May 19 22:57:49 2011 Seq name: gi|224461368|gb|ACDC01000034.1| Fusobacterium sp. 2_1_31 cont1.34, whole genome shotgun sequence Length of sequence - 6638 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1367 1220 ## COG0534 Na+-driven multidrug efflux pump - Prom 1389 - 1448 8.1 + Prom 1365 - 1424 9.0 2 2 Tu 1 . + CDS 1481 - 2188 818 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 2224 - 2263 6.3 + Prom 2203 - 2262 11.7 3 3 Op 1 6/0.000 + CDS 2362 - 3429 1500 ## COG1145 Ferredoxin 4 3 Op 2 2/0.000 + CDS 3433 - 4236 1188 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 5 3 Op 3 . + CDS 4256 - 5230 1552 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits + Term 5242 - 5285 8.2 + Prom 5232 - 5291 7.4 6 4 Tu 1 . + CDS 5312 - 6499 1590 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities Predicted protein(s) >gi|224461368|gb|ACDC01000034.1| GENE 1 2 - 1367 1220 455 aa, chain - ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 455 1 455 455 689 87.0 0 MDEEIKTANPLGYQKISKLLRSLAIPAIIANLVNALYNVVDQIFIGQGIGYLGNAATNIA FPITTICLAIGLTLGIGGASNFNLELGKGNPEKSKHTAGTAASTLIIIGIILCISIRVFL EPLMISFGATDKILQYAMEYTGITSYGIPFLLFSIGVNPLVRADGNARYSMIAIIVGAVL NTILDPLFMFVFHWGIAGAAWATVISQVISASLLLIYFPRFKTVKFSLNDFIPQVHYLKR IISLGFASFIYQFSNMIVLVTTNNLLKFYGAKSIYGSDIPIAVFGIVMKINVIFIAIVLG LVQGAQPIFGFNYGAKNYHRVRETMRLLLKVTFSIASILFVIFQVFPKQIISLFGEGDEL YFSFATKYMRTFLLFISLNSIQVSIATFFPSIGKAIKGAIVSLAKQILFLFPLLLILPRF FGLEGVIYATPVTDLLAFSVAIIFLIHEFKHMPKE >gi|224461368|gb|ACDC01000034.1| GENE 2 1481 - 2188 818 235 aa, chain + ## HITS:1 COG:CAC1511 KEGG:ns NR:ns ## COG: CAC1511 COG0664 # Protein_GI_number: 15894789 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 15 234 8 226 228 134 37.0 1e-31 MSEGDIMKAKNSDIEKIEVFSGISKNSIVEIKNSADVIELKKNKALYSDRQQLDYVYFLI SGNVSLIKSSESGENRVIFLLNDGSMINEPLMRKNTSGIECWGFEDSKILRIGLKTFDKI MSKDYILARNCMLEMEKRIRRLYRQLKNLTSSNIEKKLAAKLYRLGTQYGLKENEIEDYT YINLNLTVTYIAKMLGYQRETVSRSLKLLAQKEIILQKDRKFYVNIEKARQFFKK >gi|224461368|gb|ACDC01000034.1| GENE 3 2362 - 3429 1500 355 aa, chain + ## HITS:1 COG:CAC1513 KEGG:ns NR:ns ## COG: CAC1513 COG1145 # Protein_GI_number: 15894791 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 336 1 338 338 347 51.0 2e-95 MKLRLSVEEFDKGLEELSKKYLILAPRTFEKRGTYSDTDVVRYAKVNSFSEMNWEDKSHF PAKEALLPVNEVLFYFTEDEYKVAAEDTRERLVFLRACDMNAVKRIDQIYLGNGASNDFF YTRTRKKTKFVVVGCTKSFRNCFCVSMGTNKADNYDAAMNIRGNEIQLELRDDDLKVFLG REVDFDIDYVSKNDFEVELPDKVDFMYMQNHKMWDEYDTRCIACGRCNYSCPTCTCFSMQ DIHYKENKNMGERRRVWASCQVDGYTNIAGGHSFRVKHGQRMRFKTLHKIHDYRKRFGEN MCVGCGRCDDMCPQYISISEAYEKVARAMKEKDNEELISEVYEKVVKAMKEKREE >gi|224461368|gb|ACDC01000034.1| GENE 4 3433 - 4236 1188 267 aa, chain + ## HITS:1 COG:CAC1514 KEGG:ns NR:ns ## COG: CAC1514 COG0543 # Protein_GI_number: 15894792 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Clostridium acetobutylicum # 6 267 4 264 264 323 56.0 3e-88 MCNCDNPYIPCPAEIIEITKHTDIEWTFRVKADTSKTKPGQFYEISLPKFGESPISVSGI GPNFIDFTIRAVGRVTNEIFEYKIGDKLFIRGPYGNGFDLNEYVGKDLVIVVGGSALAPV RGIIQFVYNNPEKVKSFKLIAGFKSPKDVLFAKDLEEWSKKLDVVLTVDGAEEGYKGNIG LVTKYIPELKFNDLSNVSAVVVGPPMMMKFSVAEFLKLNVAEKNIWVSYERNMHCGIGKC GHCKMDATYICLDGPVFDYEFAKNLVD >gi|224461368|gb|ACDC01000034.1| GENE 5 4256 - 5230 1552 324 aa, chain + ## HITS:1 COG:CAC1515 KEGG:ns NR:ns ## COG: CAC1515 COG2221 # Protein_GI_number: 15894793 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Clostridium acetobutylicum # 4 322 2 320 320 438 62.0 1e-123 MIRDLNIRKVMKNAFRITKTKYKTALRVRVPGGLIDPECLMLVSEIASKYGDGQVHITTR QGFEILGIDMEDMPAVNEMAQPLIDKLNINQDEKGKGYSAAGTRNVSACIGNKVCPKAQY NTTAFAKRIEKVIFPNDLHVKVALTGCPNDCIKARMHDFGIIGTCLPEYEMDRCVTCGAC VKKCKKVSVEALRIENNKIVRDENKCIGCGECVINCPMSAWTRSPKKYYKLMIMGRTGKQ NPRLAEDWLRWVDEDSIVKIIENTYKYAKEFISKDAPNGKEHVGYIVDRTGFKVFREWAL KDVNLPKETIEREPIYWSGPKYNY >gi|224461368|gb|ACDC01000034.1| GENE 6 5312 - 6499 1590 395 aa, chain + ## HITS:1 COG:FN0625 KEGG:ns NR:ns ## COG: FN0625 COG1168 # Protein_GI_number: 19703960 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Fusobacterium nucleatum # 1 394 1 394 398 676 88.0 0 MQKEKFLKEYLVERKGTYSLKWDALDKRFGNADLISMWVADMEIKAPKEVIEALKERCEH GVFGYSYVSDEYYNSVINWLKEKHNYEIKKEWLRFTNGVVTAIYCFVNIFTKVDDAILIL TPVYYPFHNAVKDNNRKLITCDLKNTDGYFTIDYDEVEKKIVENNVKLFIQCSPHNPAGR VWKEEELAKILEICKKHNVLVISDEIHQDIIMKGYKHIPSAIVANGKYADNLITVSAASK TFNLAGLIHSNIIISNAELRKKYDDEIKKINQTEISILGMLATQVAYEKGSEWLENVKEI IEDNFNYLKTELNKHIPEITITNLEGTYLVFLDLRKIIPIDKVKEFIQDKCNLAIDFGEW FGASFKGFIRINLATDPEIVKKAVENIITEYKKLE Prediction of potential genes in microbial genomes Time: Thu May 19 22:58:03 2011 Seq name: gi|224461367|gb|ACDC01000035.1| Fusobacterium sp. 2_1_31 cont1.35, whole genome shotgun sequence Length of sequence - 41290 bp Number of predicted genes - 43, with homology - 43 Number of transcription units - 10, operones - 8 average op.length - 5.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 9 - 929 651 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 979 - 1038 15.6 2 1 Op 2 . - CDS 1092 - 1547 567 ## FN1219 hypothetical protein - Prom 1589 - 1648 10.3 + Prom 1567 - 1626 8.0 3 2 Op 1 1/0.000 + CDS 1743 - 2342 726 ## COG4399 Uncharacterized protein conserved in bacteria 4 2 Op 2 1/0.000 + CDS 2353 - 3369 1469 ## COG2255 Holliday junction resolvasome, helicase subunit 5 2 Op 3 1/0.000 + CDS 3347 - 3772 549 ## COG1959 Predicted transcriptional regulator 6 2 Op 4 5/0.000 + CDS 3782 - 4489 920 ## COG1385 Uncharacterized protein conserved in bacteria 7 2 Op 5 . + CDS 4476 - 5795 397 ## PROTEIN SUPPORTED gi|229207303|ref|ZP_04333755.1| SSU ribosomal protein S12P methylthiotransferase 8 2 Op 6 . + CDS 5785 - 6765 1290 ## FN1213 hypothetical protein + Term 6820 - 6863 4.0 + Prom 7099 - 7158 7.8 9 3 Op 1 1/0.000 + CDS 7188 - 9167 2281 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 10 3 Op 2 1/0.000 + CDS 9106 - 11028 2374 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 11 3 Op 3 1/0.000 + CDS 11037 - 11336 242 ## PROTEIN SUPPORTED gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein 12 3 Op 4 1/0.000 + CDS 11350 - 13152 1856 ## COG1154 Deoxyxylulose-5-phosphate synthase 13 3 Op 5 1/0.000 + CDS 13136 - 13975 984 ## COG3481 Predicted HD-superfamily hydrolase 14 3 Op 6 1/0.000 + CDS 13938 - 14750 921 ## COG1189 Predicted rRNA methylase 15 3 Op 7 1/0.000 + CDS 14751 - 16097 1917 ## COG0793 Periplasmic protease 16 3 Op 8 1/0.000 + CDS 16108 - 16806 985 ## COG0313 Predicted methyltransferases 17 3 Op 9 1/0.000 + CDS 16787 - 17665 1281 ## COG1161 Predicted GTPases 18 3 Op 10 . + CDS 17668 - 18444 1259 ## COG0171 NAD synthase 19 3 Op 11 . + CDS 18451 - 18822 527 ## FN1201 hypothetical protein 20 3 Op 12 . + CDS 18856 - 19662 833 ## FN1200 hypothetical protein + Term 19688 - 19721 4.0 - Term 19670 - 19714 9.2 21 4 Tu 1 . - CDS 19725 - 19949 324 ## COG1314 Preprotein translocase subunit SecG - Prom 19975 - 20034 11.9 + Prom 19999 - 20058 14.1 22 5 Op 1 1/0.000 + CDS 20091 - 20378 395 ## COG1862 Preprotein translocase subunit YajC 23 5 Op 2 . + CDS 20445 - 21473 1336 ## COG0860 N-acetylmuramoyl-L-alanine amidase 24 5 Op 3 . + CDS 21478 - 21903 618 ## FN1333 hypothetical protein 25 5 Op 4 32/0.000 + CDS 21919 - 22992 1648 ## COG0216 Protein chain release factor A 26 5 Op 5 1/0.000 + CDS 23031 - 24140 1405 ## COG2890 Methylase of polypeptide chain release factors 27 5 Op 6 1/0.000 + CDS 24124 - 25155 1264 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 28 5 Op 7 1/0.000 + CDS 25167 - 25715 322 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 + Prom 25892 - 25951 9.7 29 6 Op 1 22/0.000 + CDS 25972 - 26199 418 ## COG1722 Exonuclease VII small subunit 30 6 Op 2 . + CDS 26201 - 27097 1263 ## COG0142 Geranylgeranyl pyrophosphate synthase + Term 27120 - 27153 5.1 31 7 Op 1 9/0.000 + CDS 27168 - 28184 1599 ## COG2984 ABC-type uncharacterized transport system, periplasmic component + Prom 28197 - 28256 4.4 32 7 Op 2 13/0.000 + CDS 28281 - 29165 1100 ## COG4120 ABC-type uncharacterized transport system, permease component 33 7 Op 3 . + CDS 29165 - 29932 172 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Term 29935 - 29997 -0.9 34 7 Op 4 . + CDS 30008 - 30700 1154 ## FN0602 hypothetical protein 35 7 Op 5 1/0.000 + CDS 30710 - 32014 1504 ## COG0144 tRNA and rRNA cytosine-C5-methylases 36 7 Op 6 . + CDS 32008 - 32652 723 ## COG4122 Predicted O-methyltransferase + Prom 32673 - 32732 6.5 37 8 Tu 1 . + CDS 32794 - 33570 956 ## COG2116 Formate/nitrite family of transporters + Term 33589 - 33622 4.0 - Term 33577 - 33610 4.0 38 9 Op 1 1/0.000 - CDS 33617 - 34081 714 ## COG2606 Uncharacterized conserved protein 39 9 Op 2 . - CDS 34103 - 35419 1438 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 35452 - 35511 9.9 40 10 Op 1 . - CDS 35566 - 38949 3634 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 41 10 Op 2 2/0.000 - CDS 39012 - 40001 1148 ## COG0582 Integrase 42 10 Op 3 . - CDS 40047 - 40754 692 ## COG3177 Uncharacterized conserved protein 43 10 Op 4 . - CDS 40741 - 41289 474 ## COG0732 Restriction endonuclease S subunits Predicted protein(s) >gi|224461367|gb|ACDC01000035.1| GENE 1 9 - 929 651 306 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 303 5 302 308 255 47 3e-67 MLANSVIDLIGNTPLVKINNIDTFGNEIYIKLEGSNPGRSTKDRIALKMIEAAEKEGLID KDTVIIEATSGNTGIGLAMICAIKNYKLKIVMPNTMSVERIQLMRAYGTEVILTDGSLGM KACLDKLEELKKEEKKYFIPNQFTNPNNPKAHYENTAEEILRDMDNKVDVYICGTGTGGS FSGTAKKLKEKLPNIKTFPVEPASSPLLSKGYIGPHKIQGMGMSIGGIPVVYDGSLADGI LVCDDEDAFKMMRELSFKEGILAGISSGATFKAALDYSKENANKGLRIVVLSTDSGEKYL SNAYNY >gi|224461367|gb|ACDC01000035.1| GENE 2 1092 - 1547 567 151 aa, chain - ## HITS:1 COG:no KEGG:FN1219 NR:ns ## KEGG: FN1219 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 151 1 151 151 233 87.0 2e-60 MSTLYIKILTDYFHHIIGDLEENRKIFLGKFYSYLLEKDEYGFAPVFEGELGRIEYLLKQ ISIEAKGMSLDEFLKLMSWYNEDAWANGEIFEYFLHHKKEKEIKLITDIHSLSENELQFI KDLDNFLNTKGRILKFFNVHNGKYQSLKEIL >gi|224461367|gb|ACDC01000035.1| GENE 3 1743 - 2342 726 199 aa, chain + ## HITS:1 COG:FN1218 KEGG:ns NR:ns ## COG: FN1218 COG4399 # Protein_GI_number: 19704553 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 2 198 3 199 200 275 79.0 3e-74 MKLVIMVIISAAIGWITNWVAIKMLFRPHNEINLGLFKIQGLIPKRRAEIGIGIADVIQN ELISIKDVIANIDREEFSKRLNDLIDDVLEKNLKTKVKEKFPVMQMFFSDKMAKDVSNTI KGIVMENQEKIFEIFSNYAEENINFSTIITDKISNFSLDKLEEIINGLAKKELKHIEVIG AILGAFIGLVQYFITLFVK >gi|224461367|gb|ACDC01000035.1| GENE 4 2353 - 3369 1469 338 aa, chain + ## HITS:1 COG:FN1217 KEGG:ns NR:ns ## COG: FN1217 COG2255 # Protein_GI_number: 19704552 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Fusobacterium nucleatum # 1 331 1 331 332 591 95.0 1e-169 MDRIISELEMPNEIEIQKSLRPKSFDEYIGQENLKEKMSISIKAAQKRNMTVDHILLYGP PGLGKTTLAGVIANEMQANLKITSGPILEKAGDLAAILTSLEENDILFIDEIHRLNNTVE EILYPAMEDGELDIIIGKGPSAKSIRIELPPFTLIGATTRAGLLSAPLRDRFGVSHKMEY YNIDEIKAIIIRGAKILGVKISEEGAIEISKRSRGTPRIANRLLKRVRDYCEIKGNGTID VLSAKNALDMLGVDSSGLDELDRNIINSIIENYDGGPVGIETLSLLLGEDRRTLEEVYEP YLVKIGFLKRTNRGRVVTPKAYQHFKKDEVKDEDKHEG >gi|224461367|gb|ACDC01000035.1| GENE 5 3347 - 3772 549 141 aa, chain + ## HITS:1 COG:FN1216 KEGG:ns NR:ns ## COG: FN1216 COG1959 # Protein_GI_number: 19704551 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 141 1 141 143 227 83.0 6e-60 MKINTKVRYGLKALAYIAENSSDKKLVRIKEISEDQDISIQYLEQILFKLKNENIIEGKR GPTGGYKLTLKPSQINLYTIYKILDDEERVIDCNENAEGKAHNCNEEACGETCIWSRLDN AMTKILSETSLEDFIKNGKKI >gi|224461367|gb|ACDC01000035.1| GENE 6 3782 - 4489 920 235 aa, chain + ## HITS:1 COG:FN1215 KEGG:ns NR:ns ## COG: FN1215 COG1385 # Protein_GI_number: 19704550 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 235 1 235 235 340 85.0 1e-93 MLSVLVTEVYDEYILVIDTNDINHIKNVFRKEKGDIVRAVDGSNEYLCEIEEINDKEIKL KIIEKKADKFSLDIELDAAISILKGDKMDLTIQKLTELGINKIIPIAVKRCVVKLDKKKD RWDTIAKEALKQCQGVVPTVVDEIKKIDKLNLKDYDLVLVPYENEEEVFLKDILRNLKVK PSKILYVIGAEGGFEKEEIDFLKSQGAKIISLGKRILRAETAAIVTGGVIINEFF >gi|224461367|gb|ACDC01000035.1| GENE 7 4476 - 5795 397 439 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229207303|ref|ZP_04333755.1| SSU ribosomal protein S12P methylthiotransferase [Nocardiopsis dassonvillei subsp. dassonvillei DSM 43111] # 1 399 1 432 480 157 25 1e-37 MSFSKKVAFHTLGCKVNQYETESIKNQLIKRGYEEVPFEDKSDIYIINSCTVTSIADRKT RNMLRRAKKINPEAKVIVTGCYAQTNSREILEIEDVDFVIDNKNKSNIVNFVGAIEDISF EREKNGNIFQEKEYQEYEFATLREMTRAYVKIQDGCNHFCSYCKIPFARGKSRSRKKENI LKEIEKLVEDGFKEVILIGIDLSAYGEDFEEKDSFESLLEDILKIKDLKRVRIGSVYPDK ITDKFIDLFKNKNLMPHLHISLQSCDDTVLKNMRRNYGSALIRESLLKLKSKVKNMEFTA DVIVGFPKEDDSMFQNTRNVIKEIEFSGLHIFQYSDREGTIASNMDGKVDAKTKKQRADS LDQLKQEMILESREKYLGKVLEVLVEEEKEGEYFGYSQNYLRVKFKSEEKNLINQLINTK IKSIEDDILIGEKENFYGN >gi|224461367|gb|ACDC01000035.1| GENE 8 5785 - 6765 1290 326 aa, chain + ## HITS:1 COG:no KEGG:FN1213 NR:ns ## KEGG: FN1213 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 326 11 327 327 462 82.0 1e-129 MATKKKKKRGRAPVLVLVLTAILLVLLFLNFRGNNIKLSKDEKVLIIGKQNLFAIYEDRL AVKIPYELYIDSDETVEDLVSTRNYEQVLEKINSIVPEKLTRYIVIKSGEIKLDVENQRN IPETNIGDKRFILTSSVYAMFKDLYHEKNSVDEQNENILVDVLNANGVGGYARKTGELIK TSLGMKYNAANYETTQDQSYVILNDISKEKAAEILDKLPEKYFKIKTKSTIPTLANIVVI IGSEKNINFKIDIYGADSVLKDATDKVKKLGYTDINTSAAKEGTEQSVIEYNKEDYFIAL RIAKELGITDMIENNELVNRIGVTIK >gi|224461367|gb|ACDC01000035.1| GENE 9 7188 - 9167 2281 659 aa, chain + ## HITS:1 COG:FN1211 KEGG:ns NR:ns ## COG: FN1211 COG0768 # Protein_GI_number: 19704546 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Fusobacterium nucleatum # 1 631 1 630 657 1041 88.0 0 MKLNKYRDNDVVLGDKRNTREIWFKVIVFLCFFVLFLRLLYLQVLQGNEFSYLAERNQYK LIKIDSPRGKILDSKGKLVVTNGTGYRLIYSLGREENEEYIKEIAKLTDKTEEVVRKRIK YGEIFPYTKDNVLFEDLEEEKAHKLMEIINNYPYLEVQVYSKRKYLYDTVASHTIGYVKK ISEKEYENLKEAGYTPRDMIGKLGIEKTYDDLLRGRNGFKYIEVNALNKIEREVEKVKSP IVGKNLYMGINMELQQYMEEEFEKDGRSGSFVALNPKTGEIITIVSYPTYSLNTFSSQIS PEEWNKISNDPRKILTNKTIAGEYPPGSTFKMISAMAFLKSGIDPKQIYNDYNGYYQVGN WKWRAWKRGGHGPTDMKKSLVESANTYYYKFSDQIGYAPIVKVARDFSLGQKSGIDVPGE KTGIIPDPDWKKKKTKTVWFRGDTILLSIGQGFTLVTPIQLAKAYTFLANKGWAYDPHVV SRIEDVQTGKIETVVTQKTVLTDYPASFYETINDALIATVDQNNGTTKIMKNPYVKVAAK SGSAQNPHSKLTHAWVAGYFPADTEPEVVFVCLLEGAGGGGVMAGGMAKRFLDKYLEVEK GIEVVKKTPQTETKKVNTSTTQRNVNNNNSEQGRGEETVNEERETETTSTTSTSEGEEN >gi|224461367|gb|ACDC01000035.1| GENE 10 9106 - 11028 2374 640 aa, chain + ## HITS:1 COG:FN1210 KEGG:ns NR:ns ## COG: FN1210 COG0595 # Protein_GI_number: 19704545 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Fusobacterium nucleatum # 28 638 2 608 608 1014 89.0 0 MKKEKPKQQVQQVQVKEKKTSIKERLKSIKDDVLSLKTKKTKVKDENKNEKPKKKKELKT VKVTEVTQVIESKVKKSKKSKNDLEKMYVIPLGGLEEVGKNCTIVQYKDEIIIIDAGAIF PDENLPGIDLVIPDYSFLENNKSKVKGLFVTHGHEDHIGGIPYLYEKIEKDTVIYGGKLT NALIKSKFENFGVKKNLPKMVEVGSRSKISVGKYFTVEFVKVTHSIADSYSLSIKTPAGH VFVTGDFKIDLTPVDNEKVDFVRLSELGEEGVDLMLSDSTNSEVEGFTPSERSVGDAFRQ EFQKATGRIVVAVFASHVHRIQQIIDTAAQFGRKIAIDGRSLLKVFEIAPSVGRLNIPKN ILIPISAVEQFQDDEVVILCTGTQGEPLAALSRIAKNMHKHIVLREGDTVIISSTPIPGN EKAVSTNINNILRYDVDLVFKKLAGIHVSGHGSKEEQKLMLNLINPKNFMPVHGEYRMLK AHMKSAIETGVPKDKILITQNGDKVEVTKEYAKINGKVNSGEILVDGLGVGDIGSKVIKD RQQLSEDGIVIVAYSIDKQTGKILSGPEMSTKGFVYYKDSEDTMKEAQDLLLKKIRKEET YLGRDWQDLKGDVRDLLSRFFYEKLKRNPIIVPMLLEVES >gi|224461367|gb|ACDC01000035.1| GENE 11 11037 - 11336 242 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Anoxybacillus flavithermus WK1] # 1 94 2 95 97 97 47 9e-20 MNSKKRAFLKKKAHNLEAIVRIGKDGLNQNIIQSILDAIESRELIKVKILQNCEEEKTVI YSKLMDNKEFEVVGMIGRTIIIFKENKEHPTISLEWKNI >gi|224461367|gb|ACDC01000035.1| GENE 12 11350 - 13152 1856 600 aa, chain + ## HITS:1 COG:FN1208 KEGG:ns NR:ns ## COG: FN1208 COG1154 # Protein_GI_number: 19704543 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Fusobacterium nucleatum # 1 600 1 600 600 1033 85.0 0 MSMELTEKCKEIRKQLIEVVSKNGGHLGPNLGVVELTVCLNEVFNFKEDIVLFDVGHQAY VYKILTDRDDKFHTIRTRGGLSPFLDPSESTYDHFISGHAGTALAAGVGFATANPDKKVV VIVGDASISNGHSLEALNYISYKKLDNILVIVNDNDMSIGENVGFISKFLKRVISSGKYQ NFREDVKTFINKIKANRLKNTLERMERSLKGYVTPFYALESLGFRFFNISEGNNIEKLLP MFRKVKDLKGPIILLVKTEKGKGYCFAEENKEKFHGIAPFNIETGNTYKSSVSYSEIFGN KILDLAREDTEIYTLSAAMIKGTGLDKFSKEFPERCIDTGIAEGFAVTFAAGLARSQKKP YVCIYSTFIQRAISQLIHDISIQNLPVRFIIDRSGIVGEDGKTHNGIYDLSFFLTIQNFT VLCPTTAKELEEALELSKDFNSGPLVIRIPRDSVFNIEDDKPLEIGKWKEIKKGSKNLFI ATGTMLKIILEIHEELKNRGIDATIVSAASVKPLDENYLLNYIKEYDNIFVLEENYVKNS FATSILEFLNDNGINKLIHRIALDSAIIPHGKRDELLAEERLKGESLIERIEEFVYGRKK >gi|224461367|gb|ACDC01000035.1| GENE 13 13136 - 13975 984 279 aa, chain + ## HITS:1 COG:FN1207 KEGG:ns NR:ns ## COG: FN1207 COG3481 # Protein_GI_number: 19704542 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Fusobacterium nucleatum # 1 274 1 274 274 432 85.0 1e-121 MGEKNNKSKKFIDCLLNFQDVKDLELCDDQGVKVSTHTYDVLNISINKIKEKYVDYEFAS QKIDFFAITVGIIIHDISKSSLRRNEENFSHSQMMIKNPEYIKAEVYSVLELIEKESGYK LIDSVKQNIAHIVESHHGKWGKVQPETEEANLVYMADMESAKYHRINPIQANDILKYSAR GLGLSDIEKKLNCSAAVIKDRIKRAKKELNLRTFSELLDVYKEKGRVPIGDKFFVLRSEE TKKLKKYVDKNGFYNLFMKNPLMEYMIDDKIFKKENEIR >gi|224461367|gb|ACDC01000035.1| GENE 14 13938 - 14750 921 270 aa, chain + ## HITS:1 COG:FN1206 KEGG:ns NR:ns ## COG: FN1206 COG1189 # Protein_GI_number: 19704541 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Fusobacterium nucleatum # 1 265 1 266 266 331 75.0 7e-91 MTKFLKKKMRLDEYLCDNEYFEDLEVAKKQIMAGNVIINEQKIDKPGIIISLDKIKTVRI KEKNIPYVSRGGLKLKKAIDVFDLNFKDKIVLDIGASTGGFTDCSLQNGAKLVYAVDVGT NQLDWKLRNHNQVVSIENKHINDLEKSEIKDEIDIIVMDISFISIKKVLYKIKEFLSENS YAVFLIKPQFEAEKEYIDKGIVKDLEIHKKVITDIIEDAKNYDLFLENLTISPIKGTKGN TEYLAKFSKKNIFSDKEIENMINNNIREEK >gi|224461367|gb|ACDC01000035.1| GENE 15 14751 - 16097 1917 448 aa, chain + ## HITS:1 COG:FN1205 KEGG:ns NR:ns ## COG: FN1205 COG0793 # Protein_GI_number: 19704540 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Fusobacterium nucleatum # 13 441 1 426 427 677 89.0 0 MKVSLRKAAMVLMIAISGLSFSDDDRTGFLSNMRELKEISDIMDVIQDSYVENANAHKNK EEKNKKTPQAAQKSTKVTKKSLMQGALKGMLESLDDPHSVYFTREELRSFQEDIKGKYVG VGMVIQKKVGEPLTVVSPIEDGPAYKAGIKPKDQIVEIDGESTYNLTSEEASKRLKGKAN TSVKVKVYREANKLTKVFELKRETIELKYVKSKMLEGGIGYLRLTQFGDNVYPDMKKALE GLQAKGMKALILDLRSNPGGELGQSIKIASMFIEKGKIVSTRQKKGEETVYSREGKYFGN FPMVVLINGGSASASEIVSGALKDYKRATLIGEKSFGKGSVQTLLPLPDGDGIKITIAKY YTPNGISIDGTGIEPDKKVEDKDYYLISDGTITNIDENQQKENKKEIIKEVKGEKAAKEV DTHKDIQLEAAIKFLNTPTQKNIPSPKK >gi|224461367|gb|ACDC01000035.1| GENE 16 16108 - 16806 985 232 aa, chain + ## HITS:1 COG:FN1204 KEGG:ns NR:ns ## COG: FN1204 COG0313 # Protein_GI_number: 19704539 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Fusobacterium nucleatum # 1 232 1 235 235 410 94.0 1e-115 MLYIVATPIGNLEDMTFRAIRTLKEVDYIFAEDTRVTRKLLDHYEIKNTVYRYDEHTKQH QVANIINLLKEEKNIALVTDAGTPCISDPGYEVVDEAHKNNIKVVAIPGASALTASASIA GISMRRFCFEGFLPKKKGRQTLLKQLAEEKERTIVIYESPFRIEKTLRDIETFMGKREVV IVREITKIYEEVLRGSTTELIEKLEKNPIKGEIVLLVEGQQKGGNKYVDDTD >gi|224461367|gb|ACDC01000035.1| GENE 17 16787 - 17665 1281 292 aa, chain + ## HITS:1 COG:FN1203 KEGG:ns NR:ns ## COG: FN1203 COG1161 # Protein_GI_number: 19704538 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 289 1 289 289 476 93.0 1e-134 MSMTQINWYPGHMKKTKDLIEENLKLIDVVLEIVDARIPLSSKNPNIASLSKNKKRIIVL NKSDLLEKKELEVWKKYFKEQDFADEVVEMSAETGYNLKKLYEAIEFVSKERKEKLLKKG LKKVSTRIIVLGIPNVGKSRLINRIVGKNSAGVGNKPGFTRGKQWVRIKEGIELLDTPGI LWPKFESETVGVNLAISGAIRDEILPIEDIACSLIRKMLTQGRWTSLKDRYKLLEEDRDD EIMENILSKIALRMAMLNKGGELNVLQAAYTLLRDYRAAKLGKFGLDEIKEV >gi|224461367|gb|ACDC01000035.1| GENE 18 17668 - 18444 1259 258 aa, chain + ## HITS:1 COG:FN1202 KEGG:ns NR:ns ## COG: FN1202 COG0171 # Protein_GI_number: 19704537 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Fusobacterium nucleatum # 1 258 1 258 258 441 89.0 1e-124 MDKLDLNMKEVHKELVDFLKENFKKNGFSKAILGLSGGIDSALAAYLLRDALGKENVLAI MMPYKSSNPDSLNHAKLVVEDLGINSKVIEITDMIDAYFKNEKDSTSLRMGNKMARERMS ILYDYSSKENALVIGTSNKTEIYLGYSTQFGDSACAFNPIGDLYKTNVWELSRYLNIPKE LIEKKPSADLWEGQTDEQEMGLTYKEADQVLYRMLEENKTVEEILNEGFDKSLVENIVRR MNRSEYKRRMPLIAKIKR >gi|224461367|gb|ACDC01000035.1| GENE 19 18451 - 18822 527 123 aa, chain + ## HITS:1 COG:no KEGG:FN1201 NR:ns ## KEGG: FN1201 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 122 3 124 124 147 84.0 2e-34 MQNQLKEDLNDFLKEKEELREVIGKIGGSNNSQAKIITSLFMGIVLVIFVTGIILKQLSP MTTLLLLLLIISFKIIWMLQQMQKSMHFQFWVLNSIEIRINELDKRQKKIEKILEGLEDK KEE >gi|224461367|gb|ACDC01000035.1| GENE 20 18856 - 19662 833 268 aa, chain + ## HITS:1 COG:no KEGG:FN1200 NR:ns ## KEGG: FN1200 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 268 1 259 259 450 89.0 1e-125 MKKKLVLLSMLALSVSSFAAKPSLPKSYTVRYTHNFGRIDGFVQIPKGGQFNTTTDRRPT FDELDIKNINYPELFVGAKWDNFGVYYGMKYKSFKGSATLNEDLKTHDIQLRKGDKISSK HLYAFYNLGFSYDFKVNHKFTLTPKIEFSLFQFSYKFSSSGSNNVSNDERKFNAGGVRVG GEANYQFTEDFGLRFDIMTHIPHDSIKSSLDTSLTASYNLYRSGNTEINAIAGIGYDSFK YRDRQKDMQNFMDSKTKPVYKLGVELKF >gi|224461367|gb|ACDC01000035.1| GENE 21 19725 - 19949 324 74 aa, chain - ## HITS:1 COG:FN0538 KEGG:ns NR:ns ## COG: FN0538 COG1314 # Protein_GI_number: 19703873 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecG # Organism: Fusobacterium nucleatum # 1 74 1 74 74 111 95.0 3e-25 MSTLLNVLLFLSAFILIVLVLIQPDRSHGMTASMGMGASNTIFGINKDGGPLAKATEVVA TLFIVCSLLLYLTR >gi|224461367|gb|ACDC01000035.1| GENE 22 20091 - 20378 395 95 aa, chain + ## HITS:1 COG:FN1335 KEGG:ns NR:ns ## COG: FN1335 COG1862 # Protein_GI_number: 19704670 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Fusobacterium nucleatum # 1 94 1 94 94 124 84.0 6e-29 MQEIFAKYGSTIGLVVLWIGVFYFLLIRPNKKRQKEQQNLLNSLKEGTEVITIGGIKGTI AFVGEDYVELRVDKGVKLTFRKSAIANVISNNNQQ >gi|224461367|gb|ACDC01000035.1| GENE 23 20445 - 21473 1336 342 aa, chain + ## HITS:1 COG:FN1334 KEGG:ns NR:ns ## COG: FN1334 COG0860 # Protein_GI_number: 19704669 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Fusobacterium nucleatum # 5 342 1 338 338 505 79.0 1e-143 MKKKLITAFFFFLLSVLSFSAQVKDVRFRNNTCSISLNAREGEYLVSADEESRLIYIEIQ NLDSSSCEKFTKNLEYDIRDSNLFEDVVIDKTRDSVSITLQVAPKVGYVMDATNNRIDVN FHRTTKNKHLIVIDPGHGGKDSGAIRGSVVEKKIVLSVGTFLKEELSKDFNVVMTRDSDV FVVLSQRPKMANKSNAKLFVSIHANASESKNANGVEVFYFSKKSSPYAERIANFENTIGE QYGDSSDKIIQISGELAYKKNQENSIRLARKIAENISSGLALKNGGVHGANFAVLRGFNG TGVLIELGFVSNSYDAAILVDRDSQQKMAEEIAKSIKEYLTR >gi|224461367|gb|ACDC01000035.1| GENE 24 21478 - 21903 618 141 aa, chain + ## HITS:1 COG:no KEGG:FN1333 NR:ns ## KEGG: FN1333 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 12 141 9 138 138 142 61.0 3e-33 MSRNKKVTFKSTAILLGILIILVAIKILMPSKDKIGEIEVRKVEVKAEELVKIPAYAVDK DSDSPRKYAISTKEAATSDLLQVAVQDMTKNYSEDLELKNIYFSDSAVYYEFNKKDLSEG FMQALQMVTEEIMGISEINFI >gi|224461367|gb|ACDC01000035.1| GENE 25 21919 - 22992 1648 357 aa, chain + ## HITS:1 COG:FN1332 KEGG:ns NR:ns ## COG: FN1332 COG0216 # Protein_GI_number: 19704667 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Fusobacterium nucleatum # 1 357 9 365 365 590 98.0 1e-168 MFDKLEEVVARYEELNQMLVSPEVLADSKKMIECNKAINEITEIVEKYKEYKKYVDDIEF IKESFKTEKDPDMKEMLNEELKEAEEKLPSLEEELKILLLPKDKNDDKNVIVEIRGGAGG DEAALFAADLFRMYSRYAERKKWKIEIIEKQDGELNGLKEVAFTIIGLGAYSRLKFESGV HRVQRVPKTEASGRIHTSTATVAVLPEVEDVQEVTVDPKDLKIDTYRSGGAGGQHVNMTD SAVRITHLPTGIVVQCQDERSQLKNREKAMKHLLTKLYEMEQEKQRSEVESERRLQVGTG DRAEKIRTYNFPDGRITDHRIKLTVHQLEAFLDGDIDEMIDALITFHQAELLSASEQ >gi|224461367|gb|ACDC01000035.1| GENE 26 23031 - 24140 1405 369 aa, chain + ## HITS:1 COG:FN1331 KEGG:ns NR:ns ## COG: FN1331 COG2890 # Protein_GI_number: 19704666 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methylase of polypeptide chain release factors # Organism: Fusobacterium nucleatum # 17 369 1 353 354 523 84.0 1e-148 MKKYSFSKPRLEAEKLVSYVLNLDRIALYIHYERELTEEEKTSIKQFLKQMVEEKKSFDE IKGEKKDYKTENLDIFNKSVEYLKKNGVPSALVDTEYIFSEALKVSRNTLKYSMSREIKE EDKNKIREMLMLRAKSRKPLQYILGEWEFYGLPFKVRENVLIPRPDTEILVEQCIQLMRE IEEPNILDIGSGSGAISIAIANELKSSSVTGLDINEDAIRLANENKVLNKVENVNFMKSD LFEKLDEDFKYDLIVSNPPYITKEEYETLMPEVKNFEPKNALTDLGDGLHFYREISKKAE SYLKDTGYLAFEIGYKQAKEVSKILEDNNFAILSVVKDYGGNDRVVLAKKAIKADNFEEI EEEENVDLS >gi|224461367|gb|ACDC01000035.1| GENE 27 24124 - 25155 1264 343 aa, chain + ## HITS:1 COG:FN1330 KEGG:ns NR:ns ## COG: FN1330 COG0809 # Protein_GI_number: 19704665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Fusobacterium nucleatum # 1 343 9 351 351 601 92.0 1e-172 MSTYLSDYDYFLPEELIGQKPREPRDSAKLMLINRKTGEIEHKHFYNIIDYLQKGDVLVR NATKVIPARIYGHKESGGVLEVLLIKRISIDTWECLLKPAKKLKLGQKLYIGENKELIAE LLEIKEDGNRILKFYYEGSFEEVLDKLGSMPLPPYITRKLENKDRYQTVYAQRGESVAAP TAGLHFTEELLKKISEKGIEIIDIFLEVGLGTFRPVQTENVLEHKMHEESFEISEKAAKA INEAKAQGRRIISVGTTATRALESSVDENGKLIAQKKDTEIFIYPGYRFKIVDALITNFH LPKSTLLMLVSALYDREKMLEIYNLAVKEEYHFFSFGDSMFIY >gi|224461367|gb|ACDC01000035.1| GENE 28 25167 - 25715 322 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 180 13 192 199 128 36 5e-29 MRIIAGEAKNRIIKTRKGFDTRPTLESVKESLFSIIAPYVENSVFLDLFSGSGSISLEAV SRGAKRAVMIEKDGEALKYIIENIDNLGFTDRCRAYKNDVVRAVEILGRKKEKFDIIFMD PPYQDNITTKVLKAIDKADILADDGLIICEHHLFEDLEDNIASFRKTDERKYNKKILTFF TK >gi|224461367|gb|ACDC01000035.1| GENE 29 25972 - 26199 418 75 aa, chain + ## HITS:1 COG:FN1328 KEGG:ns NR:ns ## COG: FN1328 COG1722 # Protein_GI_number: 19704663 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII small subunit # Organism: Fusobacterium nucleatum # 6 75 1 70 70 73 85.0 1e-13 MKGVEMAKNTFEENLENLDEIIEKLESGELSLDDAIKEYENAMKLIKTASKMLNEAEGRL IKVIEKNGEIETEEI >gi|224461367|gb|ACDC01000035.1| GENE 30 26201 - 27097 1263 298 aa, chain + ## HITS:1 COG:FN1327 KEGG:ns NR:ns ## COG: FN1327 COG0142 # Protein_GI_number: 19704662 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 3 297 2 296 297 471 85.0 1e-133 MNSDFQVYLKEKTNFFETELKKELKELSYPETIAKGMEYALLNGGKRLRPFLLFTTLELL NQDIQKGVKSAIGIEMIHSYSLVHDDLPALDNDDYRRGKLTTHKVFGEAEAILIGDALLT YAFYMLSEKNLNILSFEQITKIISKTSAYSGVNGMIGGQMIDIESENKKINLETLKYIHK HKTGKLIKLPIEIACIIADVSEDKRLVLEEYAELIGLAFQVKDDILDIEGTFEDLGKPVG SDDDLHKATYPSILGMEESKKILNETVERAKKIIHNMFGEEKGKILISLADFIRERKS >gi|224461367|gb|ACDC01000035.1| GENE 31 27168 - 28184 1599 338 aa, chain + ## HITS:1 COG:SP1069 KEGG:ns NR:ns ## COG: SP1069 COG2984 # Protein_GI_number: 15900938 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 41 338 47 344 344 260 48.0 3e-69 MKKSVLFFGALLIIVLGYYFLNNKKDNSQEQVAQEKAQVTEEKVINVGVLQLLSHPALDS IYKGMVEELARQGYEDGKNIRIDLQNAQGEQSNLALMSEKLVSEKNDILVGITTPATLSL ANATKDIPIIMAGITYPVEAGLIASEEKPGNNITGVSDRTPIKQQLELMKEIIPNLKKIG LLYTSSEDNSIKQIEEAKKYAAELGLEVKLASIANSNDIQQVTESLASEVQAIFVPIDNT IASAMATVVKVTDKFKIGVFPSADTMVADGGVLGLGVDQYQIGVETAKVIVDVINGKKPA DTPIVLANEGVIYLNEAKAQELGIEIPATIKEKAQIVK >gi|224461367|gb|ACDC01000035.1| GENE 32 28281 - 29165 1100 294 aa, chain + ## HITS:1 COG:SP1070 KEGG:ns NR:ns ## COG: SP1070 COG4120 # Protein_GI_number: 15900939 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 3 294 1 288 288 213 49.0 2e-55 MDLIISAISQGLLWSLLSLGLFISFRVLNIADMTTEGSYPLGAAVCVMLIQSGYPPLTAT IIAILVGSLAGLVTAIFINICKIPSLLAGILTMTALLSVNLRIMKRPNLSLLNKETIFDN LSKLNLPPYFDIILLGLIVISIVILAMHLFFDTELGQALIATGDNPKMATSLGISTKKMT TLGLMLSNSLIALTGAILSQNNGYADVNSGLGVIVVALAAIIIAEVIFTDVNFLTRLVCI VFGSMIYRLLLVFVLKLNVIQANDFKLVSALLIALFLSVPELKKFSLKLGKGDK >gi|224461367|gb|ACDC01000035.1| GENE 33 29165 - 29932 172 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 24 237 28 245 563 70 26 1e-11 MPYIELKNINKVFNPNSNREHHALKNINLVINKGDFITIIGGNGAGKSTLFNAISGVFPL DSGSISINDVEISSTKEFERAKYISRVFQNPLDNTAPRMTVAENMALALNRGERRTLKFS KNKDNIALFENLLKNLNLGLEQKLNTEMGVLSGGQRQAIALLMATMKAPELILLDEHTAA LDPKTQKKIMLLSEEKVKEKNLTALMITHNLQDALTYGNRMLLLHQGEIVRDFSEEEKKK LSVTDLYKIMVDLDE >gi|224461367|gb|ACDC01000035.1| GENE 34 30008 - 30700 1154 230 aa, chain + ## HITS:1 COG:no KEGG:FN0602 NR:ns ## KEGG: FN0602 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 230 1 231 236 375 82.0 1e-103 MRIRSVETAIRADVSRNIPNGVDALGIFDNLVQPIFPFPVESLSIILSFSEMEGPTMFQV RINAPNDDLVSKGDFGVLPDQFGYGRKVINLGGILISERGKYTIDIFELGVDKKLKFIKT RRLFFADYPPQREFTEAEKKAILEDESLIRVVKTEFKPFEFANDDTVKPIKLQISLDDSV PLEEGYIAVPEDNTILVKGKKFDLTGMRRHVEWMFGKPIPKQEEEPDEEK >gi|224461367|gb|ACDC01000035.1| GENE 35 30710 - 32014 1504 434 aa, chain + ## HITS:1 COG:FN0313 KEGG:ns NR:ns ## COG: FN0313 COG0144 # Protein_GI_number: 19703658 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Fusobacterium nucleatum # 1 434 1 435 435 627 82.0 1e-179 MSVKYVAMKLISFVDKGSYSNIVLNDAFKEFYLTAKEKAFITEIFYGVLRNKNFLDYMIE KNTKVIKKEWIRNLLRISIYQLTFMSSDAKGVVWEATEIAKKHGIAISKFINGTLRNYLR NKDLEIKKLHDEKNYEILYSIPQYFCDILEKQYGSENLNQAIISLKKIPYLSVRVNKLKY SEEEFEEFLKEKDIQIIKKVDSVYYVNSGLIINSKEFKEGKIIAQDASSYLAAKNLGVKP NDLVLDICAAPGGKTAVLAEEMKNKGEIIAIDIHQHKKKLIEENMKKLGIDIVKATVLDA RNVNKQGRKFDKILVDVPCSGYGVIRKKPEILYTKNRENIEELASLQLEILNSAADILKD GGELIYSTCTIISQENTENVEKFLNERKEFKVKALNIPENVSGEYDKLGGFSINYKEEIM DNFYIIKLVKEEKC >gi|224461367|gb|ACDC01000035.1| GENE 36 32008 - 32652 723 214 aa, chain + ## HITS:1 COG:FN0314 KEGG:ns NR:ns ## COG: FN0314 COG4122 # Protein_GI_number: 19703659 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 1 213 1 213 215 311 86.0 5e-85 MLEELKEANSYISSKIDKYRSRSLLIKEIEEDAEINNVPIISKEIREYLKFIIKSNKNIK NILEIGTATAYSGIIMAEEIQDRNGCLTTIEIDEDRFKIAKSNFEKANLKNIEQILGDAT EEIEKLNKNYDFIFIDAAKGQYKKFFEDSYKLLNQGGLVFIDNILFRGYLYKESPKRFKT IVKRLDEFIEYLYENFEDVTLLPISDGVMLVNKN >gi|224461367|gb|ACDC01000035.1| GENE 37 32794 - 33570 956 258 aa, chain + ## HITS:1 COG:FN1141 KEGG:ns NR:ns ## COG: FN1141 COG2116 # Protein_GI_number: 19704476 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Fusobacterium nucleatum # 1 257 1 256 256 377 83.0 1e-105 MADGHKTPTELVDYIIKVGIDKATKPLFKLMLLGIFGGAFIALGGAGNIISSSTLVKTDP GFAKFLGAAVFPVGLILVVTLGAELFTSNCLLSVAFVNKKISFAQMIRNLVTVYLFNYVG SFIVAYITVKGGSFNADSLAYLQNIATHKVDASAYALFIKGILCNVLVCGAVIQSYTSRD TIGKLVGAWLPIMLFVLIGYDHSIANMFYLTAAKLTDTSLFGVSGILYNLFYVTLGNILG ALAIGLPLYFSYYKKSDN >gi|224461367|gb|ACDC01000035.1| GENE 38 33617 - 34081 714 154 aa, chain - ## HITS:1 COG:FN0673 KEGG:ns NR:ns ## COG: FN0673 COG2606 # Protein_GI_number: 19704008 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 154 1 154 154 260 94.0 8e-70 MSIEAVRKHLEKYGLDSKIKEFTESTATVEEAAKVNSCEPARIAKSLSFIINDIPTIIVV AGDAKINNQKFKAKFKTKAKMIAGSDVENLIGHPIGGVCPFGIKDNVKVYLDESMKRFET MLPACGTPNSAIELTLEELEKASNYIEWIDVCQI >gi|224461367|gb|ACDC01000035.1| GENE 39 34103 - 35419 1438 438 aa, chain - ## HITS:1 COG:FN0672 KEGG:ns NR:ns ## COG: FN0672 COG1373 # Protein_GI_number: 19704007 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 438 1 438 438 742 91.0 0 MLKKENELKNRSHYLEKLIEFKDTDFVKIITGIRRCGKSSLMKLMIKHLLDNGIEKNQII QINFESMEFKRMTVEDLYNYVKSNLPKDKKAYLFFDEIQKVSEWQDAINSFRVDFECDIY ITGSNAFLLSSEYATYLAGRSIEIKVYPLSFIEFIDFHGYKIIEKKSLTGGISRKVENEN GEAYEIKEFFDAYITFGGMPSLTELPLEIDKALTILDGIYSSVVIRDILEREKQKDRRQV TDSSLLRKIIMFLADNIGNNTSINSISNVLLNEKLIETKPAVQTVQSYVATLLEAYVFYE IKRFDIKGKEFLKTLGKYYIVDIGLRNYLLGFRNRDIGHIIENIVYFELLRRGYDVAIGK IGDNEINFIATNANTKIYIQVTENIANSSTRERELTPFYKIQDNFEKIVITNDESYLGVH DGIKIIRLVDFLLDENIL >gi|224461367|gb|ACDC01000035.1| GENE 40 35566 - 38949 3634 1127 aa, chain - ## HITS:1 COG:MA2418 KEGG:ns NR:ns ## COG: MA2418 COG4096 # Protein_GI_number: 20091249 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Methanosarcina acetivorans str.C2A # 1 1081 1 1093 1146 673 37.0 0 MSNFDFLKDEFIDLYELCLEAEKNCYIKPRTSAFYSRLALEFCVGLVYKFEKIQTSYNEM SLNDLINKKEFKDLFQDESQIAGLNLIRKFGNDAAHMLKNIISNADRNLSLNKDIALNCL KGIFDFTVWIAYCYGSTLKTDDIKFDEKYILHSSSEEENINDIKLTDNDVKNNIEKIEVV PTKKHNTRINNNNFSEKETRKLFIDFLLMKAGWNLNDKNMFEYEVEGLKSTSSGKGNIDY VLWGDSAYPLAIIEAKKASYNAKKGEFQALEYAEALERKFNFFPIRFVTNGFEIFIYENK NSIPRRIYGFYRKEELLKIIARRNEKITSNDISINKKIIDRYYQERAVKKAIENYISGNR KSLLVMATGSGKTRVAISLVDCLSRLNMVKRTLFLADRVALVKQALNSFKNSLPDYTLVD LVAEKDRDNAKIVFSTYQTMMTESEKSREDGTNKYGVGAFDLIIVDEAHRSIYQKYGDLF EYFDSLILGLTATPKNEIDRNTFKVFDMNSKEPTDSYDLFEAAKDEFLVLPKIKEVSLNY PENGIVYSKLSEEEKEKYETLFDEEDSMPEEISGDSLNSWFFNEGTTSKVLTTLMEEGYK IESGDKLGKTIIFAKNDKHAEHIVETFNKLYKNLDGEFCQKITTKVEKAQTLIERFVDPN SLPQIAVSVDMLDTGIDVPQILNLVFYKKVKSKAKFWQMIGRGTRKCKDVYGPGQDKKDF LILDFCRNFSYFEMYGSFDEDNTKLGKSLSSRIFENKVKMIYKLQNLEYQMDENYKKLWE DLVNEIYDLIASLNEENISVRTKISYVKKYKNIDVLRNLEEKNVDEIIKNLSSLPFPVTE KTEMEKKFENLILKIQLKLFDNKKVENEKMEIFDIAKGLAKKGTIKEIQKNTDYIMKLIK DENYLKNIDILELKNLKDIIEPLTIFIDADGKHLNYVIGDFEDTCISTEVKDINTFASAY INSKAKFQKYLDKNKELLSIKKLRNNIELDEEDLKELKQLLYSNEEVSLESLKNENNTEI EKISSLYGKKESFGIFIRSLIGLDREAINKEFSEFLNKEKFNSNQIELINLIIENIVKYG AYSKSEIPKLSNDILGTSIFNLFTDENDLQKIVNIIDKINSNVPKLL >gi|224461367|gb|ACDC01000035.1| GENE 41 39012 - 40001 1148 329 aa, chain - ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 9 328 8 320 321 386 65.0 1e-107 MVDTLILDIKQAMSSTLTNGQMEKLHKVLAHYLYDLEIVKKEGADRDEKQNIEYLEAFLS AKHVEGCSRKSLKYYKATIENLFKKIDKSIKHITTNDLREYLDNYQKEGNASKITIDNIR RIFSSFFAWLEEEDYILKSPVRRIHKVKTGTVVKETYSDEAMEIMRDNCKSLRDLAIIDI LASTGMRVGELVKLNIEDIDFEGRECVVFGKGDKERKVYFDARTKIHLHNYLKTRDDDNS ALFVSLLKPHKRLQISGVEIMLRELGKKLNITKVHPHKFRRTLATKAIDKGMPIEQVQQL LGHQKIDTTLQYAMVSQNNVKISHRKYIG >gi|224461367|gb|ACDC01000035.1| GENE 42 40047 - 40754 692 235 aa, chain - ## HITS:1 COG:pli0008 KEGG:ns NR:ns ## COG: pli0008 COG3177 # Protein_GI_number: 18450294 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 8 235 5 242 254 182 41.0 7e-46 MRKIKILEQFTEEYIDDLNVRMTHHSNAIEGNTLTLNETATIILDDTIPNAMSKREFLEV LNHSDALKFLLAELQNNTVDIYMIKEINKILLSRLNHNAGNFKTDYNYIRGANFETASPS ETPYKMNEWFENMNFQLKNSNSDIEKIKIILEYHIKFERIHPFSDGNGRTGRLIMLALML ENNLTPFVITVENRAKYMDILRNQDIENFVSLVEPLIEEEKKRIIAFKKSASLQI >gi|224461367|gb|ACDC01000035.1| GENE 43 40741 - 41289 474 182 aa, chain - ## HITS:1 COG:MJ1531 KEGG:ns NR:ns ## COG: MJ1531 COG0732 # Protein_GI_number: 15669726 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanococcus jannaschii # 17 159 251 394 425 130 53.0 2e-30 KFNKNGVEKQLNDVADIIMGQSPLSQSYNKDKKGLPFYQGKTEFSDIYIKEPTVYCNSPI KVVEENDILMSVRAPVGDVNIATQKSCIGRGLASIKPKKIDYLYLFYLLKEQKSKIEKIG VGSTFKAINKNNISTLKISIVEKDKQNKIRNYLSSIEKLKFTIMTIILKAYKTMKKRGVE KN Prediction of potential genes in microbial genomes Time: Thu May 19 22:58:35 2011 Seq name: gi|224461366|gb|ACDC01000036.1| Fusobacterium sp. 2_1_31 cont1.36, whole genome shotgun sequence Length of sequence - 33942 bp Number of predicted genes - 35, with homology - 34 Number of transcription units - 14, operones - 9 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 27/0.000 - CDS 3 - 564 442 ## COG0732 Restriction endonuclease S subunits 2 1 Op 2 . - CDS 551 - 2047 2008 ## COG0286 Type I restriction-modification system methyltransferase subunit - Prom 2151 - 2210 13.4 - Term 2231 - 2272 6.7 3 2 Tu 1 . - CDS 2281 - 2721 534 ## Vpar_0716 hypothetical protein - Prom 2958 - 3017 7.8 + Prom 3147 - 3206 8.5 4 3 Op 1 1/0.000 + CDS 3250 - 3699 629 ## COG3682 Predicted transcriptional regulator 5 3 Op 2 . + CDS 3719 - 5893 4208 ## COG2217 Cation transport ATPase + Term 5974 - 6041 17.1 - Term 6551 - 6607 5.7 6 4 Tu 1 . - CDS 6665 - 7087 376 ## COG0394 Protein-tyrosine-phosphatase - Prom 7144 - 7203 2.5 - Term 7244 - 7280 2.6 7 5 Tu 1 . - CDS 7332 - 7490 258 ## CHAB381_0336 hypothetical protein - Prom 7585 - 7644 6.7 - Term 7609 - 7647 4.1 8 6 Op 1 10/0.000 - CDS 7650 - 8507 777 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 9 6 Op 2 42/0.000 - CDS 8504 - 9421 1075 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 10 6 Op 3 25/0.000 - CDS 9429 - 10115 271 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 11 6 Op 4 . - CDS 10138 - 11013 1306 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin - Prom 11074 - 11133 11.1 12 7 Tu 1 . - CDS 11152 - 12354 1307 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog - Prom 12428 - 12487 14.6 + Prom 12392 - 12451 7.9 13 8 Op 1 . + CDS 12476 - 12889 600 ## COG2510 Predicted membrane protein 14 8 Op 2 . + CDS 12952 - 13272 280 ## COG0534 Na+-driven multidrug efflux pump + Term 13303 - 13350 1.3 - Term 13291 - 13338 1.3 15 9 Op 1 . - CDS 13361 - 14950 2388 ## COG3653 N-acyl-D-aspartate/D-glutamate deacylase 16 9 Op 2 . - CDS 14964 - 16436 2012 ## COG0591 Na+/proline symporter 17 9 Op 3 4/0.000 - CDS 16495 - 17679 1863 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 18 9 Op 4 . - CDS 17692 - 18900 1612 ## COG1171 Threonine dehydratase - Prom 19055 - 19114 14.6 + Prom 19126 - 19185 15.9 19 10 Tu 1 . + CDS 19207 - 20922 1454 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Prom 20971 - 21030 10.7 20 11 Op 1 . + CDS 21076 - 22266 1599 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 21 11 Op 2 . + CDS 22291 - 22620 480 ## FN1153 hypothetical protein 22 11 Op 3 1/0.000 + CDS 22696 - 23886 799 ## COG1295 Predicted membrane protein 23 11 Op 4 1/0.000 + CDS 23870 - 26032 2512 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 24 11 Op 5 4/0.000 + CDS 26033 - 28348 2229 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 25 11 Op 6 . + CDS 28358 - 28882 751 ## COG0242 N-formylmethionyl-tRNA deformylase 26 11 Op 7 . + CDS 28875 - 29186 291 ## FN1158 hypothetical protein 27 11 Op 8 . + CDS 29183 - 30223 1785 ## COG1494 Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins 28 12 Op 1 . - CDS 30331 - 30624 523 ## gi|237740382|ref|ZP_04570863.1| predicted protein 29 12 Op 2 . - CDS 30639 - 30929 464 ## gi|237740383|ref|ZP_04570864.1| predicted protein 30 12 Op 3 . - CDS 30984 - 31151 320 ## - Prom 31185 - 31244 10.3 + Prom 31150 - 31209 10.8 31 13 Op 1 . + CDS 31275 - 31514 324 ## gi|237740385|ref|ZP_04570866.1| predicted protein 32 13 Op 2 . + CDS 31516 - 32325 1070 ## SSUBM407_p004 toxin of epsilon-zeta postsegregational killing system + Prom 32487 - 32546 14.2 33 14 Op 1 . + CDS 32686 - 33207 625 ## gi|237740387|ref|ZP_04570868.1| predicted protein 34 14 Op 2 . + CDS 33229 - 33672 635 ## gi|237740388|ref|ZP_04570869.1| predicted protein 35 14 Op 3 . + CDS 33752 - 33941 221 ## gi|294782520|ref|ZP_06747846.1| lipoprotein Predicted protein(s) >gi|224461366|gb|ACDC01000036.1| GENE 1 3 - 564 442 187 aa, chain - ## HITS:1 COG:SP0508 KEGG:ns NR:ns ## COG: SP0508 COG0732 # Protein_GI_number: 15900422 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Streptococcus pneumoniae TIGR4 # 9 167 354 511 522 87 37.0 1e-17 MKIFNKNEWKKVKLGDVFDLQMGKTPLRENKLYWDKGEYHWISISDMNFSEKYISSTKEK ITELAVKKSGIKIIPKNTVIMSFKLSIGKVKIVNEDIYSNEAIMAFIPKTNNFIDENFLY YSLKGVRWNEGINKAVKGLTLNKALISQKEIFLPNLAIQKEIASNLDSIADFLNLRRKQL NYLEELS >gi|224461366|gb|ACDC01000036.1| GENE 2 551 - 2047 2008 498 aa, chain - ## HITS:1 COG:SP0886 KEGG:ns NR:ns ## COG: SP0886 COG0286 # Protein_GI_number: 15900769 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Streptococcus pneumoniae TIGR4 # 1 494 1 496 497 535 57.0 1e-151 MITGEIKNKVDKMWEYFWVGGLTNPVDVIEQLTYLIFMKRLDQEEQRKEKEQKLGNIFGN FDEKFIFGENHQDIRWSNLIQLGDPKQLYDKVRNEAFEFIKNLDEDKDSVFSQYMENAIF KVPTPAVLQNTMDTIEEIFNNPQMVEDKDTKGDLYEYLLSKLSTSGKNGQFRTPKHIINM MVELMKPTVEDKIIDPACGTSGFLVSSIEYIKRNFKDILATSPEIYKYFSTSMIHGNDTD ATMLGISAMNLLLHDMKTPKLKRIDSLSTDYSEESDYTLILANPPFKGSVDEALLSNTLT RVVKTKKTELLFIALFLRLLKIGGRGAVIVPDGVLFGASNAHKNLRKELIENNQLEAVIS MPSGVFKPYAGVSTGILIFTKTGKGGTDNVWFYDMTADGYSLDDKRNPVEENDIPDIIER FSNLENEKDRKKTDKSFFVPKQEIVDNDYDLSINKYKEIIYEKVEYEEPKVILEKLEELS KSIDEKLKELKVMLDEDI >gi|224461366|gb|ACDC01000036.1| GENE 3 2281 - 2721 534 146 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0716 NR:ns ## KEGG: Vpar_0716 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 1 146 1 146 146 172 56.0 3e-42 MVWLNSSSAGKQNYLELRLNAPKGERILLDFNPLKTSDIPDWEAKWKDWHCYSNPLRIYL QDYEILLPYFKKIYPFVDASDGTLRQELDLCFDNWIEKNDWLKIINEIENNLKHISDSEK KFLSDFIEWLKEALKHTTIIVVEGNL >gi|224461366|gb|ACDC01000036.1| GENE 4 3250 - 3699 629 149 aa, chain + ## HITS:1 COG:SPy1717 KEGG:ns NR:ns ## COG: SPy1717 COG3682 # Protein_GI_number: 15675568 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 10 145 4 139 144 110 38.0 1e-24 MSTIEFKTYITDAEWEVMRVVWANDRVTSKKIISVLKEKMDWTQSTIKTILGRLVEKGVL NTEQEGRKFIYTANIEEKEAVRDYVEDIFNRICNKKVGNVIGSIIEDHVLSFDDIDRLEK ILEMKKSFAVEEVDCNCPEGQCECHLHHR >gi|224461366|gb|ACDC01000036.1| GENE 5 3719 - 5893 4208 724 aa, chain + ## HITS:1 COG:AF0152 KEGG:ns NR:ns ## COG: AF0152 COG2217 # Protein_GI_number: 11497769 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Archaeoglobus fulgidus # 74 716 43 689 690 517 45.0 1e-146 MNNDNRYSSHNHHDYDDMDKLKHEHGITDEHGCHKDKHEEHHAGHDHSGHSGHDHSGHEG HDHSGHGGHNHSGHSGHAHHHHGSFKELFLKSLPLGIIIMFFSPLHGFKLPFQFTFPYSD IVVAILSTILIIYGGRPFYQGAVDEFKQKKPGMMALVSLGISVSYLYSIYAVIITYVTGE HVMDFFFEFASLLLIMLLGHWIEMKAIGEAGDAQAALAKLVPKDAHVVLEDDSIETRPVA DLKAGDLIRVQAGENVPADGIIERGESRVNEALLTGESKAVKKGPGDEVIGGSTNGEGVL YIKVLETGDKSFISQVQTLISQAQSQPSRAENIAQKVAGWLFYIAVIVALIAFVVWMIIG DIPTAVIFTITTLVIACPHALGLAIPLVTARSTSLGASRGLLVKDRQALEIAQDADVIIL DKTGTLTTGEFKVLDVKLLNDKYTKEEIIALLAGIEGGSSHPIAQSIIRFAEQQGIRPAS FDSIDVISGAGVEGKAGGHRYQLVSQKAYGRNLDMDIPKGATLSVLVENDDAIGAVALGD ELKPTSKELIKALKKNNIQPIMATGDNEKAAQGAAEVLGIEYRSNQSPQDKYELVKTLKD EGKKVIMVGDGVNDAPSLALADVGIAIGAGTQVALDSADVILTQSDPGDIESFIELAHKT TRKMKQNLFWGAGYNFIAIPIAAGILAPIGITLSPALGAILMSVSTVIVAINAMLLSLDP KNNG >gi|224461366|gb|ACDC01000036.1| GENE 6 6665 - 7087 376 140 aa, chain - ## HITS:1 COG:CAP0105 KEGG:ns NR:ns ## COG: CAP0105 COG0394 # Protein_GI_number: 15004808 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Clostridium acetobutylicum # 3 137 2 136 136 160 57.0 9e-40 MRKAKVAFICVHNSCRSQIAEALGKKYASDVFDSYSAGTQIKNQINQDAVRLMKDIYNID MEKTQYSKLISDLPDIDISITMGCDVVCPIVENQYTEDWNLEDPTGQDDAFFREIISKIE KNIKNLSLRIKKNEISLNEK >gi|224461366|gb|ACDC01000036.1| GENE 7 7332 - 7490 258 52 aa, chain - ## HITS:1 COG:no KEGG:CHAB381_0336 NR:ns ## KEGG: CHAB381_0336 # Name: not_defined # Def: hypothetical protein # Organism: C.hominis_BAA-381 # Pathway: not_defined # 1 39 1 39 133 62 87.0 7e-09 MRLFGKKEKRENDCTCGGNCEPVNNKEVKETSCCDVNYSPENIEKAEEKKKE >gi|224461366|gb|ACDC01000036.1| GENE 8 7650 - 8507 777 285 aa, chain - ## HITS:1 COG:FN0671 KEGG:ns NR:ns ## COG: FN0671 COG1108 # Protein_GI_number: 19704006 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Fusobacterium nucleatum # 1 280 1 280 280 322 87.0 5e-88 MSAGLTIQLIAILISVACSLLGVFLVLRSMSMLTDAISHTVLLGIVLSFFITHKLDSPLL IVGATLTGLLTVYFVEVLSDSKLVKEDAAIGIVLSILFSIAVILISKYTANIHLDIDAVL LGEIAFAPFHTTEIFGFKIATGLVNGFAILIVNLLFITIFFKEIKISIFDKALALTLGLL PEVFHYLLMTLVSVTSVISFDIVGATLMISFMVGPATTAYMISKNLKTMLVYSSLIGVIS SIIGYHLAVFLDVSISGSIAVVIGVIFFIVLFGKRFKKYVKMEEN >gi|224461366|gb|ACDC01000036.1| GENE 9 8504 - 9421 1075 305 aa, chain - ## HITS:1 COG:FN0670 KEGG:ns NR:ns ## COG: FN0670 COG1108 # Protein_GI_number: 19704005 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Fusobacterium nucleatum # 1 303 1 303 305 398 86.0 1e-111 MNEILKLFLSSYTFKVVTLGCTLLGIVSAIIGTFAVLKKESLLGDGISHSALAGICLAFL ISGKKELYILLTGALVIGFLCIFLIHYIERNSKVKLDSAIALLLSTFFGLGLVLLTYLKK VPGAKKAGLNRFIFGQASTLIAKDIYLIIIVGLVLIFLVILFWKEIKISIFQADYAKTLG IQSNKINFLVSTMIVVNVIIGIQIAGVILMTAMLVLPSVAARQWSKKLSIVTLLAAIIGG ISGAMGSIISTLDASLPTGPLIILVSGIFVLISFLFSKKGIIARNYRIYTRNRKLRLQEN KGDNI >gi|224461366|gb|ACDC01000036.1| GENE 10 9429 - 10115 271 228 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 3 211 4 218 305 108 31 3e-23 MNAIEIKNLTVAYGENIALEDLNLNIEVGSLMALVGPNGAGKSTLIKTILKFLKQITGEI KINAKTLAYVPQRNSVDWDFPTTLFDVVEMGCYGRVGLFKRVSKEEKQKVLKAIEQVGML EFKDRQISELSGGQQQRAFIARALVQEADIYLMDEPFQGVDSTTEKSIVEILKQLKAEGK TIIVVHHDLQTVPAYFESVALINKAVIVSGKVSEVFTQENIDVTYRKI >gi|224461366|gb|ACDC01000036.1| GENE 11 10138 - 11013 1306 291 aa, chain - ## HITS:1 COG:FN0668 KEGG:ns NR:ns ## COG: FN0668 COG0803 # Protein_GI_number: 19704003 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 1 291 15 312 312 481 89.0 1e-136 MMISLLVIACGDKKETGKIKVTTTLNYYTNLIEEIGGDKVEVTGLMKEGEDPHLYVATAG DVDKLQNADLVVYGGLHLEGKMTEIFDNLSNKYILNLGEQLDKNLLHKENENTYDPHVWF NTKFWAIQAQAVKDKLAEISPENKEYFESNLQAYLKSLDEATEYIQAKINEIPEESRYLI TAHDAFAYFAEQFGLQVKAIQGVSTDSEIGTKQIEDLATFIVEHKIKAIFVESSVNHKSI EALQEAVKAKGGNVEIGGELYSDSMGDKENNTETYIKTIKANADTIANALK >gi|224461366|gb|ACDC01000036.1| GENE 12 11152 - 12354 1307 400 aa, chain - ## HITS:1 COG:CAC0707 KEGG:ns NR:ns ## COG: CAC0707 COG1508 # Protein_GI_number: 15893995 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Clostridium acetobutylicum # 9 397 13 462 464 183 32.0 5e-46 MILEQKLNQSLKLSQSMKMSLNILEMSMLNLNNFMKNEFSNKFGVEVNYSKQETYNDDDR LEFSFPNEEENFFQILEGQLSYFNINQKIKDICIFIINNLNAKGYLEISKVEIKDILSTS DRELEEAFNIIHNLEPYGVGAYSLEECLKIQLEKKNIIDKKLNLLIDNFLYPLADKKYNL IKDKLNIDETTLTEYIDIIKSLNPIPSRGYNIGKIRKIIPDIFVKQINNEITYEINQDLI PQINIKNNINDKEYKKLNEIIYCIEKRFNTLDKIIKIVLRKQKDFFITEGKKMNVLKISE IASELDLSSSTVSRAIKEKYIKSDFGIISLRKLFNLSSTIFLCQEKIAQYIENEDREKPY SDQDIVKLLENDGIKIARRTVSKYRTDLGYKSSSERKTSF >gi|224461366|gb|ACDC01000036.1| GENE 13 12476 - 12889 600 137 aa, chain + ## HITS:1 COG:AGl3039 KEGG:ns NR:ns ## COG: AGl3039 COG2510 # Protein_GI_number: 15891634 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 18 135 59 176 180 111 53.0 3e-25 MWFIFAILSAIFAALTSILAKIGIEGVNSNLATAVRTLVVVLMAWLMVFVTGSQNGFMDI SKKSWIFLILSGLATGASWLCYYKALQIGEASKVVPIDKLSIVITVALAFLFLGEQITLK TLIGCSLIVAGTFVMIL >gi|224461366|gb|ACDC01000036.1| GENE 14 12952 - 13272 280 106 aa, chain + ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 76 346 421 426 125 94.0 2e-29 MRIFIKNEDIIYLGSQIAFTTFPFYWLYSILEVLGSSLRGMGYSIVSMYVTTICLCAVRI SLLYLISKFNFDFKSVAYVYPMTWFITASIFIIAFLKIINKKIKSN >gi|224461366|gb|ACDC01000036.1| GENE 15 13361 - 14950 2388 529 aa, chain - ## HITS:1 COG:PAB0090 KEGG:ns NR:ns ## COG: PAB0090 COG3653 # Protein_GI_number: 14520359 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-acyl-D-aspartate/D-glutamate deacylase # Organism: Pyrococcus abyssi # 4 526 6 523 526 415 42.0 1e-115 MSTILIKNGTLIDGSGSKRYLADILIENEKIKKIGKLDLTADKVIDASGKIVSPGFIDTH SHSDLKVLIEPFVEPKIRQGITTEILGQDGISMAPLPEEYVSSWRKNLAGLDGDSDELKW DWKNTDGYLNLISKTGSGPNELYLVPHGNIRMEAMGLEARVATKEELEKMKEITRREMEA GVAGLSTGLIYIPCAYAETEELIEICKVAAEYGRPLVIHQRSEADTMLESMQEVIRIAKE SGVKIHFSHFKICGQKNWKLIEPVIALLDKCKEEGINVSYDQYPYVAGSTMLGVILPPWA HAGGTDKLIERLKDKSLREKMKEDIIKGIPGWDNFIDFAGFDGIYVTSVKTNKNQDCIGK NLTEIAEMRGKEKFDAVFDLLMEEENAVGMYDYYGKDEHIVTFMRRPESNICTDGLLGGK PHPRVYGSFPRVLGRFVREMQTMSLEEAIYKMTHKPAVTFKIENRGLLKEDYFADIVIFD ESKIIDKGTFIEPTQFPDGIDYVLVNGNFAVKEGKSTYELGGKVIRIKK >gi|224461366|gb|ACDC01000036.1| GENE 16 14964 - 16436 2012 490 aa, chain - ## HITS:1 COG:PA1418 KEGG:ns NR:ns ## COG: PA1418 COG0591 # Protein_GI_number: 15596615 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pseudomonas aeruginosa # 9 482 5 462 463 142 27.0 2e-33 MNRTQIIALIIILLYMGATVLIGLIASKKKEEKKQSNDDFLMAGKSLGPVVLAGTLFAAN TGGASTTGIATNVFKYGLSASWYVIAGGIGFILVSFIAPYFRRAQANTVPEIISKRYGKA SHIFTAFTSILALFMATGAQIIATASIINVVTGFNFKTAAIVSTIVVIVYTMFGGFKSVT AANLMHVLFITIGMAIAMFIMVNNEAVGGFQALFEKAKNMQDADGNDMNFLSMTKIGATT ILGYIAMYFMTFPTGQEIVQTYCSAKDGKSAKIGSVLAGLVSAVYAIVPAIIGLLAYVCI DGYILEGAQKNALAQATITFAPPIVAGIVLAAIVAATMSSASGNMIGTATMFTNDIFTPY INKGIKDDKKEIWISKIAMLVVGVAGLFIALEASNVISVMMGAFALRSAGPFAAFICGIF YKNVTKRAGFLSIIAGTIVAAIWIYILKTPLGLNAMVPGGIVAFIVIFVGSYIERKMGVE AAPEIEFENI >gi|224461366|gb|ACDC01000036.1| GENE 17 16495 - 17679 1863 394 aa, chain - ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 4 393 15 401 403 166 32.0 7e-41 MLTKERKEKVVAALQKAIQAKSYSGEEKGVVEYLEKLFKDIGYDTVHIDHYGNIIGCIKG KKNGLKVLMDGHMDTVPVQEEKWKENAFGGEVKDGKIYGRGTSDMKGALISMVLAGAYFA EDTKKDFSGEIYVAGIVHEECFEGVAAREVSKYVKPDIVIIGEASELNLKIGQRGRAEIV VETFGKPAHSANPEKGINAVYKMMKLIEKIKTLPMTYQDKLGYGILELTDIKSLPYPGAS VVPDYCRATYDRRLLVGETMEGVLKPIQDCINELKKEDLEFEAKVSYAIGKEKCWTGEEI KGERFFPGWLYDEKEEYIQKAYSGLKAIGQNPIITYYNFCTNGSHYAGEAKIHTIGYGPS KENLAHVIDEYVEVEQIEKVTEGYYAILDAYLGK >gi|224461366|gb|ACDC01000036.1| GENE 18 17692 - 18900 1612 402 aa, chain - ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 30 397 33 401 404 413 54.0 1e-115 MKTIEWIENKKITNSEKEILGFDKKSIKSVLQFHQTLPNYQKTPLIDLKELAKYCGVKKI WVKDESKRFGLNAFKVLGGSYSIGKCLSEILNEDISKLPFFVLTSKEIKKKLGDLTFITA TDGNHGRGVAWMAKHLQQKSVVYMPKGSSQMRLKAIQDEGADASITDFNYDDAVRLANEK AKENNWIMVQDTAWKGYEKIPLWIMQGYSTIMSEIIEELEKVKEKPTHVFLQAGVGSFAG AMQALLAEIYGKEKPITIICEPHGANCIYKSFKADDGNPHNVSGDLKTIMAGLACGEPNT ISWEILRDYSNFALSCDDSISAKGMRVLSSPLGEDKRIISGESGAVGIGAFVSLYENQNN YRDLWKKLNINNDSCILCISTEGDTDVDGYRNIVWNGHYENN >gi|224461366|gb|ACDC01000036.1| GENE 19 19207 - 20922 1454 571 aa, chain + ## HITS:1 COG:ygeV KEGG:ns NR:ns ## COG: ygeV COG3829 # Protein_GI_number: 16130771 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 11 571 17 589 592 351 35.0 2e-96 MNFLNFILSDVKKYSEVVSKVVNIDVEVMDSSFVRIAGTGNLEEKVGLSMRKESHIYHQV LKYKKTIIILNPRKDPHCSNCPNKSLCKEELEISTPIIYQDEVIGVIGLICFEKNKKYDF FIKKDLYIQFLEQIAEFISYKVYVYFSSLQLKRDNEILNSIIDRVQDIIILTNRKNQIEL INKKGLNVLGKIDKNEKIILKNSSSFLNQKEFNFLYSEKEISAIGDIFSFSLEKNEELNK KLFVFKEISEFKKYVLSFHENSSIILLESPQMQDIYSRISKVSKNDTSVIITGESGTGKE IIARHIHNLSTYSKGPFIIVNCGAIPESLIESELFGYTKGAFTGADPKGKIGFFEKANNG TIFLDEIGELPLQIQVKLLRVLQDKTITPIGSRTEKQINVRIIAATNKKLEEEIEKKNFR EDLFYRLNVFPINIPPLRERKKDIKILIDFFVKKYYISFQKERKDISENVYKLLMDYSWP GNIRELRNTIEYCMNMIEENETRIEPKHLPPKFLKNKEDLDDKKTLKEFEKENIINLIKI YGDNLEAKKIIAKSLGIGIATLYRKLKKYNL >gi|224461366|gb|ACDC01000036.1| GENE 20 21076 - 22266 1599 396 aa, chain + ## HITS:1 COG:FN1152 KEGG:ns NR:ns ## COG: FN1152 COG0436 # Protein_GI_number: 19704487 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Fusobacterium nucleatum # 1 396 1 396 396 698 87.0 0 MRISEKALNMKYSAVRKLVPLASEAESKGVKVYRLNIGQPNIETPELFFEGLRNIPDHVI RYADSRGIKELLDQVIEVYSRDGHVLKKEDIIVTQGGSEALTMAMLAICNPDDEVLVPEP FYSNYKSFIDIAGAKIVPIATDITNDFALPKKEEIQKLISPRTKAILYSNPCNPTGKVYT KEEVELMADLAVENDLFIVADEPYREFIYDDNDKHYSLLDIEKARENTIIIDSVSKHYSA CGARVGFLISRNEEFMKYIMKFCQARLAAPTVEQYAVANLMKAPKEYFKEIKEIYNRRRD IIVNSLNKIEGVTCSAPKGAIYAFAKLPVDSSEEFCKWLLTDFRYDNSTVMLAPGEGFYE TEGLGKQEVRFSFCVGEEDIEKAMKVLEEALKVYKK >gi|224461366|gb|ACDC01000036.1| GENE 21 22291 - 22620 480 109 aa, chain + ## HITS:1 COG:no KEGG:FN1153 NR:ns ## KEGG: FN1153 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 12 109 12 109 109 123 76.0 2e-27 MKKILAMLVLLSITSNATEVFSEYYVMEKVLPLLTNAESYTLNGEEVKAVKVDRKVLKAL GTTDDPFYYTNSNQEKKMVRVGDYMVTPITFSSIDSASSKEFNSDFIKK >gi|224461366|gb|ACDC01000036.1| GENE 22 22696 - 23886 799 396 aa, chain + ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 395 1 395 396 567 83.0 1e-161 MLKRAYEKYRGANSSFWVTSLSFYTILAIVPIFAILVSLSSWFGAKDYIINQIKDIAPLK SDTLELLTDFSNNLLMDARSNVLAGVGFIFLGWTFIKMFSLIEDAFNQIWHIKKSRSLIR KISDYISFFIFLPLVFITLNGISLFFLAKIKDIGFLYYIIKNILPLFSMTVFFTALFLVM PNTIVKVFPAFVASIIVSVAFLIFQYIFFLLQFLLIGYSTVYGGFSVIFIFIIWIRISWF IIILGVHICYLIQNANFDINIENDGINISFNSKLYITFKVLEEMVKRYLNNLPPVNIEEL RKITTSSPFLIGNILDELIRGGYIVSSLDYSEKVFCLTKNIDEIYLKEIYDFIANTGEEI FILQDGRITDDVEKIIIDKDYNRTLKSLGGEGAEKI >gi|224461366|gb|ACDC01000036.1| GENE 23 23870 - 26032 2512 720 aa, chain + ## HITS:1 COG:FN1155 KEGG:ns NR:ns ## COG: FN1155 COG0768 # Protein_GI_number: 19704490 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Fusobacterium nucleatum # 13 720 1 711 711 1049 78.0 0 MRKKFKIGRLLSLLIVLSISGYLIYIKSFFVLITFWFLLIYILFSYAVVKNWKKKELFDK RRSIMLLIILAFLIVYGLQLFRIQFLLKSKYVGLMNKQLISVNKEVGQRGAIYDSNGKKL AFNKRLYTVSINPSLLNDEKIHDDILKDITAIKESGIIPLSENIEEELLEMAKENVKYKR IARNIDDEQKKEIVDLIANIERKKVKGRAKYKSVLVFERSIDRKYYKSEEYDKLVGMVKE TEETNDEKIGISGLEKQYQNYLVERKRDITKLYGLNKKNTLALSKETLFSDLNGKNIYLT IDADLNFILNDEIKAQFENVNAYEAYGLIMDPNSGKILAVAAFSKDKDLLRNNIFQSQYE PGSIFKPLIVAAAMNEGFITANTQFNVGDGRIVRSKKTIRESSRSTRGVITTREVVMKSS NVGMVLISDYFTDALFEEYLKKFGLYDKTGVDFPNELKPYTTPHENWDGLKKNNMAFGQG IAITPIQMITAFSAVVNGGTLYKPYLVEKITDGEGIVIRRNTPTVVRKVISESVSESMRS ILADTVDKGTGKRARIEGYAVGGKTGTAQLSGGKTGYIRNEYLSSFIGFFPADKPKYVIM AMFMRPQSEIQSNRFGGVVAAPVVGNVIRRIIKEEEGFAKDIEKINVNSETAGAHKSSLE AVNYEDVMPDLEGMSPQEVLSVFKETDIDIEVVGTGLVVEQKPAAGDSLKDVKKVKIILK >gi|224461366|gb|ACDC01000036.1| GENE 24 26033 - 28348 2229 771 aa, chain + ## HITS:1 COG:FN1156 KEGG:ns NR:ns ## COG: FN1156 COG1198 # Protein_GI_number: 19704491 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Fusobacterium nucleatum # 6 771 1 766 766 1104 81.0 0 MFEVNMQYFDIYIDSTKGIYTYSDKNDEFEIGDNVIVPFRNIKKTGFIIRKSLKENFDFK VLNISSKVKNSLKLSEEQIKLIEWINDYYLASYDSIIKAMIPKNVKIKYNNIYCINFEKN NLLIENSTNKIIKHIISLATISYSTAKTKFKKKIIDSLVEKEFLSLEDNNIQVKIEKFFE LKEENKDVFEYLYKKTFIKKEKLEEKFKRNDIKELEEKKILKVEASLNEKKEYSSEEVEK IQRNGSLLNEEQLAVKDKIINSDKKYFLLKGVTGSGKTEIYIELIKSAFFEGYGSIFLVP EISLTPQIIERFQSEFKNNIAILHSALSDVERAKEWESIYTGEKKIVLGVRSAIFSSVKN LKYIILDEEHEATYKQDSSPRYNAKYVAIKRCLDEGAKLILGSATPSIESYYYAKSGIYE LLNLDKRFANAELPDIEIVDMKQEDDLFFSKTLLEEIKNTLLRDEQVILLLNRKGYSTYI QCKDCGYVEECDSCSIKMSYYKSLNKYKCNYCGRQIHYTGKCSKCGSTNLIHSGKGIERV EEELRKYFDVPMVKVDSDLSKNKDNFSKIYKDFLNKKYSILIGTQIIAKGLHFPDVTLVG VINSDIILNFPDFRSGEKTFQLLTQVSGRAGRAGKKGKVIIQTYEPENNVIKDSKEENYE LFYNREINSRKVFSYPPFSKILNIGFSSEDEKRLIEVSREFYEEIKNQDIELYGPMPSMV YKVQKRYRMNIFAKGSRAKIDIFKRYLKKKLDEFNDGKVRIVVDIDPINLM >gi|224461366|gb|ACDC01000036.1| GENE 25 28358 - 28882 751 174 aa, chain + ## HITS:1 COG:FN1157 KEGG:ns NR:ns ## COG: FN1157 COG0242 # Protein_GI_number: 19704492 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Fusobacterium nucleatum # 1 174 1 174 174 266 84.0 1e-71 MVFEIRKYGDDVLKQIAKEVELSEINDEFRKFLDDMVETMYKTDGIGLAAPQVGVSKRVF VCEDGTGKIRKLINPVIEPLTEETQEFEEGCLSVPGIYKKVERPKKVMLKYLNENGEAVE EIAEELLAVVVQHENDHLNGILFVEKISPMAKRLIAKKLANMKKETKRIMEENE >gi|224461366|gb|ACDC01000036.1| GENE 26 28875 - 29186 291 103 aa, chain + ## HITS:1 COG:no KEGG:FN1158 NR:ns ## KEGG: FN1158 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 18 90 18 90 104 76 73.0 3e-13 MSKRLFWLIGIMVLVLMTFNVMSQIKHNMSKKNSIQEEIKIVNKKIEETRANIEKYDRKI DSLDDDFEKERVARNMFQMVKDNEVIYKYVEKDNNPNNIKEEK >gi|224461366|gb|ACDC01000036.1| GENE 27 29183 - 30223 1785 346 aa, chain + ## HITS:1 COG:FN1159 KEGG:ns NR:ns ## COG: FN1159 COG1494 # Protein_GI_number: 19704494 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins # Organism: Fusobacterium nucleatum # 1 346 1 346 346 619 98.0 1e-177 MKRELALEFARVTEAAALAAHKWVGRGKKESADQAGVDAMRTMLNRLAIDGEIVIGEGEI DEAPMLYIGEKVGLIYNEEEKDSVTYVDPVDIAVDPVEGTRMTAQGQPNAITVLAVGKKG SFLKAPDMYMEKIIVGPEAKGKIDLSKPLEDNIHAVAKALNKELKDLMIVILDKPRHKEL IKDLQAMGIKVYALPDGDVAGSILTCMIDSDVDMLYGIGGAPEGVISAAVIRALGGDMQA RLKLRSEVKGTSLENDKISKFEKLRCEEQGLKVGEILKLEDLAKDDEIIFSATGITGGDL LEGVKRKGSIARTQTLVVRGLSKTVRYINSIHNLDFKDEKITHLVK >gi|224461366|gb|ACDC01000036.1| GENE 28 30331 - 30624 523 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740382|ref|ZP_04570863.1| ## NR: gi|237740382|ref|ZP_04570863.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 97 1 97 97 155 100.0 6e-37 MKKILLALSVVFLLVACGKPKAYTLPEKEKESIFAIAENNQQKLDELHKNMEEWKKLAEK GDEQGKKEYQEWQIVETLVSDPSYVEVNYKALKADGK >gi|224461366|gb|ACDC01000036.1| GENE 29 30639 - 30929 464 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740383|ref|ZP_04570864.1| ## NR: gi|237740383|ref|ZP_04570864.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 96 1 96 96 163 100.0 3e-39 MGLFGKTKELPFVSNGRLIEVINQNDAYLDEGIEEKKSYKALERQLERRFLYRNVESITP TGTFGIVIVKYKDLKVRSEAEVAEMRKQLRKEAGLE >gi|224461366|gb|ACDC01000036.1| GENE 30 30984 - 31151 320 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMSNINITVDEEIDDFTYFNAETIEAIEETERNLKNSNRKRYSSIQELREALENN >gi|224461366|gb|ACDC01000036.1| GENE 31 31275 - 31514 324 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740385|ref|ZP_04570866.1| ## NR: gi|237740385|ref|ZP_04570866.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 79 1 79 79 125 100.0 6e-28 MKLYDLTLKKEVARECAWGVMGTITRIENKKGESPVLSLIEKEFWEEVRKIPRMTFEEVE ALNVKINFIMKVLSKLEEI >gi|224461366|gb|ACDC01000036.1| GENE 32 31516 - 32325 1070 269 aa, chain + ## HITS:1 COG:no KEGG:SSUBM407_p004 NR:ns ## KEGG: SSUBM407_p004 # Name: not_defined # Def: toxin of epsilon-zeta postsegregational killing system # Organism: S.suis_BM407 # Pathway: not_defined # 4 269 6 271 287 167 40.0 3e-40 MEKNYTDKELELVFEKILKMYKSSYSPKEKPKVFLLGGQPGAGKTGLENMINAKDEYISI SGDDFREYHPKFKEINLEHGREASKYTQQWCGAITEKLIETLGKEKYNLIIEGTLRTAEL PIKEATRFKKLGYEVGLNVVVVKGEKSRLGTIQRYEEMIKQGKTPRMTPKEHHDLVVNSI GNNLETIYNSKLFDDIRLFDRENNLLYSYKESPDVSPKNILEKEFSRKWEKEEIEEYNER WNNLVKIMENRSASAAEISKVIIEKEENL >gi|224461366|gb|ACDC01000036.1| GENE 33 32686 - 33207 625 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740387|ref|ZP_04570868.1| ## NR: gi|237740387|ref|ZP_04570868.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 173 1 173 173 284 100.0 2e-75 MLKKESEKVNIKALVPIYIREILDEDIKHFRMAKYTLCNQILIKFSRCSDNNFSKITPFE KKEYLQFAVQKENITRYSELRELNKDKTESEMIREIFASYTTMPPFLREINLFEEKIVFL MTAKKEYKKLKLHTDDGFIEGKIEDIRRNEENNYLEVIINSKSYYISRLTIIS >gi|224461366|gb|ACDC01000036.1| GENE 34 33229 - 33672 635 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740388|ref|ZP_04570869.1| ## NR: gi|237740388|ref|ZP_04570869.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 147 1 147 147 234 100.0 9e-61 MAFEKRVIDFREVMFKDNELFTGIYYEYHENGLDKYECSYRDGLKHGMEWMFDEYGMAIE VRTYKKGEMTTFEEYYPSGALKQKIELKDEMKNGIEMAFEENHNILYYGLNKDDKRYGEW QFYKNGKLEKYVSYKNGEIIGEEKVEY >gi|224461366|gb|ACDC01000036.1| GENE 35 33752 - 33941 221 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294782520|ref|ZP_06747846.1| ## NR: gi|294782520|ref|ZP_06747846.1| lipoprotein [Fusobacterium sp. 1_1_41FAA] # 1 63 1 63 460 115 98.0 8e-25 MKESISKFITEGKALLWIKTNDFQEVERAMIETLNSLENKKFYIYEKGKTINFLNDSIES GMD Prediction of potential genes in microbial genomes Time: Thu May 19 22:59:36 2011 Seq name: gi|224461365|gb|ACDC01000037.1| Fusobacterium sp. 2_1_31 cont1.37, whole genome shotgun sequence Length of sequence - 25094 bp Number of predicted genes - 29, with homology - 28 Number of transcription units - 13, operones - 8 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 855 1184 ## COG1262 Uncharacterized conserved protein + Term 896 - 932 4.1 + Prom 859 - 918 2.2 2 2 Op 1 . + CDS 939 - 1097 145 ## 3 2 Op 2 . + CDS 1105 - 1287 275 ## gi|237740390|ref|ZP_04570871.1| predicted protein + Term 1295 - 1335 5.4 + Prom 1341 - 1400 9.4 4 3 Op 1 8/0.000 + CDS 1432 - 2028 637 ## COG2452 Predicted site-specific integrase-resolvase 5 3 Op 2 . + CDS 2021 - 3181 1004 ## COG0675 Transposase and inactivated derivatives + Prom 3183 - 3242 5.4 6 4 Op 1 . + CDS 3385 - 5016 1890 ## Lebu_0003 protein of unknown function DUF1703 7 4 Op 2 . + CDS 5046 - 5675 896 ## COG3339 Uncharacterized conserved protein + Term 5761 - 5816 10.1 - Term 6112 - 6154 1.3 8 5 Tu 1 . - CDS 6165 - 7355 2091 ## COG0133 Tryptophan synthase beta chain + Prom 7690 - 7749 12.3 9 6 Tu 1 . + CDS 7781 - 8692 1164 ## EUBREC_2750 hypothetical protein + Term 8700 - 8748 12.4 - Term 8695 - 8728 4.0 10 7 Op 1 . - CDS 8771 - 9028 338 ## FN0980 hypothetical protein 11 7 Op 2 . - CDS 9041 - 9448 529 ## FN0979 hypothetical protein - Prom 9498 - 9557 13.7 + Prom 9541 - 9600 13.2 12 8 Tu 1 . + CDS 9627 - 10934 1207 ## COG1757 Na+/H+ antiporter + Prom 10944 - 11003 14.5 13 9 Op 1 . + CDS 11023 - 11940 1385 ## FN0976 hypothetical protein 14 9 Op 2 1/0.500 + CDS 11980 - 12903 1101 ## COG1242 Predicted Fe-S oxidoreductase 15 9 Op 3 . + CDS 12900 - 13721 1079 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 16 9 Op 4 8/0.000 + CDS 13792 - 14496 765 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) 17 9 Op 5 1/0.500 + CDS 14489 - 14812 180 ## COG1687 Predicted branched-chain amino acid permeases (azaleucine resistance) 18 9 Op 6 . + CDS 14824 - 16002 1041 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases 19 9 Op 7 . + CDS 16021 - 16548 553 ## gi|237740406|ref|ZP_04570887.1| predicted protein + Term 16699 - 16755 9.1 + Prom 16550 - 16609 5.7 20 10 Op 1 . + CDS 16765 - 18366 1766 ## Athe_2404 hypothetical protein 21 10 Op 2 . + CDS 18363 - 19208 261 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains + Term 19400 - 19437 1.6 22 11 Op 1 . - CDS 19342 - 19740 445 ## gi|237740409|ref|ZP_04570890.1| predicted protein 23 11 Op 2 . - CDS 19758 - 20543 443 ## Plim_3624 endonuclease/exonuclease/phosphatase 24 11 Op 3 . - CDS 20586 - 20930 393 ## COG1733 Predicted transcriptional regulators - Prom 21008 - 21067 12.0 + Prom 20963 - 21022 10.9 25 12 Op 1 . + CDS 21057 - 21860 1022 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 26 12 Op 2 . + CDS 21935 - 22675 829 ## Lebu_1563 hypothetical protein 27 12 Op 3 . + CDS 22702 - 24105 1563 ## Acfer_1552 hypothetical protein 28 12 Op 4 . + CDS 24171 - 24392 188 ## gi|237740415|ref|ZP_04570896.1| predicted protein + Term 24400 - 24438 7.2 29 13 Tu 1 . - CDS 24415 - 25002 392 ## FN1044 hypothetical protein - Prom 25028 - 25087 3.3 Predicted protein(s) >gi|224461365|gb|ACDC01000037.1| GENE 1 1 - 855 1184 284 aa, chain + ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 25 252 19 254 286 152 38.0 9e-37 QKIELGEEDLKKIIKELKGTIRKYSAKKASNIESKFENMIFVQGGKHQPSFANEEKEVFD IEVCKYPTTQKMWTEVMENNPSEFKGDNKPIESITWWEALEFCNKLSEKYGLEPVYDLSK SREGILMIKELGGQKVYPDVANFKNTEGFRLPTEVEWEWFARGGQIAIEQGTFNYTYSGS NNIDEVAWYIENSGETYNVGLKKPNQLGLYDCTGNVWEWCYDTTENIEEGKNYIYKAFDA SNKSRRLKGGSWNDISKFCAVLNREINQATDAFYAIGFRIVRTV >gi|224461365|gb|ACDC01000037.1| GENE 2 939 - 1097 145 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFESLLYGAVGILIILCASILILKSLTESCLGCLATILWIIFLIWLFRKCTT >gi|224461365|gb|ACDC01000037.1| GENE 3 1105 - 1287 275 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740390|ref|ZP_04570871.1| ## NR: gi|237740390|ref|ZP_04570871.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 60 1 60 60 67 100.0 4e-10 MKSLKELFKDYDKNNKKTVRESFSEMDRKDIRTNREIFKDMENVGRSVKEIFEDMSEKKK >gi|224461365|gb|ACDC01000037.1| GENE 4 1432 - 2028 637 198 aa, chain + ## HITS:1 COG:MJ0014 KEGG:ns NR:ns ## COG: MJ0014 COG2452 # Protein_GI_number: 15668185 # Func_class: L Replication, recombination and repair # Function: Predicted site-specific integrase-resolvase # Organism: Methanococcus jannaschii # 1 198 4 203 213 133 38.0 3e-31 MKKIYKPKEFSELVNRSVNTLQRWDREGILIAHRTPTNRRYYTLEDYNKVMGIEVTQNQV YEVIIYARVSNHSQKDDLQNQIKFLRDYANAKGYIVSEVITDIGSGLNYQRKGFNSILYS DKKQKILISYKDRFVRFGFDWFDKFLKSKGSEIEIVNNEDLSPQKEMVQDLISIIDMFSC HIYGLRKYKKQIKEDKDV >gi|224461365|gb|ACDC01000037.1| GENE 5 2021 - 3181 1004 386 aa, chain + ## HITS:1 COG:Z3664 KEGG:ns NR:ns ## COG: Z3664 COG0675 # Protein_GI_number: 15802939 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 3 379 5 373 402 156 29.0 8e-38 MYKALKIELKLTVAQKIQVCQTIGTERFIYNEYIKYNKEQYELGNKFVSANDFSKYINNV YLPNNPDKKWIKDVSSKSVKQAMIYGERAFKNFFKGLSAFPVFKKKGKNELGAYFVKNNK TDFEFYRHKIKIPTLKFVRVKEYGYIPKNAIIKSGTITKTVDRYFLSLIMEVEDTVKATN TSSEGLGVDLGIKDTAICSNGKVFKNINKTKKVKKLKKKLKREQRKMSRSVEYSKSKKIK LKECKNFNKKKLKVQKLFYRLNCIRDDYNNKIVDEITRAKLKYITIENLKVSNMMKNKHL SKAIQEQNFYAIRTKLINKCKKRNIELRLVDTFYPSSKTCSCCGEIKKDLKLNDRLYKCC NCGLEIDRDYNASINLEKAKIYKVIA >gi|224461365|gb|ACDC01000037.1| GENE 6 3385 - 5016 1890 543 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0003 NR:ns ## KEGG: Lebu_0003 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 2 543 3 545 545 664 67.0 0 MKKIPIGIDDFKKIRENNYYYIDKTNFIEEIGKNVGKTLLFTRPRRFGKTLNMSMLKYFF DIKNKEENKKLFQNLYIEKSDFFEEQGAYPVVYISLKGIKADTWENCLSGIKTLLRELYE EYSFIKEKLSLSEQIEYDKIWLKKEDGEYRNALKNLTSFLYEYYKKEVILLIDEYDSPLI NAYEYGFYDEAVLFFKVFYGEALKTNPYLRMGIMTGIIRVIKAGIFSDLNNLKVYSILEK EYSDFYGFTQEEVEKALKNFNIEYELPEVKAWYDGYRFGNSDVYNPWSILNFIQSEELRP YWIETSGNFLINDILKNVSTETIEILEHLFNGISMEENISGNSDLSVLMREDEIWELLLF SGYLTIDEKIGESYEDVYTLRLPNREVKEFFRKKFIDVNFGESTFRKSMEALKKNNIKDF EKYLQNILLKSTSFHDTKSEVFYHGLILGMMFYLDRDYIVKSNEESGLGRYDVSIEPRNK NNRAYILEFKVTKNEEDLEKESKEAIEQIISKKYDTSLKERGLKEIVFLGIAFCGKLLKV NYR >gi|224461365|gb|ACDC01000037.1| GENE 7 5046 - 5675 896 209 aa, chain + ## HITS:1 COG:mlr4351 KEGG:ns NR:ns ## COG: mlr4351 COG3339 # Protein_GI_number: 13473675 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 98 172 25 97 120 65 42.0 9e-11 MDRKYFECMTELGLEPGFTEKDLRKKWLELLKKYHPDKYQTEDESIIKFAEEKIIKINEA YEYLKENFLESKEEDVDTDVDTMDYDYEKYTDDFSDGKFWDKIKEVAKKIGLKTTSYALI LYYVLQKKEVPFKDKMLITGCLGYFILPIDLIPDFIPIAGYTDDVAGMIFAIKKCMDYVD DEIKQNVSSKLVAWFDVEKDYVDDLLKDI >gi|224461365|gb|ACDC01000037.1| GENE 8 6165 - 7355 2091 396 aa, chain - ## HITS:1 COG:FN0317 KEGG:ns NR:ns ## COG: FN0317 COG0133 # Protein_GI_number: 19703662 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Fusobacterium nucleatum # 1 395 1 395 395 732 94.0 0 MTTENKKGYFGEFGGSYVPEVVQKALDELEIAYNKYKDDEEFLKEYHHYLKDYSGRETPL YFAESLTNYLGGAKIYLKREDLNHLGAHKLNNVIGQILLAKRMGKKKVIAETGAGQHGVA TAAAAAKFGMQCDIYMGALDVERQRLNVFRMEMLGATVHAVEAGERTLKEAVDAAFEAWI NNIEDTFYVLGSAVGPHPYPSMVKDFQKVISQEARRQILEKENRLPDMVIACVGGGSNAI GAFAEFIPDKDVKLVGVEAAGKGIDTDRHAATLTLGTVGVIDGMNTYALFNEDGSVKPVY SISPGLDYPGVGPEHAFLRDSKRAEYVPATDDEAVNALLLLTKKEGIIPAIESSHALAEV IKRAPKLDKDKIIIVNISGRGDKDVAAIAEYLKNKN >gi|224461365|gb|ACDC01000037.1| GENE 9 7781 - 8692 1164 303 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2750 NR:ns ## KEGG: EUBREC_2750 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 302 1 306 307 397 65.0 1e-109 MSNIIAVIWDFDKTLVDGYMQDPIFEKYGVDSKEFWEEVNSLPKKYWEEQQVKVNRDTIY LNHFINKTKEGVFKGLNNKVLFELGKELKFYKGIPEIFGKTKELIEKNSIFQEYNIKVEH YIVSTGMVEMIKGSIIKEYVEDIWGCELIQAKDENGNFEISEIGYTIDNTSKTRAIFEIN KGVNKNTGYDVNAKIKEGNRRVLFKNMIYIADGPSDVPAFSLIKKGGGSTFAIYPKSDLK AFKQVEKLREDNRVDMYAEADYSEGTTTYMWIMSKIQELAQNIVDEEKSRLAASISDSPK HLN >gi|224461365|gb|ACDC01000037.1| GENE 10 8771 - 9028 338 85 aa, chain - ## HITS:1 COG:no KEGG:FN0980 NR:ns ## KEGG: FN0980 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 83 1 83 85 87 67.0 1e-16 MRIRLSGAVGGVVLVVITGVILASIVDGILSFIEKYVVKEDESGKKFISLLKKINWGFFI IFIILDLIGVFPLFRTILFALFSHF >gi|224461365|gb|ACDC01000037.1| GENE 11 9041 - 9448 529 135 aa, chain - ## HITS:1 COG:no KEGG:FN0979 NR:ns ## KEGG: FN0979 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 14 135 1 122 122 200 83.0 2e-50 MFGLFGGKKKKEFMSDNTKAYLHIYCAKNIIVDEQKFSELEHIKGDDLEDVIKVSTDKHI VTANYDLPSNSVFNSRIKAKDISISCPLLEAGKHYVISIYEVTPEVAATEESTFMDYVHA EEIEKGYSICLYRKK >gi|224461365|gb|ACDC01000037.1| GENE 12 9627 - 10934 1207 435 aa, chain + ## HITS:1 COG:FN0978 KEGG:ns NR:ns ## COG: FN0978 COG1757 # Protein_GI_number: 19704313 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 1 430 1 430 431 554 82.0 1e-157 MGSVIAILLFSVSLILCLLLNFSVVYALIVGYIIFITYGLIKGYDLKVLVKKSFEGVLTV KNILLVFVLIGMITALWRASGTIAFIVYMGSKLILPSILILLTFLLCSILSFLIGTSLGT AATMGVICVSIGKAMGLNPYYLGGAVLSGIYFGDRCSPMSTSALLITELTKTNLYTNIKL MLKTSIIPFIATCLFYLFLGLKSSTSPVSIDATNIFKENYNLNIVVIVPAILIIILSLFK VNVKKTMLLSIFISFIIAMFFQKESVTSLINYCVYGFHHSNEKLNSMMKGGGILSMLNVG LIVAISSSYSGIFKETKILVLMKKYLKEFSKKTSNYFVIFLSSIISGAIACNQSLGTILT YELCEELEEKQNMAIILENTIVLLAGLIPWNIAMAVPLKTIDIGLMSGLFAFYLYFLPLW NLFLGIVKEKKKINR >gi|224461365|gb|ACDC01000037.1| GENE 13 11023 - 11940 1385 305 aa, chain + ## HITS:1 COG:no KEGG:FN0976 NR:ns ## KEGG: FN0976 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 305 1 305 305 441 75.0 1e-122 MSISFYVKNKKKFLGYEPVLNVEEALSLLDKELNTYGTDGIDINDLLLSPLSKYPCLLVG AEEESARGFELSYDNKNKVYGVRIFTPSSREDWLLALEYIKALAKKMGTEIINERGETFT VDNIEKFDYEPDILYGIKVITENIKSGESSNYIIFGTTRPVSFDEKMIDEINNSDSPIDT FSRIVRDIQNLDAYSANQQFYQNREDGKIMGAYTITESVRTIIPYKPSVEFHNSDMLKND DIAFWNMGFVVINGDENDPNSYQGVGQLDYDDFIKKLPKEKYKFIDASYIMVEPLTKEEI SDFLK >gi|224461365|gb|ACDC01000037.1| GENE 14 11980 - 12903 1101 307 aa, chain + ## HITS:1 COG:FN1142 KEGG:ns NR:ns ## COG: FN1142 COG1242 # Protein_GI_number: 19704477 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Fusobacterium nucleatum # 1 302 1 302 304 522 85.0 1e-148 MIRKIYMLNDFLKEKFNEKIYKVSLDGDFTCPNRDGKVSRGGCIFCSENGSGDFTATKLK SIHEQIEEQIDLVSKKYKGDKYIAYFQNFTNTYAEVSYLRKIYQEALSHEKIVGLAIATR PDCLGDDVLELLAELNKKTFLWVELGLQTVNDDVAKYFNRAYETEIYKNASEKLNRLNIK FVTHIIIGLPKEEEDDYLKTAIFAQNCGTWGIKLHLMYVVKNTPLEKLYLNGDLKVNTKE EYVEKVINILENISPEIVVHRLTGDGDRETLVAPLWSIKKIDVLNSIHKELKRRNTYQGK LYYGGLK >gi|224461365|gb|ACDC01000037.1| GENE 15 12900 - 13721 1079 273 aa, chain + ## HITS:1 COG:FN1143 KEGG:ns NR:ns ## COG: FN1143 COG0363 # Protein_GI_number: 19704478 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Fusobacterium nucleatum # 1 273 1 273 274 444 78.0 1e-124 MRFVITDNKRVGDWAAVYVANKIREFNPTAERKFVLGLPTGSTPLQMYKRLIEFNKAGII SFKNVVTFNMDEYLGLEATHDQSYHYYMYNNFFNHIDIEKENINILNGKAKNYEEECKRY EEKILELGGIDLFLGGVGVDGHIAFNEPGSSFKSRTRKVQLTENTIIANSRFFDNDITKV PRFALTVGIETITSAKEVLIMAEGENKARALHKGIESGINHMWAISSLQLHENAIIVADE AACSELKVGTYRYYKDIESENCDVNKLLEKVQK >gi|224461365|gb|ACDC01000037.1| GENE 16 13792 - 14496 765 234 aa, chain + ## HITS:1 COG:FN1039 KEGG:ns NR:ns ## COG: FN1039 COG1296 # Protein_GI_number: 19704374 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Fusobacterium nucleatum # 1 232 18 249 250 285 70.0 8e-77 MEEFKFAFKNYILIAFPYLFIGITCGFLMKEAGFGAIWSLLSCLLVYGGTIQLLMVGLLK ANTPIISMGLISLIVNSRHMFYGLSFLQEFKKIRKESFLKFFYLAFSLTDEVYSIYAAIK IPERLNKTKTMLYINLLAQFTWTFGCVVGNLAFNFIKFDLKGIDFIITEFFCIVVISQLI GDKSYISTSVGIISSIIAFLIMGSNFIVLAIFLSMLTLLILKKKLTAKEVDKHE >gi|224461365|gb|ACDC01000037.1| GENE 17 14489 - 14812 180 107 aa, chain + ## HITS:1 COG:FN1040 KEGG:ns NR:ns ## COG: FN1040 COG1687 # Protein_GI_number: 19704375 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permeases (azaleucine resistance) # Organism: Fusobacterium nucleatum # 1 105 1 105 107 134 74.0 5e-32 MNNNLYLFLAILSAGVGMAICRLLPFIIFANGKLPKLVKFYEKYLPYSLMAILFCYCFAS VKFSVYPHGFPEIITLIVITLLHIWKKNVMLSLFLGTVVFLILSRVF >gi|224461365|gb|ACDC01000037.1| GENE 18 14824 - 16002 1041 392 aa, chain + ## HITS:1 COG:FN1041 KEGG:ns NR:ns ## COG: FN1041 COG4552 # Protein_GI_number: 19704376 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Fusobacterium nucleatum # 1 392 1 391 391 518 81.0 1e-147 MKIRYAKKSEKEMAIKFWKDSFKDSEEQIKFYFDNIYNEKNYLVLEDNSKIISSLHENDY IFNFNNESVKSKYIVGVSSDITMRNKGYMSKLLISMLENSKKKSMPFVFLTPINPKIYRK FGFEYFSNIEYYNFSIEELANFKLPDNDYSYIEINEENKKIYLKDLIKVYNSNMEDKFCY LERDNFYFDKILKEAISDEMKIFILYKNKVASAYIIFGLYEENIEIRECMALDGVSYKEI LALIYGYRDYYKNVSLASSNNSNLEFLFENQLSIEKIVKPFMMMRILDPLAIFKNLKLEN HNIYIYIEDKILKENTGLYYFLNKKFTFYALPVEKSIYHLRIDIADLVFLITGYFSIDDL VKMGKIDISNKETLRKLKRIFSKKNSYLYEFI >gi|224461365|gb|ACDC01000037.1| GENE 19 16021 - 16548 553 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740406|ref|ZP_04570887.1| ## NR: gi|237740406|ref|ZP_04570887.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 175 1 175 175 285 100.0 9e-76 MNIENIIKNQDILDCWKEIQKSNIDKNISKEVFEYDIEEYHTFLLDEIVEASQYMNMSTD TLIQEMLLFAKDSKSLVINFSNERLNKKIPFSSPLTYEEMSGGYTEEELGIPYQDLEDET DAIIDIGTLVSYLIDLIFLFKEEKNYMKYLTQRLYYSEIHAKEFIDYEKKIIEDL >gi|224461365|gb|ACDC01000037.1| GENE 20 16765 - 18366 1766 533 aa, chain + ## HITS:1 COG:no KEGG:Athe_2404 NR:ns ## KEGG: Athe_2404 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophilum # Pathway: not_defined # 11 511 15 486 491 216 32.0 2e-54 MNFFRKTLEILKRTWNNAVTNESTLNFNLDMFAYSSRDPKQDYEKNKNRFKNALIENDEI ALANLLYTLDIRNGKGERALFKSYFSALIEMNKDYAIQILPYIPELGRWDYVFEGIGTEI EETIYELVRAYFMMDIKNYNENKPVSLLAKWLPSIKTHNKKNYFAVKLAKKLNLTEKEYR KILSKLRDSLNIVEKHITNKEYEKIEYISVPSKAMVKYKNLFFTKDEVRFKEFIEELKDS KKAKYDNLFMNDFAKMYLDNLMKIGINYFYERTIKEACRLLFNNFFLKDLEENSQILLQN FKNEKNLINTMWKKQSKIEFDKNVLVIADTSGSMEGTPFETAISLAIYISQNNKSEEWRN KFIIFSSDCIEYSYDKDAEFTDIIDNIPLIAENTNIDKVFKKILNDSIEKNLPQLDEVII ISDMEFDMVQDKKDMSNFKHWKSEFAKYNYELPKIIFWNVARNVESFPVTKLDYGTCLVS GYSKNILKSIIDIENFDPIDIMLKTLEKKNYFKMVKEIKENLSRKEFEHVEEK >gi|224461365|gb|ACDC01000037.1| GENE 21 18363 - 19208 261 281 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 30 280 24 280 285 105 30 4e-22 MIKVGKRQKLVINNFASVGAYLFAGTDDDKDNILLPNNELEGRDLKEGDEVEVLIYRDSE DRLIATFRKTEALVGTLAKLEVVDDNPRLGAFLDWGLNKDLMLPNSQKETKVEIGKKYLV GLYEDSKGRVSATMKIYKFLMPSSDIKKGDIVNATVYRINDEIGTFVAVEDRYFGLIPKS ECFEEYSAGDELTLRVTRVREDKKLDLSPRKLLSDQMESDAELVLGKMRLLKEHFRFNDN SSAEDIKDYFGISKKAFKRAIGSLLKNGLIEKSGDYFILKK >gi|224461365|gb|ACDC01000037.1| GENE 22 19342 - 19740 445 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740409|ref|ZP_04570890.1| ## NR: gi|237740409|ref|ZP_04570890.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 132 8 139 139 248 100.0 1e-64 MSTWKFDDNFGYMESREYLDQIMKEFNDLFKTDGLLLNKTSVGIRFDSFKGYDISIEAFF IKVPRLDYKIEIFRIEQPLVKEYPITVFNSIVKQNEECWDIESLKEAVERITTSRRLNDI INNLRSRVLYAY >gi|224461365|gb|ACDC01000037.1| GENE 23 19758 - 20543 443 261 aa, chain - ## HITS:1 COG:no KEGG:Plim_3624 NR:ns ## KEGG: Plim_3624 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase # Organism: P.limnophilus # Pathway: not_defined # 1 259 2 267 283 119 30.0 9e-26 MKFLFWNINRKDLSNTIIEICQENDIDILGVCEGQNLNIKNLIGSLDYEEISVIHNLEDN GIKLFCRENISIIANEENTDYHKEYTLILESIKYKLFLVHLPSKLHLTENEQSLLVPRFE CTKQAMQRTLIFGDFNMNPFENGIISSETFHALSSKEITLKKKRRVNKKERYFFYNPTWF LYARKHNEIIGSYYYNKSSPNNLFWNMFDQVIISPDLINVFEFNKFKIITDTKQKNLLYE DKTPNKKRYSDHLPILFELNI >gi|224461365|gb|ACDC01000037.1| GENE 24 20586 - 20930 393 114 aa, chain - ## HITS:1 COG:FN1904 KEGG:ns NR:ns ## COG: FN1904 COG1733 # Protein_GI_number: 19705209 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 11 110 38 137 148 105 54.0 3e-23 MSKTLTASEMCPMDLGINILSGKWKLKILWNIQRKNIIRFNELQKLLGSITTKTLTNQLR ELEEQKIIKRNVYPEVPPKVEYSLTELGESIIPILKMLCEWGKEYQKAIKKLIV >gi|224461365|gb|ACDC01000037.1| GENE 25 21057 - 21860 1022 267 aa, chain + ## HITS:1 COG:CAC3337 KEGG:ns NR:ns ## COG: CAC3337 COG2159 # Protein_GI_number: 15896580 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Clostridium acetobutylicum # 1 263 1 260 262 229 45.0 4e-60 MIIDSHEHLILPIEMQIKKLEEAGVDKAILFTTTPHPEKANTMQEFKNEMSVLFKVLSGE KNHKNDMKRMENNINDLMKVLKKYPDKFYGFGPVPLGLNLDETISWIEKYIVSKNLKGVG EFTPGNDEQVKQLEIIFQALENYNYLPIWIHTFYPVTLNGINILMEFTKKYPKVSVIFGH IGGYNWMNVIDFVKETPNAYLDLSASFSSIVVKMAIAEVPNKCLYSSDVPYGEPLLNKQM IEYLCKDEKVRKNILGGNILKLLKENI >gi|224461365|gb|ACDC01000037.1| GENE 26 21935 - 22675 829 246 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1563 NR:ns ## KEGG: Lebu_1563 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 246 1 251 251 270 54.0 3e-71 MNGEVAQIRDIAIYAKHALKTKSKISYKPDKYENKIEFLFTENFEAKDVSEWYEHCIEKG LEDIKLSMPIAVKDPSLLAFSNTSQAGLICYFKDNLVTYFIPKWEVKDKGWNVTYKEYRW ENPPKEKVQFEDNTEDFKNTLSKIATLADKIDFQNFANIFTEAYDMLDGKEVESYYHKKY FSLMPERNARLLCSAGISDVFGGMGSWNDSPSWYAYEKGLESEYKKLSSELLTQIRVALL YSVNEW >gi|224461365|gb|ACDC01000037.1| GENE 27 22702 - 24105 1563 467 aa, chain + ## HITS:1 COG:no KEGG:Acfer_1552 NR:ns ## KEGG: Acfer_1552 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 56 409 10 374 506 157 29.0 1e-36 MEIFKKFGLILILVLEIIAFLKCIKKPKKTSKENLIENEKEDFENKKEDLENKNLSIFEL IEISIQPNGELPEDFKLPPKDPNGVPWADGALDGVCIYHLVENEEDIEPLKNIVFQISEG KFEEAQNNLENLDFFMISRRDSLLNWIIQEQKQINIDNLCEFTISQLSTSKNIEVIKFCL CVLEIIKLETEKDTIEKVKILALSDEFTLYCLNILKNLKNSNEEIFEIAKKVKGWGRIYS IKYLKVTNDEIKEWILEEGCHNYIIPAYTAYTCAKKINLVEILNEDKISSKKFNDISYLM NALLDEEAITGISNLEDRELLIERYLEKAKTLASAEEDYYAIITLKEYIKNNKEINNELI KICDEILNSEKTRNIVKELLKEGYGYNIAKYLGIDIDKYILEYLQDNPLKNPYIVFNISE RENMEKLVSLIEKKLTLEKLEGVPTDKFYSKNEKNKEYIFLDTIIKN >gi|224461365|gb|ACDC01000037.1| GENE 28 24171 - 24392 188 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740415|ref|ZP_04570896.1| ## NR: gi|237740415|ref|ZP_04570896.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 73 1 73 73 112 100.0 6e-24 MDEPENYIGIGENLIICALNSPYVDIRYNAVNTLESWKEKGYILSNEIIENIKKLEKLEV DEELKIKLNELLK >gi|224461365|gb|ACDC01000037.1| GENE 29 24415 - 25002 392 195 aa, chain - ## HITS:1 COG:no KEGG:FN1044 NR:ns ## KEGG: FN1044 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 6 195 53 242 242 239 78.0 5e-62 MAPIFEIKKIILFSLIISASHFFINVIKNKLEKIFPQRRLQFLFFSFNQLLHFIAIVGFY YILNIENFTSQLYIVLKDCEYFKIFILYITVFSIILDPASVLIRKLFISISPKTYPKAYS EELKAGNIIGKLERTIIAILLLNNQFGLIGFVLTAKSIARFKQMEDKNFAEKYLIGTLTS FLIVLIAILILKGLL Prediction of potential genes in microbial genomes Time: Thu May 19 23:00:49 2011 Seq name: gi|224461364|gb|ACDC01000038.1| Fusobacterium sp. 2_1_31 cont1.38, whole genome shotgun sequence Length of sequence - 10210 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 2, operones - 1 average op.length - 13.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 38 - 850 925 ## FN1045 hypothetical protein - Prom 870 - 929 6.0 2 2 Op 1 . + CDS 1081 - 2235 1323 ## TDE0809 hypothetical protein 3 2 Op 2 . + CDS 2235 - 2921 292 ## FN1046 hypothetical protein 4 2 Op 3 . + CDS 2970 - 3722 646 ## FN1047 hypothetical protein 5 2 Op 4 . + CDS 3725 - 4447 413 ## FN1047 hypothetical protein 6 2 Op 5 . + CDS 4466 - 5533 868 ## FN1048 hypothetical protein 7 2 Op 6 . + CDS 5555 - 5881 540 ## FN1049 hypothetical protein 8 2 Op 7 . + CDS 5896 - 6279 669 ## COG0346 Lactoylglutathione lyase and related lyases 9 2 Op 8 . + CDS 6293 - 7075 538 ## FN1051 hypothetical protein 10 2 Op 9 . + CDS 7098 - 7814 486 ## FN1051 hypothetical protein 11 2 Op 10 . + CDS 7865 - 8203 349 ## FN1052 hypothetical protein 12 2 Op 11 . + CDS 8207 - 8623 477 ## gi|237740428|ref|ZP_04570909.1| predicted protein 13 2 Op 12 . + CDS 8646 - 9416 768 ## FN1058 hypothetical protein 14 2 Op 13 . + CDS 9463 - 10210 585 ## FN1058 hypothetical protein Predicted protein(s) >gi|224461364|gb|ACDC01000038.1| GENE 1 38 - 850 925 270 aa, chain - ## HITS:1 COG:no KEGG:FN1045 NR:ns ## KEGG: FN1045 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 270 1 269 269 377 78.0 1e-103 MKNYSVLMIDLKNSRSYSIQDRNNLQNSILNSIKILNKVFKNSIKKEVEFSAGDEIQGLF TSPQSAYLYYRLFSIIIFPIEIHSGIGYGTWDIVMDNESSTAQDGTVYHNARKAIDEAKK SLEYSVLFYSSNKNDLIVNSLINSCNLLAFKQSKYQNNLMLLTEILYPIVYDNTILETET LKELLEFIQFEKKENLILDTNFSIEPAQIEKESFYITEGKKRGLSTQFSKLLGVSRQSVE KAIKTGNIYDLRNLTIAILKAMDNTQGESL >gi|224461364|gb|ACDC01000038.1| GENE 2 1081 - 2235 1323 384 aa, chain + ## HITS:1 COG:no KEGG:TDE0809 NR:ns ## KEGG: TDE0809 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 380 1 377 381 352 53.0 1e-95 MKLIFELTNEQRKYLGLIPVEEHWELVKFDNGIYYYFEDDTIKKEIKVSKNYYHEAELNE KTAENRTMILPKTKRGKIKKFNYTATQSFSPFGTYFTFSTNGVIIANYTTQRTYYSEIFS EKEKISLDNLKKWLDKWMKETTEEDLEEIEEFKNAKRKHCKFNEGDFFAFKISRREWCFG RILLDVSKLKKNENFKKNKNYGLTNLMGKPLIIKVYHKISDNKNINLKELSKCLALPSQA IMDNIFYYGEAIILGNLPLKPEENDMFISVSESISGIDKNIAYLQYGLIYREIPLSDYEK LIKELKIGPQTLRREGIGFVIDTYKLKECIEAKSNSPFWEKYKKRNVPDLKNPDHIELKR KIFKAFGLDADKTYEENLKMLEVK >gi|224461364|gb|ACDC01000038.1| GENE 3 2235 - 2921 292 228 aa, chain + ## HITS:1 COG:no KEGG:FN1046 NR:ns ## KEGG: FN1046 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 69 227 4 161 245 200 74.0 3e-50 MFFLGYLVFYSALSLLIDVSIFFSILSFVLGATLYFNSNSPLQGIAICVFGLLILISSLC YHSKGKCARGSSFVNYNTSCNFLSISIATVIFSIPIWYIIVKTNIIEIKSSPLYIFIPSL LISWAILFKIVDRIFIHNRETKEVVLENYFTIHRSRRDLTHIYIFKFKNSSDLYSTGMLR QRIFIDKIGSKFSCTFGKGIFGTNYITSIKLIEDAGIDTSENQTSHSF >gi|224461364|gb|ACDC01000038.1| GENE 4 2970 - 3722 646 250 aa, chain + ## HITS:1 COG:no KEGG:FN1047 NR:ns ## KEGG: FN1047 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 17 248 1 236 238 168 47.0 2e-40 MNENKKWGYIYDKKLDMYIPNLPRLKKFTTIIFILLILSIFSGIVLSFLDLSSYDKIKIF VYNAMLVFIFLILWIFLLINTHYTEKILQELNELEVPREFEIKALKRRIIPQIIMTVIIL ISMFAFEQKKLSFDYLFKLILVIGICAFMFYRSLRKFQNSKYSLNIKGNTVKIFYENNEK EIITTENINYVSFFALRRGKRGKERKPTLQFFDLEERILAEMTIEVIDYFRLKRYLKKYN VEIVDNYEWS >gi|224461364|gb|ACDC01000038.1| GENE 5 3725 - 4447 413 240 aa, chain + ## HITS:1 COG:no KEGG:FN1047 NR:ns ## KEGG: FN1047 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 240 1 238 238 222 64.0 1e-56 MYIPNLPRQKKFAKVLLFLSIISFVALLIQIYFFDKTSEVKILFLAGTSVIVFLFLAIYL LSKINIHLLEKRLQEIEKIELSDKFEIKSLKKNTLLFSYVILFVILIFILFFLINILLKE FTYKYIFYIIFLIGIIIFNYYNFLKELKSRKYFLTINGKTIKIYYKNNEKEFITTDNISQ VRFYVIDTGKGIGKKNPSLQIFDNEEKILVEMTISANDYYLLKKYFEKYNVRIDNRYKEF >gi|224461364|gb|ACDC01000038.1| GENE 6 4466 - 5533 868 355 aa, chain + ## HITS:1 COG:no KEGG:FN1048 NR:ns ## KEGG: FN1048 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 353 1 365 367 380 72.0 1e-104 MELLEKDEEYISSLLKQGKKVEAIAFVKNKTGMNLKEAKDYIDKKDIAISEEDEQYLASL INENKKLEAVAFLHKNKDMSLEEAKNYTDSLILKKNVKTNKKSSHKWGYIFDEKLNIYVP NLPRQKKLLKIMLSIFLVLLVTTLISLMFLDRSSDIKMIILRYSVLGTLVLIITLPLIIL NIHITENKLKKLENLEVSNQFEVKAFVSNFHLSLHVLLIIIFIIIIPIFLLKIDYKDYKG IFYFFVLIAITVAGIYELLKMLKYKKYSLNIDSREITLLYNKNEIKSIKIEKINFIKFYD KKTKRGGRTNIPSIVIFDMEKNIFIEMEVKISDYVLLKMYFEKHKIMVKDEFKKI >gi|224461364|gb|ACDC01000038.1| GENE 7 5555 - 5881 540 108 aa, chain + ## HITS:1 COG:no KEGG:FN1049 NR:ns ## KEGG: FN1049 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 108 1 108 108 149 86.0 4e-35 MKAKEFAEMCYAEKEIQLKEYMNGKESLVAKLKNDLALSSEQEKILYKLVDTVLIDTYIT LLYALDGTASLGNGKQENFKLYGEDGELVFESGELEMATYEAFYENKK >gi|224461364|gb|ACDC01000038.1| GENE 8 5896 - 6279 669 127 aa, chain + ## HITS:1 COG:FN1050 KEGG:ns NR:ns ## COG: FN1050 COG0346 # Protein_GI_number: 19704385 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Fusobacterium nucleatum # 1 127 1 127 127 226 93.0 8e-60 MYIEHIAMYVNDLEKTKEFFIKYFGASSNEIYHNKKTDFKSYFLTFDSGCRLEIMTKPEL VDDIKDLKRTGFIHIAFSVGSKEKVDELTEILKIDGYEVISGPRTTGDGYYESCIVGIEG NQIEITV >gi|224461364|gb|ACDC01000038.1| GENE 9 6293 - 7075 538 260 aa, chain + ## HITS:1 COG:no KEGG:FN1051 NR:ns ## KEGG: FN1051 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 251 1 235 239 105 34.0 1e-21 MNKDFDNNIPIITKKKEISKVILVLFFFFLAFLGFGKYYKAPIPLLIGIALLLFILLLYL FILLSDIYFIKKMIKYRDNTVVPDEFQIRPPKHIFMLIIAIILLAYLIVYIIPKSIMTLD KNIFKLIIVLIIFFIGIYRIYKAGRYSIDVMEKNIRILFKNQEISSFNVENVAFVKFSGV KNKVNVYLLGIIRFIFCKHRYHFRSVDESRSSPFMRLFDFEGKEFFKISLSIKDYWVIKK YFLKYNVKTEDISDFLNDDL >gi|224461364|gb|ACDC01000038.1| GENE 10 7098 - 7814 486 238 aa, chain + ## HITS:1 COG:no KEGG:FN1051 NR:ns ## KEGG: FN1051 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 236 1 237 239 238 66.0 1e-61 MNKDFNNNIPDFARRKKALKIILSLLVISLIILAVQLFYIDDLSYLQGNLTIINSFIAFI LLFLYITLTADMYVTIKRIKEREKVEVPNEFRVDAFKQTYFIILYTIILIIFIFIFVLSI VFKIGIFGIIFSILGIGIFSYFLSIMIKSRKYSLEVRNRNIKVLYKNQEIEVLEIKDIPF VAFFGSGKEKVKKGDYPIMEICNIKGEILRIPLSLRNYWLMKKYFLKCAVEISDTYEN >gi|224461364|gb|ACDC01000038.1| GENE 11 7865 - 8203 349 112 aa, chain + ## HITS:1 COG:no KEGG:FN1052 NR:ns ## KEGG: FN1052 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 111 46 155 155 143 75.0 2e-33 MENLSQYEFESTQQNANNKKFRFREYLYSGDYVEVIKEFKDYYGFTHQVGEKFYFACVYF LPHEDGYTLYISKDKINISNIFLQDRAETQKEICYNLKEYFKVIEQGRFKRD >gi|224461364|gb|ACDC01000038.1| GENE 12 8207 - 8623 477 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740428|ref|ZP_04570909.1| ## NR: gi|237740428|ref|ZP_04570909.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 138 1 138 138 269 100.0 4e-71 MNIELAKELLSFHSCRNENIDDPRWENGFLGILRPFQGELNEKNFIEVMECLKVLVPEIQ KENIDKNIVSDIMNIIHFTRNWVSEGGMLTRNNKVTTEQTKYLLAWSDIIETCFIYLLED ASDIAFTDYTAYCNNEYF >gi|224461364|gb|ACDC01000038.1| GENE 13 8646 - 9416 768 256 aa, chain + ## HITS:1 COG:no KEGG:FN1058 NR:ns ## KEGG: FN1058 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 34 251 34 251 255 177 51.0 3e-43 MEVLVYIVLMPFLFIFLFVMAYLFRKRKVKKILFSEFDEGEKDLETREFFNRIFKLERLS KPFFYAQVIFLIIDTLFILFGGYKTYLEEVEFVKEFPRIIMSPLSPPLIKFMIPIVMWVF VFFSFIYVMILKNKENRRIAEMLDNLEKVKHLKFAKEDFLRSDRILATGVVGGDIKLGDR YLFSFYPVCIIPYIFIQKMKVKISRRGKNGIIYHLDITLKRPFQNIKIDFAKEDIAEKVR EFSLERKKDLNEKIEY >gi|224461364|gb|ACDC01000038.1| GENE 14 9463 - 10210 585 249 aa, chain + ## HITS:1 COG:no KEGG:FN1058 NR:ns ## KEGG: FN1058 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 34 246 34 245 255 208 61.0 1e-52 MGDLAYMMFMPFLFFFLFFLAYLFRKRKVEKILFAELNESEKDLEARDFFYNILKMERSA KSFYYLEVIFLIINTFFILFGGYKTYLEEVEFIKEFPSFSISPLSSVLIKFMAPIIMWVL VFFLLIFAMIMKKKENKRITEMLDNLEKVKHLKFAKEDFLKSDRILATGVVSMSDIKLGD RYLFSVYPAYIIPYIYIQKIKVERFYRRHGGSIYYLDITLKRTFQNIKIYFAKEDVAEKV KEFILEKIK Prediction of potential genes in microbial genomes Time: Thu May 19 23:01:39 2011 Seq name: gi|224461363|gb|ACDC01000039.1| Fusobacterium sp. 2_1_31 cont1.39, whole genome shotgun sequence Length of sequence - 2734 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 53 - 430 360 ## FN1054 hypothetical protein 2 1 Op 2 . + CDS 457 - 1191 531 ## FN1058 hypothetical protein + Prom 1212 - 1271 9.6 3 2 Tu 1 . + CDS 1300 - 2586 1608 ## COG1114 Branched-chain amino acid permeases Predicted protein(s) >gi|224461363|gb|ACDC01000039.1| GENE 1 53 - 430 360 125 aa, chain + ## HITS:1 COG:no KEGG:FN1054 NR:ns ## KEGG: FN1054 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 5 121 7 124 125 131 71.0 7e-30 MYTSMIFIMIIIMIIMLVVTISLLKRKNWECFYIENEILYLPSLFVKEIPLSNIRNIEFE TFHSRGSYSGIIRVYQKDAKVVKRYFQTSQMAFFVSKEMVLAEIEKITPLLKKYYIPYTI NDRKY >gi|224461363|gb|ACDC01000039.1| GENE 2 457 - 1191 531 244 aa, chain + ## HITS:1 COG:no KEGG:FN1058 NR:ns ## KEGG: FN1058 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 34 241 34 245 255 149 49.0 8e-35 MDLMFTTFFIFILFIFLMCSVFIIRKRLIKKAMLLEIDEDLKNLAPKEFFLNILKREKFS KIINYVEFFFFLLVTILIVFQGYQEYILSKEESDSSINLISFILDKFKIPIFIWFVVSTS LLLALLIKKRENKRIYEMLDSLEQSKLLKSAQVDFMIPNKIVERGLLGNDIKFGSKFLFV VYPGYIIPYCWLDDVKVEKISGRYGSKSHYVNIILKTSSKPINITFAKKEICEKIRELLL KKIK >gi|224461363|gb|ACDC01000039.1| GENE 3 1300 - 2586 1608 428 aa, chain + ## HITS:1 COG:FN1059 KEGG:ns NR:ns ## COG: FN1059 COG1114 # Protein_GI_number: 19704394 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Fusobacterium nucleatum # 1 425 1 425 425 617 88.0 1e-176 MYRTKDVLLTGFALFAMLFGAGNLIFPPMLGYETNSSWIMTMLAFTITGVGFPFLGILSV SIAGNGIKDFANRVSPKFSIIFAIISILAIGPMLAIPRTGATAYEITFLYNGMDSSIYKY IYLIAYFGIVILFSLRANKVIDRVGKILTPILLILLFLIIVKGAFFTDLSVKPDIYPHAF KRGFLEGYQTMDTIASIAYAGIILTAIKSGRTLTQKQEFSFLVKSGLVAIVSLALIYGGF AFVGAKMHSVLNTQDKIKLLVRTTSYLLGSYGNLVLAVCVAGACLTTAIGLVATVGEFFS SITSFKYEKIVIFTVLISFALSVLGVESIIRISVPILIFIYPVTISLILLNLFGKYIKND YVYKGVVFFTGIVGLIESLDSLGIENYYTKSVLEILPFSDYGLTWLFPGLIGYILCSLIF RRTEKKED Prediction of potential genes in microbial genomes Time: Thu May 19 23:01:52 2011 Seq name: gi|224461362|gb|ACDC01000040.1| Fusobacterium sp. 2_1_31 cont1.40, whole genome shotgun sequence Length of sequence - 25450 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 6, operones - 4 average op.length - 5.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 757 1098 ## COG4884 Uncharacterized protein conserved in bacteria - Prom 895 - 954 9.9 + Prom 809 - 868 12.1 2 2 Op 1 1/0.000 + CDS 892 - 1479 777 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism 3 2 Op 2 10/0.000 + CDS 1494 - 3611 1215 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 4 2 Op 3 . + CDS 3625 - 4071 666 ## COG0691 tmRNA-binding protein 5 2 Op 4 . + CDS 4141 - 7644 5004 ## FN0610 hypothetical protein + Prom 7770 - 7829 8.2 6 3 Op 1 . + CDS 7862 - 9775 2720 ## COG0441 Threonyl-tRNA synthetase 7 3 Op 2 . + CDS 9787 - 10302 792 ## FN0612 hypothetical protein + Term 10310 - 10353 6.0 - Term 10300 - 10339 -0.8 8 4 Tu 1 . - CDS 10352 - 10861 760 ## FN0691 hypothetical protein - Prom 10893 - 10952 14.1 + Prom 10837 - 10896 12.2 9 5 Op 1 1/0.000 + CDS 11019 - 12470 2025 ## COG2067 Long-chain fatty acid transport protein 10 5 Op 2 . + CDS 12489 - 13061 788 ## COG1309 Transcriptional regulator + Term 13073 - 13113 5.1 + Prom 13136 - 13195 8.2 11 6 Op 1 . + CDS 13238 - 14011 661 ## FN0760 hypothetical protein 12 6 Op 2 1/0.000 + CDS 14008 - 14586 660 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 13 6 Op 3 1/0.000 + CDS 14602 - 15648 1444 ## COG1077 Actin-like ATPase involved in cell morphogenesis 14 6 Op 4 12/0.000 + CDS 15664 - 16209 581 ## COG1386 Predicted transcriptional regulator containing the HTH domain 15 6 Op 5 1/0.000 + CDS 16199 - 16903 624 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 16 6 Op 6 31/0.000 + CDS 16913 - 17203 380 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 17 6 Op 7 21/0.000 + CDS 17212 - 18666 449 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 18 6 Op 8 1/0.000 + CDS 18682 - 20127 2062 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) 19 6 Op 9 1/0.000 + CDS 20171 - 21121 1009 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 20 6 Op 10 . + CDS 21138 - 22148 1235 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 21 6 Op 11 . + CDS 22162 - 22902 744 ## FN0750 hypothetical protein 22 6 Op 12 . + CDS 22899 - 24233 1460 ## FN0749 hypothetical protein 23 6 Op 13 . + CDS 24246 - 25424 734 ## COG0658 Predicted membrane metal-binding protein Predicted protein(s) >gi|224461362|gb|ACDC01000040.1| GENE 1 1 - 757 1098 252 aa, chain - ## HITS:1 COG:FN1060_2 KEGG:ns NR:ns ## COG: FN1060_2 COG4884 # Protein_GI_number: 19704395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 75 251 1 177 180 281 87.0 7e-76 MIQALHFKDEKSDKFWFIETLDCELMVNYGKTGVTGKYEIKEFDTVEECEKEALKLINSK KKKGYQDFPEFDRDNHYYFDDEECGLHILTSHINFRKYFTDEFYYDCGDEEAPFGSDEGN DALYELQEAIQNKKKINFFDFPKVIIEKIWEMDYISPDVEKTDEELKEQAKTKFNGLLGD QIILQSDQVILAVTFGQAKITGKIDNDLLELALKSLTRIDRLNRLIWNWDKEEATYYIET MRKDLIKFKEDC >gi|224461362|gb|ACDC01000040.1| GENE 2 892 - 1479 777 195 aa, chain + ## HITS:1 COG:FN0607 KEGG:ns NR:ns ## COG: FN0607 COG1713 # Protein_GI_number: 19703942 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Fusobacterium nucleatum # 1 179 1 179 193 266 83.0 1e-71 MKYNFNQLKEIVKSKMSLKRFTHTLGVVEMAGKLAEINKADVEKCKLAALLHDICKEMDM EDIKNICKNNFLNELSDEDLENNEILHGFAGAYYVNKEFGIEDSEVLNAIKYHTIGSKDM TLVEKIIYIADAIEYGRNYPSVTEIREETFKNLNKGILMEIEHKEKYLESIGKKSHPNTS QLKENILTELSKTYL >gi|224461362|gb|ACDC01000040.1| GENE 3 1494 - 3611 1215 705 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 1 696 1 697 730 472 38 1e-133 MNLEKDLEKIKEILKTVKYLSFDQITSLLEWSPKKRKDNKAIILSWVDAGELLLDKKNRI TAIEDSSLYAKGIFRIIKNKFGFVDSENSEERNGIYIARENFNSALDGDRVLVKITNEGY DSKGRPGAEGEIIKIIERRKNTVVGILEKNKKFSFVLPTSAFGSDIYIPNSQVGNADNKD IVVAEITFWGDENRKPEGKIIKILGSSTNSKNMIEALIYREGLSEHFSDEAMQEVREVIK KKIDYTDRKDLTELPIITIDGADAKDLDDAVYVEKLKNSNYRLIVAIADVSYYVKKDSTL DLEARNRGNSVYLVDRVLPMFPKEISNGICSLNEREDKATFACEMEIDLKGDVVNYEVYK SVIKSVHRMTYKDVNAILDGDEKLIDKYSDIYEMLKEMLELSKILRNKKHTRGSIDFELP ELKVVLDEENNKVKEVLLRERGEGEKIIEDFMIAANETVAERIYWLELASIYRTHEKPDR EKVFKLNEMLAKFGYKIPNFDNLHPKQFQEIIERSKNQETSMLVHKTILTSLKQARYTVD DIGHFGLASSHYTHFTSPIRRYADLMVHRVLFSSINNSIKQLKLSDLDEIAHHISKTERV AMKVEDESVRIKLVEYMKKYVGKELELMVTGFASRKVFFETSEHIECSWDVTISGNFYNF DEENYCMIDYYNGTVFSLGEKVKALVEKADLLTLEIGVVPLKDIF >gi|224461362|gb|ACDC01000040.1| GENE 4 3625 - 4071 666 148 aa, chain + ## HITS:1 COG:FN0609 KEGG:ns NR:ns ## COG: FN0609 COG0691 # Protein_GI_number: 19703944 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Fusobacterium nucleatum # 1 148 1 148 148 231 95.0 3e-61 MIIANNKKAFFDYFIEEKYEAGIELKGSEVKSIKAGKVSIKESFVRIINDEIFIMGMSVV PWEYGSIYNPEERRVRKLLLHRKEIKKIHEKVKIKGYTIVPLDVHLSKGYVKVQIAIAKG KKTYDKRESIAKKDQERNLKRDIKINNR >gi|224461362|gb|ACDC01000040.1| GENE 5 4141 - 7644 5004 1167 aa, chain + ## HITS:1 COG:no KEGG:FN0610 NR:ns ## KEGG: FN0610 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 14 1167 1 1155 1155 1715 79.0 0 MRRKFMFLFILSAMIVNNAYSETITADSDEVVIDLNDNTLTSDHGVAVTNGNMKGLFYKF RRNPETGEISFEDNAIMNIAQPTGNIKIETEGGKISQANEEGEFYNSFAYVNVAKMTGAE APNDKIYFGSPLIKYSDEKINAKDAWVTTDFNIVNFQKEPEKAGYHIFSSDVLIEPDKQI TLKKSDLFIKGKDVMPFNFPWFRANIRSGSTVPLFVTIQSSDDYGAATSMGFLYGNRKDK FRGGFAPKFADKMGILVGRWENWYKFDKIGETRLNIDDWLIYAKNKEKPTASNELPEYEK RRKRYKVELSHDYEGDNGNFHFLSQNSTRSMVGSLADVMEKFDDNNVYNSLGLDRYKFDK NIGFYTLDSNLYNLGEKKDISFSGKMSLVSDKKAYGLLVYDSIDDISYGSTIDHDLYTNL SLTKDNDKYKFNARYDYLYDMDPGSTASDLMARNERIGADLLLKENGASISYDKRKGDDY RRFSFWEEDINTSAKKRNILGIDFSYTPTTVAKYDYNNFENIKLSLGNYKMGRYTFTPTF AYNFLDRKLDEARDTYRKIVLGDNRLAEFNRFENTIYENTLERRADLNLYNDNEIYRVGF GKNNSEIWSRDGLFDGTYRSYENKSSFYELELGRKNIELADKGTLGLDATFRHDEFDASS DKTNLLNLKLDNDLFLYKGTGLDVTNKFRIEGQKYSFSGNKNNEERRLINKSDFLKFDDT LIFDGKSTVTTYNIGYKTSKNPYGTKSKNGEELHTGLNIKFDEDTNLDFKYSDDKRYTTK TRSEKKVNDLSTRQYSVKFETKKYDLGFSNTDIDFVGDDFYTTNNFREDINEHRITGGYK FDNSRLSFTYAQGRDKLKVDGGGYLNRKNRMYSATYNIYGDVEQDFTAAYKTYRYGNTRI EDDIRNTDTYSFAYAYRDKRFEKEELMKYATLEYEKPENEITANDIDQIRAILDRKSDFY NQFELTRIKDETFRIGNYKKAFNFYVNIERNNKRYSQTGNLRNSMSKFTGGLTYTYNRVG IGYQFTEKASWKNSSGNYYWGKDSKEHEFSLFAKIGKPSQGWKVKTYAMFYENKNDTTGT RYRKKSLDSLGIEIGKEMGFYEWAVSYENRYKASSKDYEWRVGVHFTLLTFPNNSLLGVG AKNTGGNTSTRPDGYLLDRPSQLKNSY >gi|224461362|gb|ACDC01000040.1| GENE 6 7862 - 9775 2720 637 aa, chain + ## HITS:1 COG:FN0611 KEGG:ns NR:ns ## COG: FN0611 COG0441 # Protein_GI_number: 19703946 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 18 637 1 620 620 1218 95.0 0 MLVKYNGENKEYDSNINMFEIAKGISNSLAKKSVGAKVDGKNIDMSYVLDHDAEVEFIDI DSPEGEDIVRHSTAHLMAQAVLRLYPETKVTIGPVIENGFYYDFDPVEQFTEEDLEKIEA EMKRIVKENIKLEKYVLPRDEAIDYFRDVDKNKYKVEIVEGIPQGEQVSFYKQGDFTDLC RGTHVPSTGYLKAFKLRTVAGAYWRGNSKNKMLQRIYGYSFSNEDRLKKHLKFMEEAEKR DHRKLGKDLELFFISEYGPGFPFFLPKGMVFRNVLIDLWRKEHEKAGYLQLETPIMLNKE LWEISGHWFNYRENMYTSEIDELEFAIKPMNCPGGVLSFKHQLHSYKDLPARLAELGKVH RHEFSGALHGLMRVRSFTQDDSHIFMTPDQVQDEIIGVVNLIDRFYSKLFGFEYEIELST KPEKAIGSQEIWDMAESALAGALDKLGRKYKINPGDGAFYGPKLDFKIKDAIGRMWQCGT IQLDFNLPERFDVTYIGEDGEKHRPVMLHRVIYGSIERFIGILIEHYAGAFPMWLAPVQV KVLTLNDECIPYAKEIMDKLQELGIRAELDDRNETIGYKIREANGRYKIPMQLIIGKNEV ENKEVNIRRFGSKDQFSKSLDDFYDYVVDEAAIKFDK >gi|224461362|gb|ACDC01000040.1| GENE 7 9787 - 10302 792 171 aa, chain + ## HITS:1 COG:no KEGG:FN0612 NR:ns ## KEGG: FN0612 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 7 171 1 166 166 180 59.0 2e-44 MKKILLMLLLCLAVVSCGKKDEVTDEVTEASTTQAQDYGVPNPFEIVDTLDEAAKIAGFS LEAPTEYADYKSIVIQAIADDMIEVIYFDAEKTHEGLRIRKAVGTDDISGDYNEYKEENV VKVGELEVTEKGNDGNISVASWTDGTHSYSINVDEALLNADDIAKLVETIK >gi|224461362|gb|ACDC01000040.1| GENE 8 10352 - 10861 760 169 aa, chain - ## HITS:1 COG:no KEGG:FN0691 NR:ns ## KEGG: FN0691 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 169 5 181 181 132 50.0 5e-30 MYKLFLSIFLIVSFSLFALESNEIKEVIPIHNENIATEEETLVEEVEKPKNNYNSEELKA SNVISTNKDLKENKKEEYASDYKSEEISDKTSRVTALGSAMGAVDLGKIEERKFRIGAGV GSSNNNQAVAVGVGYAPTDRFRVNTKFSTSSTSKRASAISIGASVDLDW >gi|224461362|gb|ACDC01000040.1| GENE 9 11019 - 12470 2025 483 aa, chain + ## HITS:1 COG:FN1003 KEGG:ns NR:ns ## COG: FN1003 COG2067 # Protein_GI_number: 19704338 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Fusobacterium nucleatum # 211 483 1 273 273 420 81.0 1e-117 MKKLLFLIGILSSGLYGASIDHIQTYSPDYLSNQAQTGMIDEVSPYYNPAGLSRLDKGKY IHLGLQFANGHEKMSYKGKEHKAHLTQLIPNVSLTSVDDNGAYFFTFGGLAGGGKLEYDG VSGIDVLSDLDQFKPLGVYDKGSTLTGKNLYEQATLGRAFTINDQLSVSVAGRIVHGSRN LSGTLNIGTNPTTAYKQAKARQVAQEVSKAVDAATQGSGLSAAQIAAIKQQKTTQALTLL QTKMNALQQNGLSGDLDSKREAWGYGFQLGINYKVNDKLNLAARYDSRIKMNFKAKGSEN QLQTADIIGSNIGLSTFYPQYAINSKIRRDLPAILSVGASYKVTDNYLVSTSANYYFNRH AKMDRVTTFGGHEHGRDYKNGWEIALGNEYKLNDKFTLIGSVNYARTGAKNSSFNDTEYA LNSVTLGAGLRYKYDETLSITGSVAHFIYDKEDGNFKEKYKVNDNQKYHKEITAFGLSVT KKF >gi|224461362|gb|ACDC01000040.1| GENE 10 12489 - 13061 788 190 aa, chain + ## HITS:1 COG:FN1004 KEGG:ns NR:ns ## COG: FN1004 COG1309 # Protein_GI_number: 19704339 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 188 1 188 188 251 70.0 4e-67 MPKKVLFSKEVILDTAFRLFKEEGYDAISARNVAKALDSSPAPIYKSIGSMEVLKAELVA RTKKLFIEYLLKERTGIKLFDIGMGVCVFAREEKQLFLQIFSRHTVKSPLIDEFLNVIRE ELKTDERIISIDKDKQEELLHTCWVFAHGLSTLIAIDFFKDSSDEFIERSLKNGPARLFY EYLSRYSKKQ >gi|224461362|gb|ACDC01000040.1| GENE 11 13238 - 14011 661 257 aa, chain + ## HITS:1 COG:no KEGG:FN0760 NR:ns ## KEGG: FN0760 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 256 1 270 270 202 49.0 1e-50 MKRIKYVYFYFLFLLYLIVGFFGDVPLIKKGIYENLYNYMGLMLIPTLLFFVLYGFVFML ENKKKRFFWELRLYYIYILFFIIAYIFILANLGINIGTAPGFEINADFIRNLINKSLFEY KIGYLPTYLLYEFINLSLRFKQIPFHYFYYGLYGLAFFLFLLMVFGPLIRSINRAKEKRK SERKRAGMNSGLMEQFEIQEKLEKGEKLSTVKSERKTSSPQKKNKKAKIENEVKVKEKID KKGIVFRRTVTIEDEEE >gi|224461362|gb|ACDC01000040.1| GENE 12 14008 - 14586 660 192 aa, chain + ## HITS:1 COG:FN0759 KEGG:ns NR:ns ## COG: FN0759 COG0424 # Protein_GI_number: 19704094 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Fusobacterium nucleatum # 1 192 1 192 192 283 79.0 2e-76 MILASNSKRRQEILRDMGFNFKVLTSDIEEISDKEEISEMILDIAEKKLDKIAKENVNEF VLAADTVVELEGRIFGKPKSREEAESFLKILSGKTHKVITAYVLKNISKNIIIKDVVISK VKFFDLDDETINWYLDTSEPFDKAGAYGIQGQGRALVEKIEGDYFAIMGFPISNFLKNLR KNGYKISQIDRI >gi|224461362|gb|ACDC01000040.1| GENE 13 14602 - 15648 1444 348 aa, chain + ## HITS:1 COG:FN0758 KEGG:ns NR:ns ## COG: FN0758 COG1077 # Protein_GI_number: 19704093 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 1 348 6 353 353 591 91.0 1e-169 MKKFMGNILGVFSDDLGIDLGTSNTLIYMKNKGIILREPSVVTISSKTKELFEVGEKAKH MIGRTPNIYETIRPLRNGVIADYEVTEKMLRCFYKRIKSGTIFNKPRVIICVPAGITQVE KRAVIEVTREAGAREAYLIEEPMASAIGVGINIFEPEGSMIVDIGGGTSELAVVSLGGVV KKSSFRVAGDRFDMAIVDYVRQKHNLLIGEKSAEDIKIKIGTVDPEEEELQIEVSGKYVL NGLPKDITLTSSELIETLSALVQEIIEEIRVIFEKTPPELAADIKKKGIYISGGGALLRG IDKKIASGLNLKVTVAEDPLNAVINGIGVLLNDFSTYSRVLVSTETEY >gi|224461362|gb|ACDC01000040.1| GENE 14 15664 - 16209 581 181 aa, chain + ## HITS:1 COG:FN0757 KEGG:ns NR:ns ## COG: FN0757 COG1386 # Protein_GI_number: 19704092 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Fusobacterium nucleatum # 1 181 1 181 181 283 87.0 1e-76 MSIKNQVEAIIFLGGDENKIKDLARFFKISVEDMLKIILELKDDRKDSGINIEVDADLVY LATNPIYGEVINSYFEQETKPKKLSSASIETLSIIAYKQPITKSEIESIRGVSVDRIISN LEERKFVRNCGRQESGRKANLYEVTDKFLSYLGIRDIRELPDYDLFKDKIKDMENISTDE N >gi|224461362|gb|ACDC01000040.1| GENE 15 16199 - 16903 624 234 aa, chain + ## HITS:1 COG:FN0756 KEGG:ns NR:ns ## COG: FN0756 COG1187 # Protein_GI_number: 19704091 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Fusobacterium nucleatum # 1 234 1 234 234 360 88.0 2e-99 MRINKFLSSLGIASRRAIDKYIEEGRIKVNGVITSTGIDVTEDDEIYIDNKKIETKRIEE KVYFMLNKPLEVLSASSDDRGRKTVVDLIKTDKRIFPIGRLDYMTSGLILLTNDGELFNR LVHPKSEIYKKYYIKVFGEVKKEEIEELKKGVLLEDGKTLPAKISGIKYDKNKTSMYISI REGRNRQIRRMIEKFGYKVLMLRREKIGELSLGDLKEGKYRELTKEEIEYLYSV >gi|224461362|gb|ACDC01000040.1| GENE 16 16913 - 17203 380 96 aa, chain + ## HITS:1 COG:FN0755 KEGG:ns NR:ns ## COG: FN0755 COG0721 # Protein_GI_number: 19704090 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Fusobacterium nucleatum # 1 96 1 96 96 117 86.0 5e-27 MSLTKEEVLKIAKLSKLSFEEAEIEKFQVELNDILKYIDMLNEVDTSEVQPLVHINDVVN NFREKEEKASIEIEKVLLNAPESAENAIVVPKVVGE >gi|224461362|gb|ACDC01000040.1| GENE 17 17212 - 18666 449 484 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 20 483 21 463 468 177 31 6e-44 MFIYELTAKELRDKFLSGEISAEEIVNSFYERIEKIEDKVKSFVSLRKELALEEAKKLDE KRKNGEKLGKLAGIPLAIKDNILMEGQKSTSCSKILENYVGIYDATVVKKLKEEDAIILG VTNMDEFAMGSTTKTSYHHKTANPWDLDRVPGGSSGGAAASVAAQEVPISLGSDTGGSVR QPASFCGVVGLKPTYGRVSRYGLMAFASSLDQIGTLAKTVEDVAICMNVIAGADDYDATV SKNEVPDYTEFLNKDIKGLKVGLPKEYFIEGLNPEIKKIVDNSVNALKELGAEIVEVSLP HTKYAVPTYYVLAPAEASSNLARFDGIRYGYRAKDYTDLESLYVKTRTEGFGAEVKRRIM MGTYVLSAGFFDAYFKKAQKVRNLIKQDFENVLAKVDVILTPVAPSVAFKLSDVKTPIEL YLEDIFTISANLAGIPAISLPGGLLDNLPVGVQFMGRPFDEGTLIKVSSALENKIGRLNL PKLD >gi|224461362|gb|ACDC01000040.1| GENE 18 18682 - 20127 2062 481 aa, chain + ## HITS:1 COG:FN0753 KEGG:ns NR:ns ## COG: FN0753 COG0064 # Protein_GI_number: 19704088 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Fusobacterium nucleatum # 1 480 1 480 481 827 91.0 0 MIKEWESVIGLEVHLQLKTGTKVWCGCKSDYDETGINTHVCPICLGHPGALPKLNKKVVD YAVKAALALNCQINNESAFDRKNYFYPDAPKNYQITQFEKSYAEKGYIEFKLNSGREVKI GITKVQIEEDTAKAIHGKNESYLNFNRASIPLIEIISEPDMRNSEEAYEYLNTLKNIIKY TKVSDVSMETGSLRCDANISVMEKGSKVFGTRVEVKNLNSFKAVARAIDYEIARQIELIE NGGKVDQETRLWDEENQITRVMRSKEEAMDYRYFNEPDLLKLLITDEEIEEIKKDMPETR LAKVERFKNAYSIDEKDALILTEEMELSDYFEEVVKVSNNPKLSSNWILTEVLRVLKHQN IDIEKFTISSGNLAKIITLIDKNIISSKIAKELFEIALTDNRDPEVIVKEKGMVQLSDSS EIEKMVEEVLANNQKMIEDYKAADEGRKPRVLKGIVGQVMKLSKGKANPEIVNELIMSKL N >gi|224461362|gb|ACDC01000040.1| GENE 19 20171 - 21121 1009 316 aa, chain + ## HITS:1 COG:FN0752 KEGG:ns NR:ns ## COG: FN0752 COG0596 # Protein_GI_number: 19704087 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Fusobacterium nucleatum # 1 313 1 313 319 566 87.0 1e-161 MENYDFYPAIEPFKSYMLQVSDIHSIYVEECGNPNGEPIIFLHGGPGAGCGKKARRFFDP EYYHIILFDQRGCGRSLPFVELKENNIFYSVEDMEKIRLHIGIDKWTIFAGSYGSTLGLT YAIHHPERVKRMVLQGIFLANESDVKWYFQEGISEIYPAEFKVFKDFIPKEEQDDLLKAY HKRFFSDDIKLRNEAIKIWSRFELRTMESEYTWSLEEDIQNFEISLALIEAHYFYNKMFW EDRNYILNRVDKIKDIPIQIAHGRLDFNTRVSSAYRLSEKLNNCEFVIVESVGHSPFTEK MSKVLIKFLEDNKNSN >gi|224461362|gb|ACDC01000040.1| GENE 20 21138 - 22148 1235 336 aa, chain + ## HITS:1 COG:FN0751 KEGG:ns NR:ns ## COG: FN0751 COG0252 # Protein_GI_number: 19704086 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Fusobacterium nucleatum # 1 336 1 336 336 607 92.0 1e-173 MENKVLIINTGGTIGMVGKPLRPAYNWAEITKGYSVLEKFPTDYYQFEKLIDSSDVTTDF WVKLAEVIEENYDKYLGFVILHGTDTMAYTGSMLSFLLKNLAKPVVLTGAQAPMVNPRSD GLQNLINSIYIAGHKLFDIPLIPEVTICFRDSLMRANRSKKTDSNNYYGFSSPNYQPLAE IATEIKVIKDRILKLPTEKFYVEKNIDANVLLLELFPGLNPNYISSFIESNKNIKALILK TYGSGNTPTSEDFINTLKSIVEKGIPILDITQCISGSVRMPLYESTDKLSKLGIINGSDI TSEAGLTKMMYLLGKKLNLKEIKEAFSTSICGEQTV >gi|224461362|gb|ACDC01000040.1| GENE 21 22162 - 22902 744 246 aa, chain + ## HITS:1 COG:no KEGG:FN0750 NR:ns ## KEGG: FN0750 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 239 1 242 243 219 57.0 1e-55 MKKWILLILAIILLGVFSLIRSCQRSSREVINVYTDKEIEYFIGKFAKKFERNENKVQVK INDLKNMSDYDIIITDEKENVKNLKKDFKSKDLFKDELVVIGRRRIENISQVVNSSIAIP NYKTNIGKTGLDILAKLDNFSEISKKIEYKDDAISSLQSVDLYEVDYAFIPRNSLAFAKN SEICYRFPPTMEANKILYRIYLDTNSSDNSKNFYNFLEEEFAEKVQEKPKSEKNKAIITK DVEGKS >gi|224461362|gb|ACDC01000040.1| GENE 22 22899 - 24233 1460 444 aa, chain + ## HITS:1 COG:no KEGG:FN0749 NR:ns ## KEGG: FN0749 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 440 1 405 405 369 56.0 1e-100 MKKTLIFLSLLIISSLSFSEGTNLESINQNTTITTETDIKPQKVILDVKSVYDSLNIKGK IDYSIFQKAYLGYVQIPNKNPGVLVIIDYTKPSNEERFYVLDLNKKKLVYSTRVAHSKNS GLEIPLEFSDDPNSYQSSLGFFLTLGEYNGAYGYSLRLKGLEENINANAESRAIVIHGGD IVNDEYIKKYGFAGRSLGCPVLPAALTKEIVNYIKHGRVLFIYGNDEEYIEESYYLSKLA PVFEGKPQNIVELEKPRETTKVVTTTSPTSDSKVPTALNTPSASTPTVENPDQKNISIML DVIKKEAEYKQHLSFRKSEKFVDYFAVMKDVVEDSNTPKEPEAIVTNTKIEDSKNTEILE KSKKEDIKQEEVKQEEVKTEELKKEEPKKEDVKQEEIKKEEVKTEELKKEEPKKEEVKKV NRKYSEEVIRKSLGLGVKLKSKTK >gi|224461362|gb|ACDC01000040.1| GENE 23 24246 - 25424 734 392 aa, chain + ## HITS:1 COG:FN0223 KEGG:ns NR:ns ## COG: FN0223 COG0658 # Protein_GI_number: 19703568 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Fusobacterium nucleatum # 15 386 1 372 378 442 75.0 1e-124 MKKMFLLTFLAVVLMLRIATGVRITEIFQKEVYRMSFNLVDGKAKDLKINNKYPLKKIYG KLGYKEDGKYKGYFLVKSIKKYKDVYFIELEDIKSEKIENNFLENYLQVLFNRAEEGYLY EIKNLNRAILLGDNSRIKKSLQEKIRYIGLSHVFAMSGLHIGLVIAIFYFILKRIIKNKI VLEVSLILLLSLYYFSIKESPSFTRAYIMALVYLLGKLVYEKIDLGKSLIISAYLSILIK PTVVFSLSFQLSYGAMIAIIYIFPYVRKINYKKIKVLDYFLFTTTIQIFLIPITVYYFST IQFLSLISNLILLPLASFYISINYIALFLENFYLSFLLKPIIKISYNFLIYLIDFFSKFS YLSIEYENQKLIYIYSLVIILILIHKKSLLKK Prediction of potential genes in microbial genomes Time: Thu May 19 23:02:28 2011 Seq name: gi|224461361|gb|ACDC01000041.1| Fusobacterium sp. 2_1_31 cont1.41, whole genome shotgun sequence Length of sequence - 5544 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 16 - 1302 1992 ## COG2873 O-acetylhomoserine sulfhydrylase - Prom 1482 - 1541 8.8 - Term 1338 - 1389 3.8 2 2 Op 1 11/0.000 - CDS 1555 - 2286 643 ## COG1180 Pyruvate-formate lyase-activating enzyme - Term 2306 - 2336 3.6 3 2 Op 2 . - CDS 2348 - 4579 3405 ## COG1882 Pyruvate-formate lyase - Prom 4752 - 4811 17.6 + Prom 4632 - 4691 13.7 4 3 Tu 1 . + CDS 4898 - 5543 942 ## COG0760 Parvulin-like peptidyl-prolyl isomerase Predicted protein(s) >gi|224461361|gb|ACDC01000041.1| GENE 1 16 - 1302 1992 428 aa, chain - ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 8 425 1 418 422 418 48.0 1e-116 MSADLKNLEIETQLVQSLEEFEEGESRTVPLVQSTTFNYTNPDTLAELFDLKKLGYFYSR LSNPTVAAFENKMAILEKGVGALAFASGQAAITAAILTICKAGDHIVAVSTLYGGTITLL ASTLKNYGIETTFLNPEASEEEFKAAFRENTKILYGETLGNPEMNTLDFEKIVKVAKEKD VPTIVDNTLASPYLCNPISYGVNIVVHSATKYIDGQGSVLGGVIVDGGNYNWDNGKFPML VEPDASYHNMSYYKTFGNLAYIIKARANILRDMGAALSPFNAFILLRGLETLHLRMERHS ENALALAIALEKNPNITWVKYSKLPSHYSYKNAEKYLTKGGSGVILVGVKGGREGAEKFI KGLEWIRAVVHVGDSRTCVLHPASTTHRQLSEEDLIKCGVLPEAVRINVGIENINDIIAD IEQALAKI >gi|224461361|gb|ACDC01000041.1| GENE 2 1555 - 2286 643 243 aa, chain - ## HITS:1 COG:FN0261 KEGG:ns NR:ns ## COG: FN0261 COG1180 # Protein_GI_number: 19703606 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Fusobacterium nucleatum # 1 243 1 243 243 459 89.0 1e-129 MQGYINSFESFGTKDGPGIRFVVFMQGCPLRCLYCHNVDTWELKDKNYIYTPNEILAELN KVKAFLTGGITASGGEPLMQASFILELFKLCKENGIHTALDTSGFIFNDQAKKVLEYTDL VLLDIKHIDKDMYKKITSVDLEPTLKFIQYLQEINKPVWLRYVLLPGYTDDIKDLNDWAK YVSQFDVVKRVDILPFHQMAIYKWEKTNRDYKLKDVSTPTKEQIQKAEEIFKKYDLPLYK ERS >gi|224461361|gb|ACDC01000041.1| GENE 3 2348 - 4579 3405 743 aa, chain - ## HITS:1 COG:FN0262 KEGG:ns NR:ns ## COG: FN0262 COG1882 # Protein_GI_number: 19703607 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Fusobacterium nucleatum # 1 743 1 743 743 1479 96.0 0 MDAWRGFKSGDWQNNINVSDFIKHNYTEYKGDESFLEGPTENTKKLWDILSGMLKIEREK GIYDAETKIPSKIDAYGAGYIDKDLEKIVGLQTDAPLKRAIFPNGGLRMVENSLEAFGYK LDPTTKEIYEKYRKSHNAGVFSAYTPAIKAARHTGIITGLPDAYGRGRIIGDYRRVALYG VDRLIAERKREFDAYDPAEMTEDVIRDREEMFEQLEALKALKRMAAAYGFDIGRPAETAQ EAVQWTYFGYLGAIKDQNGAAMSLGKTAGFLDVYIERDLKEGRITERDAQEFIDHFIMKL RIVRFLRTPEYDQLFSGDPVWVTESIGGMNNEGNSWVTKNAFRYLNTLYNLGTAPEPNLT ILWSERLPENWKRFCSKVSIDTSSLQYENDDIMRPQFGEDYGIACCVSPMAIGKQMQFFG ARANLPKALLYAINGGKDELKKEQVTPAGEFEKITSEYLDFDEVWEKYDKMLTWLASTYV KALNIIHYMHDKYSYEALEMALHSLDIKRTEACGIAGLSIVADSLAAIKYGKVRVIRDED GDAVDYVVEQPYVPFGNNDDRTDELAVKVVRTFMNKIRSHKMYRDAEPTQSVLTITSNVV YGKKTGNTPDGRRAGAPFGPGANPMHGRDTKGAVASLASVAKLPFEDANDGISYTFAITP ETLGKTDDEKKNNLVGLLDGYFKQTGHHLNVNVFGRELLEDAMEHPENYPQLTIRVSGYA VNFIKLTKEQQLDVINRTISSKM >gi|224461361|gb|ACDC01000041.1| GENE 4 4898 - 5543 942 215 aa, chain + ## HITS:1 COG:FN0263 KEGG:ns NR:ns ## COG: FN0263 COG0760 # Protein_GI_number: 19703608 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Fusobacterium nucleatum # 1 208 5 211 231 207 66.0 1e-53 MEDDKVLHNILLKKAKEAEYSSFEIEQINLQTETLFIRYFLEREAAKVVEGTKIEDEVLK KIYDENKEFYTFPEKVKLDTIFIREQDKAEKLLKEVTVANFNEIKEKNDEKTDVTQKNVD DNFIFITDIHPAIAEEIFNENKKDVILANLVPVQEGFHIVYLKDKEDKRQATFEEAKETI LNDVKRNLFGQAYNQLIADIANEKVTLETNETKEE Prediction of potential genes in microbial genomes Time: Thu May 19 23:02:35 2011 Seq name: gi|224461360|gb|ACDC01000042.1| Fusobacterium sp. 2_1_31 cont1.42, whole genome shotgun sequence Length of sequence - 20396 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 6, operones - 4 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 64 - 465 674 ## FN0264 hypothetical protein + Term 498 - 557 16.4 + Prom 523 - 582 11.0 2 2 Op 1 1/0.000 + CDS 616 - 1551 1146 ## COG2177 Cell division protein 3 2 Op 2 1/0.000 + CDS 1523 - 3013 2410 ## COG4942 Membrane-bound metallopeptidase 4 2 Op 3 17/0.000 + CDS 3026 - 3829 1071 ## COG0061 Predicted sugar kinase 5 2 Op 4 1/0.000 + CDS 3823 - 5484 2170 ## COG0497 ATPase involved in DNA repair 6 2 Op 5 1/0.000 + CDS 5486 - 6241 649 ## COG0582 Integrase 7 2 Op 6 . + CDS 6241 - 7134 1410 ## COG1159 GTPase 8 2 Op 7 1/0.000 + CDS 7161 - 8081 1149 ## COG4874 Uncharacterized protein conserved in bacteria containing a pentein-type domain + Term 8190 - 8225 1.1 + Prom 8154 - 8213 5.3 9 2 Op 8 21/0.000 + CDS 8240 - 9013 610 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 10 2 Op 9 17/0.000 + CDS 9000 - 10004 1496 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 11 2 Op 10 . + CDS 10014 - 10742 253 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 12 2 Op 11 1/0.000 + CDS 10815 - 11264 720 ## COG0629 Single-stranded DNA-binding protein 13 2 Op 12 . + CDS 11277 - 11852 732 ## COG2096 Uncharacterized conserved protein + Term 11857 - 11900 7.1 - Term 11843 - 11890 5.6 14 3 Op 1 . - CDS 11893 - 13353 2130 ## COG2195 Di- and tripeptidases 15 3 Op 2 41/0.000 - CDS 13366 - 14985 1592 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 16 3 Op 3 . - CDS 15001 - 15273 540 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 15339 - 15398 11.5 - Term 15379 - 15415 4.0 17 4 Op 1 . - CDS 15423 - 16124 653 ## COG5522 Predicted integral membrane protein 18 4 Op 2 . - CDS 16137 - 16787 867 ## FN0997 hypothetical protein - Prom 16954 - 17013 13.5 + Prom 16888 - 16947 11.2 19 5 Op 1 1/0.000 + CDS 17051 - 18853 2110 ## COG1164 Oligoendopeptidase F 20 5 Op 2 . + CDS 18872 - 19342 648 ## COG2849 Uncharacterized protein conserved in bacteria 21 5 Op 3 . + CDS 19416 - 19916 749 ## FN0600 hypothetical protein + Term 20089 - 20129 1.5 - Term 20075 - 20117 1.1 22 6 Tu 1 . - CDS 20133 - 20396 239 ## gi|294782434|ref|ZP_06747760.1| nitrite/sulfite reductase-like protein Predicted protein(s) >gi|224461360|gb|ACDC01000042.1| GENE 1 64 - 465 674 133 aa, chain + ## HITS:1 COG:no KEGG:FN0264 NR:ns ## KEGG: FN0264 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 7 133 2 129 129 116 92.0 2e-25 MGGTTNEKFLLLAVLAVSASAFAANTADLVGELQALDAEYQNLASQEEARFNEERAQADA ARQALAQNEQVYNELSQRAQRLQAEANTRFYKSQYEDLASKYEDALKKLEAEMEQQKQVI SDFEKIQALRAGN >gi|224461360|gb|ACDC01000042.1| GENE 2 616 - 1551 1146 311 aa, chain + ## HITS:1 COG:FN0265 KEGG:ns NR:ns ## COG: FN0265 COG2177 # Protein_GI_number: 19703610 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Fusobacterium nucleatum # 1 278 1 278 308 392 74.0 1e-109 MYKLFGYGLKDIPYINRLKNRVFYIIVITIVSLNIFISFSLNLKKVSKETLINSFIIVDL QNNLDEEKRNDIEKYILTIDGVRSVRFMDKSESFKNLQNELNISIPEASNPLTDSLIVSV KSAELMNGVQELIEAREEVKEVYKDEPYLKQSQEQSDIIHIAQIGSAIFSFLIALVTIVI FNLGVAIEFLNNANTGLDYAENIKESKFKNLIPFSMASVVATLIFFNIYVFFRKYVTNAN FDSSLLSLREIFLWHIGAIAILNFLVWLIPANLGRIEYEEEKDDDLDYEFYEEEIEDKKD EFYDEFEDDDI >gi|224461360|gb|ACDC01000042.1| GENE 3 1523 - 3013 2410 496 aa, chain + ## HITS:1 COG:FN0266 KEGG:ns NR:ns ## COG: FN0266 COG4942 # Protein_GI_number: 19703611 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Membrane-bound metallopeptidase # Organism: Fusobacterium nucleatum # 12 234 6 230 403 240 76.0 5e-63 MMNLKMTTFNKLFLFFLISANVNSATVKDMNKRIKNIDQEIEQKNTRIKAIDTETSQIEK MIKDTEVEIEKMVQERKEIEEEITVVKKNIDYGRKNLEISEDEHDRKESEFIAKIIAWDK YSKVHRRDLPEKVILMKNYREVLYGDLQRMGYIEKVTGNIKESQDKIEAEKIKLDKLENQ LKENARRMDAKKEEQNKLKEKLQVEKKGHQSSIEKLKKEKQRISKEIERIIIENARKAAE KAAREKAAREKAAREKAARERAEREKAIREKAAREKAAREKAAREKAAREKAAREKAAQE AEAKKNSAKPSENKPKTPTKPVEVPIVVDTSDIELEEKREIEKIREEEKQELREIKAAAT VDMQKISNPEAYKRTGKTMKPLNGPIVVYFRQKKAGVVESNGIEIRGKVGNPIVAAKAGT VIYASNFEGLGKVVMIDYGGGMIGVYGNLLAIKVGYNSRVSAGQTIGVLGLSSEKEPNLY YELRANLRAIDPLPTF >gi|224461360|gb|ACDC01000042.1| GENE 4 3026 - 3829 1071 267 aa, chain + ## HITS:1 COG:FN0267 KEGG:ns NR:ns ## COG: FN0267 COG0061 # Protein_GI_number: 19703612 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Fusobacterium nucleatum # 1 267 1 267 267 390 79.0 1e-108 MIKLSIIYNSEKESAINIYKELLEFLKNKKEFEILDEENLHKASYIVTIGGDGTLLRAFR NIKNKKAKIIAINSGTLGYLTEIRKDMYKEIFENILKNKVNIEERFFFMVNIGNRRYKAL NEVFLTRDTIKRNIVASEIYVNDKFLGKFKGDGVIISTPTGSTAYSLSAGGPIVTPEQKL FIITPIAPHNLNTRPIILSGDVKLVLTLSEPSQLGLVNIDGHTHKTIKLEDKVEIFYSKE SLKIVIPEARNYYDVLREKLKWGENLC >gi|224461360|gb|ACDC01000042.1| GENE 5 3823 - 5484 2170 553 aa, chain + ## HITS:1 COG:FN0268 KEGG:ns NR:ns ## COG: FN0268 COG0497 # Protein_GI_number: 19703613 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 552 6 557 558 758 89.0 0 MLRELKIENLAIIDELDIEFEKGFIVLTGETGAGKSIILSGINLLIGEKASVDMIRDGEE NLVAQGVFDIDEEQKKKLEAMGIDTDGDEIIIRRYYNRNGKARAFVNNVRITLADLKEIA STLVDIVGQHSHQMLLNRNNHIKLLDSFLSKEDKDIKEKLLTLLSQHREIKSKIEKIESD KKETLEKKEFYEYQLEEIEKLKLKDGEDEILEAEYKKVFNAEKIREKVYESLEYLKYDDD SALGFILESIRNIEYLGKYDERYLELAKRMESAYYELEDCVGEIEDISKNIEVTESDLDK IAGRMNTLKRIKEKYKRTLAELIEYREDLKEKLSDMNSGDFKTRELQKELDKIKAEYDKL AEKLSESRKEIALKIEDELLNELKFLNMEDAKLKVQMNKIDRMTNDGYDEIEFFISTNVG QDLKPLNKIASGGEVSRVMLALKVIFSKVDNIPILIFDEIDTGIGGETVRKIALKLKEIG DSTQIISITHSPVIASKASQQFYIEKYVENSKTISRVKKLSANERIKEIGRMLVGEKIND EVLEIANKMLNEG >gi|224461360|gb|ACDC01000042.1| GENE 6 5486 - 6241 649 251 aa, chain + ## HITS:1 COG:FN0269 KEGG:ns NR:ns ## COG: FN0269 COG0582 # Protein_GI_number: 19703614 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Fusobacterium nucleatum # 12 250 1 240 241 238 65.0 7e-63 MDILEKYIENLVIKKNLLQSTVEAYKLDISEYLSFLENKEKDILASNEDLFSEYFKEIED KYSVASFKRKYSTIRNFYKFLLKNRYIDKIFEYKLTKKINDKVSKENRTEAFKKNEYEAY INSLSDNFNEVRLKLISRMIAEAKISLINIFEIEIKDLLKYNFEKIIVFRNSKIVTYEIS TEISKELKEYYEKYAIEKRYLFGSYKKSSLISDLKRYNLDFKTLKNCLQEDEEEINKKIR EIYFKIGIGDN >gi|224461360|gb|ACDC01000042.1| GENE 7 6241 - 7134 1410 297 aa, chain + ## HITS:1 COG:FN0270 KEGG:ns NR:ns ## COG: FN0270 COG1159 # Protein_GI_number: 19703615 # Func_class: R General function prediction only # Function: GTPase # Organism: Fusobacterium nucleatum # 1 296 1 296 296 484 92.0 1e-137 MKAGFIAIVGRPNVGKSTLINKMVAEKVAIVSDKAGTTRDNIKGILNVKDNQYIFIDTPG IHKPQHLLGEYMTNIAINILKDVDVILFLIDASKTIGTGDMFVMDRINENSNKPKILLVN KVDLISDEQKEEKLKEIEEKLGKFDKIIFASAMYSFGIAQLLEALDPYLEEGVKYYPDDM YTDMSTYRIITEIVREKILLKTRDEIPHSVAVEIIDVERKEGKKDKFNINIYVERDSQKG IIIGKNGKMLKDIGMEARQEIEDLLGEKIYLGLWVKVKDDWRKKKPFLKEMGYVEEK >gi|224461360|gb|ACDC01000042.1| GENE 8 7161 - 8081 1149 306 aa, chain + ## HITS:1 COG:FN0238 KEGG:ns NR:ns ## COG: FN0238 COG4874 # Protein_GI_number: 19703583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria containing a pentein-type domain # Organism: Fusobacterium nucleatum # 1 306 5 310 310 500 84.0 1e-141 MKKNITNKILMVRPALFAFNEETAVNNYYQKRDNKTVQEIQNSALIEFDKMVEKLKNIGI DVKVIQDTKEPHTPDSIFPNNWFSTHYSNTVVLYPMFAENRRLERTDRIYDFFDNVDDLN VVDYSSLEKENIFLEGTGALVLDRKNKKAYCSLSQRADEKLLDIFCEDAGYKKIAFHSYQ TINEERKAIYHTNVMMAMGENYAILCADSIDNLEERAAVINELEKDNKEIVYITEKQVES FLGNAIELVNNEGVNVCVMSATAYSALTDEQKNIIEKYDVILPVDVHTIEKYGGGSARCM IAELFI >gi|224461360|gb|ACDC01000042.1| GENE 9 8240 - 9013 610 257 aa, chain + ## HITS:1 COG:FN0237 KEGG:ns NR:ns ## COG: FN0237 COG0600 # Protein_GI_number: 19703582 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Fusobacterium nucleatum # 17 257 1 241 241 406 94.0 1e-113 MKKFLNRNISFISIIILIAVWQVCGNLGLLPKFIFPTPLEIANAFVRDRALFLFHFKITM LEALIGLALGIFFACLLAIIMDSFEIINKIVYPLLIFTQTIPTIALAPILVLWLGYDMTP KIVLIVINTTFPIIISILDGFRHCDKDAIQLLKLMNASRWQILYHLKIPTALTYFYAGLR VSVSYAFISAVVSEWLGGFEGLGVFMIRAKKAFDYDTMFAIIILVSAISLISMELVKRSE KKFIKWKYLEEEENEKD >gi|224461360|gb|ACDC01000042.1| GENE 10 9000 - 10004 1496 334 aa, chain + ## HITS:1 COG:FN0236 KEGG:ns NR:ns ## COG: FN0236 COG0715 # Protein_GI_number: 19703581 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Fusobacterium nucleatum # 1 334 1 334 334 586 93.0 1e-167 MKKIKYLLFGIFTIFMLAACGEKKEEAKTEAPVELKKVDFLLDWVPNTNHTGLFVAKEKG YFAEEGIDLDIKQPANESTSDLIINNKAPMGIYFQDYMASKLAKGAPITAIAAIIENNTS GIITNKNLNINSPKELAGHKYGTWDIPIELNMLQFIMEKDGGDYSKVELVPNTDDNSITP LSNGVFDAAPVYYAWDKIMGDSLGIETNFFYYKDYAPELNFYSPVIIANNDYLKDNKEEA IKILRAIKKGYQYAIEHPEEAAEILIKYAPELENKKAMIIESQKYLASQYATDKDKWGYI DPARWNAFYNWLNEKGLTKNPIPENTGFSNDYLE >gi|224461360|gb|ACDC01000042.1| GENE 11 10014 - 10742 253 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 205 1 216 223 102 28 3e-21 MKKTLEIKNLSYSFGDNHILKDINIYVKENEMVAIVGSSGVGKSTLFNLIAGVLKKQSGE ITIDGSDDYIGKVAYMLQKDLLFEHKTIINNVILPLIIAKIDKKVALEEGRKILKQFNLE KYVDKYPKQLSGGMRQRVALIRTYMFKRNIFLLDEAFSALDAITKKELHKWYLNLKKEFN LTTLLITHDIEEAIFLSDRIYILANKPGEIIKEIKIEINPNEDIDVQRLFYKKEILNIMN IE >gi|224461360|gb|ACDC01000042.1| GENE 12 10815 - 11264 720 149 aa, chain + ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 149 1 154 154 189 69.0 2e-48 MNLVVLNGRLVRDPELKFGQSGKAYSRFSIAVDRPFQTSTDSQTADFINCVAFGKTAEFI GEYFRKGRKILLKGSLQMNQYESEGKKLTTYVVIAENVEFGEAKANTNAGGNDYKAPSNA VMETSNFEEFHSGDDNIESAPVADDEFPF >gi|224461360|gb|ACDC01000042.1| GENE 13 11277 - 11852 732 191 aa, chain + ## HITS:1 COG:FN1303 KEGG:ns NR:ns ## COG: FN1303 COG2096 # Protein_GI_number: 19704638 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 191 1 192 192 327 91.0 1e-89 MEDKKYVNITKVYTKRGDKGETDLLGGSAARKDSLKVESYGCIDETSSFIGLARYYTKNK VIKERLKEIQNKLLVLGGFLASDDKGKEMMKDQIKEEDIKLLEEYIDEYNQKLPPLTHFI LPGDDEVAAYFHVARTVVRRAERRIVSLAAQEDLNPLIQKYVNRLSDLMFVLARYSEEVE NKKWKSTNLNI >gi|224461360|gb|ACDC01000042.1| GENE 14 11893 - 13353 2130 486 aa, chain - ## HITS:1 COG:FN1277 KEGG:ns NR:ns ## COG: FN1277 COG2195 # Protein_GI_number: 19704612 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 1 486 1 486 486 763 80.0 0 MSNKLVNLKPERVFYYFEELSKIPRESGNEKAVSDFLVDTAKKLGLEVYQDKMNNIVIKK AASKNYENSPGVILQGHMDMVCEKDLDSNHNFKTDGIDLIVDGNYLRANKTTLGADNGIA VAMGLAVLEDNTIEHPQIELLVTVEEETTMGGALGLEDNVLTGKMLINIDSEEEAWVTVG SAGGRTIRAIFDDKKEKLNITNPEFFRLEVKNLFGGHSGAEIHKNRLNANKVISELMTQL KKEFDIRLCDIKGGTKDNAIPRECYFDIAIDKDASESFTLKVKEVFENFKNKYKAQDENI TFEITKLENSFNEAFSNDVFERLLSLISTLPTGVNTWLKEYPDIVESSDNLAIVKLIDDK ITIITSLRSSEPSVLDSLEEKIVNIIKEHKVNYEVGEGYPEWRFRPVSHLRDTAVKTYKD LFNEDMQVTVIHAGLECGAISTHYPDLDMISIGPNIYDVHTPKEKMEIASVEKYYKYLVE LLKNLK >gi|224461360|gb|ACDC01000042.1| GENE 15 13366 - 14985 1592 539 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 539 3 547 547 617 59 1e-176 MAKIINFNDEARKKLETGVNILADAVKVTLGPRGRNVVLEKSYGAPLITNDGVTIAKEIE LEDPFENMGAALVKEVAIKSNDVAGDGTTTATILAQAIVKEGLKMLSAGANPIFLKKGIE LAAKEAIEVLKDKAKKIESNEEISQVASISAGDEEIGNLIAQAMEKVGETGVITVEEAKS LETTLETVEGMQFDKGYVSPYMVTDSERMTAELDNPLILLTDKKISSMKELLPLLEQTVQ MSKPVLIVADDIEGEALTTLVINKLRGTLNVVAVKAPAFGDRRKAILEDIAILTGGEVIS EEKGMKLEEASIEQLGRAKTVKVTKDLTVIVDGAGEQKDISARVNLIKSQIEETTSDYDK EKLQERLAKLSGGVAVIKVGAATEVEMKDKKLRIEDALNATRAAVEEGIVAGGGTILLDI IDSMKEFNETGEIAMGIEIVKRALEAPIKQIAENCGLNGGVVLEKVRMSPKGFGFDAKNE KYVNMIESGIIDPAKVTRAAIQNSTSVASLLLTTEVVIAHKKEEEKASMGAGGMMPGMM >gi|224461360|gb|ACDC01000042.1| GENE 16 15001 - 15273 540 90 aa, chain - ## HITS:1 COG:FN0676 KEGG:ns NR:ns ## COG: FN0676 COG0234 # Protein_GI_number: 19704011 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Fusobacterium nucleatum # 1 89 1 89 90 125 84.0 2e-29 MNIRPIGERVLIKPIKKEEKTKSGILLSSKTAPAEKPNQAEVIALGKGEKLEGIKVGDKV IFNRFSGNEIEDGEEKYLVVNAEDILAVIE >gi|224461360|gb|ACDC01000042.1| GENE 17 15423 - 16124 653 233 aa, chain - ## HITS:1 COG:FN0996 KEGG:ns NR:ns ## COG: FN0996 COG5522 # Protein_GI_number: 19704331 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Fusobacterium nucleatum # 1 233 1 232 232 296 74.0 2e-80 MGDKFVLFSDPHLITMGIGFGVCFLLIFLGFFTERKQAFAKIIAVLVLGVKIAELIYRHK YYGESVAQLLPLHLCPMVIIISIFMMFFHSEVLFQPVYFWCMGAFFAIIMPDIKEGMHDF ASQSFFITHFFILFSAAYAFIHFRFRPTKTGFIMSFLLLVSLAFAMYFVNIKLGTNYLFV NRPPSSAAKLIDYVGPWPYYLYSIVGIYILLSFILYLPFKRNKKSKYGSWRKY >gi|224461360|gb|ACDC01000042.1| GENE 18 16137 - 16787 867 216 aa, chain - ## HITS:1 COG:no KEGG:FN0997 NR:ns ## KEGG: FN0997 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 216 1 216 216 332 74.0 7e-90 MAKQKYYAYFFDDKRNGIVESWTECEKIVKGTKARYKSFIDKAVAQNWLDSGANYERKIS STTPINTKLEKGIYFDSGTGRGIGVEVRITDENKVSFLETLPKETVKKLLKGRNWTVNEY GNIYLGANRTNNFGELVGLYFALEIAKIIDCSLISGDSRLVIDYWSLGHFHENNLELDTI FYINKVILMRKEFEKNKGVIKHISGDINPADLGFHK >gi|224461360|gb|ACDC01000042.1| GENE 19 17051 - 18853 2110 600 aa, chain + ## HITS:1 COG:FN0887 KEGG:ns NR:ns ## COG: FN0887 COG1164 # Protein_GI_number: 19704222 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Fusobacterium nucleatum # 1 600 1 600 600 1001 89.0 0 MKDRKTIEQKYKWNLNDIYENYDMWESDLEKFEKLTKEVPKYKGQIKNNSEKFVELELLM EKIARLLDRLYLYPYMLKDLDSTDEITSIKMQEIEMIYTKFGTKTAWIAPEMLEIPEETM NEWIKKHPELEERRFGLSEMYRLRKHVLSEDKEQLLSHFSQFMGSSSDIYGELSISDMKW NTVKLSTGEEIAVSNGVYSKILATNRNQEDRKLAFEALYKSYENSKNTFAAIYRAIIQQN VASCNARSYESCLDRALENKNIPKEVYFSLVNSAQENTAPLRRYIELRKKALKLKEYHYY DNSINIVDYNKVFKYDDAKEIVLNSVKPLGEDYQAKMKRAISEGWLDVFETKNKRSGAYS INIYDVHPYMLLNYQETMDAVFTLAHELGHTLHSMHSSETQPYSTADYTIFVAEVASTFN ERLLLDYMLENSDDSLEKIALLEQALGNIVGTYYIQTLFASYEYEAHKMIEEYKAVTPDI LSDIMYNLFKKYFGESITIDELQKIIWSRIPHFFRSPFYVYQYATSFASSAKLYENLKTN PESREKYLTLLKSGGNNHPMEQLKLAGVDLTKKESFDSVAKEFDRLLDVLEEELKKINLI >gi|224461360|gb|ACDC01000042.1| GENE 20 18872 - 19342 648 156 aa, chain + ## HITS:1 COG:FN0601 KEGG:ns NR:ns ## COG: FN0601 COG2849 # Protein_GI_number: 19703936 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 16 154 1 139 141 214 84.0 7e-56 MKRKLLLVAFALLFSVSAISNSQEIRKKDLRIVEKLYYLKDSDVPFTGKVSEGKDRLYYL NGKQDGKWISFYKNGNIKSIINWKDGKLNGKYIIYENNGMKSTETIYKDGKENGDYFLYN ANGTYRTKGAYIMGRPVGLWEYYDKDGKLKDTVIVN >gi|224461360|gb|ACDC01000042.1| GENE 21 19416 - 19916 749 166 aa, chain + ## HITS:1 COG:no KEGG:FN0600 NR:ns ## KEGG: FN0600 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 166 1 166 166 232 89.0 3e-60 MKLTLQQAIFTISNLSKKQRRLLDLIRDSYVVPLKVNGKEVFEQAQADEMLKNLSELDLI NQDIVTLKDGINVANTENFIENKSLFALLEEVRLKRNILFDLEYLLKRDSTTVENGVGVV QYGVLNKKELAEKFNKLENEVNSLSEKIDSVNAKTEIEVKLFSSID >gi|224461360|gb|ACDC01000042.1| GENE 22 20133 - 20396 239 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294782434|ref|ZP_06747760.1| ## NR: gi|294782434|ref|ZP_06747760.1| nitrite/sulfite reductase-like protein [Fusobacterium sp. 1_1_41FAA] # 3 87 8 92 92 147 95.0 2e-34 GSLAGNYLCGKDFIAGTYDIELVKNYGYITIREKKNVSNIKFRKYLGENIGELKDFKNCS IEIEEKLEISGGLEVKLTPSKSTYLYN Prediction of potential genes in microbial genomes Time: Thu May 19 23:02:55 2011 Seq name: gi|224461359|gb|ACDC01000043.1| Fusobacterium sp. 2_1_31 cont1.43, whole genome shotgun sequence Length of sequence - 22597 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 5, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 406 - 465 12.4 1 1 Tu 1 . + CDS 496 - 4248 4760 ## CHY_2654 hypothetical protein + Term 4255 - 4293 6.3 + Prom 4288 - 4347 16.9 2 2 Op 1 . + CDS 4393 - 5505 1102 ## Pnuc_1118 SMC domain-containing protein + Term 5541 - 5582 -0.6 + Prom 5556 - 5615 8.2 3 2 Op 2 . + CDS 5650 - 8526 2737 ## Cag_1611 putative ATPase involved in DNA repair + Term 8533 - 8575 7.7 + Prom 8534 - 8593 6.3 4 3 Op 1 . + CDS 8613 - 9944 1530 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 5 3 Op 2 . + CDS 9969 - 13571 3899 ## COG1002 Type II restriction enzyme, methylase subunits 6 3 Op 3 . + CDS 13576 - 13830 287 ## gi|237740487|ref|ZP_04570968.1| predicted protein 7 3 Op 4 . + CDS 13896 - 15143 873 ## gi|237740488|ref|ZP_04570969.1| predicted protein 8 3 Op 5 . + CDS 15166 - 16344 1149 ## Lebu_1380 protein of unknown function DUF1703 + Term 16355 - 16403 6.2 + Prom 16350 - 16409 17.0 9 4 Op 1 . + CDS 16438 - 18951 2948 ## Swol_2489 hypothetical protein 10 4 Op 2 . + CDS 18960 - 20975 2214 ## COG4930 Predicted ATP-dependent Lon-type protease 11 4 Op 3 . + CDS 20993 - 21130 149 ## gi|237740492|ref|ZP_04570973.1| predicted protein + Prom 21132 - 21191 5.1 12 5 Op 1 . + CDS 21258 - 21578 278 ## Msm_1749 hypothetical protein 13 5 Op 2 . + CDS 21633 - 22148 472 ## Exig_0313 hypothetical protein 14 5 Op 3 . + CDS 22168 - 22597 609 ## FN0832 hypothetical protein Predicted protein(s) >gi|224461359|gb|ACDC01000043.1| GENE 1 496 - 4248 4760 1250 aa, chain + ## HITS:1 COG:no KEGG:CHY_2654 NR:ns ## KEGG: CHY_2654 # Name: not_defined # Def: hypothetical protein # Organism: C.hydrogenoformans # Pathway: not_defined # 4 1128 1 1116 1187 538 33.0 1e-151 MSDLTIKSILQKDIERKINGVVKADSNEKDTIITELNEYVVTEEIRKRLIKFFDKYVDSI NSPTEDMGVWISGFFGSGKSHFLKMIGHILENNTYDGKTVVDFFKEKIDDAILMGNIEKA AEIPTDVILFNIDNVSDQDTHQNKDSIALAFLKKFNEYLGFTRDDIEIAEFERRLWEDGK LEEFKKAFEEESGKTWKDANRNLDFHSDDFIDVVEKLEIMTRESAERWLERDIVRSINAE SFRDIIENYLKMKDPKHRVVFLVDEIGQYIGDNSKLMLNLQTLVETLGVKFKGRVWVGVT SQQDLGSILNNSEHRKNDFSKIQDRFKTILSLSSGNIDEVIKKRLLIKKKIEGEDLEKLF DKNRVEIENLINFETQMTLPLYDSAEDFSETYPFVAYQFNLLQKVFEKVRNMGHSGQHMS RGERSLLSSFQEAGIKVKDKNIRILVPFNYFYESIEQFLEDNVRRPFIHARNEKKVDDFG LEVLKLLFLLKGINGINPTLNNLTSFMIDSIDCDRIELEKKIKKALEKLEREVLIQKDGD NYYFLTNEEQDINREIEREDIDFKKIDEKIDSYIFKEIFTKNSIIMEDTGNKYGFSRTLD ETPFSKAGEELAITIFTERAEDYDNVSIVGTRPESDLIIRLPNDDETYRNEIKLFLKVES YIRNKQKDNERESIIRILEIKQRENSIRNRRIKNELERIIGEAEVFIYGQKQDIKTKDAA KKIEESLKALANHRFHKAKLVKKPYDEAEIRKTLSYVFDTNKNGILFDIKKNIEANVNCE AINEVLDRVKLLEKRGDTPITLKNISDYYLRTPYGWGQLTINGLVGELWKYRLINLEESK VLVTDENVATNLLTKLQNKNLEKIVISLREELDPELVKKVNNLLKEIKTLKEETGEVTID SPKEDLLEILNRKIGIAKRYKIECEHSKYPGKKELSDWIELLEEIILSRDNAEKTLKNFL EMKDEISKEYDKVDRVFDFFTSSKKDRYDEVIQKINKIEEYKDYIGGLKETPAYKTIEEI RNDKNIYERIREFDELITELDKEKDKLIELEKESLRAKVEKYREDFSEKLKDNTEILKKL EEKFNNFLENEVNNSDSSNDMAIFMKSKKLENIVSKFEDEYKNYARKEIEKLESYLNEVA EDKTDIDKLRQSIKSTYNNYKNEIAKSDIRNISATITKAIKDKDDFYAELNGKAKKKERV ILRKVSISSKTNIETEDQVNDYISRIEKDIEKLKNEMLEAIRSNKIIDIG >gi|224461359|gb|ACDC01000043.1| GENE 2 4393 - 5505 1102 370 aa, chain + ## HITS:1 COG:no KEGG:Pnuc_1118 NR:ns ## KEGG: Pnuc_1118 # Name: not_defined # Def: SMC domain-containing protein # Organism: Polynucleobacter # Pathway: not_defined # 1 368 1 366 368 263 43.0 9e-69 MLIKFKVEGFKNFEKELVFDLSKTRNYNFNESAIKDGIVKVGLIYGINGSGKSNLGSAIF DIILHLTEKEKLINLYDHYLNLSNSNIIAKFYYKFEFEDNILEYEYQKDKPQNLVKEVLK INNKLISDYNYITNEAELNLEGTENLNKNLNGNNISFIKYINNNINPKEDSEQFKIKKII EKFISFVDNMLFFASLNGNFYQGFKKGNGTISDEIINNGKLKDFENFLNEAGVPYKLIEK KIGKDKRIYCKFKSGEVDFFEIASKGTCSLILFYSWLISLDKVSFVFIDEFDAFYHVDLA KRVVEELLKLNIQAILTTHDTTIMTNDLLRPDCYFVLSEGKIKSLPDLTEKELRQAHNLE KMYRAGAFNE >gi|224461359|gb|ACDC01000043.1| GENE 3 5650 - 8526 2737 958 aa, chain + ## HITS:1 COG:no KEGG:Cag_1611 NR:ns ## KEGG: Cag_1611 # Name: not_defined # Def: putative ATPase involved in DNA repair # Organism: C.chlorochromatii # Pathway: not_defined # 2 957 4 963 966 414 33.0 1e-114 MYTEYKEGSKWIKCDFHIHTPCSVLNNQFGDNFEEYVKKMLRKALEHDIKIIAITDYYSI DGYKKLKEEYLEREFKLKELGFLDEEILRIGQILFLANVEFRLDILVNRAKVNFHIILSD KIKISDIEENFIKRIEFPFQGTEKRTLTRSNIESLGKKLKEEQNNLRGSEYEIGIGQLAV DSSQVLNILENSDIFKNKYLVVLPSDEDLSNIRWDGQDHNIRKILIQQAHCLFSSNKSTI SWGLGEKSENKEEYVKEFFSLKPCIHGSDAHCYEKLFRPDKNRYCWIKSIPTFEGIRQML FEPKERVYIGETFPNKKQPYNIIKRVKFIDSKNEFQNDWIYFNEGLNSIIGGKSSGKSLL LYYIAKTIISKKITNLKMEIGSDLNFLGYDFEKELKFDFIVEWADGVKINLKSEESKRKI TYIPQMYINYIAENKNNKNELNNILLGILNENKEFKDNIENINEKINQKSIEINEEISIF FRNKIKLTELENEKIDIGDLEGIQKNIDRLEKESQEIAYISILNDNEKEEYLDISNSIKK KKEELEKYKQNVNIRQLYVNKLIEKIKETSVVLNEIFFEDFNKIDDNEVKENLKSLNENV ESKILEVKNYLQNDNFIFKVSEKMKLLEVEIVNSQNNIKKYEEKMGNMEKQLNLVKLLEQ EKQKKVLIEEKEKEIILLKNDLTNKKILEKYQELLALYENKILEHLKFKNISEDIELVVK LKFDIDSFKEKFSEKISKKMVLEKQFGENIFTGNEFKFTKDCHLENIKNIYDKLINNKEE IKINQSYSLEEVLEGLFKDYFSIEYDLVQNGDSLLKMSPGKRGIVLFQLFLQLSNSDTPI LIDQPEDNLDNRTIYQELNTFIKNRKLKRQIILVSHNANLVVSTDSENVIVANQEIKKGN NYEYNEKYKFEYINGALEETFTLNNGKKFHEKGIREHVCEILEGGEKAFKIREKKYGF >gi|224461359|gb|ACDC01000043.1| GENE 4 8613 - 9944 1530 443 aa, chain + ## HITS:1 COG:MA2370 KEGG:ns NR:ns ## COG: MA2370 COG2865 # Protein_GI_number: 20091202 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 6 437 11 442 458 110 26.0 5e-24 MKKYIESEKLELKEKYTDIICKEIVSFLNGNGGAILIGVRDDGTVVGVDKIDETLRKISD IITTQIEPNPQDEISSELRFEEGKTIIILNINKGRKHIYCQKKYGFSSHGCTIRIGTTCK EMTIEQIKIRYEKKFIDTEYMLKKRASLADLSFRELKIYYSEKGYHLDEKSFETNLNLRN EDGEYNLLAELLSDRNNIPFIFVKFQGKNKASISERNDYGYGCLLTTYQKIKNRLEAENI CISDTTTRPRKDIYLFDYDCVNEAILNAFVHNDWTITEPQISMFNDRLEILSHGGLPSGM TRKQFFDGISKPRNITLMRVFLNMGLTEHTGHGVPTIINRYGEKVFEIGNNYICCTIPFD EKVINQKNEKNVGLNVGLNVGLNKTEKKVIEFLMENPSFTSDDLAEKIGVTKRTIERTFK KLQEKKMIERIGSKRDGNWIVIK >gi|224461359|gb|ACDC01000043.1| GENE 5 9969 - 13571 3899 1200 aa, chain + ## HITS:1 COG:MA2372 KEGG:ns NR:ns ## COG: MA2372 COG1002 # Protein_GI_number: 20091204 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Methanosarcina acetivorans str.C2A # 1 1199 1 1160 1161 639 35.0 0 MNKSNLKLFAIEARNELMEKMRTRLDILGITKKGIEKAKVVGREVEINGSLYPKESYNSL VRKYKQIGYEELVEESAYTWFNRLTALAFMEANEYIEEKMIFNNGLKNEPGIIDNYYDFE FFKNLDSELQKELHNLRDENTANSIEKLYSILVEEKCEELSAIMPFMFKKKGTYSDILFP TGLLLENSLLIRIREEIGKEAPIELIGWLYQFYNSEKKDEIFAKKEKISKENVPAVTQLF TPDWIVKYMVENSLGKLALESTGINESLKTNWKYYIESELDESSEKIKIEDVKILDPAMG SGHILVYAFDLLFEIYENLGWSTKDSVLSILKNNLYGLEIDERAGQLASFALMMKAREKF SRLFSVLKREETFKLNTLIIEESNNLSEKIKNKVKDNNLNNLSKIIEDFEDAKEYGSILK LETIDKETLEKEFNLLKESLDNEQGTLIFNEDELDINIEEDLELIESLIVQHIALTNQYE AVVTNPPYMGGKGFSTKLKAYTEKNYKDSKSDLFAVFIERCNEFTKKNCYTSIITMQSWM FLSSFETLRKNIIEKTEIKSLNHLGTRAFSEIGGEVVSTVAWISQKKSPKNDGIYLRLVD YNNADLKEEEFFNKANYFQAKQKDFEKIPGSPIAYWVSDKVREIFEKNQKLGEVGEVISG MTIGKNDLYLRKWYEVNNYKINLRKLKIDEINLEKNPWIPYSKGGEIRRWYGNNEWIVNW RCSNKFNRAKTTLKHLYLKEALAWNFISSSNFSMKYLENGYLWDIAGSPCFFKKELVNYV LGFLLTNFSQNILNIMNPTINYQAVDIQNMPFLYAENKANEINNLVQQNIDISKEEWDSR ETSWDFEKLSLVDGKDLRTAFENYCSHWRDNFVQLHKNEEELNRLFIEIYELQDEMDEKV SFDDITILKKEAKIVEIDNSKAREFSSESERYLYDRGVSLEFNKDELVKQFLSYAIGCIM GRYSTNKPDLIMANSDDVLELSSNKFLVKDTNGDIRQEVETEFLPDEFGILPITAEKDFS NDIVERVKEFVKFVYGEESLKDNLNFIAEALGNKDNKSAEEILRAYFITAFYKDHLQRYQ NRPIYWLMNSGKKNAFSCLFYMHRYEPLTVARVRADYLIHYQEMLENKRKFIERQLYAED ITAKEKKNVEKELKDLDVLLKELREYANEVKHIAEQKIVLDLDDGVNVNYERLGAILKKR >gi|224461359|gb|ACDC01000043.1| GENE 6 13576 - 13830 287 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740487|ref|ZP_04570968.1| ## NR: gi|237740487|ref|ZP_04570968.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 84 1 84 84 125 100.0 1e-27 MENIKISDKFKDLIDEKYRNNYIFHTTTQSSDIFYKYFYLGKYDKEDILKELELETEPLA ILFLNLIKFEIGEKKKNIITQKYV >gi|224461359|gb|ACDC01000043.1| GENE 7 13896 - 15143 873 415 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740488|ref|ZP_04570969.1| ## NR: gi|237740488|ref|ZP_04570969.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 415 1 415 415 613 100.0 1e-174 MIDILKEIKNIRKGDTNFLDNFVFSIYNTIKVFDNHNLIIKVLNFLGKGNKNGIDIINTQ RKKITNITRITFYDSLLELKEFKILDESLYQEILVEYIEKLIDFCQNKKEVENSLTYLNF LNIGMKKLNNIKNSNIRNSLKEKTKSLSEQLVEASKICNNVIEYIDNEKIKIIKDNYKYI SNEKSLEKNIKNLEILLLEELNSLPNINKNPSILKKLAKPLYLDENRVIANKEDENKDAY DVITCIFFYLLILFYENPIFNSCETKKLIYILKNEEIKGFLDEIIENYFCGNYFTCCSAI SPTIERLIRNTYFKLGKSDIELYGKDNSLQKRKNLENLLKDNDIKKIYSEEFINYFSWLF NNDISYYNYRNSVCHGYKNYEEYNKIETTLQLFILILFLKKFYEYFDKIEESKKE >gi|224461359|gb|ACDC01000043.1| GENE 8 15166 - 16344 1149 392 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1380 NR:ns ## KEGG: Lebu_1380 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 2 392 10 398 552 369 50.0 1e-100 MKKIPIGINDFKTLIENNYYYIDKTKHIEDILNDGSEVILFTRPRRFGKTLNMSMIKYFF DIENKKENKKLFSGLYIEKSKYIGEQGKYPVIYISFKDLKSKNWEGTIFKLKNQLKDLYK EFLYLKDSLDEISQEDFNKIIYMKEDANYEFSLKYLTEYLYKYYKQKVIVIIDEYDSPIV NSYENNYYDEAILFFRNFYSSVLKDNQYLERGFLTGILRVAKEGIFSGLNNLEICSILDN KYSSFYGLTEDEVLKSLDFFNMEYKLNEVKDWYDGYKFGNKEIYNPWSILNYINKKEIGA YWVGTSNNFLINNILENAEASIFEEVELLFSDKETIKTIYSDPDFSDIKKPQNIWQLLVH SGYLTVKKRIDRNLYSLSIPNKEIEEFLKEFF >gi|224461359|gb|ACDC01000043.1| GENE 9 16438 - 18951 2948 837 aa, chain + ## HITS:1 COG:no KEGG:Swol_2489 NR:ns ## KEGG: Swol_2489 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 1 837 1 873 873 336 30.0 2e-90 MEIDKIKELLEYRFSLTTELPQKRHIIFWYDSKKEFKDLIDELNLTDVKIIKLTKSLDKK GESIYTNIFKTKYTLEVIDTESNYLIYSEHPRSIDSENYLLDIEKYSEFFEADKSAMIVE ELKLDRTNYKLGEIIKDYPSFFANKERREKLAKLIEQPESIDEEEFKLSILTVISGAKTV DILEIIKNIILNKSKLQDIEKWLNLDFLFFEIKKKFDVEVTSFEQFLKILMVTHFYFQLQ KKAHTNLEKYFTGRKNELYIFAESLLQNKQISEIIREEFYDLAKDLNIKDRIDELELDYS ISGTAFEYFDKVIIKDIIDIFNSEVIDYEKYKKYIEIRLDNSLWIEKYQYFYKALLALND FFRLKDSLIIEDRKVLKDIFKDYTEHYFLIDKLYRDFYFYHDKIKNDELAPLFDILKAKI DKFYEVDYLEKLLALWSSKVYERENLAQQRDFYKNNIANADVRTVVIISDALRYEVGYEI SQQLRKEANVKEIKLDAMLTDLPSRTFLGMANLLPCKIERNIDLVSAKVIVDGIDSQGTE NREKILKLSCEESSAISYDNFNKMNRGKQEEYIKGKKVIYIYHDSIDAIGDKAKTESNTF NACKDAVEDIVGLSKLLSSLGVVNVYITSDHGFLYEKKEVEEYNKLELKNTKYEAIGKRY AIYKKEAEEKACITLKLDSLYGVFPEKNQRIKASGSGLQFVHGGASPQEMIIPLINYKSG ANSKKISKVDLRIRESVGKITSNLTKFSIYQIEAVSIKDKFIERDVSVALYDENVRVSDE KKLKLNSTEENTIHDFRLTLSGEHKKVTLKVIDIESGDILDSKEYIVSIGIASDFDF >gi|224461359|gb|ACDC01000043.1| GENE 10 18960 - 20975 2214 671 aa, chain + ## HITS:1 COG:STM4491 KEGG:ns NR:ns ## COG: STM4491 COG4930 # Protein_GI_number: 16767735 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent Lon-type protease # Organism: Salmonella typhimurium LT2 # 2 670 23 693 694 791 55.0 0 MDINEIGSKVFEGKIVRKDLVSKIKGGANVPVYVLEYLLGMYCNQTDEDSIEEGMLKVKK ILAENYVRPDEAEKVKSKIREIGVYNVIDKVTVVLNEKKDRYEGHLSNLGVSNIEIHKNY IKDYEKLLSGGIWCILTLSYQYDELNISESPFKLNKLKPIQIASLDMNEVFEARKHFTKD EWIGFLLRSSGMEFENFDKDAIWHLLARMMPLVENNYNLCELGPRGTGKSYIYKEISPNS ILLSGGQTTVANLFYNMSKRQVGLVGYWDVVAFDEIAGIKFKDKDGIQIMKDFMASGSFA RGKEEKNANASMVFIGNINQSVDVLLKTAHLLVDFPPEMNNDSAFFDRMHCYIPGWDIPK LSPKSFTKEYGLIVDYMAEIFRELRKTSYGDALDKYFSLGRDLNQRDTIAVRKTVSALIK LVYPDGVFTKEEVEEILVRALEYRRRIKEQLKKMAGMEFFATNFSYIDKESGEEKYVGLK EQGGSKLIPEGPLKAGSLYTIAISANSVKGLYKVESQISAGKGKITVSDNKYRKTFENAF NYLKINSKRISGAINISEKEFYLSVADEKNVGNTEAITLGGFIAMCSIALNRQLMPQTVV LGEMALSGSINAVSDLVSMLQIAREAGAKKALIPILNAVDMSTLPPDILMDIQPIFYQDP IDATQKALGLI >gi|224461359|gb|ACDC01000043.1| GENE 11 20993 - 21130 149 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740492|ref|ZP_04570973.1| ## NR: gi|237740492|ref|ZP_04570973.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 45 1 45 45 69 100.0 5e-11 MQYRGITSENFYLTEIRNTCRFILENGIQENLKELLKKNKYFRNI >gi|224461359|gb|ACDC01000043.1| GENE 12 21258 - 21578 278 106 aa, chain + ## HITS:1 COG:no KEGG:Msm_1749 NR:ns ## KEGG: Msm_1749 # Name: not_defined # Def: hypothetical protein # Organism: M.smithii # Pathway: not_defined # 3 95 92 184 197 64 38.0 1e-09 MCNERFILEFLEEVVKEKYDNYDYNIKESDFLIYLATKSEQSEIINNWTEAGKRKMLVKI KNFLTEGGFLEKNKDGYNIIKPVVESAVIDEIKENGNRKILKIMFY >gi|224461359|gb|ACDC01000043.1| GENE 13 21633 - 22148 472 171 aa, chain + ## HITS:1 COG:no KEGG:Exig_0313 NR:ns ## KEGG: Exig_0313 # Name: not_defined # Def: hypothetical protein # Organism: E.sibiricum # Pathway: not_defined # 1 171 15 186 188 133 42.0 3e-30 MNSDEFFNNRGLANEVPFYIFDYNPKYELEIRDFVKNSLLINLENNTRLKAVEIDLFELL LESMKNDNILESAFELEEKKGTKFLYEKLKKSFNTEIIMKYIAEKTKDKNFLILTGVGKV FPIVRTHTILNNLQNIFDHTKVLLFFPGEYTSTDLRLFGFQDNNYYRAFKI >gi|224461359|gb|ACDC01000043.1| GENE 14 22168 - 22597 609 143 aa, chain + ## HITS:1 COG:no KEGG:FN0832 NR:ns ## KEGG: FN0832 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 143 1 143 154 222 78.0 4e-57 MGLMDLVKKAFLGATDEENQKNKARMREIFNESVPNGDDYKLIYCHMENFTNAVVVKVTK HANYIVGYKEGEVIVIPVNPDLLDYDKPYVFNKKNESETKTSLGYCIVANPEIKFQFIPI TYEPALAGKKDYSVAITQSSAEV Prediction of potential genes in microbial genomes Time: Thu May 19 23:04:16 2011 Seq name: gi|224461358|gb|ACDC01000044.1| Fusobacterium sp. 2_1_31 cont1.44, whole genome shotgun sequence Length of sequence - 14594 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 5, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 60 - 1007 1300 ## FN0833 hypothetical protein 2 1 Op 2 . + CDS 1007 - 2632 1895 ## FN0833 hypothetical protein 3 1 Op 3 . + CDS 2633 - 4132 1840 ## FN0834 hypothetical protein + Term 4138 - 4182 8.1 - Term 4128 - 4165 4.0 4 2 Tu 1 . - CDS 4176 - 4439 295 ## SGO_1740 integral membrane protein - Prom 4584 - 4643 10.7 + Prom 4413 - 4472 7.0 5 3 Op 1 . + CDS 4618 - 5157 757 ## Sterm_0139 hypothetical protein 6 3 Op 2 . + CDS 5175 - 5843 922 ## FN0835 hypothetical protein 7 3 Op 3 . + CDS 5862 - 6158 531 ## FN0836 hypothetical protein 8 3 Op 4 . + CDS 6171 - 7481 1329 ## Lebu_0718 hypothetical protein 9 3 Op 5 . + CDS 7498 - 8592 924 ## Lebu_0718 hypothetical protein 10 3 Op 6 . + CDS 8594 - 8842 303 ## gi|237740505|ref|ZP_04570986.1| predicted protein 11 3 Op 7 . + CDS 8856 - 9842 1077 ## COG0582 Integrase + Prom 9955 - 10014 18.9 12 4 Tu 1 . + CDS 10257 - 10796 584 ## COG1335 Amidases related to nicotinamidase + Prom 10804 - 10863 13.2 13 5 Op 1 4/0.000 + CDS 10949 - 11950 977 ## COG0373 Glutamyl-tRNA reductase 14 5 Op 2 6/0.000 + CDS 11966 - 12871 1247 ## COG0181 Porphobilinogen deaminase 15 5 Op 3 . + CDS 12887 - 14347 1821 ## COG0007 Uroporphyrinogen-III methylase 16 5 Op 4 . + CDS 14348 - 14594 126 ## gi|294782421|ref|ZP_06747747.1| conserved hypothetical protein Predicted protein(s) >gi|224461358|gb|ACDC01000044.1| GENE 1 60 - 1007 1300 315 aa, chain + ## HITS:1 COG:no KEGG:FN0833 NR:ns ## KEGG: FN0833 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 17 283 17 282 539 262 56.0 1e-68 MKKFLLFLFVIAIFAIGGLYGYKKLSADERKNEIIQMFNKELLNDFVESKKSVIERLKTA KDKEEGNKIYNEYVATNKLMIEKINEAHSELLENVFMADSKYNFTPEEWKTVNNYLKDYD LELIDMGEGNAMIAQVPNFYYDIFKDYVTDDYRDYLELVTKEYTEPYFGIEEILVSHEKI ADRLLAWEDFQKKYPDSDFLAEADIEANVYRRAYILGAYNLHTREGGSENPELYYIPDNI LKEFNRFIQANPDSPTVEYIKFYLENYKNPNIEEILYDKFEKEIVKDYESENSNEPVIKD TLEVITEEDKESKGE >gi|224461358|gb|ACDC01000044.1| GENE 2 1007 - 2632 1895 541 aa, chain + ## HITS:1 COG:no KEGG:FN0833 NR:ns ## KEGG: FN0833 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 17 537 17 536 539 471 57.0 1e-131 MKKILLILFTVAIFIIGGIFGYKKILSIEKENKIIQLFNKDSLENFSKNKNEMLEKLKTL NKEEADKLYEQYLESNNTILENLNIEHDKLLSGGINGIYNKDIAENFTDEEWKIANKFLN KYDLELWYLARGTCIIKEVPDFYYKTFKDYVTDDYKEYLKITSKENEEHYVADSGLCITL EELGDRIVTWENFLEKYPNSKLNDKVNNICNSYRRDYILGVPGGIYDYKESAEEYNRFIK KYPDSPTTELLGYYLEEVNLDKPENNDSEALSKMIDEYIEKYFYLGSLENRKKGNLFSEQ TNNLLEEFNKNKKEVINKLKTLNKEEANKFYEDYLESNNEILEKMNENDYIMLDNAFYIG KGDIDKEKLNKQNKYLDNYGLEVVEIEEGFMLTEKKDFYYNIFKNYVSDDYRDFIKLCSE DIDYIDYFSSLEEHPEIIADKVINWEKFLEKYPDSKLRKKANDICYSYRDDYILALTSSQ TTEVLKNGKINEDVKELNRFKNKYPNSPTTEIIKYYLENYKNEDIRDMLADKNEEIYNKG E >gi|224461358|gb|ACDC01000044.1| GENE 3 2633 - 4132 1840 499 aa, chain + ## HITS:1 COG:no KEGG:FN0834 NR:ns ## KEGG: FN0834 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 497 1 505 511 534 59.0 1e-150 MKKIGLIVLLTLSFLLLTSCNKGKNEEVKNEKIKFSEESYSLFEKFATDKKETMEKLKSL NKEEANNLYEEYQAQNNNTLYDIDEASASFLDSIYGDTNGENFTDKDWADANKILNKYDL ELWDIGEGMVTIRELPHLYYDVFKDYVTDDYKEYLKIWAKDDEELYQADAGLSISFEELG DRIARWENFLNKYPNSTLKPKVTALLNSYREDYLLGMENTPTLDGGYDNIPITVDEEAKK EYDRFMKKYPNSPTVELIKYFLENYQNNDIYDLIRNKILNEFELDLTKEALSENLGRVLA IQDNFNEKIFTSADWTVNLDDNTFSNAKEKYPIEFIGTAILKENGETIWIWEDSSLATEI QATAGNNAIPILTYNSFELPKNMSANAFVSLACGILHDKIAFSGIDYTEKGGMYYFVVSK LPETVFSPVAIKKFADITELAIKNYDIDHKIFVENFLEWNKTKYEWQGDKIIADFGNEDK LEIQFEKIEDKYRIKEIIL >gi|224461358|gb|ACDC01000044.1| GENE 4 4176 - 4439 295 87 aa, chain - ## HITS:1 COG:no KEGG:SGO_1740 NR:ns ## KEGG: SGO_1740 # Name: not_defined # Def: integral membrane protein # Organism: S.gordonii # Pathway: not_defined # 1 87 1 87 87 128 81.0 6e-29 MSKAKFNAIVGSIGAFIGIFVFISYIPQIIANLNGAKSQPLQPLFAAVSCLIWVIYGWTK EPKKDYILIAPNLAGVILGTITFLTAL >gi|224461358|gb|ACDC01000044.1| GENE 5 4618 - 5157 757 179 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0139 NR:ns ## KEGG: Sterm_0139 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 31 149 56 166 171 66 36.0 4e-10 MDKVYNEIKEFLKDPIKNREDFFNSKVITWIDWREYDEDIIGYFNGLLAHEDIIELETKE IDLGRGIDLILKKDNKVLIIPYEDDETDRDITIKTLDEFISPKYQIRLFSESLGDDTLAF IVLNSNEWKDLENEFGKEKLEFFFTPVSQFKGIFNMSMKEVKKIYTEREVLRDKIFKNN >gi|224461358|gb|ACDC01000044.1| GENE 6 5175 - 5843 922 222 aa, chain + ## HITS:1 COG:no KEGG:FN0835 NR:ns ## KEGG: FN0835 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 26 222 1 198 198 287 73.0 2e-76 MDKWNDVFSANLGKIMAIQTACAKYVVKNRNWNVDFDRGIISFGNDEYPLQFLGSEANSS NTWLWAWENINGFDENIISLAREIKAKGEKLNLEALTTAEIEITDELNGHILSIVACGLA DKKYCYYRGPHSGGAIFVAFDGVDERVFAPINAKDFADIVVNSIQQFPLNHKLFVESFLE WNKTKYEWKENTLIANFKDSKLEIDFEEKVELARITNIRLNS >gi|224461358|gb|ACDC01000044.1| GENE 7 5862 - 6158 531 98 aa, chain + ## HITS:1 COG:no KEGG:FN0836 NR:ns ## KEGG: FN0836 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 98 1 98 98 162 89.0 3e-39 MLNHIVMWKIKEDVEDKEKVKLDIKNGLEGLFGKIKELREIRVETFMETTSTHDIALFVK IDNEETLKNYATNPLHVDVIKNYIKPFVYDRVCIDFFE >gi|224461358|gb|ACDC01000044.1| GENE 8 6171 - 7481 1329 436 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0718 NR:ns ## KEGG: Lebu_0718 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 435 1 439 439 360 48.0 5e-98 MLFKKDEKNLSVEIIDIKLDTSDVPSIKEAKLVHINGKAKLVKDMGKYDDNYTSPYQIKL NNIPIVQAKIPECSTCCSVLATGYGIENANCKELLDIQENINSNYISLEKSIRDIEPLLT LLETGFYLVADAISYPTDGDKNFFWNIPNEEIETLATGPVAIYDDEDSYFNYIYGEPVYL YPTQTTDSYDENRVKYYIDKFKELADSSPRAIVYYLDNFMNFVIDGHHKTCASALLGEPL RCLLIIPGVVARYPNEIKIFFSSSIIINKNDIPYNYSSFVKAEIPSLSSKEIVIKDGIVN KRKWEKKYLDSAKKYLTQKEYARMVDILINNKIEVTDNLIEDCLINFDIKSQKKMEKIIY KLKLLDIEKAQDIALKYAKNSLRYEINKNLREFIYKILVSIKDNNEVEQIFVDYYTYYSE NKEDPVLEIISTYWED >gi|224461358|gb|ACDC01000044.1| GENE 9 7498 - 8592 924 364 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0718 NR:ns ## KEGG: Lebu_0718 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 354 1 366 439 310 51.0 7e-83 MEIKKNIENLSVEIIDIKLDTSTIPSIKEAKLVYINGKSKLVTDIGRYDSNYRSPYQIKL NDVPLLQAKIPNCPTCCSLLATGYGIENANCKELLDIQEKINSNYISLEKSIEDIEPLLT LFETGFYLIADAICYPTDGDKNFFWNIPNKLEKENYFEYIYGQPVYLYPTQTTDSYDKNR VEYYIDKFIELDDSSPRTIVYNFTDYINFIIDGHHKACASALLGEPLRCILIIPAIVTKY YNVLEEKNETYLDFSSIKVSQAEIPEKYLPFVKEKRFKSKKKEIIIEDGSLSKREWEKEY LDSVKNYINLYNYAKIIDILRKEKININNNLMEEYLSNFDLVTQNKMKKIIYKLKLFDME KSEA >gi|224461358|gb|ACDC01000044.1| GENE 10 8594 - 8842 303 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740505|ref|ZP_04570986.1| ## NR: gi|237740505|ref|ZP_04570986.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 82 1 82 82 146 100.0 5e-34 MSFWYKPCPICDGQGRLEIMKNIDEDYLFFCCDECMSCWKNENDLEKRINKFMEYSIKFD FILATEEDIRKYKWEKYKLNIK >gi|224461358|gb|ACDC01000044.1| GENE 11 8856 - 9842 1077 328 aa, chain + ## HITS:1 COG:FN0837 KEGG:ns NR:ns ## COG: FN0837 COG0582 # Protein_GI_number: 19704172 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Fusobacterium nucleatum # 1 328 1 328 328 513 92.0 1e-145 MEIKKIDERDLVVNQRKKRNQDKKKTIFEIYKSEKTVKDYMFHLKDFLHFVYEGENDFSI SEVIPLMQDIEKEDVEAYIVHLFEDRKLKKTSVNTILSALKSLYKELESNGLKNPVKYIK LFKVNRNIENVLKVSIDDIRKIIGLYKIDSEKKYRNITILYTLFYTGMRSKELLTLQFKH FLRREDEYFFKLVQTKSGKDVYKPIHKSLVKKLEEYRSYLMNMYSLDSKDLDEHYIFATS VSNNSPLSYRSLNVIIQDMGKLIEKDISPHNIRHAIATELSLNGADILEIRDFLGHSDTK VTEVYINVRSVLEKKVLEKLPEINLDKE >gi|224461358|gb|ACDC01000044.1| GENE 12 10257 - 10796 584 179 aa, chain + ## HITS:1 COG:BS_yrdC KEGG:ns NR:ns ## COG: BS_yrdC COG1335 # Protein_GI_number: 16079729 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Bacillus subtilis # 2 178 5 181 187 140 45.0 1e-33 MEALIIIDMQKGFFKNILGKRNNPQAESNILKILENFRKENKEIIHIQHLSTDEKGILFR NEDREFLKSLEPLPNETIFQKSVNSAFIGTNLENYLRNKSIDKLIIAGMTLPHCVSTTVR MASNLGFKVILIEDATITFEMKDYYSDKLLSADEIHKYHISALNEEFCEILSTKNFLNL >gi|224461358|gb|ACDC01000044.1| GENE 13 10949 - 11950 977 333 aa, chain + ## HITS:1 COG:FN0646 KEGG:ns NR:ns ## COG: FN0646 COG0373 # Protein_GI_number: 19703981 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Fusobacterium nucleatum # 1 329 1 329 329 508 87.0 1e-144 MLDLDKIVVIGVSHENLSLLEREDFMRTRPKYIIEKLYKEKEINAYINLSTCLRTEFYIE LNSNISIDEIKKLFSVKMLVKTGIEAIEYLFKVSCGFYSVIKGEDQILAQVKSSHSEALE NEHSSKFLNIIFNKAIELGKKFRTKSMIAHNALSLEAISLKFIKNKFPNIEDKNIFILGI GELAQDILTLLSKEQLKNIYITNRTYHKAEQIKKQFEMVNVVDYKEKYKEMFEADVIISA TSAPHIVVEYDKFVPKMKEDKDYLFIDLAVPRDVDERLANFKNIEIYNLDDIWKVYNEHS MNRDKLLEDYSYLIDEQMEKLIKSLDYYKENTL >gi|224461358|gb|ACDC01000044.1| GENE 14 11966 - 12871 1247 301 aa, chain + ## HITS:1 COG:FN0645 KEGG:ns NR:ns ## COG: FN0645 COG0181 # Protein_GI_number: 19703980 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Fusobacterium nucleatum # 1 298 1 298 298 508 85.0 1e-144 MKRHVVIGSRGSILALAQASLVKNSLEANYPDLTFEIKEIVTSGDKDLKSNWENSNASLK SFFTKEIEQELLDGQIDIAVHSMKDMPAVSPKGLICGAIPDREDARDVLISKNGFLVTLP QGAKIGTSSLRRVMNLKAIRPDFEIKHLRGNIHTRLKKLETEDYDAIILAAAGLKRTGMA DKITEYLSGEAFPPAPAQGVLYIQCRENDEEIKGILKSIHNEDIAKIVEIEREFSKIFDG GCHTPMGCYSQVDEDKIKFIAAYSHDGKQIRVVIEDDLTKGKEIAHMAAEEIKAKINKGN L >gi|224461358|gb|ACDC01000044.1| GENE 15 12887 - 14347 1821 486 aa, chain + ## HITS:1 COG:FN0644_1 KEGG:ns NR:ns ## COG: FN0644_1 COG0007 # Protein_GI_number: 19703979 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Fusobacterium nucleatum # 1 251 1 251 251 461 92.0 1e-129 MKKGKAYIIGAGPGDFELLTIKAKRIIENADCIIYDRLISEDILRLPKKDAELIYLGKGN TEGGLIQDEINQTLVKKCLEGKSVARVKGGDPFVFGRGGEEVEALFQNEIEFEIIPGITS SISVPAYAGIPVTHRGIARSFHIFTGHTMENGKWHNFENIAKLEGTLVFLMGVKNLDLIV SDLIKYGKDSKTPIAIIEKGATKNQRVTVGNLENILELVEKNKILPPAITIIGEVVNSRE TFKWFESDKLAKRILVTRDKKQAVEMSENISKRGGIPVELPFIEIENLKIDLDNLSKYKA ILFNSPNGVKAFFENIKDIRSLANIQIGAVGVKTKEALEKNKIVPDFVPEEYLVDKLAED VVKYTEENDNILIVTSDISPCDTDKYNSLYKRNYEKVVAYNTKKLRVDREKVLETLKDID IITFLSSSTVEAFYESLDGDFFILGDKKIASIGPMTSETIRRLGMKVDYEAEKYTADGIL DEIFGA >gi|224461358|gb|ACDC01000044.1| GENE 16 14348 - 14594 126 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294782421|ref|ZP_06747747.1| ## NR: gi|294782421|ref|ZP_06747747.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 82 1 82 215 133 90.0 4e-30 MFEKLNPRSAEIIEQSSAVYNLKWKGNIEFLLYGHENSSSGWYYILKNNEQISSTYHYSE INDIFLKNLQRIIDDIESGKYN Prediction of potential genes in microbial genomes Time: Thu May 19 23:05:19 2011 Seq name: gi|224461357|gb|ACDC01000045.1| Fusobacterium sp. 2_1_31 cont1.45, whole genome shotgun sequence Length of sequence - 46342 bp Number of predicted genes - 44, with homology - 43 Number of transcription units - 16, operones - 10 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 250 316 ## gi|237740511|ref|ZP_04570992.1| predicted protein 2 1 Op 2 . + CDS 269 - 706 698 ## COG3708 Uncharacterized protein conserved in bacteria 3 1 Op 3 . + CDS 725 - 1609 824 ## COG1266 Predicted metal-dependent membrane protease 4 1 Op 4 . + CDS 1602 - 1841 172 ## FN0638 hypothetical protein 5 1 Op 5 . + CDS 1726 - 1974 169 ## gi|262067562|ref|ZP_06027174.1| conserved hypothetical protein 6 1 Op 6 . + CDS 1990 - 2964 1149 ## COG2849 Uncharacterized protein conserved in bacteria + Term 2972 - 3001 -0.3 - Term 2960 - 2989 -0.3 7 2 Tu 1 . - CDS 3015 - 4172 1953 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase - Prom 4303 - 4362 9.2 + Prom 4206 - 4265 13.9 8 3 Tu 1 . + CDS 4292 - 4747 612 ## COG2731 Beta-galactosidase, beta subunit + Prom 4794 - 4853 8.3 9 4 Op 1 15/0.000 + CDS 4886 - 5767 1281 ## COG3221 ABC-type phosphate/phosphonate transport system, periplasmic component + Term 5783 - 5815 1.6 10 4 Op 2 9/0.000 + CDS 5833 - 6576 234 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 11 4 Op 3 1/0.200 + CDS 6545 - 8113 1429 ## COG3639 ABC-type phosphate/phosphonate transport system, permease component + Term 8166 - 8228 0.8 + Prom 8154 - 8213 10.0 12 5 Tu 1 . + CDS 8246 - 8668 908 ## COG3576 Predicted flavin-nucleotide-binding protein structurally related to pyridoxine 5'-phosphate oxidase + Term 8688 - 8743 0.7 + Prom 8740 - 8799 10.7 13 6 Op 1 2/0.000 + CDS 8870 - 11800 3124 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 14 6 Op 2 . + CDS 11802 - 13013 1397 ## COG3581 Uncharacterized protein conserved in bacteria - Term 13015 - 13053 2.1 15 7 Tu 1 . - CDS 13204 - 14211 1244 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases - Prom 14287 - 14346 11.1 + Prom 14188 - 14247 12.8 16 8 Tu 1 . + CDS 14486 - 15661 694 ## COG3547 Transposase and inactivated derivatives + Prom 15791 - 15850 14.0 17 9 Op 1 1/0.200 + CDS 15878 - 16924 1146 ## COG3053 Citrate lyase synthetase 18 9 Op 2 . + CDS 16938 - 17459 663 ## COG3697 Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) + Term 17525 - 17571 5.0 - Term 17785 - 17828 0.1 19 10 Tu 1 . - CDS 17854 - 18744 675 ## Acfer_1552 hypothetical protein - Prom 18947 - 19006 10.3 - Term 18966 - 19033 8.2 20 11 Op 1 . - CDS 19052 - 20332 2065 ## COG0151 Phosphoribosylamine-glycine ligase 21 11 Op 2 . - CDS 20356 - 21429 1219 ## TDE0552 ankyrin repeateat-containing protein 22 11 Op 3 10/0.000 - CDS 21449 - 22963 2139 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 23 11 Op 4 21/0.000 - CDS 23004 - 23588 736 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 24 11 Op 5 13/0.000 - CDS 23576 - 24595 820 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 25 11 Op 6 2/0.000 - CDS 24643 - 25992 1884 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 26 11 Op 7 4/0.000 - CDS 26039 - 26752 1172 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 27 11 Op 8 1/0.200 - CDS 26793 - 27266 763 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 28 11 Op 9 . - CDS 27279 - 31007 4988 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain - Term 31043 - 31075 -0.0 29 11 Op 10 . - CDS 31083 - 31178 75 ## - Prom 31202 - 31261 7.9 30 12 Op 1 . - CDS 31396 - 32439 1149 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 31 12 Op 2 . - CDS 32473 - 34464 2623 ## COG0556 Helicase subunit of the DNA excision repair complex - Prom 34523 - 34582 15.0 + Prom 34520 - 34579 11.4 32 13 Op 1 1/0.200 + CDS 34605 - 34952 180 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 + Term 34954 - 35003 6.6 33 13 Op 2 1/0.200 + CDS 35009 - 35632 609 ## COG2121 Uncharacterized protein conserved in bacteria 34 13 Op 3 2/0.000 + CDS 35642 - 37555 2732 ## COG0143 Methionyl-tRNA synthetase 35 13 Op 4 2/0.000 + CDS 37579 - 38196 954 ## COG0457 FOG: TPR repeat 36 13 Op 5 . + CDS 38208 - 39095 1067 ## COG1210 UDP-glucose pyrophosphorylase 37 13 Op 6 . + CDS 39133 - 39957 1052 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase + Term 40021 - 40069 -0.8 38 14 Op 1 . - CDS 40258 - 40626 477 ## Lebu_1626 hypothetical protein 39 14 Op 2 . - CDS 40610 - 41014 284 ## Lebu_1625 protein of unknown function DUF1722 - Prom 41093 - 41152 8.4 + Prom 41220 - 41279 7.1 40 15 Op 1 . + CDS 41382 - 43196 1972 ## COG0457 FOG: TPR repeat 41 15 Op 2 . + CDS 43212 - 43826 539 ## FN0848 hypothetical protein + Prom 43837 - 43896 6.6 42 16 Op 1 . + CDS 43929 - 45062 1203 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 43 16 Op 2 . + CDS 45063 - 45677 731 ## FN0850 putative cytoplasmic protein 44 16 Op 3 . + CDS 45667 - 46335 477 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|224461357|gb|ACDC01000045.1| GENE 1 2 - 250 316 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740511|ref|ZP_04570992.1| ## NR: gi|237740511|ref|ZP_04570992.1| predicted protein [Fusobacterium sp. 2_1_31] # 10 82 1 73 73 128 100.0 1e-28 FQCWCYSYPMSTCLTTNTLTSARVTSVLTHFIKEIKHRERLIKDKVIIYDKKAEIYEVLK NFNIPYEYDEIENAFLIYGYKS >gi|224461357|gb|ACDC01000045.1| GENE 2 269 - 706 698 145 aa, chain + ## HITS:1 COG:FN0643 KEGG:ns NR:ns ## COG: FN0643 COG3708 # Protein_GI_number: 19703978 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 145 1 145 145 250 88.0 7e-67 MAYRLKAVTIRTNNSEEGIRKIGELWEDVLTGKLSLLVDGIIPVSQYSNYESDEKGDYDI SIVGVEHNFFEDMEKKVEKGLYKKYEAVDENGSVELCTKKAWENVWNDSHSGVLKRAFSV DYESSVPKEFSKDGKAHCYLYIAVK >gi|224461357|gb|ACDC01000045.1| GENE 3 725 - 1609 824 294 aa, chain + ## HITS:1 COG:FN0640 KEGG:ns NR:ns ## COG: FN0640 COG1266 # Protein_GI_number: 19703975 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Fusobacterium nucleatum # 1 293 1 293 293 350 76.0 2e-96 MTNKFQSYVDSIQSKSKLKLLLVPILVTILIIVLNQLLIIPLIFIFNDSFKEVLSFSGTS NLVSEAISLFLAIFLMTKISKLSTEQLGFSKDNIAASYLKGAFFGTLQVLSVFLMIFCLN AIEVYYVANIPILIFIKILIFFVFQGLFEEILFRGYLMPMFTKVIGIKFTIILLSFLFTC IHLLNPNLSMIALTNVFLAGVTFSLIYYYTGNLWLVGAMHTLWNFILGFVVGSYVSGIPT IYSIFFSVPIEGKDLISGGEFGFEASIVETILELGVSLFVIYLIKKEKKGENYE >gi|224461357|gb|ACDC01000045.1| GENE 4 1602 - 1841 172 79 aa, chain + ## HITS:1 COG:no KEGG:FN0638 NR:ns ## KEGG: FN0638 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 72 1 68 119 73 65.0 2e-12 MNKEILELVTKIFTFLKLEDYTKLKNILSMIEKEFPNYYKFFEKFKDKSMGEKASDILSN VFDSLTLGGNSISFTREKS >gi|224461357|gb|ACDC01000045.1| GENE 5 1726 - 1974 169 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262067562|ref|ZP_06027174.1| ## NR: gi|262067562|ref|ZP_06027174.1| conserved hypothetical protein [Fusobacterium periodonticum ATCC 33693] # 24 82 65 123 123 70 74.0 3e-11 MRNLKIKVWVKKHLIFYPMFLILLLWGETPLALLGKKAEKEEKEREITSQKCLLKNGIKE ILKNYSDSSEEKRFLEFLSEKI >gi|224461357|gb|ACDC01000045.1| GENE 6 1990 - 2964 1149 324 aa, chain + ## HITS:1 COG:FN0637 KEGG:ns NR:ns ## COG: FN0637 COG2849 # Protein_GI_number: 19703972 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 152 322 1 171 172 180 57.0 3e-45 MKRKNFILTTLVFLFISILGLAVENPNPTTQESVIAALNPDFAEVVKEYKPNLENIDKMF NYIEKNIKQKGRAIFYSKIDQEKKELIVTDENNKLIYTEKLPEKLVNSISYYEAKQTYSL KNGKTLEYSEVNLETLGKRFKMKDETLRKNRINKKDAIQVLSLIGDMNKASQTVFSKIEY SNIETFDENDNLILTIKFKNKKMVIEQEVEGDKVKMITYFDNLNTMNGKMEAYVNDTLVS TMQIKNSIPEGESKIFYPSGKVLTVTNVKNGIMDGTMKVFYEDGKIRMSGHFKNGKKDGE FIEYDEDGNIINKALYKNDEMVTQ >gi|224461357|gb|ACDC01000045.1| GENE 7 3015 - 4172 1953 385 aa, chain - ## HITS:1 COG:FN1133 KEGG:ns NR:ns ## COG: FN1133 COG1820 # Protein_GI_number: 19704468 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Fusobacterium nucleatum # 1 385 1 385 386 622 82.0 1e-178 MKKILLKNANLVLENKIEKATVLLCEDKIEKIFSKDSDLSQITYDELIDLEGKYLGPAFV DVHVHGADGADAMDIDEEALRRISKYLAKEGTANFLVTTLTSTKDELKNVLEIAGKLQNK EIDGANIFGVHMEGPYFAVEYKGAQNEKYIKPAGIEELEEYLSVKDGLIKLFSISPHTQE NLEAIKYLSDRGVVVSVGHSNATYEAVIKAVDYGLSHATHTFNGMKGFTHREPGVVGAVF NSDNIMAEIIFDKIHVHPEAVRTLIKIKGIDKVVCITDAMSATGLAEGKYKLGELDVNVK DGQARLVSNNALAGSVLRMDIAFKNLIDLGYSITDAFKMTSTNAAKEFKLNSGLIKENKD ADLVVLDKDYNVCMTIVKGRVKYTN >gi|224461357|gb|ACDC01000045.1| GENE 8 4292 - 4747 612 151 aa, chain + ## HITS:1 COG:FN1134 KEGG:ns NR:ns ## COG: FN1134 COG2731 # Protein_GI_number: 19704469 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Fusobacterium nucleatum # 1 151 5 155 155 191 70.0 5e-49 MIYGELKDIKNYKGLNKNLDKAIDFIVDKKYLNANFGKNLIEGNSIYFDYPEKVMTRENK DIESEYHKKYADIHIVLEGEEIIGYTSFEDCVETKAYNSEKDIAFVKGENQAEVLLNGKN FALFFPEEVHLPLLKVGEIKEIKKVVFKIEI >gi|224461357|gb|ACDC01000045.1| GENE 9 4886 - 5767 1281 293 aa, chain + ## HITS:1 COG:FN1135 KEGG:ns NR:ns ## COG: FN1135 COG3221 # Protein_GI_number: 19704470 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 12 293 1 282 282 486 91.0 1e-137 MKLKKVWKLLALVSLIFLLISCGKKKEEKPLVMGLSPIANSEKLLEDAAPLYKMLGDDIG RPVEGYIATNYIGVVEALGTGTIDFALIPPFAYILANKKNGSEALLTSIGKNDEPGYYSV LLVRTDSGIEKVEDLKGKKVAFVDPSSTSGYIFPAVILMDHGIDVEQDVTYQFAGGHDKA LQLLINGDVDAIGTYESAITKFAKEFPEVTEKVKVLEKSDLIPGITLTVSSKLDDATKQK IKDAFIKVTNSKEGQELTLKLFGIKGFEDAKVDNYKLIEDKLNKMGIDIEKVK >gi|224461357|gb|ACDC01000045.1| GENE 10 5833 - 6576 234 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 226 1 219 223 94 25 9e-19 METIIEVKNLKKNYGNREILKDISFSIEKGEIISIIGESGAGKSTLMRCINGLEGINSGS IKFYDTDITKLKEKERNSIKKQMAYVFQDLNIIDNMYVIENVLIPFLNRKNFIQVLLNRF SRAEYERALYCLEKVGISKLAYTKAKYLSGGEKQRVAIARSIAPNVDLILADEPISSLDE KNSFQIMEIFKRINAKKNKTIILNLHNVEIAKKFSDKILALKNGEIFFFKKSSEVNEDDI RQVYQSS >gi|224461357|gb|ACDC01000045.1| GENE 11 6545 - 8113 1429 522 aa, chain + ## HITS:1 COG:FN1137 KEGG:ns NR:ns ## COG: FN1137 COG3639 # Protein_GI_number: 19704472 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, permease component # Organism: Fusobacterium nucleatum # 1 521 1 521 522 671 88.0 0 MTLDKFIKVHNTKTFLKILTIVIVLLLFFFTLNLDFQDYIDGFSRLKGLIAGMMRIETED KKIVLFKMFETIITAFASSFIGVLLAVLCSPFLATNISNKYLAKFLTVCFSIFRTVPALV MAAILVSLIGIGSFTGFISLLIITFFSATKLLKEYLEEINPAKIQSFRSFGFSKFTFLRS CIYPFSKPYIISLFFLTLESSIRGASVLGMVGAGGIGEELWKNLSFLRYDKVSFIIVILL GFIFLTDTLSWFFRKKDNLIKITTSKGYKRSKFISNFVIIGVLILLVFSLNILYEDTNKI SAPVFFERLFTFLKKFRNLDFTYSGKALLALWQSFLVAFFATVFAAPSAIIVSYFANSVT SNKIIAFLIKIFINFIRTFPPVIVAILFFSGFGPGLISGFFALYLYTTGVITKVYVDVLE SVEVDYGLYGKSLGLRNFYIYLKLWLPSTYTNFVSIFLYRFESNMKNSSVLGMVGAGGIG QLLMNHIAFRNWEKVWVLLIFLIITIILIENLSEYIRNKVNK >gi|224461357|gb|ACDC01000045.1| GENE 12 8246 - 8668 908 140 aa, chain + ## HITS:1 COG:FN1138 KEGG:ns NR:ns ## COG: FN1138 COG3576 # Protein_GI_number: 19704473 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein structurally related to pyridoxine 5'-phosphate oxidase # Organism: Fusobacterium nucleatum # 1 140 4 143 143 266 96.0 6e-72 MAKLTDAIKDLILNPVKEGAWTAQLGWIATVREDGAPNIGPKRSCRIYDDATLVWNENTA GEIMKDIERGSKVAIAFANWDKLDGYRFVGTAEVHKEGKYYDEAVEWAKGKMGVPKAAVV FHIEEVYTLKSGPTAGTRID >gi|224461357|gb|ACDC01000045.1| GENE 13 8870 - 11800 3124 976 aa, chain + ## HITS:1 COG:FN1139_1 KEGG:ns NR:ns ## COG: FN1139_1 COG1924 # Protein_GI_number: 19704474 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Fusobacterium nucleatum # 1 640 1 640 640 1204 93.0 0 MYYKIGIDVGSTTLKTVILNEKDEIVEKSYQRHFSKVREMTLEHFKSLKDLLNGKKFKLA ITGSAGLGISKDYGIPFVQEVFSTAGAVKKCYPQTDIVIELGGEDAKILFLKGAIEERMN GTCAGGTGAFIDQMASLLDMEVSELDKISFEHERIYPIASRCGVFAKTDVQPLLNQGAKK SDIAASIYQAVVEQTITGLAQGRPIKGTVIFLGGPLYFLKGLQERFVEVLKLSKEEAIFP ELAPYFVALGSAYFADTTEEIFDYDGVVNLLSQKKEKKVEHLVNPLFTSEEEFETFFKRH QKVTVPTRDITTYSGKAYLGLDSGSTTIKVVLLDEDENILYRYYSSSKGNPVSLFLEQLK KIRELCGDRIEIVSSTVTGYGEELMQVAFGVDIGIVETIAHYTAAKHFNPDVDFIIDIGG QDIKCFHIKDGAIDSIVLNEACSSGCGSFLETFAKSLGYSTQDFAKKAIFSKSPAELGSR CTVFMNSSVKQAQKDGAEVEDISAGLARSVVKNAIFKVIRARDINDLGENIVVQGGTFLN NAVLRSFEQELGREVLRPEISELMGAYGAALYGKKVQKEKSKLLNLEELENFQHNSSPGM CKLCTNHCQLTINTFTNGQKFISGNKCERGAGKKLQSDLPNMVAYKNQLFNSIPLKAGGR AKIGLPRALNIYEMLPFWAELFCSLDCDVVLSKVSNRNLYMKGQNTIPSDTVCYPAKLVH GHIIDLLEKNVDAIFYPCMSYTFDEGISDNCYNCPVVAYYPELIQANISEVEKTNFLYPH LGIENHKLFAEQMYEEFKNIIPKLTKKEMEQATEKAFKTYHEYRETVRQEGSRVLKFAEE NNYPVIILASRPYHIDPEINHGLDRLLNSLQFVIVTEDALYPVEGKLTTKTLNQWGYHAR MYNAAKYVSQHKNMELVHLVSFGCGIDAITTDEVQDILRSKNKLYTQLKIDEVSNLGAAK IRLRSLQATMKEREMY >gi|224461357|gb|ACDC01000045.1| GENE 14 11802 - 13013 1397 403 aa, chain + ## HITS:1 COG:FN1140 KEGG:ns NR:ns ## COG: FN1140 COG3581 # Protein_GI_number: 19704475 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 403 1 403 407 744 91.0 0 MDKNCKVLIPMMMDIHFDLIAGVLKNEGYDVEVLKTDHRGVIEEGLKSVHNDMCYPALLV IGQFIDALKSGKYDTNNVALLLTQTGGGCRASNYIHLLRKALEINGFHKVKVLSLNFEGL DKKNEFSLSFKGYFNLFYSILYGDLLMSIYHQSVAYEENPGDSKSILAYWKEKLISEVGK KPFKKLKENYKKIIEHFLTIPKNLSKKKIRVGIVGEIYMKYSPLGNNHLTDYLEKEGVEA VNTGLLDFLLFNLYDTIFDRKIYGRKGLKYYFVKYVVGYIEKKQKEMIDVIKQYKSFIPP SPFAKVREMTKGYLGHGVKMGEGWLLTAEMLEFIEMGVKNIVCAQPFGCLPNHIIAKGMI RKIKDNHPEANIIAVDYDPGASSVNQENRIRLMLENARMLATE >gi|224461357|gb|ACDC01000045.1| GENE 15 13204 - 14211 1244 335 aa, chain - ## HITS:1 COG:BS_yddN KEGG:ns NR:ns ## COG: BS_yddN COG2141 # Protein_GI_number: 16077571 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Bacillus subtilis # 4 333 2 329 339 273 45.0 4e-73 MENKKVKVSALNLVPQFQGETTIEAINRAVDLAKILEDLDYYRYWVAEHHNFRGVVSSAT ALLIQHILANTKKIKVGSGGVMLPNHSPLQVAETYGTLETLYPHRVDLGVGRAPGTDAET ASLIYRQKYANIHNFMEDILQLERYFGSEEEQGVVIANPGINTNVPIIILGSSTSSAYVA AELGLPYSFATHFAPAMAEEALSIYRKHFKASKYLDKPYFILGVLAHGADTDEEAEKLYT IAQQGSIRLLREEKGLYPLADEKFEENLNLSSAEKIFLKSRMGINLMGSKETMTKIWKEV KAKFDPDEVIAVSYMPKLEELEKSYRILKEVVENK >gi|224461357|gb|ACDC01000045.1| GENE 16 14486 - 15661 694 391 aa, chain + ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 391 1 391 391 661 96.0 0 MFLLGIDIAKLNHVASCIDSSTNEVVFSNFKFKNDFKGFSALLNKIKSFDAKNLIIGLES TSHYGENLINFLFIHDFKVALINPLQTSHLRKANIRDAKNDNLDSLNIAKSLVFTKLNFV SKKNIECFSLKKLTRFRSNLIKQRSKAKIQLTSLLDLLFPELQYLFKSKIHSKAIYSLLK KHPSTEEIAALKDDEISNLLYASSKGHFKKEKSIELKSLAKTTVGIKDTSISLHLIQLIE LIELYTKQIKDIEIKITDIVNNLDTTLLSVPGISVIACAIILGETNNFENFSSSKKLLAF AGLDPKIRQSGNFNASSCRMSKKGSPYLRYALIFTAWNIVRHSEKFNKYYSLKRSQGKSH YNALGHVAHKLVRILFTLIKKNISYQEEKLE >gi|224461357|gb|ACDC01000045.1| GENE 17 15878 - 16924 1146 348 aa, chain + ## HITS:1 COG:FN0319 KEGG:ns NR:ns ## COG: FN0319 COG3053 # Protein_GI_number: 19703664 # Func_class: C Energy production and conversion # Function: Citrate lyase synthetase # Organism: Fusobacterium nucleatum # 6 348 3 345 345 604 91.0 1e-173 MSEYNISKIYENDKRSLKLIDDLLAKEEIRRDKNLDYTCAMFDDDMNIIATGSCFKNTLR CLAVDNSHQGEGLMNQIVTHLVDYEFSKGLSHLFLYTKNKSMKFFKDLGFYEIINIENQI VFMENKRTGFSDYLNNLKKDMREGKNIASLIMNANPFTLGHQYLVEKAASENDILHLFIV SDDSSLVPFKVRKKLVIEGTKHLKNICYHETGDYIISSATFPSYFQKDEVAVIESQANLD IEIFSRIAKALNINRRYVGEEPNSLVTNIYNQTMLKKLPENNIECVVVPRKKYSDKVISA STVRQIIKDGNLEDLKNLVPETTYNYFLSDEAKPVIDKIRSQADVIHY >gi|224461357|gb|ACDC01000045.1| GENE 18 16938 - 17459 663 173 aa, chain + ## HITS:1 COG:FN0318 KEGG:ns NR:ns ## COG: FN0318 COG3697 # Protein_GI_number: 19703663 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) # Organism: Fusobacterium nucleatum # 1 167 1 167 171 230 82.0 8e-61 MQGVEVGIEEVLICRERRVDIQNEMIKKYNMPLISFTMNIPGPIKTNQQIKKAFDIGKTL ILEKLKENNIEVLEIKELDENTGNELFISVDSKAEKIKDITIAIEESSLLGRLFDIDVID VNFEKLSRKSFRRCLICEEQAQECGRSRKHSIEELQNKVEEILEKELLQNNKK >gi|224461357|gb|ACDC01000045.1| GENE 19 17854 - 18744 675 296 aa, chain - ## HITS:1 COG:no KEGG:Acfer_1552 NR:ns ## KEGG: Acfer_1552 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 9 264 68 321 506 80 23.0 8e-14 MIEADIKGLLDIIFKIEVNNYDKIEKELEIYFENYRDTILIYREPLLKYFSRNEISISSQ NNIFNFFKKMLTKSRNIFVIKISIIILNSLNLEYNIELLEIIKILALCSEFTLLGVLFIK TLKNIDINKEIYELAKKVYTWGKMACIFYLEANSNEIKDWILNESTEENILYNFVAITYS DKADIRKRLKKISFKKNEFSKISFLIYSLLFLDEEKGIIFLDYKEELLINYLERAKSIEL SETDYLTIEEVSSYMEDDIYYMEELGREMREDEYFFPLEISNKLLKECKEILNNRN >gi|224461357|gb|ACDC01000045.1| GENE 20 19052 - 20332 2065 426 aa, chain - ## HITS:1 COG:FN0981 KEGG:ns NR:ns ## COG: FN0981 COG0151 # Protein_GI_number: 19704316 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Fusobacterium nucleatum # 1 425 1 425 426 745 93.0 0 MKVLIVGSGGREHAIAWKISQNPKVEKIFAAPGNAYNKVIKNCENVNLKTSDDILNFALN EKIDLTIVGSEELLVDGIVDKFQENNLTIFGPNKEAAMLEGSKAFAKDFMQKYGVKTAKY QSFTDKEKAIKYLDEMSYPVVIKASGLAAGKGVVIAQNRKEAEDTLNDMMTNKVFATAGD TVVIEEFLDGVEISVLSITDSEVIIPFISAKDHKKISEKETGLNTGGMGVIAPNPYYTKT IEEKFIQNILNPTLKGIKEEKMNFAGIIFFGLMVANGEVYLLEYNMRMGDPETQAVLPLM KSDFLDVINSALNKELKNIKIDWENKSACCVVIAAGGYPVKYEKGNLISGLEKFDINNSD NKVFFAGVKEENDKFYTNGGRVLNVVSIQDNLEKAIEVAYKNVKEISFKDNYCRKDIGTL YVPVKD >gi|224461357|gb|ACDC01000045.1| GENE 21 20356 - 21429 1219 357 aa, chain - ## HITS:1 COG:no KEGG:TDE0552 NR:ns ## KEGG: TDE0552 # Name: not_defined # Def: ankyrin repeateat-containing protein # Organism: T.denticola # Pathway: not_defined # 1 354 1 354 354 483 66.0 1e-135 MIKLKDIGSFKSIPEILDDIIKENISKLDEHLAKAWDINKNISISKYTDLSPLDCALIME AFESVKWLVEHGVNLNMKDNPSFLTAVRYCDEKIIQYIVSNGAKINLTNNVKSDAFMEAI YGKNYKYLQLIHDLGHTVEKYGGKAFREAVSDRNYVVLNFFIKNGVDINYNEADMVYPFK PTPLCVAARYVDLAMCKFLVENGADVTLTEKDGMRPYSIALEKGDIEMAEYFKSLEPLEY HNLQNKLDELKSFKLPKNLIEFLQGDKLHFELDDCDFKWIEFFSLIDTIPMKVGRQKLLR ISKATGDYEDIYIVWNPKTKKIAFYDMEHEELKDITDFVDFIENTSSYMQKIIEGEL >gi|224461357|gb|ACDC01000045.1| GENE 22 21449 - 22963 2139 504 aa, chain - ## HITS:1 COG:FN0982 KEGG:ns NR:ns ## COG: FN0982 COG0138 # Protein_GI_number: 19704317 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Fusobacterium nucleatum # 1 504 1 504 504 918 95.0 0 MKKRALISVYDKTGILDFAKFLVSKGIEIISTGGTYKYLKENNIEVIEVSKITNFEEMLD GRVKTLHPNIHGGILALRDNEEHMRTLKERNIDTIDYVIVNLYPFFEKVKEDLSFEEKIE FIDIGGPTMLRSAAKSFKDVVVISDVKDYELIKEEINNSDDVSYETRKRLAGKVFNLTSA YDAAISQFLLDEDFPEYLNVSYKKFMEMRYGENSHQKAAYYTDNMSDGAMKNFKQLNGKE LSYNNIRDMDLAWKVVSEFDEICCCAVKHSTPCGVALGDNVEEAYRKAYETDPVSIFGGI VAFNREVDEASAKLLNEIFLEIIIAPSFSKSALEILSKKKNIRLIECKNKPSDKKELIKV DGGILVQDTNDRLYENLEVVTKAKPTSQEEKDLIFALKVVKFVKSNAIVVAKNLQTLGIG GGEVSRIWAAEKALERAKERFNTTDVVLSSDAFFPFKDVVELAAKNGVKAIIQPSGSVND KDSIEECDKNNISMIFSKLRHFKH >gi|224461357|gb|ACDC01000045.1| GENE 23 23004 - 23588 736 194 aa, chain - ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 8 191 3 186 204 173 53.0 2e-43 MSEINKKKIAVLVSGSGSNLQSIIDNVENGNLNCEITYVIADRECYGLQRAEKHGIETLL LDRKIIDNKLANEIIDSTLEGCKTDYIVLAGYLSILTEKFIKKWDKRVINIHPSLLPKFG GKGMYGIKVHEAVIKAGEKESGCTVHFVTNEIDAGEIITNVKVPVLEDDTPETLQKRVLE QEHKLLIKGIKKIL >gi|224461357|gb|ACDC01000045.1| GENE 24 23576 - 24595 820 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 4 331 13 339 356 320 49 1e-86 MINSYKDSGVDKEEGYKAVELMKKNVLKTHNKSVLTNLGSFGAMYELGQYKNPVLISGTD GVGTKLEIAMKQKKYDTVGIDCVAMCVNDVLCHGAKPLFFLDYLACGKLDAEVAAQLVSG VTEGCLQSYAALVGGETAEMPGFYKEGDYDIAGFCVGIVEKENLIDGSKVKEGNKIIAVA SSGFHSNGYSLVRKIFTDYNEKVSLKEYGENVTMGDVLLTPTKIYVKPILKVLEKFNVNG MAHITGGGLYENLPRCMGKDLSPVVFREKVRVPEIFKLIAERSKIKEEELFGTFNMGVGF TLVVEEKDVEPIIELLTSLGETAYEIGHIEKGDHNLCLK >gi|224461357|gb|ACDC01000045.1| GENE 25 24643 - 25992 1884 449 aa, chain - ## HITS:1 COG:FN0987 KEGG:ns NR:ns ## COG: FN0987 COG0034 # Protein_GI_number: 19704322 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Fusobacterium nucleatum # 1 448 1 448 448 845 94.0 0 MGILALHSKKVRKDLVGIAYYGMYALQHRGQEGAGYTICDSKTNNEVRIKTVKNVGLVSD VFKVEDFQKYIGNILIAHTRYGSKNTVSIRNCQPIGGESAMGYISLVHNGDISNREELKQ ELLNNGSLFQTAIDTEIILKFLSINGKYGYKEAVLKTVEKLKGCFALGIIINDKLIGVRD PEGLRPLCLGRIAEDDMYVLASESCALDAIGAEFVRDIEAGEMVVIDDNGVESIKYKEST KKASSFEYIYFGRPDSVIDGISVYDFRHQTGKCLYEQNPIEADIVIGVPDSGVPAAIGYA EASGIPYSAALLKNKYVGRTFIAPVQELRERAVRVKLNPIKELIKDKRVVVIDDSIVRGT TSKKLIDVLFEAGAKEVHFRSASPVVIEESYFGVNIDPNNKLMGSYMSIEEIRQAIGATT LDYLSLKNLKKILNGGEDFYTGCFKEDEE >gi|224461357|gb|ACDC01000045.1| GENE 26 26039 - 26752 1172 237 aa, chain - ## HITS:1 COG:FN0988 KEGG:ns NR:ns ## COG: FN0988 COG0152 # Protein_GI_number: 19704323 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Fusobacterium nucleatum # 1 237 1 237 237 426 97.0 1e-119 MEKGKFIYEGKAKQLYETDDKDLVIVHYKDDATAGNGAKKGTIHNKGVMNNEITTLIFNM LEEHGIKTHFVKKLNDRDQLCQRVKIFPLEVIVRNIIAGSMAKRVGIKEGTKINNTIFEI CYKNDEYGDPLINDHHAVAMGLATYDELKEIYDITAKINNLLKEKFDKIGITLVDFKIEF GKNSKGEILLADEITPDTCRLWDKETGEKLDKDRFRRDLGNIEEAYIEVVKRLTEAK >gi|224461357|gb|ACDC01000045.1| GENE 27 26793 - 27266 763 157 aa, chain - ## HITS:1 COG:FN0989 KEGG:ns NR:ns ## COG: FN0989 COG0041 # Protein_GI_number: 19704324 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Fusobacterium nucleatum # 1 157 1 157 157 279 97.0 2e-75 MKVGIIFGSKSDVDVMKGAADCLKKFGIEYSAHVLSAHRVPELLEETLEKFEKEDYGVII AGAGLAAHLPGVIASKTVLPVIGVPIKAAVEGLDALFSIVQMPKSIPVATVAINNSYNAG MLAVEILAVGNKDLRSKLLEFRKEMKEDFKKNIHVEL >gi|224461357|gb|ACDC01000045.1| GENE 28 27279 - 31007 4988 1242 aa, chain - ## HITS:1 COG:FN0990_1 KEGG:ns NR:ns ## COG: FN0990_1 COG0046 # Protein_GI_number: 19704325 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Fusobacterium nucleatum # 1 976 5 983 983 1722 91.0 0 MSDLRFFVEKKKGFDLDAKRLEKQLREELGIDIKNLRLINCYDIFNLSADKENVKKMILS EPVTDSITEELDLKGKKYFAVEFLPGQFDQRADSAIQCIDIVSTVKQNVDVLTSKIIILN DEITDEELNRIKKFYINPIEMREKDLSVLKKEEILFNSEVITYDNFTSLNDAEIEKMRTD LGLSMSFEDLKFVQDHYKEIGRNPTETEIKVLDTYWSDHCRHTTFETKINKVTFPNSEFG KQMEKEFNEYLKLKEDVSKKRAVSLMDMATIVAKYLKKEGKLDNLEVSEENNACSVYVDV EVEDFEGKKSIEKWLLMFKNETHNHPTEIEPFGGASTCLGGAIRDPLSGRAYVYQAIRVT GSGNPLETVEETLKGKLPQKKITTGAASGYSSYGNQIGIATTLVSEIYHDGYKAKRMEVG AVVAAAPVENVVRKSPTPGDSIIIIGGKTGRDGCGGATGSSKEHNDKSLLLCEAEVQKGN APEERKIQKLFRNPNATKLIKKCNDFGAGGVSVAIGELADGVEVNLDLVPVKYEGLNGTE LAISESQERMAVIISKEDTEKFLKFVDEENLLGTVVGYVTDKNRLTLNWKGKAIVDISRD FLNTNGVQQSIDIEVRDYENENVFEKFKTSDSSLEKKWLHNIKKLNVASQKGLVEMFDSS VGAGTILAPFGGKYQMSPTDVSIMKFPVLDKNTDTASAITWGFNPYISEWSTYHGAIYAV VESLAKLVAAGVDYKTARLSFQEYFEKLGKDAYKWSKPFLALLGAMKAQKDFDVAAIGGK DSMSGTFNDISVPPTLISFAVSPVNIHDVISTEFKKAKNKLYLVENKIDEKDYLFNSEEL KENFEFVLKNIKNKKIVSAMVIKMGGLAEALSKMSFGNRLGFDIDNKEVDFFSLKPASIL IETTEELSYKNAIYLGEVSDKFEGKVNGENINLENVESVWLNKLKPIFPYKLEEEIETYD IKNKISEKKIYKSSITIAKPRVVIAAFPGTNSEYDMYNRFNENGAEAKITLLRNLTQNHL AESVDQMCKDLRNSQIFALPGGFSAGDEPDGSGKFIAAVLQNPKLMDEIKAFLDRDGLIL GVCNGFQALVKSGLLPYGEIGNVHENSPTLTFNKIGRHISQLVKTKIVTNNSPWLSSFEI GETFDIPVSHGEGRFYASDEVLKELFENGQIATQYVDFDLNATSEFRFNPNGSSLAIEGI ISPDGKIFGKMGHSERYSRDAFKNIPGNKDQNLILNGIKYFK >gi|224461357|gb|ACDC01000045.1| GENE 29 31083 - 31178 75 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLITKNNINRIRFKIAVFKGIVANSYKCGL >gi|224461357|gb|ACDC01000045.1| GENE 30 31396 - 32439 1149 347 aa, chain - ## HITS:1 COG:FN1199 KEGG:ns NR:ns ## COG: FN1199 COG0389 # Protein_GI_number: 19704534 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 345 6 350 350 527 84.0 1e-149 MHYDMDAFYASIEINRNPKLKNKPLVVGENIVTTASYEARKYGIHSAMKVSDAKLLCPKL IAIPVDKKEYIRISNEIHNLILKITNKVEFIATDEGYIDLTGIVKPENKKQFALKFKERI KELTNLTCSVGIGFNKLSAKIASDINKPFGIYIFENEKDFVEYISDKKIKIIPGVGKKFF EILKHDKIFLVKDVFKYSLDYLVKKYGKSRGENLYCSVRGINHDEVEYEREIHSIGNEET YSIPLQTNSELEREFNSLFEYTYQRLLKNNVFSQSVTVKIRYSSFKTYTKSKKLKFATKD KDFLYNEMLELLNSFELEDEIRLLGIYFGDIKRNTLIQLSINESLKK >gi|224461357|gb|ACDC01000045.1| GENE 31 32473 - 34464 2623 663 aa, chain - ## HITS:1 COG:FN0224 KEGG:ns NR:ns ## COG: FN0224 COG0556 # Protein_GI_number: 19703569 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Fusobacterium nucleatum # 1 653 1 653 663 1130 94.0 0 MENNLFKIHSEYKPMGDQPTAIESIVKNIERGVKDQVLLGVTGSGKTFTIANVIERLQRP ALIIAPNKTLAAQLYSEYKKFFPENAVEYFVSYYDYYQPEAYIKTTDTYIEKDSSVNDEI DKLRNAATAALIHRRDVIIVASVSSIYGLGSPDTYRKMTIPIDKQTGIQRKELMKKLIAL RYERNDIAFERGKFRIKGDVIDIYPSYMNNGYRLEYWGDDLEEISEINTLTGQKIKKNLE RIVIYPATQYLTADDDKDRIIEEIKDDLRVEVKSFEDEKKLLEAQRLRQRTEYDLEMINE IGYCKGIENYSRYLSGKRPGETPDTLFEYFPKDFLLFIDESHITVPQVRGMYNGDRARKE ALVENGFRLKAALDNRPLKFEEFREKSNQTVFISATPGDFEIEVSDNNIAEQLIRPTGIV DPEIEIRPTKNQVDDLLDEIRKRVAKKERVLVTTLTKKIAEELTEYYIELGVKVKYMHSD IDTLERIEIIRALRKGEIDVIIGINLLREGLDIPEVSLVAIMEADKEGFLRSRRSLVQTI GRAARNVEGRVILYADIMTDSMKEAIIETERRRKIQKEYNAYNNIDPKSIVKEIAEDLIN LDYGIEDKKFENDKKVFRSKADIEKEITKLEKKIKKLVEELDFEQAIVLRDEMLKLKELL LDF >gi|224461357|gb|ACDC01000045.1| GENE 32 34605 - 34952 180 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 6 112 3 107 114 73 36 2e-12 MVRKLKGAKPAGNQADIVKQAQVMQQQMLEIQEELKSKEVSSSVGGGAVSVKVNGQKELV EVKLSDEIIKEAATDKEMLEDLILTAVKNAMAEAEELAEKEMAKVTGGINIPGLF >gi|224461357|gb|ACDC01000045.1| GENE 33 35009 - 35632 609 207 aa, chain + ## HITS:1 COG:FN1269 KEGG:ns NR:ns ## COG: FN1269 COG2121 # Protein_GI_number: 19704604 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 206 3 208 209 290 84.0 1e-78 MEENKKYRILGTILYYILRIISFTLRVEIVNKYNIDMQKAHIYGFWHSKLFITPIFFKDV EKKLAMSSPTKDGELISVPLEKMGYLLVRGSSDKKSISSTISLLKYLKKGYSIGTPLDGP KGPKEKAKKGLLYLCQKTSVPLVPVGISYTNKWILKKTWDKFEIPKPFSKVRIVLGEAMI IDENEDLDKYTEIVENTINDLNKIYEG >gi|224461357|gb|ACDC01000045.1| GENE 34 35642 - 37555 2732 637 aa, chain + ## HITS:1 COG:FN1268_1 KEGG:ns NR:ns ## COG: FN1268_1 COG0143 # Protein_GI_number: 19704603 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 527 1 526 526 1003 91.0 0 MKKNFFVSTPIYYVNGDPHVGSAYTTIAADVINRYNKSMGMDTHFVTGLDEHGQKVEQAA EQHGFTPQAWTDKMTPNFKNMWAALNIKYDDFIRTTEERHKKAVKKILEIVHEKGDIYKG EYEGKYCVSCETFFPENQLNGSNKCPDCGKELTVLKEESYFFKMSKYADALLKHIDEHPD FILPHSRRNEVISFIKQGLQDLSISRNTFTWGIPIEFAPGHITYVWFDALTNYITSAGFE NDDKKFDKFWNDARVVHLIGKDIIRFHAIIWPCMLLSAGIKLPDSIVAHGWWTSEGEKMS KSKGNVVDPYNEIKKYGVDAFRYYLLREANFGTDGDYSTKGIVGRLNSDLANDLGNLLNR TLGMYKKYFNGVVVASSTSEEIDDVIKAMFDETIKDVEKYMYLFEFSRALETIWKFISRL NKYIDETMPWTLAKDETKKSRLAAVMNILCEGLYKIAFLIAPYMPESAQKISNQLGIDKD ITSLKFDDIKEWNIFKEGHQLGEASPIFPRIEIDKEEVVEEVKKELKIENPIAIDDFNKV QIKVVEILDVDKVKGADKLLKFKVFDGEFERQIISGLAKFYPDYKALVGEKVLAVANLKF AKLKGELSQGMLLTTEDKNGVSLIKIDKSVQAGAIVS >gi|224461357|gb|ACDC01000045.1| GENE 35 37579 - 38196 954 205 aa, chain + ## HITS:1 COG:FN1267 KEGG:ns NR:ns ## COG: FN1267 COG0457 # Protein_GI_number: 19704602 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 205 1 205 205 342 92.0 4e-94 MGIISKKDEKFFENVEYFSEIIDRINEIQANNNYSDEEMNNDLDVALWRAFVYINLWSYK GYARAEKILKKVENKGIKNPIWCYRYAVSISRLRKYEEALKYFFIGTEADPTYPWNWLEL GRLYYKFGELDKVYKCIEKGLELVPNDYEFLTLKDDVKNDRGYFYSINHYINEEVDKTED RELDYSDDKEWEKFKKETHYGEKCL >gi|224461357|gb|ACDC01000045.1| GENE 36 38208 - 39095 1067 295 aa, chain + ## HITS:1 COG:FN1266 KEGG:ns NR:ns ## COG: FN1266 COG1210 # Protein_GI_number: 19704601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Fusobacterium nucleatum # 1 294 8 301 301 504 88.0 1e-143 MKKVTKAVIPAAGLGTRLLPATKALPKEMLTIVDKPSLQYIVEELVASGITDIVIITGRN KNSIEDHFDFSYELENTLKNEHKSELLDKVSHISTMANIYYVRQNMPLGLGHAILKAKSF IGDDPFVIALGDDIIYNPEKPVIKQMIEKYELYGKSIIGCQEVATEDVSKYGIAKLGDKF DEATFQMLDFLEKPSIEDAPSRIACLGRYLLSGKVFKYLEETKPGKNAEIQLTDGILAML KDGEDVLSYNFIGKRYDIGSKAGLLKANIEFGLRNEETKDNIKEYLKNLDIDKIY >gi|224461357|gb|ACDC01000045.1| GENE 37 39133 - 39957 1052 274 aa, chain + ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 1 274 1 279 290 166 32.0 3e-41 MSTQDTKVLLINDIAGYGKVALSAMLPILSYKGFNLYNLPTAIVSNTLNYEKFRIEDTTE YLEETLKIWKDLNFSFDVISTGFIFTKKQMEIISKFCEEQSKKGVLIFNDPIMADNGELY SGISPDTVDYMKNIIAVSDVTMPNYTESCLLTNTKYKEGISTEEINSIINKIRGIGAKSV IVTSIPSVETRMVAGFDSKINEYFYLPYEEIPTYFPGTGDIFSSVIISETLEGKSLKVAT EKAMKIVKEIVFENKDQEDKKKGIHIEKYLNLFD >gi|224461357|gb|ACDC01000045.1| GENE 38 40258 - 40626 477 122 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1626 NR:ns ## KEGG: Lebu_1626 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 122 1 122 122 195 82.0 3e-49 MRLWHEEIIPLLPKNQLLGQHRECCALRGNGWGKKHKTVDYVFLYSPYHLFMYHLLVMDE MEKRGYKVSIEWRDKNYRGKTAEKYDSLEEKTVDKPIYKEHNAEYMIECIENLQKKGIEL EV >gi|224461357|gb|ACDC01000045.1| GENE 39 40610 - 41014 284 134 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1625 NR:ns ## KEGG: Lebu_1625 # Name: not_defined # Def: protein of unknown function DUF1722 # Organism: L.buccalis # Pathway: not_defined # 1 134 1 134 134 167 75.0 1e-40 MNSKKIRKDCEELWAKNKYYVLSKSHKVYLEIREYLKKKEVDFLFLNEKIQRVRNIEESK KDFSNAILHIWGYFKNEATEIEKQGLCNLLEEYMSGKNDQKSVIEYINILLKKYPNEYLE KSTLLKGEEDETLA >gi|224461357|gb|ACDC01000045.1| GENE 40 41382 - 43196 1972 604 aa, chain + ## HITS:1 COG:FN0847 KEGG:ns NR:ns ## COG: FN0847 COG0457 # Protein_GI_number: 19704182 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 604 1 599 599 964 85.0 0 MKLDELIKKREQYQTEGNILKEIEILRVILNETEKQYGLESDEYIKALNELGGTLKYVGY YDEAEANLLKSLEIIKKKYGDNNLPYATSLLNLTEVYRFAQKFNLLEENYKKIVKIYQDN SADTSFSYAGLCNNFGLYYQNVGNMKAAYDLHLKSLDILKNYDSEEYLLEYAVTLSNLFN PCYQLGMKEKAIEYLYKAIEIFEKNVGKEHPLYSASLNNMAIYYYNERQLEKAIEFFEKA AEISKKTMGLDSDNYKNILSNIEFIKEELEKKSDTSSSQKSKVDNNEVGENPKKEDLKNI KGLELSKRYFYDLVLPEFEKNLNDILPLCAFGLVGEGSECYGYDDKISQDHDFGPSVCIW LRKDDYLKHKDKINGVLKVLPKTYLGFQELKESEWGSDRRGLLNIEDFYFKFLGSSKAPE TIADWQKIPETALATVTNGEVFLDNLGEFTKIRKDLLNYYPEPMRQNKIATRLMNISQHG QYNYTRCLKRNDLVAANQCLYLFVDEVIHLVFLLNRRYKIFYKWSNRALLDLKILGEEIH KLLEDMVFAQNKIPYVRKICKVLAEELRNQKLTNCDSEFLGDLGVDIQKNIDDDFFKNYS PWLD >gi|224461357|gb|ACDC01000045.1| GENE 41 43212 - 43826 539 204 aa, chain + ## HITS:1 COG:no KEGG:FN0848 NR:ns ## KEGG: FN0848 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 204 1 204 204 275 83.0 6e-73 MEKEKIIEEILEKEWKYFSNLNNIGGRADCQDNREDFIIMRKSQWETFNEETLLSYLEDL NSKNNPLFQKYAQMMKYNSPEEYEKIKDILEKASEEKNDLVNKIMFIYMEWEKEFFERYP IFSSMGRPLYSSEDDDIETSIETYLRGELLSYSEKTLKLYLNYVIDNKEKNINLAIKNMD NLARMQGFNDSNDVEEYYKNFSKN >gi|224461357|gb|ACDC01000045.1| GENE 42 43929 - 45062 1203 377 aa, chain + ## HITS:1 COG:FN0849 KEGG:ns NR:ns ## COG: FN0849 COG0156 # Protein_GI_number: 19704184 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Fusobacterium nucleatum # 1 377 5 381 381 558 81.0 1e-159 MLKENIIKELEGFKTENRFRTIKTNDKSLYNFSSNDYLGLANDKSLSQKFYENYTFDNYK LSSSSSRLIDGSYQTVMRLEKKVEEIYGKPCLVFNSGFDANSSVIETFFDKNSLIITDRL NHASIYDGCINSNAKVLRYNHLDVDALEKLLKKYSKTHDDILVVTESIYSMDGDCADLKK ICALKDEYNFTLMVDEAHSYGVYNYGIAYNEKLIDKIDFLIIPLGKAGASVGAYVICDEI YKNYLINKSRKFIYSTALPPVNNLWNLFILENLTLFHDKIEKLKDLVNFSLTTLKKANIE TSSTSHIISIIIGDNLKTINLSEALKEKGYLIYPIKEPTVPKDTARLRISLTANMKKEDL DAFFKILKAEMKKLGVM >gi|224461357|gb|ACDC01000045.1| GENE 43 45063 - 45677 731 204 aa, chain + ## HITS:1 COG:no KEGG:FN0850 NR:ns ## KEGG: FN0850 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 1 196 1 196 196 258 73.0 1e-67 MSKIYFFNGWAMDQNLLSPLKNSTEYEIKVINFPYNIDKTSINKEDIFIAYSFGVYYLNK FLSENQDLVYEKAIAINGLPETIGKFGINEKMFNMTLETLNQENLEKFLLNMDIDESFGR ADKTLEEAKYELQYFKDNYKTIPNYINFYYIGKNDRIIPASKVEKYCQNNNIAYELIACG HYPFSYFTDFKDIINIREENKNEF >gi|224461357|gb|ACDC01000045.1| GENE 44 45667 - 46335 477 222 aa, chain + ## HITS:1 COG:FN0851 KEGG:ns NR:ns ## COG: FN0851 COG0500 # Protein_GI_number: 19704186 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 1 222 1 222 222 298 72.0 4e-81 MNFDKHYSTYEKNSLAQKQVAEHLLSYMKDADILKSDVNSIFEIGCGTGIFTREYRKFFP KSSLILNDIFDVKSFIKDINYNIFIKENIEEIDIPKSDLVVSSSVFQWIDGLENLIRNIA ENTNILCFSTYVFGNLLEIKNHFDISLNYLKIEEIEKIIAKYFQKFKTYKETIKIDFENP LSVLRHLKYTGVTGFQRAPFSKIKSFKDNCLTYEVAYFICQK Prediction of potential genes in microbial genomes Time: Thu May 19 23:05:59 2011 Seq name: gi|224461356|gb|ACDC01000046.1| Fusobacterium sp. 2_1_31 cont1.46, whole genome shotgun sequence Length of sequence - 4237 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 132 165 ## + Prom 160 - 219 4.6 2 2 Tu 1 . + CDS 375 - 848 484 ## COG2801 Transposase and inactivated derivatives + Term 939 - 983 -0.1 + Prom 1017 - 1076 13.2 3 3 Op 1 . + CDS 1103 - 1624 669 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 4 3 Op 2 . + CDS 1621 - 4237 1940 ## FN1150 hypothetical protein Predicted protein(s) >gi|224461356|gb|ACDC01000046.1| GENE 1 1 - 132 165 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KTLSEKEKIKQLEDEILYLKAENEYLKKLRALVQERELKEKKK >gi|224461356|gb|ACDC01000046.1| GENE 2 375 - 848 484 157 aa, chain + ## HITS:1 COG:FN1447 KEGG:ns NR:ns ## COG: FN1447 COG2801 # Protein_GI_number: 19704779 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 157 5 161 207 274 96.0 5e-74 MKKFNLQSIIRKKRKYSSYKGRIGKIADNHIKRDFEATAPNQKWFTDVTEFNLRGEKLYL SPILDAYGRYIVSYDISRSANLEQINHMLSLAFKENENYENLIFHSDQGWQYQHYSYQEK LKEKKITQSMSRKGNSLDNGLMECFFGLLKLEMFYEQ >gi|224461356|gb|ACDC01000046.1| GENE 3 1103 - 1624 669 173 aa, chain + ## HITS:1 COG:FN0874 KEGG:ns NR:ns ## COG: FN0874 COG0494 # Protein_GI_number: 19704209 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 1 170 1 170 171 263 81.0 9e-71 MKFKHISKKQVFKNDVITVFEETLALPNDNVVTWTFTGKKEVVAIIAELENEIFFVKQYR PAIKKELIEIPAGLVEKGENILDAAKREFEEEIGYRANKWEKICTYYNSAGINAGQYHLF YATDLVKTQQSLDENEFLEVIKIPFNDIDIFSFEDSKTMLALSYLKIKKEGAL >gi|224461356|gb|ACDC01000046.1| GENE 4 1621 - 4237 1940 872 aa, chain + ## HITS:1 COG:no KEGG:FN1150 NR:ns ## KEGG: FN1150 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 872 1 879 903 953 68.0 0 MKQIKYSYLNYNQAANNELLSTIEKLPEDNLIIVENELAKKQYFAYVNKGQLRVKTNLIS FEDFLDKIFISDKKILKDIKRFFLFYSYLKDDIKKKLNITNYFDCIEIADDFFEFFSYIR NKEDLESLNLSKWQEEKFELFFEIKNEMDKFLKENSYLPSDWLYSITNLKLDFLKKYKKL VFFDIVDFPHNFSKILETLKNYYDIEFILQMEDKDFNRDKLKLNKVSLIDKKMDIELAKY SNELELYTMILSRQYDNYYTTDANKKDRYSIFTKSNKYYLNDTKFYKIIETYLNLLNGID HKNKNLIDIFLVKENIFNSAFMEFYGLDVEDYKCFEKIISKDYRYISLNLLREDYYSHFL NDDENLKIKLNLIFETLDSIDNINNIDDLNNFLCTNFFSSKTDIDFFMENKFDSLYDKIY EILGLLKSNENIEFFNNFDSFFKTNIGKNIFTLFFNYLNKIEIYSLEKNQNKDKELKNLN LIKYSLKNLENSALVYADSQSLPKIKANNNLFTEQQKIKLALKTNEDEILIQKYRFFQNI LNLDKITVYSLVNQDINIDFSPFVYELVNKYSAKEHNTNDLKGFFEACYLQNKTEDFKKD PVFFRAFSKKNTDFINNTLTIGAYDYILLKKNETFFFLDKICGIESISETSPVNGMSPKV LGNILHKTLEDIFKTNWKNILKDSTNLILSKEEIKEYLERHIWKEKLKIENFMELYLDEV LFPRLINNIENFLKVLYEELKDSKIQRIEAEKESTTKNVAYLEHKGIQVILNGRADLLIE TDKARYIIDFKTGSYNKDQLEFYSIMFYGSDNSLPVYSAAYNFWEEEKDFDFSKHLIAQL DEKDNNFKTFLKEFLETKYYTPPNKSSLKEND Prediction of potential genes in microbial genomes Time: Thu May 19 23:06:15 2011 Seq name: gi|224461355|gb|ACDC01000047.1| Fusobacterium sp. 2_1_31 cont1.47, whole genome shotgun sequence Length of sequence - 5638 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 56 - 3187 3160 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 2 1 Op 2 1/0.000 + CDS 3261 - 4550 1756 ## COG1362 Aspartyl aminopeptidase 3 1 Op 3 . + CDS 4619 - 5602 1244 ## COG2502 Asparagine synthetase A Predicted protein(s) >gi|224461355|gb|ACDC01000047.1| GENE 1 56 - 3187 3160 1043 aa, chain + ## HITS:1 COG:FN1149 KEGG:ns NR:ns ## COG: FN1149 COG1074 # Protein_GI_number: 19704484 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Fusobacterium nucleatum # 1 1042 1 1056 1056 1225 73.0 0 MNKIKNLVVSASAGTGKTYRLSLEYIAALSKKANTEAIDYKNILVMTFTRKATAEIKEGI LKKLSEFIELYDICKYSKLSVRETILNNENLDEKKKTNYINLIESIEKIEKDLIVDREFL DNLSNVYKDIIRNKEKLKIYTIDAFLNIIFKNIVINLMKIKSYSLIDEAENSAYYKKVLE SIFTNKKLFNDFKNFFTENSEKNIDNYISVVNELISSRWKYILSLNDNKEYIKKEKLSID EKPVEILRELFSYLENDAKKDLNDVLKNDCKIYLGKTPETQRELLVRNANFFFQDGTAGL IYNGNKLKKATDKEYKEYLISRQEILKENLAKEVYNEILIPYEEKIFELSLEIFRLYDMF KIRDKNFTFNDIAIYTYIAIFNKENGLMNENGLTDIFFESLDMNIETIFIDEFQDTSILQ WKILYEFTKKAKIVVCVGDDKQSIYGWRDGEKRLFENLETILKANPDTLKKSYRSDINIV SYCNEFFSAISRKDNWAFKPSEINSKNQGYVKAICMSDLDKEANIYSVLLEELKAFEPYD NVAIIARTNNELNEIAQLLENEKIPYILNNEKNISEYPGIFECFELLKYLIYENELALFN FISSPLSNIGTKDIEVLLKNKKEVLSYINFSQDNNFILSLENKKIINFLNKIIDIKKNFK NFKVQNLIHEIIKKFQFLDYFSKENEVKNIYDFYLLTNSYLSVLDLLNDYNDNKLILADL NSNKKGIELVTIHKSKGLEFKTTFVIKNDKKSKGFDINFLFEMNETYDKTTFSLFAKKGY KNILKSCFEDKVLEYIKKIQEEETNNFYVALTRPKNNLVILYNDRLFEEKPLENSNLKDF FSCEIGEIHRNIEIIEVETPKEISYNSSSYFLNSNIENEEIDNFEVSNSKFLLETEEKRM IGILVHYFFENLKYTSEEEVAFAKNLCYKKYLSYFGKKKLDEIFSKENIEMFLNKDKEIF SKKWNYIYNEYVLYDAIEKQEYRIDRLMIKDNDDGTGEIYIVDYKTGGKNENQLKTYASV LKKTFKELKDYEIKTNFLEFDVF >gi|224461355|gb|ACDC01000047.1| GENE 2 3261 - 4550 1756 429 aa, chain + ## HITS:1 COG:FN0775 KEGG:ns NR:ns ## COG: FN0775 COG1362 # Protein_GI_number: 19704110 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Fusobacterium nucleatum # 1 428 1 428 429 695 83.0 0 MNKQKLAKDLIKFIDESPSNYFACINAKEILNKNGFTELSEAEEWKLKKGEKYYVTINDS GIIAFTIGTDKIYKSGYRIAASHTDSPGFLIKPNPEMNKKDYDILNTEVYGGPILSTWFD RPLSFSGRVFVEGDSAFKPKKYFINYDKDIFIIPSLCIHQNRGVNDGVAINAQKDTLPLV SISKDKNKFSLTALLAKELKVKENEILSYDLSLHSREKGCILGANDEFVSVGRLDNLAAF HASLNSLIDNKDKKNTCIVVGYDNEEIGSHTIQGADSPTLANILGRISNAMDLTLEEHEQ ALAKSFVISNDAAHSIHPNYLEKADPTNEPKINCGPVIKMAANKSYITDGYSKAVIEKIA KDAKIPLQIFVNRSDVRGGSTIGPIQQSQIRIQGIDIGSPLLSMHSVRELGGVEDHYNLY KLISELFKN >gi|224461355|gb|ACDC01000047.1| GENE 3 4619 - 5602 1244 327 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 1 327 1 327 327 628 94.0 1e-180 MAYTSSLDILETEIAIKKVKDFFESHLSKELDLLRVSAPLFVIPESGLNDNLNGTERPVS FDTKNGERVEIVHSLAKWKRMALYRYNIENHKGIYTDMNAIRRDEDTDFIHSYYVDQWDW EKIISKEDRNEEYLKEVVRKIYSVFKATEDYITKEYPKLTKKLPEEITFITSQELEDKYP TLTPKNREHAAAKEYGAIFLMKIGGKLTSGERHDGRAPDYDDWDLNGDIIFNYPLLGIGL ELSSMGIRVDENSLEEQLKISHCEDRRSMPYHQMILNKVLPYTIGGGIGQSRICMFFLDK LHIGEVQASIWSQEVHEICRQMNIKLL Prediction of potential genes in microbial genomes Time: Thu May 19 23:06:26 2011 Seq name: gi|224461354|gb|ACDC01000048.1| Fusobacterium sp. 2_1_31 cont1.48, whole genome shotgun sequence Length of sequence - 36332 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 16, operones - 9 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 10 - 822 1282 ## COG5266 ABC-type Co2+ transport system, periplasmic component - Prom 863 - 922 14.2 2 1 Op 2 . - CDS 952 - 1575 531 ## COG1451 Predicted metal-dependent hydrolase - Prom 1688 - 1747 9.4 + Prom 1547 - 1606 5.7 3 2 Tu 1 . + CDS 1664 - 2269 168 ## COG0671 Membrane-associated phospholipid phosphatase + Term 2334 - 2387 1.2 - Term 2102 - 2144 1.5 4 3 Tu 1 . - CDS 2228 - 3244 1026 ## FN0917 hypothetical protein - Prom 3270 - 3329 13.1 + Prom 3308 - 3367 14.0 5 4 Tu 1 . + CDS 3391 - 3465 81 ## + Term 3515 - 3554 3.7 + Prom 3479 - 3538 5.6 6 5 Op 1 2/0.000 + CDS 3588 - 5114 1971 ## COG0747 ABC-type dipeptide transport system, periplasmic component 7 5 Op 2 2/0.000 + CDS 5124 - 6296 1623 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 8 5 Op 3 . + CDS 6318 - 7838 1919 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Prom 7840 - 7899 4.3 9 6 Op 1 1/0.250 + CDS 7920 - 8765 975 ## COG0668 Small-conductance mechanosensitive channel 10 6 Op 2 1/0.250 + CDS 8832 - 9860 1465 ## COG0687 Spermidine/putrescine-binding periplasmic protein 11 6 Op 3 . + CDS 9880 - 10974 1090 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) + Term 10993 - 11027 5.3 - Term 10980 - 11013 5.1 12 7 Op 1 . - CDS 11022 - 11915 966 ## FN0821 hypothetical protein 13 7 Op 2 . - CDS 11979 - 13346 469 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 13380 - 13439 8.1 - Term 13359 - 13418 2.0 14 7 Op 3 . - CDS 13441 - 14790 1315 ## COG0534 Na+-driven multidrug efflux pump - Prom 14877 - 14936 17.2 + Prom 14907 - 14966 11.4 15 8 Tu 1 . + CDS 15084 - 15362 437 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 15403 - 15452 11.0 - Term 15240 - 15280 0.1 16 9 Op 1 . - CDS 15450 - 16448 1266 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 17 9 Op 2 . - CDS 16451 - 17377 820 ## FN0976 hypothetical protein 18 9 Op 3 . - CDS 17397 - 18887 1841 ## COG1492 Cobyric acid synthase - Prom 18967 - 19026 8.6 + Prom 18926 - 18985 12.7 19 10 Tu 1 . + CDS 19016 - 19612 620 ## Lebu_0573 hypothetical protein - Term 19608 - 19656 2.6 20 11 Op 1 . - CDS 19667 - 20260 913 ## COG3291 FOG: PKD repeat 21 11 Op 2 2/0.000 - CDS 20270 - 21418 1712 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB 22 11 Op 3 4/0.000 - CDS 21440 - 22765 1954 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB - Prom 22788 - 22847 5.2 23 11 Op 4 1/0.250 - CDS 22887 - 23681 1229 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 24 11 Op 5 1/0.250 - CDS 23710 - 24951 1838 ## COG0786 Na+/glutamate symporter - Prom 24977 - 25036 7.6 - Term 25006 - 25062 8.1 25 12 Op 1 3/0.000 - CDS 25078 - 26832 2977 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 26 12 Op 2 21/0.000 - CDS 26848 - 27651 1357 ## COG2057 Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit 27 12 Op 3 1/0.250 - CDS 27654 - 28619 1459 ## COG1788 Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit - Prom 28643 - 28702 6.4 - Term 28639 - 28683 8.5 28 13 Op 1 9/0.000 - CDS 28728 - 29855 1459 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 29 13 Op 2 . - CDS 29870 - 30274 701 ## COG0511 Biotin carboxyl carrier protein 30 13 Op 3 . - CDS 30315 - 30644 472 ## FN0199 hypothetical protein - Prom 30665 - 30724 7.3 31 14 Tu 1 . - CDS 30754 - 32760 1560 ## COG3711 Transcriptional antiterminator - Prom 32810 - 32869 10.6 32 15 Op 1 . + CDS 33093 - 33839 997 ## COG1262 Uncharacterized conserved protein 33 15 Op 2 . + CDS 33896 - 35236 1667 ## COG0534 Na+-driven multidrug efflux pump + Term 35393 - 35426 3.1 + Prom 35330 - 35389 4.6 34 16 Tu 1 . + CDS 35438 - 36304 1089 ## COG0685 5,10-methylenetetrahydrofolate reductase Predicted protein(s) >gi|224461354|gb|ACDC01000048.1| GENE 1 10 - 822 1282 270 aa, chain - ## HITS:1 COG:FN0947 KEGG:ns NR:ns ## COG: FN0947 COG5266 # Protein_GI_number: 19704282 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 270 1 270 270 489 90.0 1e-138 MLSKKLLIGALVATMSMSAFAHFQMVYTPDSDISGKSSVPFELIFTHPADGVEAHSMDMG KDEKGTIQPVVEFFSVHNGEKKDLKANLKASKFGPASKQVTSYKFNFDKNSGLKGGGDWG LVVVPAPYYESAEDVYIQQITKVLVNKDELATDWNKRLANGYPEIIPLSNPITWKGEIFR GQVVDKDGKAVANAEIEIEYLNSNIKNSKFVGELQKDKTATVIYADENGYFSFVPVHKGY WGFAALGAGGELKHNGKELSQDAVLWIEAK >gi|224461354|gb|ACDC01000048.1| GENE 2 952 - 1575 531 207 aa, chain - ## HITS:1 COG:FN0946 KEGG:ns NR:ns ## COG: FN0946 COG1451 # Protein_GI_number: 19704281 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Fusobacterium nucleatum # 1 207 14 227 229 245 70.0 3e-65 MEYTITKKKIKNFILRIYPDLTIAVSAPLSATSKDIENFILSKKDWIEKTLEKLNKLKDD SIKILGKKVEKKVIQSDLERISLTDRNIFIYTKNIEEIEVEKKFLEWKYNKLKEIIDEAI EKYTKLLNTEINYYRIKKLSSAWGIYHRRENYISFNLDLIEKEIESIDYVVLHEICHIFY MDHQKKFWSLVEKYMPDYKIRRKKLKS >gi|224461354|gb|ACDC01000048.1| GENE 3 1664 - 2269 168 201 aa, chain + ## HITS:1 COG:FN0945 KEGG:ns NR:ns ## COG: FN0945 COG0671 # Protein_GI_number: 19704280 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Fusobacterium nucleatum # 3 197 2 197 199 191 68.0 7e-49 MKDNLQRLKIKYIIFITIFFTILYKGAEFYTRTLDYVPSYFMAWEKKIPFLTIFMLPYMT SAPFFFGTFLTIKDEKKLNFYVKQAIFLTVVSIAIFFIVPMKFYFPKPEIANPIFNFFFY VLGQLDSSFNQCPSLHVSFAFLSIGIYCKEMKTKLKYLVSLWGFLIAISVHFVYQHHFID FVGGFIMFLITWYIFPKFLKK >gi|224461354|gb|ACDC01000048.1| GENE 4 2228 - 3244 1026 338 aa, chain - ## HITS:1 COG:no KEGG:FN0917 NR:ns ## KEGG: FN0917 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: Purine metabolism [PATH:fnu00230]; Pyrimidine metabolism [PATH:fnu00240]; Metabolic pathways [PATH:fnu01100]; DNA replication [PATH:fnu03030]; Mismatch repair [PATH:fnu03430]; Homologous recombination [PATH:fnu03440] # 11 332 1 322 322 424 78.0 1e-117 MFYFLYGNSPMIEFETEKKTEEILEKYPNISAKYYDCALKEEDEFLSALQINSIFKTVDF LVLKRAETLKSSGIQKLFKTLKNYDLNEKNIIIIYNVPIQYGKIVTEYEITKTSIKAIED IATFLDCTLIKENNIILNYVKDNLNITERDAKDLIELLGSDYYHIKNETNKIAAFLDRQP YSFEKIKNLISIDKEYNMKDLVENFFKTKNFTDILNFLETNKDSYLGIVYMLADELIVFL KLTSLINSGKISQHMNYNVFKELYNDFSDLFIGRNFKAQHPYTIFLKLNSLTYFSEEFLE NKLKELLYIEYGLKTGEREINIELNLFFKKFWKDVPSY >gi|224461354|gb|ACDC01000048.1| GENE 5 3391 - 3465 81 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNLHKINTTNQYWWGYTNENCIS >gi|224461354|gb|ACDC01000048.1| GENE 6 3588 - 5114 1971 508 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 508 1 510 511 292 35.0 1e-78 MKKKFIYLCLIVTIVLLGCFGKEKETEEVVKTNEEKIITIAEKAEIKTLDPQNTVDSASL SVIQMINQRLFKIDNNGNIIPEIAEEATKVDEKTTLIKIKKDLFFSNGEPVTVDDVLFSL NRAKESPRMTQDFYMIESFEKVDDSTIKVKTFYEAGNLLHKLASMGASIMSKKALEENET NIVGSGMFKLKEWVAGDRLVLERNTYFKDANSNIKEIVIKFIPEANSRMIMLETGEVDIA ESLLPLDFQKISKEDDKFVSVEMQSSSNMFIGFDLRDKHLADKRVRQAIAYAINNQDIVD SIYNGSATVATSPIPKITTGHNENSNPYTQNIEKAKELLAEAGYADGFNIVLNVNEDNQR VDTAVVIQDNLKAIGINVEIKTYQWASYVAFVENPAQEKGMFLMAWNIANDDPDELLYPL YHSSQIDAHTNVVFYKNEEFDNLISKARETTDKEKRIELYKKAQDIIQEELPHYAILYPM QNFAYKKSIKGIEVSKRGYFNFQNTIVE >gi|224461354|gb|ACDC01000048.1| GENE 7 5124 - 6296 1623 390 aa, chain + ## HITS:1 COG:PH1043 KEGG:ns NR:ns ## COG: PH1043 COG1473 # Protein_GI_number: 14590880 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus horikoshii # 3 390 5 386 387 295 42.0 9e-80 MDIKEKAENIKDYIIEMRRHFHQNPELSLEEFETTKKIVNELEKMGIEVSTFKDGLTGCV GTIKGAKEGKTLLLRADIDALSVHEKTNLEFASRVDGKMHACGHDCHAAMLLGIAKILSE MKDKFSGNIKLFFQAAEEIGLGAKLSIEQGVMDNVDACYGVHVTPRFESPKINMQYGERM AATDVFKLTVEGTSSHGSSPHLGHDAIVASAAIITALQTIVSRINNPLKPAVVTIGTIKG GQRFNIIANEVIMEGNVRTFDEIFRKEIEIHIREIAESVAKAHSCTAKLEYRYGTGVVLN KDKNLVDIAQNAVKKLYGEDSLVEMEKITGGEDFSLLMEKAPGIFGYIGTRNPKVPGSEK INHHECFTVDEDALIRGTAVAVQFALDYLN >gi|224461354|gb|ACDC01000048.1| GENE 8 6318 - 7838 1919 506 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 493 1 497 511 290 35.0 3e-78 MKRKLIYLSILVLLVLFACSQKENKKETVEKEEKKILTMAQKAEIKTLDPQKATDSVSRS IIKLINQTLVYIDNEGNIVPELAQEITKVSPKETLIKIKNDIKFSNGETLTIDDVLFSLE RAKASPKMSQDLYMIESFEKVDDRTLKINTLYDAGNLLHKLASGGVAIINKKAFEKDENN IVGTGMFKLKEWVAGEKLVLERNEFFKDSKSNIDTLVVKFVPEANSRMIMLETGEIDLAR DLLPLDFKKISEDTKFTTVEIETPSNMFLGFDLRNELLADKRVRQAIAYAINNEDLVKTV FNGSASVATSPVPKITTGHNENSNNYPQNIEKAKQLLAEAGYPNGFNIELFVSEDNQRID MAVIIQDNLKKIGINAEIKTFQWAAYVSTIENPNIIKPLFIMSWNISNDDPDEVLYPLYH SSQIDAHTNVVFYKNEKFDNLISEARETTDKEKRMKLYEEAQDIIQEDLPHYTLVYPKQN FAYKASIKNIKYNKRAYLDFQDTIIE >gi|224461354|gb|ACDC01000048.1| GENE 9 7920 - 8765 975 281 aa, chain + ## HITS:1 COG:FN0619 KEGG:ns NR:ns ## COG: FN0619 COG0668 # Protein_GI_number: 19703954 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Fusobacterium nucleatum # 9 280 1 272 281 414 77.0 1e-116 MNKTFFEKMLEKLLIDLENYLPLLAGKLVAFLLVCFIWPKISKFMLKLLDKSRTLKNNDP LLLSFLKSLLKAIMYVIEAFLLIGIIGIKATSLVTILGTAGVAVGLALQGSLANLASGIL ILFFKQVSKGDFVSSLDKTIEGTVESIHILYTVIKQANGPLIFVPNNQIANASIINYSRN PYRRLDLVYSSSYDVPVDKVISVLHEVANDEKRIIKDNPDMPITITLNKHNASSLDYIFR AWVKKEDYLDAMFACNANVKKYFDKNNIEIPYNKLDLYMKK >gi|224461354|gb|ACDC01000048.1| GENE 10 8832 - 9860 1465 342 aa, chain + ## HITS:1 COG:FN0618 KEGG:ns NR:ns ## COG: FN0618 COG0687 # Protein_GI_number: 19703953 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Fusobacterium nucleatum # 1 342 1 342 342 605 91.0 1e-173 MKKLFLLFLATIMLVSCGDSKDENTLYVYSWADYIPQFVYEDFEAETGIKVVEDIYSSNE EMYTKIKAGGEGYDIIMPSTDYYEIMMKEDMLAKLDKSQLENTKYIDDAYMAKLREFDPE NDYGVPYMRGITCIAVNTKFVKDYPRDYTIYDREDLAGRMTLLDDMREVFVPALALNGYK QDADSEEAMEKAKAKVLAWKKNIAKFDAESYGKGFANGDFWVVQGYPDNIYRELSEEDRK NVDFIIPPGDQGYSSIDSFVILKDSKNIENAMKFINYIHRPDVYAKISDFIEIPSINLEA DKLVTKKPLYDVSKTKDAQLLIDIGDKLNIQNKYWQEILIAN >gi|224461354|gb|ACDC01000048.1| GENE 11 9880 - 10974 1090 364 aa, chain + ## HITS:1 COG:FN0617 KEGG:ns NR:ns ## COG: FN0617 COG0592 # Protein_GI_number: 19703952 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Fusobacterium nucleatum # 1 364 1 364 364 513 79.0 1e-145 MKFSINKENVIGIISEYTNILKDNPVKPSLAGLFIEVKNNQVVFKGANTEVELIRYANCN IEVEGQVLIKPSLLLEYIKLVESEDINFEKKDGYLIVNNAEFSILDDTTYPEIKELPSTT IAKENSLQFAMLLEKVKFLTNSSSNVDTLFNSIKLIFQDNFIELASTDSYRLIYLKKSLE NMINKDILVPADSMSVIYKILKDLNEDVTLATIEDKLIVTWKDAYFSCKLLSLSFPDFRP LITNSTHDKKFEFNRDELNSSLKKVISVTKNSNDSKNVATFNFKVNQLLISGMSSNAKIN QKVNMIKTGEDLKLGINCKYIKEFIDNTDKNIIIEATNSSSMLKIIEEANENYIYLVMPV NIRV >gi|224461354|gb|ACDC01000048.1| GENE 12 11022 - 11915 966 297 aa, chain - ## HITS:1 COG:no KEGG:FN0821 NR:ns ## KEGG: FN0821 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 297 1 288 288 245 52.0 1e-63 MKKIFLSLSLLLFVSCVNLDKLNVFDKKDSKVAEKNTANSNKNVASSKKDKQKKSAPIVP TKGTKSKNLLRDAEAMPEDNYANRVKKYKAYNSLVAFNPSYKPNVEAKMGDLKSKIESTY TVKVSVTDLILQNLTKKEEFNNVGSKVFNYTKTNPDLNLLVDISSVNYSKPTINVKTAPK EYSEEYVNSEGNKVLNVVKYYENETTKTTALSFVVTYKLVSNTTGEVLFHYKKTIDKSYK ESWKNYYMSSFRMNKRKQIPNDEPEKSVPTKEQLYQIAYQEMYDMIQKEINNLPSIK >gi|224461354|gb|ACDC01000048.1| GENE 13 11979 - 13346 469 455 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 2 450 4 444 458 185 29 4e-46 MYDLIVIGWGKAGKTLAAKLAAKGKKIAVVEENPKMYGGTCINVGCLPTKSLVHSAKLIS QVKNYGIDGDYEFKNNFFKEAMKKKDEMTTKLRNKNFSILDTNENVDIYNGKGSFISNNE VKVTTKDGEVVLKADKIVINTGSVSRNLDIEGANNKNVLTSEGILELKELPKKLLIIGAG YIGLEFASYFRNFGSEVSVFQFDDSFLAREDEDEAKIIKEILENKGVKFYFNTSVKKFED LGDSVKATYVKDNEELVEEFDKVLVAVGRKANTENLGLENTSVELGKFGEVVVDDYLKTN APNIWAAGDVKGGAQFTYVSLDDFRIIFPQILEGTKGRKLSDRVLIPTSTFIDPPYSRVG INEKEAQRLGIAYTKKFALTNTIPKAHVINETDGFTKILINENNEIIGASICHYESHEMI NLLSLAINQKIKANVLKDFIYTHPIFTESLNDILG >gi|224461354|gb|ACDC01000048.1| GENE 14 13441 - 14790 1315 449 aa, chain - ## HITS:1 COG:FN1151 KEGG:ns NR:ns ## COG: FN1151 COG0534 # Protein_GI_number: 19704486 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 449 3 448 448 522 67.0 1e-148 MEIINNYFVENRKLIKNIFQITLPAVFDLLAQTLIMALDMKMVSSLGPSAISSVGVGTAG MYALIPALIAVATGTTALLSRAYGADNKLDGKKAFAQSFFIAVPLGIILTIIFLIFSEQI INLVGNAKDMNLSDAILYQNMTVIGFPFLGVSIATFYAFRAMGENKIPMIGNTLALVLKV ILNFLLIYLFKWGIFGAALSTTLTRLFSAIFSIYLVFWSKKNWISLELKDLKFDYFTSKR ILKVGIPAAVEQLGLRIGMLIFEMMVISLGNLSYAAHKIALTAESISFNLGFAFSFAASA LVGQELGKGSSQKALKDGYICTIIAMIVMSTFGLLFFIMPQFLVSLFTNDKDVIELSTMA LKIVSICQPFSGASMVLAGALRGAGDTKSVLLITYLGIFLVRIPITYLFLDVLNFGLAGA WIVMTIDLVIRSSLAFYIFRRGKWKYLQV >gi|224461354|gb|ACDC01000048.1| GENE 15 15084 - 15362 437 92 aa, chain + ## HITS:1 COG:FN0818 KEGG:ns NR:ns ## COG: FN0818 COG0776 # Protein_GI_number: 19704153 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Fusobacterium nucleatum # 1 91 1 91 91 138 90.0 3e-33 MTKKEFAKLLFEKGVFTTRTEAEKKVDIIFETMEKTLLDGEDISIINWGKLEVVERAPRL GRNPKTGEEVNIGERKSVKFRPGKAFLEKLNK >gi|224461354|gb|ACDC01000048.1| GENE 16 15450 - 16448 1266 332 aa, chain - ## HITS:1 COG:FN0975 KEGG:ns NR:ns ## COG: FN0975 COG1270 # Protein_GI_number: 19704310 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Fusobacterium nucleatum # 1 320 1 320 325 511 87.0 1e-145 MFNYFFIKFGIAYILDLILADPRWLYHPVIIIGKLISFLEKILYKAKNKIFSGAILNILT LSVTFIVSLLLVRTNYVVEIFFLYTTLATKSLANEGNKVYKILKSGDIEKAKKELSYLVS RDTNTLSLDKIIMSVVETIAENTVDGFISPAFYAFVGSFFHIELFGQMVSLALPFAMTYK AINTLDSMVGYKNEKYIDFGKVSARVDDVANFIPARLTGLIFVPLSTLILGYDFKNSLRI FFRDRNKHSSPNSGQSESAYAGALGIQFGGKISYFGKDYEKPTIGDKLKAFDYEDIKKAV NILYLVSFIATLVIISCSMMYNIPDLRVFFHL >gi|224461354|gb|ACDC01000048.1| GENE 17 16451 - 17377 820 308 aa, chain - ## HITS:1 COG:no KEGG:FN0976 NR:ns ## KEGG: FN0976 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 305 1 304 305 351 63.0 2e-95 MSVSFYVKNKRKIISYEPVLTVKEALALSDKELNVFAISDMDINELLLSPLSNYECLLIG VKNESARGFELSYDKKNKDYIVRIFTPSSREDWLLALNYIKTLAKKFNSEIENSRGEIYT IKELDKFDYESDILYGISSISAKINDREGAQYIILGLNRLVVFNKKILDKIYSSGNTIDA FSTTVREIQYLDASSAPQNFFKNNDDGKIMGNYTLVEGIRTILPYIPNVEFENSSIVRNE DISVWNITLLIIELNKNDGKNYYCLAGNLEYDKFIKKLSTDKYKFIDGAYIMLEPLTKEE ILKLLDGE >gi|224461354|gb|ACDC01000048.1| GENE 18 17397 - 18887 1841 496 aa, chain - ## HITS:1 COG:FN0977 KEGG:ns NR:ns ## COG: FN0977 COG1492 # Protein_GI_number: 19704312 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Fusobacterium nucleatum # 1 496 1 496 496 834 90.0 0 MKKANLMVVGTSSGAGKSLFVTALCRIFYKDKYKVSPFKSQNMALNSYITKDGKEMGRAQ VVQAEASGLEPEVEMNPILLKPSSMNKIQIIVCGKSIGNMSGVEYNQYKKNLIPILKETY SKIEAKNDIVIIEGAGSPAEINIKEEDISNFAMARIADAPVILVADIDRGGVFASIYGTI MLLKEEDRKRVKGIVINKFRGNREVLKPGFEIIENLTGVKTLGVIPYADIDIEDEDSLSE KYKSFKLNKNSNKIKVSVIKLKHISNVTDIDALSIHDDVEIQFVSERSQIGDEDLIIIPG SKNTIDDLKWLKESGIAEEIIKKARTKTIIFGICGGFQILGNKVKDPYHIEGDIEELNGL GLLDLETTMENEKTLVQYRGKLIAEEGLFKPLNNFEIKGYEIHQGLTEGNEKNLTSDNRT ILVNKNNIIATYLHGIFDNKDFTNNLLNEIRRRKGLEEVNSNISYEEYKIQEFDKLEKLV RENIDIEEIYKIIGLK >gi|224461354|gb|ACDC01000048.1| GENE 19 19016 - 19612 620 198 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0573 NR:ns ## KEGG: Lebu_0573 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 195 1 195 205 225 66.0 7e-58 MNFLGHSLISLEIDESTDKNTLYGNFTGDYYKGLVDRIELPEALKEGIKLHRIIDKVSDR KENYLNELLVDKFGIFKGIVSDMFIDHFLSKNFHKLFNKDIKLIEKKILNAIEENRNIFP KDFERMFKWLNDRNVMSNYKDIDFLERAFEGLARNIRKGEILNLATTELKKNYNLFEEKS IEEFFYVKDKSIEEFLNK >gi|224461354|gb|ACDC01000048.1| GENE 20 19667 - 20260 913 197 aa, chain - ## HITS:1 COG:MA4285_2 KEGG:ns NR:ns ## COG: MA4285_2 COG3291 # Protein_GI_number: 20093074 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 31 168 654 795 1325 73 35.0 2e-13 MDQNIWEYDDFIFKGDELKGMTAKGKDKVKTGGQTDLVIPAVTPDGLPLKKIADNAFYRR GLTSVVIPDTVESIGYDAFGVCKLKEVKLPEALVNIEGFAFYRNKLTKVEFGSKVRRIEP SAFAMNELSEITLPETLEYIGASAFYKNAFETITFPKALTKIDMYAFRKNNIHKVEVANS VDLHKFAFESFTTVERV >gi|224461354|gb|ACDC01000048.1| GENE 21 20270 - 21418 1712 382 aa, chain - ## HITS:1 COG:FN0208 KEGG:ns NR:ns ## COG: FN0208 COG1775 # Protein_GI_number: 19703553 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Fusobacterium nucleatum # 1 382 1 382 382 730 95.0 0 MAEIKELLEQFKYYAENPRKQLDKYLAEGKKAVGIFPYYAPEEIVYAGGMVPFGVWGGQG PIEKAKDYFPTFYYSLALRCLEMALDGTLDGLSASIITTLDDTLRPFSQNYKVSAGRKIP MVFLNHGQHRKEEFGKQYNARIFRNAKEELEKICDVKITDENLKNAFKVYNDNREEKRRF IKLAAKHPQSIKASDRSNVLKSSYFMLKDEHTALLRKLNQELEAIPEEQWDGVRVVTSGV ITDNPGLLEVFDNYKVCVVADDVAHESRALKVDIDLSIADPMLALADQFARMDEDPILYD PDIYKRPKYVLDLVKENNADGCLLFMMNFNDTEEMEYPSLKQAFDAAKVPLIKMGYDQQM VDFGQVKTQLETFNELVQLSRF >gi|224461354|gb|ACDC01000048.1| GENE 22 21440 - 22765 1954 441 aa, chain - ## HITS:1 COG:FN0207 KEGG:ns NR:ns ## COG: FN0207 COG1775 # Protein_GI_number: 19703552 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Fusobacterium nucleatum # 2 441 3 442 442 893 97.0 0 MGKMEKLPNKTPRPIEGHKPAAAILRGVVDKVYANAWEAKKRGELVGWSSSKFPIELAKA FDLNVVYPENHAASAAAKKDGLRLCQAAEDMGYDNDICGYARISLAYAAGEPTDARRMPQ PDFLLCCNNICNMMTKWYENIARMHNIPLIMIDIPFSNTVDVPEEKIDYLIGQFNHAIKQ LEELTGKKFDEKKFEDACARANRTASAWLKSCKYMGYKPSPLSGFDLFNHMADIVAARCD EEAAMGFELLAEEFEQSIKEGTSTWEYPEEHRILFEGIPCWPGLKPLFEPLKDNGVNVTA VVYAPAFGFRYENIREMAAAYCKAPCSVCIETGVEWRETMAKENGISGALVNYNRSCKPW SGAMPEIERRWKEDLGIPVVHFDGDQADERNFSTEQYNTRVQGLVEIMQERKEERLANGE EVYTNFENTKETDWSKETIKH >gi|224461354|gb|ACDC01000048.1| GENE 23 22887 - 23681 1229 264 aa, chain - ## HITS:1 COG:FN0206 KEGG:ns NR:ns ## COG: FN0206 COG1924 # Protein_GI_number: 19703551 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Fusobacterium nucleatum # 2 264 3 265 265 478 97.0 1e-135 MSIFTMGIDVGSTASKCIILKDGKEIVAKAVISVGTGTSGPARAMKEALEQVGLSSVSEL QGAVATGYGRNSLAEVPAQMSELSCHAKGAYFLFPNVHSIIDIGGQDSKALKIGDNGMLE NFVMNDKCAAGTGRFLDVIAKVLEVNLEDLEKLDEKSTVDVAISSTCTVFAESEVISQLA KGTKIEDIVKGIHTAIASRVGSLAKRIGIKDDVVMTGGVALNKGMVRALERNLGFKLHTN EYCQLNGAIGAALFAYQKYTMTHQ >gi|224461354|gb|ACDC01000048.1| GENE 24 23710 - 24951 1838 413 aa, chain - ## HITS:1 COG:FN0205 KEGG:ns NR:ns ## COG: FN0205 COG0786 # Protein_GI_number: 19703550 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Fusobacterium nucleatum # 1 412 1 418 419 515 75.0 1e-146 MEEISVFKVSMFETLMLAVLAIYFGEFLRKKINILVKYCLPASVVGGTIFAIVFYVLYSM KIVELEFDYKAVNQLFYCIFFAASGAAASMALLKQGGKLVLIFAVLAAVLAACQNAVALA VGKFMNVDPLISMMTGSIPMTGGHGNAASFAPIAVDAGAPAAMEVAIAAATFGLISGSII GGPLGNFMVKRHKLEDPLLDGKEEKAELTGEESTGILMGKSHIVQAVFLMCIAIGIGQII TNGLASINVKFPIHVSCMFGGILVRLFFDAKKGNHDVLYEAIDSVGEFSLGLFVSMSIIT MKLWQLSGLGMSLVVLLMAQVVFIIIFCYLLTFRLLGKNYDAAVMTVGHIGFGLGAVPVA MTTMQTVCKKYRYSKLAFFVVPVIGGFISNISNAIIITKFLDIAKSLHAVWVG >gi|224461354|gb|ACDC01000048.1| GENE 25 25078 - 26832 2977 584 aa, chain - ## HITS:1 COG:FN0204 KEGG:ns NR:ns ## COG: FN0204 COG4799 # Protein_GI_number: 19703549 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Fusobacterium nucleatum # 1 584 1 584 584 1135 96.0 0 MNYSMPKYFQNMPQVGNSLANIDEANENAVREVEAAIADSIAAMQDAGTPDEKIHDKDQM TALERIAELVDEGTWYPLNTLYNPEDFETGTGIVKGLGRIGGKWAVVVASDNKKIVGAWV PGQADNLLRASDTAKCLGIPLVYVLNCSGVKLDEQEKVYANRRGGGTPFFRNAELQQLGV PVIVGIYGTNPAGGGYHSISPTILIAHKDANMAVGGAGIVGGMNPKGYIDMEGAIQIAEA TMAAKQVEVPGTIHVHYDKTGFFREVYDDEIGVIDGIKKYMDYLPAYDLEFFRVDEPTEP ALDPNDLYSIIPMNQKKIYNIYDIIGRLFDNSEFSEYKKGYGPEVVTGLAKVDGLLVGVV ANAQGLLMNYPEYREKSVGIGGKLYRQGLIKMSEFVTLCSRDRLPIVWLQDTTGIDVGNP AEEAELLGLGQSLIYSIENSHVPQIEITLRKGSAAAHYVLGGPQGNNTNAFSLGTAATEV YVMNGETAASAMYSRRLAKDYKAGKDLQPTIDKMNQLINEYTAKSRPAYCAKTGMVDEIV PLYDLRGYISAFANAVYQNPKSICAFHQMILPRAIREFETYTKK >gi|224461354|gb|ACDC01000048.1| GENE 26 26848 - 27651 1357 267 aa, chain - ## HITS:1 COG:FN0203 KEGG:ns NR:ns ## COG: FN0203 COG2057 # Protein_GI_number: 19703548 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit # Organism: Fusobacterium nucleatum # 1 267 1 267 267 531 98.0 1e-151 MAKNYKNYTNKEMQAITIAKEIKDGQIVIVGTGLPLIGATVAKNKFAPNCKLIVESGLMD CSPIEVPRSVGDLRLMGHCAVQWPNVRFIGFETNEYLNGNDRMIAFIGGAQINPYGDLNS TIIGDDYIKPKTRFTGSGGANGIATYSNTVIMMQHEKRRFIEKIDYVTSVGWAGGPGGRE KLGLPGNRGPLAVVTDKGILRFDEVTKRMYLAGYYPGVTIEDIVENTGFELDTSRAVQLE APTEEIIKMIREDIDPGQAFIKVPVEE >gi|224461354|gb|ACDC01000048.1| GENE 27 27654 - 28619 1459 321 aa, chain - ## HITS:1 COG:FN0202 KEGG:ns NR:ns ## COG: FN0202 COG1788 # Protein_GI_number: 19703547 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit # Organism: Fusobacterium nucleatum # 1 321 1 321 321 644 95.0 0 MSKVMSLHDAIAKYVESGDSLCFGGFTTNRKPYAAVYEIIRQGQTDFIGYSGPAGGDWDM LIGCGRIKAFINCYIANSGYTNVCRRFRDAVEKKHNLLLEDYSQDVIMLMLHASSLGLPY LPVKLMEGSDLEYKWGISAEIRKTIPKLPDKKLERIPNPFKEGEDVIAVPVPRLDTAIIS VQKASINGTCSIEGDEFHDVDIAIAARKVIVIAEEIVTEEEIRRDPSKNSIPEFCVDAVV HVPYGCHPSQLYNYYDYDPAFYKMYDSVTKTDEDFEKFIQEWVIDVKDHDGYLAKLGLPR VSKLRVVPGFQYAAKLVKDGE >gi|224461354|gb|ACDC01000048.1| GENE 28 28728 - 29855 1459 375 aa, chain - ## HITS:1 COG:FN0201 KEGG:ns NR:ns ## COG: FN0201 COG1883 # Protein_GI_number: 19703546 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Fusobacterium nucleatum # 1 375 1 375 375 560 96.0 1e-159 MNFFNVLAELLEASGFAALTWQNIAMILVSFVLFYLAIVKKFEPLLLLPISFGMFLVNLP LAGLMDEGGIIYLMSYGVKSNLFPCLVFMGVGAMTDFSPLIANPISLLLGAAAQLGIYVA FIFATQIGFTPAEAAAIGIIGGADGPTSIYIANNLAPHLLAPIAVAAYSYMALIPLIQPP IMKLLTTKKERAVKMGQLRKVSKTEKIVFPIAVVLFCSLLLPSVAPLLGLLMMGNLFKES GVVQRLSDTAQNAMINIITIMLGLSVGAKADGSTFLDVSTLKIIAMGLAAFCFSTAGGVI LGKILYYVTGGKINPLIGSAGVSAVPMAARVSQTVGAKENPTNFLLMHAMGPNVAGVIGS AVAAGFFMMIFKGTM >gi|224461354|gb|ACDC01000048.1| GENE 29 29870 - 30274 701 134 aa, chain - ## HITS:1 COG:FN0200 KEGG:ns NR:ns ## COG: FN0200 COG0511 # Protein_GI_number: 19703545 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Fusobacterium nucleatum # 1 134 1 134 134 156 86.0 8e-39 MKYVVTVNGKKFEVEVEKVGGAGKSLSRQPAERREAVKSEPVVETKAAVAPAPVETAPAA TTTGGTTITSPMPGTILDVKVNVGDKVKFGQTLAILEAMKMENDIPATGDGEVAEIRVKK GDAVETDAVLIVLK >gi|224461354|gb|ACDC01000048.1| GENE 30 30315 - 30644 472 109 aa, chain - ## HITS:1 COG:no KEGG:FN0199 NR:ns ## KEGG: FN0199 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 7 109 1 100 100 97 72.0 2e-19 MWTSDTMTLSESIITFLIGFSIVFAALIALALFIIISSKVINALVKEEEVVAPKPVANVS NNNANTASAKAVAEKDNQEAENLAVIISAISEELREPVENFTIVSVTEI >gi|224461354|gb|ACDC01000048.1| GENE 31 30754 - 32760 1560 668 aa, chain - ## HITS:1 COG:FN0198 KEGG:ns NR:ns ## COG: FN0198 COG3711 # Protein_GI_number: 19703543 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Fusobacterium nucleatum # 9 665 1 658 660 788 77.0 0 MLKKQHFEILKIIENERKLSKVAELLNLTERSVRYKIHEINEELGSKKIEIKKREFFSSI TENDMDKLFDNIEENNYIYSQKEREELIILYTLMKKDNFLLKELADKLSTSKSTIRNDLK NLKKILLKYNIKLLQDDKLKYYFDYSEEDYRYFIAIYLYKYVSFDKKYDKIFFADLSYFR KIIYKEIKEEYINEIDSISKRIKKAELDFMDETLNILVILMVISQKREKKNSNLILDNIE ILEKREEYSQLKKNFSDFSNTNLLFFTDYLFKISRDEKDVFIKFRNWLDITVAVIKMVRA FEIESKTNLKNMDVFLDEIFYYIKPLIFRTKRKIKLKNSILRDVENLYPSIFNFLKKNFY YLEDIIDEKVSEEEIAYLVPFFHKALQNNNKMNKKAVLVTTYKENVALFLKEDIETEFLV DIDKILTLKNFEQIKDQLNQYDYILTTFNVEEDFMKEIKLAKVIELNPILTEKDIKKLED SGLIKNKKIKMTNLLKVILENSSEVNVKNLVHSLDEAFPEKIYNDIDRNKFSIANFLKEE NIFRTNLDSFEKILNKFFDLSFLQKNDINDIINKALSNNFYSYLGLKTAIIFHKFNTKNS QDSMIIAVNEKELYINSQKINTIILINSTCEIKFRGIIYNFVKLFFQNNNFNFNEQTDIY NFLITMDN >gi|224461354|gb|ACDC01000048.1| GENE 32 33093 - 33839 997 248 aa, chain + ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 10 245 33 284 286 141 35.0 1e-33 MKKLEDFQKEYMIFVRGGKYKKKVFNLEVCKYPVTQSMWENIMGYNPSGFKGVNKPVEIV NWWEVLKFCNKLSEKYNLKPVYDLSQEEKGILKIIHLDGEIVEENKSDFKKTEGFRLPTE AEWEWFARGGQKAIDEGTFDYKYSGSNNIDEVAWYYENSGAKNKEGRTQNVGLKEANQLG LYDCSGNVWEWCYDMPDDESIEDGIVYRKLKGGAWVSNLELCQNFFCTSENAIFEDVDIG FRIVRTIH >gi|224461354|gb|ACDC01000048.1| GENE 33 33896 - 35236 1667 446 aa, chain + ## HITS:1 COG:FN0162 KEGG:ns NR:ns ## COG: FN0162 COG0534 # Protein_GI_number: 19703507 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 446 1 446 446 726 90.0 0 MDSIGNVGKKNLISLIIPIFFELLLVTIVGNIDTIMLGYYSDEAVGAIGGITQLLNIQNV IFSFINMATAILTAQFLGAKDYKRVKQVISVSLVLNVLLGLILGGIYLFFWESLLQKINL PAELIGIGKYYFQMVGGLCILQGIILSCGAILKSHGRPTETLIINVGVNILNIIGNAFFI FGWLGMPVLGPTGVGISTVISRGIGCVAAFYMMCKYCDFTFKKKYIKPFPFKIVKNILSI GLPTAGENLAWNVGQLMIVAMVNTMGTTIIASRTYLMLISSFTMTLSIALGQGTAIQVGH LVGAGEIKEVYHKCLKSLKIAFIFAFVTTSLVFLFRKPIMSIFTTNPDILKASLKIFPLM ILLEMGRVFNIVIINSLHAAGDIKFPMFMGITCVFTVAVLFSYLFGISLGWGLAGIWLAN AMDEWIRGLAMYFRWKSKKWQNKSFV >gi|224461354|gb|ACDC01000048.1| GENE 34 35438 - 36304 1089 288 aa, chain + ## HITS:1 COG:Cj1202 KEGG:ns NR:ns ## COG: Cj1202 COG0685 # Protein_GI_number: 15792526 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Campylobacter jejuni # 15 282 5 271 282 241 47.0 1e-63 MKIADIYKGKSLTTSFEVFPPNDKVGLEQVYNCLDVLSLEKPDYISVTYGAGGNTKGRTV EIADRIKNQNGVESVAHLTCIGAKKEEIDKVLEDLEKHNIENILALRGDYPVDRELEVGD FSYARDLINYIHEKKGDKFSIGAAYYVEGHRETNDLLDLFHLKEKVNAGVDFLISQIFLD NEFFYSFRDKLEKLQINVPLVAGIMPVTNAKQIKKITSLCSCTIPKKFLKILEKYEDNPS ALKEAGLAYAIEQVVDLVASDINGIHLYTMNRPETAKKIIDATGIIRK Prediction of potential genes in microbial genomes Time: Thu May 19 23:06:57 2011 Seq name: gi|224461353|gb|ACDC01000049.1| Fusobacterium sp. 2_1_31 cont1.49, whole genome shotgun sequence Length of sequence - 18972 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 5, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 63 - 122 10.0 1 1 Tu 1 . + CDS 194 - 3439 4410 ## COG0646 Methionine synthase I (cobalamin-dependent), methyltransferase domain + Term 3462 - 3498 5.0 + Prom 3443 - 3502 12.5 2 2 Op 1 . + CDS 3525 - 4301 894 ## COG1262 Uncharacterized conserved protein 3 2 Op 2 . + CDS 4329 - 4613 496 ## FN0165 hypothetical protein 4 2 Op 3 . + CDS 4627 - 4992 292 ## FN0166 hypothetical protein 5 2 Op 4 . + CDS 5007 - 5537 347 ## FN0167 hypothetical protein 6 2 Op 5 . + CDS 5552 - 6355 1067 ## COG1262 Uncharacterized conserved protein 7 2 Op 6 . + CDS 6379 - 8004 1243 ## FN0289 hypothetical protein + Term 8064 - 8122 16.2 + Prom 8148 - 8207 9.3 8 3 Op 1 4/0.000 + CDS 8262 - 9149 963 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 9 3 Op 2 2/0.000 + CDS 9210 - 11375 2967 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 10 3 Op 3 35/0.000 + CDS 11391 - 12356 802 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 11 3 Op 4 1/0.000 + CDS 12356 - 13132 198 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 12 3 Op 5 1/0.000 + CDS 13146 - 14381 983 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 13 3 Op 6 . + CDS 14378 - 14896 759 ## COG0716 Flavodoxins + Term 14904 - 14944 2.1 - Term 14892 - 14930 6.3 14 4 Tu 1 . - CDS 14960 - 15781 1170 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase - Prom 15824 - 15883 7.4 + Prom 15755 - 15814 10.6 15 5 Op 1 14/0.000 + CDS 16015 - 16326 506 ## PROTEIN SUPPORTED gi|237740607|ref|ZP_04571088.1| LSU ribosomal protein L21P 16 5 Op 2 14/0.000 + CDS 16330 - 16659 477 ## PROTEIN SUPPORTED gi|197736146|ref|YP_002164924.1| possible ribosomal protein 17 5 Op 3 1/0.000 + CDS 16660 - 16944 489 ## PROTEIN SUPPORTED gi|237740609|ref|ZP_04571090.1| LSU ribosomal protein L27P + Term 16965 - 17023 -0.9 + Prom 17068 - 17127 14.3 18 5 Op 4 . + CDS 17304 - 18887 2347 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Term 18931 - 18964 2.3 Predicted protein(s) >gi|224461353|gb|ACDC01000049.1| GENE 1 194 - 3439 4410 1081 aa, chain + ## HITS:1 COG:FN0163 KEGG:ns NR:ns ## COG: FN0163 COG0646 # Protein_GI_number: 19703508 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I (cobalamin-dependent), methyltransferase domain # Organism: Fusobacterium nucleatum # 8 315 1 308 309 552 88.0 1e-156 MFEFEKELRERILVLDGAMGTVLQKYELTPEDFNGAKGCYEVLNETRPDIIFEVHKKYIE AGADIIETNSFNCNAISLKDYHLEDKVYDLAKKSAEIARDAVKQSGKKVYVFGSVGPTNK SLSFPVGDIPYKRAVSFDEMKEVIKIQVAGLIDGGVDGILLETIFDGLTAKAALLATEEV FEEKSVKLPISISATVNRQGKLLTGQSMESLIVALDRDSVTSFGFNCSFGAKDLVPLILK IKELTTKFVSLHANAGLPNQNGDYVETAQKMRDDLLPLIENQAINILGGCCGTNYDHIRA IAELVKGQKPRVLPEENLLETCLSGNEIYNFNDKFTCVGERNNISGSKLFRTMIEEHNYL KALEVARQQIDAGAKVLDINVDDGILDSVEEMKNFLRVLQNDSFIAKVPIMIDSSDFAVI EEGLKNTSGKAIVNSISLKEGTEEFLRKAKIIRKFGASIIVMAFDEKGQGVSAERKIEIC QRAYDLLKSIGVKNSDIVFDPNILSVGTGQEADRYHAREFIKTIDYIHENLKGCGVVGGL SNLSFAFRGNNVLRAAFHHIFLEEAVPRGFNFAILNPKEKAPQWTDDEREKIKSFIFGES TDMEALLSLNLIKRKEEAQIFAETPEDKIRKALIQGGSESLQEVIGDLLKKYKALEILEN ILMSAMQEIGRLFEQGELYLPQLIRSASVMNNCVDILTPYLDKVDKTSSKGKILMATVDG DVHDIGKNIVGTVLECNGYEVIDLGVMVPRDKIVEKAKEINADIVTLSGLISPSLKEMER VADLFQKVGMQVPVLIAGAATSKLHTGLKVLPNYDYSLHVTDAMDTITVISQLLSTKRKD FLETKQNQLRKIAKRYIDNNNGTEEKKVFPEVKKTVSYIPKVLGKQFLSLPVEIFKDTLK WDIALYALRVKNTPEEEKTLNDLKKIYEKLIEEKVEFRAAYGYFRCKKTETFLEMEGMTF EVSPNLAQYIEKEDYVGGFVISVGSKIFKDDKYLGLLETLLCNAIAETASEYMETRVSED IVPTFLRPAVGYPILPDHSLKKVVFDLIDGERTGAKLSPAFAMTPLSTVCGFYLCNDNAK Y >gi|224461353|gb|ACDC01000049.1| GENE 2 3525 - 4301 894 258 aa, chain + ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 8 256 32 284 286 134 33.0 1e-31 MELQNFKDKYMVKVKGGIYKPSFEEIEKIVFDIEVCKYPVTKKMWLDIMGEISLETERNN EPAENITWWKALEFCNKLSEKYGLEPVYDLSKSKQGILAIRELKGKTIKTVDPKMANFKS TEGFRLPTEIEWEWFARGGQIAIEQGTFDYKYSGSNNIDEVAWYAENSNYLIQNVGLKKS NQLGLFDCSGNIWEWCYDTEEMENIKSLHFNFDPSSAYRRIRGGSWLHSAESCTTFYRIF ETAAYVVLNTGFRIVRTI >gi|224461353|gb|ACDC01000049.1| GENE 3 4329 - 4613 496 94 aa, chain + ## HITS:1 COG:no KEGG:FN0165 NR:ns ## KEGG: FN0165 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 20 94 1 75 75 118 97.0 5e-26 MLEEKLLKKLKTINENFINLGFDLEEDLVELVTQREDIKDRIENTKYKKMTFSKDEEANS YILNLEDCQISFDIIEGEDEEGPWFEVECNIIFF >gi|224461353|gb|ACDC01000049.1| GENE 4 4627 - 4992 292 121 aa, chain + ## HITS:1 COG:no KEGG:FN0166 NR:ns ## KEGG: FN0166 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 121 1 121 121 83 70.0 2e-15 MELIILTSISIILIVFLVLSFINEHLGFEKFNLKSKITTSIIILLIVNLIYFFDSYHEDD IILSSNIIIVGIDILFVLTNFFLLIFKRKGFYFIFFLLGLLFLVLPFFSIMMFALRGLPH G >gi|224461353|gb|ACDC01000049.1| GENE 5 5007 - 5537 347 176 aa, chain + ## HITS:1 COG:no KEGG:FN0167 NR:ns ## KEGG: FN0167 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 72 3 74 76 68 51.0 1e-10 MEIEIKEKIDSLEITKNCKQELRKITCIIIFLILLIYFLYIYYNPSMFLLFPIYEAPFAY IIFTIICQRYKYEVIYINSDKIKFSSSYTEKNSKTSMTSFFKIKNLKEIYVMEYHELPPK KFFGITKYKNIPHYMIHFIFSGEEDYKCWGYKISIEEATKIVRRIEEFLGKEKIFF >gi|224461353|gb|ACDC01000049.1| GENE 6 5552 - 6355 1067 267 aa, chain + ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 13 265 32 284 286 145 37.0 9e-35 MENKEIELKNFEDEYMIKVKGGKYIPSFANELKEVFDIEVCKYPTTQLMWLEIMENNPSE VKALYKPVETVNWWQALEFCNKLSEKYGLEPVYELSKSLEETLMIKELGGKIVSPDKANF ENTEGFRLPTEIEWEWFASGGQKAIEQGTFEYKYSGSNNIDEVAWYLGNSDFKNNDVSIK EVGLKKPNQLGLFDCSGNIWEWCYDTTGEIENGKLYTYKTFEPYNIYRRIKGGSGAYSAK SSLIISRSETIATYSYKNFGFRIVRTI >gi|224461353|gb|ACDC01000049.1| GENE 7 6379 - 8004 1243 541 aa, chain + ## HITS:1 COG:no KEGG:FN0289 NR:ns ## KEGG: FN0289 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 290 540 53 306 308 149 40.0 2e-34 MERKEIKLNHFSFYSVFAIIALIYFSIKCIQFYLVVRIPQEIWETITTEKNSFISNESEL EAQLIVNFLTLLAIFLPPFLVYLFIKKIYKICNYFLSEEKIVISDEYFSYTRKLAMINFE KFEINLNEIKRITKIPMKVPTRFSTNIPALAILWYFKEQERILIKDKNGKEYKIWNIPAR KFSPSTYFGTPKDDADLYIKELREYLKLEEENIEDEQETESLNIERKKLIYRHPDLSEKK KSFLILLFFQLFLIFIFLEILSEGIRAFYRGGIEILIFIVFGIACIGISYFLIKAIKNAI IYFFPYEEYEIIEDRLYYKKKLKLFGKSFVMERFDVALKDIDSISSLAPKISYMGIKPLD DFKPFKRIYIKLKNGEGYEVCNWGKISYNYTYFSGNNDKVLEIEFKEVFNKIKFFIENGE KKYNFETQLDETKSNYNLEESERYNFILNKIIEEEKLYLYKDEEKFIVNAEEIAIKNLAI FSTINFEEMDFYVFYVNYLSKKEYEDKRVLVGFNGIDGKEVTVSRLKDDINEIRDSKSTF I >gi|224461353|gb|ACDC01000049.1| GENE 8 8262 - 9149 963 295 aa, chain + ## HITS:1 COG:FN0767 KEGG:ns NR:ns ## COG: FN0767 COG0614 # Protein_GI_number: 19704102 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 22 295 1 274 275 390 82.0 1e-108 MKKLILLISFILFSFTSFAVKIENNEIVDKYGNKIEAREYKRIIVTDPGVIEILFKIGGE KSIVAIGKTTKSKIYPYEKVEKLQSIGSISNLNLEKIVEYKPDLVVVSSMMLKNVESIKK LGYKVIVSNASNLNEILELISVVGVISGRKNEAEDLRKVSSLKLEKIIKENHKKSSNLKG AILFSTSPMTAFSDDSLPGDVLKHLGVTNIASNVPGQRPILSPEYILKENPDFLAGAMSL DSPQQIIEASNVIAKTKAGKNKNIFILDSSLMLRSSYRIFDEMEVLKSKLEKIKN >gi|224461353|gb|ACDC01000049.1| GENE 9 9210 - 11375 2967 721 aa, chain + ## HITS:1 COG:FN0768 KEGG:ns NR:ns ## COG: FN0768 COG1629 # Protein_GI_number: 19704103 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 6 721 1 715 715 949 71.0 0 MKKYLMSLSLFIFCATSYGETIDLGEKSIYSETGFKNSLRSSTTSPFILKSKDIEGKGYT SVSEVLDSIPGVNVKEGAHPAIDLRGQGFQKAKATVQLLVDGIPANMLDTSHQNVPINVV NIDEIERIEVIPGGGAVLYGSGTSGGVINIITKKYKNKNIRGGVGYQISSFRNNKFDVST GTSVGNFDFDVNYSKNRKYGYRDYDFTNSDYFSGRINYNINKTDNIAFKYSGYRSKYTYP ASLTETQLDRDRRQSGLGSDDKNDNNKIKKDEFSLTYNSKITNNNDLNVVAFYQKTEIPS ESISDGTGMYKGILAGQVAGLSSALRNPSLPTSARLAMTNRLNALIAQLRNPSRVDFSSH SNFEDRKISIKAKDKYTYDNNGSNVIIGLGYTDDNMIRASKMELVGKMKLVDTHMDLTKK TFESFALNTYKVNNFEFIQGLRYEKSKFDGSRRNLDDVSTVKRDMNNWAGTLAVNHLYSD TGNVYLKYERAFTSPSPSQLSDKVRTSSGAFDYVTNNLKSEKTNQFEVGWNDYLLGSLLS ADIYYSETKDEIATIFDGGRAHPTNGFKTTNLGKTRRYGFDLSAEQKLENFTFKESYSFV KTKILKDNDKNIEGKEIAEVPNHKLLLSVDYNISSKFTVGAEYEYKAAAFVDNANKYGKD KAKSVFNLRANYQVNDSLDIYAGVDNVFGTKYYNSVTLSSGDRLYDPAPRTTYYTGFKYK F >gi|224461353|gb|ACDC01000049.1| GENE 10 11391 - 12356 802 321 aa, chain + ## HITS:1 COG:FN0769 KEGG:ns NR:ns ## COG: FN0769 COG0609 # Protein_GI_number: 19704104 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Fusobacterium nucleatum # 1 321 1 321 322 463 89.0 1e-130 MKKIFFLISLLISFILITLSLSIGSVIIPIKNLLFLSPMDDYMKMIVFELRLPRIIMAFL VGMLLASSGNIVQIIFQNPLADPYIIGIASSATFGAVIAYLLKLPEFYYGVIAFISCMLS TLLIFKISKRGNKIEVNTLLIVGITLSSFLGGFTSFSIYMIGEDSFKITMWLMGYLGNAS WRQILFLIFPLVFSSIYFYSKRNELDILMLGDEQAHSLGINISKLKFHLLIVSSFVVAYS VAFTGMIGFVGLIVPHIMRSIIGPLNSRLIPFVLIYGGVFLLFCDTVGRIILSPVEVPIG VITSILGAPFFLYLALKARRK >gi|224461353|gb|ACDC01000049.1| GENE 11 12356 - 13132 198 258 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 4 243 12 244 318 80 26 6e-15 MSIVSIENLNYFYGKKQILKELKLDIDENKLTGIIGPNGCGKSTLAKNIIKYLNGDFKKL EIMDIDIKKLSHKKIAQLISYIPQHSTIISNISVFDYILLGRFPLLKNSWNNYCEKDFEI VNYNINLLNIEFLKDRNIETLSGGELQKVLLARALVQETKILLLDEPTSALDLNNAVEFM KILKCISTKKNISVIIIIHDLNLASLFCDNLIVLKDGKFIKKGSPYEVINEQNIKDVYNL DCKVLYNEDNKPYIIPKT >gi|224461353|gb|ACDC01000049.1| GENE 12 13146 - 14381 983 411 aa, chain + ## HITS:1 COG:FN0771 KEGG:ns NR:ns ## COG: FN0771 COG0635 # Protein_GI_number: 19704106 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 411 1 411 411 639 80.0 0 MFKIRYKSHHDVGNVISKFTKNFKASKEDFISLLEGVNVSKQLALYFHTPYCDKICSFCN MNRKQLDNDLEDYTKYICDEIKKYGAYKFCKTSEVDVVFFGGGTPTIFKKEQLERILKTL RENFIFSKDYEMTFETTLHNLSFEKLEIMEKYGVNRISVGIQTFSNRGRKLLNRTYDKNY VVERLKEIKKRFSGLICIDIIYNYAGQTDEEILEDARLLSEIKVDSTSFYSLMIHDGSDI SKEREKDKSIYTYSLKRDEELHNLFYEACIENSYELLELTKLSNGKDKYKYIRNNNSLKN LLPIGVGAGGHIQDIEIYNMNQQVSFYSRTSELNYKLSMISGLMQFEKFSLLEIQKYCDE KIYREIFKRLKEFEDKGYIKIENNFAIYQLKGIFWGNSLVADIIELIGRNL >gi|224461353|gb|ACDC01000049.1| GENE 13 14378 - 14896 759 172 aa, chain + ## HITS:1 COG:FN0772 KEGG:ns NR:ns ## COG: FN0772 COG0716 # Protein_GI_number: 19704107 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 168 1 168 169 291 85.0 5e-79 MKTLIVYSTISGNTKAVCERIYNALNVEKEIINVKDSKNIKPSDYENIIIGFWCDKGTMD KDSIDFLKTLNNKNLYFLGTLGARPDSEHWNDVFENAKKLCSENNIFKDGLLIWGRISKE MMDVMKKFPAGHPHAVTPERLARWEAASTHPDENDFKKAEEFFSDLLNKILY >gi|224461353|gb|ACDC01000049.1| GENE 14 14960 - 15781 1170 273 aa, chain - ## HITS:1 COG:FN1263 KEGG:ns NR:ns ## COG: FN1263 COG4822 # Protein_GI_number: 19704598 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Fusobacterium nucleatum # 1 273 11 283 283 473 89.0 1e-133 MSKKALFMVHFGTTHNDTRELTIDKMNKKFADEFKDYDLFTAYTSRIVLKRLKDRGENYS TPLRVLNALADQGYEELLIQTSHVIPGIEYENLVKEVNSFSNKFKTVKIGKPLLYYIDDY KKCVEALADEYVPKNKKEALVLVCHGTDSPLATSYAMIEYVFSDCGYDNVFVVCTTAYPL MDSLIKKLKKAGIEEITLAPFMFVAGEHAKKDMAVTYKEELEENGFKVNQVILKGLGEFD AIQNIFLSHLKLAIEKDDEDIADFKKEYTNKYL >gi|224461353|gb|ACDC01000049.1| GENE 15 16015 - 16326 506 103 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237740607|ref|ZP_04571088.1| LSU ribosomal protein L21P [Fusobacterium sp. 2_1_31] # 1 103 1 103 103 199 100 1e-50 MYAVIKTGGKQYKVTEGDVLRVEKLNAEVNATVELTEVLLVAGGDNVKVGKPLVEGAKVV VEVLSQGKAAKVINFKYKPKKASHRKKGHRQLFTEVKVTSIIV >gi|224461353|gb|ACDC01000049.1| GENE 16 16330 - 16659 477 109 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|197736146|ref|YP_002164924.1| possible ribosomal protein [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 109 1 109 109 188 86 3e-47 MTKVEIFRKNGNIIGYKASGHSGYSEQGSDIICSAISTSLQMTLIGIQEVLKLKVDFKIN DGFLDVDLKNISLDKLTQTNILTEAMAIFLKELTKQYPKYIRLVEKEDK >gi|224461353|gb|ACDC01000049.1| GENE 17 16660 - 16944 489 94 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237740609|ref|ZP_04571090.1| LSU ribosomal protein L27P [Fusobacterium sp. 2_1_31] # 1 94 1 94 94 192 100 1e-48 MQFLLNIQLFAHKKGQGSVKNGRDSNPKYLGVKKYDGEVVKAGNIIVRQRGTKFHAGNNM GIGKDHTLFALIDGYVKFERLGKNKKQVSVYSEK >gi|224461353|gb|ACDC01000049.1| GENE 18 17304 - 18887 2347 527 aa, chain + ## HITS:1 COG:FN1120 KEGG:ns NR:ns ## COG: FN1120 COG1866 # Protein_GI_number: 19704455 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Fusobacterium nucleatum # 1 526 1 526 527 994 89.0 0 MKMYGLEKLGIDNVLTVHYNLSPAELTEKALANGEGKLNNTGALVIETGKYTGRAPDDKF FVDTPSVHEHIDWSRNKPIESEKFDAILGKLIAYLQKKEIYLFDGRAGANSEYTRRFRFI NEMPSQNLFIHQLLIRTDEEYNENNKIDFTVISAPNFHCVPEIDGVNSEAAIIINFEKKM AIICGTRYSGEMKKSVFSIMNYIMPHENILPMHCSANMDPVTHETAIFFGLSGTGKTTLS ADPNRKLIGDDEHGWCDTGVFNFEGGCYAKCINLKEESEPEIYHAIKFGSVVENVTMDEK TRKINYEDASITPNTRVGYPIHYIPNAELAGVGGIPKVVIFLTADSFGVLPPISRLSQEA AMYHFVTGFTAKLAGTELGVKEPVPTFSTCFGEPFMPMDPSVYAKMLGERLEKHNTKVYL INTGWSGGAYGTGKRINLKYTRAMVTAVLSGYFDNAEYKHDEIFNLDIPQSCPNVPSEIM NPIDTWEDKEQYIIAAKKLANLFYKNFKEKYPNMPENITNAGPRYNG Prediction of potential genes in microbial genomes Time: Thu May 19 23:07:26 2011 Seq name: gi|224461352|gb|ACDC01000050.1| Fusobacterium sp. 2_1_31 cont1.50, whole genome shotgun sequence Length of sequence - 51065 bp Number of predicted genes - 54, with homology - 52 Number of transcription units - 24, operones - 15 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 1 - 1513 1950 ## COG4868 Uncharacterized protein conserved in bacteria - Prom 1540 - 1599 11.0 2 1 Op 2 . - CDS 1609 - 4083 2919 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 4161 - 4220 13.0 - Term 4190 - 4255 8.4 3 2 Op 1 1/0.000 - CDS 4272 - 5306 1516 ## COG1363 Cellulase M and related proteins - Term 5327 - 5365 6.6 4 2 Op 2 . - CDS 5375 - 6886 2207 ## COG0747 ABC-type dipeptide transport system, periplasmic component 5 2 Op 3 . - CDS 6915 - 7394 627 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis 6 2 Op 4 1/0.000 - CDS 7407 - 8630 849 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Term 8651 - 8681 1.3 7 2 Op 5 1/0.000 - CDS 8694 - 9077 699 ## COG5496 Predicted thioesterase 8 2 Op 6 . - CDS 9141 - 9764 862 ## COG1564 Thiamine pyrophosphokinase 9 2 Op 7 . - CDS 9764 - 10624 1194 ## FN0891 DNAse I homologous protein DHP2 precursor (EC:3.1.21.-) - Prom 10864 - 10923 12.7 + Prom 10702 - 10761 12.7 10 3 Op 1 1/0.000 + CDS 10810 - 11541 964 ## COG0560 Phosphoserine phosphatase 11 3 Op 2 . + CDS 11569 - 11991 443 ## COG1959 Predicted transcriptional regulator + Term 11995 - 12044 8.7 - Term 11986 - 12028 7.6 12 4 Tu 1 . - CDS 12033 - 12749 790 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Prom 12786 - 12845 11.8 + Prom 12773 - 12832 9.9 13 5 Op 1 . + CDS 12858 - 13103 392 ## COG4443 Uncharacterized protein conserved in bacteria 14 5 Op 2 . + CDS 13122 - 14099 957 ## COG2849 Uncharacterized protein conserved in bacteria + Prom 14151 - 14210 5.5 15 6 Op 1 . + CDS 14230 - 14868 540 ## gi|237740625|ref|ZP_04571106.1| predicted protein 16 6 Op 2 1/0.000 + CDS 14943 - 15806 1184 ## COG0130 Pseudouridine synthase 17 6 Op 3 . + CDS 15830 - 17650 3081 ## COG1217 Predicted membrane GTPase involved in stress response + Term 17664 - 17705 -1.0 18 6 Op 4 . + CDS 17729 - 19300 1952 ## FN0616 hypothetical protein + Term 19310 - 19357 -0.9 - Term 19296 - 19344 3.1 19 7 Tu 1 . - CDS 19355 - 19861 711 ## FN0688 hypothetical protein - Prom 19911 - 19970 14.0 - Term 19932 - 19970 2.1 20 8 Tu 1 . - CDS 19981 - 21495 2297 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 21645 - 21704 14.8 + Prom 21797 - 21856 9.8 21 9 Tu 1 . + CDS 21885 - 22088 376 ## + Term 22114 - 22146 -0.9 - Term 22095 - 22139 8.9 22 10 Op 1 . - CDS 22146 - 22991 1017 ## FN0331 hypothetical protein - Prom 23021 - 23080 8.0 23 10 Op 2 . - CDS 23175 - 24227 1296 ## COG3177 Uncharacterized conserved protein - Prom 24257 - 24316 7.1 - Term 24263 - 24299 -0.2 24 11 Op 1 . - CDS 24322 - 25074 1300 ## FN0728 hypothetical protein 25 11 Op 2 . - CDS 25087 - 25809 897 ## COG3177 Uncharacterized conserved protein - Prom 25832 - 25891 14.4 + Prom 25806 - 25865 16.4 26 12 Op 1 . + CDS 25938 - 26624 1001 ## COG0588 Phosphoglycerate mutase 1 27 12 Op 2 . + CDS 26641 - 27174 852 ## FN0731 hypothetical protein 28 12 Op 3 . + CDS 27192 - 28385 1333 ## COG1323 Predicted nucleotidyltransferase - Term 28299 - 28341 -0.1 29 13 Tu 1 . - CDS 28393 - 29208 1215 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 29273 - 29332 11.2 + Prom 29258 - 29317 12.1 30 14 Op 1 1/0.000 + CDS 29419 - 30651 1898 ## COG2195 Di- and tripeptidases 31 14 Op 2 . + CDS 30658 - 32427 1716 ## COG1032 Fe-S oxidoreductase - Term 32355 - 32394 5.0 32 15 Op 1 . - CDS 32402 - 32611 382 ## gi|237740642|ref|ZP_04571123.1| conserved hypothetical protein - Prom 32647 - 32706 6.8 33 15 Op 2 . - CDS 32708 - 34051 1394 ## COG2211 Na+/melibiose symporter and related transporters + Prom 34022 - 34081 7.4 34 16 Op 1 . + CDS 34215 - 34958 919 ## FN1144 hypothetical protein 35 16 Op 2 . + CDS 34982 - 35728 948 ## FN1144 hypothetical protein 36 16 Op 3 . + CDS 35752 - 36519 1029 ## FN1144 hypothetical protein + Term 36531 - 36572 5.2 - Term 36518 - 36560 9.2 37 17 Op 1 . - CDS 36569 - 37657 723 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 38 17 Op 2 . - CDS 37675 - 38166 358 ## FN1073 hypothetical protein 39 17 Op 3 1/0.000 - CDS 38168 - 39268 1438 ## COG1161 Predicted GTPases 40 17 Op 4 5/0.000 - CDS 39281 - 40123 815 ## COG4974 Site-specific recombinase XerD 41 17 Op 5 6/0.000 - CDS 40129 - 41433 1920 ## COG1206 NAD(FAD)-utilizing enzyme possibly involved in translation - Prom 41616 - 41675 11.8 42 18 Op 1 13/0.000 - CDS 41683 - 43953 2740 ## COG0550 Topoisomerase IA 43 18 Op 2 5/0.000 - CDS 44007 - 44858 877 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 44 18 Op 3 1/0.000 - CDS 44883 - 45683 1205 ## COG0457 FOG: TPR repeat 45 18 Op 4 . - CDS 45664 - 46878 1293 ## COG1570 Exonuclease VII, large subunit - Prom 46949 - 47008 14.8 + Prom 46921 - 46980 11.8 46 19 Tu 1 . + CDS 47097 - 47267 257 ## gi|237740656|ref|ZP_04571137.1| predicted protein + Prom 47365 - 47424 4.1 47 20 Op 1 . + CDS 47472 - 47546 59 ## 48 20 Op 2 . + CDS 47566 - 47814 347 ## gi|237740657|ref|ZP_04571138.1| predicted protein 49 20 Op 3 . + CDS 47847 - 48071 354 ## gi|237740658|ref|ZP_04571139.1| predicted protein + Prom 48426 - 48485 9.8 50 21 Tu 1 . + CDS 48575 - 48889 458 ## gi|237740659|ref|ZP_04571140.1| predicted protein 51 22 Op 1 . - CDS 48933 - 49142 322 ## gi|237740660|ref|ZP_04571141.1| predicted protein - Prom 49176 - 49235 3.6 52 22 Op 2 . - CDS 49237 - 49509 228 ## gi|237740661|ref|ZP_04571142.1| predicted protein - Prom 49560 - 49619 6.8 53 23 Tu 1 . - CDS 49697 - 50176 396 ## gi|237740662|ref|ZP_04571143.1| conserved hypothetical protein - Prom 50345 - 50404 12.3 - Term 50181 - 50223 6.2 54 24 Tu 1 . - CDS 50407 - 50673 418 ## EUBREC_2584 hypothetical protein - Prom 50699 - 50758 10.2 Predicted protein(s) >gi|224461352|gb|ACDC01000050.1| GENE 1 1 - 1513 1950 504 aa, chain - ## HITS:1 COG:FN1121 KEGG:ns NR:ns ## COG: FN1121 COG4868 # Protein_GI_number: 19704456 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 504 1 506 506 949 95.0 0 MKIGFDHGKYLEEQSKYILERVNKHDKLYIEFGGKLLGDLHAKRVLPGFDENAKIKVLNK LKDQIEVIICVYAGDIERNKIRGDFGITYDMDVFRLIDDLRENELKVNSVVITRYEDRPS TDLFITRLERRGIKVYKHYATKGYPSDVDTIVSDEGYGKNAYIETTKPIVVVTAPGPGSG KLATCLSQLYHEYKRGKNVGYSKFETFPVWNVPLKHPLNIAYEAATVDLNDVNMIDPFHL EEYGEIAVNYNRDIEAFPLLKRIIEKITGKKSIYQSPTDMGVNRVGFGITDDEVVREASQ QEIIRRYFKTGCDYKKGNTDLETFKRAEFIMHSLGLKEEDRKVVTFARKKLELLNNEEKS DKQKTLSAIAFEMPDGQIITGKKSSLMDAPSAAILNSLKYLSNFDDELLLISPTILEPII KLKEKTLKNKHIPLDCEEILIALSITAATNPMAELALSKLSQLAGVQAHSTHILGRNDEQ SLRKLGIDVTSDQVFPTENLYYNQ >gi|224461352|gb|ACDC01000050.1| GENE 2 1609 - 4083 2919 824 aa, chain - ## HITS:1 COG:FN1122_1 KEGG:ns NR:ns ## COG: FN1122_1 COG1022 # Protein_GI_number: 19704457 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Fusobacterium nucleatum # 1 600 1 600 600 946 86.0 0 MQIVTDKNKVALYFKDNAVSYKEFILNTKKIKQYANIKEFTNNMIYMENRPELLYSFFSI WDSRATCVCIDASSTAEELSYYIDNSEVEKIFTSRGQLEKVEEALTILNKKIELVIVDDV EFNKIKIDENIESNLVINSPEKEDTALILYTSGTTGKPKGVMLTFDNILANVDSLDVYKM YEETDVTIALLPLHHILPLLGTGVMPLLYSATIVFLDDMSSVALIDAMKKYKVTMLIGVP KLWEVMHKKIMDTINSKGITRFIFKLAKKINSLSFSKKIFKKVSEGFGGHIKFFVSGGSK LNPQITEDFLTLGIKICEGYGMTETSPIIAYTPKDDIMPNSAGRVIKDVEVKIADDNEIL VKGRNVMKGYYKNPEATAEIIDKDGWLHTGDLGTLKDGYLYVTGRKKEMIVLSNGKNINP IDIEAKLMSMTNLIAEVVVTEYNSILTAVIHPDFNKVKEEKVDNIYEVLKWSVVDKYNQK SSDYKKILDVKIVNEDFPKTKIGKIKRFMIADMLEGKIEKKERKPEPDFEEYNKIKKYLV TAKEKEVYFDSHIEIDLGMDSLDMVEFQHFLDLNFGVKEENLISKHPSLLELANYVKENR NQEKIGNLNWKEIINKDTDAKLPSSSFLAIILKFISCILFNTFFRVKVKGKEKIEMDKPT IYVANHQSFLDGFLFNYAVPSKLVKKTYFLATVAHFKSSMMKSFANSANVVLVDINKDIA EVMQILAKVLKENKNVAIYPEGLRTRDGKMNKFKKSFAILAKELNVDVQPYVISGAYELF PTGKKFPKPGKISVEFLDKIKVEDLNYDEIVSKSYKAIEEKLAK >gi|224461352|gb|ACDC01000050.1| GENE 3 4272 - 5306 1516 344 aa, chain - ## HITS:1 COG:FN0999 KEGG:ns NR:ns ## COG: FN0999 COG1363 # Protein_GI_number: 19704334 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Fusobacterium nucleatum # 1 344 4 347 347 608 87.0 1e-174 MNIDLKYTLKKTVELLAIPSPVGYTHNAIEWVRKELESLGVKKYNITKKGALIAYVKGKD SNYKKMISAHVDTLGAVVKKVKKNGRLEITNVGGFAWGSVEGEHVTIHTLSEKTYTGTIL PIKASVHVYGDVAREMPRTEETMEIRIDEDVKTDQDVFKLGILQGDFVSLDPRTRVLENG YIKSRYLDDKLCVAQILAYLKYLKDNKLKPRTDLYIYFSNFEEIGHGVSVFPEDLDEFIA VDIGLVAGEDAHGDEKKVNIIAKDSRSPYDYTLRKKLQEAADKNKIQYTIGVHNRYGSDA TTAILQGFDFKYACIGPNVDATHHYERCHNDGIVETIKLLIAYL >gi|224461352|gb|ACDC01000050.1| GENE 4 5375 - 6886 2207 503 aa, chain - ## HITS:1 COG:FN0998 KEGG:ns NR:ns ## COG: FN0998 COG0747 # Protein_GI_number: 19704333 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 503 1 500 500 813 80.0 0 MKKIVYVLMLVALFLVGCGEKQEEKVAATEDKIVTVAQGAKPKTLDPHMYNAIPELLVSR QFYNTLFSREKDGTIVPELAESYEYKNDKELDIVLKKGVKFHDGSELTADDVIFSIERMK EKPGAGVMVEEIDKVEKVNDYEIKILLKNPSSPLLFNLAHPLTSILSKKYVEAGNDISIA PMGTGAYKLVAYNDGEKIEMEAFKDYFEGTPKIQKLIIKSIPEDTSRLAALETGEIDIAL GLSPISTQTVEANDKLTLISEPTTATEYICLNVEKAPFNNKDFRVALNYAIDKQSIVDSI FSGRAKVAKSIVNPNVFGYYDGLEGYPFDIEKAKELVAKSGVKDTSFSLYVNDNPVRLQV AQIIQANLKEIGIDMKIETLEWGTYLQKTGEGDYQAFLGGWISGTSDADIVLYSLLDSKL IGLAGNRARYSNPEFDKEVETARVVLTPEERKEHFKNAQIIAQNDSPLVVLYNKNENIGI NKRITGFNYDATTMHKFKDLDVK >gi|224461352|gb|ACDC01000050.1| GENE 5 6915 - 7394 627 159 aa, chain - ## HITS:1 COG:CAC2942 KEGG:ns NR:ns ## COG: CAC2942 COG1854 # Protein_GI_number: 15896195 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Clostridium acetobutylicum # 1 159 1 158 158 201 58.0 5e-52 MERIASFQVDHKKLNRGIYVSRLDEINGNYLTTFDIRMKLPNREPVINIAELHTIEHLGA TFLRNHPTRKDDIIYFGPMGCRTGLYLILKGKLESKEVVELIKELFEFISKFEGDIPGAS AIECGNYLDQNLPMARYEAEKFLKETLNNLKEENLVYPE >gi|224461352|gb|ACDC01000050.1| GENE 6 7407 - 8630 849 407 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 2 406 9 421 447 331 43 4e-90 MEKLSLQTKLVLGIQHVLAMFGATVLVPFLTGLNPSIALICAGVGTLIFHSVTKGIVPVF LGSSFAFIGATALVFKEQGIAILKGGIISAGLVYVIMSFIVLKFGVERIKSFFPPVVVGP IIMVIGLRLSPVALSMAGYSNNTFDKDSLIIALVVVVTMISISILKKSFFRLVPILISVV IGYVVAYFMGDVDLSKVHEASWLGLPTGAFETITTLPKFTFTGVIALAPIALVVFIEHIG DITTNGAVVGKDFFKDPGVHRTLLGDGLATMAAGLLGGPANTTYGENTGVLAVTKVYDPA ILRIAACFAIVLGLIGKFGVILQTIPQPVMGGVSIILFGMIAAVGVRTIVEAQLDFTHSR NLIIAALIFVLGIAIGDITIWGTISVSGLALAALVGIVLNKILPEDK >gi|224461352|gb|ACDC01000050.1| GENE 7 8694 - 9077 699 127 aa, chain - ## HITS:1 COG:FN0889 KEGG:ns NR:ns ## COG: FN0889 COG5496 # Protein_GI_number: 19704224 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Fusobacterium nucleatum # 1 127 1 127 127 181 81.0 2e-46 MLEVGMKYEIDRVVTENDTAAKAASGSVEVLATPVMIAWMEEASLRLAQKELEEGLTTVG TEVNIKHLKGTLVGKTVKVLSTLKEIDRKRLVFDVEVIEDGVAVGTGSHTRFIIDTTKFY EKLKNTK >gi|224461352|gb|ACDC01000050.1| GENE 8 9141 - 9764 862 207 aa, chain - ## HITS:1 COG:FN0890 KEGG:ns NR:ns ## COG: FN0890 COG1564 # Protein_GI_number: 19704225 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Fusobacterium nucleatum # 1 207 1 207 209 256 70.0 1e-68 MKIAYLFFNGQLRGSKKFYSNLIEKQEGDIYCADGGANIAYQLNLIPKEIYGDLDSIKDE VKDFYAKKNVKFIKFNVEKDYTDSELVLNEIEKKYDKIYAIAALGGSIDHELTNINLLNR YSNLIFVSQKEKMFKIEKSYNFSNMKNKKISFIIFSDKVKDLTLKGFKYDVENLDLTKGE TRCVSNIIEKTEARVTLKSGALLCIVK >gi|224461352|gb|ACDC01000050.1| GENE 9 9764 - 10624 1194 286 aa, chain - ## HITS:1 COG:no KEGG:FN0891 NR:ns ## KEGG: FN0891 # Name: not_defined # Def: DNAse I homologous protein DHP2 precursor (EC:3.1.21.-) # Organism: F.nucleatum # Pathway: not_defined # 9 286 2 279 279 446 83.0 1e-124 MQGSINLKKKLSLFIVSVLMIFTMFSTISSADEAYIASFNILRLGAAEKDMVQTAKLLQG FDLVGLVEVINKKGIEELVDELNRQSPNTWEYHISPFGVGSSKYKEYFGYVYKKDKVKFI KSEGFYKDGKSSLLREPYGATFKIGNFDFTLVLVHTIYGNNESQRKAENFKMVDVYDYFQ DKDKKENDILIAGDFNLYALDESFRPMYKHRDKITYAIDPAIKTTIGTKGRANSYDNFFF SQKYTTEFTGSSGALDFSEKDPQLMRQIISDHIPVFIVVETSKDDD >gi|224461352|gb|ACDC01000050.1| GENE 10 10810 - 11541 964 243 aa, chain + ## HITS:1 COG:FN0892 KEGG:ns NR:ns ## COG: FN0892 COG0560 # Protein_GI_number: 19704227 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Fusobacterium nucleatum # 1 243 5 247 247 427 90.0 1e-120 MIAAFFDIDGTIYRNALLIEHFKKMIKYELFKDVQYRLKVEEAYQLWDTRKGDYDDYLLD LAQLYVVAIKGLPLKYNDFISDQVLLLKGNRVYTYTREMIEWHKKEGHKVFFISGSPSFL VSRMAKKMGVDDFCGSVYEIDEETQTFSGKITKPMWDSVHKQEAIEDFIKKYDIDLSKSY AYGDTNGDYSMLSSVANPRAINPSKELIQKIKNDEDLKSKIQIIIERKNVIYKLDSNVEL IEF >gi|224461352|gb|ACDC01000050.1| GENE 11 11569 - 11991 443 140 aa, chain + ## HITS:1 COG:FN0893 KEGG:ns NR:ns ## COG: FN0893 COG1959 # Protein_GI_number: 19704228 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 140 1 140 140 223 92.0 8e-59 MKIKNEVRYALQIVYYLTLHRDKDIISSNEISAEENIPRLFCLRIIKKLEKAGVVKIFRG AKGGYVLTRDPKRLTFRDIIEIIDDDIVLQPCIDSSTICSTRGANCSIRLALKKIQDELL DDFDKINFHDLVEKNTGLYV >gi|224461352|gb|ACDC01000050.1| GENE 12 12033 - 12749 790 238 aa, chain - ## HITS:1 COG:FN1185 KEGG:ns NR:ns ## COG: FN1185 COG0846 # Protein_GI_number: 19704520 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Fusobacterium nucleatum # 2 238 6 242 252 387 78.0 1e-107 MENKIEKLADIIKNSKHLVFFTGAGVSTESGLKSFRGKDGLYSSLYKGKYRPEEVLSSDF FCSHRKIFIEYVEEELNINGIKPNKGHLALAELEKMGILKAVITQNIDDLHQMAGNKNVL ELHGSLKRWYCLSCGKTSNKNFSCDCGGIVRPDVTLYGENLNQDVVNEAIYQIEQADTLI VAGTSLTVYPAAYYLRYFRGKNLVIINNESTQYDGEASLVLKTNFADTMEKVLNIIKK >gi|224461352|gb|ACDC01000050.1| GENE 13 12858 - 13103 392 81 aa, chain + ## HITS:1 COG:CAC0545 KEGG:ns NR:ns ## COG: CAC0545 COG4443 # Protein_GI_number: 15893835 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 6 72 5 71 74 60 49.0 5e-10 MERKELKFEVLNDLGTISESTKGWSKKLTCIIWNEDEPKYDIRAWDSEFKKMGKEITLTE KELRSLKYLIDKELEFLDNEN >gi|224461352|gb|ACDC01000050.1| GENE 14 13122 - 14099 957 325 aa, chain + ## HITS:1 COG:FN0637 KEGG:ns NR:ns ## COG: FN0637 COG2849 # Protein_GI_number: 19703972 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 152 325 1 172 172 161 58.0 2e-39 MRRKNFILTVLIFLFINILSMAVESTNFIMPNTNMTGSSTNFQEALKDYKPNLENIDKIF NYIEKNIKEKGRAVFYSKLEKGKNEIIVTDENNNIIYTEKISEKLINVAPYFEAKEMYQL KEGKTFSYIDYSTEMLGKNVSIKSENLLKKKMNKKDAIEILNKLRDPNSFTKNSISNIEY AKSECYDEEGNLLFTMQIKDSKVITETQKTINENIIKMIYIVNDIDTDSGLMETYINGKL SAIMRMKNSLPNGEAKIFYPSGKLLAIFTLENGKTNGIVKVYYENGKIQAIHNFKDNVLN GEAIEYDENGNVVKKVLYKNGKIVR >gi|224461352|gb|ACDC01000050.1| GENE 15 14230 - 14868 540 212 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740625|ref|ZP_04571106.1| ## NR: gi|237740625|ref|ZP_04571106.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 212 1 212 212 350 100.0 4e-95 MRNLLIPFLPKYGGFKLAKIIKFFLTLIFTIFLILTLLFIIVMTTYVKEKYYVIQLSVIR ITKIRNNYSLDFNSYGDTKKTLNLEITSKGKLPENIVIKNINFYNRKVSIYNQAIKAFEI NKATAEKKEFKINQSIAKLKHNDGYYKDEIIYLYPLDKEIIDTFIYFEENFNNFYVEIII EDTETGEEYSDFTDYIYMMPVRKGFHFGPIVK >gi|224461352|gb|ACDC01000050.1| GENE 16 14943 - 15806 1184 287 aa, chain + ## HITS:1 COG:FN0635 KEGG:ns NR:ns ## COG: FN0635 COG0130 # Protein_GI_number: 19703970 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Fusobacterium nucleatum # 1 287 1 287 287 443 85.0 1e-124 MEGIILVNKPKGISSFDVIRKLKKILKTKKIGHTGTLDPLATGLMLICVGKATKLASDLE AKNKVYLADFEIGYATDTYDIEGKRIAENLIDISKDNLELSLKKFIGDIKQVPPMYSAIK IDGNKLYHLARKGIEIERPERDVTIEYIKLLDFKDNKAKIETKVSKGCYIRSLIYDIGLD LGTYATMTELQRINVGEYSLTNSYTLEQMEEMAQNNDFSFLNSVEEVFSYEKYNLETEKE FTLFKNGNTVKIKDNLENKKYRVYYQDEFLGLANIENNNLLKGYKYY >gi|224461352|gb|ACDC01000050.1| GENE 17 15830 - 17650 3081 606 aa, chain + ## HITS:1 COG:FN0634 KEGG:ns NR:ns ## COG: FN0634 COG1217 # Protein_GI_number: 19703969 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Fusobacterium nucleatum # 1 604 1 604 605 1169 97.0 0 MKIKNIAIIAHVDHGKTTLVDCLLRQGGVFKTHELEKVEERVMDSDDIERERGITIFSKN ASARYKDYKINIVDTPGHADFGGEVQRIMKMVDSVLLLVDAFEGPMPQTKYVLKKALEQG HRPIVVVNKVDKPNARPEDVLYMVYDLFIELNANEYQLEFPVVYASGKAGFARKELTDEN TDMQPLFETILEHVQDPDGDATKPTQFLITNIAYDNYVGKLAVGRIHNGTLKRNQDVMLI KRDGKQVKGKVSVLYGYEGLKRVEIEEAEAGDIVCVAGIDDIDIGETLADINDPVALPLI DIDEPTLAMTFMVNDSPFVGKEGKFVTSRHIWDRLQKEIQTNVSMRVEATDSPDSFIVKG RGELQLSILLENMRREGFEVQVSKPRVLFKEKDGKKLEPIELALIDVDDSFTGTVIEKMG VRKAEMVSMVPGQDGYTRLEFKVPARGLIGFRNEFLTDTKGTGILNHSFFDYEEYKGDIP TRNKGVLIATEPGVTVPYALNNLQDRGTLFLDPGIPVYEGMIVGEHNRENDLVVNVCKTK KLTNMRAAGSDDAVKLATPRKFTLEQALDYIAEDELVEVTPTNVRLRKKILKEGDRRKNW SALNNK >gi|224461352|gb|ACDC01000050.1| GENE 18 17729 - 19300 1952 523 aa, chain + ## HITS:1 COG:no KEGG:FN0616 NR:ns ## KEGG: FN0616 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 31 523 2 495 495 807 84.0 0 MEVDMKKFKFLMLFCLFGSMAFATPKTTKAVKNEYDLKFNPNKYVSKETEVNGKKVKYRA YENIVYVKNPVDKEYQNMNIYIPEEYFNNSSIGNYNSSNAPIFLPNSVGGYMPGKADKVG VGRDGKANSLSYALSKGYVVAAPGARGRTLTDKNGAYTGKAPAAIVDLKAAVRYLYFNDE VMPGDANKIISNGTSAGGALSALLGASGNSQDYLPYLTEIGAADTRDDIYAVSSYCPITN LENADSAYEWMYNGVNTFSRMEFTRNTSAQEYNDRSLSRTTVQGSLTEDEIKISNRLKNI FPAYLNSLKLKDDKGNLLTLDKNGNGTFKSYLSLMIKNSANKALAEGKDISEFKKAFTIE NGKVVAVDLDVYTHIGDRMKSPPAFDSLDASSGENNLFGDKKTDNKNFTKFSFDIANKEA IEYYQKGKFNDKSIKIVIPKMADKTIIKMMNPMNYIDSTPTKYWRIRHGAIDKDTSLAIP AILAIKLKNSGKVVDFAAPWGQGHGGDYDLDELFNWIDTIVNK >gi|224461352|gb|ACDC01000050.1| GENE 19 19355 - 19861 711 168 aa, chain - ## HITS:1 COG:no KEGG:FN0688 NR:ns ## KEGG: FN0688 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 18 164 1 148 153 176 63.0 3e-43 MKKMFRYVLLVFVFLMLVACGKPDSQKAFEKNFKQTITDISKKMKDGNETSEMLAKIFEK GSCKVNNVKEDGKVAELDVTIKAADFVKYMSEYFVALKPLFDSNMGEEAFQKKSLEYFEN LTKKELDYTETDVKVHMEKIDGEWKVINTEDVLTAIFGGLTDAVTDFN >gi|224461352|gb|ACDC01000050.1| GENE 20 19981 - 21495 2297 504 aa, chain - ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 503 1 510 511 735 69.0 0 MKKKILFLLMIFSAFFIISCNKEAEKKETKDTIVIAQGADAKSLDPHASNDSPSTKIRMQ IFDPLLKLDADANPQPCLAESWEREDETSIIFHIRKGVKFHNGDEMKASDVKFSIDRALA SPEFHEVLGGITKVEVLDDYTIKLTTEKPMAAILNNLSHDCIVVLSEKYVKENGDKIGQK PMGTGPYKFVSWESGDKVVLEAFPDYWRGEAPVKNVIFRNVVEETNRTIGLETGEIDIIY DVGSMDKNKIKEDGRFNLIEAPQARVEYLGFNVKKKPYDNPKVREAISYALDQKPIIDTV YLGAAEPATSIIGPKILYSVEVEKFTQDLEKAKELLKEAGYPDGFKAKLWTSDNPARRDM AVIIQDQLKQIGIDVTIETLEWGAYLDGTGRGDQEMYLLGWTTVTRDPDYGIAELTSTET QGNAGNRSFYSNPKLDKLLKAGKIEMDPEKRRAIYKEAQEIIRKDIPMYMILYPTQAVAT QKNIKGFKLGTMNSYEIYEVSIEN >gi|224461352|gb|ACDC01000050.1| GENE 21 21885 - 22088 376 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKISLVILVLAGILVGCTHTEKTATGGALAGAAVGAMLGNDVRGTAVGAAIGGALGAGA GELTKNK >gi|224461352|gb|ACDC01000050.1| GENE 22 22146 - 22991 1017 281 aa, chain - ## HITS:1 COG:no KEGG:FN0331 NR:ns ## KEGG: FN0331 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 279 20 308 329 271 57.0 2e-71 MQEQEIIALINSKGAFILDDSKANAELIAYVDCKTNELLETSAKIEWKISKKLSLKDIKR FKIYHLKVKNLGENTFLLIDIVKKDVKNNLLEKILKECEQNASVTVEEPDLGKFVLDKAT KSLHSKLKWLNEKEEIDVYLNIDEDNRINTLKKVGAFFITLEKVFKEKKDWDKKLKTFAA EHLADLATELRKNSKSLFKFLKVWKWYFIAKMKLVSLAIENDGEIVATFDDRKLFSGHKI IVKANTNNNEISSAIVENFNIDDYKKIEVPESNIETKEDKE >gi|224461352|gb|ACDC01000050.1| GENE 23 23175 - 24227 1296 350 aa, chain - ## HITS:1 COG:NMA1635 KEGG:ns NR:ns ## COG: NMA1635 COG3177 # Protein_GI_number: 15794529 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 10 347 21 358 375 217 37.0 2e-56 MYKLPIESLDLNRIDIFEQLVNATESLGILKGTLNKLPNPNIILNVITLKEAKESSEIEN IITTYDELYKEMILKDKSNPNAKEVLNYRSAINLGNRLVQEKNMITTNMINEIHHLIEPN KGDIRKQKGTVIMNTKTGEILHTPPQTYEEIMEYLKNLEEFINLENKVNPLIKMALIHLQ FEMIHPYYDGNGRTGRVLNLMYLKLSDKLDIPILYLSKYIIENRNEYYNLLNRAGKSEEN ILEFIIYMLKAIEKTSKYTLTLIDNIIEAMENTKKTLKEKLPKIYSKDLLELLFFEFYTK NEYIRNKLDISRQTATAYLKQLEKVGILSSEKIGKEIIYKNISLFKIAEN >gi|224461352|gb|ACDC01000050.1| GENE 24 24322 - 25074 1300 250 aa, chain - ## HITS:1 COG:no KEGG:FN0728 NR:ns ## KEGG: FN0728 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 41 250 1 210 211 379 90.0 1e-104 MEATKEWLEKWEKVKNKLQPNSNLLDYFTLKEIAGKEIDVMDIGPCSIPTGEFLVADPLV YLVSKHEKEYFQKIPTGEFRTEVCVVKATDGDCDRYAAVRLKFNDNEVSYFEEAVTGTEN LEDINEGDFFGFNVDAGLACICDKKLHELYCEFDKKWCDENPDGNAYDDYFADLFKKSYE DNPKYQRDGGDWINWTIPGTDYHLPMFQSGFGDGAYPVYLAYDKDGNVCQLIVELIDIEL AYSDIDEDEE >gi|224461352|gb|ACDC01000050.1| GENE 25 25087 - 25809 897 240 aa, chain - ## HITS:1 COG:VNG6349C KEGG:ns NR:ns ## COG: VNG6349C COG3177 # Protein_GI_number: 16120251 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Halobacterium sp. NRC-1 # 27 181 107 270 424 75 31.0 7e-14 MNKILEILLEEKETKLKGSLYHLTQIKFSYNSNHIEGSKLTEDETRYIYEINSFIGDKEK VVSIDDINETINHFKCFDYILENINILDEKLIKNLHKILKNNTSDSQREWFKVGDYKLKA NFIGDTKTTSPSNVKKEMKKLLEEYNSKNNITFDDIVDFHYKFETIHPFQDGNGRVGRLI MFKECLKNNIIPFIIDEEHKLFYYRGLKNYKEDKAYLIETCLSAQDRYIKLLDELEINVK >gi|224461352|gb|ACDC01000050.1| GENE 26 25938 - 26624 1001 228 aa, chain + ## HITS:1 COG:FN0729 KEGG:ns NR:ns ## COG: FN0729 COG0588 # Protein_GI_number: 19704064 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Fusobacterium nucleatum # 1 228 1 228 228 431 96.0 1e-121 MKLVLIRHGESAWNLENRFTGWKDVDLSPKGIEEAKAAGKILKEMNLVFDVAYTSYLKRA IKTLNIVLEEMDELYIPVYKSWRLNERHYGALQGLNKAETAKKYGDEQVHIWRRSFDIAP PSIDKDSEYYPKSDRRYADLPDSEIPLGESLKDTIARVLPYWHSDISKSLQEGKNVIVAA HGNSLRALIKYLLNISNEDILNLNLVTGKPMVFEIDKDLKVISAPELF >gi|224461352|gb|ACDC01000050.1| GENE 27 26641 - 27174 852 177 aa, chain + ## HITS:1 COG:no KEGG:FN0731 NR:ns ## KEGG: FN0731 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 177 1 177 177 223 77.0 3e-57 MRKLIFISCFALSALSFAAEKNLPENVEKKIRYAVSTTSGADRRETYDWYKDSYLEMVER LDKAGIPQTDKEIIIKRLEAMYGANYPKQLARVNDEINDYKGLVNRIREEQNANQQKAEA QNKKSKEEIASILSSSSIPKAELNRIEENAKAEYPDDYTLQKAFIKGAIKTYNDLKK >gi|224461352|gb|ACDC01000050.1| GENE 28 27192 - 28385 1333 397 aa, chain + ## HITS:1 COG:FN0732 KEGG:ns NR:ns ## COG: FN0732 COG1323 # Protein_GI_number: 19704067 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Fusobacterium nucleatum # 1 393 1 393 396 594 82.0 1e-169 MFKNVIGLVVEYNPFHNGHLHHIQEIDKLFEDNIKIAVMSGDFVQRGEPSLINKFEKTKI ALSQGIDIVIELPVFYSSQSAEIFAKGSVSLLDKLSCSHMVFGSESNDLDKLKKITSLSL TDEFTKALKEFLDKGFSYPTAFSKAISDEKFGSNDILALEYLKAIETIDSKIEACCIKRE KTGYYDDEKDNFASASYIRKVLLDSNETKENKLNKIKNLVPEFSYKILEENFGVFSCLND FYDLIKYNIIKNYSNLRNIQDLEVGLENRLYKYSLENLSFSDFFDKILSKRLTISRLQRI LLHTLLDLTEELTNKVKNKIPYVKILGFSNKGQEYLNYLKKLDDYNERKILTSNRNLKEI LSEEELELFNFNELASQIYRIKSNYNNIGYPIMNSKK >gi|224461352|gb|ACDC01000050.1| GENE 29 28393 - 29208 1215 271 aa, chain - ## HITS:1 COG:YGR231c KEGG:ns NR:ns ## COG: YGR231c COG0330 # Protein_GI_number: 6321670 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Saccharomyces cerevisiae # 28 219 59 253 315 81 25.0 2e-15 MEGKKYFKMVLSGAIGVFILLLILTNCYTVDTGEVVIISTFGKITRVENEGLHFKIPFVQ GKTFMETREKTYIFGRTDEMDTTMEVSTKDMQSIKLEFTVQASITDPEKLYRAFNNKHEQ RFIRPRVKEIIQATIAKYTIEEFVSKRAEISKLIFEDLKDDFSQYGMSVSNVSIVNHDFS DEYERAIESKKVAEQEVEKAKAEQEKLKVEAENRVRLAEYSLQEKELQAKANAVESNSLS PQLLRKMAIEKWDGKLPQVQGNNGSTLINLD >gi|224461352|gb|ACDC01000050.1| GENE 30 29419 - 30651 1898 410 aa, chain + ## HITS:1 COG:FN0733 KEGG:ns NR:ns ## COG: FN0733 COG2195 # Protein_GI_number: 19704068 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 2 410 4 412 412 724 92.0 0 MEKYSTLKERFLRYVKFNTRSDEKSETIPSTPSQMEFAKMLKKELEDLGLSNVFINKACF VNATLPSNIDKKVATVGFIAHMDTADFNAEGINPQIIENYDGSDIVLNKEQNIVLKVEEF PNLKNYISKTLITTDGTTLLGSDDKSGIVEIIEAVKYLKEHPEIKHGDIKMAFGPDEEIG RGADYFDVKEFAADYAYTMDGGPVGELEYESFNAAQATFKIKGVSVHPGTAKGKMINAGL IASEIIQMFPKDEVPEKTEGYEGFYYLVETNTSCESGEVIYILRDHDKAKFLAKKEFVKE LVKKVNEKYGKEVVELELKDEYYNMGEIIKNHMYVVDIAKQAMENLGIKPLIKAIRGGTD GSKISFMGLPTPNIFAGGENFHGKYEFVALESMEKATDVIVEIAKLNAER >gi|224461352|gb|ACDC01000050.1| GENE 31 30658 - 32427 1716 589 aa, chain + ## HITS:1 COG:FN0734 KEGG:ns NR:ns ## COG: FN0734 COG1032 # Protein_GI_number: 19704069 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Fusobacterium nucleatum # 1 566 1 566 568 1140 97.0 0 MKFLPTTKEEMKSLGWDSIDVLLISGDTYLDTSYNGSALVGKWLVEHGFKVGIIAQPEVD VPDDITRLGEPNLFFAVSGGCVDSMVANYTATKKRRQQDDFTPGGENNKRPDRAVLVYSN MIRRFFKGTTKKIVISGIESSLRRITHYDYWTNKLRKPILFDAKADILSYGMGEMSMLQL ANALKNGEDWQNIRGLCYLSKEPREDYLSLPSHSDCLADKDKFIEAFHTFYLNCDPITAK GLCQKCDDRYLIQNPPSESYSEEIMDKIYSMEFARDVHPYYKKMGAVRALDTIKYSVTTH RGCYGECNFCAIAIHQGRTIMSRSQNSIVEEVKNIAETPKFHGNISDVGGPTANMYGLEC KKKLKLGACPDRRCLYPKKCPHLQVNHNNQVELLKKLKKIPNIKKIFIASGIRYDMILDD NKCGQMYLKEIIKDHISGQMKIAPEHTEDKILGLMGKDGKSCLNEFKNQFYKINNELGKK QFLTYYLIAAHPGCKDKDMMDLKKYASQELRVNPEQVQIFTPTPSTYSTLMYYTEKDPFT NQKLFVEKDNGKKQKQKDIVTEKRNNNNYKKGSKIYYLFFLLIYQTLFV >gi|224461352|gb|ACDC01000050.1| GENE 32 32402 - 32611 382 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740642|ref|ZP_04571123.1| ## NR: gi|237740642|ref|ZP_04571123.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 69 1 69 69 73 100.0 3e-12 MENLEKETLIQKIKDLEVILKEMDLKIETAKKEVKMLENNKENLTDLLDLYTRQLEYGKK DFKQRASDK >gi|224461352|gb|ACDC01000050.1| GENE 33 32708 - 34051 1394 447 aa, chain - ## HITS:1 COG:FN0222 KEGG:ns NR:ns ## COG: FN0222 COG2211 # Protein_GI_number: 19703567 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Fusobacterium nucleatum # 1 443 1 443 448 677 92.0 0 MKKLTTKVQVLYALGVSYAIVDQIFAQWILYFYLPSESSGLKPFMAPVLVSIALAVSRLV DMITDPLVGFLSDKYNSKYGRRIPFVAVGTIPLIIVTIAFFYPPTSSEKASFYYLMLIGS LFFTFYTIVGAPYNALIPEIGRTPEERLNLSTWQSVFRLSYTAIAIILPGILIKMIGGND VLFGIRGMIMFLCVIVFIGLATTVFTVRERDYSTGEVSNVSFKETIGIIIKNKNFILYLF GMMFFFIGFNNLRAIMNYYVEDIMGYGKKEITIASALLFGAAAICFYPTNKLSKKYGYRK IMLYCLAMLIVSTSMLFFLGKIFPVKFGFALFAIIGIPLAGAAFIFPPAMLSEISTQISE DSGARIEGLSFGIQGFFMKTSFLISIVTLPIILVMGNDVSILSAISSGVSKVEKNGIYLA SLSSVFFFIISFIFYYKYSDSKKVDKK >gi|224461352|gb|ACDC01000050.1| GENE 34 34215 - 34958 919 247 aa, chain + ## HITS:1 COG:no KEGG:FN1144 NR:ns ## KEGG: FN1144 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 245 1 243 249 144 40.0 2e-33 MKKSLKKILFTILTVFAVFFVVACGNKEDAKINKEEVLQKSVEANSNIKSGNKLVNAKIE VEGEGNVEFTIDTSIIKEPLAMKAIIEQKNENTQMTTYIKDGMVYATTGNNDWQKEALSE NNAENFKDSLDASIEMNEILKDHLDKVTIKEEGGNYIVSIDKDADFLKEFLNKEVSNIVG QNIDFDPKNATVEYVIDKETYFIKSSSFSVETKIQDKKLKIMTEVTLSNINSVEEITVPE EALNSNN >gi|224461352|gb|ACDC01000050.1| GENE 35 34982 - 35728 948 248 aa, chain + ## HITS:1 COG:no KEGG:FN1144 NR:ns ## KEGG: FN1144 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 246 1 243 249 158 39.0 2e-37 MKKSLKKILFTILTVFALFFIVACGNKEDAKINKEEVLKKAAEVANDIKSGNKLVNTIME IKGGVTVEYIIDSSIIIEPFSMKLTLEQKGQDAKVTTFVKDGIMYMSNPVNNTWEKQEAT FEAIEQFKNSLDTSTEIYNTLKDHLDKVDIKEKDGNYVITVPKNSDFIKESLKEQMNSIV GQNPDFNPDNVTWEYVIDKETYFPKVLSLSFEAKLDEQDVKVTTTNTLSNINSVGEITVP EEALNSNN >gi|224461352|gb|ACDC01000050.1| GENE 36 35752 - 36519 1029 255 aa, chain + ## HITS:1 COG:no KEGG:FN1144 NR:ns ## KEGG: FN1144 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 253 1 247 249 221 52.0 2e-56 MKKSLKKILFTVLTVFAVFFVVACGNKEDTKINKEEVLQKSAEAYKNIKSADMLINLKME PKKSGQPMELTMDVSITLEPIAMKMSMDIKDQNVKLNSFIKDDIMYIQNPVDNSWIKQAL PEDVSNQFKNMVNSDDETYEVLKNNLDKVNIKEEGGNYIISVTKDTDFLKEAIKKQNSKV NMAGQNLDLNVDNVTLEYVVDKETYDTKSSVVAFDTNIQGQDVRMTVDSAFSNVNNIKEI TVPEEALNATATPAN >gi|224461352|gb|ACDC01000050.1| GENE 37 36569 - 37657 723 362 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 70 361 26 320 336 283 45 2e-75 MGIFDKLFRRNKNVETEEVEKVEEKKEEIKEEVKVESTQNTENIEKVENEVVEEATKIEE PVKVNISQRLTKSKEGFFSKLKNIFTSKSKIDDSIYEELEDLLIQSDVGLGMTTNLINDL EKKVKANKISETSEVYEILKGLMSEFLLSQDSKVHLKDNRINVILIVGVNGVGKTTTIGK LALKYKKLGKKVLLGAGDTFRAAAVEQLEEWAKRADVDIVKGREGADPASVVYDTLSKAE ATKADVVIIDTAGRLHNKANLMRELEKINNIIKKKIGEQEYESLLVIDGTTGQNGLNQAK EFNSVTDLTGFIVTKLDGTAKGGIVFSVSEELKKPIKFIGLGEKIEDLIEFNAKDFVEAI FN >gi|224461352|gb|ACDC01000050.1| GENE 38 37675 - 38166 358 163 aa, chain - ## HITS:1 COG:no KEGG:FN1073 NR:ns ## KEGG: FN1073 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 163 6 168 168 212 65.0 3e-54 MKNKKKKKSLFEKISTLSFLTLIPFVIFLIYLLTSLFRETNDEVELPKIMIKDIKNVRIA IDEYYRATGTFPNLELVNTDEKLEQIFFEQDGERIYFKDFLKENSMPSTPSYKDLPETNK VTIVTNFRKPTNDGGWNYNIKTGEIHANLPENFFGQGIDWNSY >gi|224461352|gb|ACDC01000050.1| GENE 39 38168 - 39268 1438 366 aa, chain - ## HITS:1 COG:FN1072 KEGG:ns NR:ns ## COG: FN1072 COG1161 # Protein_GI_number: 19704407 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 366 1 366 366 665 88.0 0 MTKKCVGCGIELQNTDKDLQGYTPKSIDSKEDMYCQRCFQLKHYGKYSTNKMTREDYKKE VGKLLDDVKLVIAVFDIIDFEGSFDVEILDILREKDSIVVVNKLDLIPDEKHPSEVANWV KDRLAEESIAPLDIAIVSTKNGYGVNGIFKKIKHFYPDGVNAMVIGVTNVGKSSVINRLL GKRIATVSKYPGTTIKNTLNMIPFTNIGLYDTPGLIPEGRASDLLCDSCAQKIIPAGEIS RKTFKAKYDRIIMIDNLVKIRVLNDEEVKPIFAIYAAKDVKFHETTIERAKELEEGNFFD IPCECCRDEYNKHKKITKTLTIKTGEELVFKGLGWVSVKRGPLKIEVTLAEEIEISIRKA FIKPRR >gi|224461352|gb|ACDC01000050.1| GENE 40 39281 - 40123 815 280 aa, chain - ## HITS:1 COG:FN1071 KEGG:ns NR:ns ## COG: FN1071 COG4974 # Protein_GI_number: 19704406 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Fusobacterium nucleatum # 2 280 7 285 290 341 74.0 9e-94 MIEKSIKNFIYYLEFEENKKHNTVISIRKDLNQFLTYLNEHDIIDFNKLDELLIKEYFTK LKTEKISASTFNRRLSSIKKFYKYLVDKGLKEKGSEILIESEKNDEKKIEYLTPEEINLV RTTMEGESFNILRDRLMFELLYSSGMTVAELLSLGEVNFNLEKREIYILKNKLSKTMYFS ETCKEFYIKFLNSKKEKFKEDYNPNIIFTNNSNERLTDRSVRRLINKYAEMANLNKEISP YTLRHSFCIYMLKNGMPKEYLARLLDLKVVGLLDVYEGLC >gi|224461352|gb|ACDC01000050.1| GENE 41 40129 - 41433 1920 434 aa, chain - ## HITS:1 COG:FN1070 KEGG:ns NR:ns ## COG: FN1070 COG1206 # Protein_GI_number: 19704405 # Func_class: J Translation, ribosomal structure and biogenesis # Function: NAD(FAD)-utilizing enzyme possibly involved in translation # Organism: Fusobacterium nucleatum # 1 434 1 434 434 735 94.0 0 MEKEVIVVGAGLAGSEAAYQLAKRGIKVKLYEMKAKQKTPAHSKDYYSELVCSNSLGSDS LENASGLMKEELRILGSMLIEVADRNRVPAGQALAVDRDGFSEEITKILKNMENIEIIEE EFTEIPEDKIVIIASGPLTSDKLFEKISEITGEESLYFYDAAAPIVTFESINMDIAYFQS RYGKGDGEYINCPMNKEEYYNFYNELIKAERAELKNFEKEKLFDACMPIEKIAMSGEKTM TFGPLKPKGLINPKTDKMDYAVVQLRQDDKEGKLYNIVGFQTNLKFGEQKRVFSMIPGLE NAEFVRYGVMHRNTFINSTKLLDKTLKLKNKDNVYFAGQITGGEGYVTAIATGMYAAINV ANRLNGEKEFVLEDISEIGAIVNYITEEKKKFQPMGANFGIIRSLDENIRDKKEKYRRLS QRAIEYLKKSIKGV >gi|224461352|gb|ACDC01000050.1| GENE 42 41683 - 43953 2740 756 aa, chain - ## HITS:1 COG:FN1069_1 KEGG:ns NR:ns ## COG: FN1069_1 COG0550 # Protein_GI_number: 19704404 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Fusobacterium nucleatum # 1 684 1 684 684 1070 89.0 0 MLELARKLDKNKLVIVESPAKAKTIEKILGSSYKVISSYGHIIDLPKTKIGVDVKDNFKP SYLTIKGKGEVIKKLKEAAKKADEIYLASDPDREGESIAWHIANTLKLDHNEKNRIEFNE ITEKAIKEAVKNPRKINISRVNSQQARRILDRLVGYEISPFLWKLISPNTSAGRVQSVAL KIICELEDKIKSFVPEKYWDVKGIFDDKYNLNLYKIDDKKIDKLKDEKLLERVKKDLKKK YEVVSSKISNKIKNPPLPLKTSTLQQLASSYLGFSASKTMMVAQKLYEGISINGEHKGLI TYMRTDSTRISEEAKEMARKYITKEYGKEYLGSVSPKTKKNDKNVQDAHEGVRPTDINLT PQKIMEFLDKDQFKLYNLIWQRFLISQLAAMKYEQFEYILEKDKIQYRGSINKIIFDGYY KVFKEEEDLPVGDFPEIKEGDKFTLDKLDIKEDYTKPPARLTESSLVKTLESEGIGRPST YASIIDTLKKREYVELQNKSFVPTEIGYEVKTQLDKFFPNIMNIKFTAKLEDELDEVDSG DKDWIDLLKTFYTELQKYEEKCKASVEKELEKLVESDIIGKDGKPLIMKIGRFGRYLTSQ DEDSKENISLKGIEISLEEIKSGKIYVKDKIKELLKKKEGEKTDIILENGARLILKYGRF GAYLESEKFKEDNVRKTIPKDIKTKIENNTIKRENGILCLKEIFEKIEAENAAILKKAGK CEKCGKPFEIKNGRWGKFLACTGYPECKNIKKIEKK >gi|224461352|gb|ACDC01000050.1| GENE 43 44007 - 44858 877 283 aa, chain - ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 1 283 5 288 288 449 84.0 1e-126 MNYDFITINDDIYPECLKEISDPPEKLYYKGNLELLKSERIIAVVGTRNPSSYGKLCCEY MIKKMSKADITIVSGFAKGIDSIAHKTSLITGTKTIAVIASGLDIVYPASNLSLYKEIEE KGLILSEYEAGTKPFKGNFPRRNRIIAGLSKGIIVVESKDRGGSLITADLALEYNRDVYA VPGDIFSEYSKGCNYLIRDAKAKSLSNIKELLEDYNWEIKEEAKLKVSKNQQLILDSLSS EKSLDKILEETKIDQTEILSELINLEIMGLIKSIAGGRYKKIL >gi|224461352|gb|ACDC01000050.1| GENE 44 44883 - 45683 1205 266 aa, chain - ## HITS:1 COG:FN1067 KEGG:ns NR:ns ## COG: FN1067 COG0457 # Protein_GI_number: 19704402 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 55 266 26 237 237 307 78.0 1e-83 MKKIMISLFILVSMLGFAEGENEGSAIREVPILGNQEAPVENTRPVSNGGGESQTPDDGG ETVENPETPKEATGVREYRPQSLIQLDEQMKKGTRSSIIQLNARYEQELNAYLQSVSYNS DVIFYLANEYMMLNNYSRANKIFLKDNKDLRNVFGAATTYRFMGQHRNAIDKYSQAISMN SGFAESYLGRGLSYRNLNEYDNAVSDLKTYISKTGAHDGYVALADVYFKMGKNKEAYAIA SQGIAKYGNSGILKVLANNILKNKID >gi|224461352|gb|ACDC01000050.1| GENE 45 45664 - 46878 1293 404 aa, chain - ## HITS:1 COG:FN1066 KEGG:ns NR:ns ## COG: FN1066 COG1570 # Protein_GI_number: 19704401 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Fusobacterium nucleatum # 1 402 1 402 404 622 82.0 1e-178 MEKIYSVSEFNRMVKSYIDDIDDFQDFYIEGEISNITYYKSGHLYFSVKDSKSQIKCAAF NYKMKRIPEDLKEGDAIKLFGDVGFYEVKGEFQVLVRHIEKQNALGALFAKLEKVKEKMA EKGYFDESHKKELPRFPKNIGVVTALTGAALQDIIKTTRKRFNSINIYIYPAKVQGLGAE QEIIKGIETLNKIEEIDLIIAGRGGGSIEDLWAFNEEEVAMAFFNSEKPIISAVGHEIDF LLSDLTADKRAATPTQAIELSIPEKESLIKSLDDKKIYLTKLLKSYLEDMKRELSIRMDN YHLKNFPSTINNYRELMVEKEEILTKSIKDFLEQKRHLFEVKIDKVSVLNPINTLKRGYS VSQVKNKRIDVLEDVEVNDEMTTILKNGRLISIVKEKIYEKNND >gi|224461352|gb|ACDC01000050.1| GENE 46 47097 - 47267 257 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740656|ref|ZP_04571137.1| ## NR: gi|237740656|ref|ZP_04571137.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 56 4 59 59 72 100.0 1e-11 MTIFYDPEYEKVSELVSKYMIYDEEKKEFIIPKDAPKEVHEAYKRKKEIWGKYQEY >gi|224461352|gb|ACDC01000050.1| GENE 47 47472 - 47546 59 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKYQIDGIDFSREEMFKIEKSIF >gi|224461352|gb|ACDC01000050.1| GENE 48 47566 - 47814 347 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740657|ref|ZP_04571138.1| ## NR: gi|237740657|ref|ZP_04571138.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 82 1 82 82 111 100.0 1e-23 MTDLEKAQKSIWKIYKEYCLECKKLETPYEVGLDGFKNYKEKKELTSKMLSDVNNIKKKY NIENLEISAKDLFEFEKKLFEK >gi|224461352|gb|ACDC01000050.1| GENE 49 47847 - 48071 354 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740658|ref|ZP_04571139.1| ## NR: gi|237740658|ref|ZP_04571139.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 74 1 74 74 110 100.0 2e-23 MFWGLSEDLINKNNYDEVNKLLDFIFNDIEIVSTKDGKEINLSKIEKEKSLKRIEELGEI KVVENYKAGKYFEV >gi|224461352|gb|ACDC01000050.1| GENE 50 48575 - 48889 458 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740659|ref|ZP_04571140.1| ## NR: gi|237740659|ref|ZP_04571140.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 104 4 107 107 185 99.0 1e-45 MLKYSKFKKALFGVSGFVFLELEDGMGADVDIENKAIELRPLADLRVYKNVYTGEITKPT KEEIEKAREVLENPDFVMKGPFYDDFYDKDSDIYKSVQRGERLI >gi|224461352|gb|ACDC01000050.1| GENE 51 48933 - 49142 322 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740660|ref|ZP_04571141.1| ## NR: gi|237740660|ref|ZP_04571141.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 69 1 69 69 106 100.0 4e-22 MKFSKLTKNETERLFELMKIPCYKMTDEEYDEYYNLELKIGTAKEDIPERNFKHRSLKEK WKSDALIFE >gi|224461352|gb|ACDC01000050.1| GENE 52 49237 - 49509 228 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740661|ref|ZP_04571142.1| ## NR: gi|237740661|ref|ZP_04571142.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 90 18 107 107 143 98.0 4e-33 MEYAKNIENAMEKFELKENIKVYRDTQKKYYELLGVGDTFEVNMFFSISTNKSIAEEFSE VDMSDDGIIFEIDVSINTKCIYIGKKFYFK >gi|224461352|gb|ACDC01000050.1| GENE 53 49697 - 50176 396 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740662|ref|ZP_04571143.1| ## NR: gi|237740662|ref|ZP_04571143.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 159 1 159 159 302 100.0 4e-81 MEINGIYFAKREFYQIIRNIGGVWNDSKERPIVCLLKIDDTDIYWAIPMGNLNHRSEKAK ERLDFYLNIEESDIRSCFYHIGKTTTDTIFFISDVVPIKEIYIDREYLGFNNIHYVIKNK KLISELERKLKRILYFEDSTPNYFRQHITDLKNKLLIME >gi|224461352|gb|ACDC01000050.1| GENE 54 50407 - 50673 418 88 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2584 NR:ns ## KEGG: EUBREC_2584 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 75 7 76 88 90 60.0 2e-17 MSMKLVNIRMDEDLKKEMEIVCNDLGINITTAFTIFAKKLTREKRIPFSVSIDPFYSNEN IKALENSINEVKDGKIIMKTIEELETME Prediction of potential genes in microbial genomes Time: Thu May 19 23:09:22 2011 Seq name: gi|224461351|gb|ACDC01000051.1| Fusobacterium sp. 2_1_31 cont1.51, whole genome shotgun sequence Length of sequence - 47307 bp Number of predicted genes - 48, with homology - 47 Number of transcription units - 17, operones - 12 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 50 - 1264 819 ## COG0477 Permeases of the major facilitator superfamily - Prom 1300 - 1359 3.8 2 2 Op 1 . - CDS 1445 - 2056 679 ## COG3340 Peptidase E 3 2 Op 2 . - CDS 2089 - 2982 1071 ## gi|237740666|ref|ZP_04571147.1| predicted protein 4 2 Op 3 . - CDS 2999 - 3790 739 ## COG2215 ABC-type uncharacterized transport system, permease component 5 2 Op 4 . - CDS 3817 - 3921 108 ## - Prom 3956 - 4015 5.5 6 3 Tu 1 . - CDS 4034 - 4588 365 ## COG3683 ABC-type uncharacterized transport system, periplasmic component - Prom 4616 - 4675 12.3 + Prom 4594 - 4653 14.4 7 4 Op 1 49/0.000 + CDS 4782 - 5723 1075 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 8 4 Op 2 5/0.000 + CDS 5720 - 6532 870 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 9 4 Op 3 5/0.000 + CDS 6568 - 8130 2444 ## COG0747 ABC-type dipeptide transport system, periplasmic component 10 4 Op 4 44/0.000 + CDS 8143 - 8922 253 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 11 4 Op 5 . + CDS 8940 - 9701 568 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Prom 9746 - 9805 9.8 12 5 Op 1 1/0.625 + CDS 9927 - 10892 1642 ## COG3643 Glutamate formiminotransferase + Term 10916 - 10957 3.0 + Prom 10897 - 10956 2.2 13 5 Op 2 . + CDS 10976 - 12217 1801 ## COG1228 Imidazolonepropionase and related amidohydrolases 14 5 Op 3 . + CDS 12233 - 12778 629 ## COG3236 Uncharacterized protein conserved in bacteria 15 5 Op 4 . + CDS 12801 - 13439 1141 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase + Term 13454 - 13501 7.9 + Prom 13476 - 13535 17.1 16 6 Op 1 . + CDS 13624 - 14970 1287 ## FN0748 hypothetical protein 17 6 Op 2 . + CDS 14967 - 15482 715 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 15483 - 15509 -0.7 - Term 15471 - 15497 -0.7 18 7 Op 1 . - CDS 15508 - 16089 580 ## gi|294782166|ref|ZP_06747492.1| conserved hypothetical protein 19 7 Op 2 . - CDS 16121 - 16612 558 ## gi|237740681|ref|ZP_04571162.1| predicted protein 20 7 Op 3 . - CDS 16600 - 17262 703 ## COG1309 Transcriptional regulator 21 7 Op 4 . - CDS 17294 - 18241 900 ## COG0679 Predicted permeases 22 7 Op 5 . - CDS 18263 - 18706 600 ## SEN0273 rhs-associated protein 23 7 Op 6 . - CDS 18703 - 19356 827 ## COG1059 Thermostable 8-oxoguanine DNA glycosylase - Prom 19565 - 19624 8.4 + Prom 19288 - 19347 14.7 24 8 Tu 1 . + CDS 19553 - 20857 1592 ## COG0427 Acetyl-CoA hydrolase + Term 20865 - 20903 6.0 - Term 20853 - 20891 4.4 25 9 Tu 1 . - CDS 21018 - 21656 1033 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 21755 - 21814 14.3 - Term 21785 - 21832 6.0 26 10 Tu 1 . - CDS 21853 - 23277 2200 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 23439 - 23498 11.8 + Prom 23373 - 23432 15.2 27 11 Op 1 9/0.000 + CDS 23534 - 25213 1881 ## COG3275 Putative regulator of cell autolysis 28 11 Op 2 . + CDS 25206 - 25928 742 ## COG3279 Response regulator of the LytR/AlgR family + Term 25941 - 25997 -0.2 - Term 25670 - 25735 8.2 29 12 Op 1 1/0.625 - CDS 25953 - 26822 1109 ## COG2071 Predicted glutamine amidotransferases 30 12 Op 2 . - CDS 26819 - 27472 508 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 27604 - 27663 13.3 + Prom 27526 - 27585 12.6 31 13 Op 1 1/0.625 + CDS 27650 - 28414 1138 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 32 13 Op 2 1/0.625 + CDS 28439 - 29191 286 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 33 13 Op 3 1/0.625 + CDS 29191 - 29763 766 ## COG0817 Holliday junction resolvasome, endonuclease subunit 34 13 Op 4 . + CDS 29773 - 30279 690 ## COG1778 Low specificity phosphatase (HAD superfamily) 35 13 Op 5 . + CDS 30309 - 30851 706 ## FN0212 hypothetical protein - Term 30842 - 30890 11.5 36 14 Op 1 2/0.125 - CDS 30897 - 31919 1449 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins 37 14 Op 2 2/0.125 - CDS 31928 - 32479 845 ## COG1704 Uncharacterized conserved protein 38 14 Op 3 . - CDS 32521 - 34281 2057 ## COG4907 Predicted membrane protein - Prom 34348 - 34407 8.7 39 15 Op 1 . - CDS 34414 - 36219 2280 ## COG4907 Predicted membrane protein - Prom 36262 - 36321 6.8 40 15 Op 2 . - CDS 36371 - 38353 2589 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases - Prom 38384 - 38443 8.8 + Prom 38323 - 38382 10.3 41 16 Op 1 17/0.000 + CDS 38607 - 39266 833 ## COG0765 ABC-type amino acid transport system, permease component 42 16 Op 2 34/0.000 + CDS 39247 - 39927 688 ## COG0765 ABC-type amino acid transport system, permease component 43 16 Op 3 16/0.000 + CDS 39924 - 40679 259 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 44 16 Op 4 . + CDS 40711 - 41592 1513 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 41610 - 41662 11.1 - Term 41597 - 41648 8.4 45 17 Op 1 7/0.000 - CDS 41669 - 42154 821 ## COG0319 Predicted metal-dependent hydrolase 46 17 Op 2 . - CDS 42169 - 44241 2319 ## COG1480 Predicted membrane-associated HD superfamily hydrolase 47 17 Op 3 1/0.625 - CDS 44258 - 46720 2447 ## COG1199 Rad3-related DNA helicases 48 17 Op 4 . - CDS 46742 - 47233 686 ## COG4807 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|224461351|gb|ACDC01000051.1| GENE 1 50 - 1264 819 404 aa, chain - ## HITS:1 COG:FN1168 KEGG:ns NR:ns ## COG: FN1168 COG0477 # Protein_GI_number: 19704503 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Fusobacterium nucleatum # 1 273 1 273 302 325 78.0 1e-88 MQSKESNIKLLLLGRAVSLFGNTIYLIVLPLYILNITQNLKITGIFFAMVNLPTVVISIF IGTIIEKFNKKNVILTCDFLTSILYFILFLYFRNSNSLILLFFISLFINIISTFFIIASK VIFSELNTPETLEKYNGLQSFLENITIIIGPVIGTYLFSIFDFNFILLIVSLAYFLSFLQ ELLIKYEKDSNLVKEDSNFIKDFKEGIIYIKNNKIVFNFFILVMFLNFFIANNDEIINPG ILIQKYKISEKLFGFSATAYGVGSVFAGIFIYYNEKFRFLKKLKLLFVLNSSLMCLLGLL SIILFEYNHYIYFAVFIFFQFLIGMITTFVNVPLISSFQKNVEIEYQSRFFSLLSFFSGG LIPLGVLYAGYLSSYIGADITYIINNIAIIVIVFLVFRKNKKYL >gi|224461351|gb|ACDC01000051.1| GENE 2 1445 - 2056 679 203 aa, chain - ## HITS:1 COG:FN1116 KEGG:ns NR:ns ## COG: FN1116 COG3340 # Protein_GI_number: 19704451 # Func_class: E Amino acid transport and metabolism # Function: Peptidase E # Organism: Fusobacterium nucleatum # 1 203 1 203 203 317 90.0 1e-86 MKNLFLCSYFTGVKDIFKDFMSNDTEGKKVLFIPTANIDEETKFLVDEAKEVFKSLGMEV ENLEISKLDEKTIKNKIEKTNYLYIGGGNTFYLLQELKRKNLIDFIKNRVNFGMTYIGES AGAIITSKDIEYNDLMDDKTIAKDLKEYSGLNLVDFYIVPHLNEFPFEESAKQTVEKYKD NLNIIAINNSQAIILKDDKFEIK >gi|224461351|gb|ACDC01000051.1| GENE 3 2089 - 2982 1071 297 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740666|ref|ZP_04571147.1| ## NR: gi|237740666|ref|ZP_04571147.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 297 1 297 297 481 100.0 1e-134 MYLIIENIQEQFELYFNHEKNIELIKKWAIRYIGYGEDLCFLSDEKYIIKWLEIFKNISD EIKDTDMRKLYNEFLEDLKKINIEYDKNVDELTKKYKEENLEIYNYKGVTLGDNIKKIYP LMKIYNTEYSEHGIEEEYSLITKIENSYIFIDIYSRKVVKIEIYDESYSLGEFKIGSEIT TELCDKYELLDLDDVDTGEICYFPQKNYMHAVIYVNPEDDVSKITKIVFSINGENPSKNN VKDILKAKKIEDIYYSLYNFGKIEIDIKNKEIIGRLEGNTFIFDLFNGNLIDIKFKE >gi|224461351|gb|ACDC01000051.1| GENE 4 2999 - 3790 739 263 aa, chain - ## HITS:1 COG:FN1115 KEGG:ns NR:ns ## COG: FN1115 COG2215 # Protein_GI_number: 19704450 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 20 263 1 244 244 341 90.0 7e-94 MKKIIKYLVGIIAIALVYLLISNFNLIMYKIAIYQQEIVEKISELTENSNNKVVYTILFF TFLYGVVHSLGPGHGKTLVLTYSVKEKLNFPKLLLVSALIAYLQGLSAYLLVKFIINLSD KASMLLFYDLDNRTRMIASILIILIGLYNIYSILRNKSCEHCHETKVKNILGFSIVLGLC PCPGVMTVLLFLESFGLSENLFLFTLSMSTGIFLVILFFGILANTFKNTLVEDENLRLHK ILALVGAILMILFGIFQILILGE >gi|224461351|gb|ACDC01000051.1| GENE 5 3817 - 3921 108 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKEFIKNIMFALSFAFIGYIIFYLYAMYATIGMM >gi|224461351|gb|ACDC01000051.1| GENE 6 4034 - 4588 365 184 aa, chain - ## HITS:1 COG:FN1114 KEGG:ns NR:ns ## COG: FN1114 COG3683 # Protein_GI_number: 19704449 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Fusobacterium nucleatum # 7 184 23 196 196 216 75.0 2e-56 MKRIFFILFFIFSFNIFSHPHVFFETALTLKTDNKKMEGVEIQLILDELNTKLNRKVLKP DKDMNVEKGNIVFLKHLYKHIRIKYNNKTYKENDIIFEQAKLEDDSLEIYFFVPIDEKID KNSKLTIALYDTKYYYNYDYDLSSLRMDKSNKNDLKAKVKFFTNDKIKFYFNLVSPDEYE VTFE >gi|224461351|gb|ACDC01000051.1| GENE 7 4782 - 5723 1075 313 aa, chain + ## HITS:1 COG:FN1113 KEGG:ns NR:ns ## COG: FN1113 COG0601 # Protein_GI_number: 19704448 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 312 1 312 312 517 92.0 1e-147 MIRYTIKRLLYLIPILFGVTFLTFLMLYLAPSDPISMKYSSMATVGDSKYIEEKKEEMGL NDSFIKQYTRWSKNVLSGDFGISTKYNVPVKDEIAKRLPKTLALTGTSILITIFLAFPLG IISAKYKNKWIDYIIRFFSFTGISIPSFWLGLMLMYIFSVRFKLLPIVGSKGIKSLILPS VTLSVWLVAVYIRRIRACILEEINKDYVVALESKGISSSKIMLFHILPNSLLTIITMFGM SIGSILGGTTIIETIFEYRGLGKMAADAITNRDYFLMQGYVIWTAIIYVVINLLVDILYK YLNPKIKIGDDSL >gi|224461351|gb|ACDC01000051.1| GENE 8 5720 - 6532 870 270 aa, chain + ## HITS:1 COG:FN1112 KEGG:ns NR:ns ## COG: FN1112 COG1173 # Protein_GI_number: 19704447 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 270 1 270 270 409 92.0 1e-114 MINKKFNYKFSIILTLAVIIIFITFFANYLAPFNPDYQNYEAISQAPNSTYLMGTDYVGR DIFSRILYGGRYSLLIALLVTLLVAFIGIVIGLISGYLGGIVDIVIMRIVDMIMAFPYIV FVIAVVTIFGGGLKNLILAMTLISWTNYARVTRAMVISLKNNDFINQAKLSGASNIRIMY KYLAPNVLPYLIVLTTQDIANNLLTLSSLSLLGIGVQPPTAEWGLMLSEGKKFIQTAPWI LFFPGLAIFICVVVFNLLGDSLRDILDPKK >gi|224461351|gb|ACDC01000051.1| GENE 9 6568 - 8130 2444 520 aa, chain + ## HITS:1 COG:FN1111 KEGG:ns NR:ns ## COG: FN1111 COG0747 # Protein_GI_number: 19704446 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 520 19 538 538 911 90.0 0 MKKKVLLGIFLALISIGVLTACGTKKEKEEVSTEAQATGGHMNIALYWFGETLDPALDWD GWTLTRAAVGETLVTVDENLQLVGQLADSWENVDETTWKFHIRQGVTFQNGNPLTPEAVK ASIERTVKMNERGESALKLASIDVDGEYVVIKTKEPYGAFLANISDPMFIIVDTSVDTSK FKETPVCTGPYMVTSFKPATSFEVVAYENYWGGKPALDSITVFNIEDDNTRALALQSGDV DMAQGIRAGDIALFTDNKDYIVKTTTGTRIEFLTMNTVKSVLKDKNLRLAVNSAVDYDTI AKVVGGGAVAARAPFPASAPYGYDELNKQTFDLEKAKTLLAEAGYKDTDNDGYVDKDGKN LELNIYGTAGGNTRANSTVAELLESQLKTAGIKANIKIAENLEDIKKNLEFDLLFQNWQT VSTGDSQWFLDNAFKTDGSGNYGKYSNKELDNLINKLATTFDVKERQKITKEASQLIIDE AYGTYIVSQANVNVSNNKVENMHNFPIDYYFLTADTKITK >gi|224461351|gb|ACDC01000051.1| GENE 10 8143 - 8922 253 259 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 232 1 221 223 102 31 5e-21 MKPLLEIKNLNINYKNSIKAVKNVSLTLKDNQIISIVGESGSGKSTLIRAILKLLPTGGK IESGNIFFLEKDILTLNKKELNKLRGKDIGMIFQDPNSTMDPIKTIEKQFIEYILEHNNI SKKEAIDLAKEYLLKLSLTDVDRILKSYPFELSGGMKQRVAIAMSMAQSPRLLLADEPTS ALDVTVQAQVIQELKKIRENFNTAIILVTHNMGVASYISDKIAVMKDGELIEFGNKEQII NNPQKEYTKLLLNAVINLK >gi|224461351|gb|ACDC01000051.1| GENE 11 8940 - 9701 568 253 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 247 8 265 329 223 46 2e-57 MSEDLLIIENISKSFTVDKNKELKALKNINIRLKKGECIGIVGESGCGKSTLARIIVGIE KKTSGKIIFDDKEIDGISKTKDIQMIFQSPLSSFNPRMKIIDYMWEPLRNYFKLSKKESI PLIIKSLVDVGLDETALEKYPHEFSGGQLQRITIARAIIIKPKLIVCDEITSALDVSVQK QILELLKKLQKDLALSYLFIGHDLAVVQNISQKIVVMYMGEIVEELNSIDLKTKAKHPYT NLLLNSVFEVNKA >gi|224461351|gb|ACDC01000051.1| GENE 12 9927 - 10892 1642 321 aa, chain + ## HITS:1 COG:FN0741 KEGG:ns NR:ns ## COG: FN0741 COG3643 # Protein_GI_number: 19704076 # Func_class: E Amino acid transport and metabolism # Function: Glutamate formiminotransferase # Organism: Fusobacterium nucleatum # 1 321 1 321 321 625 97.0 1e-179 MAKIVECIPNYSEGKDLAKIERIVAPYKNNPKVKLLGVEPDANYNRTVVTVLGDPEEVKK AVIESIGIATKEIDMNVHKGEHKRMGATDVVPFLPIQEMTTEECNEISREVAKAVWEQFQ LPVFLYESTATAPNRVSLPDIRKGEYEGMAEKLKQPEWAPDFGERAPHPTAGVTAIGCRM PLIAFNINLATTDMDIPKEIAKAIRFSSGGFRFIQAGPAEILDKGFVQVTMNIKDYTKNP IYRIMETVKMEAKRWGVKVTGCEIIGATPFASLTDSLKYYLACDGIKDDVDAMSMEKVVE LMVKYLGLTDFDVKKVLEANI >gi|224461351|gb|ACDC01000051.1| GENE 13 10976 - 12217 1801 413 aa, chain + ## HITS:1 COG:FN0740 KEGG:ns NR:ns ## COG: FN0740 COG1228 # Protein_GI_number: 19704075 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Fusobacterium nucleatum # 1 413 1 413 413 781 97.0 0 MQADLVLYNIGQLVTSRELDNSKKMDNIEVIENNGYIIIEKDKIVAVGSGEVPKEYLSPA TEMVDLSGKLVTPGLIDSHTHLVHGGSRENEFAMKIAGVPYLEILEKGGGILSTLKSTRN ASEQELIEKTLKSLRHMLELGVTTVEAKSGYGLNLEDELKQLEVTKILGYLQPVTLVSTF MAAHATPPEYKDNKEGYVQEVIRMLPIVKERNLAEFCDIFCEDKVFSVDESRRILTAAKE LGYKLKIHADEIVSLGGVELAAELGATSAEHLMKITDSGINALANSNVIADLLPATSFNL MEHYAPARKMIEAGIQIALSTDYNPGSCPSENLQFVMQIGAAHLKMTPKEVFKAVTINAA KAVDKQDTIGSIEVGKKADITVFDAPSMAYFLYHFGINHTDSVYKNGKLVFKR >gi|224461351|gb|ACDC01000051.1| GENE 14 12233 - 12778 629 181 aa, chain + ## HITS:1 COG:PA4580 KEGG:ns NR:ns ## COG: PA4580 COG3236 # Protein_GI_number: 15599776 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 4 181 6 184 184 186 47.0 2e-47 MKYNLENLIKNFNSKKKLKFLFFWGHTQNGDEITKACFSQWYNCKFVVDEITYHTAEQYM MAQKALLFGDNEIFHKIMNSKHPKEYKELGRKIKNFSDSKWNENKYQIVLKGNLAKFSQN EKLKTFLLNTGTRVLVEASPYDKIWGIGLSADQENIENPLTWNGENLLGFALMEVRDLIS E >gi|224461351|gb|ACDC01000051.1| GENE 15 12801 - 13439 1141 212 aa, chain + ## HITS:1 COG:FN0739 KEGG:ns NR:ns ## COG: FN0739 COG3404 # Protein_GI_number: 19704074 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Fusobacterium nucleatum # 1 212 1 212 212 355 98.0 5e-98 MKLVELDVLKFLDVVDSNSPAPGGGSVSALASSLGASLARMVAHLSFGKKNYEALADDVK AKFVANFDELLKIKNELNDLIDRDSEAYNTVMAAYKLPKETDEEKAARSAEIQKSLKYAI QTPYDIVVLSGKAISLLGEILANGNQNAITDIGVGTMLLMVGLEGGILNVKVNLSSIKDA EYVEKITKEIYDIKATAEKEKERIMGIVNAAL >gi|224461351|gb|ACDC01000051.1| GENE 16 13624 - 14970 1287 448 aa, chain + ## HITS:1 COG:no KEGG:FN0748 NR:ns ## KEGG: FN0748 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 20 448 1 429 430 616 81.0 1e-175 MDNIKQRIRQIEILALTLFMIILVCFLTYIINESENIFLGLYRIITSPAILVTDFIKVGG IGAAFLNALLILSFNYFLVKLFKIKITGVVIAMFFTVFGFSFFGKNILNILPFYLGGILY SVYTSTDFSEHLISIAFSSALAPFISSVAFYGEVAYETSYINAILIGVLIGFIVVPLAKS LYDFHEGYDLYNLGFTAGILGSVIMAVLKLYHFEINPQFLVSSEYDMALKIICSSVFVAF IIVGFYINNNSFTGYFKLMRDDGYKSDFVQKYGYGLTYINMGMMGLISVAFVTFTGQTFN GPILAGLFTVVGFSANGKTIFNTIPIFIGVLLASFGSKGNTFTVAISGLFGTALAPISGV FGPVAGIIAGWLHLAVVQNVGLVHGGLNLYNNGFSAGIVAGFLLPIFNMITDNKNQRKMN IQKKHMNFLKAVQKNIKNKMKEEEGEDK >gi|224461351|gb|ACDC01000051.1| GENE 17 14967 - 15482 715 171 aa, chain + ## HITS:1 COG:FN0747 KEGG:ns NR:ns ## COG: FN0747 COG0494 # Protein_GI_number: 19704082 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 1 171 1 171 171 293 88.0 1e-79 MKLLDIPNLKFLKVGVDSDPLNNNNLEYLEKQNAIAALIVNHAGDKVLFVNQYRPGVHNY IYEVPAGLIDEGEEPIHALEREVREETGYRREDYDIIYDSNTGFLVSPGYTTEKIFIYII KLKSDDIVPLELDLDETENLYTRWIDIRDAGKLTLDMKTIFSLHIYANIIR >gi|224461351|gb|ACDC01000051.1| GENE 18 15508 - 16089 580 193 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294782166|ref|ZP_06747492.1| ## NR: gi|294782166|ref|ZP_06747492.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 18 193 1 176 198 259 94.0 6e-68 MSVRINLEKDGKKESGFMGFSWTLLFWGFWVPLFRGRKKDFGLFFLFFLVKIGIIVLTVK AVFRAQRSALMFGFYKPSYILLIPTLIFVIIEVIEIWLTYYYNRHCTNSLLADGYYPEEN DEYSIALLKEFTYIPYTKEELEDKSIREKYKKFSDFARKEERDKFKTFFAICLIISVIIL IFWGVQYLRFYNF >gi|224461351|gb|ACDC01000051.1| GENE 19 16121 - 16612 558 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740681|ref|ZP_04571162.1| ## NR: gi|237740681|ref|ZP_04571162.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 163 1 163 163 278 100.0 9e-74 MATIIRLEKDGYMKDAFVGYSYTTALFNIFVPAARQDLKSFLFMGGIYFLSAFILNFYKI YVQRNLIQYKYGGLISFIALMISWVIAFFYNKYYTQKMLAEGWKPLKDDEYSNVLLKKYN YFEYTDNDLISDERTKEILDEVKKTEKKKALMFVVAAIIQILL >gi|224461351|gb|ACDC01000051.1| GENE 20 16600 - 17262 703 220 aa, chain - ## HITS:1 COG:FN1803 KEGG:ns NR:ns ## COG: FN1803 COG1309 # Protein_GI_number: 19705108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 214 1 214 217 263 68.0 2e-70 MEEMNIKKKRVMMYFIEATQELILDEGLEKLSIKKIAEKAGYNSATIYNYFENLEVLILY ASINYLKDYLNDLKNEITADMKAIEVYETVYKIFTKHSFERPEIFHTLFFGKYSYKLENI IKKYYEIFPDEIEGHIDLTKAMLIQGNIYDRDLPIINKMVKEGSIKEEEAEFIMETIIRV HQSYLSDLLYKNDDSLIEKYTEGFFKIFNFLLKKGDTWQQ >gi|224461351|gb|ACDC01000051.1| GENE 21 17294 - 18241 900 315 aa, chain - ## HITS:1 COG:FN0623 KEGG:ns NR:ns ## COG: FN0623 COG0679 # Protein_GI_number: 19703958 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 315 4 318 318 410 77.0 1e-114 MENFLLALNVVLPIFLTMALGFYLKRLKMVDESSLNTMNKLVFRVFMSTLLFLNVYNIGD LSKLSIDNLKLLGYAFIIIFVVVFLAWLIYMPKVKEKKKLSVLIQGVYRGNFVLFGLAIV DSIYGKEGLATVSLLTIVVIPTFNVLAVIILEYYSGREISKLKLVKQVFKNPLIIATLLG IIFILLRINIPKPIYKTLSDISKISTPLAFIVLGAELQFGNMLKNIKYLISVNLLRLIVN PLITIGIGKLIGFQGIELVALLSMSACPTAVASYTMAKEMKADGDLAGEIVATTSMFSIL TIFCWVLILKNMAWI >gi|224461351|gb|ACDC01000051.1| GENE 22 18263 - 18706 600 147 aa, chain - ## HITS:1 COG:no KEGG:SEN0273 NR:ns ## KEGG: SEN0273 # Name: not_defined # Def: rhs-associated protein # Organism: S.enterica_Enteritidis # Pathway: not_defined # 7 141 3 141 148 85 35.0 8e-16 MTFGEEIENEIKEFINEIGNIRYYPDSNYGVEYLNNNFSFLGTKIDLSKENNYTSHDFKE NDFLDMMKFFEFKDIKEKILESNEIHYVGDGITDSELVFSGKEFFKLLEFLFVNVPEHHY FFNEDKKWCLLIATEGWIDYGEKSIKR >gi|224461351|gb|ACDC01000051.1| GENE 23 18703 - 19356 827 217 aa, chain - ## HITS:1 COG:FN0622 KEGG:ns NR:ns ## COG: FN0622 COG1059 # Protein_GI_number: 19703957 # Func_class: L Replication, recombination and repair # Function: Thermostable 8-oxoguanine DNA glycosylase # Organism: Fusobacterium nucleatum # 1 217 1 217 217 339 85.0 3e-93 MKKNEYFKEIEKIYKEIKVDIKKRLEEFKNTWEKGSNKDIHLELSFCILTPQSKALNAWQ AITNLKKDDLIFKGSAEELVEYLNIVRFKNNKAKYLVELREQMTKKGKIITKDFFNSLPT VYEKRDWIVKNIKGMSYKEAGHFLRNVGFGADVAILDRHILKNLVKLEVIDELPKTLSPK LYLEIEEKMRKYCKFVKIPMDEMDLLLWYKEAGVIFK >gi|224461351|gb|ACDC01000051.1| GENE 24 19553 - 20857 1592 434 aa, chain + ## HITS:1 COG:FN0621 KEGG:ns NR:ns ## COG: FN0621 COG0427 # Protein_GI_number: 19703956 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Fusobacterium nucleatum # 1 431 1 431 434 746 85.0 0 MEKWQKKYKAKICSPDEAIQKIKSAKRISFGHICSESSVLTEALIRNKKLFKKLEIDHLL SIGKCEYAKEENSEYFHHNALFIGPKTREAANSSYGDYTPIFFYETAKIFGKDGDLSPDA MLLQVSTPDEHGYCSYGLSCDHTKSATESAKIVVAQINKFVPRTLGNCFIHIDDIDYIIL EDTPIPEIPAPVVGELEEKIGANCASLINDGDTLQLGIGAIPFAVLNFLKDKKDLGIHSE MISDGIVDLIQAGVITNKKKNFNPNKVIATFLLGTKKLYDYANNNPAIELHPVDYVNNPM IIAQNNNMISINSAIQVDLMGQVNAEYINTKQFSGPGGQVDFVRGATMSNGGKSIIALPS TTADEKISRIVFTFEEGVPVTTSRNDVDYIITEYGIAHLKGKTLRERAKLLIEIAHPKFR EELRKRAIEKFEIL >gi|224461351|gb|ACDC01000051.1| GENE 25 21018 - 21656 1033 212 aa, chain - ## HITS:1 COG:FN1265 KEGG:ns NR:ns ## COG: FN1265 COG2885 # Protein_GI_number: 19704600 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 11 212 1 202 202 271 84.0 7e-73 MKNRKIIASCMLALSLVGCTGFEAGNGGYTTAGGAGGAAVGALAGQIIGKDTKGTLIGAA VGSLLGMGWGAYKDNQARELRAALKGTQAEVRNDGNALVVNLPGGVTFASDSANISSGFY SALNGVAQTLVRYPETRIQVNGYTDSTGGDAHNQELSQRRANSVAQYLISQGVSSSRIVA NGFGSSNPIASNATPEGRQANRRVEVRILPAQ >gi|224461351|gb|ACDC01000051.1| GENE 26 21853 - 23277 2200 474 aa, chain - ## HITS:1 COG:FN0221 KEGG:ns NR:ns ## COG: FN0221 COG1966 # Protein_GI_number: 19703566 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Fusobacterium nucleatum # 1 474 1 474 474 775 91.0 0 MYSFIGSIIALVLGYLIYGKFVEGVFGIDSSKATPAERLADGVDYMEMSWTKAFLIQFLN IAGTGPIFGAVAGALWGPAAFIWIVFGCIFAGSVHDFLLGMMSLRRDGASVSEIVGENLG NGAKQIMRVFSVVLLLLVGVVFIMSPAQILKDITGISYEIWLAVIIIYYLCATVLPIDAI IGKIYPIFGLSLLVMAVGIGGGLIITNANIPEIAFVNMNPTGRSIFPYLCITIACGAISG FHATQSPMMARCLRTEKDGRKVFYGAMISEGIIALIWAAAAMSFFGGIPQLAEAGTAAVV VNKISVGILGKVGGALALLGVVACPITSGDTAFRSARLTIADSLKYKQGPIVNRFVVAIP LFVLGIALCFIPFNVIWRYFGWANQTLATIALWAAVKYLANRGKNFWIALIPAMFMTVVV TSYILAAPEGFVRFFGDKDIKVIEHIAIVIGCVVSLGCTAAFFMSNKKTNLITE >gi|224461351|gb|ACDC01000051.1| GENE 27 23534 - 25213 1881 559 aa, chain + ## HITS:1 COG:FN0220 KEGG:ns NR:ns ## COG: FN0220 COG3275 # Protein_GI_number: 19703565 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Fusobacterium nucleatum # 18 559 1 541 541 852 86.0 0 MNIQFISHLISNIGCSAIIAFFFIKIDKANIIIKSKAKSKKDVIALSFFFSLLSISGTYI GLNFNGAILNTRNMGVVAGGLLGGPYVAALTGLIAGIHRAIVNLGRETAIPCAIATIIGG FLTAYVSRFAKNKDRMFFAFLLAFVVENLSMALILLIQKDKALAQSIVKNFYIPMVFMNS VGAAVLILLVEDIIQKSELIAGSQAKLALEIANKTLPYFRNTENLNEVCKIIANSLGARA TVITDTKEIIAGFSTDKSVINRSNIRSNNTREVLKTGEVMLVIKDDEDEIIEDFFYISPH IKSCIILPLKEKNDVSGTLKIFFDTAEKITEKNRYLMIGLSHLISTQMEISKVENLISLL KYSELKALQSQINPHFLFNVLNTMTSLIRTNPEKAREVTIDLSNYLRYNLDNNLKSVELI KELNQIDTYIKIEKARFGEKLNIIYDVDESLYNFQIPSLIIQPLVENSIKHGILKKRDKG FVKIIVKKIDKDIEVAIEDDGVGIEQAVIDNLDKKIEENIGLKNVHQRLKLLYGEGLNIT KLEQGTRIKFKILGGVKYD >gi|224461351|gb|ACDC01000051.1| GENE 28 25206 - 25928 742 240 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 1 238 1 238 240 371 87.0 1e-103 MISCIIVEDELPAREELKYFIDEEKEIKLIAEFDNPLDTLTFLEKNAVDAIFLDINMPDM NGISLGKIITKMYPDTKIIFITAYKDYAVDAFEIKAYDYLLKPYSESRIRNLLKSLVNIK NETVNTVKNNNLKKITINMDERLYVISLNDVDYIEADEKETLIFSNQKKYISKIKISKWE EMLKGNNFYRCHRSFIINLDKITEIEQWFNSSWIIKIKNYPTAIPVSRNNIKELKELFLG >gi|224461351|gb|ACDC01000051.1| GENE 29 25953 - 26822 1109 289 aa, chain - ## HITS:1 COG:FN0218 KEGG:ns NR:ns ## COG: FN0218 COG2071 # Protein_GI_number: 19703563 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Fusobacterium nucleatum # 1 289 1 289 289 459 82.0 1e-129 MKKPIIGISASMIFEEKDELFLGDKYSCVAHSYVDAIYKSGGIPVVLPILKDVSAIREQV KLLDGIVLSGGRDVDPHFYGEEPLEKLEAIFPERDVHETALIKAATDLKKPIFAICRGMQ ILNVVYGGTLYQDISYAPGEHIKHYQIGTPYQATHSIKIDKSSTLFRMADKLEVERVNSF HHQALKKLADGLKVVATAPDGIIEAVEGTNEDGMFILGVQFHPEMMYDKSTFARSMFKRF ITICLESRPTDVILKDGLHHEEEYKAKEIADRIKELEEEEKKEFFKGDL >gi|224461351|gb|ACDC01000051.1| GENE 30 26819 - 27472 508 217 aa, chain - ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 1 217 1 217 217 293 81.0 2e-79 MISKEDIKQLEIIFPFWSELNQNDRAKIILSSRVLSLKKEAIFFNSHELDGLLFLKSGRL RFFLSSLEARDLPLYYLKDNEVEFFEDFNNKLISPILDIAFVVERNSEVLLIPYTILNLF RKKYSIMERFLHDLTREKLSKSLLSLQNILLIPLKERLLNFLYGLKKNEISLTHEEIAKK LGSSREVISRNLKILEKENFLKMNRKKIIIIGRGEVL >gi|224461351|gb|ACDC01000051.1| GENE 31 27650 - 28414 1138 254 aa, chain + ## HITS:1 COG:FN0216 KEGG:ns NR:ns ## COG: FN0216 COG1028 # Protein_GI_number: 19703561 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Fusobacterium nucleatum # 1 249 1 249 250 395 83.0 1e-110 MKVFIIGGSSGIGLSLAKRYLSLGNEVAICGTNEEKLKKIEEVNKGLKLYKVDVRNKSDL KSAIEDFSQGNLDLIINSAGIYTNNRTTKLTNDEAFAMIDINLTGVINTFEAVRDMMFTN NKGHIAIVSSIAGLIDYPKASVYARTKLTIMGVCETYRAFFRDYNINITTIVPGYIATDK LKSLSKEDITNRPTVLSEEKSTDIIVKAINDKKEKVIYPLSMRILIAIITKLPKKLLTYL MIKQANWGESNTRK >gi|224461351|gb|ACDC01000051.1| GENE 32 28439 - 29191 286 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 22 250 11 232 236 114 31 8e-25 MMHIFDILDKFLKIKFTGELTVEIVCFRLILAILFGGIVGYEREKNNRPAGFRTHILVCF GAAIVSMVQDQLRLNILDLASTEGSAVASVIKTDLGRLGAQVISGVGFLGAGSIMKEKGE TVGGLTTAAGIWATACVGLGIGWGFYNIAAVAVVFMIIIMVTLKKLESKLVKKTRLLKFE VKFFDSEDFANGLIEAYEVFRQRSIKITEIDKYQDDALVTFTVSMRGRNNISDVVVSLSS IQNVEYVRDV >gi|224461351|gb|ACDC01000051.1| GENE 33 29191 - 29763 766 190 aa, chain + ## HITS:1 COG:FN0214 KEGG:ns NR:ns ## COG: FN0214 COG0817 # Protein_GI_number: 19703559 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Fusobacterium nucleatum # 1 190 1 190 190 313 88.0 1e-85 MRVIGIDPGTAIVGYGIIDYNKNKYSIVDYGVILTSKDLSNEERLEIVYNELDKILKKYK PEFMAIEDLFYFKNNKTVISVAQARGVILLAGKQNNIPMSSYTPLQVKIGITGYGKAEKK QVQLMVQKFLGLSEIPKPDDAADALAICITHINSLSSNISFTGTSNLKKITLSSDTNKIS LEEYKKLLKK >gi|224461351|gb|ACDC01000051.1| GENE 34 29773 - 30279 690 168 aa, chain + ## HITS:1 COG:FN0213 KEGG:ns NR:ns ## COG: FN0213 COG1778 # Protein_GI_number: 19703558 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Fusobacterium nucleatum # 1 168 1 168 168 271 87.0 5e-73 MKDIKILVLDVDGTLTDGKIYVDDKDNSFKAFNVKDGFALVNWLKLGGEVAILTGKKSNI VERRAKELGIKYVIQGSKNKTQDLKKLLDELDITFENTAYMGDDLNDIGVMKKVGLTACP KDSVAEVLEICDFISTKNGGDAAVREFLEFIMKKNGMWQEVLNKYSNE >gi|224461351|gb|ACDC01000051.1| GENE 35 30309 - 30851 706 180 aa, chain + ## HITS:1 COG:no KEGG:FN0212 NR:ns ## KEGG: FN0212 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 180 1 180 180 258 83.0 9e-68 MLNFEKINNMIDLIEKNEIMPGLSFNEFAIAFYQEVKLVPLSRYLKTNNRAKRMPKIMTM KKAGELLLFTKTDDETLSFLKRKGYNEIPELDYKTMMLLRRLDPIDNWKKILAFFDGDKT VEEINLSTKPILFPQEIKKLEEFIKDELSIDDEEFEKFMKLSSLAIKNKELTKAIRKLTR >gi|224461351|gb|ACDC01000051.1| GENE 36 30897 - 31919 1449 340 aa, chain - ## HITS:1 COG:FN1124 KEGG:ns NR:ns ## COG: FN1124 COG2885 # Protein_GI_number: 19704459 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 82 340 1 315 315 315 62.0 6e-86 MKKVYLVAAVLALIFGFSYCYKKDKTEKTETATEKEAVVNEVKNEDMVIPGYALGEIPSI TIPEIQNLSVSENPDAKITLDMTKKISSVPGITISPVKVENGNIVDGSYSMQIGKNGDGQ YINKVTGVILQVDKDGTGLYTDNKNNIKIYVGEINARYESPNVEIINNGDGSGTYTDKSK NLVIENDGKGKAKITFNGQTTEVDAKPLEKPGKLEMVPPVPSIEANSLLITSDSGILFDV DKYDVRPEDKEVLKNLATVLKEMNVKNFEIDGYTDSDGSDEHNQVLSEKRANSVKNFLVS QGVTAEITTKGYGESKPVASNDTAEGRQKNRRVEIIIPTI >gi|224461351|gb|ACDC01000051.1| GENE 37 31928 - 32479 845 183 aa, chain - ## HITS:1 COG:FN1125 KEGG:ns NR:ns ## COG: FN1125 COG1704 # Protein_GI_number: 19704460 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 18 183 18 183 183 285 93.0 3e-77 MVVLGIVLGIVVVLALLAISYKNKFVVLDNRVKNAWSQIDVQMQNRFSLVPNLVETVKGY AKHEKETFEGIANAKAKYMSANTAAEKMEANNQLSGFLGRLFAISEAYPELKANTGFENL QGQLVEVENKIRFARQFYNDTVTEYNQAIQMFPGSLFAGFFNYHNAELFKANDMAREEVQ VKF >gi|224461351|gb|ACDC01000051.1| GENE 38 32521 - 34281 2057 586 aa, chain - ## HITS:1 COG:FN1127 KEGG:ns NR:ns ## COG: FN1127 COG4907 # Protein_GI_number: 19704462 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 12 547 28 567 606 671 65.0 0 MSFAANYRIEKLDIEANLQKDGSMVVSEAVTYDIDEINGVYFDIDAKGFGELEDLQVFED DPNTSSFKEVDASNYEVSASDELYRIKLYSKNQNNIRTFKFVYKLPEAITVYDDVAQFNR KMVGQEWQQGIKHITAKVIVPVPTDYDNSNILVFGHGPLTGEVDKEGNTVVYKLDDYYPG DFLEAHILMEPEIFSEYNKSKIVHKDMKQELLDMEAKLSEEANTERDKASSQQKISKKQG VILGVLGSIWGVLMFYIYGIYRRKNRVKNSVGKYLRELPDDSSPALVGSFMTDSISGNEI LATIVDLIRRKVLMLETSGEKSIITLVGNTEKLSAQERVIVDIYINDFGNGKSLDLKDFD LFQEVPMSTARKFEKWKTIIQSEMDRKDLVFEGFKGMGENLFYTSLGGIILGIKFFKNIL EKAMESKMFLIIVIMGFILLISLTKARYPRKELAEAKDKWQAFKNFLSDYSQLEEAKITS VHLWEQYFVYAVALGVSDKVVKAYKKALDMGVINDVQGVNSLAYSPIFNPMFSRSFSNLN GMVSRTNSGASSAIASSRRSSSSGGGGGFSSHSSGGGGSRGGGGGF >gi|224461351|gb|ACDC01000051.1| GENE 39 34414 - 36219 2280 601 aa, chain - ## HITS:1 COG:FN1127 KEGG:ns NR:ns ## COG: FN1127 COG4907 # Protein_GI_number: 19704462 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 562 1 567 606 728 67.0 0 MKKNILRIFLFFLISIVSFAASFRIEKLDIEANLQKDGSMVVSEAVTYDIDEINGVYFDI DAKGFGELEYIQVFEDDSTGGFKEVDSSNYEVSVSDELYRIKLYSKNHNNRRTFKFVYKL PEAITVYDDVAQFNRKMVGQEWQQGINYITAKVIIPVSASYDNSNILVFGHGPLTGEVDK EGNTVVYKLNNYYPGDFLEAHILMEPEIFSEYNKSKIVHKDMKQKLLDMEAKLADEANAE RDKAIRQQEMINKVFEKPGLIFGVLSSIWGALMYYIHVIFKRKNKVKNSVGKYLRELPDN SSPALVGGFMTNSINDNEILATIVDLVRRKVLTLENSDKNSIIILTGSTENLSAQEKAIV DIYINDFGDGKSLDLKSFGFFQKVPMSVARKFEKWRAMVQSEMDRKNLTYQGLGCLGVIF FAFFPMIFTFAGLVIGMITGNKMFLLIVVMGIILFVSGAKARYPRKELAEAKDKWQAFKN FLSDYSQLEEAKITSVHLWEQYFVYAVALGVSEKVVKAYKKALDMGVINDVQGVNSLAYS PIFNPMFSRSFSNLNGMVSRTNSGASSAIASSRRSSSSGGGGGFSSHSSGGGGSRGGGGG F >gi|224461351|gb|ACDC01000051.1| GENE 40 36371 - 38353 2589 660 aa, chain - ## HITS:1 COG:FN1128 KEGG:ns NR:ns ## COG: FN1128 COG1506 # Protein_GI_number: 19704463 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Fusobacterium nucleatum # 1 660 1 660 660 1125 85.0 0 MENLHLKSFLEYKFLSNLDFNPEGKNLAFSLSESDYEKNSYKHYIYSLNTETKEVRKLTH FGKEKNSLWLNNDIILFTSDRDTDIEEKKKLGETWTLFYALDIKNGGEAYEYMKLPLDVS NIKIVDENNFILLADYDNNSYNLNDLKGEEREKAIKEIEENKDYEVLDEIPFWSNGHGFR NKKRDRLYHYDKLNNKVTPISDEYTNVELINVKDNKVIFAGRTFTDKQGLTSGLYVYDVK SQNLEVIVDKDLYDISYANFIEDKIICALSDMKAYGVNENHKLYLIDSNKNITLLNDNDT WLSCTVGSDCRLGGGKSFKVIGNKLYFLATIAERVYLESIDTNGKVEILSDKDGTIDFFD IANGEIYYVGMRDYTLQEIYKLENNESTKLTSFNEEINKKYKISKPEVFDFTTNGATTKG FVIYPIDYDKTKTYPAILDIHGGPKTVYGNVFYNEMQVWANMGYFVFFTNPHGSDGYGNE FADIRGKYGTIDYEDLMNFTDYVLEKYPIDKSRVGVTGGSYGGYMTNWIIGHTDRFRCAV SQRSISNWISKFGTTDIGYYFNADQNQATPWINHDKLWWHSPLKYADKAKTPTLFIHSEQ DYRCWLAEGIQMFTALKYHGVEARLCMFRGENHELSRSGKPKHRLRRLTEITNWFEKYLK >gi|224461351|gb|ACDC01000051.1| GENE 41 38607 - 39266 833 219 aa, chain + ## HITS:1 COG:SP0711 KEGG:ns NR:ns ## COG: SP0711 COG0765 # Protein_GI_number: 15900609 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 27 219 7 199 206 200 54.0 2e-51 MDWEFIAKYTPEFINAGILTLKIGGIGIVLSIIVGILGSWILYENFKFFKQIVIGYVELS RNTPLLVQLFFLYFGLPKVGLRLSPELCGIIGLTFLGGSYMIETFRSALETIDKIQKESA LSLGMTNWQTMRYVILPQSFVISLPGLTANIIFMLKETSVFSAISLMDMMFVTKDLIGLY YKTEESLFMLVVGYLIILLPLSLLGVWLERKLKYVGYSN >gi|224461351|gb|ACDC01000051.1| GENE 42 39247 - 39927 688 226 aa, chain + ## HITS:1 COG:SP0710 KEGG:ns NR:ns ## COG: SP0710 COG0765 # Protein_GI_number: 15900608 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 6 223 6 223 225 224 57.0 8e-59 MLATVIDLLSKGTNFERLLYGLWITIKLSLISAILSIIFGILFGLFMVIKNPITRIISQI YLQTIRIMPPLVLLFIAYFGVTRMYGVHISPEASAIIVFTIWGTAEMGDLVRGAIESIPK IQIESATALALDKKQIYLYVIIPQIIRRLIPLSVNLITRMIKTTSLVVLIGIVEVLKVGQ QIIDTNRFQYPNGAIWIYGVIFLLYFLSCWPLSMLAKFLEKRWSRI >gi|224461351|gb|ACDC01000051.1| GENE 43 39924 - 40679 259 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 238 1 253 563 104 28 1e-21 MKQLDKVVLSAKDVVKNYGELEVLKGINLDIHQGEVVVIIGSSGCGKSTFLRCLNGLEDI QAGDIILDNEIKFSDAKNNMTKVRQKIGMVFQSYELFPHLTILDNILLAPLKVQKRNKEE VKEQALKLLERVNLLDKQNSYPRQLSGGQKQRVAIVRALCMNPEIMLFDEVTAALDPEMV REVLDVMLELAREGMTMVIVTHEMQFARAVADRVIFMDNGNIAEQGEAEEFFSNPKTERA QKFLNTFTFKK >gi|224461351|gb|ACDC01000051.1| GENE 44 40711 - 41592 1513 293 aa, chain + ## HITS:1 COG:Cj0982c KEGG:ns NR:ns ## COG: Cj0982c COG0834 # Protein_GI_number: 15792309 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Campylobacter jejuni # 5 292 2 279 279 265 47.0 1e-70 MKIWKKILKLATVGIAVFALAACGNKTEEKTEAQAPAQETAVAKARTVQEIKDSGVIRIG VFTDKAPFGYIDENGKNQGYDVYFTDRLAKDLGVKVEYISLDPASRVEYAETGKADIVAA NFTVTPERAEKVDFSLPYMKVSLGVVSPDGAVIKSVEELKDKTLIVSKGTTAEYYFSKNH PEVKLQKYDSYADAYNALLDGRGDAFSTDNTEVLAWAKSNPGFTVGIDSLGDVDTIAVAV QKGNTDLLNWINNEIKELGKENFFHEAYKATLEPIYGDSADPDSIVVEGGEVK >gi|224461351|gb|ACDC01000051.1| GENE 45 41669 - 42154 821 161 aa, chain - ## HITS:1 COG:FN0746 KEGG:ns NR:ns ## COG: FN0746 COG0319 # Protein_GI_number: 19704081 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Fusobacterium nucleatum # 1 161 1 161 162 235 85.0 2e-62 MELVLDFSCELDNEKYNEFIDKLYEDSYLENYIKKVLEIEEVKAERPLYLSVLLTDNKNI QVINREYRDKDAPTDVISFAYHETDDFNIGPYDTLGDIIISLERVEEQSSEYNHSFEREF YYVLTHGILHILGYDHIEEEDKKVMREREEAILSSFGYTRD >gi|224461351|gb|ACDC01000051.1| GENE 46 42169 - 44241 2319 690 aa, chain - ## HITS:1 COG:FN0745 KEGG:ns NR:ns ## COG: FN0745 COG1480 # Protein_GI_number: 19704080 # Func_class: R General function prediction only # Function: Predicted membrane-associated HD superfamily hydrolase # Organism: Fusobacterium nucleatum # 62 690 1 629 629 997 86.0 0 MKKFTIFGFKFLFEVKKKDNSDEEKYSDTYFLKEKVFYLILALFLITISAKIPILFRNNN YMIGDVVKSDIYSPKTIVFRDKIGKDKIIQDMINQLDKDYIYSSDAADIYTNEFDNFHKE IIAIKKGNLQTFDYNGFERKMGKAMPETIVKKLLEEDEDKINSTFEKLSEHLKNAYTAGI YKEKNSIRINEPAKSEIENLDAFERDLINYFLIPNYIYDEAKTKNTINEKVSQINDQYIE IKAGTLIAKTGEILTERKIDILDKLGIYNYKMSIFIITLNIIFLLVISSVFNVVTMRFYS RDVLEKKKYKAVMLLMIVTLLVFRIVPNSMIYLVPIDTMLLLLMFIVRPRFSIFLTMMLI SYLLPITDYDLKYFTIQSIAILATGFLSKNIGTRSSVIAIGIQLAIMKILLYLILSFFSM EESFGVALNTIKLFVSGLFSGMFAIALLPYFERTFNILTVFRLIELADLSQPLLRKLSIE APGTFQHSMMVATLSENAVIEIGGDPIFTRVACYYHDIGKTKRPQYYVENQTDGKNLHNN ISPFMSKMIILAHTKEGAEMGKKYKIPKEIRDIMFEHQGTTLLAYFYNKAKEIDPNVQEE EFRYSGPRPQTKESAVILLADSIEAAVRSLDVKDPIKVEEMVRKIVNAKIADNQLSDANI TFKEIEIIINSFLKTFGAIYHERIKYPGQK >gi|224461351|gb|ACDC01000051.1| GENE 47 44258 - 46720 2447 820 aa, chain - ## HITS:1 COG:FN0743 KEGG:ns NR:ns ## COG: FN0743 COG1199 # Protein_GI_number: 19704078 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Fusobacterium nucleatum # 80 820 1 741 741 1180 85.0 0 MDIKDRFSEESLQTIEEYLIENDNKSIIFKATFDENEVIQEPFFLSLYKKKTFEETLTKV KRDEVVIRVTKPNQLYPNDLELELSEELFNRRNIAYCLLSSDLDDFYFIQDIDRTNLEKI DIEDYFSEDGILVNEIKGFEHRHEQEEMAKNIQNAINDNRKIIVEAGTGTGKTLAYLIPA IKWAITNKKKVIIATNTINLQEQLLLKDIPLAKSVIKDEFTYALVKGRSNYLCKRLFTEL SLGKSIDIESFSVEAREQIEYILKWGNKTKTGDKAELPFEVYPDVWELVQSTTELCLGKK CPFRKECFYMKTRMKKMEADILISNHHVFFSDLNVRAETDFDSEYLILPRYDMVIFDEAH NIESVARSYFSVEVSKISFTRLLHRIYQKKSKKKKEKSALTRVEETIDEKYLEKPGDYLE LLKTMKSEIYSLQTIGDEYFDEIRKMFETNTEAPIRKSLNNFEMTKSNFLENLRDKKEFF QAKLAEFLNLMMAFNNVIDEEKDKNPEVINFNNHLKIFKKYIDSFKFINNFSDADYVYWL DINSKRTNVVLTATPLNIAQKLSSVLFENLNRLVFASATIMANGNFEYFKKSLGLDEEEC IECFIESPFDYEHQMSVYIPTDIQDSENLNAFVTDASKFILDILKKTKGKAFILFTSYTM LNQIYYSVVNKLKNSNFEIFLHGEKPRSQLIKEFKEAKNPVLFGTTSFWEGVDVQGENLS NVIITKLPFLVPTDPIVSAISKKIEEDGGNSFSDFQLPEAIIKFKQGVGRLIRKKTDRGN IFILDSRIIKKRYGSAFIKALPSQKNIKILEKDDIIKEIE >gi|224461351|gb|ACDC01000051.1| GENE 48 46742 - 47233 686 163 aa, chain - ## HITS:1 COG:FN0742 KEGG:ns NR:ns ## COG: FN0742 COG4807 # Protein_GI_number: 19704077 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 8 163 1 156 156 235 79.0 2e-62 MRESEVIMTNNDFLRRLRYALNLKDNVTVQIFKKGGLTVTKEDVVNYLKKDIDEGFKKLS NADLMTFLDGLITFKRGEKKEASPAPKIKITKNNLNNILLRKLRIALAFKSYDMIEVFKL GGVDISEAELNALFRSEDHRNYKECGDKYIRVFLKGLIEYCRD Prediction of potential genes in microbial genomes Time: Thu May 19 23:10:20 2011 Seq name: gi|224461350|gb|ACDC01000052.1| Fusobacterium sp. 2_1_31 cont1.52, whole genome shotgun sequence Length of sequence - 26812 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 6, operones - 5 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 17/0.000 - CDS 18 - 1403 1854 ## COG0297 Glycogen synthase 2 1 Op 2 7/0.000 - CDS 1419 - 2582 1256 ## COG0448 ADP-glucose pyrophosphorylase 3 1 Op 3 6/0.000 - CDS 2620 - 3753 1457 ## COG0448 ADP-glucose pyrophosphorylase 4 1 Op 4 4/0.000 - CDS 3756 - 5594 2176 ## COG0296 1,4-alpha-glucan branching enzyme 5 1 Op 5 7/0.000 - CDS 5622 - 7988 3223 ## COG0058 Glucan phosphorylase 6 1 Op 6 . - CDS 8002 - 9498 1494 ## COG1640 4-alpha-glucanotransferase - Prom 9529 - 9588 9.8 - Term 9618 - 9671 5.9 7 2 Op 1 . - CDS 9685 - 10446 1088 ## PFLU4248 hypothetical protein 8 2 Op 2 . - CDS 10459 - 11235 1215 ## FN0865 hypothetical protein 9 2 Op 3 1/0.500 - CDS 11254 - 11949 804 ## COG0670 Integral membrane protein, interacts with FtsH 10 2 Op 4 . - CDS 11974 - 14484 3291 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 11 2 Op 5 . - CDS 14496 - 14645 200 ## 12 2 Op 6 1/0.500 - CDS 14654 - 15487 1168 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 15508 - 15567 9.0 13 3 Op 1 1/0.500 - CDS 15794 - 16588 1085 ## COG0561 Predicted hydrolases of the HAD superfamily 14 3 Op 2 1/0.500 - CDS 16598 - 17461 1236 ## COG0607 Rhodanese-related sulfurtransferase 15 3 Op 3 . - CDS 17442 - 19454 2134 ## COG0337 3-dehydroquinate synthetase - Prom 19631 - 19690 11.0 - Term 19641 - 19691 3.2 16 4 Op 1 35/0.000 - CDS 19699 - 20454 240 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 17 4 Op 2 33/0.000 - CDS 20456 - 21493 914 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 18 4 Op 3 . - CDS 21505 - 22365 1047 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component - Prom 22395 - 22454 10.8 + Prom 22338 - 22397 11.7 19 5 Tu 1 . + CDS 22542 - 24524 3008 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 24533 - 24598 4.3 20 6 Op 1 . - CDS 24654 - 25010 286 ## COG0239 Integral membrane protein possibly involved in chromosome condensation 21 6 Op 2 . - CDS 25023 - 25844 985 ## FN0872 hypothetical protein 22 6 Op 3 . - CDS 25862 - 26812 1271 ## COG0616 Periplasmic serine proteases (ClpP class) Predicted protein(s) >gi|224461350|gb|ACDC01000052.1| GENE 1 18 - 1403 1854 461 aa, chain - ## HITS:1 COG:FN0853 KEGG:ns NR:ns ## COG: FN0853 COG0297 # Protein_GI_number: 19704188 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Fusobacterium nucleatum # 1 459 1 459 461 827 87.0 0 MKILFATGEAFPFIKTGGLGDVSYSLPKALVQKEKLDVRVILPKYSKISNELLKDARHLG HKEIWVAHHNEYVGIEEVELEGVIYYFVDNERYFRRLNVYGEYDDCERFLFFSKAVVETM DITDFKPDIIHCNDWQTALIPIYLKERGIYDVKTVFSIHNLRFQGFFYNNVIEDLLEIDR AKYFQDDGIKYYDMISFLKAGVVYSDYITTVSDSYAEEIKTPEFGEGIHGLFQKYGYKLS GIVNGIDKASYPLSKKSHKTLKANLQAKLGLEIEEATPLVAIITRLDRQKGLDFILEKFD EMMSLGIQFVLLGTGEKSYEDFFRYKESQYRGYVCSYIGFNQELSTEIYAGADIFLMPSV FEPCGLSQMIAMRYGCIPVVRETGGLKDTVKPYNEYTGEGDGFGFKQANGDDMIKALRYA VTMYRRPEVWKEIIANAKKRDNSWKEPAKRYKEIYQKLLGN >gi|224461350|gb|ACDC01000052.1| GENE 2 1419 - 2582 1256 387 aa, chain - ## HITS:1 COG:FN0854 KEGG:ns NR:ns ## COG: FN0854 COG0448 # Protein_GI_number: 19704189 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Fusobacterium nucleatum # 1 387 1 387 387 664 87.0 0 MIRNYMAIIYLGDNKQNISPLTKVRSLASIPVGGSYRIIDFALSNVVNSGIRNVGLFCGN EELNSLTDHIGMGAEWDLARKKDGIFIFKRMLDDDLSLNQSRISKNMEYFFRSTQEHVVV LNGHMICNLDISDLIEKHKESGKEITMVYKKVKKANEHFNNCSSVKIDENNRVIGIGQNL FFREEENISLDAFVLSKELMLKLLIDSIQEGKYNVLSEIIARKLPSLNINAYEFKGYLQC INSTKEYFNFNMNILKKEIREDVFGLKSGRRILTKVKDTPPTIFKETAEVENSLISNGCI IEGKVINSVLSRGTIVEKDVVLEECVILQDCHIKAGSHLKNVIVDKNNIIHENEKLSASE EYPLVIEKGMKWNTKEYKDLMDYIKNK >gi|224461350|gb|ACDC01000052.1| GENE 3 2620 - 3753 1457 377 aa, chain - ## HITS:1 COG:FN0855 KEGG:ns NR:ns ## COG: FN0855 COG0448 # Protein_GI_number: 19704190 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Fusobacterium nucleatum # 1 376 3 378 384 686 89.0 0 MKRKKMIAMILAGGQGSRLKQLTEDLAKPAVAFGGKYRIIDFTLTNCSHSGIDTVGVLTQ YEPHILNNHIGRGSPWDLDRMDGGVTVLQPHTRKNDEKGWYKGTANAIYQNIKFIEEYNP EYVLILSGDHIYKMNYDKMLQYHIEKKADVTIGVFRVPLKDAPSFGIMNTRDDMTIYEFE EKPKEPKSDLASMGIYIFNWEELKKYLEEDEHNPNSDNDFGKNIIPNMLNDGKKLVAYPF EGYWRDVGTIQSFWDAHMDLLSENNELDLFDKNWRINTRQGIYTPSYFETGSKIKNSLID KGCLVEGDIEHSVIFSGVKIGKNSKIIDSIIMADTEIGDNVTIRKAIIANDVKVADNVVI GDGKEIAVVGEKKVIDK >gi|224461350|gb|ACDC01000052.1| GENE 4 3756 - 5594 2176 612 aa, chain - ## HITS:1 COG:FN0856 KEGG:ns NR:ns ## COG: FN0856 COG0296 # Protein_GI_number: 19704191 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Fusobacterium nucleatum # 1 607 4 611 611 1084 87.0 0 MSGQMEHYLFHRGEYRQAYEYFGAHPNRSSTIFRIWAPAAKSVAVVGDFNNWNAREEDYC QKITNEGIWEVEIKKVKKGAIYKFQIETSWGQKILKADPYAFYSELRPQTASVVNGIPKF RWGDKKWLNNREIGYAKPINIYEVHLGSWKKKEDGTYYNYREIAELLVEYMLEMNYTHIE IMPITEYPFDGSWGYQATGYYSVTSRYGTPEDFMYFVNYFHKNNLGVILDWVPGHFCKDA HGLYRFDGSACYEYEDQNLGENEWGTANFNVSRNEVRSFLVSNLYFWIKEFHIDGVRMDA ISNMIYHKDGVSENRASIEFLQYLNQSLHENHPDIMLVAEDSSAWPLVTKYQADGGLGFD FKWNMGWMNDTLKYIEQDPFFRRSHHGKLTFSFMYAFSENFILPLSHDEIVHGKNAILNK MPGYYEDKLAHVKNLYSYQMAHPGKKLNFMGNEFVQGLEWRYYEQLEWQLLKDNKGSKDI QNYVKALNTLYLEEKALWHDGQNAFEWIEHENIDENMLIFLRKDPDTDDFIIAVFNFSGK DQDKYPVGVNLEGEYECILDSNEKRFGGSYQGRKRNYKTIKGAWHNREQHIEVKIAKNST IFLKHKKGNEED >gi|224461350|gb|ACDC01000052.1| GENE 5 5622 - 7988 3223 788 aa, chain - ## HITS:1 COG:FN0857 KEGG:ns NR:ns ## COG: FN0857 COG0058 # Protein_GI_number: 19704192 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Fusobacterium nucleatum # 1 788 1 788 789 1437 91.0 0 MEFNKEKWKKKLEEKLLEKFSVSLKEATSFEVYRALGETVISFIARDWYETKEKYSKTKQ AFYLSSEFLMGRALGNNLINLGIDKEVREFLEEIGIDYNQVEDEEEDPALGNGGLGRLAA CFMDSLATLNLPGQGYSIRYRNGIFNQYLRDGYQVEKPETWLKYGDVWSIMRPEDEVIVN FGNSSVRALPYDMPVIGYGTKNVNTLRLWEAHSINDLDLGVFNQQDYLHATQDKTLAEDI SRVLYPNDSTDEGKKLRLRQQYFFVSASLQDIIKNFKKVHGREFTKIPEFIAIQLNDTHP VIAIPELMRILVDIEGVLWEDAWEIVKKTFSYTNHTILAEALEKWWVGLYQEVVPRIFQI TEGIHNQFKNELAQLYPNDQDKQNRMQIIQGNMIHMAWLAIYGSHKVNGVAELHTEILKE RELRDWYELYPEKFLNKTNGITQRRWLLKSNPQLASYITELIGDAWIKDLSELKKLEQFL DDKNVLDRIWDIKIEKKKELVEYLRETQGIDINPNSIFDVQVKRLHEYKRQLLNIFQVYN LYQQLKQNPSMDFTPTTYIFGAKAAPGYKVAKGIIRLINDVAQIINGDNDVKDKLKVVFV ENYRVTVAEKIFPAADISEQISTAGKEASGTGNMKFMLNGALTLGTLDGANVEIAKEAGE ENEYIFGMRVEDIDALMKKGYDPRFPYNNVSGLKQVVDALIDGSLSDLGSGIYREIHSLL MERGDQYFVLEDFEDYRKTQRTINREYKDKYSWAKKMLKNIANAGKFSSDRTILEYANEI WDIKETKI >gi|224461350|gb|ACDC01000052.1| GENE 6 8002 - 9498 1494 498 aa, chain - ## HITS:1 COG:FN0858 KEGG:ns NR:ns ## COG: FN0858 COG1640 # Protein_GI_number: 19704193 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Fusobacterium nucleatum # 1 498 1 498 506 804 83.0 0 MKRECGVLLAISSLPSAYGIGDFGKEAYRFVDFLEASGQSLWQILPLCPVEYGNSPYQSP STFAGNFLYLDLEELVHNEYLTQEDIDILKSEVSSVDYEYIKSQKESLLKKASQAFFCKN AEESEFKKFQSENQFWLEDYSLFLSLNKKFKGKMWNTWEKGYKFRERKSIEEAKKEFEED YKYESFIQYYFYKQWKKLKDYANSKGIKIIGDLPIYVANNSADTWQHPKLFCFNKHLKIK AVAGCPPDYFSKNGQLWGNVLYGWEAMKKENYSWWEQRIKHSFLLYDVLRLDHFRGFASY WAIRYGEKTAINGRWKIGPRIQFFRNLERKLKNIDIIAEDLGTLTADVFKLLRQTNYPNM KVLQFGLTEWDNMYNPKNYTENSVAYTGTHDNMSMVEWYSTLNKNEKFICDENLKNFLKD YNTNMWEPIQWRAIEALYASKSNRVIVPLQDILGLGADSRMNTPSTVGDNWAWRVYWEYR HGDLENKLYNLAKRYQRI >gi|224461350|gb|ACDC01000052.1| GENE 7 9685 - 10446 1088 253 aa, chain - ## HITS:1 COG:no KEGG:PFLU4248 NR:ns ## KEGG: PFLU4248 # Name: not_defined # Def: hypothetical protein # Organism: P.fluorescens_SBW25 # Pathway: not_defined # 11 253 3 231 231 176 42.0 8e-43 MEKSDALKKKIREIKEKIARPCTEFETKNFDYDDENKVSWIGRVFLCKEDEVEERPKDDK GETMYPLAQFYLSNLPYLPESLKKFEYITVFMGEDFPEYNDIDGLVSRNGKGWILRTYTK DDVLVKNEYLRDDNFCPKAYPLEAKFHAEDYPIWDGGGLDEDLEIEICDLEEEFDDEVSY YQDIGNDHTYLHKFGGYPSYCQPGLGLEVEEGYNFVFQISSDDVAQYNVVDSGSLMFFYN ENEDKWMMYFDFY >gi|224461350|gb|ACDC01000052.1| GENE 8 10459 - 11235 1215 258 aa, chain - ## HITS:1 COG:no KEGG:FN0865 NR:ns ## KEGG: FN0865 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 20 258 1 241 241 369 84.0 1e-101 MKKILLLMFSVLCVNSFSYVERNDQVGNRGLELMRESNINQNMGLSKESGSTQIIDAYAG NGKFSKTKGFMIGTTSNLVAYPNITAGVTVAYDKYKYKPESNDYWGRDYDLNTYFSYKLD KNLFTLGLGYSQSRHVEKRGYTGNLEYGRFLTPSTYLYAGIEGQNRNYKNSEDLNFVNYK VGVLRQDTWKKMKFVNGVEVNMDNKKYDREERGRGNVTFVSRASYYIYDDLLFDVQYRGT KNSKFYDNVVGVGFTHYF >gi|224461350|gb|ACDC01000052.1| GENE 9 11254 - 11949 804 231 aa, chain - ## HITS:1 COG:FN0866 KEGG:ns NR:ns ## COG: FN0866 COG0670 # Protein_GI_number: 19704201 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Fusobacterium nucleatum # 8 231 1 224 224 206 61.0 4e-53 MYYDMDNMNDIDVRSSNNFLRKVFLYMVLGVAISFGTGIYLYLYNQELLFSLARYFNIMG IAGLGMVLILNFFLEKMSAGMARILFILYSLVIGTIFSTVGFAYSPLAILYAFASALTIF VVMSIYGFFTKEDLSSYRTFLIVGLISLIVMGLINIYLGVGVLYWIETIVGIVIFTGFTA YDVNRIKHISYQLANEEGENVEKLGIRWALELYLDFINLFLYALRIFGRRK >gi|224461350|gb|ACDC01000052.1| GENE 10 11974 - 14484 3291 836 aa, chain - ## HITS:1 COG:FN0867_1 KEGG:ns NR:ns ## COG: FN0867_1 COG1022 # Protein_GI_number: 19704202 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Fusobacterium nucleatum # 1 606 1 606 606 996 84.0 0 MSIKFLYDRQKIAITYGEQKISYADVIKYVNFYSDFLDIEKGDRSALMMENRPESIFSFF SIWARKGIAISLDAGYTVDQLAYVLGDSEPKYLFVSNKTKEVAEAANSKLNNAIKIINVD ELQMPADYKIKQEEFSNDSNEDVAVLVYTSGTTGNPKGVMITYENIETNMAGVRAVDLVN ENDVILAMLPYHHIMPLCFTLILPMYLGVPIVLLTEISSASLLKTLQENRVTVIVGVPRV WEMLDKAIMTKINQSSVAKFMFKLASKTNSMSIRKMLFSKVHKQFGGHIRLMVSGGAKID KNILEDFRTMGFRAIQGYGMTETAPIIAFNVPGRERSDSVGEVIPNVEVKIADDGEVLVK GKNVMKGYYKNEAATKEAFDAEGWFHTGDLGKMDGKYLIIIGRKKEMIVLPNGKNIDPND IEAEIMKNTDLIKEIAVTEYNEQLVAIIYPDFEKLQAQQIVNIKDAIKWEVIDKYNVTAP NYKKIHDIKIIKQELPKTRLGKIRRFMLKDLLEDKVEAPEKKVEKKVIEVPSEIKEKYDI INKYITERYNKDIDLDSHIELDLGFDSLDIVEFMNFLNSTFDIEIVEQDFVDHKTISDII KLVEEKSGITNEKVVEKVDKNENLKKIIDSDSDVKLPPSAKYAKVLKFLFSPLFKFYFRY KYSGKENLGEGAGIIVGNHQSYLDAFMLNNAFTYKELSNNYYIATALHFKSKTMKYLAGN GNIILVDANRNLKNTLQAAAKVLKSGKKLLIFPEGARTRDGQLQEFKKTFAILAQELNVP IYPFVLKGAYEAFPYNKKFPKRYDISVQFLEKIDPQNKTVEELVEETKDKIAKNYY >gi|224461350|gb|ACDC01000052.1| GENE 11 14496 - 14645 200 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFYLKLLIKILERSMTAKDSEILKKLKSGYDLSSEEKKELEELIDNLI >gi|224461350|gb|ACDC01000052.1| GENE 12 14654 - 15487 1168 277 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 1 277 1 277 277 494 92.0 1e-140 MENIITNEQINEAIFLNKKEKIEESLRTTYRKKIWKNFIKAIKEFDLIKDGDKIAVGVSG GKDSLLLCKLFQELKKDRSKNFEVKFISMNPGFEALDVDKFKENLIEMGIDCELFDANVW QIAFEEAPDSPCFLCAKMRRGVLYKKVEELGFNKLALGHHFDDIVETTMINMFFAGTVKT MLPKVPSTSGKMDIIRPLAYVREKDIINFMKYNEIQAMSCGCPIEAGKVDSKRKEVKFLL QELEEKNPNIKQSIFNAMKNINLDYVLGYTNGNKSKK >gi|224461350|gb|ACDC01000052.1| GENE 13 15794 - 16588 1085 264 aa, chain - ## HITS:1 COG:FN0869 KEGG:ns NR:ns ## COG: FN0869 COG0561 # Protein_GI_number: 19704204 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 264 7 270 270 437 88.0 1e-123 MKLVVSDLDGTLLNDDSEVSLETIQAIKRLKEKGIEFAIATGRSFNSANKIRKKIGLEIY LICNNGANIYNKNGELIKNNVMPADLIRKVVRFLTENKIGYFGFDGSGANFYVPYGTEID DEFLKEHIPHYIKNSEDIDRLPALEKILIIEEDSERIYEIKDLIHDNFDDELEIVISADD CLDLNIKGCSKRGGVEYISQELEINPKEIMAFGDSGNDYKMLKYVGHPVAMKDSFMSKRD FENKTDFTNDESGVAKYLQQYFNL >gi|224461350|gb|ACDC01000052.1| GENE 14 16598 - 17461 1236 287 aa, chain - ## HITS:1 COG:FN0870 KEGG:ns NR:ns ## COG: FN0870 COG0607 # Protein_GI_number: 19704205 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Fusobacterium nucleatum # 48 287 1 240 240 389 86.0 1e-108 MIDIVSNISGYFDRDFENIIYKDLRTNGLSDEEVEKILSDKYRDLPIMEENIFKLNNYKL GSIGFTSRELENLKIDFCEEKLLSNDYNGENPTNQIVYLKVLFDKESKKILGCQIANERN IEARLKAVKAIMEKGGDLKDLMKYKVNPTDNEWNPDILNLLALTALGKDKEVSTDVEAKD IETLSKNKEFLLDVREEYEYQAGHVKGAINLPLREILSQKDSLPKDRDIYVYCRTAHRSA DAVNFLKSLGFDKVHNIEGGFIDISFNEYHKDKGNLENSIVTNYNFD >gi|224461350|gb|ACDC01000052.1| GENE 15 17442 - 19454 2134 670 aa, chain - ## HITS:1 COG:FN0871_1 KEGG:ns NR:ns ## COG: FN0871_1 COG0337 # Protein_GI_number: 19704206 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Fusobacterium nucleatum # 1 350 1 350 350 587 90.0 1e-167 MKKIFDDIYVGSNIISKLNDYTKDFDKVLVFSNEIIADLYFEKFKSILIEKDKIFYFTIK DGEEYKNIESILSVYDFMLENNFSRKSLVISLGGGVICDMGGYISATYMRGIEFIQVPTS LLAQVDASVGGKVAINHPKCKNMIGSFKSPYRVLIDVEFLKTLAEREFKSGMGELLKHSF LTKDKKYLEYVENNVEKIKALDNEVLENIVEQSIRIKKHYVDIDPFEKGERAFLNLGHTY AHALESFFAYKAYTHGEAVAKGIIFDLELSLLRGQIDEAYLERARNIFNLFNIDTDLIYL DSDKFIPLMRKDKKNSFDKIITIILDAQGNLSKTEVKEDEIIKIIAKYENNFLRASIDIG TNSCRLFIAEVKEIDNEIIFKKEIHKDLEIVKLGEDVNKNKFLKEEAIERTLKCLKKYKE LIDEYSIEEKNIICFATSATRDSSNRDYFIKKVYDEAKIKINCISGDEEAYINFKGVISS FDKNFKENILVFDIGGGSTEFTLGNTNGIEKKISLNIGSVRITEKFFLENGRYNYSEENR NKAKEWIKENLEKLEEFKNESFILVGVAGTTTTQVSVREKMEVYDSEKIHLSNLTTEEIS DNLDLFIKNIKNDKNVKGLDVKRRDVIIGGTIILKEILKYFKKDSLVVSENDNLMGAILE GVNENDRYSK >gi|224461350|gb|ACDC01000052.1| GENE 16 19699 - 20454 240 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 223 1 227 245 97 27 1e-19 MIEVKNLSFSIENRKIIQDISIVVNKSQFVGIIGANGSGKSTLLKNIYRFFKYDSGDIKL KNIELYDYSSKDLAKEMAVLAQKQNMNFDFSVEEIVEMGRYAYKNSIFEVEKNKNSEFVG NALNAVGMYNMKDRSFLSLSGGEMQRVLIARALAQNTEILILDEPTNHLDIKYQIQIMKL VKETKKTILAVIHDMNIASSYCNYIYALKDGKVYYQGSPEEIFTKEKIKNIFDVEADVLI HPKNKKPLIVF >gi|224461350|gb|ACDC01000052.1| GENE 17 20456 - 21493 914 345 aa, chain - ## HITS:1 COG:FN0884 KEGG:ns NR:ns ## COG: FN0884 COG0609 # Protein_GI_number: 19704219 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Fusobacterium nucleatum # 1 345 1 345 345 504 91.0 1e-143 MKRCIKNYKSLSLLLFIILILISTFSITIGSVSLKNLDVWKIIINKFFDYNFFNVTWEES SEIIVWTLRAPRIVTAILAGASLSFVGILMQALTKNPLASPYILGISSGASTGAVLVILI FSGSYIFVSIGAFILGTLTAFLVFYFANSNGFSSTKLVLVGAAISAIFSGLTSLIIAITP NERALRSALFWMSGSLAGSTWEYIPFLFISLIVVFILVYPKYDELNILVTGDENAISLGV DVKKIRFLIMITSTFLTGIVVANTGIIGFVGLVIPHITRGIVGGNHKKVIPIAIFLGAIF LVLTDTLTRTLMPSQEIPIGVITSLLGAPFFLSMLRGKSYRFGGE >gi|224461350|gb|ACDC01000052.1| GENE 18 21505 - 22365 1047 286 aa, chain - ## HITS:1 COG:FN0885 KEGG:ns NR:ns ## COG: FN0885 COG0614 # Protein_GI_number: 19704220 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 286 1 286 286 473 94.0 1e-133 MKKILFCFLLLSSLCFAKVPKRAVSAAHFTTEILLSIGAEKQMVGTAYPDNPILPSLKEK YDKIPILSMKNPTKEQFYAVKPDFLTGWDSTVQDKNLGPIKELEKNGVQVYIMKSLHSSD INLVFEDILNYGKIFNLEDNAKKVVNKMKADLAAVQKKLPKNKVKVFTYDSGDKTPFVVG GDSIGNTIITLAGGDNIFKNIKKAWADGNWEKVLVENPDIILIIDYGDQSAESKIKFLKE KSPIKDLKAVKNNKFVVIELADITAGIRNVDAIKKLAKAFHNITIK >gi|224461350|gb|ACDC01000052.1| GENE 19 22542 - 24524 3008 660 aa, chain + ## HITS:1 COG:FN0886 KEGG:ns NR:ns ## COG: FN0886 COG1629 # Protein_GI_number: 19704221 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 1 660 1 660 660 1109 87.0 0 MFKRIIVLSLILATAIYANDEAEVRLNESVITAQNFKTTVRNTASNVTIVTAKDIEEKGA QNLVDALRMVPGVMVKNYYGNITFDIGGYSSVHAERNSIITYDGVRISSKEATNIPISSI ERVEVIPNGGGILYGDGANGGVINILSKNIYGKDSNKKVSGNVRTEYGSRGSYKYGLSTN AKATDRLTFQVDYSKDKYRSERNSDKNGKIVSRSQEVSVDAKYKFDNADLVVKYTRNEKH RADGGDLEEADYYKDRKMVSWAARDFTRSNDWYINYRQNIGDNTELLTYVDLYDSKENDD VTKILDRDYARKTVKLQLKHKYFNNHYFIVGADYMNEKLKGLDSNGGYTGRNTEKTDYGV FTINELKFGKFTFAQGLRFNKAKYDFYWREKYPVPRNIRGEHGEQEYKNYAANLELRYDY SNTGMVYGKWSRDFRTPLAREMYYTLDGSKLKAQTQNTFEIGVKDYIAGSYVSLSTFYKK TNGEIYYEGTPNKESTRPGAVVFPYYNMGDTRRIGVELLTQQYLNNFTFTESVSYLNHKI VDSDFESRKGKEIPMVPNWKLGFGIGYKFNDKLNLNADIIYYGKFYDSDDPENIRPKDKG NYATVSVSANYKFENGFALNARVNNLFDRKYEDYVGYWDGTRQYSPATGRYYSIGVSYTF >gi|224461350|gb|ACDC01000052.1| GENE 20 24654 - 25010 286 118 aa, chain - ## HITS:1 COG:HP1225 KEGG:ns NR:ns ## COG: HP1225 COG0239 # Protein_GI_number: 15645839 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Helicobacter pylori 26695 # 4 118 5 126 130 66 40.0 1e-11 MLKFLYVGLGGALGAILRYSFSFLPIASNKTIFINIIGAIVIGFVSFFSKNIKVLDHRLV LFLTTGLCGGFTTFSTFSLETVQLIEKNEYFLALLYSLGTVSLSLVGIYAGYYLAKLF >gi|224461350|gb|ACDC01000052.1| GENE 21 25023 - 25844 985 273 aa, chain - ## HITS:1 COG:no KEGG:FN0872 NR:ns ## KEGG: FN0872 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 13 273 1 257 257 442 85.0 1e-123 MKNRIYKILVVFLLFSLQSFLYAEMKYLNKKGMTVETRYSVPNGYKRVSVEKGSFAEFLR NQKLKPYGEKALYYNGKEKASRGIYDSVFDVEIGNQDLHQCADAIMLLRAEYFYSKKEYN KINFHFTSGFEAKYSKWMEGYRINVQGKGSYVKKANPSNTYKDFKSYMNMVFAYCGTLSL EKEMKLQSLDKMKIGDAFIKGGSPGHVVLIVDMAENDKGEKIFMLAQSYMPAQQTQILIN PSDRNLGVWYSLKGKDVLITPEWDFPVNQLRTF >gi|224461350|gb|ACDC01000052.1| GENE 22 25862 - 26812 1271 316 aa, chain - ## HITS:1 COG:FN0873 KEGG:ns NR:ns ## COG: FN0873 COG0616 # Protein_GI_number: 19704208 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Fusobacterium nucleatum # 1 316 179 494 494 521 90.0 1e-148 ANSEKAKELGLIDGVSTYEKIGVDYDEDTVDFGEYISAYKRKKNKSKNTIAVINLEGEID TRESREAVINYDNVVEKLETLEDIKNLKGLVLRINSPGGSALESEKIYQKLKKLEIPIYI SMGDFCASGGYYIATVGKKLFATPVTLTGSIGVVILYPEFSETINKLKVNMEGFSKGKGF DIFDVFSKLSEESKEKIVYSMNEVYSEFKAHVMEARNISEEDLEKIAGGRVWLGSQAKEN GLVDELGTLNDCIDSLAKDLELKDFKLAYIRGRQSIAEIVSAMKPQFIKSDIVEKMEMLK SYSNKILYYDESLENL Prediction of potential genes in microbial genomes Time: Thu May 19 23:10:50 2011 Seq name: gi|224461349|gb|ACDC01000053.1| Fusobacterium sp. 2_1_31 cont1.53, whole genome shotgun sequence Length of sequence - 43421 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 12, operones - 9 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 30 - 1367 1910 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase 2 1 Op 2 4/0.000 - CDS 1380 - 2039 721 ## COG0132 Dethiobiotin synthetase 3 1 Op 3 . - CDS 2029 - 3111 1228 ## COG0502 Biotin synthase and related enzymes - Prom 3155 - 3214 14.0 - Term 3251 - 3304 9.7 4 2 Tu 1 . - CDS 3309 - 3617 671 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 3686 - 3745 7.9 - Term 3643 - 3686 0.0 5 3 Tu 1 1/1.000 - CDS 3761 - 5053 1637 ## COG2252 Permeases - Prom 5073 - 5132 3.5 - Term 5083 - 5130 7.0 6 4 Op 1 1/1.000 - CDS 5135 - 6022 184 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 7 4 Op 2 1/1.000 - CDS 6022 - 7122 968 ## COG0772 Bacterial cell division membrane protein 8 4 Op 3 1/1.000 - CDS 7142 - 7582 839 ## COG0756 dUTPase 9 4 Op 4 1/1.000 - CDS 7583 - 8809 1477 ## COG0612 Predicted Zn-dependent peptidases 10 4 Op 5 22/0.000 - CDS 8828 - 9919 1328 ## COG0795 Predicted permeases 11 4 Op 6 . - CDS 9919 - 10896 931 ## COG0795 Predicted permeases - Prom 10925 - 10984 4.9 12 5 Op 1 . - CDS 11007 - 11546 495 ## FN1032 hypothetical protein 13 5 Op 2 . - CDS 11562 - 12743 607 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative - Prom 12851 - 12910 15.1 - Term 12858 - 12910 12.1 14 6 Op 1 . - CDS 12935 - 14218 722 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 15 6 Op 2 . - CDS 14220 - 14693 350 ## Amet_0559 tripartite ATP-independent periplasmic transporter DctQ 16 6 Op 3 . - CDS 14706 - 15731 274 ## PROTEIN SUPPORTED gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 17 6 Op 4 . - CDS 15750 - 16472 1060 ## COG0584 Glycerophosphoryl diester phosphodiesterase - Prom 16603 - 16662 12.8 - Term 16483 - 16544 0.1 18 7 Op 1 3/0.000 - CDS 16773 - 17936 1085 ## COG4924 Uncharacterized protein conserved in bacteria 19 7 Op 2 . - CDS 17909 - 21283 3060 ## COG4913 Uncharacterized protein conserved in bacteria 20 7 Op 3 . - CDS 21267 - 21860 674 ## Acid_1049 hypothetical protein 21 7 Op 4 . - CDS 21853 - 23331 1433 ## Pcar_1811 hypothetical protein 22 7 Op 5 1/1.000 - CDS 23357 - 24412 1085 ## COG0598 Mg2+ and Co2+ transporters 23 7 Op 6 1/1.000 - CDS 24429 - 24989 740 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) 24 7 Op 7 1/1.000 - CDS 25001 - 26248 1586 ## COG1448 Aspartate/tyrosine/aromatic aminotransferase - Prom 26268 - 26327 5.0 - Term 26263 - 26318 12.1 25 8 Op 1 . - CDS 26343 - 26891 173 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 26 8 Op 2 . - CDS 26923 - 28092 1641 ## FN0336 hypothetical protein - Prom 28153 - 28212 7.2 27 9 Tu 1 . - CDS 28284 - 28604 270 ## FN0337 hypothetical protein - Prom 28651 - 28710 7.7 - Term 28712 - 28746 6.2 28 10 Op 1 10/0.000 - CDS 28752 - 29771 1616 ## COG4211 ABC-type glucose/galactose transport system, permease component 29 10 Op 2 16/0.000 - CDS 29789 - 31291 192 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 31314 - 31373 6.1 - Term 31318 - 31356 5.5 30 10 Op 3 1/1.000 - CDS 31378 - 32406 1663 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 32427 - 32486 12.6 31 11 Op 1 . - CDS 32612 - 33559 458 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 32 11 Op 2 . - CDS 33576 - 34715 1202 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 33 11 Op 3 2/0.000 - CDS 34730 - 35653 387 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 34 11 Op 4 . - CDS 35675 - 36298 855 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Prom 36332 - 36391 13.6 + Prom 36386 - 36445 9.1 35 12 Op 1 . + CDS 36475 - 37824 192 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 36 12 Op 2 5/0.000 + CDS 37839 - 38711 1014 ## COG1660 Predicted P-loop-containing kinase + Term 38718 - 38753 1.1 37 12 Op 3 1/1.000 + CDS 38773 - 40467 1801 ## COG0322 Nuclease subunit of the excinuclease complex 38 12 Op 4 1/1.000 + CDS 40532 - 42058 1486 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit 39 12 Op 5 1/1.000 + CDS 42073 - 42984 1037 ## COG3872 Predicted metal-dependent enzyme 40 12 Op 6 . + CDS 43007 - 43421 370 ## COG1959 Predicted transcriptional regulator Predicted protein(s) >gi|224461349|gb|ACDC01000053.1| GENE 1 30 - 1367 1910 445 aa, chain - ## HITS:1 COG:FN1002 KEGG:ns NR:ns ## COG: FN1002 COG0161 # Protein_GI_number: 19704337 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Fusobacterium nucleatum # 1 443 7 449 452 845 90.0 0 MINNLSELQKKDLKYVFHPCAQMKDFEKNPPLVIKKGEGLYLIDENGNRYMDCISSWWVN LFGHCNPRINKVISEQVNTLEHIIFANFAHEPAAELCEELTKVLPRGLNKFLFSDNGSSC IEMALKLSFQYHLQTGNPQKTKFLSLENAYHGETIGALGVGDVDIFTETYRPLIKEGRKV RVPYVNSKLSNEEFTKLEDECIKELEEIIEKNHNELACMIVEPMVQGAAGIKIYSARFLK AVRDLTKKHNIHLIDDEIAMGFGRTGKMFACEHAGIEPDMMCIAKGLSSGYYPIAMLCIT TDIFNAFYADYKEGKSFLHSHTYSGNPLGCRIALEVLRIFKEDNVLNTINEKGKYLKEKM NEIFKGKSYIEDIRNIGLIGAIELKDNLLPNVRVGKEIYNLALKKGVFVRPIGNSVYFMP PYVITYEEIDKMLEVCKEAIEELCL >gi|224461349|gb|ACDC01000053.1| GENE 2 1380 - 2039 721 219 aa, chain - ## HITS:1 COG:FN1001 KEGG:ns NR:ns ## COG: FN1001 COG0132 # Protein_GI_number: 19704336 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Fusobacterium nucleatum # 1 219 1 219 219 373 87.0 1e-103 MNFKDFFVIGTDTDVGKTYVSTLLYKALRKHNFQYYKPIQSGCFLRDNKLIAPDVDFLTK FVDIPYDDSMVTYTLKEEVSPHLASEMEGTVIEIENVKKHYEELKKQYSNIIVEGAGGLY VPLIRDKFYIYDLIKMWNLPVVLVCGTKVGSINHTMLTLNALNTMGIKLEGLVFNNYKGQ FFEDDNIKVILELSKVKNYLIIKNGQKEISDEEIETFFN >gi|224461349|gb|ACDC01000053.1| GENE 3 2029 - 3111 1228 360 aa, chain - ## HITS:1 COG:FN1000 KEGG:ns NR:ns ## COG: FN1000 COG0502 # Protein_GI_number: 19704335 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Fusobacterium nucleatum # 1 360 1 360 360 620 88.0 1e-177 MLKEKNSAGGGKFNFFNLSKEKESELAESINVKEFISYLKDKIINEKYEITREEAIFLSR IPNNDMETLNLLFDAADQIREAFCGKYFDLCTIINAKSGKCSENCKYCAQSSHFKTGAEV YGLVSKELALCEAKKNEVEGAHRFSLVTSGRGLRGSEKELDKLVEIYKYIGENTDKLELC ASHGICTKEALQKLADAGVLTYHHNLESSRRFYPNVCTSHTYDDRINTIKNAKAVGLDVC SGGIFGLGETIEDRIDMALDLRELEICSVPINVLTPIPGTPFENNDAVEPLEILKTISIY RFIMPETYLRYGGGRIKLGDYVKTGLRCGINSALTGNFLTTTGTTIEKDKKMIEELGYEL >gi|224461349|gb|ACDC01000053.1| GENE 4 3309 - 3617 671 102 aa, chain - ## HITS:1 COG:FN1024 KEGG:ns NR:ns ## COG: FN1024 COG0776 # Protein_GI_number: 19704359 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Fusobacterium nucleatum # 1 92 1 92 102 103 89.0 6e-23 MTKKEFVDAFAKKGELKIKDSERLVAAFLETVEEALLKGEGVRFIGFGSWEVKERAAREV TNPQTKKKIKVEAKKVVKFKVGKPLADKVAEQKVAKKATKKK >gi|224461349|gb|ACDC01000053.1| GENE 5 3761 - 5053 1637 430 aa, chain - ## HITS:1 COG:FN1025 KEGG:ns NR:ns ## COG: FN1025 COG2252 # Protein_GI_number: 19704360 # Func_class: R General function prediction only # Function: Permeases # Organism: Fusobacterium nucleatum # 1 430 6 435 435 618 86.0 1e-177 MEFLDSYFKISERKSTISHEVMGGITTFLAMAYIIIVNPSVLSLSGMDKGALITVTCLAS FIGTIIAGVWANSPIALAPGMGLNAFFTYTLTLERQVPWQTALGIVFLSGCFFLILSIGG IREKIASSIPVSLRLAVGGGIGLFIAFIGLKGMGIVVANQATFVGIGEFTKTTCVSIIGL LIIIVMEVKKKKGGILIGIIITTILGIVIGDVAIPSKILSLPPSPAPILFKLDIMSAFKL SLIGPIFSFMFVDLFDSLGTLMSCSKEMGLIDDSGEVKNLGRMLYTDAGSTIIGATMGTS TVTAYVESAAGIMLGARTGLAATVTALGFLLSLFFTPLISIVPGYATAPALIVVGIFMFR QVASLEFGDLKILFPAFITIFTMPLTYSISTGLALGFLSYILVHLLTFDFKKLNITLFFI GAICLLHLLV >gi|224461349|gb|ACDC01000053.1| GENE 6 5135 - 6022 184 295 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 90 272 83 265 285 75 28 5e-13 MIEFIIDEEYETVRIDRFLRKHLKNIALSEIYKMLRKAKIKVNNKKVSQDYRLVLGDIIF VFLPESFEEKNEETFIELNEIRKEELKSMIAYENENLFIINKNLGDVIHKGSGHDISLLE EFRSYYSNNKINFVNRIDKLTSGLVIGAKNIKTAREIAKEIQLGNILKKYYILVYGKIEK EEFILENYLKKDEEKVIVSDVEKEDYKKSITHYKRINGDNDYTLLEAELKTGRTHQLRAQ LNHLGHTIVGDTKYGKNIKEDIMYLFSYYLKIDLYDLELELRIPNFFLKKYNIQK >gi|224461349|gb|ACDC01000053.1| GENE 7 6022 - 7122 968 366 aa, chain - ## HITS:1 COG:FN1027 KEGG:ns NR:ns ## COG: FN1027 COG0772 # Protein_GI_number: 19704362 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Fusobacterium nucleatum # 1 366 1 366 366 561 89.0 1e-160 MQNSTYLKKLSKFSVFFIANIILLFVISLSTIYSATITKSEPFFIKEIIWFILGLIVFIV VSLIDYRKYYKYSMAIYIFNIIMLLSVLVIGTSRLGAKRWIDLGPLALQPSEFSKLLLIF TFSAYLINNYSDKYTGFKAMFMCFLHIFPVFFLIAIEPDLGTSLVIILIYGMLLFLNKLE WKCIITVFASIAGLIPIAYKFLLKEYQKDRIDTFLNPESDALGTGWNITQSKIAIGSGKI FGKGFLNNTQGKLKYLPESHTDFIGSVFLEERGFIGGSMLLLIYIVLLAQILYIADTTQD KFGKYVCYGVATIFFFHIFVNMGMIMGIMPVTGLPLLLMSYGGSSLVFSFLILGVVQSVK IHRGNK >gi|224461349|gb|ACDC01000053.1| GENE 8 7142 - 7582 839 146 aa, chain - ## HITS:1 COG:FN1028 KEGG:ns NR:ns ## COG: FN1028 COG0756 # Protein_GI_number: 19704363 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Fusobacterium nucleatum # 1 146 1 146 146 248 92.0 3e-66 MKKIQVKVVREEGVQLPKYETEGSAGMDVRANIKEAITLKSLERVMIPTGLKVAIPEGYE IQVRPRSGLAIKHGITMLNTPGTVDSDYRGELKVIVVNLSNEAYTIEPNERIGQFVLNKV EQIEFVEVEELDDTSRGEGGFGHTGK >gi|224461349|gb|ACDC01000053.1| GENE 9 7583 - 8809 1477 408 aa, chain - ## HITS:1 COG:FN1029 KEGG:ns NR:ns ## COG: FN1029 COG0612 # Protein_GI_number: 19704364 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Fusobacterium nucleatum # 1 408 1 408 408 654 91.0 0 MENIKLKKLDNGITLITEHLPNVSTFSMGFFIKTGAINETKKESGISHFIEHLMFKGTKN RTAKEISEFVDFEGGILNAFTSREVTCYYIKLLSSKMDVALDVLTDMLLNSNFDEESIEK ERNVIIEEIRMYEDIPEEIVHEKNIEFALKGIHSNSISGTIASLKKINRKAILKYLEEHY VAENLVIVACGNIDEKYLYKELNKRMKDFRKAKKEEVLDLTYQIKKGKKVVKKPSNQIHL CFTTRGVSNKSELRYPAAIISNILGEGMSSRLFQKIREERGLAYSVYTYLTRFANCGLLS VYVGTTKEDYKEVIKLIKEEFKNIKENGISERELRKAKNKYESAFTFSLESTSSRMNRLA STYLTYGEIISLDKVREDIEKVSLKDIKKAAEFLFDEEYYSQTIVGDI >gi|224461349|gb|ACDC01000053.1| GENE 10 8828 - 9919 1328 363 aa, chain - ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 363 1 363 363 587 86.0 1e-167 MIKKLDIYISKYFIKYFLMNIIGFMGVFLLAQTFKIIKYINQGKLVGGEIFDYIINLLPK MFVETAPLSVLLAGLITISIMASNLEIVSLKTSGIRFLRIVRAPLIIAFIISLFVFFVNN SIYTKSLAKINFYRKGEIDASLRLPTTKENAFFINNTDGYIYLMGNINRETGNAEKIEIV VYDTEISKPVEIITAQSGKYDKDNKKWMLSGVNIYNVETKKNITKVEYDSDRFREDPNNF IRAAAEDPRMLTIKELKKTIKEQKNIGEDTRIYLAELAKRYSFPFASFIVAFIGLSVSSK YVRGGRTTLNLVICVVAGYGYYLVSGAFEAMSLNGILNPFISSWIPNILYFIIGMYFMNR AEY >gi|224461349|gb|ACDC01000053.1| GENE 11 9919 - 10896 931 325 aa, chain - ## HITS:1 COG:FN1031 KEGG:ns NR:ns ## COG: FN1031 COG0795 # Protein_GI_number: 19704366 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 325 35 359 359 517 90.0 1e-146 MMEHIIVKGISVFDVLRLLSFYIPPILTQTIPIGMFLGIMICFTKFSRNSESVAMVSTGM SIRAILKPILAIAIGAAIFIVFLQESIIPRSFIKLKYVGTKIAYENPVFQLKEKTFIDNL DQYSIYVDKVESDGKAKNIIAFEKPQDKTKFPMVLTGEEAFWKDNSIVLKQSQFISFDEN GKKNLTGTFDEKRVVLTPYFENLNLKIKDVEALSITDLIKNIRKVDAEEVLKYKIEIFRK LALIFSTVPLAVIGFCLSLGHHRISKKYSFVLAMIIIFAYIIFLNIGIVMASAGKLHPFI ATWTPNVLLYILGYKLYKAKEVRGI >gi|224461349|gb|ACDC01000053.1| GENE 12 11007 - 11546 495 179 aa, chain - ## HITS:1 COG:no KEGG:FN1032 NR:ns ## KEGG: FN1032 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 179 1 179 179 232 88.0 5e-60 MYLDILILIIFIFGILSGIRNGIFIEVISVFGFAINLLITKIYTPVVLRFLKRSDASFAN NYVITYIVTFITVYLVVSMILVFVKKAFKGLKKGFFNKLMGGIAGFVKAFIASLVIILIY TYSSKLAPSLEKYSQGSSAIEIFYEILPNFESYIPDILVEDFNKNATKKIIEKNINTML >gi|224461349|gb|ACDC01000053.1| GENE 13 11562 - 12743 607 393 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 1 392 1 394 396 238 34 4e-62 MSKIIVKKDKEQKILNFYPNVYKDEIKDMIGTVKTGDIVDIITSDMKFLARAYVTEGTSA FARVLTTKDEKIDKKFIFERIKNAYEKRKHLLEETNSLRAFYSEADYIPGLIIDKFDKYV SIQFRNSGVEVFRQDVIEAVKKYLKPKGIYERSDVENRVIEGVETKTGIIFGEIPERTIM LDNGVKYSIDIVDGQKTGFFLDQRDSRKFIAKYINNQTRFLDVFSSSGGFSMAALKNGAK EVVAMDKDSHALELCYENYKLNEFTADFSTVEGDAFLMLNTLATRNKKFDIITLDPPSLI KKKTDIYKGRDFFLDLCDKSFKLLENGGILGVITCAYHISLQDLIEVTRMAASKNNKLLS VIGVNYQPEDHPWILHIPETLYLKALWVRVEER >gi|224461349|gb|ACDC01000053.1| GENE 14 12935 - 14218 722 427 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 426 1 428 431 282 37 2e-75 MEKLLPIILLFVLFFFNVPICFALFTSTFFYFIFINTNTYPDLILQVFVNSAQSFPLLAI PFFIMAGAVMNYSGISSRLMGVAEVLTGHMKGGLAQVNVLLSTLMGGISGSANADAAMEC KILVPEMTKRGYSKEFSAAITAASSAITPVIPPGINLIIYSLIANVSVAKMFIAGYVPGL AMCISLMITVYFIAKKRGYKPIREKRASSKEILKVLKDSFWALFLPFGIIMGMRMGFFTP TEAGAIAVVYCIVIGFFIYKELKIRYFVDIIKETVYGTSTVMFIIIGATVFGQYLNWERI PHLIGEFLTNFTDNKYMFLVIVNLILLFVGMFIEGGAAMIILAPLLIPTAVSLGIDPVHF GIVMIVNIMIGGLTPPFGSMMFLTCSIVRVEIKDFVKECMPFIITLLIVLVIVTFLPQLI LFLPNLI >gi|224461349|gb|ACDC01000053.1| GENE 15 14220 - 14693 350 157 aa, chain - ## HITS:1 COG:no KEGG:Amet_0559 NR:ns ## KEGG: Amet_0559 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter DctQ # Organism: A.metalliredigens # Pathway: not_defined # 1 151 3 153 170 122 49.0 4e-27 MKKIFYNLEELIAGFFLIITVTSVVLNVFCRAAGFGTISTSEEIATISFVWSVYIGAVAC YKRKMHIGVDMLVQMFSDKGKKIFTIFLDVFLIVINSVILYLCVIFIMNSQEKPTPVLGI SSNYLNIALLISFFLMVVHSLNFLYQDIKALKKVGEE >gi|224461349|gb|ACDC01000053.1| GENE 16 14706 - 15731 274 341 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 9 308 14 313 340 110 25 2e-23 MKKFRVLSLLSVLAFIMLFTACGGEKKAAEEKKAEPLEIKVSYIFKENEPTHIAMKEATD AINQRLEGQVKFVLFPNGQLPVYKDGLQQVVRGADFIDVDDLSYIGDYVPEFTALAGPML YQNYDEYVKLMHSDLVTDLKKKAEEKGIKIISLDFIFGFRSIISDKEIKEPADLKGMKIR VPASKLFIDTLNAMGASAVPMSFSETISALQQNVIDGLEGSYATNYLTKTYELRKNMSLT KHFLGTAGVYISTKVWDKLTDEQKAIIQEEFDKAAENNNKNLVELDKELVKNLEDAGVKI NEVNLPEFAKLVEPIYKNIGITEEFYKQLMDEMEKIRTEQK >gi|224461349|gb|ACDC01000053.1| GENE 17 15750 - 16472 1060 240 aa, chain - ## HITS:1 COG:FN1891 KEGG:ns NR:ns ## COG: FN1891 COG0584 # Protein_GI_number: 19705196 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Fusobacterium nucleatum # 1 239 22 260 261 400 88.0 1e-112 MKVFAHRGASGYAPENTLVAIKKAIEMKADGIEIDIQLTKDGKIVVMHDWKVDRTTTGRG YVYELDYDYIKTLDAGQWFTKDFIGETVPTLEEVLDILPKDMMLNIEIKDTARHHTNIEE KMLEVLKKYPDKFENIIVSSFHHDKIKKLQVLEPKLKLALLTDSEFIEIEKYLSNNGLSS YSYHPEINLISKEDVEKLHDRGVKIFVWTVNKEEDLNYLVKLGVDGVITNYPDIMKELLS >gi|224461349|gb|ACDC01000053.1| GENE 18 16773 - 17936 1085 387 aa, chain - ## HITS:1 COG:XF2735 KEGG:ns NR:ns ## COG: XF2735 COG4924 # Protein_GI_number: 15839324 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 1 381 1 393 401 176 27.0 8e-44 MGWTDFKEIKNIILKKWEKGDIARGKLLFPWKIKLKVPTSKEILSDISKFLVWRKELKKT DKSSLGYGYNIIEKEIKNKILGKNPIPAYIEIVSIDDALKLIKKENEMKRILENSSFILE KYPKLQKWIEKNIFKILIEKGEIKKILAVVEYILKNPVKDTYLREISIENIDTKFIENHK KMIFSMLDEILSDDSSLLEEKLSIKKKPQLIRFRILDEKYYIHGFSDISVPIEEFNEWEN NFSNVFFTENEISFLSFPSYKNALIIFSKGYNVHAFKENKWLNKKKLYYWGDMDTHGFNI LGIAREIFPDLKSFLMNEEIFFKHKEFWVVEDKPFLAEVKGLNYEESLFCKKLQENIWGK NLRLEQERINFSYLKEYLKKLEADSNS >gi|224461349|gb|ACDC01000053.1| GENE 19 17909 - 21283 3060 1124 aa, chain - ## HITS:1 COG:XF2734 KEGG:ns NR:ns ## COG: XF2734 COG4913 # Protein_GI_number: 15839323 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 4 1103 9 1129 1144 669 35.0 0 MEILNDFSEIKKNGFRLHNLEGLNWGTFQGKWNFNPKEETTLLTGESGSGKSTLVDALTT ILVPPYKIKFNQAADVDSKERTILSYVRGYYGQKSNAEGKGNIEALRDYNSYSVILANFY EQTLKQEVGLAVVFYFKETNSTPSKFYIVSEKFLSIDNDFIFRFNDIKILRDSLKQKSYS VFNDYSPYFNYFSKRLGNLTEQSIQLFQKTISMKKMGELSDFIRKNMLEKLDIEVSIEKL IQHYNDLNRAYEATVKARKQLEMLNPISEKGKIYLSEREEKLKFQSAIDRLEIWLAKKKD KLYEEKISNLMEQKKEKEISYDSEDKKFKECRKELENLRLEILQNGGDKLREYNTKLKHE KENFKQKEKRVNDYSSFAKNINLNIPNRIEEFNNNRKILERIAKENGELIKKTNEIRDEK VIENSKIRDERKNIEAELISLRERTSNIPLKYVELKEKIQRALNLEETDISFVGELIEVK SEEKDWEGAIERFLHSFSLTLLIKKEDYAKVVEYVNKNFLNLKLVYYYVDVKKTKYELSY IETFSVLNKIDIKADSIFSAWIKKQLYDRYNYICCDNLEDFRVNKKALTKTGQIKSDSRH EKDDRADINDRRNYIMGFSNKNKIKIWESYLQEKENLLKNLSCDLDKLDEELKKMNNIKN SIDSLERFKDFEDLDILTSKNKIEELNEIIEKLKKDNNILKDLESRAAFLAKEEVKLENK VKNLSSSIIELEKDLSYINKEYQQNKFVSEEDPHFLYLDDYDFLKKHSQFWNDFPLTLEN ISSFQNKYSLYLNRRILDLEEKLKLLEKNIESLMREFKREYPIESQDFDDNVEALKEYNT FLEKILKDELPKYEEKFKTELQENIFRHIIHFKTNLDIQERTIKDKIKEINDSLEGIDYS KGRYIKLIQKRTVDKDIIDFRNSLNDITTNSIDDNNLIEEKFLEIKKIIDRFKGRANETD RDKKWTAYVTDVRNWFHFSATEYWRDNNEEYEHYTDSGGKSGGQKEKLAYTILGASLAYN YGINTKNRTSFRLIIIDEAFLKSSDESARFGLELFKKLDFQFIIVTPLLKINTMEPYIRH VGFVSYNDNSHISTLTNISLSEYIEMKEKMQGRKLEWDGLTLKK >gi|224461349|gb|ACDC01000053.1| GENE 20 21267 - 21860 674 197 aa, chain - ## HITS:1 COG:no KEGG:Acid_1049 NR:ns ## KEGG: Acid_1049 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 8 187 4 185 194 126 41.0 5e-28 MDKINIDLSNAYISLLKGIVIKENKEKIWDTILNYRNQIEDYFRHLGLKLRIYEEEGYCY LQQDEDEELKDLFKLVPKVQLSLHLSLLLATLRKYMYESTANGDEKIVISKEEIFLRMKS YLKETSNEVKQEKEIESYLKKIEEMKFIRKLKHSDDKYEVLRLISSFVDAQWLDDLNKHL VEYKDFIKNGGTDGDSE >gi|224461349|gb|ACDC01000053.1| GENE 21 21853 - 23331 1433 492 aa, chain - ## HITS:1 COG:no KEGG:Pcar_1811 NR:ns ## KEGG: Pcar_1811 # Name: not_defined # Def: hypothetical protein # Organism: P.carbinolicus # Pathway: not_defined # 10 481 3 478 481 340 38.0 6e-92 MEKVDLGNFMSYSYLTSLRKNHPAWKLLTSHQAPFIASFFYAVFLKPKQREIPEGKLVSQ LENFIDEINFEEKSSFTPQEILTQWSNSDYAYLRKYYPKDKDEIHYDLTPAAQKAVEYLI SFEQKSFMATESRLRTVFNLLKEIVEKSNENPEFRIQELEKQKKDIENKIEGVKAGKIEI LSPIQIKDRFLQAMNTSHEILSDFRIVEQNFRNLERKLREKIVTWDSGKGELLHNFFEEE EGIQESEQGKSFKAFSDFLSSNESQKDFKNLIEQVLSMKEIIPIQGSLSFERIKDDWMKG SEHVLDTLALLSKQLNLFINENSYAEERRIKEIIKNIEATALQLDGKNIKNFIEMKTFYA DIKFPMNKKLYSIPNKIFLDELALNNEEIDIDFSALYLQLFMDKNKLKDRINNLLLDREE ISLKELVEVYPIKSGLTELLTYFVLAKREINAEIYYDDLFELSWVYENGDSQSAKVPMII FKNSENEDDRDG >gi|224461349|gb|ACDC01000053.1| GENE 22 23357 - 24412 1085 351 aa, chain - ## HITS:1 COG:FN0332 KEGG:ns NR:ns ## COG: FN0332 COG0598 # Protein_GI_number: 19703675 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Fusobacterium nucleatum # 1 351 1 351 351 561 87.0 1e-160 MSNSNRKLGLMPGSVVYTGENPNYNITITVIYYSKDFHKRETFSSTDKIDIDLKFKGNIW INIDGINDVNLIKDIGKMFDIDTLSMEDIANPEQRVKIDDRDTYILIILKMLQMEILTKD VQYEQLSLIIKKNILITFQETPYDPFESIRARLEIASARLRTQDVSYLAYILIDIIVDNY LLILDEVENEIDEIESQLIESADRDDLENILALKQNIAILKKFISPVRELISKLQTRSML NYFHEDMKYYLGDLNDHGIIVFDTVDMLNNRATELIQLYHSMISNTMNEIMKILAIISTI FMPLSFIVGLYGMNFEYMPELKWHYGYYITLGLMASLVILMIFYFKKKKWF >gi|224461349|gb|ACDC01000053.1| GENE 23 24429 - 24989 740 186 aa, chain - ## HITS:1 COG:FN0333 KEGG:ns NR:ns ## COG: FN0333 COG1954 # Protein_GI_number: 19703676 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Fusobacterium nucleatum # 1 186 1 186 186 272 83.0 3e-73 MKIKSILERNPIIPAIKDNITLEKALNSNSEIVFIILANIVNIKEYCDKLRDKNKIIYVH IDMIDGLNSTNNGIDYIMNTIKPDGILTTKSNVVAHAYKNSISVIQRFFILDTLSYEKAL LNIKENKIVAAEIMPGLMPKVIKKLSQKTHIPIITGGLIKEKEDVINAIKAGALSVSTTE TSLWEE >gi|224461349|gb|ACDC01000053.1| GENE 24 25001 - 26248 1586 415 aa, chain - ## HITS:1 COG:FN0334 KEGG:ns NR:ns ## COG: FN0334 COG1448 # Protein_GI_number: 19703677 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Fusobacterium nucleatum # 1 415 1 414 415 694 86.0 0 MLAKRYTGKKLVDNIFTTSKKAKQAIKKYGKENVINATIGSLYDEEEKFAIYNVVEKVYR NLPSEDLYAYSTNVIGEDDYLEEVIKAIFYDDYKEELKDLLYIASVATTGGTGAISNTIK NYMDTGDKVLLPNWMWGTYKNIVIENGGKIETYQLFDENGNFNFEDFRSKVLELAKTQKN VVLILNEPSHNPTGFRMTYEEWVNLMDFFKSIKDTNLIVIRDVAYFEYDDRTEEETKALR KLLIGLPKNVLFMYAFSLSKSLSIYGMRIGAQIAVSSSEEVIQEFKDAISFSCRTTWSNV PKGGMKLFETIMKNPELKADFLKEKQAYIDLLKERADIFLNEAKEVNLDILPYKSGFFVT IPIGETVDKVIEELESQNIFVIKFDKGIRIGICSVPKRKIVGLAKKIKEAIEKSK >gi|224461349|gb|ACDC01000053.1| GENE 25 26343 - 26891 173 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 77 176 244 343 347 71 35 1e-11 MKKKIFAVLMLALLATACSSKKVIKNTGVGVDSANKYAIEDTEASKKPLEDIIVFDQEGV TIRREGNNLILSMPELILFDFDKYAVKDGIKPSLATLAKALGENKDIHIKIDGYTDFIGT EAYNLELSVKRARAIKDFLISKGAIGSNISIEGYGEQNPADTNKTEAGRSRNRRVEFIIS RG >gi|224461349|gb|ACDC01000053.1| GENE 26 26923 - 28092 1641 389 aa, chain - ## HITS:1 COG:no KEGG:FN0336 NR:ns ## KEGG: FN0336 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 150 389 1 240 240 447 92.0 1e-124 MIATSLNINAAQATSFAETVNDDKVEVIATYDNEMPQEIKNIYNPKHKGEGVNYFDYVFV TARSANLREKPDPKSKVIGKFTYDVKLKLLEKVRYQGNIWYLVEDTKGNKGYIAGSQTKK RDFRFQMALDKIGDLEYFINKSIDEGATLMSVNTYAPNPSNINPQREKDKYGTSLDQNLL GISKKGERIIIPDRSIVKIIENRGDKALVRALSIPEEVEVSKAKLSTYPSIKKGFRKVIA IDIENQNFMVFEKSKQTNEWELISYVYTKTGIDSQLGYETPKGFFTVPVVKYVMPYTDET GQKQGSAKFAIRFCGGGYLHGTPINVQEEVNKEFFLRQKEFTLGTTTGTRKCVRTSEGHA KFLFDWLVNNPNKDSNEQRLSEDAYFIVF >gi|224461349|gb|ACDC01000053.1| GENE 27 28284 - 28604 270 106 aa, chain - ## HITS:1 COG:no KEGG:FN0337 NR:ns ## KEGG: FN0337 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 106 7 112 112 128 78.0 6e-29 MRRFQNATMEYNVAKNELVIRDVNNNGIFFAIDFFENSKQIKRVFSLYPVSVEINKNRIL ELKFSVQNQNGEQSVLNLLLELDQLVSDKRTVINISNDDLSNITLN >gi|224461349|gb|ACDC01000053.1| GENE 28 28752 - 29771 1616 339 aa, chain - ## HITS:1 COG:FN1167 KEGG:ns NR:ns ## COG: FN1167 COG4211 # Protein_GI_number: 19704502 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Fusobacterium nucleatum # 1 339 1 339 339 540 96.0 1e-153 MFARNNEGKLDYKKIIIESGLYLVLFCMLIAIIIKEPTFLSLRNFKNILTQSSVRTIIAL GVAGLIVTQGTDLSAGRQVGLSAVISGTLLQSMTNVNKAFPKLGEFSIFTTILIVVLVGV IIASINGIVVATLNVHPFIATMGTMTIVYGINSLYYDKAGAAPISGFVDKYSKFAQGYIQ IGSYTIPYLIIYAAIATLIMWILWNKTKFGKNVFAVGGNPEAAKVSGVNVVLTLIGIYAL SGAYYAFGGFLEAGRIGSATNNLGFMYEMDAIAACVIGGVSFYGGVGRISGVITGVIILT IINYGLTYTGVSPYWQYIIKGIIIVTAVAFDSIKYAKKK >gi|224461349|gb|ACDC01000053.1| GENE 29 29789 - 31291 192 500 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 273 478 17 217 245 78 26 6e-14 MENLKYVLEMENITKEFPGVKALDNVQLKLKPGTVHALMGENGAGKSTLMKCLFGIYEKD NGKILLDGVEVNFKSTKEALENGVSMVHQELNQVLQRNVLDNIWLGRYPMKGFFVDEKKM YEDTINIFKDLDIKVDPRKKVADLPIAERQMIEIAKAVSYKSKVIVMDEPTSSLTEKEVD HLFRIINRLKQSGVAIVYISHKMEEIKMISDEITILRDGKWISTNDVSKISTEQIISMMV GRDLTERFPKKDNTVKEMILEVKNLTALNQPSIQDVSFELYKGEILGIAGLVGSKRTEIV ETIFGIRPKEKGEIILNGKTVKNKNPEDAIKNGFALVTEERRSTGIFSMLDIAFNSVISN LDRYKNKFRLLKNKDMEKDTKWIVDSMRVKTPSYTTKIGSLSGGNQQKVIIGRWLLTEPE VLMLDEPTRGIDVLAKFEIYQLMIDLAKKDKGIIMISSEMPELLGVTDRILVMSNGRVAG IVKTSETNQEEIMELSAKYL >gi|224461349|gb|ACDC01000053.1| GENE 30 31378 - 32406 1663 342 aa, chain - ## HITS:1 COG:FN1165 KEGG:ns NR:ns ## COG: FN1165 COG1879 # Protein_GI_number: 19704500 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 342 1 341 341 580 95.0 1e-165 MKKFGMILGSIILASALVACGEKKEEAKTEAAPAAEKLSIGLTAYKFDDNFIALFRKAFE AEAAAKADTVEVTAIDSQNSVATEKEQIEAVLEKGVKAFAINLVDASAADGIINLLKEKN IPVVFYNRKPSDEAIASYDKLYYVGIDPNAQGIAQGELIEKLWKENPDLDLNKDGVIQYV MLTGEPGHPDAVARTKYSISTLNDHGIKTEELHQDTAMWDTATAKDKMDAWLSGPNGSKI EVVICNNDGMALGAIESMKATGKILPTFGVDALPEALVKIEAGEMAGTVLNDAKGQASAT FNMVVNLAAGKEPTEGTDLKLDNKIILIPSIGIDKSNVADFK >gi|224461349|gb|ACDC01000053.1| GENE 31 32612 - 33559 458 315 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 5 315 6 318 319 181 34 9e-45 MKHYIGIDLGGTNTKIGVVDLEGNLIISKIIKTHSKQKVDKTLERIWETSKELLVKCDIP LFSVLGIGIGIPGPVKEQSIVGFFANFDWEKNMNLKEKMEKLTGIETRIENDANIIAQGE AIFGAAKGKKTSITIAIGTGIGGGIFLNGNLLTGMSGVAGEIGHMKVVKDGKTCGCGQNG CFEAYASASALVKEAKERLKLNEDNLLYKEINGDLEELEAKNIFDAARKGDEFSKDLLEY ESDYLALGIGNLLNIFNPECIVISGGISLAGDEILIPVKEKLKKYTLLPALENLEIKTGV LGNEAGVKGAVALFI >gi|224461349|gb|ACDC01000053.1| GENE 32 33576 - 34715 1202 379 aa, chain - ## HITS:1 COG:XF1739 KEGG:ns NR:ns ## COG: XF1739 COG2220 # Protein_GI_number: 15838340 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Xylella fastidiosa 9a5c # 25 376 26 370 385 313 41.0 3e-85 MKKLFKILFYLLIIVIVFAISIYLFMKTPVFGALPSGKSLEKVKNSKNYIDGEFRNKEKT ELLTDTKKTPIKRLLEFAFEKDPEGTVPKIALPSVKTDLKTLDPNEDLIVWFGHSSLFIQ IAGKKILVDPVFSKYASPVPFSNKAFEGTNIYTVDDLPEIDILLITHDHYDHLDYPTVKK LKDKVAKVIVPLGVDAHLLRWGFDEEKIVTVDWDDEVTIDDNLKIYSLETRHFSGREFSN RNQSLWVSYLIEEKYNDNLYRLFLSGDGGYSPRFKAFKEKFQNIDLAVMEAGQYNEEWAL IHSLPEDIIKEVRDMEVTKLFPIHNSKFKLSKHPWDEPLRKLDDFTINTNIQLLTPMIGE KLYLHKENSFKKWWENLEK >gi|224461349|gb|ACDC01000053.1| GENE 33 34730 - 35653 387 307 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 294 1 296 306 153 33 1e-36 MEKIYDVVIVGAGPAGLTAGIYTGRGNLSTLILEKEGIGSMIMTHQIDNYPGSHVGASGK EIYDTMKKQALEFGCEIKSATVLGFDPYDEVKIVKTDAGNFKTKYIIIATGLGKIGAKKV KGENKFLGAGVSYCATCDGAFTKGRTVSLVGKGDELIEESLFLTRYAKEVNIFLTSDDLD CSEELKEAILSKENVKIVKKVKLLEIKGEEFVTELDLEVDGNKETVATDFVFLYLGTKNN LELYGEFVSLSDAGYIVTDETMKTRTDKMYAIGDIREKDIRQVATATSDGVIAASFIMKE ILKTKKK >gi|224461349|gb|ACDC01000053.1| GENE 34 35675 - 36298 855 207 aa, chain - ## HITS:1 COG:FN1162 KEGG:ns NR:ns ## COG: FN1162 COG0491 # Protein_GI_number: 19704497 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 1 207 1 207 207 356 85.0 2e-98 MKVKCFHLGAYGTNCFLVYDDNNLAYFFDCGGRNLNKLYSYIEEHNLDLKYIVLTHGHGD HIEGLNDLAEKYPEAKVYVGEEEKDFLYNSELSLSDRIFGEFFKFKGELHTVREGDMIGD FKVIDTPGHTIGSKSFYDEKAKILISGDTLFRRSYGRYDLPTGDLNMLCNSLEKLSKLPE ETVVYSGHTEETTIGEEKKFLERVGIL >gi|224461349|gb|ACDC01000053.1| GENE 35 36475 - 37824 192 449 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 103 399 127 407 458 78 25 6e-14 MNKKIIIIGGVAAGMSAASKAKRIDKNLDITVYEMTDAISWGACGLPYYVGDFYPNASLM IARTYEEFEKEGINVKIKHKVENIDFKNKKVFVRNLNENKVFEDNYDELVIATGASSTSP KDIKNLDAEGVYHLKTFNEGLEVKKEMMKKENENIIIIGAGYIGVEIAEAALKLGKNVRI FQHSSRILNKTFDKEITDLLENHIREHEKISLHLNESPIEVRTFENKVIGLKTNKKEYSA NLIIVATGVKPNTEFLKDSGLELFENGAIIIDRFGKTNIPNVYAAGDCATVYHSVLEKNV YIALATTANKLGRLIGENLTGANKEFIGTLGSAGIKVLEFEAARTGITEQEAKDNNINYR TILVDGEDHAAYYPGGEDVYIKLIYHADTKILLGAQVAGKRGAALRADSLAVAIQNKMTT QELANMDFLYAPPFATTWDIMNVAGNVAK >gi|224461349|gb|ACDC01000053.1| GENE 36 37839 - 38711 1014 290 aa, chain + ## HITS:1 COG:FN1089 KEGG:ns NR:ns ## COG: FN1089 COG1660 # Protein_GI_number: 19704424 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Fusobacterium nucleatum # 1 290 1 290 290 501 90.0 1e-142 MKTKHIIIVTGLSGAGKTTALNILEDMNYYTIDNLPLGLEKSLLDTEIEKLAVGIDIRTF KNTKDFFKFINYIKESGVKMDIVFIEAHEAIILGRYTLSRRAHPLKENTLLKSILKEKEI LFPIREIADLIIDTTEIKNVELEKRFKKFLSGKDELNIDINMNIHIQSFGYKYGIPTDSD LMFDVRFIPNPYYIEKLRDMNGYDEEVKDYVLSQKESKDFYSKLLPLIEFLIPQYIKEGK KHLTISIGCSGGQHRSVTFVNKLAEDLKNSKVLNHINIYASHREKELGHW >gi|224461349|gb|ACDC01000053.1| GENE 37 38773 - 40467 1801 564 aa, chain + ## HITS:1 COG:FN1090 KEGG:ns NR:ns ## COG: FN1090 COG0322 # Protein_GI_number: 19704425 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Fusobacterium nucleatum # 1 564 26 589 589 933 90.0 0 MGKAKNLKNRVSSYFNRVHESEKTNELVKNIEDIEFFLTNTEIDALLLENNLIKKYSPKY NILLKDEKTYPFIKISKEDFPSIKIVRTTKALDIKTGEYFGPYPYGAWRLKAVLMKLFKI RDCNRDMKKKSQRPCLKYYMKSCTGPCVYKDIKEDYDNDIESLKQVLRGNSSKLISELSI LMNKSAEEMDFEKSIIYREQIKELKNIANSQIIQYERELDEDIFVFKTILDKAFICVLNM RDGKILGKTSTSLDLKNKITDNVFEAIFMSYYSKHILPKSLVLDAEYENELAVVVEALTL EDSKKKEFHFPKIKSRRKDLLEMAYKNLERDIETYFSKKDTIEKGIKDLHDILNLKRFPR RIECFDISNIQGKDAVASMSVSIEGRAAKKEYRKFKIRCKDTPDDFSMMREVIERRYSKL ADKDFPDVILIDGGLGQINSAGEVLKKLGKIHLSELLSLAKRDEEIYKYGESVPYSLSKD MEALKIFQRVRDEAHRFGITYHRKLRSKRIISSELDRIEGIGEVRRKKLLTKFGSVSAIK KASIEELKEIVPEKVALEIKNHIK >gi|224461349|gb|ACDC01000053.1| GENE 38 40532 - 42058 1486 508 aa, chain + ## HITS:1 COG:FN1091 KEGG:ns NR:ns ## COG: FN1091 COG2208 # Protein_GI_number: 19704426 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Fusobacterium nucleatum # 61 507 1 447 447 709 87.0 0 MIIAFYMIVAFLIFIFFTYIYIKKLVNHYINEELKIISGLNDKERLNDLPDNIKTEYTQT LEKIIKQENELNNSIDELKVYRDELDVTYSTLVSKSSQLEYTNSLLEKRVRNLSNLNHIS RVALSMFNIDKIVETLADAYFVLTATSRISIYLWEGDTLVNKKIKGSIDYTESISYPMNL LSKFTNEDFSKIYSDLSRKITILNDEKVIITPLKVKERQLGVIFLVQNKDQLLEINNEMV SALGIQASIAIDNAISYAELLEKERISQELELASSIQKQILPKGFEKIKGMDIATYFSPA KEIGGDYYDLALKDNILSITIADVSGKGVPASFLMALSRSMLKTINYVSSFKPAEELNLF NKIVYPDITEDMFITVMNTELDLNSSIFTYSSAGHNPLVVYRKESDTVELYGTKGVAVGF IENYSYKENSFELKNGDIVVFYTDGIIECENKRRELFGTQRLLDVIYKNKNLSSKEIKGK ILEAIEDFRKDYEQNDDITFVILKSVKK >gi|224461349|gb|ACDC01000053.1| GENE 39 42073 - 42984 1037 303 aa, chain + ## HITS:1 COG:FN1092 KEGG:ns NR:ns ## COG: FN1092 COG3872 # Protein_GI_number: 19704427 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Fusobacterium nucleatum # 3 303 4 304 304 543 93.0 1e-154 MSNNNRPSIGGQAVIEGVMMRGTECLATAVRKPSGEIVYKKTKIIGKNSNFAKKPFIRGV LMLFESLVIGVKELTFSANQAGEEDEKLSHKEAVFTTLFSLALGIGIFIVLPSLVGSFIF PENKMYANLTEAILRLIIFIGYIWGISFSKEVGRVFEYHGAEHKSIYTYENGLELTPENA KKFTTLHPRCGTSFLFIVMFIAIIVFSVIDYVLPIPTNLFSKFLLKVVVRIVLMPVIASL AYELQKYSSCHLNNPLIKLISLPGLALQKITTREPDLDELEVAIVAIKASLGQEVNNATE VFE >gi|224461349|gb|ACDC01000053.1| GENE 40 43007 - 43421 370 138 aa, chain + ## HITS:1 COG:FN1093 KEGG:ns NR:ns ## COG: FN1093 COG1959 # Protein_GI_number: 19704428 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 138 1 139 142 223 89.0 6e-59 MKLKNEIEYVFRILNYLSLQDKDRIVTSTEIAENENIPHLFSIRVLKKMEKKGLLKIFKG ANGGYKLNKDPKDITLRDAVETIEDEIIIKDRSCVAGQTSCSVIFKALESVENNFLNNLD KVNFKELTCPHVDLKIDD Prediction of potential genes in microbial genomes Time: Thu May 19 23:11:20 2011 Seq name: gi|224461348|gb|ACDC01000054.1| Fusobacterium sp. 2_1_31 cont1.54, whole genome shotgun sequence Length of sequence - 23300 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 4, operones - 3 average op.length - 7.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 22 - 1644 2431 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 1671 - 1730 10.2 + Prom 1853 - 1912 8.7 2 2 Op 1 1/0.000 + CDS 1946 - 2878 1134 ## COG0451 Nucleoside-diphosphate-sugar epimerases 3 2 Op 2 12/0.000 + CDS 2936 - 3571 1071 ## COG0563 Adenylate kinase and related kinases 4 2 Op 3 . + CDS 3611 - 4375 1155 ## COG0024 Methionine aminopeptidase + Term 4383 - 4419 3.5 - Term 4370 - 4407 7.5 5 3 Op 1 15/0.000 - CDS 4414 - 6726 2997 ## COG2217 Cation transport ATPase 6 3 Op 2 . - CDS 6761 - 6955 459 ## COG2608 Copper chaperone - Prom 7158 - 7217 11.9 + Prom 7114 - 7173 10.2 7 4 Op 1 1/0.000 + CDS 7210 - 9960 3710 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 8 4 Op 2 1/0.000 + CDS 9973 - 10869 741 ## COG1481 Uncharacterized protein conserved in bacteria 9 4 Op 3 1/0.000 + CDS 10882 - 11838 390 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 10 4 Op 4 1/0.000 + CDS 11813 - 12490 701 ## COG1354 Uncharacterized conserved protein 11 4 Op 5 . + CDS 12478 - 13953 576 ## PROTEIN SUPPORTED gi|163803542|ref|ZP_02197411.1| 30S ribosomal protein S20 12 4 Op 6 . + CDS 14020 - 14697 794 ## FN0710 hypothetical protein 13 4 Op 7 3/0.000 + CDS 14694 - 15590 1002 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 14 4 Op 8 1/0.000 + CDS 15538 - 15915 357 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 15 4 Op 9 7/0.000 + CDS 15902 - 16474 509 ## COG2059 Chromate transport protein ChrA 16 4 Op 10 1/0.000 + CDS 16471 - 17001 534 ## COG2059 Chromate transport protein ChrA 17 4 Op 11 . + CDS 17012 - 17959 1152 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 18 4 Op 12 . + CDS 18039 - 18788 885 ## FN0715 hypothetical protein 19 4 Op 13 . + CDS 18825 - 19691 914 ## FN0715 hypothetical protein 20 4 Op 14 . + CDS 19675 - 20589 1173 ## FN0716 phophatidylinositol-4-phosphate 5-kinase (EC:2.7.1.68) 21 4 Op 15 . + CDS 20589 - 21269 845 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 22 4 Op 16 . + CDS 21282 - 21695 596 ## gi|294782057|ref|ZP_06747383.1| glycosidase CRH1 23 4 Op 17 4/0.000 + CDS 21695 - 22735 901 ## COG4394 Uncharacterized protein conserved in bacteria 24 4 Op 18 . + CDS 22747 - 23299 822 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) Predicted protein(s) >gi|224461348|gb|ACDC01000054.1| GENE 1 22 - 1644 2431 540 aa, chain - ## HITS:1 COG:FN1301 KEGG:ns NR:ns ## COG: FN1301 COG0488 # Protein_GI_number: 19704636 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Fusobacterium nucleatum # 1 539 1 539 539 1033 97.0 0 MIATANLGMRFSGRKLFEDVNLKFTPGNCYGVIGANGAGKSTFVKILSGELEATEGEVIF DKNKRMSVLKQDHFQYEEEEVLNVVLMGNKKLWDIMVEKNAIYAKTDFTDEDGIRAAELE GEFAELNGWEAETEAETLLMGLKIGADLHHKLMKELTEPEKVKVLLAQALFGEPDVLLLD EPTNGLDVKAISWLENFIMGLENSTVIVVSHDRHFLNKVCTHITDIDYGKIKMYVGNYDF WYESNELMKTLINNKNKKLEQKRQELQEFIARFSANASKSKQATSRKKQLEKLQLEDMQM SNRKYPFVEFKPEREAGNNLLKVENLSKTIEGVKVLDNVSFTIETGDKVVFLAKNDLVKT TLLSILAGEIEPDSGSYTWGVTTSQAYMPRDNSAYFNNTDVNLIEWLRPYSPDEHEAFIR GFLGRMLFSGDETLKKVSVLSGGEKVRCMLSKLMLSGANVLLFDNPSDHLDLESITSLNK ALIKFKGTILFGAHDHEFIQTVANRIIEITPKGIVDKVTTYDEYLEDETIQARLEEMYAE >gi|224461348|gb|ACDC01000054.1| GENE 2 1946 - 2878 1134 310 aa, chain + ## HITS:1 COG:FN1299 KEGG:ns NR:ns ## COG: FN1299 COG0451 # Protein_GI_number: 19704634 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 1 310 1 309 309 437 75.0 1e-122 MKKILVMGGNQFVGKEVAKKLLEKNYKVYVLNRGIRKNLDNVIFLKADRKNISEMKNILK NIEVDVIIDISAYTEEQVEILQRVMKNKFKQYILISSASVYTDITESPAKEDDPTGENPA WSDYAKNKYLAEMRTIENSRLYNFKYTIFRPFYIYGIGNNLDRENYFFSRIKYNLPIYIP NKGNNIVQFGYIEDLASAIELAVENSDFYGQVFNISGDEYVAITEFAEICGKIMNKKSII KHIDTEEKNIKARDWFPFREVNLFGDISKLENTGFRNKYSLIKGLEKTYKYNEEHDLIIE PNLNEIEKEN >gi|224461348|gb|ACDC01000054.1| GENE 3 2936 - 3571 1071 211 aa, chain + ## HITS:1 COG:FN1298 KEGG:ns NR:ns ## COG: FN1298 COG0563 # Protein_GI_number: 19704633 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Fusobacterium nucleatum # 1 211 1 211 211 369 92.0 1e-102 MVDVNLVLFGAPGAGKGTQAKFIVDKYGIPQISTGDILRVAVANQTKLGLEAKKFMDAGQ LVPDEVVNGLVEERLAEKDCEKGFIMDGFPRTVVQAKALDEILTRLGKQIEKVIALNVPD ADIIERITGRRTSKVTGKIYHIKFNPPVDEKEEDLVQRADDTEEVVVKRLETYHNQTAPV LDYYKAQNKVTEIDGTKKLEDITQDIYRILG >gi|224461348|gb|ACDC01000054.1| GENE 4 3611 - 4375 1155 254 aa, chain + ## HITS:1 COG:FN1297 KEGG:ns NR:ns ## COG: FN1297 COG0024 # Protein_GI_number: 19704632 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Fusobacterium nucleatum # 1 254 1 254 254 449 87.0 1e-126 MRLIKTLDEIKGIRKANQIIAKIYTDIIPPYLKPGITTREIDRIIDEYIRSCGARPACIG VEGIYGPFPAATCISVNEEVVHGVPGDRVIKEGDIVSLDTVTELDGYYGDSARTFPIGII DDESRKLLEVTEKAREIGIQTAVAGNRLGDVGHAIQTFVEQNDFSVVRDFAGHGVGLALH EEPMIPNYGRKGRGLKIENGMVLAIEPMVNAGTYKIAMLPDGWTIITRDGKRSAHFEHSI AIIDGKPVILSELD >gi|224461348|gb|ACDC01000054.1| GENE 5 4414 - 6726 2997 770 aa, chain - ## HITS:1 COG:FN0245 KEGG:ns NR:ns ## COG: FN0245 COG2217 # Protein_GI_number: 19703590 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 770 1 769 769 1244 88.0 0 MENDIKSGAELDNKQERDNKKLELKIDGISCQACVAKIERKLSRTEGVEKALVNISNNMA DIEYDEKEIKASEIMKIIEKLGYTPKRREDLKDKEEAIRAEKMLKSELTKSKIAIVLSLI LMYISMSHMFGLPVPHIIYPVDHIFNYVAIQFIIAVTVMIIGKRFYKVGFRQLFMLSPNM DSLVAVGTSSAFIYSLYISYKIFADNNIHLMHSLYYESAAMIIAFVMLGKYLETLSKGKA SAAIKKLVNFQAKKANIIRNGEIVEIDINEVSKGDIVFIKPGEKIPVDGTIIEGHSTIDE AMITGESIPVEKLENDKVYSGSINKDGALKVVVNATEGETLISKIAKLVEDAQMTKAPIA RLADKVSLIFVPTVIFIAIFAALLWWFLIKYNVVSVSQNHFEFVLTIFISILIIACPCSL GLATPTAIMVGTGKGAELGILIKSGEALEKLNEIDTIVFDKTGTLTEGTPKVIDIVSIDN VLSKDEILKIAASMEVNSEHPLGKAVYDEAKEKNVELYDVKKFLSISGRGVIGEVEEKKY LLGNKKLLIDNGISNLHEEEIHKYELEGKTTILLADQEKLIAFITLADVVRNESIELIEK LKKENIKTYMLTGDNERTAKVIAKKLGIDDVIAEVSPEDKYKKVKDLQEQGRKVVMVGDG VNDSPALAQADVGMAIGSGTDIAIESAGIVLMSKDIETILTAIRLSKATIKNIKENLFWA FFYNSCGIPIAGGLLYLFTGHLLNPMLAGLAMGLSSVSVVTNALRLKRFK >gi|224461348|gb|ACDC01000054.1| GENE 6 6761 - 6955 459 64 aa, chain - ## HITS:1 COG:FN0244 KEGG:ns NR:ns ## COG: FN0244 COG2608 # Protein_GI_number: 19703589 # Func_class: P Inorganic ion transport and metabolism # Function: Copper chaperone # Organism: Fusobacterium nucleatum # 10 64 1 55 56 80 87.0 5e-16 MKLNLKIDGMGCEHCIKSVREALEGISGVKVIDVKIGSAEVEAENDSVLNEIREKLDDAG YDLV >gi|224461348|gb|ACDC01000054.1| GENE 7 7210 - 9960 3710 916 aa, chain + ## HITS:1 COG:FN0705_2 KEGG:ns NR:ns ## COG: FN0705_2 COG0749 # Protein_GI_number: 19704040 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Fusobacterium nucleatum # 416 916 1 501 501 815 88.0 0 MKRAVLLDVSAIMYRAYFANMNFRTKNEPTGAVYGFINTLLSIIKEFNPDYMAAAFDVKR SSLKRTEIFGDYKSNRQSTPEDLVAQIPRIEEVLDAFNINRYRIESYEADDVLGSIAKKI AKDDLEVIIVTGDKDLSQLVEKNITIALLGKGTEGEKFGMLRTAEDVVNYLGVVPEKIPD LFGLIGDKSDGIPGVTKIGEKKALAIFSKYDSLEKIYENIDDLKNIEGIGPSLIKNLTNE KDIAFLSRELAKIFTNLDINLEEENLKYSMDKEKLYELCKTLEFKMFIKKLNLEEKTQTS NFDHKPVLLSLFDKVEEVEKTEKVEKEIVYEKELNINFSNRELVIIDNETLLNEQKEYLN NYKKIASIYYEELGIILSTEEKDLYFPLNHGGLLSKNIDKNTLIKFIAELDVKFISYNFK TLLNLGFTFKSMYMDMMIAYHLISSQTKMDVFIPITEYSNVDAKDFKTTFGKAHIETLLV GEFAGYLSKIGLGILAIYDEINHILHKEELYDILIQNEMPLIPVLSLMERKGIKIDVSYF KNYSLELEKELAKIEKAIYEEAGEEFNINSPKQLGDILFVKMNLPSGKKTKTGYSTDVMV LEDLESYGYNIARLLLDYRKLNKLKTTYVDTLPNLVDSNSRIHTSFNQIGTATGRLSSSE PNLQNIPVKTDDGIKIREGFVAGEGKVLMSIDYSQVELRVLTSMSKDENLIEAYREEKDL HDLTARRIFNLSDSDDVTREQRTIAKIINFSIIYGKTAFGLAKELKIPVKDASEYIKKYF EQYPRVTTFEKEVIEFGEEHGYVKTLFGRKRYISGIDSKNKTIKAQAERMAVNTVIQGTA AEVLKKVMLKVYETLKDKDDIALLLQVHDELIFEVEESSVEKYSEILADIMKNTVKLEDV NLNININIGKNWAEAK >gi|224461348|gb|ACDC01000054.1| GENE 8 9973 - 10869 741 298 aa, chain + ## HITS:1 COG:FN0706 KEGG:ns NR:ns ## COG: FN0706 COG1481 # Protein_GI_number: 19704041 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 298 1 298 299 432 87.0 1e-121 MSYSSNVKQEITQKIPVTNLECLAEISSIFENKANLVKEGIEIKMENSILAKRLYSLIKA TSSLQFGIKYSITKKFTEHRIYVITLYKQKGLKEFLESFKFSFLDIIQNDEIFRGYLRGF FLSCGYIKDPKKEYSLDFFVDNKELADKIYNILLSKKKKIFKTIKKNKILVYLRNSEDIM DILVSMNALKYFFEYEEITIIKNLKNKTIREMNWEVANETKTLNTGNYQIKMIKYIDEKL GLNTLTDVLKEAAMLRLNNPEDSLQSLADMINISKSGIRNRFRRIEEIYNNLLEEENS >gi|224461348|gb|ACDC01000054.1| GENE 9 10882 - 11838 390 318 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 19 305 20 311 317 154 33 4e-37 MIVVNDILTSNIEFEDTYVAIGNFDGVHYGHKKLINETIKAARENSKKAVVFTFEKHPLE FLFSERKFDYINTNEEKLYLLESLGVDVVIMQKLDKNFLEYTPLEFVRILKNKLKVKEIF VGFNFSFGKGGLGTAEDLEYLAEIHNIKVNELPPVTLDGELVSSSAIRKKIANSDFDGAI KLLDHPMIVIGEVIHGKKIARQLGFPTTNIKMDNRLYPPSGIYGAFLQVSDKNSKVLYGV VNIGYNPTLKQEMSLEVHILDFDREVYGEKLYIQIVKFMREEKKFSSIDELKATIQADVD RWKLFKREMKYGRTSSKS >gi|224461348|gb|ACDC01000054.1| GENE 10 11813 - 12490 701 225 aa, chain + ## HITS:1 COG:FN0708 KEGG:ns NR:ns ## COG: FN0708 COG1354 # Protein_GI_number: 19704043 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 225 1 225 225 311 89.0 6e-85 MEELVVKVNNFEGPFDLLLNLIEKKKMMISDINISQLIDEYLEVLKLSERENIEIKSDFI IIASELIEIKTLNLLNLDSDKEKETNLKRRLEEHKLFKELTPKVANLEKEFNISYSRGES KRTIKKIAKDYDLTSLTTDDIFDVYKKYFDSVDMSEFMELNLIKQYDIKEIMDNILIKIY FKNWLIDDLFLEAENKLHLIYIFLAILELYKDAKINIDDGEIRKC >gi|224461348|gb|ACDC01000054.1| GENE 11 12478 - 13953 576 491 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803542|ref|ZP_02197411.1| 30S ribosomal protein S20 [Vibrio campbellii AND4] # 1 421 3 434 520 226 31 1e-58 KKMLKKSINTMVITMVSRVLGLFRGTLVAYFFGASVLTDAYYSAFKISNFFRQLLGEGAL GNTFIPLYHKKKKEEGEERSREYIFSVLNITFLFSFVISVLMIIFSSYIIDFIVVGFSDE LKMVASRLLKIMSFYFLFISLSGMMGSILNNFGYFAIPASTSIFFNLSIIFSAMWLTKYF SIDALAYGVLIGGVLQFLVVFFPFLKLLKSYSFKIDFKDIYLKLLGIKLIPMLVGVFARQ VNTIVDQFFASFLVAGSITALENASRVYLLPVGVFGVTISNVLFPSISRAAANGDKEDTN RRLVSAINFLNFLTIPSLFVLTFFSKDVIRLIFSYGKFNEDAVKITSECLLYYSLGLIFY VGVQLVSKGYYAMGDNKRPAKFSIIAIIMNIVLNYLFIKNFQHKGLALATSISSGVNFFL LLFVYVKLYVKLDLKNIIATAIKICISSVIATAFAFYINNVILKLVIFSAVFLLQWAYPI YKYRERVFYKK >gi|224461348|gb|ACDC01000054.1| GENE 12 14020 - 14697 794 225 aa, chain + ## HITS:1 COG:no KEGG:FN0710 NR:ns ## KEGG: FN0710 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 224 1 224 225 203 59.0 5e-51 MGLADLLFKEKEEKYLKQIEDLQNYLKIKDDEISYLTTQLEEVTKEKDARINSKQLEIFE KNFKHNIEVAKKYRSILDSYNLDTEKKSYKYRVDLKYFYSEKKFEEVIKFLNENNKFFVD ELSEEIFDNMTKEIKNANKAKQRFIDFKNGQMEWAITTLINKGEELSKLYSKSRKLMTIF SDLYFEYLDDIANFDFMNLKSQGFDISEIEEFISKRDNYYKERRR >gi|224461348|gb|ACDC01000054.1| GENE 13 14694 - 15590 1002 298 aa, chain + ## HITS:1 COG:FN0711 KEGG:ns NR:ns ## COG: FN0711 COG0452 # Protein_GI_number: 19704046 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Fusobacterium nucleatum # 1 287 1 287 404 472 83.0 1e-133 MKNILVGVTGGIAAFKSASIVSLLKKKGYNVKVVMTENATNIIGPLTLETLSKNRVYVDM WDKNPHYEVEHISLADWADIVLIAPATYNIIGKVANGIADDMLSTILSAVSLRKPVFFAL AMNVNMYENPILNENINRLKTYGYRFIDTNEGLLACNYEAKGRMKEPEEIVDIIVRHDIA SKIDNFRDALKGKRLLITSGRTREDIDPIRYLSNKSSGKMGYSLAQAAVDLGAEVTLVSG PTNLNVPDGLKEFISVDSAIHMYEKVDEKFKDTDIFIACAAVADYRPKEYQDKKNKKI >gi|224461348|gb|ACDC01000054.1| GENE 14 15538 - 15915 357 125 aa, chain + ## HITS:1 COG:FN0711 KEGG:ns NR:ns ## COG: FN0711 COG0452 # Protein_GI_number: 19704046 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Fusobacterium nucleatum # 17 123 298 404 404 141 78.0 2e-34 MQTTDLKNIKIKKIKKSDLNLTIELVRNPDILFEMGKKKENQLLVGFAAETNNIIENALK KLEKKNLDMIVANNASTMGTDTNSIEIIRKDRSSTVINQKSKIELAYDILKEVILDLKKA KDEEK >gi|224461348|gb|ACDC01000054.1| GENE 15 15902 - 16474 509 190 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 186 1 186 186 264 82.0 8e-71 MKKNKIIDIFILFFKIGAFTIGGGYAMLSLIEDEIVNKKNWLEKEEFVDGMAIAQSIPGV LAVNISLITGYKIAGFLGMFAGMLGAVLPSFFIVLFLSQILLAIGNHPIVVAIFNGIKPA IAALILISVYRIAKSANINRYTFIFPIIIAILINYFGISPIIIIIATMILGNIYFLFKEK SKKEKEDDVQ >gi|224461348|gb|ACDC01000054.1| GENE 16 16471 - 17001 534 176 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 175 1 175 176 210 80.0 1e-54 MIYLKLFFVFFKVGLFSFGGGYAILPLMRHEVVDVNKWISFHEFMEIVAISQITPGPISI NLATHAGYRIAQTMGSTIATFSVVLPSIIIMTIIVVFLKKFSNLPVVKRTFAALRITVVG LILAAAVALFVKDNFIDYKSYIIFASVLIAGLFFKIGSITLIISSGLAGLLLYYIF >gi|224461348|gb|ACDC01000054.1| GENE 17 17012 - 17959 1152 315 aa, chain + ## HITS:1 COG:FN0714 KEGG:ns NR:ns ## COG: FN0714 COG1902 # Protein_GI_number: 19704049 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Fusobacterium nucleatum # 1 314 1 314 314 585 92.0 1e-167 MEKINIFTDFKIKNIHIKNRIVLPPMVRFSLVKDDGYVTQDLIDWYGMIARSGVGLIIVE ASAVEESGKLRENQIGIWNDSFIEGLTKVANEIHKYDVPCMIQIHHAGFKDKITEVPEEE LDRILKLFEEAFIRAKKCGFDGIEIHGAHTYLISQLNSKLWNKRTDKYGERLYFSRKLIE NTRYLFDDNFILGYRMGGNEPELEDGIENAKELESYGLDILHVSSGVPNPEYKRQVKISN FPEDFPLDWIIYMGTEIKKHVKIPVIGVSKIKKESQASWLVENNLLDFVAVGKAMISQDK WMEKARKDFMSKNRH >gi|224461348|gb|ACDC01000054.1| GENE 18 18039 - 18788 885 249 aa, chain + ## HITS:1 COG:no KEGG:FN0715 NR:ns ## KEGG: FN0715 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 245 1 264 290 179 43.0 6e-44 MTTKNYIAVAKYLEDNTILLSFPDFEGLTTTADSEENIQNIAAKAIKSKLAELKNSNIEA PEPKKITEVSKNLQEGEFTTYIPVTETPSFNTLKDNETLKDVSNKVDNFINKDIKKSVPE GKEHFLGIGGAILAILNTLLFPVYTITGFFGFGGGGANFFQMNALYMLFGLAFLAFAGAN IYASLNRDMKILQISTLGILGTFALCYVLVFITALTNAYLSVGIIKFILYAISVAVIYSG YRILNSLND >gi|224461348|gb|ACDC01000054.1| GENE 19 18825 - 19691 914 288 aa, chain + ## HITS:1 COG:no KEGG:FN0715 NR:ns ## KEGG: FN0715 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 278 1 274 290 306 61.0 6e-82 MSMTNYIIVMKALEDGKFLITFPDFEGLTATADSEESIQSVATETIKTKLAELKKDNLVI PEAKKMKDVSSTLNEGEFTTYIPVKEEFDFKATMNSTMANLKDKESFKKGTEDLKNKATE LTNNIPKGSENLFGIIGGVIAIINTFLLAVFSVKIPIFGSYSIGFFKGLGILADFSKEVK NIQAILLFSGILFIAFAGLLIYSSVIKNKNILLYSIIGNAIFLVIFYIILLVKLPGGEAG KYISVSFFKILLYLISLALAFVTYFTLNKVEQNQISLNNGDDRNEEGL >gi|224461348|gb|ACDC01000054.1| GENE 20 19675 - 20589 1173 304 aa, chain + ## HITS:1 COG:no KEGG:FN0716 NR:ns ## KEGG: FN0716 # Name: not_defined # Def: phophatidylinositol-4-phosphate 5-kinase (EC:2.7.1.68) # Organism: F.nucleatum # Pathway: not_defined # 1 304 1 314 314 278 53.0 2e-73 MKKDFKQFLILLIVSIFVAFTVSFGYSVYQNYQREKKINEVKNLFNFGGTSEDEKEETKE EIKTEETTKPEEVNSKESWNNLIISEIEKDYVLDDVRPFYKRLYDKIRGKKIYNFKSINN ENETLVVEMNDNKITEKFFNDGKEVLEKELIANDDFSSYDLKAKNIAEEYTATFKDMLGK DTYLNTKNGLIEYQDGRKIEFIHKNAIMNGPAIEYLANGDKIEFNYVNGKRYGEAQKFYA NGDKEDFFYGNNEKKNGASIYYFANGEREEVAYKDGVLEGPAIYIFNDGVAEHYEYKNGK RVEE >gi|224461348|gb|ACDC01000054.1| GENE 21 20589 - 21269 845 226 aa, chain + ## HITS:1 COG:FN0717 KEGG:ns NR:ns ## COG: FN0717 COG1187 # Protein_GI_number: 19704052 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Fusobacterium nucleatum # 1 226 1 226 226 345 85.0 4e-95 MRLDRFLVECGIGSRKEVKKIISAKEIKVNGSYDISAKDNIDEYSDVIEYNGERLEYKEF RYYIMNKKAGYITATEDIREATVMDLLPEWVIRKDLAPVGRLDKDTEGLLLLTNDGKLNH KLLSPKNHVDKTYYVEIENNISQEDILKLEEGVDIGNYITLPAKVEKISDTKIYLTIKEG KFHQVKKMLEAVNNKVNYLQRTTFAKLSLDGLALGEVKEVNLEDII >gi|224461348|gb|ACDC01000054.1| GENE 22 21282 - 21695 596 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294782057|ref|ZP_06747383.1| ## NR: gi|294782057|ref|ZP_06747383.1| glycosidase CRH1 [Fusobacterium sp. 1_1_41FAA] # 1 137 1 137 137 81 84.0 1e-14 MKKNLILIISTLFLTTACTASLGLGTGFGLGGSSSGVSVGTGISVEKKIPTKKETKKKVE TKTNGTSHTNSNTKTTVKKTTDHSTNTTKKAVEDKTQVKTEKNEVTSSTTTFETNTSTKS TETSVTTPKRVKQERQE >gi|224461348|gb|ACDC01000054.1| GENE 23 21695 - 22735 901 346 aa, chain + ## HITS:1 COG:FN0719 KEGG:ns NR:ns ## COG: FN0719 COG4394 # Protein_GI_number: 19704054 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 346 1 346 350 473 80.0 1e-133 MLIDNIDIFCEVIDNYGDVGVAYRLARELKRIYPNKELRFIINQTEELNLIKKNDDIIVI DYKDVDKIESPADLIIETFACNIPEIYMGKALKNSKLMINLEYFSSEDWVDDFHLQESFL GGNLKKYFFIPGLSEKSGGIILDKEFLDRKNKVQKNREYYLKQFNIDEKYDLIISVFSYE KNFDNFLKTLQKLDKKVLLLLLSEKTQKNFIKYFDNNDYYDKIKAVKLPFFTYDKYEELL ALCDINLVRGEDSFVRALLLGKPFLWHIYPQDENTHIIKLESFLEKYCPNNKELKETFIN YNINKDDFSYFFENLDEIKKYNEKYTDYLIENCNLMNKLINFIEKI >gi|224461348|gb|ACDC01000054.1| GENE 24 22747 - 23299 822 184 aa, chain + ## HITS:1 COG:FN0720 KEGG:ns NR:ns ## COG: FN0720 COG0231 # Protein_GI_number: 19704055 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Fusobacterium nucleatum # 1 184 1 184 187 346 100.0 2e-95 MKIAQELRAGSTIKIGNDPFVVLKAEYNKSGRNAAVVKFKMKNLISGNISDAVYKADDKM DDIKLDKVKAIYSYQNGDSYIFSNPETWEEIELKGEDLGDALNYLEEEMPLDVVYYESTA VAVELPTFVEREVTYTEPGLRGDTSGKVMKPARINTGFEVQVPLFVEQGEWIKIDTRTNE YVER Prediction of potential genes in microbial genomes Time: Thu May 19 23:11:48 2011 Seq name: gi|224461347|gb|ACDC01000055.1| Fusobacterium sp. 2_1_31 cont1.55, whole genome shotgun sequence Length of sequence - 12406 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 4, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 48 - 797 755 ## FN0721 hypothetical protein - Prom 895 - 954 14.9 + Prom 931 - 990 15.0 2 2 Op 1 16/0.000 + CDS 1017 - 2015 1462 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 3 2 Op 2 14/0.000 + CDS 2015 - 3001 1366 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 4 2 Op 3 6/0.000 + CDS 3031 - 3924 1430 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 5 2 Op 4 27/0.000 + CDS 4002 - 4229 485 ## COG0236 Acyl carrier protein + Term 4256 - 4293 3.1 + Prom 4239 - 4298 7.8 6 3 Op 1 1/0.000 + CDS 4326 - 5567 1868 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 7 3 Op 2 3/0.000 + CDS 5582 - 6286 874 ## COG0571 dsRNA-specific ribonuclease 8 3 Op 3 1/0.000 + CDS 6273 - 7319 1036 ## COG1243 Histone acetyltransferase 9 3 Op 4 . + CDS 7294 - 8673 1226 ## COG1530 Ribonucleases G and E 10 3 Op 5 1/0.000 + CDS 8672 - 9166 302 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 11 3 Op 6 7/0.000 + CDS 9169 - 10551 1474 ## COG1066 Predicted ATP-dependent serine protease 12 3 Op 7 . + CDS 10544 - 11593 682 ## PROTEIN SUPPORTED gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 + Term 11726 - 11771 -0.5 + Prom 11686 - 11745 12.1 13 4 Tu 1 . + CDS 11788 - 12333 908 ## COG2849 Uncharacterized protein conserved in bacteria + Term 12364 - 12406 8.2 Predicted protein(s) >gi|224461347|gb|ACDC01000055.1| GENE 1 48 - 797 755 249 aa, chain - ## HITS:1 COG:no KEGG:FN0721 NR:ns ## KEGG: FN0721 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 246 1 239 239 322 71.0 7e-87 MKRKLFFILSLFLICSFYAFAENFPQKAKSINDFVPKGWKILKDENGSNFIAKGDLNKDK LEDVAIIIEKNDKKNIKKNDNFGPNELNLNPRILLILFKEKDGTYSLVAKNDKGFIKSEG SEDNPALMDTLSDICIKKNVLKITFNYFMSAGSWNTSSDTYIFRFQNNVFELIGYESDSY MRNSGDEEKISINFSTNKVKSTTGGNMFEGTKDKPKDKWRNIKFEKKYILDEMTESTMDE ILDTIYYIE >gi|224461347|gb|ACDC01000055.1| GENE 2 1017 - 2015 1462 332 aa, chain + ## HITS:1 COG:FN0147 KEGG:ns NR:ns ## COG: FN0147 COG0416 # Protein_GI_number: 19703492 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Fusobacterium nucleatum # 1 332 1 332 332 539 91.0 1e-153 MKIALDAMSGDFAPISTVKGAVEALNEIENLEVILVGKESIIKEELKKYKYDTKRIEIKN ANEIIEMTDDPVKAVREKKDSSMNVCIDLVKDKVAQASVSCGNTGALLASSQLKLKRIKG VLRPAIAVLFPNKKDQGTLFLDLGANSDSKPEFLNQFATMGSKYMEIFLNKKNPKVALLN IGEEETKGNELTRETYSLLKQNKDIDFQGNIESTKIMDGEVDVVVTDGYTGNVLLKTSEG VGKFIFHVVKESVMESWISKIGALLMKGAIKKVKKKTEASEYGGAIFLGLSELSLKAHGN SDSRAIMNALKVASKFIELNFIEELRKTMEVE >gi|224461347|gb|ACDC01000055.1| GENE 3 2015 - 3001 1366 328 aa, chain + ## HITS:1 COG:FN0148 KEGG:ns NR:ns ## COG: FN0148 COG0332 # Protein_GI_number: 19703493 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Fusobacterium nucleatum # 1 328 1 328 328 583 87.0 1e-166 MQSIGIKGIGYYAPENVFTNFDFEKIIDTSDEWIRTRTGITERRFATKEQATSDLACEAS LKAIESAKIKKEDIDLIILATVTPDYLAQGAACIVQHKLGLSNIPCFDLNAACTGFIYGL EVGYSMVKSGLYKNVLVIGAETLSRIIDMQNRNTCVLFGDGAAAAVVGEVEEGYGFLGFS IGAEGEDDMILKIPAGGSKKPNDDETIKNRENFVVMKGQDVFKFAVNILPKVTLDALEKA KLDVSDLSMVFPHQANSRIIESAAKRMKFPIEKFYMNLSRYGNTSSASVGLALGEAVEKG LVKKGDNVALTGFGGGLTYGSAIIKWAF >gi|224461347|gb|ACDC01000055.1| GENE 4 3031 - 3924 1430 297 aa, chain + ## HITS:1 COG:FN0149 KEGG:ns NR:ns ## COG: FN0149 COG0331 # Protein_GI_number: 19703494 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Fusobacterium nucleatum # 1 297 1 297 299 489 89.0 1e-138 MGKIAFVYPGQGTQFVGMGKELYENNLKAKELFDKIFSSLDIDLKKVMFEGPEDLLKRTD YTQPAIVSLSLVLTELLKETGVKPDYVAGHSVGEFAAFGGANYLSVEDAVKLVAARGRIM KEVAEKVNGSMAAVLGMDAEKIKEVLKSVDGVVEAVNFNEPNQTVIAGEKEAVEKACVAL KDAGAKRALPLAVSGPFHSSLMKEAGEQLKVEAQNYNFNIADVKIVANTTAELLETDAEV KEEIYKQSFGPVKWVDTINKLKALGVTKIYEIGPGKVLAGLIKKIDKEIEVENIEII >gi|224461347|gb|ACDC01000055.1| GENE 5 4002 - 4229 485 75 aa, chain + ## HITS:1 COG:FN0150 KEGG:ns NR:ns ## COG: FN0150 COG0236 # Protein_GI_number: 19703495 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Fusobacterium nucleatum # 1 75 1 75 75 105 94.0 3e-23 MLDKVREIIVEQLGVEPDQVKPESNFVDDLGADSLDTVELIMSFEEEFGVEIPDTEAEKI KTVQDVINYIEANKK >gi|224461347|gb|ACDC01000055.1| GENE 6 4326 - 5567 1868 413 aa, chain + ## HITS:1 COG:FN0151 KEGG:ns NR:ns ## COG: FN0151 COG0304 # Protein_GI_number: 19703496 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Fusobacterium nucleatum # 1 413 1 413 413 741 95.0 0 MKRVVVTGLGLISSLGIGLEESWKKLIAGETGIDLITSYDTTDQPVRIAGEVKGFEPTDY GIEKKEVKKLSRNTQFALVATKMALDDANFKIDETNADDVGVLVSSGVGGIEIMEEQYGA MLSKGYKRISPFTIPAMIENMAAGNIAIYYGAKGPNKSIVTACASGTHSIGDGFDLIRHG RAKAMIVGGTEASVTQFCINSFANMKALSTRNETPKTASRPFSKDRDGFVMGEGAGILIL EELESALARGAKIYAEMVGYGETCDANHITAPIETGEGATKAMRIALKDANLSPDDVTYI NAHGTSTPTNDVVETRAIKALFGDKAKDLYISSTKGATGHGLGAAGGIEGVIIAKAIADG VIPPTINLHETDEECDLNYVPNQAIKTDVKVAMSNSLGFGGHNSVIVMKKFEK >gi|224461347|gb|ACDC01000055.1| GENE 7 5582 - 6286 874 234 aa, chain + ## HITS:1 COG:FN0152 KEGG:ns NR:ns ## COG: FN0152 COG0571 # Protein_GI_number: 19703497 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Fusobacterium nucleatum # 1 234 1 234 234 374 87.0 1e-103 MKNLLDLEHKLNYYFNNRNLLKTALLHKSLGNEKKEYKNQNNERLELLGDAVLDLIVAEY LYRNYKSASEGTIAKLKAMIVSEPILAKISRQIDLGKFLMLSKGEILSGGRNRESILADA FEAVLGAVYMDSNLEEARSFALGHIEQYITHIEEDEDILDFKSILQEYVQKNFKTVPTYE LISEKGPDHMKEFEIQVVVGNYKEKAVAKNKKKAEQLSAKALCVKLGVKYHEAL >gi|224461347|gb|ACDC01000055.1| GENE 8 6273 - 7319 1036 348 aa, chain + ## HITS:1 COG:FN0153 KEGG:ns NR:ns ## COG: FN0153 COG1243 # Protein_GI_number: 19703498 # Func_class: K Transcription; B Chromatin structure and dynamics # Function: Histone acetyltransferase # Organism: Fusobacterium nucleatum # 1 348 1 348 348 605 89.0 1e-173 MKHYNIPVFISHFGCPNACVFCNQKKINGRETDVSLDDLKNIIDSYLKTLPKNSIKEVAF FGGTFTGISMELQKQYLEVVKKYIDNADVEGVRISTRPECIDDEILTQLKKYGVKTIELG IQSLDDEVLKATGRHYDYEIVKKSCDLIKKYGFTLGVQLMIGLPKSDFKSDLMSAVKSLD LNPDIARIYPTLVIKGTELEFMYKRNLYNSLTLEEAVDRTVPIYSLLELKDINVIRVGLQ PAEDLTADGVIISGPFHPAFRDLVENKIYFNFLSKIYEKEKKLDIEVNERNISKIVGQKA STKKTFYPNFKITINNNLAINELIINSKKYERKEILKGELNEQMPDFI >gi|224461347|gb|ACDC01000055.1| GENE 9 7294 - 8673 1226 459 aa, chain + ## HITS:1 COG:FN0154 KEGG:ns NR:ns ## COG: FN0154 COG1530 # Protein_GI_number: 19703499 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Fusobacterium nucleatum # 1 457 1 457 458 533 73.0 1e-151 MSKCLILSKNTYETKLALLEDDKLEEIYIEREKEKEISGNIYKGKIIDILNKGEIIFVDI GLEKNAFLSFENKKNIPKFNIFDSLIVQVETEARDGKGARLTLDYSINGENLVLLPNSKN LSISKKIENLETIKKLKDIFLNIDKGLILRTKSVEKSPEALLEEYKKLEAINNQIQKDFK EKNTTLLYDNNSILKKALTLLDEDIDEFIIDDEDSFNKIKNTLEENKRNYLLKKFKKYFK DEDIFSYYKLDTQIERALDRKVYLESGASIVIEKTEALVSIDVNTGQNTGNQSSQNLIFN TNLEACKEIARQIKLRNLAGIIIIDFIDLKKISDRKKILEELKRYLKKDRMEINSLDFSH LGLVQFTRKRQGKELSFYYREKCHYCEGTSYLLSKDRIILNLFAELNSQIKYNDLNKIVI KTKKDIIKEIKKLISNPKIEYEEDSNFYKEGYKIELYRR >gi|224461347|gb|ACDC01000055.1| GENE 10 8672 - 9166 302 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 1 147 2 147 164 120 43 4e-27 NMKIGVYAGSFDPITKGHQDIIERALKIVDKLIVVVMNNPKKNYWFNLDERKNLISKIFE GSENIKVDEHAGLLVDFMAKNSCGILIKGLRDVKDFSEEMTYSFANKKLSNGEVDTVFIP TSERYTYVSSTFVKELAFYNQSLEGYVDGKIIEEVLNRAKEYRG >gi|224461347|gb|ACDC01000055.1| GENE 11 9169 - 10551 1474 460 aa, chain + ## HITS:1 COG:FN0157 KEGG:ns NR:ns ## COG: FN0157 COG1066 # Protein_GI_number: 19703502 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Fusobacterium nucleatum # 1 460 1 452 452 757 93.0 0 MAKGSVYYCSECGYKSVKWAGKCPQCGAWSSFEEVEEIPRDVKKATSSVSVASRASDIKV YEFKDVEYSKEDRYKTKYEEFDRLLGGGLLKGEVVLVTGNPGIGKSTLLLQVANSYKDYG DVLYISGEESPAQIKNRGERLKISGDGIYIMAEMDILNIYEYVVSKKPKVVIVDSIQTLY NSSMDSISGTPTQIRECTLKIVEIAKKYNISFFIVGHITKDGKVAGPKLLEHMVDAVFNF EGDEGLYYRILRSEKNRFGSTNEIAVFSMEENGMKEIKNSSEYFLSEREEKNIGSMVVPI LEGTKVFLLEVQSLITDSGIGIPKRVVQGYDRNRIQILTAIAEKKLYLPLGMKDLFVNVP GGLAIEDPAADLAVLISILSVYKGVSISQKIAAIGELGLRGEIRKVFFLERRLKELEKLG FTGVYVPESNRKEIEKKKYKLKIIYLKNLDELLERMNKND >gi|224461347|gb|ACDC01000055.1| GENE 12 10544 - 11593 682 349 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 [Bacillus selenitireducens MLS10] # 9 349 16 358 360 267 39 3e-71 MTKQDLMDIIIKVAPGSPLRDGIDYILDAGIGALIIIGYDDDVEEVKDGGFCINCDYTPE KIFELSKMDGAIIINDDCSKILYANVHIQPDTSFTTTESGTRHRTAERVAKQLKREVVAI SERKKNVTLYKGNLKYRLKNFDELNIEVGQVLKTLESYRHVLNRSLDNLTILELDDLVTV LDVANTLQRFEMVRRISEEITRYLLELGARGRLVNMQVSELIWDIDDEEESFLKDYIDTD TKPETVRRYLHTLSDSELLDIENIVVALGYTKSSSVFDNKVAARGYRVLEKISKLTKKDI EKITSTYKDISEIQELTDEDLGAIKISKFKIKALRAGINRLKFTIEMQR >gi|224461347|gb|ACDC01000055.1| GENE 13 11788 - 12333 908 181 aa, chain + ## HITS:1 COG:FN1078 KEGG:ns NR:ns ## COG: FN1078 COG2849 # Protein_GI_number: 19704413 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 181 1 181 181 270 95.0 1e-72 MNNQYNKDGKKEGLWVKIYDNGVVQEERNYVNGVREGIYKSYYMNGEVEIIKNYKNGNLH GKYQTFYSDGKLNSEYNLVDGRKVGEYKEFYPNGILKRETVYVNDGTTSKNIKYFPNGKI KLEVNFVDGHMEGPYKEYHSNEKLFKECFYNEKGKLEGNYKEYDVEGNLLKEVTYKNGVE V Prediction of potential genes in microbial genomes Time: Thu May 19 23:11:54 2011 Seq name: gi|224461346|gb|ACDC01000056.1| Fusobacterium sp. 2_1_31 cont1.56, whole genome shotgun sequence Length of sequence - 3726 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 29 - 1237 2053 ## COG0183 Acetyl-CoA acetyltransferase 2 1 Op 2 . - CDS 1296 - 2015 271 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 2117 - 2176 9.6 + Prom 2059 - 2118 7.7 3 2 Op 1 . + CDS 2279 - 3214 1252 ## FN0493 hypothetical protein 4 2 Op 2 . + CDS 3237 - 3467 338 ## gi|237740812|ref|ZP_04571293.1| predicted protein Predicted protein(s) >gi|224461346|gb|ACDC01000056.1| GENE 1 29 - 1237 2053 402 aa, chain - ## HITS:1 COG:FN0495 KEGG:ns NR:ns ## COG: FN0495 COG0183 # Protein_GI_number: 19703830 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Fusobacterium nucleatum # 1 402 1 402 402 693 90.0 0 MSKVYVVAAKRTAIGSFLGTLSPLKPGELGAKVVKNIIEETGIDPANIDEVIVGNVLSAG QAQGVGRQVAIKAGIPYEVPAYSINIICGSGMKSVITAFSNIKAGEADLVIAGGTESMSG AGFILPGAVRGGHKMADLTMKDHMILDALTDAYHNIHMGITAENIAEKYNITREEQDEFA LESQKKAIAAVDAGKFKDEIVPVVIPNKKGDITFDTDEYPNRKTDLEKLAKLKPAFKKDG SVTAGNASGLNDGASFLLLASEEAVKKYNLKPLVEIVSTGTGGVDPLIMGMGPVPAIRKA LKKANLKLQDMQLIELNEAFAAQSLGVIKELCTEHGVTADWFKDKTNVNGGAIAIGHPVG ASGNRITVTLIHEMKKTGVEYGLASLCIGGGMGTALVLKNVK >gi|224461346|gb|ACDC01000056.1| GENE 2 1296 - 2015 271 239 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 235 4 238 242 108 32 5e-24 MNRLEGKIAVVTGSARGIGRAIVEKLAAHGAKMVISCDMGESSYEQANVVHKILNVTDRE AIKTFVDEVEKEYGKIDILVNNAGITKDGLLMRMTEDQWDAVINVNLKGVFNMTQAVSKS MLKARKGSIITLSSVVGLHGNPGQTNYAATKGGVIAMSKTWAKEFGARNVRANCVAPGFV QTPMTDVLPEETIKGMLDATPLGRLGQVEDIANAVLFLASDESAFITGEVLSVSGGLML >gi|224461346|gb|ACDC01000056.1| GENE 3 2279 - 3214 1252 311 aa, chain + ## HITS:1 COG:no KEGG:FN0493 NR:ns ## KEGG: FN0493 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 311 1 309 309 426 77.0 1e-118 MKDMRLLKLYNRLLKNDDIDVEEYAKENGVSTRTVERDIKCIKDFLADNEDKSRELIRNK IKKKYQLSYSEDSVNLTKSEILAISKILLASRAFLKDEISLIVDKIAKQCGSEDDLKSIQ KLVNNEKFHYIELQHKKSFINYIWDLGQAIKDKKKIEIAYKKMDGNTVRRVIDPVGLMFS EYYFYLLAHIENIDKEKYFCNKDDEYPTIYRLDRIEDFEVLNEKYIPTLYKNRFQEGLFR KQVQFMTGGKLRKLKFIYRGNSIEALLDKIPTAKAKEIDKNVYEIKAEVFGNGIDRWILS QGDAIEIIEDN >gi|224461346|gb|ACDC01000056.1| GENE 4 3237 - 3467 338 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740812|ref|ZP_04571293.1| ## NR: gi|237740812|ref|ZP_04571293.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 76 1 76 76 137 100.0 3e-31 MATLNPRAQIALVLAQIEREYSKGMEFFLEDLSTVQNCVSYSNYQTFFNLLRNNADLTKL VMRVGSVSGKNKYKRK Prediction of potential genes in microbial genomes Time: Thu May 19 23:12:05 2011 Seq name: gi|224461345|gb|ACDC01000057.1| Fusobacterium sp. 2_1_31 cont1.57, whole genome shotgun sequence Length of sequence - 7585 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 34 - 93 9.7 1 1 Op 1 . + CDS 116 - 1705 2338 ## COG0513 Superfamily II DNA and RNA helicases 2 1 Op 2 . + CDS 1721 - 2377 737 ## COG2184 Protein involved in cell division 3 1 Op 3 1/0.000 + CDS 2387 - 3322 1073 ## COG1559 Predicted periplasmic solute-binding protein + Prom 3502 - 3561 13.2 4 2 Op 1 14/0.000 + CDS 3601 - 4959 1002 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 5 2 Op 2 1/0.000 + CDS 4949 - 7129 1323 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Prom 7132 - 7191 5.7 6 2 Op 3 . + CDS 7241 - 7498 422 ## PROTEIN SUPPORTED gi|19705275|ref|NP_602770.1| SSU ribosomal protein S15P + Term 7511 - 7547 4.1 Predicted protein(s) >gi|224461345|gb|ACDC01000057.1| GENE 1 116 - 1705 2338 529 aa, chain + ## HITS:1 COG:FN1975 KEGG:ns NR:ns ## COG: FN1975 COG0513 # Protein_GI_number: 19705271 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Fusobacterium nucleatum # 1 529 1 528 528 915 94.0 0 MEQLEKLKEFRELGLGEKVLKVLSKKGYESPTPIQKLTIPALLKNDKDIIGQAQTGTGKT AAFSLPIIENFETSDHHIQAIVLTPTRELALQVAEEMNSLSTSKKMKVIPVYGGQSIDIQ RKLIKTGVDVVVGTPGRVIDLIERKLLKLNSLKYFVLDEADEMLNMGFIEDIEKILTFTN DDKRMLFFSATMPPEIMKIAKTHMKEYEVLAVKSRELTTDLTEQIYFEVNERDKFEALCR IIDLTKEFYGIIFCRTKTDVNEIVGRLNDRGYDAEGLHGDIGQNYREVTLKRFKTKKINI LVATDVAARGIDINDLSHVINYAVPQEVESYVHRIGRTGRAGKEGTAITFITPQEYRRLL QIQKAVKKEIRKESLPDVKDVIQAKKFRIIDDIGQILIDNDYDKFKKLAKDLLNMEEAEN IVASLLKLTYSDVLDESNYNEISPVKMEDTGKTRLFIAMGRKDGMTPKKLVDFIVKKAKV KQAYIKNAEVYDAFSFVSVPFKEAEIIVEAFAEIRKGKKPLIEKAKSKK >gi|224461345|gb|ACDC01000057.1| GENE 2 1721 - 2377 737 218 aa, chain + ## HITS:1 COG:HI0977 KEGG:ns NR:ns ## COG: HI0977 COG2184 # Protein_GI_number: 16272915 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Protein involved in cell division # Organism: Haemophilus influenzae # 21 195 12 186 191 174 50.0 9e-44 MNKYNFTETDKTILKRLVDEKEEEYLSKKRAKDLFEKDILSKVDLGTFKSLQAIHKYLFQ DCFETAGLVRKHDIRKGDTLFCKAMYLEDNLRTVSNMKEDTFEDIIEKYVEMNMMHPFYE GNGRATRIWLDFLLIKRLGKCVDWKKIDKEDYLSVMRRSVINSLELKTLLRDNLTDDINN RDLYMSNINQSYSYENMTNYDANNLDEETELKEKYSKK >gi|224461345|gb|ACDC01000057.1| GENE 3 2387 - 3322 1073 311 aa, chain + ## HITS:1 COG:FN1976 KEGG:ns NR:ns ## COG: FN1976 COG1559 # Protein_GI_number: 19705272 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Fusobacterium nucleatum # 17 309 17 309 310 488 89.0 1e-138 MKKLLAIVSIVIIILAGTSAYQLSKKDKYNLVLEIDKDKPLKESLSTLPVSNNPFFKLYL KFRNSGRNIKAGSYELRGKYNIVELISMLESGKSKVFKFTIIEGSTVKNVIDKLVANGKG TRENYIKAFKEIDFPYPTPEGNFEGYLYPETYFIPESYDEKAVLNIFLKEFLKRFPVEKY TDKEEFYQKLIMASILEREAALDSEKPLMASVFYNRIAKNMTLSADSTVNFVFNYEKKRI YYKDLEVQSPYNTYKNKGLPPGPICNPTVSSVDAAYNPADTEFLFFVTKGGGAHFFSKTY KEHLDFQKNNK >gi|224461345|gb|ACDC01000057.1| GENE 4 3601 - 4959 1002 452 aa, chain + ## HITS:1 COG:FN1977 KEGG:ns NR:ns ## COG: FN1977 COG0037 # Protein_GI_number: 19705273 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 1 448 1 447 448 567 79.0 1e-161 MELFREILKLNEEYNLIENNDTIVVGFSGGPDSVFLVEMLKKLQDFFKFKIYLVHINHLL RGEDADADESFSYDYAKRNNLEIFVKRIPVKEIAKKTGKTLEEVGREERYNFFSEIYDKV GANKIATAHNKDDQIETFLFRLVRGTSLQGLEGIKLKYNNIIRPISEIYKKDILEYLNKN EIQYKIDKTNFENEFTRNSIRLDLIPFIEKRYNIKFKDKLFSLIEEIRENNKKNFLDLDE YVDEENRLTLEKIKTLSLFERKNLLVHFLNKKNIKINRNKIDEINSLIKSDGTKKIDLDL NFRIVKDYHHLYIEKKEEEPFSYLNETLQLKIPSETYFDKYKIKVEFVENKEKTKYKNQY LLYAMNNDIIEVRYRREGDRILLDENHSKKLKEVLINQKVPRDVRDRIPLFLYKNNIFWI YGIKKAYVPKENKNINELRQVLITVEEVMNEG >gi|224461345|gb|ACDC01000057.1| GENE 5 4949 - 7129 1323 726 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 108 721 13 594 636 514 46 1e-145 MKDNQFEDEDLKNDSQVPENQENKINEEEKINEEVKQEEEKQEDKKEEEPKQEEPKPEEN SEKEEDKKENKQEEKKEEKRYNNNKREEERKRVVGKAVRVNFNLKGLLMLVFIITLFAVA PKIMEESKTQDYVDISYSDFIKNIESKKIGVVEEKDGYVYGYKANETKYLDNKSNSLKSK LGFDSKTGVQGLKARLITNRLGEDSNLVAVINENGALIQSTEPPQPSLFLSIVLSLLPYV IMIGLLVFMMNRMGKGSGGGGPQIFNMGKSKAKENGEDISDVTFADVAGIDEAKQELKEV VDFLKEPEKFKKIGAKIPKGVLLLGEPGTGKTLLAKAVAGEAKVPFFSMSGSEFVEMFVG VGASRVRDLFNKARKNAPCIVFIDEIDAVGRKRGTGQGGGNDEREQTLNQLLVEMDGFGT DETIIVLAATNRADVLDKALKRPGRFDRQVVVDMPDIKGREEILKVHAKNKKFAPDVDFK IIAKKTAGMAGADLANILNEGAILAARAGRTEITMADLEEASEKVQMGPEKRSKVVSDTD KKIVAYHESGHAIVNFVVGGEDKVHKITMIPRGQAGGYTLSLPAEQKLVYSKKYFMDEIA IFFGGRAAEEIIFGKDNITSGASNDIQVATGMAQQMVTKLGMSEKFGPILLDGTREGDMF QSKYYSEETGKEIDDEIRSIINERYQKALSILNENRDKLEEVTRILLEKETIMGDEFEAI MKNEHI >gi|224461345|gb|ACDC01000057.1| GENE 6 7241 - 7498 422 85 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|19705275|ref|NP_602770.1| SSU ribosomal protein S15P [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 85 1 85 85 167 100 3e-41 MRTKAEIIKEFGKSEADTGSTEVQIALLTEKINHLTEHLRVHKKDFHSRLGLLKMVGQRK RLLAYLTKKDLEGYRALIAKLGIRK Prediction of potential genes in microbial genomes Time: Thu May 19 23:12:10 2011 Seq name: gi|224461344|gb|ACDC01000058.1| Fusobacterium sp. 2_1_31 cont1.58, whole genome shotgun sequence Length of sequence - 13512 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 3, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 47 - 1135 1337 ## COG5438 Predicted multitransmembrane protein + Term 1208 - 1246 5.4 + Prom 1137 - 1196 9.1 2 2 Op 1 32/0.000 + CDS 1259 - 1729 490 ## COG0779 Uncharacterized protein conserved in bacteria 3 2 Op 2 22/0.000 + CDS 1756 - 2817 639 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 4 2 Op 3 15/0.000 + CDS 2810 - 3340 810 ## PROTEIN SUPPORTED gi|237742963|ref|ZP_04573444.1| ribosomal protein L7Ae 5 2 Op 4 32/0.000 + CDS 3354 - 5552 3456 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 6 2 Op 5 1/0.000 + CDS 5567 - 5929 559 ## COG0858 Ribosome-binding factor A 7 2 Op 6 1/0.000 + CDS 5937 - 7601 1685 ## COG0608 Single-stranded DNA-specific exonuclease 8 2 Op 7 29/0.000 + CDS 7618 - 8907 2030 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) + Term 8956 - 8996 5.2 + Prom 8926 - 8985 5.2 9 3 Op 1 24/0.000 + CDS 9049 - 9630 869 ## COG0740 Protease subunit of ATP-dependent Clp proteases 10 3 Op 2 18/0.000 + CDS 9641 - 10933 1863 ## COG1219 ATP-dependent protease Clp, ATPase subunit + Prom 10978 - 11037 7.3 11 3 Op 3 4/0.000 + CDS 11075 - 13381 3379 ## COG0466 ATP-dependent Lon protease, bacterial type 12 3 Op 4 . + CDS 13396 - 13510 199 ## COG0218 Predicted GTPase Predicted protein(s) >gi|224461344|gb|ACDC01000058.1| GENE 1 47 - 1135 1337 362 aa, chain + ## HITS:1 COG:FN1980 KEGG:ns NR:ns ## COG: FN1980 COG5438 # Protein_GI_number: 19705276 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Fusobacterium nucleatum # 1 362 1 369 369 511 78.0 1e-145 MKKFFVLIIFLLSSVLIFAEGTKEEYLSGKIIELVSEEKSDEEGVAKLQKFNVKLLEGVD KGEVVEIDFPVYTAKEYNIDVKVGDRVVVFKTFDDYGNDEMQMQYYISDVDKRMEIYIMG IIFVALVLVIARKNGLKALFALIVTVAFIVKIFIPAVFNGYSPILFAVITAIFSSLVTIY YTVGMNKKFFVSLLGVIGGVVVAGILSYIFTYRMRLNGYLDPELLASASILKNINLKEVI PAGVIIGSLGAVMDVAVSIASSINELHETDPNMSQKSMFKSVINIGTDIIGTMINTLILA YIASSVFTLLLVYAQAGEYPIIRLLNFQDIAVEIMRSVCGSIGILISVPLTAYIGTLIYK QK >gi|224461344|gb|ACDC01000058.1| GENE 2 1259 - 1729 490 156 aa, chain + ## HITS:1 COG:FN2023 KEGG:ns NR:ns ## COG: FN2023 COG0779 # Protein_GI_number: 19705319 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 156 1 156 156 237 87.0 5e-63 MEDNSQIIEKITKIVNPFVEEMNLSLVDVEYLQDGGYWYVRVFIENLNGDLSIEDCSKLS SKIEDKVEELIEHKFFLEVSSPGLERALKKLEDYIRFTGEKITLHLKHKLNDKKQFKAVI KEVKGDNIVFLIDKKEVEIEFKEIRKANILFEFNDF >gi|224461344|gb|ACDC01000058.1| GENE 3 1756 - 2817 639 353 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 10 351 11 350 537 250 39 3e-66 MKAKDSKIFLEALDELEKEKGISKESVLEAIELALLAAYKKNYGEDENVEVIVDRESGEI KVLASKTVVDADDLLDPNEEISLEDAKEIKKRAKIGDVLKFEVSCDNFRRNAVQNGKQIV IQKVREAEREHIYEKFKERENDIVSGIIRRIDNKKNIFIEIDGIELILPPAEQSYSDIYR VGERIKVFVYNVEKTNKFPKILISRKNEGLLKKLFEIEIPEISAGIIEIKSVAREAGSRA KVAVYSEVPNIDTVGACIGQKGTRIKNIVDELNGERIDIVEWKESMEQFVSAVLSPAVVS SVEILEDGTAKVLVEPSQLSLAIGKNGQNARLAARLTGTRVDIKVLEKEDDDE >gi|224461344|gb|ACDC01000058.1| GENE 4 2810 - 3340 810 176 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237742963|ref|ZP_04573444.1| ribosomal protein L7Ae [Fusobacterium sp. 4_1_13] # 1 176 1 176 176 316 86 5e-86 MSNTHIPERTCVLCRAKKDKSKLFRLAKVKEGFYEFDKEQKKQTRAVYVCKSLTCLGRLA KHSKVKLDSQDLMAMLSIINKANKNYLNILNSMKNSGELVFGINLLFENIEHIHFIVLAQ DISKKNEEKILRRISELKIPYVTAGTMEELGKIFNKEEITVIGIKDKKMARGLIED >gi|224461344|gb|ACDC01000058.1| GENE 5 3354 - 5552 3456 732 aa, chain + ## HITS:1 COG:FN2020 KEGG:ns NR:ns ## COG: FN2020 COG0532 # Protein_GI_number: 19705316 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Fusobacterium nucleatum # 1 732 1 737 737 1149 92.0 0 MKVRVHELAKKYELKNKEFLEILKKDIGVSVTSHLSNLDEDQVKKIDDYFAKMNMLKVET VEPVKMYKEKKEEKPIRKIIDEDEIEEGQKNNKKPKIQQKIKKNNNITFDEDGNSHKNKS KKKKGRRTDFVLKTVEATPDVVEEDGIKIIKFRGELTLGDFAEKLGVNSGEIIKKLFLKG QMLTINSPITLEMAEELAGEYDALVEEEQEVELDFGEKFALEIEDREADLKERPPVITIM GHVDHGKTSLLDAIRTTNVVEGEAGGITQKIGAYQVVKDGKRITFIDTPGHEAFTDMRAR GAQVTDIAILVVAADDGVMPQTVEAISHAKVAKVPIIVAVNKIDKPEANPMKVKQELMEH GIVSVEWGGDVEFVEVSAKKKINLDGLLDTILITSEILELKGNVKKRAKGVVLESRLDPK IGPIADILVQEGTLKIGDVIVAGEVQGKVKALLNDKGERVNTAIVSQPVEVIGFNNVPDA GDTMYVIQNEQHAKRIVEEVRKERKIQETTKKTISLESLSDQLKHEDLKELNLILRADSK GSVDALRDSLLKLSNDEVAVNIIQAASGAITESDIKLAEAAGAIIIGFNVRPTTKALKEA ETNKVEIRTSGIIYHIIEDIEKALAGMLDPEFKEEYQGRIEIKKVFKVSKVGNVAGCVVI DGKVKNDSNIRILRDNVVIYEGKLASLKRFKDDAKEVVAGQECGLGVENFNDIKEGDVVE AFEMVEIKRTLK >gi|224461344|gb|ACDC01000058.1| GENE 6 5567 - 5929 559 120 aa, chain + ## HITS:1 COG:FN2019 KEGG:ns NR:ns ## COG: FN2019 COG0858 # Protein_GI_number: 19705315 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Fusobacterium nucleatum # 1 119 1 119 120 176 92.0 7e-45 MKKQRLEGIGKEMMRVISKVLLEEVKNPKIKGLVSVTEVNVTEDLKFADTYFSILPPLNN EEKQYDNEEILEALNEIKGFLRKRVAEEVDIRFTPEIRVKLDNSMENAMKITKLLNDLKA >gi|224461344|gb|ACDC01000058.1| GENE 7 5937 - 7601 1685 554 aa, chain + ## HITS:1 COG:FN2018 KEGG:ns NR:ns ## COG: FN2018 COG0608 # Protein_GI_number: 19705314 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Fusobacterium nucleatum # 1 554 3 556 556 879 85.0 0 MLDEKSTEELIKDLLEKRGQESQHQIEKFMNPEYKDFKNPFDFENMEKIVNRIILARENK EKIFIYGDYDVDGISGTAFLTRFFNEIGIDTNYYIPSRNETDYGVSKKSIDYFHKRQGKL VITVDTGYNTIEDVRYAKSLGMEVIVTDHHKTVKEKFDDEILYLNPKLSKSYKFQYLSGA GVAFKLAQGLCMSLGLDMGIIYKYLDIVMIGTIADVVPMIDENRLIIKKGLKIIKNTKVK GLSYLLNYLRLNKKTLTTTDVSYYISPLINSLGRVGISRMGADFFLKEDEFDLYNIIEEM KEQNRQRRTLEKYIYDDAMRKIKNLKLPLDKLSVIFLSSSKWHPGVIGVVSSRLTIKFNV PVILVAIDGDYGKASCRSVGNISIFNLLSNVKHLLERYGGHDLAAGFVVHKEKLNELREY FIRTIPRLKEEDNKAKKDYEKSFDFELSVKDLGEKAFDFMEKMGPFGSSNPHPLFFDSNL KLDNIKRFGVDFRHFNGIIYKDNVSYNAVGFELADEIKEDYINKTYNIVYYPEKIILNNE EVTQIILKSIKENK >gi|224461344|gb|ACDC01000058.1| GENE 8 7618 - 8907 2030 429 aa, chain + ## HITS:1 COG:FN2017 KEGG:ns NR:ns ## COG: FN2017 COG0544 # Protein_GI_number: 19705313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Fusobacterium nucleatum # 1 429 1 429 429 645 87.0 0 MKYEVKKLEKSAVEVKLHLDAAEVSPLVDKVLKHVGEHAEVAGFRKGHAPKEALMANYKD HIESDVANDAINAYFPEIVEKEKLEPVSYVRLKEIALKDDLDLTFDIDVYPEFTLGNYKG LEAEKKTFEMTDDLLNTELEMMQRNHSKLVEVEDPSYKAQLEDTVDLAFEGFMDGVPFPG GKAESHLLKLGSKSFIDNFEEQLVGYTKGQEGEITVKFPEEYHAPELAGKPAQFKVKINA IKQLKEPELNDEFAKELGYESLEDLKNKTKEETIKRENDRIENEYIGALLDKLMETTTID VPVSMVQAEIQNRLKELEYQLSMQGFKMDDYLKMMGGNIDTFAAQLTPAAEKKVKVDLIL DKIARENKFEASEEELNGRMEEIAKMYGMDVPTLEGELKKNNNLDNFKASVKYDIVMKKA IDEVVKNAK >gi|224461344|gb|ACDC01000058.1| GENE 9 9049 - 9630 869 193 aa, chain + ## HITS:1 COG:FN2016 KEGG:ns NR:ns ## COG: FN2016 COG0740 # Protein_GI_number: 19705312 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Fusobacterium nucleatum # 1 193 1 193 193 359 95.0 2e-99 MYNPTVIDNNGKSERAYDIYSRLLKDRIIFVGTAIDENVANSIIAQLLYLESEDPEKDII MYINSPGGSVTDGMAIYDTMNYIKPDVQTVCVGQAASMGAFLLSSGAKGKRFALENSRIM IHQPLISGGLKGQATDISIHANELLKIKDRLAELLARNTGKTKEQILNDTERDNYLSSEE AVRYGLIDSVFKR >gi|224461344|gb|ACDC01000058.1| GENE 10 9641 - 10933 1863 430 aa, chain + ## HITS:1 COG:FN2015 KEGG:ns NR:ns ## COG: FN2015 COG1219 # Protein_GI_number: 19705311 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent protease Clp, ATPase subunit # Organism: Fusobacterium nucleatum # 1 422 1 423 423 750 91.0 0 MSKKLDKCSFCGRTEREVAQLFQGPGDVFICDNCVESCHNLLREDMYSLAREYDMLKDGK SSKRNYKDKVELLKPIEIKAKLDEYVVGQDEAKKVLSVAVYNHYKRILNGGQDEDGVELQ KSNVLLIGPTGSGKTLLAQTLARILNVPFAIADATTLTEAGYVGDDVENVLVRLIQACNY DIPNAERGIIYIDEFDKIARKSENVSITRDVSGEGVQQALLKIIEGTKSQVPPEGGRKHP NQELIEIDTKNILFIVGGAFEGLEKVIKSRTNKKVIGFGAEVQKQEMAGAEGEFFKKVLP EDLVKQGIIPELVGRLPVITTLDNLDEQTLINILTKPKNAIVKQYQKLCRLEGAKLEFTE EALTEIARRALKRKMGARGLRAIIEHTMLDIMFELPSNNKIKEITITKDAIDNYKEAKIE YKAEEQIITN >gi|224461344|gb|ACDC01000058.1| GENE 11 11075 - 13381 3379 768 aa, chain + ## HITS:1 COG:FN2014 KEGG:ns NR:ns ## COG: FN2014 COG0466 # Protein_GI_number: 19705310 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Fusobacterium nucleatum # 1 768 1 768 768 1336 93.0 0 MSKAPFLPIRDLVIFPNVVTPIYVGRANSIATLEKAIANKTKLVLGLQKDASEENPTFDG DIYEVGVIANIVQIIRMPNNNIKVLVEAESRVKIKDIETEDKEYFATYTVIKETLKDSKE TEAIYRKVFTRFEKYISMIGKFSSELILNLKKIEDYSNGLDIMASNLNISAEKKQEILEI SNVKDRGYKILDDIVAEMEIASLEKTIDEKVKTKMNEAQRAYYLKEKISVMKEELGDFSQ DDDVIEIVDRVKDADIPKEVREKLEAEIKKLTKMQPFSAESSVIRNYIEAVLDLPWNKET KDVLDLKKASEILERDHYGLKDAKEKVLDYLAVKTLNPSMNGAILCLSGPPGIGKTSLVK SIAESMGRKFVRVSLGGVRDEAEIRGHRRTYVGSMPGKIMKAMKEAGTKNPVILLDEIDK MSNDYKGDPASAMLEVLDPEQNKSFEDHYIDMPFDLSKVFFVATANDLRTVSAPLRDRMD ILQLSSYTEFEKLHIAQNFLLKQAQKENGLADVEIKIPDKVMFKLIDEYTREAGVRNLKR EIINICRKLAREVVEKKVKKFNLKASDLEKYLGKAKFRPEKSRKADGKVGVVNGLAWTAV GGVTLDVQGVDTAGKGDVTLTGTLGNVMKESASVAMTYVKANLKKYPPKDENFFKDRAIH LHFPEGATPKDGPSAGITITTAIVSVLTNRKVRQDIAMTGEITITGDVLAIGGVREKVIG AHRAGIKEVILPEDNRVDTDEIPDELKSTMKIHFAKTYDDVSKLVFVK >gi|224461344|gb|ACDC01000058.1| GENE 12 13396 - 13510 199 38 aa, chain + ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 38 1 38 194 70 100.0 6e-13 MKIKKADFVKSAVYEKDYPEQLDKMEFAFVGRSNVGKS Prediction of potential genes in microbial genomes Time: Thu May 19 23:12:15 2011 Seq name: gi|224461343|gb|ACDC01000059.1| Fusobacterium sp. 2_1_31 cont1.59, whole genome shotgun sequence Length of sequence - 12345 bp Number of predicted genes - 16, with homology - 14 Number of transcription units - 6, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 301 393 ## FN0111 hypothetical protein 2 1 Op 2 . - CDS 345 - 821 110 ## FN0146 hypothetical protein 3 1 Op 3 . - CDS 818 - 1261 367 ## FN0145 hypothetical protein - Prom 1370 - 1429 16.7 + Prom 1324 - 1383 16.6 4 2 Op 1 . + CDS 1442 - 2767 1503 ## COG1106 Predicted ATPases 5 2 Op 2 . + CDS 2773 - 3366 639 ## FN1087 hypothetical protein + Term 3367 - 3432 6.8 + Prom 3406 - 3465 10.1 6 3 Tu 1 . + CDS 3514 - 3579 111 ## + Prom 3854 - 3913 4.3 7 4 Tu 1 . + CDS 3936 - 4487 660 ## COG1971 Predicted membrane protein + Term 4505 - 4542 3.1 + Prom 5154 - 5213 12.8 8 5 Op 1 . + CDS 5304 - 5378 63 ## 9 5 Op 2 30/0.000 + CDS 5350 - 6480 1343 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 10 5 Op 3 36/0.000 + CDS 6467 - 7324 519 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 11 5 Op 4 1/0.000 + CDS 7314 - 8099 777 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 12 5 Op 5 . + CDS 8157 - 8984 1368 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 9005 - 9058 13.5 + Prom 9011 - 9070 7.9 13 6 Op 1 . + CDS 9288 - 9776 606 ## FN1814 hypothetical protein 14 6 Op 2 1/0.000 + CDS 9786 - 10700 1148 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 15 6 Op 3 25/0.000 + CDS 10713 - 11621 1400 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 16 6 Op 4 . + CDS 11621 - 12310 234 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein Predicted protein(s) >gi|224461343|gb|ACDC01000059.1| GENE 1 1 - 301 393 100 aa, chain - ## HITS:1 COG:no KEGG:FN0111 NR:ns ## KEGG: FN0111 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 100 1 91 91 106 83.0 2e-22 MNQKERIDKMEKILNNSTKLLEELEEILNKLDKDSKNYNELVKYYYSKNWAKDKEDFEKD LLPDVEAAGVLTEDSIYDMMTTSSGLAIQMLELATKMLKR >gi|224461343|gb|ACDC01000059.1| GENE 2 345 - 821 110 158 aa, chain - ## HITS:1 COG:no KEGG:FN0146 NR:ns ## KEGG: FN0146 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 30 151 2 123 128 84 67.0 1e-15 MRFSTAKFTEIAKTSILIILILNFFIIFFEIKYIEYIYLIIFFIFNFLVKIIEIKNYKNI YEKLKEMLKAKRNFFVAVNILIFFYIFKEPYFFLFKHKILILLGSVIIGYFSLIFIQKIK IKLSLNFFKEIVKIFLISAIIYYLPILSVQLGKFFYIF >gi|224461343|gb|ACDC01000059.1| GENE 3 818 - 1261 367 147 aa, chain - ## HITS:1 COG:no KEGG:FN0145 NR:ns ## KEGG: FN0145 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 142 14 157 162 135 63.0 4e-31 MRKNTKLNFMYILVVIPIYFFSTIFVNHFQQYEYIFKDINNIKKYQVTSNNDASYAYIKL DNNLYTEGEFSTFEIKKVYEDEYYFIAYYFKEKNYIVVDKKLASMKIYNEKEFKEKYRDI NDEKFVDIYNFLKRKGTKIGIHREVLL >gi|224461343|gb|ACDC01000059.1| GENE 4 1442 - 2767 1503 441 aa, chain + ## HITS:1 COG:FN1086 KEGG:ns NR:ns ## COG: FN1086 COG1106 # Protein_GI_number: 19704421 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 1 441 1 441 441 671 91.0 0 MFTYIKLKNYKSLIELEVDLTKKENTPKKLISIYGENGVGKSNFVDSFYTLKRIVSTRTI NEKIRILTEKQKELQTDDFDKALYFFGQLGSIIKNGFFSNSIDIINECKTIDSKDNMIIE VGFKIKSKSGVYCIETDDTDIISEKLDFTLNKNKVNFFEITKKEKYLNESVFIDNEYKKE IFSIIEKYWGKHSLLSLIAYEIEDKKENYVKKKIFNGIFEVINFFSSLSILSRNKMEVFK DIEEEKLFYGTLSVSEEKKLTKIENVINTFFTSLFSDIKQAYYKKKFDNDKINYILYFKK NIHNKLIDIEYNIESTGTKNILKILPYLISAAKGKTVIIDEIDNGIHDLLMLKILENLSE DLKGQLIITTHNTLLLEEEFIKDSIYIFKVDENANKKLLALNKFEGRVHPNLSIRKRYLK GLYGGIPFPMDIDFNELIEGV >gi|224461343|gb|ACDC01000059.1| GENE 5 2773 - 3366 639 197 aa, chain + ## HITS:1 COG:no KEGG:FN1087 NR:ns ## KEGG: FN1087 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 196 5 192 192 287 87.0 2e-76 MSKSSYSYSFLKGIVICHGKSEKLICEFLRSNLRIQIEIDSDKKGKKSIQITSIMKFLSG EKYKNIASFKNKFDDIEQIKDKKKLPSYFKIFIIMDTDDCNENQKKSFKDKSMFKEHWLY DYIVPIYNDSNLEEVLVDVGIKFQKSGNERKSEYPKVFPMNGISDIESIKKFGNRLKKSK KTNMEEFIDFCLELIEK >gi|224461343|gb|ACDC01000059.1| GENE 6 3514 - 3579 111 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGTQRFRRGYEVIGSKSGLSL >gi|224461343|gb|ACDC01000059.1| GENE 7 3936 - 4487 660 183 aa, chain + ## HITS:1 COG:FN1615 KEGG:ns NR:ns ## COG: FN1615 COG1971 # Protein_GI_number: 19704936 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 183 1 183 183 243 86.0 2e-64 MSTIAVLITALALAMDAMSLSIYQGIASTENQRKQNFIKIILTFGIFQFAMALVGSLSGS LFVHYISLYSKYISFAIFLFLGLMMLKEALKKEEMEYDEKYLDIKTLIIMGVATSLDALL VGLTYSILPFHKVLVYTVEIGIITAIISGLGFVVGNKFGDILGQKSHFLGAALLIFISIN TLI >gi|224461343|gb|ACDC01000059.1| GENE 8 5304 - 5378 63 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKYKNNIGIGGKKGIGKKRYKNS >gi|224461343|gb|ACDC01000059.1| GENE 9 5350 - 6480 1343 376 aa, chain + ## HITS:1 COG:FN1797 KEGG:ns NR:ns ## COG: FN1797 COG3842 # Protein_GI_number: 19705102 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 376 1 376 376 685 93.0 0 MEKKDIKIVNVNKSFDGVQILKDINLTIEQGEFFSIIGPSGCGKTTLLRMIAGFISPDSG AIYLGDENIVDLPPNLRNVNTIFQKYALFPHLNVFENVAFPLRIKKTDEKTINEEVMKYL KLVGLDEHSTKKVSQLSGGQQQRVSIARALINKPGVLLLDEPLSALDAKLRQNLLIELDL IHDEVGITFIFITHDQQEALSISDRIAVMNAGKVLQVGTPAEVYEAPADTFVADFLGENN FFSGKVTGIINEELAKIDLEGIGEIIIEQDKKVQIGDKVTVSLRPEKIRLSKNEITKSKN CINSVAVYVDEYIYSGFQSKYYVHLKNNKDLKFKIFLQHAAFFDDNDEKAIWWDEDAYIT WDAFDGYLVEVESEKK >gi|224461343|gb|ACDC01000059.1| GENE 10 6467 - 7324 519 285 aa, chain + ## HITS:1 COG:FN1798 KEGG:ns NR:ns ## COG: FN1798 COG1176 # Protein_GI_number: 19705103 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Fusobacterium nucleatum # 1 284 1 284 284 463 94.0 1e-130 MKKNSKLGLGYSLPINIWLTLFFLIPILIILSYSFLKRSTYGGVEFKLSFETFNIFVDKV FLTILVNTIYISVLITIFTVLIAIPISYYIARSRHKQELLFLIIIPFWTNFLVRIYSWIA LLGNNGFINHFLMKFHLINEPIKMLYNVPAVVVISVYTSLPFAILPLYAVVEKFDFSLLD AARDLGATNFQAFRKVFLPNIKAGIITSTIFTLIPALGSYAVPKLVGGTNSLMLGNVIAQ HLTVTRNWPLASTISGALIVLTSIVLWVFSKYEEKENKVGEKNVK >gi|224461343|gb|ACDC01000059.1| GENE 11 7314 - 8099 777 261 aa, chain + ## HITS:1 COG:FN1799 KEGG:ns NR:ns ## COG: FN1799 COG1177 # Protein_GI_number: 19705104 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Fusobacterium nucleatum # 1 261 4 264 264 395 92.0 1e-110 MSNKLDRRKTSFVIFILTMIFFYLPLAVLVIYSFNNGKGMAWQGFSLRWYKELFRHSSNI WKAFYYSIFIALISSFVSTVIGTFGAIALKWFDFKGKKYLKNISVLPLVVPDIIIGVSLL IMFATVKFKLGITTIFIAHTTFNIPYVLFIILSRLDEFDYSVVEAAYDLGATNRQTLTKV IIPMLLPAIVSAFLMALTLSFDDFVITFFVSGPGSSTLPLRIYSMIRLGVSPVVNALSVL LIAISILLTLSTKKLQKNFIK >gi|224461343|gb|ACDC01000059.1| GENE 12 8157 - 8984 1368 275 aa, chain + ## HITS:1 COG:FN1800 KEGG:ns NR:ns ## COG: FN1800 COG0652 # Protein_GI_number: 19705105 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Fusobacterium nucleatum # 1 275 1 274 274 442 82.0 1e-124 MKKIIKLLSIIGLSLMFLVSCSSVKSTMKSITSVFKDPVKYNNVTATFVTTQGEITFYLY PEAAPITVANFINLAKRGFYDNTKFNRSVENFMVQGGDPTGTGMGGPGYTIPDEFVEWLD FYQPGMLAMANAGPNTGGSQFFMTFAPADWLNGVHTIFGEVRSEGDAIKVRKLEMGDVIK EVRISDNGDFILALFKPQVEEWNRILDREYPNLRKYPVRDVTAQEVEAYKEELENLYTKK EKKNQDTFEYPITKFIRGVFNKAGGYTPREPVISN >gi|224461343|gb|ACDC01000059.1| GENE 13 9288 - 9776 606 162 aa, chain + ## HITS:1 COG:no KEGG:FN1814 NR:ns ## KEGG: FN1814 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 158 31 188 192 227 72.0 9e-59 MAGIALKIRHRTDYKINLQENEIISYEVLDNIELGLYSDIKNSLVDIAQLKEENGALPDI EVLAEEEIPPYYKDVTWEQRGAMEWKKIKHDNEDYYVGIGNERIGTFVVKFNDENIDESD IFYMKDKVSFKDIEKNFEKYEHIMKKIVPYTGNDERQKYIAK >gi|224461343|gb|ACDC01000059.1| GENE 14 9786 - 10700 1148 304 aa, chain + ## HITS:1 COG:FN1813 KEGG:ns NR:ns ## COG: FN1813 COG0803 # Protein_GI_number: 19705118 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 1 301 1 301 302 473 87.0 1e-133 MKKIILLMFLVLNVFVMAGEKLKIGITLLPYYSFVANIVKDRAEVIPIVKAEGFDSHTYQ PKVEDIERASKVDAIVVNGVGHDEFVYKIIDAVDKKDRPVIINANKDVPLMPVAGTLGNE KIMDSHTFISITAAIQQVHNITKEIIKLDPKNKDFYLANSREYVKKLRKLKTDALKEVQD VNGTDVRVATFLGGYNYLLSEFGIDVKAVLEPTHGSQISMSSLQKMIEKIKKEKIDIIFG EKNYSDEYVSIIKNETGIEVRKLEHLTTGAYRADSFEKFIKVDLDEVVSAIKYVKNKNKN RTKK >gi|224461343|gb|ACDC01000059.1| GENE 15 10713 - 11621 1400 302 aa, chain + ## HITS:1 COG:FN1812 KEGG:ns NR:ns ## COG: FN1812 COG0803 # Protein_GI_number: 19705117 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 1 302 1 302 302 518 93.0 1e-147 MYKKLLAILMLIFSFSAMAKDKLKIGVTLQPYYSFVVNIVKDKAEVIPVVRLDKYDSHSY QPKPEDIKRMNELDVLVVNGVGHDEFIFDILNAADRKKEIKVIYANKNVSLMPIAGSIRK EKVMNPHTFISITTSIQQVYNIAKELGELDPANKEFYLKNSREYAKKLRKLKADALNEVK KLGNIDIRVATLHGGYDYLLSEFGIDVKAVIEPSHGAQPSAADLEKVIKIIKNEKIDIIF GEKNFNNKFVDTIHKETGVEVRSLSHMTNGAYEPDSFEKFIKVDLDEVVKAIKDVAAKKG KK >gi|224461343|gb|ACDC01000059.1| GENE 16 11621 - 12310 234 229 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 16 228 16 223 311 94 30 3e-19 MNGLEIQIKDLNLVLSGNEILENINLTVKAGEIHCLVGPNGGGKTSLLRCILGQMPFTGS IEMKYEKDRVIGYVPQVLDFERTLPITVEDFMAMTNQTRPCFLGISKKHKETVDNLLKKL GVYEKKKRLLGNLSGGERQRVLLAQALFPKPNLLILDEPLTGIDKAGEEYFKEIIKELKE EGITILWIHHNLAQVKELADTVTCIKKRMIFSGDPKEELKEDKIMRIFE Prediction of potential genes in microbial genomes Time: Thu May 19 23:12:40 2011 Seq name: gi|224461342|gb|ACDC01000060.1| Fusobacterium sp. 2_1_31 cont1.60, whole genome shotgun sequence Length of sequence - 18607 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 8, operones - 6 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 + CDS 73 - 966 1123 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 2 1 Op 2 . + CDS 968 - 1846 1263 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 3 1 Op 3 . + CDS 1859 - 2278 802 ## FN1808 hypothetical protein 4 1 Op 4 . + CDS 2353 - 3147 1292 ## COG5266 ABC-type Co2+ transport system, periplasmic component + Term 3174 - 3227 5.1 - Term 3162 - 3213 8.5 5 2 Tu 1 . - CDS 3221 - 4087 705 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 4115 - 4174 9.1 6 3 Op 1 . - CDS 4189 - 5385 1686 ## FN1805 hypothetical protein 7 3 Op 2 1/0.500 - CDS 5422 - 6882 2169 ## COG2195 Di- and tripeptidases 8 3 Op 3 1/0.500 - CDS 6913 - 7602 533 ## COG1309 Transcriptional regulator - Prom 7625 - 7684 10.4 - Term 7730 - 7791 11.1 9 4 Op 1 8/0.000 - CDS 7795 - 8934 1077 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 10 4 Op 2 . - CDS 8921 - 9184 251 ## COG1396 Predicted transcriptional regulators - Prom 9206 - 9265 11.0 11 5 Tu 1 . - CDS 9268 - 9798 217 ## gi|237739263|ref|ZP_04569744.1| predicted protein - Prom 9935 - 9994 8.3 - Term 10295 - 10349 5.5 12 6 Op 1 . - CDS 10367 - 11044 728 ## COG1738 Uncharacterized conserved protein 13 6 Op 2 . - CDS 11071 - 11784 740 ## FN1995 hypothetical protein - Prom 11963 - 12022 11.3 - Term 12100 - 12134 2.0 14 7 Op 1 . - CDS 12339 - 12773 516 ## FN1994 hypothetical protein 15 7 Op 2 1/0.500 - CDS 12778 - 13431 776 ## COG0009 Putative translation factor (SUA5) 16 7 Op 3 11/0.000 - CDS 13431 - 14381 1425 ## COG0462 Phosphoribosylpyrophosphate synthetase 17 7 Op 4 1/0.500 - CDS 14383 - 15726 1957 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) 18 7 Op 5 . - CDS 15749 - 16552 945 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain - Prom 16658 - 16717 9.7 + Prom 16680 - 16739 12.0 19 8 Op 1 . + CDS 16863 - 17954 1746 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 20 8 Op 2 . + CDS 18020 - 18181 93 ## gi|237739272|ref|ZP_04569753.1| predicted protein 21 8 Op 3 . + CDS 18168 - 18606 403 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|224461342|gb|ACDC01000060.1| GENE 1 73 - 966 1123 297 aa, chain + ## HITS:1 COG:FN1810 KEGG:ns NR:ns ## COG: FN1810 COG1108 # Protein_GI_number: 19705115 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Fusobacterium nucleatum # 1 297 1 297 297 400 92.0 1e-111 MLETFRNFLINLAEQGSIPASFKYGFVINAMICALLIGPILGGIGTMVVTKKMAFFSEAV GHAAMTGIAIGVLLGEPFSAPYISLFTYCILFGLIINYTKNRTKMSSDTLIGVFLSISIA LGGSLLIYVSAKVNSHALESILFGSILTVNDTDIYILVVSAIIIGFVLVPYLNRMLLASF NPNLAIVRGVNVKLIEYIFIIIVTVITIASVKIVGSILVEALLLIPAAAAKNLSKSIKGF VSYSVLFALISCLLGVYLPIHFDISIPSGGAIIMISSAIFIITVIVRMLFRNFAEGE >gi|224461342|gb|ACDC01000060.1| GENE 2 968 - 1846 1263 292 aa, chain + ## HITS:1 COG:FN1809 KEGG:ns NR:ns ## COG: FN1809 COG0803 # Protein_GI_number: 19705114 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 10 292 1 283 283 465 87.0 1e-131 MKKILLFIFMLVLGTVSFAENIVITSIQPLYSLTSYLTKGTDIKVYTPFGSDISMTMSKE AIREEGFDLSVAKKAQAVVDIAKVWSEDVIYGKARMNKINIVEIDASYPYDEKMTTIFFS DYSNGEVNPYIWTGSKNLVRMVNIISRDLIRLYPQNKAKIEKNVNKFTNDLLKLENEANE KLLSVDNPSVISLSENLQYFLNDMNIYAEYVDYDSITAENIANLVRDKGIKVVISDRWLK KNVIKALKDAGGEFVIINTLDIPMDKDGKMDPEAILKVFKENTDNLIEALKK >gi|224461342|gb|ACDC01000060.1| GENE 3 1859 - 2278 802 139 aa, chain + ## HITS:1 COG:no KEGG:FN1808 NR:ns ## KEGG: FN1808 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 15 139 1 125 125 217 91.0 1e-55 MKKFLVLVIGVLMSVVAFAHAPLISVDDNGDGTVYIEGGFSNGASGEGVEIIIVKDKAYN GPEETFKGKEIIYKGKLDAKGSITMPKPATEKYEVYFNAGEGHVTSKKGPALTAAEKANW DKATASFDFGEWKELMLEK >gi|224461342|gb|ACDC01000060.1| GENE 4 2353 - 3147 1292 264 aa, chain + ## HITS:1 COG:FN1807 KEGG:ns NR:ns ## COG: FN1807 COG5266 # Protein_GI_number: 19705112 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 264 1 264 264 487 93.0 1e-137 MKKSLVLIGSILLAANLFAHDHFLYTSNLDASNQKEVKMKAVLGHPAEGPEAEPISIATV DGKTSLPKAFFVVHDGVKTDLLSKVKVGTIKTAKGQYVALDAVYSMEDGLKGGGSWVFVM DSGNTKDEGYTFNPVEKLIITKDSAGSDYNQRVAPGYNEIVPLVNPVNAWKENVFRAKFV DKDGNPIKNARIDVDFINGKLDMTNNTWVANKEAPKTSLRVFTDDNGVFAFVPSRSGQWV IRAVASMDRQNKVVHDASLVVQFE >gi|224461342|gb|ACDC01000060.1| GENE 5 3221 - 4087 705 288 aa, chain - ## HITS:1 COG:FN1806 KEGG:ns NR:ns ## COG: FN1806 COG0697 # Protein_GI_number: 19705111 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 287 1 287 287 443 91.0 1e-124 MKNNSKAYFLIIAAVIIWSLSGLLVKAVNVDPIWISLIRCLGGGIFLLPYIFKEKIYPIK NILFGGIFMAIFLLSLTITTRISSSAMAISMQYTAPMYVIGYGFYKSKEIKFEKFIVFLF IFAGIIFNSITSMNGGNWWAIVSGITIGLAFVFYSYNLQKVKKGNLLGIVALINIISAIF YGILLLFRYSPPPSSFNEIIILSISGIVISGISYALYGEGLREISMEKAMIICLAEPVLN PLWVYLGKGEIPSMTTVIGSTLILLSAILDIAFSIKNNKKAKKLSTHN >gi|224461342|gb|ACDC01000060.1| GENE 6 4189 - 5385 1686 398 aa, chain - ## HITS:1 COG:no KEGG:FN1805 NR:ns ## KEGG: FN1805 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 398 1 388 388 548 93.0 1e-154 MHPAMAFFILMAFLALGEFISVKTRAIVPSILIFLILLLSGVWGGILPKEIIDLGGFSEA MTEVIMVVIVVNMGSSLSLDSLKKEWKTVLIGIGAIVGIAAVILPVGSMIYDWQTAVVAA PPIAGGFVAAFEMSKTSLAKGLPHLSTIALLLLALQEFPVYFILPGLLRSETLRRLDLFR KGELKAVSAAEEENKKRLIPPIPEKYMDTSTYLFLLGLVGMIAILCSMLSKTIFNGFGVD FKISPTIFALLFGIVAGEIGLLERKSLQKANCFGFFVVASVVGVMGGLVNSSMEEILKLI VPLVVLIFLGIIGMAIGGIIVGKLLKLTWQMSFAIALNCLIGFPVNFLLTNEAINVLAKT EEEKDFLTNTMVPTMLVGGFTTVTLGSVVFAGILTNFL >gi|224461342|gb|ACDC01000060.1| GENE 7 5422 - 6882 2169 486 aa, chain - ## HITS:1 COG:FN1804 KEGG:ns NR:ns ## COG: FN1804 COG2195 # Protein_GI_number: 19705109 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 1 486 1 486 486 841 88.0 0 MAYKCVEDLREHRVFYHFLEISKIPRQTFFEKEISDFILNWAKGLGLEVYQDEKYNLLIK KPATLGYENKKPIIIQAHIDMVCEKRPEVEHDFRKDPINLVLEGDILSTGNRTTLGGDDG IGVAMAMAVLEDKNLKHPPVEVILTTCEEEDMSGALSVDKSWFNTNRVINIDHVVDTEII AGSCGGIGVDLRFPVEYTNKADNYKGYKIKITGLRGGHSGEDIHKGRANANVLLANLLNL LREKVNFLISDIKGGNFRLAIPREAHVTLALEEKDADTLKDIAKHFEYEAKKIYEETAVN LKIEVSEDILADKLLSKNTVDKIIDAIILSPNGVSSMIGSLNVVESSSNLGEVYVKDEYV YLVTEIRATFEKNRDYLYNKIALIGKYLGGELRGFSAYPSWVYKPHSSLRDTANKVYSEL FGEEIKTLAVHAGLECGCFVDKIQGDMDAISIGPNAWDLHSPNERLSVSSTEKVYKFLTH ILENLD >gi|224461342|gb|ACDC01000060.1| GENE 8 6913 - 7602 533 229 aa, chain - ## HITS:1 COG:FN1803 KEGG:ns NR:ns ## COG: FN1803 COG1309 # Protein_GI_number: 19705108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 15 229 1 215 217 333 88.0 2e-91 MYIDYLWLNVRGAHMDKLDIKKKRVMMYFIEATQELILNEGIENLSIKKIAEKAGYNTAT IYNYFEDLEELILYSSVDYLKIYLKDLRSEINPDMKAIEMYETIYKVFVHHSFEKPEIFH TLFFGKYSYKLEKIIKKYYEIFPDDITGQSDITKSVLIEANIHNRDIPVMKQMIKESSIS EEEAPYIMETIVRVHQSYLENILQQRDKISLDEHKNKFFKIFNFLLRKR >gi|224461342|gb|ACDC01000060.1| GENE 9 7795 - 8934 1077 379 aa, chain - ## HITS:1 COG:FN2000 KEGG:ns NR:ns ## COG: FN2000 COG3550 # Protein_GI_number: 19705296 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Fusobacterium nucleatum # 241 378 1 138 145 245 91.0 9e-65 MNKSKSLQVFYNEKKVGTLALMKNNIVAFEYDSNWITNGFSISPFSLPLKKQVFIPRIDP FDGLYGVFSDSLPDGWGRLLVDRMLNSQNINPREISQIDRLAIVGETGMGALSYKPEYNL LEDKDYQEDYDNLALSCKKILNTEYSADLDKLFKLGGSSGGARPKILTKIDNEDWIIKFP SSLDESNIGKLEYLYSVCAKKCKIDIPETKLFPSKISSGYFGIKRFDRKKLSTGAIRKLH MISVSGLLETSHRIPNLDYNDLMQLTLNLTKSFEEVEKLFRLMCFNVFSHNRDDHSKNFS FIYNEDLNKWELSPAYDLTYSYSINGEHATTINGNGVNPDLNDILKVAEKIGLDKKKAEK IAIEIRETVRKDLEIFLSK >gi|224461342|gb|ACDC01000060.1| GENE 10 8921 - 9184 251 87 aa, chain - ## HITS:1 COG:FN1997 KEGG:ns NR:ns ## COG: FN1997 COG1396 # Protein_GI_number: 19705293 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 87 20 106 106 127 93.0 4e-30 MKTPKEIQLEIAKNIRKRRKELKLTQEEFSKKSGVSFGSIKRFENTGEISLFSLIKIAIV LGCEDEFLNLFQQKQYSSIEEIINEQE >gi|224461342|gb|ACDC01000060.1| GENE 11 9268 - 9798 217 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739263|ref|ZP_04569744.1| ## NR: gi|237739263|ref|ZP_04569744.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 176 1 176 176 196 100.0 4e-49 MTKRYYNPHNETAIVAAIIMILTNIYVFLNDKLTLSIGIFSNIFIYIEVFYAILLIFYNI LLLAKDFFKFKDITKYKLKKYFPLANLIILTFYICKEPLFFYSTKKIILVLTFLLIEIFF TFFIHKIKFKFRGWKKILDEIFAIVSVSIIIYYLQYISAVIGYLFYRFIIIVTFLL >gi|224461342|gb|ACDC01000060.1| GENE 12 10367 - 11044 728 225 aa, chain - ## HITS:1 COG:FN1996 KEGG:ns NR:ns ## COG: FN1996 COG1738 # Protein_GI_number: 19705292 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 225 11 235 235 384 93.0 1e-107 MLVINFSCILFAYRKFGKIGLYIWVPISTILANIQVVILVNLFGMEATLGNILYAGGFLI TDILSENYGKKAANTAVKIGFFSLVATTLIMQCAIHFKPLDVPEGLAIFESVKSIFSLLP RLAIASLIAYLISQFHDVWLYEKIREKFPAKKFIWIRNNGSTMLSQLIDNLVFTTIAFYG VYPIDVMVNIFLSTYIIKFIVAICDTPFIYLADKMFRDKKIPEDV >gi|224461342|gb|ACDC01000060.1| GENE 13 11071 - 11784 740 237 aa, chain - ## HITS:1 COG:no KEGG:FN1995 NR:ns ## KEGG: FN1995 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 237 1 237 237 382 90.0 1e-105 MGIRYSKVEGKFQREIVLLKSFPCAYGKCSFCNYIEDNSNNEEEINEVNLEVLKEITGEF GILEVINSGSVFEIPKKTLEKIREVVYEKDIKILYFEIFYSYLSRLDEIINYFNEKKKVE IRFRTGIESFDNDFRRNVYKKNILLDEKKIKELSEKIYSVCLLIATQGQTKEMIKNDIEM GLKYFKAITINVFVDNGTVVKRDAELVKWFVQDMKHLFDNDRVEILIDNKDLGVYEQ >gi|224461342|gb|ACDC01000060.1| GENE 14 12339 - 12773 516 144 aa, chain - ## HITS:1 COG:no KEGG:FN1994 NR:ns ## KEGG: FN1994 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 144 1 144 144 194 90.0 7e-49 MKGTRVNPTALSPMEMNNMSSMMGMMSSIQKIGKGKRKYTIQLDKNDKKLLVRFINEAKK QFSDTASNSQYAGVYNFLNYITDVASKKESTEIKMSYEEQDFVKRMLQDSVRGMEKMQFF WYQFIRKFTVKTLAKQYRELLKKF >gi|224461342|gb|ACDC01000060.1| GENE 15 12778 - 13431 776 217 aa, chain - ## HITS:1 COG:FN1993 KEGG:ns NR:ns ## COG: FN1993 COG0009 # Protein_GI_number: 19705289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Fusobacterium nucleatum # 1 216 1 216 217 326 87.0 2e-89 MEKYLKIDNISDISDDKWTELADELKEGSLIIYPTDTVYGLASIVTNEQSINNIYLAKSR SFTSPLIALLSSVDKVEEVATISDENREILEKLAHAFWPGALTVILKRKEHIPNIMVSGG DTIGVRIPNLDLAIKIIDLAGGILATTSANISGEATPKSYNELSEAIKSRVDILVDGGEC KLGEASTIIDLTSDVPKILRNGAISTDEITKIIGRVR >gi|224461342|gb|ACDC01000060.1| GENE 16 13431 - 14381 1425 316 aa, chain - ## HITS:1 COG:FN1992 KEGG:ns NR:ns ## COG: FN1992 COG0462 # Protein_GI_number: 19705288 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Fusobacterium nucleatum # 1 315 1 315 316 554 91.0 1e-157 MINFNNVKIFSGSSNVELASKIAEKIGFPLGKAEIQRFKDGEVYIEIEETVRGRDVFVVQ STSEPVNENLMELLIFVDALRRASAKTINVIIPYYGYARQDRKSKPREPITSKLVANLLT TAGVNRVITMDLHADQIQGFFDIPVDHMQGLPLMAKYFKDKGFYGDDVVVVSPDVGGVKR ARKLAEKLDCKIAIIDKRRPKPNIAEVMNLIGEVEGKIAIFIDDMIDTAGTITNGADAIA ARGAKEVYACCSHAVFSDPAIERLEKSALKEVVVTDSIALPERKKIDKVKIISVDSVLAA AIDRITNNKSVSELFE >gi|224461342|gb|ACDC01000060.1| GENE 17 14383 - 15726 1957 447 aa, chain - ## HITS:1 COG:FN1991 KEGG:ns NR:ns ## COG: FN1991 COG1207 # Protein_GI_number: 19705287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Fusobacterium nucleatum # 1 447 1 446 446 749 91.0 0 MKAIIMAAGKGTRMKSDLPKVVHLTHSKPMIIRIIDALNALNTEENVLILGHKKEKVLEV LGPDVSYVVQEEQLGTGHAVKQAVPKLENYQGDVLIINGDIPLIRKETLIDFYNEYKKEN ADAIILSAVFENPFSYGRVLKDGNKVLKIVEEKEANEEQKKIKEINAGVYIFKSQDLVKA LAQINNNNEKGEYYITDVIEILSNDNKKVISYSLEDSMEIQGVNSKVELALVSRVLRERK NTALMEEGVILIDPANTYIEDEVKIGRDTTIYPNVTLQGNTEIGENCEILSGTRIIDSKV FDNVRIESSVIEESIVENGVTIGPYAHLRPKSHLKENVHIGNFVETKKSTLEKGVKAGHL TYLGDAHVGEKTNIGAGTITCNYDGKNKFKTEIGKEVFIGSDTMLVAPVSIGDNSLIGAG SVITKDVPSDSLSVERSKQIIKEGWKK >gi|224461342|gb|ACDC01000060.1| GENE 18 15749 - 16552 945 267 aa, chain - ## HITS:1 COG:FN1990 KEGG:ns NR:ns ## COG: FN1990 COG0484 # Protein_GI_number: 19705286 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Fusobacterium nucleatum # 50 265 4 173 175 113 44.0 5e-25 MEAILLPLVVMFFILVLTLGIEKASKAIIPLAIIEIFVYFFGWDIFKYIFIFILFIVFLI FFLIFKLLKKAGTSSNTYRRTRTQNDDFFGGYRNNTSNNNSNGTRGNNTYNDTRYYGNFR TREEAEEFFRNIFGRDFGQNGTYNNTRSSGTFTQEEAEEFFRNIFGDNFGGTYGGTTYGG NTYGNSSGSYRQGGSYQRTGTYTSNRSRYYRILGLKDGASQEEIKKAYRQLAKEHHPDKF VNASDSEKKFHESKMKEINEAYENLKI >gi|224461342|gb|ACDC01000060.1| GENE 19 16863 - 17954 1746 363 aa, chain + ## HITS:1 COG:FN1586 KEGG:ns NR:ns ## COG: FN1586 COG4948 # Protein_GI_number: 19704907 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Fusobacterium nucleatum # 1 363 13 375 375 668 93.0 0 MKITKVKLGIISVPLRVPFKTALRTVNSVEDVIVEIHTDTGNIGYGEAPPTGAITGDTTG AIIGALKDHIIKTIVGRDVDDFENLMKDLNSCIVKNTSAKAAVDIALWDLYGQLHKIPVH KLLGGSRKKLITDITISVNPPEEMARDAINAIKRGYNTLKVKVGIDPTLDVARLSAIREA IGKDYRIRIDANQAWTPKQAIKLLNQMQDKGLDIELVEQPVKAHDFEGLAYVTKYANVPV LADESVFSPEDAFKILEMKAADLINIKLMKCGGIYNALKIISMAEVLGVECMIGCMLEAK VSVNAAVHLACAKQIITKIDLDGPVLCSEDPVVGGAIFNEKEIIVSDDYGLGIKAINGIK YID >gi|224461342|gb|ACDC01000060.1| GENE 20 18020 - 18181 93 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739272|ref|ZP_04569753.1| ## NR: gi|237739272|ref|ZP_04569753.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 53 1 53 53 87 100.0 3e-16 MKLFQKYDIIKSVKGNKKEYFWSRIFFVLSVGGASIEVIKKYIQGQGGENENS >gi|224461342|gb|ACDC01000060.1| GENE 21 18168 - 18606 403 146 aa, chain + ## HITS:1 COG:BBH40 KEGG:ns NR:ns ## COG: BBH40 COG0675 # Protein_GI_number: 11496700 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Borrelia burgdorferi # 6 132 5 132 155 94 47.0 8e-20 MKIVKKAYKFRIYPTLEQIIFFSKNFGCVRKVYNLMLDDRKKDHEEYKSTGIKTKYPTPA KYKEDYPYLKEVDSLALANAQLNLEKAFKNFLKNKDFGFPKYKCKSNPVQSYTTNNQNTI YIKDSYIKLPKLKSLVKIKLHRKIEG Prediction of potential genes in microbial genomes Time: Thu May 19 23:13:09 2011 Seq name: gi|224461341|gb|ACDC01000061.1| Fusobacterium sp. 2_1_31 cont1.61, whole genome shotgun sequence Length of sequence - 2279 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 584 698 ## COG1739 Uncharacterized conserved protein - Prom 766 - 825 9.6 + Prom 517 - 576 9.3 2 2 Tu 1 . + CDS 794 - 2263 2486 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific Predicted protein(s) >gi|224461341|gb|ACDC01000061.1| GENE 1 2 - 584 698 194 aa, chain - ## HITS:1 COG:FN1907 KEGG:ns NR:ns ## COG: FN1907 COG1739 # Protein_GI_number: 19705212 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 194 1 194 195 306 84.0 1e-83 MEKLKTIKRECSVEFEEKKSKFIASVKPVFSKEEAEEYINYIKSLHPNATHNCSAYKINN KGLEFFKVDDDGEPSGTAGKPMGDIINYMEVTNLVVIATRYFGGIKLGAGGLVRNYAKTA KLGITEAEIIDFVNKVDLLFEIPYEKLGEIEKLLKDYEAEVIDKSFLEKIIFKVRINEEF LTNLENYPYVNLID >gi|224461341|gb|ACDC01000061.1| GENE 2 794 - 2263 2486 489 aa, chain + ## HITS:1 COG:FN1547_1 KEGG:ns NR:ns ## COG: FN1547_1 COG1263 # Protein_GI_number: 19704879 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Fusobacterium nucleatum # 1 411 1 411 411 673 90.0 0 MFSYLQKIGKALMVPVAVLPAAAIMLGLGYWIDPTGWGANSQLAAFLIKAGAAVIDNMPI LFAVGVAYGISKDKDGAAALAGLVAFEIVTTLLSKGAVAQIMGIDPEQVHAAFGKVNNQF IGILCGVISGELYNKFHKTELPKFLAFFSGKRFVPIITSVVMIIVSFILTYIWPAIFGAL VTFGTSIAKLGPIGAGIYGFFNRLLIPVGLHHAVNSVFWFNVAGINDIGRFWGAPEMAYA DLPEILQGTYHVGMYQAGFFPIMMFGLLGACLAFIQTSKPENRAKIVSIMVAAGFTSFLT GVTEPIEFAFMFVAPLLYLVHALLTGLALFLAASFNWMAGFSFSGGFIDFFLSLRNPNAH NPFMLIVLGLIFFVIYYFVFLFVIKAFNLKTPGREESEEEKEEAVRVNTSNAALAESLAT YLGGADNVVEVDNCTTRLRLKVKDSDKIQDSEIKKLVPGLLKPSKEAVQVIIGPHVEFVA TELKRILNK Prediction of potential genes in microbial genomes Time: Thu May 19 23:13:19 2011 Seq name: gi|224461340|gb|ACDC01000062.1| Fusobacterium sp. 2_1_31 cont1.62, whole genome shotgun sequence Length of sequence - 38632 bp Number of predicted genes - 33, with homology - 31 Number of transcription units - 12, operones - 8 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 28 - 765 528 ## gi|237739276|ref|ZP_04569757.1| predicted protein 2 1 Op 2 . - CDS 781 - 2121 1726 ## COG1032 Fe-S oxidoreductase 3 1 Op 3 . - CDS 2118 - 3665 1405 ## Hoch_4770 GH3 auxin-responsive promoter - Prom 3788 - 3847 6.8 4 2 Op 1 . - CDS 3849 - 4550 708 ## PTH_2268 hypothetical protein 5 2 Op 2 . - CDS 4566 - 5471 671 ## Lebu_0283 hypothetical protein 6 2 Op 3 . - CDS 5464 - 6189 588 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Prom 6285 - 6344 6.4 + Prom 6441 - 6500 2.6 7 3 Tu 1 . + CDS 6543 - 7406 415 ## COG1560 Lauroyl/myristoyl acyltransferase + Term 7442 - 7479 0.1 8 4 Op 1 . - CDS 7562 - 9715 2473 ## BCG9842_B2017 putative cytoplasmic protein 9 4 Op 2 . - CDS 9740 - 11074 1504 ## COG1032 Fe-S oxidoreductase - Prom 11106 - 11165 9.3 10 5 Op 1 . - CDS 11318 - 12640 1462 ## COG1032 Fe-S oxidoreductase 11 5 Op 2 . - CDS 12637 - 13941 1615 ## COG1032 Fe-S oxidoreductase 12 5 Op 3 . - CDS 13955 - 14353 560 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 14382 - 14441 9.2 - Term 14456 - 14496 7.2 13 6 Tu 1 . - CDS 14551 - 21306 10510 ## FN0387 hypothetical protein - Prom 21401 - 21460 8.0 - Term 21500 - 21534 4.0 14 7 Op 1 . - CDS 21546 - 22109 596 ## FN1008 hypothetical protein 15 7 Op 2 . - CDS 22131 - 22502 512 ## FN1009 hypothetical protein 16 7 Op 3 . - CDS 22504 - 23460 986 ## EFER_3822 hypothetical protein 17 7 Op 4 . - CDS 23469 - 23942 535 ## COG1683 Uncharacterized conserved protein 18 7 Op 5 . - CDS 23956 - 24480 447 ## FN1008 hypothetical protein 19 7 Op 6 . - CDS 24501 - 26684 2522 ## COG5324 Uncharacterized conserved protein - Prom 26761 - 26820 9.3 - Term 26807 - 26857 -0.8 20 8 Op 1 1/0.000 - CDS 27024 - 28301 2091 ## COG0104 Adenylosuccinate synthase - Prom 28332 - 28391 4.0 - Term 28450 - 28496 1.1 21 8 Op 2 1/0.000 - CDS 28517 - 30439 2000 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase 22 8 Op 3 . - CDS 30459 - 31115 915 ## COG0283 Cytidylate kinase - Prom 31302 - 31361 10.2 + Prom 31479 - 31538 8.3 23 9 Op 1 . + CDS 31568 - 31861 323 ## gi|237739298|ref|ZP_04569779.1| predicted protein 24 9 Op 2 . + CDS 31936 - 32202 282 ## gi|237739299|ref|ZP_04569780.1| predicted protein 25 9 Op 3 . + CDS 32279 - 32548 254 ## gi|237739300|ref|ZP_04569781.1| predicted protein + Term 32551 - 32592 4.0 - Term 32537 - 32579 4.2 26 10 Op 1 1/0.000 - CDS 32583 - 33440 1305 ## COG1281 Disulfide bond chaperones of the HSP33 family 27 10 Op 2 1/0.000 - CDS 33440 - 33964 582 ## COG1555 DNA uptake protein and related DNA-binding proteins 28 10 Op 3 1/0.000 - CDS 33961 - 35379 1390 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 29 10 Op 4 . - CDS 35372 - 36325 1005 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT 30 10 Op 5 . - CDS 36347 - 36517 242 ## 31 10 Op 6 . - CDS 36426 - 36641 561 ## - Prom 36663 - 36722 1.7 - TRNA 36431 - 36515 70.5 # Tyr GTA 0 0 - TRNA 36534 - 36608 66.8 # Glu TTC 0 0 - TRNA 36611 - 36686 81.3 # Thr TGT 0 0 32 11 Tu 1 . - CDS 36759 - 37613 999 ## COG0731 Fe-S oxidoreductases - Prom 37639 - 37698 8.5 + Prom 37972 - 38031 13.5 33 12 Tu 1 . + CDS 38064 - 38631 897 ## COG0457 FOG: TPR repeat Predicted protein(s) >gi|224461340|gb|ACDC01000062.1| GENE 1 28 - 765 528 245 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739276|ref|ZP_04569757.1| ## NR: gi|237739276|ref|ZP_04569757.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 245 1 245 245 367 100.0 1e-100 MKIDKIAILNDISSDNINLINFLDIFAKFSQNTKDMAEFIYLNENISQSFFKLTDLKKED LEDILDILKLIKDKSKKEDLDIYGEEVERGINEVNWLIEEKNLYQNIFQEFDNKKVLDKN SIVNELYRNEDASQSQYLIRTFSNKLWKELDEETIVNFLNGLDFYYLSDEAYFFILPACI RYGLEKFEDNEQLDYLTFFLSDKERVKNANEKIKILVVSYLNLLKKLNFLGFFEKEEKEC LDLWK >gi|224461340|gb|ACDC01000062.1| GENE 2 781 - 2121 1726 446 aa, chain - ## HITS:1 COG:slr0309 KEGG:ns NR:ns ## COG: slr0309 COG1032 # Protein_GI_number: 16331878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Synechocystis # 20 403 28 414 473 213 32.0 6e-55 MKIAFLAPAGAMHRFNGSFGKSLHYAPLTLTTLAALIPESLNAEAKIYDETIEKIPLDLE ADIIVMTSITGTSQRCYAYADYFRQRGITVVLGGVHPSLMPEEASQHADVVMVGFAEQTF PQMLLDFKNGRLKRMYIQDKEFNLDNKVIPRRELLQKDKYITTATVEVVRGCSLPCTFCA YPTAFGRKIYKRPIKEVLSEIEMFSEKIILFPDVNLIADREYAMRLFKEMKSLNKYWMGL VTSSVGIDENMIKTFADSGCKGLLIGFESITQESQSYINKGINKVADYAELMKKLHDYGI LVQGCFAFGSDEEDTSVFERTVEAVVKAKIDLPRYSILTPFPKTQFYAQLEAENRIFEKN WAMYDVEHCVFTPKKMTVEELEKGTAWAWRETYSMKNIFKRLAPFTHSPWISLPLNIAYR KYADKYEHFTREVMCDNSDIPLIFEK >gi|224461340|gb|ACDC01000062.1| GENE 3 2118 - 3665 1405 515 aa, chain - ## HITS:1 COG:no KEGG:Hoch_4770 NR:ns ## KEGG: Hoch_4770 # Name: not_defined # Def: GH3 auxin-responsive promoter # Organism: H.ochraceum # Pathway: not_defined # 35 507 49 547 581 280 32.0 1e-73 MLSKLYLYIVHSIFLLFYKKEYRKYMNSRNILEIQENKLKEILENNKNSLYGKKYNFDKI KTIEDFQREVPLTKYEDYLAYIEKIKNGEEHVLTYEKVKMFELTSGSTSASKLIPYTDSL KKEFQAGIKVWLYSLYKKYPSLKFGKSYWSITPKIDFQHKEKSLIPIGFEEDSEYFGNLE KYLIDSIFVNPKDIKNEKDMDRFYFKTLSALVAEENIRLFSFWSPSLLLLLIEYLEKNSE KILKTLKEKRKQEVRKYIEAKEYHKIWKNLILISCWGDMNSTEYLKKIQEFFPKTIIQEK GLLATEGFISFPDAEKNLSKLSFYSHFFEFLSLDDNKIYDTSEIEANKKYELIITTSGGL YRYCIGDIIEVISIENNVPYIKFIGRKGAVSDLFGEKLEESFLKNIIQTYKQKIDFYMFA PNRNHYILFIKTDKKIDVKDLENKLRENFHYDYCRKLGQLKAIKVFTLTGQPEKEYIQAC QNKNQKLGNIKMIALSKESGWENIFNGYFQESEDK >gi|224461340|gb|ACDC01000062.1| GENE 4 3849 - 4550 708 233 aa, chain - ## HITS:1 COG:no KEGG:PTH_2268 NR:ns ## KEGG: PTH_2268 # Name: not_defined # Def: hypothetical protein # Organism: P.thermopropionicum # Pathway: not_defined # 3 225 2 223 230 153 31.0 5e-36 MELKGDIVKINEISQSEIEEMYILMTEFYNDVNKDVFLKDLKEKDYCIILKDDKNKVKGF STQKIMNFTLGNEEIYGVFSGDTIIDKENWGNLTLFKVFANFFFPFGEKYKNFYWFLIVK GYKTYKFLPTFYKEFYPNYKAETPEKFKNIMDLFGEIKYPNEYNKENGVIEYKGIKDSLK KGVADITEKELKDKNVQFFLENNPDYEKGNDLVCITSLKIENLKEKTLKILFT >gi|224461340|gb|ACDC01000062.1| GENE 5 4566 - 5471 671 301 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0283 NR:ns ## KEGG: Lebu_0283 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 301 1 314 314 295 56.0 1e-78 MNKIEFKIIRGNERSEEISEILNEEDMESSIQIKYIKYPNLFESLKLDGVREPLIVPGID TTNNKMVGLGACTIFEDNIAYLNSFRIRTEYRNKVNFGNGYKKIIEELEKEGIDTIITTI LDDNKMAKEILTKQRRNMPIYEFYKNITFYSIKNIKKNSLVIDDLYVTEYKNFRIEIKNK TNKKYFVEDYKGIYKFLYKMRKVISFFGYPELPEKNTEMKFLYVDIEAKDNDYSNTLEAI KHLQSMGCSCDFFMIGTYENSSLDIQLKKIKSFKYKSKLYKVYYGEDKNKGKDIRFKFWN L >gi|224461340|gb|ACDC01000062.1| GENE 6 5464 - 6189 588 241 aa, chain - ## HITS:1 COG:CAC3686 KEGG:ns NR:ns ## COG: CAC3686 COG0491 # Protein_GI_number: 15896918 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Clostridium acetobutylicum # 1 222 2 227 245 70 27.0 3e-12 MINKVKLGINNLYLFKNNNDDYLLLDTALSCKKDVILDEINKLIDDYNKIKVIVITHSHS DHIGNLKLLLDKIKREDKIVIAHNNAKDIMLTGEKVIPNGFYKFSKYISKKLKAKSSGNF QKGFENLTEEYFKYVNFLDFKDYKEFSLDKYGFGNLKLIYTPGHSKDSISLVYNNDYLFC GDMVQNLCFKYPLIPLFGDDIEELINSWKKAIEKGYSRFYPATSKSYILREDLIKKLEKY E >gi|224461340|gb|ACDC01000062.1| GENE 7 6543 - 7406 415 287 aa, chain + ## HITS:1 COG:FN1016 KEGG:ns NR:ns ## COG: FN1016 COG1560 # Protein_GI_number: 19704351 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Fusobacterium nucleatum # 83 285 2 211 226 79 27.0 8e-15 MKLIFDFIIYLIFLIFISIFKILPSKIKLKFSEFLGLFLYYLISKGRKLSLRNLNLILNE QYNYNFTDKQIKDIAIKSYRNTMKSFLLPFWIYEYAEKYPPVIHNIELLEKLKETNDRIV LATLHYGFFHMSMYPIIDEQMFIIIRPVPNRFIEAYMNKIRFKKNMLSFTEQNIKALFKH KKSKGFFIMLNDVRKPNGEKVNFFNLPTTASGFTAFFSMREKLPIIVIHNEVDSNNICNI YIDEIIHPENYTNESDLTDRLLKIYEKIILNKPEQWYWFQDRWINKK >gi|224461340|gb|ACDC01000062.1| GENE 8 7562 - 9715 2473 717 aa, chain - ## HITS:1 COG:no KEGG:BCG9842_B2017 NR:ns ## KEGG: BCG9842_B2017 # Name: not_defined # Def: putative cytoplasmic protein # Organism: B.cereus_G9842 # Pathway: not_defined # 14 717 3 698 703 192 25.0 3e-47 MNNSINSIALRHLNGIYIAKNTDNNINETLSMAELATLIKKFEGYGYIFSKELAIAISKE ERNIIIDKLKAVIKVIEDFKSDKNYTVFYKNFPDEVINMSEVDLYINQILHYWIGYLPSN NENIIKEDVEPSKLVKARELNLVDDEMIEKLFIDLLSSNVTLSEQYLDDVCVLTNNKSIK ELEKYMEYIQMKETLTTVSNYILQKEGVLIGDFKTATDILRLIAKISGVELNNKHIHFAY FSRTVLSQLMNKLENLKNIMPDIKRYSKPWHSFFKLYAKKINFNKYPKVRNAVDMLFGDI SYMTERGKINEQINRLPTMSEEELDNFIKEYTVFYGDYIREILSLLNKANENQYEKLLLG LENCVTKVNTRILFQLYDRIINLKANNKTVPRLVNSKGKWRILQESINLSDELLNRVLQI VEDGIKTQLKEKGSLGKVYIDRSYKDIMLTTSEKDSNVSLRPMTRGSRIKFNPNTEVLRF FVAWKNLDEKTLKELNAYSDRVDVDLSALTFDANLEFNDVVAYYNQKKSYFAFSGDITNA PEGALEYIDILDLEKLKKKGDRYVLMEIRSYNGYTFKEINTVYAGVMELTSKEAKEKKNM YSTAITEGFQIVSSERTTNTILVDLEKFEYIWLDTNMASYMLGVMYGNALSNEEIPYLND LLRYFSRKQYVTMYDLLKLNADVRGVLTKDKKEADIIFEKVDNKNNLALADILSNYL >gi|224461340|gb|ACDC01000062.1| GENE 9 9740 - 11074 1504 444 aa, chain - ## HITS:1 COG:slr0309 KEGG:ns NR:ns ## COG: slr0309 COG1032 # Protein_GI_number: 16331878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Synechocystis # 1 399 13 412 473 199 31.0 1e-50 MKITFILPAIGKKKGQRYIKTWKHMEPLMIAVLKSLTPNDIETNFMDDRNELINYDEKTD LVVISVETYTAKRAYEIAKKFREKGIKVLAGGYHPTVEPEECLENFDSIIVGNAENVWLK MLEDCKNNNLQEKYFGTSTSFAMPDRSIYKDRKYSPLALIETGRGCNFSCEFCAIHSYYE KKYYRRPVEEVVQDIKNSGKKYVFFIDDNFVADHNYALEICKAIAPLKIKWVTQGAITMA KNDELLYWMKKSGCKMVLIGYESMNPNILKDMGKGWRSSVGEINELTNKIHSYGIGIYAT FVFGFGDDSQEVFDETVKFAKKHSFFFAAFNHLVPFPKTGVYRRLKEEKRLLSDKWWLDS RYPYGRISFLPLDQTPDELSKKCANARKKFFEWGSILKRALVQFKRSFDLGMFFIFLTQN FNLKNEVLEKYDLPYADNLDEMPK >gi|224461340|gb|ACDC01000062.1| GENE 10 11318 - 12640 1462 440 aa, chain - ## HITS:1 COG:slr0309 KEGG:ns NR:ns ## COG: slr0309 COG1032 # Protein_GI_number: 16331878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Synechocystis # 12 410 22 416 473 200 30.0 6e-51 MRIMLVLAKDNIYRFDSLHQRKYYPQITLITLESLIDKKYNTEIILVDEGVEEYDATSSK YSDDKFDLICISAVISASRRAKEISKFWKDRGAYTQIGGHYATVLSDEALEFFDTVIKGP AEIAFPTFIKDFVEEKPKREYFELVGNDFEYKPLNRKLLTNKKYYKSFGTIVANNGCPNK CTYCSVTKMYSGKNQLKNIDFVVSEIKSNKHKKWVFYDPNFLADKSYAINLMNELKKLKI KWTASATINIGNDIKMLQLMKEAGCIGLVIGLESFIQENLNGVNKGFNNVKEYKRLVSTI QSYGISVLSTLMIGMETDTVESIRQIPDIIEEIGVDVPRYNILTPYPGTPFYEQLKAENR LLTTDWYYYDTETVVFQPKNMSPATLQEEFYKLWQDTFTYKRIFKRLKTSKNKGLKLILE IFSRQHAKKFKKYTKLDFIN >gi|224461340|gb|ACDC01000062.1| GENE 11 12637 - 13941 1615 434 aa, chain - ## HITS:1 COG:slr0309 KEGG:ns NR:ns ## COG: slr0309 COG1032 # Protein_GI_number: 16331878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Synechocystis # 19 410 34 411 473 173 29.0 8e-43 MKIAFLRPNLGGQRSNDAIEPLGFAVLSGLTDRNKHEVLLFDERIENIPMDLEVDLVVIT TFTLTAKRAYTIADNYRKKGIYVVIGGYHASLMPEEVQEYADTVFVGSAEGNWEKFLIEL ENGHPQKVYEEIKLPDISEVVYDRSLFKDKKYSFVVPVQFGRGCMHQCEFCTIGSVHKGD YAHRKVELVIEEIKEIFRTNKRAKVIYFVDDNIFANKKKALHLFNELKKLKIKWACQGSI DIAKDEDLVKLMSESGCIEMLLGFENINIMNIKKMNKKSNYDFDYENIIRIFKKHRILVH ASYVIGYDYDTKDYFQEILDFSNKHKFFLAGFNPALPIPGTPFYDRLKNEGRLLYDKWWL DDDFRYGKAAYTPHNMTVEEFEAGILRCKVEYNTHKNIWLRLFDGAANFRHALIFLAVNY INRKEIYNKKGIKL >gi|224461340|gb|ACDC01000062.1| GENE 12 13955 - 14353 560 132 aa, chain - ## HITS:1 COG:FN1006 KEGG:ns NR:ns ## COG: FN1006 COG0454 # Protein_GI_number: 19704341 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Fusobacterium nucleatum # 1 132 1 132 132 209 93.0 7e-55 MECKIIKNDTNYNLDDLTKLLNTSYWAKDRKKETVKKTVENSLCYFAYDTDKNKLIGFAR AITDYTTNYYICDVIVDEEYRGEGIGKKLVETLTNDEDLVHVRGLLITKDAKKFYEKFGF YNKEDVMQKDKK >gi|224461340|gb|ACDC01000062.1| GENE 13 14551 - 21306 10510 2251 aa, chain - ## HITS:1 COG:no KEGG:FN0387 NR:ns ## KEGG: FN0387 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 517 2251 1 1724 1724 1269 48.0 0 MANYLPTVEKNLRSAARRYENVKYSLGLAILFLMKGTSAFSDDNKIQEAERKKDILTNDQ TKKSIAKETKAVTQANKKLKASWATMQFGANDLYSNFFVTPKTKIDKASIVKSENTILLA SADNNVSLPTFSKISSDIEETYVPTTKEINTSKENLRNSIGNFQEKISRARAKNTKEVEG LKLELVQLMEQGDQVVKSPWSSWQAGANYIYSDWNGAYKGRGDKVVDQVLSRDSSGSINR FVASSSTTTSYGSTDLALVEEPFAKIEITPGINPKIINRQAPSYRPSTPEVSYPTFDPRF IRTPVKPSAPAEITPTTFDPPALNFVGGGFIQNQIIGMTQTNIIVQNYERYSTEGVFKIT AGAAPGWLGTKWEGKLNVETTLDPTKNGSLTTGQTGNGPQFGMNTFINDLRDHNTMFTGE YELTDLGGRSWTKSFISYNPAGVGGDYSGYYTPGNRTSTFAGKLTLHGIPVPVNSNVLVG IEHQLWATSRNHPQSSSTLLNTGEIILASGNNVVGIMIDIERMVDPDRIPHKTINDGKII INNQNSIGMDFGQYIYGYSGVFKVDVSLGNIIVNGKNNYGARMKNIFVKPQTDPLYPTWS KYYDMVTVTSGTGKKITVNGEENVGMAIGKSLSAVARESAPGANDTNPIANISDLNIEVA GEKNIGFLRLKDYSDNNTNDMILDSTTMGTFTFGNGAKNSSLVRTDKHGIQVKKDISITG KDADGNDYTGSGNTVLHSNGETQHIYNYNTITVGKGFTKTVGMAATGTKASTIDNIINEG TIALQAKQSIGMYTDKFSQGKNTGSIKLSGVGDTDPSGNIGDAENIGISNKGKFTFLGDL EVNGKKSSGIYNTGTTTIAVGTNPNDKTNIKATNGATALYSKGTGSTITSNAGNKLNITV EAGATKEGLAVYAEDKGQITLHNANINVVGGSAGVAAYDANTKIDLTGSTLKYDGDGYAA YSDGNGQIILNNADIELRGKATLMNVDWSVPAANRPIKTSATNVTVFSNNVVGININNLG TQNVSNLSAIKNSLGVVLNPGTEGGQTFNKFKELAIDDGVINFDVATDKNEGDSTPGGFF FKKVLGQRLKLNVNENLTARLSSATADEFYNSKNDPHEPNNGQVVGLEANSSDKATTNTE AQVNIAAGKIVDVARTDGTDKGGVGVFVNYGQVNNSGTINVEKDSSANSNAVGIYAVNGS EVTNNGSVNVSGDKSIGLLGMAYRVTPPKVDPTTGKIIVPSKIVVDEFGAGAIGQGMVNI VNKGSIDLDGQGAIGMFAKNNKAGTTFTNAVALNDTTGVITTTGNKAVGMAGEEATLTNR GTINVNGQTGTGMFAKSNSKIENDGTINIASSTSATETNIGIFTEDQNTVIENNKDIVGG NNTYGIYGKTINMGTNGKIQVGDNSVGIYSNGQYASSTTPNVTLAAGSSITVGNNQSVGI FVTGQNQNISSQADMRIGDNSFGYVVKGTGTKLTTNATNPVTVGNDTTFIYSTDTTGNIE NRTKLTSTGSKNYGIYAAGNVTNLADMDFSSGIGNVGMYSIAGGTIVNGSPTVNSIIKVG SSDKPNKLYGIGMVAGYTDDNGNVIQTGTVENYGTIKVEKDNGIGMYATGSGSKAINRGT IELSGKNTTGMHLDNNAVGENYGTIKTVPNPTNDGIVGVSVQNGAVIKNYGSIIIDGANN TGIYLSRGKNEGTTPTATNGAVAVRNKVQSDTSKKVAGIEIKAPGNGTATVSRDGKLETP TFVDTTVASPLASRVIVGATELDLTSTKLGDTPSGGMASEIGMYVDTSGINYTNPIQGLQ HLTAVKDVNLIFGTEASRYTTSKDIKIGENILKPYNDEISALTASASGKNFYVSSGSLTW IATGTQNPDDTFNAVYLSKIPYTAFAKDKDTYNFMDGLEQRYGVEGVNSREKALFDKLNA IGKGEPRLFAQAVDQMKGHQYANTQQRVQATADILNKEFNYLRNEWSNLTKDSNKIKTFG ARGEYNTKTAGIEDYKSNAYGVAYVHEDETVRLGESTGWYAGMVHNKLKFKDFGRSEEEM LQGKLGIFKSVPFDENNSLNWTISGDISVGYNKMHRKYLVVDEIFNAKSRYNTYGVGLKN EISKEFRLTEGFSVKPYAALGLEYGRISKIKEKSGEMRLEVKGNDYLSVKPEIGSELAYK HYFGAGAFKAAVGVAYENELGRVANAHNKARVANTSADWYDLRGEKEDRRGNAKLDLNVG LESERYGVTANVGYDTKGENLRGGVGLRVKF >gi|224461340|gb|ACDC01000062.1| GENE 14 21546 - 22109 596 187 aa, chain - ## HITS:1 COG:no KEGG:FN1008 NR:ns ## KEGG: FN1008 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 177 6 179 179 150 50.0 2e-35 MIELLTSGAMSNRMLLLMFTVFFSIFPVMFLIFGIMMRRGQQNNISSYDGEVKGEIIEVL NSEKSAMFATYPVYQYEVNKHKYIVKPNFIFFNSSLDKKYHDSENVTCITYLNKHGGNTR TKYKVGESIIIKYDIDNPKSHEILNDKDKKFAYDSRRIATLLLMIFPLIFFIASFFIKGQ AVFTPAN >gi|224461340|gb|ACDC01000062.1| GENE 15 22131 - 22502 512 123 aa, chain - ## HITS:1 COG:no KEGG:FN1009 NR:ns ## KEGG: FN1009 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 123 1 114 114 172 81.0 2e-42 MKKILVLLLMLILGIVSYAKEDDILGTWLIKENGKIVEIYKNENGEYTGKIKENNFIFLK QANDLTYDKEKNSLAYFNLKFPEDKFSWYIWINIEKDGNLFIKGTGNTEVGKYVRELHFI RQK >gi|224461340|gb|ACDC01000062.1| GENE 16 22504 - 23460 986 318 aa, chain - ## HITS:1 COG:no KEGG:EFER_3822 NR:ns ## KEGG: EFER_3822 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 9 316 9 324 324 201 38.0 4e-50 MENLYNEAYSLLPMEEKKEILENLAKKYNMELLRFETFSKYSKSTFTAIFKYKESEFVFV PGDTVTLGYEGLPKNLSDETLKGLKYCLDETEDLDTVLGEYIRENFSKVRKATIKPMLVE RDLQTVAWKKSNLEELKDFDSDLLKDYNEFKSSNYNRLTLDETARFTKIGNDIEIELYDD ISYEKLCKNLKDEGFSLANLDEWEYLCGGGCRTLFPWGDDLDYNMNLLYFSKEGNDKYDL EEPNFFGLSIAYDPYKMEIIDNKSFSKGGNGGCNICGGYGDFLGYLPCSPYFNQVIDYEE EDLNGDFNFYRRIIKIGE >gi|224461340|gb|ACDC01000062.1| GENE 17 23469 - 23942 535 157 aa, chain - ## HITS:1 COG:FN1602 KEGG:ns NR:ns ## COG: FN1602 COG1683 # Protein_GI_number: 19704923 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 156 1 156 156 241 80.0 3e-64 MKKEIKVLISACLLGDNVKYSGGNNLTPELVTLLEKYNVDIVKVCPECFGGLPIPRVPSE IRENKVFSKDNRDITEEFLAGAEETLKVAKENEVNFVILKERSPSCGSTHIYDGSFSGNV IPGQGITAKRLTEEKIKVFSEENLEEIEKYLVELDKN >gi|224461340|gb|ACDC01000062.1| GENE 18 23956 - 24480 447 174 aa, chain - ## HITS:1 COG:no KEGG:FN1008 NR:ns ## KEGG: FN1008 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 169 6 179 179 181 62.0 1e-44 MSNRVLFLLIAGVFFVFASIFLIIGIIYEKIYQNNMKGYDKEVEGKVLEVIKPRLAGRLT DTYVIYQYIVNNHKYIVKPYILLKNADINLKYTDSENVTCITYMGRHGMSGQTKYHTGED IIVKYNSNNPKRHEILNDKDKTFASKVFKIVGKILMIIPLIFLIISFFVKGQVQ >gi|224461340|gb|ACDC01000062.1| GENE 19 24501 - 26684 2522 727 aa, chain - ## HITS:1 COG:FN1603_3 KEGG:ns NR:ns ## COG: FN1603_3 COG5324 # Protein_GI_number: 19704924 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 411 676 2 267 273 441 86.0 1e-123 MRTLLLLRGIQASGKSTWIKENNLEPYTLSADNIRLNIANPVLTEDGSYEISQKYNKVTW ELLYKYLEMRMQNGDFTIIDATHSDLKLLNKYKDLANTYKYTMYCLEFDVPLEEALKRNK ERDNYKYVPERVIERTYETIKNNEKFPSALKKIESINEIINFYTADVNQYEKVVIIGDIH SCAEPLKEVLKDFNEETLYIFVGDYFDRGIQPVETFNIMLDLLEKPNVILIEGNHEEKSM KKFIYDEEKYTKSFEETTLLLLLKEYDVDYVRTSLKKIYKKLRQCFAFEFRGKKFLCTHG GLPLVPNLALVSAKEMIHGVGKYETEIGEIYSENYKKGLCQGFIQVHGHRGVNDGQFSYC LEDRVEFGGELKVLTIDNEGKIKKTGIKNSVYNKGLKLPMSGAVEKVEFNTANELINEMI RHQFITVKECEHNLISLNFNREAFNKKKWNDLTIKARGLFVDKDSGEVKIRSYNKFFNFG ERHVNLGYLKKYATYPIRAFKKYNGFLGLASVVNGEVVLTSKSVTSGKYKDIFQDIWNKV ESEVRELLKKTMIENNCTAVFEVVSPEYDPHIIKYDKEHLYLLDFIENKLDLDTHNIDLE FSENLMKKVEFSSDLLTKKEELTRLENYDELYNFLVEKEKSLEEFEGYVLCDNSGFMFKF KLPYYNLWKTRRAWLERYRSALVKGKKVEVTEKDEHRHFKKFLLKLGKDKLQELSIIDAR ELYEKEN >gi|224461340|gb|ACDC01000062.1| GENE 20 27024 - 28301 2091 425 aa, chain - ## HITS:1 COG:FN1605 KEGG:ns NR:ns ## COG: FN1605 COG0104 # Protein_GI_number: 19704926 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Fusobacterium nucleatum # 1 425 1 425 425 814 96.0 0 MAGYVVVGTQWGDEGKGKIIDVLSEKADYVVRFQGGNNAGHTVVVDGEKFILQLLPSGVL QAGTCVIGPGVVVDPKVFLDEIDRIEKRGARTDHVIISDRAHVIMPYHIEMDKIRESVED RIKIGTTKKGIGPCYADKISRDGIRMADLLDLKQFEEKLRANLKEKNEIFTKIYGIEPLD FDTIFEEYKGYIEQIKHRIVDTIPIVNKALDENKLVLFEGAQAMMLDINYGTYPYVTSSS PTLGGVTTGAGISPRKIDKGIGVMKAYTTRVGEGPFVTELKNEFGDKIRGIGGEYGAVTG RPRRCGWLDLVVGRYATEINGLTDIVMTKIDVLSGLGKLKICTAYEIDGVIHEYVPADTK SLDRAIPIYEELDGWDEDITQIKKYEDLPVNCRKYIERVQEILDCPISVVSVGPDRNQNI YIKEI >gi|224461340|gb|ACDC01000062.1| GENE 21 28517 - 30439 2000 640 aa, chain - ## HITS:1 COG:FN1606_1 KEGG:ns NR:ns ## COG: FN1606_1 COG1519 # Protein_GI_number: 19704927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Fusobacterium nucleatum # 1 426 1 426 426 672 91.0 0 MYNLLRKIGLTLYRPFMKEKMKTFIDKRLSQDFSDLKDEEYIWIHCSSVGEVNLSEDLVK KFYSISRKNILISTFTDTGYENAVKKYSDKKKIKVIYFPIDDKKKINEILNKIKLKLLVL VETELWPNLINEVNKKNSRIIIVNGRISDRSYPRYKKLKFLLKSMLQKIDYFYMQSEIDR ERIVSLGADEKKTENVGNLKFSISLEKYSDDEKDEYRKFLNIGDRKVFVAGSTRTGEDEV ILDVFKKIKNYVLIIVPRHLDRLPKIEELIKENNLTYVKYSDLENNISTGKEDIILVDKM GVLRKLYSISDIAFVGGTLVNIGGHNLLEPLFYRKAVIFGKYTQNVVDIAKEILRRKIGF QVNDTEEFIEAIKNIESGKISDEEINSFFEENKMIALNIVKKENLIMNNIKDEAKDLWKH FFHSEKSNYNIYMYKLLDYPEYIMYDNDVMKAKKSKWNEYFGNSNPIAVEIGTGSGNFMY QLAERNPNKNFIGLELRFKRLVLATQKCQKRNIKNVAFLRKRGEELEDFLADNEISEMYI NFPDPWEGTEKNRIIQERLFETLDKIMKKDGVLYFKTDHDTYYSDVLELVKTLKNYEVVY HTSDLHNSEKAENNIKTEFEQLFLHKHNKNINYIEIKKLV >gi|224461340|gb|ACDC01000062.1| GENE 22 30459 - 31115 915 218 aa, chain - ## HITS:1 COG:FN1607 KEGG:ns NR:ns ## COG: FN1607 COG0283 # Protein_GI_number: 19704928 # Func_class: F Nucleotide transport and metabolism # Function: Cytidylate kinase # Organism: Fusobacterium nucleatum # 1 218 1 218 218 291 83.0 8e-79 MKNLIVAIDGPAGSGKSTIAKLLAKKYDLTYIDTGAMYRMITLYLLENNIDINDLKEVER VLNTVNLDMQGDKFYLDNVDVSTKIREKRINDNVSKVASIKIVRSNLVDLQRKISNNKDV ILDGRDVGTVIFPNAQVKIFLIASPEERARRRYNEFLEKKTEITYDEVLKSIKERDHIDS TRDESPFVKADDAIELDSTNLTIEDVINFISKEIEKAK >gi|224461340|gb|ACDC01000062.1| GENE 23 31568 - 31861 323 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739298|ref|ZP_04569779.1| ## NR: gi|237739298|ref|ZP_04569779.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 97 1 97 97 149 100.0 7e-35 MSIGKKIYMILYFGIYAIINLTCLGYVVDGLFFKLGFEDVFSSINENIYGAIGGTVIICC IINNRKIEEMLKDKKRNIYFSIILIVLAIGTYFFVYN >gi|224461340|gb|ACDC01000062.1| GENE 24 31936 - 32202 282 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739299|ref|ZP_04569780.1| ## NR: gi|237739299|ref|ZP_04569780.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 88 1 88 88 140 100.0 3e-32 MRNDVWKKIKIATYYFIGIICFGYIIETLLYKYGFEQTAPKIYDNISGALGGATASIFFK KDGNKKFDIYFLIIVIVLAIGTYFFVYN >gi|224461340|gb|ACDC01000062.1| GENE 25 32279 - 32548 254 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739300|ref|ZP_04569781.1| ## NR: gi|237739300|ref|ZP_04569781.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 89 6 94 94 150 100.0 2e-35 MNTNVWKKIGYVICGSIGLSLSWYVFYSLLYKYGFEQTAPKIYNILCYTILNLVLLGLIF KKNEIKKTDTYFILVLMVLGIGAQFFIHN >gi|224461340|gb|ACDC01000062.1| GENE 26 32583 - 33440 1305 285 aa, chain - ## HITS:1 COG:FN1610 KEGG:ns NR:ns ## COG: FN1610 COG1281 # Protein_GI_number: 19704931 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Fusobacterium nucleatum # 1 285 1 285 285 492 90.0 1e-139 MGRLIRGLSKNARFFVADTTDVVQKALDIHKYDEYSMKTFGKFCTLAAIMGATLKGEDKL TIRTDTDGYIKNIVVNSDANGDIKGYLINTSEENFEGLGKGTMRIIKDMGLKEPYVAITN VDYSSLPDDISAYFYNSEQIPTIISLACEDTNDGKILCAGAFMVQLLPGADEDFITKLER KAEAIRPMNELMKGGMSLEQIINLLYDDMDTADDSLVEEYEILEEKELKYNCDCNSERFQ RGIMTLGKEELKHIFEEEKEIEAECQFCGKKYKFTENDFEDILKK >gi|224461340|gb|ACDC01000062.1| GENE 27 33440 - 33964 582 174 aa, chain - ## HITS:1 COG:FN1611 KEGG:ns NR:ns ## COG: FN1611 COG1555 # Protein_GI_number: 19704932 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Fusobacterium nucleatum # 18 174 2 158 159 208 78.0 4e-54 MKKIISFLLFSCLFANSYAVPALNNNDYRLIMSSQNMQNEKEELLDINKASEQDMLGRKI SKSYVAKIMEYREITGGFDKLEDLKRIKGIGDATYQKLSKFLKVGSAPTKKVLNINSADE LTLKYYGFSKKEIKKIQTYLDKNDRITDNIEFQKLVKKKTYEELKDLINYGGKK >gi|224461340|gb|ACDC01000062.1| GENE 28 33961 - 35379 1390 472 aa, chain - ## HITS:1 COG:FN1612 KEGG:ns NR:ns ## COG: FN1612 COG0635 # Protein_GI_number: 19704933 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 4 472 1 469 469 781 91.0 0 MIKLLIETNVEINLRSIEEFTRVMASELLEDKILFDIQREENLIKIKVSSENLNKNTEFS YIDLENKIEDQILTMCKISLLKLLNKNYAWGSLMGVRPTKVLRRLLINGCDYEKARKILK DFYLVTDDKINLMETVVKKELEFLDKEHINLYLGIPFCPTKCKYCSFASYEIGGGVGRFY NDFVEALLKEIQIIGDFLKTYNKKVSSIYFGGGTPSTLTEIDLERVLKKLLENIDMSDVK EFTFEAGREDSLNIEKLEIMKKYSVDRISLNPQSFNLETLKRVNRRFNRENFDLIFKEAK KLGFIINMDLIIGLPEETTEEILDTLAQLNAYDIDNLTIHCLAFKRASKLFKESQERNSI DRALIEEHIQEIVKEKEMKPYYMYRQKNIIEWGENIGYSKEGKESIFNIEMIEENQNTMA LGGGGISKIVIEERNGIDYIERYVNPKDPALYIRELDKRCKEKIEMFRKEKI >gi|224461340|gb|ACDC01000062.1| GENE 29 35372 - 36325 1005 317 aa, chain - ## HITS:1 COG:FN1613 KEGG:ns NR:ns ## COG: FN1613 COG2805 # Protein_GI_number: 19704934 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Fusobacterium nucleatum # 1 317 3 316 316 447 81.0 1e-126 MEKIFEYARKNNISDIHIIEGERIYFRKDGEIVAYEDSQSLSREDILKICSGKFEEDFAY TDSKNQRYRINSFFTKGKLALVIRVINDDAIKLKGEFINKVIDEKILALKDGLVLISGIT GSGKSTTLANIIEKFNENKSIKILTIEDPIEYIFENKKSLIIQRELGTDVESFEKALKSS LRQDPDVIVLGEIRDEESLFSALKLAETGHLVFSTLHTMNAVESINRLISMAKSDKRDFI REQLASVLRFIFSQELYRDKKTKKVKAIFEILNNTKAVANLIANNKLNQIPSLIESGIEN YMITKEKYFKKIEIESD >gi|224461340|gb|ACDC01000062.1| GENE 30 36347 - 36517 242 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVGFPSGQRDQTVNLTAQPSKVRILPQPPSKNNQLHFGVAFLFFLLCDIIIKKAKK >gi|224461340|gb|ACDC01000062.1| GENE 31 36426 - 36641 561 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIGSTPISGTSAPFVQWLGHQIFTLETGVQFPYGVPLQLNTDGWVPERSKGSDCKSDGSA FEGSNPSPTTI >gi|224461340|gb|ACDC01000062.1| GENE 32 36759 - 37613 999 284 aa, chain - ## HITS:1 COG:FN0127 KEGG:ns NR:ns ## COG: FN0127 COG0731 # Protein_GI_number: 19703472 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 284 1 284 284 442 87.0 1e-124 MYKHVFGPVPSRRLGISLGVDLVVSKSCNLNCIFCECGATKKIQLERKRFKDMNEILEEI SAVLKDIKPDYITFSGSGEPTLSLDLGNISKAIKEDLKYEGKICLITNSLLLADENLMKE LEYIDLIVPTLNTLTQDIFEKIVRPDYRTSVEEIRKGFINLNKSNYKGKIWIEIFILENV NDSDKNFVDIANFLKSENIRYDKIQLNTIDRVGAERDLKAISFEKISRAKKILEENGLDN IEIIKSLGELEEDKKIQINQELLDNMKQKRLYQEEEINKIFKKN >gi|224461340|gb|ACDC01000062.1| GENE 33 38064 - 38631 897 189 aa, chain + ## HITS:1 COG:FN0819 KEGG:ns NR:ns ## COG: FN0819 COG0457 # Protein_GI_number: 19704154 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 189 1 197 665 112 41.0 3e-25 MKEKLLEKIESLNEIEKYQEIIDLIEALPAEQLNTELIGELGRAYNNIEEYEKGLEILKS IENEVGDTALWNWRIGYSYFFLEDYTKAEKHFLKVYEQEPDNEKACDFLVGTYIALGEIE EENGNSEKAIEYALEAKKYAKTRENKIDTEIFLASLYNRHMRYTEAEEILRPILAKNKRD IEGNYELGY Prediction of potential genes in microbial genomes Time: Thu May 19 23:14:59 2011 Seq name: gi|224461339|gb|ACDC01000063.1| Fusobacterium sp. 2_1_31 cont1.63, whole genome shotgun sequence Length of sequence - 4513 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 + CDS 3 - 1829 2562 ## COG0457 FOG: TPR repeat + Term 2020 - 2058 -0.9 + Prom 1915 - 1974 12.9 2 1 Op 2 . + CDS 2109 - 4512 3240 ## COG0457 FOG: TPR repeat Predicted protein(s) >gi|224461339|gb|ACDC01000063.1| GENE 1 3 - 1829 2562 608 aa, chain + ## HITS:1 COG:FN1964 KEGG:ns NR:ns ## COG: FN1964 COG0457 # Protein_GI_number: 19705260 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 606 102 708 709 681 61.0 0 KQERYEEALEYFLKLEKLNDADDWLYQKIGICYKNLDKKEEALKYYLKAVELDEEDTYSI SDIAWLYNFLGKYEEGLKYLERLEEFGQDDAWTNGEFGYCLSRLERHEEAIKKLNHALEV EDEEKDIAHIHSLLGWSYRQLEDYDKALEHYIQSKENGRDSAWINDEIGYCYKKKSDLKK ALEFYLLAEKYDKNDLDLVSEIAWVYSILGEYKEGLKYTERAVKLGRNDAWINIQYGACL ANSNRFQEAIEKLEYALSLDEEKDLAFAYSQLGWCYRLLGDYEKALGYHIKSQEEGRNDA WINFEIAICYENLNDYEKALEYALISYELDKDEVNVLSEIGWLYNYMGKYEDALPFLLRA QELGRNDEGINTEIAVSLGRSGNVKEAIEKLKKSLTMINKNEINQRIFINSELAWLYGSL EDPQPEEALKYLNVAKELGREDEWIHSQIGYQLGYNPDKSEEALEHFEKAIELGRDDAWI FEMRGIILLNLKRYEEALESFKKAYDKDNNGWYLYSMGRCLRGLERYEEAIEILLKSRQI SLDEEDVVDGEDFELAHCYIGIGDKENAQKYLDSARDSITERGILNDYFKEEIEKIEKGI SSLDTLFN >gi|224461339|gb|ACDC01000063.1| GENE 2 2109 - 4512 3240 801 aa, chain + ## HITS:1 COG:FN1964 KEGG:ns NR:ns ## COG: FN1964 COG0457 # Protein_GI_number: 19705260 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 93 797 2 705 709 825 64.0 0 MKKEILEKIEKLHDLEKYQEIIDLIEALPAEQLNTDLIGQLARAYNNVENYEKGLELLKT IEFEEGDSFLWNWRAGYSYFFLEDYTNAEKCFLKAYELDPSDNATCDFLKATYTSLSKLE GRNGNSEKAIEYALEAKKYVYDEEGRIEADSFLASRYDRYGKYEEAEEILRNLLAINKED DWTLSELGYCLSAQGKYEEALEYLLAAEKINKNDVWTYRQIGICYKNLDNKEEALKYYLK AVDLDEEDKYSIADIAWIYDVFGNYEEALKYLERLDELGEENDVWTNIEFGLCLSRLGRY EEAIERINHALEIEDEEKNTGYIYSQLGFCKRKLEEYDEAIEAFKQAKKWGRNDAWINVE LGYCYRLKNEIKKALECNLEAEKFDKKDPYIMSDIAWFYDNLDQYKEGLKYVKKAIKLGR NDAWINVEYGACLAGLGKYEEAIEKFEYALSLKDEEKDLAFIYNQLGWCYRLLGDYEKAL ECYIKSKEEGKNDAWTNVEIAMCYENLNDYEKALEYALIAYDLDRDDIRSLSEVGWIYNC KEKYEDALPFLLRAEELGRDDEWLNTEIGINLGRSGKINEGIERLKKSLTMVDKDNISQK IFINSELAWLYGRLEDPQPEEALKYLNAAKELGRDDEWIHSQIGYQLGYNPDKSEEALEH FEKAIELGRDDAWIFEVKGIILLDLKRYEEALESFKKAYDKDNNGWYLYSMGRCLRGLER YEEAIEILLKSRQISLDEEDVVDGEDFELAHCYIGIGDKENAQKYLDSARDAIIERGILN DYFKEKIDEIEKGISSLDILF Prediction of potential genes in microbial genomes Time: Thu May 19 23:15:12 2011 Seq name: gi|224461338|gb|ACDC01000064.1| Fusobacterium sp. 2_1_31 cont1.64, whole genome shotgun sequence Length of sequence - 42541 bp Number of predicted genes - 39, with homology - 38 Number of transcription units - 13, operones - 11 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 3.5 1 1 Op 1 2/0.000 + CDS 97 - 1275 1937 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 2 1 Op 2 4/0.000 + CDS 1275 - 1967 904 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase + Prom 2096 - 2155 9.8 3 1 Op 3 1/0.000 + CDS 2183 - 3034 1359 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component + Term 3052 - 3104 3.7 + Prom 3067 - 3126 14.8 4 2 Op 1 13/0.000 + CDS 3150 - 4199 1582 ## COG0457 FOG: TPR repeat 5 2 Op 2 . + CDS 4221 - 6629 3334 ## COG0457 FOG: TPR repeat + Term 6643 - 6693 11.5 + TRNA 6773 - 6849 82.1 # Pro TGG 0 0 + TRNA 6857 - 6932 93.2 # Gly TCC 0 0 + TRNA 6947 - 7022 75.9 # His GTG 0 0 + TRNA 7026 - 7101 94.1 # Lys TTT 0 0 + TRNA 7108 - 7191 68.7 # Leu TAG 0 0 + TRNA 7227 - 7301 72.4 # Gln TTG 0 0 - Term 7345 - 7391 9.9 6 3 Op 1 . - CDS 7423 - 7710 494 ## FN0038 hypothetical protein - Prom 7765 - 7824 5.2 7 3 Op 2 . - CDS 7849 - 8421 710 ## gi|237739315|ref|ZP_04569796.1| predicted protein - Prom 8529 - 8588 13.7 - Term 8591 - 8635 9.1 8 4 Op 1 . - CDS 8654 - 8866 414 ## FN1966 hypothetical protein 9 4 Op 2 27/0.000 - CDS 8910 - 10409 1795 ## COG0732 Restriction endonuclease S subunits 10 4 Op 3 5/0.000 - CDS 10411 - 11835 1916 ## COG0286 Type I restriction-modification system methyltransferase subunit 11 4 Op 4 . - CDS 11849 - 15106 4220 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases - Prom 15137 - 15196 5.6 12 5 Op 1 1/0.000 - CDS 15234 - 15611 693 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 15633 - 15692 7.2 13 5 Op 2 . - CDS 15703 - 18531 2327 ## COG1061 DNA or RNA helicases of superfamily II - Prom 18773 - 18832 14.6 + Prom 18886 - 18945 11.3 14 6 Tu 1 . + CDS 19102 - 19563 438 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Prom 19566 - 19625 7.9 15 7 Op 1 1/0.000 + CDS 19761 - 21047 1877 ## COG0536 Predicted GTPase 16 7 Op 2 . + CDS 21040 - 21942 935 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 17 7 Op 3 . + CDS 21990 - 22382 381 ## FN1916 hypothetical protein 18 7 Op 4 1/0.000 + CDS 22387 - 22797 511 ## COG3920 Signal transduction histidine kinase 19 7 Op 5 1/0.000 + CDS 22799 - 23146 381 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) 20 7 Op 6 . + CDS 23177 - 24733 2297 ## COG1418 Predicted HD superfamily hydrolase + Term 24751 - 24800 3.2 + Prom 24818 - 24877 13.2 21 8 Op 1 . + CDS 24904 - 25278 707 ## COG1302 Uncharacterized protein conserved in bacteria 22 8 Op 2 . + CDS 25297 - 25887 772 ## FN1618 hypothetical protein 23 8 Op 3 . + CDS 25887 - 26114 278 ## FN1617 prolipoprotein diacylglyceryltransferase 24 8 Op 4 . + CDS 26119 - 26577 700 ## COG0781 Transcription termination factor + Term 26582 - 26625 5.5 - Term 26574 - 26610 4.2 25 9 Op 1 . - CDS 26626 - 26751 173 ## gi|294783684|ref|ZP_06749008.1| hypothetical protein HMPREF0400_01678 - Prom 26780 - 26839 7.1 26 9 Op 2 . - CDS 26841 - 28025 1504 ## Lebu_1366 VWA containing CoxE family protein 27 9 Op 3 . - CDS 28034 - 30301 2401 ## Lebu_1367 hypothetical protein 28 9 Op 4 . - CDS 30315 - 31415 1654 ## COG0714 MoxR-like ATPases - Prom 31468 - 31527 5.4 29 10 Op 1 . - CDS 31568 - 32227 591 ## Lebu_1369 hypothetical protein 30 10 Op 2 . - CDS 32249 - 34159 2321 ## Lebu_1369 hypothetical protein 31 10 Op 3 . - CDS 34173 - 35621 1818 ## Lebu_1370 zinc finger SWIM domain protein - Prom 35648 - 35707 7.7 32 11 Op 1 . - CDS 35725 - 37359 2060 ## Lebu_0003 protein of unknown function DUF1703 33 11 Op 2 . - CDS 37407 - 38150 940 ## COG0863 DNA modification methylase 34 11 Op 3 . - CDS 38169 - 39026 1153 ## EUBELI_01767 type II restriction enzyme 35 11 Op 4 . - CDS 39039 - 39965 965 ## COG0338 Site-specific DNA methylase - Prom 39992 - 40051 14.9 + Prom 39970 - 40029 12.2 36 12 Tu 1 . + CDS 40148 - 41356 1223 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 41362 - 41410 12.2 37 13 Op 1 . - CDS 41428 - 41883 515 ## gi|237739344|ref|ZP_04569825.1| predicted protein 38 13 Op 2 . - CDS 41899 - 42351 647 ## gi|237739345|ref|ZP_04569826.1| predicted protein 39 13 Op 3 . - CDS 42392 - 42541 108 ## Predicted protein(s) >gi|224461338|gb|ACDC01000064.1| GENE 1 97 - 1275 1937 392 aa, chain + ## HITS:1 COG:FN0128 KEGG:ns NR:ns ## COG: FN0128 COG1840 # Protein_GI_number: 19703473 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 79 392 1 314 314 607 96.0 1e-173 MYISKSMSIKSIVEKYPETIPVFTNIGFKGLDNPAVLQKLEEQGITLEKAMMIKKEDVDA FIPMLQQAIASVEREDEGVKEASLMGLLPCPVRIPLLEGFEKYLADNKDIKVKYELKAAY SGLGWIKDEVIDKNDIDKLADMFISAGFDLFFDKDLMGKFKEQGIFKDMTGIEKYNTDFD NENIHLKDPHGDYSMIGVVPAIFIVNKAALNGREVPRSWGDLLKPEFAKSVSLPIADFDL FNSILIHIYKLYGFEGVRSLGHSLLSNLHPAQMVEAKEPVVTIMPYFFSKMVPEKGPKEV IWPKEGAIISPIFMLTKASKAKELEKVIKFMSGKAVGDTLANQGLFPSVHPEVKNPVNGR PMLWVGWDFIYSNDMGELIKKCEETFKEGAAE >gi|224461338|gb|ACDC01000064.1| GENE 2 1275 - 1967 904 230 aa, chain + ## HITS:1 COG:FN0129 KEGG:ns NR:ns ## COG: FN0129 COG0378 # Protein_GI_number: 19703474 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Fusobacterium nucleatum # 1 230 1 230 231 445 96.0 1e-125 MKLITVSGPPSSGKTSLIIKTIESLKAQNIKVGIVKFDCLYTDDDVLYEKAGILVKKGLS GSVCPDHFFASNIEEVVQWGQANGVDLLITESAGLCNRCSPYLKDIKAVCVIDNLSGINT PKKIGPMLKLADIVVITKGDIVSQAEREVFASRVQTVNPKAAIIHINGLTGQGTYEFGSL IMDNNEEIDTVLERKLRFPLPSAVCSYCLGETRIGNEYQLGNIRKINFEE >gi|224461338|gb|ACDC01000064.1| GENE 3 2183 - 3034 1359 283 aa, chain + ## HITS:1 COG:FN0130 KEGG:ns NR:ns ## COG: FN0130 COG1136 # Protein_GI_number: 19703475 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 16 283 1 268 268 482 98.0 1e-136 MSIDNNELDEMDFDLLDILGVTEQKVESITLLPGYNKKGEKEGYEELVIKSGEIVAIVGP TGSGKSRLLADIEWGAQGDTPTKRTVLVNGELMDAKKRFSPSYKLVAQLSQNMNFVMDLT VREFIDLHAESRLVLDRESVIEKIFNQANELAGEKFTIDTPITSLSGGQSRALMISDTAI LSTSPIVLIDEIENAGIDRKKALDLLVGNNKIVLMATHDPILALMGDRRIVIKNGGINKV IESTTEEKNILGALTELDDVVQGMRNKLRYGERLELDFEIKKK >gi|224461338|gb|ACDC01000064.1| GENE 4 3150 - 4199 1582 349 aa, chain + ## HITS:1 COG:FN1965 KEGG:ns NR:ns ## COG: FN1965 COG0457 # Protein_GI_number: 19705261 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 349 1 345 345 425 67.0 1e-119 MDQKFWDKIDSFGQNGEYDKIVREIKKLPADKMDMELINVLGRAYMYLGDLGNALDTYLS FIGKAEEDTLNADIWLYSEAGWTCNEFEDFEQGLKYLLEAEKLGRDDEWLNTEIGQCLGR LERYEEAIKRLEKSLKLIEADEAENGDEKVNEKIFVVSELGYLHGVQGKNEEALKYFYTA KDLGRNDNWIYLHLYHNLKTTKGEEEALKYFEEQAKIEDKNPVLLEALGNIYMLEAANYE KAEKTFQKAFALSGDGQQLFNRGRALAALKKYKEAIEVLLQSRRISEQEGDVTDGEDMEL VRCYIGLKDKKNAEKYLELAREGADNVADEFIDEYEEELDKLEDMIDEL >gi|224461338|gb|ACDC01000064.1| GENE 5 4221 - 6629 3334 802 aa, chain + ## HITS:1 COG:FN1964 KEGG:ns NR:ns ## COG: FN1964 COG0457 # Protein_GI_number: 19705260 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 93 800 2 705 709 825 65.0 0 MKETLLEEIERLHDLEKYQEIIDLIESLPAERLNTELIGKLASAYNNIENYEKALEILKT IEFEEGNSFQWNRRAGYSYFFLEDFVNAEKCFLKAYKLDPDDNDTCDFLIAIYRNLAKLE NENRNSEKSIEYALKAKKYAYDEEGRIETDSFLAWTYNRYGKYEEAEEILRNQLARNKDN QWAYSELAYSLSEQGKFEEALENCFKAEDLGRNDDWLFTRIGICYKNLGKNEEALEYYLK ALELSEDDIFIMSDIAWLYDITDRYEEALKYLERLEELGQDDAWTNTEFGYCLSKLGRYE EAIEKLNHALEVEAENDDKDVSYIYARLGWCKRKLGMCDEAIEDFNKAKKWGRNDAVINA EIGHCYKAKDEYEEALKYYLQAEKFDKKDPLIMSEIAWHYGALGLYSESIKYIKKAMRLG RNDAWINIEYGACLAGLDKYEEAIEKFEYALSLNDEDKDKDLAFTHGQLGCCHRNLENYE EAIKYLMLSKEEGRNDVWTNLELAMCYENLEDYEKALEYALVAYELDKDDIGTISEVATI YDNLENYEEALPFLLRAEELGRNDEWINTEIALNLGRSGKTHEALERLEKSLTLVDEDRI NQKIFINSEIAWNYGRLEEPHPEEALKYLNIAKELGREDAWIYSQIGYQLGYNLETRKEA LEHFDKALELGQNDAWIFEMRGTILLDFKRYEEALDSFKKAYDLNDDSWYLYSIGRCLRG LERYEEAIEVLLESRQLSLDEDDVVDGEDLELAHSYLGIGDKDNAQKYLNSARDSIDKQG TLNDEIKEEIEKIEKGILSLDN >gi|224461338|gb|ACDC01000064.1| GENE 6 7423 - 7710 494 95 aa, chain - ## HITS:1 COG:no KEGG:FN0038 NR:ns ## KEGG: FN0038 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 95 6 100 100 102 78.0 5e-21 MNQNEKKEIMGKFAKKLENAIKREVAVTKEIENDKALIKYLEAQKAAGAALDTTAYESYD AWIDTIKKQIKKSESTLTNIEFKKVELEAVKQYLA >gi|224461338|gb|ACDC01000064.1| GENE 7 7849 - 8421 710 190 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739315|ref|ZP_04569796.1| ## NR: gi|237739315|ref|ZP_04569796.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 190 1 190 190 299 100.0 6e-80 MKKFFKVFLFGLMLVSGVIFISCGKKELTGLHKEFDIIFNGIKEEVTTEFKNNLDALEKE VKDSSRSELETKIQLFGIEVLVEAFNKASYDVVSINDMGDKAELKIKVKAVDFFEALQQI ITNTTKDKSNLVDEVEGLLKKIKKGKAPVIEQEMDIEMIKENDTWTIPERQKYVLMKRMM GIKKGTIFDN >gi|224461338|gb|ACDC01000064.1| GENE 8 8654 - 8866 414 70 aa, chain - ## HITS:1 COG:no KEGG:FN1966 NR:ns ## KEGG: FN1966 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 70 1 70 70 87 82.0 2e-16 MNRDAKFINFSEVHELDYILKKYGKETSKENRDLLKEFGKQAKELLGKTMLGHQDLYKYI EDNSLAEKLK >gi|224461338|gb|ACDC01000064.1| GENE 9 8910 - 10409 1795 499 aa, chain - ## HITS:1 COG:Cj1051c_2 KEGG:ns NR:ns ## COG: Cj1051c_2 COG0732 # Protein_GI_number: 15792378 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Campylobacter jejuni # 159 467 137 453 454 133 34.0 9e-31 MAKNKNVEISLEEKLKKALVPVDEQPYTIPSNWVWTRYDVLFSDISKNEKKIEEKNYLEN GEIAIVSQGKDKIVGYSDILEVKPYKEELPLIIFGDHTLNVKYIEFPFYIGADGVKVLKT TDIIIPKFLFYLLNNLKTFSLINTGYRRHYPILKKLFFPLSPLNEQKRIVEKLDFLFEKI KRAKEIIEEIKIDIENRKISILDRAFKGTLTSKWRSENKISDVKELLKSINEEKIKKWEE DCLQAEKDGNKKPKKPTITEVKDMIVPVDEQPYKLPDSWVWVRLGDIVEINPNKIKINID ENELVDFIPMKNVSENSPEIIENNFEKFKNLQKGYSQFIENDILFAKITPCMENGKTAIV SNLKEKIGYGSTEFHVLRSTKIISNKLLYNFLKQQRFRKDAKYNMTGSVGFRRVPTEFMR SYPFPLPPLEEQQEIVRVLDEVLENENKVKELLELEERIDILEKSILHKAFKGELGTQNS SDESAMELLKNSIAINTKI >gi|224461338|gb|ACDC01000064.1| GENE 10 10411 - 11835 1916 474 aa, chain - ## HITS:1 COG:STM4525 KEGG:ns NR:ns ## COG: STM4525 COG0286 # Protein_GI_number: 16767769 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Salmonella typhimurium LT2 # 1 465 1 507 529 394 43.0 1e-109 MTNNEIVQKLWNLCNVLRDDGITYHEYVTELTYILFLKMLAEQDNEAEVGIPEEYRWNTL IKLDGLELKTTYQKALIDLAQKENNLAIIYRNAKTNIEEPANLKKIFSEIDKMDWYSVDK EDFGDLYEGLLEKNASEKKSGAGQYFTPRVLIDTIVKVTKPQLKERICDPASGTLGFIIS ANRYIKEKNDDYYGISEEDYAFQKKEAFSACELVPDTHRLGIMNALLHGVEGNFLQGDTL SATGTQLKNFDLILSNPPFGTKKGGERATRDDLVFSSSNKQLNFLEIIYRSLNLTGRARA GVVLPDNVLFEGGIGKDIRQDLLNKCNVHTILRLPTGIFYAQGVKTNVLFFDRAKTDIGN TKDIWFYDLRTNMPNFGKTTPLTEKYFEEFISTFDNDEEKEKLERWTKISIDEVIKKDYS LDLGLIKDESLLDIEALPNPILNTNETIEKLEEAIDLLKLVVNELKNCGLSEED >gi|224461338|gb|ACDC01000064.1| GENE 11 11849 - 15106 4220 1085 aa, chain - ## HITS:1 COG:hsdR KEGG:ns NR:ns ## COG: hsdR COG4096 # Protein_GI_number: 16132171 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Escherichia coli K12 # 3 1082 23 1183 1188 634 34.0 0 MYSNFNFLQNDWQGLAKIGEMAEYILYKDPNTAIMKLRQFGEELINTMIKIENFSCDKTT LAVDKILILKRAGLIPDDIDNILHSLRKKGNKAAHGAYGDEKTAETLLSLAVKLGAWFQE IYGTDMSFHSETIEYKKPENIDYEKEYQKLVERTDEIQKELEDIKTNPHLTTREDRKKAI SKKREIEFTEEETRLIIDKQLAEAGWEVDTKVLNYKTNKTLPEKKKNLAIAEWPCIKENG RKGFADYALFLGEKLYGILEAKRLNTDIPAALNADSRIYSKGVEIFENAQLCEGSPFGEY KAPFLFSSNGRGYNKDLPEKSGIWFLDARKESNLPKVLKGFYSPKDLKELLEKDDELANQ KLKEESFEYLESKFGLGLRYYQKDAIKSVEESLISGKKKVLLTMATGTGKTRTALGLIYR LLKTNKFKRILFVVDRTLLGEQAKETFDDVKIEQLLSLGGIYGVKGLNDKSTDKDERVHI ATVQSLIKRILYPNTEEKEKLTVGEYDCIIVDEAHRGYILDRNMKEEQKDFFDEKDFESK YKAVIDYFDAVKIALTATPALHTQEIFGEPVYSYSCSQAVIDGYLVDVEPPYEIITKLSE EGIHYQKGAMVKVYDVEAQKVKEREVLADELDFDIEKFNKSVITESFNREVCSALVDYIN PEGPEKTLIFTASDEHADMVVRILREEYQKQAMFDMNIEMIAKMTGYVKDVDHLVKKFKN EAYPTIGVTVDLLTTGVDVPRITNLVFLRKVKSRILYHQMLGRATRKCDEIDKTSYKVFD AARNYVDMKDFSDMNPVVNNPQIDMEKLLDSYSKDVPDESKKYFIEQVIARLQRKKKRIK DLGENKFEINSKIYRKNEDIKNIDDYIEYIKNINPDDMEKEEDFLIFLDSIQNPKKDRII SEHEDEVRTVRQIYGKNEKPEDYLENFEKYIRENQDKVEALKLLKENPKLFKRKDLKELR YILDENGYKETELNSAYGKVENVNITADILSYIKKVLKGSTILDKEEKIQDIEKRIKRLK NWNPIQLKIIEKIISQLRENSYLTEEDFSSGIFKDNFGGYNKINQKLEDKLADIVSIINE EIILN >gi|224461338|gb|ACDC01000064.1| GENE 12 15234 - 15611 693 125 aa, chain - ## HITS:1 COG:FN1973 KEGG:ns NR:ns ## COG: FN1973 COG0251 # Protein_GI_number: 19705269 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Fusobacterium nucleatum # 1 125 4 128 128 223 97.0 7e-59 MKRVINTTNAPAALGPYSQAIEANGVLYVSGQIPFVPATMTLVSEDVEEQTKQSLENIGA ILKEAGYDFKDVVSTTVYIKDMNDFTKINGVYDKYLGEVKPARACVEVARLPKDVKVEIG VIAVK >gi|224461338|gb|ACDC01000064.1| GENE 13 15703 - 18531 2327 942 aa, chain - ## HITS:1 COG:FN1974_2 KEGG:ns NR:ns ## COG: FN1974_2 COG1061 # Protein_GI_number: 19705270 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Fusobacterium nucleatum # 182 942 1 761 761 1250 92.0 0 MENILLEALKTSSIDFNIDSDEKYQYELIANGEEKIVTRLRKYFEDCDEFVISVAFITMG GISLFLEELKNLENKGIKGKILTGDYLTFTEPKALKKLLSYKNIDLKVATNRKHHTKAYF FRKGNIWTLIVGSSNLTQGALTVNFEWNIKVNSLENGKIVKSVLETFNKEFDNLKTLTEE DIENYQKRYEQLKKLIEANNQNIDLNEIKPNSMQVQALKNLEETRTENDRALLISATGTG KTYLSAFDVKQAKAKKILFVAHRKVILERSKVSYQRILKDKKLEIFDSNFQINDKDEVVF AMVQTLNKEKNLNIFPKGYFDYIIIDEVHHGGAKTYQSIFEYFKPKFLLGITATPERTDD FNIYQLFNYNVAYEIRLQDAMKEDLLCPFHYFGISDIVIDGENIDEKTSIKNLTSDERVR HILEKSKYYSYSGEKLHCLVFVSKVEEAKILVEKFLEQGVKALALSSENSDNEREEAIRK LEEGEIEYIISVDIFNEGVDIPCVNQVILLRPTTSAIVYIQQLGRGLRKHKNKAYTVVLD FIGNYEKNFLIPIAISQNNSYDKDFMKRFLMNATDFLAGESSISFDEISKERIFENINKV NFSNRKLIEEDFKLLESQLGRIPYLYDFYIKNMLSPTVILKYKKDYDEVLKNIAPKYRAG SLNSIEKKFLIFLSTFFTPAKRVHEMLILKELFVKEKLNLEEVEKILKNKYSLINQENNI KNAFEHLSKEIFITLSTTKSFEPVLYRKDDYYFLDENFKNSYSSNPYFKILIDDLIKYNL AFAENNYNNFVKESIKLFGEYTKQEAFWYLNLNFNNGFQVSGYTPFENERKLLIFITMDN LSERADYSNEFYDAQTFSWFSKSSRYLKKDNKLTIEGKIAENFYEINVFVKKNNGENFYY LGDVEKVLSAKEIKDSQEKSMVKYIFKLKKDVKKELLDYFNM >gi|224461338|gb|ACDC01000064.1| GENE 14 19102 - 19563 438 153 aa, chain + ## HITS:1 COG:FN1791_1 KEGG:ns NR:ns ## COG: FN1791_1 COG0494 # Protein_GI_number: 19705096 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 1 152 1 152 158 247 86.0 6e-66 MITTLCYLEKENKYLMLHRTKKENDINKNKWLGVGGKLEKEETPEQCLIREVKEETGLDL IDYVHRGIVIFNYNDDEPLDMYLYTSKNFSGEIQECSEGDLKWIDKSEIYKLNLWEGDRI FLELLEKDAPFFHLILNYENDNLLSSELKFVEK >gi|224461338|gb|ACDC01000064.1| GENE 15 19761 - 21047 1877 428 aa, chain + ## HITS:1 COG:FN1918 KEGG:ns NR:ns ## COG: FN1918 COG0536 # Protein_GI_number: 19705223 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 428 1 428 428 707 96.0 0 MFIDEVIITVKAGNGGDGSAAFRREKFIQFGGPDGGDGGKGGDVVFVADSNINTLIDFKF KKLFKAQNGENGQKKQMYGKKGEDLIIKVPVGTQVRDFTTGKLILDMNVNGEQRVLLKGG KGGYGNVHFKNSVRKAPKIAEKGGEGAEIKVKLELKLLADVALVGYPSVGKSSFINKVSA ANSKVGSYHFTTLEPKLGVVRLEEGKSFVIADIPGLIEGAHEGVGLGDKFLRHIERCKMI YHIVDAAEIEGRDCIEDFEKINEELRKFSEKLANKKQIVIANKMDLIWDMEKFEKFKSYL AEKGIEVYPVSVLLNEGLKEILYKTYDMLSKIEREPLEEEVDITKLLKELKIEKEDFEIT RDEEDAIVVGGRIVDDVLAKYVIGMDDESLITFLHMMRNLGMEEALQEFGVQDGDTVKIA DVEFEYFE >gi|224461338|gb|ACDC01000064.1| GENE 16 21040 - 21942 935 300 aa, chain + ## HITS:1 COG:FN1917 KEGG:ns NR:ns ## COG: FN1917 COG0324 # Protein_GI_number: 19705222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Fusobacterium nucleatum # 1 300 4 303 303 448 85.0 1e-126 MNKAIVIAGPTGVGKTKISIDLAKLLNAEIISSDSAQVYKGLNIGTAKISEKEKQGVEHH LIDIVEPIAKYSVGNFEKDVNKILNQNPEKNFMLVGGTGLYINSVTNGLSVLPEADKKTR EYLASLDNQTLLELALKYDEEATKEIHPNNRVRLERVVEVFMLTNKKFSELSKKNIKNNN FKFLKIALERNREELYDRINKRVDIMFEEGLVKEVENLYKIYGEKLYSLNIIGYNELIDY FNGLSSLEEASYKIKLNSRHYAKRQFTWFKADKEYVWFNLSEVSEDEVVKRVHTLFNIKS >gi|224461338|gb|ACDC01000064.1| GENE 17 21990 - 22382 381 130 aa, chain + ## HITS:1 COG:no KEGG:FN1916 NR:ns ## KEGG: FN1916 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 35 129 1 95 96 138 85.0 5e-32 MKKIILLIAMVFLLISCSNNNYVQKGFSQNEKQALILFKDKIKSNLSENNLAYIKENTKD SYRNRYILEKLQNIDFTKLNIFVSQPSYTTEYPSSILALNMNEDTYYFDLIFIYDKQNKK WLIFDLKEKE >gi|224461338|gb|ACDC01000064.1| GENE 18 22387 - 22797 511 136 aa, chain + ## HITS:1 COG:FN1915 KEGG:ns NR:ns ## COG: FN1915 COG3920 # Protein_GI_number: 19705220 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 28 136 1 109 109 164 83.0 5e-41 MNNFDEEINKIKIFIPSFLEGLSTVRAMIRVYLREHNISKLDEIQLLSVVDELTTNAVEH AYSDSQGEIEVVLNYYNNTIFLTVEDFGRGFDESLDSKEDGGFGLSIARKLVDVFEIEKK RKGTIIKVEKKIKEAV >gi|224461338|gb|ACDC01000064.1| GENE 19 22799 - 23146 381 115 aa, chain + ## HITS:1 COG:FN1914 KEGG:ns NR:ns ## COG: FN1914 COG1366 # Protein_GI_number: 19705219 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Fusobacterium nucleatum # 1 115 1 115 115 179 95.0 9e-46 MENNFEILERMKDDIQIIEINGELDAFVAPKLKETFNKLIEKDINKYIVDFKGLIHINSL AMGILRGKLQTVREMGGDIKIVNLNKHIQTIFETIGLDEIFEIYKNEEEALKSFK >gi|224461338|gb|ACDC01000064.1| GENE 20 23177 - 24733 2297 518 aa, chain + ## HITS:1 COG:FN1913 KEGG:ns NR:ns ## COG: FN1913 COG1418 # Protein_GI_number: 19705218 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Fusobacterium nucleatum # 11 518 1 508 508 735 94.0 0 MIFLGLAILALALVFTVFFKKSVIDRQIEKLNDLEDEVEKAKLKAKEIVEEAEKDASSKA KEIELKAKEKAYQIKEEVEKEARNLKNEIAQKEARIVKKEEILDGKIEKAENKSLELEKI NNELEAKRKEIDELKVKQEEELSRVSELTKADAREILLHKIREELTHDMAVTIREFETKL DEEKEKISQKILSTAIGKAAADYVADATVSVINLPNDEMKGRIIGREGRNIRTIEALTGV DVIIDDTPEAVVLSCFDGVKREIARLTIEKLITDGRIHPGKIEEIVNKCRKEVEKEIIAA GEEALIELSIPSMHPEIIKTLGRLKYRTSYGQNVLTHSIEVAKIASTMAAEIGANVELAK RGGLLHDIGKVLVNEIETSHAIVGGEFVKKFGEKQEVVNAVMAHHNEVEFETVEAILVQA ADAVSASRPGARRETLTAYIKRLENLEEIANSFDGVESSYAIQAGRELRIVINPDKVSDD EATLMSREVAKKIEDTMQYPGQIKVTILRETRAVEYAK >gi|224461338|gb|ACDC01000064.1| GENE 21 24904 - 25278 707 124 aa, chain + ## HITS:1 COG:FN1619 KEGG:ns NR:ns ## COG: FN1619 COG1302 # Protein_GI_number: 19704940 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 124 1 122 122 152 85.0 2e-37 MSELGNIRIADEVVKTIAAKAAGDVEGVYKLAGGVVDEVSKMLGKKRPTNGVKVEVGEVE CSIEVYLVIKYGYQIPKVAEAVQKAVLEEVSKLSGLKVVEVNVYIQNIKVEEVTEEETTE EYED >gi|224461338|gb|ACDC01000064.1| GENE 22 25297 - 25887 772 196 aa, chain + ## HITS:1 COG:no KEGG:FN1618 NR:ns ## KEGG: FN1618 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 19 183 1 167 179 200 70.0 3e-50 MLKKLIFFFAWIGIFLISLVALNYILLPGQIVYDNPYAEKITSFQYKMIILVLAALYLFI CIIKFFSLFERKKDYERKTENGLLKISKATINNYVMDLLRRDPDITGIKTVSELKGNKFF INIKCELLAKMNIANKISYLQNLIKTDLMENLGVDVNKVVVNIAKIEAREKEKTNDEASN EVPAVNVEGDNVEVNN >gi|224461338|gb|ACDC01000064.1| GENE 23 25887 - 26114 278 75 aa, chain + ## HITS:1 COG:no KEGG:FN1617 NR:ns ## KEGG: FN1617 # Name: not_defined # Def: prolipoprotein diacylglyceryltransferase # Organism: F.nucleatum # Pathway: not_defined # 2 75 1 74 74 110 78.0 2e-23 MLPDNILEVLLEKIINNWRKVYGSILGFIVGLTVVNYGILKAIVIFAFAFIGYKLGDSSF TKKMKKTIINRLKED >gi|224461338|gb|ACDC01000064.1| GENE 24 26119 - 26577 700 152 aa, chain + ## HITS:1 COG:FN1616 KEGG:ns NR:ns ## COG: FN1616 COG0781 # Protein_GI_number: 19704937 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Fusobacterium nucleatum # 1 152 1 152 153 187 71.0 9e-48 MKEIFGEETKKKKAGIRLVREELFKIVFGVEATESTSEELEKAFDIYLSNNEDFVSTLSE SQLKFLQTSVKGISENYDNIKDTIKANTQNWAYERIGLVERTLLIIATYEFLKANTPIEV VANETVELAKEYGNEKSYEFVNGILANIGKTK >gi|224461338|gb|ACDC01000064.1| GENE 25 26626 - 26751 173 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294783684|ref|ZP_06749008.1| ## NR: gi|294783684|ref|ZP_06749008.1| hypothetical protein HMPREF0400_01678 [Fusobacterium sp. 1_1_41FAA] # 1 41 18 58 58 62 100.0 7e-09 METYRNYNREDFNKILSAKITRKLLRTATVLAALGILNKTF >gi|224461338|gb|ACDC01000064.1| GENE 26 26841 - 28025 1504 394 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1366 NR:ns ## KEGG: Lebu_1366 # Name: not_defined # Def: VWA containing CoxE family protein # Organism: L.buccalis # Pathway: not_defined # 1 394 1 392 393 676 84.0 0 MDYKEDIKRWRLILGKDTQDTFSSMNSEAISSLSEEDWLMDRALDAIYNPTGKFMGEAAL GAGRGPSNPQISKWLGDVRDLFDKELVKIIQTDAMDRCGLKQLIFEPEILEQVEPDISLA STIMLLKDQIPKHSKESVRAFIKKIVEEINKLLESDIRRAVRAALNKRQHSPIPSASALD FKRTIQRGIKNYNKELKKIIPEHYYFFERASTNPSSKFTVILDIDQSGSMGESVIYSSVM ACILASMAALKTRIVAFDTNIVDLTEKSDDPVDLLYGFQLGGGTDINKSITYCMNYIENP KKTIFFLISDLMEGGNRGGMLRHLQEMKDSGVIVVCLLAISSDGQPYYDSQMAEKISSMG IPCFACNPEKLPLLLERVLKGLDLNSFQEEFKKK >gi|224461338|gb|ACDC01000064.1| GENE 27 28034 - 30301 2401 755 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1367 NR:ns ## KEGG: Lebu_1367 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 755 1 759 759 1129 73.0 0 MKKQNENKPHIFGVRHFSPAGAYYVRKYLDEVQPKVVLIEAPSDFTDLMDKITAKEVVPP IAIMAYTLEAPIQTIIYPFAEFSPEYQAILWAKENKVECRFCDLPSSVFLAIQNKGENPS EESLNSYIHRKIDEFSEDSDSEVFWERVMEQAADHQAYRSGARDYGTNLRELTLANTKSD AENIIREAYMCKQVAELCEEGFKINEIAMVVGAFHIEGIEKGNFLSDEEFNLLKKVETKK TLMPYSYYKLSTYSNYGAGNKAPGYYELLWKGLNKEDIYYAVYGYLSRLADFQRTSGNMV SSAQIIEAVQLAISLANIHNSKIPTLKDMQDAAITCMAQGSHSEIILAMANTEVGKKIGK IPQDSIQTSIQSDFYSILKELKLEKYQTLTATELRLDLRENIRVKSEKLAFLDLERSYFF HRLRVLKISFVNFLDKVQDNKTWAEDWVLQWTPEAEIEIVEAILKGDTIEFATAFELNQR IENSSSISQIAEIVKDAFYCGLPKSLEQAFQALQSCMADDIPINEIAKTSTTLSIMLRYG DIRKLDRDVLIPILEQLFLRACLILPTEAFCDANAAIELAEAIIALHNVVENHDFLDRER WYALLTEVAKRDNLNTKISGLAMAILLETGKISNDELGLEVERRLSKAIPADLGASWFEG LSMKNHYTLIARLGLWEKLQDYISALDEDEFKRALVFLRRAFADFSSNEKHDIAENMAEI WGLNKIAVSEAMNKDLKEEEAEIISSLDDFDFDDI >gi|224461338|gb|ACDC01000064.1| GENE 28 30315 - 31415 1654 366 aa, chain - ## HITS:1 COG:ECs2927 KEGG:ns NR:ns ## COG: ECs2927 COG0714 # Protein_GI_number: 15832181 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli O157:H7 # 3 356 26 378 384 235 38.0 9e-62 MSKKEEVQRLTAEQLFQEEIDALIKAEKNPIPTGWKMSPKSVLTYICGGKVGKKTIVPKY IGNKRLVEIAISTLVTDRALLLIGEPGTAKSWLSEHLTAAINGNSTRVIQGTAGTTEEQI RYSWNYAMLIAEGPTKEALIPSPIYRAMEDGAIARVEEISRCASEVQDALISLLSEKRLS VPELNLEIPAKKGFSIIATANTRDKGVNEMSAALKRRFNIVVLPSPNSLEAEIDIVRTRV EQLASNLDLNAKLPEDEVIEKVCTVFRELRQGLTLDGKQKIKTTTNVLSTAEAISLLANS MALAGSFGDGEISDYDLAAGLQGAIVKEDSKDGQIWTEYLENIMKKRGSEWLNLYKECKE LNKTSK >gi|224461338|gb|ACDC01000064.1| GENE 29 31568 - 32227 591 219 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1369 NR:ns ## KEGG: Lebu_1369 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 189 1 193 638 181 59.0 2e-44 MNFEPLYELKNRLENVAVVGINLAKDDFRLKRAVEHLKEYSTIAKVFKQIYDMGNELIST NDEDKCDLFLDLVALVDAVLCTQVTTYLGNEPQEIKTIAKSKDYYKELHYSELSPLVYAF TEGNLFIIQDAVNNNADIFDDFRLKSYMIKGLSNKYSKVINLATKKLKKQRKEIVPLQKD EFSPGIEKKSLLDWILFPVLLKKLKIIFTNIALKIGLKK >gi|224461338|gb|ACDC01000064.1| GENE 30 32249 - 34159 2321 636 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1369 NR:ns ## KEGG: Lebu_1369 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 636 1 638 638 690 60.0 0 MNFEPLYELKNRLENVAVVGINLAKDDFRLKRAVEQLKEYSTAAKVFKQIYDMGNELIST DDEDKCDLFLDLLALLDAVLCTQATTYSGDEPQEINTITKNKDFYKELHYSELSPLVSAF TREGGGRLNIIMDTIESNPEIMNDFRVKACMIHGLSDKYSEIADRMVDELKKQGKDIVPI LKDGFDPQGKRNMVSRLEIIASICKEEENDFYKYCIENGSKEVKEDAIGYLSFDQSNIDY LLDLTKTEKGKLKNKAFEALSYMSDNRAAEEWGKFLKKKTLDNLEYLRGTDQQWAMDYIN DFIENYVTETKNKTLKTAEEKRTVEYDILKVSPFVLKNRNEKSLLFCKELYPFNKYEIKR ILNFYIAKDLDKDVIDVVKELAKEYEGEFLQQEFLISLIKDKAETVYKNFSKYAGAGKER EEIRALFNTFIRGNYSKKKEERKVQEDFRDIFQVILRMHYNEENKEYILEWPDTIVGYPI QIKLDGFDKKWYDIILGTSTEITGNWDYYASSHGDFRYLYNPNIKGLKEKFGEFYYNITL LRTPYLSDIEFLNKLDWTNYKDFLVGKMDVGKNIYLISYRLNYISDFINKIPISEEDLKA QIEELLEKYKTIQKSTRDLCQTWLDKLNSGVKVKEL >gi|224461338|gb|ACDC01000064.1| GENE 31 34173 - 35621 1818 482 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1370 NR:ns ## KEGG: Lebu_1370 # Name: not_defined # Def: zinc finger SWIM domain protein # Organism: L.buccalis # Pathway: not_defined # 1 482 1 479 479 645 72.0 0 MKKLDKEKILALAPNSSAVANAKKICSSGSFVKLAHSSDDTFYMGECKGSGKSNYIVSAD FVDEENPVMRCTCPSRQFPCKHGLALLFEIADGKTFEECEIPEDILAKREKKEKTKAKKE KESAEGTVKEKKAPSKVSKAARTKKINKQIEGLDLIKNISTQLLKLGLSTIGTVSLKEYK DVVKQLGDYYLPGPQILFQKLILEIQEYKEDQDTVHYQQALECLKRLRAIEKKGREYLKE ELEKENLEMSDNTLYEDLGGVWKLEQLNDLGLKKENAKLIQLAFEVTYDEASKIYTDYGY WIDIESGEISYTANYRPLSALKYIKQDDSNFSLVTVPTLTYYPGGLNKRIRWASANFEEK DKTSFKKIKTYAKNIEEATKIAKNELKNILTDNHVALLLEFEKIMFVEEEGSKKYILVDK NQKMIELRNNGSKELTKVFYELLPNECLENQVMFVKLFQKDRTIYAEAHSIITDNKIVRL GF >gi|224461338|gb|ACDC01000064.1| GENE 32 35725 - 37359 2060 544 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0003 NR:ns ## KEGG: Lebu_0003 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 2 544 3 545 545 572 57.0 1e-161 MKRLAIGLSDFKHLIEEDFYYFDKTKFIEEVIKDGSQVKLFARPRRFGKTLNMSMLKYFF DIENKEENKEIFKDLYIEKTEAFKEQGQYPVIFLSLKDLKALTWEQMEKAIKSAISRLFS EYKYLLNDLDKFDTLAFENILLKNTELEDLKEALKFLTRILYEKYNKKVVVLIDEYDSPL VSAYINGYYEKAKDFFKTFYSTVLKDNSYLQMGVLTGIIRVIKAGIFSDLNNLSTYTILS DVYTDSYGLTEEEVEKSLKYYGIEQEISNVKDWYDGYKFGDSEVYNPWSILNFLQYKELR AYWVDTSGNDLINDVLKKITKNTIEALERLFYGEGLKQNISGTSDLSKLLSEEELWELML FSGYLTVEEKIDQDNYVLRLPNKEVRTLYRKTFFERYFGRGSKLLYLMEDLTENRIYEYE ERLQEILLTSVSYNDTKKGNEAFYHGLIMGMGLYLEGEYITKSNIESGLGRYDFVIEPKN KTKRAFIMEFKATDSIENLEEISKEALRQIEDKKYDVSLKQNGVKDITYMGIAFCGKQIK ISYK >gi|224461338|gb|ACDC01000064.1| GENE 33 37407 - 38150 940 247 aa, chain - ## HITS:1 COG:TVN0442 KEGG:ns NR:ns ## COG: TVN0442 COG0863 # Protein_GI_number: 13541273 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Thermoplasma volcanium # 5 247 96 346 347 134 31.0 2e-31 MKKKIKKWEPDNFELELNSVWSFKERGDWATHDAKWRGNWSPYIPRNLILRYTQEKDLIL DQFAGGGTTLVEAKLLNRNIIGIDVNDIAIERCREKIDFEFENSGKVYIHKGDARNLDFI KNETIDFICTHPPYANIIEYSEDIEEDLSHLKIPEFLKEIEKVASESYRVLKKDKFCAIL MGDTRIKGHIQPLGFEVMKVFEKVGFKLKEIIIKEQHNCKATGYWKTNSIKYNFFLIAHE YLFIFKK >gi|224461338|gb|ACDC01000064.1| GENE 34 38169 - 39026 1153 285 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01767 NR:ns ## KEGG: EUBELI_01767 # Name: not_defined # Def: type II restriction enzyme # Organism: E.eligens # Pathway: not_defined # 2 284 4 286 287 401 71.0 1e-110 MRNFETWLKNFKNSIATYDYYIDFEKVIKNINNIKIELNILNSLIGSKNIEEDFVNIIKK YPETLKCIPILLAVRDTEIYAQDEEGSFLYNFKTQNYSVEQYKIFMRKTGLFNLIQNHII NNLVDYVLGVETGLDSNGRKNRGGHLMEDLVEKYIVKAGFIKGVNYFKEMKISEIEEKFK IDLSHISNKGKTVKRFDFVVKTENMIYAIETNFYASSGSKLNETARSYKQIAQEAKNING FTFVWFTDGKGWIDARNNLEETFEILETIYNIHDMENDIMRTLFR >gi|224461338|gb|ACDC01000064.1| GENE 35 39039 - 39965 965 308 aa, chain - ## HITS:1 COG:slr1803 KEGG:ns NR:ns ## COG: slr1803 COG0338 # Protein_GI_number: 16330320 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Synechocystis # 16 307 9 307 309 286 50.0 4e-77 MENFSLFEKEEKIECKPFIKWVGGKGQLLPEISKLYPVELGKTINKYAEIFIGGGAVLFD ILSKYRLDDVYISDKNLELINTYKTIRDNVDILIKSLKEMENQYIPMNNEDRKVYYYEKR SEYNNLKINIEENNIRKAALFIFLNKTCFNGLYRVNKKGEYNVPMGAYKNPKICDEENLK NVSLALKNVKITYADYRESKDFIDEKTFIYIDPPYRPLNITSSFTSYTENDFCDKEQIEL ANYIDSLNEKGAKVVISNSDPKNTNKNDNFFDDLYKNYNINRVKANRMLNSKANSRGEIN ELLITNYK >gi|224461338|gb|ACDC01000064.1| GENE 36 40148 - 41356 1223 402 aa, chain + ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 401 1 401 402 659 92.0 0 MTKRELYIEKIKPFIDKDIIKVLTGIRRSGKSVMLKLIMEELKQNKIDEKQFININFENL INIELTTADKLHEYILKKASEIKKKCYIFLDEIQEVKDWEKCINSLRVNEEYDFDIYITG SNAKLLSGELSTYLAGRYVEFVIYPFSFKEFLETLKSIQQDVSTREAFQKYVKFGGMPFL YNLAFEEEASLQYLKDIYSSIILKDITQRNKIRDTDLLERVISYLTMNVGNNFSATSISK FFKSENRKVSVETILNYIKAAEESFLIYKVSRDDLIGKKVLNVNEKYYIADHGMREAILG SNQRDINQIFENIIYLELLRKGYNVRVGKVDNLEVDFVCTKGNKKIYIQVAYLLASSETI EREFTSLEKIDDNYPKYVISMDEFDMSRNGIIHINIIDFLMK >gi|224461338|gb|ACDC01000064.1| GENE 37 41428 - 41883 515 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739344|ref|ZP_04569825.1| ## NR: gi|237739344|ref|ZP_04569825.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 151 1 151 151 291 100.0 1e-77 MKIIKKIFLAVVMLFAFASCMNGPIKLTGSAAPINPSQKKVLVAYFPEYSAKWRDDLELS FESRKWKVNEIDFWEVEKANLRKRNETFLIVVDKMIKEDYKSFLGGTFFSGNISVYDLRT GKKIINYNVHTEESFDVTTRLAKALGELVRK >gi|224461338|gb|ACDC01000064.1| GENE 38 41899 - 42351 647 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739345|ref|ZP_04569826.1| ## NR: gi|237739345|ref|ZP_04569826.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 150 1 150 150 288 100.0 8e-77 MKIVKKVLLGIFMLLAFTSCSLLFPNSGPEITTISTPASFTRAQRSAYVEGATVGVEKAI RSKLLQRNWKVSSRATGNETFAIVFDQLNIDKYSDGGFISSTYYEYTGYVSIFDVRNNER LCVYNFTKESLGDLLEGIEKAVIEVEKSMR >gi|224461338|gb|ACDC01000064.1| GENE 39 42392 - 42541 108 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no APTSTCLTANTLTSARATSVLTHLIKEMVKRDLEIERRRAVADSRNNPY Prediction of potential genes in microbial genomes Time: Thu May 19 23:16:36 2011 Seq name: gi|224461337|gb|ACDC01000065.1| Fusobacterium sp. 2_1_31 cont1.65, whole genome shotgun sequence Length of sequence - 2103 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 30 - 500 605 ## Lebu_0879 hypothetical protein - Prom 538 - 597 6.8 - Term 517 - 567 6.2 2 2 Tu 1 . - CDS 674 - 1063 391 ## COG2832 Uncharacterized protein conserved in bacteria - Prom 1083 - 1142 14.8 + Prom 1026 - 1085 8.1 3 3 Tu 1 . + CDS 1218 - 1991 1145 ## COG0489 ATPases involved in chromosome partitioning + Term 1997 - 2043 9.1 Predicted protein(s) >gi|224461337|gb|ACDC01000065.1| GENE 1 30 - 500 605 156 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0879 NR:ns ## KEGG: Lebu_0879 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 155 1 156 159 76 34.0 3e-13 MKKILLFLFLILGAYSFSAPSYVDLNKIQRDGYQIDVNDVDTLAFSQEESDMNLVVTMYF TNDGNPQTLRIAFKTMFAPAFGLEYTDEIQTNRAYIQKSFGKNRNGIVYGYNIVPKSQKR KGCFLNVFLITEQELPDELLKDIANTALNEIESYIK >gi|224461337|gb|ACDC01000065.1| GENE 2 674 - 1063 391 129 aa, chain - ## HITS:1 COG:FN2099 KEGG:ns NR:ns ## COG: FN2099 COG2832 # Protein_GI_number: 19705389 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 124 1 124 125 177 72.0 5e-45 MRNLKKKLYITFGFLAVALAIVGVFIPGLPTVPFLLVALFCFERSSKKYHDMIMNNKYSG SVLQDYYSGKGLTSSVKIKAILFLSCGMAFSIYKIQNLHARIALAIVWLGVAIHIILLKT KKTKNKSNK >gi|224461337|gb|ACDC01000065.1| GENE 3 1218 - 1991 1145 257 aa, chain + ## HITS:1 COG:FN2098 KEGG:ns NR:ns ## COG: FN2098 COG0489 # Protein_GI_number: 19705388 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Fusobacterium nucleatum # 1 257 1 257 257 455 94.0 1e-128 MIQKEAPKVKDDKNIKNVIAVMSGKGGVGKSTVTTLLAKELRKKGYSVGVMDADITGPSI PRLMNVSEQKMATDGKNMYPVVTEDGIEIVSINLMIDENEPVVWRGPVIAGAVMQFWNEV VWSDLDYLLIDMPPGTGDVPLTVMKSFNIKGLIMVSIPQDMVSMIVTKAIKMARKMNVNV IGLIENMSYITCDCCDNKIYLTDENDIQTFLKENDVELLGELPMTKQIARMTKGESAYPE ETFSKIADRVIEKVKEL Prediction of potential genes in microbial genomes Time: Thu May 19 23:17:45 2011 Seq name: gi|224461336|gb|ACDC01000066.1| Fusobacterium sp. 2_1_31 cont1.66, whole genome shotgun sequence Length of sequence - 40741 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 8, operones - 4 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 82 - 255 258 ## gi|294783658|ref|ZP_06748982.1| conserved hypothetical protein 2 1 Op 2 . + CDS 255 - 851 763 ## COG4185 Uncharacterized protein conserved in bacteria 3 1 Op 3 . + CDS 867 - 1538 968 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain + Prom 1642 - 1701 12.8 4 2 Op 1 28/0.000 + CDS 1779 - 3608 2332 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 5 2 Op 2 28/0.000 + CDS 3619 - 4704 1209 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 6 2 Op 3 4/0.000 + CDS 4704 - 6002 1842 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 7 2 Op 4 26/0.000 + CDS 6012 - 7076 1414 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 8 2 Op 5 11/0.000 + CDS 7081 - 8472 2142 ## COG0773 UDP-N-acetylmuramate-alanine ligase 9 2 Op 6 6/0.000 + CDS 8459 - 9304 1363 ## COG0812 UDP-N-acetylmuramate dehydrogenase 10 2 Op 7 . + CDS 9320 - 10183 1335 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 11 2 Op 8 . + CDS 10197 - 10904 663 ## FN1453 hypothetical protein 12 2 Op 9 35/0.000 + CDS 10901 - 12199 1749 ## COG0849 Actin-like ATPase involved in cell division 13 2 Op 10 . + CDS 12221 - 13306 1693 ## COG0206 Cell division GTPase + Term 13307 - 13356 7.8 + Prom 13322 - 13381 9.4 14 3 Tu 1 . + CDS 13463 - 25138 17437 ## FN1449 hypothetical protein + Term 25154 - 25206 10.2 + Prom 25205 - 25264 16.8 15 4 Tu 1 . + CDS 25411 - 26829 1993 ## COG2985 Predicted permease + Term 26847 - 26888 3.2 - Term 26835 - 26876 -0.6 16 5 Tu 1 . - CDS 26894 - 27346 530 ## FN1647 hypothetical protein - Prom 27387 - 27446 18.8 + Prom 27324 - 27383 11.1 17 6 Op 1 . + CDS 27566 - 27877 508 ## PROTEIN SUPPORTED gi|237739365|ref|ZP_04569846.1| SSU ribosomal protein S10P 18 6 Op 2 . + CDS 27861 - 27914 74 ## PROTEIN SUPPORTED gi|34764924|ref|ZP_00145269.1| SSU ribosomal protein S10P + Term 27948 - 27997 1.0 + Prom 27916 - 27975 2.2 19 7 Op 1 58/0.000 + CDS 28023 - 28658 1077 ## PROTEIN SUPPORTED gi|237739366|ref|ZP_04569847.1| LSU ribosomal protein L3P 20 7 Op 2 61/0.000 + CDS 28678 - 29307 1044 ## PROTEIN SUPPORTED gi|237739367|ref|ZP_04569848.1| LSU ribosomal protein L1E 21 7 Op 3 61/0.000 + CDS 29307 - 29594 480 ## PROTEIN SUPPORTED gi|34764036|ref|ZP_00144922.1| LSU ribosomal protein L23P 22 7 Op 4 60/0.000 + CDS 29637 - 30467 1447 ## PROTEIN SUPPORTED gi|237739369|ref|ZP_04569850.1| LSU ribosomal protein L2P 23 7 Op 5 59/0.000 + CDS 30492 - 30767 492 ## PROTEIN SUPPORTED gi|237739370|ref|ZP_04569851.1| SSU ribosomal protein S19P 24 7 Op 6 61/0.000 + CDS 30796 - 31128 524 ## PROTEIN SUPPORTED gi|237739371|ref|ZP_04569852.1| LSU ribosomal protein L22P 25 7 Op 7 50/0.000 + CDS 31147 - 31806 1111 ## PROTEIN SUPPORTED gi|19704960|ref|NP_602455.1| SSU ribosomal protein S3P 26 7 Op 8 50/0.000 + CDS 31809 - 32240 748 ## PROTEIN SUPPORTED gi|237739373|ref|ZP_04569854.1| LSU ribosomal protein L16P 27 7 Op 9 50/0.000 + CDS 32240 - 32422 291 ## PROTEIN SUPPORTED gi|34764030|ref|ZP_00144916.1| LSU ribosomal protein L29P 28 7 Op 10 50/0.000 + CDS 32456 - 32707 411 ## PROTEIN SUPPORTED gi|237739375|ref|ZP_04569856.1| SSU ribosomal protein S17P 29 7 Op 11 57/0.000 + CDS 32736 - 33104 600 ## PROTEIN SUPPORTED gi|197736521|ref|YP_002165299.1| ribosomal protein L14 30 7 Op 12 48/0.000 + CDS 33129 - 33470 567 ## PROTEIN SUPPORTED gi|237739377|ref|ZP_04569858.1| LSU ribosomal protein L24P 31 7 Op 13 50/0.000 + CDS 33489 - 34040 923 ## PROTEIN SUPPORTED gi|237739378|ref|ZP_04569859.1| LSU ribosomal protein L5P 32 7 Op 14 50/0.000 + CDS 34061 - 34348 478 ## PROTEIN SUPPORTED gi|237739379|ref|ZP_04569860.1| SSU ribosomal protein S14P 33 7 Op 15 55/0.000 + CDS 34377 - 34775 658 ## PROTEIN SUPPORTED gi|237739380|ref|ZP_04569861.1| SSU ribosomal protein S8P 34 7 Op 16 46/0.000 + CDS 34800 - 35333 910 ## PROTEIN SUPPORTED gi|237739381|ref|ZP_04569862.1| LSU ribosomal protein L6P 35 7 Op 17 56/0.000 + CDS 35360 - 35728 590 ## PROTEIN SUPPORTED gi|237739382|ref|ZP_04569863.1| LSU ribosomal protein L18P 36 7 Op 18 50/0.000 + CDS 35753 - 36247 808 ## PROTEIN SUPPORTED gi|237739383|ref|ZP_04569864.1| SSU ribosomal protein S5P 37 7 Op 19 48/0.000 + CDS 36260 - 36445 300 ## PROTEIN SUPPORTED gi|237739384|ref|ZP_04569865.1| LSU ribosomal protein L30P 38 7 Op 20 53/0.000 + CDS 36445 - 36924 807 ## PROTEIN SUPPORTED gi|237739385|ref|ZP_04569866.1| LSU ribosomal protein L15P 39 7 Op 21 . + CDS 36949 - 38229 866 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 + Term 38247 - 38292 5.0 + Prom 38266 - 38325 13.9 40 8 Tu 1 . + CDS 38367 - 40718 3082 ## COG1982 Arginine/lysine/ornithine decarboxylases Predicted protein(s) >gi|224461336|gb|ACDC01000066.1| GENE 1 82 - 255 258 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294783658|ref|ZP_06748982.1| ## NR: gi|294783658|ref|ZP_06748982.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 55 1 55 77 68 98.0 9e-11 MEVKNENLKEMILKLTQKDIDELMEKTEKEEDKIFYNKLFNLILETKQEELIKKGVY >gi|224461336|gb|ACDC01000066.1| GENE 2 255 - 851 763 198 aa, chain + ## HITS:1 COG:CAC1491 KEGG:ns NR:ns ## COG: CAC1491 COG4185 # Protein_GI_number: 15894770 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 5 189 4 187 187 137 41.0 1e-32 MKKVFYLFAGVNGAGKSTLYNSESLNNDIKNTIRINTDEIVREIGDWKNNSDQLKAAKIA INLRNECFLYGKSFNEETTLTGKTILKTIERAKELGYELQLFYVGVNSTEIAKERIKSRV EKGGHHIENDIVEKRYYESLKNLKEILLKFDKVYLYDNSKKYKNIFSFSNNKILFKDNKS ISWAKEAIEIIENNIKNK >gi|224461336|gb|ACDC01000066.1| GENE 3 867 - 1538 968 223 aa, chain + ## HITS:1 COG:FN1305 KEGG:ns NR:ns ## COG: FN1305 COG1917 # Protein_GI_number: 19704640 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Fusobacterium nucleatum # 112 223 1 111 111 191 89.0 8e-49 MVKIEVAKAINFNELINSKEAEVVSMRILNETNSYISLFSLAKNEEITAEAMLGNRYYYC FNGNGEISVENNKKSIKTGDFLEVLANNNYSVKSLDTLKLIEIGEKIGDETMENQTLKML ESASAFSLADCVDYKEGQIVSKNLVAKANLVITVMSFWKGESLDPHKAPGDALVTVLDGE GKYIVDGKAFVVKKGESAVLPANIPHAVEAETQNFKMMLTLVK >gi|224461336|gb|ACDC01000066.1| GENE 4 1779 - 3608 2332 609 aa, chain + ## HITS:1 COG:FN1461_2 KEGG:ns NR:ns ## COG: FN1461_2 COG0770 # Protein_GI_number: 19704793 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Fusobacterium nucleatum # 194 606 1 413 416 677 90.0 0 MNKAIFLDRDGTINVEKDYIYKCEDLVFEEGSVEALKTFKNLGYILIVVSNQSGIARGYF TEEDLKAFNNNMNEKLKEEAVEITEFYCCPHHPDGLAEYKKVCDCRKPNNKMLEDAIEKY NIDREKSYMIGDKASDIGAGLKSKLKTVLVKTGYGLKDMEKIDKNETLVCENLKDFSEVL KREKLNELIFEEFSKKVQIKNVVMDSRKVTEGSLFFAINNGNSYVKDVLDKGASLVIADN TDIADERIVKVADTIATMQDLATKYRNKLDIQVIGITGSNGKTSTKDIVYSLLSKKAKTL KTEGNYNNHIGLPYTLLNVTDEEKFVVLEMGMSSLGEIRRLGEISNPDYAIITNIGDSHI EFLKTRDNVFKAKTELLEFVNKENTFVCGDDVYLAKLDVNKIGFNEDNNFIIESYEFSDK GSKFTLDGKEYEMSLLGKHNISNTAIAIELAKKIGLSEEEIKEGLKDIKISSMRFQEIRV GEDIYINDAYNASPTSMKAAIDTLNEIYNDKYKIAILGDMLELGEDEVKYHVEVLNYLLD KKIKLIYLYGERMKKAYDIFMKNKSEEYRFWYYPTKEGIVESLKNIRMEKVILLKASRGT ALEDIIVKE >gi|224461336|gb|ACDC01000066.1| GENE 5 3619 - 4704 1209 361 aa, chain + ## HITS:1 COG:FN1459 KEGG:ns NR:ns ## COG: FN1459 COG0472 # Protein_GI_number: 19704791 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Fusobacterium nucleatum # 1 361 1 361 361 551 89.0 1e-157 MLYFLAEYFAELEFLKSIYLRTFLAFVISFCIVLFAGKPFIKYLKVKKFGEEIRDDGPSS HFSKKGTPTMGGVLIIASVLLTSLLINDLANKLILLVLISMLMFAAIGFIDDYRKFTVSK KGLAGKKKLLFQGTIGLMVWAYLYYIGFTGRPMIDFSLINPISAHPYYIGAIGMFVLIQL ILMGTSNAVNITDGLDGLAIMPMIICSTILGVVAYFTGHIELSSHLHLFYTVGSGELSVF LAAVTGAGLGFLWYNCYPAQIFMGDTGSLTLGGILGVIGIILKQELLLPILGFIFVLEAL SVILQVGSFKLRGKRIFKMAPIHHHFELMNIPESKVTLRFWIATLIFGIIALGTIKMRGI L >gi|224461336|gb|ACDC01000066.1| GENE 6 4704 - 6002 1842 432 aa, chain + ## HITS:1 COG:FN1458 KEGG:ns NR:ns ## COG: FN1458 COG0771 # Protein_GI_number: 19704790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Fusobacterium nucleatum # 1 432 23 454 454 681 87.0 0 MKKVMIYGMGISGTGAKALLETEGYEVILVDDKKAMTSEEAMQHLDNIEFFIKSPGIPYN DFVKEVQKRGIKVLDEIEVAYNYMVEKNLKTKIIAITGTNGKSTTTAKISDLLNHAGYKA CYAGNIGRSLSEALLHEKDLDFVSLELSSFQLENVENFRPYISMIINMGPDHIERYKSFD EYYDTKFNIAKNQNENQYFIENIDDVEIEKRAKQIKAKRISVSKSKEANVYVEDNKIYVG KDFIIEADKLSLKGIHNLENTLFMVATAEILNIDREKLKEFLMVATPLEHRTELFFNYGK VKFINDSKATNVDSTKFAIQANKDSILICGGYDKGVDLAPLAEMIKENIKEVYLIGVIAD KIETELKKVGYEASKIHKLETIENSLLDMKKRFTKDSDEVILLSPATSSYDQFNSFEHRG KVFKELVLKIFG >gi|224461336|gb|ACDC01000066.1| GENE 7 6012 - 7076 1414 354 aa, chain + ## HITS:1 COG:FN1457 KEGG:ns NR:ns ## COG: FN1457 COG0707 # Protein_GI_number: 19704789 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Fusobacterium nucleatum # 1 354 4 357 357 590 82.0 1e-168 MRKVILTTGGTGGHIYPALAVADKLKLKGVETIFVGSTERMEHEIVPESGHRFIGLDISV PKGFKNIRKYLKAIRAAYKIIKEEKPEAIIGFGNYISVPTIIAAILLRKKIYLQEQNVNI GSANRLFYKMAKLTFLAFDKTYDDIPIKSQDRFKVTGNPLRIGIEDLRYATERQKLGVEP NEKVLLITGGSLGAQDINNTIMKYWEKICTEKNLRVYWATGNNFTELKKVLKTKKENDRI EPYFNDMLNIMAAADLVVCRAGALTISELIELEKPSIIIPYGSIKVGQYENAKVLKDYNA AYVYTKDELDEAIKKALEVIRNDEKLKKMRIRLKPLRKPNAAEEIIAYLDIWRN >gi|224461336|gb|ACDC01000066.1| GENE 8 7081 - 8472 2142 463 aa, chain + ## HITS:1 COG:FN1456 KEGG:ns NR:ns ## COG: FN1456 COG0773 # Protein_GI_number: 19704788 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Fusobacterium nucleatum # 1 460 9 468 468 803 88.0 0 MEKIYFIGINGIGMSGLAKIMKCKGYDVKGADICSNYVTEELLSMGITVYNEHDEENVKG ADYVIASTAIKETNPEYAYAKENGIKILKRGELLAKLLNRETGIAIAGTHGKTTTSSMLS AVMLKKDPTIVVGGILPEIKSNARPGKSEYFIAEADESDNSFLFMNPEYSVITNIDADHL DVHGNLDNIKKSFIEFILHTQKESIICMDSKNLMDAILKLPEGKSVTTYSIKDENADIYA KNIRIVDRKTIFEVYVNKELKGEFSLNIPGEHNIQNSLPVIYLALKFGLNKDEIQEALNQ FKGSKRRYDVLYDQELENGYGSKTKRVRIVDDYAHHPTEIKATLKAIKSVDSSRLVAIFQ PHRYSRVHFLLDEFKDAFADVDKVILLPIYAAGEKNEFNVSSETLKEHINHSNVELMNEW KDVKRYVTRVKKDSTYIFMGAGDISTLAHEIAEELEGMSDENF >gi|224461336|gb|ACDC01000066.1| GENE 9 8459 - 9304 1363 281 aa, chain + ## HITS:1 COG:FN1455 KEGG:ns NR:ns ## COG: FN1455 COG0812 # Protein_GI_number: 19704787 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Fusobacterium nucleatum # 1 281 1 281 281 478 90.0 1e-135 MKIFDNQEMKNYSNMRVGGKAKRLIILESKEEIIDVYKNEENTNIFILGNGTNVLFTDNF MDKTFVCTKKLNKIEDLGSNLVRVETGANLKDLTDFIRDKNYSGIESLFGIPGSIGGLVY MNGGAFGTEIFDKIVSIEVFDENHQIREIKKEDLKVAYRKTEIQDKNWLVLSATFKFDDG FDDARVKEIKELRECKHPLDKPSLGSTFKNPEGDFAARLISECGLKGTIIGNAQIAEKHP NFVLNLGGATFEDITNILTLVKKSVFEKFGVKLEEEIIIVK >gi|224461336|gb|ACDC01000066.1| GENE 10 9320 - 10183 1335 287 aa, chain + ## HITS:1 COG:FN1454 KEGG:ns NR:ns ## COG: FN1454 COG1181 # Protein_GI_number: 19704786 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Fusobacterium nucleatum # 1 287 1 287 287 491 88.0 1e-139 MKIAVFMGGTSSEKEISLRSGEAVLESLQRQGYDAYGVVLDENNQVTAFLENDYDLAYLV LHGGNGENGKIQAVLDILGKKYTGSGVLASALTMDKNKTKQIAESIGIRVPKSYRDLDSI ERFPVIIKPVDEGSSKGLFLCNNKEEAGEALKKLRKPIIEDYIVGEELTVGVLNGKALGV LKIIPQADVLYDYDSKYAKGGSIHEFPAKIEDKSYKEAMKIAERIHKEFKMKGISRSDFI LSEGKLYFLEVNSSPGMTKTSLIPDLATLQGYTFDDVVRLTVETFLK >gi|224461336|gb|ACDC01000066.1| GENE 11 10197 - 10904 663 235 aa, chain + ## HITS:1 COG:no KEGG:FN1453 NR:ns ## KEGG: FN1453 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 46 235 1 191 191 240 80.0 4e-62 MKVRLLILNIIMYLVYMLPQNFFRLDYFNINKVNIQESAKMLQPELTKLSQKLYNKNIIY IDSNEIKEFLEKDVRVEDVTITKKSLGEISIDVKEKDLSYYAVIGKNIYLVDKAGEIFAY LNEKDVEEVPFIVANSEDEIKEITEFLNELSDLAIFKNISQIYKINDKEFVIILTDGVKI KTNRIEEKDEVNKEKQNKRYLIAQQLYFNMSKERKIDYIDLRFNDYIIKYLGDNK >gi|224461336|gb|ACDC01000066.1| GENE 12 10901 - 12199 1749 432 aa, chain + ## HITS:1 COG:FN1452 KEGG:ns NR:ns ## COG: FN1452 COG0849 # Protein_GI_number: 19704784 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Fusobacterium nucleatum # 1 432 1 447 447 532 64.0 1e-151 MRDDVIRKVALDIGNNSIKLLIGEMSSDFTKIAITDYVKVKHGGLRKSEIYDINALSEAL RTAISKVESVESPITRLSLALGGPGVGSTTVNVKISFPEKEIEESDMDKLLRKAKRQIFG ENEDKFKILYKEVYNKKVDGPNIIKQPIGMVGKELQADVHFVYVEEAYVKKFREVLYGLG VDIDKIYLDSYASAKGTLDEETRKMGVAHVDIGYGSTSVIIMKNGKVLYAKTKSLGELHY VSDLSIILKVPREGAEEILLKLKNKEFESDETIKYGAKRIPLKSIKDIIAARTDDIIEFI NTTIDESGFNGLLARGIVLTGGAVDIDGVAEQVSNKSGYLVRKMLPIPLKGIKNSFYSDA TVVGIFLEDMEREYKRSTERSKEENIQVTRRDAVRNNSVREEVDVFLESIEEERVKEKKE KFDFFRWLKELF >gi|224461336|gb|ACDC01000066.1| GENE 13 12221 - 13306 1693 361 aa, chain + ## HITS:1 COG:FN1451 KEGG:ns NR:ns ## COG: FN1451 COG0206 # Protein_GI_number: 19704783 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Fusobacterium nucleatum # 1 361 1 360 360 498 84.0 1e-140 MSEGIKDLVKIKVIGVGGGGGNAINDMLYSGVTGVEYIAANTDKQDLEKSLADRKLQIGE KLTKGQGAGAEPEIGRLAAEEDIEKIQELLKGTDMLFITAGMGGGTGTGAAPVIAKVAKE LDVLTVAVVTRPFNFEGEKRRRNSESGIELLRQNVDSLVIIPNDKLFDLPDKNITMLNAF KEANNILRIGIKAVVDLVLGQGFINLDFADIKSVLKNSGIAVLGYGEGEGENRAIKAAEK ALESPLLEKSIQGADKILINLRTSEDVGLNESQTVTEVIRQATGKKVEDVLFGITIVPEF SDKIEITIMANNFKDEMETNNETFIKVEPVKTTEPIRESERKKEVPDDIIDIPPWMRTNR R >gi|224461336|gb|ACDC01000066.1| GENE 14 13463 - 25138 17437 3891 aa, chain + ## HITS:1 COG:no KEGG:FN1449 NR:ns ## KEGG: FN1449 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 639 3891 36 3165 3165 2585 56.0 0 MENNLHRIEKDLRSIAKRYKSVKYSIGLAILFLMLGVSAFSEEVNTKTQVAQIATREELK TSVGDVQTKLNVLRDNNKKEIKNLNLELVQLMEQGSQVVKSPWASWQFGMNFFYENWGGT YKGSGDKPQKYIYNSVHTRGDWKVKNAMDALDARRVSSTPLTPGNDSLASWIDILNTLSG AGKVEKDSALYSSTNGRRTWGLVDLLNIKEPTNEVEILARISPKEVNKQAIPLTIDEPVV EGIEAPKVEPNVNTPLKAPEITLPKISEVVISSLEINPPDAPNAPNAPSIDISIDAPKAP EAPTAPTVPNISVSPKEPGTIEAPTISITVTPPTIAALNITTPPTVSVTPPTVSTIEPVA FSVAPTIDSKKHKVQTTNQAGINSKFPEGKETTYTVNNTTKDNSNFFTFNAATGKAVLPK TGTVNVEVNDTRAMVIDEPKNGSEFVMDGTINLYGTKNMGIDLQGSASTQNIVAKIVNSG LIEGHFKDKTGTNTNKQQIAFGFSNVDASYNNTMSHIINKGTISLNAPESAAMQLKPEDP HNWTPDWTDLELVGGKYYVKIGANSIPSGTPNKGRVLMKADNQKDININSKGSFGIITVF NPGISTLDSVRMANLSAAQKTNANLRAQRNLAGAQILPGGEIGRSALADSTYTSGVYNTG TININGDESVGVGILHEIQEVKVGGTINIGTGTVNQVKNVSDTKSTATSDQTKVVNNAVG VFAGVPTLPVKAGENDTMGNKNTTGVTIGTETTEANGVINLGKNATGSIGLLVGDSAEDL NKGSLNGVANQVRQLLRSGSITYKASTDGKINVDGNENYGFVVNSESYSSEFKGALDDLT QSKNNKTTHGRGINEGKIEVVGTKSIGFALLKGGNSENTGSITVGGTAENSIGFYGEQDK FTNKGTIAVNTPSKTGNKAVLLKGNTNGINFDNTGTISVEGNSNIGVYAEGKYTFNHEKN AAGDNKISVGSNAIGIYAKNATGALNIKAPIDIAASGTKTTIGIYTDGDAKVKFGDNSKL TIGEGAVGLYSSDATKFNNTFEIETGKKLEVELGKKSTFALLNGNKTVTNSPLLSKYLNN GTTDKIEITQFGEGASLFYATLKAKAILDENYTVTNGDAASTSVLVANDGANVEIASGKT LTTNTNVGLIATRGEGASASTSVAENKGTLESTRTEKGIGIYTTFAIGKNSGTITMNNKE AVGMLATIGSTLINSSKIELKGISSAGMYGENSDLTNSGNASNITVNKEKSAGMYAKMSG ASSVSKTSKNEGKIEIKADGTGKSAAMYSLMESGTTKVMTTQNTKNIEVAQKASAGIYVK NESAQDKNNSLAENTGSIKMTGESSVGIIAEKSKATNSGTGANGIEISGNSSAGILATKE SEVTNSGRIEGNTGTKLVGISVDETSTVTNTASGTIIMNTAQNTGIASKGGKVTNDGTII LAKNNSTGISAENADVTNTSTIKAKDKESVGIYAKMSGNVDKKVTNTGTITLESPTGTTP NKSAAIYSLVDGGTGTLTTENTGTINVEQKESVGIFAQNNGTANTRSVVKNTKIINVSKE GSAGILGDKSTITNSGSGTDGIVLTANKTAGIIGNNNSVVSNTGRIETKTATPSGSSEGL VGISLNASTGTNSGSIILDTAHSTGMNGVAKSTLTNSGTIEGKKNNAVGMAVNDSTATNT TTGKINLSGLTSTGMFGAVGSTVTNAGTIETKTAIPSKIEEGLVGIALNASTGINVATGK IILGTKFSTGMFGEATSILTNKGIITGTKENSVGMAGKASTVTNENIITLEEKGSTGMFG EDNSTLLNDLPGVITVKKEKSVGIYSNSNTNLAVNKGNIIAEGEKSAGMLGLKGQIENTG TITTKNKESAVMYAENTDATNKKALIAEGEKSAGIYAKVNEAAGNINIVGANIANATITI TGTQSAGMLGLMDKTTGANSSLKLENSGKIDIKNKESVGMMVTNKSGATKDEVVATNKTG GVINLESTTATDDKNIGILANEKATGINEGTIQVKTLESVGMLGQAASEVINRNTINLIA EKGIGMLAKDATSTALNDGDINVTGKKSSGMLAKEAGKAENKKNINVTAENGVGIFVSDT GTGTNLNGAKITLDSKNAVGIFAQNNGAGHTAQNTGKIILGKEDGTSANESLIGMFAQSE TGKTSSVKNTGTIDINTKKSVGMYAKNDAANVGDVDLQNIGTININNKSSAGIYAPNATI SKVGTINFKNTADSDGSSAVYISKGGKVSDTSTATIDLGKTNQNRVAYYVNGAGSALAGA NIGKISGYGVGVYLQGDATVGTAKIDKNTPTLDFTTGNDGGNGIIGLYLNGNTEIKDYTK GITVGDSVGTTYAIGVYAKAQGTVGSYNITTPIKVGADGVGIYADKNSDITYTGNMEVGD GTKSGTGIYITKKDGTGTATGTVTLGSNTIKLKGTGGVAVIASEGTKFDAKSATIELIGK DVTGVGVYGLKGSDVKTGGWTFKNNGNQAEEVRLEEGKVHITSSKSLNPRMVLTHVINGE TSVGTGATVTSVSDGSYNAKENIGLMAEGVKNPAPQPPFTWNEADFEAVNKGTIDFSAAE KSTAIYVNSARAKNDGTIKVGKNSIAIYGFYDKNTRKYDGATGNPNKLEIETTGNSKISL GDQSTGMYLINAGKINNTGGEITSTTGAKRNVGIYAVNGQDADNDNNKVLTMTTATNMTL GDGAVGIYSKGQSNTVRNTVTNTGDITVGDKIVASKTENYPAVAVYAENTNLTNTSAVRV GNDGIAFYGKNSNITADGTVNFSNKGVLAYLENSIFISKLGNLGATQNTMIYLKDSTAQL DGAGTKVDIDVADGYTGAYISGNSHLTGVRTIKLGQGSTGLYLENTSPNFVSTAESISGT KDNARGIVAKNANFTNNSKISLSGKESVGIYSNADATKNVVNNGELTLSGKKTLGVFLRG GQTFENKANINIADSANSLEPTIGIYTAEGTSNIKHTSGTIEVGEKSIGIYSKTPSSVEM NGGKIHVKDQGIGIYKEDGTVVVKGELDIDKHVATAKDTEPTGVYAVNGATVVDQASKIT VGEKSYGFILNNTDPNKTNVYTNTNAGPVSLGNDSVFLYSKGKANITNNRNINSNNSDHL IGFYIKDGGDFVNNGIIDFSTGKGNIGVYAPNGKATNRGSIVVGPTDDIDPATGKVYSDV SKIVYGIGMAADNGGHIVNEGDIRISTNKSIGMYGAGIGTIVENTGRILLDGSQATATNK IQSLTGVYVDEGATFKNSGLITTTDSYAGRNGKINENVTGLTGVAVMNGSTLINEATGKI LIDADNSTGVVIRGKRDAAGNLIRPAVIKNYGEIRVRGKGGSAISWKDVSAADIAELERQ INSKITTDPSGNEITQASGTSKDYQGITITVKNGQATFLRNGVPVSDSEVEKINKLIGNE PNLAMSDIGFYIDTLGRTRPVTFDGAAPPVNSQLIIGTEFSEMTNKKEWIVSGDVIKPFL DQIQGRNFKITTMAGSLTWMATPILDNYGQIVGMAMSKLAYTSFVRPEDNAYNFADGLEQ RYDMNALDSAEKRLFNKLNGIGKNEDVLLTQAYDEMMGHQYANVQQRIQATGRILDKEFS YLRNSWSNPSKDSNKVKTFGMKGEYKTDTAGIIDYKNNAYGVAYVHEDETVKLGESTGWY TGIVHNTFKFEDIGKSKEEQLQAKVGLFRSVPFDDNNSLNWTISGDIFVGHNKLNRRFLV VDEIFQAKSKYYTYGIGVRNEVGKEFRLSEGLSVRPYGAVRVEYGRMSKIKEKSGEMKLE VKSNDYLSIRPEVGTELAYKLHLGNKTLRAALAVAYENELGRVANGKNKARVAGTSADWF NIRGEKEDRRGNVKTDLNVGVDNQRIGVTGNVGYDTKGRNVRGGLGLRVIF >gi|224461336|gb|ACDC01000066.1| GENE 15 25411 - 26829 1993 472 aa, chain + ## HITS:1 COG:FN1450 KEGG:ns NR:ns ## COG: FN1450 COG2985 # Protein_GI_number: 19704782 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Fusobacterium nucleatum # 5 472 1 468 468 770 91.0 0 MHFDLVGFIFNSLVLLFFTMTLGNLFGDIKFKKFNFGITGTLFIGLFVGYFLTKYAVTIP EDSKFFSKAQNVLKGNIIDSSIMNLSLLIFIVGTGLLAAKDMKYAITKFGKQFVILAIFI PFVGAVASYGFSKALKNMSPYQITGTYTGALTSSAGLAAATESSESESKHSAANFANLDE GTKVKILAIINNAKERDAKLKNEAIPEKMTVENTTTLSAEDTEIYVTEAKAGVGVGHSIG YPFGVLFLILGINFIPRMFRFDVEKEKEKYFAQKKIDLSNDKDAGKSTIPEVKMDFVGFS IAAFLGYFLGSIKIAMGPLGTFSLGSIGGAIIVALILGSIGKIGPINFRMDSVVLGKMRT YFLSIFLAGTGLNYGFRVVEAVTGDGIMIAVVSALVAILSVLFGFLLGHYVFHVNWTLLS GAITGGMTSAPGLGAAIDALDSDEPAISYGATQPLATLCMVIFSIIIHKLPI >gi|224461336|gb|ACDC01000066.1| GENE 16 26894 - 27346 530 150 aa, chain - ## HITS:1 COG:no KEGG:FN1647 NR:ns ## KEGG: FN1647 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 150 1 139 139 192 68.0 4e-48 MKKIFLFVLMLFAFAACSNTQFVHDVQPISKSEKTVLIQYFPSDFEIPLEKALEDNFWKV SVVANKDSASPSVKSNIAVTCDGLYLAHFGTYQGTIKFSDLRTGKRIAIYKFNMATRGAI IENIVKTLEALPGASNTIPATSTTSTQAVK >gi|224461336|gb|ACDC01000066.1| GENE 17 27566 - 27877 508 103 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739365|ref|ZP_04569846.1| SSU ribosomal protein S10P [Fusobacterium sp. 2_1_31] # 1 103 1 103 103 200 100 1e-50 MASNKLRIYLKAYDYTLLDESAKRIAEAAKKSGATVAGPMPLPTKIRKYTVLRSVHVNKD SREQFEMRVHRRMIELVNSTDKAISSLTSVHLPAGVGIEIKQV >gi|224461336|gb|ACDC01000066.1| GENE 18 27861 - 27914 74 17 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34764924|ref|ZP_00145269.1| SSU ribosomal protein S10P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 17 61 77 78 33 82 3e-26 KLNKFNSILGIYLWEIM >gi|224461336|gb|ACDC01000066.1| GENE 19 28023 - 28658 1077 211 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739366|ref|ZP_04569847.1| LSU ribosomal protein L3P [Fusobacterium sp. 2_1_31] # 1 211 1 211 211 419 100 1e-116 MSGILGKKIGMTQIFEDGKFVPVTVVEAGPNFVLQKKTEEKDGYVALQLGFDEKKEKNTT KPLMGIFNKAGVKPQRFVRELAVESVEGYELGQEIKVDVLAEVGYVDITGTSKGKGTSGV MKRHGFGGNRASHGVSRNHRLGGSIGMSSWPGKVLKGKRMAGQHGNATVTVQNLKVVKVD VEHNLLLIKGAVPGAKNSYLVIKPAVKKVIG >gi|224461336|gb|ACDC01000066.1| GENE 20 28678 - 29307 1044 209 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739367|ref|ZP_04569848.1| LSU ribosomal protein L1E [Fusobacterium sp. 2_1_31] # 1 209 1 209 209 406 100 1e-113 MAVLNVYNLEGNQIDTLEVKDTVFGIEPNKVVLHEVLTAELAAARQGTASTKTRAMVRGG GRKPFKQKGTGRARQGSIRAPHMVGGGVTFGPHPRSYEKKVNKKVRNLALRSALSAKVAA GNVLVLDYEGIDTPKTKVIVNLVNKVDAKQKQLFVVGDLIKDYNLYLSARNLENAVILQP NEIGVYWLLKQEKVILTKEALAVVEEVLG >gi|224461336|gb|ACDC01000066.1| GENE 21 29307 - 29594 480 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34764036|ref|ZP_00144922.1| LSU ribosomal protein L23P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 95 1 95 95 189 100 2e-47 MNVYDIIKKPVVTEKTELLRKEYNKYTFEVHPKANKIEIKKAIETIFNVKVEDVATINKK PITKRHGMRLYKTQAKKKAIVKLAKENTITYFKEV >gi|224461336|gb|ACDC01000066.1| GENE 22 29637 - 30467 1447 276 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739369|ref|ZP_04569850.1| LSU ribosomal protein L2P [Fusobacterium sp. 2_1_31] # 1 276 1 276 276 561 100 1e-159 MAIRKMKPITNGTRHMSRIVNDELDKVRPEKSLTVPLKSAYGRDNYGHRTCRDRQKGHKR LYRIIDFKRNKLDVPARVATIEYDPNRSANIALLFYVDGEKRYILAPKGLKKGDIVSAGS KADIKPGNALKLKDMPVGVQIHNIELQKGKGGQLVRSAGTAARLVAKEGTYCHVELPSGE LRLIHGECMATVGEVGNSEHNLVSIGKAGRARHMGKRPHVRGAVMNPVDHPHGGGEGKNS VGRKSPLTPWGKPALGIKTRGRKTSDKFIVRRRNEK >gi|224461336|gb|ACDC01000066.1| GENE 23 30492 - 30767 492 91 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739370|ref|ZP_04569851.1| SSU ribosomal protein S19P [Fusobacterium sp. 2_1_31] # 1 91 1 91 91 194 100 9e-49 MARSLKKGPFCDHHLMAKVEEAVASNNNKAVIKTWSRRSTIFPNFIGLTFGVYNGKKHIP VHVTEQMVGHKLGEFAPTRTYHGHGVDKKKK >gi|224461336|gb|ACDC01000066.1| GENE 24 30796 - 31128 524 110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739371|ref|ZP_04569852.1| LSU ribosomal protein L22P [Fusobacterium sp. 2_1_31] # 1 110 1 110 110 206 99 2e-52 MEAKAITRFVRLSPRKARLVADLVRGKSALEALDILEFTNKKAARIIKKTLMSAVANATN NFKMDEEKLVVSTIMINQGPVLKRVMPRAMGRADIIRKPTAHITVAVSEK >gi|224461336|gb|ACDC01000066.1| GENE 25 31147 - 31806 1111 219 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|19704960|ref|NP_602455.1| SSU ribosomal protein S3P [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 219 1 219 219 432 99 1e-120 MGQKVDPRGLRLGITRAWDSNWYADKKEYVKYFHEDVQIKEFIKKNYFHTGISKVRIERT SPSQVVVHIHTGKAGLIIGRKGAEIDALRAKLEKLTGKKVTVKVQEIKDLNGDAVLVAES IAAQIEKRIAYKKAMTQAISRSMKSPEVKGIKVMISGRLNGAEIARSEWAVEGKVPLHTL RADIDYAVATAHTTYGALGIKVWIFHGEVLPSKKEGGEA >gi|224461336|gb|ACDC01000066.1| GENE 26 31809 - 32240 748 143 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739373|ref|ZP_04569854.1| LSU ribosomal protein L16P [Fusobacterium sp. 2_1_31] # 1 143 1 143 143 292 100 2e-78 MLMPKRTKHRKMFRGRMKGAAHKGNFVAFGDYGLQALEPSWITNRQIESCRVAINRTFKR EGKTYIRIFPDKPITARPAGVRMGKGKGNVEGWVSVVRPGRILFEVSGVTEEKAKAALRK AAMKLPIRCKVVKREENENGGEN >gi|224461336|gb|ACDC01000066.1| GENE 27 32240 - 32422 291 60 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34764030|ref|ZP_00144916.1| LSU ribosomal protein L29P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 60 1 60 60 116 100 2e-25 MRAKEIREMTSEDLVVKCKELKEELFNLKFQLSLGQLTNTAKIREVRREIARINTILNER >gi|224461336|gb|ACDC01000066.1| GENE 28 32456 - 32707 411 83 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739375|ref|ZP_04569856.1| SSU ribosomal protein S17P [Fusobacterium sp. 2_1_31] # 1 83 1 83 83 162 98 2e-39 MRNERKVREGIVVSDKMEKTIVVAIETMILHPIYKKRVKRTTKFKAHDEENVAQVGDKVR IMETRRLSKDKNWRLVEIIEKAR >gi|224461336|gb|ACDC01000066.1| GENE 29 32736 - 33104 600 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|197736521|ref|YP_002165299.1| ribosomal protein L14 [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 122 1 122 122 235 100 3e-61 MVQQQTILNVADNSGAKKLMVIRVLGGSKKRFGRIGDIVVASVKEAIPGGNVKKGDVVKA VIVRTRKETRRDDGSYIKFDDNAGVVINNNNEPRATRIFGPVARELRARNFMKILSLAIE VI >gi|224461336|gb|ACDC01000066.1| GENE 30 33129 - 33470 567 113 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739377|ref|ZP_04569858.1| LSU ribosomal protein L24P [Fusobacterium sp. 2_1_31] # 1 113 1 113 113 223 100 2e-57 MARPKIKFVPESLHVKTGDIVYVISGKDKKKTGKVLRVFPKKGKIIVEGINIVTKHLKPS QVNPQGGVVQKEAAIFSSKVMLFDEKTKQPTRVGYEVRDGKKVRVSKKSGEII >gi|224461336|gb|ACDC01000066.1| GENE 31 33489 - 34040 923 183 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739378|ref|ZP_04569859.1| LSU ribosomal protein L5P [Fusobacterium sp. 2_1_31] # 1 183 1 183 183 360 99 1e-98 MDKYVSRYHKFYNEVVVPKLMKELEIKNIMDCPKLEKIIVNMGVGEATQNSKLMDAAMAD LTLITGQKPLLRKAKKSEAGFKLREGMPIGAKVTLRKERMYDFLDRLVNVVLPRVRDFEG VPSNSFDGRGNYSVGLRDQLVFPEIDFDKVEKLLGMSITMVSSAKTDEEGRALLKAFGMP FKK >gi|224461336|gb|ACDC01000066.1| GENE 32 34061 - 34348 478 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739379|ref|ZP_04569860.1| SSU ribosomal protein S14P [Fusobacterium sp. 2_1_31] # 1 95 1 95 95 188 100 4e-47 MAKKSMIARDVKRAKLVDKYAEKRAELKKRIAAGDMEAMFELNKLPKDSSVVRKRNRCQL DGRPRGFMREFGISRVKFRQLAGAGLIPGVKKSSW >gi|224461336|gb|ACDC01000066.1| GENE 33 34377 - 34775 658 132 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739380|ref|ZP_04569861.1| SSU ribosomal protein S8P [Fusobacterium sp. 2_1_31] # 1 132 1 132 132 258 100 5e-68 MYLTDPIADMLTRVRNANAVMHEKVDIPHSKMKERIAEILKEQGYISNFKIVTDEENKKS IRVYLKYAGKERVIKGLKRISKPGRRVYSSVEDMPRVLSGLGIAIVSTSKGVITDKVARA EKVGGEVLAFVW >gi|224461336|gb|ACDC01000066.1| GENE 34 34800 - 35333 910 177 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739381|ref|ZP_04569862.1| LSU ribosomal protein L6P [Fusobacterium sp. 2_1_31] # 1 177 1 177 177 355 100 3e-97 MSRVGKKPIAVPSGVDFSVKDNVVTVKGPKGTLTKEFNKNITIKLEDGHITFERPNDEPF IRSIHGTTRALINNMVKGVSEGYRKTLTLVGVGYRAAAKGKGLEISLGFSHPVIIDEIPG ITFTVEKNTTIHIDGIEKELVGQVAANIRAKRPPEPYKGKGVKYADEHIRRKEGKKS >gi|224461336|gb|ACDC01000066.1| GENE 35 35360 - 35728 590 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739382|ref|ZP_04569863.1| LSU ribosomal protein L18P [Fusobacterium sp. 2_1_31] # 1 122 1 122 122 231 99 4e-60 MFKKVDRKASRQKKQMSIRNKISGTPERPRLSVFRSNTNIFAQLIDDVNGVTLVSASTID KALKGSIANGGNVEAAKAIGKAIAERAKEKGINAIVFDRSGYKYTGRVAALAEAAREAGL SF >gi|224461336|gb|ACDC01000066.1| GENE 36 35753 - 36247 808 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739383|ref|ZP_04569864.1| SSU ribosomal protein S5P [Fusobacterium sp. 2_1_31] # 1 164 1 164 164 315 99 2e-85 MLNREDNQYQEKLLKISRVSKTTKGGRTISFSVLAAVGDGEGKIGLGLGKANGVPDAIRK AIAAAKKNIVKISLKNNTIPHEITGRWGATTLWMAPAYEGTGVIAGSASREILELVGVHD ILTKIKGSRNKHNVARATVEALKLLRTAEEIAALRGLEVKDILS >gi|224461336|gb|ACDC01000066.1| GENE 37 36260 - 36445 300 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739384|ref|ZP_04569865.1| LSU ribosomal protein L30P [Fusobacterium sp. 2_1_31] # 1 61 1 61 61 120 100 2e-26 MARLRIELVKSIIGRKPNHIATAKSLGLKKMHDVVEHNETPELKGKLAQISYLLKIEEVQ A >gi|224461336|gb|ACDC01000066.1| GENE 38 36445 - 36924 807 159 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739385|ref|ZP_04569866.1| LSU ribosomal protein L15P [Fusobacterium sp. 2_1_31] # 1 159 1 159 159 315 99 3e-85 MKLNELTPSVPKKNRKRIGRGNSSGWGKTAGKGSNGQNSRAGGGVKPYFEGGQMPIYRRV PKRGFSNAIFKKEYTVISLSLLNDNFEDGEEVTLETLFNKFLIKKVRDGIKVLGNGELNK KLTVKVHKISKSAQAAIEAKGGTVEIVEVKGFERAESNK >gi|224461336|gb|ACDC01000066.1| GENE 39 36949 - 38229 866 426 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 16 426 19 437 447 338 41 4e-92 MTLMEKFSSRLSSIVKIPELRERIIFTLLMFLVARVGTLIPAPGVDVDRLASMASKSDVL SYINMFSGGAFTRISIFSLGIIPYINASIVVSLLVSIIPQLEEIQKEGESGRNRITQWTR YLTIALAIIQGAGVCLWLQSVGLVYNPGISFFVRTITTLTAGTVFLMWVGEQISVKGIGN GVSLIIFLNVISRAPSSVIQTVQKMQGDKFLIPLFVLVAFLATVTIAGIVLFQLGQRKIP IHYVGKGFSSKSGIGEKSFIPLRLNTAGVMPVIFASVFMLIPGVIVNALPSDLQLKTTLS IIFGQNHPVYMILYALVIMFFSFFYTALVFDPEKVAENLRQSGGTIPGIRPGEETVEYLE GVASRITWGGGLFLAVISILPYVIFTSLGLPVYFGGTGIIIVVGVALDTIQQIDAHLVMR DYKGFI >gi|224461336|gb|ACDC01000066.1| GENE 40 38367 - 40718 3082 783 aa, chain + ## HITS:1 COG:FN0501_1 KEGG:ns NR:ns ## COG: FN0501_1 COG1982 # Protein_GI_number: 19703836 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Fusobacterium nucleatum # 1 503 1 503 503 975 95.0 0 MSKLDQNKTPLFTVLKDEYVRRNILPFHVPGHKRGKGVDKEFFNFMGEAPFSIDVTIFKM VDGLHHPKSCIKEAQELLADAYGVKHSFFAVNGTSGAIQAMIMSVIKAGEKILVPRNVHK SVSAGIILSGSEPVYMNPEIDENLGIALGVKPQTVENMLKQDPDIAAVLLINPTYYGVAT DLKKIADIVHSYDIPLIVDEAHGPHLHFHDELPISAVDAGADICTQSTHKILGSMTQMSV IHVNSDRVDVEKVKQILSLLHTTSPSYPLMASLDCARRQIATEGQELLTKTIELAKYFRR EANRIPGIYCFGEELVGKEGFFAFDPTKITISAKELGLKGGELESLLVDDYNIQMELSDY YNTLGLVTIGDTEESIDRLLDALRDISKRFFGKGKTLEKNNIKLPETPELVLMPREAFYS EKNKVPFKESVGKISGEMIMAYPPGIPIIIAGERISQDIIDHIEELKEADLHIQGMEDPE LETINVIEEEDAVYLYTEKMKNVLIGVQTNLGVNKTGTEFGPDDLIQAYPDTFDEMELIT VERQKEDFNDKKLKFKNTVLDTCEKIAKRVNEAVIDGYRPILIGGDHSISLGSVAGVSLE KEIGILWISAHGDMNTPESTLTGNIHGMPLALIQGLGDRELVNCFYEGAKVDSRNIVIFG AREIEVEERKIIEKTGVKIVYYDDILRKGIDNVLEEVKDYLKVDNLHISIDMNVFDPEIA PGVSVPVRNGMSSDEMFKSLKFAFKNYSVTSADITEFNPLNDINGKTAELVDDIVQYMMN PDY Prediction of potential genes in microbial genomes Time: Thu May 19 23:18:44 2011 Seq name: gi|224461335|gb|ACDC01000067.1| Fusobacterium sp. 2_1_31 cont1.67, whole genome shotgun sequence Length of sequence - 4701 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 13 - 72 6.0 1 1 Tu 1 . + CDS 104 - 799 801 ## COG3177 Uncharacterized conserved protein + Prom 809 - 868 3.8 2 2 Op 1 38/0.000 + CDS 960 - 1703 1258 ## PROTEIN SUPPORTED gi|237739389|ref|ZP_04569870.1| SSU ribosomal protein S2P 3 2 Op 2 24/0.000 + CDS 1739 - 2632 530 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts + Term 2638 - 2677 6.2 4 2 Op 3 33/0.000 + CDS 2699 - 3418 1260 ## COG0528 Uridylate kinase 5 2 Op 4 . + CDS 3443 - 4015 939 ## COG0233 Ribosome recycling factor + Term 4037 - 4074 4.8 + Prom 4053 - 4112 14.9 6 3 Tu 1 . + CDS 4251 - 4649 588 ## FN2052 hypothetical protein Predicted protein(s) >gi|224461335|gb|ACDC01000067.1| GENE 1 104 - 799 801 231 aa, chain + ## HITS:1 COG:pli0008 KEGG:ns NR:ns ## COG: pli0008 COG3177 # Protein_GI_number: 18450294 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 230 5 241 254 179 40.0 4e-45 MNNIVEKMTNEYIDDLLVRAAHHSTAIEGNTLTLGDTISILIHNYIPKGMTEREYYEVKN YKKAFELLLKADRVISTDLIKNYHRYIMENLREDNGEFKKIQNIILGSVIETTKPYLVPT VIEDWCQNLEYRLNNAKTDEEKIEAILDQHIKFEKIHPFGDGNGRTGRLLIIHSCLKENL APILIPKEEKGKYINFLTSENIKEFVKWGIELENKEKERIELFHNKEKEEK >gi|224461335|gb|ACDC01000067.1| GENE 2 960 - 1703 1258 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739389|ref|ZP_04569870.1| SSU ribosomal protein S2P [Fusobacterium sp. 2_1_31] # 1 247 1 247 247 489 100 1e-138 MSVVSMKQLLEAGVHFGHQAKRWNPKMKKYIFTERNGIHVIDLHKSLKKIEEAYEEMRKI AEDGGKVLFVGTKKQAQEAIKEQAERSGMYYINNRWLGGMLTNFSTIKKRIERMKELERM DADGTLDSDYTKKEAAEFRKELSKLSKNLSGIRDMEKVPDAIYVVDVKMEELPVREAHLL GIPVFAMIDTNVDPDLITYPIPANDDAIRSVKLITSVIANAIVEGNQGHEHVEPQSEEVN VEEGSVE >gi|224461335|gb|ACDC01000067.1| GENE 3 1739 - 2632 530 297 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 292 1 276 283 208 45 6e-54 MATITAALVKELRERTGAGMLDCKKALETNDGDIEKAIDYLREKGITKAVKKAGRIAAEG LIFDAVTPDHKKAVILEFNSETDFVAKNEEFKEFGRKLVKLALERNAHHLEELNEAQIEG DKKVSEALTELIAKIGENMSLRRLAVVVAKDGFVQTYSHLGGKLGVIVEMSGEATEANLE KAKNIAMHVAAMDPKYLSEEEVTASDLEHEKEIARKQLEEEGKPANIIEKILTGKMHKFY EENCLVDQVYVRAENKETVKQYAGDIKVLSFERFKVGEGIEKKEEDFAAEVAAQIKG >gi|224461335|gb|ACDC01000067.1| GENE 4 2699 - 3418 1260 239 aa, chain + ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 1 239 1 239 239 431 95.0 1e-121 MESPFYKKILLKLSGEALMGDQEFGISSDVIASYAKQIKEIVDLGVEVSIVIGGGNIFRG ISGAAQGVDRVTGDHMGMLATVINSLALQNSIEKLGVPTRVQTAIEMPKIAEPFIKRRAQ RHLEKGRVVIFGAGTGNPYFTTDTAAALRAIEMGTDVVIKATKVDGIYDKDPVKFADAKK YEKVTYNEVLAKDLKVMDATAISLCRENKLPIIVFNSLVEGNLKKVIMGENIGTTVVAD >gi|224461335|gb|ACDC01000067.1| GENE 5 3443 - 4015 939 190 aa, chain + ## HITS:1 COG:FN1623 KEGG:ns NR:ns ## COG: FN1623 COG0233 # Protein_GI_number: 19704944 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Fusobacterium nucleatum # 1 190 1 190 190 283 90.0 1e-76 MSIASDKLVKECEDKMLKTIESVKERFTSIRAGRANVAMLDAVKVENYGSEVPLNQVGSV SAPEARLLVVDPWDKTLIPKIEKALLAANLGMTPNNDGRVIRLVLPELTADRRKEYVKLA KNEAENGKIAVRNIRKDINNHLKKLEKDKENPISEDELKKEEANVQTLTDKYIKEIDELL AKKEKEITTV >gi|224461335|gb|ACDC01000067.1| GENE 6 4251 - 4649 588 132 aa, chain + ## HITS:1 COG:no KEGG:FN2052 NR:ns ## KEGG: FN2052 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 13 132 1 119 119 117 85.0 1e-25 MKIKFILGAMLVVGAVSYSAEATDAVAQEVINEVRNIEAEYQALMQKEAERKEEFIQEKA NLEKEVKELKEKQLGREELYAKLKEDSKIRWHRDEYKKLLKRFDEYYNKLEQKIADKEQQ IVELTKLLEVLN Prediction of potential genes in microbial genomes Time: Thu May 19 23:18:51 2011 Seq name: gi|224461334|gb|ACDC01000068.1| Fusobacterium sp. 2_1_31 cont1.68, whole genome shotgun sequence Length of sequence - 15538 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 46 - 8727 13435 ## Lebu_0887 autotransporter beta-domain protein + Term 8785 - 8825 7.0 + Prom 8836 - 8895 10.0 2 2 Op 1 20/0.000 + CDS 8991 - 10145 1822 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 10178 - 10216 -0.9 + Prom 10159 - 10218 4.6 3 2 Op 2 24/0.000 + CDS 10260 - 11147 1229 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 4 2 Op 3 19/0.000 + CDS 11147 - 12127 1302 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 5 2 Op 4 18/0.000 + CDS 12117 - 12899 271 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 6 2 Op 5 . + CDS 12899 - 13612 284 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 13622 - 13665 3.1 + Prom 13614 - 13673 11.0 7 3 Tu 1 . + CDS 13704 - 15338 2081 ## Lebu_0003 protein of unknown function DUF1703 Predicted protein(s) >gi|224461334|gb|ACDC01000068.1| GENE 1 46 - 8727 13435 2893 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0887 NR:ns ## KEGG: Lebu_0887 # Name: not_defined # Def: autotransporter beta-domain protein # Organism: L.buccalis # Pathway: not_defined # 739 2893 624 2831 2831 1721 53.0 0 MVKNNLYMVEKNLRSIAKRYKGVKYSLGLAILFLMMGLSAFSQEVMSTQEIAASKENLRS SVGTLQNKVEAARKENQKAIDGLKLELIQLMEQGDQVVKSPWASWQFGANYMYNDWRGTY KGRGDKAEKYPYEGIFTRSNGNERYISTSSKQYSKLAMNDSITSASTNRRTNIKNGYGLV GVRTVQEPIVAFDVNAGIRPKQVVKGAITITEKNPIAIAQPEAIVFNSPTINIVPPSPVS VTATVPTVNAPNVTPPGVNVPALPTALSFSPVTPSVTAPTAPTVVVSTPPSISFQAQGFG QANNSIIHQVSQGIHLMNYASYNSTNGVANFDVTNNTSTLTSGSVAYTTPPVGTGISITN PSGTISTGDTMSASPMNAVFSHVAPTNVSMTGDYSLNSTAGHNILFISVNPYNYYLTPGN KTFSFNGNVTLNSNGGSSVVGIEHQLLNGGGGGYYSPVNQVVTTVENNGTITLGNGQNMI GIMLDTEYYGSPGNYKFAKSPQTNNNGSIIISAGATNSVGFDYGYYVTGSSNEGPNGTVK VGTITINGNSNYGYRQKYYSATPGYYDGMGTVDGSNGIITLNGSNNVGYSIAHGKTSGDP ISNIANMQIEVNGSNNVGFLRNSDTSTTLNSTAITLNAAKLGTQFNFGTTATGGALIRSD VHEVILDKDITVGATGVKNSLMQAGNDGKVTLASGKTITSTTANEFYGMTAGNFAGADGK KATAKNLGTLAIGGNKSLGMAIDVDDEGINEGTINFTGTNGAGVYNTGTFTAKLGSSINV NSPGQNSIGAFNSGTLTIDNGATITGTAKDATGIYATGGTATNNGTISMTAEKAKGLVTD KAGTTAGTIINNGTVSVTGTESVGAAALDGNINVASGSISASGLSGITLYTGGTTGGTIT ATGGTINATSGAINVYANKGTINFNGATINTGASSLAFMKGGGAVNFNSPTTANIATNGT AFYIPPTVSPIPTTPTYSAFTGLTGAALTGFNNLNNLTLNMAANSNLAVASYVQTKISDL ATSSIGALGATVTGTNYNDYLLYKSELTADAGTSYAQFKKIALSNSSIINNTNLSTADNN VVLMAQENNEANKDWVKLTNNGTISLTGKGSLAMYASNGKITNSANSNITVGDSGTAIYG KNKGVGDTLIENSGTITVGKASTGIYAKDYETTGVENKGTINLADDDGIAISYEPNLTSA IVVENKGKITSTAKKGTGIYAAKDANSIQYSALNSGIISLGEGGVGLYTNATSTTTSPKL LNTGTIKVGKNGIGLYGYEETTTGNITVGDTGIALYSQGGDVNIGSATDSPTITVGNTNA TAVYTTGSGQTVTSTKANYDIGTESYGFVNKGTGNALNISGGTATLKDKGVFVYSNDTTG NITNSNPISSTGTIGSNIGVYSAGTVNNSGNISFTGGTGNVGVYVIDNGNITNSGKINVG ASTTAKRSIGVVANTGTINNTGNIVVNGQYGLGLYSTGAASTINNGANITLTGDETIGAY GANGSNINLTSGTIASTGNKTTGYYLGGGTASTIASGAKINGIGNEANGIYVNSGANLTY TGETKVTGDAAYGLIVDGASTVNATGGKVTIGGTSGVNGSSSGANSNRGAAGLVVTSGST LSGNNLNVIADVAGENSVGVYSAGSLAIDSANVSAYDSAVNFYTNGGTLSIGNNGGTSTV LTGTGANKGALMFYTPSGNILLNGPVNATVQGSTDTLKRGTAFYYTGGGTLGTISSYTQL NPTAIASWARNRFGNGTTSTLGNLNLTMNQDSRLFLTEQVDVALSNTSATSLFSGLSSTE RPNVTGAGSGYRTFMLYHSHLNVDNAVNLDNTNDPYNLMEISSSSITNNSTITGTQAGQI ALAQANDTTPKSVVTLTNNGTITLSGANSAGIFAKNGIVKNTNDITVGNSSSGIYGLNNT EVSNTGTITTGGSSTGIFYSDVEKDSAGNVTAINNTTTGLKNDGTINLNGDDSVALTYEP GNITSTVAFENTGDISSTGDKNVGMYAKLAKNNAAYTTKNTGNITLGNSASLSNPNVAIY TNASSAGTNPLENTGNITVGDNAVGMYGYEENSTGNITVGNGSIAMYSKGGNVNVAGSIT TGNSGESVGVYTVGNGQTITNNGATFNLGDTSFGFVNVGSGNNITSNGGSATLNNDGVFI YSSDKANTITNNTNITGTGSIGKNYGIYASSTVNNSGNINFSNGVGNVGIYMINGATGRN TGNITVGASDTSNDLYSIAMAAGYIGDSTTPATTGKIENSGIINVNGRYGIGMYGAGNGT TVTNNNNIVLNANNTMGIYVENGAKAVNNGSIRTGVSGLLSVIGVVLGQNSILENNGTIN ISGTQSKGVLLKGGRIDNYGNITVSGAGSKETDSLNSSPTSKQVGSVVINAPAGATTATI TAGGVVQTPTVVNTTARNPISVAADSIGLYVNTSGTHFTNSITGLGNLTTNADLIIGTEA AQSTRSRYILVNDNRILNPYNTAILTSGVSKWDIYSGSLTWMTTPTLDSTTGAITNLYMA KIPYTEWANDRDTYNFTDGLEQRYGVEELGTRENKVFQKLNSIGNNEETLLYQAYDEMMG HQYANVQQRIQATGDILNKEFDYLRSSWSNPSKDSNKIKTFGTRGEYNTDTAGVIDYKNH AYGVAYVHEDETVRLGESTGWYAGIVHNTFKFKDIGNSKEEMLQGKIGIFKSIPFDHNNS LNWTISGDIFAGYNKMNRRFLVVDEVFGAKGRYHTYGLGLKNEISKEFRLSEDFTLKPYA ALGLEYGRVSKIREKSGEIRLDVKSNDYFSVRPEVGAELGFKHHFDRKTVRVGVTVAYEN ELGRVANGKNKARVAYTSADWFNIRGEKEDRRGNIKTDLNIGVDNQRIGLTGNIGYDTKG HNIRGGVGLRVIF >gi|224461334|gb|ACDC01000068.1| GENE 2 8991 - 10145 1822 384 aa, chain + ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 1 384 1 383 383 644 90.0 0 MKKKLVTTLLGASLLLAACGGEKAAEKPASTEAETIKIGAIGPLTGSVAIYGISATNGLK LAVDEINANGGILGKQIELNLLDEKGDSTEAVNAYNKLVDWGMVALIGDITSKPSVAVAE VAAQDGIPMITPTGTQLNITEAGSNIFRVCFTDPYQGEVLAKFTKDKLAAKTVAIMSNNS SDYSDGVANAFAKEAEAQGIKIVAREGYSDGDKDFKAQLTKIAQQNPDVLFVPDYYEQDG LIAIQAREVGIKSVIVGSDGWDGVVKTVDPSSYAAIEDVFFANHYSTKDSNEKVQNFIKN YKEKYNDEPSAFSALSYDAAYILKAAIEKAGTTDKEAVAKAIKELEFDGITGHLTFDEKN NPVKSITIIKIVNGDYTFDSVVSK >gi|224461334|gb|ACDC01000068.1| GENE 3 10260 - 11147 1229 295 aa, chain + ## HITS:1 COG:FN1431 KEGG:ns NR:ns ## COG: FN1431 COG0559 # Protein_GI_number: 19704763 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Fusobacterium nucleatum # 1 295 14 308 308 450 89.0 1e-126 MEFLLQIINGLQIGSIYALVSLGYTMVYGIAQLINFAHGDIIMIGAYVSLFSIPTLSSMG LPVWVSVIPAIIICAIVGCLTERIAYRPLRNSPRISNLITAIGVSLFLENVFMKVFTPNT RSFPKIFIQDSIKLGDSIQISFGAVVTIVVTVILSIALQLFMKKTKYGKAMIATSQDYAA SALVGINVDRTIQLTFAIGSGLAAVAAVLYVSAYPQIQPLMGSMLGIKAFVAAVLGGIGI LPGAVIGGFILGIVESLTRAYLSSQLADAFVFSILIIVLLFKPTGILGKNVKEKV >gi|224461334|gb|ACDC01000068.1| GENE 4 11147 - 12127 1302 326 aa, chain + ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 42 326 1 285 285 430 95.0 1e-120 MDKNKKLSYIATYAILLVLYFILFSLINSGFISRYQIGIIILILINVILAASLNITVGCL GQITLGHAGFMSIGAYTAALLTKSGFLSGYPGYIVALIVGGIVAGIIGFIIGIPALRLTG DYLAIITLAFGEIIRVLIEYFKFTGGAQGLTGIPRVNNFTLIYFITIFSVIFMYSIMTSR HGRAVLAIREDEIASGASGINTTYYKTFAFVLSAIFAGIAGGIYAHNLGILGAKQFDYNY SINILVMVVLGGMGSFTGSILSAIVLTILPEVLRSFAEYRMIVYPLILIIMMLFRPEGLL GRKEFQISKVISYFTNKSKRGEEDGK >gi|224461334|gb|ACDC01000068.1| GENE 5 12117 - 12899 271 260 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 18 248 33 254 329 108 28 2e-23 MENKKPLLVAKDISISFGALKAVDKFNLEIKSGELIGLIGPNGAGKTTVFNILTGVYNAS SGEYTLDGEDVIKTSTSALVKKGLARTFQNIRLFKYLSVLDNVVAAYNFRMKYGIFTGMF RLPSFWKEEKLAKEKAMELLKIFDLDKYANMRAGNLPYGEQRKLEIARAMATEPKILLLD EPAAGMNPKETEDLMNTIKLIRDKFGIAVLLIEHDMKLVLGICERLVVLNYGQILASGDP QEVINNPKVVEAYLGKEEDE >gi|224461334|gb|ACDC01000068.1| GENE 6 12899 - 13612 284 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 232 1 237 245 114 28 5e-25 MAMLEVKDLQVFYDNIQALKGISLEINEGEVVSIIGANGAGKTTTLQTISGLITPKSGSI IFEGKDLLKEKAHNICKLGIAQVPEGRRIFSQLAVKDNLKLGQFTIKDSAEKKEEDRANF YKVFPRMSERKNQLAGTLSGGEQQMLAMGRALMSRPKLLILDEPSMGLSPLFVKEIFEVI KQLKEKGTTILLVEQNAKMALSISDRAYVIETGEIVLEGNAKDLLHNDRVKKAYLGG >gi|224461334|gb|ACDC01000068.1| GENE 7 13704 - 15338 2081 544 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0003 NR:ns ## KEGG: Lebu_0003 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 2 544 3 545 545 569 57.0 1e-160 MKRLAIGLSDFKHLIKEDFYYFDKTKFIEEVIKDGSQVKLFARPRRFGKTLNMSMLKYFF DIKNREENKKIFKDLYIEKTEAFKEQGQYPVIFLSLKDLKALTWEEMEEKITVIVSELFS EYNYLINELVETDSDKFKRIINENANLSNLGRSLKFLTKILYEKYNKKVVVLIDEYDSPL VSAYINGYYEKAKDFFKTFYSTVLKDNSYLQMGVLTGIIRVIKAGIFSDLNNLSTYTILS DVYTDSYGLTEEEVEKSLKYYGIEQEISNVKDWYDGYKFGDSEVYNPWSILNFLRFKELR AYWVDTSGNDLIKDVLKKITKNTIEALERLFNGEGLKQNISGTSDLSKLLSEDELWELML FSGYLTVEEKIDHKNYVLRLPNKEIKELFRDTFLEKYFGRGSKLLYLMEALTENRIDEYE ERLQEILLTSVSYNDTKKGNEAFYHGLIMGMGLYLEGEYITKSNIESGLGRYDFVIEPKN KTKRAFIMEFKATDSIENLEEISKEALRQIEDKKYDISLKQNGVKDITYMGIAFCGKQIK IEYK Prediction of potential genes in microbial genomes Time: Thu May 19 23:19:44 2011 Seq name: gi|224461333|gb|ACDC01000069.1| Fusobacterium sp. 2_1_31 cont1.69, whole genome shotgun sequence Length of sequence - 19108 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 15 - 878 1032 ## COG0384 Predicted epimerase, PhzC/PhzF homolog - Prom 904 - 963 8.5 + Prom 853 - 912 13.9 2 2 Op 1 . + CDS 1046 - 4213 4802 ## COG4625 Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain + Prom 4216 - 4275 1.6 3 2 Op 2 . + CDS 4297 - 4554 429 ## PROTEIN SUPPORTED gi|237739403|ref|ZP_04569884.1| LSU ribosomal protein L28P + Term 4620 - 4669 6.4 + Prom 4924 - 4983 15.3 4 3 Op 1 38/0.000 + CDS 5047 - 6585 2749 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 6599 - 6631 2.0 5 3 Op 2 49/0.000 + CDS 6654 - 7580 284 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 6 3 Op 3 44/0.000 + CDS 7590 - 8459 1205 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 7 3 Op 4 44/0.000 + CDS 8478 - 9485 629 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 8 3 Op 5 . + CDS 9478 - 10452 827 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 10464 - 10509 7.5 + TRNA 10553 - 10629 75.9 # Arg ACG 0 0 + TRNA 10636 - 10722 72.5 # Leu CAA 0 0 - Term 10723 - 10762 -0.0 9 4 Op 1 . - CDS 10932 - 11897 885 ## MGAS2096_Spy0376 hypothetical protein 10 4 Op 2 . - CDS 11967 - 12890 852 ## FIC_00852 hypothetical protein - Prom 12982 - 13041 10.3 + Prom 13013 - 13072 9.9 11 5 Op 1 1/1.000 + CDS 13282 - 15105 2762 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 12 5 Op 2 1/1.000 + CDS 15138 - 16892 2588 ## COG0006 Xaa-Pro aminopeptidase 13 5 Op 3 1/1.000 + CDS 16922 - 17482 563 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 17485 - 17538 8.4 + Prom 17498 - 17557 13.4 14 6 Tu 1 . + CDS 17602 - 19077 2398 ## COG1012 NAD-dependent aldehyde dehydrogenases Predicted protein(s) >gi|224461333|gb|ACDC01000069.1| GENE 1 15 - 878 1032 287 aa, chain - ## HITS:1 COG:FN1427 KEGG:ns NR:ns ## COG: FN1427 COG0384 # Protein_GI_number: 19704759 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Fusobacterium nucleatum # 1 286 8 293 293 464 82.0 1e-131 MKIFICDAFSSEIFKGNQAGVVILDEKENYPDENFMKNIAAELKHSETAFVRKIDTKVFK IKYFTPTDEVELCGHATISVFSTLRSLKIIEPGKYTAETLAGNLEIIVDKDFIWMDMSLP KVEYIFNLDEIKELYSAFNLDTSHAPKSLVPKIVNTGLSDIIIPIENKEILDNFVMNKEK VIELSKKYNVVGAHLFSLDKEKNFTAFCRNIAPLVGIDEECATGTSNGALTHYLKEYNII SVKDINSFRQGEAMQRASTILSRYKEDGVTIQVGGNAVISFECKLYK >gi|224461333|gb|ACDC01000069.1| GENE 2 1046 - 4213 4802 1055 aa, chain + ## HITS:1 COG:FN1950_2 KEGG:ns NR:ns ## COG: FN1950_2 COG4625 # Protein_GI_number: 19705252 # Func_class: S Function unknown # Function: Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain # Organism: Fusobacterium nucleatum # 624 1055 1 432 432 587 69.0 1e-167 MKKEILKSKMMLIALSSILFVSCGGGGGGGGGASNLPVNPGINPGTPSTPTTIEGTYPTV DTGLDKSNMSALKTNLYAAQRSSGASIPKDTSTVDGSGVKVAILDTNFVDAVRTGANSAE DGDGNSITARRNKKLTDIYTSIEIVNPNPNHPYIEDTDGKIVNHGTERPTGIEHGEEVLE IIGDLYAAPNSQATSIGLTNQADNKIGAILGSIGWNYQYTEGTSTKKRIGGMFPTKEVYE AAMAKFGNQSVKIFNQSFGSDDPYDDPQYRTYRGEGNLPLPFAKMNSGDQPNFMLPYFRD AVENKGGLFIWAAGNKANQNASLEAGLPYFDKRLEAGWISVVGVSTEKGSQYNVIDTLSK AGSEAAYWSISADEESVLKIISLTPKQGTVGVGSSYAAPKVTRAAALVYDKFDWMTANQI RQTLFTTTDKTELTQNPATMSEDNLRNVTMFPDSTYGWGMLNEKRALKGPGAFMNVTKYG DTSIFKANLPAGKVSYFDNDIYGNGGLEKLGAGTLHLTGNNSFSGGSTVTAGTLEIHQIH SSPITVGSGGTLVLNPKAIVGYDNSGFSLIGTVDPQKITDSGIKVKNYGNVKFNGNTAII GGDYVAYNGSNTQVGFKNSVKVLGTIRIENANISILSNDYVTKNEIATVMEGQSVEGNIA TVETNGMRTASAEIKDGKIVASLSRQNVVDYVGEDASASTKNVAGNVEKVFEDLDNKIEK GIATEKEVLAARTLQTMATSTFTSATEVMSGEIYASAQALTLSQAQDVNRDLSNRFSRID NLKNSKDDTEVWFSALGGAGKLRRDGYASADTRIVGGQAGIDKRFTPTTTLGLALNYSYA HADFNKYAGESKSDMVGLSLYGKQDLGNDFYLAGRLGIANISSKVERELLTATGDRIEGK INHHDKMLSTYLELGKKFNWFTPFVGISQDYLRRGSFDESTATWGIKADKKTYRATNFLV GARAEYIADKYKLYASLSHSINTDKRDLAYEGRFTGSSVAQKYYGVKQAKNTTWIGFGAF REITPAFGVYGNIDFRIESNKGRDSVISTGIQYRF >gi|224461333|gb|ACDC01000069.1| GENE 3 4297 - 4554 429 85 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739403|ref|ZP_04569884.1| LSU ribosomal protein L28P [Fusobacterium sp. 2_1_31] # 1 85 1 85 85 169 100 1e-41 MQRCEITGTGLISGNQISHSHRLTRRVWKPNLQVTTLNVNGSPIKVKVCARTLKTLKGAS EVEVMRILKANIATLSERLLKHLNK >gi|224461333|gb|ACDC01000069.1| GENE 4 5047 - 6585 2749 512 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 512 1 510 511 912 85.0 0 MKKKFGLLVTIILSMLFLVACGGNGDKKEGADAGASTGKDTLVIAQGADAKSLDPHASND NPSSRIRVQIYDRLMDLDDNGVPQPMLAESWERPDDKTIIFHLRKGVKFHNGDEMKASDV KFSLERALASPEVSHILTGINGVEVLDDYTVKVTTEKPMAAILNNLAHTTIAILSEKATK EAGDKFGQNPVGSGPYKFVSWQSGDRVTLEAFPEYWQGEAPVKNVVFRNIVEETNRTIGL ETGELDIIYDIQGMDKNKLKDDDRFVVIEGPQVSMTYLGFNMKKAPYDNPKVREAISYAI DQKPIIDTVFLGAGEPANSIIGPNVWGYYDVEKYTQDIEKAKALLAEAGYPNGFKAKIWV NDNPVRRDTAVILQDQLKQIGIDLAIETVEWGAFLDGTARGDHEMFLLGWGTVTRDPDYG MYELISTATMGAAGNRSFYSNPTVDKLLEEGRTELDPEKRKAIYKEIQEIIRKDIPMYMI IYPLQNVVTQKNIKNFKLDPANSHKIYGVTKE >gi|224461333|gb|ACDC01000069.1| GENE 5 6654 - 7580 284 308 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 62 307 40 316 320 114 27 7e-25 MYKYILKRLVLLIPVMLGVTLLVFAIMYLTPGDPAQLILGESAPKEAVAALREKMGLNDP FFMQYLRFVKNALVGDFGRSYTTGREVFEEIFARFPNTVVLAVLGVIISIVIGIPVGIIS ATKQYSLTDSFSMVLALLGVSMPVFWLGLMLILLFSVKLGIFPSGGFDGFRSVILPSVAL GVGSAAIVTRMTRSSMLEVIRQDYIRTARAKGVAEKVVINKHALKNALIPIITVVGLQFG GLLGGAVLTESVFSWPGVGRLMVDAIRQKDTPTVLASVVFLAVVYSVVNLLVDLLYAFVD PRIKSQYK >gi|224461333|gb|ACDC01000069.1| GENE 6 7590 - 8459 1205 289 aa, chain + ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 289 1 289 289 493 96.0 1e-139 MEKSKNKKQSQWAEVFRMLKKNKMAMLGLIILVVLVLLALFADVIANYDTVVIKQNLAER LMPPNGKHWLGTDEFGRDIFARLIHGARVSLKVGILAISISVVVGGILGAVSGYFGGVID NVIMRVVDIFLAVPSILLAIAIVSALGPSMLNLMISISVSYVPNFARIVRASVLSIRDQE FIEAAKAIGASNTRIILKHIIPNSLAPVIVQGTLGVAGAILSTAGLSFIGLGIQPPAPEW GSMLSGGRQYLRYAWWVTTFPGVAIMITILSLNLLGDGLRDALDPRLKQ >gi|224461333|gb|ACDC01000069.1| GENE 7 8478 - 9485 629 335 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 23 321 35 328 329 246 42 6e-65 MENRNLLEIRDLVIQYVKDDETVHAVNSISVDIAEGETLGLVGETGAGKTTTALGIMRLI TGPTGKIKSGSIKFNDKSILEIPEEEIRKIRGNDISMIFQDPMTSLNPVMTVGEQIAEVI EIHEHISKEEAMNKAAEMLELVGIPGARKNDFPHQFSGGMKQRVVIAIALACNPKLLIAD EPTTALDVTIQAQVLDLMTDLKNKFRTSMLLITHDLGVVAQVCDKVAIMYAGEIVEYGSL EDVFENPKHPYTLGLFGSIPSLDEEKTRLVPIKGLMPDPTNLPTGCKFNPRCPHATELCS QRAPIVSEISKGHKVQCLIAEGLVKFKENWEEENE >gi|224461333|gb|ACDC01000069.1| GENE 8 9478 - 10452 827 324 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 312 8 324 329 323 52 7e-88 MSKVLLEVKNLKKYFQTPKGQLHAVDNVNFAIEEGKTLGVVGESGCGKSTTGRTILRLLE ATDGEIIFEGKNIRDYSKAEMKKLREEMQIIFQDPFASLNPRMTVSEIIAEPLIIHNKCK TKEELNNRVKELMDTVGLSQRLVNTYPHELDGGRRQRIGIARALALNPKFIVCDEPVSAL DVSIQAQVLNLMKDLQEKLGLTYMFITHDLSVVKYFSNDIAVMYLGELVEKAPSKDLFKN PIHPYTKALLSAIPTINIRKKMERIKLEGEITSPINPGVGCRFAKRCVYATEICSKESPK LEKVGEAHFFACHRAKELGFVDEK >gi|224461333|gb|ACDC01000069.1| GENE 9 10932 - 11897 885 321 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy0376 NR:ns ## KEGG: MGAS2096_Spy0376 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 2 321 3 322 322 443 73.0 1e-123 MNLAKLLVDKSIEAFIMGLEIYNKPTIKYRVEGFSFFICNAWELMLKAELINNGKTIYYP DKPDRTISLELAVKRIYPDENTRIRLNLLKIIDLRNISTHFITEDYEIKYAPLFQACVLN FIFEINRFHKRDVTEYIAHNFLTISANYDPLTNEEIKLKYPPEIAEKFIKQANEIDVLSN EYNSDKFSINIKQNLYITKRKDEVDFVVSISNNSKNKVALVKELKDPSDTHKYSFINVIA AVKERLKKQNIKLGYQNGFNSHVLSLIINFYDIKKDKKYSYRHILGNNETYTYSQQFIEF IINEVKKNPSKFVESLKNKKR >gi|224461333|gb|ACDC01000069.1| GENE 10 11967 - 12890 852 307 aa, chain - ## HITS:1 COG:no KEGG:FIC_00852 NR:ns ## KEGG: FIC_00852 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium # Pathway: not_defined # 3 293 2 294 294 315 52.0 2e-84 MDKIHKATHRGYLNLGDKQIPCAVLENGMRIVSQTGLFLSFDRPRKGEKRLEDLPSVVGA KNLLPYVSNELYEKSKPIEYYHTNGKIATGYNAEIVPLICELYISAFEENILKPSQYKLY ARSMILIRALAKVGITALIDEATGFQNDRQAQALQELLKSYISEDLLKWQKRFPNKFYKE IFRLYGWKYDENSSKRPGWVGTFTRKYVYDLFPEAVIKEIETINPVVTKNNKSYRKNRHH QFLTTDIGLPQLDHYISKLLGVMALSKNNEDFDKNFNIAFSEEINLKSKEKELYNLQLEL YPKKDKI >gi|224461333|gb|ACDC01000069.1| GENE 11 13282 - 15105 2762 607 aa, chain + ## HITS:1 COG:FN0452 KEGG:ns NR:ns ## COG: FN0452 COG0449 # Protein_GI_number: 19703787 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Fusobacterium nucleatum # 1 607 1 607 607 1073 93.0 0 MCGIIGYSGTNTNAVEVLLEGLEKVEYRGYDSAGIAFVTDKGIQIEKKEGKLDNLRNHMK QFEVLSCTGIGHTRWATHGVPTDRNAHPHYSENRDVALIHNGIIENYVEIKKELLEQGVK FSSDTDSEVVAQLFSKLYDGDLYSTLKKVLKRIRGTYAFAIIHKDFPDRMICCRSHSPLI VGLGEHQNFIASDVSAILKYTRDIIYLEDGDVALVTKDNVTIYDKDEKEVKREVKKVEWN FEQASKGGYAHFMIKEIEEQPEIIEKTLNVYTDKEKNVNFDEQLEGINLHNIDRIYIVAC GTAYYAGLQGQYFMKKLLGIDVFTDIASEFRYNDPVITDKTLAIFVSQSGETIDTLMSMK YAKEKGAKTLAISNVLGSTITREADNVIYTLAGPEISVASTKAYSSQVLVLYLLSLYMGA KLGKLEEKDYVKYISDITLLKENISGLITEKEKIHEIAKRIKDVKNGFYLGRGIDEKVAR EGSLKMKEINYIHTEALAAGELKHGSIALIEQGVLVVAISTNLEMDEKVVSNIKEVKARG AYVVGVCKEGSLVPEVVDDVIQIKDSGELLSPVLAVVALQYLAYYTSLEKGFDVDKPRNL AKSVTVE >gi|224461333|gb|ACDC01000069.1| GENE 12 15138 - 16892 2588 584 aa, chain + ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 584 1 584 584 977 87.0 0 MEINKRIEEARKSMKKHKVDAYIVTSSDYHQSEYIGGYFQGREYLSGFTGSAGILVIFND EACLWTDGRYHIQAENQLKGSEIKLFKQGNTGVPTYKEYIVSKLAENSKIGIDAKILLSS DVNEILSKKKFKIVDFDLLAEVWKKRPALAAEKIFILEDKYTGKSYKEKVKEIRASLKEK NADYNIISSLDDIAWIYNFRGDDVQHNPVALSFTVISEKKASLYIDKNKLNEGAKKYFKD NKVEVKGYFEFFEDIKKLKGNILVDFNKTSYAIYEAISKNNLINAMNPSTYLKAHKNETE IANTKDIHVQDGVAIVKFMYWLKNNYKKGNITEFSAEEKINSLREKIEGYIDLSFHTISA FGKNAAMMHYSAPEKNSTKIEDGVYLLDSGGTYLKGTTDITRTFFLGKVGKQEKIDNTLV LKGMLALSRAKFLFGATGTNLDILARQFLWNVGIDYKCGTGHGVGHILNVHEGPHGIRFQ YNPQRLEVGMIVTNEPGAYIEGSHGIRIENELLVKEACETEHGQFLEFETITYAPIDLDG IVKSLLTKEEKEQLNTYHKEVYEKLKPYLTKAEQAFLKEYTKEI >gi|224461333|gb|ACDC01000069.1| GENE 13 16922 - 17482 563 186 aa, chain + ## HITS:1 COG:lin1042 KEGG:ns NR:ns ## COG: lin1042 COG1853 # Protein_GI_number: 16800111 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Listeria innocua # 1 178 1 180 183 95 34.0 5e-20 MRKNYETSKLYYGFPVILLGYKDVNFKYNFTTNSSSYTLGDMMVIGLHCRSNAAKQIMNF KEFTVNIPSENLMDEIEIGGFFHKVDKIQLSKLDYEIGEFIDAPIFTACPVSMECKVENV VMYGETANIIASIKKRVVNPILIEDGKLNSDKLNSVLFFGDDNEKIYRYLRNLSDKAGKF YKNKFE >gi|224461333|gb|ACDC01000069.1| GENE 14 17602 - 19077 2398 491 aa, chain + ## HITS:1 COG:FN0454 KEGG:ns NR:ns ## COG: FN0454 COG1012 # Protein_GI_number: 19703789 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Fusobacterium nucleatum # 1 491 1 491 491 962 95.0 0 MENILKKSYRMFINGEWVNSSNGVMVKTYAPYNNELLSEFPDASESDVDLAVKSAKEAFK TWRKTTVKERAKILNKIADIIDENKELLATVETMDNGKPIRETTLVDIPLAASHFRYFAG CILADEGQATVLDEKFLSLILREPIGVVGQIIPWNFPFLMAAWKLAPALAAGDTVVLKPS STTTLSLLVLMELIQDVIPKGVVNLITGKGSTAGEFLKNHPDLDKLAFTGSTAVGRDIAL AAAEKLIPATLELGGKSANIILDDADIEKALEGAQLGILFNQGQVCCAGSRIFVQEGIYD EFVEKLVKKFENIKIGNPLDPTTVMGSQIDARQVKTILDYVEIAKQEGGVVLTGGVKYTE NGCDKGNFVRPTLITNVNNGCRISQEEVFGPVAVIIKFKTDDEVIAQANDSEYGLGGAVF TKNINRALRLAREIQTGRVWVNTYNQIPEHAPFGGYKKSGIGRETHKVILEHYTQMKNIL IDLEEGTSGLY Prediction of potential genes in microbial genomes Time: Thu May 19 23:19:57 2011 Seq name: gi|224461332|gb|ACDC01000070.1| Fusobacterium sp. 2_1_31 cont1.70, whole genome shotgun sequence Length of sequence - 6325 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 22 - 81 8.7 1 1 Op 1 1/0.000 + CDS 141 - 680 1019 ## COG1592 Rubrerythrin + Term 697 - 738 6.6 + Prom 701 - 760 6.2 2 1 Op 2 . + CDS 789 - 2198 1914 ## COG1306 Uncharacterized conserved protein + Term 2216 - 2275 6.5 + Prom 2244 - 2303 15.5 3 2 Tu 1 . + CDS 2381 - 2809 905 ## COG0716 Flavodoxins + Prom 2874 - 2933 13.7 4 3 Op 1 7/0.000 + CDS 3069 - 4448 1889 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins 5 3 Op 2 . + CDS 4468 - 5907 1983 ## COG0531 Amino acid transporters + Term 5954 - 6002 6.1 + Prom 6000 - 6059 9.2 6 4 Tu 1 . + CDS 6109 - 6325 296 ## COG0426 Uncharacterized flavoproteins Predicted protein(s) >gi|224461332|gb|ACDC01000070.1| GENE 1 141 - 680 1019 179 aa, chain + ## HITS:1 COG:FN0455 KEGG:ns NR:ns ## COG: FN0455 COG1592 # Protein_GI_number: 19703790 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Fusobacterium nucleatum # 1 179 1 179 179 306 92.0 1e-83 MDLKGSKTEKNLMTAFAGESQARNKYNFYAKVAKEEGYEQIAELFDITANNEKEHAKLWF KALHGDTIPETIVNLADAAAGENYEWTDMYAKFAEEAREEGFMKLAKQFEMVGQIEKEHE ERYRKLLENIKNGTVFHSEEKVAWECMDCGHLHYGNDAPGKCPVCGADKAKFKRRAVNY >gi|224461332|gb|ACDC01000070.1| GENE 2 789 - 2198 1914 469 aa, chain + ## HITS:1 COG:FN0456 KEGG:ns NR:ns ## COG: FN0456 COG1306 # Protein_GI_number: 19703791 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 89 468 1 380 380 653 90.0 0 MKFTKKIFLLIMFLFLGLVSSKETYSKEKKSKPNYNYVVEKASIYSDMNKKENIGTLIKG TRVNVYETKEVTKKIKNKQGKEIDATIIMKKITYKDVDKTKVAWIEDTYLVSTLNEAVDE RFKNLDFTEKEKKEYKDNKRVKVRGIYVSAHSVALKGRLDELIELAKKNNINAFVIDVKG DYGELTFPMSESINKYTKSANKNPIIKEIEPVIKKLKDNGIYTIARIVSFKDTIYAKENP DKIIVYKDGGKAFTNSDGLVWVSAYDKNLWEYNVTVAKEAAKAGFNEIQFDYVRFPASNG GKLDKVLNYRNPDNMTKAEAIQKYLNYAKKELSPYNVYISADIYGQVGSSSDDMSLGQFW EAVSSEVDYVLPMMYPSHYGKGVYGLEIPDANPYKTIYHSTKDSINRNNNISSPAIIRPW IQAFTATWVKGHIHYGSKEVKEQIKAMKDLGVDEYILWSATNKYENFFK >gi|224461332|gb|ACDC01000070.1| GENE 3 2381 - 2809 905 142 aa, chain + ## HITS:1 COG:FN0513 KEGG:ns NR:ns ## COG: FN0513 COG0716 # Protein_GI_number: 19703848 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 142 1 142 142 225 88.0 2e-59 MSKISLVYYSATGNTEKMAKAIEEGIVEAGGAVTVYKSNAMDKDAILSSDVIVMGSSATG AEVIDENDLLPFMEEAGDKFKGKKVYIFGSYGWGGGEYADNWKAQLEGFGATIVDMPILA NEEPSDEELAQLKEVGKKLAAI >gi|224461332|gb|ACDC01000070.1| GENE 4 3069 - 4448 1889 459 aa, chain + ## HITS:1 COG:sll1641 KEGG:ns NR:ns ## COG: sll1641 COG0076 # Protein_GI_number: 16329656 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Synechocystis # 12 459 18 467 467 497 51.0 1e-140 MTKKNFTKTIEINPIFARPGEITSAPVDRFPTDSMLPETAYQIVHDESMLDGNARLNLAT FVSTWMDENANKLYSETFDKNAIDKDEYPATARVETNCWHMLADLWHAPDPDNAIGCSTT GSSEACMLGALALKRRWQEKMRKLGKSTARPNLIMSSAVQVCWEKFCNYFDVEPRYVPIS LDHKVLDGYDLEKYVDENTIGVVAIMGVTYTGMYEPVKDIAKALDKIEKETGLDIPIHVD AASGGMIAPFIQPDLEWDFRIPRVYSINTSGHKYGLVYPGLGWVVWRSTAHLPESLIFKV SYLGGEMPTFALNFSRPGAQILLQYWAFLRYGFNGYKTVQQSTMDVANHLANEISKMDMF TLWNHPTDIPVFAWMLKESPNRKWTLYDLSDRLRMKGWQVPAYPMPVDLTNITVQRIVVR NGLSMDLADRFLDDIKSQVEYLENLEHEMPKTNAGGFHH >gi|224461332|gb|ACDC01000070.1| GENE 5 4468 - 5907 1983 479 aa, chain + ## HITS:1 COG:BMEII0909 KEGG:ns NR:ns ## COG: BMEII0909 COG0531 # Protein_GI_number: 17989254 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Brucella melitensis # 17 465 24 482 510 303 39.0 5e-82 MNEDKLKNNDSIQKNTLSIFQIAVMTTICVASLRTLPPMAEEGRASILMYIIPAILFLIP TSLVSAEFATTYKGGIYVWIREAFGNRMGFVAIWLQWIQNVVWYPVQLAFVAAALAFTIN RGDLSNSGLFTAVVIIVVYWFSTFLAFKGGNLFAKVSSIGGMIGVLIPGAILIILGLLWI AQGQPISESYLQSSYIPKITGISSLVLIVSNVLSYAGMEMNAVHAGQMENPKKDFTKAIA LAFILILCVFVFPTLAIAIAIPADKLGMANGIMVAFQEFFEKLNISWMSNVMSGAMFFGA ISSVVTWVAGPSKGLLDAGKTGLLPPILQKVNKNNVQVNILVFQGIIVTILAMIYVLFPD VSDVFIALIGMAAALYVIMYMLMFAAIIVLRKKEPSIERGYKVPAVNIVSGIGFISCALA FVMSFIPTTSEAAIPRNMYPIVVAIVVFLLGIPPFIFYMFKKPSWDMRTAQEKEEKPIH >gi|224461332|gb|ACDC01000070.1| GENE 6 6109 - 6325 296 72 aa, chain + ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 72 1 72 403 146 91.0 8e-36 MYCCTKINDDIIWIGVNDRKTQRFENYIPLDNGVTYNSYLILDEKICIIDGVEEGENGNF LSKIEAMIGTAP Prediction of potential genes in microbial genomes Time: Thu May 19 23:19:58 2011 Seq name: gi|224461331|gb|ACDC01000071.1| Fusobacterium sp. 2_1_31 cont1.71, whole genome shotgun sequence Length of sequence - 1820 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 988 1501 ## COG0426 Uncharacterized flavoproteins + Term 1002 - 1041 5.4 + Prom 995 - 1054 5.9 2 2 Op 1 . + CDS 1096 - 1278 179 ## COG1724 Predicted periplasmic or secreted lipoprotein 3 2 Op 2 . + CDS 1316 - 1714 502 ## SAG1835 hypothetical protein + Term 1751 - 1789 -0.9 Predicted protein(s) >gi|224461331|gb|ACDC01000071.1| GENE 1 2 - 988 1501 328 aa, chain + ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 328 76 403 403 641 92.0 0 IIVNHVEPDHSGSIKNLLKIYPELKVVGNAKTIMMLRLLGVELPDERTMVVKEKDVLDLG KHKLTFYLMPMVHWPESMATYDITDKILFSNDAFGSFGALDGAVFDDEVNTDFFTDEMRR YYSNIVGKFGAPVNAVLKKLSSLEISCICPSHGLIWRKYIKELIERYQKWANMEPTKEGV VIVYGSMYGHTAEMAEYLGRELGNRGIKDVIIFDSSKTDHSYIFSTIWKYKGLMLGSCAH NNDVYPKMEPLLHKLQNYGLKNRYLGIFGNMMWSGGGVKKIKEFADSLPGLEQIGEPIEI KGHVTPIERDRLIELANLMADKLIADRK >gi|224461331|gb|ACDC01000071.1| GENE 2 1096 - 1278 179 60 aa, chain + ## HITS:1 COG:CC3184 KEGG:ns NR:ns ## COG: CC3184 COG1724 # Protein_GI_number: 16127414 # Func_class: N Cell motility # Function: Predicted periplasmic or secreted lipoprotein # Organism: Caulobacter vibrioides # 1 60 1 60 62 72 58.0 2e-13 MSSKEIIKMLEADGWILRAVEGSHHHFKHPSKKGKVTVPHPNKDLHIKTVNSILKQAGLK >gi|224461331|gb|ACDC01000071.1| GENE 3 1316 - 1714 502 132 aa, chain + ## HITS:1 COG:no KEGG:SAG1835 NR:ns ## KEGG: SAG1835 # Name: not_defined # Def: hypothetical protein # Organism: S.agalactiae # Pathway: not_defined # 1 131 1 134 134 119 49.0 3e-26 MKYHYYAVFEKDEDGYSISFPDLPGCLTCAKDIEEALKMAKDVLEGYILISEEDNDPIVP ASSYKELNKNLEDNQVLQLITADTDFVRMRKKNKSVNKMVTLPKWLIDLGKEKKINFSQL LQEAIKRELNID Prediction of potential genes in microbial genomes Time: Thu May 19 23:20:03 2011 Seq name: gi|224461330|gb|ACDC01000072.1| Fusobacterium sp. 2_1_31 cont1.72, whole genome shotgun sequence Length of sequence - 7733 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1006 1388 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 1046 - 1105 14.9 + Prom 1025 - 1084 9.2 2 2 Tu 1 . + CDS 1127 - 1972 1152 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily + Term 2018 - 2064 -1.0 + Prom 1988 - 2047 6.7 3 3 Op 1 1/0.000 + CDS 2284 - 3993 2076 ## COG0018 Arginyl-tRNA synthetase 4 3 Op 2 1/0.000 + CDS 4010 - 4738 956 ## COG2071 Predicted glutamine amidotransferases 5 3 Op 3 4/0.000 + CDS 4756 - 6144 1693 ## COG0531 Amino acid transporters + Prom 6150 - 6209 9.2 6 3 Op 4 1/0.000 + CDS 6242 - 7129 1052 ## COG0583 Transcriptional regulator 7 3 Op 5 . + CDS 7143 - 7727 899 ## COG0279 Phosphoheptose isomerase Predicted protein(s) >gi|224461330|gb|ACDC01000072.1| GENE 1 2 - 1006 1388 334 aa, chain - ## HITS:1 COG:FN0511 KEGG:ns NR:ns ## COG: FN0511 COG1052 # Protein_GI_number: 19703846 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 1 334 1 334 335 543 88.0 1e-154 MEKTKIIFFDIKDYDKEFFKKYADNFNFDMTFLKGKLTEETVHLTKGYDVVCAFTNDVIN KANIDVMANNGIKLLAMRCAGFNNVSLKDIHNRFKVVRVPAYSPHAIAEYTVALILAVNR KIHKAYVRTREGNFSINGLMGFDLNGKTAGIIGTGKIGQILIKILKGFNMKVIAYDLYPN QKAAEELGFEYVSLDELYAQSDIISLNCPLTKETQYMINRKSMLKMKDGVILVNTGRGML IDSADLVEALKDKKIGAVALDVYEEEEDYFFEDKSTQVIEDDILGRLLSFYNVLITSHQA YFTQEAVDAITLTTLNNIKDFVEGKELVNEVPQN >gi|224461330|gb|ACDC01000072.1| GENE 2 1127 - 1972 1152 281 aa, chain + ## HITS:1 COG:FN0508 KEGG:ns NR:ns ## COG: FN0508 COG4667 # Protein_GI_number: 19703843 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Fusobacterium nucleatum # 1 281 1 281 281 440 80.0 1e-123 MKIGLVLEGGGMRGLFSAGVLDALLELKELSVNGIVGVSSGALFGVNYVSKQKERAVRYN KKYADDKRYMGLHSWITTGNAVNKDFAFYELPYKLDVFDNETFKKAKTDFYVVMTNVESG KPEYVLIEDAFAQMEYLRATSALPFASKIIEINGKKYLDGGISDSIPIDFCESLGYDKII AVLTRPEGTYKEDKLGFLYKLVYRKYPNLVNSLLNMATDYEKVLAKIKDLENKGEIFVVR PPEVLKIGRLEKNRDKIQKVYDTGLNTGLKELDNIVKYLNK >gi|224461330|gb|ACDC01000072.1| GENE 3 2284 - 3993 2076 569 aa, chain + ## HITS:1 COG:FN0506 KEGG:ns NR:ns ## COG: FN0506 COG0018 # Protein_GI_number: 19703841 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 569 1 569 569 1009 91.0 0 MKIIIKELTDIFQNLVDNLFPNKELKPVEITVATNENFGDYQCNFAMINSKIIGDNPRKI AEEIKAKFPYGEIVEKLEVAGPGFINIFLTDKYISDSIKKIGEAYDFSFLNRKGKVIIDF SSPNIAKRMHIGHLRSTIIGESIARIMRYLGYDVVADNHIGDWGTQFGKLIVGYRKWLNR EAYEKNAIEELERVYVKFSEEAEKDPSLEDLARAELKKVQDGEEENTKLWKEFITESLKE YNKLYKRLDVHFDTYYGESFYNDMMADVVKELEEKKLAVDDDGAKVVFFDEKDNLFPCIV QKKDGAYLYSTSDIATVKFRKDNYDVNKMIYLTDARQQDHFKQFFKITDMLGWDIEKYHI WFGIIRFADGILSTRKGNVIKLEELLDEAHSRAYDVVNEKNPNLSEEEKQNIAEVVGVSS VKYADLSQNKQSDILFEWDKMLSFEGNTAPYLLYTYARIQSILRKIDEQNIELNDSVEIK IENKIERSLATHLLTFPISVLKAAETFKPNLIADYLYDLSKKLNSFYNNCPILNQDIDTL KSRALLIKKTGEVLKEGLSLLGIPVLNKM >gi|224461330|gb|ACDC01000072.1| GENE 4 4010 - 4738 956 242 aa, chain + ## HITS:1 COG:FN0505 KEGG:ns NR:ns ## COG: FN0505 COG2071 # Protein_GI_number: 19703840 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Fusobacterium nucleatum # 1 241 1 241 243 433 87.0 1e-121 MSKKPIIGISSSVIVDDSGNFAGYKRAYVNKDYVDAVVRAGGVPLIIPFTTDKEVIISQV QVIDALILSGGHDVSPYNYGQEPNPKLGETFPERDTYDMLLLEESKKRNIPILGICRGSQ IINVAAGGTLYQDLSLIPGNVLKHNQVSKPTLKTHKIQIEENSIISEIFGKETMVNSFHH QALDKVADDFKVVARASDGIVEAIQHKTYKFLVGVQWHPEMLAVECDEARELFKRLIEEA KK >gi|224461330|gb|ACDC01000072.1| GENE 5 4756 - 6144 1693 462 aa, chain + ## HITS:1 COG:FN0504 KEGG:ns NR:ns ## COG: FN0504 COG0531 # Protein_GI_number: 19703839 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Fusobacterium nucleatum # 10 462 1 453 453 607 89.0 1e-173 MGNNQNNEKMKFWSIVLLTINGIIGTGIFLSPGAVAKLVGDKAAMIYLAAAAFAAVLAVS FAAASKYVVKSGAAYAYSKAAFGDEVGSYVGITRVVSASIAWGVLATGVVKTALSIFGKD SSDMKTVTIGFITLMVVLLIINLIGTKLLTIISNISTIGKIGALGITIIAGIFILVFSDG ANLQDLTILKDADGKNIIPEFTTSVFVTALVGAFYAFTGFESVASGSADMEKPEKNLPRA IPLAIGIIACIYFGIVFVSMYIDPVAMVTSKEPVVLASVFKNQILQKIIIIGALMSMFGI NVAASFLTPRVFEAMAQEKQVPEFFTKRTKGGLPLTSFILTAGIAVIIPLAFNYNMAGII IISSISRFIQFIIVPLAVITFYFGKSKEDILNANKNVITDVIIPIIGLFLTILLLVKFNW AQQFSTKLDDGTTTLNIKAVVSMVIGYVILPICLRIYMRGKK >gi|224461330|gb|ACDC01000072.1| GENE 6 6242 - 7129 1052 295 aa, chain + ## HITS:1 COG:FN0503 KEGG:ns NR:ns ## COG: FN0503 COG0583 # Protein_GI_number: 19703838 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 295 8 302 302 422 86.0 1e-118 MDLHYLEIFYEVAKAKSFTKAAEKLFINQSAVSIQVKKFEDILKVKLFDRSSKKIKLTYT GETLYKMAEDIFEKVKRAEKEISRVIEFDRARIAIGASAIIAEPLLPSLMKEFSSIHEEI EYNITMSNKEHLLKLLKEGELDVIIIDSQHITDSNLEIIPIEKGPYVLISSKHYDNIKDI EKDAIITRDVIQNNNKAIEYIEDKYGVNFTKKINVLGNLEVIKGLVREGIGNVILPYYSV YKDIRKGTFKVIAKIDEIKDGYELIITKDKKDLSQITKFIDLVKSHKIVMEGSRN >gi|224461330|gb|ACDC01000072.1| GENE 7 7143 - 7727 899 194 aa, chain + ## HITS:1 COG:FN0502 KEGG:ns NR:ns ## COG: FN0502 COG0279 # Protein_GI_number: 19703837 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Fusobacterium nucleatum # 1 194 1 194 194 311 92.0 5e-85 MNLITSYKTELELLKKFIEEEEERKETEKVAKKLADIFTKGNKVLICGNGGSNCDAMHFI EEFTGRFRKERRALPAISISDPSHITCVANDYGFDYIFSKGVEAYGKEGDMFIGISTSGN SANVIKAVEQAKAQGLVTVALLGKDGGKLKGQCDYEFIVPGKTSDRVQEIHMMILHIIIE GVERIMFPENYEGE Prediction of potential genes in microbial genomes Time: Thu May 19 23:20:04 2011 Seq name: gi|224461329|gb|ACDC01000073.1| Fusobacterium sp. 2_1_31 cont1.73, whole genome shotgun sequence Length of sequence - 2434 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 2 - 412 241 ## COG2856 Predicted Zn peptidase 2 1 Op 2 . - CDS 372 - 839 701 ## COG1396 Predicted transcriptional regulators - Prom 934 - 993 13.7 + Prom 916 - 975 9.9 3 2 Tu 1 . + CDS 1019 - 1522 569 ## FN2064 hypothetical protein + Term 1578 - 1616 -0.9 + Prom 1625 - 1684 8.1 4 3 Op 1 . + CDS 1725 - 2123 617 ## FN2052 hypothetical protein 5 3 Op 2 . + CDS 2137 - 2434 508 ## Predicted protein(s) >gi|224461329|gb|ACDC01000073.1| GENE 1 2 - 412 241 136 aa, chain - ## HITS:1 COG:FN2066 KEGG:ns NR:ns ## COG: FN2066 COG2856 # Protein_GI_number: 19705356 # Func_class: E Amino acid transport and metabolism # Function: Predicted Zn peptidase # Organism: Fusobacterium nucleatum # 1 136 1 136 138 199 75.0 8e-52 MTERRKKEILKLIDNLYFEFGTKNPISICKGLGIEIVSANIEMKGLYTVVLNSKLIVVQS LLEGFAKLFVIGHELFHALEHDCDEIRFFREHTGFKTSIYEEEANFFSVQLLKDYIEYHQ DEVADLEIAEEIEKFI >gi|224461329|gb|ACDC01000073.1| GENE 2 372 - 839 701 155 aa, chain - ## HITS:1 COG:FN2065 KEGG:ns NR:ns ## COG: FN2065 COG1396 # Protein_GI_number: 19705355 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 155 1 155 155 176 70.0 2e-44 MKLVSNFAERLKLALKLRNMKATKLSELTNVNKSTISQYLSGEYEAKKDRIELFAEVLNV NELWLRGYDLPMENEDDKEKDILIKEYQLSADEIREYENIAMTTSTLMFNGKPVSEEDKN ELEKVLKEFFIRALLKKRADENNDGKKKKRNSKID >gi|224461329|gb|ACDC01000073.1| GENE 3 1019 - 1522 569 167 aa, chain + ## HITS:1 COG:no KEGG:FN2064 NR:ns ## KEGG: FN2064 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 164 1 164 167 265 85.0 3e-70 MKNIHTNFLAEYILKLSGEYASANRIHDILNISLSYTYTLVKNNKVRSRVKNGRTEYNME DFIRSLELSYNNNIVETPLTKEEFDANNFHNWEAKNDIEKYLERLLLDELGQFTCIKDLV ELFKVSKTMWYDALDEGKIMYFTISSRKIIITRSLLPFLREALSMQD >gi|224461329|gb|ACDC01000073.1| GENE 4 1725 - 2123 617 132 aa, chain + ## HITS:1 COG:no KEGG:FN2052 NR:ns ## KEGG: FN2052 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 13 132 1 119 119 115 85.0 6e-25 MKIKFILATMLALGSLSYSAEVTDTVAQEVISEVRNIEAEYQALMQKEAERKEEFIQEKA NLENEVKELKEKQLGREELYAKLKEDSKIRWHRDEYKKLLKRFDEYYNKLEQKIADKEQQ ILELTKLLEVLN >gi|224461329|gb|ACDC01000073.1| GENE 5 2137 - 2434 508 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKFLKTILFLCALSSIAYAEDDAMAILNKKRAEIEKAEKAKAKLAKEAEEKARKEAEEQ AKLAEKEAKEQARLAEEQAKSQVVEVVEAPSEAVVATEN Prediction of potential genes in microbial genomes Time: Thu May 19 23:20:17 2011 Seq name: gi|224461328|gb|ACDC01000074.1| Fusobacterium sp. 2_1_31 cont1.74, whole genome shotgun sequence Length of sequence - 6638 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 6552 10503 ## FN1554 hypothetical protein + Term 6583 - 6629 6.6 Predicted protein(s) >gi|224461328|gb|ACDC01000074.1| GENE 1 1 - 6552 10503 2183 aa, chain + ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 622 2183 5 1582 1582 1480 58.0 0 NKEINGLRLELIQLMEQGNQVVKSPWSSWQFGANYFYEDWGGAYKGRGDKKEKYPFEGVF TRSTDPFERYTSPISSNYSLLPTSTDPYSATTSSRKGLRSTYGLASTTKKQEPLAEMNVD ASIRPKEVYRAPIIAPTVNITAPILQALNVPNLLPPSLDIKTPDAVEAPNKVPTINAQPV VSFQFGGRSYIQGEGTMMRDLVDGNNQVFWSGWNPATNTIEANSSYQLNAYGGTPVLNNV RVPNVFYLNTGKRLPHNPNSTWLFKDATVHVAGKQGWNPIGTIAIHTVWNGKIENVHGYL HGHSTLISSETWHSGKVQLDNVDVTVTGEYNSVFYGYPSSYYSLGINDGDYNSWHQRGEY NGSLTANINEKNNYIYTVLGVQGAFKIKSTGDYNINSDSDIVYLGLGYSPNWENLRGSGS VKDQPYAKVDADMTPSIQLSKTNVNGNKNAILYFGNRYGSAATPHFDGNRTPVGNNDNWA KSIIGIYQGEIDVNANIGTSSISVGNVGVYSKSGQRIGIVPSEDLGAPNAAARAAVSNLH PDYDLDKVHNLEIGAAKIFFGKYSTDGVMFASENGTVMDIGKSKKADKNATGETVYVTEV KDAVSAQTEIRDAATNVISSYNDTANKAATGTVIAYSTGVWNNTDMPGGTNAGLAGKGSE INLYKPVVMTGRAKLDSSNKLNRSVALIGDNKGIVNAEKNVTALGYSSIVALAQNNGIVN AKGNIIAKDGAAATDVATKPYLYNNIGIYAGKDGTVNVTGNADIYGIGAMASGANAVANL NGNANTIKTGKSGALVATNGGVVKFGGGTITHSENFTGDHDSSTPFTADSSSHINFTGTT TLNISDGILIPGTKVDYAAATGTTAKYNGMSNVTVNLTGDNVVLASNNGINKVWDGTTIS NLVKNTMKVSAFNENGHSYKIYYINGTFVIDTNVNVGNATDDFNKVGLSREVVTINAGKT VSSTVGKGLAMGSNDKSIADGDNSKTQYINNGKVDITGGSLSTGTIGLNISYGQIHNNNI INVANGIGAYGINGSTLINDTTGKVNITTQGVGMAAFTSASPLQTYGTDKKINDGTLTAS DKTFEIINKGQITVNGDKSIGLYGNTNGTSTLLSASNGYITNSGKLTLTGDETVGIVSKR ATVTLNGTGSSDIVVGKKGIGVYAEKSPVTMSSNYGVEVKDGGTGIFVKNDGSTLSSSSN ELELKYSGSNMGTGVGIYYEGAIGSNIINGTKVKLVDTSGTTAGLVGLYTAGGGKLTNTA IISGDKGYGIITNGTEIINTGTVTLTNPLTSSKPSVGILTQAGDDITNTGTVSLGDNSVG IFGKKVLNTGIITVGNGGTGIYSEGGNVDLSATSKINIGANKAVGVFTKGNGQVITANAG STMTIGDSSFGFLNEGKGNTINSNIVSQTLGNDVTYVYSSDRTGIINNNTTLTSTGSYNY GLYSAGTVTNNADVNFGTGLGNVGIYSTYGGTARNLAGRSVTVGASYVDPNNSLNNRYAV AMAAGFNGDGIPSKAYTGNVVNEGTINVTGPYSIGMYGTEAGTKVYNGTSVGSTATINLG ADNTTGIYLDNGAYGYNYGTIRSVGSGLKKVVGVVVKNGSTIENHGRIEITAEDAVGILS KGNAAGANPGIIKNYGTFNINGKSDPNDPTVIKKASGGQDLGKTMGNVTINAPAGSTVGT ITIAGKPVVPTLATTSAEEYKDMAVSKIGMYIDTSNKRFTNPISGLSALSGLTTSDLIMG NEATENTTSKAIQVDQRILSPYNTMIIQNPQIKKWNIYSGSLTWMATITQNQIDGTMQSA YLAKIPYTQWAGNEPTPVDKKDTYNFLDGLEQRYGVEKIGTRENRVFQKLNSIGNNEEIL FYQATDEMMGHQYANVQQRIQATGDILNKEFNYLRNEWSNPSKDSNKIKTFGTNGEYKTS TAGVIDYKNNAYGVAYVHEDETVRLGESTGWYAGIVHNRFRFKDIGNSREEMLQAKFGLF KSVPFDENNSLNWTISGDIFAGYNKMHRKFLVVDEVFNAKGRYHTYGLGLKNEISKKFRL SESFTLKPYAALGLEYGRVSKIKEKSGEIKLDVKSNDYFSVRPEIGAELGFKHYFDRKTI KVGVSVAYENELGRVANGKNKARVAGTNADWFNIRGEKDDRTGNVKADLNIGIDNQRVGV TANVGYDTKGHNVRGGVGFRVIF Prediction of potential genes in microbial genomes Time: Thu May 19 23:20:50 2011 Seq name: gi|224461327|gb|ACDC01000075.1| Fusobacterium sp. 2_1_31 cont1.75, whole genome shotgun sequence Length of sequence - 27175 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 9, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 43 - 6201 9443 ## FN0387 hypothetical protein + Term 6219 - 6261 9.8 + Prom 6230 - 6289 7.2 2 2 Op 1 . + CDS 6330 - 7052 547 ## FN1870 hypothetical protein 3 2 Op 2 . + CDS 7074 - 7832 735 ## FN1870 hypothetical protein + Term 7843 - 7882 3.2 + Prom 7966 - 8025 11.3 4 3 Tu 1 . + CDS 8164 - 8280 98 ## + Term 8287 - 8326 1.9 + Prom 8349 - 8408 7.4 5 4 Op 1 . + CDS 8447 - 8833 696 ## FN1869 hypothetical protein 6 4 Op 2 . + CDS 8857 - 9672 1373 ## COG3246 Uncharacterized conserved protein 7 4 Op 3 . + CDS 9688 - 10725 1842 ## FN1867 Zn-dependent alcohol dehydrogenase and related dehydrogenase + Prom 10819 - 10878 6.8 8 5 Op 1 . + CDS 10928 - 12205 1792 ## COG1509 Lysine 2,3-aminomutase 9 5 Op 2 . + CDS 12209 - 13225 1110 ## FN1865 hypothetical protein 10 5 Op 3 . + CDS 13231 - 14691 1711 ## COG1193 Mismatch repair ATPase (MutS family) 11 5 Op 4 . + CDS 14693 - 16249 2584 ## FN1863 L-beta-lysine 5,6-aminomutase alpha subunit (EC:5.4.3.3) 12 5 Op 5 . + CDS 16249 - 17046 1487 ## COG5012 Predicted cobalamin binding protein + Term 17093 - 17144 -0.9 + Prom 17079 - 17138 10.7 13 6 Op 1 . + CDS 17268 - 18845 2439 ## COG1757 Na+/H+ antiporter + Prom 18884 - 18943 7.4 14 6 Op 2 . + CDS 19025 - 20101 1699 ## FN1859 major outer membrane protein + Term 20142 - 20181 7.7 + Prom 20157 - 20216 11.4 15 7 Op 1 1/0.000 + CDS 20341 - 21720 1899 ## COG2031 Short chain fatty acids transporter + Prom 21730 - 21789 9.2 16 7 Op 2 21/0.000 + CDS 21812 - 22465 1074 ## COG1788 Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit 17 7 Op 3 . + CDS 22483 - 23145 1217 ## COG2057 Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit + Term 23153 - 23193 7.8 + Prom 23172 - 23231 10.3 18 8 Tu 1 . + CDS 23381 - 24484 1375 ## FN1859 major outer membrane protein + Term 24513 - 24549 3.1 19 9 Op 1 . - CDS 24527 - 25984 1502 ## COG4865 Glutamate mutase epsilon subunit 20 9 Op 2 . - CDS 25998 - 27173 1496 ## FN1854 methylaspartate mutase (EC:5.4.99.1) Predicted protein(s) >gi|224461327|gb|ACDC01000075.1| GENE 1 43 - 6201 9443 2052 aa, chain + ## HITS:1 COG:no KEGG:FN0387 NR:ns ## KEGG: FN0387 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 472 2052 159 1724 1724 1122 47.0 0 MEQGDQVVKSPWSSWQFGANYFYEDWGGSYKGRGDKAKEKLDLTKKTDPLERFKASSQMS STYGTTSLDLVYEPPREVEVSAGIRPKEVNKRAPGFVPPEPSGSLPPFEPKIIQPPKAPE VTPPRIDEPAVISYPGFGSSIQNYYSFWNGSPGATGNSSTMMDQIALTAGKFKVTSNEGA YIKGYQGVGGYKATSTTVEREVPTGSIPADGYHALARRFFFGIRNSAFIQWEPEVEISWF NKGTNNNQLIYMETSNISQQNLETLNAGGAFANSPGLYDAVKAYKTQTALRDDGDGATGI NLHNNKGTIYLGGNNTRYQATTTIGGTRLSVFNNEGTIVGMNVHEGNTGENGSTKQVVYF NTPDTSSDQRWIYANGTNGKINLYGSESIAMYYTSSSGSQQQYVKTGFVNEGLINLYGRN SSGVMVRDSETLKAGSGFYLNKPVNIYADGARGVYIAGNISNMQSANAIVRANIGFDKND TTGFDWTDIEDGTTKHFDSNGNITGKSTDFVDNAVGVMYTHANTNTKIQTPDITMEKFSK ESLGVYVSNGKLTVESGTDRSKININGGEGNVALYAKGGEIEYNGDITMGTSSLVTEGGN SSGKGNIGIVAENGHKVTLNGLLKTYNGNGSTRDGIGVYADNSDVSVTQAVDMKFVAGTT GENVGIYSKGPGTGVAGKTVTILGNGSKINIDGGTTGKGTALYSGAGGKIIADSTATTGG LQVTVTNGAAAIASDGTNSNVSAKFSNIDYSGEGYALYTSNGGKIDVTGSKISLRGNATG FERDVASATSPITFKDITGSNAATITAYSNNVTIMNLRNVPTLNLSTLNTNLQTYTGGVT HLAGTDPVTGEVYNKYKTAAVDGLSAYNIDTDLDKSVATDDANAATNDYVFTRRMAVQRG IINLKAGKNVKAILSTTDLSKIGETSVVGLSMNSSSYATSNAEAGINLEANTTVTADRTT AGNGAVGLYINYGKINTDASSIINVEKETSNPANDSAVGIYSVNGSEVTNAGQVNVGGQN SIGILGLAYRIDSATNNPILNEFGATALGQGKATVLNKGQVSLDGANATGIYIKNNNASA TRATAVGTNDTTGVITLTGDSSTGMSGDKATLENKGTININGQKSVGMFAKNSSELINSG TINLAAGLSENEPNIGIYAKDADTTVTNSKDIIGGNNTYGIYGKTVSLTGTGKIELGDAS VGIFSDAEYTSTPTVATIDLANGSKLKVGAKESVGVFATGKNQNISSKGDIEVGDTSFAF VVRGTGTTLKTDNTNGVTLGNDATFIYSNDTTGNIENKTALTATGSKNYGIYAAGTATNL ADMNFGTGVGNVGMYSIGGGTVTNGSATVSPTITVSASDVVNKLYGIGMAAGYVNDAGTL VSTGNVVNYGTIKVEKDNGIGMFATGSGSTATNRGTIELSGKKTTGMYLDNNAIGYNYGT ITTVPNPSNDGIVGVVASNGAIIKNYGTINIVDGSNLTGVFINKGTEAANYDNQIPGGGT GVLNGKIEVKKQSATGKTVAGIDIKAPGDGTATIYRDGAQVTPIAVDTVTATPQPLSVNV GTTSLDLSATDLATPSLGQASSIGMYVDTSGVNYTNPIQGLNLLTGLQKVDLIFGTEASK YTNEKDIEVGQNILKPYNDVITSLSGGTSMKFSFTSGSLTWIATATQNTDDTLKALYLSK IPYTAFAKDKNTGNFLAGLEQRYGVEGLGTREKALFDKLNGIGKGEAALFAQAVDQMKGH QYSNVQQRIQATGNILDREFDYLKGSWSNPTKDSNKIKTFGARGEYNTDTAGVEDYKNHA YGVAYVHEDETVRLGESTGWYAGVVENKLKFKDLGNSKEEQLQGKIGMFKSVPFDENNSL NWTISGDVFVGYNKMNRRFLVVDDIFGAKSRYYTYGLGVKNEISKSFRLSEGFSFTPHAG LNLEYGRFSKIREKSGEMRLEVKANDYFSIRPEVGADLAYKYSFGNNNLKVSVGVAYENE LGKVANANNKARVAYTTADWYDLRGEKEDRRGNVKTDLNIGWDNQRVGVTANVGYDTKGE NVRGGVGLRVIF >gi|224461327|gb|ACDC01000075.1| GENE 2 6330 - 7052 547 240 aa, chain + ## HITS:1 COG:no KEGG:FN1870 NR:ns ## KEGG: FN1870 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 240 6 242 242 267 68.0 2e-70 MYYHIVKQEGLKYKNLDLKEVEEVVKKEKLNFNNSRIYSSTEALEHTVERFKQNENSLEE VVDRNEIRITDIIPCKNKSEYLHNCSYDFFMKILNSNSQPRWKVFFNKTFIIWIMLLSTV LNRYLFYFLYKYYYKFSTYKLLFGFNVKPIWYVIFIYVGIPLSFILLIWYRDKYYKKNYL LMFIVLMVIINQIVDYILGDTINKLVVNFGGLLAIFIGLILTQLFYNLFRRLSYNRYKDF >gi|224461327|gb|ACDC01000075.1| GENE 3 7074 - 7832 735 252 aa, chain + ## HITS:1 COG:no KEGG:FN1870 NR:ns ## KEGG: FN1870 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 238 6 242 242 99 34.0 9e-20 MYYHLITKDGLKYKDLNIEEIEKILREEEVSYNNARIYSSEKKLNEVYENYKEELIEEGE DTSFENIKIEKLISMAEAKNDLQNKSYKFFLDIKNPSGKINFMNFLNKTFIIWLMVISSV VKWVLVNRYYLAGEMEFSRVTSIEKLLDSGMLLIVILVFLEDKYFKKEYFVICILANIVT NLLVYISIKLVTKLLIFISARSEIMLLILLLISIIKAIIIQFIYNYAREKSYREYKDLNT IHINANRIAKLF >gi|224461327|gb|ACDC01000075.1| GENE 4 8164 - 8280 98 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKNRGAYMRGSMKMLRYKPPMLHTMSIVAERNLVAYD >gi|224461327|gb|ACDC01000075.1| GENE 5 8447 - 8833 696 128 aa, chain + ## HITS:1 COG:no KEGG:FN1869 NR:ns ## KEGG: FN1869 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 128 1 128 128 250 97.0 1e-65 MKSLIRLRMSSHDAHYGGNLVDGARMLQLFGDVATELLIQLDGDEGLFKAYDSVEFMAPV FAGDYIEAEGEIINVGNSSRKMKFEARKVIVPRPDLSDSAADVLAEPIVVCRATGTCVTP KDKQRGKK >gi|224461327|gb|ACDC01000075.1| GENE 6 8857 - 9672 1373 271 aa, chain + ## HITS:1 COG:FN1868 KEGG:ns NR:ns ## COG: FN1868 COG3246 # Protein_GI_number: 19705173 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 271 2 272 272 531 96.0 1e-151 MEKLIITAAICGAEVTKENNPAIPYTVEEIAREAESAYKAGASIIHLHVREDDGTPTQDK ERFRKCMEAIREKCPDVIIQPSTGGAVGMSDLERLQPTELHPEMATLDCGTCNFGGDEVF VNTENTIKNFGKILIERGVKPEIEVFDKGMVDYAIRFQKQGFIQKPMHFDFVLGVQMTAS ARDLVFMVESIPEGSTWTVAGVGRHQFQMAALAIVMGGHVRVGFEDNVYIDKGVLAKSNG ELVERVVRLAKELGREIATPDEARQILSLKK >gi|224461327|gb|ACDC01000075.1| GENE 7 9688 - 10725 1842 345 aa, chain + ## HITS:1 COG:no KEGG:FN1867 NR:ns ## KEGG: FN1867 # Name: not_defined # Def: Zn-dependent alcohol dehydrogenase and related dehydrogenase # Organism: F.nucleatum # Pathway: not_defined # 1 345 1 345 345 602 94.0 1e-171 MKKGCKYGTHRVIEPAGVLPQPAKKISNDMEIFSNEILIDVIALNIDSASFTQIEEEAGH DVEKIKAKIKEIVAEKGKMQNPVTGSGGMLIGTIEKIGDDLVGKTDLKVGDKIATLVSLS LTPLRIDEIKDIKPDIDRVEIKGKAVLFESGIYAVLPKDMSETLALAALDVAGAPAQVAK LVKPCQSVAILGSAGKSGMLCAYEAVKRVGPTGKVIGVVRNEKEKALLQRVSDKVRVVIA DATKPMDVLHAVLEANDGKEVDVAINCVNVANTEMSTILPVKEFGIAYFFSMATSFTKAA LGAEGVGKDITMIVGNGYTVDHAAITLEELRESAALREIFNELYL >gi|224461327|gb|ACDC01000075.1| GENE 8 10928 - 12205 1792 425 aa, chain + ## HITS:1 COG:FN1866 KEGG:ns NR:ns ## COG: FN1866 COG1509 # Protein_GI_number: 19705171 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Fusobacterium nucleatum # 1 425 1 425 425 856 97.0 0 MNTVNTRKKFFPNVTDEEWNDWTWQVKNRIEKIDDLKKYVELSAEEEEGVVRTLETLRMA ITPYYFSLIDMNSDRCPIRKQAIPTIQEIHQSDADLLDPLHEDEDSPVPGLTHRYPDRVL LLITDMCSMYCRHCTRRRFAGSSDDAMPMDRIDRAIEYIAKTPQVRDVLLSGGDALLVSD KKLESIIQKLRAIPHVEIIRIGSRTPVVLPQRITPELCNMLKKYHPIWLNTHFNHPQEVT PEAKKACEMLADAGVPLGNQTVLLRGINDSVPVMKRLVHDLVMMRVRPYYIYQCDLSMGL EHFRTPVSKGIEIIEGLRGHTSGYAVPTFVVDAPGGGGKTPVMPQYVISQSPHRVVLRNF EGVITTYTEPENYTHEPCYDEEKFEKMYEISGVYMLDEGLKMSLEPSHLARHERNKKRAE AEGKK >gi|224461327|gb|ACDC01000075.1| GENE 9 12209 - 13225 1110 338 aa, chain + ## HITS:1 COG:no KEGG:FN1865 NR:ns ## KEGG: FN1865 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 19 338 1 320 320 529 90.0 1e-149 MLDTYKFIEKYKRISIIGMEKNVGKTTLLNKLIADIGTNKKLGLTSIGRDGEDIDVVTNT DKPRIYVRRGSIIATGRNCLTKCDITKEILYVTDFTTPMGSIVIVRALSDGYVDIAGPSY NKQVKIVVELMEKFGSEISIVDGALGRKSTAISDVSEATILSTGAALSLDMLKVVEETKK TVYFLKLDEAEKNIKEKVEKLRNEKAVLFYKNGEVAILEVDNSIDLSNILKEYLKKDLEY FYIRGAITPKIIETFINNRGSYEKITLLAEDGTKFFLSSSLLDKAKLSGMEFQVLNKINL LFVTINPHSPLGVDFNKEEFKNRLQNEVSVPVINVLGD >gi|224461327|gb|ACDC01000075.1| GENE 10 13231 - 14691 1711 486 aa, chain + ## HITS:1 COG:FN1864 KEGG:ns NR:ns ## COG: FN1864 COG1193 # Protein_GI_number: 19705169 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 1 486 1 486 487 672 81.0 0 MKFIDENSLNRLNFKDLLARVDVFSAYGKNKLNNLENFLVGEEEKLEEEFERMQKIYDFI SENKKEEMEIEIVLHRFKDIKKLIENAEAGIILDTVDIFEIKAQLMAMVDLNSYLLKNKE VFSNFVLKDMNELFKILDPNDEKIATFYIYEAYSVILKEIRRQKKEVENRLFNETDYEIV KRLKDERLSILVDEEKEEFKIRKNLTKAIKSYAEDFITNVEKISNLDFIIAKVKFAKEYN GIRPEVSKKKEIILEDAINLEVKELLEAKNKKYTPISINLNVGTTMITGANMGGKSVALK TIAENVLLFQMGFFVFAKYASIPLLDFIFFVSDDMQDISKGLSTFGAEIIKLKEINSYVK NGTGLIVFDEFARGTNPKEGQKFVKALAKYLNDKSSISIITTHFDSVVENNMKHYQVVGL KNLDFEKLKTKLKVNNSLETIQDNMDFTLEESTDTEVPKDALNIAKLIGLDDEISEMIYK EYEMEE >gi|224461327|gb|ACDC01000075.1| GENE 11 14693 - 16249 2584 518 aa, chain + ## HITS:1 COG:no KEGG:FN1863 NR:ns ## KEGG: FN1863 # Name: not_defined # Def: L-beta-lysine 5,6-aminomutase alpha subunit (EC:5.4.3.3) # Organism: F.nucleatum # Pathway: Lysine degradation [PATH:fnu00310] # 1 518 1 518 518 1019 96.0 0 MGKLDLDWGLVKEARESAKKIAADAQVFIDAHSTVTVERTICRLLGIDGVDEFGVPLPNV VVDYIKDNGNITLGVAKYIGNAMIETKLQPQEIAEKIAKKELDITKMQWHDDFDIKLALK DITHATVDRIKANRQARENYLEQFGGDKKGPYLYVIVATGNIYEDVTQAVAAARQGADVV AVIRTTGQSLLDFVPYGATTEGFGGTMATQENFRIMRKALDDVGVELGRYIRLCNYCSGL CMPEIAAMGALERLDMMLNDALYGILFRDINMKRTLVDQFFSRIINGYAGVIINTGEDNY LTTADAIEEAHTVLASQFINEQFALVAGLPEEQMGLGHAFEMEPGTENGFLLELAQAQMA REIFPKAPLKYMPPTKFMTGNIFKGHIQDALFNIVTITTGQKVHLLGMLTEAIHTPFMSD RALSIENAKYIFNNLKDFGNDIEFKKGGIMNTRAQEVLAKAADLLKTIETMGIFKTIEKG VFGGVRRPIDGGKGLAGVFEKDSTYFNPFIPLMLGGDR >gi|224461327|gb|ACDC01000075.1| GENE 12 16249 - 17046 1487 265 aa, chain + ## HITS:1 COG:FN1862 KEGG:ns NR:ns ## COG: FN1862 COG5012 # Protein_GI_number: 19705167 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Fusobacterium nucleatum # 1 262 1 262 263 500 96.0 1e-141 MSSGLYSTEKRDFDTTLDLTKLRPYGDTMNDGKVQMSFTLPVPCNEKGVEAALELARKMG FVNPAVAFSEALDKEFSFYVVYGATSYNVDYTAIKVQALEIDTMDMHECEKYIEENFGRE VVMVGASTGTDAHTVGIDAIMNMKGYAGHYGLERYKGVRAYNLGSQVPNEEFIKKAIELK ADALLVSQTVTQKDVHIENLTNLVELLEAEGLRDKIILIAGGARITNDLAKELGYDAGFG PGKYADDVATFVLKEMVQRGMNTKK >gi|224461327|gb|ACDC01000075.1| GENE 13 17268 - 18845 2439 525 aa, chain + ## HITS:1 COG:FN1860 KEGG:ns NR:ns ## COG: FN1860 COG1757 # Protein_GI_number: 19705165 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 1 523 1 523 525 850 90.0 0 MFLKLWKGIQGFSELSLLGKAGVVIGVLIVLGILFGIVTKKKFREGFLKLSPVIVLAELM MDEFDALLAAPIATIFACIIAMIFSNQKFSTVIDHAIDNVKEIQVALFILMAAYAMAEAF MSTGVGASLILIALKVGITAKTVAVVGAIVTSILSIATGTSWGTFAACAPIFLWLNHIVG GNLLLTTAAIAGGACFGDNIGLISDTTIVSSGIQRVEVIRRIRHQGVWSALVLLSGIILF AVAGFTMGLPSTAGDPVEAINSIPADVWTALAEKREAAVKLLEQVKNGVPLYMAIPLVIV LVLAFMGTQTFICLFAGLFFAYLFGMMAGTVTSTMDYLNMMMGGFASAGSWVIVMMMWVA AFGGIMKSMNAFEPVSKLLSKISGSVRQLMFYNGLLCVFGNATLADEMAQIVTIGPIIRE MVEENVEGSEEDLYTLRLRNATFSDAMGVFGSQLIPWHVYIAFYMGIATVVYPLHEFVAI DIIKYNFIAMIAVASILILTLTGLDRLIPLFKLPSEPAVRLKKNI >gi|224461327|gb|ACDC01000075.1| GENE 14 19025 - 20101 1699 358 aa, chain + ## HITS:1 COG:no KEGG:FN1859 NR:ns ## KEGG: FN1859 # Name: not_defined # Def: major outer membrane protein # Organism: F.nucleatum # Pathway: not_defined # 1 358 1 368 368 497 72.0 1e-139 MKKLALVLGSLLVVGSVASAKEVMPAPAPAPEKVIEYVEKPVIVYRDREVTPAWRPNGSV DVQYRWYGETENKTKKEDTDGDWAAGRNNAGRLQTLTKVNFTEKQSLEVRTRNYQALRGT KGDSSDDQIRLRHFYNFGNLGSSKVNATSRLEYKQNGRDGGKKAEASVFFDFADYIYSNN FFKVEKFGLRTGYAHKWAGHDNDNTVERALVNFESEYTLPLGFSAELNVYNYYDWHHDKL AYADKKHEYNGELEAYIYQHTPLYKANNVELSFDFEGGYDPYAWHQHKVVDNGSGKGERA SYSVYMLPTFQVAYKPTEFVKLYAAAGAEYRNWAVEANSTAKNWRWQPTAWAGMKVSF >gi|224461327|gb|ACDC01000075.1| GENE 15 20341 - 21720 1899 459 aa, chain + ## HITS:1 COG:FN1858 KEGG:ns NR:ns ## COG: FN1858 COG2031 # Protein_GI_number: 19705163 # Func_class: I Lipid transport and metabolism # Function: Short chain fatty acids transporter # Organism: Fusobacterium nucleatum # 1 459 1 458 458 755 89.0 0 MESVKEKKGIFKRFTSMCVRVMERWLPDPFIFCALLTFLVFGGALIFTKASFMGVIGYWV DGFWSLLAFSMQMALVLVTGHALASSRLFKKMLETFASGIKGPKQAILIISLVSGIACVL NWGFGLVIGALFAKEIAKKVKGVDYRLLIASAYTGFLVWHGGLSGSIPLQLASGNPESLA QQTAGAVTAAIPTSQTMFSPMNIFIVVGLLIIVPLLNVAMFPSKDEVVEVNQELLVEAKE EVLDKSKMTPAERIENSRVVSILLSIMGFAYIGQYLYTKGFALNLNLVNFIFLFLGILLH GTPRRYLNALAEAIKGAGGILLQFPFYAGIMGIMVGADADGMSLAKLMSNFFVNISTEKT FPVFSFISAGLVNFFVPSGGGQWAVQAPIVMPAGQAIGVSAAKSAMAIAWGDAWTNMIQP FWALPALGIAGLGAKDIMGYCLIVTIVSGLFICTGFLLF >gi|224461327|gb|ACDC01000075.1| GENE 16 21812 - 22465 1074 217 aa, chain + ## HITS:1 COG:FN1857 KEGG:ns NR:ns ## COG: FN1857 COG1788 # Protein_GI_number: 19705162 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit # Organism: Fusobacterium nucleatum # 1 217 1 217 217 362 86.0 1e-100 MRKKIVSMEEAISHIKDGMTVHFGGFLACGTAENIVTALIEKGVKDLTIVCNDTAFVDKG VGRLVVNNQVKKVIASHIGTNPETGKKMHDGTMEVELVPQGTLAERVRAAGYGLGGVLTP TGLGTVVQEGKSIVNVDGKDYLLEKPIKADVAVLFGSKVDEQGNVICEKTTKNFNPLMAT AADVVIVEALEIVPAGSLSPEHLDISKIFVDYIVESK >gi|224461327|gb|ACDC01000075.1| GENE 17 22483 - 23145 1217 220 aa, chain + ## HITS:1 COG:FN1856 KEGG:ns NR:ns ## COG: FN1856 COG2057 # Protein_GI_number: 19705161 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit # Organism: Fusobacterium nucleatum # 1 216 1 216 217 392 94.0 1e-109 MEMDKNLVREVIAKRVAQEFHDGYVVNLGIGLPTLVANYVGDMDVIFQSENGCIGVGPAP EKGKEDPYLINAGGGHITAAKGAMFFDSAYSFGIIRGGHVDATVLGALEVDEKGNLANWM IPGKKVPGMGGAMDLVVGAKHVIVAMEHTSNGGIKILKECKLPLTAVGVVNLIITEKAVF EVTDKGLVLKEITPYSSLEDIRATTEADFIVSDELKNKIK >gi|224461327|gb|ACDC01000075.1| GENE 18 23381 - 24484 1375 367 aa, chain + ## HITS:1 COG:no KEGG:FN1859 NR:ns ## KEGG: FN1859 # Name: not_defined # Def: major outer membrane protein # Organism: F.nucleatum # Pathway: not_defined # 1 367 1 368 368 394 56.0 1e-108 MKRFALLLGSLLVVSSIASAKEVMPAPTPEPEKVIEYVEKPVIVYRDREVKPAWKPNGYL DMRYTWYGETENKNVGEDKDRDWAASIANAGRLQSVLNLNFTEKQTLYVRTRNYNTLRGT DKKQSTIGSDTLRVRHYYNFGTLENTKVKAKSRLSYDQSNGDAGRKSLSASVFFDFADYL PSNEYVKVTSFGLRPGYTHSWTGHTNDSTYNKYSLDFESSYKLPAGFSAELNLYSNYTRY RDSFEVGTSGETKKGQFNGAMEAYLYHTLPLYKNDKLTVSLKSEGGYDAYDFHQYKTVKN RQGVRTDRRSYSLYLLPTLNVNYKVTDNVNLFAAAGAEYRNWKVSTESEAKNWRWRPTAW AGMKVSF >gi|224461327|gb|ACDC01000075.1| GENE 19 24527 - 25984 1502 485 aa, chain - ## HITS:1 COG:FN1855 KEGG:ns NR:ns ## COG: FN1855 COG4865 # Protein_GI_number: 19705160 # Func_class: E Amino acid transport and metabolism # Function: Glutamate mutase epsilon subunit # Organism: Fusobacterium nucleatum # 1 485 1 485 485 793 84.0 0 MSITFKKIDKEDFLEMRKKFLENYKGLDDFDLETAFRFHKSLPYYKNFQKMLEKSIQDNR TVTEAYSKETLLEDLIKNLNSLHRVGQADFLSIIIDSHTRENHYENARTILNDSIKSNKS LLNGFPLITYGTKLARKIVNDVEVPLQIKHGSADARLLAEFAFLGGFSAFDGGGISHNIP FSKSVPLKDSLENWKYVDRLVGLYEENGIKINREIFSPLTATLVPPAISNSIQILESLLA VEQGVKNISIGVAQYGNITQDIASLLALQEQIQFYLDKFSFKDIHISTVFNQWIGGFPED ELKAYSLISYSATVSLFTKSNRIFVKNIDEYTKNSLGNTMINSLVLTKTILDIGNSQNLT NYEEVNLEKEQIKKETAQIIEKVFSICNGDLRKAIAEAFEDGIIDIPFAPSKYNIGKMMP ARDKEGMIRYLDIGNLPFETTIEEFHHNKIKERAQNENRKIDFQMTIDDIFAMSQGKLIN KKSRE >gi|224461327|gb|ACDC01000075.1| GENE 20 25998 - 27173 1496 391 aa, chain - ## HITS:1 COG:no KEGG:FN1854 NR:ns ## KEGG: FN1854 # Name: not_defined # Def: methylaspartate mutase (EC:5.4.99.1) # Organism: F.nucleatum # Pathway: Metabolic pathways [PATH:fnu01100] # 1 391 72 462 462 649 83.0 0 CSSAAGGLKIIAIGLVPELTTEAAKKAALSSGGRVVKTYAFRLSPEDMEEISSLDYDILL LTGGTNGGNREYILDNARTLAENNIKKPIIIAGNEEVKEEIAEIFRTHNIEYYSSENVMP VVNKINVLPVKEVIREVFMNNIIKAKGMESIQKIVGNIIMPTPTAVMMAAEVFSQDNDDT IVIDIGGATTDVHSIGAGLPKANNIQLKGMEEPYSKRTVEGDLGMRYSALALYEATSLNK IREYLGSKDSKINIRENFEFRHENPDFVAETKDDIIFDEMMAMLCTEIAIDRHVGTLESI FSPMGTLFVQSGKDLTDVKYVIGTGGIINNSRNPKKILDLCLYDENNPLDLKPKYPKFLV DKTYIMSAMGLLASDYPDIAYRIMKKYLVEI Prediction of potential genes in microbial genomes Time: Thu May 19 23:22:01 2011 Seq name: gi|224461326|gb|ACDC01000076.1| Fusobacterium sp. 2_1_31 cont1.76, whole genome shotgun sequence Length of sequence - 2421 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 211 255 ## FN1854 methylaspartate mutase (EC:5.4.99.1) - Prom 242 - 301 5.2 - Term 292 - 322 -0.6 2 1 Op 2 . - CDS 491 - 901 523 ## COG2185 Methylmalonyl-CoA mutase, C-terminal domain/subunit (cobalamin-binding) - Prom 992 - 1051 14.9 - Term 1012 - 1066 14.5 3 2 Tu 1 . - CDS 1090 - 1518 695 ## FN1852 hypothetical protein - Prom 1545 - 1604 8.4 4 3 Tu 1 . - CDS 1631 - 2413 332 ## PROTEIN SUPPORTED gi|162456259|ref|YP_001618626.1| putative ribosomal protein Predicted protein(s) >gi|224461326|gb|ACDC01000076.1| GENE 1 1 - 211 255 70 aa, chain - ## HITS:1 COG:no KEGG:FN1854 NR:ns ## KEGG: FN1854 # Name: not_defined # Def: methylaspartate mutase (EC:5.4.99.1) # Organism: F.nucleatum # Pathway: Metabolic pathways [PATH:fnu01100] # 1 70 1 70 462 100 78.0 2e-20 MSSRIYLSIDFGSTYTKLTAIDLDKEEIISTARATTTVKTNVLTGFNMAFEELTKDLKDK LKDYEIVKKV >gi|224461326|gb|ACDC01000076.1| GENE 2 491 - 901 523 136 aa, chain - ## HITS:1 COG:FN1853 KEGG:ns NR:ns ## COG: FN1853 COG2185 # Protein_GI_number: 19705158 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, C-terminal domain/subunit (cobalamin-binding) # Organism: Fusobacterium nucleatum # 1 136 1 136 136 244 91.0 3e-65 MTKKKIVIGVIGSDCHTVGNKIIHNKLEESGFEVVNIGALSPQIDFINAALETNSDAIIV SSIYGYGELDCQGIREKCDEYGLKDILLYVGGNIASSNEDWEKTEKRFKDMGFNRIYKPG TPIEETISDLKKDFAL >gi|224461326|gb|ACDC01000076.1| GENE 3 1090 - 1518 695 142 aa, chain - ## HITS:1 COG:no KEGG:FN1852 NR:ns ## KEGG: FN1852 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 139 1 124 126 202 80.0 4e-51 MKKFLLALMLVGAVSAVAATKSAPKSQYPDGTYRGVYISGQETQVEVQFNLKNDVITDAK YRTLFYKGHDWLKEDDFIAKNDGYMKLLERITNKKIQDVMPTMYNSEEIEKGGATVREMK VRSALQYGLNVGPFKLPKKEAK >gi|224461326|gb|ACDC01000076.1| GENE 4 1631 - 2413 332 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|162456259|ref|YP_001618626.1| putative ribosomal protein [Sorangium cellulosum 'So ce 56'] # 62 259 1 204 207 132 40 3e-31 MKYSEDSAAEVDMNVIMNKKGEFIEVQGTGEESTFTRTELNGLLDLAEASIKRIINLQDK VIEQENLKIFLATGNKHKIEEISDIFSDIENVEILSIKDGVEIPEVIEDGTTFEENSKKK AVEIAKFLNMITIADDSGLCVDALNGEPGVYSARYSGTGDDLKNNEKLIENLKGLENRKA KFVSVITLAKPNGETFSFEGEILGEIVDNPRGNTGFGYDPHFYVEEYQKTLAELPELKNK ISHRAKALEKLKKELKNILL Prediction of potential genes in microbial genomes Time: Thu May 19 23:22:07 2011 Seq name: gi|224461325|gb|ACDC01000077.1| Fusobacterium sp. 2_1_31 cont1.77, whole genome shotgun sequence Length of sequence - 2785 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 513 796 ## COG0689 RNase PH 2 1 Op 2 1/0.000 - CDS 530 - 1459 1054 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 3 1 Op 3 . - CDS 1456 - 2730 1159 ## COG1541 Coenzyme F390 synthetase Predicted protein(s) >gi|224461325|gb|ACDC01000077.1| GENE 1 3 - 513 796 170 aa, chain - ## HITS:1 COG:FN1851_1 KEGG:ns NR:ns ## COG: FN1851_1 COG0689 # Protein_GI_number: 19705156 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase PH # Organism: Fusobacterium nucleatum # 1 170 1 170 242 292 94.0 2e-79 MLREDGRKFNEERKIKITKDVNIYAEGSVLIEVGNTKVICTASVSEKVPPFLRGTGKGWV TAEYSMLPRATNERNQREASKGKLTGRTVEIQRLIGRALRSAIDLEKLGERLITIDCDVI QADGGTRTTSITGGYVALALAIKKLLKEEILEENPLIANVAAISVGKIDS >gi|224461325|gb|ACDC01000077.1| GENE 2 530 - 1459 1054 309 aa, chain - ## HITS:1 COG:FN1850 KEGG:ns NR:ns ## COG: FN1850 COG0332 # Protein_GI_number: 19705155 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Fusobacterium nucleatum # 1 309 1 309 309 522 82.0 1e-148 MRRIKFKGYGVVLPKNTVSFKNHIRYRISEGETQLQLAVAACEKALKNSNISINDIDCIV SASAVGVQPIPCMAALIHEKIAKGTSIPALDINTTCTSFITALDTMSYLLEAGRYERVLI ISCDVASSALNPNQKESFQLFSDGAVAFVVEKSDEEIGIIDSTLKTWSEGAHSTEIRGGL SNFHPKYYSESTKEEYMFDMSGKSILALCIKEIPKMFREFLENNKMKVSDIDMVVPHQAS VAMPIVMQKLGVAKGQFIDEVKEFGNMVSASVPMTLAHGLEQQKIKNGDIILLTGTAAGL TTNMMLIKI >gi|224461325|gb|ACDC01000077.1| GENE 3 1456 - 2730 1159 424 aa, chain - ## HITS:1 COG:FN1849 KEGG:ns NR:ns ## COG: FN1849 COG1541 # Protein_GI_number: 19705154 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Fusobacterium nucleatum # 1 422 1 422 424 694 87.0 0 MNKILKIVSTFIKVRYFSKWTSRDKLLKYQDEQVEKHLKFLKENSPYFKTHQITDDFTMN KAFMMENFDELNTLGVKKDEAMEIALNSEKTRNFSQKYKDISVGLSSGTSGHRGMFITTP EEQGTWAGTILAKMLPKNDIFGHKIAFFLRADNDLYKAINSFLISLEYFDTFKDIDEHIE RLNKYLPTMIVAPPSLLLVLAKKIEEEKLNISPKRLISVAEILEKADEEYIKKQFNLKII HQIYQATEGFLACTCEYGHLHLNEDLIKFEKQYIDEKRFYPIITDFRRTSQPFVKYYLND ILVENSEPCQCGSVLQRIEKIEGRSDDIFKFTNKFGKEIVVFPDFIRRTILFVENIREYQ VFQINDKLLEVAILNISDEQKELVKNEFNKLFTSLNIENIEIKFINYEIDKTKKLKRIVR KVEK Prediction of potential genes in microbial genomes Time: Thu May 19 23:22:16 2011 Seq name: gi|224461324|gb|ACDC01000078.1| Fusobacterium sp. 2_1_31 cont1.78, whole genome shotgun sequence Length of sequence - 32180 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 6, operones - 5 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 4 - 825 826 ## COG0491 Zn-dependent hydrolases, including glyoxylases 2 1 Op 2 . - CDS 788 - 1771 1167 ## COG0451 Nucleoside-diphosphate-sugar epimerases 3 1 Op 3 . - CDS 1768 - 3045 1516 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 4 1 Op 4 . - CDS 3020 - 3661 679 ## FN1846 hypothetical protein 5 1 Op 5 . - CDS 3636 - 4832 845 ## Lebu_1741 ceramide glucosyltransferase 6 1 Op 6 . - CDS 4829 - 5602 686 ## COG0300 Short-chain dehydrogenases of various substrate specificities - Prom 5848 - 5907 16.8 + Prom 5831 - 5890 15.7 7 2 Tu 1 . + CDS 5964 - 6449 877 ## COG3212 Predicted membrane protein + Term 6499 - 6544 9.1 + Prom 6576 - 6635 14.8 8 3 Op 1 1/0.000 + CDS 6659 - 7186 570 ## COG1852 Uncharacterized conserved protein 9 3 Op 2 . + CDS 7199 - 10009 3593 ## COG0457 FOG: TPR repeat + Prom 10025 - 10084 3.2 10 4 Op 1 . + CDS 10111 - 10185 105 ## + Prom 10187 - 10246 2.9 11 4 Op 2 . + CDS 10273 - 10653 606 ## FN1835 hypothetical protein 12 4 Op 3 30/0.000 + CDS 10681 - 11292 555 ## COG0811 Biopolymer transport proteins 13 4 Op 4 11/0.000 + CDS 11305 - 11745 460 ## COG0848 Biopolymer transport protein 14 4 Op 5 1/0.000 + CDS 11754 - 12764 1520 ## COG0810 Periplasmic protein TonB, links inner and outer membranes 15 4 Op 6 1/0.000 + CDS 12778 - 14172 1930 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 16 4 Op 7 . + CDS 14176 - 15633 1463 ## COG2812 DNA polymerase III, gamma/tau subunits 17 4 Op 8 . + CDS 15647 - 16501 728 ## FN1829 hypothetical protein 18 4 Op 9 16/0.000 + CDS 16523 - 16972 722 ## PROTEIN SUPPORTED gi|237739477|ref|ZP_04569958.1| LSU ribosomal protein L9P 19 4 Op 10 1/0.000 + CDS 16983 - 18326 1829 ## COG0305 Replicative DNA helicase 20 4 Op 11 . + CDS 18349 - 19572 1733 ## COG0826 Collagenase and related proteases + Term 19623 - 19674 -0.3 + Prom 19762 - 19821 11.4 21 5 Op 1 12/0.000 + CDS 19873 - 21627 1761 ## COG2831 Hemolysin activation/secretion protein 22 5 Op 2 . + CDS 21640 - 30222 11478 ## COG3210 Large exoproteins involved in heme utilization or adhesion 23 5 Op 3 . + CDS 30225 - 30767 778 ## gi|237739482|ref|ZP_04569963.1| predicted protein + Term 30782 - 30833 7.8 + Prom 30937 - 30996 8.3 24 6 Op 1 . + CDS 31026 - 31751 1127 ## FN1358 hypothetical protein + Term 31752 - 31785 2.4 25 6 Op 2 . + CDS 31829 - 32180 415 ## COG2849 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|224461324|gb|ACDC01000078.1| GENE 1 4 - 825 826 273 aa, chain - ## HITS:1 COG:FN1848 KEGG:ns NR:ns ## COG: FN1848 COG0491 # Protein_GI_number: 19705153 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 8 270 1 263 263 434 82.0 1e-121 MSNTVKKMIEKVDYFACGYCTNDLKRVFKGFDKTIVNFNAGVFLIKHREKGYILYDTGYS MDILKNNLKYFLYRFANPITLKREDMIDYQLKEKGISPDEIKYIIISHLHPDHIGGLKFF PNSYLILTKTCYNDFKLKRDSLLIFDELLPEDFEKRLIIIDDFKENTQFPYRESCDLFSD SSMFLVEVSGHTKGQACLFLPEDNLFLAADVCWGTDFLPLTEKMRWLPRKIQNNFEEYKK GTSLLEKLIEDKISVIVSHDKKEKIINVLKTIE >gi|224461324|gb|ACDC01000078.1| GENE 2 788 - 1771 1167 327 aa, chain - ## HITS:1 COG:FN1847 KEGG:ns NR:ns ## COG: FN1847 COG0451 # Protein_GI_number: 19705152 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 1 327 2 328 328 581 90.0 1e-166 MKILLTGATGFLGKYVIDELKNNSYQVVAFGRNEKIGKTLIDENVEFFKGDIDNLDDLYK ASQDCSAVIHAAALSTVWGLWEDFYNVNVIGTKNIVQVCEEKKLKLVFVSSPSIYAGAKD QLDVKEDEAPKENDLNYYIKSKIMAENIIKASNLDYIIIRPRGLFGIGDTSIIPRLLELN KKMGIPLFVDGKQKIDITCVENVAYSLRLALENKEHSREIYNITNGEPIEFKEILTLFFN EMGTEGKYLKWNYNLVLPLVSFLEKVYKLFRIKKEPPITKYTLYLMKYSQTLNIDKARKE LGYSPKMSILEGVKNYVEHSKKNDRKS >gi|224461324|gb|ACDC01000078.1| GENE 3 1768 - 3045 1516 425 aa, chain - ## HITS:1 COG:YPO1985 KEGG:ns NR:ns ## COG: YPO1985 COG1819 # Protein_GI_number: 16122227 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Yersinia pestis # 10 364 15 367 395 203 34.0 5e-52 MNTYKEKIKIAVVAPPFSGHLYPILELVLPLLNKKDKYDICVYTGFKKKEVVERLGFPVK ILLEDRPDVFENISDTDKKTNPIIAYKQFKENLGLMPKIIKEMEDYFSEEKPDIIVADFI AVPVCFVSKKLNIPWITSIPTPFAIENKTTTPAYVGGLYPRDSFSFKLRDKFARGFIRAF KKLLCFILRKQLKELDFTLYNEKGEENIYSPYSILALGMKEIEFRDDFPSQFSWAGPCCS SLFKDSAKFEVETKFEKIILLTKGTHLKWAKNSIIDIARELSQKYPNYLFVVSLGSYLER EKEIIKENNLQVYHYLDYDEILPKVDYVIHHGGAGILYSCIKHNKPAVIIPHDYDQFDYG VRADLAEIAYVANLKSRKSILKAFDKMLERKEWKNLEKLSKAFNNYSPSALLEKEIDRIL KGVKK >gi|224461324|gb|ACDC01000078.1| GENE 4 3020 - 3661 679 213 aa, chain - ## HITS:1 COG:no KEGG:FN1846 NR:ns ## KEGG: FN1846 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 6 210 1 205 208 333 87.0 2e-90 MGRYTMKFKEYLEKLEILDVSKTLLKEDKIVFVISGSSNLKTAALEPDRFEILNIFKEFG YKVIKSNFPYNEDFPYDEFEDINILKASLSNITYYPHTLFNKRFEKEILRHLEPIKSLKD VIIISQSSGLNVWKKFMELSDFNNENIKMFALGPVGKGYGKLNNVVVLKGIFDIYSWLLD FHKFDKIVNCGHLGYFKDRKVKEIIYEYLQRKN >gi|224461324|gb|ACDC01000078.1| GENE 5 3636 - 4832 845 398 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1741 NR:ns ## KEGG: Lebu_1741 # Name: not_defined # Def: ceramide glucosyltransferase # Organism: L.buccalis # Pathway: Metabolic pathways [PATH:lba01100] # 1 398 1 392 392 539 76.0 1e-151 MIDLFYVLLTITIILLILKLFFSFAYFYKIDKFEKTEINEKKYTVLQPILSGDPRLEEDL IANLKNTTDMKFIWLVDKSDKVAINTVENILKDKNYSNRIEVYYLDDVPQELNPKIFKLA QVVDKIKTEYSIILDDDAVIDRKKLDELSVYEKDKSEWIATGIPFNYNIKGFYSKLISAF INSNSIFSYFSMSFLNENKTINGMFYILRTDILKKYSAFDEIKYWLCDDLALATYLLSKK VKIIQSTIFCNVRNTVPSLKRYILLMKRWLLFSNVYMKNAFSTKFLFIILLPTLLPTILL FFSFYLGLDYLVIILNLFIGKIALFHIVRLFIYQGNYEENSFKKSLFVFSPQTTELLYEL LSEFLLPFMLIYTLLTPPVILWRNKKIRVKDGKIHYEI >gi|224461324|gb|ACDC01000078.1| GENE 6 4829 - 5602 686 257 aa, chain - ## HITS:1 COG:FN1844 KEGG:ns NR:ns ## COG: FN1844 COG0300 # Protein_GI_number: 19705149 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Fusobacterium nucleatum # 1 257 1 257 257 396 83.0 1e-110 MKKILITGASSGIGKELAINLTNKTDELFLLARSIDKLELLKKDLEEKNPSLKCECIKYD LSDIENLDKIIENYDIDLLINCAGFGKITDFSKLSDKEDLDTINVNFISPMLLTKKYSEK FLQKGQGIILNVCSTAALYQHPYMAIYSSTKSALLHYSLALDEELHNKNKNVRVLSVCPG PTASNFFDKDIQVKFGSSQKFMMSSEDVAKRIIKIIEKKKRFSIIGFRNKLSMFLLNLLP ASLQLRLVGSVLKKVIK >gi|224461324|gb|ACDC01000078.1| GENE 7 5964 - 6449 877 161 aa, chain + ## HITS:1 COG:FN2085 KEGG:ns NR:ns ## COG: FN2085 COG3212 # Protein_GI_number: 19705375 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 161 1 161 161 211 81.0 4e-55 MKKILIVGAIILGSIGFSTSALAAISEQQAKDIALKEAQGGQITKFKLDREKGRMVYEVE VMDGNIEKDYEIDAETGAIVKFEQEQKGAGKAKSVNEPKISYDKAKEIALKNSKNGKFKE IELKHKNGVLVYDVEIAEGFADREFLIDANTGEILKNKKDF >gi|224461324|gb|ACDC01000078.1| GENE 8 6659 - 7186 570 175 aa, chain + ## HITS:1 COG:FN1837 KEGG:ns NR:ns ## COG: FN1837 COG1852 # Protein_GI_number: 19705142 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 175 1 175 176 245 79.0 3e-65 MEKFYINLLKSLLYILFMMTTKFKNPKLNNYFSQKFLEINNNYVLKKIKKKTNDKILILL PHCIQLYDCEYKITADINNCRVCGKCVVYNFVDIKNKYKNIDVKIATGGTLARKYVKDLR PDLIIAVACKRDLISGIRDAESFLVYGVFNKIKNESCINTTVAIEDIYAILEEIS >gi|224461324|gb|ACDC01000078.1| GENE 9 7199 - 10009 3593 936 aa, chain + ## HITS:1 COG:FN1836 KEGG:ns NR:ns ## COG: FN1836 COG0457 # Protein_GI_number: 19705141 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 936 1 936 936 1188 81.0 0 MKKILIISLLASSSIIFAGEAEDFKTVNELYKEKNFKSALVESEKFLEKYPESKHQKSMR DKIAKIYFLEKEYKKAEEVFKKLFVMEEKKSEKDEYASYLARINALQNKTDEARFYLREI KNEKTYQRALFAVGQDFLSKDNNEAAKDIYREIIDKKYENDKEAMMGLGIVNYNLKDYDK AIYWFSEFQKSKPKENKDMVSYLKASALYRKGNTEQAIVDFEELANANPANDYSKKAILY LIEIYSNKKDEQKVSFYLEKIKGTKEYNTAMTMIGDLYVTKENYDKALQYYDQSDDKSNP RLIYGEAYSLYKSGKYEAALKKFQSLKNSDYYNQSIYHIFAINYKLKNFDEIIRDREIIR KVVVSQVDTDNIIRIIANSAYQVGNYKLAKDYYGRLFAVSPDKDNLFRVILLDSQMLDME DLQIRFNQYNKLYSDDTEYKKDVYLYTGDAYYKAGQVERAEQIYKAYLSQDTNTEIISSL MSSLLDQQKYDEMNQYLSSVSDDNSLSYLKGVAAMGLKKYDEAETHFQKVLANGDKGLST KVYLNRVRNFFLAERYNEAIQAGEQYLSRINPDKEKAIYSEMLDKIGLSYFRVGKYDQAR SYYSKIASMKGYEVYGKFQIADSYYNEKNYAKAGELYKSIYNNYGETFYGEQAYYKYITT LSLLGNTEAFEREKNNFLSVYPNSTLRTTISNLSTNFYIESGDTEKAIEALDKSKANTDD ADVKENNTIKIIGLKLEKKDYKDMEKYLGEISDPEERAYYSAQYYAQKKDPKLVKEYETL LKSEKYKAYASKALGDYYFDKKDLAKSKKYYGTHVSVNKNPDEYVLYRLGQANEKENNLK MALADYKLVYEKKGKLAEDAMLRAAEIYDRQENNVEAEKLFTKLYATKGNKDLKAYSIEK LIYYKLLNEKTKEAKKYYDELKKLDAKRAEKFKAYF >gi|224461324|gb|ACDC01000078.1| GENE 10 10111 - 10185 105 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRVIVGVSEANLLASFAEITAKR >gi|224461324|gb|ACDC01000078.1| GENE 11 10273 - 10653 606 126 aa, chain + ## HITS:1 COG:no KEGG:FN1835 NR:ns ## KEGG: FN1835 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 5 126 1 121 121 131 73.0 8e-30 MSKKLLAIFLLLGVLTYAEDNDTTVIINDNAQKATDNGEVITTEVTKKVVGENNQQLDVK EIDTEELILQNQNLESSSINITGENLKENGDKVKVNRENTATIEEELSQGVEKKGFFRRI IDKLFG >gi|224461324|gb|ACDC01000078.1| GENE 12 10681 - 11292 555 203 aa, chain + ## HITS:1 COG:FN1834 KEGG:ns NR:ns ## COG: FN1834 COG0811 # Protein_GI_number: 19705139 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 17 203 1 187 187 308 87.0 5e-84 MQILKAGGILMYFILLMGIVGLYAILERFFYFASKERNNYSKLPSEAKQLIKEGKIKEAI IFFNSNKSSTSTVLKEILIYGYKENKETLSALEEKGKEKAIEQIKHLERNMWLLSLAANA SPLLGLLGTVTGMITAFNSIALNGTGDAGILAKGISEALYTTAGGLFVAIPCMIFYNYFN KRIDLVVADIEKTCTELLNYFRE >gi|224461324|gb|ACDC01000078.1| GENE 13 11305 - 11745 460 146 aa, chain + ## HITS:1 COG:FN1833 KEGG:ns NR:ns ## COG: FN1833 COG0848 # Protein_GI_number: 19705138 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Fusobacterium nucleatum # 34 142 1 109 114 148 77.0 4e-36 MKLDRIKRRSGGTLILEITPLIDVVFLLLIFFMLATTFDERSAFKIELPKSTVAKTKSTL KEVQVLVDKDKNVYIKYTNNSGKTETEELELSAFVDFVSEKLDTSESKDVVVSADKDIDY GFIVEIMSLLKEAGASGINIDTNSTK >gi|224461324|gb|ACDC01000078.1| GENE 14 11754 - 12764 1520 336 aa, chain + ## HITS:1 COG:FN1832 KEGG:ns NR:ns ## COG: FN1832 COG0810 # Protein_GI_number: 19705137 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein TonB, links inner and outer membranes # Organism: Fusobacterium nucleatum # 191 336 89 234 234 222 85.0 1e-57 MKKSDYICLFLSIIINIGIILALTVFSKDTTQEILDAEQIKIGLVAVENDASTKFRGEKN VDAKKQNLDADSIEKKEEKTEKPEKPAEKKVEEIKTEKTVEKITEKPEKKAAEKPAEKPK EKTPEKPKEKPVEKEKPVEKKIEKLAEKGEKVVEKKDEKKAPEKPATKENSKKSSSESSD SKGTSKQEKPSLADLKKQISGSQPKTSNGGYSPTADPDGEEVVDRVLQNVTYSNGLVSGS KMGNSSDGRVVDWNAKNKAPEFPQAAKSSGKHGKLKIKLKVDKMGNVLSFVIVEGSGVPE IDAAVERVVGTWRVKLMKNGKPVNGTFYLNYNFDFK >gi|224461324|gb|ACDC01000078.1| GENE 15 12778 - 14172 1930 464 aa, chain + ## HITS:1 COG:FN1831 KEGG:ns NR:ns ## COG: FN1831 COG2204 # Protein_GI_number: 19705136 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Fusobacterium nucleatum # 1 464 1 464 464 756 89.0 0 MLLLGLRLDNDLKLEFENNFENDLVFVENMISFMDAIKNRKYEAIVIDERNSKEEALISL ITKITELQKKVVIIILGEASNWRVIAGSIKAGAYDYILKPEIPKNIVKVVEKSVKDYKGL VEKVDKTKSTGEKLIGRSKLMIDLYKVIGKVANNSAPVLVTGERGTGKTSVAKAIHQFSN VHDKPIISVNCNSYRANLLERKLFGYEKGSFEGAAFSQYGELEKAEGGILHLANIESLSL DMQSKILFLLEENRFFRLGGMEPINAFVRIIASTSVNLEELIDKGLFIDELYRKLKVLEI NIPNLRDRKDDIPFIIDHYMPECNREMEKNIRGVTKMALKKLMRYDWPGNVNELKNAIKY AVAMCRGSSILIEDLPPNVIGEKAITSKEEIRALSIENLIKNEISQLKSKNKKSDYYFEI ISKIEKELIKQILEITNGKKVETAEILGITRNTLRTKMNYYDLE >gi|224461324|gb|ACDC01000078.1| GENE 16 14176 - 15633 1463 485 aa, chain + ## HITS:1 COG:FN1830 KEGG:ns NR:ns ## COG: FN1830 COG2812 # Protein_GI_number: 19705135 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Fusobacterium nucleatum # 1 485 1 484 484 720 81.0 0 MHITLYRKYRPSSFSEVSGENEIVKSLKLSLKNKSMAHAYLFSGPRGVGKTTIARLIAKG VNCLNLGEDGEPCNECKNCKAINEGRFSDLIEIDAASNRSIDEIRSLKEKINYQPVEGLK KVYIIDEAHMLTKEAFNALLKTLEEPPSHVMFILATTELDKILPTIISRCQRYDFKALDI EDMKSGLKHILKEENLSMSDEVYPLIYENSSGSMRDSISILERLIVTANGNEINLKIAED TLGVTPSSRIKIFLDKLLNESEYNIINELEALANESFDIELFFKDLAKYCKNSIVKNELD IDKGLKIISTIYDVINKFKFEDDKKLVGYVIVADILANSTQTIVRTVTKVQRVVEDTDDT SNTVVEAVKEKPKVQITIADVKSNWNSILDEAKKKRISYKVFLMGANPVRVEDNIIFITY DKKYLFPKEQMESEEYNREFTEIIRKFFNENSLELKYEVIGQKKEEESGEEEFFKKIENY FKGNS >gi|224461324|gb|ACDC01000078.1| GENE 17 15647 - 16501 728 284 aa, chain + ## HITS:1 COG:no KEGG:FN1829 NR:ns ## KEGG: FN1829 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 284 1 279 279 192 53.0 2e-47 MPIVSILMSAATIVLYFFLSLFLPFLAYLIPYYKITRVNLYKKKYSLAVNIIVALILVFI NPGYLMLYLIFPYAMEFMFYLFNKIAKRMQVFNRIVLMSIVPTILISLYLYANMDMINYT MNYMITNLPRMKDIVEQVGIETVVAVQKSIQESMALVSNYYIFGAFFVVIVSYFFLFLNL IPSTYKLWKISCYWLIPYMLILWAHKYNISSNLLIENNILECIKWMYVLYGIKVIYSLLD RIGVKVNIIKHAISMMIGLQYAPFVFILGALVSFEFIEVKEIKI >gi|224461324|gb|ACDC01000078.1| GENE 18 16523 - 16972 722 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739477|ref|ZP_04569958.1| LSU ribosomal protein L9P [Fusobacterium sp. 2_1_31] # 1 149 1 149 149 282 100 2e-75 MAKIQVILLEDVAGQGRKGEIVTVSDGYAHNFLLKGKKGVLATPEELQKIESRKKKEAKK LEEERNKSLELKKILEAKTLNLSVKAGENGKLFGAITSKEIASHIKDELGLDIDKKKIEA NIKALGPDEVVIKLFTDVKAVVKINVVAK >gi|224461324|gb|ACDC01000078.1| GENE 19 16983 - 18326 1829 447 aa, chain + ## HITS:1 COG:FN1827 KEGG:ns NR:ns ## COG: FN1827 COG0305 # Protein_GI_number: 19705132 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Fusobacterium nucleatum # 1 447 1 446 446 659 80.0 0 MEFEELNRIPYSLEAERALIGGIFFDVNSLDEIKYIIKSDDFYKKEHAEIYKAIENLFSE SRGVDPILVVEEIKKSDLKNEEEILQVLTEIIDENTSSYNLLEYAELIKEKAMLRRLGQV GMEITKTAYTDVRTAEEIMDEAEAKVLNLSKNILKNSIIDMKTASLDEMKRIDNVTANRG KTLGIPTGFVDLDRMTSGLNNSDLIILAARPAMGKTAFALNLALNAAKEKKNVLIFSLEM PVQQLYQRLLAMESGISQNKLRNVYLEEDEWNKLTLATTSLSNLGIYVADLPHTNVLEIR SYARNMKTQGLLDLIVIDYLQLINGTGKGRGSEASRQQEISDISRALKGLARELDVPVIA LSQLSRAVESRVDRRPMLSDLRESGAIEQDADIVAFLYREEYYIPDTENKGITELIIGKH RNGATGTVKLNFLSEFTKFTSYTDEVK >gi|224461324|gb|ACDC01000078.1| GENE 20 18349 - 19572 1733 407 aa, chain + ## HITS:1 COG:FN1826 KEGG:ns NR:ns ## COG: FN1826 COG0826 # Protein_GI_number: 19705131 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Fusobacterium nucleatum # 1 407 4 410 410 786 95.0 0 MKKAELLAPAGNMEKFKMALHYGADAVFMGGKMFNLRAGSNNFSDDELEEAVNYAHERGK RVYVTLNIIPHNDELDALPDYVKFLERIGVDGVIVADLGVFQVVKENSDLNISISTQASN TNWRSVKMWKDMGAKRVVLAREISLENIKEIREKVPDIELEVFVHGAMCMAISGRCLLSN YMTGRDANRGDCAQACRWKYSLVEETRPGETMPVYEDEHGTYIFNSKDLCTIEMIDKILD AGVDSLKIEGRMKGIYYVSNCVKVYKDALNSYYSGNFEYNPEWRNELESISNRSYTEGFY HGKAGKESLNYNNRNSYSQTHKLVAKIEKKLSDNEYLVAIRNKLFVGQAVQIVSPEIKVR DFVMPEMILLDKMGRETESVESANPNSFVKIKTDIPMNELDMLRIVL >gi|224461324|gb|ACDC01000078.1| GENE 21 19873 - 21627 1761 584 aa, chain + ## HITS:1 COG:FN0131 KEGG:ns NR:ns ## COG: FN0131 COG2831 # Protein_GI_number: 19703476 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Hemolysin activation/secretion protein # Organism: Fusobacterium nucleatum # 1 553 1 565 566 743 73.0 0 MKKVITYIFLVFSILSFSNTFNENEDERTILRKEQRLEQERIQEKYENSKDSYQKDTKIE VDKNELKFYISKINLFDDENLLNEIEKENILSKYENKKLGSTDISNILVELTNRLIEKGY VTSTASLSENNNLNSETLNLKIISGKIEKIILNEDDSLDKLKKYFLVSTKEEKVLNVRDL DTTTENFNYLEANNMTMEIIPSDKENYSIIKLKNEMKDKFTISFLTNNYGEDNQNGIWRG GTSVNIDSPLGIGDRVYFSYMTVHKKKADRSWKRTTESLKPGEILPIGPKGYDPAKDTLP YKRELDLYNFRYTMKFRDYTLSLGSSRSENISSFYTPTTIYDMETMSSNFSVNLDKILWR NQKSKLSLGVGLKRKHNKSYIEDTLLSDRVLTIGDISLNGTTVFYGGIFGITLDYERGLR ALGASNTPKAEFKKYSLNLNYYKPLTKKLVYRFNTLTSHSKDVLYASEKQSIGGVGSVPG YHRRGNIMGDRAIEIENELSYKIIDSEKIGRLSPYISYSYGAVRNNKNPSVYGKGYISGA SIGLRYSMKYLDIDLAYAKALSHSSYIKPRDREIYFSTSLKIRF >gi|224461324|gb|ACDC01000078.1| GENE 22 21640 - 30222 11478 2860 aa, chain + ## HITS:1 COG:FN0132 KEGG:ns NR:ns ## COG: FN0132 COG3210 # Protein_GI_number: 19703477 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 185 2327 2 2260 2462 2305 68.0 0 MRNRILKRVITVIFLLLHTLEIFGANLVVDPNSTYNTKIDESRNGVPIVNISTPNDRGVS INEFKEYNVDEKGQILNNADNIGRSYLGGLINANPNLAPNQAANLIILQVNGSNRSQIEG YLEALSRQKVDVILANENGLYINNSGTINIKNFTATTGKLNLKDGDFVGIDVEKGNVLIG PKGFNGNNTDYVDIIAKTLELRGNIVANNLNIKTGSNDKNSSNILAIDASELGGMYAGVI KIVSTDKGVGVNSDSFIVSKDKKLEITADGQIKINKVQAKGVDIKGKEYVQKDLTYSDGD ISIKADKIKLAGTGMSGNKVSLNGDVENSSDISVKENLNTKNFSNTGLVQVNDKIEVLGN VNNTGEILTNNSFTAKDVKTIGKLISKDNINVSNLENSGVITSNNKLNIDGKLNNAGEIQ ITDNIVVNGNVENIGEILTNGSFTSKDIKNKKELSANKDIRVSKLENTGNVLTNSKISIN GDLTNTGELKALDSISVTENTTNDGSILTNKNFSTSDLTNNKKIIVKEKIDTKNLKNTGT IASGDKFTINGNFENNNNIETAALDLTGNKLTNSGSIKADNISANVTNITNSGKILSSNN IVFSNAQKLQNTNDILAIKNIQANNTIIENDGKIASNNKIMLNNSSIKNTKKITSDTIEM KNNRSFDNTGEIIGNNVVLTSENNLDFYGKVQGNQNLSIIGKNIENNGEIIGTGSANITS TNFTNNGELTAQVLTVDAKNGKLTNNNIISGEEVTLSAKNIENNDLISSAKNITLKADEK ILNNSNKTIYTSGKLSISGKEIENKKNAEFLATDIELKADKVKNEVGTIKASNNIIIKAD KFENIGEVKDLDRYESYYETWDGKVLTESEIGNWKRIGGDYSKNKRKRANKAHVGDYIRR KQKDAYEEITKKVEEDKYKSLLFPKYTKYMRGYLGNRGTFTEKTGSARIKDIPLKEKLRS LSETEYAKVIAGNNIIIEGKDGGKSRETLNKDAIISAGNTVKIDTNKLENIVSIGDEKIK VKTGQESMEIKFERTGHRLNKHVQMNVTYRRDFTNDYITKKVPVLDEHGNPVLNFRGRPK YEYVKEYVGRYSYVTGSPSIIEGKNVIIDRANLVVNGIEEANGKINQGISKNNVVLDKKK ISVGTREDISNSVSNPIKGNIEISTNSRVFEDILRNGVINIDVTTPSALFIKNVNPDSKY LLETRVKYINQKEFYGSDYFLKRIGYEDKWTRVRRLGDAYYENQLIERNIIEKLGTRFIN GKELSIKELIDNGTDIAKKNALTIGQGLTKEQIAKLDKDIVWYEYQNVDGIQVLAPKVYL SQNTLKNLNSDSRTKIVGLDNTYIKTNKLENTALISGRGNTFIEADEVNNRTLGNQLAEI SGENTQIIATNNINNIGARISAKQNLNLIAVNGDILNKSTVEKVEFNNGEFDRSKFTKIA SVGEIISDGNLNIIANNYTSEGAVTQAKNVNINVTNDVNISSQKVSGEQKFGKNDGQYNY YGFERNLGSVVKTENLNVTAKNVNISGSVVTTQTADLNVDKLNIESKVDKEDEIKKSSYK SFLKSGSKKEIIHNEENSAGSLYVENKGTIKGDVNLVGSNLVLGDNSIINGKLTTDSNEL HSSYSLEEKKKGFSSSIGSGGFSVGYGKSQSKLKEKDLTNAKSNLVLGDNVTLNKGAEIT ATNFTHGKVTVNNGDVKFGARKDTRDVETSSKSSGVNLSVRIKSEALDRAKQGVDSVNQL KSGDILGGLASTTNTVTGLVQGLSSNITKKDGSKATLKDIKDGDFKVNNNFYANAGVNLG FNKSSSKSNSHSESAVVTTIRGKDENSSITYNNVKNVEYVGTQAKDTKFIYNNVENINKT AVELNNSYSSTGKSSGISAGATINYNNGFQAEANAVSISASQSKMNSNGTTYQNGRFVNV DEVHNNTKNMTLSGFNQEGGTVTGNIENLTIESKQNTSTTKGSTKGGSLSVSANGLPSGS ANYSKTNGERKVVDNASTFIIGDGSNLKVGKLENTASAIGTTENGKLSIDEYVGHNLENV DKLKTAGGSVGVSTSGITSIGVNYSDKKQEGITKNTVIGNVEIGKSSGDEINRDLDTMTE ITEDRDFKTNINVESQTISYIKNPEKFKEDLQIAIIEGKATGRTVVKTIDNVINGDKSQD IGDAERRSLIEIKEAIVRVQTAPAMDIIAEKDLADKNIQARLGVEIEKFDPNDPTLSEKV RERIDELKAEGKELVAFYDKVTKKIFINQNAKDEEVRASIAREYKIKEDLELGRGKENDK GQLRSTVAGEIAYDEIKDRLKKGDKNPISASSFDVAKMDKDSEVTADNYRFEDRKLRKEY KEIVYFTDERKEQMSEGGLDMMAGGGTALAGGAVIGASIGVDYVSFGTTIVPSTAAKAGG ATAVVVGGNRFITGGLKFLNGLYGSGKKETLNPIRDSIPKKYQGYYDAFEVTVEVGILHL APHAQPYYSQPNKSVINISQQQRELNKQMNLNASVGQQKVPISSTRSSTDGKNYNAIKEI SQGKVFINNEKTGETTMAYQKRITYSNGSMRIFQENLTTGEISLQKINPLGQRVFEKTLT PYKSNTLIEASSSSKMLVGDRAVSQVASKVGQVQNNRLPYKPVLALPVKYPIVTKSGVTL YQDYAIGSRGAKYIEVGVSNNKEIVYKNNSGSYFKLTDKGLENISSKDVIKKEYFPTVKV QKNNGSNPSAGKNYYVSKEGNFKVEKYIEAGQTVEDYDKIYLIENVRARGINVGRSKEHT NTTISHWEKAVDIADEAKVDPNVKAVYVDETLKNISDKFKDSDARPDVTIEYKDGTFKLI EVQSKTDYRPKLENKLKNLQTKYGEDVITNYKVEKPKGGK >gi|224461324|gb|ACDC01000078.1| GENE 23 30225 - 30767 778 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739482|ref|ZP_04569963.1| ## NR: gi|237739482|ref|ZP_04569963.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 180 1 180 180 299 100.0 4e-80 MERIVYIRGKFKGNNKEMKKAYFYAKEMMTKYDLKPQYIGVIGEKGWTGDKLLTIKRKEK QLLEDLEKDIIIKNIGLETKEMEGKEIINDKSFFLIEKEDGTIAFWTNTNIEEVNFEEIL EEMKKYVEPGIEEICDWESGSSPIVYVLRGEKTLEKTGKFQDKITIIYKKVTPLDIPIEV >gi|224461324|gb|ACDC01000078.1| GENE 24 31026 - 31751 1127 241 aa, chain + ## HITS:1 COG:no KEGG:FN1358 NR:ns ## KEGG: FN1358 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 241 1 233 233 269 62.0 7e-71 MTKVSEFLEKMYEEPSQELENEMLEEIIMKANFLSYINRPDNGNTDVFSINMLLTDDKKL YLPIFTDIEELAKWGIPEEMDTIELNFDNYSEIILDHPHDIEGLVINPFGKSYIISEEWL SELRTMKEERLKVRELKIPVNSKILLSEPERFPTMLAEEVTKCCDEIGTINRLWLLEMTT EKDESWLLVVDFKGDKNEIFREINDAARNYLGMRYLDMIAYDDEFAKKSVENHKPFYDKI K >gi|224461324|gb|ACDC01000078.1| GENE 25 31829 - 32180 415 117 aa, chain + ## HITS:1 COG:FN0774 KEGG:ns NR:ns ## COG: FN0774 COG2849 # Protein_GI_number: 19704109 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 26 114 3 92 248 87 53.0 5e-18 MKKIMIFLFAFCSLLAYSAKTVDYQEVDKYIRQKLDKDKEITFIYKVNQADFTLEGYSDG KLTAVTDLKSNPGQAAMDGMKSVVSEKNGKLNPEYKIFAADGKLLSEQKFKLNKSIR Prediction of potential genes in microbial genomes Time: Thu May 19 23:22:55 2011 Seq name: gi|224461323|gb|ACDC01000079.1| Fusobacterium sp. 2_1_31 cont1.79, whole genome shotgun sequence Length of sequence - 23404 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 7, operones - 4 average op.length - 6.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 464 624 ## COG2849 Uncharacterized protein conserved in bacteria + Prom 714 - 773 11.5 2 2 Op 1 . + CDS 802 - 1386 646 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 3 2 Op 2 . + CDS 1390 - 1887 925 ## FN1105 hypothetical protein 4 2 Op 3 . + CDS 1916 - 3142 1758 ## COG1760 L-serine deaminase + Term 3164 - 3210 2.1 - Term 3152 - 3194 8.0 5 3 Op 1 . - CDS 3212 - 3814 1313 ## gi|237739489|ref|ZP_04569970.1| predicted protein 6 3 Op 2 . - CDS 3852 - 4439 793 ## FN1479 hypothetical protein 7 3 Op 3 2/0.000 - CDS 4460 - 5791 1922 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) 8 3 Op 4 1/0.000 - CDS 5775 - 6917 1430 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 9 3 Op 5 9/0.000 - CDS 6932 - 9109 2639 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 10 3 Op 6 1/0.000 - CDS 9128 - 9640 819 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 11 3 Op 7 1/0.000 - CDS 9689 - 10471 721 ## COG0457 FOG: TPR repeat 12 3 Op 8 1/0.000 - CDS 10487 - 11176 590 ## COG2928 Uncharacterized conserved protein 13 3 Op 9 1/0.000 - CDS 11173 - 12456 1752 ## COG1253 Hemolysins and related proteins containing CBS domains 14 3 Op 10 1/0.000 - CDS 12488 - 12949 595 ## COG4492 ACT domain-containing protein 15 3 Op 11 1/0.000 - CDS 12962 - 13813 1093 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 16 3 Op 12 1/0.000 - CDS 13807 - 14739 1040 ## COG0223 Methionyl-tRNA formyltransferase 17 3 Op 13 1/0.000 - CDS 14758 - 15207 558 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains 18 3 Op 14 1/0.000 - CDS 15224 - 15694 543 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 19 3 Op 15 . - CDS 15688 - 16383 525 ## COG1381 Recombinational DNA repair protein (RecF pathway) 20 3 Op 16 . - CDS 16385 - 16936 606 ## FN1493 hypothetical protein 21 3 Op 17 . - CDS 16933 - 17820 1130 ## COG1792 Cell shape-determining protein - Prom 17948 - 18007 9.9 + Prom 17717 - 17776 10.4 22 4 Tu 1 . + CDS 17868 - 18041 90 ## - Term 17936 - 18003 16.1 23 5 Op 1 . - CDS 18028 - 19167 1678 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 24 5 Op 2 . - CDS 19191 - 19682 474 ## FN0663 hypothetical protein - Prom 19816 - 19875 17.4 + Prom 19750 - 19809 11.8 25 6 Tu 1 . + CDS 19835 - 21571 2363 ## COG0616 Periplasmic serine proteases (ClpP class) + Prom 21575 - 21634 12.8 26 7 Op 1 1/0.000 + CDS 21662 - 22141 639 ## COG3187 Heat shock protein 27 7 Op 2 . + CDS 22172 - 22666 696 ## COG2190 Phosphotransferase system IIA components 28 7 Op 3 . + CDS 22702 - 23404 731 ## FN0914 hypothetical protein Predicted protein(s) >gi|224461323|gb|ACDC01000079.1| GENE 1 3 - 464 624 153 aa, chain + ## HITS:1 COG:FN0519 KEGG:ns NR:ns ## COG: FN0519 COG2849 # Protein_GI_number: 19703854 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 28 153 220 343 343 61 34.0 8e-10 ANIMAYLDGDIPYDERLMELFNAVDSLETIGYHPNNVKYIKKIYVNHKNNTVKIEVKDYR EDPMLLQITNVDIKTLSGKIQYFYNSGKLFSTMNVKNGILDGEAKLYYENGKLKLVATNK NGKMNGIVTTYSEDGKVIKKIEVKDGEVVREIQ >gi|224461323|gb|ACDC01000079.1| GENE 2 802 - 1386 646 194 aa, chain + ## HITS:1 COG:FN1104 KEGG:ns NR:ns ## COG: FN1104 COG0632 # Protein_GI_number: 19704439 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Fusobacterium nucleatum # 1 193 1 193 194 294 88.0 6e-80 MFEYLYGTVEYKKMDYIAIDINGVGYKVYFPLREYEKIDLGNKYKFYIYNHIKEDTYKLI GFLDERDRKIYEMLLKINGIGPSLALAVLSNFSYDKIVEIISKNDYTSLKKVPKLGEKKA QIIILDLKAKLKNLTYTEVETISIDMLEDLVLALEGLGYTKKEIDKTLEKVDLSAYSSLE EAIKGILKNMKIGG >gi|224461323|gb|ACDC01000079.1| GENE 3 1390 - 1887 925 165 aa, chain + ## HITS:1 COG:no KEGG:FN1105 NR:ns ## KEGG: FN1105 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 165 1 165 165 273 84.0 1e-72 MRGLEEIYLKGFSLDKYFGIASADELEKLEELYKNIVISDEFINRIKGINKKVPVLASVE TWCPYARVFLTTMRKINEINHIFDLSLITYGRGVSELAGYLKIHEDDFVVPTAVFLGEDF SKLRVFNGFPEKYHNDSTLDTIDGTRNYLKGKFANDILEDVLSIF >gi|224461323|gb|ACDC01000079.1| GENE 4 1916 - 3142 1758 408 aa, chain + ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 408 1 408 408 793 93.0 0 MDTLKEFFKIGAGPSSSHTIGPERATKRVKEKFPDADSYIVELWGSLAATGKGHYTDKII IETFKPIPVEIIWKPEFVHELHTNGMKFIALDKDKEEIGEWVVFSVGGGTIRDYDELTDK SPKKEVYPLNSMKEIIKWCKENKKHLWQYVEECEGPTIWQHLKFIDQAMTDAVKRGLEKE GDVPGPFKYPKRAREMYEKALSKRASLVFTNKIFAYALAVSEENASMGQVVTAPTCGASG VVPGVLRAMKEEYELVEKHILRGLAIAGLVGNLVKYNATISGAEAGCQAEVGTACSMAAA MATYFMGGSIDQIEYAAESAMEHHLGMTCDPVGGYVIIPCIERNAICAVRAVNTAIYCMS TDGKHTISFDEVVKTMKETGKDMCSAYKETSDGGLAKYYDKILVGNEE >gi|224461323|gb|ACDC01000079.1| GENE 5 3212 - 3814 1313 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739489|ref|ZP_04569970.1| ## NR: gi|237739489|ref|ZP_04569970.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 200 1 200 200 112 100.0 9e-24 MKKRIFAMFILAASMAMVACSSSSTVNEGATSGDNQALQTLEKKREFYKAQDREKAKLEA EAKKAEENAKKEAEEKARLEAKRAQEEAKKAEEEARKAEENARLEAARAEEEARKAEENA KLEAARAEEEAKKAEENARLEAKRIEEEAKKAEEQAKLQAARAEEEARKAQEKAIEDAKK AEEQAKLEALKVLEKKRKTN >gi|224461323|gb|ACDC01000079.1| GENE 6 3852 - 4439 793 195 aa, chain - ## HITS:1 COG:no KEGG:FN1479 NR:ns ## KEGG: FN1479 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 194 1 187 188 266 81.0 3e-70 MKKNLEILEKIYDLRYKSGKVHIFHSINKLVGRFGNVVSLDKIYVSKEYLSYLSEKLFKD RERLTSFFGGNNKFVRLSLVQEFMQDFGRDIAQDVKDDFLEIKQYNSSVFKAVKERMIAL KDNENEEITKEDIDLIQGYLTNWKKLQDKIKHFIPEEFYGQKNNYFYTSLLSYVKFLDKL NPNYEVGMKYLEEIK >gi|224461323|gb|ACDC01000079.1| GENE 7 4460 - 5791 1922 443 aa, chain - ## HITS:1 COG:FN1480 KEGG:ns NR:ns ## COG: FN1480 COG2239 # Protein_GI_number: 19704812 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Fusobacterium nucleatum # 1 443 7 449 449 716 92.0 0 MEEIVELLEQNKLAELKEILINENPIDIADVFEDFPKEKYLIIFKLLPKDFSSEVFSYLS PEKQQDVIENITDDEIKFIVEDMYLDDTVDFIEEMPANIVDKILKNTSSDKRKLINQMLK YPENSAGSVMTVEYISFKDNYTVKQAIEYYRKVAIDKEETDICFVTDTKKKLVGIISLKT LILSKDDSYIQDEMDTNFVSVLTLDDQEEIAALFRKYDLTTMPVVDHEDRLVGVITVDDI VDVIDQENTEDIQKMAAMNPSDEEYLKESVVSLAKHRILWLLVLMISATFTGLVIKKYED ILQSAVYLATFIPMLMDTGGNAGSQSATLVIRGIALEEIEFSDIFKVIWKELRVSILVGF ILSAVNFIRIYYFTRSGLETSLVVAISMFLTVIMAKVIGGVLPLVAKSLKIDPAIMASPL ITTIVDTAALIIFFKLSVIFLHI >gi|224461323|gb|ACDC01000079.1| GENE 8 5775 - 6917 1430 380 aa, chain - ## HITS:1 COG:FN1481 KEGG:ns NR:ns ## COG: FN1481 COG0343 # Protein_GI_number: 19704813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Fusobacterium nucleatum # 1 372 1 372 373 734 95.0 0 MKLAVTYKVENKDGKARAGIITTPHGEIETPVFMPVGTQATVKTMSKEELIDIGSEIILG NTYHLYLRPNDELIARLGGLHKFMNWDRPILTDSGGFQVFSLGSLRKIKEEGVYFSSHID GSKHFISPEKSIQIQNNLDSDIVMLFDECPPGLSTREYIIPSIERTTRWAKRCVEAHQKK DTQGLFAIVQGGIYEDLRQKSLDELSEMDEHFSGYAIGGLAVGEPREDMYRILDYIVEKC PEDKPRYLMGVGEPVDMLNAVESGIDMMDCVQPTRLARHGTVFTKKGRLIIKSERYKEDT APLDEECDCYVCKNYSRAYIRHLIKVQEVLGLRLTSYHNLHFLIKLMKDAREAIKEKRFK EFKENFIKKYEGGDRNGRDS >gi|224461323|gb|ACDC01000079.1| GENE 9 6932 - 9109 2639 725 aa, chain - ## HITS:1 COG:FN1482 KEGG:ns NR:ns ## COG: FN1482 COG0317 # Protein_GI_number: 19704814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Fusobacterium nucleatum # 1 725 1 725 725 1293 92.0 0 MMNYWEQLLEKAKANHLNYDFDKLKLALAFAEESHQGQYRKSGDDYIIHPVEVAKILMDM KMDTDTVVAGLLHDVVEDTLIPIADIKYNFGDTVAVLVDGVTKLKALPNGTKNQAENIRK MILAMAENIRVILIKLADRLHNMRTLKFMKPEKQQAISKETLDIYAPLAHRLGMAKIKSE LEDLSFSYLHHEEYLEIKRLVENTKEERKDYIENFIRTMKRTLVDLGLKAEVKGRFKHFY SIYKKMYQKGKEFDDIYDLMGVRVIVEDKAACYHILGIVHSQYTPVPGRFKDYIAVPKSN NYQSIHTTIVGPLGKFIEIQIRTKDMDDIAEEGIAAHWNYKENKKSSKDDNIYGWLRHII EFQNESDSTEDFIEGVTGDIDKGTIFTFSPKGDIIELPVGATALDFAFMVHTQVGCKCVG AKVNGRMVTIDHKLRSGDKVEIITSKNSKGPSIDWLDIVITHGAKGKIRKFLKDENKEIV SKLGKDSLEKEAAKIGMTLKEIENDSTLKKHMERNNIPNMEEFYFYLGEKRSRLDILINK IKVNLEKERAASTITIEEVLKKKEEKRKEGKNDFGIVIDGINNTLIRFAKCCTPLPGDEI GGFVTKLTGITVHRKDCPNFHAMVEKDPSREILVKWDENLIETKLNKYNFTFTIVLNDRP NILMEIVNLIGNHKINITSLNSYEVKKDGDKVMKVKISVEIKGKAEYDYLINNILKLKDV IAVER >gi|224461323|gb|ACDC01000079.1| GENE 10 9128 - 9640 819 170 aa, chain - ## HITS:1 COG:FN1483 KEGG:ns NR:ns ## COG: FN1483 COG0503 # Protein_GI_number: 19704815 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Fusobacterium nucleatum # 1 170 1 170 170 316 92.0 1e-86 MNLKDYVASIENYPKEGIIFRDITPLMNNGEAYKYATEKIVEFAKNHDIDIVVGPEARGF IFGCPVSYALGVGFAPVRKPGKLPREVVEYAYDLEYGSNKLCLHKDAIKPGQKVLVVDDL LATGGTVEATIKLVEELGGVVAGLAFLIELVDLKGRDKLNNYPMITLMQY >gi|224461323|gb|ACDC01000079.1| GENE 11 9689 - 10471 721 260 aa, chain - ## HITS:1 COG:FN1484 KEGG:ns NR:ns ## COG: FN1484 COG0457 # Protein_GI_number: 19704816 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 19 260 2 245 245 272 84.0 5e-73 MKKMMTIVILSFLFLACFNSQKEKNYNFIKGLNEYQKNDKVSALENYKKAYEMDKNNIVL LNEIAYLYVDLGNYEEAEIYYKKALEIKPNDENSLKNLLQLLYFQDKRIEMKKYIPFIID KNSFTYNLSNFRVAVLENDEMEVEKSLLRISSNNKFLEEYNESFYTELASIAGLSKNTIK YSNIIFEKAYKKYANKEIVDTYANFLIEIKEYRKAEDILMKYIVNNENNLDEYALLKTLY TKENNKEKLENLKKILKNKI >gi|224461323|gb|ACDC01000079.1| GENE 12 10487 - 11176 590 229 aa, chain - ## HITS:1 COG:FN1485 KEGG:ns NR:ns ## COG: FN1485 COG2928 # Protein_GI_number: 19704817 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 210 1 210 223 319 82.0 2e-87 MKLKKNFYTGLLMILPVVITYYIFNWLFNLAFRIINNTIIIKILKRLVDFGFGEKADTFY MQVSVYIAAFLIIFLSITMLGYMTKVVFFSKIIRRAINILERIPIIKTVYSTSKQIIGIV YSDNGESVYKKVVAVEFPRKGLYAIGFLTADKNTALKEILPDKEIVNVFVPTAPNPTSGF LLCLPKEEVYYLNMSVEWAFKLIVSGGYITEDVVKHNEQKAEQKTEENN >gi|224461323|gb|ACDC01000079.1| GENE 13 11173 - 12456 1752 427 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 1 426 1 426 426 662 90.0 0 MDTYLNVLILVILILLSGFFSASETALSAYRSNYLEKLDEEKHPKRYAVLKKWLKDPNSM LTGIVIGNNVVNILASSLATVVIVNYFGNKGSSVALATAIMTILILIFGEISPKLMARNN SAKIAEGVSVVIYILSIIFTPLVYCLIFISRFVGRILGVNMESPQLLITEEDIISYVNVG NAEGIIEEDEKEMIHSIVTLGETSAKEVMTPRTSMFSLEGEKTINEIWDEITENGFSRIP VYEETIDNIIGILYVKDLMEHVKNNELDIPIKQFVRSAYFVPETKSIIEILKEFRTLKVH IAMVLDEYGGVVGLVTIEDLIEEIVGEIRDEYDDEEDSFFKKIADNEYEVDAMTDIETIN KELELELPISEDYESLGGLIVTTTGKICEVGDEVQIDNIYLKVLEVDKMRVSKVFIKILE EENKEEE >gi|224461323|gb|ACDC01000079.1| GENE 14 12488 - 12949 595 153 aa, chain - ## HITS:1 COG:FN1487 KEGG:ns NR:ns ## COG: FN1487 COG4492 # Protein_GI_number: 19704819 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Fusobacterium nucleatum # 1 153 1 153 153 222 92.0 2e-58 MAVKSKDKDNKEFYIVDKRILPKSIQNVIKVNDLILKTKMSKYSAIKKVGISRSTYYKYK DFIKPFYEGGEDKVYSLHLSLKDRVGILSDVLDVIAKEKISILTVVQNMAVDGIAKSTIL IKLTQSMLKKVDKIISKIGKVEGIADIRISGSN >gi|224461323|gb|ACDC01000079.1| GENE 15 12962 - 13813 1093 283 aa, chain - ## HITS:1 COG:FN1488 KEGG:ns NR:ns ## COG: FN1488 COG0190 # Protein_GI_number: 19704820 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Fusobacterium nucleatum # 1 283 1 283 283 447 86.0 1e-125 MLMDGKDLAKDIKIKLKNEIDDIKRIYGVTPAVASILVGDDPASQVYVNSQIKSYQDLGI AVHKYSFSKEISEAYLLNLIDKLNKDTEVDGIMINLPLPPQINATKVLNRIKLIKDVDGF KAENLGLLFQNSEDFISPSTPAGIMALIEGYKIDLEGKDVVVVGRSNIVGKPVAALVLNN HGTVTICNSHTKNLAEKTKNADVLISAVGKPKFITEDMVKEGAVVIDVGINRVNGKLEGD VDFENVQKKTSYITPVPGGVGALTVAMLLSNILKSFKANRGII >gi|224461323|gb|ACDC01000079.1| GENE 16 13807 - 14739 1040 310 aa, chain - ## HITS:1 COG:FN1489 KEGG:ns NR:ns ## COG: FN1489 COG0223 # Protein_GI_number: 19704821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Fusobacterium nucleatum # 1 310 8 317 317 498 83.0 1e-141 MKIIFMGTPTFALPSLEKLNARYELLSVFTKIDKVNARGNKIIYSPIKDFALANNLKIYQ PENFKDNALIDEIRAMEPDLIVVVAYGKILPKEVLDIPKYGVINLHSSLLPRFRGAAPIN AAIIHGDSKSGVSIMYVEEELDAGPVILQKETEISDEDTFLTLHDRLKDMGADLLVEAIE LIKDNKVEPKVQDKNLVTFVKPFKKEDCKIDWTKTSREIFNFIRGMNPAPTAFSMLDKSI IKIYETIIYDKTYDNASCGEVVEYIKGKGPVVKTADSSLIISSAKPENKKQISGVDLING KFLKIGEKLC >gi|224461323|gb|ACDC01000079.1| GENE 17 14758 - 15207 558 149 aa, chain - ## HITS:1 COG:FN1490 KEGG:ns NR:ns ## COG: FN1490 COG1327 # Protein_GI_number: 19704822 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Fusobacterium nucleatum # 1 149 1 149 149 223 92.0 9e-59 MKCPFCSSEDTKVVDSRTMIDGSTKRRRECNHCLKRFSTYERFEESQIYVVKKDNRRVKY DREKLLRGLTFATVKRNISREELDKIISDIERGLQNSLVSEISSKELGEKVLEKLRDLDQ VAYVRFASVYKEFDDIKSFIEIVEQIKKD >gi|224461323|gb|ACDC01000079.1| GENE 18 15224 - 15694 543 156 aa, chain - ## HITS:1 COG:FN1491 KEGG:ns NR:ns ## COG: FN1491 COG1762 # Protein_GI_number: 19704823 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Fusobacterium nucleatum # 1 156 6 161 162 240 85.0 7e-64 MVNSIKITDYITEDLIDLDLKSKNRDEILVELSKLLEKSDNIVGKENDILKALVDREKLG STGIGKGVAIPHAKTESAKSLTVGFGVSREGIDFNSLDEEDVHLFFVFASPSKDSHIYLK VLARISRLIREEDFRDALFNCKTAKEIIECIKEKEE >gi|224461323|gb|ACDC01000079.1| GENE 19 15688 - 16383 525 231 aa, chain - ## HITS:1 COG:FN1492 KEGG:ns NR:ns ## COG: FN1492 COG1381 # Protein_GI_number: 19704824 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Fusobacterium nucleatum # 1 231 1 233 233 306 77.0 2e-83 MIFLRGKGIIISKKDVEEADRYIDIFMEDYGKVSTLIKGIRKSKKRDKTAVDILSLTDFQ FYKKNDSLIISNFSTVKDYLAIKSDIDKINMAFYIFSILNQILVENGRNRKLYEVLEKTL DYLNNSDNTRKNYLLLLYFLYIVIKEEGISIEGDIDELQFEIPEQKKIDLDKTSRKILEY LFEEKLKIVINDENFELNSVKKAILVLENYINFNLDTNINAKKMLWGALLW >gi|224461323|gb|ACDC01000079.1| GENE 20 16385 - 16936 606 183 aa, chain - ## HITS:1 COG:no KEGG:FN1493 NR:ns ## KEGG: FN1493 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 182 1 182 192 221 74.0 1e-56 MKKFLIVLFILVQGLIFAAGKNLADIKTLKFDVVEKTIIKSKKRELTYKIDFILPNKIKK EVIAPKLNKGEIYMYDYSANKKSVYLPMFNETKESEIVDDENRIIKAINKIIEEEKTNKN FKQKYNAKKAQNLNIDKQISITISTYLEVDGYIFPETVQIKDSGTKIADVKISNLQINPK IEM >gi|224461323|gb|ACDC01000079.1| GENE 21 16933 - 17820 1130 295 aa, chain - ## HITS:1 COG:FN1496 KEGG:ns NR:ns ## COG: FN1496 COG1792 # Protein_GI_number: 19704828 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Fusobacterium nucleatum # 80 283 1 204 210 271 75.0 9e-73 MKKDKESKLKILLPILAIIIVIVLIFNRLLFKLKDQVDKVALPVQSKVYNAANRAVGIKD IIFSYEDIMAENENLKKENMALKLGKIRDEKIYEENERLLKLLAMKENNLYKGELKFARV SFSDINNLNNKVYIDLGEEDNIKVNMIAVYGDSLVGKVSKVYDNYSELELITNPNSIVSA KTEDDVLGIVRGSDEENGLLYFQPSVYEDNLTVGDEIFTSGVSDIYPEGIKIGKIEKVND KENYAYKMIILKPDFENRDLKELIIIGRENKVNRPIVKEKELEEGKEGKEGETKE >gi|224461323|gb|ACDC01000079.1| GENE 22 17868 - 18041 90 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYLIIILYCAIFVKQLMKSILFFWTNKKMIKQVKRELLTVLPFFKWIFMTDKFKNL >gi|224461323|gb|ACDC01000079.1| GENE 23 18028 - 19167 1678 379 aa, chain - ## HITS:1 COG:FN0664 KEGG:ns NR:ns ## COG: FN0664 COG2070 # Protein_GI_number: 19703999 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Fusobacterium nucleatum # 1 379 4 382 382 679 86.0 0 MKGIKIGKYYIEKPIVQGGMGVGVSWNNLAGTVSKNGALGTISGICTAYYDNLKYCKKVV NGRPVGAEALNSKEAMMEIFKNARKICGDKPLACNILHAMNDYAKVVEFAIEAGANIIVT GAGLPLELPKLVENHPDVAIVPIVSSARALKIICKKWKAAGRLPDAVIVEGPKSGGHQGA KAEDLFLPEHQLESVVPEVKEERDKWGDFPIIAAGGIWDNDDIQKFMALGADAVQLGTRF IGTYECDASDVFKNILINAKKEDIVIVKSPVGYPGRAIKTNLIKNLVADDQTVKCYSNCV APCNLGEGARKVGFCIANCLSDSYNGKAETGLFFSGENGYRVNKLVSVEELINELMTPNT NENILNIKSENIVENVINF >gi|224461323|gb|ACDC01000079.1| GENE 24 19191 - 19682 474 163 aa, chain - ## HITS:1 COG:no KEGG:FN0663 NR:ns ## KEGG: FN0663 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 13 156 1 140 143 111 59.0 1e-23 MISRIRENFAQFIESMNIKKEEILKQNKFISLENLLSFYEENKKILLDKKENLLAILNKY FPNINLNINLKFNLDLSFLEKLEIDNIDEIVEKLEQFYEANYIEPVESNLRKKVVEKFKK IIKFTKNIFIDYSDVFLNYTSINLNKKIERAPPYNFDLYLEQK >gi|224461323|gb|ACDC01000079.1| GENE 25 19835 - 21571 2363 578 aa, chain + ## HITS:1 COG:FN1271 KEGG:ns NR:ns ## COG: FN1271 COG0616 # Protein_GI_number: 19704606 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Fusobacterium nucleatum # 14 578 1 565 565 764 76.0 0 MKILHYLKKFILFIIKEIFSFFIKLFLFLLIVGVIISLIFKSIEEKPKVVIKDNAYVVID LANSYKERSLSSNLFEDDSINFYNLLTNIKNLSFDDKVSGVVLKLNSNSLSYAQSEELAQ ELSMLRGADKKVIAYFENVNRKNYYLASYADEIYMPSANSTSVNIYPYFREEFYTKKLSD KFGVKFNIIHVGDYKSYQENLAKDTMSKEAREDSTRILNLNYENFLDIVSLNRKLNRDEL DKIIKDGDLVAASSIDLFSNKLIDKYSYWDNLVTLLGGKDKLISIQDYAKNYYQEATLDD SDNIVYVIPLEGDIVESQTEIFSGEANINVNETIAKLNTAKENKKIKAVVLRVNSPGGSA LTSDIIAEKVKELASEKPVYVSMSSVAASGGYYISANANKIYVDRNTVTGSVGVVSVLVD YSSLLKDNGVNVEKISEGEYSDLYSVDTFTEKKYNKIYNSNLKVYEDFLNVVSKGRKIDK EKLKELAEGRVWTGTEAVKNGLADEIGGLYSTIYAITDDNNIDDYTVVLAKDKVEIGNIY KKYSRYIKMDKKDLIKTTVFKDYLYNKPVTYLPYDMLD >gi|224461323|gb|ACDC01000079.1| GENE 26 21662 - 22141 639 159 aa, chain + ## HITS:1 COG:FN0916 KEGG:ns NR:ns ## COG: FN0916 COG3187 # Protein_GI_number: 19704251 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Heat shock protein # Organism: Fusobacterium nucleatum # 1 152 1 149 149 201 75.0 6e-52 MKKLLILGIAALALTACTDTKVPFLSSKSNNTNSSSSSSSTGIFANLKEQLNGREFIIVT EGYNSKTSIGFKGDRVYGFSGINRYFGTYQVSGGKFVFGEFGLTRMAGSESEMTLELKFL DILKNNKSVKLSGDTLTLTSTEGIELVFKDPKAAVTQSK >gi|224461323|gb|ACDC01000079.1| GENE 27 22172 - 22666 696 164 aa, chain + ## HITS:1 COG:FN0915 KEGG:ns NR:ns ## COG: FN0915 COG2190 # Protein_GI_number: 19704250 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Fusobacterium nucleatum # 1 164 1 164 164 276 90.0 1e-74 MGLFDIFKKKEKTVVTIYSPMNGKVIELKDVPDEAFAQKMVGDGCAIEPDKGVICSPVDG QLMNIFPTNHALIFETVDGLEMIVHFGIDTVKLDGKGFQKLREAGTIKIGDEIIKYDLDQ ISSGVPSTRSPIIINNMEKVEKIEVLSLSKVVKIGEPIMKVTLK >gi|224461323|gb|ACDC01000079.1| GENE 28 22702 - 23404 731 234 aa, chain + ## HITS:1 COG:no KEGG:FN0914 NR:ns ## KEGG: FN0914 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 231 7 240 243 276 63.0 3e-73 MFMLLFYFVLSSIVFADFTTIELPSDFSISYKKDFSDKEIRNMYYDLKLNDKVSFTCFNN AVSGLEKISYATNELLVLVDYTKPSTEERLFVVDLSKKRVVFSSLVSHGKGNGGLYATTF TDRNNSYASSSGFYLTGNIYNGKHGRSLVLYGLEEGKNDNAERRTIVMHSADYVSEEFIQ KNGSLGRSKGCLALPVELNAKIIDLIHDGVVIYVHTNFDENNEYDFSKLSSNRI Prediction of potential genes in microbial genomes Time: Thu May 19 23:23:35 2011 Seq name: gi|224461322|gb|ACDC01000080.1| Fusobacterium sp. 2_1_31 cont1.80, whole genome shotgun sequence Length of sequence - 11321 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 6, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 333 - 392 14.1 1 1 Op 1 5/0.000 + CDS 439 - 1752 1899 ## COG0672 High-affinity Fe2+/Pb2+ permease 2 1 Op 2 . + CDS 1803 - 2483 1181 ## COG3470 Uncharacterized protein probably involved in high-affinity Fe2+ transport + Term 2492 - 2549 12.2 + Prom 2518 - 2577 11.9 3 2 Op 1 . + CDS 2614 - 4107 1710 ## COG0606 Predicted ATPase with chaperone activity + Prom 4113 - 4172 3.2 4 2 Op 2 . + CDS 4193 - 4624 680 ## COG0716 Flavodoxins + Term 4636 - 4694 10.3 + Prom 4638 - 4697 8.3 5 3 Op 1 . + CDS 4742 - 4855 151 ## 6 3 Op 2 . + CDS 4797 - 5813 1254 ## gi|237739516|ref|ZP_04569997.1| predicted protein 7 4 Op 1 . - CDS 6091 - 7986 2124 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 8 4 Op 2 . - CDS 8006 - 8449 387 ## gi|237739518|ref|ZP_04569999.1| predicted protein - Prom 8536 - 8595 6.3 - Term 8461 - 8496 1.1 9 5 Op 1 2/0.000 - CDS 8660 - 9565 1136 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 10 5 Op 2 . - CDS 9575 - 11068 1808 ## COG1404 Subtilisin-like serine proteases - Prom 11122 - 11181 8.0 - Term 11130 - 11167 4.2 11 6 Tu 1 . - CDS 11203 - 11319 225 ## COG2323 Predicted membrane protein Predicted protein(s) >gi|224461322|gb|ACDC01000080.1| GENE 1 439 - 1752 1899 437 aa, chain + ## HITS:1 COG:FN1251 KEGG:ns NR:ns ## COG: FN1251 COG0672 # Protein_GI_number: 19704586 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity Fe2+/Pb2+ permease # Organism: Fusobacterium nucleatum # 8 423 1 416 433 736 91.0 0 MKRCFKSLFAFILVFGLFFSLSSIDIEAAEKKTYNTWQDVAKDMNIEFQAAKKFIEEGNN DEAYNAMNRAYFGYYEVQGFEKNVMVNIAAKRVNEIEATFRRIKHTLKGNIQGNVAELDK EIDTLAMKVYKDAMVLDGVASKDDPDDLGNKVFSNEAVSVGDETAIKLKSFGASFGLLLR EGLEAILVVVAIIAYLVKTGNQKLCKQVYIGMGFGVICSFILAYLIDILLGGVGQELMEG ITMFLAVAVLFWVSNWILSRSEEQAWSRYIKSQVQKSIDQNSGRALIFSAFLAVLREGAE LVLFYKAMLTGGQTNKLFAFYGFLVGAVVLVIIYLIFRYSTVRLPLKPFFTFTSILLFLL CISFMGKGVVELTEAGVISGSTTIPAMNGYQNTWLNIYDRAETLIPQIMLVIASVWMLLN NYLKERKMKKEAVEESK >gi|224461322|gb|ACDC01000080.1| GENE 2 1803 - 2483 1181 226 aa, chain + ## HITS:1 COG:FN1252 KEGG:ns NR:ns ## COG: FN1252 COG3470 # Protein_GI_number: 19704587 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein probably involved in high-affinity Fe2+ transport # Organism: Fusobacterium nucleatum # 1 226 1 228 228 340 92.0 2e-93 MKNFKLLLGALLVLGLVACGEKKEEAKPAEQPAATTEAPKEEAKTEAPAEKPGESGFAEV PIDETVVGPYQVAAVYFQAVDMIPEGKQPSAAESDMHLEADIHLLPDAAKKYGFGDGEDI WPAYLTVNYKVLSEDGKTEITSGTFMPMNADDGAHYGINVKKGLIPIGKYKLQLEIKAPT DYLLHVDSETGVPAAKDGGVAAAEEFFKTQTVEFDWTYTGEQLQNK >gi|224461322|gb|ACDC01000080.1| GENE 3 2614 - 4107 1710 497 aa, chain + ## HITS:1 COG:FN1614 KEGG:ns NR:ns ## COG: FN1614 COG0606 # Protein_GI_number: 19704935 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Fusobacterium nucleatum # 1 497 1 497 497 815 86.0 0 MKNKIFTSSYLGLESYLVEVEVDISRGLPMFSIVGMGDTAILESKFRVKAALKNSDYEVR PQKIVVNLSPAGIKKEGAQFDLAIAIGIILEMKLLRDLREIVKDYLFIGELSLDGEVKGV TGTINTVILAKEKGFKGVILPYENRNEASLIDGIDIVVVKNITDVVNFIENGVKIPFEKI KIEKDENSILDFSDVKGQYFAKRAMEISAAGGHNILLIGSPGSGKSMLAKRMIGILPEMS ENEIIESTKIYSVAGELSEKNPIISKRPVRMPHHSSTLPAMVGGGKKAIPGEISLASNGI LVLDEMSEFKHSVLEALRQPLEDGFVSITRAMYRVEFKTNFLLVGTSNPCPCGMLYEGNC KCSNIEIERYTKKLSGPILDRIDLIVQIKRLNEEELVNSKKGESSAEIRERVIKAREIQY RRFKEIRTNSTMTQEELKKYCDIKDEDKRFLISALENLKISARVYDKILKIARTIADLEG KEELERKHLLEAISFKK >gi|224461322|gb|ACDC01000080.1| GENE 4 4193 - 4624 680 143 aa, chain + ## HITS:1 COG:FN0029 KEGG:ns NR:ns ## COG: FN0029 COG0716 # Protein_GI_number: 19703381 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 143 1 143 143 244 84.0 3e-65 MNKVNIVYYSFTGNTLRMVKAFEKGLQEAGVPFKSYSVVELKNDDEAFDCEILALASPAN QTEAIEKEYFQPFMKRNAERFKDKKIYLFGTFGWGTGMYMSHWIKEVEELGAKIVELPMA CKGSPNSETREKLQGLAKKIATM >gi|224461322|gb|ACDC01000080.1| GENE 5 4742 - 4855 151 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIKLREYDKLSKKGEILNEEIIRIFINNFSYAHACF >gi|224461322|gb|ACDC01000080.1| GENE 6 4797 - 5813 1254 338 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739516|ref|ZP_04569997.1| ## NR: gi|237739516|ref|ZP_04569997.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 338 1 338 338 605 100.0 1e-171 MKKLFVFLLIISVMLMLVFNSSYKKNPHLSKINDDVSIAKMKIKEKSYFDDSKVEEKIYG DEEEKLLDHILHIKYDDLPIYQIVVTISEQLNTSCEFIIVYNEEYKSGDVSFVWETKEKE VSFEIPITKRNKKYCVMELSELTSSTMNDIDEDEELTSEEKESLKAKTYREAWSPDLFIR FNGEGNFFTLEDIKSLDEIRDLVGASNQNSDIIAEKNILDFAEGNYEISEYASAEFLAEI MKANKSHALPFVYTGELSIESLTDAIYSNLGADRAIIDGAAGNKIGAYLSVTYYKNDKQL AVLYFMLDEKLVGTPDIRLEFSNGKELKSWDVINYIQK >gi|224461322|gb|ACDC01000080.1| GENE 7 6091 - 7986 2124 631 aa, chain - ## HITS:1 COG:FN2102 KEGG:ns NR:ns ## COG: FN2102 COG0488 # Protein_GI_number: 19705392 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Fusobacterium nucleatum # 1 631 1 631 631 1037 96.0 0 MAILQVNDIYMGFSGETLFKEISFSVDEKDKIGLIGVNGAGKTTLIKLLLGLENSEINPA TNERGTISKKSNLKVGYLAQNTQLNKENTVFNELMTVFNNLLEDYNRMQEINFLLTVDLD NFDKLMEELGEVSERYERHEGYSIEYKIKQILNGLNIPENLWTMKIGNLSGGQNSRVALA KILLEEPDLLILDEPTNHLDLTSIEWLEKILKDYNKAIILISHDVYFLDNVVNRVFEIEG KRLKDYKGNYTDFLIQKEAYLSGEVKAYEKEQDKIKKMEEFIRRYKAGVKSKQARGREKI LNRMEKMENPVVTTQKIKLKFDIKAQSVDLVLDIKNLSKTFEDKLLFKDLNLKVYRGERI GLIGKNGTGKSTLLKIINNLEKASSGEFKIGERVSIGYYDQNHQGLGLNNNIIEELMYYF TLSEEEARNICGAFLFREDDIYKKISSLSGGEKARVAFMKLMLEKPNFLILDEPTNHLDI YSREILMDALEDYPGTILVVSHDRNFLDTVVTKIYELKTDGVETFDGDYENYKQERDNVK VKNEEAVKSYEEQKKAKNRIASLEKKLVRLEEEIQKIEEEKEEVNKKYLLAGEKNDVDKL MSLQEELDNLDNKILEKYQEYEETEIELKSL >gi|224461322|gb|ACDC01000080.1| GENE 8 8006 - 8449 387 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739518|ref|ZP_04569999.1| ## NR: gi|237739518|ref|ZP_04569999.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 147 41 187 187 241 99.0 1e-62 MKFCFLTDNFFELYKDCEEIEKKNNRPYATVCLLKYKDLYFAIPIRHHIKHQYAIFTDKE KTKGLDLSKTLIIKDLKYITQNKTAFISQSEYSQLITKETFIISKLSSYIKKYIKALEHQ EIKKNYLLCSMSCLKYFHSELNIKTSY >gi|224461322|gb|ACDC01000080.1| GENE 9 8660 - 9565 1136 301 aa, chain - ## HITS:1 COG:FN2101 KEGG:ns NR:ns ## COG: FN2101 COG0697 # Protein_GI_number: 19705391 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 301 1 301 301 424 85.0 1e-118 MKKNDNTGMLSTFVGGTLWGINGVMGNYLFLNKNVTTPWLIPYRLILAGFLLLGYLYYKK GSKIFDILKNPKDLVQIVLFGFIGMLGTQYTYFSAIQFSNAAIATVLTYFGPTLVLIYMC LREKRKPLKYEIVSICLSSFGVFLLATHGDITSLQISFKALVWGILSALSVVFYTVQPES LLKKYGASIVVAWGMMIGGIFIAFVTKPWNINVTFDFITFLVLMLIIVFGTIIAFILYLT GVNIIGPTKASIIACIEPVAATICAILFLGVTFDFLDVIGFLCIISTIFIVAYFDKKAKK K >gi|224461322|gb|ACDC01000080.1| GENE 10 9575 - 11068 1808 497 aa, chain - ## HITS:1 COG:FN2100 KEGG:ns NR:ns ## COG: FN2100 COG1404 # Protein_GI_number: 19705390 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Fusobacterium nucleatum # 82 497 1 416 416 699 86.0 0 MRVRLAKEDVNSNYKVSLNDVSREKDFVKILEDYNIKYKRTEYFKDLFMYKLIDINSKFI MILQEKASNYIKYIEPVSIYSLPLQIEDEDGEIPVVYPEENKDYVTLGVIDNGIAHIKHL DPWIKRVHTRFLREETSTTHGTFVSGIALYGDKLENREIVKNEPFYLLDATVLSATTIEE DDLLKNIALAIEENYKRVKIWNLSLSVRLGIEEDTFSDFGVVLDHLQKTYGVLIFKSAGN GGNFMKQLPKGKLYHGSDSLLSLVVGSITNEGYASNYSRVGLGPKGTIKPDIASYGGDLL RGDNGEMIMKGVNSFSRNGNVASSSGTSFATARISSLATIIYQNICKDFKDFSDFNPILL KALIIHSAKNTDKNLSVEEIGYGIPSTSTEILSYFKNENIKIFNGVMEKNKEIELDAAFF NYKKDIKIKLTLVYDTEFDYLQKSEYIKSDIKIKNISENGKNLTRKFEGILERNKKIELY SDNDIKKNYTLIIEKLN >gi|224461322|gb|ACDC01000080.1| GENE 11 11203 - 11319 225 38 aa, chain - ## HITS:1 COG:FN0036 KEGG:ns NR:ns ## COG: FN0036 COG2323 # Protein_GI_number: 19703388 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 38 173 210 210 62 71.0 2e-10 IDKDTEWLETRLKEMGYDNISDIFLAEYDNGKITVVTY Prediction of potential genes in microbial genomes Time: Thu May 19 23:23:59 2011 Seq name: gi|224461321|gb|ACDC01000081.1| Fusobacterium sp. 2_1_31 cont1.81, whole genome shotgun sequence Length of sequence - 2451 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 207 281 ## COG2323 Predicted membrane protein 2 1 Op 2 . - CDS 207 - 659 560 ## FN0037 hypothetical protein 3 1 Op 3 . - CDS 726 - 959 58 ## - Prom 993 - 1052 5.1 + Prom 724 - 783 11.6 4 2 Op 1 11/0.000 + CDS 1003 - 1836 860 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 5 2 Op 2 . + CDS 1846 - 2449 797 ## COG0352 Thiamine monophosphate synthase Predicted protein(s) >gi|224461321|gb|ACDC01000081.1| GENE 1 3 - 207 281 68 aa, chain - ## HITS:1 COG:FN0036 KEGG:ns NR:ns ## COG: FN0036 COG2323 # Protein_GI_number: 19703388 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 68 1 68 210 108 94.0 2e-24 MGLSYLDIAIKLTMGLLSLVLVINISGKGNLAPSSAMDQVLNYVLGGIVGGVIYNPSITV LQYFIILM >gi|224461321|gb|ACDC01000081.1| GENE 2 207 - 659 560 150 aa, chain - ## HITS:1 COG:no KEGG:FN0037 NR:ns ## KEGG: FN0037 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 150 1 150 150 234 80.0 5e-61 MRFYSYNYLLEQIAKFDWWGAVFTLFLIICLIFTLFKYNKGHKESKFRELAIIFTLTIIV VISIKITQYQESHINDNRYRQAVHFIEVVAEDLKTDKENIYINTSASIDGALVRIGTLYF RVISGDNGENYLLEKIDLENPKVELIEVRK >gi|224461321|gb|ACDC01000081.1| GENE 3 726 - 959 58 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLCNLKTNFLLPTLALPKSGQKVKAQHPFLSQFIDSLVTSFYYLIFFIFDIIAHILFLYN TIFTLLFFTCKHTKNVI >gi|224461321|gb|ACDC01000081.1| GENE 4 1003 - 1836 860 277 aa, chain + ## HITS:1 COG:FN1759 KEGG:ns NR:ns ## COG: FN1759 COG0351 # Protein_GI_number: 19705080 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Fusobacterium nucleatum # 1 277 13 289 289 426 87.0 1e-119 MKNVLSIAGSDCSAGAGIQADLKTFVANGVYGMTVITSLTAQNPQKVKMVEDVSIEMLKN QLEAILDVIEVSAIKIGMINSKENAELIYDSLLKYKVKNIVLDPIMISTSGKSLIKNETK DFLVNKLFKLVDIITPNLDETTEIVKMILNNENIENIDSVEKMQSYGKIIADFTKKWVLV KGGHLSNSAVDILLNSDEIYILEGEKIPNNKTHGTGCSLSSAIASNLAKGYSMLDSVKKA KNFVLYSIKNSIDFGEIGGTVNQMGEIYKNIDIEKLY >gi|224461321|gb|ACDC01000081.1| GENE 5 1846 - 2449 797 201 aa, chain + ## HITS:1 COG:FN1758 KEGG:ns NR:ns ## COG: FN1758 COG0352 # Protein_GI_number: 19705079 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Fusobacterium nucleatum # 1 199 1 199 206 319 87.0 3e-87 MELKACKIYLVTDEKACLGKGFYVCIEEAIKGGVKIVQLREKNISTKDFYEKALKVKEIC ENYGALFIINDRLDIAQAVGADGVHLGQSDMPIEKAREILKDKFLIGATARNIEEAKRAE LLGADYIGSGAIFGTNTKDNAKKLEMEELKKIVTSVKIPVFAIGGINIDNVSSLKNIGLQ GICAVSGILSEKDCKKAVDMM Prediction of potential genes in microbial genomes Time: Thu May 19 23:24:10 2011 Seq name: gi|224461320|gb|ACDC01000082.1| Fusobacterium sp. 2_1_31 cont1.82, whole genome shotgun sequence Length of sequence - 8299 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 32 - 1333 1946 ## COG0422 Thiamine biosynthesis protein ThiC 2 1 Op 2 5/0.000 + CDS 1343 - 1537 369 ## COG2104 Sulfur transfer protein involved in thiamine biosynthesis 3 1 Op 3 5/0.000 + CDS 1541 - 2158 985 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 4 1 Op 4 5/0.000 + CDS 2155 - 2928 1118 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis 5 1 Op 5 3/0.000 + CDS 2928 - 4058 1174 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 6 1 Op 6 . + CDS 4061 - 4678 565 ## COG0352 Thiamine monophosphate synthase + Term 4912 - 4961 -0.2 + Prom 4964 - 5023 15.2 7 2 Op 1 . + CDS 5059 - 5385 543 ## FN1742 V-type sodium ATP synthase subunit G (EC:3.6.3.15) 8 2 Op 2 16/0.000 + CDS 5372 - 7285 1882 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I + Prom 7426 - 7485 7.3 9 2 Op 3 11/0.000 + CDS 7513 - 7995 772 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 10 2 Op 4 . + CDS 8011 - 8299 252 ## COG1390 Archaeal/vacuolar-type H+-ATPase subunit E Predicted protein(s) >gi|224461320|gb|ACDC01000082.1| GENE 1 32 - 1333 1946 433 aa, chain + ## HITS:1 COG:FN1757 KEGG:ns NR:ns ## COG: FN1757 COG0422 # Protein_GI_number: 19705078 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Fusobacterium nucleatum # 1 433 1 433 433 801 95.0 0 MYKTQMEAAKKGILTKEMKSIAESEAMDEKILMQRVASGEIAIPANKNHSSLVAKGVGSG LSTKINVNLGISKDCPDVDKELEKVKVAIDMKADAIMDLSSFGKTEEFRKKLIAMSTAMV GTVPVYDAIGFYDKELKDIKAEEFLDVVRKHAEDGVDFVTIHAGLNREAVELFKRNERIT NIVSRGGSLMYAWMELNNAENPFYENFDKLLDICEEYDMTISLGDALRPGCLNDATDACQ IKELITLGELTKRAWKRNVQIIIEGPGHMAIDEIEANVKLEKKLCHNAPFYVLGPLVTDI APGYDHITSAIGGAIAAAAGVDFLCYVTPAEHLRLPNLDDMKEGIIASRIAAHAADISKK VPKAIDWDNRMAKYRADINWEGMFAEAIDEEKARRYRKESTPENEDTCTMCGKMCSMRTM KKIMSGEDLNILK >gi|224461320|gb|ACDC01000082.1| GENE 2 1343 - 1537 369 64 aa, chain + ## HITS:1 COG:FN1756 KEGG:ns NR:ns ## COG: FN1756 COG2104 # Protein_GI_number: 19705077 # Func_class: H Coenzyme transport and metabolism # Function: Sulfur transfer protein involved in thiamine biosynthesis # Organism: Fusobacterium nucleatum # 1 64 1 64 64 94 90.0 3e-20 MAKINGKYEEINDVNLLDYLIENKYRVDRVVVDYNGDIVKKAEFSKINIKNTDKIEIVCF VGGG >gi|224461320|gb|ACDC01000082.1| GENE 3 1541 - 2158 985 205 aa, chain + ## HITS:1 COG:FN1755 KEGG:ns NR:ns ## COG: FN1755 COG0476 # Protein_GI_number: 19705076 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Fusobacterium nucleatum # 1 161 1 161 165 229 79.0 2e-60 MDLKEENLLKRNVKGISEKLKKAKVCILGLGGLGSNVAILLARAGIGYLKLVDFDIVEAS NLNRQQYRISHIGMKKTEAIRTIIKEINPFVEVKTLDIKVDRENILSIVGDVEIVVEAFD RAETKAMAIEELLINGDKILVSASGMAGLGSANEIITRKIRDNFYLVGDNYSDYEEYSGI MSTRVMICAAHQANIVLRIILGEEK >gi|224461320|gb|ACDC01000082.1| GENE 4 2155 - 2928 1118 257 aa, chain + ## HITS:1 COG:FN1754 KEGG:ns NR:ns ## COG: FN1754 COG2022 # Protein_GI_number: 19705075 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Fusobacterium nucleatum # 1 255 1 255 257 459 94.0 1e-129 MSDSFKLGNKEFNSRFILGSGKYSNELINSAINYAGAEIVTVAMRRAISGVQENILDYIP KNITLLPNTSGARSAEEAVKIARLARECTQGDFIKIEVIKDSKYLLPDNYETIKATEILA KEGFIVMPYMYPDLNVARTLKDVGASCIMPLAAPIGSNRGLITKEFIKILIDEIDLPIIV DAGIGKPSQACEAMEMGVTAIMANTAIATASDIPRMAQAFKYAIQAGRDAYLAKLGRVLE NGASASSPLTGFLNGED >gi|224461320|gb|ACDC01000082.1| GENE 5 2928 - 4058 1174 376 aa, chain + ## HITS:1 COG:FN1753 KEGG:ns NR:ns ## COG: FN1753 COG1060 # Protein_GI_number: 19705074 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Fusobacterium nucleatum # 1 376 1 376 376 655 89.0 0 MELENINSDIMDRVISEMNNYDYNSFTDEDIREALNKVYLSVRDFQALLSPKAMNYLEEM AKKAKECRERYFGNSVYIFTPLYISNYCDNYCVYCGFNSHNKIKRARLDFEQIEAELKEI SKTGLEEILILTGESERYSSIEYIGEACKLARKYFNNVGIEIYPVNVEDYKYLHSCGVDY VTIFQETYNNEKYKKLHLEGHKKVFSYRFNSQERALMGNMRGVAFGALLGLDDFRKDAFS TGYHAYLLQKKYPHTEISISCPRLRPVINNIKIEEEFVSEKELFQIICAYRLFLPFANIT ISTRENSKFRDNVIKIAATKISAGVDTGIGAHSEHSNKKGDEQFEIADRRTVSEIFEKIK TEDLQPVMNDYIYLKD >gi|224461320|gb|ACDC01000082.1| GENE 6 4061 - 4678 565 205 aa, chain + ## HITS:1 COG:FN1752 KEGG:ns NR:ns ## COG: FN1752 COG0352 # Protein_GI_number: 19705073 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Fusobacterium nucleatum # 3 205 4 206 206 304 79.0 6e-83 MINKIKLNIISNRKLCENENLEKQIEKIFLAYERKIILKNFDIVAFTLREKDLNKNEYLK LIEKVYPICQKYKINLILHQNYDLNLDDKYKIDGIHLSYNIFKSLNENIKAELIKKYKRI GVSIHSLDEAKEVENLGASYVIAGHIFETDCKKGLKPRGLKFVEDLSSALTIPIFAIGGI DEKNSQSVIDSGAFSVCMMSNLMKY >gi|224461320|gb|ACDC01000082.1| GENE 7 5059 - 5385 543 108 aa, chain + ## HITS:1 COG:no KEGG:FN1742 NR:ns ## KEGG: FN1742 # Name: not_defined # Def: V-type sodium ATP synthase subunit G (EC:3.6.3.15) # Organism: F.nucleatum # Pathway: Oxidative phosphorylation [PATH:fnu00190]; Metabolic pathways [PATH:fnu01100] # 1 108 1 108 108 83 67.0 3e-15 MATDAILKVKDAELKAKEIIEKANQEIALLKEETREQIKKFQKDAIETAIKNAEILKTKY KTEGEAIASPIFKEAEQKVLAIKDVKEDKLESVIELIVERIVNSNGNS >gi|224461320|gb|ACDC01000082.1| GENE 8 5372 - 7285 1882 637 aa, chain + ## HITS:1 COG:FN1741 KEGG:ns NR:ns ## COG: FN1741 COG1269 # Protein_GI_number: 19705062 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Fusobacterium nucleatum # 1 637 1 638 638 846 73.0 0 MAIVKMKKFKLFALEKDRKSLLKELQKFSYVHFVKTKEEDKSLKDIEFNQDMTVIKEKSQ KVKWMLDYFLKLFPKDTKKEIDESSVKETLFSLLEQQASKYDFSNDYENLANISREMDSN KEEIANLETYRKELSKWLNIKESLGNLKAFNTAKFFLGTVAKKNFEPLKDNLRNFDHTYI EEISDESSQVNIMLLTSNTEEKKLKNELKTYSFTETNFDFDTSFTDEYEKTKNREEELKK ANEKLKEKVEKLLKLIPKLLIQKEYLDNALMRETVVSNFKATDTVNVIEGYIPLDMEEEF KKIINKNSNKSNYLEITEVDKDDEEVPILLKNSGITGLFASITQMYALPKYNEIDPTAIL SIFYWIFFGMMVADFAYGLILFILSGLALMIGKFDENKRKFLKFFFALSFSTMIWGLLYG SAFGDLIKLPTQVLDSSKDFMSIFILSIIFGAIHLVIALGIKAYILIKNGHFMDVIYDVF LWYLTLTSLIILLLAGRFGLSEFTKNIFIACAVIGMLGIVVFGARDAKTLVGRIGGGLYS LYGITSYIGDFVSYLRLMALGLAGGFIASAINIIVKMLVSKGILGIILGVVVFTLGQSFN IFLSFLSSYVHTSRLTYVEFFSKFYEGGGKAFKKFRV >gi|224461320|gb|ACDC01000082.1| GENE 9 7513 - 7995 772 160 aa, chain + ## HITS:1 COG:FN1740 KEGG:ns NR:ns ## COG: FN1740 COG0636 # Protein_GI_number: 19705061 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 1 160 1 160 160 216 90.0 1e-56 MENIMTIFQQYGGVVFGVLGAALAVLLSGIGSARGVGIAGEAAAGLIIDEPEKFGKAMVL QLLPGTQGLYGFVIGLLIMFKLSPDMTIAEGLYLLMAGLPVGFVGLRSALYQGQVAVAGI NILAKNETHQTKGIVLAVMVETYAVLAFVMSLLLLNQVQF >gi|224461320|gb|ACDC01000082.1| GENE 10 8011 - 8299 252 96 aa, chain + ## HITS:1 COG:FN1739 KEGG:ns NR:ns ## COG: FN1739 COG1390 # Protein_GI_number: 19705060 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit E # Organism: Fusobacterium nucleatum # 1 96 1 96 183 91 71.0 4e-19 MSNLDKLVAEILQQAQKEANRMLTKAKTENSEFSEKENKKIQKEVDAINDKAQEEAQALK ERVISNANLKSRDMILQAKEELADDILEKVLERLKN Prediction of potential genes in microbial genomes Time: Thu May 19 23:24:15 2011 Seq name: gi|224461319|gb|ACDC01000083.1| Fusobacterium sp. 2_1_31 cont1.83, whole genome shotgun sequence Length of sequence - 9446 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 5 - 304 327 ## COG1390 Archaeal/vacuolar-type H+-ATPase subunit E 2 1 Op 2 13/0.000 + CDS 316 - 1317 1129 ## COG1527 Archaeal/vacuolar-type H+-ATPase subunit C 3 1 Op 3 12/0.000 + CDS 1310 - 1618 480 ## COG1436 Archaeal/vacuolar-type H+-ATPase subunit F 4 1 Op 4 16/0.000 + CDS 1636 - 3405 2855 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 5 1 Op 5 16/0.000 + CDS 3398 - 4774 2207 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 6 1 Op 6 . + CDS 4786 - 5421 620 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D + Term 5656 - 5717 12.1 - Term 5655 - 5692 6.4 7 2 Tu 1 1/0.500 - CDS 5724 - 7040 1721 ## COG0733 Na+-dependent transporters of the SNF family - Prom 7066 - 7125 6.8 - Term 7077 - 7134 4.0 8 3 Tu 1 . - CDS 7175 - 8557 2286 ## COG3033 Tryptophanase - Prom 8613 - 8672 10.1 + Prom 8707 - 8766 10.4 9 4 Tu 1 . + CDS 8792 - 9446 571 ## COG1802 Transcriptional regulators Predicted protein(s) >gi|224461319|gb|ACDC01000083.1| GENE 1 5 - 304 327 99 aa, chain + ## HITS:1 COG:FN1739 KEGG:ns NR:ns ## COG: FN1739 COG1390 # Protein_GI_number: 19705060 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit E # Organism: Fusobacterium nucleatum # 13 98 97 182 183 95 65.0 2e-20 MTLALVGYQSWEVQTKKYLKFVENILKNLNLSKNAEVMVSKDMKLALGDKILDYKISDKT VESGCSIKDGNLIYNNEFSNLIEFNREELEREILNKIFE >gi|224461319|gb|ACDC01000083.1| GENE 2 316 - 1317 1129 333 aa, chain + ## HITS:1 COG:FN1738 KEGG:ns NR:ns ## COG: FN1738 COG1527 # Protein_GI_number: 19705059 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit C # Organism: Fusobacterium nucleatum # 1 333 2 334 334 489 80.0 1e-138 MDREKFVQASVRIRNLEKKLLTKIQFEKLYEAENLEEAVRHLNETAYSEDLAKIDRAENF EIALSNSLNRTYSEVLKLSPVKELVDVLTYRFAFHNIKLAVKEKILQENFEHIYSKVHYE DLPKLKKQFETEKGEKGTWYEDTVIQAYKVFEDTKDPEKIEFFVDKRYFEKVLEVSKNLG LDLIEEYFKNMIDFLNIRTFIRCKRDEQDISILKAALIQDGYIDTEDISSYFYKDIEDLI NSYKNSRIGKSLILALKGYNDTGRLLLFEKYMENFLTNLLKEKVQRMPYGPEIIFAYVHA KEVEIKNLRICLVGRANGLSADFIKERLREIYV >gi|224461319|gb|ACDC01000083.1| GENE 3 1310 - 1618 480 102 aa, chain + ## HITS:1 COG:FN1737 KEGG:ns NR:ns ## COG: FN1737 COG1436 # Protein_GI_number: 19705058 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit F # Organism: Fusobacterium nucleatum # 1 102 4 105 105 157 87.0 5e-39 MYKIAIVGDKDSVLAFKILGVDVYISLDAQEARKIIDRISKEGYGIIFVTEQVAKDIPET IKRYNSELIPAIILIPSNKGSLNIGLANIDKNVEKAIGSNIL >gi|224461319|gb|ACDC01000083.1| GENE 4 1636 - 3405 2855 589 aa, chain + ## HITS:1 COG:SPy0154 KEGG:ns NR:ns ## COG: SPy0154 COG1155 # Protein_GI_number: 15674362 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Streptococcus pyogenes M1 GAS # 1 585 1 590 591 739 62.0 0 MKEGKIIKVSGPLVVAEGMEEANVYDVVEVSDNKLIGEIIEMRGDKASIQVYEETTGIGP GDVVVTTGSPLSIELGPGMLEQMFDGIQRPLLKIQEAVGDFLLKGVSVPALDREKKWQFN PTVKVGEEVEPGKVIGTVQETEIVLHKIMVPNGVYGKVKEIKEGEFTVEETICKLETENG VKELNMIQKWPVRKGRPYLKKLNPVKPLITGQRIIDTFFAVTKGGTAAIPGPFGSGKTVI QHQLAKWADAEVVVYVGCGERGNEMTDVLMEFPEIIDPKTGQSLMKRTVLIANTSNMPVA AREASIYTAITIGEYFRDMGYSVALMADSTSRWAEALREMSGRLEEMPGDEGYPAYLSSR IAEFYERAGLVECLGNGEEGALTVIGAVSPPGGDISEPVSQSTLRIAKVFWGLDYALSYR RHFPAINWLNSYSLYQAKMDKYKEEHVDRDFPKFRIEAMALLQEEAKLQEIVRLVGRDSL SEFDQLKLEVTKSLREDFLQQNAFHEVDTYCSLDKQFKMLKLILFFYDEAQRAIKEGVYL NEILALPSREKITRAKNISEKELDTFDKIEEEIKEAVSKLIKEGGTTNA >gi|224461319|gb|ACDC01000083.1| GENE 5 3398 - 4774 2207 458 aa, chain + ## HITS:1 COG:FN1734 KEGG:ns NR:ns ## COG: FN1734 COG1156 # Protein_GI_number: 19705055 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Fusobacterium nucleatum # 1 458 1 458 458 878 97.0 0 MLKEYKSVQEIVGPLMIVEGVEGIKYEELVEIQTQTGEKRRGRVLEIDGDRAMIQLFEGS AGINLKDTTVRFLGKPLELGVSEDMIGRIFDGLGNPIDKGPKIIPEKRVDINGSPINPVS RDYPSEFIQTGISTIDGLNTLVRGQKLPIFSGSGLPHNNVAAQIARQAKVLGDDAKFAVV FGAMGITFEEAQFFIDDFTKTGAIDRAVLFINLANDPAIERISTPRMALTCAEYLAFEKG MHVLVILTDLTNYAEALREVSAARKEVPGRRGYPGYLYTDLSQIYERAGKIKGKPGSITQ IPILTMPEDDITHPIPDLTGYITEGQIILSRELYKSGIQPPIFVIPSLSRLKDKGIGKGK TREDHADTMNQIYAAYASGREARELAVILGDSALSDADKAFAKFAENFDREYVSQGYETN RNIEETLNLGWKLLKVIPRTELKRIRTEYIDKYLNDKD >gi|224461319|gb|ACDC01000083.1| GENE 6 4786 - 5421 620 211 aa, chain + ## HITS:1 COG:FN1733 KEGG:ns NR:ns ## COG: FN1733 COG1394 # Protein_GI_number: 19705054 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Fusobacterium nucleatum # 1 211 1 211 211 306 95.0 2e-83 MAKLKVNPTRMALSELKLRLVTAKRGHKLLKDKQDELMRQFINLIKENKKLRVEVEKELS ESFKSFLLASATMSPLFLESAVSFPKEKLSVEIKSKNIMSVNVPEMKFVKEEMEGSIFPY GFVQTSAELDDTVIKLQKVLDNLLSLAEIEKSCQLMADEIEKTRRRVNALEYSTIPNLEE TVKDIRMKLDENERATITRLMKVKQMLEKNA >gi|224461319|gb|ACDC01000083.1| GENE 7 5724 - 7040 1721 438 aa, chain - ## HITS:1 COG:FN1989 KEGG:ns NR:ns ## COG: FN1989 COG0733 # Protein_GI_number: 19705285 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 1 438 1 438 438 698 90.0 0 MDNSERKFQSKLGFILTCVGSAVGMANIWAFPYRVGKYGGAVFLLIYFMFIALFSYVGLS AEYLIGRRAGTGTLGSYEYAWNEKGKGKLGYTLAYIPLLGSMSIAIGYAIISAWVLRTFG AAVTGKILEVDTAQFFGEAVQGNFVILPWHIAVIVITLLTLFAGASSIEKTNKIMMPAFF VLFFILAVRVAFLPGAIEGYKYLFVPDWSYLFNVETWVNAMGQAFFSLSITGSGMIVCGA YLDKKEDIVNGALQTGIFDTLAAMIAAFVVIPASYAFGYPAGAGPSLMFMTIPAVFKQMP FGHVLAILFFISVVFAAVSSLQNMFEVVGESIITRFKMSRKAVIFLLAIVSLVIGIFIEP ENKVGPWMDVVTIYIIPFGAVLGAISWYWILKKESFMEELNEGSKVKRSDLYFTVGRYVY VPLVLVVFVLGLIYHGIG >gi|224461319|gb|ACDC01000083.1| GENE 8 7175 - 8557 2286 460 aa, chain - ## HITS:1 COG:FN1988 KEGG:ns NR:ns ## COG: FN1988 COG3033 # Protein_GI_number: 19705284 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Fusobacterium nucleatum # 1 460 1 460 460 924 97.0 0 MRFEDYPAEPFRIKSVETVKMIDKATREEVIKKAGYNTFLINSEDVYIDLLTDSGTNAMS DKQWGGLMQGDEAYAGSRNFFHLESTVQEIFGFKHIVPTHQGRGAENILSQIAIKPGQYV PGNMYFTTTRYHQERNGGIFKDIIRDEAHDATLNVPFKGDIDLNKLQKLIDEVGAENIAY VCLAVTVNLAGGQPVSMKNMKAVRELTNRYGIKVFYDATRCVENAYFIKEQEEGYQDKTI KEIVHEMFSYADGCTMSGKKDCLVNIGGFLCMNDEELFLKAKEMVVVYEGMPSYGGLAGR DMEAMAIGLKESLQYEYIRHRVLQVRYLGEKLKEAGVPILEPVGGHAVFLDARRFCPHIP QEEFPAQALAAAIYVECGVRTMERGIISAGRDVKTGENHKPKLETVRVTIPRRVYTYKHM DIVAEGIIKLYKHKEDIKPLEFVYEPKQLRFFTARFGIKK >gi|224461319|gb|ACDC01000083.1| GENE 9 8792 - 9446 571 218 aa, chain + ## HITS:1 COG:FN1987 KEGG:ns NR:ns ## COG: FN1987 COG1802 # Protein_GI_number: 19705283 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 3 218 1 216 216 315 85.0 4e-86 MKVVKDLLSEQIYKILKEDIINSRINFGEVLVNKNLQERFEVSSTPIRDAILRLKEDGIV EEVTRSGAKLIDFDPHFACEVNQLIMTITLGVIEYSLKNPENRNEILANLKKYVELEEDN LSTDLYYDYDYHFHKTFFDYSNNKLLKDLFKKYNLINEILVKAYHKGAISLKNRKACLED HENIIKSIEENNISLTLDLTKKHYLSAEKIFKKLIKIN Prediction of potential genes in microbial genomes Time: Thu May 19 23:24:18 2011 Seq name: gi|224461318|gb|ACDC01000084.1| Fusobacterium sp. 2_1_31 cont1.84, whole genome shotgun sequence Length of sequence - 7875 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 255 - 4286 5083 ## FN0498 hypothetical protein 2 1 Op 2 . - CDS 4312 - 6555 2989 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 6588 - 6647 15.6 + Prom 6723 - 6782 8.7 3 2 Tu 1 . + CDS 6853 - 7779 914 ## COG2342 Predicted extracellular endo alpha-1,4 polygalactosaminidase or related polysaccharide hydrolase Predicted protein(s) >gi|224461318|gb|ACDC01000084.1| GENE 1 255 - 4286 5083 1343 aa, chain - ## HITS:1 COG:no KEGG:FN0498 NR:ns ## KEGG: FN0498 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 766 1343 1 583 583 560 57.0 1e-157 MKKNMLMLMMLIAISSSTYSNTKYIYKNPNYRVPLTNEYVVTIEQGAYNDFLRVVDEKNR RDGVFSNYSNKATKKAQRLHDNVDLIPVAHYTDEYDSYIVKYPIKSPPTDSDVARSFHSI HSIAGGATDGITQEDLNKLSYTRSSYQKRFYFGNGNSINDIIFVNQSNFDNEVTKAKKDN NERYLVEGVYQTIGSRNNNIGASLDEKNTNLLDISMDEYKSKIEGKSRTDVAAFLKEKME EKGVTGLIQKGDELYTKDSRNTEWKVLWTLEPVSLHDTLSGKFGDTIFTKIYTYKEFDTN SSTDSKGRLLYTKNNNIYLEDKVNYSNELSLKDDPSKNISDLIKEAKEKAERGETPSTPL EKYYSDKKTLSDADFNAKWVDPFKNGDFDRDLANMKREVDIAREKLVKATKEKDDADSHK KAIEQDSDWDDTLKSEAFWWKYKSESEKNTYLANLSEKQKNLFLEYVKYSAIFSEKDDEV DTLNSDIQYDIPGRYGFNQYSWNPNDKRWLDKVIKDSTIIRSLIGKDIEFRGRGRIDGTI DLGEGNNTITITEQFTGRYGTNIILGPYAKIKNVHTIFVGGQLGSDQGVSISGKSSLTMD IDPSIKNSEGNLVQHALKDSDPNIIFRSVSSTLNLNNRNQFQIELMASRIANDSTIDIGR KIDYEYYDVEKGKFDMKLSFVSDSIAHTLSDNKKFSKNGTSLIDVKVKDSIKRLTNEENE VYKSIKNAQKLGILFPTLTTTNKRTTFTVLEDEKQENKVKNLVSYLKSKSSEELINDLSE FNLSEENKKEVISLVEKIKETSIKEYDKKNKKNKDDYDAFLNDKKSLNEKLKIVEGLKSE EDYKRLNLEGFGTNLETLKISQLIDEGDYVLKTKMDSDLIKISNEISKINVNDLEKLKDK YTNLKFKDIFDKLNVVKTKLANLNDTNYPDEDDFRVAYQYFLDDLNDLQKATNEQLSYSP DKVNKEIEALDKKLLALSEKLKNPDKDLIDSLEYYSGNEGSDAYQKLKNLIYYTMREEEV LTELKNMLNQLSDRNIYSKLNKISKNEISTYTNIPFEVTHALTDKKHIARGGFISNRTVQ DNFKGNIYTAYGLYEKTAESGTKYGFMIGGANTKHNEVYQRSLTTVATESDIKGVSAYVG GYFNKPIVNNLNWITGVGAQYGTYKVKREMRNNYQDLHSQGKVSTNALNTYSGLIMNYPI QEDVFVQLKALLAYTMVKQSKVNESGDLPLDISAKTYHYVDGEAGISFNKIFYGDNLRSS ISAGAYGITGLAGYKNGDMDAKIDGSTSSFGIKGDRIKKDAVKINLDYNVQTDIGYNYGL EGTYISNSKENNVKIGIKAGYTF >gi|224461318|gb|ACDC01000084.1| GENE 2 4312 - 6555 2989 747 aa, chain - ## HITS:1 COG:FN0499 KEGG:ns NR:ns ## COG: FN0499 COG1629 # Protein_GI_number: 19703834 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 7 747 1 743 743 1158 81.0 0 MKKLLVLLSIISSIASFSEDVIELGQTTVKGSKTSDYTAPPKEQKNTFVITQERIREKNY KNVEDILRDAPGVVVQNTAFGPRIDMRGSGEKSLSRVKVLVDGVSINPTEETMASLPINA IPVESIKKIEIIPGGGATLYGSGSVGGVVSISTNSNVTKDNFFMDLNYGSFDNRNFGFAG GYNFNKHLYVNYGFSYLNSEDYREHEEKENKIYLLGFDYKINAKHRFRFQTRFSDIKQDS SNQIPVEELKNDRRKAGLNMDIDTKDRSYTFDYEYRPTQNLTLSSTLYKQKQERDIETES IDDIKIIASDRAHTWHKEEMNFYDIKSKMHADFEENKDGVKLKAKYDYNLIGNLPSETIV GYDYQSATNKRNSLVQSETLKTYNNGYMDVTLTESERLPVINRVNMEMKRKSQGIYVFNK WGLANWLDVTLGGRMEKTKYNGYRENGPNVMPYVEPEVKRIETNRKLDNYAGELGFLFKY NDTGRFYTRYERGFVTPFGNQLTDKVHDTTLKNPNTGFIIPPTVNVASKYVDNNLNAEKT DTFEIGFRDYILGSTVSTSFFLTDTKDEITLISSGVTNPAVNRWKYRNIGKTRRMGIEFE AEQNVGKFRFNQSLTLVRTKVLVANEEARLERGDEVPMVPRLKATLGLRYNFTDKLAGFV NYTYLAKQQSRELRENEDLNKDDVVVKHTIGGHGVVDAGFSYKPDAYSDIKIGAKNLFSK KYNLRETSLEALPAPERNYYLELNVRF >gi|224461318|gb|ACDC01000084.1| GENE 3 6853 - 7779 914 308 aa, chain + ## HITS:1 COG:FN0386 KEGG:ns NR:ns ## COG: FN0386 COG2342 # Protein_GI_number: 19703728 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted extracellular endo alpha-1,4 polygalactosaminidase or related polysaccharide hydrolase # Organism: Fusobacterium nucleatum # 70 307 1 238 254 277 59.0 2e-74 MIIIKKYIILLFACIYYFSFSITNEIYRDRMRDFVKEIRNNTSSDKIIISQNGNELYFKN DKIDEEFFKITDGTTQESLYYGDILKFDVATSKEANNELLKLLLPIRKKGKPIFVINYGK GEKKRNFLKQESLKTNFINELLPSFSLNDFYKPINDYNTNDIHNLNEVKNYLCLLNPEKF SSMDEYYQALKNTNYDLLLIEVSYDNIFFNREQIEGLKVKKNGGKRIVIAYLSIGEAEDY RFYWKKEWNKNKPDWIVSENENWSGNYIVKYWKPEWKEIIKEYQKKLDEIEVDGYLLDTL DSYSYFEK Prediction of potential genes in microbial genomes Time: Thu May 19 23:24:46 2011 Seq name: gi|224461317|gb|ACDC01000085.1| Fusobacterium sp. 2_1_31 cont1.85, whole genome shotgun sequence Length of sequence - 46353 bp Number of predicted genes - 42, with homology - 42 Number of transcription units - 7, operones - 6 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 18 - 221 215 ## FN1516 hypothetical protein 2 1 Op 2 1/1.000 - CDS 236 - 2815 3870 ## COG0495 Leucyl-tRNA synthetase 3 1 Op 3 1/1.000 - CDS 2828 - 3415 605 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 4 1 Op 4 1/1.000 - CDS 3426 - 4130 344 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 5 1 Op 5 . - CDS 4140 - 5495 2042 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 6 1 Op 6 . - CDS 5559 - 9047 3756 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 7 1 Op 7 . - CDS 9050 - 10255 1358 ## gi|237739552|ref|ZP_04570033.1| predicted protein 8 1 Op 8 . - CDS 10257 - 10955 771 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins 9 1 Op 9 . - CDS 10955 - 12706 1620 ## VCM66_1702 Restin-related protein - Prom 12907 - 12966 14.4 10 2 Op 1 . + CDS 12998 - 13990 1008 ## COG3950 Predicted ATP-binding protein involved in virulence 11 2 Op 2 . + CDS 13980 - 14144 182 ## gi|237739556|ref|ZP_04570037.1| predicted protein 12 3 Op 1 3/0.000 - CDS 14293 - 15243 393 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase - Prom 15273 - 15332 9.5 13 3 Op 2 . - CDS 15502 - 16272 941 ## COG0730 Predicted permeases 14 3 Op 3 . - CDS 16292 - 18574 2296 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 18607 - 18666 10.4 15 4 Op 1 1/1.000 + CDS 18736 - 19734 1504 ## COG0451 Nucleoside-diphosphate-sugar epimerases 16 4 Op 2 . + CDS 19746 - 20561 1001 ## COG1968 Uncharacterized bacitracin resistance protein 17 4 Op 3 9/0.000 + CDS 20638 - 21201 647 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 18 4 Op 4 . + CDS 21198 - 22094 1113 ## COG1091 dTDP-4-dehydrorhamnose reductase 19 4 Op 5 . + CDS 22106 - 23104 1134 ## FN1697 hypothetical protein 20 4 Op 6 7/0.000 + CDS 23116 - 24927 1748 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 21 4 Op 7 5/0.000 + CDS 24930 - 26090 1456 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 22 4 Op 8 . + CDS 26090 - 26623 615 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 23 4 Op 9 . + CDS 26706 - 28490 1784 ## COG3882 Predicted enzyme involved in methoxymalonyl-ACP biosynthesis 24 4 Op 10 . + CDS 28492 - 28722 341 ## Gmet_2339 putative acyl carrier protein 25 4 Op 11 1/1.000 + CDS 28733 - 29113 333 ## COG0346 Lactoylglutathione lyase and related lyases 26 4 Op 12 8/0.000 + CDS 29115 - 29747 853 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 27 4 Op 13 4/0.000 + CDS 29761 - 30930 1051 ## COG0438 Glycosyltransferase 28 4 Op 14 3/0.000 + CDS 30942 - 31964 1342 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 29 4 Op 15 3/0.000 + CDS 31964 - 33070 1139 ## COG0451 Nucleoside-diphosphate-sugar epimerases 30 4 Op 16 . + CDS 33074 - 34207 1442 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 31 4 Op 17 . + CDS 34272 - 35243 999 ## COG0859 ADP-heptose:LPS heptosyltransferase 32 4 Op 18 . + CDS 35253 - 36296 678 ## gi|237739577|ref|ZP_04570058.1| predicted protein 33 4 Op 19 . + CDS 36309 - 37796 804 ## COG0728 Uncharacterized membrane protein, putative virulence factor 34 4 Op 20 . + CDS 37849 - 38820 1304 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 35 4 Op 21 . + CDS 38842 - 40011 1116 ## Rru_A0261 hypothetical protein + Term 40123 - 40169 2.3 + Prom 40183 - 40242 10.9 36 5 Op 1 . + CDS 40278 - 41246 826 ## gi|237739581|ref|ZP_04570062.1| predicted protein 37 5 Op 2 . + CDS 41271 - 41804 627 ## gi|237739582|ref|ZP_04570063.1| predicted protein + Prom 41813 - 41872 1.7 38 6 Op 1 . + CDS 41899 - 42528 750 ## gi|237739583|ref|ZP_04570064.1| predicted protein 39 6 Op 2 . + CDS 42491 - 42988 427 ## gi|237739584|ref|ZP_04570065.1| predicted protein 40 6 Op 3 . + CDS 42994 - 43488 524 ## gi|237739585|ref|ZP_04570066.1| predicted protein 41 6 Op 4 . + CDS 43505 - 44704 1819 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Term 44714 - 44760 5.6 + Prom 44790 - 44849 10.4 42 7 Tu 1 . + CDS 44887 - 46239 1314 ## EUBELI_20456 hypothetical protein Predicted protein(s) >gi|224461317|gb|ACDC01000085.1| GENE 1 18 - 221 215 67 aa, chain - ## HITS:1 COG:no KEGG:FN1516 NR:ns ## KEGG: FN1516 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 64 1 64 64 75 73.0 7e-13 MNGKLKVFLTQILALLSLVIAINLFAFIAIKFGFLNSEYSMAGCTVIGVGAYLIYLYTLY KDKKRKK >gi|224461317|gb|ACDC01000085.1| GENE 2 236 - 2815 3870 859 aa, chain - ## HITS:1 COG:FN1517 KEGG:ns NR:ns ## COG: FN1517 COG0495 # Protein_GI_number: 19704849 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 859 1 859 859 1649 92.0 0 MREYDYKEIEKKWQEKWAKDNIFKTENEVAGKENYYVLSMLPYPSGKLHVGHARNYTIGD VISRYKRMKGYNVLQPMGWDSFGLPAENAAIQNGIHPAIWTKSNIENMRRQLKLIGFSYD WEREIASYTPEYYKWNQWLFKRMYEKGLIYKKKSLVNWCPDCQTVLANEQVEDGMCWRHS KTHVIQKELEQWFFKITDYADELLEGHEEIKDGWPEKVLTMQKNWIGKSFGTELKLKVVE TGEDLPIFTTRIDTIYGVSYAVVAPEHPIVDKILKVNPSIKDKVTEMKNTDMIERGAEGR EKNGIDSGWHIENPVSKEIVPLWIADYVLMNYGTGAVMGVPAHDERDFAFAGKYNLPVKQ VITSKKADEKVELPFVEEGIMINSGDFNGLSSKDALIKIAEYVEEKNLGQRTYKYRLKDW GISRQRYWGTPIPVLYCEKCGEVLEKDENLPVILPDDIEFSGNGNPLETSNQFKEATCPC CGGKARRDTDTMDTFVDSSWYFLRYCDPKNLNLPFAKEIVDKWTPVDQYIGGVEHAVMHL LYARFFFKVLRDLGLLTANEPFKRLLTQGMVLGPSYYSEKENRYLLPKDVVLKGDKAYSE SGEELQVKVEKMSKSKNNGVDPEEMLDKYGADTTRLFIMFAAPPEKELEWNENGLAGAYR FLTRVWRLIFENSELVKNAHDDIDYDKLSKEDKALLIKLNQTIKKVTDAIENNYHFNTAI AANMELINEVQSYVTNSMSSEQAPKILAYTLKKILLMLSPFVPHFCDEIWEELGETGYLF NEKWPEYDEKMLSSDEVTIAVQVNGKIRGSFEIEKDSDKAVVEKAALELPNVTKHLEGMN IVKVIVIPNRIVNIVVKPQ >gi|224461317|gb|ACDC01000085.1| GENE 3 2828 - 3415 605 195 aa, chain - ## HITS:1 COG:FN1518 KEGG:ns NR:ns ## COG: FN1518 COG1595 # Protein_GI_number: 19704850 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Fusobacterium nucleatum # 3 194 2 200 204 254 80.0 7e-68 MEENINTILKKAQTGDSEAIDWILKEYSKILSFNAQKYYLIGAEQEDLLQEGILGLLKAI KFYDETKSSFSSFAFLCIRREMISAIRKANTQKNSVLNEALTTSSMIEDSSDVDNYISLE NNPEEAYLLKEEIKEFKNFSDKNFSKFEKEVLKYLIRGYSYREIAKILSKNLKSIDNTIQ RIRKKSEDWINKEEI >gi|224461317|gb|ACDC01000085.1| GENE 4 3426 - 4130 344 234 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 4 234 9 245 255 137 32 2e-31 MERIIGVNPVTEALLNKEKNIEKLELYNGLKGETVQKLKELASKRNIKIFYTNKKIDNSQ GIAVYISNFDYYKDFDEAYEELASKDKSVVLILDEIQDPRNFGAIIRSAEVFKVDLILIP ERNSVRINETVVKTSTGAIEYVNISKVTNLSDTINKLKKLDYWVYGAAGEASINYNEEDY PNKIVLVLGNEGSGIRKKVREHCDKLIKIPMFGQINSLNVSVASGILLSRIVNK >gi|224461317|gb|ACDC01000085.1| GENE 5 4140 - 5495 2042 451 aa, chain - ## HITS:1 COG:FN1520 KEGG:ns NR:ns ## COG: FN1520 COG0766 # Protein_GI_number: 19704852 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Fusobacterium nucleatum # 29 451 1 423 423 777 96.0 0 MVFLFFINTNLRKYVIIFKYKKVKGDKRMVEAFKIIGGNKIAGELKVDGSKNSTLPIMIA TLVEKGTYILRNVPDLRDIRTLVALLESLGLEVEKLDANSYKIINNGLSGAEASYDLVKK MRASFLVMGGMLAIEKRGKVALPGGCAIGARPVDLHLKGFEALGAKINIEHGYVEATTEN GLVGGNIVLDFPSVGATENIIMAAVKAKGKTILENAAKEPEIEDLCNFLIKMGAKISGVG TSRLEIDGVEKLTACEYTIIADRIVAGTYIIASILFDGSIKVSGIVPEHLSSFLLKLEEM GAKFKIEGDKLEVLSKLSDLKPVKVTTMPHPGFPTDLQSPMMTLMCLVNGVSEIKETIFE NRFMHVPELNRMGAKIEIDSSTAKVTGVENFSSAEVMASDLRAGASLILAALKANGESIV NRIYHVDRGYENFEEKFKALGANIERIKTQA >gi|224461317|gb|ACDC01000085.1| GENE 6 5559 - 9047 3756 1162 aa, chain - ## HITS:1 COG:VC1760 KEGG:ns NR:ns ## COG: VC1760 COG0553 # Protein_GI_number: 15641763 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Vibrio cholerae # 516 1144 267 932 940 275 30.0 6e-73 MGIIERLFKKTKETDKKLKLSLEYEEKYIKIILKIGDKTISLKDLKDEIDISSLSKSDIF EVDENGNTLLLDYDEIYSLDRSTLKLLKLPSFFPGIVYIDNKGYFGSSKVEFSYKISFGL DEYHIVNANYVESISSSERYILTKEQYDLIKLINQYNNDDSKNKEANEQYKMLNAIKDVS HKTNLLLNETIKKEDDLVLLENIELDFLESDEDYLEVVPQSSQLSEKQNKSLREAFKKAN LSQNFYLLNIDNKKVKVVVNRELKDALKVVKSNEKISKKDFVKRESPIFEDIDSEIVEFN YGPRVIGLGYLNYRPSPAQNMSEMDWFTKEFPKIMTDTPITLKPEHLNYMQDKFNNLDEF EETELKFNLEGEEKKLFISKENLANEIKKLENSIKDITDYNKSKALDEIIELAEADNYSQ DYIAYKGNYIKKFDKNVAEQYRDDLRAIEIEKREEKKNTTKEQEKVLIPKDNIEKLDYIE DMEKITEEEVELPSSLRYSDGIELKEHQKEGLLRMQSLYKKSNVNGLLLCDDMGLGKTIQ ILSFLAWLKEKEALRPSLLIMPTSLITNWYDEKNIGEIQKFFLDNTFKVKILDGKKSRDE ISELRNYDLVLTSYESLRINHKETGYIEWKVVVCDEAQKMKNPKTLLTTAVKTQNALFKI ACSATPIENTVVDLWCLTDFVKPGLLETQKDFEKKYMKPLSANDINDEKRQEINNKLSDL LGEFYLRREKEKVLTSDFPKKIVIYDKIKPSSQQEDIIEKLKNTGKAALAIIQGMIMTCS HPQLVDRDVDEVPLGSEESLIEEAYKLEHIYTILTEVKKKNEKAIIFTKYKKMQKILWNV IKYWFDIEVGIVNGDADKTSRRRILDDFRKKEGFNVIILSPEAAGVGLNIVEANHVIHYT RHWNPAKEEQATDRAYRIGQKKDVYVYYPIISNVERIERDEYRTVDEWIRKQLEIDMTDS SPEEKLNRIIVKKKRMLKDFFLTCGGEFDDDMTKEFAAMSNEVEKDLSIEVIDNIDHMEF EKLAVVLLEKEVNSKYGLVTVKSGDKGIDGVIFSERGNILIQTKHTKRLDSNAAGDLFRG EKFYSDELNKDFPKLIVFTSASKNNISEDIKQLEKMGKVEIYYREKVTELLNKYPTKITE LVDRDKRYSIEDIKNYIIERHI >gi|224461317|gb|ACDC01000085.1| GENE 7 9050 - 10255 1358 401 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739552|ref|ZP_04570033.1| ## NR: gi|237739552|ref|ZP_04570033.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 401 1 401 401 635 100.0 1e-180 MNEINFKKFYPINLEKKKDEINDFWNQIFKEVRDEKIFELIEKVLRNKSLKVDEVIILLY NIKKVDEYLIRKNTELSDFIISIKFDLSEKKEKRKECLMDIYQAIVSYYNKNEVETALNF LVEDNIDLKKEESEIAQVYQEYSSSKKDYILFLYDETIKAIKNNKLKETLEKFYVSEERE VFSYIMEKVLLDIVYYVKLEKPYIEQILADFFDRVKHEVRIESFKRILSYYVEEYEEDED ISICSRVIMEKIHEYLKDPHKNSPKWQWGDFSEAQIEIMRIWLVSADLEKYFSIEVNDKI RLQFWRRYIKYIKEVRYFERLKQAIVMLTDEHIFIEFGEKGNAAYCYKKDYISFSEINNL STNFRLKNRASAEFFMSHSGNWEVKLKSTLYRLGYSVKVWR >gi|224461317|gb|ACDC01000085.1| GENE 8 10257 - 10955 771 232 aa, chain - ## HITS:1 COG:Cj0599_2 KEGG:ns NR:ns ## COG: Cj0599_2 COG2885 # Protein_GI_number: 15791959 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Campylobacter jejuni # 47 226 15 186 191 97 35.0 3e-20 MFFINKKKKEIQIEETNYWLSIGDLMASILMIFMLLFIVKTIETGRELRKKEEIIEGFTG LKKNIISKIQKDFEERGIKVDIDPQTGTIKIDDKILFNTGEYLLKPEGKKYLNDFVPIYI NLLLNDEEVKKELSQIIIEGHTDDVGSYIYNMELSQKRAFEVIKYIYDEMPNFKGKEELK SYITANGRSNIKVLRDNSGNIDRDKSRRVEFQFKLKEDETLMKIEKTLKEGN >gi|224461317|gb|ACDC01000085.1| GENE 9 10955 - 12706 1620 583 aa, chain - ## HITS:1 COG:no KEGG:VCM66_1702 NR:ns ## KEGG: VCM66_1702 # Name: not_defined # Def: Restin-related protein # Organism: V.cholerae_M66-2 # Pathway: not_defined # 40 396 28 378 706 85 21.0 8e-15 MKDEKKFIYFLFIATCILGLVGQLFLMNFKILQLFNFFNINVVFFWLIFGIFIYFTLFNK KIKNQERTIKELDDINTFFETEKLLDKNITEKEILKIKNEFFSKEEDIEKYPLLSKVWKE YSSSFLKTDENSYYQIIDAEDLFNENSLVKEKMNMKILNYIPQLFVGLGIFGTFLGLSLG LSQINLRDTGDLGQISNLIEGVQTSFYTSLYGMFFSISITLLFNNYMSQIEKRIFILKNK LNNLFFLNNGGKIIQDMKNELKEIRAYNSDMASQITNGINKELIQMTSVLDNKIGGISSG ITGTFQQTMSENLEKIFSEDFIKNFENIKDELLETSRENNKFIAEYKDEMKEIVTTTKSL KDEFLIFSDEINQRYSDTNENLKENFEKISIVLNDIKEIHSSINEFTGDVQFIATENKKI ISDFKDVSLNLKEFAKGQDTILELWEGYKDSFAGFEDSINTNFENYQSILEDVSDKYGNT IDKLTTEYVKTMNMGMEDVFRGYDNHLTEIIEKFQGVLRSFKENLELSDENLKVNIDLLQ ENLENQSILGELNKDISEKNRILLEKIQKTQQHLEFLEKKGDQ >gi|224461317|gb|ACDC01000085.1| GENE 10 12998 - 13990 1008 330 aa, chain + ## HITS:1 COG:STM3753 KEGG:ns NR:ns ## COG: STM3753 COG3950 # Protein_GI_number: 16767037 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 260 318 316 374 396 60 44.0 3e-09 MKIEKVHIKNIKGIKDLELSFRKDNKILDVIVLAGVNGSGKTTILESIKDFFYNKNIDYD ILEKSKAKLKIFFEDFEEQKIKEAENFSDANKNKLSDFFNALKLYSFDKNEKRASYEIQI AKRFENPPKIIYVPANNSFEEVETETSTLLRNYEFINVVNSELMEDIPSYIVTRRNYLAT IEEDLTMKEITNKVVNEINSIFDILELDVKLKGFSKDEKTMPIFENSAGEEFNINDLSSG EKQLFLRTLSIKMLEPKNSIILIDEPELSLHPKWQQRIIEVYKKIGENNQIIIATHSPHI LGSVSNENIFILYRDEKGKIEAKTGEKYGY >gi|224461317|gb|ACDC01000085.1| GENE 11 13980 - 14144 182 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739556|ref|ZP_04570037.1| ## NR: gi|237739556|ref|ZP_04570037.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 54 1 54 54 75 100.0 1e-12 MDIKTFILANKEIIEKFAIKKDEEKVIKKDDDWYNEDCWDNFHKLVGDKIKISI >gi|224461317|gb|ACDC01000085.1| GENE 12 14293 - 15243 393 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 2 293 15 316 345 155 32 3e-37 MEEIKVYKLENEFLKVELLNLGATIKKLEVKDKNGNFRNVVLGFDDIEKYRENPAYFGAV IGRTAGRIKNAELKIGDKVYSLDSNNNGNTLHGGKSSISHRFWTVEKIENGLVFSIKSSH LDNGYPANVEIKASYVLNKNELEVRYFTKADSLTYLNLTNHSYFNLSGNPENTIYEDILK INSDYLVGIDENSIPCETIALDNNIFDFRKSKKLKDFFMASDVQKTIANDGIDHPFIFNE KIGRLEIENLESGIKMSVETDNPAVVIYTGNYLQDIGFKKHSAICFETQEVPNLYLNPSF IDENKAYERYTKFIFN >gi|224461317|gb|ACDC01000085.1| GENE 13 15502 - 16272 941 256 aa, chain - ## HITS:1 COG:FN1706 KEGG:ns NR:ns ## COG: FN1706 COG0730 # Protein_GI_number: 19705027 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 4 256 2 254 254 380 93.0 1e-105 MFQDFDVMKFLILAVCCFIASVVDAISGGGGLISLPAYFAVGFPPHMALGTNKLSAFLST FASAFKFWKAKKVNVEIVSKLFLFSLAGAVLGVKTAVSIDTKYFKPISFAILIVVFLYAL KNKSMGEVNYYKGTTPKTILLGKIMAFCLGFYDGFLGPGTAAFLMFCLIKIFKLDFSSAS GNTKILNLSSNFASLVVFAFLGKLNWLYGIPIALVMTVGAIIGARLAILKGNKFIKPVFL VVTIVLILKMSVEIFF >gi|224461317|gb|ACDC01000085.1| GENE 14 16292 - 18574 2296 760 aa, chain - ## HITS:1 COG:FN1704_1 KEGG:ns NR:ns ## COG: FN1704_1 COG1752 # Protein_GI_number: 19705025 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Fusobacterium nucleatum # 1 374 1 374 375 567 86.0 1e-161 MKKIIFLTYIFLIFNFTYAEEIQLKTKEDVEIEKMEEQIKNLQDKIENTKKLKSAKDNKN LKVALVLSGGGVKGYAHLGVLRVLERENIKIDYITGTSIGALIGTLYSIGYSVDEIEKFL DDINVSSFLETVTDNTNLSLEKKESLKKYSAYLSFDNELNFSFPKGLKGTGEEYLFLKKM LGKYEYMDSFDNFPIPLRIVATNLNTGETKAFSKGDVAKVLIASMSIPSIFEPMKIDGEI YVDGLVTRNLPVEEAYEMGADIVIASDIGAPVVEKDDYNILSVMSQANTIQASNVTKISR EKASILISPDVKDISALDSSKKEELMKLGKVAAEKELDKIKLLSKADNKKKKENFVSDND VKITINKIEYNEKFSDNTIVVLNDIFKDLLNKTISKNDIDKKIIDIYSSKYMDKVYYTID GNTLIIDGEKPHSNKIGLGFNYLTGYGTTFNIGSDLVFNGKFKNNINFNFKFGDYLGVDF ATLSYYGVKNRFGFLTNIGYNENPFFLYDNKRKIAKFISREAYFKLGLFNQPTNNTVLSY GILSKFSSLKQDTGGNETKSLEYSENSTKTYLSFKYDSLDSISNPMKGVKTNFVYNFSSS FGKSKSNLYGPTFTLKGYVPINPKFSLIYGLNYSSLRGDNIRADRRIKLGGMYTNIDNND FEFYGYNYQEKQMKDLINLTLGFKHKIVYSLYFNTKFNIATFNEDNPMQRYNSRMWKDYS QGLGFSLTYDSPIGPIEFSVSSDLKNIKPIGSISIGYKFD >gi|224461317|gb|ACDC01000085.1| GENE 15 18736 - 19734 1504 332 aa, chain + ## HITS:1 COG:FN1703 KEGG:ns NR:ns ## COG: FN1703 COG0451 # Protein_GI_number: 19705024 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 1 332 1 332 332 610 95.0 1e-175 MIIVTGGAGMIGSAFVWKLNEMGIKDILIVDKLRTEDKWLNIRKREYYDWMDKDNLKEWL SHKENADKIEAVIHMGACSATTETDGDFLMDNNYAYTKFLWNFCAEKNIKYIYASSAATY GMGELGYNDDVSPEELQKLRPLNKYGYSKKFFDDWAFKQESQPKQWNGLKFFNVYGPQEY HKGRMASMVFHTYHQYMENGYVKLFKSYKEGFKDGEQLRDFVYVKDVVDIMYFMLTNDVK SGIYNIGTGKARSFMDLSMATMRAASHNDNLDKNEVVKLIEMPKDLQGKYQYFTEAKINK LREIGYTKEMHSLEEGVKDYVQNYLAKEDSYL >gi|224461317|gb|ACDC01000085.1| GENE 16 19746 - 20561 1001 271 aa, chain + ## HITS:1 COG:FN1702 KEGG:ns NR:ns ## COG: FN1702 COG1968 # Protein_GI_number: 19705023 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Fusobacterium nucleatum # 6 259 1 254 266 373 86.0 1e-103 MNALILVVILAIVEGITEFLPVSSTGHMILVNKLIGGEYLSPTFTNSFLIIIQLGAILSV VVYFWKDLTPFVGTKEKFVLRFRLWVKIIVGVIPAMVIGLFLDDIIDKYFMDNVTTIAIT LIVYGIIFIAIEVIYKIKNVKARVRKFSELKYSTAFLIGFFQCLAMIPGTSRSGATIIGA LLLGLSRPLAAEFSFYLAIPTMFGATALKLLKNGLVFTEREWAYLALGSAIAFVVAYIVI KWFMDFIKKRSFASFGLYRIILGIIVLVLLR >gi|224461317|gb|ACDC01000085.1| GENE 17 20638 - 21201 647 187 aa, chain + ## HITS:1 COG:PH0416 KEGG:ns NR:ns ## COG: PH0416 COG1898 # Protein_GI_number: 14590334 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Pyrococcus horikoshii # 7 182 12 180 188 190 54.0 1e-48 MKSTETKIKNLLLIEPKIFEDSRGFFMESYNYNTFKELGIDNVFVQDNISKSSKGVLRGL HFQRDEYAQAKLVYVLRGAVLDITVDLRKDSETFGRYEAVELNDKNKRMLFIPRGFAHGF LTLEDNTEFVYKCDNFYNPKSEVGIIWNDTDLNTDWNLDKYNIKEEELIISEKDKKNITF KEYRRGK >gi|224461317|gb|ACDC01000085.1| GENE 18 21198 - 22094 1113 298 aa, chain + ## HITS:1 COG:FN1698 KEGG:ns NR:ns ## COG: FN1698 COG1091 # Protein_GI_number: 19705019 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Fusobacterium nucleatum # 1 298 1 298 298 472 80.0 1e-133 MKLIFGANGKLGTDFKELLDSIGEKYIASDKDEIDITNGDFLRAYVQTMHQNYKIDTIIN CAAYNYVDRAETEKELCYKLNAEAPATLANIAAEIGANYITYSSDFVFNGLLTSYLYGDT TGYTEEDEPHPLSTYAKAKYEGELLVSQVMNNPELSSKMYIVRTSWVFGKATMNFVDKII ELSKEKNEIKVTDDQISSPTYSKDLAYYSWELLKSSAENGIYHFTNDGIASKYEEAKYIL DKISWQGNLIAVKREDLGLPAERPKFSKLSCKKIKEKLGITIPDWKNAIDRYFKDNNK >gi|224461317|gb|ACDC01000085.1| GENE 19 22106 - 23104 1134 332 aa, chain + ## HITS:1 COG:no KEGG:FN1697 NR:ns ## KEGG: FN1697 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 331 1 327 328 327 58.0 5e-88 MSNKLVKVEEDFYEEDEISIYEILNIFLKNIKIFIIVTIIGLIATCLFVAKKIIFDKNNT TYINYTLNYEEIKSYIGKDIYYPQKSPKEILLDDKYLALLFENPELKSLYEEKVKENKDD ISTKRDFLTENKILETISLQEMAKTKEEQDLISPDLYRTTVRVNKKYDKNRTVSDSIMKT YLNILNQYYKENMFDYLEERKTYLEKSLPVLKRQLEENAVDGKISISSGGSGTNDNNYFK YIYPIQVSNIDTYYEKYKTFESEYQSIKTLMDLELNKAENFIKYDSSIINVKEKSGNAIK LVIGLVLSLCLGVFATFVKEFIEGYKKNKANN >gi|224461317|gb|ACDC01000085.1| GENE 20 23116 - 24927 1748 603 aa, chain + ## HITS:1 COG:FN1696 KEGG:ns NR:ns ## COG: FN1696 COG1086 # Protein_GI_number: 19705017 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Fusobacterium nucleatum # 1 603 5 607 607 998 87.0 0 MNTIRKLVKFLIDIFLLNISLVVSIFLKYDQLQITNKNINILIYFNLSFCIIYFILKIYN NSWRFSGTSEYMSLVALSSSTTILSYIFRIFLRLDTKSSLYFEAWIIFTFLLIVSRFLMF LTRMKGIGRSDANSENVLIYGAGEAGVLLVKESRINPNFSYKIVGFLDDNPNKKGGKVYG LKVLGGLEDVEKIIEKNDVSKIIISMPSVEQNKISNILKELNKLKDISVKILPNVDNLIE EGNLSTQLRNIKLEDLLGREEIKINTKEVFDFIQDKIVFVTGGGGSIGSELINQIAKYNP KKIINIEINENASYLMELELKRKYPYLDYKTEIASVRDFDKLDMLFNKYKPEILFHAAAH KHVPLMENNPEEAIKNNIFGTKNVAECCLKYKLESVVLISTDKAVNPTNVMGATKRVCEM IFQKYSEKDSNTKFMAVRFGNVLGSNGSVIPIFSKLIEEGKNLTLTHKDIIRYFMTIPEA AQLVIEATTIGKGGEILILDMGEPVKIYDLAKNMIKLSGSNVGIDIVGLRPGEKLFEELL YDVNSSEKTSNNKIFITNMENEKVQVDIDDYYTILKDLIKNNDTVGMRRTLASIIGTFKG RVE >gi|224461317|gb|ACDC01000085.1| GENE 21 24930 - 26090 1456 386 aa, chain + ## HITS:1 COG:MTH334 KEGG:ns NR:ns ## COG: MTH334 COG0399 # Protein_GI_number: 15678362 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Methanothermobacter thermautotrophicus # 1 384 1 361 363 242 35.0 9e-64 MINLSVPNLSMDILDNLKECLESGWVSTGGRFIPEFETKVKNYMKTKFAAGVQSGTAGLH MSLQVLGVQRDEEVFVPTLTFIAAVNPTTYLGASPIFIDCDDSLCMDPIKLEKFCSEECD FKEGVLVNKKTNKKIRALVIVHVFGNMADMEKIMDIAKKYNLRVLEDATEALGTYYTEGR YKGKYAGTIGDIGVLSFNANKIITTGGGGMVVGDNEELVEKVRFLSSQAKKDTLYFIHDE IGYNYRMLNLQAALGTSQIDQLESFIETKIKNYNIYKEELEKIEGLEILPFVEGIRANHW FYSLKIDKEKYGIGRDELLQKLVDAGIQTRPIWGLIHQQKPYSACQNYEIEKALYYYDRI LNLPCSSNLTEKEVYQVIEKIKEFKK >gi|224461317|gb|ACDC01000085.1| GENE 22 26090 - 26623 615 177 aa, chain + ## HITS:1 COG:NMA0639_1 KEGG:ns NR:ns ## COG: NMA0639_1 COG2148 # Protein_GI_number: 15793627 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Neisseria meningitidis Z2491 # 4 174 3 173 223 213 57.0 1e-55 MYRKFFKRFLDIIISLIFILCFWWLYIVIAILVRIKLGSPVLFKQDRPGLNEKIFKMYKF RTMTDEKDKNGNLLPDAERLTKFGKFLRSTSLDEIPELWNVLKGEMSLVGPRPLLVSYLT KYNEYEKRRHEVRPGITGWAQINGRNNTTWEERFKNDIYYVENISFKLDLKIIIKQF >gi|224461317|gb|ACDC01000085.1| GENE 23 26706 - 28490 1784 594 aa, chain + ## HITS:1 COG:SMc01554 KEGG:ns NR:ns ## COG: SMc01554 COG3882 # Protein_GI_number: 15966058 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted enzyme involved in methoxymalonyl-ACP biosynthesis # Organism: Sinorhizobium meliloti # 193 571 224 603 651 280 37.0 8e-75 MDFSCLEYPFDYEFLNRKKKSLKRELENQNIKFLEKKIAILGGSTTNEIKLNLELFLLKE GIKPIFFESEYNKYYEDAIFSKELEDFNPEIVYIHTTIKNLENFPSLSFSKEKNQSILEN EIKKFKSIWNSLFSKFSCVIIQNNFEYPQYRLTGNSSVYLDGGNVKFINNLNNFFVKATE DYKNLYINDINYLSSLVGLEKWYDNFLWFNYKYALSFEAIPYLSHSVSSIIKGTYGKNKK AIAVDLDNTLWGGVIGDDGIAGIKIGKDNPIGEAHIEFQNYLKKLKELGIVLTVASKNDE ENAIEGLEYNNMLLHKNDFLVIKANWEPKSNNILESANEINIGVDSFVFLDDNPVERELV KNQLQGVAVPNIDNKVENYQNIVDRNNYFEFLSVSEEDLQRLKYYEDNQSREREKLNYEN YQEYLKSLDMLAEIQEAKDIYLERIHQLINKTNQFNLTTKRYTKAEVEEVFHDENSVLLY GRLQDKFGDNGLVSIIIGEKENKVLNIPLWIMSCRVLKRDMEKAMLDSLVEICKTKDIEK IVGKYIPSSKNSMVKNHYIDLGFKLIEDNDNITTWELDVCEYKNKNEIIKIGEY >gi|224461317|gb|ACDC01000085.1| GENE 24 28492 - 28722 341 76 aa, chain + ## HITS:1 COG:no KEGG:Gmet_2339 NR:ns ## KEGG: Gmet_2339 # Name: not_defined # Def: putative acyl carrier protein # Organism: G.metallireducens # Pathway: not_defined # 1 76 1 76 79 70 48.0 2e-11 MDILNKLQEIFRDVFDDETIVLTNETTQDDIEDWDSLAQINIIVAIKKDFKIDFSMEEIG KLKNVGEIVKKIEEKL >gi|224461317|gb|ACDC01000085.1| GENE 25 28733 - 29113 333 126 aa, chain + ## HITS:1 COG:CAC2192 KEGG:ns NR:ns ## COG: CAC2192 COG0346 # Protein_GI_number: 15895460 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 1 125 1 128 128 59 31.0 2e-09 MKFHHIGICCKNIEKKINSIEKIHKIIEKTEIIYDPLQDANLCMLTLEDGTNLELVSGKV VEIFLKKKIDYYHICYEVANIEEELEKICSNGGVQISEVKPAILFNNRRVVFLKVDYGII ELLEEK >gi|224461317|gb|ACDC01000085.1| GENE 26 29115 - 29747 853 210 aa, chain + ## HITS:1 COG:BS_yvfD KEGG:ns NR:ns ## COG: BS_yvfD COG0110 # Protein_GI_number: 16080477 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Bacillus subtilis # 1 208 1 207 216 101 30.0 9e-22 MKKIVIYGAGGFAKEIIWLIEEINNVNKEWELLGLIEDNEENFGKEINGYKILGGKNYLE TLSDDIFITIAIGDGNIRKKIYENFPYKKYATLIHPSVKISSTNEIGKGSIICAGCNLTV NVVIGEHSNINLNCTVAHDCKIGDFVSIFPQVAISGNVKIGSNTTIGTGSAIIQKLKVGE NVTIASMSNVTKNISDNSIALGNPIKIIKK >gi|224461317|gb|ACDC01000085.1| GENE 27 29761 - 30930 1051 389 aa, chain + ## HITS:1 COG:TM0631 KEGG:ns NR:ns ## COG: TM0631 COG0438 # Protein_GI_number: 15643396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 44 384 73 424 434 118 28.0 2e-26 MNILYLSAVPFKFDKNPSIYTDLIQELTGFGDKVTILSITSDLKPFQIKKISKKNIDLIY IGSFQLYNVNIFRKGLSILSLPFFMRRAIKGLDLKKFEVVLYETPPITWAGIVKEIKKKN KIKSFLMLKDIFPQNAVDIGLMKKEGVIFKYFKRKEKLLYEISDYIGCMSKGNMDYILKN NPGISQEKVYYFPNTKKDTGNRSMDFEKEKLQFVYGGNMGLPQGVLNIAPAITYFKNDKD IEFIFVGKGTEWNKINEYFKEQKNVKVLESLPREEYEKLLSSCDAGFIFLDSRFTIPNYP SRTLAYLEKGIPIIAATDKNTDIRNLVQDNNVGLWSCSDDIASLIENIKIMKENKEIRKE FSKNARELFLKEFQVERSVELLHKYINND >gi|224461317|gb|ACDC01000085.1| GENE 28 30942 - 31964 1342 340 aa, chain + ## HITS:1 COG:PM1007 KEGG:ns NR:ns ## COG: PM1007 COG1086 # Protein_GI_number: 15602872 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Pasteurella multocida # 1 336 1 337 344 478 68.0 1e-135 MFRDKILLITGGTGSFGNAVLRRFLKTDIKEIRIFSRDEKKQDDMRKIYNDSKIKFYIGD VRDYNSISDAMRGVDFVFHAAALKQVPSCEFYPIQAVYTNILGTENVLNAAIANKVKRVV CLSTDKAAYPINAMGMSKALMEKVIVAKGRNLDENETMICLTRYGNVMASRGSVIPLFFD QIRSGKPMTITNPNMTRFMMSLDQAVDLVLFAFENGHNGDLFIQKSPAATIELLANTIRN LVGKPDYEIKNIGIRHGEKLYEVLMTKEEKVRAIDMGNYFRVPADSRDLNYSQYFDNGQP IEKVEEYNSDNTYQLNEQELKEMLLNLYEIQDDLKGFGVK >gi|224461317|gb|ACDC01000085.1| GENE 29 31964 - 33070 1139 368 aa, chain + ## HITS:1 COG:SA0149_1 KEGG:ns NR:ns ## COG: SA0149_1 COG0451 # Protein_GI_number: 15925858 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Staphylococcus aureus N315 # 1 249 1 250 251 219 47.0 8e-57 MEVLVTGSSGFIGKNLLERLSRIENIKVHTFDIEDKLEDLVKNIDKIDFIFHLAGINRPQ NVDEFYKGNRDTIKDLISIIEKKELKIPILVTSSIQVERDNDYGKSKLEGENLLREYSAK NNIPIYIYRLPNVFGKWCRPNYNSVIATWCNNIANDSEITVSDRAVKLSLVYIDDVVNTF SKHLTEKIESKEYYSIPIIYEKTLGEILDLLYSFKNNRNDLIINKVGTGFERALYSTYLS YLPKDKFSYELTEHKDNRGAFVEIIKTLDSGQFSISTSKPGITRGNHYHNTKNEKFLVIK GEAIIRFRHIYSDEVIEYPVSDKKLEVVDIPVGYTHNITNTGDSEMILVIWANELFDKEN PDTYFLEV >gi|224461317|gb|ACDC01000085.1| GENE 30 33074 - 34207 1442 377 aa, chain + ## HITS:1 COG:PM1009 KEGG:ns NR:ns ## COG: PM1009 COG0381 # Protein_GI_number: 15602874 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Pasteurella multocida # 1 374 1 375 376 507 67.0 1e-143 MKKLKVMTVVGTRPEIIRLSAVINKLDKSEAIEHILVHTGQNYDYELNEVFFEDFNLKKP DYFLNSAVGTAIETIGNILINIEKVIDKEKPDAFLILGDTNSCLTAIAAKRRHIPIFHME AGNRCFDQRVPEETNRKIVDHIADINLTYSDIAREYLLREGLLPDRVIKTGSPMYEVIKS KLDDINNSDVLNKLNLEKGKYFVVSAHREENINSEKNFMNLVESLNAIADKYNFPVIIST HPRTRKMIEEKGVKFNPLVNLLKPLGFNDYVKLQIESKAVLSDSGTISEESSILKFKALN LREAHERPEAMEEASVMMVGLKKERILQGLEILETQEKDTLREVYDYSMPNVSDKVLRII LSYTDYINRNVWRKLNI >gi|224461317|gb|ACDC01000085.1| GENE 31 34272 - 35243 999 323 aa, chain + ## HITS:1 COG:HI1105 KEGG:ns NR:ns ## COG: HI1105 COG0859 # Protein_GI_number: 16273031 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Haemophilus influenzae # 9 323 24 346 346 62 23.0 2e-09 MFTPALNLLKKNFPNAKIDILITNKNIKQFIEAQNIFNNVFVSDLNLKNLIKIAFKLRKK YDLSFFTIGNKIWKTKIFSFLLNNKYMIGESKNLSNKYPFDKVIIRDDYAHLVDKNIELV QLIVKNKEIEDRSPRIKVKELSLIKTEKFLKDKGIENLKLFGVHPGSQKAFASRRWPEEY FAEVINNIDGKNLIKCLVFIGPEDQNIRDYLKERTNAIFIENWDIDDTIAMISKCSYFFN SDSGLGHIFTCFNKKIFSIFGPNQLGENQELRTGPYSDERVILKIENMPKEYYLELTERG IFRCLVDLKPEVVIERIEKELNK >gi|224461317|gb|ACDC01000085.1| GENE 32 35253 - 36296 678 347 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739577|ref|ZP_04570058.1| ## NR: gi|237739577|ref|ZP_04570058.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 347 1 347 347 449 100.0 1e-124 MNIFPLLSLYIYILVVCILFNKKNIYIYILSGVVLILFATFRSNMINIDTNTYINYYKTV PSFKYLFEYNNYYFEKGYVFLNMLVSSIGLNYRFFLFITASLSISLIMTSIYKYTKYCFI TLFIYFSNFYFLNELLIMRTGIAFSILFFAIRYLKYDKKKYILLVILGSFFHRISLVALL PILLFKIEFIRKRKLVLISLAVAFILGRGEIIPFIANNLITFLPHKIIIYFENYEPKEAS YRQLILFLPIFLYFLKNFYKYKNVRFFEESILFLFLMIISKFIFIKHETLDRISHLFLMG ILFLPDIYLKSIKEKEVRFFTKFMIIVFFGLLLIWYLRGDNVITIVW >gi|224461317|gb|ACDC01000085.1| GENE 33 36309 - 37796 804 495 aa, chain + ## HITS:1 COG:CAC3047 KEGG:ns NR:ns ## COG: CAC3047 COG0728 # Protein_GI_number: 15896298 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, putative virulence factor # Organism: Clostridium acetobutylicum # 3 445 11 454 520 177 30.0 6e-44 MRKVFFGVGLIAIIAKISGFARELALSYFFGASAITDAYIIALTIPTVIFNFVGVGFNSG YIPIYSMIKKRYGQEEAIKFTTNFLNLLLVVCTVIYIFGMFFTAEIVKLFASGFSIETLD MATNFTRICFVGIYIVVIISIFSAFLQANGSYYVVAFLSVPMNLVYIIGTYVAYKKGIEY LPIFSVLAISIQLVLLYFPLKTNNYRYRFYLKINDNNIKRILYLSVPAIIGGSLEQINYL IDKTIASRVMVGGISILNYASRLNIAIIGILLSTVISILFPKISLLVSERKINELKLYIK KTVNLVIIFCLPLSLWIMVYSKEIIAVVFGRGKFDENMIYITSKCLFYYTSGFIFMVLRE VITKIYYSFKDTKTPVINSGIGIILNIVLNIVLSIYMGISGIAFATSISLVVTTILLTYK LKKKYGDFYIQEIIFTFLKVFVISIILVSLIYLIKHFLIEFNIFVQIIVPSVVVGILYLI SIFFYFSEIKEIVKK >gi|224461317|gb|ACDC01000085.1| GENE 34 37849 - 38820 1304 323 aa, chain + ## HITS:1 COG:FN1786 KEGG:ns NR:ns ## COG: FN1786 COG2870 # Protein_GI_number: 19705091 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Fusobacterium nucleatum # 1 323 1 323 323 531 90.0 1e-151 MLKIEILMESFKNIKIAVIGDLMLDEYIMGKVERISPEAPVPVVKVTEEKFVLGGAANVI NNLAALGANVYCGGLVGNDNNAEKLINAFPKNVDCNLILKADNRPTIVKKRVIAGHQQLL RLDWEEEFSINEEEENIIIENLKNHIKELDAIILSDYNKGLLTKSLSQKIINLCRENNVI VTVDPKPKNITNFVGASSITPNKKEAYLAVDANSREDIDIVGKKLKEQYKLDTVLITRSE EGMTLYDGGIHNIPTYAKEVYDVTGAGDTVISVFTLARAAGATWEEAAKIANAAGGIVVG KIGTSTVSEKELIETYNNIYSNN >gi|224461317|gb|ACDC01000085.1| GENE 35 38842 - 40011 1116 389 aa, chain + ## HITS:1 COG:no KEGG:Rru_A0261 NR:ns ## KEGG: Rru_A0261 # Name: not_defined # Def: hypothetical protein # Organism: R.rubrum # Pathway: not_defined # 9 165 13 136 303 74 36.0 8e-12 MSKSEEKIIYHYCSVEVFISIIENQELWASDIFKMNDSSEEKYLEDLLRFYLKRIHGELL KDKKLKEYLDKKGKNDKESTKKERENTEKLKKELFKEYYNKIFSASIKKNIDNYIKINKF LTINDLLKEKSKKIKRYILCFSGDGDLLSQWRAYADDGKGISIGFKKSGIKEFLKGIEFE TIDIEHKEKIRKILKKIEVDLFDIEYIKKIKESSYNEELYKIFKYFPKIEVIMGGTGFNE MIIDSLLFDPEMEIYSDLSPFLRYFTFMITKRYLSDEELKPQIMKRLKNFIKVKSSTFSE EKESRIVIIEDETGENQDINNIHFRSKNNDELVSYKKLKLENIVDFISHIVLGPKNKITV EELKRFLIKQFSEEKIKHIVIEKSDIPYV >gi|224461317|gb|ACDC01000085.1| GENE 36 40278 - 41246 826 322 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739581|ref|ZP_04570062.1| ## NR: gi|237739581|ref|ZP_04570062.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 322 1 322 322 555 100.0 1e-156 MANSVRLHTIAFEKDKKPIKNISILDFFKKLNKYLEKSSKIREILNKTITCSKFYFDDNN SNRVVIPFGKLKEGLSYMINNESAFEEINTDIFNINSLEYDEEEKLLAFTTSGDGPTINH IEAYLNSFIPKDKDFSIKIHPIFKNKDLNTIRRAKYIRRVEFILDMSQPSTTLFNHNLEL NESDLLRKIINLYNSFQEELSPKSFSFSIGVGRAGKNSTLEFENMVYLLEQINLNSEIVK EIIVHYKNNSQESVEWSKLKNSNVIVEHYFDIDTKNISPEYLRDNWIEILNEERIYFRRE IEAYYREQIQVDRVEYQLKEGE >gi|224461317|gb|ACDC01000085.1| GENE 37 41271 - 41804 627 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739582|ref|ZP_04570063.1| ## NR: gi|237739582|ref|ZP_04570063.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 177 1 177 177 216 100.0 3e-55 MLKKICEFLKNSKLEILIYTIFIVILISSYLGNFDFSLTIEAEKRADIISFFSIIIGIYI AVITIIATSIIGITKEMLKKNLDTQLIDTIIFGMIETILTIGIIIFLNPTTKLSRVILVA LICNSIISFFKFTIILTLIFKANMNAMAKEIDSKDEYENRLLTTLDEIKNKLKNIEK >gi|224461317|gb|ACDC01000085.1| GENE 38 41899 - 42528 750 209 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739583|ref|ZP_04570064.1| ## NR: gi|237739583|ref|ZP_04570064.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 209 1 209 209 310 100.0 5e-83 MIVLNLELDNLFGFEDFKVNFSYSDNIKNSSIKDEFLKDRANFKYKKVNILLGANATGKT SVGKAMMAIFNFLKRKEISKITQHIRDLKKEMSFSIDFILDEKNILYRVNFKYKKEKEKE KINLDLYKADILEEDSYETTLDKFEKVNLENENYIEALEKLGKISGWLFTYPDKDSNVLG KNKKVMDKKILRKYFKNIGSINRKSKKVR >gi|224461317|gb|ACDC01000085.1| GENE 39 42491 - 42988 427 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739584|ref|ZP_04570065.1| ## NR: gi|237739584|ref|ZP_04570065.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 165 1 165 165 293 100.0 3e-78 MDPSIEKVRKSDEVENTYILTLKEEDLIIQDREFIKENNILSSGTKAGLDIAYITSAIKK NTHGFYYCDEKFSYIHSDIEKAILSLMIDFLKPNTQLFFTTHNMEVLNMDLPIHCFTFLK KREKIEVVYASEYIDEDNIFLMEAIKNDVFNVAPYLDLIYELEEA >gi|224461317|gb|ACDC01000085.1| GENE 40 42994 - 43488 524 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739585|ref|ZP_04570066.1| ## NR: gi|237739585|ref|ZP_04570066.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 164 1 164 164 236 100.0 5e-61 MILHYFVEGENERKLIETIKNKYLYSGKIKIINTIQNKVPNSILRTLERETVVVLVFDTD VEKIDILDENIKLIMSSNNVKDVICIPQIKNLEDELIYSTNINKIVDLLESKSKKDFKND FNNCKNLLKKLEEKEFKISKLWSRNAVDIYKKYKNDSEVIKKKV >gi|224461317|gb|ACDC01000085.1| GENE 41 43505 - 44704 1819 399 aa, chain + ## HITS:1 COG:FN1667 KEGG:ns NR:ns ## COG: FN1667 COG1088 # Protein_GI_number: 19704988 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Fusobacterium nucleatum # 1 399 1 399 399 715 92.0 0 MKTYLLTGAAGFIGANFLKYILKKYEDINVIVVDALTYAGNLGTIKEELKDSRVKFEKVD IRDRKEIERIFSENKVDYVINFAAESHVDRSIENPQIFLETNILGTQNLLDNAKKAWTVS KDENGYPVYREGIKYLQVSTDEVYGSLSKDYDEAIELVIDDEDVKKVVKNRKNLKTYGNK FFTENSPVDPRSPYSASKTGADHIVIAYGETYKLPINITRCSNNYGPYHFPEKLIPLMIK NILEGKKLPVYGKGDNVRDWLYVEDHCKGIDLVLRNAKVGEVYNIGGFNEEKNINIVKLV IDILKEEITNNAEYKKVLKTDLSNISYDLITYVQDRLGHDMRYAIDPSKIAKDLGWYPET DFETGIRKTVKWYLENQEWVNEVASGDYQKYYEEMYGNK >gi|224461317|gb|ACDC01000085.1| GENE 42 44887 - 46239 1314 450 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20456 NR:ns ## KEGG: EUBELI_20456 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 446 12 469 477 240 37.0 7e-62 MKNIIVRLLEIGIINLKNVKNGKINFLNSNIKETLTLQIGDILGVYGQNGSGKTTVIEAL SILKSILVGMPLDDDLKNLITYGEKNIELLFKFYIEIEDKKYIVEYKIVIGKTENNTIEI LNEIIKYSIYIEEDKRWKNTQTLIETPFSEDIIKLKKYNKNFSNEIDLLKILVIQGISKK MRVSSIFSEEIKAFLKDNTDLIYIIEALEYYGNLNLFIVSNKEIGMITLNLLLPLKIKNL NSCGDLPIQIDKNESIIVNQLIYPSVENAINEINIVLKRIIPDLQLKIEEQRKETLPDGN TNIIADLRSIREGKPISLRYESEGIKRIISILGVLIAGYNQPSVCLAIDELDSGIFEYLL GEILEVLSSEIKGQLIFTSHNLRILEKIDKKNIVFSTTNPENRYIRFKYIKPNNNLRDMY LRELIIQEQAEQLYKETKQSDIKRAFYKVR Prediction of potential genes in microbial genomes Time: Thu May 19 23:26:40 2011 Seq name: gi|224461316|gb|ACDC01000086.1| Fusobacterium sp. 2_1_31 cont1.86, whole genome shotgun sequence Length of sequence - 28739 bp Number of predicted genes - 31, with homology - 27 Number of transcription units - 12, operones - 9 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 97 - 156 2.7 1 1 Op 1 . + CDS 217 - 420 190 ## gi|237739588|ref|ZP_04570069.1| predicted protein 2 1 Op 2 . + CDS 436 - 2070 1882 ## FN1654 hypothetical protein + Term 2082 - 2133 8.0 + Prom 2384 - 2443 8.5 3 2 Op 1 . + CDS 2470 - 2958 733 ## SH1747 hypothetical protein 4 2 Op 2 . + CDS 2955 - 3335 451 ## Smon_1286 hypothetical protein 5 3 Op 1 . - CDS 3734 - 3850 57 ## 6 3 Op 2 . - CDS 3904 - 5193 760 ## COG4325 Predicted membrane protein - Prom 5225 - 5284 5.6 7 4 Op 1 . - CDS 5424 - 5507 67 ## 8 4 Op 2 . - CDS 5525 - 5686 373 ## + Prom 5594 - 5653 1.5 9 5 Tu 1 . + CDS 5690 - 5851 118 ## - Term 5755 - 5788 5.4 10 6 Op 1 . - CDS 5806 - 7344 2223 ## COG0519 GMP synthase, PP-ATPase domain/subunit 11 6 Op 2 1/0.000 - CDS 7408 - 7974 810 ## COG0778 Nitroreductase - Prom 8004 - 8063 8.4 - Term 8031 - 8069 1.3 12 6 Op 3 . - CDS 8090 - 8362 417 ## PROTEIN SUPPORTED gi|237739595|ref|ZP_04570076.1| SSU ribosomal protein S20P - Prom 8392 - 8451 6.4 13 7 Tu 1 . - CDS 8510 - 8806 459 ## FN1878 hypothetical protein - Prom 8833 - 8892 8.3 + Prom 8840 - 8899 9.1 14 8 Op 1 1/0.000 + CDS 8932 - 9534 653 ## COG0693 Putative intracellular protease/amidase 15 8 Op 2 1/0.000 + CDS 9598 - 10083 176 ## PROTEIN SUPPORTED gi|225085052|ref|YP_002656490.1| ribosomal protein S2 16 8 Op 3 1/0.000 + CDS 10090 - 10533 805 ## COG0698 Ribose 5-phosphate isomerase RpiB 17 8 Op 4 . + CDS 10546 - 10884 517 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 18 8 Op 5 . + CDS 10920 - 11186 403 ## FN1871 hypothetical protein + Term 11188 - 11251 8.1 + Prom 11217 - 11276 9.0 19 9 Tu 1 . + CDS 11413 - 12738 2042 ## COG1160 Predicted GTPases + Term 12846 - 12881 3.4 + Prom 12881 - 12940 9.2 20 10 Op 1 . + CDS 12971 - 13729 274 ## PROTEIN SUPPORTED gi|227512216|ref|ZP_03942265.1| ribosomal protein S4e 21 10 Op 2 . + CDS 13740 - 15227 1745 ## FN0173 hypothetical protein + Term 15258 - 15292 -0.9 + Prom 15229 - 15288 11.0 22 11 Op 1 1/0.000 + CDS 15407 - 16363 1601 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase + Prom 16365 - 16424 8.7 23 11 Op 2 22/0.000 + CDS 16451 - 17140 893 ## COG0850 Septum formation inhibitor 24 11 Op 3 22/0.000 + CDS 17142 - 17936 1043 ## COG2894 Septum formation inhibitor-activating ATPase 25 11 Op 4 . + CDS 17942 - 18211 384 ## COG0851 Septum formation topological specificity factor + Term 18221 - 18291 15.2 + Prom 18223 - 18282 8.5 26 12 Op 1 1/0.000 + CDS 18325 - 19248 498 ## PROTEIN SUPPORTED gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase 27 12 Op 2 . + CDS 19299 - 21932 2808 ## COG0249 Mismatch repair ATPase (MutS family) 28 12 Op 3 . + CDS 21925 - 24636 3862 ## FN0694 S-layer protein 29 12 Op 4 . + CDS 24666 - 25391 243 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 30 12 Op 5 . + CDS 25409 - 25984 707 ## FN0696 hypothetical protein 31 12 Op 6 . + CDS 26000 - 28639 3906 ## COG0013 Alanyl-tRNA synthetase Predicted protein(s) >gi|224461316|gb|ACDC01000086.1| GENE 1 217 - 420 190 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739588|ref|ZP_04570069.1| ## NR: gi|237739588|ref|ZP_04570069.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 67 4 70 70 106 100.0 6e-22 MKKKAKKVVLFLVEGASDLTSLEFIDFINNKDFKVLGDYKATWDFIKKDLNSVNRYSNFW LFFENLK >gi|224461316|gb|ACDC01000086.1| GENE 2 436 - 2070 1882 544 aa, chain + ## HITS:1 COG:no KEGG:FN1654 NR:ns ## KEGG: FN1654 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 543 28 570 571 759 74.0 0 MKKGIGLGIDDFRQIIKEDCYYFDKTNWIEELLKDRSQIKLFTRPRRFGKTLNMSTLKYF FDVKNAEENRKLFKDLYIEKSEYFKEQGQYPVIFISLKDLKKNTWEDAFFELKALLREVY EEHSYVKEKLSDIEKEEYDKILMKTEDAEYGRALRNLTKYLHTYYQKEVVLLIDECDNPL IVANTFNYYKEAINFFRDFFSTALKTNPYLKTAVLTGIVQVAKEGIFSGLNNVITYNILE KGFETFFGLSEEEVEEALKYFEMEYQIEEVKKWYDGYKFGGKEIYNPWSILNYLRTKELR AYWVNTSDNALIYENLSVANMDVFNCLEKLFEGKEIKKEISPFFTFEELERYNGIWQLMV YNGYLKLNQKLEDDEYLLTIPNYEIQTFFKKGFIDKYLIGSNYFNPIMRTLLEGNIDEFG RMLEEIFLINTSFHDLKAENIYHTFLLGMLIWLRDKYEVKSNGERGQGRYDILLLPLDKK KPAFVFEFKVSKTIKGLESKAEEALNQIKEKQYDIGIKESGIDKIYRIGLAFKGKKVKIK YELA >gi|224461316|gb|ACDC01000086.1| GENE 3 2470 - 2958 733 162 aa, chain + ## HITS:1 COG:no KEGG:SH1747 NR:ns ## KEGG: SH1747 # Name: not_defined # Def: hypothetical protein # Organism: S.haemolyticus # Pathway: not_defined # 7 153 13 161 177 128 47.0 6e-29 MEKILNVAEYIFKEYQRVTGEYIDEMKLQKLLYFSQRESLAILNKPMFSEKFEGWKYGPV SREVRTYFTQEDGIQTYTEDIKSENKYIVNNVILEYGSLASWKLSEMTHKEISWLNSRKG LSENENGNKKIELEDIREDAKKVRPYDYIWDMYYDEFEDVIQ >gi|224461316|gb|ACDC01000086.1| GENE 4 2955 - 3335 451 126 aa, chain + ## HITS:1 COG:no KEGG:Smon_1286 NR:ns ## KEGG: Smon_1286 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 1 121 1 119 125 102 42.0 6e-21 MIGKIYTSMTEFFDSKTNSTRIKARPVLILTDTRNNDYTVLPISTITIKTNIDTYYDIKI DPISYPKLKLKKISYVRTHKRTSIHQASIDKSNIIGDLKTDYEELFLEILKKVEEFDNEV IESALR >gi|224461316|gb|ACDC01000086.1| GENE 5 3734 - 3850 57 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWSISLVFVDHFSIITLGPFSVDIYMMKYLRLSLMHLI >gi|224461316|gb|ACDC01000086.1| GENE 6 3904 - 5193 760 429 aa, chain - ## HITS:1 COG:alr5319 KEGG:ns NR:ns ## COG: alr5319 COG4325 # Protein_GI_number: 17232811 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Nostoc sp. PCC 7120 # 60 380 62 390 449 85 24.0 2e-16 MLSKIKLWFFNHKNFLRMSRFLILTVLLLLVTWIFDHQYITIKSYIPKQLLLSVEVSVDF LSNISGIFLTISTFSFTIIVTVLNKYSSSISPRMLQSFIDRTGVLGLYGIFVSGFFYSVI SILLLQDIAPDQHVVAGSFGIAYSIIAMLSFIAFSRQVLDNLKVSNIIENIFNDCEKLIN KEVELRKKAKHYEEGDESTKLSIVAESSGYLFEIKSDDIFKELNGIKAEFVINKRIGEYT TKGESVGELNIFQCKLDDNDKKELKEKLSPLFIINAHNNREEDYHHGIVNLTEIANMALS PGTNDPNTAIMCINKMSSLLGKLLSTGNHFIILKEDEDVKIIYQSYSVKDELYLGFSQII SYSAGDPLVTKAILQGIYIIYIMADMDAKKTIKRFFDDSYEILMENFTHEIHLDIFKKIK NNMEEHVSL >gi|224461316|gb|ACDC01000086.1| GENE 7 5424 - 5507 67 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRPNILVSNIRHCGVTRVRLISDFCT >gi|224461316|gb|ACDC01000086.1| GENE 8 5525 - 5686 373 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLRQPHISSTKCEDRWMETLPQGRKADATNRSVGFSDSLNGIENYVIIFKYT >gi|224461316|gb|ACDC01000086.1| GENE 9 5690 - 5851 118 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRCPLENSREFSRLVFWTQLCFVSKMTDAKASVTLRWRDYSHSIVAGGLDEMS >gi|224461316|gb|ACDC01000086.1| GENE 10 5806 - 7344 2223 512 aa, chain - ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 195 512 1 318 318 627 98.0 1e-179 MKKGGIIILDFGSQYNQLIARRVREMGVYAEVVPFHEDVDKILAREPKGIILSGGPASVY TEGAPSLDIKLFEQNIPILGLCYGMQLITHLHGGKVARADKQEFGKAELELDDKDNCLYK NIPNKTTVWMSHGDHVTEMAPNFKIIAHTDSSIAAIENKDKNIYAFQYHPEVTHSQHGFD MLKNFVFGIAKAEQNWSMENYIESTVKQIKETVGNKQVILGLSGGVDSSVAAALINKAIG KQLTCIFVDTGLLRKDEAKQVMEVYAKNFNMNIKCVNAEERFLSKLAGVTDPETKRKIIG KEFVEVFNEEAKKIEGAEFLAQGTIYPDVIESVSVKGPSVTIKSHHNVGGLPEDLKFELL EPLRELFKDEVRKVGRELGIPDYMVDRHPFPGPGLGIRILGEVTKEKADILREADAIFIE ELRKADLYNKVSQAFVVLLPVKSVGVMGDERTYEYTAVLRSANTIDFMTATWSHLPYDFL EKVSNRILNEVKGINRLTYDISSKPPATIEWE >gi|224461316|gb|ACDC01000086.1| GENE 11 7408 - 7974 810 188 aa, chain - ## HITS:1 COG:FN1880 KEGG:ns NR:ns ## COG: FN1880 COG0778 # Protein_GI_number: 19705185 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 188 2 189 192 264 72.0 8e-71 MIEKIKNTRSHRKFTDKKISKEEILKILEGARYSSSAKNSQFLRYSYTIDDEKCKKLFSA IALGGLLKLEDKPTLEERPRAYILISAKKDVNIPDFLQYFDVGIASQNIALLANELGYGA CIVMSYNKNVFKEVLELPEDYESKAVIVLGEAKDIVKLTDSKDENDTKYFIENGTHYVPK LPLDKILL >gi|224461316|gb|ACDC01000086.1| GENE 12 8090 - 8362 417 90 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739595|ref|ZP_04570076.1| SSU ribosomal protein S20P [Fusobacterium sp. 2_1_31] # 1 90 1 90 90 165 100 4e-40 MANSKSAKKRVLVAERNRVRNQAVKTRVKTMAKKVLATLELKDVEAAKTALSVAYKELDK AVSKGILKKNTASRKKARLAAKVNSLVNSL >gi|224461316|gb|ACDC01000086.1| GENE 13 8510 - 8806 459 98 aa, chain - ## HITS:1 COG:no KEGG:FN1878 NR:ns ## KEGG: FN1878 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 97 1 97 99 135 83.0 6e-31 MKPEVRDVINNINRFIQEQKYINVSSNLKMEENVVARNLNGKDPDVVAEVMENLELIFKE ISEVHNAGQADEYTERYYYLSDKFYTDMKQFKIDFFIK >gi|224461316|gb|ACDC01000086.1| GENE 14 8932 - 9534 653 200 aa, chain + ## HITS:1 COG:FN1876 KEGG:ns NR:ns ## COG: FN1876 COG0693 # Protein_GI_number: 19705181 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Fusobacterium nucleatum # 1 200 1 200 200 300 79.0 2e-81 MKKIAIFLFEGAELFEIASFTDIFGWNNIVGLKEFRDIKVETISYKEEIKCTWGGVLKAE KLVTENNIEEIFSYDALVIPGGFGGANFFKDKENEIFKKLVKYFSENNKIIVAICTAVIN LIETREIKNRKVTTYLLDNKRYFNQLKKFDIIPEEKEIVVDENLFTCSGPANALDLSLLI LEKMTSKENVEIVKKNMFLK >gi|224461316|gb|ACDC01000086.1| GENE 15 9598 - 10083 176 161 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225085052|ref|YP_002656490.1| ribosomal protein S2 [gamma proteobacterium NOR51-B] # 3 144 7 147 150 72 31 3e-12 MKIGENKVVALDYKVYDADTKELLEDTAELGPYYYIQGMGLFLPKIEAALDSRSKGYKTT IEIPMEEAYGDYDEELVEELTKADFADFEDIYEGMEFVVELEDGSEMVAVITEIDGDKVY TDSNHPFSGRNLLFEVEVADVREATDEELDHGHVHEYENEE >gi|224461316|gb|ACDC01000086.1| GENE 16 10090 - 10533 805 147 aa, chain + ## HITS:1 COG:FN1874 KEGG:ns NR:ns ## COG: FN1874 COG0698 # Protein_GI_number: 19705179 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Fusobacterium nucleatum # 1 147 1 149 149 262 91.0 2e-70 MKIALGADHGGFELKEKIKQHLSKKEGIEVIDFGTNSTESVDYPKYGHLVANSVVNKEVD FGILVCGTGIGISIAANKIKGIRAANCTNTTMAKLTRQHNDANILALGARIVGDVLALDI VDEFLAASFEGGRHQKRIDEIEACNLF >gi|224461316|gb|ACDC01000086.1| GENE 17 10546 - 10884 517 112 aa, chain + ## HITS:1 COG:FN1873 KEGG:ns NR:ns ## COG: FN1873 COG0537 # Protein_GI_number: 19705178 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Fusobacterium nucleatum # 1 112 1 112 112 202 92.0 9e-53 MATLFTKIINREIPADIVYEDDDVIAFKDIAPVAPIHVLVVPKKEIPTINDISDEDALLI GKVYRVIGKLAKEFGIDKNGYRVVSNCNEHGGQTVFHIHFHLIGGNQLGTMV >gi|224461316|gb|ACDC01000086.1| GENE 18 10920 - 11186 403 88 aa, chain + ## HITS:1 COG:no KEGG:FN1871 NR:ns ## KEGG: FN1871 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 6 88 1 83 84 76 71.0 3e-13 MALGLVACGEKFPYTSQSTKEKMIKELKVAMEKAEETKSEKDAQVLLEKMGEIIKISTEL EKRSSEGDEKAKEELDKWDKMITEIKPQ >gi|224461316|gb|ACDC01000086.1| GENE 19 11413 - 12738 2042 441 aa, chain + ## HITS:1 COG:FN0170 KEGG:ns NR:ns ## COG: FN0170 COG1160 # Protein_GI_number: 19703515 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 440 1 440 440 829 95.0 0 MKPIIAIVGRPNVGKSTLFNNLIGDKIAIVDDLPGVTRDRLYRDTEWSGSEFVIVDTGGL EPRNNDFLMTKIKEQAEVAMNEADVILFVVDGKAGLNPLDDEIAYILRKKNKPVILCVNK IDNYFEQQDDIYDFYGLGFEYLVPISGEHKVNLGDMLDIVVEIIGRMDFPEEDEDVLKLA VIGKPNAGKSSLVNKLSGSERTIVSDIAGTTRDAIDTLIEYKDNKYMIIDTAGIRRKSKV EESLEYYSVLRALKSIKRADVCILMLDAKEGLTEQDKRIAGIAAEELKPIIIVMNKWDLV ENKNNVTMKKMKEELYAELPFLSYAPIEFISALTGQRTTNLLEISDRIYEEYTKRISTGL LNTVLKDAILMNNPPTRKGRLIKINYATQVSVAPPKFVLFCNYPELIHFSYARYIENKFR EAFGFDGSPIMISFEAKSKDM >gi|224461316|gb|ACDC01000086.1| GENE 20 12971 - 13729 274 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227512216|ref|ZP_03942265.1| ribosomal protein S4e [Lactobacillus buchneri ATCC 11577] # 1 251 14 260 264 110 28 1e-23 DRISIMKSIDKTNNNLEKIENCIELAEKTDMIVYSKQFFPISQLNKLKHHELNFSFKGLN EDCEKKLLAVYPKDFTEEDLFFPVKYFKIEKKSKFIDLEHKHYLGNILALGLKRESLGDL IVKNGHCYGIILENMFDFLKENLLRVNSSPVEIIEIDESEVPQNEYQELNITLASLRLDS LVAELTNLSRTLGTNYIDLGNVQLNYEVEREKSTKIAVGDTIIIKKYGKFKIVEENGLTK KEKIKLIIRKYI >gi|224461316|gb|ACDC01000086.1| GENE 21 13740 - 15227 1745 495 aa, chain + ## HITS:1 COG:no KEGG:FN0173 NR:ns ## KEGG: FN0173 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 35 495 1 461 461 612 77.0 1e-173 MKKFIGFLLLFIVISVTILFFARDILLKAYLERKMSQANNAEVTIGSLDLDYFERYITLK DVKIMSNLNEEEVFISIDKLKSYYNINFRKKIITFDDAEVEGISFFGDAKYEYNSEEDMV VFENKVTEAEEKAKREKVLTELKNLYLNKIEENHLNLNEIFSRNLSNGKDLSELEKIKQS IKNIKESTEKNLNISEVVGEISNIGKSTKKLGQDLDIKDLNKSEAELKEGMTLEESLDRV VRNFLNRNKLVLFDLDGYINMYLNLVYEQKIYNLSLKYRNILDEIRVRKEKDSKLDDEDV WELFFNSISITSNVYGISFNGEVKNFSTRLSKDTDNTEFKLFGEKGNTIGEFKGFINFDT ELTESTLNIPEADLKDLGSDLLQGGQGVLFQNLKTDGSHLVINGSIHLKDMKLDVEKIIE TMKIEDEVTREIIAPLLKELNTGEIYYSYDTDSRTLQIKTNIVEIFDEILNGENSSLKSK IREQIKEDFLNKIGA >gi|224461316|gb|ACDC01000086.1| GENE 22 15407 - 16363 1601 318 aa, chain + ## HITS:1 COG:FN0174 KEGG:ns NR:ns ## COG: FN0174 COG2070 # Protein_GI_number: 19703519 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Fusobacterium nucleatum # 1 318 1 318 318 518 92.0 1e-147 MKNNKICELLGIKYPIFQGAMAWVSGGELAGAVSRDGGLGIIAGGGMEPELLRQHIRKAK EITSNPFGVNLMLLRPDVEQQMNVCIEEGVKVITTGAGNPGAFMEKLKAANIKVIPVIPT VKLAERMEKIGADAVIVEGMESGGHVGTLTTMALLPQIVNAVSIPVIAAGGIASGKQFLA ALAMGADAIQCGTIFLTAKECIIHQNYKDIILKAKDRSTVVTGTSTGHPVRVIDNKLAKE MIELERSGAPKEEIEKLGTGSLRLAVVEGDTERGSFMSGQVAAMVNDEKTTKEILEYLMN DLKIEVEQLRRRLENWNI >gi|224461316|gb|ACDC01000086.1| GENE 23 16451 - 17140 893 229 aa, chain + ## HITS:1 COG:FN0175 KEGG:ns NR:ns ## COG: FN0175 COG0850 # Protein_GI_number: 19703520 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Fusobacterium nucleatum # 1 229 1 216 216 305 72.0 5e-83 MSNQVIIKGKNDRLVIALNPKSDFLELCDILKTKILEAKNFIGNSRMAIEFSGRKLTSEE EDILIGILTENSNIVISYIFTDKNEKNEKNEKKPKDKKSKDQAMDLSKFNPLMEEGKTHF YRGTLRSGAKIESDGSVVVIGDVNPSSIIRARGNVIVLGRLNGTVYAGLNGDEQAFVTAI YFNPIQLTIGMKTKTDMQKEILDSSRVNKKDKFRIARIKNQEIVVEELI >gi|224461316|gb|ACDC01000086.1| GENE 24 17142 - 17936 1043 264 aa, chain + ## HITS:1 COG:FN0176 KEGG:ns NR:ns ## COG: FN0176 COG2894 # Protein_GI_number: 19703521 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Fusobacterium nucleatum # 1 264 1 264 264 432 90.0 1e-121 MGARVIVITSGKGGVGKTTTTANIGAALADKGHKVLLIDTDIGLRNLDVVMGLENRIVYD LIDVIEGRCRVSQALIKDKRCPNLVLLPAAQIRDKNDVNTDQMKELIHSLKESFDYILID CPAGIEQGFKNAIVAADEAIVVTTPEVSATRDADRIIGLLEAAGIKSPRLVVNRLRIDMV KDKNMLSVEDILDILAVKLLGVVPDDENVVISTNKGEPLVYKGDSLAAKAFKNIASRIEG VEVPLLDLDVKMSILEKIKFVFKR >gi|224461316|gb|ACDC01000086.1| GENE 25 17942 - 18211 384 89 aa, chain + ## HITS:1 COG:FN0177 KEGG:ns NR:ns ## COG: FN0177 COG0851 # Protein_GI_number: 19703522 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation topological specificity factor # Organism: Fusobacterium nucleatum # 2 89 1 88 99 120 80.0 8e-28 MLNMLSGLFKKESSKDEAKNRLKLVLIQDRAMLPSGVLENMKDDILKVLSKYVEIEKSKL NIEVSPCDDDPRKIALVANIPIIKAGNRK >gi|224461316|gb|ACDC01000086.1| GENE 26 18325 - 19248 498 307 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae 3655] # 3 307 38 344 353 196 35 1e-49 MKKIYIAPIAGVTDYTFRGILEDFKPDLIFTEMVSVNALSVLNDKTISKILKLRDGNAVQ IFGEDIEKIKLSAKYIQNLGVKHINLNCGCPMKKIVNCGYGAALVKDPEKIKRILSEIKS VLNDDVKLSVKIRIGYKEPENYVQIGKIAEEVGCDHITVHGRTREQLYSGKADWSYIKEV KDNISIPVIGNGDIFTAEDALEKISYSNVDGVMLARGIFGNPWLIRDIREILEYGEVKNP VTKDEKINMAIEHLKRIRVDNDDQFIFDVRKHISWYLKGLENCAEAKRKINTLSDYDKII KLLEDLY >gi|224461316|gb|ACDC01000086.1| GENE 27 19299 - 21932 2808 877 aa, chain + ## HITS:1 COG:FN0693 KEGG:ns NR:ns ## COG: FN0693 COG0249 # Protein_GI_number: 19704028 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 1 877 20 896 896 1412 90.0 0 MSTDTPLMQQYKKIKEEYQNEILMFRLGDFYEMFFEDAKIASKELGLTLTKRNKEKGQDV PLAGVPYHSVASYIAKLVEKGYSVAICEQVEDPKAATGIVKREVTRVITPGTIIDVDFLD KNNNNYIACVKINTIENILAIAYADITTGEFSVFEIKDKNFFEKGLAEINKIQASEILLD EKTYSEYISILEERISFSGVKFTEIKNVKKAEDYLTSYFDIMSVEAFSLKSKDLAISAAA NLLHYIDDLQKGNELPFSKIEYKNINNIMELNISTQNNLNLVPKRNEESKGTLLGVLDGC VTSIGSRELKKIIKNPFLDIDKIKERQFYVDYFFNDVLLRENVREKLKDIYDIERIAGKI IYGTENGKDLLSLKDSIRKSLETYKLLKEHQELKKIFELDIEILLDIYNKIELIIDAEAP FSVREGGIIKDGYNSELDELRRISKLGKDFILEIEQRERERTGIKGLKIKYNKVFGYFIE VTKANEHLVPEDYIRKQTLVNSERYIVPDLKEYEEKVITAKSKIEALEYDLFKSLSSEIK EHIESLYKLANRIANLDIVSNFAHIATKNSYVKPEISEENILEIKGGRHPIVESLIASGS YVKNDIVLDEKNNLIILTGPNMSGKSTYMKQVALNIIMAHIGSYVAADYAKIPIVDKIFT RVGASDDLLTGQSTFMLEMTEVASILNNATEKSFIVLDEIGRGTSTYDGISIATAITEYI HNNIGAKTIFATHYHELTELEKELERAINFRVEVKENGKNVVFLREIVKGGADKSYGIEV ARLSGVPKDVLNRSRKILKKLENRKNLIESKMKAEQMMLFGNNFEEEEEEIETELINENE MKVLEMLKVMDLNSLSPLESLLKLSELKKILLGGNND >gi|224461316|gb|ACDC01000086.1| GENE 28 21925 - 24636 3862 903 aa, chain + ## HITS:1 COG:no KEGG:FN0694 NR:ns ## KEGG: FN0694 # Name: not_defined # Def: S-layer protein # Organism: F.nucleatum # Pathway: not_defined # 258 899 1 642 643 877 79.0 0 MTKKKIAYIGAGIVALVLGYFNYFGSDKETGDIRKLVETINAVYENDDLRIEAEKETDYI DEKESKFEKAKAFIKGMFLSGDNAFLDKDKNLTLDSNILGKSANGWEIKASQLKYNKETE ELESTKPMYAKNEEKGIEVLGNKFKTNVSMDNITLEDGVVIKNKLFSIVADKANYNNEAK TITLEGNISLSNKIGEIGDINTLTDVRNLQVGEVEKGKEMSGTFSKVYFNLNERNLYATD GFDMKYGEVGLKGRDIVLNEADQSFKVTGDVKFTYQDYVFDVNYIEKEANSDTINVYGQI KGGNPEYSVLADRAEYNINDKKFKILGNVVVTSTKGENLKADTFVYSSETKEADIYGNKI LYTSPTNNLEAEYIHYNSETKEVTTDKPFDSWNDKGEGIKGTSIVYNLGTKDFYSKEEIT VKSKDYGLTTKNVTYKEETGILSAPEPYVIKSKDESSVINGNSITYNKKTGELTSPGNIV MNNKGTIMKGHDLVFNNITGVGKLQGPIPFENKEDKMSGTAKEIIIKRGEYVDLMGPVRV KQDTTNMVVDKARYSYKDELVHVNTPVKFDDPVRSMVGSVSSATYSPKDGILRGSDFNMR EPNRTAKAQNVVIYNKENRRLELIGNAYLSSGADSITGPKIVYYLDTKDAETPTNSVIKY DQYTIKSSYGKVNKESGEIFVKNADVKSVDGNEFYSNQAKGNINDVVHFTGNVKGKSKQK EGDVYFSGDKADLYMAKIDDKYQAKKVIVNTKSTFTQLNRKIVSNYMELDLIKKEVYAKD KPVLTIDDGPKGNTLVKADDVTGYIDQDLIKLNKNVYVKNVNEKKEETVLTADRGTVTKK MADVYDKVKVVTKDSVTTANEGHYDMENRKIRAKGNVHVEYQTDKSAGNVFDNMTSNTKT TKK >gi|224461316|gb|ACDC01000086.1| GENE 29 24666 - 25391 243 241 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 4 214 5 219 223 98 26 5e-20 MITLSADNLVKAYKGRKVVDRVSLEVNKGEIVGLLGPNGAGKTTTFYMITGIVRPDDGEV LCAEEDITNLPMYKRADMGIGYLAQEPSVFRNLTVEENIEVVLEMKDMSKKEQKETVHRL LEEFKLTHVKDSLGYALSGGERRRIEIARTIANNPSFILLDEPFAGVDPIAVEDIQNIIR HLKKRGLGILITDHNVRETLSITDRSYIMAKGKVLIEGTAREIANNPEARRIYLGEKFKL D >gi|224461316|gb|ACDC01000086.1| GENE 30 25409 - 25984 707 191 aa, chain + ## HITS:1 COG:no KEGG:FN0696 NR:ns ## KEGG: FN0696 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 24 69 1 46 50 67 71.0 2e-10 MGKNRLLIFFGLLLICGVLSSTSLFAKKKESHYKAEENVYFLMGDFVNEWANTDFNPTDE KWKKEGVSLPPSVSKDEIKTLLEIEKNGKVVPATSKNHDPKMCIEYAPFDGNIYVFYGIS SEKSDTKTLIVYADEKDDKIFAMTNVGGTPYEFVGGFSLPEEYQKKWSHLISTLKKRFEE YKKNEQNIKDK >gi|224461316|gb|ACDC01000086.1| GENE 31 26000 - 28639 3906 879 aa, chain + ## HITS:1 COG:FN0697 KEGG:ns NR:ns ## COG: FN0697 COG0013 # Protein_GI_number: 19704032 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 13 878 1 866 867 1546 90.0 0 MWKNSSSIIDKKMLTGNEIREKFIEFFMQKQHKHFESASLIPDDPTLLLTVAGMVPFKPY FLGQKEAPYPRVTTYQKCIRTNDLENVGRTARHHTFFEMLGNFSFGDYFKEEAIAWSWEF VTEVLKLDKDKLWVTVFTTDDEAERIWIEKCNFPKERIVRMGESENWWSAGPTGSCGPCS EIHVDLGVQYGGDENSKIGDEGTDNRFIEIWNLVFTEWNRMEDGSLEPLPKKNIDTGAGL ERIAAVVQGKPNNFETDLLFPILEEAARITGSQYGKNPETNFSLKVITDHARAVTFLVND GVIPSNEGRGYILRRILRRAVRHGRLLGYKDLFMYKMVDKVVERFEVAYPDLKKNLENIR KIVKIEEEKFSNTLDQGIQLVNQEIDNLLANGKNKLDGEVSFKLYDTYGFPYELTEEIAE ERGVTVLREEFEAKMEEQKEKARSAREVVMEKGQDSFIEDFYDKHGVTKFTGYEKTEDEA TLLSSREAKDGKYLLIFDKTPFYAESGGQVGDQGRIYSDNFSAKVLDVQKQKDIFIHTVE IEKGSAEENKTYKLEVNLLRRLDTAKNHTATHLLHKALREVVGTHVQQAGSLVDPDKLRF DFSHYEAVTAEQLAKIENIVNEKIREGIDVVVSHHSIEEAKNLGAMMLFGDKYGEVVRVV DVSGFSTELCGGTHIDNIAKIGLFKIVSEGGIAAGVRRIEAKTGYGAYLAEKEEADTLKE IEKKLKASNTNVVEKVEKTLESLKDAEKALESLKQKIALFETKAALSGMEEINGAKVLIA TFKDKTADDLRTMIDTIKDNNEKAIVVLASTQDKLSFAVGVTKTLTDKVKAGDLVKQLAE MTGGKGGGRPDFAQAGGKDESKLLDAFKEIRAIIEAKLS Prediction of potential genes in microbial genomes Time: Thu May 19 23:27:43 2011 Seq name: gi|224461315|gb|ACDC01000087.1| Fusobacterium sp. 2_1_31 cont1.87, whole genome shotgun sequence Length of sequence - 22548 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 10, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.500 + CDS 2 - 499 531 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 2 1 Op 2 31/0.000 + CDS 524 - 1759 1873 ## COG0342 Preprotein translocase subunit SecD 3 1 Op 3 1/0.500 + CDS 1759 - 2715 1161 ## COG0341 Preprotein translocase subunit SecF 4 1 Op 4 . + CDS 2747 - 4291 2128 ## COG0500 SAM-dependent methyltransferases + Term 4299 - 4339 7.1 - Term 4286 - 4326 7.1 5 2 Op 1 1/0.500 - CDS 4330 - 4800 672 ## COG0716 Flavodoxins 6 2 Op 2 1/0.500 - CDS 4824 - 6002 1837 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 7 2 Op 3 1/0.500 - CDS 6052 - 6567 721 ## COG0350 Methylated DNA-protein cysteine methyltransferase - Prom 6588 - 6647 8.7 8 3 Op 1 29/0.000 - CDS 6798 - 8609 2930 ## COG0443 Molecular chaperone 9 3 Op 2 21/0.000 - CDS 8644 - 9306 1154 ## COG0576 Molecular chaperone GrpE (heat shock protein) 10 3 Op 3 . - CDS 9242 - 10264 1196 ## COG1420 Transcriptional regulator of heat shock gene - Prom 10322 - 10381 10.1 11 4 Tu 1 . - CDS 10421 - 10840 619 ## COG1598 Uncharacterized conserved protein - Prom 10894 - 10953 3.6 - Term 10910 - 10947 2.8 12 5 Tu 1 . - CDS 10957 - 11979 865 ## COG0457 FOG: TPR repeat - Prom 12006 - 12065 9.9 + Prom 12052 - 12111 7.7 13 6 Op 1 1/0.500 + CDS 12159 - 12950 1255 ## COG1692 Uncharacterized protein conserved in bacteria 14 6 Op 2 . + CDS 12925 - 13866 1580 ## PROTEIN SUPPORTED gi|237739628|ref|ZP_04570109.1| ribosomal protein L11 methyltransferase 15 7 Op 1 34/0.000 - CDS 13849 - 14637 420 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 16 7 Op 2 15/0.000 - CDS 14649 - 15467 279 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 17 7 Op 3 7/0.000 - CDS 15451 - 16239 574 ## COG1122 ABC-type cobalt transport system, ATPase component 18 7 Op 4 . - CDS 16249 - 16797 169 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 + Prom 17084 - 17143 12.8 19 8 Op 1 2/0.000 + CDS 17183 - 18991 1200 ## COG4984 Predicted membrane protein 20 8 Op 2 . + CDS 18984 - 19547 883 ## COG4929 Uncharacterized membrane-anchored protein 21 9 Tu 1 . - CDS 19800 - 20972 1468 ## FN1986 hypothetical protein - Prom 21107 - 21166 6.4 + Prom 20942 - 21001 11.1 22 10 Tu 1 . + CDS 21083 - 22519 1650 ## COG4452 Inner membrane protein involved in colicin E2 resistance Predicted protein(s) >gi|224461315|gb|ACDC01000087.1| GENE 1 2 - 499 531 165 aa, chain + ## HITS:1 COG:FN0698 KEGG:ns NR:ns ## COG: FN0698 COG0816 # Protein_GI_number: 19704033 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Fusobacterium nucleatum # 28 165 1 138 138 204 92.0 9e-53 ASNELFFITLINLTNIELIFMKLIRGRMKRYLALDIGDVRIGVARSDLMGIIATPLETIN RKKVKSVKRIAELCKENNTTSIVVGIPKSLDGEEKRQAEKVREYIEKLKKEIENLEIIEI DERFSTVIADNILKDLNKNGAIEKRKVVDKVAASIILQTYLDMKK >gi|224461315|gb|ACDC01000087.1| GENE 2 524 - 1759 1873 411 aa, chain + ## HITS:1 COG:FN0699 KEGG:ns NR:ns ## COG: FN0699 COG0342 # Protein_GI_number: 19704034 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Fusobacterium nucleatum # 1 406 1 406 411 653 91.0 0 MNSKLFIRLLIVIAIFITAVYYSIRKPIKLGLDLKGGVYVVLEAVEDKNSNVKIDSDAMN RLIEVLNRRINGIGVAESSIQKAGDNRVIVELPGLQNAEDAINLIGKTALLEFKIMNEDG TLGETLLTGSALQKAEVSYDNLGRPQISFQMTPDGAHVFAKITRENIGRQLAITLDGEVQ TAPKINTEIAGGSGAITGNYTVEEATATATLLNAGALPIKAEVVETRTVGATLGDESIAQ SKNAGMVAIGLIWLFMIIFYRLPGIIADLAIIIFGFITFACLNFIDATLTLPGIAGFILS LGMAVDANVIIFERIKEELRFGNSIRNSIESGFGKGFVAIFDSNLTTLIITAILFVFGTG PIKGFAVTLALGTLASMFTAITATKVLLLTFVNVFGFRSPKLFGVTEGGEN >gi|224461315|gb|ACDC01000087.1| GENE 3 1759 - 2715 1161 318 aa, chain + ## HITS:1 COG:FN0700 KEGG:ns NR:ns ## COG: FN0700 COG0341 # Protein_GI_number: 19704035 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Fusobacterium nucleatum # 1 318 1 317 317 498 83.0 1e-141 MKVNLHIIRKMKYYLSVSIILVVLSIIVFFAKGLNYGIDFTGGNLFQLKYNDKKVTLTEI NDNLDKLSEKLPQVNSNSRKVQISEDGTVILRVPELKEEDKKEVLNSLQELGAFNLDKED KVGASIGDDLKKSAIYSLGIGAILIVLYITLRFEFSFAIGGILSLLHDIIIAVGFIALMG YEVDTPFIAAILTILGYSINDTIVIYDRIRENLKRRHTKNWTLEDCMDESVNQTAIRSLN TSITTLFSVIALLIFGGASLKTFIMTLLIGILAGTYSSIFIATPIVYILNKRKGNNMEDM FKDDDENNDGKRVEKILV >gi|224461315|gb|ACDC01000087.1| GENE 4 2747 - 4291 2128 514 aa, chain + ## HITS:1 COG:FN0701_1 KEGG:ns NR:ns ## COG: FN0701_1 COG0500 # Protein_GI_number: 19704036 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 3 246 5 248 248 292 56.0 1e-78 MDQNAEKNEKLESSYDENPYISKTYYHTQPEKLKSNLRLLDFISPDLKNAKVLEIGCSFG GNIIPFAIENPDATVVGVDLSKVQVDEGNKIIDFLGLKNIRIYHKNILDYNEHFEQFDYI ICHGVFSWVDENVQKGILKFIKKHLTKNGLAMISYNTYPGWKSLEVSKDAMKFRNRMLAK QNKDVTGKNQIAYGKGILEFLDEYSGLNKRIKDNFAYVAQKNDYYLLHEYFEVYNTPFYV YDFNELLETEGLAHIVDSYLQKSFPFLSNEILDKIENDCQGDYIGKEQYYDYLTDCQFRS SIITHKDNIKDINISRNIKIDSIKALNYRGFYLKNEDGKYVIGEDKEVVEDEKKALFLET VAKHYPNTVTVEDLEKELENKLTTVEICEILLVLIYQRKIEVYNDKLTVNKEEKIKISDR YRKYVEYFAETKFPVISSYGLSGINDLGLDLLRANVFLLFDGTRTDDYIVEILKAKHARD EIKVDNTDSKAVETILKEYVATMRTIIEENFLNK >gi|224461315|gb|ACDC01000087.1| GENE 5 4330 - 4800 672 156 aa, chain - ## HITS:1 COG:FN0119 KEGG:ns NR:ns ## COG: FN0119 COG0716 # Protein_GI_number: 19703467 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 154 10 163 164 239 77.0 1e-63 MAKSLIVYYSLSGLTKKVVDVLKNFTDADIYEIELEKPYSKLTAYTIGLAHCKTDYEPAI KNEIDLSNYDKIFIGGPAWCFTYAPPIHSFIKKYNLGDKVIYPFGTATSNFGNYFERFAK ECKAKEMKKPLKVFKSTFKDGLEDTVKVWLDEEYIK >gi|224461315|gb|ACDC01000087.1| GENE 6 4824 - 6002 1837 392 aa, chain - ## HITS:1 COG:FN0118 KEGG:ns NR:ns ## COG: FN0118 COG0484 # Protein_GI_number: 19703466 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Fusobacterium nucleatum # 1 392 1 392 392 563 90.0 1e-160 MEKRDYYEVLGVDKGASEGDIKKAYRKAAMKYHPDKFANASDAEKKDAEEKFKEINEAYQ ILSDPQKKQQYDQFGHAAFEAGAGAGGGGFNANGFDFGDIFGDIFGGGGFGGFEGFSGFG DSSRRSYAEAGHDLRYNLEITLEEAAKGVEKTIKYKRNGKCEHCHGTGGEDSKMKTCPTC NGQGTVKTQQKTILGIIPSQTVCPDCHGKGEVPEKKCKHCHGTGIAKETVEKKINVPAGV DDGQKLKYAGLGEASQSGGPNGDLYIVIRIKSHDIFVRDRENLYCEVPISYSTAVLGGEV EIPTLNGKKTIRVPEGTESGRLLKVKGEGIKSLRGYGQGDIIVKITIETPKKLTDKQKEL LQKFEESLNEKNYEQKSSFMKKVKKFFKDIID >gi|224461315|gb|ACDC01000087.1| GENE 7 6052 - 6567 721 171 aa, chain - ## HITS:1 COG:FN0117 KEGG:ns NR:ns ## COG: FN0117 COG0350 # Protein_GI_number: 19703465 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Fusobacterium nucleatum # 2 171 1 170 170 238 82.0 5e-63 MVRSIKGISFLYNKEIGYLEIIEEKDGISEISFLGNINIEERKKLYNISTESPLTKKCSK QLEEYFSGKRKEFNIKLDVIGTKFQKECWNSLLKIPYGETISYSDEAKIIGKDKAVRAVG SANGKNSIPIIIPCHRVVSKDGSLGGYSGGEGGNKGIKIKKYLLELEKNFK >gi|224461315|gb|ACDC01000087.1| GENE 8 6798 - 8609 2930 603 aa, chain - ## HITS:1 COG:FN0116 KEGG:ns NR:ns ## COG: FN0116 COG0443 # Protein_GI_number: 19703464 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Fusobacterium nucleatum # 1 603 1 607 607 962 92.0 0 MSKIIGIDLGTTNSCVAVMEGGNVTIIPNSEGARTTPSVVNIKDNGEVVVGEIAKRQAVT NPTSTVSSIKTHMGSDYKVEIFGKKYTPQEISAKILQKLKKDAEAYLGEEVKEAVITVPA YFTDSQRQATKDAGTIAGLDVKRIINEPTAAALAYGLEKKKEEKVLVFDLGGGTFDVSVL EISDGVIEVISTAGNNHLGGDNFDDEIIKWLVAEFKKENGIDLSNDKMAYQRLKDAAEKA KKELSTLMETSISLPFITMDATGPKHLEMKLTRAKFNDLTRHLVEATQGPTKTALQDANL NASQIDEILLVGGSTRIPAVQEWVENFFGKKPNKGINPDEVVAAGAAIQGGVLMGDVKDI LLLDVTPLSLGIETAGGVFTKMIEKNTTIPVKKSQVYSTYADNQTAVTINVLQGERARAI DNHSLGNFNLEGIPAAPRGVPQIEVTFDIDANGIVHVSAKDLGTGKENNVTISGSSNLSK ADIERMTKEAEANAEEDKKFQELVEARNKADQLISATEKTLKENPDKVSEGDKKNIEDAI EELKKAKDGDDRGAIEAAIEKLSQASHKFAEDLYREAQAQQQAGANASSDNKADDVAEAE IVD >gi|224461315|gb|ACDC01000087.1| GENE 9 8644 - 9306 1154 220 aa, chain - ## HITS:1 COG:FN0114 KEGG:ns NR:ns ## COG: FN0114 COG0576 # Protein_GI_number: 19703462 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Fusobacterium nucleatum # 26 220 1 199 199 236 81.0 3e-62 MLVGKLINSLIQWKEKKIKKFRRQSMQDKDIKDEVLQEDINKEEVKEEAHEHEHEHKHGG HSCCGKHGHKHEEEIGKLKAEIENWKNDYLRKQAEFQNFTKRKEKEVEELKKFASEKIIT QFLGSLDNFERAIESSTESKDFDSLLQGVEMIVRNLKDIMTSEGVEEISTEGAFNPEYHH AVGVEVCEDKKEDEIVKVLQKGYMMKGKVIRPAMVIVCKK >gi|224461315|gb|ACDC01000087.1| GENE 10 9242 - 10264 1196 340 aa, chain - ## HITS:1 COG:FN0113 KEGG:ns NR:ns ## COG: FN0113 COG1420 # Protein_GI_number: 19703461 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Fusobacterium nucleatum # 1 340 12 351 351 536 90.0 1e-152 MRISEREKLVLNAIVDYYLTVGDTIGSRTLVKKYGIELSSATIRNVMADLEDMGFIEKTH TSSGRIPTDMGYKYYLTELLKVEKITQEEIENINNVYNRRVDELENILKQTSTLLSKLTN YAGIAVEPKPDNTKVDRVELVYIDEYLIMAVIVMEDRRVKTKNIHLPYPITKDEVDKKVV ELNDKIKNNEIAINDIEKFFTESSDIIYEHDDEDELSKYFINNLPGVLKDRDIEEVTDVI EFFNERKDIRDLFEKLIEQKAKENSKTNVNVILGDELGIKELEDFSFVYSIYNLGGAQGI IGVMGPKRMAYSKTMGLINHVSREVNKLINSMEREKNKKV >gi|224461315|gb|ACDC01000087.1| GENE 11 10421 - 10840 619 139 aa, chain - ## HITS:1 COG:SP1786 KEGG:ns NR:ns ## COG: SP1786 COG1598 # Protein_GI_number: 15901615 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 1 131 1 143 150 60 28.0 7e-10 MNLTYPAIISHEDDVFYIGFPDIEENIEDCFYVTYGDSFNDAIEMGKEYLILKLEDYENN KKNFPKASSISDLKKKLKNNQEIVYITLNYEYEKSLIKLAYVKKTLTIPNYLDILAKNKN INFSQVLQNALKKELGIKK >gi|224461315|gb|ACDC01000087.1| GENE 12 10957 - 11979 865 340 aa, chain - ## HITS:1 COG:FN0819 KEGG:ns NR:ns ## COG: FN0819 COG0457 # Protein_GI_number: 19704154 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 1 340 1 342 665 327 54.0 1e-89 MKDDLIKLINELEKEREYQKIITMIEALSDEDKNSKIKLSLAKAYSHINEFDKTIEILES IKASESNTSIWNYCMGHSYYYLDNPSEAERYFLKALEINPKDKPSNFLLALLYHELGDIE EAEEAIYYLNKSLNYFNAYSKLNAEEDITEDLISIEQKLAWNYDKLKNHKEAEIHLRKAI SLGDNEEWVYSQLAYNLRSQERYEEALENYQKVIELGRKDTWLYSEIAWTYFLIKKPQLA LDYMKKAKELSPVEVDLALITRTASILLALAEHKKAIKMIEEVISKEEYKNDRNLLSNLA YIYIDMKDYNSALTYLQRLKELGRNDEWLNKNLEFVYSKL >gi|224461315|gb|ACDC01000087.1| GENE 13 12159 - 12950 1255 263 aa, chain + ## HITS:1 COG:FN1609 KEGG:ns NR:ns ## COG: FN1609 COG1692 # Protein_GI_number: 19704930 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 263 1 263 263 468 88.0 1e-132 MKVLVVGDVVGRPGRNTLQAFLEKYKENYDFVIVNGENSAAGFGITVKIADEFLSWGTDV ISGGNHSWDKKEIYEYLDNSDRMVRPANYPSEVPGKGYTILEDKNGNKIALISLQGRVFM SAVDCPFRTAKKLIEEISKTTKNIIIDIHAEATSEKIALGKYLDGEVSLVYGTHTHVQTA DERILANGSGYISDVGMTGSQNGVIGTNAETIIKKFLTSLPQKFEVAEGEEQLSGIEVEI DEKTGKCKKIKRINWSENEGFRS >gi|224461315|gb|ACDC01000087.1| GENE 14 12925 - 13866 1580 313 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739628|ref|ZP_04570109.1| ribosomal protein L11 methyltransferase [Fusobacterium sp. 2_1_31] # 1 313 1 313 313 613 99 1e-175 MKMKVLEAKVIYESDNIEKYKKIISDIFYNFGVTGLKIEEPLLNKDPLNFYKDEKQFLLS ENSVSAYFPLNIYSEKRKKVLEETFKEKFSEDEEIVYNLDFYEYDEEDYQNSWKKYLFVE KVSEKFVVKPTWREYEKQDDELVIELDPGRAFGTGSHPTTSLLLKLMEEQDFTNKTIIDI GTGSGILMIAGKLLGAGEVYGTDIDEFSMEVAKENLLLNNISLDEVKLLKGNLLEVIENK KFDIVVCNILADVLVKLLDEIKYILKEDSIVLFSGIIEDKLAEVISKAELVGLEVAEVKE DKEWRSCRLLVKK >gi|224461315|gb|ACDC01000087.1| GENE 15 13849 - 14637 420 262 aa, chain - ## HITS:1 COG:FN2006 KEGG:ns NR:ns ## COG: FN2006 COG0619 # Protein_GI_number: 19705302 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Fusobacterium nucleatum # 1 247 1 247 266 310 77.0 1e-84 MNIILGEYINRDSVLHHLDPRTKLIGSFSLILSFLFANNLSIYLIYSVLALILIFLSKIP LTAFLKSLKYLSYILIFSSFFHIFSKQEGELLFKVWKYSVYDSGIFSAIKMMGRIILLLI FSSLLTLTTKPLDIALALETLLSPLKKIGLPIQDFSIMLSITLRFIPTILQEFNTIKMAQ QARGGNFETRNPFKKLSQYSLILLPLLMSVIKKVDNLTLAMEARAFHCGLERTNFHRLKF QKIDYLAFIILFSIIIFLFFYQ >gi|224461315|gb|ACDC01000087.1| GENE 16 14649 - 15467 279 272 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 4 243 131 375 398 112 29 3e-24 MKISIKNLCYSYSVFNDEKNAIKDVSLEINSNKRIAIVGHTGSGKSTLLKLIKGLLKNQT GEISIDGKIEDIGYIFQYPEHQIFETTIFKDVAFGLRKLKLCEKDLTERVEKSLQLVGLN KDYLHRSTLNLSGGEKRKVALAGVFIMENQLLLLDEATVGLDPESKNELFKILLNWQKEN NSGFIFSSHDMNDVLNYAEEVIVMSEGKVLYHTKPSELFEKYSDSLESLGLVLPKSIDFL NRLNKNLKNPLTFENEIREEDILKAIEERLAK >gi|224461315|gb|ACDC01000087.1| GENE 17 15451 - 16239 574 262 aa, chain - ## HITS:1 COG:FN2004 KEGG:ns NR:ns ## COG: FN2004 COG1122 # Protein_GI_number: 19705300 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Fusobacterium nucleatum # 1 261 1 263 264 388 82.0 1e-108 MIEVENLSFSYQNNKVLKNVSFSIEKGEYLCIIGKNGSGKSTLAKLLAALIYQQEGAIKI SGYDTKNQKNLLNIRKIVGIIFQNPEEQIISTTVFDEIIFALENLAIPREDIKEIAEKSL KNLNLLEYKDRLTYQLSGGEKQRLAIASILAMGTEILIFDEATSMLDPVGKKEVLRIMKE LNSQGKTIIHITHDRDDILEASKVMLLSEGEIKYLGNPYKVFDDDIAFLLKIKNILEKHN IKVEDENINMEDLVKIVYENIY >gi|224461315|gb|ACDC01000087.1| GENE 18 16249 - 16797 169 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 2 180 4 182 190 69 27 2e-11 MKIKNMLYAAMFAAIVAVLGLMPPIPLPFIPVPITLQTMGVMLAGSFLGKRLGFISMLLV VVIVLLGLPILSGGRGGLAVLTGPTGGFFIVWPFAAFLIGFLAEKFWKNINIGKYIVVNI IGGIVLVYLVGAIYLSYITKMPIDKAFLATMAFIPGDVLKAIVVSVLCYKLKEISPINEV VR >gi|224461315|gb|ACDC01000087.1| GENE 19 17183 - 18991 1200 602 aa, chain + ## HITS:1 COG:FN2002 KEGG:ns NR:ns ## COG: FN2002 COG4984 # Protein_GI_number: 19705298 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 32 600 1 569 570 518 66.0 1e-146 MFEKIKKFFLYFSVIFLIAGVTSFTAYNWATMSSIEKLAVPSALIIAGLGAYLFLKKDIY KNLALFFSSFTIGTLFAVYGQVYQTGADTWILFRNWAIFLIIPMIATGYYSIVTLFTIVV AVGTNFYLELYLSGSIIPFLSSLIFGIVLLVYPFIQKRFNFKFNNIFYNIMTGIFYISFI ASGFAAINDHHNGLIAIVLYLLFVAAVYFVGYKQLKKITIKILSITALGCFGVAIIIKMI SSIIYTDATVYIFFSLMVIIGTIVAVVKSSNEIENENIKKFTNVVVGFLKVFAFFLLMIF VFSLLGLMGLGEKAFIVVAILLIIFSYFAAKMLGLKNDKIEIVAFIAGLICLGIYLSVSL EMSSLSVIFIITIIFDLFWFFMPTRALDLLLFPVNYLLLGFFLSEKAPSINYYYSIITIT LIVEAYFYFLYDKKELLNEKLKRVLIGNEAALILLPLSWLSTRIGIFIDDYELMFKYAQY YRIVDIALTTLIGAFVIFKTIKNQKLQIVLCILWLGLNYFAYSQILSLIFVMLIMLIYAS KNSKWGILVPTLAACYVIYTYYFTTYRSLLDKSIALSISGGLLLVAYLVLKYGFKGVDEN NE >gi|224461315|gb|ACDC01000087.1| GENE 20 18984 - 19547 883 187 aa, chain + ## HITS:1 COG:FN2001 KEGG:ns NR:ns ## COG: FN2001 COG4929 # Protein_GI_number: 19705297 # Func_class: S Function unknown # Function: Uncharacterized membrane-anchored protein # Organism: Fusobacterium nucleatum # 1 187 1 186 186 224 74.0 9e-59 MSNKMKKILIVVNIVLLFVITGFSAQKEESYKKLDSYFYLELRPVDPRSLLQGDYMTLNY DILDQTTEFIYQNKSYDYYEEEKKEETEQERERRELAEAKKAYIAIRLDGNKVAKFVKLA KEKTDEKDLLFVAYKSDGYNVDINANSYLFQEGTGDKYENARYAKVVLVDNKLRLIDLRD KDFKEIK >gi|224461315|gb|ACDC01000087.1| GENE 21 19800 - 20972 1468 390 aa, chain - ## HITS:1 COG:no KEGG:FN1986 NR:ns ## KEGG: FN1986 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 55 390 1 336 336 417 68.0 1e-115 MKKILLLCLFSILSIFSFANDWEFGSEGEHIIPLKGSAVAIKKEKITLKLTEDGMLVNVK FTFDSPNAENKIIGFVTPESGNNEEYEEDYSKAKRKAEPLKIKNFKTVVNGKEVKSNVEL LSKLLSRGVLDNNVIKEYIEEEKNFYNYVYYFNADFKQGENVVEHSYYYTGSYGIFQRDF AYVVTTIAKWKNKTVEDFEIEVIPGKYFVKLPYTFWKNGKKIDWQIAGKGKMVSIAPTNP NSDDSYGIDKYGAVYLNLDNGSVKYKTKNFSPDTDFYMTRIDNIPGFDYEFPAGKVQGYR FKDGDYMFDTSLASLLNSDADDLKGLSNLQLDILRNYPYAIAGYDFARKDLKDYFSEFIW YRPTSKNVKINPNYNDLIKTIDKIKASRKK >gi|224461315|gb|ACDC01000087.1| GENE 22 21083 - 22519 1650 478 aa, chain + ## HITS:1 COG:FN1985 KEGG:ns NR:ns ## COG: FN1985 COG4452 # Protein_GI_number: 19705281 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Fusobacterium nucleatum # 19 474 1 452 454 640 71.0 0 MDNNSYKIPSNKKPFSPVMKKLIFLVVFAIILIIPLLFVGKLVERRGRLFKETVTEIGNE WGKSQKIIAPVISLSYTDSSLSKDDSIRNEKNVVVQPVQRRLAILPEELNATIEMKDELR HRGIYNATVYTANIKLTGYFSPKDFPDKNDMVAYLSIGLSDTKALVKVNKFKLGNVEKDL EAMSGTMANPLFTSGISGQIGPEYDGMMKEDKIPFEIDIDIRGSRKISILPLGKKNNFDI KSNWKSPSFSGVLPTERNIDDNGFTAKWEISNLIRDYPQVLDINQDVYDDFKDSYSEADL EVYRDSEEYEYYNSDDSKIVKVLLYNSVTDYTQIYRACNYGFLFILMSLVIVYIFEIVSK KVAHYVQYIVVGFSLVMFYLLLLSLSEHLGFEMSYLVASLAIVIPNSLYVASMTDNKKFG IGMFIFLSGIYAILFSILRMEQYALLTGTLLILAVLYVVMYLTKKADIFFKLEEGNNQ Prediction of potential genes in microbial genomes Time: Thu May 19 23:27:51 2011 Seq name: gi|224461314|gb|ACDC01000088.1| Fusobacterium sp. 2_1_31 cont1.88, whole genome shotgun sequence Length of sequence - 2650 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 23 - 1420 1159 ## FN0687 hypothetical protein 2 1 Op 2 . + CDS 1482 - 2630 1861 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases Predicted protein(s) >gi|224461314|gb|ACDC01000088.1| GENE 1 23 - 1420 1159 465 aa, chain + ## HITS:1 COG:no KEGG:FN0687 NR:ns ## KEGG: FN0687 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 465 1 464 467 543 66.0 1e-153 MYVAITGKGKAKVIQFCEQHRIPKTNKKKTVVIKTIGNYETLLKENPNIIEELKEEAKRL TIEKKEKIPKTNLFRFGHSLVNALWKELSLDDILGEDLSKSLFALVVYRLGSSYSTFLEN RKTPFISLNSLSHSEFYDVLLQLDKKTKDLIKCFNKFFDKKIKRDKNIVYYHRGNYIYNS YWKVLYGLESNNFQKGEKDLPFNMNLFFDSYGIPISYQLSLKEDNSKNRLEDFKKNFKNS KLILVLTKESEIQEKSSISSISFEDLSEDIQNEILKDNKWKILERDIKTNEVLEKEKVLD IKDSKLYVYWNKKRAYKDYLENNLKNGYICLKTDENLEDYEISNIFQHSWNIEDKFKITD VDFSKRHIQGHFTLCFICLCIIRYFQYLLGDNGKVFIPMIYANKAISNPMIFMKKVGNDS SLYPIHLTNSYIKLSKILGLDELKEEINFEKFQDKIKMDLEKVNN >gi|224461314|gb|ACDC01000088.1| GENE 2 1482 - 2630 1861 382 aa, chain + ## HITS:1 COG:CAC0390 KEGG:ns NR:ns ## COG: CAC0390 COG0626 # Protein_GI_number: 15893681 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Clostridium acetobutylicum # 3 380 7 382 384 447 56.0 1e-125 MNKNVGTVCVHGKKQRRNVDNTGAVSFPIYQSATFVHPAFGESTGFDYSRLQNPTREELE RVVNDLEEGVDALAFSTGMTAVTVLLDILEPGDHIVATDDLYGGTIRLMESICKKNGIKT TFVETDKVENVEKAIEKNTKMIYIETPTNPMMKIADIEEISKIAKKNNCILVVDNTFLTP YFQKPLKLGADVVLHSATKYLAGHNDTLAGFLVTNSQEISEKLRFITKTIGACLSPFDSW LVLRGIKTLHIRMEQHQKNAKKIVEWLKTQKAVVSVYYPGLEENESIEVSKKQGTGFGGM VSFHVDTPERAKKILKDIKLIQFAESLGGVESLITYPMFQTHADVPLEERLARGINECLL RMSVGIEDVNDLIEDLDQAINK Prediction of potential genes in microbial genomes Time: Thu May 19 23:28:00 2011 Seq name: gi|224461313|gb|ACDC01000089.1| Fusobacterium sp. 2_1_31 cont1.89, whole genome shotgun sequence Length of sequence - 11123 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 3, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 24 - 5084 5632 ## FN0033 hypothetical protein 2 1 Op 2 . - CDS 5093 - 5506 617 ## FN0794 hypothetical protein - Prom 5696 - 5755 7.9 + Prom 5660 - 5719 8.6 3 2 Op 1 1/0.000 + CDS 5740 - 6339 558 ## COG0517 FOG: CBS domain 4 2 Op 2 . + CDS 6352 - 8901 3625 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase + Prom 8903 - 8962 3.6 5 3 Op 1 . + CDS 8982 - 9056 110 ## 6 3 Op 2 . + CDS 8995 - 9129 75 ## 7 3 Op 3 . + CDS 9126 - 11063 2122 ## COG3855 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|224461313|gb|ACDC01000089.1| GENE 1 24 - 5084 5632 1686 aa, chain - ## HITS:1 COG:no KEGG:FN0033 NR:ns ## KEGG: FN0033 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 111 1685 1 1606 1607 1501 54.0 0 MFDFYSNKFTSDINHFIKKAKDRVRSLDRDNQRFIEDIFTKGNYRNYGEIFENNLINKFS RRENVKFEDIFPENIHPALETLIGESNLKNFIKIGEKITKTPYTIGYTRRMVRSSNCRNY IDKLFSILKTFVHYKFFDINTKKLLLGNCNFKGLEGWNLKNLITSLENKYIIANDIDNGN QDVIDFINEALTSGSSKNISYGTLAAIFVSENKSLVEMTGKLLLAAQRQEGLRQQICETM DEGSQENFEYMFKIIYDNDLIRFSSVKRALGVWTGLLGQNYNNPETVGKKELEIINKLID NPKYADELLKSDDNVEVYLALWYKASQDVKIALEAIQELLKVSKLHTKLLVAYNLDIFQD IKYQRTVAKDIIKEYSEKDDNDFLKIVACYWEHLAYNAYTNTSIKTNRGLFDTTDEAKEF FEIFKKVFALIDGKDKAFNPIIFPWVSRYIYTHNIAALLFTIAASYPELNLKNEVLTYFK ALEPYSRSGYLKSLFNKPENKDEELFVVKMLADATVTNEANKIIRVNNLASKYTKEIEDI LRLKTADVRKNAIALILSLESSQLLEATENLVQDKNANKRLAGLDILTKIKDKQDFAKEK IEKIVATIKEPTDPEKILIDGLVGKVETTESSDLYDKTYKFELPYEVKEVKKLSKNVKKN KDGVYILEKSIDAKDIFTKTEDELFELVKKFNALIVNNGTYEYTNGYTGEKILLRDNFLP IVKRANYYYSVNEHLDEYPLADTWREFYKNEIKDFSTLYQLYLLTQSHLRIENFNNVINK ILRTTPGIILKKIIHHFKTFSDNEIMERIIYLLYKEYREENKEYLFETSKAFFIELLKEN PTNLIHRRNKNDNYNSIFDLEYSIPTVVFKNLSEYWDERTFTENLILKLNFEKKVSSYKT RENFYSLIDIANAVELGLIEKDLLIKSIFSEDIDKMDTNFRNLYNFLEIKNPNNYYYYNN YDDNEKIKNSWNYENAIKVLKKYGLEVVNYVVDNELKRGDSKTKYSKLITSINRIEGVDY LIKILQALGNEKLVRSDYWYGDNTSKKEVLSHLLKVCFPSEKDDLKTFKEKIKKTNITEE RLVEVAMYASQWIELIDKFLKWKGFISGCYYFQAHMSDVSKDKEGIIAKYSPISIEDFQA GAFDIDWFKDAYKQLGKEHFDVLYDSAKYITDGTKHSRARKFADAVLGKMKVKDVEKEIS AKRNKDLVASYSLIPLAKNKIKDALSRYKFLHNFLKESKQFGAQRRASEAKAFEVSLENL SRNMGYSDVTRLTWAMESEMMAEMKKYFEPKKIQDYSVYIEIDDLGQSSIKYEKDGKVLK SLPTKIKNEKYIEEIKEVHKNLKEQYSRSRKMLEQSMEDGIKFYAYEIKTLSTNPVVAPL IKDLVFKVDGILGYYEDNKLIGFDKKSKKATLIEDIDKDTLLSIAHPFDLFNSKQWPLYQ QDILEREVKQVFKQVFRELYIKTKDELKMDKSRRYAGHQIQPTKSIALLKTRRWVVDDYE GLQKVYYKENIIAKMYAMTDWYSPAEVEAPTIEDIVFYDRKTFELMTIEDVPDLIFSEVM RDIDLVVSVAHVGDVDPEASQSTIEMRRAIVEFNAKLFKLKNVTFTESHALIRGTRAEYS IHLGSGVIHQKAGATIEVLPIHSQHRGRIFLPFIDEDPKTAEIMAKVLLFAQDEKIKDIF ILDQIL >gi|224461313|gb|ACDC01000089.1| GENE 2 5093 - 5506 617 137 aa, chain - ## HITS:1 COG:no KEGG:FN0794 NR:ns ## KEGG: FN0794 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 137 1 137 137 214 89.0 9e-55 MKLSINIKGLSRKKVIHQEEIEIINEISTTKDLIKELVTINVEKFNKKIDDKDILSIMTN EYIAEAARSGKIGDEVHGDKKANLEKALDTAYLAFEDGLYCIFINDEQTEKLDDSLNLKD GDVLTFIKLTMLAGRMW >gi|224461313|gb|ACDC01000089.1| GENE 3 5740 - 6339 558 199 aa, chain + ## HITS:1 COG:FN0795 KEGG:ns NR:ns ## COG: FN0795 COG0517 # Protein_GI_number: 19704130 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Fusobacterium nucleatum # 1 199 1 198 198 308 89.0 3e-84 MILTERQKKILKMLKEKSLLSGDEIAKNLNVTKSALRTDFSILTALKLVTSKQNKGYSYN NKCTIIRVKDCMSPQNSIDVKTSVYDAIIHLFNYDLGTLVVVENEKLVGIISRKDLLKAT LNKKNIEKTPVSMIMTRMPNIVHCFEDDNIMDAIEKLIKHEIDSLPVLRKENGKLSLVGR FTKTNVTKLFYQELKNKSI >gi|224461313|gb|ACDC01000089.1| GENE 4 6352 - 8901 3625 849 aa, chain + ## HITS:1 COG:FN0796 KEGG:ns NR:ns ## COG: FN0796 COG0574 # Protein_GI_number: 19704131 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Fusobacterium nucleatum # 1 848 1 848 851 1542 91.0 0 MKQVYEFRDGGKEMMALLGGKGANLAEMAKIDLPIPKGIIISTTACNEYFKNDKKLSPVL EEEILRNIRVLEYETGKKFQSPKPLLVSVRSGAPVSMPGMMDTILNLGFNDYVAEKMLEI TKDEKFVYTSYLRFVQMFSEIAKGIDRRKFVHLKATDYKAQILESKKIYRDECGEIFPEN YRDQILIAVKSIFDSWNNDRAILYRKLHNIDNNMGTAVVIQEMVFGNFNEKSGTGVLFTR NPSTGEDKIFGEVLLNAQGEDIVAGIRTPDNIELLQNSMPDIYNQLVETAKKLEKHNRDM QDIEFTIENSKLFILQTRNGKRTAEASLKIAMDLVKEEIITKEEAVMKVEPASINKLLNG DFEEKYLKEATLLTKGLAASSGVAVGRIMFDAKRVKIREKTILVREETSPEDLQGMALAQ GIVTLKGGATSHGAVVARGMGKCCVTGCSEIKIDEINKTMTIGDHVLKEGDFISVSGHTG EIFLGKIPLKENSFSDELKEFVSWASEVKRMNVRMNADTVEDVEQGKSFGAKGIGLCRTE HMFFKNDKIWTIREFILSDRGEEKERALKKLHNLQKEDFLNIFEVLDGDEANIRLLDPPV HEFLPKTTDDKKKMAEILLISLEEIEKRIYKLKDENPMLGHRGCRLGVSYPELYRIQARA IIEAAYECEKKGIKVHPEIMIPFIMEAKELAYLRKEIEEEIEDLFKELGARVEYKLGTMI EIPRACLLADEIAEYADFFSFGTNDLTQMSMGLSRDDSVKFLDDYREKGIWEGEPFYSID RKAVSQLVELGVKNGKSRKTNLKIGVCGEHGGDPKSIEFFEEQNLDYISCSPFRVPTAIL AAAQAYLKK >gi|224461313|gb|ACDC01000089.1| GENE 5 8982 - 9056 110 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRVIVGVFEANLLASFAEITAKR >gi|224461313|gb|ACDC01000089.1| GENE 6 8995 - 9129 75 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSACLKPTCWQVLPKLQRNVNFLSLRNLASNEQFFYCSKEWGII >gi|224461313|gb|ACDC01000089.1| GENE 7 9126 - 11063 2122 645 aa, chain + ## HITS:1 COG:FN0798 KEGG:ns NR:ns ## COG: FN0798 COG3855 # Protein_GI_number: 19704133 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 645 1 645 645 1207 93.0 0 MNTEIKYLELLSKTFKNIAETSTEIINLQAIMNLPKGTEHFMTDIHGEYEAFNHVLRNGS GTIRNKIEEVYKDKLTESEKKELAAIIYYPKEKIEIMQNTANFNVDRWMINIIYRLIEVC KIVCSKYTRSKVRKAMPKDFQYILQELLYEKKELANKREYFDSIVDTIISIDRGKEFIIA ISNLIQKLNIDHLHIVGDIYDRGPFPHLIMDTLAEYNNLDIQWGNHDILWIGAALGNKAC IANVIRICCRYNNNDILEEAYGINLLPFATFAMKYYGNDPCKRFRPKEGVDSDLIAQMHK AMSIIQFKVEGLYSERNPELEMSSRESLKFINYEKGTITLDGVEYPLNDTNFPTVNPENP LELLDEEAELLDKLQALFLGSEKLQKHMQLLFSKGGMYLKYNSNLLFHACIPMEPNGEFS EMYVVDGYYKGKALLDKIDNVVRQAYYDRKNVEVNKKHRDLIWYLWAGRLSPLFGKDVMK TFERYFIDDKSTHKEVKNPYHKLVNDEKICDKIFEEFGLNPRTSHIINGHIPVKVKEGES PIKANGKLLIIDGGFSRAYQSTTGIAGYTLTYNSYGIKLASHLKFISKEAAIKDGTDMVS SHIIVETKSKRMKVKDTDIGRSIQSQINDLKKLLKAYRIGLIKSN Prediction of potential genes in microbial genomes Time: Thu May 19 23:28:33 2011 Seq name: gi|224461312|gb|ACDC01000090.1| Fusobacterium sp. 2_1_31 cont1.90, whole genome shotgun sequence Length of sequence - 16272 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 7, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 63 - 1997 2152 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases - Prom 2090 - 2149 13.5 + Prom 2070 - 2129 10.2 2 2 Op 1 1/0.000 + CDS 2162 - 2944 961 ## COG1183 Phosphatidylserine synthase 3 2 Op 2 1/0.000 + CDS 2946 - 4022 861 ## COG0859 ADP-heptose:LPS heptosyltransferase 4 2 Op 3 . + CDS 4026 - 5477 1582 ## COG0168 Trk-type K+ transport systems, membrane components 5 2 Op 4 . + CDS 5506 - 6288 931 ## FN0994 hypothetical protein 6 2 Op 5 . + CDS 6304 - 6921 626 ## FN0995 hypothetical protein + Term 6991 - 7026 -0.2 + Prom 7026 - 7085 9.5 7 3 Op 1 36/0.000 + CDS 7164 - 7655 344 ## PROTEIN SUPPORTED gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 8 3 Op 2 46/0.000 + CDS 7728 - 7934 359 ## PROTEIN SUPPORTED gi|19703669|ref|NP_603231.1| 50S ribosomal protein L35P 9 3 Op 3 . + CDS 7970 - 8320 567 ## PROTEIN SUPPORTED gi|237739652|ref|ZP_04570133.1| LSU ribosomal protein L20P + Term 8359 - 8414 14.8 + TRNA 8417 - 8505 67.4 # Ser GCT 0 0 + Prom 8430 - 8489 80.4 10 4 Op 1 . + CDS 8730 - 9530 982 ## COG2849 Uncharacterized protein conserved in bacteria 11 4 Op 2 . + CDS 9549 - 10370 874 ## FN0458 hypothetical protein 12 4 Op 3 1/0.000 + CDS 10383 - 11354 1313 ## COG0113 Delta-aminolevulinic acid dehydratase + Prom 11367 - 11426 7.9 13 4 Op 4 . + CDS 11453 - 11998 849 ## PROTEIN SUPPORTED gi|34763431|ref|ZP_00144379.1| PROBABLE SIGMA(54) MODULATION PROTEIN; SSU ribosomal protein S30P + Term 12022 - 12069 11.5 + Prom 12148 - 12207 15.9 14 5 Op 1 . + CDS 12273 - 12680 759 ## gi|291460986|ref|ZP_06026273.2| putative general stress protein 15 5 Op 2 . + CDS 12700 - 13191 423 ## gi|294783208|ref|ZP_06748532.1| conserved hypothetical protein 16 5 Op 3 . + CDS 13214 - 13465 386 ## COG2261 Predicted membrane protein + Term 13475 - 13528 11.4 + Prom 13559 - 13618 10.2 17 6 Tu 1 . + CDS 13652 - 15166 1868 ## COG1288 Predicted membrane protein + Term 15319 - 15364 0.1 18 7 Op 1 . - CDS 15295 - 15696 417 ## FN0351 hypothetical protein 19 7 Op 2 . - CDS 15790 - 16047 314 ## FN0350 hypothetical protein - Prom 16108 - 16167 11.8 Predicted protein(s) >gi|224461312|gb|ACDC01000090.1| GENE 1 63 - 1997 2152 644 aa, chain - ## HITS:1 COG:FN0799 KEGG:ns NR:ns ## COG: FN0799 COG1523 # Protein_GI_number: 19704134 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Fusobacterium nucleatum # 1 644 1 645 645 1094 81.0 0 MYYNYNQYVNLGAFLDKNACTFAIYAKNVSSLILNIFHSSEDVIPYIQYKLSPVEHKLGD IWSISLDNIQEGTLYTWEINGFSVLDPYALAYTGNENVKNKKSIVVKRVGTETKHILIPK KDMLIYESHIGLFTKSSNSQTSTKGTYSAFEEKIDYLKELGINVVEFLPVFEWDDRTGNL NREVGLLKNVWGYNPINFFALTKKYSSSTDINSFDEIKEFKELVSKLHQNGMEVILDVVY NHTAEGGTGGEKYNFKIMAEDVFYTKDREGNFTNYSGCGNTLNCNHKVVKDMIIQSLLYW YLEVGVDGFRFDLAPILGRDADSQWTRYSLLYELVEHPILAHAKLIAESWDLGGYFVGAM PSGWSEWNGAYRDTVRCFIRGDFGQVPELIKKIFGSVDIFHSNKSGYQASINFICCHDGF TMWDLVSYNVKHNLLNGENNQDGENNNHSYNHGEEGLTENPKIIALRKQQIRNMLLILYI SQGIPMLLMGDEMGRTQLGNNNAYCQDNVTTWVDWDRKKEFEDVFLFTKNMINLRKKYSI FRKESPLTEEEITLHGIELFKPDLTFHSLSIAFQLKDIETNTDFYIALNSYSEQLCFELP KLENKSWYILTDTANPRTFTFEEIKHEGDHYCVLPKSAIILISK >gi|224461312|gb|ACDC01000090.1| GENE 2 2162 - 2944 961 260 aa, chain + ## HITS:1 COG:FN0991 KEGG:ns NR:ns ## COG: FN0991 COG1183 # Protein_GI_number: 19704326 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Fusobacterium nucleatum # 1 259 1 261 261 384 80.0 1e-106 MVKKKYIAPNLITAGNMFLGYLSITESIKGNYTMAILFILLAMVCDGLDGKTARKLDAFS EFGKEFDSFCDAVSFGLAPSMLIYSILMSRVPGSPFVVPVSFLYALCGVMRLVKFNIINV ASSEKGDFSGMPIPNAAAMVVSYIMFCEAIYETFGVQLFHINIFIAVSVISASLMVSTIP FRTPDKTFAFIPKKLAVILILALLASMYWTLDYSVFIISYTYVVLNLLAYFYKRFGNAGG DDTSVEEYVEVEEDTNEREG >gi|224461312|gb|ACDC01000090.1| GENE 3 2946 - 4022 861 358 aa, chain + ## HITS:1 COG:FN0992 KEGG:ns NR:ns ## COG: FN0992 COG0859 # Protein_GI_number: 19704327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 358 1 358 358 549 82.0 1e-156 MFSQNDNINILVVRFKRIGDAILSLPLCHSLKLTFPNAKLDFVLYEEASPLFEDHPYIDN VITISKKEQKNPFSYIKKVYKITRKKYDIIIDIMSTPKSELFCMFSRKTPFRIGRYKKKR GIFYNHKMKEKDSLNKVDKFLNQLLPPLEEAGFDVKRDYDFKFFAKPEEKEKYRKKMIEA GVDFSKPIIAFSIYSRVMSKIYPIEKMKILVQHLIDKYSAQIIFFYSADQKDEIQKIHKE LGDNKNIFSSIETPTIKDLVPFFENCDYYIGNEGGARHLAQGVGIPSFAVFNPSAELKEW LPFPSDKNMGISPIDMLEKKGISREEYDKLSFEEKFSLIDVETLIEMSDKLIEKNKRK >gi|224461312|gb|ACDC01000090.1| GENE 4 4026 - 5477 1582 483 aa, chain + ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 483 1 483 483 687 81.0 0 MNTRIISYVISNLFKLMMFLLLFPLAVSVYYQEGLKLSMAYIIPIIILGISSYFLSNKAP ENQSFFSKEGLVIVALSWLLISFFGALPFVISGDIPNMIDAFFESVSGFTTTGATILPEV ESLNKSIIFWRSFTHLVGGMGVLVLVLAILPKGNNQALHIMRAEVPGPTVGKLVAKMSYN SRILYIIYIAMTIIMIILLLAGGMSFYDACIHAFGTAGTGGFSSKNTSIGYYNSAYIDYV ISVGMLVFGLNFNLFYLLLLGNIKQIFKSEEAKYYLLIIFGMTALICVNIYPTYTSISRL IRDVFFTVTSVITTTGYSTVDFNTWPTFSKTLILFLMFSGGCAGSTAGGFKVSRVVILAK KVVREFKKIGHPNKVVNINFEGKTLDKEMLDGIDSYFILYSFTILILLLITSLESDTFLT AVGSVFGTFNNIGPGLDATGPTSNFSIFSPFLKFILSLGMLLGRLEIIPLLILVSPRIYR KRD >gi|224461312|gb|ACDC01000090.1| GENE 5 5506 - 6288 931 260 aa, chain + ## HITS:1 COG:no KEGG:FN0994 NR:ns ## KEGG: FN0994 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 37 259 17 240 241 276 62.0 8e-73 MENNGKPKKIVGRSFRLNFILYSCLILACYALFSVNKFFVKDFERGYSATGTGYVLGIVA INFFILLFLLLPYILLKTFPKFYFYDEGFTRGKNGEFIYYEKMDYFFIPGFVKGKEFLEI RYTNNEGEWKAIPGQGYPTNGFDLFQQDFVNLNYPKAMKCLENNEKIEFLFNDPKKKIIA FGRKNYMKKKLEQAMKIIVTRESITFDNEVYEWDKYKIFVNLGNIIVKEQDGTNILSLGP TALIHRPNLLEVIVSTLGKK >gi|224461312|gb|ACDC01000090.1| GENE 6 6304 - 6921 626 205 aa, chain + ## HITS:1 COG:no KEGG:FN0995 NR:ns ## KEGG: FN0995 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 204 11 206 208 126 38.0 5e-28 MSQVRGFEFKKEENKLKMSIAMFIVFLLTTLILYIFGILNWSSIYVSFIALGIPIGCATF IDNMLEKSKEKQTNAEDSWATNPDELVKTKKTRFSKFKGTESKDISKFAVALSIIVSYVS VYISEVFIWTKAVLENYPDNTFSDVFTYLLKNILTEEWSRKYLVMYWIFMTGFIIFIAIG YFWNKRKMAKMQKKDEEQNNNIRKS >gi|224461312|gb|ACDC01000090.1| GENE 7 7164 - 7655 344 163 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 [Vibrio campbellii AND4] # 1 161 1 165 166 137 41 6e-32 INEKIRGKEFRIISFDGEQLGIMSAEQALNLASSQGYDLVEIAPGANPPVCKIMDYSKYK YEQTRKLKEAKKNQKQVVVKEIKVTARIDSHDLETKLNQVNKFLEKENKVKITLVLFGRE KMHANLGVTTLDEIAEKFAETAEVEKKYADKQKHLILSPKKAK >gi|224461312|gb|ACDC01000090.1| GENE 8 7728 - 7934 359 68 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|19703669|ref|NP_603231.1| 50S ribosomal protein L35P [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 68 1 68 68 142 100 1e-33 MPKMKTHRGAKKRIKVTGTGKFVIKHSGKSHILTKKDRKRKNHLKKDAVVTETYKRHMQG LLPYGEGR >gi|224461312|gb|ACDC01000090.1| GENE 9 7970 - 8320 567 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739652|ref|ZP_04570133.1| LSU ribosomal protein L20P [Fusobacterium sp. 2_1_31] # 1 116 1 116 116 223 100 9e-58 MRVKTGIIRRKRHKRVLKAAKGFRGASGDAFKQAKQATRKAMAYSTRDRKVNKRKMRQLW ITRINSAARMNGVSYSVLINGLKKAGIELDRKVLADIALNNAAEFTKLVETAKSAL >gi|224461312|gb|ACDC01000090.1| GENE 10 8730 - 9530 982 266 aa, chain + ## HITS:1 COG:FN0637 KEGG:ns NR:ns ## COG: FN0637 COG2849 # Protein_GI_number: 19703972 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 143 263 26 146 172 75 40.0 1e-13 MFSSLVSYSAEVYETASEDFMNAVKTLVISEAKSSKNLKAYIEKGVEEKNLIFSVKIEKE KMIVKDKTGKLLHEKVLSKNVSNSFLPFEMKYQEIHKKEGAFEYADITYLEDNEKFRIKY ESKIKKTSAKTKNSDFVEVSPKDVEYKTLDLYDKNGKLLTKQEDIGNKTIVTNYLDEGHK LKIIYNFDSNLTTGNIETWIDKTLLSKGKMKDGLPHGEMKIFNEKGKVISIINYKNGIQD GVTKDFNEKGKLIKETLYKNGVEVKR >gi|224461312|gb|ACDC01000090.1| GENE 11 9549 - 10370 874 273 aa, chain + ## HITS:1 COG:no KEGG:FN0458 NR:ns ## KEGG: FN0458 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 265 1 265 275 342 80.0 1e-92 MKKNFLFIFLIFLVNSIFSYSSVKALNGEKLQKAFDKNKEIIVVYRTNIKDGLPKKYIEN IIPKEKFNISNDNRIKNTIRYVQKNKLDILAEIYTPSGDIIVKTEIKLKKEISLNEIEKL VQEIKDNEASNQSDILNNKFSENFEENVKSFISYSYYNDGSLNSKTEYDFERKNIAMITY SEGKISSESIAKYKGSIQDENIDIDFYENLSKTYTKMKVKKVETGQEVRTFYPNGKLKSV GVYKGNILNGDYKEYDKSGKLIKEVKNNGFIEE >gi|224461312|gb|ACDC01000090.1| GENE 12 10383 - 11354 1313 323 aa, chain + ## HITS:1 COG:FN0460 KEGG:ns NR:ns ## COG: FN0460 COG0113 # Protein_GI_number: 19703795 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Fusobacterium nucleatum # 1 320 1 320 322 584 89.0 1e-167 MFVRTRRLRRNALTREMVKNISIETSSLIYPLFICEGENIKSEIESMPEQFRYSLDRLNE ELDDLLKLGINNILLFGIPAHKDEVGSQAYDKEGIVQKAIRHIRKNYSDKFLIVTDVCMC EYTSHGHCGILHHHDVDNDETLEYIAKIALSHAEAGADIIAPSDMMDGRIAKIREILDEN NFKDIPIMAYSVKYSSAYYGPFRDAADSAPSFGDRKTYQMDFRSTNNFYAEVEADTQEGA DFIMVKPAMAYLDVIKSVSEVTHLPIVAYNVSGEYSMVKAAAKNNWIDEKKIVMENIFAI KRAGADIIITYHAKDIAKWLITK >gi|224461312|gb|ACDC01000090.1| GENE 13 11453 - 11998 849 181 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34763431|ref|ZP_00144379.1| PROBABLE SIGMA(54) MODULATION PROTEIN; SSU ribosomal protein S30P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 180 1 180 181 331 95 2e-90 MKLSIHGRKITLTDAIKKYAEEKISRVEKFNDSILKIDATLAASKLKTGNAHVTEILAYL SGSTLKATATETDLYASIDKAVDIMENQLKKHKEKRSRAKVQDDTRKKSYSFDYIVEPEE KISDEKKLVRVYLPLKPMEISEAILQLEYLNRVFFAFTNSETGKMAVVYKRKDGDYGVIE E >gi|224461312|gb|ACDC01000090.1| GENE 14 12273 - 12680 759 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291460986|ref|ZP_06026273.2| ## NR: gi|291460986|ref|ZP_06026273.2| putative general stress protein [Fusobacterium periodonticum ATCC 33693] # 1 100 57 156 180 159 88.0 6e-38 MGLINYIHEKRLEKERAERNEKIVGTLKVLAGVGAGFTLGVLFAPKSGKETRKNISDATK KGLNYVGENLANAKNYIEEKTSDIREALAEKYDELTDEIISEKVEEIEEEIEEEVEEVAK KVEEKAKEVKEKAKK >gi|224461312|gb|ACDC01000090.1| GENE 15 12700 - 13191 423 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294783208|ref|ZP_06748532.1| ## NR: gi|294783208|ref|ZP_06748532.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 163 1 163 163 190 96.0 2e-47 MVTINLDMILKVLLGISLAIFLILLFIILIKIISIVSKINSLLEKNKEQLENSISQIPNL VKNSEKILENTNANLEKVNILVEDVTDILKASKRNIVNTSSSVSTTLENIKNVSSNVAES SRYIANNFAGKTSGSSNSGGIMSTIDTILDCWDIFKTLLKKKK >gi|224461312|gb|ACDC01000090.1| GENE 16 13214 - 13465 386 83 aa, chain + ## HITS:1 COG:BMEI1501 KEGG:ns NR:ns ## COG: BMEI1501 COG2261 # Protein_GI_number: 17987784 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Brucella melitensis # 1 83 1 82 86 57 50.0 6e-09 MGIIAWLILGAFSGWIASIIMGKNASMGAIANIVTGIIGAFIGGVVFNFFGAQKVTGLNL HSALVSIVGACILLWILSAISKK >gi|224461312|gb|ACDC01000090.1| GENE 17 13652 - 15166 1868 504 aa, chain + ## HITS:1 COG:FN0257 KEGG:ns NR:ns ## COG: FN0257 COG1288 # Protein_GI_number: 19703602 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 503 1 503 503 698 79.0 0 MKRKNWEFPTAYTVLFLILILVTVLTHIIPAGKYNRLSYQENSKEFVIESYGKDDEKLPA TQETLDKLNINIDVEKFTNGTIKKPMAIPNTYTKVSGQAQGVDDLILAPISGLADSIDII IFVLILSGIVGIVNKTGTFSLAMKAISQKTKGKEFLLVVISFIFFAAGGTIFGAWEETIP FYSILIPLFLVNGFDPLVPMATIFLGSAVGCMFSTVNPFSTIIASNAAGISFNEGLKFRF GALVVFSIIALTYLYRYIKKVKENPENSIAFEEKDEINERYLKNYQEETGIKFNWSKKLI LFLFVVQFAIMIWGVASQGWWFQEAAALFFFVSIIIMLVSGLSEKEAVNAFIAGASEVVG VALIIGLARAINIVMENGMISDTLLFYSSNLVSEMGKGLFAVVLLFIFVFLGIFIPSTSG LAVLSMPILAPLADTLGLSRSIVVDAFSWGQGLILFIAPTGLIFVVLQIVGIPYNKWLKF VMPLLIVITILTTIMIYILSVFFR >gi|224461312|gb|ACDC01000090.1| GENE 18 15295 - 15696 417 133 aa, chain - ## HITS:1 COG:no KEGG:FN0351 NR:ns ## KEGG: FN0351 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 7 133 19 144 144 146 67.0 3e-34 MAVIMCVLTMTKKRKKDVQEWLEQNPKAAKVYIGSTSSNLLSYILTPSSISLIAIDNEKS KTFTMEGTKQVFYLTPGKHTITSSFQKSRPGILSKRVITEYEPTTQEVEVEAEKTYIYSF DKKNEQYTFTEKN >gi|224461312|gb|ACDC01000090.1| GENE 19 15790 - 16047 314 85 aa, chain - ## HITS:1 COG:no KEGG:FN0350 NR:ns ## KEGG: FN0350 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 85 56 139 139 133 78.0 2e-30 MTDKELIVKNGKKEDVYEFSKYHFRARTISSSGDTECTLYAIDENENRTHIDCELIGIGQ FKHLLADLKLTGEQVNKIKTIKKEK Prediction of potential genes in microbial genomes Time: Thu May 19 23:29:09 2011 Seq name: gi|224461311|gb|ACDC01000091.1| Fusobacterium sp. 2_1_31 cont1.91, whole genome shotgun sequence Length of sequence - 16655 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 7, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 70 - 525 701 ## COG1490 D-Tyr-tRNAtyr deacylase - Prom 597 - 656 10.0 + Prom 519 - 578 10.5 2 2 Op 1 1/1.000 + CDS 647 - 2152 2104 ## COG1488 Nicotinic acid phosphoribosyltransferase 3 2 Op 2 1/1.000 + CDS 2145 - 3119 1068 ## COG0688 Phosphatidylserine decarboxylase 4 2 Op 3 1/1.000 + CDS 3100 - 3498 351 ## COG5341 Uncharacterized protein conserved in bacteria 5 2 Op 4 1/1.000 + CDS 3520 - 4614 1126 ## COG0628 Predicted permease 6 2 Op 5 . + CDS 4611 - 5750 1226 ## COG0116 Predicted N6-adenine-specific DNA methylase 7 2 Op 6 . + CDS 5734 - 6393 624 ## FN0343 hypothetical protein 8 2 Op 7 1/1.000 + CDS 6444 - 6947 743 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 9 2 Op 8 . + CDS 6994 - 8322 1971 ## COG2056 Predicted permease + Term 8335 - 8370 -0.8 - Term 8522 - 8553 1.1 10 3 Tu 1 . - CDS 8686 - 9129 777 ## FN0121 hypothetical protein - Prom 9317 - 9376 13.1 - Term 9379 - 9424 5.2 11 4 Op 1 . - CDS 9486 - 11603 2650 ## COG3968 Uncharacterized protein related to glutamine synthetase - Prom 11629 - 11688 12.9 12 4 Op 2 . - CDS 11696 - 12388 831 ## COG2964 Uncharacterized protein conserved in bacteria - Prom 12440 - 12499 11.4 + Prom 12451 - 12510 12.0 13 5 Op 1 . + CDS 12531 - 12629 91 ## 14 5 Op 2 1/1.000 + CDS 12626 - 14263 2430 ## COG3033 Tryptophanase + Prom 14274 - 14333 5.2 15 6 Tu 1 . + CDS 14376 - 15710 1913 ## COG0733 Na+-dependent transporters of the SNF family + Term 15728 - 15774 11.1 - Term 15907 - 15958 13.2 16 7 Tu 1 . - CDS 15987 - 16454 539 ## FN1264 hypothetical protein - Prom 16489 - 16548 13.2 Predicted protein(s) >gi|224461311|gb|ACDC01000091.1| GENE 1 70 - 525 701 151 aa, chain - ## HITS:1 COG:FN0349 KEGG:ns NR:ns ## COG: FN0349 COG1490 # Protein_GI_number: 19703692 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Fusobacterium nucleatum # 1 151 4 154 154 253 92.0 7e-68 MRTVIQRVKYAKVNVDGKTIGEIDKGLLVLLGITHEDTIKEVKWLANKTKNLRIFEDEEE RMNLSLEDVKGKVLIISQFTLYGNSIKGNRPSFIDAAKPDYAKDLYLKFIEEFKSFGIET QEGEFGADMKVELLNDGPVTIIIDTKDANIK >gi|224461311|gb|ACDC01000091.1| GENE 2 647 - 2152 2104 501 aa, chain + ## HITS:1 COG:FN0348 KEGG:ns NR:ns ## COG: FN0348 COG1488 # Protein_GI_number: 19703691 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 501 1 501 501 902 91.0 0 MNNDIILTEFARVINSDRYQYTESDIFLMENMQNKIAVFDMFFRKTEDGGFAVVSGIQEV IHLIEVLNTTSEEEKRKYFSKVLEEEHLVDFLSKMKFTGDLYAIQDGEIVYPNEPIITIK APLIQAKILETPILNIMNMNLGIATKASMVTRAADPVKVLAFGSRRAHGFDSAVEGNKAA VIGGCFGHSNLITEYKYGIPSNGTMSHSYIQAFGVGAEAEKEAFVTFIKHRRQRKSNSLI LLVDTYDTIHIGIENAIKAFKECGIDDNYEGIYGVRLDSGDLAYQSKKCRKRFDEEGFTK AKITLTNSLDEQLIRSLREQGACVDMYGVGDAIAVSKSYPCFGGVYKIVELDEEPLIKIS GDVIKISNPGFKEVYRIFDKDGYAYADLISLVKNDKDKEKLLNNEDFTIRDEKYDFKSSL IEKDKYTYTKLTKQYIKDGVIDRDLYDELFDIMKSQKHYFDSLAKVSVERKRLENPHSYK VDLSSDLIELKYGLINKIKNV >gi|224461311|gb|ACDC01000091.1| GENE 3 2145 - 3119 1068 324 aa, chain + ## HITS:1 COG:FN0347 KEGG:ns NR:ns ## COG: FN0347 COG0688 # Protein_GI_number: 19703690 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Fusobacterium nucleatum # 25 324 1 300 300 484 88.0 1e-136 MSKKKILIFLLILFFIIMYSKESTMKFEQIKYIERKTGEIKTEKVMGEGALKFLYYNPFG KLALNAVVKRKFVSDWYGSKMSKPKSKEKIKGFVEEMGIDMSEYKRSIDEYTSFNDFFYR ELKEGARDIDYDEKAIVSPADGKILAYQNIKEVDKFFVKGSEFTLEEFFNDKDLAKKYED GTFVIIRLAPADYHRFHFPTDGEISEVKKISGDYYSVSTHAIKTNFRIFCENKREYAILK TKNFGDIAMFDVGATMVGGIVQTYKANSFVKKADEKGYFLFGGSTCILVFEKGKVEIDKD ILENTQNKIETRIYMGEKFGNEKN >gi|224461311|gb|ACDC01000091.1| GENE 4 3100 - 3498 351 132 aa, chain + ## HITS:1 COG:FN0346 KEGG:ns NR:ns ## COG: FN0346 COG5341 # Protein_GI_number: 19703689 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 3 132 1 130 130 228 94.0 3e-60 MEMKKTKYFKIGDLVIYGFLIIFFSILTLKIGSFKNVKGAKAEIWVDGELKYVYPLQEEE KNIFVDTNLGGCNVQFKDNMVRVTTSNSPLKIAVKQGFIKSPGEVIIGIPDRLVVKVVGD SEDDSELDFVAR >gi|224461311|gb|ACDC01000091.1| GENE 5 3520 - 4614 1126 364 aa, chain + ## HITS:1 COG:FN0345 KEGG:ns NR:ns ## COG: FN0345 COG0628 # Protein_GI_number: 19703688 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Fusobacterium nucleatum # 35 363 3 331 331 390 77.0 1e-108 MNLKNIMKITGIILIFVILQSYFTNPESFSTIIGRWTGYFMTLIMAIFIAILLEPIEKYL KKKSKINDVLAISLSIVFVVLIVIIMSLIVIPEIISSLKVLNDMYPAISEKVLTIGKDVT NYLAEKNIYTVNTEELNDSITNFISNNTSNIKEFVFAFVGGLVNWTLGFTNLIIAFTLAF LILLDKKNLMKTLENLIKIIFGVKNTPYVMNKLKLSKDIFISYISGKIIVSSIVGLCVYI ILLITGTPYAALSAILLGVGNMIPYVGSIFGGIVAFFLILLVAPIKTLILLIAIIISQLV DGFIVGPKIIGNKVGLSTFWVMVSMIIFGNLFGIVGMFLGTPILSIIKLFYVDLLKRAEQ GGKE >gi|224461311|gb|ACDC01000091.1| GENE 6 4611 - 5750 1226 379 aa, chain + ## HITS:1 COG:FN0344 KEGG:ns NR:ns ## COG: FN0344 COG0116 # Protein_GI_number: 19703687 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Fusobacterium nucleatum # 1 379 1 379 379 657 90.0 0 MIFIASATMGLESVVKEECLALGFKNIKVFDGRVEFEGDFKDLVKANIYLRCSDRVFIKM AEFKALSYEELFQNVKAIEWQDFIDENGEFPISWVSSVKSKLYSKSDIQRISKKAIVEKL KEKYKREIFLENGALYSIKIQCHKDIFIVMLDSSGEALTKRGYRAVKRLAPIKETLAAAL VYLSKWKPDEVLLDAMCGTGTIAIEAAMIARNIAPGANRNFAAEKWSVIDEKLWTDIRDE AFSSEDLSKELKIYASDIDEKSIEVAKENAEKAGVEEDIIFEVKDFKDIESPAKYGAVIV NPPYGERLMNDEDIEELYRDFGKFCKKNLTKWSYYIITSYEDFEKAFGKVATKNRKLYNG GIKCYYYQYFGDRKNGYRN >gi|224461311|gb|ACDC01000091.1| GENE 7 5734 - 6393 624 219 aa, chain + ## HITS:1 COG:no KEGG:FN0343 NR:ns ## KEGG: FN0343 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 219 6 224 224 232 75.0 6e-60 MDIETKIKNFIDYAREVCLQSLLLADNIKVDLKSQDNLYEVERIDNEVISKYENIYLLLD ETTLLDIYKKDAKVFEKIEEAIKKMAEDNKIKDEYIKSQIKKRKELKGNSGSEVVERFFK YKIKELKKIKGDLIQKINKVLDKEEKLNLDLSNAIQEVEQMEIIEKLQPVRAEFRSLSLQ FDKYQKELEETENKLSKKWYYEIYGTTDKETLLEAYNTK >gi|224461311|gb|ACDC01000091.1| GENE 8 6444 - 6947 743 167 aa, chain + ## HITS:1 COG:FN0342 KEGG:ns NR:ns ## COG: FN0342 COG0652 # Protein_GI_number: 19703685 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Fusobacterium nucleatum # 1 167 1 167 167 280 91.0 1e-75 MSLQAIIKTNKGEINLNLFSDVAPVTVLNFVTLAKSGYYNGLKFHRVIEDFMIQGGDPTG TGAGGPGYQFGDEFKRGVEFTKKGLLAMANAGPNTNGSQFFITHVPTEWLNYKHTIFGEV VSPKDQDVVDSIKQGDTMNEIVVVGDVDKLIEENKEFYTQLKNFLKI >gi|224461311|gb|ACDC01000091.1| GENE 9 6994 - 8322 1971 442 aa, chain + ## HITS:1 COG:FN0341 KEGG:ns NR:ns ## COG: FN0341 COG2056 # Protein_GI_number: 19703684 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Fusobacterium nucleatum # 1 442 1 442 442 619 90.0 1e-177 MILLNPVVLSVIVMTVLCLLKLNVLLALMVSALVAGFSAGMPINDIMSTLIGGMGGNSET ALSYILLGTLAVAINSTGVTSIVSKKIASLVDGKKKLLLLLIAFFACFSQNLIPVHIAFI PILIPPLLKLMNSLKLDRRAMACSLTFGLKAPYIALPVGFGAIFQGIIAGEMTNNGMTVA QGDVWKSTWILGLFMIIGLLLAIFVTYNKDREYKDLPLIGIEEVKAEKMEAKHWLALLAA SAAFVIQALSAIEVINISKGALPLGALVALAIMLVFGVIKWKSLDEFINGGVGLMGLIAF IMLVAAGYGSIIRETGAVGELVDSIHGLIGGSKALGISVMLLVGLLITMGIGTSFGTIPV VAAIYIPLCIKLGISIPGSIVVLAAAAALGDAGSPASDSTLGPTSGLNADGQHDHIMDTC VPTFIHYNILLLIGGFIGGMFF >gi|224461311|gb|ACDC01000091.1| GENE 10 8686 - 9129 777 147 aa, chain - ## HITS:1 COG:no KEGG:FN0121 NR:ns ## KEGG: FN0121 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 15 147 1 125 130 87 64.0 1e-16 MKRFKGFFLFLVIMLLSLGIASTTYARERDGSSREDRDWGNYGSRGYRSRSDYDSGWGGP KLNYDGKTSGIGRELGGHIVGSAAGAIGGAIGGSAGGVAGGLAGGMASGIAGAYYGGRAG DAWERRTNRYAHDRWNSRGSRGHSWRF >gi|224461311|gb|ACDC01000091.1| GENE 11 9486 - 11603 2650 705 aa, chain - ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 8 705 8 696 696 646 46.0 0 MNNLLDNFGVNCFSEKNLKNRVPDYVFKKFLQIKNGKAELTLEIADTIANAIKMWALEKG ATHYTHWFQPLTELTAEKHESFISINSDGTSMAKFSGKDLMKGESDTSSFPNGGLRSTFE ARGYTAWDISSPMFLKGEEGCKTLYIPTAFVGYNGEALDKKVPLLRSINLIKEQALKIQR LLGDTETENINVTLGVEQEYFLVDKKFFYKRQDLVLSGKTVFGCLPPKGQEMNDHYYGTI KERIESFMAELDNELWKVGVMSKTKHNEVAPNQFEIALMFNTANVSVDQNQITMDMIKKV ANRHNMVALLHEKPFKNVNGSGKHCNWSLSTDKGMNLYDPETLSENNLSFLVYLLAMIEA VDRYAPALRATTATSGNDYRLGGHEAPPAIISIFLGEQLEDILENIENTNFNNNSSSHLD EITIDKNISRIPKDISDRNRTSPMAFTGNKFEFRMPGSSASPATPMFVLNTIVADVLKEY CEYFEKELKNKTVKEVVIALVKDRYNKHKRIIFDGNGYEEKWVEEAKKRGLSNLKNTVEG LPALIEEEVIQLFERNSVLSRSESLSRFHVYVERYNKQCNLEISTGIKIVRNQVYPFVIK YISNLSKSIHRSRKIFPDEDLFQYDIGILKDIILLKNDMLILTDKLEENLEKAIKIQDLY QRAKFYSNEVLPTLENLREKVDKLEEKIATDAWPIPSYYDLLFNL >gi|224461311|gb|ACDC01000091.1| GENE 12 11696 - 12388 831 230 aa, chain - ## HITS:1 COG:FN1942 KEGG:ns NR:ns ## COG: FN1942 COG2964 # Protein_GI_number: 19705247 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 228 1 228 229 318 71.0 5e-87 MKNEILNQYKLLVNFLGKSLGPSFEVVLHEVKGEEVKMIAIANGEVSDRVLEDTVSSETL NILKNKSSHNEESMVNHTVLLKNGKKVRSSSMLIKENQKVVGMLCINFDDSKFHELNCQL LRIIHPDMFVKNYLSDVSYNVLYDDFKKEADEDNEDEDIDAYMKKVYYEVNTKLNFPIGR PTRQEREQTIYALYERGFFNLKDSIDFVSKKLFCSTSTVYRYIALAEKNK >gi|224461311|gb|ACDC01000091.1| GENE 13 12531 - 12629 91 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIIFSILFSIKECCPLIKKMTTGVLKFLGGLQ >gi|224461311|gb|ACDC01000091.1| GENE 14 12626 - 14263 2430 545 aa, chain + ## HITS:1 COG:FN1943 KEGG:ns NR:ns ## COG: FN1943 COG3033 # Protein_GI_number: 19705248 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Fusobacterium nucleatum # 1 545 1 545 545 1095 96.0 0 MKEYLLNVPVPRSFSYVKRNIPEVTVEQRERALKATHYNEFAFPAGMLTVDMLSDSGTTA MTDQQWAAMFLGDESYGRNKGYYVLLDAMRDCFERGDNQKKIINLVRTDCQDIEKMMNEM YLCEYEGGLFNGGAAQLERPNAFLMPQGRAAESILFEIVRKVLAAREPGKVFTIPSNGHF DTTEGNIKQMGSVPRNLYNKELLYEVPEGGRYEKNPFKGDMDIKKLEKLIEVVGVENIPM IYTTITNNTVCGQAVSMKSIRETSKIAHKYEIPFMLDAARWAENCYFIKMNEEGYRDKSI AEIAKEMFSYCDGFTASLKKDGHANMGGILAFRDKGYFWKKFSEFNEDGSVKTDVGILLK VKQISCYGNDSYGSMSGRDIMALAAGLYECSNFNYLHERVEQCNYLAEGFYKAGVKGVVL PAGGHGVYINMDEFFDGKRGHESFAGEGFSIELIRRYGIRVSELGDYSMEYDLKTPEQQA EVANVVRFAINRSVYSQEHLDYVIAAVKALYEDRESIPNMRIVSGHNLPMRHFHAFLEPY PNEEK >gi|224461311|gb|ACDC01000091.1| GENE 15 14376 - 15710 1913 444 aa, chain + ## HITS:1 COG:FN1944 KEGG:ns NR:ns ## COG: FN1944 COG0733 # Protein_GI_number: 19705249 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 1 444 16 459 459 710 89.0 0 MSEVEKRDGFSTKWGFILACIGSAVGMGNIWRFPVLVSAMGGMTFLIPYFIFVIFIGSTG VIEEFALGRSAGAGPVGAFGMCTEMRGNRSIGEKIGIIPILGSLALAIGYSCVMGWVFKY AWMSIDGSMYAMASNMDVIGSTFGQTASAWGANFWIVVALIVSFIIMSMGIASGIEKANK IMMPVLFILFVLLGIYIVFQPGSSGGYKYIFTVDLKGLADPKVWIFAFGQAFFSLSVAGN GSVIYGSYLSKKEDIPNSAKNVAFFDTLAALLAAFVIIPAMAVGGAELSSGGPGLMFIYL INIMNNMAGGRIIEVIFYLCVLFAGVSSIINLYEAPVAFLQEKFKVKRIAATAIIHILGC AVAICIQGIVSQWMDVVSIYICPLGALLAAVMFFWIGGKKFAEESVNMGANKPIGSWFYP AGKYIYCLLALVALIAGALLGGIG >gi|224461311|gb|ACDC01000091.1| GENE 16 15987 - 16454 539 155 aa, chain - ## HITS:1 COG:no KEGG:FN1264 NR:ns ## KEGG: FN1264 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 35 155 1 120 120 173 72.0 1e-42 MKKVGSFFTLLTSRGYKKVILIPLAFCLGFFLYSLYSNFTGGKAEKTTYDDGTTRISAQS DLGSVKLPKILDGLNIPIHDELKIRNYDVFLDKDENITSIDIYCKSNKDANEIIDWYKEK LNATDDRAKGVWNGFDMDVSYSEGSKLFSINLKKQ Prediction of potential genes in microbial genomes Time: Thu May 19 23:29:27 2011 Seq name: gi|224461310|gb|ACDC01000092.1| Fusobacterium sp. 2_1_31 cont1.92, whole genome shotgun sequence Length of sequence - 4353 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 16/0.000 - CDS 16 - 744 1090 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 2 1 Op 2 34/0.000 - CDS 773 - 1501 605 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 3 1 Op 3 . - CDS 1494 - 2204 885 ## COG0765 ABC-type amino acid transport system, permease component - Prom 2224 - 2283 14.2 - Term 2506 - 2546 -0.5 4 2 Op 1 . - CDS 2563 - 3441 938 ## FN0289 hypothetical protein 5 2 Op 2 . - CDS 3413 - 3913 222 ## gi|237739682|ref|ZP_04570163.1| predicted protein 6 2 Op 3 . - CDS 3955 - 4353 277 ## gi|237738969|ref|ZP_04569450.1| predicted protein Predicted protein(s) >gi|224461310|gb|ACDC01000092.1| GENE 1 16 - 744 1090 242 aa, chain - ## HITS:1 COG:FN0800 KEGG:ns NR:ns ## COG: FN0800 COG0834 # Protein_GI_number: 19704135 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Fusobacterium nucleatum # 13 242 1 230 230 362 88.0 1e-100 MKKVFKLMLMSLLSIVISVSAFAKNKVVYVGTNAEFAPFEYLEKNKVVGFDVDLLDAISK ETGLEFKVQDMAFDGLLPALQTKKVDMVIAGMSATPERKKAVAFSKPYFKAKQVVITKGV DKSLKSFKDLSGKKVGVMLGFTGDTVVSEIKGVKVERFNASYAAIMALSQNKVDAVVLDS EPAKKYTANNKQFVIASIPAEEEDYAIAVRKNDKELLDKINAALDKIKANGEYDKLLKKY FK >gi|224461310|gb|ACDC01000092.1| GENE 2 773 - 1501 605 242 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 237 49 1e-62 MINITNLHKNFGDLEVLKNISTEIKKGEIISIIGPSGSGKSTFLRCINKLEEPSSGHIYI DGMDLMDKNTDINKVRERVGMVFQHFNLFPNMTVLDNLTLSPIMVKKESKEEAEKYALSL LEKVGLSDKANSYPTQLSGGQKQRIAIARALAMKPEVILFDEPTSALDPEMIKEVLDVMR DLAKEGMTMLIVTHEMGFARNVGNRILFMDKGEIIEDCSPKEFFENPTNERIKDFLNKVL NK >gi|224461310|gb|ACDC01000092.1| GENE 3 1494 - 2204 885 236 aa, chain - ## HITS:1 COG:FN0802 KEGG:ns NR:ns ## COG: FN0802 COG0765 # Protein_GI_number: 19704137 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 1 236 1 236 236 366 94.0 1e-101 MEYLEILKDTFLTDDRYMYIVDGVIFSIGITLFSAILGIVLGLLLAVMKLSHWYPFKRIK FLENFNPLSKIAYIYIDVIRGTPVVVQLMILANLIFVGALRETPILVIGGIAFGLNSGAY VAEIIRAGIEGLDKGQMEAGRALGLSYSQTMRKIIVPQAIKNILPALVSEFITLLKETSI IGFIGGIDLLRSASIITSQTYRGVEPLLAVGFIYLILTSIFTVFMRKVERGLKVSD >gi|224461310|gb|ACDC01000092.1| GENE 4 2563 - 3441 938 292 aa, chain - ## HITS:1 COG:no KEGG:FN0289 NR:ns ## KEGG: FN0289 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 143 292 144 308 308 166 65.0 9e-40 MMKVITKKINNTDEYSITRYSISGIILKTIIFLILYIICAYLGIYHLEEISLSKIIRFVI GSFPFFMFFYLEAILGSSKEVLCIKENNLILKKYILFFCYYSKILKVEDIREIYYEKIPC KDYPILFFPIDLLKNIKFRVKENEFEDKIYAFGYKLSEHEIAEITNEIEEHIKVENVEKE NLSEKYNYSLDERYSYILNKILDEEKLFISEKDNNFIINGDSEAIKDLEISKDMNFEEID FYVFYVNYLSKKEYENKKVLVGYNGIDGKEVTMSKFKEDINEIRDSRSTFKN >gi|224461310|gb|ACDC01000092.1| GENE 5 3413 - 3913 222 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739682|ref|ZP_04570163.1| ## NR: gi|237739682|ref|ZP_04570163.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 166 12 177 177 260 100.0 2e-68 MKTVTKREDFIYIVYDEELKRKCDIVLSLHVILWILIPMYILLFRANDKIYNYLWIIFYV FFIISYKLEPYLSKIKIILYEDKIEIKKRKKTRLFLYSEIKKIEYSEKDRGREGTIYFVK IMKNDGSICKQIKGELKSDIIEIFTIIKNSYEEWRIKNDESYNKEN >gi|224461310|gb|ACDC01000092.1| GENE 6 3955 - 4353 277 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738969|ref|ZP_04569450.1| ## NR: gi|237738969|ref|ZP_04569450.1| predicted protein [Fusobacterium sp. 2_1_31] # 2 132 60 190 190 163 85.0 3e-39 VITLYLYNLPFLLIFVVIALIVCSKEILLVDNNELVIEKYFLFYLYERKIIDVENIRSIF FADEYEKIFPLFLPLDIVKNLKIRVKESDIEDKIYTFGVCLNEEKYKEIIGEILKYSEIK GYLQNLINITNF Prediction of potential genes in microbial genomes Time: Thu May 19 23:29:46 2011 Seq name: gi|224461309|gb|ACDC01000093.1| Fusobacterium sp. 2_1_31 cont1.93, whole genome shotgun sequence Length of sequence - 2058 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 156 82 ## gi|262066007|ref|ZP_06025619.1| putative membrane protein 2 1 Op 2 . - CDS 149 - 622 310 ## gi|237738976|ref|ZP_04569457.1| predicted protein 3 1 Op 3 . - CDS 636 - 956 304 ## gi|237739685|ref|ZP_04570166.1| predicted protein - Prom 982 - 1041 1.8 - Term 998 - 1034 3.4 4 2 Tu 1 . - CDS 1057 - 1812 906 ## gi|237739686|ref|ZP_04570167.1| LOW QUALITY PROTEIN: hemolysin - Prom 1979 - 2038 2.1 Predicted protein(s) >gi|224461309|gb|ACDC01000093.1| GENE 1 3 - 156 82 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262066007|ref|ZP_06025619.1| ## NR: gi|262066007|ref|ZP_06025619.1| putative membrane protein [Fusobacterium periodonticum ATCC 33693] # 1 51 1 51 184 77 88.0 3e-13 MFENLIKSKVNIKKGTNLLVIEYRKWSFKLPIYLIIFYILHTWLGYKMREI >gi|224461309|gb|ACDC01000093.1| GENE 2 149 - 622 310 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738976|ref|ZP_04569457.1| ## NR: gi|237738976|ref|ZP_04569457.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 157 1 157 157 219 92.0 5e-56 MQTITRIEEDTYIMKNYEFLFKAGYFAVYFMINLLNTFSMIFLNTSYKYGYVILIFSSLL LFLLKKYVFKKLLFIFKKDKLEIQEIWSNKLIKKIILNYEDILDLKITEISAGNGTSYYI KIITPPIEKSYYYDLFKEEAYKVVKIYNLYKNGDTDV >gi|224461309|gb|ACDC01000093.1| GENE 3 636 - 956 304 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739685|ref|ZP_04570166.1| ## NR: gi|237739685|ref|ZP_04570166.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 106 1 106 106 190 100.0 2e-47 MPTNSSKLIKSLDFEKYSTEACKLKTYSWEANASIDYSYVLWHMDMDEKFCKEYAEVQKA NYQPEAKVIKSITSFLIKNHSNLRRTISFGEAMDRLNKMYEDGYME >gi|224461309|gb|ACDC01000093.1| GENE 4 1057 - 1812 906 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739686|ref|ZP_04570167.1| ## NR: gi|237739686|ref|ZP_04570167.1| LOW QUALITY PROTEIN: hemolysin [Fusobacterium sp. 2_1_31] # 1 251 80 330 330 438 100.0 1e-121 MEGVLLDKFKDEHQKEFNLIKEENLSLEDKQKLAQNLIERYLRENGYEGEIPEVLLTDEA HSFTVDSKDKETGAKRREKIYFSINDIANPDLAFSKLFAHEKAHMNTYDEGKDGEETSIH TREKVGSENKNKVFSEEEKADYLNNLRNKYKDQKSIEQQFAEAKLVDEKDKEHWAVILSE NASIAYGLRFNGGSSIAFIVDPKTKKIIIAETLDGGVGIGTPSGGISPSIAYFPNINTVE DLKGAIGTAQY Prediction of potential genes in microbial genomes Time: Thu May 19 23:30:17 2011 Seq name: gi|224461308|gb|ACDC01000094.1| Fusobacterium sp. 2_1_31 cont1.94, whole genome shotgun sequence Length of sequence - 12943 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 7, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 32 - 190 76 ## - Prom 224 - 283 13.7 + Prom 179 - 238 8.3 2 2 Op 1 . + CDS 326 - 1636 1275 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family 3 2 Op 2 . + CDS 1677 - 2243 653 ## gi|237739689|ref|ZP_04570170.1| predicted protein 4 2 Op 3 . + CDS 2253 - 2834 571 ## gi|294783182|ref|ZP_06748506.1| conserved hypothetical protein + Term 2975 - 3021 6.6 - Term 2902 - 2968 2.3 5 3 Op 1 5/0.000 - CDS 3079 - 3612 725 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 6 3 Op 2 . - CDS 3624 - 4691 1656 ## COG2252 Permeases - Prom 4764 - 4823 11.5 + Prom 4447 - 4506 4.8 7 4 Op 1 1/0.000 + CDS 4734 - 6287 1985 ## COG1492 Cobyric acid synthase + Prom 6313 - 6372 9.7 8 4 Op 2 . + CDS 6400 - 7758 687 ## PROTEIN SUPPORTED gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 + Prom 7761 - 7820 2.5 9 4 Op 3 . + CDS 7841 - 8371 968 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Term 8188 - 8233 -0.9 10 5 Tu 1 . - CDS 8390 - 9103 477 ## COG3619 Predicted membrane protein - Prom 9218 - 9277 11.5 + Prom 9069 - 9128 14.6 11 6 Op 1 . + CDS 9349 - 9939 716 ## FN2083 hypothetical protein + Prom 10012 - 10071 10.2 12 6 Op 2 . + CDS 10099 - 11733 2520 ## COG2759 Formyltetrahydrofolate synthetase + Term 11769 - 11809 8.3 - Term 11821 - 11859 7.2 13 7 Tu 1 . - CDS 11870 - 12829 1704 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family - Prom 12883 - 12942 7.9 Predicted protein(s) >gi|224461308|gb|ACDC01000094.1| GENE 1 32 - 190 76 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFYIRRTFIIKRVLFFFLLIYSLSFSVDEIDITSIKLERNTVFEDFQIETIL >gi|224461308|gb|ACDC01000094.1| GENE 2 326 - 1636 1275 436 aa, chain + ## HITS:1 COG:FN0185 KEGG:ns NR:ns ## COG: FN0185 COG3593 # Protein_GI_number: 19703530 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Fusobacterium nucleatum # 45 436 1 392 400 611 81.0 1e-174 MLLGGYMELLKVQIKNWQTFSNISLECKEFLVFIGESSTGKSSFMKALLYFFQARNLHKG DIKNPELPLEIIGTLKGEKGHVFQLRILNNPYQDTRYFIKNHISKHEKDNRNWEEIDEKE YKKHIFGVSVFYVPSYMKISHLNFLVEKLFKNENLRRYHKYYRRFKNSMNKKMSFGFYRH LFIELLNEIIEKEKNHNFWNNTILLWEEPEFYLNPQQERACYEALSESTKLGLMSVVSTN SSRFIEIENYQSLCIFKRVKEEIEIYQYSGNLFSGDEVTVFNMNYWINPDRSELFFAKKV ILVEGQTDKIVLSYLAKYLGVFKYEYSIIECGSKSSIPQFIRLLNAFHIPYVAVYDKDNH YWRNETELMNSTLKNKTIQKLISKNLGTWIEFENDIEEEIYNESRDKKNYKNKPFYALET VIKSGYVLPEKLKEKL >gi|224461308|gb|ACDC01000094.1| GENE 3 1677 - 2243 653 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739689|ref|ZP_04570170.1| ## NR: gi|237739689|ref|ZP_04570170.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 188 1 188 188 325 100.0 8e-88 MAVKVKLEKDGFKKDAYVGFSWTTLFLGIWVPSFRLDLKGFLVFLGIMLFQTTTILFVII NALKTGEFYILTIFIFSYIGINYIISFLLAIYYNKIYTKNMLLDGWKPMTNDEYSLAILG SYGYIDYEIDAQDEEKIARCKGYIRETKREERRKWLIFLIPMFMTVLSIVVSIIGLIALI KVLSKVGY >gi|224461308|gb|ACDC01000094.1| GENE 4 2253 - 2834 571 193 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294783182|ref|ZP_06748506.1| ## NR: gi|294783182|ref|ZP_06748506.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 192 1 192 192 310 96.0 2e-83 MAIKVRLEKEGQLKNAFVGFSWTTFFFGFWVPLFRGKLKDFAYFFMFFLCKIIIFAVLAK EIFDIVYIGIEESKFEISYYIIVPFILMTALYPIDIFLAYTYNKYSTTNMFKEGFYLIEN DEYSAAILKDYTYLPYTEEEFADEELLKRYEQHVKKARKSEKNKCVVAIIIMASYQVFLG VVSSVPTIFSFFR >gi|224461308|gb|ACDC01000094.1| GENE 5 3079 - 3612 725 177 aa, chain - ## HITS:1 COG:FN2073 KEGG:ns NR:ns ## COG: FN2073 COG0503 # Protein_GI_number: 19705363 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Fusobacterium nucleatum # 1 177 1 177 177 265 85.0 5e-71 MKTYTLNIAGLKRELPIIKLSYDLSIASFVILGDTEIVRKTAPMIAKKLPDVDFIITAEA KGIPLAYEISRILNLNEYIVARKSIKAYMEAPIEVEVDSITTNGSQKLYLNSIDAQKIKG KRVALVDDVISTGQSLKALETLVEKAGANIVAKAAILAEGEAKDRKDIIFLEALPLF >gi|224461308|gb|ACDC01000094.1| GENE 6 3624 - 4691 1656 355 aa, chain - ## HITS:1 COG:FN2072 KEGG:ns NR:ns ## COG: FN2072 COG2252 # Protein_GI_number: 19705362 # Func_class: R General function prediction only # Function: Permeases # Organism: Fusobacterium nucleatum # 1 355 1 355 355 485 93.0 1e-137 MTLSDVLAALGVVLNGIPQALLAATYGFASVPTAFGFVVGAVACLLYGSVIPISFQAETI ALAGMLGKDIRERLSIILFSGITMVILGLTGTLSIIVDFAGSTIINAMMAGVGIMLARIA LGGLKESRIVTASSIASAFITYFFFGQNLVYTIVVCVIFSSLVANIFKIDFGGGIVENYK KIEIKKPILNFNVIRGSLALACLTIGANIAFGNITASMTGKYEANIDHLTIYSGLADAVS SLFGGGPVEAIISATAAAPNPLNSGVLMMVIMAVILFFGLLPKISKYIPGHSVHGFLFIL GAIVTVPTNASLAFSGGTPQDYVVAATAMTVTAANDPFIGLLVALVVKYIFVFIG >gi|224461308|gb|ACDC01000094.1| GENE 7 4734 - 6287 1985 517 aa, chain + ## HITS:1 COG:FN2070 KEGG:ns NR:ns ## COG: FN2070 COG1492 # Protein_GI_number: 19705360 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Fusobacterium nucleatum # 25 514 1 490 491 804 88.0 0 MGEMINCQENLCYNKNKKISFGGYMKKHKNIMIQGTGSSVGKTLMVAGLCRIFAQDGYRT TPFKSQNMALNSFVDIEGLEMSRGTVIQAEAAYEIPRAFMNPILLKPNSDNNSQVIINGK VAYTVDAKTYFSNSKDLKKIALDSYKNNIEANFDIGVLEGGGSPAEINLREYDLVNMGMA ELVDSPVILVGNIDIGGVFASIYGTVMLLDEQDRKRIKGYIINKFRGDSDLLKPAIEILD KKFKDEGLDIKFLGVLPYADLRIEEEDSLSDEDKRVYSDNKEYINISVIKTKKMSNFTDF HAFKQYDDVRLKYVYDAKDLGNEDIIIFPGSKNTITDLEDLKERGIFEKVKELKEKGKII VGICGGLQMLGKKIYDPKHLESDILETEGFNFFDYETTFDEIKKTEQVTKKLELTEGILK DFNNYEVKGYEIHQGISTFDSPVICKNRVFATYIHGIFDNSKFTNDFLNIVRREKNMPEQ KEIFSFNEFKEKEYDKLAELLRKNLDMKKIYEILERK >gi|224461308|gb|ACDC01000094.1| GENE 8 6400 - 7758 687 452 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 [Haemophilus influenzae 3655] # 3 452 2 445 456 269 33 9e-72 MESIYKIVDAVNGLLWGKNILVFMLIGAALYFSFKTKFMQFRLFHKIVKVLFKNEKGKKG DISSLETFFLGTACRVGAGNIAGVVAAISVGGPGAIFWMWLVAMLGSATAFIESSLAVIY RKKEKDGSYTGGTPFIIEKRLNMRWLGIIYALASVVCYFGVTQVMSNSITSSITSVYTWG AGNKFLNLQNISSIAVAIMVAYVIFFSSSKKDSIIESLNKIVPFMAIIYVVAVIYILVTN LTNIPSMIGTIFSQAFGGKEIFGGTFGAVVMNGVRRGLFSNEAGSGNSNYAAAAVHIDNP SKQGMVQAFGVFIDTLVICSATAFIVLLVPESTIAGLSGMGLFQAAMTYHLSSIGAPFVV ILMFFFCVSTILAVAFYGRSAVNFIHESKYLNIGYQAVLILMIYIGGIKQDIFIWSLADF GLGIMTVINILVIIPIAKPALDALKNYEKELK >gi|224461308|gb|ACDC01000094.1| GENE 9 7841 - 8371 968 176 aa, chain + ## HITS:1 COG:FN2067 KEGG:ns NR:ns ## COG: FN2067 COG0526 # Protein_GI_number: 19705357 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Fusobacterium nucleatum # 14 176 1 163 164 257 79.0 6e-69 MKRKLIMVLMFALMSFSLFAAKSNKNEDVKVPNIVLQDQYGKKHNLADYKGKVVVINFWA TWCGYCVREMPDFEKVYKEFGSNSKDVIIIGIAGPKSKLNANNVDVSKEEITAFLKKKNI TYPTLMDETGKTFDDYGVRAFPTTYVINKKGFLEGYVSGAITADQLKKAINETLKK >gi|224461308|gb|ACDC01000094.1| GENE 10 8390 - 9103 477 237 aa, chain - ## HITS:1 COG:FN2084 KEGG:ns NR:ns ## COG: FN2084 COG3619 # Protein_GI_number: 19705374 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 236 1 236 239 360 85.0 1e-99 MNKKKLHKFFNNKEEFAPNERLWLFCMLMLVAGFFGGFTFSLRGRVFVNAQTGNLVLLSL GFATWDTALIKNALATFLAYFCGIITAELISKKINKTSFLIWERILLIFSIIVTICLGFI PEAAPYEFTNFPIAFTAAMQFNTFEKAHGMGMATPFCTNHVKQASANFVRFLRTRDDNKL RISLSHLSMILSFIIGATLSIFLGRFLFGKVIWLSTIFLIITFYFFSKSIKEYKKKL >gi|224461308|gb|ACDC01000094.1| GENE 11 9349 - 9939 716 196 aa, chain + ## HITS:1 COG:no KEGG:FN2083 NR:ns ## KEGG: FN2083 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 196 1 196 196 309 84.0 3e-83 METKINNIDLFKVKDNENTYYGFSQEWYKDEWQRRAGCGATVASSIINYYNQRDNFKEVG ISDALKIMEELWNYLLPTEQGLNSIKLFYDGIKSYYDDKEVTIDYINVDIKNKVSLEEII KFICKELTEDKPLAFLNLCNGEENNLDKWHWVVVVEMFEENGEHFLNIIDDKEIIKINLS LWYRTIKNDGGFITFK >gi|224461308|gb|ACDC01000094.1| GENE 12 10099 - 11733 2520 544 aa, chain + ## HITS:1 COG:FN2082 KEGG:ns NR:ns ## COG: FN2082 COG2759 # Protein_GI_number: 19705372 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Fusobacterium nucleatum # 1 544 1 544 544 1002 96.0 0 MTDIQIAQAAKKENIVEIAKKIGLTEDDIEQYGKYKAKVNLDVLQKINRPNGKLILVTAI TPTPAGEGKSTVTIGLTQALNKIGKLSAAAIREPSLGPVFGMKGGAAGGGYAQVVPMEDI NLHFTGDMHAIGIAHNLISACIDNHINSGNALGIDITKITWKRVVDMNDRALRNIVIGLG GKANGYPRQDSFQITVGSEIMAILCLSNSITELKEKIKNIVFGTSLEGKLLRVGDLHIEG AVAALLKDAIKPNLVQTLENTPVFIHGGPFANIAHGCNSILATKMALKLTDYVVTEAGFA ADLGAEKFIDIKCRLGGLKPDCAVIVATVRALEHHGKGDLKAGLENLDKHIDNIKNKYKL PLVVAINKFVTDTDEQIDMIEKFCNERGAEVSLCEVWAKGGEGGIDLAEKVLKAIDNNKV EFDYFYDINLTIKEKIEKICKEIYGADGVIFAPATKKVFDTIAAEGLENLPVCMSKTQKS ISDNPALLGKPSGFKVTINDLRLAVGAGFVIAMAGDIIDMPGLPKKPSAEVIDIDENGVI SGLF >gi|224461308|gb|ACDC01000094.1| GENE 13 11870 - 12829 1704 319 aa, chain - ## HITS:1 COG:FN0662 KEGG:ns NR:ns ## COG: FN0662 COG0010 # Protein_GI_number: 19703997 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Fusobacterium nucleatum # 4 318 3 317 318 591 89.0 1e-169 MEYWSGRVDGNDSDILRIHQVIQVKTLDELMQDEYNGKKVCFVSYNSNEGIRRNNGRLGA ADGWKHLKSALSNFPIFDTDIKFYDLKDPIDVVDGKLEEAQMKLADVVAKLKSKDYFVVC MGGGHDIAYGTYNGILSYAKTKTKDPKIGIISFDAHFDMREYGKGANSGTMFYQIADDCQ KNNIKFDYTVIGIQRFSNTKRLFERAQKFGVTYYLAEDILKLSDLNITPILERNDYIHLT ICTDVFHITCAPGVSAPQTFGIWPNQAIGLLNYIAKTKKNLTLEVAEISPRYDYDDRTSR LVANLIYQAILTHFGCEIK Prediction of potential genes in microbial genomes Time: Thu May 19 23:30:43 2011 Seq name: gi|224461307|gb|ACDC01000095.1| Fusobacterium sp. 2_1_31 cont1.95, whole genome shotgun sequence Length of sequence - 3446 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 181 240 ## FN2112 hypothetical protein - Prom 241 - 300 2.3 - Term 255 - 290 -1.0 2 2 Tu 1 . - CDS 311 - 979 784 ## gi|237739700|ref|ZP_04570181.1| predicted protein - Prom 1050 - 1109 6.1 3 3 Tu 1 . - CDS 1456 - 1989 627 ## gi|237739701|ref|ZP_04570182.1| predicted protein 4 4 Op 1 . - CDS 2634 - 3170 708 ## FN2112 hypothetical protein 5 4 Op 2 . - CDS 3201 - 3350 238 ## gi|237739703|ref|ZP_04570184.1| predicted protein - Prom 3378 - 3437 7.1 Predicted protein(s) >gi|224461307|gb|ACDC01000095.1| GENE 1 1 - 181 240 60 aa, chain - ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 60 1 53 164 72 66.0 4e-12 MKSFLALIILIFLTACTNTRYYYYPENYKNSDISISASLVEFGKEDSALDYIWVLDLRDH >gi|224461307|gb|ACDC01000095.1| GENE 2 311 - 979 784 222 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739700|ref|ZP_04570181.1| ## NR: gi|237739700|ref|ZP_04570181.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 222 3 224 224 400 100.0 1e-110 MIKNSFKFIILTILVIIANACSSNSNSFWGFKPHFSTGTYIHSYAIIEDGKVNRMGIPKK DIDKMDSIINDKYGIQFIDDNRIYALKGGGENYKIKFYNDFKMTVNGKEYIMPKEKIRYS AYDYDLELPIKITNTNYNEYILDIGEIEIIDTDGKIIRPRTKIPPILFKKTIYRTFVNDI TGSDYDVYYRGWAEDYPKDPSTLKKMYNNLEKKFGKLKNIKK >gi|224461307|gb|ACDC01000095.1| GENE 3 1456 - 1989 627 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739701|ref|ZP_04570182.1| ## NR: gi|237739701|ref|ZP_04570182.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 177 1 177 177 297 100.0 2e-79 MKKLIFLFIMILTLTGCKTVEISTSYGYKVTNQKEKVIFDRVQIDGNIINNLKGEKEPLE SISIISKDKNNGIKETPKKIKIISNNKEYLVSVDFKYNTIYPVYNKGIIIDSDSFILEIG NIKFKDGTTLYLHPLLFKKYVYAYKINKILDTLNQDTREDLFSGTIDEYREWKKKNK >gi|224461307|gb|ACDC01000095.1| GENE 4 2634 - 3170 708 178 aa, chain - ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 172 2 162 164 76 37.0 4e-13 MAKKMIVIIFSVLIFNACSMIDKQYKYYPIIENNKIHIVGYLNNQYDENSPLSVLKIEDK KNGTDVNHKIKLLDSTIKIVKGGKEYIIQYSKLEDYDDVYIYIRVLKNGVNITDDEFVVY LGKIELDTGEIIKLPPLRFKKYVYITKGSILNTINPNGKFDQYYNTVEEYKKNGWKEE >gi|224461307|gb|ACDC01000095.1| GENE 5 3201 - 3350 238 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739703|ref|ZP_04570184.1| ## NR: gi|237739703|ref|ZP_04570184.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 49 1 49 49 74 100.0 2e-12 MNTIVGEKTNLIGSVIGGGNTTLRTAKLEYSDIHDKDKGYNFGINGNVS Prediction of potential genes in microbial genomes Time: Thu May 19 23:31:13 2011 Seq name: gi|224461306|gb|ACDC01000096.1| Fusobacterium sp. 2_1_31 cont1.96, whole genome shotgun sequence Length of sequence - 7646 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 6 - 419 503 ## FN1276 hypothetical protein 2 1 Op 2 27/0.000 - CDS 419 - 3481 4145 ## COG0841 Cation/multidrug efflux pump 3 1 Op 3 13/0.000 - CDS 3484 - 4584 1673 ## COG0845 Membrane-fusion protein - Prom 4613 - 4672 7.5 4 1 Op 4 . - CDS 4737 - 6014 1722 ## COG1538 Outer membrane protein 5 1 Op 5 . - CDS 6033 - 6671 704 ## FN1272 TetR family transcriptional regulator - Prom 6824 - 6883 14.8 + Prom 6813 - 6872 16.1 6 2 Tu 1 . + CDS 7004 - 7594 858 ## COG2815 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|224461306|gb|ACDC01000096.1| GENE 1 6 - 419 503 137 aa, chain - ## HITS:1 COG:no KEGG:FN1276 NR:ns ## KEGG: FN1276 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 7 137 1 131 131 216 87.0 2e-55 MNEIRNMNDTDSENLKLIILHINDTQVRLLEKFFEKIGIYYYTVEDNVKRAIDKSIKHQQ TKVWPGSDALVTLPLGDKKIDEFLIKLKTFRMVLPKGLFLSVGILPFERVIRSMYEEDIP VDEELMEELQNDKDYNI >gi|224461306|gb|ACDC01000096.1| GENE 2 419 - 3481 4145 1020 aa, chain - ## HITS:1 COG:FN1275 KEGG:ns NR:ns ## COG: FN1275 COG0841 # Protein_GI_number: 19704610 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 1020 1 1020 1020 1680 91.0 0 MSLAGISIRRPVATTMVMLSFIFIGLLAMFSMKKELIPNINIPVVTISTTWSGAVSEDVE AQVTKKIKDSLSNVEAIDKIQTVSAYSSSTVVVNFEYGVDTDEKVTQIQREVSKITNNLP SDANTPLVRKVEAGSGNMTAVIAFNADSKTALTTFIKEQLKPRLESLPGIGQVDIFGNPD KQLQIQVDSDKLASYNLSPMELYNIVRTSVATYPIGKLSTGNKDMIIRFMGDLDYIDQYK NILISSNGNTLRLKDVADVVLTTEDATNIGYLNGKESVVVLLQKSSDGDTITLNNAAFKV IEEMRPYMPAGTEYSIEMDSSENINNSISNVSSSAVQGLVLATIILFVFLKSFRTTILIS LALPVAIVFTFAFLSMRGTTLNLISLMGLSIGVGMLTDNSVVVVDNIYRHITELNSPVRE AAENGTEEVTFSVIASALTTIVVFLPVLFVPGLAREFFRDMSYAIIFSNLAAIIVAITLI PMLASRFLNRKSMKSEDGKLFKKVKAFYLKVINSAVSHKGLTVLIMVGLFFFSILVGPKL LKFEFMPKQDQGKYSLTAELQKGTDLAKAERIAKELEEIVKNDPHTESYLMLVSTSSISI NANVGKKNTRKDSVFTIMDDIRKKASNVLDARVSMTNQFSGGQTQKDVEFLLQGSNQDEI KQFGKQLLEKLQKYDGMVDISSTLDPGIIELRLNIDRDKIASYGISPAVIAQTVSYYMLG GDKANTATLKTDSEEIDVLVRLPKEKRNDINTLSSLNIKVGDNKFVKLSDVATLQYAEGT SEIRKKNGIYTVTISGNDGGVGLGKIQSKIIEEFNNLEPPSTISYSWGGQSEKMQKTMSQ LSFALSISIFLIYALLASQFESFILPFIIIGSIPLALIGVIWGLVVLRQPIDIMVMIGVI LLAGVVVNNAIVLIDFIKTMRTRGYDKEYAIIYSCETRLRPILMTTMTTVFGMIPMALGL GEGSEFYRGMAITVIFGLSFSTILTLVLIPILYSVVDSFTVKVAAKLKGVFGGLKKKGAK >gi|224461306|gb|ACDC01000096.1| GENE 3 3484 - 4584 1673 366 aa, chain - ## HITS:1 COG:FN1274 KEGG:ns NR:ns ## COG: FN1274 COG0845 # Protein_GI_number: 19704609 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 1 365 1 368 370 550 87.0 1e-156 MKKLLTILLATSLLVVACGKDKEAKDAKNEAAVTETQTVAKPVEVSAVTTRQMSKLFESS AVWEPLSKVDFSTNKGATVEKIYKRNGEYVNKGEIIVKLSDAQTEADFLQAKANYQSATA NYNIARNNYQKFKTLYDKQLISYLEFSNYEATFTSAQGNLEVAKAAYMNAQNSYSKLVAK ADISGIVGNLFIKEGNDIAAKETLFTILNDKQMQSYVGITPEAISKVKLGDEIDVKIDAL GKEYKAKITELNPIADSTTKNFKVKLTLDNSDGEIKDGMFGNVVIPVGESSVLSVEDEAI VTRDLVNYVFKYEDGKAKQVEVTVGATNLPYTEISSPELKEGDKIIVKGLFGLQNNDTVE IKNEVK >gi|224461306|gb|ACDC01000096.1| GENE 4 4737 - 6014 1722 425 aa, chain - ## HITS:1 COG:FN1273 KEGG:ns NR:ns ## COG: FN1273 COG1538 # Protein_GI_number: 19704608 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 11 425 1 413 413 580 78.0 1e-165 MKKLLTFFVLLANVALARDLTLDQAIDLSLNNSKEMKISEKSLEISKLNVSKAFKEALPS VTYSGAVTLGEHERNILTQSGGNYVSKKKGYTQTLKVTQPLFTGGAISAGIKGAKAYENI ASYSYLQSKIQNRLETIKIYSDIINAERNLTALKNSEEILLKRHYKQEEQLKLRLITKPD ILQTEYSLEDIRAQIINLQNLADTNKEKLYIRTGINKSEPLNLVSFDIPNNLSDSLNLNT DLNQALNQSLSAKIADEQVNVASASRMAAAGDLLPQVSAYVSYGTGGQERASFSRSYRDA EWVGGVQVSWKVFSFGKDLDNYKVAKLEEEQQVLKNTSAKENIEINVKSAYLNVVSLEKQ VAAQRKAVEAAKANFEMNQEKYDAGLISTIDYLDFENTYRQARIAYNKVLLDYYYAFETY RSLLI >gi|224461306|gb|ACDC01000096.1| GENE 5 6033 - 6671 704 212 aa, chain - ## HITS:1 COG:no KEGG:FN1272 NR:ns ## KEGG: FN1272 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: F.nucleatum # Pathway: not_defined # 1 211 1 211 211 225 72.0 8e-58 MNFDNDKKLLILEKAKDMIITEGYSNLSINKLTSELGISKGSFYTYFPSKDNMLTEILDE YSENAKVFSENLASNSNSIDECLNYYVNSMLNLNDRNLKLELVMTSLKRNYEVFNEENFI KLKNTARKTIDFIKSILKKYKKSINIKEKDIEKCSKIIFSITEVFLMAENINFETNKFSS KTLDEVKDLYRSQEMKENLEFIKESIKKILYR >gi|224461306|gb|ACDC01000096.1| GENE 6 7004 - 7594 858 196 aa, chain + ## HITS:1 COG:FN0678 KEGG:ns NR:ns ## COG: FN0678 COG2815 # Protein_GI_number: 19704013 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 196 1 200 200 286 77.0 1e-77 MKKFRNDNDEDDFEDTEVKATSAKQPEKDNRRLIKIILNIILIIAIIKVGLGVFERYYFN EFYYKAPNLTGLSIEEAKKTISKSPLNIREMGEVYSDLPYGTVALQEPAEGTIVKRSRNM KVWVSKESPSVFLDDLVGMNYIEASSLLNKNGMKVGEVKKMRSDLPINQIIATSPKSGEP ISRGQKFDFLISNGLE Prediction of potential genes in microbial genomes Time: Thu May 19 23:31:34 2011 Seq name: gi|224461305|gb|ACDC01000097.1| Fusobacterium sp. 2_1_31 cont1.97, whole genome shotgun sequence Length of sequence - 53546 bp Number of predicted genes - 58, with homology - 57 Number of transcription units - 15, operones - 11 average op.length - 4.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 10 - 69 3.0 1 1 Op 1 10/0.000 + CDS 96 - 899 670 ## COG1162 Predicted GTPases 2 1 Op 2 1/0.000 + CDS 892 - 1539 831 ## COG0036 Pentose-5-phosphate-3-epimerase 3 1 Op 3 1/0.000 + CDS 1554 - 2231 952 ## COG1846 Transcriptional regulators 4 1 Op 4 . + CDS 2233 - 3858 1764 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP 5 2 Tu 1 . - CDS 4009 - 4404 551 ## COG0824 Predicted thioesterase - Prom 4464 - 4523 9.9 + Prom 4788 - 4847 13.9 6 3 Tu 1 . + CDS 4929 - 5792 604 ## PROTEIN SUPPORTED gi|42631297|ref|ZP_00156835.1| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily + Term 5847 - 5902 5.6 - Term 5829 - 5898 18.4 7 4 Op 1 . - CDS 5908 - 6087 343 ## FN1884 hypothetical protein - Prom 6137 - 6196 12.6 8 4 Op 2 . - CDS 6314 - 6727 528 ## gi|237739717|ref|ZP_04570198.1| predicted protein - Prom 6818 - 6877 15.4 - Term 6865 - 6916 5.4 9 5 Op 1 . - CDS 6920 - 7096 381 ## gi|237739718|ref|ZP_04570199.1| conserved hypothetical protein 10 5 Op 2 . - CDS 7153 - 7242 62 ## - Prom 7335 - 7394 12.1 + Prom 7376 - 7435 7.4 11 6 Op 1 . + CDS 7462 - 8961 2105 ## COG1288 Predicted membrane protein 12 6 Op 2 . + CDS 8995 - 9423 475 ## gi|237739720|ref|ZP_04570201.1| predicted protein + Term 9440 - 9479 4.6 + Prom 9466 - 9525 8.4 13 7 Op 1 1/0.000 + CDS 9562 - 10287 1153 ## COG2849 Uncharacterized protein conserved in bacteria + Term 10299 - 10340 5.1 + Prom 10334 - 10393 6.2 14 7 Op 2 1/0.000 + CDS 10475 - 11200 1078 ## COG2849 Uncharacterized protein conserved in bacteria 15 7 Op 3 1/0.000 + CDS 11224 - 12045 991 ## COG2849 Uncharacterized protein conserved in bacteria + Term 12063 - 12103 4.2 + Prom 12091 - 12150 9.5 16 8 Op 1 1/0.000 + CDS 12185 - 12682 817 ## COG2849 Uncharacterized protein conserved in bacteria 17 8 Op 2 1/0.000 + CDS 12705 - 13199 781 ## COG2849 Uncharacterized protein conserved in bacteria + Term 13215 - 13251 3.4 + Prom 13244 - 13303 13.9 18 8 Op 3 . + CDS 13330 - 14046 1064 ## COG2849 Uncharacterized protein conserved in bacteria + Term 14055 - 14095 5.2 + Prom 14049 - 14108 10.1 19 9 Tu 1 . + CDS 14255 - 15052 1178 ## COG3315 O-Methyltransferase involved in polyketide biosynthesis + Term 15055 - 15104 11.5 - Term 15043 - 15092 10.2 20 10 Op 1 17/0.000 - CDS 15099 - 16745 1619 ## COG1178 ABC-type Fe3+ transport system, permease component 21 10 Op 2 7/0.000 - CDS 16735 - 17847 1597 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 22 10 Op 3 1/0.000 - CDS 17862 - 18893 1785 ## COG1840 ABC-type Fe3+ transport system, periplasmic component - Prom 18968 - 19027 8.1 23 10 Op 4 . - CDS 19036 - 21594 2688 ## COG0608 Single-stranded DNA-specific exonuclease - Prom 21627 - 21686 13.0 24 11 Op 1 . - CDS 21694 - 22458 969 ## FN0371 hypothetical protein 25 11 Op 2 . - CDS 22489 - 23232 823 ## FN0371 hypothetical protein 26 11 Op 3 . - CDS 23252 - 24016 766 ## FN0371 hypothetical protein - Prom 24137 - 24196 15.7 + Prom 24072 - 24131 8.1 27 12 Op 1 1/0.000 + CDS 24207 - 25232 1187 ## COG0681 Signal peptidase I + Prom 25319 - 25378 7.4 28 12 Op 2 1/0.000 + CDS 25398 - 25829 198 ## PROTEIN SUPPORTED gi|228002792|ref|ZP_04049785.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 29 12 Op 3 1/0.000 + CDS 25822 - 27255 1982 ## COG0015 Adenylosuccinate lyase 30 12 Op 4 1/0.000 + CDS 27292 - 27807 257 ## COG4769 Predicted membrane protein 31 12 Op 5 . + CDS 27829 - 29187 2284 ## COG1109 Phosphomannomutase 32 12 Op 6 . + CDS 29201 - 29419 123 ## gi|262066577|ref|ZP_06026189.1| putative ATP synthase protein I 33 12 Op 7 . + CDS 29444 - 29818 208 ## FN0365 ATP synthase protein I, sodium ion specific 34 12 Op 8 40/0.000 + CDS 29850 - 30599 961 ## COG0356 F0F1-type ATP synthase, subunit a 35 12 Op 9 37/0.000 + CDS 30632 - 30901 659 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 36 12 Op 10 38/0.000 + CDS 30946 - 31437 714 ## COG0711 F0F1-type ATP synthase, subunit b 37 12 Op 11 41/0.000 + CDS 31434 - 31958 646 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 38 12 Op 12 42/0.000 + CDS 31983 - 33485 2493 ## COG0056 F0F1-type ATP synthase, alpha subunit 39 12 Op 13 42/0.000 + CDS 33497 - 34345 1113 ## COG0224 F0F1-type ATP synthase, gamma subunit + Prom 34353 - 34412 8.8 40 12 Op 14 42/0.000 + CDS 34589 - 35977 2191 ## COG0055 F0F1-type ATP synthase, beta subunit 41 12 Op 15 1/0.000 + CDS 35988 - 36383 495 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 42 12 Op 16 . + CDS 36393 - 36764 177 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 + Term 36779 - 36826 7.2 + Prom 36807 - 36866 5.5 43 13 Op 1 . + CDS 36913 - 37824 1063 ## Lebu_0491 hypothetical protein 44 13 Op 2 . + CDS 37861 - 39003 618 ## gi|237739751|ref|ZP_04570232.1| predicted protein + Term 39245 - 39291 6.6 45 14 Tu 1 . + CDS 39398 - 39709 435 ## FN0134 hypothetical protein + Prom 39735 - 39794 8.1 46 15 Op 1 1/0.000 + CDS 39827 - 40354 844 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 47 15 Op 2 . + CDS 40364 - 41158 868 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 48 15 Op 3 . + CDS 41165 - 41413 304 ## FN0286 hypothetical protein 49 15 Op 4 12/0.000 + CDS 41424 - 41663 341 ## COG1837 Predicted RNA-binding protein (contains KH domain) 50 15 Op 5 30/0.000 + CDS 41672 - 42187 753 ## COG0806 RimM protein, required for 16S rRNA processing 51 15 Op 6 2/0.000 + CDS 42184 - 42912 871 ## COG0336 tRNA-(guanine-N1)-methyltransferase 52 15 Op 7 1/0.000 + CDS 42921 - 43484 728 ## COG4752 Uncharacterized protein conserved in bacteria 53 15 Op 8 . + CDS 43536 - 47867 5327 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 54 15 Op 9 . + CDS 47879 - 49162 1560 ## FN0280 hypothetical protein 55 15 Op 10 1/0.000 + CDS 49152 - 49907 1107 ## COG2853 Surface lipoprotein 56 15 Op 11 1/0.000 + CDS 49935 - 51293 2170 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 57 15 Op 12 1/0.000 + CDS 51306 - 52178 1044 ## COG4866 Uncharacterized conserved protein 58 15 Op 13 . + CDS 52214 - 53546 1457 ## COG1283 Na+/phosphate symporter Predicted protein(s) >gi|224461305|gb|ACDC01000097.1| GENE 1 96 - 899 670 267 aa, chain + ## HITS:1 COG:FN0679 KEGG:ns NR:ns ## COG: FN0679 COG1162 # Protein_GI_number: 19704014 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 264 22 285 285 439 86.0 1e-123 MRGILKKTNNKYNCVVGDRVEISEDNAIVEIFERDNMLIRPIVANVDYLAIQFAAKHPNI DYERINLLLLTAFYYKVKPLVIINKIDYLSEEELTELKERLAHLKSIGVPTFLISCQENV GLQEVEDFLKDKTTVIGGPSGVGKSSLINFLQSERVLKTGEISERLQRGKHTTRDSNMIR MKAGGYIIDTPGFSSIEVPKIENREELISLFPEFSNIDSCKFLNCSHIHEPNCNVKKAVE ENRISQDRYNFYKKTLEILSERWNRYD >gi|224461305|gb|ACDC01000097.1| GENE 2 892 - 1539 831 215 aa, chain + ## HITS:1 COG:FN0680 KEGG:ns NR:ns ## COG: FN0680 COG0036 # Protein_GI_number: 19704015 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Fusobacterium nucleatum # 1 213 1 213 215 379 92.0 1e-105 MTKGIKIAPSILSSDFSKLGEELVAIDKAGADYIHIDVMDGEFVPNLTFGPPVIKCIRKC TELVFDVHLMIDRPERYIEDFVKAGADIVVVHAESTIHLHRVIQQIKSFGVKAGVSLNPS TSEDVLKYVINDIDMVLVMSVNPGFGGQKFIPAVVEKIKAIKKMRADIDIEVDGGITDET IKVCADAGANIFVAGSYVFSGDYKERIDLLKSKVN >gi|224461305|gb|ACDC01000097.1| GENE 3 1554 - 2231 952 225 aa, chain + ## HITS:1 COG:FN0681 KEGG:ns NR:ns ## COG: FN0681 COG1846 # Protein_GI_number: 19704016 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 1 225 1 225 225 376 92.0 1e-104 MTVNIQRVNDVLEEYYKLFYKTEDMALKRGIKALTHTELHIIESVGQDTQLTMNELADKI GITMGTATVAISKLSDKGYIDRARSTTDRRKVFVSLTKKGVDALTYHNNYHKMIMASITE SIPEKDLQKFVETFEIILDSLRNKTDYFKPMTITDFKEGTKVSIVEIKGTPIVQNYFLSH GIENFTLLKVLKSGDKSLFKIEKEDGEVLTLDILDAKNLIGVKAD >gi|224461305|gb|ACDC01000097.1| GENE 4 2233 - 3858 1764 541 aa, chain + ## HITS:1 COG:FN0682 KEGG:ns NR:ns ## COG: FN0682 COG1293 # Protein_GI_number: 19704017 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Fusobacterium nucleatum # 1 541 1 541 541 768 86.0 0 MLYMDGISLSKIKEELKKTLEGKRINRIFKNNEYTISLHFGKIELLFSCIPALALCYISK NKEQAILDISSSLISNLRKHLMNAMLTDIEQLGFDRILAFHFSRINELGEIKKYKIYFEC LGKLSNVIFTDEEDKVLDTLKKFHISENIDRTLFLGETYSRPKYDKKILPTELNKDKFDN LLASGNVFSNEIEGVGKYLNNIKSFEEFTNILNSPVKAKIYFKDKKIKLATVLDLDFKDY DEVKEFSSYDEMINFYIDYEHTTTSYMLLKNRLESFLEKKLKKLNKILTLIKKDIEDSET MESIKEKGDILASVLYNVKKGMNSVKAYDFYNNKEIEIELDSLISPKENLDRIYKKYNKV KRGLTNAIRRDKEIREEISYIESTLLFIESSTDVNSLREIEEELIKLNYIKSLHNKKKTK LKKEVKYGLIEGEDYLILYGRNNLENDNLTFKISEKNDYWFHVKDIPSSHIILKATKLTD ELIVKAAQVSAYYSKANLGEKVTVDYTLRKNVSKPNGAKPGFVIYVSQKSVVVEKVELDK I >gi|224461305|gb|ACDC01000097.1| GENE 5 4009 - 4404 551 131 aa, chain - ## HITS:1 COG:FN1881 KEGG:ns NR:ns ## COG: FN1881 COG0824 # Protein_GI_number: 19705186 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Fusobacterium nucleatum # 1 129 1 129 129 202 83.0 1e-52 MFTFNYTIKQEDLNYGNHVGNERALLFFQWAREEFLRANNLSETDIGDGSGFIQTEATVQ YKKQLFLNQEIKINITKIEIKGLRIIFEHEIFCGEDLAITGTATVLAYNYEEQKVKKVPT TFKTLVENYNS >gi|224461305|gb|ACDC01000097.1| GENE 6 4929 - 5792 604 287 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631297|ref|ZP_00156835.1| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily [Haemophilus influenzae R2866] # 1 284 1 284 290 237 42 1e-61 MIYLLIAAFLWGTSFIAGKIAYDMLDPSLVVAIRYILASIILLPMTFSFMKKKEESFTKK DFILLVILGILTYPLTSMLQFIGLSFTSASSATTIIGIEPVMITIVGFIFFKEKASPIVF LLGIIAFFGVALTVGVSALENVSFFGCFLVFLSTIVVSFWVRLSKKILTKMNSNYYTALT IQLGTLFALPIMLFLVKSWEIHYSLKGIIALLYLVVGCSIGAGWFWNKGLERSEASKSGV FLALEPVFGILLAVLVLGEKLNFLSIIGIILVILSAAICMILPKQES >gi|224461305|gb|ACDC01000097.1| GENE 7 5908 - 6087 343 59 aa, chain - ## HITS:1 COG:no KEGG:FN1884 NR:ns ## KEGG: FN1884 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 57 1 57 59 68 91.0 7e-11 MASCENMKKGEVYKCQCCDFEIEVKNACDCGTNDNCETHDASHECCEFTCCGKPLVKKG >gi|224461305|gb|ACDC01000097.1| GENE 8 6314 - 6727 528 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739717|ref|ZP_04570198.1| ## NR: gi|237739717|ref|ZP_04570198.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 137 1 137 137 234 100.0 2e-60 MGLTACKEKERILETTKDIPINENIVFNDYSVETVEDLAAFLVTVTEVENNKPVTITKVK KTFDWKVEEQEKDSYIVSAKYRDSTFKIPVTLSNNRVYTDIGYASVERNDEVYPLGSILP DLITEVQNDPKYQDYLK >gi|224461305|gb|ACDC01000097.1| GENE 9 6920 - 7096 381 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739718|ref|ZP_04570199.1| ## NR: gi|237739718|ref|ZP_04570199.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 58 1 58 58 68 100.0 9e-11 MTKKIENFIDNIIEEKKAEFKGLIGKENRVENMIEDLKTLNLSNDKLEEVIKVAKKHM >gi|224461305|gb|ACDC01000097.1| GENE 10 7153 - 7242 62 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVNVCNKLFIASMTDVHICLGNIYFYSLT >gi|224461305|gb|ACDC01000097.1| GENE 11 7462 - 8961 2105 499 aa, chain + ## HITS:1 COG:FN0023 KEGG:ns NR:ns ## COG: FN0023 COG1288 # Protein_GI_number: 19703375 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 499 1 499 499 852 96.0 0 MKKIKMPDTFVIIFFVVLFASLLTYIVPVGKFEMQEVTYVTNTGAEKTRNVPVPGSFSYE LDDQGNELKKGIKIFEPGGEVGVTNYIFEGLASGDKWGTAVGIVAFLLVVGGAFGIILKT GAVESGIYSMISKSKGSELVLIPVIFILFSLGGAVFGMGEEAIPFAMLIIPIVIDMGYDS VTGILITYISTQIGFATSWMNPFSVAVAQGVSGIPVLSGAGFRMFMWTFFTAFGVIYTIF YARRVKRNPESSIAYKTDAYFRDNFKSEEQGNREFKLGHKLIILVLILGMAWVVYGVIKE GYYLPEIATQFVIMGLIAGVIGVVFKLNNMSVNDIATSFRKGAEDMVGAALVIGMAKGIV LILGGTSADTPTILNTILNYVASGLSNMSAAFCAWVMYIFQSLFNFFVVSGSGQAALTMP IMAPLSDLVGVTRQVAVLAFQLGDGFTNMIVPTSGILMAVLGIAKIEWGVWAKYQIKFQL ILFALGSCFVFFAVFTNFS >gi|224461305|gb|ACDC01000097.1| GENE 12 8995 - 9423 475 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739720|ref|ZP_04570201.1| ## NR: gi|237739720|ref|ZP_04570201.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 142 1 142 142 206 100.0 5e-52 MKIRKYFLLVMTLVFINFLNLNASQKRTGEMEATGSLVSSTKLNLVQKNNKKIFTIEVYR SNGKLSTKSEYELEDKDKNIEKNEIKKLYEEAKSGKIDYSSKIIEEYHENGNLKTRLIDT HVKEKLEEYDENGKLIRVENGE >gi|224461305|gb|ACDC01000097.1| GENE 13 9562 - 10287 1153 241 aa, chain + ## HITS:1 COG:FN0024 KEGG:ns NR:ns ## COG: FN0024 COG2849 # Protein_GI_number: 19703376 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 15 241 2 228 228 284 74.0 8e-77 MKKLLVGLLLVGSALSFGATQRVPLEKLGPRGNGRELYLEGQAKPYSGEVERKYPNGKLL GVATMKDGKLEGKAYEYYESGKVFKEEIYVNGTANGVAKSYYENGKVQYETKFVNGKREG IEKGYTNTGVLVSEIPYKNGEATGLAKFYNEQTGKLEYETNAINGLRNGLSKEYYPSGKV ANEVNFKNDIEDGITKIYYESGKLKGEATYKNGQLDGLAKIYDENGKVVEQATYKNGQKI K >gi|224461305|gb|ACDC01000097.1| GENE 14 10475 - 11200 1078 241 aa, chain + ## HITS:1 COG:FN0024 KEGG:ns NR:ns ## COG: FN0024 COG2849 # Protein_GI_number: 19703376 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 14 241 1 228 228 264 68.0 9e-71 MKKLLLGLLLISSVLSFGATQRVNIEKLVTNESRDTLYLEETKKPYSGEVERKYPDGKLL GLTTVKDGKLNGKSYEYYENGKLKIEENYVDGKSEGVSKSFYPNGKVESEVQFKNNKKEG VSKLYSENGILTSEIPFKNDVAIGVAKLYNAQTGKLEYEENLVNGKRNGLSKKYYPSGKV LNEVNFKDDKEEGIMRVYYETGKLQGEIPYKNGQVDGVVKAYDENGKLIEQAIYKNGEEV K >gi|224461305|gb|ACDC01000097.1| GENE 15 11224 - 12045 991 273 aa, chain + ## HITS:1 COG:FN0024 KEGG:ns NR:ns ## COG: FN0024 COG2849 # Protein_GI_number: 19703376 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 15 237 2 223 228 199 54.0 3e-51 MKKILLGLLLLSSALSFGATQRVSSEKLETNESKDILYLAGTKTPYSGEAEIRYPSGQLL SVATFKNGKINGKAYEYYPSGQLKLEENYINGKLNGLSKSYYENGQLKDETPYKNDKKEG IVKTYIENGTLISEVTFKNGVVVGKSKLYNTKTGKLAAESNINNMKIEGVSEEYYPSGKL LSEVRYKNGRVDGIAKVYNEQTGKLEREIPYKNGKIEGIEKSYDEKGKLIGTITFENDQI MEETTYKDGKIIDRKVYPLGKDLENELLKSETP >gi|224461305|gb|ACDC01000097.1| GENE 16 12185 - 12682 817 165 aa, chain + ## HITS:1 COG:FN0025 KEGG:ns NR:ns ## COG: FN0025 COG2849 # Protein_GI_number: 19703377 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 165 1 166 166 91 39.0 6e-19 MKKLLVGLLLVASSVLSFGAQRVPYEKLSFPGGYISYNDEKFTGEFERKDPRTGKINMVG SVKNGELHGTSYSYDENGKVTEEITFKKGMKEGASKTYYPSGAVAAKLNYKNDRYEGLQK YYYENGKLQAEIEMSKGQLDGVTKMYDENGKLKEEIMYKNGKKVK >gi|224461305|gb|ACDC01000097.1| GENE 17 12705 - 13199 781 164 aa, chain + ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 158 1 142 245 77 41.0 9e-15 MKKLLVALLLIVSSVLSFGAEKVPYEKLSFSNGYIYYNNQEFTGEFEKKDPNTGIVKMVA SVKNGKLDGMSYTYDEIGRLIEETPYKNGLREGNGKVYYKSGVLSAKLTYKNDEYEGVQK YYYENGKLQTEIPTSQGVVTGAVKLYDKRGRFEGEMYHMMRKKL >gi|224461305|gb|ACDC01000097.1| GENE 18 13330 - 14046 1064 238 aa, chain + ## HITS:1 COG:FN0024 KEGG:ns NR:ns ## COG: FN0024 COG2849 # Protein_GI_number: 19703376 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 14 238 1 228 228 244 65.0 1e-64 MKKLLVGLFLVASVLSFGAQRMQVEKMVMNGDLFYVQGEQKPYSGEIEKKYPSGKTLGLA TIKAGKLEGKVYEYYENGKVKSEGNYVNGKAEGVEKYYYKSGKLESEVPFKNAKREGVAK YYSENGILVAEVPYKNDVTSGLGKEYNEKTGKLEYEITLANGVRNGSSKNYYPSGKLLSE VTYKNDIQDGPVRAYYENGKLQAEGTYKNGEIDGVATIYDENGKILQQVTYKNGKEVK >gi|224461305|gb|ACDC01000097.1| GENE 19 14255 - 15052 1178 265 aa, chain + ## HITS:1 COG:FN0388 KEGG:ns NR:ns ## COG: FN0388 COG3315 # Protein_GI_number: 19703730 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: O-Methyltransferase involved in polyketide biosynthesis # Organism: Fusobacterium nucleatum # 1 265 5 269 269 470 92.0 1e-132 MKIKLDGVAETLLITLNARAKDYENPKSVLHDKKSFEIASQLDYDFKKFDAAWASYYGIL ARAYIMDEEVKKFIERYPDCVIVSIGCGFDTRFERVDNGKITWYNLDLPEVMESRKLLFK ENNRVKNISKSVFESDWTKEVVTDGKELLIISEGVLMFFTEDEVKKVLEILVNNFEKFEL HLDLLYKGTIRMTAKHDTLKKMNNVKFKWGVKDGSEVVKLEPKLKQIGLINFTKKMAKIL PLSKKIFIPIFWLMNNRLGMYTYNK >gi|224461305|gb|ACDC01000097.1| GENE 20 15099 - 16745 1619 548 aa, chain - ## HITS:1 COG:FN0377 KEGG:ns NR:ns ## COG: FN0377 COG1178 # Protein_GI_number: 19703719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Fusobacterium nucleatum # 1 548 3 550 550 849 94.0 0 MLSKKKDIWIVISLCVLAFYIVFMIYPLGILFKNAVIENNGDFTFAYFAKFLSKNYYFST IFNSFKVSLAATALTLIIGTPLAYFYNMYKIKGKTFLQIIIILCSMSAPFIGAYSWILLL GRNGLITNTIRNLTGFNFPSIYGFGGILLVLCMQLYPLVFLYVSGALRNIDNSLLEASEN MGCTGAKRFFKIIIPLCIPTILAAALMVFMRAFADFGTPLFIGEGYRTFPVEIYNQFMNE TGSDKNFASAVSIIAIIITSLIFLLQRYINGKYKFTMNALHPIEAKEIKGIKSVLIHLFC YLIVFVSYAPQLYVIYTSFQNTSGKLFKKGYSLKSYTEAFDKLGNAIQNTFFIGGLSLIL IIVISILIAYLVVRRNNFINRTIDTLSMVPYVIPGSVVGIALVSAFNKKPFVLVGTFLIM VISLIIRRNAYTIRSSVAILQQIPLSIEEASISLGASRMKSFFKITTPMMMNGIISGALL SWITIITELSSSIILYNYKTITLTLQIYVYVSRGSYGIAAAMSTILTLMTIISLLVFMRV SKNKNIMM >gi|224461305|gb|ACDC01000097.1| GENE 21 16735 - 17847 1597 370 aa, chain - ## HITS:1 COG:FN0376 KEGG:ns NR:ns ## COG: FN0376 COG3842 # Protein_GI_number: 19703718 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 370 1 371 371 646 90.0 0 MSVNIIIKNAQKRYGDNIIIEDLSLDIRQGEFFTLLGPSGCGKTTLLRMIAGFNSIENGD FYFNEKRINDLDPSKRNIGMVFQNYAIFPHLTVEQNVEFGLKNRKVSKDVMKVETDKFLK LMQIDEYRDRMPDRLSGGQQQRVALARALVIKPDVLLMDEPLSNLDAKLRVEMRTAIKEI QNSIGITTVYVTHDQEEAMAVSDRIAVMKDGAIQHLGQPKDIYQRPANLFVATFIGKTNV LKGTLDGSALKIAGKYDINLTNIKDKNVKGNVTISIRPEEFVIDESQAKDGMKAFIDSSV FLGLNTHYFAHLENGEKLEIVQESKIDNIIPKGTEVYLKVKQDKINVFTEDGSKNILEGV NNDIGVAYAK >gi|224461305|gb|ACDC01000097.1| GENE 22 17862 - 18893 1785 343 aa, chain - ## HITS:1 COG:FN0375 KEGG:ns NR:ns ## COG: FN0375 COG1840 # Protein_GI_number: 19703717 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 343 10 352 352 630 93.0 0 MAIGMVMFVACGGEKEKTEAAAPEAQGSNELVIYSPNADDEVNKIIPAFEKATGIKVTLQ SMGSGDVLARIAAEKENPQADINWGAISMGVLATTPDLWESYTSENEKNVPDAYKNTTGF FTNYKLDGSAALLVNKDVFAKLGLDPEKFTGYKDLLWPELKGKIAMGDPTASSSAIAELT NMLLVMGEKPYDEKAWEFVEKFVAQLDGTILSSSSQIYKATADGEYAVGVTYENPAVTLL QDGATNLKFVYPEEGSVWLPGAAAIVKNAPHMENAKKFIDFLISDEGQKIVAETSTRPVN TSIKNTSEFIKPFEEIKVAYEDIPYCAEHRKEWQERWTNILTK >gi|224461305|gb|ACDC01000097.1| GENE 23 19036 - 21594 2688 852 aa, chain - ## HITS:1 COG:FN0374 KEGG:ns NR:ns ## COG: FN0374 COG0608 # Protein_GI_number: 19703716 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Fusobacterium nucleatum # 1 852 1 844 844 1207 77.0 0 MKKNTKWLLENKINYERIFKNKGEKKLDFVIESLIENRNLSLDTNFDFNPFDLKDIDIAV KRIFEAIENNEKIYIYGDYDVDGITSVSLLYLALSELGGNIDYYIPLRDEGYGLNKDAIQ SLKEEEANLVISVDCGINSIEEINFANELGLDFIITDHHEIIGDLPKAFAVINPKREENI YSYKYLAGVGTAFMLVYALYSKLDRLNDLEKFLDIVAIGTVADIVPLTSDNRKFVKRGLK LLNNTKWIGIKQLLRKVFPEDWDTREYCSYDVGYLIAPIFNAAGRLEDAKQAVSLFIEED GFECLTIIEQLLENNNERKNIQKKILEASIAEIEKKQLYNKNLILVANKSFHHGVIGIVA SKILDKYYKPSIIMEIKESEGVATASCRSIDGINIVECLNSVSDILVKYGGHSGAAGFTI KIENIEEFYQRVDKYIGENFPKELFVKTIKIENILAPYKVNYEFLRELEILEPYGAKNHT PIFAFKNCEYENLRFTKNSTEHLMLDIKKDNYYFKNCIFFGGGDYYDIIANSKKIDVAFK LKLETFKDRYMCKLQLEDVKNSMENTDFNDNYLELNGRDISFPIRTVVYPKRPDIENPLN LVFNDYGLAITKDRTIIENIDVNLANILKVLKNEFNYNFSVEIEKKYLKTENINLHLKID IDKDIILKTFPVKDALIFQEIKKELISDFDYNSIQKKVLASIFKDKKATLAIIEKGRGIK TIIETIKKYYLYKGKTISINDNFKKADFHIFTFDFENEVDLKNVMQTLEKINSNNILVIS NKEFELSNFNLIKDEYTIAKNIEYITYDEIDKIKKSDNFYYPFLTNEEKIKILALLNKEE KIFSTKEIIVHF >gi|224461305|gb|ACDC01000097.1| GENE 24 21694 - 22458 969 254 aa, chain - ## HITS:1 COG:no KEGG:FN0371 NR:ns ## KEGG: FN0371 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 251 1 252 255 234 58.0 3e-60 MKKKGLFFSFLLFIVCSFNLLAENFPQKASKVEDFIPKGWKKLIVEKGDLNKDKIDDVVL VIEKNDPKNFKKIEDSPRSNPVNFNPRIILVLFKDKNSKYTLVAKNDKNFIVSPGYASEE GLETLDSPDYNDNLSKAVTIKNNTLRIFTLADYIKAATSTTYIFRYQNNRFELIGLDAQN ISGDTEYVDTTNYSLNLSTKKLIIHNMSEKLESNVKKEEKIEKNLNITEIYALDTMSETS GVDILDKYVYEIKK >gi|224461305|gb|ACDC01000097.1| GENE 25 22489 - 23232 823 247 aa, chain - ## HITS:1 COG:no KEGG:FN0371 NR:ns ## KEGG: FN0371 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 245 1 249 255 216 55.0 4e-55 MKKKGLFFSFLLFILCSFNLLAENFPQKASKVEDFIPKGWKSIVVKKGDLNKDKIDDVVL VIQKDDAKNFEKSEDNTIFNYNPIAILVLFKDKNSQYNLISKNENGFIVSKDKALVEELE TLSSPDLDDDLSKSINIKNDTLRLLTRSEYVKGARVTEYIFRYQNKKFELIGLEYKYWHT STDYAVDIAYSINFSTKKLIGTKDISGVRTDETKIEKVEKSIDVKDKYILDTMAQDTGIK ILEKYDN >gi|224461305|gb|ACDC01000097.1| GENE 26 23252 - 24016 766 254 aa, chain - ## HITS:1 COG:no KEGG:FN0371 NR:ns ## KEGG: FN0371 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 251 1 252 255 248 60.0 1e-64 MKKNFFILSLILFIFCSFNLFAVNFPKKANKIEDFIPKGWKSIIIKKGDLNKDKIDDVVL IIEKNDPKNFKKNEESYQTSPENYNPRIILVLFKDKNSKYVLVAKNDKGFIISPGEAYES GLQNLESPDFDNDLSKSVTIKNNTLRIFTFAELTRSSGSSIYIFRYQNNRFELIGLENQN IFANAEYIDTYNYSFNFSTKKLKIHNLREKLESNMRKEEKIEKKLNIKESYVLDTMLETT GIDILDTYAHEIKK >gi|224461305|gb|ACDC01000097.1| GENE 27 24207 - 25232 1187 341 aa, chain + ## HITS:1 COG:FN0370 KEGG:ns NR:ns ## COG: FN0370 COG0681 # Protein_GI_number: 19703712 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Fusobacterium nucleatum # 1 320 1 283 286 357 61.0 2e-98 MKTILYGIFYFFLTLFFAYIFIKEKDLAKKFDTRREKFVNKIVKNENKAKRYKKILYYVE TIGTALILVVVIQRFYIGNFKIPTGSMIPTIEIGDRVFADMVSYKFTGPKRNSIIIFDEP MRDEDSYTKRAMGLPGETIKIQDGALYVNGEKTDFRRYSNDGIGDQEWRIPKKGDKLEII PAGKYRDVLENAGINVDAVVKEAFYKEPFEFFKNLYYGLKHKIFDKLKIKYDINEYINHR NDYRKQGALTIVEMIMPNLKFVVNGEETGPILDFISDEKVRNKLLNGETVEIILEDDYYL ALGDNTDNSKDSRYIGFIKKSRMKGRVLVRFWPLNRIGLVK >gi|224461305|gb|ACDC01000097.1| GENE 28 25398 - 25829 198 143 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228002792|ref|ZP_04049785.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Anaerococcus prevotii DSM 20548] # 1 140 1 144 146 80 35 1e-14 MIKKLTINDVDYIEQIFNLEKNIFKNSAFSKESTENLVKADNSFIYAYLVDEKVCGYLMV LDSIDVYEILAIATIEEYRNKGIAQELLDKIKTKDIFLEVRKSNEKAINFYKKNNFKQIS IRKGYYSDPTEDAIIMKMEVNNE >gi|224461305|gb|ACDC01000097.1| GENE 29 25822 - 27255 1982 477 aa, chain + ## HITS:1 COG:FN0368 KEGG:ns NR:ns ## COG: FN0368 COG0015 # Protein_GI_number: 19703710 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Fusobacterium nucleatum # 1 477 1 477 477 882 94.0 0 MNNEIYSNPLCERYSSKEMMYNFSPDKKFSTWRKLWIALAESEKELGLDISQEQIDEMKK NIHNIDYELAAKKEKEFRHDVMAHVHTFGTQAPLAMPIIHLGATSAFVGDNTDLIQIKDG LEIIKAKLVNVMNNLSKFALENKDVATLGFTHFQAAQLTTVGKRATLWLQSLLLDLEELE FRENTLRFRGVKGTTGTQASFKDLFNGDFSKVEELDVLVSKKMGFDKRFAITGQTYDRKV DSEIMNLLANIAQSAHKFTNDLRLLQHLKEVEEPFEKSQIGSSAMAYKRNPMRSERISSL AKFVIALQQSTAMVASTQWFERTLDDSANKRLSLPQAFLAVDAILIIWNNIMEGLVVYNK IIEKHIMSELPFMATEYIIMECVKAGGDRQELHERIRVHSMEAGKQVKVEGKDNDLIDRI VNDDYFKLDKAKLLSILEPKNFIGFAAEQTEKFVNIEIKPILDKYKALLGMDSELKV >gi|224461305|gb|ACDC01000097.1| GENE 30 27292 - 27807 257 171 aa, chain + ## HITS:1 COG:FN0367 KEGG:ns NR:ns ## COG: FN0367 COG4769 # Protein_GI_number: 19703709 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 171 1 171 172 211 91.0 7e-55 MIKKEYREEIYLIALVLLGLYLSLIENIIPKPFPWMKIGLSNISVLIALEKFNSKMALQT ILLRVFIQALMLGTLFTPNFIISFSAGLVSTLFMIFLYKFRKYLSLLSISCISAFMHNLL QLTVVYFLMFRNISLNSKSIIIFIIFFLGLGVIMGLVTGIIATRLNLKRNR >gi|224461305|gb|ACDC01000097.1| GENE 31 27829 - 29187 2284 452 aa, chain + ## HITS:1 COG:FN0366 KEGG:ns NR:ns ## COG: FN0366 COG1109 # Protein_GI_number: 19703708 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Fusobacterium nucleatum # 1 452 1 452 452 760 89.0 0 MGRYFGTDGIRGEANKELTVEKALRLGYALGYYLKNTYKNEEKIKVVMGSDTRISGYMLR SALTAGLTSMGIYIDFVGVIPTPGVAYITKLKKAKAGIMISASHNPAKDNGIKIFNSDGF KFSDEIENKIEDYMDDLDSILVDPLAGDKVGKFKYAEDEYFLYRDYLSHCVKGNFKDMKI VLDTANGAAYRAAKDVFLDLRAELVVINDAPNGRNINVKCGSTHPEILAKVVVGYEADLG LAYDGDADRLIAVDKFGNIIDGDKIIGILALGMKNAGTLKNDKVVTTVMSNIGFEKYLKE NNIELLRANVGDRNVLEMMQKEDVAIGGEQSGHIILRNYATTGDGILSSLKLVEVIRDTG KDLHELVSAIKDAPQTLINVKVDNAKKNTWDKNEKITSFIDEINKKHSDEVRILVRKSGT EPLIRVMTEGENKQLVHKLAEDIAKLIETELN >gi|224461305|gb|ACDC01000097.1| GENE 32 29201 - 29419 123 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262066577|ref|ZP_06026189.1| ## NR: gi|262066577|ref|ZP_06026189.1| putative ATP synthase protein I [Fusobacterium periodonticum ATCC 33693] # 1 72 1 72 72 77 95.0 2e-13 MKIFDKDFYRYLALFTEIGLTLFINVFIAIYLYYLFEKYLFRSFIFLIFMILLGIVNGFY SVYKLIFPKNKK >gi|224461305|gb|ACDC01000097.1| GENE 33 29444 - 29818 208 124 aa, chain + ## HITS:1 COG:no KEGG:FN0365 NR:ns ## KEGG: FN0365 # Name: not_defined # Def: ATP synthase protein I, sodium ion specific # Organism: F.nucleatum # Pathway: not_defined # 20 124 1 105 105 122 80.0 4e-27 MEDIKNLFKKTIITTIICFLLGLVFQNKYLFFGIGGGCAISVIALYLISVDSKAITYSKD VKVAKRIAYIGYAKRYFLHLLFFVALFYFFNDFRLFLCGFIGTLNVKLTIYCMNILKKIK SFFK >gi|224461305|gb|ACDC01000097.1| GENE 34 29850 - 30599 961 249 aa, chain + ## HITS:1 COG:FN0364 KEGG:ns NR:ns ## COG: FN0364 COG0356 # Protein_GI_number: 19703706 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Fusobacterium nucleatum # 32 248 1 217 218 323 88.0 1e-88 MRLGPIEFTTGELVSGPSEIFSIFGFPITSTVVTTWFILLCFFVFFKLGTRNLQLIPGKF QSILEGIYEFLDGTIGQILGTWKKKYYTFFATLFLFIFLSNIITFFPIPWFGVKNGVFEI FPAFRSPTADLNTTVCLALIVTFLFISINIKNNGILGYLKGFGDPTPVMVPLNIVGEFAK PLNISMRLFGNMFAGMVIMGLIYMAVPYFIPAPLHLYFDLFAGLVQSFVFVTLSMVYVQG SLGDAEYTE >gi|224461305|gb|ACDC01000097.1| GENE 35 30632 - 30901 659 89 aa, chain + ## HITS:1 COG:FN0363 KEGG:ns NR:ns ## COG: FN0363 COG0636 # Protein_GI_number: 19703705 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 1 89 1 89 89 126 100.0 9e-30 MDLLTAKTIVLGCSAVGAGLAMIAGLGPGIGEGYAAGKAVESVARQPEARGSIISTMILG QAVAESTGIYSLVIALILLYANPFLSKLG >gi|224461305|gb|ACDC01000097.1| GENE 36 30946 - 31437 714 163 aa, chain + ## HITS:1 COG:FN0362 KEGG:ns NR:ns ## COG: FN0362 COG0711 # Protein_GI_number: 19703704 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Fusobacterium nucleatum # 1 163 1 163 163 186 87.0 2e-47 MPIISIDATFFWQIINFFVLLFIVKKYFKEPISKIINERKQKIEAELVEATKNREEAEKL HKEAEAQVLNSRKEASEIVKNAQRKAEEEAHLLIKEARENRENILRATELEVTKIKNDTK DELGREVKNLAAELAEKIIKEKVDDNQETSLIDKFIAEVGEDK >gi|224461305|gb|ACDC01000097.1| GENE 37 31434 - 31958 646 174 aa, chain + ## HITS:1 COG:FN0361 KEGG:ns NR:ns ## COG: FN0361 COG0712 # Protein_GI_number: 19703703 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Fusobacterium nucleatum # 1 174 1 174 174 230 90.0 1e-60 MIKSQVGRRYSKAIFDIAEEKNQVKEIYEMLNSAMVLYRTDKEFKNFIRNPLIENEQKKA VLTEIFGKDNSENLNILLYILDKGRINCIKYIVAEYLKIYYRKNRILDVKATFTKELSEE QRTKLINKLSQKTGKEINLEVKVDKSILGGGIIKIGDKIIDGSIRRELDNWKKS >gi|224461305|gb|ACDC01000097.1| GENE 38 31983 - 33485 2493 500 aa, chain + ## HITS:1 COG:FN0360 KEGG:ns NR:ns ## COG: FN0360 COG0056 # Protein_GI_number: 19703702 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Fusobacterium nucleatum # 1 500 1 500 500 920 95.0 0 MNIRPEEVSSIIKKEIDNYKKTLEIKTSGTVLEVGDGIARIYGLSNVMSGELLEFPHGVM GMALNLEEDNVGAVILGYASLIKEGDEVRATGKVVSVPAGDNLLGRVVNALGEPIDGKGE IIADKYMPIERKASGIISRQPVSEPLQTGIKSIDGMVPIGRGQRELIIGDRQTGKTAIAI DTIINQKGQNVKCIYVAIGQKRSTVAQIYKKLSDLGCMEYTTIVAATASEAAPLQYMAPY SGVAIGEYFMDRGEHVLIIYDDLSKHAVAYREMSLLLRRPPGREAYPGDVFYLHSRLLER AAKLSDELGGGSITALPIIETQAGDVSAYIPTNVISITDGQIFLESQLFNSGFRPAINAG ISVSRVGGAAQIKAMKQVASKVKLDLAQYTELLTFAQFGSDLDKATKAQLERGHRIMEIL KQPQYHPYTVEKQVVSFYAVINGHLDDIEISKVRRFEKELLEYLKGNTDILTEIADKKAL DKDLEERLKESIANFKKSFN >gi|224461305|gb|ACDC01000097.1| GENE 39 33497 - 34345 1113 282 aa, chain + ## HITS:1 COG:FN0359 KEGG:ns NR:ns ## COG: FN0359 COG0224 # Protein_GI_number: 19703701 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Fusobacterium nucleatum # 1 282 1 282 282 440 86.0 1e-123 MPGMKEIKSRIKSVQSTRQITNAMEIVSTTKFKKYSKLVSESRPYEESMRKILSHIAAGT KNERHPLFDGREEVKSIAIIVITSDRGLCGSFNSSTLKELEKLVKQNEGKKISIIPFGRK AIDFATKRNYDFSESFSKFSAEEMNKIARDVSEDIVVKYANHEYDEVYLIYNKFISALRY DLTCEKIIPIARMEGEVNSEYIFEPSTEYILSSLLPRFINLQVYQAILNNTASEHSARKN SMGSATDNADEMIKTLNIQYNRNRQTAITQEITEIVGGASAL >gi|224461305|gb|ACDC01000097.1| GENE 40 34589 - 35977 2191 462 aa, chain + ## HITS:1 COG:FN0358 KEGG:ns NR:ns ## COG: FN0358 COG0055 # Protein_GI_number: 19703700 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Fusobacterium nucleatum # 1 462 1 462 462 864 98.0 0 MNRGTITQIISAVVDVAFKDELPAIYNALKVKLEDKELVLEVEQHLGNNVVRTVAMDSTD GLKRGMEVIDTGKPITVPVGKAVLGRILNVLGEPVDNQGPVNAETVLPIHREAPEFDDLE TETEIFETGIKVIDLLAPYIKGGKIGLFGGAGVGKTVLIMELINNIAKGHGGISVFAGVG ERTREGRDLYNEMTESGVITKTALVYGQMNEPPGARLRVALTGLTVAENFRDKDGQDVLL FIDNIFRFTQAGSEVSALLGRIPSAVGYQPNLATEMGALQERITSTKSGSITSVQAVYVP ADDLTDPAPATTFSHLDATTVLSRNIASLGIYPAVDPLDSTSKALSEDIVGKEHYEIARK VQEVLQRYKELQDIIAILGMDELSDEDKLTVSRARKIERFFSQPFSVAEQFTGMEGKYVP VKETIRGFREILEGKHDDIPEQAFLYVGTIEEAVAKSKDLVK >gi|224461305|gb|ACDC01000097.1| GENE 41 35988 - 36383 495 131 aa, chain + ## HITS:1 COG:FN0357 KEGG:ns NR:ns ## COG: FN0357 COG0355 # Protein_GI_number: 19703699 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Fusobacterium nucleatum # 3 131 6 134 134 173 75.0 6e-44 MLVSVVTQIKKVLEQEAGYLRLRTSEGDIGIMPNHAPLVAELSAGKMEIESPSKDRRDVY FLTGGFLEISNNQATIIADEIFPLDEINIENEQLELEKLRKELELDLTEEEKQKIQKRIK ISSAMIDAKTN >gi|224461305|gb|ACDC01000097.1| GENE 42 36393 - 36764 177 123 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 1 122 1 126 126 72 36 4e-12 MKFHFLHENFNVLDLEKSIKFYEEALGLKVEREKFAEDGSYKIVYLGDGITNFQLELTWL ADRTEKYDLGDEEFHLAFEVDDYEGAFKKHTEMGCVVFVNEKMGIYFITDPDGYWIEILP PKK >gi|224461305|gb|ACDC01000097.1| GENE 43 36913 - 37824 1063 303 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0491 NR:ns ## KEGG: Lebu_0491 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 9 285 7 289 296 190 40.0 6e-47 MLVGKKKFDEKFKHVLGIIEYSQDRLEGEKVSLELITNKEGDPLSCMTMLGHDYHQMASV NLLVDKDIEKFRQNMYLSTKLCLLGSDTRSYLIPHVEDFFSALMSNNQDILEFFKKYSDI LAYEKEKNFYKKSYSGSFLSRTVLLALKGEWEDVILRANLYLANPSKNTKDKYYYLQFEF LKALAEKNVEKMKESINTMLDLKVARTMVYNMDLYFDFYLQIYVLIYAKIALYHGIDLEV DSDIAPKELIDNTPAKEYPEPYEFMKKFDFKTITAEEWKAWIYEYHPEPEELKESEERGY IFV >gi|224461305|gb|ACDC01000097.1| GENE 44 37861 - 39003 618 380 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739751|ref|ZP_04570232.1| ## NR: gi|237739751|ref|ZP_04570232.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 380 1 380 380 569 100.0 1e-160 MKIKNFFHIILLALIIFFLAKVPRYENTLLQLSKNTKIERDYSIFNDDIALFYLESTNLK YIIYVEGLKKIDNIWVGNAYSYKEAYEKNSGFKWLEDDSKSFNPEYNREQKEIEYNKSTG YFIIDDKKEVYGLSEDEIKEMLNISSLKLENPEKYINKNGERARLTQFSQDLFREMLNIS DYYDSDKFEKENTTTAKEVNKLIFLRNIMLIYLGINLILCIVFLFKKNIKNLKNYIKVKE KRIWILYGLDFFTLFIYCFLCIFSELNKIIEMNKFVFYYIVIKNFLFYYYLKLKKEKFKK LIVVKYLILILILVSGIFIKSFEINSIFYYLTYLFSILYFPILSLRKKENLKVTVLINSL YLVTQFIYLAIVFLWLYIIF >gi|224461305|gb|ACDC01000097.1| GENE 45 39398 - 39709 435 103 aa, chain + ## HITS:1 COG:no KEGG:FN0134 NR:ns ## KEGG: FN0134 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 103 1 103 103 143 86.0 2e-33 MSKRKKNLEKVIQQCQKTLDRIDEELTKPEPKLTPYDIEMGNFDEVPRLILKEAKKQIKI MMQVLDKNEYMPDYIYPLIDSYLIDTELCDLLFETESIYKKYT >gi|224461305|gb|ACDC01000097.1| GENE 46 39827 - 40354 844 175 aa, chain + ## HITS:1 COG:FN0288 KEGG:ns NR:ns ## COG: FN0288 COG0634 # Protein_GI_number: 19703633 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 175 1 175 175 288 88.0 3e-78 MNYRIENLIDRKTVENRIKELAKQIEKDYAGEEVYCVGLLKGSVVFLSDLVKEINTPVII DFMSVSSYGSETVSSGDVKILKDTDLDLRGKHVLIVEDIIDTGLTLEHVIRYFKESKGVK TLKTCTLLSKPERRKVNIDIDYVGFDVPDKFVIGYGLDYDQKYRNLPYIAVVVFE >gi|224461305|gb|ACDC01000097.1| GENE 47 40364 - 41158 868 264 aa, chain + ## HITS:1 COG:FN0287 KEGG:ns NR:ns ## COG: FN0287 COG0030 # Protein_GI_number: 19703632 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Fusobacterium nucleatum # 1 264 1 264 264 426 90.0 1e-119 MDFKHKKKYGQNFLNNKDEILNKIIEVSNIDENDEILEIGPGQGALTNLLVERAKKLTCV EIDKDLEAGLRKKFSSKENYTLVMGDVLEVDLTKYLNKGTKVVANIPYYITSPIINKLIE NKELIDEAYIMVQKEVGERICAKAGKERSILTLAVEYYGEADYLFTIPREFFNPVPNVDS AFISIKFYKDDRYKNKISEDLFFKYIKAAFSNKRKNIVNNLATLGYSKDKIKEILNRVEI SENERAENISIDKFIELIDIFEGR >gi|224461305|gb|ACDC01000097.1| GENE 48 41165 - 41413 304 82 aa, chain + ## HITS:1 COG:no KEGG:FN0286 NR:ns ## KEGG: FN0286 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 82 1 80 80 124 85.0 7e-28 MLMKKSYEFIIQSKKEDIDFINKIVEAYEGAGVVRTLDSTNGIISVISTDDYKDMMREVL IDLGNRWVDLKIIEEGAWKGTL >gi|224461305|gb|ACDC01000097.1| GENE 49 41424 - 41663 341 79 aa, chain + ## HITS:1 COG:FN0285 KEGG:ns NR:ns ## COG: FN0285 COG1837 # Protein_GI_number: 19703630 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Fusobacterium nucleatum # 1 79 1 79 79 120 92.0 7e-28 MENLESLLNFIIKQLVETEDKVNITYEVLDSDVTFKVSVAKGEMGKIIGKNGLTANAIRG VMQAAGVKDKLNVSVEFLD >gi|224461305|gb|ACDC01000097.1| GENE 50 41672 - 42187 753 171 aa, chain + ## HITS:1 COG:FN0284 KEGG:ns NR:ns ## COG: FN0284 COG0806 # Protein_GI_number: 19703629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RimM protein, required for 16S rRNA processing # Organism: Fusobacterium nucleatum # 1 171 3 173 173 251 88.0 4e-67 MIVAGKVLGSHHLKGEVKVISDLQNIEMLVGNKVILELEDKQQKLLTVKKIAPLVANKWI FTFEEIKNKQDTIEIRNAAIKVRRDIVGIGEDEHLVSDMLGFKVYDVKDNEYLGEITEIM DTAAHDIYVIESEDFETMIPDVDVFIKNIDFENKKMLVDTIEGMKEPKVKK >gi|224461305|gb|ACDC01000097.1| GENE 51 42184 - 42912 871 242 aa, chain + ## HITS:1 COG:FN0283 KEGG:ns NR:ns ## COG: FN0283 COG0336 # Protein_GI_number: 19703628 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Fusobacterium nucleatum # 1 236 1 236 238 422 92.0 1e-118 MKINILTLFPKMFEGFVSESIISRAIKFGAVEVNIIDIRDYCFDKHKQADDMPFGGGNGM VMKPEPLFLALENLSGKVIYTSPQGKTFNQEIAKELAKEEELTIIAGHYEGVDERVVENK VDMELSIGDFVLTGGELPAMVISDTIIRLLPDVIKKDSYENDSFYNGLLDYPHYTRPAEY KGLRVPDVLISGNHKKIDEWRLKESLKRTYLRRRDLIEKRELTKLEKKLLDEIKEEIKKE EV >gi|224461305|gb|ACDC01000097.1| GENE 52 42921 - 43484 728 187 aa, chain + ## HITS:1 COG:FN0282 KEGG:ns NR:ns ## COG: FN0282 COG4752 # Protein_GI_number: 19703627 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 186 1 186 187 366 96.0 1e-101 MRNKVYLSLVHYPVYNRNRDVVCTSVTNFDIHDISRSCGTYEIKGYRLVVPVDAQKKLTE RIIAYWQDGTGGQYNKDREQAFRVTDVAESIEAVVEEIERIEGQKPLIITTSARIFNNSI SYENLSKQIFEDDKPYLLLFGTGWGLTDEVMAMSDHILEPIRANSKYNHLSVRAAVAIIL DRLFGER >gi|224461305|gb|ACDC01000097.1| GENE 53 43536 - 47867 5327 1443 aa, chain + ## HITS:1 COG:FN0281 KEGG:ns NR:ns ## COG: FN0281 COG2176 # Protein_GI_number: 19703626 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Fusobacterium nucleatum # 1 1443 12 1454 1454 2534 89.0 0 MIEPNMEVFEKLGVKNIEIKNILLNTRTKRITFNCSVSCMGCIDDIDTIYKDVLSKFGRE IEIEFVTENKELKLEDEEIKTIAIRAIERLKSRNTTSKSFLCFYKVYVKNNYIIIELNDE HIKFMLEEVKISSKIESILAEYGLKDYKIMFSVGDFSKELSNVEEKIKADMEKQQNIISS EREKIIKENSVTETQVYKAKNDFKRGSKTKDIKGDVISIKDFYDLYDGEPCIVQGEIFSI EGMVLKSGKTLKTIRITDGESSLTSKIFLDENDNLDISEGKILKLSGKVQMDTYAGNEKT LMINTVNIIEKEVIKKEDTAEEKMVELHTHTKMSEMVGVTDVEDLIKRAKEYGHKAIAIT DYSVVHSYPAAYKTAKKLSKDDDKMKVIFGCEMYMIDDEALMITNPKDKKIDEEEFVVFD IETTGLNSHTNKIIEIGAVKIKAGRIIDRYSQLINPGISIPYHITEITSITNEQVANQPK IDEVIGKFVDFIGDAVLVAHNAPFDMGFIKRDIKEYLNIDLENSVIDTLQMARDLFPDFK KYGLGDLNKSLGLALEKHHRAVDDSQATANMFIIFLEKYKEKGIEYLKDINKGFEVNVKK QSLKNIMVQVKTQEGLKNMYKLVSEGHIKYFGNKKARIPKSVLKENREGLIVGSSLSAHF MNSGELVELYLRHDLEKLEETAKFYDYIELLPKSTYNELIEKEGTGSLASYDEVEKMNKY FYDLGKRLGILVTASSNVHYLDENEDIIRSILLYGSGTVYSPRQYRVNNGFYFRTTDEML KEFSYLGEQEAKEVVITNTNKIADMVEEGIKPIPEGFYPPKMDNAEEIVRTMTYEKAYRI YGDPLPNIVSARLERELNAIINNGFSVLYLSAQKLVKKSLDNGYLVGSRGSVGSSLVAFM MGITEVNALYPHYICDNPECKHSEFIEKEGVGIDLPDKICPNCGAPLRKDGYSIPFEVFM GFKGDKVPDIDLNFSGEYQSEIHRYCEELFGKENVFKAGTISTLAEKNAEAYVRKYFEDN NLNAVRAEIIRLGRLCQGAKKTTGQHPGGMVIVPQGNSIYEFCPVQRPANDETSESTTTH YDYHVMDEQLVKLDILGHDDPTTIKLLQEYTNMEIKDIPLADKDTLKIFSSTESLGVSPE EIGTEIGTYGIPEFGTGFVRQMLIDTRPTTFAELVRISGLSHGTNVWLNNAQEFVRNGQA TLSQIITVRDDIMNYLIDQGLDNSDAFKIMEFVRKGKPKKEPENWENYSNMMKEKNVPDW YIESCRRIEYMFPKGHAVAYVMMAMRIAYFKVHKPLAFYAAFLSRKADDFDMEVMSKGVL AKQKLEELSKEPKLDPKKKNEQAICEIVVELEARGIELLPVDIYLSEGRKFKIEDGKIRI PLIGISGLGGAVIENILKEREEAKFISVEDLKRRTKMSQTVADKLKSIGAISSLSETNQI SLF >gi|224461305|gb|ACDC01000097.1| GENE 54 47879 - 49162 1560 427 aa, chain + ## HITS:1 COG:no KEGG:FN0280 NR:ns ## KEGG: FN0280 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 427 1 427 427 692 85.0 0 MKKLLSKVVLFLLLSLTAFSYNFPIEDPYSATIIGSSTMMTEGIMENIPLKVYEIQIKDP KDIPDAFWYANKFKFSFSKQSNKKAPLIFVLAGTGSDYNASRVKFMQRIFHTAGYHTIAI SSQMSQQFMISASSNTVPGLLMEDNKDIYKAMKLAYDKIKDQVEVTDFYIMGYSLGGTNA AVLSYIDETEKAFNFKRVFMVNPPVELYDSAVKLDKYLDDYTGGKTENIEKLLNTTLARL KNGLTNEYANIGADTIYNIVKGDFLSDEEKKAYIGLAFRLTSNDLNFLSDLLTKSGVYTK PTAKLTKFTNMKPYLKAVNFANFEDYVNKVGLPYYQKQNKASSIDDLKKASSLRVIEDYL RTSPKIAAVTNADELILSQKDIAFLKDVFKDRLVVYPRGGHCGNMFYKENVDVMVNFVNK GVLKYEN >gi|224461305|gb|ACDC01000097.1| GENE 55 49152 - 49907 1107 251 aa, chain + ## HITS:1 COG:FN0279 KEGG:ns NR:ns ## COG: FN0279 COG2853 # Protein_GI_number: 19703624 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Surface lipoprotein # Organism: Fusobacterium nucleatum # 1 251 1 260 260 374 77.0 1e-104 MKIKNLLLLSILSLSLVSCANTDEVKTTNTETSDVVVTAPEKNFIAEEYDPWEPFNKRMY YFNYQIERLVITPIVNTYRFITPDFVENRVSNFFKNAKVLNTMANSAFQLKGRKSMRALG RFTMNTVLGLGGLFDVASKMGMPKPYEDFGLTLAHYGVGRGPYLVLPLLGPTYLRDAFGT GVDSTLAGKIDIYNRMELFSTSSVPVTALRGIDMRKNIDFHYYQTNSPFEYEYVRYLYGK YRGIQEAASEK >gi|224461305|gb|ACDC01000097.1| GENE 56 49935 - 51293 2170 452 aa, chain + ## HITS:1 COG:FN0278 KEGG:ns NR:ns ## COG: FN0278 COG0624 # Protein_GI_number: 19703623 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Fusobacterium nucleatum # 1 452 1 452 452 833 93.0 0 MDLKEKVLGYKDEVVKEIQNAVRVKSVKEAPLPGMPFGEGPAKALDHFMDLAKKLGFKAE KFDNYAMHIDMGEGDETLGILAHVDVVPEGDNWTYPPYSGTIADGKIFGRGTLDDKGPAV ISLFAMKAIADAGIKLNKKVRMILGADEESGSACLKYYFGELKMPQPTIGFTPDSSFPVT YAEKGSVRVKIKKKFNTLQDVVIKGGNAFNSVPNKANGEIPVDMLGEVRNKNKVEFEREG NTYKVVSAGIPAHGAYPSKGYNAVSALFEVLKDFEVKNEELKSIVTFFDNFVKMETDGES FGVKCTDGETGELTLNLGKINLENNELEIWLDMRIPVKIKNEQIIETIKKNTEDFGYEFV LHSNTQPLYVPKDSFLVSTLMDIYKELTGDKDAEPVAIGGGTYAKYANNTVAFGALLPEQ EDRMHQRDEYLEISKIDKLLQIYVEAIYKLAK >gi|224461305|gb|ACDC01000097.1| GENE 57 51306 - 52178 1044 290 aa, chain + ## HITS:1 COG:FN0277 KEGG:ns NR:ns ## COG: FN0277 COG4866 # Protein_GI_number: 19703622 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 290 1 290 290 439 84.0 1e-123 MWKKLTIESKDTIEEYTKNRFEICDLSFSNLFLWSFGENTEYEIENDVLTIRSEYMGEAY YYMPIPKNDTPENITAMKEKIKNIIEENVPIHYFTEYWYEKLKDDFNLQEKRDYEDYIYS YESLSTLKGRHYAKKKNRVANFRKNYEYTYGSISKDNIGEVIAFQEKWYKLHSEFGGEIL KNENEGIMQLLKNYDSLDIKGGFLKVNNQIIAYSLGEALNDKIVLVHTEKALIDYIGSYQ AINMIYLQEEWQGYELVNREDDFGDEGLREAKMSYKPLYLLKKYSIEKNV >gi|224461305|gb|ACDC01000097.1| GENE 58 52214 - 53546 1457 444 aa, chain + ## HITS:1 COG:FN0276 KEGG:ns NR:ns ## COG: FN0276 COG1283 # Protein_GI_number: 19703621 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Fusobacterium nucleatum # 15 444 1 430 525 716 91.0 0 MYIKIILQLIGGLGLFLYGMEHMSTSMQKIAGPKLKKILASLTNNRILGILVGIVITALV QSSSVSTVMTVGFVNASLLTLKQALGVILGANIGTTITGWLLVLDIGKYGLPIVGVAAIL YMFMKKEKARTNLSAIIGVGLIFFGLQLMSQALSPLKDMPEFIEMFKMFKVDSYFGLLKV TAVGAIITALIQSSAATIGITIALATQGLIDYQAAVALVLGENVGTTVTAFLASLGAKPN AKRAAFAHTLINLIGVFWVTSIFRFYLKFLNNFVDPVHHMGAAIAAAHTIFNISNVIILT PFVGLLDKLLLYIVKDTGEDEQRVTKLASLKMTLPNVIIDQTKIEVSSMVTMIDDVFLKL EESLKEKEKIAKYNEEIVAAEDKLDLYEKEIYDSNFSLLSKSLSKSLIEDTRMNLLACDE YETIGDYQNRIANRLYMLYENSID Prediction of potential genes in microbial genomes Time: Thu May 19 23:32:47 2011 Seq name: gi|224461304|gb|ACDC01000098.1| Fusobacterium sp. 2_1_31 cont1.98, whole genome shotgun sequence Length of sequence - 3087 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 278 272 ## COG1283 Na+/phosphate symporter 2 1 Op 2 . + CDS 314 - 760 421 ## gi|237739767|ref|ZP_04570248.1| predicted protein + Term 766 - 818 14.4 3 2 Op 1 . + CDS 1111 - 1674 570 ## gi|237739768|ref|ZP_04570249.1| predicted protein 4 2 Op 2 . + CDS 1711 - 2634 1115 ## gi|237739769|ref|ZP_04570250.1| predicted protein + Prom 2673 - 2732 7.1 5 3 Tu 1 . + CDS 2874 - 3087 197 ## gi|237740061|ref|ZP_04570542.1| hypothetical protein FSAG_00135 Predicted protein(s) >gi|224461304|gb|ACDC01000098.1| GENE 1 3 - 278 272 91 aa, chain + ## HITS:1 COG:FN0276 KEGG:ns NR:ns ## COG: FN0276 COG1283 # Protein_GI_number: 19703621 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Fusobacterium nucleatum # 1 91 435 525 525 162 87.0 1e-40 RAKMIFKLHSLSVELFNDISRAVKTGEKELYSTGLKKYQELKSYYKEVKREHFSRSENIP ARLNTGYLDIINYYKRIADHTYNIIEYVMKI >gi|224461304|gb|ACDC01000098.1| GENE 2 314 - 760 421 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739767|ref|ZP_04570248.1| ## NR: gi|237739767|ref|ZP_04570248.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 148 1 148 148 275 100.0 6e-73 MDNLDIFENACKYFLEKMTEFKEILSSKVDSLNNKDWILSKGSTKTCKADETGKKKRCKV GLNYGLKIELSVDDVDIIWNEFSSFFTKTYVENIERISNNEEIARYEFLARSSIGDEIRC SIYLANEWNIPQISLSGFVVPRYKKSDY >gi|224461304|gb|ACDC01000098.1| GENE 3 1111 - 1674 570 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739768|ref|ZP_04570249.1| ## NR: gi|237739768|ref|ZP_04570249.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 187 1 187 187 332 100.0 5e-90 MRTKLRIQSLDEWELDGMEFECTLPIIKNKTIIFGAYNMNFEITDNIWYQLPEEYKRKLY NNYRKKLIKNMLVTITNITAYSFNIKFHDKLKDEIIMEEIYENFDRNRKIESFLPGCDFP YSSMSIYFQYLGEIYVEFDLDELIAYSEEERILGYSKLIKDVNRRKEREKSIGKLEQIHN EQSNLNL >gi|224461304|gb|ACDC01000098.1| GENE 4 1711 - 2634 1115 307 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739769|ref|ZP_04570250.1| ## NR: gi|237739769|ref|ZP_04570250.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 307 1 307 307 499 100.0 1e-139 MEIRIKLQPLHRNELNFSEAFCTVPIIMDKTMIWGAYNVDVNMSECIWYQLPEKFKERIR NDKGYTIKSMIITVTDITAYSVSVSNHEKLKDTITMEEIYEDFSESKKIERFLCKCDFPY SNIEVYFQNLGEIYVEFELEDWVCYEKEAKEEWKIKEIERRKLREVYKVEPRVIKVSEEI KERVIIQSLLEKIHDDEQSKFEKILKKSYKESSIPKESLLLIFEKFTLLIDREIDSLLFN LKYVMLIIKSAEELRIAIPENIKNEIGYWLTDMQAKIRKEEEEKLFKEIRDKLNIKKVYK SGTYEFF >gi|224461304|gb|ACDC01000098.1| GENE 5 2874 - 3087 197 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237740061|ref|ZP_04570542.1| ## NR: gi|237740061|ref|ZP_04570542.1| hypothetical protein FSAG_00135 [Fusobacterium sp. 2_1_31] # 1 71 1 71 71 106 90.0 4e-22 MKIKLKVEPNWEIWIYESYSTVPIITNGTMIFGAYNVEFKLTDNIWRQLPEEYKKKIYKY NWKKVIKSMLI Prediction of potential genes in microbial genomes Time: Thu May 19 23:33:23 2011 Seq name: gi|224461303|gb|ACDC01000099.1| Fusobacterium sp. 2_1_31 cont1.99, whole genome shotgun sequence Length of sequence - 15340 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 9, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 598 659 ## gi|237739770|ref|ZP_04570251.1| predicted protein 2 1 Op 2 . + CDS 630 - 839 315 ## gi|237739771|ref|ZP_04570252.1| predicted protein + Prom 893 - 952 2.8 3 2 Op 1 . + CDS 1147 - 1524 431 ## gi|237739772|ref|ZP_04570253.1| predicted protein 4 2 Op 2 . + CDS 1535 - 2191 741 ## gi|237739773|ref|ZP_04570254.1| predicted protein + Prom 2193 - 2252 9.8 5 3 Tu 1 . + CDS 2329 - 2610 396 ## gi|237739774|ref|ZP_04570255.1| predicted protein + Term 2773 - 2835 -0.9 + Prom 3366 - 3425 10.6 6 4 Tu 1 . + CDS 3453 - 3827 420 ## FN0142 hypothetical protein + Term 3847 - 3887 7.8 + Prom 4084 - 4143 4.0 7 5 Op 1 . + CDS 4179 - 4682 644 ## GbCGDNIH1_1661 hemolysin 8 5 Op 2 . + CDS 4706 - 5104 532 ## gi|237739778|ref|ZP_04570259.1| predicted protein 9 5 Op 3 . + CDS 5188 - 6273 1264 ## GbCGDNIH1_1661 hemolysin 10 5 Op 4 . + CDS 6297 - 6692 485 ## gi|237739780|ref|ZP_04570261.1| predicted protein + Term 6801 - 6848 7.2 + Prom 6836 - 6895 14.6 11 6 Op 1 2/0.000 + CDS 6941 - 7327 656 ## COG0640 Predicted transcriptional regulators 12 6 Op 2 15/0.000 + CDS 7329 - 7544 401 ## COG2608 Copper chaperone 13 6 Op 3 . + CDS 7574 - 9415 2285 ## COG2217 Cation transport ATPase 14 6 Op 4 . + CDS 9489 - 10403 1421 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Term 10408 - 10439 3.1 - Term 10389 - 10431 5.1 15 7 Op 1 . - CDS 10433 - 10810 495 ## gi|237739785|ref|ZP_04570266.1| predicted protein 16 7 Op 2 . - CDS 10823 - 11974 2053 ## COG0192 S-adenosylmethionine synthetase - Prom 11995 - 12054 9.7 - Term 12066 - 12108 0.4 17 8 Op 1 1/0.000 - CDS 12193 - 13059 1005 ## COG0682 Prolipoprotein diacylglyceryltransferase 18 8 Op 2 1/0.000 - CDS 13079 - 13861 739 ## COG2035 Predicted membrane protein 19 8 Op 3 . - CDS 13865 - 14941 1218 ## COG0787 Alanine racemase - Prom 14973 - 15032 14.0 + Prom 14929 - 14988 12.3 20 9 Tu 1 . + CDS 15096 - 15339 337 ## COG4545 Glutaredoxin-related protein Predicted protein(s) >gi|224461303|gb|ACDC01000099.1| GENE 1 2 - 598 659 198 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739770|ref|ZP_04570251.1| ## NR: gi|237739770|ref|ZP_04570251.1| predicted protein [Fusobacterium sp. 2_1_31] # 16 198 1 183 183 244 100.0 3e-63 SVSLSEHEKFKEPITMEEFYKDFSENKKIERFLCECDFPYSNMSVYFQSLGEIYAEFELE DWVSYEKEAKEEWKIKERERRKIREIYKSESISLEKIEEEQESFLIFIGNVYRTGKLSEE YFERFFSTYPLILRHLDAEELVLIEKNAEELGIEISDNIKYEIGYSLTNTQTEIMIEEEK EAIEEIRKKWNLKKVYED >gi|224461303|gb|ACDC01000099.1| GENE 2 630 - 839 315 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739771|ref|ZP_04570252.1| ## NR: gi|237739771|ref|ZP_04570252.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 69 1 69 69 119 100.0 7e-26 MMYYFSDYTRVMAEREAAKRAPATKISTNQGIKIGNLTVYKDYVIRERGATYKLRGLTVN YSRLTPYEC >gi|224461303|gb|ACDC01000099.1| GENE 3 1147 - 1524 431 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739772|ref|ZP_04570253.1| ## NR: gi|237739772|ref|ZP_04570253.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 125 1 125 125 184 100.0 2e-45 MISKIVEQAKLPNMTMQEIYDLAKKEGLNVGELKKAGETGIGQKFNIVGNRKEKIEIKIN GKKIKKRNVLSIRSNSGGMHNSSYILIGTDDGKIKVIFGSPEKYEYNSTNIEKAELIFVE QPKSK >gi|224461303|gb|ACDC01000099.1| GENE 4 1535 - 2191 741 218 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739773|ref|ZP_04570254.1| ## NR: gi|237739773|ref|ZP_04570254.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 218 1 218 218 360 100.0 3e-98 MKYEYCGISLGDDIKDIIEKFDISKIEYRDSMKRLYFKLGNFSKKTNLECFLSIPIETGK VIYIIIFDENFKLFNELEIWQELTNEIKEKYELYYDEDDDGIYLSKKYKYLKIGVDEGYG RIEGFKDYKERIFSFIFDAQEDIRWILQQDKITNYLECQNLQDIYNSLYDSKTLDVDIEK REIYGQLDNYKFIFSLLTRDIKSIQNLETGEFIKTSLE >gi|224461303|gb|ACDC01000099.1| GENE 5 2329 - 2610 396 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739774|ref|ZP_04570255.1| ## NR: gi|237739774|ref|ZP_04570255.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 93 19 111 111 162 100.0 6e-39 MVTYKSIFSKDAYLHAKIDMKLWKKFGTGSDDKSTREIRERFGGSVFGLSVMLFYDILKI ESMKNNQNILSEEEINKKLHEIYFSNEKENNGY >gi|224461303|gb|ACDC01000099.1| GENE 6 3453 - 3827 420 124 aa, chain + ## HITS:1 COG:no KEGG:FN0142 NR:ns ## KEGG: FN0142 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 122 36 156 160 172 81.0 4e-42 MYNTDLYIEFTRPINLKLEKINFRFNDEVIGTAEINKNLSELEDFAEPYIDEKTKEKIIR KIYPLQKEFLRILGRNAEVYDLLEDGRFYIDIYIKDLKTNKTFIIKRDNIHIYYESVGIK LFSM >gi|224461303|gb|ACDC01000099.1| GENE 7 4179 - 4682 644 167 aa, chain + ## HITS:1 COG:no KEGG:GbCGDNIH1_1661 NR:ns ## KEGG: GbCGDNIH1_1661 # Name: not_defined # Def: hemolysin # Organism: G.bethesdensis # Pathway: not_defined # 23 167 588 730 730 102 37.0 4e-21 MKEYKNKPISGKITSKYNDVNKYEYNATTNPGPLAEGKDPPINNFYGGMYNDASNESGIF VRMGDKIKPYGSWYTKVSKNSEAQARVDLAIKKWWVKPNGEIKIRGFETEKSILDTMYYI KFPEGIPKYKGPVGYQGGPFLGGLDQEQYFIPDSWEYGEIIETYPVK >gi|224461303|gb|ACDC01000099.1| GENE 8 4706 - 5104 532 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739778|ref|ZP_04570259.1| ## NR: gi|237739778|ref|ZP_04570259.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 132 1 132 132 201 100.0 9e-51 MICEELKSRKNFVEEDFIELRDSVEELISVIEKYKDMGYNSSLYIDDLKDFLADVNLTLE EKKITDKELKNLNSLGESYFNSRVDNSIYSYYVYDKNNLEKTHKANDEIKIAKKRFGKIL YKITEKVMYHMI >gi|224461303|gb|ACDC01000099.1| GENE 9 5188 - 6273 1264 361 aa, chain + ## HITS:1 COG:no KEGG:GbCGDNIH1_1661 NR:ns ## KEGG: GbCGDNIH1_1661 # Name: not_defined # Def: hemolysin # Organism: G.bethesdensis # Pathway: not_defined # 204 361 574 730 730 91 36.0 4e-17 MWLTSVPSLVSASGVTNEEQLLLENKVNSVQGKVLINDGTPGGTTIAYQKRITYPDGSMS ISQKNLTTGEVSFQGINSSGQRISEASLTPHEANTLIGANSSSKMLVGNGAVSQSVISKV SYQTGNNSLVLYDKTPVPVTNGTLVAPLTTNRALATVPLLTDGAVSQVASKLPYNPVLKS PVKYPNLTDSENKFLNDYMNKEMPGKIVSKYNDVNKYEYNATTNPGPLSIGDDPPINNFY GGMYNDASDESGIFARMGDETYPYGSWYTRVAKNSEVEARVDLAIKKWWVKPNAEIRITE YGGDKSILDTVYYIEFPEGIPKYKGPVGYQGGPFLGGLNQEQYFIPNSKSFGKVIKSYPV K >gi|224461303|gb|ACDC01000099.1| GENE 10 6297 - 6692 485 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739780|ref|ZP_04570261.1| ## NR: gi|237739780|ref|ZP_04570261.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 131 1 131 131 189 100.0 3e-47 MIWEELKSRKNFVEEDFIELRDSVEEIIKIFEKYKDMRKNSKGYIEEMKRFLGEINITLK EKKLTDKELINLVELRRTYFNFHDNSLSEYGVYDKDDLEKTHRVNREITVVIERLKKILY KITEKIDYHIS >gi|224461303|gb|ACDC01000099.1| GENE 11 6941 - 7327 656 128 aa, chain + ## HITS:1 COG:FN0260 KEGG:ns NR:ns ## COG: FN0260 COG0640 # Protein_GI_number: 19703605 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 4 128 1 125 125 213 87.0 7e-56 MKAIKSVKPVNSCDCDSVNKEIIEKVKKEFPNDEILGDLSDFFKVIGDGTRIRILWALDV SEMCVCDIANVLNMTKSAVSHQLRALREADLVKFRKSGKEVLYSLADNHVKEIFEQGLVH IQEEKGED >gi|224461303|gb|ACDC01000099.1| GENE 12 7329 - 7544 401 71 aa, chain + ## HITS:1 COG:FN0259 KEGG:ns NR:ns ## COG: FN0259 COG2608 # Protein_GI_number: 19703604 # Func_class: P Inorganic ion transport and metabolism # Function: Copper chaperone # Organism: Fusobacterium nucleatum # 1 71 1 73 73 82 80.0 1e-16 MKKVFKLEGLNCADCASKIEEKVAKLEGVKSVMVNFMTTKMTLESENMEEVVEKVKKIVN EVEPDVNMVKA >gi|224461303|gb|ACDC01000099.1| GENE 13 7574 - 9415 2285 613 aa, chain + ## HITS:1 COG:FN0258 KEGG:ns NR:ns ## COG: FN0258 COG2217 # Protein_GI_number: 19703603 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 12 612 12 612 614 918 84.0 0 MKKKKEIIIAISAILFTLTLFIRMPQALQLILILVAYVLVGKDTVLLAVKKIERGDFLDE NFLMTVATLGAILIGEYPEAVAVMLFYEIGELFQGYAINKSRKSIAAMMDIKPEYANVIR DDKTQRVEPDEVGLGEIIEIRPGERVPLDATIIKGETSLDTSALTGESVPVEAREGANIL SGCININGLITAKVTKEYFDSTVNKVLDLVENAAAKKSKSERLITRFAKVYTPIVIGLAI LLALLPPIISGEYNFRLWVFRALSFLVISCPCAFVLSVPLSFFSGIGAASKAGVLIKGGN YLEALAKVDTVVFDKTGTLTKGVFNVQKVVVHNKNIDENEFMFYVASAESGSNHPISKSI QKYYNKEIDNSSINSIKEISGKGIEAIINSKKVLVGNEKLINLPKDISVTDVGTILYVEI DNVFSGYIVISDEIKEDAKRAIKELKNIGIKKNIMLTGDLEKVAKKVGEDLELDETYSNL LPQDKVSKFEEIIKNKTSKGSVIFVGDGINDAPVLARADVGIAMGAMGSDAAIEAADVVI MTDEPSKIVTAIKSSKKTMKIAMQNMALAFGIKVIALILSALGIADMWMAVFADIGVTIL AVLNSFRALKVEK >gi|224461303|gb|ACDC01000099.1| GENE 14 9489 - 10403 1421 304 aa, chain + ## HITS:1 COG:MTH1430 KEGG:ns NR:ns ## COG: MTH1430 COG0115 # Protein_GI_number: 15679429 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Methanothermobacter thermautotrophicus # 6 303 32 330 330 372 60.0 1e-103 MINTEKIWMNGKLVGHDDANIHILSHVVHYGSSVFEGIRIYKTENGPAIFRLREHVKRLF DSAKIYRMEIPYTVEEIEQAIIETVKANKLEQGYIRPIAYRGYFELGVTPSRCPVEVAIA AWAWGAYLGEEALNKGIRVQVSSWRRPALNTLPSLAKAGGNYLSSQLIRLEALNNGYEEG IALDYLGNVSEGSGENLFVVLNGKIITPTLASSALGGITKDTVIQLAKKLGYEVVEQAIP RELLYICDELFLTGTAAEVTPVYSVDDIVVGNGDKTITKALQKEFFDLAHGRHELSEKFL AYVK >gi|224461303|gb|ACDC01000099.1| GENE 15 10433 - 10810 495 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739785|ref|ZP_04570266.1| ## NR: gi|237739785|ref|ZP_04570266.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 125 1 125 125 208 100.0 8e-53 MLKKFSEAQEKGYNYMLFIELGYLTSKNDLSSFQVKAVTIEGYFETIKQIYDYVENIDFE ETEEKDGRYECEVSNIYDVSRKIYFIKNEGLTFTEVDDTDIVDRIVNKGPKEIVGKSKEF LEARL >gi|224461303|gb|ACDC01000099.1| GENE 16 10823 - 11974 2053 383 aa, chain - ## HITS:1 COG:FN0355 KEGG:ns NR:ns ## COG: FN0355 COG0192 # Protein_GI_number: 19703697 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Fusobacterium nucleatum # 1 383 1 383 383 711 90.0 0 MKKFTYFTSEFVSPGHPDKVSDQISDAILDACLADDPNSRVACEVFCTTGLVVVGGEITT TTYIDVQEIVRKKIEEIGYRPGMGFDSNCGTLSCIHSQSPDIAMGVDVGGAGDQGIMFGG AVKETEELMPLALVLSREILVRLTKMMKAGEITWARPDQKSQVTLAYDENGNVDHVDSIV VSVQHDEEVSHAEIEKTIIEKVVNPVLEKYKLNTENIKYYINPTGRFVIGGPHGDTGLTG RKIIVDTYGGYFRHGGGAFSGKDPSKVDRSAAYAARWVAKNVVAAGFADKCEIQLSYAIG VDKPVSIKVDTFGTAKVDEDKISEAISKVFDLSPRGIEKALELREGKFKYQDLAAFGHIG RTDIDTPWERLNKIEELKKAINL >gi|224461303|gb|ACDC01000099.1| GENE 17 12193 - 13059 1005 288 aa, chain - ## HITS:1 COG:FN0489 KEGG:ns NR:ns ## COG: FN0489 COG0682 # Protein_GI_number: 19703824 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Fusobacterium nucleatum # 1 288 1 288 288 466 88.0 1e-131 MNPVFLKLGPIELHYYGLMYAIAFYVGITLGKKIAKERNFDVELVENYAFVAIISGLIGG RLYYVLFNLPYYLRNPLEIPAVWHGGMAIHGGIIGGIIGTLIYAKIKKVNPLTLGDFAAG PFILGQAIGRIGNFMNGEVHGVPTFTPFSVIFNLKPKFYEWYSYYQNLDLVEKSKYKELV PWGVVFPESSPAGSEFPNLPLHPAMLYEMVLNLIGFFIIWFILRKKENKAPGYMWWWYII IYSINRIIISFFRVEDLMFFNFRAPHVISFILIAISIFFLKKDNKKIF >gi|224461303|gb|ACDC01000099.1| GENE 18 13079 - 13861 739 260 aa, chain - ## HITS:1 COG:FN0490 KEGG:ns NR:ns ## COG: FN0490 COG2035 # Protein_GI_number: 19703825 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 260 1 260 260 361 86.0 1e-99 MILLFLKSIIIGIANIIPGVSGGTLAVMLNVYDPITEKIGNFFLVDRKTKLSYFWYLLIV LVGAATGIFLFANIIKYSITNYPKITVSVFTLLILPSIPYIVKGLDYKKKKNILAFCCGA ALMIIFILLGLKYGDKTTGAVTIQIAKGVCFTRAYRLKLFICGIIAAGAMIIPGISGSLL LMMLGEYYNVVYLISSLASSLKEKSFSILLPLITLAVGVGIGLVAFSKAINYLLKNHKEF TLFFIEGIITFSIIQMWLSI >gi|224461303|gb|ACDC01000099.1| GENE 19 13865 - 14941 1218 358 aa, chain - ## HITS:1 COG:FN0491 KEGG:ns NR:ns ## COG: FN0491 COG0787 # Protein_GI_number: 19703826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Fusobacterium nucleatum # 1 358 1 359 359 585 84.0 1e-167 MNTSFFVSLDKKALYHNIEYLREYKQKELLPVLKANAYGHDILLIAKALYDYDVKVWAVA RYSEAVSICEYFKTLSIDDFKILIFESLIDDYSLLEKYPQICPTLNSIKDLKNALANNIS IDRLSLKIDFGFGRNGIKAEEVDELKNLIKYNSLKFLSIFSHLFSASYTDGLEVIKKFTD LVNMLGRNNFEMVHLQNAAGIYNYDVDIVTHIRTGMLTYGLQEAGFYDLDMKPVFTGLIG YVDSVRYVNELDYVAYQELSSIDPGTKKIAKIKIGYGDGFSKANNKTTCLIKKKEYVISQ VTMDNTFIEVDDRVNVGDEVHLYHRPNEIKTKTGFSMLELLIAISPLRVKRIFKGEEN >gi|224461303|gb|ACDC01000099.1| GENE 20 15096 - 15339 337 81 aa, chain + ## HITS:1 COG:FN1077 KEGG:ns NR:ns ## COG: FN1077 COG4545 # Protein_GI_number: 19704412 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin-related protein # Organism: Fusobacterium nucleatum # 1 81 1 81 83 130 79.0 5e-31 MPKVYGSMLCPDCVEAKEYFEKVNYKYEFINITESMKNLKEFLSLRDNRKEFDDVKKLGY IGIPAILTDDNKIILGDEVLQ Prediction of potential genes in microbial genomes Time: Thu May 19 23:34:34 2011 Seq name: gi|224461302|gb|ACDC01000100.1| Fusobacterium sp. 2_1_31 cont1.100, whole genome shotgun sequence Length of sequence - 16267 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 3, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 906 472 ## gi|237739791|ref|ZP_04570272.1| conserved hypothetical protein - Prom 932 - 991 6.8 2 2 Op 1 . - CDS 1013 - 1660 921 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 3 2 Op 2 1/0.000 - CDS 1707 - 2732 1387 ## COG2849 Uncharacterized protein conserved in bacteria 4 2 Op 3 . - CDS 2764 - 3834 1243 ## COG2849 Uncharacterized protein conserved in bacteria 5 2 Op 4 . - CDS 3807 - 4424 487 ## FN0520 hypothetical protein 6 2 Op 5 1/0.000 - CDS 4417 - 5082 887 ## COG1636 Uncharacterized protein conserved in bacteria 7 2 Op 6 . - CDS 5091 - 6599 1575 ## COG0419 ATPase involved in DNA repair - Prom 6700 - 6759 4.0 8 3 Op 1 28/0.000 - CDS 6788 - 7855 1261 ## COG0419 ATPase involved in DNA repair 9 3 Op 2 1/0.000 - CDS 7845 - 9011 1060 ## COG0420 DNA repair exonuclease 10 3 Op 3 1/0.000 - CDS 9008 - 11797 2752 ## COG0210 Superfamily I DNA and RNA helicases 11 3 Op 4 1/0.000 - CDS 11799 - 14036 3296 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 12 3 Op 5 1/0.000 - CDS 14040 - 15116 1163 ## COG0820 Predicted Fe-S-cluster redox enzyme 13 3 Op 6 . - CDS 15109 - 16236 1574 ## COG2872 Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain Predicted protein(s) >gi|224461302|gb|ACDC01000100.1| GENE 1 34 - 906 472 290 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739791|ref|ZP_04570272.1| ## NR: gi|237739791|ref|ZP_04570272.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 290 1 290 290 427 100.0 1e-118 MALYQIYTEKDGFNRGMALIFLIVFSVFLSLTMCNLLKTFLTLCSKKVNITEKLFERKYF KSLDKFEFILKIVLKVMLRVLTYLFVFFLIGINIVSVADENTRDRGTFPLEAIQFLTVAA LIALLIMLFKDLKKIYLFLSEKHKALPAFNKKVQENAIKVKTKIKEQLKKIKDKKYNFKD FIAKLKIEFIAKNINKFTEFLEDRTNFLKEKLFQERYEIFLNEKADKFLIGAYQVLSALC ILAFCSIFISISGILIYKTLVFLFYLFGVIFTLLIGAIASFPYILFLFFL >gi|224461302|gb|ACDC01000100.1| GENE 2 1013 - 1660 921 215 aa, chain - ## HITS:1 COG:FN1075 KEGG:ns NR:ns ## COG: FN1075 COG0596 # Protein_GI_number: 19704410 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Fusobacterium nucleatum # 1 215 1 215 215 360 92.0 1e-99 MNYRIALIHGFFRNYKDMEELENNLMNMGYTVDNLNFPLTFPSIDMSINILKKYLLSLKE KKINKQNEIVLIGFGFGGVLIRETLKLEEVSGIVDKVILLSSPINDSTLHRRLKRTFPFI DLIFKPLAIYSKTRRDRRRFDKDIEVGLIIGRESSGFFGKWLGDYNDGYIEMKDVAFPAA KDKILIPITHNELNKRIGTARYIHNFIAKGKFRLE >gi|224461302|gb|ACDC01000100.1| GENE 3 1707 - 2732 1387 341 aa, chain - ## HITS:1 COG:FN0519 KEGG:ns NR:ns ## COG: FN0519 COG2849 # Protein_GI_number: 19703854 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 14 338 14 343 343 360 67.0 1e-99 MKKIFIVLTFVFISLLGFADNTNVGVRIPLGEQKVELDPSLKMLMGLTKVENDPKYKKLV DYVEENLAKKGIVKYSGAINLKKGAMEIFSENGVLLSEEKLPDEFMTMISFPLNFEDDKE KVKKMLKEAYENPAYVTISKNNGKPKIYMERTMGPAKEKIKIIDEVILKRELTEAEKKEL LSLENDKLIEKYKSYIESEISKGYQNNKLIMTKEFKNLTETAVMYDKDNSSVKMEIKYKD NSLKNGTAKSYTNNKLVQEIVFENSTATLLREYHDNGNLATELPANGEAKIYYENGKIKE SVPVKNGKREGIAREYDETGKVIKETLYRNDKEIKKAKKLK >gi|224461302|gb|ACDC01000100.1| GENE 4 2764 - 3834 1243 356 aa, chain - ## HITS:1 COG:FN0519 KEGG:ns NR:ns ## COG: FN0519 COG2849 # Protein_GI_number: 19703854 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 26 356 14 343 343 395 73.0 1e-110 MLWSGNARIGGMMKKILIVLTFVFISLLSFADNTNTEITIMSLDEQTIELDFNVKLLLGM IKAENNPKYKKLLDYIDDNLAKKGEVKYSANINLKKASSEVFSESGELLYEEKLPEEFMD LLNYSLVTADDREKTKKFIKGLYENTAYMLISKNNGKPKFFKEVNLESDGHKIKNTIEVI LKRELTEAEKKELLSLKNEKLITKYRTYVDNEISKVYTDNNLTILKEFKNSTETLTAYDK NRNSLKREKRYTDSSYSNGILKSYKNDKLVDEIVFENSMPKLKKMYYDDGNLAFEFPLKD GKIHGEAKKYYRSGKIREVYSFKNGKREGIGREYSETGEIIKESLYKDNEEIKKIK >gi|224461302|gb|ACDC01000100.1| GENE 5 3807 - 4424 487 205 aa, chain - ## HITS:1 COG:no KEGG:FN0520 NR:ns ## KEGG: FN0520 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 205 1 205 205 337 90.0 2e-91 MTSFSEKTVRGVSLVFLVIFSYLTYKNYYYSPLIVLTIMIFFSTKGVQMFENRIFLSTRA IFWVLFSTLLFLRIYFNESTHLDMKNTKTLMTIAIISLCIGTWIGDFFAKYIYIRIKFCI NRLFSKSNKGTYRIVKMENTQQNYMKSLGKKMGIMFYHITLDVNGEERKFLLEKELFEKL QGKSEININIKKGCLGICYGVGMQE >gi|224461302|gb|ACDC01000100.1| GENE 6 4417 - 5082 887 221 aa, chain - ## HITS:1 COG:FN0521 KEGG:ns NR:ns ## COG: FN0521 COG1636 # Protein_GI_number: 19703856 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 221 1 222 222 335 90.0 3e-92 MKVNYDLKMEEILKEITESGKKKRLLIHSCCGPCSSSVLEYLKEFFQIDIYFYNPNITFD YEYLARMDEQKEMLEKLDYDMNVIEGVYNPKEDFFEKIKGLENEKEGGQRCYSCYDIRIG ETAKKAKEEGYDFFSTVLSISPMKNVNYINEIGEKYSKEYDIPFLFADFKKKNRYLRSVQ ISKELNMYRQEYCGCVFSKVEKEQRDKEKAEKEKQEETKND >gi|224461302|gb|ACDC01000100.1| GENE 7 5091 - 6599 1575 502 aa, chain - ## HITS:1 COG:FN0522 KEGG:ns NR:ns ## COG: FN0522 COG0419 # Protein_GI_number: 19703857 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Fusobacterium nucleatum # 3 502 422 921 921 499 78.0 1e-141 MAINIEDIEKQLTVFKDLEKEIKSLEEDKIAFEIEIKTLRNASNELSSKICPYLKENCEN LKDKEADDYFSSKISIRTETIEDLKKKIEEKKVILAEKSVFEEKKKQYFELDKTIKNLEL SLKTEEVNLKEIEVNIKSLDISIQQLIENQEFQDSTSLKEHKKGLEVELKNLNLDEKREN LKNLIESLEFEKEKILKNQSSIENNLKEIDEYCKKIKADTDKNIKNVTSEIKIFENKLTE LKTSYNEYIKNNVLAKDLENLLLKVEKSIKELYSLRFNKNSLKEKVFNLENKIKNIKIDE LREKFNILKEELNEISKKLGSSQEKIENYKKILEKIASQEEKQKKLLDELKKLEDKSNRA NLIRNEVGKMGRAISKYMLSGISNIASLNFNKITGRTERIEWSNDEKDKYVLYLVGQERK IAFEQLSGGEQVSVAIAIRGTMTEYFTNSRFMILDEPTNNLDTERKKLLAEYMGEILKNL DQSIIVTHDDTFREMAEKIIEL >gi|224461302|gb|ACDC01000100.1| GENE 8 6788 - 7855 1261 355 aa, chain - ## HITS:1 COG:FN0522 KEGG:ns NR:ns ## COG: FN0522 COG0419 # Protein_GI_number: 19703857 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 353 1 353 921 367 75.0 1e-101 MIIKRVKLENYRSHSNTTVNFSKGVNLILGKNGKGKTSILEAISSVMFNTKDRSGKETGK NFIKFNEKSGKIEIEFTANDGRDYILKTEFFKTKPKKQSLTDINGLDCEGDIQENLEELC GIKKGFEETYENIVIAKQNEFINIFKAKPKDREEIFNKIFNTQIYKEMYDGFLKEATDKY TKQIDYLSKDINSLKDNMEDKEEISNFLKEEEVLKESLNTEFSKTTEISTKLSNEIKDYE ADEINLKNLISNIEDEENKIEKYSNLLKDNILEAKKAKKAKTIVKENEKPYLEYLEIENK LKDFREIHTNLLQEQKLNIQYQNNIEKLELSNKTLKTDIANLEESISKNSEKKIV >gi|224461302|gb|ACDC01000100.1| GENE 9 7845 - 9011 1060 388 aa, chain - ## HITS:1 COG:FN0523 KEGG:ns NR:ns ## COG: FN0523 COG0420 # Protein_GI_number: 19703858 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Fusobacterium nucleatum # 1 284 1 284 291 422 78.0 1e-118 MKIVHCSDLHLGKKVSGNREYVKKRYEDFFSAFENFIAKVKEINPDVCIIAGDIFDKKEI SPDILSKTENLFKELRSYVKKEVIAIEGNHDNSKALEDSWLEYLHEQSLLKVFYFNKNFE EENYLKIEDINFYPIGYPGFMIDEALKKLSKKLNSDEKNIVIVHTGISAGENTLPGLVST SILDLFKDKAIYVAGGHIHSFSTYPKEKPFFFVPGSLEFSNVQNEKSDRKGFILFDTDSL EHEFIELEHRKRVQKNFIYNDSTNIEAEFEAFVKELNLTGEEILVISIGIKNNEYINLEN LENIAENNGALKTHILIKNILSIENSEEGTSDLSIEELEKNLISNWNISNIEKFSASFSE LKELFSNNDRDSFLELFDKTLEVNEDDN >gi|224461302|gb|ACDC01000100.1| GENE 10 9008 - 11797 2752 929 aa, chain - ## HITS:1 COG:FN0524 KEGG:ns NR:ns ## COG: FN0524 COG0210 # Protein_GI_number: 19703859 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Fusobacterium nucleatum # 9 929 2 919 919 1315 82.0 0 MSIVNEKKSELNERQLEAVNTVKGPVVIIAGPGTGKTKTLVERTVNILVNEKVEAKKIMI TTFTNKAARELELRINESLEKANVNIDISDMYIGTMHSIWTRLIEENITYSDFFDNFELM SGDYEQHFFIYSRLKEYKKLEDYQKFFDNLSNNTGKYQGDWARSSFLKNKINDLNENAID IENIQTSDVYINFIKEAYKLYVKQLYEANIVDFSYLQVEFFNMLVKNKEFLEKINHDFEY IMVDEYQDSNKIQEKILLLISKTRKNICVVGDEDQSIYRFRGASSENILNFPKHFDEDEC KIIILEENYRSVTDIVEFNNKWISSIDWQGNRFDKNIVSMRDTDILGKNVFHISGKTMDE NIKNTVIFIKKLKQHNKITNYNQIAVLFANFKNNSAKKLEVALKKENIEVYSPRTKVFFE MYEIKLTLGIILACFKKYFPEDSIDQYLTECIDFARLEIKKDSEFLAWIKEKIENISEES FDSLNEIFYEFLNFTYYKNVLNEETPVDSRANHNLAILSKIFKNFQKYVHYRKITAEDDF SVVKYFFTGYLDILKESRVDEIFSEEDYPNECIPFLTIHQSKGLEFPVVIVFSLNSKPNR YEDDDISRQTSIDRLINSSSKLSENDKEKFDFYRKFYVAFSRAKNLLVLSSYEMGISENF KPFFYSIRGVNSLQFDINEVNLDEVTKKDERKILSYTTDIAPYRHCPMKYYLVREKEYST FSKKIFNLGIITHKAIEHINKLFLQKKNPLFDDEYIENLLKNIYKFQNIDLDDNFERIMS IVKKYIEDEKDNFEYIKKVEASEFRIEDDYILYGQIDLILEDENEIQIIDFKTGKYNELE YSSNYRQQLSLYKLLLQKKYDKDIKTYLYYLEEDEPKKEILITDEELEEDFKNINKTTQD ILDNKFPKIPYNHNICGICEFKNYCWGLE >gi|224461302|gb|ACDC01000100.1| GENE 11 11799 - 14036 3296 745 aa, chain - ## HITS:1 COG:FN0525 KEGG:ns NR:ns ## COG: FN0525 COG0744 # Protein_GI_number: 19703860 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Fusobacterium nucleatum # 21 745 2 731 731 1164 82.0 0 MKKLLVILLKLIAVLFVVGALAVFAIIIKYRLELPNIQSMVEDYKPRMATTIYDKNNNVV DVLEAESRDAVKLEDVSPYVKEAFLAIEDKKFYSHHGLHFKGIIRAVLTNFLKGKATQGG SSITQQLAKNAFLTPERTFSRKVKEAILTYQIERTYTKDEILERYLNEIYFGSGSYGIKN AADQYFRKDPKDLNIAEAALLAGIPNRPTKYDPNRSLENALHRQQIILKEMFEDGRITKE EYEEALAYKFELENEENVKNVPKNTSIIYNRRPKKAYNNPELTTIVENYLAEIYDDEQIY SSGLKIYTTIDLDYQKVARDTFNAYPYFKNKEINGAMVTLDPFTGGIVSIVGGKNFKAGN FDRATMARRQLGSSFKPFVYLKALEEGYEPYSVVVNDFVAYGKWAPKNFDGRYTFNSTLV NSLNLSLNIPAVKLMDAVTVDAFKEEMTDKIKLSSEIQNLTTALGSVDSTPVNTAANFSI FVNGGYIVKPNIIREIRDNQDILIYVADIEKVKAFDSVDVSVITAMLKSVVSNGTATKAR VVDKSGRPIQQGGKTGTTSEHRTAWFVGITPEYVTVCYIGRDDNKPMYGKMTGGSAVAPM WARYYQTLINKGLYTPGKFEFLENYLETGDLVKQNIDIYTGLLDGPNSKEMVIRKGRLQV ESAAKYKNGIASLFGLEASAGGGVYVESSSDGMIIDSASGEGGSSEGGSSENSGGDNVSP STQSGQVETNKEKDGDSLTDRLLGD >gi|224461302|gb|ACDC01000100.1| GENE 12 14040 - 15116 1163 358 aa, chain - ## HITS:1 COG:FN0526 KEGG:ns NR:ns ## COG: FN0526 COG0820 # Protein_GI_number: 19703861 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Fusobacterium nucleatum # 1 358 1 358 358 622 91.0 1e-178 MNNEKVNILNLTQEELTEFLVSLGLKKFYGKEVFIWLHKKIIRNFDDMTNLSLKDREILK ENAYIPFFNLLKHQVSKLDKTEKFLFELEDKGTIETVLLRHRDSKNKEIRNTLCVSSQVG CPVKCSFCATGQGGYMRNLSVSEILNQVYTVERRLRKKDESLNNLVFMGMGEPLLNIDNL STALSIISNENGINISKRKITISTSGIVSGIEKILLEKIPIELAVSLHSAINDKRDQIIP INKNFPLEDLSAVLVEYQKQTKRRITFEYILIDNFNISEADANALADFIHQFDHVVNLIP YNEVEGVEHKRPSMKKIDRFYNYLKNVRKVNVTLRQEKGSDIDGACGQLRQRNKKGDN >gi|224461302|gb|ACDC01000100.1| GENE 13 15109 - 16236 1574 375 aa, chain - ## HITS:1 COG:FN0527 KEGG:ns NR:ns ## COG: FN0527 COG2872 # Protein_GI_number: 19703862 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain # Organism: Fusobacterium nucleatum # 1 371 1 371 373 530 83.0 1e-150 MENKKINLKKISDMTYEVLNSPFYVDGKGGQLGDRGTIAEANIVEVKENIVILDKNLEDG EYTYSINEKRQEDIRQQHTAQHIFSAEAYNNFGLNTVGFRMAEEYTTVDLDQKDISKEVI EKLEELVNKDIKADILVEEEIYTNEEAHKFENLRKAIKEKIKGDVRFIKIGDVDICACAG FHVSRTSEIEIFKIINHENIKGNYTRFYFLAGDRAKNDYNKKHDIIKKLTNTFSCKDDEI LEMLDKSLKEKASVTAELKSLGMRYAELMAKDFENTFIDYKDFKILIYNENENLVGILPK FINLDKFLLLIGYNTSYTLMSNIYDCKEIIINIVKNFPNIKGGGGKNKGNIKLDKAYNRN ELIEIIKKGIDSNNE Prediction of potential genes in microbial genomes Time: Thu May 19 23:34:49 2011 Seq name: gi|224461301|gb|ACDC01000101.1| Fusobacterium sp. 2_1_31 cont1.101, whole genome shotgun sequence Length of sequence - 3187 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 43 - 89 4.1 1 1 Tu 1 . - CDS 112 - 1446 2402 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 1513 - 1572 15.1 - Term 1664 - 1702 0.5 2 2 Tu 1 . - CDS 1771 - 2022 234 ## - Prom 2042 - 2101 7.7 Predicted protein(s) >gi|224461301|gb|ACDC01000101.1| GENE 1 112 - 1446 2402 444 aa, chain - ## HITS:1 COG:SPy1150 KEGG:ns NR:ns ## COG: SPy1150 COG0446 # Protein_GI_number: 15675127 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Streptococcus pyogenes M1 GAS # 2 444 3 455 456 573 65.0 1e-163 MKIVVVGANHAGTACINTMLDNYKGNEVVVFDSNSNISFLGCGMALWIGGQIAGSDGLFY SSKEKLEAKGAKIHMETGVTNIDFDKKIVYATGKDGKKYEESYDKLVLSTGSLPIDLPIV GKELENVQYVKLFQNAQEVIDKLNANKSIEKVAVVGAGYIGVELAEAFKRWGKEVYLVDA AEGCLSTYYDKLFREKMDAQLEGHGIKLEYGQLVKEIQGNGKVEKIITNKGEFPADMVVL CAGFRPNTDLGKDKLELFRNGAYIVDRTQKTSIDDVYAIGDCATVYDNSIGGTNYIALAT NAVRSGIVAAHNVCGTKLESIGVQGSNGISIFGLNMVSTGLTFEKAEKLGIEVLETTFHD LQKPEFMEHNNEEVYIRIVYRKDNRKIIGAQIASKYDISMAMHVFSLAIQEGVTIDRFKL LDILFLPHFNKPYNYITMAALGAK >gi|224461301|gb|ACDC01000101.1| GENE 2 1771 - 2022 234 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLETINKNQVNVILNKNTDEIATLIEKHRNTMITSKKIIKLKKLFKYTLVIIALSLLIIA IPLYLVTLVILIISSISILKSII Prediction of potential genes in microbial genomes Time: Thu May 19 23:35:02 2011 Seq name: gi|224461300|gb|ACDC01000102.1| Fusobacterium sp. 2_1_31 cont1.102, whole genome shotgun sequence Length of sequence - 30990 bp Number of predicted genes - 37, with homology - 33 Number of transcription units - 13, operones - 8 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 49 - 108 6.0 1 1 Tu 1 . + CDS 143 - 418 382 ## gi|237739806|ref|ZP_04570287.1| predicted protein 2 2 Op 1 13/0.000 - CDS 1254 - 1532 246 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair 3 2 Op 2 12/0.000 - CDS 1537 - 2529 732 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 4 2 Op 3 6/0.000 - CDS 2541 - 3035 664 ## COG1468 RecB family exonuclease 5 2 Op 4 . - CDS 3093 - 5504 2418 ## COG1203 Predicted helicases 6 2 Op 5 . - CDS 5516 - 6595 1320 ## CTC01145 hypothetical protein 7 2 Op 6 . - CDS 6606 - 7490 1303 ## COG1857 Uncharacterized protein predicted to be involved in DNA repair 8 2 Op 7 . - CDS 7503 - 9047 1602 ## FN1182 hypothetical protein 9 2 Op 8 . - CDS 9037 - 9789 719 ## FN1183 putative cytoplasmic protein - Prom 9880 - 9939 10.6 + Prom 9920 - 9979 16.3 10 3 Tu 1 . + CDS 10061 - 10612 522 ## Coch_0117 hypothetical protein + Prom 10952 - 11011 8.6 11 4 Op 1 . + CDS 11198 - 11590 622 ## COG3728 Phage terminase, small subunit + Prom 11648 - 11707 8.6 12 4 Op 2 . + CDS 11792 - 11929 93 ## + Term 11977 - 12011 1.1 13 5 Tu 1 . - CDS 13036 - 13497 556 ## M6_Spy1806 Cro/CI family transcriptional regulator - Prom 13622 - 13681 13.0 + Prom 13544 - 13603 8.7 14 6 Tu 1 . + CDS 13687 - 13890 304 ## gi|237739819|ref|ZP_04570300.1| predicted protein 15 7 Op 1 . + CDS 14001 - 14732 1017 ## gi|237739820|ref|ZP_04570301.1| predicted protein 16 7 Op 2 . + CDS 14745 - 15077 246 ## gi|237739821|ref|ZP_04570302.1| predicted protein 17 7 Op 3 . + CDS 15074 - 15172 113 ## 18 7 Op 4 . + CDS 15162 - 15428 411 ## gi|237739822|ref|ZP_04570303.1| predicted protein + Prom 15439 - 15498 8.0 19 8 Op 1 . + CDS 15568 - 15720 66 ## 20 8 Op 2 . + CDS 15746 - 16051 355 ## gi|237739823|ref|ZP_04570304.1| predicted protein 21 8 Op 3 . + CDS 16029 - 16289 282 ## gi|237739824|ref|ZP_04570305.1| predicted protein 22 8 Op 4 . + CDS 16279 - 18336 2045 ## COG3378 Predicted ATPase + Prom 18691 - 18750 16.7 23 9 Op 1 . + CDS 18908 - 19405 571 ## gi|237739826|ref|ZP_04570307.1| predicted protein 24 9 Op 2 . + CDS 19435 - 19599 287 ## gi|237739827|ref|ZP_04570308.1| predicted protein 25 9 Op 3 . + CDS 19602 - 20660 1000 ## COG0582 Integrase + Prom 20809 - 20868 11.0 26 10 Op 1 . + CDS 20907 - 21401 634 ## FN1188 hypothetical protein 27 10 Op 2 . + CDS 21415 - 21801 356 ## FN1189 hypothetical protein 28 10 Op 3 . + CDS 21805 - 24012 2961 ## COG2217 Cation transport ATPase 29 10 Op 4 . + CDS 24025 - 24777 797 ## FN1191 hypothetical protein 30 10 Op 5 . + CDS 24837 - 25109 520 ## FN1192 hypothetical protein + Term 25123 - 25180 5.3 - Term 25110 - 25169 6.2 31 11 Op 1 . - CDS 25170 - 26699 1199 ## gi|237739834|ref|ZP_04570315.1| predicted protein 32 11 Op 2 . - CDS 26705 - 28096 1665 ## A1S_0304 hypothetical protein 33 11 Op 3 . - CDS 28111 - 28755 777 ## FN1197 hypothetical protein 34 11 Op 4 . - CDS 28758 - 30020 1299 ## COG1106 Predicted ATPases - Prom 30047 - 30106 8.4 + Prom 30061 - 30120 6.5 35 12 Tu 1 . + CDS 30158 - 30391 179 ## FN1193 hypothetical protein + Term 30424 - 30482 4.2 - Term 30391 - 30436 9.1 36 13 Op 1 . - CDS 30486 - 30587 76 ## 37 13 Op 2 . - CDS 30636 - 30938 332 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|224461300|gb|ACDC01000102.1| GENE 1 143 - 418 382 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739806|ref|ZP_04570287.1| ## NR: gi|237739806|ref|ZP_04570287.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 91 2 92 92 152 100.0 9e-36 MEKASAYQEFKFQYDNTLSVWNDIYDSSQMTFKFQYDNTLSEGLTRLYMDWFLFKFQYDN TLRQRPDRLIAGEVRFKFQYDNTLSVYPSSL >gi|224461300|gb|ACDC01000102.1| GENE 2 1254 - 1532 246 92 aa, chain - ## HITS:1 COG:FN1176 KEGG:ns NR:ns ## COG: FN1176 COG1343 # Protein_GI_number: 19704511 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 92 15 106 106 163 94.0 7e-41 MYVVAVYDISLDEKGNRNWRKVFGICKRYLHHIQKSVFEGELSEVDIQRLKYEVSKYIRN DLDSFIIFKSRNERWMEKEMLGLQEDKTDNFL >gi|224461300|gb|ACDC01000102.1| GENE 3 1537 - 2529 732 330 aa, chain - ## HITS:1 COG:FN1177 KEGG:ns NR:ns ## COG: FN1177 COG1518 # Protein_GI_number: 19704512 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 330 9 338 338 573 93.0 1e-163 MKRSYFLYTNGTLKRKDNTITFINEQDEKRDIPIEMIDDFYVMSEMNFNTKFINYISQFG IPIHFFNYYTFYTGSFYPREMNVSGQLLVKQVEHYTNPQKRIEIAREFIEGASFNIYRNL RYYNGRGKDLKFYMEQIEELRRQLNEVTNVEELMGYEGNIRKIYYEAWNIIVNQEIDFEK RVKNPPDNMINSLISFINTLFYTRVLGEIYKTQLNPTVSYLHQPSTRRFSLSLDISEVFK PLIVDRLIFSLLNKNQITEKSFVKDFNYLRLKEDASKLIVQEFEERLKQVITHKDLNRKI SYQYLVRLECYKLIKHLLDEKKYQAFQMWW >gi|224461300|gb|ACDC01000102.1| GENE 4 2541 - 3035 664 164 aa, chain - ## HITS:1 COG:FN1178 KEGG:ns NR:ns ## COG: FN1178 COG1468 # Protein_GI_number: 19704513 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Fusobacterium nucleatum # 1 164 1 164 164 256 92.0 2e-68 MDKDITGLMVYYYEVCKRKLWYFTNEIQLEENNSNVILGKLLEENTYTRDEKKINIDGVI NIDFIRSKKILHEIKKSNSIEPASILQVQYYLYYLEKKGLIGLKGVLDYPLLKKTVEVNL TDNDRKNLENIIIGIKEILGKESPPILEKKNICKKCAYFDLCFV >gi|224461300|gb|ACDC01000102.1| GENE 5 3093 - 5504 2418 803 aa, chain - ## HITS:1 COG:FN1179 KEGG:ns NR:ns ## COG: FN1179 COG1203 # Protein_GI_number: 19704514 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Fusobacterium nucleatum # 20 793 10 799 812 992 74.0 0 MENYKENSKIKIYDNIKNIYYAKPDKTLAQHNEELHIQKKKLIELGYINDDKIIELLEYS IEFHDIGKINPEFQVRLKENKKFDVSKEVAHNILSIHFIDKKDYEDKNDYESIAYAIFYH HRFGNGDNDSIRADENTKKIIENLLSKLEEKGIKVIKKLSPSLKFPNLHTDRNLKLLGLL MKCDHSASGGYEIEYPNDFLEVALNELLNEFKEKDKSADWNDMQKFCKENSDKNIIAIAD TGMGKTEGGFLWGGNNKIFFVLPLRTAINAMYKRFNEVIIKGENKEERVGLLHSDSLEYY LNNKKELVIDDNDEKEMDILEYNKRGKHLSLPVTICTPDQIFNFILKYKGYESKLATLSY SKIILDEMQMYDANLLAAVIFGITKIIEMGGKIAIVTATFPPIIEYFLNKYLMKNNQNVI KDLDKPNEIVGEEIFIKKKFTNNEKLRHNIVLIDNEIGIEEILWKFKDNRDKKKSSNKIL VICNTIKKAQEIYLKLKEYSDLENKINMLHSNFIREDRESKEKEILDFGRTDFNGEGIWI STSLVEASLDIDFDYLFTELQDLNSLFQRFGRCNRKGKKSVDETNCFIYLKIENKYLKEK DSRYGFIDKDIYENSKKGLENYCKVVSKNELDNSEDYNELFKHFSKKITEGDKITLIEEN LSFENLKDSPFVNEFEKAYDKYQRVLNSDKNSQDALKLRDIQSVTVIPYNIYEENEELIK ELIKKIEDANLGLEERQKAKTELLKKTLSIQYYQLNEYFNILDSKSYSSKSINKFEKIIV TISEYDKELGFRATKNSKNFVFI >gi|224461300|gb|ACDC01000102.1| GENE 6 5516 - 6595 1320 359 aa, chain - ## HITS:1 COG:no KEGG:CTC01145 NR:ns ## KEGG: CTC01145 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 359 1 360 360 308 54.0 2e-82 MEALRIILKQSSANYRKAGTIDNKMTYPLPIPSTIIGALHNICEYTEYHPMDISIQGKFT SLSRKVYTDYCFLNSALDDRGNLIKVVDPNAFSGAFVKVASAKKSQGNSFKDRITIQVHN EELLQEYCNLKEKSKEIEELKNSEYKKKLEEFKTLKKEIADKKKKEDKKSEVFKQLSEEE KKIKLEEEKYKEEFKKFEYENYTKPYSHFQNLVTSLKSYEVLNDIFLILHVKADDETLKD IENNIFNLQSLGRSEDFVEVIECKIVELQEVEDIIESNFSMYINAKDFYEEKIFTETVDG EHSSGGTKYYLDKNYEIKKGKREFKKVPVIYSTSVLAEDSSENVKADIYNEETILVNFI >gi|224461300|gb|ACDC01000102.1| GENE 7 6606 - 7490 1303 294 aa, chain - ## HITS:1 COG:FN1181 KEGG:ns NR:ns ## COG: FN1181 COG1857 # Protein_GI_number: 19704516 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 293 1 299 300 489 91.0 1e-138 MKKNALTVTIVANMTSNYSEGLGNISSVQKIYRDRNVYAIRSRESLKNAIMVQSGMYEDL ETEANGATQKKVDENLNATNCRALEGGYMNTKESTYVRNSSFYLTDAISTESFINETRFH NNLYLATNYANANNLNVQKDAGKVGLMPYQYEYEKSLKVYSLTIDLEKVGKDPNFPDKEA DNKEKFERVKSILEAIENLSLVVKGNLDNAEPVFAIGGLSLRKTHYFENVVRVEQGALVL GETLKEKKEDGFNCALLKGDIFTNEAEIIKELQPASMKEFFKSLTEDVKNYYGV >gi|224461300|gb|ACDC01000102.1| GENE 8 7503 - 9047 1602 514 aa, chain - ## HITS:1 COG:no KEGG:FN1182 NR:ns ## KEGG: FN1182 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 514 1 517 517 728 85.0 0 MKYDVDKNEYGFDTAISASDWKYSAAITGLIYYFKELEKKYEIKNLTIDEITDSFLLYNK EDITEDNYLDFIERFYPEDTLVHKKIETQLKYTKEFTPEIIKNIKENMSANTVLKKVFSK IKFDGTNKEEALKLLNENRNLIVKETFRNKKDLYDNYCQTSRLLEKGNNSPCRLKGYYFD PNRKSKATGYNFASSSVDYFDDEVFDFIPFAFTGNSFETIFLNDNLDLEILENMNYKLRE YFSEEKEREITNIRTLKQEKAIKEGKNEPIEETSVSVSLKKIFLNILQKKTDYIKYGMEI IYKNRDKEYFETWYLRNDSIEVLKIVEDFSRLDIRIKITDKYYFNLLDEVFSAILNLSLL TNSILYLLKDRENFIKLDVSKENLSKIFKYNYAIEQLIKINQTIRNGGKGMDKNLKNSIK ACASEVMKKFIKDNSLNKLVSYRQKLLSSVVAKNHKRILDVLTQLSVYSGVYFSFSFDYI ENPTQNEDIIHYFILELDQSRLESKKNKENEDKE >gi|224461300|gb|ACDC01000102.1| GENE 9 9037 - 9789 719 250 aa, chain - ## HITS:1 COG:no KEGG:FN1183 NR:ns ## KEGG: FN1183 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 1 250 1 250 250 423 89.0 1e-117 MRFILNFELDTVIIPVEIRRTMISFFKKSLTEAHNSKYYPEFFTGTQIKDYSFSVIFPLD KYLREEIYLKKPEMKVIVSCSEKNNIGFLLVNVFLSQRNKNFPLPKNTHMILRDVRIVEE KNISGEEAIFQTTIGGGIVVRDHSKEDNKDICYSVGDEKFEEVLNWLMKERFKRLGYPED IFKDFSCKLLQGRKIIVKHFDLKFPITTGRFKIKAPKILLEEIYRTGMGSRLSQGFGLLE YLGGEIKDEV >gi|224461300|gb|ACDC01000102.1| GENE 10 10061 - 10612 522 183 aa, chain + ## HITS:1 COG:no KEGG:Coch_0117 NR:ns ## KEGG: Coch_0117 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 4 171 6 171 175 149 39.0 4e-35 MLEIEKITLKNKIVDKDNYFEIGYCEELKIYMMHVFVSWIASYYRYYKIDEEDYNLYKNS PQSFYKKYENEIKQNNNVYTENFIGSESLRDYDGVKDFQHSYPTKNEIINPFQNYIYIEG ILFARIIWEMGEFLIPPFQKIISKDGSYKFPLREICELKNNSSGNPICYYLPFDEKKYLH KIN >gi|224461300|gb|ACDC01000102.1| GENE 11 11198 - 11590 622 130 aa, chain + ## HITS:1 COG:L36274 KEGG:ns NR:ns ## COG: L36274 COG3728 # Protein_GI_number: 15672009 # Func_class: L Replication, recombination and repair # Function: Phage terminase, small subunit # Organism: Lactococcus lactis # 1 126 1 140 147 72 37.0 2e-13 MTHRQELFIQEYIKTGNATEAAKKAGYSEKTAYSIGQRLLKNVEVKDAIDKLSKNIAINN IMTAKERQEFLTSLILNNDVKVSDKLKAVDLLNKMTGEYIQKVEVNGNVKTDDPFKNLTT EELKRIIFDN >gi|224461300|gb|ACDC01000102.1| GENE 12 11792 - 11929 93 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIRKKPFSYSEHRTVYIKYIIRYLKTIKRYKIDIINYIIYNRIVI >gi|224461300|gb|ACDC01000102.1| GENE 13 13036 - 13497 556 153 aa, chain - ## HITS:1 COG:no KEGG:M6_Spy1806 NR:ns ## KEGG: M6_Spy1806 # Name: not_defined # Def: Cro/CI family transcriptional regulator # Organism: S.pyogenes_MGAS10394 # Pathway: not_defined # 1 70 1 70 226 69 45.0 4e-11 MIKSNLAIVMAEKKIKISELSRKTGISRVTLTSLYYNNSGGIQFDTLNNLCNFLSVKPSD ILVYYPFDYKIKDLYPHIDGINNFKIEYIINNKTFSCSLEIELFVEKKIEPEDDAGGIII TDVFISVYLSEQFDFADSEIELSESARHFQKIF >gi|224461300|gb|ACDC01000102.1| GENE 14 13687 - 13890 304 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739819|ref|ZP_04570300.1| ## NR: gi|237739819|ref|ZP_04570300.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 67 1 67 67 103 100.0 4e-21 MINQNLLKSKIALSGLSSKEVAEKIGIPYQSFNNRKAGKIEFNSSEIKALKEVLNLTNDD VEAIFLS >gi|224461300|gb|ACDC01000102.1| GENE 15 14001 - 14732 1017 243 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739820|ref|ZP_04570301.1| ## NR: gi|237739820|ref|ZP_04570301.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 243 1 243 243 366 100.0 1e-100 MEFLSKESINSKELLKQINYFREMEYKEKEASNTLTKAQKKRGRYIELTHDNLLKIIRDE FNMKVNAVNKNAVKNNNHYNGPVETTYRDDKGELRPMFILTIDQAKQVLLRESKVVRKAV IEYLNLLEKRIRQLERKKGITARKEETEAIKMLLEYGNIPKEKHNLYYMTYSKLPFVALG VKRVERDKLPADDLELIKELENIIEKTILKGIVKKLDVKAIYKECKKACIDSLVDDIEQK LIS >gi|224461300|gb|ACDC01000102.1| GENE 16 14745 - 15077 246 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739821|ref|ZP_04570302.1| ## NR: gi|237739821|ref|ZP_04570302.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 110 1 110 110 166 100.0 6e-40 MIKKVYIDVLFMVLSYEATKVFINKDNVIISFKKEDQEENKEILELIENLGIKEVIGNYS IELNFEFMILEIHQQYKFKMLRKLGKDDIDKIWTIAMVDIDTLMTREVEA >gi|224461300|gb|ACDC01000102.1| GENE 17 15074 - 15172 113 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNIFIRGTMDEVIQKLKELAAQGNKGVITNGK >gi|224461300|gb|ACDC01000102.1| GENE 18 15162 - 15428 411 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739822|ref|ZP_04570303.1| ## NR: gi|237739822|ref|ZP_04570303.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 88 1 88 88 147 100.0 2e-34 MENKTDKKISSLVLDLENIIMDIEGFKGMLLALEEALFQACNWDKENYRYMVSHMYTFIY DINNNLKDTFNGLKEKALINSNSDQAKN >gi|224461300|gb|ACDC01000102.1| GENE 19 15568 - 15720 66 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRYDRSLTIPLKLRRTLKEAFTYMRGVVKKVLVIRFEYRYPFIFLGKCQK >gi|224461300|gb|ACDC01000102.1| GENE 20 15746 - 16051 355 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739823|ref|ZP_04570304.1| ## NR: gi|237739823|ref|ZP_04570304.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 101 1 101 101 149 100.0 7e-35 MVDSTDKKKEAKKKLYRKHHLIFKSKDLYDGFNLLIFMAQYEDDIIIDDTIIEYTGDFFI YHIKCSYNLPLNLDVCKDQNGTVIKVFFTKGVEAIEASDIF >gi|224461300|gb|ACDC01000102.1| GENE 21 16029 - 16289 282 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739824|ref|ZP_04570305.1| ## NR: gi|237739824|ref|ZP_04570305.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 86 1 86 86 130 100.0 3e-29 MKQVIYFENYDYYEHINSLIEELESHNIKVIDIIISSRVIKSGSKVTHTLIVESLNKIKV EIEKIEPYPDIQGIVIKFNGGGVIEV >gi|224461300|gb|ACDC01000102.1| GENE 22 16279 - 18336 2045 685 aa, chain + ## HITS:1 COG:L37667 KEGG:ns NR:ns ## COG: L37667 COG3378 # Protein_GI_number: 15672011 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Lactococcus lactis # 306 670 143 494 542 115 26.0 4e-25 MKFEKAFRGFYLSKGDKGKEPILKYKTDDQLKDFNNRLVGYNEADKQFNSFVGYLADDYV LVDLDNKDSQGEYDSNKTESKKLIEILEHYGINTPIVETPHGHHFYFKVNDYQKENLKSV SGVYSLIGLKVDYKLGSKKGCACMKALGEVRKIVNDTSGISELPAFLLNNKVLNNTLKEL EDIKNSTGSRNSFISKYKYQLLKNGYDELTTYQVLEIINNFIFVDPLPMQELTVLMRQEH IEAEKDTSGLNNDSFLYHTNNGKLKVNTYKMAQKLINDFSIINIDNFLYSYNGQYYKKCE KEDIERAILRLHKDITMNELKEVLKKIQLGADKKKEDLNYIALNNGVFNLDTRKLEPYSK DKITMVHMDIIYTDDVDIITGEPTGTIIKNYMLDLVQNDYNLFCVLCEFLGQALYRKENI LQKCLIIKGDKSNGKSKFLEILIKFFGTENVSTLDLKRFEQRFDLFSIVGKMVNIGDDIS GQYIPDSSNIKKIITSEMLPIERKGQDLFDYKPRIICIFSCNNLPRFDDSTKAVKRRLCI LPFENTYRPELNNINPFIVHEMTTPENLSELFSWSVWGLDRVLRNHRLTESPKIMELVEE FDKDNDPIRAFIEDMAGDTEIGLKGYFNMKDTGIIYTDYQIWCNNNGYKEMNASNFGKQL KQHIPKLDKKPYRNDFQKVKNRYLL >gi|224461300|gb|ACDC01000102.1| GENE 23 18908 - 19405 571 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739826|ref|ZP_04570307.1| ## NR: gi|237739826|ref|ZP_04570307.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 165 7 171 171 272 100.0 4e-72 MKGIDKIMRLVINNKTFDSKEFKGTEAELLEQFVYEFLNINSIVMMERLAVVYEMLIGYI KDVLGIQENPPFKFDDIESDREKLEIVIEQYKFAKFLSSRYKGSYESYLDLLEQYEVFSK DKAIMTLIDYKLARFGDEIFKEMGIEIIDRIDQGFIVKDNSKYIN >gi|224461300|gb|ACDC01000102.1| GENE 24 19435 - 19599 287 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739827|ref|ZP_04570308.1| ## NR: gi|237739827|ref|ZP_04570308.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 54 1 54 54 95 100.0 9e-19 MRMLNVDDVKNLLQVSKPKAYEIIRTLNAELKDKGYLTVQGKIREDYLFERMGA >gi|224461300|gb|ACDC01000102.1| GENE 25 19602 - 20660 1000 352 aa, chain + ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 39 349 35 367 368 121 30.0 2e-27 MPAEKEKLKGEYTGRWNARFTIKLINGNSKRIYKRGFKSKKEALEYEKNMILDNSLGSNI QFRKIVDKYLEYKKMRVKESSFLNLTSIFKNITFFDNFILSEITPKIISDFQNELIKTYK PSTVKIININLRMLFTWCVRYRNLSNNPFDMVERLKIETSKKINIITVDEFNQIVEQVNN EDMKLMFKLLFWTGLRIGEARALKIDDIDFENKTISVTKNYTKLHGKESLTSPKTKRSNR IIRIDDKLVEDIKEYLDKALYIIDDDFIFRHQKTSYTAKFKKIVLKVLKKDLRIHDLRHS HASFLINNGVDILLISQRLGHSNIAMTLNVYSHLYPSKENEAIELINKLKGV >gi|224461300|gb|ACDC01000102.1| GENE 26 20907 - 21401 634 164 aa, chain + ## HITS:1 COG:no KEGG:FN1188 NR:ns ## KEGG: FN1188 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 163 1 163 165 213 74.0 2e-54 MKNNMLLPNFYGVFEVKSATKNRLRMEIEKLKNNKVEIANLKENLKKIEVIKNFKVIESL GSLTVEFDDKEIDTQFMIGIVLKLLNLDEELLKGREAKAKTLFKTVAQIADITIYNKTKG LFDTKTLLGTGLLIYGLKKFKADMILPGGATLIWWSYRLLSKKS >gi|224461300|gb|ACDC01000102.1| GENE 27 21415 - 21801 356 128 aa, chain + ## HITS:1 COG:no KEGG:FN1189 NR:ns ## KEGG: FN1189 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 12 127 1 116 116 184 89.0 1e-45 MFKNMLKKTYLMFNKVKVVHSIPGRIRLLIPSLDKFPEEMKKHEHYISAIIKLKDGIKSI EYSYLTSKILIEYDKTKLKEQDIVNWLNKIWKIIVDNEDVYHGMSVDEVEKNVKRFYEML KGELEGRK >gi|224461300|gb|ACDC01000102.1| GENE 28 21805 - 24012 2961 735 aa, chain + ## HITS:1 COG:FN1190 KEGG:ns NR:ns ## COG: FN1190 COG2217 # Protein_GI_number: 19704525 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 735 1 735 735 1231 94.0 0 MKNDNLLACEIVHRIRGRIRIKSKAFKYIGASLKTEIEKQLVQVRYIESVEISLITGTIL IYFEDVSLSEQNLINLIQNTLNSHIFEICKNEKIEKSSKYVIERKLQEETPGEIIKKIIT TAGLLGYNLFFKSKQEVVATGIRRFLNYNTLSTLALAMPVLKNGINSLVKNKRPNADTLS SSAIISSILLGKESAALTIMFLEEVSELLTVYTMEKTRGAIKDMLSVGESYVWKEISEDN VKRVPIEEIQKDDIIVVQTGEKISVDGKIIKGEALIDQSSITGEYMPLKKAEGETVYAGT IVKNGNISILAEKVGDDRTVSRIIKLVEDANFNKADIQNYADTFSAQLIPLNFILAGIVY ASTRSITKAMSMLVIDYSCGIRLSTAVAFSAAINTAAKNGILVKGSNFIEELSKAETVIF DKTGTITEGKPKVQSIEVFDNSMSENEMIGLAGAAEEQSSHPLATAIMTEIKDRGIEIPK HSKIKTVVSRGVETKVGKGKEAKVIRVGSKKYMLENNVNLTAAIDAERGIISRGEIGLYI AQDDKIIGLIGVSDPPRENIKKAINRLRNYGVDDIVLLTGDLRQQAETIASRMSIDRYES ELLPEDKAKNILKFQSKGSNVIMIGDGVNDAPALSYANVGVALGSTRTDVAMEAADITIT QDNPLLVPGIIGLSKNTVKTIKENFAMVIGLNTFALVLGATGILAPIYASVLHNSTTILV VLNSLKLLKYDIKTN >gi|224461300|gb|ACDC01000102.1| GENE 29 24025 - 24777 797 250 aa, chain + ## HITS:1 COG:no KEGG:FN1191 NR:ns ## KEGG: FN1191 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 250 1 243 243 408 90.0 1e-113 MKKLTITIVHILPNRVRLKLSAPVKDTKTFYSNIKNSLKFLEMRYNSRLKTVTLNFSTSE IFLQEIIYRVAISFSIENGLLPVKLVEENVYKSISPLSMYALASIMVSYLNGVINKNDTN LQSSMNVFSMGLTVGSVFEHAYGEVKKRGMFDIEILPALYLLKSFFTEQKLSTVLIMWLT TFGRHLTVSHKMTKLIKVFRVKTEKGYQYTATIVDDNTIENFSDFIHQIFFKKHSDYCQF NEKYVTLSKN >gi|224461300|gb|ACDC01000102.1| GENE 30 24837 - 25109 520 90 aa, chain + ## HITS:1 COG:no KEGG:FN1192 NR:ns ## KEGG: FN1192 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 90 1 90 90 116 87.0 2e-25 MFGNTITKDHLVGAAVGVGVAAIAFYLYKKNQAKVDDFLRKQGINIKTSSCSNLEGLDIE GLTEMKEHIEDLIAEKSATGSAEEIIVEAE >gi|224461300|gb|ACDC01000102.1| GENE 31 25170 - 26699 1199 509 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739834|ref|ZP_04570315.1| ## NR: gi|237739834|ref|ZP_04570315.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 509 1 509 509 751 100.0 0 MPSWLKPSGNFGIGFQSIFMLTEHVNLKSKSLFSQESIDVDLIKPASKNYNAGSIYFKKT KFYYKQETGTVISFRYKTKKVSNSYTTSGEFINNYIYTFDPLIDKEFDIEIHSLLDMIRE VNSYSLIDITLNKEDKNIRLNKLLNGDLKLFSKEKYQIFIDIDDEPINLNKHLFTKFFYK NQYVEKIGYLSNSFFNITINALNFKANEILELSRNRIKNSFINKEMDSILNYVFKTLYEK YNDLFDKKEKKEENKKRKISYFWLAYRNSLNLKIEDMLDFEKKLENFIYQDEIVEGISVI SLEKKEKIEIEIKHQKNKDSFYKIENKEINEYFLRFLIKEISKKNYYILKGKKKNEKLYK VILRKVKDNKEEYIDIEYNIKNLIDSRNWVNPIGRFYIHSKKDNRNLKINYKKIDLKSNK KWIEIKRTSIFYNIFSNLISRNIILFPYYVKSENEVIWNEKMKEEYIEFCYKNRFLTDLS KEKIREEVEKIVKNLKKKFLDENIQILEE >gi|224461300|gb|ACDC01000102.1| GENE 32 26705 - 28096 1665 463 aa, chain - ## HITS:1 COG:no KEGG:A1S_0304 NR:ns ## KEGG: A1S_0304 # Name: not_defined # Def: hypothetical protein # Organism: A.baumannii # Pathway: not_defined # 8 460 28 495 992 219 33.0 2e-55 MNFFEKILEEKSKQENTIDYFTQWNYDKELYTDILLGVRDYYSNYTDHGKKHSETILTNI LRILGEESIKKFSTLDLWLILEAAYLHDCGMYITREEAKKVIQDDNFKSYYSNILNNPEH PMYSYTQFFSQDKNGFSYNQIHYNVDFDYAMRFIISSYKRSSHAADFRKVIGNSKKLLHD RIYRILFSISESHGKSFEDVMKLPKKENGIGNEIGHPIFIACLLRMGDLLDIDNKRFSEK LIENIEDIIPIDSKEHLEKHKSITHFWIDQERIEITATVNSGEESYNVAEVIGKWFSYIE DEYNNQLHNWNDIIPKNIEASLPILGELKIDIKDYEYIDSKNKPKFSLDINNTLSLLMGT SIYDKKEKAIREILQNAIDTTYLRVFEENKEQLALKNEISLEEVNELFENKKIEIVINKI SEDKEYNQWDIIIKDKGIGIDKEHLKYIIEAGSSYKDSKKMIL >gi|224461300|gb|ACDC01000102.1| GENE 33 28111 - 28755 777 214 aa, chain - ## HITS:1 COG:no KEGG:FN1197 NR:ns ## KEGG: FN1197 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 211 1 211 213 300 85.0 3e-80 MKKANRLTEKRSERKKVFLKSGAYLIVTDAEKTEKNYFEGIKNIIPDSLKNDLQIKIYSN KPLAKIIDFATEQRNKDERFRDIWLVFDRDEVKNFDKLIEEAKESKMNVAWSNPCFEIWL MSYFQIPKNIDDSQKCCETFEKVFQENTNKKYKKSEEKIYNILCEKGNENRAIERSREKY HQVRKEYSKPSKMIGCTTVYKLVEELKKKMEMKK >gi|224461300|gb|ACDC01000102.1| GENE 34 28758 - 30020 1299 420 aa, chain - ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 1 420 1 420 420 706 94.0 0 MLLQFYFSNYRSFEGEGILDMRASGSNELSSHVRNTLNERVLPVTAIYGANASGKSSVFE AFQFMAFCVLESLSFSDENKKNPYKLKVDSFKFSESREKPSEFEINYIDKKGKKELYYNY GFKIDNSGILEEYLTYNTKTGVKRNEDYTYIFKRERNQKLHLNSSIEKFRENLEISLKDK TLLVSLGAKLNIDEFIRVRTWFINTEVINFSNSLYGVLLENTLPNNIFESEEVRKNLVNF INSFDDSIIDIEVEKISAIDENDNDNYRVFTIHKSDKETSTARISMNEESSGTKKMFSLY QTLLDALENGGVFFADELDIKLHPLLMRNILLTFTNKEKNPNNAQLIFTTHNTIYMDMDL LRRDEIWFVEKDNGVSNLYSLDDITNEKGEKVRKDSNYEKHYLLGNYGAIPNLKNLLGRK >gi|224461300|gb|ACDC01000102.1| GENE 35 30158 - 30391 179 77 aa, chain + ## HITS:1 COG:no KEGG:FN1193 NR:ns ## KEGG: FN1193 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 76 4 79 89 125 82.0 4e-28 MKDIDVIYKGEVLKLTRFWGNNKLCLWIKNSNQITMPKMEFVGGYPNEYCIFLENLSTEE LKEIKTIDGKVLNFEEF >gi|224461300|gb|ACDC01000102.1| GENE 36 30486 - 30587 76 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWLVKAHTSQETCDSSTRRVLVIGGSVTPLRLI >gi|224461300|gb|ACDC01000102.1| GENE 37 30636 - 30938 332 100 aa, chain - ## HITS:1 COG:TM1044 KEGG:ns NR:ns ## COG: TM1044 COG0675 # Protein_GI_number: 15643802 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermotoga maritima # 1 95 286 376 405 119 58.0 2e-27 MIKNHRLAHRLARNIADVSWSEFSRILEYKAKWYGKTVVRVDRFFASSQICNCCGYRNKE VKDLSVREWTCPVCGAVHNRDINAAKNILKEGLRILGISA Prediction of potential genes in microbial genomes Time: Thu May 19 23:37:25 2011 Seq name: gi|224461299|gb|ACDC01000103.1| Fusobacterium sp. 2_1_31 cont1.103, whole genome shotgun sequence Length of sequence - 41369 bp Number of predicted genes - 36, with homology - 34 Number of transcription units - 12, operones - 8 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 210 - 269 8.5 1 1 Op 1 . + CDS 303 - 1040 927 ## COG1349 Transcriptional regulators of sugar metabolism 2 1 Op 2 . + CDS 1092 - 1157 77 ## + Prom 1185 - 1244 3.6 3 2 Op 1 19/0.000 + CDS 1275 - 2204 1281 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 4 2 Op 2 . + CDS 2194 - 4068 2677 ## COG1299 Phosphotransferase system, fructose-specific IIC component 5 2 Op 3 . + CDS 4086 - 4583 452 ## gi|237739843|ref|ZP_04570324.1| predicted protein + Term 4613 - 4669 9.1 + Prom 4590 - 4649 7.1 6 3 Tu 1 . + CDS 4794 - 5114 399 ## COG3070 Regulator of competence-specific genes + Term 5328 - 5370 11.4 + Prom 5390 - 5449 10.8 7 4 Op 1 . + CDS 5474 - 5653 144 ## 8 4 Op 2 1/1.000 + CDS 5715 - 6419 696 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 9 4 Op 3 . + CDS 6419 - 7450 1230 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 10 4 Op 4 1/1.000 + CDS 7515 - 9605 2799 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 11 4 Op 5 . + CDS 9664 - 12297 3982 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 12 4 Op 6 . + CDS 12337 - 13053 849 ## FN1719 hypothetical protein + Term 13064 - 13107 7.2 - Term 13042 - 13103 15.7 13 5 Tu 1 1/1.000 - CDS 13120 - 16686 4775 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 16727 - 16786 9.7 14 6 Op 1 1/1.000 - CDS 16788 - 18125 1448 ## COG1757 Na+/H+ antiporter 15 6 Op 2 . - CDS 18167 - 19354 2061 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 19422 - 19481 10.3 + Prom 19344 - 19403 9.1 16 7 Tu 1 . + CDS 19470 - 20906 1424 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Prom 20915 - 20974 9.6 17 8 Op 1 . + CDS 21087 - 22145 1021 ## COG3180 Putative ammonia monooxygenase 18 8 Op 2 3/0.000 + CDS 22205 - 23296 1155 ## COG0726 Predicted xylanase/chitin deacetylase 19 8 Op 3 3/0.000 + CDS 23313 - 24092 852 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 20 8 Op 4 11/0.000 + CDS 24082 - 25101 1216 ## COG0859 ADP-heptose:LPS heptosyltransferase 21 8 Op 5 . + CDS 25098 - 26135 1101 ## COG0859 ADP-heptose:LPS heptosyltransferase 22 8 Op 6 . + CDS 26122 - 26724 611 ## FN0545 lipopolysaccharide core biosynthesis protein RfaY 23 8 Op 7 1/1.000 + CDS 26726 - 27736 868 ## COG0859 ADP-heptose:LPS heptosyltransferase 24 8 Op 8 14/0.000 + CDS 27741 - 28880 2078 ## COG0468 RecA/RadA recombinase 25 8 Op 9 1/1.000 + CDS 28861 - 29424 649 ## COG2137 Uncharacterized protein conserved in bacteria 26 8 Op 10 1/1.000 + CDS 29421 - 30446 680 ## PROTEIN SUPPORTED gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 27 8 Op 11 . + CDS 30486 - 30953 805 ## COG0662 Mannose-6-phosphate isomerase + Term 30982 - 31028 6.6 - Term 31018 - 31062 6.4 28 9 Op 1 . - CDS 31106 - 31435 361 ## Vpar_0189 hypothetical protein 29 9 Op 2 . - CDS 31503 - 31853 547 ## SSUBM407_1036 hypothetical protein 30 9 Op 3 . - CDS 31882 - 32652 1083 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor - Prom 32689 - 32748 8.6 + Prom 32772 - 32831 10.8 31 10 Tu 1 . + CDS 32871 - 33413 733 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Prom 33420 - 33479 13.3 32 11 Op 1 . + CDS 33559 - 35382 2302 ## COG0326 Molecular chaperone, HSP90 family + Term 35385 - 35434 -0.5 + Prom 35458 - 35517 7.8 33 11 Op 2 . + CDS 35577 - 36464 1292 ## COG3588 Fructose-1,6-bisphosphate aldolase + Term 36483 - 36535 12.3 + Prom 36503 - 36562 15.7 34 12 Op 1 13/0.000 + CDS 36617 - 39043 3106 ## COG0457 FOG: TPR repeat 35 12 Op 2 1/1.000 + CDS 39054 - 40562 1799 ## COG0457 FOG: TPR repeat 36 12 Op 3 . + CDS 40585 - 41364 1103 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity Predicted protein(s) >gi|224461299|gb|ACDC01000103.1| GENE 1 303 - 1040 927 245 aa, chain + ## HITS:1 COG:FN1439 KEGG:ns NR:ns ## COG: FN1439 COG1349 # Protein_GI_number: 19704771 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Fusobacterium nucleatum # 1 245 1 245 245 367 91.0 1e-101 MLFEDRISLILKLIETQGSIENSKIIKDLKISEATLRRDLAYLEKENKIKRVRGGAVLRK VARKEIEIKEKNTNKDSKKKIAQMAAQFISDGDYIYLDAGTTTYEIIDYIKGKDIKVVTN GIIHLERLIANDIETYLIGGRIKKSTLAIVGVKALRDLSEFRFDKAFIGINGINENGYST HDVEEALIKKQAIENSNKAFILADSTKFDMIYFANVAKLEEATIITDKKEINKDIEKHTK IINIY >gi|224461299|gb|ACDC01000103.1| GENE 2 1092 - 1157 77 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRVNVGVFEANLLASFAEFS >gi|224461299|gb|ACDC01000103.1| GENE 3 1275 - 2204 1281 309 aa, chain + ## HITS:1 COG:FN1440 KEGG:ns NR:ns ## COG: FN1440 COG1105 # Protein_GI_number: 19704772 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Fusobacterium nucleatum # 1 309 6 314 314 488 85.0 1e-138 MIYSVTLNPSIDFIVRVKDFQIGETNRAYEDNFFAGGKGIMVSKLLKNVGTECINLGFLG GFTGAFIEKNLKKLNIPSDFVNVEENTRINVKLKTEEETEINCPGPKISEKEKEEFLDKI RKIKSDDFVILSGSVPSNLGNDFYINIIEILNENSVKFTLDSSGETFKKSLKYKPFLIKP NKDELKEYAKREFKDNKEIIDYVRTNLVGMAENVIISLGGEGALYIAKDFSLFAQPFKAK ESVVNTVGAGDSVVAGFVNYMLKENDVEKAFRFAVACGTATSFSEDIGELEFIEEISKKL VIEREHYGN >gi|224461299|gb|ACDC01000103.1| GENE 4 2194 - 4068 2677 624 aa, chain + ## HITS:1 COG:FN1441_3 KEGG:ns NR:ns ## COG: FN1441_3 COG1299 # Protein_GI_number: 19704773 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Fusobacterium nucleatum # 296 622 1 327 328 545 93.0 1e-155 MEIKDLLKKDLMIMDLKANTKMEAIDEMIARLKEKNIVSDADVFKDLILKREERSSTGLG EGIAMPHAKTSVVNSPSVLFARSNKGVDYDALDGEPVHIFFMIAASEGAHDLHIETLAKL SKMLLNDDFTKGLLTCGSPDEVYALVDKYSEKPQESSKEEVKETQVTNKKKILAVTACPT GIAHTYMAEAALKEAGEKLGVDVKVETNGADGIKNNLTTNDINDAVGIIVAADKKVETAR FNDRKVIVTSTADAIKNAEALIKKVLNNEAPVFKAEASDNTEEDSQANDSIGRIIYKSIM SGVSNMLPFVIGGGILLALSFIVERFMGQNELFKLLYGIGGGAFHFLIPVLAGFIAMSIA DKPGFMPGAVAGYMASQGAGFLGGLIGGFIAGYSVIFLKKMTKNMSKQFDGMKSMGIYPI FSLLITGVLMYFIIGPVFTKINVIVANWLNNMGTANAVLLGAILGGMMSVDMGGPINKAA YAFSIGVFTDTNNGAFMAAVMAGGMVPPLAIALAMTLFKDRFDEKEQQSKISNFILGLSF ITEGAIPFAAKEPLKVISSCVIGAAIAGGLTQFWGVSAPAPHGGIFVIPAMPSVHSAIFF VVSIAIGAIISGVIFGVIRGKKNN >gi|224461299|gb|ACDC01000103.1| GENE 5 4086 - 4583 452 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739843|ref|ZP_04570324.1| ## NR: gi|237739843|ref|ZP_04570324.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 165 1 165 165 197 100.0 1e-49 MARFLNILFGVVFILFGIYMWNNPTETFVTYSFYLGLLYVIWTIITIFYIFRKKIRPVPY GNIIVSIIISIAILALPMFSIAMVLWTFVFIFLISAVYYLRNVIKNGLKSHLLQFILACI AVIYGFVMLFNPIVAGNTIAKILAFFVIMNGISYIFSSIIDVKIE >gi|224461299|gb|ACDC01000103.1| GENE 6 4794 - 5114 399 106 aa, chain + ## HITS:1 COG:SP0951 KEGG:ns NR:ns ## COG: SP0951 COG3070 # Protein_GI_number: 15900829 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Streptococcus pneumoniae TIGR4 # 1 75 1 75 75 102 65.0 2e-22 MASSKEYLDFILEQLSELEEITYKTMMGEYIFYYRGKIIGGIYDDRFLVKPVKSAIACIP NAKYELPYDGAKKMLLVDDVDNKEFLTSLFNSMYNELPAPKPKKKK >gi|224461299|gb|ACDC01000103.1| GENE 7 5474 - 5653 144 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLSNKVYVDNSYFLIYCKNRFIFNMLFILKKLFHTILRTKKFETLFFRNRVIVLMYKI >gi|224461299|gb|ACDC01000103.1| GENE 8 5715 - 6419 696 234 aa, chain + ## HITS:1 COG:FN1921 KEGG:ns NR:ns ## COG: FN1921 COG0340 # Protein_GI_number: 19705226 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Fusobacterium nucleatum # 1 234 1 234 234 386 86.0 1e-107 MKFLKFNEIDSTNNYMKENISSFENYDIVSAKVQTAGRGRRGNSWLSPEGMALFSFLLRP ERSLSIVEATKLPFIAGISTLNALKKIKDGAYSFKWTNDVFFNSKKLCGILIERVKDDFV VGIGINVANKIPEDIKNIAISLESDHDIDKLILKVVEEFSLYYEKFMAGKWQEIIEEINA NNFLKDKKIRVHIGEQIFEGTAKNIAEDGRLEIEMNGEIKLFSVGEITIEKDYY >gi|224461299|gb|ACDC01000103.1| GENE 9 6419 - 7450 1230 343 aa, chain + ## HITS:1 COG:FN1920 KEGG:ns NR:ns ## COG: FN1920 COG0482 # Protein_GI_number: 19705225 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 1 343 1 343 343 580 88.0 1e-165 MKKVVIGMSGGVDSSVSAYLLKEQGYEVIGVTLNQHLEENSKDIEDAKKVCDRLGIIHEV VNIRKDFENIVIKYFLDGYKSGKTPSPCVICDDEIKFKILFEIADKYKADYVATGHYTSV EYSETFSKYLLKSVHSIIKDQSYMLYRLAPEKLERLIFPLKPYSKQEIREIALKIGLEVH DKKDSQGVCFAKEGYKEFLKENLKDEIVKGKYIDKEGNILGEHEGYQLYTIGQRRGLGIN LSKIVFITEIRAKTNEIVLGEFSELFTDEIELINYKFAVEFEKIKDLNLLARPRFSSTGF YGKLIKNNDKIYFKYNEENAHNAKGQHVVFFYDNFVVGGGEIK >gi|224461299|gb|ACDC01000103.1| GENE 10 7515 - 9605 2799 696 aa, chain + ## HITS:1 COG:FN1717 KEGG:ns NR:ns ## COG: FN1717 COG0272 # Protein_GI_number: 19705038 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Fusobacterium nucleatum # 1 696 1 696 696 1129 90.0 0 MKIKERIEELKNSNAGLTLYSSQELKDLERIVKLKEDLDKYRDSYYNDNESLISDYEFDI LLKELESLEEKYPEYKEASSPTASVGASLKENKFKKVEHAHPMLSLANSYNIGEVVDFIE RIKKRISKEQELKYCLEVKLDGLSISLTYIQGKLVRAVTRGDGFIGEDVTENILQIASVV KTLPQALDIEIRGEIVLPLASFEKLNKERLEKGEELFANPRNAASGTLRQLDPKIVKERA LDAYFYFLVEADKLGLKSHSESMKFLESMGIKTTGIFELLENSKDIEQRIDYWEKERENL PYETDGLVIKVDEINLWDEIGYTSKTPRWAIAYKFPAHQVSTVLNDVTWQVGRTGKLTPV AELEEVELSGSKVKRASLHNISEIQRKDIRIGDRVFIEKAAEIIPQVVKAIKEERTGNEK TIEEPINCPVCNHKLEREEGLVDIKCVNEECPAKIQGEIEYFVSRDALNIMGLGSKIVEK FIDLGYIKTVVDIYDLKNHREDLENIDKMGKRSIENLLNSIEESKNREYDKVIYALGIPF IGKVASKVLAKASKNIDKLMSMTFEELTSIEGIGEIAANEIIAFFKKEKTQKLVAALKEK GLKFEITESETKVENLNPNFAGKNFLFTGTLKHFTREQIKEEIEKLGGKNLSSVSKNLDY LIVGEKAGSKLKKAQEIPTIKILTEEEFIELKDKFD >gi|224461299|gb|ACDC01000103.1| GENE 11 9664 - 12297 3982 877 aa, chain + ## HITS:1 COG:FN1718 KEGG:ns NR:ns ## COG: FN1718 COG0653 # Protein_GI_number: 19705039 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Fusobacterium nucleatum # 1 877 1 869 869 1434 86.0 0 MIGGLLKKIFGTKNDREVKALRKIVDQINALEPEYEKLSDEDLRHKTDIFKERLANGETL DDILVEAFATVREASKRILGLRHYDVQLIGGIVLHQGKITEMKTGEGKTLVATCPVYLNA LAGKGVHVITVNDYLAKRDRDQMSRLYGFLGLSSGVILNGLPTEQRKRSYEADITYGTNS EFGFDYLRDNMVSDMKNKVQRELNFCIVDEVDSILIDEARTPLIISGAAEDKIKWYQVSF QVVSMLTRSFETEKIKNIKEKKAMNIPDEKWGDYEVDEKSRTVVLTEKGVKRVEKILKID NLYSPEHVELTHFLNQALKAKELFKRDRDYLVRENGEVVIIDEFTGRAMEGRRYSDGLHQ AIEAKEGVNIAAENQTLATITLQNYFRMYKKLSGMTGTAETEATEFMHTYGLEVIVIPTN LPVIRKDNADLVYKTKNGKIKSIIDRIEALYEKGQPVLVGTISIKSSEELSELLKKRGVP HNVLNAKFHAQEAEIVAQAGRYKAVTIATNMAGRGTDIMLGGNPEFMALDEVGSRDDERF PEVLAKYQEQCKEEKEKVLALGGLFILGTERHESRRIDNQLRGRSGRQGDPGESEFYLSL EDDLMRLFGSERVSVWMERLKLPEDEPITHGMINSAIEKAQKKIEARNFGIRKSLLEFDD VMNLQRKAIYENRDEALSTDNLKDKILGMLKDTITAKVYEKFAAEHKEDWDIDGLNEYLE DFYVYEEQDDKAYLKNTKESYAERVYEALVSQYNKKEEEIGSGLLRNLEKYILFEVVDNK WREHLKALDGLRESIYLRAYGQRDPVTEYKIISSQIFEEMISNIKEQTTSFLFKVAVKTE EERQSVEEFEEEDVKKVNSEDSCPCGSGKPYNKCCGR >gi|224461299|gb|ACDC01000103.1| GENE 12 12337 - 13053 849 238 aa, chain + ## HITS:1 COG:no KEGG:FN1719 NR:ns ## KEGG: FN1719 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 238 1 239 239 350 75.0 3e-95 MIFTLVSCSSTTNKKDLIQKYSLDKESAHNWETVMPNVMMAEATNPDWYGEDNPLVSLRK QGKMSEREYYFLDYLGKTPANQITDEEFDRFAKILTSFVNRTPRNFILEETNIKDPKGLV DFMVKEANSSQLDNPSKYIKEVVADKEEWAEIVALSEKADLNSKDVRKLRKLLAAFVKRE NFFNEQVWLQVEVSDRVLQLAQMSRKVPKTKRELNNVNAKALYLAYPQFLSKIDRWSR >gi|224461299|gb|ACDC01000103.1| GENE 13 13120 - 16686 4775 1188 aa, chain - ## HITS:1 COG:FN1421_1 KEGG:ns NR:ns ## COG: FN1421_1 COG0674 # Protein_GI_number: 19704753 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 1 410 3 412 412 821 97.0 0 MKRVMQTMDGNQAAAYASYAFTEVAGIYPITPSSPMAEYVDEWAAKGMKNIFDVPVKLVE MQSEGGAAGTVHGSLEAGALTTTYTASQGLLLKIPNMYKIAGELLPGVIHVSARSLSVQA LSIFGDHQDIYATRQTGFTMMASGSVQEVMDMGTVAHLTAIKSRVPVLHFFDGFRTSHEI QKIELMDFDVCKKLVDYDEIQKFRDRALNPEHPVTRGTAQNDDIYFQTREAQNKFYDAVP DIAAYYMEEISKETGREYKPFKYRGAADADRVIIAMASVCQTAEETVDYLVEKGEKVGLI TVHLYRPFSEKYFFNVLPKTVKKIAVLERTKEPGAPGEPLLLDVKSIFYDKENAPVIVGG RYGLSSKDTTPAQIKAVFDNLSQDKPKTNFTVGIVDDVTFTSLEVGERLSVADPSTKACL FFGLGADGTVGANKNSIKIIGDKTDLYAQGYFAYDSKKSGGVTRSHLRFGKKPIRSTYLV SSPSFVACSVPAYLKQYDMTSGLKKGGKFLLNCVWDKDEVLENIPDNIKYDLAKAEAKFY IINATKLAHEIGLGQRTNTIMQSAFFKLAEIIPYEEAQKYMKEYALKSYGKKGDDVVQLN YKAIDVGASGLIEIEVNPEWINLKVSAQEKVDKNNDTSNCKTELLTSFVKNIVEPINAIK GNDLPVSAFMGREDGTFENGTAAFEKRGVAVDVPIWNLDKCIQCNQCSYVCPHAAIRSFL ITDEEKAASPIEFSTLKANGKGLENLSYRIQVTPLDCTGCGSCANVCPAKALDMNPIAVA LENQEDKKASYIYSKVSYKNDKLPTNTVKGSQFSQALFEFNGACPGCGETPYLKVISQMF GDRMMVANASGCSSVYSGSAPSTPYTKNCCGEGPAWASSLFEDNAEYGFGMHVGVEALRD RIQHIMEVSMDKVTPALQGLFREWIENRCFAAKTREITPKILAALEGNNESYAKDIIGLK QYLIKKSQWVVGGDGWAYDIGYGGLDHVLASKEDINVIVMDTEVYSNTGGQSSKATPTAA VAKFAAAGKPLKKKDLAAICMSYGHIYVAQVSMGANQQQFLKAIQEAESYNGPSIIIAYS PCINHGIKKGMSKSQTEMKLATECGYWPIFRYNPLLESQGKNPLQLDCKEPKWELYQDYL MGETRYMTLKKTNPDEANELFEKNMWDAQRRWRQYKRLASLDFSDEKR >gi|224461299|gb|ACDC01000103.1| GENE 14 16788 - 18125 1448 445 aa, chain - ## HITS:1 COG:FN1420 KEGG:ns NR:ns ## COG: FN1420 COG1757 # Protein_GI_number: 19704752 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 5 444 1 440 445 701 91.0 0 MENKIENKASFKGLIPFLVFILLYLGTGIFLNIQGVELAFYQLPGPVAAFAGIVVAFIIF RGTITEKFNTFLEGCGHPDIITMCIIYLLAGAFAIVSKAMGGVDSTVNLGITYIPPHYIA VGLFIIGAFISTATGTSVGAIVALGPIAVGLGEKSGVPMALILAAVMGGAMFGDNLSVIS DTTIAATKTQGVEMKDKFRINSYIALPAAILTIILLFIFARPDVVPEAVSHEYNLLKVLP YVFVLVMALVGVNVFVVLTSGILLSGIIGFIYGDFTLLSYGKEIYNGFTNMTEIFVLSLL TGGMAQMVTREGGIDWVINTVQKFIVGKKSAKLGIGLLVSLADIAVANNTVAIIITGGIS KKISENNKVDLRESAAFLDIFSCVFQGMIPYGAQMLILLGFAGDKVSPTQLIPLLWYQLL LAVFTLIYIFFPQISNKTLSFIDKK >gi|224461299|gb|ACDC01000103.1| GENE 15 18167 - 19354 2061 395 aa, chain - ## HITS:1 COG:FN1419 KEGG:ns NR:ns ## COG: FN1419 COG0626 # Protein_GI_number: 19704751 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Fusobacterium nucleatum # 1 395 1 395 395 737 91.0 0 MEIKKCGLGTTAIHAGTLKNLYGTLAMPIYQTSTFIFDSAEQGGRRFALEEAGYIYTRLG NPTTTVLEDKIAALEEGEAAVATSSGMGAISSTLWTVLKAGDHIVTDKTLYGCTFALMCH GLTRFGIDVTFVDTSNLDEVKNAMKENTRVVYLETPANPNLKIVDIEALAKLAHTNPNTL VIVDNTFATPYMQKPLTLGADVVVHSVTKYINGHGDVIAGLVITNKALADQIRFVGLKDM TGAVLGPQDAYYIIRGMKTFEIRMERHCKNARKVVEFLNNHPKIERVYYPGLETHPGYEI AKKQMKDFGAMISFELKGGFEAGKTLLNNLKLCSLAVSLGDTETLIQHPASMTHSPYTKE EREAAGITDGLVRLSVGLENVEDIIADLEQGLEKI >gi|224461299|gb|ACDC01000103.1| GENE 16 19470 - 20906 1424 478 aa, chain + ## HITS:1 COG:FN1418 KEGG:ns NR:ns ## COG: FN1418 COG1167 # Protein_GI_number: 19704750 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Fusobacterium nucleatum # 1 472 1 472 475 770 81.0 0 MNKKLVRNSDTTVSTQLFEILKQDILENRWKENDKFFSVRQISIKYGLNPNTVLKVVKAL EEEGYLYSVKGKGCFIKKGYNLDISQRMTPILNTFRFGQISKDMEINFSNGGPPKEYFPV QEYKEILSEILLDKDGSRHLMAYQNIQGLESLRETLVEFIKRYGIRREKEDIIICSGTQI ALQLISTAFGLVPKKTVLLSDPTYQNAVNILKNYCNVENIDMKNDGWDMNEFENLLKNKK IDFVYIMTNFQNPTGVSWSFEKKKKMIELSIKYDFYIIEDECFSDFYYKSQDCPRSIKAL DKDERVFYIKTFSKIVMPALALTMLIPPKKYTESFSLNKYFIDTTTSGINQKFLELYIKR GLLDKHLEKLRVNLKEKMEYMIEKLKKIKHLEIMHVPQGGFFIWIKLANYINSEKFYYKC RLRGLSILPGFVFYSNSEEVSSKIRISTVSSSIEEVERGLEIIQDVLNNCDFSEINLK >gi|224461299|gb|ACDC01000103.1| GENE 17 21087 - 22145 1021 352 aa, chain + ## HITS:1 COG:FN0532 KEGG:ns NR:ns ## COG: FN0532 COG3180 # Protein_GI_number: 19703867 # Func_class: R General function prediction only # Function: Putative ammonia monooxygenase # Organism: Fusobacterium nucleatum # 5 352 2 349 351 444 82.0 1e-124 MNGNEIIFLILTLAIGILGGYLADKKKVPAAFMIGALFAVAIFNIVTNRAFLPTSFKFIT QVATGTFIGSKFRTEDVKMLRKVIIPGMVMVVLMIAFSFVLSFIMSHFLGIDYMTSFFAT APGGIMDISLIAYDFKANTSQVALLQLIRLISVISFVPFFTKKCYERSKNKKESFEKEIE NEIDEEKRTLNKSEKSLTFTLIIGIIGGIIGYFSHLPAGTMSFAMAFVAFFNVKTQKAYM PLPLRKIIQTFGGALIGARVTLADVVALKTLVLPIILIIVGFCLMNVLVGFFLYKTTKFS LSTALLSASPGGMSDISLMAEDLGANGPQVASMQFLRAIFIVGVYPLIIKLL >gi|224461299|gb|ACDC01000103.1| GENE 18 22205 - 23296 1155 363 aa, chain + ## HITS:1 COG:FN0541 KEGG:ns NR:ns ## COG: FN0541 COG0726 # Protein_GI_number: 19703876 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Fusobacterium nucleatum # 14 363 1 351 351 523 82.0 1e-148 MIITLIILTIIIFLIIIFNKRAVPAFLYHQVNPISNVSPELFEEHLKVIKEYKMNTITIS EFYNKEVPTNSILLTFDDGYFDNYKYVFPLLKKYNMKATIFLNTLYIMDKRETEPEIKDN NTVNLEAMKEYIKSGKATINQYMSWEEIKEMYDSSLIDFQAHSHKHMAMFVDTKIEGLTN KNRMEAPELYLYGELEDNFPSFPKRGEYTGKAILIKKEFFKIFKEFYEKNIENKITDKNE ILKKSQEFIDENKEYFSIESEAEYRKRIEEDFSENKKIIEKNLGNEVKFFCWPWGHRSKE TIEVLKELGVVGFISTKKGTNSMKANWNMIRRIELRNYNVKKFKINLLIARNLILGKIYG WIS >gi|224461299|gb|ACDC01000103.1| GENE 19 23313 - 24092 852 259 aa, chain + ## HITS:1 COG:FN0542 KEGG:ns NR:ns ## COG: FN0542 COG0463 # Protein_GI_number: 19703877 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Fusobacterium nucleatum # 1 259 5 263 263 439 88.0 1e-123 MTLTVSIITLNEEKNLERTLKSVQDFADEIVIVDSGSTDKTEEIAKKFGAKFVYQEWLGY GAQRNKAIDLATSDWVLNIDADEEISPELAKRIKAIKENSRYKVYKINFMSVCFNKKIKH GGWSNSYRIRLFRKDAGRFNENTVHEEFKTTQEIAKLHKYIYHHTYSDLADYFDRFNKYT TLGAIEYYKKGKKASIISIVLSPIYKFLRMYIVRLGFLDGLEGLLLATTSSLYTMVKYYK LREIYKNKSYIEKEGNNGN >gi|224461299|gb|ACDC01000103.1| GENE 20 24082 - 25101 1216 339 aa, chain + ## HITS:1 COG:FN0543 KEGG:ns NR:ns ## COG: FN0543 COG0859 # Protein_GI_number: 19703878 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 339 7 345 345 557 85.0 1e-158 MEIKRILVSRTDKIGDLILSIPSFFMLKKMYPNAELVAIVRKYNMDIVKNLPYIDRFVVL DDYTKNELLEKIAYFKADVFIALYNDAYIAALARASKAKIRIGPISKLNSFFTYNKGVLQ KRSRSIKNEAEYNLDLIAKLDKKKFSILYELNTKLVLTDDNRKVADTFFKENSIEGKCLV VNPFIGGSAKNITDEQYVSILKKVKEEMPDLNIIVTSHISDEERNEKFCKDIGKDKVFSF SNGASILNTASIIDRADVYFGASTGPTHIAGALGKRIVAIYPNKKTQSITRWGVYGNSNV EYIVPDENNPNEDYKNPYFDNFTEEMEDKAVKYILEGLK >gi|224461299|gb|ACDC01000103.1| GENE 21 25098 - 26135 1101 345 aa, chain + ## HITS:1 COG:FN0544 KEGG:ns NR:ns ## COG: FN0544 COG0859 # Protein_GI_number: 19703879 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 342 1 341 342 595 90.0 1e-170 MNILIIHTAFIGDIVLSTALVSKVKEKYPDSDIYYLTTPLGKEILKNNPKIKEIIAYDKR GKDKGFKAFVSFVRKIRKLKIDVCLTPHRYLRSSVLSFLSGAKIREGYDIANLAFLFNKK IKYDKTKHEVEKLLSFVEDNNTKRYELEMYPDKQDKIRIDSLLKDLSDNKKIILIAPGSK WFTKKWPEEYFRTLIQNLVKRDDLLIVITGGKEEKEINLELDSKVLDLRGEISLLELAEL TKRATLVVSNDSAPIHITSAFPNTRIIGIFGPTVKEFGFFPWSQNSKVFEIDNLYCRPCA IHGGNSCPEKHFRCMREITPDLIENEIYNYIASTDDKKVKTNEQK >gi|224461299|gb|ACDC01000103.1| GENE 22 26122 - 26724 611 200 aa, chain + ## HITS:1 COG:no KEGG:FN0545 NR:ns ## KEGG: FN0545 # Name: not_defined # Def: lipopolysaccharide core biosynthesis protein RfaY # Organism: F.nucleatum # Pathway: not_defined # 9 200 8 198 198 283 87.0 2e-75 MSKNKIPEKIIKVLKDDHRSYVYVFELEPYGDKKFVYKESREKNKRKWQRFLNFFRGSES KREYYQMKKINSLGLKTAKPIFYNKEYLMYEYIEGNEPTIDDIDLVVKELKKIHSMGYLH GDSHINNFLISPEREVYIIDSKFQKNKYGKFGEIFEMMYLEDSVGIEIDYDKKSFYYKGA MLLRKYLTFFSKLKNIIRGK >gi|224461299|gb|ACDC01000103.1| GENE 23 26726 - 27736 868 336 aa, chain + ## HITS:1 COG:FN0546 KEGG:ns NR:ns ## COG: FN0546 COG0859 # Protein_GI_number: 19703881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 335 1 335 335 556 87.0 1e-158 MENRKRILVIRLSSIGDVILTTPVLKAFKEKYPESIIDFLVIDKFKDAISLSPYVDNLLL FNKEKNDGLSNLIKFTKELSKNKYDYVFDLHSKFRSKIITFILSKFYNVKSYTYKKRAFW KSILVNMKLIKYEVDNTIVKNYFSAFKDFELKYQGEDLNFSFEPELKNKFEEYKGHIAFA VGASKETKKWTVEGFGKLAKKLYETYGKKIILVGGKEDYERCDTIEKISENSVINLAGKL SLKETGALLSQAKFLLTNDSGPFHIARGVGCKTFVIFGPTSPGMFDFGKNDILVYNKIEC TPCSLHGDKVCPKKHFKCMKELSYEKVFKIIESKEW >gi|224461299|gb|ACDC01000103.1| GENE 24 27741 - 28880 2078 379 aa, chain + ## HITS:1 COG:FN0547 KEGG:ns NR:ns ## COG: FN0547 COG0468 # Protein_GI_number: 19703882 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Fusobacterium nucleatum # 8 379 11 381 381 554 91.0 1e-157 MAAKKDKNTPDSKITDKEGKQKAVNDAMAAITKGFGAGLIMKLGEKSSMNVESIPTGSIN LDIALGIGGVPKGRIIEVYGAESSGKTTLALHIIAEAQKQGGTVAFIDAEHALDPVYAKA LGVDIDELLISQPDYGEQALEIADTLVRSGAIDLIVIDSVAALVPKAEIDGEMSDQQMGL QARLMSKGLRKLTGNLNKYKTTMIFINQIREKIGVTYGPTTTTTGGKALKFYASVRLEVK KMGTVKQGDDPIGSEVVVKVTKNKVAPPFKEAAFEILYGKGISRVGEIIDAAVARDIIVK AGSWFSFRDQSIGQGKEKVRIELESNPELLAQVEADLKEAISKGPVDKKKKKSKKELASD DVDTDDDELDEDSSEDSND >gi|224461299|gb|ACDC01000103.1| GENE 25 28861 - 29424 649 187 aa, chain + ## HITS:1 COG:FN0548 KEGG:ns NR:ns ## COG: FN0548 COG2137 # Protein_GI_number: 19703883 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 57 186 1 130 130 177 89.0 9e-45 MKTVTIKGNKLFLENDKIIYLTKEMIAKFDLNGKTCLDDETFYSLIYFRIKLSAYNMLAK RDYFKKEIKNKLIEKIGFADIVEDVVEDFEEKGYLDDYEKAKSYASQHSNYGAKKLSFIF YQMGVDRESISKILEDDKDNQIEKIKQLWYKLGNKEKQKKIESILRKGFLYGDIKKAISS IEEEEEE >gi|224461299|gb|ACDC01000103.1| GENE 26 29421 - 30446 680 341 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Slackia heliotrinireducens DSM 20476] # 2 316 439 763 781 266 43 1e-70 MIILGIESSCDETSIAVVRDGKEILSNNISSQIEIHKEYGGVVPEIASRQHIKNIATVLE ESLEQAKITLDDVDYIAVTYAPGLIGALLVGLSFAKGLSYARNIPIIPVHHIKGHMYANF LEHEVELPCISLVVSGGHTNIIHIDENHKFTNIGETLDDAVGESCDKVARVLGLGYPGGP VIDKMYYKGDRNFLKITKPKVSRFDFSFSGIKTAIINFDNNMKMKNQEYKKEDLAASFLG TVVDILCDKTLDAAIEKNVKTIMLAGGVAANSLLRSQLTEKAAEKGIKVIYPSMKLCTDN AAMIAEAAYYKLKNAKNEEDCFAGLDLNGVASLMVSDEKAI >gi|224461299|gb|ACDC01000103.1| GENE 27 30486 - 30953 805 155 aa, chain + ## HITS:1 COG:FN0550 KEGG:ns NR:ns ## COG: FN0550 COG0662 # Protein_GI_number: 19703885 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Fusobacterium nucleatum # 15 155 1 141 141 259 94.0 1e-69 MKKLLFSTLLIGICLTGVASAKEKNPILLKQVYKKEELIALDKQNVAGGNGTLHGKFAFT RDMATEDEAIKEIGWMTLNKGESVGVHPHKNNEDTYIIISGEGIFTDGSGKETVVKAGDV TIARPNQSHGLRNEKDEPLVFLDIIAQNHALKAEK >gi|224461299|gb|ACDC01000103.1| GENE 28 31106 - 31435 361 109 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0189 NR:ns ## KEGG: Vpar_0189 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 1 101 1 101 107 148 69.0 5e-35 MNNLDFTLICSVTFFSKKRKTPPNLTNGKYNPHLVIKGDTEYLGVTFIDGEEVIFDKEIM ASALPLYEEIDYSGLTEGTKFMIMEGGNIVGEGIVDEVFQHISTKELKN >gi|224461299|gb|ACDC01000103.1| GENE 29 31503 - 31853 547 116 aa, chain - ## HITS:1 COG:no KEGG:SSUBM407_1036 NR:ns ## KEGG: SSUBM407_1036 # Name: not_defined # Def: hypothetical protein # Organism: S.suis_BM407 # Pathway: not_defined # 9 116 6 113 117 147 64.0 8e-35 MENLKVHKYEPPRHIVDFHVAGFAYYDGLDVINELSLGQAVTLVVETDNPYDNEAVANYY KDKKLGYVPREKNSFLSTLLYYGYGDILEARIQYVNVENHPERQFRVVVKIKDNRK >gi|224461299|gb|ACDC01000103.1| GENE 30 31882 - 32652 1083 256 aa, chain - ## HITS:1 COG:FN0761 KEGG:ns NR:ns ## COG: FN0761 COG1521 # Protein_GI_number: 19704096 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Fusobacterium nucleatum # 1 256 1 256 256 449 94.0 1e-126 MIIGIDIGNTHIVTGVYDDKGELISTFRLATNDKMTEDEYFSYFNNITKFNNISIEKVDA ILVSSVVPNIIITFQFFARKYFKVEAIIVDLEKKIPFTFAKGINYTGFGADRIIDITEAM QKYPDKNLVIFDFGTATTYDVLKKGVYIGGGILPGIDMSINALYGNTAKLPRVKFTTPSS VLGTDTMKQIQAAIFFGYAGQIKHIIKKINEELGEEIFVLATGGLGRILSAEIDEIDEYD ANLSLKGLYTLYMLNK >gi|224461299|gb|ACDC01000103.1| GENE 31 32871 - 33413 733 180 aa, chain + ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 1 180 1 180 180 298 87.0 4e-81 MTKKKIDVLDYSSKILKALSKGVLLTVKDDEKVNTMVISWGALGIEWNKVLFTTYIRENR YTKAILDKALNFTINIPLEKMDTKVFGIAGTKSGRNIDKIKEANLTLVDSEIVSSPAIKE LPITLECKVLYKQKQVLENLPEDIVKKDYPQDVDGTFVGANRDPHTAYYAEIVAAYIIEE >gi|224461299|gb|ACDC01000103.1| GENE 32 33559 - 35382 2302 607 aa, chain + ## HITS:1 COG:FN0321 KEGG:ns NR:ns ## COG: FN0321 COG0326 # Protein_GI_number: 19703666 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Fusobacterium nucleatum # 1 606 1 606 607 948 89.0 0 MRKEEKIFKAETKELLNLMIHSIYTNKEIFLRELISNANDAIDKLKFQSLTNNDLLKGDD KFKIEITVDKDNRTLTIKDNGIGMTYDEVDENIGTIAKSGSKVFKEQLEAAKKADIDIIG QFGVGFYSAFIVADKVTLETRSPYSENGVRWVSSGDGNYEIEEISKENRGTEITLHLKDG EEYSEFLEEWKIKELVKKYSNYIRYEIYFKDEVINSTKPIWKRDKKELKDEDYNEFYKAT FHDWNDPLFHINLKVQGNIEYNALLFIPKKLPFDYYTKNFKRGLQLYTKNVFIMEKCEDL IPEYFNFISGLVDCDSLSLNISREILQQNSELQAISKNLEKKIISELEKVLKNDREKYIE FWKEFGRSIKGGVQDMFGMNKEKLQDLLIFVSSHDDKYTTLKEYVDRMGETKEILYVPAE SIDAVKALPKMEKLKEQGREVLILTDKIDEFTLMAMRDYSGKEFKSINSSDFKLSDDKEK EEEVKKIADENKTLIEKAKEFLKDKVNEVELSNNIGNSASALLAKGGLSLEMEKTLSEMT NNNDAPKAEKILAINPEHVLFDKLKAAEGTDNFNKLVDVLYNQALLLEGFSIENPVEFIK NLNDLLA >gi|224461299|gb|ACDC01000103.1| GENE 33 35577 - 36464 1292 295 aa, chain + ## HITS:1 COG:FN0322 KEGG:ns NR:ns ## COG: FN0322 COG3588 # Protein_GI_number: 19703667 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphate aldolase # Organism: Fusobacterium nucleatum # 1 295 1 295 295 539 94.0 1e-153 MNEKLEKMRNGKGFIAALDQSGGSTPKALKLYGINEDQYSNDAEMFDLIHKMRTRIIKSP AFNEQKILGAILFEQTMDSKIDGKYTADFLWEEKRVLPFLKIDKGLNDLDADGVQTMKPN PGLADLLKKANERHIFGTKMRSVIKKASPAGIARVVDQQFEVAAQIVAAGLVPIIEPEVD INNVDKVECEEILRDEIRKHLNALPETSNVMLKLTLPTVENFYEEFTKHPRVVRVVALSG GYSREKANDILSKNKGVIASFSRALTEGLSAQQTDDEFNKTLANTIEGIYEASVK >gi|224461299|gb|ACDC01000103.1| GENE 34 36617 - 39043 3106 808 aa, chain + ## HITS:1 COG:FN1434 KEGG:ns NR:ns ## COG: FN1434 COG0457 # Protein_GI_number: 19704766 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 154 808 1 657 657 766 66.0 0 MKTVEEILTKIESLDNLEKYQEIIDMIEELPVEQLNSQIISEQGRAYNNIGEYNKAIEIL KTIETEEKDTRRWNYRIGFSYYYLDDYENAEKYFLRADEIGPDDDEIKNYLLNIYIELSK NKLEEDQEKAIEYALKAKNYIKTEDNKVQVYSYLAWMYDRIEAYDIAEDLLKSIINSKTD PRNEIWIYSELGYCLGEQHRYEESLEALIKASEMGRDDIWINSQIGWTFRILGRYEEALE HLFKAKELGRDDDWINAELGICYKEIDKFEEALESYLLANERNGEKSIWILSEIAWIYGV LDKFDEELNYLAKVKKLGRKDEWINAEYGKVYARTEKYEEALKYFKKAKKLGQDDAWINI QMAICFKRLENLKKALEHYLLAEKFKDYKKDIWLLSEIAWTYDGLGKYKDGLKYLKKIDR LGRKDCWFYTEYGFCLMRLEKYKEAITKFKKGLQVKEELNEEIYLNSQIGFCYRLLGSEK TALKYHLKAKELGRNDAWINSEIGICYKDLDKYEEALEYYLLAYEEDKEEIWLLSDIGWL YNELDRYEEALEFLLEAEKLGRDDSWINAEIGQCLGRLEKYDEGIERLKKALELLEKDKR RNNTGEKIFINSEIGWLIGRKDDSNAEEALYYLNAAKALGRDDMWINSEIAWELAYNDDK AKESIKYFEKAIKLGRKDEWIWSRVANIYFDLERYKEALDAYSKAYKLAKNSWYICNIGR SLRRLGKYEEAVKKLIQSRKLSLKEGDVVDLEDLELAYCYAALGDKKKAEKHMKLSIDSL GSRAENEEHLKEQFDEIKEMISVLSKPS >gi|224461299|gb|ACDC01000103.1| GENE 35 39054 - 40562 1799 502 aa, chain + ## HITS:1 COG:FN1434 KEGG:ns NR:ns ## COG: FN1434 COG0457 # Protein_GI_number: 19704766 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 40 432 10 430 657 300 45.0 6e-81 MDKISKEYQVIIHEVKILPVEQLDINRVEKLIRAYIADKNYEKALEVLRVVEDREKDNPS INSEFGYCLVELKQFDEAAKYFLKAKNQGREDAWIYSQLGWAYRNAEKYKEALEAYLKAQ QLGDKVAWKNAEIGMCYKELGKYDEALKYYLIIINSGELDNDIYKKIWVLSEIAYIYQNI DKYEEAIEYFKKVESLGRKDSWLYANMFNCLKALKNNEEALKYSLMLENFEELKDSIYLL SNIANLYEEKQDYKEQLKYLEKIEKIGIDDPQFYIEYGYCLMFLEYYREAISKFEKSLGA GKDTYCISQIAFCYRNLGEYEKALEYFQKARSLGRNDAWISLEFGLCYRDLNEYEKALKY FLEAYEKEERYKTDTYLLSSIGKMYDLLGKYENGEEFLRKSYDLGERDRWINMELGECLT RLGKYEEAIKKLLEARRLYMAEGKAPYSEDLELAYCYAALGDKNKAKYHMDSSIEALGAY AESEEYLKKRFTEIKEMINSLK >gi|224461299|gb|ACDC01000103.1| GENE 36 40585 - 41364 1103 259 aa, chain + ## HITS:1 COG:FN1433 KEGG:ns NR:ns ## COG: FN1433 COG4221 # Protein_GI_number: 19704765 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Fusobacterium nucleatum # 2 259 3 260 260 442 86.0 1e-124 MESNIKGKIAFISGASSGIGKATAEKLAEMGANLIICARRENILNELKEKLEKQYGIKVK TLVFDVRSYSDVLKNINSLDDEWKKIEILVNNAGLAVGLEKLYEYNMEDVDRMVDTNIKG FTYIANTILPLMIATDKVCTVINIGSVAGEIAYPHGSIYCATKFAVKAISDSMRSELIDK KIKVTNIKPGLVDTEFSLVRFKGDKERADGVYGGIEPLYAEDIADTIAYVVNLPDKIQIT DLTVTPLHQANAIHIHREK Prediction of potential genes in microbial genomes Time: Thu May 19 23:37:56 2011 Seq name: gi|224461298|gb|ACDC01000104.1| Fusobacterium sp. 2_1_31 cont1.104, whole genome shotgun sequence Length of sequence - 991 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 88 - 990 1113 ## COG4166 ABC-type oligopeptide transport system, periplasmic component Predicted protein(s) >gi|224461298|gb|ACDC01000104.1| GENE 1 88 - 990 1113 300 aa, chain - ## HITS:1 COG:FN1313 KEGG:ns NR:ns ## COG: FN1313 COG4166 # Protein_GI_number: 19704648 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 300 171 474 474 440 77.0 1e-123 EWVSNPIFYPIREENAKLSLDKKIVNGAFKVSTYNEDSIVLVRNENYWDNVNTKLKEVNI ALVENDIMAYEMFPRNEIDYFGEPFYSIPFDRLGQVNILPEKLVFPSTRYWYISIPNETN EKIFEKAELRKLMYAVSDPEFMGKVIIENNSPSIFEHPHPSSEVLNKAKEDFEKLNIKFS DTPYIAYFSADKLLEKKLLLSTVKEWVGNFKIPIRVSSSTDSPITFKIENYLVGTNNKND LYYYINYKYNTKIKTDEEFLNSLVVIPLLQEYNTVLSRSSVRGLNLTPSGDLYLKYINMQ Prediction of potential genes in microbial genomes Time: Thu May 19 23:37:59 2011 Seq name: gi|224461297|gb|ACDC01000105.1| Fusobacterium sp. 2_1_31 cont1.105, whole genome shotgun sequence Length of sequence - 7510 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 502 618 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 2 1 Op 2 . - CDS 513 - 1481 275 ## PROTEIN SUPPORTED gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B 3 1 Op 3 . - CDS 1500 - 3347 2141 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism 4 1 Op 4 . - CDS 3365 - 4540 1679 ## gi|237739877|ref|ZP_04570358.1| predicted protein 5 1 Op 5 1/1.000 - CDS 4533 - 5768 1712 ## COG0285 Folylpolyglutamate synthase 6 1 Op 6 . - CDS 5793 - 6494 992 ## COG0775 Nucleoside phosphorylase - Prom 6520 - 6579 13.2 + Prom 6528 - 6587 9.6 7 2 Tu 1 . + CDS 6611 - 7498 938 ## COG1560 Lauroyl/myristoyl acyltransferase Predicted protein(s) >gi|224461297|gb|ACDC01000105.1| GENE 1 1 - 502 618 167 aa, chain - ## HITS:1 COG:FN1313 KEGG:ns NR:ns ## COG: FN1313 COG4166 # Protein_GI_number: 19704648 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 167 1 168 474 236 73.0 2e-62 MKKIKILVILILSLLLISCGEKEETVEKVEQIFYTAMPKQEYNLNPQSYTGNERALITQI FEGLTELKDEGTRYVEVLNIEHSDDFKEWIFTLRDDLKWSDNQKITAETYLESWLNTLEN SNSDEIYRMFVIKGAEDFAKKKVDRSSVGIKAQENKLIVTLNSSVKN >gi|224461297|gb|ACDC01000105.1| GENE 2 513 - 1481 275 322 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B [Streptococcus pneumoniae SP18-BS74] # 5 321 1 308 311 110 29 3e-24 MKEFIEKFKEIKAIIEENQNIILTAHVNPDGDAVGSGLGLFLTLKENYKDKNIRFVLQDS IPYTTKFLKGSEEIETYNSEEKYSTDLLIFLDSATRERTGETGRNIEAKLTINIDHHMSN PSYGDVNCVITYSSSTSEIVYHFIKYMGYPISLATAEALYLGLVNDTGNFSHSNVKVETM MMATDLISLGVNNNYIVTNFLNSNSYQTLKMLGDALTKFEFYPEKKLSYYYLDQTTMQKY GAKKEDTEGVVEKILSYYEASVSLFLREEADGKIKGSMRSKYETNVNKIAALFGGGGHYK AAGFSSDLSPKEILDIVLKNLD >gi|224461297|gb|ACDC01000105.1| GENE 3 1500 - 3347 2141 615 aa, chain - ## HITS:1 COG:FN1012 KEGG:ns NR:ns ## COG: FN1012 COG1493 # Protein_GI_number: 19704347 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Fusobacterium nucleatum # 1 615 1 615 615 1060 88.0 0 MYTYTTVREIADSLNLEILNEGNLDLKIDIPNIYQIGYELVGFLDKDSDELNRYINICSL KESRFMATFSKERKEKVISEYMALGFPALIFSKDAIIAEEFYYYAKKYNKNILLSNEKAS VTVRKLKFFLSRALSVEEEYEDYSLMEIHGVGVLMTGYSNARKGVMIELLERGHRMITDK NLIIRRVGENDLLGYNGKKKVKLGHFYLEDIENGSVDVTDHFGVKSTRIEKKINILIVLE EWKEKEFYDRLGLDTQYETFVGEKIQKFVIPVRKGRNLAVIIETAALSFRLKRMGHNTPL EFLNKSQEIIQKKKKEREENMNTNSLAVTKLINEFDLEVKYGRDKVTSTYIKSSNVYRPS LSLIGFFDLIEEVSNIGIQIFSKIEFNFLEKLCPTERINNLKKFLSFDIPMIVLTEDANA PDYFFELVQKSGHILAIAPYKKSSQIIANFNNYLDSFFSETISVHGVLVELFGFGVLLTG KSGIGKSETALELIHRGHRLIADDMVKFYRDTQGDIVGKSAELPFFMEIRGLGIIDIKTL YGMSSVRLSKRLDMIIELKALDNSDYMSAPTTHLYEDVLGKPIKKRILEISSGRNAAAMV EVMVMDYMSGLLGQK >gi|224461297|gb|ACDC01000105.1| GENE 4 3365 - 4540 1679 391 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237739877|ref|ZP_04570358.1| ## NR: gi|237739877|ref|ZP_04570358.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 391 1 391 391 241 100.0 5e-62 MDKKKNTKSSNREKPVNKKQEASKKKESSNANNTKNKVNTNTNPKVKPKVKKKTSINFQK FLNFIVFLVFIIFTFFMYKKVSNQEKLEQAMVENTTKQVLAAMDLRNNEFYGGTKKEIKK EEKVQEIKEEKKLEEDVVETPKELPKETTIEKKTEAPKEEKAPVVNDKKELKAEAKTEEK KTVAKVEETKKEEKKVEKPKPKETTTEKKTEVPKEEKAPVVNDKKEVKAEAKTEEKKSTV KAEETKKEEKKVEKPKPKEATSEKKAEISKEEKATKKVEEAKKIEEGRETVKKVMLEKEA KKEVKKEEKKPEEKAKTEEKKSTVKAEEAKKEVKKEEKATPKKVEETKKEVKKEEIKTIK TKKEPAEHLSNEQVKVKLNKEIKEVEGTYTP >gi|224461297|gb|ACDC01000105.1| GENE 5 4533 - 5768 1712 411 aa, chain - ## HITS:1 COG:FN1014 KEGG:ns NR:ns ## COG: FN1014 COG0285 # Protein_GI_number: 19704349 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Fusobacterium nucleatum # 1 411 5 415 415 712 91.0 0 MNIDALLEELYAYSMFSIRLGLDNIKEICEHLGNPQNSYKVIHITGTNGKGSVSTTVERI LIDAGYKVGKYTSPHILEFNERISFDDKYISNEDVAKYYEKVKKIIEEYKIQATFFEVTT AMMFDYFKDMKAEYVILEAGMGGRYDATNICNNTVSVITNVSLEHTEYLGDTIYKIATEK AGIIKNCPYTIFADNNPDVKKAIEEVTDKYVNVLDKYKDSTYKLDFNTFTTNININGNIY EYSLFGDYQYKNFLCAYEVVKYLGIDENIIKEAIKKVVWQCRFEVFSKNPLVIFDGAHNA AGVEELIKIVKQHFSKDEVTVLVSILKDKDRVSMFRKLNEISSSIILTSIPDNPRASTAR ELYDYVENKKDFEYEEDPIKAYNLALSKKRKLTICCGSFYILIKLKEGLNG >gi|224461297|gb|ACDC01000105.1| GENE 6 5793 - 6494 992 233 aa, chain - ## HITS:1 COG:FN1015 KEGG:ns NR:ns ## COG: FN1015 COG0775 # Protein_GI_number: 19704350 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Fusobacterium nucleatum # 1 233 5 237 237 379 89.0 1e-105 MKIGIIGAMHEEIVELKSSMTDINEIEISNLKFYEGKLCSKDVVLVESGIGKVNAAISTT LLISNFKVDKIIFTGVAGAVNPDIKVTDIVIATDLVESDMDVTAGGNYKLGEIPRMKSSN FKADPYLFTLADSVATKLFGSEKVHKGRIISRDEFVASSEKVKKLREIFEAECVEMEGAA VAHVCEVLNIPFIVLRSISDKADDEAGMTFDEFVKIAAKNSKSIVEGILSIIK >gi|224461297|gb|ACDC01000105.1| GENE 7 6611 - 7498 938 295 aa, chain + ## HITS:1 COG:FN1016 KEGG:ns NR:ns ## COG: FN1016 COG1560 # Protein_GI_number: 19704351 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Fusobacterium nucleatum # 73 294 1 222 226 379 88.0 1e-105 MYFIQYIVARFFIFLLLLLPEKLRFKFGDFLGNLTYKLIKSRRMTALMNLKMAFPEKSDE EIEKIARKSFRIMIKAFLCSLWFDKYLKNPKNIKIINQESMLNACKKDKGVMAATMHMGN MEASTVCTGENKIITVAKKQRNPYINAYITKLRGKANYMEVIEKNERTSRVLISKLREKK VIALFSDHRDKGAIVNFFGKETKAPSGAVSMALKFDLPFLLVYNTFNDDNTITIYVTDEI ELKKTGNFKEDVQNNVQYLINIMEDVIRKHPEQWMWFHDRWNSFREYKRSLKNKK Prediction of potential genes in microbial genomes Time: Thu May 19 23:38:28 2011 Seq name: gi|224461296|gb|ACDC01000106.1| Fusobacterium sp. 2_1_31 cont1.106, whole genome shotgun sequence Length of sequence - 1876 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 28 - 1722 1761 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 1753 - 1812 8.7 Predicted protein(s) >gi|224461296|gb|ACDC01000106.1| GENE 1 28 - 1722 1761 564 aa, chain - ## HITS:1 COG:FN1080 KEGG:ns NR:ns ## COG: FN1080 COG1132 # Protein_GI_number: 19704415 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Fusobacterium nucleatum # 1 563 1 563 564 959 89.0 0 MFKKFISYYKPHKKMFFLDLLAAFLISICDLFYPILTRSILYDFIPNRKLKTIFLFLFIL ALIYIFKMLSNYFVGYYGHIVGVKIQADMRRDLFKHIQNMPISYFDKNQTGDIMSRIVND LVDISELAHHGPEDVFISGVLVLGSFFYLINLNPLLTCIVFIFIPILALLTIFLRKRMMR AFAETRTTVGAINANLSNSISGIRVSKSFNNSKFEFKKFEEGNSKYIIARKAAYFWLAVF QGGVYYIIDTLYLVMLLSGTLFTYYNKITVVDFVTYMLFVNLLITPIKRLINSVEQFQNG MSGFRRFYEVITVPQEEEGKIEVGKLNGDIVFDEVTFRYEENENVFENFSLNIKAGTNVA LVGESGVGKSTICHLIPRFYEILSGKITIDDIDIKDMTLSSLRKNIGIVSQDVFLFTGTI KENIAYGKLDATDEEIYKAAKYANIHDYIMTLEKGYDTQVGERGIRLSGGQKQRISIARV FLANPPILILDEATSALDSITERNIQKSLDELSEGRTTLVVAHRLTTVRKADVIIVITKD GIAEMGNHDELMKLQGIYYKLNQV Prediction of potential genes in microbial genomes Time: Thu May 19 23:38:35 2011 Seq name: gi|224461295|gb|ACDC01000107.1| Fusobacterium sp. 2_1_31 cont1.107, whole genome shotgun sequence Length of sequence - 22037 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 8, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 542 656 ## COG0386 Glutathione peroxidase - Prom 574 - 633 7.5 2 1 Op 2 12/0.000 - CDS 637 - 1359 359 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 3 1 Op 3 2/0.000 - CDS 1359 - 2900 1996 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) 4 1 Op 4 . - CDS 2900 - 3343 582 ## COG1846 Transcriptional regulators - Prom 3390 - 3449 12.1 - Term 3423 - 3462 4.3 5 2 Tu 1 . - CDS 3504 - 3596 63 ## - Prom 3646 - 3705 6.4 - Term 3784 - 3827 10.5 6 3 Op 1 . - CDS 3834 - 6497 3758 ## COG0525 Valyl-tRNA synthetase 7 3 Op 2 . - CDS 6525 - 7436 1126 ## CCC13826_0034 hypothetical protein 8 3 Op 3 . - CDS 7451 - 8290 946 ## CCC13826_0034 hypothetical protein - Prom 8378 - 8437 10.6 + Prom 8339 - 8398 13.0 9 4 Op 1 1/0.000 + CDS 8446 - 9090 756 ## COG2039 Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) 10 4 Op 2 1/0.000 + CDS 9116 - 10681 2117 ## COG0038 Chloride channel protein EriC + Prom 10742 - 10801 4.6 11 4 Op 3 1/0.000 + CDS 10852 - 12204 1403 ## COG0534 Na+-driven multidrug efflux pump 12 4 Op 4 17/0.000 + CDS 12219 - 13565 1125 ## COG0168 Trk-type K+ transport systems, membrane components 13 4 Op 5 1/0.000 + CDS 13575 - 14231 898 ## COG0569 K+ transport systems, NAD-binding component 14 4 Op 6 24/0.000 + CDS 14240 - 16141 2621 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 15 4 Op 7 . + CDS 16143 - 16841 953 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division + Prom 16910 - 16969 9.6 16 5 Tu 1 . + CDS 16992 - 17426 679 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) + Term 17436 - 17466 4.3 - Term 17424 - 17454 4.3 17 6 Tu 1 . - CDS 17464 - 17820 319 ## FN0762 hypothetical protein - Prom 17844 - 17903 16.7 + Prom 17747 - 17806 9.3 18 7 Op 1 . + CDS 17967 - 18569 600 ## FN0764 amino acid transporter LysE 19 7 Op 2 1/0.000 + CDS 18581 - 19669 1515 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 20 7 Op 3 . + CDS 19727 - 20137 585 ## COG1970 Large-conductance mechanosensitive channel + Term 20147 - 20194 8.1 + Prom 20174 - 20233 7.2 21 8 Op 1 1/0.000 + CDS 20296 - 20844 681 ## COG0344 Predicted membrane protein 22 8 Op 2 . + CDS 20862 - 22007 1398 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) Predicted protein(s) >gi|224461295|gb|ACDC01000107.1| GENE 1 2 - 542 656 180 aa, chain - ## HITS:1 COG:FN2007 KEGG:ns NR:ns ## COG: FN2007 COG0386 # Protein_GI_number: 19705303 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Fusobacterium nucleatum # 1 179 17 195 199 316 91.0 2e-86 MKIYDFTVKNRKGEDVSLENFKGKVLLIVNTATRCGFTPQYDELEALYSKYNKDGFEVLD FPCNQFGNQAPESDDEIHTFCQLNYKVKFDQFAKVEVNGENAIPLFKYLKEQKGFTGFDP KHKLTSILNDMLSKNDPDFAKKPDIKWNFTKFLVDKSGNVVARFEPTTGAEEIEKEIKKY >gi|224461295|gb|ACDC01000107.1| GENE 2 637 - 1359 359 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 235 1 239 245 142 33 1e-33 MIEFKNISKSYGKQEIIKNFNLTIECGTFLTIIGSSGSGKTTILKMINGLIKADKGEVLI NNKNIQDEDLIELRRKIGYVIQGNILFPHLTVFDNIAYVLNLKKYNKKEIEKIVNEKMDM LNLSRDLKDRLPDELSGGQQQRVGIARALAANPDIILMDEPFGAVDAITRYQLQKDLKEL HKKTEATIVFITHDITEALKLGTKVLVLDKGEIQQYDMPKNICSNPKNEFVKQLLKMAEM >gi|224461295|gb|ACDC01000107.1| GENE 3 1359 - 2900 1996 513 aa, chain - ## HITS:1 COG:FN2009_2 KEGG:ns NR:ns ## COG: FN2009_2 COG1732 # Protein_GI_number: 19705305 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Fusobacterium nucleatum # 207 513 1 307 307 564 94.0 1e-160 MVNQLMKLLTEDFKFFTNLTIEHVLISLLAISIASVLGIILGIIISEYRRFSGLILGTVN ILYTIPSIALLGFFITITGVGNTTALIALIIYALLPIIRSTYTGIVNINPLIIEASEGMG STKLQQLFKVKLPLALPVLMSGIRNMVTMTIALAGIASFVGAGGLGVAIYRGITTNNSAM TFLGSLLIALLALIFDFILGFMEKRLTNYKRTKYKVNFKFIILGLFIIIFGAYFSLNSKK DKTINIATKPMTEGYILGQMLTELIEQDTDLKVNMTTGVGGGTSNIQPAMVKGEFDLYPE YTGTSWEAVLKKEGSYDESKFDELQKEYKEKYNLEYVNLYGFNNTYGLAVNKDIAEKYNL KTYSDLAKVSNNLIFGAEYDFFEREDGYKELQKVYNMNFKKQIDMDIGLKYQAMKDKKID VMVIFTTDGQLAISDVVVLEDDKKMYPSYRAGTVVRSEILSEYPELKPVLEKLNNILDDK TMADLNYQVESEGKKPEDVAREYLQEKGLLEAK >gi|224461295|gb|ACDC01000107.1| GENE 4 2900 - 3343 582 147 aa, chain - ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 1 144 1 144 160 206 85.0 9e-54 MQRLGGFLITKIKQLHSRALAQCISDKGIDAFSGEQGKILFVLWRKDKITQKELATETGL AKNTITIMLEKMEKNNLIRKITDENDKRKSLVILTDYAKSLKKPFDEISDEMLKKVYKGF SEEEIDKYEEYLHRIIRNLEEKEESDR >gi|224461295|gb|ACDC01000107.1| GENE 5 3504 - 3596 63 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKLPIIALIIIVLLILLSKSVYIVFNFNF >gi|224461295|gb|ACDC01000107.1| GENE 6 3834 - 6497 3758 887 aa, chain - ## HITS:1 COG:FN2011 KEGG:ns NR:ns ## COG: FN2011 COG0525 # Protein_GI_number: 19705307 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 887 1 887 887 1749 97.0 0 MNELDKNYSPNEIEEKWYKTWEESKFFAASLSSEKENYSIVIPPPNVTGILHMGHVLNNS IQDTLIRYNRMRGKNTLWMPGCDHAGIATQNKVERKLAEEGLKKEDIGREKFLEMTWDWK EKYGGIITQQLRKLGASLDWDRERFTMDEGLSYAVRKIFNDLYHDGLIYQGEYMVNWCPS CGTALADDEVDHEEKDGHLWQIKYPVKDSDEYIIIATSRPETMLADVAVAVHPEDERYKH LIGKTLILPLVNREIPVIADEYVDKEFGTGALKITPAHDPNDYNLGKKYNLPVINMLTPD GKIVNDYPKYAGLDRFEARKKIVEDLKEQGFFIKTEHLHHAVGQCYRCQTVIEPRVSPQW FVKMKPLAEKALEVVRNGEIKILPKRMEKIYYNWLENIRDWCISRQIWWGHRIPAWYGPD RHVFVAMDEAEAKEQAKKHYGHDVELSQEEDVLDTWFSSALWPFSTMGWPEKTKELDLFY PTNTLVTGADIIFFWVARMIMFGMYELKKIPFKNVFFHGIVRDEIGRKMSKSLGNSPDPL DLIKEFGVDAIRFSMIYNTSQGQDVHFSTDLLGMGRNFANKIWNAARFVIMNLEGFDVKS VDKTKLDYELVDKWIISRLNETAKDVEDCLEKFELDNAAKAVYEFLRGDFCDWYVEIAKI RLYNDDEDKKISKLTAQYMLWTILEQGLRLLHPFMPFITEEIWQKIKVDGETIMLQQYPV ADNNLIDVKIEKSFEYIKEVVSSLRNIRAEKGISPAKPAKVVVSTSNSEELETLEKNELF IKKLANLEELTCGANLEAPAQSSLRVAGNSSVHMILTGLLNNEAEIKKINEQLAKLEKEL EPVNRKLSDEKFTSKAPQHIIDRELRIQKEYLDKIEKLKESLKSFEE >gi|224461295|gb|ACDC01000107.1| GENE 7 6525 - 7436 1126 303 aa, chain - ## HITS:1 COG:no KEGG:CCC13826_0034 NR:ns ## KEGG: CCC13826_0034 # Name: not_defined # Def: hypothetical protein # Organism: C.concisus # Pathway: not_defined # 3 302 17 298 299 112 31.0 2e-23 MKKYFKLFMLMLLVFSYSYAGVMPETEWAKRGLKGKVKSMIKTEYGYENSGKIKFTSLVK TEFNEKGYITRESFTRDGVEYKIVQYQFDKNGFIARRIEEVPQASINNYKYSYKYSKDGN LIEKAELVERVRGYYPMYDIITYNKLGKEINELKYVEGKLEGDVSTFYNERGDATEVKNN LNPDYPYILIYYDYYKDGGYEKTVDGSGRRSFVVVDKNGFQRELAYVLFFGSRNPVVQLD IYEKNINEKRDKYGNITEFISVRYDVLENNKAKAEDIYKQLREQKIKKIGVSGKVEITYE YYN >gi|224461295|gb|ACDC01000107.1| GENE 8 7451 - 8290 946 279 aa, chain - ## HITS:1 COG:no KEGG:CCC13826_0034 NR:ns ## KEGG: CCC13826_0034 # Name: not_defined # Def: hypothetical protein # Organism: C.concisus # Pathway: not_defined # 3 279 17 299 299 191 44.0 2e-47 MKKYFKLFILMLLVFSYSYSGVMPETDWKIKNLKGKVKSMVKTEYEYDSSGKVEKTWVTE TYFNEQGYITDEVQYVDNRLNQSIIYKNNSDGLPIKKDEVSRVYSYKYEETKDGNLLVTI KEEYVDKKHFPSLEKITYNKNGKKVHHLVYSGEELITNDTYIYNKKGNLIEIKDNTFPEN SMKITYNYKANGDSEKTTEVATAKWTYLYDKNGNEQEYISMIKQGSQGKTKISIYLKFKD IARDEHGNLTRSTSVRYDYSKKKESSVYKKLENKYEYYK >gi|224461295|gb|ACDC01000107.1| GENE 9 8446 - 9090 756 214 aa, chain + ## HITS:1 COG:FN1728 KEGG:ns NR:ns ## COG: FN1728 COG2039 # Protein_GI_number: 19705049 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) # Organism: Fusobacterium nucleatum # 1 214 1 214 214 349 85.0 2e-96 MKKILVTGFDPFGGEKVNPALEVIKLLPKKIGENEVRILEIPTVYKKSVEKIEKEIESYK PDYVLSIGQAGGRASISIERVAINIDDFRIKDNEGNQPIDENIFEDGENAYFSTLPIKSI QEELSKNNIPSSISNTAGTFVCNHVFYGVRYLIEKKYKGIKSGFVHIPYIPEQVIGKANT PSMSLDNILKGIIIIIETIFNVETDIKKSGGTIC >gi|224461295|gb|ACDC01000107.1| GENE 10 9116 - 10681 2117 521 aa, chain + ## HITS:1 COG:FN1727 KEGG:ns NR:ns ## COG: FN1727 COG0038 # Protein_GI_number: 19705048 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Fusobacterium nucleatum # 1 521 1 521 521 855 87.0 0 MNDAKSMVEKLYKGNGKLYLACLCVGLITGAIVSCYRWGLGKIGIIRREYFSEVNLNNPM ALLKAWALFIGIGLIVNYLFKKFPKTSGSGIPQVKGLILGRIDYKNWFFELISKFVAGVL GIGAGLSLGREGPSVQLGSYVGYGVSKLFKKDTVERNYLLTSGSSAGLSGAFGAPLAGVM FSIEEIHKYLSGKLLICAFVSSIAADFVGRRVFGVQTSFDIVIKYPLDINPYFQFFLYII FGVIIAFFGKLFTVSLVKFQDIFNGVKLPREIKVCFVMTVSFILCFVLPEVTGGGHDLVE SLIHQKAIIYTLIIIFIAKLFFTSISYATGFAGGIFLPMLVLGAIIGKIFGECLDLFAAT GADFTVHWIVLGMAAYFVAVVRAPITGVILILEMTGSFHLLLALTTVSVVSFYVTELLGQ QPVYDILYDRMKKDDNLVDEENQEKVTIELPVMAESLLDGKAISEIIWPEEVLIIAIIRN GVEKIPKGRTVMMAGDILVLLLPEKIVGEVKESLMKHTSVE >gi|224461295|gb|ACDC01000107.1| GENE 11 10852 - 12204 1403 450 aa, chain + ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 450 8 457 457 714 90.0 0 METESITKLLIKFSIPAIVGMFVNALYNVVDRIYIGNIKGTGYLGITGVGLVFPVVILIF AFSLLIGIGSAASVSLKLGMKDREEAERFLGVAVFLSLVISAILMIIIYFNMDRIIYFIG GSKETFSYAKNYLFYINLGVPAAILGLVLNSVIRSDGSPKIAMGTLLIGAITNIVLDPIF IFMFGMGVKGAAIATIISQYVSMIWTIHYFMSKRSKIKLIKKDIRYDFYKSKEICLLGSS AFAIQIGFSLVTYILNTVLKKYGGDTSIGAMAIVQSFMTFMAMPIFGINQGIQPILGYNY GAKKYKRVKEALYKGIFAATIICLIGYTSVRLFSDSLIHIFTNKPELKEIAKYGLKAYTL VFPIVGFQIVSSIYFQAVGKPKMSFFISLSRQIIVMIPCLIILPKFFGLNGIWYAAPTAD SIATLITFILVRREIKKLDKLEEMLEKRDV >gi|224461295|gb|ACDC01000107.1| GENE 12 12219 - 13565 1125 448 aa, chain + ## HITS:1 COG:FN1725 KEGG:ns NR:ns ## COG: FN1725 COG0168 # Protein_GI_number: 19705046 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 448 1 448 448 617 87.0 1e-176 MQKLSLLKKWNNLSPYRKLIFGFLVAIFIGVILLKMPFSLRENQNISVLDSLFTIVSAIC VTGLSVVDVSQVFTSTGQLIILFFIQLGGLGVMTVSIIVFLLVGKKMSFETRELLKEERN SNSNGGITKFIKQLLLTVFVIEISGALILTYGFSKYYSLKRSLFYGLFHSVSAFCNAGFS LFTNNLEIFKYDRLINLTISFLIILGGIGFVTINSLVIIKKKKLQNLSITSKFALLITFF LLSIGTILFLVFEYNNLSTLKNMNFIDKLINSFFQSVTLRTAGFNTVPLGNIRPATIFIS YIFMFIGASPGSTGGGIKTTTFGVLILYALGVLKRKEYVEVFKRRIDWELINKALAIVII SLFYIIVITTIILSIESFPTEKIIYEVLSAFSTTGLSMGITAGLGIISKLILVITMFIGR LGPMTVALAFTSNKRSSIKYPKEEILIG >gi|224461295|gb|ACDC01000107.1| GENE 13 13575 - 14231 898 218 aa, chain + ## HITS:1 COG:FN1724 KEGG:ns NR:ns ## COG: FN1724 COG0569 # Protein_GI_number: 19705045 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 218 1 218 218 331 85.0 7e-91 MKQYLVIGLGRFGTGVAKTLYEAEKNVLAIDIDEELVQEKIDANILKNAVIGDPSDEKVL KDIGAENFDVAFICIADIEASVMIALNLKELGVKTIIAKAVNKKHGKILTKVGATEIVYP EEHMGKRIAELIIDTDIKEHLKFSDNFVLVEVKAPSIFWNNSLINLDVRNKYNINIVGIK KAQKEFIPNPTANVIIEEGDILVIITDKKTVESFNKLI >gi|224461295|gb|ACDC01000107.1| GENE 14 14240 - 16141 2621 633 aa, chain + ## HITS:1 COG:FN1723 KEGG:ns NR:ns ## COG: FN1723 COG0445 # Protein_GI_number: 19705044 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Fusobacterium nucleatum # 1 633 1 633 633 1137 92.0 0 MQEFDIIVVGAGHAGCEAALASARMGMKTAVFTISLDNIGVMSCNPSLGGPAKSHLAREI DALGGEMGRNIDKTFIQIRVLNTKKGPAVRSLRAQADKMTYANEMKKTLEHTDNLTVIQG MVSELVVEEENGKKVIKGIKIREGLEYRAKIVILATGTFLRGLIHIGEVNFSAGRMGELS SEELPLSLEKVGLKLGRFKTGTPARIDGRTIDFSVLEEQPGDKSQVLKFSNRTTDEEALS RRQISCYIAHTNDKVHEIIKNARERSPMFNGKIQGLGPRYCPSIEDKVFRYPDKNQHHLF LEREGYETNEIYLGGMSSSLPVDVQEEMIRNVKGFENAKVMRYAYAIEYDYVPPEEIKYT LESRTVENLFLAGQINGTSGYEEAGAQGLMAGINAVRKLRNEEAIILDRADSYIGTLIDD LVSKGTNEPYRMFTARSEYRLYLREDNADLRLSKLGYELGLIPEEEYQRVEKKRRDVELI TEILTKTNVGPSNLRVNETLLKRGENPIKDGSTLLELLRRPEVTFEDIVYISEEIKGVDL KGYDHDTSYQVEITVKYQGYINRALKMIEKHKSMENKKIPADIDYDALKTIPKEAKDKLK RIKPINIGQASRISGVSPADIQAILIYLKMRGN >gi|224461295|gb|ACDC01000107.1| GENE 15 16143 - 16841 953 232 aa, chain + ## HITS:1 COG:FN1722 KEGG:ns NR:ns ## COG: FN1722 COG0357 # Protein_GI_number: 19705043 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Fusobacterium nucleatum # 1 232 1 232 232 358 92.0 6e-99 MKEYFKEGLEKIKVSYDENKIEKALKYLEILLDYNSHTNLTAIREEKAIIEKHFLDSLLL QNLLKEEDKTLIDIGTGAGFPGMMLAIFNEDKNFTLLDSVRKKTDFLELVKNELALNNVE IINGRAEEIIKDKREKYDVGLCRGVSNLSVILEYEIPFLKVNGRFLPQKMTGTDEIENSS NALKVLNSKIIKEYNFKLPFSNEDRLIIEILKTKSTDKKYPRKTGIPLKKPL >gi|224461295|gb|ACDC01000107.1| GENE 16 16992 - 17426 679 144 aa, chain + ## HITS:1 COG:FN1079 KEGG:ns NR:ns ## COG: FN1079 COG0783 # Protein_GI_number: 19704414 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Fusobacterium nucleatum # 1 144 1 144 144 236 90.0 1e-62 MKNKENLNRYLSNLAILVTKTHNLHWNVVGARFKAIHEYTESLYDYYFEKFDDVAEIFKM KGEYPLAKVADYLKHATVKELDVKDFTIPEVVASIKEDMELMLADARKIREVANEEDDFL VANMMEDHIEYFVKQLWFIQAMSK >gi|224461295|gb|ACDC01000107.1| GENE 17 17464 - 17820 319 118 aa, chain - ## HITS:1 COG:no KEGG:FN0762 NR:ns ## KEGG: FN0762 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 118 1 118 118 134 83.0 8e-31 MTELEVKIIKFLLSSAVYSENAIMKNFGIDRETLDKSFKILEDNGYLESYEEYMKRESLN EEGDCCKTIKNSSCSSCSSCSSHSCSSGSSCCDHNIFSDLEDFSKIKVITMKAVDNFS >gi|224461295|gb|ACDC01000107.1| GENE 18 17967 - 18569 600 200 aa, chain + ## HITS:1 COG:no KEGG:FN0764 NR:ns ## KEGG: FN0764 # Name: not_defined # Def: amino acid transporter LysE # Organism: F.nucleatum # Pathway: not_defined # 48 198 1 148 150 149 68.0 4e-35 MGFILSLPFGPVGIYCMELTIIEGRWKGYITALGMVTIDMVYSTVALLFLSGVKEYIVKY ENYLSLIIGLFLLIISLRKLLTKVELKDINVDFKSMLQNYLTGAGFAIVNISSILVIATV FTVLKVLDDGNTSPPTTYMEAILGVGLGGTSLWFLTTYVMSHFRRLFGKEKLIKIIKIAN VTIFILALAIIFYTIKKITS >gi|224461295|gb|ACDC01000107.1| GENE 19 18581 - 19669 1515 362 aa, chain + ## HITS:1 COG:FN0765 KEGG:ns NR:ns ## COG: FN0765 COG0482 # Protein_GI_number: 19704100 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 1 361 1 361 362 637 89.0 0 MIDVKNVAEEFSKYIEFDSDKKGIKVGVAMSGGVDSSTVAYLLKQQGYDIFGVTMKTFKD EDSDAKKVCDDLGIEHYVLDVRDEFKEKVMDYFVNEYMNGRTPNPCMVCNRHIKFGKMLD FILSKGASFMATGHYTKLKNGLLSVGDDSNKDQVYFLSQIQKDRLSKIIFPVGDLEKPKL RELAKQIGVRVYSKKDSQEICFIDDGKLKEFLIENTKGKAEKPGNIVDKNGKILGKHKGF SFYTIGQRKGLGISSEEPLYVLAFDKDNNNIIVGENEDLFKDELVATRLNLFSVPSLESL DNLECFAKTRSRDILHKCVLKKNGDNFQVKFIDNKVRAITPGQGIVFYNNDGNVIAGGFI ES >gi|224461295|gb|ACDC01000107.1| GENE 20 19727 - 20137 585 136 aa, chain + ## HITS:1 COG:FN0766 KEGG:ns NR:ns ## COG: FN0766 COG1970 # Protein_GI_number: 19704101 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Fusobacterium nucleatum # 1 136 7 142 142 199 83.0 1e-51 MKLVDEFKAFVMRGNVVDLAVGVIIGAAFGKIVTSLVNDIFMPIIGMIVGNVDFTTLEIK IGEPVEGVEQAAIKYGMFIQEIINFLIIALCIFMFIKLIAKIQKKKDEEPAPAPEPTKEE VLLTEIRDALKKMSDK >gi|224461295|gb|ACDC01000107.1| GENE 21 20296 - 20844 681 182 aa, chain + ## HITS:1 COG:FN0537 KEGG:ns NR:ns ## COG: FN0537 COG0344 # Protein_GI_number: 19703872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 182 13 194 194 256 82.0 2e-68 MGAIPSGVWLGKIFKNIDVRDYGSKNSGATNSYRVLGAKLGTTVLIMDVLKGFLPLYIAS KFDLEYNDLVLIGLVAILAHTYSCFISFRGGKGVATSLGVFLFLIPTITLILLAIFMLIV YFTRYISLGSITAAFLLPIFTFFSDKGSYLFVLSLIIGIFVIYRHRSNISRLLSGTESKF KF >gi|224461295|gb|ACDC01000107.1| GENE 22 20862 - 22007 1398 381 aa, chain + ## HITS:1 COG:FN0536 KEGG:ns NR:ns ## COG: FN0536 COG0592 # Protein_GI_number: 19703871 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Fusobacterium nucleatum # 1 381 1 381 381 545 85.0 1e-155 MHIKVNRQNFLTAVRIVEKSIKDNKIKPILSCVYAKVKDNKVYFTGTNLDTTIKTSIDVN EVIREGEVAFSPSIIDEYLKEIKDEFVVLRVENGNILFIETEDSTTEYDVFTTEDYPNTF ENINLNENNFKFEMPSQELVEIFEKVLFSADTPDNIAMNCIRIESNNKTLNFVSTNTYRL TYLKKDVEKEINDFAVSVPADTISSIVKIVKGLDNELIKIYKEDAHLYFKYKETTIITKL IELRFPNYVDILSNITYDKKLSINNEKFTNLLKRVLIFSRSNMESKYSSTYQFKHGDNGE SKLIISALNDIARINEELNISFEGEDLKISLNSKYLLEFIQNIPKEKELVLEFMYANSAV KVYEKDNEDYIYILMPLALRD Prediction of potential genes in microbial genomes Time: Thu May 19 23:38:57 2011 Seq name: gi|224461294|gb|ACDC01000108.1| Fusobacterium sp. 2_1_31 cont1.108, whole genome shotgun sequence Length of sequence - 12800 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 49 - 627 910 ## COG1611 Predicted Rossmann fold nucleotide-binding protein 2 1 Op 2 . - CDS 652 - 1080 253 ## FN0534 hypothetical protein - Prom 1317 - 1376 9.7 + Prom 1209 - 1268 9.5 3 2 Op 1 . + CDS 1425 - 2225 939 ## FN1933 hypothetical protein 4 2 Op 2 1/0.000 + CDS 2234 - 2806 850 ## COG0237 Dephospho-CoA kinase 5 2 Op 3 1/0.000 + CDS 2803 - 4998 2406 ## COG0826 Collagenase and related proteases 6 2 Op 4 1/0.000 + CDS 4998 - 5507 734 ## COG1267 Phosphatidylglycerophosphatase A and related proteins 7 2 Op 5 1/0.000 + CDS 5518 - 6714 1722 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA 8 2 Op 6 1/0.000 + CDS 6733 - 7287 780 ## COG1396 Predicted transcriptional regulators 9 2 Op 7 . + CDS 7299 - 8735 2040 ## COG1461 Predicted kinase related to dihydroxyacetone kinase 10 2 Op 8 . + CDS 8753 - 9829 1282 ## COG1307 Uncharacterized protein conserved in bacteria 11 2 Op 9 1/0.000 + CDS 9908 - 10657 1341 ## COG0217 Uncharacterized conserved protein + Term 10668 - 10708 8.6 12 2 Op 10 . + CDS 10716 - 12788 2600 ## COG1200 RecG-like helicase Predicted protein(s) >gi|224461294|gb|ACDC01000108.1| GENE 1 49 - 627 910 192 aa, chain - ## HITS:1 COG:FN0535 KEGG:ns NR:ns ## COG: FN0535 COG1611 # Protein_GI_number: 19703870 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Fusobacterium nucleatum # 1 192 1 192 192 362 89.0 1e-100 MKKKNVTVYCGASFGVDKSYQDITRKLGEWIGKNNYNLVYGGGRSGLMGLIADSVLENGG KVTGIITHFLSEREIAHDGITKLIKVDTMSERKKKMADLADIFIALPGGPGTLEEITEVV SWAVLALHPCPCIFFNYDNYYNHIRAFYDLMVEKGYMKKEAREKLFFTDSFEEMEKFIAT YVPPKAREYHGE >gi|224461294|gb|ACDC01000108.1| GENE 2 652 - 1080 253 142 aa, chain - ## HITS:1 COG:no KEGG:FN0534 NR:ns ## KEGG: FN0534 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 142 1 142 142 171 70.0 6e-42 MDINGFVLLARETATLYRKLFPTWAIYSLPDGLWLFSAGAVFLIARKRFFLHVIWFFFIY LFVILGEFVQKFFGGHGTPVGTFDKSDIVAFTYAYISINVVAIILRLFQNKDKYIFKNSK EILENICYTIIISIVGLLANMF >gi|224461294|gb|ACDC01000108.1| GENE 3 1425 - 2225 939 266 aa, chain + ## HITS:1 COG:no KEGG:FN1933 NR:ns ## KEGG: FN1933 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 28 266 109 347 347 295 70.0 8e-79 MNYKKYLLVFLSIFILIACKKLPENNTRSSYQKVKAMKENTFREKVAKYLAYAKILNDDK ELRATVLAERDRMEAKIEKTMDYRIIGGNDELLNIMIAENYRGYIEHNPFTKTKDNPDTI LEITAKLVEYSPNKIEEKKEQNDFEEIYTDENGKEATNKVKYVKTHFFKSSYAKVEVKYK LISTLTGEVILQGNKIVDNEEYTYWDTYDVVSGTLKDKSQFYRNDREEELSNKEKFFRET AVKILKEINLELKKLPDYDIWKLYIS >gi|224461294|gb|ACDC01000108.1| GENE 4 2234 - 2806 850 190 aa, chain + ## HITS:1 COG:FN1932 KEGG:ns NR:ns ## COG: FN1932 COG0237 # Protein_GI_number: 19705237 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Fusobacterium nucleatum # 1 190 4 193 193 251 82.0 6e-67 MIVGLTGGIASGKSTVSKYLAEKGFKVYDADKIAKDISEKKLVQEEIILNFGDKILTEDG KVDRKKLKEIVFADKDKLKKLNAIIHPKVIDFYRELKEKNTDETIIFDVPLLFESGIDKF CDKILVVISDYDVQLSRIIERDNIDKELASKIIKSQISNEERIKKADIVIENNTSLEELY EKVERFCEKI >gi|224461294|gb|ACDC01000108.1| GENE 5 2803 - 4998 2406 731 aa, chain + ## HITS:1 COG:FN1931 KEGG:ns NR:ns ## COG: FN1931 COG0826 # Protein_GI_number: 19705236 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Fusobacterium nucleatum # 1 731 1 720 720 1165 89.0 0 MKIVAPAGNMERFYSAISATADEIYLGLKGFGARRNAENFTVEELKKAIDYAHLRGSRIF LTLNTIMTNREIELLYPTLKELYNYGLDAIIVQDLGYAEYLHKNFPSIEIHGSTQMTVAN HYEINYLKELGFKRIVLPRELSFEEIKEIRENTDMELEIFVSGSLCISFSGNCYMSSFIG GRSGNRGMCAQPCRKEYKTSCGEKSYFLSPKDQLYGFDEIKKLQEIGVESIKVEGRMKDV SYVYETVSYFRSLINGIDKEENTHKLFNRGYSKGYFYNNDKAIMNRDYSYNMGEKIGEVL GKNIRLDEDIVSGDGVTFVSKDYKNLGGTYIGKINVVNIKEDRKIAYKNEKLIFNFPEGT KYIFRNYNKRLNDEILKKLKNTDKKLEVNFDFTAKLNEKLNLKIYLEDENGNRILNLEEI SETLTQKAQKRAISEEDIKEKLSEIGDSEFTVKNIEVDIDEDIFIPLSELKNLKRTAIEK FREEILSYFRRDLDSELKANNQEYFKLEIEKDEPKDVEIRVIVSNDEQRSFLEKVKDEYN ISEIYDRTYDIAKQSKLSQHNLDNKLASNLYELLENKNSSVMLNWNMNIVNSYTISVLER IKNLESFIVSPEINFAKIRELGKTRLKKALLVYSKLKGMTIDVDIAENKDEVITNKENDR FNIIRNEYGTEIFLDKPLNIINIEEDIKKLNVDIIVLEFTTETIDEIKKVLKQLKTRKGE YREYNYKRGVY >gi|224461294|gb|ACDC01000108.1| GENE 6 4998 - 5507 734 169 aa, chain + ## HITS:1 COG:FN1930 KEGG:ns NR:ns ## COG: FN1930 COG1267 # Protein_GI_number: 19705235 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Fusobacterium nucleatum # 4 169 6 171 171 266 93.0 1e-71 MSGHNHKLIKNLATCFGLGEMSFMPGTFGTLGGIPIFLFLTYIKRFFINVMVYNSFYLIF LVTFFAIAVYVSDICEKEIFKKEDPQAVVIDEVLGFLTTLFLINPVGVKATLIAMGLAFV IFRILDITKIGPIYKSQNFGNGVGVVLDDFLAGIIGNFILVIIWTKFFY >gi|224461294|gb|ACDC01000108.1| GENE 7 5518 - 6714 1722 398 aa, chain + ## HITS:1 COG:FN1929_1 KEGG:ns NR:ns ## COG: FN1929_1 COG1058 # Protein_GI_number: 19705234 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Fusobacterium nucleatum # 1 237 1 237 237 373 90.0 1e-103 MKAGIFLVGTELLNGATIDTNSIYIAEELNKYGIEIEFKMTVRDVMDEIVKALKYAKKNV DLVILTGGLGPTDDDITKEAMAKFLKKKLIIDEKEKAELLKKYKSYGNLNKTNFKEIEKP EGAISFKNDVGMAPAVYVDGLVAFPGFPNELKNMFPKFLKHYVKENNLKTQIYIKDIITY GIGESTLENTVKDLFTEEGIFYEFLVKDYGTLIRLQTSSENKKNVEKIVKKLYNRISEFI IGEDTDRLENSIYECLNSGKKPLTISTAESCTGGMIASKLIEVPGISENFMESIVSYSNE AKIKRLKVKKETLEKYGAVSEEVAREMLAGLKTDVAISTTGIAGPGGGTKEKPVGLVYIG IRVKDEVKIFRRELKGDRNKIRQRAMMHALYNLLKILK >gi|224461294|gb|ACDC01000108.1| GENE 8 6733 - 7287 780 184 aa, chain + ## HITS:1 COG:FN1928 KEGG:ns NR:ns ## COG: FN1928 COG1396 # Protein_GI_number: 19705233 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 184 1 184 184 308 92.0 5e-84 MTIGEKLKKSRNDKGMSLRELATKVDLSASFLSQIEQGKASPSIENLKKIAHTLDVRVAY LIEDEEDDIRNIEYVKAANVRYIESIDSNIKMGILLASNKEKNMEPIIYEIGVDGESGRD YYSHGNSEEFIYILEGELEVYVANKKYKLAKGDSLYFKSSLNHRFKNTSKKEVKALWVVS PPTF >gi|224461294|gb|ACDC01000108.1| GENE 9 7299 - 8735 2040 478 aa, chain + ## HITS:1 COG:FN1927_1 KEGG:ns NR:ns ## COG: FN1927_1 COG1461 # Protein_GI_number: 19705232 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 1 478 1 481 560 766 87.0 0 MKIEIKVLTPLRLTKLFIAASRWLLKYADVLNDLNVYPVPDGDTGTNMSMTLQSVENALI GLQTEPKMEELVDIISEAVLLGARGNSGTILSQIIQGFLDEVRDTEEITIPKAARAFVSA KERAYMAVSQPVEGTILTVIRKVSEAAIAYEGPKDNFIPFLVHLKNAAAEAVDDTPNLLP KLKEAGVVDAGGKGIFYVLEGFEKSVTDPEMLKDLARIANSQVNRKQKLEYVNKNEIKFK YCTEFIIESGDFDLEEYKEKIQQLGDSMVVAQTRKKTKTHIHTNHPGQVLEIAGALGNLN NMKIENMEIQHNHVLVKEEELNGGKALVVEEEETVKLLFNEKNIENNVAVYAVVDNKNIA ELFLKDGAAATLIGGQTKNPSVADIEDGLKKISAKTIYILPNNKNIIASAKLAAQRDKRD IIVIDTKTMLEGYYFTKNRKMNLQSLLRQLKFNNSIEITKAVRDTKVNNIEIKVGEII >gi|224461294|gb|ACDC01000108.1| GENE 10 8753 - 9829 1282 358 aa, chain + ## HITS:1 COG:FN1927_2 KEGG:ns NR:ns ## COG: FN1927_2 COG1307 # Protein_GI_number: 19705232 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 74 358 1 285 285 492 90.0 1e-139 MTERAETLEDLIKIVCDKYINDKTLSLTVVKGKTATEEANKIITAKNLKKFYMYDGEQDN YSYYIYLEQRDPSLSKIAILTDSASDLTHEMIEGLDITIIPVRLRIGENNYKDGVDITKK EFWHKLITEKVVPKTAQPSPAEFRDYYEELFNKGYEKIISIHMSSKMSGTQQVAKVAREM IKREKDIIIVDSKSVTFGQAYQVLEAAKMAKEDAKLETILARLYEIADKMKVYFAVSDLT YLEKGGRIGRASSMIGSLLKLRPVLKIEDGEVTLETKTFGERGAISYMEKIIKNEGKNSI YLYTAWGGTNQELQSTDILKKTADTMRKIEYKGRFEIGATIGSHSGPVFGIGIISKIR >gi|224461294|gb|ACDC01000108.1| GENE 11 9908 - 10657 1341 249 aa, chain + ## HITS:1 COG:FN1661 KEGG:ns NR:ns ## COG: FN1661 COG0217 # Protein_GI_number: 19704982 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 249 1 249 249 437 95.0 1e-123 MSGHSKWNNIQHRKGAQDKKRAKLFTKFGRELTIAAKEGGSDPNFNPRLRLAIEKAKAGN MPKDILERAIKKGSGELEGVDFTEMRYEGYGPAGTAFIVEAVTDNKNRTASEMRMTFSRK DGNLGADGAVSWMFKKKGIITVKSEGIDTDEFMMAALEAGAEDVEESDGVFEVTTEYTEF QTVLENLKNAGYQYEEAEITMIPENTVEITDLETAKKVMALYDALEDLDDSQNVYSNFDI SDEILEQLD >gi|224461294|gb|ACDC01000108.1| GENE 12 10716 - 12788 2600 690 aa, chain + ## HITS:1 COG:FN1660 KEGG:ns NR:ns ## COG: FN1660 COG1200 # Protein_GI_number: 19704981 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Fusobacterium nucleatum # 1 690 1 689 689 1064 83.0 0 MIEAYKKMYTKLEDLPSKYITAKQVVNLKSLGIDTIYDLIYYFPRAYDNRSNVKNIGDLT FNEYVVVKASVMSVLNMPNRSGKKIVKAMITDGTGIMEVLWFGMPYISKSLKVGEEYIFI GQTKKSNLFQFINPEYKLYKGQEKETAKEILPIYSSNKSITQNNLRKIIKKFLENFLKYF EENIPNDLVKGYKEIFERTQAIKNIHFPESVQAIEAANLRFATEELLILELGILKNRFII DSLNTKKYEIEGKKEKVKKFLELLPFELTRAQKKVIKEIYDEISDGKIVNRLVQGDVGSG KTAVATVMLIYMAENGYQGALMAPTEILANQHYLGMKERLEKIGLRVGLLTSSIKGKKKT EILEAIANGDIDIVIGTHSLIEDNVVFKKLGLIVIDEQHRFGVNQRNKLREKGFLGNLLV MTATPIPRSLALSIYGDLDLSIIDELPPGRTPIKTKWIANDKDLSIMYDFIYKKVNSGNQ AYFVAPLIETSDKMALKSVDKVSEEIERRFSDKKIGIIHGKMKAKEKDEVMLKFKNKEYD ILIATTVIEVGIDVPASTIMTIYNAERFGLSALHQLRGRVGRGSKQSYCFLISESTTENS KQRLSIMEKTEDGFVIAEEDLKLRNSGEIFGLRQSGFSDLKFIDIIYDSKTIKDVRDLCI AYLKKNKGKIKNEFLKYDIERKFSDLQSGN Prediction of potential genes in microbial genomes Time: Thu May 19 23:39:05 2011 Seq name: gi|224461293|gb|ACDC01000109.1| Fusobacterium sp. 2_1_31 cont1.109, whole genome shotgun sequence Length of sequence - 2209 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 17 - 994 1357 ## COG0180 Tryptophanyl-tRNA synthetase - Prom 1028 - 1087 9.3 - Term 1033 - 1079 4.2 2 1 Op 2 . - CDS 1112 - 2176 1215 ## COG0787 Alanine racemase Predicted protein(s) >gi|224461293|gb|ACDC01000109.1| GENE 1 17 - 994 1357 325 aa, chain - ## HITS:1 COG:FN0405 KEGG:ns NR:ns ## COG: FN0405 COG0180 # Protein_GI_number: 19703747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 325 1 325 325 602 94.0 1e-172 MKRSLSGIQPSGILHLGNYFGAMKQFVDLQSDYDGFYFIADYHSLTSLTSPETLRENTYN IVLDYLAIGLDPSKSTIFLQSNVPEHTELTWLLSNITPIGLLERGHSYKDKTAKGIPANT GLLTYPVLMAADILIYDSDVVPVGKDQKQHLEMTRDIAMKFNQQYGVEFFKLPEPLILDD SAIVPGTDGQKMSKSYNNTINMFVTKKKLKEQVMSIVTDSTPLEEPKNPDNNIAKLYALF NNIDKQNELKEKFLAGNFGYGHAKTELLNSILEYFAVAREKREELVKDMDYVKDVLNEGS KKARAIAIEKIQKAKEIVGLVGNIY >gi|224461293|gb|ACDC01000109.1| GENE 2 1112 - 2176 1215 354 aa, chain - ## HITS:1 COG:FN0406 KEGG:ns NR:ns ## COG: FN0406 COG0787 # Protein_GI_number: 19703748 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Fusobacterium nucleatum # 1 354 1 354 354 592 84.0 1e-169 MRTWVEIDKENLKYNILKLKELADNREVLGVVKANAYGLGSVEIAKILQEVGVNFFGLAN LEEAIELQDAGIKANFLILGASFEDELVEATKRGIHTAISSIQQLRFLVENNLNPNIHLK FDTGMTRLGFEVDEAEEVINFCKTNNLNLVGIFTHLSDSDGNTIDTKNFTLEQIEKFKNI VKGLDLKYIHISNSAGITNFHENILGNLVRAGIAMYSFTGNKKTPCLKNVFTIKSKVLFT KKVNKDSFVSYGRHYTLPADSTYAVIPIGYADGLKKYLTKGGYVLINNYRCEIIGNICMD MTMVRIPKELEKTIKISDEVTVINADIIDNLNIPELCVWEFMTGIGRRVKRIIV Prediction of potential genes in microbial genomes Time: Thu May 19 23:39:08 2011 Seq name: gi|224461292|gb|ACDC01000110.1| Fusobacterium sp. 2_1_31 cont1.110, whole genome shotgun sequence Length of sequence - 8328 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 5, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 33 - 78 2.5 1 1 Tu 1 . - CDS 94 - 615 544 ## FN0407 hypothetical protein - Prom 637 - 696 13.6 + Prom 582 - 641 10.5 2 2 Op 1 . + CDS 681 - 773 72 ## 3 2 Op 2 10/0.000 + CDS 847 - 1761 1275 ## COG0777 Acetyl-CoA carboxylase beta subunit 4 2 Op 3 5/0.000 + CDS 1774 - 2715 1518 ## COG0825 Acetyl-CoA carboxylase alpha subunit 5 2 Op 4 1/0.000 + CDS 2726 - 3694 1555 ## COG0205 6-phosphofructokinase 6 2 Op 5 1/0.000 + CDS 3713 - 4009 457 ## COG2926 Uncharacterized protein conserved in bacteria 7 2 Op 6 . + CDS 4020 - 4613 725 ## COG0353 Recombinational DNA repair protein (RecF pathway) 8 2 Op 7 . + CDS 4642 - 5250 665 ## FN0429 hypothetical protein + Term 5254 - 5294 4.1 - Term 5257 - 5312 0.1 9 3 Tu 1 . - CDS 5353 - 6807 1473 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 6827 - 6886 15.7 + Prom 6833 - 6892 9.5 10 4 Tu 1 . + CDS 6992 - 7342 574 ## PROTEIN SUPPORTED gi|237739925|ref|ZP_04570406.1| LSU ribosomal protein L19P + Term 7349 - 7390 4.0 - Term 7338 - 7378 5.1 11 5 Tu 1 . - CDS 7383 - 8156 834 ## COG0796 Glutamate racemase - Prom 8181 - 8240 14.1 Predicted protein(s) >gi|224461292|gb|ACDC01000110.1| GENE 1 94 - 615 544 173 aa, chain - ## HITS:1 COG:no KEGG:FN0407 NR:ns ## KEGG: FN0407 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 171 1 171 174 214 70.0 1e-54 MNKNKIFLFLFSLLTLTACSSIDSYIPSFLTEGSTPAAIQEAVASRVNPDKELYSVASSQ LSKSGSTLAQSRANKSASESLRRKVKSEVEAQLRGYLEDMDAFSKSIVNPAFSDLANYST DLSMKKSTQKGAWEDSEKVYSLLTVDRSEVMKITDTVFKDFIKTASKNLGNVK >gi|224461292|gb|ACDC01000110.1| GENE 2 681 - 773 72 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKGEEGRKVEGFTHINKEKSNKFRNKNKNI >gi|224461292|gb|ACDC01000110.1| GENE 3 847 - 1761 1275 304 aa, chain + ## HITS:1 COG:FN0408 KEGG:ns NR:ns ## COG: FN0408 COG0777 # Protein_GI_number: 19703750 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Fusobacterium nucleatum # 1 304 1 304 304 519 88.0 1e-147 MSIFKDLVKNLGLTNITQSKKKYVTVSENNSEEEKEKAKYKVKNIDNLKEEEITKCPTCG VLSHKSEIKENLKKCPNCNHYFNMSARERIELLIDEGTFKEEDSNLTAGNPIDFPEYTEK HEKAERDSGMKEGVISGLGEINGLKVSIACMDFNFMGGSMGSVVGEKITAALERAIEHKI PAVVVAISGGARMQEGLFSLMQMAKTSAAAKKMRLAGLPFISVPVNPTTGGVTASFAMLG DIIISEPNARIGFAGPRVIEQTIRQKLPENFQKSEFLQECGMVDIIAKREDLKETIFKVL NNII >gi|224461292|gb|ACDC01000110.1| GENE 4 1774 - 2715 1518 313 aa, chain + ## HITS:1 COG:FN0409 KEGG:ns NR:ns ## COG: FN0409 COG0825 # Protein_GI_number: 19703751 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Fusobacterium nucleatum # 1 313 1 313 313 544 92.0 1e-155 MQFEFQIDELEHKIEELKKFSEEKEVDLTEEINKLKDQRDIALKVLYEDLTDYQRVIVSR HPERPYTLDYIENITTDFIELHGDRLFRDDPAIVGGLCKIDGKRFMIIGHQKGRTMQEKV FRNFGMANPEGYRKALRLYEMAERFRIPILTFIDTPGAYPGLEAEKHGQGEAIARNLMEM SGIKTPIISVVIGEGGSGGALGLGVADKVFMLENSVYSVISPEGCAAILYKDPSRVEEAA NNLKLSSQSLLKVGLIDGIIDEALGGAHRGPKETAINLKRVVLETLEELEKLPLDELVEK RYEKFRQMGVFNR >gi|224461292|gb|ACDC01000110.1| GENE 5 2726 - 3694 1555 322 aa, chain + ## HITS:1 COG:FN0410 KEGG:ns NR:ns ## COG: FN0410 COG0205 # Protein_GI_number: 19703752 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Fusobacterium nucleatum # 1 322 8 329 329 597 95.0 1e-170 MEKKLAILTSGGDAPGMNAAIRATAKIAESYGFEVYGIRRGYLGMLNDEIFPMTGRFVSG IIDKGGTVLLTARCEEFKEARFREIAANNLRKKGINYLVVIGGDGSYRGANLLFKEHGIK VVGIPGTIDNDICGTDFTLGFDTCLNTILDAMSKIRDTATSHERTILVQVMGRRAGDLAL HACIAGGGDGIMIPEMDNPIEMLALQLKERRKNGKLHDIVLVAEGVGNVFDIEEKLRGHI NSEIRSVVLGHIQRGGTPSGRDRVLASRMAAKAVEVLNKGEAGVMVGIEKNEMVTHPLEQ ACSVDRRKSIEKDYDLAILLSR >gi|224461292|gb|ACDC01000110.1| GENE 6 3713 - 4009 457 98 aa, chain + ## HITS:1 COG:FN0411 KEGG:ns NR:ns ## COG: FN0411 COG2926 # Protein_GI_number: 19703753 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 98 1 98 98 103 91.0 8e-23 MDTVLELVRKERRKNQIKREIEDNDRKIRDNRKRVELLLNLKDYLKESMSYSEIIDIIEN MESDYEDRVDDYIIKNAELGKERREISKTIKEFKKSLS >gi|224461292|gb|ACDC01000110.1| GENE 7 4020 - 4613 725 197 aa, chain + ## HITS:1 COG:FN0412 KEGG:ns NR:ns ## COG: FN0412 COG0353 # Protein_GI_number: 19703754 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Fusobacterium nucleatum # 1 197 1 197 197 361 92.0 1e-100 MPTKSLERLILEFNKLPGVGQKSATRYAFHILNQSEEDVKNFAEALLAVKDNVKRCSICG NYCESDICNICSDNTRNHNIICVVEESKDIMILEKTTKYRGVYHVLNGRLDPLNGITPNE LNIKSLIERLGKEDIEEIILATNPNIEGETTAMYLAKLIKNFGIKITKLASGIPMGGNLE FSDTATISRALDDRVEI >gi|224461292|gb|ACDC01000110.1| GENE 8 4642 - 5250 665 202 aa, chain + ## HITS:1 COG:no KEGG:FN0429 NR:ns ## KEGG: FN0429 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 96 201 1 106 107 143 75.0 4e-33 MFTKRNIYEYDIHKANISCLLEMGEISQEQYDMLKDIPKDERSKFVAAFEKLEKDTAEDY RKYVAIALEKFKALNDIKEKDIIEIAFDAIWLDKEVSNLQVTENIRFICKRKASSILEIK KVKFYFNSADNTFFQRGLGQKESPWFEIIKEYMRLSELGDNKSLTEFINDFKEKYINKAL DEEFYKRLIPKMDNLKIIEILS >gi|224461292|gb|ACDC01000110.1| GENE 9 5353 - 6807 1473 484 aa, chain - ## HITS:1 COG:BS_rocR KEGG:ns NR:ns ## COG: BS_rocR COG3829 # Protein_GI_number: 16081087 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus subtilis # 9 482 11 456 461 366 43.0 1e-101 MNQDIYIKLLEELLANINEGIHFVDQKNITQIYNHNMEKIEGMDAKVIIGKNFRNIFKDI PEEESTLLKALKGISIKDNIQRYQNKNGKEIIALNTTLPIKVNGKILGALEISKDITKIK KLSDELIKLQTVNNEINENLSCCKKNKYSFDDIIGECPKIKRTIELAKKATESDATVFIY GETGTGKELLSQAIHYGSSRKDKPFIAINCATLPETLFESILFGTEKGGFTGATNKMGLF EQANGGTLLLDEINSIPIELQAKLLRVLQEKTVRRIGEVKDIPVDVRVISTTNENPKDII KNGKMRLDLYYRLNLIYLELSPLREREEDILLLSQKFLNYYNKKLNKNIKGLSKKVEEVF MQYLWPGNIRELENVIQSSTILTNEDFLTKEFLNINWDEVFFKKRVKKEEKEFFIKIAPD SDIKIDDNIDENDPNLLNNLMAKMEEKYIREAVDSYPYNLSKAAAYLDISRQALQYKMKK YNIK >gi|224461292|gb|ACDC01000110.1| GENE 10 6992 - 7342 574 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739925|ref|ZP_04570406.1| LSU ribosomal protein L19P [Fusobacterium sp. 2_1_31] # 1 116 1 116 116 225 100 8e-59 MKEKLIELVEKQYLRTDIPQFKAGDTIGVYYKVKEGNKERVQLFEGVVIRVNGGGVAKTF TVRKVTAGIGVERIIPVNSPNIDRIEVLKVGRVRRSKLYYLRGLSAKKARIKEIVK >gi|224461292|gb|ACDC01000110.1| GENE 11 7383 - 8156 834 257 aa, chain - ## HITS:1 COG:aq_325 KEGG:ns NR:ns ## COG: aq_325 COG0796 # Protein_GI_number: 15605845 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Aquifex aeolicus # 5 256 3 243 254 87 27.0 3e-17 MNKAIAVFDAGLGSYAIVEAIKKAYPKQDIIYFADRKSFPYGTKTTDELKTIIEDSVDFL LKKGASFIVLASNAPSITVLDKIKNKDNVIGIYPPLKNVINDKKKNTLIIGAKVMIDSFE LQEYIKKEVGDFQKQFHVENASPLIQLIESGDFINNIEKTENTIKNFIKTCEEKYGKLDS ITLSSTHLPWLSSYFQKIIPEAKLYDPADSLVKAIKNHTSIGSGEIHSIISESEKYPADE FLKILDILKIKLDYEII Prediction of potential genes in microbial genomes Time: Thu May 19 23:39:22 2011 Seq name: gi|224461291|gb|ACDC01000111.1| Fusobacterium sp. 2_1_31 cont1.111, whole genome shotgun sequence Length of sequence - 13108 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 5, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 111 83 ## 2 1 Op 2 1/0.000 + CDS 153 - 728 817 ## COG0193 Peptidyl-tRNA hydrolase 3 1 Op 3 12/0.000 + CDS 804 - 2111 1888 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 4 1 Op 4 12/0.000 + CDS 2138 - 3082 1485 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 5 1 Op 5 13/0.000 + CDS 3072 - 3605 943 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 6 1 Op 6 3/0.000 + CDS 3605 - 4222 866 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 7 1 Op 7 12/0.000 + CDS 4219 - 4803 836 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 8 1 Op 8 . + CDS 4828 - 6081 2020 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB + Term 6099 - 6152 7.4 - Term 6090 - 6135 2.2 9 2 Tu 1 . - CDS 6146 - 6805 1027 ## COG2932 Predicted transcriptional regulator - Prom 6917 - 6976 14.5 + Prom 6716 - 6775 14.2 10 3 Tu 1 . + CDS 7004 - 7168 128 ## gi|237738943|ref|ZP_04569424.1| predicted protein - Term 7262 - 7312 2.1 11 4 Op 1 9/0.000 - CDS 7325 - 8050 583 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 12 4 Op 2 35/0.000 - CDS 8044 - 9387 1425 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 13 4 Op 3 1/0.000 - CDS 9371 - 9982 768 ## COG0512 Anthranilate/para-aminobenzoate synthases component II - Prom 10007 - 10066 9.8 14 4 Op 4 . - CDS 10154 - 10930 925 ## COG0253 Diaminopimelate epimerase - Prom 10960 - 11019 11.7 - Term 10982 - 11026 2.0 15 5 Op 1 . - CDS 11036 - 11617 605 ## FN1716 hypothetical protein 16 5 Op 2 . - CDS 11614 - 12819 1027 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 13048 - 13107 8.5 Predicted protein(s) >gi|224461291|gb|ACDC01000111.1| GENE 1 1 - 111 83 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLSLFLYYSKVRLNVIIFLKTEENYKKDYVNINNI >gi|224461291|gb|ACDC01000111.1| GENE 2 153 - 728 817 191 aa, chain + ## HITS:1 COG:FN1597 KEGG:ns NR:ns ## COG: FN1597 COG0193 # Protein_GI_number: 19704918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Fusobacterium nucleatum # 1 188 1 188 191 307 88.0 9e-84 MKVVIGLGNPGKKYEKTRHNIGFIVVDSLRKKFNLTDEREKFQALISEKNIDGEKVIFFK PQTFMNLSGNALIEIVNFYKLDPKKDIIVIYDDMSLDFGDIRIREKGSSGGHNGIKSIIS HIGEEFIRIKCGIGAKKEDAVEHVLGEFSQSEQKELVEFLEKLNECVIEMLTVHNLDRTM QKYNKKKEKLK >gi|224461291|gb|ACDC01000111.1| GENE 3 804 - 2111 1888 435 aa, chain + ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 435 7 441 441 805 94.0 0 MKFFGFRGGVHPPENKIQTEHLPIEKLESPNEIFVPLLQHIGAPLNPIVNVGDRVLKGQK IADAEGLAVPVHAPVSGTVTKIENRVYPLSGKVMTIFIENDKKEEWAELTKIANWETADK KELLDIIREKGIVGIGGATFPTHVKLNPPPNTKLDSLILNGAECEPYLNSDNRLMLENPS SIIEGIKIIKKILNVPDVYVGIEDNKPEAIESMRKAAEGTGINIVPLKTKYPQGGEKQLI KSILDRQVPSGQLPSAVGVVVQNTGTTAAIYEAVVNGKPLIEKVVTVTGKAIKNPKNLKV AIGTPFSYILDHCGINRDEMERLVMGGPMMGLAQMTEEATVVKGTSGLLALTNEEMRPYK TKACISCSKCVSACPMGLAPLMFDRLAAAKEYEAMAGHNLMDCIECGSCAYICPANRPLA EAIKTGKAKLRAKKK >gi|224461291|gb|ACDC01000111.1| GENE 4 2138 - 3082 1485 314 aa, chain + ## HITS:1 COG:FN1595 KEGG:ns NR:ns ## COG: FN1595 COG4658 # Protein_GI_number: 19704916 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Fusobacterium nucleatum # 1 314 1 314 314 539 90.0 1e-153 MSTILKTGPAPHIRTKETVESVMYDVVIALVPAFLMAIYSFGVRALILTSVSVLTCVVTE YLCQKALKRDIEAFDGSAILTGILFSFVVPAIMPLQYVVIGNIIAITLGKMVYGGLGHNI FNPALIGRAFVQASWPVAITTFAYDGKAGATVLDAMKRGLPLADSLLQNGDQYVNAFLGN MGGCLGETSSLALLIGGAYLIYKKQIDWKVPATMIGTVFILTWAFGADPIMQIFSGGLFL GAFFMATDMVTSPTTSKGRVVFAFGIGLLVSLIRMKGGYPEGTAYAILIMNGVVPLIDRY IRPKKFGGVSTNGK >gi|224461291|gb|ACDC01000111.1| GENE 5 3072 - 3605 943 177 aa, chain + ## HITS:1 COG:FN1594 KEGG:ns NR:ns ## COG: FN1594 COG4659 # Protein_GI_number: 19704915 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Fusobacterium nucleatum # 1 177 1 177 177 290 88.0 8e-79 MENRYIHFGIVLGLIAAISAGLLGGVNGFTSKVIAANTIKIVNEARKQVLPAAASFKEDE AKEAEGIQYIPGFNEAGEVVGYVASVAEPGYGGDINFVVGIDNDAKVTGLNVVTSSETPG LGAKINEKDWQDHWIGKDATYEFNKSTDAFAGATISPKAVYTGVIKALNTYQNEVSK >gi|224461291|gb|ACDC01000111.1| GENE 6 3605 - 4222 866 205 aa, chain + ## HITS:1 COG:FN1593 KEGG:ns NR:ns ## COG: FN1593 COG4660 # Protein_GI_number: 19704914 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Fusobacterium nucleatum # 1 191 1 191 205 308 97.0 5e-84 MKKLGILTAGIFKENPVFVLMLGLCPTLGVTSSAINGFSMGLAVIAVLACSNGLISLFKK FIPDEVRIPAFIMIIATLVTVVDMVMNAYTPDLYKVLGLFIPLIVVNCIVLGRAESFASK NGVIDSILDGIGSGIGFTLSLTFLGSIREILGNGSIFGISLVPANFTPALIFILAPGGFI TIGMIMACINIKKERDAKKKKVTKK >gi|224461291|gb|ACDC01000111.1| GENE 7 4219 - 4803 836 194 aa, chain + ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 1 194 1 194 194 279 96.0 2e-75 MSIGGLFSIIVTSIFINNIIFAKFLGCCPFMGVSKKVDSSLGMGMAVTFVITIASGVTWL AYRMILEPLGLGYLQTIAFILIIASLVQFVEMAIKKTSPSLYKALGVFLPLITTNCAVLG VAIINIQVGYNFIETIVNGFGVAVGFSLALLLLAGIRERLEFANTPKNFKGVPIAFITAG LLAMAFMGFSGMQI >gi|224461291|gb|ACDC01000111.1| GENE 8 4828 - 6081 2020 417 aa, chain + ## HITS:1 COG:FN1591 KEGG:ns NR:ns ## COG: FN1591 COG2878 # Protein_GI_number: 19704912 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Fusobacterium nucleatum # 1 369 1 381 385 454 78.0 1e-127 MEAIMMPVVVLGITGILMGLFLAYASKKFEVEVDPKVEAILAILPGANCGACGFPGCAGY ASGVALEGAKMTLCAPGGPKVIEKIGEIMGVAVEVPVKKKPVKKTVEKKVVAQTGDPISA SAEFIEKNKRMLNKFKDAFDAGDKEAYEKLENLAKTAGKDELLKYYEEIKTGKIIPDGSA PAAPTGDPISASAEFIEKNKRMLNKFKDAFDAKDKEAYEKLENLAKTAGKDELLKCFEEI KAGKIIASGSAPAAAAVKLEPITATKEFVEKNKRMLNKFKDAFDAKDKEAYEKLEGLAKS TGKDDLLKCFEEIKAGKVVPDPATMTDAPAPKAEDSKKQEASYCSVLGDGLCVPEQNEKA KEEIVKQAEPPKTAEELEKDKQAASYCSILGDGLCVPEENEQMVKQNLTQELDKEVK >gi|224461291|gb|ACDC01000111.1| GENE 9 6146 - 6805 1027 219 aa, chain - ## HITS:1 COG:FN1589 KEGG:ns NR:ns ## COG: FN1589 COG2932 # Protein_GI_number: 19704910 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 219 1 219 219 338 82.0 4e-93 MSFGTTLKKIRLKHKDSLRGLAKKINLHFTFVDKVEKGTAPISNNFIERIVEVYPDEEKI LKKEYLKENLPKVFSKDESIKILEDSEVLNLPVYGKASAGRGYLNMDKPDYYMPITKGDF SLNSFFVEITGNSMEPTLEDGEYALVDPNNTAYVKNKIYVVTYNDEGYIKRVEVKDKKKV ITLKSDNPDYDDIDISEEMQEYFKINGRVVEVISKKRVL >gi|224461291|gb|ACDC01000111.1| GENE 10 7004 - 7168 128 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738943|ref|ZP_04569424.1| ## NR: gi|237738943|ref|ZP_04569424.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 54 1 54 54 80 100.0 4e-14 MTIERVYINTYKKGDVIMHFLEFKRRFSLLNEEEKEFIYKLKLKDAIDFLRTIY >gi|224461291|gb|ACDC01000111.1| GENE 11 7325 - 8050 583 241 aa, chain - ## HITS:1 COG:FN1729 KEGG:ns NR:ns ## COG: FN1729 COG0115 # Protein_GI_number: 19705050 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Fusobacterium nucleatum # 1 237 1 237 249 343 78.0 1e-94 MLIELDDGFSFGLGLFETILLYKGEAVFLDEHLARINQSIIDLGLNIDKLEKDEVYRYLE TNKSELEHEVLKIVLTEKNRLFIKRAYTYTDEDYKRAFSLNISKVQRNESSIFTFHKTLN YADNIFEKKKSKKLGYDEPIFLNSRSLVTEGATSNIFIIIDNKIYTPKLDSGLLNGIIRQ YIISNYPVIETDIDLEFLNKADEIFLTNSLFGIMPVSSLENKKLKSQKISREILSKYLDQ K >gi|224461291|gb|ACDC01000111.1| GENE 12 8044 - 9387 1425 447 aa, chain - ## HITS:1 COG:FN1730 KEGG:ns NR:ns ## COG: FN1730 COG0147 # Protein_GI_number: 19705051 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Fusobacterium nucleatum # 1 447 1 453 453 687 85.0 0 MQIELKKLEKYIDIYDIFRILKKENNNKIAFLDSSLKNKYGRYSIIGIDPYLELKENNKK FYINGMLSEENFEEYLAKFLKENKQENNSILPLISGGIVYFSYDYGRKFENIATRHKKDL DIPEAIVTFYKTYIVEDIEKQEIYISYQDKKDYDNLVNLLEKTNLEKENLVKKDSLANFK SNFEKEEYLKAIKSTIDYIIEGDIYIMNLTQRLMIESQKSPLEVFSYLRKFNPAPFSAYL DFQDFQLVSASPERFIKMKDRLIETRPIKGTRKRGATEEEDLALKNELANSEKDKSELLM IVDLERNDLNRICELKSVVVDELFEVETYSTVFHLVSTIRGKLRKDYDFVDLIRATFPGG SITGAPKIRAMEIIDELENSRRDAYTGSIGYISFNGDCDLNIVIRTAIHKDNKYYLGVGG GITCESELDFEYEETLQKAKAILEALC >gi|224461291|gb|ACDC01000111.1| GENE 13 9371 - 9982 768 203 aa, chain - ## HITS:1 COG:FN1731 KEGG:ns NR:ns ## COG: FN1731 COG0512 # Protein_GI_number: 19705052 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Fusobacterium nucleatum # 1 203 1 203 203 353 86.0 8e-98 MFLMIDNYDSFVYNLVSYFLEENIEMEIIRNDLVDLKRIEDLIKQDKLEGIIISPGPKSP KDCGLCNEIVKNFYKQVPIFGVCLGHQIIGYTFGAEVKKGKSPVHGKVHKIKTSSSNIFK DLPKELNVTRYHSLVVEKEHLLEEFNIDAETEDGVLMALSHKKYPLYSVQFHPEAVLTEY GHEMLRNFLDLAREWRLKNANRA >gi|224461291|gb|ACDC01000111.1| GENE 14 10154 - 10930 925 258 aa, chain - ## HITS:1 COG:FN1732 KEGG:ns NR:ns ## COG: FN1732 COG0253 # Protein_GI_number: 19705053 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Fusobacterium nucleatum # 3 258 8 263 265 397 79.0 1e-111 MKLDFIKINPAGNITILIDNFNIYDKDIAKISEELMREDNLHAEQVGFIKDNHLQMMGGE FCGNASRAFASLLAFRDKTFSKQKIYKITCSGEDEVLAVDVREGQTENSFLAKIKMPKFK SLEELKIDNYKLGLVKFSGIGHFIFDIAKNKEDNFEKVIDSVKNYLSDKDFSAFGIMFFD RENLSMKPYVYVKEVESGIFENSCASGTTALGYYLKKYKNLDRAKIIQPNGWLEYIIEND EIYIDGSVEIVAEGKVYI >gi|224461291|gb|ACDC01000111.1| GENE 15 11036 - 11617 605 193 aa, chain - ## HITS:1 COG:no KEGG:FN1716 NR:ns ## KEGG: FN1716 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 189 1 189 189 253 68.0 2e-66 MKNFKIKIAILLSLILFTACSNIPGGPYEKKSDFTWRKVNEETFITNLQPGDIVIKEKEV NPIGMFGHVAIMINERTLFDYPKFGYKSYYIDINFWLEDGRDILVLRYKDMTDEFREKLI KNMKKYFGKSYSISSNRENTDAFYCSQYIWYVYYITAKEMGFELDLDSNGGNFVFPYDFI NSPYLEIVNIEKY >gi|224461291|gb|ACDC01000111.1| GENE 16 11614 - 12819 1027 401 aa, chain - ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 4 395 2 401 402 303 45.0 3e-82 MDYTSREKYINKIKALINKPVIKILTGIRRTGKSSLLHIIKDEILKDVADENKIYINFET SMFLGINNAYSLLEYLKSLLEDIEGKVYFFFDEIQIVDGWEQVISDLKLNRDCDFYLTSS NAKLISNTSLSEEYVEFEIQPFTFSEFKKTFENMELSKENLFYKFIQLGGLPFLKYFDLD ETPSFEYLNDIYNTVLVKDVLQYNNIRDVDLFNHIFSYVIANVGQSFSASSIKTCLKNKN KNISVDTILNYLEYCNVAFLIKKVPRYDVLSKKTLKIDEKYYLTDHGFRQAIGFPITQDI ERILENIVYIELLSRGYEVKVGKVKDKEINFIAKKEKDLSYYQISYKIRDEKTRERIFET YNLVTDNFPKYVLSMDHSNFSQDGVIHKNIIDFLLEDEGVK Prediction of potential genes in microbial genomes Time: Thu May 19 23:39:41 2011 Seq name: gi|224461290|gb|ACDC01000112.1| Fusobacterium sp. 2_1_31 cont1.112, whole genome shotgun sequence Length of sequence - 22996 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 6, operones - 5 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 29 - 208 295 ## gi|237738950|ref|ZP_04569431.1| predicted protein 2 1 Op 2 . - CDS 210 - 1157 1095 ## gi|237738951|ref|ZP_04569432.1| predicted protein - Prom 1236 - 1295 8.4 - Term 1285 - 1330 -0.9 3 2 Tu 1 . - CDS 1379 - 5905 3322 ## GYMC10_1051 hypothetical protein - Prom 5939 - 5998 7.9 4 3 Op 1 . - CDS 6013 - 7374 1662 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 5 3 Op 2 . - CDS 7376 - 7633 419 ## FN1712 hypothetical protein 6 3 Op 3 1/0.000 - CDS 7635 - 8597 1313 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis - Prom 8665 - 8724 9.8 7 4 Op 1 1/0.000 - CDS 8735 - 9013 273 ## COG0762 Predicted integral membrane protein 8 4 Op 2 1/0.000 - CDS 9027 - 9590 299 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 9 4 Op 3 . - CDS 9612 - 11711 1711 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase 10 4 Op 4 . - CDS 11781 - 12251 661 ## COG2606 Uncharacterized conserved protein - Prom 12398 - 12457 12.5 + Prom 12356 - 12415 13.5 11 5 Op 1 1/0.000 + CDS 12473 - 13849 2011 ## COG3493 Na+/citrate symporter 12 5 Op 2 1/0.000 + CDS 13928 - 15274 1985 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase 13 5 Op 3 1/0.000 + CDS 15261 - 16109 752 ## COG1767 Triphosphoribosyl-dephospho-CoA synthetase 14 5 Op 4 6/0.000 + CDS 16120 - 16404 546 ## COG3052 Citrate lyase, gamma subunit 15 5 Op 5 6/0.000 + CDS 16413 - 17303 1375 ## COG2301 Citrate lyase beta subunit 16 5 Op 6 . + CDS 17306 - 18856 2411 ## COG3051 Citrate lyase, alpha subunit + Term 18879 - 18928 9.2 + Prom 18895 - 18954 11.5 17 6 Op 1 . + CDS 19136 - 19684 559 ## gi|237738966|ref|ZP_04569447.1| predicted protein 18 6 Op 2 . + CDS 19710 - 19841 69 ## FN1599 hypothetical protein 19 6 Op 3 . + CDS 19866 - 20339 444 ## gi|237738968|ref|ZP_04569449.1| predicted protein 20 6 Op 4 . + CDS 20332 - 20892 351 ## gi|237738969|ref|ZP_04569450.1| predicted protein 21 6 Op 5 . + CDS 20933 - 21427 259 ## gi|237738970|ref|ZP_04569451.1| predicted protein 22 6 Op 6 . + CDS 21405 - 22064 556 ## gi|237738971|ref|ZP_04569452.1| predicted protein 23 6 Op 7 . + CDS 22065 - 22634 244 ## FN0289 hypothetical protein 24 6 Op 8 . + CDS 22666 - 22996 190 ## gi|262066001|ref|ZP_06025613.1| conserved hypothetical protein Predicted protein(s) >gi|224461290|gb|ACDC01000112.1| GENE 1 29 - 208 295 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738950|ref|ZP_04569431.1| ## NR: gi|237738950|ref|ZP_04569431.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 59 1 59 59 92 100.0 7e-18 MEDKICKKCGKSMDIEDSYDTCENCRTKDAENKRTIGMIALGIVTVISGIALKILKKND >gi|224461290|gb|ACDC01000112.1| GENE 2 210 - 1157 1095 315 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738951|ref|ZP_04569432.1| ## NR: gi|237738951|ref|ZP_04569432.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 315 19 333 333 536 100.0 1e-151 MNNDFIELNDVDILEVAKPLKDKMSLLLSNIKDKILLVPGFVRVVKSFIPIKTLQAVLTN EQKEKIASGILKIMSKNDGTLIANLVDPQSGKIITNIPLKEIELTPELNKAMTDFALQLQ LLQISAEIKSIKKAVEEVRKGQEYDRLATAYSCQQKFLQATLIKDIKLKKETLLRIALDA EDSRNFLMLSQKANIDFIKNLPDGYWKKMLSLTTSSEIDSRMNEIREGFSTINMVSLVEA LAYHELEEYGSEQQSLIYYADFIQKTYLDDSKLLKRLDSIDPSTERYWTTKVPIIETKIR KQKELYNKLELLEVK >gi|224461290|gb|ACDC01000112.1| GENE 3 1379 - 5905 3322 1508 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_1051 NR:ns ## KEGG: GYMC10_1051 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 1 1508 1 1516 1516 1261 43.0 0 MPFEIGGRADKRGNSYEIDCIIYEMLKVLDEKNYSICIEPLGTDEIGTDILVTNFDGQKE HQQCKARNASKEFWNISDLKEKNIFSTWKVHLDRDCFRKVALVSPVACTFLSDLHDRASN TNGKAEDFYSIQIMKSSKPFQEFYEKFCDEMGLNSNDNNDIFKSIDYLKRIYYKQISEYQ LKEMINQYIQFLFSTKVETVYNAFISLIVKEDIFGQEITQIILTNYFQKQEINFRLQDND ERISPRIQEINQEYRENFNPLLSGFVHRKVFDDCIEAITNEKSIIISGNAGYGKSGCTEA ILNYCEEEKIPHIAIKLDRRIPHKNCESWGHDLGFPSSIAYSIHRVSRNENAVIVLDQLD ALRWTQANSSEALTVCIELIRQVEYLNHERNKKIITVFVCRTYDLENDNNINSLFKKQDT LKNDWKTIRVDVFEDDEVKEIIGKNYENFLPKLKNLLRIPSNLYIWQHLDKEEVYEKCLT TYHLIDKWFKQICRKSTTAGLQERAINEVIKRIVDFLDKTGRLYIPKQNLNIEEAELDYL ISSEIIILQNNKVSFVHQSILDYFISKSMFEKYFNDSNQPIEEIIGEKNKQDPKRRYQIQ MFLQNMLELDSSDFLLIGEKMLTSNNIRYYVKHIFYEILRQIEEPDENITQFILTNYENN IFGKYLMNNTILGKKQYISILMAHGILEKWYLENKKNIVFNLLWSITPNFDDKIISFIKK YAFKNMNDDKQFMRCFIHDITEESEEMFELRMMFYEKYPEYIKEMFIDIKKIMKHFGKRL IQLISFCLKNKIETHNIYLSSYEELDLDNDFFIENTEFILNELLPYISKENLSNIKYSNW NEKHINKYGIERTTVKLVKKATIELINKSPQNFWTYYNPYLGKGYHIFNEIILTSLLYMP PQYSDQIIDYLITDFDKNIFDYTSGAEDELGLVEEVLKIHGNTCTKKRLLILEDKIYKYI SPYATEWYKQRIEQNKTKKSSPVYWSFWGDLQYKLLQCLPEKKVSKKTKGLLNVLHRRFY KVPLHYSNSNIHSGWVTSPVSGKNISKAQWLQIITNSKLKKQKRPKLVEVKDGFIESSYE AYARDFQIVVQQNPQEMIEIILKNKEYVLPIFIDSLFLGIELSEKLEIIDFTILEELFLE FPCDMKSYRASYFCGIVKKLKNVSWSPEIIQQLMNISLNHFNPELSSNIANQKDSCQTLR NNALNSVKGRAAMAIGHLLQENKDFFSQFKDIIEKMSTDKNPEVCFVAMYALYPSYNINK EWAEKKILHLYESDIRTASFFNSNNMLFHLYYSYKEKVIKIVENCFESEEQELIKIGGHT ICEFYIHFKEFEKIVLSIGDKSEEQIKSILEMAILYLDISKYKEIAQKIILAYKNTDIDL QFPLSKIFNKRYIDSIHDKEFLKELMESKVSRKLVRAFVDYLEENNNSIIFYKDIILQLC ENILQMKLEDLKQEWGLEDHISKLIISLYDKTINTDKQIADKCMDLWDIMFERQWRSVRE ISKKLMER >gi|224461290|gb|ACDC01000112.1| GENE 4 6013 - 7374 1662 453 aa, chain - ## HITS:1 COG:FN1713 KEGG:ns NR:ns ## COG: FN1713 COG2265 # Protein_GI_number: 19705034 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Fusobacterium nucleatum # 1 452 13 464 464 725 92.0 0 MLKVADIIQIKIDKIVFGGEGLGYYNGFAVFVPMSIPEDELEIEIISVKKTYARGLIKNI IKASPERIDSHKFTFEDFYGCDFAMLKYESQLKYKKLMVEEVMRKIAGLPDIEISDVLAS EDVYNYRNKIIEPFSVYGNKIITGFFKRKSHEVFEVDENILNSKLGNRIIKELKEILNKN KISVYNEITHKGLLRNVMIRTNSNNEAMLVLIINSNKITENIKNLLFRLREKIEEIKSIY ISLNSKKTNTVIGEKNIFIYGEESIKENLNGIEFHISPTSFFQINVKQAKRLYDIAINFF DNIDDKYIVDAYSGTGTIGMIMAKKAKKVYAIEIVKSASEDGERTAKENGIENIEFINGA VEKELVNLINANKRIDTIIFDPPRKGLEASIIDKVAELNLKEVVYISCNPSTFARDVKLF SEKGYVLKKLQAVDMFPQTSHIECVGLIEKVIY >gi|224461290|gb|ACDC01000112.1| GENE 5 7376 - 7633 419 85 aa, chain - ## HITS:1 COG:no KEGG:FN1712 NR:ns ## KEGG: FN1712 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 85 4 88 88 79 67.0 4e-14 MKYLALLTFIAVVFIWLFNIQTLREVTELEKQLKAANETLEELDKDLDKKIIYYDSKLDL DKIKRDMEAKGMKVTEEVVYFEIEE >gi|224461290|gb|ACDC01000112.1| GENE 6 7635 - 8597 1313 320 aa, chain - ## HITS:1 COG:FN1711 KEGG:ns NR:ns ## COG: FN1711 COG0275 # Protein_GI_number: 19705032 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Fusobacterium nucleatum # 7 320 1 314 314 540 89.0 1e-153 MKNGDFMEKIGNDYHIPVLYYETLDNLVINPDGVYIDCTLGGGSHSEGILERLSDKGLLL SIDQDSNAIEYSKKRLEKYASKWKVLKGNFENIDTLAYMAGIDKVDGILMDIGVSSKQLD EAERGFSYRYDVKLDMRMNTDQKLSAYDVVNTYSEEELSRIIFEYGEERFARKIAKLICE NRKIKPITTTFELVALIRRAYPERASKHPAKKTFQAIRIEVNRELEVLENAMSKAVELLK VGGRLGIITFHSLEDRIVKNKFKDLATACKCPKDIPICMCGGVKKFEIITRKPIIPVEDE LKNNNRAHSSKLRILERILD >gi|224461290|gb|ACDC01000112.1| GENE 7 8735 - 9013 273 92 aa, chain - ## HITS:1 COG:FN1710 KEGG:ns NR:ns ## COG: FN1710 COG0762 # Protein_GI_number: 19705031 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Fusobacterium nucleatum # 1 90 1 90 91 87 65.0 4e-18 MPLLTYSLITILDRMIWCVYILIMIRIFLSWIPMDNNFTDLIYNLTDPILKPFKDFLDKF IDLPIDFSPMLLVLTLEALQKILVKIIIALTW >gi|224461290|gb|ACDC01000112.1| GENE 8 9027 - 9590 299 187 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 5 180 484 665 904 119 36 1e-26 MNLPNRLTMIRFLLAIPFIMFLQASDSSKFGFIFRMISLVIFVVASLTDFFDGYIARKYN LITDFGKIMDPLADKILVISALVIFVQLEYIPGWMSIIVLAREFLISGIRILAAAKGEII AAGNLGKYKTTSQMLVIVIALAIGPIGFTLAGHFFTIAEVLMLIPVILTIWSGWEYTFKA KHYFLEQ >gi|224461290|gb|ACDC01000112.1| GENE 9 9612 - 11711 1711 699 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 1 699 1 694 714 663 50 0.0 MFDEKIMELELAGRTLKVSTGKISRQSSGAIVIQYGDTVILSTANRSKEARKGADFFPLT VDYVEKFYSTGKFPGGFNKREGRPSTNATLIARLIDRPIRPMFPEGFNYDVHIVNTVLSY DEVNTPDYLGIIGSSLALMISDIPFLGPVAGVTVGYKNGEFILNPSPAELEESELDLSVA GTKDAVNMVEAGAKELDEETMLKAIMFAHDNIKKICEFQEVFAKLYGKENIEFTKEEVLP LVKDFIDTNGHKRLQEAVLTTGKKNREEAVDSLEEELLNKFIGENYPDVPEEELPEDVIT EFKTYYHDLMKKLVREAILYHKHRVDGRTTTEIRPLDAQINVLPIPHGSALFTRGETQSL AITTLGTKADEQLIDDLEKEYYKKFYLHYNFPPYSVGEVGRMGSPGRRELGHGSLAERAL RYVIPSEEEFPYTIRVVSEITESNGSSSQASICGGSLSLMSAGVPIKEHVAGIAMGLIKE GEEFTVLTDIMGLEDHLGDMDFKVAGTKSGITALQMDIKITGITEEIMRIALNQAHEARQ QILELMNNTISKPAELKSNVPRIQQITIPKDKIAVLIGPGGKNIKSIIDQTGATVDITDD GLVSVFAQDAEVLEKTLKLIDSFVREVEYNEVYEGRVVSIMKFGAFMEILPGKEGLLHIS EISPERVEKVEDVLSVGDVFKVRVISMEGGKISLSKKKV >gi|224461290|gb|ACDC01000112.1| GENE 10 11781 - 12251 661 156 aa, chain - ## HITS:1 COG:FN1373 KEGG:ns NR:ns ## COG: FN1373 COG2606 # Protein_GI_number: 19704708 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 156 1 156 162 234 83.0 6e-62 MKKTNAIRELETHKIEHIVREYEVDEDHLDALSVAIKTNEDITRIFKTLVLLNEKKEMIV ACIPGLEKLDLKKLAKISGSKKVEMLPMKDLFSMTGYLRGGCSPIAIKKRHTAFIHNSAT DNETILISGGLRGLQIEISPQKLIDYLNLKVADIIE >gi|224461290|gb|ACDC01000112.1| GENE 11 12473 - 13849 2011 458 aa, chain + ## HITS:1 COG:FN1375 KEGG:ns NR:ns ## COG: FN1375 COG3493 # Protein_GI_number: 19704710 # Func_class: C Energy production and conversion # Function: Na+/citrate symporter # Organism: Fusobacterium nucleatum # 1 458 1 454 454 718 84.0 0 MAKKNFKELFDPKESKWGGISLPMFLAALVVVVIMVYVPFGLDKEGNPGSFLRPNFLIMF SALAIFGLLFGEIGDRIPIWDEYVGGGTILVFFVAAVFGTYGLVPEKFMSAVDVFYNKQP VNFLEMFIPALIVGSVLTVDRKTLIKSISGYIPLIIVGVFGASVCGMAVGFIFGKTPQDV MMNYVLPIMGGGTGAGAIPMSEMWSSKTGRPASEWFAFAISILSIANVFAIITGALLKKL GEVKPNLTGNGELIIDNSKEAIRDKEVEVKPELTDTTAAFILTGILLMTAHILGELWSAF AKSHNIDFELHRLVFLILLTMFLNIANVVPDRLKAGAKRMQTFFSKHTIWILMAAVGFTT DVKEIGAALLPSNVLIALAIVFGAVGFIMLVARKMKFYPIEAAITAGLCMANRGGAGDVA VLGAADRMDLMSFAQISSRIGGAMMLVFGSVLFSIFAS >gi|224461290|gb|ACDC01000112.1| GENE 12 13928 - 15274 1985 448 aa, chain + ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 1 448 1 448 448 845 94.0 0 MNKIKIMETCLRDGHQSLMATRMTTAEMLPIIEKLDSVGYHSLEMWGGATFDAALRFLNE DPWERLREIKKRAKNTKLQMLLRGQNLLGYRNYPDDIVERFVKKSIQNGIDIVRIFDALN DVRNLKTACEATKKYGGHAQLAMSYTISPVHTIEYYKNLALEMQEIGADSIAIKDMSGIL LPEVAYELVKTLKSVLRVPVEVHTHATAGLASMTYLRAIEAGADIVDTAISPLSGGTSQP ATESLVRTLQGTERETGFDLNLLKEIAEYFKPIRAKYLQEGILNPQALMTEPSIVEYQLP GGMLSNFLSQLKMQKAEHKYEDVLREIPRVRADLGYPPLVTPLSQMVGTQAIFNILTGQR YKLIPNEIKNYVRGLYGKSPVPITDEIKNTIIGTEEVFTGRPADKLAPEYDKLVEESREF ARSEEDVLSYALFPQVAKDFLIKKYENE >gi|224461290|gb|ACDC01000112.1| GENE 13 15261 - 16109 752 282 aa, chain + ## HITS:1 COG:FN1377 KEGG:ns NR:ns ## COG: FN1377 COG1767 # Protein_GI_number: 19704712 # Func_class: H Coenzyme transport and metabolism # Function: Triphosphoribosyl-dephospho-CoA synthetase # Organism: Fusobacterium nucleatum # 1 278 1 278 279 390 75.0 1e-108 MKMNNKNIATLATKALLYEVSISPKAGLVSRLSNGSHKDMNFYTFIDSSLALHNYFLNCF DYGQEKLFSCPNFFKDLREVGKVAEKEMYEATKGINTHKGTIFSMGILLAVLGVYLKENK KIDLKVLSEKIKEMCKPLLNELEATKNISTYGEKAYKEYHFTGARGLAISGYEIVLLDGI NKLKDFCKTLDFETACILLLFYYMSVLDDTNIVNRASITTLKEVQILSKELFEENKKTLE KENIKNSMSKLNDIFIEKNISAGGSADLLILTIFIHFLTCEN >gi|224461290|gb|ACDC01000112.1| GENE 14 16120 - 16404 546 94 aa, chain + ## HITS:1 COG:FN1378 KEGG:ns NR:ns ## COG: FN1378 COG3052 # Protein_GI_number: 19704713 # Func_class: C Energy production and conversion # Function: Citrate lyase, gamma subunit # Organism: Fusobacterium nucleatum # 1 94 1 94 94 146 92.0 9e-36 MVLKTVGIAGTLESSDAMITVEPANEGGIVIDVSSSVKRQFGRQITETVLNTIKELGVEN ASVKVVDKGALNYALIARTKAAVYRAAESKEYKF >gi|224461290|gb|ACDC01000112.1| GENE 15 16413 - 17303 1375 296 aa, chain + ## HITS:1 COG:FN1379 KEGG:ns NR:ns ## COG: FN1379 COG2301 # Protein_GI_number: 19704714 # Func_class: G Carbohydrate transport and metabolism # Function: Citrate lyase beta subunit # Organism: Fusobacterium nucleatum # 1 296 1 296 296 534 92.0 1e-152 MAIRDRLRRTMMFLPGNNPSMITDAHIYKPDSIMIDLEDAVSVNQKDAARFLVSEALKAI DYKTTERVVRVNGLDTPFGADDIRAIVKAGVDVIRLPKTDNPDEIVAVDKLITEVEREIG KEGETLLMAAIESAAGIMNVKEIALASKRLMGIALGAEDYVTNLKTSRSKHGWELYYARE AIVLAARNAGIYCFDTVYSDVNNIEGFRDEVQFIKDLGFDGKSCIHPKQVRIVHEIYTPS QKEIEKSIRIINGAKEAEAKGSGVISVDGKMVDNPIIMRAQRVLDLAKASGIYKED >gi|224461290|gb|ACDC01000112.1| GENE 16 17306 - 18856 2411 516 aa, chain + ## HITS:1 COG:FN1380 KEGG:ns NR:ns ## COG: FN1380 COG3051 # Protein_GI_number: 19704715 # Func_class: C Energy production and conversion # Function: Citrate lyase, alpha subunit # Organism: Fusobacterium nucleatum # 1 516 1 516 516 904 92.0 0 MKFNKNAVGREIPEYLEGIGELVPFKGVDAIKPTKSKAGAKLRMRIQDEKKLVASIEEAI KKSGLKDGMTISFHHHMRNGDTVVNRVLDIISKMGIKDLTLAPSSLSPCHAPVVDHIKSG VVTGIQSSGLREPLGDEISKGILKKPVIIRSHGGRARAIEDGELHIDVAFIAAPSCDEMG NMNGRTGKSACGSMGYAIVDAQYADYVIAITDNLVPFPNSPASIDQTLVDAVVVVDEIGD PKKIVSGAIRFSDNPRDLLIAQNAVKVIINSGYFKDGFVYQTGAAGASLAVTSLLREEMI KQNIKASLGLGGITSQLVGLLEEGLMSALYDTQCFDLDAVRSIKENERHIEITASQYANP NTSGPAVNDLTFVMLGALEIDKDFNVNVMTKSDGTINQAVGGHQDTAAGAKISVILAPLM RARIPIIVDKVTTVCTPGEAVDVICTDYGIVVNPRRKDLIETLTKAGVELKTIEEMKEMA EQLTGKPDPVEFTDEIVGVVEYRDGSIIDVIKKVKE >gi|224461290|gb|ACDC01000112.1| GENE 17 19136 - 19684 559 182 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738966|ref|ZP_04569447.1| ## NR: gi|237738966|ref|ZP_04569447.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 182 1 182 182 327 100.0 2e-88 MDIKIEETEDTLKIIKSCKEELRWAFIIGFICNACILYPLLRETPGEFYFGFLLFYTPIF LMIQAVFCHRFRYELIFIKEGRVYLLQSFIKPDIINAEKFLIKNVVEIFAKKFNGSVLLM SHNIFKERKTIKNHPNYKIHFYFKNETEEYYGWGYEIPMEEAEKVEKKVKEFLKKHNDIE LK >gi|224461290|gb|ACDC01000112.1| GENE 18 19710 - 19841 69 43 aa, chain + ## HITS:1 COG:no KEGG:FN1599 NR:ns ## KEGG: FN1599 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 4 43 129 168 168 64 85.0 9e-10 MITLEDFKNNISPQEYYTEENYIYLYNRHLDWIRDKSDYLNGK >gi|224461290|gb|ACDC01000112.1| GENE 19 19866 - 20339 444 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738968|ref|ZP_04569449.1| ## NR: gi|237738968|ref|ZP_04569449.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 157 1 157 157 235 100.0 9e-61 MQTITRIEEDTYIMKNDELVYKRGYYTGYWLINFLSLAYLLASNSKYKYVYIILIVSSIL LFFLIKNALEKVLFIFKKNELEIQIIRKNKIIRNNIFNYREILDLKVREFTGKTGSTYNI EIIFSNKKKSYYYSESKEEAYKVVKIYNLYKNGDTDV >gi|224461290|gb|ACDC01000112.1| GENE 20 20332 - 20892 351 186 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738969|ref|ZP_04569450.1| ## NR: gi|237738969|ref|ZP_04569450.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 186 5 190 190 283 100.0 3e-75 MFENLIKNKLNIKKEANLLIVEYKKWSFKLPLYLIIFYILYTWLGCITKDIARTDITLYL YNLAFLLIFIVVALAVCSKEILIVDNNELVIEKYFLFYLYERKVIDVLNIRSISWTEEYE KHFPVFLPLDIVKNLKIRAKESEIEDRLYTFGVCLNEEKYKEIIEEILKYSETKGYLQNL INITNF >gi|224461290|gb|ACDC01000112.1| GENE 21 20933 - 21427 259 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738970|ref|ZP_04569451.1| ## NR: gi|237738970|ref|ZP_04569451.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 164 1 164 164 248 100.0 1e-64 MKTVTKKEDFTCIAYDEELLNKYLTILNFHVFLWIFIFLYIFLFIANDKIYNYLWIIFYV LFIISYKLEPYLYKIKIILYEDKIEIKKGKKTRLFLYSEIKEIKYNKKWINRVGEISFIK IMKNDGKIYEVMRGQLEKEIIEIFTTIKNSYEEWRIKNVESCNK >gi|224461290|gb|ACDC01000112.1| GENE 22 21405 - 22064 556 219 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738971|ref|ZP_04569452.1| ## NR: gi|237738971|ref|ZP_04569452.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 219 1 219 219 325 100.0 2e-87 MLKAVTNKIKNTDEYSITRYSISGIILKTIIFLILYIICAYLGMYNFREISLSKIIKFII GSFPFFIFLYLEAISGSSKEILCIKENNLILKKYILFFCYYSKVIKLEDIRKIYCEKNKN YLFFPTDLLKNIKFRVKENEFEDKIYAFGYKLSEHESAEIIEEIIEHIKVENVEKEYENK KVLVEYNGVDGKEVTMSKFKEDINEIRDSRSIYGREVEK >gi|224461290|gb|ACDC01000112.1| GENE 23 22065 - 22634 244 189 aa, chain + ## HITS:1 COG:no KEGG:FN0289 NR:ns ## KEGG: FN0289 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 188 1 194 308 97 39.0 2e-19 MKKVKNDSQPFISYIVPMFILPFAGFFVGLCFYAILSSFDKINFGSIYGILLSVIFFYVI VRGLKNGMLYFIPREECYIEDDNLIYRRIFLCKFIFKELRIRILDIENIIDKGCKIPKVS TRSLVLATFFKPYERILIKTKSGKEYKIFVDADPYTFRNDDDNKFIRTYNELKEMVIEEQ NKLSFNKKN >gi|224461290|gb|ACDC01000112.1| GENE 24 22666 - 22996 190 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262066001|ref|ZP_06025613.1| ## NR: gi|262066001|ref|ZP_06025613.1| conserved hypothetical protein [Fusobacterium periodonticum ATCC 33693] # 1 104 1 100 156 87 64.0 2e-16 MISITEKNNKIYIIQDSGENEKSSGSGMLLYLFLTLIGRLVLLENPDSIFFIYCFIFLFL IHLILVNSILRKKTQIILDLDERNIITKKETFNFKNIIKIDIKENYFKES Prediction of potential genes in microbial genomes Time: Thu May 19 23:41:02 2011 Seq name: gi|224461289|gb|ACDC01000113.1| Fusobacterium sp. 2_1_31 cont1.113, whole genome shotgun sequence Length of sequence - 1955 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 33 - 500 229 ## gi|237738973|ref|ZP_04569454.1| predicted protein 2 1 Op 2 . + CDS 527 - 1180 719 ## FN1721 hypothetical protein + Prom 1197 - 1256 3.2 3 2 Tu 1 . + CDS 1333 - 1806 402 ## gi|237738976|ref|ZP_04569457.1| predicted protein Predicted protein(s) >gi|224461289|gb|ACDC01000113.1| GENE 1 33 - 500 229 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738973|ref|ZP_04569454.1| ## NR: gi|237738973|ref|ZP_04569454.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 155 1 155 155 213 100.0 4e-54 MKSKFFILLYSIYLIYVRVYIYKFNLDISLMKKVSLETSFHFLVLYFIILLVKSKEIMIL EKEEITIKKFFIFICYQTNKIKVSDIKSIYYETNSLTGKFNIFVDMTKNLKIRTKFKEFE DKIYYFGINLSEEEYKEIAGKILSCTDKINVLFIE >gi|224461289|gb|ACDC01000113.1| GENE 2 527 - 1180 719 217 aa, chain + ## HITS:1 COG:no KEGG:FN1721 NR:ns ## KEGG: FN1721 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 99 1 91 94 82 51.0 8e-15 MLNNKEKLIELIELIEFGNEIKEIINLWDPMGLMDFCPEDEYETEVKGIRNLVVNNKNMD KKSLAQEIRNIFEYYFSNEYKSKQEIEEDIASKIIEKSKEYKLNFTLPNYYDTKKTIFKN QKEADIYINLYIKINKIINLWDPLKIMDISFHNEYSYEINRIIEELSKNISVQDLAEKIN KIFKNSYNELYEIGKNEEIKIARKILEVYNIGEVRGI >gi|224461289|gb|ACDC01000113.1| GENE 3 1333 - 1806 402 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738976|ref|ZP_04569457.1| ## NR: gi|237738976|ref|ZP_04569457.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 157 1 157 157 232 100.0 5e-60 MQTITRIEEDTYIMKNYEFLFRAGYFAVYFMINLLNTFSMIFSNTSYKYGYVILIFSSLL LFVFNKYVFKKLLFIFKKDKLEIQEIWSNKLIKKIILNYEDILDFKITEISARNGTSYYI KIITPTVEKSYYYDLFKEDAYKVVEIYNLYKNGDTDV Prediction of potential genes in microbial genomes Time: Thu May 19 23:41:22 2011 Seq name: gi|224461288|gb|ACDC01000114.1| Fusobacterium sp. 2_1_31 cont1.114, whole genome shotgun sequence Length of sequence - 8617 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 398 351 ## gi|262066007|ref|ZP_06025619.1| putative membrane protein + Prom 415 - 474 7.2 2 2 Op 1 . + CDS 496 - 957 189 ## gi|237738978|ref|ZP_04569459.1| predicted protein 3 2 Op 2 . + CDS 1024 - 1548 304 ## gi|237738979|ref|ZP_04569460.1| predicted protein + Prom 1553 - 1612 6.1 4 3 Op 1 . + CDS 1736 - 2464 626 ## FN0289 hypothetical protein 5 3 Op 2 . + CDS 2525 - 3154 536 ## gi|237738981|ref|ZP_04569462.1| predicted protein 6 3 Op 3 . + CDS 3155 - 4081 643 ## FN0289 hypothetical protein + Prom 4133 - 4192 5.5 7 4 Op 1 . + CDS 4227 - 4544 370 ## gi|237738983|ref|ZP_04569464.1| predicted protein 8 4 Op 2 . + CDS 4547 - 4870 240 ## gi|237738984|ref|ZP_04569465.1| predicted protein + Prom 4935 - 4994 2.6 9 5 Op 1 . + CDS 5017 - 5280 321 ## gi|237738985|ref|ZP_04569466.1| predicted protein 10 5 Op 2 . + CDS 5277 - 5801 395 ## gi|237738986|ref|ZP_04569467.1| predicted protein 11 5 Op 3 . + CDS 5832 - 6050 224 ## gi|237738987|ref|ZP_04569468.1| predicted protein 12 5 Op 4 . + CDS 6128 - 6481 424 ## gi|237738988|ref|ZP_04569469.1| conserved hypothetical protein 13 5 Op 5 . + CDS 6493 - 6795 230 ## gi|237738989|ref|ZP_04569470.1| conserved hypothetical protein + Prom 6815 - 6874 2.5 14 6 Tu 1 . + CDS 6921 - 7253 441 ## gi|237738990|ref|ZP_04569471.1| predicted protein + Prom 7269 - 7328 7.3 15 7 Tu 1 . + CDS 7367 - 7468 126 ## + Prom 7532 - 7591 4.8 16 8 Op 1 . + CDS 7637 - 7753 90 ## 17 8 Op 2 . + CDS 7776 - 8492 786 ## gi|237738991|ref|ZP_04569472.1| predicted protein Predicted protein(s) >gi|224461288|gb|ACDC01000114.1| GENE 1 3 - 398 351 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262066007|ref|ZP_06025619.1| ## NR: gi|262066007|ref|ZP_06025619.1| putative membrane protein [Fusobacterium periodonticum ATCC 33693] # 1 130 54 183 184 181 95.0 2e-44 ITLYLYNLPFLLIFVVVALAVCSKEILLVDNNKFVIEKYFLFYLYERKVIDVLNIRSISW TEEYEKYFPVFLPLDIVKNLKIKVKESEIEDKIYTFGVCLNEEKYKEIIGEILKYSETKG YLQNLINITNF >gi|224461288|gb|ACDC01000114.1| GENE 2 496 - 957 189 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738978|ref|ZP_04569459.1| ## NR: gi|237738978|ref|ZP_04569459.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 153 1 153 153 166 100.0 3e-40 MNSFSGNFMYNLKYFLTLITTVNVSISDFFLIYYIIFSFCIINFLQIKLCKKLYNEVKIE IILLPKLSYLIPGIYRIHMYILGIFILIILKIKYKKNIKEVFFFFLIYYETLLVNAVVVL IELIYYLFNQELFIDTINKNFELYKNMPLEFYF >gi|224461288|gb|ACDC01000114.1| GENE 3 1024 - 1548 304 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738979|ref|ZP_04569460.1| ## NR: gi|237738979|ref|ZP_04569460.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 174 1 174 174 260 100.0 3e-68 MKIKIIKREKYLRIEKILKKEFIIRTLIFFLIVLYFFITSFIKYGILTVIMSIFCFPGIF LFYLRFILKCSYEILIIKKDIISSYISKNYCIDKSKFKNLNKKFEISNLEKIYFKEYPIW AIVRGVKYEENPYFKLHFKLKDGEQFDFGLMLNNNEAKEILREVKEFLNINKLT >gi|224461288|gb|ACDC01000114.1| GENE 4 1736 - 2464 626 242 aa, chain + ## HITS:1 COG:no KEGG:FN0289 NR:ns ## KEGG: FN0289 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 112 242 178 308 308 170 77.0 4e-41 MGASFYLAFISNYPYEVLIIKNGKIIIYVSLFYRHLKFCKLFNFLQVYDIDNLKHIYFKN TTEILVSKVLKRNESPYHKIHLTFKDKSYTAFGVKLKDEVAKDIVLTINKFLEKYKKENK IKRLTLAEKENLSEKYNYPLNERYSYILNKIIDEEKLFISEKDNNFIINGDSEAVKDLEI FKDMNFEEIDFYVFYVNYLSKKEYENKEVLVGYNGIDGKEVTMSKFKEDINEIRDNRSTF KN >gi|224461288|gb|ACDC01000114.1| GENE 5 2525 - 3154 536 209 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738981|ref|ZP_04569462.1| ## NR: gi|237738981|ref|ZP_04569462.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 209 1 209 209 330 100.0 3e-89 MEIENTKEATVIELKNNRVFIGVYIVPLFTLIGLIISKITIYSILVCLTIFIFLNIGNYI AIRNIGSIETLTLKNVSLTIRRLKKNKKVTYEKEIFYDEIFKIYYQEYFLGFYKRDFTFD MKRNMKIKTYFNTYSFGYKMSYGDFKKINSIIEEKIKEHKNYIKKENQDKKVLVGYNGTD EKEVTMSKFKEDINEIRDSRSIYGREVEK >gi|224461288|gb|ACDC01000114.1| GENE 6 3155 - 4081 643 308 aa, chain + ## HITS:1 COG:no KEGG:FN0289 NR:ns ## KEGG: FN0289 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 306 1 307 308 278 57.0 2e-73 MKKVKNDSQPFIIYIIPMFVLPFAAFFMGLCFYISLLVFDGINFWSIYGTLLPAIILYLI IRAIKNGILYFIPREECYVEEKKLTYKRILFNKLILKEIKAPLLDIQDIIDKGSKIPKAY ANSSNPLNYITIFFKPYERILIKMKSGKEYKIFVDADPYSFSQNYDNNKFIKNYNQLKEM VIEEQNKLFFNQKIENLNEKYNSSLDERYDFVLNKIIDEEKLFISEKDNNFIINGDSEAI KNLEIFKDTNFEEIDFYVFYVNYLSKKEYENKKVLVGYNGTDGKEVTMSKFKEDINEIRD SRSTLKKS >gi|224461288|gb|ACDC01000114.1| GENE 7 4227 - 4544 370 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738983|ref|ZP_04569464.1| ## NR: gi|237738983|ref|ZP_04569464.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 105 11 115 115 152 100.0 6e-36 MEDNEFDFIELKLSSNSLNKKEIQVLKENGIEHKYLENNEIILIIKSFYSYFVEIDGENR GVFFEKTEHYSKYLKDISSLRSEAEVNKISNLVISKEGVRVEIEV >gi|224461288|gb|ACDC01000114.1| GENE 8 4547 - 4870 240 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738984|ref|ZP_04569465.1| ## NR: gi|237738984|ref|ZP_04569465.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 107 1 107 107 151 100.0 1e-35 MKLDNIVIDSLEYKNSLLYIFYKNYYGYEKYFNKVLILKDVKNFTYTIEELAYHTEMYEF LNELELEAFTRVFYRNKKLNKLYIFDANYSVTIIEFSKKMSWRNQKR >gi|224461288|gb|ACDC01000114.1| GENE 9 5017 - 5280 321 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738985|ref|ZP_04569466.1| ## NR: gi|237738985|ref|ZP_04569466.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 87 1 87 87 152 100.0 5e-36 MGTIGVGGTGRTLKAAPPLDGYFSVDSTIITHNEDMTKFRKEIEKNIEYNLDEISKLAKD PETAIKNSYKLNKISKKLIDDVNRLRK >gi|224461288|gb|ACDC01000114.1| GENE 10 5277 - 5801 395 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738986|ref|ZP_04569467.1| ## NR: gi|237738986|ref|ZP_04569467.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 174 1 174 174 257 100.0 2e-67 MIEMKTSNNKEIEIIISHHKYIRKVQFSLIIWLIIPSIVIVLIVTPFLLMFIAEIFILLT SIINYYYFRKSIVALEKLIIKEDTLIIQDINNYGVIIFNKEINFNNIKEIFFKDSLSAYF SSGGGMRENRKFLKIRTFNELFSFGILIDRNEYNQIKAIIQKEILKKNTKFNNI >gi|224461288|gb|ACDC01000114.1| GENE 11 5832 - 6050 224 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738987|ref|ZP_04569468.1| ## NR: gi|237738987|ref|ZP_04569468.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 72 1 72 72 135 100.0 6e-31 MNCFFTSSELAGEDYTRLEDKEQLKKYRDRIYYNGYRVVSISEDGWAKLDFNDTLPDENE IALIAGISRVEQ >gi|224461288|gb|ACDC01000114.1| GENE 12 6128 - 6481 424 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738988|ref|ZP_04569469.1| ## NR: gi|237738988|ref|ZP_04569469.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 117 1 117 117 175 100.0 7e-43 MEVKSKLLFGLDEYQNEDSLAIIEFSFLNITKNEKKILEKFDNHYVSLRNDENKIIIKNY TFYFVEIDGKNNGIFLEKTKNYYEFLKRDLVYKETKKTKNLVISKDGIRAEFIFFME >gi|224461288|gb|ACDC01000114.1| GENE 13 6493 - 6795 230 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738989|ref|ZP_04569470.1| ## NR: gi|237738989|ref|ZP_04569470.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 100 1 100 100 125 100.0 1e-27 MKFQNILTTNLVYKNEILYVYFRHYYIKDKYYNKILKLKNVKKFTHFLSEFYITFLREFS EVEEELRIHFFSKPFYKNKKRKQLYIFDRTETFVMIEFKD >gi|224461288|gb|ACDC01000114.1| GENE 14 6921 - 7253 441 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738990|ref|ZP_04569471.1| ## NR: gi|237738990|ref|ZP_04569471.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 101 14 114 123 160 100.0 3e-38 MIDFFNAKTFIGKKIKLKFTKESLEKSEKNKEIIEENNEWLDKVLKKINFFKRHGLSSQK DVDFSTGVVTSVGKDFDYEENDDGKEFYYIELDSYKWINLNEIEEIIEIE >gi|224461288|gb|ACDC01000114.1| GENE 15 7367 - 7468 126 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGIGVPDSKEDNKTGNQTIANYTGAKLEDRTGS >gi|224461288|gb|ACDC01000114.1| GENE 16 7637 - 7753 90 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MILIRKLEWLDQKKTIILKHTMNLIKMGNILEQKKKEK >gi|224461288|gb|ACDC01000114.1| GENE 17 7776 - 8492 786 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738991|ref|ZP_04569472.1| ## NR: gi|237738991|ref|ZP_04569472.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 238 1 238 238 390 100.0 1e-107 MIYRISLEDIREQIEDVLILDKWKMTKNEEQIKYLTEKRHYRKGKTREEVHEWAKPRYET LIDKASKDKVVFYPERDFILIRNWLKFLYYAINIRFEDGSYDYKYEVLNFYLKKLLMREK SDVEDLFDVSESIIGITLEDIREQFQLWMEGKITRTEVEYWSDRRNTCFHHYDDVEFYPI EKEDLYQKWIETLSMFVLGGEGDYFYTDEELKKMYYELCEDIKKADEKFEQEIKKIYK Prediction of potential genes in microbial genomes Time: Thu May 19 23:43:05 2011 Seq name: gi|224461287|gb|ACDC01000115.1| Fusobacterium sp. 2_1_31 cont1.115, whole genome shotgun sequence Length of sequence - 2072 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 21 - 80 2.1 1 1 Op 1 . + CDS 247 - 1470 1518 ## gi|237738992|ref|ZP_04569473.1| predicted protein 2 1 Op 2 . + CDS 1475 - 1858 196 ## gi|237738993|ref|ZP_04569474.1| predicted protein Predicted protein(s) >gi|224461287|gb|ACDC01000115.1| GENE 1 247 - 1470 1518 407 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738992|ref|ZP_04569473.1| ## NR: gi|237738992|ref|ZP_04569473.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 407 10 416 416 656 100.0 0 MEGVLLDKFKDEHQREFNLIKDENLSLEDKQKLAQNLVERYLRENGYQGVIPEVLLTDEA HSFTVDSKDKATGTKRREKIYFSINDIADPNLAFSRLFGHEKAHMNTYDEGKDGEETSIH TREKIGSENKNKVFTEEEKTDYLNNLRNKYKNQKSIEKQFAEAKLVPEKDKEHYVDWKNF NKYLEAEKFVKDNTYLLSKMYKVAKEKILNNPKKYTYYYIQDTNNVVFKDKEDFEKEQAK LRAKIASEKDDKKRRELEKELYYKLPDSMAQLHNIEIDENGEVYINIKFPNEKYVNKKGK EVVINKKTGEIINDGINNGTYNMGVWEGSYNFLDFDVAYENIIHLGIDVPAWIDYGVNSK DKLTKAQREELYEISKLYFKDYLFREVVDKNEKLNYKIYQEYMNKWR >gi|224461287|gb|ACDC01000115.1| GENE 2 1475 - 1858 196 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738993|ref|ZP_04569474.1| ## NR: gi|237738993|ref|ZP_04569474.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 127 1 127 127 128 100.0 1e-28 MIKFLNFLNNLHSIIKITQSSLPYGIIFYFMILLNLIFLFIEINLSKKIFGMINYSFFLA TKILYFCPANFRKYIYILGIIVLLILKIKYKKNKLEIMMGILIFYEYLIGNIILDIFTLL FILPEVI Prediction of potential genes in microbial genomes Time: Thu May 19 23:43:29 2011 Seq name: gi|224461286|gb|ACDC01000116.1| Fusobacterium sp. 2_1_31 cont1.116, whole genome shotgun sequence Length of sequence - 4682 bp Number of predicted genes - 9, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 742 901 ## COG3210 Large exoproteins involved in heme utilization or adhesion 2 1 Op 2 . + CDS 748 - 1167 100 ## gi|237738995|ref|ZP_04569476.1| predicted protein 3 1 Op 3 . + CDS 1178 - 1327 142 ## + Prom 1334 - 1393 4.6 4 2 Tu 1 . + CDS 1430 - 1510 77 ## + Term 1566 - 1615 4.5 5 3 Tu 1 . - CDS 1751 - 1888 102 ## - Prom 1915 - 1974 8.6 + Prom 2209 - 2268 13.1 6 4 Op 1 . + CDS 2517 - 2660 172 ## gi|237738997|ref|ZP_04569478.1| predicted protein 7 4 Op 2 . + CDS 2632 - 2886 166 ## gi|237738998|ref|ZP_04569479.1| predicted protein 8 4 Op 3 . + CDS 2891 - 4429 1304 ## FN0289 hypothetical protein 9 4 Op 4 . + CDS 4502 - 4682 211 ## FN2112 hypothetical protein Predicted protein(s) >gi|224461286|gb|ACDC01000116.1| GENE 1 2 - 742 901 246 aa, chain + ## HITS:1 COG:FN0290 KEGG:ns NR:ns ## COG: FN0290 COG3210 # Protein_GI_number: 19703635 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 3 49 499 545 727 83 87.0 4e-16 AKLEDRTKAMVAKEATEDTLAAIRNNPNVITGEEGKKLAESVPMDRREYSIYLGERPVKG VKFAAHSFIPVFPNIQSDFFEKDGTVKEEFKFLGEPIQLKNGKRGWIIAGYNGEKDKGEA GGRLVLRINDPSNIRAFKLALETNNNSEEGYIVQQITSDIENDTEPAKKIFLNSKRYIEN GNYVDYNMFIRNCHSISFSLIEPIEKKVITNRILPISRGREEVEYLKFKRFAPGTEQRIR IEKSSR >gi|224461286|gb|ACDC01000116.1| GENE 2 748 - 1167 100 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738995|ref|ZP_04569476.1| ## NR: gi|237738995|ref|ZP_04569476.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 139 1 139 139 138 100.0 1e-31 MKIYIIIFFDFIFYFLLIDLEILNKIGINKELEIATVYSYFTVFKPSLLYFYIEKRNIFR NYIYRVLILLLSIPIFYYFISLPDGLIYRTFIYSIFFFPFMFICVRFIKLSKQYIFTKIC LTFLYEFIMFYFILFRFFE >gi|224461286|gb|ACDC01000116.1| GENE 3 1178 - 1327 142 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWKIFFKNKSNLLFIAFLALIGILLHLKGINSNINFFILIMLGIEYFFL >gi|224461286|gb|ACDC01000116.1| GENE 4 1430 - 1510 77 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPGLDPGSNTIIDDKYLKKIKLDRQK >gi|224461286|gb|ACDC01000116.1| GENE 5 1751 - 1888 102 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSILELKKVIDLKIKVFLYIFRLNNKNNLLKKNLILMKIKNKNLI >gi|224461286|gb|ACDC01000116.1| GENE 6 2517 - 2660 172 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738997|ref|ZP_04569478.1| ## NR: gi|237738997|ref|ZP_04569478.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 47 75 121 121 83 100.0 5e-15 MPNLAGIPPELGEYKLSILHIFIACIIVSYLYIPLFRKNEKYRSKEI >gi|224461286|gb|ACDC01000116.1| GENE 7 2632 - 2886 166 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738998|ref|ZP_04569479.1| ## NR: gi|237738998|ref|ZP_04569479.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 84 1 84 84 126 100.0 5e-28 MKNIEAKKFNKVLILKDVKNFTYTTEELAYHTEMYKFLNDLELEAFTRVFYRNKKLNKLY IFDANYSVTIIEFSKKMNWRNQKG >gi|224461286|gb|ACDC01000116.1| GENE 8 2891 - 4429 1304 512 aa, chain + ## HITS:1 COG:no KEGG:FN0289 NR:ns ## KEGG: FN0289 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 205 506 1 306 308 332 66.0 2e-89 MEYRYKNIYLGETIEEIFPELNNSNTEYECSIFVLLYKPYENIEVYIYLKFGKVRLIKIF DENFQIDNTLKVGIALTDEIINRYDLYYDDFEEVYLSKKHKELVVIVDLADNIIGFSFHL ELVGDEAFATDRIKNYLECKNLLDIYGSLYWSKTLDANIEKREIYGQLDNYKFTFDIITR DIKSIQNLETGEFIKIMKGVKDNELKKIEYSQHCILYYYALLTLIIFDFIAGVFFIINLI RFELLIFIIVFGIMTYFFTRAIVNCLKFFISKEECYCKNENLIYKRILFKKFLLKELKIP LLDIEEVLDKGHVYSENGGGNYVSPTDFVFLFFKPYKRVLLNLKRGIKYDIFTYTYPYPY IEKEIYDDTDFLKSFNELKEMIEAEQKKILFNQKVENLMEKYNFPLDERYSYILNKILDE EKLFISEKDNNFIINGDSEAIKDLEIFKDMNFEEIDFYVFYVNYLSKKEYENKKVLVGYN GIDGKEVTISKLKEDINEIRDSRSRYGRKEKS >gi|224461286|gb|ACDC01000116.1| GENE 9 4502 - 4682 211 60 aa, chain + ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 60 1 53 164 72 66.0 6e-12 MKSFLVLIILIFLTACTNTRYYYYPENYKNSDISISASLVEFAKEDSALDYIWVLDLRDH Prediction of potential genes in microbial genomes Time: Thu May 19 23:44:05 2011 Seq name: gi|224461285|gb|ACDC01000117.1| Fusobacterium sp. 2_1_31 cont1.117, whole genome shotgun sequence Length of sequence - 7785 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 85 57 ## + Term 104 - 138 3.0 + Prom 181 - 240 8.3 2 2 Op 1 12/0.000 + CDS 277 - 1881 1867 ## COG2831 Hemolysin activation/secretion protein 3 2 Op 2 . + CDS 1891 - 7783 7056 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|224461285|gb|ACDC01000117.1| GENE 1 2 - 85 57 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no DDAFSKGVPRKEIFSGTVEDYKNKRNK >gi|224461285|gb|ACDC01000117.1| GENE 2 277 - 1881 1867 534 aa, chain + ## HITS:1 COG:FN0292 KEGG:ns NR:ns ## COG: FN0292 COG2831 # Protein_GI_number: 19703637 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Hemolysin activation/secretion protein # Organism: Fusobacterium nucleatum # 192 534 8 350 350 541 81.0 1e-153 MIKRILFFFLSIYSLSFSVDEIDIEKRREEQQNFDRLIKSQNFERKEPSIENEGKDIILD ITSIKLDGNTVFEDFQIETILRKYTGRNKNIYELMNVLENKYIERGYITTKVGLNIEESD FENGKISFFVSEGKIDKVFYDGKENDFKTFITFPQRENNLLNIRDLDQGIDNLADNSTLD IKASDKNEYSDIYIKRDNKPIGFGINYNDLGQFETSRHRVRYFLNTHNIFGLNESLDFSY QQKLQRQYKERDAKNFSFSLNIPFKYWSLSYSYDSSEYLRTINALGRKYKATGNTETQTF GLRKMLHRNEDHKIDIGARISLKDSKNYIDDIRLISSSRKLSVLTIDSSYTGRVFSGLLS ANGSVSFGLKRFGANKDHKEPFRDEYTPKAQFRKYNVNISWYRPIHKFYYKIFIGGQYSK DILYSQEKIGIGDDTSVRGFKDESTQGDKGFYIRNEIGYKGSEILEPYIAYDYGRVFNNK VNENKVETLQGVAVGVRGYFKGFEGSFSIAKPIDKPTYFRNNKPVVYTTLTYRF >gi|224461285|gb|ACDC01000117.1| GENE 3 1891 - 7783 7056 1964 aa, chain + ## HITS:1 COG:FN0291 KEGG:ns NR:ns ## COG: FN0291 COG3210 # Protein_GI_number: 19703636 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 1671 1 1865 1881 954 45.0 0 MRGNLKRLIAIFMLFLHIGSLANGIVPDSAASKDLKVDKASNGVPLVNIEAPDKNGTSHN VYKDYNVDGKGVILNNSKDLTQSQLGGLIYGNPNLQNGKEASTIINEVSGVNRSKIEGYQ EIAGKKADYILANPNGIYVNGAGFINTGNVTLTTGSGDNILNPEKGKIEIDGKGIDLRNV NKAELVARVAEISGPVYGGEEVNLKLGSKGQSNKPEYALDARALGSIYAGRINVIVNEEG VGVKTEAPTYATKGDVVISSKGKVYLKDTQAKGDIKISSTETEIGDKLVAENAINIESKK TKNSGQIQANKDITINGNIDSSNLISTNKDISISGDLKNSGNISSTNLNAKDIENSNKVA VEEKVFSNKIINSGNFSAKDIMTTDISNSGKLLSKNINVNSLNNLGEVSSENLITINLEN SNKINIKESINSNSVVNKTINSEILSKNLNSNNLDNKGSIIVINNVNSGVITNNGKLLAG NTINSQNLTNTSIVQGEILDIKNKINSSGKILSDNILTKGIFNSGNISAKVITTQELTNS GEIISNNLFSNNINNLKNIFVNENLKANSLINSNKIESNNIEVNSLNNVGDIKVSENLTT KDITNSKNLQVGKNIRTDDIKNSDFIQAKNIIIEKSLDNSGKLKSVDININTSNIENNGE ILSSNNINITTANDVNLSGKYIANNTLKIDVKSLINNIDLENSGKIKLNLKENLINNKKI NSSNDLNITAKKIVNIGEIGSMAKLDIDAGGLENNGALLFGNNDENKLTIKENTQNTGII NSLGKLEINTQDFENKGQIASIKDLKINSKNFTNNEDSLIFSNENVKFDLKNDFLNKGEI YSGKNIDIKALGAVTNNTATISAQNNIDINAKKITNIGKVEKKNPTIERKSALDSDTNIT EKQREEAKKYFQELVNRKNNEKGIENPGGYTEGSYLFKDFKAEWIEKVKSNDVSKLSYIT ADNNISLTSKEDILNEEGNILANNDITITASKITNRNSIRDVDIKIGWKAPIYQVTYTEG HEYGGDNENHNGSGGGGHYSTSPVGTEEYIGKTTQTIGADKASKIAAGKNLTINAQKVGN GFKDNETNDIKNKNTVINKVNLAENNVNIDNVNLNRKKIEDKKVIDTKDYIFIPKGDKGL FKINKNENNKPGFSYLIETNIKYIDKSMFLGSDYFFKRIGFNPDRDIRLLGDSMYETRLI NRAILEGTGRRYLGDYRSEKEQMQALYDNAISEKEDLKLSIGIALSKNQIDKLKKDVIWY VEEEIEGQKVFVPKVYLTKNSLNKLKNNVTSLEAGENFNISVNEITNTGAINAKNLNIKT NNLINKSSSENLATINADNIDIEAKETLSNIGATIKADKDLKVTANKVENISTHHINTNI VEIIKDNFENFASIEAGNNIDIKTKDVANTAASIKANNDITIKSDNVKLNTLALKDSENR KNYENIVINNVGSEITGKNIAIEANNDIGIVGSDIKTKEKLTLNTKNIDISSAESSLYER SSNKNGYTINEIKRNTESNIEGNNININAKNDINIKASNIVAKEEANIKADGDINIVSAT DSEYFAHKESRKKKFGRSRSEETINYKTSNVASNIVGEKLNITSGKDVNILGGNIQADTE GQINAKGNITQAGVKDINYSYHKKTKKGFMGLTSKSVTDENYAEKAILSATLAGDKGLTY DSKNNLLLEGVKVVSSGSINLKGQNVEINPLETTAYSKHKEEKKGFSGSLSPKGVSLSYG KDKLSSDTDIVNQTASQIVSNKDINIEATNKVKAKSVDIYAKDDINISGDKGVEISTANN SYDNTTKQSSSRIGANFGVNPAIVNTVENIKDIKNLTDFSGNSYDILNNASKVVGAIKDG AKATIAIADTSYKGSTDAGYDNLKIMNNVFTAGVSYNKSKSKSLVHNESAEKSSIEAGRN MNIKSKDGSISISGTDVKVGNDLSLTAKKDIDIKAAEEKFTSSS Prediction of potential genes in microbial genomes Time: Thu May 19 23:44:10 2011 Seq name: gi|224461284|gb|ACDC01000118.1| Fusobacterium sp. 2_1_31 cont1.118, whole genome shotgun sequence Length of sequence - 2283 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 865 1194 ## COG3210 Large exoproteins involved in heme utilization or adhesion + Term 1008 - 1070 3.5 + Prom 1215 - 1274 14.2 2 2 Op 1 . + CDS 1298 - 1450 97 ## + Prom 1546 - 1605 4.2 3 2 Op 2 . + CDS 1625 - 1699 89 ## 4 2 Op 3 . + CDS 1683 - 2246 348 ## gi|237739004|ref|ZP_04569485.1| predicted protein Predicted protein(s) >gi|224461284|gb|ACDC01000118.1| GENE 1 2 - 865 1194 287 aa, chain + ## HITS:1 COG:FN0290 KEGG:ns NR:ns ## COG: FN0290 COG3210 # Protein_GI_number: 19703635 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 3 49 499 545 727 79 85.0 6e-15 AKLEDRTKAMVAKEATEDTLAAIRNNPNVITGEEGRLLAESIPMDRREYDAYIYERDLNL NFTIPLIKKDINVAKGVIAHSGVTLIPSIQSDFFDEDGNLKEKYIKRGYSKPHKFPNGRL GWVKAGFKGEESKGEPKDKLVIRINPFEDVVVNDNLLGGNNQTYLTGRIVGKLKPPKNVS DTQMIENIMEKGIDGNSVDYSALPQLYINLPGGVKLQLTNGYNCHSVTGTLSQGFEPYGD ERNNLPKQRGGEISDEDYNKYRYSPMKRLAPGLGNIIPKKYLRRMEE >gi|224461284|gb|ACDC01000118.1| GENE 2 1298 - 1450 97 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYLKNLIFLIYLIFIGFLFTKYTSLHWVFLIPYSLLTLIEYALLIKKDK >gi|224461284|gb|ACDC01000118.1| GENE 3 1625 - 1699 89 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGKFIKLVDQLKKDIEVLSDAYEK >gi|224461284|gb|ACDC01000118.1| GENE 4 1683 - 2246 348 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739004|ref|ZP_04569485.1| ## NR: gi|237739004|ref|ZP_04569485.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 187 2 188 188 192 100.0 9e-48 MPMKNKALHFLCDSRHFLKDLKKDYKKFLIFFFLGILLLLYQQRSSIINIILILFFSLLL PLLMLIDCNRCEKYKYIIEELFIKEDEIIIFHINKKERIEKHKIKFDEITDLEYKDPFFL SPYKPDTFFHKNIEKCRLLKIKLKSKKVISFGFFLEEEEAKKIIKAIKESKINYEKLQEE IKEFKIK Prediction of potential genes in microbial genomes Time: Thu May 19 23:44:27 2011 Seq name: gi|224461283|gb|ACDC01000119.1| Fusobacterium sp. 2_1_31 cont1.119, whole genome shotgun sequence Length of sequence - 2757 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 + CDS 58 - 621 701 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 2 1 Op 2 6/0.000 + CDS 633 - 1466 900 ## COG0368 Cobalamin-5-phosphate synthase 3 1 Op 3 2/0.000 + CDS 1476 - 2051 519 ## COG0406 Fructose-2,6-bisphosphatase 4 1 Op 4 . + CDS 2067 - 2757 969 ## COG2038 NaMN:DMB phosphoribosyltransferase Predicted protein(s) >gi|224461283|gb|ACDC01000119.1| GENE 1 58 - 621 701 187 aa, chain + ## HITS:1 COG:FN0913 KEGG:ns NR:ns ## COG: FN0913 COG2087 # Protein_GI_number: 19704248 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Fusobacterium nucleatum # 1 187 1 187 187 316 84.0 2e-86 MGKIIFFTGGSRSGKSKFAEEYIYEQKYKNKIYFATAIAFDDEMQDRIERHIKRRGNTWK TIEGFKNLISLVKNDIDSTDVILFDCITNFVSNFMIMDRDIDWDKVDLSVVQEIEDQIEE EMSNFLEFIRSKKTDCVFVTNEIGSGLVPDYPLGRHFRDICGRINQLVAKNSDEAYLAVS GIKVKIK >gi|224461283|gb|ACDC01000119.1| GENE 2 633 - 1466 900 277 aa, chain + ## HITS:1 COG:FN0912 KEGG:ns NR:ns ## COG: FN0912 COG0368 # Protein_GI_number: 19704247 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Fusobacterium nucleatum # 1 277 1 277 278 327 71.0 1e-89 MKGFLLLLSFMTRIPMPKIDYDEEKLGKSMKLFPLVGIVIGFILLFFSIVFSYVLSNLSF SAFLPIIILVVILTDLISTGALHLDGLADTFDGIFSYRSKHKMLEIMKDSRLGSNGALAL ILYFLIKFVLLYSLLMEDQGETVFAVLTYPVVARLCSVISCASAPYARGSGMGKTFVDNT KAKEVVIASLITVVYSSAMLFYMINSSLSLELPLDFVMMRLGINLLIIAILGLFAFSFSK LIERKIGGITGDTLGALLEISSLVYLFLIIVVPTFFL >gi|224461283|gb|ACDC01000119.1| GENE 3 1476 - 2051 519 191 aa, chain + ## HITS:1 COG:FN0911 KEGG:ns NR:ns ## COG: FN0911 COG0406 # Protein_GI_number: 19704246 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 190 1 190 191 301 78.0 7e-82 MGKLILIRHGQTEMNAQNLYFGKLNPPLNELGIEQAYMAKEKLSNIAYDCIYSSPLERTR ETAEICNYLDKEIIYDSRLEEINFGIFEGLTFKEISEQYPNEVKEMEKNWKSFNYITGES LEELYQRAVSFLETLDYTKDNLIISHWGIINCIISYFVSGTLDTYWKFKVDNCSIVIFEG DFNFSYLTKLY >gi|224461283|gb|ACDC01000119.1| GENE 4 2067 - 2757 969 230 aa, chain + ## HITS:1 COG:FN0910 KEGG:ns NR:ns ## COG: FN0910 COG2038 # Protein_GI_number: 19704245 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 230 1 230 354 382 81.0 1e-106 MKDINFLFDLISKIEPVDSSAIKAAQTELDRKMKPKDCLGVLEEICKKVASIYGYPIKKL DRKCHILVSADNGVIEEGVSSCPIEYTPIVSEAMLNNIACIGIFTKTLGVDLNVVDIGMK NDIKREYPNLIHRKVKRGTNNFYKEKAMSMEECLQAIFTGIDLINERANDYDIFSNGEMG IANTTTSSALLYSVTRENIDIVVGRGGGLSDEGLSKKKKIIVEACERYGT Prediction of potential genes in microbial genomes Time: Thu May 19 23:44:30 2011 Seq name: gi|224461282|gb|ACDC01000120.1| Fusobacterium sp. 2_1_31 cont1.120, whole genome shotgun sequence Length of sequence - 8252 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 3, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 2 - 367 625 ## COG2038 NaMN:DMB phosphoribosyltransferase 2 1 Op 2 1/0.000 + CDS 380 - 1108 773 ## COG2003 DNA repair proteins 3 1 Op 3 1/0.000 + CDS 1122 - 2051 974 ## COG1774 Uncharacterized homolog of PSP1 4 1 Op 4 1/0.000 + CDS 2053 - 2721 578 ## COG4123 Predicted O-methyltransferase 5 1 Op 5 . + CDS 2737 - 3744 1487 ## COG0240 Glycerol-3-phosphate dehydrogenase 6 1 Op 6 . + CDS 3769 - 4089 379 ## FN0905 hypothetical protein 7 2 Op 1 11/0.000 - CDS 4445 - 5245 648 ## COG0810 Periplasmic protein TonB, links inner and outer membranes 8 2 Op 2 30/0.000 - CDS 5254 - 5643 508 ## COG0848 Biopolymer transport protein 9 2 Op 3 . - CDS 5646 - 6254 602 ## COG0811 Biopolymer transport proteins - Prom 6358 - 6417 16.0 + Prom 6374 - 6433 19.3 10 3 Op 1 . + CDS 6468 - 6773 434 ## COG1799 Uncharacterized protein conserved in bacteria + Term 6788 - 6823 4.1 11 3 Op 2 . + CDS 6830 - 7519 1030 ## COG0860 N-acetylmuramoyl-L-alanine amidase 12 3 Op 3 . + CDS 7519 - 8223 1140 ## COG0813 Purine-nucleoside phosphorylase Predicted protein(s) >gi|224461282|gb|ACDC01000120.1| GENE 1 2 - 367 625 121 aa, chain + ## HITS:1 COG:FN0910 KEGG:ns NR:ns ## COG: FN0910 COG2038 # Protein_GI_number: 19704245 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 120 233 352 354 217 90.0 4e-57 MNPIEMMAAVGGFDLACMLGMYIGAALNKKLMLVDGFISSVAALLACKLNKNIQDYLLFT HKSEEPGVNIILDYLKEKTFLNMNMRLGEGTGAVLAYPIIACAIEMINTMKSPEEVYKLF F >gi|224461282|gb|ACDC01000120.1| GENE 2 380 - 1108 773 242 aa, chain + ## HITS:1 COG:FN0909 KEGG:ns NR:ns ## COG: FN0909 COG2003 # Protein_GI_number: 19704244 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Fusobacterium nucleatum # 15 242 1 232 232 265 63.0 4e-71 MDKSDNKNVKNKEDLKKAEYKGHRERIRKRFLDKGIHSFSDYEILEFLLFYCNAQEDTKP IAKAILKEFKSLNRVFKADIKELEKIKGIGPISAILIKFVGELLTELYGENLKIETKADK ITDKEALLKFLRNKIGYEDVEKFYVIYLSSSNEVIAFEESSSGTLDRSSIYPREIYKRVI MENAKSIIIAHNHPSGNTCPSKCDIDITNEIAKGLKNFGALLLEHIIITRDSYFSFLEEG LI >gi|224461282|gb|ACDC01000120.1| GENE 3 1122 - 2051 974 309 aa, chain + ## HITS:1 COG:FN0908 KEGG:ns NR:ns ## COG: FN0908 COG1774 # Protein_GI_number: 19704243 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Fusobacterium nucleatum # 1 309 1 312 312 525 90.0 1e-149 MENNIIDENTQIVSTDPERIHKVLIVTFETTKKRYYFEVLGDETYKKNDKVIVETIRGTE LGIASNSPLPMKEKDLVLPIKPVLKLASEKEIEIYNKQRKEADEAFIACKEKIRKHQLEM KLITCEYTFDKSKLIFYFTANGRIDFRELVKDLAVMFKTRIELRQIGVRDEARILGNIGP CGKELCCKTFINKFDSVSVKMARDQGLVINPTKISGVCGRLLCCINYEYSQYEEALKNFP AVNQSVKTEIGEGKVVSISPLNNFLYVDVKDKGISRFSIDDIKFNRKEASILKNMKTKEE IENKILEKE >gi|224461282|gb|ACDC01000120.1| GENE 4 2053 - 2721 578 222 aa, chain + ## HITS:1 COG:FN0907 KEGG:ns NR:ns ## COG: FN0907 COG4123 # Protein_GI_number: 19704242 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 1 222 1 223 223 303 77.0 2e-82 MLKDDEIIEELDKKYKIIQKKGGYKYAEDTILLFNYLKKSLSKRNIKLLDIGTGNGILPI LLSDNDMIEEIVGIDIQNENIERANKALELNKIEKNINFTCLDVKEYKNANYFDVVISNP PYMEDNGKKINENEHRALSRHEIKLNLDEFIQNAKRLLKPIGTLYFVHRTHRLIEIIKTL DKNKFSIKKIIFVFSKNNTSSMMIIEALKGKKIKLEIENYYV >gi|224461282|gb|ACDC01000120.1| GENE 5 2737 - 3744 1487 335 aa, chain + ## HITS:1 COG:FN0906 KEGG:ns NR:ns ## COG: FN0906 COG0240 # Protein_GI_number: 19704241 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 1 335 1 335 335 583 92.0 1e-166 MAKISVIGSGGWGIALTILLHKNGHELTVWSFDKKEAEELKITRENKAKLANILLPEDIV VTDDLKEAVTDKDILVLAVPSKAVRSVSKSLKDIVKEKQIIVNVAKGLEEDTLATMTDII EEELKDKNPQVAVLSGPSHAEEVGKGIPTTCVVSAHNKELTLYLQNIFINPAFRVYTSPD MLGVEVGGALKNVIALAAGIADGLNYGDNTKAALITRGIKEIASLGVAMGGEQSTFYGLT GLGDLIVTCASMHSRNRRAGILLGQGKTLDEAIKEVNMVVEGVYSAKSALMAARKYNVEI PIIEQVNAVLFENKNAAEAVNELMIRDKKLEIQSW >gi|224461282|gb|ACDC01000120.1| GENE 6 3769 - 4089 379 106 aa, chain + ## HITS:1 COG:no KEGG:FN0905 NR:ns ## KEGG: FN0905 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 7 106 1 101 101 131 77.0 8e-30 MKKIFLMFIAILLINACTNTSVPFNEVESSLNQKYISLSNEYYRMLENPIVEKDRRAVLS KFESFRTEVRDIKKTRKKASSNELRVLNSFIDKASINIQYLNDLAE >gi|224461282|gb|ACDC01000120.1| GENE 7 4445 - 5245 648 266 aa, chain - ## HITS:1 COG:FN1310 KEGG:ns NR:ns ## COG: FN1310 COG0810 # Protein_GI_number: 19704645 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein TonB, links inner and outer membranes # Organism: Fusobacterium nucleatum # 34 266 12 242 242 204 59.0 2e-52 MKKYVLISLIVHLAILFLFATIKTDEVEKEKLVKNEVVPIAFVTKQTSDNPGAKTLDTQE REKTKEEKPKSEPKIEKKVEEKKVEEKKPVEKPIEKKAEKTEEKKIESNIPSKNEPSHSD TSSKSSSESSSTSSSDKSSNHSSDGGSPNGNSSGKDLGPNFIVDGDGTNIALTSEGINYQ IINEVEPDYPSQAESIGYSNQVKVTVKFLVGLKGNVEKAEIIKSHKDLGFDAEVMKAIKK WRFKPIFHKGKNIKVYFTKTFVFEPQ >gi|224461282|gb|ACDC01000120.1| GENE 8 5254 - 5643 508 129 aa, chain - ## HITS:1 COG:FN1311 KEGG:ns NR:ns ## COG: FN1311 COG0848 # Protein_GI_number: 19704646 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Fusobacterium nucleatum # 30 129 1 100 100 155 89.0 2e-38 MSKYKKSRESAKLDLTPLIDVVFLLIIFFMVTTTFNNFGSVQIDLPSSTIQQTDKTKSIE IIIDKDGNYHISEDGKITQIQFSEIDSYLKTAKEATVSADKNLKYQVIMDVITKIKENGV DNLGLSFYE >gi|224461282|gb|ACDC01000120.1| GENE 9 5646 - 6254 602 202 aa, chain - ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 1 202 1 202 202 322 83.0 4e-88 MLHYLQVGGPILWVLTIISIAAFAVILERIAFFSRNEKAVGNTFKEEILSLIANKKIDEA RNLCASKKSCVASAVKKFLEKAEKGMEVQDYEFILKEITIQETSPYESRLNLLSSIISIS PMLGLLGTVTGMIKAFTNISKYGAGDAAIVADGIAEALLTTAAGLMIAIPVIVVYNYLNR RLEKMENEIDDIVTNIINIFRR >gi|224461282|gb|ACDC01000120.1| GENE 10 6468 - 6773 434 101 aa, chain + ## HITS:1 COG:FN1010 KEGG:ns NR:ns ## COG: FN1010 COG1799 # Protein_GI_number: 19704345 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 6 100 4 97 98 133 75.0 1e-31 MADNSNVDIVFLKPSKFEDCVVCAKYIKEDKIVNMNLSQLDDKDSRRVLDYVAGAIFITK ADIINVGNRIFCSVPVNKSFLNETDRESVRDYETEEEIIRR >gi|224461282|gb|ACDC01000120.1| GENE 11 6830 - 7519 1030 229 aa, chain + ## HITS:1 COG:CT268 KEGG:ns NR:ns ## COG: CT268 COG0860 # Protein_GI_number: 15604989 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Chlamydia trachomatis # 23 226 60 239 259 84 29.0 1e-16 MKKILLILFILFSVTVLSKEKYIVCLDPGHQTKGNPALEEIAPNSDKKKAKVTTGTRGVV TKKYESELMLEIALKLKTSLESKGYKVIMTRTKNDVDISNKERAIFANDNKADVYIRLHA DGSENKNAAGASVLTSSPKNKYTTKVQKESEKFSKILLEEYVKATGAKNRGLIYRDDLTG TNWATVPNTLIELGFMSNAEEDKKLSEKDYQDKIVKGLVNGIDRYLGGK >gi|224461282|gb|ACDC01000120.1| GENE 12 7519 - 8223 1140 234 aa, chain + ## HITS:1 COG:FN0435 KEGG:ns NR:ns ## COG: FN0435 COG0813 # Protein_GI_number: 19703773 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Fusobacterium nucleatum # 1 234 7 240 241 383 83.0 1e-106 MSVHIAAKNGEIADTVLLPGDPKRAKWIAENFLENAVCYTDIRGMLGFTGTYKGKRISVQ GTGMGIPSMSIYITELMKDYGVKTLIRVGSAGSYQEDVKIRDIVVALSTSTDSNINNRRF KGASFAPTVNFDLLSKVLKTAEEKNIKIKAGNILTSDEFYNDDPSYFKKWAEFGVLAVEM ETAALYTLASKYKAKALSILTISDSLVSPEITSSEEREKTFNEMIELALETAIK Prediction of potential genes in microbial genomes Time: Thu May 19 23:44:39 2011 Seq name: gi|224461281|gb|ACDC01000121.1| Fusobacterium sp. 2_1_31 cont1.121, whole genome shotgun sequence Length of sequence - 16792 bp Number of predicted genes - 21, with homology - 19 Number of transcription units - 9, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 54 - 1085 1206 ## COG4145 Na+/panthothenate symporter - Prom 1139 - 1198 5.6 2 2 Op 1 . - CDS 1216 - 1509 421 ## COG4145 Na+/panthothenate symporter 3 2 Op 2 . - CDS 1523 - 1795 366 ## FN0686 integral membrane protein - Prom 1929 - 1988 11.0 + Prom 1904 - 1963 7.6 4 3 Tu 1 . + CDS 2014 - 2856 845 ## FN1720 hypothetical protein + Prom 2917 - 2976 4.8 5 4 Op 1 38/0.000 + CDS 3008 - 4495 1860 ## COG0747 ABC-type dipeptide transport system, periplasmic component 6 4 Op 2 49/0.000 + CDS 4505 - 5422 655 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 7 4 Op 3 4/0.000 + CDS 5422 - 6189 465 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 8 4 Op 4 1/0.500 + CDS 6189 - 6890 358 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 9 4 Op 5 . + CDS 6890 - 7624 507 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 7640 - 7692 11.2 - Term 7626 - 7679 15.2 10 5 Tu 1 . - CDS 7692 - 7871 314 ## PROTEIN SUPPORTED gi|237739029|ref|ZP_04569510.1| LSU ribosomal protein L32P - Prom 7905 - 7964 12.3 + Prom 7951 - 8010 9.9 11 6 Op 1 4/0.000 + CDS 8142 - 9428 1431 ## COG4393 Predicted membrane protein 12 6 Op 2 10/0.000 + CDS 9430 - 10710 1993 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 13 6 Op 3 36/0.000 + CDS 10720 - 11922 1812 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 14 6 Op 4 . + CDS 11926 - 12624 309 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 15 6 Op 5 . + CDS 12634 - 12762 58 ## 16 6 Op 6 . + CDS 12759 - 13181 727 ## COG4939 Major membrane immunogen, membrane-anchored lipoprotein 17 7 Tu 1 . - CDS 13161 - 13391 89 ## - Prom 13522 - 13581 12.3 + Prom 13506 - 13565 9.7 18 8 Op 1 . + CDS 13631 - 14068 571 ## FN1350 integral membrane protein 19 8 Op 2 36/0.000 + CDS 14072 - 15277 1532 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 20 8 Op 3 1/0.500 + CDS 15286 - 15957 307 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 15996 - 16055 3.6 21 9 Tu 1 . + CDS 16079 - 16780 1091 ## COG1359 Uncharacterized conserved protein Predicted protein(s) >gi|224461281|gb|ACDC01000121.1| GENE 1 54 - 1085 1206 343 aa, chain - ## HITS:1 COG:FN0685 KEGG:ns NR:ns ## COG: FN0685 COG4145 # Protein_GI_number: 19704020 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Fusobacterium nucleatum # 1 343 142 484 484 528 89.0 1e-150 MAQFIGGARLFEAVTGLSYTTGLIIFSSVVIIYTTFGGFRAVTLTDAIQAVVMFAATIVL FFVILKHGNGMENIMMKIKEIDPNLLKPDSGGNIAKPFIMSFWILVGIGILGLPATTIRC MAFKDAKAMHNAMIIGTSLVGVLVLGMHLVGVMGRAIIPDLQEVDKIIPILALKNLYPIL AGVFIGGPLAAVMSTVDSLLIISSSTLIKDLYVTYLDKNASENKIKKISMWTSFLIGLLV FILSVKPISLITWINLFALGGQEIVFFCPLILGLYWKRANATGAIASIFFGIVAYLYLEI TKTKIFSLHNIVPSLVVALTAFVIFSYLGKKSDEKTIETFFEY >gi|224461281|gb|ACDC01000121.1| GENE 2 1216 - 1509 421 97 aa, chain - ## HITS:1 COG:FN0685 KEGG:ns NR:ns ## COG: FN0685 COG4145 # Protein_GI_number: 19704020 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Fusobacterium nucleatum # 1 93 1 93 484 132 86.0 2e-31 MDKILIIVPILLYLSAMLFIAYKVNKIKNSSESFTNEYYIGGRSMGGFVLAMTIVATYVG ASSFIGGPGIAYKLGLGWVLLACIQVPTAFFYLRSSW >gi|224461281|gb|ACDC01000121.1| GENE 3 1523 - 1795 366 90 aa, chain - ## HITS:1 COG:no KEGG:FN0686 NR:ns ## KEGG: FN0686 # Name: not_defined # Def: integral membrane protein # Organism: F.nucleatum # Pathway: not_defined # 3 90 17 104 104 120 80.0 1e-26 MKISKQINKEVLITIALYLIYFVWWYYFAYEYGSDNVEEYKYILGLPEWFFYSCVVGLVL INVLVYVCIKLFFKDVDFEEYDNKDKKLDK >gi|224461281|gb|ACDC01000121.1| GENE 4 2014 - 2856 845 280 aa, chain + ## HITS:1 COG:no KEGG:FN1720 NR:ns ## KEGG: FN1720 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 45 280 2 241 242 193 51.0 5e-48 MIEITKENDEIKIVISYKKLIKVYQVCFLFFLILIFILFDFEFPAMILNPLSAMFFIYLI LISFFGISYEKITIKENYILLEVIRNNKRICYSQKISLDEINKTYFKSSFLRGRSRDLLT YIFPFDRYLKIETNKKTYSFGKEIDYEDYLKINKILIEKVREYKAEKIILDKERNREEEL EAMYKLGVEERYIEILNAIIDEEKLFISKKEENFLIDAINKSKDSQETDFYVFYVNYLSK KEYANQKVLVGYDGVDGKEVTMSKLKEDINKLRDDRSTFK >gi|224461281|gb|ACDC01000121.1| GENE 5 3008 - 4495 1860 495 aa, chain + ## HITS:1 COG:FN1359 KEGG:ns NR:ns ## COG: FN1359 COG0747 # Protein_GI_number: 19704694 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 495 1 495 495 853 88.0 0 MKRKLFFGKILLSILLTFVFVACQKEESKEESIRTVSTVDIDSLNPYQVVSSNSDQILLN VFEGLVMPGTDGTVIPALAESYEVSEDGKTYTFTIRKGVKFHNGNDMDIKDVEFSLNYMS GKLGNAPTEALFENIEKIEVLDDSHIAIHLSEPDSSFIYYMKEAIVPDENKDHLNDTAIG TGPYKISEYQRDQKLVLSKNEEYWGEKAKIPTVSILISPNSETNFLKFLSGEINFLSGID SKRIPELDKYQILNSPSNLCLILALNPKEKPFDDIEVRKAINLAIDKNKIIQLAMNGKGT PIYTNMSPVMSKFLWAAPEEKADPQKAKQILEEKKLLPIKFTLKVPNSSKIYLDTAQSIK EQLKEVGITVDLEIIEWATWLSDVYTNRKYEASLAGLAGKMEPDAILRRYTSTYAKNFTN FNNARYDALIEEAKRTSNEAKQIENYKEAQKILAEEQAAIFLMDPNTIIATEKGLEGFEF YPLPYFNFAKLYFKK >gi|224461281|gb|ACDC01000121.1| GENE 6 4505 - 5422 655 305 aa, chain + ## HITS:1 COG:FN1360 KEGG:ns NR:ns ## COG: FN1360 COG0601 # Protein_GI_number: 19704695 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 305 1 305 305 487 94.0 1e-137 MYYIKKIFRMLLSVFSIGTFSFLLLELIPGDPETTILGLEASAKDLENLREQLGLNLSFG TRYWNWLCGVFQGDLGISFKYKEPVFKLILERLPLTLKIAFISIFIIFLVSIPLSFFLHN TKSKRIKKIGESIFSFFISIPSFWLGIIFMYLFGIILKWTSTGYNNSWQSLILPCIVISI PKIGWISMHLYSNLYKELREDYIKYLYSNGMKKIYLNFYILKNAFLPIIPLTGMLLLELI TGVVIIEQIFSIPGIGRLLVQSVLMRDIPLIQGLIFYTSTFVVLLNFVIDILYSLLDPRI QVGDQ >gi|224461281|gb|ACDC01000121.1| GENE 7 5422 - 6189 465 255 aa, chain + ## HITS:1 COG:FN1361 KEGG:ns NR:ns ## COG: FN1361 COG1173 # Protein_GI_number: 19704696 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 255 1 255 255 402 92.0 1e-112 MKKWQYFILILVGIIFCISFYQNPYKISENFTLLKPSFQHILGTDNLGRDIFSRLLLGTF HSIFLAFSAILLAAIVGSILGAVAGYFGGYIDEFFLFISEIFMSIPVILITLGIIVLLNN GFHSIILALFVLYMPRTLSYVRGLVKREKHKNYIKIARIYGVSNFRIMRRHIAPNIILPI LVNFSTNFAGAILTEASLGYLGFGIQPPYPTLGNMLNESQSYFLLAPWFTILPGLMILFL VYKINQISRKYQEKK >gi|224461281|gb|ACDC01000121.1| GENE 8 6189 - 6890 358 233 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 12 221 31 261 329 142 37 2e-33 MEILKIKNLNLKIREKEILKNVSLEIKEGEVIGLIGESGSGKTIFTKYILGILPLAAQYT QETFEVVPKVGAIFQNAFTSLNPTMKIGKQLKHLYVSHYGTQENWKEKIESLLEDVGLDK NRNFLDKYPYELSGGEQQRIVIMGALIGEPSFLIADEVTTALDVETKIEVVKFFKRLREK FKISILFITHDLSTLKNFADKIYVMYHGEIVDEDHPYRKQLFQLSQDVWRRTK >gi|224461281|gb|ACDC01000121.1| GENE 9 6890 - 7624 507 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 230 11 257 329 199 39 8e-51 MLLTVENLSKEYIKKKTLNNVSFSMEKGEILGMLGKSGAGKSTIGKILLQLSRPTTGTIL FEGKALSEVPRRDIQAIFQDPYTALNPSLKIGEILEEPLIANGKFTKEERRKKVEETLVK VGLLESDYEKYPEELSGGQQQRVCIAGAIILSPKLIICDEPIASLDLAIQVQILDLIQKI NQEEGISFIFITHNLPAVYRIADRILLLYRGEVQEIQEVEEFFKNPKSEYGKKFLKTSNL IKDF >gi|224461281|gb|ACDC01000121.1| GENE 10 7692 - 7871 314 59 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739029|ref|ZP_04569510.1| LSU ribosomal protein L32P [Fusobacterium sp. 2_1_31] # 1 59 1 59 59 125 100 2e-28 MAVPKKKTSKAKKNMRRSHHALTAIGLVTCEKCGAPKRQHRVCLECGDYKGSQVLETAE >gi|224461281|gb|ACDC01000121.1| GENE 11 8142 - 9428 1431 428 aa, chain + ## HITS:1 COG:FN1355 KEGG:ns NR:ns ## COG: FN1355 COG4393 # Protein_GI_number: 19704690 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 131 428 1 298 298 544 90.0 1e-154 MLKFYIDVINYLAIFAFLLGIITALLVKYKKLYLNIVVGLISLVGLACSVTMTVFKQLYP QKMVKISLQYNRWALAIGMLFMLVALVLQIIKTFRKCENDKLCIASAISIIFSTVAVWFL GFTIIPQVYAMTKEFVAFGENSFGTQSLLRLGGFLLGLLTIFLIALSVQKVYFRLKPCLA KVFALAIFFVGSMDFFLRGVSALARLRLLKSSNPFVFNVMILEDKSTTYIAILFAIVACI FSFLLFKDSRKVVGTFKNNALLRLEKARLKNNKHWLSSLAFFSILSVFAITVIHSHITKP VALTPPQPYQEEGNMIVIPLTDVEDGHLHRFSYTATGGNNVRFIVVKKPKGGSYGIGLDA CDICGLAGYYERNDEVVCKRCDVVMNKSTIGFKGGCNPVPFEYEIKDKKIYIDKATLEKE KDRFPVGD >gi|224461281|gb|ACDC01000121.1| GENE 12 9430 - 10710 1993 426 aa, chain + ## HITS:1 COG:FN1354 KEGG:ns NR:ns ## COG: FN1354 COG0577 # Protein_GI_number: 19704689 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 426 3 428 428 759 92.0 0 MFWRMVKGTLFRQRSKMLMIAFTVALGVSLATAMMNVMLGVGDKVNKELKTYGANITVMH KDASILDDLYGISGETVSNKFLLESEIPKIKQIFWGFAILDFAPYLERTGEIKGVSDKVK IYGTWFEKHLVMPTGEETDAGIKNLKTWWEVKGEWLNDDDLDGVMVGSLIAGKNNLKVGD TIEVKGTNETKKLTIRGIINSGGNDDEAIYTALKTTQDLFGLEGKITMIDVSALTTPDND LARKAAQDPNSLTISEYETWYCTAYVSSISYQLQEVLTDSVAKPNRQVAESEGTILNKTE LLMLLICILSSFASALGISNLITASVIERSQEIGLIKAIGGTNRRIILLILTEVVLTGIL GGIFGYLAGIGFTQIIGKTVFSSYIEPAVIVVPIDIALVFAVTIIGSIPAIRYLLTLKPT EVLHGR >gi|224461281|gb|ACDC01000121.1| GENE 13 10720 - 11922 1812 400 aa, chain + ## HITS:1 COG:FN1353 KEGG:ns NR:ns ## COG: FN1353 COG0577 # Protein_GI_number: 19704688 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 400 1 400 400 660 97.0 0 MTKKQMYIKLVVSSLIRRKARMIVALLAVAIGATIMSGLVTIYYDIPRQLGKEFRSYGAN FVVLPSGNEKITETEFDKIKTEMSTQKVVGMAPYRYETTKINQQPYILTGTDMIEVKKNS PFWYIEGEWATNDDENNVMIGKEISKKLNLQIGETFIIEGPKAGAKVVASKQSDSAEESK KKDLNSDFYSKKLKVKGIITTGGAEESFIFLPISLLNEILEDDTKIDSIECSIEADSKQL ESLATKLKSADENITARPIKRVTQSQDIVLGKLQALVLLVNIVVLILTMISVSTTMMAVV AERRKEIGLKKALGAYDSEIKKEFLGEGSALGFIGGLLGVGLGFVFAQEVSLSVFGRAIE FQWLFAPITIIVSMIITTLACLYPVKKAMEIEPALVLKGE >gi|224461281|gb|ACDC01000121.1| GENE 14 11926 - 12624 309 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 218 1 218 245 123 30 7e-28 MDNREVLLEVKNVSKIYGDLHALKEVSFQVRKGEWVAIMGSSGSGKSTMMNIIGCMDKPS VGEVILDGQDITKESQNSLTKIRREKIGLIFQQFHLIPYLTALENVMVAQYYHSIPDEQE ALQALERVGLKDRAKHLPSQLSGGEQQRVCIARALINSPEIILADEPTGNLDEVNEKIVI DILTQLHEEGSTIIVVTHDLEVGDVAERKIILEYGKIVNDIDQKQFGKKKQS >gi|224461281|gb|ACDC01000121.1| GENE 15 12634 - 12762 58 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTACLPLLFQELHKGSFNNNGRRSSLNLYLILFLVTVDRRAK >gi|224461281|gb|ACDC01000121.1| GENE 16 12759 - 13181 727 140 aa, chain + ## HITS:1 COG:FN1351 KEGG:ns NR:ns ## COG: FN1351 COG4939 # Protein_GI_number: 19704686 # Func_class: S Function unknown # Function: Major membrane immunogen, membrane-anchored lipoprotein # Organism: Fusobacterium nucleatum # 1 140 1 140 140 220 87.0 7e-58 MKKYLLVGMVVALSLLTACGKKDFSKMTFNDGEYQGHFNNDDKDHPSTADVVLTIQDGKI VACTAEFRDGKGNIKGDDYGKEAGDEKYKKAQIAVEGFSTYADKLVEVQDPNEVDAVSGA TVSNKEFKEAVWDALDKAKK >gi|224461281|gb|ACDC01000121.1| GENE 17 13161 - 13391 89 76 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRVIVGVSEANLLAILESLNELVHLEFLTDPKLQRILDFSSLKNLLSNELFLSFIYFYH LFIFIIYLFLLFFSFV >gi|224461281|gb|ACDC01000121.1| GENE 18 13631 - 14068 571 145 aa, chain + ## HITS:1 COG:no KEGG:FN1350 NR:ns ## KEGG: FN1350 # Name: not_defined # Def: integral membrane protein # Organism: F.nucleatum # Pathway: not_defined # 1 145 1 145 145 221 88.0 5e-57 MKKNILEKLALILSVILFLVPKYIAPVCGPKEDGSHMACYFSGNAVMKIAVAIFIITLVM ILLSRVKIVKIIGAVATIVLSAYVYLVPHGMSGLQNEMGKPFGVCKIDTMQCHIHHTFEI ATGIAVVIGLLMVFSLISTFLKKED >gi|224461281|gb|ACDC01000121.1| GENE 19 14072 - 15277 1532 401 aa, chain + ## HITS:1 COG:FN1349 KEGG:ns NR:ns ## COG: FN1349 COG0577 # Protein_GI_number: 19704684 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 401 1 401 401 670 92.0 0 MSKRIDANSLAMENIRQRKTRSTCMILLVALFSIIVYMGSMFSLSLSKGLESLSDRLGAD VIVVPAGYKAEIESVLLKGEPSTFYLPADTMDKLKKFDEIEKMTAQTYVATLSASCCSYP VQIIGIDIDTDFLIYPWITHNIDKELKDGEAIVGSHVIGEKGETVHFFNEELKIVGRLKQ TGIGFDATVFVNQNTAKKLARASERITANKVAEEDVISSVMIKVKPGVDSVKLASKISKE LSKEGIFAMFSKKFVNSISSNLKVLATSVLILVVAIWLLSVIILSISFTAIFNERKKEMA VLRVLGASKKMLRNIIIKEAVILSLIGAGIGSFLGFILSIIELPLIASKFSMPFLSPSIM QYIGIFVLSFVLAVIIGPLSTVRVVKKLTDKDSYLSLREEM >gi|224461281|gb|ACDC01000121.1| GENE 20 15286 - 15957 307 223 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 220 1 221 245 122 35 1e-27 MLEIKNISKSYNRQGKDFFAVKDVNLNISHGDFIHIIGRSGSGKSTFLNIVAGLLSADKG SLSLDRTNYMELPDEEKSEFRNKNIGFIPQSPALLSYLNVLENIRLPYDMYEKEGDSEGK ARYFLNELGLEHLAKSYPKELSGGELRRIIIARALMTEPKILIADEPTSDLDIEATKEVM DLLKKINEKGTTVLVVTHELDTLKYGKKVYTMSEGILEDGKKL >gi|224461281|gb|ACDC01000121.1| GENE 21 16079 - 16780 1091 233 aa, chain + ## HITS:1 COG:FN1347 KEGG:ns NR:ns ## COG: FN1347 COG1359 # Protein_GI_number: 19704682 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 17 233 1 218 218 306 75.0 2e-83 MLKKLLMGLAMLTSVSMYAVPTLNVYNFEVKNDKEASYKSITEDYVNKTAIEQGVLGLFA TTDDRDKLNSYVIEIYNDYLAFSNHTKNQTSADFKAMIPQIAEGNLNATDVEVQFAKDKK IEQNENTFAVYTVIEVKPENNTEFAEFIKNRAEASFNENGTLLVYIGTDRRSPNKWCVFE VFTDMDSYLNQRAASYSKNFITETKDMVISQKRAELQALKLINKGGLDYKKLY Prediction of potential genes in microbial genomes Time: Thu May 19 23:44:59 2011 Seq name: gi|224461280|gb|ACDC01000122.1| Fusobacterium sp. 2_1_31 cont1.122, whole genome shotgun sequence Length of sequence - 8538 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 642 814 ## FN1346 putative cytoplasmic protein + Term 700 - 752 6.0 + Prom 822 - 881 10.5 2 2 Op 1 1/0.000 + CDS 912 - 1640 952 ## COG0584 Glycerophosphoryl diester phosphodiesterase 3 2 Op 2 . + CDS 1653 - 3011 799 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Term 3037 - 3084 9.1 + Prom 3015 - 3074 11.2 4 3 Tu 1 . + CDS 3255 - 4682 1575 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 4699 - 4728 -0.2 - Term 4739 - 4792 9.2 5 4 Op 1 . - CDS 4795 - 8061 3769 ## COG1002 Type II restriction enzyme, methylase subunits 6 4 Op 2 . - CDS 8082 - 8537 664 ## FN0231 hypothetical protein Predicted protein(s) >gi|224461280|gb|ACDC01000122.1| GENE 1 1 - 642 814 213 aa, chain + ## HITS:1 COG:no KEGG:FN1346 NR:ns ## KEGG: FN1346 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 14 212 1 199 200 266 73.0 5e-70 KNKFCNSLFFGGSMSIVEKYLKELKRAYYKNGGKEMWDNFEKIKEGASEEDIKKLKEEYP EVTDSLIELLKNVDGTYFRKYKGETIAFYFLGSDIDEYPYYLLSSSQILETKDKAYKYYA DYVDRKYEEVEIDEEIISDSKKMRWLHFSDCMNNGGTSQLFIDFSPSEKGVKGQVVRFLH DPDEIAVIADSFDEYLEKLMEYELDFINEDTME >gi|224461280|gb|ACDC01000122.1| GENE 2 912 - 1640 952 242 aa, chain + ## HITS:1 COG:BS_yqiK KEGG:ns NR:ns ## COG: BS_yqiK COG0584 # Protein_GI_number: 16079474 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Bacillus subtilis # 1 235 1 237 239 195 41.0 6e-50 MTKNFAHRGFSGKYPENTMLAFQKAVEIGADGVELDVQLTKDGEVVIIHDETIDRTTDGK GYVVDYTYEELSKFDASYIYTGKMGFNKIPTLKEYFELVKDLDFVTNIELKTGINQYLGI EEKVYKLIKEYKLEKKVIISSFNHFSVLRMKKIAPELKCGFLSEDWIIDAGAYTASHGIE CFHPRFNNLIPEVVEELKKNNIEINTWTVNKEEDIKDLINKGIDILIGNYPDLVKKIINE NK >gi|224461280|gb|ACDC01000122.1| GENE 3 1653 - 3011 799 452 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 3 447 2 445 456 312 38 6e-85 MESLELFLTTVNKWLWGRWLVYVLLALGILYTFANGFIQVRHFKFIMKKTLVDSFKARND EKGSGSISTFKAMMVTLAGNVGGGNVVGVATAVAAGGMGAVFWMWVAAFFGMALKYGEIV LSQLYRGKDSEGNLLSGPMYYIRDGLKAPWLGIVIAVLMCTKMMGANLVQSNTISGVLKS NYNVPTWLTGVILICCLMAVVLGGLKRLANIATSLVPIMSIFYVAVGLLVILLNIQAVPG VFKEIFTQAFSMKAAAGGTGGYIIARAMQYGITRGMYSNEAGEGTAPFAHGSAIVDHPCE EGITGVTEVFLDTIIICSITAIVIGVTGIYQSDLSPAVMAIESFGTVWEPLKHLATFALL LFCFTTLMGQWFNAAKSFTYAFGPKVTDKVRFVFPFLCIIGAITKISLVWTIQDVAMGLV IIPNLIALIILFPQVSKQTKDYFSNPKFYPKK >gi|224461280|gb|ACDC01000122.1| GENE 4 3255 - 4682 1575 475 aa, chain + ## HITS:1 COG:FN0191 KEGG:ns NR:ns ## COG: FN0191 COG2865 # Protein_GI_number: 19703536 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Fusobacterium nucleatum # 1 475 1 476 477 707 76.0 0 MTIDEIKKLIQNGEKIDVEFKESKNALTKDVFDTVCSFNNRNGGHILLGINDKRDIVGVS EDRVDKVMKEFITSINNPQKIYPPLYLLPEVFEIDSKKIIYIKVPEGYQVCRHNGKIWDR SYEGDINITNHAELVYKLYARKQGSYFVNKVYPNFDIEFLDTTVIDKARKMAINRNKNHV WGNMSDEELLRSANLILIDPETKCEGITLAAILLFGKDNSIMSVLPQHKTDAIFRVENKD RYDDRDVVITNLIDSYDRLIAFGQKHLNDLFVLDGIININARDRILREIVSNTLAHRDYS SGFPAKMIIDNEKIMIENSNLAHGMGVLSLQKFEPFPKNPTISKVFREIGLADELGSGMR NTYKYTKLYSGADPLFEERDVFRTIIPLKKIATQKVGGNDVAQDVAQDVAQDVAQDKIAL AEFIKEKIRGNDKITRKMIANEAGVSIKTIERVIKEIDNLKYVGRGSNGHWELIE >gi|224461280|gb|ACDC01000122.1| GENE 5 4795 - 8061 3769 1088 aa, chain - ## HITS:1 COG:Ta1336 KEGG:ns NR:ns ## COG: Ta1336 COG1002 # Protein_GI_number: 16082323 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Thermoplasma acidophilum # 22 574 9 494 496 155 28.0 3e-37 MNNLFNQKLLIQKAQEEINLSDYFEKRKILNNWINSLEKGILSKSKEEEFQGEFLNDIFS LILGAVNKSSGNDEWNLQRESKTRIDGQKADGVIGFFDVNGKDDVRAVIELKGPTISLDQ RQKRSGDTRTPVEQAFNYAPKYGKNCQWVIVSNYKEIRLYRSNDMTEYEVFFLENLKDDL EFQKFIYILSFEALVGTANKKAKALELSEEYQKNQIEIEKKFYNEYKNIRLHIFENMKEN NPETDENTLLEKVQKLLDRFLFICFCEDKGLLEKDFFNTILKKGKDFGSIFDIFKVFCNW INLGNPKENISHFNGGLFKNDDVLNSLNIDDKVFEELKKISDYDFDSDLNVNILGHIFEQ SISDIEELKKSISGEEFDQKKSKRKKDGIFYTPQYITKYIVENSIKNWLDDKRKELGEDD LPKLNEKDYIFDIAKKNYTKNYRKHIEFWQQYREAVRNIKIIDPACGSGAFLITAFEFLL NYNKYLDDKIFDLVGTSDLFSDRTKEILQNNIFGVDLNKESVEITKLSLWLKTADKNKTL ASLENNIKCGNSLIDDPEIAGNLAFNWEKEFPEIFANGGFDIVVGNPPYVKEDIGKNAFN GLHQHLCYQGKMDLWYFFGWLALTISKKEFAYISFIAPNNWITNDGASKFRNKINDCATI FEYIDFNNFMIFEEAQIQTMIYIMKNDNKLEKYKFKYSKILNNKIAKEEIMHFLQKLENN NFEYFDVDINRVDYKDKLFNFNSEKNRNVLNKIKANANFYLKKEEIFSGIDIGQDFINAK SLEILGDDFKIGDGIFNLSEEEYNSYNFFNNEKEIIKPFYTTKEVNRYYFNEKNKYWVIY TTSKFKNPQEIIDYPNIKKHLDKFSKVITSDNKPYGLHRARNEEIFKGEKILSIRKHERP AFSYVTLDTYVNRTFNIIKTDRVNLKYLLVLLNSKLTKFWLKEKGKMQGDIFQVDITPII SIPLIIVSKDQEAFISLSEKMLSLNRELQDLSQKFQRMLLRKFDLEKLSTKLQEWYLLDF SDFIKELKRLKVKLSLSQESEWEEYFLEEKSKAIAIDSEIKNTDKEIDSMVYRLYDLTDE EIKIIEEE >gi|224461280|gb|ACDC01000122.1| GENE 6 8082 - 8537 664 151 aa, chain - ## HITS:1 COG:no KEGG:FN0231 NR:ns ## KEGG: FN0231 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 115 151 1 37 37 78 91.0 7e-14 YLHSEENRKILDRTIPIGKYEVELAILNSKTISKRVAGARLKIKNDKIIRYEQTQNKSSK LNGFGVDAGLASFCDATVAEEYTKFYSNNDYFIKLLQGKQFIDWEIPGTNHKIAMFETGF GDGYYMSLYGLNEKDEVCELVIPFINPELID Prediction of potential genes in microbial genomes Time: Thu May 19 23:45:12 2011 Seq name: gi|224461279|gb|ACDC01000123.1| Fusobacterium sp. 2_1_31 cont1.123, whole genome shotgun sequence Length of sequence - 16724 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 6, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 491 606 ## COG4859 Uncharacterized protein conserved in bacteria 2 1 Op 2 . - CDS 528 - 1076 689 ## FN0234 hypothetical protein 3 1 Op 3 . - CDS 1130 - 1597 675 ## FN0234 hypothetical protein - Prom 1627 - 1686 11.8 + Prom 1659 - 1718 13.4 4 2 Op 1 . + CDS 1847 - 2761 1190 ## COG1897 Homoserine trans-succinylase 5 2 Op 2 1/0.000 + CDS 2783 - 4153 1391 ## COG2849 Uncharacterized protein conserved in bacteria 6 2 Op 3 1/0.000 + CDS 4170 - 5765 1714 ## COG2849 Uncharacterized protein conserved in bacteria 7 2 Op 4 1/0.000 + CDS 5790 - 7130 1676 ## COG2849 Uncharacterized protein conserved in bacteria 8 2 Op 5 1/0.000 + CDS 7155 - 8084 1103 ## COG2849 Uncharacterized protein conserved in bacteria 9 2 Op 6 . + CDS 8109 - 8882 1043 ## COG2849 Uncharacterized protein conserved in bacteria 10 3 Tu 1 . - CDS 9000 - 10337 947 ## COG0534 Na+-driven multidrug efflux pump - Prom 10436 - 10495 14.4 + Prom 10385 - 10444 11.7 11 4 Tu 1 . + CDS 10529 - 12106 2071 ## COG2461 Uncharacterized conserved protein + Term 12108 - 12164 2.8 - Term 12096 - 12144 6.7 12 5 Op 1 23/0.000 - CDS 12154 - 12417 434 ## PROTEIN SUPPORTED gi|237739055|ref|ZP_04569536.1| SSU ribosomal protein S16P 13 5 Op 2 8/0.000 - CDS 12468 - 13802 1946 ## COG0541 Signal recognition particle GTPase 14 5 Op 3 . - CDS 13813 - 14124 334 ## COG2739 Uncharacterized protein conserved in bacteria - Prom 14205 - 14264 10.3 + Prom 14180 - 14239 10.3 15 6 Op 1 1/0.000 + CDS 14283 - 15986 2682 ## COG0442 Prolyl-tRNA synthetase 16 6 Op 2 11/0.000 + CDS 16052 - 16369 527 ## PROTEIN SUPPORTED gi|237739059|ref|ZP_04569540.1| SSU ribosomal protein S6P 17 6 Op 3 . + CDS 16421 - 16639 351 ## PROTEIN SUPPORTED gi|197736537|ref|YP_002165315.1| ribosomal protein S18 + Term 16659 - 16696 5.1 Predicted protein(s) >gi|224461279|gb|ACDC01000123.1| GENE 1 2 - 491 606 163 aa, chain - ## HITS:1 COG:FN0232 KEGG:ns NR:ns ## COG: FN0232 COG4859 # Protein_GI_number: 19703577 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 18 82 1 65 65 107 76.0 7e-24 MKKYVENAGSCIVTKSLLNGESNFRWLFREEPLDNIDTGWMAFGDSDNDEYVNDPKNLSV VDLNTLINIEPTILNVYEMPVGTDLIFIEEDGEKYFINAKTNEQIREKVKSPFMIAFEKN LDFLRKDEYSKEFIENLFTESDRISLDTIGEVDFPTGQVIIAD >gi|224461279|gb|ACDC01000123.1| GENE 2 528 - 1076 689 182 aa, chain - ## HITS:1 COG:no KEGG:FN0234 NR:ns ## KEGG: FN0234 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 53 182 15 147 147 108 48.0 6e-23 MKKITMFFLAVLALTFVACGKGEGGIVDKIKSLDNTTSQSEASSDSLDHGDKAEYLDIDK IVSEFDITEEDDEHIEFQDRDQEKAVYRIFIFEKMNKLDFKNPDRMDILENFYIEKNCDI VYKDDETIIIKLEQDGTLAYNIHNFDDSKTELAVIVSIGSDRELSETELFDILKEAKSFI KK >gi|224461279|gb|ACDC01000123.1| GENE 3 1130 - 1597 675 155 aa, chain - ## HITS:1 COG:no KEGG:FN0234 NR:ns ## KEGG: FN0234 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 12 155 1 146 147 117 51.0 2e-25 MKKFLVFFLAILALAFVACGKDKDIRDILDKEKISSEFNIVEEREKYFEFKDKDDNRDVF RIFMYEKISSIDFKNPKKIDSLEEGYIEQGCDIIYKDKDTIMIGIFDPEVGYGYNIHNFD NSKTTLEIIVAIGSQDELSEKDLFEILKEAKSFIK >gi|224461279|gb|ACDC01000123.1| GENE 4 1847 - 2761 1190 304 aa, chain + ## HITS:1 COG:BH2280 KEGG:ns NR:ns ## COG: BH2280 COG1897 # Protein_GI_number: 15614843 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Bacillus halodurans # 1 299 1 301 303 332 53.0 5e-91 MPIRVANDIPAKNQLTEEGIIFMEETRANTQDIRPLNILILNLMPKKEETETQLLRLIGN SPLQINVEFLMVKNHESKNTNLSHIEKFYQYFDDIKDNFYDALIVTGAPVEQMEYEEVDY WNELQKIFEWSKTHVFSCLHICWAAQARLYNDYKIAKTIQPAKVFGVFQHEITESSNPLI RGFSDVFLAPHSRHTHIDENKLASTKELEILAKSEVGSLLISTEDLRKIFITGHLEYDRE TLLGEYRRDKDKGLEIQVPVNYFPNDDDTKKPLQTWKTTAHLFYHNWLNAVYQLTPYDLK DLAK >gi|224461279|gb|ACDC01000123.1| GENE 5 2783 - 4153 1391 456 aa, chain + ## HITS:1 COG:FN1514 KEGG:ns NR:ns ## COG: FN1514 COG2849 # Protein_GI_number: 19704846 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 44 456 1 503 503 468 61.0 1e-131 MKKAFTILILMLFIFSIANAHSFKNEKELNNFFSKIDQLIKEELKKDYREEMTKRKGTAN NEYSFEIEDDRTVLITRSIAGIKPETEITQYFNSEGKLYMISSLTSETEKDLYALYRKYD SNGNLFIYSYAIDGKNIDRGYYSDGKLAYIQELKIIKGQPPIPNGKYIEYYKNGQIKVQG NNKDGKRDGEFKAFLRNGKSAGSVFYKDGKIIKSTLVNSMKDNASFSLTTDINYNLNSNE IITDEFLNGLLKQYFIYNKNGLLNGESREYYEEGDIKSISHFKNDIPDGVFISYYQNGNI ENKYAYVNGQANGECFSYYENGKLEERYFLKNGEIDGEAFAYYPSGKLRGKEVQKLGKRE GESIIYHENGNIKQKSTFKNDKREGDLFIYFPSGKLRQTEKYINGKIEGEVIEYYESGTV KEKAYFINDKQEKEHFFYDKKGKLIKTDIYKNGVKQ >gi|224461279|gb|ACDC01000123.1| GENE 6 4170 - 5765 1714 531 aa, chain + ## HITS:1 COG:FN1515 KEGG:ns NR:ns ## COG: FN1515 COG2849 # Protein_GI_number: 19704847 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 531 1 555 555 583 65.0 1e-166 MRKNFLVLIFLFFTFSILNAKPLKTEAELQNIRNKADKIVQEELKNDYKKEYLKRKNNLE KIEGIEESVFSDGEFVFELKNGVVDNISKNIEKNPDIFVTKAFDKDGNLLKIAFLAKLDE GTFLYREFNTDLSPIIETYSINEKCIQKAYYSNKKLAYTREGKLVEGYNLLPNGKYTEYY KNGQIKVQGSYKEGKRDGEFKAFLKNGKSAGFIIYKDGKIIKSTLVKAMKDNASFSPVTD IYYKLEDSHTLRKVDYENGLLKTYFIYNKDGIPDGESVEYYEEGSIESIVHFRNNIVEGL TITYYENGNIDEEVNYKNNKMNGEAKSYDENGKLNGRTIFKDNIKLEEDVYKENEILKNT FKNGELVKQDICTLNGTLKERRILNGDEMEYSTFYPNGNVKQKILAKDKIIIKEQIYARS GNIMYNSFFSDGKPVTEYFEYYPDGKLFRKIVSVDGKLNGDSIEYYPSGNIKEKIFLVDD KMNGEDIEYYENGVIKEKAYFINDKQEKEHFFYDEKGKLIKTDIYKNGVKQ >gi|224461279|gb|ACDC01000123.1| GENE 7 5790 - 7130 1676 446 aa, chain + ## HITS:1 COG:FN1512 KEGG:ns NR:ns ## COG: FN1512 COG2849 # Protein_GI_number: 19704844 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 355 42 396 396 500 79.0 1e-141 MKRFLITLILMLSIFSIASAHPFKSEKELYNYYAEIDKKINEELKNSPEKILKDRKNTLK PLSMNVFGADKVLGDNSYLFGFDKNGKIMSMMKRNVLDGPSMIARIYYSNGNLKELYLMD DDFVTGIVRTYYESGKKHEEIPYYKGKKEGLRKIYFENGNLSNEVHYFDDSREGKTTDYY SNGKVLRVKNYKNNMIDGEFAEYYRNGQIKVKGTYKDDLRDGEFKFYSENNKYLGSVFYK DKEIIKSTLSEEDKDELEDSFEFADMSSFLRAATGDIVGARTDKYPNGKTRISMPYNVNG ELHGKFKEFYESGKISSETTYENGIRQGKSLEYLENGKIVEEKNYIDGKKEGKALETFEG MIQMKANYKNNKIDGDMFLYYPSGKLLQKRSFINGKAEGELVEYYENGVVKEKAYFINDK QEKGHFFYDEKGKLIKTDIYKNGIKQ >gi|224461279|gb|ACDC01000123.1| GENE 8 7155 - 8084 1103 309 aa, chain + ## HITS:1 COG:FN1514 KEGG:ns NR:ns ## COG: FN1514 COG2849 # Protein_GI_number: 19704846 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 27 309 224 503 503 154 39.0 3e-37 MKRFFTILILMLSIFSITNTYSEEMHKEYYDSGKILKESHFSNDKKNGLEKIYYENGKIS SIKNYKDGKADGEYVEYYTDGELKLKGSYKNGLRNGEFKTYLMNSKSAGSMFYKDGKEIK STLTPYMKEDVFFNFPDEIESLINTVSKKSKELIKLRDEDDGYHILGVANYPNGRVCSAV QVNDLGEYDGERKEYYESGQLEQKGYYKDDLGQGEYIWYYEEGSIKQKAFYKDDKIEGTL FIFFPGGKIAQTNNYVNGKKEGELIEYYENGQIKEKRFYINDKEEGKSLFYDEKGKLIKT EVYKNGVKQ >gi|224461279|gb|ACDC01000123.1| GENE 9 8109 - 8882 1043 257 aa, chain + ## HITS:1 COG:FN1512 KEGG:ns NR:ns ## COG: FN1512 COG2849 # Protein_GI_number: 19704844 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 250 42 285 396 129 36.0 5e-30 MKRFLTILILMLSVFSIANAHPFKSDKELYDFYAEIDKKIFAEKNKPIVRKKFPRKLTKE EESKVPLDNRNYTVEEVIGNDKLVYSAFDNHLMYIFQLNQKGKEEGVVRFFDEDENLVKI CYGNDLNGLMGILREYYPNGKLKNEMPYYAKKLNGNGKIYYESGALREDYHYYNDKEDGQ GIIYFENGQKMQVENYKNGLKVGDYYEYFEDGTLATKGFFVNGKEEGVFELYNREGKKFK ELVFKKGKKIEEREIKQ >gi|224461279|gb|ACDC01000123.1| GENE 10 9000 - 10337 947 445 aa, chain - ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 443 1 443 445 542 69.0 1e-154 MKTMFKNNDLTQGKIWKVILNFTLPIFLGTLFQSLYTTIDAIIVGKFAGKDAFAAIESVM SFQRLPVSFFIGLSSGATIIISQYFGAKEKEDVSKASHTAMLFAIVGGLILSILSCILSP YFIGLIRVPQKIFHEAYIYTFICFSGMVFSMIYNIGSGILRALGNSKTPFHILIFANILN IVLDLIFVINFNLSVVGVGLATLISQIVSAILVFVVLMRTNLDCRIYIKKLTFYKKYLKK IFVLGLPIAIQSVIYPIANTTIQSKINMFGVNSIAAWAISGKLDFLIWSVSDAFCISSST FVAQNYGAKKHHRVKKGIISSVIMSISMILVISITLFIWSKDLAPFLIEDREVIELTSEI LSILAPFYFIYTIGDVLAGAIRGLGDTFYPMLINIFAICIVRLLWIFFVFPLNPTFFMIL YGYLISWTVNTIAFLIYIYFKRKKI >gi|224461279|gb|ACDC01000123.1| GENE 11 10529 - 12106 2071 525 aa, chain + ## HITS:1 COG:FN1655 KEGG:ns NR:ns ## COG: FN1655 COG2461 # Protein_GI_number: 19704976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 4 525 1 512 512 810 84.0 0 METMAKHLPALDEEKLKFVIELKEKYNAGKISLEEARKLLKERVKTLTPYEIAYAEQKIV PFVEDECIKENIQNMMLLFNEVMDTSRPTDLPSDHPIMCYYRENDDMRELLKEVENLIQF PVIKNQWYELYDKLDLWWKLHLPRKQNQLYSLLEKKGFTRPTTTMWVLDDFVRDELKENR KMLDDGNEEEFIASQTSVAADIIDLIQKEETVLYPTSLAMITEEEFEDMKSGDKEIGFTF GELEEVSPKKEINQSESSNISGQGNLAKDLAQLLGKYGFNSDANSSELDVAMGKMTLEQI NLVFKHLPVDITYVDENEIVKFYSDTAHRIFPRSKNVIGRDVKNCHPRKSVHIVEEIIEK FRNGEQDFAEFWINKPGLFIYICYSAVKDKDGNFRGILEMMQDCTRIRSLQGSQTLLNWE NGTMNAEEVKEEKVEEKTEESPKEESLNSQIPLDSINKDTYLKDLIKVYPNLKKDMIKIS ERFKILQGPLAAVMLPKATLEKVSEKGDIDLNTLIEKIKELIKTY >gi|224461279|gb|ACDC01000123.1| GENE 12 12154 - 12417 434 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739055|ref|ZP_04569536.1| SSU ribosomal protein S16P [Fusobacterium sp. 2_1_31] # 1 87 1 87 87 171 100 2e-42 MLKLRLTRLGDKKRPSYRLVAMEALSKRDGGAIAYLGNYFPLEDSKVVLKEEEIIKFLQN GAQPTRTVKSILVKAGVWAKFEESKKK >gi|224461279|gb|ACDC01000123.1| GENE 13 12468 - 13802 1946 444 aa, chain - ## HITS:1 COG:FN1393 KEGG:ns NR:ns ## COG: FN1393 COG0541 # Protein_GI_number: 19704725 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Fusobacterium nucleatum # 1 444 1 444 444 765 97.0 0 MLENLGNRFQDIFKKIRGHGKLSDSNIKDALREVKMSLLEADVNYKVVKDFTNRISEKAI GTEVIRGVNPAQQFIKLVNDELVELLGGTSSKLTKGLRNPTIIMLAGLQGAGKTTFAAKL AKFLKKQNEKLLLVGVDVYRPAAIKQLQVLGQQIGVDVYSEEDNKDVVGIATRAIEKAKE INATYMIVDTAGRLHVDETLMNELKELKKAIKPQEILLVVDAMIGQDAVNLAESFNNALS VDGVILTKLDGDTRGGAALSIKAVVGKPIKFIGVGEKLNDIEIFHPDRLVSRILGMGDVV SLVEKAQEVIDENEAKSLEEKIKSQKFDLNDFLKQLQTIKRLGSLGGILKLIPGMPKIDD LAPAEKEMKKVEAIIQSMTIEERKKPDILKASRKIRIAKGSGTDVSDVNKLLKQFEQMKS MMKMFSSGKMPNLGGMGKGGKFPF >gi|224461279|gb|ACDC01000123.1| GENE 14 13813 - 14124 334 103 aa, chain - ## HITS:1 COG:FN1394 KEGG:ns NR:ns ## COG: FN1394 COG2739 # Protein_GI_number: 19704726 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 103 1 103 103 127 81.0 5e-30 MILDEFIEIANLLEIYSPLLSEKQREYLEDHFENDLSISEIAKNNNVSRQAIFDNIKRGV ALLYEYENKLKFHQIKQDIREKLIDLKEDFTEEKLENIIEDLV >gi|224461279|gb|ACDC01000123.1| GENE 15 14283 - 15986 2682 567 aa, chain + ## HITS:1 COG:FN1658 KEGG:ns NR:ns ## COG: FN1658 COG0442 # Protein_GI_number: 19704979 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 567 1 567 567 1068 94.0 0 MRFSKAYIKTLKETPKEAEIASHKLMLRAAMIKKLASGIYAYLPLGYRTIRKIENIIREE MDRAGALELLMPVVQPAELWQESGRWDVMGAEMLRLKDRHERDFVLSPTQEEMITSIVRS DISSYKSLPLNLYHIQTKFRDERRPRFGLMRGREFTMKDGYSFHTSQESLDEEFLNMRDA YTRIFTRCGLKFRPVDADSGNIGGSGSQEFQVLAESGEDEIIYSDGSEYAANIEKAVSEL INPPKEDLREVELVHTPDCPTIESLAKYLDIPLERTVKALTYKDMGTDEIYMVLIRGDFE VNEVKLKNILNAVEVEMATDEEIEKIGLTKGYIGPYKLPAEIKIVADLSVIEVTNHVVGS HQKDYHYKNVNYGRDYKADIVADIRKVRVGDNCITGGKLHSARGIECGQIFKLGDKYSKA MNATYLDENGKTQYMLMGCYGIGVTRTMAAAIEQNNDENGIIWPVSIAPYIVDVIPANIK NEGQVSLAEKIYNELQAEDIDVMLDDRDEKPGFKFKDADLIGFPFKVVVGKRADEGIVEV KIRKTGETLEVSESEVVAKIKELMKLY >gi|224461279|gb|ACDC01000123.1| GENE 16 16052 - 16369 527 105 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739059|ref|ZP_04569540.1| SSU ribosomal protein S6P [Fusobacterium sp. 2_1_31] # 1 105 1 105 105 207 99 4e-53 MGKNQREEVNAMKKYEIMYIINPTVLEEGRDELINQINSLLTANGATIAKTEKWGERKLA YPIDKKKSGFYVLTTFEMDGTKLAEVEAKINIMEAVMRHIVVRLD >gi|224461279|gb|ACDC01000123.1| GENE 17 16421 - 16639 351 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|197736537|ref|YP_002165315.1| ribosomal protein S18 [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 72 1 72 72 139 100 1e-32 MAEFRRRRAKLRVKAEEIDYKNVELLKRFVSDKGKINPSRLTGANAKLQRKIAKAVKRAR NIALIPYTRIEK Prediction of potential genes in microbial genomes Time: Thu May 19 23:45:20 2011 Seq name: gi|224461278|gb|ACDC01000124.1| Fusobacterium sp. 2_1_31 cont1.124, whole genome shotgun sequence Length of sequence - 4014 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 35 - 94 6.2 1 1 Tu 1 . + CDS 168 - 971 1102 ## COG0501 Zn-dependent protease with chaperone function + Term 981 - 1032 2.1 + Prom 1014 - 1073 6.6 2 2 Op 1 . + CDS 1093 - 1332 368 ## Lebu_1593 protein of unknown function DUF172 3 2 Op 2 . + CDS 1326 - 1604 300 ## COG4115 Uncharacterized protein conserved in bacteria + Term 1605 - 1652 5.4 - Term 1435 - 1487 1.2 4 3 Op 1 . - CDS 1612 - 1695 70 ## 5 3 Op 2 . - CDS 1667 - 1951 471 ## FMG_1585 hypothetical protein - Prom 2038 - 2097 9.6 + Prom 2002 - 2061 8.7 6 4 Op 1 . + CDS 2169 - 2549 568 ## COG0789 Predicted transcriptional regulators 7 4 Op 2 . + CDS 2562 - 4014 2236 ## COG1154 Deoxyxylulose-5-phosphate synthase Predicted protein(s) >gi|224461278|gb|ACDC01000124.1| GENE 1 168 - 971 1102 267 aa, chain + ## HITS:1 COG:RSc0153 KEGG:ns NR:ns ## COG: RSc0153 COG0501 # Protein_GI_number: 17544872 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Ralstonia solanacearum # 23 265 38 274 314 130 32.0 3e-30 MKKIKNIILMLFVSLTLISCSTAPLTGRRQLKMVSDEAVAQSSISQYNQMIAELRQNNLL ANNTADGQRINQIGRRISRAVEQYLTANGMQDKIKSLQWEFNLIKSKDINAFALPGGKIA FYTGILPVLKTDAAIAFVMGHEIGHVIGGHHAESASNQNLAGFLMIGKKLIDAVTGVPVI SDDLAQQGLSLGLLKFNRTQEYEADKYGMIFMAMAGYNPQEAIAAQQRMMDLGGSQQAEI LSSHPSTQNRIEELKRFLPEAMKYYKK >gi|224461278|gb|ACDC01000124.1| GENE 2 1093 - 1332 368 79 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1593 NR:ns ## KEGG: Lebu_1593 # Name: not_defined # Def: protein of unknown function DUF172 # Organism: L.buccalis # Pathway: not_defined # 1 79 1 79 79 112 84.0 6e-24 MTNTNATNLRKNLFSYLDSTIEYNDIINVNTKKGNVIIISESEYNGLLETLYLLSDPTMK EKLETVKNATDEDYEVFEW >gi|224461278|gb|ACDC01000124.1| GENE 3 1326 - 1604 300 92 aa, chain + ## HITS:1 COG:RC0291 KEGG:ns NR:ns ## COG: RC0291 COG4115 # Protein_GI_number: 15892214 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Rickettsia conorii # 6 91 2 86 86 67 46.0 5e-12 MVDEEYKIYILKKANKDKENIKQFPALKNNVDKLIKLIKKNPFQTPPPYEVLIGDLKGYY SRRINKQHRLVYEVIEDEKRINIISMWKHYEF >gi|224461278|gb|ACDC01000124.1| GENE 4 1612 - 1695 70 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLLRFWMIKCQTLGNNKRVAINNMTL >gi|224461278|gb|ACDC01000124.1| GENE 5 1667 - 1951 471 94 aa, chain - ## HITS:1 COG:no KEGG:FMG_1585 NR:ns ## KEGG: FMG_1585 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 10 88 3 83 86 97 58.0 2e-19 MKAITTDKEKTNINFRLDKNLKEELDILCDEIGITVTTAFTIFAKKFVRERRLPFTVDAD PFYSAKNLERLKKSIEQLENKGGTIHEITEVLDD >gi|224461278|gb|ACDC01000124.1| GENE 6 2169 - 2549 568 126 aa, chain + ## HITS:1 COG:BS_yraB KEGG:ns NR:ns ## COG: BS_yraB COG0789 # Protein_GI_number: 16079754 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 113 1 113 140 114 46.0 5e-26 MTIKEVSEELGLTQDTLRYYEKIGMIPPVTRTEGGIRDYQKEDLEWVKLATCMRSAGLPV KVMIDYLSLYKQGDSTIQQRCNLLKEQREKLLEQRKQIEETLEKLNYKIAKYEIAVETGK LTWDKE >gi|224461278|gb|ACDC01000124.1| GENE 7 2562 - 4014 2236 484 aa, chain + ## HITS:1 COG:FN1464 KEGG:ns NR:ns ## COG: FN1464 COG1154 # Protein_GI_number: 19704796 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Fusobacterium nucleatum # 1 484 1 484 583 870 89.0 0 MYLEKINSPEDVKKLNIEEMKVLAEEIREAIIKRDAIHGGHFGPNLGMVEATIALHYVFD SPKDKFVFDVSHQTYPHKMLTGRREAFTDEAHYDDVTGYSNQHESEHDHFILGHTSTSIS LALGLAKARDVKGEKGNVIAIIGDGSLSGGEALEGLDFAGELKTNFIVIANDNDMSIAEN HGGLYKNLKLLRETEGKAECNLFKAMGLEYIFVKDGNNIEELIEAFKKVKDIDHPITVHI HTQKGKGYKLAEENKEPWHYVMPFNIEDGKPLNNDDSEDYTDVTKEYLMKKMKEDKTVVT ITAGTPGSFSFSRKEREELGEQFVDVGIAEQTAVALASGMASKGAKPVFTVVNSFVQRTY DQLSQDLCINNNPATIVVSYGGAIGMTDVTHLGWFDIAMMSNIPNLVYLAPTTKEEHLAM LEWSIEQQEHPVAIRIPGGKMVSTGEKVTKDFSKLNTYEVKQKGEKVAILGLGTFYQLGE KAAK Prediction of potential genes in microbial genomes Time: Thu May 19 23:45:31 2011 Seq name: gi|224461277|gb|ACDC01000125.1| Fusobacterium sp. 2_1_31 cont1.125, whole genome shotgun sequence Length of sequence - 14743 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 7, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 293 442 ## COG1154 Deoxyxylulose-5-phosphate synthase + Term 364 - 404 -0.5 + Prom 361 - 420 4.2 2 2 Tu 1 . + CDS 443 - 694 401 ## gi|237739068|ref|ZP_04569549.1| predicted protein + Term 788 - 833 9.3 + Prom 773 - 832 14.3 3 3 Op 1 . + CDS 953 - 1657 570 ## COG3619 Predicted membrane protein 4 3 Op 2 . + CDS 1734 - 2018 498 ## FN1972 hypothetical protein - Term 2018 - 2075 14.1 5 4 Tu 1 . - CDS 2124 - 2966 1544 ## COG0214 Pyridoxine biosynthesis enzyme - Prom 3086 - 3145 10.4 + Prom 2975 - 3034 8.5 6 5 Op 1 . + CDS 3074 - 4483 1104 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Prom 4490 - 4549 8.6 7 5 Op 2 . + CDS 4571 - 11368 10369 ## FN0387 hypothetical protein + Term 11424 - 11463 6.1 8 6 Tu 1 . - CDS 11538 - 12710 1535 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 12741 - 12800 9.9 + Prom 12805 - 12864 15.6 9 7 Op 1 1/1.000 + CDS 12891 - 14174 1906 ## COG3681 Uncharacterized conserved protein 10 7 Op 2 . + CDS 14198 - 14686 742 ## COG2849 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|224461277|gb|ACDC01000125.1| GENE 1 3 - 293 442 96 aa, chain + ## HITS:1 COG:FN1464 KEGG:ns NR:ns ## COG: FN1464 COG1154 # Protein_GI_number: 19704796 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Fusobacterium nucleatum # 1 96 488 583 583 172 92.0 9e-44 EKTGVKATVINPMYITGVDEKLLEELKKDHSVVITLEDGILNGGFGEKIARFYGNSDVKV LNYGLKKEFLDRYNIGKVLTENRLKADLIVEDLLKF >gi|224461277|gb|ACDC01000125.1| GENE 2 443 - 694 401 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739068|ref|ZP_04569549.1| ## NR: gi|237739068|ref|ZP_04569549.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 83 4 86 86 122 100.0 6e-27 MYSTSFDEIFDKMIGNKKEVVIKRKNKAEDLILLTATRYKEILEKIEELKYYNEIRRRAE DLDAGNGKVHTIAEMEKMLEVIK >gi|224461277|gb|ACDC01000125.1| GENE 3 953 - 1657 570 234 aa, chain + ## HITS:1 COG:SPy0421 KEGG:ns NR:ns ## COG: SPy0421 COG3619 # Protein_GI_number: 15674550 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 7 234 10 233 235 165 41.0 5e-41 MEKIKEEVPEKLRIAVLLSFISGYINAFTYNNAGELFAGAQTGNVIFMALHFAKGNFEKA VEFLIPIISFMIGQIFIYCFRNFFQRRGHKGYIHSSLLMLFIMIMLIVLLPFFDYHFIVV TLAFFAAIQSDTFQRLRGFSYATIMMTGNVKNAPRLLIEGLVQRDRELLVRGFLLFLIIF SFMLGVGISTYFTQFVKKSALVPLILPLSYINYVLFKEEHSVIDVVKSKIRKIK >gi|224461277|gb|ACDC01000125.1| GENE 4 1734 - 2018 498 94 aa, chain + ## HITS:1 COG:no KEGG:FN1972 NR:ns ## KEGG: FN1972 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 91 28 117 122 94 53.0 1e-18 MLGKVITEHGQVVNNQDIMVVHLHLKEGETIPAHNHPGRQIFFTVVEGEVEVYLDEKETY PLVPKKVLEFDGEVRISVKALKESDIFVYLVVKR >gi|224461277|gb|ACDC01000125.1| GENE 5 2124 - 2966 1544 280 aa, chain - ## HITS:1 COG:FN1463 KEGG:ns NR:ns ## COG: FN1463 COG0214 # Protein_GI_number: 19704795 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxine biosynthesis enzyme # Organism: Fusobacterium nucleatum # 1 280 1 280 280 501 96.0 1e-142 MDTRFNGGVIMDVTSKEQAIIAEEAGAVAVMALERIPADIRAAGGVSRMSDPKLIKEIMS AVKIPVMAKVRIGHFVEAEILQAIGIDFIDESEVLSPADSVHHVNKRDFTTPFVCGARNL GEALRRICEGAQMIRTKGEAGTGDVVQAVSHMRQIMKEINLVKALRDDELYVMAKDLQVP YDLVKYVHDNGRLPVPNFSAGGVATPADAALMRRLGADGVFVGSGIFKSGDPRKRAKAIV EAVKNYDNPEIIARVSEDLGEAMVGINENEIKIIMAERGV >gi|224461277|gb|ACDC01000125.1| GENE 6 3074 - 4483 1104 469 aa, chain + ## HITS:1 COG:FN1462 KEGG:ns NR:ns ## COG: FN1462 COG1167 # Protein_GI_number: 19704794 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Fusobacterium nucleatum # 1 469 1 469 469 754 89.0 0 MIILNLDNKSKIPLYIQIYTEIKKLIQTKILKANEKLPSKKDFIDYYNISQNTIQNALYL LLEEGYIFSIERRGYFVSDIENLIIQNVKVENKAKFKEKEKIHYDFSYSGVDKKSLARTI FKRITKDVYDEENEDLLFQGHIQGDLLLRKSICEYLSQSRGFKVDAEQVVISSGTEYLFY IIFKLFNNKIYGLENPCHKMFKELFLTNNISFKAISLDENGIVIDDLKKNNVNIAYVTPS HQFPTGAIMSISRRTELLNWANENPDRYIVEDDYDSEFKYTGRPIPALKANDINDKVIYL GSFSKSISPAIRVSYLVLPKVLLNIYQRELPYFICPVPTLNQKILYRFIKDGYFVKHINK MRTLYKKKREFLVNTIKTYSSKILNKEIQIQGADAGLHMVIKLNQKINEKLFLNECLENS LKLYSLEEYNIEEIYRENSYFLLGYANLTNKEIEEGILLLLKILKKYYV >gi|224461277|gb|ACDC01000125.1| GENE 7 4571 - 11368 10369 2265 aa, chain + ## HITS:1 COG:no KEGG:FN0387 NR:ns ## KEGG: FN0387 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 577 2265 7 1724 1724 1269 48.0 0 MNNNLYNVEKNLRSIAKRYENVKYSVGLAVLFLMKGTNAFSDDNKIQELEKQKDILTDVK KEKVEVKETKKVAKTTQKLKASWTNMQFGANDLYSNFFVTPKNKVEKTSIVKSEKAVLVA SADNSTSLPMLAKITSDIEETSTPTTEEINTSKENLRNSIGNLQNKIDTARKENSKEIQG LKLELVQLMEQGNQVVKSPWSSWQFGANYFYDNWGSAYKGRGDKQTNKILTRDKSATLNR FLETSADSTSYGTTNLRLVKEPSVEIKISAGVRPKNINKQAPSYRPNAPTVSLPVFEPKL LSTPGKPSAPVEVTPATFDPPDINFKGKGFFQHAHIDISRGNGLSYGPHQGQWKGIVIQN YDNYNTVDKNNDTVEGIINIEVGNLADGKKAKWWGSNLDGTANSNIQLKVHTDVPKTAAS GYNFYGPGDFYLDDGSTIVPLYTFINELRDHNATISGNYIMTNKGGENDTSRNVIFLSHN PAGPGTPGYDGKNTAGPRTATFNGTLTLNGTATPFTGATASSDVTIGVEHQLYSNGHQSA YSIFDNKGTIDLASGNNLVGILLDVEVWGDNTNNNIANNTNRLPHKTVNNGKIIINSQNS IGIDYGEYYNRFFKSDLTVGDIIVNGSHNYGLRMANIFSGNAGYFDKGTTIKSGGADKKI LVQGTENVGVSIAKFLSSAKDPNPIANITDLNVEVAGQKNIGFLRHKTYANNTGDMLFNA TTMGTFTFGNGAKDSTLIRTDKYGIQVKKDINATGKDNAGKDFTGSGNTVLHSNGQTQHI DNYNTITVGKGYTQTVGMAATGTAASKIVNILNEGTIDLQGKKSIGMYVDKFTQGKSTGS IKLSALGDLDKDGKLGDTENVGISNKGKFTFLGDIEVNGKKSSGIYNTGTTTIEVGTNPN DRTNINVANGATAIFSKGTGSSISSTAGNKLNINVNAGTTKEGLAVYAEDKSVVSLQNAN INVVGGSAGVAAYDSGTKIDLKDATLKYNGNGYATYSDGQGKIDLTNSKIELRGSSTLMD IDLSIPVSNRPITTAGTEVKVFSNDVIAINVSNLGTKNIGDLTTLMSTLGVNKIEAGTEA GQTFNKYKELAIDNGTINFNVTTDKTEVDTNPGGFFFKKVVGQRLKLNVNENLTARLSSA IANEFYGGQVVGLEANSSKKATSNTETQVNIASGKVVDVARTDGTDKGGVGVFVNYGQVN NSGTIKVEKDSDANSNAVGIYVVNGSEATNNGSVNVSGEQSIGLLGMAYRVDTSGNLVVD EFGTGAIGQGKVNIVNKGSVDLDGQGAVGMFAKNNKAGTTFTNATALNDTTGAITTTGTK SVGMAGEKANIINRGTINVNAEKGTGMFAESASRIENSGTINIVAASSTTEPNIGIFTED QDTEIHNNKDIIGGNNTYGIYGKTINMGANGKVKVGENSVAIYSNGQYASSATPNITLAS GSSIEVENNQSVGVFVTGQNQNISSQADIRIGDDSFGYVIKGTGTKLIANTTNPVTVGND TTFIYSTDTTGNIENRTRLISTGNSNYGIYAAGNVTNLADMDFSSGIGNVGMYSIAGGKI VNGSTLVNPVIKVGSSDKSNKLYGIGMVAGYTDDKGNVIQTGTVENYGTIKVDKTDSIGM YATGSGSKAINRGTIELSGKNTTGMYLDNNAVGENYGTIKTVPNPTNTGIVGVAALNGAV IKNYGSIIIDGANNTGIYLARGRREGVDPTITNGAVAVQTKKQSDTTKKVEGIEIKAKPD EPVAVTRQGNLVTPTFVDTTVAVPTASKVTVGATELDLTASELGDIPSIGMASDLGMYVD TSGVNYTNPIQGLQYLTGVNKVNLIFGTEASRYTTSKDIKVGENILKPYNDEIISLFSGG NEKKFNIASSSLTWIATGTQNSDDTFNAVYLSKIPYTAFAKDKNTYNFMDGLEQRYGVEG VNSREKALFDKLNAIGKGEPILFAQAVDEMMGHQYANTQQRIEATGNILDKEFNYLRSEW SNPSKDSNKIKTFGTKGEYKTNTAGVIDYQNNAYGVAYVHEDETVKLGESVGWYAGIVHN TFKFKDIGNSKEEMLQGKLGLFKSVPFDHNNSLNWTISGDIFAGYNKMNRRFLVVDEVFN AKGRYHTYGLGVKNEISKEFRLSEGFSVRPYAALGLEYGRVSKIREKSGEMKLEVKANDY FSVKPEIGTELAYRHHFGTSAIKVAVGVAYENELGKVANGKNKAKVAGTDADYFNIRGEK EDRTGNVKTDLNIGWDNQRIGVTANIGYDTKGHNVRGGVGLRVIF >gi|224461277|gb|ACDC01000125.1| GENE 8 11538 - 12710 1535 390 aa, chain - ## HITS:1 COG:FN1148 KEGG:ns NR:ns ## COG: FN1148 COG1301 # Protein_GI_number: 19704483 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Fusobacterium nucleatum # 1 390 1 390 390 586 93.0 1e-167 MEKEKKGDTLIIKLVLGVIAGIIIGLVANEKLISVILPIKFFLGELIFFVVPFIIIGFIA PAITQLKSNASKMLLTMLGLSYLSSIGAAFFSATAGYALIPKLNIVSSVEGLKELPPILF KVQIPPAISVMGALVLALLMGLAVVWTNSKRTEELLNEFNNIMLMIVNKIIIPILPIFIA TTFATLAYEGSITKQLPVFLKVILIVLVGHYIWIAILYTIGGIVSGKNPWSLLKHYGPAY MTAVGTMSSAATLPVSLKCVRKSGVLDEEITNFAIPLGATTHLCGSVLTETFFVMVVAKI LYGAVPPVGTMVLFILLLGIFAVGAPGVPGGTVLASLGLIISVLGFDETGTALMITIFAL QDSFGTACNITGDGALALILNGIFKKKQAN >gi|224461277|gb|ACDC01000125.1| GENE 9 12891 - 14174 1906 427 aa, chain + ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 17 427 1 411 411 709 90.0 0 METKIEKVLKILEEEIVAAEGCTEPIALSYAAAKAKRILGTIPNKVDVFLSGNIIKNVKS VTIPNSDGMVGIEAAIAMGLIAGDDIKELMVISDVTSEQLTEVKEFLDKGIIKTHVHPGD IKLYIRLEISNDKDNVVLEIKHTHTNVTQILKNGKVLLSQVCNDGDFNSSLTDRKVLSVK FIYDLAKTIDIDLIRPIFQKVVNYNSAIAEEGLKGKYGVNIGKMILDNIEKGIYGNDVRN KAASYASAGSDARMSGCALPVMTTSGSGNQGMTASLPIIKFAAEKNLSEEELIRGLFVSH LITIHVKTNVGRLSAYCGAICAAAGVAASLTYLHGGSYEMVCAAITNILGNLSGVICDGA KASCAMKISSGIYSAFDATMLALNKDVLKSGDGIVGVDIEETIRNVGELAQSGMKGTDET ILDIMTK >gi|224461277|gb|ACDC01000125.1| GENE 10 14198 - 14686 742 162 aa, chain + ## HITS:1 COG:FN1146 KEGG:ns NR:ns ## COG: FN1146 COG2849 # Protein_GI_number: 19704481 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 48 162 14 128 128 183 76.0 2e-46 MGKKFKLLLLTLGALTLFSACSSVYTDTEMKARGVVMFLSKTMGPTSFEKRWQTTNKAAI VENFKNGVRDGELRRYYLNGNLLMRFYFEEGNVEGPWEDYYPNGKLLMSGQMKANKEVGN WKYYDENGNLLGEAPYDQIPKAIRDAKEKNIDQFWKDIKAGK Prediction of potential genes in microbial genomes Time: Thu May 19 23:46:04 2011 Seq name: gi|224461276|gb|ACDC01000126.1| Fusobacterium sp. 2_1_31 cont1.126, whole genome shotgun sequence Length of sequence - 7749 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 2, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 19 - 1512 2460 ## COG0554 Glycerol kinase 2 1 Op 2 4/0.000 - CDS 1553 - 1897 481 ## COG3862 Uncharacterized protein with conserved CXXC pairs 3 1 Op 3 6/0.000 - CDS 1897 - 3162 2204 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 4 1 Op 4 . - CDS 3175 - 4605 2165 ## COG0579 Predicted dehydrogenase - Prom 4832 - 4891 17.2 + Prom 4764 - 4823 11.4 5 2 Op 1 13/0.000 + CDS 4909 - 5562 746 ## COG0785 Cytochrome c biogenesis protein + Prom 5789 - 5848 12.7 6 2 Op 2 3/0.000 + CDS 5878 - 6504 861 ## COG0526 Thiol-disulfide isomerase and thioredoxins 7 2 Op 3 3/0.000 + CDS 6573 - 7457 1087 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase 8 2 Op 4 . + CDS 7488 - 7749 270 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|224461276|gb|ACDC01000126.1| GENE 1 19 - 1512 2460 497 aa, chain - ## HITS:1 COG:FN1839 KEGG:ns NR:ns ## COG: FN1839 COG0554 # Protein_GI_number: 19705144 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Fusobacterium nucleatum # 1 497 1 497 497 949 95.0 0 MKYIVALDQGTTSSRAILFDESQTIVGVAQKEFTQIYPNEGWVEHDPMEIWASQSGVLSE VIARAGVSQHDIIALGITNQRETTIVWDKNTGKPVYNAIVWQCRRTAKICDELKKIEGFS DYIKDNTGLLVDAYFSGTKIKWILDNVEGAREKAEKGDLLFGTVDTWLIWKLTNGKVHAT DYTNASRTMLYNIKELKWDEKILETLNIPKSMLPEVKDSSGTFGYANLGGKGGHRVPIAG VAGDQQSALFGQACFEEGESKNTYGTGCFLLMNTGEKFVKSNNGLITTIAIGLNGKVQYA LEGSVFVGGASVQWLRDELKLISESSDTEYFARKVKDNGGVYVVPAFVGLGAPYWDMYAR GAILGLTRGANKNHIIRATLESIAYQTKDVLKAMEEDSSIKLNGLKVDGGAAANNFLMEF QADILGEVVKRPTVLETTALGAAYLAGLATGFWENKEEIKQKWVLDKEFSPNMSQEERDK KYAGWLKAVERSKNWED >gi|224461276|gb|ACDC01000126.1| GENE 2 1553 - 1897 481 114 aa, chain - ## HITS:1 COG:FN0181 KEGG:ns NR:ns ## COG: FN0181 COG3862 # Protein_GI_number: 19703526 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Fusobacterium nucleatum # 1 114 1 114 114 186 88.0 1e-47 MEKEMICIVCPVGCHISVNTETYEVKGNACPRGAVYGKEELTAPKRVVTSTVKIKNALDH RCPVKTERAIPKELNFKLMEELKKVELTAPVKRGDIVLENIFNTGVNVVVTKDM >gi|224461276|gb|ACDC01000126.1| GENE 3 1897 - 3162 2204 421 aa, chain - ## HITS:1 COG:FN0182 KEGG:ns NR:ns ## COG: FN0182 COG0446 # Protein_GI_number: 19703527 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 1 421 1 421 421 738 94.0 0 MNMKYDLVVVGGGPAGLAAAVEAKKNGIDSILVIERAKELGGILQQCIHNGFGLHEFKEE LTGPEYAQRFMDQLFELNIEYKLDTMVLEVSENKIVQAINSVDGYMIIEAKSIVLTMGCR ERTRGAIAIPGDRPAGIFTAGAAQRYINMEGYMVGKRVVILGSGDIGLIMARRLTLEGAK VLAVAELMPFSGGLMRNIVQCLEDYDIPLYLSHTVVDIIGKDRVEKIIIAKVDENKKAIP GTEIEYECDTLLLSVGLIPENDISRATGIKIDPRTSGPVVNELMETSIEGIFASGNVVHV HDLVDFVSIESRKAGKSAAKYIKGEVADGEYIEIETGNGIGYTVPQKFRIENIEKNLELS MRVRQIYKNVKIVVKSNDFVIHSVKKNHMAPGEMEKITLSKTVLGKIDAKKIVVEVVEED K >gi|224461276|gb|ACDC01000126.1| GENE 4 3175 - 4605 2165 476 aa, chain - ## HITS:1 COG:FN0183 KEGG:ns NR:ns ## COG: FN0183 COG0579 # Protein_GI_number: 19703528 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Fusobacterium nucleatum # 1 476 23 498 498 874 90.0 0 MFDVVVIGAGIMGAAVSRELSRYELKTLLLDKENDVSCGTTKANSAIVHAGYDAKEGSLM AKYNVLGNAMYGKLCEEVDAPFRKVGSYVLAFSEKEKEHLEMLYQRGLNNGVPEMEIIDA AEIQRREPHVSKEAVAALYAGTAGITGPWELATKLVENAMENGVELKLNAEVANIKKEND VFKIELKNGEIIEAKAIVNAAGVYADFINNMLSNKKFKITPRIGEYYLLDKIQGYLTDSV IFQCPTEMGKGILVSKTAHGNIIVGPTASDVDNKDDVGNTQAGLDTVRQFATKSIKDINF RDNIRNFAGLRAEADTGDFILGEAEDVKGLFNIAGTKSPGLTSAPAMAIDLAKMIVESFG GVKEKTNFVKNRKMIHFITLSPEEKAEVIKKDPRYGRIICRCENITEGEIVDAIHRKCGG RTLNGIKRRVRPGAGRCQGGFCGPRVQEILARELGEDLEEIVMEQKNSYILTGKTK >gi|224461276|gb|ACDC01000126.1| GENE 5 4909 - 5562 746 217 aa, chain + ## HITS:1 COG:FN0186 KEGG:ns NR:ns ## COG: FN0186 COG0785 # Protein_GI_number: 19703531 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Fusobacterium nucleatum # 1 217 1 217 217 295 98.0 5e-80 MLNTELFIGAVYVAGLLSFFSPCIFPLLPVYIGMLSTSGKKSIIKTVVFVIGLSTSFVLL GFGAGSIGSFLMSKTFRIISGVIVIIFGIIQMEIVKIPFLERTKLVDIKGKENDSIWGAF LLGFTFSLGWTPCVGPILASILFISSGGGNPYYGALMMFIYVLGLATPFVILSLSSKYVL AKVSAIKKHLGIMKKIGGLLIIIMGILLLTDKLSIFL >gi|224461276|gb|ACDC01000126.1| GENE 6 5878 - 6504 861 208 aa, chain + ## HITS:1 COG:FN0187 KEGG:ns NR:ns ## COG: FN0187 COG0526 # Protein_GI_number: 19703532 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Fusobacterium nucleatum # 1 208 1 209 209 290 87.0 1e-78 MKGLKKLFFGIMMLLMGAVAFGAEMDLSKVTLKDVNGMNYSFGKDGKPTYVKFWASWCPI CLSGLEDIDNLSKEKKDFEVVTVVSPGLVGEKKTEDFKKWYKSLGYKNIKVLLDEKGELT KMLNVRVYPTSAVLNKSGKVEKVLPGHLEKAEIKKLFSSKMMMNDKGMKDTMMNDGKMKD SMMKDDKMMNDKNMMKNDKMSMEKKTSM >gi|224461276|gb|ACDC01000126.1| GENE 7 6573 - 7457 1087 294 aa, chain + ## HITS:1 COG:FN0188_2 KEGG:ns NR:ns ## COG: FN0188_2 COG0229 # Protein_GI_number: 19703533 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Fusobacterium nucleatum # 147 294 1 148 148 300 98.0 2e-81 MEKIYGVIDVTSGYANGKTKNPKYQDLHSSGHAETVHVKYDINKVNLSTLLKYYFKIIDP TSVNKQGNDRGSQYRTGIYYVNQNDKSVIQNEIKEQQKKYSQKIVVEVLPLKEYYLAEEY HQDYLKKNPNGYCHIDLSKADDIIVDEKKYPKLSEKELRMKLNSKQYEVTQNGDTERAFQ NDYWDFFDKGIYVDITTGEPLFSSTDKYASQCGWPSFVKPIVPEVVTYHNDTSFNMLRTE VRSRSGKAHLGHVFDDGPRDRGGKIYCINSAAIQFIPYAEMEAKGYGYLLPLVK >gi|224461276|gb|ACDC01000126.1| GENE 8 7488 - 7749 270 87 aa, chain + ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 1 87 1 87 261 153 97.0 9e-38 MYKLMIADDEPLIRRGIKQLIDLSSLQIGEIHEASTGEEALKVFKEFKPEIVLMDINMPK IDGLSVAKKIKSINPDTKIAIITGYNY Prediction of potential genes in microbial genomes Time: Thu May 19 23:46:05 2011 Seq name: gi|224461275|gb|ACDC01000127.1| Fusobacterium sp. 2_1_31 cont1.127, whole genome shotgun sequence Length of sequence - 2195 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 + CDS 2 - 517 542 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 2 1 Op 2 . + CDS 504 - 2189 1607 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain Predicted protein(s) >gi|224461275|gb|ACDC01000127.1| GENE 1 2 - 517 542 171 aa, chain + ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 1 171 91 261 261 257 98.0 8e-69 AQTAIKIGVEDYILKPISKSDVSEIIVKLVSSLQKERKDKEIEKVLEKITTVDIQDNIAK NNYKELIQNIIEESYTDSQFTLSVLSEKLDLSSGYLSIMFKKNFGIPFQDYLLQKRMEKA KLLLLTTELKNYEIAEQIGFEDVNYFITKFKKYYQITPKQYREMVLKNENE >gi|224461275|gb|ACDC01000127.1| GENE 2 504 - 2189 1607 561 aa, chain + ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 1 550 1 550 552 958 98.0 0 MKMNNKPLNIKIGFYFLITNLVLVLLLGSIFYFSSSSLLIQKEISAKTEAIEKSGNYIEL YMSKLTTLSQVISHDKGVYDYLKNKDETEKNRILNIIDNTLSTDPYIKSIILIRKDGAVI SNEKNVNMEVSSDMMKEEWYVNSLMNPMPVLNPLRKQNFSVDGMDDWVISVSREIADTNG ENLGVLLIDVKYQALHEYLQNQETGKNSDIVILDEDNRIVYYKEIPYDISQEKYLKNLKN IEEGYNRKENTVTVKYPIKNTHWTLIEISYMQEIESLKNHFFEMIVISCLASLLITVLIS ISVLRRITKPIKELEQHMNNFNNDLSKINLKGDVSVEILSLQNHFNEMIDKIKYLREYEI NALYSQINPHFLYNTLDTIIWMAEFQDTEKVISITKALSNFFRISLSNGKEKIPLKEEIN HIKEYLYIQKQRYEDKLEYKISIQEELENIEVPKIILQPFVENAIYHGIKNLDTTGIISI YSQIIENKIELIIEDNGIGFEAAKKQALMKMGGVGIKNVNKRIQYYYGNEYGAKIDSSFK AGARIIITLPCELLTTNTNEC Prediction of potential genes in microbial genomes Time: Thu May 19 23:46:10 2011 Seq name: gi|224461274|gb|ACDC01000128.1| Fusobacterium sp. 2_1_31 cont1.128, whole genome shotgun sequence Length of sequence - 13382 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 201 - 254 3.2 1 1 Op 1 . - CDS 274 - 831 526 ## FN0184 hypothetical protein 2 1 Op 2 . - CDS 856 - 1401 398 ## FN0184 hypothetical protein - Prom 1533 - 1592 14.4 3 2 Tu 1 . - CDS 1600 - 2589 1563 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Prom 2619 - 2678 17.5 - Term 2659 - 2697 7.2 4 3 Op 1 1/0.000 - CDS 2723 - 4261 2180 ## COG2978 Putative p-aminobenzoyl-glutamate transporter - Prom 4371 - 4430 12.4 5 3 Op 2 1/0.000 - CDS 4455 - 5063 870 ## COG3142 Uncharacterized protein involved in copper resistance 6 3 Op 3 23/0.000 - CDS 5073 - 5765 817 ## COG1346 Putative effector of murein hydrolase 7 3 Op 4 1/0.000 - CDS 5765 - 6121 277 ## COG1380 Putative effector of murein hydrolase LrgA - Term 6153 - 6185 2.5 8 3 Op 5 . - CDS 6200 - 7681 2115 ## COG1190 Lysyl-tRNA synthetase (class II) 9 3 Op 6 . - CDS 7703 - 9043 1532 ## FN0465 hypothetical protein 10 3 Op 7 . - CDS 9059 - 9334 193 ## FN0464 hypothetical protein - Prom 9403 - 9462 5.0 11 4 Op 1 1/0.000 - CDS 9464 - 9931 520 ## COG1576 Uncharacterized conserved protein 12 4 Op 2 . - CDS 9941 - 11866 2255 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) - Prom 11942 - 12001 17.1 + Prom 11901 - 11960 11.1 13 5 Tu 1 . + CDS 11992 - 12888 1233 ## COG3023 Negative regulator of beta-lactamase expression + Term 13029 - 13067 7.3 14 6 Tu 1 . - CDS 13039 - 13380 329 ## COG0566 rRNA methylases Predicted protein(s) >gi|224461274|gb|ACDC01000128.1| GENE 1 274 - 831 526 185 aa, chain - ## HITS:1 COG:no KEGG:FN0184 NR:ns ## KEGG: FN0184 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 94 168 52 126 143 63 48.0 3e-09 MSVKIQLEKNGEVTDAFTGFSWTTFIFGFWVPAFRKKSKGFGLFFLFFIIKIIILYTLSK QNNEIDLNLLIYGTFEPSYGMITPVVLAAAIYPLETWIAYFYNNYYTNNLLAEGYRPIEN DDYSVAILKDYSYLPYSKEELDDNVKMERYREISTLARKEERKKIYIFVGIWAIFIIIFW FSNLF >gi|224461274|gb|ACDC01000128.1| GENE 2 856 - 1401 398 181 aa, chain - ## HITS:1 COG:no KEGG:FN0184 NR:ns ## KEGG: FN0184 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 54 163 19 126 143 71 44.0 1e-11 MAIKVKLEKDGFIKDGFVGYSYTSAIFDFWVPAFRLDFNAFVFFFGLYMLEKFLSEFFTI YSMLNYYSIENEWFFYILNASVPIFTLLIPFIIAFFYNKHYTKKMLKEGWSPLENDEYSN AILKGYRYLDYTDVEIKDEDKMQRYQNYIDKAKSNEVKKCLCFIIFLIIIFVSFYFYYFR A >gi|224461274|gb|ACDC01000128.1| GENE 3 1600 - 2589 1563 329 aa, chain - ## HITS:1 COG:FN1279 KEGG:ns NR:ns ## COG: FN1279 COG0491 # Protein_GI_number: 19704614 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 1 324 1 324 326 597 90.0 1e-170 MLNEIAKNIYLIEVPLPKNPLKALNCYFIKNGENILVVDSGFDHEESEKVFFEALEELGA QVGKTDMFLTHLHADHSGLALKFKNKYQGKVYCSQIDTEYINKMKHELYADRFVPTLKVM GIEPDFKFFETHPGLVYCIKGKLDTTIVKDGDKIDFGYYNFEVIDLSGHTPGQVGIYDKN HKILFSGDHILNKITPNISFWDFKYEDILGTYLKNLDKVYNMDVDTIYSAHRGIIDNPKL RIDELKKHYADRNAEVYNLLKEVEENSAAQMAAKMHWDYRAKNFEEFPNNQKWFATGEAL ANLEHLRAIGKADYEFKDGVAYYRIKERT >gi|224461274|gb|ACDC01000128.1| GENE 4 2723 - 4261 2180 512 aa, chain - ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 1 512 1 512 512 852 91.0 0 MEKEKKKGIQRFLDFVERGGNKLPHPLTLFWIFCVIIAIISAIAANSGASVTYEAFDRKE NIIKETTLTIKSLLNAEGIRYIFSSMVKNFTGFAPLGTVLVALIGIGVAEGSGLMSATMK KVVTATPKKLLTAIVVLAGVMSNIASDAGYVVLIPLGAVIFLSFGRHPIAGLAAAFAGVS GGFSANLLLSTTDPLLSGITTEAAKLLNPDYFVNPASNYYFMAASTFLITILGTFITEKI IEPRLGEYKGEVVVDHNELTDKERKALRWAGISVLIFCAVIAFLILPENAILRVDGTLKQ WTHDGLVPTLMMFFLVPGIVYGKVAGTIKNDKDVAKMMGSSLATMGGYLALAFAASQFVA YFSYTNLGTFVAVKGADFLQSIGLTGLPLIVLFVLVSAFINLFMGSASAKWAIMAPIFVP MLMRLGYTPEFTQLAYRIGDSSTNIITPLMTYFAMIVAFMQKYDKESGMGTLISVMLPYS MCFLVGWTIFLIIWFMTGLPIGIEGAIHLAGM >gi|224461274|gb|ACDC01000128.1| GENE 5 4455 - 5063 870 202 aa, chain - ## HITS:1 COG:FN0469 KEGG:ns NR:ns ## COG: FN0469 COG3142 # Protein_GI_number: 19703804 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Fusobacterium nucleatum # 1 202 1 202 202 337 88.0 1e-92 MIKEACVESFEKSLEAQNNGANRIELCENLAVGGTTPSYGTVKVCLEKLNIPIFPMIRAR GGNFVYSKDEIEIMKEDIRIFKELGVKGVVFGFLTSDNKIDLELTKELVELAAPMEVTFH KAIDEISNPLDYIEDLINIGVKRILTSGGKATALEGSDLINQMIKKANNRLKIVVAGKVT KENLNELKNLIPADEFHGKLIV >gi|224461274|gb|ACDC01000128.1| GENE 6 5073 - 5765 817 230 aa, chain - ## HITS:1 COG:FN0468 KEGG:ns NR:ns ## COG: FN0468 COG1346 # Protein_GI_number: 19703803 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Fusobacterium nucleatum # 1 230 1 230 230 338 87.0 7e-93 MKEIIVSNLFFGLILSYFALEIGKWVFKKTQTPLCNPFLIGTIIVIIILKVFNISTDDYY KGAGMILFLLGPATVALAIPLYKKWELFKKFFVPVMTGAIVGSFVGIISVIVLGKLFGMD DKLIFSLMPKSITTPFGIEVSSMLGGIPAITVVSIMLTGIAGNVTAPLISKIFRVKHSVA VGIGIGVSSHAVGTSKAMEIGEVEGSMSALSIVFAGILTLVWAPLLKLLV >gi|224461274|gb|ACDC01000128.1| GENE 7 5765 - 6121 277 118 aa, chain - ## HITS:1 COG:FN0467 KEGG:ns NR:ns ## COG: FN0467 COG1380 # Protein_GI_number: 19703802 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Fusobacterium nucleatum # 1 115 1 115 118 119 80.0 2e-27 MLREFMIIFLINYVGMLLSKILHLPLPGTIVSLLLLFFMLQFKVLKLEKIENAGNFLLLN MTIFFMPPTVKIIDSYELLEKDLFKIIVIILVSTFLTMGITGKVVQLMIDLKERKEKK >gi|224461274|gb|ACDC01000128.1| GENE 8 6200 - 7681 2115 493 aa, chain - ## HITS:1 COG:FN0466 KEGG:ns NR:ns ## COG: FN0466 COG1190 # Protein_GI_number: 19703801 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Fusobacterium nucleatum # 1 493 1 493 493 934 93.0 0 MEKYFDRLEKEPLIAERWKKIEELESYGIKAFGSKYDKQIMIGDILKHNPEENLKFKTAG RIMSLRGKGKVYFAHIEDQSGKIQIYIKKDELGENEFDHIVKMLNVGDIIGVEGELFVTH TEELTLRVKSISLLTKNVRSLPEKYHGLTDVEIRYRKRYVDLIMNPEVRDTFIKRTQIIK AVKKYLDDRGFLEVETPLMHPILGGAAAKPFVTHHNALNLDLFLRIAPELYLKKLIVGGF ERVYELGRNFRNEGISTRHNPEFTMIELYQSHANFHDMMDLCEGIISSVCQEVNGTTDIE YDGVQLSLKNFQRVHMVDMIKDVTGVDFWQEMTFEEAKKLAKEHHVEVADHMDSVGHVIN EFFEQKCEERVVQPTFVYGHPVEISPLAKRNEKNPNFTDRFELFINKREYANAFTELNDP ADQRGRFEAQVEEALRGNEEATPEIDESFVEALEYGLPPTGGMGIGIDRLVMLLTGAPSI RDVILFPQMKPRD >gi|224461274|gb|ACDC01000128.1| GENE 9 7703 - 9043 1532 446 aa, chain - ## HITS:1 COG:no KEGG:FN0465 NR:ns ## KEGG: FN0465 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 446 2 410 410 445 60.0 1e-123 MLKKLAITLVAIVFVGCYNLDNVGGKSSGGSIREIEIAGSQQTGGTTAPTPNTTTGETVE TRPQQEEKIVSVDVNDENVNDYLTIIKSNLRTSAKKVDDNIKNQYTVPIGETLVFPVDNE KAIKLSTSPKNANPKVSLTNGKVTFRTVYQGQYVLSTYVNGSVNRKITVSAISKYDFNEK DLYKLILQDSEKRDKDVENAVTLYKMLYPAGRYSKEVNYLFLKYAYDIRNNSLINEALAG VKNDFSSYSDSEKATILRAAKLANKSIFIPSEVYNTNNSDLKNALNEYNNSSKAPVDRAP SAPVDNRTVTTEKNKTKTQTTENETSIVDYAREKVRSVVGGISGTTTTVTTVGSVKSKTT NTTESYYEKGMKNLNSNPKVAIDNLKKSLSSEKIQDKKPEIYYNIASSYAKLGNRVEVTK YIRLLKQEFPNSSWAKKSEALSNLIK >gi|224461274|gb|ACDC01000128.1| GENE 10 9059 - 9334 193 91 aa, chain - ## HITS:1 COG:no KEGG:FN0464 NR:ns ## KEGG: FN0464 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 91 20 110 110 89 71.0 3e-17 MENAEYEYFISLIEKEIFLIRNSGTNLNIHLKFNGEKNNFFYEVGNFKQNFLVLGEKYSY NIKNKTFQFSYILFDENNNEINKIEIAIRHV >gi|224461274|gb|ACDC01000128.1| GENE 11 9464 - 9931 520 155 aa, chain - ## HITS:1 COG:FN0463 KEGG:ns NR:ns ## COG: FN0463 COG1576 # Protein_GI_number: 19703798 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 155 1 155 155 222 86.0 2e-58 MNINIICIGKIKDKYINEGIAEFSKRMTSFANLNIIELKEYNKEDSMNISIDKESQDILK QLSKSNSYNILLDLNGKELSSEDMSEYIEDLKNKGTSSINFIIGGSNGVNKELKNSVDMK LKFSHFTFPHQLMRLILLEQIYRWFAISNNIKYHK >gi|224461274|gb|ACDC01000128.1| GENE 12 9941 - 11866 2255 641 aa, chain - ## HITS:1 COG:FN0462 KEGG:ns NR:ns ## COG: FN0462 COG0323 # Protein_GI_number: 19703797 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Fusobacterium nucleatum # 1 641 7 643 643 939 81.0 0 MSRIRILDESVSNAIAAGEVVENPTSMIKELIENSLDAGSKEIKLEVWNGGLDICISDSG CGMSKEDLLLSIERHATSKIITKDDLFNIRTYGFRGEALSSIASVSKMILSSRTEDSSNG TQMNVLGGKVTNLKDIQKNVGTQIEIKDLFYNTPARKKFLRKDTTEYLNIKDIFLREALA NPNVKFILNIEGKESIRTSGNGIENAILEIFGKNYLKNFSKFSLGYLGNANLFKANKDSI FVFINGRSVKSKIVEEAVIAAYHTKLMKGKYPSALIFLDIDPAEIDVNVHPSKKIVKFAN QSEIYDLVKGEIERFFSDDENFISPHIEVEDEEVETFEEKEEKIEYPSNNFLDINDFKDE KESLSQLSVVQKEDYLKKDYSEIKVEKPNISHIENTVKASSNEIKENIETFKKVDNDFDL IEKEVETERTNEKYIFDTKDTSRGKIFDDFSSLKNIDFRVIGQVFDSFILVERNNLLEIY DQHIIHERILYEKLKQEYYNQSMTKQNLLVPIRFELDPREKQLALENTEIFSSFGFDIDD FEKNEILLRTTPTMDLRDSYENIIKEILDNISKNKDKDIRENIIVSMSCKGAIKANHKLT IEEMYSMVAKLHEVGEYTCPHGRPIIVKMSLLDLEKLFKRK >gi|224461274|gb|ACDC01000128.1| GENE 13 11992 - 12888 1233 298 aa, chain + ## HITS:1 COG:FN0164 KEGG:ns NR:ns ## COG: FN0164 COG3023 # Protein_GI_number: 19703509 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Fusobacterium nucleatum # 14 297 1 287 288 422 73.0 1e-118 MKKILALLSLLIFMVACSSSDTPVKETKGISTSRRTSSSSIGSMGKFKVDSDTYVSLGRN ERIQFVVVHYTATNNEYSIKELISNRVSAHFLVLDEDDNMIYNLVPLDQRAWHAGASSFR GRTNLNDTSIGIEIVSDGIARDRRNDPNRYPPYDAYLEYKPIQIEKVAQIIKYVAARYNI PARNIVAHSDIAPSRKKDPGAKFPWKELYEKYDIGAWYNESDKQAFMNEEKFNATSISDI KEELRKYGYEINRTNEWDRDSRDVVYAFQLHFNPKNATGNMDLETFAILKALNKKYPN >gi|224461274|gb|ACDC01000128.1| GENE 14 13039 - 13380 329 113 aa, chain - ## HITS:1 COG:FN0875 KEGG:ns NR:ns ## COG: FN0875 COG0566 # Protein_GI_number: 19704210 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Fusobacterium nucleatum # 1 111 149 259 261 163 82.0 8e-41 AYNEKVIRATMGSILNVNLFYLEKQEIIKLLKENNYSIIATYLDKEALPYNKIKLKEKNA VIFGNEGRGICDEFVSISDCKTVIPILSNTESLNVAVASAIILYKFREIEGLI Prediction of potential genes in microbial genomes Time: Thu May 19 23:46:27 2011 Seq name: gi|224461273|gb|ACDC01000129.1| Fusobacterium sp. 2_1_31 cont1.129, whole genome shotgun sequence Length of sequence - 12175 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 71 - 2905 3705 ## COG0178 Excinuclease ATPase subunit - Term 2891 - 2925 5.4 2 2 Op 1 . - CDS 2931 - 3845 1261 ## COG0501 Zn-dependent protease with chaperone function 3 2 Op 2 1/0.000 - CDS 3861 - 4784 1016 ## COG2334 Putative homoserine kinase type II (protein kinase fold) 4 2 Op 3 . - CDS 4797 - 6236 1047 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 6260 - 6319 20.3 + Prom 6289 - 6348 10.6 5 3 Tu 1 . + CDS 6391 - 7194 833 ## FN0924 hypothetical protein + Term 7269 - 7315 -0.5 + Prom 7224 - 7283 7.2 6 4 Op 1 . + CDS 7440 - 8168 584 ## FN0925 hypothetical protein 7 4 Op 2 . + CDS 8183 - 8962 1029 ## COG2357 Uncharacterized protein conserved in bacteria + Term 8965 - 8996 3.4 - Term 8953 - 8984 3.4 8 5 Op 1 . - CDS 8990 - 9370 623 ## FN0656 hypothetical protein 9 5 Op 2 . - CDS 9387 - 9740 569 ## FN0655 hypothetical protein 10 5 Op 3 26/0.000 - CDS 9786 - 10982 1946 ## COG0126 3-phosphoglycerate kinase - Prom 11006 - 11065 4.4 - Term 10997 - 11030 4.0 11 5 Op 4 . - CDS 11073 - 12080 1785 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase - Prom 12115 - 12174 8.4 Predicted protein(s) >gi|224461273|gb|ACDC01000129.1| GENE 1 71 - 2905 3705 944 aa, chain + ## HITS:1 COG:FN1103 KEGG:ns NR:ns ## COG: FN1103 COG0178 # Protein_GI_number: 19704438 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Fusobacterium nucleatum # 1 943 16 958 960 1774 95.0 0 MIDKITIKGARQHNLKNIDIELPKNEFIVITGVSGSGKSSLAFDTIYSEGQRRYVESLSA YARQFIGQMNKPEVDSIEGLSPAISIEQKTTNRNPRSTVGTITEVYDYLRLLFAHIGIAH CPICHTAVEKQSVDEIVESIMSKFDEGSKIILLSPVVKDKKGTHKNIFLNLFKKGFVRAR VNGEVLYLEDEIELDKNKKHNIEVVVDRLVLKKDDKDFESRLTQSIEAAIELSNGKLIVN DGKTDYLYSENYSCPNHEDVSIPELNPRLFSFNAPYGACPECKGLGKKLEVDENKLIENP DLSIEDGGMYIPGAMARKGYSWEIFRAMAKAAKIDLTKPVKDLTKKELDIIFYGYDEKFK FDYTGGDFDFHGYKEYEGAVKNLERRYYESFSEAQKEEIENRYMVERICKVCKGKRLKDE VLAVTVNDKNIMEICDMSIKNSLDFFMNLSLTEKQEKIAKEILKEIRERLTFMTNVGLDY LTLSRETKTLSGGESQRIRLATQIGSGLTGVLYVLDEPSIGLHQKDNDKLLATLNRLKEL GNTLIVVEHDEDTMMQADKILDIGPGAGTFGGEIVAFGSPKEIMKNKNSITGKFLSGKEE IEIPKKRRKWNKTLKLFGAKGNNLKNIDVEFPLGVMTVVTGVSGSGKSTLVNSTLYPILF NKLNKGKLYPLEYDKIEGLEELEKVINIDQTPIGRTPRSNPATYTKLFDDIRDIFAETQD AKLHGFKKGRFSFNVKGGRCEACQGAGILKIEMNFLPDVYVECEVCKGKRYNKETLDVYY KGKNIYDVLEMSVLEAYDFFKNIPTLERKLKVLIDVGLDYIKLGQPATTLSGGEAQRIKL ATELSKMSKGNTVYILDEPTTGLHFQDIKKLLEVLNRLLEKGNTVIIIEHNLDVIKTADH IIDIGVDGGENGGTVVATGTPEEIAKSKKSYTGKYIAKILKKKK >gi|224461273|gb|ACDC01000129.1| GENE 2 2931 - 3845 1261 304 aa, chain - ## HITS:1 COG:FN0920 KEGG:ns NR:ns ## COG: FN0920 COG0501 # Protein_GI_number: 19704255 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Fusobacterium nucleatum # 1 304 1 305 309 461 78.0 1e-129 MKGLAELRNKVVNAPHLNIFKVATWATMGVFATYLLIYIFAGEEMLKYYPLLIVFAFGAP LVSLMTSKASVKRAYNIRMIDNGRARTEKEQLVVDTVTLLSEKLNLQKLPEIGIYPSNDI NAFATGASKNSAMVAVSQGLLNNMNETEIIGVLAHEMSHVVNGDMLTSSILEGFVSAFGL IISYIILNSRRNNRSGGAAASMASFYMIKNGINFLGRIVASWYSRRREFGADRLAAQITD PSYMKSALLRLQEISEGRVNLQPNDREFAAFKITNNFSMGGFANLFATHPSLERRIAAIE RMEK >gi|224461273|gb|ACDC01000129.1| GENE 3 3861 - 4784 1016 307 aa, chain - ## HITS:1 COG:FN0922 KEGG:ns NR:ns ## COG: FN0922 COG2334 # Protein_GI_number: 19704257 # Func_class: R General function prediction only # Function: Putative homoserine kinase type II (protein kinase fold) # Organism: Fusobacterium nucleatum # 1 307 1 307 312 365 71.0 1e-101 MGVFTKILDKEREFIEEQYQIKILDIKNISNGILNSNFQIDCEDIKYILRIFEADRTLNE EEQELILLNKIASFVPVSEAIKNKDNEYISVFENKKFALFNYVDGKVIKKIDTHIIREIA TYLGKLHAFTKDISPEKYNRKTRLDFNYFYDKIFQTDIDFQDKDKLLNLAYEIKDYDFSQ LECGIIHGDIFPDNVLFDEDNNIKAILDFNESYYAPFIFDIAVVINFWIKINKYDFFTEN NFIRDFLNYYSKQRKITNQELKVLDLACKKVALTFIFLRLYREKIENSYQKAFSIEEKSY VSLLELM >gi|224461273|gb|ACDC01000129.1| GENE 4 4797 - 6236 1047 479 aa, chain - ## HITS:1 COG:FN0923 KEGG:ns NR:ns ## COG: FN0923 COG1502 # Protein_GI_number: 19704258 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Fusobacterium nucleatum # 1 479 1 479 479 751 81.0 0 MQNIQNLIITFVNLFLQYVWIANLFFIIVIITVEKKNPLYTILWTFILTLLPYVGFFIYL FFGLTFKKKRVANKIYKLKKLRSIKNVTNADRKELRRWKGLITYLEMSTDNHISANNNIE LYFTGKDFFENLKKEIKNAREVINMEYFVFKFDNIGKEIADLLIEKAKEGLEVNLIIDGV NTSNFKLKRYFKDTGVSLHFFFKTYIPLFNIRLNYRDHRKLTIIDNKVAFVGGMNIGDEY LGKGKIGYWRDTSVKVFGDVVATFEKEFYFALSIVKNKFLKDEKLPVEPTLKYEEEKSIY MQLISSGPNYEFPVIRDNHIKLIQEAKKSVFIQTPYFVPDDLLLDTLKTAILSGIDVKIM IPNKADHLFIYWVNQYYIADLLRLGANIYRYENGFIHSKTLLIDEEVISVGTCNLDYRSF YLNFEVNLNVYNKEVANAFKVQYYKDIAISKKLTFNDFAKRSIFTKIKESVFRLLSPVL >gi|224461273|gb|ACDC01000129.1| GENE 5 6391 - 7194 833 267 aa, chain + ## HITS:1 COG:no KEGG:FN0924 NR:ns ## KEGG: FN0924 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 57 267 1 209 209 254 68.0 2e-66 MLSNNTKFNLLLGDNFNKLVSLPTKQVIMRSILSVIDRDFIVSSNNSSLAELVQKLLDKV LNEKQEIVDIISDLFSMENKCDLSFYKEIFESDMFSSIITTNFDYTIEENFLNSIKINTP FDINNDESAKIAFYKIYGDYKDKDIDKFVLSSQDIKRIKVLGFYAKFWEKLRIEFNKRAT IILGANLEDKEFLDILDFIMSKTDRLQTTYLYINDEIDKYMTDKNITNFINKYSIEIIKG EAKDFIPNLKERFFDERKSGDALQNFA >gi|224461273|gb|ACDC01000129.1| GENE 6 7440 - 8168 584 242 aa, chain + ## HITS:1 COG:no KEGG:FN0925 NR:ns ## KEGG: FN0925 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 242 52 292 292 251 61.0 1e-65 MELQNFPIKYRNFSKDLEPLKTNFLGMTDVDFGNIRLEGVSIKILDFLDFKLIEFRKKDF RIAIDEKDSLFEYEIPKDIKNKRLEEILNFFANFFKATTIKFKIANDKYEYYFHNNIEYY KFITLKQILTQYTNLISNLRLYRYKNLSSAKNTFFELDLLDKSNSIEETNTWINAEIKSV VDANIGDSLTIKRLHKMKFNDFPYDVEEIITLVHPLTKEEVKDNIIKLTRKSVKIKLRRV HK >gi|224461273|gb|ACDC01000129.1| GENE 7 8183 - 8962 1029 259 aa, chain + ## HITS:1 COG:FN0926 KEGG:ns NR:ns ## COG: FN0926 COG2357 # Protein_GI_number: 19704261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 259 1 259 259 449 93.0 1e-126 MDKLIKEEFFKEFSINEDYFLSTGLDWTELEKIYEDYVRLVPLLEKEAEYVVSKLIDVPS VHSVRRRVKKPSHLIEKIIRKGKKYQERNICVDNYKEIVTDLIGIRVLHLFKDDWQTIHH EILNLWDIKETPQVNIRRGDYNLSQFKETIKDINCDVIVREHGYRSVHYLVSIDITKVLN ISVEIQVRTVFEEAWSEIDHIMRYPYDVDNPIITEYLGIFNRIVGSADEMGTFLKKVKEN FGNTKNADEVQRELDLKFK >gi|224461273|gb|ACDC01000129.1| GENE 8 8990 - 9370 623 126 aa, chain - ## HITS:1 COG:no KEGG:FN0656 NR:ns ## KEGG: FN0656 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 126 1 126 126 167 73.0 1e-40 MTIKDLGIREWLVIAFISIGLLAFVFEDKFKPKIYEAEGTGIGYAGDITLKVKAYKKKDK SLRVTEIQVIHEDTDVIGGVCCTKLVDDIKARQRLDKIDMVAGATFTSEGFKEAFTEAIE NIKNQE >gi|224461273|gb|ACDC01000129.1| GENE 9 9387 - 9740 569 117 aa, chain - ## HITS:1 COG:no KEGG:FN0655 NR:ns ## KEGG: FN0655 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 6 116 1 111 112 152 81.0 4e-36 MKKSFLAICFAVLSLGSFAEDKIYEAKAEARGYNEDGVPIVLTVKATKKDGKVVIKDIVA QHKETDKIGGVAIEQLIKQVKDKQNYNKVDGVSGATSTSAGFRRALRNAVKDIEKQA >gi|224461273|gb|ACDC01000129.1| GENE 10 9786 - 10982 1946 398 aa, chain - ## HITS:1 COG:FN0654 KEGG:ns NR:ns ## COG: FN0654 COG0126 # Protein_GI_number: 19703989 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Fusobacterium nucleatum # 1 398 1 398 398 692 95.0 0 MKKIITDLDLNNKKVLMRVDFNVPMKDGKITDENRIVQALPTIKYALEQNAKLILFSHLG KVKTEEDKASKSLKAVAEKLSELLGKNVTFIPETRGEKLEAAINNLKPGEVLMFENTRFE DLDGKKESKNDPELGKYWASLGDVFVNDAFGTAHRAHASNVGIAENIGAGNSAVGFLVEK ELKFIGEAVNNPKRPLIAILGGAKVSDKIGVIENLLTKADKILIGGAMMFTFLKAEGKNI GTSLVEDDKLDLAKDLLTKSNGKIVLPVDTVVAAEFNNDAEFSTVDVDNIPDNKMGLDIG EKTVKLFDSYIKTAKTVVWNGPMGVFEMSNFAKGTIGVCESIASLADAVTIIGGGDSAAA AISLGYADKFTHISTGGGASLEFLEGKVLPGVEAISNK >gi|224461273|gb|ACDC01000129.1| GENE 11 11073 - 12080 1785 335 aa, chain - ## HITS:1 COG:FN0652 KEGG:ns NR:ns ## COG: FN0652 COG0057 # Protein_GI_number: 19703987 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 1 335 1 335 335 608 96.0 1e-174 MAVKVAINGFGRIGRLALRVMSKNKDFDVVAINDLTDAKTLAHLFKYDSAQGRFDGTIEV TDDGFVVDGDSIKVFAKANPEELPWGELGIDVVLECTGFFTSKEKAEAHIKAGAKKVVIS APATGDLKTIVYNVNDNVLDGSETVISGASCTTNCLAPMAKVLNDKFGIVEGLMTTIHAY TNDQNTLDAPHKKGDLRRARAAAENIVPNTTGAAKAIGLVIPELKGKLDGAAQRVPVITG SITELVTVLEKETSVEEINAAMKAASNESFGYTEEELVSSDIIGISFGSLFDATQTKVLS VGGKQLVKTVAWYDNEMSYTSQLIRTLKKFVEISK Prediction of potential genes in microbial genomes Time: Thu May 19 23:46:47 2011 Seq name: gi|224461272|gb|ACDC01000130.1| Fusobacterium sp. 2_1_31 cont1.130, whole genome shotgun sequence Length of sequence - 25975 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 8, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 72 - 938 249 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 2 1 Op 2 . - CDS 947 - 1978 1328 ## COG2008 Threonine aldolase - Prom 2005 - 2064 10.9 3 2 Op 1 . - CDS 2118 - 2951 679 ## COG2990 Uncharacterized protein conserved in bacteria 4 2 Op 2 1/1.000 - CDS 2963 - 3415 443 ## COG0219 Predicted rRNA methylase (SpoU class) 5 2 Op 3 . - CDS 3438 - 4061 897 ## COG0406 Fructose-2,6-bisphosphatase - Prom 4081 - 4140 8.6 6 2 Op 4 . - CDS 4143 - 4706 638 ## COG1396 Predicted transcriptional regulators - Prom 4727 - 4786 10.3 + Prom 4795 - 4854 11.1 7 3 Op 1 4/0.000 + CDS 4986 - 6335 1809 ## COG2610 H+/gluconate symporter and related permeases 8 3 Op 2 1/1.000 + CDS 6361 - 7686 1929 ## COG3048 D-serine dehydratase 9 3 Op 3 . + CDS 7708 - 8817 1310 ## COG3616 Predicted amino acid aldolase or racemase + Term 8825 - 8887 19.2 + Prom 9069 - 9128 12.2 10 4 Tu 1 . + CDS 9371 - 9643 295 ## gi|254304146|ref|ZP_04971504.1| possible ESS family glutamate:sodium (Na+) symporter + Term 9658 - 9698 3.1 + Prom 9776 - 9835 13.4 11 5 Op 1 1/1.000 + CDS 9893 - 10630 1206 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase 12 5 Op 2 . + CDS 10685 - 12235 2071 ## COG2385 Sporulation protein and related proteins 13 5 Op 3 . + CDS 12265 - 13350 1210 ## COG0270 Site-specific DNA methylase 14 5 Op 4 . + CDS 13352 - 14164 794 ## MHO_0360 cytosine-specific DNA methyltransferase/type II site-specific deoxyribonuclease + Term 14175 - 14223 1.1 + Prom 14166 - 14225 4.1 15 6 Tu 1 . + CDS 14314 - 15057 757 ## COG4912 Predicted DNA alkylation repair enzyme + Term 15107 - 15149 1.6 + Prom 15276 - 15335 13.2 16 7 Tu 1 . + CDS 15375 - 22922 10728 ## FN2058 hypothetical protein + Term 22979 - 23019 7.2 - Term 23013 - 23057 1.0 17 8 Op 1 34/0.000 - CDS 23069 - 23911 335 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 18 8 Op 2 8/0.000 - CDS 23913 - 24692 484 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 19 8 Op 3 4/0.000 - CDS 24693 - 25010 435 ## COG1930 ABC-type cobalt transport system, periplasmic component 20 8 Op 4 . - CDS 25022 - 25735 811 ## COG0310 ABC-type Co2+ transport system, permease component - Prom 25832 - 25891 12.9 Predicted protein(s) >gi|224461272|gb|ACDC01000130.1| GENE 1 72 - 938 249 288 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 74 277 82 278 285 100 34 1e-20 MKKYIVEHEFDGYEIGTYLKETKGYSSRGLRNLEIYLNGKRIKNNAKKIKKLNRIVIIEK EKSTGIKAMDIPIDIAYEDENLLIVNKEPYIIVHPTQKKVDKTLANAVVNYFEKTLGKTL VPRFYNRLDMNTSGLIIIAKNAYTQAFLQDKTEVKKTYKVIVSGIIEEDDFFIELPIGKI GDDLRRIELSEENGGKSAKTHIKVLERNREKNITFLEARLYTGRTHQIRAHLSLIGHPLV GDELYGGDMNLAKRQMLHAYKLEFQNPKTLENLKVEIEIPLDMKELLK >gi|224461272|gb|ACDC01000130.1| GENE 2 947 - 1978 1328 343 aa, chain - ## HITS:1 COG:FN0810 KEGG:ns NR:ns ## COG: FN0810 COG2008 # Protein_GI_number: 19704145 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Fusobacterium nucleatum # 1 340 1 340 340 616 90.0 1e-176 MISFKNDYSEGACPEVLEALVKTNYEQTIGYGEDAYCEEAKNLIKENINCPNADIYFLVG GTQVNTTVISHCLKPYEAVIASKTGHISIHETGAIEATGHKIIEVEPVDGKLTPDLILNE LRKHEDHHMVKPKMVYISNTTEIGTVYTKDELEAISKVCKDNNLYLFLDGARLASALASE KCDINLEDYPKYCDVFYIGGTKCGLLFGEAVVIINEDIKKEFNFSIKQKGGLFAKGRLLG VQFATLFKNDLYYRIGVHSNKMALKIKNAFVEKGIKLATDSYTNQVFVDLSQKQIKELEK EVIFSVEFFGIGESQSSRFVTSWATKEEDVDKLVELIKNLNVD >gi|224461272|gb|ACDC01000130.1| GENE 3 2118 - 2951 679 277 aa, chain - ## HITS:1 COG:YPO1363 KEGG:ns NR:ns ## COG: YPO1363 COG2990 # Protein_GI_number: 16121643 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 24 273 37 293 315 122 29.0 6e-28 MKKELNFYFSAMKNSTNKGEINTFKKKVKYIFRNLIFYKYSKKLANFILNDKFLKENIYK YPALCSKIHRPYLANSIKLEDKVNIIISSYIFLNNYFKDSFLAELYEKGIYKICEIEGKN EEQLFFYLKVYTDFEKEGEFNLICTDKSGNQLVKLTFAVDNNKIVIAGLQGMKKDENLEK IKYVTKNFYGIFPKKITLEVLYLLFSNFQKKAVSNNGHVYLSLRYKFKKYRKINVDYDEF WESLGAKKENETFWLLPEKLTRKSIEDIPSKKKISIH >gi|224461272|gb|ACDC01000130.1| GENE 4 2963 - 3415 443 150 aa, chain - ## HITS:1 COG:FN0809 KEGG:ns NR:ns ## COG: FN0809 COG0219 # Protein_GI_number: 19704144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Fusobacterium nucleatum # 1 150 1 150 150 296 93.0 7e-81 MNIVLFQPEIPYNTGNIGRSCVLTNTTLHLIKPLGFSLDEKQVKRSGLDYWHLVDLKIWE SFEDFLEANRNIRLFYATTKTKQKYSDVKYEENDFIMFGPESRGIPEEILNKNPERCITI PMIPMGRSLNLSNSAVVILYEAYRQLGFNF >gi|224461272|gb|ACDC01000130.1| GENE 5 3438 - 4061 897 207 aa, chain - ## HITS:1 COG:FN0808 KEGG:ns NR:ns ## COG: FN0808 COG0406 # Protein_GI_number: 19704143 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 205 1 205 206 332 85.0 3e-91 MEIYFVRHGQTIWNVEKRFQGLSDSPLTELGITQAKLLGKKLKDIKFDKFYSTSLKRAND TANYIKGDRDQEVEIFDDFIEISMGDMEGMGHEKFKELYPVQLKNFFFNQIEYDPREYNG ESFLEVRERVIKGLNKFVELNKNYERVLVVSHGATLKTLLHYISGKDISTLSDEEIPKNT SYTIVEYKDGKFEITDFSNTSHLDEIK >gi|224461272|gb|ACDC01000130.1| GENE 6 4143 - 4706 638 187 aa, chain - ## HITS:1 COG:FN0555 KEGG:ns NR:ns ## COG: FN0555 COG1396 # Protein_GI_number: 19703890 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 187 13 199 199 305 88.0 5e-83 MEMNKEVNVGMTIKNIRKSKKLLLKEVASKCGISSSMLSQIEKGNANPSLNTIKSIAQVL EVPLFKFFIDSDKEKYEFHLLKKNERKIISTEYVTYELLSPDVETNIECMQMTLIGKNAE TSVKPMSHKGEEIAVLLDGKVKLTIGKFSVILSSGDSIHIPAMAPHKWTNLNDTKSVIIF SVTPPEF >gi|224461272|gb|ACDC01000130.1| GENE 7 4986 - 6335 1809 449 aa, chain + ## HITS:1 COG:FN0554 KEGG:ns NR:ns ## COG: FN0554 COG2610 # Protein_GI_number: 19703889 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 1 449 1 449 449 697 95.0 0 MNVTFTITAILLSILLLVLLTIKVKLHPFFALTISAFFFGIISGHSIPDIIGAYSDGLGG TIAGIGVVIAIGTVMGSLLENSGAAETMAETILKITGKKNADIGLAVTGYFVSIPVFCDS AFVLLSPLAKRVSKDTGGSMTTMAVALAMGLHATHMLVPPTPGPLAVAGILGSNLGLVIL CGMLVSIPVTIVAIIAGRIFGKKYYFLPEIEEAHVDEKNKKLPSAFMSFAPIIVPIILML LKTVGSLEAKPFGTGALYNVFDSLGQTIVALFIGLIIAFFTYRSVYPDDKNVWTFDGIFG ESLKTAGQIVLIVGAGGAFATVLKLSNLQEIVMNLFAGISIGIIVPYIIGAIFRTAIGSG TVGMITAASMLLPLLDVLGFNSPMGLVIAMLACAAGGFMVFHGNDDFFWVVVSTSGMKPE VAYKTFPIISVLQSVTALICVFILKIIFL >gi|224461272|gb|ACDC01000130.1| GENE 8 6361 - 7686 1929 441 aa, chain + ## HITS:1 COG:FN0553 KEGG:ns NR:ns ## COG: FN0553 COG3048 # Protein_GI_number: 19703888 # Func_class: E Amino acid transport and metabolism # Function: D-serine dehydratase # Organism: Fusobacterium nucleatum # 1 441 1 441 441 780 91.0 0 MDIKNIIANSPLIKDMVDKKEVVWINPKEVNYSEYEKRLPINDEELKEAEERLKRFAPFI KKVFPETEETNGIIESPLEEISNMQKELERKYNTEIPGKLYLKMDSHLPVAGSIKARGGV YEVLKHAEELAIEAGLLKLEDDYSILADKKFKDFFSKYKIQVGSTGNLGLSIGITSAALG FQVIVHMSADAKKWKKDMLRSKGVQVVEYESDYGKAVEEGRKNSDADPMSYFVDDEKSMN LFLGYTVAASRIKKQFDERGIVIDKEHPLIVYIPCGVGGAPGGVAYGLKRIFKENIYIFF VEPILAPCMLLGMETGLHEKISVYDVGINGITHADGLAVARPSGLVGRLMDPILSGIFTV DDYKLYDYLRILNETENKRIEPSSCAAFEGVTSLLKYDESKKYIENKIGKNINNAYHVCW ATGGRMVPKEDMERFLNTYLK >gi|224461272|gb|ACDC01000130.1| GENE 9 7708 - 8817 1310 369 aa, chain + ## HITS:1 COG:FN0552 KEGG:ns NR:ns ## COG: FN0552 COG3616 # Protein_GI_number: 19703887 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid aldolase or racemase # Organism: Fusobacterium nucleatum # 1 369 1 369 369 630 90.0 1e-180 MKKKELKTPTILLNIEALKNNIKNYQKLCTEYKKELWPMIKTHKSMEIVGMQIKEGATGV LCGTLDEAEACCEKGIQKIMYAYPVASEENIKRIIEITKKTDFIIRLDSLEAAIKINKIA EAENVIISYTIIVDSGLHRFGLSLKNLLTFAEELKKLKNLKLRGISSHPGHVYSSTCEAD IHKYVLDECETLKEAKEILEKEGYYLEYITSGSTPTFTEAVKDLNINVYHPGNYVFLDSI QLSINKAKVKDCALTVLATIISHPSENLFICDAGAKCLGLDQGAHGNSSIIGYGTVIDHP EVIVSSLSEEVGKLKVEGETSLKVGDKIEIIPNHSCSTANLCTYYTVVDGDDVIKSIKVD ARGNSIKRI >gi|224461272|gb|ACDC01000130.1| GENE 10 9371 - 9643 295 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|254304146|ref|ZP_04971504.1| ## NR: gi|254304146|ref|ZP_04971504.1| possible ESS family glutamate:sodium (Na+) symporter [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 90 364 453 453 130 93.0 4e-29 MIATFGMATGVFLTGILLLRVCDPDFKSPVLANYSLSYTITSVVYFVFLNIILTLLLTKG LFFGMSFTFIVGFLFMFAAIISSKILLKNK >gi|224461272|gb|ACDC01000130.1| GENE 11 9893 - 10630 1206 245 aa, chain + ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 245 1 245 245 417 90.0 1e-117 MKFLGIIPARYSSTRLEGKPLKLIEGHTMIEWVYKRAKKSNLDSLIVATDDERIYNEVLN FGGQAIMTSTEHTNGTSRIAEVCEKIKDYDVIINIQGDEPLIEYEMINSLIETFKENKDL KMATLKHKLTEKEEIENPNNVKVICDKNDYAIYFSRSVIPYPRKADNISYFKHIGIYGYK RDFVIDYSKMPATALEVAESLEQLRVLENGYKIKVLETTHSLIGVDTQENLEQVINFVKK NNIRI >gi|224461272|gb|ACDC01000130.1| GENE 12 10685 - 12235 2071 516 aa, chain + ## HITS:1 COG:FN0806 KEGG:ns NR:ns ## COG: FN0806 COG2385 # Protein_GI_number: 19704141 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Fusobacterium nucleatum # 184 516 1 333 333 556 85.0 1e-158 MIVSAFLLLACSNNSGKKVKPVKPNGDYKVGTVVEENTNNTNIERGNREKITLENTVFKK LGLPLPYNTFGAAIPYLVPVNDNHKESFSVFGEYDENKALKYFKNLSSRGHGDNSPYWRW KTSIKKSDLYNKVESRIVSIYKTNPRNVLTLVNGEWQQAPIRSVGNVQDIIVAARGESGI ITHMLIITSNGKYLVAKEFNVRKLLATNNALYGSKGEEGAYASKPIMPNVSSLPSAYLAL EEDGGYIHIYGGGYGHGVGMSQFAAGTLAKSGENYKNILKRYYTNVKISTVESVLGNNKE IKVGITTNGSLEHGRLNIFSSENKVQIYNEDFDVTVGANERIDVRNSSGSVTVTLENGKE YKTRNPLNFYAKGEYLTISPVKKAHTSSPKYRGILTIIPRGSSLRVINTIDIEKYLLQVV ASEMPRSFGVEALKVQAVAARTYAVSDILKGKYAKDGFHIKDTVESQVYNNQVENEDATR AIKETAGEIMTYDGMPIDAKYFSTSSGFTSHASNVW >gi|224461272|gb|ACDC01000130.1| GENE 13 12265 - 13350 1210 361 aa, chain + ## HITS:1 COG:all0934 KEGG:ns NR:ns ## COG: all0934 COG0270 # Protein_GI_number: 17228429 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Nostoc sp. PCC 7120 # 10 354 53 425 477 241 37.0 2e-63 MNTENLTLWNNNEKNLKVDKKKVNSEVELKAIELFAGAGGLALGVEKAGFNTIGLIEIDK NACNTLKLNRPNWNVINENIANISLKNLEDFFSIKKGELDLLSGGAPCQAFSYAGKRLGL EDTRGTLFYHYALFLKQLQPKMFLFENVKGLLSHDKGKTHETIVNVFKEEGYTIYEKVLN AWNYGVAQKRERLIIVGIRNDLINKLNFLFPTPHKYKPVLRDILLDCPESKGIAYSEYKK KIFEMVPPGGYWKDIPKEIAKEYMKSCWDMEGGRTGILRRLSLDEPSLTVLTSPSQKQTD RCHPIEARPFTIRENARCQSFPDEWIFSGSVTDQYRQVGNAVPVNLAYEVALEIRKALEM L >gi|224461272|gb|ACDC01000130.1| GENE 14 13352 - 14164 794 270 aa, chain + ## HITS:1 COG:no KEGG:MHO_0360 NR:ns ## KEGG: MHO_0360 # Name: dcm # Def: cytosine-specific DNA methyltransferase/type II site-specific deoxyribonuclease # Organism: M.hominis # Pathway: not_defined # 4 224 343 550 553 231 52.0 2e-59 MWKLKFISKENFYKHIQDTIEKYGEKLESYDLKKFNKNIIDPIKLIFDKTVYSSSWNEII NSEIFRQRDKSNNNDIGYFHQRIFQYIDNCKVPENGEDGGWDVIYEDKDGITLPEGTTVH KIYVEMKNKHNTMNSSSASKTFIKMQNQLLNDDDCACFLVEAIAQHSQNIKWETTVDKQK VSHKLIRRVSMDQFWSLVTGEEDAFYKICMLLPEVIKEVIQDTKAFSFPDDNVCEEIEEK SKLYPKLSNDEAIAMAFYMLAFSEYLGFKK >gi|224461272|gb|ACDC01000130.1| GENE 15 14314 - 15057 757 247 aa, chain + ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 1 247 5 251 251 372 90.0 1e-103 MEIESLEFKTEKEYKEFLDYLFSIRDIEYKDFNTKIIVPVDCEIIGIRTPILRDIAKKIA KTSSENFLNFFEKLFLKKKIKYYEEKVLYGFLIGYSKMDYQDRLKRIDFFIDIIDNWAVC DIVDSSFNFINKNKEDFYKYLNSKLSATNLWEQRFIFVMLLAYYVEDKYLKDIFKICEKI KSEEYYVNMAKAWLLSVCYVKYREETYKFLEKTKLDAWTVNKSIQKIRESLRVTKEEKEK ILILKRK >gi|224461272|gb|ACDC01000130.1| GENE 16 15375 - 22922 10728 2515 aa, chain + ## HITS:1 COG:no KEGG:FN2058 NR:ns ## KEGG: FN2058 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 699 2515 4 1794 1794 1955 66.0 0 MNNNLYNVEKNLRSIAKRYENVKYSVGLAVLFLMKGTSAFSDDNMIQETEKQKDILIDVK KGKSQLKEKKKITQKTQKLKASWTSMQFGANDLYSNFFVAPKTKVEKTSIVKNEKTVLVA SADSSTSLPMFAKITSDIEETSTPTTEEINTSKKNLRNSIGNLQNKIDTARKENSKEIQG LKLELVQLMEQGNQVVKSPWSSWQFGANYMYDDWNGSYKGSGDKTEKYPFEGVFTRSSNV FGRSTSARTADQKAILGSIIATSGGFTPNDTGLSYGLIRRAQINENPISIEVSAGIRPKH IQKGAITLSVPPVNIIQPTPSVTLGITNTPAAPNINIPSFSPVAPKVEAPALPVPPTFAV VLGADCNVGCNSSGSTPRQNTKAGFNLAGRAAGNIENILHYTWPAGSGAYALPGIRASLA FKMYADTKRDFTLGTDIPKNHSNWGATTAAPNNVYFNSYNFGDEYANAVRTSANGGNPNK NDQYFFVGGSRFIESDDVGAAGNTLTIPNGYTVNLGGIFTLGLVSQGHKTTELNAGTITD KEEKNDKWIKDMPYDTSGSGFGKYLTIKGPTEEYHIKRSADGYVGYKVALALIQEDAVQG GAIINDTTGVIDFRGERSIGLYTYLPNPTTSKVYSNRPMTNKGNILLSGIESYGMKYAAT ENAGAVTFINDSTGTIALRKNPNGNDKADNSAAMALMKDGSVTTKVTLTRGKAINKGNIN LEDNISNALGMFVNINSDMTNEGTIKVSAVAPKVSNKYQFNVAMRADQADIAYEGTNTKD TEVINKNSIKLTGQGAIGMIANGSSTSGTNTKHAIATNNAGATIDIDKEGTSESKDNFGM LATNQAEVINKGTINIGASTGSVGMAALKDGATHSTAKNEGTISVNGAESTGVYNTGHFL MDNINAKINVKGSQSIALYAKETDPTHTKTELKKGTVKSEDGAVALYSDQADITLDNTSG NLKLVAGNGGLLFYNYKSSNPNDYAGQFTINGPVVADIESGGYGFYLKNATINNVSGQVQ GVPAFLDAMFNITANKLKVKMQSGGTFMVLHKPTGGSMKLSSVSNLAAINSALGSKVELE APTTGSYKVYSVYRGKLEINQDVNLDNDESTATPDAFYKVDFRSSNMLVEAGKTVSGTKQ GQVALFQGNFNEGAGGDVGTVGDVSIVNNGTIKLTGNSITTGPVASRKTTTAMAGDFITL TNNKTIEVTGDNGIGIYGAGGSKILNNAGASLTVGQEGVALYGANKLNSSTLGDGTISVT NAGDIQGVNGKTKAFGIFAENTSSTVTDSNLTNSGTIDFSSSQESIGIHSINSTVSNTGN IKMGLKGVAINSKNSDINSTGDIVLAGNGIAFNLGGTFTGRTLNFSSKVTLNGDGNSIFN LKDMAFSSVGASLTENVNIVPNGKSFAYFSMDNSSLVYDKDKTFAGDKITLVSAKNSTVD WRSNVILNGQENVTFYLNGRKAGAALELKTAAGKTIILGNKSVGAYGVNGARIENDSDIV VGSDGAALYSTGATGSLKNTGKLTIGKNSVGMFMKDGTALTNTGEIVSTAEGAKGLVINR TTAGTYTNTGKIKLTGTSSIGIHAEGAAHNINSGADVEVGNTTGTSQSVAIHLKDGGEVN VLANTSVKAGNGSIGIYGSTVSTTVDNNAKVEVGDGAVAIYAKSGNVNLNAGSKMKIGET LGTNKEAVGVYYVGNAGTINNNLTSFDIGKGSIGIADAGTGATTINNNSATVALKGDSIY TYTFNTSSNVIGNTTITSTGNGNYGYYVAGNLSNYGTMDLSSGNGNVGIYSAYGAGTGNG VARNYANIKVGKTDLENELYSIGMAAGYTNNNRPSENKVGHIVNMAGSTITVGNENSIGM YASGAGSTAENYGTIHVTAKKGIGMYLENGATGYNRAGGLIEIDPSAQNAIAVYSTGGTT VFKNYGTIRLKAPDSKGIVTANNAQGTNETGGIIDVQHSSAEATKKIEGTAGGDKKFGDK TLSVPRGGLTDSKVQDSTGNIITPTVIDATSATATANIQVSNDPIAKATYNRDILKEHQD FGSISKIGMYVDTSGVNFTNPIEGLNNLTGLRKADLIIGAEAAEYTNAKTLIVGTNILKR YNTALLNSGVDKWDILSGSLTWAAVPLRLSRNGEIQGVLMTKVDYKEYAKDSTTPYNFLD GLEQRYDKNALDSKEKKLFNKLNSIGKNEPILLSQAFDEMLGQQYANTQQRVQATGNILD KEFNYLRNEWSNPSKDANKIKTFGTRGEYSTGTAGIKDYKNHAYGVAYVHEDETVKLGES TGWYTGIVHNTFDFKDIGKSKEEQLQAKLGIFKSVPFDHNNSLNWTISGDIFAGYNKMNR RFLVVDEIFNAKSRYSTYGVGLKNELSKEFRLSEDFSVRPYAALSLEYGRVSKIREKSGE IKLEVKANDYFSIKPEIGTELAYKHYFGANTMKVGVSVAYENELGRVADSKNKARVGYTS AGWYDLRGEKEDRTGNVKTDLNIGWDNQRVGVTANVGYDTKGNNVRGGVGLRVIF >gi|224461272|gb|ACDC01000130.1| GENE 17 23069 - 23911 335 280 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 10 272 133 392 398 133 31 1e-30 MNGEYILEVKDLEYTYGDGTHALKGINLKIEKGKKIAIIGVNGSGKSTLFLNMNGVLKAT KGEIYFKGEKIAYDKKSLMELRKKIGIVFQNPETMLFSSNVYQEVSFGAMNLKLDNNIVH ERVQAALEDVNMTEFSEKSIHFLSYGQKKRVSIADILVMEPELIIFDEPTSSLDPKHTQK IEEIFDELNKKGMTVVISTHDMNFTYSWADYIFIIDDGKIKKQGIPEEIFADNKILEECY LEKPFLFDMFEKLKEKNLITNTESYPKTKEELYKLLKTIS >gi|224461272|gb|ACDC01000130.1| GENE 18 23913 - 24692 484 259 aa, chain - ## HITS:1 COG:MJ1089 KEGG:ns NR:ns ## COG: MJ1089 COG0619 # Protein_GI_number: 15669277 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Methanococcus jannaschii # 4 256 9 258 268 120 31.0 4e-27 MVSLDKVAYTSKLLKNNPSEKLLFSILTLFSCIFFNNIIISLIVLFIMYLSITYLGGIEN KLFFKLISIPLVFLVIGVLTIIIVKLAPNQESLFKINIFSNEYGITIKTFLQGLTLMLKS LAAVSCLYFLILTTPVLDIFYCLEKLKLPKLLVEIIGLVYRYIFLFIDVAQMIYISQDAR LGYSTLKASFNSSGRLISSLFLTALKQANESFVCLEARCYTGELKFIDKSYISSKRNIIL ILLVNIILIAIYFLTRRVI >gi|224461272|gb|ACDC01000130.1| GENE 19 24693 - 25010 435 105 aa, chain - ## HITS:1 COG:alr3944 KEGG:ns NR:ns ## COG: alr3944 COG1930 # Protein_GI_number: 17231436 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 9 103 10 100 100 101 55.0 3e-22 MEKNLWKKNFILIILVVVLAAFPLIFVADGEFGGSDDQAEQLITDIDSNYKPWFESLWEP PSGEVESLLFSLQAAIGAGIVCGYIGYLMGKNKGKKENSDKKEGE >gi|224461272|gb|ACDC01000130.1| GENE 20 25022 - 25735 811 237 aa, chain - ## HITS:1 COG:STM2023 KEGG:ns NR:ns ## COG: STM2023 COG0310 # Protein_GI_number: 16765353 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Salmonella typhimurium LT2 # 11 224 19 234 245 225 55.0 7e-59 MKFNKKNLIFLWFFMLGANSFSMHIMEGFLPPLWCGIWGALCVPFILVGFSRIKKKIEVD SKMKMLIAVVGAFAFVLSALKIPSVTGSSSHPTGVGLAAILFGPAITSVLGLIVLIFQAI LLAHGGITTLGANVFSMGIVGPIVSYFIYKGLKKTNRSFAIFLAAALGDLLTYVTTSIQL GLAFPDPNGGFLLSASKFMGIFAVTQVPLAISEGLLTVVVFNILMNYNKETLNELEI Prediction of potential genes in microbial genomes Time: Thu May 19 23:47:26 2011 Seq name: gi|224461271|gb|ACDC01000131.1| Fusobacterium sp. 2_1_31 cont1.131, whole genome shotgun sequence Length of sequence - 9774 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 6, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 1 - 654 561 ## COG0785 Cytochrome c biogenesis protein 2 1 Op 2 . + CDS 678 - 2192 1969 ## COG0225 Peptide methionine sulfoxide reductase + Term 2201 - 2240 4.6 + Prom 2215 - 2274 6.0 3 2 Tu 1 . + CDS 2298 - 3218 943 ## Lebu_2020 hypothetical protein + Prom 3235 - 3294 6.1 4 3 Tu 1 . + CDS 3395 - 4159 860 ## COG0500 SAM-dependent methyltransferases + Term 4265 - 4307 1.1 + Prom 4390 - 4449 6.3 5 4 Op 1 . + CDS 4504 - 4626 93 ## 6 4 Op 2 . + CDS 4610 - 5722 1201 ## COG2849 Uncharacterized protein conserved in bacteria 7 4 Op 3 . + CDS 5732 - 5965 225 ## gi|296329140|ref|ZP_06871643.1| conserved hypothetical protein + Prom 6009 - 6068 8.1 8 5 Op 1 . + CDS 6099 - 6827 641 ## gi|237739138|ref|ZP_04569619.1| predicted protein 9 5 Op 2 . + CDS 6830 - 7333 392 ## CCC13826_1945 carbon monoxide dehydrogenase 1 (CODH 1) (EC:1.2.99.2) 10 5 Op 3 . + CDS 7342 - 7716 529 ## gi|294783866|ref|ZP_06749188.1| conserved hypothetical protein 11 5 Op 4 . + CDS 7724 - 8392 567 ## gi|237739141|ref|ZP_04569622.1| predicted protein + Term 8403 - 8461 11.2 + Prom 8770 - 8829 7.7 12 6 Tu 1 . + CDS 8949 - 9599 600 ## COG4804 Uncharacterized conserved protein Predicted protein(s) >gi|224461271|gb|ACDC01000131.1| GENE 1 1 - 654 561 217 aa, chain + ## HITS:1 COG:FN0804 KEGG:ns NR:ns ## COG: FN0804 COG0785 # Protein_GI_number: 19704139 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Fusobacterium nucleatum # 1 217 2 218 218 285 88.0 4e-77 FTQEVAYSTAYLAGIASFFSPCIFPIIPVYISILSNGEKKSVSKTLAFVLGLSVTYIVLG FGAGFIGELFLNSKVRVIGGILVVILGLFQMDILKLKFLEKTKVMNYEGEEQSLFSTFLL GLTFSLGWTPCVGPILASILILAGSSGDTGNSVMLMVLYLLGMATPFVIFSLASKTLFKK MSFIKKHLPLIKKIGGFLIIVMGFLLIFDKLNIFLTV >gi|224461271|gb|ACDC01000131.1| GENE 2 678 - 2192 1969 504 aa, chain + ## HITS:1 COG:FN0803_2 KEGG:ns NR:ns ## COG: FN0803_2 COG0225 # Protein_GI_number: 19704138 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Fusobacterium nucleatum # 193 357 1 165 165 310 93.0 3e-84 MKKFILPLIFIFLIGTFVFAKMLSHNVSKETEAEKDLLESIQLVDMNGNDYTFSRGKNIY IKFWASWCPTCLAGLEELDRLAGENNSNFEVITVVFPGINGEKNPAKFKEWYDGLGYKNI KVLYDTDGKLLQIFKIRALPTSAIIYKDLKIDNVIVGHISNGQIKDYYEGKGENEVMEES KNTTVNNVNKENIKEIYLAGGCFWGVEEYFARIDGVIDSVSGYANGSFDNPTYENVCNNS GHAETVHITYDSSKVSLDTLLKYYFRIIDPTSVNKQGNDRGIQYRTGIYYQNDEDKQIAI NAIKEEQKKYSKPIVIEVEKLKRFDKAEEEHQDYLKKNPNGYCHINLNKANEAIIDEKKY QKPSDEVLKEKLTDLEYQVTQNAATERAFTHEYDKNQEDGIYVDITTGEPLFSSKDKYDA GCGWPSFTKPIATEVVNYKQDNSHGMSRVEVRSRAGKAHLGHVFEDGPRAEGGLRYCING ASLRFIPYDKMDEEGYGEFKKYVK >gi|224461271|gb|ACDC01000131.1| GENE 3 2298 - 3218 943 306 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2020 NR:ns ## KEGG: Lebu_2020 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 302 1 312 321 131 37.0 3e-29 MKEILKNLKLGQIIGIYHFEDSCFTVGKILKIDSKYLYLFSYDVNFKEDGIKIFLINSIK RIILKSDYIRSLEKNQKKIVRFNRENKDIFQELIKNRIKVSVELADESIEEVYLIEKGED YFSFQILNDNENITSEEIITKDYLKRIKISNYIEREEYKNFKVITTKNDDEYIAYDLSYN KDYLIFSEKEEFYDMGKINIIPKNMIENISEIEVKLDTKKENFYELIDFEKDLEIVEILR KCLENKFLVFIDNVDFFETKVGVITNLENNKIKMKEIDKYGNFYKNSEIYLDEIQLLAIK NYKLGV >gi|224461271|gb|ACDC01000131.1| GENE 4 3395 - 4159 860 254 aa, chain + ## HITS:1 COG:FN1919 KEGG:ns NR:ns ## COG: FN1919 COG0500 # Protein_GI_number: 19705224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 10 253 12 249 249 184 41.0 1e-46 MNNYIKVNEDRWNNVKNDYTEPLTHEELEEVKNNPISVALTVGKKVPKEWFEKTNGKKIL GLACGGGQQGPVFAIKGYDVTIMDFSKSQLQKDDMVAKREGLKINTVQGDMTKPFPFENE TFDIVFNPVSNVYVEDLENIYKEASRVLKKGGLLMVGFMNPWIYMYDADIVWDKPDEELL LKFSIPFNSKELEEEGKITINPEYGYEFSHTLETQIRGQLKNGFAMIDFYESCDKRHRLS HYGNDYIATLCIKL >gi|224461271|gb|ACDC01000131.1| GENE 5 4504 - 4626 93 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKEIDKYGNFYKNSEIYFDEIQLLAIKNYKFMEGNYEEKF >gi|224461271|gb|ACDC01000131.1| GENE 6 4610 - 5722 1201 370 aa, chain + ## HITS:1 COG:FN2119 KEGG:ns NR:ns ## COG: FN2119 COG2849 # Protein_GI_number: 19705409 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 9 335 4 333 338 206 41.0 7e-53 MKKNFRFCILCVLIFLFNTLIVKAEREIKYVDSEIRNGIIYSKNEKIPYNGLIKDYYKNG NVKIEWTIVNGAQNGVAKSYYEDGTLKSDSIFKNNKKTGIEKMYSLDGKLVAEVPYKDEI RDGIEKQYSKNGKIIVEISFKNGIEEGAFKQYYENGVLEIEAFYKNGKLEGIWKDYSKDG KIENETSFKNGIEDGTKKTFYKNGNLKYSVEIKNGIKEGAFKQYYENGVLEIEAFYKNDK LEGVKRDYYKSGKIENETSFKNGIEDGTKKSFYKNGNLKYSVEMKNGVENGVFKQYYENG VLEIEAFYKNGKLEGIRRDYYKSGKLEVEGLQKNGEPDGWTYVYNEDGTIKREIFFVEGK AYEKDNNKKK >gi|224461271|gb|ACDC01000131.1| GENE 7 5732 - 5965 225 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|296329140|ref|ZP_06871643.1| ## NR: gi|296329140|ref|ZP_06871643.1| conserved hypothetical protein [Fusobacterium nucleatum subsp. nucleatum ATCC 23726] # 1 77 1 77 360 123 84.0 3e-27 MKKFELRPIYYPKGSYLNYILEIWVDGVNISQFYEGDKLRIDVGYIFHIYNYFDNYLEDI MKEEVLPYEDVEGKNNF >gi|224461271|gb|ACDC01000131.1| GENE 8 6099 - 6827 641 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739138|ref|ZP_04569619.1| ## NR: gi|237739138|ref|ZP_04569619.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 242 1 242 242 380 100.0 1e-104 MLLSGPFLCIPDIIFRKIGDKIEISWDTIWDITYQQRKYENENIKFISTKGVSYIDADEF YLEIKNFLKKIDDISKIQNEKFHIIEKTGKLIYAKDPYNNIEFKEEKNFLKDLEKIDYKF FTIYELVLITEKDKKIVPIILKYLSKIEDENIKTHLAYFLAVKNYKEASEKLIKEFYNAK TNEYRIALSKALSTIYNKDILNELLEIAKNKEYKDVNFPIILTLNKYKNKRVKMFFEKNR ME >gi|224461271|gb|ACDC01000131.1| GENE 9 6830 - 7333 392 167 aa, chain + ## HITS:1 COG:no KEGG:CCC13826_1945 NR:ns ## KEGG: CCC13826_1945 # Name: not_defined # Def: carbon monoxide dehydrogenase 1 (CODH 1) (EC:1.2.99.2) # Organism: C.concisus # Pathway: not_defined # 1 167 1 160 168 84 35.0 1e-15 MKTYVLDVLENILNEEQANQYYCKAFNEMNKREKIPYIVNENRYLKFLLRLYKMDKNVVY KFRFFEKWCFDFFSNSEKLHYKNSIKKLRRKALDKKKFLNKDKDILEMIFKMSFRDVFGF QKNYKIYFSNLKILITSLTDYYYFITFLDKDEEKVKNLVKKSKLFLR >gi|224461271|gb|ACDC01000131.1| GENE 10 7342 - 7716 529 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294783866|ref|ZP_06749188.1| ## NR: gi|294783866|ref|ZP_06749188.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 124 1 122 122 149 93.0 6e-35 MEVRYDVYLEDENENENEDDFDAPRKRLELIFSHITEEEKEILEKYDFKYEYTEDNKIKL IDEDYAIYYTVEIDDEDKGIYLEKTKTYYNYFKYDFISRENERTKNLVISKEGVRVEIIF NERR >gi|224461271|gb|ACDC01000131.1| GENE 11 7724 - 8392 567 222 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739141|ref|ZP_04569622.1| ## NR: gi|237739141|ref|ZP_04569622.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 222 1 222 222 332 100.0 1e-89 MVDYRSILVERMEYKDSILYLYCRTFYKVVGDGEYNKYDYRLYHRKVLKFKNVKRFEYYS DEVYDNFLNELEDLRAELGVPYFRKIFNKSKKRNKLFISGMGYFDNFIAIEFKDDKKEKI VIDEKEKYLEIKKELLKILQSKKEKFEKNNIKMKVIEEKEDSYIINLEKGKRIATLSLRM PDSTRYYYIHYKQDFYRYDWYDEEYYTISEIAEQLNIILDRF >gi|224461271|gb|ACDC01000131.1| GENE 12 8949 - 9599 600 216 aa, chain + ## HITS:1 COG:RC0367 KEGG:ns NR:ns ## COG: RC0367 COG4804 # Protein_GI_number: 15892290 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Rickettsia conorii # 9 158 5 155 159 88 33.0 8e-18 MKNKDTFIINNEKYQKFLLRLKERVKNNQLKAAIKVNYELLDLGKEIVKKQSEYFWGEVF LKSLSNDLQKEFIGMKGFSLTNLKYIRKFYLFYKKSQQAVDQLDYIFSIPWGHHILLITR CKNEEEALFYVEKIIKNGWSRAMLLNFLDTNLYSAQGKAITNFSRLLPDTKSDLAKETLK DPYNFDFLTLTEGYKEKELEEALTSNITNFLLELGQ Prediction of potential genes in microbial genomes Time: Thu May 19 23:48:21 2011 Seq name: gi|224461270|gb|ACDC01000132.1| Fusobacterium sp. 2_1_31 cont1.132, whole genome shotgun sequence Length of sequence - 38899 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 7, operones - 5 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 62 - 2263 1691 ## PROTEIN SUPPORTED gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein 2 1 Op 2 1/0.000 + CDS 2283 - 4700 2089 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 16/0.000 + CDS 4697 - 7498 4142 ## COG0060 Isoleucyl-tRNA synthetase 4 1 Op 4 1/0.000 + CDS 7507 - 7965 411 ## COG0597 Lipoprotein signal peptidase 5 1 Op 5 19/0.000 + CDS 7967 - 8839 1502 ## COG0752 Glycyl-tRNA synthetase, alpha subunit + Prom 9014 - 9073 10.8 6 2 Op 1 1/0.000 + CDS 9105 - 11168 3016 ## COG0751 Glycyl-tRNA synthetase, beta subunit 7 2 Op 2 2/0.000 + CDS 11179 - 11730 707 ## COG0302 GTP cyclohydrolase I + Prom 11842 - 11901 6.9 8 2 Op 3 5/0.000 + CDS 11924 - 12742 625 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 9 2 Op 4 1/0.000 + CDS 12729 - 13556 1220 ## COG0294 Dihydropteroate synthase and related enzymes 10 2 Op 5 4/0.000 + CDS 13558 - 13899 546 ## COG4810 Ethanolamine utilization protein 11 2 Op 6 1/0.000 + CDS 13902 - 14339 591 ## COG4917 Ethanolamine utilization protein 12 2 Op 7 3/0.000 + CDS 14329 - 14907 693 ## COG3707 Response regulator with putative antiterminator output domain 13 2 Op 8 2/0.000 + CDS 14907 - 16295 1567 ## COG3920 Signal transduction histidine kinase + Term 16339 - 16386 10.1 + Prom 16314 - 16373 6.2 14 3 Op 1 5/0.000 + CDS 16402 - 17832 2011 ## COG4819 Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition 15 3 Op 2 8/0.000 + CDS 17860 - 19227 2265 ## COG4303 Ethanolamine ammonia-lyase, large subunit 16 3 Op 3 6/0.000 + CDS 19239 - 20123 1296 ## COG4302 Ethanolamine ammonia-lyase, small subunit + Prom 20130 - 20189 5.2 17 3 Op 4 4/0.000 + CDS 20229 - 20882 1131 ## COG4816 Ethanolamine utilization protein 18 3 Op 5 5/0.000 + CDS 20893 - 21342 841 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 19 3 Op 6 2/0.000 + CDS 21374 - 21658 581 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein + Term 21666 - 21702 3.2 20 3 Op 7 1/0.000 + CDS 21735 - 23183 486 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P + Prom 23234 - 23293 6.3 21 3 Op 8 . + CDS 23420 - 24184 944 ## COG4812 Ethanolamine utilization cobalamin adenosyltransferase 22 3 Op 9 . + CDS 24184 - 24774 632 ## FN0086 hypothetical protein 23 3 Op 10 . + CDS 24776 - 25021 475 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein 24 3 Op 11 . + CDS 25034 - 25369 392 ## FN0088 hypothetical protein 25 3 Op 12 2/0.000 + CDS 25369 - 26451 2014 ## COG3192 Ethanolamine utilization protein 26 3 Op 13 . + CDS 26462 - 26911 724 ## COG4766 Ethanolamine utilization protein 27 3 Op 14 . + CDS 26933 - 28033 1596 ## FN0091 phosphoserine phosphatase (EC:3.1.3.3) 28 3 Op 15 . + CDS 28052 - 29194 1684 ## FN0091 phosphoserine phosphatase (EC:3.1.3.3) 29 3 Op 16 . + CDS 29213 - 30328 1464 ## COG1454 Alcohol dehydrogenase, class IV + Term 30341 - 30399 9.1 + Prom 30463 - 30522 11.0 30 4 Op 1 . + CDS 30557 - 31603 585 ## gi|237739172|ref|ZP_04569653.1| predicted protein 31 4 Op 2 . + CDS 31664 - 32839 682 ## gi|237739173|ref|ZP_04569654.1| predicted protein + Term 32972 - 33010 2.1 - Term 32960 - 32996 4.2 32 5 Tu 1 . - CDS 33014 - 34714 2627 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 34826 - 34885 16.4 + Prom 34811 - 34870 10.5 33 6 Tu 1 . + CDS 34898 - 35143 431 ## FN0683 hypothetical protein + Prom 35150 - 35209 10.8 34 7 Op 1 5/0.000 + CDS 35255 - 36190 1176 ## COG0517 FOG: CBS domain 35 7 Op 2 2/0.000 + CDS 36232 - 37506 1586 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases 36 7 Op 3 . + CDS 37586 - 38863 1307 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases Predicted protein(s) >gi|224461270|gb|ACDC01000132.1| GENE 1 62 - 2263 1691 733 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein [Symbiobacterium thermophilum IAM 14863] # 11 730 2 720 764 655 48 0.0 MIINKKCKGYAMEKIYKIVAEELKIPVDKVENTIKLLDDGATIPFVARYRKEITGNLDEV QIGDILQKVEYLRNLEERKEEVIRLIEEQGKLTDELRNSIVEAKILQEVEDIYFPYRKKK KTKADIAKERGLEPLAEKFYTVNNLEEIQNLAKDFITEEVPTVEDAIEGAMLIIAQNISE KAEYRERIREIYLKFSIIEAKASKKAAELDEKKVYNDYYEYSEKIDKMASHRILAVNRGE KEDILTVHLRLEDSDREKIENMILKEFPKNDLVATYKEIIKDSLDRLIIPSIEREVRNAL TERAEIESIAVFKDNLKNLLLQAPLKEKNVLALDPGYRTGCKVAVIDKYGFYRENTVFFL VEAMHNPKQIEDAKKKFLALVKKYEIDIVSIGNGTASRETETFVANIIKENKLNLKYLIV NEAGASVYSASKIAAEEFPDLDVTVRGAISIGRRIQDPLAELVKIDPKSIGVGMYQHDVN QSKLDESLDNVISHVVNNVGANINTASWALLSHISGIKKTVAKNIVEYRKENGNFKNRKE ILKVKGVGPKAYEQMAGFLVIPEGENILDNTVIHPESYAIAEALLEKIGFSLEKYNNELN EARERLKSFDYKKFAEENNFGAETVKDVYEALLKDRRDPRDDFEKPLLKSDILNIDNLEV GMELEGTVRNVVKFGAFVDIGLKNDALLHISEISNKYIDDPSKVLAVGQIIKVRIKDVDK DRGRVGLTKKEQN >gi|224461270|gb|ACDC01000132.1| GENE 2 2283 - 4700 2089 805 aa, chain + ## HITS:1 COG:FN0066 KEGG:ns NR:ns ## COG: FN0066 COG0642 # Protein_GI_number: 19703418 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 69 805 1 737 737 1046 81.0 0 MFIKKDSLLLRIISYNGIAIIIVASIMATLFGIMIFNELNMRLLDKSRERTLLVNKAYLY YIDKSREHLYDASNDAVNLILVDSNDKLIQNRLASAVKNQLGIESYSLYGKSFIQILSPQ RIILGESGDRDIKYDLYKNNNIIPSKEFLETQKFEYISTKDALYIRLVQPYRLYNSTERN YIILTFPITNYSLTEIKDYAYLSAEDKIFILSKDGFTFGEISLEKTDNFFKNFKSNKVGR ELSDNKYYFSEKKIDNDYYYLGMLALQNDKGNDYVGDIGVAISKNEFVVVKYMLATIILV VCLLAVVLSTALCARIFTKLLAPLNALAGKTEKIGVDSKKDKGGIDFGEENIFEIRSISN SLKFMTERIEENENLLIQKNNKLNTNLNRLIAVEKLLTTISLRDNFSEGLDEVLRTLTSE EGLGYSRAFYLGYDEDKEELSVTKYAINPHIEMNMEKYTEGINGFKFQVNSIKELMPLLN IEYEPGGMFWESMENSKIIYHNDKGFKYSYGNKLYLTLGLNNFMILPIADKDIKIGCILV DYFGKNNLISEEEVEVNSLLLMNLLTRIKNIILGESKLMKERYLTMSKVSDKFIKDNKRL IHNVESFIEKLENNRYNSKDIEKIKKYLKDEKKKNIVIKDSLDNSKSHFKVFNFEKLIEK IVNNSEKILRKYGINISLFIDFSGNMYGDKKRIYQMFIQILRNSINAILTRNKLDKKINI VVVGDKNNRIILEIIDNGVGMTPEEVKAVMKPYSEVTGNSIMGTGLITIYKIVKEHNGFM SISSELDVGTKIRIIFNEYREETNQ >gi|224461270|gb|ACDC01000132.1| GENE 3 4697 - 7498 4142 933 aa, chain + ## HITS:1 COG:FN0067 KEGG:ns NR:ns ## COG: FN0067 COG0060 # Protein_GI_number: 19703419 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 933 1 933 933 1831 94.0 0 MNEKEYTSTLHLPKTDFQMKANLPNKEPKYITKWTEEKIYEKGLEKNKNGESFILHDGPP YANGNTHIGHALNKILKDIIIKYKTFRGFKSPYVPGWDTHGLPIELQVVKEVGVAKAREM SPLEIRKRCEEYARKWVGIQKEQFIRLGVLGDWDHPYLTLDPRFEAKQLELFGEIYEKGY IFKGLKPVYWSPATETALAEAEIEYYDHVSPSIYVRMQANKDLLDKIGFNEDAYVLIWTT TPWTLPANVAICLNENFDYGLYKTEKGNLILAKDLAESAFKNIGIENAELLKEFKGKDLE YTTYQHPFLERTGLVILGDHVTADAGTGAVHTAPGHGQDDYVVGLNYKLPVISPIDHRGC LTEEAGDLFKGLVYSEANKAIIKHLTETGHILKMQEINHSYPHDWRSKTPVIFRATEQWF IRMEGGDLREKTLKVIDEINFIPAWGKNRIGSMMETRPDWCISRQRVWGVPIPIFYNDET NEEIFHKEILDRICGLVREHGSNIWVEKTPEELIGEELLVKYNLKGLKLRKETNIMDVWF DSGSSHRGVLEVWEGLHRPCDLYLEGSDQHRGWFHTSLLTSVASTGDSPYKSVLTHGFVN DGEGKKMSKSLGNTVAPSDVIKVYGADILRLWCGSVDYRDDVRISDNIIKQMSEAYRRIR NTARYILGNSYDFNPKTDKVAYKDMLEIDKWALNKLEVLKRSVTESYDKYEFYNLFQGIH YFAAIDMSAFYLDIIKDRLYTEKKDSVARRAAQTVMYEILMTLTKMVAPILSFTAEEIWE NLPAEAREAESVFLADWYVNNDEYLNPELDEKWQQIIKLRKEVNKKLEKARQGENKIIGN SLDAKVSLYTEDNALKEFIKENLELLETVFIVSDIEVTDSSDDNFTAAEEIENLKIKITH ADGEKCERCWKYDDLGTDPEHPTLCPRCTGVLK >gi|224461270|gb|ACDC01000132.1| GENE 4 7507 - 7965 411 152 aa, chain + ## HITS:1 COG:FN0068 KEGG:ns NR:ns ## COG: FN0068 COG0597 # Protein_GI_number: 19703420 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Fusobacterium nucleatum # 1 152 14 165 165 215 89.0 3e-56 MIYIFLFLILLIIDQYSKFIVHSTLYVGDTIPIIDNFFNLTYVQNKGVAFGLFQGKIDIV SILALIAIGLILFYFCKNFKKISFLERIAYTMIFSGAVGNMIDRLFRGFVIDMLDFRGIW SFIFNFADVWINIGVILIIIEHLIFNRKKRVK >gi|224461270|gb|ACDC01000132.1| GENE 5 7967 - 8839 1502 290 aa, chain + ## HITS:1 COG:FN0069 KEGG:ns NR:ns ## COG: FN0069 COG0752 # Protein_GI_number: 19703421 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, alpha subunit # Organism: Fusobacterium nucleatum # 1 290 1 290 290 590 96.0 1e-168 MTFQEIIFSLQQYWSSKGCIIGNPYDIEKGAGTFNPNTFLMALGPEPWNVAYVEPSRRPK DGRYGDNPNRVYQHHQFQVIMKPSPTNIQELYLESLRVLGIEPEKHDIRFVEDDWESPTL GAWGLGWEVWLDGMEITQFTYFQQVGGLELDIVPVEITYGLERLALYIQNKENVYDLEWT KGVKYGDMRYQFEFENSKYSFELASLDKHFKWFDEYEDEAKKVLDQGLVLPAYDYVLKCS HTFNVLDSRGAISTTERMGYILRVRNLARRCAEVFVENRRALGYPLLNKK >gi|224461270|gb|ACDC01000132.1| GENE 6 9105 - 11168 3016 687 aa, chain + ## HITS:1 COG:FN0070 KEGG:ns NR:ns ## COG: FN0070 COG0751 # Protein_GI_number: 19703422 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, beta subunit # Organism: Fusobacterium nucleatum # 1 687 1 686 686 1053 83.0 0 MKLLFEIGMEEIPARFLSQALTDLKSNFEKKLKNNRIKYEGIKTYGTPRRLVLVVDEVAE MQEDLNELNIGPSKERAYKDGELSKAGEGFLNAYKIDESQIEIVKNDKGEYIAFKRFAKG EATEKLLPEILKELVLEETFPKSMKWSDKTIRFARPIEWFLALYGNNVVEFEIEGIKSSN KSKGHRFFGKEFEVSSVEDYLKKIRENNVIIDISERRKMIEEMINKALLEDEKADIDEGL LDEVTNLVEHPYAIVGNFSEDFLEVPQEVLIISMKVHQRYFPILDKKGKLLPKFIVIRNG IDFSQNVKEGNEKVLSARLADARFFYQEDLKIPLDQNVEKLKTVVFQKDLGTMFNKVKRT EKIAEFLIGKLKYNYMKADILRTVKLAKADLVSNMIGEKEFTKLQGLMGSKYAMERGEEI GVAIGIKEHYYPRFQGDLLPSGIEGIITGLSDRIDTLVGCFGVGLIPTGSKDPFALRRTA LGIVNIIINANINISLKELVNVSLDALQADQVLKADRAKVEADVLDFLKQRMINVFTDMK YRKDIVLAVLDRDADNITNALEIVKVISEKLALNKLEALLQVAKRVTNIITKGNNNVTVK EKLFKEEIEKTLYAEAKRIGEEAEKSIKENEYADYFEKMISLVPTIDKYFEAVIVMDEDK NIRENRINQLTYIKNLFDRIAYLNKID >gi|224461270|gb|ACDC01000132.1| GENE 7 11179 - 11730 707 183 aa, chain + ## HITS:1 COG:FN0071 KEGG:ns NR:ns ## COG: FN0071 COG0302 # Protein_GI_number: 19703423 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Fusobacterium nucleatum # 1 183 5 187 187 325 90.0 2e-89 MDSKRIENAFLEVVEALGDVEYKAELKDTPKRIADSYKEIFYGIGIDPKEVLTRTFDINN NELIMEKNIDFYSMCEHHFLPFFGTICIAYVPNKKIFGFGDILKLIEILSRRPQLQERLT EEIARYIYELLDCQGVYVVVEAKHLCMTMRGQKKENTKILTTSAKGIFETDINKKLEVLA LLK >gi|224461270|gb|ACDC01000132.1| GENE 8 11924 - 12742 625 272 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 1 268 1 269 278 245 45 3e-64 MDKIYIRDLEFIGYHGVFEEEKKLGQKFYLSLELSTDLREANDDITKTTHYGEVAETVKK VFFQKKYDLIETLAEDIAREVLLSFPLIKEVKLEIKKPWAPVGLPLKDVAVEITRKWNEV YLSLGSNMGNKKENLEKAIKEVSKIRDTFIIKESKIIETEPFGYKEQDDFLNSCIGIKTL LTAREVLTELLAIEIRMGRERKIKWGPRIIDLDIIFYNKEVIEEDDLIVPHPYMEYRDFV LKPLEEIIPNFVHPLLSKRITALRKELENEKN >gi|224461270|gb|ACDC01000132.1| GENE 9 12729 - 13556 1220 275 aa, chain + ## HITS:1 COG:FN0073 KEGG:ns NR:ns ## COG: FN0073 COG0294 # Protein_GI_number: 19703425 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Fusobacterium nucleatum # 1 275 3 277 277 461 88.0 1e-130 MKKISCGKKEIILGERTLIMGILNVTPDSFSDGGKYNNLDAAMKQAEKLIADGADIIDIG GESTRPGHTQITVEEEISRVVPIVEKISKELNTIISIDTYKHEVAKEAVKAGADIINDIW GLQYDKGEMAKFVKECNLPLIAMHNQNDEVYNKDIMLVLREFFEKTYKIADEYGIDRNKI ILDPGLGFGKNSEQNIEVLSRLDELNDMGPILLGASKKRFIGKLLNDLPFDERVEGTVAT TVIGIQKGVDIVRVHNVLENKRASLVADGIYRKRG >gi|224461270|gb|ACDC01000132.1| GENE 10 13558 - 13899 546 113 aa, chain + ## HITS:1 COG:FN0074 KEGG:ns NR:ns ## COG: FN0074 COG4810 # Protein_GI_number: 19703426 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 1 113 10 122 122 192 97.0 1e-49 MEKQRTIQEYVPGKQVTLAHLIANPDRDMCVKLGLDEEKTNAIGILTITPGEAAIISADI AIKSGSIELGFLDRFSGTLLLTGDFASVESSLKAVLAFLQETLKFYICEITRS >gi|224461270|gb|ACDC01000132.1| GENE 11 13902 - 14339 591 145 aa, chain + ## HITS:1 COG:FN0075 KEGG:ns NR:ns ## COG: FN0075 COG4917 # Protein_GI_number: 19703427 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 1 145 1 145 145 228 86.0 4e-60 MKKTMLIGRTGCGKTTLTQKLMNEEVKYKKTQSVTYKSKIIDTPGEYVENKMYYKSLLVL SADAKLIVLVQSAIDGATLFPPKFSTMFPRKEVIGVITKIDLENANIERSRKFLVEAGVT EVFTIGLDDSEGLEEIRKRLVADES >gi|224461270|gb|ACDC01000132.1| GENE 12 14329 - 14907 693 192 aa, chain + ## HITS:1 COG:FN0076 KEGG:ns NR:ns ## COG: FN0076 COG3707 # Protein_GI_number: 19703428 # Func_class: T Signal transduction mechanisms # Function: Response regulator with putative antiterminator output domain # Organism: Fusobacterium nucleatum # 1 192 1 192 192 296 92.0 2e-80 MSLRVVVVEDETLTRIDLIEILKENGYDVVGEATDGIEAVEICKKLQPDIVLLDIKIPYI SGLKVANILKEDGFKGCIIILTAYNIAEYIQEASNTIVMGYILKPIDEPIFLERLKLIYK NYKLYDDLKKEVEETKKKLEERKVIERAKGIVMAKYTLSEEEAYKKMRDLSMQKRISMSK LAEIIIMTGGLE >gi|224461270|gb|ACDC01000132.1| GENE 13 14907 - 16295 1567 462 aa, chain + ## HITS:1 COG:FN0077 KEGG:ns NR:ns ## COG: FN0077 COG3920 # Protein_GI_number: 19703429 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 1 462 1 462 462 729 90.0 0 MLKLLCKICATLSPADIEIVEQMSNVATILGNILDMDVFLDCPTKKEDEAMVVFHARPEK NSLYTKNIAGEIAYRIDEPAVFRTFETGLPSRNYKAVTQEKANVLQNILPIFNSLDEVIC TVIIEYNEQQREFFEKEYNKKSTGILIGQIDSLKDRVTEYINDGIIIFNRNGHATYANKV AKILYEKLGVPSIVGESFENLYFERAKYNDIVEEPEKYKQKEVRILDFILNVQCLVSKIN EDIKRVTLIIKDITEEKKYEEELKLKTVFIKEIHHRVKNNLQTVASLLRIQKRRVKNLEM KKILDETINRILSIAITHEILSTTGIDTISIKHILEILCQNYFKNNIDKSKKIEFNIVGD EFSISSDKATSVALVVNEIVQNATEHAFTTKDSGNINIKILKGETFSKIIISDNGVGMEV KKERDSMGLLIISSLVKDKLKGNLEIRSKKDKGTTIEFDFKN >gi|224461270|gb|ACDC01000132.1| GENE 14 16402 - 17832 2011 476 aa, chain + ## HITS:1 COG:FN0078 KEGG:ns NR:ns ## COG: FN0078 COG4819 # Protein_GI_number: 19703430 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition # Organism: Fusobacterium nucleatum # 1 476 1 476 476 818 92.0 0 MREEINSVGIDIGTSTTQVVFSKIVLENMSSGARVPQIKIVSKDVVYRSQIYFTPLVSQT EIDAQAVKKIVEEEYRKAGMSPAAISTGAVIITGETARKSNANEVLNALSGMAGDFVVAT AGPDLESIIAGKGSGAMDFSEKRNTQIFNLDIGGGTTNICYFDKGKVMDTTCLDIGGRLI KINTATMTVDYISDKFTKLIENLGLNIRVGSKVEKSEIVKLCKEVADILLQAVYYKPKTK NYELLVTYKDFHNKDNKLKYVSFSGGVADLIYDFYSGDEFKYGDIGIILGKEIKKAFDVA GVEYVRVGETIGATVVGAGNYTTEISGSTITYTDEDILPIKNIPVIKMNKEDEENLFEFK ERLEQRLDWFRNNEGRQDVAIGVVGENNMKYKKIVGIAESISQVFKSVSRIIVVVESDIG KVLGQCLMLNTGGKVQIICVDSIKVNDGDYIDIGKPLGMGSVLPVVVKTLVLKNYR >gi|224461270|gb|ACDC01000132.1| GENE 15 17860 - 19227 2265 455 aa, chain + ## HITS:1 COG:FN0079 KEGG:ns NR:ns ## COG: FN0079 COG4303 # Protein_GI_number: 19703431 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, large subunit # Organism: Fusobacterium nucleatum # 1 455 1 455 455 880 93.0 0 MILSVKLFDHVYNFSSLKEVMAKANERKSGDTLAGIAASSSKERVAAKVVLSKITLKDLK ENPAVPYEEDEVTRIIIDDLNLQVYDEIKDWTVSDLREWLLSYEATPEKINWIRRGLTSE MIAAVTKLMSNMDLIVAANKIEVYAHCNTTIGGKETLAVRLQPNHTTDDPDGIMISTLEG LTYGMGDAVIGLNPVDDSVDSVMAVMERLHKVKTDYDIPTQTCVLAHVTTQMEAIKRGAK VDLIFQSIAGSEKGNEAFGINGTMIEEARKLALKQGTAAGPNVMYFETGQGSELSSDAHN GADQVTMEARCYGFAKRFQPFLVNTVVGFIGPEYLYDSKQVIRAGLEDHFMGKLHGLPMG VDVCYTNHMKADQSDVEVLATLLTTAGCNYFMGIPAGDDIMLNYQTTGFHDNQSLRELFG KHPIKEFKEWLVKYGFMTEDGKLTEKAGDPSVFLK >gi|224461270|gb|ACDC01000132.1| GENE 16 19239 - 20123 1296 294 aa, chain + ## HITS:1 COG:FN0080 KEGG:ns NR:ns ## COG: FN0080 COG4302 # Protein_GI_number: 19703432 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, small subunit # Organism: Fusobacterium nucleatum # 1 294 1 295 295 494 89.0 1e-140 MVSELELKEIIGKVLKEMAVEGTSVNNEVKKPSASASVIENGIIDDITKEDLREVIELKN PANREEFLKYKRKTPARLGISRAGSRYTTHTMLRLRADHAAAQDAVLTDVSEDFLKANNL FTVKSRCQDKDQYITRPDLGRRLDEESVKILKEKCIQNPTVQVFVADGLSSTAIEANIED CLPALLNGLKSYGISVGTPFFAKLARVGLADDVSEVLGAEVTCVLIGERPGLATAESMSA YIMYKAYVGMPEAKRTVVSNIHIKGTPAAEAGAHIAHIIKKVLDAKASGQDLKL >gi|224461270|gb|ACDC01000132.1| GENE 17 20229 - 20882 1131 217 aa, chain + ## HITS:1 COG:FN0081 KEGG:ns NR:ns ## COG: FN0081 COG4816 # Protein_GI_number: 19703433 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 1 217 1 217 217 365 96.0 1e-101 MINDALRASVLSVKLIPNVDAKMAEELNLPNGYRSIGIITADSDDVTYTALDEATKMAEV VIVYAKSFYGGAANANTKLAGEVIGIMAGPNPAEVKSGLNAAVDFIENGACFYSANEDDT VPYYAHCVSRTGSYLSKTAGIEEGEALAYLIAPPLEAMYALDAALKAADVRLAAFFGPPS ETNFGGGLLTGSQSACKSACDAFAEAVKFVAQNPKKI >gi|224461270|gb|ACDC01000132.1| GENE 18 20893 - 21342 841 149 aa, chain + ## HITS:1 COG:FN0082 KEGG:ns NR:ns ## COG: FN0082 COG4577 # Protein_GI_number: 19703434 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Fusobacterium nucleatum # 1 93 1 93 148 131 92.0 5e-31 MKALGLIETRGMVGAIVAADIALKTAQVELINREHTKGGLVCIEFEGDVAAVKASVEAAV MAIKDMGVYVGSHVIPRPDDSVEKIIKRKLGASEQKEEVVEETKEEVKEEPKETEVEEEI SEMKNIEEEIEEINEILKVSKNKKTKHKK >gi|224461270|gb|ACDC01000132.1| GENE 19 21374 - 21658 581 94 aa, chain + ## HITS:1 COG:FN0083 KEGG:ns NR:ns ## COG: FN0083 COG4577 # Protein_GI_number: 19703435 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Fusobacterium nucleatum # 1 94 1 94 94 123 100.0 9e-29 MSTLNALGMIETKGLVAAVEAADAMVKAANVTLVGKELVGGGLVTVMVRGDVGAVKAATD AGAAAADRVGELISVHVIPRPHSEVELILPKSNN >gi|224461270|gb|ACDC01000132.1| GENE 20 21735 - 23183 486 482 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 11 408 43 434 477 191 31 4e-48 MDKDLLSIQQVRDLIKAAKVAQKLYSTFTQEQIDRVVYAIVQEMKNHYVDLAKKANEETG FGKWEDKVIKNKFANEFVYDYIKDMKTVGILNETDTVTEVGVPMGIVAALTPSTNPTSTA IYKTLISLKAGNAVIVSPHPNAKNCVIDTVKLMQKAAVAAGAPEGLIGVIEIPTLEGTNE LMRSKDTSIILATGGEAMVRAAYSSGRPAIGVGPGNGPAFIEKTANVKEAVRKIIESKTF DNGVICASEQSVIVEPCNKEAVMDEFRRQGGFFLSKEESDKLGKFILRPNGTMNPQIVGK DAQTLAKLAGLNIPSNVKVLLSEQNTVSKTNPYSREKLTTILAFYVEENAEKACERAIEL LENEGEGHTLIIHSENKDIIREFALRKPVSRMLVNVGGSLGGVGATTNLAPAFTLGCGAV GGSSTSDNVSPMNLINIRRVAVGVRELSDFKKGSDNSNCCSGTTVNSEVEDMIRRIIAEY RR >gi|224461270|gb|ACDC01000132.1| GENE 21 23420 - 24184 944 254 aa, chain + ## HITS:1 COG:FN0085 KEGG:ns NR:ns ## COG: FN0085 COG4812 # Protein_GI_number: 19703437 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization cobalamin adenosyltransferase # Organism: Fusobacterium nucleatum # 1 254 1 255 255 348 84.0 5e-96 MVLSEDILKIKYRKEAFDVFEIEKGTLLTPSAKQFLNEKGIRLVIKGEEAPVSTKQNEFG EETEEKIIYEKPKYVGKNGECYFEKPEYMTVVDGNVLISKNSKLISLRGKIDTFLAELLL NTKEIEQSSNNKLIKDIETVIKFIQNIIVAEKLDKILENQILLDSKTIKDIKEIIDNPKE YFKKGHLLEVSLNSDLTIHKLNRLRFLARELEIQAIDYFVEDYKVNRKDLLEAFNVLSDV IYIIILKYDNGDYR >gi|224461270|gb|ACDC01000132.1| GENE 22 24184 - 24774 632 196 aa, chain + ## HITS:1 COG:no KEGG:FN0086 NR:ns ## KEGG: FN0086 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 196 2 195 195 208 79.0 1e-52 MNNNFDEKYIIELVKKELSKYLTEQGIEMKKQISFLGDDKDIEEKLSQKFEISENAGTLV VSKLSLKNLYNISNAIYENDYEEKIIKFLLENKEIIIIKEGIEYSKYENIPVAVLKRYEE YIEKIKTYGIKIENKDFYINSLEKKEEVYSKKLLDLNSLRELEIKGIKRLVIENSIVTSS AQEYAKDKNIEIIKRR >gi|224461270|gb|ACDC01000132.1| GENE 23 24776 - 25021 475 81 aa, chain + ## HITS:1 COG:FN0087 KEGG:ns NR:ns ## COG: FN0087 COG4576 # Protein_GI_number: 19703439 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Fusobacterium nucleatum # 1 81 1 81 82 133 95.0 1e-31 MLIGEVIGNVWATKKYDGLDGLKFLIVKTEDNKRMVAFDSVGAGIGEKVIISTGSSARNV LNMKDIPVDAAIIGIIDGMEE >gi|224461270|gb|ACDC01000132.1| GENE 24 25034 - 25369 392 111 aa, chain + ## HITS:1 COG:no KEGG:FN0088 NR:ns ## KEGG: FN0088 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 111 1 111 111 157 80.0 1e-37 MAKKFITLKDVEQQISNGKIYLDEKAILASSLQDYIREHNIEVVYGGETCSVKPSVADCA CLKEEVAATSSKAEDFTEVARLIVKILKNDYGIQDEEKIMQVIKIIREVLK >gi|224461270|gb|ACDC01000132.1| GENE 25 25369 - 26451 2014 360 aa, chain + ## HITS:1 COG:FN0089 KEGG:ns NR:ns ## COG: FN0089 COG3192 # Protein_GI_number: 19703441 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 1 360 1 360 360 525 95.0 1e-149 MGINEIIIYVMVFFMAVGAIDKCIGNKFGYGEKFEEGIMAMGALALSMVGIVSLAPVLAN ILKPIVGPVYSALGADPAMFATTLLANDMGGYPLAMSLAQDPMVGKFAGLILGSMMGATV VFTIPVALGIIEKEDRPYLAKGVLAGMVAIPFGCLVGGLVAGFPLMTVLRNLVPIIIFAV LIIIGLWLIPEKMTTGFTYFGTGVVVVITIGLAAAIIENLTGIVVIPGMAPIDEGMGIIW SIAIVLAGAFPLVHFITKVFKKPLEKIGEKLGMNEIGAAGLVASLANNIPMFGMMKDMDP NGKVMNVAFAVCAAFVFGDHLGFTGGVDKAMIAPMIAGKLAGGILAIIIAKVLFTTKKAK >gi|224461270|gb|ACDC01000132.1| GENE 26 26462 - 26911 724 149 aa, chain + ## HITS:1 COG:FN0090 KEGG:ns NR:ns ## COG: FN0090 COG4766 # Protein_GI_number: 19703442 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 1 149 1 149 149 265 92.0 2e-71 MNKELLEELIRKVIQEELGKAEQPESEYKEMDKSGVGVVKLNKMRKRVKMDTGNPKDQVT TTDLFTLQESPRLGAGLMEMRETTFPWTLTYDEIDYIIEGRLEILIDGRKVVGEAGDVIL IPKNSKIEFSAPNYAKFMYFVYPANWSEL >gi|224461270|gb|ACDC01000132.1| GENE 27 26933 - 28033 1596 366 aa, chain + ## HITS:1 COG:no KEGG:FN0091 NR:ns ## KEGG: FN0091 # Name: not_defined # Def: phosphoserine phosphatase (EC:3.1.3.3) # Organism: F.nucleatum # Pathway: Glycine, serine and threonine metabolism [PATH:fnu00260]; Metabolic pathways [PATH:fnu01100] # 1 366 1 366 366 585 82.0 1e-166 MSIENSCVRLDEGRWNPKNREVLEKLIEKYRNTNSYAVFDWDNTSIQGDTQQNLFIYQIE NLKYKLSPEKFNEVIRKNVPTTDFDERFKNSEGEVLNVTKLANDIYKSYIFLYENYISTK KISLEEIRETEEFKDFRAKMHYLHNALPSNFSSKIACLWEFYLLSGMTRTEVKSLAKESN DAKLGESLGDVIVESSRVLRGEAGIVKGIYDNGLRVRSEMANLYHELKRNGIDVYVISAS MQELIEVFATDKSYGYNLDEEKIYAMRLRKTVDDVLIDEFNEDYAFTQKEGKSETIERFI RDKYEGKGPILVGGDALGDESMLTKFKDTEVLLIMKREGKLDNLVNDERALIQHRNLQTG LLDPQN >gi|224461270|gb|ACDC01000132.1| GENE 28 28052 - 29194 1684 380 aa, chain + ## HITS:1 COG:no KEGG:FN0091 NR:ns ## KEGG: FN0091 # Name: not_defined # Def: phosphoserine phosphatase (EC:3.1.3.3) # Organism: F.nucleatum # Pathway: Glycine, serine and threonine metabolism [PATH:fnu00260]; Metabolic pathways [PATH:fnu01100] # 25 380 10 366 366 397 56.0 1e-109 MNSILKNMARVSLFLAISVGAMANLDEGRWVPKNREVLDEVISENKNQGNYAVFDWDYTS IYQDTQENLFRYQIDNLRFAMTPEQFSKAIRKDIPLDNFSDDYKNVKGQAINIEKIGSDL DKDYAFLYKNYIKDKKMSLEKIKKTEQFKDFRGKLAFLYEAIGGSFSHDISYPWVLYLFE GMTVNEVKALAKEANDFGIGNKLDSYVIESSDVLTGKAGKVSHKYKSGLRTQPETANLFR ELQANGIKVYIISASLQDIVEVFATDKSYGYNLEDGSVYGMRLEMNGDKYRAEYKAGYPQ TQTKGKVEIIETYLKPKHGGKAPILVAGDSSGDANMLTEYKDTKVLLLMKREGKLDDVAK DGRALIQKRNAQTGLLDPKN >gi|224461270|gb|ACDC01000132.1| GENE 29 29213 - 30328 1464 371 aa, chain + ## HITS:1 COG:FN0092 KEGG:ns NR:ns ## COG: FN0092 COG1454 # Protein_GI_number: 19703444 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Fusobacterium nucleatum # 1 371 1 372 372 628 85.0 1e-180 MKEFRLQPKILFGEDSLDYLKTLEYKKVMIVTDEVMTQLKLTDFITNNLSSSTEVKIFNK VEPNPSMQTIENGLKDFIDFEPQCVIALGGGSPIDACKAILYFSYELYKKLKVNKKVFFI AVPTTSGTGSEVTSYSVVTKGEHKIALADEKMLPDVALLNTAFLSGLPAKVVADTGMDVL THSIEAYVSTNANPFSSSFAMKSIKLIFENLVAHYNDRKIQGPKENVQFASCLAGIAFDN SSLGINHSIAHTVGAKFHIAHGRANAIIMPYVIEVNTEANRKYFEISRELGLPSDTIEEG KYSLLSFVRILKEKLAVEKSLKDYGVDFEAFKREIPSMLEDIKKDICTQYNPNKLTDEEY VRLLLKIYFGE >gi|224461270|gb|ACDC01000132.1| GENE 30 30557 - 31603 585 348 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739172|ref|ZP_04569653.1| ## NR: gi|237739172|ref|ZP_04569653.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 348 1 348 348 514 100.0 1e-144 MSTLRLREDITVVRDAGSINDYNSIFLIQTKDKGIIAVKEIQKINDNWVGHIQPLENLIK DSFSFRSILNLYGDRDRIEKMYKEKEEYFIIFPNKEIYGLSLEEVKNQLDIEKIDFIDMN KYMNKNGEEKALTIRYQIYLDKLSNYNYYDKEFELSEIKKVFIFKGTIIIFLILTFFINI NYFIKCYFKKENFLSNKDLSNGDKVVIASTVLFFDFIVLIFCIFSSWYKNYVSENYIGLL FVFHLIFKNIFISFYFSKENKIKKFLGNFYNSRILLYLLPILSLILVSNIEYKIPSIFYF LFYIYSIIIVGLELIKKVGFYKGNYYFSYYFIYQIVYILLLFWLYITF >gi|224461270|gb|ACDC01000132.1| GENE 31 31664 - 32839 682 391 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739173|ref|ZP_04569654.1| ## NR: gi|237739173|ref|ZP_04569654.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 391 1 391 391 615 100.0 1e-174 MKKIIIVEIILALTIFYIIKNIPGYKNTELILRNNTKIERQEPFHNSEEDLFFLRGEQYI NKYVKELSNINGIWIGNTYSYDELKVRSSYFNELIAETGVKKENYDKETGYFIIKSDNEF YSLTEQEVKQKLNINDLKLKSVESYMKKYGKKPIFSDFYQNYLSTIKSVKGNLPFYKENV DDENFEKRELNYTILFKNIILVILILNFCSYPYLLKKNKLKIDLEKLMLIIVFFFTDSIV LVLSFFFTKPWDYINNNYIPIYVVFHIIFRNITTALYFVKIEKVLKNFNNISQNDILEKF KRNETIMLEEIKKFLMIKVALLYLVPLFFTVILSVIGTALMTIFYFFTFLISLLYCFYSF VKIEDNPLNSTFYISVYLLQYILFIFIILHF >gi|224461270|gb|ACDC01000132.1| GENE 32 33014 - 34714 2627 566 aa, chain - ## HITS:1 COG:FN0684 KEGG:ns NR:ns ## COG: FN0684 COG1151 # Protein_GI_number: 19704019 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Fusobacterium nucleatum # 1 566 1 566 566 1112 95.0 0 MDKMFCYQCQETAKGTGCTSIGVCGKDAETSGLQDLLIHTDKGVAAYSSVLRKNGKAKEL IEGKVNRYLVNSLFITITNANFDDDAILDEIKAGLKLREELKALATDEEKKEAEKYGADL VNWYYESNEDLIKFSENQSVVGVLRTENEDVRSLRELIVYGLKGLAAYAEHAFNLGKTSE EIFAFVEEALLGTMDDSLTAEQLVALTMKTGEYGVKVMALLDEANTSVLGTPEITKVKIG AGKRPGILISGHDLWDLKQLLEQSKDSGVDIYTHSEMLPGHAYPELKKYPHFYGNYGNAW WDQRKDFTNFNGPIVFTTNCIVPPVKNATYKDRVFTTNAAGYPGWKRIKVNADGTKDFSE IIELAKTCQPPVEVESGEIVVGFAHNQVLSLADKVVENIKSGAIKRFVVMSGCDGRMAQR HYYTDFAENLPKDTIILTSGCAKYKYNKLNLGDINGIPRVLDAGQCNDSYSWAVVALKLK EVFGLNDINELPLVFNIAWYEQKAVIVLLALLYLGVKNIHVGPTLPGFLSPNVAKVLVEN FGIAGITTVEEDLKKFGLYEGSGLAN >gi|224461270|gb|ACDC01000132.1| GENE 33 34898 - 35143 431 81 aa, chain + ## HITS:1 COG:no KEGG:FN0683 NR:ns ## KEGG: FN0683 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 81 1 81 81 152 97.0 4e-36 MHDGCSGKFDDGMQVLAKLRMMGFSKQEMPFPMTFTCKECGEEITMTTFEYECPHCSMIY AVTPCHAFDVENILSAGKAKK >gi|224461270|gb|ACDC01000132.1| GENE 34 35255 - 36190 1176 311 aa, chain + ## HITS:1 COG:FN1926_2 KEGG:ns NR:ns ## COG: FN1926_2 COG0517 # Protein_GI_number: 19705231 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Fusobacterium nucleatum # 146 311 2 167 167 283 92.0 4e-76 MKFSSYLNTDYIFPNLEANSKEEIIRKIVNKVAEDDRAVGEQKEEIIKNIIKREEEISTC IGGGIFLPHTRMIDFSDFIIAVATVKDKIVSDIGGTNQKDEIKVVFLIVSDVLKNKNLLK AMSVISKIGLKQPEVIEKIKKSNSPKEIYELLAANDIEIEHKIIAEDVLSPEIRPAKEND TLEEIAKRLILEQKSALPVLSDDNVLLGEITERELIGFGMPEHLSLMSDLNFLTVGEPFE EYLLNESTMTIKDIYRKDIKHLMIDKDTPIMEICFKMVYKGMHRLYVVNPKNNKYLGIIN RSDIIKKVLHI >gi|224461270|gb|ACDC01000132.1| GENE 35 36232 - 37506 1586 424 aa, chain + ## HITS:1 COG:FN1925 KEGG:ns NR:ns ## COG: FN1925 COG1055 # Protein_GI_number: 19705230 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Fusobacterium nucleatum # 1 424 1 424 424 673 93.0 0 MLYAGILIFIAVFYCIITEKIPNAWATMAGGLLMTMIGIINQEEVLETVYNRLEILFLLV GMMMIVLLVSETGVFQWFAIKVAQLVRGEPFKLIILLACVTALCSAFLDNVTTILLMAPV SILLAKQLKLDPFPFVITEVMSANIGGLATLIGDPTQLIIGAEGKLTFNEFLVNTAPVAI LSMISLLATVYFMYAKNMKVSNELKAKIMELDSSRSLKDIKLLKQSIVIFSLVIIGFILN NFVDKGLAMIALSGAVCLSLIAKKSPKEMFEGVEWETLFFFIGLFMMIKGIENLEIIKFI GDKMITITEGHFGGAVLSTMWISALFTSVIGNVANAATFSKIINIMTPSFAGVAGVKALW WALSFGSCLGGNLSLLGSATNVVAVGAADKAGCKINFVQFLKFGGIIAIENLIIASVYIY FRYL >gi|224461270|gb|ACDC01000132.1| GENE 36 37586 - 38863 1307 425 aa, chain + ## HITS:1 COG:FN1924 KEGG:ns NR:ns ## COG: FN1924 COG1055 # Protein_GI_number: 19705229 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Fusobacterium nucleatum # 1 425 1 425 425 639 91.0 0 MLLSLGILIFVIVFYCMITEKVASAYATMLGALAMAFLGIVNEEQILETIHSRLEILLLL IGMMIIVSLISETGVFQWFAIKVVKIVRGDPLKLLILLSIVTATCSAFLDNVTTILLMAP VSILLAKQLKLDPFPFVMTEVLSSDIGGMATLIGDPTQLIIGSEGKLSFNEFLINTAPMT VIALVILLTVVYFTNIRKMKVPNRLRAQIMELESDRILTNKKLLKQSIIILTAVIIGFVL NNFVNKGLAVISLSGGILLAFLTEREPKKIFGAVEWDTLFFFIGLFVMIRGIENLGVIKF IGDKIIELSTGNFKVASISIMWLSSIFTSIFGNVANAATFSKIIKTVIPNFQSVADIKVF WWALSFGSCLGGSITMIGSATNVVAVSASAKADCKIDFMKFFKFGSKIAILNLIAATVYM YLRYL Prediction of potential genes in microbial genomes Time: Thu May 19 23:49:10 2011 Seq name: gi|224461269|gb|ACDC01000133.1| Fusobacterium sp. 2_1_31 cont1.133, whole genome shotgun sequence Length of sequence - 2021 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 34 - 630 735 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 2 1 Op 2 . - CDS 633 - 1964 965 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|224461269|gb|ACDC01000133.1| GENE 1 34 - 630 735 198 aa, chain - ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 1 185 1 185 197 304 89.0 5e-83 MKKRNLKGSVVLNPVPAVLVTCKNSEGKDNVFTVAWIGTICSRPPMLSISIRPERLSYDY IKETMEFTINLPSKKQTKVVDFCGVRSGRQIDKIKECAFTLHDGLKVKSSYIEECPINIE CKVKDIIKLGSHDMFIAEVLTSHINEDLFDEKDKIHFEKADLISYSHGEYFALSKDAIGK FGYSVAKKKKKINKKSKK >gi|224461269|gb|ACDC01000133.1| GENE 2 633 - 1964 965 443 aa, chain - ## HITS:1 COG:FN1469 KEGG:ns NR:ns ## COG: FN1469 COG0534 # Protein_GI_number: 19704801 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 16 443 13 440 440 551 80.0 1e-157 MTITSNKPILKTIFKYAIPNVISMWIFTLYTMVDGIFISRFVGSTALAGVNLVLPLINFI FSISIMIGVGSSTLIAINFGENKYDEGNKIFTLATLLNLFLAIFISLLILLNLERVINIL GANKSQEVYQYVKDYLSVIVFFSVFYMSGYAFEIYIKIDGKPSYPTICVLVGGITNLILD YLFVVVFHYGVTGAAIATGISQVTCCSMLLSYIIFKAKKIKFKKSFRFDFDRIIKIFKTG FSEFLTEISSGILILIYNLVILKRIGVTGVSIFGTISYISSFITMTMIGFSQGIQPIISY NLGKKHYKNLRDILKISITSLGILGIVCFILITSSAEYIGRIFFKEKDMILRVKDVLRVY SLSYLLIGINIFISAYFTALKRVTYSAFITFPRGILFNSILLLILPTIFGNRSIWFVTFL SEALSVFICLFLLKKLKREGILN Prediction of potential genes in microbial genomes Time: Thu May 19 23:49:15 2011 Seq name: gi|224461268|gb|ACDC01000134.1| Fusobacterium sp. 2_1_31 cont1.134, whole genome shotgun sequence Length of sequence - 20186 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 8, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 291 - 326 2.1 1 1 Tu 1 . - CDS 347 - 679 316 ## Cthe_0307 hypothetical protein - Prom 798 - 857 12.4 + Prom 743 - 802 14.5 2 2 Op 1 . + CDS 868 - 5373 5827 ## FN1912 hypothetical protein 3 2 Op 2 . + CDS 5445 - 7538 2926 ## COG4775 Outer membrane protein/protective antigen OMA87 4 2 Op 3 . + CDS 7579 - 8052 762 ## FN1910 hypothetical protein 5 2 Op 4 . + CDS 8076 - 9074 1468 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase + Term 9094 - 9128 0.4 + Prom 9101 - 9160 16.7 6 3 Op 1 1/0.000 + CDS 9188 - 10264 1248 ## COG2404 Predicted phosphohydrolase (DHH superfamily) + Term 10338 - 10378 2.1 + Prom 10351 - 10410 4.3 7 3 Op 2 . + CDS 10458 - 11198 732 ## COG0101 Pseudouridylate synthase + Term 11354 - 11396 -0.9 + Prom 11578 - 11637 14.1 8 4 Op 1 2/0.000 + CDS 11681 - 13591 2999 ## COG1960 Acyl-CoA dehydrogenases 9 4 Op 2 . + CDS 13617 - 14834 1762 ## COG0426 Uncharacterized flavoproteins + Term 14844 - 14888 10.9 - Term 14835 - 14873 6.2 10 5 Tu 1 . - CDS 14878 - 15177 441 ## FN1395 hypothetical protein - Prom 15197 - 15256 12.1 + Prom 15288 - 15347 11.6 11 6 Op 1 2/0.000 + CDS 15420 - 16334 1191 ## COG2066 Glutaminase 12 6 Op 2 . + CDS 16375 - 17790 783 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Term 17829 - 17893 16.5 + Prom 18154 - 18213 13.6 13 7 Tu 1 . + CDS 18257 - 19681 1964 ## COG1288 Predicted membrane protein + Term 19701 - 19745 2.1 - Term 19692 - 19728 2.5 14 8 Tu 1 . - CDS 19736 - 19972 276 ## FN1825 hypothetical protein - Prom 20092 - 20151 7.7 Predicted protein(s) >gi|224461268|gb|ACDC01000134.1| GENE 1 347 - 679 316 110 aa, chain - ## HITS:1 COG:no KEGG:Cthe_0307 NR:ns ## KEGG: Cthe_0307 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 4 109 13 125 125 85 45.0 7e-16 MKQRSVFLGVILSFLTCGIYATVWIWILNNELRVANGKDKNSFLNFILSIVTCGIFYLVW NYKLGQEVEDFGGKDDGVLYLFLAFFSFGIISIALAQSQVNEICERNGIS >gi|224461268|gb|ACDC01000134.1| GENE 2 868 - 5373 5827 1501 aa, chain + ## HITS:1 COG:no KEGG:FN1912 NR:ns ## KEGG: FN1912 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 327 1501 1 1175 1175 1421 68.0 0 MSKKEASLMFKNFSLNKIPKKVSIPLIAATVLGLTTTVALSNLEKIVEKVSSRFINGRLH IEDIDLSLSEPVIKNITLYDNENNVMFNSDKVVAKISFKNLLGGRIDELNVDSASVNVVR DKDGVINFTKLSKKKSDKKPSNPIDKLVVTSANINYEDYTFPNKLEKKIENINATILADK EKLVKNADVSIDDENIKLNTSFKDESEKELSSLEMKLKIDKFLLDKDLLKSLAKNNEKLE FSDVNISSDLTIKTDKTVKNTNIVGNLDVESPLFRYADVESDIKNIKLSGVFNGRDGKAN LDLNVFDKDRNIAVTYRDEELNSVINIDKIDESILNKIKPIKDKKLDLKNINIEDIKTIV HYSDERGFIVKTTMKPNNSEFKGIELNDFNLYADSKDGKKRANAKISAKIKGMAENLTVN LENQAENTDIIVALKSQEKDSIIPDINLKANLENKKDILKAKITSNIVNFNMDYQKEEKL AKIYDEKFKINYDVNKKNLTDGDGRIAFKIYDTDNYLDFKAKDNQVEIKELKLMDKLNKN NTLIAKGNADLNKKEFNIDYDAKLNSVSRKFKDKDIVLSFDAKGKAESKNNIISSQGQIN DLSLEYMAKIEKINGTYDFKKSDSGMEANLKTKIASIGYDKYKFDNFNLLATYSGKEVKI RDFSNNLLSFKADYNTEANKLNGDLNIKRLTDEDIGLDKVDFVLENLKAKLEGDIKTPKA KIDLGTTVVTLPSKDLAKISGKLNLVGDKIIIEGVNVDNNLITGQYDIKEKLLNLKASLS ENHLEKYYGGKDLGYMLYGDLVLTGVAGNLDAKLKGRVINLQSSFPDLAYNIDYSAENYS DGIISINDLDIIDKNYGSILGLTGIVDLKEKNLNIKNKNDKIDLTKLQNILKNPNVKGIV NTDIIINGQLSNPNYSLNMSSSEVSIKNFKINDIVLNLAGDKEKANLNKLSLDVYKNLIV GSGSYDIKNKTYNVNMKSNNKIDLSKFQTFFNSYGINNPSGKIGFNVQIDQNDEKAYLSL ENINLESSKLKLKFSNFSGPITLSGRRIEIGELNAKLNNSPVTIDGFVDLVDIAKIDKED IIRSLPYKLHIKSKELNYVYPEVIKLKASTDITLTNEELYGNLIIKEATINDIPNNYYRD FFSLIKEQLRKRRTDVTPKKKVDKNSREAQEKAARMRAFLNKLMPVDLVIKTEKPILIDM DNFNILVPEVYGKLDIDLNINGKKGKYYLEGETELKDGYFVIGTNEFKVDRALAIYNDNT PLPEINPNIFFESTIEMDDAEYYFTTMGKLNQLRYEITSKTSKVGGDLSALIVNPDSNEH IYSYGDGSQIFIVFMKNLIAGQIGQIVFGKTARYVKRKLKLTRFVIRPEIKIYNSEDSVI NRYGTTDNRALSPQIYNVNIKMEAKDNIYKEKLFWKASARLIGTGKDTIRNQTFKVNSQN IREYDVGLEYKIDDSKTLGIGVGTVPYKYRTDENKDYKKPNYYIEYKFRKRYKDFSEIFS F >gi|224461268|gb|ACDC01000134.1| GENE 3 5445 - 7538 2926 697 aa, chain + ## HITS:1 COG:FN1911 KEGG:ns NR:ns ## COG: FN1911 COG4775 # Protein_GI_number: 19705216 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Fusobacterium nucleatum # 20 697 1 678 678 1165 84.0 0 MKRLLIALMFVISLVSFSTMTNQSVKSIEVVNNNQVPASLIKNTLKLKEGSKFSTDALVA DFNALKGTGYFEDVMLQPISYDGGVKIVVDVIEKQNVASLLKERGVSVNTVREDTDKSVV ISSIKFNGNKKYSAAELQKITQLKTGEYFSRSRVEEAQRNLLATGKFAEVKPDAKVTNGK MELSFDVVENSVVKNVVITGNKAVPTSAIMSVLSTKTGAVQNYNNLREDRDKILGLYQAQ GYTLVNITDMSTDENGTLHIAIVEGIVRKIEVKKMVTKQKGNRRTPNDDVLKTQDYVIDR EIEIQPGKIFNVKEYDATVDNLMRLGIFKNVKYEARSIPGDPEGIDLILLIDEDRTAELQ GGVAYGSETGFLGTLSLKDSNWRGKNQELGFTFEKSNKDYTSFSLDFFDPWIRNTDRVSW GWGLYKTSYGDSDSILFHDIDTLGFKVNIGKGFSKYFRLSLGAKVEYIKEKHENGKLQKA PNGRWYYNDSGSWREIEGVDDKYVLWSIYPYISYDTRNNYLNPTSGTYAKFQVEGGHAGG YKAGNFGNVTLELRKYHKGLFKNNTFAYKVVGGVASDSTKESQKFWVGGGNSLRGYDGGF FKGSQKLVATIENRTQLNDIVGFVVFADAGRAWKQNGRDPSYTRDNKDFGHNIGTTAGVG LRLNTPIGPLRFDFGWPVGNKMDDDGMKFYFNMGQSF >gi|224461268|gb|ACDC01000134.1| GENE 4 7579 - 8052 762 157 aa, chain + ## HITS:1 COG:no KEGG:FN1910 NR:ns ## KEGG: FN1910 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 157 1 157 157 159 86.0 3e-38 MKKLLLIASVLLATSAFADKIGVVDSQKAFFQFSETKKAQQALEGQAKKVENEARQREVA LQKEFVSLQAKGDKLTDAEKKAFEKKSQDFQSFLNASQNNLNKEQMTKLKRIEDIYEKAV KKVAADGKYDYVFEAEALKVGGEDITDKVLKQMEALK >gi|224461268|gb|ACDC01000134.1| GENE 5 8076 - 9074 1468 332 aa, chain + ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 1 332 1 332 332 585 96.0 1e-167 MEYKVTDIITLLNAEYKGEVIENVSKLSPFFHSDEKSLTFAADEKFLKNLAQTKAKVIIV PDIELPLIEGKGYIVVKDSPRVIMPKLLHFFSRTLKKIEKMREDSAKIGENVDIAPNVYI GHDVVIGNNVKIFPNVTIGEGVKIGEGTVIYSNVTIREFVEIGKNCVIQPGAVIGSDGFG FVKVNGNNTKIDQIGTVIVEDEVEIGANTTIDRGAIGDTIIKKYTKIDNLVQIAHNDIIG ENCLIISQVGIAGSTIVGNNVTLAGQVGVAGHLEIGDNTMIGAQSGVPGNVEANKILSGH PLVDHREDMKIRVAMKKLPELLKRVKALEEKK >gi|224461268|gb|ACDC01000134.1| GENE 6 9188 - 10264 1248 358 aa, chain + ## HITS:1 COG:FN1601 KEGG:ns NR:ns ## COG: FN1601 COG2404 # Protein_GI_number: 19704922 # Func_class: R General function prediction only # Function: Predicted phosphohydrolase (DHH superfamily) # Organism: Fusobacterium nucleatum # 1 358 1 358 358 617 85.0 1e-176 MADILYDTRLKSEEAPKVIILTHGDADGLVSAMIVKSFEEMENKNKTFLIMSSMDVTSEQ TDKTFDYICKYTSLGSQDRVYILDRPIPSIDWLKMKYLAYTNVINIDHHLTNKPTLYKDE CCCENIFFHWNDKWSAAYLTLEWFKPLVEKAECYKNLYKKLEDLAIATSYWDIFTWKNLG NSPEDTLLKKRALSINSAEKILGSGAFYNFITKKINSKNYTEEVFDYFFLLDEAYSLKID NLYDFAKRVISDFDFKGYKLGVIYGIEGDYQSIIGDKILVDKKLNYDAVAFLNVYGTVSF RSKDNVDVSEIAQKLGMLVGYSGGGHKHAAGCRICDKDEMKKKMFEIFEHSMDKIRIL >gi|224461268|gb|ACDC01000134.1| GENE 7 10458 - 11198 732 246 aa, chain + ## HITS:1 COG:FN1600 KEGG:ns NR:ns ## COG: FN1600 COG0101 # Protein_GI_number: 19704921 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Fusobacterium nucleatum # 1 245 3 247 247 398 85.0 1e-111 MERKNIKIEFRYDGSRYYGFQRQPNKETVQGEIEKILKIVTKEDINLISAGRTDRGVHAN HQVSNFYTSSTIPVEKYKYLLTRALPKDIDILSVEEVDEKFNARHDAKMREYVYIISWEK NPFEARYCKFVKDKIDAERLEKIFSSFLGIHDFRNFRLSDCMSKVTVREIYSIDVKYFSE NKLKICIRGSAFLKSQVRIMVGTALEVYYKNLPKNHIDLMLNDFSKEYKKSLVEAEGLYL NRINYS >gi|224461268|gb|ACDC01000134.1| GENE 8 11681 - 13591 2999 636 aa, chain + ## HITS:1 COG:FN1424_1 KEGG:ns NR:ns ## COG: FN1424_1 COG1960 # Protein_GI_number: 19704756 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 1 377 1 377 377 724 98.0 0 MLFKTTEEHEALRMQVREFVETEVKPIAAMLDKENKFPHEAIEKFGKMGFMGLPYPKEYG GAGKDILSYAIAVEELSRVDGGTGVILSAHVSLGSYPIFAYGTEEQKKKYLTPLAKGEKL GAFGLTEPNAGSDAGGTETTAVKEGDYYILNGEKIFITNADVAETYVVFAVTTPDIGTKG ISAFIVEKGWEGFTFGDHYDKLGIRSSSTCQLLFNNVKVPKENLLGKEGDGFKIAMSTLD GGRIGIAAQALGIAQGAFEHALEYAKEREQFGKPIAFQQAVSFKLADMATKLRTARFLIY SAAELKEHHEPYGMESAMAKQYASDIALEVVNDALQIFGGSGYLKGMEVERAYRDAKITT IYEGTNEIQRVVIAAHLIGKPPKSDAVAVAKKKKGPVTGPRKNIIFKDGSAKEKVAALVA ALKADGYDFTVGIPLNTPIGKSERVVSAGKGIGDKKNMKLIEKLATQAGASVGCSRPVAE TLQYLPLDRYVGMSGQKFVGNLYIACGISGALQHLKGIKDATTIVAINTNANAPIFKNAD YGIVGDIAEILPLLTKELDNGEAKKDAPPMKKMKRVLPKVMYSPHVYVCSGCGHEYNPEI GDEDSDIKPGTRFKDLPEDWTCPDCGDPKSGYIDAK >gi|224461268|gb|ACDC01000134.1| GENE 9 13617 - 14834 1762 405 aa, chain + ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 405 1 405 405 792 93.0 0 MHNVRNITENLYWIGANDRRLALFENIHPIPEGVSYNSYMLLDEKTVVFDTVDWSVTRQY VENIEYLLNGRELDYLVVHHMEPDHCGSIEELALRYPNLKIISSEKGFMFMRQFGYKSIN GHELIEVKEGDKFKFGKHEIVFLEAPMVHWPEVLVSFDTTNGALFSADAFGSFKSLDGRL FNDEVNWDRDWLDEGRRYLTNIVGKYGPHIQHLLKKAGPIVDKIKFICPLHGVVWRNDFG YIIDKYDKWSRYEPEEKGVLIAYASMYGNTENAVEIIAKKLAEKGVTNIKMYDVSNTHVS YLISDLFKYSHLVIASPTYNLGIYPVIHNFVMDIKALNLQNRTVAIVENGSWARKSGDLL QEFFETQVKDIAVLNEKVGLTSSANNVNLDEMDALVEVLVESLKK >gi|224461268|gb|ACDC01000134.1| GENE 10 14878 - 15177 441 99 aa, chain - ## HITS:1 COG:no KEGG:FN1395 NR:ns ## KEGG: FN1395 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 99 1 99 99 109 78.0 4e-23 MDKENEGFDIMSFLFNNKSFIEGLIENLKKELMEVIFSENLNIFKKSIFIQGVFTYANLI LSNNESLSKEEKTKIMEEIVEISNLLTDETLEDIKKYAN >gi|224461268|gb|ACDC01000134.1| GENE 11 15420 - 16334 1191 304 aa, chain + ## HITS:1 COG:FN1397 KEGG:ns NR:ns ## COG: FN1397 COG2066 # Protein_GI_number: 19704729 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Fusobacterium nucleatum # 1 304 1 304 304 554 94.0 1e-158 MEELLKELVEKNRKFAVDGNVANYIPELDKADKNALGIYVTTLDGQEFFAGDYNTKFTIQ SISKIISLMLAILDNGEEYVFSKVGMEPSGDPFNSIRKLETSSRKKPYNPMINAGAIAVA SMIKGKNEKERFTRLLDFAKLITEDDSLDVNYKIYCGEADTGFRNFSMAYFLKGEGIIEG NVEEALTVYFKQCSIEGTAKTISTLGKFLANDGVLSNGERIITTRMAKIIKTLMVTCGMY DSSGEFAVRVGIPSKSGVGGGICSVVPGKMGIGVYGPALDKKGNSLAGGHLLADLSEELS LNIF >gi|224461268|gb|ACDC01000134.1| GENE 12 16375 - 17790 783 471 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 448 3 445 456 306 37 9e-83 MEFLNSIVGQINNILWSYVLIALLILSGLFYTIRTGFAQGRLLGDMVALITGKLSSLRDG EKKVAGQVTGFQAFCIAVASHVGTGNLAGVAIAVAVGGPGALFWMWVIALLGGATSLIEN TLAQTYKVKDGKGGFRGGPSYYMEKALGQKTLGYIFSVIVIVTFAFVFNTVQANTIAQAF ETSFNMSSAVAGIILAALTALIIFGGLNRIANVVSFMVPIMAIGYVVVALYVLVVNAVHI PRLFMDIIEAAFGLKQVVGGTIGVAMLQGIKRGLYSNEAGMGSAPNAAATSNVSHPVKQG LLQAFGVFVDTILICSATGFIVLLYPEYNTIGEKGIKLTQLALSHSVGAWGAGFITLCIF LFAFSSLVGNYYYGEANLEFLTKSKTSMLVFRVLTVACVYLGSVASLGLVWDIADVSMGI MALMNIVVIAILSPKAVAIINDYIKQRKEGKNPVFRAKDIPGLKNTECWDD >gi|224461268|gb|ACDC01000134.1| GENE 13 18257 - 19681 1964 474 aa, chain + ## HITS:1 COG:FN1409 KEGG:ns NR:ns ## COG: FN1409 COG1288 # Protein_GI_number: 19704741 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 6 246 3 243 247 430 94.0 1e-120 MSKNEKTEEKKQLKALNPMLLLVCMILIIAVLSYIVPAGVYDRVMDEKLGKELVDPNSFH YVERNPITIFGLLVSITLGIQNAAYIISFLLIIGGMFAILNATGAINTGMANVVRSMKGR ELLMIPVCMIVFGCGSAFCANFEEFLAFVPLVLACCYAMGFDSLTAVGIIFCAAASGYAG AITNAFTTGVAQSIAGLPMFSGMGLRIPLFISLITVSILYVMYHAHKVKKNPKSSAVYEN DLEQKKHINIDTENVEKLTGRQKLVLATLILGIAYTVYCVIKKGYYIDELSGIFLAIGII GGFIGGLKPSKMCDEFLQGCINMLFPCIMIGLANSVVIILKDTSILDTIIHALASLLKGL PASIAAIGMFIVQDLFNVVVPSGSGQAAITMPIMAPLADMIGITRQTAVLAFQLGDAFTN VMAPTGGEILAALAMCGTISFKTWMKYLAPLFALWWLVSFIFLTIATQIQYGPF >gi|224461268|gb|ACDC01000134.1| GENE 14 19736 - 19972 276 78 aa, chain - ## HITS:1 COG:no KEGG:FN1825 NR:ns ## KEGG: FN1825 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 78 68 145 145 134 84.0 1e-30 MEVDQYTAVITGTLTHNGSTKKNLRLSILCFDKKGNRVGDAIATIDELEKGKKWKFRAVL NEENVAACKIKDAYITVE Prediction of potential genes in microbial genomes Time: Thu May 19 23:49:41 2011 Seq name: gi|224461267|gb|ACDC01000135.1| Fusobacterium sp. 2_1_31 cont1.135, whole genome shotgun sequence Length of sequence - 3414 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 68 - 1684 2080 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase + Term 1693 - 1737 7.1 + Prom 1732 - 1791 5.4 2 2 Tu 1 . + CDS 1846 - 2370 605 ## FN0142 hypothetical protein + Term 2384 - 2430 7.1 + Prom 2379 - 2438 6.6 3 3 Tu 1 . + CDS 2620 - 3015 586 ## gi|237739198|ref|ZP_04569679.1| predicted protein + Term 3019 - 3080 1.5 + Prom 3029 - 3088 2.9 4 4 Op 1 . + CDS 3144 - 3272 228 ## 5 4 Op 2 . + CDS 3235 - 3412 102 ## gi|294783822|ref|ZP_06749144.1| conserved hypothetical protein Predicted protein(s) >gi|224461267|gb|ACDC01000135.1| GENE 1 68 - 1684 2080 538 aa, chain + ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 1 538 1 538 538 875 88.0 0 MEEILVFGHKNPDTDSICSSIAMSNLRKQQGFNAIPCRLGEINKETKFVLDKLGVKSPKL LKTVSAQITDLNYVEKSTISTEDSIKEALDLMTKENFSSLPVIDTEGYFKTMLSISDIAN TYLEIDYSDLFSKYSTTFENLKEALEGEVISGNYPEGEIASNLKEASELESLKKGDIVIT TSLTDGIDKSIQAGARVVIVCCRKGDFISPRVTSECAIMLVRHSFFKAISLISQSISVGG ILNTNKVLFNFNKEDFLNEIRGIMKDANQTNFPVLEDDGKVYGTIRTKHLIDFHRKKVIM VDHNEFSQSVEGIQDAHILEVVDHHKFANFQTNEATKIRTEPVGCTSTIIYGLYKEAKIE PDEKTALLMLSAILSDTLLFKSPTCTSRDVEVAKELAKLAKVDNISEYGMEMLVAGTSMA KSSMKEIINQDKKIFPIGDMEIAVAQINTVQIGELVARKEEIAKEIEHEIGKYGYSLFLF VVTDIINSNSLVFTYGKEIEIVENAFKKEVVNNEILLENVVSRKKQIIPFLMTAAQNM >gi|224461267|gb|ACDC01000135.1| GENE 2 1846 - 2370 605 174 aa, chain + ## HITS:1 COG:no KEGG:FN0142 NR:ns ## KEGG: FN0142 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 15 174 1 160 160 249 86.0 2e-65 MILITIFSLIIIFLLQFIYLVPENHYSILDQTGKVKLKDYPELKDMSFEYNADLSVEYTE LINLELEKINFRFNGEIIGTVEINRNINDLKDFAEPYIDEETKDKIIRKIYPLQNEFLRI LGRNAEVYDSLEDGRFYIDIYIKDLKTNKTFIIKRDNISIYYESGGPKIFIPSV >gi|224461267|gb|ACDC01000135.1| GENE 3 2620 - 3015 586 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739198|ref|ZP_04569679.1| ## NR: gi|237739198|ref|ZP_04569679.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 131 1 131 131 238 100.0 9e-62 MKFRFGYLKLIDKSIYKKDKIILVCDSVSNNQDENIIGCYINDISNFESNLTEICGELDK RKYSGLDGQVWGADFLKDKVHIFWLFDPDNEEGKAEISRKGMLKLMKKWIEFRKKKIPEN YEKYEEIIEVD >gi|224461267|gb|ACDC01000135.1| GENE 4 3144 - 3272 228 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEEGHNRLSVSASKSNMNSNGTRLNEEQFVNEREIGERYERK >gi|224461267|gb|ACDC01000135.1| GENE 5 3235 - 3412 102 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294783822|ref|ZP_06749144.1| ## NR: gi|294783822|ref|ZP_06749144.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 59 1 59 119 82 100.0 7e-15 MKERLEKDMKENKIIVHKNILNINNLIREFPTKILEIIEFKNFIIIRIGYNSQISDNIF Prediction of potential genes in microbial genomes Time: Thu May 19 23:49:59 2011 Seq name: gi|224461266|gb|ACDC01000136.1| Fusobacterium sp. 2_1_31 cont1.136, whole genome shotgun sequence Length of sequence - 2470 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 49 - 108 2.1 1 1 Op 1 . + CDS 206 - 298 103 ## 2 1 Op 2 . + CDS 301 - 516 255 ## Lebu_0273 hypothetical protein + Prom 518 - 577 6.2 3 2 Tu 1 . + CDS 693 - 995 489 ## gi|237738865|ref|ZP_04569346.1| predicted protein + Term 1037 - 1078 3.4 + Prom 1046 - 1105 6.6 4 3 Tu 1 . + CDS 1242 - 2075 975 ## gi|237739202|ref|ZP_04569683.1| predicted protein Predicted protein(s) >gi|224461266|gb|ACDC01000136.1| GENE 1 206 - 298 103 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIPIEKMIIKYDRLDLLPNSRTDIILKEVK >gi|224461266|gb|ACDC01000136.1| GENE 2 301 - 516 255 71 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0273 NR:ns ## KEGG: Lebu_0273 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 71 1 71 71 67 64.0 1e-10 MKYWRDEYLVLKNLIEKYCETEDRNRLMKILETEDRFLFKYFINEFSKLKIPSKMTSKEL EEYEKKIMVYI >gi|224461266|gb|ACDC01000136.1| GENE 3 693 - 995 489 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738865|ref|ZP_04569346.1| ## NR: gi|237738865|ref|ZP_04569346.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 100 5 104 104 143 100.0 3e-33 MNFLKLKDAANKLLEFMEKYDLDDYNERLVKKFLNELIYVIDTDEIDDIKKYQEVKEIIV GLYPPRGGLTEMYVADEDREKMNKINRELKELKKKITLLD >gi|224461266|gb|ACDC01000136.1| GENE 4 1242 - 2075 975 277 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739202|ref|ZP_04569683.1| ## NR: gi|237739202|ref|ZP_04569683.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 277 1 277 277 419 100.0 1e-116 MLPNRLIITEKSKRKAIYENSKDKWIIDFEDKIKSWSDFYDIIQKEMDFWNYNEKFRKDN YTYSDIVGDLIVFEKMKERKKEGMVFILDYTEDFRKIKDCDKKDYDKGTIYYDLVYNLLV EWYRDNRIMYKEWNASIDIEIYILIDDNSIKDKNIDFDNELIIATENDRNDVRQQYKNYA KTKIRFFDYDEIKDLPDIFLDNKRGFEAENFIFFYQLEKIKADNSKQIKVEISNSMGIFH SLSIYLLVYIIDKILIEKFIEGKEIKMFMIFANELAE Prediction of potential genes in microbial genomes Time: Thu May 19 23:50:25 2011 Seq name: gi|224461265|gb|ACDC01000137.1| Fusobacterium sp. 2_1_31 cont1.137, whole genome shotgun sequence Length of sequence - 10660 bp Number of predicted genes - 16, with homology - 12 Number of transcription units - 8, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 11 - 70 2.1 1 1 Tu 1 . + CDS 160 - 489 364 ## gi|294783816|ref|ZP_06749138.1| hypothetical protein HMPREF0400_01813 + Prom 491 - 550 7.4 2 2 Tu 1 . + CDS 704 - 1123 461 ## gi|237739204|ref|ZP_04569685.1| predicted protein + Prom 1163 - 1222 4.0 3 3 Tu 1 . + CDS 1243 - 1785 846 ## gi|237739205|ref|ZP_04569686.1| predicted protein + Prom 1796 - 1855 7.1 4 4 Op 1 . + CDS 1879 - 2031 222 ## 5 4 Op 2 . + CDS 2043 - 2408 401 ## Lebu_1174 hypothetical protein 6 4 Op 3 . + CDS 2408 - 2479 93 ## 7 4 Op 4 . + CDS 2485 - 2586 62 ## + Term 2599 - 2638 -0.3 8 5 Tu 1 . - CDS 2651 - 2776 59 ## - Prom 2913 - 2972 8.0 + Prom 2841 - 2900 8.0 9 6 Tu 1 . + CDS 2928 - 3827 980 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Term 3659 - 3695 4.3 10 7 Op 1 17/0.000 - CDS 3846 - 4553 355 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 11 7 Op 2 44/0.000 - CDS 4554 - 5336 249 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 12 7 Op 3 49/0.000 - CDS 5333 - 6136 717 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 13 7 Op 4 38/0.000 - CDS 6129 - 7058 694 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 14 7 Op 5 . - CDS 7073 - 8650 2462 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 8720 - 8779 10.8 + Prom 8989 - 9048 6.2 15 8 Op 1 6/0.000 + CDS 9078 - 9539 848 ## COG0054 Riboflavin synthase beta-chain 16 8 Op 2 . + CDS 9542 - 10651 1558 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis Predicted protein(s) >gi|224461265|gb|ACDC01000137.1| GENE 1 160 - 489 364 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294783816|ref|ZP_06749138.1| ## NR: gi|294783816|ref|ZP_06749138.1| hypothetical protein HMPREF0400_01813 [Fusobacterium sp. 1_1_41FAA] # 1 109 19 127 127 181 100.0 1e-44 MRYIRVPKDIKAMKDYDYGVQKDEQMEELILSESQYSVFYTLKVFQLINEECDVLIDDYE EEVLSLEKIPLALRIVNKIIQNSNDINLIKFKNMLELAIKYRTIVGFDF >gi|224461265|gb|ACDC01000137.1| GENE 2 704 - 1123 461 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739204|ref|ZP_04569685.1| ## NR: gi|237739204|ref|ZP_04569685.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 139 9 147 147 228 100.0 9e-59 MVLKFHYIKNKNNIFQSCEVNEKYKFISFYLHNPIDCKNFLKFTKKALEENLKKDISGEA VAAELDIEENKIIIYDIDTYFDGDEPDELLEIKKEDLIYILDRWIKFLEKPITDENYEEI FEMEDPVVKVLKDGKYVII >gi|224461265|gb|ACDC01000137.1| GENE 3 1243 - 1785 846 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237739205|ref|ZP_04569686.1| ## NR: gi|237739205|ref|ZP_04569686.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 180 1 180 180 293 100.0 2e-78 MERIVYIRGKFKGNNKEMKEAYFYAKEMMKKYDLEPQYIGVIASEGWESPGILTIKRKEK QLLEDLEKNKKIESIEVITKEMEGKEIIDNKSYFLIDKEDGIIVFWTNTNIEKVNFEEIL EKMKKYVEPGIEEICDWESGSSPIVYVYEGEKALERTGKFQDKITIIYKKVTPLDIPIEV >gi|224461265|gb|ACDC01000137.1| GENE 4 1879 - 2031 222 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTEEREMILNLSYEELIEKFKNEPRKVIKFLQDEQKKILEMIQNISLRY >gi|224461265|gb|ACDC01000137.1| GENE 5 2043 - 2408 401 121 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1174 NR:ns ## KEGG: Lebu_1174 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 7 118 67 178 178 82 43.0 6e-15 MIIIEDYYLEDDSFNEFLIELAYDKRHRQHEDLAFLLEKKHSPKIINRVYDLAVMELDYK KEDEFFNIARKCTYALGYTNTPKAKEKLELLAKNENELIREYAIKQLNRHDFTDKDEEEQ D >gi|224461265|gb|ACDC01000137.1| GENE 6 2408 - 2479 93 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKYEEQERKIYAKYDDKIADDRK >gi|224461265|gb|ACDC01000137.1| GENE 7 2485 - 2586 62 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDITDDVKRIKKSIDNGTFKESLLPEEKEYIVK >gi|224461265|gb|ACDC01000137.1| GENE 8 2651 - 2776 59 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSWTIPPVGNFTLPRRPYILFSSITILLFSQFVNIYFFNF >gi|224461265|gb|ACDC01000137.1| GENE 9 2928 - 3827 980 299 aa, chain + ## HITS:1 COG:FN1498 KEGG:ns NR:ns ## COG: FN1498 COG0697 # Protein_GI_number: 19704830 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 285 1 285 299 438 89.0 1e-123 MDKHIKGALLVCLAATMWGFDGIALTPRLFNLHVPFVVFILHLLPLILMSVIFGREEIKN IRKLDKNDLFFFFCVALFGGSLGTLSIVKALFLVNFKHLTVVTLLQKLQPIFAILLARIL LKEKLKKDYLFWGFLALLGGYFLTFEFNIPEVVEGDNLLAASLYSLLAAFSFGSATVFGK RVLKAASFRTALYVRYLMTTCIMFVIVAFTSGFGDFSQATAGNWLIFVIIALTTGSGAIL LYYFGLRYITAKVATMCELCFPISSVIFDYFINGNVLSPVQIASAILMIISIIKISRLK >gi|224461265|gb|ACDC01000137.1| GENE 10 3846 - 4553 355 235 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 11 208 30 242 329 141 36 2e-33 MKAIELINIVKKYGQQEVLNSFSLDIEKGKCLAVMGESGSGKSTIAKIIIGLEKPNSGEV KIFDKDIEFLFQDSYNALNPRMTVEDLIYEPLQFSTDIDIKDKREFILELLKQVELAPEL LTRRRDELSGGQLQRVCLARALSTKPQIMIFDESLSGLDPLVQDKILDLLYKIQKEYNLT YIFISHDFRLCYFLADRIILIDNGKIIEDFKDLDKEIIPKTEIGKILLEDIIKLK >gi|224461265|gb|ACDC01000137.1| GENE 11 4554 - 5336 249 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 12 228 31 242 329 100 30 5e-21 MNILEIKNLSLKISDKKILNSINFALKEKEIVSIIGQSGSGKTMLSKMIMGLKNKNMQVE GEILFKDKNIFDFSEEDLRKYRGEGIGYITQNPLNVFLPFQKIKTTFLETYLSHKNVSKK EIIEFAKKNLKQVNLDNADEILNKYPFELSGGMLQRVMIALIVGLDSKIIIADEVTSALD SYNRHEIIKIFKELNNIGKSIILITHDYYLMKAISDRCLVMENGEVIEEFNPKLKSELIK ESSNFGAKLLETTIYRRKGS >gi|224461265|gb|ACDC01000137.1| GENE 12 5333 - 6136 717 267 aa, chain - ## HITS:1 COG:FN1502 KEGG:ns NR:ns ## COG: FN1502 COG1173 # Protein_GI_number: 19704834 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 267 5 271 271 427 92.0 1e-119 MAKNIKFYFAIFLLFFWIALAIFAPIIAPYDPQYVDLSLKLLPPNKTYILGTDALGRDIF SRIIYGARLSISISLSIQVILLLVSVPIGLFIGWKQGKEEKFFDWLTMIFSTFPSFLLAM VFVGMLGAGISNMIISVVAVEWIYYARILKNSVISQKQNEYVKYAILKGMPTKYILKKHI FPFVYGPILTASLMNIGSIILMISSFSFLGIGVQPNISEWGNMIHDSRTFFRNHPNLMIY PGMMILFAVGSFRFIASQIEEKFRGTK >gi|224461265|gb|ACDC01000137.1| GENE 13 6129 - 7058 694 309 aa, chain - ## HITS:1 COG:FN1503 KEGG:ns NR:ns ## COG: FN1503 COG0601 # Protein_GI_number: 19704835 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 309 4 312 312 519 95.0 1e-147 MKKKIFDIISALFVISILAFIFIQLTPGDPAENYLRASHLPITDELLKQKREELGLNSPL IIQYLKWLKNVLLGNFGSSFLRKEPAIYLTFKALYATFQLTIFSTFLIILISLPIGILTA IKTGTWIDKLIISITTIFVSMPVFWLGFSLILLFSVKLNWLPVSGRGGFLNFILPSITLA VPFIGQYIEFVKKSILENIQNNLLENAILRGLKKRYIIFNYLLKGAWIPILSGFSFTFVS ILTGSILVEEIFSWPGIGFLFTKAIQAGDVPLIQACIMVFGLLFIIATHFMNSILKYLDP RIKGEKNNG >gi|224461265|gb|ACDC01000137.1| GENE 14 7073 - 8650 2462 525 aa, chain - ## HITS:1 COG:FN1504 KEGG:ns NR:ns ## COG: FN1504 COG0747 # Protein_GI_number: 19704836 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 5 525 2 522 522 990 95.0 0 MLVFSLLFLVACGEKAKEEVKTEGTEKKTLTISWNEDIGFLNPHAYLPDQFVTQGMVYEG LVNYGENGEILPSLAESWEISEDGKTYTFHLRKGVKFSDGSDFNANNVKKNFDSIFLNKE RHSWFGLTDHIKSYRAVDENTFELILDEAYTPTLYDLAMIRPIRFLADAGFPDDGDTYKG IKASIGTGPWILKEHKKDEYAIFEKNPNYWGKKPILDEVIIKIIPDAETRALQFEAGELD MIYGNGLISYDTFKSYEEDPKYKTAISEPMSTRLLMFNTTSGVLNDINLRYALTYATDKK AISEGILNGIEKPADTIFAPNMPHSKQDLKPFEYNLDKAKEYIEKAGYKMGKEFYEKDGQ VLTLVFPYIATKTLDKQIAEYIQGQWKKIGVNVEIKALEEKNFWEETDDLKYNVMLNYSW GAPWDPHAYINAMATVAENGNPDYEAQLGLPMKKELDEKIHQVLVEANPEKVEQLYKEIL TTLHEQAVYVPLTYQSLIAVYRDNLTGVRFMPQEYELPLSFIDKK >gi|224461265|gb|ACDC01000137.1| GENE 15 9078 - 9539 848 153 aa, chain + ## HITS:1 COG:FN1505 KEGG:ns NR:ns ## COG: FN1505 COG0054 # Protein_GI_number: 19704837 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Fusobacterium nucleatum # 1 153 5 157 157 271 93.0 3e-73 MRVFEGKFNGEGIKIAIVAARFNEFITSKLIGGAEDILRRHNVEDDNINLFWVPGAFEIP LIAQKLAKSKKYDAVITLGAVIKGSTPHFDYVCAEVSKGVAHVSLESEIPVIFGVLTTNS IEEAIERAGTKAGNKGADAAMTAIEMINLIKGI >gi|224461265|gb|ACDC01000137.1| GENE 16 9542 - 10651 1558 369 aa, chain + ## HITS:1 COG:FN1506_2 KEGG:ns NR:ns ## COG: FN1506_2 COG1985 # Protein_GI_number: 19704838 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Fusobacterium nucleatum # 147 369 1 223 223 382 87.0 1e-106 MDKTVDKKFMARAIELAFRGLGGVNPNPLVGAVVVKDGKIIGEGWHKKYGGPHAEVWALN EAGEEAKGATIYVTLEPCSHQGKTPPCAKRIVEAGIKRCVIACIDPNPLVAGKGIKIIED AGIKVDLGILEKEAKEVNKVFLKYIENKIPYLFLKCGITLDGKIATRRGKSKWITNELAR EKVQFLRTKFSAIMVGINTVLKDNPSLDSRLDEEKFGIEKRNPFRVVVDPNLESPIDSKF LHFNDNKAIIVTSNDNRNLEKVKEYENLGTRLIYLEGKIFKMKDILKELGKLNIDSVLLE GGSGLISTAFKENVIDAGEIFIAPKIIGDNSSIPFINGFNFDSMEEVFKLSNPKFNIYGD NISVEFENL Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:06 2011 Seq name: gi|224461264|gb|ACDC01000138.1| Fusobacterium sp. 2_1_31 cont1.138, whole genome shotgun sequence Length of sequence - 5961 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 10 - 738 845 ## COG0206 Cell division GTPase 2 1 Op 2 . - CDS 755 - 2491 1810 ## COG1479 Uncharacterized conserved protein + Prom 2844 - 2903 5.6 3 2 Tu 1 . + CDS 2929 - 3717 830 ## Acfer_0728 hypothetical protein + Term 3735 - 3790 9.2 + Prom 3894 - 3953 17.7 4 3 Op 1 6/0.000 + CDS 4000 - 4914 804 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 5 3 Op 2 . + CDS 4949 - 5485 626 ## COG1045 Serine acetyltransferase - Term 5498 - 5544 6.9 6 4 Tu 1 . - CDS 5552 - 5824 550 ## COG2388 Predicted acetyltransferase - Prom 5900 - 5959 12.4 Predicted protein(s) >gi|224461264|gb|ACDC01000138.1| GENE 1 10 - 738 845 242 aa, chain - ## HITS:1 COG:FN1451 KEGG:ns NR:ns ## COG: FN1451 COG0206 # Protein_GI_number: 19704783 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Fusobacterium nucleatum # 33 241 77 315 360 73 27.0 3e-13 MQEKLKVITIDDYSISILKNYFKDNENIEFLELALHEKLEDLNTNFSKRDIVFLRTNAEN LEKLLEVGKALKEKEIITITVLEEKLAIENKKVLEEAINTIFPVNKKDDMENLFLEVIKM IDNIIFGRCYINLDVEDVKYMLKDSGISVFGRLNINKTISKEDIIKNINYPFYNKTLKDS KKILIFLDTLEGFVLTEGELIIDTLRNKNGKTIEDILFSVRIGNNLKNRIECSFIAGLFK ER >gi|224461264|gb|ACDC01000138.1| GENE 2 755 - 2491 1810 578 aa, chain - ## HITS:1 COG:Cj0008 KEGG:ns NR:ns ## COG: Cj0008 COG1479 # Protein_GI_number: 15791407 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 3 575 4 580 583 313 37.0 5e-85 MSFKNPITIEEALEKIQRNKYLLPAIQREFVWKSWQIEKLFDSLMNEYPIGSFLFWEVTS TKDFKFYEFIKKYDQRDDNHNKKIPLVDGDEIIAVLDGQQRLTALYIGLLGTYTEKLPYH RWDNPKAFPEKYLCLNLLDKDNKDIEEETSKYELKFCTKEEIEYKDGKKSWFKLNNILKY KTEGDILLEPLEMLKDCSDETKREAYKKLMQLFKIVNRDKIINYYLEKENNLDKVLNIFI RMNSMGTKLNNSDLLLSIATDQWKNRDAREEINKLLDELNKEGFAIDKDFILKAALYLTK KIKNIRFKVDNFKKENMDLIEKNWDNISDSLKLSFILLKSFGYNKDNLSANTPVLPIALY ILENNFDRKIVTNSSFSEDRKLIKEWIIKSTLKGIFSGRDISANVREVILKSKSNNFPLD EIKEELKKIPGKSINFSEDEIDNLLWSEYGANTTFSILQLLYPTLDYKNNFHIDHIYPRA KFNEKYLKKQGINIADLWYWDCLPNLQLLDGSQNLEKTDKEFKDWLEEQDFSKEERKDYF KKNYIPENIELTFKNYNEFFEKREELLLKQLKKILLEK >gi|224461264|gb|ACDC01000138.1| GENE 3 2929 - 3717 830 262 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0728 NR:ns ## KEGG: Acfer_0728 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 68 262 7 205 212 78 29.0 3e-13 MVRKLKLKNYLNTEKSNDFLEMAYNAETQTKAKSYLKKALELDPDNLDAELALADISSKS QLDFLKKTEDIIAHGNKLMEEQGYFEKDCIGDFWLIFETRPYMRVRYRYLMLLLECKMLK KAIHECEEMLKLCENDNLGVRYILIHLYTFWEDEKSVLNLCKKLKMLKTTQLLFPLSILY YKLLDFKKAEKYLLELAETNRYTKEFFKAFLEERIDEFELEDYGYRPFTIDELIVTFMEN LYLYCNLIEYFSWGYDILKKKR >gi|224461264|gb|ACDC01000138.1| GENE 4 4000 - 4914 804 304 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 2 297 3 300 308 314 54 1e-85 MIYNNLLDLIGNTPVVKVDFKDENIADVYVKLEKFNLSGSVKDRAALGMIEAAERDGLLK EGSVIIEPTSGNTGIALSLIGRLKGYKVVIVMPDTMSIERRSTLKAYGAELILTDGSKGI GEAIAVAEKLVAENPNYFMPQQFNNKANPEKHYETTGKEILDDFKVVDAFVAGVGTGGTL VGIGKRLKERSKDTKVIGVEPSTSAVLSGEAPGKHSIQGIGTGFVPENYDATVVDEVIKI SSEEALEYAKKASHDFGLFVGISSGANIAAAYQVAKRLGKGKTVVTIAPDGGEKYLSIEA FLTK >gi|224461264|gb|ACDC01000138.1| GENE 5 4949 - 5485 626 178 aa, chain + ## HITS:1 COG:BS_cysE KEGG:ns NR:ns ## COG: BS_cysE COG1045 # Protein_GI_number: 16077161 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Bacillus subtilis # 4 177 3 171 217 203 55.0 1e-52 MNIFKWLKDEFLNIQQKDPAVKSKLEIILYASFHAVLYHKLAHFLYKCKLYFLARLISQI ARFLTGIEIHPGATLGRRVFFDHGMGIVIGETAIIGDDCVIFHGVTLGGLSSKKPNQTNS SKRHPTIKNNVMLGAGAKLLGDITIGENVKVGANAVVLTDVPDNAVAVGIPARIIVKE >gi|224461264|gb|ACDC01000138.1| GENE 6 5552 - 5824 550 90 aa, chain - ## HITS:1 COG:FN1391 KEGG:ns NR:ns ## COG: FN1391 COG2388 # Protein_GI_number: 19704723 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Fusobacterium nucleatum # 3 90 2 89 89 142 87.0 1e-34 MNDIVHYEGNGFYIYDDNKEILARLEYKRNGNTLIFDHTVVSDKLKGQGIAGKLLDVAVD YARKNNFKVHPVCSYVVKKFESGNYDDIKI Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:11 2011 Seq name: gi|224461263|gb|ACDC01000139.1| Fusobacterium sp. 2_1_31 cont1.139, whole genome shotgun sequence Length of sequence - 5361 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 113 - 787 999 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 - Prom 812 - 871 9.4 + Prom 933 - 992 10.3 2 2 Op 1 . + CDS 1044 - 1280 318 ## FN0101 glutaredoxin 3 2 Op 2 24/0.000 + CDS 1280 - 3547 3036 ## COG0209 Ribonucleotide reductase, alpha subunit 4 2 Op 3 . + CDS 3528 - 4574 1016 ## COG0208 Ribonucleotide reductase, beta subunit + Term 4580 - 4626 4.8 + Prom 4587 - 4646 9.5 5 3 Op 1 . + CDS 4678 - 4869 309 ## MGAS9429_Spy0565 phage protein 6 3 Op 2 . + CDS 4906 - 5331 584 ## COG1598 Uncharacterized conserved protein Predicted protein(s) >gi|224461263|gb|ACDC01000139.1| GENE 1 113 - 787 999 224 aa, chain - ## HITS:1 COG:FN0100 KEGG:ns NR:ns ## COG: FN0100 COG1018 # Protein_GI_number: 19703448 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Fusobacterium nucleatum # 9 224 1 215 215 345 83.0 3e-95 MKKIYDLNLVERNDVAENTIELIFTKPSDYEFKIGQYTFLNVGEDPQDKNFARALSIASH PDEDLLRFVMRTSDSEFKQRCLAMKKGDSATITKATGSFGFKFSDKEIVFLISGIGIAPI IPMLMELEKIDYQGKVSLFYSNRTLAKTTYHERLGSYNIKNYNYNPVFTGIQPRINIDLL KEKLDDIYDAHYYIIGTGEFIKTMKTLLEENNINKDNYLVDNFG >gi|224461263|gb|ACDC01000139.1| GENE 2 1044 - 1280 318 78 aa, chain + ## HITS:1 COG:no KEGG:FN0101 NR:ns ## KEGG: FN0101 # Name: not_defined # Def: glutaredoxin # Organism: F.nucleatum # Pathway: not_defined # 12 78 1 67 67 108 91.0 7e-23 MENNEFKECLEMIKVYGKENCSKCTSLKGILTDRNIEFEYIEDVKTLMIIASKARIMSAP VIEYNDTVYSMEAFLKVI >gi|224461263|gb|ACDC01000139.1| GENE 3 1280 - 3547 3036 755 aa, chain + ## HITS:1 COG:FN0102 KEGG:ns NR:ns ## COG: FN0102 COG0209 # Protein_GI_number: 19703450 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Fusobacterium nucleatum # 1 755 1 755 755 1421 94.0 0 MTNERRKVINRDNIVEDLNIEKIREKLLRACDGLEVNMVELESNIDSIYEENITTQKIQA SLINTAVTMTSFEESDWAYVAGRLLMMEAEREVYHSRKFSYGDFAKTIKHMVELGLYDER LLTYTEEELNQISQLIDLSRDMVYDYAGANMLVNRYLIKHDGKTYELPQETFMAISMMLA LNEKEGETRVNIVKEFYNALSLRKLSLATPILANLRIPNGNLSSCFITAIDDNIESIFYN IDSIARISKNGGGVGVNVSRIRAKGSMVNGYYNASGGVVPWIRIINDTAVAVNQQGRRAG AVTVALDTWHLDIETFLELQTENGDQRGKAYDIYPQVVCSNLFMKRVKNNESWTLFDPYE IRKKYGVELCELYGYEFENLYEKLEKDNDIKLKRVLSAKELFKSIMKTQLETGMPYIFFK DRANEVNHNSHMGMIGNGNLCMESFSNFKPTINFVEEEDGNTSIRRSEMGEIHTCNLISL NLAELTSDELEKHVALAVRALDNTIDLTVTPLKESNKHNLMYRTIGVGAMGLADYLAREY MIYEESINEINELFERIALYSIKASALLAKDRGAYKAFKGSKWDQAIFFGKKREWYEANS KFKDEWNEAFYLVEANGLRNGELTAIAPNTSTSLLMGSTASVTPTFSRFFIEKNQRGAIP RTVKHLKDRAWFYPEFKNVNPISYVKIMAKIGSWTTQGVSMEMVFDLNKDIKAKDIYDTL MTAWEEGCKSVYYIRTIQKNTNNISEKEECESCSG >gi|224461263|gb|ACDC01000139.1| GENE 4 3528 - 4574 1016 348 aa, chain + ## HITS:1 COG:FN0103 KEGG:ns NR:ns ## COG: FN0103 COG0208 # Protein_GI_number: 19703451 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Fusobacterium nucleatum # 1 348 1 348 348 652 97.0 0 MKAVVDRKKLFNPEGDDTLNARKIIKGNSTNLFNLNNVRYQWANQLYRTMMANFWIPEKV DLTQDKNDYENLTLPEREAYDGILSFLIFLDSIQTNNIPNISDHVTAPEVNMLLAIQTFQ EAIHSQSYQYIIESILPKQSRDLIYDKWRDDKVLFERNSFIAKIYQDFIDEQSDENFAKV IIANYLLESLYFYNGFNFFYLLASRNKMVGTSDIIRLINRDELSHVVLFRSIVKEIKNDY PEFFSAETIYSMFKTAVEQEINWTEHIIGNRVLGITSQTTEAYTKWLANERLKSLGLEPL YSGFNKNPYKHLERFADTEGEGNVKSNFFEGTVTSYNMSSSIDGWEDF >gi|224461263|gb|ACDC01000139.1| GENE 5 4678 - 4869 309 63 aa, chain + ## HITS:1 COG:no KEGG:MGAS9429_Spy0565 NR:ns ## KEGG: MGAS9429_Spy0565 # Name: not_defined # Def: phage protein # Organism: S.pyogenes_MGAS9429 # Pathway: not_defined # 1 63 28 88 88 67 60.0 2e-10 MPMTSTEMIKLLLKNGFKQIPGGKGSHKKFFQESTGKFTVVPDHKQELGKGLEYKILKQA GLK >gi|224461263|gb|ACDC01000139.1| GENE 6 4906 - 5331 584 141 aa, chain + ## HITS:1 COG:SP1786 KEGG:ns NR:ns ## COG: SP1786 COG1598 # Protein_GI_number: 15901615 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 1 134 2 143 150 80 33.0 8e-16 MLIYPAIFHRTIEGGYIVVFPDFDNGATEGQTLEQAMEMAEDYIGTYLYDDFIKGKDLPK ASNINEISIEIPEDEKEFYIEGKSFKTLVSLDMMKYVNECKSATIRKNVTIPSWLNEMGK NHNLNFSNLLQEAIKKELDIE Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:20 2011 Seq name: gi|224461262|gb|ACDC01000140.1| Fusobacterium sp. 2_1_31 cont1.140, whole genome shotgun sequence Length of sequence - 11264 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 6, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 8 - 316 356 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Term 332 - 370 -0.9 2 1 Op 2 . - CDS 396 - 1145 915 ## COG0500 SAM-dependent methyltransferases 3 1 Op 3 . - CDS 1169 - 1693 274 ## PROTEIN SUPPORTED gi|50365462|ref|YP_053887.1| acetyltransferase of 30S ribosomal protein L7 - Prom 1724 - 1783 11.0 + Prom 1720 - 1779 8.0 4 2 Tu 1 . + CDS 1810 - 2373 908 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 2382 - 2418 -0.7 - Term 2368 - 2403 5.1 5 3 Op 1 59/0.000 - CDS 2429 - 2830 656 ## PROTEIN SUPPORTED gi|237738729|ref|ZP_04569210.1| SSU ribosomal protein S9P 6 3 Op 2 . - CDS 2846 - 3280 742 ## PROTEIN SUPPORTED gi|237738730|ref|ZP_04569211.1| LSU ribosomal protein L13P - Prom 3308 - 3367 5.6 7 4 Op 1 1/0.000 - CDS 3384 - 4304 1115 ## COG2849 Uncharacterized protein conserved in bacteria - Term 4311 - 4358 9.7 8 4 Op 2 1/0.000 - CDS 4370 - 5200 1206 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 5223 - 5282 9.9 - Term 5269 - 5312 6.2 9 5 Op 1 1/0.000 - CDS 5324 - 6154 1121 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 6206 - 6265 13.4 10 5 Op 2 . - CDS 6268 - 7101 753 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 7351 - 7410 16.3 + Prom 7384 - 7443 15.2 11 6 Tu 1 . + CDS 7644 - 11262 5912 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461262|gb|ACDC01000140.1| GENE 1 8 - 316 356 102 aa, chain - ## HITS:1 COG:BS_ydfQ KEGG:ns NR:ns ## COG: BS_ydfQ COG0526 # Protein_GI_number: 16077618 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 11 100 13 104 112 68 35.0 2e-12 MEKIKTYNDLLEKIKNEEKFLLYIKSEGCSVCEADFPKVKEITDKNNYLSYYIQADEMAE AVGQLNLYTAPVVILFYNGKEIHRQARFIDFSELDYRIKQTL >gi|224461262|gb|ACDC01000140.1| GENE 2 396 - 1145 915 249 aa, chain - ## HITS:1 COG:FN1919 KEGG:ns NR:ns ## COG: FN1919 COG0500 # Protein_GI_number: 19705224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 1 249 1 249 249 451 86.0 1e-127 MNYQDINAATIDRWIKEEDWEWGRAISHEEYIKALNGEWDVKLTPVKFVPHEWFGDFKGK KLLGLASGGGQQIPIFTALGAECTVLDYSDEQLASEKMVAEREKYKVNIVKADMTKALPF EDESFDIIFHPVSNCYIESVEPVFKECHRILKKGGILLCGLDTIINYVLDENEEKIVFSM PFNPLKNEEHKEFLKKMDCGYQFSHNLSEQLGGQLKAGFILTNIEDDTDGEGRLHEMNIP TFIMTRAIK >gi|224461262|gb|ACDC01000140.1| GENE 3 1169 - 1693 274 174 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|50365462|ref|YP_053887.1| acetyltransferase of 30S ribosomal protein L7 [Mesoplasma florum L1] # 3 171 2 169 170 110 40 6e-24 MDKIILVKPDLSYADEIIKYKEESLAESPIINGSAGLDRFSSIEIWFEELKKRSCEDTVP KGLVPSSTYLGVREKDNYIVGMIDIRHYLNEYLTQVGGHIGYGVRKTERNKGYAKQMLKL ALEKCKELKIKKVLITCDEDNIASEKVILSANAKLEDIRNIDGENKKRFWIDLQ >gi|224461262|gb|ACDC01000140.1| GENE 4 1810 - 2373 908 187 aa, chain + ## HITS:1 COG:MA2295 KEGG:ns NR:ns ## COG: MA2295 COG1853 # Protein_GI_number: 20091133 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Methanosarcina acetivorans str.C2A # 1 185 1 187 188 131 36.0 7e-31 MRKTFSKKAVLLPLPVYIIGTYDENGKANAMNLAWGTQCGYHEVSLSIAKEHKTMKNILL KKEFTISLATKATKDIADYFGIESGNKVDKIEKSGVHVVKSENIDAPIIEEFPLTLECKV IEIQEELGDYRVIAEIINTLADESVLNEKGQIDVDKLELITFDSITNSYRVLGEKVGQAF KDGAKIK >gi|224461262|gb|ACDC01000140.1| GENE 5 2429 - 2830 656 133 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237738729|ref|ZP_04569210.1| SSU ribosomal protein S9P [Fusobacterium sp. 2_1_31] # 1 133 1 133 133 257 99 3e-68 MAEKITQFLGTGRRKTSVARVRLIPGGQGVEINGKGMDEYFGGRAILSRIVEQPLALTET LDKYAVKVNVVGGGNSGQAGAIRHGVARALVLADDSLKAALREAGFLTRDSRMVERKKYG KKKARRSPQFSKR >gi|224461262|gb|ACDC01000140.1| GENE 6 2846 - 3280 742 144 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237738730|ref|ZP_04569211.1| LSU ribosomal protein L13P [Fusobacterium sp. 2_1_31] # 1 144 1 144 144 290 99 3e-78 MKKYTFMQRKEDVVREWHHYDAEGQILGRLAVEIAKKLMGKEKITFTPHIDGGDYVVVTN VEKLVVTGKKLNDKVYYNHSGFPGGIRARKLGEILAKKPEELLMLAVKRMLPKNKLGRQQ LTRLRVFVGTEHSHTAQKPNKVEL >gi|224461262|gb|ACDC01000140.1| GENE 7 3384 - 4304 1115 306 aa, chain - ## HITS:1 COG:FN0248 KEGG:ns NR:ns ## COG: FN0248 COG2849 # Protein_GI_number: 19703593 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 306 1 308 308 450 82.0 1e-126 MKKGIILLALIFTACVNLDNIGGNSGGEVKEVNTTNVSTSKNYERKNGSLYVDNVLANGK QEYKEKNGVIIKGNFKDGLADGLQERYYPSGKLYGKINIVNNKVEGTETTYYENGKIISE LNYTQGKLISGKIYYENGDLLSQIEGKKMTIFYSSGKKLFTMDKTDLAVYHENGKEVFSN SAEGIKINGEPAKKSLLDMFSKENLVKTALYLLTSDTIQAEYKSGKPSIQLKGATAVMYY PSGKILLELSPSIDGSVSSKIYYENGQLMQVEDRDKNGRAVKVYDKAGNLIAENNFNKEH EIKQIY >gi|224461262|gb|ACDC01000140.1| GENE 8 4370 - 5200 1206 276 aa, chain - ## HITS:1 COG:FN0247 KEGG:ns NR:ns ## COG: FN0247 COG2849 # Protein_GI_number: 19703592 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 14 276 1 263 263 307 64.0 1e-83 MKKILSALLLIFAMLLSACGGVKYELKDGLMFADGKEATGNFEFKTGKYKVKGNFVNGLP DGLFEEYYEDGSIMAKETFVNGEMTSKELYYKNGNLLGNFAENGDIKLYYDDGSLILSYD AEKEEYTYYHENGNPFMIGNSNETTLYNENNEVVSKLRDDDLTDIGATLKKLDDGTFELV KGDVVIAKIDANGEVINYLYSTGETLLTVNDSTGETEFFFKNGNTFMKEKEGGSILNYRD GKPLYEIDGDSENIYNEEGDKIVGGFDLVTDIKKLD >gi|224461262|gb|ACDC01000140.1| GENE 9 5324 - 6154 1121 276 aa, chain - ## HITS:1 COG:FN0247 KEGG:ns NR:ns ## COG: FN0247 COG2849 # Protein_GI_number: 19703592 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 14 276 1 263 263 299 61.0 3e-81 MKKILSALLLLFAMLLSACGGVKYEFKDGILYADGKKASGTFEFNLNGFKAKGTFVNGLA DGLFERYYSDGSIMVKDTFSNGIYLKDEIYYKNGQLMANFSSETGLNLFYDDGQLVMASN PQTGETITYHENGNPLLVIGGATSTLYNENNEVLFKIENGQSTDIGATLNKLEDGSFELV KDGKVIAKIDANGQILTYLYSTGEILMVSNTASENTEIFFKNGQTFFKGDGVNSRFNYKN GVPLYESNGGEWKFFDREGKQIISNFDNITDVKKLN >gi|224461262|gb|ACDC01000140.1| GENE 10 6268 - 7101 753 277 aa, chain - ## HITS:1 COG:FN0247 KEGG:ns NR:ns ## COG: FN0247 COG2849 # Protein_GI_number: 19703592 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 23 276 10 262 263 228 48.0 1e-59 MKKILLVLLLLFLIPVSISAKEKYEFKGGILYNDGKKVTGTFELIFEKYKAKGSFVNGLP DGIFERYYPDGSIMLKNTFVAGIRMTEETYYKSGKLFIKFSKKDNSLKVFYEDGNLVLSR NIKTDSYIIYHENGKPLMVFDSNVSTLYNENNEILFKLNSDESLDSQGDLKELKDGSYQL VKNNKVIATLDASGTIVTFLYSTGEPLMRLNDNNELFEVFFKNGNVFFETNANNFKINYK DGKPLYKTNRITEIFFNRDGEEIPNDLEKVIGIRKIK >gi|224461262|gb|ACDC01000140.1| GENE 11 7644 - 11262 5912 1206 aa, chain + ## HITS:1 COG:BMEI1873 KEGG:ns NR:ns ## COG: BMEI1873 COG5295 # Protein_GI_number: 17988156 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Brucella melitensis # 679 908 128 354 505 85 35.0 4e-16 MNNKERDEKFLKSWLKKKISITTSTVVSFLITGAIGGGAAYGAGNKPGTGGNNNSVAWGQ GSSVDNDSVAVGANASAVSASGGNNPTVTSKGVAIGNNAKAEGSVSIGDSASSTHFGVAV GYKAGVQNTTTGASFSNVTIGANTRVGVQGTQAAQGIAIGSGIQAHEGAWAKGDQSISIG ANTTASGDSSIAIGGDDLLSVSAKTSSYSDAKFDKNGNKIGGNSNNTASINNIFNTLTGR GTILHQGIGTDGKFYTAWRNTESGQGGVALGVKSISGDIALAIGTFSEAKGTNSVAIGTG AQTPQSGAVAIGGGSTTYGLQGRQITDADITLTDGTTMALTNFAGASGVLEGSMVSFGKA GNERQLKHIAPGEVSATSTDGINGSQLFAVAKKLGDDISKFKYVSIKSNDAGNKLNDGAT ANNAIAIGPNASTKVESAVSLGDGANVLPGPTKDKAGTLLQTVISSGSGVAIGKNATATQ AGVAIGDTSSTVTSGIAIGREAKVTNKYEAASGPYVVGDSQDGYLNYDRVQNPDNLQYSK TEATGNNIFSPDRYNGQGIAIGYKAESNMFGTSLGNSAVAKQGGLALGTFSRAEGATATA IGLGANSSGARGISMGRQASATTADSVAIGTGARGGATSAGGSVAIGGGAAATGTQAIAI GGLYGNDLYSSSATKDGAGNLTKNTQASGAGSIAMGVNAVANKDNALALGGSTQVFKNED VALGYGSKVTSAPTSVTSATVGGVTYGGFAGTNPNSSLSIGSAGNERQIKNVAAGEISAT STDAINGSQLYSVANKLSQGWTATADGNKIGAATPTAVKPGNTVVYSAGSNLQVKQTIDA TNGKQTYEYSLNKDLTGLDSVTTKKLTVPGTGGKDTVIDNNGINAGNNKITNVAPGVNGT DAVNKNQLDQKIGDNTIKLGGDKGTTGTQNLSQATGLQFNIKGGDGLETSASGTDVTVQL DTVTKQKLNKAVLPLKFSGDDYDPFDEVSTVVSKELGQKLEIVGGADTTDPTKLSNNNIG TMVDGTGKINIKLAKELKDLTSAEFKTPAGNKTVINGDGLTITPSAPGATPISVTKDGIN AGDKKVTNVAPGTISATSKDAINGSQFHGLAKNTIQLGGDKGTTTDTQTLDKTGGIKFDI VGANGITTEAKDGKVTVSVDPSKLSASNSKLSYTANGDTTKQEVKLSDGLNFTDGKLTTA SVAANG Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:21 2011 Seq name: gi|224461261|gb|ACDC01000141.1| Fusobacterium sp. 2_1_31 cont1.141, whole genome shotgun sequence Length of sequence - 534 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 533 753 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461261|gb|ACDC01000141.1| GENE 1 2 - 533 753 177 aa, chain + ## HITS:1 COG:PM0714 KEGG:ns NR:ns ## COG: PM0714 COG5295 # Protein_GI_number: 15602579 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Pasteurella multocida # 1 171 1208 1379 2712 63 37.0 1e-10 DAVNKNQLDTATNNLINKGMNFSADDYDPATPNTTVSKKLGERLEIVGGADKTKLSNDNI GSVVDNTGKINVKLAKELTGLTSAEFKTPAGDKTVINGDGLTVTPVAPGAAPISVTKDGI NAGNKTITNVAPGVNGTDAVNKNQLDQKIGDNKIKLGGDTGTTATQDLSQAGGLQFN Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:22 2011 Seq name: gi|224461260|gb|ACDC01000142.1| Fusobacterium sp. 2_1_31 cont1.142, whole genome shotgun sequence Length of sequence - 722 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 720 965 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461260|gb|ACDC01000142.1| GENE 1 3 - 720 965 239 aa, chain + ## HITS:1 COG:HI1731a KEGG:ns NR:ns ## COG: HI1731a COG5295 # Protein_GI_number: 16273668 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Haemophilus influenzae # 2 179 757 937 1020 87 43.0 2e-17 GLSVSPDGKAGLPNPATPGATTPNGLVTAQDVADALNNVGWKATADSTGTGIKTGTPSAQ LVKNGSTVSYVAGDNLTVAQDVTSGDHKYTYSLNKVLKDLTSAEFKTPAGDKTVINGDGL TVSPATPTTSPISVTKDGISAGDKKVTNVAPGTISKTSTDAINGSQLYNLASNTIQLGGD NASTTDKQTLDKSGGIKFDIVGANGITTEAKDGKVTVKVDSSTIGANSKLKYTANGDAP Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:22 2011 Seq name: gi|224461259|gb|ACDC01000143.1| Fusobacterium sp. 2_1_31 cont1.143, whole genome shotgun sequence Length of sequence - 1299 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1299 1973 ## NT05HA_0523 autotransporter adhesin Predicted protein(s) >gi|224461259|gb|ACDC01000143.1| GENE 1 3 - 1299 1973 432 aa, chain + ## HITS:1 COG:no KEGG:NT05HA_0523 NR:ns ## KEGG: NT05HA_0523 # Name: not_defined # Def: autotransporter adhesin # Organism: A.aphrophilus # Pathway: not_defined # 117 344 1374 1597 2065 104 39.0 9e-21 TLSVSPDGKAGLPNPATPGATTPNGLVTAQDVADALNNVGWKATADSTGTGIKTGTPSAQ LVKNGSTVSYVAGNNLTVAQAVDTNGNHKYTYSLNKDLKDLDSVTTKTITIPGAPGTNDV VIGKDGISAGNKVIKDVAPGVNGTDAVNKNQLDTATNNLIDKGMNFSADDYNAATPNTTI SKKLGERLEIVGGADKTKLSDNNIGSVVDNTGKINIKLAKELKDLTSAEFKTPAGDKTVI NGDGLTVSPATLGTAPISITKDGISAGDKKVTNVAPGTISSTSTDAINGSQFHKLATNTI QLGGDNASTTDKQTLDKTGGIKFDIVGANGITTEAKDGKVTVKVDSSTIGANTKLKYKSN SDAATAQEVKLSDGLDFKNGNFTTATVGANGEVKYDTVTQGLTVTDGKAGLPNPVTPGAT TPNGLVTAQDVA Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:29 2011 Seq name: gi|224461258|gb|ACDC01000144.1| Fusobacterium sp. 2_1_31 cont1.144, whole genome shotgun sequence Length of sequence - 3377 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 3220 4857 ## COG5295 Autotransporter adhesin + Term 3242 - 3294 13.3 Predicted protein(s) >gi|224461258|gb|ACDC01000144.1| GENE 1 2 - 3220 4857 1072 aa, chain + ## HITS:1 COG:PM0714 KEGG:ns NR:ns ## COG: PM0714 COG5295 # Protein_GI_number: 15602579 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Pasteurella multocida # 293 1008 1261 1963 2712 134 29.0 1e-30 SAVGTGVASGSPSAQLVKNGSTVSYVAGDNLTVVQDVTAGDHKYTYSLNKVLKDLTSAEF KTAAGDKTVINGDGLTVSPATPGTAPISITKDGISAGDKKVTNVAPGTISSTSTDAINGS QFHKLATNTIQLGGDKGATATQQLDKTGGIKFNIVGENGITTTATGDKVTVGVDTNTIGA NIKLKYKSNSDAGTTQEVKLSDGLDFKNGNFTTATVGANGEVKYDTVTQGLSVSPDGKAG LPNPATPGATTPNGLVTAQDVADALNNVGWNATASAVGTGVASGTPSAQLVKNGSTVSYV AGDNLAVAQDVDASGNHKYTYSLNKQLKDLTSAEFVNPTSGNKTVVNGDGLTITPSTPGA KNISITKDGISAGDKKITDVADGDITPTSKDAINGSQLYKLASNTISLGGDNSTVTATQQ LNKNGGIKFNIVGDNGIITEAKDDKVIVRVNPATIGSNITLKYAANGANGQTVKLSDGLN FQDGNFTKASVDTAGKVKYDTVTQAIAPTADGTAQVAPGSTPGLATSADVVNAINNSGWK ATAGGNVTGTATPTVVKNGQEVEFNAGDNLKVKQTIDPTTGKQTYEYSLNKDLTGLNSAE FTNAAGDKTKITAGNTEYTNAAGDKTVVNAGGLTISSSTPGAKDISVTKDGISAGDKVIK NVAAGVNDTDAVNVSQLKDVDNKITNVNNTINKGLNFKGNTGATVNKQLGDTLEIVGEGT KADSEYSGQNLKVVEDGGKLVVKMDKNLKSDTVTADTVNTNSVTVGAPGKDGVITVKDAN GKDGVSINGKDGSIGLNGKDGSSATISTVQGNPGIAGTPGTTMDRIQYTDKAGTPHQVAT LDDGMKYGGDTGTVINKKLNQQVNVVGGITDTNKLSTKDNIGVVSDGSNNLKVRLAKDLD GLESVTVRDTSGNSTVVKGDGVTITSSSGDTVSLTDKGLDNGGNVITNVAAGKDGTDAVN VDQLNQTVSNVVNAAGDAIAHVNNKVDKLGDRVNKGLAGAAAMAGLEFMDIGINQATVAA AVGGYRGTHAVAVGVQAAPTENTRVNAKVSMTPGSRSETMYSVGASYRFNWR Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:34 2011 Seq name: gi|224461257|gb|ACDC01000145.1| Fusobacterium sp. 2_1_31 cont1.145, whole genome shotgun sequence Length of sequence - 11171 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 3, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/1.000 - CDS 57 - 554 743 ## COG2849 Uncharacterized protein conserved in bacteria 2 1 Op 2 1/1.000 - CDS 578 - 1147 823 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 1184 - 1243 6.9 - Term 1234 - 1273 7.0 3 2 Op 1 . - CDS 1287 - 1856 857 ## COG2849 Uncharacterized protein conserved in bacteria 4 2 Op 2 1/1.000 - CDS 1870 - 2418 839 ## COG1658 Small primase-like proteins (Toprim domain) 5 2 Op 3 1/1.000 - CDS 2430 - 3818 2041 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases 6 2 Op 4 1/1.000 - CDS 3830 - 4018 344 ## COG4224 Uncharacterized protein conserved in bacteria 7 2 Op 5 . - CDS 4070 - 5308 810 ## COG0772 Bacterial cell division membrane protein - Prom 5334 - 5393 22.4 + Prom 5332 - 5391 9.3 8 3 Op 1 1/1.000 + CDS 5422 - 5961 523 ## COG2849 Uncharacterized protein conserved in bacteria 9 3 Op 2 1/1.000 + CDS 5927 - 6184 401 ## COG1605 Chorismate mutase 10 3 Op 3 4/0.000 + CDS 6181 - 6984 789 ## COG0169 Shikimate 5-dehydrogenase 11 3 Op 4 1/1.000 + CDS 6965 - 7408 588 ## COG0757 3-dehydroquinate dehydratase II 12 3 Op 5 . + CDS 7411 - 8172 1079 ## COG0708 Exonuclease III 13 3 Op 6 . + CDS 8223 - 11168 4413 ## COG3468 Type V secretory pathway, adhesin AidA Predicted protein(s) >gi|224461257|gb|ACDC01000145.1| GENE 1 57 - 554 743 165 aa, chain - ## HITS:1 COG:FN0026 KEGG:ns NR:ns ## COG: FN0026 COG2849 # Protein_GI_number: 19703378 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 11 165 1 155 155 197 70.0 9e-51 MKKILLALFVMCSALSFSAKVIKSDRIETKGNVVYEIGQKTPYTGVLENYNEKGIVDARA EFKNGVMDGYSKLYYPSGKLSSEATFKNGVQVGLQKDYYEDGKLKMELNYKNGKPEGLGR SYYPNGKVFIEENYKNGERDGIAKAYDENGKLLQQATFKNGQQIK >gi|224461257|gb|ACDC01000145.1| GENE 2 578 - 1147 823 189 aa, chain - ## HITS:1 COG:FN0026 KEGG:ns NR:ns ## COG: FN0026 COG2849 # Protein_GI_number: 19703378 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 11 189 1 155 155 177 59.0 8e-45 MKKILLALFVMCSALSFSAKVIKATDIDVKGNIVYEAGQNAPYTGFIDTYNEKNVLLART EFKNGIQDGSSKIYFPSGKLSSEATFKNGVQVGIQKDYYENGKVKIETTYKNGQKTGPAK IYDENGKLDTEVNLVNGKAEGLVKSYYPNGKIRTEENYKNNERDGIAKAYDENGKLVQQA TFKNGQQVK >gi|224461257|gb|ACDC01000145.1| GENE 3 1287 - 1856 857 189 aa, chain - ## HITS:1 COG:FN0026 KEGG:ns NR:ns ## COG: FN0026 COG2849 # Protein_GI_number: 19703378 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 11 189 1 155 155 171 59.0 6e-43 MKKLLLALFVMCSALSFSAKVIKATNAEVKGDVVYEAGQNAPYTGIIETYNEKGVLEAKT EFKNGIQDGSSKLYYPSGKLYSEATFKNGKQVGVQKDYYENGKVAAETTYKNSQPNGPAK IYDENGKLAVEFNLVNGKAEGLLKTYYPSGKVRTEENYKNDERNGVAKAYDENGKLVQQA NFKNGKQVK >gi|224461257|gb|ACDC01000145.1| GENE 4 1870 - 2418 839 182 aa, chain - ## HITS:1 COG:FN0039 KEGG:ns NR:ns ## COG: FN0039 COG1658 # Protein_GI_number: 19703391 # Func_class: L Replication, recombination and repair # Function: Small primase-like proteins (Toprim domain) # Organism: Fusobacterium nucleatum # 1 180 1 180 183 306 93.0 2e-83 MKKKIKEVIVVEGKDDISAVKNAVDAEVFQVNGHAVRKNRSIELLKLAYENKGLIILTDP DYAGEEIRKYLCKHFPNAKNAYISRVSGTKDGDIGVENASPEDIITALEKARFSLDNSEN IFDLNLMMDYGLIGKNNSSDLRAELGSELGIGYSNAKQFMAKLNRYGISLEEFKKAYEKI IK >gi|224461257|gb|ACDC01000145.1| GENE 5 2430 - 3818 2041 462 aa, chain - ## HITS:1 COG:FN0040 KEGG:ns NR:ns ## COG: FN0040 COG0017 # Protein_GI_number: 19703392 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Fusobacterium nucleatum # 2 462 1 461 461 867 95.0 0 MMITVKDIFRHGEEYLNKEIELFGWVRKIRDQKKFGFIELNDGSFFKGVQIVFEEGLENF DEVSRLSIASTIKVKGTLVKSEGSGQDLEVKAKEIEVFQKADLEYPLQNKRHTFEYLRTK AHLRPRTNTFSAVFRVRSVLAYAIHKFFQENNFVYVHTPIITGSDAEGAGEMFRITTFDL NKVPKKENGEIDFSKDFFGKSTNLTVSGQLNVETYCAAFRNVYTFGPTFRAEYSNTARHA SEFWMIEPEIAFGDLGANMELAEAMVKYIIKYVMDNCPEEMEFFNSFIEKGLFDKLNNVL NNDFARVTYTEAIEILEKSGKKFEFPVKWGIDLQSEHERYLAEEYFKKPVFVTDYPKDIK AFYMKLNDDNKTVRAMDLLAPGIGEIIGGSQREDNYDLLVKRMDELGLNKEDYEFYLDLR RFGSFPHSGYGLGFERMMMYLTGMQNIRDVIPFPRTPNNAEF >gi|224461257|gb|ACDC01000145.1| GENE 6 3830 - 4018 344 62 aa, chain - ## HITS:1 COG:FN0041 KEGG:ns NR:ns ## COG: FN0041 COG4224 # Protein_GI_number: 19703393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 62 1 62 62 71 95.0 4e-13 MEMKDIIEKVNYYAKLSKERKLTEEEIKDREIYRRMYLDQFKAQVRKHLDNIEIVDEKDF KN >gi|224461257|gb|ACDC01000145.1| GENE 7 4070 - 5308 810 412 aa, chain - ## HITS:1 COG:FN0042 KEGG:ns NR:ns ## COG: FN0042 COG0772 # Protein_GI_number: 19703394 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Fusobacterium nucleatum # 35 410 39 415 417 441 64.0 1e-123 MKRKIDFNNRATDQVAFHNMMNEKDEEKRKEKKSKRRNNIISFFFILVMIGSLNFISSIS RFDNAKVMDKAFKQSIILGLSLVIFILMCSKKFGGLFNKSVSGPLFRFTFLVGSLILFIV VALGPSSIFPTVNGGKGWIRLGSLSLQIPELLKVPFVISIAGIFARGKDTNEKISYKKNF WTAFIYTGAFAAFITFALRDMGTAIHYVMIASFMLFLSDIPNRWLYPFFFGGILFSPVLL SLAAKLTSGYKQHRIKVYLEGILHNNYDRVDAYQIYQSLIAFGTGGIFGKGIGNGVQKYN YIPEVETDFAIANLAEETGFVGMFIVLFLFFTLFVLIMNIAGKSKNYFYKYLVSGIAGYI ITQVIINIGVAIGLIPVFGIPLPFISAGGSSILALSLSMGYIIYINNNHTTD >gi|224461257|gb|ACDC01000145.1| GENE 8 5422 - 5961 523 179 aa, chain + ## HITS:1 COG:FN0043 KEGG:ns NR:ns ## COG: FN0043 COG2849 # Protein_GI_number: 19703395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 24 179 3 158 158 178 57.0 4e-45 MKIKKISFLLVLLFSFNLFGANSKNISNISKLNVSKKSVSNGPVKTYYKNGKIKSKEYYT GNRKTGIWHYYHENGKIKTEVMFNALSKDEEAIVKTYDEKGIIISSGKVVNGEMVDIWTY YDEMGRKLNTYDLTKGVIVTYSEKGKVILQVSEKALLNRLEEIMVEVNNDRTRANEEKN >gi|224461257|gb|ACDC01000145.1| GENE 9 5927 - 6184 401 85 aa, chain + ## HITS:1 COG:FN0044 KEGG:ns NR:ns ## COG: FN0044 COG1605 # Protein_GI_number: 19703396 # Func_class: E Amino acid transport and metabolism # Function: Chorismate mutase # Organism: Fusobacterium nucleatum # 1 85 2 86 86 80 71.0 1e-15 MTELELMRKKIDEIDEKLLVLFKERLEVSKQIGILKKKYKMSIFDPEREKQIISEATEAM SDNEKKYTESFLHNLMDISKEVQSE >gi|224461257|gb|ACDC01000145.1| GENE 10 6181 - 6984 789 267 aa, chain + ## HITS:1 COG:FN0045 KEGG:ns NR:ns ## COG: FN0045 COG0169 # Protein_GI_number: 19703397 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Fusobacterium nucleatum # 19 267 1 249 249 355 81.0 3e-98 MRKFGLLGKKLSHSLSPLLHKVFFEEFGVEAEYKLYEVTETEIDNFKSYMLENSIEGVNI TVPYKKVFLDKLDFISDEAKAIGAINLLYIKDNKFYGDNTDYYGFKQTLISNQINPSGKK IAIIGRGGASASVYKVLKNMGAEDITFYFRKDKLSKIEFPENIEGDIIINTTPVGMYPNI EDNIVDEQILKKFKIAIDLIYNPLETKFLKIARENGLKSINGMEMLIEQALKTDEILYNI VLSDQLREKIIKKIIKRVKEFYENNGN >gi|224461257|gb|ACDC01000145.1| GENE 11 6965 - 7408 588 147 aa, chain + ## HITS:1 COG:FN0046 KEGG:ns NR:ns ## COG: FN0046 COG0757 # Protein_GI_number: 19703398 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Fusobacterium nucleatum # 1 147 1 147 147 248 82.0 3e-66 MKIMVINGPNLNMLGIREKNIYGTFTYEDLCKYIETYPNYKEKDIDFTFLQTNHEGEIVD FIHKAYTEKYDGIVLNAGGYTHTSVAIHDAIKAVSIPTVEVHISNIHAREEFRKVCMTSP ACVGQITGLGKLGYVLAVVYLTEERKK >gi|224461257|gb|ACDC01000145.1| GENE 12 7411 - 8172 1079 253 aa, chain + ## HITS:1 COG:FN0047 KEGG:ns NR:ns ## COG: FN0047 COG0708 # Protein_GI_number: 19703399 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Fusobacterium nucleatum # 1 253 1 253 253 480 93.0 1e-135 MKLISWNVNGIRAAIKKGFLDYFNEQNADIFCLQETKLSAGQLDLELKGYHQYWNYAEKK GYSGTAIFTKEEPLSVSYGLGIEEHDKEGRVITLEFEKFYMITVYTPNSKDELLRLDYRM VWEDEFRKYLKNLEKKKPVVVCGDLNVAHKEIDLKNPKTNRRNAGFTDEERGKFTELLES GFIDTFRYFYPDLEHAYSWWSYRANARKNNTGWRIDYFVVSKALEKYLVDAEIHAQTEGS DHCPVVLFLDFKK >gi|224461257|gb|ACDC01000145.1| GENE 13 8223 - 11168 4413 981 aa, chain + ## HITS:1 COG:YPO3984_2 KEGG:ns NR:ns ## COG: YPO3984_2 COG3468 # Protein_GI_number: 16124111 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Yersinia pestis # 676 981 169 476 476 139 32.0 3e-32 MKNSKKVFGLFAFLLVCSNANAGSVGMPEVYPGIRYDYNNVNVSLPAFNFTENLTDGGNY IRVEGNSTLDIASNLDINLTSNIPSTGAYGNLVGFGAYGTNNGAPTINAKDVKIKVEAGA ADYHNEPFGMLIRDGANYSGGNIDINLITNSNHSASDITALSYGLDNENVTREYNSTMKV KDVNIKIVNNQVLTNGIDDNNLIGLDQMGEGNQNTNFVSTGNLNIDIDDKSNSAPYHIAA GIVIEGDSGTKMTLNNSNIKIKSKTNNDYLGGAIILGFPDYEATTTGQGATLESKGKMVL DTTEAPDVATLNLHGHGSLFKADFENSSAEIKSGGTAIRFAGISQALNADGEDTKPGRDL TISLKNAKITTSATAPHSAPLIVVEDGVKNATFNLSGPGSEAIAADQNELLLIKGNDTDV TLNIDNGAKIKGTIYRATSGEITTNITNNAVWSSPAGSGSTYSSNLTLKDGGTLDLTDET HPDTRGINYYEIKIFNRAEDGGKILNDNGIITMANTSYNDEVEIYGNYEGKNGAKIKMNT LWNAPGDADGANSQSDILKILGGGTSKYGFATGVTEIIPIALDGRVNIIEGNIQKVAQAV NTVPVVVADKAAAGTFVGTAQTTGAGEVQLTSKLNSNGQRVFFWTLNAIDGTNPYDDGTS RNYRLGKARTILNSSVAGYINTAKVNMDSGFTSLSTLHERRGENALDVNNKKGQAWARII GKHSKDEGKERFNYETDIYGVQAGYDFNIKNSEDGNRYTGFYFTNTTASTDFYDRYRAQN GIIASDKYTGKVKTKDFSLGLTTTKYYNNGFYLDLVGQLSFINNKYNSRDGVSAKQKGNA LALSVEGGKNYSLGSNWAIEPQAQLIYQYLNLKDFNDGVREVHHGNDSALRARLGFRTTY KKAFYSIANVWHDFSNTTEANIGSDRIKEKYSATWGEIGLGVQLPITNSAYVYSDIRYER SFTSNPKHKGYRGTVGFKYTF Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:37 2011 Seq name: gi|224461256|gb|ACDC01000146.1| Fusobacterium sp. 2_1_31 cont1.146, whole genome shotgun sequence Length of sequence - 7280 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 5 - 766 1072 ## COG0647 Predicted sugar phosphatases of the HAD superfamily - Term 792 - 824 3.3 2 2 Op 1 1/0.000 - CDS 835 - 1920 1621 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 1943 - 2002 5.5 3 2 Op 2 . - CDS 2022 - 2960 1413 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 2998 - 3057 10.0 + Prom 3071 - 3130 11.9 4 3 Tu 1 . + CDS 3192 - 4139 1133 ## HH0050 hypothetical protein + Term 4151 - 4189 1.1 - Term 4135 - 4180 9.0 5 4 Tu 1 . - CDS 4214 - 6790 1809 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 6812 - 6871 3.2 + Prom 6963 - 7022 15.6 6 5 Tu 1 . + CDS 7058 - 7280 165 ## gi|294781981|ref|ZP_06747313.1| conserved hypothetical protein Predicted protein(s) >gi|224461256|gb|ACDC01000146.1| GENE 1 5 - 766 1072 253 aa, chain - ## HITS:1 COG:FN0048 KEGG:ns NR:ns ## COG: FN0048 COG0647 # Protein_GI_number: 19703400 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 252 1 252 252 380 81.0 1e-105 MKTYLIDLDGTMYSGNTNIDGAREFIAYLQKKGLPYIFLTNNATRTKTQAKEHMLNLGFK NIKEDDFFTSAIATAKYIAKNYSERKCFMIGESGLEEALKEENFIFVEDKADFVVVGLDR KANYTKYSEALHHILAGAKFIATNSDRLLANNGLFDLGNGATVNMLEYASGVEAIKVGKP YQTILNILLEDKNLKKEDIILLGDNLETDIKLGYEGNIETIMVCSGVHDENDIERLKVYP TKVVKNLRELIKN >gi|224461256|gb|ACDC01000146.1| GENE 2 835 - 1920 1621 361 aa, chain - ## HITS:1 COG:FN2119 KEGG:ns NR:ns ## COG: FN2119 COG2849 # Protein_GI_number: 19705409 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 361 1 337 338 211 42.0 1e-54 MKKSLIALFILTSVLAFSEGNVKKVPFESMTRTDNGIAYFKDEKTPFTGIVEKKSKDGKL EAVISLKDGKLEGKTFTYYPNGKVKREETFQNALVNGTVKNYSENGILEYEANYKNDKID GLEKTYYPNGKVEKEISYKNGKIDGLSRHFSDKGILLAEAYFTEGQPNGISKEYYPSGKL MSEQTFLMGSLNGPAKLYYESGKIKISSNYKNDVLDGKSFQYQENGKLVEELSYQYNQLN GLTKMYDKDGKLEYETQYANDKKNGISKKYYPSGKLLSEVTFKDDKEVGVLKGYHENGKL EGEIPYNNGVVEGIVKVYHENGKISEEVTFKNGKKNGPMKIYDENGNLKRQSNFVDDRQV N >gi|224461256|gb|ACDC01000146.1| GENE 3 2022 - 2960 1413 312 aa, chain - ## HITS:1 COG:FN2119 KEGG:ns NR:ns ## COG: FN2119 COG2849 # Protein_GI_number: 19705409 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 312 1 337 338 180 38.0 4e-45 MKKSLITLFLITSVLAFSEISVKKVPYESMIKSDDGIIFVEGDSAPYTGIIEKKSEDGRV ESTISIKDGKLEGIERNYFPNGKLKSEMSYKNGKLDGVSKVFSEEGVLLAEAYFTDGEPN GVSKEFYPNGKLKSEQNFSMGALNGTAKYFFESGKIYIISNYKNDTLDGKLTEYQEDGKV LQELLYEANQLSGLIKLYRDGHLEFETQYANNEKNGLSKKYYPNGRLLSEVIFKDGKEIG ILKGYSETGKLQGEIPYNNGVIEGIVKVYYENGKVQEESTYKNGMKNGITKFYDENGNFL KQANFVDDKQVD >gi|224461256|gb|ACDC01000146.1| GENE 4 3192 - 4139 1133 315 aa, chain + ## HITS:1 COG:no KEGG:HH0050 NR:ns ## KEGG: HH0050 # Name: not_defined # Def: hypothetical protein # Organism: H.hepaticus # Pathway: not_defined # 23 300 375 661 1086 234 47.0 3e-60 MKKLLFLLFMLILSISASSKNFKYHPKTKDELKELIENESIYLGDIDTSAITDMSYLFII DQKKIDACGTAYEYITTKRKNFSGIGKWNTSNVTDMEGLFFKMKDFNEDISAWNTSKVEN MISMFEDADSFNQALNNWDVSKVKTMKNMFRGAISFNQALNKWNVSEVIDMEEMFEAAYK FNQNINSWNVSKVKNMSYMFNGAKEFNQLLDKWNVSNVEDMTCMFRYTKKFNQPLNLWDV SKVKYMEEMFYGAESFNQSLNRWNVSNVKDMARMFCDAKEFNQDLSMWKVQGATDTVNMF LGSPLENRKPKWEGQ >gi|224461256|gb|ACDC01000146.1| GENE 5 4214 - 6790 1809 858 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 6 857 5 811 815 701 44 0.0 MMSPNQFTENTITAINLAVDISKGNMQQSIRPEALALGLLMQNDGLIPRVIEKMNLNLKY IISELEKEMNNYPKVEVKVSNENISLDQKTNSILNRAEMIMKEMEDSFLSVEHIFKAMIE EMPIFKKLGISLEKYMEVLMSIRGNRKVDNQNPEATYEVLEKYAKDLVELAREGKMDPII GRDSEIRRAIQIISRRTKNDPILIGEPGVGKTAIVEGLAQRILNGDVPESLKNKKIFSLD MGALVAGAKYKGEFEERMKGVLKEVEESNGNIILFIDEIHTIVGAGKGEGSLDAGNMLKP MLARGELRVIGATTIDEYRKYIEKDPALERRFQTILVNEPNVDDTISILRGLKDKFETYH GVRITDTAIVEAATLSQRYITDRKLPDKAIDLIDEAAAMIRTEIDSMPEELDQLTRKALQ LEIEIKALEKETDDASKERLKVIEKELAELNEEKKVLTSKWELEKEDIAKIKNIKREIEN VKLEMEKAEREYDLTKLSELKYGKLATLEKELQEQQNKVDKDGKENSLLKQEVTADEIAD IVSRWTGIPVSKLTETKKEKMLHLEDHIKERVKGQDEAVKAVADTMLRSVAGLKDPNRPM GSFIFLGPTGVGKTYLAKTLAYNLFDSEDNVVRIDMSEYMDKFSVTRLIGAPPGYVGYEE GGQLTEAIRTKPYSVILFDEIEKAHPDVFNVLLQVLDDGRLTDGQGRIVDFKNTLIIMTS NIGSHLILEDPALSESTREKVADELKARFKPEFLNRIDEIITFKALDLPAIKEIVKLSLK DLENKLKPKHITLEFSDKMVDYLANNAYDPHYGARPLRRYIQREIETSLAKKILANEVHE KSNVLIDLDDNHIVFKEI >gi|224461256|gb|ACDC01000146.1| GENE 6 7058 - 7280 165 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294781981|ref|ZP_06747313.1| ## NR: gi|294781981|ref|ZP_06747313.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 13 74 1 62 154 94 91.0 1e-18 MKNISVRYGGYRMGIFIVAIFLLLIGYLSISIAKISGEKANEIKKSLENNKENLLETKGT LELIKIEGGKNSRS Prediction of potential genes in microbial genomes Time: Thu May 19 23:51:52 2011 Seq name: gi|224461255|gb|ACDC01000147.1| Fusobacterium sp. 2_1_31 cont1.147, whole genome shotgun sequence Length of sequence - 24103 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 6, operones - 5 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 42 - 497 669 ## FN1784 hypothetical protein 2 1 Op 2 . - CDS 520 - 969 492 ## Lebu_0879 hypothetical protein 3 1 Op 3 . - CDS 992 - 1312 337 ## gi|262067393|ref|ZP_06027005.1| conserved hypothetical protein 4 1 Op 4 . - CDS 1332 - 1445 102 ## 5 1 Op 5 . - CDS 1472 - 1924 602 ## FN1784 hypothetical protein 6 1 Op 6 . - CDS 1951 - 2412 527 ## Lebu_0879 hypothetical protein 7 1 Op 7 . - CDS 2436 - 2891 591 ## FN1784 hypothetical protein 8 1 Op 8 . - CDS 2940 - 3398 537 ## Lebu_0879 hypothetical protein - Prom 3434 - 3493 8.5 + Prom 3393 - 3452 8.0 9 2 Op 1 . + CDS 3500 - 4480 1291 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 10 2 Op 2 2/0.000 + CDS 4474 - 4959 848 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 11 2 Op 3 . + CDS 4949 - 6301 1674 ## COG0534 Na+-driven multidrug efflux pump 12 2 Op 4 2/0.000 + CDS 6320 - 7882 2339 ## COG0286 Type I restriction-modification system methyltransferase subunit + Prom 7923 - 7982 5.8 13 2 Op 5 1/1.000 + CDS 8051 - 9106 1271 ## COG3943 Virulence protein 14 2 Op 6 4/0.000 + CDS 9103 - 10374 1613 ## COG0732 Restriction endonuclease S subunits 15 2 Op 7 . + CDS 10367 - 11017 644 ## COG0732 Restriction endonuclease S subunits + Term 11030 - 11078 9.1 + Prom 11058 - 11117 10.6 16 3 Op 1 . + CDS 11138 - 12400 1401 ## llmg_2248 putative abortive phage resistance 17 3 Op 2 . + CDS 12469 - 12903 497 ## gi|237738770|ref|ZP_04569251.1| conserved hypothetical protein + Prom 12922 - 12981 10.9 18 4 Op 1 . + CDS 13171 - 15462 2677 ## Bsel_0210 hypothetical protein 19 4 Op 2 . + CDS 15473 - 17164 2110 ## Bsel_0211 labile enterotoxin output A + Term 17180 - 17214 1.7 + Prom 17186 - 17245 15.5 20 5 Op 1 . + CDS 17272 - 18894 1808 ## Lebu_0003 protein of unknown function DUF1703 21 5 Op 2 . + CDS 18912 - 21950 3518 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 22 5 Op 3 1/1.000 + CDS 22019 - 22501 582 ## COG2131 Deoxycytidylate deaminase 23 5 Op 4 . + CDS 22514 - 23167 715 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 24 6 Tu 1 . - CDS 23472 - 24101 832 ## COG3641 Predicted membrane protein, putative toxin regulator Predicted protein(s) >gi|224461255|gb|ACDC01000147.1| GENE 1 42 - 497 669 151 aa, chain - ## HITS:1 COG:no KEGG:FN1784 NR:ns ## KEGG: FN1784 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 151 1 151 151 234 86.0 7e-61 MKKLILGLFMILVASSYAVPSFVNSKRAEERGYKVVSDSEGSISMQKVEDESATTISYWY GVKNPDAAELNKILKEDASRDLQNKGSLKMGKAYVEKYVDGQNFMYTIVFKNAKPADTLT SVAYYTKKEIPKNELNKYVDKLLAESEKYIK >gi|224461255|gb|ACDC01000147.1| GENE 2 520 - 969 492 149 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0879 NR:ns ## KEGG: Lebu_0879 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 148 1 156 159 81 34.0 8e-15 MKKFILGLFLILGAVSIAAPEYVDVKGMEKKGYIIYKNKKDSLVAFKITQNDRIGVTLYF SDKDNAEYLTNTFKRSAPSALKFLDEFENNRAYIQRFKGELYIYNIIAKEQKVNGCYVTI TFSEIDDLTEEKLNEKIDILLDEVENFLK >gi|224461255|gb|ACDC01000147.1| GENE 3 992 - 1312 337 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262067393|ref|ZP_06027005.1| ## NR: gi|262067393|ref|ZP_06027005.1| conserved hypothetical protein [Fusobacterium periodonticum ATCC 33693] # 1 106 45 150 150 163 96.0 4e-39 MAISDDDSAIIISFSTANISSKEISDLLKSNALHNSENFIATLDNNRAYVNEFEAQGFYS YIIVPKKEKVKNQHTYATYVSSKKLSKNDLSKITNAILDEAESYIK >gi|224461255|gb|ACDC01000147.1| GENE 4 1332 - 1445 102 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKFILGLFLILGAISFAAPKFVDMAKVKKCWLYNQK >gi|224461255|gb|ACDC01000147.1| GENE 5 1472 - 1924 602 150 aa, chain - ## HITS:1 COG:no KEGG:FN1784 NR:ns ## KEGG: FN1784 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 150 1 151 151 78 33.0 7e-14 MKKIILGLFLILGAVSFAVPSFIDATKLQKSGHGIIQDEANLFTIGSPKEDTALVISYYL TDKNPQELSDTIKADAPAGEVKFLSAIDNDQAYVNEFQSENFYSYVVVPKKQKLGKYKIY ATYATVKKLPKDAINSTVKSIINEAEGLIK >gi|224461255|gb|ACDC01000147.1| GENE 6 1951 - 2412 527 153 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0879 NR:ns ## KEGG: Lebu_0879 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 152 1 156 159 81 28.0 1e-14 MKKIILGLFLILGAVSFAMPSFINTTKVQKSGYSIIEDSETALTIAGADIVDGDSILVAS FYLSDMSPKELSDAIKAEAQQQDAKFVASFDNNRAYVNEFKHVDFYSFTIVPKKQKISKY HIYVTYMSPKKLSKEDINKVINATLNEAEGLIK >gi|224461255|gb|ACDC01000147.1| GENE 7 2436 - 2891 591 151 aa, chain - ## HITS:1 COG:no KEGG:FN1784 NR:ns ## KEGG: FN1784 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 151 1 151 151 80 33.0 3e-14 MKKIILGLFLILGAISFALPKNLDANKLKKAGYEVVRDEDSAVIFGKSTDDAGITVALFL GDVNAKGVNDSIKATAPKNQKFLSSKETKRAYISKYKDTQYNGFTYSVVAKNSKSKGTVI SFLYMTDKELKDADLDKAIDKTVNEIESFLK >gi|224461255|gb|ACDC01000147.1| GENE 8 2940 - 3398 537 152 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0879 NR:ns ## KEGG: Lebu_0879 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 151 1 156 159 95 35.0 4e-19 MKKIVLGLFLIFGALSFASPSFVDVDKIKQNSYEIYDDDDDFFTFVKSTDEAGISVTFTI IEGGNSKEVSDIIKSSTPDNQQFLSSINNKRAYINKFANNENGGFTYNFVAKNTKIKDCY ISVLYATDNELSNTELNNAVDKVLNEVESYLK >gi|224461255|gb|ACDC01000147.1| GENE 9 3500 - 4480 1291 326 aa, chain + ## HITS:1 COG:FN1786 KEGG:ns NR:ns ## COG: FN1786 COG2870 # Protein_GI_number: 19705091 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Fusobacterium nucleatum # 1 319 3 321 323 535 91.0 1e-152 MIKKLIEKFKNIKIAVIGDLMLDEYIMGKVERISPEAPVPVVKVTEEKFVLGGAANVINN LAALGANVYCGGLVGNDNNAEKLINAFPKNVDCNLILKADNRPTIVKKRVIAGHQQLLRL DWEEEFSINEEEENIIIENLKNHIKELDAIILSDYNKGLLTKSLSQKIINLCRENNVIVT VDPKPKNITNFVGASSITPNKKEAYLAVDANSREDIDIVGRKLKEQYKLDTVLITRSEEG MTLYDEGIHNIPTYAKEVYDVTGAGDTVISVFTLARAAGATWEEAAKIANAAGGIVVGKI GTSTVSEKELIETYNSIYNIGGTCKC >gi|224461255|gb|ACDC01000147.1| GENE 10 4474 - 4959 848 161 aa, chain + ## HITS:1 COG:FN1788 KEGG:ns NR:ns ## COG: FN1788 COG0245 # Protein_GI_number: 19705093 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Fusobacterium nucleatum # 1 157 1 157 160 274 94.0 6e-74 MLRIGNGYDVHRLVEGRRLMLGGVEVPHTKGVLGHSDGDVLLHAITDAIIGALGLGDIGL HFPDNDENLKNIDSAILLKKINNIMKEKNYRIVNLDSIIVIQKPKLRPHIDSIRDNIAKI LEIEPELVNVKAKTEEKLGFTGDETGVKSYCVVLLEKDNVR >gi|224461255|gb|ACDC01000147.1| GENE 11 4949 - 6301 1674 450 aa, chain + ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 449 10 458 459 677 87.0 0 MLDKTSFRKSVLTFLLPIAIQNLINVAISSTDVIMLGRYSEVALSASSLAGQVQFILILL FFGIASGATVLTAQYWGKKDIKSIEKVLAIGIKIAFFVSIGFFIFAFFFSRTAMRLFSND EATILQGIRYLKIVSFSYLTTSISIVYLVTMRSVERVGVSTVAYATSFVSNLIINYLLIY GNFGFPEMGVEGAAIGTLVARIIELGIVFYYNSKNHHFVSIKWKYIKSLDPVLKKDFFKY SAPTMMNELLWAGGTAAGIAILGRLGTSIVAANSITSVVRQLAMVFAFGLANTAAVMVGK EIGKKDFHTAEIYAKKLLFYSFLSSLVGVALLYIAKPFIISKFALNAEVEDFLNHTINVL FYYIPLQSISAVLIVGVFRAGGDTKFALISDAIPLWCGSVLLSAIGAFYLGLSTKLVYIL IMSDEIIKLPLIIWRYRSRKWINNITRELK >gi|224461255|gb|ACDC01000147.1| GENE 12 6320 - 7882 2339 520 aa, chain + ## HITS:1 COG:XF2728 KEGG:ns NR:ns ## COG: XF2728 COG0286 # Protein_GI_number: 15839317 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Xylella fastidiosa 9a5c # 1 518 1 522 525 731 67.0 0 MDNKKEQERAELHRTIWAIANDLRGSVDGWDFKQYVLGILFYRYISENLTNYINKGEIEA GNPDFNYADLSDEDAIVAKEDLIATKGFFILPSELFVNVRKRADKDENLNVTLHNIFTNI ENSANGTESENDLKGLFDDIDVNSNKLGGTVAKRNENLVNLLNGVGDMKLGDYQENTIDA FGDAYEYLMGMYASNAGKSGGEYYTPQEVSELLTKLTLVGKTEVNKVYDPACGSGSLLLK FAKILGKDNVRNGFFGQEINITTYNLCRINMFLHDIDFDKFDIAHGDTLTEPAHWDDEPF EAIVSNPPYSIKWEGDASQILINDSRFSPAGVLAPKSKADLAFIMHSLSWLAPNGTAAIV CFPGVMYRSGAEQKIRKYLIDNNYIDCIIQLPDNLFYGTSIATCIMVMKKAKTDNKVLFI DASKEFVKVTNSNKMTEKHINDIVEKFTKRENVEYISNLVDYEKIVEENYNLSVSTYVEK EDTSEKIDIVELNKEIQRIVAREEELRKEIDKIIAEIEIK >gi|224461255|gb|ACDC01000147.1| GENE 13 8051 - 9106 1271 351 aa, chain + ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 22 346 13 336 345 227 39.0 3e-59 MKSDITSKNRISKKREEMEEVKFLLYSYLEDDVNVGVIVKNDTLWLTQKSMAELFGVGVP AISKHLKNIYESEELEKNTTISKMETVVNRGFVGEIKEEVEYYNLDAIISVGYRVNSIKA TRFRMWATKILREYIQKGFVLDDERLKQGETTFGKDYFKELLERVRSIRASERRIWQKIT DIFAECSIDYDKNSEITKDFYATIQNKFHYAIVGKTAPEIIYSKADNTKENMGLTNWKNS PDGRILKTDVIVAKNYLELDEIKRLERLVVGYFDYVEDVIERENTFTMEEFATSVNEFLE FRKYEILKNKGKVSKKQAIEKAEKEYDIFNKTQKIESDFDKQIENLKRGKK >gi|224461255|gb|ACDC01000147.1| GENE 14 9103 - 10374 1613 423 aa, chain + ## HITS:1 COG:jhp0726 KEGG:ns NR:ns ## COG: jhp0726 COG0732 # Protein_GI_number: 15611793 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Helicobacter pylori J99 # 1 410 1 444 454 129 26.0 1e-29 MSRLDELIKELCPNGVEYKKTKDIVQEKFWIMPETPNFIEEGIPYITSKNIKNGFIDFKD VKYVSVDDYNRISNNRKIKKDDMLITMIGTIGEVAIVEDEIDFYGQNLYLLRMNNEIILN KYYYYYITLNKIKRTLVEKRNTSSQGYIKAGNIENLLIPVPPLEVQEEIVRILDDYTKSV EELKEKLNAELITRKKQYSWYRDYLLKFENKIKIVKLGELFEFKNGINKEKSSFGKGTPI INYVNVYKKNKIYFEDLQGLVEATDDELIRYKVKRGDVFFTRTSETIEEIGFTSVLLEDI ENCVFSGFLLRARPLTDLLLPEYCAYCFSTSSMRNAIIRKSTYTTRALINGTSLSQIEIP LPPLEVQKRIVEVLDNFEKTCKELNIELSSEIEIKEKEYEFVRNYLLTFEEKSRQAILAC ELA >gi|224461255|gb|ACDC01000147.1| GENE 15 10367 - 11017 644 216 aa, chain + ## HITS:1 COG:XF2726 KEGG:ns NR:ns ## COG: XF2726 COG0732 # Protein_GI_number: 15839315 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Xylella fastidiosa 9a5c # 25 215 21 207 409 202 53.0 3e-52 MRSKQQAQNLIKILQYVYGYVEVRLGDIASIVRGNGLQKRDFTEEGVGCIHYGQIYTKYG MVAEKTISFVEESLAEKLRKVEKGDIIFAVTSENIEDLCKCVVWLGEDEIVTGGHTAILK HNQNSKFLAYYFQTEAFHSQKRKLATGTKVMDITATKLEEILISLPPLEEQQRIVDILDR FDRLCNDISEGLPAEIEARQKQYEYYREKLLNFKKL >gi|224461255|gb|ACDC01000147.1| GENE 16 11138 - 12400 1401 420 aa, chain + ## HITS:1 COG:no KEGG:llmg_2248 NR:ns ## KEGG: llmg_2248 # Name: not_defined # Def: putative abortive phage resistance # Organism: L.lactis_MG1363 # Pathway: not_defined # 38 365 31 378 440 64 22.0 9e-09 MKIFMLSFKVNGVKNIEKDIEINFYNKTLKRFSPCGSNIKGIFGPNGIGKTSIIKGMDIL RKISLNDNYLTNDFNLIILDKIINKKIEKASLEIEFLVIDDKKKKSRYVHSITIAITSPK EIKILFENIKKKDPNTDQVVGEILIENGIIKNDSLHKDDLKSEIVDITKNLLEKRSIVNI VKPSVLKSIDLEKIRYFYRKLHIKIDREDSHLGYALMDNPLKDDIPFNDSIGNYDMIISK NNLPIFEDYLRRMTEFLKIFKPNLRNIEYEKKEGKEEYYINILFVYNDYKVNYEFESVGI KNLFSLFTYFRALSEDEVVVIDEIDTSIHDIYLNKLIEFFAVDGKGQLVFTAHNITLLQT LKKYKHSIDFINENMEVISWIKNGNSSPFKSYKDGYIKGLPFNIKEYDFLEIFSQESDVE >gi|224461255|gb|ACDC01000147.1| GENE 17 12469 - 12903 497 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738770|ref|ZP_04569251.1| ## NR: gi|237738770|ref|ZP_04569251.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 144 17 160 160 209 100.0 6e-53 MLKEKNPDINFKFTSIPIKGKTNFRDEKYIDKVEKTKLKFQGESQVLYVVDTDDVDTSKE DLELLEKITEHVKKQDWHFVFFNRDIEEVLNKKADRKKKMKEARSYTEKKFYEVDKNNLK VRDYLIRGTSNLFSVISEEIEMKI >gi|224461255|gb|ACDC01000147.1| GENE 18 13171 - 15462 2677 763 aa, chain + ## HITS:1 COG:no KEGG:Bsel_0210 NR:ns ## KEGG: Bsel_0210 # Name: not_defined # Def: hypothetical protein # Organism: B.selenitireducens # Pathway: not_defined # 12 761 6 745 753 339 32.0 3e-91 MILSEDVFKPTEIYAFILTKVKELREHVLKMKLSDKEAKEKQEEIKKGLDSIVNELEKKI KELKNNSEWDKFTIAFYGETNAGKSTLIETLRILLNEKEKLKDREKYKEIDNCINSLKDE REVYDNKIKESVQKYEQALNSIMENLKKSEIELDDLKENLKLLKDSDDEFQVELNDIKEM INKEKSSSFKNFILWLFKRLPEQKNLPVIKDKIKGNSLKIKEIETNEKTINRKIEKMNKE TSTLESTKNKEIDECNKKIVLLDKKIQNTDEEIEKYCDGKIIGDGSSDYTRDVTEYEIEY NGQKFCLLDLPGIEGNEKIVLSNINRAVKKSHAVFYISASPNPPQSGNKENSGTIEKIKD HLGDQTEVYFIYNKKIKNPKMLKYDLIDYDEEDSLEETDKVLSSILESQYSGNISLSAYP AFLAIGNCCNRDRTSKIKFLENLNAENILSISQVEKFKNWLTESFVTNTKDKIKRANYKK VYSVIDETTIKIQEQNRILNIVKEALIRNSDNTASNLDSVLIGMKRKFRTELDHSLDEFE QNLRASVYGEIDRCIDNTEFKQIFESKYEEYSEKLSSNLQTRFETLNEEFVSEVKNTLEK HNKTREELIETYNTRYSIDKKFDFKLNLKSGINKTGLIISIGTTIAGIIMAMSNPAGWIV IGLAILGGIISLIKSVWGFFDNDYRAGQQRNTTNSKISEIKGSLRQEIEKKLPEIDQSLT RAIDETKEMLLAEKKDFDDLIDTFENAKNEFDKLSLNIQNKNN >gi|224461255|gb|ACDC01000147.1| GENE 19 15473 - 17164 2110 563 aa, chain + ## HITS:1 COG:no KEGG:Bsel_0211 NR:ns ## KEGG: Bsel_0211 # Name: not_defined # Def: labile enterotoxin output A # Organism: B.selenitireducens # Pathway: not_defined # 1 558 1 555 556 425 42.0 1e-117 METLKVFNKKKEDVFKMLDNLLLVLKEGKELGVDIEPEYITKIEKSIDENEDKKLKVVLI GGFSEGKTSIAAAWLEEYDKNKMKIAMTESTDDIREYNVGNINLVDTPGLFGFKETANKE KYKDITRKYISEANIVLYVMNSDNPIKESHKEEIQWLFKDLNLLPRTVFILSRFDNVADI EDENDYNGMLKIKRENVLKRLENFEVIDAEESHKLPIVAVAANPFDEGIEYWLGKLDEFK KISHIDSLQHATSDIVEKNGGVDSVLLETQKSIIKEVLELKIPLATKKVEKLDKEYKNLN NVIDQMNGDSANLKTKIFDVKKYLKNFVIELFRDLILQLKGTTLETFNDFFERNLGDKGI ILSNKILDEFDKKVFDIETDVQRLEKNLNNEIDNFKEFTQNSSLEKFKVGSQMMKMANLN LTNASVIAVRDFLNLSIKFKPWQAVKIAKFINNGVPIIGSIAGIAFELWDSYSKKQKEDE FLRTKNKIKEDFEKQREEYVELIENPEEFDKKIFRSYFKLQNNIVDLSNELSEKEKEKEI FKNWMEKAKVIDDNLKAIKVNEE >gi|224461255|gb|ACDC01000147.1| GENE 20 17272 - 18894 1808 540 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0003 NR:ns ## KEGG: Lebu_0003 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 2 540 3 545 545 594 60.0 1e-168 MKRLAIGIDDFRKIIKEDCYYVDKTKFMEAVLEDASNVKLFTRPRRFGKTLNMSMLKYFF DVRDSEENRKLFNGLDIEKSRYINEQGKYPTILISLKSIKYETWEESLEQLKSLLSNLYN EFEYIRECLNESEIELFNDIWFKKENGEYANSLKNLTSFLYKYYKKEVILLIDEYDIPLI TAHKYGYYDEIINFYKIFLGEALKTNQYLKMGVLTGIIRVIKTGIFSDLNNLKVYSILEK KYSEFFGFTEEEVKKALQYFNIEEELVNVKYWYDGYKFGNSELYNPWSIINFLDGRELKN YWVGTSENFLIKNILENSTSRTNEILDKLFNEEEVEEAITGTSDLSILMDSKEVWELLLF SGYLTVKEKLDDDIYSLKLPNMEVKKLFKKEFINVHFGISLFRKTMEALKNLNFNDFEKY FQEIMLKSTSNWDTSKEAFYHGLSLGMLSYLDNDYYVTSNFEAGFGRYDVVLEPKNRNDR AFILEFKVAEAENKLEKLSKEAIKQIEEKKYDINLKSKEIKEITSVGIAFYGKKLKVSYK >gi|224461255|gb|ACDC01000147.1| GENE 21 18912 - 21950 3518 1012 aa, chain + ## HITS:1 COG:XF2725 KEGG:ns NR:ns ## COG: XF2725 COG0610 # Protein_GI_number: 15839314 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Xylella fastidiosa 9a5c # 6 1010 10 1006 1007 1112 57.0 0 MSSVDYNMLISTLESTVVTEYIREDIPAYSYQSEADLEREFIKNLQNQGYEYLNIHNEKE LIANLKDKLEKLNNIIFSEKEWERFFKEKIANKNDSIVEKTRTIQEDYVKSFTRDDGSLV NISLINKKNIHNNFLQVINQYEEEGGNHNTRYDVSILVNGLPLIHIELKRRGVAIREAFN QINRYQRDSFWAGSGLFEYVQIFVISNGTNTKYYSNTTRARHIKEMSFNRKKVKKSSNSF EFTSYWADANSKSITDLVDFTKTFFAKHTILNILTKYCIFDTNETLLVMRPYQISATERI LSKIQLANNYKWAGKIDAGGYVWHTTGSGKTLTSFKTAQLASQLDYIDKVLFVVDRKDLD SQTQKEYDRFSKGSANGNTSTKILKAQLEDKYENKSKIIITTIQKLGYFIKQNKNHEVFR KNIVLIFDECHRSQFGELHLAIAKTFKNYFMFGFTGTPIFPKNSNGSSKTLFKTTEQTFG DKLHTYTIVNAINDGNVLPFRIDYINTIKEKENIQDKKVNAIDIEKAMSDPNRIKEVVSY IIDHFEQKTMRNKHYELKDQRLSGFNSIFAVSSIPVAKKYYFELKKQLKEKNKDLRVATI FSYSVNEEENTDNLDDESFDTENLDLGSREFLEEAISDYNKMFGTNYDTSSDGFQLYYEN LSKRTKDKEIDILIVVNMFLTGFDATTLNTLWVDKNLRMHGLIQAFSRTNRILNSIKTFG NIVCFRDLQKETDEAIALFGNKEAGGIVLLKTYEDYYNGYQDDKGREKEGYSQLIEELQS KFPLSEQITGESNKKEFVILFGNILKIKNILSAFDKFPGNEILSEREFQDYQSIYLDMYQ EIRSKNKEKEIINDDIIFEIELIKQVEINIDYILMKVTEYYKSNKEDKEILIDIKKAINS SLELRSKKELIEGFIEKINSSKNITDDFQKFVREEKEKDLEKVIEEEKLKPEETKKFIDN SLRDGNFKTTGTDIDKLLPPVSRFSSGNRGLKKQGVIDKLKGFFDKYLGLTV >gi|224461255|gb|ACDC01000147.1| GENE 22 22019 - 22501 582 160 aa, chain + ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 1 160 14 173 174 297 86.0 6e-81 MRENYIDWDSYFMGIALLSSMRSKDPNTQVGACIVNEDKRIVGVGYNGLPKGCEDTDFPW EREGEFLETKYPYVCHAELNAILNSIKSLKDCVIYVALFPCNECSKAIIQSGIKEIVYLS DKYDGTDTNRASKKMLDSAGVKYRKFTPNMDKLEIDFKNI >gi|224461255|gb|ACDC01000147.1| GENE 23 22514 - 23167 715 217 aa, chain + ## HITS:1 COG:FN1901 KEGG:ns NR:ns ## COG: FN1901 COG0664 # Protein_GI_number: 19705206 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 1 217 1 217 217 291 80.0 5e-79 MLEALKESVVFNKIQRENIKKILEETKYEIKTYSPNETIAFRGDEVKGLYLILKGTLSTE MLTEEGNVIKIEELVKSDVIASAFIFGSKNCFPVDLKAKEKAEVLFIERKEFLKLLFSQE QILENFLNEISNKTQLLTTKIWNNFNNKTIKKKFCNYVNRKQEKGEFIIESLGALAEFFG VERPSLSRVLSDLVKDEKLERIGRNRYKILDKEFFEI >gi|224461255|gb|ACDC01000147.1| GENE 24 23472 - 24101 832 209 aa, chain - ## HITS:1 COG:FN1900 KEGG:ns NR:ns ## COG: FN1900 COG3641 # Protein_GI_number: 19705205 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Fusobacterium nucleatum # 1 209 122 330 330 296 92.0 1e-80 ILLPMTTIIFGCLLGKFFAPYISAVISEIGVIVNKTTELRPILMGLTMSVIMGIILTLPI SSAAIGISLGLSGLAAGASLTGCCCQMIGFAVMSYDDNDLGTVFSIGFGTSMIQIPNIIK NPMIWIPPIVSSAILGVLSTTVFNLSSNSIASGMGTSGLVGQIATFSVNGMSYLPTMIIL HFLLPAIITFIVYKILKKKGYIKPGDLKI Prediction of potential genes in microbial genomes Time: Thu May 19 23:52:58 2011 Seq name: gi|224461254|gb|ACDC01000148.1| Fusobacterium sp. 2_1_31 cont1.148, whole genome shotgun sequence Length of sequence - 8347 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 355 376 ## COG3641 Predicted membrane protein, putative toxin regulator - Prom 407 - 466 13.4 + Prom 576 - 635 9.2 2 2 Op 1 . + CDS 694 - 1938 1972 ## FN1590 lipoprotein + Term 1966 - 2011 3.3 + Prom 1946 - 2005 3.2 3 2 Op 2 21/0.000 + CDS 2031 - 3614 219 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 4 2 Op 3 11/0.000 + CDS 3616 - 4635 1595 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 5 2 Op 4 . + CDS 4628 - 5716 1321 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 6 2 Op 5 . + CDS 5732 - 6058 448 ## FN1895 hypothetical protein + Prom 6065 - 6124 11.0 7 3 Op 1 . + CDS 6272 - 6979 590 ## COG2992 Uncharacterized FlgJ-related protein + Term 6985 - 7031 9.1 + Prom 6981 - 7040 4.0 8 3 Op 2 . + CDS 7074 - 7520 464 ## COG0456 Acetyltransferases + Prom 7547 - 7606 10.2 9 4 Tu 1 . + CDS 7634 - 8056 677 ## FN0106 hypothetical protein Predicted protein(s) >gi|224461254|gb|ACDC01000148.1| GENE 1 1 - 355 376 118 aa, chain - ## HITS:1 COG:FN1900 KEGG:ns NR:ns ## COG: FN1900 COG3641 # Protein_GI_number: 19705205 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Fusobacterium nucleatum # 1 118 1 118 330 169 91.0 2e-42 MKNFFIKSLNGMAFGLFSSLIVGLILKQIGILFNIEFLTYLGGFSQLLMGAGIGVGVAYA LESHVLILIASAITGMYGAGSINFVEGQAILKVGEPMGAYFSVIFGLLIAKRIAGKTK >gi|224461254|gb|ACDC01000148.1| GENE 2 694 - 1938 1972 414 aa, chain + ## HITS:1 COG:no KEGG:FN1590 NR:ns ## KEGG: FN1590 # Name: not_defined # Def: lipoprotein # Organism: F.nucleatum # Pathway: not_defined # 4 414 2 411 414 718 84.0 0 MKTKRILFSILAVFMFVLVAACGKKEAPTDDANAQKEGAATEVSQNYHIGIVTTSVSQSE DNFRGAEAVAKKYGLSNEGGKVTVVTVPDNFMQEQETTISQIVSLADDPEMKAIIVSESV PGTYPAFKTIKEKRPDIILIANNCHEDPVQVSTVADVVLNPDSISRGYLIVKTAHDLGAT KFMHISFPRHLSYEVISRRRAVMEQTAKDLGMEYIEMSAPDPVSDVGVPGAQQFILEQVP NWIAKYGKDTAFFATNDAQTEPLLKQIAAYGGYFIEADLPSPTMGYPGALGIEFSDDEKG NWPKILEKVEKAVIEAGGAGRMGTWAYSYNFSTTEGLTDLAIKAIESGNREFTLDKVLAS LGEQTPGSKWNGSLMKDNNGVDIPNSFFVYQDTYIFGKGYMGITSVEIPEKYTK >gi|224461254|gb|ACDC01000148.1| GENE 3 2031 - 3614 219 527 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 277 503 12 221 318 89 31 1e-17 MSDTILKIEKLSKSFGENTVLKDINLELKAGEILGLVGENGAGKSTLMKIIFGMDVIKET GGYNGKIFFDGQEVNFNSPFDALQAGIGMVHQEFSLIPGFKSSENIVLNRESTKKNAIAY LFGDSVNKIDQNENQKRSEKAISKLGVNLSGQEQINEMAVAYKQFTEIAREIEREHTKLL VLDEPTAVLTEDEAQILLETMKKLSDKGISIIFITHRLNEIMAVSDKVTVLRDGELINTV PTKSTSVNEITEWMIGRKVNSTAEEKKVAHDDIETLLEIRDLWVDMPGELLKGLDLDIKK GEILGLGGMAGQGKIAVANGIMGLFKSKGDVKYKNEALVLNKPTYPLEKGIFFVSEDRKG VGLLLDESIERNIAFPAMEIKKQFFKKFLGFFNLIDDKAVTENAKKYIEKLEIKSMGEKQ KVGELSGGNQQKVCMAKAFTMEPELLFVSEPTRGIDVGAKQLVLETLKEYNRERNTTIVV TSSEIEELRSICDRIAIINEGKLAGILPASAGILEFGKLMSGIKEGK >gi|224461254|gb|ACDC01000148.1| GENE 4 3616 - 4635 1595 339 aa, chain + ## HITS:1 COG:FN1897 KEGG:ns NR:ns ## COG: FN1897 COG1172 # Protein_GI_number: 19705202 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 1 339 1 339 339 535 91.0 1e-152 MIKKFGLPRLIILIFLISTYIIAPFVGIPITTALSDTMIRFGMNAILVLSLMPMIESGAG LNFGMPLGIEAGLLGSLLSIELGFSGFVGFALAILISIVFAYIFGWAYGVVLNKVKGGEM MIATYIGFSSVAFMCIMWLILPFKKPDMIWAYGGSGLRTTISVETYWKGILNNVFGKISE AIPVGEIIFFLLLAFIMWVFFRTKAGLSMSAVGKNEKFAQATGIDANKSRKQSVIISTII AAIGIVVYQQSFGFIQLYLAPFNMAFPAIAAILIGGASVNRVTIWHVMIGTFLFQGILTM TPTVVNAVIKTDMSETIRIIVSNGMILYALTRKDGGSRG >gi|224461254|gb|ACDC01000148.1| GENE 5 4628 - 5716 1321 362 aa, chain + ## HITS:1 COG:FN1896 KEGG:ns NR:ns ## COG: FN1896 COG1172 # Protein_GI_number: 19705201 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 23 362 1 340 340 535 92.0 1e-152 MDKNNKVKNFILDNSVPILILIMVAIMFPLSGLSGDYLVREMIERISRNLFLIMSLLIPI VAGMGLNFGIVLGAMGGQLALILVTNWHIMGLQGVFLAMILSMPFSILLGYVGGVILNRA KGKEMITSMILGYFINGVYQLVVLYSMGKIIPVSDRTLLLSSGRGIKNTVDLTEISKAVD NAIPLKIFGYDIPVLTLLFIVGLCFFIIWFRKTKLGQDMRAVGQDMEVSKSAGIEVNKVR IYSIVISTVLAGIGQVIYLQNLGTINTYNSHEQIGMFSVAALLIGGASVARATIPNAIGG VILFHTMFVVAPRAGKELMGSSQIGEYFRVFISYGIIALVLIIYEWRRKKEKEREREKAI GF >gi|224461254|gb|ACDC01000148.1| GENE 6 5732 - 6058 448 108 aa, chain + ## HITS:1 COG:no KEGG:FN1895 NR:ns ## KEGG: FN1895 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 107 1 107 109 144 77.0 1e-33 MKHTLKVAIIVLILVVISVILFVTGKRHDILIENNSIAGIKYSINGEPYKTLDAGKKALG ISKGVGNVIFIKTADNKVIEKELPSKNINLFINQAINNGEDWYKESEK >gi|224461254|gb|ACDC01000148.1| GENE 7 6272 - 6979 590 235 aa, chain + ## HITS:1 COG:FN1894 KEGG:ns NR:ns ## COG: FN1894 COG2992 # Protein_GI_number: 19705199 # Func_class: R General function prediction only # Function: Uncharacterized FlgJ-related protein # Organism: Fusobacterium nucleatum # 33 235 1 203 203 290 82.0 2e-78 MKKYLLAVVFLCLSILSYSNDTEALDQDTNTGIITQAKDFAKVKGKSKKQIFIDTLIPTI EKIRTKIAEDKEYVKTLIEKEILTTEEKLYLEEMYIKYKVKSKSKTELVHKMVVPPTSFI LGQASLESGWGNSKLAKEGNNLFAVRSSLRDPEKTVNLGPNQYYKRYESLEESLMDYVMT LSRHSSYSNLRKAINNGEQTIVLIKHLGNYSEMKNLYEQRLTQIITKNNLFKYDN >gi|224461254|gb|ACDC01000148.1| GENE 8 7074 - 7520 464 148 aa, chain + ## HITS:1 COG:FN2046 KEGG:ns NR:ns ## COG: FN2046 COG0456 # Protein_GI_number: 19705336 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Fusobacterium nucleatum # 1 148 1 148 149 213 75.0 1e-55 MELIHIENPNFEIMQKIIELEESAFEGAGNVDLWIIKALIRYGMVFIVKEGDKIVCIVEY MQIFNKKSLFLYGISTLKEYRHKGYANFILNETEKILKDLGYTEIELTVAPENQIAIDLY KKHGYKQESFLKDEYGTGVDRFMMKKFL >gi|224461254|gb|ACDC01000148.1| GENE 9 7634 - 8056 677 140 aa, chain + ## HITS:1 COG:no KEGG:FN0106 NR:ns ## KEGG: FN0106 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 140 1 140 140 214 81.0 8e-55 MEAKKEFLRMINECDEIALATSIHDMPNVRIVNYYYDQDNNIMYFATYKGREKISEFWKN NNVAFTTIPMKKGVREQVRARGHVRESEKTIIDLREEFSNKMSGFAEIIDKYSEELKVYE IRFTEATVTLDSRTYEKISL Prediction of potential genes in microbial genomes Time: Thu May 19 23:53:10 2011 Seq name: gi|224461253|gb|ACDC01000149.1| Fusobacterium sp. 2_1_31 cont1.149, whole genome shotgun sequence Length of sequence - 7255 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 250 - 309 10.7 1 1 Op 1 . + CDS 329 - 1762 1969 ## COG0591 Na+/proline symporter + Prom 1766 - 1825 6.1 2 1 Op 2 . + CDS 1845 - 3623 2508 ## COG2849 Uncharacterized protein conserved in bacteria + Term 3632 - 3670 7.0 + Prom 3640 - 3699 9.0 3 2 Op 1 1/0.000 + CDS 3729 - 4091 424 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 4 2 Op 2 1/0.000 + CDS 4118 - 5392 1848 ## COG1114 Branched-chain amino acid permeases + Prom 5453 - 5512 5.1 5 2 Op 3 . + CDS 5561 - 6769 889 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 + Term 6775 - 6825 6.4 + Prom 6790 - 6849 6.9 6 3 Tu 1 . + CDS 6941 - 7253 378 ## gi|262067361|ref|ZP_06026973.1| hypothetical protein FUSPEROL_01637 Predicted protein(s) >gi|224461253|gb|ACDC01000149.1| GENE 1 329 - 1762 1969 477 aa, chain + ## HITS:1 COG:FN0107 KEGG:ns NR:ns ## COG: FN0107 COG0591 # Protein_GI_number: 19703455 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Fusobacterium nucleatum # 1 472 1 472 482 770 90.0 0 MASYEIFITFGIYLVFLMAIGVYFYSKTTTHESYVLGDRGVGYWVTAMSAQASDMSGWLL LGLPGAVYTSGLTEIWVVIGLALGTYLNWKFVAPALRVQTEKYNSLTVPSFISQKLNDKK GYIRTFSAIVILFFFTIYSASGLVASGKLFDSLLGIDYKWGVLIGGGTIIVYTFLGGYLA TCWTDFFQGCLMFFAIIVVPVAAYYSGGGIDGISTAMEAKDISLNIFKYTKVWSLPIIIS GLGWGLGYFGQPHIIVRFMSIDSADELWKSRLIAMIWVFISLLGAIAVGITGIGVFTDIS QMGGDAEKVFIFLIHKLFNPWMAGILFAAILSAIMSTISSQLLVSSNTLTEDFYKHIVKR EKTHKEMIWVGRLCVIVIFVIASVLAMNPSSKVLELVSYAWAGFGGVFSPVILFTLYKKD LHWETVLVSMIIATITVIAWKTSGLSNTIYEMVPAFVINSISIYLLEKFKVFGNNEK >gi|224461253|gb|ACDC01000149.1| GENE 2 1845 - 3623 2508 592 aa, chain + ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 6 119 130 242 245 89 44.0 3e-17 MKNREYYTNGNLKAEYERNERGEKEGYELLYYESGVLRAEYHYKAGKLHGVTKEYYENGN LVAEGNYRNGMLEGLSRIYYESGKLKAESSYKNDALDGLCKMYYESGQVKAEYYYRDGSL EKTISSPKDNADAKVTNDKEFFDVDYEDGQLNLKLDLNTLLKSNLSKKDICKISYEDNEL KLKIYDEDTKETKQIPIDKEKNVKAVQEETKKVETEKVEAKKEEPKVQNLVSPKKETKKD ELEIPSFLKSRYEDELESEPEVKEAEPEAKRVFDPENDLDILEMVKTKREEKPELKFAKE TEKKPKIKKIIKKIDSEEDEIADIRSILDTSDIDTEEEIVRNRGNKVNKGKKKNSTTTVA PPPSRKKDSLEDQKKSVLKMIFFTLFLLAIGILFYFLYQKFTSEDTESLILDKKGTVAEE NATEETDEESGADTEGSPEEVTAEAQKEETKEEAKAEPETKEKEVTKDEVTKDTKEKENA KAKEETKKEDKKEEVAKSSDNSDIKKIDEVISQVMDKKNPDYLLKFNSEELALIRNTLYA RRGLKYTKGKYKTYFEGKSWYKPSVTSGKDLLPEKEERLVEIIRKYEKKAKK >gi|224461253|gb|ACDC01000149.1| GENE 3 3729 - 4091 424 120 aa, chain + ## HITS:1 COG:FN0052 KEGG:ns NR:ns ## COG: FN0052 COG1393 # Protein_GI_number: 19703404 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Fusobacterium nucleatum # 1 120 1 120 120 159 85.0 1e-39 MKDIIFFCYPRCSTCQKAKKWLEENSIKFTERDIVKDKPTEKELKEFFKKSGKELKKFFN TSGILYRELELKDKLPTMTEDEMIKLLATDGKLVKRPMIVTKDFVLNGFKEEEWKEKLKK >gi|224461253|gb|ACDC01000149.1| GENE 4 4118 - 5392 1848 424 aa, chain + ## HITS:1 COG:FN0053 KEGG:ns NR:ns ## COG: FN0053 COG1114 # Protein_GI_number: 19703405 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Fusobacterium nucleatum # 1 424 1 424 424 597 87.0 1e-170 MYNMIDVVTAGFALFAMLFGAGNLIFPPILGYELGSNWGVAAFGFILTGVGIPLMAIIAS ANAGKDLDSFSNKVSPLFAKFYGIALILSIGPLLALPRTGATAYEVTFFHAGFTTSTFKY VYLIVYFLLALLFSLKSSEVVDRVGKILTPILLIVLFIILVKGVFFNSSTIAEKVYGLPF KKGFVEGYQTMDALAAVVFSTVILNAIRGKTKLTEKQEFYYLLKVGLIAALGLTIVYAGL TYIGATFGGTELVAGAEKTDLLVKISINLLGKIGYLILAICVAGACLTTSIGLIVTVAEY FSGLMKVSYQKLVVITTIIGFIFAMFGVNKIVIISVPVLVFLYPISIALIILNFFRVKNA NVFKGVVLVSGLVGLYEGISVTGIAMPEVFTNIYNSLPLVNLGLPWLVPALVVGIVCNFI KTEK >gi|224461253|gb|ACDC01000149.1| GENE 5 5561 - 6769 889 402 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 7 395 12 410 418 347 45 2e-95 MANVYDVLKERGYLKQLTHEEEIKELLEKEKVTFYIGFDPTADSLHVGHFIAMMFMAHMQ QHGHRPIALAGGGTGMIGDPSGRSDMRTMMTVETIDHNVECIKKQMQKFIDFSDGKAILE NNANWLRNLNYIEFLRDIGEHFSVNRMLAAECYKSRMENGLSFLEFNYMIMQGYDFYVLN KKYNCTMQLGGDDQWSNMIAGVELIRRKDRRQAYAMTCTLLTNSEGKKMGKTAKGALWLD PKKTTPYEFYQYWRNIDDQDVENCLALLTFLPMDEVRRLGALKDAAINEAKKVLAYEVTK IIHGEEEATKAKEATEALFGSGNNLDNAPKIELGAEDFSKELLDVLVDRKILKTKSEGRR LIEQNGMSLNDEKITDVKFTLNENTLGLLKLGKKKFYNIVKK >gi|224461253|gb|ACDC01000149.1| GENE 6 6941 - 7253 378 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262067361|ref|ZP_06026973.1| ## NR: gi|262067361|ref|ZP_06026973.1| hypothetical protein FUSPEROL_01637 [Fusobacterium periodonticum ATCC 33693] # 1 104 1 104 253 181 97.0 2e-44 MSELQTIRKKIQGDFFLEKSYREGKLFYEVLRYKDDYIGINYGYFEDELSEETYINDNRI GMLVSKEKDKIYICTLDGKRHEVGITVAYHNKSGRLAYEMDYLD Prediction of potential genes in microbial genomes Time: Thu May 19 23:53:21 2011 Seq name: gi|224461252|gb|ACDC01000150.1| Fusobacterium sp. 2_1_31 cont1.150, whole genome shotgun sequence Length of sequence - 14791 bp Number of predicted genes - 18, with homology - 15 Number of transcription units - 7, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 610 735 ## gi|237738791|ref|ZP_04569272.1| predicted protein + Term 654 - 700 5.7 + Prom 1293 - 1352 3.5 2 2 Op 1 . + CDS 1411 - 3966 3635 ## COG3210 Large exoproteins involved in heme utilization or adhesion 3 2 Op 2 . + CDS 4017 - 4562 653 ## FN0142 hypothetical protein + Prom 4597 - 4656 3.7 4 3 Op 1 . + CDS 4758 - 6134 1553 ## COG3210 Large exoproteins involved in heme utilization or adhesion 5 3 Op 2 . + CDS 6176 - 6634 394 ## FN0145 hypothetical protein 6 3 Op 3 . + CDS 6658 - 6945 127 ## gi|237738796|ref|ZP_04569277.1| predicted protein + Prom 7042 - 7101 10.2 7 4 Tu 1 . + CDS 7121 - 7219 63 ## + Prom 7244 - 7303 10.3 8 5 Op 1 . + CDS 7333 - 7587 341 ## Dtox_4301 prevent-host-death family protein 9 5 Op 2 1/0.000 + CDS 7580 - 7840 346 ## COG4115 Uncharacterized protein conserved in bacteria 10 5 Op 3 . + CDS 7850 - 8353 295 ## PROTEIN SUPPORTED gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase 11 5 Op 4 . + CDS 8382 - 8867 416 ## FN0056 acetyltransferase (EC:2.3.1.-) 12 5 Op 5 . + CDS 8954 - 9019 70 ## + Prom 9021 - 9080 3.0 13 6 Op 1 1/0.000 + CDS 9169 - 9819 698 ## COG0177 Predicted EndoIII-related endonuclease 14 6 Op 2 20/0.000 + CDS 9899 - 11092 1652 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 15 6 Op 3 1/0.000 + CDS 11165 - 11551 584 ## COG0822 NifU homolog involved in Fe-S cluster formation + Term 11574 - 11627 11.1 + Prom 11588 - 11647 11.7 16 7 Op 1 1/0.000 + CDS 11711 - 13003 1878 ## COG1686 D-alanyl-D-alanine carboxypeptidase 17 7 Op 2 . + CDS 13022 - 14512 2060 ## COG2317 Zn-dependent carboxypeptidase + Prom 14523 - 14582 9.9 18 7 Op 3 . + CDS 14605 - 14791 377 ## Predicted protein(s) >gi|224461252|gb|ACDC01000150.1| GENE 1 2 - 610 735 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738791|ref|ZP_04569272.1| ## NR: gi|237738791|ref|ZP_04569272.1| predicted protein [Fusobacterium sp. 2_1_31] # 16 202 1 187 187 285 100.0 1e-75 SVSLSEHEKFKEPITMEEFYEDFSENKKIERFLCECDFPYSNMSVYFQNLGEIYAEFELE DWVCYEKETKEEWKIKERDRRKIREIIKEALLERIEEKQSNFEDSLKYWVKVVELTEEKI KEIFLKFPTVLHRGGFRLLLELKYMMSIIENSEKFKLTIPENVKNEIGYWLTDMQGEIRT EEEKNLFKEIRDKLKLKKIYQD >gi|224461252|gb|ACDC01000150.1| GENE 2 1411 - 3966 3635 851 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 617 1837 2460 2806 561 64.0 1e-159 MKSGDTLGGLASATNTVTGIVSGLASNQGTKLPTSAVNKNNSNDDDDDDDKNNQTNTVGK DNLKAAQATNNFYANIGVNLGFNKSSSKSNSHSESAVVTTIRGKDGNSSITYNNVKNVEY VGTQAQDTKFIYNNVENINKTAVELNNSHSSTGKSSGISTGVTIGYGDGTQTEFNGVSIS ASKSNMNSNGTTYQNGRFVNVDEVHNNTKNMTLSGFNQEGGTVTGNIENLTIESKQNTST TKGSTKGGSLSVSANGLPSGSANYSQTNGERRVVDNASTFIIGDGSDLKVAKVENTASAI GTTENGKLSIDEYVGYNLENVDKLKTAGGSVGVSTSGITSIGVNYSDKKQEGITKNTVIG NVEIGKSSGDEINRDLDTMTEITEDRDFKTNINVESQTINYIKNPEKFKEDLQKAKNEIY DIYHAVDSTVNLQGKEDRNISEQVGEVRQAKVIYNLIDSRLQEAENQEDIAKIFEGASED LGYKVKVIFTDPSNSPQLIGVDKKGNKYIKDGTAYVDKDTGIGYILVNTESPANSTKAGV IGTIAEEQSHIIGKKEGRQKVVPDGSEKGLESLGRPTNDYFKKQYSKNDKAIELKSDGKD YSNVDFGENVGDKTIENPLELYDKKYTTADERKEVEKILSEEKGEDYLVDWELYNESLER DYRAEINYFVSQAKNRVNKLSEIEKGNFKEVPPKESVFHNFINTKKGKKIVINKSLRNTK KVDYDTGKEVVISKSNKIVKDYMNQGTSNNFTYGPDGIVRSDEGKFDKMLHGIFDIGNYI SKGTGVADKTTTLERINMTILGTLVSLNYDELEKWASENNYNAIGYKEYFEYKIYKFYIN SYKNSRRNMYK >gi|224461252|gb|ACDC01000150.1| GENE 3 4017 - 4562 653 181 aa, chain + ## HITS:1 COG:no KEGG:FN0142 NR:ns ## KEGG: FN0142 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 20 181 4 160 160 68 35.0 1e-10 MKKVLIIMIGILIFFLNMCIRYTPENHYILLDEKKISVEKEDKIISDNNLKNVRLLSDAT ILFEFYNIPNNLVLERVELFYQSKMIGSIDINEKINNLENCGDDYFDELGRKVGNKGYLL EKDFFRIMGEEYEKYNITYPNRKFELIIYLRNLDTNEIFKLNRSFSLYFEKKGYEFFILS V >gi|224461252|gb|ACDC01000150.1| GENE 4 4758 - 6134 1553 458 aa, chain + ## HITS:1 COG:FN1817 KEGG:ns NR:ns ## COG: FN1817 COG3210 # Protein_GI_number: 19705122 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 160 2342 2505 2806 178 66.0 2e-44 MPKKSKGSTGSSYIVDRKNRKVLIPIDVNKIEDIKELLGTVTEEVAHGKDALEGRQDKKV AEDKSNDEEGLETLGRPANEYVKKSFGEDNNSEIKLTTDGIDLSNADVGEKVGDVVTLED RKFKKYYHKETNKLEIAVNSFKEIPKGISYIGNLALDSVVDALFAEDMEQSIKNLNNKDN ISKYIKIFRKKGRKAMLNEMSKEFDKIYLSKIYLSKIYLSKIYLSDVSKKLFVIYGAPDE IRKLYTNKEKDKNNNDIDIFILDKHLPEHMEDLWSNPKTSDNLDKISNYWKNELGDKDSF SLSEQNRERNKTIAYPDFTTSYKFYAFGKTKLFQSTYGSVKRLENGKYDVNITVIFQYTD RFEDVKNVNELSSAKQGLNKEFKGGKTFSFRTEPKRVTIKKQVDSLDEITGLLKNRLKGI DDNSNLGQYNIKGNVSLIKNNKLDNSNEIDNFKKYYNQ >gi|224461252|gb|ACDC01000150.1| GENE 5 6176 - 6634 394 152 aa, chain + ## HITS:1 COG:no KEGG:FN0145 NR:ns ## KEGG: FN0145 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 5 151 9 157 162 124 53.0 1e-27 MEKHKKILKKIIFIFFILPIYFFSTILANYLQHYEYTFKDTNGIKEYNIFSNGRYGYINL ENDMYIEGQTGDFKIKEVYENNDYFIGCDFNYNYETDKIGKRYYIVVDKNKSTMKIFSEE EFKEKFKNISNQEFINIYSFFKKRGTKFASYK >gi|224461252|gb|ACDC01000150.1| GENE 6 6658 - 6945 127 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738796|ref|ZP_04569277.1| ## NR: gi|237738796|ref|ZP_04569277.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 95 1 95 95 81 100.0 2e-14 MKYKGKDIAEVSITLIIILNFLIIFTKIRNIELLYILIFLFLNVYIIIFIGDTKEKYLMK KKILKVNKIMFIGINTLIFFIFLKSLISFYQKIKF >gi|224461252|gb|ACDC01000150.1| GENE 7 7121 - 7219 63 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVISQIVLIKKFILELIELRELHSSQKSTNKD >gi|224461252|gb|ACDC01000150.1| GENE 8 7333 - 7587 341 84 aa, chain + ## HITS:1 COG:no KEGG:Dtox_4301 NR:ns ## KEGG: Dtox_4301 # Name: not_defined # Def: prevent-host-death family protein # Organism: D.acetoxidans # Pathway: not_defined # 1 84 1 84 84 83 52.0 3e-15 MLAINYTTLRTNLKSYFDKAVDNDEDIIITRKNERNVVLLSLDKYNEFLKAMRNLEYMTK IREGIAELEAGKGEIHDLIEVDDE >gi|224461252|gb|ACDC01000150.1| GENE 9 7580 - 7840 346 86 aa, chain + ## HITS:1 COG:SA2195 KEGG:ns NR:ns ## COG: SA2195 COG4115 # Protein_GI_number: 15927985 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 3 84 5 88 88 83 52.0 1e-16 MNNLVWTHKAWQDYLYWQTQDKKTLKKINELVKDIERNGVLKGIGKPEVLKNESAYSRRI DEKNRLVYRIVDGFILIIACKGHYEE >gi|224461252|gb|ACDC01000150.1| GENE 10 7850 - 8353 295 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase [Brachyspira murdochii DSM 12563] # 5 167 4 166 166 118 36 3e-26 MEIKIREIEVEDYKELLDFMKKVKGETNFLLGYPNEIKMSYEDEKEHIKKVKSSETSNYF VVMKNSKIIGCIGFNGNTARKMKHYGTIGISVLKEYWGRGIATALLEKLISWAKEKGIKK INLDVFENNERAIKLYEKFGFKLEGCIEDGIFDGGNYINLLVYGLKI >gi|224461252|gb|ACDC01000150.1| GENE 11 8382 - 8867 416 161 aa, chain + ## HITS:1 COG:no KEGG:FN0056 NR:ns ## KEGG: FN0056 # Name: not_defined # Def: acetyltransferase (EC:2.3.1.-) # Organism: F.nucleatum # Pathway: Tyrosine metabolism [PATH:fnu00350]; 1- and 2-Methylnaphthalene degradation [PATH:fnu00624]; Benzoate degradation via CoA ligation [PATH:fnu00632]; Limonene and pinene degradation [PATH:fnu00903] # 1 157 1 158 159 190 79.0 2e-47 MSRFKIRNMREDDIEIIYKNLHLDFVNKYFKNNKEKKKIHDNHSEWYKTHISSFDYLIYI FEDEEANFVAMTSYEILEDTAKINIYLNKDYRNKGYSQEILSESIDKFLNDNKNIKTLKA CILEENLASKKIFENLSFIYDKKEICRDELEYLIYKKTIII >gi|224461252|gb|ACDC01000150.1| GENE 12 8954 - 9019 70 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRVIVGVFEANLLASFAEIS >gi|224461252|gb|ACDC01000150.1| GENE 13 9169 - 9819 698 216 aa, chain + ## HITS:1 COG:FN0057 KEGG:ns NR:ns ## COG: FN0057 COG0177 # Protein_GI_number: 19703409 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Fusobacterium nucleatum # 14 212 1 199 201 356 90.0 2e-98 MTKKEKVKKILEELHKKFGEPKCALNFETPFELLVAVILSAQCTDKRVNIVTEEMFKEVN TPEQFANMEIEEIENYIKSTGFFRNKAKNIKKCSQQLLEKYNGEIPQDMDKLTELAGVGR KTANVVRGEVWGLADGITVDTHVKRITNLIGLVKSEDPIKIEQELMKIVPKKSWIVFSHY LILHGRATCIARRPQCKNCEISDCCNYGKIKLLKEN >gi|224461252|gb|ACDC01000150.1| GENE 14 9899 - 11092 1652 397 aa, chain + ## HITS:1 COG:FN0058 KEGG:ns NR:ns ## COG: FN0058 COG1104 # Protein_GI_number: 19703410 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Fusobacterium nucleatum # 1 397 1 397 397 708 91.0 0 MKVYLDNNATTKVDEEVVKAMMPYFSDYYGNPFSLHLFGNETGLAVTEARQTIADILKAK PSEIIFTASGSEGDNLAIRGVAKAYKHRGKHIITSTIEHPAVKNTFIDLMEDGFEITMVP VDENGVMILDEFKKALREDTILVSVMHANNEVGSFQPVEEIGKITRERKIIFHVDAVQTM GKVEIYPEKMGIDLLSFSGHKFHAPKGIGVLYKRDGIRFAKVITGGNQEGKRRPGTSNVP YIVGLAKALKIATENMKEEWVREETLRDYFEDEVSKRIPEIKINGKGARRLPGTSSITFK YLEGESMLLNLSLKGIAVSSGSACSSDSLQPSHVLLAMGVPAEYAHGTLRFSLSKYTTKE EIDYTIEALVEIIGKLRELSPLWKTFKDNKLTDTASF >gi|224461252|gb|ACDC01000150.1| GENE 15 11165 - 11551 584 128 aa, chain + ## HITS:1 COG:FN0059 KEGG:ns NR:ns ## COG: FN0059 COG0822 # Protein_GI_number: 19703411 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Fusobacterium nucleatum # 1 125 4 128 128 218 94.0 3e-57 MQYTEKVMQHFMNPHNVGVIENPDGYGKVGNPSCGDIMEIFIKVDNNILTDVKFRTFGCA SAIASSSISTDMIIGKTVDEALQVTNKAVVDALGGLPAVKMHCSVLAEEAIKMAIEDYIS KRDGKKAE >gi|224461252|gb|ACDC01000150.1| GENE 16 11711 - 13003 1878 430 aa, chain + ## HITS:1 COG:FN0060 KEGG:ns NR:ns ## COG: FN0060 COG1686 # Protein_GI_number: 19703412 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Fusobacterium nucleatum # 64 430 3 368 368 448 72.0 1e-125 MFKRFKNLFLVMAILGLIFTTSYSGPVKEIQSIEEYSAQVLGDEEEDAEDTSQTIVIPKV KKVEKKEEIKKEPEVKKEEIKEEVKKETKKEAVKEEPKKKEEVKKEPEVKAVKEEVKKPE KIETKEVKKVEEEKKTEPKNLALEEPENPEKDKQKYEMITYYSKDGVEWVLPDNFRAVLV GDLNGNVIFSKNADTMYPLASVTKVMTLLVTFDEINAGNISLNDSVRISKTPLKYGGSGI ALKEGQIFVLEDLIKASAVYSANNATYAIAEYVGEGSVFNFVAKMNKKLKQLGLQNDIKY HTPAGLPTRNTKMPMDEGTPRGIYKLSIEALKYHKYIEIAGIKNTKIYNGKISIRNRNHL IGEDGVYGIKTGFHKEAKYNITVAVKFEGIDLIIVVMGGETYKTRDDLVRTIIANLKENY TVRNGQLIRK >gi|224461252|gb|ACDC01000150.1| GENE 17 13022 - 14512 2060 496 aa, chain + ## HITS:1 COG:FN0061 KEGG:ns NR:ns ## COG: FN0061 COG2317 # Protein_GI_number: 19703413 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent carboxypeptidase # Organism: Fusobacterium nucleatum # 1 496 1 496 496 759 82.0 0 MREEFRELVKRKNRIHANLELIQWDLETKTPLKSRPYLSELVGELSMQDYALSTSDEFVN LVEELNKQKETLTEIEKREIELSMEEIEKKKKIPADEYEDYARLTSYNQTVWEEAKAKKD FSIVKEGLKKIFDYNKKFATYRRKDEKTLYDVLLNDYEKGMDTERLDVFFSELKKEIVPF LKKIQEKKKTIKEVDKISVPIDEDVQLKFAKFLSSYVGFDFEKGLVETSEHPFTLNLNKN DVRLTTKNKKDSPMSTVFSIIHESGHGIYEQQTGDELIDTLLGTGGSMGLHESQSRFMEN IVGENKAFWKPLYSKAGEFYPFLKDLEFEEFYKQINRIEPGLIRVEADELTYSLHIMLRY EIEKMLINGEVNIDDLPKIWNEKVKEYLGLEPKNDSEGLMQDIHWYCGLIGYFPSYAIGN AYASQIYNTMKKDFDVEKALENQDLKKITDWLGEKIHKYGLLKDTPTIIKEVTGEELNPK YYIEYLKEKYSKIYEI >gi|224461252|gb|ACDC01000150.1| GENE 18 14605 - 14791 377 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGILDDVTGKLGELKDTVVDEAKKAKDEVVAKAAELKDKAVDKAKELKEGAEGKAAELKD KA Prediction of potential genes in microbial genomes Time: Thu May 19 23:54:04 2011 Seq name: gi|224461251|gb|ACDC01000151.1| Fusobacterium sp. 2_1_31 cont1.151, whole genome shotgun sequence Length of sequence - 21685 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 6, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 165 276 ## + Term 245 - 293 6.1 + Prom 704 - 763 9.7 2 2 Op 1 1/0.000 + CDS 846 - 1274 413 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Term 1413 - 1452 -0.5 + Prom 1276 - 1335 8.8 3 2 Op 2 18/0.000 + CDS 1553 - 1705 266 ## PROTEIN SUPPORTED gi|19705334|ref|NP_602829.1| 50S ribosomal protein L33P + Term 1751 - 1819 30.4 + TRNA 1733 - 1808 87.4 # Trp CCA 0 0 + Prom 1739 - 1798 80.4 4 3 Op 1 46/0.000 + CDS 1838 - 2014 223 ## COG0690 Preprotein translocase subunit SecE 5 3 Op 2 45/0.000 + CDS 2011 - 2592 852 ## COG0250 Transcription antiterminator 6 3 Op 3 55/0.000 + CDS 2626 - 3051 702 ## PROTEIN SUPPORTED gi|237738811|ref|ZP_04569292.1| LSU ribosomal protein L11P 7 3 Op 4 43/0.000 + CDS 3113 - 3820 1184 ## PROTEIN SUPPORTED gi|237738812|ref|ZP_04569293.1| LSU ribosomal protein L1P + Term 3920 - 3951 0.1 8 3 Op 5 47/0.000 + CDS 3973 - 4485 830 ## PROTEIN SUPPORTED gi|237738813|ref|ZP_04569294.1| LSU ribosomal protein L10P 9 3 Op 6 . + CDS 4534 - 4899 580 ## PROTEIN SUPPORTED gi|237738814|ref|ZP_04569295.1| LSU ribosomal protein L12P + Term 4930 - 4966 3.1 - Term 4916 - 4953 3.0 10 4 Tu 1 . - CDS 5027 - 5224 254 ## gi|237738815|ref|ZP_04569296.1| predicted protein - Prom 5428 - 5487 5.6 + Prom 5203 - 5262 7.3 11 5 Op 1 58/0.000 + CDS 5339 - 8899 841 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 12 5 Op 2 1/0.000 + CDS 8935 - 12897 5345 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit + Prom 13020 - 13079 5.5 13 5 Op 3 8/0.000 + CDS 13099 - 13977 1088 ## COG1561 Uncharacterized stress-induced protein + Prom 14065 - 14124 4.9 14 5 Op 4 . + CDS 14177 - 14734 729 ## COG0194 Guanylate kinase 15 5 Op 5 . + CDS 14735 - 14956 393 ## FN2032 DNA-directed RNA polymerase omega chain (EC:2.7.7.6) 16 5 Op 6 1/0.000 + CDS 14940 - 15956 1176 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 17 5 Op 7 1/0.000 + CDS 16033 - 18051 3663 ## COG3808 Inorganic pyrophosphatase + Term 18079 - 18124 4.1 + Prom 18057 - 18116 7.7 18 5 Op 8 . + CDS 18229 - 20031 1696 ## COG1835 Predicted acyltransferases + Term 20037 - 20088 4.2 - Term 20023 - 20075 9.8 19 6 Tu 1 . - CDS 20087 - 21631 1305 ## COG2978 Putative p-aminobenzoyl-glutamate transporter Predicted protein(s) >gi|224461251|gb|ACDC01000151.1| GENE 1 1 - 165 276 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no VGETKAKIETTKEKKATAVKKPVKKGKHATNATRAVVVEEIVQPVIIEVQEVKK >gi|224461251|gb|ACDC01000151.1| GENE 2 846 - 1274 413 142 aa, chain + ## HITS:1 COG:FN2045 KEGG:ns NR:ns ## COG: FN2045 COG0735 # Protein_GI_number: 19705335 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Fusobacterium nucleatum # 1 142 1 142 142 248 90.0 4e-66 MELQLHTGDIGNYLKNHDIKPSYQRMKIFQYLLDNHVHPTVDTIYKALCPEIPTLSKTTV YNTLNLFVEKKLVQVIVIEENETRYDLITHTHGHFKCNCCGALFDVELNIDYSMSPELAD CEIDEKHIYFKGLCKNCKGKQN >gi|224461251|gb|ACDC01000151.1| GENE 3 1553 - 1705 266 50 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|19705334|ref|NP_602829.1| 50S ribosomal protein L33P [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 50 1 50 50 107 100 9e-23 MRVQVILECTETKLRHYTTTKNKKTHPERLEMMKYNPVLKKHTLYKETKK >gi|224461251|gb|ACDC01000151.1| GENE 4 1838 - 2014 223 58 aa, chain + ## HITS:1 COG:FN2042 KEGG:ns NR:ns ## COG: FN2042 COG0690 # Protein_GI_number: 19705333 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecE # Organism: Fusobacterium nucleatum # 1 58 1 58 58 84 86.0 6e-17 MNLFQKVKMEYSKVEWPSKTEVIHSTIWVITMTVIVSVYLGVFDILAVKALNALEALI >gi|224461251|gb|ACDC01000151.1| GENE 5 2011 - 2592 852 193 aa, chain + ## HITS:1 COG:FN2041 KEGG:ns NR:ns ## COG: FN2041 COG0250 # Protein_GI_number: 19705332 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Fusobacterium nucleatum # 1 193 1 193 193 316 88.0 2e-86 MSIENVRKWFMIHTYSGYEKKVKTDLEQKVGTLQLRDVVTNILVPEEETTEIVRGKPKKI YRKLFPAYVMLEMEATREENENGISYKVDPDVWYIIRNTNGVTGFVGVGSDPIPMEDDEV KNIFNIIGMDTSKETIKLDFAEGDFVKILKGSFIDQEGQVAEIDYEHGRVKVMVDIFGRM TPVEIEVDGVLKV >gi|224461251|gb|ACDC01000151.1| GENE 6 2626 - 3051 702 141 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237738811|ref|ZP_04569292.1| LSU ribosomal protein L11P [Fusobacterium sp. 2_1_31] # 1 141 1 141 141 275 100 2e-73 MAKEVIQIIKLQLPAGKANPAPPVGPALGQHGVNIMEFCKAFNAKTQDKAGWIIPVEISV YSDRSFTFILKTPPASDLLKKAAGISSGAKNSKKEVAGKITTAKLRELAETKMPDLNASS VETAMKIIAGSARSMGIKIED >gi|224461251|gb|ACDC01000151.1| GENE 7 3113 - 3820 1184 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237738812|ref|ZP_04569293.1| LSU ribosomal protein L1P [Fusobacterium sp. 2_1_31] # 1 235 1 235 235 460 100 1e-129 MAKHRGKKYLEVAKLVETGKLYDIKEALELVQKTRTAKFTETVEVALRLGVDPRHADQQI RGTVVLPHGTGKTVKILAITSGENIEKALAAGADYAGAEEYINQIQQGWLDFDLVIATPD MMPKIGRLGKILGTKGLMPNPKSGTVTPDIAAAVSEFKKGKLAFRVDKLGSIHAPIGKVD FDLDKIEENFKAFMDQIIRLKPATSKGQYLRTVAVSLTMGPGVKMDPAIVAKIVG >gi|224461251|gb|ACDC01000151.1| GENE 8 3973 - 4485 830 170 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237738813|ref|ZP_04569294.1| LSU ribosomal protein L10P [Fusobacterium sp. 2_1_31] # 1 170 1 170 170 324 100 4e-88 MATQVKKELVAELVEKIKKAQSVVFVDYQGIKVNEETSLRKQMRENGAEYLVAKNRLFKI ALKESGVEDNFDEILEGTTAFAFGYNDPVAPAKAVFDLAKTKAKAKQDVFKIKGGYLTGK KVSVQEVEALAKLPSRDQLLSMLLNSMLGPIRKLAYATVAIADKKEGSAE >gi|224461251|gb|ACDC01000151.1| GENE 9 4534 - 4899 580 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237738814|ref|ZP_04569295.1| LSU ribosomal protein L12P [Fusobacterium sp. 2_1_31] # 1 121 1 121 121 228 100 3e-59 MAFNKEQFIADLEAMTVLELKELVSALEEHFGVTAAAPVAVAAAGPVEAAEEKTEFDIVL KNAGGNKIAVIKEVRAITGLGLKEAKDLVDNGGVIKEAAPKEEAEAIKEKLTAAGAEVEV K >gi|224461251|gb|ACDC01000151.1| GENE 10 5027 - 5224 254 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738815|ref|ZP_04569296.1| ## NR: gi|237738815|ref|ZP_04569296.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 65 1 65 65 89 100.0 7e-17 MSRANLGVFEADLSAILESLNELVHLEFLTDTEFAANVNFLSLRNLASNELFFRTLKNFV IVLKL >gi|224461251|gb|ACDC01000151.1| GENE 11 5339 - 8899 841 1186 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 888 1142 1085 1391 1392 328 54 2e-89 MQKLIERLDFGKIKARGSMPHFLEFQLNSYEDFLQTNMSPNKREDKGFELAFKEVFPIES SNGDVRLEYIGYELHEAEAPLNDELECKRRGKTYSNSLKVRLRLINKKMGNEIQESLVYF GEVPKMTERATFIINGAERVVVSQLHRSPGVSFSKEVNTQTGKDLFSGKIIPYKGTWLEF ETDKNDFLSVKIDRKKKVLATVFLKAVDFFKDNKEIIEHFLEAKELNLKSLYKKYSKEPE ELLNVLKQELEGSLVKEDILDEETGEFIAETEAIITEELINILIENKIETISYWFVGPED KLLANTLANDETLTEEQAVVEVFKKLRPGDQVTIDSARSLIRQMFFNPQRYDLEPVGRYK MNKRLKLDVADDQISLTKEDVLGTMKYVTDLYNGDQNVHTDDIDNLSNRRIRGVGELLLM QIKTGLAKMNKMVKEKMTTQDIETVSPQSLLNTRPLNALIQDFFGSGQLSQFMDQSNPLA ELTHKRRISALGPGGLSRERAGFEVRDVHDSHYGRICPIETPEGPNIGLIGSLATYAKIN KYGFIETPYVKVENGVALVDDVHYLAADEEDGLFIAQADTKLGKGNKLQGLVVCRYGHEI VEIEPERVNYMDVSPKQVVSVSAGLIPFLEHDDANRALMGSNMQRQAVPLLRPEAPFIGT GLERKVAVDSGAVVTTKVAGKVIYVDGKKIVIEDADKKEHTYRLLNYERSNQSMCLHQTP LVDLGDVVKAGDIIADGPATKSGDLALGRNILMGFMPWEGYNYEDAILISDRLRKEDVFT SIHIEEYEIDARATKLGDEEITREIPNVSESALRNLDENGIIMIGSEVGPGDILVGKTAP KGETEPPAEEKLLRAIFGEKARDVRDTSLTMPHGSKGVVVDILELSRENGDELKAGVNKS IRVLVAEKRKITVGDKMSGRHGNKGVVSRVLPAEDMPFLEDGTHLDVVLNPLGVPSRMNI GQVLEVHLGMAMRTLNGGTCIATPVFDGATEEQVKDYLEKQGFPRTGKVTLYDGRTGEKF DNKVTVGIMYMLKLHHLVEDKMHARAIGPYSLVTQQPLGGKAQFGGQRLGEMEVWALEAY GASNILQEMLTVKSDDITGRTKTYEAIIKGEAMPDSDLPESFKVLLKEFQALALDIELCD EEDNVINVDEEVEVEETPTEYSPQYEIDTFGLHEIDEDAEDVEDLE >gi|224461251|gb|ACDC01000151.1| GENE 12 8935 - 12897 5345 1320 aa, chain + ## HITS:1 COG:FN2035 KEGG:ns NR:ns ## COG: FN2035 COG0086 # Protein_GI_number: 19705326 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Fusobacterium nucleatum # 1 1319 1 1319 1319 2424 93.0 0 MGIRSFDKIRIKLASPEKILEWSHGEVTKPETINYRTLNPEKDGLFCEVIFGPTKDWECS CGKYKRMRYKGLVCEKCGVEVTRAKVRRERMGHITLASPVSHIWYSKGSPNKMSLIIGIS SKELESILYFARYIVTSSEEDSIKVGKILTEKEYKLLKQTYPNKFEAYMGADGILKLLTA IDLEALRDELENELIDVNSAQKRKKLVKRLKIVRDFISSGNRPEWMILTNVPVIPAELRP MVQLDGGRFATSDLNDLYRRVINRNNRLKKLLEIKAPEIVVKNEKRMLQEAVDALIDNGR RGKPVVAQNNRELKSLSDMLKGKQGRFRQNLLGKRVDYSARSVIVVGPSLKMNQCGIPKK MALELYKPFIMRELVRRELANNIKMAKKLVEESDDKVWAVIEDVIADHPVLLNRAPTLHR LSIQAFQPVLIEGKAIRLHPLVCSAFNADFDGDQMAVHLTLSPESMMEAKLLMFAPNNII SPSSGEPIAVPSQDMVMGCFYMTKDRDGEKGEGKFFSNLDQVITAYQNDKVGTHAKIKVR INGKLVDTTPGRVLFNEILPEVDRDYSKTYGKKQIKALIKSLYEAHGFTETAELINRVKN FGYHYGTFAGVSVGVEDLVIPPQKKDLLKQADDEVAQIEKDYKSGKIINEERYRKTIEVW SRTTQAVTDAMMDNLDEFNPVYMMATSGARGNTNQMRQLAGMRGNMADTQGRTIEAPIKA NFREGLTVLEFFMSSHGARKGLADTALRTADSGYLTRRLVDISHEVIVNEEDCHTHEGIE VEALVDAAGKVIEELKERINGRVLAEDLVHNGKTIAKRNTMIHKDLLKKIEDLGIKKVKI RSPLTCALEKGVCQKCYGMDLSNYNEILLGEAVGVVAAQSIGEPGTQLTMRTFHTGGVAG AATVVNSKKAENDGEVSFRDIKTIEINGEEVVVSQGGKIIIADNEHEVDSGSVIKVKEGQ HVKEGDILVTFDPYHIPIISSHDGKVQYRHFTPKNIRDEKYDVHEYLVVRSVDSTDSEPR VHILDKKNEKLATYNIPYGAYMMVRDGAKVKKGDIIAKIIKLGEGTKDITGGLPRVQELF EARNPKGKAILSEIDGRIEILPAKKKQMRVINVRSLTNPDDFKEYLIPMGERLVVTDGLK IKAGDKITEGAISPYDILNIKGLVAAEQFILESVQQVYREQDVTVNDKHIEIIVKQMFRK VRIVDSGASLYLEDEVIEKRIVDLENKKLAEEGKALIKYEPVIQGITKAAVNTGSFISAA SFQETTKVLSNAAIEGKVDYLEGLKENVILGKKIPAGTGFNKYKAIKVKYSSDEEKSEEE >gi|224461251|gb|ACDC01000151.1| GENE 13 13099 - 13977 1088 292 aa, chain + ## HITS:1 COG:FN2034 KEGG:ns NR:ns ## COG: FN2034 COG1561 # Protein_GI_number: 19705325 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Fusobacterium nucleatum # 1 292 1 292 292 387 86.0 1e-108 MRSMTGYSKLNYEDENYVISMEIKSVNNKNLTTKVKLPYNLNLLENYIRAEIASFISRGS IDFRIEFEDKNENLKSLKYDEDLAKSCMQILNKMEEDFNEKFSNKLDFLVRNFGVISQKD LDTDEEKYKEIISLKLRELLQDFVKTKVEEGNRLRSFFKEQLNILKLKVEEVKKLKPQVV ENYRERLLANVNSVKADIDFKEEDILKEILLFSDRVDITEEVSRLESHFKQLEYEFNVDK DSQGKKIEFIFQEIFREFNTMGVKSNMYEISKLVVEGKNELEKMREQIMNIE >gi|224461251|gb|ACDC01000151.1| GENE 14 14177 - 14734 729 185 aa, chain + ## HITS:1 COG:FN2033 KEGG:ns NR:ns ## COG: FN2033 COG0194 # Protein_GI_number: 19705324 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Fusobacterium nucleatum # 1 185 1 185 185 314 95.0 6e-86 MSLGALYVVSGPSGAGKSTVCKLVRERLGINLSISATSRKPRNGEQEGVDYFFITAEEFE RKIKNGDFLEYANVHGNYYGTLKSEVEERLQRGEKVLLEIDVQGGVQVKEKFPEANLVFF KTPTKEELEKRLRGRNTDSEEVIQARLKNSLKELEYENKYDTVIINNEIEQACNDLISII ENGVR >gi|224461251|gb|ACDC01000151.1| GENE 15 14735 - 14956 393 73 aa, chain + ## HITS:1 COG:no KEGG:FN2032 NR:ns ## KEGG: FN2032 # Name: not_defined # Def: DNA-directed RNA polymerase omega chain (EC:2.7.7.6) # Organism: F.nucleatum # Pathway: Purine metabolism [PATH:fnu00230]; Pyrimidine metabolism [PATH:fnu00240]; Metabolic pathways [PATH:fnu01100]; RNA polymerase [PATH:fnu03020] # 11 73 1 63 64 85 90.0 5e-16 MKKEITYDELLSKIPNKYILTIVCGERARERAKERMERNGEPLPLTKYDKKDTEMKKVFK EILAGKVGYGKDE >gi|224461251|gb|ACDC01000151.1| GENE 16 14940 - 15956 1176 338 aa, chain + ## HITS:1 COG:FN2031 KEGG:ns NR:ns ## COG: FN2031 COG1477 # Protein_GI_number: 19705322 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Fusobacterium nucleatum # 19 338 1 320 320 509 82.0 1e-144 MVKTNKFIAFILVFLSIFLISCGKKVEKIEESKFLFGTYIKIVVYSDNKEKAMNSIEKAF NEIQRIDEKYNSKMEGSLIYKLNTTDNKSIKLDAEGLELFKGVKKAYELSEHKYDVTIAP LLELWGFTEEAMELPNLKLPTKEEIEYTKTFVDFSKVHISEDGTLTLESPVKEIDTGSFL KGYAIYRAKEVLKADGIDSAFITSISSMDLIGTKPEGKPWKIGLQNPENPSEILGIVPLK NRAMGVSGDYQTYVEIDGKMYHHILDKDTGYPVEDKKMVVVLCDNAFEADLLSTTFFLMP IDKAIDYANSRDDLEILIVDKDMNIITSKNFEYEEVKK >gi|224461251|gb|ACDC01000151.1| GENE 17 16033 - 18051 3663 672 aa, chain + ## HITS:1 COG:FN2030 KEGG:ns NR:ns ## COG: FN2030 COG3808 # Protein_GI_number: 19705321 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Fusobacterium nucleatum # 1 672 1 669 671 994 94.0 0 MDLLTQVMYLGLVAGILSLLAAFYYAKKVEHYQINIPKVEEITSAIREGAMAFLSAEYKI LIVFVVVVAAALGIFISVPTAIAFVLGAITSAIAGNAGMRIATKANGRTAIAAKEGGLAK ALDVAFSGGAVMGLTVVGLGMFMLSLILLVTKILGENVITVNDVTGFGMGASSIALFARV GGGIYTKAADVGADLVGKVEAGIPEDDPRNPATIADNVGDNVGDVAGMGADLFESYVGSI IATITLAFLLPVDDATPYVAAPLLISAFGIVASIIATLTVKTDDGSKVHAKLEMGTRIAG LLTIIASFGIIKYLGLDMGIFYAIVAGLAAGLIIAYFTGIYTDTGRRAVNRVSDAAGTGA ATAIIEGLAIGMESTVAPLIVIAIAIIVSFKTGGLYGISIAAVGMLATTGMVVAVDAYGP VADNAGGIAEMSELPHEVRETTDKLDAVGNSTAAVGKGFAIGSAALTALSLFAAYKEAVD KLTSEPLIIDVTDPEVIAGLFIGGMLTFLFSALTMTAVGKAAIEMVEEVRRQFREFPGIM DRTQKPDYKRCVEISTHSSLKQMIFPGVLAIIVPVAIGLWSVKALGGLLAGALVTGVLMA IMMANAGGAWDNGKKQIEAGYKGDKKGSDRHKAAVVGDTVGDPFKDTSGPSLNILIKLMS IVSLVLVPLFVR >gi|224461251|gb|ACDC01000151.1| GENE 18 18229 - 20031 1696 600 aa, chain + ## HITS:1 COG:FN2029 KEGG:ns NR:ns ## COG: FN2029 COG1835 # Protein_GI_number: 19705320 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Fusobacterium nucleatum # 1 599 1 601 604 773 73.0 0 MNELKKRSIGIDILKAISLISVIIYHFYEYKGTYIGVILFFVISGYLITEVLYERDDSYF SFIKRRFNKIFPPLIAVLTFTYLAFYYFYDYISEKLIYSSLSSLFGVSNLYQIFTGMSYF ERSGDLFPLLHTWSLSIEIQFYILFPFLIYLFKKLKLDTEIIVAIVILLSFISAGVMFYK EYINYDISAIYYGTDTRIFSIFMGSAFYFLFKDRDLENEKQKLNIISYICLGVIVVITLS VDYLSKSNYYGFLFLISILGSFITVTSLKTGFLDFDNPVANTLAKLGEHSYVYYLWQYPL MIFSLEFFKWSDIDYNYTVGIQVIILIILSEISYEFLIKRRQESIVLRRIFLVLYVALLA FLPISSETNSEEVKNRANEIDKMAVVENSENTSKPDNKNKDEEKTLATKVNTAEHKELKT DSDKSSAKTNIKTEENKTVAKNTDTIEAKDYTFIGDSVMKMGEPYIKEIFRDANVDAKVS RQFTDLPKILEELKDSKKLRNTVVIHLGTNGVINKEAFESSMRMLKGKKVYIMNTVVPKP WEKSVNKSLAEWSQEYDNITIIDWHKYAKGEKQLFYKDATHPKPEGAKKYAEFIFKNIKR >gi|224461251|gb|ACDC01000151.1| GENE 19 20087 - 21631 1305 514 aa, chain - ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 16 507 21 502 512 349 40.0 1e-95 MSERKGFMGKVIAVSNHLPHPVTIFIIFSIIVGILSVIFSKMGVSVEIEAINRSTKQVEL QTYTVKSLFDADGIRWIFEEAVPNFIGFAPLGVVLFFSLFFNFLNEVGLFPSFLKKSMQK IKGKYVSLFIAFLGVNSSFAGDIGYVLVIPIAGILYKQLKRNPIAGILLGFSSTSAGFAA CLVSIDALLGGLSTSAMSIVNPNYVVTPLANSIFMFFYTFFITFIIAYINDKIIEPKVNQ YFPEEITETNEIEENFTELTADENKGLKYAGIGFLVSIGIILLLSLPSGAPLRNPKTGLL LLGWSPLLSAIIPVICLIFFIPGIFYGIATKKIKSDKDLMTYLFKSLDGFGAFIVLCFFS SIFISWFSYSQLGIIIAAKGGQFLSEIGLTRIPLMIVFILFCSFINLFIGSMSSKYVLIA PIFLPMLYKMGVSPELSQLAYRLGDSATNIISPLMSFFPLVLIYCNKYNKKFGFGDLIAY MMPHAIIILITSIIFFGIWLMFNLPIGFNTVNFF Prediction of potential genes in microbial genomes Time: Thu May 19 23:54:22 2011 Seq name: gi|224461250|gb|ACDC01000152.1| Fusobacterium sp. 2_1_31 cont1.152, whole genome shotgun sequence Length of sequence - 25688 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 12, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 18 - 1568 1892 ## COG1288 Predicted membrane protein - Prom 1716 - 1775 13.9 - Term 1760 - 1809 11.2 2 2 Op 1 . - CDS 1825 - 2601 931 ## COG1262 Uncharacterized conserved protein - Term 2623 - 2663 5.2 3 2 Op 2 . - CDS 2669 - 4159 2402 ## COG3333 Uncharacterized protein conserved in bacteria 4 2 Op 3 . - CDS 4181 - 4621 270 ## FN2104 hypothetical protein - Prom 4660 - 4719 8.9 - Term 4644 - 4706 1.2 5 3 Tu 1 . - CDS 4737 - 5717 1638 ## COG3181 Uncharacterized protein conserved in bacteria - Prom 5753 - 5812 8.7 6 4 Tu 1 . - CDS 5833 - 6381 652 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 6489 - 6548 7.0 + Prom 6347 - 6406 12.1 7 5 Tu 1 . + CDS 6585 - 7151 824 ## gi|237738831|ref|ZP_04569312.1| predicted protein + Term 7175 - 7227 7.1 - Term 7171 - 7202 3.1 8 6 Tu 1 . - CDS 7209 - 8087 961 ## FN0031 hypothetical protein - Prom 8228 - 8287 9.0 + Prom 8467 - 8526 11.3 9 7 Op 1 . + CDS 8554 - 9831 1406 ## Sdel_0916 hypothetical protein 10 7 Op 2 . + CDS 9850 - 10431 657 ## gi|237738834|ref|ZP_04569315.1| predicted protein + Prom 10440 - 10499 8.3 11 8 Tu 1 . + CDS 10596 - 10838 329 ## COG2261 Predicted membrane protein - Term 10849 - 10881 1.1 12 9 Op 1 . - CDS 10935 - 16103 5849 ## FN0033 hypothetical protein 13 9 Op 2 . - CDS 16120 - 21285 5850 ## FN0033 hypothetical protein - Prom 21337 - 21396 7.9 - Term 21376 - 21422 3.4 14 10 Tu 1 . - CDS 21575 - 22240 715 ## FN0035 hypothetical protein - Prom 22273 - 22332 12.7 - Term 22306 - 22354 9.1 15 11 Tu 1 . - CDS 22430 - 23437 1736 ## COG0059 Ketol-acid reductoisomerase - Prom 23462 - 23521 7.5 16 12 Op 1 . - CDS 23602 - 24378 245 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 17 12 Op 2 . - CDS 24387 - 25445 1522 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 18 12 Op 3 . - CDS 25500 - 25616 75 ## Predicted protein(s) >gi|224461250|gb|ACDC01000152.1| GENE 1 18 - 1568 1892 516 aa, chain - ## HITS:1 COG:FN2106 KEGG:ns NR:ns ## COG: FN2106 COG1288 # Protein_GI_number: 19705396 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 4 516 6 518 518 825 89.0 0 MQIKIAITPIKMSEKQKKKRSFPSAFTVLAIILVLAAVLTYIVPSGQFSRLTYDDSTNEF VITDHENNVTTEPATQEVLDRLQIQLSLDKFTEGVIKKPIAIPGTYQRIEQKPQGFLDVL KAPITGALDTTDIMLFVFILGGIIGIINKIGAFDAGMAALSKRTKGKEFLLVTLVFVLTT LGGTTFGLAEETIAFYPILMPIFLLSGFDVLTCIAAIYMGSSIGTMFSTINPFATVIASN AAGISFTEGLTFRIVTLVLASIITLAYMYWYAKKVNKDPTKSYVYADKEEIHKRFLGEYD SNSEKEFTWRRRLCLLIFAAAFPVLIWGVSRGGWWFEEMSALFLGVALLLMFFSGLSEKD AVNTFIAGAGDLVGVVLTIGLARSINIVMDNGFISDTLLYYSTEFVAGMGKGTFAIAQLL IFSVLGFFIPSSSGLAVLSMPIMAPLADTVGLSREVVINAYNWGQGWMSFITPTGLILVT LEMAGTTFDKWLKYILPLMGIIGVFSAVMLVINTMF >gi|224461250|gb|ACDC01000152.1| GENE 2 1825 - 2601 931 258 aa, chain - ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 8 190 35 232 286 134 38.0 2e-31 MRSKFKDMIFVKGGKYTPCFTDDEKKVFDLEVCRYLTTQKIWLEVMNYNPSKFEGIYKPV DSVTWWEALEFCNKLSEKYNLEPVYDLSKKGTLLINQLDGEKTSPNIADFKKTEGFRLPT EVEWEWFARGGQVAIDKGTFDYKYSGSDNADEVAWYDKISNGETQNVGTKNPNQLGLYDC SGNIWEMCYDTAEDEFIPDGDLYIYEDTDSTIPTRRIRGGSWANFDFCSSIDSSFFERWD NKSTNGCEIIGFRIVRTI >gi|224461250|gb|ACDC01000152.1| GENE 3 2669 - 4159 2402 496 aa, chain - ## HITS:1 COG:FN2105 KEGG:ns NR:ns ## COG: FN2105 COG3333 # Protein_GI_number: 19705395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 493 1 493 494 753 96.0 0 MSDVLFGYAAALTPINLVAAIISVAIGITIGALPGLSAAMGVALLIPITFGMDPSTGLIT LAGVYCGAIFGGSISAILIRTPGTPAAAATAIDGYELTKQGKAGTALGTAITASFIGGIL SAIPLYLFAPRLAKLALLFGPAEYFWLSIFGLTIIAGASTKSIVKGLISGALGLMLSTVG MDPMLGNARFTFGVPALLSGIPFTAALIGLFSMSQVLMLAEKKIKKAGNMVDFDNKVLLS KEQILEILPTSLRSTVIGSIIGILPGAGASIAAFLGYNEAKRFSKKKELFGHGSIEGIAG AEAANNAVTGGSLIPTFTLGIPGESVTAVLLGGLMIQGLQPGPDLFTVHGKITYTFFAGF VIVNIFMLILGLFGSKLFAKVSRVSDSYLIPLIFALSVIGSYAINNQMADVWVMFVFGII GYFVQKFELNSASIVLALILGPIGESGLRRSLILNHNNYSILFQSTVSKVLLFLTLFSLL SPIVMAQLKKRKKTEE >gi|224461250|gb|ACDC01000152.1| GENE 4 4181 - 4621 270 146 aa, chain - ## HITS:1 COG:no KEGG:FN2104 NR:ns ## KEGG: FN2104 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 146 1 147 147 172 86.0 2e-42 MRKYDKFLTIGLFILEAFYFFLIKQLPEKAARYPLFVLGLMVFLTLLLAINTFIIKPKNV EDKESDQFKGILYRQFFLIITLSAVYIILIDIIGFFVTTAIYLFVTMVALKSSIKWSIVV SILFPIFLYLIFVSFLKVPVPRGFLL >gi|224461250|gb|ACDC01000152.1| GENE 5 4737 - 5717 1638 326 aa, chain - ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 18 325 1 307 308 556 93.0 1e-158 MKKKFLAVLTLLLSLLLVACGGEKKAAEANPDAYPEKPVNVIIAYKAGGGTDVGARILMA EAQKNFPQTFVIVNKPGADGEIGYTELAKAAPDGYTIGFINLPTFVSLPHERQTKYKIDD VEPIMNHVYDPGVLVVKADSPFNTLADFVEYAKAHPDELTISNNGAGASNHIGAAHFAKE AGIQVTHVPFGGSTDMISALRGGHVNATVAKISEVASLVKSGELRLLASFTDKRLEGFED VPTLTESGYPVIFGSARAIVAPKGTPKEIIQKLHDVLKAALESPDNIEKSKNASLPLQYM SPEELAQYIKDQEKYIIETVPTLGIK >gi|224461250|gb|ACDC01000152.1| GENE 6 5833 - 6381 652 182 aa, chain - ## HITS:1 COG:FN1123 KEGG:ns NR:ns ## COG: FN1123 COG0526 # Protein_GI_number: 19704458 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Fusobacterium nucleatum # 25 177 1 153 157 263 86.0 1e-70 MKKIIFILLLSILSLTSFAIPLNNMDKAGNVTLPNIELVDQYGKKHNLQDYKGKVVMINF WVSWCSDCQGEMPKVAELYKEYGENKKDLIILGVASPISKEYPNNKDRIGKKELLKYIAD NNYIFPSLIDETGKTFAEYEIEEYPSSFIINENGHLRAYIKGAVSKEELKQNIDKVLTSI QK >gi|224461250|gb|ACDC01000152.1| GENE 7 6585 - 7151 824 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738831|ref|ZP_04569312.1| ## NR: gi|237738831|ref|ZP_04569312.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 188 1 188 188 342 100.0 8e-93 MTTNKILGSKNVLKKAFVFLVLSASFLSLNSQDAFAEVAGDQSKITSKLVKNVKEVEYDR YYKGDFYSDVGAYPEGMFLVVEKLIENYIAFAHMGDASYLAPVGQRFEEKIEGQTIKRYI AVSQKKKEGYYCIDIYNNLNDQPVATLTAGLKIEKKKNGYLISPKNDLKIVYKGKTYKNQ SALNFLGF >gi|224461250|gb|ACDC01000152.1| GENE 8 7209 - 8087 961 292 aa, chain - ## HITS:1 COG:no KEGG:FN0031 NR:ns ## KEGG: FN0031 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 290 1 264 271 378 77.0 1e-103 MKKFLFLLLSVFFISNSLFATTNDKYIFYLDNPTDKNIKITLDSRVYNLKPKTYEVLNLK RGEHIAELSDGTKVYFKIFANSKGGIINPSGATYTINYFRYQSPRISVDWQEPEDIVLPT FNDFIIDKNYIAWEYDIFEEATKESMPKKIHPDEDIHVFTKIYSPSEVKEPDYTKGKAIE VYNFKKSDIDMENSKANLPKLDSDYNIPNSDDEVFQNYIKQIITLDKAYMNTNDAKKQKK ILQEYDKIAKIIWSKYSKYNIVEGSYNKVSLKKLNLKSLDRGVIITKIENKQ >gi|224461250|gb|ACDC01000152.1| GENE 9 8554 - 9831 1406 425 aa, chain + ## HITS:1 COG:no KEGG:Sdel_0916 NR:ns ## KEGG: Sdel_0916 # Name: not_defined # Def: hypothetical protein # Organism: S.deleyianum # Pathway: not_defined # 5 275 9 300 422 83 29.0 1e-14 MEEKLLEEAYEKSKDYLYEDSSRFNLLSIIEKDRDEAHIHSKILYSLLSQNWEKKDKETF LTLFLKELGIEEEIIYNKTWEVTREKAFDLDTVKGRLDFEIKSKDYIYIIEMKIDAGDQP EQLIRYQEFAKEQHKKYKIFYLTLDGHSASKKSVGEEISEEIKKVEYTNISFQEEILNWL GKCLDLVKGKENKSTCINQYIASINKILGEKDTKIKDNILKSTEDTKNAISLYGELNDKL QVTLENFMSVLKEKLRARIENEIIYYKKYVKDYYNYALDKMPQDYPGLYIVLAENKSRNS NYYRFVLKLEISPDLTACFGFIKNDTDEKEIAPFVSFSKVKRDSSGLYNKCIKGIKNLEL EDKLSDNNKAYWCYIKNSKDEIINFKDISLSNKALLSLMDEETLKEEVKNIATYISNEII KNMEI >gi|224461250|gb|ACDC01000152.1| GENE 10 9850 - 10431 657 193 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738834|ref|ZP_04569315.1| ## NR: gi|237738834|ref|ZP_04569315.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 193 7 199 199 303 100.0 2e-81 MKEIEKIEASIYLYDCLKEVLPEYAVKFMKELKNKFPETQILDFNEDEIREIYADSNIAP GIYLKFNEIKIEATNYYFTLKIEINTDELCLCFGFDSKKKGEELSFVKLEDMKDFSKDFY DNLIKLKTNLEDNKVKFRDGKKAIDMSLENTNFRKVSIDNKFLMNLLEDDTREKEFERVY KEIEDILKKAGLK >gi|224461250|gb|ACDC01000152.1| GENE 11 10596 - 10838 329 80 aa, chain + ## HITS:1 COG:BMEI1501 KEGG:ns NR:ns ## COG: BMEI1501 COG2261 # Protein_GI_number: 17987784 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Brucella melitensis # 1 78 1 79 86 58 44.0 4e-09 MGVIAWLVLGALSGWIANRLMNSRTGLIDNIITGIIGSFIGGFVFNFFGAETITGFNLHS IFVSVVGACILLWIINQIRR >gi|224461250|gb|ACDC01000152.1| GENE 12 10935 - 16103 5849 1722 aa, chain - ## HITS:1 COG:no KEGG:FN0033 NR:ns ## KEGG: FN0033 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 117 1722 1 1607 1607 2343 81.0 0 MLSFSNHEYNEKAKEYIEEIKNSSKDLNKESQDFIKTLFDLGNARYYSSFYGYVDIFSEQ TSESLKTKKEVNLDDIFPKSLYPAMELLIGKKFFKIFMAIAKNITKTPFSVGYFRRMVRS KNYFNYISILITLFKKFIDLHFLDIDALKILKKDYEKGIYNLDNNPYYIAYEIDNGNQEV IDLIKSALSSQKSEIDLTYYIFQAIFISSNKELVELTGKLLLAAKLQEGVRQQICENMDR GIQENFEYMFKIIYDNDLIRFSSVKRSLATWTGLAKNDGTDISKIGKKELEIINKLITNP KYEDELLKSDDNIEVYLALWNKSTRDVKEAVEAIEKLLKSSKYHIKLLISYYLHIIENKD YQREIAKKMIKEYSKDNKNIIEILACYFQFIINYIVGHKLKSDIEKGQIKAENYFKNKKE ALEFFDILENALSLITEKRKVFSPCIFPWNSEFIDTDILAKTLGLIAIFYPDDKLKAKVM KYIKEIDAWERQYFFEILFEKPNNKEEKDFVIATLSDRSGAGDAAYEIVKNNNLLKEYPR EIEDLLRLKNGDRRKSFIDLLMTQDKKALLVSIDNLISAKNENKRLAALDILNQVNSKEK ALYDKKEVKKLIEKISKPTDAEKILIENLSDKKKKESENTLSKLYNTEYDLELAYEIKEV DKLSKAIKKNKKDEYIIENSLNIKKLFTKSTDELFKIVKKLSELYVKNEDYEYMCFYAKE YILLRDGFSITKDVTNISYSDRQQLDNFPLENVWRDFYKKEIKDFSTLWQLNVLLTREYN GGINDSNIKECQDFYKKLLGFDITELKTKLKKANLKYIFTENYYNDTGYVLEIISMLYKE YCKENKDYLFEIGKVFTGYLLENFEAKDIVEQKERYNKEIYYNVNIYYLNPGIYYLFARI IPYLEFYSNEKSFIESFILRYNLDEKIKKYTNENLKGCEIGGRRKGLELRDYTIAIVLNI AEKDLIYKEILEIENKTEDEKKEVFWSLDTYMNNYRNILAKKENKRLVNLNQFMLNEALK IIYDEGRKIVDYLVQNELKRGDSPTIYSKLLHEIDRIEGIDYLVQILQALGKETLDRNSY YWGGNDTKKSVLSHLLKACYPSEKDNSKELAKKLKGTDITEQRLIEVAMYSSQWIEIIEG YLGWKGLASGCYYFQAHMSDIDRNKEGLIAKYTPISIEDLKEGAFDIDWFKSAYKELGEK KFEMLYDSAKYISDGAKHSRARMFADAVNGKLNLKETEKKIEDKRNKDLVASYSLIPLLK DKQKDALHRYQFLQKFLKESKKFGAQRRASEAKAVSISLENLSRNMGYSDVTRLIWNMET ALINEMKEYFVPKKLDDVDVYIKIDELGQSEIIYEKAGKELKSLPTKLKKDKYIEDIKEV HKNLKEQYRRSRKMLEEAMEDGTEFYGYEIENLMTNPVIAPILKSLVFKMDKNLGYYEDK KLKSTKKKSVAVKDDSLLKIAHCFDLFESGDWASYQKDIFDRELKQPFKQVFRELYVKTV DEKGRDKSLRYAGHQVQPTKTVALLKTRRWIIDGQEGLEKVYYKENIIAKIYALADWFSP ADIEAPTLEEVQFFDRKTFKPILIDNVPDLVFTEVMRDIDLVVSVAHIGDVDPEASHSTI EMRKAIVEFNCKLFKLKNVKFSENHVLIKGERAEYSIHLGSGLVHQKAGSAINVLPVHSQ HRGRVFLPFIDDDPKTAEIMAKVILFAQDEKIKDVFILEQIK >gi|224461250|gb|ACDC01000152.1| GENE 13 16120 - 21285 5850 1721 aa, chain - ## HITS:1 COG:no KEGG:FN0033 NR:ns ## KEGG: FN0033 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 115 1721 1 1607 1607 2492 85.0 0 MLIFYRKSLEKNIEKFIEDIRKASKNLDKENQKFIEDIFLEKKDGTKYTYYASYLRATLK QVLLSKKDVKLNDIFPKNVYPAMELLVGKKFLKIFLEISKNATKYSFSRGYSRRMVRSSS YYNYIDFLFDLFTDLVDLNFLNLNILTIVKGEYDNDGIYGLHNPYLIAYEIDNGNKELID LIKGALGSQKSKIDLNYYIIQAIFISNNKELLELTGKLLLAAKLQEGVRQEICENMDRGL QENFEYMFKIIYDNNLIRFSSVKRALGTWTGLTRDENADISKFGKKELEIINKLIANPKY EDELLKSDDNVEVYLALWNKSTRDIKEAVEAMEKLLKSSKYHIKLLISYFLDVIQDIKYQ REIAKKVIKEYDDTKEIIEILACYLNFVITYGSASDLKENLKNGKIIPETFFKNKKEALE FFDILEKALVLMDGKDKVFNPCIFPWFYQSIGTHTVATAMGLIAAFYPDDALKNRMMKHL KEINTWNRGYYLDVLFEKPNNKEEKDFVITMLSDRTSAGITAYEIAKDNNLVKEYPREIE DLLRLKNADTRKNLIDLLMSQDKKELLISIDNLVSAKNENKRLAGLDILNLANSKQKPLY DKKEVKNLVAKISSPTDAEKILIENLSDKKKKESENTLSKLYNTEYKLDLPYEIKEVEKL SKTIKKNKKSEYIIENSLNIKKIFTKSIDELFKIVKKLSELYVKNENYEYMSFYTKEYTL LRDGLSITKDVNNIPYNERQKLANYPLEDVWRDFYKKEIKDFSTLWQLYTLLVKDYNSSI NENNVKEYQDFYKKILGIDITELRAKLKKANLKYIFTENYYNDTGYVLEIIDMLYKEYSK ENKDYLFEIGKVFTSYILENFEAKDIVEQKERYNKEIYYSVSIYNRYSGVQYLFAKAIDY LEFYDDEKAFTESFVLRYNLDKKIERYINENLKDCEIGGPTKAFGLRNYAIASILKIAEK DLIYKYVLELDNEVAKEINVYGFSELDGFMDNYRNILAKKEDKKVATLNQFMLNEALKVI YDEGRKIVDYVVQNELKRGDSPTIYSKALNRIYKIEGIDYLVQILQALGKETLDRTSYYY GSGYDSKKGVLSHLLKVCYPTEKDNSKELAKKLKGTDITEQRLIEVAMFSSQWIEIIEGY LGWKGLASGCYYFQAHMSDIDVNKEGLFAKYTPISIEDLKEGAFDIDWFKSAYKELGEKK FEMLYDSAKYISDGAKHSRARMFADAVNGKLNLKETEKKIEDKRNKDLVASYSLIPLLKD KKKDALHRYQFLQKFLKESKKFGAQRRASEAKAVNISLENLSRNMGYSDVTRLIWNMETA LINEMKEYFVPKKLDDVDVYIKIDELGQSEIIYEKAGKELKSLPTKLKKDKYIEDIKEVH KNLKEQYRRSRKMLEEAMEDGTEFYGYEIENLMTNPVIAPILKSLVFKMDKNLGYYEDKK LKSVNKKAVTVKDDSLLKIAHCFDLFESGDWASYQKDIFDRELKQPFKQVFRELYVKTVD EKGRDKSLRYAGHQVQPTKTVALLKTRRWIIDGQEGLEKVYYKENIIAKIYALADWFSPA DIEAPTLEEVQFFDRKTFKPILIDDVPDLIFTEVMRDLDLVVSVAHVGDVDPEASHSTIE MRKAIVEFNCKLFKLKNVKFSENHVLIKGERAEYSIHLGSGLIHQKAGSAINVLPVHSQH RGRVFLPFIDDDPKTAEIMTKVILFAQDEKIKDVFILEQIK >gi|224461250|gb|ACDC01000152.1| GENE 14 21575 - 22240 715 221 aa, chain - ## HITS:1 COG:no KEGG:FN0035 NR:ns ## KEGG: FN0035 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 220 1 217 220 231 64.0 2e-59 MKKFSLKFILIFILSLFSSFVYADGVFSKYYNGRFNYVINVPTTKYENGVGGTENLNFVK NSNLIPTKNFFSAYEGANSDGLTIQDINGNIIILAYGTYFLNSEEVNGLSRETIRNSFEY DRLNYNLFLRKYYNGNLPKNIEPLKYDYNKNLFIYGENVAYNTIGKNFYVISYIEENKIV YKKVIYSKDSNAYIVFQASYLPKDKKFMDKLVVEMVNSIRY >gi|224461250|gb|ACDC01000152.1| GENE 15 22430 - 23437 1736 335 aa, chain - ## HITS:1 COG:RSc2075 KEGG:ns NR:ns ## COG: RSc2075 COG0059 # Protein_GI_number: 17546794 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Ralstonia solanacearum # 10 334 3 327 338 414 59.0 1e-115 MAGNILGTTVYYDADCNLQKLVGKKITVLGYGSQGHAHALNLKENGMDVTIGLRKDSKTW SVAEEAGFVVKETGEAVKDADVVMVLIPDEIQGDTYTNSIAPNLKKGAYLGFGHGFNIHF KKIQPREDVNVFMVAPKGPGHLVRRTFQEGSGVPCLIAVYQDPSGDTKDVALAWASGIGG GRAGILETTFKQETETDLFGEQAVLCGGITELIKTGFEVLTEAGYDPVNAYFECLHEMKL IVDLIYEGGFGKMRHSISNTAEYGDFLAGPKVITSVSKEAMKGLLADIQSGKFADEFLAD SKAGQPFLKAHRKAASEHQLEKVGQELRQLMSWIK >gi|224461250|gb|ACDC01000152.1| GENE 16 23602 - 24378 245 258 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 14 221 15 214 311 99 34 3e-20 MEKILSYKNVSFRRDGREILKNINWEIKKGENWALLGLNGSGKSTLLSMIPAYTFATSGE VSVFEKKFGTCIWAEVKEKVGFVSSSLNTFSDSLNNQTLNNIVLSGKYNSIGIYQEITQK DREKANNIIKDFKLSHLKLNKYITLSQGEQRKTLLARAFMNEPSLLILDEPCSGLDIRAR EIFLKTLEESKSDIPFIYITHQIEEIIPSITHVAILDNGEIVSQGNKFEVLTKENLSKLY GIDLKIEWSNNRPWLIVK >gi|224461250|gb|ACDC01000152.1| GENE 17 24387 - 25445 1522 352 aa, chain - ## HITS:1 COG:PA3118 KEGG:ns NR:ns ## COG: PA3118 COG0473 # Protein_GI_number: 15598314 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Pseudomonas aeruginosa # 1 349 1 354 360 430 59.0 1e-120 MEYKIAVLKGDGIGPEIVDVTTKVLEKIGEKFNHKFIFTRGYLGGESIDKYGVPLSDETI EICKNSDAVLLGAVGGPKWDKIEAELRPEKGLLKIRKELEVFTNLRPAILFNELKNASPL KEEIIGDGLDIMVVRELTGGLYFGPKKYSEEEASDTLVYKREEIERITKKAFEIAKLRSK KLTSVDKQNVLDSSKLWRKIVNEISKDYPEVKVDHMYVDNAAMQLVINPRQFDVILTENT FGDILSDEASMLTGSIGMLPSASLGYGKVGIYEPCHGSAPDIAGQNIANPIATILSAAMM LRYSFNLNVEADTIEKAIEDVLKDGYRTADIYSEGYKKVGTIEIGEEIINRI >gi|224461250|gb|ACDC01000152.1| GENE 18 25500 - 25616 75 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRANFVVSERSEFNEFAANVNFLSLRNLASNELFFLH Prediction of potential genes in microbial genomes Time: Thu May 19 23:55:37 2011 Seq name: gi|224461249|gb|ACDC01000153.1| Fusobacterium sp. 2_1_31 cont1.153, whole genome shotgun sequence Length of sequence - 2580 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 30/0.000 - CDS 69 - 644 730 ## COG0066 3-isopropylmalate dehydratase small subunit 2 1 Op 2 6/0.000 - CDS 644 - 2035 2191 ## COG0065 3-isopropylmalate dehydratase large subunit 3 1 Op 3 . - CDS 2045 - 2578 710 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases Predicted protein(s) >gi|224461249|gb|ACDC01000153.1| GENE 1 69 - 644 730 191 aa, chain - ## HITS:1 COG:SA1865 KEGG:ns NR:ns ## COG: SA1865 COG0066 # Protein_GI_number: 15927635 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Staphylococcus aureus N315 # 1 187 4 187 190 224 57.0 6e-59 MKPFIKFEGTIVPIMNDNIDTDQLIPKQYLKSTEKTGFGKYLFDEWRYNEDGSDNLDFNL NKSEYKKGTILITGDNFGCGSSREHAAWALQDYGFHVIVAGGYSGIFYMNWLNNGHLPIT LSKEERLELSKLPGDTVITVDLENNKLSTNGKDYFFNLEESWKERLLKGLDSIGLTLQYE DKIKEYENKNC >gi|224461249|gb|ACDC01000153.1| GENE 2 644 - 2035 2191 463 aa, chain - ## HITS:1 COG:lin2096 KEGG:ns NR:ns ## COG: lin2096 COG0065 # Protein_GI_number: 16801162 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Listeria innocua # 2 457 3 454 462 663 69.0 0 MKTLFDKVWEKHVITGNEGEAQLLYIDLHLIHEVTSPQAFSGLRIAGRRVRRPDLTFGTM DHNTPTIMADRYNIADETSKAQLEALKRNCEEFGVELADMFNERNGIVHMVGPELGLTLP GKTVVCGDSHTATHGAFGAIAFGIGTSEVEHVLATQTLWQKKPKTMGIEITGKLQKGVYA KDIILHLIKTYGIGLGNGYAFEFFGDTIKSLSMEERMTICNMAIEAGGKSGIIAPDEITF EYIKGREFSPKDEELEKKIKEWKELYTDDISAFDEYIKLDVSNLVPQVTWGTNPEMGMNI TDTFPEIKDLNYEKAYNYMDLKPGDSPKNINLKHVFIGSCTNGRLSDLEVVAKIVKGKKV HPNIKAVIVPGSQMVKKQAEEKGFAKIFLDAGFEWREAGCSTCLGMNPDLIPGGEHCAST SNRNFEGRQGKGARTHLVSPAMAAAAAIHGHFIDVRELEEVQD >gi|224461249|gb|ACDC01000153.1| GENE 3 2045 - 2578 710 177 aa, chain - ## HITS:1 COG:CC1541 KEGG:ns NR:ns ## COG: CC1541 COG0119 # Protein_GI_number: 16125788 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Caulobacter vibrioides # 1 174 336 508 524 121 37.0 8e-28 LVLGKLSGKHAFVDKLNSLGFSGFDDKKIEELFADFKNLADKKKYVLDEDIISLISGDAA EVKGRFSLEHFEIIRTDIKAKAEIIMYVDGEKDVSSSYGSGPVDAAYKAINRLLNDNFVL EEYKLESITGDTDAQAQVVVIIEKDNKRHIGRAQSTDIVESSIKAYINALNRLYKED Prediction of potential genes in microbial genomes Time: Thu May 19 23:55:44 2011 Seq name: gi|224461248|gb|ACDC01000154.1| Fusobacterium sp. 2_1_31 cont1.154, whole genome shotgun sequence Length of sequence - 22861 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 9, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 967 1415 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 988 - 1047 5.4 2 2 Op 1 32/0.000 - CDS 1110 - 1601 729 ## COG0440 Acetolactate synthase, small (regulatory) subunit 3 2 Op 2 . - CDS 1591 - 3309 2487 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 4 2 Op 3 8/0.000 - CDS 3320 - 4531 1691 ## COG1171 Threonine dehydratase - Prom 4735 - 4794 5.3 5 2 Op 4 . - CDS 4797 - 6458 2737 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 6484 - 6543 3.2 + Prom 6770 - 6829 11.2 6 3 Op 1 2/0.000 + CDS 7006 - 7572 1085 ## COG0450 Peroxiredoxin + Term 7583 - 7612 0.5 + Prom 7584 - 7643 5.0 7 3 Op 2 . + CDS 7681 - 9318 2485 ## COG0492 Thioredoxin reductase + Term 9363 - 9410 4.0 - Term 9349 - 9396 4.0 8 4 Op 1 3/0.000 - CDS 9401 - 10837 2124 ## COG0260 Leucyl aminopeptidase - Prom 10871 - 10930 9.2 - Term 10914 - 10951 -0.2 9 4 Op 2 1/0.000 - CDS 10964 - 12058 1714 ## COG0012 Predicted GTPase, probable translation factor 10 4 Op 3 . - CDS 12081 - 12836 1329 ## COG0149 Triosephosphate isomerase 11 4 Op 4 . - CDS 12851 - 13450 621 ## FN1367 methyl-accepting chemotaxis protein 12 4 Op 5 1/0.000 - CDS 13469 - 13927 325 ## COG1040 Predicted amidophosphoribosyltransferases - Prom 14028 - 14087 4.3 - Term 13995 - 14035 1.3 13 4 Op 6 1/0.000 - CDS 14091 - 14309 213 ## COG3478 Predicted nucleic-acid-binding protein containing a Zn-ribbon domain 14 4 Op 7 1/0.000 - CDS 14299 - 14658 494 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 15 4 Op 8 1/0.000 - CDS 14679 - 15332 838 ## COG0164 Ribonuclease HII - Prom 15391 - 15450 4.3 16 4 Op 9 . - CDS 15453 - 16400 272 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit - Prom 16428 - 16487 9.5 + Prom 16490 - 16549 14.5 17 5 Tu 1 . + CDS 16573 - 16884 603 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 16896 - 16946 10.5 + Prom 17079 - 17138 14.9 18 6 Tu 1 . + CDS 17171 - 19966 3607 ## FN1905 168 kDa surface-layer protein precursor + Term 19986 - 20032 9.1 + Prom 20016 - 20075 16.5 19 7 Op 1 . + CDS 20106 - 20516 482 ## Lebu_0275 hypothetical protein + Prom 20555 - 20614 4.0 20 7 Op 2 . + CDS 20640 - 20840 227 ## gi|237738864|ref|ZP_04569345.1| predicted protein + Prom 20992 - 21051 14.7 21 8 Op 1 . + CDS 21131 - 21433 497 ## gi|237738865|ref|ZP_04569346.1| predicted protein + Prom 21437 - 21496 1.7 22 8 Op 2 . + CDS 21532 - 22185 844 ## gi|237738866|ref|ZP_04569347.1| predicted protein + Prom 22255 - 22314 4.7 23 9 Tu 1 . + CDS 22341 - 22628 470 ## FN0038 hypothetical protein + Term 22656 - 22695 5.4 Predicted protein(s) >gi|224461248|gb|ACDC01000154.1| GENE 1 1 - 967 1415 322 aa, chain - ## HITS:1 COG:aq_2090 KEGG:ns NR:ns ## COG: aq_2090 COG0119 # Protein_GI_number: 15607049 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Aquifex aeolicus # 4 320 9 325 524 379 60.0 1e-105 MKCIKIFDTTLRDGEQTPRVNLNAKEKLRIAKQLEALGVDIIEAGFAAASPGDFEAIELI AQNIKNSTVTSLARAVKSDIEMAAKAIKKANKARIHTFIATSPIHREFKLKMSKEEILKT VDEMVRYARTFTNDIEFSAEDAMRTEKEYLVEVYETAIKAGATTINIPDTVGYRTPQEMY DTIKYLKENIKGIENIDISVHCHNDLGLAVANSIAAVQAGATQIECTINGIGERAGNTSL EEVVMLFKTRKDLFADFTTNIDTKQIYPTSKLVSLLTGVTTQPNKAIVGANAFSHESGIH QHGVLANPETYEIIKPEVVGRN >gi|224461248|gb|ACDC01000154.1| GENE 2 1110 - 1601 729 163 aa, chain - ## HITS:1 COG:MA3791 KEGG:ns NR:ns ## COG: MA3791 COG0440 # Protein_GI_number: 20092587 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanosarcina acetivorans str.C2A # 5 162 2 159 161 134 43.0 6e-32 MLNKEHQILIIAKNTNGIVARIMSLFNRRGYFVKKMSAGVTNKEGHARLTLTVDGDKESL DQIQKQVYKIIDVVKVKIFPEKDVIRRELMLLKVKADEETRSQIVQIANIYRGNILDVSP KSLVIELTGDIEKLRGFVGMMNNYGILEMAKTGIVAMSRGEKM >gi|224461248|gb|ACDC01000154.1| GENE 3 1591 - 3309 2487 572 aa, chain - ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 2 563 5 559 564 602 51.0 1e-172 MANEMIKGARILLECLSRLGIKEIFGYPGGAVIPIYDELYSFKDIKHYFARHEQGAVHEA DGYARSTGKVGVCLATSGPGATNLVTGIMTAHMDSIPLLAITGQVTSTLLGKDAFQESDI VGITVPITKNNYLVQDIRELPRILKEAYYIASTGRPGPVLVDIPRDIQLEEIPYDEFKKL YEQEFELEGYNPVYEGHKGQIKTAIKMIKDSKKPLIIAGAGILKGHAYDELKEFVDKTNI PVAMTLLGLGSFPGDHELALGMIGMHGTTYANYAANEADLVIAAGMRFDDRVTGNPLKFL PNANIIHIDIDPAEIGKNKLIDVPIVGDLKNVLAELNRKIPKLSHTKWLDEVAKLKKKYS LTFRKTEDDVLIPQEILFEINKLTKGEVIVATDVGQHQMWSAQFIKFNNPYSILTSGGAG TMGFGLPAAIGAQVANPDKKVLAIVGDGGFQMTFQELMMVKEYNLPVKIFIINNSYLGMV RQWQELFNDRRYSSVDLSYNPDFIKIGEAYGIKSIQLKTKKDLKKHLKKILESDEAVLVE CIVEKEENVYPMIPAGKDVSCIVGKRGVLDAE >gi|224461248|gb|ACDC01000154.1| GENE 4 3320 - 4531 1691 403 aa, chain - ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 1 398 1 398 404 518 73.0 1e-147 MHKLYDFIEARERLTTVVVKTKLIHSPVFSEESGNEIYLKPENLQKTGSFKIRGAYNKIA KLTEEEKKKGVIASSAGNHAQGVAYAAKRLGIKAVIVMPKHTPLIKVEATRKYGAEVVLH GEVYDDAYKKALELQKENGYVFVHPFNDEDVIEGQGTIALEILDELPDADIILVPLGGGG LVSGIASAAKLKNPQVKVIGVEPEGAASAIAALEKGKVVELAEANTIADGTAVKRIGEKN FEYIKKYVDDIVTVSDYELMEAFLLLVEKHKLVAENSGILPVAAAKKLNIKGKKIVAVLS GGNIDVLTISSMINKGLIMRGRIFTFSVQLADKPGQLLKVAEILAKQNANVIKLEHNQFK NLSRFKDVELQVTVETNGEEHISKIAEAFKKEGYDIVRENTPM >gi|224461248|gb|ACDC01000154.1| GENE 5 4797 - 6458 2737 553 aa, chain - ## HITS:1 COG:PAB0895 KEGG:ns NR:ns ## COG: PAB0895 COG0129 # Protein_GI_number: 14521553 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Pyrococcus abyssi # 3 553 2 551 551 664 60.0 0 MSRSNNLTEGAARAPHRSLLKGLGFVAEEMDRPIIGIANSFNEIIPGHVHLQTLVQAVKD GIRNAGGVPMEFNTIGICDGLAMNHLGMKYSLVTRQLIADSVEAVAMATPFDAIVFIPNC DKVVPGMLMAAARLNIPSIFISGGAMLAGVYKGKKVGLSNVFEAVGQYEAGLITRKELNT VEDLACPTCGSCAGMYTANTMNCLTEALGMGLPGNGTVPAVFSERLRLAKKAGMQILEIL KADLRPSDIMTKKAFENAVAVDMALGGSSNTALHLPAIAHEAGVDLTLDDFNEIAKKTPQ LCKLSPSGEYFIEDLYRAGGVTGVMKRLYENGRLNADEKTVALRTQGELAKDAYINDDDV IKPWDKPAYTTGGIAVLKGNLAEDGCVVKEGAVDKEMLVHSGPAKVFNSEEEAIKAMREK KIVAGDVVVIRYEGPKGGPGMREMLAPTATIAGMGLGKDVALITDGRFSGATRGASIGHV SPEAAAGGTIAIVQDGDIIEIDIPNRKINVKLSDEEIARRKAELKPYEPNVKGYLKRYAA HVSSAASGAIYVE >gi|224461248|gb|ACDC01000154.1| GENE 6 7006 - 7572 1085 188 aa, chain + ## HITS:1 COG:FN1983 KEGG:ns NR:ns ## COG: FN1983 COG0450 # Protein_GI_number: 19705279 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Fusobacterium nucleatum # 1 188 1 188 188 369 96.0 1e-102 MSLIGKKVPEFKAQAFKKGEKDFITVTDKDLLGKWSVFVFYPADFTFVCPTELEDLQDNY AAFQKEGAEVYSVSCDTAFVHKAWADHSDRIKKVTYPMIADPTGFLARAFEVMIEEEGLA LRGSFVINPEGKIVAYEVHDNGIGREAKELLRKLQGAKFVAEHGEVCPAKWQPGSETLKP SLDLIGEL >gi|224461248|gb|ACDC01000154.1| GENE 7 7681 - 9318 2485 545 aa, chain + ## HITS:1 COG:FN1984_1 KEGG:ns NR:ns ## COG: FN1984_1 COG0492 # Protein_GI_number: 19705280 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin reductase # Organism: Fusobacterium nucleatum # 1 332 1 332 332 551 91.0 1e-157 MEKIYDMIIIGGGPAGLSAGIYGGRAKLDVLVIEKENKGGQISLTSEVVNYPGILEISGT EFMTQTRKQAEGFGVNFVQGEVVDMDFTKDIKTVKTKDAEYSALSVVIATGAAPRKLGFP GEQEFTGRGVAYCATCDGEFFTGMDIFVIGAGFAAAEEAMFLTKYGKSVTIIAREPDFTC AKSIGDKVKAHPKITTKFNTELIELTGDMKPTAAKFKNNVTGEITEYKAKVGETFGVFVF VGYAPSSQIFKGHIEIGEGGFIPTNEDLMTNVKGVFAVGDIRPKRLRQVVTAVADGAIAA TSIEKYVHDLREELGLKKEEKEEEKTTSIKTEKETFLDDDLRKQLVTVVDRFENPVEIVV FKNPAIEESLAIEDAVKDIASIAPEKLRFSSYNEGENKELEAKVKVERTPTIAVLDKDGN FSGLKYSSLPSGHELNSFILGLYNVAGPGQKVAPESLEKIDKIDKPINIKIGISLSCTKC PKTVQATQRIATLNKNIEMEMINIFTFQDFKNRYDIMSVPAIIVDDQHVYFGEKTVEDML EIINK >gi|224461248|gb|ACDC01000154.1| GENE 8 9401 - 10837 2124 478 aa, chain - ## HITS:1 COG:FN1906 KEGG:ns NR:ns ## COG: FN1906 COG0260 # Protein_GI_number: 19705211 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Fusobacterium nucleatum # 1 478 1 478 478 780 81.0 0 MSFNCVKKVENDYDKYVLVSTTGKINLPDYLDKKSKDLAKAVIEKNEFTAKASEKLAMTL VNNKKVIDFVIVGLGDKAKLNCKNIRQYLFDTLKNETGKVLLSFANEELDNMDIVAEVVE HINYKFDKYISKKKDKFLEVSYLTDKKVPKLIEGYELAKISNIVKDLINEQAEVMTPKAL ADKAVELGKQFGFQAEIMDEKKIQKLGMNAYLGVARAAHHRPYLIVMRYKGDEKSKYTHG LVGKGLTYDTGGLSLKPTASMLTMRCDMGGAGTMMGVMCAVAKMKVKKNVTCVIAACENS IGPNAYRPGDILTAMNGKTIEITNTDAEGRLTLADALTYIVRKEKVDEIIDAATLTGAVM VALGEDVTGVFTNNDEMAKEIISASNNWNEYFWQMPMFDIFKKNFKSPYADMQNSGSRWG GSTNAAKFLEEFIDDTKWTHLDIAGTAWASGANPYYSQKGATGQVFRTVFSYLKNSKN >gi|224461248|gb|ACDC01000154.1| GENE 9 10964 - 12058 1714 364 aa, chain - ## HITS:1 COG:FN1365 KEGG:ns NR:ns ## COG: FN1365 COG0012 # Protein_GI_number: 19704700 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Fusobacterium nucleatum # 1 364 1 364 364 641 89.0 0 MIGIGIVGLPNVGKSTLFNAITKAGAAEAANYPFCTIEPNVGMVTVPDERLNALAQIINP ERIVPATVEFVDIAGLVKGASKGEGLGNKFLSNIRATSAICQVVRCFDDENVIHVSGQVD PISDIEVINTELIFADIETIEKAIEKHEKLARNKIKESVELMAVLPKVKKHLEEFKLLKT LDLTDEEKQVLKNYQLLTLKPMIFAANVAEDDLATGNKYVDLVKDYAEKIGSEVVIVSAK VEAELQEMDDESKKEFLETLGVKEAGLNRLIRAGFKLLGLQTYFTAGVKEVRAWTIRIGD TAPKAAGEIHTDFEKGFIRAKVVSYDDFIKYSGWKGSQENGVLRLEGKEYIVHDGDLMEF LFNV >gi|224461248|gb|ACDC01000154.1| GENE 10 12081 - 12836 1329 251 aa, chain - ## HITS:1 COG:FN1366 KEGG:ns NR:ns ## COG: FN1366 COG0149 # Protein_GI_number: 19704701 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Fusobacterium nucleatum # 1 251 1 251 251 436 92.0 1e-122 MRRLVIAGNWKMYKNNSEAVATLTELKNLTKDVNNVDIVIGAPFTCLSDAVKTVEGSNVK IAAENVYPKIEGAYTGEISPKMLKDIGVEYVILGHSERREYFKESDEFINQKVKAVLEIG MKPILCIGEKLEEREGGKTLEVLATQIKGGLADLSKEEAVKVIVAYEPVWAIGTGKTATP EMAQETHKEVRNVLAEMFGKEVADKMIIQYGGSMKPENAKDLLSQEDIDGGLVGGASLKA DSFFEIIKAGN >gi|224461248|gb|ACDC01000154.1| GENE 11 12851 - 13450 621 199 aa, chain - ## HITS:1 COG:no KEGG:FN1367 NR:ns ## KEGG: FN1367 # Name: not_defined # Def: methyl-accepting chemotaxis protein # Organism: F.nucleatum # Pathway: not_defined # 1 199 1 201 201 212 64.0 6e-54 MEVYIDNQKTNFGRRSKDLEKILKAISKKLEKNNKVIENIYINGNSIEEFPFIDMDMKNV MEVTTKSYVDLSLESLNLSKEYIEIFFDINSGFQENIIEKEEISAIEIEETDVFLNWFSD LLYFLITNYSFTFPELEETFETFKGELAILSEFKEKKDYIAYVSTLNYCVSDILETFVAN IDYYQNCILNDEAQKKNLF >gi|224461248|gb|ACDC01000154.1| GENE 12 13469 - 13927 325 152 aa, chain - ## HITS:1 COG:FN1368 KEGG:ns NR:ns ## COG: FN1368 COG1040 # Protein_GI_number: 19704703 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Fusobacterium nucleatum # 1 150 53 202 204 204 75.0 7e-53 MRQIIADYKLRNRKDLAKDLAYLIQKPFFQLLEREKIDVIIPVPISDERMLERGFNQIEY LLDLLSVNYKKIQRIKDTKHMYSLKDVKKRAKNVKNVFKNKLNLTDKNVLIVDDVVTSGA TIRSISEELEKTNENINIKVFSIAMARHFINN >gi|224461248|gb|ACDC01000154.1| GENE 13 14091 - 14309 213 72 aa, chain - ## HITS:1 COG:FN1369 KEGG:ns NR:ns ## COG: FN1369 COG3478 # Protein_GI_number: 19704704 # Func_class: R General function prediction only # Function: Predicted nucleic-acid-binding protein containing a Zn-ribbon domain # Organism: Fusobacterium nucleatum # 1 72 3 75 75 101 89.0 4e-22 MAFSCPKCRCRHCEEKSIILPEKKKNFIKIELNTYYAKTCLNCGYTEFYSAKIVDDETEK KCKDNAEPEGSY >gi|224461248|gb|ACDC01000154.1| GENE 14 14299 - 14658 494 119 aa, chain - ## HITS:1 COG:FN1370 KEGG:ns NR:ns ## COG: FN1370 COG0792 # Protein_GI_number: 19704705 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Fusobacterium nucleatum # 1 119 1 119 119 166 81.0 7e-42 MNTREIGNKYEDKSVEFLIKNSYKILERNYQNKYGEIDIIAQKDDEIVFIEVKYRKTNKF GYGYEAVDRKKLFKIVKLAQLYMQSKKYEKYKMRFDCMSYLEDELDWIKNIVWGDEIGF >gi|224461248|gb|ACDC01000154.1| GENE 15 14679 - 15332 838 217 aa, chain - ## HITS:1 COG:FN1371 KEGG:ns NR:ns ## COG: FN1371 COG0164 # Protein_GI_number: 19704706 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Fusobacterium nucleatum # 1 210 7 215 215 299 81.0 3e-81 MDNPLYLYDLEYKNVIGVDEAGRGPLAGPVVAAAVILKQYSEELDEINDSKKLTEKKREK LYDIILNNFNVAVGIASVEEIDKLNILNADFLAMRRALKDLEKFHEVDKDYIVLVDGNLK IKEYEGKQLPVVKGDAKSLSIAAASIIAKVTRDRIMKDLGLKYPDYDFEKNKGYGTKKHV EAIKTKGVLKNVHRKVFLRKILDETKDEPKEVQLRIL >gi|224461248|gb|ACDC01000154.1| GENE 16 15453 - 16400 272 315 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 34 290 32 275 285 109 31 2e-23 MENIKEKFEFEVSPEYEGMRLDKYLAEQIEEATRSYLEKLIDNSYVKINSKVINKNGRKL KSGEKIEISIPEEENIDIEAENIPLDIVFENDDFILVNKKYNMVVHPAYGNYTGTLVNAL LYYTNNLSSVNGNIRPGIIHRLDKDTSGLILVAKNNFAHAKLASMFTDKTIHKTYLCIVK GNFSDENLEGRIENLIGRDTKDRKKMAVVKENGKLAISNYRVVEQVKDYSLVEVLIETGR THQIRVHMKSINHPILGDVIYGSEDKNVKRQMLHAFKLEFLNPLDNKEYTFTGKLFDDFI EVAKNLKFDINKYTI >gi|224461248|gb|ACDC01000154.1| GENE 17 16573 - 16884 603 103 aa, chain + ## HITS:1 COG:FN0093 KEGG:ns NR:ns ## COG: FN0093 COG0526 # Protein_GI_number: 19703445 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Fusobacterium nucleatum # 1 103 1 103 103 175 90.0 1e-44 MAIIKGTKDNFEAEVLKAEGIVVVDFGANWCGPCKSLVPILDEVVEEDPNKKIVKVDIDE EEELAAQYKIMSVPTLLVFRNGEIIDKSVGLIQKHEVKALFSK >gi|224461248|gb|ACDC01000154.1| GENE 18 17171 - 19966 3607 931 aa, chain + ## HITS:1 COG:no KEGG:FN1905 NR:ns ## KEGG: FN1905 # Name: not_defined # Def: 168 kDa surface-layer protein precursor # Organism: F.nucleatum # Pathway: not_defined # 681 931 1268 1487 1487 66 31.0 7e-09 MKEQIEKMLKRNLKRKLAITTATLIAFLLSANLAFAGGDYEVGNEGIKRENSGTLEEATE NLKKTGLFKIEAETIENRLKDWSTIKLQNEVNKTFINKGKIYLINEGIGKIENRGYIEAF LSINTASNILNSGYIDKMASDGNIYNKGYIKELLSSDTLNKGISIENEGTICTFNENKKL KNFGDVAVDNRDISSTEKIENYGLINFETGDYDTLKSANKTANKGIVLESYNLKEISPNS ITLKDDKGRLIKNYANSAIADTEINNGILNAFDATIDKNLEVKNNSVFNIYKAKIEDNKK ISFNNSTLNMSYTLIKNGIEMEFKNNSNIGNMKVIGQTSPLLKKLTIDNTTNVGSIMISD VDIEAINLVANEDIVNKNMVLRINDFETSPDKDVNTFVSTNVDLLGNSTTLGNITVKDGG RLTIGESTLFKNGLDEDSGISPVYYKKDIKIEGNGKILIGIDPYDISGVKELGQGNNLTD SIKGQVDTTDPNMLLAQDQLNIDADSLLHDIVIMPKYSFNYVDGAHEVRNKYKIVVAKDL AIPLPDDKPLQVTPAITITPATPITPGTSTPTVPATPAKPKDYGELNAIYRSIVTADKIK EFKVYNNDELKGFYRYLRDIYANSPYTTTIGNSLDNLSMLREKALFEIKPNLNKWAVMGG AVYNDNETKYKASTTEVDTKTTGAYAKGEYGLKEDTTFGLILGGTNSKTDLSTGKIKGSS AYLGAYAKKYVNNFKFTLGTAVDFAEDKVKRDAIGYEGIIETSRSSAKQKSRAFDLYTEL AYSKNLGNNFYIEPKFGLSYSRVRRGAVTENDGIVNLHVNSKTFNETKARVGLDLKKIIV SGNTIHNFIINTAYERILNGAKATTIKANVVGGSEFDILVPEREKGRATTGIEYKLENKF GLLFNLKVDYGFRHGSNKKSTRFSTGLGYKF >gi|224461248|gb|ACDC01000154.1| GENE 19 20106 - 20516 482 136 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0275 NR:ns ## KEGG: Lebu_0275 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 134 1 136 136 112 50.0 5e-24 MNKNYVGTYGVIKKNGGINLVCSVNYEGGGLFASILKCIDENNEYLKVIIFGSCKEENEK IAIIKKEGYEILKKPKFDVGDKVRLIKYPNEKAIVRLIIWHEKDRRIYYILDIEGNKKRS NSWYYEDENKFEKINE >gi|224461248|gb|ACDC01000154.1| GENE 20 20640 - 20840 227 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738864|ref|ZP_04569345.1| ## NR: gi|237738864|ref|ZP_04569345.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 66 1 66 66 92 100.0 8e-18 MICEELKSRKNFVEEDFIELRDSVEELISVIEKYKDMRYNSSTYIDDLKDFLADIRLTLK EKKNNR >gi|224461248|gb|ACDC01000154.1| GENE 21 21131 - 21433 497 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738865|ref|ZP_04569346.1| ## NR: gi|237738865|ref|ZP_04569346.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 100 5 104 104 143 100.0 3e-33 MNFLKLKDAANKLLEFMEKYDLDDYNERLVKKFLNELIYVIDTDEIDDIKKYQEVKEIIV GLYPPRGGLTEMYVADEDREKMNKINRELKELKKKITLLD >gi|224461248|gb|ACDC01000154.1| GENE 22 21532 - 22185 844 217 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738866|ref|ZP_04569347.1| ## NR: gi|237738866|ref|ZP_04569347.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 217 1 217 217 374 100.0 1e-102 MKYEYQGIKLGDSIEKIINLLNNKNTKLNDFGTDLIYKTGSTIEDISTRIYICLYTGIVV MIKVFDQDFCLVEDLKIGLPITNEIIEKYGLYEDDVAEDEGYYESIKYKKLVINIDWGTG RLKRYNDGIERIIGYTFYEQDGLEFNIRKDEVDNYLQCKNLKDIFYSLRKTNTIEVDVDK REIYGQLDNYKFTFDLVTRDIKSIQNLETREFVKTYN >gi|224461248|gb|ACDC01000154.1| GENE 23 22341 - 22628 470 95 aa, chain + ## HITS:1 COG:no KEGG:FN0038 NR:ns ## KEGG: FN0038 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 95 6 100 100 115 90.0 4e-25 MNATEKKELMGKYAKKLENAIKREASVTKEIENDKALIKYLEGQKTSGAAFDNTVYESYD AWIETIRKQIKKSESTLTNIEFKKVELEAIQKYIA Prediction of potential genes in microbial genomes Time: Thu May 19 23:56:25 2011 Seq name: gi|224461247|gb|ACDC01000155.1| Fusobacterium sp. 2_1_31 cont1.155, whole genome shotgun sequence Length of sequence - 3418 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 46 - 495 398 ## Lebu_1174 hypothetical protein 2 1 Op 2 . + CDS 495 - 1061 794 ## Lebu_1175 hypothetical protein + Term 1088 - 1126 -0.9 + Prom 1090 - 1149 3.7 3 2 Op 1 . + CDS 1169 - 1630 500 ## FN0169 coproporphyrinogen III oxidase 4 2 Op 2 . + CDS 1633 - 2025 496 ## FN0169 coproporphyrinogen III oxidase 5 2 Op 3 . + CDS 2050 - 2622 515 ## gi|237738872|ref|ZP_04569353.1| predicted protein + Prom 2624 - 2683 2.0 6 3 Op 1 . + CDS 2734 - 3069 354 ## gi|237738873|ref|ZP_04569354.1| predicted protein 7 3 Op 2 . + CDS 3092 - 3416 390 ## gi|237738874|ref|ZP_04569355.1| conserved hypothetical protein Predicted protein(s) >gi|224461247|gb|ACDC01000155.1| GENE 1 46 - 495 398 149 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1174 NR:ns ## KEGG: Lebu_1174 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 146 38 178 178 88 40.0 9e-17 MIKFLRDEQKKDIDNDTKYIIEILTSLILIIIEDYYLEDDSFNELLIEFIYDKKHHKHED LAFLLEKKHSPKLINRVYDLAVMELDYTKEDEFFNIARKCTYALGYTNTPKAKEKLELLA QNENELIREYAIKQLNRHDFTDKDVEEQD >gi|224461247|gb|ACDC01000155.1| GENE 2 495 - 1061 794 188 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1175 NR:ns ## KEGG: Lebu_1175 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 6 186 5 185 185 254 70.0 1e-66 MKYEEQERKIYAKYDDKTIRVYQAYNNVIADEAIKLGTFGEHFSLTRMTWIKPSFLWMMY RCGWAEKENQERVLAIDIKRAAFDEIVKNSVISSYKPNLGITEDEWKEEVKNSLVRCQWD PERDIYGKPIGRRSIQLGIRGEAVEKYVNEWIVKITDITDEVKRIKKSIDNGTFKENLLP EEKEYIIK >gi|224461247|gb|ACDC01000155.1| GENE 3 1169 - 1630 500 153 aa, chain + ## HITS:1 COG:no KEGG:FN0169 NR:ns ## KEGG: FN0169 # Name: not_defined # Def: coproporphyrinogen III oxidase # Organism: F.nucleatum # Pathway: not_defined # 22 137 8 130 135 84 44.0 2e-15 MLPKSNMNKTREKNMLVYSFEMSEKEKVYLSVGVIATIFDSLKFLKTSDKLKIKKNKGLF YKGSTYIEKENISKLKKIVSSWKGLFSEATQNFVLIGFFNTKIDDYERWNCNKEEVIESL EKLVVLCEKAEKENKIIRCRKLTVQLADNRGER >gi|224461247|gb|ACDC01000155.1| GENE 4 1633 - 2025 496 130 aa, chain + ## HITS:1 COG:no KEGG:FN0169 NR:ns ## KEGG: FN0169 # Name: not_defined # Def: coproporphyrinogen III oxidase # Organism: F.nucleatum # Pathway: not_defined # 2 130 1 135 135 101 42.0 8e-21 MLKHDFGIVGEKKEFFLEDNLILYMIDSFEWIKTLSELEKNVEKYGLNYHGITYFKGESI TKLKNIILHWINIFSLGEDVIELRGMYYINIGKHSYNKYKKKYLIESLKKLVVLCEKAEK ENKIIEHWGI >gi|224461247|gb|ACDC01000155.1| GENE 5 2050 - 2622 515 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738872|ref|ZP_04569353.1| ## NR: gi|237738872|ref|ZP_04569353.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 190 1 190 190 303 100.0 4e-81 MKCIRNICLYLKKYISDKQFERIFYQDIDDFKSILEENIYWKILSSNFNKKEDIISMNTD LYDYVEKNYKSVYDEISDAYIEKLIETNEKNEIIDILKKKYKQKEEVFINCCMIDTKLEL IYLIKKALNYPKHCANNWDAIEDFIYDVVLPKKIVLQNWDSIKEKLPQDTIILKRILDKI NSRYSTVLYE >gi|224461247|gb|ACDC01000155.1| GENE 6 2734 - 3069 354 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738873|ref|ZP_04569354.1| ## NR: gi|237738873|ref|ZP_04569354.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 111 1 111 111 175 100.0 9e-43 MKRVIYKKVINENKENRTEFLLINFDYEDGNDYLAKIFTKEFNMKVEEKKDYIWFSIIKL CKKNTCYELLWHEDIGNIIYSLEQDEDIVNELELRLQKVLDVLNIKILESN >gi|224461247|gb|ACDC01000155.1| GENE 7 3092 - 3416 390 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738874|ref|ZP_04569355.1| ## NR: gi|237738874|ref|ZP_04569355.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 108 1 108 109 202 100.0 6e-51 MRGHAVYFFMLKDDLIESFKRVEEKLGGLQYVVHTTYKEPKFEIFDSIEKITDIGLIKPI EPNYFIALKNEKFSMREIKLKSGELCYDIQDKQGFLQFFPSGIFENSN Prediction of potential genes in microbial genomes Time: Thu May 19 23:56:57 2011 Seq name: gi|224461246|gb|ACDC01000156.1| Fusobacterium sp. 2_1_31 cont1.156, whole genome shotgun sequence Length of sequence - 3376 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 85 - 246 165 ## gi|294781855|ref|ZP_06747187.1| hypothetical protein HMPREF0400_02083 + Term 317 - 365 7.0 + Prom 353 - 412 12.2 2 2 Op 1 1/0.000 + CDS 611 - 2947 3280 ## COG1193 Mismatch repair ATPase (MutS family) 3 2 Op 2 . + CDS 2923 - 3375 195 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 Predicted protein(s) >gi|224461246|gb|ACDC01000156.1| GENE 1 85 - 246 165 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294781855|ref|ZP_06747187.1| ## NR: gi|294781855|ref|ZP_06747187.1| hypothetical protein HMPREF0400_02083 [Fusobacterium sp. 1_1_41FAA] # 1 53 140 192 192 104 100.0 1e-21 MKINRGISTYVGKSIIENKEKYRIPYGSPASPPEEDFDVSDMVWQEKGRKKKD >gi|224461246|gb|ACDC01000156.1| GENE 2 611 - 2947 3280 778 aa, chain + ## HITS:1 COG:FN1581 KEGG:ns NR:ns ## COG: FN1581 COG1193 # Protein_GI_number: 19704902 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 1 778 1 778 778 1300 94.0 0 MNKHSFNVLEFDKLKELILENIVIDDNREVIENLEPYKDLSALNNELKTVKDFMDLISFD GGFEAVGLRNINSLMDKIKLIGTYLEVEELWDINVNLRTVRVFKARLDELGKYKQLRDTI GNIPNLRMIEDMINKTINPEKEIKDDASLDLRDIRLHKKTLNMNIKRKFEELFDEPSLAN AFQERIITERDGRMVTPVKFDFKGLIKGIEHDRSSSGQTVFIEPLSIVSLNNKMRELETK EKEEIRKILLRIAELLRNNRDDILAIGDKALYLDILNAKSIYAVDNKCEIPTVSNREVLS LEKARHPFIDKDKVVPLTFEIGKDYDILLITGPNTGGKTVALKTAGLLTLMALSGIPIPA SENSKIGFFEGVFADIGDEQSIEQSLSSFSAHLKNVKEILAGVTKNSLVLLDELGSGTDP IEGAAFAMAVIDYLNEKKAKSFITTHYSQVKAYGYNEEGIETASMEFNTDTLSPTYRLLV GIPGESNALTIAQRMGLPESIISKARAYISEDNKKVEKMIENIKTKSQELDEMRERFARL EEEARLDRERAKQETLVIEKQKNEIIKAAYEEAEKMMNEMRAKASALVEKIQHEEKNKED AKQIQKNLNMLSTALREEKNKTVEVVKKIKTKVNFKVGDRVFVKSINQFANILKINTSKE SASVQAGILKLEVPFEEIKIVEEKKEKVYNVSTHKKTPVRSEIDLRGKMVDEGIYELETY LDRATLNGYTEVYVIHGKGTGALREGILKYLKTSKYVKEYRIGGHGEGGLGCTVVTLK >gi|224461246|gb|ACDC01000156.1| GENE 3 2923 - 3375 195 151 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 15 150 7 140 234 79 31 3e-15 MYSGDSKIEKKVTFILAAAGQGKRMNMSLAKQFLEYKGEPLFYSSLKIAFENQYIDDIII VTNKENIKNIREFCENKKLLSKVKYIVEGGSERQYSIYNAIKKIENTDIVIIQDAARPFL KDKYIEESLKILDNTCDGVIIAVKCKDTIKV Prediction of potential genes in microbial genomes Time: Thu May 19 23:57:06 2011 Seq name: gi|224461245|gb|ACDC01000157.1| Fusobacterium sp. 2_1_31 cont1.157, whole genome shotgun sequence Length of sequence - 13241 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 3, operones - 2 average op.length - 6.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 2 - 238 238 ## COG1211 4-diphosphocytidyl-2-methyl-D-erithritol synthase 2 1 Op 2 8/0.000 + CDS 253 - 1674 2110 ## COG0215 Cysteinyl-tRNA synthetase 3 1 Op 3 1/0.000 + CDS 1662 - 2057 389 ## COG1939 Uncharacterized protein conserved in bacteria 4 1 Op 4 1/0.000 + CDS 2059 - 3087 1691 ## COG1077 Actin-like ATPase involved in cell morphogenesis 5 1 Op 5 . + CDS 3090 - 3959 975 ## COG0470 ATPase involved in DNA replication 6 1 Op 6 . + CDS 4001 - 4240 489 ## gi|237738883|ref|ZP_04569364.1| predicted protein 7 1 Op 7 . + CDS 4227 - 4511 356 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system + Prom 4521 - 4580 5.5 8 1 Op 8 . + CDS 4626 - 4928 268 ## COG2827 Predicted endonuclease containing a URI domain + Term 5048 - 5114 30.0 + TRNA 5014 - 5101 70.9 # Leu TAA 0 0 + TRNA 5115 - 5191 81.5 # Met CAT 0 0 + TRNA 5216 - 5291 93.2 # Gly TCC 0 0 + TRNA 5301 - 5376 94.1 # Lys TTT 0 0 + TRNA 5384 - 5460 90.7 # Arg TCT 0 0 + TRNA 5468 - 5545 96.0 # Met CAT 0 0 + TRNA 5562 - 5636 64.0 # Glu TTC 0 0 + TRNA 5643 - 5726 64.5 # Ser TGA 0 0 + TRNA 5749 - 5824 87.4 # Phe GAA 0 0 + TRNA 5836 - 5911 94.0 # Val TAC 0 0 + TRNA 5923 - 5999 95.0 # Asp GTC 0 0 - Term 6009 - 6041 4.0 9 2 Tu 1 . - CDS 6050 - 8116 3097 ## COG0480 Translation elongation factors (GTPases) - Prom 8271 - 8330 11.7 + Prom 8405 - 8464 9.5 10 3 Op 1 . + CDS 8491 - 10212 2111 ## GYMC10_0303 hypothetical protein 11 3 Op 2 . + CDS 10224 - 10427 287 ## gi|237738888|ref|ZP_04569369.1| predicted protein 12 3 Op 3 . + CDS 10427 - 10876 697 ## GYMC10_0304 hypothetical protein 13 3 Op 4 . + CDS 10879 - 12450 2286 ## COG0464 ATPases of the AAA+ class 14 3 Op 5 . + CDS 12462 - 12953 701 ## GYMC10_0302 hypothetical protein + Term 13056 - 13111 4.5 Predicted protein(s) >gi|224461245|gb|ACDC01000157.1| GENE 1 2 - 238 238 78 aa, chain + ## HITS:1 COG:FN1580 KEGG:ns NR:ns ## COG: FN1580 COG1211 # Protein_GI_number: 19704901 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2-methyl-D-erithritol synthase # Organism: Fusobacterium nucleatum # 1 78 154 231 231 132 88.0 2e-31 ENGIIVETPNRNNLIAVHTPQTFKFEILKKAHQMAEEKNILATDDASLVENISGRIKFIH GDYDNIKITVQEDLKYLK >gi|224461245|gb|ACDC01000157.1| GENE 2 253 - 1674 2110 473 aa, chain + ## HITS:1 COG:FN1579 KEGG:ns NR:ns ## COG: FN1579 COG0215 # Protein_GI_number: 19704900 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 473 1 473 473 874 91.0 0 MIKIYNTLTGHLDEFKPIKENEVSMYVCGPTVYNYIHIGNARPAIFFDTVRRYLEYRGYK VTYVQNFTDVDDKMINKANAENVSIKEIAERYIKAYFEDTAQINLKEDGMIRPKATDNID GMINIIKSLVDKGYAYESNGDVYFEVKKYKEGYGELSKQNIEDLESGARIDVNEIKKDAL DFALWKSSKPNEPSWDSPWGKGRPGWHIECSAMSRKYLGDSFDIHGGGLDLIFPHHENEM AQSKCGCGGTFARYWMHNGYININGEKMSKSSGSFVLLRDILKHFEGRVIRLFVLGSHYR KPMEFSDTELNQTKSSLERIENSLKRIKELNRENLDGTNDCQELLATKKEMEAKFIEAMD EDFNTAQALGHVFELVKSVNKALDEENFSKTAIEVLDEVYSYLVMIIEEVLGVKLKLEAE VNNISADLIELILELRKDAREQKNWALSDKIRDRLLELGIKIKDGKDKTTWTM >gi|224461245|gb|ACDC01000157.1| GENE 3 1662 - 2057 389 131 aa, chain + ## HITS:1 COG:FN1578 KEGG:ns NR:ns ## COG: FN1578 COG1939 # Protein_GI_number: 19704899 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 114 1 112 129 181 83.0 4e-46 MDNVDKLSKKDIRDYTGLELAFIGDAIWELEIRRYYLQFGYSIPTLNKYVKNKVNARYQS LIYKQIIEELDEKFKVIGKRAKNSNIKTFPKTCTVMEYREATALEAVVGAMYLLNEEEEI KKIINIVIKGE >gi|224461245|gb|ACDC01000157.1| GENE 4 2059 - 3087 1691 342 aa, chain + ## HITS:1 COG:FN1577 KEGG:ns NR:ns ## COG: FN1577 COG1077 # Protein_GI_number: 19704898 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 1 342 1 342 342 598 95.0 1e-171 MGFFNFRANRSIGIDLGTANTLVYSKKHKKIVLNEPSVVAVEKETKKVLAVGNEAREMLG KTPDTIVAVKPLSEGVIADYDITEAMIKYFIKKIFGSYSFFMPEIMICVPIDVTGVEKRA VLEAAISAGAKKAYLIEEARAAALGSGMDIAAPEGNMIIDIGGGSTDVAIISLGGTVVSK TIRIAGNNFDNDIVKYVKKTYNLLIGDRTAEEIKIKIGTALPLEEEETIEVKGRDLLMGL PKVITITSEEVREAIKDSLDQILQCIRTVLEKTPPELAADIVDKGMIMTGGGSLIRNFPE MITKYTNLKVNLAENPLESVVIGAGLALDQIDVLRKIEKAER >gi|224461245|gb|ACDC01000157.1| GENE 5 3090 - 3959 975 289 aa, chain + ## HITS:1 COG:FN1576 KEGG:ns NR:ns ## COG: FN1576 COG0470 # Protein_GI_number: 19704897 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Fusobacterium nucleatum # 1 289 1 289 289 415 87.0 1e-116 MLDEFLKNELSFNREAGTYLFYGDDLEKNYRIALEFSAALFSRNIENEDEKSKIKDKTLR NLYSDLMVVDNLNIDAVRDIIKKTYTSSHEGGAKVFILKNIQDIRKESANAMLKIIEEPT RDNFFILISKRLNILSTIKSRSIIYRVRKSTPEELGVDKYVYNFFLGISNDIAEYKEQEI DLMLEKSYKSIAGVLKEYEKEKNIVVKIDLYKCLRNFVQESTSLKKYEKIKFAEDIYSNA SKESINLIVDYIINLVKKNKNLKEKLEYKKMLRYPVNMKLLLINLIMSI >gi|224461245|gb|ACDC01000157.1| GENE 6 4001 - 4240 489 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738883|ref|ZP_04569364.1| ## NR: gi|237738883|ref|ZP_04569364.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 79 4 82 82 135 100.0 9e-31 MTTLSLNITDEQKKFLTDYANDKNVSIADMFTLFIEYLERLEDMEDYNLAVARMLDPNNK PCGTMKELASEFGIDYDEL >gi|224461245|gb|ACDC01000157.1| GENE 7 4227 - 4511 356 94 aa, chain + ## HITS:1 COG:SP1223 KEGG:ns NR:ns ## COG: SP1223 COG2026 # Protein_GI_number: 15901085 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Streptococcus pneumoniae TIGR4 # 4 87 2 84 84 69 38.0 2e-12 MMSYKFIPTIEFKEDLKKLDGAAVKFILKYIKKLESSENPKAYGKELSGNMVGLYRFRVS DYRIVAKFINNTFIIQGLGVGHRKNVYKILEKRL >gi|224461245|gb|ACDC01000157.1| GENE 8 4626 - 4928 268 100 aa, chain + ## HITS:1 COG:FN1575 KEGG:ns NR:ns ## COG: FN1575 COG2827 # Protein_GI_number: 19704896 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Fusobacterium nucleatum # 1 98 1 98 100 119 78.0 1e-27 MFYYLYMLRCEDGSIYTGTAKDYLKRYEEHLSGKGAKYTRSHKVKKIERVFLCENRSIAC ILESEIKKLTKDKKEAIIIEPDTYVKELENTRKIKILKKI >gi|224461245|gb|ACDC01000157.1| GENE 9 6050 - 8116 3097 688 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 688 3 690 690 1271 93.0 0 MKVFTTDNIRNISLLGHRGSGKTTLIESILYVKDYIKRKGDVENGTTVSDFDKEEIRRIF SINTSLIPVEHNNVKLNFLDTPGYFDFVGEVISSLRVSASAVLVLDATAGVEVGTEKAWK LLEERKLPRIIFVNKMDKGYVNYTKLLNELKEKFGKKIAPFCIPIGEKDEFKGFVNVVDM VGRVFDGKECVDTPIPADVDVSEVRNLLFEAIAETDEALMDKYFAGEEFTQEEIVKGLHK GVVNGDIVPVMVGSAQQNIGIHTLLNYLDLYMPCPTELFSGQRVGEDPITQQEKVVKISD ENPFSAIVFKTLVDPFIGKITFFKVNSGVLRKETEVFNPKKNKKERIAQLITMQGNKQIE IEELHAGDIGATTKLLYTQTGDTLCDKNYPVVFNKIRFPKPNIFSGVLPADKNDDEKLST ALQRVMEEDPTFVVTRNYETKQLLIGGQGEKHLYIILCKIKNKFGVHAELQDVIVSYRET ILGKAEVQGKHKKQSGGAGQYGDVFIRFEPSDKEFEFVDEIKGGVVPRNYIPAVEKGLME AKEKGVLAGYPVINFKATLYDGSYHPVDSNDLSFKLAAILAFKLGMEKAKPVLLEPVVKM KITIPEEYMGDVMGDLNKRRGRVLGMDHNEAGEQLLFAEVPEAEILKYSIDLRALTQGRG EFEYEFVRYEEVPENISKRVKEERNKDK >gi|224461245|gb|ACDC01000157.1| GENE 10 8491 - 10212 2111 573 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_0303 NR:ns ## KEGG: GYMC10_0303 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 7 573 6 564 567 363 38.0 1e-98 MADDYKISWANLKFIEDSLSIISNEIKYVDNRVSQVDGNVKIVQSKIEILAREFKDYVEM QTLANRKAEAKLNLSAIRDKLKDKFGHYDVVRRTATGILQANDLAIVKSSMLSHVTEKQM IETPNYWLTPCLVALAAWINNDKSLAERALAEGIRRNDEKTSLFFGLICRRVGREHSTLR WLARYLEAQDEEMLDRKAVIVLDAFASGILGNDTENFVYKQIQEWMANLEIKPGFTERQL ENWTDAINSKRVELKKGLYPYLEQYSNTWETLKDVLEGANLNNDLYQYFRNIFDQKEETK KLKVELDKILDSLVTEFDEEELPLKRQEQFEQLVVDNNGSESRAQAQMALEKSVYDDYRD FMQLLTDAAMNPEESKSSTATQKFATALSRNNIVTAFNDITAKNRIKVPYDIEINVDNFN DKTQNGEDEEEVLNRFEELIEQEKQEELSKAKLDLFQQFCLYGGAAVILYGIIKTFMDKS LAFITIIIGIGLIIYHFTSKSKLQKIIQQIIEAYDKKLESGQQIIRAIIAEIVDFRIEFN ERDAESTKVLDFFEQIRPEEYIRKIGTNERKIM >gi|224461245|gb|ACDC01000157.1| GENE 11 10224 - 10427 287 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738888|ref|ZP_04569369.1| ## NR: gi|237738888|ref|ZP_04569369.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 67 1 67 67 83 100.0 4e-15 MAEIMDKIKSAFKSIASTEEKEKKETATVSTIEKDAKKNEKNEYEGFASGFPEWDLEPPQ APVRRKR >gi|224461245|gb|ACDC01000157.1| GENE 12 10427 - 10876 697 149 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_0304 NR:ns ## KEGG: GYMC10_0304 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 5 143 4 146 180 72 30.0 4e-12 MDKAPKIYADWIKVLDVLKSAEDDENALALMEKGEIVWQSGVAERFLNKIVAAMNFRLKR AIDNFQKSYRGDENETIKAIMQLRKELQFLLKVVSIKALPDKEKAELRNIIVAQSNSIQE SLEKSSESDRSGKLASIVKNNRVNVQWEG >gi|224461245|gb|ACDC01000157.1| GENE 13 10879 - 12450 2286 523 aa, chain + ## HITS:1 COG:PA0657 KEGG:ns NR:ns ## COG: PA0657 COG0464 # Protein_GI_number: 15595854 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Pseudomonas aeruginosa # 17 500 14 492 493 241 31.0 2e-63 MSDIKKAKELLKRYSSARIPFIVIDTMERDRTLEVLKEVADELTISFFVHTMSKGIYDLS SGKVLSEDKSIYSAIDYMSDQMKRRQNLTLILTGIPDISNENADAKQLFDLVTHANETGG SIIVFTNGGVWNQLQRLGMTLKIDNPNEEEMYDIIKKYIKDYRNEVSIEWDEGDIREAAS ILNGVTRIEAENVIATLIAKREITKEDMDEVRFAKDRLFSNISGLEKIDVDESVVNVGGL AGLRKWLDEKKELLRVDKKDLLRSKGLRSPRGILLVGVPGCGKSLSAKAISASWKLPLYR LDFATVQGSYVGQSEQQLKDALTTAENVSPCILWIDEIEKGLSGAGSSNDGGVSTRMVGQ FLFWLQESKKQVFVVATANDVSMLPSELLRRGRFDELFFIDLPTAEERYDIIKMYMRKYL SLDFAGELADRIVEMTEGFTGADLESTVRDLAYRVIANDDFVLDEENVVQAFKNVVPLSQ TSPEKIAAIRDWGKERAVPASGKPIGAEEIKTSTESRTRKLLV >gi|224461245|gb|ACDC01000157.1| GENE 14 12462 - 12953 701 163 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_0302 NR:ns ## KEGG: GYMC10_0302 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 27 163 22 156 158 131 48.0 7e-30 MPKIMLKNPTETKTESYNNLSLTKPVEKAKTTVESVNLQRLLRKKRSKRYMEAMGALDTG GHVNNQEKINELIEAIREEFPEVELIKTGIFIGMVSKCFLGDPYEVHTVDFTHTIIKHYK QGETLPDGMEKARAIAARNIYMYIEVYTDHCCAISADGTVSII Prediction of potential genes in microbial genomes Time: Thu May 19 23:57:28 2011 Seq name: gi|224461244|gb|ACDC01000158.1| Fusobacterium sp. 2_1_31 cont1.158, whole genome shotgun sequence Length of sequence - 1239 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 130 168 ## 2 1 Op 2 . + CDS 169 - 984 737 ## COG2801 Transposase and inactivated derivatives - Term 919 - 975 12.2 3 2 Tu 1 . - CDS 1038 - 1238 123 ## gi|294781844|ref|ZP_06747176.1| conserved hypothetical protein Predicted protein(s) >gi|224461244|gb|ACDC01000158.1| GENE 1 2 - 130 168 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no TLSEKEKIKQLEEENLYLKAENEYLKKLRALVQERELKEKKK >gi|224461244|gb|ACDC01000158.1| GENE 2 169 - 984 737 271 aa, chain + ## HITS:1 COG:FN0841 KEGG:ns NR:ns ## COG: FN0841 COG2801 # Protein_GI_number: 19704176 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 65 271 1 207 207 341 94.0 7e-94 MLLKIAGISRSVYYYYIDKKDIDEKNKDIIEKIKEIYYVNKGRYGYRRVTLELKNQGFNI NHKKVQRLIKKFNLQSIIRKKRKYSSYKGRIGKIADNHIKRNFEATAPNQKWFTDVTEFN LRGEKLYLSPILDAYGRYIVSYDISRSANLEQINHMLSLAFKENENYENLIFHSDQGWQY QHYSYQEKLKEKKITQSMSRKGNSLDNGLMECFFGLLKLEMFYEQEEKYKTLEELKEAIE DYIYYYNNKRIKEKLKGLTPASYRSQSLLVS >gi|224461244|gb|ACDC01000158.1| GENE 3 1038 - 1238 123 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294781844|ref|ZP_06747176.1| ## NR: gi|294781844|ref|ZP_06747176.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 66 108 173 173 109 93.0 7e-23 LEFSAPLTEDEFFNSLYIAYINFYIENEDDISCNFDLDCEPDYLFGHLANIELDENNEIL MSGING Prediction of potential genes in microbial genomes Time: Thu May 19 23:57:40 2011 Seq name: gi|224461243|gb|ACDC01000159.1| Fusobacterium sp. 2_1_31 cont1.159, whole genome shotgun sequence Length of sequence - 8871 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 313 423 ## gi|294781844|ref|ZP_06747176.1| conserved hypothetical protein - Prom 335 - 394 19.9 + Prom 373 - 432 11.8 2 2 Op 1 56/0.000 + CDS 479 - 847 627 ## PROTEIN SUPPORTED gi|19704890|ref|NP_602385.1| 30S ribosomal protein S12 3 2 Op 2 51/0.000 + CDS 875 - 1345 778 ## PROTEIN SUPPORTED gi|237738896|ref|ZP_04569377.1| SSU ribosomal protein S7P 4 2 Op 3 30/0.000 + CDS 1388 - 3469 3281 ## COG0480 Translation elongation factors (GTPases) + Prom 3479 - 3538 10.7 5 2 Op 4 . + CDS 3558 - 4742 1548 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 + Term 4763 - 4806 9.4 + Prom 4754 - 4813 4.5 6 3 Tu 1 . + CDS 4836 - 8871 5238 ## FN0387 hypothetical protein Predicted protein(s) >gi|224461243|gb|ACDC01000159.1| GENE 1 1 - 313 423 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294781844|ref|ZP_06747176.1| ## NR: gi|294781844|ref|ZP_06747176.1| conserved hypothetical protein [Fusobacterium sp. 1_1_41FAA] # 1 104 1 104 173 153 97.0 3e-36 MNLKEVKNILKNSKYLSKTKIEDEVEINGTISLWNRNDVDIIIEFDDENDINFSEDTLKL IEDKLNWIDKNKKLICKTFIEDEGMFYGLNDEIEKQLSKKEKAK >gi|224461243|gb|ACDC01000159.1| GENE 2 479 - 847 627 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|19704890|ref|NP_602385.1| 30S ribosomal protein S12 [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 122 1 122 122 246 100 6e-65 MPTLSQLVKKGRQTLTEKKKSPALQGNPQRRGVCIRVYTTTPKKPNSALRKVARVKLTNG IEVTCYIPGEGHNLQEHSIVLVRGGRTKDLPGVRYKIIRGALDTAGVAKRKQGRSKYGAK NA >gi|224461243|gb|ACDC01000159.1| GENE 3 875 - 1345 778 156 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237738896|ref|ZP_04569377.1| SSU ribosomal protein S7P [Fusobacterium sp. 2_1_31] # 1 156 1 156 156 304 100 2e-82 MSRRRAAVKRDVLPDSRYSDKVVTKVINSIMLDGKKSIAEGIFYSAMDLIKEKTGQEGYD VFKQALENIKPQIEVRSRRIGGATYQVPVEVKADRQQTLAIRWLTTYTRARKEYGMIEKL AAELIAAANNEGATIKKKEDTYKMAEANRAFAHYRV >gi|224461243|gb|ACDC01000159.1| GENE 4 1388 - 3469 3281 693 aa, chain + ## HITS:1 COG:FN1556 KEGG:ns NR:ns ## COG: FN1556 COG0480 # Protein_GI_number: 19704888 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 693 1 693 693 1317 96.0 0 MARKVSLDMTRNVGIMAHIDAGKTTTTERILFYTGVERKIGEVHEGQATMDWMEQEQERG ITITSAATTCFWKGHRINIIDTPGHVDFTVEVERSLRVLDGAVAVFSAVDGVQPQSETVW RQADKYKVPRLAFFNKMDRIGANFDMCVSDIKEKLGSNPVPIQIPIGAEDQFEGVVDLIE MKEVVWPVDSDNGQHFDVKDIRAELQEKAEEARQYMLESIVETDDALMEKFFGGEEITKE EIVKGLRKATIDNTIVPVVCGTAFKNKGIQALLDAIVNFMPAPTDVAMVEGRDPKDPEKL IDREMSDEAPFASLAFKVMTDPFVGRLTFFRVYSGIVEKGATVLNSTKGKKERMGRILQM HANKREEIEQVYCGDIAAAVGLKDTTTGDTLCAEEAPIVLEQMEFPEPVISVAVEPKTKN DQEKMGIALSKLAEEDPTFRVRTDEETGQTIISGMGELHLEIIVDRMKREFKVESNVGKP QVAYRETITQSYDQEVKYAKQSGGRGQYGHVKIILEPNPGKEFEFVNKITGGVIPREYIP AVEKGCREALESGVIAGYPLVDVKVTLYDGSYHEVDSSEMAFKIAGSMALKQAATKAKPV ILEPVFKVEVTTPEEYMGDIIGDLNSRRGMVSGMIDRNGAKIITAKVPLSEMFGYATDLR SKSQGRATYSWEFSEYLQVPASIQKQIQEERGK >gi|224461243|gb|ACDC01000159.1| GENE 5 3558 - 4742 1548 394 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 394 1 407 407 600 72 1e-172 MAKEKFERSKPHVNIGTIGHVDHGKTTTTAAISKVLSDKGLAKKVDFDQIDAAPEEKERG ITINTAHIEYETANRHYAHVDCPGHADYVKNMITGAAQMDGAILVVSAADGPMPQTREHI LLSRQVGVPYIIVYLNKSDMVDDEELLELVEMEVRELLTEYGFPGDDIPVIRGSSLGALN GEEKWIEKIMELMDAVDSYIPTPERAVDQPFLMPIEDVFTITGRGTVVTGRVERGIIKVG EEIEIVGIKPTTKTTCTGVEMFRKLLDQGQAGDNIGVLLRGTKKEEVERGQVLAKPGSIH PHTNFKGEVYVLTKDEGGRHTPFFSGYRPQFYFRTTDITGAVTLPEGVEMVMPGDNITMT VELIHPIAMEQGLRFAIREGGRTVASGVVSEIIK >gi|224461243|gb|ACDC01000159.1| GENE 6 4836 - 8871 5238 1345 aa, chain + ## HITS:1 COG:no KEGG:FN0387 NR:ns ## KEGG: FN0387 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 887 1345 359 812 1724 309 43.0 4e-82 MNNNLYKMENTLRSLAKRYKSVKYSLGLVLIFLMLGINAFSEEIFTTEQVSTSKKNIKAS VENLKDKVEQVKEKNEKDIKNLRLELIQLMEQGNQVVKSPWASWQFGVNYYYDNWGGTYR GRGDKTEKYPYEGILERDTNEMYRYIATNSAMYSNLSKSTNVRSASTNRRKGLSNYGIAS NISQKEPIVSLELNAGISPRVINKKSPNTAPTPPDILLPTFEPKFVNPPVIPNAPHSPTL TLPSFNVNVGSSNNGSKKVILGYGDNSLIQEVAITGGDFKIKRNPIRAAYNGTQGTVPSG TPPMWHYSFVNYSGKSPFASVQGGHQFPSTITNAMGGVSWKPLGYSWSGLNEPGTDAENT ATPYQKNLAFLAVNKGAMLNNANFLYTNPIMTSQLLKEFVHQDVHGGIATATIRNGIVDS VGSSSSLVTAYDDMVAYQTTAIQQSLVPYSGITEQHTFLNGGKVIIEGANTSLGNTYSHM EGSKVRQSTINTGDIIFQPYSDGTNIFEKNTAVFVISNDTHETTSSGAIVPQIQDSINYN AGKIKMYTTNAVGLVTDPEQKRPVSLINRDTFELYGENSAAIYLKNAVALDIQTTNDKNF TLDTKDNSNSTLSTNSSYKPIKISGDKSIGLYSVASGSTVEGNFALDIGAIGVGNEKFTT TTISNLTDGVNLTNHSINTSGSTDDIEGSYGILSKSPTDLTSHQIRIFDKAVGNVGVYPA DNVELKLGGGSIELNGGSTAKDNIGIYIDGQGAVTSTGSIKLDGGVGNLGIYAKGGNLPI GVSNNVTVREIKGSGTKNTVFIYGSDGAKIKLSDGTLPNGTALSKGATYGLNISNATVDT DASTSNKKDTGAVFATGVGTEISINKTATLSTTASPNINITGTKLTDPGTEKYVGFGLMA KDGATIRAKNNYIKITNGSTAVASVGSGSKVEMNGGKIDYTGKGYALYATPGNTIDISDV KLTLDGNAVAYERDLATSFPIITDTNTSIHIKSKDVVVLNVKHASPLNVSSLVSTVVNSW AGLSTIPTYDAGAINYKMTTIDGLSAYNIDQDINKKNVVLGTADNNSKLFVRNMSIQRAK MNLASGKNVTAYLDASDLTNLETSTVIGLDMNSSANAAGRNETQINLASGSSVNADRIDA GSGAIGLFINYGETNIASGAKVNVEKSGLNDANAKAVGVYAVNGSTVNNEGEINVGGEGS IGVLGISYRKDSNGVLKRNEFGAKPNAGDVGVVNKGKIELDGKKAVGIYIENNDSNTSAP HTIEATNDTNGTINMSGQEAIGMAAKLGNLVNKGTINITADKGTGMFVETDGVRPATMTN DSTGTISIGDSTSESVLRTGMFTKN Prediction of potential genes in microbial genomes Time: Thu May 19 23:58:02 2011 Seq name: gi|224461242|gb|ACDC01000160.1| Fusobacterium sp. 2_1_31 cont1.160, whole genome shotgun sequence Length of sequence - 3270 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1867 2958 ## FN1554 hypothetical protein + Term 1870 - 1932 8.3 + Prom 1915 - 1974 9.1 2 2 Tu 1 . + CDS 2001 - 3242 1582 ## COG0786 Na+/glutamate symporter Predicted protein(s) >gi|224461242|gb|ACDC01000160.1| GENE 1 2 - 1867 2958 621 aa, chain + ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 621 959 1582 1582 689 63.0 0 ATNNAGHTINLSGDGSMGMYLDNGAIGVNNGTITTVGNPKEAVGVVVRNGAEFTNNGIVN ISSKDGYAKYVKGGIIKNYGSFVVSNGATEDYAVGNKPTGKELIVNGVKVVDINAPLGAT TATITINGIPHTPVITNISGTRSMLTSNVGMYIDTLRGTNPISGSLGILGDSADLIVGSE AAQTTTSRGIQVPNRIIAPYNVVMATNPSIRYWNIYSGSLTWIATATLNKTTGLIENMYL AKIPYTAFAGNEATPVAVTDTYNFLDGLEQRYGVEELGTRENRVFQKLNSIGKNEEALFY QATDEMMGHQYANVQQRIVATGDILNKEFDYLRSEWQTVSKDSNKIKTFGVRGEYNTDTA GVINYTNNAYGVAYVHEDETVKLGDTLGWYAGIVHNTFKFKDIGNSKEEQLQGKVGLFKS VPFDENNSLNWTISGDVFAGYNKMHRKFLVVDEVFNAKGRYHTYGLGLKNEISKEFRLSE SFTLKPYVALGLEYGRVSKIREKSGEIKLDVKSNDYFSVRPEIGAELGFKHYFDRKTVKV GVTVAYENELGRVANGKNQARVSGTEADYFNIRGEKDDRTGNVKTDLNIGVDNQRVGVTA NVGYDTKGNNVRAGVGLRVIF >gi|224461242|gb|ACDC01000160.1| GENE 2 2001 - 3242 1582 413 aa, chain + ## HITS:1 COG:FN1801 KEGG:ns NR:ns ## COG: FN1801 COG0786 # Protein_GI_number: 19705106 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Fusobacterium nucleatum # 1 412 1 412 413 675 90.0 0 MNFETIEGILNINLNSTMTLALAAVLLIMGYSINKRLSILNKYCIPAPVVGGFIFMFLTW IGHVTGTFKFNFENIFQSTFMLAFFTTVGLGASFALLKKGGKLLIIYWLVCGIISIFQNI IGITITKITGLEAPYALLSSAISMIGGHGAALAYGGTFAKMGYENAPLVGAAAATFGLIT AVLIGGPLGRRLIEKNNLRPDNSENFDQSIMEINKNEGEKLSDLDVIKNVVVILVCMAIG SYISTLIGKLINMDFPSYVGAMFMAVIVRNINEKTHTYNFNFSLVDGIGNVMLNLYLALA LMTLKLWELSGLIGGVLLVVACQVIFMILIAYFVVFRVLGANYDAAVMCSGLCGHGLGAT PSAIVNMTAINEKYGMSRKAMMIVPIVGAFLVDIIYQPATVWFIKTFVQNYVG Prediction of potential genes in microbial genomes Time: Thu May 19 23:58:17 2011 Seq name: gi|224461241|gb|ACDC01000161.1| Fusobacterium sp. 2_1_31 cont1.161, whole genome shotgun sequence Length of sequence - 28048 bp Number of predicted genes - 25, with homology - 22 Number of transcription units - 13, operones - 9 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 36 - 95 13.8 1 1 Op 1 . + CDS 126 - 7352 10910 ## FN1554 hypothetical protein + Term 7379 - 7415 1.7 + Prom 7489 - 7548 12.8 2 1 Op 2 . + CDS 7626 - 15032 11793 ## FN1554 hypothetical protein + Term 15045 - 15085 6.9 + Prom 15074 - 15133 9.6 3 2 Op 1 1/0.000 + CDS 15162 - 15968 1128 ## COG2849 Uncharacterized protein conserved in bacteria 4 2 Op 2 1/0.000 + CDS 15985 - 16716 867 ## COG2849 Uncharacterized protein conserved in bacteria 5 2 Op 3 1/0.000 + CDS 16737 - 17243 837 ## COG2849 Uncharacterized protein conserved in bacteria + Term 17244 - 17271 0.1 + Prom 17256 - 17315 10.8 6 3 Op 1 1/0.000 + CDS 17431 - 18168 828 ## COG2849 Uncharacterized protein conserved in bacteria 7 3 Op 2 1/0.000 + CDS 18189 - 19202 1332 ## COG2849 Uncharacterized protein conserved in bacteria 8 3 Op 3 . + CDS 19228 - 19734 731 ## COG2849 Uncharacterized protein conserved in bacteria + Term 19747 - 19786 6.1 + Prom 19767 - 19826 10.0 9 4 Tu 1 . + CDS 19939 - 20415 737 ## FN2115 hypothetical protein + Term 20424 - 20457 -0.5 + Prom 20423 - 20482 4.6 10 5 Op 1 . + CDS 20526 - 20642 58 ## 11 5 Op 2 . + CDS 20642 - 21124 631 ## FN2115 hypothetical protein + Prom 21132 - 21191 5.9 12 6 Tu 1 . + CDS 21359 - 21841 559 ## FN2115 hypothetical protein + Term 21864 - 21900 5.9 + Prom 21849 - 21908 3.6 13 7 Op 1 . + CDS 21929 - 22111 206 ## gi|296328335|ref|ZP_06870862.1| conserved hypothetical protein 14 7 Op 2 . + CDS 22111 - 22602 730 ## FN2115 hypothetical protein + Term 22625 - 22661 2.6 + Prom 22608 - 22667 6.8 15 8 Tu 1 . + CDS 22832 - 23311 698 ## FN2115 hypothetical protein + Term 23413 - 23450 4.0 + Prom 23379 - 23438 8.6 16 9 Op 1 . + CDS 23577 - 24074 493 ## FN2115 hypothetical protein + Term 24097 - 24133 4.1 + Prom 24080 - 24139 4.0 17 9 Op 2 . + CDS 24162 - 24344 173 ## gi|296328335|ref|ZP_06870862.1| conserved hypothetical protein 18 9 Op 3 . + CDS 24341 - 24826 647 ## FN2115 hypothetical protein + Term 24845 - 24880 5.3 + Prom 24835 - 24894 4.4 19 10 Op 1 . + CDS 24938 - 25054 158 ## 20 10 Op 2 . + CDS 25054 - 25539 569 ## FN2115 hypothetical protein + Prom 25553 - 25612 5.6 21 11 Tu 1 . + CDS 25808 - 26296 548 ## FN2115 hypothetical protein + Term 26319 - 26355 6.8 + Prom 26301 - 26360 6.0 22 12 Op 1 . + CDS 26407 - 26499 84 ## 23 12 Op 2 . + CDS 26499 - 26981 681 ## FN2115 hypothetical protein + Term 27000 - 27036 4.1 + Prom 27103 - 27162 7.2 24 13 Op 1 . + CDS 27186 - 27416 394 ## gi|237738922|ref|ZP_04569403.1| predicted protein 25 13 Op 2 . + CDS 27416 - 27898 536 ## FN2115 hypothetical protein + Term 27968 - 28011 1.0 Predicted protein(s) >gi|224461241|gb|ACDC01000161.1| GENE 1 126 - 7352 10910 2408 aa, chain + ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 803 2408 1 1582 1582 1279 51.0 0 MENNSLHNMEKNLRSIAKRYENVKYSVGLAVLFLMKGTSAFSDENKIQEVEKQKEVITND QIAKSTVKEVKKQKPKASWATMQFGANDLYSNFFVTPKSEVEKTSIVKSEKTVLVASADN STSLPMLAKLSSDIETTNTPTMEEIKTSKENLRDSVGNLKEKIDVARRENNKEINGLRLE LIQLMEQGNQVVKSPWSSWQFGANYFYENWGGSYKGRGDKSEKYSFEGIYTRNSNLFTRN ISPNSELYKDYVKTIKDDATNSALSSTLNARGRSTRYGLASNSGIQEPVVTIEINAAIKP KSIEKTPITLNFIAPSAPIVPTPSIAQVVPPSLTLPEPKAPSKEISIVKPNANPFTGFFF NSNRSSIGVGSTNMILYSGVNPDDIKAGKEGEQVRPALKTGALNTTLGDITDINERPTNI LYRMAPNLNNLTFHIRGYFGDGSDGYVDAGSGATGGSDTVGGPTLGTIGVHTLLNGTVSN VTANLYGRAGFLTSETWRHGKVTMHNTNVNVYGKDNAVYYIMPAAFKTISKYTDSNYHLG AIQGETNVKMFGTGNTMYLSSGISAARLIKNTGKIELEGASNIVYSSFSYAPTWEVGVYG GKAGKMNSLIQFNQNVELYGDENVGLFFGSKIGGSPKSWETADRDAESNAGYLRKASYIG IYQGEIDIKARVGGQLAINPSATTQTASGQLVEDLTNPANPKYKGYTDKTVDGGVGLYVT SGQRKGIDVLKDMGVPVSVTPTLDDLKLDPIHNLEVGKMDISFGKYSKNGFMMIAKDGSV IDVGKATHQYYVTNLSTSITDGVNGATTTEADASTGTTIAYAEGTWDQSKHQLGSKQTDL NQNNTDAAAVNAGAARKALTDTTASTAAKLQGLGSEINVYPNVVLASKEGIAYMGDNQGI VNAKGTTEAVNYGAIIAYAKNKGQVNVDGTVTAIDKNTTLEDNKYKNIAAFAESGGQVDI NRKVTINGIGAFAKGTDSKAQLLSGTDEINAGVVGGMVATEGGYARLNGGTIRITKDNSR LFYADATGKIDFTNTTTIEMSKGIILPQEENNTAFFSSKATTEAGAVPTKYNGMRNVTIK LLSDDVVLKTVNNHPLETWTGSTNFESGIQGVMKYAALNKNGHTYKVYYTNGQFKIATNV DLDDANDVFNSIIMANEEVTIDNGVSITSNVGKGLVQGSLANTVDNSKTAYINNGTVNIS GANSSSIALRVNHGTIENNNLVKMTDGIGLYGSSGSKIENKSNGTIQITSASAYGVGMAG FLSGTTAQDYGTDKLISNLIATSGGNLASTVKTIDITNEGNISVTGKAIGIYADNTSTIA GFDNHVTKENAVVNNKASLNFGDESIGILTKKATVNLTGTGTDDISVGKDGIGVYAKDSD VNLLTDYGFQIKDRGVGIYAENSETSTGTMNVRYTGATTEAGTGAYFKGTGSNSLTNKLD INVDNVSNATKGMIGIYAKNTNFTNEGDIQVINTDTLGFGIISSGADVTNKGVITLEDSS DPTKPNIGMYTVDSNPLKNMGKITVGKNGIGIYGKNFSNGDSATLPNSTIEVGENGIGVY TADGNGHLKSGSIKTGKDGVGVYIAGNGGTITANNTFNMTLGDGSSGNNKGSFGFVNVGT NNVIHSDISNVTLQNNSVYIYSKDTSGTLANPQITNNTNITATGNNNYGLYSAGYVVNNG NMNLAAGTGNIGVYSIKAGTIENRNGVITVGGSVPGEDEYGIGMAAGYTWTKKDLLKPVS QRPEQTTGNIVNRGTINVNGQYSMGMYGSGNGTTVNNYGTINLNANNTTGVYLTDKAVGT NYGTITNTPGVKNVTAIVVKNGAKFVNDTAGVIRLNATNAVGILATKDEGKPLGTYIINY GTFEILGSGSEPEKILSGPHDLNKSVGKGKDKISIDVPAGATTGTIKVAGEVKIPEVVET KGVELEETQVSTIGMYINTSGTKFTKPITGLSALTQIKKADLIIGAEATQSTTSKYIQVG NSILKPYNDTILNNPQIEKWGIYSGSLTWMANIAQNQSTGTIENAYLVKIPYTNWAGNEA SPVEVTDTYNFLDGLEQRYGVEEIGTRENKVFQKLNSIGNNEEILFHQATDEMMGHQYAN VQQRVVATGDILDKEFNYLRNEWTNPSKDSNKIKTFGTNGEYKTSTAGVIDYTNNAYGVA YVHEDETVRLGESTGWYAGIVHNTFKFKDIGNSKEEMLQGKLGLFKSVPFDENNSLNWTI SGDIFVGYNKMHRKFLVVDEVFNAKGRYHTYGVGIKNEISKEFRLSESFTFKPYAALGLE YGRVSKIREKSGEIKLDVKSNDYFSVRPEIGAELGFKHYFDRKTVRVGVTVAYENELGRV ANGKNKAKVAGTEADYFNIRGEKDDRTGNVKTDLNIGVDNQRVGVTANVGYDTKGNNVRA GVGLRVIF >gi|224461241|gb|ACDC01000161.1| GENE 2 7626 - 15032 11793 2468 aa, chain + ## HITS:1 COG:no KEGG:FN1554 NR:ns ## KEGG: FN1554 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 862 2468 1 1582 1582 1219 49.0 0 MNSNNLDTMEKNLRSIAKRYENVKYSIGLAVLFLMKGTSVFSDDNHVQEIEKQKDIFTDI KKEKSEIKEKKSVKQANQKIKASWVNMQFGANDMYSNYFSIPKAKVDKASIVKSEKTILL ASTDNTSTLPMFAKLLTDIEETTETRTQVPTTAEINASKDNLRNSVGNLQNKINSARQEN NKEIEGLKLELTQLMEQGDQVVKSPWSSWQFGANYMYNEWGGAYKGRGDKAEKYAFEGIF TRSLNSFERVVSPLSEKYDQLEFSTNKYSALTSSRRGLASGYGLTSVERKQEPLVSIEIN AAVKPKTIKKTPLALKPVITAPNVPEPPTIVPIPTINLELPEPNTPSKVVVIAKPNAEPF TGFYFDGTWNHRELRDNISIYSGIDPASLIGNIDNRNPTPAAMTGSYNGRQLEGTRIINE NNRYTNAYYINSQTNATKIENNTFYLRGHYPTDSYNDSNTRAHLGISNNKKIVYNDGHGN GIPDEGVVGVHALGDLNIKNLEFNLYGRAGAVTNETWRHGILDFDNITVNMYNSDNMGFY NMPVARYTYKYGKNVGGIGREWRVLAGGFSGKANVNMYGRNNSVYLTTGLSYMKHWQNEG LIQSDGASNIVYSSFSYAPTLSKLVNPAGAGYLHNTNMIKLSNIKLYGDENIGMYFGSRI KGDVAKVHMEAPNEIESLYGYNNKAAHIGIYQGEIDFSAKIGEKLTIDNQNQQTAEGNLN NAGYTNETVDGAVGIFSESGQRVGIVARGDIMEGPTPTAAEIQAHRTDPSWDRWFQHKWN STTQQIEIDKTGYGAGFYYAASNDFSKDPIHNLEVAKLDIRFGKYSKNGIMVLAKQGTVI DVGKNTSNYHITGVSSDITDGINGANTLEADASTGTIVAYAEGTWDQLKHRYGSEDARIA RNDADAVAINNGAARKALTDTKATTAAKLQGLGSEININPNVVLASKEGIAYMGDNQGIV NAMGTTEAVNYGSIIAYGKNKGIVTVNGTVKAEDKNTVSEANKFKNIGAFAEAGGKAELK GAVTINGIGAFAKGAGSEAILSSTNNDIVINAGTVGGIVATDNGYAKLNGGTINVTKDNS RLFYADATGKIDFTRTTNINVSKGIVLPHEESNPAFYNSKVSTAAGVTPTKYNGMENVTI NLLSDDVVLRTVDNHAPETWTGGANFETNVKNIMKYSALNKNGHTYKAYYTNGEFKIAVN INRDDATDVFNGIVMGNEKVTIDNGISITSNAGKGLAQAALKNTVDNSKTAYINNGTVNI TGANSGSIALKVDHGTIENNGLVSMTDGIGLYGSSGSKISNNANGKISISSPSQNGIGIA GFLTGTSAQEYGTDKLISNLIATGGGNLPSTVKTIDITNNGEIEISGKAVGIYADNTTSK IAGSKIAGFDNHVTKENAVVNNNATLSFGDDSVGIYAKKAIVNLSGTGKDDISVGTKGIG VYTEDSSVNLLTDYGFQIKDKGVGLYAKNTDTSTGTMNVRYTGAVNKVGTGAYFEVTGSP ITNKLNINVDNVSNAQTGMIGIYAAGGTFTNEGNVKVTNTNTLGFGIISSGANVTNKGDI TLEDTLNQDKANIGMYTAGSDSLKNIGKITVGKNGIGIYGKNFSNGDSATLPNSTIEVGE NGIGVYTEAGTGENIKLESGSIKAGKDGVGVYAVGNGGTITANNTFNMTLGDGSSDADKG AFGFVNVGSNNKIYSDISNVTLQNNSIYIYSKDTSGTSVNPQIINNTNITATGKNNYGIY SAGYVVNNGNMNLSAGTGNVGVYSVNGGTIENRSGAITVGGSVPVNDEYGIGMAAGYTWT KKDLLKPISQRPVETTGNIINRGTINVNGQYSLGMYASGNGSTAKNYGTISLNANNTTGM YLTDKAVGHNYGTITNAAGVKDVTGVVVKNGAKFINEATGVVSLNATNALGILRTKDEGE TLGIIENYGTFNITGDGSEVEKVSESKDLNKSLGKGKDKISIDVPAGATTGTIKLNDIIQ SPEIVETKKLELEETQVSTIGMYINTSGVKFTKPITGLSELSQLRKADLIIGAEAAQSTT AKYIQVGNTILKPYNDTILNNPQINKWTIYSGSLTWMANIGQNQVNGTIENAYLAKIPYP VFAKDKNTYNFTDGLEQRYGKEGIGSRENTLFQKLNSIGNNEEVLLFQAFDEMMGHQYAN TQQRIQSTGSILDKEFNYLRNEWSNPSKDANKIKTFGAKGEYKTNTAGVIDYTNNAYGVA YVHEDETVRLGESTGWYAGIVHNTFRFKDIGNSKEEQLQGKVGIFKSVPFDHNNSLNWTV SGEIFAGHNKMHRKFLVVDEVFNAKGRYHTYGAAVKNEISKEFRLSEDFSLRPYASLKLE YGRVSKIREKSGEMRLDIKANDYFSVKPEIGTELAYRHYFGANTVKATVGVAYENELGRV ANGKNKAKVAGTDADYFNIRGEKDDRTGNVKTDLNLGWDNQRVGVTANVGYDTKGHNVRA GVGLRVIF >gi|224461241|gb|ACDC01000161.1| GENE 3 15162 - 15968 1128 268 aa, chain + ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 267 1 244 245 221 50.0 9e-58 MKKILLGIFLLVSSLAFSAERILSFEETFLDEKTGKVHAKGEQTPYTGVIKNFKIPGEDG VFEGKISFKDGVIDGLVELYYSNGKLAEMATFKNGEKNGIQKKYYENGQIKMEVLHKNGK KDGIAKQYSNKGILIGEYPFKNDMVDGLVKQYNEVTGKLEIESTYKNGKSEGLLKAYYPS GKLKSEENYKNGLREGLRKDYYENGVLENERFYKNDKLEGISKIYYPSGKLQVEVNFKDN EADGIFREYDETGKIINQETYKNGQLID >gi|224461241|gb|ACDC01000161.1| GENE 4 15985 - 16716 867 243 aa, chain + ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 243 1 245 245 298 65.0 7e-81 MKKLLLTIFLLVSSLAFSAERLVKIENTYMGNKGIVFVAGEETPFTGIVENYKTSEGDKT LSGKVPFKDGLMEGTSKLFYSNGKIASIATFKKGKIEGIQKDYYESGIRKREISYKNGLV DGITKMYYLNGNIQSEISYKKGVPDGISKTYHKNGKVNVEATYKNGVQVGIQKDYYQNGK LKIELPLDKNGLVNGIVKIYYPSGKIMSEESYKDDKLEGTVKKYDESGKITSEEFFKNGN RIK >gi|224461241|gb|ACDC01000161.1| GENE 5 16737 - 17243 837 168 aa, chain + ## HITS:1 COG:FN2116 KEGG:ns NR:ns ## COG: FN2116 COG2849 # Protein_GI_number: 19705406 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 168 1 168 168 210 66.0 9e-55 MKKLLLGAFLLVSALSFSAGRKVPAGKIVMDQNTGIAYVQGEQTPFTGTVEVKFDNGKVQ GLLEVKNGLLDGTGVTYYPSGKVKSKENYKNGYEEGINTIYYENGNIEYEKYVSNNGRLV YEKHFYQNGQIDFEATYKDGQLDGIVKKYGPNGQVAQQGIFKDGVQVQ >gi|224461241|gb|ACDC01000161.1| GENE 6 17431 - 18168 828 245 aa, chain + ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 245 1 245 245 379 85.0 1e-105 MKKILLGVFLLLSVLSFSAERVVKLENAYVDDKGIVYVIGEKAPFTGIVENYKVPPILEG DSVLEGKIPFKNGVMEGYSKLYYPSGKLASVATFKNGKVEGIQKDYYENGKIKREISHKN GLVDGVSKLYYPNGKVQNEITHKKGIPDGVSKTYYENGKLLAEVTYKNGIEVGIQKDYYE NGKLKLELPYKNGVVDGLAKVYYPTGKLMLEENYKNDQLDGIVKRYDENGKIISEEFYKN GNKIK >gi|224461241|gb|ACDC01000161.1| GENE 7 18189 - 19202 1332 337 aa, chain + ## HITS:1 COG:FN2111 KEGG:ns NR:ns ## COG: FN2111 COG2849 # Protein_GI_number: 19705401 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 212 1 215 219 243 70.0 3e-64 MKKLLLGAFLLVSVLSFSAGRKVPIEKMMVDDTTGIAYVQGEKTPFTGTVEVKYDDGEVL ALMEVKNGLMEGTYKLLYPSGKTAIIATYKNGKTDGIQKEYYENGQIKMEVLHKNGKAEG YLRTYYLNGKLEGEEKYQNGLREGLTKSYYEDGSLEGERFYKNNNLEGINKIYHPNGKLA KIAVFKNGELDGTVKTYYPNGKLEGIGTFKDGKIDGIQKEYYENGQIKMESLAKNDKKNG IGRFYSITGVLIAEIPFKDDEVDGTIKNYNEVTGKLETEGEFKNGKVEGTMKEYYPNGKI QMEANFKDDKLEGIVKRYDENGKLIEQEIYKNGNRIK >gi|224461241|gb|ACDC01000161.1| GENE 8 19228 - 19734 731 168 aa, chain + ## HITS:1 COG:FN2116 KEGG:ns NR:ns ## COG: FN2116 COG2849 # Protein_GI_number: 19705406 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 168 1 168 168 210 65.0 9e-55 MKKLLLGAFLLVSVLSFSAERKVQVEQVFKNANTGIVYVQGEETPYTGLIEVKFPNGKTQ ALTSYRNGVVHGKGITYHPNGKVWSKENYKNGVEDGVNIIYYENGNIEYEKNVSNNGRTV YEKHYFSSGKLDFEATYQDGKLNGVVKKYGENGQVVQQGTFKDGVQVQ >gi|224461241|gb|ACDC01000161.1| GENE 9 19939 - 20415 737 158 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 158 1 151 151 88 39.0 7e-17 MIKRMIILIIMGLTLSSCDFIHYGKIAIQDNVRRIEMEREKKSVMKKDGPAAIDVDEYKE GVEEAIKDILKRPINKKVEFKGTTLIIPEGTRINSKHGNLVDEKTGYGVFISFSINPHCI SKKVNNREYGFFFDKHDVNINKIAKEIMRVNGFEDVCK >gi|224461241|gb|ACDC01000161.1| GENE 10 20526 - 20642 58 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPKNKEINISPQSYIRKEPLKIVDINEYLRNLRKGVDK >gi|224461241|gb|ACDC01000161.1| GENE 11 20642 - 21124 631 160 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 160 1 151 151 120 46.0 2e-26 MIKRMIILIIMGLTLSSCQLFTEAIKDNMYRVERARERKELSKKDGPSAIVVDEYKEDVE RVIQDIMKRPINKKVEFGETTLLIPENTRINSKHGNIVDEKTGYGIAVIFYIEDYCTEVF YRKKIEENKYIMLFYNHRDKSLDTVAQKIIKANGFTKTCK >gi|224461241|gb|ACDC01000161.1| GENE 12 21359 - 21841 559 160 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 160 1 151 151 125 51.0 4e-28 MIKRMIILIVMGLTLSSCQLFTEAIKDNILRYEIEQETKERHKKNGGGAISVDKYKEGVE ATIKDILKRPINKRIQFEEAVLLIPENTELNKKVGNIVDMKTGYGIPIYIINDGEHCSQL AFTKRVNGKYYKISYLENNIEISKIAQKIIKENGFRKVCK >gi|224461241|gb|ACDC01000161.1| GENE 13 21929 - 22111 206 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|296328335|ref|ZP_06870862.1| ## NR: gi|296328335|ref|ZP_06870862.1| conserved hypothetical protein [Fusobacterium nucleatum subsp. nucleatum ATCC 23726] # 1 60 273 332 332 87 78.0 2e-16 MEDYRKEELKKEYGYRNYQDKGLLKNKEKNNSPQPYTRKEAQPTLGIEEYLKGLRKGVGL >gi|224461241|gb|ACDC01000161.1| GENE 14 22111 - 22602 730 163 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 163 1 151 151 108 43.0 7e-23 MLKKGIVLIMLGLIFSSCDLIYYGKIAVYENKYRSELERSAREGMKKDGPGAINNEKYTE GVKEAIQDIKKRPVNKRVEFGGTTLLIPENTRLNPKHGNIVDEKTGYGIAILFEIEDYCT DVFYRKKISSDKYILLYYNNEDKDLNVIGQKIIKANGFTKTCK >gi|224461241|gb|ACDC01000161.1| GENE 15 22832 - 23311 698 159 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 157 1 148 151 86 38.0 4e-16 MMLKKGIMLIMLGLMFSSCDFIHYGKIAVYENTNRIERERETKEARKKDGPFAVVVDEYK EGVKEVIEDILKRPINKKVQFEGITLIIPEGTRINSKLGNIVDEKTGYGITIMFSLKKRY YITKEVNNKKYGFFYDEYDVNISKIAQRIMKINDFKEPK >gi|224461241|gb|ACDC01000161.1| GENE 16 23577 - 24074 493 165 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 165 1 151 151 104 41.0 1e-21 MMLKRVIMVIMLGLIFSSCDFIHYGKIAIQDNVRRIEMERERKEARKKDAYAAAGNPEYE TGVELAIQDIMKRPVNKRVEFGGTTLLIPENTRLNPKHGNIVDEKTGYGIAILFKMDNGC SPEVFYTKKVKSNLYIYLYYNNEDKDLNVIGQKIIKANSLTNTCK >gi|224461241|gb|ACDC01000161.1| GENE 17 24162 - 24344 173 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|296328335|ref|ZP_06870862.1| ## NR: gi|296328335|ref|ZP_06870862.1| conserved hypothetical protein [Fusobacterium nucleatum subsp. nucleatum ATCC 23726] # 1 60 273 332 332 88 76.0 1e-16 MEDYRKEEFRKEYGYGNYQDKGLPKNKEMNLSPQPYKRKKTPTTVEINEYLEKLRKGVGL >gi|224461241|gb|ACDC01000161.1| GENE 18 24341 - 24826 647 161 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 12 161 1 151 151 97 40.0 1e-19 MMMLKRVIMVIMLGLIFSSCDFIHYGKIAIQDNIRRIEMERERKELRKKDAPGAIMTDEY KEGVEIATQDIMKRPVNKRVEFEGATFIIPENTRLNPKHGNIVDEKTGYGIAITFTLSPH CMSKKVNDKEYSLFYNSKHNANVAEIAKEIIKVNGFKDTCK >gi|224461241|gb|ACDC01000161.1| GENE 19 24938 - 25054 158 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPKNKEININPQPYTRKETQPTLGIEEYLKGLRKGVDK >gi|224461241|gb|ACDC01000161.1| GENE 20 25054 - 25539 569 161 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 161 1 151 151 236 83.0 2e-61 MIKRMIILIVMGLTLSSCQLFTEAIKDNINRYETEEILKEYKKKDGSMAYQGDKEIFNVV KEVLKRPLNKEIQFDGIKIMIPENTRINSKTGTIVDMKTGYGLPISIYIKDNCDPLSDDK VTSKKRIKGGYYYINYLSQNKDTKALFEKISKVNGFTKGCK >gi|224461241|gb|ACDC01000161.1| GENE 21 25808 - 26296 548 162 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 162 1 151 151 180 62.0 1e-44 MIKRMIILIIMGLTLSSCQEIAIIKYSIDDAKQQAEIRKITSPYYEKEGGWAYQANKEVL DIVKEVLKRPINKEIQFDGIKIMLPENTRLNLKTEAIVDIKTGYGLPVCIYIRDYCDTYS DARSKKRLAGGYYYICYFSENKDTKALFEKISKINGFTNGCK >gi|224461241|gb|ACDC01000161.1| GENE 22 26407 - 26499 84 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSQSYVKKKNSTIVDINEYLGNLRKGVDR >gi|224461241|gb|ACDC01000161.1| GENE 23 26499 - 26981 681 160 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 160 1 151 151 113 45.0 2e-24 MIKRMIILIIMGLTLSSCQLFTEAIKDNMYRVERARERKELSKKDGPGAIVVDKYKEDVE RVIQDIKKRPINKKVEFGGTTLLIPENTRLNPKHGNIVDEKTGYGIAITFEITERCSSVY YRKKIKEGLYCKIYYNGINSELNIISKKIIETNGFTNTCK >gi|224461241|gb|ACDC01000161.1| GENE 24 27186 - 27416 394 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738922|ref|ZP_04569403.1| ## NR: gi|237738922|ref|ZP_04569403.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 76 1 76 76 127 100.0 2e-28 MYGPPILNDSPFLLDEVNEYNKGEFEKKYGYGNYQDKGLPKTKEINMSPQSYVKKKNSTI VDINEYLGNLRKGVDK >gi|224461241|gb|ACDC01000161.1| GENE 25 27416 - 27898 536 160 aa, chain + ## HITS:1 COG:no KEGG:FN2115 NR:ns ## KEGG: FN2115 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 160 1 151 151 131 51.0 8e-30 MIKKMIILIIMGLTLSSCQLFTEAIKDNIHRYEMEQETKERHKKNGGGAISVDKYKEGVE ATIKDILKRPINKRIQFEEAVLLTPENTELNKKVGNIVDMKTGYGIPIYIINDGEHCSQL AFTKRVNGKYYKISYLENNIEISKIAQKIIKENGFTKGCK Prediction of potential genes in microbial genomes Time: Fri May 20 00:00:04 2011 Seq name: gi|224461240|gb|ACDC01000162.1| Fusobacterium sp. 2_1_31 cont1.162, whole genome shotgun sequence Length of sequence - 8690 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 1 - 988 1556 ## COG1087 UDP-glucose 4-epimerase 2 1 Op 2 4/0.000 - CDS 988 - 2520 1952 ## COG4468 Galactose-1-phosphate uridyltransferase - Prom 2540 - 2599 3.4 3 1 Op 3 . - CDS 2674 - 3840 1737 ## COG0153 Galactokinase - Prom 3869 - 3928 14.9 4 2 Op 1 26/0.000 - CDS 4040 - 4924 1260 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 5 2 Op 2 . - CDS 4942 - 5367 559 ## COG1585 Membrane protein implicated in regulation of membrane protease activity - Prom 5411 - 5470 9.5 6 3 Tu 1 . - CDS 5476 - 5721 419 ## FN1796 hypothetical protein - Prom 5883 - 5942 12.7 + Prom 5783 - 5842 10.0 7 4 Op 1 25/0.000 + CDS 5942 - 6205 354 ## COG1925 Phosphotransferase system, HPr-related proteins + Term 6216 - 6252 4.2 8 4 Op 2 . + CDS 6269 - 7996 2579 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) + Prom 8061 - 8120 9.8 9 5 Tu 1 . + CDS 8231 - 8629 824 ## FN1792 hypothetical protein + Term 8653 - 8682 1.4 Predicted protein(s) >gi|224461240|gb|ACDC01000162.1| GENE 1 1 - 988 1556 329 aa, chain - ## HITS:1 COG:FN2109 KEGG:ns NR:ns ## COG: FN2109 COG1087 # Protein_GI_number: 19705399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Fusobacterium nucleatum # 1 329 1 329 329 670 96.0 0 MSILVCGGAGYIGSHVVKYLLEKNEDVVVVDSLITGHVDAVDEKAHLELGDLKDEEFLNR VFEKYQIDGVIDFAAFSLVGESVSEPLKYFENNFYGTLCLLKVMKAHNVDKIVFSSTAAT YGEAENMPILETDRTEPTNPYGESKLAVEKMFKWCANAYGLKYTALRYFNVAGAYPSGEI GEAHTCETHLIPLILQVALGQREKISIYGDDYPTPDGTCIRDYIHVMDLADAHYLALNRL RNGGDSQVFNLGNGEGFSVKEVIEVTRKVTGHPIPAEVSPRRAGDPARLIASSQKALDTL KWVPKYDKLEQIIETAWNWHKNHPNGYED >gi|224461240|gb|ACDC01000162.1| GENE 2 988 - 2520 1952 510 aa, chain - ## HITS:1 COG:FN2108 KEGG:ns NR:ns ## COG: FN2108 COG4468 # Protein_GI_number: 19705398 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Fusobacterium nucleatum # 1 510 1 509 509 935 90.0 0 MEIYSLINRLIKYSLKNSLITEDDVMFVRNELMALLQLKDWEDINEDNYQVPEYPQEILD KICDYAIEQKIIEDGTTDRDIFDTEVMGKFTPFPREVINTFKNLSDENIKSATDYFYNFS KKTNYIRTERIEKNLYWKSPTEYEDLEITINLSKPEKDPKEIERQKNMPQVNYPKCLLCY ENVGFAGTLTHPARQNHRVIPLTLENERWYFQYSPYVYYNEHAIIFCSEHREMKINRDTF SRTLDFVNQFPHYFIGSNADLPIVGGSILSHDHYQGGNHEFPMAKSEIEKEVSFDAYPNI KAGIVKWPMTVLRLKSLDRNELIELSDKILKAWREYSDEEVGVFAYTNSTPHNTITPIAR RRGEYFEIDLVLRNNRTDEANPLGIFHPHSEHHNIKKENIGLIEVMGLAVLPGRLKMEMR KIAEFLKYEDFEKKISEDKDCEKHLSWLKAFLNKYPNIKDLSVDEILENILNVEIGLTFS RVLEDAGVFKRDEKGKNAFLKFINHIGGRF >gi|224461240|gb|ACDC01000162.1| GENE 3 2674 - 3840 1737 388 aa, chain - ## HITS:1 COG:FN2107 KEGG:ns NR:ns ## COG: FN2107 COG0153 # Protein_GI_number: 19705397 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Fusobacterium nucleatum # 1 388 1 388 389 687 89.0 0 MLEDLIKEFKEIFKYDGNVETFFSPGRVNLIGEHTDYNGGFVFPCALDFGTYAVVKKRED KTFRMYSKNFKNLGTIEFNLDNLVYNKKDNWANYPKGVVKTFLDRAYKIDSGFDVLFYGN IPNGAGLSSSASIEVLTAVILKDLFKLDVDMVEMVKMCQVAENKFIGVNSGIMDQFAVGM GKKDHAILLDCNTLKFEYVPVKLKNMSIVIANTNKKRGLADSKYNERRSSCEEAVKVLND NGINIKYLGELTVAEFDKVKHFITDEEQLKRATHAVSENERAKVAVEFLKKDDIAEFGRL MNQSHISLRDDYEVTGVELDSLVEAAWEEEGTVGSRMTGAGFGGCTVSIVENDHVENFIK NVGKKYKEKTGLRATFYIANIGDGAGKI >gi|224461240|gb|ACDC01000162.1| GENE 4 4040 - 4924 1260 294 aa, chain - ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 1 293 1 293 294 474 92.0 1e-134 MFYIPFFVLLLILFAVIALKAIKIVPESQVYIIEKLGKYNQSLSSGLNLINPFFDKVSRI VSLKEQVVDFDPQAVITKDNATMQIDTVVYFQITDPKLYTYGVERPLSAIENLTATTLRN IIGDMTVDETLTSRDIINTKMRQELDDATDPWGIKVNRVELKSILPPNDIRIAMEKEMKA EREKRAKILEAQATRESAILVAEGEKQSAILRAEAEKEVKIKEAEGKAQAILEIQRAEAE AIKLLNEAKPAKEILALKSFETFEKVADGKSTKILIPSEIQNLAGFMQTIKEIN >gi|224461240|gb|ACDC01000162.1| GENE 5 4942 - 5367 559 141 aa, chain - ## HITS:1 COG:FN1548 KEGG:ns NR:ns ## COG: FN1548 COG1585 # Protein_GI_number: 19704880 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Fusobacterium nucleatum # 3 141 1 138 138 177 72.0 7e-45 MTVGYIFWLVLTIIFTIIEFAIPALVTVWFAFAAALTVFVSLISDSMKVEITFFTVVSLL SLIFLRPYARAILSKNKDNFDAEKIDTAIIVKKIVDTSKEEKIYDVSYKGSIWTALSNDL FEVGDTPAISSFKGNKIILKK >gi|224461240|gb|ACDC01000162.1| GENE 6 5476 - 5721 419 81 aa, chain - ## HITS:1 COG:no KEGG:FN1796 NR:ns ## KEGG: FN1796 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 78 1 78 79 96 83.0 2e-19 MERLKEDEVKKIIDELKQTGKYKEYQEMLLDDFEEHHVVYKIGADELIAIAHKNNTIPYK LIEFYDWQQMNYLIEEEDGIE >gi|224461240|gb|ACDC01000162.1| GENE 7 5942 - 6205 354 87 aa, chain + ## HITS:1 COG:FN1794 KEGG:ns NR:ns ## COG: FN1794 COG1925 # Protein_GI_number: 19705099 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Fusobacterium nucleatum # 1 87 1 87 87 133 97.0 8e-32 MKSKTVEIVNETGLHTRPGNEFVSLAKTFSSQISVENEAGAKVNGTSLLKLLSLGIKKGS KITVYADGEDENEAVDKLSSLLENLKD >gi|224461240|gb|ACDC01000162.1| GENE 8 6269 - 7996 2579 575 aa, chain + ## HITS:1 COG:FN1793 KEGG:ns NR:ns ## COG: FN1793 COG1080 # Protein_GI_number: 19705098 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Fusobacterium nucleatum # 2 574 7 579 579 974 89.0 0 MKNNLIKGIPASPGIAIGKAFLYKENNLEILEKSILSKEEELERLIKGREVAKKQLEEIK ENTLQKLGKDKADIFEGHITLLEDEELFSEIDSKISEKKCTAEFALSEAIDEYANMLANL EDAYFKERAGDLRDIGKRWLYGVMNAQVVDLSKLEPETIIVARELNPSDTAQINLENVLA FVTEIGGKTAHSSIMARSLELPAVVGVGTVLENLEDNQILIVDALNGEVIVNPDEETLKI YREKRENFLKEKEELKALKDKEAVSKDGTKVDVWGNIGSPNDLKGIISNGGFGIGLYRTE FLFMEKDSFPTEDEQFEAYKIVAEGLKGYPVTIRTMDIGGDKSLPYMELPQEENPFLGWR AIRVCLDRQEILKTQFRALLRASKYGQIKIMLPMIMDIEEVRKAKAIFENCKKELREEGI EFDEKIMLGIMVETPAVAFRAKHFAKECDFFSIGTNDLTQYTLAVDRGNEKIANLYDTYN PAVLQAIKMLIDGAHEGGIKISMCGEFAGDENAVAILFGMGLDSFSMSGISIPRVKRILM KLDKKECEKLVERILELSTAIEIKNEVKEFMKNIA >gi|224461240|gb|ACDC01000162.1| GENE 9 8231 - 8629 824 132 aa, chain + ## HITS:1 COG:no KEGG:FN1792 NR:ns ## KEGG: FN1792 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 11 132 1 121 121 108 84.0 8e-23 MKKFAMLALAMSLFLVACGEKKEEEKPAEQPAAEATATTNEAAATTTEAAAEAKSFSVKT EDGKEFTLEVAADGATATLTDAEGKVTELKNAETASGERYADEAGNEIAMKGTEGVLTLG DLKEVPVTVEAK Prediction of potential genes in microbial genomes Time: Fri May 20 00:00:09 2011 Seq name: gi|224461239|gb|ACDC01000163.1| Fusobacterium sp. 2_1_31 cont1.163, whole genome shotgun sequence Length of sequence - 1116 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 67 - 594 809 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 + Prom 603 - 662 6.0 2 1 Op 2 . + CDS 686 - 1115 513 ## COG0006 Xaa-Pro aminopeptidase Predicted protein(s) >gi|224461239|gb|ACDC01000163.1| GENE 1 67 - 594 809 175 aa, chain + ## HITS:1 COG:FN1951 KEGG:ns NR:ns ## COG: FN1951 COG2110 # Protein_GI_number: 19705253 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Fusobacterium nucleatum # 1 174 1 174 175 280 79.0 1e-75 MYKDIIKIVSGDITKIPEVEVIVNAANNQLEMGGGVCGAIFRAASGDLAKECKEIGSCAT GEAVITKGYNLPNKYIIHTVGPRYLTGENGEAKKLESAYYESLKLAREKGLRKIAFPSVS TGIYRFPVNEGAEIALNTAKKFIDENPDSFELILWVLDEKTYVVYKEKYEKIIKE >gi|224461239|gb|ACDC01000163.1| GENE 2 686 - 1115 513 143 aa, chain + ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 143 1 143 462 261 93.0 3e-70 MLNKEVYINRRKKLKENFRDGLILIMGNNFSPLDCEDNTYPFIQDATFKYYFGIDHNGLI GIIDIDKNEEIIFGNDYTMSDIIWMGKQKFLKELALEVGIEKFVEKEELKKYLENRKNIR FTNQYKADNIMYLSSILNINPFE Prediction of potential genes in microbial genomes Time: Fri May 20 00:00:11 2011 Seq name: gi|224461238|gb|ACDC01000164.1| Fusobacterium sp. 2_1_31 cont1.164, whole genome shotgun sequence Length of sequence - 3139 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - 5S_RRNA 2 - 117 100.0 # AE009951 [D:1076861..1076976] # 5S Ribosomal RNA # Fusobacterium nucleatum subsp. nucleatum ATCC 25586 # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Fusobacterium. - LSU_RRNA 314 - 2681 93.0 # FJ410389 [D:301..3086] # 23S ribosomal RNA # Fusobacterium necrophorum # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Fusobacterium. + Prom 2607 - 2666 80.4 1 1 Tu 1 . + CDS 2702 - 2917 224 ## Predicted protein(s) >gi|224461238|gb|ACDC01000164.1| GENE 1 2702 - 2917 224 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYRTFTFYGSPFQTIPIHQYTILNILQFFTTLSLNPLNTTAVSLTYLRFRLDPFRSPLLW VSFLLSFPRVT Prediction of potential genes in microbial genomes Time: Fri May 20 00:00:19 2011 Seq name: gi|224461237|gb|ACDC01000165.1| Fusobacterium sp. 2_1_31 cont1.165, whole genome shotgun sequence Length of sequence - 1565 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 73 - 1542 99.0 # FJ471670 [D:1..1475] # 16S ribosomal RNA # Fusobacterium periodonticum # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Fusobacterium. Prediction of potential genes in microbial genomes Time: Fri May 20 00:00:29 2011 Seq name: gi|224461236|gb|ACDC01000166.1| Fusobacterium sp. 2_1_31 cont1.166, whole genome shotgun sequence Length of sequence - 34422 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 9, operones - 6 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 223 - 289 14.6 1 1 Tu 1 . - CDS 324 - 8942 13655 ## FN1449 hypothetical protein - Prom 9044 - 9103 11.3 - Term 9041 - 9099 -0.3 2 2 Op 1 3/0.000 - CDS 9132 - 9851 993 ## COG2849 Uncharacterized protein conserved in bacteria 3 2 Op 2 40/0.000 - CDS 9860 - 12259 3493 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit - Prom 12279 - 12338 4.6 4 2 Op 3 1/0.000 - CDS 12343 - 13359 1528 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Term 13375 - 13413 -0.9 5 2 Op 4 1/0.000 - CDS 13439 - 13906 375 ## COG0622 Predicted phosphoesterase 6 2 Op 5 24/0.000 - CDS 13920 - 16358 3916 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit - Prom 16384 - 16443 9.8 - Term 16370 - 16411 2.5 7 3 Op 1 . - CDS 16584 - 18491 2688 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 8 3 Op 2 . - CDS 18526 - 18801 244 ## FN2127 hypothetical protein 9 3 Op 3 9/0.000 - CDS 18779 - 19888 810 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 10 3 Op 4 . - CDS 19902 - 20117 332 ## COG2501 Uncharacterized conserved protein 11 3 Op 5 . - CDS 20167 - 22062 1861 ## FN0001 chromosomal replication initiator protein DnaA - Prom 22136 - 22195 15.1 + Prom 22600 - 22659 11.9 12 4 Op 1 . + CDS 22692 - 22826 224 ## PROTEIN SUPPORTED gi|197735492|ref|YP_002164270.1| hypothetical protein FNP_0004 + Term 22829 - 22861 2.1 13 4 Op 2 16/0.000 + CDS 22881 - 23216 360 ## COG0594 RNase P protein component 14 4 Op 3 18/0.000 + CDS 23225 - 23473 168 ## COG0759 Uncharacterized conserved protein 15 4 Op 4 16/0.000 + CDS 23470 - 24090 586 ## COG0706 Preprotein translocase subunit YidC 16 4 Op 5 4/0.000 + CDS 24092 - 24862 1132 ## COG1847 Predicted RNA-binding protein 17 4 Op 6 11/0.000 + CDS 24881 - 26248 1911 ## COG0486 Predicted GTPase + Prom 26350 - 26409 9.9 18 5 Op 1 . + CDS 26472 - 28355 2621 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 19 5 Op 2 . + CDS 28371 - 28982 852 ## COG1279 Lysine efflux permease + Prom 29172 - 29231 10.8 20 6 Op 1 10/0.000 + CDS 29266 - 30162 1138 ## COG0379 Quinolinate synthase 21 6 Op 2 13/0.000 + CDS 30164 - 31456 1485 ## COG0029 Aspartate oxidase 22 6 Op 3 1/0.000 + CDS 31434 - 32294 502 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 23 6 Op 4 . + CDS 32287 - 32796 511 ## COG1827 Predicted small molecule binding protein (contains 3H domain) + Prom 32845 - 32904 10.8 24 7 Op 1 . + CDS 32933 - 33196 412 ## Lebu_1725 addiction module antitoxin, RelB/DinJ family 25 7 Op 2 . + CDS 33198 - 33470 453 ## COG3041 Uncharacterized protein conserved in bacteria + Term 33473 - 33517 3.5 - Term 33407 - 33446 -0.5 26 8 Tu 1 . - CDS 33524 - 34027 893 ## HMPREF0868_0528 hypothetical protein - Prom 34063 - 34122 17.6 + Prom 34059 - 34118 8.0 27 9 Tu 1 . + CDS 34138 - 34416 447 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump Predicted protein(s) >gi|224461236|gb|ACDC01000166.1| GENE 1 324 - 8942 13655 2872 aa, chain - ## HITS:1 COG:no KEGG:FN1449 NR:ns ## KEGG: FN1449 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 487 2872 782 3165 3165 1702 48.0 0 MNKNFQKIEKDLRSIAKRYKSVKYSIGLVILFLMMGLNAFSEEVNTDQNVSNPVGTIATR EGIRDSVEGLQGKIKDARAENTKSIESLRLELVQLMEQGNQVVKSPWSSWQFGINYMYDN WSGTYKGRGDKPSKYVYNSIYRRGNWEERNAIDTIAGKSVDGDPITPGNENTSTWQTATT PTGVTKLKRDTSIDASTNGKREWGLVELRKIREPLNEVEIFANVSPKEVKKDKLDIPVSV TPPATLSAPVVKPNVNKPTEAPKVDLPNPPVLEIPGDPNLTFNPTISVLKVEKVGNITVN PGTVTPVDFILEPAVLSNSRNFLRKYTQTPQSYNLNSTTVVVNNVSTRPDGYKTGNYIST WGYIKDLGNVTTNVDVDVDGTRAFMIDEGIDKNNDRGIDPFNYNGVINLKKSQNVGIDVQ GTHTVYGPGAYVNTESKYNDINTVANVVVTNSGTINGIGGPNIKNQVAFGFNNFDASTNN TRTEMVNAASGVIKLDAEESVGIQLRPEDPNASGASDRLGLNMMAGKNDGTITLNGAGSF GILTVKNKNPNTKVLAAPKTYSNYGVTTSAGGQIASRAQEVNKSFVKNTGTISIQSDNSI GVGILNSIQSVEVGKFIDIGTSTLGAYGDKNGNGKVDGAVGVYTEEATRPVKGRVYTYTT DSKGVTTETITAGGKDDHGLENTIGIEDTSQPYTDSTGATKYKRTGKQVGTDTVEVSATI TLGSNSTSSFGLRNSNTGSITLVSGGKVKIEGEKNFGALSNAAAGRININHGAEITGDGK ESIGYAMLQGTGVNAGTIDITGAPTNTATPAAYNGSIGFYGVKGTFTNTSTGKIITSGTL AHSVVLKQNSTTQGEMTFNHYGEINVSSASTKPGNIGVYSDGYALANFYNDAKVYVGDDS IGIHTPKKDGFNQTFKNHGTLGIKIGARSTFAYLDADGTPTETTLKEFFNFGNAKVNIES GMGEKSSLVYANNQAKALLDADLTIDKGDAAASTMALLATNKSAVTVDSGKTLTTNTQVA LAAINGTTTAGSGSTAKNKGTIVSNRTNNGIGIYAKDGGSKAINDGTITMMGKEAAGMYG EDITTFENKAGKSIEVKEEKSVGMYAKSTGTNTLTAQNDGKITTNKQKSVGIYLENATTG PAVANLKASNKEIEIKGGTESIGIYAPASTVSKVGKVTMADAATKSIAVYLSKGAQATTV ATDEIDLGKSAKNIAYFIKNKDTGFGTSTDIGKVSGYGVGVYLEGTSTTDIAELTATSPN LNFTTGNDTGNGIVGLYLKGDTDISAYNKKITVGDTVVDAINGKDIAPAIAIYSEKQGVS ATPYVIKADIQTGKKAVGIFSVPDKTAPVVANKSYIKYLGSRMDLGEGSTGFYVNGKTEL DTTNKTTTINLDGGLVAYVTQNSEFVGGKSEVNLSKSGVGVYGERGAKVDVGSWSFNNKG NAAEEVRLKEGIAKVTTPKSLKPKMVLTHVINGETYLTSTVTAIADTGYTQEENIGLMAQ GLKNTKVGITWDKGTDYEIINEGTIDFKDSIKSTAIYAESARVKNDGTIKLGGSSTGIYG IYRDDSPKFEDAGGTKYPNKLEIDTTANSNISLGTNSTGMYLVNAEKLNTAAGGTIQSSA NATNNFGIYAINGKVDVPTTGTPAEIAEANAYNNKNDNFKTLTMTNNSNITLGNGSVGIY TRGQSDAARNTVTNDGNITVGDNLTGAPAVGIYAENTNLTQGDTGTPDITVGEKGIALYG KNSTVEAKGTVNYTNKGILGYFEDSTFTSHYGDLTAHQNTILFLKNSTANMNGAGADIDI TVPDKAATSDSFAGVYVEGTSTLNGVKKITVGENSNGIFMKNATFTSNVTDIVSTKEGAK GLLAVESDLTNNSKITLSGDSSIGIYSDASNAKTVTNNGKLTISGKKTLGVFLKGSQTFI NIADIDVADTTSSVLAEKTVGIYTKDGTSTIKHNSGTINVGTKSIGIFSATNSGVEVDTP AKINVKDEAIGIYKEKGTALLKGEIDVAAHTSTTKNSEPVGLYGLNGANITDSASKITVG AKSFGFILENETTATTNQYTSTGAGTVSLGDDSVFLYSNGQASLTNGRDISSNSKRLIGF YIKGNGTNRGDLINNATIDFSNSLGSIGIYAPGGKATNNGRILVGETDSIDPATGKTYTD VSKIIYGIGMAADNGGHIINNNEIRIYGDKSIGMYGKGVGTTVENNGTIFLDGSRATATN KIQSMIGVYVDEGATFVNKGDIRTADAYAGKDVGGTIKVNDNVSGLVGVAVMNGSTLENH GNIDIDANESYGVVIRGKSPTQPAVIKNYGNFRINVRGRGTYGVSYKDISAADLAALEAI VNSKLKSDATGQELVAAAGTDKSYEGVSITIQNGKPVFTRNGVLVSDSEVEKIEKIIGNA TSNLGMSDVGFYIDTLGRTKPIDINGATPPINSQLIVGTEYSELTNKKEWFVKDDVIAPF LQQIQGRNFKLTSIAGSLTWMATPVLDNYGQIKGVAMTKLPYTAFVEKSHNAWNFADGLE QRYGMNALDSREKRVFNLLNSIGNNEEILLTQAYDEMMGHQYANVQQRIYETGNILNREF SYLRNAWSNPTKDANKVKVFGTNGEYKTDTAGIIDYKYNAQGVAYVHEDESVKLGETVGW YAGLVHNRYKFEDIGRSKEEMLQGKVGMFKSVPFDDNNSLNWTISGDVFVGYNKMHRRFL VVNEIFNAKSRYYSYGLGVKNELSKEFRLSEGFALKPYGALRLEYGRMSKIRERSGEVKL EVKSNDYISIKPEIGAELSYRAFFGPKSLKAAVTVAYENELGVLANPKNKARVAGTSADW FNIRGEKEDRRGNVKTDLNIGVDNQRIGVTANVGYDTKGSNIRGGLGLRVIF >gi|224461236|gb|ACDC01000166.1| GENE 2 9132 - 9851 993 239 aa, chain - ## HITS:1 COG:FN2121 KEGG:ns NR:ns ## COG: FN2121 COG2849 # Protein_GI_number: 19705411 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 20 239 9 230 230 249 60.0 3e-66 MKKILASLFVLFSIVTFSAEKIAIERIEVKEEKIYLKGQQTPFTGVVEKKYPNGRVEATL EVKDGKLNGKTFVYSEDGKVKKEENYINGLMEGVERGYYPSGKLEFEVTNKNDLRNGIER HYSEDGKLIIEVPYQNDVVTGLVKQYNKDGKLEYETNYVNNKREGLSKKYYPSGKLLSQV IFKNDKEEGLMKGYSEDGKLEMEIPYLHGSVEGLVKRYDENGKVVEQAMYKNNQEVKKK >gi|224461236|gb|ACDC01000166.1| GENE 3 9860 - 12259 3493 799 aa, chain - ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 146 799 1 653 653 1100 89.0 0 MLISLNWLKQYVDIKESVDEIANALTMIGQEVEAIDIQGKDLGNVVIGQIVEFDKHPNSD RLTLLKVNVGEGEPLQIICGATNHKLNDKVVVAKIGAVLPGNFKIKKSKIRDVESYGMLC SEAELGLAKESEGIIILPEDAPIGKEYREYAGLNDVIFELEITPNRPDCLSHIGIAREVA AYYNRKVKYPVIEMAETIESVNTVIKVNIEDKDRCKRYMGRVIKNVKIKESPEWLKTRIR AMGLNPINNVVDITNFVMFEYNQPMHAFDLDKVEGNITIRAAKENEEITTLDGVERVLKN GELVIADDEKAIAIGGVIGGQNTQIDDDTKNIFVEVAYFTPENIRRESRDLGIFTDSAYR NERGMDVENLPVVMNRAVSLIAEVAEGEVLSEVIDKYVEKPKRAEISLNLEKLNKFIGKT LTYEEVGKILTHLDIELKPLGDGTTLLIPPSYRADLTRPADIYEEVIRMYGFDNIEAKMP VMSIESGKENTNFKISRIVREILKELGLNEVINYSFIPKFTKELFNFGEEVIEIKNPLSE DMAIMRPTLLYSLIANVRDNINRNQTDLKLFEISKTFKKLGEGPNGLAIEDLKIALILSG REEKNLWNQSKSDYSFYDLKGYLEFLLERLNVTKYSLTRLTNNKNFHPGASAEIKIGEDV IGVLGELHPNLVNYFGIKREKVFFAELNLTSLLKYIKIKVNYETISKYPEVLRDLAITLD RSVLVGEMVKEIKKKVNLIEKIDIFDVYSGDKIDKDKKSVAMSIVLRDKNRTLTDEDIDK AMTAILELIKDKYNGEIRK >gi|224461236|gb|ACDC01000166.1| GENE 4 12343 - 13359 1528 338 aa, chain - ## HITS:1 COG:FN2123 KEGG:ns NR:ns ## COG: FN2123 COG0016 # Protein_GI_number: 19705413 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Fusobacterium nucleatum # 1 338 1 338 338 659 95.0 0 MKEEILKVKEEIQKYIEESKTLQKLEEIRVNYMGKKGIFTDLSKKMKDLTAEERPKIGQI INEVKEKISNLLDEKNKALKEKELNERLESEIIDISLPGTKYNYGTVHPINETMELMKNI FSKMGFDIVDGPEIETVEYNFDALNIAKTHPSRDLTDTFYLNDSIVLRTQTSPVQIRYML KHGTPFRMICPGKVYRPDYDISHTPMFHQMEGLVVGKDISFADLKGILTHFVKEVFGDRK VRFRPHFFPFTEPSAEMDVECMICHGEGCRLCKDSGWIEIMGCGMVDPEVLKYVGLNPDE VNGFAFGVGIERVTMLRHGIGDLRAFFENDMRFLKQFK >gi|224461236|gb|ACDC01000166.1| GENE 5 13439 - 13906 375 155 aa, chain - ## HITS:1 COG:FN2124 KEGG:ns NR:ns ## COG: FN2124 COG0622 # Protein_GI_number: 19705414 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Fusobacterium nucleatum # 1 153 1 153 153 233 77.0 8e-62 MKRILVLSDSHSYFDKALKIFEKEKPDIVIAAGDGIGDIDDLSYVHPEATYYMVKGNCDF FERSHSEENIFEIEGKKFFLTHGHLYDVKRSLNSIKEMTKKLKANLTIFGHTHKPYIEYY EDEILFNPGATEDGRYGFIILKDGNIQLFHKQLQL >gi|224461236|gb|ACDC01000166.1| GENE 6 13920 - 16358 3916 812 aa, chain - ## HITS:1 COG:FN2125 KEGG:ns NR:ns ## COG: FN2125 COG0188 # Protein_GI_number: 19705415 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Fusobacterium nucleatum # 1 812 1 811 811 1388 95.0 0 MSNVDNRYIEEELKESYLDYSMSVIVSRALPDVRDGLKPVHRRILFAMNEMGMTNDKPFK KSARIVGEVLGKYHPHGDSAVYGTMVRMAQDFNYRYLLVEGHGNFGSIDGDSAAAMRYTE ARMEKITAELLEDIDKDTIDWRKNFDDSLDEPTVLPAKLPNLLLNGAIGIAVGMATNIPP HNLGELVDGILALIDNKDIEILELMNFIKGPDFPTGAIIDGRAGIIEAYKTGRGKIKVRG KVDIEEQKNGKANIIVSEIPYQLNKANLIEKIANLVKEKKITEISDLRDESNREGIRIVI EVKKGEEPELVLNKLYKYTDLQTTFGVIMLSLVNNVPRVLNLKEMLNEYIKHRFEVITRR TAFDLDKAEKRAHILKGYQIALENIDRIIELIRASSDGTVAREQLIEKYGFTDIQARSIL DMKLQRLTGLEREKIDNEYKEIEALIKELREVLADNSKIYEIMKKELLELKEKYNDKRRT QIEEERMEILPEDLIKDEEIIITYTNKGYVKRIEASKYKAQRRGGRGVSALNTIEDDYAE KIITASTLDTMMIFTDKGKVYNIRAYEIPDLSKQSRGRLIGNIINLSEGEKVRDTIVIKE FVPEKEIVFITKNGLIKKTSLGEFKNINNSGLIAIKIKEDDDIIFVGLIEDVTKEEILIA THDGYCTRFLTDTIRPTGRSTQGVKAITLREGDAVVSAMLIKNPETDILTITENGYGKRT SLDEYPQYNRGGKGVINLKASEKTGKVVSVLEVTEDEELMCITSNGIVIRTSISEISRIG RATQGVRIMKVADEEKVAAITKIKKEEEELED >gi|224461236|gb|ACDC01000166.1| GENE 7 16584 - 18491 2688 635 aa, chain - ## HITS:1 COG:FN2126 KEGG:ns NR:ns ## COG: FN2126 COG0187 # Protein_GI_number: 19705416 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Fusobacterium nucleatum # 1 635 5 639 639 1186 95.0 0 MSYEAQNITVLEGLEAVRKRPGMYIGTTSERGLHHLVWEIVDNSVDEALAGYCNKIDVKI LPDNIIEVVDNGRGIPTDIHPKYGKSALEIVLTVLHAGGKFENDNYKVSGGLHGVGVSVV NALSEWLEVEVRKEGNVYYQKYHRGKPEEDVKIIGSCEASEHGTTVRFKADGDIFETLVY NYFTLSNRLKELAYLNRGLTITLSDLRKDEKKEETYKFDGGILDFLNEIVKEETTIIDKP FYVSAEQDNVGVDVTFTYTTSQNEVIYSFVNNINTHEGGTHVQGFRTALTKVINDVGKAQ GLLKDKDGKLMGNDIREGVVAIVSTKIPQPQFEGQTKGKLGNSEVSGIVNTIVSNSLKIF LEDNPAITKIVVEKILNSKKAREAAQKARELVLRKSVLEVGSLPGKLADCTSKKAEECEI FIVEGDSAGGSAKQGRDRYNQAILPLRGKIINVEKAGLHKSLESSEIRAMVTAFGTSIGE TFDISKLRYGKIILMTDADVDGAHIRTLILTFLYRYMKDLITEGNIYIACPPLYKVASGK QIIYAYNDLELKNILAQMNQENKKYTIQRYKGLGEMNPEQLWETTMNPDGRLLLKVSIDN AREADMLFDKLMGDKVEPRREFIEEHAEYVKNIDI >gi|224461236|gb|ACDC01000166.1| GENE 8 18526 - 18801 244 91 aa, chain - ## HITS:1 COG:no KEGG:FN2127 NR:ns ## KEGG: FN2127 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 90 1 90 90 87 66.0 2e-16 MKIMSISDIAISTIESEDKIKLMILREKWKELFSELAEISTVIDFNEKIIYIKSYDSVLK HYIFANKQKLINEIMESLEIKFEIEDIKIKS >gi|224461236|gb|ACDC01000166.1| GENE 9 18779 - 19888 810 369 aa, chain - ## HITS:1 COG:FN2128 KEGG:ns NR:ns ## COG: FN2128 COG1195 # Protein_GI_number: 19705418 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Fusobacterium nucleatum # 1 369 1 369 369 551 93.0 1e-157 MKISNISYLNFRNLENTSVELSDKINVFYGKNAQGKTSLLEAIYYSSTGISFKTKKTAEM IKYNFDEFISSISYSDYIANNKISVRFKNIPGAKKEFFFNKKRISQTDFYGKINIIAYIP EDIILINGSPKNRRDFFDIEISQIDKEYLSNLKNYDKLLKIRNKYLKENKRNTEEFAIYE KEFIKYASYIIFRRLEYVKSLSIILNLQYRKLFNIEQELNLKYETNLDKTGKVTVEMIQE SLQKEILQKKYQEDRYKFSLVGPHKDDYKFLLNGYEAKISASQGEKKSIIFSLKLSEIEI IKKNRKENPVVIIDDITSYFDEDRRKSILEFFNKRDIQVLISSTDKLDIEAKNFYVEKGI IEDENNVNK >gi|224461236|gb|ACDC01000166.1| GENE 10 19902 - 20117 332 71 aa, chain - ## HITS:1 COG:FN2129 KEGG:ns NR:ns ## COG: FN2129 COG2501 # Protein_GI_number: 19705419 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 71 1 71 71 102 91.0 2e-22 MKNIEKVKISTEFIKLDQFLKWIAVVDSGSEAKEVILDEKVKVNDEVETRRGRKIYPEYK VEIFDKTYIVE >gi|224461236|gb|ACDC01000166.1| GENE 11 20167 - 22062 1861 631 aa, chain - ## HITS:1 COG:no KEGG:FN0001 NR:ns ## KEGG: FN0001 # Name: not_defined # Def: chromosomal replication initiator protein DnaA # Organism: F.nucleatum # Pathway: not_defined # 1 631 1 637 637 840 84.0 0 MKKEKVEQDEKKEVVEVIEIENFEVSKTGSLADDLMKFENVKDIKIENKEVPDIEVQEIY IRETGNYLNLQENFINIPIEMIYFPFFTPQKQNKRINFKYTFEDLGVTMYSTLIPKDKKD KVFQPSIFEEKIYTFLISMYQEKSPQQDENEEVAIEFEISDFIVNFLGNKMNRTYYSKVE QALKNLKNTIYQFEISNHTKFGKNKFEDSSFQLLNYQKMKVGKKIFYRVVLNKNIVNKIK SKRYIKYNTKNLLEIMVKDPIASRIYKYISKIRYKNNKGEINVRTLAAIIPLKMEQRVEK IIKNGVKEYYLNRMKPVLTRILKAFDVLLELKYIVSFEEIYNKDEKTYYIAYVFNKERDG DCHMSEFIKKNEKNIVKENIDGVEEVIDLNADIDYQDNIEYLINKAKENPKIAPKWNAWV DKKIKKILAEDGEEMLKRVLNILIHMDKNIEIGLPNYISGILKNIGGKGSKKANNINMTI FENVSKGKGLKSKNQIKQARKKGMEKISNIKEIMTENNFLEEKLEDKNLLLEKKTEIKNE KLDNVDEKIYNIEESNLEKILSFFDEETRNTIEEKALENIKKEVDNNNIDVILNVKNFSK TMYYKMIGTSIMKILKAEYSEVLENINKNDK >gi|224461236|gb|ACDC01000166.1| GENE 12 22692 - 22826 224 44 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|197735492|ref|YP_002164270.1| hypothetical protein FNP_0004 [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 44 1 44 44 90 100 1e-17 MKRTFQPNQRKRKKDHGFRARMSTKNGRKVLKRRRVRGRAKLSA >gi|224461236|gb|ACDC01000166.1| GENE 13 22881 - 23216 360 111 aa, chain + ## HITS:1 COG:FN0002 KEGG:ns NR:ns ## COG: FN0002 COG0594 # Protein_GI_number: 19703354 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Fusobacterium nucleatum # 1 111 1 111 111 150 89.0 5e-37 MNTLKKNGEFQNIYKLGNKYFGNYSLIFFNKNKLDYSRFGFVASKKIGKAFCRNRIKRLF REYIRLNIEKLNANYDIIIVAKKKAGEMIETIKYQDIEKDLNRVFKNSKII >gi|224461236|gb|ACDC01000166.1| GENE 14 23225 - 23473 168 82 aa, chain + ## HITS:1 COG:FN0003 KEGG:ns NR:ns ## COG: FN0003 COG0759 # Protein_GI_number: 19703355 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 82 1 82 82 133 81.0 1e-31 MKKIFILLIRFYQKFISPLFPAKCRYYPTCSQYTLEAIQEYGAIKGTYLGIKRILRCHPF HEGGYDPVPKRKIEDSEEKEKE >gi|224461236|gb|ACDC01000166.1| GENE 15 23470 - 24090 586 206 aa, chain + ## HITS:1 COG:FN0004 KEGG:ns NR:ns ## COG: FN0004 COG0706 # Protein_GI_number: 19703356 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Fusobacterium nucleatum # 1 206 1 205 205 300 87.0 1e-81 MSYLYNLLKQFLALLLTTTDKYVGNFGVSIIIVTILIKIALLPLTLKQDKSMKEMKKIQP ELEKLKEKYANDKQMLNIKTMELYKEHKVNPLGGCLPLLLQLPILFALFGVLRSGIIPAD SSFLWLKLPEPDPFFVLPVLNGAVSFFQQKLMGSADSNPQMKNMMYIFPIMMIFISYRMP SGLQLYWLTSSILAVVQQYFIMKKGA >gi|224461236|gb|ACDC01000166.1| GENE 16 24092 - 24862 1132 256 aa, chain + ## HITS:1 COG:FN0005 KEGG:ns NR:ns ## COG: FN0005 COG1847 # Protein_GI_number: 19703357 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Fusobacterium nucleatum # 97 256 1 162 163 204 83.0 2e-52 MEKTIEIKAIDKEKALKRALNILGVELTDNETVDIVEKVAPRKKFFGLLGIEPGIYDVSI KTKQEEKKEVKEHKEHKPYIHKFEKEKTEKHVKTEKVEKSEKVEKIEKLAHTEQEKEISE KVAFFVEKMKLDIKYKIRRVKERLYVVEFFGKDNALIIGQKGKTLNSFEYLLNSMIKNCK IEIDVEKFKEKRNDTLRVLAKRMAEKVSKTGKTVRLNAMPPRERKVIHEVVNKYPDLDTF SEGRDPKRYIVIKKKR >gi|224461236|gb|ACDC01000166.1| GENE 17 24881 - 26248 1911 455 aa, chain + ## HITS:1 COG:FN0006 KEGG:ns NR:ns ## COG: FN0006 COG0486 # Protein_GI_number: 19703358 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 455 1 455 455 759 96.0 0 MLLDTIAAISTPRGEGGISIVRMSGQDSLNILEKIFRAKSKKVSELKNYSINYGHIIDNE HIVDEVLVSIMKAPNTYTREDIVEINCHGGFLVTEQVLQVVLKNGARIAEIGEFTKRAFL NGRIDLTQAEAVIDVIHGKTEKSLSLSLNQLRGDLRDKIATIKKSVLDLAAHINVVLDYP EEGIDDPVPENLVDNLKKASAEIKDLISSYDKGKIIKDGIKTAIIGKPNVGKSSILNSLL REDRAIVTHIPGTTRDIIEEVININGIPLLLVDTAGIRNTDDIVENIGVEKSKELINSAD LILYVIDISREIDEEDFRIYDIINTDKVIGILNKIDIKKEIDLSKFPKIDKWIEISALSK IGIDNLEDEIYKYIMNENVEDSSQKLVITNVRHKSALEKTNEALLNIIETIDMGLPMDLM AVDIKDALDSLSEVTGEISSEDLLDHIFSNFCVGK >gi|224461236|gb|ACDC01000166.1| GENE 18 26472 - 28355 2621 627 aa, chain + ## HITS:1 COG:FN0007 KEGG:ns NR:ns ## COG: FN0007 COG0445 # Protein_GI_number: 19703359 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Fusobacterium nucleatum # 1 627 1 627 628 1124 93.0 0 MDKDYDVIVVGAGHAGVEAALASARLGNKVALITLYLDTISMMSCNPSIGGPGKSNLVTE IDVLGGEMGRHIDEFNLQLKDLNTSKGPAARITRGQADKYKYRKKMREKLEKTENISLIQ DCVEEILVEDIKDRQNLSYEKKVIGVKTRLGLIYNTKAIVLATGTFLKGKIVIGDVTYSA GRQGETSAEKLSDSLRELGIKIERYQTATPPRLDKKTIDFSQLEELKGEEHPRYFSIFTK KEKNNTVPTWLTYTSEETIEVVRDMMKYSPIVSGMVNTHGPRHCPSIDRKVLNFPEKTKH QIFLEMESENSDEIYVNGLTTAMPAFVQEKILRTIKGLENAKIMRHGYAVEYDYAPASQL YPSLENKKISGLFFSGQINGTSGYEEAAAQGFIAGVNAAKKIKGEEPVIIDRSEAYIGVL IDDLIHKKTPEPYRVLPSRAEYRLTLRYDNAFMRLFNKIKEVGIVDKDKIEFLEKSINNV YTEINNLKNISVSMNDANNFLESLGIEEKFVKGVKASEILKIKDVNYDDLKAFLNLNDYE DFVKNQIETMIKYEIFIERENKQIEKFKKLEHMYIDKNINYDDIKGISNIARAGLNEVRP LSIGEATRISGVTSNDITLIIAHMNDK >gi|224461236|gb|ACDC01000166.1| GENE 19 28371 - 28982 852 203 aa, chain + ## HITS:1 COG:FN1861 KEGG:ns NR:ns ## COG: FN1861 COG1279 # Protein_GI_number: 19705166 # Func_class: R General function prediction only # Function: Lysine efflux permease # Organism: Fusobacterium nucleatum # 1 201 1 202 207 266 76.0 2e-71 MDVYLQGFLMGLAYVAPIGVQNLFVINSAISQKRGRALLIALIVIFFDITLAFACFFGIG LLIDKLEWLKLIILLIGSLIVIYIGQGLIRSKSSFKETDTNISLAKVITTDCVVTWFNPQ AIIDGTMMLGAFRVNLVASDATYFILGVVSASFVWFTGVTLFVSFFRDKFNDKILRVINI VCGAIIIFYGIKLLLSFYKMLKG >gi|224461236|gb|ACDC01000166.1| GENE 20 29266 - 30162 1138 298 aa, chain + ## HITS:1 COG:FN0008 KEGG:ns NR:ns ## COG: FN0008 COG0379 # Protein_GI_number: 19703360 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Fusobacterium nucleatum # 1 298 1 298 298 513 93.0 1e-145 MKDRIKKLQKEKDVAILAHYYVDGEVQKIADYVGDSFYLAKTATKLKNKTIIMAGVYFMG ESIKILNPEKTVHMVDVYADCPMAHMITIKKIKEMREKYDDLAVVCYINSTAEIKAYCDV CITSSNALKIVSKLKEKNIFIVPDGNLAAYIAKQIKNKNIILNEGYCCVHNLVHLENVIK LKKEYPNAKVLAHPECKEEILNLADYIGSTSGIIEEALKDGDEFIVVTERGIQYKIYEKA PNKKLHFADTLICKSMKKNTLEKIENILLNGGDELEVDDEIAKKALIPLERMLELAGD >gi|224461236|gb|ACDC01000166.1| GENE 21 30164 - 31456 1485 430 aa, chain + ## HITS:1 COG:FN0009 KEGG:ns NR:ns ## COG: FN0009 COG0029 # Protein_GI_number: 19703361 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Fusobacterium nucleatum # 1 430 1 430 435 733 90.0 0 MKIENSDVVIVGSGVAGLICALTLSKKFKIILLTKKKLQDSNSYLAQGGISVCRGKEDRE EYIEDTLIAGHYKNDKRAVEILVDESEEAVNTLIENGVKFTGDKKGLFYTREGGHRKFRI LYCEDQTGKYIMESLIEKILERDNIEIIEDCEFLDIIEKENTCLGILAKKEEIFAIKSKF TVLATGGLGGIYKNTTNFSHIKGDGVAVAIRHNIELKDISYIQIHPTTLYSKENKRKFLI SESVRGEGAILLNQKLERFTDELKPRDKVTKAILEEMKKDKSEYEWLDFSTIKLDVKERF PNIYRNLMENNIDPLKDKVPIVPAQHYTMGGIKVDMNSKTSMKNLYAIGEVACTGVHGKN RLASNSLLESVVFGKRAAYSIIDENNISVYNEITDDIFENIVDKIILTDEKENKNIIEKR IKEDEFEKNR >gi|224461236|gb|ACDC01000166.1| GENE 22 31434 - 32294 502 286 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 10 279 10 283 286 197 40 6e-50 MNLRKIDKFQMDESIRLALKEDITSEDISTNAIYKNSRLAEISLYSKEEGILAGIDVFKR VFELLDDNVEFIEYKSDGDKLLNKDLILKIKADVKTILSAERTALNYLQRMSGIATYTRK MLEALDDENIKLLDTRKTTPNMRIFEKYSVKVGGGYNHRYNLSDAIMLKDNHIDAAGSIT EAIKLAREYSPFIKKIEIEVEDLKGVEEAVKAGADIIMLDNMDIETTKEAIKIINKKAII ECSGNVDINNINRFKGLEIDYISSGAITHSAKILDLSLKNLRYVDD >gi|224461236|gb|ACDC01000166.1| GENE 23 32287 - 32796 511 169 aa, chain + ## HITS:1 COG:FN0011 KEGG:ns NR:ns ## COG: FN0011 COG1827 # Protein_GI_number: 19703363 # Func_class: R General function prediction only # Function: Predicted small molecule binding protein (contains 3H domain) # Organism: Fusobacterium nucleatum # 1 169 1 169 169 248 97.0 4e-66 MIEREEREKKILEILRDSETLVSGTYLAEFFDVSRQVIVQDIAILKAKNIDIISTNRGYR LLSKGIKKVIKVKHDDAEIRNELNAIVDLGASVEDVFVIHKTYGEIRVKLDIKSRRDVDL LVENINSKLSKPLKNLTDNCHYHTIIAENENIFKEVEDKLKELGILMEE >gi|224461236|gb|ACDC01000166.1| GENE 24 32933 - 33196 412 87 aa, chain + ## HITS:1 COG:no KEGG:Lebu_1725 NR:ns ## KEGG: Lebu_1725 # Name: not_defined # Def: addiction module antitoxin, RelB/DinJ family # Organism: L.buccalis # Pathway: not_defined # 1 85 1 82 84 84 60.0 2e-15 MAVINIRVNDEVKKEAETIFKALGLNMSVAMNLFLKKCINENGIPFDLKVPNKETIEAME ETNKILNGDIERKSYKNADELFEDLGV >gi|224461236|gb|ACDC01000166.1| GENE 25 33198 - 33470 453 90 aa, chain + ## HITS:1 COG:jhp0831 KEGG:ns NR:ns ## COG: jhp0831 COG3041 # Protein_GI_number: 15611898 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Helicobacter pylori J99 # 1 90 1 90 90 82 50.0 2e-16 MYEIKTTTRFEKDLKLIKKRGYDTKLLKEVIDILSKGEKLEEKYKDHYLQGNYSGFKECH IKPDWLLVYKIEDNVLVLTLSRTGTHSDLF >gi|224461236|gb|ACDC01000166.1| GENE 26 33524 - 34027 893 167 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_0528 NR:ns ## KEGG: HMPREF0868_0528 # Name: not_defined # Def: hypothetical protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 1 166 1 162 163 118 46.0 6e-26 MGMYAMYQEVKKEDFKKLLESDDFFETIEDLEEKDGTELCDIDKMWDALHFLLNGLSAIH GTPEDNILSEFIIGSESFDEESEDFTRYIPTERVIKIAKKLNEINFEDYLKDFDMNKFAE NGIYPDIWSYDEEREEIMEELSEHFENLKEFYNKVAKNKNIVVVTIC >gi|224461236|gb|ACDC01000166.1| GENE 27 34138 - 34416 447 92 aa, chain + ## HITS:1 COG:FN0013 KEGG:ns NR:ns ## COG: FN0013 COG1811 # Protein_GI_number: 19703365 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Fusobacterium nucleatum # 12 86 1 75 113 110 92.0 7e-25 MIINSGRWGKDMGLITNFLAIIIGGILGLTIGKKFNEDIKNIIVDCAGIFIIVIGIKSAL VAQKDIMILIYLIIGAVIGQLVNYSRLTPYEC Prediction of potential genes in microbial genomes Time: Fri May 20 00:01:21 2011 Seq name: gi|224461235|gb|ACDC01000167.1| Fusobacterium sp. 2_1_31 cont1.167, whole genome shotgun sequence Length of sequence - 4433 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 59 - 541 594 ## FN0015 hypothetical protein - Prom 646 - 705 17.0 + Prom 623 - 682 12.3 2 2 Op 1 . + CDS 792 - 1790 1078 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Term 1795 - 1847 8.0 + Prom 1799 - 1858 7.5 3 2 Op 2 . + CDS 1884 - 3131 1348 ## COG3177 Uncharacterized conserved protein + Term 3140 - 3189 3.6 4 3 Tu 1 . - CDS 3186 - 3680 537 ## FN0018 hypothetical protein - Prom 3731 - 3790 9.8 + Prom 3657 - 3716 12.4 5 4 Tu 1 . + CDS 3881 - 4336 563 ## COG3086 Positive regulator of sigma E activity + Term 4364 - 4407 1.9 Predicted protein(s) >gi|224461235|gb|ACDC01000167.1| GENE 1 59 - 541 594 160 aa, chain - ## HITS:1 COG:no KEGG:FN0015 NR:ns ## KEGG: FN0015 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 159 1 181 182 180 61.0 2e-44 MGMDLCYYGVKEENIPDILDGNFFEEDFSDSEPQHTLRVFSVKELYYVYSGGKELEEEDF QGKNERDLFIEAFLGEVTVSSPPGDIYSYCTCKEKVKEIANFLNKIDMKDCFEKIEKFYS SSEEEEYIFDIENIIDRFNDFKEFYNELVKNNLGVFIYIS >gi|224461235|gb|ACDC01000167.1| GENE 2 792 - 1790 1078 332 aa, chain + ## HITS:1 COG:all5295 KEGG:ns NR:ns ## COG: all5295 COG0451 # Protein_GI_number: 17232787 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Nostoc sp. PCC 7120 # 3 324 4 326 334 113 29.0 5e-25 MKKVFIITGSTGFLGNTIVKKLSKNKDYEVRALVYSKKEEDILKDIDCKIFYGDITNKAS LKDIFTVEDNKDIYVIHCAAIVTIKSDEDPKVYDVNVKGTNNVIDYCIEVNAKLLYVSSV HAIKESEGKIFETKDFDKDLVHGYYAKTKAEAAKNVLEAVKNRNLRACVFHPAGIIGSGD SSNTHTTQLVKRMLENKLVFVVNGGYNFVDVRDVADGIINAADMGEIGETYILSGEYISI KDYAKLVEKILGKKKYIFSIPIWFVKMIAPAMEKYYDLVKKVPLFTRYSIYTLQTNSNFS NDKAHKKLNFRNRKIEDSIKDTIIDITKKEIQ >gi|224461235|gb|ACDC01000167.1| GENE 3 1884 - 3131 1348 415 aa, chain + ## HITS:1 COG:FN0017 KEGG:ns NR:ns ## COG: FN0017 COG3177 # Protein_GI_number: 19703369 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 415 1 415 415 566 84.0 1e-161 MSNKYEKLIKLYYKKKNIDEEYIKRIENPATLITELKINPMKKGNKILDKEYSLFYVNLL EHTLLQEKIVKNSNKINYISNRLPTIAIKEIIMKILSNELYKTNKIEGIETVKSEIHSSL KDDRISNKKSNKLDGIIKKYKDIMENNFEDTEHIDNLSSFRKIYDEMFEDFEKSGNYKLD GKYFRKDTVKIINGLGNIIHIGVNGEEAIEKNIESLIEFMNRKDITFLLKASIVHFFFEY IHPFYDGNGRFGRYLLSLYLARKLDNLTAFSVSYSISRNLDDYYKSFVEVEDVNNYGEIT FFVENILKTIKNGQEMIIELLNDSVMRFKHSIEILDELTKELSEKENIILRIYLQNYLFN DFEELTNVELTSIIGDLTQQTINKYTQELEKKGYLVKIKQRPLTYSLSEKITEKI >gi|224461235|gb|ACDC01000167.1| GENE 4 3186 - 3680 537 164 aa, chain - ## HITS:1 COG:no KEGG:FN0018 NR:ns ## KEGG: FN0018 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 6 163 1 155 156 71 33.0 9e-12 MKKLFLIFLSLFCISCTGLTAFTQHTADPTEIKKLTEKGAMELMTSQEIEDLKAGKTVTM IGFRSNSRGVVLDKLTSMVNLSNNDVDKDVIAAAKLIQNNPGAIFISDNSDIFIRTIKFL GQNEDGRSIINGARFLFINNFNENRVKELAQKYNFKYSFPKLDN >gi|224461235|gb|ACDC01000167.1| GENE 5 3881 - 4336 563 151 aa, chain + ## HITS:1 COG:FN0338 KEGG:ns NR:ns ## COG: FN0338 COG3086 # Protein_GI_number: 19703681 # Func_class: T Signal transduction mechanisms # Function: Positive regulator of sigma E activity # Organism: Fusobacterium nucleatum # 37 150 1 114 114 196 87.0 2e-50 MVNKGIVTKIQGDTVAVKLYKSSSCSHCSSCSESNKMGSDFEFKINQKVELGDLVTLEIS EKDVVKAAMIAYVFPPIMMILGYIVADRLGFSEMQSIAGSFIGLVIGFIFLAIYDRFFAK KTIDEEIKIVSVEKYDPNACENLAERCEDFF Prediction of potential genes in microbial genomes Time: Fri May 20 00:01:32 2011 Seq name: gi|224461234|gb|ACDC01000168.1| Fusobacterium sp. 2_1_31 cont1.168, whole genome shotgun sequence Length of sequence - 17093 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 5, operones - 2 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 66 - 1412 1687 ## COG0166 Glucose-6-phosphate isomerase + Term 1443 - 1480 1.5 - Term 1424 - 1474 8.1 2 2 Tu 1 . - CDS 1505 - 2038 950 ## gi|237738588|ref|ZP_04569069.1| predicted protein - Prom 2179 - 2238 13.4 - Term 2214 - 2264 -0.3 3 3 Op 1 . - CDS 2397 - 3941 1684 ## COG1450 Type II secretory pathway, component PulD 4 3 Op 2 . - CDS 3938 - 4714 664 ## FN2087 hypothetical protein 5 3 Op 3 . - CDS 4683 - 5861 738 ## FN2088 hypothetical protein 6 3 Op 4 . - CDS 5854 - 6375 265 ## FN2089 hypothetical protein 7 3 Op 5 . - CDS 6344 - 6886 187 ## FN2090 hypothetical protein 8 3 Op 6 . - CDS 6893 - 7306 181 ## FN2091 hypothetical protein 9 3 Op 7 . - CDS 7303 - 7737 57 ## FN2092 integral membrane protein 10 3 Op 8 10/0.000 - CDS 7770 - 8246 767 ## COG2165 Type II secretory pathway, pseudopilin PulG - Prom 8270 - 8329 7.5 11 3 Op 9 24/0.000 - CDS 8476 - 9513 825 ## COG1459 Type II secretory pathway, component PulF 12 3 Op 10 . - CDS 9510 - 10742 1004 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 13 3 Op 11 . - CDS 10763 - 10948 96 ## gi|237738599|ref|ZP_04569080.1| predicted protein 14 3 Op 12 . - CDS 10945 - 11436 235 ## FN2097 hypothetical protein - Prom 11522 - 11581 12.7 + Prom 11501 - 11560 14.7 15 4 Tu 1 . + CDS 11716 - 12072 409 ## FN0064 putative cytoplasmic protein + Term 12203 - 12271 4.2 + Prom 12344 - 12403 9.0 16 5 Op 1 3/0.000 + CDS 12423 - 15362 3125 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Prom 15433 - 15492 9.4 17 5 Op 2 1/0.000 + CDS 15524 - 15823 386 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 18 5 Op 3 1/0.000 + CDS 15816 - 16688 950 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase 19 5 Op 4 . + CDS 16704 - 16988 485 ## COG2088 Uncharacterized protein, involved in the regulation of septum location + Term 16998 - 17046 9.4 Predicted protein(s) >gi|224461234|gb|ACDC01000168.1| GENE 1 66 - 1412 1687 448 aa, chain + ## HITS:1 COG:FN2054 KEGG:ns NR:ns ## COG: FN2054 COG0166 # Protein_GI_number: 19705344 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Fusobacterium nucleatum # 1 448 1 448 448 788 89.0 0 MKKISLDYSKISKFVNENELNELKNKVELVSEKLYNKTGAGNDFLGWLDLPVNYDKEEFA RIKKASEKIKSDSEVLVVIGIGGSYLGARAVIECLSHSFFNSLSKEKRNAPEIYFAGQNI SGTYLKDLIEIIGDRDFSVNVISKSGTTTEPAIAFRVFKELLENKYGEAAKERIYVTTDK NKGALKKLADEKGYEEFVIPDDVGGRFSVLTAVGLLPIAVAGISIDDLMAGAQTAREDYS KDFTSNDCYKYAAIRNILYKKDYNIEILANYEPKLHYISEWWKQLYGESEGKDKKGIFPA SVDLTTDLHSMGQYIQDGRRNLMETILNVENPLKDISIKKEAEDLDGLNYLEGKGLSFVN NKAFEGTLLAHIDGGVPNLIINIPELNAFNIGYLIYFFEKACAISGYLLEVNPFDQPGVE SYKKNMFALLGKKGYEELSKELNERLKK >gi|224461234|gb|ACDC01000168.1| GENE 2 1505 - 2038 950 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738588|ref|ZP_04569069.1| ## NR: gi|237738588|ref|ZP_04569069.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 177 1 177 177 137 100.0 2e-31 MKKKLFGLLLFSLVLSSLAYAKVRDTENQEAAQNVAGTSIVKLSPEEEKEAFKALERARK RIEKEDKEREEALKLAEKQAQEEAKRIEEAQAKAQAEAEEQQKQVQQVQETTVQENGNTV TEVVTTASGLTPQEEKEAFKALERARKRIEKEDKERAEALKLAEEQAKAQATQTAQE >gi|224461234|gb|ACDC01000168.1| GENE 3 2397 - 3941 1684 514 aa, chain - ## HITS:1 COG:FN2086 KEGG:ns NR:ns ## COG: FN2086 COG1450 # Protein_GI_number: 19705376 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Fusobacterium nucleatum # 114 514 1 401 402 607 85.0 1e-173 MKKFTLILFLILNSFLFSIGLNRDVDIIDMPLHEVLAVLSKECGRNLICSKEAKDIVIDT YFNKGEDLDSVLGFLAETYGLTMKKENNTTIFMLASEKNSKKAKIIGRVTSNNMSLEGAR IELKDLNKFVYSDKSGNFIIDNLDKDVYVCKISKKGYEEKGEIIDSSKSISILNVDLKEK ADNYTNRQNEANLEDLNFYEVDGKFYYTKTFSLFNVSPDEVLKVLHETFGENIKVSSLNK VNKLVVSAERDILENAISIIEDIDKNPKQVKISSQILDISNNLFEELGFDWVYRQNVASE ERNSLTAIILGKAGLNGVGSSLNIVRQFNNKSDVLSTGLNLLESTNDLVVSSVPTLMIAS GEEGEFKVTEEVIVGVKTTRENKNDRHTEPVFKEAGLIMKVKPFIKDDDYIVLEISLELS DFKFKRNVLNIKDINSGTYNSEGGSKVGRALTTKVRVKNGDTILIGGLKKSIQQNIESKI PILGDIPIISFFFKNTTKKRENSDMYIKLKVEIE >gi|224461234|gb|ACDC01000168.1| GENE 4 3938 - 4714 664 258 aa, chain - ## HITS:1 COG:no KEGG:FN2087 NR:ns ## KEGG: FN2087 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 53 241 1 188 226 172 64.0 1e-41 MKSSIELFRGEDMFKDLKIKNLKTIILLVCYLIVFYFLIFKNILKLVEIKELIEQEDIKI GRLNYEKNTVLKALALKKEDFEKEQKKIVKNEEDETKKSFDNIPSLFRYIEDKITKNNIN FQNFGRSRREEDKLNLTMTFKGKEKDVKNFFSDIENEDYDINFSSSYLKITVDKNLLEVK SNLVATVLDKKEEVEIDTNIGDKNIFQSLNLNPKEKEDEENSYSYMRIGDKTYYRVSAKK ENNKKNKKTKTKDKGEDR >gi|224461234|gb|ACDC01000168.1| GENE 5 4683 - 5861 738 392 aa, chain - ## HITS:1 COG:no KEGG:FN2088 NR:ns ## KEGG: FN2088 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 387 1 388 389 427 69.0 1e-118 MSKKTLALSHIDNYINGGKNTILLLENKFFYIFKVQIENVLNEEDRKEKLEDRLEIVFPR YNSDDFVLRYEILKKDKKRENIVVYLMDINYLNDCIIDDMKDYGFISIIPSFFISREKKD LNHYFNFDISENMLVITEYMNNNILDIQSFKLSKSSLDNENFEVEDKFSIINTFLANITE DIHIIFTGDKINFEDLELENKTYSFYSVDNLDFSKYPNFLPEELRNKYSLYYIEDKYLYI LLGLSIITIILTIIIHYSLNSSERKLETLELESTKLEEEIENARNEMEEIEVESKNLQEF LVKKEDMDIKISSFLEELTYLCPEYLKISSIEYDENKIFNIEGKTDKVERITKFLENITN SKNFILSNYDYILKKANEIEFKIEVKYRAVPR >gi|224461234|gb|ACDC01000168.1| GENE 6 5854 - 6375 265 173 aa, chain - ## HITS:1 COG:no KEGG:FN2089 NR:ns ## KEGG: FN2089 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 173 3 179 179 179 59.0 4e-44 MRKYCTTIKNKAYIFLEVIIISFLFISLTLFVQILLNNSFKLYKVDYETQENFQNLDFLN EIMKVEIRYIEKNINDGNIKNAVDYIVLNEAGEKIFLIADPSKRISLGGYSLLKDEIKIN AFNFVNIHFKKKVSIKDKNYLIFATVKYEVGSSRDLESLYNGVLTRMWIKEDV >gi|224461234|gb|ACDC01000168.1| GENE 7 6344 - 6886 187 180 aa, chain - ## HITS:1 COG:no KEGG:FN2090 NR:ns ## KEGG: FN2090 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 180 7 189 189 220 69.0 2e-56 MNKNKAFSLVEIIIAISLTLIVGSICLITFYSMNKSFLVMNKTYKRDKEIASFRDLLISH IKWNEGVEIRISNLSKNQNINSLGNLFLKESEKEGNLLVLKIQAFNEIEKTTSKYYRCFL FYENKVSISYFDEGDIVNLFNGTVILENCSGKFNFNNNILKFYLKDKEKEYEEILYYDQK >gi|224461234|gb|ACDC01000168.1| GENE 8 6893 - 7306 181 137 aa, chain - ## HITS:1 COG:no KEGG:FN2091 NR:ns ## KEGG: FN2091 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 137 1 128 128 139 64.0 2e-32 MKKSRAFSLMEVIVSVFILFLVLIPSIKLNSQQLKTYSKIRAKEKELHFFNSLNNYVKSK SISNSHLEFNSYSDFLSSFSDFQTYARNIQNDEFNLLIDVEDIEVDFSDRKEKVSLINLE YRGASKIYKNKIIKFKD >gi|224461234|gb|ACDC01000168.1| GENE 9 7303 - 7737 57 144 aa, chain - ## HITS:1 COG:no KEGG:FN2092 NR:ns ## KEGG: FN2092 # Name: not_defined # Def: integral membrane protein # Organism: F.nucleatum # Pathway: not_defined # 1 139 17 155 165 157 77.0 1e-37 MYIDINKKYIPNVLNFSILILSVFIRGISEIENFFIGAACYVLPILIFYGYISDILKREV FGFGDIKLIIALGGLLYHSEINIFLQIYIFYLLVFSIATLYITFYLCIYFCKNRALKIRG VEIAFAPYICIAFFIIYNYIEGIL >gi|224461234|gb|ACDC01000168.1| GENE 10 7770 - 8246 767 158 aa, chain - ## HITS:1 COG:FN2093 KEGG:ns NR:ns ## COG: FN2093 COG2165 # Protein_GI_number: 19705383 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Fusobacterium nucleatum # 8 158 1 151 151 174 59.0 7e-44 MKNRGFSLIEVIVAVAIIGILSGIVGLKLRSYIATSKDTRAVASLNSFRLAAQTYQIDND KPLIEDSSKYDDDTEIKKALEKLEIYLDKNAKEIIDKNRITIGASRDSENGDLKYGGEVR FTFKDPDNAANSDGYYMWLVPVNPTKNFDSKGKEWTKY >gi|224461234|gb|ACDC01000168.1| GENE 11 8476 - 9513 825 345 aa, chain - ## HITS:1 COG:FN2094 KEGG:ns NR:ns ## COG: FN2094 COG1459 # Protein_GI_number: 19705384 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Fusobacterium nucleatum # 1 344 1 344 346 416 74.0 1e-116 MRNQKEKILFFTNELALLIKSGLTFTKAIEIILKEEKNKKFKDILKKIHKNLTMGKNIYD SFKPFENTFGSTYLYILKIGELSGNIVESLEDISKSLDFDLSQRKKLGGILIYPIVVVCL TFLIVSFLLIFILPSFITIFEENQIELPLVTRILLGISRNFHYILIFIIVILTIIFILNM YINNNKYKRIRRDKFLLNIFLFGELKKLLLASNLYHSFSILLNAGIGMVESLEIMYMNNN NYYLKDRLFEVKKAILAGNNITTSFKNLNLYNDRFSILITVGEESGYLSENFLQISKILK EDFDYKLKKLLAILEPLVILILGLIVGFVVLAIYLPILSIGDIFI >gi|224461234|gb|ACDC01000168.1| GENE 12 9510 - 10742 1004 410 aa, chain - ## HITS:1 COG:FN2095 KEGG:ns NR:ns ## COG: FN2095 COG2804 # Protein_GI_number: 19705385 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Fusobacterium nucleatum # 2 410 6 414 414 667 88.0 0 MEKIENYFKKSINSSMDNNKISLIEDIEELYARENLNSNKGIFYILLEAIKFLASDIHIE ALNNIVRIRYRINGILKEVARIDKSFLAAISSKIKILSSLDIVEKRKPQDGRFSLRYKGR EIDFRTSIMPTMNGEKIVIRILDKFNYNFTLDDLYLSEENKKVFYKAINQNNGIIIVNGP TGSGKSSTLYSILKYKNKEEVNISTVEDPIEYQIEGINQVQCKNELGLNFATILRSLLRQ DPDILMIGEIRDKETAEIAVKASLTGHLVFSTLHSNDSLGCINRLVNLGIDNYLLSLVLQ MIVSQRLVRKLCPHCKKEDENYKEKLKSLNLVEENYKDVKFYTSGSCKKCMNTGYIGRIP VFEIIYFDESLKNMLAQKKEIKQNFKTLLENAMDKAKEGLTSLDEIMRQL >gi|224461234|gb|ACDC01000168.1| GENE 13 10763 - 10948 96 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738599|ref|ZP_04569080.1| ## NR: gi|237738599|ref|ZP_04569080.1| predicted protein [Fusobacterium sp. 2_1_31] # 17 61 17 61 61 69 100.0 6e-11 MIKKLFLCFLFLFICLNIFSKQSKKNVVRVDIIGKNANRSYFIKFSDENNLNSFEVYDED N >gi|224461234|gb|ACDC01000168.1| GENE 14 10945 - 11436 235 163 aa, chain - ## HITS:1 COG:no KEGG:FN2097 NR:ns ## KEGG: FN2097 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 30 163 1 134 134 160 77.0 1e-38 MNIAFLKYKEFKELRDINEAKTKITEAFYLVSTTSLKQKRKQELELDLSAKKIVISDKTL QSQDIELPKNLIYYHTYTSNLKNFKFSFTKNGNISKSFSIYIFNKEKKVRYKLSFYGFDR SKFLKINSYRKKNNNEINYNNIADYHKSTNEDRESFYKDWRKE >gi|224461234|gb|ACDC01000168.1| GENE 15 11716 - 12072 409 118 aa, chain + ## HITS:1 COG:no KEGG:FN0064 NR:ns ## KEGG: FN0064 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 3 116 1 114 117 160 71.0 2e-38 MEMSKLLVKDLMNGKFELISDYIYQIENYVIKVPKFFVTDYASIPRIFRAIVLPYGKHSG ASVVHDYLYSKGCELNIERKKADKIFLEILKEEGVNPILARLMYIAVRCFGKTRYKIK >gi|224461234|gb|ACDC01000168.1| GENE 16 12423 - 15362 3125 979 aa, chain + ## HITS:1 COG:FN0019 KEGG:ns NR:ns ## COG: FN0019 COG1197 # Protein_GI_number: 19703371 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Fusobacterium nucleatum # 1 979 1 981 981 1467 86.0 0 MEKKFRGEIPFWLKNKKNSIVYVCSSNRNIDDYFFVLKDFYKGRILRIKKENENGELKKY NYDLLELLKSDEKFIILISLEYFLEDYYSKANSIFIEKGKEVDIKALEEKLIEAEFEKTY MLTQRKEYSIRGDILDIFNINQENPVRIEFFGNEVDRITYFDLHSQLSIEKLDSIELYID NNKDKKDFFSLMYTNKNKVEYYYENNDILQAKVKRLMGENSDRENDIINKITELSKIGKQ IEIQKFTEEELKQFEVIDRIKKLSENTNIVIYSEEATRYKEIFKGYDIKFEKYPLFEGYR TEDKLILTDREIKGIRVKRERVEKKALRYKTVDEIAEQDYVIHENFGVGIFLGLENIDGQ DYLKIKYADEDKLYVPLDGINKIEKYINISDVIPEIYKLGRKGFKRKKARLSEDIEIFAK EIIKIQAKRNLANGFKFSKDTVMQEEFEEAFPFTETPGQLKAIEDVKRDMESGKVMDRLV CGDVGYGKTEVAIRAAFKAIMDEKQVVLLVPTTVLAEQHYERFSERFKNYPINIEILSRV QTKKEQEESIKKIENGSADLIIGTHRLLSDDIKYNDIGLLIIDEEQKFGVKAKEKLKKLK GDIDILTLTATPIPRTLNLSLLGIRDLSIIDTSPEGRQKIQTEYIDNNKDLIRDIIITEV SREGQVFYIFNSVKRIEMKSKELRELLPEYIKVDYIHGQMLARDIKRAIHNFENGNTDVL IATTIIENGIDIENANTMIIEGVEKLGLSQVYQLRGRIGRSNKKSYCYMLMNENKTKNAQ KREESIREFDNLTGIDLSMEDSKIRGVGEILGEKQHGAVETFGYNLYMKMLNEEILKLKG ENEEELEDVNIELNFPRFLPDNYIEKNEKIKIYKRALALKTFEELEDLHKELEDRFGRLK SEAKGFFEFLKIRIRARELGIVSIKEDKEKRILINFNEEKINVDKIIYLLANKKIMYSKF TRTIGFDGDIFEFFDLYSN >gi|224461234|gb|ACDC01000168.1| GENE 17 15524 - 15823 386 99 aa, chain + ## HITS:1 COG:FN0020 KEGG:ns NR:ns ## COG: FN0020 COG1188 # Protein_GI_number: 19703372 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Fusobacterium nucleatum # 1 99 1 99 99 156 96.0 1e-38 MRLDKFLKVSRIIKRRPIAKLVVDGGKVKLDGKVVKAAAEVKVGQTLEIEYYNKYFKFEI LQVPLGNVSKDKTSDLVKLLDTKGLDIEINLDKDEDFFE >gi|224461234|gb|ACDC01000168.1| GENE 18 15816 - 16688 950 290 aa, chain + ## HITS:1 COG:FN0021 KEGG:ns NR:ns ## COG: FN0021 COG1947 # Protein_GI_number: 19703373 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Fusobacterium nucleatum # 1 290 5 294 294 416 79.0 1e-116 MNKYKIFPNAKINIGLNVYQKAGDGYHEIDSVMSPIDLSDEMDITFYSEIGDLKISCSDK NIPTDERNILYKAYEIFFENSKKHKEKIEISLTKNIPSEAGLGGGSSDAGFFLKLLNEHY GNVYNEKELEELAMKVGSDVPFFIKNKTARVGGKGNKVELVENNLKDSLILVKPLGFGVS TKDAYNSFDELDEVRYANFEKIVECLRNDNRKDLEKYIENGLEQGISERNADIKMFKAIL NSVVPGKKFFMSGSGSTYYTFVTEIERSQIETRLRTFVDNVKIIISKTIN >gi|224461234|gb|ACDC01000168.1| GENE 19 16704 - 16988 485 94 aa, chain + ## HITS:1 COG:FN0022 KEGG:ns NR:ns ## COG: FN0022 COG2088 # Protein_GI_number: 19703374 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Fusobacterium nucleatum # 1 88 1 88 93 140 81.0 7e-34 MIVTNVKIKKVDGDKLDRLKAYVDITLDESLVIHGLKLMQGEQGLFVAMPSRKMRNEEYK DIVHPICPDLRNYITKVVEEKYNTIEEEATVEIA Prediction of potential genes in microbial genomes Time: Fri May 20 00:02:25 2011 Seq name: gi|224461233|gb|ACDC01000169.1| Fusobacterium sp. 2_1_31 cont1.169, whole genome shotgun sequence Length of sequence - 39399 bp Number of predicted genes - 63, with homology - 58 Number of transcription units - 14, operones - 8 average op.length - 7.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 356 144 ## COG1672 Predicted ATPase (AAA+ superfamily) + Prom 441 - 500 10.6 2 2 Tu 1 . + CDS 521 - 1000 671 ## COG3212 Predicted membrane protein + Term 1043 - 1079 2.1 3 3 Op 1 . - CDS 1366 - 1842 275 ## FN0109 hypothetical protein 4 3 Op 2 . - CDS 1826 - 3091 2022 ## COG0172 Seryl-tRNA synthetase - Prom 3117 - 3176 9.7 5 4 Tu 1 . - CDS 3183 - 4007 1164 ## COG4820 Ethanolamine utilization protein, possible chaperonin - Prom 4033 - 4092 9.1 + Prom 4074 - 4133 8.2 6 5 Op 1 1/1.000 + CDS 4187 - 4456 356 ## COG1925 Phosphotransferase system, HPr-related proteins 7 5 Op 2 . + CDS 4462 - 6945 3805 ## PROTEIN SUPPORTED gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P + Term 6963 - 6997 3.2 8 6 Tu 1 . + CDS 7013 - 7759 1147 ## FN1780 hypothetical protein + TRNA 7844 - 7919 94.0 # Val TAC 0 0 + TRNA 7927 - 8003 95.0 # Asp GTC 0 0 + TRNA 8009 - 8084 87.4 # Phe GAA 0 0 + TRNA 8099 - 8172 68.6 # Cys GCA 0 0 + TRNA 8225 - 8300 93.2 # Gly TCC 0 0 + TRNA 8306 - 8381 92.5 # Lys CTT 0 0 + TRNA 8389 - 8463 66.8 # Glu TTC 0 0 + TRNA 8480 - 8555 94.0 # Val TAC 0 0 + TRNA 8563 - 8639 95.0 # Asp GTC 0 0 + TRNA 8648 - 8723 81.3 # Thr TGT 0 0 + TRNA 8748 - 8831 68.7 # Leu TAG 0 0 + TRNA 8871 - 8947 89.3 # Ala TGC 0 0 + TRNA 8953 - 9029 91.8 # Met CAT 0 0 - Term 8940 - 9007 31.8 9 7 Op 1 . - CDS 9194 - 10309 880 ## COG0582 Integrase - Prom 10341 - 10400 4.1 10 7 Op 2 . - CDS 10442 - 11155 558 ## GALLO_0427 hypothetical protein 11 7 Op 3 . - CDS 11169 - 11555 443 ## COG1396 Predicted transcriptional regulators - Prom 11757 - 11816 10.3 + Prom 11713 - 11772 11.4 12 8 Tu 1 . + CDS 11807 - 11992 288 ## BL00668 hypothetical protein + Prom 12129 - 12188 7.0 13 9 Op 1 . + CDS 12213 - 12509 463 ## gi|237738618|ref|ZP_04569099.1| predicted protein 14 9 Op 2 . + CDS 12523 - 12783 364 ## gi|237738619|ref|ZP_04569100.1| predicted protein 15 9 Op 3 . + CDS 12793 - 12912 166 ## 16 9 Op 4 . + CDS 12905 - 13090 263 ## gi|237738620|ref|ZP_04569101.1| predicted protein 17 9 Op 5 . + CDS 13078 - 13365 278 ## gi|237738621|ref|ZP_04569102.1| predicted protein 18 9 Op 6 . + CDS 13381 - 13566 340 ## gi|237738622|ref|ZP_04569103.1| predicted protein 19 9 Op 7 . + CDS 13563 - 13670 62 ## 20 9 Op 8 . + CDS 13727 - 14467 859 ## gi|237738623|ref|ZP_04569104.1| predicted protein 21 9 Op 9 . + CDS 14480 - 14608 222 ## gi|237738624|ref|ZP_04569105.1| predicted protein 22 9 Op 10 . + CDS 14623 - 14727 63 ## 23 9 Op 11 . + CDS 14740 - 15195 725 ## gi|237738625|ref|ZP_04569106.1| predicted protein 24 9 Op 12 . + CDS 15195 - 16361 1578 ## Sterm_1409 hypothetical protein 25 9 Op 13 . + CDS 16384 - 16962 1029 ## Sterm_1410 hypothetical protein + Term 16978 - 17007 -0.3 26 9 Op 14 . + CDS 17021 - 18991 2469 ## LGAS_1473 DNA polymerase I 27 9 Op 15 . + CDS 19001 - 19642 759 ## gi|237738629|ref|ZP_04569110.1| predicted protein 28 9 Op 16 . + CDS 19656 - 22112 2793 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 29 10 Tu 1 . - CDS 22143 - 22232 63 ## - Prom 22445 - 22504 10.3 + Prom 22114 - 22173 6.7 30 11 Op 1 . + CDS 22202 - 22423 187 ## 31 11 Op 2 . + CDS 22410 - 22718 316 ## SH1780 hypothetical protein 32 11 Op 3 . + CDS 22699 - 24096 1610 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 33 11 Op 4 . + CDS 24093 - 24296 317 ## gi|237738633|ref|ZP_04569114.1| predicted protein 34 11 Op 5 . + CDS 24289 - 24498 173 ## gi|237738634|ref|ZP_04569115.1| predicted protein 35 11 Op 6 . + CDS 24498 - 24773 430 ## gi|237738635|ref|ZP_04569116.1| predicted protein 36 11 Op 7 . + CDS 24760 - 24972 303 ## gi|237738636|ref|ZP_04569117.1| predicted protein 37 11 Op 8 . + CDS 24959 - 25153 238 ## gi|237738637|ref|ZP_04569118.1| predicted protein 38 11 Op 9 . + CDS 25140 - 25508 469 ## CCV52592_0062 hypothetical protein 39 11 Op 10 . + CDS 25508 - 25780 276 ## gi|237738639|ref|ZP_04569120.1| predicted protein 40 11 Op 11 . + CDS 25777 - 26061 210 ## gi|237738640|ref|ZP_04569121.1| predicted protein 41 11 Op 12 . + CDS 26073 - 26282 455 ## gi|237738641|ref|ZP_04569122.1| predicted protein 42 11 Op 13 . + CDS 26285 - 26509 343 ## gi|237738642|ref|ZP_04569123.1| predicted protein 43 11 Op 14 . + CDS 26526 - 27512 1069 ## APP7_0480 type I restriction enzyme EcoR124II M protein (EC:2.1.1.72) 44 11 Op 15 . + CDS 27440 - 28117 842 ## gi|237738644|ref|ZP_04569125.1| predicted protein 45 11 Op 16 . + CDS 28132 - 28671 866 ## gi|237738645|ref|ZP_04569126.1| predicted protein 46 11 Op 17 . + CDS 28668 - 29030 543 ## gi|237738646|ref|ZP_04569127.1| predicted protein 47 11 Op 18 . + CDS 29040 - 29225 205 ## gi|237738647|ref|ZP_04569128.1| predicted protein 48 11 Op 19 . + CDS 29242 - 29682 516 ## gi|237738648|ref|ZP_04569129.1| conserved hypothetical protein + Prom 29972 - 30031 8.7 49 12 Op 1 . + CDS 30161 - 30739 517 ## CLH_1731 hypothetical protein 50 12 Op 2 . + CDS 30681 - 32348 1409 ## COG5525 Bacteriophage tail assembly protein + Prom 32422 - 32481 4.5 51 13 Op 1 . + CDS 32511 - 32735 272 ## gi|237738651|ref|ZP_04569132.1| predicted protein 52 13 Op 2 4/0.000 + CDS 32774 - 34333 1813 ## COG5511 Bacteriophage capsid protein 53 13 Op 3 . + CDS 34290 - 35375 1704 ## COG0740 Protease subunit of ATP-dependent Clp proteases 54 13 Op 4 . + CDS 35372 - 35707 550 ## gi|237738654|ref|ZP_04569135.1| predicted protein 55 13 Op 5 . + CDS 35707 - 36720 1500 ## CLH_1724 hypothetical protein 56 13 Op 6 . + CDS 36732 - 36965 444 ## gi|237738656|ref|ZP_04569137.1| predicted protein 57 13 Op 7 . + CDS 36968 - 37300 531 ## gi|237738657|ref|ZP_04569138.1| predicted protein 58 13 Op 8 . + CDS 37297 - 37866 752 ## Dred_1209 hypothetical protein 59 13 Op 9 . + CDS 37863 - 38348 523 ## gi|237738659|ref|ZP_04569140.1| predicted protein 60 13 Op 10 . + CDS 38350 - 38556 192 ## gi|237738660|ref|ZP_04569141.1| predicted protein 61 13 Op 11 . + CDS 38624 - 38797 272 ## gi|237738661|ref|ZP_04569142.1| predicted protein 62 14 Op 1 . + CDS 38920 - 39123 420 ## gi|237738662|ref|ZP_04569143.1| predicted protein 63 14 Op 2 . + CDS 39113 - 39398 400 ## gi|237738663|ref|ZP_04569144.1| hypothetical protein FSAG_02203 Predicted protein(s) >gi|224461233|gb|ACDC01000169.1| GENE 1 3 - 356 144 117 aa, chain + ## HITS:1 COG:FN0123 KEGG:ns NR:ns ## COG: FN0123 COG1672 # Protein_GI_number: 19703471 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 40 117 377 454 454 105 69.0 2e-23 RLALNFLIFLLSTDYILSHSFICLGLPTSTNFGVLPSRGIVVEPLGENNKIVFGECKYSK KQVGLSILKQLQEKAKNIKWNNNNREEYFILFSKSGFSEELEELAQKEKNIILKKLI >gi|224461233|gb|ACDC01000169.1| GENE 2 521 - 1000 671 159 aa, chain + ## HITS:1 COG:FN2085 KEGG:ns NR:ns ## COG: FN2085 COG3212 # Protein_GI_number: 19705375 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 159 1 161 161 192 71.0 2e-49 MKRLLLVGAIIIGSLGFSTNALATLSQEQIKTIVKKEVPNGQLTKFELDRENGRKVYEVE VMDGNVEKEFKIDAETGEVIKFKTEKKVVKRVKKEPKISYDRAKEIALKQSKNGKFKEIE LKHKNGVLVYDVEVAEGFMDREFLIDAMTGEILRDKKDF >gi|224461233|gb|ACDC01000169.1| GENE 3 1366 - 1842 275 158 aa, chain - ## HITS:1 COG:no KEGG:FN0109 NR:ns ## KEGG: FN0109 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 158 1 158 158 202 91.0 4e-51 MLLKSSLFILLLVNIFTSNLLILSGILLVVLILNLCLNKNLKKHSRQLKVLLFFYLSTFL VQLYYGQQGKVLFKFYNFYLTQEGLMNFGVSFIRILNLVLMSWLINEMKLLTGRFSKYQK IIDTVIDLVPVVFVLFKKKMKAKNFTRYILKDINKRYE >gi|224461233|gb|ACDC01000169.1| GENE 4 1826 - 3091 2022 421 aa, chain - ## HITS:1 COG:FN0110 KEGG:ns NR:ns ## COG: FN0110 COG0172 # Protein_GI_number: 19703458 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 421 4 424 424 789 92.0 0 MLELKFMRENVEMLKEMLKNRNSNIDMDAFVALDAKRREVLSEVETLKRDRNNVSAEIAN LKKEKKDASHLIEKMGGVSSKIKELDAELVEIDEEIKNIQMTIPNVYHPSTPIGPDEDSN KEIRRWGEPRKFDFEPKAHWDIGEDLGILDFERGAKLSGSRFVLYRGAAARLERALISFM LDTHTLEHGYTEHITPFIVKAEVCEGTGQLPKFEEDMYKTTDDMYLISTSEITMTNIHRK EILEQSELPKYYTAYSPCFRREAGSYGRDVKGLIRLHQFNKVEMVKITDAESSYDELEKM VNNAETILQRLELPYRVIQLCSGDLGFSAAKTYDLEVWLPSQNKYREISSCSNCEAFQAR RMGLKYKVTNGSEFCHTLNGSGLAVGRTLVAIMENYQQEDGSFLVPKVLIPYMGGIDVIK K >gi|224461233|gb|ACDC01000169.1| GENE 5 3183 - 4007 1164 274 aa, chain - ## HITS:1 COG:FN1783 KEGG:ns NR:ns ## COG: FN1783 COG4820 # Protein_GI_number: 19705088 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Fusobacterium nucleatum # 1 273 1 273 274 429 86.0 1e-120 MNLDKVNKYIKEFEKTITKPRTNFDKSKFFVGVDLGTANIMITILDKDGKPVAGATQRSR VVKDGIVVDFIGAISIVRKLKEELEEKLGIEITEGYTAIPPGVEQGSVKAIVNVVESAGI DVKKVVDEPTAASYVLGITDGVVVDLGGGTTGISILKDGKVVFVADEPTGGTHMTLVIAG SYGVDFETAEDIKTDKKREKEVCIQITPVLQKMASIVKKYISAYDVKDVYLVGGACSFED SEKIFAKELGLNIHKPYMPLYITPIGIALAGLKD >gi|224461233|gb|ACDC01000169.1| GENE 6 4187 - 4456 356 89 aa, chain + ## HITS:1 COG:FN1782 KEGG:ns NR:ns ## COG: FN1782 COG1925 # Protein_GI_number: 19705087 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Fusobacterium nucleatum # 1 89 1 89 89 145 97.0 1e-35 MKSVKVHIKNKKGLHARPSSLFVQLVTKYDSDITVKSEDETVNGKSIMGLMLLAAEEGRE LELIADGPDEDAMLAELVDLIEVKRFNEE >gi|224461233|gb|ACDC01000169.1| GENE 7 4462 - 6945 3805 827 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 827 1 827 827 1470 88 0.0 MEIIRAKHMGFCFGVLEAINVCNSLVEEKGRKYILGMLVHNKQVVEDMERKGFKLVKEEE LLEDIDDLKENDIVVVRAHGTSKNVHEKLKERKVKVYDATCVFVNKIRQEIEIANEKGYS ILFMGDKNHPEVKGVISFADNIQIFESLEEAIEVKIDSDKTYLLSTQTTLNKKKFEEVKK YFKENYQNVIIFDKICGATAVRQKAVEDLAVKANIVIIVGDTKSSNTKKLYEISKKLNSE SYLVENEEQLDLTIFRGKEVIGITAGASTPEETIMNIEKKIRGTYKMPNVNENQNEFLEM LEGFLPNQEKRVEGTIESMDQNYSYLDVPGERTAVRVRTEELKGYKVGDTVEVLITGVSE EDDDQEYIIASRKKIEVEKNWEKIEDSFKNKTVLEGEVTKKIKGGYLVQALFHAGFLPNS LSEIPENEEKVAGKKVQVIVKDIKVDPKDKRNKKITYSVKDIKLAEQAKEFAGLEVGQTV DCVVTEVLEFGLAVDINALKGFIHISEVSWKRLDKLADAYKVGDKIKAVVVSLDEAKKNV KLSIKRLEADPWATVADEFKVGDEVDGVVTKVLPYGAFVEIKPGVEGLVHISDFSWTKKK VNVAEYVKEGEKVKVKITDLHPEDRKLKLGIKQLVANPWDSAEKDYAVDTVIKGKVVEVK PFGIFVELTDGIDAFVHSSDYNWIGEETPKFEIGNEVELKITELDLNDRKIKGSLKALRK SPWEHAMEEYKVGTTVEKKIKTVADFGLFVELTKGIDGFIPTQYASKEFIKNIRDKFNEG DVVKARVVEVNKDTQKIKLSIKEIEREEAKREEREQIEKYSVSSSEE >gi|224461233|gb|ACDC01000169.1| GENE 8 7013 - 7759 1147 248 aa, chain + ## HITS:1 COG:no KEGG:FN1780 NR:ns ## KEGG: FN1780 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 248 3 247 247 416 84.0 1e-115 MSSAFTGFVLLNEAKFDKEKFLKDLKEDWKITLDLGGEDENKEKDMLVGNIGDIMVAVAL MPAPIPNNEAVENAKTNYRWPDAVKVAEEHKAHILVSLLGEPDLIEGAKLYTKIVSALTQ QENCTGINVLGTVLNPDMYRDFTKYYEEKDMFPVENMIFIGLYAVEDNKVSAYTYGMEAF GKKEMEIIASSQNPEDIYYFLQGVADYVITSDVILQDGETIGFSAEQKIPITHSKAIAVD GVSVKLGF >gi|224461233|gb|ACDC01000169.1| GENE 9 9194 - 10309 880 371 aa, chain - ## HITS:1 COG:L55605 KEGG:ns NR:ns ## COG: L55605 COG0582 # Protein_GI_number: 15673415 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Lactococcus lactis # 17 361 9 346 359 102 25.0 1e-21 MAGRKANGEGTISTVIRNGKTYYKANITVGWDSNGKQIRKSFGSYKKSVVLDKMNTAKYQ AKTNSLSNSDISFGELFKDWIFNFKKIEVSPNTFYEYEASYRLRLMNYSIARKKANQITL KDLQQYFNELQKDFTANTIKKTYIQIHSCIKFAIIQGIMMKDFCPGVTLQKITKKENINV FSKQEQEMVLRTLDKRDIVDCLIYFTFYTGLRLGEVLGLQWSDIKDNMVKITRQYRRNVD VDKVDDRKLTYTFKELKTKNSAREIPLPDKVQELLKDIPRQGNLIFSNLGKPIEPKKPQR RIASICKKLNIPHRSFHSIRHSYATRLFEMDIPIKTVQVLLGHGDIATTMDIYTHVMKEK KLEVLDKLNNL >gi|224461233|gb|ACDC01000169.1| GENE 10 10442 - 11155 558 237 aa, chain - ## HITS:1 COG:no KEGG:GALLO_0427 NR:ns ## KEGG: GALLO_0427 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus # Pathway: not_defined # 7 153 8 151 366 66 28.0 1e-09 MVRIDNKVLYDKSQAQAYEVLINYSDGALPVDPFKIIKKLDNVQIYSYKECMEKLKKIES YEGMTEKQMLDAFPSNEGFTTLIGDTYNIFFNEKKPPARIRWTIFHELGHFFLKHFDECE CVKNFFTDPQEYEDTLEKEANCFARHCSSPLPIALYLGFHKRKPLSIELFKSCFDMSDET SKICLKHLKKFTEYYSCNRHEKLISHFKEKIRESDKHLHAKLCVNRWNNFIELIKKI >gi|224461233|gb|ACDC01000169.1| GENE 11 11169 - 11555 443 128 aa, chain - ## HITS:1 COG:BH0345 KEGG:ns NR:ns ## COG: BH0345 COG1396 # Protein_GI_number: 15612908 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 95 1 96 116 57 40.0 9e-09 MAEIKDRIISLRNEKNLTQSQLAEELNISPSAIGMYEQGRRKPSYELLEELCDYFNVDMD YLTGRSNIKNRFQEDLKNKREKNYEADLDKSDDDFKMVARDYNKLSEDKKKLFKSMMKNF MKSLNEEE >gi|224461233|gb|ACDC01000169.1| GENE 12 11807 - 11992 288 61 aa, chain + ## HITS:1 COG:no KEGG:BL00668 NR:ns ## KEGG: BL00668 # Name: not_defined # Def: hypothetical protein # Organism: B.licheniformis # Pathway: not_defined # 3 61 6 64 78 62 55.0 3e-09 MTIGEKLKQLRGDKKTKDVAKDLNVTVSALSNYENDYRVPRDEVKKKIADYYKKSVEEIF F >gi|224461233|gb|ACDC01000169.1| GENE 13 12213 - 12509 463 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738618|ref|ZP_04569099.1| ## NR: gi|237738618|ref|ZP_04569099.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 98 1 98 98 169 100.0 5e-41 MRIHKKIKVDLEKIKDVLKHDDLIERALALIEDYNTYESTDIEFTFSWNLNKCREWGYDY GGDVEYFDKFFVETLGMKESVKRAWDNKTAQKYYFYED >gi|224461233|gb|ACDC01000169.1| GENE 14 12523 - 12783 364 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738619|ref|ZP_04569100.1| ## NR: gi|237738619|ref|ZP_04569100.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 86 9 94 94 166 100.0 3e-40 MKKIKNIFGIFKHKASRPIVFKELFGINQLSACDRDGSWDSYDFVGTIDEVNDYEKRWCT QGSNGFGFLGIEAVKGFKGQFSYCGK >gi|224461233|gb|ACDC01000169.1| GENE 15 12793 - 12912 166 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSNKILEKYWGKNELKGLSLKRALAIIQILELWEGEND >gi|224461233|gb|ACDC01000169.1| GENE 16 12905 - 13090 263 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738620|ref|ZP_04569101.1| ## NR: gi|237738620|ref|ZP_04569101.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 61 1 61 61 95 100.0 9e-19 MTRSEIAARELLKESKKATLLDVIKYKTVWLLKVIFGAYMKYVELYDFDGLIWEEDAEWE L >gi|224461233|gb|ACDC01000169.1| GENE 17 13078 - 13365 278 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738621|ref|ZP_04569102.1| ## NR: gi|237738621|ref|ZP_04569102.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 95 7 101 101 186 100.0 3e-46 MGVVKGSFIARESLFKNHKYVVALNTTSEKYQAYVVLNPEDDINQVSMTPEMSYFIDGWS YGIISFKTDDARCNNSTYYENKCQDIITQLTAKIN >gi|224461233|gb|ACDC01000169.1| GENE 18 13381 - 13566 340 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738622|ref|ZP_04569103.1| ## NR: gi|237738622|ref|ZP_04569103.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 61 1 61 61 104 100.0 2e-21 MSEKMMLTMPETAKLTGIGLQKLKQIAREYSDFPYIKIGVKHLVIKEKLPDWFEKHKGEE L >gi|224461233|gb|ACDC01000169.1| GENE 19 13563 - 13670 62 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLAIVLASILVIYKRKTSVVSDQTNTDVSRKNI >gi|224461233|gb|ACDC01000169.1| GENE 20 13727 - 14467 859 246 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738623|ref|ZP_04569104.1| ## NR: gi|237738623|ref|ZP_04569104.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 246 1 246 246 469 100.0 1e-131 MNINDYNSKNTGKQVLVLGEDDIKVLNHFASIAKSGELKGLIVAGKYVGFTDTYRLASIK DIHEDLPGTNTGNALMYDVLDVLKKAKSLAVLKDGKLAIQVGVEVTEYEPMKDVKVPNIA TVREGLDYETYTEAFPSVNFAENVVWKMLKTPAGQERYKKYFKFESGKVIVEAYPNENSK LFLEILELKKDRTSLVTNLDCKYLDLWFKWTKNSKFDLAIGKNSNCAVKFSKDKVDYIVM PLSMME >gi|224461233|gb|ACDC01000169.1| GENE 21 14480 - 14608 222 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738624|ref|ZP_04569105.1| ## NR: gi|237738624|ref|ZP_04569105.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 42 1 42 42 70 100.0 3e-11 MFLIDGNYFELVLEDGDIAVLSNVVTGESLTMNIKELWNYAI >gi|224461233|gb|ACDC01000169.1| GENE 22 14623 - 14727 63 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLRINKKSVATTTSATDKYTHFNDKIPQINKKCK >gi|224461233|gb|ACDC01000169.1| GENE 23 14740 - 15195 725 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738625|ref|ZP_04569106.1| ## NR: gi|237738625|ref|ZP_04569106.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 151 1 151 151 248 100.0 9e-65 MVKVEFTGSVEEVSKEILDFVRGNYINLAENIALPKSDTEKAIKRAIDNASAKKEAVKNV EEAPAQKLPIAPAKKEEAPVAVATPLPTKTAEYTADDLQRIAAAWIAKDIENNRKTMKDL LGKFGVKAITVLPQESYGAFVQELKNLGVDI >gi|224461233|gb|ACDC01000169.1| GENE 24 15195 - 16361 1578 388 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1409 NR:ns ## KEGG: Sterm_1409 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 388 1 389 389 351 49.0 3e-95 MAHALLGPSSASRWMACPPSVRLCEQFEDVESEYAKEGSLAHEIAELKVRKLIDPGLTSR KFTSAMKKLKEKELYQEEMQGYTDEYVEFIQEQMYSYPTTPHIAVEQKVDFSQYVPGGFG TADCILISNDTLHVIDFKYGKGVPVSVENNAQLLLYALGAYLAYEMIFPIEHIKMSIVQP RLTGIDTWECSLDYLLTFAKKAQEKAVMALNGEGDFECGEHCKFCKAKSICKERANVNLE LAKYEFKAADQLSLEEIGEILKKAQDLAEWAEDLNEYALAESLKGNNVPGWKAVNGRGSR SFKNTDEAIKVLKENGIAEELLYERKYLTLAQIEKVIGKKDFNNLVGDLIVMNVGKPTLV EASDKREAITNKIKAEDEFSAVDDINNL >gi|224461233|gb|ACDC01000169.1| GENE 25 16384 - 16962 1029 192 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1410 NR:ns ## KEGG: Sterm_1410 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 192 6 195 195 191 51.0 2e-47 MANDTRVMTGKVRLSYVHLFKPYAAEKGQEEKYSCTILVPKTDVQTKMKLDAAINAAIEK GISSVWNGVKPPKPTIPIYDGDGVRPSDGQEFGPECKGHWVFTASAKIDYQPGIVDVRAQ PILNQSEIYSGIYARVSVNFFPYAVSGKKGIGCGLGNVQKLMDGEPLSAVGIKAENEFDE VEIDPVTGEPIL >gi|224461233|gb|ACDC01000169.1| GENE 26 17021 - 18991 2469 656 aa, chain + ## HITS:1 COG:no KEGG:LGAS_1473 NR:ns ## KEGG: LGAS_1473 # Name: not_defined # Def: DNA polymerase I # Organism: L.gasseri # Pathway: not_defined # 1 656 1 644 644 677 50.0 0 MRTLNIDIETFSSVDIGKSGAYKYAMSDDFQILLFAYSVDGQDVKIIDLAQGEAIPGEVL DLLKDETCIKYAYNAVFEWWCLNMAGIETPLEQWHCTMVHGLYCGYTAGLAAIGNAMGLP QDKKKLTTGSALIRYFCIPCNPTKSNGNRTRNLPHHAPEKWELFKEYCIQDVVTEMEIGR RLSAFPVPEREWKLWVLDTFMNAYGVRVDSELVNGALYIDALSRANLLEEARDITKLDNP NSTSQLLNWLEEAGEEVENLQKATVGKMIDTLDDGKAKRVLEIRQELSKTSVKKYKAMDE AMCKDERVRGLLQFYGANRTGRYAGRLVQVQNLPRNYIETLDVARDVIKKGDGELLEMLY GNIPDTLSQLIRTAFIPSEGNHFVVSDFSAIEARVIAWLAGEEWRMEVFKTHGKIYEASA SQMFGVPINTIAKGEENYHLRAKGKVAELALGYQGSVGALTAMGAADMGLTDEEMKDIVD RWRKSSKRIVELWYALENAAVEVLETGEPQIVKCVKLAKEYDFIYGQDFFTIELPSGRKL FYPKPFLKENQFGQMQMHYMGINQTTKKWEVIPTYGGKLTENIVQAIARDCLAETLLRVK AKGWPIVFHVHDEIILDVPKSVELEEVIKTMTEEISWAKGLILNAAGFTGSYYMKD >gi|224461233|gb|ACDC01000169.1| GENE 27 19001 - 19642 759 213 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738629|ref|ZP_04569110.1| ## NR: gi|237738629|ref|ZP_04569110.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 213 1 213 213 414 100.0 1e-114 MLHIGRKIKKFRIENNLSQKEFAEKIGVTQGFLSYVENGRLNIESPSLEKKILIAIGEAP DEDLRKDFEKNVELASDNVHSPKHYMIPGCNFECKDLSDAIVRNMPNPLGTRIWNVVKYL VRAEKKNGLEDYNKAVEYLSWIEKGNEADEYDNENTLENIADKLKTDWTTIIMGICEGYT AKKAILMNETFRNLIALNIPGAINCISKIIELG >gi|224461233|gb|ACDC01000169.1| GENE 28 19656 - 22112 2793 818 aa, chain + ## HITS:1 COG:XF2121 KEGG:ns NR:ns ## COG: XF2121 COG5545 # Protein_GI_number: 15838712 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Xylella fastidiosa 9a5c # 377 692 30 347 501 156 31.0 2e-37 MENSRKLIISEANNRLSKQWVTTEITWSEFVERLSKPKITAETLDEFLSYSKAKQDDIKD VGGFVGGKLKGNLRRSEAVESRSLITLDLDNLAYEDDTKIIKTLNSLGCAYAVYSTRKHQ TTKPRIRVILPLAEDVSADEYEPIARKVAESIGLRYCDPTTFQAVRLMYWPSHSTDSDYV FTYADKPMLDGKAVLNMYADWRDVSTWPEVPDAQKHHLTLLKQQENPLEKEGMVGAFCRR FNIYQAIDEFLPGVYEPCDISDRLTFVGGSTTAGAIVYQDGLFLYSHHATDPCSQKLVNA FDLVRLHKFGHLDIQADIKTPVAKLPSWLAMKEWVFAKTPVNSDLLKERRQKAISEFSVS NNPDVDAVDGVLVEEDDSWTAELVYNAKDSSKVLNTLANIMLILRNDRELKFKIFKDIFS SRILVRKDVPWDRKFEADDRLWTDTDDAGLRWYLESTYGITSTNKIIDGVNLIAEENAEN KVASRIQATLWDGEKRLETLFIDYLGCEDNVYTREVSEKSLVAAAKRAIYGGIKWDNMPI LIGPQGVGKSTFLKILGMEWYNDSLVNVEGKDACELIQGSWILEMGELSSLRKSEMNLVK NFLSRTDDVFRASYGRRAQKYPRRCAFFGTANDTNFLRDETGNRRFWPIDCFILNPKKSI FDDLKDELDQIWAEACELAKDKSYNLVLSKEALEIAIKEQDSHSEDNVYKGIILDYLDKK IPKNAWDSMDLFARRTYLNEYESTIPQYDENDLILRDKVCAAEIWEEALKMDIRYLKKSD SVEINKILSTLFQWEKVKQASRFGKYGVQRGYKRKIES >gi|224461233|gb|ACDC01000169.1| GENE 29 22143 - 22232 63 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFFYVTFYVTFYVTIKKNVTSLSYRMLHF >gi|224461233|gb|ACDC01000169.1| GENE 30 22202 - 22423 187 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLHRMLHKKTLVLLVLLYICNIVTFFSILIYKNKEIKGYLRSIKSINPIFIYIYKEKRAR MLHLRLEKIHEKK >gi|224461233|gb|ACDC01000169.1| GENE 31 22410 - 22718 316 102 aa, chain + ## HITS:1 COG:no KEGG:SH1780 NR:ns ## KEGG: SH1780 # Name: not_defined # Def: hypothetical protein # Organism: S.haemolyticus # Pathway: not_defined # 5 94 3 90 92 80 45.0 1e-14 MKKSEREIEAYLVKSIKNKNGLCMKWTSPGNAGVPDRIVIVPGGDVYFVELKAEGKREDL SPLQRNFINKLKNLNCDARVIASFKEVDEFIEEVMPNEVYTT >gi|224461233|gb|ACDC01000169.1| GENE 32 22699 - 24096 1610 465 aa, chain + ## HITS:1 COG:XF0680 KEGG:ns NR:ns ## COG: XF0680 COG0553 # Protein_GI_number: 15837282 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Xylella fastidiosa 9a5c # 8 448 6 456 472 314 39.0 2e-85 MKFIPHEYQKYCIDRMIADDKLGLMLDMGLGKTIITLSAIADLKFNRFEVGKVLIIAPKK VAEATWTDEIAKWDHLSLLKTSLVLGGLQKRIKALAKTADIYVINRENVTWLVDYYKNAW PFDMVVLDEWSSFKNHQSKRFKSLKVIRNKITRIVGLTGTPAPNGLIDLWAQLYLLDQGE RLEKTIGKFRERYFEPGQRNRTVIFNYDAKEGSNEAIHEKISDICISMKAEDYLELPDII YEQVPVVLDTKAKKAYDELEKKAILELEDTEITVANAAALSNKLLQLANGAIYDENRKVF EVHDCKIERFLELIEQLNGKPALVFYNFQHDKDRIIEALKDSKLRIRLLKTPQDQLDWNK GEIDILLAHPASAAYGLNLQAGGNHVIWFGLNWSLELYQQANKRLHRQGQTEKVIIHHLV CKETRDEDVMEALQNKGDVQEALVESLKVRIMKVKEAEKKNKEQI >gi|224461233|gb|ACDC01000169.1| GENE 33 24093 - 24296 317 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738633|ref|ZP_04569114.1| ## NR: gi|237738633|ref|ZP_04569114.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 67 1 67 67 118 100.0 1e-25 MRTFGCIYFYVSGGSIEKTQDYGNEKDDKNYKLGNYFLDSTEARQVLDSKEYREFWERVR TGEIGND >gi|224461233|gb|ACDC01000169.1| GENE 34 24289 - 24498 173 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738634|ref|ZP_04569115.1| ## NR: gi|237738634|ref|ZP_04569115.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 69 1 69 69 115 100.0 1e-24 MIKLIKNSELKKTITYQFYGIRCNCCNSTNNVNVLEIRAENSSGGTIIDICDKCLIELKE QIEKLGGDE >gi|224461233|gb|ACDC01000169.1| GENE 35 24498 - 24773 430 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738635|ref|ZP_04569116.1| ## NR: gi|237738635|ref|ZP_04569116.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 91 1 91 91 155 100.0 7e-37 MTQEIIKIVGIEVQMPYHNEVYIVGEKPEGHGSMIVKNAGIVKEIRLADDDDSIQERDVI YIKMEKNGIILELSTSQPGLRIIWSDENVDM >gi|224461233|gb|ACDC01000169.1| GENE 36 24760 - 24972 303 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738636|ref|ZP_04569117.1| ## NR: gi|237738636|ref|ZP_04569117.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 70 4 73 73 119 100.0 9e-26 MWICKKCGEKIQGHYIGYVDIDKKGCAIDGTQEEEELIRYTCACCRIIKFGELKKLEKVA DWVEDEDVEM >gi|224461233|gb|ACDC01000169.1| GENE 37 24959 - 25153 238 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738637|ref|ZP_04569118.1| ## NR: gi|237738637|ref|ZP_04569118.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 64 1 64 64 108 100.0 1e-22 MWRCKFCGCTKFEIERKIIDRDFDSKKNTLNINDIKRSVMCCNCYNWGKYIEEIAYWEDE YERD >gi|224461233|gb|ACDC01000169.1| GENE 38 25140 - 25508 469 122 aa, chain + ## HITS:1 COG:no KEGG:CCV52592_0062 NR:ns ## KEGG: CCV52592_0062 # Name: not_defined # Def: hypothetical protein # Organism: C.curvus # Pathway: not_defined # 1 120 1 130 131 84 45.0 1e-15 MREIKFRAWDKINKDMFNVESINFQERRVYKDTVSYRKFEDIDLMQYTGLKDKNNKEIYE GDILFESFGERYYKVVFKNGNFRTEFEGYFDEYSFDLIDVVLDLCEINGNIYENSELMEE VR >gi|224461233|gb|ACDC01000169.1| GENE 39 25508 - 25780 276 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738639|ref|ZP_04569120.1| ## NR: gi|237738639|ref|ZP_04569120.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 90 1 90 90 145 100.0 8e-34 MNDLANKELRKLYHQVLKGLYRAKTIRENTDNNDIYSEFLLYDEDGNLIEETNVTSFESR EIIRLLINSYENQLLKVGGKIRKPNKEVRQ >gi|224461233|gb|ACDC01000169.1| GENE 40 25777 - 26061 210 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738640|ref|ZP_04569121.1| ## NR: gi|237738640|ref|ZP_04569121.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 94 1 94 94 151 100.0 1e-35 MNIDLNKLKNYKSIAYANEAAQLGKVKEEYKELLAEVRETSTFSYIKNRDNFVAEALDLI TATVNLLLLCGLTEQDFEKHIEKLESYKNGKYKR >gi|224461233|gb|ACDC01000169.1| GENE 41 26073 - 26282 455 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738641|ref|ZP_04569122.1| ## NR: gi|237738641|ref|ZP_04569122.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 69 1 69 69 82 100.0 7e-15 MIDEAELFEKIESKQFEIDYDNSITKSIQEYYKAKGQIEALEWVKRLIAVESDDDFIIDD TIELGKEWD >gi|224461233|gb|ACDC01000169.1| GENE 42 26285 - 26509 343 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738642|ref|ZP_04569123.1| ## NR: gi|237738642|ref|ZP_04569123.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 74 5 78 78 105 100.0 8e-22 MLHRYQIDLRVKEENTEKTIKKSIFRKKELTDAELEEAQLEFIRSTKAIYKEKGIDLEVL EWGIQKFELVRKNS >gi|224461233|gb|ACDC01000169.1| GENE 43 26526 - 27512 1069 328 aa, chain + ## HITS:1 COG:no KEGG:APP7_0480 NR:ns ## KEGG: APP7_0480 # Name: not_defined # Def: type I restriction enzyme EcoR124II M protein (EC:2.1.1.72) # Organism: A.pleuropneumoniae_AP76 # Pathway: not_defined # 2 250 6 255 318 261 55.0 2e-68 MSFKEHNNREVSKKLAEYITGTELRKYVAKKVKQYINLENPTVFDGAVGSGQLEQFVNPS ILYGVDVQESSINSARENFKNTELEVKSFFEYERENFEVDCVIMNPPFSLKFKDLTEQEQ KNIQKQFTWKKSGVVDDIFVLKSLEYTKRYAFYILFPGVGYRKTEEKFRELIGNRLAELN VISNAFTDTSIDVLFLVVDKNKITETVYRELYDCKLEKIIVSDTWKVDEDYRWEQIREEK EVEEVDINALNTKACELWIKGVERNLELDLFLIKECDANIDFMGNIRRLKAIVEKYENKF RSKKRCKNEMTLLEKQSKLLTLFSGAQR >gi|224461233|gb|ACDC01000169.1| GENE 44 27440 - 28117 842 225 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738644|ref|ZP_04569125.1| ## NR: gi|237738644|ref|ZP_04569125.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 225 1 225 225 393 100.0 1e-108 MQKRDDFIRETIKIANFVFGSTTVVISDIFNIKYMSKKDIFTKRDIVENGEPAIFYGDIS RKYDCFVDEEITKINSEAYNRADKINKGQILVNLEDFDYEDIGRCIFYENDIPAAINGNV AILTLKEKFEDAVNLKYITFYLNYKDIVRQYVYDKAVGEKVKRLSRLYFEHIPITIPLIE RQDKIIDNFIKVRKKFKNDFELLEKAIDLANKYTSFGVDGLLKLK >gi|224461233|gb|ACDC01000169.1| GENE 45 28132 - 28671 866 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738645|ref|ZP_04569126.1| ## NR: gi|237738645|ref|ZP_04569126.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 179 1 179 179 365 100.0 1e-100 MKKMLMVLCLIMLFAGCEEFGTDKDIQSTARLGNKLAENQPTPNDIDYSLERYNLIRRTY WVNGQREKAVNLPCPVVKPFGYIVLFTENGGIVGSFTVDGKVSSLNSFLTPDSEYYSRGE YTNDWLPDVDGSYGENDNMGIFFFTNDGKYIEWTGTYLYSDIPMKVENPIVKYEIGGNK >gi|224461233|gb|ACDC01000169.1| GENE 46 28668 - 29030 543 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738646|ref|ZP_04569127.1| ## NR: gi|237738646|ref|ZP_04569127.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 120 1 120 120 210 100.0 2e-53 MKIIGQLIIGIVGMMMSILMAYGFSFFTEKVDYSYQKAIDNISYDRLKKVEDTARAMIAT YKSDKLTYEAYKNTDVELATQAKIRANRTAVAYNDYILKNSFQWKGNIPSDIYNQLEIIE >gi|224461233|gb|ACDC01000169.1| GENE 47 29040 - 29225 205 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738647|ref|ZP_04569128.1| ## NR: gi|237738647|ref|ZP_04569128.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 61 2 62 62 91 100.0 1e-17 MEIYVRVIIIIFMFHFGVIGLVGIKYAMNDNKTKKDADRINFYIFLGFVVQIAGYFLWKS V >gi|224461233|gb|ACDC01000169.1| GENE 48 29242 - 29682 516 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738648|ref|ZP_04569129.1| ## NR: gi|237738648|ref|ZP_04569129.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 146 1 146 146 228 100.0 8e-59 MATQEQRIVLKEIEDVLYSYPKYKNRIKEETEHLANPQLKKCCGVGGQGGNGYEIKSEYE QIEELKQRISNNISRYREMLFRIDECLNMVKDNKDYNFIELKYFQGLTYEEIAEKLEVHV TSTYKMRNRILGALKVHFKAQRLIEF >gi|224461233|gb|ACDC01000169.1| GENE 49 30161 - 30739 517 192 aa, chain + ## HITS:1 COG:no KEGG:CLH_1731 NR:ns ## KEGG: CLH_1731 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 2 178 10 189 197 87 34.0 3e-16 MNTEEKIVSSPELAEMFGVTDRYIRMLAQDGIVKKSGNRGKYLLVESVKGFIEFIKEQNS ADVDLKDTKLKKETEKIEKDIELKSIKISELKNELHSADIVRKVMTVMLTNLKGKLLAVP NKIAPLVVGCDNLGDIQDIVLSSIEDVLLELSEYSPELFKNKNIILEDEEEVEDEKSKGK GSSRKSKSKKNN >gi|224461233|gb|ACDC01000169.1| GENE 50 30681 - 32348 1409 555 aa, chain + ## HITS:1 COG:RSc0853 KEGG:ns NR:ns ## COG: RSc0853 COG5525 # Protein_GI_number: 17545572 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Ralstonia solanacearum # 27 501 18 511 660 309 36.0 1e-83 MKKAKEKDPVENPSLRKTINLFADIFQTLKPPPKLTIDTWADSYRILSSKTSAEPGRWKT DRVPFQREVMKAISDKKTTKIVMMYGAQLSKTEILLNVFGYYADYDPAPIMYLLPTKDLA EDFSSTRLDDMIQSTPQLKNKILNKVDGRDTKLQKEFVGGYITLVGSNSAAELSSRPLRI LLADEVDRFKSDVGGEGDPLNLAIERTKTFWNKKIVITSTPTIKGDSRVEKEYENSTKEE FYIPCPKCGSFQKLEWRNIIFEPVGHKCPDCLEISSEHEWKRNMIHGIWQPQEEEVDDWS VRGFHISELYSPFSTWPEIIKKFKAAKGNMQMMKVFTNTCLGQTWEEKVEKIDFLDISKR KEEYTAEIPDQVQVLTAGVDVQDDRLEIEVVGWGLGEESWGIYYKQFIGSPGQNDVWEQL DRFLETEFEYADGEKIRILCTCIDTGGHYTQEAYQYIKPREFRRVFGIKGKGGDGVAFVS KPSRTNRMQISLFTLGVNTGKETILARLKIEEPGSMYMHFPSNVDRGYDEAYFKGLTSEV KTTVWEKGVKKLFGK >gi|224461233|gb|ACDC01000169.1| GENE 51 32511 - 32735 272 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738651|ref|ZP_04569132.1| ## NR: gi|237738651|ref|ZP_04569132.1| predicted protein [Fusobacterium sp. 2_1_31] # 11 74 1 64 64 107 100.0 2e-22 MNYTREECSQMIEVYRKAEIAVLTGKSYKIGTRELVREDLSEIRKGRAFWEGELDKLNNN GRKKLGRRVIPRDL >gi|224461233|gb|ACDC01000169.1| GENE 52 32774 - 34333 1813 519 aa, chain + ## HITS:1 COG:RSc0857 KEGG:ns NR:ns ## COG: RSc0857 COG5511 # Protein_GI_number: 17545576 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Ralstonia solanacearum # 36 506 26 482 508 138 26.0 3e-32 MNLLDKTIAFFNPKKALEREVARKKIEILNTGYSNHGASTTKSSMKGWISTGGGVKKDIY KNRKKLVERSRDLYMGAPVAQGVMKTINSNVIGSGLKLKSAIDYETLGISEEEAEAIETT IEKEFKLWADNKIEQMGVLNFDQVQDLVFLTILLNGECFVKFNYFETPKNPYSLKLQIIE PDRVMTPSILQNDETIVDGVKIDNNNRISGYYVARKHPLDVSGNVETDFISVYGKQEQLN ILHIMLAERPEQVRGIPILSPVIEALKQLDRYTDAELMAAVVSGMYAIFIESDKDNAQGA NIADHEVLDETEQIDSSNEETIELTPGLVQGLNPGEKVVATNPGRPNAQFDPFVTSILRQ IGAALEVPYELLIKHFTASYSASRAALLEAWKMFRKRRDWFSSNFTQVVYEEWLREAYLL GRVDMKNYGEDPLLTKAWSGAQWNGPSQGQLDPLKEVKASTLRVQQGFSTRTKETVELNG GDFEQNVRILAKENKLLEEKGVMINNAENDKEVLEHNEE >gi|224461233|gb|ACDC01000169.1| GENE 53 34290 - 35375 1704 361 aa, chain + ## HITS:1 COG:CAC1893 KEGG:ns NR:ns ## COG: CAC1893 COG0740 # Protein_GI_number: 15895167 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 8 186 14 195 256 139 42.0 6e-33 MPKMTKKFWNITKNEEAKSADVVMYGTIGSDEYWDDVCDKTIKEEIGNLGDVENINVHIN SPGGSVFAAVAIANTLKNHKAKVTAFIDGLAASAATIITSACDVVKMPKNAMFMIHNPLT WAYGNKQELEKTGILLDKVKDSILETYLAKAKGKTKEELSALMDEEKWFNAEEAKEYGFI DEIVDEVENLQNVNNLLIVNSLAFDISKFKNFPGFKPTEPVTEPTPEPTQNTATNTATNT EEMTVEKFKADYPELYKNIVNSAVQGERNRIEAIENLEIAGFDDVVNTAKFKEPVDAANL ALKILNIKKEKNKETLKNIQEESQATPVPVAPRAEEGSGSVVGIPVCNILKYMNKKTGGT K >gi|224461233|gb|ACDC01000169.1| GENE 54 35372 - 35707 550 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738654|ref|ZP_04569135.1| ## NR: gi|237738654|ref|ZP_04569135.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 111 1 111 111 186 100.0 5e-46 MSFIEKGNEYGVDQLLSGTGHKVMELEVPQGKSVKRGQAVNASAELSDGTDLFGVVLETA DGTAAKTKTTVVVFGEVIFEGLELKAATVKSDFIKKARDKGIIVKELGGRY >gi|224461233|gb|ACDC01000169.1| GENE 55 35707 - 36720 1500 337 aa, chain + ## HITS:1 COG:no KEGG:CLH_1724 NR:ns ## KEGG: CLH_1724 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 17 336 20 347 348 151 31.0 4e-35 MAVLLEFLGLYDQSVIKPKTFIRDMFFSKHETHEYPKWEIEYRKGRQLVAPFVSELIPGT EVVKRSYASKYYSAPKVAPKKTFSAQEIYFAKSAGETIYGGISPEEKKAKLIGEAFADFE EQISRREELMCIDLMFKGSIVVKGEGVEDKIEYGTVQEITPTVLWNQPNADISGDIESVI TLIGETTGQRVEHIVMDPVASRLFTQNEKIAKLLDIKNANFGQIDPKELASGAIYIGTLA PYNIPIYSYQTQHSVLKADGKTYDTVKMIPEGRVLFAPSNNTLHYGPAADIAKGIIVAER VPFEDEDTKINTLEVRTESRPLPVPFDIDAIKVLKVK >gi|224461233|gb|ACDC01000169.1| GENE 56 36732 - 36965 444 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738656|ref|ZP_04569137.1| ## NR: gi|237738656|ref|ZP_04569137.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 36 1 36 77 65 100.0 1e-09 MKLKVKQSLIYCGIVYNPGEVVDILESDIIERVKSLELVEAEEVTEEAENLEGTEETTEE NTEVEETNKNSKKSKKA >gi|224461233|gb|ACDC01000169.1| GENE 57 36968 - 37300 531 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738657|ref|ZP_04569138.1| ## NR: gi|237738657|ref|ZP_04569138.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 110 1 110 110 193 100.0 2e-48 MGFKEEVTSDIVDVFLNLEEFGDTHTIGKKETVCVIDEERFQNKQRNRTRSLENEGLFIE GMTLFIEKSFFKYPPHSGEKILVDGVRYLVEEAKEDMGLLEIDLTRYDEK >gi|224461233|gb|ACDC01000169.1| GENE 58 37297 - 37866 752 189 aa, chain + ## HITS:1 COG:no KEGG:Dred_1209 NR:ns ## KEGG: Dred_1209 # Name: not_defined # Def: hypothetical protein # Organism: D.reducens # Pathway: not_defined # 4 187 2 181 185 84 32.0 1e-15 MIGVKVEATGINEVINTLGKYESELPSCISRAINRSLEMVKTEQIRKTTESYFAQKSKLL SSVNVFKTSKSNLTGSIISNGRVIGLDHFKLNPKTRTKGKIVQTAVKKGGYKSLPNAFIA YKNGHLGAFERTGKFITKNGRKRETIKRLMSVSAPQMLGNLSILEYLQGYADEKFRMRLE HEINRVIGV >gi|224461233|gb|ACDC01000169.1| GENE 59 37863 - 38348 523 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738659|ref|ZP_04569140.1| ## NR: gi|237738659|ref|ZP_04569140.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 161 1 161 161 303 100.0 2e-81 MIIEVEQLIFDFLTEKLQDKKVTVYHGLLPEINHEDREEGKSEKDLFPFAILRVTKFEQT RNGIDNYDVPVDLEVWIGTKMESEKDYLNNLSIGDYLKKEFLNESTVDGKFAVDQSFPFS IEYFTAESEPYFYSVCRFRVFGVPDTSEVVEKKIAKLLGRG >gi|224461233|gb|ACDC01000169.1| GENE 60 38350 - 38556 192 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738660|ref|ZP_04569141.1| ## NR: gi|237738660|ref|ZP_04569141.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 68 6 73 73 114 100.0 2e-24 MKTYIYVGKKLDLPEFLFVRGTVYFGEEIEKLIEKYPLLGRLLIPVEDYPKINKDYQYFD SIVDEIKI >gi|224461233|gb|ACDC01000169.1| GENE 61 38624 - 38797 272 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738661|ref|ZP_04569142.1| ## NR: gi|237738661|ref|ZP_04569142.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 57 4 60 60 79 98.0 9e-14 MLKKIGRPTDEPKSHRITVRIDEESKKTLDEYCLKKEVKPAEAIRIGIKKLKDDLEN >gi|224461233|gb|ACDC01000169.1| GENE 62 38920 - 39123 420 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738662|ref|ZP_04569143.1| ## NR: gi|237738662|ref|ZP_04569143.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 67 1 67 67 110 100.0 2e-23 MYANMEKVIKESRKHLTTHYDMTFDQLNDIRDNSKGIFEMIGTAFMFGFGQGMKYQKKRG KVNKNGK >gi|224461233|gb|ACDC01000169.1| GENE 63 39113 - 39398 400 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738663|ref|ZP_04569144.1| ## NR: gi|237738663|ref|ZP_04569144.1| hypothetical protein FSAG_02203 [Fusobacterium sp. 2_1_31] # 1 95 1 95 95 143 100.0 3e-33 MANNLIMKNEITSLELLAEINKFRKEEGIKKELLHKTLLAIIRDEFSEEINEQNILPVEY KDKKGEKRPMFILTLSQARQVLVRESKFVRRAVIH Prediction of potential genes in microbial genomes Time: Fri May 20 00:06:30 2011 Seq name: gi|224461232|gb|ACDC01000170.1| Fusobacterium sp. 2_1_31 cont1.170, whole genome shotgun sequence Length of sequence - 6485 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 121 205 ## + Term 132 - 163 0.1 2 1 Op 2 6/0.000 + CDS 186 - 1628 1960 ## COG3497 Phage tail sheath protein FI 3 1 Op 3 . + CDS 1643 - 2170 799 ## COG3498 Phage tail tube protein FII 4 1 Op 4 . + CDS 2181 - 2522 612 ## gi|237738666|ref|ZP_04569147.1| predicted protein 5 1 Op 5 . + CDS 2459 - 2692 266 ## gi|237738667|ref|ZP_04569148.1| predicted protein 6 1 Op 6 . + CDS 2708 - 5419 3874 ## COG5283 Phage-related tail protein 7 1 Op 7 . + CDS 5416 - 5637 210 ## Sterm_0927 hypothetical protein 8 1 Op 8 . + CDS 5683 - 5877 293 ## gi|237738670|ref|ZP_04569151.1| predicted protein + Term 5932 - 5980 2.6 + Prom 5930 - 5989 5.0 9 2 Op 1 . + CDS 6009 - 6221 260 ## gi|237738671|ref|ZP_04569152.1| predicted protein 10 2 Op 2 . + CDS 6202 - 6484 305 ## gi|237738672|ref|ZP_04569153.1| hypothetical protein FSAG_02212 Predicted protein(s) >gi|224461232|gb|ACDC01000170.1| GENE 1 2 - 121 205 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no NDEFEHLMAEATRIKANLLKEKAEIEEKLMKLNKMGLTN >gi|224461232|gb|ACDC01000170.1| GENE 2 186 - 1628 1960 480 aa, chain + ## HITS:1 COG:STM4213 KEGG:ns NR:ns ## COG: STM4213 COG3497 # Protein_GI_number: 16767463 # Func_class: R General function prediction only # Function: Phage tail sheath protein FI # Organism: Salmonella typhimurium LT2 # 67 463 61 460 475 72 21.0 2e-12 MGYKHGTYQQEGATAFQLPVVLDYGHFIVGTAPIHKVKAENRKVNEVVRIGTYQEAIQYF GDTYDLDFSISQAIKVFFELYAVAPLYVVNILDLTTHKSEKKTLANKALEKGKVLIPSHK VIPESVVVKNATGKQVISDARTVYTAEGLEVYATVAGNNVDIEYEEVDLSKVTKTEAIGG FDSTTMKRTGLELANEIFLKYSELPAFIDVPDFSHESDVAAIMETKAKTLNGGMFEAIAL VNAPVDKKYNELVEWKETNNILSNDQVLLYGKIKLAGEVYYQSIHYAALSMKVDGENNGV PSQGPSNYSYKMDAFVWKNASGKYEEVRLDKEQQANFLNKNGVVTAINFKGWRCWGSETA KNPLATDPKDKYIYGRRMFKYIGNELVISYFNNVDKKFSLKMAETMKKSMNIRLNALVAA DQLLSAKVNFYSVDNSLIDIINGDITWTIELGIIPGAKSITFKKVYDVDALQKFAESLTA >gi|224461232|gb|ACDC01000170.1| GENE 3 1643 - 2170 799 175 aa, chain + ## HITS:1 COG:STM4212 KEGG:ns NR:ns ## COG: STM4212 COG3498 # Protein_GI_number: 16767462 # Func_class: R General function prediction only # Function: Phage tail tube protein FII # Organism: Salmonella typhimurium LT2 # 2 162 3 161 174 57 26.0 9e-09 MGRKQIPNALIDAETYFNGSNNLAGISEVELPNIEYDTVTSEQMGLTAELEVPLMGHFKK LEAKIKMDCVDESVLEINNEKSILIECKGAAQAMNRETHSADVYGIDATFKGLIKKMDGL KMKPSGKLETSIDLSVTYFKLEIGGKTVVEIDVLNNVNVIHGLANQAVRKYLGLN >gi|224461232|gb|ACDC01000170.1| GENE 4 2181 - 2522 612 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738666|ref|ZP_04569147.1| ## NR: gi|237738666|ref|ZP_04569147.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 113 1 113 113 194 100.0 2e-48 MKVKLSQTYNFGGKEFDELDINIEEMTGKDFMLCEKEFKARNKEAGAVKELEDSWAITVA AKSVGVKYGDLLNLVSIDYLKVVNGVKRFLSQGWEDKEAQKDTTVEATEETGA >gi|224461232|gb|ACDC01000170.1| GENE 5 2459 - 2692 266 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738667|ref|ZP_04569148.1| ## NR: gi|237738667|ref|ZP_04569148.1| predicted protein [Fusobacterium sp. 2_1_31] # 26 77 1 52 52 84 100.0 2e-15 MGRQRGSEGYYSGGNRGNWCLIYLDMITELLRVLNYFKVNVSYDSMLDCSLYELDYWIAR ANKFVEEEEERQNNNDN >gi|224461232|gb|ACDC01000170.1| GENE 6 2708 - 5419 3874 903 aa, chain + ## HITS:1 COG:STM2697 KEGG:ns NR:ns ## COG: STM2697 COG5283 # Protein_GI_number: 16766010 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Salmonella typhimurium LT2 # 27 660 33 701 935 102 21.0 5e-21 MAKDMSLIWQMGVAGANETMAILSKAAKSLNEVKDSTEDLVKTQKKLEGLDKVAEAYKNA NSEYNKAAKNLEQLRKAYAKSNNVTAEFKEQVKNAEKQVDKLNKQKERQKHVFEAARSAL ENEGIKLEGYKKKLKEVNEDLKKQEKLKKDLSKAQAISDIGDQFSKKGSEQLRRGAATGA ALAIPVKFYMDVEESQADLRKILGKEAEKYYDDLAELSKNGPLSQIEINEIAGSLAQSGI KGEDIVAYSDMAGKMKVAFDISTDEAGTFLAKTKEQLNLSKDELFSYMDTLNMLSNNYSV TAAQLADVSARTGGFAKSINLSKESNMAFATSLISTGVTAEQTSTVLGKLYSELSQGANT KNKAAALQRLGFDPGTINKEMAENAEGTILKVLEKIKNSNVADKSALISDIFGSDKSVIN GLSVLSENLDGVKEKLDKAKQAVSENEKVNGEYEDRLNTLTNQLKIFRNNAFNALADIGK SIAPELKETLNTLKEFAGKIANFIKENPKLVAFIVKMVAGFAAMNLGMGVANKLLLGPFA KGVGWLYKFGAFKSKGGVFFALKKMFPLASKLFGTFVKIGTFIGGKFIGIIKMVGLALKA AFVANPVGLIIAAIVAVIAIFVLLYKKCEWFRKGVDKAWKAIKEGFKATWTWIKNKFHAL MELGAKVWAKIKEYKALFIPFIGIFVVLYQKCEWFRNGVNAVWKAIKNAFSNTWQWIKDK FNALLEIGSNAWNGLKNSATVIIDKIREAFSGFFDWINKKWESLKNLGSKLNPFNWFKGE GEVAQNYSGTNYFGGGLTTLAERGAELVEMNNSSYLVNSPVMANLPRGARILNNSQTRSS LSSRVSSLKDRIRSISNDSRTVVGGDTITININGGSGSDTDIARAVKRVIEEIQSKKRRT AII >gi|224461232|gb|ACDC01000170.1| GENE 7 5416 - 5637 210 73 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0927 NR:ns ## KEGG: Sterm_0927 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 2 73 3 70 70 68 48.0 6e-11 MRKVKVYKTVSGDTWDLISYKLYGSDQYFHQLMRANLNLLSIAVFDSNIPIIVPEITPIA SAVETSKLPPWKR >gi|224461232|gb|ACDC01000170.1| GENE 8 5683 - 5877 293 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738670|ref|ZP_04569151.1| ## NR: gi|237738670|ref|ZP_04569151.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 64 1 64 64 107 100.0 4e-22 MELKSEVMGMGSKMGRPVMGSPKTNDIKVRIDDETLKELLKYCEKNGITKAEAIRQGIHL LLKK >gi|224461232|gb|ACDC01000170.1| GENE 9 6009 - 6221 260 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738671|ref|ZP_04569152.1| ## NR: gi|237738671|ref|ZP_04569152.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 70 1 70 70 112 100.0 8e-24 MDKFEVEKLSMKMEALDNLLLAMETAIFSGNYDVSNFRRGFAHLTDMAMEVSSELNKMVE GAFANGRAKN >gi|224461232|gb|ACDC01000170.1| GENE 10 6202 - 6484 305 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738672|ref|ZP_04569153.1| ## NR: gi|237738672|ref|ZP_04569153.1| hypothetical protein FSAG_02212 [Fusobacterium sp. 2_1_31] # 1 94 1 94 95 139 100.0 8e-32 MVEQKIKNEITSLELLEQINLFRKEEGIKKELLHKTLLAIIRDEFSEEITEQKILPSSYK DKSGRTVPMFILTLSQARQVLVRESKFVRRAVIH Prediction of potential genes in microbial genomes Time: Fri May 20 00:07:06 2011 Seq name: gi|224461231|gb|ACDC01000171.1| Fusobacterium sp. 2_1_31 cont1.171, whole genome shotgun sequence Length of sequence - 23585 bp Number of predicted genes - 35, with homology - 32 Number of transcription units - 16, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 121 192 ## + Term 132 - 163 0.1 + Prom 130 - 189 4.0 2 2 Op 1 . + CDS 265 - 936 648 ## COG3500 Phage protein D 3 2 Op 2 . + CDS 929 - 1444 843 ## gi|237738674|ref|ZP_04569155.1| predicted protein 4 2 Op 3 . + CDS 1410 - 1913 629 ## Sterm_0929 phage baseplate assembly protein V 5 2 Op 4 . + CDS 1910 - 2419 523 ## Sterm_0930 hypothetical protein 6 2 Op 5 . + CDS 2416 - 2718 360 ## gi|237738677|ref|ZP_04569158.1| predicted protein 7 2 Op 6 . + CDS 2715 - 3827 1105 ## COG3948 Phage-related baseplate assembly protein 8 2 Op 7 . + CDS 3824 - 4453 732 ## Sterm_0933 hypothetical protein 9 2 Op 8 . + CDS 4450 - 5463 1145 ## Sterm_2509 hypothetical protein 10 3 Tu 1 . - CDS 5469 - 5801 262 ## gi|237738681|ref|ZP_04569162.1| conserved hypothetical protein - Prom 5912 - 5971 6.4 + Prom 5581 - 5640 6.9 11 4 Tu 1 . + CDS 5810 - 6097 341 ## gi|237738682|ref|ZP_04569163.1| conserved hypothetical protein 12 5 Tu 1 . - CDS 6094 - 6354 200 ## gi|237740813|ref|ZP_04571294.1| predicted protein - Prom 6519 - 6578 6.8 + Prom 6321 - 6380 6.0 13 6 Tu 1 . + CDS 6402 - 7376 908 ## gi|237738683|ref|ZP_04569164.1| phage integrase + Prom 7414 - 7473 9.1 14 7 Op 1 . + CDS 7568 - 8377 849 ## gi|237738684|ref|ZP_04569165.1| conserved hypothetical protein + Term 8388 - 8419 -0.6 + Prom 8380 - 8439 3.7 15 7 Op 2 . + CDS 8465 - 8914 240 ## Sterm_2506 hypothetical protein 16 7 Op 3 . + CDS 8928 - 9149 204 ## gi|237738686|ref|ZP_04569167.1| predicted protein 17 7 Op 4 . + CDS 9136 - 9495 603 ## FN0636 hypothetical protein 18 7 Op 5 . + CDS 9497 - 9979 767 ## COG4824 Phage-related holin (Lysis protein) + Term 9986 - 10019 2.1 + Prom 10066 - 10125 12.3 19 8 Tu 1 . + CDS 10173 - 10370 498 ## gi|237738689|ref|ZP_04569170.1| predicted protein + Term 10400 - 10452 10.3 + Prom 10385 - 10444 6.7 20 9 Op 1 . + CDS 10466 - 10642 228 ## gi|237738690|ref|ZP_04569171.1| predicted protein 21 9 Op 2 . + CDS 10630 - 11652 1014 ## CTC01145 hypothetical protein + Term 11658 - 11689 0.0 + Prom 11715 - 11774 7.8 22 10 Tu 1 . + CDS 11863 - 11934 78 ## + Term 11969 - 12018 4.2 + Prom 11947 - 12006 3.5 23 11 Op 1 . + CDS 12038 - 12295 402 ## gi|237738692|ref|ZP_04569173.1| predicted protein 24 11 Op 2 . + CDS 12334 - 12738 423 ## gi|237738693|ref|ZP_04569174.1| predicted protein + Term 12846 - 12883 1.2 + Prom 12925 - 12984 12.4 25 12 Op 1 2/0.000 + CDS 13043 - 14461 2159 ## COG0469 Pyruvate kinase 26 12 Op 2 1/0.000 + CDS 14485 - 15786 2256 ## COG0148 Enolase + Term 15796 - 15846 6.1 + Prom 15820 - 15879 7.9 27 13 Op 1 1/0.000 + CDS 15919 - 16476 604 ## COG3758 Uncharacterized protein conserved in bacteria 28 13 Op 2 . + CDS 16477 - 17223 752 ## COG3022 Uncharacterized protein conserved in bacteria + Term 17310 - 17376 31.6 + TRNA 17286 - 17362 82.1 # Pro TGG 0 0 + TRNA 17367 - 17441 72.4 # Gln TTG 0 0 + Prom 17367 - 17426 80.2 29 14 Tu 1 . + CDS 17545 - 19128 1948 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain + Term 19136 - 19168 1.0 - Term 19124 - 19155 0.0 30 15 Op 1 . - CDS 19163 - 19726 889 ## FN1560 hypothetical protein 31 15 Op 2 1/0.000 - CDS 19731 - 20456 659 ## COG1496 Uncharacterized conserved protein 32 15 Op 3 . - CDS 20456 - 21460 1162 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 33 15 Op 4 . - CDS 21474 - 21737 263 ## FN1563 hypothetical protein - Prom 21762 - 21821 12.4 + Prom 21883 - 21942 11.6 34 16 Op 1 . + CDS 21970 - 23157 1376 ## COG1301 Na+/H+-dicarboxylate symporters + Prom 23177 - 23236 2.1 35 16 Op 2 . + CDS 23275 - 23403 71 ## + Term 23484 - 23534 7.5 Predicted protein(s) >gi|224461231|gb|ACDC01000171.1| GENE 1 2 - 121 192 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KDEFEHLIAEATRIKANLLKEKAEIEEKLMKLNKMGLTN >gi|224461231|gb|ACDC01000171.1| GENE 2 265 - 936 648 223 aa, chain + ## HITS:1 COG:STM4208 KEGG:ns NR:ns ## COG: STM4208 COG3500 # Protein_GI_number: 16767458 # Func_class: R General function prediction only # Function: Phage protein D # Organism: Salmonella typhimurium LT2 # 1 213 38 246 347 85 28.0 7e-17 MTYTDNSKNAVDDLELDLENLDYRWLNEWYPDENSRLLIGIQQNENEISKFLDLGIFYVD EPTFNNQRLSLKCLALPLDQTIREQVNSVAWEKITLSELLSKIATKHELSYELHCDNAFF DRLDQDRETDLGFLKRILSETALSLKVTDDKLIVFNDDALIDNDNIDIFNIKDFRIRSFT LKKKNQGVYDKVEVSYYDADKKKHIVETITKEELEKRNEVKNA >gi|224461231|gb|ACDC01000171.1| GENE 3 929 - 1444 843 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738674|ref|ZP_04569155.1| ## NR: gi|237738674|ref|ZP_04569155.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 171 1 171 171 193 100.0 3e-48 MLDDGGYVAFKEKADKTKTKKRVKKAKTKKIKTKGKSQAKKVAEKTLKDSLKQEYSINLT VDGDVKYCAGCIIELDDSFGRFAGRYVIDKVTHNIDGDYSCDIEAFKVGARQNAEERAKS IDKAKRDKAEKEKAKTANTRKKERELKKANKIKSKKVVSKNAGYLEARGSK >gi|224461231|gb|ACDC01000171.1| GENE 4 1410 - 1913 629 167 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0929 NR:ns ## KEGG: Sterm_0929 # Name: not_defined # Def: phage baseplate assembly protein V # Organism: S.termitidis # Pathway: not_defined # 3 155 5 165 177 71 31.0 1e-11 MLDILKQGEVNDIDIANGKARVIFPDRDNKISDWLNILVPFSESHSDNYHLEIGQTVIVL SLPDMMEQGYILGCPMRPSDISEGEVKRTFSDGGFYSYKDGVLTLSPVTKVVITADVEIK KKLTVDGDTTFKSNTDTKGTAMLGGINLNTHTHSGIQPGSGNTGGPS >gi|224461231|gb|ACDC01000171.1| GENE 5 1910 - 2419 523 169 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0930 NR:ns ## KEGG: Sterm_0930 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 2 166 3 162 166 69 31.0 5e-11 MIGSLGDIIFYASDLNVFSLKKELSRSRKAKITQHEPIYGIGKVRQQGRELMEVSLSIEL IAGLTKAPSLHLQMLKDFMELGRYAPLILGYHVIGEFPFLITGIDETLSHFNAATGEFDY INLDITLLEYVDDPLQYQKKIEYRQTAKTILGVEYEDTVKNLQKKVFKL >gi|224461231|gb|ACDC01000171.1| GENE 6 2416 - 2718 360 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738677|ref|ZP_04569158.1| ## NR: gi|237738677|ref|ZP_04569158.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 100 1 100 100 147 100.0 2e-34 MIFSINSKDEINYNPQNEIEDVVRNVHMILRVTKEEQPLMRDFSLDSDVVDKNIPVIKNK LIGLLMANLKKYEPRALLKNLDLKLENNDLEIMLEIEVIV >gi|224461231|gb|ACDC01000171.1| GENE 7 2715 - 3827 1105 370 aa, chain + ## HITS:1 COG:STM4202 KEGG:ns NR:ns ## COG: STM4202 COG3948 # Protein_GI_number: 16767452 # Func_class: R General function prediction only # Function: Phage-related baseplate assembly protein # Organism: Salmonella typhimurium LT2 # 7 369 7 370 371 192 33.0 9e-49 MIDDTYEILDANAEELRQQMQEKFEELSGRKISKHSPEGLIFASVAYLIAMREENYNDNL KQNYLKYARDYRLDLLGDRYGDRGLRLEEQYAKATFRFHIISAKQKKIVIPKGSLIRYND LYFETNEEYSITENTLFVDGIATCKTPGTIGNNIPVGHINTMVDLYPYFSKVENITISNG GTDLEEDEVYRERLRLVPDSFSVAGSVGAYVFWTLSTSPEIVDVTVKSPNPCEVDIYVLT KDGVPSEELRNQVLKVVNSDEIRPLTDKVTIKSPEVVDYKVEFDYYINKADEISINSIKD KVQTAVNEYIEWQKNKLGRDIIPDELIKRLKLAGVKRTVITSPIYKKLEPHQFAKCNASV VVNYLGVEDI >gi|224461231|gb|ACDC01000171.1| GENE 8 3824 - 4453 732 209 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0933 NR:ns ## KEGG: Sterm_0933 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 158 1 160 213 74 30.0 2e-12 MILIDDLKLTDIAAVSTLDDATTKWIYESIDFVLRGRNSIINSELKKLEMTDLMNEQEIN MLLWEYSIYTKNATLEEKKKIVKRAIFSKINIGTTKVLKDVSGLLYKGFDVKEWTAYNGR PGTFRIYTDKKIVDPKEYRELMENIEANKNVRSHLDYIELKQINTSKYYISGFKEVTLLA TKENKKKDFSVNNAIYIKGYKQIIGGISK >gi|224461231|gb|ACDC01000171.1| GENE 9 4450 - 5463 1145 337 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2509 NR:ns ## KEGG: Sterm_2509 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 3 214 4 228 362 65 33.0 2e-09 MKFNGITKKGREYLAKIQAENKPINFAKIKIGDGRLDNYDNPAELEHLINQKVEKGILTL NQEHDTVILTTNIDNVSLRTGYYPREIGVFVNDNGQEIMYYYMNDGDETSWIPPETDGPF KIELKLNLIASNAQSIVVEGVGKDLFITKEFLETNYTQKGGYTGTAQEIDDRVVSALGKE DGKFPLTEAVKGNVYYFPGNKKFYICKEAQNRRVSVPDGNFEELSIWENRKRLENFSKLE GERLYVPNATFIKIYKIAGMVTLIVDSGTAFFNKANTPIFNIPEKYRPNETLYFSASYRN STKSNTFFLYANGNLIKSEADDNLGAYYFTISYPAKN >gi|224461231|gb|ACDC01000171.1| GENE 10 5469 - 5801 262 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738681|ref|ZP_04569162.1| ## NR: gi|237738681|ref|ZP_04569162.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 110 1 110 110 185 100.0 8e-46 MENLFKYSKIFDGRASIKGQVLGSIPDNSKFIEIIGINYASDGNFYYFQPITLRTEIIRN RDIFFNLGITSDTREFGLSFKNNVISIIHSSYSNSTADNNFIAQILSVNA >gi|224461231|gb|ACDC01000171.1| GENE 11 5810 - 6097 341 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738682|ref|ZP_04569163.1| ## NR: gi|237738682|ref|ZP_04569163.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 95 1 95 95 162 100.0 5e-39 MENLIKFKTYRVDRTDDAMTLQTSAGVYAISEKLIYNFKRIIGIPSTSKIVSISATQSSG YAEYATYDYDADLAAVGHIVNTSNPRWISINVAYI >gi|224461231|gb|ACDC01000171.1| GENE 12 6094 - 6354 200 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237740813|ref|ZP_04571294.1| ## NR: gi|237740813|ref|ZP_04571294.1| predicted protein [Fusobacterium sp. 4_1_13] # 4 85 1 82 82 93 51.0 4e-18 MENLIKVKRSGDCNVVHNDCTIAFEPWSSTLLKNRPNNENNPAGILISFYHGRRVQLYIS GNTMYTRVNQGAEDYNSWLQWIKCSN >gi|224461231|gb|ACDC01000171.1| GENE 13 6402 - 7376 908 324 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738683|ref|ZP_04569164.1| ## NR: gi|237738683|ref|ZP_04569164.1| phage integrase [Fusobacterium sp. 2_1_31] # 1 324 1 324 324 612 100.0 1e-173 MQLTVLENLKKENVDVYLEYLNSCKSSNWDTWGTTYKTYCNNFKLFLVWFQKSYKNKLLL SKETLLEMPTIIESYRNYCRSLGNSKRTLMNKTTAISTFYAWCVRRNKIKYHPFDSKLDK LRFTEKDKVRNSYFLTTEQILTVRLYMQVESKKYDLQDRILWELFLDSACRISAIQNLKM EQLDLENGYFRDVKEKEGYIVNAFFFQKCKELIKEWIQYREENRIDVDWFFVTKYGKIYK QMTQGAIRNRIKKLGKILDIEDLYPHTLRKTAINLINNLAGLGLASSYANHSSSGVTSKH YIAKANPTEVRNSIINARKKLGIF >gi|224461231|gb|ACDC01000171.1| GENE 14 7568 - 8377 849 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738684|ref|ZP_04569165.1| ## NR: gi|237738684|ref|ZP_04569165.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 269 1 269 269 490 100.0 1e-137 MFFIYTKERKSKLAFTVNLTADEVMQFMDGNLFLDYPELIPSEHVAIERNEAFKYPAYDE AKNTIREMTRDELIEEDIEVQLAPGEYIENKKLKVVPQPSSYHTWNTSTHTWDIDMEDVK RTFRHKFREILLDKMFGSYEHNGKIFQMQEYDEVNFMRVKMALDIAGEIEDYEVIKDALS TLGIPVDAELEEKIKMAMRAGKLKQLLKLLPTQWRLKDNSIASISLGELNLIYFSWILRV IAAQNKYTAITKKIREVSTVKELEAIKWD >gi|224461231|gb|ACDC01000171.1| GENE 15 8465 - 8914 240 149 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2506 NR:ns ## KEGG: Sterm_2506 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 141 1 141 142 117 45.0 1e-25 MYTLSETSLKMLKGVHPNLVNFMTELIKISPWNFKITAGVRTAEEQNRLYQKGRTAPGSK VTNVDGYKLKSNHQIKFDGLGYAADIGVIVNGEYKGTWKDFHYYQDIYNTAKNAGLLEKY SIEWGGNCWRTFKDAPHWQIKGADRVAYK >gi|224461231|gb|ACDC01000171.1| GENE 16 8928 - 9149 204 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738686|ref|ZP_04569167.1| ## NR: gi|237738686|ref|ZP_04569167.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 73 31 103 103 127 100.0 2e-28 MESFLDRIIKEKDDLQEKIIKLDRFFTTDTFENLSPVEKMHLKDQMRYMSAYLSTLRQRI NFYESKEGKHGND >gi|224461231|gb|ACDC01000171.1| GENE 17 9136 - 9495 603 119 aa, chain + ## HITS:1 COG:no KEGG:FN0636 NR:ns ## KEGG: FN0636 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 119 1 119 119 192 75.0 2e-48 MEMTRLNTMPIDDKYWEVLEDYTYRTSMGLVTVPKGFKTDYASVPRIFRNIINSSGKHGR AAVVHDWLYSSKCTLDVTREEADKIFLEIMAEWGVGVIKRNLMYRMVRLFGASHFRRGE >gi|224461231|gb|ACDC01000171.1| GENE 18 9497 - 9979 767 160 aa, chain + ## HITS:1 COG:BH0965 KEGG:ns NR:ns ## COG: BH0965 COG4824 # Protein_GI_number: 15613528 # Func_class: R General function prediction only # Function: Phage-related holin (Lysis protein) # Organism: Bacillus halodurans # 43 134 36 133 135 71 37.0 5e-13 MEDFFISAKNGIAMVWTGWISVLVWALGGFDLSVRVLVFLMLVDYITGIWAGYITKTVNS TRAYKGISKKVFILIIVSCSTVIEQLVPNVGIRNLVIVFYVATEFLSVIENASKLGLPIP EKLKIALEQCKGDKCNSKNADPKDVKPEKLKEKDFDEEIK >gi|224461231|gb|ACDC01000171.1| GENE 19 10173 - 10370 498 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738689|ref|ZP_04569170.1| ## NR: gi|237738689|ref|ZP_04569170.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 65 3 67 67 97 100.0 2e-19 MTKLWKEVKGLVKGTNVDKDNLDKETGLCTVDLIGGEFNGWAVAGQIIDDELIIDDNAKV YNPAE >gi|224461231|gb|ACDC01000171.1| GENE 20 10466 - 10642 228 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738690|ref|ZP_04569171.1| ## NR: gi|237738690|ref|ZP_04569171.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 58 1 58 58 66 100.0 5e-10 MEKDKAVKKYRTTEKGRKNTYYTNTKSACKKFLLTMSTKEDFELAKTWLEEGEKRWKL >gi|224461231|gb|ACDC01000171.1| GENE 21 10630 - 11652 1014 340 aa, chain + ## HITS:1 COG:no KEGG:CTC01145 NR:ns ## KEGG: CTC01145 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 340 1 360 360 234 45.0 3e-60 MEALRIVLKQSSANYRKAGTIDNKMTYPLPIPATVIGALHNICGYREYHSMDISIQGNFE AVSKDMYKNITVLNTISDRGTLVKMIAPNAISNAYIEVAEAVDDNANFITEKNIKIKNKE LLEEFKNLKILKEKLDSEKKLKLEEFKTRKKELSDKDELKKIRSEEKNYKEEFKKFEDEN YSKPYSQFRSIVKKPMFYELLNNIFLILHIKSDEKTLKDIENNIFNLQSIGRSEDFVEVV ECKKVELQEFDTEIKSAEGLSIYLNYNDFQEEKIFNLDVDGNVVKSGTKYYLDKYYKIVN LKREFEKTLAIYSNYFKANNSSENVKLDEYNNTKLLVNFI >gi|224461231|gb|ACDC01000171.1| GENE 22 11863 - 11934 78 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKIQFKIVIGSWSLTITITKKEK >gi|224461231|gb|ACDC01000171.1| GENE 23 12038 - 12295 402 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738692|ref|ZP_04569173.1| ## NR: gi|237738692|ref|ZP_04569173.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 85 4 88 88 152 100.0 6e-36 MLKELMNHNELGVKFYRDENAVIFVEDEKIGVILKLSVYENIFIFHRQGNDVEAIKRQIE IAKHYDEIMAGTWRPETERKFTRIR >gi|224461231|gb|ACDC01000171.1| GENE 24 12334 - 12738 423 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738693|ref|ZP_04569174.1| ## NR: gi|237738693|ref|ZP_04569174.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 134 1 134 134 151 100.0 1e-35 MEEKKRKGYKTQDQQNKANQRYRATEEGKEKTKHSTYKSRARVFINEMASFKELKELKKM ILEMEDIKMKELKKLYAEWRKVSEEMLEEGFKGSIDCGDKAVREDFSNYAELQEIISFEE MLELEKEYIRKEQD >gi|224461231|gb|ACDC01000171.1| GENE 25 13043 - 14461 2159 472 aa, chain + ## HITS:1 COG:FN1765 KEGG:ns NR:ns ## COG: FN1765 COG0469 # Protein_GI_number: 19705084 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Fusobacterium nucleatum # 1 472 4 475 475 791 87.0 0 MKKTKIVCTIGPVTESVETLKELLNRGMNVMRLNFSHGDYEEHGTRIKNFRQAISETGKR AGLLLDTKGPEIRTMTLEDGKDVSIKAGQKFTFTTDQSVVGNSERVAVTYPDFAKDLKIG DMILVDDGLIELDVTEIKGNEVICIARNNGELGQKKGINLPNVSVNLPALSEKDIEDLKF GCKNNIDFVAASFIRKADDVREVRRILHENGGDRIQIISKIESQEGLDNFDEILEESDGI MVARGDLGVEIPVEDVPCAQKMMIKKCNRAGKPVITATQMLDSMIKNPRPTRAEANDVAN AIIDGTDAIMLSGETAKGKYPLEAVAVMDKIARKVDPTIVPFFVKHVTSKNDITSAVAEG SADISERLNAKLIIVGTESGRAARDMRRYFPKADILAITNNEKTANQLILTRGVIPYVDA TPKTLEEFFILGEAVAKKLNLVEKGDIVIATCGESVFIQGTTNSIKVIQVKA >gi|224461231|gb|ACDC01000171.1| GENE 26 14485 - 15786 2256 433 aa, chain + ## HITS:1 COG:FN1764 KEGG:ns NR:ns ## COG: FN1764 COG0148 # Protein_GI_number: 19705083 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Fusobacterium nucleatum # 1 433 1 434 434 763 95.0 0 MTGIVEVIGREILDSRGNPTVEVDVVLECGARGRAAVPSGASTGSHEAVELRDEDKSRYL GKGVLKAVNNVNTEIREALLGMDALNQVAIDKLMIELDGTPNKGRLGANAILGVSLAVAK AAAEALGQPLYKYLGGVNAKELPLPMMNILNGGAHADSAVDLQEFMIQPVGAKSFQEAMR MGAEIFHHLGKILKANGDSTNVGNEGGYAPSKIQGTEGALNLICEAVKAAGYELGKDITF ALDAASSEFCKEVDGKYEYHFKREGGVKDTDAMIKWYEELINKYPIVSIEDGLGEDDWDG WVKLTKAIGDRVQIVGDDLFVTNTERLKKGIELGAGNSILIKLNQIGSLTETLDAIEMAK RAGYTAVVSHRSGETEDATIADVAVATNAGQIKTGSTSRTDRMAKYNQLLRIEEELGAVA QYNGKNVFYNIKK >gi|224461231|gb|ACDC01000171.1| GENE 27 15919 - 16476 604 185 aa, chain + ## HITS:1 COG:FN1763 KEGG:ns NR:ns ## COG: FN1763 COG3758 # Protein_GI_number: 19705082 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 184 1 184 184 337 92.0 8e-93 MNKVIKKEDWKVSVWAGGTTNEIFIYPKDSSYADRIFKARISVATTNNEEKSLFTKLPGV ERYISKLTGDMKLQHTGHYDVEMEDYQIDRFKGDWETYSWGKFEDFNLMLKGIRGDLYYR QIRGRCRLHLEKGSTIVFLYVIDGKINVNGIDLETEDFYITDDNILDVFGNNPKIYYGFI KEWDQ >gi|224461231|gb|ACDC01000171.1| GENE 28 16477 - 17223 752 248 aa, chain + ## HITS:1 COG:FN1762 KEGG:ns NR:ns ## COG: FN1762 COG3022 # Protein_GI_number: 19705081 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 247 1 247 248 312 81.0 4e-85 MKIIFSPSKEMREENIFENKKIEFTESPFKDKTNILIDVLKQKSIEEIESIMKLKADLLA KTYKDIQNYDKLKYIPAISMYYGVSFKELELEAYSEKSLKYLKDKLFILSALYGLSKPFD LLKKYRLDMTMSIVDKGLYNFWKKEINEYISKSLTEDEVLLNLASGEFSKMIDTKKINMI NIDFKEEKDGTYKSVSTYSKKARGKFLNYLIKNQIDSLEEIEKINLDGYSLNKDLSNSKN LIFTRKNF >gi|224461231|gb|ACDC01000171.1| GENE 29 17545 - 19128 1948 527 aa, chain + ## HITS:1 COG:FN1559 KEGG:ns NR:ns ## COG: FN1559 COG3263 # Protein_GI_number: 19704891 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Fusobacterium nucleatum # 1 527 1 527 527 805 87.0 0 MNNILFLSSVVIILSIFIYRYLSKFGVPMLLVFISLGMIFGVNGIFKIPYDNYELSRDIC SFALIYIIFFGGFGTNLSMARGIIKKSLILSSLGVIFTSLLTGLFSHYILKLDWYSSLLI GSVLGSTDAASVFAILRSHKLNLKENTASLLEIESGSNDPFAYVLTIAFLTLSKGNLNLP LLLFKQVFFGLAVGYIFAKVSCYIIRKVNNIDSGMSMALITASMLLSYSTSEFIGGNGYI TVYLLGVLLGNIHFNKKSEIVSFFNGLTSIMQILIFFLLGLLVNPLEALKYAVPAVLIMT VMTLLIRPFVVYALISPMKSSRGQKLLVSWAGLRGAASVVFAILVVVANKERGMVVFNIA FIVVLLSIAIQGSLLPYFSKKLNMIDEDGDVLRTFNDYSDTEDVDFITAEIDETHKWVGR QVKNLELMPSVLLVLIIRNNENIIPNGNTVIEKGDRIVLCGSSFVDKGTRINLYESMVDK NSKYINKSIRELDRNILIVLIKRENRTMIPSGNTVLLEDDILVLLDR >gi|224461231|gb|ACDC01000171.1| GENE 30 19163 - 19726 889 187 aa, chain - ## HITS:1 COG:no KEGG:FN1560 NR:ns ## KEGG: FN1560 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 21 166 17 163 167 140 58.0 2e-32 MKKKLMVSFLALVLVACGSSGSLELTKQEKEKVNGDVNVARQLLVQKAILKDASAEKLSE DDQYNLNLAKQEVEVSYYLQKKFESELNNIQVSDEEAQKYYDVHKAEIGNTPFESVKDAI VAQITYEKQTGIVNKYYEDLLNKYKIEEILKKDFPEAAQAAVQTQTPAPAEAAPAAPAPE AKTEEKK >gi|224461231|gb|ACDC01000171.1| GENE 31 19731 - 20456 659 241 aa, chain - ## HITS:1 COG:FN1561 KEGG:ns NR:ns ## COG: FN1561 COG1496 # Protein_GI_number: 19704893 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 240 1 241 242 335 77.0 3e-92 MNYIDKDIKDHEDYIEFTTFNKFNIKIFFTKKHYGSIPERSKEDVAKDFSLNKIMLSCYQ THSDNVVLVGEDTSVHHFPNTDGILTSNKNAAVLTKYADCLPIFIYDEETKIFGAVHSGW KGSYQEIVKRAIEKINPKDLSTINILFGVGISCEKYNVGKEFYEDFKNKFSKEIVDKVFS MKNNKFFFDNQLFNYYLLKEYGVKEEKMFLNNRCTFSENFHSFRRDKELSGRNGAIIFME E >gi|224461231|gb|ACDC01000171.1| GENE 32 20456 - 21460 1162 334 aa, chain - ## HITS:1 COG:FN1562 KEGG:ns NR:ns ## COG: FN1562 COG2876 # Protein_GI_number: 19704894 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Fusobacterium nucleatum # 1 334 1 334 334 549 79.0 1e-156 MYIRLKDSKMSARLNDFLDKNNIKYFTIMDKIDIKYAILYIPNDFNQENFKEIQDIAEVI KLTSPYKFVSREFKKADTIIDVKGHLIGGDNFMLMAGPCSVENREMLSNIAKEVKKGGAV VLRGGAYKPRTSPYDFQGLGEVGLKYLREVADENDMLVVTELMDSDDLNLVSSYADIIQI GARNMQNFSLLKKLGKLDKPVLLKRGLSATINEFLLSAEYILAHGNQNVILCERGIRTFE TMTRNTLDLNAIALVRELSHLPIIVDASHGTGKRSLVGPLTLAGIMAGANGAMIEVHENP DCALSDGPQSLDFKLFDKVANNIRKSLYFRKDLE >gi|224461231|gb|ACDC01000171.1| GENE 33 21474 - 21737 263 87 aa, chain - ## HITS:1 COG:no KEGG:FN1563 NR:ns ## KEGG: FN1563 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 87 1 87 87 132 77.0 5e-30 MYNELDLHNLDFKVALSVFKKKYNEALKKKDRREILVIHGYGANKLGHKAVLATNLRIFL SNNRDKLSYRLDTNPGVTYVTPISRLE >gi|224461231|gb|ACDC01000171.1| GENE 34 21970 - 23157 1376 395 aa, chain + ## HITS:1 COG:FN2053 KEGG:ns NR:ns ## COG: FN2053 COG1301 # Protein_GI_number: 19705343 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Fusobacterium nucleatum # 1 395 1 395 395 622 93.0 1e-178 MSSKKLGLVPRLIIAIIVGILIGQFMPLWIVRIFKTFSTFFGSFLSFFIPLMIVGFVVSG IAKLTEGAGKLLGFTAIVSYVSTIVAGTFSYTVAANLYPKLVSGISQGINFEGKDVAPYF TIPLKPPIDVTAAIIFAFMMGITISIMRSQKKGETTFNLFAEYEEIISKILAGFVIPLLP FHILGIFSEMAYSGIVFKVLGVFAAIYLCIFAMHYIYMLVMFSIAGGISKKNPFTLIKNQ IPAYFTAVGTQSSAATIPVNIQCGLKNGTSPEIVDFVVPLCATIHLSGSMITLTSCIMGV LLLNGMPHSFSIMFPFLCMLGIAMVAAPGAPGGAVMSALPFLFLIGIDAQGPLGSLLIAL YITQDSFGTAINVSGDNAIAIYVDEFYKKYIKKAA >gi|224461231|gb|ACDC01000171.1| GENE 35 23275 - 23403 71 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MITCLPLVFQELRKGSFNNNGRRSNLIKLLLIYSKKISNDKL Prediction of potential genes in microbial genomes Time: Fri May 20 00:09:07 2011 Seq name: gi|224461230|gb|ACDC01000172.1| Fusobacterium sp. 2_1_31 cont1.172, whole genome shotgun sequence Length of sequence - 24173 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 5, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 28 - 87 5.8 1 1 Tu 1 . + CDS 132 - 653 748 ## COG2109 ATP:corrinoid adenosyltransferase + Term 683 - 721 5.2 + Prom 658 - 717 10.8 2 2 Tu 1 . + CDS 809 - 1021 372 ## gi|237738705|ref|ZP_04569186.1| predicted protein + Term 1029 - 1085 -0.1 + Prom 1052 - 1111 11.5 3 3 Op 1 . + CDS 1271 - 2098 790 ## FN2078 DeoR family transcriptional regulator 4 3 Op 2 . + CDS 2111 - 5515 4135 ## COG0433 Predicted ATPase 5 3 Op 3 . + CDS 5539 - 6186 965 ## gi|237738708|ref|ZP_04569189.1| predicted protein 6 3 Op 4 . + CDS 6201 - 6809 990 ## gi|237738709|ref|ZP_04569190.1| predicted protein 7 3 Op 5 . + CDS 6824 - 7153 533 ## gi|237738710|ref|ZP_04569191.1| predicted protein 8 3 Op 6 . + CDS 7155 - 7691 730 ## gi|237738711|ref|ZP_04569192.1| predicted protein + Term 7700 - 7733 4.0 9 4 Op 1 . + CDS 7743 - 9008 1683 ## gi|237738712|ref|ZP_04569193.1| predicted protein 10 4 Op 2 . + CDS 9030 - 12437 4191 ## DP1989 hypothetical protein 11 4 Op 3 . + CDS 12447 - 15695 4114 ## DP1989 hypothetical protein 12 4 Op 4 . + CDS 15705 - 19496 4919 ## COG4642 Uncharacterized protein conserved in bacteria 13 4 Op 5 . + CDS 19509 - 22286 3150 ## DP1989 hypothetical protein + Term 22290 - 22352 16.0 + Prom 22331 - 22390 10.7 14 5 Op 1 . + CDS 22413 - 23459 1002 ## COG1106 Predicted ATPases 15 5 Op 2 . + CDS 23459 - 24112 565 ## gi|237738718|ref|ZP_04569199.1| predicted protein Predicted protein(s) >gi|224461230|gb|ACDC01000172.1| GENE 1 132 - 653 748 173 aa, chain + ## HITS:1 COG:FN1790 KEGG:ns NR:ns ## COG: FN1790 COG2109 # Protein_GI_number: 19705095 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Fusobacterium nucleatum # 1 173 1 173 173 293 91.0 1e-79 MEKGYVQIYTGNGKGKTTAALGLITRAVGNNFKIFFCQFLKGRDYGELHTLKKFETVVHE RYGRGVFIRSKEFVTDEDKKLMREGYESLKNALLSKKYDIVIADEILGTLRYDLISVDEI KFLIENKPETTELVLTGRNAPEELIELADLVTEMREVKHYFQKGVMARKGIEK >gi|224461230|gb|ACDC01000172.1| GENE 2 809 - 1021 372 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738705|ref|ZP_04569186.1| ## NR: gi|237738705|ref|ZP_04569186.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 70 1 70 70 100 100.0 2e-20 MGKLSKTLTILGGAVLAGVAYSLWKDKQDLEEENDELYDEIAALKKKNMEQDLEENIITE TNENSEDIVL >gi|224461230|gb|ACDC01000172.1| GENE 3 1271 - 2098 790 275 aa, chain + ## HITS:1 COG:no KEGG:FN2078 NR:ns ## KEGG: FN2078 # Name: not_defined # Def: DeoR family transcriptional regulator # Organism: F.nucleatum # Pathway: not_defined # 1 275 1 280 280 159 37.0 1e-37 MKKVRITVSDFMFEILKGDSEYFKVPVGKIGNTLFKYYIDKNLSKIKLEESSGKKVQFNL SKENEDIFFDILREKKAETEAELMRDIFFTYINNLRFKREEIIFNDTFKQIREAIKNNKK IGIKYHSTARIVNPYFIELSSKENRSYLFCYCEKNQDFRNYRISDIENIWNLQNEIYVKD EDYIEAIRKNFDPFLSYGNEIKVRMTEEGKALYERVNQNRPKLLKEEGDIYTFECSDKLA KVYFAQFYDEIEIIEPESLRESFKENFKRTYEMYK >gi|224461230|gb|ACDC01000172.1| GENE 4 2111 - 5515 4135 1134 aa, chain + ## HITS:1 COG:AF1060 KEGG:ns NR:ns ## COG: AF1060 COG0433 # Protein_GI_number: 11498665 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Archaeoglobus fulgidus # 730 949 244 461 505 62 25.0 4e-09 MEYEVIEKEERLQFLEEMFESTELKVTESYNYPTIMEDLSNVVLFHIKEVTFEGEEKSPR REAFENVIGMIQNEGVNFIYLILGDKKGVSFYFGLVKESKYDGELPMPIDEMGNNLLKSA IRGNFRGSKVEEVSPEEMIEIFDRMQTNSNNRNARKYASVIGTPGINESEDKKSFQGVDR LVDVMQGDDFGLCILAKPLSKRAIKKIEDDLYQIYNSLSTFSKISLQEGENLSKGTSISK GTSDSVSSGENTTKGTNYSKTTGTSENTGTSESKTAGSTTGTNYSESQTEGKNWGKTEGT NKGSSVTKGKSEGSSSGGSSSSTNKGTNESGTENSGTTFSENVGTSSSKGTNTGYSNSEN MSKTKGTNTSTGSSQSKTAGTSETTGTNTSTTRGTNNSVSESSTTGSSQNVSKDIINKKA ADYVKYIDEMLLPIIDYGKSKGLYLTTTFIFADNNSQLEKLGNTIKSLYSGKKGNKNPLE FKILENNDKKIEYFKNFQIPECISYDDENALTLKSHFVENDEVSLGNWYSPNELGLIAGL PEKEVVGLSLNEEVEFGLNAKTPEKGEELISLGNLVQSGNEIDTKVYLEKSALNKHIFIT GVTGTGKTTTCQKLLLESELPFLVIEPAKTEYRILMNNPKTEDILIFTLGNDKVAPFRLN PFEFFEGESITSRVDMLKAAMEASFDMEAAIPQIIESAMYSCYEDYGWNIDTDENEKFDN PYDEGVYSFPTLEDLLNKIEIEVTKHDFDDRLKKDYIGSITARLQGLLVGSKGQMLNTRR SIDFRELIEKKVVLEIEEIKNGTEKSLVMGFILTNLREALRIKSKKNKDFKHITLIEEAH RLLSKYTPGDSLNKKNSVETFADMLAEIRKYGESLIIADQIPNKMTPEVLKNTNTKIVHK IFAEDDKEAIGNTISLSKEQKEFLASLPPGRTIVFSQSWTKAIQVQIEAETDTTFAKDID EDRLKNRVEDFYIENYKKGIFIGTKYEKITREQFRLCREFSTSKEFVKIFKAVFEENINS FSEFENISKNLNMLLKDRDFLEDIRKILEKNEKYIENLNKTQESFKNYYENIFQEDIYKV IYFKYYESFKENKVAQNNIIDGIKEIFNRGNFNLNDGSHLIQMYRLELNEKLKR >gi|224461230|gb|ACDC01000172.1| GENE 5 5539 - 6186 965 215 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738708|ref|ZP_04569189.1| ## NR: gi|237738708|ref|ZP_04569189.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 215 1 215 215 367 100.0 1e-100 MGLFSAIGGFISGAIGAVGSFLGGAIGIAATAIAGLVGSPVFGAVVGLISLVSTVIGLTK KDEKQEELGAKAMLSDKKPQDFDSHQAYIDHLRNDIELTPEIKDKLKNDKVFNDNCTVLG ASVEWNGLNEKMGINMDIASLVKLVEAGVKTPEQFQTIANTFKSKEVEPKINDAIERNIP MKEGAKIVDTLKEGVDKVEGSKEIWEKLDKMLDEM >gi|224461230|gb|ACDC01000172.1| GENE 6 6201 - 6809 990 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738709|ref|ZP_04569190.1| ## NR: gi|237738709|ref|ZP_04569190.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 202 1 202 202 316 100.0 5e-85 MIMDESTMKGLSGVLDKILKILSKITPELIKEVTSLISSVAELLGLKDEDDSSEELGLKS EIADKRLEDFDSREEYINYLKDDVELDEYDREKLNNESLKEKYSAKGLDIEMGAINEKIG VKLGLEDYVMMAKAGINKVQDFMTIIDTFKEKGVEPLINEAIERLIPMKEGATVIDTLKE GVNKIENAKATWDKFNDMLENF >gi|224461230|gb|ACDC01000172.1| GENE 7 6824 - 7153 533 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738710|ref|ZP_04569191.1| ## NR: gi|237738710|ref|ZP_04569191.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 109 1 109 109 118 100.0 2e-25 MGNDELWSAFGKILLNPINYIKSIRGDIGEKKENENFEKTEKYIKEVEEKIEKEKYLKPE SKLKDIKLDEVGENLNWQDLRSKQNIKEDLAEEKEEDKWKKLDKILEDI >gi|224461230|gb|ACDC01000172.1| GENE 8 7155 - 7691 730 178 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738711|ref|ZP_04569192.1| ## NR: gi|237738711|ref|ZP_04569192.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 178 1 178 178 273 100.0 4e-72 MKIHRYFFWIEDNFIEIYKNGNLEKYEGEEKLYIDKFETFWEKWKKNSKIIASRDAIDFT FLVDKKVSKDDLLKGLDNYKKETEINFSSEDLKKLLDIKDFKTIIFEFNNQKKVITKTKG RYIESEFEENLPEIILFGDNIDEDILNNLANQRVEEKKNKTEAGQLDKIFGSQWNNRK >gi|224461230|gb|ACDC01000172.1| GENE 9 7743 - 9008 1683 421 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738712|ref|ZP_04569193.1| ## NR: gi|237738712|ref|ZP_04569193.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 421 17 437 437 524 100.0 1e-147 MPNNVTTENITKKDLIDNIKKTENIIEELNKTKNKLDKAIEFLDNIGTFIDNKIKNYMET LNSNNSNEKSQTKSVESGKIENNNVEKSEEEKEKKSDISKFIEAFSQEFSFIDKEQFNDE MENDEIMGNIIEKIKKAFTNKENSIMDLKERVSQEIRETNSLKETVSKYKNELNEKEKRI FEKENEINNYKIQLENKIKENIKLEKDYNDELNLKKNEIVRLENVERILTAGKTKLEKEK KELEEKNKENSEKMKIFMTKYEEKEKELKSVGDLLEAKKLYDKYLSMETRILSKLDNVLI QRDFESFISSGYTLSSLDNVWDIVKIEYKNISKEDKETLKEVFAFFLNQINKRFKDAKYG LVQVEVGEDFDAVNQVDLSQNSKGTVTECVFYGYGFLAEKENPNDEDIIARIIKNPLVLT A >gi|224461230|gb|ACDC01000172.1| GENE 10 9030 - 12437 4191 1135 aa, chain + ## HITS:1 COG:no KEGG:DP1989 NR:ns ## KEGG: DP1989 # Name: not_defined # Def: hypothetical protein # Organism: D.psychrophila # Pathway: not_defined # 479 1127 243 913 923 426 39.0 1e-117 MGNKKRRLSVAISALFSVVNNTQNYSRDFKESSEIINSFNDNYRDRLNEIEKIKFIFKKV MEIKREKMGKKGSAKKVKKEFESLEKNRIEKIEEISTQDINLKELKKYEKIKKFRNLEKV NIIDSNNEKDKFKDFQEIYNLINIKLEAKNKKWIFSQKDNIYYIFPYDFFVTERLKGGYS IYVSNKETKNNLDSIKKVSNRIRKKEFEFNLPEERSIKLLNSIGFLGNGNWVYYQKQVLE YVTIEDSFNYSSNYTHNLLGIYKLDNFFKSIENRNKKYNLVESNNLKDFEVILNYDLLDP SIITKSYEKKFNNLVEIYRTYKDYISCIYDEDDKKAGILFDTEKIIESIKNREKIFDNIE IAYLENELGIEKEVIYLDKNVYCYKNGEKEEIYNTNSEENKSIYHFKNGDIEERIYQNGI LNGKSILKFSNGGIEERDYKNGILDGKAVSKINNKEKEYIYNNGIKKEISKLKYYLSIDK ERINTDDYQEDILLDPNVGHWDLKEQDKKELKEILGKNVYKRNPKEDINQGGIVAIDFGT KSTVVVYQKDSENILPMRISGDKLNREVRNTDYENPTVIEFRDIDKFLKDYNAKVGRPDT KWQDITVSHTAFGNLTEVSSEYFDSIISDIKQWTASKNQKIILVDRKGKEILLPPYLELK EKSKDYLDPVEIYAYYIGSYINNMINGIYLEYYLSFPVTYEKAIRERILKSFEKGIQKSL PIEIQEDKDLMKKFRVRHGANEPAAFAVCALKNFKIEPKDKDDKVYYGVFDFGGGTTDFD FGIWKYSEEEDLYDYELEHFGAGGEKYLGGENILKELAYKVFTDNSSNLRKSQIQYTRPE WCDEIIGEETLVSQTKQARQNSTKVAKELRGIWENTTTERVDFISVNLFDSHDEVNAGFK LNVKEDELKNLIKEKIEKGIKNFFIKMEDAFKGEDVKEINIFLAGNSCKHPFVEEIFNSY IEKKKNKFKINIYDTKMFEKLKDTKKTNPTAKTGIAFGLIYSRNSGRIKVISRDEKANMN NEINFKFYIGNNRRNKFNCIISPNSNYDEYKFFGIVKSDIFELYYSTSPEAQTNEMKSSE AKIKRVNLKKDYDEEERYRIYLKADESDKLVYAIVKEEKDIETKKFVEEGEVTLN >gi|224461230|gb|ACDC01000172.1| GENE 11 12447 - 15695 4114 1082 aa, chain + ## HITS:1 COG:no KEGG:DP1989 NR:ns ## KEGG: DP1989 # Name: not_defined # Def: hypothetical protein # Organism: D.psychrophila # Pathway: not_defined # 417 1074 243 913 923 397 38.0 1e-108 MEKFENLKEFRDLDSLKLINEDKSTEKLNDKFKDLQEIYTLIRTNIKNNGKQWIYSQKDE IFYIFSQNKFTTLAFYSERLYTTDKEKLKEISNKIKDHKIEFNVPNNNELENLNRIGFLG SGQWYCLDNNGNYYSRYWNYSSSHYALGICSINDFKDDLDIRNQSFKLETNILKEIDKKI IENGIEVEKYIYTDYFIEILSKIGICNIDDVRNLLARMQKDDNEVSPKEFLERYKVTLLE SKELKDFEVILNYNLLDTDIIKGEEEQIKFRNLVNLYKTYKDYISCMYIKDDTEDTVELI FNADKIISSAENRDELFNGIEILYKKNKLGITKEEIYNDKNIFYFENGDAETIYDSRAEE RTSMYHFSNGDEEKRIYKNGVLQGEAIFKKDNKVKKYFYTDGIREEMPTLKYYLSIDKER INIDDYDEERLWDINLGHWDLKEEDKEELKEILGKKVYERDPKEDVHQGGIVGIDFGTKS TVVVYQKDKTTIMPMRISGGKLNKKVDDTDYENPTVIEFRNVEKFLEKYNEKDGRPNTRW EDVMVSHTAFGNLTNGPSEYFTSIISDIKQWTTKEKEKHYLKDRTGSEYTLAPYLKLDEN EENYIDPVEIYAYYIGSYINTMTNGIYLEYLLSFPVTYEKAIREKILKSFEKGIKKSLPI QIQEDEKLMKKFKVKHGANEPAAYAACALKNFKIEPKDKDDKVYYGVFDFGGGTTDFDFG IWKLAEDEDKYDYELEHFGAGGDKYLGGENIIKELAYKVFTENSDTLLKKRIQYIRPEGY DELKGEGALVNNDSSIAKLNTRILGEMLRGIWENSAIEDMKTIKLSYLYDTHGEKRGMGG DEELSFNVSEEKLKDFIKEKIAKGINNFFIKLEDAFKDEDAKEINIFLAGNSCKHPFVNE IFAEYQEKMKDKIKLNLYDLKAIEGLKEKDSTKVMPTGKTGVAYGLIYSRKGGRIKVTNR DEKENMANEVNFKFYIGNNKRDLFNSVLNPNSKYEKYEYFGKVTSDTFEIYYTTLPEAQT GKMEIDKTNVKRISLNEEYDEDEEYRIYIKATKPTKISYAIVKKEEDVDTKEFLEEGEIV LS >gi|224461230|gb|ACDC01000172.1| GENE 12 15705 - 19496 4919 1263 aa, chain + ## HITS:1 COG:slr1485 KEGG:ns NR:ns ## COG: slr1485 COG4642 # Protein_GI_number: 16329198 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 369 559 144 334 349 75 24.0 9e-13 MSNIEQFESIKKFEDIKNFRDLENLKLIDENNEIIDFKEKFKDFQEIYSIVASFYNKNKN IKWIYSNKKRVSYIFSQKIFVTNTIRASGNTVNDLNIIKRETNLKDITFKIPDFSTLQNL NKIGFISDSSTNYWYYLDSTGNYKTCYFFIPKNYQYFYGGNYYVLGVASISGFYDILNKR NQSFLQGMNKSEKNDSLIREFEIALKYDLLDESFINLDDNKKEELKEIALEYDLLDKGFI KSDDNKKEKLKEIALEYDLLDKDFIKLDDNQKEKLKNLVEMYKKYSEYISCIYVEDGTED TVKIAFDDKKLFSLLEDNEKRYSLEIAYQENALGITKQEIDRESTVYYFDNGDREEIYLT EIGEQLSLYCFKDGNNERRIYKNGVLEGESTLTYKNGSFEKRKYRNGILEGEATLTYKNG TIETREYRNGILEGKRILTFENGENETREYKNGVLSEESTYYFKNGDTEKRKYKDGVLEG KSTYTHKSGDIEEREYKNGILSEESTYYYKNGDTEKRNTKKGVLEGKSIYTHKNGDVEER EYKNGILEGKSIYTSTDGSTEIREYRNGVLQGEVVYQKDGHKRKYLYTDGRKGEMPKLKY YLSIDKERINIDDYRENRLLDPNVGHWDLKEEDKEELKELLGKEVYSRDPKLDINQGGIV GIDFGTKSTVVVYQKDRNNILPMRISGEELNKEARDTDYENPTVIEFKDLVKFKKDYDEE IGRPNTKWEDVTISHVAFKHLMEGTSEQFDSIISDIKQWTVHKNVKTEVIDKKGNKISLP PYFELDENDENYIDPVEFYAYYIGSYINTMRNGIYLEYYLSFPVTYEKAVREKILNSFER GIKKSLPKQIQEDEKIMKKFRVKHGSNEPAAYAVCALKKLKIEPKDNEKIYYGVFDFGGG TTDFDFGIWKNSEDEDMFDYELEHFGAGGDIYLGGENILKELAYKVFSNNSSELRKRKIQ YVRPEWCPELSGEETLVENTREAKLNTRKLAEEKLRSIWEEKNTERIDGVKINLFNADGS LETGVDLKIDEEELKALIKDKIEKGIKNFFIKLEDAFKDENVKKVHIFLAGNSCRHSFVN EIFEKYVTEMEDKLELVIYDLKAIKESDKENESKVTGKTGVAYGLIYSRKGGRIKVTNRD EKENVANEINFKFYLGNNKKDKFNPIITPNSNYGKFEFLGILTSDIFEIYYTSSPEAQTG TMEIGKDDPKIKRISLNEEYDEDERYRIYIKIVTPDKISYAIVKKEEDIETKEILEEGEV FLD >gi|224461230|gb|ACDC01000172.1| GENE 13 19509 - 22286 3150 925 aa, chain + ## HITS:1 COG:no KEGG:DP1989 NR:ns ## KEGG: DP1989 # Name: not_defined # Def: hypothetical protein # Organism: D.psychrophila # Pathway: not_defined # 230 924 220 921 923 399 38.0 1e-109 MNHIYVRKSEILKKEIIENIKGNLFNKKKAIEEGLSKIISPNGNIPVKYKNKKYVYNIRV NRLFPDLKNYDFYSLNLNEKNSEDIKNIFSEEFKEFEDMEILSEEIFRKVFYREKDRYIE QGKILNTNNIIENYIIEKDGSLVIFNSQAEKIETKEGTLIPIIKISDIEDNNESKKELES IIKIFEENEIEILFNDIDKRYKEKYDELVELYNDIEKYNFSEDNSDYDKKEILKAFEEDK LDLPLNFIDEYLEKLKNIEKSRLDEKEYDEKIFTDYQKGNWDIFSDSKEDVDNRELIKIK TNEKYYERDPKNDVKDGGIVAIDFGTKSTVVAIQTQNEKTSLVRIAGGSYKKNVEKSQFE NPTIMEFIDIESFEKAYNESKGRPFTEWDDLKISYAANSSFIGGSKFVLEGLKQWAGNKD EKLVIYDKKWKRIDLKPYIDVEEDEFDPIEYYAYYIGSYINNMFTGNIFTKYLLSFPIKY ENEIRVKILKSFEKGIKKSLPTSILENEKIMDNFQVYRGANEPTAYFLCAGKELDKFPKK ENEKLFYGIFDFGGGTTDFSFGICKYIGLTSSRYDYEIKHFGEGGDKFLGGENILKNLAF EICKNNLAILKEKDIHFYCPVGCKEFNGYEGVLDNSYEAIYNIKQIAEKFRGFWEEDEFS KDLYSSDEIGVTLLSSKDSSEIINLKYDKTKCEDIIDSILRKGIENFISSLKLLFKNENL GVDKMEIFLSGNSSKSKRFQKMFEEEIQKIEDTVKSKDEKIFTINYPIEKLGNKDGLEVN AKTGTAIGLLESRKGGKFKIIAKDEEVNDNEINFRYYVGYLKNSKFKPVLDYKVGYDNWV KFLDASDIETEIYYTHQANSIEGGISGDNSNLDRKIITISKDYQDEDTYIFIRATKPDII EYCVSTEKKIKKNEFIEEAQELKLI >gi|224461230|gb|ACDC01000172.1| GENE 14 22413 - 23459 1002 348 aa, chain + ## HITS:1 COG:jhp0346 KEGG:ns NR:ns ## COG: jhp0346 COG1106 # Protein_GI_number: 15611414 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Helicobacter pylori J99 # 1 311 1 331 381 102 27.0 8e-22 MIKSIRIKNYRGIKDLEIDNFKKYNFFIGDNGSKKTTILESLGIGLSLLEFERILKNARN RKMKIKKENISSLFFNSDTTNVIDFILETTDNVKVETTVSIDKTLSMFQDFSSNDYSNYL YTIEKKIKEDKLKTDIYIKENSQTIYRNSKMDKIPLSFQNFLKKYNVLVEISDSLKNSSG TVFQVDRIIKSRKKDELLKYLQVIDKDIKEIYINDEEIFVEKETLKEFIPISSIGDGMVL ALDIITSLILVDDFRHILIDEIERGIYYKNYRKLSEIIIELCKDDENIQLFITTHSKEFL EVFNEVLSESEKDNFSLFSLRNKKEKLDFVHYTSEELKDTLETGWDPR >gi|224461230|gb|ACDC01000172.1| GENE 15 23459 - 24112 565 217 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738718|ref|ZP_04569199.1| ## NR: gi|237738718|ref|ZP_04569199.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 217 6 222 222 340 100.0 5e-92 METKILYFICEGITEVTLIKKLLEKNNYKSSSNDKEENKNLILFDLSSQKNIKIYLANCE GKDRCKKYVNSLLKSISDENFEIIFFLDADDSSEDGSFTGVKRTKDLVENILKNKNCSYS SYILPNDMDDGMTETLLNKCFLCNKTVKYIEETTFKEIEELKEIIISNKHKSLFMIMAAL LAKKGVAHHFIENNFKNFDSKNEDLKKLENWILEKIL Prediction of potential genes in microbial genomes Time: Fri May 20 00:10:55 2011 Seq name: gi|224461229|gb|ACDC01000173.1| Fusobacterium sp. 2_1_31 cont1.173, whole genome shotgun sequence Length of sequence - 2011 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 64 - 993 1617 ## COG3958 Transketolase, C-terminal subunit 2 1 Op 2 . - CDS 1016 - 1828 1229 ## COG3959 Transketolase, N-terminal subunit - Prom 1934 - 1993 11.8 Predicted protein(s) >gi|224461229|gb|ACDC01000173.1| GENE 1 64 - 993 1617 309 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 1 309 1 309 309 573 96.0 1e-163 MSKKSTRQAYGEALVELGRINNDIVVLDADLSKSTKTDLFKKEFPKRHLNIGIAEADLIG TAAGFATCGKIPFASTFAMFAAGRAFEQIRNTVAYPKLNVKIAPTHAGISVGEDGGSHQS IEDIALMRAIPGMVVLCPCDAVETKKMVQAAAEYNGPVYLRLGRLDVETVLDDSYDFQIG IANTLREGNDVTIVSTGLLTQEALKAADELAKENISVRVINCGTIKPLDGETILKAAKET KFIITAEEHSVIGGLGSAVSEFLSETHPTLIKKLGVYDKFGQSGKGAEMLEKYELTAAKL VSMVKENLK >gi|224461229|gb|ACDC01000173.1| GENE 2 1016 - 1828 1229 270 aa, chain - ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 1 270 1 270 270 515 93.0 1e-146 MKDISFLKEKAKEIRRSIVSMIAEAKSGHPGGSLSATDILTALYFSEMNIDPANPKMEGR DRFVLSKGHAAPAIYATLAERGYFSKDELLTLRKFGSRLQGHPDMKKLPGIEISTGSLGQ GLSVANGMALNAKIFNENYRTYIVLGDGEVQEGQIWEAAMTAAHYKLDNLCAFLDSNNLQ IDGNVTEIMGVEPLDKKWEAFGWNVIKIDGHNFEEILSALEKAKECKDKPTMILAKTIKG KGVSFMENVCGFHGVAPTAEELEKALAELA Prediction of potential genes in microbial genomes Time: Fri May 20 00:10:56 2011 Seq name: gi|224461228|gb|ACDC01000174.1| Fusobacterium sp. 2_1_31 cont1.174, whole genome shotgun sequence Length of sequence - 1920 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 15/0.000 - CDS 30 - 1229 1824 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase 2 1 Op 2 . - CDS 1239 - 1895 766 ## COG0307 Riboflavin synthase alpha chain Predicted protein(s) >gi|224461228|gb|ACDC01000174.1| GENE 1 30 - 1229 1824 399 aa, chain - ## HITS:1 COG:FN1508_1 KEGG:ns NR:ns ## COG: FN1508_1 COG0108 # Protein_GI_number: 19704840 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Fusobacterium nucleatum # 1 203 1 203 203 374 93.0 1e-103 MIYKIEDVLEDIKNGIPLIIVDDENRENEGDLFVAAEKATYESINLMATYARGLTCTPMS SEYAVRLNLDPMTARNTDAKCTAFTVSVDAKEGTTTGISIADRLTTIKKLADINSVATDF TRPGHIFPLIAKDNGVLEREGHTEATVDLCKICGLAPVSVICEILKDDGTMARMDDLEVF AKEHNLKIITIADLIKYRKKTQELMKVEVVANMPTDNGTFKIVGFENHIDGKEHIALVKG DVAGKEGVTVRIHSECFTGDILGSLRCDCGSQLKTAMRRIDRLGEGVILYLRQEGRGIGL LNKLRAYNLQEAGMDTLDANLHLGFGADMRDYAVAAQMLKALGVKSIKLLTNNPLKINGI EEYGMPVVEREEIEIEHNKVNKVYLKTKKERMGHLLKIK >gi|224461228|gb|ACDC01000174.1| GENE 2 1239 - 1895 766 218 aa, chain - ## HITS:1 COG:FN1507 KEGG:ns NR:ns ## COG: FN1507 COG0307 # Protein_GI_number: 19704839 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Fusobacterium nucleatum # 1 218 34 251 251 387 96.0 1e-108 MFTGLVEEKGSVISLNSGDKSIKLKIKANKVLENVKLGDSIATNGVCLTVTEFSKDYFVA DCMFETISRSNLKRLKAGDEVNLEKSITLSTPLGGHLVTGDVDCEGEIVSITQEGIAKIY EIKISRKYMRYIVEKGRATIDGASLTVISLTDDTFSVSLIPHTQEKIILGSKKVGDIVNI ETDLVGKYIERFVHFDKLEEKENKKSKISREFLLENGF Prediction of potential genes in microbial genomes Time: Fri May 20 00:10:57 2011 Seq name: gi|224461227|gb|ACDC01000175.1| Fusobacterium sp. 2_1_31 cont1.175, whole genome shotgun sequence Length of sequence - 1395 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1014 1195 ## COG1262 Uncharacterized conserved protein + Term 1025 - 1062 -0.2 + Prom 1046 - 1105 16.8 2 2 Tu 1 . + CDS 1205 - 1394 240 ## gi|294782520|ref|ZP_06747846.1| lipoprotein Predicted protein(s) >gi|224461227|gb|ACDC01000175.1| GENE 1 1 - 1014 1195 337 aa, chain + ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 50 297 33 266 286 149 36.0 1e-35 EKLELAEEEIEKVIETLKNNINKYSRNKEKKENISPLKSNEIKLNKNKIEEMIIVKGGKY QPSFADEEKEVFDIEVSKYQITQKMWAEVMGTNPSHFKGENMPVESLNWWEALEFCNKLS EKYGLEPVYNLDKKSNGILMIKELGGETVYPDVANFKNTEGFRLPTEVEWEWFARGGQKA MNEGTFDYIFAGSNEINEVAWYRNNTGGKEEIQMGIAKVLNGGSTQEVGLKKPNQLGIYD CSGNVWEWIYDTAENSHRNLENKKLYTYRAFDNSCIHRRIRGGGWAARYENCSVSTRYYK VNRENSLDFSFSQIIDTCNQETLKVSSDIGLRIVRTI >gi|224461227|gb|ACDC01000175.1| GENE 2 1205 - 1394 240 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294782520|ref|ZP_06747846.1| ## NR: gi|294782520|ref|ZP_06747846.1| lipoprotein [Fusobacterium sp. 1_1_41FAA] # 1 63 1 63 460 112 96.0 6e-24 MKESISKFIIEGKALLWIKTNDFQEVERTMIESLNSLENKKFYIYEKGKTINFLNDSIES GMD Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:02 2011 Seq name: gi|224461226|gb|ACDC01000176.1| Fusobacterium sp. 2_1_31 cont1.176, whole genome shotgun sequence Length of sequence - 1393 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 258 - 317 6.8 1 1 Op 1 . + CDS 442 - 771 358 ## gi|237738547|ref|ZP_04569028.1| predicted protein 2 1 Op 2 . + CDS 784 - 1393 646 ## gi|237738548|ref|ZP_04569029.1| hypothetical protein FSAG_02266 Predicted protein(s) >gi|224461226|gb|ACDC01000176.1| GENE 1 442 - 771 358 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738547|ref|ZP_04569028.1| ## NR: gi|237738547|ref|ZP_04569028.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 99 1 99 109 168 100.0 1e-40 MVNVPTESQAKVIIENPDGFDPLNPEILRVVKEGGEIEITGIKSNKKFFNIYSGKVEVPK GFEIIEVGEIPENFQKQGFRTDGDLIGTKNGEGFPKKTDKIIRIRKIKK >gi|224461226|gb|ACDC01000176.1| GENE 2 784 - 1393 646 203 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738548|ref|ZP_04569029.1| ## NR: gi|237738548|ref|ZP_04569029.1| hypothetical protein FSAG_02266 [Fusobacterium sp. 2_1_31] # 1 203 1 203 203 360 100.0 2e-98 MKIKLKTQSLDDWELNLFESYSTLPIIIGRTMIFGAYNVDFEMTANIWNQLPEEYKRKIY KYNWKKMIKSMLITVTDITAYSFGFGYNIKLKDSITMEEISKNFDENKKINFLTPSCAFP NSSMSVYFQYLGEVYAEVDLDELVAISDEENSFQYSIMPKLMEASNYRKRRENNTGKLEQ IYNEQLIIKNLVNKNIDKLSKEE Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:18 2011 Seq name: gi|224461225|gb|ACDC01000177.1| Fusobacterium sp. 2_1_31 cont1.177, whole genome shotgun sequence Length of sequence - 1345 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1344 1771 ## COG3210 Large exoproteins involved in heme utilization or adhesion Predicted protein(s) >gi|224461225|gb|ACDC01000177.1| GENE 1 3 - 1344 1771 447 aa, chain - ## HITS:1 COG:FN0290 KEGG:ns NR:ns ## COG: FN0290 COG3210 # Protein_GI_number: 19703635 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 271 284 545 727 371 77.0 1e-102 TAINADFEISGKKTSAEDLGFNTDISKAQEKTKDEERHLDAELHTDLLGKDKQEELKKAG GIISDLTTALGNKGQTDALENKGKTEGNFLERYKQLSMVRAIGDQVEKNPEYLSILDKKA IKNEKIDDDVQKEQLSVMNKLLNDALRAKGYAGPDIKMVLTDVTDPNGPFYTDTLTNVVV FDRKQLANLDRDKILNILGHEFGHYSKEDNKTGNQTIANYSEDKLEDRTKAMVAKEATED TLASIRNNPNVITGEEGKKLAESVPMEKREYQTLEWYVTGSAANEEVIGGIRASFTNSIY QSFDLKNDRAVKYTAITTSIGGGNSDVSAGIGLGLYLSDSPEEISKLTISHGGSIEILKT VSIGMDFLSENKEGQNFLQKLSNVKGVRLYIGKSIPISSDSLLEKFLPAKLEKHISVLDI GNINIIKDRTILDYWEDESIPSGARAY Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:18 2011 Seq name: gi|224461224|gb|ACDC01000178.1| Fusobacterium sp. 2_1_31 cont1.178, whole genome shotgun sequence Length of sequence - 1310 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 + CDS 5 - 616 670 ## COG0732 Restriction endonuclease S subunits + Prom 633 - 692 3.3 2 1 Op 2 . + CDS 726 - 1308 614 ## COG0732 Restriction endonuclease S subunits Predicted protein(s) >gi|224461224|gb|ACDC01000178.1| GENE 1 5 - 616 670 203 aa, chain + ## HITS:1 COG:MA2415 KEGG:ns NR:ns ## COG: MA2415 COG0732 # Protein_GI_number: 20091246 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanosarcina acetivorans str.C2A # 1 201 185 387 391 63 25.0 2e-10 MFGDIKTNNKNWEIVKLEKYINIIGGYAFKNIDFKSTGIPLIRIGNINSGQFKSTNLVFI KENKKFEKFKVFPNDILISLTGTVGKDDYGNACILGNSYSEYYLNQRNAKIEIIDKINKN FFLEIIKIKEVKKKLTGISRGIRQANISNKDIYNLSIPLPPIELQNKFAERVEKIEKLKF EIEKSIEIAQNLYDSLISKYFDN >gi|224461224|gb|ACDC01000178.1| GENE 2 726 - 1308 614 194 aa, chain + ## HITS:1 COG:XF0296 KEGG:ns NR:ns ## COG: XF0296 COG0732 # Protein_GI_number: 15836900 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Xylella fastidiosa 9a5c # 15 169 23 172 442 86 31.0 4e-17 MNKNIQYRKLTDICEIITGEWGTEISENSQNIASIIRTTNFLNNGKIDIENKELIKREID KKKIEQKQLKRGDIIIEKSGGSPNQPVGRVVFFDLNSNEIFLCNNFTSILRVKEDINSKY VFYFFRNSYKNKKVLKFQNKTTGIINLKLQNYLNESHIFLPELKIQNKRVDILDNLENII EKNQNYLIHLRELT Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:19 2011 Seq name: gi|224461223|gb|ACDC01000179.1| Fusobacterium sp. 2_1_31 cont1.179, whole genome shotgun sequence Length of sequence - 1230 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 40 - 687 657 ## COG1272 Predicted membrane protein, hemolysin III homolog - Prom 765 - 824 5.0 + Prom 801 - 860 6.7 2 2 Tu 1 . + CDS 1023 - 1230 163 ## COG0566 rRNA methylases Predicted protein(s) >gi|224461223|gb|ACDC01000179.1| GENE 1 40 - 687 657 215 aa, chain - ## HITS:1 COG:FN1885 KEGG:ns NR:ns ## COG: FN1885 COG1272 # Protein_GI_number: 19705190 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Fusobacterium nucleatum # 1 215 1 215 215 323 91.0 2e-88 MRFNRRLTFSEELGNTVTHGVMSAATLVLLPIGSLWGYFHGGYASAVGISIFIASLFLMF LSSTLYHSMYHNSKHKSIFRILDHIFIYVAIAGSYTPVALVIIGGWKGILIVVIQWTIVL VGILYKSLATRAMPKLSLTLYLVMGWIAIFFFPTLLRKANTVFLVLVVLGGVMYSIGAYF FAHDYKKYYHMIWHIFINIAAILHIIGIGFFLYRK >gi|224461223|gb|ACDC01000179.1| GENE 2 1023 - 1230 163 69 aa, chain + ## HITS:1 COG:FN0875 KEGG:ns NR:ns ## COG: FN0875 COG0566 # Protein_GI_number: 19704210 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Fusobacterium nucleatum # 1 69 78 146 261 104 84.0 3e-23 MNEKIFQELSSQENSQGIIIVYSKKNNDLNSLSNNLVILDDVADPGNLGTIIRLCDATNF KDIILTKGT Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:20 2011 Seq name: gi|224461222|gb|ACDC01000180.1| Fusobacterium sp. 2_1_31 cont1.180, whole genome shotgun sequence Length of sequence - 1046 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 5 - 952 1282 ## COG0006 Xaa-Pro aminopeptidase Predicted protein(s) >gi|224461222|gb|ACDC01000180.1| GENE 1 5 - 952 1282 315 aa, chain + ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 315 148 462 462 566 92.0 1e-161 MSFYLIKNIIKQRNIKNKIEIEEIEKGVNITKEMHLSAMKNVKAGMKEYELVAEVEKQPR KYNAYYSFQTILSKNGQILHNHSHLNTLKDGDLVLLDCGALTEEGYCGDMTTTFPVSGKF TERQKTIHNIVRDMFDRAKDLARAGITYKEVHLEVCKVLAENMKKLGLMKGEVEDIVSSG AHALFMPHGLGHMMGMTVHDMENFGEINVGYDEGEEKSTQFGLASLRLAKKLEVGNVFTI EPGIYFIPELFEKWKNEKLHQEFLNYDEIEKYMDFGGIRMERDILIQEDGTSRILGDKFP RTADEIEEYMKMYKK Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:21 2011 Seq name: gi|224461221|gb|ACDC01000181.1| Fusobacterium sp. 2_1_31 cont1.181, whole genome shotgun sequence Length of sequence - 957 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 8.3 1 1 Tu 1 . + CDS 95 - 898 1042 ## COG0561 Predicted hydrolases of the HAD superfamily Predicted protein(s) >gi|224461221|gb|ACDC01000181.1| GENE 1 95 - 898 1042 267 aa, chain + ## HITS:1 COG:FN0391 KEGG:ns NR:ns ## COG: FN0391 COG0561 # Protein_GI_number: 19703733 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 267 1 267 267 434 83.0 1e-121 MKYKLIVCDMDGTLLTSSHKISDHTANIIKKIEDSGVKFMIATGRPYLDARHYRDSLELK SYLITSNGARAHDEDNNPIVVENIPKELVKRLLNYKVGKDIHRNIYLDDDWIIEYEIDGL VEFHKESGYGFNIDDLSKYQNQEVAKIFFLGENKEIEELEKKMKKDFKDELSITVSSPFC LEFMKKGVNKAETLKKVLKILDIKPEEVIAFGDSMNDYEMLSLVGKPFIMGNANKRLIEA LPNVEVIGNNNEDGIGEKLQEIFNVEL Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:22 2011 Seq name: gi|224461220|gb|ACDC01000182.1| Fusobacterium sp. 2_1_31 cont1.182, whole genome shotgun sequence Length of sequence - 945 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 207 162 ## COG2323 Predicted membrane protein 2 1 Op 2 . - CDS 207 - 659 455 ## FN0037 hypothetical protein - Prom 718 - 777 7.7 - Term 748 - 814 3.3 3 2 Tu 1 . - CDS 827 - 943 220 ## COG2323 Predicted membrane protein Predicted protein(s) >gi|224461220|gb|ACDC01000182.1| GENE 1 3 - 207 162 68 aa, chain - ## HITS:1 COG:FN0036 KEGG:ns NR:ns ## COG: FN0036 COG2323 # Protein_GI_number: 19703388 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 68 1 68 210 96 82.0 1e-20 MELSYLNIAIKLIMGLLSSVLVINISGKGNLAPSSTMDQVSNYVLGGIVGRVIYAPNITV LQFFIVLM >gi|224461220|gb|ACDC01000182.1| GENE 2 207 - 659 455 150 aa, chain - ## HITS:1 COG:no KEGG:FN0037 NR:ns ## KEGG: FN0037 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 150 1 150 150 230 78.0 1e-59 MRFYSYNYLLEQIAKFDWWGAAFPLFLIICLIFTFFKYNKGHKDSKFRELAIIFTLTIIV VISIKITQYQKSHSNDNRYRQAVHFIEVIAEDLKTDKENIYINTSASIDGALVRIGTLYF RVISGDNGENYLLEKIDLENPKVELIEVNK >gi|224461220|gb|ACDC01000182.1| GENE 3 827 - 943 220 38 aa, chain - ## HITS:1 COG:FN0036 KEGG:ns NR:ns ## COG: FN0036 COG2323 # Protein_GI_number: 19703388 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 38 173 210 210 59 68.0 2e-09 IDKDTEWLETVLKEMGHDNISDIFLAEYDNGKITVVTY Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:25 2011 Seq name: gi|224461219|gb|ACDC01000183.1| Fusobacterium sp. 2_1_31 cont1.183, whole genome shotgun sequence Length of sequence - 934 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 15 - 54 2.2 1 1 Op 1 . - CDS 125 - 502 427 ## gi|237738537|ref|ZP_04569018.1| conserved hypothetical protein 2 1 Op 2 . - CDS 528 - 932 432 ## gi|237738538|ref|ZP_04569019.1| predicted protein Predicted protein(s) >gi|224461219|gb|ACDC01000183.1| GENE 1 125 - 502 427 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738537|ref|ZP_04569018.1| ## NR: gi|237738537|ref|ZP_04569018.1| conserved hypothetical protein [Fusobacterium sp. 2_1_31] # 1 125 1 125 125 202 100.0 4e-51 MKQILYENNNNASYLINILLQVQQQVETVISWELSEFDFIIVDIGDFFNGIMPPEIEEVY NFGKKIEREHVIVVEHNYLLKILKNIRTVYYANMKTIIGNNAFSIKIFDGDIIEIRGNIE NNILL >gi|224461219|gb|ACDC01000183.1| GENE 2 528 - 932 432 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738538|ref|ZP_04569019.1| ## NR: gi|237738538|ref|ZP_04569019.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 134 1 134 134 169 100.0 6e-41 YEKEAKEEWRLKERERRKIREIYKSESISLEKMEEEQERYLIFIRNVYRTGKLSEEYFER FFSTYPLILRHLDAEELVLIEKNAEELGIEISDNIKYDIGYSLTNTQTEIMTEEEKDAIE EIRKKWNLKKVYED Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:38 2011 Seq name: gi|224461218|gb|ACDC01000184.1| Fusobacterium sp. 2_1_31 cont1.184, whole genome shotgun sequence Length of sequence - 898 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 31 - 90 9.1 1 1 Op 1 . + CDS 143 - 334 270 ## gi|237738535|ref|ZP_04569016.1| predicted protein 2 1 Op 2 . + CDS 334 - 834 556 ## FN2112 hypothetical protein Predicted protein(s) >gi|224461218|gb|ACDC01000184.1| GENE 1 143 - 334 270 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738535|ref|ZP_04569016.1| ## NR: gi|237738535|ref|ZP_04569016.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 63 1 63 63 116 100.0 5e-25 MNWHSQGSIYGAAMIHFLDINDRKEIAVPNYDYKIGEMDGKDMRSIFERWKDKRTQEKRY GGK >gi|224461218|gb|ACDC01000184.1| GENE 2 334 - 834 556 166 aa, chain + ## HITS:1 COG:no KEGG:FN2112 NR:ns ## KEGG: FN2112 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 165 1 164 164 149 50.0 4e-35 MKKIFISLVSLLVFTSCVLHVYSFTSTNYNNDKISIKANLVDEQKENSPLNYIYIYDKRS NATEHHKIKILSPTIKIVSDGKEYVITPNSETIKVYKQGVVITDDFKAYIGKVQLDDGTI IDIPPLSFKKNVYVESYNPVTDTINAGARTKRLFSGTVEDYKKQKK Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:46 2011 Seq name: gi|224461217|gb|ACDC01000185.1| Fusobacterium sp. 2_1_31 cont1.185, whole genome shotgun sequence Length of sequence - 858 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 216 169 ## gi|237738533|ref|ZP_04569014.1| predicted protein - Prom 334 - 393 9.2 + Prom 229 - 288 12.0 2 2 Tu 1 . + CDS 373 - 804 521 ## Lebu_2085 hypothetical protein Predicted protein(s) >gi|224461217|gb|ACDC01000185.1| GENE 1 3 - 216 169 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237738533|ref|ZP_04569014.1| ## NR: gi|237738533|ref|ZP_04569014.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 71 1 71 72 129 100.0 8e-29 MKGASNANTVLSQLKGKIFYINKQKLNPFTAHSFIYAGQKITTPSYRKWQKIRNGYDDIL RLFNSDTYKDF >gi|224461217|gb|ACDC01000185.1| GENE 2 373 - 804 521 143 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2085 NR:ns ## KEGG: Lebu_2085 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 19 140 37 158 159 107 57.0 1e-22 MKKIFVILLVLFSVTVSAKTPTKKVTVKVYNSKEYSIILDKFLTRAEKYLETGDKKSLLD DYIKNIESSRVMKKALEDNYDKDKEAFSLVEVALFMKSSYSEGEDAKKLTDEDYKEMQEK FVKTKKYKQLEDYATIQTTTDVK Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:53 2011 Seq name: gi|224461216|gb|ACDC01000186.1| Fusobacterium sp. 2_1_31 cont1.186, whole genome shotgun sequence Length of sequence - 839 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 46 - 834 610 ## FN0484 lipase (EC:3.1.1.3) Predicted protein(s) >gi|224461216|gb|ACDC01000186.1| GENE 1 46 - 834 610 262 aa, chain + ## HITS:1 COG:no KEGG:FN0484 NR:ns ## KEGG: FN0484 # Name: not_defined # Def: lipase (EC:3.1.1.3) # Organism: F.nucleatum # Pathway: Glycerolipid metabolism [PATH:fnu00561]; Metabolic pathways [PATH:fnu01100] # 22 261 1 240 240 390 81.0 1e-107 MKKFFKILFFIIIISIAILWLVKIFFLTHKYQIKNYNEDKIEKDIVITFNGIYGYEKQLR FIDEKLAEDGYTVVNIQYPTVNENIAEMTEKYIAPNIEEQVKRLEQVNLERKSKNLPELK INFVVHSMGTCLLRYYLKENKLANLGKVVLITPPSHGSQLSDNPIADLIPYFIGPAVKDM KTNKDSFVNQLGNPDYPCYILIADSSNNFLFSLFIKGEDDGMVPLATAGLEGASLKTIKN TTHTSILEKQETVDEILQFLKN Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:58 2011 Seq name: gi|224461215|gb|ACDC01000187.1| Fusobacterium sp. 2_1_31 cont1.187, whole genome shotgun sequence Length of sequence - 785 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 785 1163 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461215|gb|ACDC01000187.1| GENE 1 2 - 785 1163 261 aa, chain + ## HITS:1 COG:HI1731a KEGG:ns NR:ns ## COG: HI1731a COG5295 # Protein_GI_number: 16273668 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Haemophilus influenzae # 4 139 800 942 1020 65 42.0 1e-10 DATGTGVKTGTPSAQLVKNGSTVSYVAGDNLTVAQDVTAGDHKYTYSLNKVLKDLTSAEF KTSAGDKTVINGDGLTVSPATPGTAPISITKDGISAGDKKVTNVAPGTISSTSTDAINGS QFHKLATNTVQLGGDNSTVTATQQLDKTGGIKFDIVGANGITTEAKDGKVTVKVDSSTIG ANAKLSYTANGAAPKQEVTLANGLDFKNGNFTTATVGANGEVKYDTVTQGLTVTDGKAGL PNPATPGATTPNGLVTAQDVA Prediction of potential genes in microbial genomes Time: Fri May 20 00:11:58 2011 Seq name: gi|224461214|gb|ACDC01000188.1| Fusobacterium sp. 2_1_31 cont1.188, whole genome shotgun sequence Length of sequence - 769 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 767 1299 ## FN0254 hypothetical protein Predicted protein(s) >gi|224461214|gb|ACDC01000188.1| GENE 1 2 - 767 1299 255 aa, chain - ## HITS:1 COG:no KEGG:FN0254 NR:ns ## KEGG: FN0254 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 255 814 1051 1677 131 41.0 2e-29 NVKITNKGKIDAGKNSYAIYGKDVQLTSGSELNIGDNGVGIFSTSTTPATPNIDIQAGAK INLGKDEAVGVFLGTDAATGVQANGVRINDAGSIMNIGDNSYGYVLKGTGTTFTNSSSGS VTLGTKSVYLYSDDTTGNITNNVALTSNGSTAGTALTSATGGQNYGIYSAGTVVNNANID FSKGIGNVGIYSIKGGTATNNATITVGDSNAQGNLYSLGMAAGYARTDSGNIINNGTINV VGKDAIGMYASGPGS Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:04 2011 Seq name: gi|224461213|gb|ACDC01000189.1| Fusobacterium sp. 2_1_31 cont1.189, whole genome shotgun sequence Length of sequence - 753 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 696 857 ## COG0616 Periplasmic serine proteases (ClpP class) Predicted protein(s) >gi|224461213|gb|ACDC01000189.1| GENE 1 3 - 696 857 231 aa, chain - ## HITS:1 COG:FN0873 KEGG:ns NR:ns ## COG: FN0873 COG0616 # Protein_GI_number: 19704208 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Fusobacterium nucleatum # 58 231 1 174 494 281 88.0 9e-76 MVILYALLQAVIISIVIIIAICILILLVKRKFKNKDVISLKGVKTVVFNIGELVEDYMVS AVSINKALSHDVTLKALENLVDDKKIEKIIIDVDEVDLSRVHIEEIKEIFKKLSVDKEII AIGTTFDEYSYQIALLADKIYMLNTKQSCLYFRGYEYKEPYFKNVLATLGVTVNTLHIGD YKVAGESFSHDKMTEEKKESLMNIKETLFQNFINLVKEKRKIDITNEILSG Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:04 2011 Seq name: gi|224461212|gb|ACDC01000190.1| Fusobacterium sp. 2_1_31 cont1.190, whole genome shotgun sequence Length of sequence - 733 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 33 - 707 883 ## gi|237738528|ref|ZP_04569009.1| predicted protein Predicted protein(s) >gi|224461212|gb|ACDC01000190.1| GENE 1 33 - 707 883 224 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738528|ref|ZP_04569009.1| ## NR: gi|237738528|ref|ZP_04569009.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 224 1 224 224 416 100.0 1e-115 MKKGIKILAFMLLAVFFTGCFASNVDLKAQKIYAREYKGMKIELSRSDLSGIFVDIQNMS NKDITIVWKESTLGGSRIIRHDAIVYPALNDENTVLTELQKKTFVIHRAEDFYYVDPVLY AQSGVRIKPLKFPVELKLVIKTNGDKETLSIFLDNNYRSDEDANSQRYKEDAYQIRKRED AEKLNKDYRRTKINRRDKVDDLPEAKVIKENPPVEDELYINHRK Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:14 2011 Seq name: gi|224461211|gb|ACDC01000191.1| Fusobacterium sp. 2_1_31 cont1.191, whole genome shotgun sequence Length of sequence - 712 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 36 - 479 319 ## COG0218 Predicted GTPase 2 2 Tu 1 . - CDS 626 - 706 95 ## Predicted protein(s) >gi|224461211|gb|ACDC01000191.1| GENE 1 36 - 479 319 147 aa, chain - ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 147 48 194 194 263 97.0 6e-71 MKLARTSKTPGRTQLINYFLINDEFYIVDLPGYGFAKVPKEMKKQWGQTMERYIASKRKK LVFVLLDIRRVPSDEDIEMLEWLEYNEMDYKIIFTKIDKLSNNERAKQLKAIKTRLVFEK EDVFFHSSLTNKGRDEILTFMEEKLNN >gi|224461211|gb|ACDC01000191.1| GENE 2 626 - 706 95 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWMHNRDINAAKNILKEGLRILGISA Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:18 2011 Seq name: gi|224461210|gb|ACDC01000192.1| Fusobacterium sp. 2_1_31 cont1.192, whole genome shotgun sequence Length of sequence - 659 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 41 - 439 584 ## FN2052 hypothetical protein - Prom 480 - 539 8.4 Predicted protein(s) >gi|224461210|gb|ACDC01000192.1| GENE 1 41 - 439 584 132 aa, chain - ## HITS:1 COG:no KEGG:FN2052 NR:ns ## KEGG: FN2052 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 13 132 1 119 119 116 88.0 3e-25 MKIKFILAAMLALGSLSYSAEVTDTVAQEVINEVKNIEAEYQALMQKEAERKEEFIQEKA NLEKEVKELKEKQLGREELYAKLKEDSKIRWHRDEYKKLLKRFDEYYNKLEQKIADKEQQ IVELTKLLEVLN Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:21 2011 Seq name: gi|224461209|gb|ACDC01000193.1| Fusobacterium sp. 2_1_31 cont1.193, whole genome shotgun sequence Length of sequence - 655 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 78 - 137 8.6 1 1 Tu 1 . + CDS 174 - 654 909 ## gi|237738525|ref|ZP_04569006.1| predicted protein Predicted protein(s) >gi|224461209|gb|ACDC01000193.1| GENE 1 174 - 654 909 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738525|ref|ZP_04569006.1| ## NR: gi|237738525|ref|ZP_04569006.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 160 6 165 166 161 100.0 1e-38 MKKLVLIIGLILGLSAMAENNTTKVNIKKAEETVKSDVKKDFEKIKGKADTKAEAAKVNL NKDFEKTSTKSGAAKVDLKKDVDGIDADIKKDYEAVKDKVVTKTDAAGTDVKKGYEKVKD KVVDKLEAAKADVKENYEAAKTDVKKGFEKVKDKVEGKTE Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:29 2011 Seq name: gi|224461208|gb|ACDC01000194.1| Fusobacterium sp. 2_1_31 cont1.194, whole genome shotgun sequence Length of sequence - 634 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 632 859 ## NT05HA_0523 autotransporter adhesin Predicted protein(s) >gi|224461208|gb|ACDC01000194.1| GENE 1 2 - 632 859 210 aa, chain + ## HITS:1 COG:no KEGG:NT05HA_0523 NR:ns ## KEGG: NT05HA_0523 # Name: not_defined # Def: autotransporter adhesin # Organism: A.aphrophilus # Pathway: not_defined # 3 190 1404 1597 2065 89 40.0 1e-16 TANNLIDKGMNFSADDYDPATANTTVSKKLGERLEIVGGADKTKLSDNNIGSVVDNTGKI NVKLAKELKDLTSAEFKTPAGDKTVINGDGLTVSPATPTTSPISVTKDGISAGDKKVTNV APGTISKTSTDAINGSQLYNLSSNTIQLGGDNASTTDKQTLDKSGGIKFDIVGANGITTE AKDGKVTVKVDSSTIGTNSKLKYTANGDAP Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:33 2011 Seq name: gi|224461207|gb|ACDC01000195.1| Fusobacterium sp. 2_1_31 cont1.195, whole genome shotgun sequence Length of sequence - 621 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 11 - 70 15.8 1 1 Tu 1 . + CDS 178 - 525 240 ## gi|237738523|ref|ZP_04569004.1| predicted protein Predicted protein(s) >gi|224461207|gb|ACDC01000195.1| GENE 1 178 - 525 240 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738523|ref|ZP_04569004.1| ## NR: gi|237738523|ref|ZP_04569004.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 115 12 126 126 134 100.0 3e-30 MLKNIGFILFCYLILVIIQIIYLYKSYEKKKLLLFIGNKILLLVSLIIEYYSNKYFGRYG IFVIFYVFGIYILNDNISTTQEEDLLEKEKLLIIDFMFFMLIGAILMSLSGILKI Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:39 2011 Seq name: gi|224461206|gb|ACDC01000196.1| Fusobacterium sp. 2_1_31 cont1.196, whole genome shotgun sequence Length of sequence - 610 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 38 - 608 398 ## gi|237738522|ref|ZP_04569003.1| predicted protein Predicted protein(s) >gi|224461206|gb|ACDC01000196.1| GENE 1 38 - 608 398 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237738522|ref|ZP_04569003.1| ## NR: gi|237738522|ref|ZP_04569003.1| predicted protein [Fusobacterium sp. 2_1_31] # 1 190 1 190 191 298 100.0 9e-80 MKNRIFYFILFSIFLISCTDLKFIGKPAYVLPEYNTVIYGPIENGKVNRMGVSKNNIEKM NNNILNKYGITFQSSNRIYAMGNSTKYYYIKFYNDFKFTLKGKEYIIQKEKIKIKEDKSI IKYEYPIPVDITKNDENEYILDIGEIEILDRNGKIIKNKEKIPPFLFKKTLYVSLISKNI YYNGWAEDYP Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:49 2011 Seq name: gi|224461205|gb|ACDC01000197.1| Fusobacterium sp. 2_1_31 cont1.197, whole genome shotgun sequence Length of sequence - 606 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 389 561 ## COG3210 Large exoproteins involved in heme utilization or adhesion - Prom 457 - 516 4.8 Predicted protein(s) >gi|224461205|gb|ACDC01000197.1| GENE 1 2 - 389 561 129 aa, chain - ## HITS:1 COG:FN0290 KEGG:ns NR:ns ## COG: FN0290 COG3210 # Protein_GI_number: 19703635 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Fusobacterium nucleatum # 1 129 154 282 727 177 82.0 6e-45 MIESKQDKSENKDSTYGGGFSIDLANPSNFSANINGSKGNGEKEWVNTQTSLIAKNGGKI DTENLTNIGAVIGSESEINKLKVSANKVVVKDLEDKNKYENIGGGVSFGTDVPNVSVKHD KVDKEQINR Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:49 2011 Seq name: gi|224461204|gb|ACDC01000198.1| Fusobacterium sp. 2_1_31 cont1.198, whole genome shotgun sequence Length of sequence - 603 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 31 - 601 828 ## gi|262066234|ref|ZP_06025846.1| outer membrane protein Predicted protein(s) >gi|224461204|gb|ACDC01000198.1| GENE 1 31 - 601 828 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262066234|ref|ZP_06025846.1| ## NR: gi|262066234|ref|ZP_06025846.1| outer membrane protein [Fusobacterium periodonticum ATCC 33693] # 1 190 1 196 2287 310 92.0 3e-83 MGNNSLSNTEKSLRSIAKRYENVKYSVSLAVLFLMNGASAFSDTNAIQETDKQKEVAKDS QAGKTVVKETKAEKKQTSQKLKASWVNMQFGANDMYSNYFAVPKAKVEKTSLVTSEKTVL VASADNTASLPMFAKLLTDIEETTENRTEVLTTIAKKEETPTMEEIKASKQELRSSVGNL QDKIDTARRE Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:57 2011 Seq name: gi|224461203|gb|ACDC01000199.1| Fusobacterium sp. 2_1_31 cont1.199, whole genome shotgun sequence Length of sequence - 558 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 384 331 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|224461203|gb|ACDC01000199.1| GENE 1 1 - 384 331 127 aa, chain + ## HITS:1 COG:alr7153 KEGG:ns NR:ns ## COG: alr7153 COG0675 # Protein_GI_number: 17233169 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 4 126 244 364 408 105 43.0 1e-23 QKRITRKRNNRINDYLSKAARTIVNYCLNNDIGKLVLGYNEDFQRNSNIGSINNQNFVNI PYGKLRDKLIYLCKLYGIEFKLQEESYTSKASFFDGDEIPIYDKENPQEYIFSGKRIKRG LYQTSSR Prediction of potential genes in microbial genomes Time: Fri May 20 00:12:58 2011 Seq name: gi|224461202|gb|ACDC01000200.1| Fusobacterium sp. 2_1_31 cont1.200, whole genome shotgun sequence Length of sequence - 549 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 185 258 ## 2 1 Op 2 . + CDS 245 - 415 176 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins Predicted protein(s) >gi|224461202|gb|ACDC01000200.1| GENE 1 3 - 185 258 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KLVEEGKIEYTEEYLSLEKEADKWKRENTTTIITLLLLLVFSLPALAVQALTTTQMRENT >gi|224461202|gb|ACDC01000200.1| GENE 2 245 - 415 176 56 aa, chain + ## HITS:1 COG:FN2048 KEGG:ns NR:ns ## COG: FN2048 COG2885 # Protein_GI_number: 19705338 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 1 55 27 81 151 90 80.0 9e-19 MTIVLDERALNFDFDKSVVKPQYFEMLNNLKDFIEQNNYELTIEGHTDSIGSNQYK Prediction of potential genes in microbial genomes Time: Fri May 20 00:13:03 2011 Seq name: gi|224461201|gb|ACDC01000201.1| Fusobacterium sp. 2_1_31 cont1.201, whole genome shotgun sequence Length of sequence - 524 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 393 473 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Term 202 - 247 4.6 2 2 Tu 1 . - CDS 345 - 524 275 ## Predicted protein(s) >gi|224461201|gb|ACDC01000201.1| GENE 1 1 - 393 473 130 aa, chain + ## HITS:1 COG:ECs0274 KEGG:ns NR:ns ## COG: ECs0274 COG1974 # Protein_GI_number: 15829528 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Escherichia coli O157:H7 # 1 113 1 113 237 199 92.0 1e-51 MSTKKKPLTQEQLEDARRLKAIYEKKKNELGLSQESVADKMGMGQSGVGALFNGINALNA YNAALLTKILKVSVEEFSPSIAREIYEMYEAVSMQPSLRSEYEYPVFLMFRQGCSHLSLE PLPKVMRRDG >gi|224461201|gb|ACDC01000201.1| GENE 2 345 - 524 275 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no EQPAQGQRELTFRQESLAWSLLVRSWITFNLKPECRITGFFGCAYPSLRITFGKGSKLR Prediction of potential genes in microbial genomes Time: Fri May 20 00:13:07 2011 Seq name: gi|224461200|gb|ACDC01000202.1| Fusobacterium sp. 2_1_31 cont1.202, whole genome shotgun sequence Length of sequence - 503 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 402 436 ## COG0675 Transposase and inactivated derivatives 2 2 Tu 1 . - CDS 399 - 503 72 ## Predicted protein(s) >gi|224461200|gb|ACDC01000202.1| GENE 1 1 - 402 436 133 aa, chain + ## HITS:1 COG:Ta1471 KEGG:ns NR:ns ## COG: Ta1471 COG0675 # Protein_GI_number: 16082436 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermoplasma acidophilum # 2 124 72 194 237 138 54.0 3e-33 IHNKIRNKRKDFVNKLSTKIINNHDIICIEDLNIKGMLKNHKLAKSISDVSWSEFVRQLE YKANWYGRKIIKVPTFYPSSKTCSSCGNIKETLTLSERIYHCECCGLEIDRDYNASINIL RKGLEILKEEKVS >gi|224461200|gb|ACDC01000202.1| GENE 2 399 - 503 72 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no VPLIEVGASWEVCAFVSHIYLPSSSGSPYPDIFF