Prediction of potential genes in microbial genomes Time: Fri May 20 01:19:23 2011 Seq name: gi|224461501|gb|ACDD01000001.1| Fusobacterium sp. 3_1_5R cont1.1, whole genome shotgun sequence Length of sequence - 32372 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 9, operones - 4 average op.length - 5.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 17 - 92 75.8 # Thr CGT 0 0 1 1 Op 1 . - CDS 134 - 298 182 ## gi|257451413|ref|ZP_05616712.1| hypothetical protein F3_00005 2 1 Op 2 . - CDS 291 - 983 331 ## gi|257451414|ref|ZP_05616713.1| hypothetical protein F3_00010 3 1 Op 3 . - CDS 1008 - 1526 483 ## COG0703 Shikimate kinase - Prom 1557 - 1616 8.2 - Term 1598 - 1644 9.1 4 2 Tu 1 . - CDS 1653 - 3125 2040 ## COG1982 Arginine/lysine/ornithine decarboxylases - Prom 3214 - 3273 15.3 + Prom 3214 - 3273 15.2 5 3 Tu 1 . + CDS 3302 - 4240 1333 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase + Term 4468 - 4496 -1.0 6 4 Op 1 4/0.000 + CDS 4643 - 5371 935 ## COG0310 ABC-type Co2+ transport system, permease component 7 4 Op 2 8/0.000 + CDS 5374 - 5676 403 ## COG1930 ABC-type cobalt transport system, periplasmic component 8 4 Op 3 34/0.000 + CDS 5673 - 6407 419 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 9 4 Op 4 . + CDS 6421 - 7248 372 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Term 7254 - 7287 1.4 10 5 Tu 1 . - CDS 7286 - 8365 988 ## gi|257451422|ref|ZP_05616721.1| hypothetical protein F3_00050 - Prom 8395 - 8454 7.0 + Prom 8410 - 8469 8.9 11 6 Op 1 . + CDS 8498 - 9277 438 ## PROTEIN SUPPORTED gi|163802692|ref|ZP_02196583.1| 30S ribosomal protein S21 12 6 Op 2 . + CDS 9336 - 9848 810 ## gi|257451424|ref|ZP_05616723.1| hypothetical protein F3_00060 + Term 9865 - 9900 4.2 + Prom 9886 - 9945 12.9 13 7 Op 1 . + CDS 9966 - 10862 1220 ## COG0053 Predicted Co/Zn/Cd cation transporters 14 7 Op 2 . + CDS 10879 - 11178 501 ## gi|257451426|ref|ZP_05616725.1| hypothetical protein F3_00070 15 7 Op 3 . + CDS 11201 - 11800 328 ## COG1011 Predicted hydrolase (HAD superfamily) 16 7 Op 4 . + CDS 11822 - 13159 1668 ## COG0534 Na+-driven multidrug efflux pump 17 7 Op 5 . + CDS 13167 - 15797 2553 ## FN1150 hypothetical protein 18 7 Op 6 . + CDS 15794 - 18817 2662 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 19 7 Op 7 . + CDS 18822 - 19640 883 ## COG0207 Thymidylate synthase 20 7 Op 8 1/1.000 + CDS 19655 - 20704 1163 ## COG0820 Predicted Fe-S-cluster redox enzyme 21 7 Op 9 1/1.000 + CDS 20714 - 22849 2791 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 22 7 Op 10 1/1.000 + CDS 22851 - 25556 2018 ## COG0210 Superfamily I DNA and RNA helicases 23 7 Op 11 28/0.000 + CDS 25553 - 26716 1107 ## COG0420 DNA repair exonuclease 24 7 Op 12 1/1.000 + CDS 26706 - 29471 2604 ## COG0419 ATPase involved in DNA repair 25 7 Op 13 . + CDS 29485 - 30120 473 ## COG1636 Uncharacterized protein conserved in bacteria 26 7 Op 14 . + CDS 30113 - 30733 752 ## FN0520 hypothetical protein + Term 30943 - 30977 0.3 - Term 30735 - 30763 2.3 27 8 Tu 1 . - CDS 30764 - 31279 683 ## COG0511 Biotin carboxyl carrier protein - Prom 31315 - 31374 8.3 28 9 Tu 1 . - CDS 31417 - 32307 857 ## COG1032 Fe-S oxidoreductase Predicted protein(s) >gi|224461501|gb|ACDD01000001.1| GENE 1 134 - 298 182 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451413|ref|ZP_05616712.1| ## NR: gi|257451413|ref|ZP_05616712.1| hypothetical protein F3_00005 [Fusobacterium sp. 3_1_5R] # 1 54 1 54 54 63 100.0 4e-09 MNKKKLYIYIVISIIAGIGLFFYDKTSFYDYIKIMGTAFLVCLFFITKHYFFKK >gi|224461501|gb|ACDD01000001.1| GENE 2 291 - 983 331 230 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451414|ref|ZP_05616713.1| ## NR: gi|257451414|ref|ZP_05616713.1| hypothetical protein F3_00010 [Fusobacterium sp. 3_1_5R] # 1 230 1 230 230 424 100.0 1e-117 MKYLCMLCVIVLCLSCGRAETFQTTKAEKWILLENAYLFQEAESLEKIKKLKTNIQRKKL LNNEVAIQEDKEWEETYLDFSLDYSKILKASDSLFQYEILDIEKSADGYVEHLNNGMRIE FSSSFIMIHFPKEESDFHLAAKFLYSKNRLTTFQGFEIGGIFIPKEAELDLLSSKNILHI YDSPLSDRAWEKIFLLSLEYCNQSLTRNSKAITKISPKNFTLKLERKIYE >gi|224461501|gb|ACDD01000001.1| GENE 3 1008 - 1526 483 172 aa, chain - ## HITS:1 COG:FN0822 KEGG:ns NR:ns ## COG: FN0822 COG0703 # Protein_GI_number: 19704157 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Fusobacterium nucleatum # 1 172 1 172 172 201 62.0 4e-52 MKENIALIGFMGSGKTTVGRLLAKQLDMKFVDVDKVIAAQEKKSISDIFQENGEQYFRQK EREIILQESTKNNVVISTGGGAIIDNENIKNLQNTCFIVYLDADVHCIYDRVKNSKHRPL LQNIENLEAHISTLLEKRRFLYEFSSDYKVSIHLESNLYDTVEEIKKIYIDS >gi|224461501|gb|ACDD01000001.1| GENE 4 1653 - 3125 2040 490 aa, chain - ## HITS:1 COG:FN0501_1 KEGG:ns NR:ns ## COG: FN0501_1 COG1982 # Protein_GI_number: 19703836 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Fusobacterium nucleatum # 1 489 1 489 503 774 75.0 0 MSKLDQSKTPLFSVLKDEYAGNNTLPFHVPGHKRGKGADQEFINFIGEGPFTIDVTIFPM VDGLHHPHGCIKEAQELAADAYDVKHSFFAVNGTSGAIQAMIMSVVKPGEKLLVPRNVHK SVSAGIILSGAHPVYMNPEIDDELGIAHGVRPQTVADMLAQDSEIKAVLIINPTYYGVAT DIKKIADIVHSYDIPLIVDEAHGPHLHFHEDLPMSAVDAGADICSQSTHKILGSLTQMSL LHVNSNRVSVERVKEILSMLHTTSPSYPLMASLDCARRQIATEGKELLTKAISLAHYFRE EANKIPGIYCFGEEIIGREGAFAFDPTKITFTAKELGFTGTELEDMLTADYHIQMELADF YHTLGLVTIGDTKESINQLLSALQDISRRFSNQGRKLTHKLLKMPQIPEQVLIPREAFYR RKIKTSFDDSIGKVCGELVMAYPPGIPIIIPGERITKEILDYVKDMKVAKLQLQGMEDPD LKTINIITEI >gi|224461501|gb|ACDD01000001.1| GENE 5 3302 - 4240 1333 312 aa, chain + ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 2 306 6 347 355 172 33.0 6e-43 MLKIGNIEIKVPIFQGGMAIGVSMAELAAAVSNEGGVGVIAGTGMTKEELKKEIQKAKEK LVGIGKVLGVNIMVATTNFMELVDAAIESGVEFIIFGAGFSRDIFDYVKGTGTQAIPIVS SLKLAKISEKLGAPAVIVEGGNAGGHLGSELDSWDIVPEVAEHIHIPVIGAGGVITPKDG ERMLSLGAQGIQMGSRFVASKECGVSEVFKEMYKKVKEGEIVKIMSSAGLPANAIVSPYV KKVLDEVTEFPRNCFACLKKCTHKFCVNERLQMAHHGNYEEGIFFAGRDAWKITEILSVK EIMEKFQVLFQD >gi|224461501|gb|ACDD01000001.1| GENE 6 4643 - 5371 935 242 aa, chain + ## HITS:1 COG:STM2023 KEGG:ns NR:ns ## COG: STM2023 COG0310 # Protein_GI_number: 16765353 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Salmonella typhimurium LT2 # 2 231 7 236 245 236 56.0 2e-62 MLKKRSSQVLLLFFLFFYLPKYAFSMHIMEGFLPPMWAGIWGIVCLPFLFLGFKKIQAKV EENPKLKILLAMAGAFAFVLSALKLPSVTGSCSHPTGVGLGAILFGPTVMSVLGIIVLIF QALLLAHGGITTLGANTFSMGIFGPIVSYFLYKSLQKAKVSRSVSVFLAAALGDLATYII TSLQLALAFPSPDGGLFLSFEKFLGIFAITQVPLAISEGLLTVIIFNILWKYNEDTLKDL GV >gi|224461501|gb|ACDD01000001.1| GENE 7 5374 - 5676 403 100 aa, chain + ## HITS:1 COG:alr3944 KEGG:ns NR:ns ## COG: alr3944 COG1930 # Protein_GI_number: 17231436 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 2 100 3 100 100 97 52.0 6e-21 MQKKESIMKKNLILLFGVILMVILPLCFVSGEFGGADDQAEGVIEEVDASYHPWFESLWE PPSGEIESLLFALQAAIGAGVICYFIGYQIGKSKRDDEEE >gi|224461501|gb|ACDD01000001.1| GENE 8 5673 - 6407 419 244 aa, chain + ## HITS:1 COG:MJ1089 KEGG:ns NR:ns ## COG: MJ1089 COG0619 # Protein_GI_number: 15669277 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Methanococcus jannaschii # 4 244 9 259 268 101 29.0 1e-21 MISLDKLAYTSTIRMKNPNEKLCFSILFLFLCIFSNDIVMSSLVFLTMGIMTVFVAKISL RVYLKLLLLPLFFTLFGVLGVVFAQWSSSLSFQENNMLIYLSLLLKALASTSCLYFLILT TPMVDVIYSLQCIRLPKLFLEIMILMYRYIFVLLEFMTIIYISQDSRLGYSSYKKSFYSM GKLVSALFLSSYQKSMECYSSMESRAYQGEIKVLDLHYRKNSKNYVYMILMAILYFAIVC LVKE >gi|224461501|gb|ACDD01000001.1| GENE 9 6421 - 7248 372 275 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 13 272 137 393 398 147 34 6e-35 MKEYIIETKDLYYHYPDGTQALKGISLAIEKGKKIAIIGVNGSGKSTLFLNLNGVLKATS GKIFYEGKELKYDKRSLMEVRKNVGIVFQNPESMLFSSNVFQEVSFGPMNLGYPVEEVKK QVVSSLEEVNMLEFQEKSVHFLSYGQKKRVSIADILAMKPKVMILDEPTSSLDPRHTRQL KELFEDLHRKGITVIISTHDVNLAYEWADEILVMKDGKVVEFGASEEIFVKKELLFDCYL EQPYLVSLYEELKKKGVLQKIGKLPKTQKELLEMV >gi|224461501|gb|ACDD01000001.1| GENE 10 7286 - 8365 988 359 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451422|ref|ZP_05616721.1| ## NR: gi|257451422|ref|ZP_05616721.1| hypothetical protein F3_00050 [Fusobacterium sp. 3_1_5R] # 1 359 1 359 359 645 100.0 0 MKKNIYSYILYLCLAFIFSACTVIDINQASTYASKRQYHYSLLQLDSYLKKGNEVDPKVL QKYEQFWNEGNRYYDAIIQQMGIADLKQISLAKERKLLMHRHFASLPETIKSKLSSNIYT PMNITKLQKEAVDSYISLGDLIGNSSYSKRLHQNYAYEKAMKYSPNPSLDLQQKWNFSKN NLERNIYVRWNGYTDSFFQNILVTKIQNLLIDSDLFILGRSQNAQIYFDVDIENYQFSNN PASLKTETKYKEISVSYQEEKIKVPYQELTFTKKWYLSYILRYQLVDKNGNIIFSGYKPC KNQEEKIWKQFVVLDSRYPLNLPRNEQEPQGKMEEEFITDSFISSLKEIQFKLNKLSNY >gi|224461501|gb|ACDD01000001.1| GENE 11 8498 - 9277 438 259 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163802692|ref|ZP_02196583.1| 30S ribosomal protein S21 [Vibrio campbellii AND4] # 2 254 10 262 271 173 34 1e-42 MKFSFEEKGEGKTIVFVHSYLWDREMWREQIDLLSQKYRCISIDLPSHRECFEKLKKEYS LEDLSQDIIDFLEEKGIEKYHYIGLSVGGMLIPYLYEKDQNKIESFVMMDSYVGAEGSEK KALYFHLLDTIENIKKIPPVMAEQIAKMFFANERKNDSNPDYVAFVNRLQNFSEEQLEDI VILGRSIFGREDKRETLKKIIIPTTILVGEEDEPRPPYESEEMGHLFPNAKVIVIPKSGH ISNRDNASCVNRVLKNLFL >gi|224461501|gb|ACDD01000001.1| GENE 12 9336 - 9848 810 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451424|ref|ZP_05616723.1| ## NR: gi|257451424|ref|ZP_05616723.1| hypothetical protein F3_00060 [Fusobacterium sp. 3_1_5R] # 1 170 1 170 170 324 100.0 1e-87 MRKNLLLGISFFALSTSMFALDGIVKFGFASNAGAYNGRSKSFESYAPNLAAEIRQGFVL GEVGAGIAYHGKVGDTGIANVPVYALLKWNVFPILPVKPYIVGKVGRVLKTNEDVKGSDP SGRGYYGVGAGIEVMDLEVEAMYSATKIRQDHRGKDWLNQVSLGVGYKIF >gi|224461501|gb|ACDD01000001.1| GENE 13 9966 - 10862 1220 298 aa, chain + ## HITS:1 COG:MA0549 KEGG:ns NR:ns ## COG: MA0549 COG0053 # Protein_GI_number: 20089438 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Methanosarcina acetivorans str.C2A # 1 285 16 299 311 207 39.0 1e-53 MLKNYKEVQKVLFVILLLNILVAGIKTVLGYLIHSSSMLADGIHSFSDGASNVVGILGIQ LSKKPEDEDHPYGHEKIEMLSSLVIGLLLLVLGVQVLIEGIKTFQSPRSPNISVESMLLL AVTLFINIAVSYFEEKRGKQLKSTILISDAMHTRSDIYVSIGVFFSLLAIKMGLPSYVDT IMSCVVSFFILHASWEILRDNVGILLDSKVLDREKIQKIILSHPEIKGVHKIRTRGTLAH VYMDLHILVDKNMSVEEAHCLSHHLEHDLQKEFEIEIQVLIHVEPYRKVCYVNQKKEI >gi|224461501|gb|ACDD01000001.1| GENE 14 10879 - 11178 501 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451426|ref|ZP_05616725.1| ## NR: gi|257451426|ref|ZP_05616725.1| hypothetical protein F3_00070 [Fusobacterium sp. 3_1_5R] # 1 99 1 99 99 174 100.0 2e-42 MQVEMKKEVKEYLERKDADAILLEYMPPCSMCNGSTFHVVAHMVKIRDRSKIKDFATRIQ VNGVEIFIPKEIEHMKKIKLEFKKALLSKNGSIKVTYYE >gi|224461501|gb|ACDD01000001.1| GENE 15 11201 - 11800 328 199 aa, chain + ## HITS:1 COG:CAC3581 KEGG:ns NR:ns ## COG: CAC3581 COG1011 # Protein_GI_number: 15896815 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Clostridium acetobutylicum # 1 196 2 197 201 122 37.0 5e-28 MKNIIFDLGNVLVNFHPRDFVDKHVLEEKREKIFRLILQGEEWQKLDRGTITQQEALESF LRKMPEEKETICKIFPIYLTDCLSPNQENIKLVYELKKRGYSLYVLSNFHKNLFEKIEKE WGVFQQFDGKIISCYHHFLKPEKEIYELLFKTYQINPEESVFVDDSLENIEMARELGVLG IHLPIREELSKKLSFLLER >gi|224461501|gb|ACDD01000001.1| GENE 16 11822 - 13159 1668 445 aa, chain + ## HITS:1 COG:FN1151 KEGG:ns NR:ns ## COG: FN1151 COG0534 # Protein_GI_number: 19704486 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 6 445 9 448 448 409 53.0 1e-114 MRQTQEQKQLVVEVFRITIPAIMDLLAQTLLAFFDMLMVASLGASAVSAVGLGHAPVIAI VPAFMAVGMGTTSLVSRAYGANNIKEGKNAVIQSLLLCIPIALVITILMLWKAEWILQHV GRADDLDFIAAKQYYKVSVLSLLFICFNVIYFATYRAIGKTKVPMIINIVGIFMNIFFNW IFIFVLKQGVFGAAIATLLSKMFSFSCFSYFTFLSKKYWISLQIRDFSWDRIMAGRILKI GIPAAAEQLLLRFGMLFFEMMIISLGNISYAAHKIASNAEAFSYNLGYGFSVAAAALVGQ QLGKNSTKGAEYNAKVCTLMSLLVMSSFALLFFSIPHLIISIFTKEIELQNLSASALRIV SICQPFLAVSMVLAGALRGAGATKSVLLITVFGIFGVRLPLTYLFLNVWKTGLLGAWWIM TIDLAFRSAATYYVFKKGKWKYLKV >gi|224461501|gb|ACDD01000001.1| GENE 17 13167 - 15797 2553 876 aa, chain + ## HITS:1 COG:no KEGG:FN1150 NR:ns ## KEGG: FN1150 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 55 859 63 875 903 295 28.0 7e-78 MKTKKFRYLSYYDNLSDTILEYRKDSYIVVENNQVKSILLSQCYHFPIFELRPVIFSLEE FFSYLFVSSDVLLKDIKRIFLLYSCLTKEMKETWQIQSYFDFVDIANEFFLFYEEIQGKE EELEKIIQPWQEEKYSFFRQLKEILEEKKEEYLAKEFLWNREKYHPENLKEFSRIVFFDI PSFPRIFQELFSLLEKDFDLEFVLQVSKEDFDEEHYMLRQVSPVFFEGEFFCYEVGSEWE EALYLLSEREKEDFFAYSNSSYEKSFSKLFPRKFLDSSRNSFNHTKLYQFMDLQLTLLRE KEQEQKETLALDKVLSAIQKTVCREYYGFWEEDFLLIQNLLQEEYRFLSISLLQSSNYKS IIGDKESFIEKMTLFLQDLFQIETWKEGKDIYEYFEKQIDIQKWKEEEYPDVLDVFYEIL SRLYASQGKTHLLSYENYFEGNLGRNLYQLLYRSLDSIYIKSAQSFSEEKIELRDWHSLM YERKKEKQAIFLDLDDKSLPKLTKTISFLTEVQKQQLDCQTREESILVEKYRFYQAVYSQ KRVVFLVQKSEEKNKTLSSFVEEFLWKQGKKIEKSPYSKAFFLQSLRESFSSEVSWTKEF EEGKALALKKENEELLKDGKLSLGAYDWRDLKTCSKYFYFHKILGNSGRIEELSFGLSPR LLGIVCHRFLERIGREQWKIFLQERQFAISDTTLAQYLEEEFKKDSLKIPPFLKQYLRKI VYPRIIKNTRNFFQNLEKKYRLENISRFQGEKGIEKEGIYRGKELNVSFQGRADLVIEAE QGKEIIDYKTGKTIEDQLDFYAFLFYGEEQKVEGRYYNLWDGMFSQGKKKEELNSACLEE FFKNFEEDSFYHISEKKAFCTYCPYQKICRREEEVE >gi|224461501|gb|ACDD01000001.1| GENE 18 15794 - 18817 2662 1007 aa, chain + ## HITS:1 COG:FN1149 KEGG:ns NR:ns ## COG: FN1149 COG1074 # Protein_GI_number: 19704484 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Fusobacterium nucleatum # 1 1002 4 1054 1056 489 36.0 1e-138 MKKLVLKASAGTGKTYRLSLEYLLSLYRGVPYSEIFVMTFTRKATAEIRERILEFSLEIL SHTETGKDLLENLQKLDTELVFREEILRTAYYSMLKNKDKIRIYTIDSFFQMLFHKVVSP YYQIYSMKMIEKEEENKEFYKKILKQILSKREFFDKMKLFFDLSPEKNIENYLLLIQNMI RERWKFLLLREPYQKRERIAYERSLQEHIESFESIFRTLEEKKKKERGYFTQSFYQSFFE KGEEEKQEILKHERDTFFKTNVFDGRKLSTRGKDEEILALREELLEELESFRCDLAKEVY NEEMIFFEESLFAIFEEIYTLYDTYKRKEKIFNYDDIAVYTYLTLFQEDLHFVEGNAITD TLEEVLDLKIHSVFLDEFQDTSILQWKILSAFLERAKSVICVGDEKQSIYGWRGGEKKLF EDLPNILDAKVENLDTSYRSLSSIVDFTNDFFKSFPLLYQEEGIDWQFLESKSHKKQRGE VLSYFVEEEEALEKLGELIEEKYSGNYGSLSILARKNKTLLQISDFLEEKKIPYQLSLQK EYQEEATIDAFLSLFRYFCTGKYLYLVEFFRSSVLQASNEILKKLLTGQENMIQYIYSGK EWKEKPKGSQEVRTLYLEFQEKEGKIEDMWLHCIKLFSLTEYFNKDSHILACYSFQQSLS YYDSWFEYFEAFDKNQLVNLEAWEEESKDAIQLMSIHKSKGLEFDNVIYFEAKDSRKGNR EQSILFYFQMAEDYRSLEHYFLTRGKYRKYMDYLPEPFPDYLSNVEKKEREEEINTLYVA LTRPKHNLYLFFSETWKGRDLVEELTPSSSAMFLPENKGKKEEENRQGIVLAFEKEVKEF DKDEKQRPEKYTLLTELHRMEGLATHFFLEHLKYATEEEIEFAKKRVIQEYASYFGREKI EALFSKERIQQILKVDSRIFSKDWDYIYPEFSIISPFDQKKYVIDRLMIKKAGKNKKGLV YLVDYKTGGNDPKQLENYKHILQELLKEEEGEYEFETKFLELGREGE >gi|224461501|gb|ACDD01000001.1| GENE 19 18822 - 19640 883 272 aa, chain + ## HITS:1 COG:FN0240 KEGG:ns NR:ns ## COG: FN0240 COG0207 # Protein_GI_number: 19703585 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Fusobacterium nucleatum # 3 272 5 275 275 331 64.0 9e-91 MLFDEEYRKLVEYICEKGEMVEGKVRTVYADGTPAYYKQVVGYQFRLDNSGKEAFLITSR KAAWKSSIRELYWIWYLQSNNVDELVDLGCKFWNEWKQEDGTIGKAYGYQIGKKTFQYKS QLDYVIGEIKNNPNSRRILTEIWVPEDLDKMALTPCVHLTQWTVLNGKLYLEVRQRSCDV ALGLVANVFQYQVLHKLVARECNLNCGDLIWTIHNAHIYDRHLEDLQKQVRETGTEKPIL DLGEEGLENFHQKVTIENYKPLENNYKYEVAI >gi|224461501|gb|ACDD01000001.1| GENE 20 19655 - 20704 1163 349 aa, chain + ## HITS:1 COG:FN0526 KEGG:ns NR:ns ## COG: FN0526 COG0820 # Protein_GI_number: 19703861 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Fusobacterium nucleatum # 2 348 4 355 358 518 74.0 1e-147 MEKLNLLDLSKKELTEFLVAEGMKKFYGKEVFVWLHKKFARNIQEMTNLSLQNREILEEK TYIPYLNLLKHQVSKIDKTEKFLFQLEDGNTIETVLLRHRDQRNTLCISSQVGCPVKCSF CATGQDGFVRNLRVSEILNQVYTVERRLNKRGEKLTNLVFMGMGEPLINIEALLKALEIL SSEEGICISKRRITISTSGIVPAIERILMEKVPVELAVSLHSAINEKRDQIIPINKAYPL EDLAAVLGEYQRQTKRRLTFEYILIKDFNVSEGDANALADFAHQFDHVVNLIPCNPVADT GLERPSEKKIERFYDYLKNVRKVNVSLRQEKGTDIDGACGQLRQNQRKK >gi|224461501|gb|ACDD01000001.1| GENE 21 20714 - 22849 2791 711 aa, chain + ## HITS:1 COG:FN0525 KEGG:ns NR:ns ## COG: FN0525 COG0744 # Protein_GI_number: 19703860 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Fusobacterium nucleatum # 21 675 2 655 731 688 52.0 0 MKKIIKSLFLLSFLGVVGMGILVFSIVMKYKMELPDVQELVENYEVSAPSVIYDRNGEIV DTLYQEARDNVKLEEVPEYSKQAFVAIEDKRFYEHHGIDPRGLLRAVFVNLRSGHARQGA SSITQQLAKNAFLTMDRTLSRKIKEMIITIEIERVYTKDEILEKYLNEIYFGSGAYGLKT AAKQFFHKDIQDINLAEAAMLAGVPNRPEGYNPRRKLENAIKRMNIVLSEMREDGKITEE EYQEALKQKFISEKEASAKDKKNPKVTIIYPRKDTRHYENPEFTKLIEDFLLKKFDANTV YNKGLKIYSSLDVAMQKSARTAFNQYPLLRARNGLNGAMVTIDPFSGQIITMVGGKDFKI GNFNRAIMAKRQFGSSFKPFVYFAALLNGFESNSVLEDSPVTFGKWSPKNANGSFTNMNT TLVNALDKSINSVSVKLLSAVGVPKFREMMEQVDPKLEIPDNLTAALGTAEGNPLQLAIN YAMFVNGGYLVSPILVTSIEDKHGNLLYEVVPRKDKIFESQDTSIITYMLKSSVQSGTSA RARVITRNGAPMEQGGKTGTTNNARTVWYAGITPEYVTTAYLGYDNNRAMPGLAGGNAVA PLYHNYYQDIINKGLYTPGKFSFMEDHIKNGELVVQRLDILTGLLSSEGREFVIRRGHTV VESDYKYLNGISSIFYGNPNPQEENADEHLEDGENPIVEEEEQLFDKLLGD >gi|224461501|gb|ACDD01000001.1| GENE 22 22851 - 25556 2018 901 aa, chain + ## HITS:1 COG:FN0524 KEGG:ns NR:ns ## COG: FN0524 COG0210 # Protein_GI_number: 19703859 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Fusobacterium nucleatum # 2 896 4 916 919 541 39.0 1e-153 MLDKNQQRVVEHTEGPLLVIAGPGSGKTKTLVERSVYLISEKKVNPSQILLSTFTEKAAR ELRMRIQKALQKKNFSVSIEEMYLGTMHSIWLRILEEYIEYSHYENGIEILDEEEEKFFL YSQLRQFKNLNFYGEFFEREHSYGDWTQSRLLQTIFSKIQEEAVDISSIRSYQEEIQFLK EAYLLYQNLLRKENKMSFSAIQMELYHSLLEYPEFLEKVQTKIHYVMIDEYQDSNPIQEK IILLLSGKYKNICVVGDEDQAIYRFRGATVENILRFPQVFEEDCETVYLEKNYRSSEEIV HLCNQWMNRVDWQGERFDKHSYSARYDTIERKSVFRISGSSNSRKRNELITWLKELKERK KIEDYSQIVFLFDNFRSPQVKRLEEDLEMAGIPVYCPRARNFFSREEVKLFFGVFMVLSP KIQESVKGYSYYEECLFRVRRLAKDDKDLQKWILEQREKEIGDFLEIYYQILSFSPFREI LEKQEEDVRRGREIYNLSLIGNILQSFQKLCKMKEDSKVERLEYLEYFFQSYLKKFIEKG VNEFEKKGEFPKGCIPFLTIHQSKGLEFSIVVLSSLYQNPPVYREKIRKSYDSLFQKKKL LQEHNEELYDFYRKFYVAFSRAKNALIFLEDNVSSSFQAFVRHSVDIVSSDFHWEDIPEE EYNSAEEMQTYSYTTDIASYDLCPRRYFFLRKISFPSLERENMIFGTLLHRCLERLHKYP DKIISLEEMIGKEKEKLEKKSKFFFQEEDIKMVYKILQEYQGKAVNLYDEILQAEGKEFL EWQGNMIYGEIDLLALQENQWKIIDFKTGKENPSYIEQLVLYQNLLRKYGKEKEIRLSLY YLLEQREEKIELSLKEEVAILEKIQRTIENIQKKEFTKREYQKEICDTCEFFSFCYRKET L >gi|224461501|gb|ACDD01000001.1| GENE 23 25553 - 26716 1107 387 aa, chain + ## HITS:1 COG:FN0523 KEGG:ns NR:ns ## COG: FN0523 COG0420 # Protein_GI_number: 19703858 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Fusobacterium nucleatum # 1 283 1 284 291 296 56.0 4e-80 MKILHCSDLHLGKRPSGNKKFTETRYQDYFQAFEQLIEKISSLEIDVFLIAGDIFDKKEI NANILERTEALFQKLKYDHPKMTILVIEGNHDVISRQEDSWLEYLKNKGYCEAFSYRKDY EKENCFQQGDVSFYPVGYPGFMVEKALQDLAEHLDSSKKNIVIVHTAIFGMENLPGLVST ETIDLFRDKVVYMAGGHIHSFSSYPKEKPYFFVPGSLEYTNIPREKSSQKGAIYFDTDTG DFERILISPRKRIRTDIFSWESEIEEEFQRFLQKYSQKQEEIMIIPVNVKNTEYFPLERL EEIAEKEGILKVYFEIRESILGKEEEQEEYSSLEEVERELIESWDILKHPESFIRSFPRL KEFSIESNQENLFQLLDEILEEDENAD >gi|224461501|gb|ACDD01000001.1| GENE 24 26706 - 29471 2604 921 aa, chain + ## HITS:1 COG:FN0522 KEGG:ns NR:ns ## COG: FN0522 COG0419 # Protein_GI_number: 19703857 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 921 1 921 921 428 39.0 1e-119 MQIKKVVLNNYRSHSHIEVVFSKGINLILGKNGRGKTSILEAIGLALFHMTDRTGKTKGK TFMKYGEKECSIFIEFLGNDGREYSIFHHYFLKKPKVSILKDMQTEEEYRDNIEEKLEEL CGVKAEYRDIYENVIVAKQNDFINIFKETPENRARVFNKIFNTEIYNKLFIDLKGFVEQY LKEKEMLEVEENTLRLTLENKEERMEMLQQTEEKWKLYALKKEARLEEKQKIAKKIEQYE FIKREFETIKSKFSFQEQKIRQNKKELQERLVLAKKAKKARFLLEEHQESYQLYMELDKK IQEKKQEKNFLQKRREENQKLEEENRKLELLIKNNQTEEEVLQERMTEQQVLLLDLETRI EEDQKQQKELQTSLARLQSFWKEIEISLEKQKKWEQENFNLQQKQSLQEKNHKQKTEELL KLNIVEIQSFLQEIQEDKAEIQGKKERIAVYQQNIEDYQFAMHTLGQKICPFLKETCENM KGHEVDSYFQGEIQKTKKLMETLQHEIKALEEKLKKEFVYRKEEASYQLLQKEVQELEKD ILQTEILWKEILLERERQQYSFQTLLSQHNFASLEELQEKLRNLEDALLLLKIEEKEEEW KSLQKKQEILQERVEKLQKDRMSSLERQKQNILNIQEDLEEIWLEFLKEMESLETKMLSL QTSYRIYLENRKIADNLEEEKGKIRVLLLERDNLRISQREVNEKYRLLEKDLEQREQENW KDKLMEVERELLAVNETLGELGEKLKNDKQVLEKIVLQEEKIAGLSKKRNKIERKYKKAE SLRKNIKEMGTQVSKNMLHYISEGASINFHKITGRSERIYWSNEEKDKYQVYLLGENRKI EYQLLSGGEQVSVAIAIRGTMAQYFSNSKFMILDEPTNNLDIEKRKLLAEYIGEILNHLE QSIIVTHDDSFREMAEKIIEL >gi|224461501|gb|ACDD01000001.1| GENE 25 29485 - 30120 473 211 aa, chain + ## HITS:1 COG:FN0521 KEGG:ns NR:ns ## COG: FN0521 COG1636 # Protein_GI_number: 19703856 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 207 1 209 222 230 66.0 1e-60 MKENYDKKMEEQLKALQGERKKLLIHSCCGPCSSSVLEYLKDYLDIDVYFYNPNITEKEE YETRLEELKIFLDKIQFPMKVVEGEYEVRRDFFEKIKGLEKEPETGARCKVCYELRMEEA ARKAKEEGYDYFTTVLSISPMKNATWINEIGEKLEEKYKIPFLHGDFKKKNRYLRSIQLS KEYGMYRQEYCGCIFSKLEREEKLKEREKNG >gi|224461501|gb|ACDD01000001.1| GENE 26 30113 - 30733 752 206 aa, chain + ## HITS:1 COG:no KEGG:FN0520 NR:ns ## KEGG: FN0520 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 204 1 205 205 172 44.0 1e-41 MVNFSERTVRLVSIIVFILFLILGAKKHWFFVLEIIPIMIFFSTKGVQMFENSLWWGARV FWGLCFSIALFVILYRQIPEMIVVTKQYLMVRALMAVCVGAWLGDFFAKYIYIRLRFCVN RFASKGYRNSYKILSMKDYSQQYVKSPFKKMKVSFYYVGLEVDGVERIFLTEKEIFEQLQ HETTIEITIKRGCLGSYYGVGYEKKY >gi|224461501|gb|ACDD01000001.1| GENE 27 30764 - 31279 683 171 aa, chain - ## HITS:1 COG:aq_1614_2 KEGG:ns NR:ns ## COG: aq_1614_2 COG0511 # Protein_GI_number: 15606729 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Aquifex aeolicus # 49 170 27 142 150 64 34.0 1e-10 MLYLEELIRGPLREKERPKPKIHQKKQEIDPILLDAILTILLGGNEMIRKFKVSIDGKVH HIEIEETTQGVSSMDFSSPTIAREEIKVEVTPKVEVETSSVKDKVTVPIAGTISNIAVHV GQTVKEGDLLFVFEAMKMENEAISSCDGVIGNIYKKEKDMVNPNEIVMEII >gi|224461501|gb|ACDD01000001.1| GENE 28 31417 - 32307 857 296 aa, chain - ## HITS:1 COG:FN0392 KEGG:ns NR:ns ## COG: FN0392 COG1032 # Protein_GI_number: 19703734 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Fusobacterium nucleatum # 2 284 1 285 297 291 51.0 7e-79 MLYDSYDYPLYRPPSEAYSLILQITLGCSHNGCVFCGMYQSKHFHIKSIEEIKMEMDMFA TRYSHIDKIFLADGNALTAPTEFLVEILEYIKIKFPKCERVSCYATHIDIRKKSLEELQL LSSKGLKLLYLGVESGDDETLRFIRKGATAQDMIDLSKKVKDANMKLSATFILGINGQEK DNTEHAIRTGELISKMYLDYVGLLTLRLEEGSYLTKLAAEGKYTLVELEEVVRELKLILE NIKTEEIPQEIIFRSNHASNFLTLKGTLPQDRDKMLEKVKQVIQQGEYPKQRKYYL Prediction of potential genes in microbial genomes Time: Fri May 20 01:20:36 2011 Seq name: gi|224461500|gb|ACDD01000002.1| Fusobacterium sp. 3_1_5R cont1.2, whole genome shotgun sequence Length of sequence - 69539 bp Number of predicted genes - 90, with homology - 82 Number of transcription units - 28, operones - 18 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 44 - 928 847 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 958 - 992 0.4 2 2 Tu 1 . - CDS 870 - 1406 799 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 1437 - 1496 9.8 + Prom 1416 - 1475 8.8 3 3 Op 1 . + CDS 1580 - 4129 1802 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 4 3 Op 2 . + CDS 4208 - 4570 241 ## gi|257451444|ref|ZP_05616743.1| hypothetical protein F3_00162 5 3 Op 3 . + CDS 4616 - 4747 243 ## 6 3 Op 4 . + CDS 4762 - 5478 799 ## FN0914 hypothetical protein 7 3 Op 5 2/0.000 + CDS 5526 - 8456 3219 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 8 3 Op 6 . + CDS 8460 - 9662 1265 ## COG3581 Uncharacterized protein conserved in bacteria + Prom 9672 - 9731 1.6 9 3 Op 7 . + CDS 9751 - 10026 441 ## gi|257451449|ref|ZP_05616748.1| FMN-binding domain-containing protein + Term 10041 - 10082 4.1 10 4 Op 1 1/0.250 + CDS 10102 - 11562 1762 ## COG1492 Cobyric acid synthase 11 4 Op 2 . + CDS 11636 - 13048 641 ## PROTEIN SUPPORTED gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 + Term 13058 - 13109 9.0 + Prom 13085 - 13144 14.0 12 5 Op 1 2/0.000 + CDS 13190 - 13501 382 ## COG0640 Predicted transcriptional regulators 13 5 Op 2 . + CDS 13494 - 15908 2764 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 14 5 Op 3 . + CDS 15983 - 16786 828 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Prom 16804 - 16863 12.6 15 6 Op 1 . + CDS 16887 - 18191 1557 ## Coch_0229 hypothetical protein 16 6 Op 2 1/0.250 + CDS 18196 - 18777 456 ## COG0693 Putative intracellular protease/amidase 17 6 Op 3 1/0.250 + CDS 18806 - 19291 179 ## PROTEIN SUPPORTED gi|225085052|ref|YP_002656490.1| ribosomal protein S2 18 6 Op 4 1/0.250 + CDS 19303 - 19746 715 ## COG0698 Ribose 5-phosphate isomerase RpiB 19 6 Op 5 . + CDS 19774 - 20112 529 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Term 20139 - 20181 4.3 20 7 Tu 1 . - CDS 20156 - 20845 781 ## COG2964 Uncharacterized protein conserved in bacteria - Prom 20892 - 20951 7.7 + Prom 20796 - 20855 9.0 21 8 Tu 1 . + CDS 21089 - 22726 2456 ## COG3033 Tryptophanase + Term 22762 - 22814 4.2 + Prom 22729 - 22788 4.6 22 9 Tu 1 . + CDS 22840 - 24183 1902 ## COG0733 Na+-dependent transporters of the SNF family + Term 24196 - 24251 6.3 + Prom 24297 - 24356 7.6 23 10 Op 1 . + CDS 24425 - 24688 335 ## gi|257451463|ref|ZP_05616762.1| hypothetical protein F3_00257 24 10 Op 2 . + CDS 24685 - 25257 650 ## gi|257451464|ref|ZP_05616763.1| hypothetical protein F3_00262 25 10 Op 3 . + CDS 25267 - 25452 244 ## gi|257451465|ref|ZP_05616764.1| hypothetical protein F3_00267 26 10 Op 4 8/0.000 + CDS 25439 - 26710 1288 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 27 10 Op 5 . + CDS 26723 - 27409 717 ## COG1475 Predicted transcriptional regulators 28 10 Op 6 . + CDS 27414 - 27794 505 ## gi|257451468|ref|ZP_05616767.1| hypothetical protein F3_00282 29 10 Op 7 . + CDS 27775 - 28971 1237 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Term 29087 - 29140 -0.9 30 11 Tu 1 . - CDS 28968 - 30254 716 ## gi|257451470|ref|ZP_05616769.1| hypothetical protein F3_00292 - Prom 30302 - 30361 11.6 + Prom 30245 - 30304 13.2 31 12 Op 1 . + CDS 30368 - 30853 573 ## COG1475 Predicted transcriptional regulators 32 12 Op 2 . + CDS 30850 - 31275 279 ## gi|257451472|ref|ZP_05616771.1| hypothetical protein F3_00302 33 12 Op 3 . + CDS 31266 - 31664 339 ## gi|257451473|ref|ZP_05616772.1| hypothetical protein F3_00307 34 12 Op 4 . + CDS 31661 - 32269 519 ## COG1738 Uncharacterized conserved protein + Term 32275 - 32332 8.0 + Prom 32309 - 32368 11.3 35 13 Op 1 . + CDS 32447 - 32944 795 ## COG1475 Predicted transcriptional regulators 36 13 Op 2 . + CDS 32941 - 33186 245 ## gi|257451476|ref|ZP_05616775.1| hypothetical protein F3_00322 37 13 Op 3 . + CDS 33208 - 33291 139 ## 38 13 Op 4 . + CDS 33288 - 33488 325 ## gi|257451477|ref|ZP_05616776.1| hypothetical protein F3_00327 39 13 Op 5 . + CDS 33502 - 33906 440 ## gi|257451478|ref|ZP_05616777.1| hypothetical protein F3_00332 40 13 Op 6 . + CDS 33940 - 34749 1009 ## COG3645 Uncharacterized phage-encoded protein 41 13 Op 7 . + CDS 34736 - 35539 567 ## gi|257451480|ref|ZP_05616779.1| hypothetical protein F3_00342 42 13 Op 8 . + CDS 35515 - 35769 462 ## gi|257451481|ref|ZP_05616780.1| hypothetical protein F3_00347 43 13 Op 9 . + CDS 35790 - 35915 131 ## + Term 35941 - 35976 0.3 + Prom 35924 - 35983 4.2 44 14 Op 1 . + CDS 36011 - 36943 1393 ## gi|257451482|ref|ZP_05616781.1| hypothetical protein F3_00352 45 14 Op 2 . + CDS 36940 - 37275 349 ## gi|257451483|ref|ZP_05616782.1| hypothetical protein F3_00357 46 14 Op 3 . + CDS 37260 - 37964 746 ## lse_1612 phage terminase large subunit 47 14 Op 4 . + CDS 37913 - 38641 887 ## lse_1612 phage terminase large subunit 48 14 Op 5 . + CDS 38653 - 41175 2681 ## COG2369 Uncharacterized protein, homolog of phage Mu protein gp30 49 14 Op 6 . + CDS 41179 - 41403 351 ## gi|257451485|ref|ZP_05616784.1| hypothetical protein F3_00377 50 14 Op 7 . + CDS 41436 - 42017 519 ## COG5005 Mu-like prophage protein gpG 51 14 Op 8 . + CDS 42035 - 42109 61 ## + Term 42288 - 42329 4.0 + Prom 42252 - 42311 6.0 52 15 Tu 1 . + CDS 42337 - 42861 525 ## gi|257451487|ref|ZP_05616786.1| ToxN - Term 42898 - 42940 7.4 53 16 Tu 1 . - CDS 42952 - 43230 357 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 43357 - 43416 11.6 + Prom 43224 - 43283 11.8 54 17 Op 1 . + CDS 43390 - 43857 682 ## gi|257451489|ref|ZP_05616788.1| hypothetical protein F3_00397 55 17 Op 2 . + CDS 43873 - 44334 626 ## gi|257451490|ref|ZP_05616789.1| hypothetical protein F3_00402 + Term 44343 - 44369 -1.0 + Prom 44358 - 44417 10.8 56 18 Tu 1 . + CDS 44447 - 44617 179 ## + Term 44627 - 44677 -0.6 + Prom 44641 - 44700 6.8 57 19 Op 1 . + CDS 44735 - 44857 196 ## gi|257451492|ref|ZP_05616791.1| hypothetical protein F3_00412 58 19 Op 2 . + CDS 44854 - 45420 752 ## MCCL_0945 hypothetical protein + Term 45456 - 45492 2.5 + Prom 45481 - 45540 11.6 59 20 Op 1 . + CDS 45565 - 45753 285 ## gi|257451494|ref|ZP_05616793.1| hypothetical protein F3_00422 + Prom 45908 - 45967 5.9 60 20 Op 2 . + CDS 45993 - 46166 202 ## Lebu_0982 hypothetical protein + Prom 46193 - 46252 4.6 61 21 Op 1 . + CDS 46279 - 46554 472 ## gi|257451496|ref|ZP_05616795.1| hypothetical protein F3_00432 62 21 Op 2 . + CDS 46567 - 47328 818 ## COG3645 Uncharacterized phage-encoded protein 63 21 Op 3 . + CDS 47368 - 47556 304 ## gi|257451498|ref|ZP_05616797.1| hypothetical protein F3_00442 + Term 47571 - 47601 1.1 + Prom 47577 - 47636 8.9 64 22 Op 1 . + CDS 47672 - 47764 151 ## 65 22 Op 2 . + CDS 47831 - 47935 215 ## + Prom 47943 - 48002 10.0 66 23 Op 1 . + CDS 48031 - 48240 258 ## gi|257451501|ref|ZP_05616800.1| hypothetical protein F3_00457 + Prom 48242 - 48301 6.4 67 23 Op 2 . + CDS 48404 - 48568 144 ## gi|257451502|ref|ZP_05616801.1| hypothetical protein F3_00462 + Prom 48599 - 48658 12.9 68 24 Op 1 . + CDS 48692 - 49204 625 ## gi|257451503|ref|ZP_05616802.1| hypothetical protein F3_00467 69 24 Op 2 . + CDS 49207 - 50799 1465 ## P9211_15361 hypothetical protein + Prom 50858 - 50917 8.3 70 25 Op 1 . + CDS 51003 - 51350 379 ## gi|257451505|ref|ZP_05616804.1| hypothetical protein F3_00477 71 25 Op 2 . + CDS 51413 - 51817 164 ## gi|257451506|ref|ZP_05616805.1| hypothetical protein F3_00482 72 25 Op 3 . + CDS 51831 - 52013 237 ## gi|257451507|ref|ZP_05616806.1| hypothetical protein F3_00487 + Term 52027 - 52080 11.0 + Prom 52039 - 52098 12.6 73 26 Op 1 . + CDS 52208 - 53158 1338 ## Sterm_1262 hypothetical protein 74 26 Op 2 . + CDS 53171 - 53272 174 ## 75 26 Op 3 . + CDS 53272 - 53622 626 ## gi|257451510|ref|ZP_05616809.1| hypothetical protein F3_00502 76 26 Op 4 . + CDS 53642 - 54616 1385 ## gi|257451511|ref|ZP_05616810.1| hypothetical protein F3_00507 77 26 Op 5 . + CDS 54621 - 54959 601 ## gi|257451512|ref|ZP_05616811.1| hypothetical protein F3_00512 78 26 Op 6 . + CDS 54956 - 55396 346 ## gi|257451513|ref|ZP_05616812.1| hypothetical protein F3_00517 79 26 Op 7 . + CDS 55411 - 56367 1276 ## Sterm_0060 hypothetical protein 80 26 Op 8 . + CDS 56384 - 56773 501 ## gi|257451515|ref|ZP_05616814.1| hypothetical protein F3_00527 81 26 Op 9 . + CDS 56857 - 57039 86 ## gi|257451516|ref|ZP_05616815.1| hypothetical protein F3_00532 82 26 Op 10 . + CDS 57085 - 61785 5900 ## Sterm_3894 hypothetical protein 83 26 Op 11 . + CDS 61821 - 62102 415 ## gi|257451518|ref|ZP_05616817.1| hypothetical protein F3_00542 84 26 Op 12 . + CDS 62129 - 62356 229 ## gi|257451519|ref|ZP_05616818.1| hypothetical protein F3_00547 - Term 62321 - 62352 3.1 85 27 Tu 1 . - CDS 62362 - 62631 334 ## gi|257451520|ref|ZP_05616819.1| hypothetical protein F3_00552 - Prom 62658 - 62717 7.6 + Prom 62531 - 62590 5.8 86 28 Op 1 . + CDS 62711 - 66604 3772 ## Sterm_0064 hypothetical protein 87 28 Op 2 . + CDS 66613 - 67422 462 ## gi|257451522|ref|ZP_05616821.1| radical SAM domain-containing protein 88 28 Op 3 . + CDS 67432 - 68076 465 ## gi|257451523|ref|ZP_05616822.1| hypothetical protein F3_00567 89 28 Op 4 . + CDS 68073 - 68324 221 ## gi|257451524|ref|ZP_05616823.1| hypothetical protein F3_00572 90 28 Op 5 . + CDS 68328 - 69537 1384 ## SZO_05380 collagen-binding collagen-like cell surface-anchored protein FneC Predicted protein(s) >gi|224461500|gb|ACDD01000002.1| GENE 1 44 - 928 847 294 aa, chain + ## HITS:1 COG:FN0395 KEGG:ns NR:ns ## COG: FN0395 COG0697 # Protein_GI_number: 19703737 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 5 287 3 286 286 245 51.0 6e-65 MKLKLEQKTKAVIYMLISALGFTMMSVAVKAIPEISLFEKVFFRNSISCFVAFLLLLRDR RGFYVKKENRLPVFIRSFLGFLGIVTNFYAIQYLLLADSNMLGKLSPITVSFFAVLYLKE KVDKEQILGIAFSFIGALFVIKPSFSLSMLPSLAGLTSVTFAGISYTVIRYLNDKENPNI IVFYFSLMSVLCSIPFMLTDFQIPDLRQWFYLLSIGLMACLAQFFMTYSYKNAEASEVAV YNYSGIPYGIILGYLLFDEIPDIYSCIGGVIIIVMAIYLYLHNKKKKANSIERL >gi|224461500|gb|ACDD01000002.1| GENE 2 870 - 1406 799 178 aa, chain - ## HITS:1 COG:FN0874 KEGG:ns NR:ns ## COG: FN0874 COG0494 # Protein_GI_number: 19704209 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 1 165 1 164 171 194 61.0 9e-50 MKFKHKERKEVFRNDVVTVYNENLVLPNGKEVTWTFTGKKEVVAILALTKKQTVIMVEQY RPAIRREFLEIPAGLVEKNELPLEAAKRELEEETGYQAESWTKICSYFGSAGVSDGEYHL FLAKELKKTHQHLDEDEFLTVREIPFKEISIYDLQDPKSIIAFQYYLLSSSCCEDTNK >gi|224461500|gb|ACDD01000002.1| GENE 3 1580 - 4129 1802 849 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 6 849 6 812 815 698 45 0.0 MNSNMFTENSILAMNEAKNLAVKYQQQVIKPEMLAYALLENKEGLIPKVLEKMGLNIHFI YQEIGNELEKMPRVQGGSEQEISLSPSTHRVLVEAEECMKKMGDSYLSVEHLFRALIENT PILKRLGIQVEKFDEVVKKVRGNRKVESQNPEETYEVLEKYAKNLVDLAREGKIDPIIGR DSEIRRAIQIISRRTKNNPILIGEPGVGKTAIAEGLAQRILNGDVPDSLKNKIIYSLDMG ALIAGAKYQGEFEERLKGVLKEVEESEGNIILFIDEIHTIVGAGKTNGAMDAGNILKPML ARGEVRVIGATTIDEYRKYIEKDAALERRFQIILVNEPDVEDTISILRGLKEKFETYHGV RIADAAIVAAANLSHRYISDRKLPDKAIDLIDEAAAMIRTDIDSMPEELDSLTRKTLQLE IEREALQKENDVASKERLEVLEKELAELKEEKARLQSQWELEKEEVNKVKKVKEEIENVK LEMEKAERNYDLTKLSELKYGKLASLEKELQGMTFENHLLKQEVSAEEISEIVSKWTGIP VAKLTESEKEKMLHLEDSLKTRVKGQEEAVKAVADTMIRSIAGLKDKHRPMGSFIFLGPT GVGKTFLAKTLAYNLFDSEDNVIRIDMSEYMDKFSVTRLIGAPPGYVGYEEGGQLTEAVR TKPYSVILFDEIEKAHPDVFNILLQVLDDGRLTDGQGRIVDFKNTLIIMTSNLGSSYILD DISLGEQTREAVMTELRASFKPEFLNRVDEIILFKALDQKAIREIVVLALESVAEKLKEK SIQVDFSSSLIEHLAHNAYNPQYGARPLRRYIQKELETSLAKKLLSNEISEYSHIKISLE GNDIIIKKQ >gi|224461500|gb|ACDD01000002.1| GENE 4 4208 - 4570 241 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451444|ref|ZP_05616743.1| ## NR: gi|257451444|ref|ZP_05616743.1| hypothetical protein F3_00162 [Fusobacterium sp. 3_1_5R] # 1 120 1 120 120 215 100.0 7e-55 MTRYRKLAYALLFSFCCFFSTSLFLHSNKVTLQFGEKAEKNIVLLVPQTSGTSHIAVLTQ GKSGISILENYSTEYFPDLPKKTSYLFVNFFKEKKVIIKCKYQCVIQTILTIFPKRVLRN >gi|224461500|gb|ACDD01000002.1| GENE 5 4616 - 4747 243 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTLEQEFLNEETEHSTAAMLLLASYLELGIFMLYKTIELFIS >gi|224461500|gb|ACDD01000002.1| GENE 6 4762 - 5478 799 238 aa, chain + ## HITS:1 COG:no KEGG:FN0914 NR:ns ## KEGG: FN0914 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 215 1 235 243 203 47.0 4e-51 MNYKKPLYCLIFILISFSLLAHSFTEEKIESLYKDMNLEKRITFPAFKQGIQGMERIHNR NNNILTIVDFTKPSTEERLYIIDLDKEQVLVSSYVAHGMRTGDLYAKYFSNRKGTLKSSD GFFLTGESYKGKNGFSLRLYGLEHGRNNNAYERTLVIHAARYAEQSFINRYGRLGRSRGC LAVPRSENGKIIEYIQGGSVCYVHSEGLKYEDYAFLNFTVADTHKKPEDVEEIEKLES >gi|224461500|gb|ACDD01000002.1| GENE 7 5526 - 8456 3219 976 aa, chain + ## HITS:1 COG:FN1139_1 KEGG:ns NR:ns ## COG: FN1139_1 COG1924 # Protein_GI_number: 19704474 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Fusobacterium nucleatum # 1 640 1 640 640 1021 77.0 0 MNYRVGIDVGSTTLKTVILDEKDNIIEKSYQRHFSKVREKTLEHIKSLESILKGKECRVA ITGSAGLGISKEYGIPFVQEVFSTAGAVKKQYPKTDVVIELGGEDAKILFLQGSIEERMN GSCAGGTGAFIDQMASLMDMNATQLDTISLDYEKIYPIASRCGVFAKTDIQPLLNQGAKK ADIAASIYQAVVEQTITGLAQGRNIEGNVLFLGGPLSFLKGLQKRFVETLHLSEKNAIFP ELAPYFVALGSAYYAGTVKEIFSFEELVRILSREKKLKEESKETPLFHTQEEYQVFQERH QRVSIPEKDILNYSGKAYLGLDSGSTTIKIVLLDEEGNLLYRHYSSSKGNPVSLFLEQLK KIRELCGERIEIVSSAVTGYGEELMQAAFGVDLGIVETVAHYTAAKYFNPQVDFIIDIGG QDIKCFHIQNGNIDSILLNEACSSGCGSFLETFAKSMGYSIQEFSEKALFARSPASLGSR CTVFMNSSVKQAQKEGAGVEDISAGLARSIVKNAIYKVIRARNAEDLGKHIVVQGGTFLN DAVLRSFEQELGREVLRLNHSELMGAYGAALYAKNVFRGQSTLLKQKDLQNFEHRSVATR CNLCTNHCHLTVNHFSTGEHFISGNKCERGAGKTVQNHLPNMVAYKNQKFDSIPLVAFGR AKIGIPRVLNMYDMLPFWAALFTNLGCDVVLSAKSSRELYMKGQHTIPSDTVCYPAKLVH GHIEDLLSKDLDAIFYPCLTYAFDEGLSDNHYNCPVVAYYPELIQANIPEVEKKNYLYPH LGMENRGLLIEKLYDCFQDIIPNLTKREMKFAVEVAYERYFRYREKIREEGKRCFTWAKL EKKPTVILASRPYHIDSEINHGLDRLLNSLGFVVLTEDSLPACEKGDSQEVLNQWTYHAR MYNAARFVGESEQTELIQLVSFGCGIDAITSDEIHAILAKKEKLYTQLKIDEINNLGASK IRLRSLAATMREREAI >gi|224461500|gb|ACDD01000002.1| GENE 8 8460 - 9662 1265 400 aa, chain + ## HITS:1 COG:FN1140 KEGG:ns NR:ns ## COG: FN1140 COG3581 # Protein_GI_number: 19704475 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 400 1 400 407 606 74.0 1e-173 MNKNYKILIPMMLDIHFDFIAGVLRKEGYDVEILQNDSQEVIEDGLKNVHNDMCYPALLV IGQFINALKSGKYNLNRVALLLTQTGGGCRASNYICLLRKALDNNGFTQVKVFSLNFAGL EKGNEFSLSFRAGVRLFQSILYGDLLMLLYNQSVALEKNIGDTKKTLLYWKKKLVEDIGK KKFSKLKENYRNILEDFASIPKNKNNEKIKVGIVGEIYMKYSPLGNNHLTEYLEQEKAEV VNTGILDFLLFNIYDVIFDKKIYGKSGIRYVIAKMLTSYIQKKQEEMISCIKENGHFRAP SAFSKVVEMTKGYLGHGVKMGEGWLLTAEMLEFIQMGVNNIICAQPFGCLPNHIIAKGMI RKIKTNHPEANIVAVDYDPGASSINQENRIRLMLENAKYL >gi|224461500|gb|ACDD01000002.1| GENE 9 9751 - 10026 441 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451449|ref|ZP_05616748.1| ## NR: gi|257451449|ref|ZP_05616748.1| FMN-binding domain-containing protein [Fusobacterium sp. 3_1_5R] # 1 91 36 126 126 145 100.0 8e-34 MILGSLFAFAETKEGAAMGFKDEIRVSVDVQGGKIISIEVSHRDPERVAKPAIEELKQEI LKKQSVEVDDIAGATATSQGFREAVKKAMEK >gi|224461500|gb|ACDD01000002.1| GENE 10 10102 - 11562 1762 486 aa, chain + ## HITS:1 COG:FN2070 KEGG:ns NR:ns ## COG: FN2070 COG1492 # Protein_GI_number: 19705360 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Fusobacterium nucleatum # 2 485 5 490 491 486 55.0 1e-137 MQKLMIQGTSSSAGKTTIVAGLCRVLAKQKKKVCPFKSQNMALNSYVDEEGRELSRATAL QAEAAMTKVKVSMNPILLKPNKDNESQVLVEGSPYGTLEAKEYFSMASQFKKIAKSNFEK LAEEYDYCILEGGGSPAEINLREYDYVNMGMAEMIDAPVILVGNIEIGGVFASLYGTIML LDEEDRKRIQGIIINKFRGDIDLLKPGIAMLEERLKKEGYCIPILGVLPCIDISLEEEDS LSSQFLDKKMEEGKIIISVLKGKQMGNTTDFQPFLQYPDVMLRYVEDPEELGKEDLIILA GSKNTLEEVEYFRRKGFEEKLKDLHKKGVPIFGICGGFQALGDQILDPYHIDGKLEEVEG FHLFTMVSTMEEEKIKMQVTKKIDMEEGLLKNCLGLEVKGYEIHHGRSSITSSVYVKEEV YGTYIHGIFENGEFTRHFLNNLRQRKHYELEAKNKDYKEFKELQYNKLAKAIEENLDMEK LYQIFR >gi|224461500|gb|ACDD01000002.1| GENE 11 11636 - 13048 641 470 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 [Haemophilus influenzae 3655] # 4 470 2 447 456 251 32 8e-66 MLETIKWITESVNNVLWGKNILVFLLVGSAIYFSIRTRFMQFRLFKTIIKTLFHKESEQK GISSLETFFLGTACRVGAGNIAGVVAAISVGGPGSIFWMWLVALLGASTSFVESCLAVMY RDKLEDGKYIGGSPWILKKQMNCRWLGVIYAIASIICYLGVVQVMSNSITESITSVYSNI DFGLSPVFYPIGAVFGIELTQENFLKYFLAILISIITASVIFGKSKKDAIIEALNKIVPI MAVLYILLVIFILITNITSIPAMIQNIFYQAFGGEQFLGAGFGIIVMQGVRRGLFSNEAG SGDSNYAAAVVDIEEPARQGMVQALGVFVDTLVICSATAFIVLLADPNVVGDASGMELFQ LAIQSHIGSIGAPFVVIIMFFFAFSTILAVTFYGKSAIYFINNHSNINLLYQLLIIVMVY IGGIKQNLFVWSLADFGLGIMTVINIIMIVPFAKPALDELKRYESLLKKN >gi|224461500|gb|ACDD01000002.1| GENE 12 13190 - 13501 382 103 aa, chain + ## HITS:1 COG:CAP0031 KEGG:ns NR:ns ## COG: CAP0031 COG0640 # Protein_GI_number: 15004735 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 9 95 7 92 95 62 37.0 2e-10 MGKEEQIIEVSGIFKVLSNSMRLGILCYLSEKKEMTVNEIHEYFKEYSQPSISQQLQILK ANRIVKDRKQGQYVYYSIADERVLKFMDTLHDLYCTRGEEENE >gi|224461500|gb|ACDD01000002.1| GENE 13 13494 - 15908 2764 804 aa, chain + ## HITS:1 COG:FN1903_1 KEGG:ns NR:ns ## COG: FN1903_1 COG0446 # Protein_GI_number: 19705208 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 3 454 2 464 469 524 59.0 1e-148 MSKKIVIVGGVAGGASTATRLRRLSEEYEIIMFEKGPYPSFANCGLPYHIGNIIPERESL IVQTPEKFKSRFQIDVRTLSEVIAVNTKEKKVQVRIQNGKLYEESYDVLVLSPGAKAWKA EIEGIHSHNIFSLKTIPDMDKIIAKLKNKVCKRVAIIGGGFIGIEAAENIKHLGIETILI EAGDHILSSFDSEFSENLEEEMREQGVELYLKQRVVKFQDGKELSLFLENGEIVEVDFVI MAMGVRPDTAFLKNSGITLGKRGEILVNEYLETNIQDVYALGDAIPGVALAGPANRQGRI VANNIFGKREKYCGSIGSSIIKVFDIVGAAAGKNEKQLKVEGIEYETVHLYPNSHAGYYP NATQLHAKILFEKESGILLGAQCIGYEGVDKFIDVMATSMHFKGTIYDLSELELCYAPPF GSAKSPVNMAGFIGRNIEDHLMETVSKEEMEDFNIQKHFRLDLRNPEESSVALAECEASI PLDELRDHLEELPKEKEIWCYCAVGLRGYLATRILMQHGFRVKNILGGYRLLPKDWKIEN SKEETSNIEKKEEETLYQKKEMEILNVTGLSCPGPLMKLKSKMESMEEGKDLHIIASDPA FANDVQAWVKASGNHLYEVKKEKGFVHAYLSKKESGLVHSSDTKVMETKEGMTIVVFSGD YDKAMAAFVIANGALAMGKRVTMFFTFWGLSILKKENPIAVKKSFIDCLFSVCLPKSWKN LPLSKMNFGGLGAKMMQVIMKRKNIESLDSLIQNAKENGVHIIACTMSMDAMGIVKEELL DGIDFGGVAQYLGAANEGNPNLFI >gi|224461500|gb|ACDD01000002.1| GENE 14 15983 - 16786 828 267 aa, chain + ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 5 261 5 259 269 136 32.0 4e-32 MIYKDYHIHSEFSGDSNQNIEELIEHCISIGLKEIAITDHSEYGIQDMPPAFILNYSQYN VKIQELQEKYRKKICLRYGVEVGMDVQVKEYFERNINSYPFDFIIGSNHAIHSLDIASSN ITLGKTKQELQELYFQTLLHNIQNYHDFCVLGHMDFITRYGGEKFRGLNLKENWDIIQTI LQHLIKYGKGIEINTSGFRYHEERFYPLPEIVKEYLRLGGEIITVGSDAHIKSHIAMDFQ RVEDFLRSINYPYIASFEKRKAIIEKI >gi|224461500|gb|ACDD01000002.1| GENE 15 16887 - 18191 1557 434 aa, chain + ## HITS:1 COG:no KEGG:Coch_0229 NR:ns ## KEGG: Coch_0229 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 201 411 229 454 454 101 34.0 6e-20 MDKKLENKIEKFYEKDNIEGVLELLDTLPEWGKEEYGEYARALNNIGRPEEALEYLMKEE AKEDTFIWNYRVCYSYSLLENWEKVIFYGKRALELDKKYEEDICYFLIESYEALKKPDEV IQILENHPDMDEIDWNSFYGKALVEKNEKKKAIPYLKKAVSLWKKYDTEFNWDGEEVTKL LAKLYYDLKMTKEFEQMKKKYHYSEANFDISKYTKEEEEQVIAHIEKYFGKIEKRIPDLD AEHVNIDILIIPASTKHPYTTLMTLGMGGRFMDGTPEELIPDKFGYDELFLCLPDDWEFG LDTMWAVQYLLDMARFPFSNKSWLGAGHSISYDIYLGNSNFTGFMITYPYEYGMEAFQLD ITEEKRIHFYNIVPLYTEELDYKQEVGFEELESLFVKSPMVTDIHRANVALNENISNIED GEEETEDYQQILYQ >gi|224461500|gb|ACDD01000002.1| GENE 16 18196 - 18777 456 193 aa, chain + ## HITS:1 COG:FN1876 KEGG:ns NR:ns ## COG: FN1876 COG0693 # Protein_GI_number: 19705181 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Fusobacterium nucleatum # 1 191 1 197 200 127 38.0 1e-29 MKKILLLLLPGVESMEFSPFLDIFGWNEMLGSKDIHLELCTLEKEVSSSWNLNLKVEKQI RNIELRDYIAVVIPGGFGSYHYFDTIENTEFRSFIQKAKKEELYILGICTGSILLASTGY FANKKMTTYLYENGRYSKQLSQYQVEFKNTMICKDDKLWTSSGPSTAIPMAFDLLEELSS RKNRKYIEAIMGF >gi|224461500|gb|ACDD01000002.1| GENE 17 18806 - 19291 179 161 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225085052|ref|YP_002656490.1| ribosomal protein S2 [gamma proteobacterium NOR51-B] # 3 145 7 148 150 73 31 3e-12 MKIEKNRVVTLEFKVYDKESHELLEDTQDVGPFMYIQGIGAFVPKVEEFLEGKEKGFKGS LDLGMEDAYGDYDEDLIEEMKRADFEEFDDIYEGMEFVAEMDDGSEVIYTVTEVDGDKIM TDGNHPFAGRNLTFEVLVTGVREAEEKELEHGHVHFHGFED >gi|224461500|gb|ACDD01000002.1| GENE 18 19303 - 19746 715 147 aa, chain + ## HITS:1 COG:FN1874 KEGG:ns NR:ns ## COG: FN1874 COG0698 # Protein_GI_number: 19705179 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Fusobacterium nucleatum # 3 147 2 149 149 204 72.0 5e-53 MKKIGLGADHGGFALKEVIKKHLLEKGYEVEDFGTHSTESVDYPKYGKLVAHAVIDKKVD CGILVCGTGIGISIAANKLSGIRAALCTNVTMAKLTRQHNDANILALGGRIIGDVVALEI VDTFLTTEFEGGRHSRRIESIESCELF >gi|224461500|gb|ACDD01000002.1| GENE 19 19774 - 20112 529 112 aa, chain + ## HITS:1 COG:FN1873 KEGG:ns NR:ns ## COG: FN1873 COG0537 # Protein_GI_number: 19705178 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Fusobacterium nucleatum # 1 112 1 112 112 170 75.0 5e-43 MASIFTKIINREIPADIVYEDDLVIAFRDIAPAAKVHILFVPKKEIPTINDIQKEDETLI GYIYSVIAKKAKELGMAEQGYRVVSNCNEYGGQTVFHIHFHLLGGEPLGTMV >gi|224461500|gb|ACDD01000002.1| GENE 20 20156 - 20845 781 229 aa, chain - ## HITS:1 COG:FN1942 KEGG:ns NR:ns ## COG: FN1942 COG2964 # Protein_GI_number: 19705247 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 228 1 228 229 274 64.0 1e-73 MKKELLAHYQSLVLFLGKTLGPSYEIVLHEVIGEKLKMIAIANGEISNRILGNPLSEETL ELLKNKTRHGENNMINHTVLLKNGKKIRSSSILIRDSKKVIGVLCINFDDSCFHEIHCQL LRTIHPDLFVQNYLSDISYNILLDELKSQKKEETQDNTIEMMMEKIFQEVSQELHFPLIR PNKKEKEKIVYELEKKGIFQLKEAIVFTAKKLSCSTTSIYRYLKKIQEE >gi|224461500|gb|ACDD01000002.1| GENE 21 21089 - 22726 2456 545 aa, chain + ## HITS:1 COG:FN1943 KEGG:ns NR:ns ## COG: FN1943 COG3033 # Protein_GI_number: 19705248 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Fusobacterium nucleatum # 1 545 1 545 545 1075 94.0 0 MKNYELNVPAPKSFSYVKRNIPEVTVEQRERALKATHYNEFAFPAGMLTVDMLSDSGTTA MTDQQWSAMMLGDESYGRNKGYYVLLDAMRDCFERGDQQKKIIDLVRTDCKDIEKMMDEM YLCEYEGGLFNGGAAQLERPNAFLMPQGRAAESILFEIVKKILAVRAPGKVFTIPSNGHF DTTEGNIKQMGSVPRNLYNKKLLYEVPEGGKYEKNPFKGDMDINKLQQLIDAVGVENIPM IYTTVTNNTVCGQAVSMKSIRETSKIAHKYEIPFMLDAARWAENCYFIKMNEEGYADKSI PEIAKEMFSYCDGFTASLKKDGHANMGGILAFRDRGYFWKKFSDFNPDGSVKTDVGILLK VKQISSYGNDSYGSMSGRDIMALAAGLYECCNFSYLHERVEQCNYLAEGFYKAGVKGVVI PAGGHGVYINMDEFFDGKRGHDTFAGEGFSLELIRRYGIRVSELGDYSMEYDLKTPEQQE EVANVVRFAINRSMYSQEHLDYVIAAVKALYKDRENIPNMRIVSGHTLPMRHFHAFLEPY ANEEK >gi|224461500|gb|ACDD01000002.1| GENE 22 22840 - 24183 1902 447 aa, chain + ## HITS:1 COG:FN1944 KEGG:ns NR:ns ## COG: FN1944 COG0733 # Protein_GI_number: 19705249 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 5 447 17 459 459 657 82.0 0 MQENTIMEKRDGFHSKWGFILACIGSAVGMGNIWRFPILVSEWGGMTFLIPYFIFVILIG STGVIAEFALGRAAGAGPVGAFGMCTEMKGNRKIGEAIGIIPVLGSLALAIGYSCVMGWI FKYTWLSIDGTMFAMQGNMEVIASTFGQTASAGGANYWIVIALIVSFGIMSMGIAGGIEK ANKIMMPILFILFVFLGIYIAFQEGASDGYKYIFTVNPKALCNPVLWIFAFGQAFFSLSV AGNGSVIYGSYLSKTEEIPGSAKNVAFFDTLAALLAAFVIIPAMAIGGAELSSGGPGLIF IYLVNVMNNMAGGRIIQVVFYICILFAGVSSIINLYEAPVAFLQEKFKTSRVMATAIIHI VGLIVAISIQAIVSTWMDIVSIYICPLGALLAGIMFFWIAGKDFVEDAVNTGSDKKIGSW FFPAGKYLYCFLALIALIAGAIFGGIG >gi|224461500|gb|ACDD01000002.1| GENE 23 24425 - 24688 335 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451463|ref|ZP_05616762.1| ## NR: gi|257451463|ref|ZP_05616762.1| hypothetical protein F3_00257 [Fusobacterium sp. 3_1_5R] # 1 87 1 87 87 114 100.0 1e-24 MRIDIYFLEKEEKFPISVYDRDEIKVGEILELGYKGKKKKWVKISKMELLKSSQGKSEKV VELTCSPLSIEERRMIIEAYLKKGDDK >gi|224461500|gb|ACDD01000002.1| GENE 24 24685 - 25257 650 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451464|ref|ZP_05616763.1| ## NR: gi|257451464|ref|ZP_05616763.1| hypothetical protein F3_00262 [Fusobacterium sp. 3_1_5R] # 1 190 1 190 190 261 100.0 2e-68 MNITKHALMRYVSRTHKIVEINEKTYDNFKRENEALISELEERLQEEFLKAEFFIKQKHE GHQEASFYINEDKMMTYVVANGNILTCYPIDFNLDEIGNLEMYHTLRKSFNREKELLKSM EESSVISKKEEEIKEIELEIQMLLSKQKQLEKGKRILENEIEVEELKINDVKNKISYIAE RISRSSKKGL >gi|224461500|gb|ACDD01000002.1| GENE 25 25267 - 25452 244 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451465|ref|ZP_05616764.1| ## NR: gi|257451465|ref|ZP_05616764.1| hypothetical protein F3_00267 [Fusobacterium sp. 3_1_5R] # 1 61 1 61 61 87 100.0 3e-16 MKTLKEMCDIMLEKRIIESFEITEFEEIIAYTDEGILLQEKDIQRIYNITLGKKENKNEC V >gi|224461500|gb|ACDD01000002.1| GENE 26 25439 - 26710 1288 423 aa, chain + ## HITS:1 COG:lin1347 KEGG:ns NR:ns ## COG: lin1347 COG3969 # Protein_GI_number: 16800415 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Listeria innocua # 3 423 6 433 434 458 55.0 1e-129 MNVCEAARKRIQYAISEFDNIIVSFSGGKDSGVMLNLTLDIAKEMNCLHKIGVYHMDYEA QYQATTDYVTEMFNDLPKEVRKYWVCLPIKAQCSVSMFQSFWQPWKFKEKEIWCRELPEN SINEENFPYDFDYEISDYQFNIKFGKEMAKTEKTCFLIGIRTQESLHRYKAVNKFNNKNE YKGKKYTTKITENLVNMYPIYDWLVDDIWVYNAKFQKKYNKIYDLFYQAGLKVNAMRVAS PFNDAAQDSLKLYKVIDPNNWGKLIGRVNGVNFTGLYGGTTAMGWKTIKKPDHFTWKEYM YFLLDTLPKHTRDIYLKKLETSIKYWTVSGGALPKEIVKELTVEHENLGKPKNNRNYTTE YDVIRFKDYLDEIEISKPNLLPTYKRMCIAILKNDTSCKTLGFGQTKYELEKRKNIMEKY RNL >gi|224461500|gb|ACDD01000002.1| GENE 27 26723 - 27409 717 228 aa, chain + ## HITS:1 COG:L69383 KEGG:ns NR:ns ## COG: L69383 COG1475 # Protein_GI_number: 15673430 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 64 225 7 168 180 238 68.0 6e-63 MEKITTYKREKPIIVKNINTGDVFSFKSMSNACEMLKIDMQNLGRILNGERKSFNGYNAF YAGSPVYNVIPVPVEKIRANSYNPNSVAPPEMKLLYQSIKEDKYTMPIVCYYIEEEDIYE IVDGFHRYTVMKKHKDIYERENGCLPVVVIDKDISNRMASTIRHNRARGSHSIELMTNIV SELVESGMSDAWILKNIGMDADELLRLKQLSGLASLFKDREFSKSEEE >gi|224461500|gb|ACDD01000002.1| GENE 28 27414 - 27794 505 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451468|ref|ZP_05616767.1| ## NR: gi|257451468|ref|ZP_05616767.1| hypothetical protein F3_00282 [Fusobacterium sp. 3_1_5R] # 1 126 1 126 126 200 100.0 3e-50 MWFEIYLQYSDIILETEKAVLIRLPYENIEEACYFYYPKALIKEKKGKKYLIFMWEKPFV RIYYREGKRGKRKYIYTNEDRSLREIMYDFEDRTPTHYEQKTEYFKRRASKITIEKVEVP DDLRDE >gi|224461500|gb|ACDD01000002.1| GENE 29 27775 - 28971 1237 398 aa, chain + ## HITS:1 COG:lin1348 KEGG:ns NR:ns ## COG: lin1348 COG0553 # Protein_GI_number: 16800416 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Listeria innocua # 9 388 7 389 399 246 38.0 5e-65 MIYEMSKEQLEAIEKLKKLKVGALFMETGTGKTMTALQLFNDRFQANKTNHLIWIAPLNT KDNFWKEVKKYGFENLPITFYGIESIRQSGRIFLEVLNLVKELETSFIVLDESLKIKNYV QVSQRIIKIGRYCKYRLILNGTLISKNFLDLYNQLDFLSPKILNMSFLEFRNRFTDEIIV KKRGRELKRFVSDNANEEALFKMIDNYIYRCELVLNLEREYVKYYYDFFEKEEEYNKIKD RMLDDLERGGFHFLEYAQKLQSIYAVTEDKKKVFEEILKEYPKCIVFCKFVESQEYLKEK YPNILVLSYQKHSFGLNLQEYNVIVFWDKTFDYATVEQAEARIFRIGQTKACTYIRMQCD CGLDKMIAGNICKKGNMLENLKIEFMKQIEERKKASLT >gi|224461500|gb|ACDD01000002.1| GENE 30 28968 - 30254 716 428 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451470|ref|ZP_05616769.1| ## NR: gi|257451470|ref|ZP_05616769.1| hypothetical protein F3_00292 [Fusobacterium sp. 3_1_5R] # 1 428 1 428 428 686 100.0 0 MKKIKSKFSLKQKNFIKDLLKLHETIIVDLLIDNFSKTDLSIFFKLDKDYNENILKIFTE ILDFSEKGAFYFNKLQKINNIRTEKYMFYSIKKLKEEFEKDYFIPERGDGIAPEMILLYY NSNNIPKDSIYAFCEDFYLKTLKNYDTKQDKKLQEVYNDCTKITSYMIELEHYLNGKYTG NISDEELYKKFLAEKIEDGYKVKKGLLDFSYKYLNQIFETHYKAEKSYILLQTYLQKIAT MYEEWQKKTLEISFITKENQSLQNKIELLQRELELSLEKKIIVTNDKELIERNKELEKEN YYLKYQNEKLQLRLQELEDELNINRTIVEELEISQEIPPKTPKMNPKNIVVLGGKWTYEK IKNSNLPITFIRNEDILKSISGIKKYDLIIFDTSRNSHIFFNKLKSVTDNFYLISHSSID EIQKIIQG >gi|224461500|gb|ACDD01000002.1| GENE 31 30368 - 30853 573 161 aa, chain + ## HITS:1 COG:RSc0845_1 KEGG:ns NR:ns ## COG: RSc0845_1 COG1475 # Protein_GI_number: 17545564 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Ralstonia solanacearum # 2 134 9 141 141 107 40.0 8e-24 MKIEKRKLETLKPYINNAKQHPDWQIEQIKESIERFGFNDPIAIDNQGNIIEGHGRYLAS IQLKLEEVDCIVLEHLSELEKKAYIITHNKLTMNTNFDLTILEQELSNLKENEFDLLSTG FSEYELDTLLSSNELELDEVLSDEDKGKKEEKTCPHCGGVL >gi|224461500|gb|ACDD01000002.1| GENE 32 30850 - 31275 279 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451472|ref|ZP_05616771.1| ## NR: gi|257451472|ref|ZP_05616771.1| hypothetical protein F3_00302 [Fusobacterium sp. 3_1_5R] # 1 141 1 141 141 265 100.0 8e-70 MNLYLASLENDYVTIETMIDVKPLFVLGSFYYLQKLKQEILEQYFSYINSKDCKGFILDS GAFSMLNAKGGTESFLKNFDNYIDDYIKFIKFWNVKNFIELDIDPLVGYSKVLEIREKIE KEVGRKSIPVWHISRGIEEWK >gi|224461500|gb|ACDD01000002.1| GENE 33 31266 - 31664 339 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451473|ref|ZP_05616772.1| ## NR: gi|257451473|ref|ZP_05616772.1| hypothetical protein F3_00307 [Fusobacterium sp. 3_1_5R] # 1 132 1 132 132 217 100.0 2e-55 MEIEEWKKTVKLYKYVAIGGIVTREIKKKDYKKVFLPMLKMARSEKCNVHGLGFTGKEIN DFPFFSCDSSSWSSIKRFGSMPVFSITEKCIKNRNISENKKIRSGNETRMKLMRYSIKEW KKFQVFLYKGGI >gi|224461500|gb|ACDD01000002.1| GENE 34 31661 - 32269 519 202 aa, chain + ## HITS:1 COG:TM0792 KEGG:ns NR:ns ## COG: TM0792 COG1738 # Protein_GI_number: 15643555 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 1 200 1 200 212 153 47.0 2e-37 MKKSIENLVLLNCIFVICLVVANVISSKLVILGNHFIVPAAVVSYGITFLCTDIIGEIWG KKEANKTVKRGLLTQIIATFLILFAIKIPIAPFMSDFQEKFQAVLGGSLRMTLASLVAYI IAQTNDVFIFHKLKTLNNGKYKWVRNNVSTICSQFLDTSIFITIAFYGVVPDLFLVIYSQ FFIKVIIALLDTPFFYLFTKSE >gi|224461500|gb|ACDD01000002.1| GENE 35 32447 - 32944 795 165 aa, chain + ## HITS:1 COG:RSc0845_1 KEGG:ns NR:ns ## COG: RSc0845_1 COG1475 # Protein_GI_number: 17545564 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Ralstonia solanacearum # 2 134 9 141 141 99 37.0 2e-21 MKIEKISLEKLKMHENNAKEHPDWQVEQIMRSIQEFGFNDPIAVDENNEIIEGHGRYLAL KELGIESCECIRLSHLNENQKRAYIIAHNKLTMNTGFNAEILAYEMNALKVDEFDLSVLG FGEAELNSIFNNFEDHDKDEKDEPEIIQPEKKMITCPFCGEEFEK >gi|224461500|gb|ACDD01000002.1| GENE 36 32941 - 33186 245 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451476|ref|ZP_05616775.1| ## NR: gi|257451476|ref|ZP_05616775.1| hypothetical protein F3_00322 [Fusobacterium sp. 3_1_5R] # 1 81 1 81 81 134 100.0 2e-30 MIELASEKYRITSIKRTDSKIWRVLIPGDVLEIRTTITRTKRNDGNQSAMQMKIYVNNVY MGKCSATRTYNALQKITLEKI >gi|224461500|gb|ACDD01000002.1| GENE 37 33208 - 33291 139 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKQIFTTHRLESHLTLVVGSIRRGKK >gi|224461500|gb|ACDD01000002.1| GENE 38 33288 - 33488 325 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451477|ref|ZP_05616776.1| ## NR: gi|257451477|ref|ZP_05616776.1| hypothetical protein F3_00327 [Fusobacterium sp. 3_1_5R] # 1 66 1 66 66 100 100.0 3e-20 MKQYKTQKRHYERLKMMLVLERRIEELEKGNEALLKELKKSKQSDMVWLIFGIVIVFIGL NSLVFN >gi|224461500|gb|ACDD01000002.1| GENE 39 33502 - 33906 440 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451478|ref|ZP_05616777.1| ## NR: gi|257451478|ref|ZP_05616777.1| hypothetical protein F3_00332 [Fusobacterium sp. 3_1_5R] # 1 134 1 134 134 269 100.0 4e-71 MTQIFIAGNVPSCKNSKRIARIKDKKGNVVATRLINSEVVERYLKTYAFQWNFAGNVAEF HRRIKGKEKPYKVGFYFIRDSRRKFDYINAMQLPCDLMVKAGWIDDDNANEIIPVCLGYE VDKRNAGVRIEVLE >gi|224461500|gb|ACDD01000002.1| GENE 40 33940 - 34749 1009 269 aa, chain + ## HITS:1 COG:lin1738_2 KEGG:ns NR:ns ## COG: lin1738_2 COG3645 # Protein_GI_number: 16800806 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Listeria innocua # 107 250 6 149 152 117 45.0 2e-26 MNDLQNKNTFTSLELTQLINQYRKEEGDRKELQHKSLLEIIREEFSEEIGEQKILPTSYK DQWNREQPMFILNLQQSRQVLVRESKFVRKAVIKYIDELESRLKGQFQIPTSFAEALRLA AEQQEKIEELALDNKVKDQQISELQPKASYYDLILQCKDLLSMTVIAKDYGKSAEWMNKK LHQLGVQFKQSGVWFLYQKYAENGYTQTKTQNYSKSDGTQGARPHMYWTQKGRLFLYDLL KNNGVYPMIEIEEVRTNIAFIWGTGNEDR >gi|224461500|gb|ACDD01000002.1| GENE 41 34736 - 35539 567 267 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451480|ref|ZP_05616779.1| ## NR: gi|257451480|ref|ZP_05616779.1| hypothetical protein F3_00342 [Fusobacterium sp. 3_1_5R] # 1 267 1 267 267 526 100.0 1e-148 MKIDEFRKNCKKIHLWWGEEEVTRLARFSEIMKKTTVDSVKYLLISKDSIGFTDSYRAVV MKATKIVDEAKKPLGFYSPDLLSLLKVAQEIALIDDYTLVIRVKEEIHLFTPTVNQVPDI SQVCKIVPSDAERVAFRTDIFSEKEMPLADTLSWKAICTSFQSNDLDFMFPRFYFTHDGI VAKAEFGKTHLEMLFPITLEKECHKALNPKFIDLWIRATTKEKVIGSLLYNTKTSGSICF EIPNLKYIILPIAWREGEKEWKTKIMQ >gi|224461500|gb|ACDD01000002.1| GENE 42 35515 - 35769 462 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451481|ref|ZP_05616780.1| ## NR: gi|257451481|ref|ZP_05616780.1| hypothetical protein F3_00347 [Fusobacterium sp. 3_1_5R] # 16 84 1 69 69 121 98.0 1e-26 MENENNAVKNPKHYQLGNLGIEAIDVIREVTGELKDGFQGKCVGDILKYVMRAHKKNGIQ DYEKAQEYLTYLIKYMKREQGGLE >gi|224461500|gb|ACDD01000002.1| GENE 43 35790 - 35915 131 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSALCMMLEIKRGTVTKEKRRNYEGIVVLCVTIHAFYDLIR >gi|224461500|gb|ACDD01000002.1| GENE 44 36011 - 36943 1393 310 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451482|ref|ZP_05616781.1| ## NR: gi|257451482|ref|ZP_05616781.1| hypothetical protein F3_00352 [Fusobacterium sp. 3_1_5R] # 1 310 1 310 310 462 100.0 1e-129 MTKKEKAYKKWLELGGEKAPRGTLSMIAKDLRISSNSIRVWKKREWCVDEKQKEKNVTNT QSVTDECNVSRMLQNVTENQEAIPIKRETKKSIQEEENRVKRLVKSQEHSKKLKKISRMT MQGYTAKEIAEAVDFHASTVAKWRKRYNLIQRRDELQLEAQARIAEVITKEREKRALQLI KGAEYLEGVALEKIKDFHTGKHGASEGKKLAAEINAIKQGLETLDAAKRYTDSWIGAGNA KEVSDLMLQDRKYELEQERFAFEKEKATAMLYTKLLEIEKGNKENTSEEIRGAIEEQFGN WEIDTWEETE >gi|224461500|gb|ACDD01000002.1| GENE 45 36940 - 37275 349 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451483|ref|ZP_05616782.1| ## NR: gi|257451483|ref|ZP_05616782.1| hypothetical protein F3_00357 [Fusobacterium sp. 3_1_5R] # 1 111 1 111 111 179 100.0 3e-44 MTHKDSKSFFENFIQEDSAFHKYKISKSMEERLKIKNVLIEMIISGELKPFSELERNVWV LYREGISSEEIQNRLGISKSMYGVVKYRAGKKIEKFAEKFKERSERLCCEI >gi|224461500|gb|ACDD01000002.1| GENE 46 37260 - 37964 746 234 aa, chain + ## HITS:1 COG:no KEGG:lse_1612 NR:ns ## KEGG: lse_1612 # Name: not_defined # Def: phage terminase large subunit # Organism: L.seeligeri # Pathway: not_defined # 20 214 2 196 430 218 51.0 2e-55 MLRNMIRIVSTEYKNTVLRKRNKLLEDSFKMIEPSLKQRKVLTWWRDSSPYKNYFGIICD GAIRSGKTASVIYSFMTWSMTNFNRQNFILSGKTIGAFKRNILKDLVRMLRTLKFEYSYN RSDNVLVVTLGEVTNYYYVFGGKDEKSADLVQGLTAAGAFFDEAVLMPKNFLDQAIARCS VESSKVWFTCNPDNPHHFFKKDFIDVANEKNCYIYTLLWKITQLCRKTKNNNIN >gi|224461500|gb|ACDD01000002.1| GENE 47 37913 - 38641 887 242 aa, chain + ## HITS:1 COG:no KEGG:lse_1612 NR:ns ## KEGG: lse_1612 # Name: not_defined # Def: phage terminase large subunit # Organism: L.seeligeri # Pathway: not_defined # 1 239 201 422 430 176 42.0 7e-43 MEDNPTLSENKKQQYKLMYKGAFYERFILGLWVMAEGLVYQIIGDNFIDEDEIPVCDYYY ISCDYGIYNPMAWNLIGVLGNEVYIMEEYHHSGRETNETKTDEQYTQDFLEWKESICNKY GIEVEYTIVDPSASSFIVALEQEGQYVVKANNKVFEEDSERVSGIPLVQIYLNKLRLFIC RNCIETIKEFYSYRWDEKRSMRGEETPVKENDHHMDGIRYFFNTVIGYYYKDGNNSGEII AA >gi|224461500|gb|ACDD01000002.1| GENE 48 38653 - 41175 2681 840 aa, chain + ## HITS:1 COG:NMB1096_1 KEGG:ns NR:ns ## COG: NMB1096_1 COG2369 # Protein_GI_number: 15676977 # Func_class: S Function unknown # Function: Uncharacterized protein, homolog of phage Mu protein gp30 # Organism: Neisseria meningitidis MC58 # 518 657 41 199 210 106 33.0 2e-22 MSKKNKTVVDDATLESVVRVLFENMSATKGDLTDELIKKMLQDVEIGGALQKIEREVAGR ILSVKTDNEELRAMIPEIEARFNNVKFNRLFRNMLEACYYGYAAFEKVYNKEDYSLARLI YIPHKYVKYTKEKKWYISANNKELKLDKETFLLAIHRYNVANKIGVSILESCKIAFTDKE MFQGYLRSISQRYGNVITLFKYNKGEKREDVKRKVDDLRKYQDKMILAIPSDFQGTLKDN MQFINLSDLKPEIYANLQDKERKKIIQNLLGGTLSIDDGNGVGSRALGTVHGDGLEAVIK ERCEFICDCLQSLLYYDGLYHGYDSKQFYFSLESLEDENDVIQKEKEKENTRAIKIKNFT DVRALGYKISKSKLAEALGLEEEDLEEVEELSQPLEFESKKKEKLQALVENVSNTVDKRL KNNQEKENAFSKKMNIAMKQWLKNPLEEELDLDMEILEEDYIIMFLLGYLDSQEKALEFE EELDPFSLSFTKAIKSIIDRNPAMYDTIEKIEEEARNRMFWIKKSTELEATKKVLTSLQK NLEKGGTYHEWKKDIESIAEKAGLGEDGWYSELVYRNAMNNAYAAGRYQEQMDNIKQRPY FMYSAINDDRTSEICRSLDGKVYPADDAIWHVIYPPNHHNCRSQVIALNEKEVQGYGLEI EKPDKEIKKMAKNMKNFNTAPVPQKLNFLQKIVKIKEKALETLKKKIERLMYLEKKKKEK KAIVELEKKIKKYILSNKCNKKFNKQMDKHFKDSKNYIIGRSYFTVPRKKIEELVSQRLG NGKVSLQKNGEWKKQEIIDLGEIVGVDVKGTEEYVTSIVKVHYGKTGIHAVPYAKRKKGD >gi|224461500|gb|ACDD01000002.1| GENE 49 41179 - 41403 351 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451485|ref|ZP_05616784.1| ## NR: gi|257451485|ref|ZP_05616784.1| hypothetical protein F3_00377 [Fusobacterium sp. 3_1_5R] # 1 74 1 74 74 89 100.0 6e-17 MKMEELEKYWEKKVEIFFIEGNSYIGILSGTESEENEEGEYTGRELVVLDLTEHSYISFL PEEIEQIKSLDKVE >gi|224461500|gb|ACDD01000002.1| GENE 50 41436 - 42017 519 193 aa, chain + ## HITS:1 COG:HI1568 KEGG:ns NR:ns ## COG: HI1568 COG5005 # Protein_GI_number: 16273465 # Func_class: R General function prediction only # Function: Mu-like prophage protein gpG # Organism: Haemophilus influenzae # 1 115 1 113 138 59 35.0 4e-09 MIELKIENDLALYINKVAGKVNTKELMEEIANDMQARVQSRFRMSIGPDGHRWFPIGIRK GKPLMDTGLLSNSITSRATMTKAIVGTNNRYARLHNYGGVIRAKSAGALTIPISPKSYGK SARRFRSAFLVRTPGATFIAREVGRGKKRQLEFLYVLKQSAKIKARPFLGINTAMQERYR RIAEEFYRKAWKK >gi|224461500|gb|ACDD01000002.1| GENE 51 42035 - 42109 61 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIFCKKKCILQVRENSSLDYRGWG >gi|224461500|gb|ACDD01000002.1| GENE 52 42337 - 42861 525 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451487|ref|ZP_05616786.1| ## NR: gi|257451487|ref|ZP_05616786.1| ToxN [Fusobacterium sp. 3_1_5R] # 1 174 1 174 174 293 100.0 2e-78 MPVIQLYEIDNDYIDYLRQFDSKVLNHSGITYSKTRKYLGVLLDINNCKYLAPLSSPEPK SDYINGQIRKSIIPIIRIVKLGTSNILLGKIKLSSMIPVYDMSVLSYYDINKEQDLKYKN LVIDELRFIYANKNLILKNANKLYQQKIKNMSMGYVQTTVDFILLEQKAKLYKK >gi|224461500|gb|ACDD01000002.1| GENE 53 42952 - 43230 357 92 aa, chain - ## HITS:1 COG:FN0818 KEGG:ns NR:ns ## COG: FN0818 COG0776 # Protein_GI_number: 19704153 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Fusobacterium nucleatum # 1 91 1 91 91 85 56.0 2e-17 MTKKEFIELYFRKGKFSTKTEAEKAMTSFLETLEEVALLNENILFSGFGKFEVVEKAERL GRNPKTGEEVRIPAKKSMKFKPGKSLDEKLNN >gi|224461500|gb|ACDD01000002.1| GENE 54 43390 - 43857 682 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451489|ref|ZP_05616788.1| ## NR: gi|257451489|ref|ZP_05616788.1| hypothetical protein F3_00397 [Fusobacterium sp. 3_1_5R] # 1 155 1 155 155 250 100.0 2e-65 MKKLLLYLFVILLFASCGKDEKYINTAREYPLPGSMSIYGVKIKTVDETVNTILGMFYDK KGAEIEKEVTWTNDKDTIIAKYKEAEVRIEARRDKNGNPEFMIMDIIGTKGNKEITGLNL VQIDMAESAEQLRKELEESNAELQRELNRMEKQFK >gi|224461500|gb|ACDD01000002.1| GENE 55 43873 - 44334 626 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451490|ref|ZP_05616789.1| ## NR: gi|257451490|ref|ZP_05616789.1| hypothetical protein F3_00402 [Fusobacterium sp. 3_1_5R] # 1 153 1 153 153 273 100.0 3e-72 MKKWLLGFLVTLCVFMVGCTNVRPHNEIMKSSSFVDVEVNEIIFNGIGMKVTNKTDDFVE IVWDSSNLNDYPLSFGNNLVTKALEKKPNTSIEPNGIFKKEMYIPEEIKMPAKLLLKVKK KDVEEYVSIMLEDRGEKIERKYNMWTGEWESGE >gi|224461500|gb|ACDD01000002.1| GENE 56 44447 - 44617 179 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTKSRADYFRERRKKLKEFGVVVDREKLETLEKKLKEKNRTKTAWLNEKIDEELKK >gi|224461500|gb|ACDD01000002.1| GENE 57 44735 - 44857 196 40 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451492|ref|ZP_05616791.1| ## NR: gi|257451492|ref|ZP_05616791.1| hypothetical protein F3_00412 [Fusobacterium sp. 3_1_5R] # 1 40 1 40 40 75 100.0 7e-13 MRMKKMNSDFWECSFKGIVFYVKNPNQAIEIAWRMSGGRV >gi|224461500|gb|ACDD01000002.1| GENE 58 44854 - 45420 752 188 aa, chain + ## HITS:1 COG:no KEGG:MCCL_0945 NR:ns ## KEGG: MCCL_0945 # Name: not_defined # Def: hypothetical protein # Organism: M.caseolyticus # Pathway: not_defined # 15 119 19 109 256 92 48.0 1e-17 MKYNPELIKKNNVYVVSSRAIAKELGKRHDNVVRDIENLISSDVRRLKKQSNDILADLEK MLIQSQYRDSKNRNYREYLLTKDGFTLYMFNIQGYNDFKIAYINEFNRMERALKQQALPL ENKVRIEDMTFEQTMRTVKGIVNNLKSRLVHERDILNMKISELDKTGLTDNVYTVKAKGK TYTVQDLD >gi|224461500|gb|ACDD01000002.1| GENE 59 45565 - 45753 285 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451494|ref|ZP_05616793.1| ## NR: gi|257451494|ref|ZP_05616793.1| hypothetical protein F3_00422 [Fusobacterium sp. 3_1_5R] # 1 62 1 62 62 77 100.0 4e-13 MEATKKKMGRPTNNPKNIQTRIRMTEEEAKKLDFCAKTLNTSKTEIISKGVDKIYQEISN KK >gi|224461500|gb|ACDD01000002.1| GENE 60 45993 - 46166 202 57 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0982 NR:ns ## KEGG: Lebu_0982 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 57 1 57 57 70 71.0 2e-11 MEKRNAKISFSRSGNGIGAKLPLSVPLLKKFGIGVDEREVEIIYDEEQQTITVKKKK >gi|224461500|gb|ACDD01000002.1| GENE 61 46279 - 46554 472 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451496|ref|ZP_05616795.1| ## NR: gi|257451496|ref|ZP_05616795.1| hypothetical protein F3_00432 [Fusobacterium sp. 3_1_5R] # 1 91 1 91 91 159 100.0 5e-38 MLTENEKIAMKFVESCRNHLMMKDDLTEVENKMFSLLEKVSKKLKLIENGGPLQTEFEKA LYDTLNLTKEIYFEFGNHYGGVFESQYYDGV >gi|224461500|gb|ACDD01000002.1| GENE 62 46567 - 47328 818 253 aa, chain + ## HITS:1 COG:SA1801_2 KEGG:ns NR:ns ## COG: SA1801_2 COG3645 # Protein_GI_number: 15927569 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Staphylococcus aureus N315 # 136 251 8 123 126 149 65.0 4e-36 MNELKMIDERELLGKQFRVYGNFENPLFLAKDVAEWIEHNKPNELIANVDDTEKLKAIIS HSGQNREMWFLTEDGLYEVLMLSRKPIAKEFKKEVKKILKTIRKNGMYVVDDLLDNPDLA IQAFTKLKEEREKRKELEIKLEEDKPKVLFADSVSASKTSVLVGELAKLLKQNGIDIGQN RLFEQLRNLGYLIQKKGSSFNMPTQRSMELKLFEIKETTISQPNGEIRIQKTPKVTGKGQ QYFINLFLKNKIA >gi|224461500|gb|ACDD01000002.1| GENE 63 47368 - 47556 304 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451498|ref|ZP_05616797.1| ## NR: gi|257451498|ref|ZP_05616797.1| hypothetical protein F3_00442 [Fusobacterium sp. 3_1_5R] # 1 62 1 62 62 99 100.0 5e-20 MKEERLQRTITINSSLDTAIIEIADENDRRYTFVLEALVYEALQGKTDIGKILNNYERYK KL >gi|224461500|gb|ACDD01000002.1| GENE 64 47672 - 47764 151 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRYAFLGENHNDMFHSRVHYKNYKVDNQND >gi|224461500|gb|ACDD01000002.1| GENE 65 47831 - 47935 215 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYDEWWKIAFKIFVVVKTIEYLYKLYKWIKNKKK >gi|224461500|gb|ACDD01000002.1| GENE 66 48031 - 48240 258 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451501|ref|ZP_05616800.1| ## NR: gi|257451501|ref|ZP_05616800.1| hypothetical protein F3_00457 [Fusobacterium sp. 3_1_5R] # 1 69 1 69 69 81 100.0 2e-14 MSEKKRKGYSTIEQQMRANKEYLDRNPEAKERGNRSRLKSTCKRFIRDFATLEELEEIKV LIAERSTKK >gi|224461500|gb|ACDD01000002.1| GENE 67 48404 - 48568 144 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451502|ref|ZP_05616801.1| ## NR: gi|257451502|ref|ZP_05616801.1| hypothetical protein F3_00462 [Fusobacterium sp. 3_1_5R] # 1 54 16 69 69 77 100.0 3e-13 MKDFSLVTTKKISDYDIVEYLKENFSEYDDVDIEKSFIEELYFYSDIFIKRHWK >gi|224461500|gb|ACDD01000002.1| GENE 68 48692 - 49204 625 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451503|ref|ZP_05616802.1| ## NR: gi|257451503|ref|ZP_05616802.1| hypothetical protein F3_00467 [Fusobacterium sp. 3_1_5R] # 1 170 1 170 170 276 100.0 3e-73 MLVKDMKNGLLVNAVIDFINFLRDENEFNYKFVSENQEIFYTDGCKAIMNLQLNKEKYKN NKSQNFLFSFSRILKDMNEDDELKKELSEFILEYLKETNNYNEEMKGYIVNSYVTLDVLT ETVDVDKERATLLKEFSDEIRKIEPSFRLALDWDSYFKECQKMEETGVWE >gi|224461500|gb|ACDD01000002.1| GENE 69 49207 - 50799 1465 530 aa, chain + ## HITS:1 COG:no KEGG:P9211_15361 NR:ns ## KEGG: P9211_15361 # Name: not_defined # Def: hypothetical protein # Organism: P.marinus_MIT9211 # Pathway: not_defined # 4 433 28 418 426 209 35.0 2e-52 MNKEEFLTEKKKMLEETKKNAWAFMHASKELLADKEFMIEAAKKNGGALEYASSELKSDK EFVIVAVCQQGRILRYAAEELKDDEDVILAAISNDGSALEFATERLRKKREVVLVAVKTT GYGLEFASDNLRNDKEVVLTAVKRDEWALKFASDELRNDKCFILEIMKCCNNWALEYASD ELRNDKEVVIEAMKGRHSIPLWLASERLQHDKDVVLENIANDILRKMQYDISKIKCEKIN GEIKVSLPKQDLGIDEEQKREIIKKNIESILLKTYDDEDILKVVKELEKYENLEDILKKI KGDFTNPFREEMIKKIEEKPSLLEYADYEIRNDKEVILSMIKKAWYVFRFASKELRANKK VVLAAIEENALNLQYASEELKNDKEIVIKAVKQIGAAIGYASERLKNDREIGMEAIKNDH LALWLAGEKLKNDKFFVIEAMKYNPDTLFGASKELQEDKDVMLECITSKLLKIIEEKLKD NIEHKEKRKFIKENIESILLKTFDDEEILKAVNILEEHKNLQEILNIIKK >gi|224461500|gb|ACDD01000002.1| GENE 70 51003 - 51350 379 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451505|ref|ZP_05616804.1| ## NR: gi|257451505|ref|ZP_05616804.1| hypothetical protein F3_00477 [Fusobacterium sp. 3_1_5R] # 1 115 1 115 115 175 100.0 1e-42 MSVFTREDKLFVEYLKDYFAEEMKNGLISKIATFDDREYTFFFSELSTLKIIVMETEDDN GLSLHMGTNEDSSEYLTKFAEFMNTDKFKKLYKNLENYWFIQYRAEEYEDKYLNS >gi|224461500|gb|ACDD01000002.1| GENE 71 51413 - 51817 164 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451506|ref|ZP_05616805.1| ## NR: gi|257451506|ref|ZP_05616805.1| hypothetical protein F3_00482 [Fusobacterium sp. 3_1_5R] # 1 134 1 134 134 230 100.0 2e-59 MINIEKYIHKTIGGNEYIQSFSDISSEIYKIFQDNMHFGIVDLEKIHNSLYDIYAVVEEN KIKVISKESEEKISESIMEFVEDDLSKLTTTIFPTAFMLKLDETEPENLRIFFENLGTLN HYVGFWLLEQKVRN >gi|224461500|gb|ACDD01000002.1| GENE 72 51831 - 52013 237 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451507|ref|ZP_05616806.1| ## NR: gi|257451507|ref|ZP_05616806.1| hypothetical protein F3_00487 [Fusobacterium sp. 3_1_5R] # 1 60 1 60 60 91 100.0 2e-17 MAEHGGKREGAGRPASRDKKIQKSIKIDPTIYKQIEQLDGTFISKIERGLELLLEKEYKK >gi|224461500|gb|ACDD01000002.1| GENE 73 52208 - 53158 1338 316 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1262 NR:ns ## KEGG: Sterm_1262 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 296 1 273 292 102 31.0 2e-20 MKKRLKVFESGNYPQGNFEADRVKAVFEKVVDKIPGIFAHSSHWAKKEEEPVSVGEFSNF ELLNKNGKLVVFGDVEFNEKGAGYYNDKILEGVSVEIDPKTNTLHKIAVLPKGVKPQVAG AEFELKEDELQGIYLQFEEFEEEKMTLEQIMATFPVLSLDDRAKLINSLVSTVTDEERNG MRKLMEWEKVEAIQVEPKVPKTEDEIRAEITAQMEFEAKRNQLIEKAKAKFTPAQQEIME FALKKAGEERTTVLEFESNGKKENMSYFERFEKAVETMEDAANFTSKTKELEFENKEEKT DAMKQAYDRTKARFSK >gi|224461500|gb|ACDD01000002.1| GENE 74 53171 - 53272 174 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTKIGIFMFVLAVIGIAAHFWNKKHNDRKGGKK >gi|224461500|gb|ACDD01000002.1| GENE 75 53272 - 53622 626 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451510|ref|ZP_05616809.1| ## NR: gi|257451510|ref|ZP_05616809.1| hypothetical protein F3_00502 [Fusobacterium sp. 3_1_5R] # 1 116 1 116 116 204 100.0 2e-51 MARYDRKEKELVQIVEDTLTIAGVVKQDEDVNLYSLVAYNHSEDKWVKFVKETHKTGFAV AMVKQAKGIEFDNTGKDVVVPLLKIGIVNKEVVKKAFSELDQETVGICLAQGLAIL >gi|224461500|gb|ACDD01000002.1| GENE 76 53642 - 54616 1385 324 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451511|ref|ZP_05616810.1| ## NR: gi|257451511|ref|ZP_05616810.1| hypothetical protein F3_00507 [Fusobacterium sp. 3_1_5R] # 1 324 1 324 324 619 100.0 1e-175 MLTEKQKEYIGVYAAIPSPNMFYWDLFREAREAYLGLGETINLDEVMAEMKEAGIVPRDT ELPPMTVNGNVAVSITPDIVGNSVGISALDSINANRTDTVVVNGQQMTASQYDVENKTMT LKNSICNTTNRMAAQALLTGKVKCAGGQEVDMKLPKDVEVGAKPKSWVTFFVEKINDYQI ETGYMPTYILVGSKIAAELITEIQNTKASLLAAKVEKKGQSAIININGVLSEIRTLPPAI GYKNLITETENKVFFINNMSLVPLYAGLEFVGDTNNPEMMRGDVFVDKGDVNKQTGRATL FAKSAPFPAIALPKLIKVYKVTLG >gi|224461500|gb|ACDD01000002.1| GENE 77 54621 - 54959 601 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451512|ref|ZP_05616811.1| ## NR: gi|257451512|ref|ZP_05616811.1| hypothetical protein F3_00512 [Fusobacterium sp. 3_1_5R] # 1 112 1 112 112 175 100.0 1e-42 MTKDFLERYSENVQLFLKEEYGEKVKQKLEDMRKEADFLIDTYGIDVKSIQEMNLELLRD LHADFRVYTRMANEELAREAKQEFYDLLDAYKETLDKKSDTSNAGKGIMVFL >gi|224461500|gb|ACDD01000002.1| GENE 78 54956 - 55396 346 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451513|ref|ZP_05616812.1| ## NR: gi|257451513|ref|ZP_05616812.1| hypothetical protein F3_00517 [Fusobacterium sp. 3_1_5R] # 1 146 1 146 146 272 100.0 5e-72 MTKVKRTAELLNRVKEILQEIPSVIPYVKFAFLEDQLFKKPVSNSILIEPLGESISPNGI SLAHPLKETKGFIIHYLFKSMTPDLHMIPFVEKKDEIINKLLDERMIGIDGLFLNYSIQT EHYKFTTDEVGEEIWVVKIIFKGQYR >gi|224461500|gb|ACDD01000002.1| GENE 79 55411 - 56367 1276 318 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0060 NR:ns ## KEGG: Sterm_0060 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 2 314 3 302 309 85 26.0 2e-15 MIKLLIGVQDNAKTKANILKAYSATEVDLKPDFNKVQSEAFTDSAYREKGYVSEKKANGS FTLEVTPETLKDLLPAFGYTLESVSLPGAGITFEGISKITHKGIANNGGKIEKYFTIVEQ DLEHEEENIITGAQFNSVELNFSKGSYVTMKVDVIGYKFEYKNSKSETGENVSDIDNVIT CNGIHLSLDGKDISANAQSIVVNINNNLEARFGLGSPDATNIKRTNFIEAKASLTFVGYE KEKYKTAYERLINGETGKADIKLLGTEKTAFGVQLHKIGVSDVQKTDKKSGAGMTQELEI FNDRSKSTPITFVFGTVN >gi|224461500|gb|ACDD01000002.1| GENE 80 56384 - 56773 501 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451515|ref|ZP_05616814.1| ## NR: gi|257451515|ref|ZP_05616814.1| hypothetical protein F3_00527 [Fusobacterium sp. 3_1_5R] # 1 129 1 129 129 161 100.0 1e-38 MQKILKVGTEENYVEYKERLSFIERHNYKEKIKPKSLKMNKKKEDFTLNVEENKFENSPE FMLLQAQVVKIVEEGEVIFDHEDGKRKISAQTFNKIFENADNLDEVLSNILKANKLSVKE EEEEEEKND >gi|224461500|gb|ACDD01000002.1| GENE 81 56857 - 57039 86 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451516|ref|ZP_05616815.1| ## NR: gi|257451516|ref|ZP_05616815.1| hypothetical protein F3_00532 [Fusobacterium sp. 3_1_5R] # 1 60 1 60 60 114 100.0 3e-24 MKEKIRFYIRYLGRDSFTGFYELRFLPDGESVGYNHHDWEDIWYLENVKNALNEIISQKQ >gi|224461500|gb|ACDD01000002.1| GENE 82 57085 - 61785 5900 1566 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3894 NR:ns ## KEGG: Sterm_3894 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 949 1535 1293 1898 1925 204 28.0 3e-50 MIDNNMTMIVSLKDEASPEFKKMATRFGMSTSDFKKYLKDMMAKAKEMEQQINKLDFKKL QENIKTLRQEAQTQLAEMKKSLESLKEKSKDVFLSIEKWTTRAVTAIVAYTTIAGKQFAD LETNIKKVETISDDSFKKISKEVRNMAMDSGTSSKELAGALYEIVSAVGDVPEKYKILEY SNKLAIAGFTDTTTAVDVLTTILNGYGMEMSQVNRVSDILIQTQNKGKIVVSELAQYMGP IISTAKLAKVSLEELAGAMATMTANGVKAPEASTYLKNMMNELIKTGTDADKAFKKIYGK SFLQFKEQGGTLQEALISLNESAKKSGMTLIDVFSSIRSSSGALVLANNMDKFIDSLKAM ENAGGTTDKAFQKMMDTFTQKFKQVKEILKEFGLRIFETIAPQIDKLMEKIKSIDVDKVF SQENINNVVGFGKAIVTLGIALKGLKFANGFLEGLRVLTGAEVKKDMLSVLKNGLGAMKN NVKAQGLGVMAHFARRTPEQMQLPGMGTTSVPKTSMFAGMLSSFQGIIGKFKALFAGGFT GALKSLGSVIGRIIPYILGSVKVFAALLGKLALIGGVVAAVILALKLLWNILSKNKNVTQ VWGKVLTNLKDFISNIIYVGKQLYTFLLNTFTGNGVFGMVGKAIGGLASALGSFFNWILE KANWFLKFIGKGLERANTGDDKKSYWGNAYNPNNSNDGFLSVPSGIGKTKKENKDTILGG NVADLANSAQETANKFREGFEKLMSEIGANITKEFTNEEKLAELEKAKGKYGKYIEEINR AINNVKLDILAEKISYLSKGINIESMEKSFEQKKKDLEQMIEAQKELIERNKIKGIDEYP LKELEDQLKNMEYTLRKLPLDEQKEKLQEFAEAIEKMPIDEQISRYNALIPAIEGQIAIV EQMVNDGLLDEKVLKDYKKNLDEIKKKQQEVQAQGLTTWGKWGQGIQLLANTFSQLGSAT GSKTMSGIGNILGNVFSIGSAMKNWGGMSSITGMFSKAGSFSGGMASLGAVAGAVTGGIA LVGTIGSLIGRSGKKKAAKIDARNKENEEAYKKQISALHQLTQAIQQNSERIKSFADRML TDVAKNPTIRMIVGGENNFDLLHNSMIGGKHFADIVALEKGSARYRSGFRHKHKSTYTKV DIGEAELLRYLGFDKRELDAFTDSEMKQLDNVLNQVNHETLRRATGRNLTESSIEEWKKQ VHEFVEQIKYLEREKADLFKGSTLESFSGVEYKTEKELIKEYTEQFKQMGLVGEQYNKTI KEMAKNNQVLITSMLDVRNSTIEGFASGNGGFLTSMKSYFEKIFKNASSVAYDLVFSDLD HYLTQAFEKISNKLVDIKKSGKLDFKGLFSDFDFEKIKNLDIMEKQVKQSLDVIKKELLS RGVDLSLINKMLPWSDFNDRINSLKNALSSAMSAGLEEHKFSSFTKALGQSLYDSVKNSL IKAFSESALYQGMIEKFIKAEDFQSQLEKAGNFKEALRIADGIMKKFGYELEANGLGGFD GINNIRREEDTQLGNAYYTDKAANVNINVTNNFYAEVYGVDDLDTRIAKGTESGIKNWLN RPNGSN >gi|224461500|gb|ACDD01000002.1| GENE 83 61821 - 62102 415 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451518|ref|ZP_05616817.1| ## NR: gi|257451518|ref|ZP_05616817.1| hypothetical protein F3_00542 [Fusobacterium sp. 3_1_5R] # 1 93 1 93 93 117 100.0 3e-25 MEKFEKNELKEVAREAWEALSEVLPKGKEIKVGNERYVKKQVSSEMEEFLLEIEAFKKKM QELDFAYKEVADLTKEFIKARVLELPKYNIFQA >gi|224461500|gb|ACDD01000002.1| GENE 84 62129 - 62356 229 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451519|ref|ZP_05616818.1| ## NR: gi|257451519|ref|ZP_05616818.1| hypothetical protein F3_00547 [Fusobacterium sp. 3_1_5R] # 1 75 1 75 75 105 100.0 8e-22 MEKKTVVTGISYRDNTGEIKTLILSRNEIMEVNGKYLTNMVFSKEELEELEKALGKEIGS IFSHIEYKREDKILS >gi|224461500|gb|ACDD01000002.1| GENE 85 62362 - 62631 334 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451520|ref|ZP_05616819.1| ## NR: gi|257451520|ref|ZP_05616819.1| hypothetical protein F3_00552 [Fusobacterium sp. 3_1_5R] # 1 89 1 89 89 154 100.0 1e-36 MFELLLKTAQSLADEYKIKCETGEEFLMVNGQKVSLPGNRIYFEFEDGSFTDSMQIFLLK EHLAKDSLKPHFEEVFSKEKKPVKIIVIK >gi|224461500|gb|ACDD01000002.1| GENE 86 62711 - 66604 3772 1297 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0064 NR:ns ## KEGG: Sterm_0064 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 99 432 147 485 763 80 23.0 3e-13 MQLSTLRYQGYTARITNLSTIEELQDWVSECNITLPQSNLISSMEARFQLEEKKINKGNE VKIEILDDVGNVLYTLQGEANIPRRTKSCTGLEVWEYNIKDSYNRLFEKVVSESQTFYDL YLCNINDKHNSLLHKIASALGFREEELDFESVAFENGNLIRLPFVYLEENSRWIDKLQAF IEATDGILYIKNKKLFFRPRNLSINHSFSFNRTNIITSLEESEKEVLQNGIRLVYDRYEK LDNQVVFNLQKKIITEPNTNQDTEVPTMKINFITSAVSNPTLTKASGYYFATEDPNSKVD ITLEENVHYKKMSWKETGAEVKFYNPLPHKLYVDNFEIKGVPLSMYADNEVSVMFPSVLE KRQENFITASKNKFIQTSEQAKFLAKKAMRRGVVNHAEYQFKTPFLHQIEVGGVYGLDLE DIHTVIEITNLSINLRPGVFRMDIQGISVKEELGNVKITSKLSGNPKESYVDLSSVKEDI QSTKSELVEKYDKKLLSLTEKYDEQLVNISDIHKEELSKITEKLGQLSQGIQDNKVKIYT LKPSYDTLTDLNIGDMYVSDTTVEILIKEGNRYVWKAIKDEDTKKQLESYMQSTNSKLVT ISYQFSAPKSPNVGDIWIDTQNDGIWKRWNGSEWEQVDKNVRDTLKKANQDIQKIESSLE TVNNTVNRKILAKAFVQEQEPKSGMKEYDVWYKPSTNTYKVYLNWRWNNASEDDIFPALR HYASLENAKLEIGKKIDKTNERAGLFLTNNDQTFGSKYGELAEVSLDKQGAIRLKNANNL LEWNVKDPWSTTKMKSKFYMGVTDVDKVPDNVYFKIGDETNGFSIELKEGERAKAKLDGK ELSQKFGEVNDKVQQSKEELDRNINNLAQADSENKRDLEEKLNAAKQEINAQLVNADGKW TALQGQYQETVRDVTNFKEQATGRLDSYETALQNGNFVITGRTVFDGNVNIVSKGTNERL EINSGNLSIYRTINGVEKKVTRIGNIRYGSISTDSKGKGVVHFTDMKEPLLVMPTIKAVN FGGNMASAFCYSEYVSACVYKFFIGGTRETYETPKNIKTVGTVVSYSNIYQYTIEGFGFT FPYTDVYSRNADISTPTYSHSNGVDSKYFNSSEYVDKLFKFLYGYDYGNSDEDIWEQDKK RYDIGTTYIKPTFTIELKEYVNGEFNATIFSKEFTLGFYKAKFKRYCINSQDFTHALNIR RNFTNRTNVRVELVVTFTEPRMGINGTDFYESSKKKGIGSDTYTKYSYNVSIYREFELKR FSQASFEGYNITSSYTSSKIEDSLGEGEVSYIAMEID >gi|224461500|gb|ACDD01000002.1| GENE 87 66613 - 67422 462 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451522|ref|ZP_05616821.1| ## NR: gi|257451522|ref|ZP_05616821.1| radical SAM domain-containing protein [Fusobacterium sp. 3_1_5R] # 1 269 1 269 269 527 100.0 1e-148 MEFKLILGYRCNLKCSYCYQLKEHCNDKEMSYEIIDTFLERYNKLEGSHTINFFGGEPLL YADKIKYIMDRVDKDRTCLSISTNGSLRGTFYELQEYWGRSIGNLLSNKEHGDFTKLNEE SSFRYVVTKENIDELTESRITFLANHYKEKLQFKYDMSSKWELEHIQKMEHVQRILQGVL GGDFYIEMPVDYNSKFVCFVSGTNCFINYNGDYLACHRNPNSKLGNIMKEDFICCKNEYC LDRCTNRPENLYSYGKYEFKGVTLFHSCD >gi|224461500|gb|ACDD01000002.1| GENE 88 67432 - 68076 465 214 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451523|ref|ZP_05616822.1| ## NR: gi|257451523|ref|ZP_05616822.1| hypothetical protein F3_00567 [Fusobacterium sp. 3_1_5R] # 1 214 1 214 214 416 100.0 1e-115 MKYFYVSRNALVRDNAVVVYGEYTIQIPLEAYRSNPNTQDAIEYISEDNNFPNDWAYDSE NDVIFSQKDRPSPYHVFVKGVWIVKDKEGLKKYCEENIDKIKKEVLEYGFDYQGHRQRCR DKDVAYMVANIVALQTAQTLGKEKKVTWYFEDNHGMTAGVQELGVLMLYGTTFVQSVYDT ENYFKTLEEPKIITKEEFEQKRKTIHQALAGGEL >gi|224461500|gb|ACDD01000002.1| GENE 89 68073 - 68324 221 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451524|ref|ZP_05616823.1| ## NR: gi|257451524|ref|ZP_05616823.1| hypothetical protein F3_00572 [Fusobacterium sp. 3_1_5R] # 1 83 1 83 83 154 100.0 2e-36 MIKIILEKNALNVQGHGNSYICHAVSAVSQYLCSNLEIISYKSGYGYLCATFRDSSISRI LLDNFVRFLKDLHSQEIHLEERR >gi|224461500|gb|ACDD01000002.1| GENE 90 68328 - 69537 1384 403 aa, chain + ## HITS:1 COG:no KEGG:SZO_05380 NR:ns ## KEGG: SZO_05380 # Name: not_defined # Def: collagen-binding collagen-like cell surface-anchored protein FneC # Organism: S.equi_zooepidemicus # Pathway: not_defined # 164 307 382 517 677 76 36.0 2e-12 MQHITNVLVHSNRCEVVDGHIFATGDKGLPHIHLKFLYMFGESSLQGKNLECKYLLPNGQ YSAETVRISSKDEVTFPIHYSCFTVNGWTTLRITLVSGSNRVTLEDITIKTKETKLGQVF TNALVEQAITQAIEITTTSIRAEGDSIKQELKDYIQQEKKNLKGDRGEKGNPGERGPVGE QGPRGYTGEKGSQGQRGEPGPQGQQGLRGEQGVQGNPGKNLEFTWKGTELGVRKEGDWSY SYKDLKGPKGDKGEPGTRGPKGERGVGVTSVTPLNNNQVRLEYGDGQSAVVEIPTVAGQQ GQKGEDGKGLEFKWRGTELGIRQEGSSNYAYQNLKGEPGNSNSVDTSNLAKLDESTTFQR NLTVKGDILSQGNVTAYSDIRLKKNVKNIENPLAKLREIRGVT Prediction of potential genes in microbial genomes Time: Fri May 20 01:26:52 2011 Seq name: gi|224461499|gb|ACDD01000003.1| Fusobacterium sp. 3_1_5R cont1.3, whole genome shotgun sequence Length of sequence - 10973 bp Number of predicted genes - 24, with homology - 18 Number of transcription units - 9, operones - 4 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 539 192 ## gi|257451526|ref|ZP_05616825.1| hypothetical protein F3_00582 2 1 Op 2 . + CDS 536 - 931 568 ## gi|257451527|ref|ZP_05616826.1| hypothetical protein F3_00587 3 1 Op 3 . + CDS 940 - 1332 344 ## COG4824 Phage-related holin (Lysis protein) + Term 1333 - 1365 1.1 + Prom 1416 - 1475 7.6 4 2 Op 1 . + CDS 1506 - 1787 497 ## gi|257451529|ref|ZP_05616828.1| hypothetical protein F3_00597 5 2 Op 2 . + CDS 1836 - 2360 527 ## gi|257451530|ref|ZP_05616829.1| hypothetical protein F3_00602 + Term 2374 - 2405 1.1 + Prom 2386 - 2445 8.9 6 3 Tu 1 . + CDS 2476 - 2586 56 ## + Prom 2636 - 2695 5.2 7 4 Tu 1 . + CDS 2803 - 2976 233 ## + Term 3028 - 3080 2.1 + Prom 3015 - 3074 8.4 8 5 Tu 1 . + CDS 3112 - 3216 74 ## + Term 3292 - 3337 7.6 + Prom 3321 - 3380 14.1 9 6 Tu 1 . + CDS 3412 - 3984 644 ## gi|257451533|ref|ZP_05616832.1| hypothetical protein F3_00617 + Term 3999 - 4031 2.2 - Term 3982 - 4024 1.0 10 7 Tu 1 . - CDS 4262 - 5509 728 ## COG0675 Transposase and inactivated derivatives - Prom 5531 - 5590 7.9 - Term 5558 - 5603 1.4 11 8 Op 1 . - CDS 5712 - 6047 231 ## gi|257451535|ref|ZP_05616834.1| hypothetical protein F3_00627 12 8 Op 2 . - CDS 6044 - 6301 331 ## gi|257451536|ref|ZP_05616835.1| hypothetical protein F3_00632 13 8 Op 3 . - CDS 6303 - 6791 402 ## gi|257451537|ref|ZP_05616836.1| hypothetical protein F3_00637 14 8 Op 4 . - CDS 6821 - 7006 343 ## gi|257451538|ref|ZP_05616837.1| hypothetical protein F3_00642 15 8 Op 5 . - CDS 7054 - 7299 164 ## gi|257451539|ref|ZP_05616838.1| hypothetical protein F3_00647 16 8 Op 6 . - CDS 7286 - 7594 325 ## gi|257451540|ref|ZP_05616839.1| hypothetical protein F3_00652 17 8 Op 7 . - CDS 7618 - 7995 467 ## gi|257451541|ref|ZP_05616840.1| hypothetical protein F3_00657 18 8 Op 8 . - CDS 7992 - 8102 128 ## 19 8 Op 9 . - CDS 8107 - 8169 58 ## 20 8 Op 10 . - CDS 8159 - 8233 78 ## 21 8 Op 11 . - CDS 8241 - 9056 1006 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 9091 - 9150 6.0 22 9 Op 1 . - CDS 9246 - 9830 613 ## COG3617 Prophage antirepressor 23 9 Op 2 . - CDS 9833 - 10090 315 ## gi|257451546|ref|ZP_05616845.1| hypothetical protein F3_00682 24 9 Op 3 . - CDS 10093 - 10971 749 ## Sterm_3911 toprim domain protein Predicted protein(s) >gi|224461499|gb|ACDD01000003.1| GENE 1 3 - 539 192 178 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451526|ref|ZP_05616825.1| ## NR: gi|257451526|ref|ZP_05616825.1| hypothetical protein F3_00582 [Fusobacterium sp. 3_1_5R] # 1 178 1 178 178 351 100.0 1e-95 MKIALIIGHNQRSRGAYSSIVGSEFDYWKRIAEKIQAEIPELVDVYERKPQKYYGQEMRE VLQELNKHNYKYCLELHFNAGVEQANGCECLIYHKNEQAQELASIFMSRLQNIFGSKIRG IIPVKTDNIRGGYGICHSKDNYILVEPFFGSNTDESLKFSIESDVVDFFVKYIQEVQI >gi|224461499|gb|ACDD01000003.1| GENE 2 536 - 931 568 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451527|ref|ZP_05616826.1| ## NR: gi|257451527|ref|ZP_05616826.1| hypothetical protein F3_00587 [Fusobacterium sp. 3_1_5R] # 1 131 1 131 131 234 100.0 1e-60 MSVTLASVFGSAVGKKVIDKVLDVVEKRIPMSADQRQQLEVELAKTEVEELKAKTEYVKS LGTRVRDAIIPLILFGFFLMHFMIFLSDFINGQIGREAPIVHISGDYTTVVLTIVGFLFT YKGATKISGKK >gi|224461499|gb|ACDD01000003.1| GENE 3 940 - 1332 344 130 aa, chain + ## HITS:1 COG:CAC1842 KEGG:ns NR:ns ## COG: CAC1842 COG4824 # Protein_GI_number: 15895117 # Func_class: R General function prediction only # Function: Phage-related holin (Lysis protein) # Organism: Clostridium acetobutylicum # 22 130 15 125 125 60 34.0 9e-10 MIGLTKTYLALCWTGWIGFLVWLIGGFDLLAKVLLALMLLDFLTGLWVGYKQKILNSKRA YKGLQKKFLILVLLCGASLMHKLVPGIGFRSLVGLFYCATEMLSIIENCAKCGVPIPKKL KKALEQVRDK >gi|224461499|gb|ACDD01000003.1| GENE 4 1506 - 1787 497 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451529|ref|ZP_05616828.1| ## NR: gi|257451529|ref|ZP_05616828.1| hypothetical protein F3_00597 [Fusobacterium sp. 3_1_5R] # 1 93 1 93 93 145 100.0 1e-33 MLEQLNKRKMRLIENGRSGMYEIAYLGSNSVILQAETVEELESKYEALCEELYSEKEYWN IDYTDYKEEVLTQEDFDKNRAEMIARNNKRFGL >gi|224461499|gb|ACDD01000003.1| GENE 5 1836 - 2360 527 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451530|ref|ZP_05616829.1| ## NR: gi|257451530|ref|ZP_05616829.1| hypothetical protein F3_00602 [Fusobacterium sp. 3_1_5R] # 1 174 1 174 174 261 100.0 1e-68 MAKNGFSSDEQRNETIKKYRNSEKGKATSRKAVAKSNSKKFIRELADEEELKELLYIMKE REDMNENLKKQIDDGIVKIEKIGEWKEADFLKIVAEASRETKKYFIQKKDERDLKSIIKN AVRDLNNREEVLNTMTLLSVKTGIYYGNFIYRIMENYEKEHNAVVAFFLELQGA >gi|224461499|gb|ACDD01000003.1| GENE 6 2476 - 2586 56 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSKKKKRIKKELLQIVILIIQLVIAILGLIREIIRK >gi|224461499|gb|ACDD01000003.1| GENE 7 2803 - 2976 233 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVKQTEANKKWQEKNKERAKYLSDRSRTKSFIRNLSTLEDLEEIQKLILDRKKELS >gi|224461499|gb|ACDD01000003.1| GENE 8 3112 - 3216 74 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIKAFLIKNYNVPYKKYLTPCRIYVILNVARKY >gi|224461499|gb|ACDD01000003.1| GENE 9 3412 - 3984 644 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451533|ref|ZP_05616832.1| ## NR: gi|257451533|ref|ZP_05616832.1| hypothetical protein F3_00617 [Fusobacterium sp. 3_1_5R] # 1 190 1 190 190 332 100.0 6e-90 MKLFKNVGIEDLESILKNGILPISETGNDNWEEGKRGNNAKDVVYLFSPKLEVNSFPKAY GIVLLEVEVEAQLNTFEKNDEHIDDYDEYIVDRVEVQDIKAVYIPKIFEDRVREFLTEET LKKVTFVEISAKHYQDFNLIEADEKTLSAFGKTAEIDSTEFNFFRGKRKVQGLFSEIEQI FDLYKVVYKF >gi|224461499|gb|ACDD01000003.1| GENE 10 4262 - 5509 728 415 aa, chain - ## HITS:1 COG:alr7153 KEGG:ns NR:ns ## COG: alr7153 COG0675 # Protein_GI_number: 17233169 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 5 391 4 381 408 193 32.0 6e-49 MYLTVKQQLKHLSKEEYLSLRELSHTAKNLYNQAVYNIRQYYFQENKYLNYQKNNSFLKS SENYKTLNSNMSQQILKEVDGSFKSFFGLLKKKNKGMYNAKVKLPDYLPKNSFTTLVIGF VRLNEDTFVIPYSTSFKKNHKKISIKIPPILLDKKIKEIRIIPKFNARFFEVQYTYEVEE EQKNLDKNHALAIDFGISNLATCVTSKGRSFIIDGKKLKSINQWFNKENARLQSIKDKQK YGKKPTLRQKYLYSSRNNKVNDYMSKAARKIINYCLENNIGTLVCGYNETFQRNSNIGKA NNQTFVNIPFGKLREKLEYLCKLYSLKFVEQEESYTSKSSFFDMDILPKFEVDKPQTYSF LGKRIKRGLYQTSKGYIFNADVNGALNILRKSNVVDLEVLYSRGEVDTPARIRIA >gi|224461499|gb|ACDD01000003.1| GENE 11 5712 - 6047 231 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451535|ref|ZP_05616834.1| ## NR: gi|257451535|ref|ZP_05616834.1| hypothetical protein F3_00627 [Fusobacterium sp. 3_1_5R] # 1 111 1 111 111 172 100.0 6e-42 MKIWKILFLAMLNFRIGIFYILYKYDLKNREYLYMTEAAKYYKKKTDILTLFFVVFLFLE IDFLFSFISKVGETVEEIHHFLRFYFCKKLSDKVYDLFLAKNEPDHVEGDD >gi|224461499|gb|ACDD01000003.1| GENE 12 6044 - 6301 331 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451536|ref|ZP_05616835.1| ## NR: gi|257451536|ref|ZP_05616835.1| hypothetical protein F3_00632 [Fusobacterium sp. 3_1_5R] # 1 85 1 85 85 147 100.0 3e-34 MYLNESCCGNLYVTSVYVDEECETCGNRYGTIFSFYSPKELLENLKKENYTEEAIEEVFK DADGFLKEHTYTKWLQENINKKRSK >gi|224461499|gb|ACDD01000003.1| GENE 13 6303 - 6791 402 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451537|ref|ZP_05616836.1| ## NR: gi|257451537|ref|ZP_05616836.1| hypothetical protein F3_00637 [Fusobacterium sp. 3_1_5R] # 1 162 1 162 162 270 100.0 3e-71 MKEPKSFKDICELQKELDSHIVNVRERTGRDIVKSMIAEIIEFDEETKDSHKTWKTKEYN SQKELEELTDVYFFFAQLVNNEKYLEEWKLEELFNKKETEIFKPDSLTMIMYATGVVRSL YSVFSALVDLTLEYGYTKEDILKTYWKKWQYNMSERIGKEWN >gi|224461499|gb|ACDD01000003.1| GENE 14 6821 - 7006 343 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451538|ref|ZP_05616837.1| ## NR: gi|257451538|ref|ZP_05616837.1| hypothetical protein F3_00642 [Fusobacterium sp. 3_1_5R] # 1 61 1 61 61 124 100.0 1e-27 MSDVIQFNEKHKWCGCFGYVANEKKGRYMIAVAIPQKGTAYIFATKEEFDIVGKTNLVLA D >gi|224461499|gb|ACDD01000003.1| GENE 15 7054 - 7299 164 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451539|ref|ZP_05616838.1| ## NR: gi|257451539|ref|ZP_05616838.1| hypothetical protein F3_00647 [Fusobacterium sp. 3_1_5R] # 1 81 1 81 81 109 100.0 6e-23 MLTDDEKIEIIYNSLAKKEVLHFNDQYERMDFNKYMLYLNFLESQHLIKIKMYNSILSKK SKIVILKSKKKFSSNTNYTKR >gi|224461499|gb|ACDD01000003.1| GENE 16 7286 - 7594 325 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451540|ref|ZP_05616839.1| ## NR: gi|257451540|ref|ZP_05616839.1| hypothetical protein F3_00652 [Fusobacterium sp. 3_1_5R] # 1 102 1 102 102 127 100.0 2e-28 MLQFKDFLSARNMFSDKEESELTQAKKKIIKKIMKAMLEGRVNPLSNRESQVYKLREQGL DYEEIAKKLGIKETTCRKRLSSANKKIRIMIELFEEEQNANR >gi|224461499|gb|ACDD01000003.1| GENE 17 7618 - 7995 467 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451541|ref|ZP_05616840.1| ## NR: gi|257451541|ref|ZP_05616840.1| hypothetical protein F3_00657 [Fusobacterium sp. 3_1_5R] # 1 125 1 125 125 154 100.0 2e-36 MTLGEQIKKYREKYGLSQRGFSNKVNISQAYISMLEAEKEVKLDPEKLEVLEKLLAEDLL NTIVKNEEEKENKNMEKKDNDLVQENKALKENIKELIKIIEKIYSDMDAFALGVKIGTLK SKIKD >gi|224461499|gb|ACDD01000003.1| GENE 18 7992 - 8102 128 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKILEETENVYKKVILSGRGSQTLLFMDCIYKEKL >gi|224461499|gb|ACDD01000003.1| GENE 19 8107 - 8169 58 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENSLELFLNVTKKSFLQRY >gi|224461499|gb|ACDD01000003.1| GENE 20 8159 - 8233 78 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRYIFIVSGRSYDNEEEINSGGK >gi|224461499|gb|ACDD01000003.1| GENE 21 8241 - 9056 1006 271 aa, chain - ## HITS:1 COG:YGR231c KEGG:ns NR:ns ## COG: YGR231c COG0330 # Protein_GI_number: 6321670 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Saccharomyces cerevisiae # 29 221 59 254 315 79 26.0 9e-15 MKKEFKMFGTIFVSVLVIIICALLFTNCYSVDTGEVAIISRFGKINRIDTEGLNFKLPFV ESKQFMEIREKTYIFGKTEEADTTLEVSTKDMQSIHIDLTVQANIVDPEKLYRAFQNKYE YRFVRPRVKEVVQATIAKYTIEEFVSKRAEISRIINKDISDDLAVYGMNVSNVSIVNHDF SDEYEKAIEQKKVAEQAVEKAKAEQAKLLVEQENKVKIAEFKLKEKELQARANAVEAQSL SPMLIKKMAIEKWDGHLPKVQGEGNTLIKLD >gi|224461499|gb|ACDD01000003.1| GENE 22 9246 - 9830 613 194 aa, chain - ## HITS:1 COG:SPy0980_1 KEGG:ns NR:ns ## COG: SPy0980_1 COG3617 # Protein_GI_number: 15674990 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Streptococcus pyogenes M1 GAS # 3 123 2 122 123 132 53.0 3e-31 MNEIKMFKNEKFGEIRIIEKEGKPYFNLKDVCVILGLEQVSRVKSRLKEDGVILNKVIDN LGREQQANFIDEPNLYKCIFQSRKENAEEFTDWVTSEVLPTIRKHGIYATDSVIDNILNN PDFGIELLTKLKEERNARIKAERRNAILTHINKTYTMTEIAKELNLKSATQLNKILSDKN LIFYQRNLGNVLSI >gi|224461499|gb|ACDD01000003.1| GENE 23 9833 - 10090 315 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451546|ref|ZP_05616845.1| ## NR: gi|257451546|ref|ZP_05616845.1| hypothetical protein F3_00682 [Fusobacterium sp. 3_1_5R] # 1 85 1 85 85 158 100.0 9e-38 MEKEKILQAYRNAISQLPNEITIAESRKIVEYNSFSYVIPKGARVIKADDEFLGKWFYIQ AVLEKCLKEKILTWAEVQEFCFKGG >gi|224461499|gb|ACDD01000003.1| GENE 24 10093 - 10971 749 292 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3911 NR:ns ## KEGG: Sterm_3911 # Name: not_defined # Def: toprim domain protein # Organism: S.termitidis # Pathway: not_defined # 3 276 315 592 607 160 38.0 7e-38 GDDAFNRMSGGMRPGEVIIFTGNPGSGKSTFVNNLMANLVEQGIKVFTMQGEFRKEVFKT NICKILSRPGQIETFKHPLKDKLYGKISYEQEKKINTWLKGKITIHTEQTPTKADLIETM EQAYKKNGVKVFVIDNLMTINIDSADKYEAQKNLFIELQEFVKKYNVCLMIVAHPKKNIV KALDEVDDFIISGASEIVNLANAVVFLKRLSEDEVKKLQEQGFEASVGAILTKDRKYGDI RSKGFWNYEIKTGRFLDIKNPEKTKYKEYGWEKLEKKRLIVEMDNLPFLNEG Prediction of potential genes in microbial genomes Time: Fri May 20 01:28:59 2011 Seq name: gi|224461498|gb|ACDD01000004.1| Fusobacterium sp. 3_1_5R cont1.4, whole genome shotgun sequence Length of sequence - 73160 bp Number of predicted genes - 118, with homology - 105 Number of transcription units - 28, operones - 15 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 914 623 ## Sterm_3911 toprim domain protein 2 1 Op 2 . - CDS 911 - 2002 957 ## CPF_1598 hypothetical protein 3 1 Op 3 . - CDS 1999 - 2259 352 ## gi|257451550|ref|ZP_05616849.1| hypothetical protein F3_00702 - Prom 2285 - 2344 1.8 4 2 Tu 1 . + CDS 2416 - 2817 148 ## gi|257451551|ref|ZP_05616850.1| hypothetical protein F3_00707 5 3 Tu 1 . - CDS 2845 - 3132 300 ## gi|257451552|ref|ZP_05616851.1| hypothetical protein F3_00712 - Prom 3306 - 3365 5.8 6 4 Tu 1 . - CDS 3380 - 3514 56 ## - Prom 3543 - 3602 5.8 - Term 3619 - 3660 0.6 7 5 Tu 1 . - CDS 3726 - 3818 77 ## - Prom 3852 - 3911 4.2 - TRNA 4436 - 4520 66.6 # Ser GCT 0 0 8 6 Tu 1 . - CDS 4747 - 4953 275 ## gi|257451553|ref|ZP_05616852.1| hypothetical protein F3_00717 - Prom 5041 - 5100 11.3 - Term 5085 - 5113 1.0 9 7 Op 1 . - CDS 5114 - 5680 176 ## Mbar_A2117 hypothetical protein 10 7 Op 2 . - CDS 5685 - 5993 406 ## Nwat_0049 hypothetical protein 11 7 Op 3 . - CDS 6010 - 6939 539 ## Neut_0161 ATP-binding region, ATPase-like - Prom 6965 - 7024 3.8 12 8 Op 1 . - CDS 7026 - 7358 296 ## gi|257451557|ref|ZP_05616856.1| hypothetical protein F3_00737 13 8 Op 2 . - CDS 7400 - 7495 96 ## - Prom 7538 - 7597 2.2 14 9 Op 1 . - CDS 7601 - 7813 257 ## Sterm_1388 protein of unknown function DUF955 15 9 Op 2 . - CDS 7822 - 8289 636 ## gi|257451560|ref|ZP_05616859.1| transcriptional regulator - Prom 8324 - 8383 11.9 16 10 Tu 1 . - CDS 8426 - 9454 1078 ## SACE_2836 LexA repressor (EC:3.4.21.88) - Prom 9594 - 9653 8.9 + Prom 9571 - 9630 9.2 17 11 Op 1 . + CDS 9841 - 10206 386 ## gi|257451562|ref|ZP_05616861.1| hypothetical protein F3_00762 18 11 Op 2 . + CDS 10243 - 10809 480 ## gi|257451563|ref|ZP_05616862.1| hypothetical protein F3_00767 19 11 Op 3 . + CDS 10814 - 11047 348 ## gi|257451564|ref|ZP_05616863.1| hypothetical protein F3_00772 20 11 Op 4 . + CDS 11050 - 11220 275 ## gi|257451565|ref|ZP_05616864.1| hypothetical protein F3_00777 21 11 Op 5 . + CDS 11230 - 11478 365 ## gi|257451566|ref|ZP_05616865.1| hypothetical protein F3_00782 22 11 Op 6 . + CDS 11479 - 11547 167 ## 23 11 Op 7 . + CDS 11558 - 11752 295 ## gi|257451567|ref|ZP_05616866.1| hypothetical protein F3_00787 24 11 Op 8 . + CDS 11768 - 12394 674 ## COG5377 Phage-related protein, predicted endonuclease 25 11 Op 9 . + CDS 12407 - 13303 1166 ## CLL_A1941 hypothetical protein 26 11 Op 10 . + CDS 13313 - 14071 989 ## Sterm_1235 phage recombination protein Bet 27 11 Op 11 . + CDS 14078 - 14254 171 ## gi|257451571|ref|ZP_05616870.1| hypothetical protein F3_00807 28 11 Op 12 . + CDS 14254 - 14640 552 ## COG0629 Single-stranded DNA-binding protein 29 11 Op 13 . + CDS 14660 - 14785 205 ## gi|257451573|ref|ZP_05616872.1| hypothetical protein F3_00817 + Term 14939 - 14991 -0.6 30 12 Tu 1 . - CDS 14776 - 15087 384 ## gi|257451574|ref|ZP_05616873.1| hypothetical protein F3_00822 - Prom 15161 - 15220 11.5 + Prom 15143 - 15202 8.9 31 13 Tu 1 . + CDS 15274 - 15447 190 ## Lebu_0982 hypothetical protein + Prom 15471 - 15530 5.1 32 14 Op 1 . + CDS 15561 - 15677 198 ## gi|257451576|ref|ZP_05616875.1| hypothetical protein F3_00832 33 14 Op 2 . + CDS 15674 - 15919 243 ## gi|257451577|ref|ZP_05616876.1| hypothetical protein F3_00837 + Prom 15933 - 15992 10.8 34 15 Op 1 . + CDS 16016 - 16192 236 ## gi|257451578|ref|ZP_05616877.1| hypothetical protein F3_00842 + Term 16239 - 16286 0.1 + Prom 16237 - 16296 4.2 35 15 Op 2 . + CDS 16316 - 16858 593 ## gi|257451579|ref|ZP_05616878.1| hypothetical protein F3_00847 + Term 16877 - 16911 3.6 36 16 Op 1 . + CDS 16925 - 17146 364 ## gi|257451580|ref|ZP_05616879.1| hypothetical protein F3_00852 37 16 Op 2 . + CDS 17143 - 17931 852 ## Sterm_0893 hypothetical protein 38 16 Op 3 . + CDS 17888 - 18100 243 ## gi|257451582|ref|ZP_05616881.1| hypothetical protein F3_00862 39 16 Op 4 . + CDS 18113 - 18388 291 ## gi|257451583|ref|ZP_05616882.1| hypothetical protein F3_00867 40 16 Op 5 . + CDS 18385 - 19074 986 ## gi|257451584|ref|ZP_05616883.1| hypothetical protein F3_00872 41 16 Op 6 . + CDS 19119 - 19325 239 ## gi|257451585|ref|ZP_05616884.1| hypothetical protein F3_00877 42 16 Op 7 . + CDS 19303 - 19653 341 ## gi|257451586|ref|ZP_05616885.1| hypothetical protein F3_00882 43 16 Op 8 . + CDS 19662 - 20135 404 ## gi|257451587|ref|ZP_05616886.1| hypothetical protein F3_00887 44 16 Op 9 . + CDS 20132 - 20530 406 ## CKR_1756 hypothetical protein 45 16 Op 10 . + CDS 20517 - 20729 264 ## gi|257451589|ref|ZP_05616888.1| hypothetical protein F3_00897 46 16 Op 11 . + CDS 20729 - 20941 206 ## gi|257451590|ref|ZP_05616889.1| hypothetical protein F3_00902 47 16 Op 12 . + CDS 20934 - 21143 330 ## gi|257451591|ref|ZP_05616890.1| hypothetical protein F3_00907 48 16 Op 13 . + CDS 21140 - 21439 393 ## gi|257451592|ref|ZP_05616891.1| hypothetical protein F3_00912 49 16 Op 14 . + CDS 21436 - 21726 383 ## Mvol_0505 hypothetical protein 50 16 Op 15 . + CDS 21707 - 21958 214 ## gi|257451594|ref|ZP_05616893.1| hypothetical protein F3_00922 51 16 Op 16 . + CDS 21933 - 22139 277 ## gi|257451595|ref|ZP_05616894.1| hypothetical protein F3_00927 52 16 Op 17 . + CDS 22136 - 22411 336 ## gi|257451596|ref|ZP_05616895.1| hypothetical protein F3_00932 53 16 Op 18 . + CDS 22415 - 22534 257 ## 54 16 Op 19 . + CDS 22539 - 23198 712 ## gi|257451598|ref|ZP_05616897.1| hypothetical protein F3_00942 55 16 Op 20 . + CDS 23242 - 23787 557 ## gi|257451599|ref|ZP_05616898.1| hypothetical protein F3_00947 56 16 Op 21 . + CDS 23777 - 24763 771 ## COG0286 Type I restriction-modification system methyltransferase subunit 57 16 Op 22 . + CDS 24809 - 25111 290 ## gi|257451601|ref|ZP_05616900.1| hypothetical protein F3_00957 58 16 Op 23 . + CDS 25123 - 25401 241 ## gi|257451602|ref|ZP_05616901.1| hypothetical protein F3_00962 59 16 Op 24 . + CDS 25398 - 25631 265 ## gi|257451603|ref|ZP_05616902.1| hypothetical protein F3_00967 60 16 Op 25 . + CDS 25653 - 26483 927 ## Blon_1319 hypothetical protein 61 16 Op 26 . + CDS 26523 - 28097 1340 ## GFO_2470 DNA-methyltransferase 62 16 Op 27 2/0.000 + CDS 28115 - 28750 352 ## COG0286 Type I restriction-modification system methyltransferase subunit 63 16 Op 28 . + CDS 28743 - 29783 910 ## COG0582 Integrase 64 16 Op 29 . + CDS 29845 - 30090 469 ## Pjdr2_1597 phage protein 65 16 Op 30 . + CDS 30090 - 30686 563 ## BC1875 phage protein 66 16 Op 31 . + CDS 30756 - 30869 185 ## 67 16 Op 32 . + CDS 30847 - 32208 1220 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 68 16 Op 33 . + CDS 32228 - 32800 498 ## gi|257451612|ref|ZP_05616911.1| hypothetical protein F3_01012 + Term 32957 - 32992 0.1 - Term 32776 - 32817 -0.4 69 17 Tu 1 . - CDS 32823 - 33623 1045 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity - Prom 33651 - 33710 9.4 - Term 33855 - 33886 1.8 70 18 Op 1 . - CDS 33887 - 34069 218 ## gi|257451614|ref|ZP_05616913.1| hypothetical protein F3_01022 - Prom 34094 - 34153 4.4 71 18 Op 2 . - CDS 34156 - 34269 142 ## 72 18 Op 3 . - CDS 34334 - 34426 124 ## 73 18 Op 4 . - CDS 34401 - 34472 60 ## 74 18 Op 5 . - CDS 34489 - 34596 139 ## - Prom 34638 - 34697 3.8 - Term 34625 - 34660 -0.7 75 19 Op 1 . - CDS 34755 - 34862 93 ## - Prom 34892 - 34951 7.6 76 19 Op 2 . - CDS 34955 - 35230 384 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 35260 - 35319 6.3 + Prom 35290 - 35349 10.6 77 20 Op 1 . + CDS 35396 - 35545 178 ## gi|257451618|ref|ZP_05616917.1| hypothetical protein F3_01042 78 20 Op 2 . + CDS 35457 - 35669 131 ## 79 20 Op 3 . + CDS 35671 - 35982 342 ## FN0165 hypothetical protein + Term 36052 - 36106 7.9 + Prom 36415 - 36474 11.0 80 21 Op 1 . + CDS 36505 - 36888 662 ## FN1869 hypothetical protein 81 21 Op 2 . + CDS 36911 - 37726 1237 ## COG3246 Uncharacterized conserved protein 82 21 Op 3 . + CDS 37752 - 38789 1572 ## FN1867 Zn-dependent alcohol dehydrogenase and related dehydrogenase 83 21 Op 4 . + CDS 38854 - 40113 1549 ## COG1509 Lysine 2,3-aminomutase 84 21 Op 5 . + CDS 40138 - 41220 1043 ## FN1865 hypothetical protein 85 21 Op 6 . + CDS 41157 - 42626 1683 ## COG1193 Mismatch repair ATPase (MutS family) 86 21 Op 7 . + CDS 42633 - 44195 2164 ## FN1863 L-beta-lysine 5,6-aminomutase alpha subunit (EC:5.4.3.3) 87 21 Op 8 . + CDS 44195 - 44992 1367 ## COG5012 Predicted cobalamin binding protein + Term 45017 - 45065 10.9 + Prom 45038 - 45097 12.4 88 22 Op 1 . + CDS 45151 - 46305 1278 ## FN0336 hypothetical protein 89 22 Op 2 1/0.000 + CDS 46318 - 46887 199 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 + Term 46892 - 46940 6.0 90 22 Op 3 . + CDS 46960 - 48198 1542 ## COG1448 Aspartate/tyrosine/aromatic aminotransferase 91 22 Op 4 11/0.000 + CDS 48206 - 49051 1212 ## COG1951 Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain 92 22 Op 5 . + CDS 49064 - 49624 944 ## COG1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain 93 22 Op 6 . + CDS 49621 - 50181 821 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) 94 22 Op 7 10/0.000 + CDS 50247 - 51233 1524 ## COG2376 Dihydroxyacetone kinase 95 22 Op 8 9/0.000 + CDS 51244 - 51855 976 ## COG2376 Dihydroxyacetone kinase 96 22 Op 9 . + CDS 51864 - 52268 413 ## COG3412 Uncharacterized protein conserved in bacteria 97 22 Op 10 . + CDS 52337 - 52399 63 ## 98 22 Op 11 18/0.000 + CDS 52412 - 53161 1303 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) 99 22 Op 12 3/0.000 + CDS 53174 - 54673 2490 ## COG0554 Glycerol kinase + Term 54753 - 54785 -1.0 + Prom 54689 - 54748 2.1 100 22 Op 13 6/0.000 + CDS 54813 - 56246 2178 ## COG0579 Predicted dehydrogenase 101 22 Op 14 4/0.000 + CDS 56261 - 57577 1874 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 102 22 Op 15 . + CDS 57577 - 57921 484 ## COG3862 Uncharacterized protein with conserved CXXC pairs + Prom 57923 - 57982 2.4 103 22 Op 16 . + CDS 58006 - 58470 398 ## COG4574 Serine protease inhibitor ecotin + Term 58497 - 58542 8.3 + Prom 58479 - 58538 6.3 104 23 Tu 1 . + CDS 58701 - 59354 940 ## COG1802 Transcriptional regulators + Prom 59424 - 59483 7.9 105 24 Op 1 1/0.000 + CDS 59514 - 60863 2038 ## COG3493 Na+/citrate symporter 106 24 Op 2 . + CDS 60891 - 62279 2294 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase 107 24 Op 3 . + CDS 62288 - 62623 389 ## gi|257451647|ref|ZP_05616946.1| hypothetical protein F3_01187 108 24 Op 4 . + CDS 62632 - 62976 625 ## COG1038 Pyruvate carboxylase 109 24 Op 5 1/0.000 + CDS 62990 - 64108 1894 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 110 24 Op 6 6/0.000 + CDS 64122 - 64406 495 ## COG3052 Citrate lyase, gamma subunit 111 24 Op 7 6/0.000 + CDS 64417 - 65319 1346 ## COG2301 Citrate lyase beta subunit 112 24 Op 8 . + CDS 65321 - 66862 2551 ## COG3051 Citrate lyase, alpha subunit + Term 66874 - 66923 6.5 113 25 Tu 1 . - CDS 66979 - 67644 661 ## COG1451 Predicted metal-dependent hydrolase - Term 67654 - 67685 2.7 114 26 Op 1 . - CDS 67695 - 68483 839 ## COG2357 Uncharacterized protein conserved in bacteria 115 26 Op 2 . - CDS 68503 - 69348 624 ## FN0925 hypothetical protein 116 26 Op 3 . - CDS 69381 - 70175 624 ## FN0924 hypothetical protein - Prom 70210 - 70269 9.2 + Prom 70231 - 70290 5.6 117 27 Tu 1 . + CDS 70311 - 71768 1272 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Prom 71966 - 72025 10.0 118 28 Tu 1 . + CDS 72053 - 73126 1531 ## COG0505 Carbamoylphosphate synthase small subunit Predicted protein(s) >gi|224461498|gb|ACDD01000004.1| GENE 1 2 - 914 623 304 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3911 NR:ns ## KEGG: Sterm_3911 # Name: not_defined # Def: toprim domain protein # Organism: S.termitidis # Pathway: not_defined # 1 292 1 304 607 117 33.0 5e-25 MTMIEFLDSFGIPYKFKGDEAVFELCPFCKHEHSKNRFLVNTKTGAYICNRQNSCGVKGH FKWDSLPKTKEKKEQKEYVTLRMADFNQDFTQEMLDYMAGRGISKETLINSKIFNRNGRF CFFYVGEDEAGTCIGVKYRTIDKKISAAKGSVMNLLNWRLVPKDSKELYIVEGEVDLLSL LEIGIKNVVSVPNGAGSHDWIDYHYEWLEKFKKIILIMDNDEAGKKGIKAIYDRLKHSEI EIKKINLLFYKDPNEILMDESGRMKLKKILETGEEDIMEPKMMDISDVHCDTDDEYYSWG DDAF >gi|224461498|gb|ACDD01000004.1| GENE 2 911 - 2002 957 363 aa, chain - ## HITS:1 COG:no KEGG:CPF_1598 NR:ns ## KEGG: CPF_1598 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 1 114 1 112 246 100 49.0 1e-19 MKYTIHGYKQEKLYENGLDNDDALILRVLSDMYSSGSKKIHFKIIDQEKYMWITYEYLLE QIIVIGSKNKLIRRIDNLITKNILKKYLETSKNGVKGRFLYVCFSEKHAILTDYDDTKEQ FSKNEKERSKNTKTQNDITPNEPKPISSMTKTQNDITPNEPKPISSMTKTQNDITPNEPK PISSMTKTQNEYDQNPKRVNKDSTINYSSINYSSIKKNKETTTNVVVKKEKSLYKQEDAS ALQAEAEALISNSAYDNKIKHLLSHWWFKEKPYFIKKKLKSNSSLKSILKLDFILNHRYF EEFIKWITEKEWQTIEARYFEVFLQNKKAEERIKNPPKSYVERKQEEEKEKNLDNLLKGA VNI >gi|224461498|gb|ACDD01000004.1| GENE 3 1999 - 2259 352 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451550|ref|ZP_05616849.1| ## NR: gi|257451550|ref|ZP_05616849.1| hypothetical protein F3_00702 [Fusobacterium sp. 3_1_5R] # 1 86 1 86 86 152 100.0 6e-36 MGRRNYLLETEMTLKAKGLLALMLDLPGNYWASICILKKYCKESTQTIKKTLKELEEFGY LKIDKKIKKWEISVTPFPQEQECDVK >gi|224461498|gb|ACDD01000004.1| GENE 4 2416 - 2817 148 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451551|ref|ZP_05616850.1| ## NR: gi|257451551|ref|ZP_05616850.1| hypothetical protein F3_00707 [Fusobacterium sp. 3_1_5R] # 6 133 1 128 128 188 100.0 8e-47 MKGAIMKSHLIDEKLVKEKPEFLLKFQNTKINFLIEKTFVTGFVPEEFNENFYKDKTLIE FEIDKNNIVEDLKSFIKNYFYSSKQLYIVFTKKINWEEISLTEKIELNAKNFDIHFLDNK VQILLLIDKANFK >gi|224461498|gb|ACDD01000004.1| GENE 5 2845 - 3132 300 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451552|ref|ZP_05616851.1| ## NR: gi|257451552|ref|ZP_05616851.1| hypothetical protein F3_00712 [Fusobacterium sp. 3_1_5R] # 1 95 1 95 95 129 100.0 7e-29 MELLEAIRKNNEKLQEQNESIRKECEELRVKAELLTSLLHIVPTKLLCEELKRRCGQDFI EVERTEKIFISTKKGITETSSKDIILMLRKEDIKF >gi|224461498|gb|ACDD01000004.1| GENE 6 3380 - 3514 56 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKVFVRIRVHSYFLHFFIANKRKIIRTVSYSPEKCKKLYYKNS >gi|224461498|gb|ACDD01000004.1| GENE 7 3726 - 3818 77 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLNYLHFVDLNKIVLPTSEKLFYRRRSNS >gi|224461498|gb|ACDD01000004.1| GENE 8 4747 - 4953 275 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451553|ref|ZP_05616852.1| ## NR: gi|257451553|ref|ZP_05616852.1| hypothetical protein F3_00717 [Fusobacterium sp. 3_1_5R] # 1 68 1 68 68 103 100.0 4e-21 MKFGEKLYKKIEICLIRGNVSKAKVAKALGITPTTLSRQFKTLKEEDHISTVTLKKIEEL TGEKIFIL >gi|224461498|gb|ACDD01000004.1| GENE 9 5114 - 5680 176 188 aa, chain - ## HITS:1 COG:no KEGG:Mbar_A2117 NR:ns ## KEGG: Mbar_A2117 # Name: not_defined # Def: hypothetical protein # Organism: M.barkeri # Pathway: not_defined # 1 187 32 201 219 62 29.0 7e-09 MKISHISSHQVQPTKKYIFDTNIWLYLLPLQRNQSGYHQNNAALYSSLLSNILSNGCKIA ILSIEVSEIFNVYLRERGKFFLNSQNKQYSSRNYKRIYRKSQSFINDKNYIASEISQNIL TFCSKVDDNFSVLSNNCMLGNGIMAFDFNDNYLLNFCELENYVFVSHDKDCQNISYQNLN LEVVTANI >gi|224461498|gb|ACDD01000004.1| GENE 10 5685 - 5993 406 102 aa, chain - ## HITS:1 COG:no KEGG:Nwat_0049 NR:ns ## KEGG: Nwat_0049 # Name: not_defined # Def: hypothetical protein # Organism: N.watsoni # Pathway: not_defined # 1 97 5 100 119 64 34.0 1e-09 MKIKVFEIIGGQFAVSAEKGKTLYDSIVKNISLGEIPIILDFEGIEMTISTFFNLAYGEL FKDYTEEEVEKLVKFENAKEISLSQIKQVKINALKLYRRGVD >gi|224461498|gb|ACDD01000004.1| GENE 11 6010 - 6939 539 309 aa, chain - ## HITS:1 COG:no KEGG:Neut_0161 NR:ns ## KEGG: Neut_0161 # Name: not_defined # Def: ATP-binding region, ATPase-like # Organism: N.eutropha # Pathway: not_defined # 14 308 8 279 290 92 28.0 1e-17 MKKNIHKLTAPIKIRNDFFTYETIAIFFDKIQKLSNVLGENVVHFDYSETIQIDGNMVVF LDMIKRYHFEKLGQSTFSGFNDEIKNLLCRNRYLNNSPEKLVSEEEILQDRKKTTVQFAR HEIKRIEKEEEEKTTEKFCDYMLTEVISHYRFDGVSEIIKNKILECTTELCANVIQHTNS SFISTCGQFFPVSRKFIFSMSNLGETFYDNINSFIKKNDDIDCIKWALKNSHTTKTTSGG LGLYTLFRFMHKIEGKIQIISGKGFYEAAFYSKNGDIKKREVMKNLTYKIPGTIITIIID LKKKIHYYY >gi|224461498|gb|ACDD01000004.1| GENE 12 7026 - 7358 296 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451557|ref|ZP_05616856.1| ## NR: gi|257451557|ref|ZP_05616856.1| hypothetical protein F3_00737 [Fusobacterium sp. 3_1_5R] # 1 110 1 110 110 194 100.0 1e-48 METIEINGHTGGKSIKIWDDIYCLFTKNKVIELKKVVYKAHSFYIRKIAENDKTVTYQIY VDEGIEYELNVFLHEIYRNTESMFCEYFIEPVNEEIDFSEKPSLKLIVTK >gi|224461498|gb|ACDD01000004.1| GENE 13 7400 - 7495 96 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEGILGEEYVPDKELFLKKEILEEIKEYLG >gi|224461498|gb|ACDD01000004.1| GENE 14 7601 - 7813 257 70 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1388 NR:ns ## KEGG: Sterm_1388 # Name: not_defined # Def: protein of unknown function DUF955 # Organism: S.termitidis # Pathway: not_defined # 2 70 4 72 153 65 49.0 6e-10 MNIRLRVKNLIERCGTRNIFKICKKLNIEVVFMELGNIKGFYNSAVGNKFIAINDKLTEW EIKIVLAHEL >gi|224461498|gb|ACDD01000004.1| GENE 15 7822 - 8289 636 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451560|ref|ZP_05616859.1| ## NR: gi|257451560|ref|ZP_05616859.1| transcriptional regulator [Fusobacterium sp. 3_1_5R] # 1 155 1 155 155 203 100.0 3e-51 MRTTGEILKQKREMLGITAEKLGELTKVTQAYVTMTENNATKPSKAYLTNVKKILHITDL EQHEIEEYEEFRRLPEKIQKKLISLDKKMDSRISTLNTRELNQYEATLSQASSFFGDEKV SEEDKKKLLDAMTEMFFIGKAKNKQKYAKNKTNKK >gi|224461498|gb|ACDD01000004.1| GENE 16 8426 - 9454 1078 342 aa, chain - ## HITS:1 COG:no KEGG:SACE_2836 NR:ns ## KEGG: SACE_2836 # Name: lexA # Def: LexA repressor (EC:3.4.21.88) # Organism: S.erythraea # Pathway: not_defined # 162 307 80 219 220 67 32.0 8e-10 MEIRILLKKLRESRNLTIKELSEKAKVGNGTIGDIESGRNTPRKTTLEKLAKALNLNSDE RVELFATLVPLDVSKKIISSKNYKYVEENSIKFEKKFPTYLKNFMSENNYDITFLCKKIK TSEILLNNYLSGIETPNNDFIDSFIKTFKISRIEAEHIKAYIDYDNNFKDPSTISNIDFE LSYELIEVPVYSSVSAGYGYSPESLPIKYVSIPKIDGDIIGIRVSGDSMEKTIYDGDIVI VKKDVEVNIGDIGVFLLNKEFGDGVVKRLAKKNGVFVLESDNPYYKPIEIKSSEVITCGK VIKIIRSSTNKQKDPFVELFNSLSADKQQDLLNYAEFLKNKK >gi|224461498|gb|ACDD01000004.1| GENE 17 9841 - 10206 386 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451562|ref|ZP_05616861.1| ## NR: gi|257451562|ref|ZP_05616861.1| hypothetical protein F3_00762 [Fusobacterium sp. 3_1_5R] # 1 121 1 121 121 229 100.0 3e-59 MSTKCLILKFEKGKFQGITCQYDGYLENVGMKLLNGFNSNKKLNELIELGNISSLGNGIK DTVSYEEDGYVFSHIREIEEFTGNPDYVYVRFLDGWYFTNNVRKCRSTLEKLTKKTIKER M >gi|224461498|gb|ACDD01000004.1| GENE 18 10243 - 10809 480 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451563|ref|ZP_05616862.1| ## NR: gi|257451563|ref|ZP_05616862.1| hypothetical protein F3_00767 [Fusobacterium sp. 3_1_5R] # 1 188 1 188 188 380 100.0 1e-104 MKDIKLSNKIMSLDVESNGLWGEPISIGFTIEENGKVIDKWEACYIDPTKEYNDWVKENV IVTLKENKNIFHCSSYDNLLRLFVLRYIHYCKDFGHTVLYHMGHIVEANLFKELVSGGYI GEWDAPYTPIEVSALLAVCGYEMDSVDKLAELNLIEKPEVSQRHQALYDAEVTARAYWFL KAQLEKKE >gi|224461498|gb|ACDD01000004.1| GENE 19 10814 - 11047 348 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451564|ref|ZP_05616863.1| ## NR: gi|257451564|ref|ZP_05616863.1| hypothetical protein F3_00772 [Fusobacterium sp. 3_1_5R] # 1 77 1 77 77 124 100.0 3e-27 MRIQTKRRGEKLEYKFINNNNQQQWLQKGHVGEVQIGDIVVNRKFGSNGFRRYDYRTFEE VIEICDDYVITKEIGGE >gi|224461498|gb|ACDD01000004.1| GENE 20 11050 - 11220 275 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451565|ref|ZP_05616864.1| ## NR: gi|257451565|ref|ZP_05616864.1| hypothetical protein F3_00777 [Fusobacterium sp. 3_1_5R] # 1 56 1 56 56 84 100.0 2e-15 MNKKNEIFLICVAIIGMVALAITLGYNLMCLKTPTIIQWILGVCGVVGFVFGYKYI >gi|224461498|gb|ACDD01000004.1| GENE 21 11230 - 11478 365 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451566|ref|ZP_05616865.1| ## NR: gi|257451566|ref|ZP_05616865.1| hypothetical protein F3_00782 [Fusobacterium sp. 3_1_5R] # 1 82 1 82 82 97 100.0 3e-19 MDLDNIEIIEEIIKDIKVLKEEVDFMQQIIRGESIPSNTDFKKAVYIVKNSFEEEQREEF LNVLLKLKEKQLNDYKEELRLM >gi|224461498|gb|ACDD01000004.1| GENE 22 11479 - 11547 167 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDYFIVGIVIVALGIVAFIDIE >gi|224461498|gb|ACDD01000004.1| GENE 23 11558 - 11752 295 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451567|ref|ZP_05616866.1| ## NR: gi|257451567|ref|ZP_05616866.1| hypothetical protein F3_00787 [Fusobacterium sp. 3_1_5R] # 1 64 1 64 64 68 100.0 1e-10 MELRNIDKAFSLIIEYKEEKARLESLEYFEKEIQKNDSLYKKIKERMQERKESILALEYE IKAL >gi|224461498|gb|ACDD01000004.1| GENE 24 11768 - 12394 674 208 aa, chain + ## HITS:1 COG:lin0084 KEGG:ns NR:ns ## COG: lin0084 COG5377 # Protein_GI_number: 16799162 # Func_class: L Replication, recombination and repair # Function: Phage-related protein, predicted endonuclease # Organism: Listeria innocua # 1 204 9 205 319 112 37.0 6e-25 MKKVAPLNNLSHEDWLKLRNLGIGGSDVGAILGYDKYKNNVQLWKHKKGIEKIAFKGNAA TERGHQSEDPIARLWEAFNLERVRLIKPEYSYCHSEYEFMRANFDYFGVVDGENCIIEIK SAEPQNMKEWKDGIPYSYYCQCLHYLAVSGLETCYLIAAIRLRMSNDIVLKEYRIKRDED EINYIIKKEVEFWESLKSDVPPFLIKNI >gi|224461498|gb|ACDD01000004.1| GENE 25 12407 - 13303 1166 298 aa, chain + ## HITS:1 COG:no KEGG:CLL_A1941 NR:ns ## KEGG: CLL_A1941 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 12 246 12 242 301 95 34.0 2e-18 MQIQTTTCVPAVLEFNFEEVKNNLQEHLEKYQDLVVTEENQKEMKSTLADLNKTAKSIDD YRKEQKKLMSEPIKKFEDKCKELSELVVKVSSPIKVQLDVFEETRMKGKEEEIKLLIQNV VDNVELNPYYANQLDILPKYLNKTAKEKDIVADLEARAAQLKLQQDAQAMKEKIQAQKTD LIQKEIEEANNKYKINLKISSFQYLMSDEYEMSDISSVIEKEARSNFERKEYYKKLEEKE EQAEAEVTQNISEKKEEVKKAVKNYNFTLQIENCSLEAAKELKEFLEERGIQHKMIQE >gi|224461498|gb|ACDD01000004.1| GENE 26 13313 - 14071 989 252 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1235 NR:ns ## KEGG: Sterm_1235 # Name: not_defined # Def: phage recombination protein Bet # Organism: S.termitidis # Pathway: not_defined # 1 244 1 236 238 253 56.0 3e-66 MEAKNNLVKKETEKGTMLFQVDGMEVKLSPALVRKYLVNGNGSITDSEVVYFMQLCKARH LNPFTKDCYLIKYSSQPATMVVAKEALERRAVKNEKYNGKKVGIYVENENGELIKKDNCI LLKSETIVGAWCEVYRKDWEYPVKIDVNFEEYIGRTKDGTPNTNWGNRPVTMITKVAKAQ ALREAFVEELSGMYDSAEVEKIIPEEVINEAETRNVNELVEAEVVTEEKPTKKEKAVEQN TSDDEIASIFGK >gi|224461498|gb|ACDD01000004.1| GENE 27 14078 - 14254 171 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451571|ref|ZP_05616870.1| ## NR: gi|257451571|ref|ZP_05616870.1| hypothetical protein F3_00807 [Fusobacterium sp. 3_1_5R] # 1 58 1 58 58 87 100.0 3e-16 MGKSIKKYNLHNIGEEVEEVGRRLRTIASNSDNSLIDKEEIKRIVGKLKKYVKTLEVF >gi|224461498|gb|ACDD01000004.1| GENE 28 14254 - 14640 552 128 aa, chain + ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 128 1 154 154 124 49.0 6e-29 MNVITLIGRLTKDPEVKFGASGKAYCRFSLAVNRPFDKEQADFINCVSFGKTAELIGEYF KKGHQIGVSGHLQMNQFEANGEKRTSYDVIVDSFDFLNNKKEEGTVKQESKNTTKNNEVP EDEEEFPF >gi|224461498|gb|ACDD01000004.1| GENE 29 14660 - 14785 205 41 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451573|ref|ZP_05616872.1| ## NR: gi|257451573|ref|ZP_05616872.1| hypothetical protein F3_00817 [Fusobacterium sp. 3_1_5R] # 1 41 1 41 41 66 100.0 4e-10 MKKKSFWDSPDAANYIAALACLISLASIVFNVLTIIEKSQK >gi|224461498|gb|ACDD01000004.1| GENE 30 14776 - 15087 384 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451574|ref|ZP_05616873.1| ## NR: gi|257451574|ref|ZP_05616873.1| hypothetical protein F3_00822 [Fusobacterium sp. 3_1_5R] # 1 103 1 103 103 173 100.0 3e-42 MKENFSSDDFDVNSIGKSYYTQDFNGLKLLEQITQNSSILPINQEKQIKELEDINAKHSD MIEKMKVSIQDNRESSKRSLYLAILSIIIACIALAFSIYTHYF >gi|224461498|gb|ACDD01000004.1| GENE 31 15274 - 15447 190 57 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0982 NR:ns ## KEGG: Lebu_0982 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 57 1 57 57 65 64.0 7e-10 MEKRILKVSFSKTGNGMGARIPLSIPLLKKMGITQEEREVEVMYDEERQQVIIQKKK >gi|224461498|gb|ACDD01000004.1| GENE 32 15561 - 15677 198 38 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451576|ref|ZP_05616875.1| ## NR: gi|257451576|ref|ZP_05616875.1| hypothetical protein F3_00832 [Fusobacterium sp. 3_1_5R] # 1 38 1 38 38 63 100.0 6e-09 MRMKKKKNLWEVTVCGLKFYTESMSEAIAIAWKIGGRA >gi|224461498|gb|ACDD01000004.1| GENE 33 15674 - 15919 243 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451577|ref|ZP_05616876.1| ## NR: gi|257451577|ref|ZP_05616876.1| hypothetical protein F3_00837 [Fusobacterium sp. 3_1_5R] # 1 81 1 81 81 117 100.0 2e-25 MNVKIQNEMTSLELVEQINFFREKEGKKTKLEYSNLLKVIRDEFSEEVNGVKIYGVKENY YTDKKGEKRHMFILTLSQTKQ >gi|224461498|gb|ACDD01000004.1| GENE 34 16016 - 16192 236 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451578|ref|ZP_05616877.1| ## NR: gi|257451578|ref|ZP_05616877.1| hypothetical protein F3_00842 [Fusobacterium sp. 3_1_5R] # 1 58 1 58 58 80 100.0 3e-14 MEKRKGKLLFHKSGNGKACKINIPMPWMRKMKMNEDEREVEISFDEENKKIIIEKSSK >gi|224461498|gb|ACDD01000004.1| GENE 35 16316 - 16858 593 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451579|ref|ZP_05616878.1| ## NR: gi|257451579|ref|ZP_05616878.1| hypothetical protein F3_00847 [Fusobacterium sp. 3_1_5R] # 1 180 1 180 180 272 100.0 6e-72 MNNLVSKESMTSLELLEQINLFRKEEGKKKDLRHKDLLSIIRNEFEEEITGRKISLSEYK DKTGRKLPMFILTLSQAKQVLVRESKYVRRAVIQYIEKLETSLANTQQTLPMKIDVIDFR EEEFVKGMREVKGILNSLKSQLLLDRLKLDNRLKQLEATGLTDKMKAYEAKGEFYVTQEL >gi|224461498|gb|ACDD01000004.1| GENE 36 16925 - 17146 364 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451580|ref|ZP_05616879.1| ## NR: gi|257451580|ref|ZP_05616879.1| hypothetical protein F3_00852 [Fusobacterium sp. 3_1_5R] # 1 73 1 73 73 111 100.0 2e-23 MEDKVENKLNELLTKVKEGINTEDINKMIELHCLFIGIALVSKSMIPQNKDFQNKIEAVM ATMQQNNPLRMLQ >gi|224461498|gb|ACDD01000004.1| GENE 37 17143 - 17931 852 262 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0893 NR:ns ## KEGG: Sterm_0893 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 6 237 5 242 243 182 43.0 1e-44 MNIELKIEDDDIKLYLPNTHSIDFIKSLVKKIRENDVQVILTDRLSIDQNRLIWVLCHEY GELLGYTREEMRGILENEFCNEREIEYFSISPHKENACSMEIATEFIQFIIEHSIKMGVN LIIHEGKGPLMKRKQARDVVPDIRRFVLACLLNKTCAVCGRKNVELHHWDPIGSFGYEHD DGLQTRFISLCGEHHSEFHLIGAEEFERRYHLQGIWLSVNLVIALKKVYPWHFKGFKKEK YIKSLEVKNDERRNHKLKKDNL >gi|224461498|gb|ACDD01000004.1| GENE 38 17888 - 18100 243 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451582|ref|ZP_05616881.1| ## NR: gi|257451582|ref|ZP_05616881.1| hypothetical protein F3_00862 [Fusobacterium sp. 3_1_5R] # 1 70 1 70 70 108 100.0 1e-22 MTREEIISLKKTIFRDFTLNSEALRTSPVESYELCLIYERYGKKFKMSEIVRALNIKTTT TEMKKIPYIY >gi|224461498|gb|ACDD01000004.1| GENE 39 18113 - 18388 291 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451583|ref|ZP_05616882.1| ## NR: gi|257451583|ref|ZP_05616882.1| hypothetical protein F3_00867 [Fusobacterium sp. 3_1_5R] # 1 91 1 91 91 104 100.0 2e-21 MRNRKKREEELFGLDLFSRDFVGGYSHKKKKAWIQKKGFRSTIILYEIEEEEILSLFSEQ HKFYPFLFEKEVEGYERFNMGKGKRYRKVIS >gi|224461498|gb|ACDD01000004.1| GENE 40 18385 - 19074 986 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451584|ref|ZP_05616883.1| ## NR: gi|257451584|ref|ZP_05616883.1| hypothetical protein F3_00872 [Fusobacterium sp. 3_1_5R] # 1 229 1 229 229 353 100.0 4e-96 MRRAKPKEKREIKINEKKEIPVIRDIQIPKELENAFIFTACMLEMKACLEEHREIWKKEF KNGGYLKANYKIALGRMINLCDMLSKKYTENGIKMSDFARIKIIKMIESIHMEKNIRDKT RKETRVMSYEDEYFPEYKEFLNTILAAWTYSMILFTKLHHFIRKEKIEKEAKELSDKVIL FMTMIDEEIVLSEHEIVTKAERKKYLKPVSEKKLQIIHTKLKEKGYLKD >gi|224461498|gb|ACDD01000004.1| GENE 41 19119 - 19325 239 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451585|ref|ZP_05616884.1| ## NR: gi|257451585|ref|ZP_05616884.1| hypothetical protein F3_00877 [Fusobacterium sp. 3_1_5R] # 1 68 1 68 68 104 100.0 2e-21 MKNYEYVCEFPRKEKQKAIEHCKFLNSKDVGKFEIREYTQSRLIATKETKPTTYIVVRRL EDENKDKV >gi|224461498|gb|ACDD01000004.1| GENE 42 19303 - 19653 341 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451586|ref|ZP_05616885.1| ## NR: gi|257451586|ref|ZP_05616885.1| hypothetical protein F3_00882 [Fusobacterium sp. 3_1_5R] # 1 116 1 116 116 182 100.0 6e-45 MKTKTKFRKIATFNNWKEAQEMCLKKNRCQQKPKYSILKLKKNKYVVIEWLLPTEKLEEE NRPFEKRSKKGKVLQNLHLIGEKRKNGITWNVIAEELELSVCTCQRYYRKFREGEF >gi|224461498|gb|ACDD01000004.1| GENE 43 19662 - 20135 404 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451587|ref|ZP_05616886.1| ## NR: gi|257451587|ref|ZP_05616886.1| hypothetical protein F3_00887 [Fusobacterium sp. 3_1_5R] # 1 157 1 157 157 280 100.0 2e-74 MEKETVLEIEFMPVWDKWAWRITKQNENVLERGVFKDDEIRVMSSSGPCLCLDNFLYIKG MDSSHDDDCFVCTNKEKKIIKEKVKAINEKYEKPKRWRAEYGGEYFFVTSHGKVRAAIDY KYDYDTNRYNFGNYFKTKEQAKKALGRMKEVLKEVTE >gi|224461498|gb|ACDD01000004.1| GENE 44 20132 - 20530 406 132 aa, chain + ## HITS:1 COG:no KEGG:CKR_1756 NR:ns ## KEGG: CKR_1756 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 2 127 3 131 133 69 39.0 3e-11 MREIKFRAWDGLTKKITNYQICDDMVLFFDKHKGCWLKSQKDRFVLMQSTGIKDKNGVEI YEGDIVKVPHFLHDERIKINGVVKYVNNRAEFVIDLEDIEETFYCCNQKDRIEVVGNIYE NPELLEENNVEM >gi|224461498|gb|ACDD01000004.1| GENE 45 20517 - 20729 264 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451589|ref|ZP_05616888.1| ## NR: gi|257451589|ref|ZP_05616888.1| hypothetical protein F3_00897 [Fusobacterium sp. 3_1_5R] # 1 70 1 70 70 117 100.0 2e-25 MWKCKCCGSTDFIERVVGGYEKYSGYDKNGYPKELEETDYETIIECNNCGNYKDGSDDIK NIADWIEEEK >gi|224461498|gb|ACDD01000004.1| GENE 46 20729 - 20941 206 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451590|ref|ZP_05616889.1| ## NR: gi|257451590|ref|ZP_05616889.1| hypothetical protein F3_00902 [Fusobacterium sp. 3_1_5R] # 1 70 1 70 70 123 100.0 4e-27 MWKCKKCGNDTFYQPTRSRFIIYSADKNENIIGCDDDCNVEYGKFYCENCKKTGWRLSDV AKWVEEEENE >gi|224461498|gb|ACDD01000004.1| GENE 47 20934 - 21143 330 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451591|ref|ZP_05616890.1| ## NR: gi|257451591|ref|ZP_05616890.1| hypothetical protein F3_00907 [Fusobacterium sp. 3_1_5R] # 1 69 1 69 69 92 100.0 1e-17 MSKYKIKFHVSTGYVSSKIEETIDLLKDYGYSEEEAKEIIKNEDKIEAIFEEWVWETLDS GWEVLEVEE >gi|224461498|gb|ACDD01000004.1| GENE 48 21140 - 21439 393 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451592|ref|ZP_05616891.1| ## NR: gi|257451592|ref|ZP_05616891.1| hypothetical protein F3_00912 [Fusobacterium sp. 3_1_5R] # 1 99 1 99 99 187 100.0 2e-46 MKQSKLFESTSHYKFMRERAFESDLGCNTTLIGFIWELNDQIAERWRDVSMLLAINNPKN NIKMTKGLTKIANDLRKIADELESYTKEKEPECVEGVIE >gi|224461498|gb|ACDD01000004.1| GENE 49 21436 - 21726 383 96 aa, chain + ## HITS:1 COG:no KEGG:Mvol_0505 NR:ns ## KEGG: Mvol_0505 # Name: not_defined # Def: hypothetical protein # Organism: M.voltae # Pathway: not_defined # 5 88 6 93 93 61 45.0 1e-08 MKQVITFKSYPEFYEKEKSGLKCNTVRLFTLSDDREYILQDIMNEEIKKEDVILRIMNSD TGETFEREISDVSKFEVNNVEIYIISWRHEDDEVKA >gi|224461498|gb|ACDD01000004.1| GENE 50 21707 - 21958 214 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451594|ref|ZP_05616893.1| ## NR: gi|257451594|ref|ZP_05616893.1| hypothetical protein F3_00922 [Fusobacterium sp. 3_1_5R] # 1 83 1 83 83 119 100.0 6e-26 MMKLKLRKLRKKREFTAHWKGGYSKCYGYGTFRPLIIRRRGENIKIKNFELMGFITQMYF LKSLKKYYKSEVGRIDDKTYKKQ >gi|224461498|gb|ACDD01000004.1| GENE 51 21933 - 22139 277 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451595|ref|ZP_05616894.1| ## NR: gi|257451595|ref|ZP_05616894.1| hypothetical protein F3_00927 [Fusobacterium sp. 3_1_5R] # 1 68 1 68 68 118 100.0 1e-25 MIKLIRNSEIEQHKTRYKFYNNRCNGCNKVGDVNILEVRADESSGGTVIVLCDECLKKLG KEIDEKIR >gi|224461498|gb|ACDD01000004.1| GENE 52 22136 - 22411 336 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451596|ref|ZP_05616895.1| ## NR: gi|257451596|ref|ZP_05616895.1| hypothetical protein F3_00932 [Fusobacterium sp. 3_1_5R] # 1 91 1 91 91 167 100.0 3e-40 MKEIYMSQIKKIFEMYMDIFEQEDSDHPDYRVIRPKERAIDRTLDKITDNKEEQRKIYYA IIGHLFNRGDGTYRPICNDLRALGYTVIEGR >gi|224461498|gb|ACDD01000004.1| GENE 53 22415 - 22534 257 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTVEEVKHKIDEFVGDDSSQRLDKLQSQIKLLKWVLGE >gi|224461498|gb|ACDD01000004.1| GENE 54 22539 - 23198 712 219 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451598|ref|ZP_05616897.1| ## NR: gi|257451598|ref|ZP_05616897.1| hypothetical protein F3_00942 [Fusobacterium sp. 3_1_5R] # 1 219 1 219 219 378 100.0 1e-103 MKYTQNIEVEALQFTEDNIDEILDFICDGEPFEMCFVEDRETTKLDIIKKQKLYIEHPVG MITAYFGNYLVKISKNIFQVWSKEEFEKFHKIKLTDVKENKIKWAFSWNGENYYGGFDTR EEAIEEARKTDKSAKSVFVGIEVPYKEKCKNIVEIVTDSLNAGAYEEMEELAEDYMLYFR EGEKKILEDRLRETILIFQKEFGYEPSFFYVKEAEFVEL >gi|224461498|gb|ACDD01000004.1| GENE 55 23242 - 23787 557 181 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451599|ref|ZP_05616898.1| ## NR: gi|257451599|ref|ZP_05616898.1| hypothetical protein F3_00947 [Fusobacterium sp. 3_1_5R] # 1 181 1 181 181 382 100.0 1e-105 MDIVDIYKECGDFHEAVRRSGLPPLIAHIKLMKNGVLKIQDKIKYGTRANALGGKAEELF QKLVPEAVDANKYWKKNNPIYDFVYKGLTIDVKYSSIRPHTTTICWCARVSGEQDVTVIF LERERGTELKDPYILLIPRGFVNVKNTLHISCSGSWFSDFQIEIEELKDVLEQYVEAMNG V >gi|224461498|gb|ACDD01000004.1| GENE 56 23777 - 24763 771 328 aa, chain + ## HITS:1 COG:XF2728 KEGG:ns NR:ns ## COG: XF2728 COG0286 # Protein_GI_number: 15839317 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Xylella fastidiosa 9a5c # 14 263 203 502 525 169 33.0 7e-42 MGYDIKSIKKEFASKGIFYTPPELAEFLKSFVDIDTDEVYDPTCGHGSLLSVFGDEVKKY GQDINPVAIEYIKENFPHFHVELGDTIQEDKFSEKKFKVILANPPFSVKYEPNEEMLLDK RFKDCGILSPASKADYMFNLHILHKLKENGIAVVMNFPGILYRKNKEGEIRKWLIENNYI DTIVHIAGKKFEDTNIATCLIIYRKNKVTTDIKFVDSEFNLERMVSLEEIRENNYNLSIS TYVQKEEQKEEINIDDVNEEANRHFLIHLEETLKVNLFLVQSLEAKIDYEMFLNKIGGLV RKYRSKFKNREKEEIPKQWEEQLLLFDK >gi|224461498|gb|ACDD01000004.1| GENE 57 24809 - 25111 290 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451601|ref|ZP_05616900.1| ## NR: gi|257451601|ref|ZP_05616900.1| hypothetical protein F3_00957 [Fusobacterium sp. 3_1_5R] # 1 100 1 100 100 146 100.0 4e-34 MIREVRRNERLKAILFTKDNVGEVVKFLNYGKKRKLSKKQIEQYIKNGIRFKSYVSEDGN YIGDYGYRVYVDEWIVMGSNRYFETYTKKEFEEEFEIVDY >gi|224461498|gb|ACDD01000004.1| GENE 58 25123 - 25401 241 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451602|ref|ZP_05616901.1| ## NR: gi|257451602|ref|ZP_05616901.1| hypothetical protein F3_00962 [Fusobacterium sp. 3_1_5R] # 1 92 1 92 92 185 100.0 8e-46 MIEKFKTKVGWTVYKIPAIKTTLWGGFGICDHCNSSTDFGYLIPVLNHYCCEKCFRDWEK TARFYPEDLEVENKRIKYYENILNIIEKRGKE >gi|224461498|gb|ACDD01000004.1| GENE 59 25398 - 25631 265 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451603|ref|ZP_05616902.1| ## NR: gi|257451603|ref|ZP_05616902.1| hypothetical protein F3_00967 [Fusobacterium sp. 3_1_5R] # 1 77 1 77 77 125 100.0 7e-28 MNFIKTWEKDRDEALKSLDMEKIRKYCLKYKIPMPKDEIVILAGAHKARLHMLTTTEEEK EISKKWLKDHNFKETIF >gi|224461498|gb|ACDD01000004.1| GENE 60 25653 - 26483 927 276 aa, chain + ## HITS:1 COG:no KEGG:Blon_1319 NR:ns ## KEGG: Blon_1319 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 29 259 167 423 450 84 26.0 5e-15 MENLIDIENIKEQTFKQFYAEITEKEKNNPNVYSHWISKIDKSKLAMNIGDYQTIHLPED ILELLLTDGNLTEAEENKINTFMLDSLKENTPLPLFMKNGLFSNKFQFDNCIITHHCQFA KKLKQIYYAGMCYGVDFTTEVVLREVIPISTSHYIYDNMPLNLEYRFFVDIDKKEIFDVL DYWHPSIKERLSEKQKISFDFGKKEQGSKFEKYKDRIKEELQNSLKNNTFNLTGIWSIDI LVDYYDNFWLIDMAIAKNSWGVHLLSEENQKRIFGH >gi|224461498|gb|ACDD01000004.1| GENE 61 26523 - 28097 1340 524 aa, chain + ## HITS:1 COG:no KEGG:GFO_2470 NR:ns ## KEGG: GFO_2470 # Name: not_defined # Def: DNA-methyltransferase # Organism: G.forsetii # Pathway: not_defined # 1 426 2 409 515 152 30.0 3e-35 MFNKDFYPTPCNLAEKMISKIDMKDYYFSILEPSAGKGDILDVLLDKNKYSRKNLSIDCI EIQPELRSILKEKKYRVVYNDFLSFYSQKKYDIIIMNPPFNNGDKHLLKAISIIKKYGGQ IVCLLNAETIKNPYTIYRKDLLRQLEENNAEIQFLQDEFKNAERSTNVEIALIYLNIPEK IEESDLLKNLKKAEKEKEYCFETKQVTHNDFLKNIIEHYNFEVNAGVEFLKEYERIKPFF KSDFDVKYERYSISVKVGENGYESDKINDFISLVRKKYWKQLISSDKFRQLFTSNILDEL WNKIEELKDYDFNEYNIMEIQKELNRNLVKSVEETILSLFDKLSYQHSYRDDYGKNIHYY NGWKTNKAYKINKKVIMPLNAWSSWHNRMETYHIIGKLLDIEKVLNYLNYKEIESIDLNK RIEQCFSQGITKNIECKYFTLTFYKKGTCHLVFKDDDLLLKFNIFGSQKKGWLPHGYGKK NYEEMEQGEKVVIDSFQGKTEYEKVIGDKKNYIIDVNRSLLELK >gi|224461498|gb|ACDD01000004.1| GENE 62 28115 - 28750 352 211 aa, chain + ## HITS:1 COG:SPy0676 KEGG:ns NR:ns ## COG: SPy0676 COG0286 # Protein_GI_number: 15674741 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Streptococcus pyogenes M1 GAS # 6 181 6 178 211 143 40.0 2e-34 MDINKQIDQLIGVTESYKAPEKLLEIVLDYHRLKKVTLEMLKAHNYKMDYDWFHEYFQSE HADRKNNKQDFTPNSIGNLLIRLNRKTEGIIYEPACGTGGIIIQNWNVAREQYGILHFNP NDRLYICEELTDRTIPFLLFNLALRGVNAIVVQCDALTRKAKTFYHVFNENNDFMEISEV SKIEPFHAKKVLGEFGLVYYLDEEKEGEKIE >gi|224461498|gb|ACDD01000004.1| GENE 63 28743 - 29783 910 346 aa, chain + ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 59 345 42 320 321 166 34.0 7e-41 MNKVEEKEMVLNDFIFSVKSKLDLPEHILLVLRSQFSLALSKYNLVKKGKYEVVVSENTN EFLWKKFFIGKRIENLSENSLKVYKDNLERFDSVIKKNFLDVTTEDVRFYIATVISKNSS VKETYLDNIRRFLKTFFQFLVDEGLIKTNPVTRIKPIRGNTSEKLPFSNMEIEKLRNACS SCLEKAVIEILISTAFRRQELAEIKLSDIDFEKGSIRTIGKGKKLGYGFLNAKALLALIE YIEKERESKSVFLFIRGKNKRFREKAGEPYDASSLYALIKKIGKKAGVTHVHPHRFRRTA ATKALDRGLGLEDVKELLRHEDIKTTLIYTTLNKDRLKDKHNRFLD >gi|224461498|gb|ACDD01000004.1| GENE 64 29845 - 30090 469 81 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_1597 NR:ns ## KEGG: Pjdr2_1597 # Name: not_defined # Def: phage protein # Organism: Paenibacillus # Pathway: not_defined # 1 80 1 76 76 73 59.0 2e-12 MKNTLADLNNHLFAQLERLSDEDLTDEQLSQEIKRASAISKIATSITGNASIILEAQKLK TKRYELEGEEATDIPKMLEGQ >gi|224461498|gb|ACDD01000004.1| GENE 65 30090 - 30686 563 198 aa, chain + ## HITS:1 COG:no KEGG:BC1875 NR:ns ## KEGG: BC1875 # Name: not_defined # Def: phage protein # Organism: B.cereus # Pathway: not_defined # 1 197 1 193 195 156 42.0 4e-37 MAHKYTEEERQFIIENAKGITTFQLTELFNIKFQSDLKRSQIRAFLKNNGIRNEICSQFT KGCIPYNKGTKSICKANRCSFKKGHIPVNYKPVGYERVDEDGYVLIKVQDKGTWPERWRP KGSVMWEKYHKKKVPKGYVVIFADQNKLNFAKENLIVISRNELLVMNRKKRFSENSDITK VNINISKLDIKIMEKRKK >gi|224461498|gb|ACDD01000004.1| GENE 66 30756 - 30869 185 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIVKNDKISQDEMIVFAKKILETLHREEEENESSNLC >gi|224461498|gb|ACDD01000004.1| GENE 67 30847 - 32208 1220 453 aa, chain + ## HITS:1 COG:SP1040 KEGG:ns NR:ns ## COG: SP1040 COG1961 # Protein_GI_number: 15900911 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pneumoniae TIGR4 # 1 440 6 514 559 84 22.0 5e-16 MRVVIYVRESTNFQNPEAQLAECLEYCKQNNMEVVKTYKDIASGAKSNRKQFLKMQEDME DDLFDGIVLWELSRSTRDFITYKMMINRMFELGKELYSLQEGRLTEDNIDMEFSTDMRAM LNSYERKRVGRRIKFRKSYSRKQGLWTGGQAPLGYKIVNTQLQIDENTVNITKNIFSRFC NGEGVTQIANSYGFDYKKVRRMLQNPIYIGYIKDNQTEMIRDKRVVHKDYKIIKGIHQSI ISDELFQKAAILLKIRIRKQYNKGSYLLSSIFDYHGNRLYPCLNGKQPYYQNSKKNTKAY NLHKLENEVINELIANIEKMCLLNSVEQSEEAEIKVSKEVYLKKELEKLKDQFNKLLKQY LNGIVSEEIFETISKEIKEQEKSLEQEIQTIQIQHEQRKKRKLKEENVESLKKYLEKLQN TKDRIKLKEILDLIIYEVRMINDFRFYIVTNLY >gi|224461498|gb|ACDD01000004.1| GENE 68 32228 - 32800 498 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451612|ref|ZP_05616911.1| ## NR: gi|257451612|ref|ZP_05616911.1| hypothetical protein F3_01012 [Fusobacterium sp. 3_1_5R] # 1 190 1 190 190 337 100.0 2e-91 MIIYDKAIRVFTRKQLREMLPILSGRVFLREKKINLEISVRIPYKKIGYTVEDMLKDYPS VKKYSELKLFYNIHASGYNLNSLTKKYNLVEGALGRILESKVSFEGNAKNFHYILDYSDK VKEFIWDNYEIIPYKDHTEIFSTVENLKEFKEQFDIEREILLEPFEKKYHIAFGGNLSIF LNRKIKNAEN >gi|224461498|gb|ACDD01000004.1| GENE 69 32823 - 33623 1045 266 aa, chain - ## HITS:1 COG:FN1433 KEGG:ns NR:ns ## COG: FN1433 COG4221 # Protein_GI_number: 19704765 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Fusobacterium nucleatum # 4 259 3 258 260 293 57.0 3e-79 MNCENRLFGKIAFITGATSGIGKSTAIAFAKEGVNLILTARRENLLLELKTFLEKEYHIQ VFTLRLDVRNAEDVKRSIEMLPLSWRNIEILVNNAGLALGLDKEYLNSSDDIDTVIDTNV KGMLYVTNAIIPLMLSHKKASIIVNLGSVAGDSAYAGGAVYCASKAAIKILSDGLRIDLV DTPIKITNIKPGIVETNFSNVRFKGDEERARKVYTGIQSLTPEDIADTIVYICNLPDNVQ IPEITMTPMHQADGLHIYKNVNLNTF >gi|224461498|gb|ACDD01000004.1| GENE 70 33887 - 34069 218 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451614|ref|ZP_05616913.1| ## NR: gi|257451614|ref|ZP_05616913.1| hypothetical protein F3_01022 [Fusobacterium sp. 3_1_5R] # 1 60 1 60 60 68 100.0 1e-10 MEGKKSKMGRPTDSKKNLMLRIRLDEETYKKLEKLSKIENVSMSEFVRNFIKVQYKKKFK >gi|224461498|gb|ACDD01000004.1| GENE 71 34156 - 34269 142 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYDKWWEIAFKIFVVVKTIEYLYKLYKWIKNKKNKDS >gi|224461498|gb|ACDD01000004.1| GENE 72 34334 - 34426 124 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNAFLGDNHNDMCHNRVHYKNCKVADQSD >gi|224461498|gb|ACDD01000004.1| GENE 73 34401 - 34472 60 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWLHILLYIQEGRNVNEKCILGR >gi|224461498|gb|ACDD01000004.1| GENE 74 34489 - 34596 139 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLILATGGVYLFLRNKYKKNKLKDLYKQAKKDLKK >gi|224461498|gb|ACDD01000004.1| GENE 75 34755 - 34862 93 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLFFESKPFFYLGERENLFLGMRKSLIELLGFGRI >gi|224461498|gb|ACDD01000004.1| GENE 76 34955 - 35230 384 91 aa, chain - ## HITS:1 COG:FN0818 KEGG:ns NR:ns ## COG: FN0818 COG0776 # Protein_GI_number: 19704153 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Fusobacterium nucleatum # 1 91 1 91 91 117 67.0 6e-27 MTKKEFAKVLFDNGVYSSKAEAERNIETIFSLMEECIINDGSFSITNWGKLEVVERAPRL GRNPKTGEEVKIPSRKSIKFRPGKAFLEKLN >gi|224461498|gb|ACDD01000004.1| GENE 77 35396 - 35545 178 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451618|ref|ZP_05616917.1| ## NR: gi|257451618|ref|ZP_05616917.1| hypothetical protein F3_01042 [Fusobacterium sp. 3_1_5R] # 1 49 1 49 49 76 100.0 6e-13 MKLAKILLFPLVIIGLTLKTYDKAYEYIQYKFRIRKKWKHSDKAEDWWI >gi|224461498|gb|ACDD01000004.1| GENE 78 35457 - 35669 131 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKLTNIFNINLEYVKNGNIQIKPKIGGYKMEKKWEEKMEKQMKKALKKMPWEIKKIHIL LTFWKWWKGR >gi|224461498|gb|ACDD01000004.1| GENE 79 35671 - 35982 342 103 aa, chain + ## HITS:1 COG:no KEGG:FN0165 NR:ns ## KEGG: FN0165 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 20 94 1 75 75 89 72.0 4e-17 MISEKLKKKVKTINEEFKKLGFDLETDLEELCEEREDIAERLENTKFKKMTFSKDEEENC YILTLEDCQIGFFVILGEDEEGPWYEAEAEIIFFLKDYYSFYK >gi|224461498|gb|ACDD01000004.1| GENE 80 36505 - 36888 662 127 aa, chain + ## HITS:1 COG:no KEGG:FN1869 NR:ns ## KEGG: FN1869 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 127 1 127 128 233 91.0 1e-60 MKSVIRLRMSSHDAHYGGNLVDGARMLQLFGDVATELLIQMDGDEGLFKAYDNIEFMAPV FAGDFIEAVGEIVSAGNSSRKMVFEARKVIVPRPDISDSAADVLEEPIVVCRASGTCVTP KDKQRKK >gi|224461498|gb|ACDD01000004.1| GENE 81 36911 - 37726 1237 271 aa, chain + ## HITS:1 COG:FN1868 KEGG:ns NR:ns ## COG: FN1868 COG3246 # Protein_GI_number: 19705173 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 271 2 272 272 496 88.0 1e-140 MEKLIITAAICGAEVTKENNPAVPYTVEEIVREAESAYKAGASIIHLHVRYDDGTPTQDK ARFKECMDAIREKCPDVIIQPSTGGAVGMTDLERLQPTELGPEMATLDCGTCNFGGDEVF TNTDNTIKNFGKIMIERGVKPEIEVFDKGMVDYAIRYAKQGYIKYPMHFDFVLGVQMAAT ARDLVFISESIPEGSTWTVAGVGRNQFPMAALAIVMGGHVRVGFEDNVFIDKGVLAKSNG ELVERVVRMAKELGREIATPAEARRILGLTK >gi|224461498|gb|ACDD01000004.1| GENE 82 37752 - 38789 1572 345 aa, chain + ## HITS:1 COG:no KEGG:FN1867 NR:ns ## KEGG: FN1867 # Name: not_defined # Def: Zn-dependent alcohol dehydrogenase and related dehydrogenase # Organism: F.nucleatum # Pathway: not_defined # 1 345 1 345 345 569 89.0 1e-161 MKKGCKYGTHRVIEPLGVLPQPAKKISNDMELYSNEILIDVIALNIDSASFTQIEEEAHG DVEKIKAKILEIVGEKGKMQNPVTGSGGMLIGTIEKIGEDLVGVTPLKVGDKIATLVSLS LTPLKIEEITAIHPEIDRVEIKGKAILFESGIYAVLPEDMPENLALAALDVAGAPAQIAK LVKPCQSVAILGSAGKSGMLCAYEAVKRVGPTGNVIGVVRNEKEKALLERVSSKVKVVIA DATKPIDVLNAVLAANDGKEVDVAVNCVNVANTEMSTILPVKDYGIAYFFSMATGFTKAA LGAEGVGKDITMIVGNGYTHDHAAITLEELRESAVLREIFNELYL >gi|224461498|gb|ACDD01000004.1| GENE 83 38854 - 40113 1549 419 aa, chain + ## HITS:1 COG:FN1866 KEGG:ns NR:ns ## COG: FN1866 COG1509 # Protein_GI_number: 19705171 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Fusobacterium nucleatum # 1 417 1 424 425 738 83.0 0 MNTVNTRAKFFPNVTDEQWNDWKWQVRNRIETLDDLKQFANLSDEESEGVVKTLETLRMA ITPYYFSLIDLDDPNCPVRKQAIPTIQEIHQSKADLLDPLHEDADSPCPGLTHRYPDRVL LLITDMCSMYCRHCTRRRFAGQSDDSMPMERIDRCIEYIAKTPEVRDVLLSGGDALLVSD EFLESIIQKLRAIPHVEIIRIGSRTPVVLPQRITPELCNMLKKYHPIWLNTHFNHPKEVT PEAKKACEMLANAGVPLGNQSVLLRGVNDSVPVMKKLMHELVMMRVRPYYIYQCDLSMGL EHFRTPVSKGIEIIEGLRGHTSGYAVPTFVVDAPGGGGKTPVMPQYVISQSPHKVILRNF EGVITTYTEPDHYEEGFPGDYESTGVSMLLGGQQMALEPTQLLRHDRYAKRLEEEAKNK >gi|224461498|gb|ACDD01000004.1| GENE 84 40138 - 41220 1043 360 aa, chain + ## HITS:1 COG:no KEGG:FN1865 NR:ns ## KEGG: FN1865 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 19 335 1 317 320 350 58.0 4e-95 MINTYTFLKEHKRISIIGMEKNVGKTTLLNQLILDIADQKILALTSIGRDGEEVDVVTST HKPKIFVYPGTIVATARDCLANCDITKEILYTTDFTTPMGNIVVVRAITGGYVDIAGPSY NKQVKEILNIMESFGAEISIVDGALGRKSSAIGEVTDATVLATGAAFSLDMSKVIEETKK TTILLNLPDFPVEKKEIETWMSKARVVIQKKTGDVIFLKAISTMDSVQEIKEHLNQDLEN VFVRGAITSRFLDVFIKNRGSFDKINLIAIDGTRFFISYQEYQKALACNISFYVINTIHL LFVSCNPHSPLGVDFPKKEFQNKLQQEILCHVIDVKEGEKCDLSMKEVSKESVLTDSCHG >gi|224461498|gb|ACDD01000004.1| GENE 85 41157 - 42626 1683 489 aa, chain + ## HITS:1 COG:FN1864 KEGG:ns NR:ns ## COG: FN1864 COG1193 # Protein_GI_number: 19705169 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 1 489 1 486 487 478 56.0 1e-134 MRFIDERSLERIGFNRLLSRVEVLSSYGEEKLKTLTNFISGEEEKLEQNFEEIEQFINFS EKGDRKSFLLTLESCIHRMKNIKKLIQMVENGNILDEVELFEVKVQAIYMEKLQECLQEL PKELQRFSLKPLSKILEALDPQSDRNPTFYLYESYSRQLTGLREQRKKVEKQIYATRDYE TIVKLKEERLTFLVEEEQEEYRIRTKLSQIILEEAAIYLENIEKIGNLDFLMAKAKFAKK YSAHRPVISRDSSLKIQKAVNLELKEMLESKGKQYTPIDIEIGAGVTIITGANMGGKSVA LKTITENLLLFHMGFFVIAEEASLPLVDFVFFISDDMQDISKGLSTFGAEIMKLREVNIF LELGKGFVVFDEFARGTNPKEGQKFVRALAKFLNGKPTISLITTHFDGVVDATMNHYQVV GLKNIDFDLLKNRIALSNKSMELIQECMDFRLEKASMEEVPKDALNIAKLIGLDEKFNEV ISREYHKED >gi|224461498|gb|ACDD01000004.1| GENE 86 42633 - 44195 2164 520 aa, chain + ## HITS:1 COG:no KEGG:FN1863 NR:ns ## KEGG: FN1863 # Name: not_defined # Def: L-beta-lysine 5,6-aminomutase alpha subunit (EC:5.4.3.3) # Organism: F.nucleatum # Pathway: Lysine degradation [PATH:fnu00310] # 5 520 3 518 518 901 84.0 0 MSNNKLDLNWDLVVEARESAKKIVADSQVFIDSHSTVTVERTICRLLGIDDVDAFGVPLP NAIVDFVKENGNITLGIAKYIGNAMLETGLSPQEIAEKVAKKELDICKMKWHDDFDIQLE INRIAVQTVERIRKNRETRESMIAGYGGDKTGPFLYIIVATGNIYEDVVQAVAGARQGAD IIAVIRTTGQSLLDYVPYGATSEGFGGTFATQENFRIMRNALDEVGKELGRYIRLCNYCS GLCMPEIAAMGALERLDVMLNDALYGILFRDINMQRTLCDQFFSRVINGFAGVIINTGED NYLTTADAFEEAHTVLASQFINEQYALVAGLPEEQMGLGHAFEMDPKLENGFLYELAQAE MAREIFPKAPLKYMPPTKFMTGNIFKGHIQDALFNIITITTNQRLCLLGMLTEAIHTPFL ADRALSIENALYIFNNLKDFGNDIEFKKGGIMNTRAEEVLEKAASLLKEIEGYGIFTTIE KGIFGGVKRPKDGGKGLAGVFEKDSTYFNPFIPLMLGGDK >gi|224461498|gb|ACDD01000004.1| GENE 87 44195 - 44992 1367 265 aa, chain + ## HITS:1 COG:FN1862 KEGG:ns NR:ns ## COG: FN1862 COG5012 # Protein_GI_number: 19705167 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Fusobacterium nucleatum # 1 263 1 263 263 458 87.0 1e-129 MSSGLYSMEKRDFDTTLDLTKIKPYGDTMNDGKVQMSFTLPVPCNEKGVEAALLLAKQMG FVAPAVAFSGSLDKQFSYYVVYGATSYTVDYTAIKVQALEINTMDMHECEKYIEEHFDRD VVIVGASTGTDAHTVGIDAIMNMKGYAGHYGLERYKGIEAYNLGSQIPNEEFIQKAIELK ADVLLVSQTVTQKDVHIQNMTNLVELLEAEGLRDKVILIAGGARITNDLAKELGYDAGFG PGKYADDVATYALEEMVARGMAKKK >gi|224461498|gb|ACDD01000004.1| GENE 88 45151 - 46305 1278 384 aa, chain + ## HITS:1 COG:no KEGG:FN0336 NR:ns ## KEGG: FN0336 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 148 384 2 240 240 295 59.0 2e-78 MKRLVFVALLIFLSVTSVVKAEDWTKVSIYDNKIPSSIKMNLKYRGEHPETVDYVFVSSR TANIRDYPGTEGNIIEKYSYNDKLPLLEKIYVKGNYWYKVRTLKGNEGYIAASVSKKRNF RFDMALDKIKSLEHFLLTEKAAGRKIAAVNSYAPNPNHLDLQKNKDKYGTSADQNTAGKN ATGETVYIPDRSLVSIHNSGAGTSTVKALSVPELLTISNRNISYANIPSANFNKVVAIDS KNQNFIVFEKNGGEWEVISYVYSKTGMDSKLGFETPKGFFSTAMGKYVMPYNDENGQKQG AAKYALRFCGGGYIHGTPINDVEEVNREFFMKQKEFTLGTYSGTRKCIRTSEPHAKFLFD WVIKRPNRSANAQNLSENLVVIVF >gi|224461498|gb|ACDD01000004.1| GENE 89 46318 - 46887 199 189 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 84 183 244 343 347 81 39 1e-14 MGNGGIMKKYLGITLLLASFVFVACGKTSNTSIRDLSTEGNQNFAIEDIDAAKKPLEDII VFNQDGVTIRREGNNLILSMPELILFDFNKYEVKNGIKPSLRTLANALGANADIKIKIDG YTDFIGSEGYNLELSVNRAKAIKSYLVAQGAIENNISIEGYGKQNPVASNDTESGRARNR RVEFIISRS >gi|224461498|gb|ACDD01000004.1| GENE 90 46960 - 48198 1542 412 aa, chain + ## HITS:1 COG:FN0334 KEGG:ns NR:ns ## COG: FN0334 COG1448 # Protein_GI_number: 19703677 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Fusobacterium nucleatum # 1 411 1 414 415 421 50.0 1e-117 MLAKHYQGKKLNDEVFATAQRAKAAIDKYGKEAVFNATLGSLYDEEENLVVFDVVRQMFR ELPLTEFTAYAPHFTGSAGYKESVKRSVLGEHYEKEYPNYYFSVIGTPGGTGALSNTIKN YLNYGDKVLLPKRMWGPYKAMAKEAGGSFDCYELFDEEGKFHLASFEEKVNLLSEQQENL IVIINDPCQNPTGFKLSREEWLSVMKILKKASNKANIILLKDIAYQDFDTLEYREENILS DLPENILVVYAFSLSKALGIYGMRAGAQLAVSSKKEWMEEFDTSATFSCRATWSNASRGG MEMFVKIMETPHLKRKLLEEQQKYRDLLLERADIFLREAEECGLEVLPYKSGFFLSVPIG EKVRELITELEKQNIFTIIFDDAIRIAICGIPKRKLKGLAKKIKDTIENIRG >gi|224461498|gb|ACDD01000004.1| GENE 91 48206 - 49051 1212 281 aa, chain + ## HITS:1 COG:CAC3091 KEGG:ns NR:ns ## COG: CAC3091 COG1951 # Protein_GI_number: 15896342 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain # Organism: Clostridium acetobutylicum # 1 277 3 278 282 308 55.0 1e-83 MKKLDLCMVTNEVEKMCMAANYYVDPKVLQKIETAYCSTEKSPLAKNVLEQILENDKIAE KEQVPMCQDTGMAVIFVEIGTEVYIPGDIYEAIQEGVRRGYTNGYLRKSMVKHPLDRVNT KDNTPAIIHTKMIAGSDQVKIILAPKGGGSENMSLVKMLKPADGIEGVKKLVLELISNAG GNPCPPITVGVGIGGSFEKAALLAKEALLRDTNDRSSDPIAASLEEELLEKINKLGIGPL GLGGKTTALAVKVNVFPCHIACLPVAINLNCHAVRHQEVIL >gi|224461498|gb|ACDD01000004.1| GENE 92 49064 - 49624 944 186 aa, chain + ## HITS:1 COG:CAC3090 KEGG:ns NR:ns ## COG: CAC3090 COG1838 # Protein_GI_number: 15896341 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain # Organism: Clostridium acetobutylicum # 1 183 1 183 187 219 62.0 3e-57 MEYTVNTPLTKEVIETLKIGDVVKITGTIYTARDAAHARLVKLIEEGKELPFSLEGQIIY YVGPTPAKPGYAIGSAGPTTSYRMDPYAPILMKHGLKGMIGKGGRSQEVRDSIQKEKAIY FAAVGGAAALIAKSIQKAELIAYEDLGAEAIRKLEVKDFPAIVVNDIYGGDLYEEGRKQY MEEISL >gi|224461498|gb|ACDD01000004.1| GENE 93 49621 - 50181 821 186 aa, chain + ## HITS:1 COG:FN0333 KEGG:ns NR:ns ## COG: FN0333 COG1954 # Protein_GI_number: 19703676 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Fusobacterium nucleatum # 1 184 1 184 186 197 53.0 9e-51 MTIEEMLALSPVIPAIKNDVSLDEAISSDSEIIFVIMANLLNIERVVTSLKEAGKKVFIH VDMIDGLSSSNYGVEYIVEKIQPFGIITTKHNIVSFALKMKIPVIQRFFILDSFSFEKTL SHIQENKPMAVEVLPGLMPKILHSLASKIDRPLITGGLISSKEDIVSALSAGACAVSTTD TKLWNI >gi|224461498|gb|ACDD01000004.1| GENE 94 50247 - 51233 1524 328 aa, chain + ## HITS:1 COG:FN1840 KEGG:ns NR:ns ## COG: FN1840 COG2376 # Protein_GI_number: 19705145 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 1 328 5 332 332 508 80.0 1e-144 MKKLVNQRENIVEEVVQGMIKAYPEKLSRVEGEPIILRKEKKVGKVALISGGGSGHEPAH AGYVGYGMLDAAVCGEIFTSPGADKVYRAIQEVDSGAGVLLIIKNYSGDIMNFEMAAEMA AMDGITVKQVVVDDDIAVENSTYTVGRRGIAGTVFVHKILGAAAEAGYSLDELVDLGNRL VNNIKTMGMSLKSCMVFSTGKQSFEIGDDEVEIGLGIHGEPGTHREKMATADEFTEKLFA QIDRETQLQKGEKIAVLVNGLGETTLIELFIINNHLQDLLQAKEVTVVKTFVGNYMTSLD MGGFSISIVKLDEEMRKLLLAEQDTIAF >gi|224461498|gb|ACDD01000004.1| GENE 95 51244 - 51855 976 203 aa, chain + ## HITS:1 COG:FN1841 KEGG:ns NR:ns ## COG: FN1841 COG2376 # Protein_GI_number: 19705146 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 1 202 1 202 202 254 63.0 1e-67 MLVKIVEKIADEIIQNKEYLTELDRVIGDGDHGVNLARGFEEIKAQISSYSSLAYSDIFQ KMGMTLLTKVGGASGAIYGTAFMSAGMYCKGKTELEKEDIVAIFKAMIEGVKKRGKASLG EKTLLDTVLPVYDLLQHRLEQGEDILSNTEEIKTVAKQGMESTKDIIATKGRASYVGERS LGHIDPGAASSYMMIKVICEEIK >gi|224461498|gb|ACDD01000004.1| GENE 96 51864 - 52268 413 134 aa, chain + ## HITS:1 COG:FN1842 KEGG:ns NR:ns ## COG: FN1842 COG3412 # Protein_GI_number: 19705147 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 132 3 134 136 157 66.0 4e-39 MVGIVVVSHSKALAKEAITLAMEMKHSEFPLINGSGTDGDYFGSNPLMIKEAIEKAYTEE GVLVFVDLGSSVLNTQIAIDFLDDSIFNLDHIKIADAPLVEGLIAAVAINDAKASLTDII SELKEFKNFSKINE >gi|224461498|gb|ACDD01000004.1| GENE 97 52337 - 52399 63 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIRVETQYYDFFYTQILRKK >gi|224461498|gb|ACDD01000004.1| GENE 98 52412 - 53161 1303 249 aa, chain + ## HITS:1 COG:FN1838 KEGG:ns NR:ns ## COG: FN1838 COG0580 # Protein_GI_number: 19705143 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Fusobacterium nucleatum # 1 235 12 246 254 281 69.0 8e-76 MEPMTMYFAEFIGTALLLLLGNGVNMTLSLKHSYGKGGGWMCTCFGWGVSVTIAAYFVGW ISGAHLNPAVSLALAVAGSLEWTLLPGYIIAQVLGGILGATLAYLAYKRQMDEEPDVGTK LGVFSTGPSIDDAKWNVVTEAIGTAVLMIGILAIGYGKNQMPAGIGPVVVGLLIMVIGLG LGGATGFAINPARDLGPRIAHAILPIKGKGDSNWKYAWVPIIGPMIGGVLGTLLFRVVCQ MTEGCPILN >gi|224461498|gb|ACDD01000004.1| GENE 99 53174 - 54673 2490 499 aa, chain + ## HITS:1 COG:FN1839 KEGG:ns NR:ns ## COG: FN1839 COG0554 # Protein_GI_number: 19705144 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Fusobacterium nucleatum # 1 497 1 497 497 877 84.0 0 MKYIVALDQGTTSSRAILFDENQSIVGVAQKEFTQYYPKEGWVEHDPMEIWSSQSGVLAE VIARAGITQHDIIAIGITNQRETTVVWDKNTGKPIYNAIVWQCRRTAKICDELRKIEGLE EYIKDTTGLVLDAYFSGTKIKWILDNVDGAREKAEKGDLLFGTVDTWLIWNLTHGKVHAT DYTNASRTMLYNIKELKWDERLLKELGIPKQMLPDVRDSSGNYGYANLGGTGGHRVPIAG VAGDQQSALFGQACFGEGESKNTYGTGCFLLMNTGEKFVKSNHGLVTTIAIGLDGKVQYA LEGSIFIGGASVQWLRDELRLVNESKDTEYFARKVKDNGGVYVVPAFVGLGAPYWDMYAR GAILGLTRGANKNHIIRATLESIAYQTRDVLEAMQEDSGIQLAELKVDGGAAANNFLMEF QSDILGVKVRRPVVLETTALGAAYLAGLAVGFWESKEEIKGKWILDREFTPNMEEEEKEK KYRGWKKAVSRAREWEELD >gi|224461498|gb|ACDD01000004.1| GENE 100 54813 - 56246 2178 477 aa, chain + ## HITS:1 COG:FN0183 KEGG:ns NR:ns ## COG: FN0183 COG0579 # Protein_GI_number: 19703528 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Fusobacterium nucleatum # 1 476 23 498 498 691 71.0 0 MVDVAIIGTGIMGSSLAYELAKYQVSILLLDKEHDVSNGTTKANSAIVHAGYDAKEGSLM AKYNVWGNALYENLCKEVDAPYKRTGSYVLAFSEADRKHLEMLYQRGLANGVPDMKILER DEVLAKEPNITTEVVAALYAGTAGITGPWELAIKLVENAMENGADLMLDAEVTKIEKMDG YYRITTKDGKKVEAKTVVNAAGVYADKINNMVSSDSFKIIPRKGEYYILDKVQGNLTNSV IFQCPNEMGKGILVAQTVHGNIIVGPTALDVNDKEDVSNTLGGFESIRKAASKSIKDINY RDNIRNFAGLRAEADTGDFILGESKDAKGFFNMAGTKSPGLTSAPAMALDLSKMILEYLG KVEKKAEHIKNKKHPHFMDLSPEEKAALIAKDSRYGRIICRCENITEGEIVDTIHRKAGG RTIDGIKRRCRPGAGRCQGGFCGPRVLEILARELEVKPDEIVQDKKTGYILTGETKR >gi|224461498|gb|ACDD01000004.1| GENE 101 56261 - 57577 1874 438 aa, chain + ## HITS:1 COG:FN0182 KEGG:ns NR:ns ## COG: FN0182 COG0446 # Protein_GI_number: 19703527 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 1 416 3 418 421 617 77.0 1e-176 MKYDLVVVGGGPGGLAAAIEAKKNGIESILVIERAKELGGILQQCIHNGFGLHEFKEELT GPEYAGRFIDQLLEMNIEYKLDTMVLDVTDKKEVHAINSKDGYMLIEAKAIVFSMGCRER TRGAISIPGDRPAGVFTAGAAQRYINMEGYMVGKRVVILGSGDIGLIMARRLTLEGAEVL AVAELMPFSGGLTRNIVQCLEDYNIPLYLSHTVIDIQGKDRVQKVILAKVDENRQPIPGT EIEYECDTLLLSVGLIPENDISRKTGVEMDRRTNGPIVNEMMETSVPGIFACGNVVHVHD LVDFVSGEARKAGKAAAKYIKGEVSEGEYIFLKNGNGISYTVPQKVRMVNVDNSLEVFMR VNRIFKDVKLEVKAGEEVLMSLKKNHMAPGEMERIMIPKAKLEAAQGKEIVVEVVEGRQI IACTSYDNCTEEAVGGAK >gi|224461498|gb|ACDD01000004.1| GENE 102 57577 - 57921 484 114 aa, chain + ## HITS:1 COG:FN0181 KEGG:ns NR:ns ## COG: FN0181 COG3862 # Protein_GI_number: 19703526 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Fusobacterium nucleatum # 1 114 1 114 114 152 73.0 1e-37 MKKEMICIVCPVGCHISVDRDTLEVTGNTCPRGEKYGKEELTNPKRVITSTVCIEGAEDR RCPVKTNDSIPKGLNFACMEELKKVILHSPVKRGDIVIANVLDTGVDVVATKDM >gi|224461498|gb|ACDD01000004.1| GENE 103 58006 - 58470 398 154 aa, chain + ## HITS:1 COG:ECs3098 KEGG:ns NR:ns ## COG: ECs3098 COG4574 # Protein_GI_number: 15832352 # Func_class: R General function prediction only # Function: Serine protease inhibitor ecotin # Organism: Escherichia coli O157:H7 # 28 145 33 151 162 112 45.0 3e-25 MKKILLFLVLALSFSMFAFGENMELDIYPKAKKGTKQEIFILDKQEKEEDYKIELRFGKD IKVDCNVHSFLHGNLEEKSVEGWSYPYYIFQGSNDMVQTLMLCQEGKKLKRVYYPSATRI LPYNSKLPVVIYVPKDVKVEVHIWKHSGVKEASR >gi|224461498|gb|ACDD01000004.1| GENE 104 58701 - 59354 940 217 aa, chain + ## HITS:1 COG:BS_ydhC KEGG:ns NR:ns ## COG: BS_ydhC COG1802 # Protein_GI_number: 16077637 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 11 194 15 199 224 101 34.0 9e-22 MIIKQKSIREQVYESLKEAIVNGEIESGEKIIELEYAEKFGVSRTPLREALRMLELEGLV SSAEKGGVTVNYISKEDIEEIYKIRVALESIVLKEIIEKDKGCLKPLHSILRETKFALDE NMESGKLIKIFQKFNHELYEVAKLKQVSKLINNLNEYTKRFRVLCLKDEIRLEEAFIEHC KLVEALENKDLEEALKINDKHLYKSMELVLNKMPDTK >gi|224461498|gb|ACDD01000004.1| GENE 105 59514 - 60863 2038 449 aa, chain + ## HITS:1 COG:FN1375 KEGG:ns NR:ns ## COG: FN1375 COG3493 # Protein_GI_number: 19704710 # Func_class: C Energy production and conversion # Function: Na+/citrate symporter # Organism: Fusobacterium nucleatum # 1 449 1 453 454 629 73.0 1e-180 MAKKNFRELFDIREFKWGGVNFPIFLCMLALTMVVVYVPFGGEKAGFLRPNFLTIFALLG VFGLLFGEIGDRIPFWDEYIGGGTVLVFFSAAVFGTYKFVPEPVVSAIKIFYGKQPVNFL EMFIPALIVGSVLTVDRRTLIKSMSGYIPLIVVGVLGASLCGIAAGLLFGKAPLDIMMNY VLPIMGGGTGAGAIPMSEMWSSKTGRPASEWFAFAISILTIANIIAILAGAFLKKLGENN PSLTGNGDLVIDDSKEVVKDKEVEVKAELVDTAAAFMMTGILFTAAHILGEVWETLGFPF EIHRLAFLIILTMVLNIAGVVPDRLKAGAKRMQTFFSKHTIWILMAAVGFTTDVNEIINA LSLANLVIAFAIVIGAVVFIMLLSKKMKFYPVEAAITAGLCMANRGGAGDVAVLGAADRM ELMSFAQISSRIGGAMMLILGSIIFGIFA >gi|224461498|gb|ACDD01000004.1| GENE 106 60891 - 62279 2294 462 aa, chain + ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 1 445 1 445 448 725 79.0 0 MKKVKIMETCLRDGHQSLMATRLKTEEMLPIIETMDKAGYYSMEMWGGATFDAAIRFLNE DPWERLREIKKRAKNTKLQMLLRGQNLLGYRHYADDVVDKFIEKAIGNGIDVIRIFDALN DIRNLKQACESTKKYGGHAQLAMSYTISPVHTVEYYKNLAQEMEAMGADSIAIKDMSGIL LPEVAYELVKELKSVLKVPLELHTHATAGLASMTYVKAIEAGVDIVDTAISPFSGGTSQP ATESLVRALAGAERETELNLDILKEVAEYFKPIRNKYVAEGILNPQALMTEPSIVEYQLP GGMLSNMLSQLKAQKAEYRYEEVLREIPRVREDLGYPPLVTPLSQMVGTQAVFNVISGQR YKMVPKEIKDYVKGLYGKSPVAVSEEIKEKIIGNEKVFTGRPADLLEAEYEKLKEESKEF TKSEEDVLMYAMFPQVAQTYLEKKYHSAKQEERKEQYIHIVF >gi|224461498|gb|ACDD01000004.1| GENE 107 62288 - 62623 389 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451647|ref|ZP_05616946.1| ## NR: gi|257451647|ref|ZP_05616946.1| hypothetical protein F3_01187 [Fusobacterium sp. 3_1_5R] # 1 111 1 111 111 177 100.0 3e-43 MNSIIFGDRFVSFSDSLYITVVSMSIVFFALVLICFFVSCMKYIPQEKVVEKISTKKRET TKVVPQTMKEEKQEINYEDENIRLALMVASMEAAAEDENAYIKIRSIKEIV >gi|224461498|gb|ACDD01000004.1| GENE 108 62632 - 62976 625 114 aa, chain + ## HITS:1 COG:lin1060 KEGG:ns NR:ns ## COG: lin1060 COG1038 # Protein_GI_number: 16800129 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Listeria innocua # 10 111 1038 1141 1146 70 40.0 9e-13 MIKVYKLKIGEKVYEVELESITEKEGTIAETTPSQKKVETTATEGTSVEAPMQGVIVDVV VSVGDQVAAGDELVVLEAMKMENAIVAPVAGRVANIYVSKGENVDNGKLLITLA >gi|224461498|gb|ACDD01000004.1| GENE 109 62990 - 64108 1894 372 aa, chain + ## HITS:1 COG:SPy1177 KEGG:ns NR:ns ## COG: SPy1177 COG1883 # Protein_GI_number: 15675149 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 1 371 2 375 376 421 66.0 1e-117 MELLYTLYQTTGLSMLTVNKSIMILVALCLLYLAIKKGYEPYLLLPISFGMLLVNLPGVP NEGLMDEGGLLYWLYKGVKLGIYPPMIFLAIGASTDFGPLIANPKSLLLGAAAQLGIFAA FIGSILLGLSGKAAASIGIIGGADGPTAIYLTSKLAPDMLGPIAVAAYSYMALVPVIQPP IIRLLTTKKEREIKMVQLRQVTKREKIIFPILVTIIVILLIPSSAPLVGMLMLGNLMKES GLVPNLVEHAKGSMMYVITICLGTTVGATTNAETFLTLTTIKIVLLGLFAFGFGTAGGVI FGKIMCKLSGGKINPMIGAAGVSAVPMAARVVQKVGQEENPSNFLLMHAMGPNVAGVIGS AVAAGVLLAVFK >gi|224461498|gb|ACDD01000004.1| GENE 110 64122 - 64406 495 94 aa, chain + ## HITS:1 COG:SPy1186 KEGG:ns NR:ns ## COG: SPy1186 COG3052 # Protein_GI_number: 15675156 # Func_class: C Energy production and conversion # Function: Citrate lyase, gamma subunit # Organism: Streptococcus pyogenes M1 GAS # 1 94 1 95 102 68 41.0 3e-12 MELKVAAVAGTTDKNDIFISIEPSSQGIEISLKSKVMEQFGDNIRETIENTLKDMGISSA KIEAEDNGAVEVVIMSRVQTAVMRSAQSTKYIWK >gi|224461498|gb|ACDD01000004.1| GENE 111 64417 - 65319 1346 300 aa, chain + ## HITS:1 COG:HI0023 KEGG:ns NR:ns ## COG: HI0023 COG2301 # Protein_GI_number: 16271998 # Func_class: G Carbohydrate transport and metabolism # Function: Citrate lyase beta subunit # Organism: Haemophilus influenzae # 1 296 1 291 291 318 56.0 6e-87 MKLRRSMLFVPATKPGTMRDAYVYKPDSVMFDLEDSVAITEKDSARILLFNMLKKFGPFY KEMGIETVVRINALDTEFGVEDLEAVVRAGIEVVRIPKTDTPEDVREVEAHIERIEKEAG IPVGTTKMMVAIESPLGALNALEIAKSSPRLIGMAIGGEDYVTNLKTTRSPEGIEMLMGR AMVVMAARSAGIAALDSVYSDIDNHEGFIKEATMIKQMGFDGKSLIHPTQIELIHKVYTP DEKSLKKSIKIMKATEQALKEGKGVFTVDGKMIDKPIIERAQHVLNLAKAAGLRWEEEDV >gi|224461498|gb|ACDD01000004.1| GENE 112 65321 - 66862 2551 513 aa, chain + ## HITS:1 COG:STM0061 KEGG:ns NR:ns ## COG: STM0061 COG3051 # Protein_GI_number: 16763451 # Func_class: C Energy production and conversion # Function: Citrate lyase, alpha subunit # Organism: Salmonella typhimurium LT2 # 23 513 12 505 506 543 55.0 1e-154 MKELRVDEALLASIKGYENRNAYVSPFAFQPEGTMQEAADLKGQVRRTKVVASLEEAIKK SGLKDGMTISFHHHFRDGDKVLPMVMEIIANMGFKDLRVAASSFTGAHECMVEYIEKGVV NRIESSGLRGKLAQAVSNGVLASPAVIRSHGGRARAIVEGDLKIDVAFLGVPSSDCMGNA NGVIGKSVCGSLGYAMVDAQYAKKVVLITDTLVAYPNHPISIPQTQVDFVVEVEEIGDPN GIMSGATRFTKNPKELLIAKNVVKAMIASGYFVDGFSMQTGSGGAALAVTRFMKEEMLKR DIKCSYALGGITAAFASLLEEGLVKEIFDVQDFDLGGVASITKNALHQEISADFYASPFN KSAAVNKLDFVVLSALEIDRDFNVNVISGSNGVIRAASGGHSDAAACAKMSIIVAPLLRG RLPIVIDRVTTVVTPGETVDVLVTELGITVNPLRQDLKANFEKAGIELIEMDTLIERAKF LAGEPKKAEFSDEVVAVVEYRDGSIIDVIKKVK >gi|224461498|gb|ACDD01000004.1| GENE 113 66979 - 67644 661 221 aa, chain - ## HITS:1 COG:FN0946 KEGG:ns NR:ns ## COG: FN0946 COG1451 # Protein_GI_number: 19704281 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Fusobacterium nucleatum # 3 217 14 223 229 142 40.0 7e-34 MSLEYQLTRKKIKRIILRVLEDGSLQVNAPFFVSQNEIETFLASQSSWIEKTRKKLLSQK KNKNPLQDHYSSGDTFSIFGKEITLQLRVSKASSIYLGKQFLYVFYQPEEKEQITQIIQN YLLQLLKEALEFYLKNYSSRLQLYPNQFQIKTMKSAWGIYHSKGNDISFNSLLLSQTKEF IEYVVVHELCHIRYLNHQKEFWNLVATQIPNYREIRKSSQT >gi|224461498|gb|ACDD01000004.1| GENE 114 67695 - 68483 839 262 aa, chain - ## HITS:1 COG:FN0926 KEGG:ns NR:ns ## COG: FN0926 COG2357 # Protein_GI_number: 19704261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 4 262 3 259 259 375 74.0 1e-104 MSTKLDQSTFFEEFTIDKEYFDSTGLEWEELVRIYEDYVQLIPSLEKEAEYIVSKLIDAP NVHSVRRRVKKAKHLIEKIIRKGKKYKDRNISVENYREIVTDLIGIRVLHLFKDDWKGIH HNILNLWELSETPQVNIRRGDYNLQQFRESISDLNCEIIVREHGYRSVHYLVKIPITISL NVLVEIQVRTVFEEAWSEIDHIMRYPYDTDNPVITEYLAIFNRMVGCADEMGTFLKKVKK DFSLEKEFAEHCIPRDLDLKFK >gi|224461498|gb|ACDD01000004.1| GENE 115 68503 - 69348 624 281 aa, chain - ## HITS:1 COG:no KEGG:FN0925 NR:ns ## KEGG: FN0925 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 13 281 33 292 292 137 35.0 5e-31 MIEITEEKKTHILPLSKEMKLEENLEIQFSSLQLKHFPISYRNFSSMEKFLEIIPLGTTD VQVGEQILHNVTLRAFVYKNFRLLELKTREFRFAFSAELFDNVFFSREAFLQYEISPDLN NPRLENIFTLFQNIFHGAKIVFQYNDAQSELSISNEIEAFKFSLLSSSLEKYQNQIASIL SKKEKNFSSLKNSFYELEILYYYLSGKTFYDGWVNAKFPKGDIHSGDSVQFVRTISYPFQ RLSYDIRQTITLRQDLGNIGNGDTIQLNRKSASILLEAIEK >gi|224461498|gb|ACDD01000004.1| GENE 116 69381 - 70175 624 264 aa, chain - ## HITS:1 COG:no KEGG:FN0924 NR:ns ## KEGG: FN0924 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 64 260 1 200 209 88 28.0 2e-16 MDKFWNYFPSEQKFNVFLGEEYCNYDQSPSRKELAAFLLDKIPEALRHRVHNKESLSEIS QDLLDLAIFSRNNLIKTVEEFGKNISLNFSCYTPILENKRFQAIINMNMFLPLEREFHEK LHPIFPFSEPTEEKTNKLPFYRILGCINQSDKVFLTAQDTKKLKLLSFYQNFWTQLRKEL MERPTILLGMDLENKDVQEILGFLLEEIHYEKQNIYLVTSSSILSTNVTNFINKYDIKLL MKDTDSFQKSLDEKVVDVQKQLVW >gi|224461498|gb|ACDD01000004.1| GENE 117 70311 - 71768 1272 485 aa, chain + ## HITS:1 COG:FN0923 KEGG:ns NR:ns ## COG: FN0923 COG1502 # Protein_GI_number: 19704258 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Fusobacterium nucleatum # 5 485 8 479 479 466 46.0 1e-131 MLENLMKITSILLEYIWIMNISFILILVFLERKNPLYTLLWAIILSLAPYIGFIAYLFFG ISFRKRRKANKIYELARLESKDMIEFSQRADLQNWERLIHYLEMTSKNRLTWQNTMTPYF EGEKYFRALLQDLKEAKREIKIEMYLFRNDFLGKKILEVLKERANIGVEIFLLLDGVNPP SYSMRKFLKEAGIRYRIFFPSPLPYLNISLNANYRNHKKLCIIDRKISYLGGFNIGDEYI GNGKLGYWRDTAIRVAGEIVVELEKEFYFTWNIASREKRELGEKVYPYMQEVMQEIKRRK GRNTGYMQVATSGPNFAFHTLRDNYLNLIQGAKSHIYIQTPYFVPDDIILDALKIACLSG VKVKIMIPAKSDHFIIHPVNHYFVGELLELGAEILEYQKGFLHCKVIMVDGEVVSMGSCN VDYRSFYQNFEINVNIYEKDVVREFEKQFKKDVAVSERISYPKYRSRSIRTKIKEAVFRL FAPVL >gi|224461498|gb|ACDD01000004.1| GENE 118 72053 - 73126 1531 357 aa, chain + ## HITS:1 COG:BS_pyrAA KEGG:ns NR:ns ## COG: BS_pyrAA COG0505 # Protein_GI_number: 16078615 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Bacillus subtilis # 1 355 1 356 364 361 49.0 1e-99 MKGKLILENGMVFNGTVFGEVGETVGELVFNTGMTGYQELLTDPSYYGQMVVMTYPMIGN YGINLEDMESDRIHLRALIIKEEAKLPNNFRCEMSLDGFLRQNKVIGFKSVDTRYLTKVI RDCGAMKGIITTKDLTKKEIEERFSSYQNRDAVEQVSPKEIYEIPGKGLRLGFMDFGAKA NIIRNFKERDCHMVVFPWNTKAETILEYNVDGVFLSNGPGDPADLQNVIAEIKKLIEKKM PILGICLGNQLTAWALGGTTKKMKFGHRGGNHPVKDLDHNRIYITSQNHGYAIDKIPEKA RVSHVSMNDGTIEGLKCDDLHIMTVQFHPEAWPGPTDCEYLFDEFLEVIKGAKKDVR Prediction of potential genes in microbial genomes Time: Fri May 20 01:35:08 2011 Seq name: gi|224461497|gb|ACDD01000005.1| Fusobacterium sp. 3_1_5R cont1.5, whole genome shotgun sequence Length of sequence - 13354 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 9, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 262 317 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Term 285 - 341 14.0 + Prom 297 - 356 6.9 2 2 Tu 1 . + CDS 537 - 668 64 ## gi|257466978|ref|ZP_05631289.1| hypothetical protein FgonA2_06020 + Term 765 - 811 7.1 + Prom 809 - 868 10.3 3 3 Tu 1 . + CDS 893 - 1363 578 ## COG3467 Predicted flavin-nucleotide-binding protein + Term 1373 - 1415 -0.8 + Prom 1375 - 1434 8.0 4 4 Op 1 4/0.000 + CDS 1479 - 2057 621 ## COG0518 GMP synthase - Glutamine amidotransferase domain 5 4 Op 2 . + CDS 1909 - 3018 1277 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 3162 - 3197 3.4 - Term 3149 - 3185 0.4 6 5 Tu 1 . - CDS 3242 - 4222 689 ## COG4823 Abortive infection bacteriophage resistance protein - Prom 4310 - 4369 9.0 - Term 4229 - 4281 2.0 7 6 Tu 1 . - CDS 4381 - 4980 312 ## Apre_0340 helix-turn-helix domain protein + Prom 5211 - 5270 13.7 8 7 Op 1 . + CDS 5349 - 6071 590 ## Amet_2189 CRISPR-associated Cas family protein 9 7 Op 2 . + CDS 6086 - 7774 1416 ## Amet_2190 CRISPR-associated Cst1 family protein 10 7 Op 3 . + CDS 7793 - 8665 1032 ## COG1857 Uncharacterized protein predicted to be involved in DNA repair 11 7 Op 4 . + CDS 8652 - 9404 969 ## Lebu_0797 CRISPR-associated protein Cas5 12 7 Op 5 . + CDS 9455 - 11860 1722 ## COG1203 Predicted helicases - Term 12086 - 12127 -0.9 13 8 Op 1 . - CDS 12128 - 12322 151 ## 14 8 Op 2 . - CDS 12396 - 12578 109 ## - Prom 12656 - 12715 11.6 + Prom 12818 - 12877 6.6 15 9 Tu 1 . + CDS 12902 - 13352 710 ## COG3328 Transposase and inactivated derivatives Predicted protein(s) >gi|224461497|gb|ACDD01000005.1| GENE 1 2 - 262 317 86 aa, chain + ## HITS:1 COG:BS_pyrAB KEGG:ns NR:ns ## COG: BS_pyrAB COG0458 # Protein_GI_number: 16078616 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus subtilis # 2 68 981 1047 1071 71 52.0 3e-13 EATAVRKISEDSPNLLDFIKNRQVDLLINTPTKANDSQRDGFKIRRSAIEYGVEVLTSLD TMKAIIKMQDRNLKEETLDVFDISKI >gi|224461497|gb|ACDD01000005.1| GENE 2 537 - 668 64 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257466978|ref|ZP_05631289.1| ## NR: gi|257466978|ref|ZP_05631289.1| hypothetical protein FgonA2_06020 [Fusobacterium gonidiaformans ATCC 25563] # 1 34 1 34 36 65 100.0 1e-09 MFVTETLLYRWFQEVGNLISQKNIYFTLGRIPIDIKIKKKKIE >gi|224461497|gb|ACDD01000005.1| GENE 3 893 - 1363 578 156 aa, chain + ## HITS:1 COG:FN1023 KEGG:ns NR:ns ## COG: FN1023 COG3467 # Protein_GI_number: 19704358 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Fusobacterium nucleatum # 1 153 3 156 156 186 60.0 1e-47 MRKSNREITDVNELLEVMKHCDVCRLALNDNEYPYILPLNFGLEVLDGNIKLYFHSAMEG YKWEVIARDNRASFEMDCEHELQYFEEQGYCTMAYESIIGRGRITELKEVEKAGALQKIM DHYHVENSYYNPAAISRTRVYVLTVESMTGKRKIKK >gi|224461497|gb|ACDD01000005.1| GENE 4 1479 - 2057 621 192 aa, chain + ## HITS:1 COG:FN1444_1 KEGG:ns NR:ns ## COG: FN1444_1 COG0518 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase - Glutamine amidotransferase domain # Organism: Fusobacterium nucleatum # 1 166 1 166 194 262 74.0 3e-70 MKECSIIILDFGSQYNQLIARRVREMGVYAEVVPFYEPLDKILARKPKGIILSGGPASVY AEGAPTLDKALFDHGIPVLGLCYGMQLVTHLFGGEVARADKQEFGKAELIIDEKDAALFQ NIPNNTKVWMSHGDHVTRIGEGFHAIAHTDSCIAAVVNPEKNNLCFSIPSGSYSLRTWKR YATKLCIRSSKM >gi|224461497|gb|ACDD01000005.1| GENE 5 1909 - 3018 1277 369 aa, chain + ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 52 369 1 318 318 564 85.0 1e-161 MPLLIQILVLQQLSIRKKIIYAFQFHPEVTHSEHGRDMLQNFVLEVAKCEKNWSMDNYIE STIKAIQEKVGDKKVILGLSGGVDSSVAATLIHRAIGDQLTCIFVDTGLLRKNEAKTVME VYSENFHMNIKCVDAEERFLSKLKGVSDPEQKRKIIGKEFIEVFNEEAKKFEDAEFLAQG TIYPDVIESVSVKGPSVTIKSHHNVGGLPEDMKFQLLEPLRELFKDEVREVGRQLGIPHH MIDRHPFPGPGLGVRILGDITKEKADILREADDIFIEELRKADLYGKVSQAFVVLLPVQS VGVMGDERTYEYVASLRSVNTIDFMTATWSHLPFDFMERVSNRILNEVKGINRLTYDISS KPPATIEWE >gi|224461497|gb|ACDD01000005.1| GENE 6 3242 - 4222 689 326 aa, chain - ## HITS:1 COG:lin2373 KEGG:ns NR:ns ## COG: lin2373 COG4823 # Protein_GI_number: 16801436 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Listeria innocua # 6 240 3 219 298 77 27.0 4e-14 MANKNKPNKPFKTYRQQLKVLRNRNLTISNGSHAISILKRDNYYNIINGYKEIFLEVKGQ NEKFYTGTTFEHINALYVFDKKIRHLFLYYILIFENLIKTKIAYYHTETFDEIFNYLDVN NFSGKAESITKLIASISKEIEKYTGLKQPNAFSHYIEEHGELPLWVLFQKATFGTASYFF SSLQNNIKDKICTELNEEHNNRYGLKSSIFINPIFLENTIHFINSYRNVCAHNERFYNST FQKRGFKVDYSHYTKTEFRGTIFDLLIILKLFLLKSDFEILKKDFKKQLNKLENELVNNP NALEKIKRSALKLPIGWENILNHIWK >gi|224461497|gb|ACDD01000005.1| GENE 7 4381 - 4980 312 199 aa, chain - ## HITS:1 COG:no KEGG:Apre_0340 NR:ns ## KEGG: Apre_0340 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: A.prevotii # Pathway: not_defined # 5 91 3 88 259 87 51.0 4e-16 MAKKTIQGSKELAKQIKLRRNELGLTIEEAASRTNVGTKTWSRYEAGSSIRIDTCKGICK ALNWYTIPNQIIEDNEQFSIQEYKNHEPWSEFLYQMKYTLLEMRMYANSGSSIIAHTVLE ELLLYLCCMESSTLIEFNSNLNKLQDIDWKDWIFDLFDDMDIISFLYSDIYLGTNHPYHF FQWTKKQFYTNEFSDEKNL >gi|224461497|gb|ACDD01000005.1| GENE 8 5349 - 6071 590 240 aa, chain + ## HITS:1 COG:no KEGG:Amet_2189 NR:ns ## KEGG: Amet_2189 # Name: not_defined # Def: CRISPR-associated Cas family protein # Organism: A.metalliredigens # Pathway: not_defined # 1 239 16 256 257 192 43.0 7e-48 MRFTVTIQLNQSEIPKDRSRIFLSLIKFWMEKENPELFYKLYGSKATIRKNFTYSLFLGK CNFKRETIEIPNKQIVLNLSCYDLELGIHIYNALLKGKNHSYFYKDISMCITNIKLQKEK LISTDVAFFQTMSPCVVREHNEETNRDWFYSLSESKGQQLFLQNLQIQIVDTFPEAKDEV QAMEIRVLKNKEVKVKHYGIEVLANICELEIKAKSYILDYLSKAGIGSLKSTGFGMMKVK >gi|224461497|gb|ACDD01000005.1| GENE 9 6086 - 7774 1416 562 aa, chain + ## HITS:1 COG:no KEGG:Amet_2190 NR:ns ## KEGG: Amet_2190 # Name: not_defined # Def: CRISPR-associated Cst1 family protein # Organism: A.metalliredigens # Pathway: not_defined # 2 556 3 550 554 436 41.0 1e-121 METIVIRSEDWLKNAGIIGLYRILEEEESDEKSSISLEEDQIRFSAELLQNFSEKYFRYF IKQYKNVLSLYRTLNFKENISQFKEKNYENFGKEDLEKLNEHVENVKKYLKSNSYRAMYP LIRCPFDPLEKEKELKKVHLRKKESIEDGISDVKKLITHLEEIYDFLQQEDSQKYIGAKN VIYNIIKNAWDGISILNPQVKEQNMYFEFDKYFVQTTQEYLKQEKTKFKYRCFSCGESIK DTNIDLSFMNHIGFDVARKTSHVWNFNNYVHICPLCRLIYACVPAGFTYLYDKGIFVNAN THLKEMLRINDLIFKNVLGEKKDGKSIYGALVSGMSKEMNEHVEYDLSDIQVVRLEKKRY TFSILSRKFLSIIKKCRTDLEYIRKSSFKEGNDIYYIYNETIKRLMQGENLFLLIHKLVR LKISNGEDCYYKMGTVGTIIEINDIFLKEVGYMKDEKKEYNPLERARIIGHHLQEAYGGF EEGKYNKKLDGIAYRMLNALKTNNKIAFMDSLINAHMYVQKPIPSLFSDYLHQELAFKEL GYAFVTGMLGEEWKNEGNTSKN >gi|224461497|gb|ACDD01000005.1| GENE 10 7793 - 8665 1032 290 aa, chain + ## HITS:1 COG:aq_374 KEGG:ns NR:ns ## COG: aq_374 COG1857 # Protein_GI_number: 15605880 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Aquifex aeolicus # 4 287 3 352 357 158 34.0 1e-38 MKVKGLTVTMIFEAGSANYGEAVGNVSALKKVARNNGDQYTYISRQAIRYNIVEQLNCPL AEVSAEGSADKKVIQFHKNATIEKYPELDFFGYLKTEQKTAGKKRSAKVRLSHAISLETF KGDLDFLTNKGLADRLQENMNIAQSEIHHSYYSYTLTVDLDQIGIDEVEEIYLSQEEKSR RVIQLLDTISMLYRDIRGRREDLKPLFAIGGVYDIKNPIFQNAVEVKNNAININLLQNIL PKNMKEETICGLVEGKFRNNLQLKEELGAIPMSKFFDHLKKEVQSYYESC >gi|224461497|gb|ACDD01000005.1| GENE 11 8652 - 9404 969 250 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0797 NR:ns ## KEGG: Lebu_0797 # Name: not_defined # Def: CRISPR-associated protein Cas5 # Organism: L.buccalis # Pathway: not_defined # 1 248 1 251 254 256 52.0 6e-67 MKAVRVILYQDLVNYRHPMSFQLKESYPLPPYSTVIGMVHNLCRYQEYHPMKISIQGKYI SKTNDLYTRYEFKSGAKYDKSRHQMDAGGFGISRGIGTAELLSQVQLILHIRPENEDQIE EIIKAFKAPWEYPSLGRREDLAVIQEVEEVEITQRTHKDAKKKEDLYAYIPLEYLEEGLV EGRNTERGVKARGTKYLLNKRYYLENYGKEKDPKWIRRWEKVKVLYTSKITTKAMKQLWL DEDNYMVFEA >gi|224461497|gb|ACDD01000005.1| GENE 12 9455 - 11860 1722 801 aa, chain + ## HITS:1 COG:FN1179 KEGG:ns NR:ns ## COG: FN1179 COG1203 # Protein_GI_number: 19704514 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Fusobacterium nucleatum # 1 792 1 797 812 361 34.0 4e-99 MIDVKKIKNEFLAKSNPLETIMEHTEILLQEYDRISTIYPNFIEKIPKVWKVLYLVCVYH DWGKINFPFQNRILTGKKQWNELPHAILSIACLNAEELSKEFPREEIQLLYSAILFHHNR EDLLSYEIDQIRSQLQEMSEDVKQMWEIICMNPQQYPLLNQKKIYFDSNLYFEDSFYQEF PVDILNPKECEKAVFYILLKGLLNRIDYAASAHIPVEHKNDFLEKGLETFMQDLNVERTS RGMEKSDWNNLQEYMKSHQEDNVIVVAQTGLGKTEAGLWWIGDHKGFFILPLRTAIDAIY DRIKNKIIKNDRVEERLALLHGESLESYLQLWEKEKELQEGSGFEWENYYIKTRQLSLPL TICTLDQLFPFVFRYRGYEAKLATMAYSKVVIDEVQMYGPDLVGFLIVGLTMIQKMGGKF AILTATFPGFIKDLMREQGLEFKMSTPYVKEESRHSVQWIQKEINADFILEKYKNNRVLV ICNTVKRCQEIYHDLYEKMEISQEKLLSHDIMDRELNLFHAKFIKKDRAIREKAILEFGS LLKQDNTPNDRQGIWISSSAVEASLDIDFDILITELSDMNSLFQRMGRCFRGRILETGDF NCYVFDGGDKKCSGIGYSIQEEIYEISKNDLRQHFSTNGNILTEKEKMDLVEQTYSKEKL EPTKYYKEVKDFIKNPSLYLPNEMSAKEGQFRFRNILSERIIPLPVYQQNIEEIRKIEEK LKLPLNSQMSENMEEKSISKEERILCREKLMQYTLSVESYLLKGANIEKKITINSYQEIK VVSCEYDSLIGLGKIIKEKKG >gi|224461497|gb|ACDD01000005.1| GENE 13 12128 - 12322 151 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYGFIKLKYSLHFNVFLLKFANGFLEGLRVLTFTFQCVSIKVKSILSEVAYFSSFTFQCV SIKV >gi|224461497|gb|ACDD01000005.1| GENE 14 12396 - 12578 109 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSFDDHLHFNVFLLKLNYFSLSALFLFDLHFNVFLLKYEEELYRKRVIETLHFNVFLLK >gi|224461497|gb|ACDD01000005.1| GENE 15 12902 - 13352 710 150 aa, chain + ## HITS:1 COG:YPO1493 KEGG:ns NR:ns ## COG: YPO1493 COG3328 # Protein_GI_number: 16121766 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 30 150 17 137 402 94 36.0 7e-20 MKEKKEVYKVKPLTEEKKNIIATLIEEYDIKTAQDIQEALKDLLGGTIQSMLEAEMEEHI GYEKYQHSDGANYRNGTKKKNIRSTYGEFQVEVPQDRNSSFEPKIVKKRQKDISEIDQKI INMYARGLTTRQISQQMEEIYGFECSESFI Prediction of potential genes in microbial genomes Time: Fri May 20 01:35:59 2011 Seq name: gi|224461496|gb|ACDD01000006.1| Fusobacterium sp. 3_1_5R cont1.6, whole genome shotgun sequence Length of sequence - 74800 bp Number of predicted genes - 71, with homology - 69 Number of transcription units - 24, operones - 12 average op.length - 4.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 820 970 ## COG3328 Transposase and inactivated derivatives - Term 782 - 817 4.4 2 2 Tu 1 . - CDS 890 - 1012 129 ## - Prom 1037 - 1096 5.8 + Prom 997 - 1056 8.9 3 3 Tu 1 . + CDS 1172 - 2827 1469 ## lp_1931 hypothetical protein + Term 2985 - 3051 30.0 + TRNA 2963 - 3039 66.9 # Arg CCT 0 0 - Term 3050 - 3083 4.0 4 4 Op 1 . - CDS 3094 - 4731 2680 ## COG2759 Formyltetrahydrofolate synthetase 5 4 Op 2 4/0.000 - CDS 4737 - 5171 483 ## COG0757 3-dehydroquinate dehydratase II 6 4 Op 3 . - CDS 5143 - 5946 746 ## COG0169 Shikimate 5-dehydrogenase 7 4 Op 4 . - CDS 5943 - 6200 286 ## gi|257451680|ref|ZP_05616979.1| chorismate mutase 8 4 Op 5 5/0.000 - CDS 6193 - 7173 1041 ## COG0082 Chorismate synthase 9 4 Op 6 . - CDS 7160 - 8386 860 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase - Prom 8435 - 8494 6.7 + Prom 8465 - 8524 12.9 10 5 Op 1 10/0.000 + CDS 8572 - 9297 241 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 11 5 Op 2 . + CDS 9353 - 10561 1908 ## COG0183 Acetyl-CoA acetyltransferase + Term 10591 - 10629 5.3 12 6 Op 1 . + CDS 10647 - 11024 436 ## gi|257451685|ref|ZP_05616984.1| hypothetical protein F3_01379 13 6 Op 2 . + CDS 11021 - 12595 1882 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 14 6 Op 3 . + CDS 12635 - 13816 1427 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family 15 6 Op 4 . + CDS 13835 - 14515 1048 ## COG0670 Integral membrane protein, interacts with FtsH 16 6 Op 5 . + CDS 14538 - 15374 900 ## gi|257451689|ref|ZP_05616988.1| hypothetical protein F3_01399 17 6 Op 6 1/0.000 + CDS 15397 - 16188 765 ## COG1835 Predicted acyltransferases 18 6 Op 7 . + CDS 16190 - 18004 1409 ## COG1835 Predicted acyltransferases + Term 18011 - 18049 3.0 - Term 17999 - 18037 3.0 19 7 Op 1 . - CDS 18050 - 18214 188 ## FN1200 hypothetical protein 20 7 Op 2 . - CDS 18258 - 18632 571 ## COG3422 Uncharacterized conserved protein - Prom 18662 - 18721 7.1 + Prom 18731 - 18790 8.4 21 8 Tu 1 . + CDS 18844 - 19383 777 ## COG1592 Rubrerythrin + Term 19394 - 19435 5.8 + Prom 19400 - 19459 7.5 22 9 Op 1 4/0.000 + CDS 19562 - 21925 3527 ## COG0058 Glucan phosphorylase 23 9 Op 2 6/0.000 + CDS 21931 - 23763 2090 ## COG0296 1,4-alpha-glucan branching enzyme 24 9 Op 3 7/0.000 + CDS 23784 - 24929 1347 ## COG0448 ADP-glucose pyrophosphorylase 25 9 Op 4 17/0.000 + CDS 24950 - 26113 1397 ## COG0448 ADP-glucose pyrophosphorylase 26 9 Op 5 . + CDS 26124 - 27533 1431 ## COG0297 Glycogen synthase + Term 27534 - 27589 7.2 - Term 27528 - 27569 6.4 27 10 Tu 1 . - CDS 27575 - 29494 1177 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases - Prom 29539 - 29598 10.3 + Prom 29488 - 29547 14.2 28 11 Tu 1 . + CDS 29649 - 29966 612 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 30004 - 30052 7.7 + Prom 30032 - 30091 9.4 29 12 Op 1 1/0.000 + CDS 30129 - 30905 987 ## COG1692 Uncharacterized protein conserved in bacteria 30 12 Op 2 1/0.000 + CDS 30902 - 31831 1114 ## PROTEIN SUPPORTED gi|237737638|ref|ZP_04568119.1| ribosomal protein L11 methyltransferase 31 12 Op 3 1/0.000 + CDS 31841 - 32497 795 ## COG0283 Cytidylate kinase 32 12 Op 4 . + CDS 32507 - 33736 1163 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase 33 12 Op 5 1/0.000 + CDS 33712 - 34416 648 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 34 12 Op 6 . + CDS 34428 - 35705 1933 ## COG0104 Adenylosuccinate synthase + Term 35774 - 35818 5.4 + Prom 35795 - 35854 11.0 35 13 Op 1 . + CDS 35925 - 39647 3552 ## CHY_2654 hypothetical protein 36 13 Op 2 . + CDS 39662 - 40990 1308 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 37 13 Op 3 . + CDS 41009 - 44623 3534 ## COG1002 Type II restriction enzyme, methylase subunits 38 13 Op 4 . + CDS 44677 - 46686 1424 ## COG1479 Uncharacterized conserved protein + Term 46695 - 46743 0.8 + Prom 46714 - 46773 9.2 39 14 Op 1 . + CDS 46798 - 49308 2034 ## Npun_F5663 alkaline phosphatase domain-containing protein 40 14 Op 2 . + CDS 49333 - 49923 514 ## Swol_2496 hypothetical protein 41 14 Op 3 . + CDS 49944 - 50501 466 ## Ppro_2290 hypothetical protein + Term 50568 - 50618 5.2 42 15 Op 1 . - CDS 50542 - 50607 66 ## 43 15 Op 2 . - CDS 50620 - 51183 631 ## COG0693 Putative intracellular protease/amidase 44 15 Op 3 . - CDS 51202 - 51450 361 ## FN1084 hypothetical protein + Prom 51584 - 51643 7.3 45 16 Tu 1 . + CDS 51672 - 53921 2099 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Term 53862 - 53899 0.8 46 17 Tu 1 . - CDS 53918 - 54715 978 ## COG0730 Predicted permeases - Prom 54740 - 54799 9.9 + Prom 54699 - 54758 8.8 47 18 Op 1 1/0.000 + CDS 54803 - 56206 357 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 48 18 Op 2 . + CDS 56154 - 58385 194 ## PROTEIN SUPPORTED gi|152975021|ref|YP_001374538.1| 30S ribosomal protein S1 49 18 Op 3 2/0.000 + CDS 58402 - 59736 1029 ## PROTEIN SUPPORTED gi|229230948|ref|ZP_04355465.1| SSU ribosomal protein S12P methylthiotransferase 50 18 Op 4 . + CDS 59749 - 60303 720 ## COG0558 Phosphatidylglycerophosphate synthase 51 18 Op 5 . + CDS 60313 - 60585 347 ## gi|257451724|ref|ZP_05617023.1| YGGT family integral membrane protein 52 18 Op 6 . + CDS 60602 - 61543 1352 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 53 18 Op 7 . + CDS 61540 - 61800 409 ## gi|257451726|ref|ZP_05617025.1| hypothetical protein F3_01584 54 18 Op 8 . + CDS 61811 - 63172 1351 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 55 18 Op 9 . + CDS 63169 - 63330 166 ## CD0408 hypothetical protein 56 18 Op 10 . + CDS 63334 - 63615 176 ## SZO_12680 membrane protein + Term 63647 - 63689 2.3 + Prom 63620 - 63679 4.8 57 19 Op 1 . + CDS 63705 - 64475 768 ## SZO_12980 replication initiation protein 58 19 Op 2 . + CDS 64472 - 65323 482 ## COG1484 DNA replication protein 59 19 Op 3 . + CDS 65320 - 65805 329 ## SEQ_0732 hypothetical protein 60 19 Op 4 . + CDS 65802 - 66140 320 ## CD0412 putative conjugal transfer protein 61 19 Op 5 . + CDS 66191 - 68029 1800 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 68068 - 68106 2.2 + Prom 68041 - 68100 1.6 62 19 Op 6 . + CDS 68123 - 68320 86 ## SDEG_1344 hypothetical protein + Term 68328 - 68372 5.1 63 20 Tu 1 . + CDS 68397 - 68594 195 ## gi|257451736|ref|ZP_05617035.1| hypothetical protein F3_01634 + Prom 68830 - 68889 3.9 64 21 Tu 1 . + CDS 68957 - 70441 1056 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Term 70451 - 70510 -0.7 + Prom 70749 - 70808 10.9 65 22 Tu 1 . + CDS 70846 - 71898 512 ## gi|257451738|ref|ZP_05617037.1| hypothetical protein F3_01644 + Term 71908 - 71945 2.1 + Prom 72044 - 72103 11.9 66 23 Tu 1 . + CDS 72144 - 72401 304 ## gi|257451739|ref|ZP_05617038.1| hypothetical protein F3_01649 + Term 72465 - 72496 -0.5 67 24 Op 1 . - CDS 72415 - 72879 364 ## COG4824 Phage-related holin (Lysis protein) 68 24 Op 2 . - CDS 72893 - 73342 574 ## FN0636 hypothetical protein 69 24 Op 3 . - CDS 73352 - 73636 480 ## gi|257451742|ref|ZP_05617041.1| hypothetical protein F3_01664 70 24 Op 4 . - CDS 73651 - 74103 540 ## COG5632 N-acetylmuramoyl-L-alanine amidase 71 24 Op 5 . - CDS 74113 - 74775 622 ## gi|257451744|ref|ZP_05617043.1| hypothetical protein F3_01674 Predicted protein(s) >gi|224461496|gb|ACDD01000006.1| GENE 1 2 - 820 970 272 aa, chain + ## HITS:1 COG:ECs2221 KEGG:ns NR:ns ## COG: ECs2221 COG3328 # Protein_GI_number: 15831475 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 2 263 20 281 289 251 44.0 9e-67 CSESFISTVTDKVMDYIQDWKNRPLDEVYPVIFIDATHFSVREDNRIKKIAAYVVLGITK EGKKEVLSLEIGENESSKYWLGVLNVLKNRGVKDIMVICADGLTRIKEAIATAFPKTEYQ RCVVHQVRNTLKYVSYKDKKSFASDLKSIYLAVTESQALENLEKVKETWEEKYPNSMASW YQNWDVLTPIFKFSLEVRKVIYTTNAIESLNSTYKKLNRQRTVYPSDKSLLKVLYLSTLE ATKKWTQPLRNWGKVYGEFSIMYEGTQPMLNT >gi|224461496|gb|ACDD01000006.1| GENE 2 890 - 1012 129 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSKEKIISDSTLVAITLLLAESKPEEKEIMINVIMNFLE >gi|224461496|gb|ACDD01000006.1| GENE 3 1172 - 2827 1469 551 aa, chain + ## HITS:1 COG:no KEGG:lp_1931 NR:ns ## KEGG: lp_1931 # Name: not_defined # Def: hypothetical protein # Organism: L.plantarum # Pathway: not_defined # 1 551 1 557 559 264 28.0 6e-69 MKWEKLFKPHILERGYEYFRSHSIQNMEISSNRIRANVLGTEEYEVEILLSQDNITELYC SCPYAEEGKNCKHIAAVLYEWFDKKEKKDKKGSTLNKKKEEISNLLEKIDKQTINSFLSE VLMENEKLFLRFKNLLNENDTEEYLELYREEIEDIILEYTDEDNFINYYNVDSFVSELED IVYKDILPMTKDGNYKLAFDILHEMFISITKLDVDDSSGILSSLVDDIYDKWLKILSKVK PGEKRIIFKSLQSNLELPILDYMKEYIEKIIVKEFREKEYREVKLKWITKKIEECDKSEL EWIRNYKLGKWAIWYFHLLQEDKYKEEEFLAFCKNYWHNEAVRKYYIDFCIQQKDYQAAF QAIEESILLDADNSFLLSYYTIKKKEIFLLQGDQEAYVEQLWKLVMKYNPGNLEFFKELK QQYPTKEWLVQREKIFQRLSKDRHLAILYHEEKLYDRLLSIVVETQGIFLLGEYEKDLIS IFPKQVLQKYERELKEMASKTGNRKQYRELVSLLRKMKKIKGGNQVVENICMEWKIQYKN RPAMMGELEKL >gi|224461496|gb|ACDD01000006.1| GENE 4 3094 - 4731 2680 545 aa, chain - ## HITS:1 COG:FN2082 KEGG:ns NR:ns ## COG: FN2082 COG2759 # Protein_GI_number: 19705372 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Fusobacterium nucleatum # 3 545 2 544 544 850 81.0 0 MKTDIQIAQETQMLHINEIAKKIGLSEDDIEQYGKYKAKVDLDVLKRHKEKENGKLILVT AITPTPAGEGKSTVTIGLTQALNKIGKLSSAAIREPSLGPIFGMKGGAAGGGYAQVVPME DINLHFTGDMHAIGIAHNLISACIDNHINSGNQLGIDLTKITWKRVVDMNDRALRKVVIG LGGKANGVPRESSFQITVGSEIMAILCLSNNIKELKEKIGNIVFATSYSGQLLRVSDLHI EGAVAALLKDAIKPNLVQTLEHTPVFIHGGPFANIAHGCNSILATKMALKLTDYVVTEAG FAADLGAEKFLDIKCRMGGLTPNAVVLVATVRAIKHHGDGDLAKGMANLEKHLEIIQTYG LPAVVAINKFVTDTEEEIAYIEKFCNERGAEVSLCEVWAKGGEGGIDLANKVVKAIEEST KEYKPFYDINLSIQEKIEKICKGIYGADGVTFSAAAKKMLTLIEKEGYNHLPVCMSKTQK SISDNPNLLGRPTGFKVTINELRLAVGAGFIICMAGDIIDMPGLPKKPAAEVITISDEGI IDGLF >gi|224461496|gb|ACDD01000006.1| GENE 5 4737 - 5171 483 144 aa, chain - ## HITS:1 COG:FN0046 KEGG:ns NR:ns ## COG: FN0046 COG0757 # Protein_GI_number: 19703398 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Fusobacterium nucleatum # 1 140 1 144 147 160 56.0 1e-39 MKIMIIHGPNLNFLGIREKNIYGMEDYNSLCDYITSSFPEDEITCLQSNSEGRLIDFIQK AHLEKYDGIVINAGAYTHTSIALYDALKSISMVTVEVHISNIYAREEFRQHSYLASACLG QISGFGKEGYVYAIQKIKTYLGGV >gi|224461496|gb|ACDD01000006.1| GENE 6 5143 - 5946 746 267 aa, chain - ## HITS:1 COG:CAC0897_2 KEGG:ns NR:ns ## COG: CAC0897_2 COG0169 # Protein_GI_number: 15894184 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Clostridium acetobutylicum # 4 255 7 268 273 194 38.0 2e-49 MKNYALLGRKLSHSYSKIIHEYLFQKFSWDASYSFWEMEENLVSQALKISKEKKLSGFNI TVPYKESLFSQINILEDAAKNIGAINTIAIEKEQVIGYNTDCFGFQKMLEYFSIDVQNKK VIILGTGGASKAVAEALRQEGANTILFVSRSPKEGQLSYSDTFDGDIIINTTPVGMYPYV EKSPIHKKILSNFKIAIDLVYNPKETKFLLEAKELGLMTINGLFMLVAQAIRSEEIWNHK TFDISLYYEVYSFLEGIVYENHDNSRS >gi|224461496|gb|ACDD01000006.1| GENE 7 5943 - 6200 286 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451680|ref|ZP_05616979.1| ## NR: gi|257451680|ref|ZP_05616979.1| chorismate mutase [Fusobacterium sp. 3_1_5R] # 1 85 1 85 85 99 100.0 7e-20 MDKLEEYRKQMSEIDQKIASLFLTRMDLSIQIGNYKKEKNIPIYQEEREKIVLENIKKLT LKKKEQEYLEDFFLYLMSKSKEVQK >gi|224461496|gb|ACDD01000006.1| GENE 8 6193 - 7173 1041 326 aa, chain - ## HITS:1 COG:FN0934 KEGG:ns NR:ns ## COG: FN0934 COG0082 # Protein_GI_number: 19704269 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Fusobacterium nucleatum # 3 319 4 356 357 321 50.0 1e-87 MNWGKILQLSIFGESHGSTIGITIGGLLPGMKIPFVELQRDLSLRAPGQRLTSPRKEKDH FEIISGVFEGKTTGAPLTVIFPNLNTQSKDYEIHKKIPRPSHADYPAQIKYKGFQDLRGG GHFSGRLTAPLVFAGTFAKQYLKDRGIGISSTIVEKEDLEKKLPTLIQEGDSIGASISCK ITGVPVGIGNPFFDSLESSISHLAFSIPGVKGIEFGLGFDFIGKLGSEVNDEYNFIDGKV MTTTNYNGGILGGLSNGMPIEFRLVFKPTASIFKKQKSVDLEKQENTTLLIQGRHDPCIA LRAQIVVESIAALAILDQIWMGEYYG >gi|224461496|gb|ACDD01000006.1| GENE 9 7160 - 8386 860 408 aa, chain - ## HITS:1 COG:FN0933 KEGG:ns NR:ns ## COG: FN0933 COG0128 # Protein_GI_number: 19704268 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Fusobacterium nucleatum # 8 403 14 417 424 356 50.0 5e-98 MKLWSTHLKGKVKIPSSKSYCHRYIIAASFAKKESVLDNVSMSDDIKSTLEIVKKLGAKI EQKNQTFIIQKKSIYDKKEPLYFFCSESASTLRFLIPISITNPRKVFFYGKHNLPKRPLS PFFPILEASHVSFQTKGEKDLCIQLDGQLKSGKYEIAGNVSSQFITALLFALPLLEGDSE ISILGNLESRAYIEMTLDVLEKFQIQIFRTKNTFYIPGNQIYQSYSTSIEGDYSQAAFFL VANSLGNQIQIQGLSQESKQADYEILSMIKKLETKKEDEILVLDGSQCPDIVPILSLRAA LTPGKTMIQNIERLKIKECDRLHATAEILNQLGAKVIEHTTSLEFDGVSHLIGNSVSSFG DHRMAMMIAIASSCCQGEIILDDGNCVSKSYPNFWEDFKQLGGNYELG >gi|224461496|gb|ACDD01000006.1| GENE 10 8572 - 9297 241 241 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 237 4 238 242 97 32 2e-19 MKRLEGKIALVTGSARGIGRATVELLAAHGAAMVISCDMVETTFEQENIRHEILNVTDRE QIKGLVSKIEKEYGKIDILVNNAGITKDNIFLRMSEEQWDAVINVNLKGVFNVTQAVAKG MLKKGSGSIITLSSVVGIYGNIGQTNYSATKGGVISMTKTWAKELTRKGAQIRANCVAPG FIETPMTEALSGEVREQMANAVPLKRMGSVEDVANAILFLASDESAYITGQVIEVSGGLV V >gi|224461496|gb|ACDD01000006.1| GENE 11 9353 - 10561 1908 402 aa, chain + ## HITS:1 COG:FN0495 KEGG:ns NR:ns ## COG: FN0495 COG0183 # Protein_GI_number: 19703830 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Fusobacterium nucleatum # 1 402 1 402 402 626 81.0 1e-179 MSKVYIAAAKRTAIGSFLGSLSPLSASDMGAAVAKNILEETKIDPAKLDEVIMGNVLSAG QYQGVGRQTSVKAGIPYEVPGYSVNIICGSGLKSVILTYANIKSGVANLVLAGGTESMSG AGFVLPGQIRGGHKMADLTMKDHMICDALTDAFHKIHMGITAENIAEKYGITREEQDEFA LASQHKAIAAVDSGRFKDEIVPVTIKNKKGDIVVDTDEYPNRKTNLEKLAGLKPAFKKDG SVTAGNASGLNDGASIVLMASEEAVKENNLTPLVEIVGVGTGGVDPLIMGMGPVPAIRKA LKHANLTLKDMDLIELNEAFASQSLGVIKELINEHGVTKEWIAERTNVNGGAIALGHPVG ASGNRILVTLIHEMKKRGSEYGLASLCIGGGMGTAVIVKNVK >gi|224461496|gb|ACDD01000006.1| GENE 12 10647 - 11024 436 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451685|ref|ZP_05616984.1| ## NR: gi|257451685|ref|ZP_05616984.1| hypothetical protein F3_01379 [Fusobacterium sp. 3_1_5R] # 6 125 1 120 120 137 100.0 2e-31 MKKYIMFLQYFLLGFFLYADSYYKEEGVLEDGTRYTKESWTSTKREKTKEKKVEVVKGKK EEIREVDLKFTEDFLKRERENQKKEKENYQNLWKNATKQENNLSESDLEDSVEYFDDFEV GEIEE >gi|224461496|gb|ACDD01000006.1| GENE 13 11021 - 12595 1882 524 aa, chain + ## HITS:1 COG:FN0904 KEGG:ns NR:ns ## COG: FN0904 COG2509 # Protein_GI_number: 19704239 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 1 522 1 524 527 679 65.0 0 MKIAIHNIVVSIKKNQDLEIQKELQKAGIQKENIKGLSYLKRSIDSRKKQDIKFVYSIEI ELKKEISSSSNAKWQEVKEIIPPKRFPLYPKREIYVVGSGPAGLFAAYRLAEYGYLPIVL ERGESIEERDKTTENFIKTSILNPNSNIQFGEGGAGTYSDGKLNTRVKSEYIENVFQLLV KFGAPEEILWNYKPHVGTDILKIVVKNLREAIIKMGGKFYFNTLLEDIKIQNGELQGFYI QKNGMKEYIAENQLVLAIGHSSRDTYRMLRKHGVAMEAKAFAMGTRMEHPRYEIDKMQYG KEVKNSLLEAATYAVTYNNQAEKRGTFSFCMCPGGVIVNAASQVGGTLVNGMSYSTRDGR FSNSAIVVGIKEHEFGEDIFSGMYFQEKLEKKAYDMIGSYGALYQNVWDFLSHKKTKHEI ETSYQMKKTSCQMEKLFPEVITENLRSALSYWKRNEEFISKNVNLIAPETRTSAPIKILR DVKGESLNVRGLYPIGEGAGYAGGITSAAVDGMKIVDCAFTRVL >gi|224461496|gb|ACDD01000006.1| GENE 14 12635 - 13816 1427 393 aa, chain + ## HITS:1 COG:CAC3299 KEGG:ns NR:ns ## COG: CAC3299 COG1979 # Protein_GI_number: 15896543 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Clostridium acetobutylicum # 1 390 1 386 389 428 53.0 1e-120 MENFNYYIPTKILFGKRKIESLGKEAAKYGKNILMVYGKGSIFKENCYGTSLYEQAKKSL EEANLTIFELPNIDPNPRIESVYAGAKLCREHSIDLVLAIGGGSTIDCAKGIAGQAKYEG DIWKCYETKDPSPIQEVLPIASVLTLSATGSEMNGSSVISNLSCNKKIGLTTSKFRPVFS ILDPSYTFTVNRKQTASGSVDIMSHIFEQYFTPDHGGYLQNRMMEGVLKTVIHYAPIALE EPDNYEARANLMWASTWALNDMFEKGKIPTDWATHQMEHELSAFYDITHGVGLGILTPYW MQYVLSNENQHRFVEYGKEVWNLTGTEEEIAKKSIEKTREFFTSLGIPAHLKEVGIGEEN LEVMAKQATQRRPLGAMKKLYAEDVLAIFKMAL >gi|224461496|gb|ACDD01000006.1| GENE 15 13835 - 14515 1048 226 aa, chain + ## HITS:1 COG:FN0866 KEGG:ns NR:ns ## COG: FN0866 COG0670 # Protein_GI_number: 19704201 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Fusobacterium nucleatum # 6 226 4 224 224 178 52.0 9e-45 MSGMYVDIQKSNSFLRKVFLYMIVGIVLSVVTPISLYFVAPKFLGLALQYYRVLVIVELI AVFTLSFRVYKMSSGTVKTLFVLYSMLNGLTLCTIGFLYDPMIVLYSFGITLSIFTVSAF YGFKTTEDLASYSRFFTIGLVSLILVSLVNLWLGVSSLYWMITVGGTVLFTGLIAYDVNR IRNMSFYLAEEDGEDVEKYAVMGALSLYLDFINLFLYILRFSGKKR >gi|224461496|gb|ACDD01000006.1| GENE 16 14538 - 15374 900 278 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451689|ref|ZP_05616988.1| ## NR: gi|257451689|ref|ZP_05616988.1| hypothetical protein F3_01399 [Fusobacterium sp. 3_1_5R] # 1 278 1 278 278 499 100.0 1e-140 MKRGLYALLFLLASSIGYGTSVFLQESWIEQRGKEQFRDTIIWRERSLEGQSGSNQYVEF ARTTATKDRKENNHWNGHSFFMGTNSNLEKNPNIYLGTSFGFFRGREKSHSYNWNEKTRT YGINAEMAHIKDKNLSLLGLGFTEMRHSPNTDKRYREKEIHIFGELGRLYSYDQIHYLYP FIAFSTQKIEGERVVPSSEVGFRYTRYWTEKLSSKFQTSYGREWTKRRREERYQNQFNFL MGLSYRYYEDLEIQLQYRGKMYKEAYQDFISLGFSHNF >gi|224461496|gb|ACDD01000006.1| GENE 17 15397 - 16188 765 263 aa, chain + ## HITS:1 COG:FN2029 KEGG:ns NR:ns ## COG: FN2029 COG1835 # Protein_GI_number: 19705320 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Fusobacterium nucleatum # 65 262 408 603 604 122 33.0 6e-28 MKKKVIALFGIALFSSGCTSLFWRVSEKEEKADVALLKFSEELELEENLRGQKASIETAT IVETNKQTEEVKLEKTLQTRKEITSLKTEENKQESKKENSKKIEVAVTQRKILFVGDSVM KGSEAQLRKIFPNAIVDSAVSRQFSALPDILHRVEKTQGIPDVVVVHLGSNGNIFEKHML ESMEILGNRKVFFINCKVERPWQESVNHFLKTQVAKYKNTKLVDWYSLAHDQNQYFAKDR IHPNQMGAKVYRAMILEKLEKEL >gi|224461496|gb|ACDD01000006.1| GENE 18 16190 - 18004 1409 604 aa, chain + ## HITS:1 COG:FN2029 KEGG:ns NR:ns ## COG: FN2029 COG1835 # Protein_GI_number: 19705320 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Fusobacterium nucleatum # 3 602 5 601 604 392 39.0 1e-109 MQRERNYGIDVLRGIALILIFTYHYYQFQGTYVGVIIFFALSGYLVTEGLFLEDFNYVSY LKKKFIKLYPLLLFIVALCTLGVFLLEKGLGNTYRYGALSVLFAGNNIYQAFSEISYFES HNDILPLVHTWALSLEIQFYIAYPLLLLACKKWKKNNRETAEIIFLLSSVSALCMFFHYL LGSDLSRIYYGTDTRLFTFLLAGACSSYMRTEKKWSKFIFYTLSIIGLIAIGLFSMYFRY DLEWNYLGAFYIISILTTIVTVSCYRFGFLNYKNPFSNLLQSLGIRGYSYYLWQYPIMIF ANEYFKWIKISYHWTVAIQVVILILISELTYRFIEKKNFSFLQVSIFFLLTIFLLIALPK PVRQESQVLEHKIEELANSNIRKEEPILEIQTEKPLLEQENKKEEEDYDSLELFLLGDEK EETAPSSKSMTKELTTVVEKPFHQVQKGVFRKPVTFIGDSVMKMCEMDIKKDFPNAYVDA AVSRQFFKLPGILEDAKKKGKLYPIVVIHLGSNGTIQKKSFDKMVQLLDGHQVFLLNCVV SKPWETEVNSLLEQEVAKYPNLHLINWYQYAKGQSSWFYKDATHPKPNGSKKYSHFILKN LENM >gi|224461496|gb|ACDD01000006.1| GENE 19 18050 - 18214 188 54 aa, chain - ## HITS:1 COG:no KEGG:FN1200 NR:ns ## KEGG: FN1200 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 54 208 259 259 68 57.0 7e-11 MLSYNLYKADNKELNLLSGIAYEKLKFKDSQKEMQNFMEHKIVPIYKVGLEYKF >gi|224461496|gb|ACDD01000006.1| GENE 20 18258 - 18632 571 124 aa, chain - ## HITS:1 COG:MA3316 KEGG:ns NR:ns ## COG: MA3316 COG3422 # Protein_GI_number: 20092130 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 58 119 12 73 77 68 53.0 3e-12 MGKFIVRETKTGIKFDLLAKNNEVIATSEVYKAKASCMNGIKSVMTNSAIATIEDQTKEN TPKEKNPKFEVYKDKAGEFRFRLKAKNGQIIATSEGYKAKASCMNGIESVKKNASVAPIE ELNK >gi|224461496|gb|ACDD01000006.1| GENE 21 18844 - 19383 777 179 aa, chain + ## HITS:1 COG:FN0455 KEGG:ns NR:ns ## COG: FN0455 COG1592 # Protein_GI_number: 19703790 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Fusobacterium nucleatum # 1 179 1 179 179 207 63.0 7e-54 MELKGTKTEQNLQTAFAGESMARNKYTYYASKAKKEGYVHIGKLFEETANNEKEHAKIWF KYLHGGAVPSTEQNLLDAAEGENYEWTDMYASFAETAKEEGFNELASLFTMVGKIEKTHE ERYRTLLENLKSGKVFSREENEEWECSNCGYIHYGPKAPGLCPVCKHPIDYFMLRPKNY >gi|224461496|gb|ACDD01000006.1| GENE 22 19562 - 21925 3527 787 aa, chain + ## HITS:1 COG:FN0857 KEGG:ns NR:ns ## COG: FN0857 COG0058 # Protein_GI_number: 19704192 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Fusobacterium nucleatum # 1 787 1 787 789 1349 85.0 0 MLFEKEVWKEKLEQRILVKFGTSLEEASSFEIYQALGDTIMESIAKDWYDTKKKYEKKKQ AFYLSSEFLMGRAMGNNLINLGIQQEVIDFLKEIGIDYNQIEDEEEDAALGNGGLGRLAA CFMDSLATLNLPGQGYSIRYKNGIFNQYLRDGFQVEKPETWLRYGDVWSVVRPEDEVIVN FGNTSVRALPYDMPIIGYGTKNINTLRLWEAHAIQDLDLGVFNQQDYLHATQAKTLAEDI SRVLYPNDSTDEGKKLRLRQQYFFVSASLQDIMKKFKKVHGREFEKIPEYIAIQLNDTHP VIAIPELMRLLVDIEGVKWEDAWEIVKRTFSYTNHTILAEALEKWWIGLYQEVVPRIFQI TEGIHNQFRAELTQLYPNDAEKQNRMSIIQGNMIHMAWLAIYGSHKVNGVAELHTEILKE RELKDWYDLYPNKFLNKTNGITQRRWLLKSNPQLSAYITELIGDAWITDLSELKKLEQYL EDEVVLNKLLAIKQEKKEELVKYLRETQGVDINPKSIFDVQVKRMHEYKRQLLNILQVYD LYYYLKENPNVEFTPTTYIYGAKAAPGYKVAKGIIRLINDIAQIINGDNEVNDKLKVVFV ENYRVTVAEKLFPAADISEQISTAGKEASGTGNMKFMLNGALTIGTLDGANVEIAKEAGE ENEYIFGMKVEDIDALQKRGYDPRTPYNSVAGLKRVIDALIDGHLNDLGSGIYREIHSLL MERGDQYYVLEDFEDYRRKQRSINRDYRDQKAWARKMLKNIANAGKFSSDRTIMEYAKEI WGINEVR >gi|224461496|gb|ACDD01000006.1| GENE 23 21931 - 23763 2090 610 aa, chain + ## HITS:1 COG:FN0856 KEGG:ns NR:ns ## COG: FN0856 COG0296 # Protein_GI_number: 19704191 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Fusobacterium nucleatum # 1 609 4 611 611 965 76.0 0 MSGQTDRYLFHRGEHRQAYGYLGAHPSRTSTIFRVWAPNAKSVAVVGDFNSWVARAEDYC KKLNNEGIWEIEIPKLKKGFLYKYQIETVWGERILKADPYGFSSELRPNTASIVTGLPKF RWGDKRWLNKREVGYQRPVNIYEVHLGSWKKQEDGNFYNYREIAKLLVDYLTDMKYTHIE LMPLVEHPLDASWGYQGVGYYSVTSRYGSAEDFMYFVNYLHQHGIGVILDWVPGHFCKDA HGLYRFDGGTCYEYEDIVLGENEWGTANFNVARNEVRSFLVSNLYFWLKEFHIDGIRMDA ISNMLYYTRDNELHENQRSVEFLQFLNQTVHEEYPDVMLIAEDSSAWPLVTKYPMDGGLG FDGKWNMGWMNDILKYMEIDPFFRKNHHGKLTFSFMYAFSENFILALSHDEVVHGKKSIL NKMPGYYENKLNHVKTLYAYQMAHPGKKLNFMGNEFAQGLEWRFYEELEWKVLEENKGCQ SIQKYTRALNELYLKEKALWYDGQDGFEWIEHENIEENMLIFLRKTPDMKEVFIAVFNFS GKNQEKYKIGVPFAGNYECLLNSNETKFGGYDIGKKKTYQTIDSSWNYREQHIEVDIAGN TALFLKYRKK >gi|224461496|gb|ACDD01000006.1| GENE 24 23784 - 24929 1347 381 aa, chain + ## HITS:1 COG:FN0855 KEGG:ns NR:ns ## COG: FN0855 COG0448 # Protein_GI_number: 19704190 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Fusobacterium nucleatum # 1 378 3 380 384 666 84.0 0 MKKKRIIAMILAGGQGSRLKELTERIAKPAVSFGGKYRIIDFTLTNCSHSGIDTVGILTQ YEPHALNNHIGRGLPWDLDRMDGGVTVLQPHTKKNDENGWYKGTANAIYRNINFIEEYDP EYVLILSGDHIYKMDYDKMLKYHIKKEADATIGVFEVPLADAPSFGIMNTREDMTIYEFE EKPKEPKSTLASMGIYIFKWKVLKEYLEEDEKDPKSSNDFGKNIIPNMLQDGKKLVAYPF EGYWRDVGTIQSFWDAHMDLLEEGNELDLFDKSWRINTRQGIYTPSYVTPKAKVQNTLLD KGCLVEGEVKHSVIFSGVKIGKNSKIIDSILMADTEIGDNVIIQKAIIANDVKVLDNTVI GDGKEIVVIGEKRIVKSEPVK >gi|224461496|gb|ACDD01000006.1| GENE 25 24950 - 26113 1397 387 aa, chain + ## HITS:1 COG:FN0854 KEGG:ns NR:ns ## COG: FN0854 COG0448 # Protein_GI_number: 19704189 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Fusobacterium nucleatum # 1 387 1 387 387 618 80.0 1e-177 MIKNYMAIIYLGQGNENISPLTKARSLASIPVGGSYRIIDFALSNVVNAGIRNVGLFCGN EEVNSLTDHIGNGFAWDLARKKDGIFIFKQMMDNHSRTGRARIHKNMEYFFRSSQEKVIV LNSHMVCNLDINDLIEKHEASGKEITMVYKKVKDAHEHFNHCSSVKIDENNRVVGIGQNL FFHEEENISLDAFVISKELVLKLLIDSIQDGNYNTLPELVAKKLASLNVNAYEFTGYLQC INSTREYFDFNMKILQREIREDVFGITSGRQILTKVKDTPPSLFKETANVENSLISNGCI IEGSVKNSILSRGAVIEKGVVLEDCVILQDCHIQKGAILKNVIVDKNNVIHEDEKLSASK EYPLVIEKSMNWDSKQYQNLMKYIKTK >gi|224461496|gb|ACDD01000006.1| GENE 26 26124 - 27533 1431 469 aa, chain + ## HITS:1 COG:FN0853 KEGG:ns NR:ns ## COG: FN0853 COG0297 # Protein_GI_number: 19704188 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Fusobacterium nucleatum # 1 458 1 459 461 714 75.0 0 MKVLFATAEAFPFVKTGGLGDVAYSLPKALQKEKIDVRVILPKYSKIKEEFLKQKRHLGH KEIWVAHHNEYVGIETVLYEDVTYYFIDNERYFKRNGIYGEFDDCERFLFFAKAVVETMD ITGFKPDIIHCNDWQTGLIPIYLKERGMQEIKTIFTIHNLRFQGFFFNNVIESLLEIDRY KYYHEDGIKYYDMISFLKAGVVYSDYITTVSESYAEEIKTPELGEGLHGLFQKLDYRLSG VVNGIDEKSYPIPKDSKENLKVKLQKKLGLKIEKDTPLIAMITRLDSQKGIDFVIEKMDE IMSMGVQFILLGTGENRYEDFFRWKESQYSGYLCSYIGFDSDLSLEIYQGADIFLMPSVY EPCGLSQMIAMRYGCIPVVRETGGLRDTVTPYNEYTGEGDGFGFRELNANDMMKTLHYAV QVYQRKQEWSVLIENAKARENSWKASAKKYEIIYQKVLGKFHDVEVIKK >gi|224461496|gb|ACDD01000006.1| GENE 27 27575 - 29494 1177 639 aa, chain - ## HITS:1 COG:FN0799 KEGG:ns NR:ns ## COG: FN0799 COG1523 # Protein_GI_number: 19704134 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Fusobacterium nucleatum # 22 639 22 645 645 831 64.0 0 MYQNIEKTTSFLNFWKINRPQFSLYAKEAKEVSLEFYKTVQDNIPYQIIRLNSQKHKLGD YWYYEDKAIQEGCLYRWNVDGVSILDPLALSYTGNMPVKEKKSIFLLHKQATSKKFSIKD KDRLIYEVHIGFFSKEQTYTTFIDKIPYLKELGINTVEFLPIYEWDDYTGNLHPDSTPIQ NAWGYNPINFFATTKKFSSSKEENSFSEVEEFRNLVEILHKNGIEVLLDVVYNHTAEGGK TGYLHHFKALGEDTFYIKNKDKDFSNFSGCGNSFHCNHTVTKEMIIESLLYWYLEMGVDG FRFDLAPVLGRDSHGQWLQRSLLYDLVEHPILSHATLISESWDLGGYFVGAMPSGWSEWN DSYRDTIRKFIRGDFGQIPDLIKRIFGSIDIFHANKKKYQATINFIACHDGFTMWDVLSY NRKYNFANGEKNQDGNNENYSYNHGEEGETKNPAILKLRIQQMKNMMLLLYISQGIPMLL MGDEIARTQLGNNNAYCQNNKITWMDWSRKDSFQDIFQFTKSMIQLRKTYSIFRKEEYLK MDEEIILHGVKLHQPDYSFHSLSIAFELWDQESDTQFYIALNSYSESLDFELPILKNKKE WYLLTDTSKVETCDFKAEEKITETNYSVISKSSIILVAK >gi|224461496|gb|ACDD01000006.1| GENE 28 29649 - 29966 612 105 aa, chain + ## HITS:1 COG:VC1919 KEGG:ns NR:ns ## COG: VC1919 COG0776 # Protein_GI_number: 15641921 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Vibrio cholerae # 1 91 1 90 90 60 43.0 1e-09 MNKKDFIALFAKNAELKTKTEAEKLVAAFLNTVEETLVAGDGVAFMGFGKFETLVREART CVNPRTKEKMNVAAKKVVRFKAGKALAEKVNVVEKKAKKGSKKSK >gi|224461496|gb|ACDD01000006.1| GENE 29 30129 - 30905 987 258 aa, chain + ## HITS:1 COG:FN1609 KEGG:ns NR:ns ## COG: FN1609 COG1692 # Protein_GI_number: 19704930 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 257 1 257 263 372 68.0 1e-103 MKILVVGDIVGRPGRKTLKSYLEKEKNQYDFIIVNGENAAAGFGITEKIAVEFLSWGIDI ITGGNHTWDKKEFYDFLRQSNRVIRPCNYPQGVPGVGYSILPSRNGKKVAVLSLQGRVFM PATDCPFQVAEKVMEEIRKETNIIIVDFHAEATSEKIALGWFLDGKVSAVYGTHTHIQTA DEKILPQGTSYITDVGMTGSENGVIGMKVECILPKFLTALPQRFEVAEGKEMLHGISLEI DEETGKTVKIDRIAWREE >gi|224461496|gb|ACDD01000006.1| GENE 30 30902 - 31831 1114 309 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237737638|ref|ZP_04568119.1| ribosomal protein L11 methyltransferase [Fusobacterium mortiferum ATCC 9817] # 1 309 1 309 309 433 67 1e-120 MKVMEVKVIFESDDIQKYQKQISDIFYDFGVTGLQIEEPLEKKNPLDYYKDESSFLMRNH AVSAYFPMNIYAKKRQETLLTVFEEKFGQDDEVVYTVDFYEHEEEDYQNSWKKYLYPEKI SAQFVVKPTWREYKAEEGEKVIELDPGRAFGTGSHPTTSLCVDLMEEGIQEGETVLDVGT GSGILMIVAEKLGAGFVCGVDIDELAVEVANENLELNKVSKEKYKVLHGNLIEKIEKQSY DVVVANILADVLLLLLKDISSVVKTGGKIIFSGIIEDKLEEVIRSVEMTGMRVEKVVAKG EWRALAIRA >gi|224461496|gb|ACDD01000006.1| GENE 31 31841 - 32497 795 218 aa, chain + ## HITS:1 COG:FN1607 KEGG:ns NR:ns ## COG: FN1607 COG0283 # Protein_GI_number: 19704928 # Func_class: F Nucleotide transport and metabolism # Function: Cytidylate kinase # Organism: Fusobacterium nucleatum # 1 215 1 215 218 233 60.0 2e-61 MKEFIVALDGPAGSGKSTIAKRIAKQYHFTYVDTGAMYRMITWFFLENNVSWKEEIACQK ALEQVHLDMKNERFFVNGQDVSEAIRGLRVSSYVSEIAALKVVRNQLVHLQRKIAKGKEV ILDGRDIGTVVFPKANLKIFLLASAEERAKRRFLEYEEKGETISYEEVLKSIQERDYIDS TRKESPLRKAEDAIEIDSSTMTIEEVVAEVSKEIESKR >gi|224461496|gb|ACDD01000006.1| GENE 32 32507 - 33736 1163 409 aa, chain + ## HITS:1 COG:FN1606_1 KEGG:ns NR:ns ## COG: FN1606_1 COG1519 # Protein_GI_number: 19704927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Fusobacterium nucleatum # 1 400 1 403 426 362 51.0 1e-100 MYSLLHSFLVKMISLLGKEKQKDFIHKRIFQEYKALPKTIEIWIHASSVGEVNLLERFLL GCLEAFEGEILLTVFTDTGKEAALQKYGKYERVHILYFPLDDKVSIQKILTQISLKNLYI IETELWPNLIRFCKKEARVVVLNGRISNRSFGRYQKIKFLLTPLLQKIDYYYLQTEEDKK RYIALGAKEEYCNIVGNLKFDISMPSYSQEEKEAYRKELKLNTRKLWVAGSTRTGEYEIL LEAFQQLEDYTLVIVPRHLERVPEIESLLKEKKISYQKYTDEEKREDIAVLLVDKMGVLR KLYSIADVTFVGATLVNIGGHSLLEPLAYGKTPIFGPYTQNVKEIAKEILEKKIGYQVVD AKTMLEAIDMIEQQSQEVREKVECFLKENKEVGKKILEREAQWNTKKKK >gi|224461496|gb|ACDD01000006.1| GENE 33 33712 - 34416 648 234 aa, chain + ## HITS:1 COG:FN1606_2 KEGG:ns NR:ns ## COG: FN1606_2 COG0220 # Protein_GI_number: 19704927 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Fusobacterium nucleatum # 21 233 1 213 214 280 68.0 1e-75 MEHKEKEIEELWSYFFKKPRNNYNPYMLRLLDFPDYILFKKKMMDEYKGNWREFFGNENP IFLEIGTGSGNFTKEIAKRNPDQNFIGLELRFKRLCLAASKCQKENLENVVFLRRRGEEL LEFLGKDELSGLYINFPDPWEGNEKNRMIQEKLFLALDSILKVGGILFFKTDHDQYYQDV LDLVKNLENYQVIYHTADLHQSEKAENNIKTEFEHLFLHKHNKNINYIEIQKVK >gi|224461496|gb|ACDD01000006.1| GENE 34 34428 - 35705 1933 425 aa, chain + ## HITS:1 COG:FN1605 KEGG:ns NR:ns ## COG: FN1605 COG0104 # Protein_GI_number: 19704926 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Fusobacterium nucleatum # 1 425 1 425 425 713 82.0 0 MAGYVVVGTQWGDEGKGKIIDVLADRADYVVRFQGGNNAGHTVVVNGEKFILKLLPSGVL HGGTCIIGPGVVVDPKVLLDELASLETRGAKTDHVIISDRSQVIMPYHVKLDELREAKED GLKIGTTKKGIGPCYEDKISRYGIRMADLLDMAQFEEKLKRNVEMKNEIFTKIYGVEPLD YDKILADYKGYIEKIKHRIKDTIPMVNKALDENKLVLFEGAQAMMLDINYGTYPYVTSSS PTTGGVTTGAGVSPRKIDKGIGVMKAYTTRVGEGPFVTELLGEFGEKVRKIGGEYGAVTG RPRRCGWLDLVVGRYATMINGLTDIVITKIDVLSGLGKLKICTAYEIDGEIYESMPANTS LLYRAKPIYEELDGWDEDITKIEKYEDLPENCKKYLKRIEEIVNCKISVVSVGPDRSQNI HIHEI >gi|224461496|gb|ACDD01000006.1| GENE 35 35925 - 39647 3552 1240 aa, chain + ## HITS:1 COG:no KEGG:CHY_2654 NR:ns ## KEGG: CHY_2654 # Name: not_defined # Def: hypothetical protein # Organism: C.hydrogenoformans # Pathway: not_defined # 6 1059 3 1053 1187 574 35.0 1e-162 MKNYLIQNTFQKDITRKINGVIKADSKEEDIVVTELSEYVVTEEIQKYLNIFFHRYVDSL QSPTEDIGVWISGFFGSGKSHFLKMIGRILENKEYQGKKVVDFFEEKIQDSILQENINKA AGAPTDVILFNIDNVSDQNTQQNKDTIPIAFLKKFNEYLGFSRDNIKIADFERMLWEKGK FEVFQKTFSEISGKIWKEEYRNLDFYADDFLDTIEKLEIMSREAAIRWLEKETDISISAE SLADLLEEYLKQQSEGHRIVFLVDEIGQYIGENSQLMLNLQTLVETLGVKFKGRIWVGVT SQQDLGNILGKAKEKSNDFSKIQDRFKTILPLSSANIDEVIKKRLLEKKLLQQEELESLY DKTRIYIENAITFDKTGMTLALFKDRKDFAETYPFVGYQFNLLQKVFEKVRNMGYSGQHM SRGERSLLSSFQEAGIRIKDMEVGSLVPFQYFYASIEQFLEDNVRRPFIHAKNEKGIDEF GLEVLKLLFLLKGMKGITPNINNLTSFMVDSVDCDRLALEEKIKKALGKLEQQVLIQKDG EGYYFLTNEEQDINKEINQEFINEQDVYKELNTYIFGQIFTSPIVTMEDTKNKYNFSQKI DEIPFGKIGGQLDILILTPRSDEYDNIAILGYRDGYDLILRLPEEMTYWEEIQQSLRIAS YVQKTLRGNNREIVHQILQRKQQENSKRKQRIQNEVERVLSEAEIYIQGQKINIMTAQAD KRVAEALKAVARHRFNKAKLVQRPYSESEIRNVLSYEYDAGTNLFDVKKDMSTNSNYAAL QEILTYITLHTSRGNVITLKNIVEYFSKKPYGWETFSIHGLVAELWIYKQIQIEESKNPI HNAEELKTLLVKSQSKTEERLVISLREEIDTELLQKVNNLLKKLFGPEKEISVDQPKEDI LEILKKKIDLPKMYSKECESGVYPGKKELQEWIDLLEEILLSKEKTEKLLKHFLEMEEEL LEAYDNQGIVLDFFRSSKRKKFDEGLEKRKKIKEYEAVIGSIQEMKPYQDLVSILTSKSP YNQIKNIDALLEEIQVEEEKLILEEKNMLLQRIEEKAKEFSTLFQERDAMLEKTLEKLKE FKNKIEKTSSLSIFFSMSDWNKVLSSIETEYRKHIQEELRSLEKAAFEVTEDKTGVKELV EDIRNTFHAYQEEAKSKDVNQLEQLIQKAKKDEEGFILQGNGRAKKKERVQFRKIRVVEK SNIETMEAVEEYISTLEKEIQNLKEKMLQAIRENKIVDIQ >gi|224461496|gb|ACDD01000006.1| GENE 36 39662 - 40990 1308 442 aa, chain + ## HITS:1 COG:UU038 KEGG:ns NR:ns ## COG: UU038 COG2865 # Protein_GI_number: 13357594 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Ureaplasma urealyticum # 23 442 32 460 463 166 30.0 1e-40 MEDQRMEYKAEIPKKGNLLKAEIVSFLNSEGGTILLGADDQGRVLEEKIKDYKRWEEILS NWIFNAFYPEVIDLIQVFPNEIPFRILIKKGKEKPYFYKEGEGFHAKGVYIRVGSSKRVA SFQEIQRMIFATKSHEYESLPCEKTNLTFQYLENKLEEKGITFDSYGLSLLDKEGKYNNA ALLLSDQNPTISKFAVFQGLDVSIFLDKKEFRGSILKQLDDILYFSNLSNKKKIIITGKA QREEHLDIPEKALREAIVNCYCHRDWTLSGDIKIEFYDDRVMIFSPGSLPDGLTLENIKQ GMVAKRNRILVSALDKADMIENYASGIRRIFQDYEDHEKQPNFYISENGVIVTLYNRNYD TQNDTQNDTQNDTQNDTFKLKPVQRRDKIIEFFLLNNEITAEYLAEKAGVSLSTIRRDLA QLRKQGKIEYIGSAKEGKWKIK >gi|224461496|gb|ACDD01000006.1| GENE 37 41009 - 44623 3534 1204 aa, chain + ## HITS:1 COG:STM4495 KEGG:ns NR:ns ## COG: STM4495 COG1002 # Protein_GI_number: 16767739 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Salmonella typhimurium LT2 # 1 1204 1 1214 1225 681 34.0 0 MNKSNLKKFAIGARQELREKTKAQLKRLGMEEKKIEEGKDMGSQVEIYGKLYSKSSYQHL LVKYHSLGYEELVEESAYLWFNRLTALAYMELHDCFSEHMIFSKGSKGEPSILDEYFQAD FFQKMPLEKQEELHQLRDQNTSDSLETLYSILMEEKCEELSKIMPFLFSKKGKYADILFP SGLLMQDSVLKKLQAILLEIQEEDQSIPVEILGWLYQYYNSEKKDEVFAGLKKNQKVTKE NIPAATQLFTPHWIVKYMVENSLGKLILEYIPNMESIKKDWNYYIETEIESSSEKLSIEG IKILDPAMGSGHMLTYAFDILFDVYQELGWSKKESVLSILQNNLYGLEIDDRAGQIAAFA LLMKGKEKFPRLFQVLEREENFEMPVISLQESNAISKRMYTMLEESPTLQDLLKGFENAK EYGSILKIDSFAESILQEEYQKLQEKIQNQGQFSLLKNNEFLEGDLEEDLERLERIIRQY KIMIQKYDVVITNPPYMGGRGFSPKLKNYTEKNYKDSKSDLFSVFMEVCQGFTKKNRYTA MITMQSWMFLSSFETLRNHMITKTEIQTLNHLGTRAFSELAGEVVQTVSWVQKHKTPEKK GTYIRLVDYNNGEEKEREFFNKENYYQANQKDFTKIPGSPIAYWVSDRIREIFEKEKKLG EVGEAKQGLATADNNRFVRLWHEINFHTIGFGMKNSEEALNSKKKWFPYNKGGEKRKWYG NQEYVVNWEKDGYEIKNFKNSVVRNPSYYFKKSISWSDITSSGNSFRLYPEGFIYDVTGM SYFIEDKFLTYLGIFNTKIYSKLTKLINPTLHLQIGDMIKMPYFQIQNPLFEQLVSLILW ISFEEWASRETSWDFERLTLLNGENLSKAYKKYCTYWESKFFSVHSSEEDLNRILLESYS LQEEMDEKVGFSDITLLKKEASIVENIDNATSCGYLENRGAKLEFHSLELVKQFLSYAIG CIMGRYSLDKSGLIMANSDDVLSMSSNKITVSGVNGEIRHEILNPSFFPEEFGILSVTTE ERFENDIVSRVIAFISAAYGKEHLEENLEFITEILGKKAGESHEEVLRNYFIKDFYTDHC QRYQKRPIYWMLHSGKKNGFSALIYLHRYEENTIARVRSDYLLPYQDFMEQQEAHYSKIA SDEVSTPTEKKDAQKKVKELHDILKELKDYANKVKHIAEQRISLDLDDGVKVNYEKLGSI LKKI >gi|224461496|gb|ACDD01000006.1| GENE 38 44677 - 46686 1424 669 aa, chain + ## HITS:1 COG:Z5943m_1 KEGG:ns NR:ns ## COG: Z5943m_1 COG1479 # Protein_GI_number: 15804980 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 EDL933 # 1 555 1 556 592 328 36.0 3e-89 MKASEKKIKDLFSEAKTFFAIPVYQRDYNWQEKHCKQLLEDILNVGKDEKLGSHFIGSIV YIHEGVYGIGEKEFYVIDGQQRMITITLLYIALYHKLKLEEKEYADEIYELYLVNKFSKR DIKLKLLPPEENLNILNKILEEKFKELEKYQDRNIVKNYSYFKEIISHYSRDGIENLLVG LDKIIYVDIALENGKDDPQKIFESLNSTGLDLSQADLIRNYILMDLDREKQNRIYKEFWI PIECNCKISTGSEVKPYVSDFIRDYLTLKNGKIPSKPKIFEEFKKFYSSNFDEKLKEIKS FSEEYQRILKPEVEQDKELKEELRNLKILDQTIINTFLIGILRDHTESKITKEELLNILK LLQSYLWRRFITEKPSNSLNKIFQGMYQKISKDKNYYNILEENLLNQDFPTNEELKESLK TKNMYKDKEKLRYVFKKIENYKHNETIDFSNDKITIEHIFPQKPDKHWKELYSDDELSEM KIFKDTISNLTLTGSNSNLGNKVFQEKRDDIEHGYKNSKLYLNKYLGELGEWNLSSMEER FENLFENIIQIWKRPDKITEFEKITFVLKGPDVSGVGKLLPAENFEILKGSIIMREQKGN GKTAQRNERIIRELLNNGIIEKNGDKYIFKNNYKSSSPSAAASLILGRSANGWKEWKTYD GKLLDEYRK >gi|224461496|gb|ACDD01000006.1| GENE 39 46798 - 49308 2034 836 aa, chain + ## HITS:1 COG:no KEGG:Npun_F5663 NR:ns ## KEGG: Npun_F5663 # Name: not_defined # Def: alkaline phosphatase domain-containing protein # Organism: N.punctiforme # Pathway: not_defined # 1 836 1 855 855 313 28.0 3e-83 MEKEKIKEMLEERFQVIPVYPRKRHILFWYDEQKAFQDIIEELQLDKVKILRLQKGVNRN GEEIDTNLFKIKYTLEILDTDSHYLIYSANPRPCKHENYLLDIESYSDFFEADKSSMLVD DFGFDRKNSEIHNTIQEYSDFFASKERREKLKKLLENPFHTSEQELKLSILAVIVGAKTA DFQEILKNIIFDSGKLEQIKKWMGLSFLYEQILRKFHLSIESWEEFLKTLVVVHFYREIQ QKPSIHLESYYLGKTNELFIFTSSLFQNQNDSERIQEIFYEVGKELNMKGRIEELEFEKM LLGNSFEYFEKRILQELVERLQLGVVDYSQYISWIQARSHCSLWKEKYFYFYETLLAIAC LLEEKEQLEIKNRKTLSEILEDYTAHYYKIDQAYRNFYYSYDQIKSSELASMFDSFHSKI SHFYEVEYLEILLALWNEHLEERNSLVQQKDFYEKYIRRSDTRVAVIISDALRYEVGEEI TRQLEKEGNAKEIKIFPLLTNLPSITSVGMTQLLPPGKERNIDLFHNKMTVNGISTFSTE NREKILKESCLESSAISYDIFKNKNRSEQEAYVKGKKVLYFYHDSIDAIGDKAKTEHRTF DACKMAIQDILSLTKILSSLGVVNIFITSDHGFLYERKEIEEYNKLELQEEYFLLGKRYA FSTKKVKEKGCITISVGDFFGIFPEKNQRIKVGGSGLQFVHGGSSPQEMIIPCIHYRGGT NVKKAKKVGVCIKETGEKITSHFIKFAVYQLEPVNSQEKLIERTIVAALYDGNVRVSNEF KILLCAETENTEYPFSLTLSGEHEKVILRIIDVESKDILEQKEYQVKIGIASEFDF >gi|224461496|gb|ACDD01000006.1| GENE 40 49333 - 49923 514 196 aa, chain + ## HITS:1 COG:no KEGG:Swol_2496 NR:ns ## KEGG: Swol_2496 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 5 192 9 193 199 74 26.0 2e-12 MKYRAITGENFYVQELKNACQFLLQHPEVENRKQAFQELDILNCISESNFNKKFQAIQKR LQVFTPKLMELFLKVDYSTVKFINLYAILSKERFFAEFMDEVLKEKYSLLDFELTNYDFY EFMERKREQSEIIQQWSNAGCNKMIHKLKNFLIEANFLEKSLSGELYLIKRPLVDMAVIE EIEKHGNVKILRLMLY >gi|224461496|gb|ACDD01000006.1| GENE 41 49944 - 50501 466 185 aa, chain + ## HITS:1 COG:no KEGG:Ppro_2290 NR:ns ## KEGG: Ppro_2290 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicus # Pathway: not_defined # 7 185 8 189 191 139 37.0 4e-32 MKELDIRFQNLIQKVKSDEFYYNKGLANEVPFYIFDYNPKEELEVREYIHSLFLPEIQKE DRLEVVEIDMFELLLESMKNDNILEKAFLMEEKKGTRFLYEKLKKSFNVEVILRYISEKS EGKNFIILTGVGRVFPIVRTHTILNNLQNILGDTKVLLFFPGEYTSTDLRLFGFEDNNYY RAFRI >gi|224461496|gb|ACDD01000006.1| GENE 42 50542 - 50607 66 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVLMRRLIFPINTIFLFFWEL >gi|224461496|gb|ACDD01000006.1| GENE 43 50620 - 51183 631 187 aa, chain - ## HITS:1 COG:FN1085 KEGG:ns NR:ns ## COG: FN1085 COG0693 # Protein_GI_number: 19704420 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Fusobacterium nucleatum # 3 182 2 181 182 211 55.0 9e-55 MKKVFVLLANGFELIEAMTPVDVLRRCGAEVTTVSTEEDLWVESSNSVIIKADKYWEEVN FEEGDILILPGGYPGYVRLRENRLVVSQVEKYLTTGKYVAAICGAPSLFSEHKLALQYKL TGHSSIQEDLKQNHIYTGETTTVDRNLITGIGAGHSLDFSFEIAALLFEKEVIEKVKEGM EIDLEKR >gi|224461496|gb|ACDD01000006.1| GENE 44 51202 - 51450 361 82 aa, chain - ## HITS:1 COG:no KEGG:FN1084 NR:ns ## KEGG: FN1084 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 79 4 82 85 111 73.0 7e-24 MFEDWQENLYDSTFDSIFNALVAEYKEGKLDVEELKVNIAEQQQILLNAFTEGEAKSTYC NAMIDAHQFVLSLITTGKIANY >gi|224461496|gb|ACDD01000006.1| GENE 45 51672 - 53921 2099 749 aa, chain + ## HITS:1 COG:FN1704_1 KEGG:ns NR:ns ## COG: FN1704_1 COG1752 # Protein_GI_number: 19705025 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Fusobacterium nucleatum # 7 366 2 374 375 273 44.0 9e-73 MKYIEQKKVFFLLSYCLFSLSTFSISQEEEKEIKNIKEQIAILQKRLETLEEKKAIENSA IEKRKIGLVLSGGGAKGYAHLSLLRLLEKQHIQIDYITGTSIGAFIGTLYSIGYSVDEIE ACLNSLNYDSLIKNNAYKRNPHDILTVNYDKQLNFSYPKGLASNEFLYLALKDILKSVEG IRDFNTLPIPLRIIATDLNTGKAKAFHEGDLAQVLTASMAVPTLLEPVKIGDTSYVDGLI SRNFPVQDVIEMGANFVIGSDVGNELKDNSDYNILSVLNQLIAIQSSSSHEEQKELVDIL IQPKIQKYSALDIQKREIFLKLGEEAVQENKEALLSCIKKESKTPKKILSTPAPIFFEKL VLSDNFQGKVRMVIEEFLSDIIGKELKEEELRDKILRVYRLPFISKVYYKKRGNELFLDG EVIPENTLGIGFHYQKDYGTTFRLGTNLHHIGKFGNTTNINAKIGDYLGLDIYSLFHYGI SDEVGLFSRLSYDERPFYLYERNRRLASFKKKIVKGELGIFTRYRDDLFLSAGLSTNYAK LNLESGDTDYRYFEYSKNFNNAFLRFKFDNVRNRKSGLKAEAEYKFSASSVKKNSNVYGP SYQLDGYFPISPKLTGTYHLSGGIMIGNRIPIDQYFKIGGLQNNMELNEFSFYGYRPHQK IADKFMIGNLGLQYEILSNIYWTGNWNMMAYHSPIETLEKTEKKWRRYIHGFASSLMYDS PLGPIELSISRNNQEKEFLTTFSIGYYFY >gi|224461496|gb|ACDD01000006.1| GENE 46 53918 - 54715 978 265 aa, chain - ## HITS:1 COG:FN1706 KEGG:ns NR:ns ## COG: FN1706 COG0730 # Protein_GI_number: 19705027 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 16 256 8 248 254 219 54.0 4e-57 MSSMLHWVGDMSPEAFFILSALCFLAAFIDSIAGGGGMISLPAFMAVGLPPHIALGTNKI SAAIGTLASSLNFLRSNKIILPLVTRFAPLALFGAIFGVKTALLIPPKYFQPISFFLLIC VFIYTLINKNLGEEYDYQGINSVNIKWGCIFSLLIGFYDGFLGPGTGSFLIFMLIKIFHL DFAHATATTKFINLASNIISAALYFHAGKLNLPLGLIMAVIMIIGAFAGSSLAIYKGSKF IKPVFLVVTITLIGKMGYSFLSSLF >gi|224461496|gb|ACDD01000006.1| GENE 47 54803 - 56206 357 467 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 160 461 29 344 345 142 32 7e-33 MAWIKKGSYKAVLCFPTFVIKYPRVNQNTFKNMRSILTEQYGYLTCGKRAREFLLPIYFI PGIPILFQSRCRVYEEENLEDEKKQKKFMELIAKKNFWRFEVFTDMENIINLGEYKGKIY KHDYDYLSYDWKREWEYFKIKWRYRGKNMEEFVLQNEKLRVTLLSYGAIIQKIEMPDKNG NWQNIVLGFEKKEEYIEKNIPYFGAIVGRTAGRTKNGILKIGEKEYLLDKNANGKHSIHG GRYNLSQKYWKGEQKENRVLFSVESPHLENGYPGNANIQVEYSLEGDTLHLCYKAFSDED TYFNLTNHSYFSLSGNPEEEIGEQYLTLQAKEYVEVDEDTIATQISSVENTVFDFRIRKQ LKEIFQSQEKQVKIVGGGLDHPFLTQYAKLEDETSGRCLEVKTDNHAMVLYTANWLHEIG RKNHTGIALEAQDLPCLSELKEKEYKVGREYERKTSFRFYVDFQKNK >gi|224461496|gb|ACDD01000006.1| GENE 48 56154 - 58385 194 743 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|152975021|ref|YP_001374538.1| 30S ribosomal protein S1 [Bacillus cereus subsp. cytotoxis NVH 391-98] # 554 743 151 353 382 79 29 5e-14 MKEKQVSDFTLIFKKISDIIWGKKEENIEKKKIGGTMFDEKTLEMELAGRTLKVSTGKIA RQSCGAVMIQYGDTVLLSTVNRSKEPRKGADFFPLTVDYIEKFYAAGKFPGGFNKRESRP STDATLVARLIDRPIRPMFPEGFTYDVHIVNTVFSFDEQNTPDYLGIIGSSLALSISDIP FLGPVAGVTVGYIDGEFILNPSPEQLEKSLLDLSVAGTKDAVNMVEAGAKELDEETMLKA ILFAHDNIKKICAFQEEFVKLCGKEKITFEKEEVDTVISSFIEENGHERLQAAVLTLGKK NREEAVDGLEEELLSAFIAKQYPDVAEEEIPEEPILAFKKYYHDLMKKLVREAILYKKHR VDGRTTTEIRPLDAQINVLPIPHGSALFTRGETQSLATATLGTKDDEQLVDNLEKEYYKK FYLHYNFPPYSVGETGRMGAPGRRELGHGSLAERALRYVIPTEEEFPYTIRVVSDITESN GSSSQASICGGSLALMSAGVPIKEHVAGIAMGLIKEGEEFTVLTDIMGLEDHLGDMDFKV AGTKSGITALQMDIKITGITEEIMRIALNQAHVARQQILEVMNTAISSPADLKPNVPRIQ QITIPKDKIAILIGPGGKNIKGIIEETGSTIDITDDGKVSIFSKDLEVLENTLRLVNNYV KDVELNEVYEGKVVGIQKFGAFMEILPGKEGLLHISEISKERVANVEDVIKMGDVFKVKV ISLDNGKIALSKKKLELENTVAE >gi|224461496|gb|ACDD01000006.1| GENE 49 58402 - 59736 1029 444 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229230948|ref|ZP_04355465.1| SSU ribosomal protein S12P methylthiotransferase [Desulfotomaculum acetoxidans DSM 771] # 1 441 19 462 462 400 44 1e-111 MNFALISLGCSKNLVDSENLTGILVNRKGFQLTNEIEEADLVLINTCGFIGDAKKESIET ILEVAEYKQERLKKIVVCGCLAQRYAEELLQEIPEIDAVIGTGEIDKIESVVDEILQDKK AVETSSFHFLPNADTDRVLTTPPHTAYLKISEGCNRRCTYCIIPQLRGDLRSRTKEDILE EAKRLVSGGVRELNLLAQETTEYGIDNYGKKALPDLLRELVKIEGLDWIRTYYMFPRSIT DELIEVMKQEEKICKYFDIPIQHISSNMLRRMGRAITGEQTKELLYKIRKEIPEAVFRTS LIVGFPGETEEEFQELKDFVEEFQFDYIGVFQYSREEDTVAYTMENQIPEEVKERRQAEI INLQNEIAESKNRKLLGREVEVLIDGISSESEYMLEGRLKTQALDIDGKVLTSEGTAQVG EMVRIMLEQNFEYDFIGRIVQNEK >gi|224461496|gb|ACDD01000006.1| GENE 50 59749 - 60303 720 184 aa, chain + ## HITS:1 COG:FN1709 KEGG:ns NR:ns ## COG: FN1709 COG0558 # Protein_GI_number: 19705030 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Fusobacterium nucleatum # 1 182 1 184 187 243 78.0 1e-64 MNLPNQLTTARFILAIPFIYFLQTSDSHGFWYRMIALVIFSVASLTDFFDGYIARKYNLI TDFGKIMDPLADKILVISALVLFVDLNYMPAWMSIVVLAREFLISGIRILAAAKGEVIAA GNLGKYKTTSQMIVVIIALFIGKLSIYILGDYYTICEILMIIPVILTIWSGAEYTAKAKH YFIG >gi|224461496|gb|ACDD01000006.1| GENE 51 60313 - 60585 347 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451724|ref|ZP_05617023.1| ## NR: gi|257451724|ref|ZP_05617023.1| YGGT family integral membrane protein [Fusobacterium sp. 3_1_5R] # 1 90 1 90 90 121 100.0 2e-26 MYTILIIVNKLVEVFNILLLIRVVLSWLPMGQNALTRAVYSVTEPILEPIRRTTYPLLGN IPLDISPIIAYFLMQLIRNIVFRIVQVLYF >gi|224461496|gb|ACDD01000006.1| GENE 52 60602 - 61543 1352 313 aa, chain + ## HITS:1 COG:FN1711 KEGG:ns NR:ns ## COG: FN1711 COG0275 # Protein_GI_number: 19705032 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Fusobacterium nucleatum # 1 312 1 312 314 456 74.0 1e-128 MQEIGNEYHIPVLYEETLDQLVWNPDGIYIDCTLGGGSHSEGILKRLSEKGRLISIDQDA NAIAFCKKRLEKYGKQWSVFQNNFENIDIVSYLAGVDKVDGILMDIGVSSTQLDDGERGF SYRYDAKLDMRMNKEQDLSAYEVVNTYAEQDLVRILFEYGEERHAKKIASFICENRKEKP IETTGELVAIIKRAYSERASKHPAKKTFQAIRIEVNRELEVLEKAIKKSVDLLKPKGHLA IITFHSLEDRLVKTVFKDLATACKCPPELPVCVCGGKAKVKILTKKPIIPSEDELGKNNR AHSSKLRVVERLA >gi|224461496|gb|ACDD01000006.1| GENE 53 61540 - 61800 409 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451726|ref|ZP_05617025.1| ## NR: gi|257451726|ref|ZP_05617025.1| hypothetical protein F3_01584 [Fusobacterium sp. 3_1_5R] # 1 86 1 86 86 140 100.0 4e-32 MKKAFILVGVIVGIIWGIHGYFLMQVMSLEQELHDKKTELDNNIKLLNRKVMEYDKKLDL AAIKKNMEENRGMLMAEEIKYFEVSE >gi|224461496|gb|ACDD01000006.1| GENE 54 61811 - 63172 1351 453 aa, chain + ## HITS:1 COG:FN1713 KEGG:ns NR:ns ## COG: FN1713 COG2265 # Protein_GI_number: 19705034 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Fusobacterium nucleatum # 1 450 13 462 464 519 60.0 1e-147 MVKLSQKIELTIDKIVFGGEGLGYFQEFAIFVPMATIGDVVEAEVISVKKHYARALISKI IKVGKDRVEGNRISFEEFQGCDFAMAKYEAQLQYKTAMVKEVMERIGKLNSNLVLDCIAS PEEKHYRNKVIEPFSKHKGKIITGFFQRRSHEVFEVEENMLNSKLGNKIIETFKQYANQE KLSVYDEKKHQGLLRNIMIRTNSSQEAMLVLIVNAKKIEESLKKILLQMPKNIPELKSIY LSLNTRKTNVVLGDKNICIWGEKTLKEELFGIHFHISPSSFFQINVPQTKHLYEKALSLI PKIENKNVVDAYSGTGTIGMLLSRKAKKVYAIEIVESASRDGAKTAKENHIDNIEFICGP VEVELDRLLEEGKNLDAIVFDPPRKGIEESILRKVAEVGIPEMVYISCNPSTLARDLKIM AECGYQVGEIQPFDMFPQTSHVECIALIQRVKS >gi|224461496|gb|ACDD01000006.1| GENE 55 63169 - 63330 166 53 aa, chain + ## HITS:1 COG:no KEGG:CD0408 NR:ns ## KEGG: CD0408 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 52 1 52 52 87 92.0 1e-16 MKILHFKQFYKHYVFVEDGEGGRKKVLKNYIDVNVCIDMVCGDTKNALESEDY >gi|224461496|gb|ACDD01000006.1| GENE 56 63334 - 63615 176 93 aa, chain + ## HITS:1 COG:no KEGG:SZO_12680 NR:ns ## KEGG: SZO_12680 # Name: not_defined # Def: membrane protein # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 92 1 92 93 107 93.0 1e-22 MRWIIKIILFPISLALSILTAFLTFLLGIGTALLYLLMMFCIFGAIASFLQKEVTIGIEA LILGFLLSPYGIPMVGATVIAFLQGINEAIKST >gi|224461496|gb|ACDD01000006.1| GENE 57 63705 - 64475 768 256 aa, chain + ## HITS:1 COG:no KEGG:SZO_12980 NR:ns ## KEGG: SZO_12980 # Name: not_defined # Def: replication initiation protein # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 256 26 284 284 425 89.0 1e-118 MNFDYFYNREAERFNFLKVPEILVDGEEFKGLSAEAIILYSMLLKRTGMSFKNNWVDKEG RVFIYFTVEEIMRRRNISKPTAIKTLDELDSKKGIGLIERVRLGLGKPNVIYVKDFMSII AVKENDFQKSKNLTSEVKDFNLRSKENELQEVQNVDSNYIENNKSKYSKREYSFGKSGLG TFQNVFLKDEDIGELQIKMAGELDNYIERLSTYLQSTGKTYKDHKATILSWFYKDQGSKK KTNIPTWEEYNKGVHL >gi|224461496|gb|ACDD01000006.1| GENE 58 64472 - 65323 482 283 aa, chain + ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 41 276 36 277 282 108 31.0 1e-23 MIKELEKVMIEDVEYSFDPEKEYIKDGHAYCKVCHERKDGKALEFFGKQMIFKTACKCDR DREAKEKERQKQLEIERLKSICFTSMIQWAYTFENYEGKENQSLIIAKNFVKDYELMRKE NIGLLFYGSVGSGKTYLACSIANALIEQYQISVKIRNFAQIINELQKGGFDLDKNAYIES LVNTSVLILDDLGIERDTSYAKEQVYNIVNNRYLKHKPTIFTTNLSYSQIENCTESVEYQ RIYSRIIEMCIPVMVLGEDYRKVIQEEKLKRNKERLLTGGERT >gi|224461496|gb|ACDD01000006.1| GENE 59 65320 - 65805 329 161 aa, chain + ## HITS:1 COG:no KEGG:SEQ_0732 NR:ns ## KEGG: SEQ_0732 # Name: not_defined # Def: hypothetical protein # Organism: S.equi_equi # Pathway: not_defined # 1 161 1 161 161 156 91.0 2e-37 MINEEIARKTLNMEVKAGKVTAKLLLTLLKKLMKEAEKLGGLEKLVSSKGNEVKLKDMVK KGQLEEIPVEEAELKELKKELNRYGVKFSVMKDKESGKYSVFFQAKDMKVMDKAFKHALS ESEKKTERKESIHKNIEKFKEMAKNTVSKDKVKNKQKEQIL >gi|224461496|gb|ACDD01000006.1| GENE 60 65802 - 66140 320 112 aa, chain + ## HITS:1 COG:no KEGG:CD0412 NR:ns ## KEGG: CD0412 # Name: not_defined # Def: putative conjugal transfer protein # Organism: C.difficile # Pathway: not_defined # 1 110 1 110 594 204 87.0 5e-52 MIDKILKDIKGLFKVQDKAKFLKQNIPYLAFFYFGNIFSHHVRAYTGGDVIDKIFQGILE LNTMSFIPSVHPIDVLMGVGVAALIKFIVYTKGKNAKKFRQGKEYGSARWVA >gi|224461496|gb|ACDD01000006.1| GENE 61 66191 - 68029 1800 612 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 1 304 1 300 301 239 44.0 1e-62 MMEYATKITALYSRLSVGDEDRDGGESNSIVNQKAFLERYARQHKLMNIRHYIDDDESGR FFDRSAYTQMISDVENGKIGIVIMKDMTRWGRDYLQVGNAMEIFRRNNVRFIAINNGIDS KDQNTLEFAPFINIMSEWYARDISKKVTTGIKTKGASGKPVATEAPYGYIKDSNNKDYWI VDKEAAKVVKLIFTLFMEGKNRNQIAVYLKEKEILTPTFYMKQQDRGTAKNKKLNEENRY NWNKATLTRILKRQEYCGDVVNFKTEKHYKDKRNHYVDKDKWQIIPDVHEPIIDRITFEN VQRILKNAPVKRPNGDGEIHALSGLMYCKDCGTKMHIRTIHKSGKVQHVTYCSEYAKGKA KHPKCNSPHRIDVDDVMENLTEVLRKIAQYSLANKEEFEVLVKSSLAKEQTEEIKKQQNR IPQITDRMEQIERVMNKLYEDNALGNIEVKRYELLSRKYAEEYYTLKAEQEEIEERLSEF ENANQRAKNFINLAESYSDFEELTPTAINEFISKIVVHERDVKRAKYAVQRIEVYFNYIG KFENELTKEIEPTEQEMIQMREEIEEAKKEKSRAYHRAYSKEYRAKNLEKFREYERIKAR EYRARKKLQATT >gi|224461496|gb|ACDD01000006.1| GENE 62 68123 - 68320 86 65 aa, chain + ## HITS:1 COG:no KEGG:SDEG_1344 NR:ns ## KEGG: SDEG_1344 # Name: not_defined # Def: hypothetical protein # Organism: S.dysgalactiae # Pathway: not_defined # 1 65 1 65 65 70 87.0 2e-11 MTEKQEVQNDEIQTVKEQTIPQKPNNIMIKKIRGKTFVTEIYFDPNSKDTFQDKLLKVVK SERKE >gi|224461496|gb|ACDD01000006.1| GENE 63 68397 - 68594 195 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451736|ref|ZP_05617035.1| ## NR: gi|257451736|ref|ZP_05617035.1| hypothetical protein F3_01634 [Fusobacterium sp. 3_1_5R] # 1 65 1 65 65 69 100.0 6e-11 MKNIDEKILMAEEEIKQLQNKRKKLISQQKQEERKREIKGYMKKEQSLKVSLPKARVLPK MNSIS >gi|224461496|gb|ACDD01000006.1| GENE 64 68957 - 70441 1056 494 aa, chain + ## HITS:1 COG:mll0964 KEGG:ns NR:ns ## COG: mll0964 COG0507 # Protein_GI_number: 13471082 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Mesorhizobium loti # 1 258 1 239 1015 171 38.0 2e-42 MAIYHLCIKIISRGKGKSAVAASAYRSGEKIKNEYDGIVHDFTRKGGIAHTEILLPQNAP QEFSDRGTLWNSVEKIEKSKNSQLAREIEIALPKELDREKQIELVREYVKENFVKVGMCA DIALHDKNEGNPHAHILLTMRPFNEDKTWGAKSKKEYILDENGEKVKLKSGNYKTKKINT VDWNEQEKAEEWRKAWADITNKYLEENNIQEKVDHRSYQRQGIEQIPTIHLGISASQMEK KGIVTNRGNINREIRHQNMILREISRRIKALMRWIRSLTTNKNKDNLIENQEDISLQSTT PLKQNDLTDILSNLIKENADNSNINLEKYIESYQFLKEKNITSIPELKECISALRDKNYK TTRAIKDTEKKTNDKVELIDQSEKYLKHRDTYKTYVKLKKSKQEDFYNEHTAEIILFESA RKYLKEHLGESKTLAISKWKLEATALKKEKDTLYSQILDMRKEVEQAESVRNCIEKLLQG NRELTQVKRNELDL >gi|224461496|gb|ACDD01000006.1| GENE 65 70846 - 71898 512 350 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451738|ref|ZP_05617037.1| ## NR: gi|257451738|ref|ZP_05617037.1| hypothetical protein F3_01644 [Fusobacterium sp. 3_1_5R] # 1 350 1 350 350 623 100.0 1e-177 MNYQGLDSKNLVLNLLGQDANIQVNKKILLTLGLEEAFYLSYLINQYKYFLAEGSLREDD SFYASNSDISLFTTLNNSQIQRWKKSLIEKEFFKISIEGIPSVTYYYLNFEKILQIIASE KTNLELAYKNVYQNETINFEISSQESIFLLEKLTYKELRFFCKENKIKYSGNDTKRHLIY KIVENKNPSLLSTPYFFTVDDISSTSGSEIRPLSKMGGSEEPSGSKTRPLVDAKSIPNLE QINQEQKKNHDHEEHELDFHFEKIFHELGVHYTKTNRESIEKLLQKMKPYEVERYIRELH KNLKENPDVKDVNALFSVKIQRQECQINTPKKCTRKLILRSFVTHYENFI >gi|224461496|gb|ACDD01000006.1| GENE 66 72144 - 72401 304 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451739|ref|ZP_05617038.1| ## NR: gi|257451739|ref|ZP_05617038.1| hypothetical protein F3_01649 [Fusobacterium sp. 3_1_5R] # 1 85 1 85 85 95 100.0 1e-18 MTEKELMEKLEAEEIATEDLLTEEFMSKFTQFHNYDELEEELASRCKSKNADKDKALKNV LLEKTDFKDLEDMKKKAIQFYLSKH >gi|224461496|gb|ACDD01000006.1| GENE 67 72415 - 72879 364 154 aa, chain - ## HITS:1 COG:lin0175 KEGG:ns NR:ns ## COG: lin0175 COG4824 # Protein_GI_number: 16799252 # Func_class: R General function prediction only # Function: Phage-related holin (Lysis protein) # Organism: Listeria innocua # 20 149 18 138 140 58 31.0 6e-09 MEKEIGIGKMLISALSYILFLLGGWNWTLGAMFIFMVSDYATGYIRSCLKGQLSSKVGYK GLLKKCSYIFIVLIGAALDRVLEENNIQIPVSFFGAPVSFKVLLICSVIGTEGISIVENF AEMGIKFPFTIRKLFKQLQQDDPSKNTYDEKKEP >gi|224461496|gb|ACDD01000006.1| GENE 68 72893 - 73342 574 149 aa, chain - ## HITS:1 COG:no KEGG:FN0636 NR:ns ## KEGG: FN0636 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 117 1 117 119 103 47.0 2e-21 MELNPLLTEPIGERKWILREEYRYEVNGYVITVPKGFQTDLASVPRVLWVFFPPFGKYTR AAIIHDYLYSELNDTGINRYWADKIFYHIMQELGVAGYKRASMYRAVRLFGEPAWKKKLK NEGYTEQAIVDHTEEALKYNKKIKEMLKL >gi|224461496|gb|ACDD01000006.1| GENE 69 73352 - 73636 480 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451742|ref|ZP_05617041.1| ## NR: gi|257451742|ref|ZP_05617041.1| hypothetical protein F3_01664 [Fusobacterium sp. 3_1_5R] # 1 94 1 94 94 110 100.0 3e-23 MDKQVLLAVGEAVVAVVVYGVLLYKQKGKEAVMQEAIKAEATIRGKGLGTMKKKAVQEFV EKLPAHVRIFINETTIEAVVKELQPIFEKMKRNI >gi|224461496|gb|ACDD01000006.1| GENE 70 73651 - 74103 540 150 aa, chain - ## HITS:1 COG:lin0128 KEGG:ns NR:ns ## COG: lin0128 COG5632 # Protein_GI_number: 16799205 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 37 143 45 139 289 61 37.0 4e-10 MGFRFSQNSLDKMSRVHPNLVAFMKELIQVSPFDFKITSGMRTAKEQANLYQQGRIKPGL VVTNADGYKYCSNHQEKVDGYGYAVDVCILKYTSKGDIDWNFKYYKELYEIAKKNSLLEK YGIEWAGNWKKFQEGAHYQLKNARNVAFKK >gi|224461496|gb|ACDD01000006.1| GENE 71 74113 - 74775 622 220 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451744|ref|ZP_05617043.1| ## NR: gi|257451744|ref|ZP_05617043.1| hypothetical protein F3_01674 [Fusobacterium sp. 3_1_5R] # 1 220 9 228 228 375 100.0 1e-103 MLIVHFYNNTEKVYSVYANSLEEVKENPKAYFPEVTENTIITLEDFKHPMLEKGILREMT REELVEKEIPISLEEGEKIENKKLIKIEKPSKYHSWNGKEWIADLGKAKKEKRDELKGIR LQKIEENIEVHGAIFQVRNSDKEHFDDVGLMIRTGEIDGSYPKNWVLADNSVKQFTAQQI IDVWKERTKRKDRIFQEFGVLSIKLEKCDSVEKVQKITWE Prediction of potential genes in microbial genomes Time: Fri May 20 01:38:29 2011 Seq name: gi|224461495|gb|ACDD01000007.1| Fusobacterium sp. 3_1_5R cont1.7, whole genome shotgun sequence Length of sequence - 51670 bp Number of predicted genes - 78, with homology - 72 Number of transcription units - 14, operones - 11 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 709 494 ## Sterm_2818 hypothetical protein 2 1 Op 2 . - CDS 719 - 1246 615 ## gi|257451746|ref|ZP_05617045.1| hypothetical protein F3_01686 3 1 Op 3 . - CDS 1247 - 2392 1214 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 4 1 Op 4 . - CDS 2389 - 2727 336 ## gi|257451748|ref|ZP_05617047.1| hypothetical protein F3_01696 5 1 Op 5 . - CDS 2736 - 3383 722 ## plu2878 hypothetical protein 6 1 Op 6 . - CDS 3380 - 4243 935 ## Apre_0852 hypothetical protein 7 1 Op 7 . - CDS 4243 - 4563 394 ## gi|257451751|ref|ZP_05617050.1| hypothetical protein F3_01711 8 1 Op 8 . - CDS 4579 - 5160 582 ## gi|257451752|ref|ZP_05617051.1| hypothetical protein F3_01716 9 1 Op 9 . - CDS 5162 - 6859 1873 ## Amet_2425 hypothetical protein 10 1 Op 10 . - CDS 6849 - 6965 120 ## 11 1 Op 11 . - CDS 7006 - 7455 478 ## gi|257451755|ref|ZP_05617054.1| hypothetical protein F3_01731 12 1 Op 12 . - CDS 7418 - 7645 230 ## gi|257451756|ref|ZP_05617055.1| hypothetical protein F3_01736 13 1 Op 13 . - CDS 7655 - 8077 580 ## gi|257451757|ref|ZP_05617056.1| hypothetical protein F3_01741 14 1 Op 14 . - CDS 8091 - 9080 1260 ## Amet_0426 hypothetical protein 15 1 Op 15 . - CDS 9065 - 9547 424 ## gi|257451759|ref|ZP_05617058.1| hypothetical protein F3_01751 16 1 Op 16 . - CDS 9576 - 9914 403 ## gi|257451760|ref|ZP_05617059.1| hypothetical protein F3_01756 17 1 Op 17 . - CDS 9915 - 10373 379 ## BAV1322 phage protein 18 1 Op 18 . - CDS 10385 - 10783 490 ## gi|257451762|ref|ZP_05617061.1| hypothetical protein F3_01766 19 1 Op 19 . - CDS 10780 - 11019 442 ## gi|257451763|ref|ZP_05617062.1| hypothetical protein F3_01771 20 1 Op 20 . - CDS 11031 - 12146 1522 ## BcerKBAB4_5338 hypothetical protein 21 1 Op 21 . - CDS 12150 - 12764 912 ## gi|257451765|ref|ZP_05617064.1| hypothetical protein F3_01781 22 1 Op 22 . - CDS 12776 - 13483 713 ## gi|257451766|ref|ZP_05617065.1| head morphogenesis protein, putative 23 1 Op 23 . - CDS 13554 - 15047 1460 ## gi|257451767|ref|ZP_05617066.1| prophage LambdaCh01, portal protein - Prom 15095 - 15154 2.6 24 2 Op 1 . - CDS 15161 - 16468 1004 ## COG1783 Phage terminase large subunit 25 2 Op 2 . - CDS 16419 - 16829 499 ## gi|257451769|ref|ZP_05617068.1| hypothetical protein F3_01801 - Prom 16895 - 16954 2.5 - Term 16915 - 16954 3.6 26 3 Op 1 . - CDS 16956 - 17423 157 ## gi|257451770|ref|ZP_05617069.1| hypothetical protein F3_01806 27 3 Op 2 . - CDS 17434 - 17616 288 ## gi|257451771|ref|ZP_05617070.1| hypothetical protein F3_01811 28 3 Op 3 . - CDS 17663 - 17974 541 ## gi|257451772|ref|ZP_05617071.1| hypothetical protein F3_01816 - Term 17987 - 18019 3.2 29 3 Op 4 . - CDS 18024 - 18251 367 ## gi|257451773|ref|ZP_05617072.1| hypothetical protein F3_01821 - Prom 18279 - 18338 3.7 30 4 Tu 1 . - CDS 18429 - 18560 63 ## - Prom 18667 - 18726 9.4 31 5 Op 1 . - CDS 18738 - 18974 234 ## gi|257451774|ref|ZP_05617073.1| hypothetical protein F3_01826 32 5 Op 2 . - CDS 18988 - 19791 861 ## gi|257451775|ref|ZP_05617074.1| hypothetical protein F3_01831 33 5 Op 3 . - CDS 19846 - 20658 412 ## gi|257451776|ref|ZP_05617075.1| hypothetical protein F3_01836 - Prom 20691 - 20750 5.4 - Term 20698 - 20755 14.3 34 6 Op 1 . - CDS 20772 - 20924 267 ## 35 6 Op 2 . - CDS 20999 - 21304 69 ## gi|257451778|ref|ZP_05617077.1| hypothetical protein F3_01846 36 6 Op 3 . - CDS 21270 - 21875 390 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 37 6 Op 4 . - CDS 21853 - 22152 311 ## gi|257451780|ref|ZP_05617079.1| hypothetical protein F3_01856 38 6 Op 5 . - CDS 22149 - 22355 324 ## gi|257451781|ref|ZP_05617080.1| hypothetical protein F3_01861 39 6 Op 6 . - CDS 22357 - 22941 688 ## Sterm_0834 hypothetical protein 40 6 Op 7 . - CDS 22938 - 23291 470 ## gi|257451783|ref|ZP_05617082.1| hypothetical protein F3_01871 41 6 Op 8 . - CDS 23288 - 23446 166 ## gi|257451784|ref|ZP_05617083.1| hypothetical protein F3_01876 - Prom 23582 - 23641 2.3 - Term 23578 - 23608 1.3 42 7 Op 1 . - CDS 23676 - 24323 832 ## Fjoh_2073 hypothetical protein 43 7 Op 2 . - CDS 24320 - 24562 329 ## gi|257451786|ref|ZP_05617085.1| hypothetical protein F3_01886 44 7 Op 3 . - CDS 24609 - 24842 314 ## gi|257451787|ref|ZP_05617086.1| hypothetical protein F3_01891 45 7 Op 4 . - CDS 24854 - 25045 378 ## gi|257451788|ref|ZP_05617087.1| hypothetical protein F3_01896 46 7 Op 5 6/0.000 - CDS 25056 - 26027 1191 ## COG2842 Uncharacterized ATPase, putative transposase 47 7 Op 6 . - CDS 26024 - 27907 1656 ## COG2801 Transposase and inactivated derivatives 48 7 Op 7 . - CDS 27920 - 28330 503 ## gi|257451791|ref|ZP_05617090.1| hypothetical protein F3_01911 49 7 Op 8 . - CDS 28333 - 28434 143 ## - Term 28445 - 28488 8.7 50 8 Op 1 . - CDS 28521 - 29300 636 ## gi|257451793|ref|ZP_05617092.1| XerD related protein 51 8 Op 2 . - CDS 29281 - 29898 843 ## gi|257451794|ref|ZP_05617093.1| hypothetical protein F3_01926 52 8 Op 3 . - CDS 29938 - 30216 331 ## gi|257451795|ref|ZP_05617094.1| hypothetical protein F3_01931 53 8 Op 4 . - CDS 30246 - 30419 196 ## gi|257451796|ref|ZP_05617095.1| hypothetical protein F3_01936 - Prom 30520 - 30579 4.9 + Prom 30770 - 30829 14.3 54 9 Tu 1 . + CDS 30865 - 31491 552 ## COG2932 Predicted transcriptional regulator + Term 31502 - 31541 10.0 + Prom 31817 - 31876 8.5 55 10 Op 1 . + CDS 32068 - 32217 148 ## gi|257453085|ref|ZP_05618384.1| hypothetical protein F3_08501 + Prom 32220 - 32279 4.6 56 10 Op 2 . + CDS 32301 - 32714 400 ## gi|257451799|ref|ZP_05617098.1| hypothetical protein F3_01951 + Prom 32720 - 32779 3.1 57 11 Op 1 . + CDS 32817 - 32906 171 ## 58 11 Op 2 . + CDS 32943 - 33629 537 ## gi|257451800|ref|ZP_05617099.1| hypothetical protein F3_01956 59 11 Op 3 . + CDS 33671 - 33736 64 ## 60 11 Op 4 . + CDS 33757 - 36708 3917 ## gi|257451801|ref|ZP_05617100.1| hypothetical protein F3_01961 + Term 36731 - 36759 -1.0 + Prom 36711 - 36770 5.4 61 12 Op 1 . + CDS 36791 - 37108 391 ## gi|257451802|ref|ZP_05617101.1| hypothetical protein F3_01966 62 12 Op 2 4/0.000 + CDS 37161 - 37856 765 ## COG3701 Type IV secretory pathway, TrbF components 63 12 Op 3 11/0.000 + CDS 37861 - 38661 863 ## COG3504 Type IV secretory pathway, VirB9 components 64 12 Op 4 . + CDS 38676 - 39863 1305 ## COG2948 Type IV secretory pathway, VirB10 components 65 12 Op 5 . + CDS 39878 - 40354 587 ## gi|257451806|ref|ZP_05617105.1| hypothetical protein F3_01986 66 12 Op 6 . + CDS 40348 - 42402 2279 ## COG3505 Type IV secretory pathway, VirD4 components 67 12 Op 7 . + CDS 42413 - 42685 402 ## gi|257451808|ref|ZP_05617107.1| hypothetical protein F3_01996 68 12 Op 8 . + CDS 42682 - 43638 1305 ## COG4962 Flp pilus assembly protein, ATPase CpaF 69 12 Op 9 . + CDS 43680 - 43937 386 ## gi|257451810|ref|ZP_05617109.1| hypothetical protein F3_02006 70 12 Op 10 . + CDS 43954 - 44238 354 ## COG0776 Bacterial nucleoid DNA-binding protein 71 12 Op 11 . + CDS 44223 - 46679 2831 ## COG3451 Type IV secretory pathway, VirB4 components 72 12 Op 12 . + CDS 46759 - 47613 1086 ## gi|257451813|ref|ZP_05617112.1| hypothetical protein F3_02021 73 12 Op 13 . + CDS 47626 - 48081 431 ## gi|257451814|ref|ZP_05617113.1| hypothetical protein F3_02026 74 12 Op 14 . + CDS 48110 - 49147 1256 ## COG3846 Type IV secretory pathway, TrbL components 75 12 Op 15 . + CDS 49221 - 49505 489 ## gi|257451816|ref|ZP_05617115.1| hypothetical protein F3_02036 + Term 49511 - 49555 10.3 + Prom 49522 - 49581 7.9 76 13 Tu 1 . + CDS 49731 - 50840 1063 ## gi|257451817|ref|ZP_05617116.1| hypothetical protein F3_02041 + Term 50921 - 50969 3.2 + Prom 50941 - 51000 11.3 77 14 Op 1 . + CDS 51022 - 51204 201 ## gi|257451818|ref|ZP_05617117.1| hypothetical protein F3_02046 + Prom 51236 - 51295 5.2 78 14 Op 2 . + CDS 51329 - 51553 393 ## gi|257451819|ref|ZP_05617118.1| hypothetical protein F3_02051 Predicted protein(s) >gi|224461495|gb|ACDD01000007.1| GENE 1 1 - 709 494 236 aa, chain - ## HITS:1 COG:no KEGG:Sterm_2818 NR:ns ## KEGG: Sterm_2818 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 155 236 163 243 349 66 43.0 1e-09 MANWIQDPQSREEVDEVTQELKLPVYKPSAKGKFRQWFKESWNKLEDYLVNLKSEISEIR GSKEPNISKKSGFNLEKTNLTENDSSKLFTAKGALDLYNKLTSAIAEKEPKISKKSGFNL SKSDADDSDSSSTLATSKAVKKVKDALERLNLSWNNITGKPNFGLKSGEFMEGHRLAESL GVKEYGGLISSSGQKKEGNAYYDSNTKNMFYCKETNSYTSANSSYFEPFDNKELLN >gi|224461495|gb|ACDD01000007.1| GENE 2 719 - 1246 615 175 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451746|ref|ZP_05617045.1| ## NR: gi|257451746|ref|ZP_05617045.1| hypothetical protein F3_01686 [Fusobacterium sp. 3_1_5R] # 1 175 1 175 175 330 100.0 2e-89 MGFNLARIPHIYHDTKYVRKLFEILKQKHINVVEMWQELSYFNDLAKSKGHMLDVLGGNF KIARLGRTDEEYRKILKFEIPTFNFLGSPYEIRRILAECYDIPIESFILTELSGKIVIKI PSTINKQEVFKNIKRLKAAGVGLQVDIDIYIEDYLISELEKMTLTEIEKITLARE >gi|224461495|gb|ACDD01000007.1| GENE 3 1247 - 2392 1214 381 aa, chain - ## HITS:1 COG:lin1710 KEGG:ns NR:ns ## COG: lin1710 COG3299 # Protein_GI_number: 16800778 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Listeria innocua # 132 380 131 382 383 82 28.0 2e-15 MMITEKGFVVPTLEEIYQRKLAEFKTVKPNIRETDSNVIIPLLKFDAAEEYDAYLEGLSV YNNLNVYTAVGSGLNAITSHLNMTWLEATRAKSRIQITVSTETTIPQAWGVETLDGKKFV TLNAEDLKIPKGKTELDVISLNVGKENNVNVGQITKMTSIISGITSMTNTLPAVGGKDKE TDTELRERYLKRIDRKSSFTTEGIKNYILENTNVQKCQVIENDTDLTDSDGRLPHAYEAV CLGDTNENILQALYDYKLAGIRTVGDITKKFDDITVGFSRAIEKQIYVNISIAAIRDLWL QEYVEKIKKIVQDYIDTIEPQGTIYLYKILGEIYKATGGIKTIQIRLGDSSYSMSTSDYV LKKKEIAVVQVENITVSAEVS >gi|224461495|gb|ACDD01000007.1| GENE 4 2389 - 2727 336 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451748|ref|ZP_05617047.1| ## NR: gi|257451748|ref|ZP_05617047.1| hypothetical protein F3_01696 [Fusobacterium sp. 3_1_5R] # 1 112 1 112 112 185 100.0 6e-46 MATSIKLDKDCDIVFNEDGVCELVESTEDIIQAIRVELEQNKEQWALNTLYGVPYLNEKN TGILQVKNNHSRILQELIKTISKYEIDKIESIEFENNEIVAKIKIKGDVYTL >gi|224461495|gb|ACDD01000007.1| GENE 5 2736 - 3383 722 215 aa, chain - ## HITS:1 COG:no KEGG:plu2878 NR:ns ## KEGG: plu2878 # Name: not_defined # Def: hypothetical protein # Organism: P.luminescens # Pathway: not_defined # 15 200 21 210 223 77 33.0 3e-13 MIEFVQAMIQDANNEIHTSLPAVITEVNHAAGTCTVQIIPKRELCGQVMSYPPLIDVKLD FLKFGGWKFQFPRKAGDKVWIGFSETTLSEDTSLERFSLNEPYIIGSCENDYEANSEDII LEGQGTRVEIKGDGSIIITTGSKEMTINSNLTLNGDLTHNGNTTQTGNTEQTGDVSVQGS VGASVDVNGGGISLKRHTHGYKPGGELPEQTDPAS >gi|224461495|gb|ACDD01000007.1| GENE 6 3380 - 4243 935 287 aa, chain - ## HITS:1 COG:no KEGG:Apre_0852 NR:ns ## KEGG: Apre_0852 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 5 265 3 265 282 87 27.0 8e-16 MAKLWKQVRIITVGGLIFNYEDLDVEFDVKCTDDNKSDTATIRIYNLSETTKNKLQANQA VTIDAGYQELHGVIFAGIVESVSTNRSENDMVTTITASPNNRAYTNTPVNMQFKAGIKAS EILKQLEKQVPFKIDVKELGKDTTYPNGKAFSNRLSNVISVLAKDTGTIARFTDSTIEFK KPGKAYSNVLKLGSEQGLVRVDKKEEKAEAEKEKKDSKKSKKENNKKKEKEKPKYSIEAF LIPIVKIGQLIEVESTLWSGKGVVKECSYVAGDVSSFSVNAMLEVIE >gi|224461495|gb|ACDD01000007.1| GENE 7 4243 - 4563 394 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451751|ref|ZP_05617050.1| ## NR: gi|257451751|ref|ZP_05617050.1| hypothetical protein F3_01711 [Fusobacterium sp. 3_1_5R] # 1 106 1 106 106 204 100.0 2e-51 MKALEIDVTGIEEYGIIADIGNDLKLDMIYNNIDHHMYVSVLDGAENRITGFFRLVPDVD FLSLAWNTLPYQLRCIKINDYAEEKDLITPSNLNQDYKFFLIGEEE >gi|224461495|gb|ACDD01000007.1| GENE 8 4579 - 5160 582 193 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451752|ref|ZP_05617051.1| ## NR: gi|257451752|ref|ZP_05617051.1| hypothetical protein F3_01716 [Fusobacterium sp. 3_1_5R] # 1 193 1 193 193 343 100.0 4e-93 MSLWGQLQQEATSLVKSFLGIKEKSLLGGIPLHVISDKSRSISATVTNRRVEKGFNISDT VRKEPLIFQLTVVDNSKDYMLNRQSLEKMLEAGEPLEFYYAGRDLYQNIVIENIEELEQA DRKNCFTYYITLRQISVAEIKATDSKVDYKKAGSTGGKKKRTAAAVKSPTSSESTKIGAK QKERKKTGLKNIF >gi|224461495|gb|ACDD01000007.1| GENE 9 5162 - 6859 1873 565 aa, chain - ## HITS:1 COG:no KEGG:Amet_2425 NR:ns ## KEGG: Amet_2425 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 137 303 1 167 479 89 33.0 3e-16 MAVRKLSIDIMSYLKGKGFEAVDSQIKKVKSSLSSLKSITDNGLFKMAAGYFTVNTLIAQ YNKAIEASNFQIEQETKLYSTLKGQNFRDEQIESIKSYASELQKVGVVGDEVTLAGAQQL ATYNLTEESLKKLMPAMQDVIVQQKGLKGTGQDAVGVANMLAKGLLGQTGILQKAGITLT AYQEKMIKTGKQEEKVAALVEAVKMNVGEQNAEFLKTPEGKILSAQNRIGDIYEYVGGLM RETRGEFWSMMADNTEWIQGFLGGLVKTGTGIADTVITTVSGIFDTFKAMPQEARDVIKL LTGFFLIRKFPIAGAFLIIEDIFGAFQGKESFTEEAMNAIFAFTGADYKFEDLRKEINDF WHDLISPSDEATEKIGFLTATLENFFEIMRGGVGITEMVFGALRTGWDVVKLTGKIMVDP DSIHDHLEDFSNGGIQNIKHGWGTLNYAADNMTNTQHRYQAGLQEKRQKEFQEAASLLNT ARLPKNDIQKVAQMMEKPSVRLKKENQVPPNYTDKSKQVFNIYEATDAKKVAEQIEQKIK QNDKEKEQKWKAQVGGNFSLAGLEA >gi|224461495|gb|ACDD01000007.1| GENE 10 6849 - 6965 120 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALSGYTKNFEAVENYTVTQLRRYFERLIDYLEERYGS >gi|224461495|gb|ACDD01000007.1| GENE 11 7006 - 7455 478 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451755|ref|ZP_05617054.1| ## NR: gi|257451755|ref|ZP_05617054.1| hypothetical protein F3_01731 [Fusobacterium sp. 3_1_5R] # 1 149 1 149 149 265 100.0 1e-69 MQKRETLEVNGHKITLVEQPTQYILDLEKRFEDKELVGYCKEILKYPAGENPDMTEFLNI PDTIKYKDLELSLKNKDGEKDLYLAQELFVALGKNKTNTAYVAEVFLQKLGKNVNDFKYK ELVDMGAEVFKQVGEMIYLIKIRDTFRSL >gi|224461495|gb|ACDD01000007.1| GENE 12 7418 - 7645 230 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451756|ref|ZP_05617055.1| ## NR: gi|257451756|ref|ZP_05617055.1| hypothetical protein F3_01736 [Fusobacterium sp. 3_1_5R] # 1 75 1 75 75 99 100.0 6e-20 MLKYFIVIFSFYYALCQALGQFQYFVIPHVTWEKVEKKKKYKHKDDIQEKLGKLGKLKKD RRNDAKARNIRSKRA >gi|224461495|gb|ACDD01000007.1| GENE 13 7655 - 8077 580 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451757|ref|ZP_05617056.1| ## NR: gi|257451757|ref|ZP_05617056.1| hypothetical protein F3_01741 [Fusobacterium sp. 3_1_5R] # 1 140 1 140 140 261 100.0 1e-68 MARNHYNYNSNKHDLVVNGMRVTDYGKDAKYTVAYENDFREVVTGADGDTITVEKNDRNA LITVKILQSSPLNIIFSQLASSDKEFPVLLTDRNFNGDIGSFSSIAHFVKIADLNVETVA KEREWQIRAINLKPALDLVK >gi|224461495|gb|ACDD01000007.1| GENE 14 8091 - 9080 1260 329 aa, chain - ## HITS:1 COG:no KEGG:Amet_0426 NR:ns ## KEGG: Amet_0426 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 9 323 2 328 334 66 26.0 1e-09 MAIKIGAEKKIVFLNVHKPTAVNQATVNVIGAFSTKKAVKEQLVTSIKDVTGLSEEDLLY KKIQAAFIAGAQEILIFGKQIKSQEYKELFDGVTNDWFGTITDEQDLEKIALISKEIAAR EKMLFATPAKGTEVNSSLKSSVQAITQDTTALVFSSNEETEDASVAGYAIPQFPGSILIA NKLINGAVDSGYSGAEQGILKGLNCNYQARMKGQLGLAEGVTMNGDSIDFIHCAKALKFR LEEDITLWLKATPKPTFYDTSSLKATILKRTGQFEAMGALAEGKTNVRLIDVADIPANDI LKGIYSGIKITCYYTYGIKEAKIDLYFAI >gi|224461495|gb|ACDD01000007.1| GENE 15 9065 - 9547 424 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451759|ref|ZP_05617058.1| ## NR: gi|257451759|ref|ZP_05617058.1| hypothetical protein F3_01751 [Fusobacterium sp. 3_1_5R] # 1 160 14 173 173 260 100.0 2e-68 MQKIRPNFQIKPIVDFKHSEEKTLPRIVSRTLNNSIVERYEERKDGDKGIYQQTEVHRHT ISFTFTLSKKESEEDVEEIRKHFSHIVGLEWWIDRARKELVIEEITDLIDISEYTKDGYI ERYSFDIIVRTLEENIAEIEHIEKVELELKVKGGISWQSK >gi|224461495|gb|ACDD01000007.1| GENE 16 9576 - 9914 403 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451760|ref|ZP_05617059.1| ## NR: gi|257451760|ref|ZP_05617059.1| hypothetical protein F3_01756 [Fusobacterium sp. 3_1_5R] # 1 112 1 112 112 206 100.0 3e-52 MEFTLREFAQEELKRYEVKRRIAGKIDSPEGSIQEFDCLMLIYKARLRGNSPNLQDGGRS VGTLNGKSFKKDKLQMGDIVKVEGLDYKITDILPRIYADFDEFSLELMRNEE >gi|224461495|gb|ACDD01000007.1| GENE 17 9915 - 10373 379 152 aa, chain - ## HITS:1 COG:no KEGG:BAV1322 NR:ns ## KEGG: BAV1322 # Name: not_defined # Def: phage protein # Organism: B.avium # Pathway: not_defined # 7 149 8 155 158 64 31.0 1e-09 MKDEDFGYKRIVKELEELGKLKLIIYIDNQKTYEKTGASVDDIAMIMEYGSEEFNVSFPA RPFFRPTFDAHYEHFAKKLELGVGDIISGKTTARKVLQDVGRYAVKKVREMINNGNFAGL DEKTIKRKKSNKPLIDTKLLYRSIKYKIERSE >gi|224461495|gb|ACDD01000007.1| GENE 18 10385 - 10783 490 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451762|ref|ZP_05617061.1| ## NR: gi|257451762|ref|ZP_05617061.1| hypothetical protein F3_01766 [Fusobacterium sp. 3_1_5R] # 1 132 1 132 132 230 100.0 2e-59 MIGYVTVEEAQDFLTARYGEIDKEELEKALYQAFDKIEAIGARNGRMQGEKNFPRLHDSE EVMKLIQRVQMLEAYAMMNGGNEDIKRLGKGISGKSIGDMSVSYDRSQKIGEITFASVEA ARIMKRFSRKTF >gi|224461495|gb|ACDD01000007.1| GENE 19 10780 - 11019 442 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451763|ref|ZP_05617062.1| ## NR: gi|257451763|ref|ZP_05617062.1| hypothetical protein F3_01771 [Fusobacterium sp. 3_1_5R] # 1 69 1 69 79 83 100.0 3e-15 MKLRHILYENVSVKGKAQFHQFIGGELEIEDAEEIKELLKNKNIEEIKEELPAVEESAGD TSEDSQEIEKTDKKKGKQK >gi|224461495|gb|ACDD01000007.1| GENE 20 11031 - 12146 1522 371 aa, chain - ## HITS:1 COG:no KEGG:BcerKBAB4_5338 NR:ns ## KEGG: BcerKBAB4_5338 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 10 356 13 351 391 151 32.0 3e-35 MAGTMVTTRLVGVKEELTPFLAYINANRAPLYMNLVALGRTGVVGQPKIAWVDYSSEGTQ TILTKAVSSNSETSFTVENGSIFKEGCLATIGDEVIEISSISENTLTVKRGQLTTKAADS YKIGEEVFFINDNIEEGADLQGATYKKGVNYDNNVQIIREEISVSASAEAITVPSAGGID AYSLEQMKKMDKVLGKIEKAIISGKKFESGLKRGMDGVKRFLAKGQLVDAGGQEISLEII GNVLRKIFEAGGDVNGGNYALYVPGVQKVKISKLLKDYIQAPPSENTLGAVANYVATDFG TLPIIPTVNLRADEMMILNHDDITLQVLRTRELQHEYMGKTGDNTKGLIVTELSVEVRNI PTMGMIMNLKK >gi|224461495|gb|ACDD01000007.1| GENE 21 12150 - 12764 912 204 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451765|ref|ZP_05617064.1| ## NR: gi|257451765|ref|ZP_05617064.1| hypothetical protein F3_01781 [Fusobacterium sp. 3_1_5R] # 1 204 1 204 204 291 100.0 2e-77 MIENEQEVIEYLKKEENKEFLSKNGFVTEVEKQVKTPLTDEEVKGYLAGKPEMTKQVGAT AVQAFLKEKLGKEVTEEDLKQGLVLGGTLENVKKLAVGKILSGVKYGELLMAKIDFSKIQ FKEDKIEGLDEQLNSLKTQYKDLFEPTVTGGAGTPPAISKKEPQSELEKVNQELEELKNK GQSVTIRAKIMMLLNKKAQLEKGE >gi|224461495|gb|ACDD01000007.1| GENE 22 12776 - 13483 713 235 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451766|ref|ZP_05617065.1| ## NR: gi|257451766|ref|ZP_05617065.1| head morphogenesis protein, putative [Fusobacterium sp. 3_1_5R] # 1 235 33 267 267 436 100.0 1e-121 MKEEELAELPDIEFSNLEKKKIIEDLTKVAIATNKHVFESWRTLTDEELKRPDLTGAKYW IRENYLRVNGLSKTFPTQMENFKEKQYKEILKSFNAPLDHRFSRMIDGKISQTDINKLLT ELKGYYAPSNDIKRLIEKLERNRTLGQAEMSKLHDWANRRNELWARNEAGNIYASQLEDL WLENGIEYYIWHTMQDDRVRMEHMEKDGKIFRIDEDVLPGQEFGCRCWAESIKKK >gi|224461495|gb|ACDD01000007.1| GENE 23 13554 - 15047 1460 497 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451767|ref|ZP_05617066.1| ## NR: gi|257451767|ref|ZP_05617066.1| prophage LambdaCh01, portal protein [Fusobacterium sp. 3_1_5R] # 1 497 1 497 497 884 100.0 0 MQEIFEAYRKHRNSEIYMNYSRNQKLFAGKSSEVFYQDVLKRVKLEYMGVLNDKNQYAEF NRNGRILERVYRPFKDLVVGNNILGAVTKLYAELASNTEPTISIEEEKKKILEEIDLQDK TAEAVAIQSYAGKLLLKGYIVDGKFYIDIIPPYQYFAVKSQLNSDLVDYYVVFEEEKKVL TTEIYKKGKTEYRKYKIEKEGLSEMPYPLDLREYGAVQDGLGWIKKYENWQVVEIHNLFH RSDYVEDLVIHNRELVIGDTLTSQAFDKVANPLLQFPDGALEYDEEGNLAIKIKDRIVIV EPEDKEIKQIQMNTKTEEWNAHRKNIIEQVYQNTGTNEQAFGLNKNGSSASGEAKRRDME RTISTVVAKRDRVFTGFEKVIKWGYQELYHQELEVIITGKDILSLGVGEKIVIAVQGISS GILSIESAIKYVNIADINVEDEVRRVKTDLSYRTKLIESLRILQEIDTEERVAGLLKKQG DELIKELGLDEEEPVST >gi|224461495|gb|ACDD01000007.1| GENE 24 15161 - 16468 1004 435 aa, chain - ## HITS:1 COG:lin0105 KEGG:ns NR:ns ## COG: lin0105 COG1783 # Protein_GI_number: 16799183 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Listeria innocua # 8 420 9 438 443 153 31.0 6e-37 MTSKKMDLTNHKVKRISKILLPNFHNLYTAWRSSQYTRYVCQGGRASAKSSHIALILVLS LMREAVNIVVLRKVADTLRTSVYEQIKWAVQELGVEDYFIFKVSPMEIIYAPRMNKFLFF GVDDPNKRKSMKTADFPIAYFWIEEAAEFLTEEEIEIVIKSVLRGKISDNLKYKGFLSYN PPKQRHHWINKKYGVIQKQETSFIHHSCYTDNPYLSEEFILEAEEKKKKDPVGYDWEYLG EPVGGGVVPFPRLHIGKIPDLLIRTLDTFRNGVDWGYAVDPVAFVRWGYDRMRKRIYAIS EFYGVQKSNEVLAKAIKKQIKRNETVTCDSAEPKSVAELRSYGIRAYSAKKGKGSVESGE KWLGEMEIYIDPERTPNIAREFQLADYDIDRYGETVARLADIDNHTIDATRYAFEGDLKK RREANSKKFSRPSGL >gi|224461495|gb|ACDD01000007.1| GENE 25 16419 - 16829 499 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451769|ref|ZP_05617068.1| ## NR: gi|257451769|ref|ZP_05617068.1| hypothetical protein F3_01801 [Fusobacterium sp. 3_1_5R] # 1 136 1 136 136 194 100.0 1e-48 MNVEQIRAKKLYAEGKSAEEIAKLLEKSTGTIYRWIKQYKEEFEQSRKIAQMTTDDMSDL LDEAHKKNLLEIIENPHLLQNPKAADALIKIANVLEKMDARKEREAMLEANEEEKGVVFI DDIKEDGFDKSQSEEN >gi|224461495|gb|ACDD01000007.1| GENE 26 16956 - 17423 157 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451770|ref|ZP_05617069.1| ## NR: gi|257451770|ref|ZP_05617069.1| hypothetical protein F3_01806 [Fusobacterium sp. 3_1_5R] # 1 155 1 155 155 280 100.0 1e-74 MDIVNIYELEKKLESLAEEKGYRFHVNSWSAGEVHRIYYMMYYGRESCYCGFVDCNNNRY YVCDRRHKRGINLLTGELERRQYRAMARVIFQEENDKEEEFVEFLRSKFWNFENIRIALQ YYRKGLNLEKLPNTLTKEQLEQVIACILENQNKLI >gi|224461495|gb|ACDD01000007.1| GENE 27 17434 - 17616 288 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451771|ref|ZP_05617070.1| ## NR: gi|257451771|ref|ZP_05617070.1| hypothetical protein F3_01811 [Fusobacterium sp. 3_1_5R] # 1 60 1 60 60 83 100.0 5e-15 MTENTWGGKREGSGRKAKENKKITKSFVISPELLKKLEEKYPGQSFSKVIEEALAEFLKE >gi|224461495|gb|ACDD01000007.1| GENE 28 17663 - 17974 541 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451772|ref|ZP_05617071.1| ## NR: gi|257451772|ref|ZP_05617071.1| hypothetical protein F3_01816 [Fusobacterium sp. 3_1_5R] # 1 103 1 103 103 195 100.0 9e-49 MKNYKEKSIYVGMSDIAALTAVGCVEEAPFINAEVIVFGEDAAYKAYIVENDDAEIPGHY ELCHTFQNWVKIYDDDGLVEEIKGKEIKIYRAGSMGLLIHVIK >gi|224461495|gb|ACDD01000007.1| GENE 29 18024 - 18251 367 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451773|ref|ZP_05617072.1| ## NR: gi|257451773|ref|ZP_05617072.1| hypothetical protein F3_01821 [Fusobacterium sp. 3_1_5R] # 1 75 1 75 75 132 100.0 7e-30 MKRFEEMIGKNWADVKGEMLNYITVDMENVDKKSGSCIVDFTNCSFLSVVGTYTEENDEV IIEVADDAIIYDNRG >gi|224461495|gb|ACDD01000007.1| GENE 30 18429 - 18560 63 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKIKGKKKARQRRERLEIVKLILEFMIAVLALITTIRDFLKG >gi|224461495|gb|ACDD01000007.1| GENE 31 18738 - 18974 234 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451774|ref|ZP_05617073.1| ## NR: gi|257451774|ref|ZP_05617073.1| hypothetical protein F3_01826 [Fusobacterium sp. 3_1_5R] # 1 78 1 78 78 120 100.0 3e-26 MAKYISVAQAANRLKVSVDTIYNYCKNGTLGGQYIFCQQKGTWKVDLESLELLEKESCFK SKLQLKKDTNQYSLFLES >gi|224461495|gb|ACDD01000007.1| GENE 32 18988 - 19791 861 267 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451775|ref|ZP_05617074.1| ## NR: gi|257451775|ref|ZP_05617074.1| hypothetical protein F3_01831 [Fusobacterium sp. 3_1_5R] # 1 267 1 267 267 386 100.0 1e-106 MGIFDFFKKEDKEKELVKEMGKKILSKLKKEIKSMCTLTDEEKDREENNLTGSQYWIKRN CDRVRIETLDVHTLANEFDTLINKLEFEIERNDISKENKEKLLDFIDLQALEDDLNIKRV LTTLEEKNYISIEELYEIRKIILNDIWEDGVLNKESKEQKRIDKRYKELSLSTEATFLND VRKLKNNIEKGKLAKTNVGRFLGQVTNSVPKREKEVYKIIEKLDKNNIITLAELEFVKEK FIIYNKEFWEEQKIEAEKTLREESWGY >gi|224461495|gb|ACDD01000007.1| GENE 33 19846 - 20658 412 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451776|ref|ZP_05617075.1| ## NR: gi|257451776|ref|ZP_05617075.1| hypothetical protein F3_01836 [Fusobacterium sp. 3_1_5R] # 1 270 10 279 279 491 100.0 1e-137 MGSKGRFYKEIKEIFQENYKKHFVDLFAGAMEVTLNVKEEFGEVTVLANVKDEFMEGLIR NRKHGIKLYKKAVEHILNGETITSGRAFYEDKKKWLGWKEKFHEITRQNILKLASHEIKA LQMLMSVKQDDSLSSNFYSIQKLQNLEKYLQKMENIHIKHDFFNKNWQFDNSFILLDPPY VTSTGTKERGTKGYRYKAMRWTEQDDMALIEFIQKNKEKGNVFLVFGSVENTLSRLIQEA FPDIKFITKKYKKSMFGRSSEREEWYCVIK >gi|224461495|gb|ACDD01000007.1| GENE 34 20772 - 20924 267 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKEKQIEKNEKYLIEEMKKHDGWCEVKIKHGYIIEANKKVPIKIIEKLNK >gi|224461495|gb|ACDD01000007.1| GENE 35 20999 - 21304 69 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451778|ref|ZP_05617077.1| ## NR: gi|257451778|ref|ZP_05617077.1| hypothetical protein F3_01846 [Fusobacterium sp. 3_1_5R] # 1 101 1 101 101 137 100.0 3e-31 MKIIITQEERDKLLQLLGNSDSILRNKLLKAKRQRKSSTYKKCTNTERKIRQKLEELICA NYRMSNEELIEKLNISRALFYKKYNKQARELRGNCQSQALF >gi|224461495|gb|ACDD01000007.1| GENE 36 21270 - 21875 390 201 aa, chain - ## HITS:1 COG:RSc1207 KEGG:ns NR:ns ## COG: RSc1207 COG0568 # Protein_GI_number: 17545926 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Ralstonia solanacearum # 5 190 122 366 377 70 27.0 2e-12 METKETLELIRLAKLGNIQARNELIEKNINLIHKINRRYGSSEDGFQEGVLAFCHAIDKF DETKKVKLSSYAFHWIRQRIKRYRDREKYRLPAHVIEKMTKEERKIRIDFEYQDFNSEDE KTTEEENSICLKTTLEQYIKLACDEKEALILKKIYFESYQQQEIAKEMGVCRQRVNAIVK RSLQKLRRVFYENHYHTGRKR >gi|224461495|gb|ACDD01000007.1| GENE 37 21853 - 22152 311 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451780|ref|ZP_05617079.1| ## NR: gi|257451780|ref|ZP_05617079.1| hypothetical protein F3_01856 [Fusobacterium sp. 3_1_5R] # 1 99 1 99 99 164 100.0 1e-39 MKQVITFRSFTEFFEKEKSGLKCNTVRMFELCDDREYILRDIMNEEIKKEDVILKIMNFD TGESFEREISDVSKLEVNTAEIYIISWRHKDENGNEGNS >gi|224461495|gb|ACDD01000007.1| GENE 38 22149 - 22355 324 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451781|ref|ZP_05617080.1| ## NR: gi|257451781|ref|ZP_05617080.1| hypothetical protein F3_01861 [Fusobacterium sp. 3_1_5R] # 1 68 1 68 68 123 100.0 4e-27 MKWIFRKDKFIEHTGSQHFEKALWVDEADGKEVIFKDENFGLVKGVGVTIVGVFHEYAVL REWCEVVQ >gi|224461495|gb|ACDD01000007.1| GENE 39 22357 - 22941 688 194 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0834 NR:ns ## KEGG: Sterm_0834 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 4 190 1 190 193 72 27.0 7e-12 MKEINVTRHALMRYAARVYKAAGITDRTFDSWRKRHEEEAQELEKCLKLEFKQAEYVTTA QFEGHKKAEFYIRKDIMMTYVASGENLVTCYYIDFGLDDVGNREMLEVLFKNLRRAIEEE ENFENKNETRVSFLKASLEKVKSDIAEYEAILAKIKEKKEIFEKELKLVDLEKIELSETI NNAREKIVRSKKAM >gi|224461495|gb|ACDD01000007.1| GENE 40 22938 - 23291 470 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451783|ref|ZP_05617082.1| ## NR: gi|257451783|ref|ZP_05617082.1| hypothetical protein F3_01871 [Fusobacterium sp. 3_1_5R] # 1 105 1 105 117 212 100.0 5e-54 MKKKTLIYVAHPYGGDEENKKAVEKFVDPLKKFKDITFISPIHSFWGYEKTDYLKGIDDC LSLLGQCDILAIPRFRDIEKSKGCLMELGFAKGAGITIVYWDELREYLETYEVEEEE >gi|224461495|gb|ACDD01000007.1| GENE 41 23288 - 23446 166 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451784|ref|ZP_05617083.1| ## NR: gi|257451784|ref|ZP_05617083.1| hypothetical protein F3_01876 [Fusobacterium sp. 3_1_5R] # 1 52 77 128 128 82 98.0 7e-15 MYGEDYKQALSKEIEKQAGYDIPFTRLNKEEASKMIIALEKIEEWKRKKGNL >gi|224461495|gb|ACDD01000007.1| GENE 42 23676 - 24323 832 215 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2073 NR:ns ## KEGG: Fjoh_2073 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 18 215 17 217 238 142 42.0 7e-33 MIDVKNMTAEEREKLRKQLIAEELEEKKRIKEEREEYKRLAEETAVKAFDMLAKLSEELK RAKEQIFEDFATIIKLKEELYGVRDNQLSHTFTTEDGKSVVLGYRNTDSFDDTVHVGIEK VKGYIKSLASGEKKEDIERVLNLLLKKDKNGNLKANRVLELQKIAEQINDNNLLEGVKII QESYKPMKTSTFIECYIRDKETGQRISIPLTMTGV >gi|224461495|gb|ACDD01000007.1| GENE 43 24320 - 24562 329 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451786|ref|ZP_05617085.1| ## NR: gi|257451786|ref|ZP_05617085.1| hypothetical protein F3_01886 [Fusobacterium sp. 3_1_5R] # 1 80 1 80 80 146 100.0 3e-34 MFKLKIVTDKRTKYELAREICLEDGRLGWECTLDILGREAVVGTAYEKQGEDWKLVSDNG FTEKLIFLQIDDEIVIGGTK >gi|224461495|gb|ACDD01000007.1| GENE 44 24609 - 24842 314 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451787|ref|ZP_05617086.1| ## NR: gi|257451787|ref|ZP_05617086.1| hypothetical protein F3_01891 [Fusobacterium sp. 3_1_5R] # 1 77 1 77 77 100 100.0 3e-20 MTKLESTKKAICGQLGIKNREMVEEKLNSLAIDMLKRKNFLKKMNMTGKKNIEELSEKYT EITGRIIQKIKEINGIL >gi|224461495|gb|ACDD01000007.1| GENE 45 24854 - 25045 378 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451788|ref|ZP_05617087.1| ## NR: gi|257451788|ref|ZP_05617087.1| hypothetical protein F3_01896 [Fusobacterium sp. 3_1_5R] # 1 63 1 63 63 93 100.0 4e-18 MNISNISEAVRLGTAYQYKLQALLYLEEAVKEVGDKDYKKHLEMEVRRLQDELEDLEMQI RGL >gi|224461495|gb|ACDD01000007.1| GENE 46 25056 - 26027 1191 323 aa, chain - ## HITS:1 COG:NMA1286 KEGG:ns NR:ns ## COG: NMA1286 COG2842 # Protein_GI_number: 15794215 # Func_class: R General function prediction only # Function: Uncharacterized ATPase, putative transposase # Organism: Neisseria meningitidis Z2491 # 9 297 11 285 304 94 28.0 3e-19 MNRENTIVRLERFAEQKGLSYRKIAQMIGIGQSTLSEIRKGTYRGNEEEILLKLEDLMER HKKGIKRVDFSVETDTKKRIFFTIDTIKKYVASNAANEIISSAKIAYIVGRSGIGKTHAL MEYQKTYGSKIIFITAENGDKHTTVMRKIARSMRMDTKGTTDELKENIKERLRFTETIII IDEGEHLSPKVIDVIRAIADQTGIGLVIAGTDQLKHQITKNHREYEYLYSRGVTWMLLKE LTIKDVDKIFRKFIQDDLDFYEEEDIVKMTSFIAREVKGSARILENLLTMASMMANEGEN FEKTGGLITLDYLKAASKVVNTI >gi|224461495|gb|ACDD01000007.1| GENE 47 26024 - 27907 1656 627 aa, chain - ## HITS:1 COG:NMA1882 KEGG:ns NR:ns ## COG: NMA1882 COG2801 # Protein_GI_number: 15794770 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Neisseria meningitidis Z2491 # 182 564 220 614 681 135 28.0 3e-31 MEKFYTTQEVEKLLGKSRTTVARLAVKHHWTIEKRVANGTVKTFYLKQEIDSFRGVPAIV EEPKKNKTRTVALRDYKAVDELPDWNQKIAWARYFICLQLEKDYEQMEGSKDIIIHQFVQ DAKKRFPEKMEILKRISVGTLRRWWGIYSKNKQNPLALATQLGKSRGYRKMSYEIAIRAK NLYLSKNKPTMMKVYTRILLEFGEDAVTYPTIRNFLNHDINSLAKDYGRLNFKDFNDKHK PFMVRDHSKLKPNDLWVSDGHDFEFMCYHPYRKNADGSRYIGTPKWILWMDVRSRYIVGW TISWGETTESIALALKNGIEKYGRPLATYTDNGKAYKSKVLKGCKDKEELTGIYAALGIE KDKQRHAIAYNAQAKNIERMFVDFKRDFAVEFLTYKGGHILERPDTLKPILKKEKQLATG EVLELSEVEAYLEYWVAYRNDKYYRFRRGHRGDGMDGKTPKQWMDSLPETERIRISEDQL RMLFMYEDIRKVTQNGVVFMQNTYIHEELFTHLGEKVKIKYDPHNLKEIFVYLLSGEFLC KADRLEKYGWDGVEQYKEHRKRLQKFQASIKKTLEIKQEIVDNDIITYAEEAQERMEIIE NIKTAKKKEEVKVITVGGIEFEVEDDE >gi|224461495|gb|ACDD01000007.1| GENE 48 27920 - 28330 503 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451791|ref|ZP_05617090.1| ## NR: gi|257451791|ref|ZP_05617090.1| hypothetical protein F3_01911 [Fusobacterium sp. 3_1_5R] # 1 136 1 136 136 235 100.0 8e-61 MDKGLQTELQRYQKALEKTREIRCSMIDVEMSVSVAKQILGIHDWGMFARGEYKNWEEMV NILQKEVKKYPGTLKERDKNFKTLKKAMTLHGMSIKELEEIIGVNCYKIYRVVRGITRDQ EIKNKLEKELNVKFLV >gi|224461495|gb|ACDD01000007.1| GENE 49 28333 - 28434 143 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARKKELLWLLHELKKNPKVYARMIAKIEGLVG >gi|224461495|gb|ACDD01000007.1| GENE 50 28521 - 29300 636 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451793|ref|ZP_05617092.1| ## NR: gi|257451793|ref|ZP_05617092.1| XerD related protein [Fusobacterium sp. 3_1_5R] # 1 259 1 259 259 460 100.0 1e-128 MKSWEIDVMMLRSELIFREYAERTQDIYVKTVEDFLMSTDKESCNIQREDVIRYLERRLK QDSNNTVLVKLNALEFFFEEVLGLDITENIRKFKKLRREYKQDLVSLQDIDILLSSIPTR DRMILKTILETGEYPKHMHKWRVSDLVSKEDGWYLQEHKLRTEFVKELLAYIEKFEITDY LFYGNKGYIDYTNIYLILRKYSEMYLGRRVTVGELKHAVALELWKQGKVEEIKKYLGNQS LASIRQWYKMRGIILEGIR >gi|224461495|gb|ACDD01000007.1| GENE 51 29281 - 29898 843 205 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451794|ref|ZP_05617093.1| ## NR: gi|257451794|ref|ZP_05617093.1| hypothetical protein F3_01926 [Fusobacterium sp. 3_1_5R] # 1 205 1 205 205 346 100.0 5e-94 MNQLLLTQEFKGISVDFYRNQDEFLFTMEQVSQLAGYKQIQNFKDILTNHPELRERRASF LMKSEALEGGVLKKRERRFFTEEGLWEVLMVSGTPKGIEVRRFIAGVLKKLRKGELITDG MKSLPQKVEHSLVNLESMAEKLLESHDKAVLPGLEEIVRGMKEFESRLQVIEKEMKEQTL AVSVTQKAMEDMMGEMDTIYEKLGN >gi|224461495|gb|ACDD01000007.1| GENE 52 29938 - 30216 331 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451795|ref|ZP_05617094.1| ## NR: gi|257451795|ref|ZP_05617094.1| hypothetical protein F3_01931 [Fusobacterium sp. 3_1_5R] # 1 92 1 92 92 149 100.0 5e-35 MQNLMEIKKIRIVSEVEKVNSLLDSGDWKLGKILEGKECIEFLMFYLPQEKEEEKVFVTY PFNTDYKDTRRKIDEFQKMVRELVHRELNIEK >gi|224461495|gb|ACDD01000007.1| GENE 53 30246 - 30419 196 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451796|ref|ZP_05617095.1| ## NR: gi|257451796|ref|ZP_05617095.1| hypothetical protein F3_01936 [Fusobacterium sp. 3_1_5R] # 1 57 1 57 57 101 100.0 1e-20 MEKLDYGDYMDGEIVFNSKADEKACLQCWNEGIEIRVDEYGRVYNEGGIYIADIKIK >gi|224461495|gb|ACDD01000007.1| GENE 54 30865 - 31491 552 208 aa, chain + ## HITS:1 COG:SPy1486_2 KEGG:ns NR:ns ## COG: SPy1486_2 COG2932 # Protein_GI_number: 15675391 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 89 201 1 111 117 60 36.0 2e-09 MKEKKLSETIGDKILKLRKETALTQEQFSKIAGVTPLSILKYESGERLISIETLLNIANY FKIPISYFLGENILKVDENMIKIPVVSVVSAGNGKCGLDDITDWIEFSENIFPACDFATT VSGDSMEPKIFDSDILLIRKTETLDSGDIGIFKIDEDVFCKKLQLNHLTNEVILKSLNPC YAPRYLSKEELENFKIIGKVVGKLDYHF >gi|224461495|gb|ACDD01000007.1| GENE 55 32068 - 32217 148 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453085|ref|ZP_05618384.1| ## NR: gi|257453085|ref|ZP_05618384.1| hypothetical protein F3_08501 [Fusobacterium sp. 3_1_5R] # 1 49 367 415 420 71 77.0 1e-11 MFEKVKEEFLKVISKEDAECIFNLIEDDITGFTSLTQMKREWIKVLFQS >gi|224461495|gb|ACDD01000007.1| GENE 56 32301 - 32714 400 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451799|ref|ZP_05617098.1| ## NR: gi|257451799|ref|ZP_05617098.1| hypothetical protein F3_01951 [Fusobacterium sp. 3_1_5R] # 17 137 17 137 137 196 100.0 4e-49 MKSFFIFIFIFVIGLIGYGADSYQNDTDGNKVREEFTTVFQEAEEFYLLDNILSIEEKEK VLQTLPANTVKEKEKVIQEILKEKEEVIKNIGEYKKQKWKKDIEDQKNFNKSIQSKKEAI FPMMIFLVMMLCFFSPF >gi|224461495|gb|ACDD01000007.1| GENE 57 32817 - 32906 171 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIEFLLNNTKDLIGERKNYKKNRTRSENE >gi|224461495|gb|ACDD01000007.1| GENE 58 32943 - 33629 537 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451800|ref|ZP_05617099.1| ## NR: gi|257451800|ref|ZP_05617099.1| hypothetical protein F3_01956 [Fusobacterium sp. 3_1_5R] # 1 228 1 228 228 411 100.0 1e-113 MKYLFYIGTGKYSLSTEEDLLKISPGDLPGKKRILLGSKDIIRDTQRAYKKLFQKAVDQY NLKHSDKPISSYYCKINESKKLSLATGILLRLGTGREWKAFQEEEKIQTLYEKQFKKIQE LLPNFYLVNATLYFEEVPCLRIVGIPFQREEDQEKLLKVRVSKSNSFNKNRIKEIRDTLM KQIKIDFYELFQESLEIVPKRHRRKRNPEDEYYQMCLFPEFPLETLTI >gi|224461495|gb|ACDD01000007.1| GENE 59 33671 - 33736 64 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLIKHFPREPTTYENTNNLK >gi|224461495|gb|ACDD01000007.1| GENE 60 33757 - 36708 3917 983 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451801|ref|ZP_05617100.1| ## NR: gi|257451801|ref|ZP_05617100.1| hypothetical protein F3_01961 [Fusobacterium sp. 3_1_5R] # 1 983 1 983 983 1593 100.0 0 MKLEYIIKECLKGKVKVTQALLVAFFITGGISFSEVPKITKLGKTTMTSERNREGYVSLT FKNSGKPDQIFYVKKGTSWETLKDYLPEGKKSNYIFTGYSTEVNGKGKEITGEITEDLTL YGVYKKEGTSAKKYVSKGTLDSLEDVHSSDWEGKETTKFLKVKPFVKSDIPSTRKTKINM DEDAKYNNFISTNVVKDIITLRNTLDKDYSDNQDALTKLGNATSSRIDVEAWKRKLGVKE SGNDILNVKYEKDTDEIILGKTTMTSEKNREGYVSLTFKNSGKPDQIFYVKKGTSWETLK DYLPEGKKSNYIFTGYSTEVNGTEIKERGEITEDLTLYGVYKKEGTSAKKYVSKGTLDSL EDVHSSDWEGKETTKFLKEKNPIKIKNLQPGEISANSKEAINGSQLHAVLKAMDDNIDIE AWKKKLNMGDKLNKNLSNFDNTSVNEEGKTKLITALNQGVSFETPTGALVKDSDVKAYVE GKIQNNKANMNGDNIEVANFTSVLNQGSNLEAPDGKLVTDSQVKDELDKYVKLDGTNVTD ENRSGLIGKLSNGASLETPKNAVVTDTFLKNSLADKATKAELAEKADKTSVDNKANKNLD NIGNIADKAKTTLTEALGTGTVTENDTHLVTGKTVHEALKGKADANTLNQFVKKDGSNID ADVKKSLTDKLSEGASIASPTDTLVTDRMVKTELDKVNTSVDNKITENNKNYYNKKEVDD KLKNTLTSEDVDKKIDNKANKNLDNIGNIADKAKTTLTEALGTGTVTENDTHLVTGKTVH EALKGKADADLSNVSKDTIMEKVSKGNVTSTTLKIMNGENAALHAMTVDLKEGSIEKKHL GEGLSKTIDDLEAKDKEHDKAISTIAKQSIQNTKNINYIHNSVQENKQAIANNARRIDQV NQKINKTAALAQATSNLDFGQVRVGNIALGAGVGNYMSETGVAVGIAYRPNEAIFLSAKW TALAGEPRYNSYGASVSYQFSVK >gi|224461495|gb|ACDD01000007.1| GENE 61 36791 - 37108 391 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451802|ref|ZP_05617101.1| ## NR: gi|257451802|ref|ZP_05617101.1| hypothetical protein F3_01966 [Fusobacterium sp. 3_1_5R] # 1 105 1 105 105 148 100.0 1e-34 MKKFKKLYVMLCGIFITTANAWASGNGGMVWEKPATSISKSISGPVAGVVALIAIVIAAL AWTFTDGGSLMGKAIKMVIGFAIVGSAAVFLNSVFGISTGGGMLI >gi|224461495|gb|ACDD01000007.1| GENE 62 37161 - 37856 765 231 aa, chain + ## HITS:1 COG:CC2687 KEGG:ns NR:ns ## COG: CC2687 COG3701 # Protein_GI_number: 16126921 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbF components # Organism: Caulobacter vibrioides # 1 222 1 218 254 96 27.0 4e-20 MLFKQKNKNNKEGKSSWNNYEKAKQEYLEHISKTTNALHSWKLIALFSLIITGISICTVT YLSTRSSLIPYVIEVDQTGNAKGINPAYQVNYEPTEINIQFYLREFVTHSRWLSSDIVLQ GIFYKKSVALLGREAKEKYNKLVEAENWKEMITDGFTRDVQIESINKISGTMNSYQIRWI ETIYKRGSFISEKKLSGIFAIEIDQPKELEELKYNPLGIKIQDYHITSEGV >gi|224461495|gb|ACDD01000007.1| GENE 63 37861 - 38661 863 266 aa, chain + ## HITS:1 COG:mlr6404 KEGG:ns NR:ns ## COG: mlr6404 COG3504 # Protein_GI_number: 13475358 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB9 components # Organism: Mesorhizobium loti # 43 264 92 311 332 127 36.0 2e-29 MKKFIGILVFVGMCSYAGAATMEESLEVLDKITPSKSAKIVSGYKNAKAGVQTVFTYDED SMYSIYCRVNYLTTIMLQPGETITFVGGGDTARWRRATAETGSSEGSREVIYIKPTSIDL KTNLVINTTKRNYQINLISDKTLYNPIIKWQYPQDDFIQQVNAQKLLEEKEAREEEISDP TSLNYRYTLSSNKYHFSPEQVFNDKTKTFILLKEDIQEIPSLYIKEGKNLLLVNYRKKGN YLIVDRLFDEAELRIENRKVIIKRKK >gi|224461495|gb|ACDD01000007.1| GENE 64 38676 - 39863 1305 395 aa, chain + ## HITS:1 COG:RSc2575 KEGG:ns NR:ns ## COG: RSc2575 COG2948 # Protein_GI_number: 17547294 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB10 components # Organism: Ralstonia solanacearum # 181 394 200 415 425 185 43.0 2e-46 MDENKEKKEEKQNLDKAPQKKPKANIKKSNIIIIGVGLVLILGFSLLRMGKRGEEEKTTV EEVKQEESIATNNDSFQAVDYTKAQEEEDKLNLSYNIPATEPSEEENIEILENSVKTNEA LDNYYEKMLSEEMAAQTGVIEFGVNTSPQTSGHLPEVHFNEESSLDLSRFQTRENIVDDF NKQGEKRAFLSESAKKNYNSFLTESPLSKYELKAGGIIPGIMLTGINSDLPGTMIANVRE DVYDTVTGKFLLIPKGTRVVGKYSSAISFGQSRVLVVWQRLIFPNGKSLNLDNFEGTDMS GYSGLVGKVDNHTLKLFQGVVLSSILGAAAGIIDNGGENNSWRNNAGKGAGEEIVSIGEA IASRLLAVQPTITIKPGARFNIMVNSDLILEPYHK >gi|224461495|gb|ACDD01000007.1| GENE 65 39878 - 40354 587 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451806|ref|ZP_05617105.1| ## NR: gi|257451806|ref|ZP_05617105.1| hypothetical protein F3_01986 [Fusobacterium sp. 3_1_5R] # 1 158 1 158 158 254 100.0 9e-67 MMYYSIKKNGEIIAILEFKELLKDRNHNLLREMKGDIEIKEISEERAISLFRQNKGLCKE NKTREKLEEKVFILEMIEKIKQQLVVEKDMDLILPEKAQRFLPQIQQELLKQGITLSHPI QECIHDIDPLCKFQSKLKREKTRGRDGGIGRDRGEESC >gi|224461495|gb|ACDD01000007.1| GENE 66 40348 - 42402 2279 684 aa, chain + ## HITS:1 COG:mlr6395 KEGG:ns NR:ns ## COG: mlr6395 COG3505 # Protein_GI_number: 13475349 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Mesorhizobium loti # 95 603 144 617 735 323 36.0 9e-88 MLILGFLLVGIGIMRIFFPDFFQKIFTLKYLEKIVILSLLFAVIYIVFSIGSYRIANYLS LPKHLTPTLYWRLMFIKDIQRNSSLNNIYLQTTSMLCLGTIPFLFYMLKVRGNQLDTHGT ARWAKIWEWDEAGFLAPPGEYKEGVILGRTKEKFFGLIPSRCIIDNEKTHIAVVAPTRSG KGVGIIIPTLLNWLGSVFVMDMKGENFQITAGYRKKVLKQKILKFKPYGLEDSVSYNPLG EVRIGTPYEVKDATIIADILTDPGEGKKRDHWDTSASTLFVGLILHVLYVAKKEGRVASF GDIVDFLTSTDDTLENNFLKLLQYSHLDNSDVFRKIYEESQMRGVKEGTHPLVARTAAEI LNKDERERASVISTVMAKLTLFKDPIIRKNTSRVDFRIQDLMDYHTPVSFYIVVEAEQMD TLAPLLRILVTQTIGLLAPEMDFSSDAPPHKHRLLFMMDEFPAFGTIPLFEKALAYIAGY GMKAVVIAQALNQIKKNYGDKNSVFDNCATAVFYAPSPLDNETPRQISDLLGETTIKIKN RSYKALQIIGDNISESNQARKLLTPEEVRNKLGDKRNIVSIAGHYPMMGVKVRYYLEEYF TSKTNKNYGIPKTDYLYQKEEKRNLQKEQEEMKMKDIMKTLEETENIEIDEPAIQEMEEI LSSNLEEREAMLHTLEEENVEIIE >gi|224461495|gb|ACDD01000007.1| GENE 67 42413 - 42685 402 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451808|ref|ZP_05617107.1| ## NR: gi|257451808|ref|ZP_05617107.1| hypothetical protein F3_01996 [Fusobacterium sp. 3_1_5R] # 1 90 1 90 90 77 100.0 2e-13 MEEIESIKKKRKNVFEQLKIQEKKLKKLEEKKQKEISQTILKSFSFLQEEEIFLQFVSEI EKENSNFLNKIESYFLNELKAATLEQESKK >gi|224461495|gb|ACDD01000007.1| GENE 68 42682 - 43638 1305 318 aa, chain + ## HITS:1 COG:AGpT89 KEGG:ns NR:ns ## COG: AGpT89 COG4962 # Protein_GI_number: 16119853 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 317 26 332 343 253 43.0 5e-67 MIDGRTNKMKVSEDRLLEFLEDSLGYENMELLEDDDVIELYVNDDTRIWIDTLSHGREWT GRYMEPMDSMRVIKTVASYTNKIIDIENTIVSAELPGTGSRFQGMIPPNVENPSFNIRKK GIKVFSLDDYILSGSLSVEQKEIILKGIQERKNILIVGATSTGKTTFANAVIAEIAKTKD RLIILEDTREIQSVAEDTLRMKTSQYVNLLKLFESTMRQRPDRIIVGEIRGGEALSLLIA WNSGHPGGLCTIHSETAEKGLSQLEQYVQIVSVSPQEKLIAQSVNLIIVLTRVGGQRKVS EIAEVKGYKDGKYILEYI >gi|224461495|gb|ACDD01000007.1| GENE 69 43680 - 43937 386 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451810|ref|ZP_05617109.1| ## NR: gi|257451810|ref|ZP_05617109.1| hypothetical protein F3_02006 [Fusobacterium sp. 3_1_5R] # 1 85 1 85 85 153 100.0 3e-36 MDEDFRVPVCQGLIKPLTILGISREAMILNVALGASFVLAVRAVYMLPVFFVTHYLLYLV CKRDPLVINIFMKKYVKQKNYFYQG >gi|224461495|gb|ACDD01000007.1| GENE 70 43954 - 44238 354 94 aa, chain + ## HITS:1 COG:FN0818 KEGG:ns NR:ns ## COG: FN0818 COG0776 # Protein_GI_number: 19704153 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Fusobacterium nucleatum # 1 90 1 90 91 95 53.0 2e-20 MTRREFAKVLFQHGIFSSKLEADRNIEIIFQLIEKAIIEDGFFSIRHFGKIKLVERAPRM GRNPKTGEEINIDARKSIKFRPSRFLLERLWEEK >gi|224461495|gb|ACDD01000007.1| GENE 71 44223 - 46679 2831 818 aa, chain + ## HITS:1 COG:mlr6400 KEGG:ns NR:ns ## COG: mlr6400 COG3451 # Protein_GI_number: 13475354 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mesorhizobium loti # 6 817 4 793 816 641 40.0 0 MGGEMIKEYQDEKNKLSTYVPWIAMIDDSAIILNKNGTFQKTLRFRGHDLDHSTKLELQN SDARLNNVLRRLEGNWTMHVEARRVKSKEYKTTDIMNYATKLIDDERKVEFEKGIYYESE YYLTLTYLVPKDTEKTIKRFFIEENQSEENLDKSLEIFKKEFQEISVLFQELFLEVEELT PEETYTYLHSCISSKKIEKIKIPDVPLYISNYLCDSDLVGGLQPELGNMHFRCISIQGFP NFTETGFFDILNRLGMEYRWITRFMFLDKIEALSKLEKKWKATFNGRISLWKRFMTELSG EQQVTKVDEDALQKAEEIETQLNLTRGDYLTQGFYSCTLIVYDHNLEILNDKVLEIEKTI NKLGFTTINETINCVEAFLGSVPGNIYNNVRIPIVNSITLTHLLPTSCVWGGDDMNYHLN QPSLMYTETNGSTPFRFNIHVGDVGHTSVVGPTGAGKSVFLGLLSAQFFKYPNAQVFFFD KDASSRVLTYAMNGTFYDLGEDNLTFQPLQKIGVMKEEIEKILKENPRLSLEEATEIERK RASMELEWANEWLLEIFIQENIELTPVQKGKLWDALELLSTSKPKLRTLSNLNTSLNDRI LKETLEKFTIRGALGRYFDSSEENISFSSWQVFEMGKIMHNKSAITPLLSYLFHKIEEQL TGKPTILVLDECWMFFDNEQFSNKIREWLKVLRKKNTSVIFATQELGDIMNSKIFTTILD ACQTKIFLPNPNAFADNYIPIYEKFGLNQREISIIAKGTQKKEYYYRSSKGARLFELALG KNTLRLLAASDQDSQKLAKEIYREYGGGEKFVKVYLGL >gi|224461495|gb|ACDD01000007.1| GENE 72 46759 - 47613 1086 284 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451813|ref|ZP_05617112.1| ## NR: gi|257451813|ref|ZP_05617112.1| hypothetical protein F3_02021 [Fusobacterium sp. 3_1_5R] # 9 284 1 276 276 427 100.0 1e-118 MLLHERRSMKLKRIVIYVFIGILSFNLNAGIFGGGKSGSLKEILKLLVSINAAAGTNGVA TLAEIESKLQLIEQTKMQLEQLKMEAENFKQLGKELGHADIAKINQILDRTLGFKDYANT TATAMGKNLDSFFSDYNGSRYQMFDKYDLDILKEQRNKLRENQTEMQKAVYNNMAKNQMY ANINDQGQELKRKVSTLNNVAGTLQAIQAIGGILEQTNVILLETKQMIATNMEMKDRIEA QARKEAEIQDDNAEAEAKETEEILKKVREEEKRDKKKSNFKIKF >gi|224461495|gb|ACDD01000007.1| GENE 73 47626 - 48081 431 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451814|ref|ZP_05617113.1| ## NR: gi|257451814|ref|ZP_05617113.1| hypothetical protein F3_02026 [Fusobacterium sp. 3_1_5R] # 1 151 1 151 151 216 100.0 4e-55 MFSKKRQEKPIFNNIYVPLIIFVFLILFQKIGIQYIKLLMLIMIPILTIPILYYKLKNRN LEKPWKIIFTIFLIEIVVCTGILYKGFPILPFGIAPINNLYFWGILITLCSKYYLLTKEK GDLEEISLKMKRLKVFLIGLWSLIILYQYLM >gi|224461495|gb|ACDD01000007.1| GENE 74 48110 - 49147 1256 345 aa, chain + ## HITS:1 COG:mll9606 KEGG:ns NR:ns ## COG: mll9606 COG3846 # Protein_GI_number: 13488455 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbL components # Organism: Mesorhizobium loti # 7 216 47 257 495 67 29.0 5e-11 MFGLDAEAILLTAKAYSETMISNLNPTVTTLMGAFVTIDLVLSFLFDESDGLDIFMKLIK KILYYGFWIWIIQEYSNLVFETLMGGAIQLGNVASGNGSSTEINVTLIEKFGLDFGDILQ VLTASGTAMLVDYMGVESGTTVLMMGIMGYVIFFVMLYVQILITFVKFYLVAGYGFLLMP FGCFSKTKDIALKGLNGLFSQAIEIFVLITILNIASDFMDGTFLITLSKKVSGWKAIKEG VIFQKTAILMFLYLLINRAGSIASALLSGAIASIGIGSQMGAQGFSNAVSTPGRLAGNMA NSASRYQRRDDAAKGGAKAFRDSSYAASAYKKAADGVKNFASRFR >gi|224461495|gb|ACDD01000007.1| GENE 75 49221 - 49505 489 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451816|ref|ZP_05617115.1| ## NR: gi|257451816|ref|ZP_05617115.1| hypothetical protein F3_02036 [Fusobacterium sp. 3_1_5R] # 1 94 1 94 94 168 100.0 1e-40 MGLFGKKESKPFVSNDKLVEIITSYDPDLHGLTWTAVLEKKIEERYLFRNVENVAAYGDK GVIIKYKDLKVRSEFEVKKMREEMEREAGLDLGR >gi|224461495|gb|ACDD01000007.1| GENE 76 49731 - 50840 1063 369 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451817|ref|ZP_05617116.1| ## NR: gi|257451817|ref|ZP_05617116.1| hypothetical protein F3_02041 [Fusobacterium sp. 3_1_5R] # 1 369 1 369 369 655 100.0 0 MKGYQELDTNYYDKDKINELREQVSNFKKVSVNEILYYSGSEKKWFMKQDALNYLIFDKW VPKEEKEHPSLDLNNVKDVSIHLGQNVSKILLKSMNEAKDRVYVSCPYVGVPTIDFFDFL LREKQLDCKLLFSQKPMKEGKHDNHSIFKKFISYKILNDDDLEKENTDEIIRLENEKESL TNILAYICLLLFPLSLVGGYQYLYPKYGKLGMIFSIFLAFLFLRKSFSIKNKNKEIRENI ENEIDKLKKRDFRYIQYNSESSKNYRYLIEELEKSKKEDGTIDFLNIAFSHVKLYIVDDV VFLGSLNFSSTSFCNLESLIEIKDKNAVEALTKYYLNLYSSEAFHTPKDVSRELGEYIYN EEEILKRGY >gi|224461495|gb|ACDD01000007.1| GENE 77 51022 - 51204 201 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451818|ref|ZP_05617117.1| ## NR: gi|257451818|ref|ZP_05617117.1| hypothetical protein F3_02046 [Fusobacterium sp. 3_1_5R] # 1 60 1 60 60 86 100.0 5e-16 MQAKKKIGRPTDNPKNINFTIRLDEECVSILEKYCKQEEITRTDAIRKGIKKLKVDLKEK >gi|224461495|gb|ACDD01000007.1| GENE 78 51329 - 51553 393 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451819|ref|ZP_05617118.1| ## NR: gi|257451819|ref|ZP_05617118.1| hypothetical protein F3_02051 [Fusobacterium sp. 3_1_5R] # 1 74 1 74 74 87 100.0 3e-16 MEKNELFEMIMYHLMEEALKEEEKEIEEIFGELNEEQTLYLSDLRKKYFGLGMDIYVSVL NFSKYFRKMAGDVQ Prediction of potential genes in microbial genomes Time: Fri May 20 01:45:25 2011 Seq name: gi|224461494|gb|ACDD01000008.1| Fusobacterium sp. 3_1_5R cont1.8, whole genome shotgun sequence Length of sequence - 31713 bp Number of predicted genes - 35, with homology - 30 Number of transcription units - 11, operones - 9 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 99 - 158 7.9 1 1 Tu 1 . + CDS 322 - 858 576 ## BcerKBAB4_5408 hypothetical protein + Term 872 - 908 5.2 + Prom 882 - 941 7.0 2 2 Op 1 . + CDS 1027 - 1221 289 ## gi|257451822|ref|ZP_05617121.1| hypothetical protein F3_02066 3 2 Op 2 . + CDS 1235 - 1300 78 ## + Term 1316 - 1362 6.6 + Prom 1308 - 1367 10.8 4 3 Op 1 . + CDS 1535 - 1906 334 ## PCC8801_4420 signal peptidase I (EC:3.4.21.89) 5 3 Op 2 . + CDS 1946 - 2881 1164 ## gi|257451824|ref|ZP_05617123.1| hypothetical protein F3_02076 + Prom 2884 - 2943 7.8 6 3 Op 3 . + CDS 2965 - 3453 684 ## gi|257451825|ref|ZP_05617124.1| hypothetical protein F3_02081 + Term 3499 - 3538 8.2 + Prom 3518 - 3577 4.9 7 4 Op 1 . + CDS 3775 - 6123 2290 ## COG0550 Topoisomerase IA 8 4 Op 2 . + CDS 6139 - 7170 924 ## Smon_1167 hypothetical protein 9 4 Op 3 . + CDS 7180 - 8193 903 ## Sterm_4203 DnaB domain protein helicase domain protein 10 4 Op 4 . + CDS 8197 - 8343 271 ## 11 4 Op 5 . + CDS 8340 - 9440 1047 ## COG2003 DNA repair proteins 12 4 Op 6 . + CDS 9492 - 9575 159 ## 13 4 Op 7 . + CDS 9572 - 10984 1241 ## CHAB381_0165 hypothetical protein + Prom 10999 - 11058 7.8 14 5 Op 1 . + CDS 11085 - 17819 5991 ## COG4646 DNA methylase 15 5 Op 2 . + CDS 17803 - 17886 70 ## 16 5 Op 3 . + CDS 17865 - 18641 822 ## COG3177 Uncharacterized conserved protein + Term 18713 - 18761 10.4 + Prom 19005 - 19064 9.8 17 6 Op 1 . + CDS 19101 - 21392 2051 ## Smon_1193 MobA/MobL protein 18 6 Op 2 . + CDS 21425 - 21802 591 ## gi|257451835|ref|ZP_05617134.1| hypothetical protein F3_02131 19 6 Op 3 . + CDS 21822 - 22088 314 ## gi|257451836|ref|ZP_05617135.1| putative phage N-6-adenine-methyltransferase 20 6 Op 4 . + CDS 22085 - 22477 429 ## LC705_00863 hypothetical protein 21 6 Op 5 . + CDS 22481 - 22645 321 ## gi|257451838|ref|ZP_05617137.1| hypothetical protein F3_02146 22 6 Op 6 . + CDS 22682 - 22780 199 ## 23 6 Op 7 . + CDS 22816 - 23481 611 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 24 6 Op 8 . + CDS 23498 - 23800 313 ## gi|257451841|ref|ZP_05617140.1| hypothetical protein F3_02161 + Term 23813 - 23857 10.1 25 7 Tu 1 . - CDS 23827 - 24135 453 ## gi|257451842|ref|ZP_05617141.1| hypothetical protein F3_02166 - Prom 24164 - 24223 10.5 + Prom 24118 - 24177 10.7 26 8 Op 1 . + CDS 24353 - 25081 586 ## COG1192 ATPases involved in chromosome partitioning 27 8 Op 2 . + CDS 25095 - 25766 526 ## gi|257451844|ref|ZP_05617143.1| hypothetical protein F3_02176 28 8 Op 3 . + CDS 25771 - 25950 289 ## gi|257451845|ref|ZP_05617144.1| hypothetical protein F3_02181 + Term 26022 - 26059 -0.9 + Prom 26060 - 26119 9.1 29 9 Op 1 . + CDS 26202 - 26381 168 ## gi|257451846|ref|ZP_05617145.1| hypothetical protein F3_02186 30 9 Op 2 . + CDS 26408 - 27601 1026 ## COG0582 Integrase 31 9 Op 3 . + CDS 27666 - 27860 310 ## gi|257451848|ref|ZP_05617147.1| hypothetical protein F3_02196 + Prom 27897 - 27956 4.5 32 10 Op 1 . + CDS 27989 - 28441 620 ## Cbei_3616 XRE family transcriptional regulator 33 10 Op 2 . + CDS 28466 - 28957 282 ## BVU_1494 hypothetical protein + Term 29009 - 29050 5.0 + Prom 29008 - 29067 5.1 34 11 Op 1 . + CDS 29172 - 29831 357 ## COG3505 Type IV secretory pathway, VirD4 components 35 11 Op 2 . + CDS 29871 - 31709 949 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs Predicted protein(s) >gi|224461494|gb|ACDD01000008.1| GENE 1 322 - 858 576 178 aa, chain + ## HITS:1 COG:no KEGG:BcerKBAB4_5408 NR:ns ## KEGG: BcerKBAB4_5408 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 2 156 8 174 190 79 32.0 8e-14 MKLVNIKEEYIQYLQNFSTNVKYNKQEKRPYVGIILEIDHHTYFAPLGSPKAKHLKMKSG LDFIKIRDGRLGVINLNNMLPVPKEFIQKIDFSKYEEKYRILLQDQAYWIQENSKKIQKQ GSKLYQFIITRENTIFHTRCNDFKLLEAKALEYLEKQRDISENSWLEKMKKEKHGIER >gi|224461494|gb|ACDD01000008.1| GENE 2 1027 - 1221 289 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451822|ref|ZP_05617121.1| ## NR: gi|257451822|ref|ZP_05617121.1| hypothetical protein F3_02066 [Fusobacterium sp. 3_1_5R] # 1 64 1 64 64 73 100.0 4e-12 MKQKLDVLYQILFEEENSLEEKERENFFNKLTEEELEYFCNIKHLFFERGFYLGQYFSKM NIKK >gi|224461494|gb|ACDD01000008.1| GENE 3 1235 - 1300 78 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MERAMRIALFTLTKLSLFYII >gi|224461494|gb|ACDD01000008.1| GENE 4 1535 - 1906 334 123 aa, chain + ## HITS:1 COG:no KEGG:PCC8801_4420 NR:ns ## KEGG: PCC8801_4420 # Name: not_defined # Def: signal peptidase I (EC:3.4.21.89) # Organism: Cyanothece_PCC8801 # Pathway: Protein export [PATH:cyp03060] # 11 121 78 181 200 63 33.0 2e-09 MDDYDKKSKLEKGTIVLFSPPKLATKNRIYHNPLLKKIVAIHGDVIEIKNSKLFINKKYR GEIQEKDSYGNKINRLSNGSYTISPGEYFVLGEHPNSYDSRYYGALTKEEISQVGKLFIP FSF >gi|224461494|gb|ACDD01000008.1| GENE 5 1946 - 2881 1164 311 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451824|ref|ZP_05617123.1| ## NR: gi|257451824|ref|ZP_05617123.1| hypothetical protein F3_02076 [Fusobacterium sp. 3_1_5R] # 1 311 1 311 311 556 100.0 1e-157 MKKKVILGMFLFASFSVFAEMEVERGYGNFEFSTSYSHLQSREKILSPYKEIHDFISLQN DFLYKQGATLDWEKENIWRAYAVGGTGKKYQNTGGGILFYHAYDEGSYFGIQMEGREVKF NAYTESLKGTRGSIRFLFDKQQENGTRLFIAPYYIFDNVKEIQNKSIGIYAKQEIALDLS KYSILEEGVKTYVEVDTQRNSVKKEKEVKDKHRNKNDSIRAGIGISYAETFDGGELKITP NMTVGYEREFLENRKYRSIQEKERDTDNVKMAVGVNMSYKNIDFMVEDMFLKSTNTRNHE NRVQAILTYKF >gi|224461494|gb|ACDD01000008.1| GENE 6 2965 - 3453 684 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451825|ref|ZP_05617124.1| ## NR: gi|257451825|ref|ZP_05617124.1| hypothetical protein F3_02081 [Fusobacterium sp. 3_1_5R] # 1 162 1 162 162 263 100.0 2e-69 MSEELKEKKNPLEKIADMTLEMAKLVLVTQSLSTIKDKETGKEQGFAEMGDEKLSNILKV DKSRIGGKNGIIYEAVMKGYIFSANVKTDTPVLDKEGNVKLSKSGEPIMKSHRILAQSME GLRKGFRHYGVEIPEKLQVKEKNIENTKTPDFKVKESNGLSR >gi|224461494|gb|ACDD01000008.1| GENE 7 3775 - 6123 2290 782 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 636 5 656 709 232 30.0 2e-60 MKVIIAEKPKLGRTIAHALRCHEKKEGYIEGENYIVTWGFGHLFTLKDIGDYEGLKGIPW DKISLPFCPTVFELKFTPDKEGQQGEVQKQFNIIKELVHREDVVEVINCGDADREGQVIV DEILQQAKNKKPVTRLWLPEQTEETIRQQISVCASNTDYMNLYHEGLARSYIDWLLGINL SVYLTIKANSEKTIRVGRVLIPIIKYICDRDRMISNFIPETYFQIEGKGEKENVPISLLA KKRYITKEEASAVCQEWNSLKAKVVSITKKEIKKYPKNLFSLSKLQEELSSRHKMDFTTS LKLIQKLYEEGYVTYPRTNTQFLAENEKEKIKEIISKINETGYELEMKDSKSIFDSSKVE SHSALTPTLKLPEKLSSQEEIVYRVIFNRFVSNFLKEDTVVSETTLIIQVGEQEFKLKGT EIVSPGFYQYEPKSFKNQLPNLIEGEEFIVEFLPVEKVTTPPKKVSETELSNFLENPFKK RNLLEEEDEEDEEILSDEDYKMLLEGIEIGTVATRTPIIQKAQEYEYITLKNSQYSITPF GEKVIQLLDDLQINLYKEKNIEFSQKLKKVYHGMLSVEDTVSETWGTLQEIIRKDLTSSI VFTEEVGICPKCGKQVLDKSKFYGCIDCDFKLWKQSKYFKNTFPISTEQAKKFLNHETVD CEILTEEKKKEKKQLKMVIRDPYVNFEEEKEIIGICPKCGKEVLEGNKNFYCSAYKDGCN FVLWKESKHFKDTLKITKAVAKKLLKKDGKSSFEITRKDGKRKKVDLKIKLNGDYVNFEE VK >gi|224461494|gb|ACDD01000008.1| GENE 8 6139 - 7170 924 343 aa, chain + ## HITS:1 COG:no KEGG:Smon_1167 NR:ns ## KEGG: Smon_1167 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 10 285 45 304 453 72 23.0 3e-11 MLKNATLEEKNDKNFILNILQNQKFEEYNENSFQYVSEALKNDKEFCLEVLKIQPNSFRF MSENLRNDKDVVIASLDKKAKFKHLDEEHLKYLGETLKNDKNFIKKLIEEKGSRLIYAKD TFRNNREFVSIAIKTFPRIIIGCNSEFLKENKDLVLKSLSKDGEILNSLPLVGFRDNPEM ILTAAKNNAKALDKAFLNPEMKDLNKILEAAVESPSNALNTALKFIPLAMQTSNIVKKAI QSNPCAIAFSREDLVDYKQLADFAINQKIEVLSGCSEKYQKEKILSNPKKYLSYARTNIQ KDMLINHKEFLEYASQDVKIAFEQYLKNPFLEKMKKKSIGLER >gi|224461494|gb|ACDD01000008.1| GENE 9 7180 - 8193 903 337 aa, chain + ## HITS:1 COG:no KEGG:Sterm_4203 NR:ns ## KEGG: Sterm_4203 # Name: not_defined # Def: DnaB domain protein helicase domain protein # Organism: S.termitidis # Pathway: not_defined # 37 323 50 353 673 76 26.0 1e-12 MADLKLVHGFFQEYLEKKGINTKTAFRCVSPQHEDKHPSMLYSKKYKKCTCLACGEKYDI FKFVGMEYNLPTFKEQLKKIQEFMKNPELIENINKTVYSKKNTEVSLHKVEEKIVPVEKK YPPLAYYFRDCKKNILKTDYLINRGISREIQDKYNIGYDPQFKNGTWKALIIPTSFYSFT ARNTDKDAEDRLRKVGHLEVFNYWELQEEQKKPFFITEGEIDALSFCEIRQKAISLGGIG NINGLMTKFEKDKPGNLFYLCLDRDEPGQKAEMILYEKMKALGLRVERVNILGNYKDANE FLVKDRIEFQKRVQGLMQSLENHFCKKIEGKKYSLGR >gi|224461494|gb|ACDD01000008.1| GENE 10 8197 - 8343 271 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSHEKLLERYHVNTMYELRELLLKDDERVKDLKEFLEFFIEQEKSDEE >gi|224461494|gb|ACDD01000008.1| GENE 11 8340 - 9440 1047 366 aa, chain + ## HITS:1 COG:BH3032 KEGG:ns NR:ns ## COG: BH3032 COG2003 # Protein_GI_number: 15615594 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Bacillus halodurans # 208 329 103 226 232 76 34.0 8e-14 MTQKNFQEFYEYYGTSSKEELYKKIKNGDKEVAELSNFIEYMKQDREREGVFIRGKQDLV NYLEKSSIPSKEEIKFLSLDTSKKVLGEYSLSLEELSDIKNIFQKIYHPKMHSCMTTFYD SSTSFPNKPRKLEEFEKKLDNIGIIPVETIAIMGDQKRVYSYCVENFTDFHHKALKKKNF LEIPDTKIDTDTKLFQEFSEFYVKKEILGKNLITEEEEIKRLLKVAKENLSQEHFSILEY DDKYRITSMETVFVGGLNRSIVDIKTLIPYFLNESKGICLLHNHPSGNNSPSRQDLELTR DISEIANQFKKQMYDHFIVGSKVFSFREESLIKDHRAEGIASLNREDTKKKNPWEKKRES GKGMER >gi|224461494|gb|ACDD01000008.1| GENE 12 9492 - 9575 159 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDIKQGIYLGDFSYIEKGNVLSSRGET >gi|224461494|gb|ACDD01000008.1| GENE 13 9572 - 10984 1241 470 aa, chain + ## HITS:1 COG:no KEGG:CHAB381_0165 NR:ns ## KEGG: CHAB381_0165 # Name: not_defined # Def: hypothetical protein # Organism: C.hominis_BAA-381 # Pathway: not_defined # 7 462 5 438 447 119 28.0 2e-25 MILNIEDTETIERINNKIQSVDFFKFFMPVVEALEKEALVFYEDCGEDSISFMFLTKKEM KFFQQNQKELLSGELNSFLPQKECLNIRFVKERKSFEDVEKRVPNKVYIRRKYIQFTFTD KISIKESRIDNNYFVLFDKYQIYTVCPYKNSKYEKRFFPTTLKNISTIDASRGHFFYYYF RELEKENYFMKDILKEYHRNKSLIHLPVSFLELEHTKNKKHFLETKLKTKVSKKANKHFL AYMFYVEKVKKYIHENEISKLFSLSEEQEEKILKKLSEISREYMKIRLFLYYYYCVIGIF PEEMPEEDEDEEEDFFKNDFLDYFQNSKKIKQKISLNIKSLHGFQKIHDKVTYEVVKRQR KNKIIWKKTNPFLDLQLPERFKRIETYQDLFLEGLKNRNCVATYLDAINQEKCIIYTTEY EEKRYTIEIRKRRNKFILAQVRGFANSDAPESLVLELKKSLEEETRRFKI >gi|224461494|gb|ACDD01000008.1| GENE 14 11085 - 17819 5991 2244 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 912 2206 2 1308 1315 576 29.0 1e-163 MKSDKRIDLVNNAYIRIREDMEQYIKFLHVIGNNYKYSPGNQLRIFEVNPKATACAEFKF WRDHCNRSVKIKQQGIPIYQRNEMGKLEKKTIFDISQTVPLSQNEKEVLWKFDSEEHRGI LEETKNSIYQKIMESTETFSSTIPKEIIHSFLVKSVEISTNTRLGILSDIQFSEKEKEIF STFHNKDIFSKSIGLVSKFNCTFLRKIEQEINKNQEKALTKSRQASYTTGKEIEREITID KAITEDIEKGGKQDEERRRKDSIFRGIIGRVEQESNERRSYENVRRNGRRAGGRIRVLGK DENGILRGHRSGRIRRTPIELGRDVRTLGRNRKTNRELLRGRKTKNDESLGINREFETAG SRRIQEIGKQFSLFDERNRDSSIHRTINTEKNVFQKGMKVLYQGKEFVIHSIHEEENNLK SMELKGESLVNGLPIRISETLLFREEQELQKILEIQKIKQEEIELPKDYWIVEFHERSET DKNYEGQRVTKELLDELQELDEEKNLSKGFYSKFYFDHIENGEVTGHLRVDIGDGNEVNQ RDFQYLYKQVEKENTIENKQEDEKDTANLKYPKINSTKDIPDILYETVKNDEDLTQLSKE KGSEEVKQKVDNFKITADILPHKLLPSERLNNNLEAISMLNRIEKGERELDTIAQETLAK YVGWGGLADVFDESKDGQWKEARAFLQENLSPSEYEAAKESTVTAFYTPKIVIDGVYKTL SDMGFEHGNILEPSMGIGNFIGNLPDEMNQSKFYGVELDSISGRIAKLLYPKSNIQVKGF EETDFANNFFDVAIGNVPFGEFKVNDRDYNKNNFLIHDYFFVKSIDKVRNGGVIAFITSS GTMDKKDESIRKYINARAEFLGAIRLPNDTFKGVAGTEVTSDIIFLKKRDSILERDEDWI HLAEDENGLSYNKYFVDHPEQVLGSMREVSGRFGKTLTCEPIAFLGQENNMESLKNRIEI AGERISAEAKYEIELLDDKRVSVPVNDDVKNFSYTVIGNEVYYRENSTFVKKEVTKKDQE KIQAYLGLKEALQEVIYKQKEDFKEEEIKVSQEKLNTAFDYFYEKYGSVNSKSNGKLLKE DAGYPLVSSIEILDDKGNFKGKGDIFRKRTIKKVTVIEKVDTLQEALILSISQKGKVDFH YMEALTEKKKEEIIQELKGEIFLDINLANGNIKDIRKEAFNAFSYVTKDEYLTGNILEKI RNIEKYDEILSAFSEEEQKNYQISREILQYQKQELEKVMPKQLEANEINVQLGATWIPAD YIHDFIRETFRPNFFAQKQIRVSFNDYLAEWNIQGKNIDNVNSIVTMTYGTNRVNAYRLL ENALNLRDTKVFDYHEEEGKRVAVLNKKETMLAAQKQDLIKEEFKKWVFKERTRRNYLVK LYNEKFNSIRLREYDGKNLTLEGINPEIKLRTHQINAIARTLYGGNTLLAHVVGAGKTFE MVASAMESKKLGLCNKPLFVVPNHLTEQMGREFMQLYPGANILVATKKDFEPSNRKKFTA RIATGEYDAVIIGHSQFEKIPMSKEYQQTHIQQELDEMIAHISALKAKEGQSISVKQLVK TRKKLEVKLEKLNDDFKKDDVVTFEELGIDKLYVDEAHSFKNLFLYTKMRNVAGIGQTEA FKSSDMFMKCRYMDEITGGKGIVFATGTPVSNSMTELYTMQRYLQFDDLKKANLHLFDAW ASTFGETTTAIELSPEGNGYRAKTRFAKFHNLPELMTRVKQFADIQTADMLNLPTPEVEY QKILTKPTEEQKEMLATLSKRADLVREQRVEPSIDNMLKITNDGKKLALDQRLIDDSLPD DPNSKVNSCVENVYQIWKKTKIHSSTQLIFSDMSTPKGKGEFNLYDDIKKKLIAKGVPSK EIAFIHDANNEKQKEELFAKVRSGEIRVLLGSTQKMGAGTNVQNKLIALHDLDVPWRPAD LEQRAGRIVRQGNENEKVEVYRYITEGTFDAYLWQTIENKQKFISQIMTSKTPVRMAEDV DESSLNYAEIKALATGNPLIKEKMDLDLEVTKLSLLESNYKSNLYQMEEQIAHEYPQKMK ELEKNIKNTKADIFNIEAKGKTEDRFTSISLQGNTITDKKLAGAHLLMAISSSIKHLDTY EKIGNYRGFELEVAYSSISKGYVGVLKGENRYSFEFGKDALGNIQRMDNVLDKIPERLEK MTLQLEEIKDNFEKTKLEVLKPFEQEGLLKEKKQRLYELNNLLNMDDSNSKIMEEKSGLE NRYKSTEVKNPWEKKITKTNDLSR >gi|224461494|gb|ACDD01000008.1| GENE 15 17803 - 17886 70 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYQDKCDIISNKEGTEEEFLCQKKTY >gi|224461494|gb|ACDD01000008.1| GENE 16 17865 - 18641 822 258 aa, chain + ## HITS:1 COG:pli0008 KEGG:ns NR:ns ## COG: pli0008 COG3177 # Protein_GI_number: 18450294 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 13 247 10 254 254 197 44.0 1e-50 MSEKNILGEMTEEYIEDMLARLAHNSSAIEGNTLSYNQTVSIVLNRTIPSGSSINIREFY EVENHRYALEYILEELKEKTSLDLYVIKEINNKLMDHLNYNKGNFKTDGNAIKGADFETA SPQETPSLMYQWVENINFRIANSKTEEEKIKVILEEHIKFERIHPFSDGNGRTGRILMLY SMLEHNLDPFIIPVERREDYYDVLRSQNTEAFYHLIQPLQIVETNRRKAFFNKEKSTIKE QKNPWERKMKEGKEGISR >gi|224461494|gb|ACDD01000008.1| GENE 17 19101 - 21392 2051 763 aa, chain + ## HITS:1 COG:no KEGG:Smon_1193 NR:ns ## KEGG: Smon_1193 # Name: not_defined # Def: MobA/MobL protein # Organism: S.moniliformis # Pathway: not_defined # 1 394 1 387 507 104 30.0 1e-20 MANYYLCMKNGKSINVVANFSYNMGIGKYSYKKNEVVYSQHHMPRWAASPEEFWQSYSLY DRANSSYKKIELSLQDELSLEENKKLLNQFLEKNIGMDYYHSVVIHEKDSSKANAKNIHA HIMICKRREDYLERNAIQFFSRYNVKTPELGGAIVDNDYWGKKKTLLNMRENWEKMINET FLENNIHTRVSCKSLKKQKEEALEKQDLILAECLDRPPIYIQNYILKKKQKTLTEDEIDK LEYYHDCKELRDLKNEVYLLRQEQIELALLEEEYKQEENNLFKEENFKDIYDLQSSIFDI ALEKKVIEEQIANPILLHSSAIRLLNDEYVKLELQLETISKTEDSQKEMKIYEIESQMRE IEETTKEEDISKKVEILLSELQENLQKINQKETIFFQELEEKINGLESTEETEKCYQNFQ YKNWESNYMTLQSKRKEALSFRRDLQKIEKQLSQEQLDFSVYNSLTKGIYGKKIKELNGY EKQLRERKGMTTSEEERLEKMIRKVESDIISIKTQYRYAKGKNMFIRRKHMIKEKYYKEF IKIKNLLEKTELKVNFLKNSMSELPQEKQKEFKEKYQNQVREKMLKELIGKKNYLLRQKT NFEIELRDHNVNKIIHHAMTQGKSTEFIKTYNALDLKLKENISSKEMEEIKKEMNILSME YTNLLKTIDKKQFLHHKDILYEKIGAKIKNIDIEVENIEKEITHVESLLVENVHYKGSIG NIKNFTPNYVKAELGEVLYAGLIHIEAEDKFEQRMKREMDFTR >gi|224461494|gb|ACDD01000008.1| GENE 18 21425 - 21802 591 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451835|ref|ZP_05617134.1| ## NR: gi|257451835|ref|ZP_05617134.1| hypothetical protein F3_02131 [Fusobacterium sp. 3_1_5R] # 1 125 1 125 125 184 100.0 2e-45 MKIGRIGHNYGNIISGNGKVVINGKEYQGRNITIVGNQVMIDGKVYEGNEEEKAIHIVIN GDVESLDLDSCEEISIYGNAKIVKTVSGNVKCNVIEGNVSTTSGDVITKEIKGSVSTMSG DIIES >gi|224461494|gb|ACDD01000008.1| GENE 19 21822 - 22088 314 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451836|ref|ZP_05617135.1| ## NR: gi|257451836|ref|ZP_05617135.1| putative phage N-6-adenine-methyltransferase [Fusobacterium sp. 3_1_5R] # 1 88 1 88 88 161 100.0 1e-38 MKQVITFRSFTEFFEKEKSGLKCNTVRLFTLSDDREYILQDIMNEDIIQPYASEIRFIRG RLKFGNSKNSAPFPSMIVVFREGVNKTK >gi|224461494|gb|ACDD01000008.1| GENE 20 22085 - 22477 429 130 aa, chain + ## HITS:1 COG:no KEGG:LC705_00863 NR:ns ## KEGG: LC705_00863 # Name: not_defined # Def: hypothetical protein # Organism: L.rhamnosus_Lc705 # Pathway: not_defined # 2 130 3 130 130 70 39.0 2e-11 MREIKFRAFLKKDKCICKVLAVELNKGLDGTLQVEYPDGAKITLNLCSVELMQYTGMQDH KETEIYEGDIVSMEAMTPGASNIVGEVQFLECGYWVVREKEKKAVCLFEEGVYIKKLGNI YENHELLEEK >gi|224461494|gb|ACDD01000008.1| GENE 21 22481 - 22645 321 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451838|ref|ZP_05617137.1| ## NR: gi|257451838|ref|ZP_05617137.1| hypothetical protein F3_02146 [Fusobacterium sp. 3_1_5R] # 1 54 1 54 54 91 100.0 1e-17 MTFTAFILGIGVGFILGLSMNMDDETEQPMIPDTPRPEAPKPMPKTYKGSGKNR >gi|224461494|gb|ACDD01000008.1| GENE 22 22682 - 22780 199 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIEGNVSTTSGDVITKEIKGSVSTMSGDIIER >gi|224461494|gb|ACDD01000008.1| GENE 23 22816 - 23481 611 221 aa, chain + ## HITS:1 COG:BS_yomI_3 KEGG:ns NR:ns ## COG: BS_yomI_3 COG0741 # Protein_GI_number: 16079194 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Bacillus subtilis # 62 186 110 233 240 122 53.0 4e-28 MKKIISLVFIVSSISFANDINGIEFSNNIEEQRIENVDLTENTKEEKKDLLYFILNNNPD KGKYIHEKIIQYSSIYNVDPAVIVSIIKRESNFKIEAESHVGAIGLMQLMPGTAKSMGIK NPWDIEENIKGGVRYFKLCLEKNKGNIPLALASYNAGIGAVLKYRGIPPYKETKNYVNLV LNTLSKITGGGQYTYNEVEFEKALKGIFSEISFGNAEYGEI >gi|224461494|gb|ACDD01000008.1| GENE 24 23498 - 23800 313 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451841|ref|ZP_05617140.1| ## NR: gi|257451841|ref|ZP_05617140.1| hypothetical protein F3_02161 [Fusobacterium sp. 3_1_5R] # 1 100 1 100 100 112 100.0 6e-24 MSDIFKINRQLSVTSTKIKFLEQKISLKKEYKKKMSKDVRKMRAHKLITKGALLEMLNME NEDNEVLLGFFSSFNKEEKEIYKKIGKEIFDENKRKKKMK >gi|224461494|gb|ACDD01000008.1| GENE 25 23827 - 24135 453 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451842|ref|ZP_05617141.1| ## NR: gi|257451842|ref|ZP_05617141.1| hypothetical protein F3_02166 [Fusobacterium sp. 3_1_5R] # 1 102 1 102 102 145 100.0 7e-34 MAKEKEIYDTTSIMFAKKLREIRNYKNMTIEQVAKKSGVSRSYITDLENARGYLISKEKL ERILQALKPIPQKDKEELFLYYLEKYVPAEILKTMLKEKNEE >gi|224461494|gb|ACDD01000008.1| GENE 26 24353 - 25081 586 242 aa, chain + ## HITS:1 COG:VC2773 KEGG:ns NR:ns ## COG: VC2773 COG1192 # Protein_GI_number: 15642766 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Vibrio cholerae # 4 187 2 203 257 72 28.0 1e-12 MERGKIIQIKISKGGVGKTFLTVQISSLIAALGKKVLVITTDPQNDVLGMMKEEISENLY KDGGLKEIVLHNKKNVIRLRENLFYIPLEFDGYFSSEFFKRLPDFLSSMRKEYDFIFIDS NPTPRTDKAFLELADYIIIPTLGSARSIQGVVSVLETINPEKILAIVFNQYRKTRLEKEV YNELKEYCNLLGFSDLVEKPIPYSSKIEELVAKRKTIWESQDKKIKEIQDILMNIVEKIY SL >gi|224461494|gb|ACDD01000008.1| GENE 27 25095 - 25766 526 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451844|ref|ZP_05617143.1| ## NR: gi|257451844|ref|ZP_05617143.1| hypothetical protein F3_02176 [Fusobacterium sp. 3_1_5R] # 1 223 1 223 223 365 100.0 2e-99 MGNIDLSKLKSKLEVLKIESEGKENFLDLIKKEDIPEVELTDSRELNIFLQQQYLIFLNL AQNSALSFGRLLENVFQKLKELDGEVTYCKFLNLIGTNRMTALRYRRRVKLYDSVDDNRK KEILLMRDDFIARLYQLNDIEGVISYINDGAKKEDLEEWIAEIASEKKVSALVPEEESKP MITFSSYKDDIISNFSKIETLSREKQEEVEKYLKKINKILQGE >gi|224461494|gb|ACDD01000008.1| GENE 28 25771 - 25950 289 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451845|ref|ZP_05617144.1| ## NR: gi|257451845|ref|ZP_05617144.1| hypothetical protein F3_02181 [Fusobacterium sp. 3_1_5R] # 1 59 1 59 59 66 100.0 4e-10 MAIEIEVLEKKKETVTFYTTKEIKNILKKIAKEKNGSVSYVINAVLVGVFKEELENEEN >gi|224461494|gb|ACDD01000008.1| GENE 29 26202 - 26381 168 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451846|ref|ZP_05617145.1| ## NR: gi|257451846|ref|ZP_05617145.1| hypothetical protein F3_02186 [Fusobacterium sp. 3_1_5R] # 1 59 1 59 59 66 100.0 7e-10 MLSKYSDILTVKELQEILRYKKEKVYKILQNKILPSIRLGKKYLIAKEDVIKYIKSNKK >gi|224461494|gb|ACDD01000008.1| GENE 30 26408 - 27601 1026 397 aa, chain + ## HITS:1 COG:L36404 KEGG:ns NR:ns ## COG: L36404 COG0582 # Protein_GI_number: 15672990 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Lactococcus lactis # 195 374 178 354 374 74 31.0 4e-13 MKNTLVNGRLRERNGVYHVVLYWYDETNKQRTHSFSTKIRTDEKKARKRAEEILLKARVE FLPQKEKEQIVRSNLYVEEYFELMLHKYIVLRELAISSGTTVKGHYKRVYSFFHGKNIAL KDLNKTHVEDFLFYLLSVRKLNPSTVNNSYAFFKAICNIGIEEDLISFKIFTGIKYTFKK KKKDYTTLEKNFIRDFLQCLKKEEYALELLLFLFYGCRTGETLGITFSSINFIENTIKIK RSLVWNRENGEYYVNEFLKTNSSRRILPLFEEMKGLFLERKERIEKNKKFFKKAYDNTWN DYICVRDDGKMIKYYALRNFLESFCKKYNFPRLTPHSLRHTFATIMHDEGMDLKDLQMWL GHASIKMTADTYVHSNITKNSNAVSHVRETMIIKNIS >gi|224461494|gb|ACDD01000008.1| GENE 31 27666 - 27860 310 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451848|ref|ZP_05617147.1| ## NR: gi|257451848|ref|ZP_05617147.1| hypothetical protein F3_02196 [Fusobacterium sp. 3_1_5R] # 1 64 1 64 64 72 100.0 1e-11 MLKKMGRPKIKEENKAKYEAAAFFNKEDYTLLQEIAIQKKTKPGILVKKIVKEWLKMQIK EREK >gi|224461494|gb|ACDD01000008.1| GENE 32 27989 - 28441 620 150 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3616 NR:ns ## KEGG: Cbei_3616 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.beijerinckii # Pathway: not_defined # 7 150 83 226 226 166 56.0 2e-40 MHDLKIKGIIEFLIKAKKATYAGDGNTTNSSRPCSHDLKYFEDNFEYYDSYFGTDPFIGE EALWSNGEIIWSMNYAGRKLDKEFEYGFLKEALLLVNSENPFRGPDEYSNGDYKYMCETI GDFEWFQGYESITFQGKLVYECYYHGGIVR >gi|224461494|gb|ACDD01000008.1| GENE 33 28466 - 28957 282 163 aa, chain + ## HITS:1 COG:no KEGG:BVU_1494 NR:ns ## KEGG: BVU_1494 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 163 3 166 166 151 48.0 7e-36 MEFYKALVGEKKQKKNIFSDLIGEWDIEWVDGKGTEKERHVIGEWIFSEILNGDGIQDIF ICPSREERVSNPQPDAEYGTTIRMYNPNKQKWDICYTCLGKMVYLEAEKVEDKIVLTNIS NDKGINLWVFDDISPSKFHWENKTSFDNGKTWITNGEVFAKRR >gi|224461494|gb|ACDD01000008.1| GENE 34 29172 - 29831 357 219 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 22 200 157 326 591 110 35.0 2e-24 MLTQTERLTMNGRPANPNYARNKNVLVIGGSGSGKTRFYVKPNLMQMHSSYCVTDPKGTI VIECGKMLEDNGYEIKILNTINFKKSMKYNPFAYIRSEKDILKLVQTIIANTKGEGEKAG EDFWVKAEKLYYTALIGYIFYEAPREEKNFATLLDMIDASEVREDDETYMNPIDRLFEAL EKKEPTHFAVKQYKKYKLAAGVIELRRTLNHYFSEICTS >gi|224461494|gb|ACDD01000008.1| GENE 35 29871 - 31709 949 612 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 5 299 6 301 301 321 56.0 2e-87 MRNFEKITALYERLSRDDELQGESNSIINQKKILEEYASKNNLSNIIHFTDDGISGTQFD RPGFMAMMNGVNQGNIGCIIVKDMSRLGRDYLKVGQCMEILRQKGVRLIAINDNVDSFYR EDDFTPFRNIMNEWYARDTSRKIQSTFRSKGESGKHTASTPPYGYIKDEKDKNKWIVDEK AAEIVRRIFNLTMQGNGPYRIAKILESEKVDIPAYHQQKLGYGLYQSKNFEHPYRWCSST IASILKKQEYLGHTVNFKTRKHFKDKKSKYVSEDNWLIFENTHEPIIDQETFDNVQRIRG NVKRYPDGWGEYHPLTGLMYCADCGSKMYVHRTSNYKNIPYYTCSAYTKTPCGMLCPSAH RIKAEVVLNLIQDTLKDIKKYLDEDNEAFICSIQNEMEEKEKIEIEKKKVRLTESQNRLR ELERLMCRIYEDMILNKIPNSRYEILNNQYETEQITLSKEIKDLEQQVSRYEKETDRARK FISLISRYENFDELTTTMINEFVEKIIVHERDRKGSQTSKQKIEIYFNFIGNYELPQAEL SEEKKQKLEEEERKIEERKDKLHQNYLKRKASGKQKEYEDKYKARREQKKQEKIKVLKRV GIPVSEYIKTNI Prediction of potential genes in microbial genomes Time: Fri May 20 01:47:34 2011 Seq name: gi|224461493|gb|ACDD01000009.1| Fusobacterium sp. 3_1_5R cont1.9, whole genome shotgun sequence Length of sequence - 9520 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 2, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 8.7 1 1 Op 1 . + CDS 97 - 669 563 ## COG1309 Transcriptional regulator 2 1 Op 2 . + CDS 720 - 1238 508 ## COG4283 Uncharacterized conserved protein + Term 1292 - 1326 2.1 + Prom 1292 - 1351 4.8 3 2 Op 1 35/0.000 + CDS 1379 - 3118 188 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 4 2 Op 2 . + CDS 3111 - 4820 190 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 5 2 Op 3 . + CDS 4823 - 4993 107 ## gi|257451857|ref|ZP_05617156.1| hypothetical protein F3_02241 6 2 Op 4 . + CDS 5033 - 5629 644 ## TDE0348 TetR family transcriptional regulator 7 2 Op 5 . + CDS 5654 - 8134 2160 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase 8 2 Op 6 . + CDS 8103 - 9494 1173 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|224461493|gb|ACDD01000009.1| GENE 1 97 - 669 563 190 aa, chain + ## HITS:1 COG:FN0473 KEGG:ns NR:ns ## COG: FN0473 COG1309 # Protein_GI_number: 19703808 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 190 1 189 189 161 50.0 7e-40 MAQVLKEEVRNRILEAAEKVFYKKDYRGAKLTEIAKEADIPVALIYTYFKNKAVLFDAVV SSVYINFESAFNEEESLEKGSASERFDEVGENYIHELLKDRKKLIILMDKSSGTKHTEAK QKLISQMQVHIEVSLKRQSKQEYDPMLAHILASNFTEGLLEIARHYQSEKWAKDMLKLIA KCYYKGVESL >gi|224461493|gb|ACDD01000009.1| GENE 2 720 - 1238 508 172 aa, chain + ## HITS:1 COG:SP0939 KEGG:ns NR:ns ## COG: SP0939 COG4283 # Protein_GI_number: 15900819 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 1 172 1 172 172 257 82.0 8e-69 MKIYKDKEELKSEINKSFEKYISEFDIIPESLKDKRVPEVDRTPAENLAYQLGWTTLVLK WEKDEKNGFEVKTPSDMFKWNQLGELYQWFTDTYAHLSIEELKKRLKENIISIYTMIDTL SEEELFQPHMRKWADEATKTATWEVYKFIHVNTVAPFGTFRTKIRKWKKIVL >gi|224461493|gb|ACDD01000009.1| GENE 3 1379 - 3118 188 579 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 346 549 40 249 329 77 27 5e-14 MPEKELRKKVIGKNGLSNSLLALKIVFDLIPQILLVYLISSLITNNISEDNLKYIFLGIF TSFVLKGVFYYFATKVAHDKAYEKLTELRLDIIGHLKKLSLGFFKEHNTGELTNIVQHDV EQVEVYLAHGLPEIMSVTLLPTIIFVAMIFVDWRLALGMIAGVPLMYLVKVLSQKTMDKN FAIYFNHENKMREELMEYVKNISVIKAFTKEEEISERTLKTAREYIYWVKKSMGMVTIPM GLIDIFMEIGVVIVMILGSIFLYYGNITTPNFILAIILSSAFTASISKTATLQHFSIVFK EALKAIGKVLTVPLPKKKTEQGLEFGNIEFKDVNFAYGKDSFELKNINLNFKKNSLNAFV GASGCGKSTVSNLLMGFWDADEGQILINGKDIKEYSQENISMMIGSVQQEVILFDLSIFE NIAIGKLNATKEEVIEAAKKARCHDFISALPNGYETRVGEMGVKLSGGEKQRISIARMIL KNAPILILDEAMAAVDSENERLIGEAIDDLSKDKTIITIAHHLNTIRDSDQIIVMDKGVV LDAGSHEELMKRCDFYKDMVEAQNKVDRWNLKEVVTENV >gi|224461493|gb|ACDD01000009.1| GENE 4 3111 - 4820 190 569 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 347 554 38 251 329 77 29 3e-14 MFREMLKLLTKTGKRDLIISSVFFALYGLSSIAMIVIVFSILFQIFDGTSLASLYKYFIA IGLLVVFKGICNMVADMKKHSAGFDIVQQIRERMIIKLKKFSLGFYTNERLGEINTILHK DVDNMSLVVGHMWSRMFGDFLIGAVVFIGLASIDLKLAILMAVSVPIALIFLYLTIKQSE KIENQNNSALLDMVSLFVEYVRGIPVLKSFSNNKSLDNELMNKTKKFGETSKAASRFKAK QLSIFGFLLDIGYLLLLISGAILVIKGNLDVLHFIIFAVISKEFYKPFASMEQHYMYYVS AVDSYERLSRILYADVIPDKVNGIVPEDNDIAFENIDFSYEKDEFKMEKLSFSIAEKTMT ALVGESGSGKTTITNLLLRFYDVHKGKITLGGTDIRDIPYDELLDRISIVMQNVQLFDNT IEENIRVGKKGATKEDIIKAAKKARIHDFIMSLPKGYETDIGENGGILSGGQRQRISIAR AFLKDAPILILDEMTSNVDPVNESLIQDAITELAKNRTVLVVAHHLKTIQKADQILVFQK GNLLEKGKHGELLAKNGYYTKLWKAQYEV >gi|224461493|gb|ACDD01000009.1| GENE 5 4823 - 4993 107 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451857|ref|ZP_05617156.1| ## NR: gi|257451857|ref|ZP_05617156.1| hypothetical protein F3_02241 [Fusobacterium sp. 3_1_5R] # 1 56 1 56 56 97 100.0 3e-19 MILLKNVYYEWEDGRTALKNVNLEIKKDLPFSINEYFKKYSLTFSFIGVRISMNIK >gi|224461493|gb|ACDD01000009.1| GENE 6 5033 - 5629 644 198 aa, chain + ## HITS:1 COG:no KEGG:TDE0348 NR:ns ## KEGG: TDE0348 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: T.denticola # Pathway: not_defined # 1 197 1 197 198 233 64.0 2e-60 MAKAFTEEEKIKIKEKIMETALDLFHDKGTKSLSISELTKRVGIAQGSFYNFWKDKESLI LDLMAYRVIQKLDDIEQKIPYSLENPKKFLSDIIYKGSVDLAKKIRKQSMYKDAFKIFLV HDFKEGSRIETLYRDFLDRLAEYWEQNNVVRSVDKKGLANAFIGSFILCCNNEHFNEETF DEVLYIYISGIVSKYIEI >gi|224461493|gb|ACDD01000009.1| GENE 7 5654 - 8134 2160 826 aa, chain + ## HITS:1 COG:BS_pps KEGG:ns NR:ns ## COG: BS_pps COG0574 # Protein_GI_number: 16078943 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Bacillus subtilis # 1 823 4 863 866 295 28.0 2e-79 MILDFQEIKKEDILIAGGKGANLGEMTSAKINVPNGFVITAKEYQDFLKVNGIDVLIENE IQKVGNKEDILLKIARDVREKIKYGKFPKEMENRIREKYLNFGENTRVAIRSSATAEDLP DASFAGQQDTYLNVQGLENVFHQIQNCYASLWGNRAVSYRFRQGYSQNAVSIAVVIQEMV ESEKAGVLFTVNPVNKKENEMHINANFGLGESVVSGKVTADTYIVDKSGNIMEVNIGTKE TQIIYGEKGTIEVAVREDKRKNRVLNDVEISKLIKYGLEIENHYGMPMDIEWAMKDDVIY ILQARAITTLANTEKSMVEDTLVEQYIKKQKIKKDTQEMMAFFLEKIPFAYRALEFDYLM AISNQKANILREVGIVFPKNPIIDNDGIQTFSDRGKRINRNIFQFFKFLKNMKDFDTCYQ KCNDFMKIYESKIENMKELNFEIMTLEECKKFMEESYTILQKLAYDRFKYALFPSVLNSK KLNKIIKKVNTTYSSFDFYWNLNNKTSVVTDDIYKLASKIRKNQNLKREIISGEDFQTLY EKYDNFRVLIDKFMKENGFKSDYNCYCLSAKTFKEDPNRLLNILRPLLNADENNDERKQS KDFLKLMQDMKEIYGNKYSDIEKEVMYFRYFHLVREESQYLWETLFYYVRQCVKRINSIL LGSENYEIGIANLFYQELLEAMKRGELNIADKEKISRRNQKFPLATKVWESSKSLIFKTK GDVLKGISGSVGIAVGKVCVINSPKEFYKMKKGDILVCHFTDPEWTPLFTLANAVVADTG SALSHAAIVAREYNIPAVLGVGFATTKFKDGDMVQVDGNTGIVTGC >gi|224461493|gb|ACDD01000009.1| GENE 8 8103 - 9494 1173 463 aa, chain + ## HITS:1 COG:MA1121 KEGG:ns NR:ns ## COG: MA1121 COG0534 # Protein_GI_number: 20089987 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 36 458 23 448 475 196 32.0 9e-50 MGILVSLQVVNMNSKKEIAEKEVRRLKILNQPILLLLIKMSIPTIFGMLITVLYTLTDTF FIGLLNHKSMTAAIGIVFSFTSMIQAIGFWFGYGSGNIMSKKLGEQDEKEATIISSLGIG FSILSGILIATLSWIFISDLSKLIGGNASESLLAFTMQYLKIMIVSIPFSLYSITLYNQL RLCGNVKDGMVGLLLGMFSNMILDPIFIFVFELGFTGAGYATLAGQIIACIFLTMLAKRN GNIPVSLKNVKYNKERIYHILVGGMPNFSRQVITSISLILVNRIAASFGDSLIAALTISS RIVAIAYMIMIGWGQGFQPICAMNYGAKKYDRVKSAFQLTVVVGTIFLILSAILLYLFSE DFIKIMSKDGEVILLGGQILRMQCISIPLLGYLAVASMFMQNTGKYFCSLFISISRQGIF YIPLLYLLVHCYGEFGIYLLQPVSDIFSFVLAVYIVHRNNEYI Prediction of potential genes in microbial genomes Time: Fri May 20 01:47:42 2011 Seq name: gi|224461492|gb|ACDD01000010.1| Fusobacterium sp. 3_1_5R cont1.10, whole genome shotgun sequence Length of sequence - 2782 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 79 143 ## + Prom 85 - 144 7.2 2 2 Op 1 . + CDS 219 - 398 202 ## SZO_12650 regulatory protein 3 2 Op 2 . + CDS 418 - 1845 1191 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 1870 - 1915 3.6 4 3 Op 1 . + CDS 1932 - 2144 272 ## CD1862 putative conjugative transposon DNA recombination protein 5 3 Op 2 . + CDS 2193 - 2781 610 ## CD1862 putative conjugative transposon DNA recombination protein Predicted protein(s) >gi|224461492|gb|ACDD01000010.1| GENE 1 2 - 79 143 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KKLLGFKSPNEVLEEYRREKKVELS >gi|224461492|gb|ACDD01000010.1| GENE 2 219 - 398 202 59 aa, chain + ## HITS:1 COG:no KEGG:SZO_12650 NR:ns ## KEGG: SZO_12650 # Name: not_defined # Def: regulatory protein # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 59 1 59 59 86 88.0 2e-16 MKDTFMDTAKVMSKGQVTIPKRIRELLDLQNGDYVTFVVNKDKVQIQNSKIFIEENIDK >gi|224461492|gb|ACDD01000010.1| GENE 3 418 - 1845 1191 475 aa, chain + ## HITS:1 COG:FN0191 KEGG:ns NR:ns ## COG: FN0191 COG2865 # Protein_GI_number: 19703536 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Fusobacterium nucleatum # 1 475 1 476 477 717 76.0 0 MTIDEIKKLIQNGEKIDVEFKESRNALTKDVFDTVCSFNNRNGGHIFLGVNDKRDIVGVS EDKVDKIIKEFTTSINNSQKMYPPLYLLPEVFEIDSKKVIYIRVPEGYQVCRHNGRIWDR SYEGDINITDHAELVYKLYARKQGSYFVNKVYPSLDIEFLDTDVIDKAKRMAVARNKNHV WENMSYEKLLRSANLILTDPETKHEGITLAAILLFGKDNSIMSVLPQHKTDAIFRVENKD RYDDRDVVITNLIDSYDRLIAFGQKHLNDLFVLDGIVNVNARDRILREIVSNTLAHRDYS SGFPAKMIIDDEKIMIENSNLAHGMGALDLQKFEPFPKNPAISKVFREIGLADELGSGMR NTYKYTRLYSGVDPLFEEGDIFRTIIPLKKIATQKVGGNGVAQDVAHSVAQDVAHDKIAL AEFIKEKIRGNNKITRKAIADEAGVSVKTIERTIKEMDNLQYVGSGSNGHWELNE >gi|224461492|gb|ACDD01000010.1| GENE 4 1932 - 2144 272 70 aa, chain + ## HITS:1 COG:no KEGG:CD1862 NR:ns ## KEGG: CD1862 # Name: not_defined # Def: putative conjugative transposon DNA recombination protein # Organism: C.difficile # Pathway: not_defined # 1 70 19 88 3011 136 94.0 2e-31 MRINDFHNILELIKQDVLQSEAEYLKLLKVVGNNQKYDFRSQLSIYDKNPEATACAKFDY WREHFNRTVM >gi|224461492|gb|ACDD01000010.1| GENE 5 2193 - 2781 610 196 aa, chain + ## HITS:1 COG:no KEGG:CD1862 NR:ns ## KEGG: CD1862 # Name: not_defined # Def: putative conjugative transposon DNA recombination protein # Organism: C.difficile # Pathway: not_defined # 1 151 106 256 3011 247 91.0 1e-64 MDYIFDISQTVSKNRDVNEVNLWRFDKETHRDVLKELIKAEGYEESDSTLENIFSLSRLD GDEKIDSLMNELRISDEDRISFTKFARDSVSYAVASRFKLDYPMDKELLKENFAMLDSIS LMSLGETVSDISGTVIDATIQKSKELELQKEADKTPDGYAETSNRIYEDREAETDHGLED RGREPSAVPSDDFSPQ Prediction of potential genes in microbial genomes Time: Fri May 20 01:47:53 2011 Seq name: gi|224461491|gb|ACDD01000011.1| Fusobacterium sp. 3_1_5R cont1.11, whole genome shotgun sequence Length of sequence - 1970 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 20 - 313 406 ## COG4939 Major membrane immunogen, membrane-anchored lipoprotein - Prom 365 - 424 13.6 + Prom 309 - 368 15.7 2 2 Tu 1 . + CDS 413 - 1636 1143 ## COG1301 Na+/H+-dicarboxylate symporters + Term 1664 - 1715 15.3 - Term 1656 - 1699 7.2 3 3 Tu 1 . - CDS 1709 - 1969 316 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) Predicted protein(s) >gi|224461491|gb|ACDD01000011.1| GENE 1 20 - 313 406 97 aa, chain - ## HITS:1 COG:FN1351 KEGG:ns NR:ns ## COG: FN1351 COG4939 # Protein_GI_number: 19704686 # Func_class: S Function unknown # Function: Major membrane immunogen, membrane-anchored lipoprotein # Organism: Fusobacterium nucleatum # 16 97 59 140 140 80 50.0 6e-16 MYLVYSCQTYVTNILKNFSCEVVFKDGDGKIKDENYGKEFEGEKFEQAKIAIQACQLYPE TLVEVQDPEKIEIIAGATHSQAEFKEAVWNALKKAKK >gi|224461491|gb|ACDD01000011.1| GENE 2 413 - 1636 1143 407 aa, chain + ## HITS:1 COG:BH3820 KEGG:ns NR:ns ## COG: BH3820 COG1301 # Protein_GI_number: 15616382 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Bacillus halodurans # 8 401 1 401 413 272 44.0 8e-73 MKTKKLQLSLTTQIFLALVLAIIVGVCLTKTPEIAADYIAPFGKIFLNLIKWIVCPLVFF SIMSGVISLQDIKKIGSIGGKTLFYYLCTTAFAVAIGLFFANSFKGIFPILATTNLSYDA TASVSFMENIVNIFPKNFIAPFADANMLQVIVSSLFIGFAIIGVGNSAKRVVDAINIVND IFVMGMEMILKLSPIGVFCLLCPVVAKNGPAIIGSLAMVLFVAYICYIVHAVLVYSLSVK VFAGINPITFFKGMMPAILFAFSSASSVGTLPLSMECTEKLGAKKEISSFILPLGATINM DGTAIYQGVCSVFIASCFGIDLTLSQMITIVLTATLASIGTAGVPGAGMVMLAMVLQSVG LPVEGIAIVAGVDRLFDMGRTTVNITGDAACCMVINAMEARKERRKV >gi|224461491|gb|ACDD01000011.1| GENE 3 1709 - 1969 316 86 aa, chain - ## HITS:1 COG:BS_pyrAB KEGG:ns NR:ns ## COG: BS_pyrAB COG0458 # Protein_GI_number: 16078616 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus subtilis # 2 68 981 1047 1071 71 52.0 4e-13 EATAVRKISEEAPNLLDLIKNREVDLLINTPTKANDSQRDGFKIRRSAIEYGVEVLTSLD TMKAIIKMQDRNLKEESLDVFDISKI Prediction of potential genes in microbial genomes Time: Fri May 20 01:47:53 2011 Seq name: gi|224461490|gb|ACDD01000012.1| Fusobacterium sp. 3_1_5R cont1.12, whole genome shotgun sequence Length of sequence - 1342 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 1108 1330 ## COG0505 Carbamoylphosphate synthase small subunit - Prom 1136 - 1195 11.4 Predicted protein(s) >gi|224461490|gb|ACDD01000012.1| GENE 1 35 - 1108 1330 357 aa, chain - ## HITS:1 COG:BS_pyrAA KEGG:ns NR:ns ## COG: BS_pyrAA COG0505 # Protein_GI_number: 16078615 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Bacillus subtilis # 1 355 1 356 364 363 48.0 1e-100 MKGKLILENGMVFNGTVFGEIGETVGELVFNTGMTGYQELLTDPSYYGQMVVMTYPMIGN YGVNLEDMESDKIHLRALIIKEEAKLPNNFRCEMSLDGFLRQNKVIGFKSVDTRYLTKII RDCGAMKGIITSKDLTKKEIEEKFSSYQNKDAVAQVSSKEIYEIPGKGLRLGFMDFGAKA NILRNFQKRDCHLVVFPWNTSAEKIMEYHLDGVFLSNGPGDPADLQNVIAEIKKLIEKKM PIVGICLGNQLTAWALGGTTKKMKFGHRGGNHPVKDLDHNRIYITSQNHGYAIDKIPEKA RVSHVSMNDGTVEGLKCDDLHIMTVQFHPEAWPGPTDCEYLFDEFLEVIKGAKKDVR Prediction of potential genes in microbial genomes Time: Fri May 20 01:48:00 2011 Seq name: gi|224461489|gb|ACDD01000013.1| Fusobacterium sp. 3_1_5R cont1.13, whole genome shotgun sequence Length of sequence - 20334 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 10, operones - 6 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 68 - 127 11.1 1 1 Op 1 4/0.000 + CDS 159 - 1424 1173 ## COG4393 Predicted membrane protein 2 1 Op 2 10/0.000 + CDS 1426 - 2706 1403 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 3 1 Op 3 36/0.000 + CDS 2716 - 3918 1422 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 4 1 Op 4 1/0.000 + CDS 3922 - 4596 288 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 5 1 Op 5 . + CDS 4614 - 5036 679 ## COG4939 Major membrane immunogen, membrane-anchored lipoprotein 6 1 Op 6 . + CDS 5109 - 5549 523 ## FN1350 integral membrane protein 7 1 Op 7 36/0.000 + CDS 5553 - 6758 1250 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 8 1 Op 8 . + CDS 6767 - 7450 261 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 9 1 Op 9 . + CDS 7483 - 8091 255 ## Apar_1241 hypothetical protein 10 1 Op 10 . + CDS 8095 - 8610 588 ## BHWA1_01954 hypothetical protein + Term 8616 - 8671 11.8 - Term 8608 - 8655 7.5 11 2 Tu 1 . - CDS 8700 - 10130 1245 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 10164 - 10223 12.6 + Prom 10226 - 10285 14.2 12 3 Op 1 5/0.000 + CDS 10333 - 11640 1779 ## COG0672 High-affinity Fe2+/Pb2+ permease 13 3 Op 2 . + CDS 11685 - 12356 1236 ## COG3470 Uncharacterized protein probably involved in high-affinity Fe2+ transport + Term 12389 - 12451 8.0 - Term 12383 - 12433 4.5 14 4 Tu 1 . - CDS 12466 - 13008 337 ## gi|257451881|ref|ZP_05617180.1| hypothetical protein F3_02371 - Prom 13038 - 13097 4.9 15 5 Op 1 . - CDS 13142 - 14017 893 ## FN0721 hypothetical protein 16 5 Op 2 . - CDS 14062 - 14949 447 ## FN1938 hypothetical protein 17 5 Op 3 . - CDS 14952 - 15395 258 ## gi|257451884|ref|ZP_05617183.1| hypothetical protein F3_02386 - Prom 15575 - 15634 5.2 18 6 Tu 1 . - CDS 15745 - 16446 333 ## EUBELI_00033 hypothetical protein - Prom 16500 - 16559 5.0 19 7 Op 1 . - CDS 16564 - 17079 528 ## gi|257451886|ref|ZP_05617185.1| hypothetical protein F3_02396 20 7 Op 2 . - CDS 17100 - 18113 328 ## TDE0547 hypothetical protein - Prom 18276 - 18335 13.4 + Prom 18205 - 18264 8.9 21 8 Op 1 . + CDS 18322 - 18489 191 ## gi|257451888|ref|ZP_05617187.1| hypothetical protein F3_02406 22 8 Op 2 . + CDS 18499 - 18816 268 ## gi|257451889|ref|ZP_05617188.1| hypothetical protein F3_02411 + Prom 18838 - 18897 2.0 23 9 Op 1 . + CDS 18963 - 19319 389 ## gi|257451891|ref|ZP_05617190.1| hypothetical protein F3_02421 + Prom 19327 - 19386 2.3 24 9 Op 2 . + CDS 19416 - 19940 399 ## TDE0510 hypothetical protein + Term 19975 - 20024 14.2 25 10 Tu 1 . - CDS 20115 - 20333 215 ## COG3464 Transposase and inactivated derivatives Predicted protein(s) >gi|224461489|gb|ACDD01000013.1| GENE 1 159 - 1424 1173 421 aa, chain + ## HITS:1 COG:FN1355 KEGG:ns NR:ns ## COG: FN1355 COG4393 # Protein_GI_number: 19704690 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 125 421 1 298 298 433 72.0 1e-121 MLKFYIDVVSFLTIFAFLVGIILAFLKEEKKMFLNILMLVISVLGISLTTAMIVFKQLYP QKMVKISLFYNRLALSMGMIFILLSILFFIFTLVQKRKMLPLVIITSGLATYFLAFTVFP QVYALTKEFIAFGEDSFGTQSLLRLGGYLLGVLTVVVMGLSIYKMYFRFHLPQRKVFALF IFLIVSLDFILRGVSALARLRFLKASNPFVFQVMILEDKGNLPIFVMFLVAFVFSVLLFL ENLKVKGSFKNRAMLRKEKARLKNNRAWSITLCFMSILVVLSVTLVHSYINKPVELTPAQ PYQEEGNKIIIPLTDVEDGHLHRFSYKATGGNDVRFIVVKKPKGGSYGVGLDACDICGVA GYYERNDDVICKRCDVVMNKSTIGFKGGCNPVPFEYEIVNKKIIIDKAVLEQEKDRFPVG E >gi|224461489|gb|ACDD01000013.1| GENE 2 1426 - 2706 1403 426 aa, chain + ## HITS:1 COG:FN1354 KEGG:ns NR:ns ## COG: FN1354 COG0577 # Protein_GI_number: 19704689 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 426 3 428 428 700 83.0 0 MFWRMVRGTLFRQRNKMVMIAFTVALGVSLATAMMNVMLGVGDKVNKELKTYGANITVMH KDASILDDLYGIHGEDVSDKFLLEEEIPKVKQIFWGFNIVDFAPYLERSIEVKGFLEKVK IYGTWFHHHLVMPTGEELDTGIKNLKNWWEVKGEWLEEEDENEIMLGSLLAGKYNYQVGD TLEFTSDSGIKKLKIKGIFNSGGDDDSSIYANLKTVQDLFDLKGKISLLEVSALTTPDND LAKKAAQDPNSLTISEYETWYCTAYVSSISYQLQEALTDSVAKPNRQVAESEGTILNKTE LLMLLICILSSFASALGISNLITASVIERSQEIGLIKAIGGTSTRIILLILTEIVLSGIF GGIFGYVAGIGFTQVIGKTVFSSYIEPAIIVIPIDIALVFAVTILGSIPAIRYLLALKPT EVLHGR >gi|224461489|gb|ACDD01000013.1| GENE 3 2716 - 3918 1422 400 aa, chain + ## HITS:1 COG:FN1353 KEGG:ns NR:ns ## COG: FN1353 COG0577 # Protein_GI_number: 19704688 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 400 1 400 400 594 84.0 1e-169 MTKRKMYMKLVLNSLIRRKARMIVALLAIAIGATIMSGLVTIYYDIPRQLGKEFRSYGAN FVVLPSGNEKISEEEFQNLKSKIKVHNVVGIAPYRYETTKINQQPYILTGTDMIEVKNNS PFWYIEGEWTTNEDTENVMIGKEISKKLNLQIGDSFTVEGPKAGTKVVASKQSDSAEESK KKDFGSNFYAKKLTVKGIITTGGAEESFIFLPITLLDEILEDVIQIDGIECSVEADSKQL ELLAENLESYDNNIIARPVKRVTQSQDIVLGKLQVLVLLVNIVVLVLTMISVSTTMMAVV AERRKEIGLKKALGAYNSEIKKEFLGEGSALGFIGGVLGVGLGFIFAQEVSLNVFGRAIE FQWLFAPITVIVSMLITTLACLYPVKKAMEIEPALVLKGE >gi|224461489|gb|ACDD01000013.1| GENE 4 3922 - 4596 288 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 8 224 5 221 311 115 33 2e-25 MEEREVLLEVKNVSKIYGDLHALKDVNLTVRKGEWVAIMGSSGSGKSTMMNIIGCMDKPS VGEVILDGQNITKETQKSLTEIRREKIGLIFQQFHLIPYLTALENVMVAQYYHSIPDEQE ALDALEIVGLKERAHHLPSQLSGGEQQRVCIARALINSPEIILADEPTGNLDETNENIVI NILKKLHTEGTTIIVVTHDAEVGEAAERKIILDYGKIVDDIYLK >gi|224461489|gb|ACDD01000013.1| GENE 5 4614 - 5036 679 140 aa, chain + ## HITS:1 COG:FN1351 KEGG:ns NR:ns ## COG: FN1351 COG4939 # Protein_GI_number: 19704686 # Func_class: S Function unknown # Function: Major membrane immunogen, membrane-anchored lipoprotein # Organism: Fusobacterium nucleatum # 1 140 1 140 140 154 57.0 4e-38 MKKKMWILMFAMSALMIACGKKEFSNMSFQDGNYAGEYISEDSEHKDSCEVALEIKDNKI ISCEAVYKDAKGNIKDEHYGENAGEEKFAKAQLAIEGFEKYSDMLLEVQDPEKVDSIAGA TVSNKEFKMAVWNALEKAKK >gi|224461489|gb|ACDD01000013.1| GENE 6 5109 - 5549 523 146 aa, chain + ## HITS:1 COG:no KEGG:FN1350 NR:ns ## KEGG: FN1350 # Name: not_defined # Def: integral membrane protein # Organism: F.nucleatum # Pathway: not_defined # 1 145 1 144 145 154 60.0 9e-37 MKKNIFEKLGILLSIILLLIPKWIAPVCPGLKEDGGHMGCYYSGNLVMKIAVVMIILCIL MIVLAKYKYVKLLGSAIIIALSAFSYLIPHGMTHMHNEIGKPYGFCKMETMACRVHHTFE IVGIVAGIIAIVMIINIITILLKKEK >gi|224461489|gb|ACDD01000013.1| GENE 7 5553 - 6758 1250 401 aa, chain + ## HITS:1 COG:FN1349 KEGG:ns NR:ns ## COG: FN1349 COG0577 # Protein_GI_number: 19704684 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 401 1 401 401 590 79.0 1e-168 MKKRIDATSLAMENIRQRKTRSICMILLVALFSIIVYMGSMFSLSLRSGLDSLSNRLGAD VIVVPAGYKAEIESVLLKGEPSTFYLPENTMKKLEQFEEIEQMTPQIYVATLSASCCSYP VQIMGIDIESDFLIYPWISNSIQKELDDNEAIVGSHVAGEQGEKIHFFNQELKIVGRLKS TGVGFDATVFVNQKTAKELAKASERITANRVAEEDVISSVMIKVKPGVDSVKLSSKISRA LAHEGIFAMFSKKFVNTISSNLKVLSSYVGGLILIIWIFSIVILSISFMAIFNERKKEMA VLRVLGASKKMLQEIIVKEAGILSLWGAALGSFLGILLSMIILPLVAKSLTMPFLSPSIL KYILIFLLSFVLGSLIGPISTIQVVRKLTEKDSYMSLKEEI >gi|224461489|gb|ACDD01000013.1| GENE 8 6767 - 7450 261 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 215 1 215 245 105 33 3e-22 MLEIKNISKSYMRASQSFYAVNNVNLNIEKGDFIHIIGRSGSGKSTLLNILAGLLSADKG EVLLEGQNYTLLEDEEKSKFRNENIGFIPQSPALLSYLNVLENIRLAYDLYHTDGSSEEK ARYFLKELGLEHLANSYPKELSGGELRRVIILRALITDVKILIADEPTSDLDIEATREVM ELLQKLNERGLTILIVTHELDTLKYGKSIYTMSEGVLTPGNHLTKTS >gi|224461489|gb|ACDD01000013.1| GENE 9 7483 - 8091 255 202 aa, chain + ## HITS:1 COG:no KEGG:Apar_1241 NR:ns ## KEGG: Apar_1241 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 1 118 1 118 119 77 31.0 3e-13 MTKEERAGKWFRKIANSEAISMEKKMEICNKVAKKMVILFIVIFLLEFVLLFMINDRVIF NHLSDFLNRLSEEKHTRNHYKGIALVGTLLCLPIMVLPIIVTLIFKKTWMKAEVYKVIDK IKRDKTFSPNNEPVSCMNEWMGKWEGIKENLDCQIDLDSYFTKKQIGRGILNILDIGTVY FPTGRIFACDPMIELEDAKPYI >gi|224461489|gb|ACDD01000013.1| GENE 10 8095 - 8610 588 171 aa, chain + ## HITS:1 COG:no KEGG:BHWA1_01954 NR:ns ## KEGG: BHWA1_01954 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 6 171 77 240 241 136 42.0 3e-31 MPIGTHPVKICVVSEEVSGDRYACVKVEINKNKVMRYELAMVGNENLDEEMEKGDYFGFG VDCGMACIADVKTQEAFKKYWKQREREEEGIDPYNDLFDNLLEENFKTNPKYQRECGDWL NWRIPETEYNVLIFASGWGDGYYPCYFGYDVQGKISAVYIHFIDIQSDYID >gi|224461489|gb|ACDD01000013.1| GENE 11 8700 - 10130 1245 476 aa, chain - ## HITS:1 COG:FN1313 KEGG:ns NR:ns ## COG: FN1313 COG4166 # Protein_GI_number: 19704648 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 3 473 2 471 474 537 61.0 1e-152 MQKKLKYLLCLFSLFFFACSSEKQELEKREQILYTAMPKKEYHLIPNHYEKNDRALITQL WEGLTELKDGGVRFIEVTDIQHSQDFLTWTFHLRDDLTWSNGEKITAESYRKSWLDSLKN SPMIEEKYRMFVVKNAEKFSENKVSEKEVGIQVKDNVLEVTLNTPIPNFDEWVSNPIFYP LHPKNETLKEEDKIVNAAFKIASFQENKIILEKNEDYWDAVNTRLKKVDISLVEDGIMAY EMFPRYEIDIFGAPFYEIPFERLKQANTLPEKLVFPVMKYYYISIPNENKERFLEKSNVK LRELLYAVSDPEFMGRVILQNDSPSIFPHPHPSSEIITKSKEEFENLQQKENFVFSESPY VAKFDSNSLLEKKLLLSTLKEWISSFKFPIRVTSEKEAKTSFEIKNYLVGTNQKEDFYYY ISKKYGKNLKTEEEFLKDLPVIPLLQENTSLLLHSDIQGLSVAPSGDIYLKYIVII >gi|224461489|gb|ACDD01000013.1| GENE 12 10333 - 11640 1779 435 aa, chain + ## HITS:1 COG:FN1251 KEGG:ns NR:ns ## COG: FN1251 COG0672 # Protein_GI_number: 19704586 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity Fe2+/Pb2+ permease # Organism: Fusobacterium nucleatum # 8 420 1 414 433 641 80.0 0 MREYFKKIFMGICTFVLLFGLNYTVLEAAQKKKYDTWQDVAKDMNIEFQDAKKSIEAGDA DAAYKFMNNAYFNYYEVQGFEKNVMVNISAKRVNEIEAMFRKIKHTLKGNIEGNISELDK EIDLLAVKVYKDAMVLDGVISEEAPDSEGERLFKGEVASADASTIKWKSFGVSFGLLLRE GLEAILVIVAIIAYLVKTGNEKLCKQVYIGMGAAIVCSFLLAFLIDILLGGIGQELMEGI TMFLAVGVLFWVSNWILSRSEEQAWSRYIKSQVQKSIDEKSGRVLIFSAFLAVLREGAEL VLFYKAMLTGGQTDKLFAFYGFLAGVLALIIIYLIFRYSTVRLPLRPFFMFTSILLFLLC ISFMGKGVVELTEAGVISGSTVIPAMNGYQNTWLNIYDRAETLIPQLMLVIASVWMILGN LLKERKIKKEAEDSK >gi|224461489|gb|ACDD01000013.1| GENE 13 11685 - 12356 1236 223 aa, chain + ## HITS:1 COG:FN1252 KEGG:ns NR:ns ## COG: FN1252 COG3470 # Protein_GI_number: 19704587 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein probably involved in high-affinity Fe2+ transport # Organism: Fusobacterium nucleatum # 1 223 1 228 228 322 82.0 3e-88 MKNLKFLAMALLVLGLTACGEKKEEAAAPAENPAAAEATTEAAAPAEKPGESGFAEIPID ETVVGPYQVAAVYFQAVDMIPEGKQPSAAESDMHLEADIHLLPEAGVKYGFGEGEDIWPA YLTVNYKVMSEDGKKEITSGSFMPMNADDGPHYGINVKKGLIPIGKYKLQLEIKAPTDYL LHVDSETGVPAARDNGLAAAEEYFKTQNVEFDWTYTGEQLQNK >gi|224461489|gb|ACDD01000013.1| GENE 14 12466 - 13008 337 180 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451881|ref|ZP_05617180.1| ## NR: gi|257451881|ref|ZP_05617180.1| hypothetical protein F3_02371 [Fusobacterium sp. 3_1_5R] # 1 180 1 180 180 316 100.0 4e-85 MKDIYENLPPAKLPMKYPGEITAMKRISGTQLSKERLFSLQDTKNHFEYDILDYWEKFEE ISACRLPLSPNKKAIILRTYYRRFPDEKHRREFFYLCLLNDDYRIIDSMQIYDNSIHVGK SKPPMREEQDFLIAKDDSCIATTHYYYLDTGKETFKTCDIYKLKEKEIFKEEKYISIQTK >gi|224461489|gb|ACDD01000013.1| GENE 15 13142 - 14017 893 291 aa, chain - ## HITS:1 COG:no KEGG:FN0721 NR:ns ## KEGG: FN0721 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 81 280 29 228 239 183 50.0 6e-45 MKTKKILFCILLFFLCACQGKEDTEKRTELHKEEIIEEKGLQEIEKVQKLKMTEEQKQKM SMEEAAIANKKREIQNLEIEKKLKKFVPRGWKILQFVTGDLNKDTLEDVAMVIEETDAEN FVKNDALGPEILNINPRELWILFQEKDGDYALETKNDIGLIPSEHDEECPTLADPLLNGE IFIENHLLKCQFHYWLSAGSWYASIVSYIFRYQKEHFELIGVDYYSYHRASGEEKESSYN LFTGKMKITTGGNISGEGKEKVEWKNKPCERKPTLEELMEDDYTILLIEES >gi|224461489|gb|ACDD01000013.1| GENE 16 14062 - 14949 447 295 aa, chain - ## HITS:1 COG:no KEGG:FN1938 NR:ns ## KEGG: FN1938 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 148 1 152 155 119 42.0 1e-25 MGIYFIACILIILGIIFISLGKGSAAEENRIKKLFSDEKNTQKVEGFLEIIHLDKTRYLS ECEAKITFMKTNGKKFSSFESDFKFLWKWNGKGRVPITITYDKKNPSNYSIKELKQIQSS QNSKLVLPFIGILWIVLAIFIITSEIRLSFAENTKYYPYQNQKYHFEVDLPTRIPEFGLE ATSEEGVALTAYHDSINIAIYGYGIPDFTALKKEYQKQIREKEKTLGYYILGKDFFIVSY QEKNKIIYSKYLLSKSERTSVALLFEYSSEHKDLMDSIITDMTNSFHFYRETRRK >gi|224461489|gb|ACDD01000013.1| GENE 17 14952 - 15395 258 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451884|ref|ZP_05617183.1| ## NR: gi|257451884|ref|ZP_05617183.1| hypothetical protein F3_02386 [Fusobacterium sp. 3_1_5R] # 1 147 105 251 251 266 100.0 5e-70 MISECLENPEYFLQKIETLKKEENEKDWEEKFQDLYEEYKEYKNYAKDTKELFFQGAIIL LSNHHFLARYDWKADKETFINLMEDLNIVKTKKLTWKEEELSETGDVELWCSQLAELWKE AGYHTLLLDNDSDEYLVGIEKMECRRR >gi|224461489|gb|ACDD01000013.1| GENE 18 15745 - 16446 333 233 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00033 NR:ns ## KEGG: EUBELI_00033 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 137 28 164 446 188 68.0 2e-46 MLEGGRRVGKSTIAEEFAKNEYASYILIDFANVSSSVLSCFDDIHDLDLFFLRLQVITEV NLIIGNSVIIFDEIQLFPKARQAIKYLVKDGRYHYIETGSLISIKKNVKDILIPSEEMKL QVYPMDFEEFSWATKNNSIEIIKKFYDSGKVLYYHTWKKKNSTHYYEIDFLLSNGIKVSA IEVKSAGVAKHNSLDAFSEKYSSQLEKAILLSQKDKTFTNGIYYYPIYMGILF >gi|224461489|gb|ACDD01000013.1| GENE 19 16564 - 17079 528 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451886|ref|ZP_05617185.1| ## NR: gi|257451886|ref|ZP_05617185.1| hypothetical protein F3_02396 [Fusobacterium sp. 3_1_5R] # 1 171 1 171 171 327 100.0 2e-88 MKKLFTIILLLAVVILGGIGFFKMKIPKNIPVSQETFIQVMTEEGYTLEEIKEGGIKEDF PNIVESYAAWKETERKNRIKVLYCKFPSSKEAHTAFLSIYSSVKSDGTQTFEFLNESGFF GNTQYYQEFFMPTSGWISEMWCVDDTLVIAQSNEFPMLVSNIFTEKFGYKH >gi|224461489|gb|ACDD01000013.1| GENE 20 17100 - 18113 328 337 aa, chain - ## HITS:1 COG:no KEGG:TDE0547 NR:ns ## KEGG: TDE0547 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 49 336 9 299 303 166 32.0 2e-39 MTKAQIIILALVASLWIVNTGIFLAFYSYLKKSKNNFGISLLKEEIAGTGTRNVGKVEML IYFSLLMLCLYYLLFLSGHNPGISMIRGQIIATPCIALFNARKRTGKSILALFGTFVLYL NLILSHAIVGLPVKTPILEIAQSEVILNKTTISELMKEDFQIYVKKEKYPPSLKYEDILT SGVFEKYSQTTPIFVEKGYEAKSISLRDADYLFVKDHCILGSFLPYGREEKETNLEDCKI VQIDVFEESWKTIQEKQISCQINGVNLLKPLQTEEMKKVFGKNLWETPEKKIEGYDAHYQ IFYPANRVFWSEYHISMKLEEKNILREFYLETRLPQK >gi|224461489|gb|ACDD01000013.1| GENE 21 18322 - 18489 191 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451888|ref|ZP_05617187.1| ## NR: gi|257451888|ref|ZP_05617187.1| hypothetical protein F3_02406 [Fusobacterium sp. 3_1_5R] # 18 55 18 55 55 68 100.0 1e-10 MKKILLLLLFLCFIFLLTGCHFFGEHMKKEEVEKYMDEKYGNNSYTLAEGKRKYT >gi|224461489|gb|ACDD01000013.1| GENE 22 18499 - 18816 268 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451889|ref|ZP_05617188.1| ## NR: gi|257451889|ref|ZP_05617188.1| hypothetical protein F3_02411 [Fusobacterium sp. 3_1_5R] # 1 105 1 105 105 161 100.0 1e-38 MYDYPNLPFQVKEEVGFQYLITPRRFLYDIFHQVFFERFFPNYLETKKYKTKMYHRTLQV KIPFETEEELSEAANKVKKYLDTSIKNYPIFTKFPISFLFENKVK >gi|224461489|gb|ACDD01000013.1| GENE 23 18963 - 19319 389 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451891|ref|ZP_05617190.1| ## NR: gi|257451891|ref|ZP_05617190.1| hypothetical protein F3_02421 [Fusobacterium sp. 3_1_5R] # 1 118 1 118 118 216 100.0 5e-55 MADYYEVSLKNLPEITFSCTRGWASFFLFEKIYVKDDMYQRLGEKASEEFSKNHHNIHFD SIGETGDPFVITINSSEELESISKTVKEFYEFTEKKYPALYLGEQIPIEIIYEVNYGR >gi|224461489|gb|ACDD01000013.1| GENE 24 19416 - 19940 399 174 aa, chain + ## HITS:1 COG:no KEGG:TDE0510 NR:ns ## KEGG: TDE0510 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 4 170 2 168 222 216 62.0 3e-55 MKHDLRERILKNKEFNRLLMHECDIRFYEEVQEVQFSENHERYSISCKAFAQDASGGEFV FLEDDSIGLISSEGDVGRIAESLEEFLTFLIHVGNIFDFSCKHLYKNQDLLNAYCNGYLS KIREEYRRKNKNWDDIRSSIAKKLSLPFLPNKLAEFAMNFYHSATREPVWFFVK >gi|224461489|gb|ACDD01000013.1| GENE 25 20115 - 20333 215 72 aa, chain - ## HITS:1 COG:FN0599 KEGG:ns NR:ns ## COG: FN0599 COG3464 # Protein_GI_number: 19703934 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 69 359 427 428 70 50.0 5e-13 ISPKMQTAWKTLRKYRKYIRNTLGTSYSNGPLEGMNNFIKSVKRVAFGFRRFSHFRQRIL IMQGIAQINPNF Prediction of potential genes in microbial genomes Time: Fri May 20 01:49:08 2011 Seq name: gi|224461488|gb|ACDD01000014.1| Fusobacterium sp. 3_1_5R cont1.14, whole genome shotgun sequence Length of sequence - 2381 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 53 - 727 442 ## gi|257451894|ref|ZP_05617193.1| hypothetical protein F3_02436 2 1 Op 2 . + CDS 753 - 1412 310 ## gi|257451895|ref|ZP_05617194.1| hypothetical protein F3_02441 3 1 Op 3 . + CDS 1428 - 1856 489 ## gi|257451896|ref|ZP_05617195.1| hypothetical protein F3_02446 4 1 Op 4 . + CDS 1862 - 2275 666 ## FN0351 hypothetical protein + Term 2291 - 2334 3.1 Predicted protein(s) >gi|224461488|gb|ACDD01000014.1| GENE 1 53 - 727 442 224 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451894|ref|ZP_05617193.1| ## NR: gi|257451894|ref|ZP_05617193.1| hypothetical protein F3_02436 [Fusobacterium sp. 3_1_5R] # 1 224 1 224 224 387 100.0 1e-106 MKLTNTFQRDSLLYQEGYILSLVKDVEKTWKLNLLPFFIVFFVFGTATIYFRFFANYYSR VGFLGIFLLLFLVIIFSEKDCYLLSKNNLDESTLRNLRQEMIDCLFFDNNVIIGRKDIFI LAKNGIRLLPKEEIEDLDILCVYGRVSLKWLNIILTTKQGKITFSIADTWKNSRLVNAYQ YKSIFLPLTILRRKSKEDLSIYQVNVLPKGMKGTPLKDLILKKY >gi|224461488|gb|ACDD01000014.1| GENE 2 753 - 1412 310 219 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451895|ref|ZP_05617194.1| ## NR: gi|257451895|ref|ZP_05617194.1| hypothetical protein F3_02441 [Fusobacterium sp. 3_1_5R] # 1 219 1 219 219 384 100.0 1e-105 MKLLKEIALTDIIEKENGIFSLLKQLENRKRMSYLFHFFSIGFYLFLLFSRKPHTRPSFL LQSFIYVLVMLISYLNAAIFYLFWKKESYYLEKENFDDMRAAKIQEEMRDYLLADEKVII GKKYIFSLKRNGLAVIPKEDILEVNLRILSKADVYLETRGGQSRFPIHADILASLYEAKS LAVLKRKFDSEEEKDLYKKKHGLGQGQWEFLKHLILTKE >gi|224461488|gb|ACDD01000014.1| GENE 3 1428 - 1856 489 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451896|ref|ZP_05617195.1| ## NR: gi|257451896|ref|ZP_05617195.1| hypothetical protein F3_02446 [Fusobacterium sp. 3_1_5R] # 1 142 1 142 142 234 100.0 9e-61 MTTYKRKITNLIVGLLSAPAAAFFLLAILRYFLSPLIMLIISGIAFILILYLTIFSDNIK FVIDEEDKTMIYYENGKVVKEYDLKNASLSYNMKFGHSAVIDLIINGEKIDCEPLGERQF EKMYHQLEKLVGVEPIKLKVGE >gi|224461488|gb|ACDD01000014.1| GENE 4 1862 - 2275 666 137 aa, chain + ## HITS:1 COG:no KEGG:FN0351 NR:ns ## KEGG: FN0351 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 9 132 8 138 144 68 32.0 6e-11 MSTMILIPISIWIIGIAVMFFMNSRNKNAVGNYLSQYPNAAKIYVSHKGVIVQSQTQILA VNDETPAVFTEMKGYGVYCKPGVNILTVEHSSTRPGVLYKTVTKSTGGVKIEVDIKAEAE YIITFDKETQNFKIDLK Prediction of potential genes in microbial genomes Time: Fri May 20 01:49:43 2011 Seq name: gi|224461487|gb|ACDD01000015.1| Fusobacterium sp. 3_1_5R cont1.15, whole genome shotgun sequence Length of sequence - 24672 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 11, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2271 2029 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 2296 - 2355 7.2 + Prom 2400 - 2459 7.3 2 2 Op 1 . + CDS 2491 - 3420 831 ## SUB0840 transporter protein 3 2 Op 2 . + CDS 3436 - 4383 1298 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases + Term 4393 - 4424 2.5 - Term 4375 - 4416 5.0 4 3 Tu 1 . - CDS 4421 - 4645 287 ## FN1044 hypothetical protein - Prom 4775 - 4834 5.5 5 4 Tu 1 . - CDS 4899 - 5687 807 ## FN1045 hypothetical protein - Prom 5760 - 5819 6.0 - Term 5792 - 5831 5.4 6 5 Tu 1 . - CDS 5848 - 6375 739 ## COG0778 Nitroreductase - Prom 6403 - 6462 7.0 - Term 6780 - 6819 -0.3 7 6 Tu 1 . - CDS 6885 - 7247 418 ## COG0239 Integral membrane protein possibly involved in chromosome condensation - Prom 7331 - 7390 8.1 + Prom 7305 - 7364 9.4 8 7 Tu 1 . + CDS 7390 - 7620 255 ## mru_0500 acetyltransferase GNAT family 9 8 Op 1 . - CDS 8271 - 9098 1170 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 10 8 Op 2 . - CDS 9108 - 9959 1055 ## COG1737 Transcriptional regulators 11 8 Op 3 . - CDS 9973 - 10869 968 ## COG1242 Predicted Fe-S oxidoreductase + Prom 10867 - 10926 6.6 12 9 Tu 1 . + CDS 10968 - 11954 706 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Term 11957 - 12007 9.4 - Term 11951 - 11989 5.5 13 10 Op 1 . - CDS 11996 - 12751 693 ## gi|257451911|ref|ZP_05617210.1| hypothetical protein F3_02521 14 10 Op 2 1/0.000 - CDS 12744 - 14555 2309 ## COG1217 Predicted membrane GTPase involved in stress response 15 10 Op 3 . - CDS 14573 - 15430 952 ## COG0130 Pseudouridine synthase 16 10 Op 4 . - CDS 15437 - 16624 1273 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 17 10 Op 5 27/0.000 - CDS 16635 - 19700 3257 ## COG0841 Cation/multidrug efflux pump 18 10 Op 6 13/0.000 - CDS 19713 - 20813 1337 ## COG0845 Membrane-fusion protein 19 10 Op 7 1/0.000 - CDS 20806 - 22242 1646 ## COG1538 Outer membrane protein - Term 22267 - 22310 6.2 20 10 Op 8 . - CDS 22339 - 22872 695 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 22896 - 22955 5.6 - Term 22985 - 23027 -0.9 21 11 Tu 1 . - CDS 23054 - 24598 2180 ## COG1070 Sugar (pentulose and hexulose) kinases Predicted protein(s) >gi|224461487|gb|ACDD01000015.1| GENE 1 3 - 2271 2029 756 aa, chain - ## HITS:1 COG:FN1122_1 KEGG:ns NR:ns ## COG: FN1122_1 COG1022 # Protein_GI_number: 19704457 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Fusobacterium nucleatum # 1 598 1 600 600 590 52.0 1e-168 MFFLKDYQKVGIYYEGQEITYRDIIIKAKQLGEHHKIEEHSKSILFSENRPEFLYAFLGI WNRNATCVCIDASFDEEEFLYYVNDSEAERIFTSKTNEQVARKTVEKSGRDVEIIVLEEE VWEEKEYRPEELVLMAPEKETVALMLYTSGTTGNPKGVMLTFDNILYNIESLDEYNMFLE SDVTLALLPMHHIFPLLGSGVIPLSHGASIIFLKELSSQAMMEALQKYQVTMMIGVPKLW EMLHKKIMEQIKANKIAHLLFKVCESLQSKSLSKIIFGKLHQKLGGKLRYFVSGGSKLDE QVAKDFFTLGITICEGYGMTETAPMISFNPLSEAKPGTAGKILRNLDLLIAEDGEILVKG RNVMKGYYKREEATKETIDEKGYLHTGDLGEIRNGYLYITGRKKEMIVLSNGKNINPIDI EFWIQGKTNLIQEIVVLEWKGLLTAAIYPNFQAIRDEKIVNIEETLKWDVIDKYNKQAPD YRKVLDTIIVPEEFPKTKIGKIRRFMIPAVLENIGKKEIISEEPSSEEYAIIKEYLSLAK ARTVVPQAHLELDLGMDSLDMIEFISFLGSRFGMVVQNETILENSTVESISAYVEKHRGE DKIEDVNWKEILSKETKVDLPYYGIFARIGKILNYLLFWSYFRIDIKGREYLDEKPTIYV GNHQSFLDICLITRAFPFAIMKNCYFMAKVVHFKSFLMKFFASQANVVTLDINDNITEVL QTMAKVLREGKSILIFPEGVRTRDGKLNSFKKSFAI >gi|224461487|gb|ACDD01000015.1| GENE 2 2491 - 3420 831 309 aa, chain + ## HITS:1 COG:no KEGG:SUB0840 NR:ns ## KEGG: SUB0840 # Name: not_defined # Def: transporter protein # Organism: S.uberis # Pathway: not_defined # 5 306 7 306 309 206 43.0 9e-52 MQTVLFPVFFMLFLGYLARKKEWITTQQNEGGKKIVFNILFPILVFHVLAQSELKKEFLI QILFLFFAWSFVFLVGKAMTRFTGKRFSNISPYLLLTCEGGNVALPLYISLVGAAHAVNI VTFDVAGILINFGLVPILVTKQSSSELIWKSLLKKIFTSSFILAVLIGILFNVTGLYSYL MNSTFQDIYLSTIDIVLKPITGIILFTLGYELKLNRAMLQPLWRLSLLRLLTCSGIIGTF FLFFPNLMKEEVFSIAVFLYFMCPTGFPVPLQIQALVKEEEEEHFMSAFISVFLMIALAV YTIITLVWK >gi|224461487|gb|ACDD01000015.1| GENE 3 3436 - 4383 1298 315 aa, chain + ## HITS:1 COG:AF0807 KEGG:ns NR:ns ## COG: AF0807 COG1304 # Protein_GI_number: 11498413 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Archaeoglobus fulgidus # 30 306 93 357 366 113 31.0 5e-25 MNQEIKSPWPGPKGLIPVTSGRAEDANVYNRRYLDDIHVEMRVLDSIKPSLRTKIFGETF DSPIMMPAFSHLNKVGVDRKKPMLHYAFAAKELNMLNWVGMEPNDEFEEILEAGARTVRI IKPFMDHSIILEQIAFAETHNATAVGIDIDHVPGSNGKYDVVDGIPLGPVTTEDLKSYVN STSLPFVAKGVLSVQDALKAKEAGVKAIVISHHHGRIPFGIPPIQVLPRIKEALKGSGIF IFVDGSMESGYDVYKALALGADAVSVGRAILAPLLKEGKEGVVKKVKKMKEELSELMMYT GIEDTKSFDPSVLYY >gi|224461487|gb|ACDD01000015.1| GENE 4 4421 - 4645 287 74 aa, chain - ## HITS:1 COG:no KEGG:FN1044 NR:ns ## KEGG: FN1044 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 70 170 239 242 75 60.0 7e-13 MKAGSMIGQLERILIVILLLQNQYEAISFVLVAKSIARFKQLDDKEFAEKCLVGTLPSLL LSLVITLVVKKCFL >gi|224461487|gb|ACDD01000015.1| GENE 5 4899 - 5687 807 262 aa, chain - ## HITS:1 COG:no KEGG:FN1045 NR:ns ## KEGG: FN1045 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 260 1 267 269 246 52.0 8e-64 MKEYAALMIDLKKSKSYSTESRNKLQQKILETIQKLNTLFSTTITKEVEFSAEDEIQGLF SSPMAAYLYLRFFQLLTFPLELHAGIGLGSWDIVIENSSSTAQDGPVYHHARKAIEESKK NLEYFSLFYSERNEDRVINSLINAYEVLLKKQSKYQAELHLMTEFLYPISIENVLSENAM IEFLKDSEQSNWKSEIIDGREVEELFYIKLGKRRGLATQLSELLESSRQSIEKSLKVGNI YEMRNLVFAILEILKNMKGEKE >gi|224461487|gb|ACDD01000015.1| GENE 6 5848 - 6375 739 175 aa, chain - ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 3 175 1 172 174 111 34.0 6e-25 MDLLEIMKRRRSVRQYTEEAIPKESIEKILQAGLLSASGKNARPWEFIVVQEKENLKYLS ECRVGSAKMLEKANCAIIVLADSEKTPIWIEDASIAMTNMHLMADYLGVGSCWIQGRGRM ASDDIASTEDYLRNKFLFPQQYKLEAILSLGIPANHPIPRRLEDLELEKIHYETF >gi|224461487|gb|ACDD01000015.1| GENE 7 6885 - 7247 418 120 aa, chain - ## HITS:1 COG:SMc01274 KEGG:ns NR:ns ## COG: SMc01274 COG0239 # Protein_GI_number: 15965143 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Sinorhizobium meliloti # 1 114 1 115 125 79 41.0 2e-15 MSEVFLVGLGGAIGSILRYGVGQIFQRTESGFPLGTLCINVLGSLCIAMISSLAIKYGYE NSRLTLLLKTGICGGFTTFSTFSLESMNLLKEGNTIFFFSYICCTVLFSFLAIYIVERVI >gi|224461487|gb|ACDD01000015.1| GENE 8 7390 - 7620 255 76 aa, chain + ## HITS:1 COG:no KEGG:mru_0500 NR:ns ## KEGG: mru_0500 # Name: not_defined # Def: acetyltransferase GNAT family # Organism: M.ruminantium # Pathway: not_defined # 1 76 1 76 170 76 46.0 3e-13 MVLETERLYLRNWTEEDAEALFYCAKDNRVGPMAGWLPHQSVEESLHIIQTLLLLPYTFA MILYYTTFLCILFSYF >gi|224461487|gb|ACDD01000015.1| GENE 9 8271 - 9098 1170 275 aa, chain - ## HITS:1 COG:FN1143 KEGG:ns NR:ns ## COG: FN1143 COG0363 # Protein_GI_number: 19704478 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Fusobacterium nucleatum # 1 273 1 274 274 399 70.0 1e-111 MRVIITEKNVVDWAAVYIARKIKEFQPTKERPFVLGLPTGGTPLGMYKRLIQFYQDGLLS FENVVTFNMDEYVGLEANNEQSYHYYMHHNFFDYIDIPKENINILNGMAEDYEKECREYE EKIKKIGGIHLFLGGVGEDGHIAFNEPGSSLSSRTRDKELTTDTILANARFFDNDITKVP KLALTVGVGTILDAKEVLIMVNGPKKARALHKGIEEGVNHLWTISALQLHEKGIIVTDEE ACNELMVGTYRYYKDIEKDNLDTEQLIQDFYREYR >gi|224461487|gb|ACDD01000015.1| GENE 10 9108 - 9959 1055 283 aa, chain - ## HITS:1 COG:PM1577 KEGG:ns NR:ns ## COG: PM1577 COG1737 # Protein_GI_number: 15603442 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pasteurella multocida # 1 278 1 279 286 169 33.0 8e-42 MSVILKLKMMRENFSKMEQKIADYILKHPEEVKQLTTYQVAKVCKTSQASIVRFAKKMGF SGYPDFKLSLSQDMGVLSAKKEVSIIDSEIDSNDSLQEVCQKVARENMRAIEDTYSLLDF KELEKAVKALGKAKKIMILGAGFSGVVARDLSYKLLELGKDVVFESDFHMQFSLLTTMTS RDILFVISYSGKTKEVYEITKKAKERGIQIITLTTIAGNPIRDLGDITLNTVELNKNFRA TALSPRISQMTVIDMLYVKLILENKEMEENILEAMEIVKNFKL >gi|224461487|gb|ACDD01000015.1| GENE 11 9973 - 10869 968 298 aa, chain - ## HITS:1 COG:FN1142 KEGG:ns NR:ns ## COG: FN1142 COG1242 # Protein_GI_number: 19704477 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Fusobacterium nucleatum # 3 291 4 292 304 395 63.0 1e-110 MGRFYSLNDYFRDTFGEKIYKVSLDGGFTCPNRDGKVGFGGCIFCSEEGSGEFSGDRHKK IYQQIEDQLQLISKKFPSGKVIAYFQNFTNTYADIPYLKKVYEEALSHPRVMGLAIATRP DCLGEDVLQLLDKMNQKTFLWIELGLQTVNEEVATFFHRGYPLSVYTKACDDLKKYRIRF VTHILLGLPKEKEEDGLKTALYAQECGTWGIKIHCLYVQKNTYLEQLYKNHEIKIQKKDE FVKKVVTILENLSYNIVIHRLTGDGDRESLIAPLWTLKKRDVLNSIQKELKLRENLKK >gi|224461487|gb|ACDD01000015.1| GENE 12 10968 - 11954 706 328 aa, chain + ## HITS:1 COG:FN0751 KEGG:ns NR:ns ## COG: FN0751 COG0252 # Protein_GI_number: 19704086 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Fusobacterium nucleatum # 1 326 1 333 336 370 53.0 1e-102 MKNRILLINTGGTIGMIGEPLQPSKDWKEITKNHPILWDFPVDYYQMENLVDSSDMNPDI WLEIAKILKKEYENYDGFVILHGTDTMSYTASALSFLCKNLSKPIILTGSQVPLAKPRSD ALQNLITAIQIASQYKIPEVCILFRDNLLRGNRSKKIDATNYFGFSSPNYPVLGEIGAEI KISWDKILSFPKEKFQVEENLCSDIIVLEIFPGMNIEFYNTILNSSIKGIILKTFGNGNA PTSLCFLEFLKELQKKKIPVINVTQCIRGSVEHGKYAASHNLISLAVISSKDMTTEASIT KLMYLLGKNYSYEDIQEAFQKNLAGEIS >gi|224461487|gb|ACDD01000015.1| GENE 13 11996 - 12751 693 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451911|ref|ZP_05617210.1| ## NR: gi|257451911|ref|ZP_05617210.1| hypothetical protein F3_02521 [Fusobacterium sp. 3_1_5R] # 1 251 1 251 251 431 100.0 1e-119 MSKKSFLILSCILIILNMGCRRQISNQKIDYPMKKIESAIATNNVLLLKDFLLHTENRKE FIYMALDYNSLNVLEFLLHFPKLEEGVANSPYFYVQGKEAFQLLQKHSYDVNVKNYEGKS LAEYYYDTKGIEFFKYFLEEAAKLDLSKENSLIFKAIASEDIELIHLLIKRKADFTVLDK KGNYPIYYAKTTAIISRLLDFPYDLQHKNFRKENVLGEVYLRLQKSQKRDLLRKCARLGI DSNYSSYQKEQ >gi|224461487|gb|ACDD01000015.1| GENE 14 12744 - 14555 2309 603 aa, chain - ## HITS:1 COG:FN0634 KEGG:ns NR:ns ## COG: FN0634 COG1217 # Protein_GI_number: 19703969 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Fusobacterium nucleatum # 1 601 1 601 605 1072 88.0 0 MKIKNIAIIAHVDHGKTTLVDCLLRQGGAFGSHELEKVEERIMDSDDIEKERGITIFSKN ASVRYKDYKINIVDTPGHADFGGEVQRIMKMVDSVLLLVDAFEGPMPQTKYVLKKALEQG HRPIVVVNKIDKPNSRPEEVLYMIYDLFIELNANDYQLEFPVVYASSKAGFAKKELKDEE KDMQPLFDTILEFVEDPDGDKNHPTQFLITNTEYDNYVGKLAVGRIHNGMLKRNQEVMIM KRDGSQVKGKVSVLYGYEGLRRVELQEAEAGDIVCIAGMENIEIGETLADINNPVALPVI DIDEPTLAMTFMVNDSPFVGKDGKYVTSRHIWDRLQKEVQNNVSMRVEATDTPDAFVVKG RGELQLSILLENMRREGFEVQVSKPRVLMKEIDGVKMEPMEMALIDVDDSYTGVVIEKMG VRKAEMIAMTPGQDGYTRLEFKVPARGLIGFRNEFLTVTKGTGILNHSFFEFEAFKGEIP TRNKGVLIATEPGVTVPYALNNLQDRGTLFLDPGIPVYEGMIVGEHNRENDLVVNVCKTK KLTNMRAAGSDDAVQLATPRKFSLEQALDYIAEDELVEVTPLNIRLRKKILKEGERRRNR SDV >gi|224461487|gb|ACDD01000015.1| GENE 15 14573 - 15430 952 285 aa, chain - ## HITS:1 COG:FN0635 KEGG:ns NR:ns ## COG: FN0635 COG0130 # Protein_GI_number: 19703970 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Fusobacterium nucleatum # 1 285 1 287 287 287 56.0 2e-77 MDGIIIINKEKGMSSFDVIRSLRKLLQERKIGHTGTLDPLATGVLILCLGKATRLAQEIE AQEKVYEAEMEFGYQTDTYDLEGEIVATSPKKEVKREEFEEVLSHWKGKISQIPPMYSAI KIQGKKLYELARKGIEIEREGREVEIFGIDILDFEGKKAKIRTKVSKGTYIRSLIYDIGE ELGSFATMTALNRIQVGEHHLKNSYTISEIHDKINVCDFSFCIPVEEYFSFPKIQLEGEK NLILFRNGNTVIFKEKDGEYRVYQNGLFLGLGKIEKQRLKGYKYF >gi|224461487|gb|ACDD01000015.1| GENE 16 15437 - 16624 1273 395 aa, chain - ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 11 393 10 392 393 303 38.0 3e-82 MMNDVFLKHWDRSQNLSAKWDELEAKFGDKDLYPLWIADMDFPAPKEVIDAVVEKAKQGI YGYTARPSSYYQALCDWTEKRFHYSLNPKYLIHSPGGVTSFTLALEVLTEKGDAVLVTPP VYPGFFRTITGTGRKLVTSPMLETSMGNFEINWEEFEEKIIQEKVKVFIFCNPHNPVGKV YKEEELKKIANICLKHNVRIIEDQMWRDLTFGGAKTISLLQLGEEVRENTVACLSATKTF NLAGLHASFLYVSNEKIRLALIDKIEVLDIHRNNALSIVAMETAFQKGEAWLQSALEYLE ENLKMAVDFIHKELPEIKAYMPESTYTLWVNFSHYSLQGEEITKHLAKYGKIATGNGAPY GQGGETCQRINLACSREVLLKSLEGLKVAVEAMGE >gi|224461487|gb|ACDD01000015.1| GENE 17 16635 - 19700 3257 1021 aa, chain - ## HITS:1 COG:FN0515 KEGG:ns NR:ns ## COG: FN0515 COG0841 # Protein_GI_number: 19703850 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 1021 3 1021 1022 1137 55.0 0 MVEYFLKNRIVTLVLTLLILLGGILSYFKLGKLEDPEFKVKEALVVTLYPGASPHQVELE VTDKLEQKIREMPHVEYIDSTSKAGYSEIRVKIEESIPSKEVEQYWDILRKKVADSKLYL PSTAISPIVLDDYGDVYGMFFAITSEGYSKEELNRYSKYIKRELESIQGVSKAVLYGKAD SVVEIVIDRSKMANLGINEKMIYTAMLQQNIPTPAHNIEQGTRYLRFQLHSNFQSIEDIE NLVIFSKPDLLKMLTGAGGDTLFLKDIAEIKKSSSNPSSNMMRFCGKMSIGLQLSPESGT NVVKTGEKIDKRLEEISSSLPIGIEVHKIYYQPELVSNAISQFVYNLIASVAVVIGVLLF TMGMRSGLIIGSGLVLSILGTFIYMLFVKMDLQRVSLGAFIIAMGMLVDNSIVIVDGTLN ALENKMERYEAVTLPTKKIALPLFGATFVAIAAFLPMYLMKSSIGEYISSLFWVIAISLG LSWIFSMTQTPLLCYLYLNDPGQQKVSKKRRKFYWILRKWMNKILHFRKVSLLILLGSFC FIILLSFGISTSFFPNSDKKGFVLNIWTPEGSNLEYTNQISKILEKEIAKNKQVENYTTF VGASPSRYYVATIPELPTTSLAQIIVNVDKLSTIEDLEKSLTNFTWENLPDVQIQVKRYA NGIPTKYPLQLRITGSDPKILRDLARKVEKELYEIPGAKNVNVDWKEKVLTMVPNLDEQK ERKHAVSTFDIASALNRLGNGNQVGVFHEGVEDLPIVIREKSGGQQVNSNNLEQLPIFGV GMQSLPLGEFIKGTDLVWEDPMILRHNGKRAIQVQADVETGIQVEKIRSILAEKIKDISL PEGYSLEWNGEYYEQNKNIAKVLSYVPVQFMIMFVACLLLFATLTDPFIIFVVLPLSLIG IVPGLLLTGRPFGFMAIIGMVSLSGMMIKNSIVLLDEIRYQKLHTDKTEFDAVVDASLSR VRAVSLAAGTTIFGMFPLMFDPLYGEMAITIIFGLAASTILTLFVVPLLYVSIHKIYKNK K >gi|224461487|gb|ACDD01000015.1| GENE 18 19713 - 20813 1337 366 aa, chain - ## HITS:1 COG:FN0516 KEGG:ns NR:ns ## COG: FN0516 COG0845 # Protein_GI_number: 19703851 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 17 357 17 356 357 250 42.0 3e-66 MNKKWICIFMISFLLIACEKNGEKEKIRPVKIQEIGINLSQVILSEYPSSIQAKQEAMLS FQVPGKIEKILVSLGDKVKKGQVLAKLEEQDYHLNLEANAQKYEASKAVAENARLQFERV KTLYQNNAIPKKDYDMALAQYKSAIAAEKANQAGLSHAANEVYYGDLIAPYDGIVSKKMT EAGMVVAAGTPILSISSEDVSELTIQVPAKELEKIKEAQRYYFIVEEDKSKTYPLTLKTI SFTPDMTKSTYPIVFQLERDNIKNLYAGMSGTVVVALKKEENSKILLPISAIFEENGSFV YLYGKENKAEKREVKLGDLQGNGEIQIISGLKTGDKVIIAGVSSIHEGQVIKALPPTTDT NVGNLL >gi|224461487|gb|ACDD01000015.1| GENE 19 20806 - 22242 1646 478 aa, chain - ## HITS:1 COG:FN0517 KEGG:ns NR:ns ## COG: FN0517 COG1538 # Protein_GI_number: 19703852 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 20 465 1 449 449 297 40.0 3e-80 MKRKWNLFFCLLFLTSCSSVNKETSENSLLQELQNKEKETQEILKEQKLSLEEAIRLAKE RNLELKMKELEKEIASIDKRTAFGNFLPKISAFYTRSFWEEPLSAQIDLPSSLGKFPMIG PLLPKEIHGRLLDQNYSVYGMQASMPIFAPATWFLYSARKKGEDIHSLVFDLTEKMITVK VIQQYYWILALKSEEKQLQASLQSAEQLLHNTKIALETQSILDWQYQKAEVYYKQKKLAL EENRRDLKIANMNLLLTLNLSPFSEIYLEDTNLSTKKPLLNYEEVVYQSLLHSQALEIQN KMIEVEKEKVKISLSRFLPIVGLQGFYGEHSFSLLTSPHYLFGILGGVFSVFNGFQDISA YQKAKIEQQKAIIKREQLMLQTIAETTNVYQKLQSSLEEQEIAQGNLKAENGKFYQKEME KKVGMIDELSYLQALQSYEEARSLNAKAEYQSAVLQEILDMLMEQGRFVKIREGEKNE >gi|224461487|gb|ACDD01000015.1| GENE 20 22339 - 22872 695 177 aa, chain - ## HITS:1 COG:FN2119 KEGG:ns NR:ns ## COG: FN2119 COG2849 # Protein_GI_number: 19705409 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 20 171 162 311 338 76 30.0 3e-14 MKERQIEFFDSIEKSDIIYYKQDSHPFDGLVFYRYPTGEVRERIRYENGIKNGLSLSYYP NGIVSQSSEYREGLLDGDTVFYYKSGKMKEFIHFSANEFEGEWIIYYENGELKSRAFFEK GRLNGTKITYYENGKVREILNFQNNLLHGKNIQYYPSGEIQWVHHYSYGELIDDGEF >gi|224461487|gb|ACDD01000015.1| GENE 21 23054 - 24598 2180 514 aa, chain - ## HITS:1 COG:TM0116 KEGG:ns NR:ns ## COG: TM0116 COG1070 # Protein_GI_number: 15642891 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Thermotoga maritima # 5 512 4 489 492 164 24.0 6e-40 MEKYYIGFDAGTQSVKVAIYNLQLECVAEQNYPTHLYYPKAGWVEMNVNEYLVAVKQGMK DCVEQMRKKGLDVSKVRAIFGDGIICGIVGVNEEGEAITPYINYLDSRCQEDVENLSAQN LTIWAEETGNAVPNCMFPAMIARWILKNNLAFQKEGKKFMHNAPYVLSHLAGLNSKDAFI DWGAMSGWGLGFEVYKKEWSDKQLEILKIKKEYMPKIVKPWKIIGSLTKEIAEFTGLPEG VSICAGAGDTMQSMLGCGLIDKNMAADVAGTCAMFCVSTDGIKPELSTPESGLIFNSGTL ENTYFYWGFIRTGGLALRWYRDNLCKQEGVDEYFDILSQEAEKIPVGSNGVLFLPYLTGG NTECVNACGCFLNMTMDTNQATLWKSVLEAIGYDYIGVTDTYRKAGVNLDQITITEGGSR SELWNQIKSDMLDAKVKTLQKAGGALITNILTAAYAVGDISNLKEALTSLLKIKKVYSPS EKNTKYYRNIYTLRKDLIQNKMQKTFEILKEIRE Prediction of potential genes in microbial genomes Time: Fri May 20 01:50:09 2011 Seq name: gi|224461486|gb|ACDD01000016.1| Fusobacterium sp. 3_1_5R cont1.16, whole genome shotgun sequence Length of sequence - 14591 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 6, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 10.9 1 1 Tu 1 . + CDS 96 - 1079 1512 ## COG2502 Asparagine synthetase A + Term 1173 - 1216 -0.5 + Prom 1284 - 1343 12.6 2 2 Op 1 34/0.000 + CDS 1369 - 2043 1007 ## COG0765 ABC-type amino acid transport system, permease component 3 2 Op 2 16/0.000 + CDS 2036 - 2764 592 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Op 3 . + CDS 2786 - 3502 1216 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 5 2 Op 4 . + CDS 3516 - 4229 762 ## COG2045 Phosphosulfolactate phosphohydrolase and related enzymes 6 2 Op 5 . + CDS 4297 - 4632 353 ## Lebu_0216 hypothetical protein 7 3 Tu 1 . - CDS 4598 - 5599 1237 ## COG0582 Integrase - Prom 5668 - 5727 12.2 + Prom 5637 - 5696 17.2 8 4 Op 1 . + CDS 5754 - 6299 610 ## FN0212 hypothetical protein 9 4 Op 2 . + CDS 6309 - 7232 841 ## gi|257451928|ref|ZP_05617227.1| hypothetical protein F3_02606 + Term 7388 - 7428 -0.2 - Term 7221 - 7264 6.3 10 5 Op 1 . - CDS 7280 - 8077 1497 ## COG5266 ABC-type Co2+ transport system, periplasmic component 11 5 Op 2 . - CDS 8110 - 8529 704 ## FN1808 hypothetical protein 12 5 Op 3 12/0.000 - CDS 8558 - 9433 1133 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 13 5 Op 4 42/0.000 - CDS 9430 - 10329 1046 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 14 5 Op 5 25/0.000 - CDS 10344 - 11060 227 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 15 5 Op 6 . - CDS 11061 - 11972 1523 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 16 5 Op 7 . - CDS 11984 - 12526 582 ## FN1814 hypothetical protein - Prom 12612 - 12671 14.7 - Term 12565 - 12619 3.1 17 6 Op 1 . - CDS 12686 - 12793 74 ## 18 6 Op 2 . - CDS 12796 - 14589 2509 ## COG0481 Membrane GTPase LepA Predicted protein(s) >gi|224461486|gb|ACDD01000016.1| GENE 1 96 - 1079 1512 327 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 1 327 1 327 327 493 74.0 1e-139 MEYKSKLGLLDTEIAIKKVKDFFEKELSLELSLIRVSAPIFVRPESGLNDNLNGIERPVS FDVKAGDIAEIVHSLAKWKRMALYRYGIETYNGLYTDMNAIRRDEDPDAIHSYYVDQWDW EKIIKKEDRNVETLKHVVKGIYTVLRKTERYLRTQYPTLSKKLPEEITFVTTQELEDKYP NLTPKEREHAIAKEHKAVFLMKIGGTLASGEKHDGRAPDYDDWELNGDILVWYEPLQIGL ELSSMGIRVDEESLERQLKIAGLEERKVFPFHQMVLNRELPYSIGGGIGQSRICMFFLEK IHIGEVQASIWPEEVRKECEEKNIILL >gi|224461486|gb|ACDD01000016.1| GENE 2 1369 - 2043 1007 224 aa, chain + ## HITS:1 COG:FN0802 KEGG:ns NR:ns ## COG: FN0802 COG0765 # Protein_GI_number: 19704137 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 1 224 1 236 236 284 75.0 9e-77 MEYLQTLQEIFLAEDRYLYILNGLGFSVGVTLFAAILGVLLGILLALMKLSNSKILSKIA LVYIDIVRGTPAVVQLMILANIIFVGALRETPILIVAGIAFGMNSGAYVAEIIRAGIEGL EKGQTEAGRALGLSYAQTMKFVIIPQAVKKILPALVSEFITLLKETSIVGFIGGVDLLRS ANIITSQTYRGVEPLLAVGIIYLILTTVFTILMRKVEKGLKVSD >gi|224461486|gb|ACDD01000016.1| GENE 3 2036 - 2764 592 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 232 46 1e-60 MIRVEHLDKNFGNLKVLKDISVEIEKGDIVAIIGPSGSGKSTFLRCINRLEEPSAGHIFI DEEDLMDENVDINQIRAKVGMVFQHFNLFPHMTVLENLTLAPIQIKNIAQKEAEEKAKLL LNKVGLLDKASSYPNQLSGGQKQRIAIARALAMEPELILFDEPTSALDPEMIKEVLDVMR DLAKEGMTMMIVTHEMGFAKNVANRVFFMDQGTILEDCHPKELFENPKSDRVKDFLNKVL NK >gi|224461486|gb|ACDD01000016.1| GENE 4 2786 - 3502 1216 238 aa, chain + ## HITS:1 COG:FN0800 KEGG:ns NR:ns ## COG: FN0800 COG0834 # Protein_GI_number: 19704135 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Fusobacterium nucleatum # 16 238 7 230 230 265 64.0 4e-71 MKIMKYVGIGMGMMLLSMVAFGKTLYIGTNAEFAPFEYLEKGKVTGFDMELMNALAKEMK MDVKIENMAFDGLLPALQMKKVDVVIAGMTETPERKKAVSFTKPYFKAKQVIITKKGKDI KDFKELSGKRVGVMLGFTGDAVVSDIKGAKVQRFDATYSAVMALEKGKVDAVVADSEPAK KYIASYKDLAIASAKAEEEDYAIAVRKNDKALLDNLNKALVKVKSNGTYDALLKKYFK >gi|224461486|gb|ACDD01000016.1| GENE 5 3516 - 4229 762 237 aa, chain + ## HITS:1 COG:CAC3233 KEGG:ns NR:ns ## COG: CAC3233 COG2045 # Protein_GI_number: 15896479 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Phosphosulfolactate phosphohydrolase and related enzymes # Organism: Clostridium acetobutylicum # 1 225 1 229 235 163 40.0 2e-40 MKIDVFLTAEEVKQKEISNSNVIVIDVLRATSVMVTAIAHGVSKIYPYESIEEVREASLT SSCCILCGERKGLKIEGFDYGNSPLEYQTEKIKNREMFMTTTNGTRALSNIKGKNNKIWI ASFLNISTVLSFLEKEEKDCIIVCAGTENHFSLDDALCAGMIIEKLSNYEKTDIALALEQ IAKTSINVKESLKNTKHYRYLKSIGLEKDLEFCCHLNTYPLLLEYKRETNSIFAVTK >gi|224461486|gb|ACDD01000016.1| GENE 6 4297 - 4632 353 111 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0216 NR:ns ## KEGG: Lebu_0216 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 5 96 96 183 192 80 50.0 2e-14 MDIQTPKVAQAISKSSGMNTDSTIKMLQMLAPLLMGALGKQKREQNLDANSLDSFTSTLA GNFLDGNTNTSNMMNLVTNMLDTNNDGSIVDDVLKMDSSLFISKRSYFFLP >gi|224461486|gb|ACDD01000016.1| GENE 7 4598 - 5599 1237 333 aa, chain - ## HITS:1 COG:FN0837 KEGG:ns NR:ns ## COG: FN0837 COG0582 # Protein_GI_number: 19704172 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Fusobacterium nucleatum # 1 323 1 323 328 348 58.0 8e-96 MDIIKTKEQDLVLPRRKKRAQEGRKSFFEIYKSPKTLQDYLFYLKDFLSFVYDGDGSFQQ EEILPLMKGIEKEDVEQYIAHLLQERNMKKTSVNKVVSAMKSLYKELEQYQVENPFRYVK LFKTTRNLDNILKISSNDIKKIIEQFQVKSEKDYRNLMILYTLYYTGMRSDELLHMEFRH LMNREGSYFLKLEKTKSGREQYKPLHPALMEKLQEYKKEMKALYQLEEEDLQNHFVFCSH FDKNKALSYRALYDLIKSLGLSIEKDMSPHNIRHAIATELSLNGADLVEIRDFLGHADTK VTEIYINAKSILEKRVLNKIPDIMEEKNSSSSK >gi|224461486|gb|ACDD01000016.1| GENE 8 5754 - 6299 610 181 aa, chain + ## HITS:1 COG:no KEGG:FN0212 NR:ns ## KEGG: FN0212 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 181 1 180 180 163 52.0 2e-39 MTDFEKINFMIETIEENRIPEGKTFNEFSMEFFQEVKLLPLSKYLRSIGKNKRLPKIMNM RKAGEVLTDTYADSDLVSFVKRKSKQGQIPELDYQSIMLLRRIDVKDNWEKIFRFFRGSE TVAEINSTTRPELLPQEIEMLENFLKEKLHLSEKELDWLLEKFRKILTEKELLRAIRKLA K >gi|224461486|gb|ACDD01000016.1| GENE 9 6309 - 7232 841 307 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451928|ref|ZP_05617227.1| ## NR: gi|257451928|ref|ZP_05617227.1| hypothetical protein F3_02606 [Fusobacterium sp. 3_1_5R] # 1 307 1 307 307 511 100.0 1e-143 MKKKKLILIMEYNYEEAVNEVLRNPETEYKALTVFFRMNLQNGLEFLKKLKRIFSLENII LMSDIEYLANDLEVGYVIELKQFYDFNLEQFLKVYESSVQHFENFFDFLESVSDVFHFSF HQYEKEKAWFSLLFGHGILIINDENYEKILQNYHKIKAHTSDLAFINLNEAGVEKNLKLL KMLGSDAQIAFGVTNSLKSKFSQWIDVIIYQRSPYYERNIQNFISQIFSFNSWEKALALL QNFFTIEEKSFEADLYEEEEDVLKVPKRFFLKIENKIEFMEKAENVFYCSKDKKEHYRLE KDKDFIG >gi|224461486|gb|ACDD01000016.1| GENE 10 7280 - 8077 1497 265 aa, chain - ## HITS:1 COG:FN1807 KEGG:ns NR:ns ## COG: FN1807 COG5266 # Protein_GI_number: 19705112 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 265 1 264 264 272 52.0 6e-73 MKKLVLMAGVLTLSATAMAHTQYLYTDTLDVSGKKEVKMKTLFGHPGEGNEIGGVAVGTV DGKAMPTKEFYMIHNGEKTDLTAKVVDGIIKTDKNTVRTLDYTFTPADGLKGQGSFIFVM VPNHATDEGYTFYGAPKLIIAKDGAGSDWDKRVAPGYPEIIPLKHPADLWTEDVFVAKFV DKDGNPVKHARIDVDFINAKIDIKNDMYKGGNPDMPKVSKRTYTDDNGMFYFSAPRAGMY AIRGVESMDKANKVVHDTGLVVQFK >gi|224461486|gb|ACDD01000016.1| GENE 11 8110 - 8529 704 139 aa, chain - ## HITS:1 COG:no KEGG:FN1808 NR:ns ## KEGG: FN1808 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 15 139 1 125 125 183 70.0 2e-45 MRKKLVFILGTLLSVAALAHAPLVSVDDNGDGTIYVEGGFSNGASAAGIPVVIVKDAPYN GPEETFKGKEILYEGKFGADNSITLPKPATPKYEVYFNAGEGHIVGKKGPALTEGEQEAW KKAVDTFDFGDWKDYMLEK >gi|224461486|gb|ACDD01000016.1| GENE 12 8558 - 9433 1133 291 aa, chain - ## HITS:1 COG:FN1809 KEGG:ns NR:ns ## COG: FN1809 COG0803 # Protein_GI_number: 19705114 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 11 291 2 283 283 272 51.0 7e-73 MRKTQVLLGAFLISSSVFAKNLVLTSIPSTYSLGKELTKNTSIRVESVFGSDTSMTMTRE AIAGDGFILPKEKADAVIDISKIWVEDNLFERVRQENIHTVEIDASYPFDSKKSMLFFNY DKDGKVIPYVWMGTKNLVRMAAIVTKDFIALYPKETAKLEKNLVDFTAKVMEIEEYGNNA FLEVESTEVISLSQNIKYFLNDFNIFAEERNPEEITEENVGKIMEETGLKVFVSDRWLKK KIVKEIEKRGGSFVVLNTLDIPMDKDGKMDEEALWKSYKNNIDTLHKAFLK >gi|224461486|gb|ACDD01000016.1| GENE 13 9430 - 10329 1046 299 aa, chain - ## HITS:1 COG:FN1810 KEGG:ns NR:ns ## COG: FN1810 COG1108 # Protein_GI_number: 19705115 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Fusobacterium nucleatum # 1 297 1 297 297 323 72.0 2e-88 MLDMIRNFVISLANQGILPEAFGYEFIVNALICAVFIGPILGAVGTMVVTKKMAFFSEAV GHAAMTGIAIGILLGEPMQAPYVCLFAYCILFGLFINYTKNRTKMSSDTLIGVFLSFSIA LGGSLLILVAGKVNAHILESILFGSVLTVTDIDIYILLFSAFVLCVVITPYFNRMLLASF NPSLASVRGVNVKLIDYIFIAVVTVITIASVKIVGSILVEALLLIPAASAKNLAKSMKGF VCYSILFSLISCIVGIVFPIQLQISIPSGGAIISVAGSIFFLTIIIRTIFKKFLEGEAI >gi|224461486|gb|ACDD01000016.1| GENE 14 10344 - 11060 227 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 7 222 9 216 309 92 31 2e-18 MSKGIRIEIKNLNLTLSNTEILKNINLTIQEGSIHCLVGPNGGGKTSLLRCILGQMPFTG EISFHYDEKEATGENGKYTIGYVPQILDFERTLPITVEDFMCMTYQTKPCFLGSTKKYKP IMEDLLKHLSMYDKRKRLLGNLSGGERQRVLLAQALYPLPNLLILDEPLTGIDKIGEEYF KNILNELKEKGVTILWIHHNLKQVKEMADFVTCIKQEIIFHGDPKVEIDEKRVLEIFA >gi|224461486|gb|ACDD01000016.1| GENE 15 11061 - 11972 1523 303 aa, chain - ## HITS:1 COG:FN1812 KEGG:ns NR:ns ## COG: FN1812 COG0803 # Protein_GI_number: 19705117 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Fusobacterium nucleatum # 1 303 1 302 302 394 66.0 1e-109 MLRKVLAIFLFTIFSIFSLGANKLKVGVTLQPYYSYVANIAGDKVDLFPVIRGDLYDSHN YQPQYEDLKQLGKADVVVVNGVGHDEFVFDMIKAVPNKNKIKIIYSNAGVSLMPVSGSRS SEKIMNAHTFISITTSIQQVYNIAKELGKLDPANKDYYMKNAREYAKKLRKIKTDALAKV SAYKKIDFRVATMHGGYDYLLSEFGVDVKAVIEPAHGIQPSAKDLKEVIDVVKRDKIDII FGEAAFQSKFIDTLHKETGVEVRSLSHMTNGPYTKDSFEKFIKEDLDSVISAMQFVAKKK GLK >gi|224461486|gb|ACDD01000016.1| GENE 16 11984 - 12526 582 180 aa, chain - ## HITS:1 COG:no KEGG:FN1814 NR:ns ## KEGG: FN1814 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 180 12 189 192 115 36.0 1e-24 MSSKKENLFLSLIVIIILCLAYILVQLTSKKNIGQIITTTQISAYKDLSNVNNSFYTELT NSLVEIEAIKEEEGKIPDISKLEEEEISPYLKDDLWEERGALEWQKIEYKTGIYYLGISK QVNLVGNYLIEFNLEEMDKSVIYYNNEQDDGRSLPKTISHLEEHWKEIVPYTGTEEREKF >gi|224461486|gb|ACDD01000016.1| GENE 17 12686 - 12793 74 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNFKEHFYENALFLCTKSNLFMLFLKNNIVFYEI >gi|224461486|gb|ACDD01000016.1| GENE 18 12796 - 14589 2509 597 aa, chain - ## HITS:1 COG:FN0777 KEGG:ns NR:ns ## COG: FN0777 COG0481 # Protein_GI_number: 19704112 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Fusobacterium nucleatum # 1 597 8 604 604 1016 86.0 0 KQKRNFSIIAHIDHGKSTIADRLLEYTGTISARDMKEQLLDSMDLEREKGITIKAQAVTL LYKAKDGLEYELNLIDTPGHVDFIYEVSRSLSACEGALLVVDAAQGVEAQTLANVYLAIG NDLEVVPIINKIDLPAAEPEKVKKEIEDIIGLPAEDAVLCSGKTGIGIEDVLEAIVQKIP APHYEEEGPLKALIFDSKFDDYRGVITYIKVEDGSLKKGDKIKIWSTEKEFEVLELGIFS PHMVPKEELGTGSVGYIITGVKSIHDTRVGDTITHPNRPCLFPMAGFKPAQSMVFAGIYP LFTDDYEDLREALEKLQLNDASLTWVPETSVALGFGFRCGFLGLLHMEIIVERLRREYNL DLISTTPSVEYKVTIEGQEQMIIDNPCEFPEPGRGRIHVEEPFIRGKVIVPKEYVGDVMG LCQEKRGIFLAMDYIDENRSMLTYELPLAEIVIDFYDKLKSRTKGYASFEYELSEYRESN LVKVDILVSGKPVDAFSFIAHNDSAYTRGRAICEKLKDVIPRQQFEIPIQAALSSKIIAR ETIKPYRKNVIAKCYGGDITRKKKLLEKQKEGKKRMKTIGNVEIPQEAFVSVLKLNN Prediction of potential genes in microbial genomes Time: Fri May 20 01:50:37 2011 Seq name: gi|224461485|gb|ACDD01000017.1| Fusobacterium sp. 3_1_5R cont1.17, whole genome shotgun sequence Length of sequence - 3260 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 106 - 396 433 ## gi|257451937|ref|ZP_05617236.1| hypothetical protein F3_02651 2 1 Op 2 . - CDS 393 - 1217 906 ## COG3177 Uncharacterized conserved protein 3 1 Op 3 . - CDS 1248 - 1880 540 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 4 1 Op 4 . - CDS 1971 - 2861 764 ## DSY5047 hypothetical protein 5 1 Op 5 . - CDS 2842 - 3069 362 ## gi|257451941|ref|ZP_05617240.1| hypothetical protein F3_02671 - Prom 3193 - 3252 12.1 Predicted protein(s) >gi|224461485|gb|ACDD01000017.1| GENE 1 106 - 396 433 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451937|ref|ZP_05617236.1| ## NR: gi|257451937|ref|ZP_05617236.1| hypothetical protein F3_02651 [Fusobacterium sp. 3_1_5R] # 1 96 1 96 96 171 100.0 2e-41 MKYGYDSLFLTLLAFSFLGCGGTKYPVYKEDDKIDLLFEAIVNEDEKSMKELKVIPSQLT AGKNQGDKIATQEYFDWQEKIRAVEFIKAEKENKKQ >gi|224461485|gb|ACDD01000017.1| GENE 2 393 - 1217 906 274 aa, chain - ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 30 242 29 242 263 125 34.0 9e-29 MNTQKQINLDLYKKFLDTKRPLEDCIVRKLETELKTSYIYHSNAIEGNTLTLKETDVILE YGITVKGKSLQEHLEVKGQEYAVNFLKEEVKHRTELNIELIKNFHSLILSGIDPLHAGTF KKYSNFIGGTNVQTVSPFQVEYELNQLIEKYNKDTNNNLIEKIAKFHADFEKIHPFSDGN GRTGRLIMNFELMKKGYPICIIRNEDRLEYYDSLELAQTKKDYSKIISFITTSLEHTFEF YFKHLSQDWKKELAEFQRIKTPFQKKEKDKEIER >gi|224461485|gb|ACDD01000017.1| GENE 3 1248 - 1880 540 210 aa, chain - ## HITS:1 COG:YPCD1.91 KEGG:ns NR:ns ## COG: YPCD1.91 COG1961 # Protein_GI_number: 16082774 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Yersinia pestis # 2 201 3 178 183 67 26.0 1e-11 MIYGYIRISSKTQNEERQIIALKDAGVSSDNIFIDRESGKNFNRASWQKLMAKLVVGDTL IIKELDRMGRNNKEIKENFELIKNKGCFLEFLENPLLSTRNKSQIEIELIQPLILHLLGY FAEKERDKILTRQKEGYDSLDTDEKGRKISKKKNKVVGRPSKIENLSLEQKRYIEAWIQG NIKISDCIKNTRIGKTSLFKIKKLRRAKIV >gi|224461485|gb|ACDD01000017.1| GENE 4 1971 - 2861 764 296 aa, chain - ## HITS:1 COG:no KEGG:DSY5047 NR:ns ## KEGG: DSY5047 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 8 288 11 291 324 261 48.0 2e-68 MKKEENSIVDFTDIIENNPSFRAYSGANGIKKGILYHGKPYMLKITHRNKDSRYTNSILS EYICSKIFSILGFSVQEVILGKIMDNGKEKLCVACKDFKEKGEYLYEFLSIKNSLLKDES SNGSGTELSEILSTIKEQKFINKNEVTKFFWDMFIVDSYLGNFDRHNGNWGFLVNENTKS TRIAPVYDCGSCLYPAATDDDLILFLNSKEEMNKRIYTFPTSAIRLEDKKINYFDFLSST DNIHCIESLKRITSIISAKEIEVENFIESLPISNIRSTFYKTILKERKDFRKSLRT >gi|224461485|gb|ACDD01000017.1| GENE 5 2842 - 3069 362 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451941|ref|ZP_05617240.1| ## NR: gi|257451941|ref|ZP_05617240.1| hypothetical protein F3_02671 [Fusobacterium sp. 3_1_5R] # 1 75 1 75 75 124 100.0 1e-27 MKQKEDWEEQLPQYLKHDIENVEKYSFKTSTVYDCYLDEVYGSINACQWDGVISIEQADY LRKKYWEGNYEKGRE Prediction of potential genes in microbial genomes Time: Fri May 20 01:51:06 2011 Seq name: gi|224461484|gb|ACDD01000018.1| Fusobacterium sp. 3_1_5R cont1.18, whole genome shotgun sequence Length of sequence - 46866 bp Number of predicted genes - 51, with homology - 45 Number of transcription units - 17, operones - 10 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 72 111 ## + Term 237 - 277 1.3 2 2 Op 1 . - CDS 93 - 629 549 ## COG3663 G:T/U mismatch-specific DNA glycosylase 3 2 Op 2 . - CDS 566 - 835 109 ## gi|257451943|ref|ZP_05617242.1| hypothetical protein F3_02681 4 2 Op 3 . - CDS 789 - 929 135 ## gi|257451944|ref|ZP_05617243.1| hypothetical protein F3_02686 5 2 Op 4 . - CDS 929 - 1546 791 ## COG1279 Lysine efflux permease - Prom 1576 - 1635 12.0 - Term 1593 - 1641 6.5 6 3 Tu 1 . - CDS 1655 - 3349 2559 ## COG0405 Gamma-glutamyltransferase - Prom 3442 - 3501 6.2 + Prom 3401 - 3460 8.6 7 4 Op 1 1/0.000 + CDS 3486 - 4271 1090 ## COG0796 Glutamate racemase 8 4 Op 2 . + CDS 4283 - 4885 744 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 4898 - 4934 4.1 - Term 4880 - 4928 11.0 9 5 Op 1 17/0.000 - CDS 4931 - 6157 1621 ## COG0151 Phosphoribosylamine-glycine ligase 10 5 Op 2 10/0.000 - CDS 6174 - 7610 1859 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 11 5 Op 3 21/0.000 - CDS 7678 - 8238 682 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 12 5 Op 4 13/0.000 - CDS 8226 - 9242 839 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 13 5 Op 5 2/0.000 - CDS 9254 - 10603 1702 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 14 5 Op 6 4/0.000 - CDS 10633 - 11349 1115 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 15 5 Op 7 1/0.000 - CDS 11379 - 11855 740 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 16 5 Op 8 . - CDS 11868 - 15590 4596 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain 17 5 Op 9 . - CDS 15600 - 15713 144 ## + Prom 15916 - 15975 8.9 18 6 Tu 1 . + CDS 16048 - 16332 167 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 + Term 16350 - 16392 0.0 - Term 16333 - 16385 10.2 19 7 Op 1 1/0.000 - CDS 16399 - 17472 1489 ## COG1363 Cellulase M and related proteins 20 7 Op 2 . - CDS 17487 - 18995 2199 ## COG0747 ABC-type dipeptide transport system, periplasmic component 21 7 Op 3 1/0.000 - CDS 19062 - 19838 955 ## COG0561 Predicted hydrolases of the HAD superfamily 22 7 Op 4 . - CDS 19851 - 20654 1033 ## COG0607 Rhodanese-related sulfurtransferase 23 7 Op 5 . - CDS 20686 - 20772 70 ## 24 7 Op 6 . - CDS 20759 - 22084 1264 ## COG0733 Na+-dependent transporters of the SNF family - Prom 22120 - 22179 16.6 25 8 Op 1 . - CDS 22315 - 23010 691 ## COG2964 Uncharacterized protein conserved in bacteria 26 8 Op 2 . - CDS 23027 - 23473 577 ## gi|257451965|ref|ZP_05617264.1| hypothetical protein F3_02791 27 8 Op 3 . - CDS 23473 - 23973 829 ## COG2190 Phosphotransferase system IIA components 28 8 Op 4 . - CDS 24014 - 24787 908 ## COG0566 rRNA methylases 29 8 Op 5 1/0.000 - CDS 24789 - 25886 1268 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 30 8 Op 6 . - CDS 25896 - 26921 1375 ## COG0687 Spermidine/putrescine-binding periplasmic protein 31 8 Op 7 . - CDS 26978 - 27778 1042 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 27855 - 27914 9.8 - Term 27953 - 27986 -0.3 32 9 Tu 1 . - CDS 28007 - 29260 1277 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Prom 29306 - 29365 4.2 - Term 29280 - 29324 2.6 33 10 Op 1 . - CDS 29368 - 29436 75 ## 34 10 Op 2 . - CDS 29436 - 30206 668 ## COG0286 Type I restriction-modification system methyltransferase subunit 35 11 Op 1 . - CDS 30318 - 30776 183 ## HMPREF0868_0125 hypothetical protein - Prom 30817 - 30876 3.4 - Term 30878 - 30915 -0.9 36 11 Op 2 . - CDS 30990 - 31832 653 ## Apre_0343 KilA domain protein - Prom 32043 - 32102 10.1 + Prom 32035 - 32094 6.3 37 12 Tu 1 . + CDS 32142 - 32372 364 ## - TRNA 32143 - 32217 71.6 # Gln TTG 0 0 - TRNA 32221 - 32297 82.6 # Pro TGG 0 0 - Term 32084 - 32137 6.7 38 13 Op 1 . - CDS 32319 - 33635 1201 ## COG0534 Na+-driven multidrug efflux pump 39 13 Op 2 . - CDS 33638 - 35371 1947 ## CLI_1231 hypothetical protein 40 13 Op 3 . - CDS 35379 - 35714 518 ## gi|257451977|ref|ZP_05617276.1| hypothetical protein F3_02851 41 13 Op 4 . - CDS 35767 - 35847 65 ## 42 13 Op 5 . - CDS 35832 - 37595 248 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 43 13 Op 6 . - CDS 37624 - 37995 194 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 44 13 Op 7 . - CDS 38040 - 39593 510 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 39677 - 39736 7.6 + Prom 39577 - 39636 11.5 45 14 Tu 1 . + CDS 39716 - 39985 483 ## gi|257451981|ref|ZP_05617280.1| hypothetical protein F3_02871 + Term 39991 - 40033 2.8 - Term 39984 - 40017 2.1 46 15 Op 1 . - CDS 40022 - 41068 1511 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 47 15 Op 2 . - CDS 41089 - 42240 1283 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 48 15 Op 3 . - CDS 42249 - 43022 258 ## PROTEIN SUPPORTED gi|227874237|ref|ZP_03992436.1| possible ribosomal protein S4e - Prom 43044 - 43103 10.6 - Term 43088 - 43133 5.5 49 16 Op 1 . - CDS 43145 - 43777 613 ## COG3022 Uncharacterized protein conserved in bacteria 50 16 Op 2 . - CDS 43790 - 44944 1667 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase - Prom 44975 - 45034 12.7 - Term 45036 - 45076 5.1 51 17 Tu 1 . - CDS 45153 - 46511 829 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 46575 - 46634 10.9 Predicted protein(s) >gi|224461484|gb|ACDD01000018.1| GENE 1 1 - 72 111 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no GSENVNKIQTILDEIYEENIGYI >gi|224461484|gb|ACDD01000018.1| GENE 2 93 - 629 549 178 aa, chain - ## HITS:1 COG:Cj1254 KEGG:ns NR:ns ## COG: Cj1254 COG3663 # Protein_GI_number: 15792578 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Campylobacter jejuni # 17 170 1 150 160 142 46.0 4e-34 MERKKRIVQEGGHHGEMLERIVHPFPAFYQKNSTILILGSFPSVKSREENFFYGHLQNRF WKMLAKIFEEEFPETQEQKKKLLKRHKIALWDVIHSCKIKGSSDSSIQDVIPNDLTEILR ESPIQKIICNGGTSYKYYKKYQEKILGKEAILMPSTSPANAGYSLERLVEIWRKEFKD >gi|224461484|gb|ACDD01000018.1| GENE 3 566 - 835 109 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451943|ref|ZP_05617242.1| ## NR: gi|257451943|ref|ZP_05617242.1| hypothetical protein F3_02681 [Fusobacterium sp. 3_1_5R] # 5 89 1 85 85 136 98.0 4e-31 MQKKVIYIGLYIIHSKYHRQGVGKNLFRKLENAFIQNKFQKIRLAVILENTISFQFWKQM EFIEKERKIWKGKSGLYKKVVIMEKCLKG >gi|224461484|gb|ACDD01000018.1| GENE 4 789 - 929 135 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451944|ref|ZP_05617243.1| ## NR: gi|257451944|ref|ZP_05617243.1| hypothetical protein F3_02686 [Fusobacterium sp. 3_1_5R] # 1 46 1 46 46 68 100.0 2e-10 MRREVQTLEDIRAAFEIYQSNLYYFHVTHHRDAKESDLYRTLYNTF >gi|224461484|gb|ACDD01000018.1| GENE 5 929 - 1546 791 205 aa, chain - ## HITS:1 COG:FN1861 KEGG:ns NR:ns ## COG: FN1861 COG1279 # Protein_GI_number: 19705166 # Func_class: R General function prediction only # Function: Lysine efflux permease # Organism: Fusobacterium nucleatum # 1 201 1 202 207 228 64.0 5e-60 MNHYLQGLLMGLAYVAPIGLQNLFVINTALTQKKGRVFLTALIVIFFDVTLAFACFFGAG AVMEKSNILKVLILFIGSLIVIYIGYGLLKEKVSMRETEVNISITKVITSACIVTWFNPQ AIIDGTMMLGAFRASLPATESMKFILGVTSASCLWFLGISSFISLFSQKFDDKVLRGINL VCGIVIIFYGCKLFYSFIQILQGLV >gi|224461484|gb|ACDD01000018.1| GENE 6 1655 - 3349 2559 564 aa, chain - ## HITS:1 COG:FN0941 KEGG:ns NR:ns ## COG: FN0941 COG0405 # Protein_GI_number: 19704276 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyltransferase # Organism: Fusobacterium nucleatum # 22 564 34 579 579 582 55.0 1e-166 MKKSFLCSISLFLLLSAGLCAEEWKPYDEQGNVVRTGRDATGQNAVVSTARYEASKIGLD ILKNGGNAIDAAVGVGFALGVCEPQSSGLGGGGFMVVRLAKTGETKFIDFRETAPAKATP DMWVLDKDGNVIGNEKEFGGKSIGVPGSVKGFLYALNQYGNLKRKDVIQPSVDLARNGYK VSAIMNMDMKNQLENMIKYPETAKIYLKNGKPYEVGDTIKNPDLANTMEKIIEKGEEAFY SGPIAESIVKSAQEAGGLLSMEDMKNYSLRIKDPVHGNYRGYEIITSTPPSSGGAHIIQI LNILENYDMKSIPVGSTRYYHLLSEAMKMAFADRAKFMGDTEFVKIPLQGVINKDYAKTL QAKIDETKSQDYSEGDPWKFESKDTTHYSIVDKEGNIVAVTFTVNGVFASGVVAKDTGVL LNNEMDDFDTGHGKANSIIGGKKPLSSMSPTIILKDGKPVASLGGLGAQKIITGITQVAL LMMDYGMDIQEAINFPRIHDAYGTLTYEGRMNPQVVQELEKMGHEMKNGGEWLEYPCIQG VTMAEDGTLRGGADPRRDGKALGF >gi|224461484|gb|ACDD01000018.1| GENE 7 3486 - 4271 1090 261 aa, chain + ## HITS:1 COG:BS_yrpC KEGG:ns NR:ns ## COG: BS_yrpC COG0796 # Protein_GI_number: 16079734 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Bacillus subtilis # 1 255 1 256 265 223 46.0 2e-58 MKIGVFDSGIGGLSVLHQAMQMLPQENFIYYADVDHVPYGTKTKEEIIKYTSEAVDFLVK EGVKAIVIACNTATSAAIQELRERYSLPIIGMEPAVKKAIDFHPEKRVLVIATPMTVQGE KLHNLIEKVDTEHLVDAIALPKLVTFAEEEMFEDEVVSAYLQEEFKQLNLEDYSSIVLGC THFNYFKESLKKLLPKGVKFLDGNEGTIKKLISELENINALEKNEKRKIEYYYSGRKLSE IQDITKMGRYIHRLNHMLMIK >gi|224461484|gb|ACDD01000018.1| GENE 8 4283 - 4885 744 200 aa, chain + ## HITS:1 COG:CAC2272 KEGG:ns NR:ns ## COG: CAC2272 COG0491 # Protein_GI_number: 15895540 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Clostridium acetobutylicum # 10 196 9 198 199 131 42.0 8e-31 MLEIKKQALGLYRTNCYVLIQEGKSVIIDPGFSPEIIEDMIAGTTPLAILLTHGHLDHVN AVKALHQKYHLPIYMSKKEDAILKLTTSVPEGYHRDFEAEYFDLQEGDLQIENFSFEIIA TPGHTEGSLCIRCENHLFTGDTLFRGTIGRTDIFSSDPKKMKESIQKIKKLDPKYIVYPG HSSNTTLEEEFLTNPFYQEM >gi|224461484|gb|ACDD01000018.1| GENE 9 4931 - 6157 1621 408 aa, chain - ## HITS:1 COG:FN0981 KEGG:ns NR:ns ## COG: FN0981 COG0151 # Protein_GI_number: 19704316 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Fusobacterium nucleatum # 1 405 1 424 426 495 60.0 1e-140 MRILVIGSGGREDAIAWKLQQNPRVEEIIIKSSSLSIEELLKIAKEEKIDFTMVGSEELL VKGIVDAFEKENLKIFGPNKQAAMLEGSKAFSKDFMKKYGVKTAKYENFKNSKEALAYIE KQDYPLVVKASGLAAGKGVIICQSLEEAKKAVQEIMVDKVFQDAGAEVVIEEFLEGVEAS ILSITDSKVILPFISAKDHKKIGEKETGLNTGGMGVIAPNPYVTEKVSEAFQKDILEPTL RGMKEEGMKFAGIIFFGLMITKKGVYLLEYNMRMGDPETQAVLPLLESDFLEMLEDALEG NLDANKIKWSKDSSCCVVLASGGYPVSYQKGYEIHGLDKIENHVFFAGVKKENRKYYNNG GRVLNIVATGANLEKAIEKAYRDIEKVSFQDSCYRKDIGTLYFPVIEI >gi|224461484|gb|ACDD01000018.1| GENE 10 6174 - 7610 1859 478 aa, chain - ## HITS:1 COG:FN0982 KEGG:ns NR:ns ## COG: FN0982 COG0138 # Protein_GI_number: 19704317 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Fusobacterium nucleatum # 6 478 28 504 504 637 65.0 0 MEHDYEILSTGGTYRYLQENGVPVIEVSEVTKMQEMLDGRVKTLHPVIHGGILAVRGNEE HMSCIEKLGIHTIDMVVVNLYPFFEKVQSDISFEEKIEFIDIGGPTMLRSAAKSFQDVVV ISDPSDYEVVKEDISISGEVSYEHRKRFAGKVFNLTSAYDAAISNFLLEEDFPRYFSTSY EKKMDLRYGENPHQKAAYYVSTTENGAMKDFIQHQGKELSFNNLRDMDVAWKVVQEFDEE IACCGLKHSTPCGVAIAETVEDAFEKAYSCDPTSIFGGIVSFNREVNAKTAEELTKIFLE IIIAPSYTKEALEVLAKKKNLRVIECHQKPTDKMNLVKVDGGLLVQEEDRVNLDNLQVVT KKAPTEEEKKDLLFGMKVVKHVKSNAIVVVKNQMALGIGTGEVNRIWATQQAIERAGKGV VLASDAFFPFRDVVDCCAENHIQAIIQPGGSMRDQESIDACDEHGISMIFTGIRHFKH >gi|224461484|gb|ACDD01000018.1| GENE 11 7678 - 8238 682 186 aa, chain - ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 1 182 1 190 204 166 49.0 3e-41 MFKIAVLVSGGGTDLQSILDAIEDKKLTDCKVSYIVADRECRALERAKKYNIPFCILKKG ELNQFFQEKDMDLIVLAGYLSILPSDFLQRWEKKIINIHPSLLPKFGGKGMHGNHVHKAV LAAKEEKSGCTVHYVTEEIDGGEIILQREVPVYAEDTVELLQERVLEQEHILLPEAIQKI KEERKK >gi|224461484|gb|ACDD01000018.1| GENE 12 8226 - 9242 839 338 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 4 335 13 344 356 327 49 6e-89 MSNSYKSAGVDKEEGYKAVELMKQNVLKTHNKSVLTNLGSFGAMYELGAYKNPVLISGTD GVGTKLEIALKQKKYDTVGIDAVAMCVNDVLCHGAKPLFFLDYLACGKLDSEVAAELVSG VTEGCLQSGAALIGGETAEMPGFYKVGDYDIAGFCVGIVEKENLIDGSKVQEGDKIIALA SSGVHSNGFSLVRKVLTDYDEVISTKEHGSAKVSDILLTPTRIYVKNILKVLENFEVHGM AHITGGGLPENLPRCMGKEFSPVVWKDKVQKLEIFDIIQKRGNIPEEEMFGTFNMGIGYT LVVKAEDSEKIIDFLNSLGETAYEIGYIEKGDHSLCLK >gi|224461484|gb|ACDD01000018.1| GENE 13 9254 - 10603 1702 449 aa, chain - ## HITS:1 COG:FN0987 KEGG:ns NR:ns ## COG: FN0987 COG0034 # Protein_GI_number: 19704322 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Fusobacterium nucleatum # 1 447 1 448 448 749 83.0 0 MGILAVHSKKVRNDLVGIGYYGMYALQHRGQEGAGYTICDTITDNIVRQKTIKNVGLVSD VFLAEDFQRFTGNILIAHTRYGSASTGSSRNCQPIGGESAMGMISLVHNGDLSNQEELKK DLIEKGMLFHTAIDTEIILKYLSIYGIYGYRDAVLKTVEKLKGCFALAMIINDKLIGVRD PEGLRPLCLGRIKEDMYVLASESCALDAIGAEFVRDIRAGEMVIIDENGVESIQYQESNK KASSFEYIYFARPDSVIDGISVYEFRHTTGRYLYEQHPVEADIVIGVPDSGVPAAIGYAE ASGIPYSAGLLKNKYVGRTFIAPVQELRERAVKVKLNPIRRLIEGKRIIVVDDSIVRGTT SKKLIDTLYEAGAKEVHFRSASPIVIEESYFGVNIDPDNILMGSHMSVEEIREKIGATTL EYLSLENLKKSLGNGEDFYIGCFKEDEER >gi|224461484|gb|ACDD01000018.1| GENE 14 10633 - 11349 1115 238 aa, chain - ## HITS:1 COG:FN0988 KEGG:ns NR:ns ## COG: FN0988 COG0152 # Protein_GI_number: 19704323 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Fusobacterium nucleatum # 1 237 1 237 237 407 89.0 1e-113 MERREFLYEGKAKQLYATDDKDLVIVHYKDDATAGNGAKKGSIHNKGIMNNEITTLIFNM LEEHGIKTHFVKKLNERDQLCQKVQIFPLEVIVRNLIAGSMAKRVGIAEGTKPSNTIFEI CYKNDEYGDPLINDHHAVALKLATYEELKEIYSITAKINDLLREKFDKIGITLVDFKIEF GKNAKGEILLADEITPDTCRLWDKATGEKLDKDRFRRDLGNIEEAYIEVVKRLTETKA >gi|224461484|gb|ACDD01000018.1| GENE 15 11379 - 11855 740 158 aa, chain - ## HITS:1 COG:FN0989 KEGG:ns NR:ns ## COG: FN0989 COG0041 # Protein_GI_number: 19704324 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Fusobacterium nucleatum # 1 152 1 152 157 222 78.0 2e-58 MKVAIIFGSKSDIDVMKGAANCLKEFGIDYEAHVLSAHRVPELLEETLENLEKTGCKVII AGAGLAAHLPGVIASKTTLPVIGVPIKAALEGVDALYSIVQMPKSIPVACVGINNSYNAG MLAVQMLAIENEDLSKKLIEFRKNMKAKFAEDNKTVEL >gi|224461484|gb|ACDD01000018.1| GENE 16 11868 - 15590 4596 1240 aa, chain - ## HITS:1 COG:FN0990_1 KEGG:ns NR:ns ## COG: FN0990_1 COG0046 # Protein_GI_number: 19704325 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Fusobacterium nucleatum # 16 980 5 977 983 1069 56.0 0 MCSDFFMLKNNIGGIMKNCRIFVEKKEGFNLEAKRLCKEWKEALQLSSLTKVRILNCYDV FGANDIEDAKKMIFSEVVTDMVSENFDETIPHFAVEFLPGQFDQRADSAYQCMNLLSTEN ENVVITSGKLFLLEGSISSEDVEKAKKFYINPVEMREKDLKKLEQETLQFQSSVPMIEDF KGLKEEMELAMSQEDLDFIETYFKEEEKRMPTETEIRVLDTYWSDHCRHTTFETELREII FPKGSFGEELQRVFDKYLADKQVSLMEMAKLIGKKMRKEGKLDDLEVSEEINACSVYIDV DVDGEIEKWLLMFKNETHNHPTEIEPFGGASTCLGGAIRDPLSGRSYVYQAIRVTGAANP LEAFEDTLEGKLPQKKITTAAAHGYSSYGNQIGLTTGLVSEIYHEGYKAKRMEVGAVVAA TPARNVRRETPIAGDIIILLGGKTGRDGCGGATGSSKEHTKDSLALCGAEVQKGNAPEER KIQRLFRKEKVSQMIKKCNDFGAGGVSVAIGELADGLKINLDLIPTKYAGLNGTELAISE SQERMAVVIAKEDEASFLEEAALENLEATKVAEVTEEKRLILIWKGQEIVNLSRAFLDTN GVRQKAKVEVETPSGKNPFQEVLFRGNTLAESWQTCMKDLNVASQKGMVEMFDSNIGAGT ILMPFGGKYQMTPSDVAVQKISVEKGHTTTASAITWGYNPNISSWSPYHGAAYAVVESLA KLVSVGVDYRKVRLSFQEYFQKLGKDAKNWGKPFAALLGSLEAQESFGTPAIGGKDSMSG SFQDLHVPPTLISFAVAPVSTKEVISPELKKVGSHIYLLKHQALENSMPNYEICKKNFTW LHEQITAGKVLSCMTIKMGGIAEALTKMSFGNQIGLELQNIGEDFFKLAYGSFILESEET LEFENLEYLGKTIQKYQIHILEKETSTILAADKLEQEWLNVLAPVFPYEYKEEKKEIYTL DTYVNIEIYHSKDRIAKPRVLVMAFPGTNCEYDSAKAFRDAGADPHILVFRNLKPSYIET SIEAMIQELKQAQILMLPGGFSAGDEPDGSGKFIATVLQNPRIMAEIQNFLDRDGLILGI CNGFQALIKSGLLPYGKLGTVTENSPTLTFNKMGRHVSQMVRTKIVSNKSPWLSSFHVGD EFIVPVSHGEGRFYVQEEELKSLIQKGQIVTQYVDFEGKATNEFHHTPNGSTCAIEGIVS PDGRILGKMGHSERKGEDLYKNIPGNKVQDIFSNGVKYFK >gi|224461484|gb|ACDD01000018.1| GENE 17 15600 - 15713 144 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYFMPKIKMKKHKKRSYNIGHSVHRNQIVYDKSESFL >gi|224461484|gb|ACDD01000018.1| GENE 18 16048 - 16332 167 94 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 1 94 3 96 96 68 37 5e-11 TMTKKEFVAALAKKAEVTGKEADKMVKCFLELVEESLVAGNDVKFIGFGSWETKKREARK LRNPQTGKEMKIAAKRVVKFKVGKALADKVAAKK >gi|224461484|gb|ACDD01000018.1| GENE 19 16399 - 17472 1489 357 aa, chain - ## HITS:1 COG:lin1180 KEGG:ns NR:ns ## COG: lin1180 COG1363 # Protein_GI_number: 16800249 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Listeria innocua # 1 354 1 353 359 202 36.0 1e-51 MKRVLEMTKAFTNAFGAPGFEDDVLEEIKKQIPDMKWERDSINNLFIYFSEKEKQKPTVL LDCHSDEVGFMIEHINDNGSLRFLPLGGWHIGNIPAMSVIIKNSQGEYIPGVVASKPPHF MTEEERSRLPKLSELSIDIGTSSYEETVNLYGIEIGNPVVPDVNFSYDEKIGIMRAKAFD NRLGAVAAIEVLKQFQEMGKMLDVNLVVSISSQEEVGLRGAQVAAQRIQPDFVIVFEGSP ADDSFQSGREAKGKLRGGVQLRALDAAMVSNPRVLEFAKRIAREKQIPFQMIVREKGSTN GGKYHITGRGIPTLVLGIPTRYAHTSYCYASLLDTKAAIDLAREVIEELDREQIETF >gi|224461484|gb|ACDD01000018.1| GENE 20 17487 - 18995 2199 502 aa, chain - ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 2 493 3 498 511 383 39.0 1e-106 MKKWTMSIILCLFSFLLLACGGKEAVEGEKKDTLVYAQISEGKTLDPQDTTEQYSQRSVS LIYSRLVEINEKTGGIDPGLARSWERPNPNEIIFHLRNDVKFSNGYDFTAEDVKFTIERA QSLPKVAHLYKPITEITILDPYTISLKTTEAFAPLLNHLTHKTSSILSKKYYDEVGDKYF ENPVGTGPYMLKEWKIGDRLELEANPNYFDGEPSIKHVVFRAIPEESTKVIGLQTGEIDM VGDVEAVSRETIAADDNLGLIEGSSVNTIYLGMNTERKIFADKEVRKAISMGVNRDDIVN SLLAGAGQKANSFLAPTVFGYSKDSKVYEYNPEEAKKIIAEKGLVGSKIKIAVSNSQLRS QMAEIIQAQLKEIGLEVSIENLEWGTFLSATANGDVDMFILGWGPSTYDGDYGLFPNFHS SQKGGEGNRSQYANPKMDQLLEDARKEMDVEKRRSLYIEATDLINEEAVVLPLYYPLTSV GYNKALKGVEAESYPMIHKYSY >gi|224461484|gb|ACDD01000018.1| GENE 21 19062 - 19838 955 258 aa, chain - ## HITS:1 COG:FN0869 KEGG:ns NR:ns ## COG: FN0869 COG0561 # Protein_GI_number: 19704204 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 258 7 270 270 181 38.0 1e-45 MKWVISDLDGTLLNNDRTVGEKTILGVQNLLKKGYPFVIATGRGFASANTIREKLGVPIY MVCNNGATIYSPKGELIFENYIPVEMVKKVTACLEKYRVDYRGFFQDYYFMPNYGKEDMK RIEYKAVILEKDEDFQILEKILVVDPNTDLLRKIQKELQEEVGEELTITLSSSECLDINS KNCSKAAGIEKVATYLQLHLQDAIAFGDSENDFAMLASVGKAVSMKGTYAAQEKDYEVTE FTNHEDGVIRHLEKYIKF >gi|224461484|gb|ACDD01000018.1| GENE 22 19851 - 20654 1033 267 aa, chain - ## HITS:1 COG:FN0870 KEGG:ns NR:ns ## COG: FN0870 COG0607 # Protein_GI_number: 19704205 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Fusobacterium nucleatum # 34 267 3 240 240 168 39.0 8e-42 MKRILNFDEDMEQILRKMEEEVGIFCSRKRNRNLGIQNLGDYCYFSVGKTMDELEKEGIN FYKSKDVANDYNGPKPSYSMVHIKLLYSPEFKILGAQMIGRGNLERRYEVLKKFLSEGKG LKELAEYSIYGKTLEEEMDILNLSAFYAMEVSKPLVPVEEVRKLQESEAFFLDVREEEEH EYACILGSTNIPLHSLVQRLSEIPRDKKVFVYCRSAHRSLDAVNFLRGMGYNNVYNVEGG FIAISYEEYTKDKEEKREKIVSRYNFE >gi|224461484|gb|ACDD01000018.1| GENE 23 20686 - 20772 70 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELGKEVKEVLKSLFFLCEIVIEKILET >gi|224461484|gb|ACDD01000018.1| GENE 24 20759 - 22084 1264 441 aa, chain - ## HITS:1 COG:FN1944 KEGG:ns NR:ns ## COG: FN1944 COG0733 # Protein_GI_number: 19705249 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 6 441 20 459 459 355 45.0 1e-97 MEKHFKKRDSFQNKIGFILACVGSAVGMGNIWLFPYRVGEFGGAAFLFPYLFFVVLLGLT GVSGEMAFGRAMRSGPLGAFKKALEKRGKKYGAFLGFIPVLGSLGIAIGYAVVVGWILKY TVQSFSGILQVTENYGELFGTITTRYSSLTWHFTAILISLLIMVAGIQGGIEKINRVLMP LFFGLFCLLAIRVFFLENSISGYEFLWKTDFEKIFQIKTWIFALGQAFFSLSLAGSGTVV YGSYLKEEVDVINSSIHVAFYDTFAAILAALVIIPAVFSFGMEVSAGPGLMFLVMPSVFQ QMPFGRIFSSLFFLAVFFAGITSLVNLFESSIEALEEKFSFSRRKAVSIVMIFSFLIGIF VEDVNYLGKLMDIVSIYLIPLGAFLSAILFYWVCGDEFVRREIQKGRLKSFPKCLLPMGK YVFCGISFVVFILGIFYHGIG >gi|224461484|gb|ACDD01000018.1| GENE 25 22315 - 23010 691 231 aa, chain - ## HITS:1 COG:Cj1387c KEGG:ns NR:ns ## COG: Cj1387c COG2964 # Protein_GI_number: 15792710 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 13 228 1 215 218 58 29.0 7e-09 MDAPFLFKVGGKMTEFQKEYYSGMIEFLSATFGETIEISLFEVLENKRTSLCAKSKNCLK DLGDEVDRNLLFCIREYKKEQKYTAKLPWKEKNGDLSRVSFYYIQDEKKNLTGILCIKKN ISPMIVAANFLNESLKALTGGPERNLEEEVSNKGKWKQENTLLKYSQYVIEDYFDSLNVP SYAMTVEERIKVVETLNQKGIFQLKGNIIEVAKRLDISEKTLYRYLKKEIE >gi|224461484|gb|ACDD01000018.1| GENE 26 23027 - 23473 577 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451965|ref|ZP_05617264.1| ## NR: gi|257451965|ref|ZP_05617264.1| hypothetical protein F3_02791 [Fusobacterium sp. 3_1_5R] # 1 148 1 148 148 237 100.0 2e-61 MYLDDFRDSLAFYDEETHKVLKVVNFQPVLRYVHSEYVSDGRKYVHEILKEYPKYRKIIV DEISEEYYAREYADTMWQRDCEFFFQEVKTILKQCNYEFLSMPKLERKKKIIRLEELFSR YENTWQYQYVDFENVKEDYRYILQWKNR >gi|224461484|gb|ACDD01000018.1| GENE 27 23473 - 23973 829 166 aa, chain - ## HITS:1 COG:FN0915 KEGG:ns NR:ns ## COG: FN0915 COG2190 # Protein_GI_number: 19704250 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Fusobacterium nucleatum # 1 166 1 164 164 206 63.0 2e-53 MGLFNNLFGKKEEKKVVTIYAPVNGTVIDLAEIPDPAFAEKMVGDGCGMEPKEGAICSPV NGEIANIFDTRHAVSFDSEDGLEMIVHFGIDTVKLKGEGFKALRGEGETKVGDAIVEYDL AYIAANAPSTRTPVIINNMEEVEKIEVIALGKEVKAGDPIMKVTLK >gi|224461484|gb|ACDD01000018.1| GENE 28 24014 - 24787 908 257 aa, chain - ## HITS:1 COG:FN0875 KEGG:ns NR:ns ## COG: FN0875 COG0566 # Protein_GI_number: 19704210 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Fusobacterium nucleatum # 7 253 9 258 261 210 50.0 2e-54 MRDILKISSLENDQLKFFSKLKKKKYREEAKLFLAEGRKFLEYSENASYVLFREDIPIEE TILEKFNCPIISLSSKCFEKVSVQENSQGVIILYSYKNIVLSRDARQLIILDDIQDPGNL GTIIRLVDASGFSDIILTKNSVDYYNEKVVRSSMGSIFHVNLHTMEKIEIVDYLKKQQYN IVVTSLQEDSIPYMEMSLDERNAFVFGNEGHGVSKEFLEIADEKVIIPISGQAESLNVAM ALGILLFYSRDLKRVLE >gi|224461484|gb|ACDD01000018.1| GENE 29 24789 - 25886 1268 365 aa, chain - ## HITS:1 COG:FN0617 KEGG:ns NR:ns ## COG: FN0617 COG0592 # Protein_GI_number: 19703952 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Fusobacterium nucleatum # 1 363 1 363 364 283 40.0 5e-76 MKFKIKREEFISVLSDYTSILKENSIKPILSALFMEVKENELVFMGSSIEMDYRKQIQCE GMEEGAVAFKPALVLEYIKLLEEEWLTVEKLDGFLKIANGEFAILEEENYPKIVELASMS LLQIQGNEFAKYLETVKFSASQTPENLALNCIRVVFGKEKINFVSTDSYRLLYLEKKIRA QFERAISLPLEAVNVIIKLLKEKTEVISLELSGENLLLLWEGTYFSCRLTAVPYPNFQGI LNQNFFDKKMEFCLEDLKAAMKRVITVAKTSIDAKYGGTFDFKGKQLIVKAVTTGRAKTQ QKVAMMKEGDDFIASLNCKYLSEFLDTISKNVIIYGKNSSSMFRVMEEGNEELIYILMPL ALREV >gi|224461484|gb|ACDD01000018.1| GENE 30 25896 - 26921 1375 341 aa, chain - ## HITS:1 COG:FN0618 KEGG:ns NR:ns ## COG: FN0618 COG0687 # Protein_GI_number: 19703953 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Fusobacterium nucleatum # 1 341 1 342 342 443 64.0 1e-124 MKKLLLALCSSLLLLACGAKDDTNSLYLYGWADYIPHEIYEDFEKETGIHVVEDIFSSNE EMYTKLKAGGDGYDIVMPSSDYVEIMMKEGMIEKLDKSKISTLGNIAPFIMQKLQAFDKN NEYAVPYNTSVTVIAVNKNYVKDYPRSFDIFNREDLQGRMTLLDDMREVMTSALAIHGYD QKTPSVEAMEKAKQTILSWKKNIAKFDSESYGKGFAAGDFWVVQGYPDNIFRELSEEERA NVDLIIPEVGAFGSIDSFAILANAKHKENAYKFIEYIHRPEVYAKLADILELPSINEPAT KLMKTKPLYDLSELEKVQVLMDIHETLDLQNKYWQEILIAD >gi|224461484|gb|ACDD01000018.1| GENE 31 26978 - 27778 1042 266 aa, chain - ## HITS:1 COG:FN0391 KEGG:ns NR:ns ## COG: FN0391 COG0561 # Protein_GI_number: 19703733 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 264 1 263 267 238 47.0 9e-63 MKYKAVVCDMDGTLLNGEHRVSERSKNIIKTIIEKGVKVFLASGRPYPDIQYFKKSLGLN SYSISSNGAVVHDEQGKEIMYYSLEKELLSELLNLPFGNLHRNLYTRNSWYVEVALKELL EFHKESGFAFQQISNLAEKNDGNATKLFFLDESEKSILDFEKKLKAKFEDRVSITLSTPN CLEIMKKGVNKGRAVKDTMQKLGIPLEEVIAFGDGLNDYEMLSLVGNPFVMSNASPRLLE ALSEVPRAPKNTEDGVAQILERLFLK >gi|224461484|gb|ACDD01000018.1| GENE 32 28007 - 29260 1277 417 aa, chain - ## HITS:1 COG:MA2370 KEGG:ns NR:ns ## COG: MA2370 COG2865 # Protein_GI_number: 20091202 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 3 415 11 446 458 129 27.0 1e-29 MKESRELEFKATITNTFLKTVSAFSNYHSGRIIFGIDDNGNVVGLDEIEEFCLDLENKIN DNISPKPDFRFIKDRKQKTITLVVDEGVNKPYLYKGKAYKRNDTSTVEVDRIELNRLTLL GLNRYYEELKAKNQNLEFTVLKKELEEKVSLKEFSKDVLKTLNLYDDKNGYNHAAEILAD KNAFPGIDIAKFGKNIDEILDRNLFTHISIISQYKKTMEIFNRYYKYEQVVGSERIEKEL IPERAFREVLANALIHRTWDINSNIRIAMYDDKIEVSSPGGLPSGISKKEYLNGQISQLR NPILGNIFFRLKYIEMFGTGIRRINESYKNFMVKPSFEIFENSIKIVLPVLVTKLLLTTD EERIMEVFKEGNILSSGEILKMTEFKKDKLNRLLKKLIQKNYIEMIGNGRGTKYLKR >gi|224461484|gb|ACDD01000018.1| GENE 33 29368 - 29436 75 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKALTCNLYTDIYKFNRRRLLN >gi|224461484|gb|ACDD01000018.1| GENE 34 29436 - 30206 668 256 aa, chain - ## HITS:1 COG:NMA1038 KEGG:ns NR:ns ## COG: NMA1038 COG0286 # Protein_GI_number: 15793994 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Neisseria meningitidis Z2491 # 2 251 237 513 514 320 59.0 2e-87 MKKQFEEHIIEEGFFGQEINMTNFNLARMNMFLHNVNYNNFSIKRGDTLLNPLHNEEKPF DAIVSNRPYSIKWVGDADPTLINDERFAPAGKLAPKSYADYAFIMHSLSYLSSKGRAAIV CFPGIFYRKGAERTIRKYLVDNNFVDCVIQLPDNLFFGTSIATCILVMAKNKTENRVLFI DASKEFKKETNNNILEEKNINTIIEEFRNREEKEDTREIIDIKVLNQEIEETVRKIDSLR ASINEIIKKLEEEGES >gi|224461484|gb|ACDD01000018.1| GENE 35 30318 - 30776 183 152 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_0125 NR:ns ## KEGG: HMPREF0868_0125 # Name: not_defined # Def: hypothetical protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 2 152 55 207 207 216 76.0 1e-55 MGNRIEKSQRNGNKNQTEVVTLCTPLLECAFSFNQKFRDYFSAVTCVSPFKFNADMATAW RKVKTENDLNFTIQDMLKVYYGESDYAKYDNSACQWNQFLKDFCSDEFSDYYSNKLKVAA IIWKEVRDSKNEKIYSRQLLNEYGDKIEEYHK >gi|224461484|gb|ACDD01000018.1| GENE 36 30990 - 31832 653 280 aa, chain - ## HITS:1 COG:no KEGG:Apre_0343 NR:ns ## KEGG: Apre_0343 # Name: not_defined # Def: KilA domain protein # Organism: A.prevotii # Pathway: not_defined # 1 276 1 277 279 396 75.0 1e-109 MSKMKKETIEAKGFAIQIYTEDFKNDYISLTDIARYKSDEPFIVINNWLRSKDNIQFLGL WESMHNPDFKPIEFDRFRNEAGSNAFTLSPQKWIEKTNAIGIVSKSGRYGGTFAHSDIAM EFASWISPEFKLYIIQDYKRLKSDENSELSLSWNLNREISKINYKIHTDAIKEYLLKDLT NEQLSYKYASEADMLNVALFNKRAKQWREENPDLNGNMRDYASLNELLVLANMESYNAVL IGKGIDQKERMIELRKLARTQLMSLEKLSDSGIKKLEGKK >gi|224461484|gb|ACDD01000018.1| GENE 37 32142 - 32372 364 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGMAGFEPTHNGVKVRCLTAWRHPKMVGIARFELAALCSQGRCATGLRYIPYRTLKILS QKQFSVKLYFILFFKK >gi|224461484|gb|ACDD01000018.1| GENE 38 32319 - 33635 1201 438 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 407 7 418 448 214 31.0 2e-55 MEKKSINRLFLEYMIPSTTGLLVTALYVIVDSIFIGRGIGQNALASLNIAYPIITVSSAI SLMIGMGASTVMTLHAGKKRIRELSLSYVLFFNGFFYLFLIFLVFCFPKFLMELLGSTPE IDNMVKTYISFCSIGLIFLMISTGLNAAVRNLGSPRYAFFSMVMGALCNVILDWLFIFVF DFGIAGAAAATSLGQILSFFLLYCYLRKREIRFSFWPKRFQKQMIEKIFSIGFSSFIMEF AHAVMLVLFNKQFVKYGGEISVAAFCIVASTFYLFRMVFTGLSQGLQPILSYFYGKKDYT FVQEAYQKAKILSIIIGMIGFFICVIWKRSIMGMYHSDPDFVSLSANGLFLYVTSMIFVA FNFIVIAYYQSIGDGRRAIFFSFIRSAIFLIPYLYILPIFIGVKGIWLTLTCAEISTTIL MLFFEKKNKIQLDRKLLL >gi|224461484|gb|ACDD01000018.1| GENE 39 33638 - 35371 1947 577 aa, chain - ## HITS:1 COG:no KEGG:CLI_1231 NR:ns ## KEGG: CLI_1231 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_F # Pathway: not_defined # 1 574 1 577 579 679 57.0 0 MKQEATINRDSIVLDFSTKFCNSHEALLESDGFRRILTAYLAKLETRDTPVYEYLLEAVG TKEEIPDRVTKLFKLLIILDLEEVHILDQKYSVLLKDKSIFIEFLEGLYNFWRKFERYAI ISNNTRGEGLQNVNFIEALNNFKNLVITTYRLIEEALMGYKNRVYRQLNAGVNAGVILNG AKRNCPAEYSILEKILFIDTVILQPPFITYPKRNTRKGIFQEVFENPIKDMVINRDNWFC YPAKVGESLVFIYFNRYFMSQGLSLCNLFELAKEEEYVNKAPDIIYVYGVKDYETEMKTV FYRDKENNRMVGYANYCEDIDYFGYMKKMILTLHNIRMIEQGHLPIHGAMVNITMKSGAV HNVVIMGDSGAGKSESLEAFRSLSEEYVKDMKTIFDDMGTLKLASKAPLAYGTEIGAFVR LDDLDTGYAYKEIDRSIFMNPDKINSRIIIPISSYEEIMRGYPIDMFLYANNYEAEGDLI EFFKKKEEAIPVFKAGRRKAKGTTSETGIVDSYFANPFGPVQTQEKTDILIENFFEDMFK KGVKVGQIRTKLGIAGEEHSGPKAAATVLFEMLKPLK >gi|224461484|gb|ACDD01000018.1| GENE 40 35379 - 35714 518 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451977|ref|ZP_05617276.1| ## NR: gi|257451977|ref|ZP_05617276.1| hypothetical protein F3_02851 [Fusobacterium sp. 3_1_5R] # 1 111 1 111 111 183 100.0 3e-45 MKPRVTWILFLVIFIFSSCMSRWAFISETDYTKREEQIVKIYEKLSKKYDRLLEDPIEEK ERKALEEKFQTFYVNLNELTVKNDPKHLQFLQEYRNQVRIKLNYLQDLKED >gi|224461484|gb|ACDD01000018.1| GENE 41 35767 - 35847 65 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVTKIIEKAFTLSIEVSAFFKTTCRI >gi|224461484|gb|ACDD01000018.1| GENE 42 35832 - 37595 248 587 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 347 568 131 351 398 100 30 2e-20 MKSREYSTKELIARFLPYFSKYKHILLFDLTCAAFTTLCDLALPLILRFITQTGMKDLSL LSIQLILQLGALYIVLRLVDTSANYFMANVGHVMGAKIETDMRRDVFNHLQGLSYSFYNE NKSGQILTRVTTDLFDVTEFAHHCPEEFFIAGIKILISFIILININVPLTLLLFAMIPLM ILSVYKFNQKMRNAQKDQRNHIGEINSGIENNILGAKVVKSFANEEIEKEKFEVQNQQFL GIKKVFYKYMASFHAVSRLYDGLMYVTVIILGGIFMLQGKLSPADLFLYALYISTLLATV KRIVEFMEQFQKGMTGIERFLELMDTETDVEDSENAKSVNNVKGDISFEEVGFRYQSTGE SVLEHLNFSIEAGKNIAIVGPSGVGKTTICNLIPRFYDVTEGAIYLDGTNIKELKVQDLR QNIGIVQQDVYLFSGTVFENIEYGKPGASLKEIERAAKLAGAYEFIDALPEKFNTYIGER GTKLSGGQKQRISIARVFLKNPPILILDEATSALDNQSEKIVQTSLELLSKGRTTITIAH RLSTIMNADEILVLTEQGIVEKGKHQELLDRHGFYHELYYGNQWSQK >gi|224461484|gb|ACDD01000018.1| GENE 43 37624 - 37995 194 123 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 1 119 1 123 126 79 36 4e-14 MKFQFIHENFNVMDLEKSLKFYEEALGLKEGRRKEAADGSYILVYLRDGITDFELELTWL KDRSENYDLGDEEFHLAFRVDDYEAAYKKHKEMGCIAYENPSMGIYFITDPDGYWLEIVP TRK >gi|224461484|gb|ACDD01000018.1| GENE 44 38040 - 39593 510 517 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 441 1 408 458 201 31 9e-51 MKKYDVIVIGTGAGNILTDAALDSGLKVAQIEKDKFGGTCLTKGCIPTKVMVTAADMIRN NEEVHKIGVESQPMKVNWKVLSRRVWQKIDESKEIVEEYKQEKNLDVYEGRAFFVRDKVL QVEYNQGGFSEEITADIIVLAAGARSRRIKLQGMETTSYLTSEDIFGASWPQKPYKSLII VGAGAIGTEFAHAFSSFGTKVTVVQFEDRLLPKMDKDISKYLGERFADLGIKVHYNQISK KITQKDGEKVLQIEDKITGEIKELKAEEILVAAGVVPNTDLLDLSNTSIQMNTQGWIRTN EFLETSVEGVYAIGDINGHGQLRHKANYEADILVHNLFPEALPPGQVAEGMKPERRFARF EYIPSVTYTYPQVSSIGLSEEEARKQASEKGWDIRVGYHHYSSTAKGYAMGFEPGDKEDG FIKVIIDAKSKYILGVHIIGAEAGILLQPYASLLGSGRIEHLVYEQEIASEETKKARATD YSRYLDPHKVTSITEAMTAHPALTELVMWTQYFVPMK >gi|224461484|gb|ACDD01000018.1| GENE 45 39716 - 39985 483 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257451981|ref|ZP_05617280.1| ## NR: gi|257451981|ref|ZP_05617280.1| hypothetical protein F3_02871 [Fusobacterium sp. 3_1_5R] # 1 89 1 89 89 161 100.0 1e-38 MKKLYMFVTSWCPHCKNAKNWIQELKAENPKYETISLEIIDEEKEVNKVNELNFGYYYVP TFFLEDEKLHEGVPSKEIVKSVLDKALQS >gi|224461484|gb|ACDD01000018.1| GENE 46 40022 - 41068 1511 348 aa, chain - ## HITS:1 COG:BMEI1196 KEGG:ns NR:ns ## COG: BMEI1196 COG1024 # Protein_GI_number: 17987479 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Brucella melitensis # 5 347 11 339 349 182 32.0 1e-45 MEINILYQVVGNVGQIILNRPKKLNALDRASVRELREILEKFAKDSEVCFVILRSNIEKA FCAGGDLLSDKKILEEEGLEAMVDELRAEYALASQITHFEKPILVYLNGITMGGGAGISV GADIRIVTETTQWAMPEMRIGLFPDVALSYYFARMQAGLGEYLAITSNSIQAEDCLWAGV ADYKIQSGDYTSLEKELLAMDWYGIEKEEILKKIQKKVEQYSSPKCIGNLEKRSKELKQH FTKSSFKEVFQSLEQEKETSDFARSILDSLNKNSTLSMAITWELLKRAKKLSLEECYQLD LVLIRSYFQGKDIFEGIRAILIEKTKDPKWEYKNIDEIPTEVVLSYFN >gi|224461484|gb|ACDD01000018.1| GENE 47 41089 - 42240 1283 383 aa, chain - ## HITS:1 COG:SA1044 KEGG:ns NR:ns ## COG: SA1044 COG0044 # Protein_GI_number: 15926784 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Staphylococcus aureus N315 # 1 155 1 157 424 62 34.0 2e-09 MILIKNIDVYSPAFLGKRDVLISGNQIEKIAEKIECGNIEVEVFDGSGKKLVPGFIDQHV HLIGGGGEGGFHTRTPETPFSKLIEGGITTVVGVLGTDSTTRSIENLLAKVKALKNEGIT AYMTTGAYSVPSPTLTGSVEKDITFIEEIVGVKIAISDHRASYVDTPILEKIASQVRRAG MFSGKHGMVVMHMGDGREILNSVWNLLQHSEIPIHHFLPTHVNRKKEVWEDSLEFLKQGA YIDLTSSFEEDDFLSASQGIDFLKKNGYDLSRVTISSDGYGSAPVFDEGGRLVKITYSPV NTNYQEIKKLVQKYHFSLEEALTFTTKNPAMEFGWYPKKGSIQEKSDADFLILDENLSIF GVFALGEICMWEYEIRKKGTYEE >gi|224461484|gb|ACDD01000018.1| GENE 48 42249 - 43022 258 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227874237|ref|ZP_03992436.1| possible ribosomal protein S4e [Oribacterium sinus F0268] # 1 257 1 253 254 103 29 1e-21 MQTKEDKFLEGQILDKILQCQEDYIFTNTNFLDLHQQSVAQAILHREKKRQRIKAVFWGG YEGAMRRILFFLPEYLEEHSYETFEDVLGVLEVTKLDKNLSLNHRDYLGAFTGLGLKRET MGDILVRENGADLIVLKEMIPILLEEYCSVGKSSVQVQEKSLKKLIFVEENQKKERGTVA SLRLDNVLCEIFNLSRTQAQEWIQKGSVYVNYVEKYKNESGIEAGDIIVLRGMGKAKIEE TGSLTKKNKVPIYYIKY >gi|224461484|gb|ACDD01000018.1| GENE 49 43145 - 43777 613 210 aa, chain - ## HITS:1 COG:FN1762 KEGG:ns NR:ns ## COG: FN1762 COG3022 # Protein_GI_number: 19705081 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 33 210 65 245 248 96 37.0 3e-20 MILLSPSKEMSKDGIPSHKIPTFQKEAEELLPEIQEKDKYEAWSLYHGLAFRYLKKGEFS EKDLVFMEKNLCIFSALYGLLSAKDGISEYRLDFSKKGLYAYWGDKIYQELLKRCSSSEE WIINLASDEFSKTILKYLPKENKFLQIDFLEEKQGELKKHSTVSKKGRGAMARYLILSHD TSIERIKKFKEENFKFREDLSTEKHFVFVR >gi|224461484|gb|ACDD01000018.1| GENE 50 43790 - 44944 1667 384 aa, chain - ## HITS:1 COG:FN1133 KEGG:ns NR:ns ## COG: FN1133 COG1820 # Protein_GI_number: 19704468 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Fusobacterium nucleatum # 4 384 2 386 386 482 65.0 1e-136 MKGEAMILKNAKMVLFNRMFQGDLRIEGSSITNIEENLIPNTKEEVFDLQGKLLIPGFID VHIHGADGADAMDGSVESLQKISKYLATRGTTNFLATTLTSSKEILKKVLACIGEVQNQE MDGANIFGAHMEGPYFDVQYKGAQNEKYIKMAGMEEIKEYLSVKKGLVKLFAMSPNANNL DVIRYLVKEGVIVSVGHSASSFEEVMAAVEAGLSHATHTFNGMKGFTHRDPGVVGAVLNS DEITAEVIFDKIHVHPDAVRVLIKTKGVERVVCITDSMSATGLPCGRYKLGELDVDVVDN QARLSSNGALAGSVLTMDKAFRHLLELGYSLIDAVKLTSTNVAKEFNLNTGMIRAGKDAD LVVLDEKNEVAMTVVKGKIKYTNL >gi|224461484|gb|ACDD01000018.1| GENE 51 45153 - 46511 829 452 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 1 450 3 446 456 323 38 9e-88 MEQAVVQVNEILWGSILIFLLMGTGIFYTFKLRFIQIRKFGQGIRRVTSGFSFHGKDADH NGMSSFQALATAVAAQVGTGNLAGAATAIASGGAGAIFWMWLSAFFGMATIYAEATLGQI YKTKVNGAITGGPAYYIQAIFKHSFFSRLLAYFFSISCILALGFMGNAVQANSIASAFEI AFHINPMIVGIVVAILSGLIFFGGTKRIASVTEKVVPLMAGMYIIICIVILILNYQNFFP ALQSIFVEAFTGRAAMGGALGITVQKAMRYGVARGLFSNEAGMGSTPHAHAIAKVNTPVE QGDVAIITVFIDTFVVLTATAMVILTSGLAFKGKTGIELTQAAFEMRLGQFGTVFIAIAL FFFALSTIIGWYFFGEANIKYLFHEKGNSVAIYRILVMCMIVFGSMQKVGLVWELADMLN GFMVLPNLIALLLMSSLVKATSDRYEKRELQK Prediction of potential genes in microbial genomes Time: Fri May 20 01:52:25 2011 Seq name: gi|224461483|gb|ACDD01000019.1| Fusobacterium sp. 3_1_5R cont1.19, whole genome shotgun sequence Length of sequence - 44073 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 13, operones - 8 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 12 - 2516 3030 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 2 1 Op 2 . - CDS 2517 - 3203 877 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 3 1 Op 3 1/0.000 - CDS 3213 - 3959 996 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 4 1 Op 4 . - CDS 3949 - 4737 862 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 5 1 Op 5 . - CDS 4774 - 7047 2690 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 7080 - 7139 6.7 - Term 7106 - 7140 1.9 6 2 Op 1 . - CDS 7145 - 7807 712 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 7 2 Op 2 1/0.000 - CDS 7832 - 9268 1889 ## COG2195 Di- and tripeptidases 8 2 Op 3 . - CDS 9293 - 10804 2133 ## COG1288 Predicted membrane protein 9 2 Op 4 . - CDS 10883 - 11305 486 ## FN0788 hypothetical protein - Prom 11401 - 11460 15.1 + Prom 11336 - 11395 8.9 10 3 Tu 1 . + CDS 11439 - 12434 758 ## COG3839 ABC-type sugar transport systems, ATPase components + Term 12577 - 12612 2.8 - Term 12314 - 12361 2.5 11 4 Op 1 7/0.000 - CDS 12385 - 12909 608 ## COG2059 Chromate transport protein ChrA 12 4 Op 2 1/0.000 - CDS 12906 - 13445 539 ## COG2059 Chromate transport protein ChrA 13 4 Op 3 . - CDS 13461 - 14633 1756 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 14 4 Op 4 . - CDS 14644 - 15372 751 ## FN0710 hypothetical protein 15 4 Op 5 . - CDS 15391 - 16836 1337 ## COG1002 Type II restriction enzyme, methylase subunits 16 4 Op 6 1/0.000 - CDS 16836 - 18305 545 ## PROTEIN SUPPORTED gi|163803542|ref|ZP_02197411.1| 30S ribosomal protein S20 17 4 Op 7 1/0.000 - CDS 18278 - 18964 883 ## COG1354 Uncharacterized conserved protein 18 4 Op 8 1/0.000 - CDS 18933 - 19898 440 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 19 4 Op 9 1/0.000 - CDS 19909 - 20799 977 ## COG1481 Uncharacterized protein conserved in bacteria 20 4 Op 10 . - CDS 20812 - 23460 2887 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains - Prom 23506 - 23565 10.8 + Prom 23462 - 23521 9.5 21 5 Op 1 1/0.000 + CDS 23645 - 24100 644 ## COG0629 Single-stranded DNA-binding protein 22 5 Op 2 . + CDS 24124 - 24672 779 ## COG2096 Uncharacterized conserved protein + Term 24682 - 24736 15.2 - Term 24673 - 24721 14.6 23 6 Op 1 1/0.000 - CDS 24728 - 26302 2306 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 24 6 Op 2 . - CDS 26317 - 27114 1025 ## COG3568 Metal-dependent hydrolase 25 6 Op 3 . - CDS 27086 - 28627 1818 ## COG1640 4-alpha-glucanotransferase - Prom 28740 - 28799 19.1 + Prom 28615 - 28674 11.3 26 7 Tu 1 . + CDS 28780 - 29493 580 ## COG2188 Transcriptional regulators + Term 29496 - 29533 5.1 - Term 29473 - 29533 9.5 27 8 Op 1 . - CDS 29549 - 29977 640 ## gi|257452014|ref|ZP_05617313.1| Heat shock protein 28 8 Op 2 1/0.000 - CDS 29977 - 30786 1007 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 29 8 Op 3 36/0.000 - CDS 30875 - 31651 926 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 30 8 Op 4 30/0.000 - CDS 31641 - 32486 722 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 31 8 Op 5 . - CDS 32488 - 33627 1443 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components - Prom 33667 - 33726 13.7 + Prom 33626 - 33685 9.5 32 9 Tu 1 . + CDS 33746 - 34768 1359 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases + Prom 34787 - 34846 7.6 33 10 Op 1 25/0.000 + CDS 34883 - 35146 450 ## COG1925 Phosphotransferase system, HPr-related proteins + Term 35163 - 35199 5.2 34 10 Op 2 . + CDS 35214 - 36944 2128 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 35 10 Op 3 . + CDS 37011 - 38333 1599 ## COG1253 Hemolysins and related proteins containing CBS domains - Term 38299 - 38353 12.1 36 11 Tu 1 . - CDS 38366 - 41680 3786 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 41882 - 41941 14.0 + Prom 41804 - 41863 11.8 37 12 Tu 1 . + CDS 41950 - 43170 422 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases - Term 43048 - 43090 -0.6 38 13 Op 1 3/0.000 - CDS 43184 - 43615 297 ## COG3315 O-Methyltransferase involved in polyketide biosynthesis - Prom 43665 - 43724 2.7 39 13 Op 2 . - CDS 43732 - 43989 388 ## COG3315 O-Methyltransferase involved in polyketide biosynthesis - Prom 44012 - 44071 8.9 Predicted protein(s) >gi|224461483|gb|ACDD01000019.1| GENE 1 12 - 2516 3030 834 aa, chain - ## HITS:1 COG:FN0867_1 KEGG:ns NR:ns ## COG: FN0867_1 COG1022 # Protein_GI_number: 19704202 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Fusobacterium nucleatum # 6 609 5 606 606 650 52.0 0 MEGTVFLYDRQKTAIIYKEKEYSYKEMIEGIKYYATLLEIEAKDRVMVCLENRPESMMTL FSIWENKGISVNVDAGSTEEQLTYFIQDAEPKYIYASNKNIKNITNAVEESGLATKIINV DEVKIPEHFPVEEYSVKIEDETQTAVMLYTSGTTGNPKGVMLSYENIMENIRGVKAVDLV TETDRLLAVLPYHHIMPLSFTLVMPLHFGVLTVLLDDLSSEGLKKALKKYKISVIIGVPR LWEVIGKSILRQIQAKTLTKKVFEFAQKHVRSISLRKKIFKKVHTELGGNIRIMVSGGAK LDSEIGELFETLGFHMIQGYGLTETAPIVSFNVPGRERQDSVGEIIPKVEVKFLEDGEIL VRGKNVMQGYYKKPEATKMVIDEEGWFHTGDLGRMEGKHLLVIGRKKDMIVLANGKNINP SDIESELFKLTDFVQDIAVIEYEKKLVAIVYPNFDLMKARGIHNVNETLKWDIIDKYNVN APSYRKIHDIKVVKEELPKTKMGKIRRFLLPDLLKKQEQQGNSTQEVKREVEIAAEYLEE YKILQEYLESTKGEKVYPDSHLEIDLSLDSLDMVELIAFLESNFGVKLSEEEFVDLKTPL AVVKAIHSRDKETISKDSSFKKILEECDDVTLPVSSWVGKFVHILISILLGLLFKIKIEN KEKLAIEGAAVYIANHQSFLDVLLINKALSMKQIGELFYIATIIHFRGTFKQYLCNHGNV VLVDVNRNLRNTLKAAAKILKSNKKLMIFPEGARTRDGELQEFKKTYAMLAKELNLPIIP MVVKGAFEAMPFGQKPKWGSQMSLKALDPIYPEGKTIEEIIEESKRVIAEELKK >gi|224461483|gb|ACDD01000019.1| GENE 2 2517 - 3203 877 228 aa, chain - ## HITS:1 COG:PA2809 KEGG:ns NR:ns ## COG: PA2809 COG0745 # Protein_GI_number: 15598005 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Pseudomonas aeruginosa # 1 225 1 224 226 114 29.0 2e-25 MNILLIQRRQEFAKELKIAWKEKEHIVDIAGNYESGLQFFYAGHYDIVLLDTWIKGGDAY LLAEKIRERSRKIGLIFLSEEHSFFFKKRAYEVGADTYLSLPISVEEVSLQVFALGKRVK AEAEYRKYRYLYGEIEVDALQRKVYRKGEDLNFTEKEFLLLTVLLKNQGLALHKDMIRKE VWGEDFVGASNILESYIKKIRKKLQDTEHKWIKTIRGYGYGIEERKGK >gi|224461483|gb|ACDD01000019.1| GENE 3 3213 - 3959 996 248 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 15 247 30 261 277 165 36.0 6e-41 MTDREILNFIEGKKFSKQLWSPIGRAMHKYHMIEEGDKIAVGISGGKDSLTTLNALIRIQ KIAQVSFEIIPIHIHPNTDKASYQKMKEYCEKLGLELVVETTNLEEILFNEENPMKNPCF LCGRIRRGILYRMLQERKINKLALGHHKDDIIETFLMNVFYQGNLHMMKPSYYAEEYGVQ VIRPLAFVEEKNIIRYVNRLELPVTKSDCPYETSEQSRRLKMKNLIHEMTKDNPNVRSTI FSSIEDLL >gi|224461483|gb|ACDD01000019.1| GENE 4 3949 - 4737 862 262 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 6 259 21 274 277 404 77.0 1e-112 MKNILEIEESIRAGYRKKIWKKFVKAVQDFELIEDGDRIAVGVSGGKDSLLLCKLFQELK KDRSKNFDLQFISMNPGFEAMDIDKFEQNLKDLEIDCTIFDANVWQVAFDQDPESPCFLC AKMRRGVLYKKVEELGCNKLALGHHFDDVIETTMINLFYASTVKTMLPKVSSTSGKLQII RPLVYVKEQDIKSFMKSNEIEAMSCGCPVESDKTDSKRKEIKILLEELEQKNPNIKQSIF SAMKNINLDYILGYTRGNKNDR >gi|224461483|gb|ACDD01000019.1| GENE 5 4774 - 7047 2690 757 aa, chain - ## HITS:1 COG:MA3787 KEGG:ns NR:ns ## COG: MA3787 COG0493 # Protein_GI_number: 20092583 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 299 754 13 467 469 496 58.0 1e-140 MYNIIEKKNLSKNIYLMKIKAEALVEAAKPGQFLIVKIDEKGERIPLTICDYDKEEGSVT IVFQVLGESTKEMAKMEVGDFFADVLGPLGKESDLLHEEKKVLQKKKYLFVAGGVGTAPV YPQVKWMKQQDCFVDVIIGSKNKESLIFEEEMRKVATNVYVCTDDGSYGSKGLVTDKIQE LVELGKKYDHAIIIGPMIMMKFAVEVCKQYGISTTVSLNPLMVDGTGMCGACRVSIGKEV KFACVDGPEFHGEEVNFDEALRRQRMYRTEEGRNILKLEDGENHHNPSCPNHEVVFVDRK KRIPVREQKPEERNQNFEEVCYGYSLEEAKLEASRCLQCKNPLCVQACPVSIDIPTFIRE IKEDNLQAAADTIAKYSSLPAICGRVCPQESQCEGKCIVGIRGEAVSIGKLERFVGDWAI ENKTSFCIPEKKQQKVAIVGGGPAGLTAAGDLAKKGYEVTIFEALHKLGGVLSYGIPEFR LPKEKIVEKEIENLLQLGVKVETNSLIGRTFTVDELLDKKGFSAVFIASGAGLPRFMNIA GENLNGVISANEFLTRVNLMKAHQSTYATPVKIGKRVLVIGGGNVAMDAARTAKRLGAET KIVYRRSEKELPARLEEIQHAKEEGISFLFLSSPIEILGDENAWVKGVKCIRMKLGEMDE SGRAAFSQVPNSEFIIEAETIIMALGTSPNPLILETTKDLQQNRWKGIATTSEFGETSRI GIFAGGDAVSGAATVILAMEAGKKAARKIDEYLQSML >gi|224461483|gb|ACDD01000019.1| GENE 6 7145 - 7807 712 220 aa, chain - ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 1 210 1 211 217 116 33.0 4e-26 MIKKEDRIYFEKLFPFWEKLTHQEKNYFIINSRTMTFTKGVDISSSPECFGLTIIKNGKI RVFLTSKEGKELSLFFLETMDIGVLTAQCIYPKLQVSINLHTEEVTEVIVMNPEAFSLMR KRNSEVSDFNMDLIYTRFSEIIEQMETALFVPLSVRLIRYLLKQEKKELIITQEEIARHL GSAREVITRNLKLLQNAGCLQVSRGKIQILSEEKLKTMLD >gi|224461483|gb|ACDD01000019.1| GENE 7 7832 - 9268 1889 478 aa, chain - ## HITS:1 COG:FN1277 KEGG:ns NR:ns ## COG: FN1277 COG2195 # Protein_GI_number: 19704612 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 3 478 4 486 486 491 54.0 1e-138 MRKLEGLKPERVFYYFEEISKIPRESYHEKEISDYLVQFGKDHNLEVYQDESLNVVLRKK ASSGYENAPGVILQGHMDMVCEKEEDSKHDFSKDPIDLLIEGNHITANKTTLGADNGIAV AMGLAVLEDENLLHGPLELLVTTSEEVDLGGALALKSGILQGKMLINIDSEEEGILTVGS AGGEGVEITLPIEKINIRHPFAYRIKIQNFLGGHSGAEIHKQRGNANKAMVEVLDLLKEK VDFLLVSVKGGSKDNAIPRAAEVIIATEEKLDMTLREVLKEVKELYISFEPQVEMFFEEI INVYEAIDENSFYQYVNLMEEIPTGVYTWMKDYPEIVEASDNLAIVKTEEESIKITISLR SSEPDILARLKKCISEIAEKYKAKYEFSAGYPEWRYRSDSPLREKAIQIWKELTGEEMKV AIIHAGLECGALSQNYPDIDFISIGPNMQDVHTPEEKLEIASTEKAYQYLVKLLQELK >gi|224461483|gb|ACDD01000019.1| GENE 8 9293 - 10804 2133 503 aa, chain - ## HITS:1 COG:FN2106 KEGG:ns NR:ns ## COG: FN2106 COG1288 # Protein_GI_number: 19705396 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 2 503 17 518 518 773 84.0 0 MSEKKKRSFPTAFTVLFIILILAAGLTYLVPSGKFSRLTYDDITNEFVITDHNDEVSTEA ATQEVLDRLHIQLALDKFTEGIIRKPIAIPGSYQRIEQHPQGFLDVVRAPITGTMDTVDI MIFVLILGGIIGIVNKIGAFDAGMSALSKKTKGKEFLLVVLVFALTTLGGTTFGLAEETI AFYPILMPIFLVSGFDALTCIAAIYMGSAIGTMFSTVNPFSVVIASNAAGISFTEGLIFR IVTLILASIVTLGYMYWYAKRVNKDKTKSFVYSDEATIQERFLGNYDASAETPFTWRRKL CLIIFALAFPILIWGVALGGWWFEEMSALFLVVAIVIMFLSGLSEKEAVNTFVSGSSELV GVVLTIGLARSINIVMDNGFISDTLLYYSTEFVAGMSQGVFAVAQLVIFSFLGFFIPSSS GLAVLSMPIMAPLADTVGLSREIVINAYNWGQGWMSFITPTGLILVTLEMAGTTFDKWLK YILPLMGIMGIFSVVMLIINTML >gi|224461483|gb|ACDD01000019.1| GENE 9 10883 - 11305 486 140 aa, chain - ## HITS:1 COG:no KEGG:FN0788 NR:ns ## KEGG: FN0788 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 138 1 137 139 139 49.0 2e-32 MINYNRFIEEFTQGKCHSFEDFQRIAKQFGLFFEKINGEMILGYQGRGEVDQVCYEFYRY FFPETKLQAKNFNLISKIHELHFQFVLEQVNEVYQKYNLPPRYDRTLSIRENAVLLLNTL KIKTAIRKEDLDFIQYILRY >gi|224461483|gb|ACDD01000019.1| GENE 10 11439 - 12434 758 331 aa, chain + ## HITS:1 COG:BMEII0621 KEGG:ns NR:ns ## COG: BMEII0621 COG3839 # Protein_GI_number: 17988966 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Brucella melitensis # 1 289 1 318 351 302 48.0 8e-82 MNILTIKNLGKQYQKKEWALRHINLEITEGEFLILVGPSGCGKSTLLRLIAGLEEVTEGE ILFHSNKKDIAMVFQNYTLYPHMTVYENLAFPLKVKSWSSEKIKNKILEIAKILEIETLL QRKPNELSGGQKQRVALGRAMVRDANIFLFDEALANLDTNLRSQMRYELLSLQKKINKTF IYVTHDQTEAMTMGDRIVVMKEGHIEQIGTPKEIYLDPKTTFVASFLGNPSMNFLKSENY LVGIRSEDIKIIEKETEDSYLFLSEFTEFLGSRSYLHGQVRDTPFIIEIPTTKEYKKGDR LFLDFPLSKRYYFNILTGQRIPLFQIKESKV >gi|224461483|gb|ACDD01000019.1| GENE 11 12385 - 12909 608 174 aa, chain - ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 170 1 169 176 135 51.0 4e-32 MIYIHLFLVFLKIGLFSFGGGYAVLSLIQQEVIEKYQWVSLSEFTEIVAVSQVTPGPIGI NSATFIGYKVTGNAFGSLCSTTGVVLPSIIILVLISLFLQKFKDSLTVKRIFLSLRPVVL GLVLGAGVSLLHPENFGHPATYVVFAMVVLAGIFTKINPILLILLSGTVGFFVL >gi|224461483|gb|ACDD01000019.1| GENE 12 12906 - 13445 539 179 aa, chain - ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 6 179 8 181 186 187 56.0 7e-48 MKKEAELFWSFFKIGAFTLGGGYAMIPLMQDEIVTKKKWLTDEEFLDALAIAQSSPGVLA VNTSIMTGYRISGRLGIAAAVLGAVLPSFLIILCLSTVIIQYREAKLFQQVFFGVKPATV ALIFIAVYKLCKSTKLNWTQYWIPLLVAVLVGMNFMSPVWIIICTMIIGNLYYAWRDKK >gi|224461483|gb|ACDD01000019.1| GENE 13 13461 - 14633 1756 390 aa, chain - ## HITS:1 COG:FN0711 KEGG:ns NR:ns ## COG: FN0711 COG0452 # Protein_GI_number: 19704046 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Fusobacterium nucleatum # 1 385 1 390 404 458 62.0 1e-129 MKKILLGVTGGIAAYKAANFTSLLKKRGYEVKIIMTENATKIITPLTLETLSKNPVCVDM WHEKAHYEVEHISLAHWADVVVILPATYNIVGKIANGIADDMLSTVIAATNKPVFFALAM NVQMYENPILYENIEKLKKYHYHFIEAAEGMLACQDVGKGKLEKEEDVIWEIESYFLAQT LEGKLKNKKVLITGGPTEEAIDPIRYLSNRSSGKMAYALAKAAVAAGAKVSLISGPTHLE KPRRLQEFVSIRGAREMYQEVESRFETCDIFVSCSAVADYRPKEYSPIKIKKKEGDLRID LERNPDILLEMGKRKSHQILVGFAAETNDIEENAQRKLEKKNLDYIVANDSKTMNQEVNT VSIIKKGGSKLEIQEKAKEELAYDIWKNIL >gi|224461483|gb|ACDD01000019.1| GENE 14 14644 - 15372 751 242 aa, chain - ## HITS:1 COG:no KEGG:FN0710 NR:ns ## KEGG: FN0710 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 24 240 6 224 225 123 39.0 6e-27 MVIYFIIKKTFEEGGAMSLLSSLFGSKEVEKENLEKIERLEQKILGKEKEIERLILELET VNNSTKIKPRQLEIFEKNVKDSREEINRLNSVLKSFGIPSKKQYYKYKVELSKLYSASRF REVLDFLLAKGYIFISDVPFSSLQDEIIILKNGEEALKRHSDYLQDNYDWEIATYRNKGE KLIKIFGRGKKLTQFFSEYYLEYMDDLDRIDLNILAQYGCDAELIQEVKEKREQYYLEQR EQ >gi|224461483|gb|ACDD01000019.1| GENE 15 15391 - 16836 1337 481 aa, chain - ## HITS:1 COG:CAC2309 KEGG:ns NR:ns ## COG: CAC2309 COG1002 # Protein_GI_number: 15895576 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Clostridium acetobutylicum # 1 449 52 553 581 130 28.0 4e-30 MGKKHYVIYTPIQESKQIAKLAFTYAPPKVKWKLADLSCGNGNLLVSFAGYMKEQNRMIP IQYYGYDIDEKAIIEARNRLLNENCYFSCEDSLFLGKEKKFDIILGNPPYLGEKNHKEIF DDLKKTEFGKKYYEGKMDYLYFFIEKAIDLLEEEGILVYLTTDYWLVADGAKTLRRILKK EGEFLYFQDYNTSLFEGALGQHNLLFIWRKGKKGSQVLVQEKEIKFYIEQEELYAENGNI YLWQPKIKKQLQAIKEKANYRLGDLLDIKQGIVSGCDKAFVLNHYEEELKEYLKPFYKNK DIFSYSLEKQEEFWILYLNEKREWNDILEKYLSPYREKLAARREVRLGKIAWWNLQWARE EKIFTQAKILGRQRCKGNWFAYSEEDVYGSADIYYFLPKYEDLDLFYILAYLNSSLFSFW YSHCGKKKGNLLEFYSKPLMEVPIYYPENLSERREVSNLAKLQIQKYSIERQQKIDNYFK F >gi|224461483|gb|ACDD01000019.1| GENE 16 16836 - 18305 545 489 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803542|ref|ZP_02197411.1| 30S ribosomal protein S20 [Vibrio campbellii AND4] # 1 427 2 439 520 214 30 7e-55 SQKMFKSSIGTMIITMISRVLGLFRGSLIAYYFGSSYLTDAYFSAFKISNFFRQLLGEGA LGNTFIPLYNQKCEQEGEEKGKAYIFSVLNLVFLFSFLISLGTVFLSNSIIDFIVVGFPE ETKSLAAILLKIMSFYFLFISLSGMMGSILNNFGEFLIPASTSIFFNLAIIVSAMFFSKT YGIYALAFGVLIGGIFQFLVVWYPLWKKIGKHSFHIDWKDKYLGLLGYRLIPMLVGVVAR QVNTIVDQFFASFLVVGGVTALENASRVYLLPVGVFGVSLSNVVFPSLSKAAAKKDYTKI QRELERGLNILLFLVVPSMVVCILYAKEVIRLLFSYGKFGEDAVTITAQALLFYSIGLYA YVGVQFLSKGFYALGDNKRPARYSIMAIVINIALNALLIQKLEYRGLALATSVASCCNFI ALVFTFHKKYISLAFLSCIKIAMLSIAASLFAYFISRALPYILLKFVAFAILYLLCWIPL FYKKRREIF >gi|224461483|gb|ACDD01000019.1| GENE 17 18278 - 18964 883 228 aa, chain - ## HITS:1 COG:FN0708 KEGG:ns NR:ns ## COG: FN0708 COG1354 # Protein_GI_number: 19704043 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 2 224 3 225 225 201 55.0 7e-52 MDVVLKLENFEGPLDLLLQLIEKKKVKIAEIQISQLIDEYLEIISQAKEENLELKADFLV VASELLEMKALSLLKLEKEKEKEEELRGRLEEYKIFKELGVQLSLFEKEYNISYSRGEGR KVIKKIKKEYDLIHFGSNDLYQIYKKYSEQLEKKEYLELALEKAYSLRDEMDKLYLHIYQ KNYSFAELFDFAENKTHLIYIFLAILELYKDGKIEIEEEGVRKCLKAQ >gi|224461483|gb|ACDD01000019.1| GENE 18 18933 - 19898 440 321 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 13 300 14 305 317 174 34 1e-42 MKVIKNILDIEENLQKSYVAIGNFDGLHTGHRTIIKRAMERAKEKDGVSIVFTFQNHPME LLRKDGRSVKYINTNEEKLFMLEKMGVDYVVLQPFTQDFADLTPLEFVRLLKNKLGVEEI FVGFNFSFGKGGVAKTKDLVYLGEGEGIYVHEFKAITSGEDVISSTLIRKSMMTGEFERA LKLLGHPMIVIGEVIHGKKIARKLGFPTANIQIKDRLYPPFGIYGAKLQIEGEDQIRYGV INVGVNPTLKPGEFSLEVHILNFDEDIYGKKMYIELMEYLRTEEKFDSVEELVACIANDV AVWTKRSEELKDGCCIKIGEF >gi|224461483|gb|ACDD01000019.1| GENE 19 19909 - 20799 977 296 aa, chain - ## HITS:1 COG:FN0706 KEGG:ns NR:ns ## COG: FN0706 COG1481 # Protein_GI_number: 19704041 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 295 1 295 299 328 63.0 7e-90 MSYSSQVKTEITSRESITNLEKLAELYGIFQSKDAIGRYEINLRVENSFLAKRVYSLLKE VTTLKIGIKYSICNKLGEHNVFSIEVFRQKGIKEFLSSLQFRYIDIVSHEEILKGYIKGM FLACGYMKDPKKEYAMDFFIDKKEIAEDFYRILLHNKKKVFITKKRNKTLVYLRNSEDIM DMLVLMGAMKQFFSYEETTMMKDLKNKTIREMNWEVANETKTLNTGNYQIKMIEYIEENM GLNNLTPVLLEAVQIRLEHPESSLQELADFIGISKSGIRNRFRRIEGIYEKLKEEA >gi|224461483|gb|ACDD01000019.1| GENE 20 20812 - 23460 2887 882 aa, chain - ## HITS:1 COG:FN0705_2 KEGG:ns NR:ns ## COG: FN0705_2 COG0749 # Protein_GI_number: 19704040 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Fusobacterium nucleatum # 382 882 1 501 501 574 62.0 1e-163 MKRALLLDVSAMMYRAYYANMNMRTKEMPTGAVYGFLLTLFQLLKEYEPEYMAAAFDIKR SHLKRTELYAEYKSNRDSAPEDLLKQIPYIEAVLDAFGIQRIKIEGYEADDVLGSLSTKL SKKGIQVTIVTGDKDISQLLDENIEIYLLGKEVLKTREDVKNYIDVYPEKIPDLFGLIGD SSDCIPGVRKIGPKKAVPMLDKYENLEGIYENIDKLIEIPGVGKTLIEIMKEDKELAFLS RTLAKIEKNLDFSFSLEDLYFEKKEEALREIFQKLEFKSFLKRLEQKEEKEQIKVAEVIP QKKKSEKSENRIVNSIEELKAEIKEFTEEEKIILLYDRLGLTCTSSNKSIYIPLFHIGLL ESNIDLEECQKLFFSLKGKLYTYDLKELLKLGFSFQKPVYDMMIAYHLVSSQTKEDYTSI GQYYLKTIAEDEKIVFAKQKIETLSIDSYGNFLLKRSQILYSCLDSLEKDLKEKELENVL WETELPLIPVLASMEKQGIKIDRKYFQKYSLELNEKLVLLEKEIWKEAGEEFNINSPKQL GEILFLKLNLPTGKKTKTGFSTDVEVLENLSSQGFQIATNLLEYRKLAKLKNTYVDPIPK MVKFGDRIHTCYHQIGTVTGRLSSSDPNLQNIPVKTEEGIRIRQGFIARDGWKLLSIDYS QVELRVLAALSKDENLVKAYQEKQDLHSVTAKKIFELEEKEEVSREQRTMAKIINFSIIY GKTPFGLAKELGISVKDAAEYIKRYFEQYPRVAQFETEIIEFAEQHGYVETYFGRKRIID GIHSKNRMVKNQAERMAVNTVVQGTAAEILKKVMIKIYSWLLGKTDIYLLLQVHDELIFE IQEERVEEYVKTLTNFMKNTIQLEEVELEVNTNIGNTWAEAK >gi|224461483|gb|ACDD01000019.1| GENE 21 23645 - 24100 644 151 aa, chain + ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 151 1 154 154 139 54.0 2e-33 MNIISLMGRLTRDPEVKFGQSGKAYCRFSIAVNRPFSKDEADFINCVSFGKTAELIGEYF RKGHQIALVGRLQMNQYESNGEKRTSYDVVVDSFDFISTKSSSDTRNYENSYDSRSYETK NTENRMSSTPKKDTFEDNLDSEAMLDDEFPF >gi|224461483|gb|ACDD01000019.1| GENE 22 24124 - 24672 779 182 aa, chain + ## HITS:1 COG:FN1303 KEGG:ns NR:ns ## COG: FN1303 COG2096 # Protein_GI_number: 19704638 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 182 1 183 192 197 57.0 1e-50 MAETKYVNLNKVYTKRGDKGKTDLFGGSQASKASLKVNAYGAIDELGAFLGLVRFYSKEE DIKSLMLELEKKLLIVGGFLASDEKGQAMMKVKIEEEDIRFLEEKIDFYNAKLPDLFAFI LPGDTEVSSYLHVARTVARRAERAMVALAETETLQENLLKYINRSSDLLFILARYDAEIL QK >gi|224461483|gb|ACDD01000019.1| GENE 23 24728 - 26302 2306 524 aa, chain - ## HITS:1 COG:BS_glvC_1 KEGG:ns NR:ns ## COG: BS_glvC_1 COG1263 # Protein_GI_number: 16077887 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus subtilis # 1 444 1 450 452 538 58.0 1e-152 MLQKLQRFGGAMLMPSVLFAFAGLVVGLTSILKNPNLVGNIAEQGTLWYHFWVVVEEGGW TLFRQMPVVFALGIPIGLAKKANGRAALETFVIYMTFNYFINAFLTQFSFFGIDMSMDKI PGITMIAGVKTLDTSIIGSILIAGISVYLHNKYFDKKLPELLGIFQGTSFVIILGFLLMI PVAFGTAIIWPKVQLGIAALQGFLKGAGVAGVFSYTLLERLLIPTGLHHFIYGPFMFGPA VVENGITAYWATHIQEFAAAAEPLKEIFPQGGFALHGNSKVFGLPAAALAMYVTSKSSKK KIVAGLLIPAALTGFLTGITEPIEFTFLFAAPVLFVAHAILGACMSSLMYVFGVVGNFGS GLIDFLAINWLPMFSNHSAQVIVQIGIGLIFSVIYFFVFRFLILKLNLKTPGREEEEEET KLYSKKEYRERESQKSSQAKTTDEENYLEQAKMILEALGGKENIAEVTNCVTRLRVTVKD ETLIQADKDFKKAGAKGVVRNGKSFQVIIGFSVGQVRAAFDSLL >gi|224461483|gb|ACDD01000019.1| GENE 24 26317 - 27114 1025 265 aa, chain - ## HITS:1 COG:SPy1985 KEGG:ns NR:ns ## COG: SPy1985 COG3568 # Protein_GI_number: 15675775 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Streptococcus pyogenes M1 GAS # 2 261 3 269 272 115 30.0 1e-25 MKLLTINVHSWLEEKQEEKMELLAKVIAEKRYDVIAMQEVNQKIEARLLKGEIREDNFLY QLCKKIEKYTEEKYEYHWSHSHIGFDIYEEGIALLTRHSILEKEDFYCTNSKTVYSISSR KIVKIFLEIEGKEIEFYSCHMNLPNCIEENMEQNIQNILKHSSRNCLKILMGDFNTDAFH DQSSYQKILEQGLFDSYTLSKKKDDGVTVYKNISGWENSVEEKRLDYIFLTEQYEVESSY VIFNGKNYPCISDHNGLEVILKIKE >gi|224461483|gb|ACDD01000019.1| GENE 25 27086 - 28627 1818 513 aa, chain - ## HITS:1 COG:SPy1292 KEGG:ns NR:ns ## COG: SPy1292 COG1640 # Protein_GI_number: 15675245 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Streptococcus pyogenes M1 GAS # 1 497 1 497 497 607 57.0 1e-173 MIKRSSGVLMHISSLPGKFGIGTFGKEAYQFVDFLEETKQSYWQILPLTTTSYGDSPYQS FSAIAGNTSFIDFDILRDEELLLEEDYQNIFYGENLERVDYAAVYESRQIVLRKVVKKFQ ESKKWMSELEVFQKENKNWLDDFSEYMAIKGYFSNKALQDWEDMEIRRREKKSLEKYREM LKEEILYHRITQFLFFYQWKNLKKYANQKGIQIIGDMPIYVSSDSVEMWTMPELFKVDKE NRPLYVAGCPADEFSPDGQLWGNPIYDWKKHKEKKYSWWIHRIQESLKMYDVIRIDHFKG FSDYWQIDKNAIVAKEGTWEAGPGIELFKTIRKELGEVPIIAENLGFIDEKAQKLLEDCG FPGMNILQFAFEGGADNKDLPYHYIKNSVSYTGTHDNPVIYAWFEDQTEEVKRYVCQFLN IREGETIPQAMIRGIYSSVSILAIVTMQDLLEKGKEARMNTPSLMGGNWEWRMRAEELSF DKKGFLRHMTGLYGREREDKVEEEDEVTNNKCS >gi|224461483|gb|ACDD01000019.1| GENE 26 28780 - 29493 580 237 aa, chain + ## HITS:1 COG:BH0873 KEGG:ns NR:ns ## COG: BH0873 COG2188 # Protein_GI_number: 15613436 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 237 2 237 237 162 39.0 5e-40 MKKYIEVYQDIKKKIEKGELKTGEELVSETELCEQYSYSKDTIRKALSLLEMNGYIQKIK GKNSTVLGHGRMKNNFLGSIQTSEELNRDNKYSIKNKLISLEVIPATTKLIEIFSSNSKK KFYKIKRSRSIDGENLEFDIFYFDKNLVPNLTAEIVTKSTYEYLENTLKLKISHSRREIF FRHATEEEKKYMDLQNFNMVAVIKSITYLSNGSILQYGTTSYRPDKFSFISIAKRAK >gi|224461483|gb|ACDD01000019.1| GENE 27 29549 - 29977 640 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452014|ref|ZP_05617313.1| ## NR: gi|257452014|ref|ZP_05617313.1| Heat shock protein [Fusobacterium sp. 3_1_5R] # 1 142 1 142 142 257 100.0 1e-67 MRKKIMILMVLMFSVLSLQSMAAQPIKEVEKTIVLAYQDLVGKEYKMISPFGGNKITLGF DVQNRIYGYTGLNRFWGQAEIENGKVKVGEVFTTENKGVQEQRILQVKYLTILKDVESIH FEGENLVLTTPFQEKLVFQPIL >gi|224461483|gb|ACDD01000019.1| GENE 28 29977 - 30786 1007 269 aa, chain - ## HITS:1 COG:FN1800 KEGG:ns NR:ns ## COG: FN1800 COG0652 # Protein_GI_number: 19705105 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Fusobacterium nucleatum # 1 259 1 260 274 320 65.0 2e-87 MKKLVKLFLAMFTMLLLTSCANEMVKDTKKLFTDTSAKYNNIVATFVTTQGEIEFYLYPE AAPITVANFINLAKRGFYDETKVTRAVENFVVQAGDPTGTGTGGPGYTIPDEFVEWLDFY QYGMLAMANAGPNTGGSQFFFTLYPADWLNGLHTIFGEIKSEADFQKIRKLEVGDVIKEV KFTGDVDLILSLNKYQVEAWNERLDQVYPNLKKYPIVDPTPEQIKAYQAELDRIFTRDDK KNSAKFEYPIPKLIRAVGNMFQNKKEVVE >gi|224461483|gb|ACDD01000019.1| GENE 29 30875 - 31651 926 258 aa, chain - ## HITS:1 COG:FN1799 KEGG:ns NR:ns ## COG: FN1799 COG1177 # Protein_GI_number: 19705104 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Fusobacterium nucleatum # 2 257 8 263 264 339 76.0 2e-93 MMNKRRTSLFFFCFTMLFFYLPLIILVVYSFNEGKSMVWKGFSLKWYRELFTYSENIWKA FRYSIGVAIFSGFLSTVIGTLGAIALKWYSFKSKKYLQLLTVLPLVVPDIIIGVSLLIMF ASIHWKLGLLTIFIAHTTFNIPYVLFIVMARLEEFDYSVVEAAYDLGATERQALQKVILP MLFPAIVSGFLMAVTLSFDDFVITFFVAGPGSSTLPLRIYSMIRLGVSPVINALSVILIA LSILLTISTKKLQKNFIG >gi|224461483|gb|ACDD01000019.1| GENE 30 31641 - 32486 722 281 aa, chain - ## HITS:1 COG:FN1798 KEGG:ns NR:ns ## COG: FN1798 COG1176 # Protein_GI_number: 19705103 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Fusobacterium nucleatum # 2 277 3 278 284 371 74.0 1e-103 MKDSKKKYFYAFPITLWLTLFFMIPMLIVLSYAFLKKGTYGGVEFSFSMAAFSIFQDKVF LTVLWKTIYISMWITALTVFFSIPVAYYIARSRYKQELLFFIIIPFWTNFLVRIYSWISL LGSNGFINSLLMKFHILEEPIKFLYNPAAVVVISVYTSLPFAILPLYAVVEKFDFSLLEA ARDLGATNCQAFFKVFIPNIKSGIVAATLFTLIPSLGSYAVPKLVGGTNATMLGNIIAQH LTITRNWPLASTISGSLIIITSIAVWLFSKIEKKGGKEYDE >gi|224461483|gb|ACDD01000019.1| GENE 31 32488 - 33627 1443 379 aa, chain - ## HITS:1 COG:FN1797 KEGG:ns NR:ns ## COG: FN1797 COG3842 # Protein_GI_number: 19705102 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 8 378 1 371 376 553 74.0 1e-157 MVEVKQRLEKNDIRIEHIRKSFDGVEILKDINLTINQGEFFSILGPSGCGKTTLLRMIAG FISADEGAIYLGNENLLDLPPNLRNVNTIFQKYSLFPHLTVYENVAFPLRLKKVEEKIIE EEVKKYISLVGLEEHMQKKPSQLSGGQQQRVAIARALINKPGVLLLDEPLSALDAKLRQN LLLELDLIHEEVGITFIFITHDQQEALSISDRIAVMNKGEVLQIGTPAEVYESPANMFVA DFLGDNNFLEGEVIEILENNFAKIQTKDLGELIIEQDKKVEIGNHVKVSIRPEKIKVTKT KPKEIRSTINTLPVYVNELIYTGFQSKYFVHLCSKEEYTFKVFKQHAVYFDDNDEGAIWW DEDAFISWDADDGFLIEVV >gi|224461483|gb|ACDD01000019.1| GENE 32 33746 - 34768 1359 340 aa, chain + ## HITS:1 COG:AF0807 KEGG:ns NR:ns ## COG: AF0807 COG1304 # Protein_GI_number: 11498413 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Archaeoglobus fulgidus # 37 336 82 364 366 129 33.0 6e-30 MTLQEVYQEARGRMKGFCSICPECNGKMCAGKVPGMGGCGSGFSFQHNYTSLKNIHLKMR CLHKAKDPKTTLQLFGQNLSMPILGAPITGPKFNFGGYVNQEEFCDDIILGAKATGTLAM IGDTGDPTAYEAGIKSLKRANGFGIAIIKPRYNEEIIKRIRIAEEAGAIAVGIDLDGAGL LTMKLFNQPVEPKSMEDLKELVNSTNLPFIVKGILSVEDAKACVEAGVDAIVVSNHGGRV LDDCISPVEVLQDIVEAVGNQIIVLVDGNVRSGEDVLKYLALGARAVLIGRPCIWASVGN RQEGMETLFQSLQSQLYKAMLMTGNHSVQEISPNTIFKNA >gi|224461483|gb|ACDD01000019.1| GENE 33 34883 - 35146 450 87 aa, chain + ## HITS:1 COG:FN1794 KEGG:ns NR:ns ## COG: FN1794 COG1925 # Protein_GI_number: 19705099 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Fusobacterium nucleatum # 1 87 1 87 87 115 78.0 3e-26 MANKTVEITNETGLHTRPGNEFVSLAKTFSSQIEVENQDGKRVKGTSLLKLLSLGIKKGT KVTVYAEGEDAEQAVEQLANLLENLKD >gi|224461483|gb|ACDD01000019.1| GENE 34 35214 - 36944 2128 576 aa, chain + ## HITS:1 COG:FN1793 KEGG:ns NR:ns ## COG: FN1793 COG1080 # Protein_GI_number: 19705098 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Fusobacterium nucleatum # 2 574 7 579 579 755 68.0 0 MERKFIKGIDASPGIAIGKVFLYQESELTIIQESNRTVEEEKQRLIHGQEKTKEQLEAIK EKTLLTLGKDKADIFDGHITLLEDEDLLEEINDLLEEGNISAEFALKTQIEEYCKMLSNL EDPYLRERAADLQDIGKRWLYNVANVTIVDLSSLPANTIIVAKDLTPSDTAQVDLQNVLA FVTEIGGKTAHSSIMARSLELPAVVGTGNICSLAKNEEIIIVDALTGDIILNPSQEELEI YKSKQEHFLQEKEMLKELKNKAAISKDGVEVGVWCNIGSPKDVKGVLNNGGQGIGLYRTE FLFMNNDRFPTEEEQFEAYKEVAMALEGKPVTIRTMDIGGDKSLPYMELPKEENPFLGWR AIRVCLDRTEILETQFKALLRASAFGYIKIMLPMIMDITEIRRARKLLEKCKAELKEKGI AFDENIQLGIMVETPAVAFRAKYFAKEVDFFSIGTNDLTQYTLAVDRGNENISHLYNTYN PAVLQAIQASIEGAHEAGISISMCGEFAGDEKATALLFGMGLDAFSMSAISVPRIKQNIL NIDRASAKAFVDEVMNCATTEEVLAKVEEFYSKLKK >gi|224461483|gb|ACDD01000019.1| GENE 35 37011 - 38333 1599 440 aa, chain + ## HITS:1 COG:STM4407 KEGG:ns NR:ns ## COG: STM4407 COG1253 # Protein_GI_number: 16767653 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Salmonella typhimurium LT2 # 3 423 1 422 447 407 50.0 1e-113 MNLLIKLFLIFTLVGIGAFFAISEIALASARKLKLHTLVEEGNKKAQKILDIQENSGNFF AAVQIGINAVSILGGSLGASIVGDFFQELEWLSPLKPFGNIFSFLIVTWLFIEFADLLPK RIAMVYPEKIALAIIHPMLFLIFFLKPFIKIINGFASIFFKLFGMEQVRNEDVTYDDIFA VVDAGAESGILQEKEQSLIENIFELDSRWVSSIMTTRDEISYLALDDTEEELREKIMDYP HTKFLITESDIDSILGYITSKDLLPSLMLSKKSIKELIKNYRKHLLILPNTLTLSETLDR FNEAKDDFAIILNEYGLVVGLVTMKDVVNTLMGDVVFQNSEDQQIIERDEHSWLIDGVTP IEDVKKVLDRIEKFPEEDSYESIAGFLMYMLKMIPKRGAKLEFMDYQFEIVDVDNFKIDQ ILVIDMLEEKKIEVENTESH >gi|224461483|gb|ACDD01000019.1| GENE 36 38366 - 41680 3786 1104 aa, chain - ## HITS:1 COG:FN0499 KEGG:ns NR:ns ## COG: FN0499 COG1629 # Protein_GI_number: 19703834 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 20 395 12 382 743 179 34.0 3e-44 MKKEMILTLLCFASLQAVAAVQEVELNPTKIRGGGATYDGSVLSNEKKNVVIITKADIEK KNYRDLESIFKDSPVTSVVYTEAGPLVTLRGSGQKTAMRVKVLLDGVSINTVDDSMGVIP FNAIPVGSIEKIEIIPGGGITLYGSGTSSGVINIVTNQANRKNFGDISFTMSSFDTYNTT LNKGIAFGDKLFWNIGIEAEKGKAYREKEESKKLNVLSGIDYKINNKHRIKLHGSKFWAD FDGTNELDLISLQKHRRGAGKSDANVKSNRYSVSFDYEYKPTENLTFTSSYNQQRFRRDF TQDNRPYLTFLPSEWLEDFFGIPDGMNADLVIKNVNNHLTGRIEEKIQNGKIAGEWKYRE GRGKLSFGYEHSAHRLKRNMDVVVEPFNPITNNYFFLRNKEERIINEEILEQHPNQLMGF HDNVLGGFILSDRDSMEEYGFDFDKFYKKMNDLYYKHTTSEADKKKYENGDPSPWDYWET LKPNLWKVVYDLMQEKMSDYAKDGKTIYRREDSENYDQKPTVPILLKDEKFEDFLKLILP HMVDSTFVVQPVTQSMVDVKKTTDSFYLFDTYKLTDRLEVNGGLRYEKAKYTGNRYTKTE QVIKGNPDNSTTKSMIDMYTELSEAEYAKKNAGDRHKWSGNETSKEKLKELKEKGVTTIL MTDLTRKEKKEEENLGGEIGVNYRFNDTDTVYLKYERGFNTPLPTQLTNKTFDPKTKMKV YWESNLKTEKIDNVELGIRGMLTPNVTYSLAGFISDTQNEILSIVKNGSSHMLREWRFIN IDKTRRMGLEFQSQQNFEKLTLKESLTYVDPRILSNDYEKQVQQIGVDKAEEMYQNNQKV RDWAIENILFSEKSFTIPTGTSEEEIAKMKEESKRLGKEAVKIIQNLREKGIKVDYSAKE EKLREITSGMSPADQARIRKEANELGKEAEERVLAEPRKELEELIAKSAYPDIFKKHLRK FSNYKLIHEGTMKESIYKQFEEEIKASYTKGTLEKGSRIPLSPKWKGTFSADYQFTDKLK LGMNTTYIGSYDSAEPGKGYEIVMTKVPHHMVADFYGTYNVNEEFSIKFGINNVFNHQYY LRQDSRTATPAPGRTYSAGFSYRF >gi|224461483|gb|ACDD01000019.1| GENE 37 41950 - 43170 422 406 aa, chain + ## HITS:1 COG:FN0771 KEGG:ns NR:ns ## COG: FN0771 COG0635 # Protein_GI_number: 19704106 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 403 1 405 411 374 47.0 1e-103 MWDYRYKTHHDVEKLLSKLIKNRICTKSVFLERLKETNPSGQLSLYVHTPYCDHICSFCN LNRKQGNQEIDNYAKRLYKEIKNYGNFRFCKSSEIDVIFFGGGTPTIYSEAQLENILKTI HDSFSLAENCEFTFETTLHNLSQEKIFLLTKYGVNRLSIGIQTFSTRGRKLLNRRFSKEI VLKRLQKIRHSFHGTLCIDIIYNYPEQTIEELLEDVRHISDCHIDSVSFYSLIIEEKSKL SRLFQKNPFSFQYNLQKDKELHRIFCEQMRKQGYHLLELTKFVKKDKYQYIQNNYQAKHL LPIGTGSGGRIGNIGCYHMNSKLSFYMQYSSAYMKAYQLLGLFQFPDISEENLKSFSGKN YEKIRNKMNKFVEKGYFSLQNHLFIYTIEGIFWGNNIAAEILKLSQ >gi|224461483|gb|ACDD01000019.1| GENE 38 43184 - 43615 297 143 aa, chain - ## HITS:1 COG:MA3472 KEGG:ns NR:ns ## COG: MA3472 COG3315 # Protein_GI_number: 20092284 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: O-Methyltransferase involved in polyketide biosynthesis # Organism: Methanosarcina acetivorans str.C2A # 7 138 140 273 274 86 36.0 1e-17 MFSCVGNVLEEKIYQEIRQEEENIVIIIEGLLIYFTEEEVKKLFHILKRNFPKATIFAEF SKPFIIKHQKYHDTVKDNMAKFRWGIQNAKEIEKVCPEVKWIGEWNLTEEMRPFSRYKLF LLAPFLRKVNNSIVKLKFKDSTN >gi|224461483|gb|ACDD01000019.1| GENE 39 43732 - 43989 388 85 aa, chain - ## HITS:1 COG:FN0388 KEGG:ns NR:ns ## COG: FN0388 COG3315 # Protein_GI_number: 19703730 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: O-Methyltransferase involved in polyketide biosynthesis # Organism: Fusobacterium nucleatum # 3 76 6 79 269 70 50.0 6e-13 MTRIQLTGVEETLLIPFYARVYGSKHYASYFYDKEALEIFSKIDYDFSKFENGKMSLYGC LARSIILDREVKKFLKSILIQSVLA Prediction of potential genes in microbial genomes Time: Fri May 20 01:52:47 2011 Seq name: gi|224461482|gb|ACDD01000020.1| Fusobacterium sp. 3_1_5R cont1.20, whole genome shotgun sequence Length of sequence - 20199 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 5, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 43 - 1212 1632 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 1236 - 1295 8.9 - Term 1277 - 1324 3.2 2 2 Op 1 . - CDS 1349 - 1738 825 ## COG3576 Predicted flavin-nucleotide-binding protein structurally related to pyridoxine 5'-phosphate oxidase 3 2 Op 2 . - CDS 1810 - 2094 366 ## Hac_1467 hypothetical protein 4 2 Op 3 4/0.000 - CDS 2110 - 3099 1495 ## COG1087 UDP-glucose 4-epimerase 5 2 Op 4 4/0.000 - CDS 3111 - 4604 1984 ## COG4468 Galactose-1-phosphate uridyltransferase 6 2 Op 5 . - CDS 4604 - 5803 1410 ## COG0153 Galactokinase - Prom 5829 - 5888 7.7 7 3 Op 1 . - CDS 5890 - 6846 798 ## COG0429 Predicted hydrolase of the alpha/beta-hydrolase fold 8 3 Op 2 . - CDS 6843 - 7721 993 ## COG0429 Predicted hydrolase of the alpha/beta-hydrolase fold 9 3 Op 3 . - CDS 7736 - 8119 553 ## FN1276 hypothetical protein 10 3 Op 4 27/0.000 - CDS 8132 - 11182 3662 ## COG0841 Cation/multidrug efflux pump 11 3 Op 5 13/0.000 - CDS 11205 - 12299 1491 ## COG0845 Membrane-fusion protein 12 3 Op 6 . - CDS 12315 - 13586 1567 ## COG1538 Outer membrane protein 13 3 Op 7 . - CDS 13602 - 14246 626 ## FN1272 TetR family transcriptional regulator - Prom 14306 - 14365 7.5 + Prom 14194 - 14253 7.5 14 4 Op 1 . + CDS 14396 - 14947 442 ## FN0534 hypothetical protein 15 4 Op 2 . + CDS 14959 - 16428 649 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 + Term 16452 - 16503 7.6 - Term 16433 - 16497 16.4 16 5 Op 1 . - CDS 16526 - 19675 3269 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 17 5 Op 2 . - CDS 19677 - 20198 265 ## Vpar_0090 hypothetical protein Predicted protein(s) >gi|224461482|gb|ACDD01000020.1| GENE 1 43 - 1212 1632 389 aa, chain - ## HITS:1 COG:FN1148 KEGG:ns NR:ns ## COG: FN1148 COG1301 # Protein_GI_number: 19704483 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Fusobacterium nucleatum # 3 385 5 387 390 495 76.0 1e-140 MAKLKDNLIVKLLLGVIIGIIVGLYANEQVIGIINTIKFLIGQIIFFIVPLIILGFITPA ITKMKSNASKMLGTMLGLSYSSSVGAALFSMVAGYILIPKLNIITNVEGLKEIPELIFKV EVPPVVSVMTALVLSIVLGLAVIWTNSKKTEELLDEFNEIMLSVVYKIVIPLLPIFIAST FATLSYEGSITKQFPVFLKVIVIVLLGHYIWLAVLYLLGGAISGKNPWSLLKHYGPAYLT AVGTMSSAATLPVALSCAKKSNVLHDDVADFAIPLGATTHLCGSVLTEVFFVMTVSKILY GSLPSVGTMILFVFLLGIFAVGAPGVPGGTVMASLGIIISVVGFDETGTALMLTIFALQD SFGTACNVTGDGALALILNGIFKKELEKN >gi|224461482|gb|ACDD01000020.1| GENE 2 1349 - 1738 825 129 aa, chain - ## HITS:1 COG:FN1138 KEGG:ns NR:ns ## COG: FN1138 COG3576 # Protein_GI_number: 19704473 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein structurally related to pyridoxine 5'-phosphate oxidase # Organism: Fusobacterium nucleatum # 2 128 7 142 143 158 60.0 3e-39 MLTDVMKEMIEKELAYVSTVSNDGIPNIGPKRSMRLLDEHTLIYNENTGKQTMKNLIDNG KVAVAYADRAKLDGYRFVGKAEVFTEGKYYDEAVEWAKGKMGAPKAAVVIHIEEIYTLRS GSTAGDKIS >gi|224461482|gb|ACDD01000020.1| GENE 3 1810 - 2094 366 94 aa, chain - ## HITS:1 COG:no KEGG:Hac_1467 NR:ns ## KEGG: Hac_1467 # Name: not_defined # Def: hypothetical protein # Organism: H.acinonychis # Pathway: not_defined # 1 93 1 93 94 99 53.0 3e-20 MATTIQLEKNGALKKSYLGFSWTTFFFGFFVPLFRGDAMWFIVMLILNACTLCMAQLILS FLYNGIYTKNLLKDGYKPADTFSEDILRRKGYIL >gi|224461482|gb|ACDD01000020.1| GENE 4 2110 - 3099 1495 329 aa, chain - ## HITS:1 COG:FN2109 KEGG:ns NR:ns ## COG: FN2109 COG1087 # Protein_GI_number: 19705399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Fusobacterium nucleatum # 1 329 1 329 329 568 81.0 1e-162 MAVLVCGGAGYIGSHVVKALLDQGEKVVVIDNLITGHVDAVDERAELLLGDLRDEEFLNH AFEKHSIDGVIDFAAFSLVGESVEEPLKYFENNFYGTLCLLKAMKKYKVNHIVFSSTAAT YGEPENIPILETDTTFPTNPYGESKLCVEKMLKWCDKAYGIKYTALRYFNVAGAHASGEI GEAHTTETHLIPIVFQVALGQRAKIGIYGDDYPTQDGTCIRDYIHVMDLADAHILALNRL RKGGDSTVFNLGNGEGFSVKEVIEVCRKVTGHTIPAETSPRRAGDPAKLVASSEKAMHEL KWTPKYNSLEKIIETAWNWHKSHPNGYED >gi|224461482|gb|ACDD01000020.1| GENE 5 3111 - 4604 1984 497 aa, chain - ## HITS:1 COG:FN2108 KEGG:ns NR:ns ## COG: FN2108 COG4468 # Protein_GI_number: 19705398 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Fusobacterium nucleatum # 3 494 2 505 509 651 60.0 0 MAEIHGTLNRLIKYGLENELIVDYDEIWVRNELMDLFHLTEWKEMPISACMMPKYPQSIL DTLCDYAVEQGIIEDTAGNRELFDTKIMGKLTPSPSQVIDRFRATSEFSKEVATQKFYEF SQKTNYIRMDRIAKNVYWQVPTEYGNLEITINLSKPEKDPRDIERQKNLPSSSYPQCLLC YENVGYAGRGNHPARQNHRVLPFILEEEKWYLQYSPYVYYNEHAIVFSREHRPMKISRGS FARITAFLEQVPHYFLGSNADLPIVGGSILSHDHYQGGNHEFPMAKAEIEEEIVFQGFEK VKAGIVKWPMSVLRISSPNREAIINLSDKILRTWREYSDEECGIFAYTGEEAHNTITPIG RRRGENFEMDLVLRNNRRSEEHPLGIFHPHKEYHNIKKENIGLIEVMGLAVLPGRLKEEL EIIRGYLKEESYLEKIKADERVIKHYDWIASFPNAEIDLEKEVGIVFSHVLEDAGVYKRT EEGRKGLLRFVEAVNEN >gi|224461482|gb|ACDD01000020.1| GENE 6 4604 - 5803 1410 399 aa, chain - ## HITS:1 COG:FN2107 KEGG:ns NR:ns ## COG: FN2107 COG0153 # Protein_GI_number: 19705397 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Fusobacterium nucleatum # 7 396 1 388 389 509 66.0 1e-144 MEQMEIIVKRLCEEAKQLFSIEKNDTLEGYFSPGRVNLIGEHTDYNGGYVFPCALSFGTY AVLKRRKDKLCRMYSNNFKELGIFEISLENIIYDEKDAWTNYPKGVIKMFQELGVNTSFG FDILFEGNIPNGAGLSSSASIELLMAEIVRDLYQVEMDRVAMVKLCQKSENVFNKVNCGI MDQFAIGMGKKDHAILLDCNSLEYHYVPVVLEDASIVIANTNKKRGLADSKYNERRASCE AAVADLQKEGCKIQYLGELSLQEFEEKKSLILGEEKQKRAKHAVAENERTKIAVEKLNQN DICAFGKLMNDSHISLRDDYEVTGFELDSLVEAAWEEEGCLGSRMTGAGFGGCTVSIVKN EAVEHFIENVGKKYQEKTGLKAEFYIAKIGEGTRKLGEF >gi|224461482|gb|ACDD01000020.1| GENE 7 5890 - 6846 798 318 aa, chain - ## HITS:1 COG:PA0368 KEGG:ns NR:ns ## COG: PA0368 COG0429 # Protein_GI_number: 15595565 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta-hydrolase fold # Organism: Pseudomonas aeruginosa # 4 307 7 321 332 172 33.0 9e-43 MIEYKPSFWFRNAHINTCYPTFFRKVDISYRRQRIFLEDGDFLDFDWVEKGNSKLILLCH GLEGSSESHYIKAFARYFSERAWDILALNYRSCSKEPNPSPFFYIAGKGDEISTALQYAS SYEEIVFIGFSLGANKVLHYLGTEREIPKNVTMGVAVSPPCDLKGSSLLFARGWNKIYEQ YFLKQLKKKMIQKEEKYPNIFQKFEISLEEVQKAKTLVEFDNLVTSKLAGCKDAYEYYKR NSSLFCLKNIHHPSFILTALDDPMMSESCYPREEVEKNMFLYLETPKYGGHISYASFEKD YWLEKFIFEKVNLLKNLK >gi|224461482|gb|ACDD01000020.1| GENE 8 6843 - 7721 993 292 aa, chain - ## HITS:1 COG:STM3462 KEGG:ns NR:ns ## COG: STM3462 COG0429 # Protein_GI_number: 16766750 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta-hydrolase fold # Organism: Salmonella typhimurium LT2 # 2 286 59 347 355 110 28.0 3e-24 MLNYERRRITTEDEDFLDLDCLLTGNSRLAILCHGLGGSARAPYMKSTAKEFQRRNFDVV AMNYRSCSEEVNRRAKMYGMMTYLDLETIIKAFEEEYSEIVLVGFSMGGNIVLNFMVHLL KNYKMIKGAVSVSAPCDVWDSITDFEKLGNGEYQEYFLDRMKDCLREKNKKYPGIFEEAG IKLEEVLQSKGLQAFDETFTVKIEGFKDVDEYYRTTSTKGKLHLITKPTLLLLPWDDIVV SKNCFPVEEGKKNPNLFFERPKFGGHLGYESKNQSFGLEERIVDFILEEVVE >gi|224461482|gb|ACDD01000020.1| GENE 9 7736 - 8119 553 127 aa, chain - ## HITS:1 COG:no KEGG:FN1276 NR:ns ## KEGG: FN1276 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 126 5 129 131 125 45.0 6e-28 MEMSNYKMILIHVNESQKGRLEDFFEDIGFYYYAVQSHAERVISKTLRHKNNKIWPGTDC FFNLIVAEEKLEEMLSYLKTFRMSLPEGIIMSIGIIPVERVIPSLYQEDIPIKEELLKEL KKKHNYK >gi|224461482|gb|ACDD01000020.1| GENE 10 8132 - 11182 3662 1016 aa, chain - ## HITS:1 COG:FN1275 KEGG:ns NR:ns ## COG: FN1275 COG0841 # Protein_GI_number: 19704610 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 998 1 999 1020 1111 59.0 0 MTLAGLSIRRPVATSMIVISVIFIGIITMLNMRTELLPNMEIPTVTIRSSWSGAVAEDVE TQITKKIEAILPNVEGIDKIESTSSYGSSSIVVKFNFGTNADQKVTEIQREVSKILNDLP KDASNPVAKKIQAGIGSLSMVVMMSAPNKAELTTFVDEYLKPKFESLPGAAQVDVYGNAA KQLQIQVDSEKLAAYNLSPVELYNLISSSNTVLPIGTLQTGTKQLVVRYMGEMQSIEDFE NMIISSNGNTLRLKDISNVVLTREDESNKGYISGKEAITILLQKSTDGSTVDLTEKANKA LRELKGIMPKGTEYNIIMDTSVDIKSSITGVSSNALQGLILATIVLFVFLRNIRATFLIT LALPISVIFTFAFLKATGTTLNLISLMGLSIGVGMLTDNSVVVIDNIYRHITELHSPVLE ASENGSTEVSASIFASALTTMLVFIPILFIPGFAREIFRDMAYAIIFSNVAALIVALTLI PMLASKLMSNDVKISSDGKIFHKIREKYLKLISYALSHRKLTIFITLGIFVFSIFVSSFL KFNFMPKQDQGRYSITAELPNGLDLEKSDKIAKQIEAFVKEEPNTKTYFIIVGNNSVNVN VDIGKKDTRSTSVFDIIEKMRPLVSKIPDTRVSLKEDFGMGSIRRDVEFQIKGANLNEIK ELGVLVQEEVSKNPKLRDVKSSLDPGNQEARLILNRDKIRSYGINPVVIAQNLSYYILGG NRGNTTTIKTGTENIDVLVRLPKEKRQDINQLKNLNIKIGDHKFIKVGDVADIVYGEGSL SIQKKDRIYSVTISANDNGLGVRGVQQAFIEAFKKVNQSDAISYSWGGESENMNATMSQL SSALLIAIFLIYALIAAQFESFLLPFVVIGSIPLALIGVFLGLFILGQAMNMMTMIGIIL LAGIVVNNAIVLIDFIQLMRVRGMSRTEAIIEAGSTRLRPILMTTATTVLGMIPMALGFG EGSEIYKGMSLAVIFGLSVSTLLTLIVIPILYSLMDDFIMRGKHFFSNFKFFKKIK >gi|224461482|gb|ACDD01000020.1| GENE 11 11205 - 12299 1491 364 aa, chain - ## HITS:1 COG:FN1274 KEGG:ns NR:ns ## COG: FN1274 COG0845 # Protein_GI_number: 19704609 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 3 351 2 363 370 346 52.0 4e-95 MRKQIIAIMLMGIVLFVACGKKEETSMERPVKKIVSEQVIIREMSQIFESDAVLEPKDKV NHNTERGGTIEKIYKKNGDYVKKGDLVMSFSDAGTKASYLQALANLQTAESSYRIAQGNH SKFKQLYDRGLVSHLEYVSYENTLVSASGQLEVAKAMFQSAQSDYSKLERRADISGTIGN LFGKEGNKVNPLEDVFTVLNDSQMQAYIGLPGEYIANVKNGDHLTVHVDNTGKDYEAVIG EVNPIADTTTKNFMTKIILQNSEKEIKDGMYASINLPIGSKQVLSVPDEAIFVRNLISYV FKIVDGKAVRVEVQAGSQNGEYTEITSPDIKEGDKIVVKGLFGLQDGDKVEESTPADLDP KQAN >gi|224461482|gb|ACDD01000020.1| GENE 12 12315 - 13586 1567 423 aa, chain - ## HITS:1 COG:FN1273 KEGG:ns NR:ns ## COG: FN1273 COG1538 # Protein_GI_number: 19704608 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 13 423 3 413 413 340 48.0 4e-93 MKKIWTMFFLVGSLAFAREITLEEAIQESMNHSKTLKISEKKLQISKLNRSQAIKKALPS VVYNTSYQRTEYERNISKNKSSMQLEKGGYKQSITISQPIFQGGAIIAGIQGAKAYETIA DLSYVQEGLNTRLKTIRTFSNIVNSKRNLQALENSEKQLQKRYQKQEAQLELRLITKTDL LKTKYNLLEIQSLIAKAKSNIEVQTEDLKFQMGVDKEEQLEVKEFNVPNHLTDTIDFQKD KEKALEFSIQSLIAKSQVEIAKAQETAALGNMLPKINAFASYGVATERTKWKQTREDAEW MGGLSVSWNVFSFGSDYDNYQIAKLEKENKELSETIAQDRIELTLKTAYSELQRLEILRE SRKRGLEAAELNFSMDQEKFDSGLISTIDYLLSETQLREARVNYYQAELDYYYAFEYYRS LLV >gi|224461482|gb|ACDD01000020.1| GENE 13 13602 - 14246 626 214 aa, chain - ## HITS:1 COG:no KEGG:FN1272 NR:ns ## KEGG: FN1272 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: F.nucleatum # Pathway: not_defined # 9 213 6 210 211 65 27.0 1e-09 MVEKQEIENKKEKILEIFQKLVLEKGYSKVSVEEITSSLGISKGSFYSYFRSKTDMVLEC IEENFWISLERQKHIENISNSMESTLWNYFIERFQTNIQHIKKELVLISLFKNLEILEES IVKRLICFEKTYIEYWEKQLEKYDEELNILEEERHEYAILLAKMIQGFRMSALFVTQDEN FFTTDVAEVLKRIEDKTILNKIEFLIKNIFKMIK >gi|224461482|gb|ACDD01000020.1| GENE 14 14396 - 14947 442 183 aa, chain + ## HITS:1 COG:no KEGG:FN0534 NR:ns ## KEGG: FN0534 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 44 183 2 142 142 107 50.0 3e-22 MIFGVTMKIRKLFFVLAPLMIGVGIYLLYRSRNLYYYQLLQDTHLHPYINQIRENAKIYR KIFSTWIVYSLPDGLWLFSFGAALLLDRVYYWMHLFIFSAIYALMIGIEYLQKLYGGHGH WLGTFDLQDIEAYTIAYLSILFFSLIFYFFQPKNKVHNRKKELGIDCIYIGIFGVLGALP SLL >gi|224461482|gb|ACDD01000020.1| GENE 15 14959 - 16428 649 489 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 115 487 173 545 546 254 37 3e-67 METLNYRTYLTTLKLVFLKAISDLYPEHEVVFLNSLNNGLYGKIIYNNHVFREADYDKIK TQMKQIIDANLPIQIVSSNYEAIKLSPIEENREDIQELINTTLWTGIMKMELDGYVDYFY HLPYDSTGKLNAYDVYPYSSGFILKYPITDPNTLEQKIDTPKMAAIFEESDHWLRLMDVP NAGSINRKVLNHEIRSLIRINEALHNKNLAKISEQIVKNDKIKVITIAGPSSSGKTTFAN RLFIQLKADEVNPLVISLDNYYIGRKNIPLNEEGEKDYEALEALDIRLLNQNLVDLIDGK EVELPIYNFITGEREEKGKIVRLSNKHGVIIIEGIHGLNEAMTKYIPKEQKFKIYISCLT QLNLDKHNRIATSDVREIRRMVRDSLSRNTAAEETLAMWSSVRKGEEKHIFPFQEEADVI FNSNLVYEMGVLKNAAMRELVKVPTTSPYYADARRLIGLLACFLPIETDDVPDDSILKEF IGKSFFYNY >gi|224461482|gb|ACDD01000020.1| GENE 16 16526 - 19675 3269 1049 aa, chain - ## HITS:1 COG:XF2739 KEGG:ns NR:ns ## COG: XF2739 COG0610 # Protein_GI_number: 15839328 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Xylella fastidiosa 9a5c # 5 1017 18 1053 1058 731 38.0 0 MLQAFTEANYENSIIQLFQSMGYQYVYGPDIERDFESPLFEEVLMDQLHMINPKAPLEAI QSALFKIKNFENGELIEKNNLFMEFLQNGIEVSYLEQGEQYSTQIYLVDYKHIEKNSFIV ANQWTFIENSNKRPDIVLFLNGIPVVLMELKSPSREEVDSSEAYSQLRNYMHEIPSMFIY NCICVMSDQLISKAGTITSDETRFMEWKTKDGSYENTRYAQFDTFFEGIFTKDRFLDILK NFICFSNIEGKKIKILAGYHQYFAVKKAIESTRRAVETDGKGGVFWHTQGSGKSLSMVFY SHLLQEALESPTIVVLTDRNDLDNQLYQQFVNCKDFLRQTPEQAKSREDLKSLLAGRKVN GIIFTTMQKFEESEEALSERRNIIVIADEAHRGQYGLSEKIKMTKNDEGEEVAKKVIGTA RIIRNSLPNATYIGFTGTPISSKDRSTREVFGEYIDIYDMTQAVEDGATRPVYYESRVVH LKLDEETLKLIDQEYEIMSQDADLEVIEKSKRELGQMEVILGNEKTIDSLVQDILNHYES YRQHELTGKVMIVAYSRSIAMKIYRRILEIHSHWTEKVAVVMTESNKDPEEWREIIGNKH HRAELAKKFKDNSSPLKIAIVVDMWLTGFDIPSLSTMYIYKPMSGHNLMQAIARVNRVFK EKVGGLVVDYIGIASALKTAMNDYTIRDRKNYGDQDISKVAYPKFLEKLSVCQDLFHGYE YTKFSTGNDLQRAKVISGAVNFMLDIRQEQKKESFLKEALLLQQSLSLCSSLVGESQRYE ASFFEAVRVLIIKLMNNGAGKKISLKEMNERISSLLKQSIKSEGVINLFDGIEKEFSIFD PHFLEEISKMKEKNLALELLKKLISEQVKIYTRSNVVKSEKFSEMIQQTMNRYLNGMLTN EEVVQELLKLAKEIQEAQEAGKELGLSSEELAFYDALSKPQAVKDADTNKELIALTKELT ESLRKNRTVDWQKKESARAKMRMMIKKLLKKYKYPPEGAEDALRTVMIQCELWTDNFVFH EKPEKEIYAERFMLEGQREYRMVAEETMK >gi|224461482|gb|ACDD01000020.1| GENE 17 19677 - 20198 265 173 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0090 NR:ns ## KEGG: Vpar_0090 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 29 173 11 154 156 92 38.0 5e-18 NINKERIKIEKNDKMKNEIIIEKDLKSIIKIGLDLLYKRDIYLIRKKVSERAVVFKFGVY FSNLISLYYPKYDVDIEYNRSGDDPKRLISTEKLIIPDLIFHKRGCRGPNILFLEFKTYW NKNQEEDENKIKEVCNPEGKYKYQYGITVLLEKNRSEVKINFYHNNNWEEWNL Prediction of potential genes in microbial genomes Time: Fri May 20 01:53:02 2011 Seq name: gi|224461481|gb|ACDD01000021.1| Fusobacterium sp. 3_1_5R cont1.21, whole genome shotgun sequence Length of sequence - 2665 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 177 - 521 381 ## HELPY_0444 type I restriction/modification specificity protein 2 1 Op 2 . - CDS 533 - 1522 1097 ## COG0582 Integrase 3 1 Op 3 . - CDS 1609 - 2634 1150 ## COG3943 Virulence protein Predicted protein(s) >gi|224461481|gb|ACDD01000021.1| GENE 1 177 - 521 381 114 aa, chain - ## HITS:1 COG:no KEGG:HELPY_0444 NR:ns ## KEGG: HELPY_0444 # Name: hsdS1 # Def: type I restriction/modification specificity protein # Organism: H.pylori_B38 # Pathway: not_defined # 20 114 22 117 419 102 59.0 4e-21 MNKIKLKKIARSNLYSYNLKEDNWEYINYLDTGNITMNHINEIQHINLRVEKLPSRAKRK VRYNNIIYSTVRPSQKHFGIIKNILPNFLVSTGFVVLEIDPLKADADFIYYFLT >gi|224461481|gb|ACDD01000021.1| GENE 2 533 - 1522 1097 329 aa, chain - ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 3 328 2 320 321 384 61.0 1e-106 MKEKIITAVLQQMQTMLNNAQMSRLQDILEYELFSYKIEKNDFMEEDIWTNERLLNTFLS AKRVEGCSEKSLSYYQKTIETMLNSIGKEIKYIVTDDLRSYLTEYQSEKQSSRVTIDNIR RILSSFFSWLEDEDYILKSPVRRIHKVKTISSIKDTYSDEELERMRDSCHEIRDLALIDI LASTGMRVGELVLLNRQDIQFGERECIVFGKGDKERVVYFDARTKIHLQNYLNTRVDSNP ALFVALRKPYNRLTIGGIEVRLRKIGKELEINKVHPHKFRRTLATIAIDKGMPIEQLQKL LGHRRIDTTLQYAMVKQSNVKLAHKKFIG >gi|224461481|gb|ACDD01000021.1| GENE 3 1609 - 2634 1150 341 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 10 341 8 342 345 231 38.0 1e-60 MKKKNEITIHSSTAEYLTFVASTGNSQDSFEIRYEDENIWLSQKMMAQLYDVEVNTVNYH IKKIFQDNELLEESVIRKFRITAEDGKTYNTKHYNLQLIIAVGFKVNNQRAVQFRKWSGQ IVKDYTIQGWTMDKERLKKGHMFTDEYFERQLQYIREIRLSERKFYQKITDLYVTAFDYD KNSKTTKLFFQTVQNKLHFAVHRHTASELIFERANANKKNMGLTTWENAPNGKIIKADVN IAKNYLNDQEMKYLERIVSMYLDYAELQAERKIPMSMEDWSKRLDGFLEFNGNELLIGAG KISSEQAKLHAETEFEKYRIIQDRLYKSDFDEFLLLEEETK Prediction of potential genes in microbial genomes Time: Fri May 20 01:53:21 2011 Seq name: gi|224461480|gb|ACDD01000022.1| Fusobacterium sp. 3_1_5R cont1.22, whole genome shotgun sequence Length of sequence - 47652 bp Number of predicted genes - 50, with homology - 49 Number of transcription units - 13, operones - 12 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 27/0.000 - CDS 1 - 481 239 ## COG0732 Restriction endonuclease S subunits 2 1 Op 2 1/0.000 - CDS 471 - 1979 1662 ## COG0286 Type I restriction-modification system methyltransferase subunit - Prom 2055 - 2114 10.2 - Term 2098 - 2141 4.4 3 2 Op 1 8/0.000 - CDS 2161 - 3855 205 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 4 2 Op 2 . - CDS 3842 - 5584 2110 ## COG4988 ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components - Prom 5641 - 5700 9.5 - Term 5696 - 5747 8.1 5 3 Op 1 . - CDS 5766 - 7322 1845 ## COG2978 Putative p-aminobenzoyl-glutamate transporter 6 3 Op 2 . - CDS 7350 - 8078 782 ## COG2071 Predicted glutamine amidotransferases - Prom 8175 - 8234 13.9 - Term 8227 - 8263 0.1 7 4 Op 1 . - CDS 8312 - 9160 764 ## COG1737 Transcriptional regulators 8 4 Op 2 8/0.000 - CDS 9174 - 9758 321 ## PROTEIN SUPPORTED gi|162456259|ref|YP_001618626.1| putative ribosomal protein 9 4 Op 3 . - CDS 9730 - 10479 937 ## COG0689 RNase PH 10 4 Op 4 1/0.000 - CDS 10479 - 12248 260 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 11 4 Op 5 5/0.000 - CDS 12245 - 13318 1129 ## COG0763 Lipid A disaccharide synthetase 12 4 Op 6 5/0.000 - CDS 13329 - 14132 888 ## COG3494 Uncharacterized protein conserved in bacteria 13 4 Op 7 25/0.000 - CDS 14132 - 14905 1288 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 14 4 Op 8 4/0.000 - CDS 14921 - 15346 712 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 15 4 Op 9 1/0.000 - CDS 15359 - 16186 953 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 16 4 Op 10 . - CDS 16202 - 18355 2441 ## COG0210 Superfamily I DNA and RNA helicases - Prom 18382 - 18441 11.1 17 5 Tu 1 . - CDS 18545 - 19771 1381 ## FN0173 hypothetical protein - Prom 19912 - 19971 8.4 + Prom 19814 - 19873 7.0 18 6 Op 1 1/0.000 + CDS 19906 - 20673 754 ## COG1183 Phosphatidylserine synthase 19 6 Op 2 1/0.000 + CDS 20660 - 21766 809 ## COG0859 ADP-heptose:LPS heptosyltransferase 20 6 Op 3 . + CDS 21756 - 22442 542 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 22444 - 22503 4.3 + Prom 22457 - 22516 10.3 21 7 Op 1 6/0.000 + CDS 22538 - 23545 1177 ## COG1145 Ferredoxin 22 7 Op 2 2/0.000 + CDS 23573 - 24376 1067 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 23 7 Op 3 . + CDS 24386 - 25363 1045 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits 24 7 Op 4 . + CDS 25391 - 26155 1010 ## COG2116 Formate/nitrite family of transporters + Term 26175 - 26222 12.1 - Term 26163 - 26210 12.1 25 8 Op 1 . - CDS 26227 - 26688 709 ## FN1065 hypothetical protein 26 8 Op 2 . - CDS 26690 - 27466 1207 ## FN1064 hypothetical protein 27 8 Op 3 . - CDS 27469 - 28665 1651 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase - Term 28692 - 28725 4.0 28 9 Op 1 . - CDS 28742 - 29284 508 ## PROTEIN SUPPORTED gi|34763431|ref|ZP_00144379.1| PROBABLE SIGMA(54) MODULATION PROTEIN; SSU ribosomal protein S30P 29 9 Op 2 23/0.000 - CDS 29361 - 30044 242 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 30 9 Op 3 . - CDS 30037 - 31206 1465 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 31 9 Op 4 1/0.000 - CDS 31203 - 32468 1254 ## COG0514 Superfamily II DNA helicase 32 9 Op 5 2/0.000 - CDS 32446 - 32985 510 ## COG0514 Superfamily II DNA helicase 33 9 Op 6 . - CDS 33007 - 33936 745 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake - Term 33951 - 33992 5.6 34 10 Op 1 . - CDS 34014 - 35135 1109 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 35 10 Op 2 1/0.000 - CDS 35151 - 35969 976 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 36 10 Op 3 1/0.000 - CDS 35995 - 36996 1302 ## COG0240 Glycerol-3-phosphate dehydrogenase 37 10 Op 4 . - CDS 37008 - 37607 853 ## COG0344 Predicted membrane protein 38 10 Op 5 1/0.000 - CDS 37675 - 38673 1439 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 39 10 Op 6 14/0.000 - CDS 38677 - 39243 767 ## COG1799 Uncharacterized protein conserved in bacteria 40 10 Op 7 2/0.000 - CDS 39255 - 39935 829 ## COG0325 Predicted enzyme with a TIM-barrel fold 41 10 Op 8 1/0.000 - CDS 39948 - 41045 1093 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 42 10 Op 9 . - CDS 41029 - 42723 2179 ## COG1109 Phosphomannomutase - Prom 42767 - 42826 10.4 - Term 42811 - 42856 6.2 43 11 Op 1 . - CDS 42891 - 43667 1139 ## FN0558 TraT complement resistance protein precursor 44 11 Op 2 41/0.000 - CDS 43706 - 45325 1549 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 45 11 Op 3 . - CDS 45336 - 45599 531 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 45654 - 45713 7.5 + Prom 45603 - 45662 9.7 46 12 Op 1 . + CDS 45759 - 46034 488 ## COG1937 Uncharacterized protein conserved in bacteria + Prom 46036 - 46095 7.8 47 12 Op 2 . + CDS 46123 - 46671 748 ## COG2849 Uncharacterized protein conserved in bacteria 48 13 Op 1 . - CDS 46668 - 46745 70 ## 49 13 Op 2 . - CDS 46724 - 47515 837 ## COG0500 SAM-dependent methyltransferases 50 13 Op 3 . - CDS 47519 - 47650 89 ## gi|257466108|ref|ZP_05630419.1| FUR family transcriptional regulator Predicted protein(s) >gi|224461480|gb|ACDD01000022.1| GENE 1 1 - 481 239 160 aa, chain - ## HITS:1 COG:PAB2150 KEGG:ns NR:ns ## COG: PAB2150 COG0732 # Protein_GI_number: 14520513 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Pyrococcus abyssi # 58 155 300 398 427 66 34.0 2e-11 MKYRLSDICHYVKGKVDVSELDNSTYISTENMLPDKGGVTEAASLPTTLQTQIYEKDDVL VSNIRPYFKKIWFADQNGGCSNDVLVFRANEGVEPGFLYYVLADDKFFDFSMATSKGTKM PRGDKKALMEYEVLDFNIDTQKKVASLLGDIDEKIRVNTE >gi|224461480|gb|ACDD01000022.1| GENE 2 471 - 1979 1662 502 aa, chain - ## HITS:1 COG:XF2742 KEGG:ns NR:ns ## COG: XF2742 COG0286 # Protein_GI_number: 15839331 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Xylella fastidiosa 9a5c # 9 474 16 491 519 534 51.0 1e-151 MAKKSNVKIGFEKEIWDAACVLWGHIPAADYRKVIVGLIFLRYISSSFEKKYKELLEEGY GFEDDRDAYMEDNIFFVPKEARWSTISAATHTAEIGMVIDNAMRAIEAENKTLKNVLPKI YASPDLDKRVLGEVVDLFTNNINMEDTEESKDLLGRTYEYCIAQFAAYEGTKGGEFYTPS SIVKTIVEILKPFDNCRVYDPCCGSGGMFVQSVKFLQAHSGNRNHISVFGQESNADTWKM AKMNMAIRGIDANFGPYQADTFFNDLHSTLKADFIMANPPFNLSNWGQDKLQDDVRWKYG LPPAGNANYAWIQHMVHHLAPNGKIGLVLANGALSTQTSGEGNIRKAIIEDDLIEGIVAM PTQLFYSVTIPVTLWFISKNKKQKGKTLFIDARNMGFMVDRKHRDFTEEDIQKLANTFTH FQEGILEDEKGFCAVVETEEIRKQDYILTPGRYVGIADPEDDGEPFEEKMTRLTSELSDM FEKSHELEDEIRKKLGAIGYEI >gi|224461480|gb|ACDD01000022.1| GENE 3 2161 - 3855 205 564 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 365 564 36 241 329 83 30 2e-15 MKQNRRSAARIMFQLAGLLKPLWGIMSIAVATGVIGFLFSFGISMFGAYAILKSLDWGNL SKVPFGTWSLHAYFIAMAICAFFRGLLHYIEQYCNHFIAFHILAEIRVRLFKVMRRLAPA KMDGENQGNLISMITGDIELLEVFYAHTVSPILIACVTTIFLFLYYLGLHWIYAIYALLG QIFVGILVPWIASRKASTVGMKVRNEIGNLNGEFLDKLRGLREVVQYRRGKEMVARISSL TDHLCEGQRELRNQMALVQVWTDSAIIFVSLFQLILSVFLVSSGMVGMEAAILAGVLQVG SFAPYINLANLGNILSQTFACGERVLSLMEEKPAVEDSENAEEISMGNLRVDNLHFEYRN GRQQQVLKGVNLEIKPGEIVGIMGPSGCGKSTLLKLMMRFWDADAGTISLGGKNIKEAKR SSLYSHYNYMTQSTSLFTGTIQDNLLVAKPEASEEEIMEALKKASFYDYVMSLPDKLQTV VEEGGKNFSGGERQRIGLARCFLADRSIFFLDEPTSNLDVQNEAIILKSLMQERKDKTII LVSHRLSTLGVCDRILKMEQGQLV >gi|224461480|gb|ACDD01000022.1| GENE 4 3842 - 5584 2110 580 aa, chain - ## HITS:1 COG:FN1819 KEGG:ns NR:ns ## COG: FN1819 COG4988 # Protein_GI_number: 19705124 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome bd biosynthesis, ATPase and permease components # Organism: Fusobacterium nucleatum # 1 578 1 580 581 637 56.0 0 MIEKQLYRFCGETEKYIKDSVFLSCYRLLAGIGFSFLFAKLLTDILEANWTTNLFLVIGG MIVIILIKQWCMRMVASKLGFLVSEVKENLRKAIYQKVLRLGISYQESFQTQEVIHLAVD GVEQLENYFGGYLTQFYYCFASSLILFCVIAPWNLKAALVLLIMACSIPLTLQLLLKIVK KVQKKYWSKYASVGNLFLDSLQGLTTLKVYGTDGKREEEIADLSEGFRKQTMKVLKMQLS SIAVINWIAYGGTVAAIIISILAYRRGDLGLFGLLFIFMLAPEFFIPMRTLTAQFHVAMT GVAAAENMMNFLQKEEEKSLGEESYQKGSQIHVKNLVYHYQDGTKALDGLNLDLESGKLT AIVGHSGCGKSTFASLLSGEMQVGVHQIFVGDTDIRSLKAGEITKHILRITHDGHIFSGT VKENLLMGNPEASEEMMIEALEKVSMWKFLQEKDGLNTVLLSQGKNVSGGQAQRISLARA LLHNAEIYIFDEANSNVDIESEEIILSVIYELAKTKTVVYISHRLPSIRKADTIYVMRKG KVVQSGNHESLYAEEGLYQSMYREQEDLENFQKGGSHETK >gi|224461480|gb|ACDD01000022.1| GENE 5 5766 - 7322 1845 518 aa, chain - ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 1 511 1 502 512 351 40.0 2e-96 MANETEFSRKGFLGKVAAISNRLPHPVTIFIILSVVVAFLSVIFSQMGVQVEIEAINRST KEVELQTFQVRNLLNAEGIRWIFESAVENFISFEPLGVVLFFSLFFNFLNEVGLFPSFLK KSMQKIKGRYVSFFIAFLGVNSSFAGDIGYVLVIPIAGIIYKQLKRNPIAGIILGFSSTS AGFAACLVSIDALLGGLSTSAMSIVNPDYIVTPLANSIFMFFFTFFITFIIAFINDRFIE PQLEGMTLEEEISETEENNFSILTEEENKGLRAAGLGFLVSLAIILILSVPSWAPLRNPN TGKLLLGWSPLLSAIVPVICFIFFVPGLFYGIATKKIRNDKDLMTFLFKSLDGFGAFIVL CFFSSIFISWFSYSQLGIIIAAEGGKFLSGIGLSRLPLIIAFVLFCSFANLFIGSMTSKY VLLAPIFLPMLYKMGISPELAQLAYRIGDSSTNVVSPLMSYFALILIYCNKYNKKFGMGD LITYMIPHAIVILISSLIFLGIWVTFDLPIGFGTVNFL >gi|224461480|gb|ACDD01000022.1| GENE 6 7350 - 8078 782 242 aa, chain - ## HITS:1 COG:FN0505 KEGG:ns NR:ns ## COG: FN0505 COG2071 # Protein_GI_number: 19703840 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Fusobacterium nucleatum # 2 236 3 237 243 211 43.0 1e-54 MKKPIIGITSAYEKEEGLRNYHRTTVSIDYTKAVVKGGGIPLVIPVTEDREIIKDQIALL DGLLLSGGTDLNPFLYGEDFKNGIHLVSPERDAYEWILLEEFLKTGKPILGVCRGHQLLN VYFKGSLYQDLKYYSSEVIQHRQEMYPELATHTVNIIDRDNILFELYGEKIFTNSFHHQI INRLGENLTVIATTNDGVIEAFQKKSHKFLYGIQWHPEMMTARGNTEMQKIFEKFVSYCM KE >gi|224461480|gb|ACDD01000022.1| GENE 7 8312 - 9160 764 282 aa, chain - ## HITS:1 COG:CAC1850 KEGG:ns NR:ns ## COG: CAC1850 COG1737 # Protein_GI_number: 15895125 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 12 263 16 267 293 122 27.0 1e-27 MDKISKFYMEHYKTLTKGEKKIAEYIVKNPKRVLLLSALELGKEIGVSDASILRFSKSLG FGKFTEFRNYIALEIREANPADRIVKHWDNFQSNSDIVNKIVNADLKNIKEFLMNIDFEA VNELVSWINHSRKIYILGIGSSRAISQFLFWYIKRLGFDVECVNEGGLGLYESFSHMKEE DLVILFTFPRFLKDEIQALNLAKEKKAKIVAITSNLFSEISYLSDMVFKLSCENEGFFNS YIVPMELCNIILTALFEQNKEKIYSEMKKNTEMKDFLFTNEK >gi|224461480|gb|ACDD01000022.1| GENE 8 9174 - 9758 321 194 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|162456259|ref|YP_001618626.1| putative ribosomal protein [Sorangium cellulosum 'So ce 56'] # 1 185 6 197 207 128 41 7e-29 MKLFLATGNKHKIEEIKAIFHENEVEIFSILDGISIPEVVEDGKTFEENSQKKALEIAKY LNMMTVADDSGLCVDALGGAPGVYSARYSEEGTDEANNQKLIQNLKGIDNRKARFVSVIS FAKPDGEVFSFRGEVEGEIVDDRRGEFGFGYDPYFYVKEYGKTLAEMPEVKNQISHRANA LKKFQEFWRQKKSF >gi|224461480|gb|ACDD01000022.1| GENE 9 9730 - 10479 937 249 aa, chain - ## HITS:1 COG:FN1851_1 KEGG:ns NR:ns ## COG: FN1851_1 COG0689 # Protein_GI_number: 19705156 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase PH # Organism: Fusobacterium nucleatum # 1 240 1 240 242 306 67.0 2e-83 MERIDGRKENQLREIKITRDFNIHAEGSVLIESGNTKVICTASVSEKVPSFIKNTGKGWL TAEYSMIPRATGERNQREAAKGKLSGRTMEIQRLIGRALRSSIDLEKLGERTITLDCDVI QADGGTRTASITGAFVAMAIAAAKLLREGTITDSPVLSSVAAVSVGKCEGNIFLDLNYVE DSSAEVDMNVIQNDLGEYIEVQGTGEEATFCRKELNTLLDMAEIGIHQLLEKQREVLGED YEIIFSNRK >gi|224461480|gb|ACDD01000022.1| GENE 10 10479 - 12248 260 589 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 350 571 135 355 398 104 31 9e-22 MKKLSWNPLKNKSLSTFLHYSMQYKWKMFAIVILSALASAMSAIPAWLSKYLIDDVLVKQ EKNMLFLVLAGMFFCTLIKVVAVYYADIGSGYITEVIKRDIKVDIFKHLQKLPLHYYKKN KLGDIMARLSGDTSTLGRMGFIIFEMFKEFLTTFVLIIRMFQVDYILALISLIVLPLILQ VVRKYTKKIRKSGRVRQDTTGAITAFTQESLSGIFVVKAFNAMKIMISKYEKISYDEFQK SFKTAKIKAKVSPINELITTLMIVLVALYGGYKIIVTKDITSGDLVSFVTALGLMQQPLK RLVAKNNELQESIPSADRVLEILEENIEKEYTGEEKHLDGRIESIEIENVSFVYPDTTEN VLEDISLSIKSGEVVALVGKSGSGKSTLVNLIARFYETVSGKILINGVDSQTIPLEEFRN YIGVVPQESFLFSGSIAENIAFGKERVTQEEIEKAAKMANAYDFIMELPEQFETEVGERG TRLSGGQKQRIAIARALIQNPQIMILDEATSALDTESEKLVQEALDELMKGRTTFVIAHR LSTIIHADKIVVMEDGKIREVGNHTELLEKKGLYEHLYHIQFQEKMEEK >gi|224461480|gb|ACDD01000022.1| GENE 11 12245 - 13318 1129 357 aa, chain - ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 1 357 1 356 356 375 52.0 1e-104 MKIFVSTGEVSGDLHLSYLAKVIRKKYPDCELYGVAGLHSREAGVTVIQDIQELAIMGFL EAFKKYSFLKEKMESYLQFIEKEKIEKVLLIDYGGFHLKFLKALKERCPDVKVNYYIPPK LWVWGKKRIQSLRLADEIMVIFPWEVDFYQKEGVKVHYFGNPLVETCPPRKQSGDKILLL PGSRKQEILSVMDIYYDLILRNPKQEFLLKLSNEEAFSFLPKEMKDLPNVEIIFGKDLGE IVKKCSYAVAVSGTVTLELALFDVPSIVVYRTSFLNYFIAKYLLKVGYISLPNITLGEEV FPELIQKDCEVKNIEQYLEKIKQNPASWKKKLESVRESLSGENIIENYADFLVEGEK >gi|224461480|gb|ACDD01000022.1| GENE 12 13329 - 14132 888 267 aa, chain - ## HITS:1 COG:FN0596 KEGG:ns NR:ns ## COG: FN0596 COG3494 # Protein_GI_number: 19703931 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 267 1 267 267 311 56.0 7e-85 MEKIGIIVGNGKFPLYFMKEAKSQGYDLYPVGLFDSIEEEIKNMEHYRSFHIGHIGEIVK HFSFCGIKKLILLGKVEKSLLFQNLDLDYYGQEIMKMLPDKKDETLLFAVISFLKLNGIK VLSQNYLLSSFMVEEICYTEKKPEKEDHKTIQLGVEAAKMLTKLDIGQTVIVKEEAVVAL EGMEGTDKTILRAGELAGKGCIIVKMARPKQDMRVDIPTVGVETVKKAIEIGAKGIVMEA KKMFFLEREEAISLANQYGIFLIGKKV >gi|224461480|gb|ACDD01000022.1| GENE 13 14132 - 14905 1288 257 aa, chain - ## HITS:1 COG:FN0595 KEGG:ns NR:ns ## COG: FN0595 COG1043 # Protein_GI_number: 19703930 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Fusobacterium nucleatum # 1 257 1 257 257 421 80.0 1e-118 MVEIHSTAIVEEGAILEDGVKIGPYCIVGKDVKIGKNTVLQSHVVVEGITEIGEENTIYS FVSIGKASQDLKYRGEPTKTIIGNKNSIREFVTIHRGTDDRWETRIGSGNLLMAYVHIAH DVIVGDGCILANNVTLAGHVVVDSHAIIGGLTPVHQFTHIGSYVMVGGASAINQDICPFV LAEGNKAVVRGLNTVGLRRRGFSDEELSNLKKVYRIIFRKGLPLKEALAEAEEQFGSDKN VAYLLEFIRNSERGIAR >gi|224461480|gb|ACDD01000022.1| GENE 14 14921 - 15346 712 141 aa, chain - ## HITS:1 COG:FN0594 KEGG:ns NR:ns ## COG: FN0594 COG0764 # Protein_GI_number: 19703929 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Fusobacterium nucleatum # 1 140 1 140 141 211 74.0 3e-55 MLNTLEIMERIPHRYPFLLVDRILEMDVENKRVIGRKNVTINEEFFNGHFPEHPIMPGVL IVEGMAQCLGVLVMEGQEGKVPYFAAVENVKFKQPVRPGDTITYDVKVEKIRSNIVKASG VALVDEVKVAEASFTFCIADK >gi|224461480|gb|ACDD01000022.1| GENE 15 15359 - 16186 953 275 aa, chain - ## HITS:1 COG:FN0593 KEGG:ns NR:ns ## COG: FN0593 COG0774 # Protein_GI_number: 19703928 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Fusobacterium nucleatum # 1 275 7 282 283 387 71.0 1e-107 MKRRTIAKEIEYSGIGLHKGETIFMRLLPSNTGKIIFRRVDLEKGKNEIVLDIDNTFDLT RGTNLKNGFGAMVFTIEHFLSALAMVNITDLIVELNGNELPICDGSAKVFLELFENAGTR DLEEEVEEIIIKEPLYLSLGDKHIVALPSEEYKLTYSIRFEHSFLKSQTAEFILDYETYR KEIAPARTFGFDYEIEYLRKNNLALGGTLENAIVVQKDGVMNPGGLRFEDEFVRHKMLDI IGDFKILNRPIKAHIIAIKAGHALDIEFAKKLREI >gi|224461480|gb|ACDD01000022.1| GENE 16 16202 - 18355 2441 717 aa, chain - ## HITS:1 COG:FN0592 KEGG:ns NR:ns ## COG: FN0592 COG0210 # Protein_GI_number: 19703927 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Fusobacterium nucleatum # 1 716 1 735 735 764 58.0 0 MSLLEKLNDKQREAAATVEGPLLILAGAGSGKTRTITYRIAHMIEELGIPPYLILAVTFT NKAAKEMKERVISLIGEEAERATISTFHSFGVRLLRMYGSKLGYQANFTIYDVEDQKRII KGIMKELNLQNTDLSEKKLASLISKLKEEGVSADDYEKDAYEYEAKTIAEIYRRYNIKLK NQNGIDFSDILLNTKNLLEIPEILEKIQTKYQYIMVDEYQDTNNIQYQIVNKIAQKHRNI CVVGDENQSIYGFRGANIQNILNFEKDYKDAMVVKLEQNYRSTAIILDAANAVIRHNTSS KNKNLWTDKKEGDKIKVFKALNQRDEVEKVISEIAKEKQKGRAYRDMTILYRTNAQSRVF EEAFLRYRIPYKIFGGMQFYQRAEIKDILAYLSLINNPLDETNLLRIINVPKRKIGDKSI EKIRFFAREQGLTLLDSLARAGEISGIGSGLAVTIQQFYTLIRELMDLAPYENTSIIFSS LLEKIGYKQYLETAYEDAEVRISNIEELGASILELENLLGNLSLRDYLENVSLVSATDDL QENQDYVKLMTIHNAKGLEFPVVFLVGVENETFPGNSKFSSEDDLEEERRLCYVAITRAE ERLVISFSTTKYVYGEVQASQESIFLKEIPSEYKLEDWKEERPKYQKLQTKNTISTEDLK KKTSNLPFSVGERVLHKKFGLGIVRDLEEKKIIVEFVTGKKEIAAIVAEKFLSKAES >gi|224461480|gb|ACDD01000022.1| GENE 17 18545 - 19771 1381 408 aa, chain - ## HITS:1 COG:no KEGG:FN0173 NR:ns ## KEGG: FN0173 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 35 399 1 430 461 143 29.0 1e-32 MRKFIMFLGIFLLVSIVGIFFGRNMILKYVLEDRLGQINQAYVKIGSVESNFFEKYISLR DVQVESHEKAGTDFIRIQQVKTYYDLDYNHKKVELFDTEVIGLEFITPKDEEDMRALREA KAVEVVSGQYPFAKVFQEESEKEEHQASFGVKSSDEYQKIKDAVQDMKNGGNGIHHNLNT IRENIEKLREKYVKPEPVEEPKSIISLDRMLGKYLTLMYEDEIYNLLLRYREIVKEMEER VRRDVERRDDIWEIQMNRVSIFFDIYGINFNGEIKNFNSRLSKNYDNISFKLFGEKDDTI GMIKGELNLLKLDLNATLDIPELNLIGVTEFRKYLSDGVASLQQDIQMDKYDVALQGVLT AKRMKLVENPLLEKIQDLEIRYQYNSRDRQLYLNTHFLKNKIDEMKNN >gi|224461480|gb|ACDD01000022.1| GENE 18 19906 - 20673 754 255 aa, chain + ## HITS:1 COG:FN0991 KEGG:ns NR:ns ## COG: FN0991 COG1183 # Protein_GI_number: 19704326 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Fusobacterium nucleatum # 1 255 1 257 261 331 71.0 9e-91 MVKKKYIAPNVITAGNMFLGYISITESIKGHFIPAIWFIILAMVCDGLDGKTARKLDAFS EFGKEFDSFCDAISFGLAPSILVYSILNQVAPGSLFIIPVSFLYALCGVMRLVKFNIITV ASSEKGDFSGMPIPNGASMVCSYYLICHTIYQNFQISFFDINVFIAIIVLAAALMVSTVP FKTPDKTFSFIPKNKTLISFLIILIIATLKYSLFIVSFTYVLLNLLTFFTKKFVGEHQDQ LDEFFEVVEEEDETK >gi|224461480|gb|ACDD01000022.1| GENE 19 20660 - 21766 809 368 aa, chain + ## HITS:1 COG:FN0992 KEGG:ns NR:ns ## COG: FN0992 COG0859 # Protein_GI_number: 19704327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 3 355 4 354 358 348 53.0 1e-95 MKQNDPIKILVIRFKRIGDAILASPVCNSLKKTFPNSSIDYVLYEPSAPLFTNHPYIDNV ICISKKEQENPFLYLKRVWKITRNKYDIIIDIMSTPKSEVFTLFSLGTPYRIGRVSKNKK RGYTYNYKQYEPQNTKNKVDKFLKQLLSPLEKDFKLSYAPELILSVSEEEKKEMKQKMER IGLSLEKPIIPFAVLSRVAGKTYPIENMKKIIQYCLDHYEAQFVFFYSSDQKAQIKEIEK DLNFPKNIFTNLETRDMRELMAFFANSTCYIGNEGGPRHLAQALGLPCFALFNPSAEKRE WLPWPSDTNVGIEPKDTLSFHQISQEKYNSLTKEEAFALMTVPFIIEKLDIFLSKVLGKW LGGLLSGS >gi|224461480|gb|ACDD01000022.1| GENE 20 21756 - 22442 542 228 aa, chain + ## HITS:1 COG:CAC1511 KEGG:ns NR:ns ## COG: CAC1511 COG0664 # Protein_GI_number: 15894789 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 6 227 5 226 228 117 32.0 2e-26 MEVSLEEIKKIEVFHGITKKSIEKIQKTAEIISLPQNKYLYTDKQNLDYIYFVLSGKVVI SKGNEHGESRIIFLLSSGTMINQPFMRNNTSAIECIAFENSRILRITFSDFATILSQDYK LCKNCMIFMENRIRRLYRQLKNSVSINLDKKLAAKLYRLGIEHGSSSQEEGMTKINLNIT ITCLAKMLGCQRESLSRAMKSLNSRKIVKMIGRSIYVDMEAAKNLFKN >gi|224461480|gb|ACDD01000022.1| GENE 21 22538 - 23545 1177 335 aa, chain + ## HITS:1 COG:CAC1513 KEGG:ns NR:ns ## COG: CAC1513 COG1145 # Protein_GI_number: 15894791 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 331 1 333 338 360 53.0 2e-99 MKLRLSVEEFDKGLEKLSKEFKIFAPKSFQDRGTYSDTDIVKYDVVNHFDEMVWDRKSNF SPKETILPINQVLFYFTEKEFTESTEEEKKILVFLRACDLNAVKRIDQIYLANGANKDTF YARRREKVKFVVVGCTESYRNCFCVSMGSNTVDNYDAAMNIRNNEIYLEIQNDSLAVFEG EKTDFNIDFVKENKFSIEVPENIDFMHLQSHSMWDEYDTRCIACGRCNFTCPTCTCFSMQ DIYYRENQNVGERRRVWASCQVDGYTRIAGGHSFRNKQGQRMRFKTLHKIHDFKKRFGYN MCVGCGRCDDACPQYISFSEAITKVKNAMDEKNKV >gi|224461480|gb|ACDD01000022.1| GENE 22 23573 - 24376 1067 267 aa, chain + ## HITS:1 COG:CAC1514 KEGG:ns NR:ns ## COG: CAC1514 COG0543 # Protein_GI_number: 15894792 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Clostridium acetobutylicum # 6 267 4 264 264 335 58.0 4e-92 MCKCSNPYIPFSAEIIEIVKHTEIEWTFRAKLDSSSVKPGQFYEISLPKYGESPISVSGI GENFVDFTIRNVGKVTKELFEFQVGDFFLVRGPYGNGFEIENYKGRDLVIVAGGSGLAPV RGIIEYVYAHKEEFTSFQLIVGFKSPKDILFKQDLEKWSKTLNILVTVDGAEEGYQGATG LVTKYIPKLQFQDIQKVSSVVVGPPMMMKFSVAEFLKLGMLEKNIWVSYERKMHCGVGKC GHCKMDATYICLDGPVFDYEFAKNLVD >gi|224461480|gb|ACDD01000022.1| GENE 23 24386 - 25363 1045 325 aa, chain + ## HITS:1 COG:CAC1515 KEGG:ns NR:ns ## COG: CAC1515 COG2221 # Protein_GI_number: 15894793 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Clostridium acetobutylicum # 4 322 2 320 320 462 66.0 1e-130 MLRDINTKKVMKNAYRITKHKYKTALRVRVPGGLIDPDSLMIISKISTEYGDGQIHITTR QGFEILGIDMENMPEVNQLIQPVIEKMGINQEIKGSGYGAAGTRNIAACIGNKVCPKAQY NTTNFAKRIEQAIFPHDLHFKVALTGCPNDCIKARMHDFGIIGTCLPEYEMDRCVNCGAC VKKCKRMSVGALREENNKIIRNEEKCIGCGECVLNCPMSAWTRSPKKYYKLMLLGRTGKK NPRLAEDWLKWVDEDSIVKIIQNTYQYVKEYIDPKAPGGKEHIGYIVDRTGFQEFRKWAL KDVNLPEETVENKNIYWSGPNYCSF >gi|224461480|gb|ACDD01000022.1| GENE 24 25391 - 26155 1010 254 aa, chain + ## HITS:1 COG:CAC1512 KEGG:ns NR:ns ## COG: CAC1512 COG2116 # Protein_GI_number: 15894790 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Clostridium acetobutylicum # 1 248 1 247 256 201 47.0 1e-51 MYDEVIGKLTEAAKKKVNLLNSSTFKYLVSSAFAGAFIGIGILLIFTIGGYMGGEPSVKV VMGLSFSVALSLVIFSGTDLFTGNNLVMTVGVLNKGVKTSDLIKVWIVSYIGNLLGAILL SFLFVNSGLVDKGPVMEFFQKMALAKASPDAISLIFRGILCNIMVCLAVFLSFKVQDETT KIILIIMCLFVFITVGFEHSIANMTVYAVGVFSRSMTEVTLGQAIYNLATVTLGNIIGGA FFIGCGVFSLRSKS >gi|224461480|gb|ACDD01000022.1| GENE 25 26227 - 26688 709 153 aa, chain - ## HITS:1 COG:no KEGG:FN1065 NR:ns ## KEGG: FN1065 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 23 153 1 131 131 152 75.0 3e-36 MVKNMTNVKESLAGLCITAFITLIGNFFATKISPIEALPGILILVAIAIIGITLAEILPI KIPAVAYVVTLSTILTIPGFPMAELLSAQTGKINFLALCTPILAYAGIYTGKNLEGLKKT GWRIFVLAIFVMLGTYLGSAIIAQVILKMLGQI >gi|224461480|gb|ACDD01000022.1| GENE 26 26690 - 27466 1207 258 aa, chain - ## HITS:1 COG:no KEGG:FN1064 NR:ns ## KEGG: FN1064 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 5 254 1 250 253 356 79.0 4e-97 MRDGIRNIKLHILALILVVIAEWIGVFKFQLGKGIIALFPMLYALIFGIVAKFVKASSEK DMKDAGSLVGITLMLLMAKYGTTIGPTLPKIISASPALILQELGNIGTVLLGVPVAIALG LHREAIGGAHSIAREPNIAVIADRFGLDSEEGEGVLGVYIVGTVFGTIFIGLLASLLASY TPLHPYSLAMASGVGSASMMTASVGALSTLYPDMAETLAAFGATSNMLSGLDGVYMSIWI SLPLAEWLYKKLQKREVK >gi|224461480|gb|ACDD01000022.1| GENE 27 27469 - 28665 1651 398 aa, chain - ## HITS:1 COG:FN1063 KEGG:ns NR:ns ## COG: FN1063 COG1473 # Protein_GI_number: 19704398 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 1 393 1 393 394 598 72.0 1e-171 MEVMEEVKLIHSDMIRWRRDLHQIPELNLELPKTVKYVTKELDKMGIVYTTLVNGNAVVA VIRGEKGEGKTIGLRADMDALPIPEETGLEFASKNGCMHACGHDGHTAMLLGAAKYFSTH RKEFRGNVKLLFQPGEEYPGGALPMIEEGAMENPHVDAVMGLHEGIISEEVPVGSIGYRD SCMMASMDRFLIKIIGKGCHGAYPQMGVDPILLASEVVLALQGIVSREIKATEPAIVSVC RIQGGYCQNIIPDVVELEGTVRATNESTRKFLAERIESIVKNITAAARGSYELEYDFKYP VVMNDKKFTQEFLKSARKVLKEEQIYQMEAPVLGGEDMAYFLQKAPGTFFFLSNPKRYAD GTIYPHHNPKFDIDEECFVLGAALFVQTALDFLNKEEE >gi|224461480|gb|ACDD01000022.1| GENE 28 28742 - 29284 508 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34763431|ref|ZP_00144379.1| PROBABLE SIGMA(54) MODULATION PROTEIN; SSU ribosomal protein S30P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 176 1 179 181 200 58 1e-50 MKLSIQGKRLELTDAIKAYAERKFEKVEKFHDGILEINVTLSAVKLKTGNYHSAEVLAYL SGKTLKATSTEEDLYFAIDQAADALEIQLKKHKDKNKRANSQKRGKSWKFDPESGVVTNQ EERRMVKVLLPKKPMSMEEALLQLEVLEKQFFAFKSLETGKMSIVYKRKDGDYGYIVEEA >gi|224461480|gb|ACDD01000022.1| GENE 29 29361 - 30044 242 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 226 7 223 318 97 31 1e-19 MNKFILELKHLEKYYQETGTKLHIIRDLSFNIEKGEFVTILGRSGSGKSTLLNLIGLLDR ADAGEIILGGKLLFSMNEIEKNKLRNEFLGFVFQFHYLLPEFTALENVMLPAMLAKKLKK EEIEKRAIELLISVGLGERLQHKPNQLSGGEKQRVAIARALINQPKLLLADEPTGNLDEE TSETIFKIFKEINGKYGQTIIVVTHSRELAKISSRQIYLKKGMLEEI >gi|224461480|gb|ACDD01000022.1| GENE 30 30037 - 31206 1465 389 aa, chain - ## HITS:1 COG:FN0581 KEGG:ns NR:ns ## COG: FN0581 COG4591 # Protein_GI_number: 19703916 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Fusobacterium nucleatum # 1 389 1 389 389 409 60.0 1e-114 MIEFFIAKKHIVERKKQSFISMLGVFIGVTVLTVSIGISNGLDKNMIQSILSLTSHILVS DSTNQEIVDYEELSEKINQIKGVKGSIPMISTQAIIKYHGVFGNYTSGVKVEAYDLEKAE KALELSSMIKEGKIDIEKKNGIYVGKELADSTGMKIGDEITMVSAENTEIPLQIAGVFQS GFYDYDVNLVLLPLEMAQYMSYRGQVVDKINVRLQNPYDAPRVADEISQNLSMMTMTWGN MNRNLLSALSLEKTVMILVFSLIVIIAGFVVWVTLNTLVREKVKDIGILRSMGFSQKNIM GIFLIQGLILGVAGIILGICVSLGILWYLKNYSLAFITSIYYLTKIPIEISGKEIAVIVG ANLGIIFISSIFPAYRASKMESVEALRHE >gi|224461480|gb|ACDD01000022.1| GENE 31 31203 - 32468 1254 421 aa, chain - ## HITS:1 COG:FN0578 KEGG:ns NR:ns ## COG: FN0578 COG0514 # Protein_GI_number: 19703913 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Fusobacterium nucleatum # 2 415 184 606 614 471 55.0 1e-132 MTATATPRVQEDILDKLHIPDAYIYQGSFNRKNLYFRVERGKVPEAYIADYLKKSQGEAG IVYCSTRKSVDSMYSYLKEIRGYSVGKYHGGMEKEEREESQNDFLMDKIQVMVATNAFGM GIDKSNVRFVIHANLPGDLESYYQEAGRAGRDGGRAEAILLYQEEDISTQRFFIEKNEEI DEDFKREKLHKLDKMIEYAELESCYREFILSYFGEARVKNYCGFCGNCRKQTDVQDLSVE AQKVLSCIGRAKESIGQSTVTNILLGKADTKMKLKGLDRLSTFRIMEEKEIPWLEDFIHY LLSEGYISQTAGSFPVLKLNTQSWDILQNRRKVLRKEEEEVRFSMQRNPLFRKLLRLRLE ISEREKVAPYIIFSDLTLWEFAQFRPKTKYEMMKIQGVGNQKFTHYGEEFLHCILEEEEL R >gi|224461480|gb|ACDD01000022.1| GENE 32 32446 - 32985 510 179 aa, chain - ## HITS:1 COG:FN0578 KEGG:ns NR:ns ## COG: FN0578 COG0514 # Protein_GI_number: 19703913 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Fusobacterium nucleatum # 1 168 9 176 614 206 54.0 1e-53 MEKEAKRLLQEIYGYRDFRKGQKAILESVFQGREVLGILTTGGGKSICYQIPALLFEGLT LVISPLISLMKDQVDTLKMIGVKSAFLNSTLKKEEYRRLVGKIFRGEIKILYVAPERLCN ESFISLMQQIKISLLAVDEAHCISQWGHDFRKSYLEIPTFLKKIKTKSTNFSFDSYSNS >gi|224461480|gb|ACDD01000022.1| GENE 33 33007 - 33936 745 309 aa, chain - ## HITS:1 COG:FN0571 KEGG:ns NR:ns ## COG: FN0571 COG0758 # Protein_GI_number: 19703906 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 3 309 2 304 304 248 47.0 1e-65 MEYTKREFLFLSWINTKEPFLHSTFLYRLIKQSILENKNLFRISEEELFQFICEEQLYTR KEYEKVLILIEDFFREEIQEEISNIETICKKEEIEMIPYGEENYPFPLKNIRNSPYVLYL KGKLPQTEILKKSVALVGSRDCSEEGKNFAKKVAQYLKKNKIYNISGLAKGIDSIGHLET LGQTGAILGQGLAREIYPRENQILASRILNMGGFLLSELPPLTPVSMEHLIARNRLQSGL TSGIIIAESALQGGTLHTFRFAREQGKKIYVASLNQKFIQKYHKDIIVLENISDFEKKKR KNRQQKTLF >gi|224461480|gb|ACDD01000022.1| GENE 34 34014 - 35135 1109 373 aa, chain - ## HITS:1 COG:FN0536 KEGG:ns NR:ns ## COG: FN0536 COG0592 # Protein_GI_number: 19703871 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Fusobacterium nucleatum # 1 373 1 381 381 305 45.0 9e-83 MQVIVNRTEFLKKLRIVEKAISENKIKPILSCVYMETRGEMLFLCGTNLETTITTTVSCK QVIEEGKVAFQYPLIDEYMKELKEEEVQIRMAGDSLMVEGGDAVSEFSTFSSEDYPKAFE NFMQQEKEVLLRMNSIELASIFDKLKFSAGNTDNPAIHCVRIEGRDGEIHFVTTDTYRLT YLHKEFLLPEDFQMSLPLEAVEACSKIFRGLEADVKLYFDKKFAHFEIEDIHIMSSLIEL NFPAYQAILSNGNYDKTMGISTENLLSILRRVIIFVRNNEESKYGATFHLSDGLLKIKGN SDIAKINEEMIVDYQGAPLKVSLNTKYLFDFVQNLEKDTELSVEMLSSKTSVKVHEKGKE DYIYILMPLALKD >gi|224461480|gb|ACDD01000022.1| GENE 35 35151 - 35969 976 272 aa, chain - ## HITS:1 COG:SA1390 KEGG:ns NR:ns ## COG: SA1390 COG0568 # Protein_GI_number: 15927141 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Staphylococcus aureus N315 # 5 260 97 355 368 208 44.0 8e-54 MAERDLLSLYLKDIRQYRTLEKEEELDLVIKAQSGDEEAKNQLILCNLRLVVNVAKGYRS KGMNLIDLISEGNLGLIRAIEKFDVGKGFRFSTYAVWWIKQSISKAIIFKGREIRIPSYR YDILNKINKYVTETVKLCGIYPTVEEVAEYLKMPVNKVEEVMIEFQEPMSLSTEIGEDIY LEDTLSGAEEHFEEKVYYKMMQQRLKDILSRLESREQEILKLRFGLDGYEIHTLEDIGKN FNITRERVRQIEKNTLKKLKRKYTKELRETLL >gi|224461480|gb|ACDD01000022.1| GENE 36 35995 - 36996 1302 333 aa, chain - ## HITS:1 COG:lin2050 KEGG:ns NR:ns ## COG: lin2050 COG0240 # Protein_GI_number: 16801116 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Listeria innocua # 2 331 4 334 338 294 46.0 2e-79 MEKVVVLGAGSWGTALSMVLAQNGHQVVLWEYQEELAQKLQKERENKKLLPGVIFPENLE VISESTNLLKDVKYVIFSIPSQALRSVVQKFSSQIQGDMILVNTAKGIEISSGMRLSEVM KDEILGKYHKNLVVLSGPTHAEEVSKGIPTTIVAAGEEDKAKQIQELFNNNNFRVYLNDD LIGVEIGAAIKNCLAIAAGALDGLGCGDNTKAALITRGIAEISRYGKCFGAKESTFSGLS GIGDLIVTAMSQHSRNRYVGEKLGRGEHIDDILSSMTMVAEGVPTVKAVYEQMKKQNISM PIVEAVYRVIYENMSAKEMMNELMNRSVKKEFY >gi|224461480|gb|ACDD01000022.1| GENE 37 37008 - 37607 853 199 aa, chain - ## HITS:1 COG:FN0537 KEGG:ns NR:ns ## COG: FN0537 COG0344 # Protein_GI_number: 19703872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 199 1 194 194 175 53.0 4e-44 MKLLFFIIIAYFLGSLPSGVWIGKITKNIDIRNYGSKNSGATNAYRILGAKYGLMVLFAD ALKGFLAVALAAAGGLSPNAVSIVALVVILGHSLSFFLAFKGGKGVATSLGVFLFLEPKV TFLLIFIFIAVVFVSRYISLGSIIAAGLLPILTFWVEIGKEKTNWLLIFITLLLGAFVVY RHKSNIIRLLEGKENKFKL >gi|224461480|gb|ACDD01000022.1| GENE 38 37675 - 38673 1439 332 aa, chain - ## HITS:1 COG:FN0563 KEGG:ns NR:ns ## COG: FN0563 COG0482 # Protein_GI_number: 19703898 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 1 331 1 331 333 484 74.0 1e-136 MAKIKALALFSGGLDSALAIKVVQEQGIEVIALNFVSHFFGGVNEKAEYMAKQLGIQLEY IHFEKRHMEVVKDPVYGRGKNMNPCIDCHSLMFRIAGELLEKYGASFLISGEVLGQRPMS QNPQALEKVKKLSGVGDLILRPLSGKLLPPSLAETEGWIQREGLLDINGRGRSRQMELMD HYGLVDYPSPGGGCLLTDPAYSIRLKTLEEDGLLDHEYADLFSLIKISRFFRFEKGRYLF VGRDQISNEKIDEIRRNREGSFYIYSFETPGPHMIAFGELTEEEKNFSRKLFSRYSKAKG KLQIKLNVSGKIEELDPISVEEIEKEMKKYQL >gi|224461480|gb|ACDD01000022.1| GENE 39 38677 - 39243 767 188 aa, chain - ## HITS:1 COG:FN0562 KEGG:ns NR:ns ## COG: FN0562 COG1799 # Protein_GI_number: 19703897 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 98 181 16 99 111 100 64.0 2e-21 MKENEGKKKGLFSTVNGLTSGMKELFGIDSVESDYEEEDTGILDFSTADEPMAEETAKTV KSSKNGKQKTFFGKAKSSQVMKEVFSNEDDGGINNCQTVFVDPKGFADAERIADYIVKDK MITINLEFLDTKVAQRLMDFLAGAMRVKESSFVAISKKVYTIVPKSMKVHYEGKKNQKKT ILEFEREE >gi|224461480|gb|ACDD01000022.1| GENE 40 39255 - 39935 829 226 aa, chain - ## HITS:1 COG:FN0561 KEGG:ns NR:ns ## COG: FN0561 COG0325 # Protein_GI_number: 19703896 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Fusobacterium nucleatum # 4 225 3 222 223 228 59.0 1e-59 MKQIEERIKEIYEEVKQYSPYPEKVKVIAVSKYLTAQEMLPYLETGIITLGENRVQVIQE KYEELSTYPFAKSLEWHFIGNLQKNKVKYIADKVSMIHSVNKLSLAEEINKKMEALGKKM PVLIEVNVSGEESKEGYEVLEAEKDLPKLLNLKNISICGLMTMAPFTEDIEEQRRVFHKL RTLKEDWNEKYFQGSLTELSMGMSNDYKIALQEGATMIRLGRKIFY >gi|224461480|gb|ACDD01000022.1| GENE 41 39948 - 41045 1093 365 aa, chain - ## HITS:1 COG:FN0560 KEGG:ns NR:ns ## COG: FN0560 COG0635 # Protein_GI_number: 19703895 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 13 365 8 365 365 355 51.0 7e-98 MSMLYKVKADAVYIHIPFCLHKCEYCDFTSFSGKLNWKKRYLEALYQEISLYEHSYYDTI YFGGGTPSLLEGKEIAKILELLPHDEKTEITVECNPKTLNLKKLQDYFEIGVNRLSIGIQ SMNEKYLKMLGRLHTVQEAKEVFQMAREIGFQNISVDMMFALPTQTLEEVEEDIENFLCL DADHISIYSLIWEENTPFFQKLEKGIYQRTENDVEAEMYQKIIETMKENSYEHYEISNFA KSGYSSRHNQKYWQNQNYLGIGLGASGYLEEIRYSNDRDFEHYFSNVNKNRFPREEEEIL NGEMKEQYRYLLGFRQLNTWLTPSGKYKKICETLFKKSYLIKREEEYQITQKGLFFFNDM LEYFL >gi|224461480|gb|ACDD01000022.1| GENE 42 41029 - 42723 2179 564 aa, chain - ## HITS:1 COG:FN0559 KEGG:ns NR:ns ## COG: FN0559 COG1109 # Protein_GI_number: 19703894 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Fusobacterium nucleatum # 8 564 23 580 580 723 64.0 0 MELNWKLEYEKWLHSSLLSEDEKKELKAISSDEIELENRFYTDLSFGTAGMRGIRGIGRN RMNRYNVGKASQGLANYILKMTGEEGKKRGVAIAYDCRIDSEENAETTARVLAANGIKAY VFESLRSTPELSFATRELRAQAGVMITASHNPKEYNGYKVYWEDGAQIVEPQASGIVDSV NAVDVFQDVKTITLEEAKKQGLFCSIGKSIDDRFIEEVEKNAIHREISGKENFPIVYSPL HGTGRVAVQRVLKEMGFLNVHTVAEQELPDGTFPTCPYANPEDHSVFQLSLDLADKVGAK LCIANDPDADRTGIAFLDKEGKWYIPNGNQIGILLANYIFTNKKIPKNGAVISTIVSTPM LDPIAKAYGITLYRTLTGFKYIGEKIRQFEQKELDGVFLFGFEEAIGYLSGTHVRDKDAV VTSMLVAEMAAYYDAQGSSLYEELLKLYDKFGYYLEETIAITKKGKDGLEAIANTMKKLR EIKPTVLCGQKVLEIRDFNENYNGLPKSNVLQYVLEDGSQVTVRPSGTEPKIKYYICVSD KVEITAKEKLNQFKKSFQDYVNAL >gi|224461480|gb|ACDD01000022.1| GENE 43 42891 - 43667 1139 258 aa, chain - ## HITS:1 COG:no KEGG:FN0558 NR:ns ## KEGG: FN0558 # Name: not_defined # Def: TraT complement resistance protein precursor # Organism: F.nucleatum # Pathway: not_defined # 43 258 1 216 216 223 57.0 7e-57 MQNYSKCVLELYFKEYGGNMKNKKLIFVLLTTLLLVFSGCGALNTAIKKRNLDVQTKMSE TIWLNPVSANQKTVFVQIKNTSGKTVNIEDKIKDTLSQKGYYVVQDPNQASYWLQANVLK LDKVDLRESDPFGSGVLGAGVGATLGAYNTGSMNTAIGLGLAGALIAGTVDALVSDMAYT MVTDIMISEKTNSKVSVSTNNNLTQGTRGRTKVTTSSESNRNQYQTRVVSVANQVNLKFE EAQPTLEAQLQQVIGGIF >gi|224461480|gb|ACDD01000022.1| GENE 44 43706 - 45325 1549 539 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 539 3 547 547 601 57 1e-171 MAKLLKFNAEARNKLEEGMNTLADAVKITLGPRGRNVVLEKSYGAPLITNDGVSIAKEIE LEDPFENMGAQLLKEVAIKSNDVAGDGTTTATILAQSIVKEGLKMLSAGANPMFLKRGIE AASKEAVECLKKRAKKIASNSEIAQVASISAGDEEIGKLIAEAMQKVGETGVITVEEAKS LETTLEVVEGMQFDKGYVSPYMVTDAERMTAELENPFILVTDKKISSMKEILPILEKTVQ TSRPVLMIVDDLEGEALTTLVINKLRGTLNVVAVKAPAFGDRRKAMLQDISILTGATLIS EETGKRLEEMEIEDLGRAKTVKVTKDSTVIVDGAGSQDEIQIRVQQVKTQIEESNSEYDT EKLKERLAKLSGGVAVIRVGAATEVEMKERKLRIEDALNATRAAVEEGIVSGGGSILLQL VSDMKEYQMQGEEGMGVEIVKKAFEAPMKQIAENSGVNGGVVIEKIKNSPDGYGFDAKTE TYVDMMSAGILDPAKVTRSAIQNAASIASLILTTEVLVVNKKEETMPGNPSNPMMNGMM >gi|224461480|gb|ACDD01000022.1| GENE 45 45336 - 45599 531 87 aa, chain - ## HITS:1 COG:FN0676 KEGG:ns NR:ns ## COG: FN0676 COG0234 # Protein_GI_number: 19704011 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Fusobacterium nucleatum # 1 86 1 89 90 63 44.0 6e-11 MKIKPLGKRILVQVKEKEEMTKSGIILSSVKDKETSNRGKIVAVSLEVEEVKIGMEVVFE KYAGTEIEDGEEKYLVLDMEQVLAVIE >gi|224461480|gb|ACDD01000022.1| GENE 46 45759 - 46034 488 91 aa, chain + ## HITS:1 COG:BH0558 KEGG:ns NR:ns ## COG: BH0558 COG1937 # Protein_GI_number: 15613121 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 6 82 16 99 100 69 47.0 1e-12 MKQCIDSHKVHARLKKIQGQVNGISNMIDQDIPCEDILIQINAVKSAIHKVGQIILEGHL DHCVRDAINEGKADEAIERFSKAVSYFANLK >gi|224461480|gb|ACDD01000022.1| GENE 47 46123 - 46671 748 182 aa, chain + ## HITS:1 COG:FN1078 KEGG:ns NR:ns ## COG: FN1078 COG2849 # Protein_GI_number: 19704413 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 182 1 181 181 206 67.0 2e-53 MNNQYNQDGKKEGLWVKLYDNGVVQEERNYVNGVREGVYKSYYANGQLEIIKNYKNGNLH GSYETFYNDGKISSRHALIDGRIIGKYEEFYPNGTLKSCSEYVGDSTTPVKTIKYFPNGE KKMEANLKKGFLFGAYKEYHSNGVVYKIATYGEKGRLEGAYQEFNAEGVLIKECTYKNGQ EI >gi|224461480|gb|ACDD01000022.1| GENE 48 46668 - 46745 70 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGCNKNLKKSADYLSRDSTFVFVTV >gi|224461480|gb|ACDD01000022.1| GENE 49 46724 - 47515 837 263 aa, chain - ## HITS:1 COG:FN0197 KEGG:ns NR:ns ## COG: FN0197 COG0500 # Protein_GI_number: 19703542 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 1 260 1 263 266 321 66.0 9e-88 MDIKEFEKMLDSKRHQGDMEKIWDHKSAWFFQKTEKSKENFKNRLVFRLVKNRKLLKGDS KLLDIGCGTGRHLLEFSNYTSYITGIDISSKMLEYAKEKLDKVPNVKLRHGNWMELFYKE KEYDLVFASMTPAISLIEHIERMCFISKKYCMMERFVFHRDSIREEIQEMLGRKLNRLHQ NEKEYSYAVWNIVWNLGYFPEIMYETEEYEEEKTIEEYLEQIECTKEEEKKIREFLRTKG KNGSIMSSHKLKKAVILWDVTKI >gi|224461480|gb|ACDD01000022.1| GENE 50 47519 - 47650 89 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257466108|ref|ZP_05630419.1| ## NR: gi|257466108|ref|ZP_05630419.1| FUR family transcriptional regulator [Fusobacterium gonidiaformans ATCC 25563] # 1 43 74 116 116 72 97.0 7e-12 KKIYSITKEGWGALEEWKEDIEIRKNNFEVFLQKFSELSKEGK Prediction of potential genes in microbial genomes Time: Fri May 20 01:53:54 2011 Seq name: gi|224461479|gb|ACDD01000023.1| Fusobacterium sp. 3_1_5R cont1.23, whole genome shotgun sequence Length of sequence - 25430 bp Number of predicted genes - 23, with homology - 21 Number of transcription units - 9, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 209 95 ## COG1695 Predicted transcriptional regulators 2 1 Op 2 . - CDS 199 - 582 208 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 3 1 Op 3 . - CDS 643 - 720 97 ## - Prom 742 - 801 11.1 4 2 Op 1 13/0.000 - CDS 852 - 2528 1591 ## COG1123 ATPase components of various ABC-type transport systems, contain duplicated ATPase 5 2 Op 2 49/0.000 - CDS 2541 - 3341 599 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 6 2 Op 3 38/0.000 - CDS 3345 - 4313 477 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 7 2 Op 4 . - CDS 4317 - 5861 1325 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 5901 - 5960 6.0 8 3 Tu 1 . - CDS 5981 - 6319 246 ## COG2826 Transposase and inactivated derivatives, IS30 family 9 4 Tu 1 . - CDS 6710 - 6916 169 ## gi|257452102|ref|ZP_05617401.1| hypothetical protein F3_03490 - Prom 6991 - 7050 9.8 - Term 7004 - 7054 12.5 10 5 Tu 1 . - CDS 7072 - 8019 1632 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase - Prom 8203 - 8262 10.2 + Prom 8154 - 8213 6.6 11 6 Op 1 . + CDS 8238 - 9449 1592 ## COG0426 Uncharacterized flavoproteins 12 6 Op 2 . + CDS 9449 - 9913 511 ## COG4807 Uncharacterized protein conserved in bacteria 13 6 Op 3 . + CDS 9941 - 11899 2641 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 11908 - 11939 4.1 - Term 11894 - 11927 4.5 14 7 Op 1 . - CDS 11933 - 13306 1639 ## COG0006 Xaa-Pro aminopeptidase 15 7 Op 2 1/0.000 - CDS 13334 - 15403 2721 ## COG0480 Translation elongation factors (GTPases) - Prom 15525 - 15584 5.6 - Term 15516 - 15559 10.5 16 8 Op 1 . - CDS 15586 - 17055 2323 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 17 8 Op 2 . - CDS 17019 - 17093 91 ## - Prom 17154 - 17213 9.9 + Prom 17126 - 17185 12.6 18 9 Op 1 . + CDS 17285 - 19363 2275 ## COG1199 Rad3-related DNA helicases 19 9 Op 2 7/0.000 + CDS 19374 - 21473 743 ## PROTEIN SUPPORTED gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 20 9 Op 3 11/0.000 + CDS 21485 - 21979 795 ## COG0319 Predicted metal-dependent hydrolase 21 9 Op 4 . + CDS 21982 - 22716 876 ## COG0818 Diacylglycerol kinase 22 9 Op 5 . + CDS 22723 - 23880 1077 ## gi|257452114|ref|ZP_05617413.1| hypothetical protein F3_03550 23 9 Op 6 . + CDS 23901 - 25409 1724 ## COG0747 ABC-type dipeptide transport system, periplasmic component Predicted protein(s) >gi|224461479|gb|ACDD01000023.1| GENE 1 2 - 209 95 69 aa, chain - ## HITS:1 COG:FN0196 KEGG:ns NR:ns ## COG: FN0196 COG1695 # Protein_GI_number: 19703541 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 69 1 69 116 110 69.0 9e-25 MNINERSKFKHLTAFVLVILAERKYSPREIHGLLLREFPGFVRDMSTIYRCLSTMEKEGL LSIEWHLPE >gi|224461479|gb|ACDD01000023.1| GENE 2 199 - 582 208 127 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 28 117 419 508 563 84 44 5e-16 MFFRENMIEYKKIGLRNEEEFLKKKMNELSGGQRQKVAIARALSMKPKLLLVDEISSMLD DSSKVNIMRLLKKLQYDLGFSMLFITHDILLAKKIADYVYCMENGKIVKKGSVRKVFCST EERKYEY >gi|224461479|gb|ACDD01000023.1| GENE 3 643 - 720 97 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSMAKNISKVFQRIEIVKCFSKNMS >gi|224461479|gb|ACDD01000023.1| GENE 4 852 - 2528 1591 558 aa, chain - ## HITS:1 COG:FN0195 KEGG:ns NR:ns ## COG: FN0195 COG1123 # Protein_GI_number: 19703540 # Func_class: R General function prediction only # Function: ATPase components of various ABC-type transport systems, contain duplicated ATPase # Organism: Fusobacterium nucleatum # 1 539 35 580 589 429 45.0 1e-120 MKTILKLEDFYFTYKGNSKYTLNGINLDIKEGEALGIIGESGSGKTTLLLSILGLLFSKG NSLGKIYFDEQLLDKEEKYKILRWKDISMVFQNQLDVFNPKITVGEHIYELLDNLKRKEK YNRVKELFTMVRLDKKFIESYPNELSGGMRQKVLIATALSCNPKLILIDEPTTSLEEISK VEIIKILKNLIKNNITLIVTSHDLEIIKELTEKIIVMDSGNIIETGITKKFLNLQKHPYS RALVQASPFINIFKDLWGINEVEDFKDMEGCPFYSKCPQRVPVCLNENPKLSKIDEESQV ACHLGGIINLLEVNKLSKTYISKKFKIDALSEVSLKIRMGEIVSIIGESGSGKSTLAEII SGIKEKTSGEVKFLNEEIGANILGSLNSIQIIFQDSSTAMNLELSIENILKEPFLLLKDK NSFPTKKMKEYLNNLGFPTSKEFLEKKAKNLSGGEMQRLSIVRALLLEPKLLIADEITSM LDPSRKANLLRVLKGLQNKYGFSMLFITHDLILAQKITDYFYVLKDGKIIEEGDGLKIFN RASHPYTKKLVSNIIYNI >gi|224461479|gb|ACDD01000023.1| GENE 5 2541 - 3341 599 266 aa, chain - ## HITS:1 COG:FN0194 KEGG:ns NR:ns ## COG: FN0194 COG1173 # Protein_GI_number: 19703539 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 13 265 12 265 274 259 52.0 5e-69 MKLKKLFCTIEIYVLLIFIIFAFFSKYISFQNINLETNDSLVAPNLEHILGTDDLGFDIF SQLVYGGKISLEISFFTAIFSAVGGSILGAFAGYFGGWRDKMILSIIDIFLSIPELPLMI VLGAFLGTNLKNIIFVLVLVTWTHPAKIARNEIIKLKNEKYILLSKAYGGSFFHIFRWHL LKPMWSIIITAIVKIMNKAILAEASLAYLGLGDPLSKSWGMIISRAMSFPNIYFTEYYKW WLLPPLVLLIILVVTLASLAQKLEKL >gi|224461479|gb|ACDD01000023.1| GENE 6 3345 - 4313 477 322 aa, chain - ## HITS:1 COG:FN0193 KEGG:ns NR:ns ## COG: FN0193 COG0601 # Protein_GI_number: 19703538 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 10 313 4 306 318 266 50.0 4e-71 MKISKKLVIRYFILFFIVISINFILPRLMKADPFLFLSSEGADDISGLSEKQIEQYYIYY GLDRPLWQQYLFYLKSIFTGNLGFSISKTLPVTTIIFSHITWTISIVLSSLTITIFLGIF LGIISAYNRENLFGKYTYLFFVTLSQIPPFLIGFGILVIGAFYIPSLPIAGGITPFLKFE WKYEVLLDILKHAILPTLTLIVVRFPHFFMLIRGKMIVEMSKRYAFIEKAKGFNDMYILC KHCLKNAITPLITEALLSIALILQGSLIVENVFKYPGIGRLLKEAVFARDYPLLQGIFLF MVCITLGISLISEIIKENEKIG >gi|224461479|gb|ACDD01000023.1| GENE 7 4317 - 5861 1325 514 aa, chain - ## HITS:1 COG:FN0192 KEGG:ns NR:ns ## COG: FN0192 COG0747 # Protein_GI_number: 19703537 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 4 514 8 515 515 466 46.0 1e-131 MKKIVKSFILVFLLGIISIISYAKTETKQKVVDVIRMQGEDYGAPNPFKNSIRGPGKYKT DIIYDSLIEKDEKGFIPWLAKKWTIDNKDDSITFDLHTNVKWHDGKPLTAEDIKFTIEYY DKFPPVVDQTHDNGESIIRKIEILPNNKIKFTFKKYSPLNLERIGTVKIIPKHIWEKIDN PLAYTGEGYLVGSGPYKVIEYSSDKGSYAFEAFDDFWGIKPAAKRLEWIPISDPVLALER GEVSIISVSPNVIDRYKNNKKYGLVIENSFHTFRLVWNQKKVKEIQNKNVRKAIAYAINR ESLIDKLEKGYGHLSSPGYIVPSNPMYNANITKYPYSVKKAKELMKNKTIDATILVSNNP KEIKMAELIKIDLAKIGINLTVKSVDAKSRDNDVKNGNYEFALLKYGSMGGDADYVRNVY LSTAKSGIQRIQGYKNKELDDVLMAQLLEKDTKKRKELLYKAQEIIADELPMLPLYSEDF IYVYRKGDYSNYRKRFDNPVPLFTKLSFLIKEKK >gi|224461479|gb|ACDD01000023.1| GENE 8 5981 - 6319 246 112 aa, chain - ## HITS:1 COG:BH2524 KEGG:ns NR:ns ## COG: BH2524 COG2826 # Protein_GI_number: 15615087 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS30 family # Organism: Bacillus halodurans # 1 111 202 313 314 114 52.0 4e-26 MLHAIEQVISSFPKKSFQSFTSNRGKEFACFQEVEQLGISFYFADSYCVWQRGSNENSNG LLREFFPKKTNLAKVQTEELLQVLLAMNHRPRKCLGFKTPFEALFHEIYKIT >gi|224461479|gb|ACDD01000023.1| GENE 9 6710 - 6916 169 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452102|ref|ZP_05617401.1| ## NR: gi|257452102|ref|ZP_05617401.1| hypothetical protein F3_03490 [Fusobacterium sp. 3_1_5R] # 1 68 1 68 68 100 100.0 3e-20 MSYTHLTIIQRNMIEILRKEKYSTRKIATLLGVHHSTIARELNRLVGLYSTILAQEHATK RNLKKGRS >gi|224461479|gb|ACDD01000023.1| GENE 10 7072 - 8019 1632 315 aa, chain - ## HITS:1 COG:FN0174 KEGG:ns NR:ns ## COG: FN0174 COG2070 # Protein_GI_number: 19703519 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Fusobacterium nucleatum # 3 312 2 310 318 370 67.0 1e-102 MKKTDRLCELFGIEYPIFQGAMAWIANGNLAGSVSRDGGLGIIAGGGMPGDVLRAEIKKA KAIAGAKPIGVNLMLMADNIEEQVNICVEEKVEVVTTGAGNPGIYMETLKGAGIKVCPVV ASVALAKRMEKIGADAIIAEGMEGGGHIGTITTMSLLPQIVDAVNIPVICAGGVASGRQM LAALAMGASGVQCGTIFIVAKECQVHDNYKKAILKAKDRSTVSTGNYTGHPVRVLENKFA KEILEMEKNGAPKEEIEAMGTGKLRLAVVDGDIVAGSVMAGQVAAMVQEEKTTKEILVTL MEELTVAKENLKNEF >gi|224461479|gb|ACDD01000023.1| GENE 11 8238 - 9449 1592 403 aa, chain + ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 403 1 403 403 631 72.0 0 MYCCTKVTEDVIWIGINDRKTERFENYLPLDNGVTYNSYLIYDEKTCAIDAVEVGQSGPF YAKLENSLGGRKLDYLVVNHVEPDHSGAIKELFRVYPELKVIGNMKTLDMLKAFDEDFPV DAFITIKEKEIFDLGKHKLTFYTMPMVHWPESMCTYDMTDKILFSNDAFGSFGALDGGIF DDEVNHEFYEDEMRRYYSNIVGKYGSSVNAIMKKLSGVDIQYICPSHGILWRSDIGKILG LYQKWANLEPEKEGVVIIYGSMYGHTAEMAEILGRELGNRGIHDVIIYDASRTDHSYIIS KIWKYKGLMIGSCAHNNAVYPKVEPLLHKLENYGLKNRYLGIFGTMMWSGGGVKAICEFA SKLKGLEVVGDPIEIKGKATSLDIDQLQWLASQMADKLIGERK >gi|224461479|gb|ACDD01000023.1| GENE 12 9449 - 9913 511 154 aa, chain + ## HITS:1 COG:FN0742 KEGG:ns NR:ns ## COG: FN0742 COG4807 # Protein_GI_number: 19704077 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 153 1 155 156 168 61.0 3e-42 MTNNDILKRLRYALSISDKLAVKIFHLGKSQVTEEEFCSLLLRPDEDDFKKCSNSLLFSF LDGLIILKRGNLEKPVEEIKITKNNLNNLILRKLKIALNLKSDEMLSFFKLGGAELSSSE LSALFRKEGHKNYRECGDKYIRVFLKGLTEYYRK >gi|224461479|gb|ACDD01000023.1| GENE 13 9941 - 11899 2641 652 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 279 645 5 366 372 313 46.0 6e-85 MTAKKNILLQSALGSVFSISEQLSIETQTPETGTKVIVAGRVEKPGVVDIEEGMTLQDVI NAVGGIKNKKQFKAAQFGIPFGGFLTSKHLDKPIDFSLFPENERNMIILSDEDCIISFSK FYIEFLQDMVSENDEKYAAYHQVTHEVERIGRILDRISKGKSNMRDVYLLRYLSDIIKTK LNQKHNMVLEIIDTFYDEIEEHVEDLKCPAGQCIHLLKFKITEKCIGCTACARVCPVQCI TGAPKKRHFLDTSRCTHCGQCVSACPVGAIFEGDHTLKLLKDLATPKRLVVAQIAPAVRV AIGEAFGFEAGENVEKKLVAALKAIGFDYVFDTAWAADITIMEEASEFQERLERFYAEDP TVRLPILTSCCPAWVKFIEQNYPDMLDVPSTVKSPMQIFSTIAKDIWAKELGYEREKVTV VGIMPCLAKKYEAARFEFSRGDNYDTDYVISTRELIRIFKETGIDLKELEDEDFDNPLGQ YSGAGIIFGRTGGVIEAATRSTIEMITGEKIPQIEFQELRGWEGFRIAELKIGHIELRIG IAHGLEEAAKMLDKIRSGEEFFHAIEIMACKGGCIGGGGQPKALKKQVILEKRAEGLNNI DRSLEIRTSHDNPMVKAIYEKYLDYPLSHKAHELLHTKYFNRTRRNHRQPSK >gi|224461479|gb|ACDD01000023.1| GENE 14 11933 - 13306 1639 457 aa, chain - ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 456 1 460 462 513 57.0 1e-145 MLTKEILQKRRNELKTLQEADLILLPSNMDSPMNCKDNCYPFIQDATFQYYFGMNHPNLI GVIDVKEKKEYVFGKDFSMSDIIWMGKIKFLKEEAEDLGLEFRDLEELPKWIENRKVAIT NYYRADTVFYMAKLLGKDPYRLGEHISEELISKIIEQRNQKSNEELEELEKAVNVTREMH LEAMRVTRVGMKEYEVVAALEAVAAKHQCSLSFPTIFSKNGQILHNHRHDNVLQEGDLVI LDAGAKLPSGYCGDMTTTFPVSKKYSDRQRKIYNIVIHMFERAEELCRPGITYREVHLEV CKVMVEELKELDFFRGEVENIVIAGAHALFMPHGLGHMLGLDVHDMENFGEEKVGYAEFP KSSQFGLASLRLGRELEEGFVYTIEPGIYFIPELFDLWKKERKFEEFLNYDVIESYMDFG GIRYEGDFVITKDSCRRLGEKMLKYPEEIEEYRKKFS >gi|224461479|gb|ACDD01000023.1| GENE 15 13334 - 15403 2721 689 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 684 3 686 690 1103 79.0 0 MKVYETSMIRNVSLLGHRGSGKTTLVEAILHSKDVIKKMGSIEQGTTVSDFDKEEERRLF SINTSLIPVEHEDYKINFLDTPGYFDFVGEAISALRVSASAVLVLDATSGVEVGAQKAWR MLEDRKLPRIIFVNKMDKGYVNYGKLLQELKEKFGKKIAPFCIPIGEKEEFKGFVNVVDL VGRIYNGKECVDAPIPEDIDVTEVRSLLMEAIAENDEALMEKYFAGEEFTQEEIEQGLHK GVVSGDIVPVLVGSAMEEVGVHTLLHMIQLYMPTPVELFDGQRIGKDPITQEKKIVDIKT ENPFSAIVFKTMVDPFIGKITLFKVNSGVLRKDMEVLNPNKNKKERIAQVMSLMGNKQIE VDELRAGDIGATTKLQYTQTGDTLCQKEYPVMFQEIAFPKPILFSGVKPADKNDDEKLST CLQRMMEEDPTFKITRNYETKQLLIGGQGEKHLYIILCKIKNKFGVHAELEDVVVAYRET ILGKAEVQGKHKKQSGGAGQYGDVHIRFEHSETDFEFVDEIKGGVVPKQYLPAVEKGLLE AREKGILAGYPTINFKATIFDGSYHAVDSNELSFKLAAILAFKKGMEMAKPKLLEPVVRM EIHIPDEYMGDVMGDLNKRRGRVLGMEPNQYGDQVLLVEVPEVEILKYCVDLRAMTQGRG EFQYEFVRYEEVPEILSKKIIEARAEATK >gi|224461479|gb|ACDD01000023.1| GENE 16 15586 - 17055 2323 489 aa, chain - ## HITS:1 COG:FN1547_1 KEGG:ns NR:ns ## COG: FN1547_1 COG1263 # Protein_GI_number: 19704879 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Fusobacterium nucleatum # 1 394 1 394 411 583 78.0 1e-166 MFSYLQKIGKALMVPVAVLPAAAILMGIGYWIDPTGWGANSQLAAFLIKAGAAIIDNMAI LFAVGVAFGMSKDKNGAAALTGLVAFQVVTTLLSSGSVAQLLSIAPEEVNPAFGRINNQF IGILCGVISAELYNRFSGIELPKFLAFFSGKRFVPIITSGVMIVVSFILMYVWPLIFSAL SSFGVKIASMGAVGAGIYGFFNRLLIPVGLHHALNSVFWFNLAGINDIGRFWGAPDAAYA DLPAAIQGAYHVGMYQAGFFPIMMFGLLGACAAFVKTAKVENRAKVMSIMTAAGFASFFT GVTEPIEFAFMFVAPVLYLLHAVLTGVAVFLAASFNWMAGFGFSAGLVDLVLSSRNPNAH NWYMLIVLGIIFFVVYYLVFYVAITKFDLKTPGREVEEEEIQAQQKEKISNNLLANQLIP LLGGSENIEEIDYCTTRLRLRVKESANINDKEIKKLVPGLLKPSKNTVQVIIGPEVEFIA DEMKRVLNK >gi|224461479|gb|ACDD01000023.1| GENE 17 17019 - 17093 91 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCLVSRKTKEEFLCLVIYKKLEKH >gi|224461479|gb|ACDD01000023.1| GENE 18 17285 - 19363 2275 692 aa, chain + ## HITS:1 COG:FN0743 KEGG:ns NR:ns ## COG: FN0743 COG1199 # Protein_GI_number: 19704078 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Fusobacterium nucleatum # 6 692 53 741 741 710 53.0 0 MDIAELSHHIPNFEYRKEQVDMMNAIRESLEADRKIIIEAGTGTGKTLAYLIPTLEWAIE NKKKVICTTNTINLQEQLLFKDLPIAKNIINKHFSYLLVKGRNNYICKRLFHNFILGNSL DISTLSSEQKKQFDYLKSWGKMTEFGDKAELPFEVDSDIWEMIQSSSEFCQGKRCPFREE CFYMKNRALKASADLIVCNHHIFFADLNVRNSVDFDAEYLILPKYDVVVFDEAHNIESVA RSYFSLEVSRYSFVRMLNQIQNQEKTKKRKVLPALETLLQSLPTEKKQEKEFRKALQNIE QEHLKCLEIGLDFFETLANHFLKNHQGKISKSLQKDEMLFSPFLSPLREKKDHFILAMKS YSIALDYFYSQVKDGEEQNQYLMDFQNFAFKLKSFLATFQEIHQFDNDDFVYWIEANAKY KNAALVAAPLNIDQILKESLFVHLERLIFTSATLAVNGDFSYFKIAVGLEEDTMEKMIPS PFFYDDQMTVYIPSDLVDPDKSFDFVEEVSEFLKQLFLKTGGRAFVLFTSYSSLNQIYYS MLEDLQEAGITVLLHGEKPRSQLISDFKRVENPVLFGTSSFWEGVDIQGEQLRNVVIIKL PFLVPSDPVVSAISAKFEKQKRNPFMEYQLPEAVIKFKQGIGRLVRSKEDYGNIFILDNR ILKKRYGKVFLDSIPSKNIQILEKNQILKIVK >gi|224461479|gb|ACDD01000023.1| GENE 19 19374 - 21473 743 699 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 [Bacillus selenitireducens MLS10] # 66 698 52 734 750 290 29 5e-78 MKKYNLFGFHISLDVKRNHNKEKEAELYSHEYLLREKIIYYILLMSVACFFYYVPLVMRD KYYKVGDITISDIFAPKTIVFRDDDTREKIIQEIVDTSQREYIFSSDTQQIYVELLQKFM QEAIEVKKGSIKSIDYNYYEKQTGKKFPETVDKDLMSYRVKDLTYLQSLWTNDLTAIYNA GVYRENDYSQDSTILRYGAPFDKEFEELTKLEKDVLSVFISPNYIFDSKGTTVELEEKLK QIPDQYMQIKAGTLIASKGEMLDKRKIHILESLGIYSLKRGFIILFSTILYLIFVSSIFY TIALHLFQNEILNKNKFRGVFLILFAIFGLFWIVPLDMIYFIPLDSALFLLVFLTGKRYS SFIYASVLAFLLPLTDYNLTLFAMHLTCLSFSIFLIQKVNTRNGLIATGIQLSIFKLFIF FILSFFAKEESFNIMFQSMQIMISGFFSGMVAIALLPFFERTFNILTVFQLSELGDLSHP LLRKLAMDAPGTFQHSMMVATLSENAALAIKANSVFTRVACYYHDIGKCKRPNYYVENQK NGENPHNDISPFMSTLIITSHTKDGDDMAKKYQIPKEIRDIMYEHQGTTFLAYFYNKAKA IDPNVLKEEFRYSGPKPRSKESAIIMLADSIEAAVRSLSVKTPREVETMIRKIINGKVED DQLSEADLTFKEIEAIVQSFLKTFSSIYHERLKYPGQKN >gi|224461479|gb|ACDD01000023.1| GENE 20 21485 - 21979 795 164 aa, chain + ## HITS:1 COG:FN0746 KEGG:ns NR:ns ## COG: FN0746 COG0319 # Protein_GI_number: 19704081 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Fusobacterium nucleatum # 1 164 1 162 162 191 70.0 6e-49 MELLVEVSVEYEKNDYAEFIQEITENNNEVLTEYIEEVLTMEKIESTLPLYVSLLLTGNE QIQGINREFRKKDSPTDVISFAYHENEDFLVGPYDTLGDIVISLDRVGEQAKEYNHSFTR EFYYVLTHGILHILGYDHIEEEDKKEMREREEEILGHFGYTREK >gi|224461479|gb|ACDD01000023.1| GENE 21 21982 - 22716 876 244 aa, chain + ## HITS:1 COG:L95012 KEGG:ns NR:ns ## COG: L95012 COG0818 # Protein_GI_number: 15673069 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Diacylglycerol kinase # Organism: Lactococcus lactis # 2 120 21 142 151 90 42.0 2e-18 MKPYQESEFKNKKMTDGFNSAIEGVFETIRTEKHMKFHAFATILIIVIGLFINLSRYELL SLIISISFVWLAELFNTAIECCVDLTCQEYNLLAKKAKDVAAGAVLLSAFNALIIGYLVF SKHIGVQLQQSFRVLRSSYQHKTVLIFIVVLSIVLLIKLITQKGTPLKGGMPSGHSALAS SIFTIISFLTDNPKVFYLSFLLLILVIQSRVEGKIHTLLETLVGAFLGSSITYLILYLLK YKAW >gi|224461479|gb|ACDD01000023.1| GENE 22 22723 - 23880 1077 385 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452114|ref|ZP_05617413.1| ## NR: gi|257452114|ref|ZP_05617413.1| hypothetical protein F3_03550 [Fusobacterium sp. 3_1_5R] # 1 385 1 385 385 712 100.0 0 MITEKEYFLLALLAYCDFSKKHIGKNLWKIWEEEKEKKTFRTSFTLLQSKFYPQFTTFFE EELKKWFIIRIDNRKAKKISSSQSGFFSVCFGNSKQEYVISYRGSEVYPLEDAYQDFINT DLTIGMGKIPIQFHEGIEVVEKLVQDLGLKYPQISLTGHSLGGGIAQYVAFSLHNLHQYI PITYTWNAVGITHIEELSIQKIKRNIDYQKKIVNYGHSEDFTNSLFSHIGKQYFVDKKLS SKRINHRNFLEKIPFLKKSLSSCHCENVFLPFFGEGKSLQKKVCLAYLAAACRKLIMQEK LFSKDFLADYYLQTDLSKITLEKYRRELIESLKKYTKALYCKQIIEQLEDFSPQDMQVFW KEFLRRIASPYRYLDVFDILVYEYI >gi|224461479|gb|ACDD01000023.1| GENE 23 23901 - 25409 1724 502 aa, chain + ## HITS:1 COG:FN0998 KEGG:ns NR:ns ## COG: FN0998 COG0747 # Protein_GI_number: 19704333 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 499 1 498 500 318 36.0 2e-86 MKKLLAFIFCLSCFFACGKKEQAPVSVENHQQTVTVALAYKPKSFDPSKHTDGVTMSVTK QIYSNLFSLGEKGEILPELAESYKIVSENTLDITLKKGILFHDGSEMKADDVIASLQRNL DSPVSHVLINPIQSMKKLNEYELEITSNTSPNLLLHNFTHGSIAITKEVPMNEDQVNLVG TGPFKIKLWGNGEKVELEAFDDFYIQKPNFQNLVFVTIPETSNRVIGLETGEIDIAYDII PSDLSLLTEEKGLTYMSGLSFGSDFLSINTERMNDVDIRRALALAIDKKGINEAVFEGKL DLASSILPPNVFGYSNSGIEISQNVEEAKKIMKEKGYDETHPLSLKMYIYEEPTRRQISE IIQANLKEIYMDVEVVSLEVSSFLQFTAQGQHDFLLGLWYTSSGDADFGYYPLLHSSSKG VPGNRAFYDNKEVDQLLDDARTTSSEKQRLKDYAEVQKIIDQELPIFPLFYKTYFIGMRN HISNLVFDPRGSHILYNLQFSK Prediction of potential genes in microbial genomes Time: Fri May 20 01:54:21 2011 Seq name: gi|224461478|gb|ACDD01000024.1| Fusobacterium sp. 3_1_5R cont1.24, whole genome shotgun sequence Length of sequence - 4020 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 21 - 1187 1495 ## COG0500 SAM-dependent methyltransferases 2 1 Op 2 36/0.000 - CDS 1206 - 2360 1379 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 3 1 Op 3 24/0.000 - CDS 2429 - 3103 319 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 1 Op 4 . - CDS 3078 - 4019 1142 ## COG0845 Membrane-fusion protein Predicted protein(s) >gi|224461478|gb|ACDD01000024.1| GENE 1 21 - 1187 1495 388 aa, chain - ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 1 388 1 394 412 357 52.0 2e-98 MTAKEAMELLEELIQEKKLIKIVLSDKEVDAEWDKVLIRPVKIKEQDFMQFEKFKNNKSY HFNMEAACLYEEISISVKQFKQAYIHSEGKDYHLTRKGDKYFSKESGNTCCQKILEHNKT KKYLLAEGKPIDFLVYLGVMSKEGKVYKHSYAKYRQINKYLEFIENTIEELQEKKWIQDH IRILDFGCGKSYLTFALYYYLREIKKISFTIIGLDLKEDVMKHCNKIAKELGYENLEFLT GNIKDFEKLQEVDLVFSLHACDNATDYSILKALEMKAKAILAVPCCQHEFFQKINKNKKS PLFHSMNVLGKHGILLERFSSLATDAYRSSFLELKGYRTQVMEFIDMEHTPKNILMKAIY EGKVKNEQKKYEEYQEFLNFLGIDPLLK >gi|224461478|gb|ACDD01000024.1| GENE 2 1206 - 2360 1379 384 aa, chain - ## HITS:1 COG:FN0828 KEGG:ns NR:ns ## COG: FN0828 COG0577 # Protein_GI_number: 19704163 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 383 25 408 408 369 52.0 1e-102 MLGIIIGISSVMSMWSIGRGGQEGITGNLKKNGYGKFTVTVDSSKDDFRYKYLFSLSQMK DLKEEGHFKNVAAQIEEYFGIKIGEEKEGILINMSTPDYEVLDPVEMMAGRNFLSFEYSP KEYVVLLDNLTAKSLFGTEKNAIGKEIEISKRRRGMNLSYRVVGVFRNPLESFIRVMKTG FFPRFARIPYQNYNYVFDKGSGVFTDILIEAKNPENLGQEMEEAKNYLEQKNQIQNIYTT RTVASDTESFDQILSTLNIFITFASAISLFVGGIGVMNIMLVTVVERTKEIGIRKSLGAT NRDILIQFLVEAVILTVMGGLIGLILGFFISFSAGKLLGIQPIYSLTSILLSLGVSISIG IIFGVSPARKAANLNPIDALRAES >gi|224461478|gb|ACDD01000024.1| GENE 3 2429 - 3103 319 224 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 219 1 218 245 127 32 1e-29 MIQIKKVNKYYINGENKLHALQDVDFHIQKGEFVSIMGSSGSGKSTMMNILGCLDREFEG EYVLDGISIREIAEKNLCKVRNQKIGFVFQSFHLLPKLSALENVELPLVYAGISKKEREE RAKKMLEIVGLGTRLHHRPNELSGGQRQRVAIARALVNDPAIILADEPTGNLDSQSEKEI MNFFQELHQKGKTIVVVTHEPEVAKYTKRILHFKDGKLLGEDVL >gi|224461478|gb|ACDD01000024.1| GENE 4 3078 - 4019 1142 313 aa, chain - ## HITS:1 COG:FN0826 KEGG:ns NR:ns ## COG: FN0826 COG0845 # Protein_GI_number: 19704161 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 1 302 10 309 338 240 44.0 2e-63 GYIEAKGRVEVNDTISVFVDKSLKVKEIFVKEGDYVEKGQILMTFDDLSKNKLLRAMERE RLQLQKLKRNYEVERSLEKIGGASLNSLKDMQEEIRIHELNLEEYQEDFQKTASEIVSPA NGTVSSLTAQENYLVNTDTPLLKIADLSNIKIVLEIPEYNVRYLKLGEKLSLQPEIFEEK ESFSGEIVRIGKIAKVSPSTSENILEVEVKPLEEIPYIVPGFKVSAKIELQETKEEKRIL IPKTALLEENGSFYVFSLAENQLAMKKGVEAEILSGQEAAILKGLQEGDKIIANPDVSLK EGDKILDSNQKSK Prediction of potential genes in microbial genomes Time: Fri May 20 01:54:24 2011 Seq name: gi|224461477|gb|ACDD01000025.1| Fusobacterium sp. 3_1_5R cont1.25, whole genome shotgun sequence Length of sequence - 7858 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 156 197 ## gi|257466086|ref|ZP_05630397.1| periplasmic component of efflux system 2 1 Op 2 . - CDS 153 - 1427 1330 ## FN0825 putative cytoplasmic protein 3 1 Op 3 . - CDS 1437 - 2030 614 ## COG0406 Fructose-2,6-bisphosphatase 4 1 Op 4 1/0.000 - CDS 2051 - 2632 817 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 5 1 Op 5 . - CDS 2641 - 5487 3081 ## COG0178 Excinuclease ATPase subunit 6 1 Op 6 1/0.000 - CDS 5506 - 7032 2123 ## COG2385 Sporulation protein and related proteins 7 1 Op 7 . - CDS 7019 - 7762 1046 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase - Prom 7793 - 7852 6.2 Predicted protein(s) >gi|224461477|gb|ACDD01000025.1| GENE 1 3 - 156 197 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257466086|ref|ZP_05630397.1| ## NR: gi|257466086|ref|ZP_05630397.1| periplasmic component of efflux system [Fusobacterium gonidiaformans ATCC 25563] # 1 51 1 51 357 93 100.0 4e-18 MKKKVIVGLLLMAVIGIGVWKFHKKSEREEKVYSIISVHKKQGNGYIEAKG >gi|224461477|gb|ACDD01000025.1| GENE 2 153 - 1427 1330 424 aa, chain - ## HITS:1 COG:no KEGG:FN0825 NR:ns ## KEGG: FN0825 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 21 421 2 410 410 143 27.0 1e-32 MKKYLILFFCFSCSLLAKEQSFEEMLRQVSKSSYEEEKYQLEQESLSIRREHSKGRDFQE GLIANLEYTEHHRHKERNDYIKKGTLQWGPFFVSAYDLGEKEGEYVGVGIEKNLKDLFYS QHDSQLRQLKWDEKAKFWNYQNHRQKKMIAFIQLYRDYKNSLYELEIKEQERKRLEKEEA KLALSYRLGNTKRVDWQAANFGLQNLALEISALEKQKKAYEERFRREFRISLEGKSIQEI PLLEVEFQDVLEEYGRAELEEEKAKLKVQEEALRYSIYEEKIPDVSILYEHSSKNHKRLE EDMLSLKFRKKLFADHYNSKILQNEIKEQELFLAQREEEIKAERELMIANYENYQSQYQV AQNRAQLESSKYEIKKLEYDLGKIDYIDVMEAFDKYLDAKISLEKSKNRLAAYLYEWKVR KVEK >gi|224461477|gb|ACDD01000025.1| GENE 3 1437 - 2030 614 197 aa, chain - ## HITS:1 COG:FN0808 KEGG:ns NR:ns ## COG: FN0808 COG0406 # Protein_GI_number: 19704143 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 194 1 194 206 167 43.0 1e-41 MKLYFVRHGETEWNTQRRFQGRKNSPLTEKGEQQAKNIAEVLRNIPFTRLYSSSLGRARK TAQEIQKGRGIPLEIMDEFIEISMGELEGKTKSDFAELFPEQYEKYLHASLDYNPQAFRG ETFEEIQARLRKGMNDLVRKHEEEDVILVVSHGMTLQILFTDLRHGNLERLREEKLPENT EVRVVEYRDQKFIIQSV >gi|224461477|gb|ACDD01000025.1| GENE 4 2051 - 2632 817 193 aa, chain - ## HITS:1 COG:FN1104 KEGG:ns NR:ns ## COG: FN1104 COG0632 # Protein_GI_number: 19704439 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Fusobacterium nucleatum # 1 193 1 191 194 181 52.0 8e-46 MFEYLEGIVAYKKPEYFALEVHGIAYRVYISLRMYEKIEVGKSYRVYIYNHIREEEYKLI GFLEEKERKLFELLLSVKGIGVSLALAALSTYPVEQLVAYIQEEKVGQLKKIPKLGEKKA QQIILDIQSKLKHFGVEQRFVEDSSMAWAEDIASALENLGYAKKEVEHLLQHENWTEYQS LEEAMKAILKKMK >gi|224461477|gb|ACDD01000025.1| GENE 5 2641 - 5487 3081 948 aa, chain - ## HITS:1 COG:FN1103 KEGG:ns NR:ns ## COG: FN1103 COG0178 # Protein_GI_number: 19704438 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Fusobacterium nucleatum # 1 945 16 960 960 1501 79.0 0 MLEKIVIKGARQHNLKNLDLEIPKNKFVVITGVSGSGKSSLAFDTIYSEGQRRYVESLSS YARQFIGQMNKPEVDSIEGLAPAISIEQKTTNRNPRSTVGTITEVYDYMRLLFAHIGTAH CPICGRKVEKQSLEEIAETILEKFEEGQKMILLSPVIKDKKGTHKNIFLNLQKKGFVRVR VDGEILYVEDEISLDKNKKHTIEAVVDRLAFKKEDKEFVSRLTQSLEVASGLSNGKIILQ VGKEDMLFSENYACPEHEEVSIPDLSPRLFSFNAPFGACPECKGIGKKLEIDENKLIEDE NLSILEGGMYIPGAASRKGYTWEIFKAMAEHFHLDLGKPVKELTKEERDLIFYGKSGYFQ VDYEGNGYSFHGLKEYEGAVANLERRYRESFSEAQKEEIENKYMIEKPCKLCHGKRLKEE VLAVTIAEKNIIEITEMSIADAYQFFLDLSLTKKQEKIAKEILKEIRERLQFMINVGLDY LSLARETKTLSGGEAQRIRLATQIGSGLTGVLYVLDEPSIGLHQKDNDKLLATLSHLRNL GNTLIVVEHDEDTMMQAEEIIDIGPGAGVYGGEIVAHGSPNEIMKNKNSLTGQYLSGKKK ILIPEKRREWKHSLVLTGACGNNLKNVTVEIPLEIMTVVTGVSGSGKSTLINQTLYPILF NRLNKGKLYPLEYKSIEGLEQLEKVINIDQSPIGRTPRSNPATYTKIFDDIRDLFSQTKD AKLHGFDKGRFSFNVKGGRCEACQGAGILKIEMNFLADVYVECEVCKGKRYNKETLDVYY KGKNIAEVLNMSVIEAYEFFKNIPSLERKLKVLVDVGLDYIKLGQSATTLSGGEAQRIKL AAELSKNTRGKTIYILDEPTTGLHFQDIEKLLEVLQRLVEKGNTVLIIEHNLDVIKTADY IIDIGKDGGARGGEILATGTPEEIAKVKDSYTGKYLAKLLKKTRKIKK >gi|224461477|gb|ACDD01000025.1| GENE 6 5506 - 7032 2123 508 aa, chain - ## HITS:1 COG:FN0806 KEGG:ns NR:ns ## COG: FN0806 COG2385 # Protein_GI_number: 19704141 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Fusobacterium nucleatum # 174 508 1 333 333 319 52.0 9e-87 MKITKIAAACCLALLFVSCSNEKIKEKTETWGAIQRRNTRPMDNAYSLLNGKDYPRVENN FGKEVPYLQPVNDTKKEDFFEKYDEKKAMQFFKNLKIEGYGSNSMYWRWKTSISKSALFQ KIAQRIPQISARGRRNVFVLKDGVWLNNQKISDVGSVKDMKVLSRGASGVITHLLVETSK GSYLITKEYNIRRLLATNGKVFGARSGSSDYGKTPIANGNALLPSASLAFDIGTFSVDIY GGGFGHGTGMIQYGAGDLASNYGLSYQQILDHYYTNVDLVDMETVSGVEQNIKVGVTKPN GSLEHGSICLTGSGKLRVYAEDESFDYTFDPNTEIRVTPKAGRLYIKTDVKEFWTNKKFF VDGGGYYLLVKNLRKAHTNNPRYRGKMQFVPNGNTLHMISVVDMEDYLKQVVPSEMPRSF GVEALKVQAVAARTYAISDFLKGRYAALGFHVKDTTESQVYNNQVENEDANRAIEETRGK ILVYHGVPIDAKYSSTSAGFTEAAHHVW >gi|224461477|gb|ACDD01000025.1| GENE 7 7019 - 7762 1046 247 aa, chain - ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 244 1 245 245 333 67.0 2e-91 MKFLGVIPARYASTRLEGKPLKDICGHSMIEWVYRRCKNTKLDDVIVATDDERIFREVER FGGKVIMTSTEHSNGTSRIAEVCQKITDYDVIINIQGDEPLIEADMIDMIVDAFQQEELC MCTLKHKLDSWEDIENPNQVKVVTDKNDYALYFSRSILPYPRKENIDLYYKHIGIYGYTR NFVLEYAAMASTPLESSESLEQLRVLENGYQIKVLETSHQSVGVDTQEDLEKVCKWIEER GITIENY Prediction of potential genes in microbial genomes Time: Fri May 20 01:54:37 2011 Seq name: gi|224461476|gb|ACDD01000026.1| Fusobacterium sp. 3_1_5R cont1.26, whole genome shotgun sequence Length of sequence - 9199 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 58 - 117 5.0 1 1 Tu 1 . + CDS 231 - 1469 1220 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase - Term 1449 - 1498 14.2 2 2 Op 1 7/0.000 - CDS 1559 - 2125 764 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 3 2 Op 2 . - CDS 2144 - 3451 2086 ## COG2233 Xanthine/uracil permeases - Prom 3624 - 3683 6.3 - Term 3477 - 3521 -0.8 4 3 Op 1 24/0.000 - CDS 3689 - 4444 615 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 5 3 Op 2 17/0.000 - CDS 4428 - 5168 226 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 6 3 Op 3 . - CDS 5152 - 6126 1089 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 6287 - 6346 8.7 + Prom 6167 - 6226 9.1 7 4 Tu 1 . + CDS 6246 - 7838 1108 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain 8 5 Tu 1 . - CDS 7863 - 9197 1371 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase Predicted protein(s) >gi|224461476|gb|ACDD01000026.1| GENE 1 231 - 1469 1220 412 aa, chain + ## HITS:1 COG:Cgl2031 KEGG:ns NR:ns ## COG: Cgl2031 COG1473 # Protein_GI_number: 19553281 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Corynebacterium glutamicum # 17 406 18 421 421 286 41.0 4e-77 MSTEKVLKSIEKIAKEQEDFYKYLHSHPELSMEESNTANMVCEKLVSFGYDVQRIGGGIV GVLKNGEGKTVLYRADMDALPIKEISNLPYASSVTQKKLKGEMVPVMHACGHDFHVTAGI GAAWAMANNKDEWSGTYIALFQPGEELGCGSQSMVEDGLVEKIPHPDIAFAQHVLVAPKS GMVGVCPGPFLSTAASIDIKVYGKGSHGSMPHLSVDTVVLAANIVTRLQTIVAREINPMD MAVLTVGALNAGDTSNIIPQEAVIKINIRAYTDEVREHLIEAIKRTVKAECTASRSPKDP EFKIYNEYPPTINDKEAAFKLQEAFKKYLGEDRVEKDYQPMSASEDFSNIPNAFGIPYVF WGFGAYNKKEDILPNHNPAFAPDLHPTMETGTEAAIVAAMSYLEKNDIKKGL >gi|224461476|gb|ACDD01000026.1| GENE 2 1559 - 2125 764 188 aa, chain - ## HITS:1 COG:BH1514 KEGG:ns NR:ns ## COG: BH1514 COG0503 # Protein_GI_number: 15614077 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Bacillus halodurans # 1 185 1 186 198 182 51.0 3e-46 MQLMKDYIQKYGVAIGDNILKVDSFLNHQIDPYLMMEVGKEFKQRFEGKGINKILTIEAS GIAVGITTAFAFQVPMVFAKKNKPSTMSDSYNATVFSFTKNKEYNITVAKEFIQKGDKIL IIDDFLALGNAILGLKSLCEQAGAEVVGVGIAIEKGFQAGGKMLRESGLHVESLAIVDSL QNGKIITR >gi|224461476|gb|ACDD01000026.1| GENE 3 2144 - 3451 2086 435 aa, chain - ## HITS:1 COG:CAC0872 KEGG:ns NR:ns ## COG: CAC0872 COG2233 # Protein_GI_number: 15894159 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Clostridium acetobutylicum # 2 427 9 431 435 253 36.0 6e-67 MTKNKSPYDIDGVPALREALPLGLQHILAMFVANITPIMIVGGALNLPTEEIAILIQASM LVAGLNTFIQTYRFGPVGARLPIVVGSNFTFVPLAITIGNNYGYEAVLGAALVGGIFEAC LGFFIQKVRRFFPSVVTGVIVLSIGLSLLPVGIASLAGGFGATDFGSFENLAIGCFVLIV IILFKQFAKGIWSTGSIFIGTMIGFILTLVMGKVDLSTVAQAGYLNLPMPFRYGFMFKSD AILAMMLLFVVSAVETLGDMSSVTMGGADRELTDKELSGGIVADGIGASLASIFGILPTT SFSQNTGIITMTKVMSRYVVGLGAVILMIGAFFPKVGALLTVIPPSVIGGSLVMIFAMIS ISGINLLTKEKLTGRNAVIVAVSLGLGYGLGSVPDALAHFPESLKLLFGGSGIVISGGIA IILNIVLPHDEKIFE >gi|224461476|gb|ACDD01000026.1| GENE 4 3689 - 4444 615 251 aa, chain - ## HITS:1 COG:AF0086 KEGG:ns NR:ns ## COG: AF0086 COG0600 # Protein_GI_number: 11497706 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Archaeoglobus fulgidus # 8 244 12 243 244 153 39.0 4e-37 MKKLSKKIIAIFEIFLFWYCMSIVLKKDLLPNPMITLETTFTLLITSSRIWIHILVSLYR VMLGILLGTVFSIPMGILLGYSKRIEKYFGEAFDFLYMIPKIVFLPIFFVLLGIGDLSKI ALIATVLFFQQTILIRDNVKNISEEIYDSIRILQASFWQIIQHVVFPSCLSGIFTSVKSS LGISFALLFITENFASQSGLGYFITKCMDRRDYVTMYAAILILAILGCILYTIFCFLERK ICKWKFLNHKE >gi|224461476|gb|ACDD01000026.1| GENE 5 4428 - 5168 226 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 202 1 210 245 91 32 2e-18 MIVLNNVSASYLDGNTKKKVIENLDLVIKEKCNVSIMGSSGCGKTTLLKVIAGLKKIEEG SISYRGKKYNTPIPEISLLFQNYGLLDWKTAEENILLPIYLRRSQKDSEKFSQLVKDLGL EKCLHKYPSQLSGGEKQRVAIGRALMTECKFLLLDEAFSSLDFVTKERIQNHLKKVFMKR GVTIILVTHSMEEALFWGDKIIIFESSTSKTPHILENYKESCDKEDWKKKEKILKRIKRI QNEEIK >gi|224461476|gb|ACDD01000026.1| GENE 6 5152 - 6126 1089 324 aa, chain - ## HITS:1 COG:AF0088 KEGG:ns NR:ns ## COG: AF0088 COG0715 # Protein_GI_number: 11497708 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Archaeoglobus fulgidus # 19 308 16 299 300 198 36.0 1e-50 MYRKLFVAFWITILFVSCGLEKVQDKKMKVGILSIADSGALFVAEKEQLFLKNGLDVELI PFGSAVEQSRAMEAGELDAMMTDAIVQNLVNQGENNLKEVLVALGDTAEHGKFLILASPT TEHNSLKNLSGAKLGISENTMMEFLVDSYFSLLNLEIHDVEKVNIPSLSLRMEMLLQGKI DLAILPEPLGDFAVLQGAKIVLDDTKLNENLSQSIIVFRESYIEKNFLEVKKFVKSYSEA AKMINEAPDQYKDYIFEMANIPEILKSSYRLPYYSIASVTERQLFDKMQNWMIQKKLLTQ TKDYSSSIDSRFIDIVGEENDSVK >gi|224461476|gb|ACDD01000026.1| GENE 7 6246 - 7838 1108 530 aa, chain + ## HITS:1 COG:FN1559 KEGG:ns NR:ns ## COG: FN1559 COG3263 # Protein_GI_number: 19704891 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Fusobacterium nucleatum # 1 523 1 520 527 338 40.0 2e-92 MLPILILVSFVILLSCILDKFSLKSGVPALLLFIGLGMIFGEEGVIKIPFYDFSMANNIC STALIFIMFYGGFGTKWKEAKKIIIEASLLSSFGVFLTALLVALGIHILLHWSWLESFLF GSILSSTDAASVFSILKRKKLGLKENTAPLLEVESGSNDPCSYMMTLICLLCLTGNVSTN TLISYIFLQLGGGFFFGSILSLITRFLLKKLHFAEGLDTLFITGAVIAAYALPSYFNGNG YLSVYIFGILLGNSSFPHKENLSSFYDGVTGLSQMFIFFLLGLLSTPSRLVDSFFPAFLI FLILTFIARPISVFSLLLPFRSSTEKKILVSFAGLRGAASIVFAIMTIVHDYTPKQDIFH IVFCVVLLSMIFQGSFLAWMAIRLKMIDTEVDVMEIFTSYASKTKLQFFEITIPQNHYWI SMKIKDLLLPPNILILNILRKEKQIVPYGDFIIQENDIITFSVLGNHKKLSFNIDMYTLP SRSKWIGKSIQEYGEKKNIFISLIIRKEKAMMPKANLLLEEDDQIYFHKK >gi|224461476|gb|ACDD01000026.1| GENE 8 7863 - 9197 1371 444 aa, chain - ## HITS:1 COG:FN0243 KEGG:ns NR:ns ## COG: FN0243 COG0617 # Protein_GI_number: 19703588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Fusobacterium nucleatum # 14 443 9 451 451 327 45.0 3e-89 SERGKIMRMEEFVFPEKVLYILEELEKYGEGYLVGGSVRDILLGREVHDFDFCTNLSYET LKQIFSKYFCIETGKAFGVLRLRIDGEEFEIASFRSEKGSDGRRPEEVIFVKRIEEDLAR RDFSINAMAYNQEKGLLDYYDAQKDLENKVIRFIGNPRERIQEDGLRIMRAFRFMSQLGF SLESNTKKAIMEEREMLGKIAKSRITEEWNKLILGDFVVETLEEMKETGVLELILPSLKS LYHSYDLWEHTMQVVKSVPKDLDLRLAAIFHDIGKPLTKTIDEKTGYYHFYGHEKKGAER IRSILQEELEESNKTRKEVEFLIENHMILQRNSSEKGIKKLISHFGIERTEKLIKLSIAD NLGKNLQQLRENNVADLFYKIVEKQKIPTLQELAIDGFALMKLGYQGKEIQKIKEYLLEE VLEGKIENKETALLSKAKELHPKS Prediction of potential genes in microbial genomes Time: Fri May 20 01:54:41 2011 Seq name: gi|224461475|gb|ACDD01000027.1| Fusobacterium sp. 3_1_5R cont1.27, whole genome shotgun sequence Length of sequence - 7692 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 401 538 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 456 - 515 12.8 + Prom 414 - 473 8.0 2 2 Tu 1 . + CDS 497 - 1072 629 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase + Term 1151 - 1196 3.2 - Term 1137 - 1184 5.2 3 3 Op 1 13/0.000 - CDS 1195 - 2979 2343 ## COG0173 Aspartyl-tRNA synthetase 4 3 Op 2 1/0.000 - CDS 2983 - 4224 1696 ## COG0124 Histidyl-tRNA synthetase 5 3 Op 3 . - CDS 4226 - 5458 1322 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Prom 5481 - 5540 4.8 - Term 5487 - 5545 0.2 6 4 Op 1 . - CDS 5559 - 5783 298 ## COG1314 Preprotein translocase subunit SecG 7 4 Op 2 . - CDS 5845 - 6912 587 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 8 4 Op 3 . - CDS 6939 - 7505 678 ## COG1739 Uncharacterized conserved protein - Prom 7544 - 7603 5.5 Predicted protein(s) >gi|224461475|gb|ACDD01000027.1| GENE 1 2 - 401 538 133 aa, chain - ## HITS:1 COG:Cgl1127 KEGG:ns NR:ns ## COG: Cgl1127 COG0494 # Protein_GI_number: 19552377 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Corynebacterium glutamicum # 1 130 1 129 131 74 33.0 4e-14 MKKKIQVVAAMIEREDGRVLAVLRSAKKKIGNRWEFPGGKVEEGESYFQTAEREVQEELC CRVQAVEEMGSIYEEVEDAVIEVHFVKCLWKGTAFTLTEHDAFIWIKKENLLSLKFAEAD RPMLERLVNEEKS >gi|224461475|gb|ACDD01000027.1| GENE 2 497 - 1072 629 191 aa, chain + ## HITS:1 COG:FN1132 KEGG:ns NR:ns ## COG: FN1132 COG1057 # Protein_GI_number: 19704467 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Fusobacterium nucleatum # 1 186 1 192 193 168 46.0 6e-42 MKIGIYGGSFNPIHLGHQKIIEFILQKTLLDKIIVIPVGFPSHRANTLEKGLHRFQMCQL AFEHLSQVKVSDIEINLGETSYTYDTLMKIRKIYGEEHEYFEIIGEDSLASFHTWKKPQE ILKLAKLLVLQRETFELKSENPNIILLNSPLFPISSTEIRKQLQEKRKEIEWLNPKVLRY IREQHLYENIL >gi|224461475|gb|ACDD01000027.1| GENE 3 1195 - 2979 2343 594 aa, chain - ## HITS:1 COG:FN0299 KEGG:ns NR:ns ## COG: FN0299 COG0173 # Protein_GI_number: 19703644 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 592 1 592 592 939 77.0 0 MTYRSHNLGELRKENIGQVVTLSGWVDTKRDLGGLTFIDLRDREGKTQIVFDIDYTNKDI IETAQKLRNEAVIKVVGEVKERASKNLNIPTGEIEVFVKELEVLNQCDVLPFQITGTEEN LSENIRLKYRYLDIRRPKMIQNLKMRHRMIMAIRNYMDQAGFLDVDTPILTKSTPEGARD FLVPSRINGGTFYALPQSPQIFKQLLMIGGVEKYFQIAKCFRDEDLRADRQPEFTQLDVE MSFVTQDDVMNEIEGLAKYVFKHVTGEEANYTFERMPYAIAMGEYGSDKPDLRFEVKLKD LSDVVAKSSFKAFSATVENGGIVKAIVAPKAFEKFSRKVLGEYEDYAKQYFGAKGMAYIK IAENGEISSPIAKFFQEDEMKAILERTGAGAGDVILIIADRAKTVHAALGALRLRLGKEL GLIDMNSYKFLWVVDFPMFEYDEEEGRYKAEHHPFTSIKEEDMEKFLAGQTDNIRTNTYD LVLNGSEIGGGSIRISNTELQAKVFERLSLSPEEAKEKFGFFLDAFKYGAPPHGGLAFGI DRWLMVMLKEESIRDVIPFPKTNKGQCLMTEAPGKVEEEQLEELFLHSTFQEEK >gi|224461475|gb|ACDD01000027.1| GENE 4 2983 - 4224 1696 413 aa, chain - ## HITS:1 COG:FN0298 KEGG:ns NR:ns ## COG: FN0298 COG0124 # Protein_GI_number: 19703643 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 410 1 413 413 606 75.0 1e-173 MKLIKAVRGTKDIIGEDALKYNYISEIAKQVFESYGCQFIKTPIFEETDLFKRGIGEATD VVEKEMYTFKDRGDRSITLRPENTASVVRSYLENAIYAKEDVSRFYYNGSMFRYERPQAG RQREFNQIGVEVLGEKSPILDAEIIAMGYKLLQKLGITDLEVRINCIGSNASRTEYRKKL LEYFTPMKEDLCEDCKNRLERNPLRVLDCKVDHDKMDGAPSIIDSLFEEERAHYEAVKKY LTIFGVPFVEDSGLVRGLDYYSSTVFEIVTNKLGSQGTVLGGGRYDNLLKQLGDKDIAAF GFASGVERIMMLLEDYPKKATDVYIAWLGENTLEKAMEITALLRENNLKVAVDYNSKGMK SHMKKADKLNTKYCVILGEDELAKNVVVLKNFETREQEEVSVENILMAIKGGK >gi|224461475|gb|ACDD01000027.1| GENE 5 4226 - 5458 1322 410 aa, chain - ## HITS:1 COG:FN0297 KEGG:ns NR:ns ## COG: FN0297 COG2256 # Protein_GI_number: 19703642 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Fusobacterium nucleatum # 1 406 1 406 407 550 64.0 1e-156 MNLFESNYEAIKPLSFQLRPQSLDEIFGQEKLLGKHGVLRKLIETGRLTNSIFFGPPGCG KSTLGEIISHTMDCAFESLNATTASLQDIKEVVLRAKRNVEYYQKKTILFLDEIHRFNKL QQDALLSYCENGTFILIGATTENPYYSLNNALLSRVMVFEFKSLEKKEIQQILKRAQTKI GISLSPFLEEVMSEMAQGDSRVALNYLELYQNLKDSLSEEEIYQVFMERKHSFHKTQDKY DMISAMIKSMRGSDPDAAVYWLGCLLEGGEDPRYMARRIMIQACEDVGMANPEAMLVASA AMQASERIGMPEIRIILAQAVIYLAISSKSNSAYLAINQVMEEIKNGNRQEVPKNICHDN VGYLYPHDYPNHFVRQTYMEEKKRYYLPQENKYEKLIEEKLKKLWENKEG >gi|224461475|gb|ACDD01000027.1| GENE 6 5559 - 5783 298 74 aa, chain - ## HITS:1 COG:FN0538 KEGG:ns NR:ns ## COG: FN0538 COG1314 # Protein_GI_number: 19703873 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecG # Organism: Fusobacterium nucleatum # 1 74 1 74 74 86 71.0 1e-17 METLLTVFLFILAIALIILVLIQPDQSHGMSASMGMGSSNTVFGISKDGGPLAKATKVVA ALFIIDALLLYLIK >gi|224461475|gb|ACDD01000027.1| GENE 7 5845 - 6912 587 355 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 5 345 12 347 378 230 40 2e-60 MGKIAKKLLEYYDKHKRDLAWRGEVPAYYTWISEIMLQQTRVEAVKPYFARFIEELPNIE ALANCEEEKLMKLWQGLGYYSRARNLKKAACQIMEMYGGELPKEKKELLHLAGIGPYTAG AISSIAYGKKETAVDGNVIRVMSRLFAVEGNVLEGKGRQKIEELTYQELPEDRAGDFNQA LMDLGATICIPNGAALCHLCPLHLECQANLKKEVEKYPEKKKKKERKLERQTILLLSDGQ KFALEKRKEKGLLAGLWQFPMLEGRLSLQEVREYLKEKGISYSGIEEYEPAIHIFSHVEW HMVSYIIEVEKWEIQEKQEENFVWLSKEEILTEYSVPSAFKVYLDYLKQGQRKLF >gi|224461475|gb|ACDD01000027.1| GENE 8 6939 - 7505 678 188 aa, chain - ## HITS:1 COG:FN1907 KEGG:ns NR:ns ## COG: FN1907 COG1739 # Protein_GI_number: 19705212 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 188 4 193 195 235 63.0 4e-62 MKTIGRECEISFEEKKSKFIGYIKPVYSKEEAEEFIEKIKRLHPQATHNCSVYSIKEKGK EFFKVDDDGEPSGTAGKPMGDIVQYMEVQNLVVVATRYFGGIKLGAGGLIRNYAKTCKLA ILEAGIVDYVKKETIIIEFPYERVGEIDKLLSSSSILEKSFLDRVVYQVEVEEDLKKVIE QMPYINII Prediction of potential genes in microbial genomes Time: Fri May 20 01:54:45 2011 Seq name: gi|224461474|gb|ACDD01000028.1| Fusobacterium sp. 3_1_5R cont1.28, whole genome shotgun sequence Length of sequence - 9335 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 1152 826 ## COG0658 Predicted membrane metal-binding protein 2 1 Op 2 . - CDS 1155 - 2444 1125 ## COG2211 Na+/melibiose symporter and related transporters 3 1 Op 3 . - CDS 2441 - 3550 1325 ## COG1323 Predicted nucleotidyltransferase 4 1 Op 4 . - CDS 3561 - 4589 1304 ## COG2008 Threonine aldolase 5 1 Op 5 . - CDS 4600 - 5169 791 ## COG0817 Holliday junction resolvasome, endonuclease subunit 6 1 Op 6 . - CDS 5173 - 5631 539 ## COG0219 Predicted rRNA methylase (SpoU class) - Prom 5661 - 5720 11.3 + Prom 5707 - 5766 9.8 7 2 Op 1 2/0.000 + CDS 5791 - 6411 809 ## COG0491 Zn-dependent hydrolases, including glyoxylases 8 2 Op 2 . + CDS 6436 - 7362 383 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 9 2 Op 3 . + CDS 7393 - 8700 1905 ## COG1875 Predicted ATPase related to phosphate starvation-inducible protein PhoH + Term 8703 - 8753 9.5 - Term 8695 - 8734 6.0 10 3 Tu 1 . - CDS 8744 - 9331 849 ## FN0557 hypothetical protein Predicted protein(s) >gi|224461474|gb|ACDD01000028.1| GENE 1 3 - 1152 826 383 aa, chain - ## HITS:1 COG:FN0223 KEGG:ns NR:ns ## COG: FN0223 COG0658 # Protein_GI_number: 19703568 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Fusobacterium nucleatum # 15 379 2 369 378 162 34.0 9e-40 MKHIIYFFLLLCLGIRLYEKIDFYELYEGEKIFLELEVYHGRGRSLNRYQTIYTKLAELE DGRYEGEFEILEKTPYYYELEICSLRKKEENFCQRYLKACVQKLGEGRDPSFRHFLEAIL LGRAWTLFREERKLFQYVGLSHLLAISGLHVGLLFYFLEKLLLFFKIPKQTRNYLTLGIS HFYCFGIFLSPSFVRAYVMGIFYLFHELLGEKISREKMLFFSAWILLMLHPTEVLSPSFL LSYTAILTIFYVFPLLKLYFEDVPPYLSYIFYTLSIQCIGIPLTAYFFGSLACLSFFVNL LILPIGTSLILFSFFTFFLEIFHLGFLTVPILEFFYHIFYEILEWIGELPYLTIYLENKI SGELVFLSYFVIVFIVRILYLQK >gi|224461474|gb|ACDD01000028.1| GENE 2 1155 - 2444 1125 429 aa, chain - ## HITS:1 COG:FN0222 KEGG:ns NR:ns ## COG: FN0222 COG2211 # Protein_GI_number: 19703567 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Fusobacterium nucleatum # 1 428 1 440 448 504 65.0 1e-142 MKQLTTKTQIFYGLGVSYAIVDQIFAQWILYFYLPPASAGLPMIMPAIYISYALAISRFV DMVTDPLVGFLSDKLDTRFGRRIPLVAFGTIPLALTTFAFFFPPQGNPDMAFVYLAVVGS LFFTFYTIVGAPYNAMIPEIGRNQTERLNLSTWQSVFRLVYTAIAMIIPGVLIKYFGKGD NLLGIRSMVAFLCIIVVLGLAITVFTVKEKEYSSGEVSKENFKDTIRIILKERNFFYYLF GLLFFFVGFNNLRAVMNYYVEDIMGMGKAEITIASALLFGVAALFFVPTNKVSKKYGYRK IMLSCLLLLAIFTGNLYFLGKIIPVKLGFILFALLGIPIAGAAFIFPPAMLSEIANHISE RSGSRIEGLCFGIQGFFLKLAFLISILMLPLVLTMGGKLVQKAGIYNASMLSLVFFALSF FCYYRYREE >gi|224461474|gb|ACDD01000028.1| GENE 3 2441 - 3550 1325 369 aa, chain - ## HITS:1 COG:FN0732 KEGG:ns NR:ns ## COG: FN0732 COG1323 # Protein_GI_number: 19704067 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Fusobacterium nucleatum # 4 361 6 379 396 303 45.0 5e-82 MQAIGIIAEYNPFHKGHLYHLETIKEKYPNAVIIAVMSGDYVQRGEPAIISKSRRAKQAK EAGIDIVIELPAIYSTQSAEIFARASVGILHLCHCEAFVFGSETNNIERLEKIARLSLSK EFNLALKEFLSQGFSYPTAFSKALFGEKIEPNDTLGIEYIKAIWFWKSSMRAESILRKQS GYYEENQKEQMAGATVIRQKIEQKEDYSKYLVDGNYLEEPFAFWDKFYPYLRYALLFHSN SFSEIQDMEEGLENRIRKAAEEHVCYSSFLESIMTKRYTYARIQRVLLHILLGISKQKTE RWKEEIPYLRVLEFSERGQEYLRVLKKGKIPVITTKKNIQKKLSEEARELFFWNERASSF YLSVVEEKQ >gi|224461474|gb|ACDD01000028.1| GENE 4 3561 - 4589 1304 342 aa, chain - ## HITS:1 COG:FN0810 KEGG:ns NR:ns ## COG: FN0810 COG2008 # Protein_GI_number: 19704145 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Fusobacterium nucleatum # 1 340 1 340 340 339 49.0 4e-93 MLSFLNDYSEGGHPKVMEDLMETNGESTVGYGFDPYCDKAREIISKKLKQENTETWFFAG GTLTNLTVIAHVLKPHQAVITAFTGHINVHETGAIEATGHKVLGLPSEDGKLTPKMIEDC LAYHEDFFFVEPKMIYISNTTEVGTIYTKKELMNLKACCEKNGLYLFMDGARLAYAFGAK ENDITWEDLGKYTDVLFIGGTKCGAMFGEAVAIIHDDLKKDFKYSIKQRGGLFAKGRLIG VQFISLLQENLYEEIGRKANEAAIVLRDGLRELGFTSPYDSPSNQQFVLMTQEEFEKISS VVLCGAEGKWRDGRCRIRFVTGWKNTVEEAKEAIEKIREVLA >gi|224461474|gb|ACDD01000028.1| GENE 5 4600 - 5169 791 189 aa, chain - ## HITS:1 COG:FN0214 KEGG:ns NR:ns ## COG: FN0214 COG0817 # Protein_GI_number: 19703559 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Fusobacterium nucleatum # 1 188 1 190 190 240 63.0 1e-63 MRILGIDPGTAIVGYGVIDYEKGKFHVVDYGCIYTEKDLPMEDRLVKVHEELSSLIQKYQ PEEMAVEELFYFKNNKTVISVGQARGVIVLTGRLHGLQIHSYTPLQVKMGITGYGRAEKK QIQQMVQRFLGLSEIPKPDDAADALAIAINHIHTKTSALLQIDTVCLKKLPKGTEKLSVQ EYRELFLKK >gi|224461474|gb|ACDD01000028.1| GENE 6 5173 - 5631 539 152 aa, chain - ## HITS:1 COG:FN0809 KEGG:ns NR:ns ## COG: FN0809 COG0219 # Protein_GI_number: 19704144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Fusobacterium nucleatum # 1 150 1 150 150 270 84.0 1e-72 MNIVLYQPEIPYNTGNIGRSCVLTNTHLHLIKPLGFSLDEKQIKRSGLDYWHSVQLTVWE SFEEFLASDPNMRLFYATTKTKQRYSDVSYKANDYIMFGPESRGIPEEILKKNPERCITI PMIPMGRSLNLSNSAAIILYEALRQVDFDFGE >gi|224461474|gb|ACDD01000028.1| GENE 7 5791 - 6411 809 206 aa, chain + ## HITS:1 COG:FN1162 KEGG:ns NR:ns ## COG: FN1162 COG0491 # Protein_GI_number: 19704497 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 1 201 1 201 207 251 59.0 5e-67 MQVKKFHLGPMMTNCFLTWGDNGTAYFFDCGGKNLDKVEAFIKDNQLSMKYLILTHGHGD HIDGIHEFIKRFPEAKIYIGKEEKEFLSNPNLNLNSYISGNNFEFDGEIHTVQGGDMIGE FLVLDTPGHTIGSKSFYHKDSNILMAGDTLFYHSYGRFDLPTGSQRQLVESLRKLCELPE NVIVYNGHTEETTIGEEKEFLGFHRR >gi|224461474|gb|ACDD01000028.1| GENE 8 6436 - 7362 383 308 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 294 1 296 306 152 33 1e-36 MEKIYDVIIVGGGPAGLTAGIYLGRGKARTLILEKANVGALLSAHKIDNYPGFLNSPSGK EIYEIMKKQALSYDVEIQEATVLAFDPYEETKIVKTDKGNFKCKYIIIASGMLKAKKVPG EAKYIGAGVSYCATCDGAFTRNRIVSLVGKGEELAEEALFLTRFAKEVHVYVTEDILEAP QEVLHALLENEKVKIQYSVSLEEVKGDGEALTSFVLKDSTGNLSEKNTDFLFLYLGTKSN TELFGEFADMNSKGFIKTNEKMQIRTPNMYAIGDIREKEIRQVTTATNDGTIAASVIIKD ILTKKANK >gi|224461474|gb|ACDD01000028.1| GENE 9 7393 - 8700 1905 435 aa, chain + ## HITS:1 COG:BH2629 KEGG:ns NR:ns ## COG: BH2629 COG1875 # Protein_GI_number: 15615192 # Func_class: T Signal transduction mechanisms # Function: Predicted ATPase related to phosphate starvation-inducible protein PhoH # Organism: Bacillus halodurans # 1 435 1 442 442 375 46.0 1e-103 MRKIFVLDTNVLIHDPYCIYKFEDNEVVVPIFVIEEIDKLKRNPNTAIQARLVSRVIDEI RKKGSLYQGVELEKDIFFRVEIDNNIEDLPTVLRRDVMDNMIISVTLGIQKKNPEKRVVI VSKDINMRIKADALGLEVQDYKNDKVDYSELYTGFLDISVSKEILEEYSNSGKISLEKLD VNSENLTPNCFIRMNCENDFVTGRYANGKVRKIILGDIEAWGLRARNEEQRFAMELLMDE AVKVVTLVGGAGTGKTLLAIAAALEQVVERKKYKKIFIARPIIPMGKDLGYLPGSEKEKL KPWMQPIFDNIEFLSHTRGEKTGEKVVQGLEAMGLMKIEPLTYIRGRSIPAGFIVIDEAQ NLTPLEIKTIVTRVGEDTKIVFTGDPAQIDNPYLDANTNGLTYLAEKLKNEKILGHMTLV KGERSEVAEIAAKLL >gi|224461474|gb|ACDD01000028.1| GENE 10 8744 - 9331 849 195 aa, chain - ## HITS:1 COG:no KEGG:FN0557 NR:ns ## KEGG: FN0557 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 194 46 241 244 216 61.0 5e-55 MFGLEKDTIGGYNSMLQKQKEQSGDVFVTTVLFDDYYEMLYHHKNIKELPNMTEKEYFVR GSTALYDAIGKTIVNVDREQELAEKKVDQVLFIITTDGMENASQEFTAKQVRALIEKQKK EKKWEFLFLGANIDAEETAAQFGISKEKAVNYHADSLGTQKNYKVLGEAVLQMRSGQQLK KEWKQEIEEDYKSRK Prediction of potential genes in microbial genomes Time: Fri May 20 01:54:53 2011 Seq name: gi|224461473|gb|ACDD01000029.1| Fusobacterium sp. 3_1_5R cont1.29, whole genome shotgun sequence Length of sequence - 12929 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 3, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 139 106 ## gi|257466055|ref|ZP_05630366.1| hypothetical protein FgonA2_01235 2 1 Op 2 . - CDS 141 - 239 281 ## 3 1 Op 3 . - CDS 303 - 572 421 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system 4 1 Op 4 . - CDS 557 - 784 408 ## FN1099 hypothetical protein - Prom 805 - 864 9.8 - Term 879 - 916 6.4 5 2 Op 1 2/0.000 - CDS 945 - 2252 2284 ## COG0148 Enolase 6 2 Op 2 . - CDS 2286 - 3704 2061 ## COG0469 Pyruvate kinase - Prom 3751 - 3810 3.7 7 3 Op 1 . - CDS 3838 - 4299 682 ## COG2731 Beta-galactosidase, beta subunit 8 3 Op 2 1/0.000 - CDS 4356 - 5255 1081 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Term 5271 - 5313 6.7 9 3 Op 3 3/0.000 - CDS 5334 - 6011 1096 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase 10 3 Op 4 4/0.000 - CDS 6021 - 6893 1349 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 11 3 Op 5 1/0.000 - CDS 6890 - 7792 331 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 12 3 Op 6 9/0.000 - CDS 7796 - 9649 696 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 13 3 Op 7 2/0.000 - CDS 9667 - 10656 368 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 14 3 Op 8 1/0.000 - CDS 10670 - 11677 1156 ## COG1609 Transcriptional regulators 15 3 Op 9 . - CDS 11681 - 12790 1355 ## COG3055 Uncharacterized protein conserved in bacteria - Prom 12869 - 12928 6.8 Predicted protein(s) >gi|224461473|gb|ACDD01000029.1| GENE 1 1 - 139 106 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257466055|ref|ZP_05630366.1| ## NR: gi|257466055|ref|ZP_05630366.1| hypothetical protein FgonA2_01235 [Fusobacterium gonidiaformans ATCC 25563] # 1 46 1 46 237 88 100.0 1e-16 MKKILLGILLAFSLVACANGTTKKEVKAKNIELVFILDRSGSMFGL >gi|224461473|gb|ACDD01000029.1| GENE 2 141 - 239 281 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKYVQILSMICLGVLLSSCVSLGVGTGITIGG >gi|224461473|gb|ACDD01000029.1| GENE 3 303 - 572 421 89 aa, chain - ## HITS:1 COG:FN1100 KEGG:ns NR:ns ## COG: FN1100 COG2026 # Protein_GI_number: 19704435 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Fusobacterium nucleatum # 1 87 1 87 88 125 74.0 2e-29 MGYRLVIPEKLNKKIIKFDKSVQKTLYSYIKKNLLDTEEPRLHGKALTGNLKGMWRYRVM DYRLIVEIQDDVLIIVAVDFDHRKKIYEK >gi|224461473|gb|ACDD01000029.1| GENE 4 557 - 784 408 75 aa, chain - ## HITS:1 COG:no KEGG:FN1099 NR:ns ## KEGG: FN1099 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 75 1 75 75 78 65.0 7e-14 MSVISIRFNNEEERLIKEYVESKGATVSQFIKDLLFKQIEEEYDLEIVQEYLKEKEAGRL NLISFEEAVKEWDID >gi|224461473|gb|ACDD01000029.1| GENE 5 945 - 2252 2284 435 aa, chain - ## HITS:1 COG:FN1764 KEGG:ns NR:ns ## COG: FN1764 COG0148 # Protein_GI_number: 19705083 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Fusobacterium nucleatum # 1 434 1 434 434 697 85.0 0 MTRIYDVVAREILDSRGNPTVEVDVVLECGAKGRAAVPSGASTGSHEAVELRDGDKARYL GKGVLKAVQNVNTEIKERLVGMNALDQVSIDKAMIELDGTPNKGRLGANAILGVSLAVAK AAAEALGQPLYKYLGGVNAKELPLPMMNILNGGSHADSAVDVQEFMIQPVGAKTFAEAMQ MGCEVFHHLGKLLKANGDSTNVGNEGGYAPAKIQGTEGALDLICEAVKAAGYELGKDITF AMDAASSEFAKEADGKYTYHFVREGGVVRTTEEMVDWYKSLVEKYPIVSIEDGLAEDDWA GWQKLTEALGEKVQLVGDDLFVTNTERLQKGIELKAANSILIKLNQIGTLTETLDAIEMA KRAGMTAVVSHRSGETEDATIADIAVATNAGQIKTGSTSRTDRMAKYNQLLRIEQELGDM AQYRGMKVFYNIDRK >gi|224461473|gb|ACDD01000029.1| GENE 6 2286 - 3704 2061 472 aa, chain - ## HITS:1 COG:FN1765 KEGG:ns NR:ns ## COG: FN1765 COG0469 # Protein_GI_number: 19705084 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Fusobacterium nucleatum # 1 472 4 474 475 647 72.0 0 MKKTKIVCTIGPKTESKETLKTLLQSGMNVMRLNFSHGDYAEHGARIVNFREAMKETGIR AALLLDTKGPEIRTIKLEGGKDVSIIAGQTFTFTTDKSVIGNQNKVAVTYEGFARDLKVG DIVLVDDGLLSMTVTKISGNEVECIAENSGDLGENKGINLPNVKVNLPALAEKDIQDLKF GCEQKVDFIAASFIRKADDVRAVRKVLEENGGAGIQIISKIENQEGLDNFEEILEESDGI MVARGDLGVEIPVEEVPFAQKMMIQRCNAVGKIVITATQMLDSMIKNPRPTRAEANDVAN AIIDGTDAVMLSGETAKGKYPIEAVTVMKRIAEKTDPLILPVEDAHLEVGEITVTTAVAK GTADVAEMIGAKVIVVATASGRAARDMRRYFPSADILAITNNERTANQLVLTRGVTSYVD GVASSLDEFYTLAEKAVRELGLAVSGDVIIATCGEQVYINGTTNSVKVIHIK >gi|224461473|gb|ACDD01000029.1| GENE 7 3838 - 4299 682 153 aa, chain - ## HITS:1 COG:FN1134 KEGG:ns NR:ns ## COG: FN1134 COG2731 # Protein_GI_number: 19704469 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Fusobacterium nucleatum # 1 150 5 153 155 124 44.0 8e-29 MIYDKIENIGRYLGISNYLDQAIRYIMTGNYQKAEYGRNVVAGEDIYYNCPEGAMAKNVE GMDYEYHRTYIDIHIPLKGKENIAFFEMKQGKEVKSYEEENDYGLYQGIAEGKLCIKEGE FLMLFPEEVHLALMKVEEEATPIEKVIFKVRAK >gi|224461473|gb|ACDD01000029.1| GENE 8 4356 - 5255 1081 299 aa, chain - ## HITS:1 COG:HI0687 KEGG:ns NR:ns ## COG: HI0687 COG0697 # Protein_GI_number: 16272629 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Haemophilus influenzae # 2 299 1 301 304 286 58.0 4e-77 MIKNEYFLRISYIVLLGFGFPIMRFMSIHFDTLNNNAIRFLSGGFFLIMLSLVCFKNELK KVKDNPKIIYKLFILGFFMTANMYFFINGLKNTSALTGSIFGVLAMPLSVCMAAIFYKDE RERAKKKSFLFGSIFAVIGSFLFVFFGNSNGGNSISFIKGGLFLLIAIFIQSIQNLVVKN VAKTLHAIVISAFTATFSGIVFLCLSIQTKTIIQLQEVSHSLIFGLVLAGIYGMLTGMLM AFYLVQTQGVIVFNLLQLLIPLSTAVVGYFSLGEGFSIGQSLSGILVVIACVFTLKKRD >gi|224461473|gb|ACDD01000029.1| GENE 9 5334 - 6011 1096 225 aa, chain - ## HITS:1 COG:FN1476 KEGG:ns NR:ns ## COG: FN1476 COG3010 # Protein_GI_number: 19704808 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Fusobacterium nucleatum # 1 223 1 223 224 273 72.0 2e-73 MKERIEALRGKLIVSCQALPEEPLHSSYIMSRMAYAAYVGGASGIRANTVVDIHEIKKTV DLPIIGIIKEVYGDNPVYITPTMKEISALVTEGVDIIAIDGTKRERPDGNTLEALMKEAK EKYPKQLFMADISSVEEAVEAERLGFDFVGTTLVGYTEYTKGNLPLVELEKVLKAVSIPV IGEGNLDTPEKAKNALQLGAFAVVVGGAITRPQQITKKFVDEMNK >gi|224461473|gb|ACDD01000029.1| GENE 10 6021 - 6893 1349 290 aa, chain - ## HITS:1 COG:FN1475 KEGG:ns NR:ns ## COG: FN1475 COG0329 # Protein_GI_number: 19704807 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Fusobacterium nucleatum # 1 288 1 288 290 448 77.0 1e-126 MKGIFSALMVPYNLDGSINEKGLRELVRHNIDVMKVDGLYVGGSTGENFMISTEEKKEVF RIAMDEAKNEVQMMAQVGSINVKESVELGKYATELGYPCLSAVTPFYYKFSFAEIKEYYE TIVRETQNNMVIYSIPFLTGVNMDIAQFGELFANPKIIGVKFTAGDFYLLERMRKAYPDK LILSGFDEMLLPAVVMGVDGAIGSTYNVNGIRAKEIFRLGKEGKIAEALEIQHVTNDLIE GILQNGLYPTIKEILKCQGVDAGICRRPMAPTTEEQAKVAKELYQKYLAK >gi|224461473|gb|ACDD01000029.1| GENE 11 6890 - 7792 331 300 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 5 290 4 318 319 132 28 2e-30 MKQDKIIAVDIGGTNIKYALVSFRGEILSSGDIPTEASKGIEILLSKLDNIIQTFLSEEI LGIAISATGQIDYYQGKVVGGNPIIPGWIGCELVKILEEKYHLPCVLENDVNCAALGEAW LGAGKGQKDFLCLTIGTGIGGGIILNHNLYRGASAVAGEFGKLHLRGKEEVYEKYASMSA LVQKVEEKTGKHRNGKEIFDLYWQEEKTVVSLVNEWIHDIAEGLKVLLYLWNPSCIILGG AVTHQGETFQKKIEEELQKQITPNYLECLELKFANLGNHAGLLGATFLLLDKIKQEEVKQ >gi|224461473|gb|ACDD01000029.1| GENE 12 7796 - 9649 696 617 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 192 613 5 428 431 272 35 8e-73 MKLFNKLEEWIGGTLFVGMFIILVLQIISRQILDDPLIWSEELARLFFVYVGMLGISMGI RTQSHVMIDFVYARLPEKLQKIIFTGIQMIIFLCISSFSYFGYLLIEKKADIELVSLGIS AKWMYIALPVISVLMLIRFFQAYQENWENKKVLISPKIILAFMIIFMALLIFQPSVFKVF KLTQYFKLRGNSVYVALLLWLVLIFAGVPVGWSLLASSMVYFSMTKWAVAYFASSKFVDS VDSFSLLSVPFFILTGILMNGSGITERIFYFAKATLGHYTGGMGHVNVAASLIFSGMSGS AIADAGGLGQLEIKAMRDEGYDDDICGGITAASCIIGPLVPPSISMIIYGVIANQSIAKL FLAGFVPGVLTTIALMIMNYFVCKKRGYKKAKKCTSKERWEAFKRAFWALLTPIIIIGGI FSGMFTPTEAAVVAALYSVILGMFIYKELTLKALFQHCVEAMAISGVTVLMIITVTFFGD MIAREQVAMKIAEVFIQFANSPLTVLVMINLLLLFLGMFIDALALQFLVLPMLIPVAEQV GIDLVFFGVMTTLNMMIGILTPPMGMALFVVAQVGKMPVSTVTKGVLPFLIPIFVTLVVI TIFPQIIIFLPNLIMGA >gi|224461473|gb|ACDD01000029.1| GENE 13 9667 - 10656 368 329 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 15 308 38 329 346 146 31 8e-35 MNLKKLVAMGLVFGAMATAALAAEYELKMGMTAGTSSNEYKAAQFFAKKLKEKSKGEIEL KLYPDAQLGKNDLDMMGQLEGGVLDFTFAEMGRFSTFYPEAEVYTLPYMMKNFKHMQKAT FGTNFGKQLLKKIETKKNIIVLSQAYNGTRQTSSNKAINSIKDMKGLKLRVPNAPANLAF AKYSGAAPTPMAFSEVYLALQTNSVDAQENPLSAVKAQKFYEVQKYIAMTNHILNDQLYL VSAATMEDLPSNLQKVVKEAAVEAAKYHTQLFEKEEASLKDFFKTKGVKITEPKLDEFRA AMKPFYDQYTKKNGKLGQQALKEIQAAAK >gi|224461473|gb|ACDD01000029.1| GENE 14 10670 - 11677 1156 335 aa, chain - ## HITS:1 COG:FN1471 KEGG:ns NR:ns ## COG: FN1471 COG1609 # Protein_GI_number: 19704803 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 1 333 1 333 333 261 41.0 1e-69 MMTQKKIAEMLGISRTTVARALQENSSIKEETRQRVLQLVRETQYEKNYLGSSLAGRKKI VHAFVVKSKNEFYTKEIQRGMQKIQKKYAKYRLEIRVHLHDINQPQEQVAMLENILSQQE QMDGLLIVPLEKNKIYSLLKPYLTKIPVISLTMQLHSDIPHVGTDYHRQGRIVANILSYC LREGESLVILDNGDDKLSTQDYLNGFLERISEEKLDILGPYRCHGIQESVDLLKDLSSKK KIRAVFSNRYAQNIIRELPDSWFLEKNIVVNGMSEGIQQLLLEKKVIATVTEEVYEEAAF AGKLLFQILYQNKKVQKMWSKTNSKIIYLENLEKK >gi|224461473|gb|ACDD01000029.1| GENE 15 11681 - 12790 1355 369 aa, chain - ## HITS:1 COG:FN1470 KEGG:ns NR:ns ## COG: FN1470 COG3055 # Protein_GI_number: 19704802 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 366 1 367 372 482 66.0 1e-136 MKKLFSILFLTCSCLSLAEHRISWEAVGELPAQKSYEKNIGTAGLLQGMIDDYVVVGGGA NFPIKPLTEGGAKVTHKDIYLLKENKKGLEVLEQMQLDTPIGYGASVSTGKEIYYLGGSP EAAHNKDVLKVSVENGKMKVEKVADLMLGFENGVATYQNGKIYYGVGKIENEEGKLKNSN RFFVLDLQTGENKELASFPGEARQQTVGQVLNGKFYVFSGGSSVSYIDGYAYDFEKNVWE KAADVVVDGERILLLGANSIKVSENEMLVIGGFNYYLWNEANDKLSNLKDKELADYKAQY FGKEPFRYEWNRKVLVFNAKENTWRSIGEVPFDAPCGAALLKHGKMMYSINGEIKPGVRT PRIFRGEFR Prediction of potential genes in microbial genomes Time: Fri May 20 01:55:04 2011 Seq name: gi|224461472|gb|ACDD01000030.1| Fusobacterium sp. 3_1_5R cont1.30, whole genome shotgun sequence Length of sequence - 6372 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 1, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 21 - 560 614 ## FN0731 hypothetical protein 2 1 Op 2 1/0.000 - CDS 577 - 1728 1469 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 1763 - 1822 3.2 3 1 Op 3 20/0.000 - CDS 1845 - 2222 676 ## COG0822 NifU homolog involved in Fe-S cluster formation 4 1 Op 4 1/0.000 - CDS 2267 - 3439 1534 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 5 1 Op 5 . - CDS 3500 - 4141 709 ## COG0177 Predicted EndoIII-related endonuclease 6 1 Op 6 . - CDS 4134 - 4595 488 ## FN0056 acetyltransferase (EC:2.3.1.-) 7 1 Op 7 . - CDS 4606 - 5814 857 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 8 1 Op 8 . - CDS 5877 - 6281 376 ## COG0824 Predicted thioesterase - Prom 6311 - 6370 2.0 Predicted protein(s) >gi|224461472|gb|ACDD01000030.1| GENE 1 21 - 560 614 179 aa, chain - ## HITS:1 COG:no KEGG:FN0731 NR:ns ## KEGG: FN0731 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 179 1 177 177 86 33.0 3e-16 MKKLGILALMFSLSLCIFAGPFTVKDIPRDVEREIFDSFSGSGEDRRRNIEDAKEAYIRL QNKAYDSDIPKEDLEVIIVRLHQMYGTNFQKQSGEFDREVAQYKDMVRRVEEKVKAETQK VELENQKAKKEIEVLSHNHGMTQALYQEILAKAEEKYPNNFVAQRYFIEGAIEFSKIKK >gi|224461472|gb|ACDD01000030.1| GENE 2 577 - 1728 1469 383 aa, chain - ## HITS:1 COG:FN0060 KEGG:ns NR:ns ## COG: FN0060 COG1686 # Protein_GI_number: 19703412 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Fusobacterium nucleatum # 33 376 1 352 368 278 47.0 2e-74 MKKKWLVAGFLWGISLFLQAGEIREIQTIDQILQEETTPVVEIQKVETKKVEIKDTEIKL PEVKVVEQKVEEKKKEIVKEKKIVPKKVQEVKEVKVPEKKKEKEKPVKVEKIIKEIKEVK EEKKQERDTVLSKEDTYLAGLVADTRGNIYYSKNIDKKLPMASVTKVMTLLVTFDAIRNG EAHFDDKVVITKDVYNKGGSGISMKPGETFTLLDLIRATAIYSANNAAYAVAKHIGKGSI PNFIKKMNKKAREVGVSKEISYYSPAGLPTRYTKEPMDIGTARGIYKLSLEAIKYPEYME IAGIKQMKIHNGRISIRNRNHLIGEEGIYGIKTGYHKEAKYNITVASKDGQREFIVVILG GNSYKDRDNAVLHLLDKVKRELR >gi|224461472|gb|ACDD01000030.1| GENE 3 1845 - 2222 676 125 aa, chain - ## HITS:1 COG:FN0059 KEGG:ns NR:ns ## COG: FN0059 COG0822 # Protein_GI_number: 19703411 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Fusobacterium nucleatum # 1 125 4 128 128 203 84.0 8e-53 MQYTEKVMNHFMNPHNVGVIENPDGYGKVGNPSCGDIMEIFLKIDNDIITDVKFRTFGCA SAIASSSVSTDLVLGKTVEEALQITNKKVVEALGGLPAVKMHCSVLAEEAIKLAIEDYMA KKENK >gi|224461472|gb|ACDD01000030.1| GENE 4 2267 - 3439 1534 390 aa, chain - ## HITS:1 COG:FN0058 KEGG:ns NR:ns ## COG: FN0058 COG1104 # Protein_GI_number: 19703410 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Fusobacterium nucleatum # 1 390 1 390 397 578 75.0 1e-165 MKVYLDNNATTKVDPAVFEAMVPYLTEYYGNSSSLHLFATETSQALNEARNTIARILKAK TSEIIFTASGSEADNLAIRGVAKAYKHRGKHIITSTIEHPAVKNTYLDLAEEGFEITMVP VDENGVLKLEELKKAIREDTILISVMHANNEVGAFQPVEEIAKIAKEHRILFHVDAVQTM GKLTIHPEEMGIDLLSFSGHKFHAPKGIAALYIRNGVRFGKVLTGGSQENKRRPGTSNVA FAVGMAKALDMAVSNMNEEWKREEELRNYFEEELLKRIPEIVVNAKSVKRLPGTSSITFK YLEGESILLTLSSKGIAVSSGSACSSDSLQPSHVLLAMSIPAECAHGTIRFSLGKYNTKE EIDYTIEAVVETVTRLRSISPLWNAFQNNK >gi|224461472|gb|ACDD01000030.1| GENE 5 3500 - 4141 709 213 aa, chain - ## HITS:1 COG:FN0057 KEGG:ns NR:ns ## COG: FN0057 COG0177 # Protein_GI_number: 19703409 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Fusobacterium nucleatum # 14 205 1 192 201 326 80.0 2e-89 MDKKQRVREVLKRLEEKFGKPKCALDFKSPFELLVAVILSAQCTDVRVNIVTKQMFPHVN TPEQFANMEVEEIEEWIRSTGFYHNKAKNIKKCSQQLLELYHGEVPQDMEQLVNLAGVGR KTANVVRGEIWGLADGITVDTHVRRLSNLIGFVKEEDPIRIERELMKIVPKKSWIDFSHY LILQGRDTCIARRPRCNQCEISEFCKGKKIIDK >gi|224461472|gb|ACDD01000030.1| GENE 6 4134 - 4595 488 153 aa, chain - ## HITS:1 COG:no KEGG:FN0056 NR:ns ## KEGG: FN0056 # Name: not_defined # Def: acetyltransferase (EC:2.3.1.-) # Organism: F.nucleatum # Pathway: Tyrosine metabolism [PATH:fnu00350]; 1- and 2-Methylnaphthalene degradation [PATH:fnu00624]; Benzoate degradation via CoA ligation [PATH:fnu00632]; Limonene and pinene degradation [PATH:fnu00903] # 4 133 5 138 159 65 35.0 8e-10 MDIKLRKMEERDIPTIYQYIHKKYVKKYYEKEEEKQWQAHRNWYCFVLNSNSYFFYIIER EQEFIGTVRYELEEEKAIVSIYIREEYRNQGYAKLALLESISCLLKEVEVEGIFAHILQE NECSQQVFLHCGFQKYKKEVYWKEIKVREDRNG >gi|224461472|gb|ACDD01000030.1| GENE 7 4606 - 5814 857 402 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 7 395 12 410 418 334 42 9e-92 MENVFHVLQERGYLKQFTHEEEIKALLEKEKVTFYIGFDPTADSLHVGHFIAMMFMAHMQ KYGHRPIALIGGGTGMIGDPSGRTDMRTMMTRETVQHNIDCIKKQMEKFIDFSDGKAILA NNADWLWDLNYIEFIRDIGSHFSVNRMLAAECFKSRMENGLSFLEFNYMLMQGYDFLVLN KKYGCVLELGGDDQWSNMIAGVELIRKKEQKSAFAMTCTLLTNKEGKKMGKTAKGALWLD PEKTSPYEFYQYWRNVDDADVEKCLSLLTFVPMEEVRRLVSFQDERINEAKKVLAFEITK MIHGEEEALKSQKAAEALFSGGADLTTVPKLEVSIGEELLNVLVENKVLKTKSEGRRLMQ QGAMTLENVKMSDPAYVITEDSFSGDALLKLGKKKFYQLVRK >gi|224461472|gb|ACDD01000030.1| GENE 8 5877 - 6281 376 134 aa, chain - ## HITS:1 COG:FN1881 KEGG:ns NR:ns ## COG: FN1881 COG0824 # Protein_GI_number: 19705186 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Fusobacterium nucleatum # 1 130 1 129 129 117 47.0 4e-27 MFQYTYQIQKEDINHGNHVGNERALVFFKKAREAWLAEKNYSELSIGEGCGIIQKSAGIE YRKQIFLQDTIDVNIIKIEVEKLFFTFFYQIYNQKGELCVEGNTKMLAYDYKNQKVRKIP NHFLKRIEEYGMES Prediction of potential genes in microbial genomes Time: Fri May 20 01:55:44 2011 Seq name: gi|224461471|gb|ACDD01000031.1| Fusobacterium sp. 3_1_5R cont1.31, whole genome shotgun sequence Length of sequence - 112676 bp Number of predicted genes - 109, with homology - 102 Number of transcription units - 37, operones - 18 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 17.7 1 1 Tu 1 . + CDS 100 - 2202 1742 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 2207 - 2239 3.2 - Term 2195 - 2227 4.0 2 2 Op 1 . - CDS 2233 - 2295 111 ## 3 2 Op 2 1/0.286 - CDS 2297 - 2800 436 ## COG1309 Transcriptional regulator 4 2 Op 3 . - CDS 2831 - 4222 2164 ## COG2067 Long-chain fatty acid transport protein - Prom 4249 - 4308 12.5 - Term 4242 - 4289 0.3 5 3 Op 1 . - CDS 4310 - 6241 2136 ## COG3855 Uncharacterized protein conserved in bacteria 6 3 Op 2 1/0.286 - CDS 6252 - 8786 3188 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase 7 3 Op 3 . - CDS 8807 - 9394 759 ## COG0517 FOG: CBS domain - Prom 9554 - 9613 10.9 - Term 9632 - 9661 2.1 8 4 Op 1 1/0.286 - CDS 9674 - 10702 1512 ## COG1363 Cellulase M and related proteins 9 4 Op 2 . - CDS 10715 - 12229 2035 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 12261 - 12320 7.7 10 5 Tu 1 . - CDS 12322 - 12783 628 ## Smon_1033 hypothetical protein - Prom 12809 - 12868 5.4 - Term 12860 - 12907 7.2 11 6 Tu 1 . - CDS 12914 - 15451 3132 ## COG0474 Cation transport ATPase - Term 15468 - 15517 14.2 12 7 Op 1 . - CDS 15529 - 15612 74 ## 13 7 Op 2 . - CDS 15652 - 16467 1150 ## Dtox_1618 4Fe-4S ferredoxin iron-sulfur binding domain protein 14 7 Op 3 12/0.000 - CDS 16479 - 17822 1685 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase 15 7 Op 4 4/0.286 - CDS 17806 - 18501 841 ## COG0132 Dethiobiotin synthetase 16 7 Op 5 . - CDS 18491 - 19474 1140 ## COG0502 Biotin synthase and related enzymes 17 7 Op 6 . - CDS 19510 - 20490 1333 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase - Term 20505 - 20548 6.2 18 8 Op 1 . - CDS 20569 - 21159 869 ## COG3291 FOG: PKD repeat 19 8 Op 2 2/0.286 - CDS 21177 - 22325 1620 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB 20 8 Op 3 4/0.286 - CDS 22337 - 23662 1671 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB - Term 23684 - 23736 3.2 21 9 Op 1 1/0.286 - CDS 23741 - 24535 1131 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 22 9 Op 2 1/0.286 - CDS 24564 - 25802 1375 ## COG0786 Na+/glutamate symporter 23 9 Op 3 3/0.286 - CDS 25824 - 27581 2651 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 24 9 Op 4 21/0.000 - CDS 27604 - 28401 1235 ## COG2057 Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit 25 9 Op 5 1/0.286 - CDS 28403 - 29368 1299 ## COG1788 Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit 26 9 Op 6 9/0.000 - CDS 29427 - 30569 1858 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 27 9 Op 7 . - CDS 30587 - 31003 753 ## COG0511 Biotin carboxyl carrier protein 28 9 Op 8 . - CDS 31047 - 31352 492 ## gi|257452201|ref|ZP_05617500.1| hypothetical protein F3_03985 29 9 Op 9 . - CDS 31373 - 33358 2231 ## COG3711 Transcriptional antiterminator - Prom 33420 - 33479 8.2 30 10 Tu 1 . - CDS 33532 - 35040 1091 ## COG1404 Subtilisin-like serine proteases - Prom 35067 - 35126 3.6 - Term 35050 - 35097 10.1 31 11 Op 1 . - CDS 35128 - 36309 1686 ## COG0786 Na+/glutamate symporter 32 11 Op 2 . - CDS 36337 - 37332 1575 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 37413 - 37472 12.3 33 12 Tu 1 . - CDS 37478 - 37546 70 ## - Prom 37743 - 37802 10.4 34 13 Tu 1 . + CDS 37796 - 39058 1635 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 39079 - 39126 5.4 35 14 Op 1 . - CDS 39114 - 40763 1767 ## Lebu_0003 protein of unknown function DUF1703 36 14 Op 2 . - CDS 40823 - 41731 906 ## COG3586 Uncharacterized conserved protein 37 14 Op 3 . - CDS 41757 - 42125 450 ## gi|257452209|ref|ZP_05617508.1| hypothetical protein F3_04025 38 14 Op 4 . - CDS 42139 - 42402 276 ## gi|257452210|ref|ZP_05617509.1| hypothetical protein F3_04030 39 14 Op 5 27/0.000 - CDS 42405 - 43916 1168 ## COG0732 Restriction endonuclease S subunits 40 14 Op 6 5/0.286 - CDS 43928 - 45355 1780 ## COG0286 Type I restriction-modification system methyltransferase subunit 41 14 Op 7 . - CDS 45368 - 48634 3339 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 42 14 Op 8 . - CDS 48662 - 49477 1107 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase 43 14 Op 9 1/0.286 - CDS 49491 - 50243 1018 ## COG0708 Exonuclease III 44 14 Op 10 . - CDS 50254 - 51456 1350 ## COG1160 Predicted GTPases 45 14 Op 11 . - CDS 51453 - 52850 1740 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 46 14 Op 12 . - CDS 52894 - 53940 1308 ## COG0502 Biotin synthase and related enzymes 47 14 Op 13 . - CDS 53934 - 54200 393 ## CLD_0905 hypothetical protein 48 14 Op 14 . - CDS 54197 - 54664 651 ## COG0350 Methylated DNA-protein cysteine methyltransferase 49 14 Op 15 . - CDS 54674 - 56053 1877 ## COG1362 Aspartyl aminopeptidase - Prom 56134 - 56193 12.8 + Prom 56132 - 56191 15.2 50 15 Tu 1 . + CDS 56236 - 56418 260 ## gi|257452222|ref|ZP_05617521.1| hypothetical protein F3_04090 + Term 56440 - 56487 9.4 - Term 56433 - 56470 3.0 51 16 Op 1 1/0.286 - CDS 56511 - 57089 889 ## COG0279 Phosphoheptose isomerase 52 16 Op 2 . - CDS 57103 - 57975 925 ## COG0583 Transcriptional regulator 53 16 Op 3 3/0.286 - CDS 58021 - 60684 3180 ## COG0525 Valyl-tRNA synthetase 54 16 Op 4 4/0.286 - CDS 60668 - 61282 682 ## COG0218 Predicted GTPase 55 16 Op 5 18/0.000 - CDS 61293 - 63605 2721 ## COG0466 ATP-dependent Lon protease, bacterial type 56 16 Op 6 24/0.000 - CDS 63615 - 64886 267 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 57 16 Op 7 29/0.000 - CDS 64895 - 65482 928 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Term 65549 - 65602 12.2 58 17 Op 1 1/0.286 - CDS 65616 - 66905 1943 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 59 17 Op 2 1/0.286 - CDS 66916 - 68592 1744 ## COG0608 Single-stranded DNA-specific exonuclease 60 17 Op 3 32/0.000 - CDS 68597 - 68965 503 ## COG0858 Ribosome-binding factor A 61 17 Op 4 15/0.000 - CDS 68981 - 71122 3230 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 62 17 Op 5 22/0.000 - CDS 71139 - 71660 446 ## PROTEIN SUPPORTED gi|237742963|ref|ZP_04573444.1| ribosomal protein L7Ae 63 17 Op 6 32/0.000 - CDS 71662 - 72723 660 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 64 17 Op 7 . - CDS 72740 - 73201 632 ## COG0779 Uncharacterized protein conserved in bacteria - Prom 73272 - 73331 12.3 + Prom 73272 - 73331 12.9 65 18 Tu 1 . + CDS 73415 - 73846 487 ## COG2185 Methylmalonyl-CoA mutase, C-terminal domain/subunit (cobalamin-binding) + Term 74085 - 74122 2.3 66 19 Op 1 . - CDS 73848 - 74867 1077 ## COG2855 Predicted membrane protein 67 19 Op 2 . - CDS 74871 - 75272 621 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 75279 - 75318 6.1 68 20 Op 1 . - CDS 75327 - 76772 2299 ## COG4145 Na+/panthothenate symporter 69 20 Op 2 . - CDS 76766 - 77026 222 ## FN0686 integral membrane protein 70 20 Op 3 . - CDS 76998 - 77801 1095 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 71 20 Op 4 1/0.286 - CDS 77828 - 78577 269 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 72 20 Op 5 4/0.286 - CDS 78577 - 79278 313 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 73 20 Op 6 49/0.000 - CDS 79279 - 80055 902 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 74 20 Op 7 38/0.000 - CDS 80057 - 80971 796 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 75 20 Op 8 . - CDS 80984 - 82471 1889 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 82569 - 82628 9.5 + Prom 82545 - 82604 4.5 76 21 Tu 1 . + CDS 82625 - 84019 1209 ## FN0687 hypothetical protein - Term 83872 - 83920 5.1 77 22 Op 1 11/0.000 - CDS 84011 - 85120 855 ## COG1088 dTDP-D-glucose 4,6-dehydratase 78 22 Op 2 11/0.000 - CDS 85165 - 86577 1424 ## COG1091 dTDP-4-dehydrorhamnose reductase - Prom 86641 - 86700 9.7 79 23 Op 1 . - CDS 86848 - 87711 662 ## COG1209 dTDP-glucose pyrophosphorylase 80 23 Op 2 . - CDS 87728 - 88432 818 ## COG1083 CMP-N-acetylneuraminic acid synthetase 81 23 Op 3 . - CDS 88444 - 88950 509 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 82 23 Op 4 . - CDS 88997 - 89929 797 ## CGSHiGG_07890 N-acetylneuraminic acid synthase-like protein - Prom 90070 - 90129 16.6 + Prom 90026 - 90085 14.0 83 24 Tu 1 . + CDS 90105 - 91244 728 ## COG0438 Glycosyltransferase - Term 91216 - 91255 3.0 84 25 Op 1 11/0.000 - CDS 91265 - 92215 855 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 85 25 Op 2 . - CDS 92212 - 93171 714 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 93265 - 93324 8.5 + Prom 93155 - 93214 7.7 86 26 Tu 1 . + CDS 93297 - 94484 804 ## COG0438 Glycosyltransferase + Term 94720 - 94757 -0.9 - Term 94328 - 94378 -0.9 87 27 Op 1 . - CDS 94506 - 95270 343 ## FN1240 lipopolysaccharide core biosynthesis protein RfaY 88 27 Op 2 . - CDS 95246 - 96004 841 ## COG0726 Predicted xylanase/chitin deacetylase 89 27 Op 3 . - CDS 95955 - 96059 62 ## 90 27 Op 4 . - CDS 96093 - 96272 272 ## gi|257452261|ref|ZP_05617560.1| hypothetical protein F3_04285 91 27 Op 5 . - CDS 96256 - 97329 1284 ## COG0859 ADP-heptose:LPS heptosyltransferase 92 27 Op 6 . - CDS 97336 - 98979 1136 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Prom 99132 - 99191 12.0 + Prom 99082 - 99141 12.3 93 28 Tu 1 . + CDS 99164 - 100045 1046 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Term 100054 - 100097 11.2 - Term 100049 - 100078 1.4 94 29 Tu 1 . - CDS 100090 - 100311 181 ## PG1526 hypothetical protein - Prom 100345 - 100404 3.2 + Prom 100054 - 100113 6.5 95 30 Tu 1 . + CDS 100298 - 100411 82 ## + Term 100592 - 100630 -0.4 - Term 100346 - 100379 0.0 96 31 Tu 1 . - CDS 100436 - 101734 1767 ## COG1362 Aspartyl aminopeptidase - Prom 101760 - 101819 6.7 + Prom 101706 - 101765 7.5 97 32 Tu 1 . + CDS 101795 - 101920 112 ## - Term 101823 - 101877 7.8 98 33 Tu 1 . - CDS 101892 - 103709 2191 ## COG0326 Molecular chaperone, HSP90 family - Prom 103752 - 103811 10.2 - Term 103817 - 103854 5.7 99 34 Tu 1 . - CDS 103864 - 105363 2243 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 105394 - 105453 9.8 + Prom 105381 - 105440 9.7 100 35 Tu 1 . + CDS 105534 - 106397 920 ## COG1073 Hydrolases of the alpha/beta superfamily + Term 106408 - 106465 6.2 - Term 106404 - 106445 8.1 101 36 Op 1 . - CDS 106451 - 107431 1102 ## COG1186 Protein chain release factor B 102 36 Op 2 . - CDS 107478 - 107552 170 ## 103 36 Op 3 . - CDS 107562 - 107771 357 ## gi|257452271|ref|ZP_05617570.1| hypothetical protein F3_04335 104 36 Op 4 1/0.286 - CDS 107781 - 108125 387 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) 105 36 Op 5 . - CDS 108122 - 108880 958 ## COG0084 Mg-dependent DNase 106 36 Op 6 1/0.286 - CDS 108867 - 109727 1164 ## COG1281 Disulfide bond chaperones of the HSP33 family 107 36 Op 7 1/0.286 - CDS 109739 - 110164 559 ## COG1555 DNA uptake protein and related DNA-binding proteins - Prom 110189 - 110248 4.4 108 37 Op 1 1/0.286 - CDS 110250 - 111632 1891 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 109 37 Op 2 . - CDS 111635 - 112603 916 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT Predicted protein(s) >gi|224461471|gb|ACDD01000031.1| GENE 1 100 - 2202 1742 700 aa, chain + ## HITS:1 COG:FN0831 KEGG:ns NR:ns ## COG: FN0831 COG1629 # Protein_GI_number: 19704166 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 4 700 2 698 698 930 71.0 0 MRRKKALFLSFLLCNLVAFGEKTIQLPESNIQSDYIEINKMKNTKHIIVIEKKDIQEKGY TNFSSILQDIPSIHVGTTAWGEIDIRGQGEGNAGKNIQVLVDGAPITTLVNHPIQTNYDV VPVENIERIEIIPGGGSIIYGSGTAGGVINITTNLSKLQKIDNHVEVSAGNGGEKYNVSF GYPITKKLNAQISYLRDNQNLYFKNTYRNSDYFTAGIFYQVAQNQSLSLRYSTLSEKGKF VRNINYNKLQEYGKNYKPDPQKITLGLDKDGHKIEGYLDGYSNAKRDFDSINASYRLQFK EGSSYLLDAFYNKGNFSNMALSDETMYHHTYGFKNKLDIPYAKNTIFEGSSLLLGIDSYQ QDASLEYNDYKVKNWKKKIYTTKPLSFHYKKRTNAFYLLNTLKYGNWESSQGIRRDYTYW NFDKIAAKNDGKDTSHRHNTNYELSLAYKYRETGRVYARYERGFTSPDGLEITDDFSKGK IHPTQGEDEIYDLYEIGWKEYLGFTTINLTAFYNKTDNEMSRNYILSDELGFGRKTINIL KTKRKGLELSFSQKFGNLSLKESYAYLKGKREYNGKEGKFLSPNNYIDWTNTGLKKVPKH SLTLEANYQFTPRISGEIRYKYNGKYSNFSSLDQKEEEGYIKSHSVTDLSLHYHHENGLH LYGGINNLFNEKYFEYAGSRIYTVIPAEERTFFLGAKYKF >gi|224461471|gb|ACDD01000031.1| GENE 2 2233 - 2295 111 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLRGPGYLIHKKLEEYGKQ >gi|224461471|gb|ACDD01000031.1| GENE 3 2297 - 2800 436 167 aa, chain - ## HITS:1 COG:FN1004 KEGG:ns NR:ns ## COG: FN1004 COG1309 # Protein_GI_number: 19704339 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 143 1 143 188 130 44.0 9e-31 MARKVVFDRERIIEKAFKMLKKEGMEAITARKLGDYMNASPAPIYNSFRSMEELKEVLVE KAKALFLDYIQNNRTELPFLNMGLGFCIFAKEESNLFRNIFLNPNIEGNIIEQFREISQQ EIIKDSRFDNISEDRRTEIFFDCLDLCSRTSKFYCLRTNCCNRSRVN >gi|224461471|gb|ACDD01000031.1| GENE 4 2831 - 4222 2164 463 aa, chain - ## HITS:1 COG:FN1003 KEGG:ns NR:ns ## COG: FN1003 COG2067 # Protein_GI_number: 19704338 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Fusobacterium nucleatum # 227 463 42 273 273 220 51.0 5e-57 MNFKLKCMAVASLLSISAYGASIDHIQTYAPEYLGNQAQNGAINGVSPFYNPAGTTQLEE GLYVNGGLQIAAGHEQSEYKGKEYKAIFVQPVPNIAITKVNKGEATYFNFSAIAGGGTLN YKHGVVGTAIIPDLVSKLHTGVLTREEKGLPIKVDVLDGTSAKGSNLYAQMTLGKAYQIN DKLSLSGGVRFVHGHRSLKGHIAVKAYTGNKFVDNKVLNSALKTATLEADVDSKRTADGF GFVLGANYKANEKWNIGMRYDSRVKLNFKAETTEKQISIPIIKGFEPIGFTSNLYYPEYK DGKKQRRDLPAILALGTTYQVTDKWMTGLSANYYFNKDAKMDGQKYNNGFEVAFGNEYKL NDKWAVLGSINYAKTGALKDSYNDIEYALDSVMLGTGLKYQYSPTLELTASVAHYFYKSE EGNIKGKVAEKASPMMKKLQNVNEQQKYKKSITAFGLGFTKKF >gi|224461471|gb|ACDD01000031.1| GENE 5 4310 - 6241 2136 643 aa, chain - ## HITS:1 COG:FN0798 KEGG:ns NR:ns ## COG: FN0798 COG3855 # Protein_GI_number: 19704133 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 2 641 3 643 645 914 69.0 0 MSELKYLELLSQSFPNIAETSTEIINLQAILNLPKGTEHFLTDVHGEYEAFSHVLRNGSG SIRQKIEDIFQDTLTELEKKELATVIYYPDGKYDMALEEQPNMNKWIRTIIYRLLKVCKN VSSKYTRSKVRKAMPKDFEYIMQELLYESREEDNKRDYVESIIDTLISISYTKQFIVAMS ELIRRLTIDHLHLVGDIYDRGPAPHLIMDCLLDYHHVDIQWGNHDMLWIGAGVGNKACIA NVIRICCRYNNNDILEEAYGINLLPLATFAMKYYGKDPCKSFRPKEGMDSDLVAQMHKAI SIIQFKVEGLFSERNPNLQMKDREILKEINYERGTILWQGKEYPLNDTFFPTIDPKNPLE LLDEEAELLDRLKDSFMNSEKLQRHLRFLFSHGSLYLCCNSNLLYHACVPLTKDGKLAEV EIEGVKYKGKAYLDKVDTIARQAFFDRVGNEKDKRNRDFLWYLWCGELSPLFGKDVMRTF ERYFIDDKSTHEEHKNPYYTFINQEETCNMILSEFGLNPKISHIINGHVPVKVKKGESPV KANGKLFVIDGGFARAYQKTTGIAGYTLIYNSYGIKLVSHAPFESKEKALKEGADILSSV VIEDKIVQRKRVKDTDIGKKLQGQVNDLKKLLLAYRKGIIQVK >gi|224461471|gb|ACDD01000031.1| GENE 6 6252 - 8786 3188 844 aa, chain - ## HITS:1 COG:FN0796 KEGG:ns NR:ns ## COG: FN0796 COG0574 # Protein_GI_number: 19704131 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Fusobacterium nucleatum # 1 844 1 851 851 1191 70.0 0 MKQVYEFREGGKDLIPLLGGKGGNLAEMTKIGLAIPNGIIVTTDACREYFRNGKKISEEL RNEILEKLETIQKKPLLVSVRSGAPISMPGMMDTILNVGFNDAVAEEVLASIKDETFVYS SYARFISMFSEIVQGVEKKKFDKIAEETKNPKDLIPLYKALYEKETGEKFPEEVKEQILM AVNSIFNSWNNERAILYRKLNNIDDNMGTAVVVQEMVFGNFNNKSGTGVVFSRNPSTGEK QIFGEYLICAQGEDIVAGIRTPEPIAKLQEEMPKVYEELLENIHKLEQHNKDMQDIEFTI QDEKLYILQTRNGKRAPKAAVKIAIDMQEEGIISKEDAVLRVDPSLVNQLLNGDFEEKAV KEATLLGKGLAASSGVAVGRVMFDSKRVKIREKTILVREETSPEDLKGIALAQGILTVKG GATSHGAVVARGMGKCCITGCGAIKINEIDREMYIGGRTVKEGEFISISGYTGEIYLGKV AIKEASYDDNLKKILSWAYEIKRLQVRMNADTPEDVKMGKDFGAEGIGLCRTEHMFFQKD KIWAIRQVILGEEGEEKNKAIEKLFELQKEDFMGIFKNLNGDVANIRLLDPPVHEFLPKE KADKIIMAKNLGIHLYDLEIRIRKLKDENPMLGHRGCRLGVSYPRLYKAQGRAIIEAALD CRKEGHPVHPEIMIPFTMEAKELAYLRKEITEEIECLFEERQERLDYKLGTMIEIPRACL LANEIAEVADFFSFGTNDLTQMSMGLSRDDSVKFLDQYREKGIWEGEPFYSIDQKAVGKL VEYGTRLGREANKNLTVGICGEHGGDPKSIEFFERQGFDYISCSPFRVPSAILAAAQSYL KNRK >gi|224461471|gb|ACDD01000031.1| GENE 7 8807 - 9394 759 195 aa, chain - ## HITS:1 COG:FN0795 KEGG:ns NR:ns ## COG: FN0795 COG0517 # Protein_GI_number: 19704130 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Fusobacterium nucleatum # 1 195 1 193 198 186 50.0 3e-47 MELTERQEKILELIKENSPISGEEIAQNLGVTRSALRTDFSILRKMSFISAKQNHGYCFV GEEPKNKIGQVMGEPKQMDSKSSVYETIVYMFENDIGSVFITENKNVLVGVVSRKDLLKA ALGNKDLEKLPIHMVMTRMPNLIYVTEQDSIKTAVEKIMKHQIDSVAVVKQEKEVCYLVG RFSKTNISKLYLETL >gi|224461471|gb|ACDD01000031.1| GENE 8 9674 - 10702 1512 342 aa, chain - ## HITS:1 COG:FN0999 KEGG:ns NR:ns ## COG: FN0999 COG1363 # Protein_GI_number: 19704334 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Fusobacterium nucleatum # 1 342 4 347 347 507 72.0 1e-143 MNVDINYILDLTEELLSIPSPVGYTHLGIARIAEELDKFGIRYEYTKKGAILAFVEGENR EYRKMISAHIDTLGAVVRNVKANGRLELTNTGGYAWGSVEGENVLVHTLSGKVYEGTLLP VKASVHTYGDVARELPRIEENMEVRIDEDVKKAEDVLKLGILQGDFVSYETRTRRLANGY IKSRYLDDKLCIAQVFGYLKYLADTASKPKSDLYIYFSNFEEIGHGVSLFPEDLDEFISI DIGLAAADAHGDEKKVNIIAKDSRSPYDFVLRKKLVEAAEAADIPYTVSVNYRYGSDATT AILQGFDFKYACIGPSVDASHHYERTHNDGIIATVDLMIAYL >gi|224461471|gb|ACDD01000031.1| GENE 9 10715 - 12229 2035 504 aa, chain - ## HITS:1 COG:FN0998 KEGG:ns NR:ns ## COG: FN0998 COG0747 # Protein_GI_number: 19704333 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 502 1 499 500 611 60.0 1e-174 MKKVFTLLLGLLAVLFVACGEKQSNTEGQEKVVVVSQGAKPKSLDPYMYNEIPGLAVTRQ FYDSLFKKEDDGSITPLLAESYEYKTPTELWVTLRQGVKFHNGDILTVDDVLFSFQRMKE TPASAIMISDIEKVEAVDDKTFKIILKQSSAPLLFSLSHPLTSILNKKYVEEHQGNISTE PMGTGPYKFVSWGDGEKIEMAAFDDYFRGRAKVDKVIFREIIEDSSRLAALETGEIDIAY DMTAIDSGMIEAKDNLVLISEPTTAVEYICLNNQKSPFDNKLFRKALDYAIDRQSIVDSV YMGRAKITNSIVNPNVFGFYDGLNKFTFDPEKAKELIAESGIKNPKFTLSINEGSDRQQA AQIIQANLRDVGIDMQIQILEWGTYLQSTAEGKFEAFLGGWMSGTSDADIVLFPLLDTKS FGSAGNRARYSNPAFDKLVEDARSELDVAKRKELYKEAQLILQEDTPMTIMYAKNKNIGV NKRIKGFIYDPTNVHSLYTLEIAE >gi|224461471|gb|ACDD01000031.1| GENE 10 12322 - 12783 628 153 aa, chain - ## HITS:1 COG:no KEGG:Smon_1033 NR:ns ## KEGG: Smon_1033 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 2 153 3 148 148 130 47.0 1e-29 MKTILLGMLLLGSVAYAKVEDVLGTWITEKADTGNQIIVEIYQAQNGKYNGRVLELTMPI YTEGEYQGKERMDLQNPNPQLKHRKLVGIDFVSNFDYNEGKDKFENGNIYSPINGKTYHS YMQLQKDGRLLVKGSIDKSGLIGKKQYWTRYKK >gi|224461471|gb|ACDD01000031.1| GENE 11 12914 - 15451 3132 845 aa, chain - ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 6 845 21 862 862 1028 64.0 0 MASVKGLTTEQVKKLQEQYGKNALIEEEKESIFLVFLKQFKDALVMILIAASIVSAVSGN IESTFVIILVLIVNAVIGTVQHVKAQKSMDSLRKLSAPKSKVMRDGNKVEIDAFDLVPGD LVFVEAGDIIPADAKIIESYSLLVNENSLTGESNSVEKSPSAEDMSDLPLGDRTNVVYSG SLVNYGRAVIQITKIGMETEIGNIAKLLGETKEKMTPLQKALDSFGKNLTIVILVLCALI FGIYVYHGNSIMESLLFAIALAVAAIPEGLNPIITIVLSLETQKLAKQNAIVKELKSVES LGSISIICSDKTGTLTQNKMTVKKLYLDAKVLQETALQAENTTHKMMLEECIFCSDATET VGDPTETALVVLAANYGHDVQALKEEHPRLSEIPFDSDRKLMSAVYAKEDKYIMYTKGAL DSLLPRLVKIDIDGEVRDITEADIERIKLVNEKFAEDGMRVLSFGYRYMKSKDITLFDEE KYVFLGLVGMIDPPREESIQAVAECRRAGIKPIMITGDHKITARTIARQIGIFEEGDLVL EGVDVEKLSQEELIEMVPKVSVYARVSPEHKIRIVSAWQSLWKICAMTGDGVNDAPALKR ADIGIAMGITGTEVSKNAASLILADDNFSTIVKAITIGRNIYRNIKNSIGFLLSGNMAAI LAVVYASFANLPVIFSAVQLLFINLLTDSLPAIAVGVEPGNEDVLDEKPRDPKEGILTRD FLQRISLEGILIMIFIVIAFHIGLAGGNALKGSTMAFSVLCLARLFHGFNYRGKRNIFAI GFLKNKMAIAAFFIGFVLLNGVLFTPALYKTFGIAALNLEQYLMIYVLAFFPTVILQIVK WIKYR >gi|224461471|gb|ACDD01000031.1| GENE 12 15529 - 15612 74 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIGAHEKGSSFKYSINNPDVNTSGTIA >gi|224461471|gb|ACDD01000031.1| GENE 13 15652 - 16467 1150 271 aa, chain - ## HITS:1 COG:no KEGG:Dtox_1618 NR:ns ## KEGG: Dtox_1618 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: D.acetoxidans # Pathway: not_defined # 4 271 3 272 272 320 59.0 3e-86 MKVKKVWAAYFSATGTTEKVVTGLAKSLAKKMQVEFDCFDFTLPDVRKCETPFQEGDVVV FGTPTIAGRVPNVLLKYLATIEGRGALAIPISLYGNRNYDDCLIELRDILAKANFYPIAA GAFIGEHSFSRILGAGRPDEKDMAIVEEFAEKIVKKIETGDKTLIEVNGTPDPYRWYYQP RDRQGNPVDIRKVKPLTNDKCTDCKICAKVCPMGSISFENVREIPGICIKCCACIKKCPE NAKYYEDAGYLYHQHELEEGYTRRAEPEYFV >gi|224461471|gb|ACDD01000031.1| GENE 14 16479 - 17822 1685 447 aa, chain - ## HITS:1 COG:FN1002 KEGG:ns NR:ns ## COG: FN1002 COG0161 # Protein_GI_number: 19704337 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Fusobacterium nucleatum # 5 443 12 451 452 728 77.0 0 MKKRSELQEKDLQYIFHPCAQMKDFEENPPLVIQKGEGLYLIDEEGKRYMDCISSWWVNL FGHANRRINQVVMEQINNLEHVIFASFSHKPAIDLAEALVEVLPKGINKFLFADNGSSCI EMALKLSFQYHLQTGNPQKMKFISLENAYHGETIGALGVGDVDIFTQTYRPLIKEGRKVR VPYLDSRKSEEEFQKYEEECIQELRDLIESSHHEIACMIVEPMVQGAAGMLMYSANYLRQ VRELTKKYNIHLIDDEIAMGFGRTGKMFACEHAGITPDIMCLAKGLSSGYYPIALVCITT DIFNAFYADYKEGKSFLHSHTYSGNPLGCRIAVEVLKIFKEENILAMVQEKGAYLQAKME ELFEGKDYVKSYRRIGMIGAIEIHEIPGQERVGRKIAALALEKGVLIRPIGNIVYFMPPY IITKEEINTMLQVCKESIEEYLKATKN >gi|224461471|gb|ACDD01000031.1| GENE 15 17806 - 18501 841 231 aa, chain - ## HITS:1 COG:FN1001 KEGG:ns NR:ns ## COG: FN1001 COG0132 # Protein_GI_number: 19704336 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Fusobacterium nucleatum # 1 222 1 218 219 230 51.0 1e-60 MNTKGYFVIGTDTDIGKTFCSTLLYHGIKDKNGMYYKPVQSGGILKEGKLYAPDVLSLCQ FEGIPYQEDMVSYVLGPEVSPHLASEIEEKTLDLDKVRSHFQELCKKYDYLIVEGAGGLH VPLIRDKFYIYDLIREFNFPVILVSSAKVGSINHAVLTMESLEKLGIPLHGIIFNRVKNT EESKIYEQDNMNIILQKAPTKNHLVILEGKKEIPQEDLNLFLKGEANEETK >gi|224461471|gb|ACDD01000031.1| GENE 16 18491 - 19474 1140 327 aa, chain - ## HITS:1 COG:FN1000 KEGG:ns NR:ns ## COG: FN1000 COG0502 # Protein_GI_number: 19704335 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Fusobacterium nucleatum # 1 326 32 359 360 446 69.0 1e-125 MKEFIHQLKDRVLEGYLVTREDTAKLLSISIEKEEELKELLQAANEIREKFCGNFFNLCT ILNAKSGRCSENCRYCAQSAHFKTNADVYPLVSKEVALEAAKEVEVEGAHRFSLVTSGRG LQGKEEELDKLQEIYRYLKENTDLDLCASHGICSKEALQKLKDSGVKTYHHNLESSRRFY PTICTSHTFDDRVNTVKYAHEVGLQVCSGGIFGLGETEEDRIDMAFDLRELRVHSVPINI LTPIPGTPLENNKEIDPKELLKDIAIYRFILPKVSIRYAGGRVKLGEYAKLGLEGGVNSA LTGNFLTTTGNTIESDKKMIKELGYEY >gi|224461471|gb|ACDD01000031.1| GENE 17 19510 - 20490 1333 326 aa, chain - ## HITS:1 COG:FN1921 KEGG:ns NR:ns ## COG: FN1921 COG0340 # Protein_GI_number: 19705226 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Fusobacterium nucleatum # 1 235 1 231 234 202 45.0 7e-52 MKIYPFEVLDSTNDYMKEHRETFQEFDVVMAKNQRAGKGRRGNIWISTEGMALFTFLVKK REQETDEKYMKLPLLAGLAVIRALKNRKELEYQFKWTNDIYLRNKKLAGILVERREDDFF IGIGMNVNNLIPLEIKNIAISLQEVYQETTEIESLIREIVLECEKLLEEYFSGQWEDILQ EINAMNYLKGKKIGLRAGNLFVQGIVQRIDENGELELLSQEGLQSFGIGEVVKERILIKL EKNLEIFVKAYILKEANYDVIAYTEEIFEGIWEERLAKLQVKVERNSSLEEMTQKYQAKS LEEYPDIFPLEYYEEEKIKEISKIFA >gi|224461471|gb|ACDD01000031.1| GENE 18 20569 - 21159 869 196 aa, chain - ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 32 167 536 672 1734 67 35.0 2e-11 MCQNVWENDDFIFKGDELKGMTAKGKDKVKTQGLTDMIIPATTPEGVAIKRIGDNAFYRR GLTSVVIPDTVESIGYDAFGVCKLTEVKLPSALVGIEGFAFYRNKLKKVIFGDKVKKIEP SAFALNELEEIDLPEGLELIDTSSFYKNSLSSVKIPASVKKINMYAFHKNNIAEVEVPAG AQLHVYAFEANTEIKK >gi|224461471|gb|ACDD01000031.1| GENE 19 21177 - 22325 1620 382 aa, chain - ## HITS:1 COG:FN0208 KEGG:ns NR:ns ## COG: FN0208 COG1775 # Protein_GI_number: 19703553 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Fusobacterium nucleatum # 1 381 1 381 382 696 90.0 0 MEEMKELLEQFKYYANNPRKQLDKYLAEGKKAVGIFPYYAPEEIVYAAGVVPFGVWGGQG PIERAKEYFPTFYYSMALRCLEMALDGTLDGLSASMVTTLDDTLRPFSQNYKVSAGRKIP MIFLNHGQHRKEAFGKQYNARIFNKAKEELEKICDVTVTDENLKKAFVVYNENRAEKRKF IKLAASHPQTIKASDRCYVLKSSYFMLKDEHTAMLKKLNEKLAALPEEKWDGVRVVTSGI ITDNPGLLEVFDAYKVCVVADDVAHESRGLKVDIDLSIEDPMLALADQFARMDEDPILYD PDIWKRPKYVVDLAKENNADGCLLFMMNFNDTEEMEYPSLKQAFDAAKIPLIKMGYDQQM VDFGQVKTQLETFNEIVQLNRM >gi|224461471|gb|ACDD01000031.1| GENE 20 22337 - 23662 1671 441 aa, chain - ## HITS:1 COG:FN0207 KEGG:ns NR:ns ## COG: FN0207 COG1775 # Protein_GI_number: 19703552 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Fusobacterium nucleatum # 1 441 1 441 442 874 94.0 0 MAGKVEKLPNKTPRPIEGHKPAAAVLRGVVDKVYAGAWEAKRRGELVGWSSSKFPIELAK AFDLNVVYPENHAASTAAKKDGLRLCQAAEDMGYDNDICGYARISLAYAAGEPTDSRRMP QPDFVLCCNNICNMMTKWYENIARIHNIPLIMVDIPFSNTVDTPEEKVDYLIGQFDYAIK QLEELTGKKFDEKKFEDACARANRTAAAWLKSCSYMSYKPSPLSGFDLFNHMADIVAARC DEEAAIGFELLAQEFEQSIAEGTSTWEYPEEHRILFEGIPCWPGLRHLYEPLKDNGVNVT AVVYAPAFGFRYNNIREMAAAYCKAPCSVCIETGVEWRETMAKQNGISGALVNYNRSCKP WSGAMPEIERRWKEDLGIPVVHFDGDQADERNFSTEQYKTRVQGLVEIMEERKQERLAKG EDVYTNFENTKETDWSKPTLK >gi|224461471|gb|ACDD01000031.1| GENE 21 23741 - 24535 1131 264 aa, chain - ## HITS:1 COG:FN0206 KEGG:ns NR:ns ## COG: FN0206 COG1924 # Protein_GI_number: 19703551 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Fusobacterium nucleatum # 4 258 5 259 265 431 89.0 1e-121 MSKFTMGVDVGSTASKCVILKDGKEIVAKAVISVGTGTSGPARAIKQALEEIGYHSIEQL DGAVATGYGRNSLEEVPAQMSELSCHAKGAYFLFPKVRTIIDIGGQDSKALKVGDNGMLE NFVMNDKCAAGTGRFLDVIAKVLEVDLNDLEKLDEQSKVDVAISSTCTVFAESEVISQLA RGTKIEDIVKGIHTAIASRVGSLAKRVGIKDQVVMTGGVALNQGMVRALEKNIGFKIHTS EYCQLNGAIGAALFAYQKCLQAEK >gi|224461471|gb|ACDD01000031.1| GENE 22 24564 - 25802 1375 412 aa, chain - ## HITS:1 COG:FN0205 KEGG:ns NR:ns ## COG: FN0205 COG0786 # Protein_GI_number: 19703550 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Fusobacterium nucleatum # 1 408 1 413 419 485 70.0 1e-137 MERIVLELGMFETLALAVLAIYFGEFLRKQFPVLKRYCLPAAVVGGTVFALISMLLYSTN ICELSFDFKTVNSLFYCIFFAASGAAASLSLLKKGGKLVIIFAILAAVLAAGQNALALFV GKLMNVNPLISMMTGSIPMTGGHGNAAAFAPIAVEAGASAAMEVAIASATFGLISGCILG GPLGNFIIKRHRLENPALDGKDDVENMEEGTESSSAVFMDKASLVNAMFLMCIALGIGQI ATLLLKKVGVSFPIHVSCMLGGILIRLFYDRKKGNHDVLYEAIDTVGEYSLGLFVSMSII TMKLWQLSDLGGPLFVLLISQVIFIVIFCYLLTFNLLGRDYDAAVMAVGHSGFGLGAVPV SMTTMQTVCRKYRYSKLAFFVVPVIGGFISNISNAIIITKFLNIAKAMVGIG >gi|224461471|gb|ACDD01000031.1| GENE 23 25824 - 27581 2651 585 aa, chain - ## HITS:1 COG:FN0204 KEGG:ns NR:ns ## COG: FN0204 COG4799 # Protein_GI_number: 19703549 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Fusobacterium nucleatum # 3 585 2 584 584 973 81.0 0 MGNYSMPNYFQNMEQIGKELTRIDEQNEQQIKEVEAKIASLVDELHAAGTPDEKIAEKGQ LTALQRIAELVDEGTWCPLNSLYNPEDFETATGIVKGLGRINGKWAMVVASDNKKIVGAW VPGQSDNLLRASDTAKCLGIPLVYILNCSGVKLDEQEKVYANRRGGGTPFYRNAELQQAG IPVIVGIYGTNPAGGGYHSISPTILIAHKDANMAVGGAGIVGGMNPKGFIDQEGAEQIIE ATAKAKGVDVPGTVSIHYDQTGFFREVYAEEIGVLDAIRYYMDCLPSYNLEFFRVDEPME PALDPNDLYSILPMNQKKVYNIYDIIGRLVDNSEFSEYKKGYGPEMVTGIAKVDGLLVGI VANFQGLLMKYPEYKENAIGIGGKLYRQGLVKMNEFVTLCSRDKLPIIWLQDTTGIDVGN DAEKAELLGLGQSLIYSIQNSKVPQMEVTLRKGTAAAHYVLGGPQGNDTNAFSLGTAATE INVMNGETAATAMYSRRLVKDKKAGKDLTPTIDKMNKLINEYKEKSTPEYCAKTGMVDEI VNLYDIRAYMIAFANSAYQNPKAICAFHQMLLPRAIKEFNTYVKK >gi|224461471|gb|ACDD01000031.1| GENE 24 27604 - 28401 1235 265 aa, chain - ## HITS:1 COG:FN0203 KEGG:ns NR:ns ## COG: FN0203 COG2057 # Protein_GI_number: 19703548 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit # Organism: Fusobacterium nucleatum # 3 265 4 267 267 472 86.0 1e-133 MANYKNYTNKEMQAITIAKEITDGQIVIVGTGLPLIGASLAKRIFAPNCKLIVESGLMDC SPIEVPRSVGDCRLMAHCGVQWPNIRFIGFEANELLNGNDRMIAFIGGAQIDPYGNVNST CIGDYHHPKTRFTGSGGANAIATYSNTVIMMQHEKRRFIDQVDYVTSVGWGDGVGGREKL GLPGNRGPIAVVTDRGILRFDEKTKRMYLAGYYPTSSIEDIIENTGFEIDTSRAVLLEAP SEDVIKMIREEIDPGQAFIKVPVEE >gi|224461471|gb|ACDD01000031.1| GENE 25 28403 - 29368 1299 321 aa, chain - ## HITS:1 COG:FN0202 KEGG:ns NR:ns ## COG: FN0202 COG1788 # Protein_GI_number: 19703547 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit # Organism: Fusobacterium nucleatum # 1 321 1 321 321 521 76.0 1e-148 MSKVMSLHDAIKTYVKSGDSICIGGFTTNRKPYAAVYEILRQGLGDFTGYSGPAGGDWDM LIGEGRVRNFINCYIANSGYTNVCRRFRHEVEKVGKMNLEDYSQDVIMYMLHASSLGLPF LPVKLMQGSDLVNKWGISKEVREKDPKLPNDKLVEIENPLVPGEKVVAVPVPRLDVALIH VQKASINGTCSIEGDEFHDIDIAIAAKHCIVTCEELVTEEEIRKDPSKNSIPQFCVDAVV HAPFGAHPSQCYNYYDYDADFYKMYDKVTKTEEDFKAFLQEWVYDIKDNEEYINKVGASR LAKLRVVPGFGYAAKLVKEAK >gi|224461471|gb|ACDD01000031.1| GENE 26 29427 - 30569 1858 380 aa, chain - ## HITS:1 COG:FN0201 KEGG:ns NR:ns ## COG: FN0201 COG1883 # Protein_GI_number: 19703546 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Fusobacterium nucleatum # 1 378 1 371 375 451 76.0 1e-127 MEFIKILEIMMAKSGFVALTWQSLVMFVISFILIYLAIVKQFEPLLLLPIAFGVFLTNLP LADLMKEADPWYASGVLRIIYNGIKSNLFPCLIFMGIGAMTDFGPLIANPISLLLGAAAQ FGIYVTFMFANSLPFFSAKQAAAIAIIGGADGPTSIYLANNLAPELLAPIAVAAYSYMAL IPLIQPPIMKLLTTKKERAVKMKQLRKISKVEKIVFPIGTVLFTTLLLPSVAPLLGMLML GNIFKESGVVQRLSDTAQNALINIVTIMLGVTVGATANGELFLRLETIAIIFMGLFAFCM STVGGVLLGKVLYLVTGGKINPLIGSAGVSAVPMAARVSQTVGASENPTNFLLMHAMGPN VAGVIGSAVAAGYFMLIFGR >gi|224461471|gb|ACDD01000031.1| GENE 27 30587 - 31003 753 138 aa, chain - ## HITS:1 COG:FN0200 KEGG:ns NR:ns ## COG: FN0200 COG0511 # Protein_GI_number: 19703545 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Fusobacterium nucleatum # 1 138 1 134 134 99 52.0 2e-21 MKYVVTVNGEKFEVEVERADGRSSGLSRRPMERGERAAAPVQKAAPVVEAPKATPAAAPA PAATSSGTANAVVSPMPGVILDLKVKEGDMVTVGQAVVVLEAMKMENEIVSEFAGKVTSI KVKKGDNVDTDAVLVEIQ >gi|224461471|gb|ACDD01000031.1| GENE 28 31047 - 31352 492 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452201|ref|ZP_05617500.1| ## NR: gi|257452201|ref|ZP_05617500.1| hypothetical protein F3_03985 [Fusobacterium sp. 3_1_5R] # 1 101 1 101 101 149 100.0 8e-35 MITTEYIGFLESLFTSILGMAIVFMSLVFLAIFVMIVSKVIGSLEKTLLDKKSEAKVLTK PVEVPKKDNKEALKIAVITAAISEERREPVDRFVITNIQKI >gi|224461471|gb|ACDD01000031.1| GENE 29 31373 - 33358 2231 661 aa, chain - ## HITS:1 COG:FN0198 KEGG:ns NR:ns ## COG: FN0198 COG3711 # Protein_GI_number: 19703543 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Fusobacterium nucleatum # 10 527 1 522 660 288 37.0 2e-77 MALNTKHFEILKELKKEDDLKRVANIFNQTERNIRYKIQELNENLGQEKIFIKKRKIYCL LDEQDISSLIKGLNVQNYVYEQKERMDLLIIKTILHEDEFQIEEIADSLQMSKSTLRADI KILAEKLRKVGIHLEQYSNKKYRAQYKNNDLIYYLSIFLYNYVTFDEGRKAISFKRSNYF EKIVYEILTKMYFSVLEDSYQKIKSIDLPYTDETLNLLILLISVLKLRKLNSEDLEVLNK KVLKETKEFKVLRKTFPELSELNIYFLTDYLLRISCDEKEIFARHRNWIEIELGVYRLIK EFEHLKKVQLVKNKKLLDDILYYIKPLIYRSSKQIELKNTVLKEVKSIYGDTFYYLKKAF QSFETLLGLEVSDNEIGFLVPIFQVALRNRVRKAKKILVVSSYKRNLINFLLARLEEEFL VEIVNVISMKQLDNFQEEVDLIITTSDLSQMNLKLPFCRVSPILTESDRNHLEEFELPPQ DKNISLDTLMNVIERNLEGQKWSHLKLKEDLLQSFPNIIVDEKSQERKESLLIQKYQMKE LDVFDWKEAVKAAAEILWKHKDVKKAYMEDICNHLEEDALMFLLNENSALFYTEPKENVY HTGFSIVHVETPLLLKDKKIEYFVCFAPKGDAEDQNLLFQLNDFFEEENFENTLKSILRK K >gi|224461471|gb|ACDD01000031.1| GENE 30 33532 - 35040 1091 502 aa, chain - ## HITS:1 COG:FN2100 KEGG:ns NR:ns ## COG: FN2100 COG1404 # Protein_GI_number: 19705390 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Fusobacterium nucleatum # 84 446 4 383 416 212 35.0 1e-54 MTEYKKHYKIKLKKQNTKKEKDLQEERWEKFLKEKNISYEKKEFFPGYEIYKVGELSETL EKEIQENPEIDFIKPLHRYFLNFQTKQIETEKGTILKPKQGEKYPIVGVLDNGIAPLEEF ENWLYQDETSYCKEEIYPSHGTFVAGVILYGDTLSQEHWCGGREVQIFNAAVVPDFSVYQ LEEDELYERIYKAISEHSWIKVWNLAISIRFPVEKDRISDFGLLLDYLQGKYDILICKSC GNGNFVENGKEAGMILQGSDTERALVVAACNRDKVVSSFSLSGKGHKILQKPDIAMYGGD VFRNEEGKRKIEGVFSFSPEGEIVSSFGTSFATARMTRIAANLLFWKENSSSLFLKAMMV HAARGYEKYSLGYGCSLSSEEIYQEYQNSILEEGSLVEEESFLFYFSNHKIVATLTSDVV LDYHQEEEYILEDISWRIFYQGREITGENQLGNFEYFSSLKKLECEMKEENGEVKIVLFR RKKRKKTQENKEKLQYCLLWKK >gi|224461471|gb|ACDD01000031.1| GENE 31 35128 - 36309 1686 393 aa, chain - ## HITS:1 COG:FN0793 KEGG:ns NR:ns ## COG: FN0793 COG0786 # Protein_GI_number: 19704128 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Fusobacterium nucleatum # 3 392 4 395 399 448 64.0 1e-126 MAFTLDMYQTLGLAIILLLLGNWIKSKVGVFQKYFIPAPVIGGFLFSILLLIGHSTGAFD FEFDSNLKNFFMVVFFTSVGFLASFSLLKKGGVGVALFLFTAIILVIIQNGVGVALAKAF GLNPGIGLAAGSIPLTGGHGTSGAFGPYLEERGVVGATVVAIASATYGLVSGCVIGGPIA KKLMEKFHLACVKDCEMKNTKAEEALVTEKSIFKAVCMIGIAMGLGACITPVIKEAGLSL PAYLIPMLIAAIMRNIVDGTANKTPINEISIVGNVCLSLFLSMALMSMKLWQLADLALPL ITILLIQTVIMGLFAYYVTFNIMGRDYDAAVMATGHCGFGMGATPNAIANMEAFTSVNGF STKAFFVIPLVGSLFIDFFNAVIIQTFTSIFVG >gi|224461471|gb|ACDD01000031.1| GENE 32 36337 - 37332 1575 331 aa, chain - ## HITS:1 COG:FN0487 KEGG:ns NR:ns ## COG: FN0487 COG1052 # Protein_GI_number: 19703822 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 1 331 1 337 338 470 68.0 1e-132 MKVIFYGVRDVEKPIFEAVNKKFGYDMTLIPEYLTDEATTRKAEGNDVVVLRGNCFATKE RLDIYKEMGVKYVMTRTVGTNHIDVPYAKSLGMKTAYVPFYSPNAIAELALSLAMSILRN VTYTGNKTKDKNFIVDKQMFSREVRNCTVGVVGLGRIGMTAAKLFKGLGAKVVGYDLFPK TGVDDIVTQVSMDELLAQSDIITLHAPYIKENGKVITKEAFAKMKDNVILINTGRGELVD TDALVEALESGKVYGAGIDTLDNEVSLFFKDFAGKELPTPAFEKLVAMYPKVIITPHVGS YTDEAALNMIETSFDNIKEYLETGACKNEIK >gi|224461471|gb|ACDD01000031.1| GENE 33 37478 - 37546 70 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYTHLIPKNIQNKTFAEKILKK >gi|224461471|gb|ACDD01000031.1| GENE 34 37796 - 39058 1635 420 aa, chain + ## HITS:1 COG:FN0488 KEGG:ns NR:ns ## COG: FN0488 COG0334 # Protein_GI_number: 19703823 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Fusobacterium nucleatum # 1 420 15 439 439 711 86.0 0 MNKETLNPLLSAQAQVKKACDALGADPAVFELLKEPQRIIEISIPVKMDDGSIKTFKGYR SAHNDAVGPFKGGIRFHQNVNADEVKALSIWMSIKCQVTGIPYGGGKGGITVDPSELSQR ELEQLSRGWVRGMYKYLGEKVDVPAPDVNTNGQIMAWMQDEYNKLTGEQTIGVFTGKPLT YGGSQGRNEATGFGVAVTMREACKALGGDLAKSTVAVQGFGNVGRFTVKNIMKLGGKVVA VAEFEKERGAFAVYKEAGFTFDELLVAKEAGSITKVAGAKVITMEEFWALNVDAIAPCAL ENAITAKEAELITAKLICEGANGPITPEADEILYKKGITVTPDILTNAGGVTVSYFEWVQ NLYGYYWTEKEVEEKEERAMVDAFNPIWALKQEKNVSFRQATYMKSIKRIAEAMKVRGWY >gi|224461471|gb|ACDD01000031.1| GENE 35 39114 - 40763 1767 549 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0003 NR:ns ## KEGG: Lebu_0003 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 1 545 1 542 545 493 48.0 1e-138 MKKVLPIGITDFQELIQGNYYFIDKTKLIEDMLTFGAKVTLFTRPRRFGKTLNMSMLRYF WDIQNAEENKKLFQGLYIENSFCFAEQGKYPVIYLSLKDMKSTSWEDCLESMKLFIQNLF YQYRYILPKLDFFAKARFSKYVNGDSNLAELSFSLKFLTELLSEYYQVKVMLLIDEYDTP IVSAYENGYYEDAIAFFRNFYSAALKDNVNLQLGVMTGILRVAKEGIFSGLNNLAVYSVL DEKYSSYFGLTEKEVKDILDYYKLEHDIQKVKEWYDGYLFGNTEIYNPWSIISYVANQKI EAYWIGTSSNALINQMLEKARQEQSDIFETLEILFQGKSILQKIQKGSDFHDLIHVEEVW QLFLYSGYLTVGQELEQGFYQLKIPNKEVYSFFQESFIQKFLGNITNFSTLVTALVQKEW KKFEEILQMIVMNSFSYFDITMEDEKVYHVFMIGLLSVLQEQYYIHSNRESGYGRYDISI EPKDKTKSGFILEFKTAKSKEELEKRAREAFLQIEEKQYDVEMKERGIVDIVKLGIAFCG KTLKVEVQE >gi|224461471|gb|ACDD01000031.1| GENE 36 40823 - 41731 906 302 aa, chain - ## HITS:1 COG:BMEII0447_2 KEGG:ns NR:ns ## COG: BMEII0447_2 COG3586 # Protein_GI_number: 17988792 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Brucella melitensis # 113 300 1 193 195 126 37.0 4e-29 MSDINVFEIKPKVKELKGSPVVLEKEIQNLIEQNMEEFFGIRFLATEYSITNGRMDSIGI DENNCPVIFEYKRSSSENIINQGLFYLDWLLDHKADFQLLVMNVLGKEAAKEIDWSAPCV FCIAKEFTKFDEHAVNQMQRNIKLVKYNKYGENLMLFEHINVPVLKKDTVSKGKKVKHEK KLSEKDSYDWETRIQKLPKEKQELYFSIRDYILSKGDDISENSLKNYIAFKRVKNFVCML PYKNKISLYLKLNPIEEVLIEDFVRDVKNIGHWGTGDLEIIIQSKEDYEKAKPYLDRAYE KN >gi|224461471|gb|ACDD01000031.1| GENE 37 41757 - 42125 450 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452209|ref|ZP_05617508.1| ## NR: gi|257452209|ref|ZP_05617508.1| hypothetical protein F3_04025 [Fusobacterium sp. 3_1_5R] # 1 122 1 122 122 228 100.0 7e-59 MVTFKLIFNDGKIAIYWYFPEGKEENGHGVIIVNQVEHTIKIETLAPDDFQREEPAENLN RLRDEINAMMLENGEPPLTEEELPTATEPMIITFFADHVIKNIREEIKETGTLPKTGMSA WY >gi|224461471|gb|ACDD01000031.1| GENE 38 42139 - 42402 276 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452210|ref|ZP_05617509.1| ## NR: gi|257452210|ref|ZP_05617509.1| hypothetical protein F3_04030 [Fusobacterium sp. 3_1_5R] # 1 87 1 87 87 119 100.0 5e-26 MKMEEFIPYFDKKVVVYFTDGTSRYGILSGTESEENEEGEYTGRELLVLDISEHSYMSFL PEEIKKMDIIEKKSLNKYYRKIKYLFS >gi|224461471|gb|ACDD01000031.1| GENE 39 42405 - 43916 1168 503 aa, chain - ## HITS:1 COG:hsdS KEGG:ns NR:ns ## COG: hsdS COG0732 # Protein_GI_number: 16132169 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Escherichia coli K12 # 26 253 5 237 464 102 27.0 1e-21 MAKKKELTIEEKLQAALVSKEEQPYEIPDSWVWVRLGSIVSVHRGLSYSKVDEIIRENND EGYLVLRGGNLTEDGLNFEDNVYVREEIGRRAIELEENDVILVASTGSSKVIGRACIVEH KLEKTTIGAFLMLCRPVTSISKWVHYIFKGNSYRNYISNISKGSNIKNIKGEYITNYAIS FPPIEEQQRIVKKLDFLFEKTKKAKKLLQEVKEEIEMRKISILDKAFRGELTKKWREKNK TGSVLELLQEIQNEKMKKWEEECCEAEKNGRKKPKKIKLSKIEEMIVPKEEEPYKIPDTW KWVKLGEISQISMGQSPLGEKVNSLIGVGLIGGPSDMGENYPIITRYTSQITKLSSIGDI IVSIRATLGKNIFSDGEYCLGRGVCGIRSKIVNNILLRFYFTNSIEYLYKISSGTTFAQV SKEDISNLYFSLPPLEEQQEIVRVLEEVLEKEKKVKELIDLEEKIDLLEKSILDKAFRGK LGTQDINDEPALELLKKIIDKEE >gi|224461471|gb|ACDD01000031.1| GENE 40 43928 - 45355 1780 475 aa, chain - ## HITS:1 COG:hsdM KEGG:ns NR:ns ## COG: hsdM COG0286 # Protein_GI_number: 16132170 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Escherichia coli K12 # 1 469 1 507 529 393 43.0 1e-109 MTNNEIVQKLWNLCNVLRDDGITYHEYVTELTYMLFLKMACELGTEEEIQIPEAYRWKTL VAYEGIALKNHYQQALLDLGKELGQLGIIYRNAQTRIEEPANLKKLFSEIDKIDWYSVDK EDLGDLYEGLLEKNASEKKSGAGQYFTPRVLIDAIVRMIKPELGETIYDPAAGTLGFIIE ADKYLRNISQDYYGTAENPISEELSQKYKKVFSACELVQDTHRLGSMNALLHGIGGNFLQ GDTLSEFGKQFSHFDIILSNPPFGTKKGGERATRDDLVYATSNKQLNFLEVIYRSLNVTG KARAAVVVPDNVLFEGGVGKEIRQDLLNKCNVHTILRLPTGIFYSQGVKTNVLFFTRGIS DTNNTKEIWYYDLRTNMPSFGKTNPLSKEHFEEFERSFEKREEKEILERWTLVSMEEIMK KEYSLDLGLIKDESVIDSENLPNPIVTAQASIDKLEEAVDLLKSVIHELTLCEQE >gi|224461471|gb|ACDD01000031.1| GENE 41 45368 - 48634 3339 1088 aa, chain - ## HITS:1 COG:STM4526 KEGG:ns NR:ns ## COG: STM4526 COG4096 # Protein_GI_number: 16767770 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Salmonella typhimurium LT2 # 2 1085 3 1164 1169 624 34.0 1e-178 MQTNFEFLKKDWELLAKIGEMAEYTLYKDPNTSIMKIRQFGEELVKIMFKVERISDSKKN MASDRLLALKKYELIPEDIEKILTTLRKKGNKAVHGIYGDEETAETLLSMAVKVAAWFQE VYGSDLSFTSEEIIYQKPKNIDYQEAYESLVKRSEEMNQELEEWIPKTPSLRSREERRQL IYQKKRIEFTEAETREIIDHQLKEAGWEVNTHSFNYKLHKTLPQKGKNMAIAEWPCKKED GKQGYADYALFCGEVLYGILEAKRMGTDIAGALQRDSRMYAKGFQRMEGVSLCEGAPFGE YKAPFLFSSNGRAYNKDLPEKSGIWFLDSRREENLPKTLRGFYSPRDLQELFRKEEEKAN ETLKQESIDYLLSKNGLGLRYYQVEAIQAVEEALISGKEKALLTMATGTGKTMTALGLIY RLLKTKKYKRILFVVDRSALGIQAGETFKNVKIEQQMTLKQIYDIKELSDKHSEDDTKVH VATVQGLIKRILYNTEEEKKPSVGQYDCIIVDEAHRGYILDKDMSEEESYFHDEKDFQSK YRAVLEYFVADKIALTATPAAHTYHIFGEPVYEYSYSQAVLDGYLVDAEPHYKIVTKLSS DGIHYAKGAEIKLFDEETQEVEVKEVLEDELNFDIEQFNTNVITENFNRAVCSTLVEEIS PEGPEKTLFYAVTDEHADMLVRILREEYEKQGLYSMNHDMIEKITGSVKDVGKLIKKYKN ENYPTIAVTVDLLTTGVDIPKISNLVFLRKVKSRILYHQMLGRATRRCDEIGKECYRVFD AAENYQDLKDFSDMKPVVVNPSLSINDILEQWFEVEEEEVRDWAVQQVIAKLQRKKKRIE DLGEEIFQRNAQNFRGESMNNIESYIQYLKEIPQEKQREVFQKEEAFLVYLDTIPAKKKR KVISEHEDEVLEMYQEFGDWKRPEDYLEGFRKYIQDNQEKIQALKILKESPKGFRKKDLK ELIMILGAEGYKDSSLNSAYRSVKNEDIAADILTYVKNVLKGSPIVDKEGKIEDVMRRIK KLNKWNRVQQGILEKIAQSLRNDNYLTEEDFNSGRLKESYGGYERLNNRLNGLLEEIVEI INEEIILN >gi|224461471|gb|ACDD01000031.1| GENE 42 48662 - 49477 1107 271 aa, chain - ## HITS:1 COG:FN1263 KEGG:ns NR:ns ## COG: FN1263 COG4822 # Protein_GI_number: 19704598 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Fusobacterium nucleatum # 1 271 11 279 283 233 45.0 4e-61 MKKQAILLIHFGTTHDDTREKTIDAFRKKVELSFADWDVFEAFTSRMIIKRLKARGIVKQ NPLELLQELKEQGYTHIYVQTSHILHGIEYENLKEELASYKKEFEEIKMGEPLLSSVEDY KQVVSALGKRQKTVENQVVVYIGHGTEHAANASYSMMRYVFFQEGYSPFFMGTVEGYPEF PEVLKEIQVQYPLEKPKVILKPFMFVAGEHAKNDIAVDWKKAFEEAGFVVSDVVLEGLGE ILEIQDIFMKHLQEAIENQRESIAEYKKKLS >gi|224461471|gb|ACDD01000031.1| GENE 43 49491 - 50243 1018 250 aa, chain - ## HITS:1 COG:FN0047 KEGG:ns NR:ns ## COG: FN0047 COG0708 # Protein_GI_number: 19703399 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Fusobacterium nucleatum # 1 250 1 250 253 406 77.0 1e-113 MKLISWNVNGIRACLKKGFMEYFEAQDADIFCLQETKCSAGQVELDLKGYHQYWNYAVKK GYSGTAIFTKKEPISVSYGLGIEEHDQEGRVITLEFEDFYMVTVYTPNSKNELERLDYRM IWEDEFRSYLAKLNEAKPVVVCGDMNVAHEEIDLKNPKTNRRNAGFTDEERTKFTELLKA GFTDSFRYLYPDRLHAYSWWSYRANARKNNTGWRIDYFVVSNDWKEQIQEAEIHAEQEGS DHCPVALYLK >gi|224461471|gb|ACDD01000031.1| GENE 44 50254 - 51456 1350 400 aa, chain - ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 2 396 4 396 411 300 43.0 4e-81 MMQETANANRKHVAFFGKRNAGKSSLFNLLLGEDYSLVSSHLGTTTDPVYKAMELVGYGP IRLIDTAGLDDIGELGELRVKKSKEVLRKIDMAIYVLDASQEITVEEREEAKKLFQRFHI PYVFVWNKRDMIGEILEAEWKSKYPNDVYLQINPIEKKRQLVDCIVKQLELEEEDPSLIG DLVHYGDSVILVVPIDSEAPKGRLILPQVQILRDCLDHGIKSYVVRDTELEKALEDLKDV KLVITDSQIFHRIADMVPLEIPLISFSILFARQKGELQEFLEGIQVLESLKEKEKAKVLI VESCSHTQSHEDIGTVKIPNLLRKKLNSKIEIVFQQGRNLEEDLRGIDLIIHCGSCMLTR KQMLNRIQIAKEQQIPITNYGIVLAYFSGVLERSIKILKK >gi|224461471|gb|ACDD01000031.1| GENE 45 51453 - 52850 1740 465 aa, chain - ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 10 463 13 470 472 447 51.0 1e-125 MERNQEHMKVNREEIFRLLEEGKKVTREQILDILERAKRKEKITHLDIARLLYIEDQDLI QEMFEVAGKIKRDVYGNRVVLFAPLYVSDFCVNNCVYCGYKKENQFHRRKLTMDEVRKEV MILEEMGHKRLALEAGEDPVNCDIEYILECIDTIYDTYNKNGKIRRINVNIAATTVENYR RLKEKGIGTYILFQETFDEEVYRRVHPNCIKGNYEYHTTAFDRAMEAGIEDVGAGVLFGL ADPRFEVLALMMQNEHLEKRFGVGFHTISVPRLRPAERVNLETFPHLLDDEMFKKIVTII RIAVPYTGMILSTRESAEMRELLLKYGISQVSAGSCTGVGGYEEHIKGKQVSQFKLADER SPRQVIEDLMKAGYIPSYCTSCYRTGRVGEKFMEIAKTEKIHNMCKPNALTTLLEYAVDY GDEELLQKVETFVREQASEIENTHIRNFVLKNIDKLKAGERDLYL >gi|224461471|gb|ACDD01000031.1| GENE 46 52894 - 53940 1308 348 aa, chain - ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 14 348 15 345 350 259 42.0 6e-69 MLVRQYIDELYERNDLEEEKLLYILDYIQKEEIGYLQKKALQTKEKYYGKKIYLRALIEF TNYCKRECRYCGINRYNTQVERYRLSEEEILKACQRAKELGFHTFVLQGGEDVYFRDEIL VDLVKKIKERFPEFALTLSVGERPYESYQKLKEVGVDRFLLRHETIIPEMYKKLHPQSEL QTRLDCLESLKSLGYQIGAGFMVGLPGYENKDYVKDLLFLKHLSPHMTGIGPFIPHHDTE LRNEKAGSVEKTIIILALVRLLLPKVLLPATTALGTVSEDGRLRGFASGANVVMPNVTPP EFRDKYALYNGKKNTGDEAAEGLRQTCEMIRKNNYEVDMGRGDSKVKY >gi|224461471|gb|ACDD01000031.1| GENE 47 53934 - 54200 393 88 aa, chain - ## HITS:1 COG:no KEGG:CLD_0905 NR:ns ## KEGG: CLD_0905 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 3 84 2 83 83 92 60.0 4e-18 MKKRVAVISAILENAIEHQVEFNDVIAKFQKNIHGRMGIPFHQEGISVVSITMIGSMDEI NSFTGKLGSIDSVQVKTAISKKEIEEVC >gi|224461471|gb|ACDD01000031.1| GENE 48 54197 - 54664 651 155 aa, chain - ## HITS:1 COG:PA0995 KEGG:ns NR:ns ## COG: PA0995 COG0350 # Protein_GI_number: 15596192 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Pseudomonas aeruginosa # 4 153 5 160 173 158 53.0 3e-39 MKYYIKYVSPVADLYLVEEQGQLVEISYHHLKKKEEMEEKNTELLQEVKRQLEEYFSGRL QNFDLPLKPKGTDFQKQVWKALLTIPYGETKSYGDIAKQIGKEKAVRAVGGANHVNPISI VIPCHRVIGKNGNLTGYGGGLEVKEKLLELERKKV >gi|224461471|gb|ACDD01000031.1| GENE 49 54674 - 56053 1877 459 aa, chain - ## HITS:1 COG:BB0366 KEGG:ns NR:ns ## COG: BB0366 COG1362 # Protein_GI_number: 15594711 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Borrelia burgdorferi # 11 457 13 457 458 503 53.0 1e-142 MFDKKQKWTGEEERIIFHFSEDYRQFLSKVKTEREFVKEGIILAEKNGFKAAEMFTKYVP GDKVYYVNRNKNLVLVVIGQEDLEQGIHYVVSHIDSPRLDLKANPLYEELDLAYMKTHYY GGIKKYQWASIPLALHGVVVLESGEVIEISLGEEEKEPVFTIPDLLPHLAGKYQGERKTS EVIQGEELQILVGSMPTKVETEEVKDKIKQNILEILKRNYGMEEADFVSAELELVPAGKA RDIGFDKSLIGAYGQDDRVCGYTSLRAILEITNIPMKTAVCFLADKEEIGSTGSTGLQSD FLNYFTGDILEKTKGSYHEMMLRRTLWNSRALSSDVNVAMDPIFKGVHDAQNAAKVGSGV VVTKYTGARGKSGTNDADAEYVAYIRKILNEGDVCWQTGMLGKVDEGGGGTVAMFLAHLG INTIDIGPGLLAMHSPFEVASKLDIYHTYKAYKVFYQAK >gi|224461471|gb|ACDD01000031.1| GENE 50 56236 - 56418 260 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452222|ref|ZP_05617521.1| ## NR: gi|257452222|ref|ZP_05617521.1| hypothetical protein F3_04090 [Fusobacterium sp. 3_1_5R] # 1 60 1 60 60 91 100.0 2e-17 MTNFIQNVVDEKTISLLGKEKMTQLLSSLQDLDKEFSSFGESEMKQLVDCLRTKVCELAK >gi|224461471|gb|ACDD01000031.1| GENE 51 56511 - 57089 889 192 aa, chain - ## HITS:1 COG:FN0502 KEGG:ns NR:ns ## COG: FN0502 COG0279 # Protein_GI_number: 19703837 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Fusobacterium nucleatum # 1 191 1 191 194 272 82.0 3e-73 MQLLDSYKTELALLTKFIEEEEERKETEKVARALAEVFRKKGKALICGNGGSNCDAMHFA EEFTGRFRKERPALPAISLSDSSHITCVGNDYGFDFIFSKGVEAYGQEGDMFLGISTSGN SQNVIEAVKVAKERKMITVALLGKDGGKLKGMCDYEFIIPGKTSDRVQEIHMMILHIIIE GVERILFPENYR >gi|224461471|gb|ACDD01000031.1| GENE 52 57103 - 57975 925 290 aa, chain - ## HITS:1 COG:FN0503 KEGG:ns NR:ns ## COG: FN0503 COG0583 # Protein_GI_number: 19703838 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 288 8 295 302 333 64.0 2e-91 MDLHYLEIFYEVAKAKSFTKAASELYINQSAVSIQVKKFEEILNTKLFDRSSKKIKLTYT GEALYKMAEEIFDKVKRAEKEISKIIDLDRARLSIGASPVIAEPLLPRLMKGFSKAHEEI EYDLQVSEKENLLRMLKEGDLDVLIIDEERINNPNLEVLTIERVPYVLVSKKEYTNIQEI AKDPLITRKYVPNNNQAISILEEKYRITFEEKIPVFGNLSVIKGMINEEIANAILPYYAV YKEIQNGEYKTVYKITEIKDAYQVVITKDKKGLIQIIKFLNFIQDYRLQY >gi|224461471|gb|ACDD01000031.1| GENE 53 58021 - 60684 3180 887 aa, chain - ## HITS:1 COG:FN2011 KEGG:ns NR:ns ## COG: FN2011 COG0525 # Protein_GI_number: 19705307 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 885 1 885 887 1375 75.0 0 MKELSKTYSPKEIESKWYPIWEEKKYFAGKLEEGKENYSIVIPPPNVTGILHMGHVLNNS IQDTLVRYQRMTGKNTLWLPGCDHAGIATQNKVERKLKEEGLTKEDLGREEFLKRTWEWK EEHGGIITTQLRKLGASLDWDRERFTMDEGLSHAVRKIFVDLYKDGLIYQGEYMVNWCPS CGTALADDEVDHEESHGHLWHLKYPVKDSEEFIIIATSRPETMLADVAVAVHPEDDRYKH LIGKMLVLPLVGREIPVIADEYVDREFGTGALKITPAHDPNDFALGQKYHLPIYNMMTAE GKVSDEYPKYAGLDRFEARKVMVKELEESGVLVKIEELNHNVGQCYRCSTVVEPRVSKQW FVKTKPLAEKAIEVVRNGQVKIMPKRMEKIYYSWMENIRDWCISRQLWWGHRIPAWYGPD EHLFVAMDEAEAKEQAKLHYGKEVELRQEEDVLDTWFSSALWPFSTMGWPEKTKELELYY PTSTLVTGADIIFFWVARMIMFGLYEMKDIPFHNVFFHGIVRDDLGRKMSKSLGNSPNPL DLIDQYGADAIRFSMIYNTSQGQDVRFSEKLLEMGRNFANKIWNASRFVMMNLEDFDINT FDVKEVKYELVDEWIISRLQETAKAVETRLANFQLDDAAKAVYEFLRGDFCDWYVEIAKI RLYNLEDVQSKRTAQYVLWSMLESGLRLLHPFMPYISEEIWQSIKKEDAGETIVLAEYPK FEEEKYHQDLEEDFAYIQDVVSSLRNIRAEMGISPAKEAKVVVRSEDDRELQVLEKNRAF LQQLAKISELSYGKEIEKPAESAFRVAKNSVVYMILADLIDKEAEVKKIQDQIAKVQKDL DKVNAKLANEKFVSKAPADILEREKRIQKEYQDKMEKLVENLKNFMI >gi|224461471|gb|ACDD01000031.1| GENE 54 60668 - 61282 682 204 aa, chain - ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 192 1 192 194 280 75.0 2e-75 MRIKRADYLKSAVYEKDYPEILNSVEFAFVGRSNVGKSSLINSLTSRTKLARTSKTPGRT QLINFFTINQEFYIVDLPGYGFAKVPKAMKKEWGSTIERYIISKRKKLVFVLLDIRRIPS EEDMEMLRWLDFHELPFKIIFTKTDKISNNEKFRLLKDIRKKIEFHNEDVFFYSSLSHKG REEVLQFMEDTLKEAGGNVDEGIK >gi|224461471|gb|ACDD01000031.1| GENE 55 61293 - 63605 2721 770 aa, chain - ## HITS:1 COG:FN2014 KEGG:ns NR:ns ## COG: FN2014 COG0466 # Protein_GI_number: 19705310 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Fusobacterium nucleatum # 1 766 1 766 768 988 66.0 0 MEKTSFLPTRDLIIFPGVVTPIYVGRKDSLTTLEEAVKNKNKLILGLQKDPNVEEPDLDK GIYKVGILVSILQVIKMPNNNIKVLVEGESRVKISNVSLTNGHYEADYTFVRELAKKSKE TEAIFRKVFSYFEKYLSFAGKSAVELLVTLKNNKDFSLSFDVIAANLPITTDLKQELVEI FNIRDRGYRLLDILSNEMEIVSLEKKIDDKVKSKMNEAQKAYYLKEKISALKEELGDYSQ DDDILELVDKMKEANLPEEVQKKLENEVKKLSKMQLFSAESSVTRNYIETVLDLPWNNTT EDILDIKVSSDILERDHYGLKEPKTKVLDYLAVKKLNPEAKGSILCLVGPPGVGKTSLVK SIADSMGRAFVRVSLGGVRDEAEIRGHRRTYVGSMPGKIMKALKEAGTNNPVILLDEIDK MSSDMKGDPASAMLEVLDPEQNKSFEDHFVDMPFDLSKVFFVATANSLYPVSRPLIDRME IVELDSYTEYEKLHIAKQYLIKQARKENGLEKISLSITDKAISRIINEYTAEPGVRNLKR QIIKLCRKLARIVVEEGRETIKIGVKDLETYLGKPIYRKETRRKEETRIGSVNGLGVTSV GGCTLPVQAVTVPGKGGLSVTGKLGDVMKESVEVAFNYVKSNLDYYVPHDEEFFAKKNIH IHFPDGATPKDGPSAGIAITTAIISVLCNREIRQDVAMTGEVSLLGDVLPIGGVKEKVLG AHRGGIREVIIPEGNARDQEDIPEEIKGEMKIHIAKTYADVEKIIFADKK >gi|224461471|gb|ACDD01000031.1| GENE 56 63615 - 64886 267 423 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 167 412 258 448 466 107 31 3e-22 MKDKELEHCSFCGKSENEVAKLFAGRDGSLICDECIDQCYNMLMMDEEEEYLPATGDTIQ IQQMEMLKPEEIKEKLDDYVIGQERAKKILAVAVYNHYKRLLYKEKQEKKKSKDNDEVEL QKSNVLLIGPTGSGKTLLAQTLARILKVPFAIADATTLTEAGYVGDDVENVLVRLIQAAD YNIENAEKGIIYIDEFDKIARKSENVSITRDVSGEGVQQALLKIIEGTLSQVPPEGGRKH PNQPLIEIDTSNILFIVGGAFEGLGKVIQGRLHKKTLGFGADIQAPKEQVGEGEFLSQVL PEDITKRGIIPELVGRLPIIANLEDLDEKAFINILTKPKNAIVKQYQKLFQMEGVELEFT EEALAEVAHLAMSRKIGARGLRSILENTMLEMMYRLPSDSSIQKVILGKEAVLDHNKVEI IRN >gi|224461471|gb|ACDD01000031.1| GENE 57 64895 - 65482 928 195 aa, chain - ## HITS:1 COG:FN2016 KEGG:ns NR:ns ## COG: FN2016 COG0740 # Protein_GI_number: 19705312 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Fusobacterium nucleatum # 1 193 1 192 193 280 71.0 1e-75 MYNPTVIDNDGRQERHFDIFSRLLRDRIIFLGTEVNDQVAASLVAQLLYLEAEDPTKDII LYINSPGGSVSAGLAIYDTMNYVKPDIQTVCIGQAASMGAFLLSAGTKGKRFALENSRIM IHQPLGGTGSGYHQATDVQIIAKELQATKEKLASIIAKNSGKTTEEVLEDTERDNYLTAE EAVNYGLIDMVMKAR >gi|224461471|gb|ACDD01000031.1| GENE 58 65616 - 66905 1943 429 aa, chain - ## HITS:1 COG:FN2017 KEGG:ns NR:ns ## COG: FN2017 COG0544 # Protein_GI_number: 19705313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Fusobacterium nucleatum # 1 429 1 429 429 382 50.0 1e-106 MKYEVKKLEQSAVAISMKLEGTEFLPIRDKVVAKIGKEVEIKGFRKGHAPADAVLAQYKD AVIDEVTQEVVNSNMETIIREKEIAPISTIRNPKVTMNDDSFEMDFEIDVYPEIKLGEYK GISAEKEAFEFKEEMLTQRMESMRTSKAKLVDCPEDHKAEMGDTVNLAFEGFIDGVPFEG GKADSHQLKLGTKSFIDTFEDQLVGYVKGQEGEVKVNFPEEYHAPELAGKPAIFKVKINA IQKMETPEMNDELAKELGFESVEDLKTKTTENIIAEGTQRAEDEYLGKLILKVVEASEFE VPVSMVQQEIQNEMRRFEQQLQQQGLSLDMYMQMMGGDRKAFEEQIRPMVEPRIKSDLVL AEIARNEKIEATDEDVTEKMAEVAKMYGMEVAKMEEELKAHNQLDAFKYSVRAEIVMKKT IDFIKAEAK >gi|224461471|gb|ACDD01000031.1| GENE 59 66916 - 68592 1744 558 aa, chain - ## HITS:1 COG:FN2018 KEGG:ns NR:ns ## COG: FN2018 COG0608 # Protein_GI_number: 19705314 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Fusobacterium nucleatum # 5 545 8 552 556 560 56.0 1e-159 MGGYNSEKLLSTLLKNRGIQDFSKLHEFINPSVSSFRDPFLFENMETIVSMLEKAKKEGS RICIYGDYDVDGITGTAFLVKVFRQIGMDTLYYIPSRDEEGYGLTKKNIDFLLEKGVKLV ITVDTGYNSLEDIAYAKSKSMEVIISDHHKTVREEGDEDILFLNPKLSQSYEFKFLSGAG VALKIAQALYQRLHLDLNELYQYLDIIMIGTIADVVPMVDENRIIIKNGLRILQKTKVKG LSYLLKYLKFGDKHINTTDVSYYISPLINSLGRVGTSRIAADFFIKEDDFEIYNIIEEMK KLNKKRRELEKNIYDDAIHSIEKNGKKGLKCIFLANRRWHPGVIGVVSSRLSLKFQVPVV LIALEGKLGKASCRSVRNISVYNILEEVKEDLVRYGGHDLAAGFTIEEEKVEKVRQYFME YLSRDTQAVERKKKYTIDMELPLEAIGENLLKDIEKISPFGSENPHPLFLEKNLNFREIR KFGVDQRHFNGTLVKAGREYPAVGFDLGHRIQLDTYLAQTFDIVYYPEKVNLHGEKMIQI RIKDIIIKDEFYDIFIKS >gi|224461471|gb|ACDD01000031.1| GENE 60 68597 - 68965 503 122 aa, chain - ## HITS:1 COG:FN2019 KEGG:ns NR:ns ## COG: FN2019 COG0858 # Protein_GI_number: 19705315 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Fusobacterium nucleatum # 1 120 1 119 120 144 68.0 5e-35 MKRQRLAGIEKEISRVISSVLFSEIKNPNIRGLVSVTKVRVTEDLKFADTYFSIMPPIAS EGQKPVEREKILEALEEVRGFFRKRIAEEINLRFVPEVRVKLDDSIEHAIHITKLLNDLK GS >gi|224461471|gb|ACDD01000031.1| GENE 61 68981 - 71122 3230 713 aa, chain - ## HITS:1 COG:FN2020 KEGG:ns NR:ns ## COG: FN2020 COG0532 # Protein_GI_number: 19705316 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Fusobacterium nucleatum # 1 713 1 736 737 963 75.0 0 MKLRVHELAKKYAVKNKEFLEILNTEIGIEVTSHLANLDEAQIEKVEEYYSRLSKAEEKE EKKAPKVNKGKDKQHKKNLPITLEEEEEEEIVEVVERKNKKHKKKKGRRTDFVVKTVEAG PAVIEEDGMKIIKVKGEITLGDFAERLKVNSAEIIKKLFLKGQMLTINSPLPFELAEELA MDYDALVEREEEVELEFGEKFDLEIEDKKEDLVERPPVITIMGHVDHGKTSLLDAIRTTN VVSGEAGGITQKIGAYQVERDGKKITFVDTPGHEAFTDMRARGAQVTDIAILVVAADDGV MPQTIEAISHAKAAKVPIIVAVNKIDKPEANPMRVKQELMEQGLVSVEWGGDVEFVEVSA KKKMNLDTLLDTILITSEILELKANFKKRAKAVVLESKLDPKVGPIADILVQEGTLRIGD VIVAGEVQGKVRALVNDKGDRVKSVEVSQPVEIIGFNQVPQAGDTMYVIQNEQHAKRIVE EVAKERKIAETTRKTISLEALSAQLEHENVKELNLVLRADSRGSVEALRDSLMKLSNEEV AVNIIQAAAGAITESDIKLASASNAIIIGFNVRPTTKALREAELANVEIRTSRIIYHITE DIEKALSGMLEPEYKEVYLGRIEIKKVYRISKVGNIAGCIVVDGKVKNDSNIRILRNNIV IFEGKLSSLKRFKDDAKEVIVGQECGLGVENFNDIKEGDVVEAFDMQEVKRSL >gi|224461471|gb|ACDD01000031.1| GENE 62 71139 - 71660 446 173 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237742963|ref|ZP_04573444.1| ribosomal protein L7Ae [Fusobacterium sp. 4_1_13] # 4 171 8 176 176 176 49 5e-43 MAVERTCIICRKKEEKKTFFRLCQREDKYYWDKTGKAQARGYYVCPSKECLGRLAKHKKI KVEMQDLYEMIKEVERYEKNYIGIFQTMKHSNMLTFGMKMVLEEIEHIHLLIVATDISDK YARQLEEQSSERKIPLEYFGTKEELGKVFGKEEVNVIAVKDKKMARGLVDKMK >gi|224461471|gb|ACDD01000031.1| GENE 63 71662 - 72723 660 353 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 10 352 11 350 537 258 40 7e-68 MTNKDARAFLEALDELEKEKGIEKESLLQAVEQALLTAYKKNYGDEENVEVVIDRENGDV KVYEVKKVVTEEDLYDAALEISLEEAKKISRRAKLGEEVRIEVDCESFRRNAIQNGKQIV IQKVREAERENIYDRFKAQEGEILTGIIRRIDERKNVFIEFGGIETILTAGEQCVSDRYK VGNRIKVYLVEVEKTNKFPKIVISRRHEGLLRKLFELEIPEISSGAIEIKAVAREAGSRA KVAVYSELPNIDIVGACIGQKRARIKNIVDELGGEKIDIVIWKENMEEFVSAVLSPAKVN SVELLEDGETARVLVDESQLSLAIGKSGQNARLAAKLTGMRVDIKVANAELED >gi|224461471|gb|ACDD01000031.1| GENE 64 72740 - 73201 632 153 aa, chain - ## HITS:1 COG:FN2023 KEGG:ns NR:ns ## COG: FN2023 COG0779 # Protein_GI_number: 19705319 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 4 153 7 156 156 163 54.0 9e-41 MESVVQKIEKIVIPAAEELGLSLVDVEYMQDGGYWYVRVYVEKLEGDVNLEDCASLSGKI EDAVDQLIDKKFFLEVSSPGIERPLKKESDFIRFTGEKIFVALKHKLNEKRNIEGILRAY ENQSLLLEVDGEELQIPFSEVKKAHLVFDFDEF >gi|224461471|gb|ACDD01000031.1| GENE 65 73415 - 73846 487 143 aa, chain + ## HITS:1 COG:FN1853 KEGG:ns NR:ns ## COG: FN1853 COG2185 # Protein_GI_number: 19705158 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, C-terminal domain/subunit (cobalamin-binding) # Organism: Fusobacterium nucleatum # 7 138 5 136 136 116 45.0 1e-26 MQLQNYKILIGVIGEDIHETGNKIIAQILEHDGFEVINLGIQVSPSSFVEHAKKDDVTAI IVSSLYGKAKEDCKHLMKLFQEDSLFHPPIYLGGYLASPQENWKEVENFFLNLGFTRVYK PGTPIEKTIADLREDLMIPYETF >gi|224461471|gb|ACDD01000031.1| GENE 66 73848 - 74867 1077 339 aa, chain - ## HITS:1 COG:L178384 KEGG:ns NR:ns ## COG: L178384 COG2855 # Protein_GI_number: 15672357 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 6 330 3 317 331 189 40.0 5e-48 MNVQKIVPGLLLSILIAMISQFLIKLPAFSTLGAALIAILLGMILGNTICKKSFYDEGTK FSEKRLLEYSIVLNGLILDIIVMKQVGLQGIGFIICLMFLTIGIAYIISRKFGFGKKFSL LMGAGNAVCGSSAIGTVAPILEADSKDKGISITCVNVLGTILMIALPVLSSILYSSDTLL TSALIGGTVQSIGQVIASAKLVNDSVVEMSTIFKLIRVLLLVGIALMFDMLNLEEGKPLF SLKLSKIEGNKKRTKVGIPWFILAFLFCFLLRSTGYIPAPVLFWAKKISTQFEIIALAAI GLRVKFSDILKEGVKAFGVSLLIGLSQVIFALGLIKVFF >gi|224461471|gb|ACDD01000031.1| GENE 67 74871 - 75272 621 133 aa, chain - ## HITS:1 COG:Cgl1127 KEGG:ns NR:ns ## COG: Cgl1127 COG0494 # Protein_GI_number: 19552377 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Corynebacterium glutamicum # 1 129 1 129 131 95 43.0 2e-20 MKKHLQVVGAMLVNKEGRILSTLRPLGKKLGNYWEFPGGKVEPGETKEEAVVREILEELD CHIEVEKEVGENTLDYGDVIITLTVFQCRMKDEVTVKEHDAFVWIKPENLLSLVWAPVDI PILEKIVEEKKGE >gi|224461471|gb|ACDD01000031.1| GENE 68 75327 - 76772 2299 481 aa, chain - ## HITS:1 COG:FN0685 KEGG:ns NR:ns ## COG: FN0685 COG4145 # Protein_GI_number: 19704020 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Fusobacterium nucleatum # 10 480 13 483 484 584 67.0 1e-166 MLVSIPILLYLLLMLYIAFRVNKKKRNSNNFAEEYYIGSRDMGGVVLAMTIIATYVGASS FIGGPGVAYKLGLGWVLLACIQVPTAFFTLGILGKKLGILSRKLNAVTLLDVIRARYQSD IVVILSALMLLIFFLGSVVAQFVGGARLFESVTGAPYIVGLILFSVVVITYTTIGGFRAV ALTDAIQGFVMLFATFILFWIILKKGNGMENIMRTIADINPDLLRPDSGGNIAKPFILSF WILVGVGLLGLPATTVRCMGFKDTKALHQAMVIGTSVVGLLMLGMHLVGVMGLAIEPNVE VGDKIIPILALNHLHPILAGVFIGGPLAAIMSTVDSLLIISSSTIIKDLYLHYVEKDAGE AKIKKLSTYCSLGFGVLVFLLAVRPPELLVWINLFALAGQEALFFAPILFGLYWRKANSF GAIASMLAGVSTYLYTTIMKTPIFGMHAVVPALLISVIAFITGSFFGKAPEQKTLKIFFE D >gi|224461471|gb|ACDD01000031.1| GENE 69 76766 - 77026 222 86 aa, chain - ## HITS:1 COG:no KEGG:FN0686 NR:ns ## KEGG: FN0686 # Name: not_defined # Def: integral membrane protein # Organism: F.nucleatum # Pathway: not_defined # 6 81 19 94 104 95 61.0 5e-19 MKSKRKQINKEALLTVGMYLVYFVWWYYFAYCFGEEEVSQYHYILGLPEWFFYSCVLGLV VMNVLVFFVIKFFFQDMDLEEEDKKC >gi|224461471|gb|ACDD01000031.1| GENE 70 76998 - 77801 1095 267 aa, chain - ## HITS:1 COG:BH0086 KEGG:ns NR:ns ## COG: BH0086 COG1521 # Protein_GI_number: 15612649 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Bacillus halodurans # 1 250 1 251 254 146 33.0 4e-35 MIFLIDVGNTNIVFGISDGEKIINTLRTETIKEKDFDYVPVLKDLLCQKEKIRKVEGSIL SSVVPEVTKKLMEAIKSIYKVDTLLVDEIIDESLNIQIDSPEKLGMDLKVDAVAALKKYP SPQLIFDLGTATTCSVLDENSCYIGGAIIPGLKVSLNALIEATSQLPMIDCSIPIAEYIG KNTQDCMRIGALYGHALMLEGFVREIQKKFSKPLHIALTGGLSTIVSQHMNIETTFDPYL TLEGLLYLYQDFHKMGGNNEEQKETDQ >gi|224461471|gb|ACDD01000031.1| GENE 71 77828 - 78577 269 249 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 234 1 239 245 108 29 2e-22 MLDFENVSKDYGKKAILKNVSFSVKKGEIFGILGQSGAGKSTIGKLLLQMIEKTEGKILF EGKELKEVSRREIQTVFQDPYSSLNPSLTVGQILEEPLLANGIKDKSERRKKVIETLYKV GLLESDTEKYPSELSGGQRQRVCIAGAIILSPKLIVCDEPIASLDLAIQEQILQLIYRIN QEEGITFIFISHNLPAIYRIADRILLLYQGEVQEIQNVLDFFYHPKSEYGKKFLQNTKAI ENHIEKKAV >gi|224461471|gb|ACDD01000031.1| GENE 72 78577 - 79278 313 233 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 19 221 38 261 329 125 32 1e-27 MELLHVENLNLWIREKPLLQQISISISDGEIVGLVGESGSGKTLFTKCILGTLPESANLY YDRFEVKAELGAVFQNAFTSLNPTMKIEKQLRHLYLSQYGNDIGWKEKVEELLEKVGLDK NRNVLKKYPHELSGGEQQRVVIVGALLGEPKFLIADEVTTALDVQTKQEIIHLFQTLRDD LGIAILFITHDISLLQNFATKMYVMYQGELVDKEHPYGKKLFQLSQNIWRRER >gi|224461471|gb|ACDD01000031.1| GENE 73 79279 - 80055 902 258 aa, chain - ## HITS:1 COG:FN1361 KEGG:ns NR:ns ## COG: FN1361 COG1173 # Protein_GI_number: 19704696 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 21 258 18 255 255 333 75.0 1e-91 MEKKTKKLLAFFLGIFLLLAISAYHNPYQVSESLTLARPSVEHILGTDNLGRDIFSRLLI GSFYSVSIAFLAVLLASILGSFLGGIAGYFEGYLDETLLFFSETLMSIPAILITLGIIVI FRAGFYSITLAIFILYTPRCINFVRALVKQEKHKNYIKMAKIYGVGHFRILFRHIGPNIF LPILVNFSTNFAGAILTEAGLGYLGFGIQPPYPTLGNMLNQSQSYFLTAPWFTIAPGFVI VVLVYQMNQIAKKYQEKK >gi|224461471|gb|ACDD01000031.1| GENE 74 80057 - 80971 796 304 aa, chain - ## HITS:1 COG:FN1360 KEGG:ns NR:ns ## COG: FN1360 COG0601 # Protein_GI_number: 19704695 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 304 1 304 305 417 75.0 1e-116 MYYIKKGIRMILSIFFIGTCSFCLLEWIPGDPATAILGVEASAKDIENLRQQLGLDLSFG ERYWNWIYGAFHGNLGTSFKYGESVSKLILERLPLTLSIAIFSIVLVFLVSIPFAFALHN IKNKKIRNFWESILGIFISIPSFWLGILFMYFFGIILRWTSTGYNDSYRSLVLPCCIIAI PKIGWITMHLYANLYKELREEYIKYFYSNGMKKRYLNLYILKNAILPIVPLTGMMLLELV TGVVIIEQIFSIPGIGRLLVSSVFTRDIPLVQGLIFYTSTFLVLMNFGIDILYSLIDPRI RLGE >gi|224461471|gb|ACDD01000031.1| GENE 75 80984 - 82471 1889 495 aa, chain - ## HITS:1 COG:FN1359 KEGG:ns NR:ns ## COG: FN1359 COG0747 # Protein_GI_number: 19704694 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 495 1 495 495 721 74.0 0 MKKIAGWKQFCLIALLAAIFSACGKEEAKLDELKTVSAVDIDSLNPYQVVSSASEQLLLN VFEGLIMPASDGSIVPALAESYEISEDGKTYTFTIREGVSFHNGNPMDIHDVEFSLNKMA GKLGDAPTEGLFENIEKIEVLDDKKIAIHLGKPDSSFIYYMKEAIVPDENKDHLTEVAIG TGPYQVGEYQKEQKLVLTKNENYWGEKAEIPKVSILVSPNAETNFLKLLSGEINFLTEID SKRLEELKEFTIASGPRNLCLILALNNQEKPFDDVEVRKAIDLAIDKEKIVQLAMNGHGT VIETNMSPVMKKFLWEEKGEKANPARAKEILEKRGLLPMHFTIKVPNSSKMYLDTAQALR EQLKEAGIQVDLETIEWASWLSDVYTNRKYVASLAGLSGKMEPDAILRRYTSTYKKNFTN FHNDNYDKLVAEAKLSADEKVQIHNYKEAEKILREEQAAIFLMDPDSIIAMEKGLEGFEF YPLPYLNFAKLRFKK >gi|224461471|gb|ACDD01000031.1| GENE 76 82625 - 84019 1209 464 aa, chain + ## HITS:1 COG:no KEGG:FN0687 NR:ns ## KEGG: FN0687 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 462 1 460 467 452 53.0 1e-126 MYVAVTGRGKARVIQFCEQHRIPGTKKKKTIVIRTLGNYEKMLEKNPNVLAELKEEAKRF TEQKKEKTKETSTSLFRFGHSLVKKVWEEMDLNSIFEEKFLQDIFSLVVYRLGSSYTNFR TNRKTPFANLESISYENFYYILEHLAEKKESLIQHLGKFFNKKTARSNELAYYHISSYNY NSYWKDLHGSSRFFLQREKEDLPFSMVLLLDRNGIPISYDLFTKKFILEQQLEEVKQKSG IEKLVILSANRNKIEKKEYILPVDFLDLPFSLQLQIIAEEDWTITEKNEESEEVLSKEKT VHFNNQLKVYASWSKKRAFKDYVEGNQKNGYYYISTNDFSIKNTEMLKMFQHIWNIEEKF RITNVDFERKHIHGHFCLCFLCLCIIRYFQYLLGSEGKASIPMIYANKAISNPMILIEKK GKESRIHPIHLTNSYLKLANLLEIPEIKDNMSLEEFEATVSLLF >gi|224461471|gb|ACDD01000031.1| GENE 77 84011 - 85120 855 369 aa, chain - ## HITS:1 COG:FN1667 KEGG:ns NR:ns ## COG: FN1667 COG1088 # Protein_GI_number: 19704988 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Fusobacterium nucleatum # 1 367 1 380 399 441 58.0 1e-123 MKTYLVTGAAGFIGTNFIKYLLQKYSDIFIVAFDKLTYAGRKENLEEEIQKEQIKFIEGD ICDSTLVEEIFIKYSIDYVVNFAAESHVDRSIESSRVFLETNVMGTQNLLEIAKKFWMIG KDNYGYPFYQEGKKFLQVSTDEVYGSLEINTSKGKKIGKYRKIFGTKFFNEEMPLAPRSP YSASKAAADLLVMAYKETYHLPVNITRSSNNYGPYQFPEKLIPLMIQKILQGKNLPIYGN GKNVRDWIYVEDHCRGIELVLQHGKLGEIYNIGGLYEETNFNIVLLLLEKIVQIIKENQE YQKYLQKDISKINKELISYVQDRLGHDERYAMNISKIQDQLGWQPKIEFSVGLQKTILWY LQHQDWILK >gi|224461471|gb|ACDD01000031.1| GENE 78 85165 - 86577 1424 470 aa, chain - ## HITS:1 COG:FN1698 KEGG:ns NR:ns ## COG: FN1698 COG1091 # Protein_GI_number: 19705019 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Fusobacterium nucleatum # 195 467 3 294 298 211 41.0 3e-54 MSKFIKMETELEGVYIIDTLKFEDERGYFSEIYQKDCFKELGIQDDFIQENISYSKKGTL RGLHFQTKKKQGKLLRVLKGKIYDVIVDLRENSKTYGQHIGIELQGQDQKLLWIPPGCAH GFLSLDEKNVVQYQCTENYAAEYESGILWSDTDLNIDWKLKEYGILKEELIISEKDRRQK SFEDYEKRKKEERVLILGGNGQLGKAFQKFIQKKKIRYQAVDIDTLDITDEKKCREFLEK NFFHCVINCAAYTDVDKAELELERCKTVNADAIKLWIDLCRERQIPFITFSTDFVFDGTG DEPYSEEKDPNPISWYGKTKLEGEKNALCYEKALVIRSSWLFSNEGTNFCKKVLTWSKQK KEIHIVDDQISTPTYVRDLVYFTWLLYEKACFGLYHMSSSGECSKYDLAKYLLSSISWNG ILERASSEEFENRAERPKYSKLYCMKLQREVGKPLPYWKKAVQHFLKGEL >gi|224461471|gb|ACDD01000031.1| GENE 79 86848 - 87711 662 287 aa, chain - ## HITS:1 COG:MTH1791 KEGG:ns NR:ns ## COG: MTH1791 COG1209 # Protein_GI_number: 15679779 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Methanothermobacter thermautotrophicus # 1 282 1 282 292 367 60.0 1e-101 MKGIILTGGSGTRLYPATQVISKQILPIYDKPMIYYPLSVLMLAGIREILIISTPRDIGL FQSLLGNGENFGISLSYEVQKEANGLAEAFLIGEDFIQKDFCALILGDNIFYGRAFTDTL KIAANLQEGVAVFPYYVQNPKEFGVVEFDSLGNIISLEEKPQHPKSNFILPGLYFFDATV VDKAKRVQKSARGELEILSILEMYLEEQKIFPFHLGRGMMWFDTGTEDSLLDSANFIKTI QQNQRIVIACLEEIAYQKSWITKEKVIQQAEKMKNSKYGKYLYSAIS >gi|224461471|gb|ACDD01000031.1| GENE 80 87728 - 88432 818 234 aa, chain - ## HITS:1 COG:Cj1143_2 KEGG:ns NR:ns ## COG: Cj1143_2 COG1083 # Protein_GI_number: 15792468 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-N-acetylneuraminic acid synthetase # Organism: Campylobacter jejuni # 8 224 1 212 218 129 37.0 7e-30 MYGTKKILAIIPARGGSKGIKKKNIIEINGLPLIAYTLKESQNSRYLDRTIVSTEDLEIK EVVEKYGGEVPFLRPIELAQDHSKTIDCIVYSIDMLKKLGEEYDYVMILQCTTPLRKACH IDESIEKIINSTERSLVSVSEVEEHPILMRTLNTDGSLSNLLNKNSTMRRQDFPKVYKVD GVIYIQKIDKNFNENTSLNDGKLAYIMDRKYTIDIDEYLDIAKVEFYLHELRKK >gi|224461471|gb|ACDD01000031.1| GENE 81 88444 - 88950 509 168 aa, chain - ## HITS:1 COG:L142326 KEGG:ns NR:ns ## COG: L142326 COG0110 # Protein_GI_number: 15673486 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Lactococcus lactis # 66 167 100 200 200 81 38.0 9e-16 MNKVFRNIFSSCYKKFHEAYQKENYYNYRKKFSGINPDFIFNGEAIKIYGNGEIFLVANF YIGSYSTTQLTHGTKVSVGHDTATSHNVRIYTSNRNPADVIFEKDLTGIKQGDVIIGNYC WIGSNVFICQGVKIGDYTVIGANSVVSKDIPSNCIAAGSPIRILKNKS >gi|224461471|gb|ACDD01000031.1| GENE 82 88997 - 89929 797 310 aa, chain - ## HITS:1 COG:no KEGG:CGSHiGG_07890 NR:ns ## KEGG: CGSHiGG_07890 # Name: not_defined # Def: N-acetylneuraminic acid synthase-like protein # Organism: H.influenzae_PittGG # Pathway: not_defined # 71 310 64 291 292 157 42.0 3e-37 MKKYLCISATNYNLLLFCLLKNFLNNTIFWLSPNLSFLKEKDFFLLSQAVSSKQRNLENK LQFLEIKKEYFKKQDFTIYAQDHILDSYSFIRGTFSIIEDGTMTYLEANKEWEKEKNRSF FSKIKRRMRGKIATCGVSSKVDKVYLRGILPIPSCLQDKVESMDIYSLWAKKTKEEKEWI NTFFHFQTRNLELLQSKKIILFTQPLSEDKIMTEEQKIQIYRKIIEKENIKDIVIKVHPR ETTKYEEYFASISILEENTPFELYLLHGLKGKKVITLFSTAVYGLKDFEVVFYGTKDNKL LMDRFGEIKY >gi|224461471|gb|ACDD01000031.1| GENE 83 90105 - 91244 728 379 aa, chain + ## HITS:1 COG:PM1138 KEGG:ns NR:ns ## COG: PM1138 COG0438 # Protein_GI_number: 15603003 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pasteurella multocida # 4 370 3 379 392 155 29.0 1e-37 MKKILFYIDSLILGGEQKLAIDYIRLLCKNYKLQILINMDFGNDNFFLSKIPSNLPISFV IEKNLIETLNFYRKNRKNSIKNRILYSYYLYKKRRARLHNLPSIIKSLDYDFCIDFSNKL PPALTDERVLVWNHSSLEGTSEKTINNFLKPKYKKNKYIIVVSESMKLEYLKAFPEFQKK IKVVPNFIDIQEIEKKSLKTIPEKTSFFLCCSRLDPKKDISTIIRAFSLWKKNKEHYEKL FILGSGVYQNNLETLSKSLDLENDVVFLGQKENPYPYMKQAKLFLHASLQEGFGLVLVEA MACKTPVISTDCPVGPKEILENGTYGILIPMKDENSLYRAIQKIMENESIYLDYQAKAYQ RAHDFSKENVLMRLTPLLF >gi|224461471|gb|ACDD01000031.1| GENE 84 91265 - 92215 855 316 aa, chain - ## HITS:1 COG:STM3707 KEGG:ns NR:ns ## COG: STM3707 COG0463 # Protein_GI_number: 16766992 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 2 228 4 228 344 127 32.0 3e-29 MNKIDISIIIPIYNVESYLRECLESVYAVNLLKEVILVNDGSTDHSLQIAEEFAEKYSEE TILISQEHKGVSPSEARNRGLREAKGEYIYFLDSDDFIEPKAFEVFFSKIKGTDLDILHG RGHYYQNGKILDSMTQHDEIFTTKYPISGKDFMYLMNDADSYVDYVVLSIYRAEYLKQNA FFFKPGITYEDILFSYVVFWNAKKIQHVEDFIYYYRQRDGSITKVSRRYLDIFYVYDFLA EFVIKENIRHFRITADIISQIRSLAKKEKVFNEIVYEKLWRLPSKNIKSIRNLINLKFRS YITKKIKYEKIKNLEY >gi|224461471|gb|ACDD01000031.1| GENE 85 92212 - 93171 714 319 aa, chain - ## HITS:1 COG:YPO0187 KEGG:ns NR:ns ## COG: YPO0187 COG0463 # Protein_GI_number: 16120528 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 1 221 1 224 329 133 32.0 5e-31 MRIIDLSIVIPIYNVEKYLRECLESVYAMNLLKEVILVNDGSTDHSLQIAEEFAEKYSEE TILISQENKGLSAARNVGLERARGEYIYFIDSDDFLDSKAFESFFYKIQGTDLDILHGNV FQYYEKENRIVKKNLPIKTGIRMGKEFLYQMYEEKCYREVVWLNIYRRKFLLEEKLYFVE KLLYEDTPFSFFAFWKAKKIGYEEECFYYYRFREGSIMSKPRNCLDCFYIFNILMDFIFK EKLKSEEITSHFISSVRSFAKREKIFNDEIYWKLWSLPKKNIICLRNLIDIYFRKKHLKR ITLEEIIGNKQSKFLEDRK >gi|224461471|gb|ACDD01000031.1| GENE 86 93297 - 94484 804 395 aa, chain + ## HITS:1 COG:FN1245 KEGG:ns NR:ns ## COG: FN1245 COG0438 # Protein_GI_number: 19704580 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Fusobacterium nucleatum # 3 370 2 364 381 209 37.0 7e-54 MKKRILFRSGSLGMGGLEKVLVQTLQILSSSEWEISLLLTYDEKEKNILEKEIPKNIPYS FLNDNNFLKKLEYSQLRKKQNLYHKLKYLYFLHKARRNSLQKTNEYIKKYGPFDIFIDYD GGAMKYIEQIPIPEKVVFFHSSPSQAIHNKGKQKRYQKRLQNYTKIIAICDAMKEELQEM FSSLANKIFRIYNPFLFQRINSLQYDTSSLSENEKKMLSDNYCLMVSRLDTSQKDFSTLF QAFYQAKKEGLQDKLYLIGDGISREELEEEVHKLGLEEEILFLGLQTNPYIWMKHSKLLV HSSRAEGFGLVLVEALACGRMVIASDCPVGPREILNQETCGVLFSVGNIEQLKNQLLFFL QNSDLRKKYESHISKSISRFDSKAILQAYQELFLR >gi|224461471|gb|ACDD01000031.1| GENE 87 94506 - 95270 343 254 aa, chain - ## HITS:1 COG:no KEGG:FN1240 NR:ns ## KEGG: FN1240 # Name: not_defined # Def: lipopolysaccharide core biosynthesis protein RfaY # Organism: F.nucleatum # Pathway: Lipopolysaccharide biosynthesis [PATH:fnu00540]; Metabolic pathways [PATH:fnu01100] # 12 251 8 239 240 122 34.0 1e-26 MDEATRKIKSIKYQSYTIYGTEEYLELGKCILKNQYEILEVYKDDNRSYVAKIRLNGKIY VLKSPRSEIRLIQRKWKTFWNKGEALTTLYNVFSLKQKKFNNIAAVYLAIVRKHFLIQES FLLMEYIEGEVFNHPERLDDFMKIVNKLHSLGRYHGDLNTSNVVLTKKGLYLIDTQAKKD IFGHFKRAYDILILSEDKVVTRIQYPILKKYLGSSFVFFWLLAKTLRKIKYSDFVYQLRK RKRRQRKKKWKEKS >gi|224461471|gb|ACDD01000031.1| GENE 88 95246 - 96004 841 252 aa, chain - ## HITS:1 COG:FN1244 KEGG:ns NR:ns ## COG: FN1244 COG0726 # Protein_GI_number: 19704579 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Fusobacterium nucleatum # 5 240 10 246 250 200 49.0 3e-51 MKKQKIEIPILMYHQFKEDMNHVGNSIATYVTRKQFEWHLRTLKFLGYETITFRDLEKIG LENRLKKRYIILTVDDGYQDNYEILFPLLKQYQMKAVIYLVSDSYNRWDVEEYGVDKSPM MKEEQVREMIESGLVEFGGHTLHHCDFDVVDEETARQEILENKKELEEKYGISLTSFAYP YGHVTETAKKIVKEAGYHFAVSTSTGTGIITDDLYEMRRTSIDRTSVWRFLKRISRRYSV YKGKKWMRQQEK >gi|224461471|gb|ACDD01000031.1| GENE 89 95955 - 96059 62 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKFTMDCMWLNNLERIRVDEKTKNRNSNIDVSSI >gi|224461471|gb|ACDD01000031.1| GENE 90 96093 - 96272 272 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452261|ref|ZP_05617560.1| ## NR: gi|257452261|ref|ZP_05617560.1| hypothetical protein F3_04285 [Fusobacterium sp. 3_1_5R] # 1 59 1 59 59 97 100.0 3e-19 MKVKFKNLGPIKNGEFDIKDLKNLNLKKNMEPYEGMIPKKWKYYTKGLTVDSRGTSASN >gi|224461471|gb|ACDD01000031.1| GENE 91 96256 - 97329 1284 357 aa, chain - ## HITS:1 COG:FN1247 KEGG:ns NR:ns ## COG: FN1247 COG0859 # Protein_GI_number: 19704582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 5 326 12 346 379 264 46.0 2e-70 MWKKFQNFMRKYRLRLGRYLWDRKINKEKIVEGNFIEKNNIHSILFLRQDGKVGDMVVHT MIFRAIKEQYPKIKIGVITKGAAKGIIENNPNVDKIYHFEKDKNKIKKLAKQIAEEHYDL LVDFSIMLRVRDMMLISLCQARYNIGVNRDDWQLFDINVHFDFHSHITNLYGAFLDRIGV KYSSTKYEIFNEPLALETQNRVIVLNPYASNRHRSLSDEMVVRIGKEILQYKKIEIYIVG EASRKEELEEIAKNIGGNTRYYPTKSILDLVALIQRADLIVSPDTAAVHIASAFDKKIIS VYLEDNEEIYYARMWAPNSSQAKIIYSNHENMDWFAWEEMEKYLQMMLEGDKNESKV >gi|224461471|gb|ACDD01000031.1| GENE 92 97336 - 98979 1136 547 aa, chain - ## HITS:1 COG:jhp1312 KEGG:ns NR:ns ## COG: jhp1312 COG2194 # Protein_GI_number: 15612377 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Helicobacter pylori J99 # 220 535 224 550 553 183 34.0 1e-45 MDFLTSILIQNGKVFYFLIWIHMVINIFVWRVEHGKFQKRILEKAINSFFFLFLVNMLLL FFPYLSYLYAVLLVILCFSEILFYTEYQSLFTSNTFLVLQETNIQESKEFLKEFFSFQFF RNFFLLFIVVFVIPYFLLLGTKKLPFTILQYFLAFTFIVSFITFLQAQLPKKRKRYYSYL PLLRITKAYFDAKKQGRENELSLEEQKKIEILIEKAENKMDTLIFVIGESASRNYMEIYG SYLKNTPYMQKIKEEGNLFAFENVISSESLTAISIPNMLTFKNYEEKKPWYQCSNLISIL KKAGYETYWLSNQSKNETVGRVFSSLAGSSFFAEDFTKNQEIYDEILIQKGLEIVKDTKK KAIFFHLSGSHNSYVKRYPENWNIDTIETIRGNQKEKIKKYIAEYSNSLRYTDYILNLII EKFKQENMACFYLSDHAEEMYESRNVRGHAGDGGSRYMVEIPMFLYLSSFFQEQNPQLVK ICEQRKSSPYMTDDIIHTILGIMGIKTSDYEENRDFLSMEFNPSRKRIYQGKDYDGFWKK QFLGKGE >gi|224461471|gb|ACDD01000031.1| GENE 93 99164 - 100045 1046 293 aa, chain + ## HITS:1 COG:CAC0679 KEGG:ns NR:ns ## COG: CAC0679 COG1597 # Protein_GI_number: 15893967 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Clostridium acetobutylicum # 1 292 1 290 295 302 53.0 5e-82 MQKVKFIYNPISGAANTPKMLDTVIAFYQKYNKTIVPFRIGENFPLEMAFEDIHENYEHI LIAGGDGTINRTINLYLQKNLSLPIAILPTGTANDFAKYLSMPMDIEEACEKILKAEVKK VDLGQVNDKYFINVFSFGLFTDVSQKTPTHLKNTFGKLAYYFNGIKEIPRFTKIDLRVES EDLTIQTKCFLAFVFNGQTAGNINIAYNSQIDDGLLDVILVKGENLLKLGNLVYNFLRGE HLEEADKENILYFKSKALTLSSPQEITTDIDGEAGPQLPVTITCIPKTLNILF >gi|224461471|gb|ACDD01000031.1| GENE 94 100090 - 100311 181 73 aa, chain - ## HITS:1 COG:no KEGG:PG1526 NR:ns ## KEGG: PG1526 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: Homologous recombination [PATH:pgi03440] # 4 73 406 475 479 89 57.0 4e-17 MGPTIHDTIHDKLESVLEFCKAPKSREEIQSFLKLKNRSHTMKFYIQPLLEEGKLKMVFP EKPKSKYQKYIKK >gi|224461471|gb|ACDD01000031.1| GENE 95 100298 - 100411 82 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGPIVCVITSDNGILVRKISPSSKIGIPPEYNLVYL >gi|224461471|gb|ACDD01000031.1| GENE 96 100436 - 101734 1767 432 aa, chain - ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 1 431 1 431 433 472 53.0 1e-133 MKKEISFAKDLMEFLDKSPCAFFAVEEMKARLQAKGYEELQEQDAWDLKKNGKYYVTKNN SAILAFQIGSGEIEKEGFHIIGSHSDSPCFRVKHNPEMSVEGKYLKLNTEVYGGPILSTW FDRALSLAGRVTVKGKDAFHPKSLFVNIDEDFMTIPNLCIHMNRGVNDGASWNAQKDTLP FLATLEKGMEVEGALQRKIADLLAVKIEDILGMDLFVYDREKAKIIGMKQEFVQSGRIDN LGMAHASLEALLTSKKAKACNVILVSDNEEVGSMTKQGANSPFLKNTLRRIVLSLGKGEE EFMRALANSFLISSDQAHALHPNYTEKQDLTNRPVLNGGVAIKIAANQAYTSDAHSIAVF TGICQKAKQKYQFFHNRSDMKGGSTIGPITTTQLDIPSVDIGNPILSMHSVRELLGIQDH YSLYQIFQEFYK >gi|224461471|gb|ACDD01000031.1| GENE 97 101795 - 101920 112 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSYNLVLSDDAKKQLKNHPKISLGWFSSFLLFPISTKHSYS >gi|224461471|gb|ACDD01000031.1| GENE 98 101892 - 103709 2191 605 aa, chain - ## HITS:1 COG:FN0321 KEGG:ns NR:ns ## COG: FN0321 COG0326 # Protein_GI_number: 19703666 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Fusobacterium nucleatum # 1 605 1 607 607 752 69.0 0 MRKEEKIFQAETKELLHLMIHSIYTNQEIFLRELISNASDALDKFKFQSLTDDSLEQGDA LEIHLSMDKEKREISIEDNGIGMTYEEVNENIGTIAKSGSKAFREKLEAAQKSEVDIIGQ FGVGFYSAFIVADEVTLETKSPYGETGVKWTSKGDGTYEVEEIEKENRGSKITLHVKEGE EFDQFLEEWKIKELVKKYSDYIRYQIKMGEDTLNSSQPIWKKTKSEVKEEEYKEFYKSNF HDWQDPLLHFPLKVQGNVEYSALLYIPQKAPFDFYTKNFKRGLQLYTKNVFIMDKCEELI PEYFSFVSGLVDCDSLSLNISREILQQNKELEVISKNLEKKIITELKKLWKNDRETYIKM WEEFGKNIKFGVQDMFGMNKEKLQDLLIFQSTLEDKYVSLKEYVDRMGEAKEILYVAGDD LTTMKSLPKMEALKEQGKEVLLFTDKIDEFTIRVLQEYEGKKFKSISDSDFVLEGSEEKQ EEAKKAAEEHKDLLAEVKDILGDKITEVNFSANIGNVASSLLSKGAISLEMEKVLSEMPG NEKVKAEKVLALNPEHPLIQRLQEEKNEEDKKNLVSVLYNQARLLEGFAVENPAEFIKSM NALLK >gi|224461471|gb|ACDD01000031.1| GENE 99 103864 - 105363 2243 499 aa, chain - ## HITS:1 COG:FN1340 KEGG:ns NR:ns ## COG: FN1340 COG0008 # Protein_GI_number: 19704675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Fusobacterium nucleatum # 2 497 12 514 516 810 78.0 0 MEKKIRTRIAPSPTGDPHVGTAYIALFNLAFANHNGGDFILRIEDTDQNRYTEGSEQMIF DALHWLGLDYAEGPDVGGDYGSYRQSERFDLYVKYAKELVEKGGAYYCFCTSDRLDNLRE RQKAMGKAPGYDGHCRSLTKEEIEAKLAAGEPYVIRLKMPYEGETVIHDRLRGDIVFENN KIDDQVLLKADGFPTYHLANVVDDHLMGITHVIRAEEWIPSTPKHIQLYKAFGWEAPEFI HMPLLRNSDRTKISKRKNPVSLIWYKEEGYIKEGIINFLGLMGYSYGENQEIFSLQEFID NFNIDKVSLGGPVFDLVKLGWVNNHQMRLKDLEELTKLAVPFFEREGYIANFETYKKIVA IQRESAQTLKQLAQESKTFFEDEYELPVVTEDMNKKERKSVEKLHASLEDEVGKKSIALF LEKLNAWKQEQFTAEEAKDLLHSLLDELQEGPGKVFMPLRAVMTGQARGADLYNVLYVIG KERAIKRIHTMLDKKGIQL >gi|224461471|gb|ACDD01000031.1| GENE 100 105534 - 106397 920 287 aa, chain + ## HITS:1 COG:PA3829 KEGG:ns NR:ns ## COG: PA3829 COG1073 # Protein_GI_number: 15599024 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 7 284 7 291 307 83 25.0 4e-16 MTIEKFEIYSEKKKLQGKKYLANVEKRKKKTILMCHGFAGIQDLFFPSYAEKFVEEGFDV ITFDYNGFGESEGITEIVPNHQIQDILNIILYIKRDETLQENKLFLWGTSLGGLYVLKVA TLSKEIAGIYAQITFANGLRNNTLGLDEEGVQKYINQIENIKYKEIKDNKVLLLPLKRLL SDEQSKAFLEDYKDIFPELMATKLSLSTIKQINELCIDNDLATIQVPVLLGKAMQDKVNS PMEMNFIYEHLQSDKKLLELDCGHYEIYVGEAFEKAIQEQTSWFEKI >gi|224461471|gb|ACDD01000031.1| GENE 101 106451 - 107431 1102 326 aa, chain - ## HITS:1 COG:FN1341 KEGG:ns NR:ns ## COG: FN1341 COG1186 # Protein_GI_number: 19704676 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Fusobacterium nucleatum # 19 324 1 306 308 453 79.0 1e-127 MEEGFWNDKRSSSAVIKTMNEEKALLASFQSLVEEMEEEEVLIEFVEAGDEMSQIELEEK HKQLLKDMESFSTSLLLDGEYDGNNAIVTIHSGAGGTEACDWADMLYRMYTRWCNDKKYK VEEMDFMPGDSVGIKSITFLVSGYHAYGYLKCEKGIHRLVRISPFDANKKRHTSFASVEV VPEVDESVEVEIEASDIRIDTYRASGAGGQHVNMTDSAVRITHFPTGIVVTCQKERSQLS NRETAMKMLKSKLLEIELKKKEEEMKKIQGEQSEIGWGSQIRSYVFQPYTMVKDHRTGVE IGNIKAVMDGDLDDFMNGYLRWNKKK >gi|224461471|gb|ACDD01000031.1| GENE 102 107478 - 107552 170 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDILECKREYAALREQLEDIRRSL >gi|224461471|gb|ACDD01000031.1| GENE 103 107562 - 107771 357 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452271|ref|ZP_05617570.1| ## NR: gi|257452271|ref|ZP_05617570.1| hypothetical protein F3_04335 [Fusobacterium sp. 3_1_5R] # 1 69 1 69 69 123 100.0 3e-27 MEKKKVIVCRGMTCGKKNQKMWEALSKREDIILEEVRCFGQCKKGPNVKIDGQIYHFMDL EKVEWFLNK >gi|224461471|gb|ACDD01000031.1| GENE 104 107781 - 108125 387 114 aa, chain - ## HITS:1 COG:FN1342 KEGG:ns NR:ns ## COG: FN1342 COG0736 # Protein_GI_number: 19704677 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Fusobacterium nucleatum # 1 111 1 120 122 103 55.0 6e-23 MIRGIGTDIIEISRIEKAMKKIQFLQKVFTEKEQEEQKARGEKMESYAAIFSAKEAIVKA MGTGFRGISFTDIEILHDDLGKPLVYLYGIAQNNWHISLSHCKEYAVATAIWEE >gi|224461471|gb|ACDD01000031.1| GENE 105 108122 - 108880 958 252 aa, chain - ## HITS:1 COG:FN1343 KEGG:ns NR:ns ## COG: FN1343 COG0084 # Protein_GI_number: 19704678 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Fusobacterium nucleatum # 1 252 7 258 258 371 75.0 1e-103 MKRIDSHVHLNDERFDVDREEVLQRIQEEMDFVVNIGYDLESSQISLDYARKYPFIYATV GLHPAEEEEYTEELEKIFERMAKEEKVLAIGEIGLDYHWMVKSKEEQQEIFRKQLALAER LGKPVVIHTREAMEDTVKILKEFPTIKGILHCYPGSVETAKQMIDRFYLGIGGVLTFKNA KKLVEVVKEIPLEHLILETDCPYMAPTPYRGQRNEPIYTKEVAMKIAELKGISYEEVVEV TNQNTRKAYGML >gi|224461471|gb|ACDD01000031.1| GENE 106 108867 - 109727 1164 286 aa, chain - ## HITS:1 COG:FN1610 KEGG:ns NR:ns ## COG: FN1610 COG1281 # Protein_GI_number: 19704931 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Fusobacterium nucleatum # 1 282 1 280 285 298 53.0 1e-80 MGRVIRGLSKNARFVAVDTTDIVQEAMEIHHCNLLAADSFGRLLTVASLMGNSLKGEDIL TLRTDTNGQIKNILVTADSNGNVKGYLSNNTPDVSDTPLLGEGMLKVIKDFGLKDPYIGF CQMSSHGLAYDLSGYFYTSEQIPTVIAFTVLFRDEHTVEKAGGYMMQLLPNAEESFLEAL EQKVGAIRSIDELFHGGMDLEDIIALLYDDMNSEEKRVVEEYEILEEKEIQYHCNCDRDK FYRALITLGKEEIDKILQEDGKLEAECHFCGKHYEFREEDFKHEEN >gi|224461471|gb|ACDD01000031.1| GENE 107 109739 - 110164 559 141 aa, chain - ## HITS:1 COG:FN1611 KEGG:ns NR:ns ## COG: FN1611 COG1555 # Protein_GI_number: 19704932 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Fusobacterium nucleatum # 1 139 16 154 159 108 48.0 4e-24 MSQGNMKEKQRGKVDINIANKGEFLAAGIASRYTDGILEYRNAVGAFEHLEELKNIKGIG EATYHKLSKKLEIGTKKNRNPLFINRADKKILSYYGFSKKEIKAIEKYREKEGRISNNII LKKIITKKQYEKYKDLFRYSK >gi|224461471|gb|ACDD01000031.1| GENE 108 110250 - 111632 1891 460 aa, chain - ## HITS:1 COG:FN1612 KEGG:ns NR:ns ## COG: FN1612 COG0635 # Protein_GI_number: 19704933 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 454 9 468 469 520 60.0 1e-147 MNKRSIEEFMRVMLPEALEEELKIEEIPDGVKIQIAGKEVEFCYPDLGKAVDDQKQTMVK LALLKAYQKDYVWGGLMGVRPSKIVRRFLKEGFSYKEVLEHLEHFYLVKKEKAKILVDIV KKEETFLHRGASNLYVGIPFCPTKCSYCSFASYEISGGVGRYYKEFVNTLEKEIRFTGEQ LRKQPQQIESVYFGGGTPSTLTEEDLERILKVFREEIDFSFVREFTFEAGREDSITLKKL EILKKYGVDRVSLNPQTFQEKTLARVHRKFNRRHFEEVYEDCKRLGFILNMDFILGLPEE TTEDILDTLEQLKQFDVENITIHSLAFKRASKLAKGSQEREEIDRKKIEEKISSLMREKK LEPYYLYRQKNMLDWGENIGYAKIGMESIFNMEMIEENQNTIALGGGGISKVVVEEENGH DYIERFVNPKDPALYIREMEERQKQKFALFEKYRKEKNEV >gi|224461471|gb|ACDD01000031.1| GENE 109 111635 - 112603 916 322 aa, chain - ## HITS:1 COG:FN1613 KEGG:ns NR:ns ## COG: FN1613 COG2805 # Protein_GI_number: 19704934 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Fusobacterium nucleatum # 1 312 3 310 316 241 45.0 2e-63 MEKLFVKYRKLGASDLHIREEAKLCYRKDGDLYFSEEVVSTKLFDEFCQSLGILKEEQER DSSYEDSFGHRYRLNFAKGEKGRMLSVRIISEFLPEFPSELYIVLKDLFTSKHGLVLVTG STGSGKSTTLRFFLEQYNESYAKKIICLEDPIEFYYQEKKSLFFQREIGRDSESFETAMK AALRQDPDILLIGEIRDLQSLYTALSFAESGHLVFSTLHTGNCVETIHKMISFSSKEKQE EIRQRLSSSLRWTIAQELVKGKEGGRVPIFELLKNTKAVANMISSGREVQLPSVLESSAS QGMCSKEQSRENWIRKGKLERI Prediction of potential genes in microbial genomes Time: Fri May 20 01:57:23 2011 Seq name: gi|224461470|gb|ACDD01000032.1| Fusobacterium sp. 3_1_5R cont1.32, whole genome shotgun sequence Length of sequence - 14375 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 3, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1497 1704 ## COG0606 Predicted ATPase with chaperone activity - Prom 1686 - 1745 7.8 + Prom 1499 - 1558 8.8 2 2 Tu 1 . + CDS 1582 - 2139 557 ## COG1971 Predicted membrane protein + Term 2184 - 2221 8.0 - Term 2171 - 2209 8.2 3 3 Op 1 . - CDS 2215 - 2619 682 ## COG0781 Transcription termination factor 4 3 Op 2 . - CDS 2628 - 2852 348 ## gi|257452281|ref|ZP_05617580.1| hypothetical protein F3_04385 5 3 Op 3 . - CDS 2854 - 3390 660 ## FN1618 hypothetical protein - Term 3399 - 3431 4.0 6 3 Op 4 4/0.000 - CDS 3440 - 3802 670 ## COG1302 Uncharacterized protein conserved in bacteria 7 3 Op 5 . - CDS 3871 - 5214 2023 ## COG0439 Biotin carboxylase 8 3 Op 6 . - CDS 5238 - 5645 495 ## gi|257452285|ref|ZP_05617584.1| biotin carboxyl carrier protein of acetyl-CoA carboxylase (BCCP) 9 3 Op 7 . - CDS 5658 - 9092 3824 ## COG0587 DNA polymerase III, alpha subunit 10 3 Op 8 . - CDS 9109 - 10869 1857 ## FN1385 hypothetical protein 11 3 Op 9 . - CDS 10885 - 14289 3167 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family - Prom 14313 - 14372 6.3 Predicted protein(s) >gi|224461470|gb|ACDD01000032.1| GENE 1 3 - 1497 1704 498 aa, chain - ## HITS:1 COG:FN1614 KEGG:ns NR:ns ## COG: FN1614 COG0606 # Protein_GI_number: 19704935 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Fusobacterium nucleatum # 5 498 5 497 497 573 59.0 1e-163 MNFCIYSSAYLGVTPYVIEVEVDISAGLPTFSIVGLGDTAILESRYRVKTALKNSGYLLS PKRIVINLSPAGLRKEGAQYDFPLAVTLMYLSGYLKDPYQKLKKYLWLGELSLNGKLKSV KGLINTAILAKEKGFQGIIIPKDNLEEASLIEGIQVIALSSLKEVQEFLLDEEERDDRLS ISEEEFIFPYDFSEVKGQSHAKRALEISAAGGHNILMIGSPGSGKSMLAKRLPGILPPMS LEERIEATKLYSISGELDGKKLSLQERPFRAPHHTTTEIAMIGGGKKMMPGEISLASGGI LLLDEMNEFKKSVLEALRQPLEDRVVRITRALYRLEYQADTILVGTSNPCPCGYAFENNC RCTASEKYHYQKKLSGPILDRIDLYVEMRRLTEEELLADREEESSKEIKKRVILARKIQE ERYGNTFHTNAKMTQEERKQYCSLSEEDKEFFKVAFAKLEISARGFTKLLSVARTIADLA GREKIEREDLLEALSYRR >gi|224461470|gb|ACDD01000032.1| GENE 2 1582 - 2139 557 185 aa, chain + ## HITS:1 COG:FN1615 KEGG:ns NR:ns ## COG: FN1615 COG1971 # Protein_GI_number: 19704936 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 181 1 180 183 99 35.0 5e-21 MNFIAVFLIGVGLSMDAFAVSICQGLIQIGQNKKEMEKIAFTFGFFQFGMTFLGGMAGKI LVPFVKNYEHIIPCIIFCGIAIFMLKEGWENRNNSCEAVSHLDSFKTLFLLGVATSIDAL FIGITFALQVNYPLFWASILIGCTTFVISAFGYYFGKSFSNLSKNKAYYLGAFLLFALGI HSFIG >gi|224461470|gb|ACDD01000032.1| GENE 3 2215 - 2619 682 134 aa, chain - ## HITS:1 COG:BS_yqhZ KEGG:ns NR:ns ## COG: BS_yqhZ COG0781 # Protein_GI_number: 16079488 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Bacillus subtilis # 1 129 1 124 131 91 38.0 3e-19 MTRREAREELFKWIFQTEIQGNSVEEAFEHSFLREEIEKDEVSKVFLERYRKGMVEHQEE VAEKIEAAMTDWDLPRIGYVEKSLLKIAVYEIYFEDLPVEIIVNEAVEIAKIYGDVKTHE FINGVLAKVIKMKK >gi|224461470|gb|ACDD01000032.1| GENE 4 2628 - 2852 348 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452281|ref|ZP_05617580.1| ## NR: gi|257452281|ref|ZP_05617580.1| hypothetical protein F3_04385 [Fusobacterium sp. 3_1_5R] # 1 74 1 74 74 106 100.0 6e-22 MLEEYIARFVLHMSQSYRKYIGGFFGFLIGVLWLQFGFFPMLFVCLCAILGYKLGDLKIQ KKIKRKILEKLKED >gi|224461470|gb|ACDD01000032.1| GENE 5 2854 - 3390 660 178 aa, chain - ## HITS:1 COG:no KEGG:FN1618 NR:ns ## KEGG: FN1618 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 24 166 6 152 179 73 31.0 3e-12 MGKKLLFFLAWLGIFCIGIFNIIYLVLPSLITRYISISSFMLETAILALSVLYVLLAVYK LLTKFERNKDYQVETPNGTVVIAASTINKYVVEVLQKNFPVQSTKVRSYNKRSGILIDAK MDMVLSKNVADSIQEVQTKITEEVQDKLGIQIKKIKVHLSNMAVKEEAKVEVSEEEVK >gi|224461470|gb|ACDD01000032.1| GENE 6 3440 - 3802 670 120 aa, chain - ## HITS:1 COG:FN1619 KEGG:ns NR:ns ## COG: FN1619 COG1302 # Protein_GI_number: 19704940 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 108 1 108 122 134 82.0 5e-32 MSELGNIRIADEVIKTIAAKAASEVEGVYKLAGGVVDEVSKMLGKKRLTNGVKVEVGEKE CSIEVYIVVEYGYKIPVVAQAVQEAVLKTVSDLSGLKVVEVNVYVQNVMDREEPILEEDL >gi|224461470|gb|ACDD01000032.1| GENE 7 3871 - 5214 2023 447 aa, chain - ## HITS:1 COG:alr0939 KEGG:ns NR:ns ## COG: alr0939 COG0439 # Protein_GI_number: 17228434 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Nostoc sp. PCC 7120 # 2 442 3 444 447 524 58.0 1e-148 MFQKILVANRGEIAVRIIRAAKELNIKTVAVYSEADRDSLHVRLADEAVCIGGNTSADSY LKIPNIMAAAEATGADAIHPGYGFLSENAKFAEICMSHEITFIGPRIDCIQNMGDKATAR ATAIANEVPVSRGTGIVQDVEEAVKRVEEEIQYPVMIKATAGGGGKGMRIAFSEEELREN MVAAQQEALAAFGNGDVYIEKYIEEPRHVEVQVLGDNYGNVIHLSTRDCSIQRRHQKMIE EAPAFSVPFKIQNEMGEAAVRLAKAIQYNSAGTLEFLVDKNNQYYFMEMNTRVQVEHTVT ELVTGIDIIQLQIRVAAGEKLNIKQQDVAVFGHAIECRINAENPNDFLPSPGIIQQYIAP GGNGIRVDSHSYTGYEISPYYDSMIGKLIAFGVNREEAIAKMKRALSEYIIEGVETTIPF YLEVLNNENYQKGNVTTAFIEENFSGK >gi|224461470|gb|ACDD01000032.1| GENE 8 5238 - 5645 495 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452285|ref|ZP_05617584.1| ## NR: gi|257452285|ref|ZP_05617584.1| biotin carboxyl carrier protein of acetyl-CoA carboxylase (BCCP) [Fusobacterium sp. 3_1_5R] # 1 135 1 135 135 205 100.0 9e-52 MKLDHHNIQEMMKIVNQYQLEELSYEGEQGKITLKASSNPRIIRKESVEKKEVKKVENSK FIISEAIGKYFFLRENGKAYIEVGQELKVGDTIGYVTSIGISTPLISKFSGVIEEILVKN GDVIDYGKKLIKIQE >gi|224461470|gb|ACDD01000032.1| GENE 9 5658 - 9092 3824 1144 aa, chain - ## HITS:1 COG:FN1383 KEGG:ns NR:ns ## COG: FN1383 COG0587 # Protein_GI_number: 19704718 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Fusobacterium nucleatum # 3 1137 4 1132 1133 1154 53.0 0 MGDFVHLHLHTEYSLLDGVGKIEDYMKRAKSLRMKAIAITDHGNMFGALEFYQKAQKYGL KPIIGMEIYLSEYPLAEKKGRNFHLILLAENYEGYQNLMKLSSLAYLEGFYYKPRLDKEL LKKYSKGIIALSACMNGEIASYILEGAEESKIEATILEYQDIFGKENFYLEVQAHEEIEQ QKVNEALYTFGRKMNIPLVATNDTHYVNKGEHVLQDVIVCIQTGSHLSDEKRMKIEMQDL YLKSYEEMYAVLGEQYQEALQNSVEIAKRCQLWIPMHEFQFPDYNLPEGISSLEEYLKKL TYEGLGKRYPQGLDERIIERVEYELSIINKMGYAGYFVVVWDFISFARSRKIPIGPGRGS AAGSLVAYALEITQLDPLEYHLIFERFLNPERISMPDIDIDICQERRQEIIDYVVQKYGQ DKVAQIATFGTLKARAAIRDVGRVMDVELAKIDKAAKCIPMFASLKEVLEENIELKTMYQ QDVELKNVINTAMRIENKVRHISTHAAGVLITKKSLTESVPLYADSKNGIVSTQYQMKEL EELGLLKIDFLGLRTLTILQRTQDYIEENTGKKIELSEIPLQDKTVYEMLSKGDSFGVFQ MESKGLRSILKRLQPNSFGDIVALLSLYRPGPLGSGMVDDFIDRKHGKKAIEYPHPSLEE VLKETYGVILYQEQVMKIANIMADYSLGEADLLRRAMGKKNVEIMHENRSKFIERSIKKG YSQEKAEEIFDLIDKFAGYGFNKSHSAAYALIAYWTAYCKVHYPKYFYAALLSSEISDID KISFYFADAKAHGVEIETPDVQFPSSRFVVKGDKILFALSAIKNVGTGISEKIKEEREKK GEFRSYEDFVERMKKEGLNKKGLEAFIYAGALDSLSGNRHEKIESLDKVLDYVQRKAKAD DIQQMNLFGDSKKNLVQFSLSRTEEYSMEVLLEKEKEFLGFYVSANPLDQYESLYQSFDF DELNLIKEENMEREVWLYGIIQNLKKTRTKKGDAMAFADLENYQGQIPMVIFPKVYQENG FLLLDKSIVFVKGKVQIDYFRGEEIKKVIVQKLFSFEHFLSQSSLRVYLHIVEQEMDVYP KLKESLLQYRGGETELYFALQGKKGKKLRKSSFSIRLTLSFLEEVSQIIGKNRIKIKWKE SEHR >gi|224461470|gb|ACDD01000032.1| GENE 10 9109 - 10869 1857 586 aa, chain - ## HITS:1 COG:no KEGG:FN1385 NR:ns ## KEGG: FN1385 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 21 584 1 576 578 431 45.0 1e-119 MNQKKERCREAMFQFYQRENLFRLYKTYLLDWIAEGYIGKELDVFSISMIGPTTEKETFL SLLEEVYCSETIFQKIFSTLDKEVQEIFETLAWGERYYLCVETKEKYYSEENRFLKELSG KYSFFRLAKDAKNRDYFELHYDILRYIRQFMKKPKEIHLQAVHSPKYMKKENNEEEILEN LTKYFEFYEQGGIALSSSGKILKESKLNMKKYCNISEYYIDAKDLDYLKTETISLFFFLL KDEYKRVDYFQANNIKTIVQDLLSTELVKEEKYAYCSLFLNFLKGVKNIWQHSENLKECF QSILEVLEELESGMLVSVDNIIKSILFQDKFYELIDIQDVKDYIYINEANYERTRITNYD RYRDYIVIPFIKSFFFLLGTLGIFELYYNMPDEDSYLYLKNGYLSKYDGLQYIRLTKLGE YVLGKRESYEFKKVTEESEILLDEEHLIITLLGDAPTKTMFLERIGQKIASNKYKVTKES FLRNLEENSSLQERIEEFHSKIPNPKPQIWIDFLDELALKSRAIIWKPEYRVLKLKEDKE LISIVSKDKRFQEFILRGEDYHIFVKEENMGALSKLFKEYGYYMNW >gi|224461470|gb|ACDD01000032.1| GENE 11 10885 - 14289 3167 1134 aa, chain - ## HITS:1 COG:FN1386 KEGG:ns NR:ns ## COG: FN1386 COG0553 # Protein_GI_number: 19704721 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 245 1133 1 891 892 881 55.0 0 MNLYYIYHSCFAVEGKSHILIFDYYKIPKEKATEREYFFNRYIRQQEKKVYVFSSHSHGD HFNPEIFSWKEENQEIQYILSKDIQGNFPEEIKLFWMGEGEQRDIDDLKIFAYGSTDAGV SFIVYMEEKIIFHAGDFHLWHWEDDTEEEEETMRKEYFRILKTIQKDKHSMIDYAFLPVD PRLGRYTTEGLENFVKELDILQVIPMHFWENYFVMEEAKAILDEYGVQLVAVKQPMEMIH GIFYMLKKEERGYYIDLVDSYGKAVSEEEIELAPIYPYIPMEQKDVFFAGWDKKWDRIFL EDQEELLQFLKEQDHFVKENFEKITWKKGKYSLILCIREKKGEEGIYTSQIELSEVDEDI SIITEDIVANGSFYSLKNKRTGLYDLKDFITELRFPEVEKLMTLAHKHFPEMELRYRDYK TVEVETIIVKPQILIEKIASDNSLYLRISAEVSSMDYKFLKDNDFEEVVTLNLREKKIQI SKIDTSPVRELVEELVKILVKTQRELGIRQAYYLDEDYFLILPENVAREFITKNLLQFAN QYQIAGTDKLKKYNIKAVKPRITGNFHYSIDFLEGEAEIEIEGEKFSIQEVLSSYQKDSY IVLSDGTNALINKRYIEKLERIFKESEDNKVKISFFDLPIIDDIIEEKILSDEYVLQKNF FYGMNHLKEYQVSLPKLNATLRSYQEYGYKWLSYLSEKHLGACLADDMGLGKTLQAIAIL TKLHQEKKKSLIIMPKSLIYNWQSEIEKFSPGLKVGIYYGNHRDLQVMEEQDVILTTYGT VRNDIVLLKEFFFDLVILDESQNIKNIHSQTTRAVMLLQSQNRIALSGTPIENNLSELYS LFRFLNPGMFGSLEEFNNTYALPIQKENNPEAVQDLRKKIYPFILRRVKKEVLQDLPDKI EKTIYIDMNVEHKKFYEERRNYYYNMIHASIREKGLGKAQFFILQALNELRQITSCPEVK NSYISSSKKEMLIEQIVEAVENDHKVLVFTNYIGSIENICKSLEEREIAYLSMTGSTKDR QQLVNKFQKNEKYKVFVMTLKTGGVGLNLTAADTIFIYDPWWNKTVENQAIDRAYRLGQD RTVFSYKLILKDTIEEKILQLQELKSKLLDDVISEDNLSNKSFTEKEIEFILGK Prediction of potential genes in microbial genomes Time: Fri May 20 01:57:46 2011 Seq name: gi|224461469|gb|ACDD01000033.1| Fusobacterium sp. 3_1_5R cont1.33, whole genome shotgun sequence Length of sequence - 2471 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 35 - 94 18.2 1 1 Op 1 4/0.000 + CDS 135 - 1010 1071 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 2 1 Op 2 . + CDS 1033 - 2470 1217 ## COG1629 Outer membrane receptor proteins, mostly Fe transport Predicted protein(s) >gi|224461469|gb|ACDD01000033.1| GENE 1 135 - 1010 1071 291 aa, chain + ## HITS:1 COG:FN0767 KEGG:ns NR:ns ## COG: FN0767 COG0614 # Protein_GI_number: 19704102 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 27 289 6 269 275 278 57.0 8e-75 MKKIFITIAIAIFYNFAYAWNINKDFIADNYGNKIPKKAYSKIIVIDPGTVEILFEIGAE NSIVAIGKTGQSKIYPIEKTEKLESVGNMTTPNFEKILQYKPDLVLVNSMMARSMQNFKR LNIPILVSDTETLQDIIQSIRVLGVMTAQEIKAETLAKQCEEKLQKIQAKIPKTSSRKGA ILYAVSPMMAFGKKSLPGDILHYLGVENIADNVPGNRPILSPEYVLKVNPDFLAGAMSLQ SVDDIINSSNVIAKTKAGKNKKIFILDSSMILRDSYRIFDEIEKLQQKIYE >gi|224461469|gb|ACDD01000033.1| GENE 2 1033 - 2470 1217 479 aa, chain + ## HITS:1 COG:FN0768 KEGG:ns NR:ns ## COG: FN0768 COG1629 # Protein_GI_number: 19704103 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 6 479 1 477 715 382 46.0 1e-106 MKKVSLGICNILLSYAMFGAEIDLGTQNIYSETGFETSLRNSISSPYIVTSKDIKEKHYT RVSEILRDIPNIYIGPGGSVDMRGQGNVHARTTVQLLIDGVPANFLDTSHINLPIDTLNP EDIKRIEVIPGGGAVLYGSGTSGGVINIITKKYTGNYARAGYQIGSYHNHKYDVAAGTSL GNLDINLSYSKNNRDGYRKKAFSNSDFFLGKLRYRFNKTDSLEFKYSYFDNKYKGVNMLT KEQIDKDRKQSGLSPTDTLKNSIQKEEWNLTYDAKWTDWLEHKSNLFYQSTEIKSSEYED AVPFYQERINMYKKMLSRPNINPMMKKNLEKQIQTFETIIANNPKIELRQGSHFKDQKFG FKTKNKFKYGENSDFILGLGYIHNKMSRDSWAYSENTQTNQTIETLTNTKIPLNKKTFEI FGLNTYRHNNLEFVQGLRFEHSKYNGKRKYKNLEYPLKDCTMNNVAANLAVNYLYSDTG Prediction of potential genes in microbial genomes Time: Fri May 20 01:57:50 2011 Seq name: gi|224461468|gb|ACDD01000034.1| Fusobacterium sp. 3_1_5R cont1.34, whole genome shotgun sequence Length of sequence - 12291 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 4, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 2 - 727 681 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 2 1 Op 2 35/0.000 + CDS 740 - 1714 938 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 3 1 Op 3 1/0.000 + CDS 1711 - 2511 222 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 1 Op 4 . + CDS 2527 - 3759 995 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases + Term 3768 - 3821 8.1 - Term 3755 - 3809 12.1 5 2 Op 1 . - CDS 3846 - 5759 2676 ## COG0441 Threonyl-tRNA synthetase 6 2 Op 2 . - CDS 5773 - 9387 4009 ## FN0610 hypothetical protein - Prom 9418 - 9477 10.7 - Term 9439 - 9476 3.2 7 3 Tu 1 . - CDS 9483 - 9803 354 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 9827 - 9886 5.7 8 4 Op 1 12/0.000 - CDS 9906 - 10574 181 ## PROTEIN SUPPORTED gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase 9 4 Op 2 1/0.000 - CDS 10555 - 11022 670 ## COG0802 Predicted ATPase or kinase 10 4 Op 3 1/0.000 - CDS 11032 - 11496 690 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 11 4 Op 4 . - CDS 11493 - 12089 629 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 12133 - 12192 15.1 Predicted protein(s) >gi|224461468|gb|ACDD01000034.1| GENE 1 2 - 727 681 241 aa, chain + ## HITS:1 COG:FN0768 KEGG:ns NR:ns ## COG: FN0768 COG1629 # Protein_GI_number: 19704103 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 1 241 473 715 715 247 52.0 2e-65 YSDTGNIYVKYERGFTSPAPAQLMDKIRKNGKYDYVNNNLKSEKSNSFEIGWNDYLFSSL VSGDIFYSETKDEISTIFSGGHGTAFHSVNIGKTKRYGLDIKASQIFDKWKLSEAYTYTH AKIIKDKNKEYEGKYISYVPEHKFSFNIDYQITPKWTIGGEYQYSADTYLSNSNQFGKDG KRATLDIQTSYEFEKGITLYAGINNVCNQKYYETVSVGEKGKKYYDPAAERNYYAGFRYQ F >gi|224461468|gb|ACDD01000034.1| GENE 2 740 - 1714 938 324 aa, chain + ## HITS:1 COG:FN0769 KEGG:ns NR:ns ## COG: FN0769 COG0609 # Protein_GI_number: 19704104 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Fusobacterium nucleatum # 4 321 3 320 322 398 67.0 1e-111 MKTKLYLIFAILFCMTAIVICISVGNVYISPKELFHYSYLDEYTKTVLMDLRLPRVIMAF LVGMLLASSGNVVQIIFQNPLADPYIIGIASSATFGAVVAYLLGLPEYSYGAVAFVCCLL STLFLFRIAKKGNHIHISTLLIMGITLASFLAGFTSFSIYWIGEDSFKITIWLMGYLGNA SWKQIVFLLFPIIFSTLYFYQKRNALDILLLGDEQAHSMGISVSKLKSHCLILASLMVAF SVAFTGMIGFVGLIVPHIVRNIVGPLNRRVIPCTLFFGGTFLLICDTVGRMILSPLEVPI GVITALLGAPFFFYLAMKNRRNMP >gi|224461468|gb|ACDD01000034.1| GENE 3 1711 - 2511 222 266 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 9 233 2 226 245 90 28 7e-18 MISKEQDFLKIEKLNFSYGKYPILKDITCNIQERKMTAIIGPNGCGKSTLLKNILGFHSG NFESFSLLGKNWKEFSPKDLARVIAYIPQKINIIPGISVRDFICLGRFSQLKHSWDSYSK RDFEIVDNIIETLHLELFRNRTLVSLSGGELQKIFLAKALAQEAKILLLDEPTSALDFHN AIFFMKTLKKYIQYYDITPIIVLHDLNLAASFCDEIIMMKEGKIVNKGSPKEMMTVENIH QIYGLDCRVHYSIENEAPYIIPKVKN >gi|224461468|gb|ACDD01000034.1| GENE 4 2527 - 3759 995 410 aa, chain + ## HITS:1 COG:FN0771 KEGG:ns NR:ns ## COG: FN0771 COG0635 # Protein_GI_number: 19704106 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 400 1 405 411 449 55.0 1e-126 MFEKRYKSHHDVENILSKYIKKEAANKKDFLEILEKENSSKELGLYLHTPYCDKICSFCN MNRKKVDAELEEYHEYLCQQIEYYGKFPFCQSSELEVVFFGGGTPTIYSNRQLENILQCL QKNFKFSNSYEMTFETTLHNLSVEKAKLLQANGVNRLSVGIQTFSDRGRKILNRSFGKKE AFQRLKALREAFHGLLCIDIIYNYPEQSDEEVLEDIHLASQLGIDSISFYSLMIQEGSEI LKNLSKESIETFYDVKRDEQLHNLFYQEALTRGYHLLELTKITNGKDKYRYIRSRNLLPI GVGAGGHLENIGCYHMNKKMTFYSKNPENTQKLTQLSGFFQFDSFSDKKLQELAEKEYPR LRRKIEEYIQQGWIQFDGENFSYTTNGVFWGNNINAELLEMALSYSIKIE >gi|224461468|gb|ACDD01000034.1| GENE 5 3846 - 5759 2676 637 aa, chain - ## HITS:1 COG:FN0611 KEGG:ns NR:ns ## COG: FN0611 COG0441 # Protein_GI_number: 19703946 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 19 635 1 618 620 1004 77.0 0 MKIEFLDGKVQEFHEACNMFVVAKSISNSLAKKAVAAKIDGELYDMSYVLDHDAKVEFIM PESEEGVEVIRHSAAHLMAQAVIRLFPGTKVTIGPAIENGFYYDFDPKEQFTEEDLIRIE EEMKKLSKEDIKVERFMMSREEAIEYFEKLGENYKVEIIKEIAKGEQLSFYRQGDFVDLC RGPHVPSTGHIKAVKLKSVAGAYWRGDSKNKMLQRIYGYAFATEKDLKDFLRLMEEAEKR DHRKLGKELELFFLSEYGPGFPFFLPKGMVLRNTLIDLWRAEHEKAGYVQIDTPIMLNRE LWEISGHWFNYRENMYTSSIDDVDFAIKPMNCPGGVLAFKYQQHSYRDLPARVAELGKVH RHEFSGALHGLFRVRAFTQDDSHIFMTEEQIESEIIGVVNLIDKFYSKLFGFQYSIELST RPEKSIGTDEIWEKAETALAGALNHLGREFKINEGDGAFYGPKLDFKIKDAIGRTWQCGT IQLDFNLPERFDVTYIGEDGEKHRPVMIHRVIYGSIERFIGILIEHYAGAFPMWLAPVQV KVLTINDECAPYAKEVVAQLKEQGIRAELDDRSESIGYKIREANGRYKIPMQVIIGKNEI EKREVNVRRFGSQAQESMELDAFLAMVKEEAKVKFKD >gi|224461468|gb|ACDD01000034.1| GENE 6 5773 - 9387 4009 1204 aa, chain - ## HITS:1 COG:no KEGG:FN0610 NR:ns ## KEGG: FN0610 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 16 1169 3 1129 1155 738 37.0 0 MKDKRILLALFLVLSINAATYAQEVEGRADEVEIDLNTNTMTSESGIVLKQSNMKTKVYT VQRDTDAGKAYYRDGIIAQVDNETGKVKIESQEGEANTTGDEAHFYKNFGYLEVAPVTGA EVPNDRVYFGSDHISYKDEKIYIDKGWMTTDFKVINHSQNPKEISYHILSDQIVIEPDKQ LTMYDSNLYLGKHKTLPMEFPWFRLNIRGGSKVPLFPNWSEKEYYGWQTSWGVLYGNRSS KLKGGIAPKFADDMGWLIGRSETWYDTKRYGTAQLNVTDLLLHSKVKGKKDSVDAVKYEQ KHKRYHVETKHDYDGEYGSFHFSGINATTSMISSLDELITKYEANKEFDNTAGLRGTGIY LDRPGFDKNIAYYSLNSDLHGMGKNKDIRFQASSKLTTDKKLYTLSTYDNVEDNVFDTKG DNALYTNVSLYQDNTRYKLGGYYNYLYDITPGYTRKYDRSRGEDYGFVALDKKNTLGIQY DEIRGDKLRGLHLWETESNATALKRSNLLGLPIDYTPVAVSEYSIYNQQNGKLLLGDYSS GKLHIKPSIATKREEKQLDLLENEKIVITDNSSLNPENYIVKGGYDRFRQYNRFNNEVYE KRKENKANINIIEDDSLNVNVFAGNQKEEIWTREGMISGKIDDVTKLKSESSFYGFDIEK KFGTLVLRGDLRQDDYRHSDGSSLRYGLGLNHKVNLYEKEDVKVSNDFDIYMQKYRYSGG KEEDKAQNLYTKKDSYQIKDTLSWDTKAVHTQYSGEYQVDKNPIYHSDKKAEKLKQKINF QVGEDKKIGLFYDRNDRYTNRIVNKFENYKDLSTENYGGDFDYGPHSFAYERQNIDFQFP KIAKEEIESDSFRYSYRWDDKSLGFSYRTGKDSVLMNEFHDSKVLDIDNKVYGVNFHKNG DIQHHVYFNYENSRHREGSSKVNLEGSRKNIGHTDEINFSYQYRDTRMKDAEYIKYASLE TGKAENELSVQDIERVKSLWEDKKVMEDPFHLTGIRDDAFLFGDGRVNFRFYTSLERNKA RYEKTHNLGDSLQKIKGAMYYTYNRYGLGYTFEENAGWLKNNSQYTWKKKNREHQISLYG KMGKPSESWKVKTYAKFYENLLDKVNADKKHKRALDGIGVEIGKEFDFYEWSVLYERKYS LTTRDYEWRLGLQFTLLTFPDNAIFSLGANKSSQKVRPKTQFMDSVNVEKIVDDKVKTKV IMQK >gi|224461468|gb|ACDD01000034.1| GENE 7 9483 - 9803 354 106 aa, chain - ## HITS:1 COG:MT4033 KEGG:ns NR:ns ## COG: MT4033 COG0526 # Protein_GI_number: 15843547 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Mycobacterium tuberculosis CDC1551 # 2 88 7 95 116 85 38.0 2e-17 MSGVIHLNESSFSEDLLQNHEKVLINFWAEWCGPCKFVNHILEELAQEENLIICLINVDK NPKLMQKFQIETIPNVLLYSHGKLIEQVKKLDKLIMKEEIRKKILA >gi|224461468|gb|ACDD01000034.1| GENE 8 9906 - 10574 181 222 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase [Lactobacillus jensenii 269-3] # 43 216 1 180 380 74 30 4e-13 MLILGIDTSTKLCSVALYDTEKGILGELNITVPKNHSNVILPMIDQLFLFTEKTIEDVER IAVGIGPGSFTGIRVGMAIAKGLAIGKKIPIVGVSGLDALAASVREKGRVFALLDARKSR VYYRIFENGKALCEAKDGNLKDVLQEYQGAEVNYFIGDGALAYQNMILEAYGKKACILSE ESSVARALYFAKMSVDQEEDNLYTLEPMYVCKSQAEKSKENV >gi|224461468|gb|ACDD01000034.1| GENE 9 10555 - 11022 670 155 aa, chain - ## HITS:1 COG:FN0929 KEGG:ns NR:ns ## COG: FN0929 COG0802 # Protein_GI_number: 19704264 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Fusobacterium nucleatum # 1 155 1 153 153 160 62.0 7e-40 MRKKLYFQELDTLADSLANYAKEDTFIALIGDLGTGKTHFTQRFAKSLGVTENLKSPTFN YVLGYESGRLPLYHFDVYRLTEAEELYEVGYEDYLRENGVILMEWANLVESELPEEYIRI ELHYTEEENQREVDLCYIGNQEKEKELFTYVNFGN >gi|224461468|gb|ACDD01000034.1| GENE 10 11032 - 11496 690 154 aa, chain - ## HITS:1 COG:FN0930 KEGG:ns NR:ns ## COG: FN0930 COG2870 # Protein_GI_number: 19704265 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Fusobacterium nucleatum # 1 154 7 160 160 221 75.0 4e-58 MILDRKLASIMVEEAKKQGKIVVFTNGCFDILHVGHLRYLQEAKRQGDILIVGVNSDASV RRLKGKDRPINSEKDRAEMLCGLESVDYTVLFEEDTPVSLLEELKPSIHVKGGDYKKEDL PETEIVEKHGGEVRILSFVEGKSTTNIVNKIQKK >gi|224461468|gb|ACDD01000034.1| GENE 11 11493 - 12089 629 198 aa, chain - ## HITS:1 COG:FN0931 KEGG:ns NR:ns ## COG: FN0931 COG0494 # Protein_GI_number: 19704266 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 1 196 1 196 205 133 39.0 3e-31 MQIQNVLHAGKYKCAAVMICFYKEKDETYIVLEKRANGIHQGGEISFPGGKRDFEDIDFK ATAIRETSEELGISKDKIEYLGYAGTFIGIFDLLLDVHLCKLKIEKKEELLYNKQEVEYL IFLPLSYLERTEPIMEIAELKNIPKFDVRAYGLPSRYWDTWPYYERNLYFYFYQGEVIWG ITAEILFSWMKGKKKEKL Prediction of potential genes in microbial genomes Time: Fri May 20 01:58:06 2011 Seq name: gi|224461467|gb|ACDD01000035.1| Fusobacterium sp. 3_1_5R cont1.35, whole genome shotgun sequence Length of sequence - 6016 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1012 1340 ## FN1859 major outer membrane protein + Term 1051 - 1085 4.0 + Prom 1044 - 1103 6.3 2 2 Tu 1 . + CDS 1126 - 2763 1626 ## Lebu_0003 protein of unknown function DUF1703 + Term 2773 - 2832 11.1 - Term 2767 - 2812 3.8 3 3 Tu 1 . - CDS 2831 - 2908 68 ## - Prom 2931 - 2990 3.5 4 4 Op 1 . - CDS 3054 - 3356 122 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system 5 4 Op 2 . - CDS 3319 - 3552 303 ## gi|257452305|ref|ZP_05617604.1| hypothetical protein F3_04505 - Prom 3579 - 3638 7.4 - Term 3699 - 3735 3.0 6 5 Tu 1 . - CDS 3747 - 4994 1434 ## Lebu_1380 protein of unknown function DUF1703 - Prom 5019 - 5078 7.2 - Term 5037 - 5071 4.0 7 6 Tu 1 . - CDS 5110 - 6015 874 ## FN1859 major outer membrane protein Predicted protein(s) >gi|224461467|gb|ACDD01000035.1| GENE 1 2 - 1012 1340 336 aa, chain + ## HITS:1 COG:no KEGG:FN1859 NR:ns ## KEGG: FN1859 # Name: not_defined # Def: major outer membrane protein # Organism: F.nucleatum # Pathway: not_defined # 2 336 60 368 368 110 29.0 7e-23 YVSVMTKTMGEIESTSTKDSDSDRTWKGADPKTRLQVQSEVALTENQNLEFRLRDWNGIT HRNHGASGVDEYYLQHTYDFGKLGSSKVNAKLESRYQREASSDILDEEDFKLYVTDSKFI RERVVFDFADYLFQNDYVKTTTAVLAPEFKYAWGTWNEWEKDEKGKVTTDDYSNHYENYG LYANYEADIIGGVHFQAEFDNLFGYTVYNGRGDEDVKGKTGSAEFTLSRPTTLYKAGKHT FAFNPELQYSTGWAWNKKTTEYGLQGTVRTQSDKLEKWGSYTAKFEPKLAYTYQATEFVK LMANVGAEYKNRTTDRHAASHWRWQPYAELGLKVNF >gi|224461467|gb|ACDD01000035.1| GENE 2 1126 - 2763 1626 545 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0003 NR:ns ## KEGG: Lebu_0003 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 1 544 1 544 545 493 52.0 1e-138 MTKKIPIGISDFKVLIENGYYYFDKTKFIPNIIEEFGKSLLFTRPRRFGKTLNMSMLRYF FDIRNAEENRNLFKDLYIEKTSNIEYQGQYPIIYLSLKDLKLNSFEETIEEIKNLIAELY EEHIYILENLSSFNKNIFNHVLNRNANMVELRNSLKFLSKILTDFYRKNVVLIIDEYDTP IVSAYEYGYYDEIITFFRPFLSSVLKDNEHLQMGIMTGILRVAKEGIFSGLNNLSVYTIL DDKYSDFFGLTEKEVEKSLLDFQMDYRIDEVKSWYDGYKFGETEIYNPWSILNFLSSKKL ESYWINTSDNFLIKKILGNSDDSIREELREIFNYRFIEKSIDKASNMIDVNNTKEIWQLL LFSGYLTIGKKTETTIDSYLLKVVNKEVHNFFQKTFIDNYFVDESLFAKMMMSFLQKDLE TFEKCLQKILLLSFSYYDGSKEEKFYHNLMFGMILYLDRRYKVYSNKEIGFGRYDIVLEP KNKNDTAYILEFKVAKNREELEEKSKEALEQIKNQKYDVDLKAKGYSILQVGMAFFGKEV KVKYL >gi|224461467|gb|ACDD01000035.1| GENE 3 2831 - 2908 68 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKERGIQEIVKLGIAFYRKKVKIKV >gi|224461467|gb|ACDD01000035.1| GENE 4 3054 - 3356 122 100 aa, chain - ## HITS:1 COG:FN0211 KEGG:ns NR:ns ## COG: FN0211 COG2026 # Protein_GI_number: 19703556 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Fusobacterium nucleatum # 25 95 16 87 88 66 47.0 1e-11 MQKVMGLIMTNYKIIRTSPFDEHFKKLDNSVKILILKYLKKLENSENPKAYGKELSGNFA GLYRFRVNSYRIISKIEDDKLIIYALAVGKRSTIYSRCKI >gi|224461467|gb|ACDD01000035.1| GENE 5 3319 - 3552 303 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452305|ref|ZP_05617604.1| ## NR: gi|257452305|ref|ZP_05617604.1| hypothetical protein F3_04505 [Fusobacterium sp. 3_1_5R] # 1 77 1 77 77 110 100.0 3e-23 MATISFRLSDEEKKIISNFSKKNNITISELMLKSILEKIEDEEDYMLGEKIMLDPKTKIT GDLKELAESYGIDYDKL >gi|224461467|gb|ACDD01000035.1| GENE 6 3747 - 4994 1434 415 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1380 NR:ns ## KEGG: Lebu_1380 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 2 411 9 413 552 426 52.0 1e-118 MKKSLPIGITNFQELIQGNYYFIDKTKLIEDILQDGSKVTLFTRPRRFGKTLNMSMLQYF WDIQQAEDNQKLFQGLHIESSPYFTEQGKYPVIFLSLKDIKERTWEECKKEIKKWLSDLY DKYHFLRDSFNQRDLKYFEDIWLGKEEGSYSNALKDLSKYLCRYYQKKVVILIDEYDTPI VSAYENGYYEEAISFFRNLYSAALKDNEYLQVGVMTGILRVAKEGIFSGLNNLAVYGILD EKYSSSFGLTEEEVEQALHYYEMEYNLPEVKEWYDGYRFGKTEIYNPWSIISYIINQRIE PYWIGTSSNALINQMLEKARKEESDIFQKLENLFQGNSILQKIQKGSDFHDLVHVEEVWQ LFLYSGYLTMAREEEQGFYQLKIPNKEVYSFFQESFIQKFLGNVTNFSALVVALT >gi|224461467|gb|ACDD01000035.1| GENE 7 5110 - 6015 874 301 aa, chain - ## HITS:1 COG:no KEGG:FN1859 NR:ns ## KEGG: FN1859 # Name: not_defined # Def: major outer membrane protein # Organism: F.nucleatum # Pathway: not_defined # 1 301 59 368 368 298 53.0 2e-79 SLNIQYRWYGETENRVPGEKNSDWARGTNNYGRLQTEAKVNITENQKLEVRARNFHTLRT EKSKAPEDQLRLRHFYNFGNIADTKVNATSRVEYNQKSNDGEKHIEASVGFDFAQYLFNN DFIKTDSFVLRPRYKYTWEGHKDGSYTNTLGINLESAYTLPLGFSAEFNLYPQYVFADEK FETNKGMKDKEFYMEMEAYLYNSTELYKNGKFDVSFEFEGGYDPYSFHQYKLVENRNDNK ERRDYSLYALPYLQANYQATEFVKLYAAAGAEYRNWKDSAESTASHWRWQPTAWAGMKVN F Prediction of potential genes in microbial genomes Time: Fri May 20 01:58:43 2011 Seq name: gi|224461466|gb|ACDD01000036.1| Fusobacterium sp. 3_1_5R cont1.36, whole genome shotgun sequence Length of sequence - 33544 bp Number of predicted genes - 30, with homology - 23 Number of transcription units - 9, operones - 6 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 363 - 402 7.0 1 1 Op 1 . - CDS 410 - 571 312 ## 2 1 Op 2 . - CDS 600 - 2504 2672 ## COG0370 Fe2+ transport system protein B - Prom 2535 - 2594 2.3 - Term 2517 - 2564 2.1 3 2 Op 1 . - CDS 2597 - 2695 137 ## 4 2 Op 2 . - CDS 2725 - 2796 81 ## - Prom 2945 - 3004 14.9 - Term 2934 - 2986 5.4 5 3 Op 1 12/0.000 - CDS 3060 - 3575 675 ## COG0602 Organic radical activating enzymes 6 3 Op 2 . - CDS 3591 - 5795 2884 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 5819 - 5878 2.1 7 4 Op 1 . - CDS 5881 - 5958 103 ## 8 4 Op 2 . - CDS 5975 - 6964 1828 ## COG0191 Fructose/tagatose bisphosphate aldolase 9 4 Op 3 . - CDS 6983 - 7468 230 ## FN0109 hypothetical protein 10 4 Op 4 . - CDS 7455 - 8729 1719 ## COG0172 Seryl-tRNA synthetase 11 4 Op 5 . - CDS 8723 - 9337 743 ## COG5011 Uncharacterized protein conserved in bacteria 12 4 Op 6 . - CDS 9410 - 10411 1374 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase 13 4 Op 7 . - CDS 10432 - 10911 853 ## FN1910 hypothetical protein 14 4 Op 8 . - CDS 10956 - 13079 2808 ## COG4775 Outer membrane protein/protective antigen OMA87 15 4 Op 9 . - CDS 13150 - 17172 4073 ## FN1912 hypothetical protein 16 4 Op 10 24/0.000 - CDS 17191 - 17895 1038 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 17 4 Op 11 1/0.000 - CDS 17897 - 19789 2311 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 18 4 Op 12 17/0.000 - CDS 19805 - 20464 988 ## COG0569 K+ transport systems, NAD-binding component 19 4 Op 13 . - CDS 20477 - 21811 1623 ## COG0168 Trk-type K+ transport systems, membrane components 20 4 Op 14 . - CDS 21827 - 23053 1933 ## COG2195 Di- and tripeptidases 21 4 Op 15 1/0.000 - CDS 23050 - 24429 1660 ## COG0534 Na+-driven multidrug efflux pump 22 4 Op 16 . - CDS 24444 - 26024 1649 ## COG0038 Chloride channel protein EriC 23 4 Op 17 . - CDS 26017 - 26820 1075 ## COG0253 Diaminopimelate epimerase - Prom 26901 - 26960 9.9 + Prom 26896 - 26955 10.7 24 5 Op 1 . + CDS 26975 - 27220 523 ## FN1796 hypothetical protein 25 5 Op 2 . + CDS 27217 - 27279 56 ## + Term 27349 - 27411 9.0 - Term 27344 - 27391 7.1 26 6 Op 1 . - CDS 27409 - 28071 685 ## gi|257452329|ref|ZP_05617628.1| hypothetical protein F3_04625 27 6 Op 2 . - CDS 28068 - 29456 1983 ## COG1757 Na+/H+ antiporter - Prom 29491 - 29550 9.1 + Prom 29149 - 29208 7.5 28 7 Tu 1 . + CDS 29411 - 29515 126 ## + Term 29526 - 29568 1.7 - Term 29512 - 29556 7.2 29 8 Tu 1 . - CDS 29628 - 29702 150 ## - Prom 29725 - 29784 8.5 - Term 29796 - 29835 7.7 30 9 Tu 1 . - CDS 29851 - 33429 5236 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 33483 - 33542 7.2 Predicted protein(s) >gi|224461466|gb|ACDD01000036.1| GENE 1 410 - 571 312 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNMSTLIVLLVLVVIIIVAIRRVKTKGGCSCGKEHDCGCGCGCGHSHEEEKNN >gi|224461466|gb|ACDD01000036.1| GENE 2 600 - 2504 2672 634 aa, chain - ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 2 619 82 700 709 510 42.0 1e-144 MIINVVDASNIERNLYLSTQLSEIGIPMVIALNMMDVVERNQDKIDTEKLSQLLACPIIE ISALRNKNIDTLVETAMKTAGKKQVHIQSFEEEVEELIKKIEDGVSALKDSSYKRWYAIK LFEKDEKAVANLALSSEKEEKVREIREAAEEKYDDDGEGIITDARYHFITSIIGKTVKKG RTGLTNSDKVDRILTNRILALPIFVVLMFGIYYIAVTIVGGPITDWVNDTFFGEMIGENV AGMLESAEVAPWLSSLIVDGIIGGVGGVLGFLPIIAALYVMMAILEDIGYMARIAFILDR IFRKFGLSGKSFIPILIGTGCSVPGIMATRTIENDNDRRMTIMVASFMPCGAKTEIIALF AASLFAGDKGWWFAPFCYFAGIIAVIISGIMLKKTQQFSGDPAPFVMELPEYHLPTPWNV ARTVWDRVKAFVIKAGTIILLTTVVIWFLQNISTSFEFVEFSEDSHSILEAVGKVIAPIF APLGFGHWAATVATITGLVAKEVVVSTFGVVAGLGDVGADDPTMVEYASSIFTSVSALSF MLFNQLSIPCFAALGAIRSEMNSKKWTWFAIAYQLLFSYVIALMVYQFGKVFILGEAFGV GTVVAVIIFALMVYLLCRKSTTKKGEVKRAVEVK >gi|224461466|gb|ACDD01000036.1| GENE 3 2597 - 2695 137 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFNALTGSNQYVGNWPGVTVEKRQGLIRKIKK >gi|224461466|gb|ACDD01000036.1| GENE 4 2725 - 2796 81 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLYFRKGMDMNLRMAPIGVEMKI >gi|224461466|gb|ACDD01000036.1| GENE 5 3060 - 3575 675 171 aa, chain - ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 167 1 167 168 230 62.0 8e-61 MNYSGIKYSDMINGPGIRVSLFVSGCSHACPGCFNKETWNPNYGEEFTEKQKKEIFDYFK KYPMLLRGLSLLGGDPTYKTNIEPLKTFILEFRKNFPEKDIWMWSGYTWEEILSSPSLLS LVKNCDVLVEGKFIETEKDLSLQWRGSRNQRVIDIVKSLKEKAIVLFEEIA >gi|224461466|gb|ACDD01000036.1| GENE 6 3591 - 5795 2884 734 aa, chain - ## HITS:1 COG:FN0311 KEGG:ns NR:ns ## COG: FN0311 COG1328 # Protein_GI_number: 19703656 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Fusobacterium nucleatum # 2 733 1 728 728 1176 77.0 0 MVKKVIKRDGTVIDFDAKRIVHAISMAFKQNSRTIPEELISKIAHQIENIENKIMSVEEI QDLVVKKLMASSEKDIAMAYQSYRTLKTEIRNKEKSIYKQIGELVDASNESLLVENANKD AKTISVQRDLLAGISSRDYYLNKIVPRHIKEAHIKGEIHLHDLDYLLFRETNCELVDIER MLKGGCNIGNAKMLEPNSVDVAVGHIIQIIASVSSNTYGGCSIPYLDRALVPYIQKSFYK HFKRGLHYTEDFEEEKVENILNSYQREEIIYDNQELKETYPKAYRYASDLTEESVKQAMQ GLEYEINSLSTVNGQTPFTTIGIGTETSWEGRLVQKYVFKTRMGGFGKNKETAIFPKIAY AMTEKLNLNPDSPNWDIAQLAFECMTKSIYPDILFVTQEQWEQGTVVYPMGCRAFLSPWK NKEGKEIYAGRFNFGATSMNLPRMAIKHQGDEKGFYQELDRILEICKENSIFRAKYLEKT TADIAPILWMYGALAEKEEKETIADLIWGGYATVSIGYIGLSEVSQLLYGKDFSQDEKVY KKSFAILKYIADKLEQFKAETGLGFAMYGTPSESLCDRFARMDQKEFGDIPGITDKGYYD NSFHVSSKLKMDQFEKLRLEALGHQYSKGGHISYIETDSLQKNIEAIPAILRYAKSVGIH YMGINQPVDKCYVCGYKGEFSATENGFACPQCGNHDNKKMSVIRRVCGYLAQPNSRPFNK GKQKEIMSRVKHNG >gi|224461466|gb|ACDD01000036.1| GENE 7 5881 - 5958 103 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEQESLKKKILQAFLYVKTQYIVKK >gi|224461466|gb|ACDD01000036.1| GENE 8 5975 - 6964 1828 329 aa, chain - ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 4 322 3 330 332 344 53.0 1e-94 MGYTYRELGLSNTREMFAKANREGYAVPAFNFNNMEMALAIVEACAEMGSPVILQCSAGA IKYMGYDVAPLMAKAAVDRARNMGSDIPVALHLDHGADLETVKKCIAAGFSSVMIDASHY DYEENIKVTKEVVEYAHKNAGEYVSVEAELGVLAGIEDDVHAEEHKYTNPEEVIDFVGRT GVDSLAIAIGTSHGAHKFKPGEDPKLRLDVLDAVAEKLGSFPIVLHGSSAVPKKYVDMIK EFGGEMKDAIGIPDSELRGATKSTVAKINVDTDGRLAFTAGVRQVLGTNPKEFDPRKYLG AGQKEMKEYYKTKVQDVFGSEGAYVKGTK >gi|224461466|gb|ACDD01000036.1| GENE 9 6983 - 7468 230 161 aa, chain - ## HITS:1 COG:no KEGG:FN0109 NR:ns ## KEGG: FN0109 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 160 2 156 158 114 48.0 9e-25 MLKNKIYLLLFIIIFAINLFFQDWRTLSLSFLFLLCWNIAYNSQFKQQLKRIWILFFFYL STFVIQLYYHQEGKVLVQLFGFYITLEGVQQFLGNFLRILNLILLSWIVANQKIFHGRFA RYQEIIETVIEFVPQVFILFRKKMKIKWFFRYILKKIQEKQ >gi|224461466|gb|ACDD01000036.1| GENE 10 7455 - 8729 1719 424 aa, chain - ## HITS:1 COG:FN0110 KEGG:ns NR:ns ## COG: FN0110 COG0172 # Protein_GI_number: 19703458 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 424 4 424 424 674 79.0 0 MLEARFIRENREKVQEMLKNRNNSLDLSEFDRLDAERREILSEVEALKRERNTESAKIAQ FKKEGKDASEVIKAMGMTSAKIKELDTKLAEVEEKVNYILMIIPNMYHETTPIGKDEEEN VEIRKWGTPREFAFTPKSHWEIGEELGILDFERGAKLSGSRFVLYRGAAARLERALISFM LDTHTTEHGYTEHLTPFMVKSEVCEGTGQLPKFEEDMYKTTEDNMYLISTSEITMTNIHR KEILDQAELPKYYTAYSPCFRREAGSYGKDVKGLIRVHQFNKVEMVKITDNKTSYDELEK MVNNAETILQKLELPYRVIALCSGDIGFSAAKTYDLEVWLPSQNKYREISSCSNCEDFQA RRMGLKYRPQGENKSEFCHTLNGSGLAVGRTLVAIMENYQQEDGSFLIPKVLVPYMGGME VVKK >gi|224461466|gb|ACDD01000036.1| GENE 11 8723 - 9337 743 204 aa, chain - ## HITS:1 COG:sll1084_2 KEGG:ns NR:ns ## COG: sll1084_2 COG5011 # Protein_GI_number: 16329879 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 4 154 3 162 222 97 36.0 1e-20 MKKRVYFDKYDNMRFISHLDLIRFLERLFQKTNLPIKYSNGFHPRPKMSFGNPISLGTEA FGEIMDIELEEDLSNAEVLRRLNSAQVLGFQVQKVESLEGKGNIVEEYPYTRYSVEGSCS VIDRLEELLQQEEIVEVREKKGKIVTRELKERIVSWERKENGITLTSINISPNAYLELAK ITQQEVRIKRLGYEKAEDKGEELC >gi|224461466|gb|ACDD01000036.1| GENE 12 9410 - 10411 1374 333 aa, chain - ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 1 331 1 331 332 475 74.0 1e-134 MSYQINDLVTLLNGTIKGESVERVSGLAPFFHAEEGEVTFAAEEKFLTKLQECKAKVIIV PDIDLPMNLGKTYIVVRDNPRILMPKLLHFFKRPLKKMEKMIEDSAKIGENVSIAPNVYI GHDAVIGDHVVLYPNVFIGEGVEIGAGSILYSNVSIREFVKIGKECIFQPGAVIGSDGFG FVKVQGNNMKIDQIGSVVIEDFVEIGANTTVDRGAIGNTVIKKYTKIDNLVQIAHNDRIG ENCLIVSQVGIAGSTEIGNNVTLAGQTGVAGHIKIGDNIVIGSKSGVSGDVKSNQILSGY PLVDHKEDLKIKVSMKKLPELLKRVKELEKKGK >gi|224461466|gb|ACDD01000036.1| GENE 13 10432 - 10911 853 159 aa, chain - ## HITS:1 COG:no KEGG:FN1910 NR:ns ## KEGG: FN1910 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 17 159 15 157 157 103 52.0 2e-21 MKKMLMMLGLVSALSVSAFAEKIAVVDSQEVIGKYSGTKTVGASLDKEAKRYENEINQRQ VALQKEEVALQAKGNKITDAEKKAFQAKVEGFYKYVNTSKETMGKMEYDKMSVIFKKANK AVQAVAAEGKYDYVLERGAVLLGGEDITDKVIKKMEATK >gi|224461466|gb|ACDD01000036.1| GENE 14 10956 - 13079 2808 707 aa, chain - ## HITS:1 COG:FN1911 KEGG:ns NR:ns ## COG: FN1911 COG4775 # Protein_GI_number: 19705216 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Fusobacterium nucleatum # 23 707 3 678 678 818 59.0 0 MKRTLVAMLLFLVSMVSFAAGGSLLVKKLEVLNNQEVPASIILNQMDLKEGKPFSTEVML HDFQTLKASKYLEDVMIQPQAYEGGVNVVVNVVEKKDAQMLLREDGIISVSEQANVDKSL ILSNIMISGNQLVSTSDMKAVLPLKQGGYFSKTAIEDGQKALLATGYFKEVVPSTQKNGN GVEVTYTVVENPVIQGINIHGNTLFSTQDILKALKTKTGEVLNINYLRADRDAIMNLYQD QGYTLSEITDMGLNEKGELEVVVSEGIVRNVSFQKMVTKQKGHRRKPTDDILKTQDYVIQ REIELQSGKIYNSQDYDNTVQNLMRLGIFKNIKSEIRRVPGDPNGRDIVLLIDEDRTAIL QGAISYGSETGVMGTLSLKDNNWKGRAQEFGVNFEKSNKDYTGFTIDFFDPWIRDTDRIS WGWSLYKTSYGDSDSALFNNIDTIGAKINVGKGFARNWRFSLGMKGEYVKEEANKGNFTQ LKNGSWQYNGKNKNNPNERIFDKDAVNDKYWVWSIFPYLTYDTRNNPWNATSGEYAKLQL ETGYAGGYKSGSFSNVTLELRKYHRGFWKKNIFAYKVVGGIMTHSTKEGQRFWVGGGNTL RGYDGGTFRGTQKVTATIENRTQINDILGIVFFADAGRAWNQKGRDPEYGNDETFSKGIA TTAGVGLRLNTPMGPLRFDFGWPVGKLQDKYSQDRGMKFYFNMGQSF >gi|224461466|gb|ACDD01000036.1| GENE 15 13150 - 17172 4073 1340 aa, chain - ## HITS:1 COG:no KEGG:FN1912 NR:ns ## KEGG: FN1912 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 264 1339 64 1173 1175 370 25.0 1e-100 MKGIFKYKMTVINISVFLTLLVGAIFYAANHSEEAIASVSKLFLGDPILIEKIEIKKDKI RLEGISMDLEGEPFLRIPKIEAERPSFLKLGNITIPEGDIYILRKEDGKLNIDRYLPKEE SKKINLKDYRPITNIPIEKISFETLRTHYEDRVLDPKFQKTISWQGEIVFDRKKGISSKL LGSDKEERYQIDYSGEKMPYDVQLDITGIRPEDYWKPYLKTESIVLETGNLEAHIHSDYY GNTGEIQADIPKLEFLNKKWEAGNLYISLDKNQVNASLDYKENGENKNSIITYDLEKKEA HAEFLDIYYDKISLDLAMQKKWKLDFLAEHSVYPKLEGTLSFDFKEDKIPFSLQSNIVDT DGEYLKNASYLKLYKKKQFFLNYDIAKANLEKGEGEIPISIYDYKANIIFQAKDNVIEIQ KVKIDSEKNGSILLKGFTDINEKKAEFEYKSDHFCFEKEIEGTEVFAQLALQGNISYDTH LGVKVSSQGEIEKVQYGDYGIEGLRVDMEYEEDEIQVYAFENRFLNAKGSIDIVNQNTNL EIELKDFDNTKVNVSYPEFFVNHARGQVRGNIKNPIADLYIEEGKLSILSQKENQVRGNF HLEDKVVSFQDVNLDQNLFSGEYRIFDNSYHIFANIIEEKLSDYYGFHDLYYRVIGEVEV NGKGRELFATAKSTIDKIYYRGRKLPNIAWEGSYTLGKQGIGKIDLSPVYLQNDKKKRFL SLEAKIDLDQETLSVDIPKQSFYLEDIEDYTMVDFLEGKWTLSGKIKGNYKNPNYDFQME GENLKVKKAPLDYLTLKFHGNTEKLIIDSMKTAYLQNKAEIQGYYGIRDGSYDISVKAPK IDWKLLQSFASEYGVENIEGNSNLDFHIRSEQSQGSLLLHNFSFEMPKKYISVKNFTGNI ELHGNEMMVHQISGIVNEGKAVVKGRMQLPKLNEVKKDFSFLKKLDYYFNIDVQELKYRI PEMLSLDISSHLRLESNKLRGNIELLKGKVVDIPNTYQSYWKIIRKFFEEKSSQVVLNSQ SLGQDFEVQESETKLENLLDIDLSLWIQEGIKVDIPELNVAVEDVKGTVVGGLSVVGKEG KYALLGNLEVEKGSLMVNTNIFSLDKAMLSFNENKTYLPNVNPSLLIDSNVDVNGEKVRF SIQGKTDDLRFSIGSSQGNTSGSLNSLITGQIQENESNASYTALLRNIIGGQLTQTVIRP FAKIIRKVFHFEKFRITSNVYNQTKKGDDSSGDLYLGAKIEVEDNLYKDKLYWNFTGTLY DTGLQNTQINSQSKDNKIMDQYDLSLRYPYSETKTFEIGVGKLPSKFYTNQEQIKEKKKL NYHIGVKIEKKMNDFFDIFR >gi|224461466|gb|ACDD01000036.1| GENE 16 17191 - 17895 1038 234 aa, chain - ## HITS:1 COG:FN1722 KEGG:ns NR:ns ## COG: FN1722 COG0357 # Protein_GI_number: 19705043 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Fusobacterium nucleatum # 1 234 1 232 232 244 60.0 1e-64 MKEYLQEGIQKLGISLSEKQIENLLTYVTLLLEYNQHTNLTAIREEKAVIEKHILDSLLL QEYIPKDATTAIDIGTGAGFPGMVLAICNPTVYFTLMDSVGKKTKFLEWVKENLKLQNVE VINARAEDYIQISKRREYYDLGFCRGVSKLAVILEYMIPFLKVGAIFLPQKMVGTQEERE AENALKILKSCLEREYINHLPYSQEERLILQIQKEEKTDKMYPRKVGMITKKPL >gi|224461466|gb|ACDD01000036.1| GENE 17 17897 - 19789 2311 630 aa, chain - ## HITS:1 COG:FN1723 KEGG:ns NR:ns ## COG: FN1723 COG0445 # Protein_GI_number: 19705044 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Fusobacterium nucleatum # 2 630 1 633 633 943 75.0 0 MVQEFDVIVVGAGHAGAEAALAAARLGKKTAIFTISLDNIGVMSCNPSLGGPAKSHLVRE IDALGGEMGRNIDKTYIQIRVLNTKKGPAVRSLRAQADKIRYAKEMKRTIETCENLSAIQ GMVSELLVEDGKAVGIKIREGVEYRAKRIILATGTFLRGLIHIGESHFSGGRMGELSSED LPLSLLKHGLDLQRFKTGTPSRIDARTIDFSVLEEQPGETAKILKFSNRTSDKELKDRRQ ISCYIAHTNEEVHTEIKNNRERSPLFNGTIQGLGPRYCPSIEDKVYRYADKPQHHLFLER EGYDTNEIYLGGLSSSLPVDVQENMIHKIHGFEHAQIMRYGYAIEYDYIPPSEIQYSLES RTIPNLFLAGQINGTSGYEEAGAQGLMAGINAVRSIDGKDPIVLDRADSYIGTLIDDLVL KGTNEPYRMFTARSEYRLVLREDNADLRLSKIGYEVGLVSEEEYQKVETKRENVRNIIEA LQQNFVGPGNPRVNERLSEKGEEILKDGASLFEVLRRPEITYEDIEYMTEGTKTFDFTSY DEDTKYQVEVQTKYSGYIERSFKMIEKHKSMEEKRIPQDIDYDSLQNIPKEAKEKLKKIR PNNIGQASRISGVSPADIQVLLIYLKMRGN >gi|224461466|gb|ACDD01000036.1| GENE 18 19805 - 20464 988 219 aa, chain - ## HITS:1 COG:FN1724 KEGG:ns NR:ns ## COG: FN1724 COG0569 # Protein_GI_number: 19705045 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 217 1 217 218 258 58.0 4e-69 MKQYLVIGLGRFGRSVAKTLYESNQEVMAVDVSEDLVQDMINDYKVENAMVLDGTDLTSL QEIGAQNFDTAFVCMRNLESSILTTLNLRELGISKIIAKAGSREHGKVLEKIGASKIVYP EEYMGRRIAQLVMEPNMIEHLRFSSDFLLAEIKAPNLFWNKTLIQLNVRNKYNANIVGIR KANDVFYPNPAAETLIEKGDILVVITDSKTARTLESLGE >gi|224461466|gb|ACDD01000036.1| GENE 19 20477 - 21811 1623 444 aa, chain - ## HITS:1 COG:FN1725 KEGG:ns NR:ns ## COG: FN1725 COG0168 # Protein_GI_number: 19705046 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 8 444 13 448 448 439 58.0 1e-123 MKRDKKQMSPSRKLILGFLLVIIVGVFLLMLPFSLKEGKSLSPLEALFTVVSAVCVTGLS VVDVAEVFSPVGDAILIAFIQIGGLGVMTFSSIVFLLAGQKMTLYTRILLKEERNANSVG EILNFVRLMLLTVFIIESIGAVILMHEFRKIMPYEQAVYYGIFHSISAFCNAGFSLFSNN LENFRGNPVISLTISYLIILGGMGFAIINSFIMMIRKGVSRFTLTSKLAIQISMILTFGG AILFFLLEFSNSATLFPLPWSEKIIASIFQSVTLRTAGFNTIPLANLRSATVFMACIWML IGASPGSTGGGIKTTTLGVILFYVIGIIRGKEHVEIFNRRLDWDVMNKALALLVVSLSYI ALVILLLLVIEPFSMEKIVFEVVSAFGTVGLTMGITPYLTVTSKLIIIVTMFIGRLGPMT IALALGEKKKKARVQYPKEDILIG >gi|224461466|gb|ACDD01000036.1| GENE 20 21827 - 23053 1933 408 aa, chain - ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 3 403 2 403 408 481 58.0 1e-136 MREELVNRFLKYVKIYTTSDEASETCPSTERQWDLAKILVEDLKEIGLEDICLDKNGYVM ATLPANIEGAPSIGWIAHMDTAPNYNGNHVNPRIIENYDGKDIILDEEKEIISSVVDFPE LKNYIGKTLIVTDGSSLLGADDKAGVTEILEAVKYLKAHPEIPHGKVRVGFTPDEEIGRG ADLFDIKAFDCDFAYTVDGGEIGELEYENFNAASVHIEITGRDIHPGAAKDKMINSMLLA MEVQSMLPVEQRPEYTTGYEGFFLLDSLQGSVEKTTMDYIIRDHSFEKFTKKKEFIQEVI DFLGKKYPKAKLECHVKDSYFNMREKIEPVMYIIDLAKKSMEELGIIPKVSPIRGGTDGS RLSYEGLPCPNIFTGGHNFHGKHEYICVESMEKARDLIVRITENATKL >gi|224461466|gb|ACDD01000036.1| GENE 21 23050 - 24429 1660 459 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 449 1 449 457 502 59.0 1e-142 MSLSHSFLEQESIGRLLWKFSLPAVVGMVVNALYNVVDRIYIGHIERVGHLAITGVGVIF PIVLLSFAFALLVGLGSSANISLHLGKKEKDRAEQFLGNSFVLGSIFSLSFTILLFFIMK ECIYLVGGSDVSYPYAKQYLEIVAIGFLPMTLSYILNAAIRSDGNPKMAMLTLLIGTFVN IILDPIFIFTLNMGVRGAALATIISQTVSFLWTIYYFTSSKSVMKLKKKYIRFHFELSKK VIALGSSSFGVQVGVSIINYIMNVILREYGGDLSIGAMAIIQSVMSLLLMPIFGINQGVQ PILGYNYGAKKYDRVKEALFKGIGAATFICVLGFLSIELFSQYWIILFTKETSLLELAEY GLRRQVIVFPIVGFQIVSSIYFQAVGKPKLSFFISMSRQILVLIPCLFLLSSIWGLDGVW YASPLSDFIATIVTFILIKRELKHLEYLKLEKEREEIVE >gi|224461466|gb|ACDD01000036.1| GENE 22 24444 - 26024 1649 526 aa, chain - ## HITS:1 COG:FN1727 KEGG:ns NR:ns ## COG: FN1727 COG0038 # Protein_GI_number: 19705048 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Fusobacterium nucleatum # 7 522 3 516 521 597 62.0 1e-170 MSRKILNVEESLKHIQNSNGKLYFLCLMVGLITGVIVSFYRYALHIFNVLRETFVSPATL HNYPFLIKIWCLFLVVGFFIDFLYRKYPRTSGSGIPQVKGIILGTVHYKHWFAQLLAKFV GGLFGIGAGLSLGREGPSVQLGSYVATGIAKSFHCNRVDENYLITSGASAGLAGAFGAPL AGVMFSLEELHKFLTAKLIICIFVASIASDFIGRRFFGMDTSFSMLAHYPKDINPYLQFA LYILFGVIIAFFGKLFTVTLVKTQNLFQGIKISRWMKVVFVMSTSFLLCLVLPEVTGGGH ELVESLPHLQQGILFLFFVFVIKLLFTSISYATGFAGGIFLPMLVLGAILGKIFALILLS VFPFTPEMIVHFMVLGMVGYFVSVVRAPITGAVLILEMTGSFDHLLALVTVSVVAFYVTA LLKLAPVYDILYERMPKDDIEETHDVESMGKTLIVVPVAAESYLDGKKISEVEWGEEVLV VALRRSELEKIPKGDTVMQSGDNIVLLLPESIVAEVKEKLLQKGIE >gi|224461466|gb|ACDD01000036.1| GENE 23 26017 - 26820 1075 267 aa, chain - ## HITS:1 COG:MTH1334 KEGG:ns NR:ns ## COG: MTH1334 COG0253 # Protein_GI_number: 15679334 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Methanothermobacter thermautotrophicus # 3 263 7 283 289 174 36.0 2e-43 MKFWKMEAAGNDFVIFDGREIEIKDINALAKKLCDRHFGVGADGILFCQESDTADIKMNY YNSDGSRGEMCGNGIRCLSRFIYENNIVDKRKMNIETDNGVKEVVLTVGENEHISQVKVE MGKAEWEKGFQEETLEIEGRSFDFYRVTVGVPHIAILVDEFMKDEELNYWGAILEKHPSF PRKTNVNFIKVLNEKEVQIKTWERGAGRTLGCATGCSSCGVILQRLQKIKGEVHFYTEGG DVFVQTQDDFVTIYGKANLIFTGDMDV >gi|224461466|gb|ACDD01000036.1| GENE 24 26975 - 27220 523 81 aa, chain + ## HITS:1 COG:no KEGG:FN1796 NR:ns ## KEGG: FN1796 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 77 1 77 79 67 58.0 1e-10 MERLTLEEVQKYIKEIKEKGLYEKYQAMILDDFEEHHIVYLLEEEEIIALAYKNQVTPYS MKDYYNWYEMNLLIEEEEYGL >gi|224461466|gb|ACDD01000036.1| GENE 25 27217 - 27279 56 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIFIKFIKIKINLNFLKKIY >gi|224461466|gb|ACDD01000036.1| GENE 26 27409 - 28071 685 220 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452329|ref|ZP_05617628.1| ## NR: gi|257452329|ref|ZP_05617628.1| hypothetical protein F3_04625 [Fusobacterium sp. 3_1_5R] # 1 220 1 220 220 400 100.0 1e-110 MRKKIIVCFFLCCSILSFSARPKSYQVSQKELIHWSYEAAENVFPDSIEGWKHVLVGTLA VETNLGQFKGNSIYGVSQMRNSGFQFVQRELQRNSKERKVFEELAGRSPNTVTLKMLETD HRLSIIYMAFYYKFCAHGKAHPEDKEAAAKIWKQYYNTKLGTGTPERFLSAYAKQKKYIE QYQKNLEDIPEEIKNTVQEIELLESLEIEENMENNDERKE >gi|224461466|gb|ACDD01000036.1| GENE 27 28068 - 29456 1983 462 aa, chain - ## HITS:1 COG:FN0352 KEGG:ns NR:ns ## COG: FN0352 COG1757 # Protein_GI_number: 19703695 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 1 458 1 458 459 556 75.0 1e-158 MFDLVQHRKPSLVESLFIILIIFILLGFPMIAIPNMTPHIPVLVSIIFLILYGMFQKVSF KKMQESMIQSVSTSMGAIYLFFFIGILISVLMMSGAIPTLMYFGLDVISTKVFYLSTFCI TAIIGISIGSSLTTVATLGVALMGLSNAFGLNPAITAGAIVSGAFFGDKMSPLSDTTGIA ASIVGVDLFDHIKNMLYTTLPAFVISAIVFGAFSPWNQAGDISSVAAFKEDILSTGLVHS YALLPFLLLLIFSIFKVPAIITIIFTSILSLMIAETHTSYSLQEIGTFLFSGFSKTGVSE SIASLVSRGGINSMFFTITIIILALSLGGLLFGLGIIPTLLESIAHFLNSASRATICVVI TALGVNFIVGEQYLSILLAGKTFKPVYDKLHLHSKNLSRTLEDAGTVINPLVPWGVCGVF ITSMLGVPTLVYLPFAVFCYSSLILTVVFGFTGLTLTKGGNE >gi|224461466|gb|ACDD01000036.1| GENE 28 29411 - 29515 126 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQLMMVFCAVLNQTYISLPFLCSIFCFFILCRQ >gi|224461466|gb|ACDD01000036.1| GENE 29 29628 - 29702 150 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKQENIKNHNHHFANFFYYYYYMN >gi|224461466|gb|ACDD01000036.1| GENE 30 29851 - 33429 5236 1192 aa, chain - ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 1 406 1 406 410 735 86.0 0 MAKKMQTMDGNQAAAYASYAFTEVAGIYPITPSSPMAEYTDEWASKGMKNIFGVPVKLVE MQSEAGAAGSVHGSLQAGALTTTYTASQGLLLKIPNMYKIAGELLPGVIHVSARALSAQA LSIFGDHQDIYAARQTGFAMLATNSVQEVMDLAGVAHLTAIKTRVPFMHFFDGFRTSHEI QKVEVMDYEVFKSLVDYDAIQAFRDRALNPEHPVTRGTAQNDDIYFQAREAQNKFYDAVP DVTAHYMAEISKVTGRDYKPFNYYGAADAERIIVAMGSICEAAEEVIDYLNAKGEKVGMV KVHLYRPFSEKYFFDVFPKSVKKIAVLDRTKEPGSLGEPLLLDVKSLFYGKENAPLIVGG RYGLSSKDTTPAQVVAVFDNLKAEQPKDLFTVGIVDDVTFTSLEVGAPVVVSDPSTKACL FYGLGADGTVGANKNSIKIIGDKTDLYAQGYFAYDSKKSGGVTRSHLRFGKNPIKSTYLV STPNFVACSVPAYLNQYDMTSGLREGGKFLLNCVWDKEEALQRIPNNVKRDIARANGKLY IINATKLAHDIGLGQRTNTIMQSAFFKLAEIIPFEDAQQYMKDYAKKSYAKKGDDIVQLN YQAIDIGASGLVEIEVDPAWKDLKVEAKVEEKDCGCSSCSCTPVEKFVEKIAKPVNAIKG YDLPVSAFDGYEDGTFENGTSAFEKRGVAVDVPLWDSTKCIQCNQCSYVCPHAVIRPFLV SEEEKAASPVEFATLKAMGKGLDGLTYRIQVSPLDCVGCGSCVNVCPAPGKAITMQPIAT SIDAEEDKKADYLFNKVEYRSNLMSIDTVKGSQFAQPLFEFHGACPGCGETPYLKAITQL FGDRMMIANATGCSSIYSGSAPATPYTTNSCGEGPSWASSLFEDNAEFGMGMHVAVEALR DRIQTIMEANLDTVSEEMATLFKEWIANRKYSAKTREIRDILVPMLEKTDAAYAKEILEL KQYLIKKSQWIIGGDGWAYDIGYGGLDHVLASSEDVNVIVLDTEVYSNTGGQASKSTPTA AVAKFAAAGKSVKKKDLAAIAMSYGHIYVAQVSMGANQQQYLKAIKEAEAHQGPSIIIAY APCINHGIKKGMSKSQTEMKLATECGYWPLFRYNPSLEAIGKNPLVIDSKEPKWEKYDEY LLGETRYLTLSKSNPEHAKELFAENKFEAQRRWRQYKRLAAMDFSEENRSAE Prediction of potential genes in microbial genomes Time: Fri May 20 01:59:52 2011 Seq name: gi|224461465|gb|ACDD01000037.1| Fusobacterium sp. 3_1_5R cont1.37, whole genome shotgun sequence Length of sequence - 45437 bp Number of predicted genes - 46, with homology - 45 Number of transcription units - 11, operones - 8 average op.length - 5.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 - CDS 23 - 1225 1647 ## COG0282 Acetate kinase 2 1 Op 2 . - CDS 1250 - 2263 1502 ## COG0280 Phosphotransacetylase - Prom 2292 - 2351 5.9 - Term 2291 - 2343 -0.9 3 2 Op 1 . - CDS 2358 - 3362 704 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 4 2 Op 2 . - CDS 3364 - 4770 1828 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 5 2 Op 3 . - CDS 4798 - 5190 476 ## FN1073 hypothetical protein 6 2 Op 4 1/0.000 - CDS 5187 - 6275 1186 ## COG1161 Predicted GTPases 7 2 Op 5 5/0.000 - CDS 6289 - 7077 768 ## COG4974 Site-specific recombinase XerD 8 2 Op 6 6/0.000 - CDS 7099 - 8406 1643 ## COG1206 NAD(FAD)-utilizing enzyme possibly involved in translation 9 2 Op 7 13/0.000 - CDS 8428 - 10677 2620 ## COG0550 Topoisomerase IA 10 2 Op 8 5/0.000 - CDS 10692 - 11543 1145 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 11 2 Op 9 1/0.000 - CDS 11562 - 12236 889 ## COG0457 FOG: TPR repeat 12 2 Op 10 . - CDS 12217 - 13431 1288 ## COG1570 Exonuclease VII, large subunit 13 2 Op 11 . - CDS 13440 - 15422 2537 ## COG0556 Helicase subunit of the DNA excision repair complex - Prom 15536 - 15595 14.1 + Prom 15533 - 15592 13.0 14 3 Op 1 . + CDS 15619 - 16116 889 ## COG0716 Flavodoxins 15 3 Op 2 . + CDS 16136 - 16480 528 ## MA2700 hypothetical protein + Term 16491 - 16547 12.4 - Term 16490 - 16524 2.5 16 4 Op 1 1/0.000 - CDS 16528 - 17193 776 ## COG4123 Predicted O-methyltransferase 17 4 Op 2 1/0.000 - CDS 17193 - 18110 1176 ## COG1774 Uncharacterized homolog of PSP1 18 4 Op 3 1/0.000 - CDS 18111 - 18794 725 ## COG2003 DNA repair proteins 19 4 Op 4 2/0.000 - CDS 18812 - 19870 1201 ## COG2038 NaMN:DMB phosphoribosyltransferase 20 4 Op 5 6/0.000 - CDS 19883 - 20464 740 ## COG0406 Fructose-2,6-bisphosphatase 21 4 Op 6 8/0.000 - CDS 20479 - 21285 1033 ## COG0368 Cobalamin-5-phosphate synthase 22 4 Op 7 . - CDS 21305 - 21871 896 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase - Prom 21901 - 21960 9.5 + Prom 22048 - 22107 10.8 23 5 Tu 1 . + CDS 22157 - 22411 511 ## + Term 22436 - 22468 4.2 + Prom 22447 - 22506 12.9 24 6 Tu 1 . + CDS 22581 - 23327 980 ## COG0501 Zn-dependent protease with chaperone function + Term 23335 - 23364 0.5 + Prom 23442 - 23501 10.0 25 7 Op 1 22/0.000 + CDS 23524 - 23952 573 ## COG0720 6-pyruvoyl-tetrahydropterin synthase 26 7 Op 2 1/0.000 + CDS 23954 - 24622 175 ## PROTEIN SUPPORTED gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 27 7 Op 3 1/0.000 + CDS 24623 - 25198 800 ## COG0302 GTP cyclohydrolase I 28 7 Op 4 . + CDS 25207 - 25902 745 ## COG0603 Predicted PP-loop superfamily ATPase 29 7 Op 5 . + CDS 25903 - 26385 666 ## COG0780 Enzyme related to GTP cyclohydrolase I + Term 26393 - 26442 5.4 - Term 26380 - 26430 7.2 30 8 Op 1 . - CDS 26451 - 27419 1332 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 31 8 Op 2 2/0.000 - CDS 27455 - 28813 1851 ## COG2610 H+/gluconate symporter and related permeases 32 8 Op 3 3/0.000 - CDS 28845 - 29843 419 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 33 8 Op 4 2/0.000 - CDS 29858 - 31135 1402 ## COG3395 Uncharacterized protein conserved in bacteria 34 8 Op 5 . - CDS 31145 - 31930 861 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 31980 - 32039 11.7 - Term 32021 - 32071 10.7 35 9 Op 1 . - CDS 32101 - 33048 994 ## COG0679 Predicted permeases - Term 33061 - 33104 4.4 36 9 Op 2 . - CDS 33112 - 33492 525 ## FN0656 hypothetical protein 37 9 Op 3 26/0.000 - CDS 33533 - 34732 1964 ## COG0126 3-phosphoglycerate kinase - Term 34748 - 34785 6.2 38 9 Op 4 1/0.000 - CDS 34796 - 35800 1549 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase - Prom 35826 - 35885 4.8 39 9 Op 5 . - CDS 35950 - 36819 240 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit - Prom 36867 - 36926 8.2 40 10 Op 1 4/0.000 + CDS 36984 - 37976 1185 ## COG0373 Glutamyl-tRNA reductase 41 10 Op 2 6/0.000 + CDS 37988 - 38887 883 ## COG0181 Porphobilinogen deaminase 42 10 Op 3 2/0.000 + CDS 38906 - 40384 1637 ## COG0007 Uroporphyrinogen-III methylase 43 10 Op 4 7/0.000 + CDS 40368 - 41345 1115 ## COG0113 Delta-aminolevulinic acid dehydratase 44 10 Op 5 . + CDS 41342 - 42643 1281 ## COG0001 Glutamate-1-semialdehyde aminotransferase 45 10 Op 6 . + CDS 42702 - 43478 886 ## COG0251 Putative translation initiation inhibitor, yjgF family + Term 43484 - 43516 1.3 - Term 43464 - 43513 6.2 46 11 Tu 1 . - CDS 43517 - 45172 1504 ## Lebu_0945 protein of unknown function DUF1703 - Prom 45276 - 45335 9.6 Predicted protein(s) >gi|224461465|gb|ACDD01000037.1| GENE 1 23 - 1225 1647 400 aa, chain - ## HITS:1 COG:FN1171 KEGG:ns NR:ns ## COG: FN1171 COG0282 # Protein_GI_number: 19704506 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Fusobacterium nucleatum # 1 396 1 396 398 657 79.0 0 MKVLVINCGSSSLKYQLINPDSKEVFAKGLCERIGIEGSKFEYEVPAKDFEIKLQSPMPT HQEALKLVVDNLVDKEHGVIASVEEVDAIGHRVVHGGETFASSVLITEEVMKAIEDNNDL APLHNPANLMGIHTCMKLMPGKPNVAVFDTAFHQTMPAKSFMYPLPYEDYTELKVRKYGF HGTSHLYVSQTMREIMGNPEHSKIIVCHLGNGSSMSAVLDGKSVDTSMGLTPLQGLMMGT RCGDIDPAAVMFIKDKRGLSDKEIDNRLNKQSGFLGIFGKSSDCRDVEDGVAAGDERAIL ADDMFCYRIKSYIGAYAAAMGGVDAICFAGGIGENAAGIREKVLEGLEFLGVKLDKEVNS VRKKGNVKLSAEDSKVLVYKIPTNEELVIARDTFAIVSGK >gi|224461465|gb|ACDD01000037.1| GENE 2 1250 - 2263 1502 337 aa, chain - ## HITS:1 COG:FN1172 KEGG:ns NR:ns ## COG: FN1172 COG0280 # Protein_GI_number: 19704507 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Fusobacterium nucleatum # 1 334 4 337 337 532 81.0 1e-151 MSFLGQVRKKALQANRRIVLPESFDERVLRAVAEILKEKVAQPILVGNPDQIMNDAKAYE ISLQGARIVDPENFERFEVYVDKLVELRSKKGMTREEATKILKNDINFFGAMMVKMGDAD GMVSGASSPTAKVLRAGIQVIGTKPGMKTVSSVFIMELSQFKEMYGSVLVFGDCSVIPHP NAEQLADIACSSAETALSIANINPRVALLTFSTKGSANHECVDKVIEAGRILRDRHVSFR FDDELQADAALVKSIGEIKAPLSDVSGNANVLIFPNLSAGNIGYKLVQRLAGANAYGPII QGLDAPVNDLSRGCSVNDIVVLTAITSAQACTDCTFE >gi|224461465|gb|ACDD01000037.1| GENE 3 2358 - 3362 704 334 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 24 333 8 320 336 275 43 3e-73 MAFWKKLFGKKEEEQEEIEILEEKNKEEERPKGIFASLHEKLFKTREGLFSKMKSLFSSH KIIDEEMYEELEELLIQSDIGLEMTQKIVTDLEKAVKKQGVQNPEEVYPVLKTVMEEYLI ESEEEFPREEEQLQVILIVGVNGVGKTTTIGKIAAKLKKEGKKVVLGAGDTFRAAAVEQL EEWAKRSEAEIVKGKEGADPGSVVFDTLTKAEELGADVAIIDTAGRLHNKAYLMKELEKI NNVVRKKIGERHYESLLVIDGTTGQNALNQAREFNEVTHLTGFIITKLDGTAKGGIVFSL SELLKKPIRFIGVGEKIEDLRKFSKKDFIAALFE >gi|224461465|gb|ACDD01000037.1| GENE 4 3364 - 4770 1828 468 aa, chain - ## HITS:1 COG:PAE0983 KEGG:ns NR:ns ## COG: PAE0983 COG2509 # Protein_GI_number: 18312326 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Pyrobaculum aerophilum # 4 464 18 474 481 322 39.0 8e-88 MKKEYDILFLGGGQAGVFGAYEAAKKNQNLKIAIIDRGKMLHKRICPKEKLGYCVNCPTC AIIYGVSGAGAFSDSKFNMDYRVGGDVHTVVGKKIVNDTIDYVVSIYRDFGFQEEPAGLK YNKVMEEIKRKCIENEVQLVDTPTMHLGTDGSRKLYQKMLEFLVEKNVDFIVDAKITDLL VENNEIQGAIVERNKEIEEYYAKNVVVAMGRSGAAKMMKFANKHKISYQVGAIDIGVRAE IPNLVMRDINENFYEAKMIYYSKTYGDKMRTFCSNPGGFIAAEKYGDDVILANGHAFKDR KSENTNLALLCTKHFTEPFKEPFEYATAIARMSAMLTGGKLLVQSYRDLKQGKRSTEESM TRLNIVPTTEDYIPGDISLACPKRILDNIMEFIEVHDKITPGFASGDLLLYFPEIKFRST RLDIDENMQTSVKGLYAAGDSSGYGSGLNIAAVMGILAVRAILEKIGD >gi|224461465|gb|ACDD01000037.1| GENE 5 4798 - 5190 476 130 aa, chain - ## HITS:1 COG:no KEGG:FN1073 NR:ns ## KEGG: FN1073 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 37 127 59 165 168 75 37.0 4e-13 MRGIIAVLAAIIIGIALMSRPDSLTEMENAEQTALSLKHLRVSLEQYYQEYKIYPENLVQ DEKFMEIYGKTDLDGTRSRGDSKENNEVHITEDFKEVTDDGGWNYNPKTGEIRANLDFDC FHQKIDWHVM >gi|224461465|gb|ACDD01000037.1| GENE 6 5187 - 6275 1186 362 aa, chain - ## HITS:1 COG:FN1072 KEGG:ns NR:ns ## COG: FN1072 COG1161 # Protein_GI_number: 19704407 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 359 1 366 366 451 60.0 1e-127 MGKKCIGCGIPLQNTDSKKDGYTPKDINGKEELYCQRCFRVSHYGEHSSSFLSREDYQKE LQQWVSPKRLALAVFDIIDFEGSFQDDILDILREMDSIVVINKVDLIPGEKHPSEVANWV KGRLASEGISPLDIAIVSCKNNYGMNGVLRKIQHFYPNGVEVLVLGVTNVGKSSVINRLL GKNRVTVSKYPGTTLLSTMNEISGTNLCLIDTPGLIPEGRFSDLMIEEDQLRVIPSTEIS RKTFKLEKDRCIVLGEFVKLRVLNEEEQRPIFSLYASQGVQFHETSLEKASLQKENENCL HLKKKQKFCKEIFTIAAGEELVWKGFAWLSVKRGPLHIEVEYPEGGDIVIRKAFIHPRRV SF >gi|224461465|gb|ACDD01000037.1| GENE 7 6289 - 7077 768 262 aa, chain - ## HITS:1 COG:FN1071 KEGG:ns NR:ns ## COG: FN1071 COG4974 # Protein_GI_number: 19704406 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Fusobacterium nucleatum # 9 259 12 282 290 103 30.0 4e-22 MKEFILWKQKYIHHLELQRGLSQNSLRAIQKDLEQFLNYMEEYQDGELTVLTLKSYFFHL QEKHASNTIQRKISSIKVFLRFLKEENIVQEDFSLYFTKVRKEEDTILFFEKDVWEQFRR AFENNLRDKAIFELLYSTGMKPKEFLSLTYLQIEWQKQEIYFFQKKESRTVFFSHRAKEA LWNYCEEKGRKEGRIWDFSEKTLRNIFKKYREKISGLENMTIYSFRHTFAITLLRAGMPK SELQYLLGLEQGELLQRYETYK >gi|224461465|gb|ACDD01000037.1| GENE 8 7099 - 8406 1643 435 aa, chain - ## HITS:1 COG:FN1070 KEGG:ns NR:ns ## COG: FN1070 COG1206 # Protein_GI_number: 19704405 # Func_class: J Translation, ribosomal structure and biogenesis # Function: NAD(FAD)-utilizing enzyme possibly involved in translation # Organism: Fusobacterium nucleatum # 1 433 1 432 434 624 76.0 1e-178 MKKEVIVVGAGLAGSEAAYQLAKRGIPVRLYEMKQQKKTEAHHYDYFAELVCSNSLGGDH LGNASGLMKEELRLLDSLLVRVADETKVPAGQALAVDRHGFSEKITQILRNMENITIVEE EFKEIPKDQYVLIASGPLTSDALFTELLTLTGEESLYFYDAAAPIVSLESIDMTSAYFQS RYGKGEGEYINCPMTREEYEAFYTELIRAERAPLKKFEEEKLFDACMPIEKIAMSGEKSL LFGPLKPKGLINPKTDKMDHAVVQLRQDDKDGKLYNMVGFQTNLKWGEQKRVFSMIPALR QAEFIRYGVMHRNTFINSTKLLEKDLSLKTQNNLYFAGQITGGEGYVAAMATGCMAAINI ANKLQGKEPFILEDVTAIGALIRYITEEKKKFQPMGPNFGIIRSLEGKRIRDKKERYLEM SRIAIEYLKNKIKML >gi|224461465|gb|ACDD01000037.1| GENE 9 8428 - 10677 2620 749 aa, chain - ## HITS:1 COG:FN1069_1 KEGG:ns NR:ns ## COG: FN1069_1 COG0550 # Protein_GI_number: 19704404 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Fusobacterium nucleatum # 6 676 13 683 684 828 67.0 0 MANKNLVIVESPAKAKTIEKILGKNYEVIASFGHICDLPKTKMGVDVDNDFKTSYSTIKG KGEVVKKLKMQAKNANKVYLASDPDREGEAIAWHIANALKLNKEEKNRIEFHEITDRAIR EAVKNPKVIDENKVNAQQARRVLDRLVGYEISPFLWKLISPNTSAGRVQSVALKLICDLE DKIQKFVPEKYWDVKGLFDRKWLWPLYKIDGKKVDKIKDYEIVKRVQSLQKQKFHVLEAK ISKKSKKPPLPLKTSTLQQLASSYLGFSASRTMSLAQKLYEGIDINGSHKGLITYMRTDS TRISEEAKEMAKSYVIETFGKEYVSDVKAKKESKQKIQDAHEAIRPTDVYLTPEILEKVL DKDQAKLYKLIWERFLISELASMKYEQFEIVCAQEEVQFRGSMNKILFDGYYKIFKEEEE LNLADFPKIEAGDLLELSKLEMKEDMTKPPARLTESSLVKMLETEGIGRPSTYASIIETL KKREYIVMEKRSFIPTEIGYEVKAQLEKYFSKIMNVKFTAELENELDGVEEGTENWISLL HRFYDGLKEEMEACREAVQAESEKTILSDVLCANGKDYMIAKTGRFGRYLASPVENDDTK ISLKNINISMEQWKQGKIFVKDALEESLKKKAGHRTDVKTESGAYYLLKEGRFGSYLESE NFKEDSLREALPAEIRKDLKAGKIEILDGVYQFVQRIRAIKEEEEALIQEAGVCEKCGKP FKVNRGRWGKFLSCTGYPDCKNIRKIEKK >gi|224461465|gb|ACDD01000037.1| GENE 10 10692 - 11543 1145 283 aa, chain - ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 4 282 10 287 288 282 52.0 4e-76 MYQIHIEDEQYPKLLKEISKPPKTLYCMGDIRLLQAERKVAVVGTRTATDYGKICCQKLV KTLCSADVVTVSGLALGIDAICQQETLHCGGKTIAVVGSGLDEIYPKQNTNLWNRIAKEG LLVSEYPLGTKAFPKNFPERNRIIAGLSKAVVVVESKERGGSLITAELALEENRDVYAIP GDIDSPCSRGCNQLIRDAQAKLLAKMEEILYDYDWNRKEEKEIEVEVSQEAQKILVSLIR EKSLDDLEKELFLSKQVLLSQLMQLEIEGWIKSVSGGKFKKIK >gi|224461465|gb|ACDD01000037.1| GENE 11 11562 - 12236 889 224 aa, chain - ## HITS:1 COG:FN1067 KEGG:ns NR:ns ## COG: FN1067 COG0457 # Protein_GI_number: 19704402 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 82 220 94 232 237 110 38.0 3e-24 MKIRWNKGMKIVCFLALSFSLLATGEIREIEMIGAHENKAIPEKVVTEAVVSKEAKEQGE GVETTEEGVQEEEDKVLTFATVFQKGNYLFVQKQYEKANSVFKSDFSDMKNIFGAATTDR FLGRHAQAIEEYTKVLQINKDFGEAYLGRALSYRDSGKYSEAISDFEKYLSLTQKEDGYL GLGDTYMAMGDYSKAQQILAQGSSKYPSSVLMKKMMSQAYLKTK >gi|224461465|gb|ACDD01000037.1| GENE 12 12217 - 13431 1288 404 aa, chain - ## HITS:1 COG:FN1066 KEGG:ns NR:ns ## COG: FN1066 COG1570 # Protein_GI_number: 19704401 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Fusobacterium nucleatum # 1 403 1 403 404 422 55.0 1e-118 MEYIYKVSDFNLKIKRYLEGNYEFKNMIIEGELSGVTYYKSGHLYFQLKDENSQVKCAAF SYQRRGIPGDLQEGEKVRVFADVGFYENRGDFQLLVRGIEKQNTLGKMYADLEKLKKRMA AEGYFSLEHKKTLPSYPKVIGVVTALTGAAVQDIIKTIRKRDPRIDVYIYSAKVQGTGAE QEIIAGIEALNRIPEIDFLIAGRGGGSVEDLWAFNKEEVALAFYHSQKPIISAVGHEVDI LLSDFTADVRAATPTQAVELSVPELEQYYQEIQARYTKLQLLGKQSLLRKKQDLQKRSQN YSLQHFPKSFESYKRDLMYREEQLRRGMEWILEQKKQEHHLALQKMIHLNPMYTLQKGYA VLRKGKKNLRSLSEVNLQDEIEIRMIDGIIKAEVKEKRYENTME >gi|224461465|gb|ACDD01000037.1| GENE 13 13440 - 15422 2537 660 aa, chain - ## HITS:1 COG:FN0224 KEGG:ns NR:ns ## COG: FN0224 COG0556 # Protein_GI_number: 19703569 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Fusobacterium nucleatum # 1 650 5 653 663 927 74.0 0 MFRLCAKYQPTGDQPIAIEKLVKSLERKNRDQVLLGVTGSGKTFTIANVIEKVQRPTLII APNKTLAAQLYQEYKSFFPENAVEYFVSYYDYYQPEAYIKTTDTYIEKDSAVNEEIDKLR NAATAALIMRKDVIIVASVSAIYGLGSPEIYKKMTIPIDLKTGISRSKLIERLIALRYER NDMNFVRGTFRVKGDVVDIYPSYLETGYRLEFWGDDLEAISEIHTLTGEKIKKNLERIVI YAATQYITEEEDLERIITEIREDQVREVKQFQDSGKLLEAQRLQQRVDYDIEMIKEIGYC KGIENYARYLAGKLPGETPNTLLDYFPENFLLVLDESHVGVPQIRGMYNGDISRKTTLVE NGFRLKAALDNRPLQFDEFRARTGQTIYVSATPGDYEIQQSGSSVVEQLIRPTGILDPFI EVRPTKGQVDDLLEEIHKRVVKKQRVLVTTLTKKMAEELTEYYLELGVKVRYMHSDVDTL DRIEIIKLLRKGEIDVLVGINLLREGLDIPEVSLVAILEADKEGFLRSRRSLIQTIGRAA RNVEGRVILYGDVMTDSMALAMEETKRRRTIQENYNLLHGIEPEAIIKEIAEEMIQLDYG ISEEAFSKKSKKKFHSKEEIEKEIAKCHKKIVKLSKELDFEQAILVRDEMKLLQEMLLSF >gi|224461465|gb|ACDD01000037.1| GENE 14 15619 - 16116 889 165 aa, chain + ## HITS:1 COG:FN0472 KEGG:ns NR:ns ## COG: FN0472 COG0716 # Protein_GI_number: 19703807 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 165 1 167 167 227 77.0 6e-60 MKTVGIFYGTTGGKTQEVIDIVAAKLGDVKVIDVANGIGELSSFDNIILASPTYGMGDLQ DDWAGCIDELAGMDFSGKVVALIGVGDSAIFGGNYVEAMKHFYDAVSPKGAKIVGAMSTD GYSFEASEAVIDDKFMGLAIDASFDEGEMTEKVEEWLAKIKPEFV >gi|224461465|gb|ACDD01000037.1| GENE 15 16136 - 16480 528 114 aa, chain + ## HITS:1 COG:no KEGG:MA2700 NR:ns ## KEGG: MA2700 # Name: not_defined # Def: hypothetical protein # Organism: M.acetivorans # Pathway: not_defined # 1 109 22 130 135 122 53.0 6e-27 MEIFYHHIYEYQKGIRNLILHSTSKGNLNLVRTKLSAENISFLIYPLGKDKINIFFGDTE CIAVIKKIGKISLTDYTPEEDFILGIMLGYDRRKQCSRYLQMKQENKKLTVHTA >gi|224461465|gb|ACDD01000037.1| GENE 16 16528 - 17193 776 221 aa, chain - ## HITS:1 COG:FN0907 KEGG:ns NR:ns ## COG: FN0907 COG4123 # Protein_GI_number: 19704242 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 1 214 8 218 223 144 41.0 1e-34 MESYDEVFLEHGASFYQEKEGFRFGNDIVLLAEFITEFAKPQQKNLEIGTGNGILPILLS QQGFLSKEYCAVDILESNIVLAQKNAEKNGIYAQFLCQDIRSFSEKNSYRQIFANPPYMK QDGKLQNDNKKKAIARHEICLSLEEFILSVKKILAPIGALYMVYRSHRLQELLEMCSRYQ LYASKIQFVYHENGQVSNLVLLEVYKGKQIKCEILKAKYIK >gi|224461465|gb|ACDD01000037.1| GENE 17 17193 - 18110 1176 305 aa, chain - ## HITS:1 COG:FN0908 KEGG:ns NR:ns ## COG: FN0908 COG1774 # Protein_GI_number: 19704243 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Fusobacterium nucleatum # 5 305 12 312 312 367 65.0 1e-101 MMEENIENQEVEIKEEYRVLTVMFEVTKKRYFFEVPEGVEYKKGDYVIVETIRGQEIGLS CGKPMMVAVKSLVLPLKPVIKKASEEERTIYLQQREDAKRAFAIGKEKILHHKLPMKLVE TEYTFDRSKLLFYFTAEGRIDFRDLVKDLANIFKIRIELRQIGVRDEARILGTIGLCGRE LCCRSFINKFDSVSIKMARDQGLVINPTKISGVCGRLLCCINYEYKQYEEALRRFPAVNQ MVASPDGDGKVLSIAPLLGTLYVDVFGKGIFQYRVEEVKFNKKEANKLKNVKSNEEAEHK DLEKE >gi|224461465|gb|ACDD01000037.1| GENE 18 18111 - 18794 725 227 aa, chain - ## HITS:1 COG:FN0909 KEGG:ns NR:ns ## COG: FN0909 COG2003 # Protein_GI_number: 19704244 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Fusobacterium nucleatum # 6 227 6 232 232 210 48.0 2e-54 MEILKNEGHRERLRKRYIERGFNSLQEYEVLELLLTYALPRKDTKALAKELLHRFGTLSA VCKAKTEELQSIKGIKENTSILLHFVGDLQKELFRNSLQEEKNIHIQRKEDLISYVRAQI GFENREKFFVLFLNTANQLLCSEELFQGSIDRSAVYPREILEKVLKYKAKSVIFAHNHPS GNTQPSRQDIALTKEMKDALRMFDVLLIEHIIVSKHSYFSFLEEGLL >gi|224461465|gb|ACDD01000037.1| GENE 19 18812 - 19870 1201 352 aa, chain - ## HITS:1 COG:FN0910 KEGG:ns NR:ns ## COG: FN0910 COG2038 # Protein_GI_number: 19704245 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 4 349 7 352 354 424 58.0 1e-119 MKGLEDIFSDIVGKDKQSIEIVKEILKKKMKPEGSLGILEELVQKMAGIYSYPLPKIQKK CHIVAVADNGIIEEKVSSCPLEYTRLVSEAMLHNIATIGIFTKQLGIDLEVVDIGMKEDI QKEYPNFYRKKIRRGSRNFVQEAAMTEEECEKAILEGFSFIQERREEYDIFSNGEMGIGN TTTSSAVLYALTQKNIHDVVGHGGGLSEEGLFKKKQIIQDACQKYQLFGKSPFDILRSVG GYDIAFLVGCYLGTAFYRKAMIVDGFISAVAALLACRMKAEVQDYCIFSHQSEEPGMKII LEELQETTFLQMKMRLGEGTGAVMVYPILDCALAMFQSLKTPKEVYDMFYEQ >gi|224461465|gb|ACDD01000037.1| GENE 20 19883 - 20464 740 193 aa, chain - ## HITS:1 COG:FN0911 KEGG:ns NR:ns ## COG: FN0911 COG0406 # Protein_GI_number: 19704246 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 192 1 190 191 206 55.0 2e-53 MGKIILVRHGQTQMNADRIYFGKLNPPLNPLGKIQAHEAKKRLETEITSYDFIHASPLER TKETAEIVNFLGKRISFDERLEEINFGIFEGLKYHEIVERYPKEYEESVANWKTYHYETG ESLETLQKRVVEYIFSLDLEKDHLIVTHWGVICSFLSYVMSENLESYWKFKILNGGVVIL EVKDNFPVLAKLL >gi|224461465|gb|ACDD01000037.1| GENE 21 20479 - 21285 1033 268 aa, chain - ## HITS:1 COG:FN0912 KEGG:ns NR:ns ## COG: FN0912 COG0368 # Protein_GI_number: 19704247 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Fusobacterium nucleatum # 1 256 1 271 278 194 47.0 2e-49 MKGLILLFSFMTRLPVPKMEFDSEELGKSMKFFPVVGLVIGLILYLFARGISFVTGSSFP FLLSVLVLLLEVAITGALHLDGLADTFDGMFSYRSKQKILEIMKDSRLGTNGALALIFYF LLKWSIFAELFYTLGKNYFAIFLVTMPIIARLGSVIHCAFFPYARGTGMGKAFVDYTGKK ELAFSVVLTGVLLAILWYFSKALPLIVALGVSCLILILFQYLFGKLVQHKIGGITGDTLG ALVELSEVMYGFILYVCINAMDWIIFYI >gi|224461465|gb|ACDD01000037.1| GENE 22 21305 - 21871 896 188 aa, chain - ## HITS:1 COG:FN0913 KEGG:ns NR:ns ## COG: FN0913 COG2087 # Protein_GI_number: 19704248 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Fusobacterium nucleatum # 1 188 1 187 187 205 52.0 4e-53 MGRIVYFTGGARSGKSAHSEQYILDRHYDHKIYLATAIVFDEEMKERVKLHVERRGKEWD TLEAYRNLYEVVKTSMKEATRGVILLDCITNMISNLLLDEQEDWDNISQERVQELEKYIL EEISTFLEEIKKTSYDLVVVSNELGMGLVPPYPLGRYFRDICGRANQLVADEAQESYFIV SGTKLRLK >gi|224461465|gb|ACDD01000037.1| GENE 23 22157 - 22411 511 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLLVLLALMLGLVACGQKEEAAPAEEQTAVEQAAETVEEAATEATETVEQAAEATTEA AADAADAVKDAAADVKEAATTDKQ >gi|224461465|gb|ACDD01000037.1| GENE 24 22581 - 23327 980 248 aa, chain + ## HITS:1 COG:VCA0581 KEGG:ns NR:ns ## COG: VCA0581 COG0501 # Protein_GI_number: 15601340 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Vibrio cholerae # 2 237 16 246 263 120 31.0 3e-27 MLIACTSTAPLTGRNQLKLVSDESLVARSVHSYNQLIQQARQQGKLANNTNNGRRLNMIG KRVASAVERYMYTNGMGDRVRYLNWEFNLIDSKEINAFAMPGGKIAFYSGIIPVLQTDAR IAFVMGHEIGHVIGGHHAEGYSNQQLAGLATTLTNVMVGGAASSLVSDGLSLGLLKFNRT QEYEADKYGMIFMAMAGYNPAEAIQAEARMAALSENSGSDFLSTHPANDKRIAALKAFLP EAMKYYQK >gi|224461465|gb|ACDD01000037.1| GENE 25 23524 - 23952 573 142 aa, chain + ## HITS:1 COG:CAC3624 KEGG:ns NR:ns ## COG: CAC3624 COG0720 # Protein_GI_number: 15896858 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Clostridium acetobutylicum # 1 141 1 136 136 160 57.0 6e-40 MYTLSSEASFDSAHFLKDYIGKCRNIHGHRWKVKIEIYAENLQSDGGFRGMVLDFGDIKK ELKEITNYFDHAFILEKNSLKPSLFNALVEEGFRLIEVDFRPTAENFSKFFYEHFEKKGF PVLQATVYETPNNCASYSKGVR >gi|224461465|gb|ACDD01000037.1| GENE 26 23954 - 24622 175 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 [Rickettsia canadensis str. McKiel] # 3 214 17 222 225 72 29 6e-12 MPKYKVVEIFESINGEGKKAGQLALFIRFQYCNLNCSYCDTKWANSKNSPFTWMSLEEIL SLAKEKRIKNITLTGGEPLLQTDIRSLLEAFSKEKQFEIEIETNGSVPLETFRNIENSPS FTIDYKLPESHMEEYMSLENFSSVHRNDTVKFVVSNRKDLEKAKEIIEQYSLIGKCAVYF SPVFGKIALPSIVDFMKEHHLNGVNMQLQMHKFIWDPEEKGV >gi|224461465|gb|ACDD01000037.1| GENE 27 24623 - 25198 800 191 aa, chain + ## HITS:1 COG:CAC3626 KEGG:ns NR:ns ## COG: CAC3626 COG0302 # Protein_GI_number: 15896860 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Clostridium acetobutylicum # 2 187 3 189 195 238 63.0 4e-63 MIDKKAIQEHVKGLLLALGEDPNREGLLETPKRVANMYEEIFEGIQYSNQELATMFGKTF EGDSETNSDDMVIIRDIEIFSVCEHHLALMYDMKVTVAYIPNKKLLGLSKVARICDMVGK RLQLQERIGRDIAEIMQKVTDSEDIAVLIQGKHSCMTMRGIKKQQSITETSCFLGKFKEN LVLQNRLYQRL >gi|224461465|gb|ACDD01000037.1| GENE 28 25207 - 25902 745 231 aa, chain + ## HITS:1 COG:AF0442 KEGG:ns NR:ns ## COG: AF0442 COG0603 # Protein_GI_number: 11498054 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Archaeoglobus fulgidus # 1 225 1 215 239 169 44.0 4e-42 MKVLVLLSGGLDSTTCLAIAVDKYGADQVVALSASYGQKHTKEILSARAIAKYYQVELLE LNLSKIFSYSNCSLLSHSTEEVPHHSYAEQLNQQEEEILSTYVPFRNGLFLSTAASIALS KECSIILYGAHSDDAAGNAYPDCSPAFNEAMNTAIYEGSGRQVKVEAPFIGLHKKDIVKL GLTLQVPYELTWSCYEGKEHSCGECGTCIDREKAFEENGTIDPLVKIRGGK >gi|224461465|gb|ACDD01000037.1| GENE 29 25903 - 26385 666 160 aa, chain + ## HITS:1 COG:BH2241 KEGG:ns NR:ns ## COG: BH2241 COG0780 # Protein_GI_number: 15614804 # Func_class: R General function prediction only # Function: Enzyme related to GTP cyclohydrolase I # Organism: Bacillus halodurans # 3 160 8 165 165 256 75.0 9e-69 MRETENLSLLGNQNTKYPQDYAPEMLETFENKHPDNDYFVKFNCPEFTSLCPITGQPDFA NIVISYVPNIKMVESKSLKLYLFSFRNHGDFHEDCMNIIMKDLIKLMNPKYIEVWGKFTP RGGISIDPYCNYGQKGTKWEEIAFHRMANHDMYPEKVDNR >gi|224461465|gb|ACDD01000037.1| GENE 30 26451 - 27419 1332 322 aa, chain - ## HITS:1 COG:FN0903_1 KEGG:ns NR:ns ## COG: FN0903_1 COG0794 # Protein_GI_number: 19704238 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Fusobacterium nucleatum # 4 207 3 206 206 270 65.0 4e-72 MLENQEILAIAHGIIDTEIQGLEKLKASMGQELIEAAKIIYESKGKLIITGIGKTGAIGK KIAATLSSTGTTTIFMNSTEGLHGDLGMVNPEDIVIGISNSGESDEILHIIPAIKNIGAR VFAMTGNPNSRLAQEAEIVLFCGVDSEGCPLNLAPMASTTSALALGDALAGILMKMRDFQ PQNFAMYHPGGSLGRRLLSRVKNLMKTGEDLALCSLDTKMKDVIVKMNEKRLGILCVMKG EELVGIITEGDIRRALSREEEFFTFHAEEIMTKQYKKVEQDMLANEALSYMEEGKYQISV MPVFHEGKFVGVVRIHDLLKIK >gi|224461465|gb|ACDD01000037.1| GENE 31 27455 - 28813 1851 452 aa, chain - ## HITS:1 COG:FN0225 KEGG:ns NR:ns ## COG: FN0225 COG2610 # Protein_GI_number: 19703570 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 1 440 1 440 452 590 81.0 1e-168 MEQQILLGLFIGILCLIFMIMKTKIHTFLALIIATILVGLIGGVEYSKIIESITKGFGGT LGSIGIIIGFGVMMGQLFEVSGAAKKMALTFLKIFGKGREELAMAITGFLVSIPIFCDSG FVILTPLLKAISKETKKSIVSLGLALATGLVITHSLVPPTPGPVGVAGIFGVNVSSIILW GIVIAAPMMMASLLFAKFSGNKIWQIPTEDGGWTRDRNYIYKGEQEKVFDEDSLPSTFLA FSPIVVPILLILLGTISKTMSLTGKMIDFIQFVGTPVLAVGIGLILTIYGLAKNMDRKSM MEEVETGIKSAGTIILITGAGGAFGILIRDSGVGDIIANSLVETSLPAILLPFVIATLVR FVQGSGTVAMITAASITAPIIAKLDVNPVFAALAACIGSLFFSYFNDSFFWVINRSIGIT EGKEQLRLYSIASTVAWAVGIVVLLIVNMIFG >gi|224461465|gb|ACDD01000037.1| GENE 32 28845 - 29843 419 332 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 6 322 9 323 346 166 33 3e-40 MNKPKIAVPMGDPAGVGPEIVVKTAVAEEIRDLCDLVVIGDRKVLEKAIEICGVNLQIHS MEKVEDGDYRDGILNVIDLQNIDLNIMEYGKVQGMCGKAAFEYIKKSVDLAMSHQVDAIA TTPINKESLRAGNINYIGHTEILGDLSNSRDPLTMFEVANMRVFFLTRHMSLRNACDAIT KERVLEYIQRCTKALKQLGVNGKMAVAGLNPHSGEHGLFGYEEVEEVTPAVEEAQKLGYD VVGPIGADSVFHQALQGRYQAVLSLYHDQGHIATKTYDFERTIAITLDMPFLRTSVDHGT AFDIAGQGIVSAISMIEAVRLAAKYAPNFKNI >gi|224461465|gb|ACDD01000037.1| GENE 33 29858 - 31135 1402 425 aa, chain - ## HITS:1 COG:FN0227 KEGG:ns NR:ns ## COG: FN0227 COG3395 # Protein_GI_number: 19703572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 425 1 425 425 541 68.0 1e-153 MAKYIIVADDLTGSNATCSLLKKVGLRAASIFQLPKTKIETIDVISYSTDSRGISKEEAY ERVKDAVSFLKSEETLLYNKRIDSTLRGNIGSEMDAMLEQLEEDRIAVVVPAYPDSGRIV VNKIMLVNGILLENSDAGRDPKTPVNTSCVEELIQKQSKYSSHYFSLQDIAKEEEKLVKE IELYGKENRVLIFDAVTNEDIIKIARLMNRSNLKIITVDPGPFTMYYTKELQKKNNLEKK ILMVIGSATETSKKQIEHILQHEEIFLEKMNPNNFFVEESRQQEIQRVVSMIKKGIDSYD LFLITTTPIGNDEKLNLPEIAKMKGVSVEEISKIISNTLTEAATLVLEEVQKFEGVYSSG GDITLALLEKLNSIGVEIKEEVIPLAAYGRLIGGKFPNMKLVSKGGMVGKEDTIKLCLNQ MKSDI >gi|224461465|gb|ACDD01000037.1| GENE 34 31145 - 31930 861 261 aa, chain - ## HITS:1 COG:FN0228 KEGG:ns NR:ns ## COG: FN0228 COG1349 # Protein_GI_number: 19703573 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Fusobacterium nucleatum # 1 254 4 257 258 283 66.0 2e-76 MLSSERYQFIVQYLEEHNSATRKELADLLGVTSMTIGRDLKKLEQKGYLVCTYGGAILPN SLVEEKKYDRKKEENTKIKKRIAEKALEEIRSNMTIILDAGTTTYELACLIAQSSIQNLR VITNDLYIALELYQKENIKIILLGGEVARETGATTSVLSIKQIENYNADIAFLGISSISD NLDITVPTEVKAILKRSIMKISEKNILLTDYSKFGKKKLYKAAHIKNFDSIITDHIFSKK EIEKYGLKKKIIQVDGKNKTI >gi|224461465|gb|ACDD01000037.1| GENE 35 32101 - 33048 994 315 aa, chain - ## HITS:1 COG:FN0623 KEGG:ns NR:ns ## COG: FN0623 COG0679 # Protein_GI_number: 19703958 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 315 4 318 318 274 50.0 2e-73 MENLILAMNVVLPILIILTIGYVLKYFNMVDSHSLNKMNSLVFRVFMSSLLFINIYRLDA EAVFQLKNLRFILFPVLGVFCMIFLSYLVYSRTIKDSKKCSVMIQAAYRGNFVLFGIPIA STLYGEEALGITSLLLAAVIPTFNLTAILLLEFYRGEKIKLSKLVSSTYKNPLLLASTLA IICLLLDIHIPNILEVTISSLAKVATPLAFIVLGGSLEMKSVKKHWKYLLTANIVKLLVF PFFLIVASHFLSFTSMEITAFLAATACPAAVASFTMAKEMDADGDLAGEIVVTTSAFSIV TIFFWVLILKNIAWI >gi|224461465|gb|ACDD01000037.1| GENE 36 33112 - 33492 525 126 aa, chain - ## HITS:1 COG:no KEGG:FN0656 NR:ns ## KEGG: FN0656 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 14 123 10 122 126 75 36.0 4e-13 MERSNSSDSSKKIQYLSILLIVCAFASAIFMKTNTEEVYEGTAKGFHGDIHVQVAAHRND ENAIEITDIQVKHEDTPDIGGVAISDLVEKVKAEQSVEVEMVAGASYSSQGFLEAVKEAV AKVPEK >gi|224461465|gb|ACDD01000037.1| GENE 37 33533 - 34732 1964 399 aa, chain - ## HITS:1 COG:lin2552 KEGG:ns NR:ns ## COG: lin2552 COG0126 # Protein_GI_number: 16801614 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Listeria innocua # 1 399 1 396 396 575 77.0 1e-164 MAKKNIKDLELQGKKVLMRVDFNVPMKDGKITDENRIVAALPTIQYALEQGAKVIAFSHL GKVKTEEDLVSKSLKPVAVRLSELLGKELKFVAATRGAELETAVNSLQNGEIMMFENTRF EDLDGKKESKNDPELGKYWASLGDVFINDAFGTAHRAHASNVGISSNIGEGKSAAGFLME KEIRFIGGAVDAPERPLVAILGGAKVSDKIGVIENLLEKADKVLVGGAMMFTFLKALGKS TGTSLVEEDKVELAKALLEKANGKLILPVDTVVAKEFNNEAAHRTVSVDEVPADEMGLDV GAGTVELFSKEIASAKTVVWNGPMGVFEMPNYAKGTIGVCEAIAHLQGATTIIGGGDSAA AAISLGYADKFTHISTGGGASLEYLEGKVLPGVASISEK >gi|224461465|gb|ACDD01000037.1| GENE 38 34796 - 35800 1549 334 aa, chain - ## HITS:1 COG:FN0652 KEGG:ns NR:ns ## COG: FN0652 COG0057 # Protein_GI_number: 19703987 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 1 327 1 327 335 567 90.0 1e-161 MSVKVAINGFGRIGRLALRVMSENPEYDVVAINDLTDAKTLAHLFKYDSAQGRFQGTIDV TEEGFVVNGDSIKVFAKANPEELPWKELGIDVVLECTGFFTSKEKAEAHIKAGAKKVVIS APATGDLKTVVYNVNHDVLDGSETVISGASCTTNCLAPMAKVLNDNFGIVEGLMTTIHAY TNDQNTLDAPHKKGDLRRARAAAANIVPNTTGAAKAIGLVIPELKGKLDGAAQRVPVITG SITELVTVLEKSVTVEEINAAMKAAANESFGYNDEDIVSSDVIGCRFGSLFDATQTRVMT VGDKQLVKTVSWYDNEMSYTSQLIRTLGAVTKAK >gi|224461465|gb|ACDD01000037.1| GENE 39 35950 - 36819 240 289 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 74 277 82 278 285 97 33 2e-19 MHKYIVEPEYDGYEIGEYLKESKGYSGRGLRKLEIYLNGKKIKNTAKKVRKLNRVFIVEK EKETGIRPMDIPLEIVYEDENLLILNKQANLVTHPTTKKVDATLANGVVAYFLKTTGKTM VPRFYNRLDMNTTGLIIVTKNAYSQAYLQEKTEVRKSYQTIVKGIVEQDEFYITKPIGKV GEDLRRIELAVSEGGQEAKTFVKVLKRFPERNRTLLDVTLFTGRTHQIRAHLSLEGYPIV GDDLYGGADDRIKRQLLHAYRLTFQNPKNGEQQEIMIDLAKDMQDYLNG >gi|224461465|gb|ACDD01000037.1| GENE 40 36984 - 37976 1185 330 aa, chain + ## HITS:1 COG:FN0646 KEGG:ns NR:ns ## COG: FN0646 COG0373 # Protein_GI_number: 19703981 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Fusobacterium nucleatum # 1 328 2 329 329 333 56.0 3e-91 MNIKNFAVIGISHEILSMQEREEVIKQKPRVLFEELFQAGDIKAYVDLSTCLRVEFYLEL EENKSLEDIQKRFPVQKGLQSKQGEEALLYLAKVVCGFFSVIKGEDQILAQVKQAYAKAL EEEHSSKLCNIIFHKIIELGKKFRSKSNIAHQALSLEAISLRSIRERVPSLQNKKILLLG IGELAQSILALLVKENLSNIYITNRSYHKAEEVSNIYQVNMIDFREKYQWIAEADIIISA TSASHIVLEYEKFLQYKQDKEYFMLDLAVPRDIDPRIADLEKIEVLNLDDIWKISKEHSC FREQLLEDYFYILEEQIESIHKALSYYEQK >gi|224461465|gb|ACDD01000037.1| GENE 41 37988 - 38887 883 299 aa, chain + ## HITS:1 COG:FN0645 KEGG:ns NR:ns ## COG: FN0645 COG0181 # Protein_GI_number: 19703980 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Fusobacterium nucleatum # 3 295 2 293 298 378 65.0 1e-105 MSKQQIILGSRGSILALAQTNWVKEQLEKYHPELSFSIQIIETQGDKDLHSHFGNSQSSL KSFFTKEIEKSLLEGEIDIAVHSMKDVPSVSPAGLICGAIPIREDVRDVLISRSGKPLAE LPQGAIIGTSSLRRIQNIKKIRPDLEIKALRGNIHTRLRKLEEEQYDAIILAAAGLKRVK LEEKITEYLDPTVFPPAPAQGALYIQCREEDIEVQKILQSIHNKNLEKVLVVEREFSKIF DGGCHTPMGCYSNLQGDTLEFFAMYSHENKRYQTKVVENLSKGKDIARMAAQKIEKMFK >gi|224461465|gb|ACDD01000037.1| GENE 42 38906 - 40384 1637 492 aa, chain + ## HITS:1 COG:FN0644_1 KEGG:ns NR:ns ## COG: FN0644_1 COG0007 # Protein_GI_number: 19703979 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Fusobacterium nucleatum # 2 245 3 246 251 346 68.0 5e-95 MKKKVYLVGAGPGDAGLFTLKGKQLLEEADCIIYDRLIPMEILNFAKKDAELIYLGKENT EGGLLQEKINHCLIEKALEGKMVVRLKGGDSFVFGRGGEEILALVEQGIDFEVVPGITSS ISVPAYAGIPVTHRDVARSFHVFTGHTMKDGTWHNFEVLAKLEGTLVFLMGVKNLDKIVN GLIQYGRDSKTPIAIIEKGATEQQKVHVGTLKNIVTLAKERDVKAPAIIIIGEVVSLQEK LNWFEATKKKKILVTRDIKQAPDFSEKLQKHGFFPIEFPLLEIQKHTLSFLKDFFQKYSV ILFNSPNGIRYFLEAIPDLRMIAHCKIGVVGRKTREVAESYKLIPDFMPKEYCVHELAKL SKEYSQEGDHILIFTSDISPCDCEKYSKEYNRKYEKFVLYSTSKKEYSKEEMEQKIKEVD IITLLSSSTVEALYENLEGDLSILEGKQIASIGPVTSKTLKKYGFTVDYEATIYDTNGLV EILKEANNVSKN >gi|224461465|gb|ACDD01000037.1| GENE 43 40368 - 41345 1115 325 aa, chain + ## HITS:1 COG:FN0460 KEGG:ns NR:ns ## COG: FN0460 COG0113 # Protein_GI_number: 19703795 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Fusobacterium nucleatum # 1 320 1 320 322 441 64.0 1e-123 MFQRTRRLRSSAILREMLQNVHLSLQDLIYPIFVEEGENKKEEISSMPGQYRYSIDRLPE LLENCRELGIKALLLFGIPNHKDEVGSEAYHSHGIVQKALQFIKENYGDQFLLITDVCMC EYTSHGHCGILHEKEVDNDTTLQFLSKIALSHAQAGADIVAPSDMMDGRVQAIRATLDEN GFSYIPIMAYSVKYASSFYGPFRDAADSAPSFGDRKSYQMDFQNDKEFYQEVLSDMEEGA DFIMVKPGMPYLDVLHAVKERISLPLVSYQVSGEYSMIKAAALQGWIDEKKIVLESILAF KRAGADLIITYYALEIAAWLKENRK >gi|224461465|gb|ACDD01000037.1| GENE 44 41342 - 42643 1281 433 aa, chain + ## HITS:1 COG:FN0540 KEGG:ns NR:ns ## COG: FN0540 COG0001 # Protein_GI_number: 19703875 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Fusobacterium nucleatum # 4 431 4 430 434 600 67.0 1e-171 MKQENSKKYFEEACQYIPGGVNSPVRAFQSVHREAPIFAKKAKGAYLWDEDDNRYLDYIC SWGPMILGHNPDFVLQGVQEAILSGSSFGLPTKKEVELAKLIVQSVPCIEKVRLTTSGTE ATMSAVRLSRAYTKRNKIIKFEGCYHGHSDALLVKSGSGLLTQGFQDSNGIPQSVLQDTI TIPFGNKEKTLEYLQTKEIACIIVEPIPANMGVIESQKDFLQFLREETQNYGSLLIFDEV ISGFRVALGGAQEYFKITPDLCTLGKIIGGGYPVGAFGGKEEIMNLIAPLGQVYHAGTLS GNPISVRAGYETLSYLMQHKSSIYLDITKKTEFLVKEIEKLIQKYEIPAVVQSMPSLFTI FFSEKEKVTNLEDALSSNVDFFTIYFNTLLENGILAPPSQFEAHFISHAHSEDDLKKTLK VIELAFQKIHEVM >gi|224461465|gb|ACDD01000037.1| GENE 45 42702 - 43478 886 258 aa, chain + ## HITS:1 COG:FN1973 KEGG:ns NR:ns ## COG: FN1973 COG0251 # Protein_GI_number: 19705269 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Fusobacterium nucleatum # 133 258 3 128 128 201 82.0 1e-51 MGAGKISFRKACTLKKYGAIIEIVAKDISKEFETLSNLQIRKKSYDEKDIQGHFLVIAAT NNSVLNHQIVEDCKKRNILVNNISSKEDMTCRFASIYEEEEYQIAISAHGYPKKSKQLRE EIKQYLIQRSDVRMKKIIHTEKAPAALGPYSQAIEANGVLYVSGQIPFVPATMTLVSDDV QAQTRQSLENIGAILAEAGYTFNDVVKASVFIKDMNDFAKINEVYNEYLGEAKPARACVE VARLPKDVKVEIEVIATK >gi|224461465|gb|ACDD01000037.1| GENE 46 43517 - 45172 1504 551 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0945 NR:ns ## KEGG: Lebu_0945 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 4 550 3 552 552 501 49.0 1e-140 MEQKKGLPNGISDFKLLREENYYYIDKTNLIEELQQEIGKTILFTRPRRFGKTLNMSMLQ YFWDIHNAEANRKLFQGLYIESSPYFSEQGKYPVIYLSFKDLKSKSWKDCLEDIKLFIQN LFYQYRHILPKLDSFANARFSKCIKGDSNLAELKFSLKFLTELLSFHYQTKVVLLIDEYD TPIISAYEHGYYEEAISFFRTFYSAALKDNEYLQMGIMTGILRVAKEGIFSGLNNLVVYS ILDEKYSSYFGLTEEEVEEALKYYHMEYNLQEVKEWYDGYRFGNTEIYNPWSIINYISNR KLDAYWINTSSNGMIHQVLEMAERTGSSIFQKLEMLFQQKTIIQRINKGSDFHDLVNMDE IWQLFLHSGYLTINDNEKDNMYELRIPNKEVYSFFQESFIQKFLGNYTTFHSLLRSLEKG DVKELEQTLEEILLSSVSYFDLSKESEKFYHVFMIGLVANFQERYYIKSNRESGEGRYDL ALEPRDRRKTGLLLEFKVANSEEELDKKAKEALLQIQEKRYDTEMQERGIQEIVKLGIAF CGKRVKVITKE Prediction of potential genes in microbial genomes Time: Fri May 20 02:00:18 2011 Seq name: gi|224461464|gb|ACDD01000038.1| Fusobacterium sp. 3_1_5R cont1.38, whole genome shotgun sequence Length of sequence - 14337 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 2, operones - 1 average op.length - 17.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 12.3 1 1 Tu 1 . + CDS 92 - 1723 1307 ## Lebu_0003 protein of unknown function DUF1703 + Term 1905 - 1969 -0.9 - Term 2033 - 2081 9.4 2 2 Op 1 . - CDS 2109 - 2579 536 ## COG2606 Uncharacterized conserved protein - Term 2587 - 2632 5.6 3 2 Op 2 . - CDS 2661 - 3026 697 ## gi|257465827|ref|ZP_05630138.1| hypothetical protein FgonA2_00075 4 2 Op 3 1/0.000 - CDS 3031 - 4170 1515 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 5 2 Op 4 9/0.000 - CDS 4182 - 6359 2324 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 6 2 Op 5 . - CDS 6376 - 6888 694 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 7 2 Op 6 . - CDS 6899 - 7222 341 ## gi|257452384|ref|ZP_05617683.1| hypothetical protein F3_04900 8 2 Op 7 . - CDS 7219 - 8502 1666 ## COG1253 Hemolysins and related proteins containing CBS domains 9 2 Op 8 . - CDS 8526 - 8969 596 ## COG0511 Biotin carboxyl carrier protein 10 2 Op 9 1/0.000 - CDS 8982 - 9461 760 ## COG4492 ACT domain-containing protein 11 2 Op 10 1/0.000 - CDS 9454 - 10314 1117 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 12 2 Op 11 1/0.000 - CDS 10311 - 11243 1270 ## COG0223 Methionyl-tRNA formyltransferase 13 2 Op 12 1/0.000 - CDS 11245 - 11703 680 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains 14 2 Op 13 1/0.000 - CDS 11700 - 12179 675 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 15 2 Op 14 . - CDS 12179 - 12838 446 ## COG1381 Recombinational DNA repair protein (RecF pathway) 16 2 Op 15 . - CDS 12835 - 13410 493 ## FN1493 hypothetical protein 17 2 Op 16 . - CDS 13400 - 14221 663 ## COG1792 Cell shape-determining protein 18 2 Op 17 . - CDS 14258 - 14335 65 ## Predicted protein(s) >gi|224461464|gb|ACDD01000038.1| GENE 1 92 - 1723 1307 543 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0003 NR:ns ## KEGG: Lebu_0003 # Name: not_defined # Def: protein of unknown function DUF1703 # Organism: L.buccalis # Pathway: not_defined # 2 537 3 538 545 446 48.0 1e-123 MKKIPIGIEDFKMLITDDYFYIDKTKFIEEILNDGSLVKLFTRPRRFGKTLNMSMLKNFF DIRGAEENKKLFDSLYIEKSPVFAEQGKYPVIFISFKGLIGDTLEKLIDSLKVKISKLFA EYRDLIEKLDKFDTALFEKMILREDISEAELSESLLTLTDILYRYYKKQVIVLIDEYDAP LTYAYGQGYYKEAVDFFKTLYGNVLKTNSNLKMGVLTGAIRVAQAGIFSDLNNIETHTIL DEAYDEYFGLLENEVENILIEYKSEDKLEDVKSWYDGYKFGNMEVYNPWSILRYVKYKKL DAYWINTSGNALIKELLLLSDGTVFEDLDNLVNGQEKNIYVNESIALGNDLDPNRIWEII LFSGYLTVKEKISNESYLIKIPNKEIQSFFKGLFAEIVFKGKSNITSMKAALENKDINTI IRILEKVVLNAISFYDTNKKLENPYQTLLAGFLYALDDYYEMKPNPETGYGRADIILKPR NKKWIGYIFELKRAKTKNLEKEAEKALEQIEEKKYDTILISEGIKEIIKIGLVFDGKKAV AYY >gi|224461464|gb|ACDD01000038.1| GENE 2 2109 - 2579 536 156 aa, chain - ## HITS:1 COG:FN1373 KEGG:ns NR:ns ## COG: FN1373 COG2606 # Protein_GI_number: 19704708 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 155 1 155 162 177 63.0 8e-45 MKKTNAMRELDKAKIQYQYYEYEVDENHLGAIDVALKTGQDITRIFKTLVLVNEKKEMIV ACIPGSDTIDLKKLAKVSSSKKVEMIEMKQLLPMTGYIRGGCSPIGIKKKHRTFLHSSAR NKESIIVSGGMRGLQIELATEDLISYIGMEVEDIIV >gi|224461464|gb|ACDD01000038.1| GENE 3 2661 - 3026 697 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257465827|ref|ZP_05630138.1| ## NR: gi|257465827|ref|ZP_05630138.1| hypothetical protein FgonA2_00075 [Fusobacterium gonidiaformans ATCC 25563] # 1 121 1 121 121 144 98.0 2e-33 MKKFLLLAVLALGLVACGQKEEAKVEEATQVEQAVETPAAEEEVITFTREDGEANIVVKS SDKFETATIVIEDQEYAAKRVEAADGVKVATEDEKISVHFKNDYGVLEMDGTEVNLTVVK E >gi|224461464|gb|ACDD01000038.1| GENE 4 3031 - 4170 1515 379 aa, chain - ## HITS:1 COG:FN1481 KEGG:ns NR:ns ## COG: FN1481 COG0343 # Protein_GI_number: 19704813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Fusobacterium nucleatum # 3 371 2 370 373 648 83.0 0 MKKLPVTYTLEMTDGKARAGKIQTPHGIIETPVFMPVGTQATVKAMTKEELDDIGTQIIL GNTYHLFLRPGDDLIDRLGGLHQFMSWKKPILTDSGGFQVFSLGALRKIKEEGVYFSSHI DGSKRFISPEKSIEIQNHLGSDIAMLFDECPPGLSTREYLIPSIERTTRWAKRCVEAHQK ADKQGLFAIVQGGIYEDLRQKSLEELLEMDEHFSGYAIGGLAVGEPREDMYRILDSIVEK CPENKPRYLMGVGEPIDMLEAVESGIDMMDCVQPTRIARHGTVFTKHGRLVIKNAEYAED TRALDEECDCYVCRNYSRAYIRHLLKVDEILGARLTSYHNLYFLVQLMKDAREAIKKGEF QKFKKEFIDKYNINIHRRK >gi|224461464|gb|ACDD01000038.1| GENE 5 4182 - 6359 2324 725 aa, chain - ## HITS:1 COG:FN1482 KEGG:ns NR:ns ## COG: FN1482 COG0317 # Protein_GI_number: 19704814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Fusobacterium nucleatum # 2 724 3 725 725 996 68.0 0 MSYWDSFVECVRQNHLEIDLDKVKLAYYLAEESHEGQYRKSGEAYIMHPIEVAKILVGLK SDTDTIIAAILHDIVEDTFITLADIEYNFGKNVAHLVDGVTKLKSLPNGTKNQSENIRKM ILAMTQNLHVILIKLADRLHNMRTLKFMKPEKQVAIAQETLEVYAPLAHRLGIAKIKWEL EDLCLYYLHNDKYLEIRSLIDKKKDERKDYIDSFIQTMTRILSDVGIKGQVKGRFKHFYS IYKKMYELGKEFDDIYDLMGVRIIVSNTSDCYHVLGEVHSRYTPVPGRFKDYIAVPKSNN YQSIHTTIVGPLAKFIEIQIRTEEMDKVAEEGVAAHWAYKEKRKTNKDDQIYGWLRNIIE LQQNTSNTEDFVKSVTADIKNDTIFVFSPKGDIVELPNMATTLDFAFAVHTQVGCRCIGA KVNGKIVPLDTKLQNGDRVEIITSKNSKGPSKDWLEIVRTHGAKSKIRKFLKDVNAEEIT KAGRESLEKELVRLGMSLKDLDTDSIILKHMEKNNIKSMEEFYYHVGEKRSKLEIIISKL RSKIEKEKVASEIKLEDIMTKKEEKPSRGKNDFGIVIDGINNTLIRFAKCCTPLPGDEIG GYVTRLTGITVHRKDCMNYQSMLKMDPSREIIVSWDEKLIHTKANKYNFGFTVFVNNRDG ILMDVVNVISNHKIHISSVNSHETNREGKLLASLKFTIEINDKEEYNQLINNISKIRDVL SIERD >gi|224461464|gb|ACDD01000038.1| GENE 6 6376 - 6888 694 170 aa, chain - ## HITS:1 COG:FN1483 KEGG:ns NR:ns ## COG: FN1483 COG0503 # Protein_GI_number: 19704815 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Fusobacterium nucleatum # 1 170 1 170 170 291 80.0 5e-79 MDLKKYVARVENFPKEGIIFRDITPLMNDGEAYQYATEKIVEFAREHQVELVVGPEARGF IFGCPVSYALGIGFVPVRKPKKLPREVVSYAYDLEYGSNTLCMHKDSIKPGQRVLIVDDL LATGGTIEASIHLIEELGGVVAGIAFLIELEELKGREKIKQYPILTLMKY >gi|224461464|gb|ACDD01000038.1| GENE 7 6899 - 7222 341 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452384|ref|ZP_05617683.1| ## NR: gi|257452384|ref|ZP_05617683.1| hypothetical protein F3_04900 [Fusobacterium sp. 3_1_5R] # 1 107 1 107 107 170 100.0 2e-41 MKKFLGIVFLVILSQGIYAQEQTWEYPFIKALNYEERQEWNLAIEELEKSRALQEENLFV LKELGYCYAKQGEWEKAKECYEKVLFFYPEDSNAKKNLEILLENKTK >gi|224461464|gb|ACDD01000038.1| GENE 8 7219 - 8502 1666 427 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 1 423 1 426 426 498 65.0 1e-141 MDTYLYIVVLVILVLLSGFFSASETALTAFRSIHLEKFVDEKKDSIVVLLKKWLKDPNPM LTGLLIGNNIVNIMASSIATVVMVTYFGNTGKSILIVTILMTVAILIFGEITPKLIARNH SSEVAGKVISFIYYLTLFLNPLILILVFISKVIGRACGVNMDNAGVMITEEDIISFVNVG QEEGIIEEDEKEMIHSIVGFGETTAKEVMTPRTSMTAFEGSKTIEDIWDTLMEDGFSRIP VYEETIDNILGILYIKDIMSQVKNGNINQPIRELVRPAYFVPETKSIIEILKEFKVKKVH IAMVLDEYGGIGGLLTIEDLIEEIVGEIRDEFDEEEEEFVRKVGDHSYEVDAMIDIETLD KELGIQLPVSEDYESLGGLITTELGRVTEKGDELELENVKLQVLEMDKMRISKVLITCEK EEEPKEE >gi|224461464|gb|ACDD01000038.1| GENE 9 8526 - 8969 596 147 aa, chain - ## HITS:1 COG:VC0296 KEGG:ns NR:ns ## COG: VC0296 COG0511 # Protein_GI_number: 15640324 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Vibrio cholerae # 71 147 120 196 196 89 53.0 3e-18 MKLDLKTMEELAENMNTYQLDSIDLEVGGERFCLKKSISKEANITNVVKTMENAETIEMP VIEEKKEEILGKQIFSPMAGTIYRAPAPDKAPFVEEGMNVKVGDTLCIVEAMKMMNEVKS TESGIITKILAEDGVVVKKGEALFEIK >gi|224461464|gb|ACDD01000038.1| GENE 10 8982 - 9461 760 159 aa, chain - ## HITS:1 COG:FN1487 KEGG:ns NR:ns ## COG: FN1487 COG4492 # Protein_GI_number: 19704819 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Fusobacterium nucleatum # 16 158 11 153 153 140 49.0 1e-33 MTKKAKENEKIHDGKRQYYIVDKTILSASIQKVIAVNEMVKNEHISKHEGIRRTGLSRST YYKYKDFIKPFFEGSQEKIFNIHMSLKDRQGLLAQILEVIADDKMNILTIVQNAAVDGIV QLTISLQGTAETPKNIETTLAKIQVIDGVRDLRILGSNS >gi|224461464|gb|ACDD01000038.1| GENE 11 9454 - 10314 1117 286 aa, chain - ## HITS:1 COG:Cj0855 KEGG:ns NR:ns ## COG: Cj0855 COG0190 # Protein_GI_number: 15792193 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Campylobacter jejuni # 1 281 1 280 282 263 51.0 3e-70 MKLLDGKKVAAEIKEELKRKIVEEKEKTGKIPGLGIIQIGHNEAASVYVQSQIKGSKALG IQAFLYAFEDDVKEEVVLQKIEELNQTEEIDGIILQLPLPEQISRSHILQAIDVNKDVDG FKTENMGRLHLGEEGFNPCTPEGVITLLKKYDIEIAGKNVTIIGRSNIVGKPMLGLFVNH DATVTICNSLTKNLKEHTLKADIIVVAVGKEKFLTADMVQEGAIVVDVGINRTVTGKIVG DVEFEEVSKKTSYITPVPGGVGSMTVAMLFQNIWKAFIKNRRIVND >gi|224461464|gb|ACDD01000038.1| GENE 12 10311 - 11243 1270 310 aa, chain - ## HITS:1 COG:FN1489 KEGG:ns NR:ns ## COG: FN1489 COG0223 # Protein_GI_number: 19704821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Fusobacterium nucleatum # 1 307 8 314 317 365 58.0 1e-101 MRILFMGTPDFAVSSLRKLQEEHEVIAVFTKIDKPNQRGKKIQYTPVKQYALEHNLEVIQ PKSVKDMEIIEKIKEYRPDLIVVVAYGKILPKEILEIPKYGVINVHSSLLPKYRGAAPIH ASIIHGEKESGVSIMYVVEELDAGPVLAQESVEILEEDNCESLHNKLQEIGASLLLKTIS KIEKQEIQAIPQDETKVSFVKPFQKEDCKIDWNQSAREIFNFVRGMDPFPGAFTLYHGKQ LKIGRVEEEKEMILEGKAGEILAFVKGKGIVVATGKGNVVITKAKPENKKMLSGVDLING NFLQEGEHFE >gi|224461464|gb|ACDD01000038.1| GENE 13 11245 - 11703 680 152 aa, chain - ## HITS:1 COG:FN1490 KEGG:ns NR:ns ## COG: FN1490 COG1327 # Protein_GI_number: 19704822 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Fusobacterium nucleatum # 1 150 1 149 149 172 62.0 2e-43 MRCPFCGSEDTKVVDSRSYLEGNSIKRRRECVVCQRRFSTFERVEEVPLFVIKKDQRRVP FNRDKVMRGLTFATVKRNIGREDLEKIVYEVEKNIQNTLKNEITTRDLGEMILEKLKKID QVAYVRFASVYKEFDDVKSFVELIEEMEREKK >gi|224461464|gb|ACDD01000038.1| GENE 14 11700 - 12179 675 159 aa, chain - ## HITS:1 COG:FN1491 KEGG:ns NR:ns ## COG: FN1491 COG1762 # Protein_GI_number: 19704823 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Fusobacterium nucleatum # 1 155 6 160 162 203 69.0 1e-52 MLNIVKITDYMSEDLICLDLTAKTKDEVLKELSTLMGKAPHIGTNSEVIYKALLEREKLG STGIGKGVAIPHAKTDAVEQLTIAFGISREKLDFKSLDEEEVNLFFVFASPNKDSHIYLK VLARISRFIREEEFRNTLLSCKTEKEVIECIREKEGVTL >gi|224461464|gb|ACDD01000038.1| GENE 15 12179 - 12838 446 219 aa, chain - ## HITS:1 COG:FN1492 KEGG:ns NR:ns ## COG: FN1492 COG1381 # Protein_GI_number: 19704824 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Fusobacterium nucleatum # 4 219 3 231 233 114 31.0 1e-25 MSKFIRNKAFVLGNYSFGEADRNLIVLTEDFGKIQLTVKGILKSKKRDIVATEALSYVDL LLYKKGEQFIISDFSSIENFMAIRQDLDSLSFAFYLLAVVNRFVFEGYRVPKIFKLLKNS LYYLNREEVKKKQLVLLNYFLFILMKEEGIFRVDEILIHLNPEEKEIVECIWKKQMENIY KEDRYTEEKLLLLLKKLELYIKEKLDIDISIDQYMMGGL >gi|224461464|gb|ACDD01000038.1| GENE 16 12835 - 13410 493 191 aa, chain - ## HITS:1 COG:no KEGG:FN1493 NR:ns ## KEGG: FN1493 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 25 188 21 186 192 103 39.0 2e-21 MKINKILFACIFLLFSFFAFGKEYKAIEKVKNFSFEVAETNYLGKKQKKILYKVQMSLPN SFKKEILFPELNKGEIYLYTGKTKTVYLPMFEQKKTTSLEKDEVQVLNVIDILVERLSSD KKFKKAYYEKKNVEFVLEENYKVRIVSYLDIDGYVFPKKWLIEEKGQKVLELTLSKVVID PKLTERDFQIS >gi|224461464|gb|ACDD01000038.1| GENE 17 13400 - 14221 663 273 aa, chain - ## HITS:1 COG:FN1496 KEGG:ns NR:ns ## COG: FN1496 COG1792 # Protein_GI_number: 19704828 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Fusobacterium nucleatum # 84 273 2 194 210 125 40.0 1e-28 MKINRNNEGKRIRFTFILLGIICLFLLLFTNVSNYLKNRIENFFLPIQASLYQSKENISD NFETYLNRDQLFKENERLKLENNKLKFILRENKILLEENKRLTSLLEMKQSLTEKIQFAK VYFRKPENMYDQFYIDLGEKDGIKKNMIVSQGEKLIGRIVEVYENSSLVYMITKESIVVS AKSENHMFGVVKGIGEDKLYFEPNVYDDSLKVGDKIYTSGISDIYPGDMYIGYISEIEKG DNSLFTSITIRPSINISNLKEVLVIQSRRNYEN >gi|224461464|gb|ACDD01000038.1| GENE 18 14258 - 14335 65 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no NFIGKLRKYCICIEKCKKKWYYIMK Prediction of potential genes in microbial genomes Time: Fri May 20 02:01:03 2011 Seq name: gi|224461463|gb|ACDD01000039.1| Fusobacterium sp. 3_1_5R cont1.39, whole genome shotgun sequence Length of sequence - 55695 bp Number of predicted genes - 62, with homology - 61 Number of transcription units - 18, operones - 13 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 144 - 230 104 ## + Prom 228 - 287 10.7 2 2 Op 1 . + CDS 330 - 887 726 ## COG0457 FOG: TPR repeat 3 2 Op 2 1/0.000 + CDS 911 - 1459 828 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 4 2 Op 3 11/0.000 + CDS 1498 - 2856 1563 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) 5 2 Op 4 1/0.000 + CDS 2846 - 3796 1229 ## COG0462 Phosphoribosylpyrophosphate synthetase 6 2 Op 5 . + CDS 3799 - 4407 709 ## COG0009 Putative translation factor (SUA5) 7 2 Op 6 . + CDS 4416 - 4850 592 ## FN1994 hypothetical protein + Term 4852 - 4893 7.3 - Term 4843 - 4876 2.1 8 3 Tu 1 . - CDS 4890 - 5870 1385 ## COG0180 Tryptophanyl-tRNA synthetase - Prom 5894 - 5953 6.2 + Prom 5913 - 5972 7.0 9 4 Op 1 1/0.000 + CDS 5994 - 6947 465 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Prom 6983 - 7042 8.2 10 4 Op 2 16/0.000 + CDS 7093 - 8109 1748 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 8143 - 8177 6.2 + Prom 8123 - 8182 6.7 11 5 Op 1 10/0.000 + CDS 8204 - 9706 1908 ## COG1129 ABC-type sugar transport system, ATPase component 12 5 Op 2 . + CDS 9726 - 10745 1723 ## COG4211 ABC-type glucose/galactose transport system, permease component 13 5 Op 3 . + CDS 10808 - 10978 304 ## COG1773 Rubredoxin + Term 10982 - 11023 5.6 + Prom 10998 - 11057 11.8 14 6 Tu 1 . + CDS 11083 - 11694 710 ## gi|257452407|ref|ZP_05617706.1| hypothetical protein F3_05015 + Prom 11711 - 11770 9.0 15 7 Op 1 2/0.000 + CDS 11796 - 12356 873 ## COG0450 Peroxiredoxin + Term 12385 - 12413 1.0 16 7 Op 2 . + CDS 12436 - 14082 2495 ## COG0492 Thioredoxin reductase + Term 14097 - 14133 3.4 + Prom 14105 - 14164 10.7 17 8 Op 1 1/0.000 + CDS 14186 - 15142 290 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 18 8 Op 2 1/0.000 + CDS 15111 - 15743 815 ## COG0164 Ribonuclease HII 19 8 Op 3 1/0.000 + CDS 15754 - 16125 505 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 20 8 Op 4 1/0.000 + CDS 16085 - 16309 245 ## COG3478 Predicted nucleic-acid-binding protein containing a Zn-ribbon domain + Prom 16320 - 16379 3.0 21 8 Op 5 . + CDS 16413 - 16940 524 ## COG1040 Predicted amidophosphoribosyltransferases 22 8 Op 6 8/0.000 + CDS 17017 - 17769 1311 ## COG0149 Triosephosphate isomerase 23 8 Op 7 . + CDS 17785 - 19305 2080 ## COG0696 Phosphoglyceromutase + Term 19325 - 19358 2.3 + Prom 19360 - 19419 13.1 24 9 Tu 1 . + CDS 19533 - 21017 2379 ## COG1757 Na+/H+ antiporter + Term 21049 - 21098 13.5 + Prom 21104 - 21163 7.9 25 10 Op 1 9/0.000 + CDS 21205 - 21474 405 ## COG3830 ACT domain-containing protein 26 10 Op 2 . + CDS 21497 - 22855 2125 ## COG2848 Uncharacterized conserved protein 27 10 Op 3 1/0.000 + CDS 22919 - 24016 1424 ## COG0012 Predicted GTPase, probable translation factor 28 10 Op 4 14/0.000 + CDS 24096 - 24278 268 ## PROTEIN SUPPORTED gi|237737599|ref|ZP_04568080.1| LSU ribosomal protein L32P + Term 24291 - 24323 1.7 + Prom 24339 - 24398 10.4 29 11 Op 1 16/0.000 + CDS 24424 - 25425 1414 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 30 11 Op 2 14/0.000 + CDS 25422 - 26408 1343 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 31 11 Op 3 6/0.000 + CDS 26419 - 27333 1429 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 32 11 Op 4 27/0.000 + CDS 27372 - 27596 527 ## COG0236 Acyl carrier protein + Term 27617 - 27654 2.1 33 11 Op 5 1/0.000 + CDS 27665 - 28912 2030 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 34 11 Op 6 3/0.000 + CDS 28922 - 29620 698 ## COG0571 dsRNA-specific ribonuclease 35 11 Op 7 1/0.000 + CDS 29672 - 30664 1161 ## COG1243 Histone acetyltransferase 36 11 Op 8 . + CDS 30639 - 31916 968 ## COG1530 Ribonucleases G and E 37 11 Op 9 1/0.000 + CDS 31928 - 32425 290 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 38 11 Op 10 7/0.000 + CDS 32429 - 33811 1636 ## COG1066 Predicted ATP-dependent serine protease 39 11 Op 11 . + CDS 33808 - 34857 705 ## PROTEIN SUPPORTED gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 40 11 Op 12 . + CDS 34929 - 35327 592 ## FN1852 hypothetical protein + Term 35346 - 35378 4.2 + Prom 35383 - 35442 8.0 41 12 Op 1 1/0.000 + CDS 35478 - 36785 522 ## PROTEIN SUPPORTED gi|229254937|ref|ZP_04378866.1| SSU ribosomal protein S12P methylthiotransferase 42 12 Op 2 1/0.000 + CDS 36802 - 38043 1186 ## COG1158 Transcription termination factor 43 12 Op 3 1/0.000 + CDS 38047 - 39195 1261 ## COG0739 Membrane proteins related to metalloendopeptidases 44 12 Op 4 1/0.000 + CDS 39220 - 40281 1389 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 45 12 Op 5 . + CDS 40338 - 40790 595 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 46 12 Op 6 . + CDS 40787 - 41110 420 ## FN0480 hypothetical protein 47 12 Op 7 . + CDS 41124 - 41525 478 ## FN0481 hypothetical protein + Prom 41527 - 41586 5.7 48 13 Op 1 1/0.000 + CDS 41623 - 41865 381 ## PROTEIN SUPPORTED gi|237739934|ref|ZP_04570415.1| LSU ribosomal protein L31P 49 13 Op 2 . + CDS 41921 - 42544 895 ## COG0035 Uracil phosphoribosyltransferase + Term 42550 - 42594 4.2 + Prom 42577 - 42636 10.2 50 14 Op 1 15/0.000 + CDS 42676 - 42870 440 ## COG2608 Copper chaperone 51 14 Op 2 2/0.000 + CDS 42867 - 43124 318 ## COG2217 Cation transport ATPase + Term 43130 - 43173 4.4 52 14 Op 3 . + CDS 43187 - 45091 2651 ## COG2217 Cation transport ATPase 53 14 Op 4 . + CDS 45093 - 45494 497 ## COG0295 Cytidine deaminase + Prom 45505 - 45564 4.2 54 15 Op 1 1/0.000 + CDS 45594 - 46970 1653 ## COG2031 Short chain fatty acids transporter 55 15 Op 2 21/0.000 + CDS 46998 - 47648 1036 ## COG1788 Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit 56 15 Op 3 . + CDS 47661 - 48326 1079 ## COG2057 Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit + Term 48456 - 48487 2.1 + Prom 48393 - 48452 2.3 57 15 Op 4 . + CDS 48496 - 49797 1835 ## COG0422 Thiamine biosynthesis protein ThiC + Term 49802 - 49854 11.2 - Term 49787 - 49844 11.0 58 16 Op 1 . - CDS 49850 - 51184 973 ## COG0534 Na+-driven multidrug efflux pump 59 16 Op 2 . - CDS 51193 - 52977 524 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 - Prom 53048 - 53107 18.0 + Prom 53021 - 53080 8.3 60 17 Tu 1 . + CDS 53179 - 54360 1789 ## COG1301 Na+/H+-dicarboxylate symporters + Term 54382 - 54423 7.1 + Prom 54397 - 54456 12.6 61 18 Op 1 . + CDS 54480 - 54809 475 ## FN0737 hypothetical protein 62 18 Op 2 . + CDS 54822 - 55679 314 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains Predicted protein(s) >gi|224461463|gb|ACDD01000039.1| GENE 1 144 - 230 104 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METLSIILYNKIITKLYYLGGNYAKKKK >gi|224461463|gb|ACDD01000039.1| GENE 2 330 - 887 726 185 aa, chain + ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 12 171 142 296 628 58 28.0 9e-09 MLNTDLLKVKFLSKYTEQELEDYEANLRMKLLGKVNIISSMAKLASLCFFKKDYDTAIYF FEKLMTLDATNGNWPGFLAYVYYEQEKYEKAIPYFEKSVDLSPNSPFIYFLLGNSYSRLG KIKEATWCYELAIFLDFDIYGAHVDFAKKYEKMGQKEKALEEYILAYEIDPRDKKIKKKI DALSQ >gi|224461463|gb|ACDD01000039.1| GENE 3 911 - 1459 828 182 aa, chain + ## HITS:1 COG:FN1990 KEGG:ns NR:ns ## COG: FN1990 COG0484 # Protein_GI_number: 19705286 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Fusobacterium nucleatum # 42 178 1 175 175 87 34.0 9e-18 MSDGLLVVILIIAILAFSGRIQGLSGLLIFGLILFFLGWFTIKFFWIILAIIGINYITRS MKPKTQRRTRYTYRTYSQQDFEDFFRRASGGQYKGQYQGNYSGNHYGNSYGSYVEDLSKY YAILGVVEGASKEDIKKAYLKKVKEHHPDRFATASETEKKFHEEQLKAINEAYDKIEKSY TV >gi|224461463|gb|ACDD01000039.1| GENE 4 1498 - 2856 1563 452 aa, chain + ## HITS:1 COG:FN1991 KEGG:ns NR:ns ## COG: FN1991 COG1207 # Protein_GI_number: 19705287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Fusobacterium nucleatum # 3 448 1 446 446 568 66.0 1e-162 MSLKTLILAAGKGTRMKSDLPKVLHKVNGKPMLHKILDVVNFLQPEENILILGYKREEIL ATLDTCSYVVQEEQLGTGHAILQAKEKLKDYHGDIMVLYGDTPLLREETLQQLHQYHKEQ KATTTVLTAVYENPFGYGRILKKERKVLGIVEEKEATEEQKKIQEVNAGVYCFDSQELWK ALSKINNKNEKGEYYLTDVLSIQAMEGKTVLSYELKDSQEILGVNSKVELAEANQVLRQR KNKQLMENGVTLLDPSITYVEEDVKIGQDTVLAPTVILQGKTIIGKKCEILGNTRIIDSQ LGDNIVVESSVIEESILEDGVTMGPFAHLRPKAHLKKKVHIGNFVEVKKSVLEEGVKAGH LTYLGDAHVGERTNIGAGTITCNYDGVNKFPTNIGKDVFIGSDSMLVAPVNIGENALIGA GSVITKDVPENALAVERNKQIIKNEWRKKNGR >gi|224461463|gb|ACDD01000039.1| GENE 5 2846 - 3796 1229 316 aa, chain + ## HITS:1 COG:FN1992 KEGG:ns NR:ns ## COG: FN1992 COG0462 # Protein_GI_number: 19705288 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Fusobacterium nucleatum # 3 313 5 315 316 509 83.0 1e-144 MVDSVKIFAGTSNKELAQKIAEKYGMELGKAEVVRFKDGEVFVKIDETVRGRDVFVVQPT SEPVNENLMELLIFVDALKRASAKSINVIVPYYGYARQDRKSSPREPITSKLVANLLTKA GVTRLLTMDLHADQIQGFFDIPVDHLQALPLMVKYFKSKGFYGDKVVVVSPDIGGVKRAR KLAEKLDCKIAIIDKRRPKPNMSEVMNLIGEVEGKIAIFIDDMIDTAGTITNGATAIMER GAKEAYACCTHAVFSDPAIERLTASSLTEIIVTDSIRLPERKKIDKVKILSVDELFAEAI NRVVHNQSVSELFEVK >gi|224461463|gb|ACDD01000039.1| GENE 6 3799 - 4407 709 202 aa, chain + ## HITS:1 COG:FN1993 KEGG:ns NR:ns ## COG: FN1993 COG0009 # Protein_GI_number: 19705289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Fusobacterium nucleatum # 4 202 17 216 217 221 61.0 5e-58 MEVKYQEIGEQIKKGALIIYPTDTVYGIGASIQSEEALIHLYQAKSRNFSSPLIALVDSV ERISEIAYVERKKELLEKLSQKFWPGGLTIILPAKDCVPKIMISGGNTVGVRIPNHEMAL SIIRASGGILPTTSANISGEATPSSYQELSEAIKRNADIVIDGGVCPVGEASTILDFTKD SIQILRLGAITKEEIEAVIGKI >gi|224461463|gb|ACDD01000039.1| GENE 7 4416 - 4850 592 144 aa, chain + ## HITS:1 COG:no KEGG:FN1994 NR:ns ## KEGG: FN1994 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 143 4 144 144 135 63.0 5e-31 MKTRVNPNAISPMEMNQMSSMMGMMSSLQKIGKGKRKYSIPLDKSSKKFLVRFIDEVKKQ FAGSVMADQNKQIYDFLVYVKEVSEKKESTELKVSFEEEEFLKRMLKDSVRGMETMEFKW YQFVKKRMVKMLTSQYRDLLAKFK >gi|224461463|gb|ACDD01000039.1| GENE 8 4890 - 5870 1385 326 aa, chain - ## HITS:1 COG:FN0405 KEGG:ns NR:ns ## COG: FN0405 COG0180 # Protein_GI_number: 19703747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 325 1 325 325 520 77.0 1e-147 MKRSLSGIQPSGILHLGNYFGAIDQFVTMQDDYEGFYFVADYHSLTSLTKPEDLRENTKN IILDYLSLGLNPEKSTLFLQSDVPEHVELYWLLCNVAPVGLLERAHSYKDKLAKGFTPNM GLFNYPALMAADILIYDADVVPVGKDQKQHLEMTRDIAAKFNQQYEVDFFKLPDPLIMDK VAVVPGTDGQKMSKSYGNTIQMFAPKKQLKQQVMSIVTDSTPLEEPKNPDNNIAKLYALF ANIEKQNEMKEKFLAGNYGYGHAKTELLNAILEYFGNAREKREELAQNTKYVEEILQEGA RKARAIASKKVQEAKEIVGLLGNIYR >gi|224461463|gb|ACDD01000039.1| GENE 9 5994 - 6947 465 317 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 6 315 7 318 319 183 34 2e-45 MKYYAGVDLGGTNTKIGICDAEGKIVSSSSIKTDSIRGVDDTLFRIWTEIQRQVLEQKIE KEDLQGIGIGIPGPVKNQSIVGFFANFPWEKNINLQEKMEKISGVTTKLDNDVNVIAQGE AIFGAARGHRSSITVALGTGIGGGIFIDGKLISGMTGAGGEVGHMKLVPDGKLCGCGQKG CFEAYASATGMIREALSRLYVNKQNALYDKFQGNYENLEAKDIFEAAAAGDIFSQEIVDY EAEYLAMGIGNLLNIINPEVIVLGGGIALAKEQILVPIQTKISKYALEITLENLEIKTGV LGNEAGILGAAALFIVS >gi|224461463|gb|ACDD01000039.1| GENE 10 7093 - 8109 1748 338 aa, chain + ## HITS:1 COG:FN1165 KEGG:ns NR:ns ## COG: FN1165 COG1879 # Protein_GI_number: 19704500 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 338 1 341 341 451 70.0 1e-126 MKKTGIVLGALLLAAGLVGCGEKKEAAAPAENAVRMGLTAYKFDDNFIALFRQAFQTEAD AVGDQVALQMVDSQNDAAKQNEQLDVLLEKGIDTLAINLVDPAGVDVVLEKIKAKELPVV FYNRKPSDEALASYDKAYYVGIDPNAQGIAQGKLVEKAWKENPALDLNGDGVIQFAMLKG EPGHPDAEARTIYSIKTLNDDGIKTEELHLDTAMWDTAQAKDKMDAWLSGPNADKIEVII CNNDGMALGAIESMKAFGKSLPVFGVDALPEAITLIEKGEMAGTVLNDAKGQAKATFQVA MNLGQGKEATEGTDIQMENKIVLVPSIGIDKENVAEYK >gi|224461463|gb|ACDD01000039.1| GENE 11 8204 - 9706 1908 500 aa, chain + ## HITS:1 COG:FN1166 KEGG:ns NR:ns ## COG: FN1166 COG1129 # Protein_GI_number: 19704501 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Fusobacterium nucleatum # 1 500 1 500 500 781 82.0 0 MENLKYVLEMEGITKSFPGVKALDNVQLKVRPHSVHALMGENGAGKSTLMKCLFGIYEKD AGKILLDGIETSFHSTKEALENGVSMVHQELNQVLQRNVLDNIWLGRYPKKGLFIDEKKM YEDTIRIFKDLDINIDPRKKVSELQVAERQMIEIAKAVSYNSKVLVMDEPTSSLTEKEVA HLFRIINKLRDSGVGIVYISHKMEEIKAISDDITILRDGTWVGTDSVKELDTDKIISMMV GRDLTDRFPPKDNEVKEKILEVKNLTGFYQPTIQDVSFDLHKGEILGIAGLVGAKRTEIV ETMFGMRKLESGQIFLHGKEVKNTDPKSAIKNGFALVTEERRSTGIFSMLDITANSTLSN LDKYKNKFGLLENKQMKDATKWVIDSMRVKTPSQSTPIGSLSGGNQQKVIIGRWLLTEPE VLMLDEPTRGIDVLAKYEIYQLMIDLAKKEKGIIMISSEMPELLGVTDRILVMSNGRIAG IVKTSETNQEEIMALSAKYL >gi|224461463|gb|ACDD01000039.1| GENE 12 9726 - 10745 1723 339 aa, chain + ## HITS:1 COG:FN1167 KEGG:ns NR:ns ## COG: FN1167 COG4211 # Protein_GI_number: 19704502 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Fusobacterium nucleatum # 1 339 1 339 339 461 79.0 1e-130 MNIRNKEGKINYKELFIQSGLYLVLFCMLLVIIWKEPSFLSIRNFKNILTQSSVRAIIAL GVAGLILTQGTDLSAGRQVGLAAVISATMLQAVTNVNRVFGLDRELPIIYAIIVVCLVGL VIGVVNGLIVAKLNVHPFIATLGSMTVVYGINSLYYDIVGASPISGFSSKYSSFAQGAVD LGGFSIPYLIIYATIATIIMWTLWNKTKFGKNIFAVGGNPEAAKVSGVNVVLTLVGIYAL SGVFYAFGGFLEAGRIGSATNNLGFMYEMDAIAGCVIGGVSFYGGVGRISGVITGVIILT VINYGLTYVGVSPYWQYIIKGIIIVAAVAFDSIKYAKKK >gi|224461463|gb|ACDD01000039.1| GENE 13 10808 - 10978 304 56 aa, chain + ## HITS:1 COG:AF1349 KEGG:ns NR:ns ## COG: AF1349 COG1773 # Protein_GI_number: 11498945 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Archaeoglobus fulgidus # 1 53 20 72 73 75 67.0 3e-14 MKLYVCEVCGYVYDSTLGDVDHGIPAGTKFEDLPDDWVCPPCGVSKDHFREMEVNK >gi|224461463|gb|ACDD01000039.1| GENE 14 11083 - 11694 710 203 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452407|ref|ZP_05617706.1| ## NR: gi|257452407|ref|ZP_05617706.1| hypothetical protein F3_05015 [Fusobacterium sp. 3_1_5R] # 1 203 1 203 203 400 100.0 1e-110 MRKKELKYFTIEDSFGGNQDWFTDPMMNRGGCGAVTACDTCMYFSKYYAQKHLYPFDIEN LTKEKFIEFSNIMKPFLSPRRMGINTLELYMDGFQEYLNSVLDTFLGMRGFLGTEKLDEA EEKVIEQIEKGFPIPYLNLLHQDKSFEDYEWHWFSLIGYEKKEENFFVKAVSYGKVEWLD FKKLWNTGHKQKGGMVLYFLLKR >gi|224461463|gb|ACDD01000039.1| GENE 15 11796 - 12356 873 186 aa, chain + ## HITS:1 COG:FN1983 KEGG:ns NR:ns ## COG: FN1983 COG0450 # Protein_GI_number: 19705279 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Fusobacterium nucleatum # 1 186 1 188 188 293 75.0 1e-79 MSLIGKKVSEFKVQAYHNGEFKEVSNKDFEGKWAAFVFYPADFTFVCPTELADLADHYAE FQKEGCEVYSVSCDTHFVHKAWHDTSDSIKKIQYPMLADPTGKLARDFEVMIEEEGLALR GSFIVNPQGEIKAYEVHDNGIGREASELLRKLRAAKFVAEHGEVCPAKWQPGSETIKPSI DLVGKL >gi|224461463|gb|ACDD01000039.1| GENE 16 12436 - 14082 2495 548 aa, chain + ## HITS:1 COG:FN1984_1 KEGG:ns NR:ns ## COG: FN1984_1 COG0492 # Protein_GI_number: 19705280 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin reductase # Organism: Fusobacterium nucleatum # 1 331 1 331 332 449 74.0 1e-125 MERIYDVIIIGGGPAGLSAGIYAGRAKLDVLLLEKAVPGGQIRITDEVVNYPGILETTGA GFGEKAAEQAKKFGVEFATEEVIGMDFSGKIKTIKTTSGEYKTLAVVIATGASPRKLGFP GELEYAGRGVAYCATCDGEFFTGLPVFVVGAGFAAAEEAMFLTKYASKVTVIAREPDFTC AKSIGDKVKAHPKIEVKFHTELIEATGDSQLRHAKFKNNETGEITEYHAPDGDTFGIFVF VGYAPETQLFKGVIDLDPAGFIPTNEDLMSNVEGVYAAGDIRPKKLRQVVTAVADGAIAA TNIEKYVQELREELGMVKEEIEEEKVESSSTNSRVLDDAIMQQIQGLAERFEKSVQLVVI QDPEKAEKSAEMLSLVNEIASASDKIQVQHYQKGENPEMEAKIQANFLPVVAFLNDKGEY ARIKYAVVPGGHELTSFLLALYNVAGPGQAVKEEIQQKATEIDERVNLKIGVSLTCTKCP ETVQSAQRIAVENQNVDIEVVDVFGFQDFKKKYDIMSVPAVVMNDKSLFFGQKDIQALLD DIFEKLGK >gi|224461463|gb|ACDD01000039.1| GENE 17 14186 - 15142 290 318 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 22 302 20 284 285 116 33 3e-25 MRRIVENYEYQVGEEDQGKRLDLFLKEQLPEATRSYLEKLITEGYVLCNEKVITKNGKKL KGKEVIQLAIPEEEEMEIVAENIPLDIVYEDKYLLVVNKQANMVVHPALGNYSGTLVNAL LYYCKENLSDMNGVIRPGIVHRLDKDTTGLIIVAKNNQVHSKLALMFQEKTIRKTYVAIV KGRFSEERKEGRLETLITRDPKDRKKMTVSQIQGKKAISNYRVLLDGDKHSLVEVKIETG RTHQIRVHMKYLNHPILGDIVYGQEDTKCKRQMLHAYRLEFIHPITEEAICLEGKLPEDF IEAGKRVFDGKDVGTVLI >gi|224461463|gb|ACDD01000039.1| GENE 18 15111 - 15743 815 210 aa, chain + ## HITS:1 COG:FN1371 KEGG:ns NR:ns ## COG: FN1371 COG0164 # Protein_GI_number: 19704706 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Fusobacterium nucleatum # 1 210 3 211 215 222 60.0 5e-58 MEKMWEQSLYDFDVEKGEKIVGVDEAGRGPLAGPVVAAATLLKQYDRSLDSIQDSKKLTE KKREALFSVIPQFFYVGIGQASVEEIEKYNILNATFLAMRRALANLEEQVEIQDATILVD GNFTIRECQRKQEAVVKGDAKSLSIAAASIMAKVTRDHKLVELAKQYPDYLFEKHKGYGT KVHREKILSLGPIPGVHRDSFLVKILQKKQ >gi|224461463|gb|ACDD01000039.1| GENE 19 15754 - 16125 505 123 aa, chain + ## HITS:1 COG:FN1370 KEGG:ns NR:ns ## COG: FN1370 COG0792 # Protein_GI_number: 19704705 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Fusobacterium nucleatum # 3 119 2 118 119 107 52.0 5e-24 MQNNRQKGNEYEERAVNILRENQYQILERNFRIFQGEIDIIAEKDGVLVFIEVKYRKNRN FGYGKEAVDSRKLGKIFRVAEYYKTYCGKQYQKMRIDVIHFLGDTYFWEKDVAWGDEIGC EMF >gi|224461463|gb|ACDD01000039.1| GENE 20 16085 - 16309 245 74 aa, chain + ## HITS:1 COG:FN1369 KEGG:ns NR:ns ## COG: FN1369 COG3478 # Protein_GI_number: 19704704 # Func_class: R General function prediction only # Function: Predicted nucleic-acid-binding protein containing a Zn-ribbon domain # Organism: Fusobacterium nucleatum # 6 63 1 59 75 63 57.0 7e-11 MWRGVMKLDVKCSKCGSKEYEVRNVILPEKKQGMKLELNLYYVKTCLSCGYSEFYLAKVV DKDEKEVPVPKAEY >gi|224461463|gb|ACDD01000039.1| GENE 21 16413 - 16940 524 175 aa, chain + ## HITS:1 COG:FN1368 KEGG:ns NR:ns ## COG: FN1368 COG1040 # Protein_GI_number: 19704703 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Fusobacterium nucleatum # 2 163 37 198 204 121 40.0 6e-28 MLRYYEGHYYVHLYQEPIRSWIHEYKFQGRKEFGEIFAKWMKKAFWECYDRNKIDIVVPV PIHEKRRLERGFNQTEEILKHLSVPYITIERYKNTKHLYQYGMKRDRQEIMEAAFYCPIS LEGKNVLLFDDIITTGTTISEMKKAICQKGMPNKIVSFAFALSERVKIEQKSGGN >gi|224461463|gb|ACDD01000039.1| GENE 22 17017 - 17769 1311 250 aa, chain + ## HITS:1 COG:FN1366 KEGG:ns NR:ns ## COG: FN1366 COG0149 # Protein_GI_number: 19704701 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Fusobacterium nucleatum # 1 249 1 250 251 334 72.0 1e-91 MRRTVIAGNWKMNKTNQEAVEMLHQLKEEVAGISEVDIVIGAPFTCLSDAVKETVGSNIK IAAENVYPKSSGAYTGEISPKMLKAIGVEYVILGHSERREYFQESDEFINEKVKAVLAEG MTAILCIGEKLEDREAGKTNMVNETQLRGGLAGITKEEAANIIVAYEPVWAIGTGKTATP ELAQETHAEIRKVLVSLFDKVGEEMTIQYGGSMKPENAAELLAQKDIDGGLIGGASLEAK SFAAIVKAGR >gi|224461463|gb|ACDD01000039.1| GENE 23 17785 - 19305 2080 506 aa, chain + ## HITS:1 COG:CAC0712 KEGG:ns NR:ns ## COG: CAC0712 COG0696 # Protein_GI_number: 15894000 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Clostridium acetobutylicum # 2 505 3 510 510 572 57.0 1e-163 MKKPVMLVIMDGWGINEKLEEKNAIRVAKPHNLLQLEEKYPHSKLQASGEAVGLPEGQMG NSEVGHLNIGAGRVVYQPLVEISVDIRNGEFFKKPALVEAFEYAKQHKTKIHFGGLLSPG GVHSHTEHLYGLLEMAKKYELSEVYVHAFLDGRDTPPSSAIDYVKELEEKMKEIGIGKIA SLSGRYYAMDRDKNWDRVELAYKAMVLGEGNHADSAVKAMEDSYSNGKTDEFVLPTIIDK QGKIGKGEVFINFNFRPDRAREITRALNDKEFTGFDREYLDLQFYCMRQYDSTIEAKVIY EDKNIAKTFGEVVSEAGLKQLRTAETEKYAHVTFFFNGGKETQYEGEDRILVPSPKVATY DLQPEMSAYEVTEGALKALDSDQYDVIILNFANTDMVGHTGVMEATVKAVQTVDECIGKI ADKILEKDGVLLITADHGNADLMEDPITKVPFTAHTTNLVPCLLVSNRYQDVSLKDGALC DLAPTLLYFLGIEQPEEMNGTCLIEK >gi|224461463|gb|ACDD01000039.1| GENE 24 19533 - 21017 2379 494 aa, chain + ## HITS:1 COG:FN1860 KEGG:ns NR:ns ## COG: FN1860 COG1757 # Protein_GI_number: 19705165 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 3 482 46 525 525 659 74.0 0 MKAFFKLSPVFLLAALMVAGYDALIAAPIATMYACVVAMLTEKTKFQGVIDAAIASVKEI QVALFILMIAYAMAEAFMSTGVGASIIIIALKFGITGKTVALVGAIVTAILSIATGTSWG TFAACAPVFLWLNHIVGGSITLTLGAIAGGACFGDNIGLISDTTIVSSGIQGVEVVRRIR HQGVWSGLVLLSGIILFGVFGVIMDLPSTVGDAAEAISKITPEVWTQLAEERESAVKLLE QVQAGVPLYMVIPLVVVLVLAFAGFQTFICLFSGVILSYVFGYFAGTVGTVNEYLDMCMS GFSDAGGWVVVMMMWVAAFGGVMKMMNAFRPLSDLLGRMARNVKQLMFFNGCLSIFGNAA LADEMAQIVTIGPIIKELVEENVEASEEDMYVLKLRNATFSDAMGVFGSQLIPWHVYIGY YLGIIGIVYPIYEFKPMDLIQYNFIAYIAVISMLVLTLTGLDRLVPLFGLPSEPKVRLRT KEEREAYTASKKAK >gi|224461463|gb|ACDD01000039.1| GENE 25 21205 - 21474 405 89 aa, chain + ## HITS:1 COG:MK1213 KEGG:ns NR:ns ## COG: MK1213 COG3830 # Protein_GI_number: 20094649 # Func_class: T Signal transduction mechanisms # Function: ACT domain-containing protein # Organism: Methanopyrus kandleri AV19 # 2 89 3 90 90 78 42.0 2e-15 MKCIITVLGTDKVGIIAKICTYLSEVNVNILDISQTIIGGYFNMMMIVDMTDANKKMEEV NEELIHIGKKMGVIITMQHEDIFNCMHRI >gi|224461463|gb|ACDD01000039.1| GENE 26 21497 - 22855 2125 452 aa, chain + ## HITS:1 COG:lin0538 KEGG:ns NR:ns ## COG: lin0538 COG2848 # Protein_GI_number: 16799613 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 452 5 451 451 593 70.0 1e-169 MISRVEIQETNRMIAEAKLDVRTITMGISLIDCADTDVDKFNEKVYKKITTYAKDLVRVG DEIAKQFGIPVVNKRISVTPIAIAAASCQTNSYVSIAKTLDRAAKDCGVNFIGGFSALVQ KGCTPSDTILINSIPEAMDVTERVCSSVNVGTSRNGLNMDAIKRMGEVIKETAERTKDRD GIGCAKLVVFCNAVEDNPFMAGAFHGVGEADCVINVGVSGPGVVKRALEEVREGDFETLC ETVKKTAFKITRVGQIVAQEAARRLQVPFGIIDLSLAPTPAIGDSIGEIFQEMGLECAGA PGTTAALAILNDNVKKGGVMASSYVGGLSGAFIPVSEDHAMIEAVERGALSLEKLEAMTC VCSVGLDMIAIPGDTSAATISGIIADESAIGMINNKTTAARLIPVVGKEVGDQVEFGGLL GYAPVMKVNSFSCEKFIARGGRVPAPIHSFKN >gi|224461463|gb|ACDD01000039.1| GENE 27 22919 - 24016 1424 365 aa, chain + ## HITS:1 COG:FN1365 KEGG:ns NR:ns ## COG: FN1365 COG0012 # Protein_GI_number: 19704700 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Fusobacterium nucleatum # 1 365 1 364 364 608 84.0 1e-174 MIGIGIVGLPNVGKSTLFNAITKAGAAEAANYPFCTIEPNIGMVTVPDERLNALSEIINP QRVVAATVEFVDIAGLVKGAAQGEGLGNKFLSNIRSTAAICQVVRCFEDENVVHVDGSVD PIRDIEVINTELIFADLETVDKAMEKHKKLAQNKIKESVELMTVLPKAKSHLESFQLLKT FDFTEEEKSLLKNYQLLTLKPMIFAANVAEDDLAEGNAYVEKVREYAKILGSEVVIVSAK VEAELQEMDDEESKQEFLESLGVKEAGLNRLIRAGFKLLGLQTYFTAGVKEVRAWTIHIG DTAPKAAGEIHTDFEKGFIRAKVVSYEDFIQYRGWKGAQEVGVLRLEGKEYIVQDGDLME FLFNV >gi|224461463|gb|ACDD01000039.1| GENE 28 24096 - 24278 268 60 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237737599|ref|ZP_04568080.1| LSU ribosomal protein L32P [Fusobacterium mortiferum ATCC 9817] # 1 60 1 60 60 107 81 1e-22 MAVPKKKTSKAKKNMRRSHHALAGISLSICSKCGAPKRQHRVCLECGDYNGKQVLATEAE >gi|224461463|gb|ACDD01000039.1| GENE 29 24424 - 25425 1414 333 aa, chain + ## HITS:1 COG:FN0147 KEGG:ns NR:ns ## COG: FN0147 COG0416 # Protein_GI_number: 19703492 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Fusobacterium nucleatum # 1 331 1 332 332 373 63.0 1e-103 MKIALDAMSGDFAPHSTVEGAVLFTKEIAETEIILVGKEEVIREELKKYSYDKERIRIQN AKEIIEMTDHPVEAIRNKKDSSMNVALDLVKKGEADACVSSGNTGALLSASQLKLKRIKG VLRPAIASVFPSKKGQIVMLDLGATADCKAEYLNQFSSLASKYAELLLGVSSPRVGLLNI GEEVGKGNELTREAYTLLQTNQSIHFIGNIEATQMMEGKVDVVVTDGFTGNMVLKTAEGT AKLITSLLKETIQESLLSKIGALFLKKSFIHLKEKMDSSEYGGAIFLGLNEISIKAHGNS DANGIKNALKVADKFSKINLIEQLKKVIEEEAN >gi|224461463|gb|ACDD01000039.1| GENE 30 25422 - 26408 1343 328 aa, chain + ## HITS:1 COG:FN0148 KEGG:ns NR:ns ## COG: FN0148 COG0332 # Protein_GI_number: 19703493 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Fusobacterium nucleatum # 1 328 1 328 328 434 63.0 1e-121 MKSVGIKGLSSYVPERVMTNFDFEKIIDTSDEWIRTRTGIEERRFAKPEQATSDLCYEAT RKLLAERAIDPKEIDFIMVCTCTPDYPVPSTACVLQSKLGIMGIPAVDINAACSGFMYGL TMAASMAQTGLYKNILVIGAETLSRILDMQDRNTCVLFGDGAAAAIVGEVEEGSGILATH LGAEGENDGILQIPGGGSKYPHTLESIEERKQFVKMKGQNVYKFAVHALPDATLAALEKA KISPNQVTRFFPHQANLRIIEAAAKRMNVPVDKFHVNLHKVGNTSAASVGLVLADALEKG MVKKGDYVALTGFGAGLTYGSVVMKWAY >gi|224461463|gb|ACDD01000039.1| GENE 31 26419 - 27333 1429 304 aa, chain + ## HITS:1 COG:FN0149 KEGG:ns NR:ns ## COG: FN0149 COG0331 # Protein_GI_number: 19703494 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Fusobacterium nucleatum # 1 298 1 298 299 399 71.0 1e-111 MGKVAFVFPGQGTQYVGMGKDLYEKSPRAKEILDKMFQSLDFDLKSIMFEGTAEDLKQTK YTQPAIVALSLTLMELAKEKGLKADYVAGHSVGEYTAYGAAGMLSFEEAICLTAARGQIM NDVSEKVNGTMAAVLGMPAEKIQEVLAGMDGVVEAVNFNEPNQTVIAGQKAAVEAACLAL KEAGARRALPLAVSGPFHSSLMKEAGEKLKEEAEKYHFSMTEIGLVANTTAEVLTSVEDV KNEIYHQSFGPVYWVKTIEYLVAAGVDTIYEIGPGKVLSGLIKKINKEITVKNIETLEEI ENLM >gi|224461463|gb|ACDD01000039.1| GENE 32 27372 - 27596 527 74 aa, chain + ## HITS:1 COG:FN0150 KEGG:ns NR:ns ## COG: FN0150 COG0236 # Protein_GI_number: 19703495 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Fusobacterium nucleatum # 1 74 1 74 75 84 71.0 4e-17 MLDKIREIVVEQLGVEPEQVVMEASFTEDLGADSLDTVELIMAFEEEFGVEIPDTEAEKI KTIKDVVDYVEAHQ >gi|224461463|gb|ACDD01000039.1| GENE 33 27665 - 28912 2030 415 aa, chain + ## HITS:1 COG:FN0151 KEGG:ns NR:ns ## COG: FN0151 COG0304 # Protein_GI_number: 19703496 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Fusobacterium nucleatum # 1 414 1 412 413 541 67.0 1e-154 MRRVVVTGLGMISPLGINLKNSWERLLQGECGISKIESYDASEMPVQIAAEVKDFNPMDF GIEKKEVKKLARNTQFAIAASKMALEDSKLNLEETNPFDIGVVISSGIGGMEIFEDQHKN MLEKGVKRISPFTIPAMISNMAAGNVAIYLGLQGPNKSVVTACASGTNSIGEAFEEIKLG KAQIMLAGGTEAAITPFAQNAFANMKALSDTHNEEPQKASRPFSKDRDGFVMGEGAGILV LEELEHAKARGAKIYAEMVGYGSSCDAYHITAPYESGVAAAHAMTMAMKEAGVKPEEVEY INAHGTSTPANDKTETKAIKVALGEENAKKVWISSTKGALGHGLGAAGGLEGVIIAKVLE TGMVPPTINYETPDEECDLDYVPNVKREKEIRVAMSNSLGFGGHNAVILMKKYQD >gi|224461463|gb|ACDD01000039.1| GENE 34 28922 - 29620 698 232 aa, chain + ## HITS:1 COG:FN0152 KEGG:ns NR:ns ## COG: FN0152 COG0571 # Protein_GI_number: 19703497 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Fusobacterium nucleatum # 3 231 2 230 234 270 61.0 1e-72 MSKNLVDLEHRINYYFNDKNLLKNALIHRSFGNEHKHYKNINNEKLELLGDAVLGLVVAE YLYQKYPEEKEGVLAKIKSMAVSEPVLASISRKLRIGEYLLLSKGEMVTGGRDRNSILGD VFEAILGAIYLDSGFFAAKEYVLFHLKDMIDHIDDFEEILDFKTILQEYCQKKYRDIPKY TLVGEEGPDHRKLFEMQVQIQNNIAKAKGTNKKIAEQMAAKQLCKELGVKYL >gi|224461463|gb|ACDD01000039.1| GENE 35 29672 - 30664 1161 330 aa, chain + ## HITS:1 COG:FN0153 KEGG:ns NR:ns ## COG: FN0153 COG1243 # Protein_GI_number: 19703498 # Func_class: K Transcription; B Chromatin structure and dynamics # Function: Histone acetyltransferase # Organism: Fusobacterium nucleatum # 3 316 21 334 348 307 49.0 2e-83 MCFVIKKKLNGQETDIQVEDIHRIVKEYLKTLPKKSEKEVAFFGGTFTGLSMELQKKYLE ALQEYIERGDIQGIRLSTRPDYIQKDILEQLRKYGVKAIELGIQSLDEEVLRRSDRFYTE KQVLSSIQQMQSYGFEVGIQIMVGLPGSSLEKEIKTIKTLIACQPDTARIYPTLVLEDTI LEKQYHQGEYQALTLEEAVERSRILYAYLEQAGIRTIRLGLQATEELSKEKTMVAGPYHP AFRELVETEIAYVFLKDIFEKEGVQTIFCTETEVSRIVGLKKKNKERFGKDFQVKIQNNL SEGQVLIGNKVYSREERIRRNLDVGTSMDF >gi|224461463|gb|ACDD01000039.1| GENE 36 30639 - 31916 968 425 aa, chain + ## HITS:1 COG:slr1129 KEGG:ns NR:ns ## COG: slr1129 COG1530 # Protein_GI_number: 16329250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Synechocystis # 61 351 83 389 674 129 28.0 1e-29 MWELLWTSDLFHHKVAIFRDNELWDLRIEEKDKIVRNGFYLAKKEKEHFLLLSSGEKVFC SEAFPNGQEKIVQVLQEEREEKLAQVSQKLEMTNPYFVFFPYGKGIFLSKKMEEEQERKR LREIFQKYEEKGSFLIRTEAKGMLERNLEQEIQQVLKEWQLVQERAFHLKKKGNLRSTVV WMEEILEEYGKQDWKTCYCENFELKETLKEKLTFYQKQVREYHGEISLWKQRKLEEQIRV LCQEKIDLASGGYLWIESTRACVTIDVNSGAGSPRRSNIEAAREIPRQVKLRNLAGNIVI DFINCKNVEEKKEIVSILKEGFLRDQHFIQWGNFHDFDLFLFSRQRKGKELSFYYSEESL FYQVQCLEEECNDLLEQKEKILLIEGEKNILQEWKKQTQCGIKDSIDFQYRKEEKESKKF HIEIK >gi|224461463|gb|ACDD01000039.1| GENE 37 31928 - 32425 290 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 157 4 156 164 116 41 3e-25 MRVGIYAGSFDPITKGHQDIIRRALKIVDKLIVLVVNNPSKKYWFNIEEREAMILESMES QYREKIEIHRYEGLLVDFMREKGVNLLIRGLRAVSDYEYEMGYAFTNKELSQGKAETIFI PASREYMYLSSSGVREIAINQGDISAYVDKALEEKIKLRAKELVK >gi|224461463|gb|ACDD01000039.1| GENE 38 32429 - 33811 1636 460 aa, chain + ## HITS:1 COG:FN0157 KEGG:ns NR:ns ## COG: FN0157 COG1066 # Protein_GI_number: 19703502 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Fusobacterium nucleatum # 1 458 1 450 452 609 74.0 1e-174 MAKTKSVYYCTECGYQSAKWLGRCPSCQEWGTFEEEVALPKELQKQFHSSSSSGNLGEKV KALSEVTMESSERYTTSMGEFDRVLGGGLLQGEVVLLTGNPGIGKSTLLLQVASCYTEYG DVIYISGEESPSQIKNRSERLGIDEKGLLLFTETDILSIYEYLLKKKPKVVMIDSIQTIY NSALDSISGTPTQIRECTLKIIELAKTYGISFFVVGHITKDGKVAGPKILEHMVDAVFNF EGEEGLYYRILRSTKNRFGSTNELAVFSMEEDGMKEIKNSSEYFLSEREEKNIGSMVVPI LEGTKVFLLELQSLLTDVSIGIPKRVVQGYDRNRLQILIAIAERKLYLPLGMKDVFINVP GGLNISDPAADLALLISMLSAYHSVEISQKIAAIGELGLRGEIRKVFFIEKRLRELEKLG FKGVYVPEANRKELEKKQYHLKIIYLKNLEELLERIKEGR >gi|224461463|gb|ACDD01000039.1| GENE 39 33808 - 34857 705 349 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764769|ref|ZP_02171823.1| ribosomal protein L18 [Bacillus selenitireducens MLS10] # 13 342 20 351 360 276 41 2e-73 MIQYDLVEIFEKIAPGTPLREGIVNILDGRLGALLILGYDEEVEKVLDGGFFINCDYTPE RLFELAKMDGAIILDEKCEKILYANVHIQADAKYPTSESGTRHRTAHRASQQLKKLVVAV SERKSVVTVYQGIGKYRLQNLSVLMEEATQALKILERYRYVLDKALVNLTLLELDDLVTV FDVITMAQRFEMIARIENELVGYVRELGKEAHLISSQLKELTQDIELEHLEFMKDYLKEE SKIELVKKKIHQLTDQELLEAEVLADVFGYGKTYSVLDNKVSSRGYRILGKISKLTKKDI EKMVSTYGNIAEIQEAEDDDLLEIKLSKFKIRAMRTGIQRLKFTVELTR >gi|224461463|gb|ACDD01000039.1| GENE 40 34929 - 35327 592 132 aa, chain + ## HITS:1 COG:no KEGG:FN1852 NR:ns ## KEGG: FN1852 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 132 2 123 126 125 55.0 3e-28 MKKLILLFSLLFAATGYASTYKDGIYRGYYISGQETQIEVQFTLKNDVMTEAKYRTLQYK NHDWLKEENFVKMNKGYMGALNYMVGKKVDQAVLDKLYTPEGIEKAGATVRGGKLRHAVQ LALMAGPIKITK >gi|224461463|gb|ACDD01000039.1| GENE 41 35478 - 36785 522 435 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229254937|ref|ZP_04378866.1| SSU ribosomal protein S12P methylthiotransferase [Capnocytophaga ochracea DSM 7271] # 1 433 6 431 433 205 30 4e-52 MKKATVITYGCQMNVNESAKMKKIFENLGYEITEDIRESDAIFLNTCTVREGAATQIYGK LGELMQVKADRGSIIGVTGCFAQEQGKELLKKFPVIDIVMGNQNIGRLPQAIENIENQTE KHVVFTNHEDDLPPRLDADFGSDQTASIAISYGCNNFCTFCIVPYVRGRERSVPLEEIVR DVDQYVKKGAKEIMLLGQNVNSYGHDFKNGDTFAKLLTEICKVEGDFIVRFVSPHPRDFT DDVIEVIAKEDKIAKCLHLPLQSGSSQILKRMNRGYTKEQYLALAHKIQDKISGVALTTD IIVGFPGETEEDFLDTLEVVREINYDNAFMFMYSIRQGTRAATMKEQIPEDIKKERLQRL MDVQARCSYKESQKYQGKTVRVLVEGESKKNKEVLSGRTSTNKIVLFQGPISLKGSFVDV EIYECKTWTLYGKLV >gi|224461463|gb|ACDD01000039.1| GENE 42 36802 - 38043 1186 413 aa, chain + ## HITS:1 COG:FN0476 KEGG:ns NR:ns ## COG: FN0476 COG1158 # Protein_GI_number: 19703811 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Fusobacterium nucleatum # 1 405 1 405 413 525 65.0 1e-149 MEILESFLVNELYEVAKQLGVPCKKGLKKGEVKILLEKYFEENPNHTMASGYLEVLPDGY GFLRNTSVEKDIYISASQIRKFKLRTGDLVMGEVRKPTGEEKNFAVTKILRINNGNLAAA ESRIPFEDLVPAYPTEQFHLETGKESISSRVIDMVAPIGKGQRALIIAPPKAGKTMLISS IANSLIRNYPKTEVWILLIDERPEEVTDIKENVTGAEVYASTFDEDPRNHIKVTESILEK AKRKVEDGEDIVILMDSLTRLARAYNIIIPSSGKLISGGIDPTALYYPKNFFGTARNIRG GGSLTIIATVLVDTGSKMDDVIYEEFKSTGNCDIHLDRHLSELRIFPAIDIQKSGTRKEE LLIGKKKLDKVWKIRRMLSKMDRATAAQTLIRGMKKTENNEGLLSLFIKEGEH >gi|224461463|gb|ACDD01000039.1| GENE 43 38047 - 39195 1261 382 aa, chain + ## HITS:1 COG:FN0477 KEGG:ns NR:ns ## COG: FN0477 COG0739 # Protein_GI_number: 19703812 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Fusobacterium nucleatum # 54 382 7 320 321 276 49.0 4e-74 MKQRGSIVILSVIILFLFVRLQEESKKEIVNLEEFTDYYETSVADNGGFELIESFYNFER VYNFPNQYIEVAKKEEETKKEETKYPKKSTYIVRKGDTPSKIAARFGMSLNSFRANNPNM DKSLKVGTSVNVVSEDGVFYKLQKGDSVSRIAVKYKVKAADIVKYNNISPKKMRVGQELF LKSPDYKAFLEKEKPKLTKKEIDKKLKEKQEKEDQKIYAENKKTGKKSKPKQEQVNEGEN VETSSGDSGEVASTGGGGFSMPVRYAGVSSPYGSRFHPILKRYIFHSGVDLVAKYVPLRA AKSGVVTFAGNMSGYGKIIIIKHDNGYETRYAHLSQISTRVGERVERGELIGKTGNTGRT TGPHLHFEIRRSGKTLNPMKYL >gi|224461463|gb|ACDD01000039.1| GENE 44 39220 - 40281 1389 353 aa, chain + ## HITS:1 COG:FN0478 KEGG:ns NR:ns ## COG: FN0478 COG0821 # Protein_GI_number: 19703813 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Fusobacterium nucleatum # 1 352 1 352 354 480 73.0 1e-135 MGRKSREVQIRDLKLGKGNPVIIQSMTNTETSDVKATVRQILDLEEAGCELVRMTINTKE AAMAIPAIKERVHIPLVADIHFDYRLALLAMENGIDKLRINPGNIGSEDKIFLVVEKAKE KKIPIRIGVNSGSLEKHILEKYGTVTADAMVESAMYHVKLLEKYGFYDIVISLKASNVAM MVEAYRKIQTLVDYPLHLGVTEAGTAFQGSIKSSIGIGSLLVDDIGDTIRVSLTENPVEE IKVAKEILKVLGLRKEGVEIVSCPTCGRTEIDLISLAKTVEKEFAKEKRNIKIAVMGCVV NGPGEAREADYGIAGGKGIGILFQKGKIVKKVKEKDILKELKNMIEEDFKRKN >gi|224461463|gb|ACDD01000039.1| GENE 45 40338 - 40790 595 150 aa, chain + ## HITS:1 COG:FN0479 KEGG:ns NR:ns ## COG: FN0479 COG1595 # Protein_GI_number: 19703814 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Fusobacterium nucleatum # 1 149 1 149 149 179 73.0 1e-45 MDFDEIFEQYFDKVYYKVLGIVKNSDDAEDISQEVFISVYKNLKKFKGESNIYTWIYRIA INKTYDFLKKNKTMLEINEEILSLEYNVDMNTNMILKEKLKKISMQEREFVILKDIYGYK LKEIAEMKDMNLSTVKSIYYKAIRDMGGNE >gi|224461463|gb|ACDD01000039.1| GENE 46 40787 - 41110 420 107 aa, chain + ## HITS:1 COG:no KEGG:FN0480 NR:ns ## KEGG: FN0480 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 4 107 2 101 101 82 47.0 4e-15 MMTSPKERVRANIYKELLEQEKRRNKKLSVVSVSVFLLGVFATSGYNALYRTSTVGQASS YVMGAEKQVKEFEKDSFMLDSIYNTGVLHEKTVTLNPDELFGLDTQI >gi|224461463|gb|ACDD01000039.1| GENE 47 41124 - 41525 478 133 aa, chain + ## HITS:1 COG:no KEGG:FN0481 NR:ns ## KEGG: FN0481 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 131 7 132 132 101 42.0 8e-21 MKKLLVYIVFVLSSFVMFGESEFGIIQDGELRRVGVSEANLRQAKAVINKAETTYKMLVL ERREIELKINKLMMENPAKNLSTLDTLFDRIGVIEAKILKDKVRSQIEMQKYISQDQYVQ ARELSIQRLNKRK >gi|224461463|gb|ACDD01000039.1| GENE 48 41623 - 41865 381 80 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739934|ref|ZP_04570415.1| LSU ribosomal protein L31P [Fusobacterium sp. 2_1_31] # 1 80 1 80 81 151 87 9e-36 MKKGLHPEYNVVVFEDMAGNQFLTRSTKMPKETTMYEGQEYPVIKVAVSSASHPFYTGEM RFVDTAGRVDKFNKRYNLGK >gi|224461463|gb|ACDD01000039.1| GENE 49 41921 - 42544 895 207 aa, chain + ## HITS:1 COG:FN0483 KEGG:ns NR:ns ## COG: FN0483 COG0035 # Protein_GI_number: 19703818 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 207 8 214 214 351 84.0 6e-97 MAVIEVNHPLIQHKLTILRNKDTDTKSFRENLSEIAKLMTYEATKNLKVMEEEVETPLMK TTGYTLEEKVAIVPILRAGLGMVEGIQSLIPTAKVGHIGVYRNEETLEPVYYYCKLPTDI EKRRVILVDPMLATGGSAVYAIDYLKSQNVKDIVFMCLVAAPIGIEKLLNKHPDVAIYTA KIDQGLTENGYIYPGLGDCGDRIFGTK >gi|224461463|gb|ACDD01000039.1| GENE 50 42676 - 42870 440 64 aa, chain + ## HITS:1 COG:FN0244 KEGG:ns NR:ns ## COG: FN0244 COG2608 # Protein_GI_number: 19703589 # Func_class: P Inorganic ion transport and metabolism # Function: Copper chaperone # Organism: Fusobacterium nucleatum # 10 64 1 55 56 59 60.0 2e-09 MRKVLKIDGMGCEHCVKSVKEALSTLEGLSLLEVKIGEATVEMAEDYDMKKIQEALDDAG YDLL >gi|224461463|gb|ACDD01000039.1| GENE 51 42867 - 43124 318 85 aa, chain + ## HITS:1 COG:FN0245 KEGG:ns NR:ns ## COG: FN0245 COG2217 # Protein_GI_number: 19703590 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 6 84 24 103 769 68 47.0 3e-12 MKKEILEISGITCQACVAKIERKVSRMDGVEQVNVNLSTGIGTFSYDSGKVKLEEIIAMI EKLGYEGKVPQKEDKEEKRKKKKKD >gi|224461463|gb|ACDD01000039.1| GENE 52 43187 - 45091 2651 634 aa, chain + ## HITS:1 COG:FN0245 KEGG:ns NR:ns ## COG: FN0245 COG2217 # Protein_GI_number: 19703590 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 632 126 769 769 656 55.0 0 MGSMMGLPLPRVISMEENPILFALMQLCFSIPVLYLGRHFYQKGLKQLFLRAPNMDSLIA VGTGAAFLYSLYGFYRITQGEIHYVHHLYFESSVMILAFISLGKYLEERSKGKTSEAIQK LMDMQVVVAHKIVGENILSVPLEEVELQDILLVKAGEKIPLDGIILEGESTINESMLTGE SIPVSKKVGDTVYGATINGEANLKIKVEAVGEDTVIAKIIHLVEDAQGTKAPIAKLADEI SLYFVPVVMMIAIVAALFWYFVMGKDFLFSITIFVSVMVIACPCSLGLATPTAIMVGTGR GAELGVLIKSGEALQKAQEMTAIVFDKTGTLTEGKPELEKILSYESGEWLRIAASLEQYS EHPLGRAVIEAAKREGLSFFEIENLEILVGRGISGKKDGKSYFLGSPKGVLEFGGSLENT GEVVSYEEEGKTVLYLVEEEKTVASFIVADQMKEESKQVLEILKNKGFSLAMITGDKKET AESIAKKIGMDTVFAEVSPEDKYLKVKELQEQGKKVIMVGDGINDSPALMQADLGIAMGG GTDIAMESADIVLMKKNLFGILDALDLSEATMKNIKQNLFWAFLYNSLGLPLAAGVLYPF TGHLLNPMIAGFAMAMSSVSVVTNALRLRYFKRG >gi|224461463|gb|ACDD01000039.1| GENE 53 45093 - 45494 497 133 aa, chain + ## HITS:1 COG:BS_cdd KEGG:ns NR:ns ## COG: BS_cdd COG0295 # Protein_GI_number: 16079584 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Bacillus subtilis # 7 128 4 125 136 129 46.0 2e-30 MEEKEIRALIQKAMEVRKNAYAPYSKFLVGAVLIDEEGREYRGVNVENTSYGLSSCAERN AIFSGVAKGMKKIAVLCVVGDTEDPIRPCGACRQVILEFANEDTKIILSNLHGKYEVFSI EDLLPNSFFVKIY >gi|224461463|gb|ACDD01000039.1| GENE 54 45594 - 46970 1653 458 aa, chain + ## HITS:1 COG:FN1858 KEGG:ns NR:ns ## COG: FN1858 COG2031 # Protein_GI_number: 19705163 # Func_class: I Lipid transport and metabolism # Function: Short chain fatty acids transporter # Organism: Fusobacterium nucleatum # 1 458 1 458 458 671 75.0 0 MEQKKEKKGIFKKFTSASVSLMQRWLPDPFIFCAILTFFVFVASLLFTKASVFDVIGYWS GGFWSLLAFSMQMALVLVTGHTMASSPVFKKLLENMASKLKTPRQAIIVVTVVSTIACIL NWGFGLVIGAIFAKEIAKKLKGVDYRLLIASAYTGFLVWHGGLSGSIPLQLASGGEALKQ QTLGVISEAIPTSQTLFSPMNLYIVIGLLILLPIINVAMYPSHDEVVTVDPALLKEVEPV VIDSKKMTPAEKIENGRLVSYALGLMGYVYIIKYLMENGFALNLNIVNFIFLFTGIIFHG TPRRYLDALAEAIKGAAGILLQFPFYAGIMGIMVGADVDGNSLAGLMSNFFVNISTPRTF PVFTFLSAGIVNFFVPSGGGQWVVQAPIVMPAGQMIGVTAAKSAVAIAWGDAWTNMVQPF WALPALGIAGLGAKDIMGYCLIVTIISGLFICSGFLLF >gi|224461463|gb|ACDD01000039.1| GENE 55 46998 - 47648 1036 216 aa, chain + ## HITS:1 COG:FN1857 KEGG:ns NR:ns ## COG: FN1857 COG1788 # Protein_GI_number: 19705162 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit # Organism: Fusobacterium nucleatum # 2 216 3 217 217 322 81.0 2e-88 MKKIVSMEEAISHIKDGMTVHIGGFLAVGTPENIITALIEKGVKDLTIVANDTGYPDRGI GRLVLNNQVKKVIASHIGTNPETGRRMQSGEMEVELVPQGTLAERVRAAGCGLGGVLTPT GLGTIVAEGKDIVTVDGKDYLLEKPIKADVALLLGTTVDKAGNVIFAKTTKNFNPLMGTA ADLVIVEAEKIVEVGEIDPDHVMLSKIFVDYIVEGK >gi|224461463|gb|ACDD01000039.1| GENE 56 47661 - 48326 1079 221 aa, chain + ## HITS:1 COG:FN1856 KEGG:ns NR:ns ## COG: FN1856 COG2057 # Protein_GI_number: 19705161 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit # Organism: Fusobacterium nucleatum # 1 218 1 217 217 336 79.0 2e-92 MELDKKLVREYIAARVAKEFHDGYVVNLGIGLPTLVANFVPEGMEVIFQSENGCIGVGPA PAPGQEDPHAINAGGGFITALPGAQYFDSATSFGIIRGGHVDATVLGALEVDKEGNLANW MVPGKMVPGMGGAMDLVVGAKHVIVAMEHTSRGAIKILDKCTLPLTAVKVVDMIITEKCV FKITDKGLVLTEISPYSSLEDIKATTAAEFTVAEDCKQLSL >gi|224461463|gb|ACDD01000039.1| GENE 57 48496 - 49797 1835 433 aa, chain + ## HITS:1 COG:CAC3014 KEGG:ns NR:ns ## COG: CAC3014 COG0422 # Protein_GI_number: 15896266 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Clostridium acetobutylicum # 3 433 2 432 436 567 65.0 1e-161 MREYFTQMEAAKKGIVTPEMKIVAEKEKMDVEKLRDLVAKGQVCIPCNINHKNISPEGIG TGLKTKVNVNLGISGDKRDYEEEFKKVDLAIQYGCEAIMDLSNYGKTNTFRKKLIEKSPA MIGTVPMYDAIGYLEKDLQDMEVKDFLEVIEAHAKEGVDFMTIHAGLTRRAVEFLKKQER LTNIVSRGGSLLFAWMETKKQENPLYEYYDQVLDILRKYDVTISLGDGLRPGSNHDSTDA GQLAELIELGYLTKRAWEKDVQVMVEGPGHMAINEIAANMQIQKRLCYGAPFYVLGPLVT DIAPGYDHITSAIGGAIAASSGADFLCYVTPAEHLRLPDVEDVKEGIIATKIAAHAADIA KGIPGARDWDNKMSDARRRLAWEEMFGLAIDEEKARRYFNSRPVEVKDSCSMCGKMCAMR TVNRILEGKDINI >gi|224461463|gb|ACDD01000039.1| GENE 58 49850 - 51184 973 444 aa, chain - ## HITS:1 COG:FN1940 KEGG:ns NR:ns ## COG: FN1940 COG0534 # Protein_GI_number: 19705245 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 2 443 3 444 456 562 69.0 1e-160 MIHWNMIREILSLALPAVGEMTLYMMIWILDTMMVGQYGGKLAVSSVGLSTEIIYSFFNI LIAMGMSSSLTSLISRALGAKDFKKAERIANAGFKISFGLAILFFLVLFFVPKQILTLAG ATKDMLPSAVIYAKISAFSFFLLTFSSTNNGIFRGAKDTKTSLYIAALINIVNLSLDYVL IFGKFGFPELGVKGAAIATVAGNGTGLLLQWFRLKKLPFHLHLFSSSKKEDFKEVILLAV PSALQEANFSLSKLLGITFVMSLGTIAFAANQIGIAIEAVSFMPGWGIAIANTALVGHSI GEKNEKKAHDYTFYSTVIASIFMGIIALIFFFFPEELIHLFIQKEEIEVISAGALCLQVG AMEQIPIAFAMVIESYFKGTGDAKTPFYVSFIMNWCIRVPLAFYYISIQKYPIHIFWLIT TIQWTLEGILIYYLYHRKGKIILH >gi|224461463|gb|ACDD01000039.1| GENE 59 51193 - 52977 524 594 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 186 573 24 407 425 206 35 2e-52 MIYGNTEGMKEFTLQQLEQLYEIKLNKGQLISEEIAIFLANISTKINKEINLCIDRNGNI TEISIGDSSTVSLPFIPVYEKKLSGKRIVHTHPNGNPKLSSVDISALLKLKLDAILAIGC IEEKVTGIGLALCNLEEDVIHYEEHLYSSFEELENFPFLEKLQSIETALRRKNIVEDDKE YAVLVGIDSKTSLQELEELAYACNIEVVGHFFQNRSKADKVLFLGPGKARELSLFQQIKR ANLIIADEELSGLQVKNLEEVTGCKVIDRTTLILEIFARRARSREAKIQVELAQLKYRSN RLIGYGVTMSRLGGGVGSKGPGEKKLEIDRRRIRENISFLKKELENIKKTRSVQREKREN SNIPKIALVGYTNVGKSTLRNLLAAEYNPNSNTKEDVFAENMLFATLDTTTRTILLDDKR LISLTDTVGFIRKLPHDLIEAFKSTLEEVIFSDLILHVVDSSSEEALSQMEAVYQVLEEL QCQNKKNILVLNKCDLARPEQILAIREKYSHITAVEISAKEHKNIEFLLEEIKKELPQNT KTCTYLIPYSDSSMVAYLHKTSTIQEEKYEAEGTFIKAIVNQETENRCKQFEIE >gi|224461463|gb|ACDD01000039.1| GENE 60 53179 - 54360 1789 393 aa, chain + ## HITS:1 COG:Cgl2969 KEGG:ns NR:ns ## COG: Cgl2969 COG1301 # Protein_GI_number: 19554219 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Corynebacterium glutamicum # 6 390 5 387 412 316 50.0 5e-86 MKKIGLLPRLIIGLVVGILLGMSGIEIIIRLLGTFNSIFGNFLGFVIPLIIVGFVAAGIA DLGKDAGKLLGVTVAIAYVSTVISGTFAYFVDTTIFQQLHLEDAAEMIKAAEANARSLSP LFTVDMPPIMGVMTALLIAFTLGIGAAVINSEVLKKGMQEFQAIVEKVISNIVIPFLPLH ICGIFANMTYEGKTAAIMSVFVKVFIIIIILHAIIILFQYTIAGTIAGGNPIKLIKNMIP AYLTAIGTQSSAATIPVTLRQTKKNGVSDGVADFAIPLCATIHLSGSTITLTSCSLALMI IYGMPHGFATMFGFILMLGITMVAAPGVPGGAVMAALGLLGSMLGFNEELLSLMIALYLT QDSFGTACNVTGDGAIAVLVNKFAGNKLETKED >gi|224461463|gb|ACDD01000039.1| GENE 61 54480 - 54809 475 109 aa, chain + ## HITS:1 COG:no KEGG:FN0737 NR:ns ## KEGG: FN0737 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 109 1 109 109 155 68.0 6e-37 MPHLKVRGLEKKVLIEKSKEIIDGLTEIIQCDRTWFTIEHIDTEYIFDGKIQEGYSFIEL YWFERGEEIKKRVAAFLTEKMKEMNGNKDACIIFFPLLGENYCDNGVFF >gi|224461463|gb|ACDD01000039.1| GENE 62 54822 - 55679 314 285 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 45 284 35 280 285 125 34 5e-28 MIKVGKRQTMLVDHFASVGAYLVPVLVEEEEEKIEILLPNNELEERELQEGEEVEVLIYR DSEDRLIATFRKTEALVGTLAKLEVVDTNPRLGAFLDWGLTKDLLLPVSQQEVRAEIGKR YLVGIYEDSKGRLSATMKIYNFLLPNHDFSKNDTVKGTVYRVNDEIGVFVAVEDRYFGLI PKSECFQAFEVGEELDLRIIRVREDGKLDVSPRVILSEQISKDAEVILQKMRILKDHFRF NDDSSPEDIKDYFSMSKKAFKRAIGQLLKQGLIDKKEDGYFSLKK Prediction of potential genes in microbial genomes Time: Fri May 20 02:01:32 2011 Seq name: gi|224461462|gb|ACDD01000040.1| Fusobacterium sp. 3_1_5R cont1.40, whole genome shotgun sequence Length of sequence - 1873 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 10 - 69 10.3 1 1 Tu 1 . + CDS 258 - 1872 2246 ## Smon_1341 YadA domain protein Predicted protein(s) >gi|224461462|gb|ACDD01000040.1| GENE 1 258 - 1872 2246 538 aa, chain + ## HITS:1 COG:no KEGG:Smon_1341 NR:ns ## KEGG: Smon_1341 # Name: not_defined # Def: YadA domain protein # Organism: S.moniliformis # Pathway: not_defined # 139 513 76 449 563 91 28.0 8e-17 MLEEKSVKHWLKRKVKVTEALLVAFLITGGIASAEETVSKAEFDALSKKVEALEKNGTNY VTILSTSEENKVAPSEGSDVVNIGTNQIGEKNKKKYDVTVVGTKNKIKKFRIIPDDVKPD SIEEFYACGETVIGNGNIVANGVSIGRDNIAIGSRGVVLGEDSKNYGDRSINIGYLTETY GTDSIAMGFSSKALDTSTLALGGYTQAKTVGAIAIGDRAIAEKGVALGYESLADRSGGEV GYYPEFGTTEREAIAKKLGKEAEYKETMKVFETSKELIEKEREYYSVIAKREEYQKQLNE QTVDLIKYNKDAQEYKDAFAKILEIKTQYKENEKKELELGKIVSTANPEYKKLEDARSTY GNIFGAYSSTMSAVSVGNEKKGIYRQITGVAAGSKDTDAVNVAQLKSLNTKVDKGASHYY SANDMNHHVDNFNNDGAKGLFSTVAGPGSTIKQTNNQMFQGATSSILGSFNTIDAGDKEF DGVANSIVGVANTTKDANGSLIFGAGNKITNSYRGVDFTKLNGNGLNLSNSQSVAEAL Prediction of potential genes in microbial genomes Time: Fri May 20 02:01:40 2011 Seq name: gi|224461461|gb|ACDD01000041.1| Fusobacterium sp. 3_1_5R cont1.41, whole genome shotgun sequence Length of sequence - 1455 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1455 1832 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461461|gb|ACDD01000041.1| GENE 1 3 - 1455 1832 484 aa, chain + ## HITS:1 COG:ECs4480 KEGG:ns NR:ns ## COG: ECs4480 COG5295 # Protein_GI_number: 15833734 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Escherichia coli O157:H7 # 81 349 67 346 1588 87 32.0 4e-17 STDKSAIWKATHSAVSIGDFKKGYTRQITGVAAGSEDTDAVNVAQLKKVKDSINTAVENS KIHYYSVNSNKVGDDSNYKNNGATGDDAIAIGIGVKAKGQHAIAMGNNVESSGYASIAIG KDSEATKQGAIAIGMGAKVYAGGGVAIGTNVQAGDSPSDGDWSPVALGYGTKSLGGASTA FGYESVARGAHSIAGGDRSKATGQDSVALGQEVEASGTWSVALGQKTVASGSNSMSMGDN TKASGSNSTAMGIKTEAGGAGSTAMGYGTKAIGNWSLATGAYSKSEGKFSTAMGLSSVAK GHNSFAVSGANVEKDASNAIAMGYNATAKLTDSVALGSGSVASTKQGVAGYNPITDKNYE RTPEAAAAYQKWIDAYNAWEAIDEAEKDKKAEKLKECNAMKTEYNKLVSTWESTKSVIAV GDKEKGISRQITGVAAGTEDTDAVNVAQLKALNTKVDKGASHYYSVNDIDDHVDNYKNDG AKGV Prediction of potential genes in microbial genomes Time: Fri May 20 02:01:47 2011 Seq name: gi|224461460|gb|ACDD01000042.1| Fusobacterium sp. 3_1_5R cont1.42, whole genome shotgun sequence Length of sequence - 23125 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 5, operones - 3 average op.length - 6.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 782 1029 ## COG5295 Autotransporter adhesin + Term 796 - 834 8.5 + Prom 839 - 898 11.4 2 2 Tu 1 . + CDS 921 - 2573 2018 ## FN1654 hypothetical protein + Term 2807 - 2846 1.1 - Term 2552 - 2588 4.1 3 3 Op 1 . - CDS 2595 - 3233 546 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 4 3 Op 2 1/0.000 - CDS 3230 - 4831 1166 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP 5 3 Op 3 1/0.000 - CDS 4848 - 5522 920 ## COG1846 Transcriptional regulators 6 3 Op 4 10/0.000 - CDS 5535 - 6173 1037 ## COG0036 Pentose-5-phosphate-3-epimerase 7 3 Op 5 7/0.000 - CDS 6163 - 6969 1052 ## COG1162 Predicted GTPases - Prom 6996 - 7055 4.3 8 3 Op 6 . - CDS 7057 - 7764 751 ## COG2815 Uncharacterized protein conserved in bacteria - Prom 7800 - 7859 7.0 9 4 Op 1 1/0.000 + CDS 7915 - 8451 767 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 10 4 Op 2 . + CDS 8460 - 9275 852 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 11 4 Op 3 . + CDS 9256 - 9501 268 ## FN0286 hypothetical protein 12 4 Op 4 12/0.000 + CDS 9511 - 9750 399 ## COG1837 Predicted RNA-binding protein (contains KH domain) 13 4 Op 5 30/0.000 + CDS 9761 - 10279 575 ## COG0806 RimM protein, required for 16S rRNA processing 14 4 Op 6 2/0.000 + CDS 10281 - 10991 708 ## COG0336 tRNA-(guanine-N1)-methyltransferase 15 4 Op 7 1/0.000 + CDS 10988 - 11554 734 ## COG4752 Uncharacterized protein conserved in bacteria 16 4 Op 8 . + CDS 11554 - 15906 4590 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) + Term 15917 - 15960 -0.0 + Prom 15953 - 16012 3.4 17 5 Op 1 . + CDS 16051 - 17217 1194 ## FN0280 hypothetical protein 18 5 Op 2 1/0.000 + CDS 17207 - 17941 989 ## COG2853 Surface lipoprotein 19 5 Op 3 1/0.000 + CDS 17942 - 19300 1879 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 20 5 Op 4 . + CDS 19320 - 20195 1063 ## COG4866 Uncharacterized conserved protein 21 5 Op 5 . + CDS 20212 - 21414 1774 ## COG1171 Threonine dehydratase 22 5 Op 6 . + CDS 21435 - 23069 1625 ## COG1283 Na+/phosphate symporter Predicted protein(s) >gi|224461460|gb|ACDD01000042.1| GENE 1 3 - 782 1029 259 aa, chain + ## HITS:1 COG:FN0735 KEGG:ns NR:ns ## COG: FN0735 COG5295 # Protein_GI_number: 19704070 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Fusobacterium nucleatum # 19 255 375 585 617 123 39.0 3e-28 PGKDGKDGKNGEGAKVLAGNNIKVDSKEKKQGEDKVIENTISLEENIKVKTVSTDSINVG DTVKISKEGINAGKQKITNVADGKADSDAANMKQLRKVEANAKKDAEKLGNAINHNAQKI HDLKKEVGNVGALSSAMAALNPMEYDPMKPNQVLAGVGSYKNSQAVAVGMSHHFNENLRV QAGVSVSEGRKTESMVNLGLAWKIGKDDRDDSYNKYKEGPISSIYVMQDEMKQVMEENKN LKSEVEEMKKQLQTLIKQK >gi|224461460|gb|ACDD01000042.1| GENE 2 921 - 2573 2018 550 aa, chain + ## HITS:1 COG:no KEGG:FN1654 NR:ns ## KEGG: FN1654 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 5 542 30 566 571 542 51.0 1e-152 MQKIKGLPNGISDFKVLREKNYYYIDKTSFIEELQQEIGKTILFTRPRRFGKTLNMSMLQ YFWDIQHAEENRTLFQGLYIESSPYFSEQGKYPVIFLSLKDLKERTWEGCQKAMKKLLSD LYDKHQFLREFLNPRDLKYFDHIWMEEKEANYSGVLKDLAKYLFQYYQKKVIILIDEYDT PMVSSYEHGYYEEAIAFFRNFYSAALKDNEYLQTGLMTGILRVAKEGIFSGLNNLVVYSI LDEKYSSYFGLTEEEVEEALQYYEMEYKLQDVKEWYDGYRFGNTEIYNPWSILNYISNKK LDAYWIHTSNNFLVYDLLEKANINIFDDLQKVFQGKEIQKTIEYSFPFQDMTNPQEIWQL LVHSGYLKTEKSLDNHRYALKIPNQEIQSFFEKSFLNRFLGGVDMFGEMITALKKGKIEI FEKKLQDILLTKVSYHDVGQEEKYYHNLVLGMILSMSKEYEIHSNLESGYGRYDISLEPK DKMKSGFILELKVAKSEEELEKKAKEALQQIEDKKYDIEMKERGIQEIIPLGISFYGKKI QVLKSVKKAV >gi|224461460|gb|ACDD01000042.1| GENE 3 2595 - 3233 546 212 aa, chain - ## HITS:1 COG:FN1075 KEGG:ns NR:ns ## COG: FN1075 COG0596 # Protein_GI_number: 19704410 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Fusobacterium nucleatum # 1 211 1 214 215 226 54.0 2e-59 MNYRIILIHDFGKSYRDMEKLEEHLFSMGYVVENLNFPLTFADLQSSKDILLQRIHNLKE SGLTERDEIVLIGFGFGGILIRECLHNKEFLQNVDTLLFISSPWNNSTLHRRIKRVFPFI NLFLKPLRAFSKEPIQLPRKLKVGLIIGTEYYNLFGHFLGEYNDGYVTKKDCFIPGAQDV IYLPICHREIHKKIGTAKYISNFISKGKFRVN >gi|224461460|gb|ACDD01000042.1| GENE 4 3230 - 4831 1166 533 aa, chain - ## HITS:1 COG:FN0682 KEGG:ns NR:ns ## COG: FN0682 COG1293 # Protein_GI_number: 19704017 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Fusobacterium nucleatum # 1 533 1 538 541 358 41.0 1e-98 MLYLDGISLSFLQKDIEEKLNKRKINRIFQNTDTSLSLHFGKQVLVLSCNPQLPICYVTE DKETVLEESVSSFLNTLRKHLMNSFLYQVEQVGWDRTLIFRFSKLTELGDYKQYFLIFEL MGRNSNLFLCDQDYKILDLLKRFSLDEVQTRNLFPGAHYEALPSTKISPNEITGTTEKPY FQTVEGVGKLLSESLQNPEDLKLLVTEAPKINLYRKNGNIVLLNFLGLVPKDYDEVLSFS DLQEAILFYFQEERIFGTLVKLRNQLEAQLNKRKKKIEQILKKIDLDEKKNENFESWKEK GDILASCLFQLKKGQENCEAFDFYHNEMIKIPLDIRKTPKENLENYYKKYNKAKTTLVYA QKRKKEMEEELSYLESLFVFLSSANDIEVLKGIEEECIQAGYSKAKPKKAYKKKKKTEKK YAVLEYPNYSLFYGRNHTENDFVSFQIADKEEYWFHAKNIPGSHVILRSFIPIEEEMIQK ACQVAAFYSQANLGDKVLVDYTQKKYLKKPKDSKPGFVTYTHEKGIWVVKEKL >gi|224461460|gb|ACDD01000042.1| GENE 5 4848 - 5522 920 224 aa, chain - ## HITS:1 COG:FN0681 KEGG:ns NR:ns ## COG: FN0681 COG1846 # Protein_GI_number: 19704016 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 1 224 1 225 225 294 71.0 1e-79 MTVNFIKVNDLLEEFYKLFYKTEDMALKRGIKCLTHTELHIIESVGHESLTMNELAERLG ITMGTATVAASKLSEKGFLNRERSQNDRRKVFVSLTDKGIKALAYHNSYHKMIMSSITEN IKGKDLDHFITVFEDILEALRNKTDYFKPLPICDFEHGTKVSVVEIKGTPIVQNYFASEG IENFTVVITKKSSDKGTIILKKQDGTELKLDILDAKNLIGIKAD >gi|224461460|gb|ACDD01000042.1| GENE 6 5535 - 6173 1037 212 aa, chain - ## HITS:1 COG:FN0680 KEGG:ns NR:ns ## COG: FN0680 COG0036 # Protein_GI_number: 19704015 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Fusobacterium nucleatum # 3 211 5 213 215 335 82.0 3e-92 MEIKIAPSILSSDFSRLGEEIVAIDQAKADYIHIDVMDGIFVPNLTFGPPIIKSIRKYTN LIFDVHLMIDKPERYIEDYVKAGADIITVHAESTIHLHRVIQQIKSFGVKAAVSLNPSTS EEVLKYVIQDLDMVLVMSVNPGFGGQKFIPAVVDKIKAIRAMREDIEIEVDGGITDATIQ SCIEAGATTFVAGSYVFSGNYAERIANLKNKK >gi|224461460|gb|ACDD01000042.1| GENE 7 6163 - 6969 1052 268 aa, chain - ## HITS:1 COG:FN0679 KEGG:ns NR:ns ## COG: FN0679 COG1162 # Protein_GI_number: 19704014 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 265 22 285 285 304 59.0 9e-83 MRGILKKKENKHNCVVGDYVEISEENSIIEIYPRKNQLTRPVVANIDYLAIQFAAKNPIL DFFRLHMLLLHSMYEKVCPCIIINKIDLLTEIELQDLQQQFHFLKDLSIPIFFISQKEQI GIEVLKEFFQNKITAIGGPSGVGKSSLINLLQEAKELETGEISKKLQRGKHTTRDTRLLA LPQGGYIIDTPGFSSLELPLIENFEQLMKLFPEFEVGKPCKFGDCHHIHEPSCAVRKAVE DGKISQERYQFFTNIYHKLKTERWNYGN >gi|224461460|gb|ACDD01000042.1| GENE 8 7057 - 7764 751 235 aa, chain - ## HITS:1 COG:FN0678 KEGG:ns NR:ns ## COG: FN0678 COG2815 # Protein_GI_number: 19704013 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 13 162 47 196 200 121 38.0 1e-27 MKFKKNLLLYLGLLVLLFFSYKVFTRYYFHDFLHEVPNVVGLSERQAKKILSKNDLEIKV MGDQYSELPEGQIMLQNPKEHSVVKSGRRIQVWISRGQNLLQIPSLVGTNLLTAQSLVQQ QGLIVDKITYIPKDLPYNEILATDPDLSQAIAKGSKISFLVSGSASSTDLNLKVPDIIGY PLEDAKFILESEQLLLGKIIRKASENTESGIVIGTSIPAGRSVDLSTKIDLIVSE >gi|224461460|gb|ACDD01000042.1| GENE 9 7915 - 8451 767 178 aa, chain + ## HITS:1 COG:FN0288 KEGG:ns NR:ns ## COG: FN0288 COG0634 # Protein_GI_number: 19703633 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 176 1 175 175 216 67.0 1e-56 MNYRIETMINRERVEERIRELAKEIERDYKDRKEEVIFLGLLKGSVMFLSDLIKETNLDL KIDFMSVSSYGSGTTTSGVVKILKDTDFDMKGKNLLIVEDIIDTGLTLKYVKEFLYAKGA AEIKICTLLDKPERRKVELKGDYVGFTIPDAFVVGYGLDYDQKYRNLPYVGTVVFEEN >gi|224461460|gb|ACDD01000042.1| GENE 10 8460 - 9275 852 271 aa, chain + ## HITS:1 COG:FN0287 KEGG:ns NR:ns ## COG: FN0287 COG0030 # Protein_GI_number: 19703632 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Fusobacterium nucleatum # 1 263 1 263 264 300 60.0 3e-81 MAFQHKKKYGQNFLTRQTEILAKIMEVSEVNSEDCILEIGPGEGALTELLLQEAKSVLNI EIDEDLKPILQKKFGNIEKYRLVMGDVLEVNFAEYMQEGTKVVANIPYYITSPIIQKIIE NRSLIQAAFLMVQKEVGERICAKKGKERSALTLSVEYFAKPEYLFTIPKEYFTPIPKVDS AFIGIRMKKEEEIAKQAPETLFFKYVKAGFFNKRKNLANNFLALGFTKAEIKEKLAILGI SETERAENLSLEDWFSVIKALEGSSVGKEKL >gi|224461460|gb|ACDD01000042.1| GENE 11 9256 - 9501 268 81 aa, chain + ## HITS:1 COG:no KEGG:FN0286 NR:ns ## KEGG: FN0286 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 81 2 80 80 101 64.0 6e-21 MEKKSYEFLLRSKIEDIDFINKIMEAYEGAGVVRTLDAKTGLVSIVLTEDFKDFVREILE DLRNRWVSFELLSEGPWSGRL >gi|224461460|gb|ACDD01000042.1| GENE 12 9511 - 9750 399 79 aa, chain + ## HITS:1 COG:FN0285 KEGG:ns NR:ns ## COG: FN0285 COG1837 # Protein_GI_number: 19703630 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Fusobacterium nucleatum # 1 79 1 79 79 111 82.0 4e-25 MERLEYLMNYIIKELVQEKEEVRVSYEVIDSTVTFQIRVAKGEMGKIIGKNGLTANAIRG VMQAAGVKDKLNVNVEFLD >gi|224461460|gb|ACDD01000042.1| GENE 13 9761 - 10279 575 172 aa, chain + ## HITS:1 COG:FN0284 KEGG:ns NR:ns ## COG: FN0284 COG0806 # Protein_GI_number: 19703629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RimM protein, required for 16S rRNA processing # Organism: Fusobacterium nucleatum # 1 172 1 172 173 155 47.0 4e-38 MKLLTAGRILGTHHLLGAVKVVSGLEELPKLLGSKCMTKLETGENILLTPTKIEHLVGDS WVFQFEEIKNKAEALKLRNALIEVRRDLLGYTEEDIFLSDYIGLLAKEVESKEEIGRVEE IFETAAHPILVIQSEHYETMVPDTPTFVKEVNFETGEIYIELLEGMKEEKRK >gi|224461460|gb|ACDD01000042.1| GENE 14 10281 - 10991 708 236 aa, chain + ## HITS:1 COG:FN0283 KEGG:ns NR:ns ## COG: FN0283 COG0336 # Protein_GI_number: 19703628 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Fusobacterium nucleatum # 1 236 1 236 238 329 68.0 2e-90 MKITVLTLFPDFFSAFQSESIIGRAIEMGKVEIVIRDIRDYCYDKHKQADDMPFGGGAGM VMKPEPLFRALADCSGKVIYTSPQGEKFSQKMALDLSEERELVIIAGHYEGIDERVIEEK VDMEISIGDYVLTGGELPAMVMMDSIIRLLPGVIRRESYENDSFFQGLLDYPQYTRPADY EGCKVPEVLLSGHHKKIEEWRFYQSVKRTLERRPDLLQGRVWTKQEKKILGELIKK >gi|224461460|gb|ACDD01000042.1| GENE 15 10988 - 11554 734 188 aa, chain + ## HITS:1 COG:FN0282 KEGG:ns NR:ns ## COG: FN0282 COG4752 # Protein_GI_number: 19703627 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 188 1 187 187 305 77.0 3e-83 MRKQIYLALVHYPVYNKRRDVVCTSVTNFDIHDISRTCSTYDIKGYRLVVPVDAQKKLTE RILGYWQEGFGGNYNKDREEAFVRTRVAESIEEVIAEIEKIEGKKPKIVTTSARHFPNTV SFGNLQEKLFETENQPYLLLFGTGWGLTDEVMAMSDYILEPIRANSEYNHLSVRAAVAII VDRLLGEN >gi|224461460|gb|ACDD01000042.1| GENE 16 11554 - 15906 4590 1450 aa, chain + ## HITS:1 COG:FN0281 KEGG:ns NR:ns ## COG: FN0281 COG2176 # Protein_GI_number: 19703626 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Fusobacterium nucleatum # 1 1450 6 1454 1454 1766 61.0 0 MSSKEIRMKPGRELLQRLGIQFMQLEEIRYSERRNVLRVFCVLPTYLAISELERLHQDLQ ITFGNNVKIEFSSKLLDENIPKEELKNIVDLAIQRLRKTEPRFKSFLCNYRIFIEGNDIY LEVNTDCGIEIIEDGHGSQKLEAVLYEYGLKCYSIHIERGDFTAENLHRERERKEEIRQI EKKAIQEQNEIAAKAAAKVPTIPEKTDFPKRGNGGFSRNKTREIKGSPIPMKDFAEVMEE DTCIVEGEIFSLEDRELSTGNILKTLWITDESNSLTAKLFLKKDEVLEIAKNDYVRIEGK VQIDTYAQNEKIIMIQAINRLERKKTKKEDLAEEKMVELHTHTKMSEMVGVTEVGDIIKR AKQYGHSAVAITDYGVVHSFPGAHKAAKEAGIKAILGCEAYMIDDTLPIVHNLKEDQDLE KASFVVYDLETLGFNSHEGKIIEIGAVKIVEKRIVDRFSQLVNPGQSIPQNIVDVTNITD SMVQNEPNIEEVLPKFLDFIEGSILVAHNADFDIGYLKQQCKQQGYSDFNPSFIDTLQMA KDLYPELKQFGLGPLNKKLGLSLENHHRAVDDCQATGNMFLIFLDKYLDQGIHKLSEMQG AFPVNTKKQNTRNVMLLVKDRVGLENLNRLVSDAHLYHFGNRKPRVLKSNLEKYREGLIV GCSLTGHSINDSDLFHDYSTGNVERIPKKISFYDYIELLPRQAYTENIEYNGTGLISGNS YIEKMNQYFYDLAKEKGILVTGSSNVHYLDPEEAKIRTILLYGSGMVHGAKAYKTDNGFY FRTTGELLEEFSYLGEEAAKEILIQNTNAIAEKIEVIKPIPDGFYPPSIDNAEETVREMT YEKAYRIYGNPLPEIVEKRLERELNAIIGNGFSVLYLSAQKLVKKSLDNGYLVGSRGSVG SSLVAFMMGITEVNALYPHYICTNPDCKHVEFIEREGVGIDLPEKKCPKCGQMYKRDGYS IPFEVFMGFNGDKVPDIDLNFSGEYQSEIHRYCEELFGKENVFKAGTISTLAEKNAAGYV KKYFEDNGMSISQAEVMRLAKKCEGAKKTTGQHPGGMVVVPSDHTIFEFCPVQKPANDEN SDSITTHYDYHVMDEQLVKLDILGHDDPTTIKLLQEYTGLDIYEIPLSDPDTLKIFSSTE SLGVTPQQIGSEVATFGIPEFGTPFVRQMLLDTRPTTFAELVRISGLSHGTDVWLNNAQD FIRQKQATLSEVITVRDDIMNYLIDQGIEKGTAFKIMEFVRKGKPSKDPEGWKKYSDLMK EKHVKDWYIESCRRIKYMFPKGHAVAYVMMAIRIAYFKVHYPLAFYAAYLSRKAEDFNFE TLGTPEKARIRLEELSKEGKLDVKKKAEQALCEVMIEMEARHIELLPIDLYHSSGKKFLI QGDKIRVPLIALAGLGGAVIDNILEERQKESFISIEDFKKRTKVSQTIVEKMKDLKIIEN MNETNQISLF >gi|224461460|gb|ACDD01000042.1| GENE 17 16051 - 17217 1194 388 aa, chain + ## HITS:1 COG:no KEGG:FN0280 NR:ns ## KEGG: FN0280 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 387 40 426 427 378 49.0 1e-103 MTQGVSEKVPKKNYEIVTREQEQLPENLWNHTKFRFSLIKQKKKAPLIFLLAGTGSDYNS LRMELFQRILYDAGFHVISISSQMTVNFIASASKFHVPGLLEEDSKDMYEIMKKCYQAVE KEVEVSEFLLTGYSLGATNAAFISKLDETEQFFNFQRVFMVNPAVNLYSSARQLDNYLNQ VTGNSVSNLEKMLEALLTKLKEESKNEYTGLTSESIFKSFQGNQFSDAQKAALVGLAFRM NAIDLNYVSDLLAKTGVYTKLDEHIKKFSPMLSYFVKIKFGDFGSYVDKVALPHYQKKLG EAYSKERLIAESSLHGVQDYLRKSPKIVVVTNEDELILSKEDLAFLRATMGDRIFVYPKG GHCGNMFYTPNIQVMLNFLKEGVFIHEK >gi|224461460|gb|ACDD01000042.1| GENE 18 17207 - 17941 989 244 aa, chain + ## HITS:1 COG:FN0279 KEGG:ns NR:ns ## COG: FN0279 COG2853 # Protein_GI_number: 19703624 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Surface lipoprotein # Organism: Fusobacterium nucleatum # 44 237 54 259 260 231 56.0 1e-60 MRNRLFLLLFLSIFSFSLFAEETGMSAQEEKEIQEMTEYFGDYDPWEGLNRRVYYFNYGF DKYFFVPVVEGYQKITPVFVQHRVSNFFDNTKNISSLGNALAQTKGRKSMRSLGRLSINT ILGLGGLFDVASALGMPKPYEDFGLTLAHYGVPRGPYLILPILGPSYLRDAFGMLVDSQI ANGKEFSIPRTYTLPLSAIDRKSRVRFRFYGTNSPFEYEYVRFLYKKYRTVQEETHQNFN IGGI >gi|224461460|gb|ACDD01000042.1| GENE 19 17942 - 19300 1879 452 aa, chain + ## HITS:1 COG:FN0278 KEGG:ns NR:ns ## COG: FN0278 COG0624 # Protein_GI_number: 19703623 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Fusobacterium nucleatum # 1 452 1 452 452 579 62.0 1e-165 MDLQKEVLKYKEDVVRGIQEMIQVPSVKSEALPGKPFGEGPANALHAFLAYAEKLGFHTE NFDNYAGHIDMGEGEETLGILAHVDVVPVGEGWTYPPFSGTIADGKIFGRGTLDDKGPAM MCLYCMKALQDLKIPLSRKIRMIIGADEESGSACLKHYFQDLKMPHPDYAFTPDSSFPVT FAEKGAVRVKITRKFKTLEEVVLRGGNAFNSVAEKVRANFPSALVSGLESKNRVKVEEED GISEVFVQGVAAHGAKPHLGVNAIQVLFDYLKDCGIHNEEFRELVELFKNYLKMETDGAS FGVNFSDEESGNLSLNVGMISLEDNQLEICIDMRCPVLVENQKVIDTMKPKVEAAGFEFV LYSNSKPLYFPKDSFLVKTLMDVYQEVTGDMEAKPVAIGGGTYAKQTTNAVAFGALLKSQ EDLMHQKDEYLEIDKLDTLLPIFIEAIYRLAK >gi|224461460|gb|ACDD01000042.1| GENE 20 19320 - 20195 1063 291 aa, chain + ## HITS:1 COG:FN0277 KEGG:ns NR:ns ## COG: FN0277 COG4866 # Protein_GI_number: 19703622 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 287 1 286 290 286 52.0 2e-77 MWKKLEIESKEVIDRYTKHRFQICDFAFTNLFLWSRGEVIEYEEEEDVLCLRGHYNDQIY YFMPVPKEETEENIGAMKRRMDTILEEGASISYVPEYWVEKLQDDYVLEEIRDSFDYVYQ VEDLAFLKGRRFAKKKNHISKFKRTYPDFTFEEITTENLEAVKAFQSQWCFCRECEKEEV LRNENMGIMSLLDHFETLGLSGSVLKVNGEIVGFSLGEVLDQDYVLIHIEKAIADYVGSY QILNSLFLQQHFLEYQYVNREDDFGNEGLREAKESYHPAFLLKKYDVISKK >gi|224461460|gb|ACDD01000042.1| GENE 21 20212 - 21414 1774 400 aa, chain + ## HITS:1 COG:TM0356 KEGG:ns NR:ns ## COG: TM0356 COG1171 # Protein_GI_number: 15643124 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Thermotoga maritima # 1 398 1 397 401 345 53.0 1e-94 MVTLEKIQEAKSCIQDSVRKTPVLNCPKLGAQTGNDVYFKLENLQQTGSFKLRGALNKIA HLSEEEKKCGVIASSAGNHAQGVALGATAKGIKSTIVMPAGAPLSKVRATREYGAEVVLH GAVYDNAYQKALEIQKETGAIFLHPFDDEEVIAGQGTIGLEILEQLPDVDAVLVPIGGGG ILAGIATAIKSVKPEVKVIGVEAAGAASMTAALAKGECCDIENCSTIADGIAVRKVGYKT LELVKKYVDEVVTVTEDEIVQGIFYLLEKSKLVAEGAGASGVAALLAGKINLKGKKVCAV ISGGNVDMNFIEKIVNKALVLNGQRHEITVYIPDKPGEMEKLTRVLHEQNANIIYISQTK YRASLAITEVKVDLVVECRDEAHQEEIHAALEKNGARIAK >gi|224461460|gb|ACDD01000042.1| GENE 22 21435 - 23069 1625 544 aa, chain + ## HITS:1 COG:FN0276 KEGG:ns NR:ns ## COG: FN0276 COG1283 # Protein_GI_number: 19703621 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Fusobacterium nucleatum # 15 544 1 525 525 592 60.0 1e-169 MYFQVLCTVVGGLGIFLLGMDNMSSGMQKIAGPRLKKILATLTTNRILGIFTGIMITALV QSSSVSTVMTIGFVNASLLTLKQALGIILGANIGTTITGWLLAMNIGKYGLPIVGLAAIL LMFKKEDKVRVRLMTLMGFGFIFLGLQLMSDGLRPLRELPEFVELFKAFRADTYLGVIKV ALIGAAITGIVQSSAATLGITITLASQGLIDYPSAVALVLGENVGTTVTALLASIGASAN AKRAAYAHTLINIIGVAWVTAIFPYYLFGLENILDPDHHVGAAIASAHTCFNICNVILMI PFVGVLDKFLQRIVPSDGNIEEDEVKVTKLSSMGKMLPTVIIDQTKNEVLTMGKYIKHIF FRLEELYEDPDKIAVNVVEINQVEDKLDLYEKEINNINYALLNRTLDQEYIEKTRRNLLV CDEYETISDYIGRIGDSIEKLQEHNIVIEGFRVEILQSLNDKIVKFFQHIHQGYESKEMK YFSDGIDEYNEIKNFCKTKRKEHFKDSTENIIPSRLNTEFSDIINYYQRAADHIYNIIEY YMKL Prediction of potential genes in microbial genomes Time: Fri May 20 02:02:03 2011 Seq name: gi|224461459|gb|ACDD01000043.1| Fusobacterium sp. 3_1_5R cont1.43, whole genome shotgun sequence Length of sequence - 3338 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 44 - 103 9.0 1 1 Op 1 . + CDS 156 - 944 1170 ## gi|257452480|ref|ZP_05617779.1| hypothetical protein F3_05380 2 1 Op 2 . + CDS 904 - 1374 723 ## gi|257452481|ref|ZP_05617780.1| hypothetical protein F3_05385 + Term 1388 - 1444 9.5 3 2 Op 1 . - CDS 1410 - 1799 456 ## SbBS512_E1933 phage capsid scaffolding protein (GpO) 4 2 Op 2 . - CDS 1811 - 1996 284 ## gi|257452483|ref|ZP_05617782.1| hypothetical protein F3_05395 5 2 Op 3 . - CDS 2002 - 2364 339 ## SNSL254_A1182 hypothetical protein - Prom 2402 - 2461 8.6 + Prom 2339 - 2398 14.3 6 3 Op 1 . + CDS 2525 - 2746 314 ## gi|257452485|ref|ZP_05617784.1| hypothetical protein F3_05405 + Prom 2767 - 2826 9.2 7 3 Op 2 . + CDS 2858 - 3151 469 ## gi|257452486|ref|ZP_05617785.1| hypothetical protein F3_05410 + Term 3155 - 3205 8.5 Predicted protein(s) >gi|224461459|gb|ACDD01000043.1| GENE 1 156 - 944 1170 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452480|ref|ZP_05617779.1| ## NR: gi|257452480|ref|ZP_05617779.1| hypothetical protein F3_05380 [Fusobacterium sp. 3_1_5R] # 1 262 1 262 262 332 100.0 2e-89 MAALTIFAITGAVSFADPATTDPNAKIKELETRVGANADGIATLGTRANKLEEFSKAQAI LNEKYDQAVDNVGTKADKDGNNIDVDKYKEKLGLDKLEDRLDTQKARIKGLTQNKVNKDD YEQDKKEITKKIDKEIKDRQDHSKIFNDSIKDHDKKINSLTKDVDELLENAVNGEYVKGE VGKEAAARKAADEKHDEKINSLTESVDELLENAVNGEYVKREVEKEAETRKAADKAHDKG IADNKKAIDNEAATREAADKKQ >gi|224461459|gb|ACDD01000043.1| GENE 2 904 - 1374 723 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452481|ref|ZP_05617780.1| ## NR: gi|257452481|ref|ZP_05617780.1| hypothetical protein F3_05385 [Fusobacterium sp. 3_1_5R] # 12 156 12 156 156 210 100.0 2e-53 MKQQQEKQQTKNNDAAIAENKAAIKKYDTAIDTNKKAIADEVKRSTEVDARHDAAIAANK AGIEANASAIKHLDSKLNKTTAMMTAMNNVDFQDVNAGEVAIGAGVGHFVGSQAVAVGVA YGVNDDLKVHAKLSGVAGDAHYNAIGGGVTYKFRTR >gi|224461459|gb|ACDD01000043.1| GENE 3 1410 - 1799 456 129 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E1933 NR:ns ## KEGG: SbBS512_E1933 # Name: not_defined # Def: phage capsid scaffolding protein (GpO) # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 2 120 169 287 294 114 44.0 1e-24 MYLFNKRSLQNLKGVHPTLVKLMKTAILSSPFPFVITEGCRSLERQKQLLKEKKTRTLQS YHLTGHAVDIAIKVGEKITWEYRYYEAVAKHIQKIAHRQHILITWGGTWKNLVDACHFQL EEERKKSLE >gi|224461459|gb|ACDD01000043.1| GENE 4 1811 - 1996 284 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452483|ref|ZP_05617782.1| ## NR: gi|257452483|ref|ZP_05617782.1| hypothetical protein F3_05395 [Fusobacterium sp. 3_1_5R] # 1 61 1 61 61 99 100.0 8e-20 MRTKELENILIAEFGPKASTPELSKKLKISLTTIYKLIKEGKLILVEPGKVDTLSLFNCI F >gi|224461459|gb|ACDD01000043.1| GENE 5 2002 - 2364 339 120 aa, chain - ## HITS:1 COG:no KEGG:SNSL254_A1182 NR:ns ## KEGG: SNSL254_A1182 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Newport # Pathway: not_defined # 19 120 20 120 130 87 46.0 1e-16 MKDIHCLEIFPYSSSKFCVLEDFEYPMKHRVIFVPKHFITDLSSIPRIFWNFYPPFGLYT LASIIHDFLYSKEGSKQVQSRKEADEIFLTIMEETGVSWYTRILFYYAVRLFGSLYFQKE >gi|224461459|gb|ACDD01000043.1| GENE 6 2525 - 2746 314 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452485|ref|ZP_05617784.1| ## NR: gi|257452485|ref|ZP_05617784.1| hypothetical protein F3_05405 [Fusobacterium sp. 3_1_5R] # 1 73 1 73 73 110 100.0 3e-23 MKISEEDLSTEIIEQLVNMVGEFTNVNDLAETLNVSRTTISRKIEEGEIVAFHFGSRVIV VTRSLQGIIEKFL >gi|224461459|gb|ACDD01000043.1| GENE 7 2858 - 3151 469 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452486|ref|ZP_05617785.1| ## NR: gi|257452486|ref|ZP_05617785.1| hypothetical protein F3_05410 [Fusobacterium sp. 3_1_5R] # 1 97 1 97 97 138 100.0 1e-31 MNTRETIQKRVKTLETSIKREKAILQELESDKATIQRIEDLVEKGIALASDSHYASYDEW KLHLEKQVKRGERSLENLKIRKAELEAFRFYLEKVGA Prediction of potential genes in microbial genomes Time: Fri May 20 02:02:51 2011 Seq name: gi|224461458|gb|ACDD01000044.1| Fusobacterium sp. 3_1_5R cont1.44, whole genome shotgun sequence Length of sequence - 31092 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 10, operones - 6 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 10.1 1 1 Tu 1 . + CDS 93 - 794 1041 ## FN0602 hypothetical protein + Term 949 - 986 4.1 + Prom 799 - 858 6.7 2 2 Op 1 36/0.000 + CDS 994 - 1506 339 ## PROTEIN SUPPORTED gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 3 2 Op 2 46/0.000 + CDS 1552 - 1758 324 ## PROTEIN SUPPORTED gi|19703669|ref|NP_603231.1| 50S ribosomal protein L35P 4 2 Op 3 . + CDS 1776 - 2126 510 ## PROTEIN SUPPORTED gi|237739652|ref|ZP_04570133.1| LSU ribosomal protein L20P + Term 2148 - 2202 11.7 - Term 2138 - 2186 6.1 5 3 Op 1 4/0.000 - CDS 2219 - 3385 1278 ## COG0003 Oxyanion-translocating ATPase 6 3 Op 2 . - CDS 3382 - 4569 954 ## COG0003 Oxyanion-translocating ATPase - Prom 4686 - 4745 8.9 + Prom 4705 - 4764 10.9 7 4 Op 1 13/0.000 + CDS 4845 - 5396 845 ## COG1556 Uncharacterized conserved protein + Term 5409 - 5449 2.3 8 4 Op 2 1/0.400 + CDS 5467 - 7623 2598 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 9 4 Op 3 2/0.000 + CDS 7624 - 8574 1382 ## COG0142 Geranylgeranyl pyrophosphate synthase 10 4 Op 4 1/0.400 + CDS 8574 - 9503 640 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase 11 4 Op 5 1/0.400 + CDS 9500 - 10201 267 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 12 4 Op 6 12/0.000 + CDS 10219 - 11514 2188 ## COG0644 Dehydrogenases (flavoproteins) 13 4 Op 7 . + CDS 11517 - 11801 400 ## COG2440 Ferredoxin-like protein + Term 11826 - 11862 5.0 + Prom 11816 - 11875 10.7 14 5 Op 1 2/0.000 + CDS 11950 - 12627 969 ## COG2186 Transcriptional regulators + Term 12708 - 12753 4.2 + Prom 12645 - 12704 10.4 15 5 Op 2 1/0.400 + CDS 12790 - 14217 2078 ## COG0277 FAD/FMN-containing dehydrogenases + Prom 14261 - 14320 4.5 16 6 Op 1 2/0.000 + CDS 14368 - 15504 1790 ## COG1960 Acyl-CoA dehydrogenases 17 6 Op 2 29/0.000 + CDS 15522 - 16298 1177 ## COG2086 Electron transfer flavoprotein, beta subunit 18 6 Op 3 1/0.400 + CDS 16312 - 17286 1608 ## COG2025 Electron transfer flavoprotein, alpha subunit 19 6 Op 4 23/0.000 + CDS 17350 - 17730 474 ## COG1380 Putative effector of murein hydrolase LrgA 20 6 Op 5 . + CDS 17727 - 18449 936 ## COG1346 Putative effector of murein hydrolase + Term 18460 - 18526 9.1 + Prom 18511 - 18570 7.5 21 7 Tu 1 . + CDS 18605 - 19993 1626 ## COG1757 Na+/H+ antiporter + Term 20002 - 20039 6.6 - Term 19988 - 20027 7.0 22 8 Tu 1 . - CDS 20032 - 20478 546 ## COG1490 D-Tyr-tRNAtyr deacylase - Prom 20504 - 20563 6.4 + Prom 20546 - 20605 7.9 23 9 Op 1 1/0.400 + CDS 20631 - 22139 2195 ## COG1488 Nicotinic acid phosphoribosyltransferase 24 9 Op 2 1/0.400 + CDS 22148 - 23062 701 ## COG0688 Phosphatidylserine decarboxylase 25 9 Op 3 1/0.400 + CDS 23037 - 23438 381 ## COG5341 Uncharacterized protein conserved in bacteria 26 9 Op 4 1/0.400 + CDS 23443 - 24564 1160 ## COG0628 Predicted permease 27 9 Op 5 . + CDS 24548 - 25699 1317 ## COG0116 Predicted N6-adenine-specific DNA methylase 28 9 Op 6 . + CDS 25689 - 26351 756 ## FN0343 hypothetical protein 29 9 Op 7 . + CDS 26373 - 27680 2078 ## COG2056 Predicted permease 30 9 Op 8 1/0.400 + CDS 27728 - 28624 1442 ## COG3643 Glutamate formiminotransferase 31 9 Op 9 1/0.400 + CDS 28635 - 29861 1650 ## COG1228 Imidazolonepropionase and related amidohydrolases 32 9 Op 10 . + CDS 29871 - 30509 1096 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase + Term 30515 - 30556 6.4 - Term 30501 - 30546 1.3 33 10 Tu 1 . - CDS 30550 - 30996 514 ## COG3086 Positive regulator of sigma E activity - Prom 31023 - 31082 12.2 Predicted protein(s) >gi|224461458|gb|ACDD01000044.1| GENE 1 93 - 794 1041 233 aa, chain + ## HITS:1 COG:no KEGG:FN0602 NR:ns ## KEGG: FN0602 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 233 1 234 236 253 52.0 3e-66 MLRVKSVETAFQASPNQYVQGAIDALGIFDNIIQPVFPYPFSNIALIFSFEKMDRPTVFE IRINAPDDSLISQGEFGVMPDSFGNGRKIVNLSNFLVAERGLYSVDILEKVSEDKVNFLK TEELFMADYPPKRRFSQEEIQEILATDGVIKMVKTDYKPVKYVQDETLEPIHFQLFLDPS EEVEEGFVAFPENDKVEIRGEIFDLTGIRRQIEWMFGQEMPKEEETKEETTEE >gi|224461458|gb|ACDD01000044.1| GENE 2 994 - 1506 339 170 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 [Vibrio campbellii AND4] # 9 170 1 166 166 135 42 4e-31 MNISEKIRINDKIRGKEFRIIGADGEQLGVMSAAEALEIAANQDLDLVEIAATAKPPVCK IMNFGKYRYEQERKAKEAKKNQKQTVVKEVKVTARIDAHDLDTKVNQIQKFLEKDNKVKV TLVLFGREKMHASLGVGTLDEVAEKFAETADVDKKYAEKQKHIILTPKKK >gi|224461458|gb|ACDD01000044.1| GENE 3 1552 - 1758 324 68 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|19703669|ref|NP_603231.1| 50S ribosomal protein L35P [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 68 1 68 68 129 89 2e-29 MPKMKTHRGAKKRIKVTGTGKFIVKHSGKSHILTKKDRKRKNSLKKDLVVSETLKRHMQG LLPYGVGR >gi|224461458|gb|ACDD01000044.1| GENE 4 1776 - 2126 510 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739652|ref|ZP_04570133.1| LSU ribosomal protein L20P [Fusobacterium sp. 2_1_31] # 1 115 1 115 116 201 86 6e-51 MRVKTGIVRRRRHKKILKAAKGFRGASGDALKQAKQATMKAMAYSTRDRKVNKRRMRQLW ITRINSAARLNGLTYSVFMNGLKKAGIELDRKVLADLALNNAAEFAKLAETAKAAR >gi|224461458|gb|ACDD01000044.1| GENE 5 2219 - 3385 1278 388 aa, chain - ## HITS:1 COG:FN1537 KEGG:ns NR:ns ## COG: FN1537 COG0003 # Protein_GI_number: 19704869 # Func_class: P Inorganic ion transport and metabolism # Function: Oxyanion-translocating ATPase # Organism: Fusobacterium nucleatum # 1 388 1 388 388 504 65.0 1e-142 MRIIIYTGKGGVGKTSIAAATASHLANLGKKVLLLSTDQAHSLQDSLDHPLTYYPQEVFP NLEAMEIDSTEESKKAWGNLRDYLRQIISEKANGGLEAEEALLFPGLDEVFALLQILEIY QENRYDVLIVDCAPTGQSLSMLSYSEKLTMLADTILPMVKNVNSILGSFISKKTSVPKPR DAVFEEFESLVKRLNHLEEILHDKKTSSIRIITTPEHIVLEEARRNYTWLQLYHFTVDAI YVNKIYPEKALEGYFENWKENQNKSLQIVEESFFNQKIFSLELQEEEIRGKDSLERISQL LYQGEDPSQIFYEGEEFKIEEKNGTRIFILPLPFTTKQDISVIKEEQDLLVTVLNETRRF RLPDKLQKRYISNYVLEDGKLKISMDYE >gi|224461458|gb|ACDD01000044.1| GENE 6 3382 - 4569 954 395 aa, chain - ## HITS:1 COG:FN1538 KEGG:ns NR:ns ## COG: FN1538 COG0003 # Protein_GI_number: 19704870 # Func_class: P Inorganic ion transport and metabolism # Function: Oxyanion-translocating ATPase # Organism: Fusobacterium nucleatum # 1 395 1 395 396 506 66.0 1e-143 MARIIIFTGKGGVGKSSVATAHALASSREGKKSLIISADMAHNLGDIFQKKIGKTITNIS TNLDAIELDPDAIRKEIFPEVKNAMMDLMGKNGLGVSNINEQFSFPGLGNLFCLLKIREL YESNQYERIFIDCAPTGETLALLKLPELLAWYMEKFFPVGKMMVRVLSPISKVKYGVTLP KRSTMNNIEKMHQSLLELQSLLKNKEICSVRLVCIPEKMVVEETKRNFMYLHLYQYQVDA VFINRVLQENIQNPFMKKWQSIQEKYIQELEEVFRNIPLTKIPWYPKEILGYEAVEKLCD TLSTSADLFSVHKQIENETYSPCEGGYRLNIVIPNAKKENIQVFLHEMDLNLKINNVNRC IPLPNSLRGSKIVKMDLEKDNLWIQFQQNTKEAKE >gi|224461458|gb|ACDD01000044.1| GENE 7 4845 - 5396 845 183 aa, chain + ## HITS:1 COG:FN1539 KEGG:ns NR:ns ## COG: FN1539 COG1556 # Protein_GI_number: 19704871 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 182 1 182 183 290 80.0 9e-79 MSTITDELYESFKKNLESVNGSCMRTAKAGLGKLIADVFTTQEITSISVFESPMMKEAGV VATLREAGITVHTDHIRLHAETDKGGVSEAQHGIAELGTIVQEQDDADGRMVSTMSEFYI GLVKGSTIVATYDDMFDILSAMPEIPNFVGFVTGPSRTADIECVGTVGVHGPIQVCIIIV DDA >gi|224461458|gb|ACDD01000044.1| GENE 8 5467 - 7623 2598 718 aa, chain + ## HITS:1 COG:FN1540_1 KEGG:ns NR:ns ## COG: FN1540_1 COG1139 # Protein_GI_number: 19704872 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Fusobacterium nucleatum # 1 463 1 463 463 890 92.0 0 MASEDLKKEIRSALDNATLGRTLGNFCKTYPARREKSYDGVDFEATRQKIAEVKSYAADH IDEIIEEFTTNCEKRGGHVYHATSTEDAMEWIRQLVKEKGVKTIVKSKSMASEEIHMNHV LGDDGVLVQETDLGEFIIALEGNTPVHMVMPALHLNKEQVADLFGDYTKKKHEPIISEEV KTARRVMRDKFTHADMGVSGANVAVAETGTVFTMTNEGNGRMVGTLPEIHLYIFGIEKFV KSFSDARHIFKALPRNGTAQRITSYISMYTGACEVTSNKETDEKRKKDFYCVILDDPGRR AILAEPDFREMFDCIRCGACLDVCPAFALVGGHVYGSKVYTGGIGTMLTHFLVSEERAAE IQNICLQCGRCNEVCGGGLHIAEMIMKLREKKMAENPDALKKFALDAVSDRKLFHSMLRI ASVAQGIFTKGEPMIRHLPMFLSGMTKGRSFPAIAQVPLRDMFHTIEQNVKEPKGTVAIF AGCLLDFIYTDLAKAVVANMNSIGYKVEMPLGQACCGCPASNMGDTENARKEAEINIEGM QAEKYDYIVTACPSCTHQLHLYPTFFEEGTEMYKRAKELADKTFDFCKLFYDLGGVADIG DGKPVKVTYHDSCHLKRSLRVSEEQRELLKHTKGVEFVEMHDCDNCCGFGGSYSLLYPEI SAPILENKIQNIKDSGAEVVALDCPGCLMQIKGGLDARGVDVKVKHTAEILAEKRGLV >gi|224461458|gb|ACDD01000044.1| GENE 9 7624 - 8574 1382 316 aa, chain + ## HITS:1 COG:FN1541 KEGG:ns NR:ns ## COG: FN1541 COG0142 # Protein_GI_number: 19704873 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 2 315 10 323 326 414 65.0 1e-115 MIEQVKQYMHLIADYSKKETEVGAVLEDALNASGKMFRTKLLLFCASLGPCYEEKKEKLC KLAAMVELTHLASLIHDDIVDDSPYRRGKISIQGKYGKDAAVYAGDFLMARIYYYEAVER LNESAALLSKTVEHMCTGEIGQDLCRYREDVSVEEYFQNIQGKTAALFETACHIGAMEAG CSQEMIEKLKLFGRNLGMMFQLKDDILDFTSNMDEIGKETHKDFQNGIYTFPVIMALQQE QAKKILYPIMEKNKGHRLDDAEITKMESCVLEYRGVEATYQEIQSLSKKNKQILQEIQGN QEAILPLWKLMDELEA >gi|224461458|gb|ACDD01000044.1| GENE 10 8574 - 9503 640 309 aa, chain + ## HITS:1 COG:FN1542 KEGG:ns NR:ns ## COG: FN1542 COG1575 # Protein_GI_number: 19704874 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Fusobacterium nucleatum # 6 307 5 306 306 305 55.0 6e-83 MADYEKLTGQMALQLAAPHTWVASIGPALFAILFCRMEGYFLQVWQEIFLLVSCIFLQSS VNTFNDYIDFIKGTDGIEDCLEEKDAVLLHHHLSPRQVISLGICYLFFGVILGVLASLPA GYLPLGIGCIGLFAILCYSGGPFPISYLPIGEIVSGFAMGALIPLGIVACSDGGLQFQVI LYALPFVIGIAFIMLTNNSCDIEKDKLAKRCTLAVLLGRKRSKKIYQGLLVLWGISIVFL SARSLEFFSCISIFFLVLARHKIGYLWKSSLLAQDRIEQMKNIVLANIIINGGYLLAMAS YILVELILA >gi|224461458|gb|ACDD01000044.1| GENE 11 9500 - 10201 267 233 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 13 232 1 221 221 107 32 9e-23 MKEEKKSEKVHGVFETISKEYDKANDRISLGFQRKWKGMLVQKLLEETEKQGRVLDVCCG TGDISIWIAKKRNDLNIVGLDFSSSMLREAEKKSRGLSNILWKEGDAMALPFEEHSFSAA CISFGLRNTADYEMVLREMKRVLKEDGILYCLDSFVPDNRWIRPCYQMYFKYMMPFLGGG KKHYQEYFWLYESTQQFLRKQELLLLYQKLGLRELKVYSKMYGACVLIQGKKE >gi|224461458|gb|ACDD01000044.1| GENE 12 10219 - 11514 2188 431 aa, chain + ## HITS:1 COG:FN1544 KEGG:ns NR:ns ## COG: FN1544 COG0644 # Protein_GI_number: 19704876 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Fusobacterium nucleatum # 1 431 1 431 431 758 92.0 0 MSEEKFDAIIVGGGLAGCSAAIVLANAGLAVLVVERGDFCGAKNMTGGRLYGHSLEKIIP NFAEEAPIERKITREKISLMSEDGSFDIGFGSKKLSSTNENASYTVLRSVFDQWLASKAE EAGAEIIPGILVDELIMEDGKVVGVSATGEELYADVVILADGVNSLLAQSIGMKKELEPH QVAVGAKEVIRLGEDVINQRFAVNGEEGVAWLSCGDPTLGGFGGGLLYTNKDTVSVGVVA TLSDIGHHELSINQLLDRFKEHPSIAPYLEGGTSIEYSGHLVPEEGLHMVPELYRDGVLV TGDAAGFCINLGFTVRGMDFAIESGRLAAETVIKAHQLGDFSAETLSDYKKALDNSFIMD DLKQYKGFPTLLGRREIFEDLPAMVNDIAAKAFTVDGKQGQSLMMYVLNSVAKHTTAAKL VNFVTTVLEAF >gi|224461458|gb|ACDD01000044.1| GENE 13 11517 - 11801 400 94 aa, chain + ## HITS:1 COG:FN1545 KEGG:ns NR:ns ## COG: FN1545 COG2440 # Protein_GI_number: 19704877 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Fusobacterium nucleatum # 1 94 1 94 94 181 94.0 3e-46 MKKMKIEDKLALNIFHVDEENSHIDVDKNFTDEAEIKKLLLACPAECYKYIDGKLSFSHL GCLECGTCRVLSHGKIVKEWKHPIGEVGVTFRQG >gi|224461458|gb|ACDD01000044.1| GENE 14 11950 - 12627 969 225 aa, chain + ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 4 222 9 227 231 120 34.0 3e-27 MLEKSYEKVIEYVRIHILRGDYKIGDKLPSERELAALLGMSRNSIREGLRILERMGVLSS QQGAGNYIVGKFDEVLTDVLSMMYALREVEISQITDFRHGLEYAALNLALENATKEEKEK MKYHLEKLEVAEDDEEWLQHDKSIHYLLIESSKNKYLLVNYIALTAIMDLYIPTMRGKIL RAMKTQHYLYDAHRKIVEGILENNLVKGMEGLSLHFKYLKDYRYS >gi|224461458|gb|ACDD01000044.1| GENE 15 12790 - 14217 2078 475 aa, chain + ## HITS:1 COG:FN1536 KEGG:ns NR:ns ## COG: FN1536 COG0277 # Protein_GI_number: 19704868 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Fusobacterium nucleatum # 1 474 1 474 475 835 87.0 0 MGGYVYNQVSPELVEKFKQIVPGKVYVGEEINQDYFHDEMPIYGEGQPEVLIDATTTEDI AAIVKLCYENNIPVIPRGAGTGLTGASVAIKGGVMINMTKMNKILEYDYENFVVRVEPGV LLIELAEDAQRQGLLYPPDPGEKYATLGGNVATNAGGMRAVKYGSTRDYVRAMTVVLPTG EIVKLGATVSKTSTGYSLLNLMIGSEGTLGIITELTLKLIPAPKETISLIIPYEKLEECI ATVPKFFMNHLQPQALEFMEREIVLSSERYIGKSVFPKELEGTEIGAYLLVTFDGDNMEE LEEITEKAAEVVLEAGALDVLVADTPAKKKDAWAARSSFLEAIEAETKLLDECDVVVPVN KIAPYLNYVNGVGEKFDFTVKSFGHAGDGNLHIYACSNDMEDAEFKRQVAEFMTDIYQKA AEMGGQISGEHGIGYGKMDYLSEFAGTVNMRLMKGIKEVFDPKMILNPNKICYKM >gi|224461458|gb|ACDD01000044.1| GENE 16 14368 - 15504 1790 378 aa, chain + ## HITS:1 COG:FN1535 KEGG:ns NR:ns ## COG: FN1535 COG1960 # Protein_GI_number: 19704867 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 1 378 1 378 378 678 90.0 0 MAYLISEEAQDLLADVKKFCENEVKEQCKEYDVTGEWPKEIYDKAIEQGYHALEVPEEFG GPGLSRVDIAALLEEMAIADAGFATTISASGLGMKPVLISGSQEQKQRVADLILEGGFGA FCLTEPGAGSDASAGKTTAVKDGDSYILNGRKCFITNGAVASFYCITAMTDKTKGVKGIS MFLVEAGTPGLSTGNHENKMGIRTSNTCDVVLEDCRIPASALVGKEGEGFAIAMKTLDQA RTWMGCIATGIAQRGINEAIAYGKERIQFGKPVIKNQALQFKIADMEIKTETARQMVAHA LTKMDLGLPFAKESAIAKCYAGDIAMEVASEAIQVFGGYGYSREYPVEKLIRDAKIFQIF EGTNEIQRIVIANNVIGR >gi|224461458|gb|ACDD01000044.1| GENE 17 15522 - 16298 1177 258 aa, chain + ## HITS:1 COG:FN1534 KEGG:ns NR:ns ## COG: FN1534 COG2086 # Protein_GI_number: 19704866 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Fusobacterium nucleatum # 1 258 1 259 259 338 73.0 7e-93 MEILVCIKQVADDSVEIAMNPTTGKPALEGVAEVVNAFDTYALEMATRLKEAKGGNICVL SLGGATTTNSLKNCLAVGADEAFHIKDETYQEKDTIAVAQILAKGIQEVEAQRGKKFDLV FCGKESTDFASGQVGIMLADELHYGVVTNLVDIDGDETKVSTKRETEEGYQEIEVACPAV LTVTKPNYEPRYPTIKSKMAARKKAIAEVVVDTTAECVITEVKMSAPAKRQAGVKLATGT PEELVAQAIEKMLEAKVF >gi|224461458|gb|ACDD01000044.1| GENE 18 16312 - 17286 1608 324 aa, chain + ## HITS:1 COG:FN1533 KEGG:ns NR:ns ## COG: FN1533 COG2025 # Protein_GI_number: 19704865 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Fusobacterium nucleatum # 7 323 3 323 323 402 67.0 1e-112 MSIEKQKNIMVYVETVEGSPINVSLEALTQARKLATGDQVVAVLVGEKLDEAAKKCVEFG ADEVLCIEDTRKEVEAVGDILAQCNAKYEPKVILIGSSLDGKDIAAMVASRAKLPSLTDV IAMREENGTCYMTIPMYSGNILKEVSVAAGKTVIVVLRSGACKKEAAAGAGNIQKEEVAL NELLTKVTNVVTEISESVNLEEAEVIVSGGRGMGSKENFELVKQLAEVCGGVVGATRPVT EENWVPRSHQIGQSGKIVAPKLYIACGISGATQHISGAIGSNYIVAINKDEDASIFDVSD VGIVGNVMDILPLMIEEIKKVKSK >gi|224461458|gb|ACDD01000044.1| GENE 19 17350 - 17730 474 126 aa, chain + ## HITS:1 COG:FN1532 KEGG:ns NR:ns ## COG: FN1532 COG1380 # Protein_GI_number: 19704864 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Fusobacterium nucleatum # 1 121 1 121 127 131 65.0 3e-31 MGQCLLILAISLLGQFLSDLISFPIPKTIIASLILFVLLELKVVKVDYLRGILDICRKNL AFFFMPVGVAIMTKLGERPSMDYLKVLIVMIISTCVIMIVTGKATDIIIGIQEKIFKRND KGGDNK >gi|224461458|gb|ACDD01000044.1| GENE 20 17727 - 18449 936 240 aa, chain + ## HITS:1 COG:FN1531 KEGG:ns NR:ns ## COG: FN1531 COG1346 # Protein_GI_number: 19704863 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Fusobacterium nucleatum # 9 237 13 241 244 294 72.0 1e-79 MSGFLENPLVHNVLFSPFFGMVLSLVAYMIGAYFFKKTKSIFCNPLLIGILLAILFMLAT DIPFEAYNQGGSILKMLISPVESVIIGVALYEQLEILKKNWFPILLSSFIGSTFAIIVVY VLGKLIVLPQDLLYATFPKSVTTAIALDIGSKFGWDGSLITMMTVSTGIIGAVVAPWITK FIKSPVARGLAIGTSSHAVGTSKAIEMGEIEGAMSGLGLSLAAIVTSFMVPVILTILHVI >gi|224461458|gb|ACDD01000044.1| GENE 21 18605 - 19993 1626 462 aa, chain + ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 11 460 1 450 473 388 46.0 1e-107 MKRKPTLVEALLPIVFLIVIIAVGILKYGADPQIPLLMATIVAAALGKYLGYTWSEMEKG IVETILPATQAILIQMIIGVIIGTWIVAGIVPTMIYYGLQIISPGFFLLATTVLCSIVSL ATGSSWTTAGTVGIALLGVGEVLGIPTALTAGAIISGAYFGDKLSPLSDTTNLAAAVSGT TLFEHIRNMMKTTIPAYCIALALYTGIGLRYLGRELNTEEIYKLLHILEQEFVINPVLLL PPILVILMVALKTPAVPGLTLGGVLGAIFAFVIQHKDFGAILEASQYGYKATTGYELADN LLSRGGLQSMMFTVSLIIVAMAFGGVVEKIKVLETVEERLVTFTKTTGSLVLTTVLSCIF CNATLPEQYLSILIPGRMFKDRYRKKGLDPRVLSRILEDSGTMTSALIPWNTCGAFMYAT LGVYPFAYLPFAFFNLLSPLIAILSGFFGVGIIKLEEEERLD >gi|224461458|gb|ACDD01000044.1| GENE 22 20032 - 20478 546 148 aa, chain - ## HITS:1 COG:FN0349 KEGG:ns NR:ns ## COG: FN0349 COG1490 # Protein_GI_number: 19703692 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Fusobacterium nucleatum # 1 147 4 150 154 197 68.0 5e-51 MKAVIQRVQYASVAVEGNIIGKIEKGFLILLGITHEDTEKDVLWLANKIKDLRVFEDENG KMNLSLEEVKGEVLIVSQFTLYGNCMKGRRPAFIDAARPELAIPLYEKFLETFQSFGIKT ESGKFGADMKVELLNDGPVTLIIESKDK >gi|224461458|gb|ACDD01000044.1| GENE 23 20631 - 22139 2195 502 aa, chain + ## HITS:1 COG:FN0348 KEGG:ns NR:ns ## COG: FN0348 COG1488 # Protein_GI_number: 19703691 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 7 499 7 499 501 741 75.0 0 MKRLNTLTEFARVINSDRYQYTESDIFLMEKMQDKVATFDVFFRKTEDGGFAVVAGVQEV LDLIHILNETSEEEKRMYFSTILEEQHLIEFLSKIRFTGDIYALPDGAIAYPNEPILTIK APLIEAQILETPILNIINMAMAIATKASMVTRAAYPQVVSSFGSRRAHGFDSAVSGNKAA VIGGCSGHSNLMTEYRYGIPSSGTMAHSYIQSFGVGKKAEKEAFTKFIEHRKNRKGNTLL LLIDTYNTIKIGLENAIEAFQEAGIDDNYPGVYGVRIDSGDLAYLSKKCRQRLDEVGMKK AKIFLTNSLDEKLIKSLKEQGACVDIYGVGDAIAVSKSYPCFGGVYKIVELDGKPLIKLS EDVIKISNPGFKEVYRIFDKEGKAYADLVTLVEGDRDKEILLSGKDLILRDEKYDFKKSY LKAGEYTFEKLTKVYVKQGEVQEALYEELLDTMKSQKHYFESLEKVSDERKRLENPHQYK VDLSQDLLQLKYGLIKSIKEEA >gi|224461458|gb|ACDD01000044.1| GENE 24 22148 - 23062 701 304 aa, chain + ## HITS:1 COG:FN0347 KEGG:ns NR:ns ## COG: FN0347 COG0688 # Protein_GI_number: 19703690 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Fusobacterium nucleatum # 1 299 1 299 300 372 65.0 1e-103 MKFEAIRYIERKTGEYKIEKVPGESFLKFLYYNPFGKLALEALVKRKFLSVWYGKKMDTP ESKKKILPFVKALEIPMEEAEKSWEDFTSFNDFFYRKLKKGARTWDMREEVLVSPADGKI LAYENIDSFSSFLVKGQEFSLEELFASKEMAEKYAGGSFVIVRLAPVDYHRFHFPIDAWV GTSHKIDGYYYSVSTHAIRRNIRIFLENQREYTILESKLFGDIAYFEVGATMVGGIHQTY LENTMINKGEEKGYFDFGGSTCLLLFEKGKVQLDEDLLENTKKGLETKVYVGEKIGYAKK DGVL >gi|224461458|gb|ACDD01000044.1| GENE 25 23037 - 23438 381 133 aa, chain + ## HITS:1 COG:FN0346 KEGG:ns NR:ns ## COG: FN0346 COG5341 # Protein_GI_number: 19703689 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 3 133 2 130 130 138 57.0 3e-33 MQRRTEYFKRGDIVIYLLLVFLFFQLALNILQFPEVKAEKAEIYVDGRLEYVYPLQEEQK LFFVNTPIGGVNVEIKDKKIRVTTSNSPLKLCVKQGWIDGVGESIIGVPDRLLIQIVGEI SEDDEDYVDGVVR >gi|224461458|gb|ACDD01000044.1| GENE 26 23443 - 24564 1160 373 aa, chain + ## HITS:1 COG:FN0345 KEGG:ns NR:ns ## COG: FN0345 COG0628 # Protein_GI_number: 19703688 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Fusobacterium nucleatum # 39 351 7 319 331 200 39.0 5e-51 MKKEKYSGIAFILFAVVILQSYLQETETFASILGGTISFFIPLIWAMFLSILLYPLQKFL EEKLHLKRELALIVVLILLGLCVSLFMLTVIPQVSKSIKELQQIYPYMEKRVGEFLDKIL SLLHKQGLLLMNETEIMKAISEYTQDNIQKIQQIGISIFWNVFDVTFGLANFFIGLFLAC FILLKPEDFVKVIERVIYLNVKKEKALNIIEILRKSKDIFLNYVVGRLLVSIIVALIVFL ILFLTKTPYPVLTALLFGVGNMIPYLGVLGASIVSGFLILIFAPYKIGYLIFAIILSQAL DGFIIGPKIVGDKVGLNSFWVVVAILLCGKLMGIAGMFLGVPIFCIIKLIYQEKWRAYVE KEKEGIEENEPKI >gi|224461458|gb|ACDD01000044.1| GENE 27 24548 - 25699 1317 383 aa, chain + ## HITS:1 COG:FN0344 KEGG:ns NR:ns ## COG: FN0344 COG0116 # Protein_GI_number: 19703687 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Fusobacterium nucleatum # 8 382 4 376 379 526 71.0 1e-149 MNQKFSMVASSTMGLESIVKEECKKLGFQNIQTFNGRVEFDGDFKTLAKANIHLRCADRV FIKMAEFKALSYEELFQEIKKIVWEHWIEEDGEFPISWVSSVKSKLYSKADIQRIVKKAM VERLKEKYKKEIFEETGAKYRIKIQCHNDIFLVMMDTSGEGLNRRGYRSLKNEAPLKETM AAALIYLAKWQGGERAFLDPMCGTGTLAIEAAMIARNIAPGANRNFAAEEWSIIPEDIWI DARDEAFSMEDYEKRVKIYASDIDEETIKIAKKNIERAGVEEDIILTCQDFREVKVEEKA GAMITNPPYGERLLDLVEVEELYRNLGQFCKKHLSKWSYYIITSFESFEKVFGKKASKNR KLYNGGIKCYYYQYYGEDRVNGR >gi|224461458|gb|ACDD01000044.1| GENE 28 25689 - 26351 756 220 aa, chain + ## HITS:1 COG:no KEGG:FN0343 NR:ns ## KEGG: FN0343 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 12 207 19 217 224 84 36.0 3e-15 MEDKYRGIWEEAEETFLEVLRIATQKQQELQKIGDLAGEELLEKEVISKYEALYLALQEE NFEDFSEIQWKQFQETLTEIQKKHQMDSTVLKEKRYLRKKLEGKSGAEVVKRLLEYQQKE LEKQKKNIMEEANQILEEEEKIHRKLCEAIQEVEQLQLFERLQPLQKRYAIISEKALDIQ KKIDYTVRDVEKKWKFKIYGTISEQKLQETSEEFFKKQKN >gi|224461458|gb|ACDD01000044.1| GENE 29 26373 - 27680 2078 435 aa, chain + ## HITS:1 COG:FN0341 KEGG:ns NR:ns ## COG: FN0341 COG2056 # Protein_GI_number: 19703684 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Fusobacterium nucleatum # 1 433 1 442 442 476 68.0 1e-134 MLLFNPVVLSVIVMSALCLLKLNVLISILIAALVAGGVAGMGLSGTISTLIGGMGGNAET ALSYILLGTLAVAINHTGVASILSRKIASLVNGKKYVLLCFIAFIACFSQNLIPVHIAFI PILIPPLLKLMNQLKVDRRAMACSLAFGLKTPYITLPVGFGLIFHGILAKEMANNGMEVA KTAIYKPLWILGVAMLIGVLLAIFVTYRKPREYQDLPLKGMEEVISEKMELKHWLTLVAA ILAFVVQILTGSLPLGALAALIALFVFGCIKWNEIDTMLNGGIQIMGLIAIIMLVAAGYG TVIRETGAVAELINALVGMVGGSKAIGAFAMLIVGLLITMGIGTSFGTIPVVATIYVPMC IHLGFSVESTVILMAAAAALGDAGSPASDTTLGPTSGLNADGQHEHIWDTCVPTFLHFNV ALIIGAMIGSIMIYG >gi|224461458|gb|ACDD01000044.1| GENE 30 27728 - 28624 1442 298 aa, chain + ## HITS:1 COG:FN0741 KEGG:ns NR:ns ## COG: FN0741 COG3643 # Protein_GI_number: 19704076 # Func_class: E Amino acid transport and metabolism # Function: Glutamate formiminotransferase # Organism: Fusobacterium nucleatum # 1 293 1 300 321 447 72.0 1e-125 MAKIVECVPNYSEGRDLAKIEKIVAPFKEDTRIELLGVEPDGDYNRTVVTVMGEPEIIAE AVIRSIGIAAEVIDMNVHKGEHKRMGATDVVPFIPIKDMSIEECNELSKKVGKEVWERYQ VPIFLYENTASAPNRVSLPDIRKGEYEGMKEKMLLPEWTPDFGERAPHPSAGVTAVGCRM PLIAFNINLDTADVEIAKKIAKAIRFSSGGFRHIQAGPAEIKEKGFVQVTMNIKDFKKNP IYRVFETVKMEAKRYGVNVTGSEIIGAVPMEAIVESLAYYLGVEDLGMNKILESKLIK >gi|224461458|gb|ACDD01000044.1| GENE 31 28635 - 29861 1650 408 aa, chain + ## HITS:1 COG:FN0740 KEGG:ns NR:ns ## COG: FN0740 COG1228 # Protein_GI_number: 19704075 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Fusobacterium nucleatum # 1 408 1 413 413 576 70.0 1e-164 MKADLILYDIGQLITSRELEENHIEILENGYLAIKGDKIIGVGTGEVPTSFIQFDTKFVR IGKKVVSPGLVDSHTHLVHGGSREHEFSMKIQGVPYLEILAAGGGILSTLKATREASLED LIEKTKKSLRYMLELGVTTVEAKSGYGLSLEQEIKQLEATKLLNHLQPVSLVSTFMAAHA TPPEFKGRTGEYVEEVIKMLPEIKKRNLAEFCDVFCEEGVFSVEESRKILSKAKELGFQL KIHADEVVSLGGVNLAGELQAVTAEHLMVITDEGIEALKKGNVIADLLPATSFNLRHDYA PARKILEAGVQVALSTDYNPGSCPSENLQFVMQIGAAHLKMTTEEVFKAVTINGAKAVCR EKEIGSLEVGKQADIAVFDVPNAEYMLYHFGVNHTDSVYKAGKLVYQR >gi|224461458|gb|ACDD01000044.1| GENE 32 29871 - 30509 1096 212 aa, chain + ## HITS:1 COG:FN0739 KEGG:ns NR:ns ## COG: FN0739 COG3404 # Protein_GI_number: 19704074 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Fusobacterium nucleatum # 1 212 1 212 212 222 55.0 4e-58 MKLMDMTLTQFLNEVDSPSPAPGGGSVGALVGGIGASLGRMVAHLSFGKKKYNAHPEEAR AAFEKNFVRLLEVKNELGRLVDADTDAYNLVMGAYKLPKDTEEQKVAREAEIQKNLKLAV QTPYETVMYCAEGIDLLGVLLQYGNQNAISDIGVGCLMLFAGLEAGIFNVLINLQSITDE AYNKEMKEKVMKIKEKAQAQKEEIVKIVEGAM >gi|224461458|gb|ACDD01000044.1| GENE 33 30550 - 30996 514 148 aa, chain - ## HITS:1 COG:FN0338 KEGG:ns NR:ns ## COG: FN0338 COG3086 # Protein_GI_number: 19703681 # Func_class: T Signal transduction mechanisms # Function: Positive regulator of sigma E activity # Organism: Fusobacterium nucleatum # 38 137 2 101 114 134 64.0 6e-32 MENKGIVQKIDGKQITVKLFKDSSCSHCNQCHGASKYGKDFEFETDKKAKVGDLVTLEIA EKEVIKAAAIAYVFPPLMMIIGYLVTDKLGFSENQSILGSFIGLILAFIGLFIYDKFFAK KSIEEEIRVISVENYDPTKIEKNTSCEL Prediction of potential genes in microbial genomes Time: Fri May 20 02:03:06 2011 Seq name: gi|224461457|gb|ACDD01000045.1| Fusobacterium sp. 3_1_5R cont1.45, whole genome shotgun sequence Length of sequence - 15003 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 5, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 6.8 1 1 Op 1 . + CDS 86 - 910 930 ## FN0760 hypothetical protein 2 1 Op 2 . + CDS 926 - 1972 1498 ## COG1077 Actin-like ATPase involved in cell morphogenesis 3 1 Op 3 . + CDS 1989 - 3611 1230 ## gi|257452522|ref|ZP_05617821.1| hypothetical protein F3_05590 4 1 Op 4 12/0.000 + CDS 3595 - 4119 616 ## COG1386 Predicted transcriptional regulator containing the HTH domain 5 1 Op 5 1/0.000 + CDS 4106 - 4822 666 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 6 1 Op 6 31/0.000 + CDS 4839 - 5132 597 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 7 1 Op 7 21/0.000 + CDS 5147 - 6604 411 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 8 1 Op 8 . + CDS 6620 - 8071 1830 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) + Term 8078 - 8113 5.1 9 2 Op 1 . - CDS 8094 - 9257 1393 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 10 2 Op 2 . - CDS 9272 - 10663 1425 ## COG1288 Predicted membrane protein - Prom 10863 - 10922 9.1 + Prom 10724 - 10783 15.6 11 3 Op 1 . + CDS 10852 - 11679 904 ## COG1737 Transcriptional regulators - TRNA 11780 - 11855 85.4 # Thr GGT 0 0 + Prom 11697 - 11756 7.2 12 3 Op 2 . + CDS 11895 - 12803 994 ## COG0646 Methionine synthase I (cobalamin-dependent), methyltransferase domain + Term 12912 - 12945 0.8 - Term 12769 - 12811 6.3 13 4 Tu 1 . - CDS 12818 - 13531 770 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Prom 13572 - 13631 7.2 14 5 Tu 1 . + CDS 13661 - 14953 1719 ## COG3681 Uncharacterized conserved protein Predicted protein(s) >gi|224461457|gb|ACDD01000045.1| GENE 1 86 - 910 930 274 aa, chain + ## HITS:1 COG:no KEGG:FN0760 NR:ns ## KEGG: FN0760 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 222 1 229 270 67 26.0 5e-10 MRFLKLRYLYIIALLFLVVIYLFPTTFHTQEQKEWLDIYFGNFIYLAFFVLFLYGVRMWY ETIAEKIVFEIRLYFGLFSFFASLALFFLWNGGLSFQSLEVTQATRDGILTEMIYEFHTG LIAAYAMYLLLNWNIYPFYYCMYAMLVGAILFFFLVVYKPLKKRYSHWKQVKRERIERER AERAIQEQIKIKKALEKEEARKVAQFEQRKIELIQERARGFEMGQLMSSVDLDDEEEEQE EFESNSENTEVMEEEKEEQEFQVDIFAEELENKK >gi|224461457|gb|ACDD01000045.1| GENE 2 926 - 1972 1498 348 aa, chain + ## HITS:1 COG:FN0758 KEGG:ns NR:ns ## COG: FN0758 COG1077 # Protein_GI_number: 19704093 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 1 348 5 353 353 503 74.0 1e-142 MLKKYLGRIFGMFSDDIGIDLGTSNTLICVKNKGIILNEPSVVAIHTKTKEIYEVGERAK MMIGRTPQAYDTIRPLKNGVIADYEITEKMLNSFYRRISNRLLYNPRVIICVPAGITQVE KRAVIDVTREAGAREAFLIEEPMAAAIGIGINVFEPEGSMIVDIGGGTAELAVISLGGVV RKSSFRVAGDKFDADIIEYIRQTHNLLIGEKTAEDIKKAIGTVVELEEDLSVDVSGRSLL NGLPKDVKVYASELIPVLNSSVQEIIEEIKIIFEKTPPELAADIRRRGIYITGGGALLRG IDQRMAENLNLKVVSVENPLNAVIDGITILLKNFSIYKSVLVSTETDY >gi|224461457|gb|ACDD01000045.1| GENE 3 1989 - 3611 1230 540 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452522|ref|ZP_05617821.1| ## NR: gi|257452522|ref|ZP_05617821.1| hypothetical protein F3_05590 [Fusobacterium sp. 3_1_5R] # 1 540 1 540 540 888 100.0 0 MKKWLAWIFYISISCFSFADYFLSNGEVTFYFDDQEKEVSYLRGDALYPLDISRIRFYWI DEKENVYDFQASVKKVEKEGENILAVSYVLDHSEWKITFIPSFQKKNQLFAFLEGNIQQK GYLVMEISPQQENRYIRTEKEEQNLEYENFMISSNRKDLSLYLSKDSSLSEFQLERVLKA SKKFREDRLYYIFDKMQEGKQQVAFTFHFYQKEKEEWKTFEELFLEEKAAALFFQQSYEK MRSSKILSKNLEYLDLLSSRVYIPNFLSYAKARMSYLEKQQLLFIRALYHMTENHQRILE DVNLRKKEIDSVHYFYYALLYAEKTQQRIDQNLVNKRLLPQILSIYDEMTEDGRLIAVED SLEAYASYYRLLSLLEKRVEFSSELEFIQERKEKLYSYIHKAFLYHGAFKDRSFEESVNV KNIEYIFLLPKSIQQTTLKQWYEKNYDRKLGVIHYSKEKDLDMVHNLKMVSILYEMGMSY EADQLLENLEKYMRRSQNYVLEEYSLVDKVEKQEIEISARALYYYLLANWNREQYHGNER >gi|224461457|gb|ACDD01000045.1| GENE 4 3595 - 4119 616 174 aa, chain + ## HITS:1 COG:FN0757 KEGG:ns NR:ns ## COG: FN0757 COG1386 # Protein_GI_number: 19704092 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Fusobacterium nucleatum # 1 174 1 174 181 208 63.0 5e-54 MGMKDELESILFLGGDENKVKDLAKFFSISLEDMLKLIEELKEDRKDTGICIEMDADLVY LVTNPKNGEIIHQYFEQEVKPRKLSAAAMETLSIIAYKQPITKREIEKIRGVGVDHIVQT LEERNLVRVCGYRDSIGRPKLYEVSNKFLGYMGISSLEELPEYRQIQEELDGRE >gi|224461457|gb|ACDD01000045.1| GENE 5 4106 - 4822 666 238 aa, chain + ## HITS:1 COG:FN0756 KEGG:ns NR:ns ## COG: FN0756 COG1187 # Protein_GI_number: 19704091 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Fusobacterium nucleatum # 6 238 1 234 234 236 57.0 3e-62 MEENKIRINKFLASKGVASRRQIDLWIEEGKILVNGILATSGQKVSAEDKILVNGKMISE KEEKKVYYILYKEEEVLSAVKDERGRKTVVDCIPTKARIFPVGRLDYRTSGLILLTNDGE LFNRVMHPRAEIFKTYEVLAKGHLTREQLKTLEEGVELEEGKTLEALVAKVKYEKGNTFF EISIREGRNRQIRRMVEAVGSRVYRLRRTKIGRLSLEGLKLGQYRRLQEEEIEYLYSL >gi|224461457|gb|ACDD01000045.1| GENE 6 4839 - 5132 597 97 aa, chain + ## HITS:1 COG:FN0755 KEGG:ns NR:ns ## COG: FN0755 COG0721 # Protein_GI_number: 19704090 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Fusobacterium nucleatum # 1 96 1 96 96 101 63.0 3e-22 MSLSREEVLKVAKLAKLKFSEEKIEKFQEELNDILGYVDMLNEVDTTEIEPLIYVHEAQN NFREDEARASLEVEEVLRNAPNAEDGAIIVPRVVGEE >gi|224461457|gb|ACDD01000045.1| GENE 7 5147 - 6604 411 485 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 25 473 25 452 468 162 29 1e-39 MKKIYEMTAKELHQSFLAGEYRAVEIVEAFFQRIEAVESKINSFVSLRKEKVLEEAKQLD EKKLSGKELGSLAGVVVALKDNMLCQGEKVTAASKILENYEGIYDATVVSKLKEADALIL GFTNMDEFAMGGTTKTSYHKMTANPYDITRVPGGSSGGAASSIAAQQVPLALGSDTGGSI RQPASFCGVVGLKPSYGRVSRYGLMAFASSLDQIGPLAKNVEDIAYAMNVIAGTDDYDAT VEEVEVPDYTSFLGKEIRGMKIGVPKEYFIEGIRAEVKEIIMKSIDTLKSLGAEIIEISL PHTKYAVPTYYVLAPAEASSNLARFDGVRYGYRSENSQNIEDLYINSRTEGFGDEVKRRI MIGTYVLSAGFYDAYFKKAQKVRRLIQEDFIKAFETVDVIVTPVAPSPAFQLSEQKTPIE LYLEDIFTIPANLAGIPGLSVPAGLAGGLPVGIQFLGKAFHEGDLLQVGSAFEKARGDWK LPILD >gi|224461457|gb|ACDD01000045.1| GENE 8 6620 - 8071 1830 483 aa, chain + ## HITS:1 COG:FN0753 KEGG:ns NR:ns ## COG: FN0753 COG0064 # Protein_GI_number: 19704088 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Fusobacterium nucleatum # 1 481 1 481 481 712 75.0 0 MAREWESVIGLEVHLQLKTGTKVWCGCKADYDGDGMNTHTCPICLGHPGTLPKLNKKVVE YAVKAALALNCKINHHSAFDRKNYFYPDAPKNYQITQFEKSYAEKGHLDFRLNSGREVRV GITKIQIEEDTAKSIHASHESFMNYNRASIPLVEIISEPDMRSSEEAYEYLNTLKSIIKY TGVSDVSMELGSLRCDANISVMEKGATKFGTRVEVKNLNSFKAVARAIDYEIGRQIETIE QGGSIDQETRLWDDEAQITRVMRSKEEAMDYRYFHEPDLLQLYIPQSRIDEIQASMPESK AEKLVRFTKDYELPEYDAQVLTEEMELADYFEKVVEVSKNPKSSSNWIMTEVLRHLKETG KEIESFEISAENLGKIICLIDAKTISSKIAKEVFALSLTDSRDPEMIVKEKGLLQVSDEG AIISMVEEVLANSTKMVEDYKNSDEGRRPRVLKGLMGQVMKLSKGKANPELVTKLMLERL EKM >gi|224461457|gb|ACDD01000045.1| GENE 9 8094 - 9257 1393 387 aa, chain - ## HITS:1 COG:SA1935 KEGG:ns NR:ns ## COG: SA1935 COG1473 # Protein_GI_number: 15927707 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Staphylococcus aureus N315 # 2 380 4 385 394 209 33.0 1e-53 MKETLSHYIDQNQKEILEMADFIFDHPELGLQEFQAVTNIKNYLKKHQFQMEENIYGFET AFRASYQVGTGGPSIGLLCEYDALEGLGHACAHHMQGPSIVATAVALQEVLKDYNFNIIV YGTPAEETLGAKVAMSERGAFSDIDIALMMHGSPTTTTDVKSMALSNFDIIFHGVSSHAA LAPEKGRSALDGILLMMQGIEFLREHVKEDTRMHYTITDGGGAANVVPKIAKAKLSLRSY DRQYLDSVVERVKKVIQGAAMMTETSCEIIQTKALDSKIPVLSLNELLMENAKLLSAPRI TPPREKTGSTDFGNVMYRVPGSCIRIAFVPEGSSSHSDVYLEKGKTEEAHDAILLASKIL AYTAYDLISNPENLKKIQKEFKEKKEG >gi|224461457|gb|ACDD01000045.1| GENE 10 9272 - 10663 1425 463 aa, chain - ## HITS:1 COG:BH2308 KEGG:ns NR:ns ## COG: BH2308 COG1288 # Protein_GI_number: 15614871 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 3 463 16 470 470 264 32.0 3e-70 MKKKFAFPNTYVIIMAIIVLAVLLTWIIPPGEFERVKDEVSKQMIIIPGTFQYLEKNPIS FLQIPLYIMKGLAKSVDIIFLVIVVGGTFNIIIETGMFQSIAAKMTKVFSKNEILIIPAF TTIFALACTTMGVNTFIGFAPIGVIVARSIGYDAIVGVAMVCLGGAIGFSTGTFNPFTTG VAQSLAGLPLFSGLGYRFFCLIVFLIVTNVYIIWYAKRIKKNPELSTVYEMEMENKKIEV SSKHYDTIEKKHYLVLLVVIACFAILIYGSQKWEWKLPENGAIFIWMGILSGAVYGFSPN KIAEEFTKGARKLVFGALMIGMARAIALVLTDGKILDTTVFFLGNLLVNLPSMLQAIGMF IMQLLINGIITSGSGQAAVTMPIMLPVADIIGMTKQTSVLAFNFGDGLSNYVLPTSSALM GFIAMVGISYNNWMKFMWKLFVIWIITGSALIIIANMIHYGPF >gi|224461457|gb|ACDD01000045.1| GENE 11 10852 - 11679 904 275 aa, chain + ## HITS:1 COG:CAC1850 KEGG:ns NR:ns ## COG: CAC1850 COG1737 # Protein_GI_number: 15895125 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 7 253 13 259 293 94 23.0 2e-19 MQALYEKIKEAKLTKTEKRIAQFFLESEEDIFFMTSKEIAELLQISDTSIIRFVKHIGFT NFTDFRDFIKNNLQTKMASLPQFIQNMEELKHNSVEQALLSQMNKNISTLFDTKSLAKME EIIDVLWEAENRYIVGLKSTVGLANFLGIRLSFMLGKVYTFVSNDTILFNSIRNIKKEDV LFLFDYPTYSREMILLCRVAKKRGAKIILVSDMINSPCSMYADIHFTIKIKGISLFHSLI SSQFFVEYLLSAISKKMTKDQKKEFLVWKKLLSER >gi|224461457|gb|ACDD01000045.1| GENE 12 11895 - 12803 994 302 aa, chain + ## HITS:1 COG:FN0163 KEGG:ns NR:ns ## COG: FN0163 COG0646 # Protein_GI_number: 19703508 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I (cobalamin-dependent), methyltransferase domain # Organism: Fusobacterium nucleatum # 18 296 1 298 309 266 45.0 3e-71 MREIIFFFGGRNMLLEALKRRILVLDGAMGTMLASYGEKPCYEVLNKTKENLIQKIHEKY IEAGADIITTNSFNCNQMALQKYHLKESVYDLTKKSVEIAKKATKNSKKAVYILGSIGPS IANLPEDMKSWKQSYFQQILGLLDGGVDALLLETIYDENKANCILGNIEEIFQEKKVEVP VFCSMTINQNGKLLTGTSITRAVEKMDRPWIVGFGLNCSYGMENIVSFLPELIRATDKYC MVYANAGFPNEKGEYTENIEEMLELLQPFLEKHWIHIVGGCCGTNEKYTYAFAKKIALLA ER >gi|224461457|gb|ACDD01000045.1| GENE 13 12818 - 13531 770 237 aa, chain - ## HITS:1 COG:FN1185 KEGG:ns NR:ns ## COG: FN1185 COG0846 # Protein_GI_number: 19704520 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Fusobacterium nucleatum # 2 236 7 242 252 303 61.0 3e-82 MEEIEKLASWIQESKHLVFFGGAGTSTDSGIKDFRGKNGLYQENFHGYSPEEVLSIDFFH RHRDLFLKYVEEKLSIANIKPHAGHYALVELEKMGKLKTIITQNIDDLHQAAGSKKVLEL HGTLKDWYCLSCEKHNTHPFQCQCGGTVRPNVTLYGEMLNESVTEAAIREIQKADVLIIA GSSLTVYPAAYYLQYYKGNKLVIINQSPTQYDKQAGLLISKNFAETMTEVLEYIKKK >gi|224461457|gb|ACDD01000045.1| GENE 14 13661 - 14953 1719 430 aa, chain + ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 20 426 3 409 411 582 71.0 1e-166 MNKELKDKILNILQEEIVPAEGCTEPIAIAYAAAKLAQVLGEKAENIDIYLSGNMIKNVK SVFIPSSDGMVGIEAAVAMGFIAGNADKELMVISDVTKEQLEAVKDYYAEKRIHTYAHEG DIKLYIRMEAKTKNHTASIEIKHTHTNITELKKDGKILLAQACNDGNFNSPLSDREILSV KLIYDMAKKIPLPEIEPLFFQVVAYNSAIAEEGLKGKYGVNIGKMILDNIERGIYGNDIR NKAASYASAGSDARMSGCSLPVMTTSGSGNQGMTASLPIIRYCRERNVSYEQMIRGLFMS HMITIHVKTNVGRLSAYCGAICASSGVAAALTYLEGGSYYNVCDAITNILGNLSGVICDG AKASCALKISSGVYSAFDACMLALNKDVLRPEDGIIGKDIEETIKNIGELAQAGMKETDE VILDIMVGKR Prediction of potential genes in microbial genomes Time: Fri May 20 02:03:29 2011 Seq name: gi|224461456|gb|ACDD01000046.1| Fusobacterium sp. 3_1_5R cont1.46, whole genome shotgun sequence Length of sequence - 3419 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 150 - 209 11.0 1 1 Op 1 2/0.000 + CDS 249 - 2162 2844 ## COG1960 Acyl-CoA dehydrogenases 2 1 Op 2 . + CDS 2178 - 3380 1871 ## COG0426 Uncharacterized flavoproteins Predicted protein(s) >gi|224461456|gb|ACDD01000046.1| GENE 1 249 - 2162 2844 637 aa, chain + ## HITS:1 COG:FN1424_1 KEGG:ns NR:ns ## COG: FN1424_1 COG1960 # Protein_GI_number: 19704756 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 1 377 1 377 377 648 87.0 0 MFFKTTEEHEELRAKVREFVETEVKPIAFELDQENKFPEEAIKKFAKMGMMGLPYPKEFG GAGKDILSYAIAVEELSRVDGGTGVILSAHVSLGTFPIAAFGTEEQKKKYLVPLAKGEKI GAFGLTEPNAGSDAGGTETTAVLEGDHYILNGEKIFITNAPYADIYVVFAVTTPDIGTKG ISAFIVEKGWEGFTFGDHYDKLGIRSSSTAQLIFNNVKVPKENLLGKEGKGFNIAMATLD GGRIGIASQALGIAQGAYEEALNYAKEREQFGQPIAFQQAITFKLADMATKLRAARFLVY SAAELKEHHEPYGMESAMAKQYASDVALEIVNDALQIHGGAGYLKGMPVERFYRDAKICT IYEGTNEIQRVVIGAHIVGKAPKPTALAAAPKKKGPVCGIRKNVIFKDGSMQDKVNALVA ALKADGYDFTVGIDMDTPILDAERVVSFGKGVGKKENVELVKELAKQAGAALGCSRPVAE TLRYLPLNRYVGMSGQKFKGNLYIACGISGAIQHLKGIKDATTIVAINTNGNAPIFKNAD YGIVGSIEEVLPLLAAALNNGEDKKPAPPMKKMKRVIPKPVAPSYKLHVCNGCGYEYNPE FGDEDGEVKPGTLFKNLPEGWTCPECGEAVDQFIEVE >gi|224461456|gb|ACDD01000046.1| GENE 2 2178 - 3380 1871 400 aa, chain + ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 399 1 403 405 604 70.0 1e-172 MHCVREITKDLYWVGGNDRRITMFENIHPLKDGVSYNSYLLLDKKTVLFDTVDWTIVRQF VENIEYVLDGRTLDYLVINHMEPDHAAAIEEVLLRYPKAKVISTEKGFYLMTQFGFHVDP ANQITVKEGDKQNFGKHEIVFVEAPMVHWPEAMVSFDTTNGVLFSADAFGSFKALNGAMF NDEVDFDKDWIDEARRYYTNIVGKYGPHVQHLLGKAPVDQIKFICPLHGPVWRNDFGYLI DKYVKWSTYTPEEKAVMIVYASMYGNTENAVEILASKLVQKGIKVKLYDVSNTHVSHLIS DTFKYSHVILSSVTYNLGIYPPMHNYLMDMKALNLQNRTFAILENGSWACKVGSLMREFI ENNLKKSTVLNETVTLTSSTNEVNLKEMDDLVESIVESMK Prediction of potential genes in microbial genomes Time: Fri May 20 02:03:31 2011 Seq name: gi|224461455|gb|ACDD01000047.1| Fusobacterium sp. 3_1_5R cont1.47, whole genome shotgun sequence Length of sequence - 4617 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 24 - 989 650 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 1018 - 1077 10.5 - Term 1051 - 1092 9.0 2 2 Tu 1 . - CDS 1109 - 2776 2345 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 2850 - 2909 13.9 - Term 2874 - 2912 3.1 3 3 Op 1 . - CDS 2922 - 3914 1416 ## COG3641 Predicted membrane protein, putative toxin regulator - Prom 3934 - 3993 5.2 4 3 Op 2 . - CDS 4009 - 4518 767 ## COG1827 Predicted small molecule binding protein (contains 3H domain) - Prom 4545 - 4604 8.9 Predicted protein(s) >gi|224461455|gb|ACDD01000047.1| GENE 1 24 - 989 650 321 aa, chain - ## HITS:1 COG:YPO2151 KEGG:ns NR:ns ## COG: YPO2151 COG0697 # Protein_GI_number: 16122384 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Yersinia pestis # 8 300 13 307 373 142 31.0 1e-33 MNNKIISIFFAFIAAMLYALNIPFSKLLLEQISPVFMASFLYFGAGIGMLCLSLLKKGNK EQNKLTKKELPYTLGMIFLDILAPISLMFGLKWISPTNASLLNNFEIVATSCIALFFFQE KISSKMWAAILLITLSSVLLSLDEQTNLSFSYGAIFILLACIFWGMENNCTRMLSSKNIV QIVVLKGICSGFGSFIVAFVIGEHLPHFFLILIILILGFLSYGLSIFFYVKAQKELGAAK TSAYYSINPFIGTFLSFLIFQEKLSKYYFLALFIMILGTILVILDTLFIKHKHIHSHQSL TEHTHEHIHFVTNLEKHIHKH >gi|224461455|gb|ACDD01000047.1| GENE 2 1109 - 2776 2345 555 aa, chain - ## HITS:1 COG:FN0684 KEGG:ns NR:ns ## COG: FN0684 COG1151 # Protein_GI_number: 19704019 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Fusobacterium nucleatum # 6 552 4 556 566 809 69.0 0 MCHTNMFCYQCQETFKNEGCQISGVCGKKPTTASLQDLLIYIDKGVANYSQALRQAKSPL IDNTVNKYLINSLFITITNANFDDQEIFHEIQRGLQLRESLKAECERLGLHTKFENHNLA KWYFTNERDVLNFSKTVGVLRTANEDIRSLRELLTYGLKGMAAYTEHAFNLGKTDESLFA FIEKALLATEDDSLGVNELIPLVLECGQFGVSAMALLDNANTSAFGNPEITKVNIGVGTR PGILISGHDLNDIKQLLEQSKDAGVDIYTHSEMLPAHYYPELKKYPHLFGNYGNAWWKQK EEFETFNGPIVFTTNCIVPPKKGASYEGKVFTTNAAGFPDWKKIPVREDGTKDFSEVIEM AKTCQAPKEIEHGEIIGGFAHNQVFALADKVVEAVKSGAIKKFVVMGGCDGRHKERDYYG DFAQALPKDTVILTAGCAKYRYNKMNLGDIGGIPRVLDAGQCNDSYSLAVIALKLKEVFD LDDINKLPIIYNIAWYEQKAVIVLLALLYLGVKNIHLGPTLPAFLSPNVAKVLVENFGIA GIGTVEDDMKKFFEI >gi|224461455|gb|ACDD01000047.1| GENE 3 2922 - 3914 1416 330 aa, chain - ## HITS:1 COG:FN1900 KEGG:ns NR:ns ## COG: FN1900 COG3641 # Protein_GI_number: 19705205 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Fusobacterium nucleatum # 1 330 1 330 330 308 60.0 7e-84 MKNFCIKTLNGMALGLFSSLIIGLILKQCGQFLHLPILIQFGTLAQYFMGPAIGVGVAYS LQSPPLVLIASLITGAFGAGTIQFVEGIAQIKIGEPMGAYIASLVAALLVTNLSGKTKLD IILLPACTIIVGCLVGIFISPAISLFMKYLGEIINTATTLHPIMMGMTLAVSMGMILTLP ISSAAIGISLGLHGLAAGAALVGCCCQMIGFATISYRENGMGGFISQGIGTSMLQIPNII KNPWIWLPPTLASAILGPISTSIFHMESNAVGSGMGTSGLVGQVSTLAVMGTTSLLPMLL LHFLLPAILSLIFAKILMKQNKIQLGDMKL >gi|224461455|gb|ACDD01000047.1| GENE 4 4009 - 4518 767 169 aa, chain - ## HITS:1 COG:lin2129 KEGG:ns NR:ns ## COG: lin2129 COG1827 # Protein_GI_number: 16801195 # Func_class: R General function prediction only # Function: Predicted small molecule binding protein (contains 3H domain) # Organism: Listeria innocua # 3 168 6 172 173 129 43.0 2e-30 MTGETRREKIVSLLKNQEKAISGREFAQQLEVSRQVIVQDIAILRAKNVPILSSPEGYLL EKTEKKLQFSFFSRHQSLQEMKEELEIIVDYGGKLLNIQVEHEIYGLITSNLCLQNRLDI ELFLEKLQETNSKPLSFLTNGLHSHTVEVDDLNQKKFILKKLQEKGFLQ Prediction of potential genes in microbial genomes Time: Fri May 20 02:03:50 2011 Seq name: gi|224461454|gb|ACDD01000048.1| Fusobacterium sp. 3_1_5R cont1.48, whole genome shotgun sequence Length of sequence - 61353 bp Number of predicted genes - 66, with homology - 60 Number of transcription units - 15, operones - 9 average op.length - 6.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 4 - 105 100 ## 2 1 Op 2 . + CDS 102 - 1724 1950 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 3 1 Op 3 1/0.000 + CDS 1742 - 3028 1728 ## COG0536 Predicted GTPase 4 1 Op 4 . + CDS 3038 - 3946 1016 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 5 1 Op 5 . + CDS 3981 - 4367 348 ## gi|257452544|ref|ZP_05617843.1| hypothetical protein F3_05702 6 1 Op 6 1/0.000 + CDS 4372 - 4767 536 ## COG3920 Signal transduction histidine kinase 7 1 Op 7 1/0.000 + CDS 4772 - 5122 606 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) 8 1 Op 8 . + CDS 5147 - 6712 2422 ## COG1418 Predicted HD superfamily hydrolase 9 1 Op 9 . + CDS 6737 - 6802 91 ## 10 1 Op 10 . + CDS 6790 - 8013 1746 ## COG1760 L-serine deaminase 11 1 Op 11 . + CDS 8018 - 8926 914 ## COG2990 Uncharacterized protein conserved in bacteria 12 1 Op 12 . + CDS 8942 - 9835 747 ## COG2990 Uncharacterized protein conserved in bacteria + Term 9894 - 9927 0.5 13 2 Op 1 12/0.000 - CDS 9803 - 10243 495 ## COG3610 Uncharacterized conserved protein 14 2 Op 2 . - CDS 10254 - 11015 559 ## COG2966 Uncharacterized conserved protein 15 2 Op 3 . - CDS 10999 - 11331 461 ## FN0762 hypothetical protein - Prom 11361 - 11420 13.8 + Prom 11362 - 11421 9.1 16 3 Op 1 . + CDS 11449 - 13104 2008 ## gi|257452554|ref|ZP_05617853.1| GTP-binding protein 17 3 Op 2 . + CDS 13086 - 14519 1364 ## COG1078 HD superfamily phosphohydrolases + Prom 14541 - 14600 1.9 18 4 Op 1 . + CDS 14695 - 15849 215 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 19 4 Op 2 . + CDS 15919 - 16542 598 ## FN0764 amino acid transporter LysE 20 4 Op 3 1/0.000 + CDS 16568 - 17620 1536 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 21 4 Op 4 . + CDS 17638 - 18036 787 ## COG1970 Large-conductance mechanosensitive channel + Prom 18225 - 18284 11.4 22 5 Op 1 . + CDS 18344 - 20815 2777 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 23 5 Op 2 . + CDS 20830 - 24789 4240 ## COG3468 Type V secretory pathway, adhesin AidA + Term 24835 - 24881 12.2 - Term 24823 - 24868 9.5 24 6 Op 1 11/0.000 - CDS 24891 - 25685 791 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 25 6 Op 2 3/0.000 - CDS 25682 - 26320 738 ## COG0352 Thiamine monophosphate synthase 26 6 Op 3 . - CDS 26304 - 26852 518 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 27 6 Op 4 5/0.000 - CDS 26849 - 27955 1084 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 28 6 Op 5 . - CDS 27952 - 28731 950 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis 29 6 Op 6 . - CDS 28758 - 28970 311 ## gi|257452567|ref|ZP_05617866.1| hypothetical protein F3_05817 30 6 Op 7 . - CDS 29020 - 29121 62 ## 31 6 Op 8 . - CDS 29148 - 29615 639 ## COG1683 Uncharacterized conserved protein - Prom 29700 - 29759 10.4 + Prom 29652 - 29711 7.4 32 7 Tu 1 . + CDS 29858 - 30259 577 ## gi|257452569|ref|ZP_05617868.1| hypothetical protein F3_05827 + Prom 30268 - 30327 3.9 33 8 Op 1 1/0.000 + CDS 30381 - 33899 3846 ## COG1196 Chromosome segregation ATPases 34 8 Op 2 . + CDS 33918 - 34925 1354 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 35 8 Op 3 . + CDS 34922 - 35482 557 ## FN1131 hypothetical protein 36 8 Op 4 . + CDS 35491 - 36102 779 ## Smon_0263 hypothetical protein 37 8 Op 5 . + CDS 36115 - 36792 730 ## COG5522 Predicted integral membrane protein 38 8 Op 6 32/0.000 + CDS 36863 - 37942 1526 ## COG0216 Protein chain release factor A 39 8 Op 7 . + CDS 37944 - 39050 328 ## PROTEIN SUPPORTED gi|170727358|ref|YP_001761384.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific 40 8 Op 8 . + CDS 39029 - 39595 315 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 41 8 Op 9 . + CDS 39610 - 40128 942 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 42 8 Op 10 . + CDS 40115 - 40330 349 ## gi|257452579|ref|ZP_05617878.1| exodeoxyribonuclease VII small subunit 43 8 Op 11 1/0.000 + CDS 40317 - 41189 1224 ## COG0142 Geranylgeranyl pyrophosphate synthase 44 8 Op 12 . + CDS 41189 - 42220 1085 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 45 8 Op 13 . + CDS 42230 - 44206 2089 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 46 8 Op 14 . + CDS 44215 - 45168 890 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 47 8 Op 15 . + CDS 45158 - 45865 767 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 48 8 Op 16 . + CDS 45869 - 46078 431 ## gi|257466444|ref|ZP_05630755.1| hypothetical protein FgonA2_03266 49 8 Op 17 4/0.000 + CDS 46086 - 47114 1013 ## COG4394 Uncharacterized protein conserved in bacteria 50 8 Op 18 . + CDS 47128 - 47691 921 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 47705 - 47734 0.5 + Prom 47709 - 47768 7.0 51 9 Op 1 8/0.000 + CDS 47808 - 48119 373 ## COG2739 Uncharacterized protein conserved in bacteria 52 9 Op 2 23/0.000 + CDS 48130 - 49479 2062 ## COG0541 Signal recognition particle GTPase 53 9 Op 3 . + CDS 49520 - 49783 407 ## PROTEIN SUPPORTED gi|237739055|ref|ZP_04569536.1| SSU ribosomal protein S16P + Term 49808 - 49846 6.2 54 10 Tu 1 . + CDS 49856 - 50770 1040 ## COG4874 Uncharacterized protein conserved in bacteria containing a pentein-type domain + Term 50798 - 50853 5.1 - Term 50785 - 50840 5.1 55 11 Tu 1 . - CDS 50886 - 51389 661 ## COG0716 Flavodoxins - Prom 51413 - 51472 5.7 + Prom 51448 - 51507 1.9 56 12 Op 1 . + CDS 51542 - 52294 888 ## FN1183 putative cytoplasmic protein 57 12 Op 2 . + CDS 52284 - 53699 1502 ## FN1182 hypothetical protein 58 12 Op 3 . + CDS 53719 - 54600 1276 ## COG1857 Uncharacterized protein predicted to be involved in DNA repair 59 12 Op 4 . + CDS 54614 - 55723 1092 ## CTC01145 hypothetical protein 60 12 Op 5 6/0.000 + CDS 55733 - 57934 1809 ## COG1203 Predicted helicases 61 12 Op 6 12/0.000 + CDS 57954 - 58448 475 ## COG1468 RecB family exonuclease 62 12 Op 7 13/0.000 + CDS 58459 - 59451 951 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 63 12 Op 8 . + CDS 59458 - 59736 311 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair + Term 59854 - 59916 -0.7 64 13 Tu 1 . - CDS 60441 - 60662 109 ## - Prom 60723 - 60782 4.0 65 14 Tu 1 . - CDS 60794 - 60958 94 ## - Prom 60986 - 61045 3.2 + Prom 61054 - 61113 7.8 66 15 Tu 1 . + CDS 61190 - 61273 215 ## - TRNA 61191 - 61267 84.7 # Arg TCG 0 0 Predicted protein(s) >gi|224461454|gb|ACDD01000048.1| GENE 1 4 - 105 100 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPYIFYCKLKNNKKYGKIKVFFGIYYLGGIKIE >gi|224461454|gb|ACDD01000048.1| GENE 2 102 - 1724 1950 540 aa, chain + ## HITS:1 COG:FN1301 KEGG:ns NR:ns ## COG: FN1301 COG0488 # Protein_GI_number: 19704636 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Fusobacterium nucleatum # 1 539 1 539 539 968 89.0 0 MIITSGLGMRFSGRKLFEDANLKFTPGNCYGIIGANGAGKSTFVKILSGDLEATEGEVIF DKKKRMSVLKQDHFQYEEEEVLNVVLMGNKILWDIMVEKNAIYAKEEFTDEDGLRAAELE GEFAELNGWEAETEAETLLMGLGIGADLHHALMKELTEPQKVKVLLAQALFGEPDALLLD EPTNGLDIKAISWLENFIMNLEHTTVLVVSHDRHFLNKVCTHITDIDYGKIKMYVGNYDF WYESNQLMIQLISNKNKKLEQKRQELQEFIARFSANASKSKQATSRKKQLEKLQLEDMQI SNRKYPFVEFKPDRDAGNNMLKVENLSKTIDGVKVLDNVSFTINTGDKVVFLAKNDIVKT TLLSILAGEMEADSGSYTWGVTTSQAYMPRDNSAFFTNPDLNLIEWLRPYSPDEHEAFVR GFLGRMLFSGEETLKKCTVLSGGEKVRCMLSRMMLSGANVLLFDNPSDHLDLESITSLNK ALIKFSGTILFGAHDHEFIQTVANRIIEITPSGLVDKLMSYDEYLEDEELQAKIEAMYAE >gi|224461454|gb|ACDD01000048.1| GENE 3 1742 - 3028 1728 428 aa, chain + ## HITS:1 COG:FN1918 KEGG:ns NR:ns ## COG: FN1918 COG0536 # Protein_GI_number: 19705223 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 428 1 428 428 614 81.0 1e-176 MFIDEVVITVKAGNGGDGSAAFRREKSVQFGGPDGGDGGNGGSIFFYADPNVNTLVDFKY KKIFKAQHGENGQKKQMFGKAGEDLIIKVPVGTQVRDLQTGKLLLDMNEKNETRMLLKGG RGGWGNVHFKTSTRKAPKIAEKGREGAELQVKLELKLIADVALVGYPSVGKSSFINRVSA ANSKVGSYHFTTLEPKLGVVRLEEGKSFVIADIPGLIEGAHEGVGLGDKFLRHIERCKMI YHLVDVAEIEGRDAISDFEKINEELSKFSEKLAKKPQVVLANKMDLLWDMEKYETFKSYV EEKGYEVYPVSVLLNEGLKEILYKTFDKIQKVEREPLEEETDIMEVLQELKIQKDDFEIT QDEEGVYHIEGRIVDGVLAKYVIGMDDESIVNFLHLMRSLGMEEAMQEAGIEDGDTVQIA NVEFEYVE >gi|224461454|gb|ACDD01000048.1| GENE 4 3038 - 3946 1016 302 aa, chain + ## HITS:1 COG:FN1917 KEGG:ns NR:ns ## COG: FN1917 COG0324 # Protein_GI_number: 19705222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Fusobacterium nucleatum # 4 295 6 295 303 306 58.0 2e-83 MERRAVIIGGPTGVGKTSLSINLAKELKADIISADSAQVYRGLDIGTAKIRKEEMQGIKH HLLDVVEPTTKYSVGEFAEATNAILQEKYEKKENILLVGGTGLYLSAVSDGLSSLPPADF ALRTKFMEKTTEELYQELLTQDVLSANTIHPNNRVRIERALEVFLLTGKSFVVLSKQNVK ENPYSFHKIALERNREHLYDRINQRVDLMMEEGFLEEAKYLYQRYGEALKKLRIIGYDQL IEYFEGMISLEKAIELIKRDSRHYAKRQFTWFKQKKDYIWYNLDEQSEEEILSDIKNFLI KK >gi|224461454|gb|ACDD01000048.1| GENE 5 3981 - 4367 348 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452544|ref|ZP_05617843.1| ## NR: gi|257452544|ref|ZP_05617843.1| hypothetical protein F3_05702 [Fusobacterium sp. 3_1_5R] # 1 128 1 128 128 212 100.0 5e-54 MKKVLVVIFLTYLFTSCSNLSSMTESNSSQEQWKVFISEVKLAVEEKKIKMLQEKMMVSQ KNKYIYQELSKLDMDQQDIQFYFKEPEYNFPKIQGLVAIQYADRTEYFNIFYTWKNGKWW ISDLEERR >gi|224461454|gb|ACDD01000048.1| GENE 6 4372 - 4767 536 131 aa, chain + ## HITS:1 COG:FN1915 KEGG:ns NR:ns ## COG: FN1915 COG3920 # Protein_GI_number: 19705220 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 23 131 1 109 109 107 55.0 7e-24 MKQTEVEVHIPSSLENLSVVRAMIRTYLQNHHIAEGDVVQLLSVVDELATNAIEHAYQNK LGEVIINIEKDGSKVRLFVEDSGSGYDDKKVSKEEGGIGLILARKLVDIFEIIKKEQGTV FRIEKEVREAM >gi|224461454|gb|ACDD01000048.1| GENE 7 4772 - 5122 606 116 aa, chain + ## HITS:1 COG:FN1914 KEGG:ns NR:ns ## COG: FN1914 COG1366 # Protein_GI_number: 19705219 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Fusobacterium nucleatum # 1 116 1 115 115 127 64.0 5e-30 MENTFELTERKLENGITVIGVMGELDALVAPKLKELMNRHIDMGNIKLILDCENLVHINS LAMGILRGKLQSVKEIGGDIKIIRLNNHIQTIFDMIGLDEIFEIYATEEEAVVSFR >gi|224461454|gb|ACDD01000048.1| GENE 8 5147 - 6712 2422 521 aa, chain + ## HITS:1 COG:FN1913 KEGG:ns NR:ns ## COG: FN1913 COG1418 # Protein_GI_number: 19705218 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Fusobacterium nucleatum # 14 521 1 508 508 638 78.0 0 MNLILGIGLGVFGLAIAFALIYKKMVIDKQIQTLNNLEDEVAKSKIKAKEILESAEKEAV SKGKEIELKAKERAYSLKEEAEKEIRNSKNEILQKEARLAKKEETLDHKIEKLENKSQEL EKTTEELEQKREEIETVKKEQEAELERITGLTKAEAKDILIAKLKEELTHDNALAIREFE NKLEDEKDRISRRILSTAIGKAAADYVADATVSVVNLPSDEMKGRIIGREGRNIRSIEAL TGVDIIIDDTPEAVVLSSFDGVKREIARITIEKLITDGRIHPGKIEEVVNKAKKEVEKEV VAAGEEAILELSIPGLHPDIIKTLGRLKYRTSYGQNVLVHSIEVAKIAATLAAEIGADVE LAKRAGLLHDIGKVLEHDVESSHAIIGGEFLKKYGEKATIINAVMAHHNEVEFETIEAIL VQAADAVSASRPGARRETLTAYIKRLEQLEEIANSFQGVESSFAIQAGRELRMIINPDRV NDDEATVMSREVAKKIEETMQYPGQIKVTIVRETRAVDYAK >gi|224461454|gb|ACDD01000048.1| GENE 9 6737 - 6802 91 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVPFSLKKKTKKFKEGYLWIH >gi|224461454|gb|ACDD01000048.1| GENE 10 6790 - 8013 1746 407 aa, chain + ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 404 1 403 408 559 68.0 1e-159 MDTLRELFKIGCGPSSSHTMGPERAAKKFLAKNPDAAKYRVELYGSLAATGKGHLTDWII EETLKPKVTEIIWKADYIHPYHTNGMKFYALDQKESILDEWLVFSVGGGTIKEEKDFEET SLEKKEVYTLNKLDDIMEWCRKNRKKLWEYVEFCEGEGIWDYLWEIHQTMEEAINRGLTK EGFLPGNLKYPRKAKETYLKAKTKTRLLRFVDKMFAYSLAVSEENASAGKVVTAPTCGAS GVIPGLLRAMREEYSLDEATVLRGLAIAGLIGNLIKQNATISGAEGGCQAEVGAACSMAS AMAVYFMGGSMEEIEYAAEIGMEHHLGMTCDPVGGYVQIPCIERNAIVATRSFNTANYVM VTGGDHTISFDEVVITMKETGKDMCSAYKETSNGGLAKYYNKILAGE >gi|224461454|gb|ACDD01000048.1| GENE 11 8018 - 8926 914 302 aa, chain + ## HITS:1 COG:YPO1363 KEGG:ns NR:ns ## COG: YPO1363 COG2990 # Protein_GI_number: 16121643 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 27 286 40 307 315 150 36.0 2e-36 MKKEILFYWNVMQHGRAKGKVSTFDQKIKYIARNILYYPWAKQIVSFLQSHPYLSHEIYR YPVLCSKIHRPYMTCDFSIQKKVDSILASYQYIDNFFQEDSLTKLYRNGRIKILQIKGKD DITIDAYLKLYSQYEKEGEFNLVLYWGEILLATLTFSIVDGRLFIGGLQGLGREYTDPEI LKKVTKSFYGMFPKRLVLEIFYSLFSEKKIAVGNRSHIYLAARYKHQEKRKIHADYDEFW QSLGANPFGEDLWALPEKLVRKEIEEIPSKKRSQYRSRYAILDEIQQLVLEFLKQESKKI VI >gi|224461454|gb|ACDD01000048.1| GENE 12 8942 - 9835 747 297 aa, chain + ## HITS:1 COG:YPO1363 KEGG:ns NR:ns ## COG: YPO1363 COG2990 # Protein_GI_number: 16121643 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 27 293 40 315 315 156 36.0 5e-38 MKRELQFYYQIMKERYGKGTTHALRKKIKYITRTLFYYRYSMQLARFIMNDRYLSKTIHQ YRMLTEKLHKPYMTYSFSSKEKLEVIFSSYAYLELYFRDDILQELYTKTKIKVLDIVGKE DCTLSIYFKVYPNFDKEGEFNLIMYQGDILLATLTFSIWKDKMFIGGLQGLGRIYNDPEI LKKVTKHFYGLFPKRILMEVFYHLFPEPKIAVGNANHIYLAQRYRYKKERKVKADYDEFW ESLGGIQREDGLWELAEKIARKPIEEIPSKKRSQYRSRYQILDQIEELVSNFLINSK >gi|224461454|gb|ACDD01000048.1| GENE 13 9803 - 10243 495 146 aa, chain - ## HITS:1 COG:CAC2266 KEGG:ns NR:ns ## COG: CAC2266 COG3610 # Protein_GI_number: 15895534 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 8 142 3 137 152 86 36.0 2e-17 MFVYSSPIQILAAVFTTLGFGVLFNVKGNNLLHTCIAGGISWAVYLFCSTHAYSLSFSYF LATFILSLYSEIIARIKKTPVTSILIAAMIPLAPGGGIYYTMLHILQKNYPLALSKGVDT LIIAGSMAIGVFSASALFRVYQEIRH >gi|224461454|gb|ACDD01000048.1| GENE 14 10254 - 11015 559 253 aa, chain - ## HITS:1 COG:FN0781 KEGG:ns NR:ns ## COG: FN0781 COG2966 # Protein_GI_number: 19704116 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 10 247 13 250 256 155 34.0 6e-38 MKKEIKEEYQILSLACKTARLLLENGSEVHRIEQISKKICEYYGYSCQCFASLTCVVITL ENKEGEIFSLVERIENRNTNLNKITRISKLVEEISSHSYFSFKEELQDIQEEVTYSPLQV LLAHMIGAAFFVFLFQGNHQEIFVSGLTGFCIAFTAFISQKIKLESLFVNLLQGMVCSSI PCLFYSLGWIQNIDISIISSLMIMVPGVAFINAIRDLFSGDLVTAQSRLLEVALIGMTLA TGSGIALKFFYIS >gi|224461454|gb|ACDD01000048.1| GENE 15 10999 - 11331 461 110 aa, chain - ## HITS:1 COG:no KEGG:FN0762 NR:ns ## KEGG: FN0762 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 65 1 65 118 83 69.0 2e-15 MSELEKKILRFLLSAAAYSENAICKNLGINLEELHTSFHILEENGYLESYETFLAREQLN ESNSCSSHGGCSSCHSCSKGSCCNKGEEDYSDIRVLTEKAVEEFGSEKGN >gi|224461454|gb|ACDD01000048.1| GENE 16 11449 - 13104 2008 551 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452554|ref|ZP_05617853.1| ## NR: gi|257452554|ref|ZP_05617853.1| GTP-binding protein [Fusobacterium sp. 3_1_5R] # 1 551 1 551 551 920 100.0 0 MKTRVFQRYEDYQKYMRQFLLEEEIELEQEKKDIENRRFVVMIVGEAKSGKSSFIDAYLK TNILPIDVKQCTNALIHIRHSEHLFLEVNHHGQFLTLESEEEIREFLNREANFSKSGKQE ELELSLYYPLEKEFQEIEFIDSPGVNAEGGLGEISEEYLPSVNAIIFVKSLYGQALESTS FIDFFRGKTKRRHKESRFLLLTGSALLSKKDQESLEKDAVAKYGDYIATEKIIALDSKLK LFWNECQDLSEIEIARKIEEEDFDSATVLWYRCQGQKEAFMKALLEKSNFINLEQKLKTF AKDYEKILCLQFLENILGAYQRQIHIFEDQRQVLVDHRKDPDTLQETVNEKRREIQELSK RLEIGVQEMYQKYIQEDFLEAMLTDSYRKWDEELVVFRRKRDWKQLELWFQEKMKESAQV SLELSEQMVEECNEKLFCDGRKVYLEIFKPNSMDYSLLKAEKVEESFFQISEMLSSLKAH LKANIKRNLENCLYKYTGKIHANCHRLEYACEELLAEKWNSEKLQIKITEISEKISILEK QREEILWELKL >gi|224461454|gb|ACDD01000048.1| GENE 17 13086 - 14519 1364 477 aa, chain + ## HITS:1 COG:FN2068 KEGG:ns NR:ns ## COG: FN2068 COG1078 # Protein_GI_number: 19705358 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Fusobacterium nucleatum # 1 469 29 510 513 483 51.0 1e-136 MGVKVVKDLVHGYIYIDEKIQKCIDTPYFQRLHRVKQLTCNLLFPSVNHTRYEHSLGVMK LACDFWDTLAPFLQQRGKDEEEILLLREQLRFAALLHDVGHPAFSHLGEKFLEKTEICQA IREILPQKYSMEETFFQNTTLKGSPHELMSCYCILSKFQEVLDSSLQLDFVCRMIIGNPY AEKEKWAENICIQILNSSSIDVDKLDYLMRDNHMTGEIAPFMDVERLLASLSLDEENRLC FIAKAIPAVQSVVDSRDSLYLWVYHHHISVYTDFLLGEMLKTSIELKYMSREEFFSPQAI TEDLIADDDVYSYLRALYCREKRKKSNLYLACLSSQFFERHFLKSLWKTIYEYHDREIAW MQAGIISEIEDLNALLKDDDAMGQLAKRVQKEVGLQEGEIFFVSQHHKFYHSVQKTEIEL VLKGEKRKLSELLPQKNFEKFHQLSFFFYVKEEKKEEAYESFLKNLKIILEERKRLL >gi|224461454|gb|ACDD01000048.1| GENE 18 14695 - 15849 215 384 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 71 345 1 283 319 87 26 2e-16 MRKEKEERILECIREQEKISRISLAKLLQWNPTTVGSIVAELLKKSYIQEVEMEASTGGR KATLLSLKEDMSPSILGISFAPSFLQIGIGSIQGKIFETEKIVLTPLIIEKIWQFLFQII DKKLNKWKEIRQISVIISGLVNSEKGVSIFSPHYQWRNIEIKKILEERYQKKVFVENDVR AMALLEKSFGSCKKKRNFVVLNIGDGVGSSIFIDNKLYIGSYSGSGELGHMQVNAKGLRR CSCGKIGCLESEVSNLSILDKISSQIKLGQYSILRQKLKRDGNLSIEDFLFALGEKDLLA LQIAEESVEMITRALDAIISLLNPERVILYGSIFQSEYLYREILKKIQSILISEQGYKIS LSNFYKEAYAYAPFAVLRYLSIKN >gi|224461454|gb|ACDD01000048.1| GENE 19 15919 - 16542 598 207 aa, chain + ## HITS:1 COG:no KEGG:FN0764 NR:ns ## KEGG: FN0764 # Name: not_defined # Def: amino acid transporter LysE # Organism: F.nucleatum # Pathway: not_defined # 60 207 1 148 150 78 33.0 1e-13 MLLDTSIIKGIIAGFILSLPFGPVGIYCMEVTIVEGRWKGYVSALGMVSIDVLYGMIALL FVNKVEDIIIRYEGYLTVLIGIFLIIIAIRKLTQPVTIKRVKHEFKTLLQGYFTFMFFAL ANISSIAVIILIFTTLRVFDSESPSMLCQVPLGIFAGGASLWFFTTTVLCKLRKTVEEGN LIRVSRVASCLILILGIYLIVQAIIKI >gi|224461454|gb|ACDD01000048.1| GENE 20 16568 - 17620 1536 350 aa, chain + ## HITS:1 COG:FN0765 KEGG:ns NR:ns ## COG: FN0765 COG0482 # Protein_GI_number: 19704100 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 4 350 15 361 362 496 70.0 1e-140 MVNMEYREENKKVRVGVALSGGVDSSTVAYLLKKQGYDIFGVTMKTCHAEDADAKKVCED LGIDHYVLDLTEPFSEKVMDYFVEEYMRGKTPNPCMVCNRHIKFGKLLDFILGQGAQYMA TGHYTKLVDGHLSVGDDGGKDQVYFLSQVPKEKLKKIIFPVGELEKIQVRELAKELGVRV YAKKDSQEICFVEDGKLKEFLIEKTKGKVYNKGNIVDKNGKILGKHNGLAFYTIGQRKGL GISSESPLYVVELNSERNEIIVGTNEDLMREQLTAEQCNLFLVDKLEELHNMNCYAKTRS RDTLHACRLEVLGDEVIAHFIDNKVRAVTPGQGVVFYNELGQVIAGGFIK >gi|224461454|gb|ACDD01000048.1| GENE 21 17638 - 18036 787 132 aa, chain + ## HITS:1 COG:ECs4156 KEGG:ns NR:ns ## COG: ECs4156 COG1970 # Protein_GI_number: 15833410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 132 1 131 136 139 57.0 9e-34 MSILKEFKEFAIKGNVVDMAVGVIIGGAFGKIVASLVGDVIMPAVSCISGGQSFAEKAIE IPSKVEGAEPILIKYGLFIQNIIDFVIIAVCVFIMVKIINSLKKKEEEAPAAVPEPTKEE VLLTEIRDLLKK >gi|224461454|gb|ACDD01000048.1| GENE 22 18344 - 20815 2777 823 aa, chain + ## HITS:1 COG:FN0499 KEGG:ns NR:ns ## COG: FN0499 COG1629 # Protein_GI_number: 19703834 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 7 823 1 743 743 339 30.0 2e-92 MKGKMLLLGILLSSVSFATAEIQLESSKIHGGGYYKPSMEENNGTIVITEEMIQKKHYDS VAKIFEDSPVSVVRHTAFGPIVDLRGSGERTISRVKVMIDGTPINPLEETHGTIPFDTIP VESIAKIEIVPGTGTTKYGGGTTGGYINIHTKKDRQNNYITVNADHASYNTNSIGIAAGM NATKKLFVYVGEAYQRKDGYRKKDHSDRNNFLGGFDYQINTKHRIKGQGNLYREDLKSTT EVTHEELKQDRRKAGEDTKIEMDRDFASLDYEYTPSSNFTLRTNVNRAHFTRDVAMNGKQ TQLILPNAFRFEMGIFGDLADLKPVLQNFESTMEGKFKEKNQEGKVDGEWKYNQGKGHLQ FGYSYNEKNLNQDLKAISKPFTLGEMGYLFQGDPAPHPFEDYAGKVIDADTMWRRIYKDY GTSEAEIEQALVGRKERFDSEKVDVQNYNTVDAFKDTHALYLLNDYKITPKFNLRAGLRW EHSEYGADRKNRMLFGIHNAKKSYLAAIAAMYGFLSDYEKQKLVEGKLNYVHMNLSLKET RIKDSSDNVGGEVGFSYQYNKKGSIFFRYERGFLSPLPSQLTNKDFLTGIYYPSHVKSEK VDTFEIGVKHSLWNNTHIEANTFFSLTKDEITNMRYNANNHMNMRWAYANISKTRRFGFE LNAEHIFDKLKIRESFSYVDAKIAKDTGFKDYYHSGYVEGTDKKFAKEPLYYKKGQQVPL VSKIKVTVGAEYQFTDKLSLGGNYNYVSGYDTREPSEGFQAKIYKVKGHGTLDLFGRYYF TDYAYMRFGVNNVLGEKYNLREDSHYAVPAPKQNYYAGFSYKF >gi|224461454|gb|ACDD01000048.1| GENE 23 20830 - 24789 4240 1319 aa, chain + ## HITS:1 COG:ycgV_2 KEGG:ns NR:ns ## COG: ycgV_2 COG3468 # Protein_GI_number: 16129165 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1060 1284 151 378 413 79 29.0 6e-14 MKLKNKKYISLFAYLLVSHFAFSMEAHIVQNGKKQEVGNEKITLKHDFTKMDQKVKEAEE YSKKLFKEGKISGVISGTLYPFSGEEALSKTGVIFDFKLENNNANSASLTGKKLDLIDGT IDVTAMPWNLYGKDNIATIENSEINIKKTREYPYYHFNDDDMMRNYYESSQNYLQLMRSI EEKEKYLDKLYVSQPIYVGNSYLYLKNSTVKVDAKKEDNTNDLIVREHLVMNRKDDNKNF NSSKAGLQIIRDGEKLEKGKYDLELSNSLNITLSIEDNFAFSYYNKDSIKKWYQNRFHSE KGRPLFLLGKNAKLKAGTFRVDNSNIWTYLYMANMFDLMDVYFNKDNRMIVFEKDSEADL EKIEIQNSNLKFEDAIVRLKNANGPVLISEYSQITGRGIFDIYGQSILKETIGTGDRIDK NLFFTDLEFLPGSKMDLSFIQDAHRNYFSIDTGIEEIKDDALKGGEYKFTFHNNTNTYLG YSQRNLNRKIQRKPDSNEAKGIGLKHASDADMNFQEGSHLYLYREKYAKDTLKKGGTWKE ENDEQGNLLEFTGRLHFDNANIHFRSNLKEELSDRLIASRYPITGTGAKLHIKNSGATDT TGKERLTLIEAKKGVGADTKFTLPHDLELGGYVYTLHEKISADGAKEYYLASGVVEGVRP ENSEKPKEIQTGIPDSSSYTKKDSIHNQESIDNQKIRVEAAGEDYAISWEGSSEKEINIQ NSHVAAKGKKGILVKNSKVNLKASELSATDGMDIDGTDTNKERLLVDKDSSLYLEHAKGG AALAVKEAGVKFDNSKKILLKGTYAIYSNGGNIAGNGIFNINGNIYHKGKGGIDLTLEKG SVLNTSSIDIAGNRDSSKLHFKAGSAMYINNYSNTNMIFDKGSALHTYAISEANQHLDIY HKGNKVIFTQPIAFHNTDIYFRTNMDEEESDKLELLGSLSGTSANLHLMNHAEADMPAGG KKVELVYAKSPKNFKWNLANQVEVGGYFYNAVIEKTSQGPNQDIISVRIGDAGTRKATLS STAKGVISNSVSDYTMYHSFHDSLFDSLYSNDYSSQKLHSIWAKTSGNTFETKEYGMKNE VNTILVGMDRALNAEEGLYGGFFAGNLHNSKKISSSSGKGSFQGFTGGAYLSYRGYLGFG DAFVAYTTGKSKYHVLDTASATVTNDRNSKHLGAGLRFGRQFFMDNGEHFYVEPSAKITY GRLNAETSKASNGLMTKVDAIKSWTTGAYARVGYQNQFTFGKINSYAKVGVTQEILGKYS VRLNQSGVEQVKLDGNTMNYGVGLEYSLGNNSISFDFDVKNSPILKNYYKVSLGYQYKF >gi|224461454|gb|ACDD01000048.1| GENE 24 24891 - 25685 791 264 aa, chain - ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 1 255 4 258 265 305 62.0 7e-83 MKTVLSIAGSDSSGGAGIQADIKTMQANGVYAMTAITALTAQNTLGVNGIFEIPAEFLEK QLESIFQDIYPDAIKIGMLSSSSSIRKIASILKKYEAKKIVLDPVMISTSGTPLISSDAV YDLQKFLFPLATLITPNIPEAELLSGISIKNEQDMERAAKKLGEKYHCSVLCKGGHQKNT AHDLLFDNGTYTWFYGEKIDNPNTHGTGCTLSSAIASNLAKGCSLKDSIQHSKEYLSLAL HSMLNLGKGNGPLNHGVLISKEKF >gi|224461454|gb|ACDD01000048.1| GENE 25 25682 - 26320 738 212 aa, chain - ## HITS:1 COG:sll0635 KEGG:ns NR:ns ## COG: sll0635 COG0352 # Protein_GI_number: 16329575 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Synechocystis # 26 207 156 337 343 152 42.0 3e-37 MRKRIEIPKGIYGITGDNFSNGKSNLDCVKEMIEGGIRILQYRDKTKSMLEKYQEAKEIA KLCKEKGVIFIINDHVDLALLVNADGVHIGQDDYPVEEVRALLGNDKIIGLSTHSPEQGF KAFQNENVDYIGVGPIFPTTTKDTKAVGLEYLDFAIQNLHLPLVAIGGIHEDNLEKILAR KVEHFCMVSGIVGAKNIRETVQNLWKQWEENQ >gi|224461454|gb|ACDD01000048.1| GENE 26 26304 - 26852 518 182 aa, chain - ## HITS:1 COG:Cj1046c KEGG:ns NR:ns ## COG: Cj1046c COG0476 # Protein_GI_number: 15792373 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Campylobacter jejuni # 2 174 86 262 267 116 37.0 3e-26 MKVGIAGCGGIGSNVAYHLIRSGIVEFKFGDFDIVESSNLNRQFFFHSQIGKTKALCLKE NLLQINPKAIIEAEVIHFEKENIQNFFYDCDIIIEAFDKKECKTMLLEEISTTGKPIIAA SGIADYDIENLQIKKLSSNLYVVGDFMKGIENYPTYSHKVNMVAAMMAKVVLELGGYFEK KN >gi|224461454|gb|ACDD01000048.1| GENE 27 26849 - 27955 1084 368 aa, chain - ## HITS:1 COG:CAC2921 KEGG:ns NR:ns ## COG: CAC2921 COG1060 # Protein_GI_number: 15896174 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 1 366 1 366 368 348 48.0 1e-95 MSFYDEKKKWNSFDFSSYFSRVTEEDVLQSIEKEKLSEYDLLNLLSPTATKYLEKMAQKA HNLKLQHFGNVICLYIPIYVSNYCSNGCTYCGFSMKNKIHRRHMTLKEIEEEAKEIAKTK IEHIILLTGEVKDLSTLQYIKEGVSILKKYFASVSVEVMPLETEEYATLKKVGLDGMTIY QETYNEEVYDKVHLYGKKKDYLFRLGTPERAAEAGLRTVGIGALFGLSNIREEAFFAGLH LQYLIHHYPNTTFGISLPRINPAEGGFQPDHPLDDIQFVQFLTAYRIFQPKADLSVSTRE VPEFRDHLLALGVTRISAGSKTDVGGYTNQDASTAQFEISDSRSVEETVAAVERQGFQVI YKDWENLV >gi|224461454|gb|ACDD01000048.1| GENE 28 27952 - 28731 950 259 aa, chain - ## HITS:1 COG:CAC2922 KEGG:ns NR:ns ## COG: CAC2922 COG2022 # Protein_GI_number: 15896175 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Clostridium acetobutylicum # 1 253 1 254 255 298 64.0 6e-81 MDRLELQGRIFNSRLLTGTGKFRDKKLIEPMLESSESEIITMALRRVNFQNPQENILNYI PKKITLLPNTSGARNAEEAIKIAMIAREAGCGDFIKIEVINDMKYLLPNNEETIKATKFL AKEGFIVLPYMYPDIYAAKALEDAGAAAVMPLGAPIGSNKGLLSKSFLEILNENKRVPLI VDAGIGTPSQAAEAMEMGVDAVLVNTAIATAEDPVMMGKAFSMAVKAGRMAYLAKLATTS KYAQASSPLTDFLFRGDKE >gi|224461454|gb|ACDD01000048.1| GENE 29 28758 - 28970 311 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452567|ref|ZP_05617866.1| ## NR: gi|257452567|ref|ZP_05617866.1| hypothetical protein F3_05817 [Fusobacterium sp. 3_1_5R] # 1 70 1 70 70 120 100.0 4e-26 MKVVINGLDREIPENMNILALVKNLSTENNISLTGAIVLIDEELIPKAMWEKTFPQPSSK IEVLSFVSGG >gi|224461454|gb|ACDD01000048.1| GENE 30 29020 - 29121 62 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAEREQSDSYYLNWIMPAKGSNFFMYFVSSSFY >gi|224461454|gb|ACDD01000048.1| GENE 31 29148 - 29615 639 155 aa, chain - ## HITS:1 COG:TM0410 KEGG:ns NR:ns ## COG: TM0410 COG1683 # Protein_GI_number: 15643176 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 5 141 2 145 149 147 51.0 6e-36 MSKEKILISACLLGIPCRYDGKDNKIEKLSSLQEYYDFVPVCPEQLGGLSTPRCPCEIQG NKVISKEGKDCSEEFQKGAEESLKLIKKWKIQKAILKAKSPSCGYGFIYDGSFTRKLIKG NGYTANLLEKEGVSIFCETELDKIFKEVYNKLTIK >gi|224461454|gb|ACDD01000048.1| GENE 32 29858 - 30259 577 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452569|ref|ZP_05617868.1| ## NR: gi|257452569|ref|ZP_05617868.1| hypothetical protein F3_05827 [Fusobacterium sp. 3_1_5R] # 1 133 1 133 133 195 100.0 1e-48 MAGRKVREEMKQLESIKTKLQGEIEGLKTQKSNISKEISLKQAHIQKITEKIKTLSQGAG NEIIISEHAILRYIERVLQIPLEEIEQKIVSSTLKEQIKMLGDGSYPLENGKYRIVVKDN VVVTILGDDMVEL >gi|224461454|gb|ACDD01000048.1| GENE 33 30381 - 33899 3846 1172 aa, chain + ## HITS:1 COG:FN1129 KEGG:ns NR:ns ## COG: FN1129 COG1196 # Protein_GI_number: 19704464 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Fusobacterium nucleatum # 1 1168 11 1177 1193 839 49.0 0 MYLKAVEVHGFKSFGEKVYIEFNQGITSIVGPNGSGKSNILDAVLWVLGEQSYKNIRAKE SQDVIFSGGKDKKAMNQAEVSLIIDNQDGYFEEFPQEDLVITRKIHITGENEYFINHQKS RLKDISALFLDTGIGKSAYSVIGQGKVERIINSSPKEVKGIIEEAAGIKKFQASKNEAMK NLENVELELEKIELVLQEVRENKNRVEKQAEVAQRYLDVREEKQRAQKSIFLTDYHQKQE EQEIAGKEQEGFLENCQKFDKELKETEENIHRLEEEKKNLQEKMEKISSKNESLRTFLEE QEREKVRVQERQAAFQRELEEKKERLIQEKQKREEREKNKRGFFLKKEELKKKIEDLEEK NQVFEVLLKNLDQEKKIFEETLEVKDHKLREVELQKLNVINDLETSSKRMQSSETRVKNL QIDAEESQKKLEEVKKEFLSAEEKKNQQEQKLKESESRTQFVEEEISRLSIALNKGAEKL RQLEFEEKRSSARYEAILRMEENNEGYYKGVREVLQANIPGVAGVFLELIQIPEYLERAL EAAVSGNLQDIVVENSDVAKRTIQYLREKKAGKASFLPLDMLKINKKTVSQKISGVLGVA ADLVASEEKYRKAVDFVLGNLLVVENYDIAIQISKANFFSGNIVTLNGELVSSRGRISGG DQNKGIASQLLERKKERKKLEEELEVLRSRMQKGNQALDEYSKQLEKYENEISNLDMMGD NLRKQKKLAEEYVESLQEKISRMEKEIRIATMELEEEIRYTKEFEKKMNSTHAQKEELIA LSSTLKQEIQEIREKNKELQEKIEIQKEKFSDIRILFLNSKNHWEQLSQEEERLSKEEKE FQGMEEELERRIEILQNGKLSLEETQLELAKKIENTLEEYHKESKEMEKLHEQDKQNVEK EREFHKIQKEIESRLLFMKDKYERTEEKLERIREESILLEEELEKLTEIESEIFPFEKMR SRKENLRNLEAKLLSFGDVNLLAIEEFRELKEKYSYLGNQRDDLVRGKKVLLDLISEIKD TIYERFQEAYHIISENFNKMCMETLDNSEGKLNLLEAEEFENAGVEIFVKFKNKKRQSLS LLSGGEKSMVAIAFIMSIFMYKPSPFTFLDEIEAALDEKNTRKLIAKLKEFTSQSQFILI THNKDTMRESDSIFGVTMNKEIGISKVVPVKF >gi|224461454|gb|ACDD01000048.1| GENE 34 33918 - 34925 1354 335 aa, chain + ## HITS:1 COG:FN1130 KEGG:ns NR:ns ## COG: FN1130 COG1663 # Protein_GI_number: 19704465 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Fusobacterium nucleatum # 10 330 1 321 325 474 70.0 1e-133 MKILSYIYYLITSLRNFLYDKGFLPIYHVKDVEIICIGNISVGGTGKTPAVQFFVKKLQK MGRNVAVVSRGYRGKRKNEPCLVSDGRVIFASPQESGDEPYIHALNLTVPIIVSKNRYHA CLFARKHFHVDTIVLDDGFQHRKLARNRDVVLVDATNPFGGRHLLPWGTLRESFKKAAKR AEEFIITKADLVSEREIEKIKKYLKHSFHKEISVAKHGVHSLRDMAGNLKPLFWIEGKRV LIFSGLANPLNFEKTVLALEPSYIERIDFIDHHNFKEKDLLRIERRAEQMEADYILTTEK DFVKFPKHLDIPNLYVLKIEFTMLEDHSLETWRVF >gi|224461454|gb|ACDD01000048.1| GENE 35 34922 - 35482 557 186 aa, chain + ## HITS:1 COG:no KEGG:FN1131 NR:ns ## KEGG: FN1131 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 186 1 186 257 128 39.0 1e-28 MKKAEIVKKFRSVSIEDLEKEIQERGKYKVFSEFAEIMDKRSYFTVDIEGGICRKKVNPI LLEFPYEEDTKKLASMILSYGAPEERQVIHEISRLSNIEIPKLKEKLMTTLVNRNFDFAK RYAKELFLRDERSFWKVLNIFVELGEAENQKREVLKAFEVCMNIVKYDERLFHLYLSFLT RYRDNY >gi|224461454|gb|ACDD01000048.1| GENE 36 35491 - 36102 779 203 aa, chain + ## HITS:1 COG:no KEGG:Smon_0263 NR:ns ## KEGG: Smon_0263 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: DNA replication [PATH:smf03030] # 4 203 2 200 200 184 49.0 2e-45 MAKVYAYFLELTGESGVVDTWAECQEKTKGVKKARYKSFPDRIQAGNWLSRGAIYEKKEA LQKKIIQKMELPEGIYFDAGTGRGIGVEVRVSDKNGNSLLEEGCNEFGNILLGFSKTNNY GELTGLSKAIDIALEKKIFHIYGDSNLVLEFWSQGRYHPEKLEKETVILIQDVIKKRKQF EALGGKISYISGDINPADLGFHK >gi|224461454|gb|ACDD01000048.1| GENE 37 36115 - 36792 730 225 aa, chain + ## HITS:1 COG:FN0996 KEGG:ns NR:ns ## COG: FN0996 COG5522 # Protein_GI_number: 19704331 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Fusobacterium nucleatum # 2 218 3 220 232 172 49.0 4e-43 MEMFVLFGTSHMIMILIGVISVLAFIGLGFLIKPQALAKFVSVVVLGIKIAEMYYRHIFL GEEIYRMLPFHLCNLTIILSLFMMFFHSKFLFQLVYFWFVGAIFAIITPDIIFDYPNFWT ISFFVTHFYLVFSALFALIHFHFRPTKKGMIMAFLFINLWAVVMYFVNQELGTNYLFVNR IPETTTLLSYFGAWPYYFLPVEGIYLIQSILLYLPFRKANIKFNF >gi|224461454|gb|ACDD01000048.1| GENE 38 36863 - 37942 1526 359 aa, chain + ## HITS:1 COG:FN1332 KEGG:ns NR:ns ## COG: FN1332 COG0216 # Protein_GI_number: 19704667 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Fusobacterium nucleatum # 1 356 9 363 365 513 83.0 1e-145 MFDKLEEVVARYEELHTLLSSPEVLNDPKKMIECNKNLNALTPLIEKYKEYKAVKDDVEF IKESLKTEKDEDMRSMMQEELKENEEQMPELEKELKILLLPKDPNDDNNVIVEIRGGAGG DEAAIFAGDLFRMYCRYAERKKWKIEIIEKQDLEGLDGLKEVAFSIQGFGAYSKLKFESG VHRVQRVPKTESAGRIHTSTATVAVLPEVEDITEVHIDPKDLKIDTYRSGGAGGQHVNMT DSAVRITHLPTGVIVQCQDERSQLKNREKAMKHLASKLLEMEVEKQRSEIEGERRLQVGT GDRAEKIRTYNFPQGRITDHRIKLTVHQLEAFLDGDLDEMIDALITFSQAEMLSASGEE >gi|224461454|gb|ACDD01000048.1| GENE 39 37944 - 39050 328 368 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|170727358|ref|YP_001761384.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific [Shewanella woodyi ATCC 51908] # 129 338 64 275 314 130 37 1e-29 MNLLDILQFAEEYLKKYSFSKSRLESELLIADVLHLDRLSLYVNYDRMLEEEEKLKIKKY LFQMAKTKKSYRELREEREEENFQEENRKLLQQSIEYLKKYEVPNAKLDAEYIFADVLKV NRNMLSLYLHREISEEQKQELREKLIQRGKFRKPLQYILVKWEFYGYEFITDERALIPRA DTEILVEQAKILSLEKENPKILDIGTGTGAIAITLAKEVPEAEVLGIDISERALSLAKEN KEYQFVRNVSFLQSNLFEKLEGKSFDIIVSNPPYIPQEEYEDLMPEVKNYEPKNALTDAG DGYSFYQRIIQEANDYLNEKGYLLFEVGYQQAKQVKQWMEEEKFEDLYIAEDYAGHQRVV LGRKGGEN >gi|224461454|gb|ACDD01000048.1| GENE 40 39029 - 39595 315 188 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 186 7 192 199 125 34 4e-28 KKRRRKLRIIAGEARSRKLKTRKGFETRPTLANVKEALFSMIAPHLEDSVFLDLFSGSGN IALEALSRGAKRAVMIEKDTEALRFIIENVNALGFQDRCRAYKNDVFRAIEILARKGEKF SIIFMDPPYQDNVCTKVLEHIEKFEILGEEGIIICEHHAFEEMAERVGSFQKIDERKYQK KVITFYAR >gi|224461454|gb|ACDD01000048.1| GENE 41 39610 - 40128 942 172 aa, chain + ## HITS:1 COG:FN0342 KEGG:ns NR:ns ## COG: FN0342 COG0652 # Protein_GI_number: 19703685 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Fusobacterium nucleatum # 1 161 1 161 167 221 73.0 4e-58 MQLQAMIKTDKGDIRLQLFPEVAPMTVTNFVYLARRGYYNGLKFHRVIPDFMIQGGDPTG TGAGGPGYQFGDEFQKGVVFDKKGILAMANAGPNTNGSQFFITHVPTDWLNYKHTIFGEV VSEGDQVVVDKIAQGDLMNEIQILGDVEDFLQSQEEIVKQLDGIFGEGHEEK >gi|224461454|gb|ACDD01000048.1| GENE 42 40115 - 40330 349 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452579|ref|ZP_05617878.1| ## NR: gi|257452579|ref|ZP_05617878.1| exodeoxyribonuclease VII small subunit [Fusobacterium sp. 3_1_5R] # 1 71 1 71 71 77 100.0 3e-13 MKKNSFEANLEEIDTIIAKMESGELSLEDSIKEYEKAMKLLKKSSDLLENVEGKLYQVMK DQEGELQVEEL >gi|224461454|gb|ACDD01000048.1| GENE 43 40317 - 41189 1224 290 aa, chain + ## HITS:1 COG:FN1327 KEGG:ns NR:ns ## COG: FN1327 COG0142 # Protein_GI_number: 19704662 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 1 290 8 297 297 295 52.0 7e-80 MKNYREYFNKRFGEVLTQYNTPVWMAEGMQYACLQGGKRIRPQLLFMTLSLLGKERDLGF PFAAALEMIHSYSLVHDDLPAMDNDDYRRGQLTTHKKFGEANGILIGDALLTNAFSVMVR GSIGKVPSEKILEIVALFSEYAGIDGMIGGQAMDVAYAGKQISYKTLTFIHEHKTGRLLL LPILVACILGDASLEQREALESYGKKIGLAFQIKDDILDVEGSFEELGKAVKSDEKLNKS TYPSIFGLEKSKELLAETLQEARLVLEKVFKKEELEEFFELTEFMEKRTK >gi|224461454|gb|ACDD01000048.1| GENE 44 41189 - 42220 1085 343 aa, chain + ## HITS:1 COG:FN1330 KEGG:ns NR:ns ## COG: FN1330 COG0809 # Protein_GI_number: 19704665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Fusobacterium nucleatum # 1 343 9 351 351 518 78.0 1e-147 MSTKLHDYDYHLPEELIGQTPREPRDHAKLLLVNRDTKTIEDKYFYDILDYLQKGDVLVR NSTKVIPARLYGQKETGGILEVLLVKRKDLDTWECLLKPAKKLKLGQKIYIGPNSELVAE LLEIQEDGNRILKFSYEGSFEENLDRLGTMPLPPYIVEKLEDQEMYQTVYAKRGESVAAP TAGLHFTEELLQKIQEKGIEIVDIYLEVGLGTFRPVQTEDVLDHKMHEELFEIPEEAAQK INQAKAEGRRIISVGTTTTRALESSVDENGILLAQKKNTGIFIYPGYQFQIVDALITNFH LPKSTLLMLVSAFSEREFILEIYQHAVEEKYHFFSFGDAMFIY >gi|224461454|gb|ACDD01000048.1| GENE 45 42230 - 44206 2089 658 aa, chain + ## HITS:1 COG:FN1128 KEGG:ns NR:ns ## COG: FN1128 COG1506 # Protein_GI_number: 19704463 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Fusobacterium nucleatum # 1 657 1 660 660 798 59.0 0 MKKIEIKTFLDYSFLSNVRFSKDGKYISYTKTKANLEKNDYEHYVYIYNTGTKETKEYTS LGKEKNVFWINEHQFLFQTSRDAGLQEKIKEGEEWTEYYLMDIQGGEAKVFLQLPYSVTG MQACSSGFIFTANYANYGISLHNLTGEERAKAIAKKKEEGDYEVLDEIPFWSNGAGFTNK QRNRLYFYEKESKKIEPLTPEFMNVEYFKVSGDKVLFIAEEYQGKLEQTNALYEYDVKQK ECTCLLEDGKYNFSFADYMGKDIVCAASDMQEFGINENHKLYFVKDGSLELFYANDTWLL STVGSDCKFGGGKTFHTTDNELYFLSTLEDFSVINCLHRDGRLEFVTEKDGSVDFFDIHG ERLVYGAMKDYGLQELYVKENLKETCITKHNQKILEEYSISKPEKIFMESHGEQIEIYVI KPVHFEEGKEYPAILDIHGGPKTVYGNVFYHEMQVWANMGYFVFFTNPHGSDGRGNLFMD IRGKYGSIDYEDLMKATDIVLEKYPIDKTRVGVTGGSYGGFMTNWIIGHTDRFACAASQR SISNWISKFGTTDIGYYFNADQNQSTPWDNVEKLWSHSPLKYANKVKTPTLFIHSEQDYR CWLAEGLQMFTALKYHGVEARLCMFRGENHELSRSGKPKHRVRRLEEITNWFEKYLKK >gi|224461454|gb|ACDD01000048.1| GENE 46 44215 - 45168 890 317 aa, chain + ## HITS:1 COG:FN0714 KEGG:ns NR:ns ## COG: FN0714 COG1902 # Protein_GI_number: 19704049 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Fusobacterium nucleatum # 4 311 6 314 314 391 63.0 1e-108 MKTIFTPYQIKGISFKNRIVLPPLVRFSLLGTDGKVNQNLLDWYERIAKTEVGLIVVEAT AVEEAGKLRENQLGIWSDEMIEGLSKIVEICHHYETPVFIQIHHAGFKEKISEVSTERLD EILDLFVQAFHRAKKAGFDGIEIHGAHGYLLSQLSSSVWNHREDCYGNRFYFAKQLIEKT RDLFDESFLLSYRMSGNDPEVADGIEMAKFLEKMGVDLLHVSNGVPKEVKQAVKISNYPS GFPFHWITFLGTEIKKAVKIPVIAVYGIKTEEQASCLIEDFDLDFVAVGRAMIFYPNWME KCRKDFEKRMKQKKYEN >gi|224461454|gb|ACDD01000048.1| GENE 47 45158 - 45865 767 235 aa, chain + ## HITS:1 COG:FN0717 KEGG:ns NR:ns ## COG: FN0717 COG1187 # Protein_GI_number: 19704052 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Fusobacterium nucleatum # 1 226 1 226 226 289 69.0 4e-78 MRIDKFLVECGVGSRKEVKELLKSRKIRVNGLFITSPKENIEEEKDEVYYGEKKLSYQEF RYYILHKKAGYVTALEDSREATVMDLLPEWVIKKDLAPVGRLDKDTEGLLLFTNDGKLNH RLLSPKSHVDKTYHASLECDITEEALEKLREGVMIGEYKTLPAKAEKLEDRKIALTIREG KFHQVKKMLEAVGNKVIYLKRISFGKLVLGDLELGEVKEVSLEDIISFDVEEVEG >gi|224461454|gb|ACDD01000048.1| GENE 48 45869 - 46078 431 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257466444|ref|ZP_05630755.1| ## NR: gi|257466444|ref|ZP_05630755.1| hypothetical protein FgonA2_03266 [Fusobacterium gonidiaformans ATCC 25563] # 1 69 1 69 69 89 91.0 8e-17 MKKMSLLLVLSAVLLTACTTSVGVGTGFNLGGLGVGLSTSAPLKKQKAKTVDEVATEALQ ETKVQEKAR >gi|224461454|gb|ACDD01000048.1| GENE 49 46086 - 47114 1013 342 aa, chain + ## HITS:1 COG:FN0719 KEGG:ns NR:ns ## COG: FN0719 COG4394 # Protein_GI_number: 19704054 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 341 1 345 350 313 51.0 4e-85 MKTRSIDIFCQVIDNFGDIGVCYRLYKELSSLFPETSIRLLLDKTEEFFALCSEYQEISY KTYAEIEAEKESVETAEVIIEAFACEIPDNYLQKAYHNSKLIVNLEYFSAEDWTEDFHLQ ESILGIGTCRKFFFMPGISKKTGGILTKAYYPNLSLQDFGITREDYDLVGSIFSYEKNFT SLFESLQKIGKRVCLCILGEKSQESVRKSLGNFKRYDRIELKFLPFYSQENYEALIQKCD FNFVRGEDSFARALLTGKPFLWHIYPQENDLHFQKLQSFLEKYCPENKALQNTFFSYNRE ETDYSYFWEHFKEIREQNEEFRDYIQKHCNLGIKLKQFIENF >gi|224461454|gb|ACDD01000048.1| GENE 50 47128 - 47691 921 187 aa, chain + ## HITS:1 COG:FN0720 KEGG:ns NR:ns ## COG: FN0720 COG0231 # Protein_GI_number: 19704055 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Fusobacterium nucleatum # 1 187 1 187 187 323 90.0 1e-88 MKIAQELRAGSTIKIGNDPFVVLKAEYNKSGRNAAVVKFKMKNLISGNISDSVYKADDKM DDIKLDKVKAIYSYNDGSFYVFSNPETWESIELKGEDLGDALNYLEEEMELEVVYYESTP VAVEVPTFLERQIEYTEPGLRGDTSGKVMKPARINTGYEIQVPLFVEQGEWIKIDTRTNE YVERVKK >gi|224461454|gb|ACDD01000048.1| GENE 51 47808 - 48119 373 103 aa, chain + ## HITS:1 COG:FN1394 KEGG:ns NR:ns ## COG: FN1394 COG2739 # Protein_GI_number: 19704726 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 103 1 103 103 91 59.0 3e-19 MDLQEFLEIGSLLELYKNLLSEKQKEYLIEHFEEDYSLSEIATTHNVSRQAVSDNIKRGI KVLNDYEKKLKMFEQKRKLREKLESLQRDFRPEVLKKIMDDLL >gi|224461454|gb|ACDD01000048.1| GENE 52 48130 - 49479 2062 449 aa, chain + ## HITS:1 COG:FN1393 KEGG:ns NR:ns ## COG: FN1393 COG0541 # Protein_GI_number: 19704725 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Fusobacterium nucleatum # 1 449 1 444 444 652 79.0 0 MLDNLGSRFQDIFKKVRGHGKLSESNIKDALKEVKMSLLEADVNYKVVKDFIESIREKAI GTEVLKGINPGQQFIKLVNDELVQLLGGTNARLTKAPKNPTVLMLSGLQGAGKTTFAGKL AKFLKKQNEKVLLVAADVYRPAAIKQLQVLGQQVDVAVYAEEGHQDVLGICERALEKAKE EHATYMIIDTAGRLHIDEALMEELRNIKRLTRPQEILLVVDSMIGQDAVNLAKSFNESLS IDGVVLTKLDGDTRGGAALSIKSVVGKPIKFIGVGEKLDDIELFHPDRLVSRILGMGDVV SLVEKAQSAIEEEDAKSLEEKIRTQKFDLNDFLKQLQNIKKLGSLGSILKLIPGMGQIGD LAPAEKEMKKVEAIIQSMTKQERKKPEILKASRKQRIARGSGTDVADINRLLKQFDQMKT MMKMFAGGKMPNFPNLNGMMSGKGGKFPF >gi|224461454|gb|ACDD01000048.1| GENE 53 49520 - 49783 407 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739055|ref|ZP_04569536.1| SSU ribosomal protein S16P [Fusobacterium sp. 2_1_31] # 1 87 1 87 87 161 91 9e-39 MLKLRLTRLGDKKRPSYRLVVMEDLSKRDGKAVAYLGNYFPLEDSKVVLKEEEILKFLSN GAQPTRTVKSILVKAGIWAKFEESKKK >gi|224461454|gb|ACDD01000048.1| GENE 54 49856 - 50770 1040 304 aa, chain + ## HITS:1 COG:FN0238 KEGG:ns NR:ns ## COG: FN0238 COG4874 # Protein_GI_number: 19703583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria containing a pentein-type domain # Organism: Fusobacterium nucleatum # 1 304 5 309 310 417 67.0 1e-116 MQSHITGKVLMIRPICFGYNEETAVNNYYQKKDTKSSREIQEEALEEFDTMVEVLRKYKI EVKVLEDTLQPYTPDSIFPNNWFSSHENGSIVLYPMFAENRRLERREDIYDFFNEDKMNI LDYSPLEKEEIYLEGTGSLVLDRKNRKAYCSLSKRADERLLDIFCQDLGYQKIAFHSYQT VEMERKEIYHTNVMMSVGEKFAILCADSIDNLEERAKVIASLEEDGKEIIFITEEQVEHF LGNALELKNEEGVHLCIMSATAEKILTEEQRKSLEKYAVIIPVKVSTIEKYGGGSARCML AELY >gi|224461454|gb|ACDD01000048.1| GENE 55 50886 - 51389 661 167 aa, chain - ## HITS:1 COG:FN0772 KEGG:ns NR:ns ## COG: FN0772 COG0716 # Protein_GI_number: 19704107 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 166 1 166 169 132 41.0 3e-31 MKTLVVYSSLTGNTKKATTWAFEAVIGEKELFSVEEAMKIDTSSYDRIIQGFWVDKGTLD PKSRKFLKQIKGKELIFIGTLGAYPNSKHAIKVMERSKKIAEENNCYLGTCMVQGKMSDV LLKSMDKFPLNLIFRKTEERLERIQVASFHPNEEDKEKIQEFVRNLY >gi|224461454|gb|ACDD01000048.1| GENE 56 51542 - 52294 888 250 aa, chain + ## HITS:1 COG:no KEGG:FN1183 NR:ns ## KEGG: FN1183 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 1 242 1 243 250 290 61.0 3e-77 MRFILKFQLSTMRIPIEIRRTMISFIKKSLTQAHDGKYYENFFKDTELKDYCFSIIYPLK QFHKNEIELKKPEISVVFSCTEKQNIAFLLMNVFLLQKNKKFPFPDDEYMILKEIVPVRE KEILGNVGIFRSTLGGGIVVREHIKEEKKDIYYSVGDEKFLEKLDWIMKKRFERLGYPKE MIQFSSKLLEGKKVIVKHFGLTFPVTNGIFEIYAPKILLKEIYRTGLGSRLSQGLGMLEY LGPGGEENEA >gi|224461454|gb|ACDD01000048.1| GENE 57 52284 - 53699 1502 471 aa, chain + ## HITS:1 COG:no KEGG:FN1182 NR:ns ## KEGG: FN1182 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 471 1 511 517 376 49.0 1e-102 MKHNLDEGVYGFDTAISASDWRYSAAIVGLRYYLQEFQKKYEIKKNIEIDGIFDDFFLYS SQDIQENTYLSFVEKFYGEDLPHKALENKLKSSSVFSQEEEKWIKEKMGANTTLKKVFSK IKFTGENKQEVLNLIEENRYTIIKETFRNKKNLYDNYCQSGVLFTEAAKDSVCRVKGYYI DAGKKGKSTAYRFRTDSIIYEDDIIFDFIPFAFTGSTFETVFLNDNVDLDTLYKVNFNVK TFFEKKEQEKVSIQQNLMELLQNQTHPMKYGMEIIYKDREKTHFNTWYLRNESIRIFQKV DIPKINLNMKVGENYRNILKETFQNILNLVRLDLIIDFLLKEREKSNLPSIYFAIKELLK INIEIKNIGGENMEFNKNQKFAYACAKEIVKIFKKNNIEKKLDSYRQKLTSSLIFKDYKR TLDILMQLSNYSGVYFGFLYDFMENPSKNDDIIRMFILELNTENFENKVEK >gi|224461454|gb|ACDD01000048.1| GENE 58 53719 - 54600 1276 293 aa, chain + ## HITS:1 COG:FN1181 KEGG:ns NR:ns ## COG: FN1181 COG1857 # Protein_GI_number: 19704516 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 293 1 300 300 409 76.0 1e-114 MKKNALTLTVVANMTSNYSEGLGNISSVQKVYKNRAVYSIRSRESLKNALMVQSGMYEDL QTVVNGATQKNVTPELNASNCRALEGGYMCTAADTYVRNSSFYLTDAISTESFVNETRFH NNLYLATNYAKANGLNVQANAGDVGLMPYQYEYEKSLKVYSITIDLEKIGKDENFNQEAD KTEKYERVKSILEAIQNLSLVVKGNLDNAEPMFVIGGLSERKTHYFENVVKTEQGALVIG EDIVLKKNKGFRCALLRGDNFSNEEEILKMLQPISMEVFFEDLIKEVKDYYQA >gi|224461454|gb|ACDD01000048.1| GENE 59 54614 - 55723 1092 369 aa, chain + ## HITS:1 COG:no KEGG:CTC01145 NR:ns ## KEGG: CTC01145 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 351 1 348 360 306 52.0 8e-82 MKALRIVLRQTSANYRKTGCLENKMTYPLPLPSTIIGALHNICDYKEYHPMDVSIQGKFS SLSKRAYTDYCFLNSVMDDRGILVKMANGNCLSNSFLRVASSKKPQGNSFKKRISIQVHD ENLFGEYCHLKDISEEIKIKKDTVYKEKLSEFKKKKTELSSQKKLLDKNSEEYKKILEEE KKWKIEEKNYVEEFKKYEEENYTKPIKSYRSLVTSLKFYEILHDIFLILHIRAEEEVLKE IHDNIYRLTSLGRSEDFLEVEDCSVVELQEFQEDIYSKENANTSIYLNRKDVAEEKIFSF EVDSNHSSGGTKYYVNKNYTLEENKRIFTKIPVLYSMNFGAQESSENVKLDFWGTDSEGN KIPVLVNFL >gi|224461454|gb|ACDD01000048.1| GENE 60 55733 - 57934 1809 733 aa, chain + ## HITS:1 COG:FN1179 KEGG:ns NR:ns ## COG: FN1179 COG1203 # Protein_GI_number: 19704514 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Fusobacterium nucleatum # 2 733 9 812 812 642 47.0 0 MEYYAKPNKTIAQHNFDLQQARECLVRFGYLYSEEENRILREAIEYHDLGKMNEFFQKRV LSQRKIKFNTELEVEHNILSIYMIDPKKYPKEKYYSILYAVLFHHRYSDVVQTMVERKKD IERLLQNFTSYRLPMGLKISSLHTLTNQKTLGLLMKCDYAASGNYQIEYPNDFLEEKLEL WSSKLGILWNDLQEFCYSHKNESIIAIADTGMGKTEAALRWIGNSKAFFTLPIRTAINAI YDRVSRDILERENLEERLSLLHSTSLEYYAKNIAEEELDIFEYHQRGKHLSLPLTICTAD QIFNFILKYKGYEMKLATLSYSKVVLDEIQMYDPSLLAAIILGIKTILELGGKIGIVTAT FPPIVEALMKKEIPDFSFQKQIFHSKNNVIRHNLISYDKRMGTEEMIDLFLRNKKIGKSN KILVVCNTIKDAQAMYDTLLEQEELSPYLHLLHSRFIKEDRARKEKEILAFGKTEIKENG IWISTQLVEASLDIDFDYLFTELQDLSSLFQRFGRCNRKGKKSTKEANCYVYLKTEEGYL KEAGSSYGFIDKVIYHLSREALLGHTGEISEELKTKWIEEFLSYEKLEQSSFLSEFRDAI EEYKNILNSNENTSEELTRLRDIQNVTVIPLPVYQKHEEEIRDLEENLKNPEMTKEEKLR FKEEIMKHTVTVPKYMLENYKKALQDGNVESMPIPPVKMSNYEKVIILECLYDSQRGFQA KKFVEKNINFTFL >gi|224461454|gb|ACDD01000048.1| GENE 61 57954 - 58448 475 164 aa, chain + ## HITS:1 COG:FN1178 KEGG:ns NR:ns ## COG: FN1178 COG1468 # Protein_GI_number: 19704513 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Fusobacterium nucleatum # 1 164 1 164 164 211 75.0 4e-55 MKKEITGIMVYYYEVCQRKLWYFLHEIQMESDNSNVILGRLLEENTYTRDEKKIAIDGII NIDFFRTKKVLHEIKKSKVMEQASILQVQYYLYYLEKKGLTGIKGVLDYPLLKQKVEVEL TGIDRKHLDEILSKIELIMELDIPPDIEKKSICKKCAYFDLCFV >gi|224461454|gb|ACDD01000048.1| GENE 62 58459 - 59451 951 330 aa, chain + ## HITS:1 COG:FN1177 KEGG:ns NR:ns ## COG: FN1177 COG1518 # Protein_GI_number: 19704512 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 330 9 338 338 548 88.0 1e-156 MKRSYFLYSNGTLKRKDNTITFINENEEKKDIPIEMIDDIYIMSEMNFNTKFINYISQFG IPIHFFNYYTFYTGSFYPRETAVSGQLLVKQVEHYLDKDKRIEIAREFIEGASFNIYRNL RYYNGRGKEVKTYMHQIEELRKQLSKVTDVEELMGYEGNIRKIYYEAWNVIINQEIDFEK RVKNPPDNMINSLISFVNTLFYTKVLGEIYKTQLNSTVSYLHQPSTKRFSLSLDISEIFK PLIVDRLIFSLLNKNQVTEKSFIKDFEYLRLKEDASKLIVQELEERLKQVIQHKDLNRKV SYQYLIRLECYKLIKHLLGEKKYLSFQMWW >gi|224461454|gb|ACDD01000048.1| GENE 63 59458 - 59736 311 92 aa, chain + ## HITS:1 COG:FN1176 KEGG:ns NR:ns ## COG: FN1176 COG1343 # Protein_GI_number: 19704511 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 92 15 106 106 157 93.0 4e-39 MYVVVVYDISLDEKGSYHWRKIFQICKRYLHHIQNSVFEGELSEVDIVRLKYEVSDYIRD NLDSFIIFKSRNERWMEKEMLGLQEDKTDNFL >gi|224461454|gb|ACDD01000048.1| GENE 64 60441 - 60662 109 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFNYIIYLSSKKRFTFYFSFTLIDEDVILAAISNEFTFYFSFTLISLKTMDFRMQIQFTF YFSFTLITKCYTL >gi|224461454|gb|ACDD01000048.1| GENE 65 60794 - 60958 94 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSFTFYFSFTLIVEEGEVIFDHEEFTFYFSFTLMMIQEYGLIHYLNLHSTLVLL >gi|224461454|gb|ACDD01000048.1| GENE 66 61190 - 61273 215 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVVEGFEPSTPGSEDQCSIQLSYTTV Prediction of potential genes in microbial genomes Time: Fri May 20 02:05:27 2011 Seq name: gi|224461453|gb|ACDD01000049.1| Fusobacterium sp. 3_1_5R cont1.49, whole genome shotgun sequence Length of sequence - 2100 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 8.7 1 1 Tu 1 . + CDS 287 - 1999 2330 ## COG0018 Arginyl-tRNA synthetase + Term 2012 - 2074 5.3 Predicted protein(s) >gi|224461453|gb|ACDD01000049.1| GENE 1 287 - 1999 2330 570 aa, chain + ## HITS:1 COG:FN0506 KEGG:ns NR:ns ## COG: FN0506 COG0018 # Protein_GI_number: 19703841 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 570 1 569 569 817 71.0 0 MLLVNKELSKIFTSTVTKLYSDKEIKEVEITAATNEKFGDFQCNFAMMNSKIIGKNPRMI AEEIQQNLIENEVIEKLEIAGPGFINIFLKEAYLSSFIKKIGKEEFDFSFLDRKGDVIID FSSPNIAKRMHIGHLRSTIIGDAICRIYRYLGYHVVGDNHIGDWGTQFGKLIIGYHKWLD KDAYQRNAIEELERVYVKFSQEAEEHPELEEEARLELKKLQDGDEENYNLWKEFIKVSME EYQKLYDRLDVHFDTFYGESFYHPIMPEVVKELVDKGIAKEDDGAKVVFFPEEENLFPCI VQKKDGAFLYATSDIATVKFRLNTYDVNHLIYLTDERQQDHFKQFFRVTEMLGWDVKKYH VWFGIMRFADGVFSTRKGNVIRLEELLDEGKRRAYEIVKEKNPSLPEEEKQHIAEVVGVG AIKYADLSQNRQSPIIFEWDKILSFEGNTAPYLQYSYARVQSVLDKAKDLGKAATEDTCL ILKDKYERSLANYMTIFPSSVLKAAETCKPNLIADYLYDLSKKLNSFYNNCPILNQEDDI LKSRAYLAKQAGEVIKQGLSLLGIQTLDRM Prediction of potential genes in microbial genomes Time: Fri May 20 02:05:39 2011 Seq name: gi|224461452|gb|ACDD01000050.1| Fusobacterium sp. 3_1_5R cont1.50, whole genome shotgun sequence Length of sequence - 41448 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 8, operones - 7 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 1/0.000 + CDS 2284 - 4701 2259 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 16/0.000 + CDS 4694 - 7489 3712 ## COG0060 Isoleucyl-tRNA synthetase 4 1 Op 4 1/0.000 + CDS 7499 - 7951 628 ## COG0597 Lipoprotein signal peptidase 5 1 Op 5 19/0.000 + CDS 7966 - 8844 1360 ## COG0752 Glycyl-tRNA synthetase, alpha subunit 6 1 Op 6 1/0.000 + CDS 8848 - 10911 2773 ## COG0751 Glycyl-tRNA synthetase, beta subunit 7 1 Op 7 2/0.000 + CDS 10920 - 11474 669 ## COG0302 GTP cyclohydrolase I 8 1 Op 8 5/0.000 + CDS 11484 - 12311 615 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 9 1 Op 9 . + CDS 12295 - 13146 1180 ## COG0294 Dihydropteroate synthase and related enzymes 10 1 Op 10 . + CDS 13206 - 13517 554 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 13529 - 13572 9.0 - Term 13521 - 13553 4.0 11 2 Op 1 17/0.000 - CDS 13557 - 15041 1607 ## COG0168 Trk-type K+ transport systems, membrane components 12 2 Op 2 . - CDS 15054 - 16412 1451 ## COG0569 K+ transport systems, NAD-binding component - Prom 16637 - 16696 11.9 + Prom 16659 - 16718 13.7 13 3 Op 1 . + CDS 16793 - 19000 2850 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 14 3 Op 2 . + CDS 19023 - 22850 4131 ## FN0498 hypothetical protein + Term 22865 - 22927 -0.9 + Prom 22996 - 23055 9.2 15 4 Tu 1 . + CDS 23076 - 24536 1349 ## FN1654 hypothetical protein 16 5 Op 1 . - CDS 24553 - 25218 547 ## COG0500 SAM-dependent methyltransferases 17 5 Op 2 . - CDS 25208 - 25810 612 ## FN0850 putative cytoplasmic protein 18 5 Op 3 . - CDS 25788 - 26909 1229 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 26932 - 26991 13.3 - Term 26951 - 26999 4.1 19 6 Op 1 . - CDS 27011 - 27880 856 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family 20 6 Op 2 . - CDS 27813 - 27968 231 ## gi|257452622|ref|ZP_05617921.1| hypothetical protein F3_06094 21 6 Op 3 . - CDS 27985 - 28416 401 ## CPF_2500 hypothetical protein 22 6 Op 4 . - CDS 28413 - 30392 1743 ## COG0337 3-dehydroquinate synthetase - Prom 30458 - 30517 9.0 + Prom 30411 - 30470 12.3 23 7 Op 1 . + CDS 30492 - 30836 454 ## gi|257452625|ref|ZP_05617924.1| hypothetical protein F3_06109 24 7 Op 2 1/0.000 + CDS 30839 - 32023 1363 ## COG1295 Predicted membrane protein 25 7 Op 3 1/0.000 + CDS 32083 - 33984 2435 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 26 7 Op 4 4/0.000 + CDS 33981 - 36257 1963 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 27 7 Op 5 . + CDS 36270 - 36791 733 ## COG0242 N-formylmethionyl-tRNA deformylase 28 7 Op 6 . + CDS 36820 - 37086 317 ## gi|257452630|ref|ZP_05617929.1| hypothetical protein F3_06134 29 7 Op 7 . + CDS 37083 - 38141 1835 ## COG1494 Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins 30 7 Op 8 . + CDS 38155 - 38646 709 ## FN0932 hypothetical protein + Term 38703 - 38777 29.2 + TRNA 38686 - 38761 93.2 # Gly TCC 0 0 + TRNA 38771 - 38846 95.4 # Lys CTT 0 0 + TRNA 38856 - 38930 66.8 # Glu TTC 0 0 + TRNA 38934 - 39009 97.4 # Val TAC 0 0 + TRNA 39025 - 39102 93.9 # Asp GTC 0 0 + Prom 39031 - 39090 80.4 31 8 Op 1 . + CDS 39281 - 40615 1405 ## COG1373 Predicted ATPase (AAA+ superfamily) 32 8 Op 2 . + CDS 40612 - 41349 927 ## COG2071 Predicted glutamine amidotransferases Predicted protein(s) >gi|224461452|gb|ACDD01000050.1| GENE 1 109 - 2283 1713 724 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein [Symbiobacterium thermophilum IAM 14863] # 1 721 2 720 764 664 49 0.0 TMDNIFSKVAKELLLRENQVESTVKLLDEGATVPFISRYRKEITGNLDEVQITDILEKVQ YLRNLEKRKEEVLRLIEEQGKLTEELTKAIQVAEKLQEVEDIYFPYRKKKKTKADVAIEK GLEPLADFFLLAHSVQEIETKAQEFITEEVPNIEEAIEGAKLIWAQKVSEKAEYRERIRE ILLKYGKMDSKESKKAKELDEKAVYQDYYEYSESLAKIPSHRILAVNRGEKEGILSVNLS LEEKEKQHVESLLLRSFTKEVELYELFHSIIRDAYDRLLFPAVEREVRNILTDKAEEEAI LVFRENLKNLLLQAPLHEKTILALDPGYRTGCKVAILDKHGFYQENDVFFLVEGMHHEKQ LEDARKKALKYIKKYGIDLVVIGNGTASRETESFVAKLIREEKLKIQYLIANEAGASVYS ASKLAAEEFPDLDVTVRGAISIGRRIQDPLAELVKIDPKSIGVGMYQHDVNQGRLDESLD QVITTVVNNVGANLNTASWALLSHISGIKKTVAKNIVEYRKENGNFTKRESLLKVKGLGP KAYEQMAGFLVIPEGDNILDNTIIHPESYHIAEKMLKEIGFSLEEYDKNLGEAREKLKTV KVEEFAEKHNFGLETCKDVYEALRKDRRDPRDDFQKPLLKSDILSIENLSVGMELEGTVR NVVKFGAFVDIGLKNDALLHISAISDEFVSDPSKVLSVGQIIKVRIKEIDKERGRVGLTR KKEL >gi|224461452|gb|ACDD01000050.1| GENE 2 2284 - 4701 2259 805 aa, chain + ## HITS:1 COG:FN0066 KEGG:ns NR:ns ## COG: FN0066 COG0642 # Protein_GI_number: 19703418 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 87 805 17 737 737 395 34.0 1e-109 MQIKRDSLLLRIIFYNDIAIIFTSLALALVFSLMVFSSMEQRLADTAREKVFLLYKAYVT EAKDSRETFRMVTNSVFQLAGTGIVENNRVYYDQIAKNISHELMKHSYERYSNSRVTLVN GEGTLLGRNSSERSYQILERSFIQEWENSKYDRKDMFFYKKDNRLFFRFITTFYESDFQN NVFVILDLPMSSYSIENLREFIGMNEEDKILVSVSGHYYYGDLDYETGGELLNSFQITRF APEGFEYFFNQKEINKHAYYMAFYKIRDLDSKYIASLGVAISKEKFLTTKYMVSALMIFI VSILIVISTTVCTKLFAKLLEPLTAILDAVYDIGRGNYKINLEEDVVYEIRNLSNAMEKL AKNISLKENQLKLHNDSLEKNLNRIDAIQKILMGVNLEQDFQLGMKGFLSALTSEAGLGY SRAIYLEYDREKNILQAKDFACNSSLVAECLEDKEKLKVFSFQLQEIDRILPLLKVPCDS QNYLGKSLNENRILYENDKAYRFPFGNDLFHSLGISHFIILPLYRSENMKSCILLDYYIR EREITQEEIELLTLLLLNVNIQLKNKEVEDRKLHFERTSTMEKMSVHFMKGREKLFSRIE SLVDKVEKNGYNKKITLEEITRLKRDFHKIKFDHSILEEYSNFSKKHFEMISVEEFMKEL AKYVQGYMDKYEINFSQFISCNGYFYADKSKLFKAFIELLKNSSEAILTRNRLDKKINIV AIEDKKSNQILINIMDNGIGMYPEEVKEINKAFESYQETTAMGLGLSIVSMVIHEHKGSI AVTSKLDEGTDVKIILNIYKGEQHE >gi|224461452|gb|ACDD01000050.1| GENE 3 4694 - 7489 3712 931 aa, chain + ## HITS:1 COG:FN0067 KEGG:ns NR:ns ## COG: FN0067 COG0060 # Protein_GI_number: 19703419 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 931 1 933 933 1422 71.0 0 MSEKDYAETLHLPKTNFQMKGNLPNKEPNYIKKWEDNKIYEKGLAKGKESFILHDGPPYA NGNTHIGHALNKILKDIIIKYKTLQGYKAPYIPGWDTHGLPIELKVMEKLGSKAKDMTAL EIRQLCKEYALKWVDIQREEFIRMGVIGKWEDPYLTLKPQFEAKQLQIFGELYANGYIFK GLKPIYWSPVTETALAEAEIEYHDHTSPSIYVRMKANSDLLEKISLTEEAYVVIWTTTPW TLPANVAISLNPDFDYGVYKTEKGNLILGKDLAETAFAEMGIENPELVKEFKGSTLEMTS YQHPFLDRTGYIILGTHVTADAGTGCVHTAPGHGQEDYVVGCRYNMPIVSPINYKGYLTE EAGPLFAGLFYEKANKAIIDHLTETGFLLKMKEITHSYPHDWRSKTPVIFRATEQWFVKA EGSDLREKALRALDDVEFIPAWGRNRIGSMLETRPDWCISRQRVWGVPIPVFYNEETGEE IFNQDILNHVISFVEKEGSDAWLLHTSEELIGEENLKKYHLEGISLRKETNIMDVWFDSG SSHRAVLETWEGLRWPADLYLEGSDQHRGWFQTSLLTSVGSRGVAPFKKILTHGFVNDGK GEKMSKSKGNVVAPEKIIKQYGADILRLWCASVDYREDVKISDNIVKQMAETYRRVRNTA RYILGNSYGFDPKKDAVPYQDLLEIDKWALHKLEMLKKSVGESYEKYEFYNVFQEIHYFA GIDMSAFYLDIIKDRLYTEKEDSIARRSAQTVMIEILMTLVKMIAPILSFTAEEIWEHLP ETLRDQESVLLTDWYVMKEEYINEEIAEKWSKIQKVRKDANKLLEKARQGENRIIGNSLD AKVQCYTEDAGLKAFLENNHETLEAALIVSQVEILSEKTENFVAGEEYKELFLQVLHADG EKCDRCWKYSTNLGTKEDHPHLCPRCSSVVE >gi|224461452|gb|ACDD01000050.1| GENE 4 7499 - 7951 628 150 aa, chain + ## HITS:1 COG:FN0068 KEGG:ns NR:ns ## COG: FN0068 COG0597 # Protein_GI_number: 19703420 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Fusobacterium nucleatum # 1 150 14 163 165 155 58.0 2e-38 MIYIILFVMLLVLDQFTKYIVEQSFYLSESIPIIDEVFNFTYVENRGIAFGLFQGRLSII SILTVVAIVAIFIYVLRNKKTLSILEHFGYTLILSGAVGNMIDRLFRGFVVDMLDFRGIW SFVFNLADVWINVGVFLLIVDYLILRRNEK >gi|224461452|gb|ACDD01000050.1| GENE 5 7966 - 8844 1360 292 aa, chain + ## HITS:1 COG:FN0069 KEGG:ns NR:ns ## COG: FN0069 COG0752 # Protein_GI_number: 19703421 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, alpha subunit # Organism: Fusobacterium nucleatum # 1 290 1 290 290 560 90.0 1e-160 MTFQEIIFALQKFWGSHGCVLGNPYDIEKGAGTFNPNTFLMSLGPEPWNVAYVEPSRRPK DGRYGENPNRVYQHHQFQVIMKPSPINIQELYLESLRVLGIEPEKHDIRFVEDDWESPTL GAWGLGWEVWLDGMEVTQFTYFQQVGGLELDIVPVEITYGLERLALYIQNKENVYDLEWT DGIKYGDIRYQFEFENSKYSFELASLEKHFAWFDQFEEEAGKILDEGLVLPAYDYVLKCS HVFNILDSRGAISTTERMAYILRVRNLARRCAEVFVQNRKDLGYPLLKKEAK >gi|224461452|gb|ACDD01000050.1| GENE 6 8848 - 10911 2773 687 aa, chain + ## HITS:1 COG:FN0070 KEGG:ns NR:ns ## COG: FN0070 COG0751 # Protein_GI_number: 19703422 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, beta subunit # Organism: Fusobacterium nucleatum # 1 685 1 686 686 740 59.0 0 MRLLFEIGMEENPARFLEPALAELKKNFMNKCKAERIAHGEVKMYGTPRRLILCAEEVAE QQEDLNELNIGPAKSIAFLNGEITRAGIGFAKSQGIDPVDLEIIQTDKGEYIAARKSLQG QATKTLLPELLKSLVLELSFPKSMKWSDLKIRFARPIEWFLAMADSEVVEFEIEGMKSSN HSKGHRFFGKEFTVNSVEDYFVKIRENNVIIDIQERKKMIREDILSKIAEDEQVVIEEGL LSEVTNLVEYPYPIVGTFNSDFLEVPQEVLIISMEVHQRYFPILDKNGKLLPKFVVIRNG IEDSDNVRIGNEKVLSARLADARFFYKEDLRNHLENNVEKLKHVVFQKDLGTIYQKITRT QEICEILLAKLHLEEKRETVLRTAYLAKADLVSNMIGEKEFTKLQGFMGADYALKFGEKE EVSKGIREHYYPRFQGDELPQVVEGILVGIADRLDTLVGCFGVGVIPSGSKDPFALRRAA LGIVNIILNSKLDLSLRELVNASLDTLAKDGVLKRDRAEVEKEVMEFFKQRLINVFSEKM DRDIVAAVLEVQSEDAMDAFTRMQALKAFLTEEGAKDLLDLAKRVGNISKEAKSREVNVT LFQQEEEKELYHYTEKTKMEIESLVSDKNYAGYLAAVLASKEIVTKYFNAVKVMDENVEV QNNRISQLGLLSSLYQKLADLSVLEER >gi|224461452|gb|ACDD01000050.1| GENE 7 10920 - 11474 669 184 aa, chain + ## HITS:1 COG:FN0071 KEGG:ns NR:ns ## COG: FN0071 COG0302 # Protein_GI_number: 19703423 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Fusobacterium nucleatum # 1 182 5 186 187 236 64.0 1e-62 MDEKRIAKAFEEILEAIGENRNREGLEETPIRVAKSYQELFSGIGQDPRKVLQRTFNVKK NDYIIEKQIDFYSMCEHHFLPFFGKIDIAYIPNGKILGFGDLLKLVDILSKRPQIQERLT EEIATYLYEELRCQGVFVRVKAKHLCMTMRGEKKENTEIITVSSNGVFEMDSQKRFEVLQ LLNS >gi|224461452|gb|ACDD01000050.1| GENE 8 11484 - 12311 615 275 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 1 262 1 261 278 241 45 5e-63 MDKIYIEKLEFRAYHGVFPEEKKLGQKFIVSLELELDTREAALSNNLDKTLHYGLISERV ESLVLEKSYDLLESLAEKIAETLLLEYPVLQGIKVRVDKPQAPIPLSFQTVAIEIYRSWH RVYLSLGSNLGDKKGNLDRAIEEISSLVHTEVIRKSSFLETEPFGYLEQDTFVNACIEIK TLLTAKEVLKACLGIEEKMGRKRLIKWGPRNIDIDILFYDKEIYDENDLVVPHPWIEERM FVLEPLCEIAPNYIHPILKKTIFMLKRGVEHETTL >gi|224461452|gb|ACDD01000050.1| GENE 9 12295 - 13146 1180 283 aa, chain + ## HITS:1 COG:FN0073 KEGG:ns NR:ns ## COG: FN0073 COG0294 # Protein_GI_number: 19703425 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Fusobacterium nucleatum # 2 273 5 275 277 357 63.0 1e-98 MKLHCRGLELELGKRTYIMGILNVTPDSFSDGGKYNHLDAALQHAQEMIEEGADILDIGG ESTRPGHIQISEEEEIARVVPVIQALRKQFPTILLSIDTYKWRVAEAALKAGVHILNDIW GLQYDKGEMANLAKEYEVPVIVMHNQNTEEYQEDRIQALRKFFQKSFEIAEKANLSRECL ILDPGLGFGKGFQGDVEILGRLSELRDMGPILLGTSKKRFIGTLLEGLPSEERVEGTTAT TVIGIQQGVDIVRVHNVKENKRVAMVADAIYRKDYLCDNITYK >gi|224461452|gb|ACDD01000050.1| GENE 10 13206 - 13517 554 103 aa, chain + ## HITS:1 COG:FN0093 KEGG:ns NR:ns ## COG: FN0093 COG0526 # Protein_GI_number: 19703445 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Fusobacterium nucleatum # 1 103 1 103 103 132 68.0 1e-31 MAIVHVTKENFKQEVLEANQPVVVDFFATWCGPCKSLSPVLEDVVAEDSFKKIVKVDIDA EPELASEYKIMSVPTLLLFKHGEVVEKSVGLIQKDEVKALFSK >gi|224461452|gb|ACDD01000050.1| GENE 11 13557 - 15041 1607 494 aa, chain - ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 479 1 479 483 511 60.0 1e-144 MNNKMIRYVLSNILKLEAVFMIVPLILSIYYQEKTLVSLAHLFTILLLIGTAYLLSKKQP ENVQIFAKEGLFIVAFSWLALSFFGALPFVISREIPSFVDAFFEVVSGFTTTGASILSNV EALSHSLLYWRSFTHFVGGMGVLVLALAILPKNNNQSLHIMKAEVPGPTVGKLVSKMTYN SRILYMIYIFLTLLITLFLYLGGMPLFDSVLHTFGTVGTGGFGIKNTSVAYYHSAYIEYV LAIGMLLSGMNFNLFYALLLRNFKQVLHNEELKYYLSIVAFAILAICIDNYNQYDNIEQL FRDSLFTVSSIMTTTGFSTINFDTWSVFSKTILLLLMMIGGCAGSTAGGMKVSRFIVLFK TFIYEFKKTYSPNRVFRLKMDGRALSQELIMSIRTYLILYLSLFFLLLLCVAPESPDFIS ACSAVAATFNNIGPGFGTVGPTMNYSHFSNFNKIVLSISMLLGRLEIFPILLIFSPEIFT PFFKKIKSLFQTEK >gi|224461452|gb|ACDD01000050.1| GENE 12 15054 - 16412 1451 452 aa, chain - ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 452 1 451 452 421 55.0 1e-117 MKIIIVGAGKVGELLCNDLSNEGNDITLIEENQKVLDQVLASSDIMGLVGNGANCEILKE ANIEKADIFIAVTQSDEINIISSVMAKKLGAKYTIARVRNTEYSSQIQFMSDSLGIDRML NPESEAAFFILKNLEFPKALNVESFSGNTVNMLEVLIEENSYLDHLKLIDFKNHYFKSIL VCIVKRNQEVHIPTGNFILQAGDRIYVTGIQAELSEFYKSLGHSEEKIKSVAIIGAGRIT YYLTSLLLEQKMNLKIFEINEEKANLLSETYENANVVWGDGTDSTLLEEEQFSSYDACIS LTGIDEENVILSMYANKVGIKKTITKINNSSLFHLLDFSELQTIVTPKKLIADYIIKTVR SFINSENEENIETLYRLAENRVEAIEFKVPEDSDVINIPLKNLNIKDNLLIAYIIRNNQA IFPGGMDIILPEDRVIIVTTEKYLNHVNKILK >gi|224461452|gb|ACDD01000050.1| GENE 13 16793 - 19000 2850 735 aa, chain + ## HITS:1 COG:FN0499 KEGG:ns NR:ns ## COG: FN0499 COG1629 # Protein_GI_number: 19703834 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 7 735 1 743 743 929 68.0 0 MRKMLFLIGALLSISAFAEQTVELGSTSIKGNRKADYTLTPKEYKNTYTITQEKIQERNY KNVEDVLRDAPGVVIQNTAFGPRIDMRGSGEKSLSRVKVLVDGVSINPTEETMASLPINS IPIETVKKIEIIPGGGATLYGSGSVGGVVSITTNSNATKNNFFMDLNYGSFDNRNFGFAG GYNVTDKLYVNYGFNYLNSEDYREHEEKENKIYLLGFDYKINAKNRFRVQTRYSKMKHDG SNWLSQDELKTSRKKAGLNLDLDTTDKSYTFDYEYRPTENLTLAATAYKQQQDRDITTDD IRDIEIVASNRNYTDLKEYMTFYDVKSTLKAKFKEEKHGIKLKGKYEYGNGEVIFGYDYQ DSNNKRNSLVQSETLKTYNDRISDLNLDPTDRKPIVNRVDIDLTKKSHGFYAFNKLELGK KFDFTTGFRTEITEYNGYRKNGPNTMPIISPKTNEIKTNEKMTNYAGEAGMLYKYSDTGR AFVRYERGFVTPFANQLTDKIHDTELKNPGGFFTPPIVNVASLYVANNLKSEITDTIEVG FRDYIFDSLVSASFFATDTTDEITLISSGITNPAVNRWKFRNIGKTRRLGIELEAEQKWG DFEFSQSLTFVDTKVLKTDKESNIYRGDKVPMVPNIKATLGLKYNVTDNLSLIGTYTYLS KRETRELDEKDKVYKHTIKGYGTADLGVLYKVDKYSNFKVGAKNLFGKKYNLRETKLEAL PAPERNYYLEFNVKF >gi|224461452|gb|ACDD01000050.1| GENE 14 19023 - 22850 4131 1275 aa, chain + ## HITS:1 COG:no KEGG:FN0498 NR:ns ## KEGG: FN0498 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 738 1275 1 583 583 389 44.0 1e-106 MKNRLLILCLASLASISYANEGKAYEGPAHYRVIETQEHIVPLERGAYEDLLRVVDEQNR QKGISSNYLKKGRDGRHLGDIDLVPVAQFAGDENDYKVELVSKKSSKLSGYHSVHELLEG KDKKLKEDGTFEKLSYSREGNQKRFYFGNGNVVKDITITGTEKFDEKLRETKKQKNDRYI IEGVYKRPFKDRDQLGISVDDYKKNIEGQSREKALKYIKQKLEERLGTSSKYKFEIKNGE LYAKDSSGKEWKVLLHIEPVSVPEIRYGSTKKEYKDDIFTNIYLYTPTSSSDDKKDSSGR VLYTKDNNIVVEDKFKYLDNVVEFDSKKETIKKEYEKDKKTMSNEEFKKKWVTPFEKGGE FEKALISFTKDLKLASDEKEQVDQRKNAARKSKEKIENDKNWPKDLYSFQLKYMNEKEKE ETFKKYPKASELLKEWFEQNKIYDEADKKSDELSEKISSEIPKKHGFYDGWKPKKEENKW LKGVVANKDLTRKYLGKNVEFRGQGRIEGTVDLGEGNNELTIKEQFTGRYGTNIVLGPKA ALKNIKYVNVAGAIGDSSHSSLSGRTSLTLDIDPSVANEKGHLTQHAFKNSDPNIVFRGL GSDITSDNRNDFYMELMASRIAKNSVVDMGRKLKYQTQDFHNPAKKIDMEIKMISDSIAH TIENKEEKEKENSLIEVKIKDKIKALNEQENAVYQSIHRSGRLDILQPTLTTTNKKTTFN VADDDREEKKKTKLIHMIKTASPEEVIEKVGQFHLSESSKKDAMERIRKIATSENMKKLK EKTEQFKELASSTEYQKLDFLRKSEEVENLNSGETWQALRQEIYDKATIERKIEEVKKVV NAIDQENIQKLAEKYPEIETLKKISSNLESLKETLASIKGKEIDIKSTTIIQSLFSTFNS LGTNMKKQALMTEDSLDNETAHTFESYETGRREYAELKNILFYSSREEEALSELKNVISQ LQERNIYSKLNKVAKNEISTYTNIPFDIDHSLLDKKSVYTRGGFISSRTVQKNFKGNIYT GYGIYEQEYDKGLRLGAIFGGANTDHTETYSRTLRTVATESNIKGVSAYAGAYVNKTLYT PNLEWISGLGLQYGYYTVKRQVKNNYQELMSKGKPQIGAFNTYTGFVYTHSLQNDLILRG KGILSYSLVHQGKVKEKDGLNLDIEAKDYHYVDGELGVSLAKTLYDDSKKSTLSAGISGI FGLSGYDNKALKAKIHNSNSSYDIVGDKVKKDAVKIYLDYNMQLDLGFNYGLEGTYITNN KQSDVKIGLKAGYAF >gi|224461452|gb|ACDD01000050.1| GENE 15 23076 - 24536 1349 486 aa, chain + ## HITS:1 COG:no KEGG:FN1654 NR:ns ## KEGG: FN1654 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 2 474 30 501 571 469 49.0 1e-130 MKQLPIGVSDFRDLIVGNYYFIDKSAFIQEIVRDGAKVKLFTRPRRFGKTLNVSMLKYFF DITNAEENKKLFQGLSIESSPYFKEQGKCPVIYLSLKDIKEANWQDCNRRMRKLLSDLFD EYKYLRDSLDQRDLKNFDAIWMEELNGNYFDALKDLSKYLSRYHQKKVVILLDEYDTPIV SAYENGYYQEAIIFFRTFYSAALKDNLFLELGVMTGILRVAKEGIFSGLNNLAVYSILNE RYSSCFGLTEIEVQEALEYYQLEYNLQEVKKWYDGYCFGNVEIYNPWSIINYISNRKVGA YWVGTSNNVLVYDLLEKSGNDIFEDLQLVFQGESLFKTLDYSFSFQDMTNPNEIWQLLVH SGYLKVQRIDEGEKYAISIPNLEIYSFFEKSFLNRFLGGIDLFQEMISELKRGNIVFFER KLQSILLHSMSYHDISTHEKYYHNLVLGMLLSLTKEYHIHSNQESGYGRYDLILGLYNKS EIKKKS >gi|224461452|gb|ACDD01000050.1| GENE 16 24553 - 25218 547 221 aa, chain - ## HITS:1 COG:FN0851 KEGG:ns NR:ns ## COG: FN0851 COG0500 # Protein_GI_number: 19704186 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 1 215 1 218 222 175 43.0 5e-44 MTFEKHFNSYEENAIVQKKVAKHLASFFKNILNPPKTILEIGCGTGIFSRELVRYFPNAS LSLNDIFDTSAFFDNISYEKFIVHNAETMSLDSYDLISSSGCFQWFTDLQTFLEKLSSHT NCLVFSMFLEDNLKEIKDHFQITLAYPSVSETIHTLRKHYSKVEYQEEIFEIDFPTPLAA LRHLQATGVTGIGETNIRKIRSYPHKKLTYRVGYFKAERAN >gi|224461452|gb|ACDD01000050.1| GENE 17 25208 - 25810 612 200 aa, chain - ## HITS:1 COG:no KEGG:FN0850 NR:ns ## KEGG: FN0850 # Name: not_defined # Def: putative cytoplasmic protein # Organism: F.nucleatum # Pathway: not_defined # 1 193 1 194 196 114 37.0 1e-24 MRWIFFFNGWGMTEDAFPHLSLEQVEVINYPYDIQEIKDLLEHHKNDTLYAVAWSFGAYY FSKLPKEIQNHFHKKIAINGLPETLGSYGILPKMCKFTLENLTPESLRSFYKNMDFHGNI SKKFTDIQEELAFFYENYQKPENPFDFAWIGENDRIFSEKKLIRYYEKERVPYQCFFGGH YPFHFFQNFFELLGDTKNDL >gi|224461452|gb|ACDD01000050.1| GENE 18 25788 - 26909 1229 373 aa, chain - ## HITS:1 COG:FN0849 KEGG:ns NR:ns ## COG: FN0849 COG0156 # Protein_GI_number: 19704184 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Fusobacterium nucleatum # 1 369 5 375 381 391 57.0 1e-108 MKLTEMQEELNQFEQEGRLRKVETKPANMTNFSSNDYLSLAGQIPLRQKFYEEYPCLALS SSSSRLIDGSYSIVMDLEKKLEEIYGKSALCFNSGFDANSSVIETIFPKKSLILTDRLNH ASIYDGIIASNSKFLRYSHLDMKALEKLLKKYQNDYEDIVIISESIYSMDGDCADLEALV SLKKQYNAQLMIDEAHSYGVYGYGIAYEKKLVSEIDYLILPLGKGGASMGAFVLCDEVAK KYLINRSRKFIYSTALPPITHAWNYYVLTHMQDFQEEQEALFRKEKLLYQLLQEEKIATT SSTHIVSIVIGNNEKANALSKALFQKGFLIQAIKEPTVPKNTARLRLSLTSAIPEEEIKR FVKELRHEMDILF >gi|224461452|gb|ACDD01000050.1| GENE 19 27011 - 27880 856 289 aa, chain - ## HITS:1 COG:FN0662 KEGG:ns NR:ns ## COG: FN0662 COG0010 # Protein_GI_number: 19703997 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Fusobacterium nucleatum # 1 285 30 314 318 437 71.0 1e-123 MEQKVEEKKICFVSFNSEEGIRRNFGRLGAAEGWIHLKKAFANFPVFDPDIHFYDLKTPI DVVNGDLEAAQYELSMTVSMLKNKNFLVVCLGGGHDIAYGTYNGILKYAQSKELDPKIGI ISFDAHFDMRSYEKGASSGTMFLQIADDCEREGRVFDYNVIGIQKFSNTKRLFDTAKHFG VNYYLAEDISKLNEFNIDPIIKRNDHIHLTLCTDVFHITCAPGVSAPQSFGIMPDEAMRL LNIISSHTKDLTIDVAEISPKFDFDDRTSRLMANLIYQTILNHFEVSFK >gi|224461452|gb|ACDD01000050.1| GENE 20 27813 - 27968 231 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452622|ref|ZP_05617921.1| ## NR: gi|257452622|ref|ZP_05617921.1| hypothetical protein F3_06094 [Fusobacterium sp. 3_1_5R] # 1 51 15 65 65 100 100.0 3e-20 MYWTGRCDGEEADVLRIHQVVKKIDIGRIDGTKSRGKENMLCQFQFRGGNS >gi|224461452|gb|ACDD01000050.1| GENE 21 27985 - 28416 401 143 aa, chain - ## HITS:1 COG:no KEGG:CPF_2500 NR:ns ## KEGG: CPF_2500 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 25 142 72 189 190 66 28.0 3e-10 MKKDLAALLQNNPEFDLSICKILNDKQLESLENNIVLFLHGKTLDMGTKDLSILLLKELL CSLSQQELLPKTIILSQKTVLCNQRESEFLHFFRLLEQKNIEILTCKTSAEYYKINKNIP IGHFASMEEIIEKLFHASKIIQW >gi|224461452|gb|ACDD01000050.1| GENE 22 28413 - 30392 1743 659 aa, chain - ## HITS:1 COG:FN0871_1 KEGG:ns NR:ns ## COG: FN0871_1 COG0337 # Protein_GI_number: 19704206 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Fusobacterium nucleatum # 15 347 8 347 350 307 46.0 4e-83 MKEILIHTKTDRYPILIGSNFLTKLHSFTQKYDKILFLTNDTLFSHYSNLYEEKIASSKT EYYVLPDGEQYKNLDFIQKIYNVMLEKHFSRKSCILCFGGGVVCDMGGFVAASFMRGIDF IQIPTSLLAQVDASIGGKVAVNHPFGKNLIGFFYSPKAVLIDVSLLHSLHEVQFQSGMSE VIKHSILCPNNAYSDFLVKNQKEIQEKEEATLISLIEQSCQIKKYYVEEDMQEKGIRAFL NFGHTYAHALENLYHYEHISHGEAVAKGCLLDLFTSYQKGLLPLEYFEKIKNLFDDYSID STPVLFPFQNLWEAMEQDKKNAFSKINTVYLKKQENKKEFLLQELDKQATQGYLSQENHH ETKAVIDIGTNSCRLYIAEWSPKEHKIIKHLYQEVQIVQLGEKVNETKFLQESAIKRTLD CLIHYQNIIKQYACSTIYCFATSATRDAHNREYFIQKVLKNTGIQIHCIPGETEAEYNFR GVSLAIDGQILIVDIGGGSTEFTLGNHGEILFSKSLNIGAVRATELFFQEENYSFKNIQN CKHWILEQLKEIETIRQKDFVLIGVAGTATTQVSVAKKMKNYTRELVHLSEISNSQLEQN LSLFLSKSLEERKKIIGLEAKRANVIIAGTIILQTIFQYLGKETMTISEFDNLMGAMIL >gi|224461452|gb|ACDD01000050.1| GENE 23 30492 - 30836 454 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452625|ref|ZP_05617924.1| ## NR: gi|257452625|ref|ZP_05617924.1| hypothetical protein F3_06109 [Fusobacterium sp. 3_1_5R] # 1 114 1 114 114 204 100.0 1e-51 MKKILAFFLLLSSMSFAVEIYPETYAMQKMIPQLEKGKRYVGSSSYEAMEQIVAVPMNQN IQKALGTGDTSIYFIDSNGNTVKAGPEDYIVAPKSLSRIYVLSKQQLQENYRGQ >gi|224461452|gb|ACDD01000050.1| GENE 24 30839 - 32023 1363 394 aa, chain + ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 17 394 2 395 396 275 42.0 1e-73 MTREDFHEIYSGVGTSIKHALWKYKEANSNLWVTSLCYYTVLSMIPIFAILFSIGTWLGL GEYLLKQIDNHSPIKGEAIQLLLTFTDNLLTNARSGVLAGLGFLFLIWSLISMFSIVEKA FNDIWDIDATRSFVRKISDYLTFFILLPTLILVSNASSLLIQNDFLSKILPYFSVLLFFM ALFMVMPNTEVKWLPAFVASFFTSVMFSIFQYAFIYLQVLINAYNMIYGSFSVIFIFLIW LRIAWFLIILGAHLSYLLQNRDINLYCDSLSIDEINFQSKFSLAVHLLAVMVRRYQKEES LVTRAELTARFHNVIAIDGVLRILKKGNFILEGKNEKQEKVYSLAKNIEKTRLEEVYFVI SSYGKMIEDIEYEIISAKRLQTRLCELGGYEEKE >gi|224461452|gb|ACDD01000050.1| GENE 25 32083 - 33984 2435 633 aa, chain + ## HITS:1 COG:FN1155 KEGG:ns NR:ns ## COG: FN1155 COG0768 # Protein_GI_number: 19704490 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Fusobacterium nucleatum # 1 633 62 711 711 627 51.0 1e-179 MLIGLRLCQVQIAQKGKYSSSILKQVQGKDEEIGERGNILDRNGKQLAFNKRQYMVIIDP SKIHINEDQFLKNLQEIADKKILDLDATFFEKLEKEFEANKKYKVIAKKIDDVTREKIKE CLSKIPEERKYLFFKKEIEREYYRKDIYETLVGMVRFTKNSKEKKQGVFGLESRYERYLA GKTLSRDKYYSRDKKKVLPTSMEWMYTNLNGNNIYLTIDNEINYILNDEIKAQYDALNAE EAYGVIMDPKNGKILGISTYTRNPKDLRNQVFQNQYEPGSIFKPIIVASALDEGLIQKNS TFNVGNGSIVKYRHTIRESSRSTTGILTTTEVIKKSSNVGMVLIGDHFTEEQFEKSLRKF GLYEKTGVDFPNEIKPYTTPHEKWDKLKKSNMAFGQGITVTPIQMITAFSAVVNGGKLFR PYLVEKVVDDDGVVLRRNVPKVVRQVIKPEVSDMLVGMLEETVANGTGSRAKVEGYRVGG KTGTAQLSSNGRYLAHQYLASFVGFFPVENPQYVILVMILKPQAESVFGRYGGTASAPVV GNIIRRISKIKNVSSQEVSKIISSNKEIVDEKQDLVLGESMPDLKGLSPKEVMNLFQTTN YDIHIVGTGLVVRQEPAAGKSLEDVDKIEVILE >gi|224461452|gb|ACDD01000050.1| GENE 26 33981 - 36257 1963 758 aa, chain + ## HITS:1 COG:FN1156 KEGG:ns NR:ns ## COG: FN1156 COG1198 # Protein_GI_number: 19704491 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Fusobacterium nucleatum # 1 758 1 766 766 705 51.0 0 MIYYQLYLEKNKGLYTYMDEKEEYHIGESVFVSFRNRKQVAYIIAKDSRKEFSFKVLPIL GKTEFPNLPPVLVEVARWMVRYYVTSYEAVLKNIIPKDIKIKKKIFYSLSSPMVLDIPKE LLDSFREYSSVSKVTLRKYVSLEEIKQSITEQEIIEVSKNRYIWNETKEKRGLLGSYFFQ KGQMPALKLIERFSKVEVEEFLKKHYLEEQDRFESGLSSVGDFSSSLSFRDVNLNEEQKK AVDRITKGEHFFYLLKGVTGSGKTEVYLSLIRKAFQEGKGSIFLVPEISLTPQMIERFQD EFQENIAILHSKLTSKERAEEWLQLYQGKKRVVLGVRSAIFAPVQNLQYIIIDEEHESSY KQDNNPRYHAKQVALKRAMLEKAKLVLGSATPSIESYYYAKKGLYQLIELNERYNQAKMP EIELVDMKEEKDLFFSEKLLEEIRNTLLRKEQVLLLLNRKGYSTYIQCQDCGHVEECDHC SIKMSYYASKGIYKCNYCGKVVKYTGRCSACGSEHLIHSGKGIERVEEELKHYFPDISIL RVDGDQKGNQFFERAYHDFLDEKYQVMIGTQLIAKGLHFPNVTLVGVINADMILNFPDFR AGEKTYQLLAQVAGRAGRAEKNGKVIIQTYQSEHYAMDKVREHDYEGFYEKELEARDFLE YPPFAKMILLGLSSRDEEYLKIKSEEIFKRIPQEQVDLYGPIPCLVYRVKDRYRYQIFIK GNREKIEEYKKLLRKVLIEYQQDENIRISIDAEPLNMI >gi|224461452|gb|ACDD01000050.1| GENE 27 36270 - 36791 733 173 aa, chain + ## HITS:1 COG:FN1157 KEGG:ns NR:ns ## COG: FN1157 COG0242 # Protein_GI_number: 19704492 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Fusobacterium nucleatum # 1 173 1 172 174 189 58.0 2e-48 MIYEIRKYGDPVLRKVAEKVEDINDEIREILSNMLETMYATDGVGLAAPQVGISLRMFVC DVGTPEESQVKKIINPIITPLTEENISVEEGCLSVPGIYRKVDRIAKIKISYQNEMGEKI EEILEGFPAIVVQHEYDHLEATLFVDRVSPMAKRMIAKKLQALKKETMRDAKE >gi|224461452|gb|ACDD01000050.1| GENE 28 36820 - 37086 317 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452630|ref|ZP_05617929.1| ## NR: gi|257452630|ref|ZP_05617929.1| hypothetical protein F3_06134 [Fusobacterium sp. 3_1_5R] # 1 88 15 102 102 120 98.0 3e-26 MLLIFAYSMFGVIPQILKSQTKIAKIKEEIEYLEGKNQKELQEIEKYTKNIEELDNDYER ERIARNRLQMIKPDEVIYRLNQKNQEEQ >gi|224461452|gb|ACDD01000050.1| GENE 29 37083 - 38141 1835 352 aa, chain + ## HITS:1 COG:FN1159 KEGG:ns NR:ns ## COG: FN1159 COG1494 # Protein_GI_number: 19704494 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins # Organism: Fusobacterium nucleatum # 1 352 1 346 346 536 82.0 1e-152 MKRELALEFARVTEAAALAAHKWVGRGDKEAADQAAVDAMRTMLNRLAIDGEIVIGEGEI DEAPMLYIGEKVGRAYHEEEAKDELEEGEVPYYTPVDIAVDPVEGTRMTAQGQSNAVTVL AVAKKGSFLKAPDMYMEKLIVGPEAKGKIDLERPLMENIENVAKALGKELHEMMVVVLDK PRHTQIIKDLQKLGIKVYALPDGDVAGSILTCLVDSDVDMLYGIGGAPEGVISAAVIRAL GGDMQARLKLRNEVKGVSLENDKISNFEKSRCEEMGLKVGEILRMDDLVKDDEVIFSATG ITGGDLLTGIYRRGMIAKTQTLVVRGSSKTVRYINSVHNLEYKDPKILHLVK >gi|224461452|gb|ACDD01000050.1| GENE 30 38155 - 38646 709 163 aa, chain + ## HITS:1 COG:no KEGG:FN0932 NR:ns ## KEGG: FN0932 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 163 7 167 167 101 38.0 1e-20 MIFYEDFIKKIEEAQYEELQLEEIENFGLEKTKKMGGYGIAIPLMLIAAYEIFVAIFMKQ YYLILIALVLFYFGLRQCRNMWAYKVVVNTKEKHFFFQKLDLDLHKVEKIQLREAKIGKK VTVVLDFITIEKKQVIIPMYMTNQLRLVRVLQNLVGSKFSIKK >gi|224461452|gb|ACDD01000050.1| GENE 31 39281 - 40615 1405 444 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 442 23 464 470 692 77.0 0 MKRFIMDDLVKWKDSKYRKPLILKGVRQVGKTWILKEFGRLYYDNVAYFNFDENMEYREF FTTTKDTKRILQNLMLISGEKIEPNSTLIIFDEIQDCPEVINALKYFYENVPEYHIVCAG SLLGIALAKPSSFPVGKIDFLNMVPMNFSEFLIANGDENLKNYLDSVEEIEKIPEAFYNP LYEKLKMYYITGGMPEPVYMWSKERDMELMIRSLNNIIEAYERDFAKHPNTKEFPKISMI WKSLPSQLSRENKKFIYKVVKEGARAREYEDALQWLVNANLVSKVYRISAPRIPLSAYDD LSAFKIYMADVGILNRLSLLSPKVFGEGSRLFTEFKGALTENFILQSLIPQFEVSPRYWS DNIYEVDFVIQHENDVFPIEVKAEKNTKSKSLLKFKEKYSENVKLRVRFSFDNLILDGDL LNIPLFMVDYTKKLIDIVLRKEYL >gi|224461452|gb|ACDD01000050.1| GENE 32 40612 - 41349 927 245 aa, chain + ## HITS:1 COG:FN0505 KEGG:ns NR:ns ## COG: FN0505 COG2071 # Protein_GI_number: 19703840 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Fusobacterium nucleatum # 2 243 4 243 243 216 47.0 4e-56 MKALIGITGSIITCGNDEIFATYERAYVNDDYVSAVEKAGGIPIILPIVEEKENIKEFVS RVDAIVLSGGYDIDPSYWGEEIGRKYERIYPRRDHYEMLVIKYAKELKKPVLGICRGHQM INVAFGGSLYQDLSEIPGSYIQHVQQAKYYEATHGIEIEEGSFISKSMGVKNRVNSYHHL AIKDLGNSLRIVGRAPDGVVEAIEYITEEQFFIGVQFHPEMMHRHHEFALHLFQDFIQEV ERRKK Prediction of potential genes in microbial genomes Time: Fri May 20 02:06:32 2011 Seq name: gi|224461451|gb|ACDD01000051.1| Fusobacterium sp. 3_1_5R cont1.51, whole genome shotgun sequence Length of sequence - 23573 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 8, operones - 3 average op.length - 7.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 28 - 543 670 ## COG0716 Flavodoxins - Prom 592 - 651 15.0 + Prom 551 - 610 11.5 2 2 Op 1 . + CDS 696 - 1076 369 ## COG2832 Uncharacterized protein conserved in bacteria 3 2 Op 2 30/0.000 + CDS 1104 - 1712 790 ## COG0811 Biopolymer transport proteins 4 2 Op 3 11/0.000 + CDS 1715 - 2101 573 ## COG0848 Biopolymer transport protein 5 2 Op 4 . + CDS 2114 - 2851 958 ## COG0810 Periplasmic protein TonB, links inner and outer membranes 6 2 Op 5 . + CDS 2880 - 3338 547 ## Smon_1033 hypothetical protein + Prom 3342 - 3401 7.2 7 3 Tu 1 . + CDS 3459 - 4007 586 ## COG1309 Transcriptional regulator + Term 4012 - 4046 1.2 + Prom 4031 - 4090 11.2 8 4 Op 1 . + CDS 4120 - 5088 1382 ## COG4143 ABC-type thiamine transport system, periplasmic component 9 4 Op 2 . + CDS 5157 - 5237 58 ## 10 4 Op 3 . + CDS 5284 - 6189 987 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 11 4 Op 4 13/0.000 + CDS 6199 - 6999 1175 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 12 4 Op 5 5/0.000 + CDS 7009 - 7929 1336 ## COG0167 Dihydroorotate dehydrogenase 13 4 Op 6 . + CDS 7932 - 8648 968 ## COG0284 Orotidine-5'-phosphate decarboxylase 14 4 Op 7 1/0.000 + CDS 8642 - 9859 1468 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 15 4 Op 8 . + CDS 9852 - 10475 947 ## COG0461 Orotate phosphoribosyltransferase + Term 10514 - 10563 6.2 16 5 Tu 1 . - CDS 10603 - 11283 722 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain - Prom 11304 - 11363 6.8 17 6 Tu 1 . + CDS 11572 - 12888 2021 ## COG1757 Na+/H+ antiporter + Term 12895 - 12937 7.3 - Term 12881 - 12925 10.2 18 7 Tu 1 . - CDS 12933 - 14081 1858 ## COG0192 S-adenosylmethionine synthetase - Prom 14106 - 14165 7.0 - Term 14115 - 14149 2.1 19 8 Op 1 . - CDS 14283 - 15167 1064 ## COG1210 UDP-glucose pyrophosphorylase 20 8 Op 2 1/0.000 - CDS 15197 - 17119 2171 ## COG0143 Methionyl-tRNA synthetase 21 8 Op 3 1/0.000 - CDS 17112 - 17780 919 ## COG2121 Uncharacterized protein conserved in bacteria - Term 17796 - 17828 -0.3 22 8 Op 4 1/0.000 - CDS 17858 - 18217 731 ## COG0718 Uncharacterized protein conserved in bacteria 23 8 Op 5 . - CDS 18285 - 19979 1505 ## COG0616 Periplasmic serine proteases (ClpP class) 24 8 Op 6 . - CDS 19989 - 20909 1175 ## COG1164 Oligoendopeptidase F 25 8 Op 7 1/0.000 - CDS 20912 - 21763 1037 ## COG1164 Oligoendopeptidase F 26 8 Op 8 1/0.000 - CDS 21792 - 23057 836 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 27 8 Op 9 . - CDS 23101 - 23484 609 ## COG5496 Predicted thioesterase - Prom 23510 - 23569 7.5 Predicted protein(s) >gi|224461451|gb|ACDD01000051.1| GENE 1 28 - 543 670 171 aa, chain - ## HITS:1 COG:FN1822 KEGG:ns NR:ns ## COG: FN1822 COG0716 # Protein_GI_number: 19705127 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 168 1 168 169 186 52.0 2e-47 MKTVIIYSSLTGNTKKVCEVAYEHLQDEKQLIKVEDKDTVDWSTVENVIIGYWVDKGTAD AKTRKFLSKLKDKNLYFIGTLGESPTSFHGQKCIKNVTKLCEKDNQFKGGVLVRGKVSDD LKQKMDKFPLNIVHKFVPNMKQIVLDAEGHPNEEDFQQVIHFVEETVNPNL >gi|224461451|gb|ACDD01000051.1| GENE 2 696 - 1076 369 126 aa, chain + ## HITS:1 COG:FN2099 KEGG:ns NR:ns ## COG: FN2099 COG2832 # Protein_GI_number: 19705389 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 7 126 4 123 125 106 47.0 1e-23 MDVIVIVKKNILLGIAWVSLVLGGIGIFLPLLPTTPFILLSAFCFQKSSERFHQWILNSP IFGKYIRDYQEQKGITLKNKIIAISFMAIGMLFSAYKVPQIHMRIFLGITFVAVSYHILK LKTLKK >gi|224461451|gb|ACDD01000051.1| GENE 3 1104 - 1712 790 202 aa, chain + ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 1 202 1 202 202 226 55.0 3e-59 MAYYFKVGGPILWVLFFMSMGALAIILEKTVFFTTKEKKVNANFKKDINDLISAGKVEEA IQLCETQKGSVAGSIKTFLKRAKKGQDVQDYESIIKEIMLESMSPLDRGLSSLEAIGSLA PMCGLLGTVTGMIKAFINISKMGAGDPTIVADGISEALVTTAAGLYVAIPVIAAYNIFSK IAARREDEVDKIVANIINIFRR >gi|224461451|gb|ACDD01000051.1| GENE 4 1715 - 2101 573 128 aa, chain + ## HITS:1 COG:FN1311 KEGG:ns NR:ns ## COG: FN1311 COG0848 # Protein_GI_number: 19704646 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Fusobacterium nucleatum # 29 128 1 100 100 117 55.0 7e-27 MGRKNKRGALKPDLTPLIDVIFLLIIFFMISTTFNNYGTIPIELPSSTVESKKENKAVEI IVDKDGRFYVSADGKNQEVTLEDIPNHLQGVEEVTVSADRNMKYQTVMDVMTKVKEQNIA NMGLTFYE >gi|224461451|gb|ACDD01000051.1| GENE 5 2114 - 2851 958 245 aa, chain + ## HITS:1 COG:FN1310 KEGG:ns NR:ns ## COG: FN1310 COG0810 # Protein_GI_number: 19704645 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein TonB, links inner and outer membranes # Organism: Fusobacterium nucleatum # 32 245 11 242 242 171 49.0 1e-42 MKKYLVLSFCIHMLCFIGFYHHEMHKGEEKLPLNQVISVSFVVENPPPSDNPGSPNVADK ILEKKENSTNEKPKEQPKKEKPKKEEQVKEKTFDSKMATKDAKEVKKEESAAVASDSHEK ESSDSGKGSGSDNPFYGSNFQANGDGSYTALSSEGINYQILNEVEPDYPSQAESIGYDQR VSVKVKFLVGLKGNVENIQIIKSHKKLGFDDEVMKAIKKWRFKPIYYAGKNIKVYFVKEF HFNPQ >gi|224461451|gb|ACDD01000051.1| GENE 6 2880 - 3338 547 152 aa, chain + ## HITS:1 COG:no KEGG:Smon_1033 NR:ns ## KEGG: Smon_1033 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 1 152 1 148 148 96 36.0 3e-19 MKKLWILFFLFPNLVFAAREDILGKWISTKYKDGNQIIIEVIEKEDGKFYGKMIDQTVPF YQEGEFQGKEKMDLKNPDPSLKHRKLVGVEMLKSIAYQEEKDRYDGGTVYIPGMGKTLYA SVQVEKDSMKMKGSFDKAGILGKTQLWHRYEK >gi|224461451|gb|ACDD01000051.1| GENE 7 3459 - 4007 586 182 aa, chain + ## HITS:1 COG:FN1823 KEGG:ns NR:ns ## COG: FN1823 COG1309 # Protein_GI_number: 19705128 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 175 1 154 156 100 38.0 2e-21 MPRKSVYTREMVLEAAVEVFKNEGYEKITVKNIAKQLGCSIAPVYAAYTSMEDLKRDVVI KVEDTLVSCFDEISNSCHVVEEDVSMTQEQRDLFERMFSMVDPENEAVKQKFENLVEEAL QNSENPDQSRMSLFNVFMKAISIMSETKHKKFSKSEILSLIARHKNYILTLKKKKGYGNR RK >gi|224461451|gb|ACDD01000051.1| GENE 8 4120 - 5088 1382 322 aa, chain + ## HITS:1 COG:PAB1835 KEGG:ns NR:ns ## COG: PAB1835 COG4143 # Protein_GI_number: 14521007 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type thiamine transport system, periplasmic component # Organism: Pyrococcus abyssi # 19 315 34 336 352 157 35.0 3e-38 MKKIILGSLLLLSASAFAEEIVVYGPSTSKWIGKKYAPIFEKVTGDTIKYVSIDGVVQRL TLEKVNPKADIVVGLTPVDIEVAKKHNVIQKYKPKNIGMIKKDIKFDKEFYATPYDYGML AINYDKTKIKNPPKTLAELGKMKKQLLIENPNTSNTGAEILQWSLALYGKNWKKFWMTIQ PAVYNVEPGWEEAFAKFTAGEAPMMLGYATSDMWFAQDNTQKEKYASFYVEDGNYQYIES AALVKKKEVKEGAKKFMEAVLGEEFQNMTAAKNYMFPVTSVNLGKEFDAVPRTDKKVQFV PNKEVVEHLSKYKKEAIQILKK >gi|224461451|gb|ACDD01000051.1| GENE 9 5157 - 5237 58 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRKVLCVEKPKISSIFLFLFLYLLG >gi|224461451|gb|ACDD01000051.1| GENE 10 5284 - 6189 987 301 aa, chain + ## HITS:1 COG:PAB1498 KEGG:ns NR:ns ## COG: PAB1498 COG0540 # Protein_GI_number: 14521526 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Pyrococcus abyssi # 2 299 6 306 308 320 52.0 2e-87 MRNFISIQDLSKQEILDLLALAKKLKEKPEPNLLQGKIVATLFFEPSTRTRLSFTSASYR IGANVLGFDSIQGTSVMKGESFEDTIRMVSSYSDVIVIRHPKDGTAQKAADISSVPVINA GDGKNEHPSQTLLDLYTIQEELGSLENKKIAFVGDLKYGRTVHSLTRAMKHFHAKFYFVA PDLIQMPKHLLEELEEAGLEYSLHNNYEDILKEVDILYMTRIQKERFEDPKDFEKVESSY RIEKEDIVGKCQEHMIILHPLPRVDEIAVSVDECKHALYFKQAANGVPVREAMIALAVGK K >gi|224461451|gb|ACDD01000051.1| GENE 11 6199 - 6999 1175 266 aa, chain + ## HITS:1 COG:CAC2651 KEGG:ns NR:ns ## COG: CAC2651 COG0543 # Protein_GI_number: 15895909 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Clostridium acetobutylicum # 8 252 10 243 246 148 36.0 9e-36 MYLRDCIVKENTCVASCYYRMVVEIPEELLVSKPGQFFMLKSLQDAFSLRRPISIHQVNK QDRTMEFYYEVKGRGTESLADFQEGEKISLQGPLGHGFSVVKDKKVIVIGGGMGIAPMKY LLDDLKENNEVTFIAGGRNQAAIEILDFFSFQKLRAYITTDDGSVGMKGNVVTKLKDLLE QDSYDQIYVCGPHGMMIAAAETAQEKGVACEISLENRMACGVKACVGCSIQTVDGMRKVC YDGPVFDSRKIVNYEPKEKASICCGN >gi|224461451|gb|ACDD01000051.1| GENE 12 7009 - 7929 1336 306 aa, chain + ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 4 301 2 297 311 293 48.0 3e-79 MSCLKTEFLGVSMKNPLVTSSGCFGFGKEYQDYFDPNQLGGIVLKGITLEARDGNYGVRI AETPGGMLNCVGLENPGIDVFEQEIIPNLRKEGITTNLIVNINGKTMEDYIEIAKRVDKI DEIAIVELNISCPNVKDGGMAFGANPEVAGAVTREVRKVTKKPLVVKLSPNVTDIAKIAK IVEENGADAVSLINTVLGMAIDVKTKKPVLGNTFGGMSGGAVKPIALRMIYQVYEAVKIP IVGMGGILNGTDALEFLMAGASILSIGTGFFINPMVSLEIEKTLRDYCEQEGLKNIQEIV GIAHRR >gi|224461451|gb|ACDD01000051.1| GENE 13 7932 - 8648 968 238 aa, chain + ## HITS:1 COG:alr2983 KEGG:ns NR:ns ## COG: alr2983 COG0284 # Protein_GI_number: 17230475 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Nostoc sp. PCC 7120 # 3 231 2 234 238 194 44.0 1e-49 MDAREKIIIALDFPTEEKAKACVESLGEEAVFYKVGLELFLNSQGKILEYLREKGKKIFL DLKFHDIPNTTAMASVFAAKKDVVMFNVHASGGKKMMQKVIEETKKINEAASAIAVTILT SFSEEEIQSLFQSKLSLKELAIHFASLAKEAGLSGVVCSPWEAADIKKVCGESFQTVCPG VRPKWSATNDQERIMTPKEAVQHGCDYLVIGRPVTKHENPKEAMRMIVKEVEEGLGLC >gi|224461451|gb|ACDD01000051.1| GENE 14 8642 - 9859 1468 405 aa, chain + ## HITS:1 COG:all2303 KEGG:ns NR:ns ## COG: all2303 COG0044 # Protein_GI_number: 17229795 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Nostoc sp. PCC 7120 # 1 402 7 436 439 250 35.0 4e-66 MLVKNAKIIMGTEEVLVDILIENGRFVKFGKDFVENSQKEVLDANFHYILPGIIDAHTHM RTPGFTQKEDNISGSKAAIRGGVTTFFDMPNTNPATVTLEALEEKRNIYKGNSYSDYAFY FGGTRFDNHEEVEKAIDETVATKIFLNVSTGDMLVEEDAILENIFKASKRVAVHAEEEMV SKAIQLARKTKKPLYLCHISLEKELEYIREAKEMDVEVYGEVTPHHLFLSEEDRESTEES KLFLRTKPELKTKQDNEALWKALQYGILDTVGTDHAPHLLEEKKAKLTFGMPSVEHSLEM MWKGVKEGKLSIPRLQEVMSENPAKIFGLKKKGKIAVGYDADFVIIDDGDHSEIRQEEII SKAAWSPYVGQKRGCKVLTTVLRGNIVYHEGKFGKKIGKEILKHE >gi|224461451|gb|ACDD01000051.1| GENE 15 9852 - 10475 947 207 aa, chain + ## HITS:1 COG:SP0702 KEGG:ns NR:ns ## COG: SP0702 COG0461 # Protein_GI_number: 15900601 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 1 205 1 207 210 181 48.0 7e-46 MNRKEAIAQVLLSTGAVKLNVKEPFTFVSGIKSPIYCDNRQMIAYPEEREVIIQGFQEAL EGKEYDILAGTATAGIPWAAFLAYSLKKPMSYIRGEKKNHGAGKQIEGASVEGKKVIVIE DLISTGGSSIKAVEAAYAEGASSVEVVSIFSYEFPKAYQQFGDKKIPWQSLSNFEVLIHK AEEMNYVTEEERKIAADWNKNADTWGK >gi|224461451|gb|ACDD01000051.1| GENE 16 10603 - 11283 722 226 aa, chain - ## HITS:1 COG:FN1305 KEGG:ns NR:ns ## COG: FN1305 COG1917 # Protein_GI_number: 19704640 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Fusobacterium nucleatum # 117 226 2 111 111 122 54.0 7e-28 MIEIKKIPRGSSFFLKEEIKVRNFQVSSKILVQSSHARMTLVSMGKGEEISAETMPYSRC FQLLKGKVFLQLNQEKLDMEIDHFLLLGENSFYSIHAEEDSIFLEIEYDRGGNFMSEVQT IKHITRGTTFALKEEISYEAGQIISKNLVTNNAMVMTLMSFDQGESLAAHKAPGDALVSL LDGEAKFWIDGKENVVKAGESILLPGNISHAVEAIKAFKMLLIIVK >gi|224461451|gb|ACDD01000051.1| GENE 17 11572 - 12888 2021 438 aa, chain + ## HITS:1 COG:FN0624 KEGG:ns NR:ns ## COG: FN0624 COG1757 # Protein_GI_number: 19703959 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 1 438 5 442 444 607 82.0 1e-173 MKEERKEKQYGVIAFLPLFVFLALYIGSGIIFNLLGVEGAFKKFPRHVALLIGIVVAMLM NRGMKLDKKIEIFSENAGNSGFMLVGMIYLLAGGFQGAARAMGGVESVVNLGITFIPSIA LVPGVFLISCFISLAIGTSMGTVAAMAPIAIGVAEAAQLNIPLTVAAVIGGAYFGDNLSI ISDTTISAAKGVGSEMKDKFKMNFLIALPAAIFAAIMYGIMGGEGNIVGEHSFHILRVLP YLVVLMIALTGFNVSVVLVLGIAMTGVIGFLEGTVNFFTWIEAIGEGMSDTFSITIVAIL ISGLIGLIKYYGGIDWIVNILSSKMSDRKSAEYGIGLLSGILSAALVNNTIAIIISAPLA KEIGKKYRIAPKRLASLIDIFACAFIALTPYDGGMLMITALVDVSPLEVLQYSFYMFALI IVTCITIQFGLLRTEEER >gi|224461451|gb|ACDD01000051.1| GENE 18 12933 - 14081 1858 382 aa, chain - ## HITS:1 COG:FN0355 KEGG:ns NR:ns ## COG: FN0355 COG0192 # Protein_GI_number: 19703697 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Fusobacterium nucleatum # 1 381 1 381 383 647 80.0 0 MKKFNYFTSEFVSPGHPDKVSDQISDAVLDACLTEDPNARVACEVFCTTGQVIVGGEITT TTYIDVQDIVRKKIEEIGYRDGMGFDANCGVLSAIHAQSPDIAMGVDIGGAGDQGIMFGG AVKETPELMPLAIVLAREILVRLTKMTRSKEIAWARPDAKSQVTLAYDEEGNIDHVETVV VSVQHNPEVSNEEIRKTIIEKVIEPVLEQYHLSKEEITYHINPTGRFVIGGPHGDTGLTG RKIIVDTYGGYFRHGGGAFSGKDPSKVDRSAAYAARWVAKNIVAADFADKCEIQLSYAIG VAEPTSIKIDTFGTSKVSEEKLEEAVKKTFDLTPRGIEKSLELRSGTFKYQDLAAFGHIG RTDIDVPWERCNKVEDLKKAMM >gi|224461451|gb|ACDD01000051.1| GENE 19 14283 - 15167 1064 294 aa, chain - ## HITS:1 COG:FN1266 KEGG:ns NR:ns ## COG: FN1266 COG1210 # Protein_GI_number: 19704601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Fusobacterium nucleatum # 1 290 8 296 301 416 74.0 1e-116 MKKITKAVIPAAGLGTRVLPATKAQPKEMLTIVDKPSLQYIVEELVASGIQDIIIVTGRN KNSIEDHFDFSYELEDTLKKDKKTELLEKVSHISDMANIFYVRQNFPKGLGHAILKAKPF IQEEEPFIIALGDDIIYNPEYPVAKQLIDCYEKYGHSIVACQEVKKEEVSKYGIVNPGEI YDDITCQIENFIEKPSLEEAPSTLASLGRYCLSGKIFHYLEEAKPGKNGEIQLTDSILSM IQDGEKVLAYSFNGERYDIGNKFGLLKANIEYGLRHEEISEELKDYLSSLLTKE >gi|224461451|gb|ACDD01000051.1| GENE 20 15197 - 17119 2171 640 aa, chain - ## HITS:1 COG:FN1268_1 KEGG:ns NR:ns ## COG: FN1268_1 COG0143 # Protein_GI_number: 19704603 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 3 509 4 510 526 840 75.0 0 MNNFYVTTPIYYVNGDPHVGSAYTTIAADVMSRYQKLAGNNVYFLTGTDEHGQKVEQTAK EKGFTPQAWTDKMAPAFTEMWKALNIHYSDFIRTTEQRHKDSVKKILKTVYEKGDIYKGE YEGQYCISCETFFPENQIVEPGHCPDCGKKLSTVKEESYFFRMSKYQDALLKHIEEHPDF ILPHSRRNEVISFIKQGLQDLSISRNTFSWGIPIEFAPGHITYVWFDALTNYITATGYEN DSEKFDTYWNNARVCHLIGKDIIRFHAIIWPCMLLSAGIKLPDSIVAHGWWTSEGEKMSK SKGNVVNPYDEIKKYGVDAFRYYLLREANFGSDGDYSTKGVVGRVNSDLANDLGNLLNRT LGMYHKYFQGSIVASGNYEEIENSVHQMWEDTLTQVDKHMYYYEYSRALECIWKFISRMN KYIDETMPWALAKEETQKTRLATVMNTLVESLYKIAVLVSPVIPEAAQKIWSQLGVEKDI QEARLSSLHTWNTFEEKHTLGKATPIFPRIEIVEEEPKLDPMQVNPDLVVENPIDIDTFK KTKIQVVEILEVSKVKGADKLLKFKVSLGDHVRQILSAIAEYYPNYQDLVGSKILAVTNL KPRKMRGEISQGMLLTTEDEQGVCQVIQIPKNTPAGTEVE >gi|224461451|gb|ACDD01000051.1| GENE 21 17112 - 17780 919 222 aa, chain - ## HITS:1 COG:FN1269 KEGG:ns NR:ns ## COG: FN1269 COG2121 # Protein_GI_number: 19704604 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 5 213 4 208 209 184 50.0 2e-46 MEKTESSKKYRFYGLCLYYFIHLLNYTFSYIRIENTGEEKVNENIRPYIFCFWHEKLLSS SLAMRNLRRKVGLASPSKDGELIAVPLEKMGFDLVRGSSDKQSVSSLLSLLKFLKKGYAM GTPVDGPKGPPYKVKHGLLYLAQKSGIPIVPMGGAFSKKWVFSKTWDHFQVPKPFSKIFY VLGNPIYLNKDSNLEEIALFLEQEINNLNEKAERLVREGNYE >gi|224461451|gb|ACDD01000051.1| GENE 22 17858 - 18217 731 119 aa, chain - ## HITS:1 COG:FN1270 KEGG:ns NR:ns ## COG: FN1270 COG0718 # Protein_GI_number: 19704605 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 32 119 1 88 88 103 76.0 8e-23 MVRKLKGNRPAQAAAGNQMDILKQAQAMQQQMLQVQEELKGKDLTVSVGGGAVNVKVNGQ KEVLEVKLSDEILKEAASDKEMLEDLILSGINEAMRQAEELAESEMNKVTGGINIPGLF >gi|224461451|gb|ACDD01000051.1| GENE 23 18285 - 19979 1505 564 aa, chain - ## HITS:1 COG:FN1271 KEGG:ns NR:ns ## COG: FN1271 COG0616 # Protein_GI_number: 19704606 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Fusobacterium nucleatum # 32 563 18 563 565 338 37.0 2e-92 MKTLLQFFKKIFLFLFREVCSFFIKLVLSLILLAIVVGTFISYISKENTTEIKQGSYVLL RASSPLSEHIPIPDPLSLQEKHMTFFEVLYALDSIRQDQRIQGVLLDADFLSWNKAQLEE IGNKLQKLEEEGKKVITTLQEVNRNNYFLASYTKEIVMTPIHAASSNISPYHYEELYWKN LLDRFGVSINVIPIGDYKSYMENYSHSQMSKEFRENMTRLLDKSYDYSIEAIANNRKLEK NTLKAWIENGEFMGTSFPTLFEKGLITKGEYPNRIRDEIGDDKIISIQEYFSLVKMKTRP KNYLALLNLEGTIEDETLFLDEVKAIQKDQNVKGIILRINSPGGSALVADTMYHAVKKLR EKIPVYVSISGTAASGGYYVAAAGEKIFASPLSVTGSIGVVSMIPNFSNLEKKANVTTES ISKGKYADLYSYLQPLSEENYNRIREGNLGVYQDFLEVVSSNRNIKKDFLDKNLAQGRVW LGIEAKENGLIDELGGLEATIYALEQDKKLGTLPILQVSKNDVFGQYLGKYRKFLSVLPS SMQQKVPKDRLWNKPLMYFPYEVE >gi|224461451|gb|ACDD01000051.1| GENE 24 19989 - 20909 1175 306 aa, chain - ## HITS:1 COG:FN0887 KEGG:ns NR:ns ## COG: FN0887 COG1164 # Protein_GI_number: 19704222 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Fusobacterium nucleatum # 1 302 294 595 600 420 71.0 1e-117 MKEYHYYDNSISLLEYDKEFSYEEAKQLVIDSVLPLGEEYQNKIKTALSDGWLDVMEKEN KRSGAYSINIYDVHPYMLLNYQGTLDDVFTLAHELGHTMHSILSTEHQPFATHSYTIFVA EVASTFNERLLLDSMLEKTKDPKERIILLEQALGNIVGTYYIQTLFANYEYQAHQLVERG EAVTPDILSGIMEQLFKQYFGDTLVFDELQKIIWSRIPHFYHSPYYVYQYATSFAASANL YKQLKTNPESVTKYLRLLQSGGNDYPMEQLKKAGADLSKVESFDAIAEEFNRLLDLLEKE LENYQA >gi|224461451|gb|ACDD01000051.1| GENE 25 20912 - 21763 1037 283 aa, chain - ## HITS:1 COG:FN0887 KEGG:ns NR:ns ## COG: FN0887 COG1164 # Protein_GI_number: 19704222 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Fusobacterium nucleatum # 3 282 11 290 600 314 56.0 2e-85 MNYTWNLEDIYPSWEAWEQDFQKMKKDMEIIPNYQGKIHNSRENFVEMTKLEESLSRLVD KLYLYPYLMKDLNSKDEVASMKLQEMEAIFTDFGVKTAWTVPETLMIPEHTMKQWILEDD FLKDYAFPLQETYRLQKHVLSEEKEQLLSYFSQYLGAPDDIYSELSISDMEWKTVKLSNG WEGPITNGMYSKILSTNRNQEDRKLAFEALYEAYHKNKNTYGAIYRSLLQRGVASSRARN YSSTLEKALEGKNIPKEVFLSLLDSALKNTAPLQRYAKLRKKY >gi|224461451|gb|ACDD01000051.1| GENE 26 21792 - 23057 836 421 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 1 410 3 421 447 326 43 8e-89 KMVNTNELGLKTKLVLGAQHVLAMFGATVLVPFLTGMNPSIALIAAGLGTLIFHAVTKRI VPVFLGSSFAFIGAIALVLKNDGIAVVKGGVIVAGLVYLVMSLIILKFGVDRVKSFFPPV VVGPIIMVIGLRLSPVAMSMAGYSNGGFDTKSLIISSIVVISMVCISILKKSFFRLVPIL ISVAIGYTVAIFFGLVDFNLISQAKWIGLSDDAFHALVTVPEFTFTGIVAIAPIALVVFI EHIGDITTNGAVVGKDFFQDPGIHRTMLGDGLATIAAGFIGGPANTTYGENTGVLAVTKV YDPSVLRIAACYAIILGFLGKFGVMLQTIPTPVMGGVSIILFGMISAVGARTIVDAQLDF SNSRNLIIASLILVFGIAINEIAIWGTISVSGLAIAAFVGVILNKILPEDQPYTKKQLRK M >gi|224461451|gb|ACDD01000051.1| GENE 27 23101 - 23484 609 127 aa, chain - ## HITS:1 COG:FN0889 KEGG:ns NR:ns ## COG: FN0889 COG5496 # Protein_GI_number: 19704224 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Fusobacterium nucleatum # 1 124 1 123 127 107 50.0 7e-24 MLEVGLQCEVSKVVQMEDTAAKVASGLLDVFATPMMIALMEKAAYTLVQDHLAEGDSTVG VEIGAKHVKATPVGTTVKAIATLTKIEGRFLTFSVQAFEEDGTLIGEGTHQRCIINSQKF IDKLNKR Prediction of potential genes in microbial genomes Time: Fri May 20 02:07:14 2011 Seq name: gi|224461450|gb|ACDD01000052.1| Fusobacterium sp. 3_1_5R cont1.52, whole genome shotgun sequence Length of sequence - 107112 bp Number of predicted genes - 102, with homology - 96 Number of transcription units - 32, operones - 23 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 6 - 113 123 ## 2 1 Op 2 9/0.000 + CDS 58 - 615 493 ## COG3683 ABC-type uncharacterized transport system, periplasmic component 3 1 Op 3 . + CDS 612 - 1379 744 ## COG2215 ABC-type uncharacterized transport system, permease component + Term 1388 - 1442 9.0 4 2 Op 1 . - CDS 1407 - 2054 704 ## COG1564 Thiamine pyrophosphokinase 5 2 Op 2 . - CDS 2054 - 2878 967 ## FN0891 DNAse I homologous protein DHP2 precursor (EC:3.1.21.-) - Prom 2915 - 2974 10.6 + Prom 2933 - 2992 5.3 6 3 Tu 1 . + CDS 3020 - 3748 974 ## COG0560 Phosphoserine phosphatase + Term 3754 - 3791 5.7 + Prom 3810 - 3869 7.5 7 4 Op 1 2/0.000 + CDS 3906 - 4559 734 ## COG0785 Cytochrome c biogenesis protein 8 4 Op 2 3/0.000 + CDS 4579 - 6156 1812 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase 9 4 Op 3 7/0.000 + CDS 6169 - 6942 1141 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 10 4 Op 4 . + CDS 6932 - 8587 1655 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 8610 - 8652 7.2 + Prom 8641 - 8700 12.1 11 5 Op 1 23/0.000 + CDS 8736 - 9218 651 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 12 5 Op 2 1/0.200 + CDS 9231 - 11015 2215 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 13 5 Op 3 . + CDS 11033 - 12736 1774 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 14 5 Op 4 . + CDS 12785 - 13660 996 ## COG0682 Prolipoprotein diacylglyceryltransferase 15 5 Op 5 . + CDS 13654 - 14958 1363 ## COG1757 Na+/H+ antiporter + Term 14999 - 15046 2.8 - Term 14932 - 14965 4.0 16 6 Op 1 . - CDS 14966 - 17494 1925 ## COG0608 Single-stranded DNA-specific exonuclease 17 6 Op 2 . - CDS 17491 - 19323 1561 ## gi|257452674|ref|ZP_05617973.1| hypothetical protein F3_06374 - Prom 19389 - 19448 10.0 + Prom 19188 - 19247 9.2 18 7 Op 1 1/0.200 + CDS 19473 - 21245 2123 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 19 7 Op 2 . + CDS 21276 - 21743 555 ## COG1576 Uncharacterized conserved protein 20 7 Op 3 . + CDS 21783 - 22112 587 ## gi|257452677|ref|ZP_05617976.1| hypothetical protein F3_06389 21 7 Op 4 . + CDS 22112 - 22498 311 ## gi|257452678|ref|ZP_05617977.1| hypothetical protein F3_06394 22 7 Op 5 . + CDS 22495 - 23727 1280 ## FN0465 hypothetical protein 23 7 Op 6 1/0.200 + CDS 23741 - 25222 2202 ## COG1190 Lysyl-tRNA synthetase (class II) + Term 25247 - 25280 1.0 24 8 Op 1 23/0.000 + CDS 25291 - 25659 423 ## COG1380 Putative effector of murein hydrolase LrgA 25 8 Op 2 . + CDS 25637 - 26329 780 ## COG1346 Putative effector of murein hydrolase + Prom 26331 - 26390 9.4 26 9 Op 1 1/0.200 + CDS 26475 - 26735 335 ## PROTEIN SUPPORTED gi|237739595|ref|ZP_04570076.1| SSU ribosomal protein S20P + Term 26748 - 26780 4.0 27 9 Op 2 . + CDS 26798 - 27373 637 ## COG0778 Nitroreductase + Term 27568 - 27603 0.3 - Term 27294 - 27334 1.1 28 10 Op 1 . - CDS 27348 - 28493 771 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases 29 10 Op 2 . - CDS 28506 - 28586 165 ## 30 10 Op 3 . - CDS 28588 - 30021 1925 ## COG0260 Leucyl aminopeptidase 31 10 Op 4 . - CDS 29999 - 30073 56 ## - Prom 30132 - 30191 7.0 + Prom 30014 - 30073 6.5 32 11 Op 1 . + CDS 30164 - 30625 583 ## COG2849 Uncharacterized protein conserved in bacteria 33 11 Op 2 1/0.200 + CDS 30703 - 32289 2335 ## COG0513 Superfamily II DNA and RNA helicases 34 11 Op 3 1/0.200 + CDS 32301 - 33245 1022 ## COG1559 Predicted periplasmic solute-binding protein 35 11 Op 4 14/0.000 + CDS 33262 - 34596 1048 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 36 11 Op 5 . + CDS 34593 - 36782 1334 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 37 11 Op 6 . + CDS 36795 - 36881 62 ## 38 11 Op 7 1/0.200 + CDS 36872 - 37135 399 ## PROTEIN SUPPORTED gi|19705275|ref|NP_602770.1| SSU ribosomal protein S15P + Term 37148 - 37178 0.3 39 11 Op 8 . + CDS 37195 - 38277 1109 ## COG5438 Predicted multitransmembrane protein + Prom 38287 - 38346 5.9 40 12 Op 1 38/0.000 + CDS 38373 - 39908 2327 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 39935 - 39968 5.1 41 12 Op 2 . + CDS 39994 - 40845 961 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 42 12 Op 3 49/0.000 + CDS 40769 - 40915 229 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 43 12 Op 4 44/0.000 + CDS 40930 - 41796 1587 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 44 12 Op 5 44/0.000 + CDS 41817 - 42824 604 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 45 12 Op 6 . + CDS 42814 - 43785 828 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 43793 - 43829 4.0 + Prom 43812 - 43871 6.0 46 13 Tu 1 . + CDS 43892 - 44137 412 ## FN0683 hypothetical protein + Prom 44159 - 44218 8.0 47 14 Op 1 . + CDS 44315 - 44401 160 ## 48 14 Op 2 . + CDS 44471 - 45130 771 ## Fjoh_4684 hypothetical protein + Prom 45151 - 45210 19.2 49 15 Tu 1 . + CDS 45239 - 45886 792 ## COG1704 Uncharacterized conserved protein + Term 45892 - 45952 8.6 + Prom 45891 - 45950 8.4 50 16 Op 1 . + CDS 46034 - 48781 3132 ## COG0457 FOG: TPR repeat 51 16 Op 2 . + CDS 48798 - 49163 481 ## FN1835 hypothetical protein 52 16 Op 3 30/0.000 + CDS 49178 - 49789 908 ## COG0811 Biopolymer transport proteins 53 16 Op 4 11/0.000 + CDS 49802 - 50236 590 ## COG0848 Biopolymer transport protein 54 16 Op 5 1/0.200 + CDS 50251 - 51030 812 ## COG0810 Periplasmic protein TonB, links inner and outer membranes 55 16 Op 6 1/0.200 + CDS 51047 - 52435 1559 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 56 16 Op 7 . + CDS 52439 - 53875 1603 ## COG2812 DNA polymerase III, gamma/tau subunits 57 16 Op 8 . + CDS 53872 - 54696 618 ## gi|257452708|ref|ZP_05618007.1| hypothetical protein F3_06554 58 16 Op 9 16/0.000 + CDS 54719 - 55168 562 ## PROTEIN SUPPORTED gi|237738486|ref|ZP_04568967.1| LSU ribosomal protein L9P 59 16 Op 10 1/0.200 + CDS 55185 - 56525 1988 ## COG0305 Replicative DNA helicase 60 16 Op 11 . + CDS 56550 - 57770 1549 ## COG0826 Collagenase and related proteases 61 16 Op 12 . + CDS 57778 - 58437 777 ## FN0749 hypothetical protein + Term 58446 - 58483 5.1 - Term 58428 - 58476 3.3 62 17 Op 1 12/0.000 - CDS 58479 - 58958 525 ## COG3610 Uncharacterized conserved protein 63 17 Op 2 1/0.200 - CDS 58959 - 59714 591 ## COG2966 Uncharacterized conserved protein 64 17 Op 3 . - CDS 59726 - 60454 754 ## COG4123 Predicted O-methyltransferase - Prom 60565 - 60624 24.3 + Prom 60453 - 60512 10.2 65 18 Op 1 2/0.000 + CDS 60702 - 61847 1757 ## COG1960 Acyl-CoA dehydrogenases 66 18 Op 2 29/0.000 + CDS 61885 - 62682 1162 ## COG2086 Electron transfer flavoprotein, beta subunit 67 18 Op 3 . + CDS 62715 - 63725 1376 ## COG2025 Electron transfer flavoprotein, alpha subunit + Term 63742 - 63775 4.0 + Prom 63809 - 63868 8.0 68 19 Op 1 1/0.200 + CDS 63901 - 65709 2522 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 69 19 Op 2 . + CDS 65729 - 67486 2075 ## COG0006 Xaa-Pro aminopeptidase 70 19 Op 3 . + CDS 67499 - 67972 652 ## FN1219 hypothetical protein - Term 67958 - 67999 0.9 71 20 Op 1 . - CDS 68002 - 68625 899 ## COG1272 Predicted membrane protein, hemolysin III homolog 72 20 Op 2 . - CDS 68699 - 70774 2493 ## COG2217 Cation transport ATPase - Prom 70812 - 70871 9.9 - Term 70857 - 70895 5.1 73 21 Op 1 . - CDS 70904 - 72121 1420 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 74 21 Op 2 . - CDS 72173 - 73516 1429 ## COG0534 Na+-driven multidrug efflux pump 75 21 Op 3 1/0.200 - CDS 73526 - 74347 904 ## COG1968 Uncharacterized bacitracin resistance protein 76 21 Op 4 . - CDS 74357 - 75355 1404 ## COG0451 Nucleoside-diphosphate-sugar epimerases 77 21 Op 5 . - CDS 75358 - 76176 841 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) - Term 76189 - 76228 4.8 78 22 Op 1 . - CDS 76237 - 76977 852 ## FN1719 hypothetical protein 79 22 Op 2 1/0.200 - CDS 76996 - 79665 3450 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 80 22 Op 3 . - CDS 79721 - 81751 2123 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 81 22 Op 4 . - CDS 81748 - 82785 882 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 82 22 Op 5 2/0.000 - CDS 82795 - 84069 931 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases 83 22 Op 6 5/0.000 - CDS 84098 - 85375 1139 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases 84 22 Op 7 1/0.200 - CDS 85390 - 86319 1265 ## COG0517 FOG: CBS domain - Term 86334 - 86365 2.5 85 23 Op 1 1/0.200 - CDS 86378 - 88903 3314 ## COG1461 Predicted kinase related to dihydroxyacetone kinase 86 23 Op 2 1/0.200 - CDS 88922 - 89470 431 ## COG1396 Predicted transcriptional regulators 87 23 Op 3 1/0.200 - CDS 89474 - 90682 239 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 88 23 Op 4 1/0.200 - CDS 90694 - 91197 670 ## COG1267 Phosphatidylglycerophosphatase A and related proteins 89 23 Op 5 1/0.200 - CDS 91201 - 93369 2097 ## COG0826 Collagenase and related proteases 90 23 Op 6 . - CDS 93341 - 93943 452 ## COG0237 Dephospho-CoA kinase - Prom 93977 - 94036 11.0 + Prom 93931 - 93990 8.6 91 24 Tu 1 . + CDS 94090 - 95058 926 ## CLB_1618 AraC family transcriptional regulator + Term 95067 - 95108 5.1 + Prom 95091 - 95150 9.6 92 25 Op 1 35/0.000 + CDS 95200 - 96945 196 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 93 25 Op 2 . + CDS 96938 - 98656 1454 ## COG1132 ABC-type multidrug transport system, ATPase and permease components + Term 98668 - 98718 8.0 - Term 98474 - 98514 -0.0 94 26 Tu 1 . - CDS 98709 - 99563 823 ## COG1284 Uncharacterized conserved protein - Prom 99617 - 99676 3.8 + Prom 99683 - 99742 8.5 95 27 Tu 1 . + CDS 99920 - 100111 317 ## Mmol_1121 hypothetical protein + Prom 100122 - 100181 10.3 96 28 Tu 1 . + CDS 100217 - 100495 386 ## COG2388 Predicted acetyltransferase + Term 100519 - 100552 3.1 + Prom 100558 - 100617 12.5 97 29 Tu 1 . + CDS 100649 - 102190 1999 ## COG2978 Putative p-aminobenzoyl-glutamate transporter + Term 102217 - 102253 5.0 + Prom 102472 - 102531 9.4 98 30 Tu 1 . + CDS 102594 - 103166 603 ## gi|257452749|ref|ZP_05618048.1| hypothetical protein F3_06759 + Prom 103198 - 103257 7.2 99 31 Op 1 11/0.000 + CDS 103381 - 105612 3074 ## COG1882 Pyruvate-formate lyase 100 31 Op 2 . + CDS 105640 - 106365 768 ## COG1180 Pyruvate-formate lyase-activating enzyme + Term 106369 - 106424 14.7 - Term 106183 - 106234 -0.7 101 32 Op 1 . - CDS 106433 - 106993 690 ## Plut_0528 exonuclease 102 32 Op 2 . - CDS 107022 - 107111 100 ## Predicted protein(s) >gi|224461450|gb|ACDD01000052.1| GENE 1 6 - 113 123 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCYNNLQVREVEERKQYYEKDNDGDDILFLFFISG >gi|224461450|gb|ACDD01000052.1| GENE 2 58 - 615 493 185 aa, chain + ## HITS:1 COG:FN1114 KEGG:ns NR:ns ## COG: FN1114 COG3683 # Protein_GI_number: 19704449 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 185 16 196 196 141 45.0 5e-34 MKKIMMGMIFYFCFLFQDSLAHPHVFFDTQVSIQIEKKKMEGVEVTLLLDEMNTLLNQKV FRASKEGDVKDKNIVFLKYLYSHIRVFWNGKRIPKQDILFELAMLEEEQLRIDFFVSIDK PIQPKDKLSISFYDTDYYYTYDYNKSSFHLNGLEKGRWNTRFYTDKGISFYFKTVHPDIY EVIFE >gi|224461450|gb|ACDD01000052.1| GENE 3 612 - 1379 744 255 aa, chain + ## HITS:1 COG:FN1115 KEGG:ns NR:ns ## COG: FN1115 COG2215 # Protein_GI_number: 19704450 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 17 249 1 239 244 135 37.0 6e-32 MRKKIFLGIFIVLILIMLWKFPNIYRFLILEQKHFIQLMKQSIREQQDGVLGILIVLTFF YGLIHSLGPGHGKSFLVTYVLKEKIATWKLLCMTAMIAYLQAFLAYVFVTFILDLASQSS MLSLYTLDQKTRFLSAIMIVLIASFDFILLFRKKEESPKECWLFAGVVGLCPCPGVMSVL LFLNLLGYEAYSKMFTLSTATGIFCMLSVFGFMAGKMKEYLVQESSPKILEYLHIIGIIL LFGIGIYQIYFSIFI >gi|224461450|gb|ACDD01000052.1| GENE 4 1407 - 2054 704 215 aa, chain - ## HITS:1 COG:FN0890 KEGG:ns NR:ns ## COG: FN0890 COG1564 # Protein_GI_number: 19704225 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Fusobacterium nucleatum # 1 208 1 207 209 161 42.0 6e-40 MKRAYLFLNGELRGSQNFYQNLLQEKQGDIFCVDGGSRHLQSLGITPKELWGDLDSTSPI LRVEWEKQGCQVFQFPIEKDFTDFELLLQSLEQRSYEEWIVIGGLGGDTDHLLSNLYLCI QYPKIQFLSEEESIFLSPSHYLFQNLQGHKVSFIPFSNSILSLSLKGFQYNLSSYHLQQG ETLCHGNTIVKEKAEITFENGLLLVVLKNKKLISK >gi|224461450|gb|ACDD01000052.1| GENE 5 2054 - 2878 967 274 aa, chain - ## HITS:1 COG:no KEGG:FN0891 NR:ns ## KEGG: FN0891 # Name: not_defined # Def: DNAse I homologous protein DHP2 precursor (EC:3.1.21.-) # Organism: F.nucleatum # Pathway: not_defined # 9 274 14 279 279 320 60.0 3e-86 MKLFYQCLLFLCLSIASFAQEAYIASFNVLKLGESPKDFETMAKTIEHFDLVGLEEVITP EGLERLVKSLNKYTNHTWDYHISPFPVGTRKYKEYYAYVWKKDRVTFLSSEGFYPDREKL FIREPYGANFQIGKFDFTFVLQHAVYGKSETERRAEAFQLVKVYRYFQDRNKKENDILIG GDFNLSAFDEAFSSLYEDKDQIIYGVDPRIKTTIGMKKMANSYDNIFLSKKYTEEFTGKS GAIDFTNRQYKVMRNKVSDHLPVFIIVNIDRDDD >gi|224461450|gb|ACDD01000052.1| GENE 6 3020 - 3748 974 242 aa, chain + ## HITS:1 COG:FN0892 KEGG:ns NR:ns ## COG: FN0892 COG0560 # Protein_GI_number: 19704227 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Fusobacterium nucleatum # 1 241 3 244 247 300 64.0 1e-81 MKQIAAFFDIDGTIYRNSLMIEHFKKLIKYELLDMEAYQQHVEESFKLWDTRTGDYDEYL NKLVQSYVKAMKGMLVSYNDFISDQVVYLKGNRVYAYTREKIKWHKEQGHKVIFISGSPD FLVSRMAKKWEADDYKASQYLLDETEKEYSGEIIPMWDSVHKIQALEEFRKKYDIDLTKS YAYGDTNGDISMLTSVGFPRAINPSRELVMKIKETPYLQENAKIIIERKDVIYELDANVK IK >gi|224461450|gb|ACDD01000052.1| GENE 7 3906 - 4559 734 217 aa, chain + ## HITS:1 COG:FN0186 KEGG:ns NR:ns ## COG: FN0186 COG0785 # Protein_GI_number: 19703531 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Fusobacterium nucleatum # 1 217 1 217 217 231 73.0 7e-61 MLNGELFVGAVYLAGLLSFFSPCIFPLLPVYLGMLSSGGKRSLLKTIVFVIGLSSSFVLL GFGAGSVGALLTSSTFRIISGVVVILFGFIQMDVIKASFLERTKLVELKQKEEDSVLGAF ILGFTFSLGWTPCVGPILTSILFLSSGGGSPVYGALMMFIYVLGLATPFLIFSFFSKQLG SKMGSFRKYLVPLKKIGGVLIVIMGILLLTDRLNLFV >gi|224461450|gb|ACDD01000052.1| GENE 8 4579 - 6156 1812 525 aa, chain + ## HITS:1 COG:FN0188_2 KEGG:ns NR:ns ## COG: FN0188_2 COG0229 # Protein_GI_number: 19703533 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Fusobacterium nucleatum # 378 525 1 148 148 269 86.0 1e-71 MKIWKKLLCSAMLVLGCSTAFAKGEDFSKIMLKDVAGKEYSFGKMKKPTYVKFWASWCPV CLSGLEEIDNLSKEKKDFEVVTVVFPGKKGEKSAVDFKEWYRSLEYKNVKVLLDEKGELL KLVNPRVYPTSVVLDASSKVQKVIPGHLGKAEIKSLFPMLKMDKKMDSKMMNDMMMKDNK MDKMMKDDKNMKMEKAGDSKKRKKKDNVVTKNIREIYLAGGCFWGVEAYMEKIYGVVDAV SGYANGKTKNPKYEDLIYRGSGHAEAVFVKYDANKISLETLLKYYFRIIDPTSVNKQGND RGTQYRTGIYYKDIQDKKIIDAEIRLQQQKYKKKIVVEVLSLQNFYKAEEYHQDYLKKNP NGYCHIDLSKAHDIIIDKKKYPKLSEKELKMKLNVQQYKVTQQGDTERAFQNDYWNFFEA GIYVDITTGEPLFSSKDKYNSACGWPSFTKAIVPEVVTYHKDTSFNMIRTEVRSRSGNAH LGHVFDDGPRDRGGKRYCINSAAIQFIPLKEMEEKGYGYLLSLVK >gi|224461450|gb|ACDD01000052.1| GENE 9 6169 - 6942 1141 257 aa, chain + ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 2 257 3 260 261 305 68.0 4e-83 MRLLIADDEPLIRRGIKKLVNLSEIGIEEVYEADNGEETLQLFEQYHPEIVLLDINMPRV DGLTVAKEIKSLSPETKIAMLTGYNYFDYAQKAIRIGVEDYILKPVSKKEITEIIAKLAH SYQEERKQQTIQKVFQKKVEVIQENSKNDYHSNMKRYMEENYTDSQFSLGVLAEKLNLSS GYLSILFKKTFGIPFQDYLLQLRMEKAKLLLLTTHLKNYEIAEQIGFEDVNYFSLKFKKY FRLSPKQYKEMVLKNEN >gi|224461450|gb|ACDD01000052.1| GENE 10 6932 - 8587 1655 551 aa, chain + ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 4 549 5 550 552 749 72.0 0 MKINRPLNVKIGIYFLLTNFILVILLGSIFYFSSSNLLIQKDISAAEEAIARSGNYIELY ANKLTSFSELISQDESVYRYLKYKDESEKARILRMIQNTLKTDAYIQSIILLRKDGYVIS NEKNVNMEISSDMMKEEWYVQALKNSMPILNPLRKQNFSQDDMEHWVISVSREIHDENGE NLGVLLIDVKYQALHEYLQSRELGEQGDTIILDELERIVYYKDIPCMNAKNTCLQRFRTI QEGYDRSNNTIMVKYPIHHTNWVLVGISSLEEIRSLKVHFFELIFMSALASIIITWVISS FILNRITKPVRELEKHMSHFSESLSKVSLTGDVSAEILSLQNHFNDMIEKIKYLREYEIN ALHSQINPHFLYNTLDTIIWMAEFEDTEKVISITKALANFFRISLSNGKEKIPLKEEIRH IQEYLYIQKQRYEDKLEYEFDINSSLENIEVPKIILQPLVENALYHGIKNLQGAGKIRIY SRIFEKKFELIVEDNGVGFEKAKQQATMKMGGVGVKNVNKRIQFYYGEEYGVKIDSGFTA GARVIISLPLM >gi|224461450|gb|ACDD01000052.1| GENE 11 8736 - 9218 651 160 aa, chain + ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 9 158 23 175 176 167 50.0 6e-42 MICKDNIGFKKLEEVINEVEEKEMAIIPILHKAQEIFGYLPEEVQQFISQKTNIPIGRIY GIVTFYNFFSTNPKGKHQISVCTGTACYVRGAQKVLDEIKKELGIDVGQTTEDGLFSLDC LRCIGACGLAPVMMIDSDVHGKLEKEQVKEILSFYRNQKA >gi|224461450|gb|ACDD01000052.1| GENE 12 9231 - 11015 2215 594 aa, chain + ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 6 526 8 527 527 649 58.0 0 MCERKIYICGGTGCMSSKSKRLKENIEAILASNHLEDKVEVRLTGCFGFCEKGPIVKIMP DNTFYTEVNPRDAIEIVETHIIYGKKIERLLYQDPKTGEIIHNTEDMNFYQKQERRILHN CGVINPESVEDYLEQDGFRAIQKALQEMTPVKVIQEIQNSGLRGRGGGGFPTGIKWEIAS KQEGNEKYIVCNADEGDPGAFMDRSILEGDPYGVIEGMMIAGYAIGANHALIYIRAEYPL AISRLQKAIEQARKKGYLGKHIFGTSFSFDVNLKFGAGAFVCGEETALIQSMQGERGEPK SKPPYPAQSGYLGKPTVVNNVETLLNVPLIIQHGSEWFREIGTEKSPGTKVFALAGKVNN VGLVEVPMGTTLREIIYEIGGGIKNGKRFKAVQTGGPSGGCLTNKDLDISIDFDTLAARG SIMGSGGMIIMDEDDCMVSIAKFFLEFTLDESCGKCTPCRIGNTRLYEILTRITEGEGTM EDLKLLEELSDTIKEASLCGLGQTSPNPVLSTLKEFREEYIQHIEDKTCLAGVCQKLTHY RITDKCVGCTLCARNCPVHAIVGTVKKQHIISQELCIKCGICYDRCKFGAITRA >gi|224461450|gb|ACDD01000052.1| GENE 13 11033 - 12736 1774 567 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 217 558 5 357 372 274 42.0 3e-73 MVSLEIDGKMLEVKEGRTILEAAKEIGIEIPHLCYMNLEEIGFKNDCSSCRICVVEVEGQ RRLIPSCSTPVANGMKIWTNTKRVMQKRRNIVELLLSDHPKDCLICGKNGNCELQKIAIS FGIRKIRFSGRESSYEKEESVAITRDVTKCIMCRRCESICRDIQSCNILTGVRRGFSAVV DTAFSRSLQHTRCTFCGQCVSVCPTGAIYETDNSFQLFQDIMNEEKIVVMQVAPAVRVAI GEMFGMEAGTDVTGKLVSALKKIGIDYVFDTNFAADVTVMEEATELKYRMEHGKILPIFT SCCPAWVRFLQQNYPEMEKYLSSTKSPQEIFGAIAKHIFQKEQEKEVVCVSLMPCVAKKY EASIGKDVNYSVTTREIVNLLKQFNIDLSLMPEEDFDQPFATSSGGGDIFGRSGGVMEAT ARTLYYLLEKEDLKEVAFHNLRGFDGLKFSEVKIGEKVLRLAVVHGLRQAREVVEAIRNG QLQIDALEVMACKGGCLAGGGQPYHHGDFSIIQKRTEAIQRLDDRNSIQCSHQNQDVLRM YREKIGSIYGDEAKELFHYEKGRKLVI >gi|224461450|gb|ACDD01000052.1| GENE 14 12785 - 13660 996 291 aa, chain + ## HITS:1 COG:FN0489 KEGG:ns NR:ns ## COG: FN0489 COG0682 # Protein_GI_number: 19703824 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Fusobacterium nucleatum # 1 286 1 286 288 394 74.0 1e-110 MQPVIFSIGGFELHYYGLMYAFAFLVGIQLAKKMAKERAFDINIIENYAFVAILSGLLGG RLYYVAFNLSYYLQNPMEILAVWHGGMAIHGGILGGILGTYIYGAIKKINPLTLGDFAAA PFLLGQAIGRIGNLMNGEVHGVPTFTPWSVIFQWKPKFYEWYTQYLTLPIEEQKKFPDLV PWGLTFPSSSPAGMEFPNLALHPAMLYELVLNLVGTAILWFILRKKTEKAPGFLWWHYII FYSINRIIISFFRAEDLMFYSFRAPHIISAILIMISIVALVFSQKKKEKKC >gi|224461450|gb|ACDD01000052.1| GENE 15 13654 - 14958 1363 434 aa, chain + ## HITS:1 COG:FN0978 KEGG:ns NR:ns ## COG: FN0978 COG1757 # Protein_GI_number: 19704313 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 1 430 1 430 431 390 56.0 1e-108 MLGILSIFLFAVILIVCIFYQVSIIYALVLGSLIFLAYGIIEGYSFSELWKMILSGVLTV KNILIVFLLIGMITATWRASGTIAMIIFLGSKLITPSIFILLSFLLCALLSVLIGTALGT SATMGVICISIARAMGIDELFVAGAVLSGIYFGDRCSPMSTSALLISEITETNLFENIKA MIKTSIIPLLITCALYFILGMKSEGSADVSVISSLFQENYRLHWIVLLPAIFMILLSFFK VNVRITMSISIFLSFGIAYFVQGEEIENLFQYLIYGYRHSNVALNKMMHGGGILSMWKVS LIVGISSSYSGIFAKTNILTKLKEYIKILSKKITDFGAVLVTSVITCMIACNQSLAVIMT QQLCKDIMKKEKLAITLENTVITVAALVPWSVAMAVPFQALEIDNIAAIYGFYVYLIPLW NLGMAIKKEKVEVN >gi|224461450|gb|ACDD01000052.1| GENE 16 14966 - 17494 1925 842 aa, chain - ## HITS:1 COG:FN0374 KEGG:ns NR:ns ## COG: FN0374 COG0608 # Protein_GI_number: 19703716 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Fusobacterium nucleatum # 2 842 3 844 844 691 44.0 0 MRNTKWIYQNYKYYPQKIEKKEAIHSIVYSIMKERNLSHQENFNTNPFLLKDMEEAVSLL QEAKKKKQTIWIYGDYDVDGITSVSLCYLALSELGYEVEYYIPLRDEGYGLNQEALQSIY NQGGKIVITVDCGIVSSKEVDFANSLGMTMIVTDHHELQGELPKAAAVINPKRKENIYPF PSLAGVGTAFFLMTALFEKEGKRKEITKYFDIVALGTIADIVPLIENNRILVQQGLSLLA KSQWTGLRILVKRLFPDYETHHFSAYDVGFIIAPIFNAAGRLEDAKSSVRLFLEKDSKKA NEQIDYLIQNNLDRRAVQEKILQACLEEISQKKLEDKNSIVIAREGFHHGVIGIVASKLV DRFYKPTIIMEIKANEGIATASCRSISGINIVESLEAVSHLLLRYGGHSGAAGFSILIEN IAKFYEEFEAILEDKISKEITTRKLNITKELLPFQIQYPLLHDMKYLEPFGASNPAPIFS LKHCKLDKIRLIGADKKHIMCNIHHGDTIFWNCVWFQAFDIYEELLYIQEVDVAFHLKLE TYRGRYQYKIFIDDIQSSNTTNEVRYHQEEIEYSYVQFPYEVILYLKHTNLSENLSLNFE EREVRLFSNRSYIAYLDSNTSKILHYWKQEKNCNFHVRKKEVFLEEEHYKIHLEITINED FHSYSLKEGQLFQDIKNFLLGKEGKYNSIQTKILASLFKKRQNTLATMECGRGIRTLINT IKLYADYTKQQYQILENWDEKEKIETQCQFHIFLFPKTPKKIPALSSRILILTGQDQILE GYFTIEDSYSLPKNIHWIEEEEISKHKIVFSHRLRKEKQKKILEQLLNLQDFYATKDLLV HL >gi|224461450|gb|ACDD01000052.1| GENE 17 17491 - 19323 1561 610 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452674|ref|ZP_05617973.1| ## NR: gi|257452674|ref|ZP_05617973.1| hypothetical protein F3_06374 [Fusobacterium sp. 3_1_5R] # 1 610 1 610 610 985 100.0 0 MKRKVLYTLFYLFLSSHFLFSYSYEDYELYLKAKKEYQEKKYQEAYETLSLLKRIFPYSR VQKSKLSDYYLALIQYQLDQKEEAIRGLTANILPLHTEERDYLLGTLYMNKKNPKQANLY FQRLLSSEYSYSHEKIEKKIEQILCKNNPYYQHYFAAKFYQNFESISNLTKKDILEIASY LSSKGEEQNSQTLLLKFLKENQGKKEDFFPFYSALLNSFFQTKSYDKVIQYANLFSKVDI QAIENRDFYLLQKARAYHHKKQYIKAISCYETIKNPRYQSDASLELAAIHYTLENYDTVI QILEKKSPKTTYDWKLLGNSYFILKEREKFLSVAQKIEEKESNAYENILYHYLIAHPKET IDKDNSLYFTNFVVNRYLENLYPFDSSDTLKSTLLEYEKLKDFAPMYDRDLIELEFKNSH FYYKSNIETAYAVSKFYEKFGFYDLAYQNSKRNASLFSRFKNSISLLFPRYYPELIKKYS LQYNISEEILNTLILLSSEWNNNYEKENKLGLFALDFRNTSEASNLKNPEISIKLACQKL KKIQKKYPQALATMIVFLYGESYYKELIWEENGDISLNKISDLNMRYEIQQLILHYCFYK NLYSTLGRKI >gi|224461450|gb|ACDD01000052.1| GENE 18 19473 - 21245 2123 590 aa, chain + ## HITS:1 COG:FN0462 KEGG:ns NR:ns ## COG: FN0462 COG0323 # Protein_GI_number: 19703797 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Fusobacterium nucleatum # 1 590 7 643 643 614 53.0 1e-175 MGKIHILEESVSNAIAAGEVVENPASLVKELLENSLDAGSKNIYLFIREGGRFVEIRDDG MGMSREDVLLSVERHATSKIKSKEDLFALQSYGFRGEALSSIASVSKMSITSCEVDANLG TKMTVLGGKVTGIKDFPRTQGTDIIIQDLFFNTPARLKFLRKASTEYIQIKDIVLKEALA NPEVKIHLEIDGKESICSSGNGLENTILEIFGKNALKNLTKFSYGYLGNEKLYRSSRDSI FVFVNGRPVKAKLIEEAIIDSYYTKLMKGKYPFACVFLEIPASEIDVNVHPSKKIIKFAN ASEVYSQVRNAIEEVFEEEKEFSFAQFSVKEVEEDREEKKILESFEIAENREAKEVFKEI KEEKKYFQEEKTDKIPLFDLQNDKIPVIKDRRSFFEEEKISPKTYDFKILAQIYDTFLLV ERNGIFEIYDQHIVHERVLYEELKEKYYGNAIQKQQLLVPLKLSLDPREKELFFENQEQF SFFGIEGEDFGGNEIIIRSVPSVELKASMEEIIREILYQLQHEKERDIRESMIISMSCKG AIKANQKLVLEDMYPLVQKLHEIGEYTCPHGRPIIMQLPFEELEKWFKRK >gi|224461450|gb|ACDD01000052.1| GENE 19 21276 - 21743 555 155 aa, chain + ## HITS:1 COG:FN0463 KEGG:ns NR:ns ## COG: FN0463 COG1576 # Protein_GI_number: 19703798 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 155 1 155 155 173 56.0 1e-43 MNVSIVCVGKVKDKYILDGIAEFQKRLQAFTKFDIIEVKEYGREQTIAQSTEKETEELLS VLEKIGGYHILLDLKGKERDSVQMAKHLENLQVQGNSRINFIIGGSDGYTEELRRYCQEG ISFSKFTFPHQLMRLILIEQIYRWFSINHHIKYHK >gi|224461450|gb|ACDD01000052.1| GENE 20 21783 - 22112 587 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452677|ref|ZP_05617976.1| ## NR: gi|257452677|ref|ZP_05617976.1| hypothetical protein F3_06389 [Fusobacterium sp. 3_1_5R] # 1 109 1 109 109 124 100.0 2e-27 MYSQGETFYYDIEDEEYELSVLSTFLVGEQEYLITEDFDGTLHVFIYDEDEDDIFLVEDE DEAAQLIQDWKDEYLDGEDIGDYEDDEYYDREDRYQEESYNEIEEDDEY >gi|224461450|gb|ACDD01000052.1| GENE 21 22112 - 22498 311 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452678|ref|ZP_05617977.1| ## NR: gi|257452678|ref|ZP_05617977.1| hypothetical protein F3_06394 [Fusobacterium sp. 3_1_5R] # 1 128 1 128 128 225 100.0 6e-58 MIPEKWMIKSQDTYGKNLEVVNLKEFQQNGVYSYYYDSRLGECELVFFEKENRISLLRKG KNELHLNLQVGRTFEMKYQAEGYQDTFFVRALSCKREEGIFEFSYDILEENGERINQIVI QMKRRKSR >gi|224461450|gb|ACDD01000052.1| GENE 22 22495 - 23727 1280 410 aa, chain + ## HITS:1 COG:no KEGG:FN0465 NR:ns ## KEGG: FN0465 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 7 401 1 403 410 116 28.0 1e-24 MKKLLMLVGMSLLTACTSLDATKAKEDIREILLPKNQLENSTPMENTKIEEKVEVEKSEW KLSLETMPEVLTSIRMELKNNQKMVFDAKVNKISLYVGQTAVIKDNAGMNKLKLLVSPQK SNPNLKTGSSMFTFRSIYQGTYVVAWETLSGVKKQLTIENHLKYKFTEEENYDIILRSFQ EQNLKALEESVALYRMSFSNGKNTRKSMLSLLELATIKKDKKLIRESLQYWSKIQGLNTE ESKAVQEGKKIVGLSKIPEKRVEKEDIKISVENDSSDLVSGNYEQYKSLYRSANRKATLH LYNAAIKDYQKALIIGKKFPETVSIYDGLGNSYYGLGKYQQSIEYFQKSLSHKGNSSERR AETYYKLASAYNKLGEKREYKKYLTLLKERYANSLWGKKAQIELMKLNER >gi|224461450|gb|ACDD01000052.1| GENE 23 23741 - 25222 2202 493 aa, chain + ## HITS:1 COG:FN0466 KEGG:ns NR:ns ## COG: FN0466 COG1190 # Protein_GI_number: 19703801 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Fusobacterium nucleatum # 1 493 1 493 493 808 79.0 0 MEKYFDRAAKESLIMEKWKKIEELKEMGIKPFGGKYDKKHMVGDILKHTPEEELIFKTAG RIMSFRRKGKIAFAHIEDQTGKIQIYVKQDELGEEAFQLVKMLNVGDMVGIEGTLFITHT GELTLRANVVTLLTKNIRALPEKFHGLTDVETRYRKRYVDLIMNREVKETFLKRTMIIKE LKKYLDDRGFLEVETPMMHPIVGGAAARPFITHHNTLDVDLYMRIAPELYLKKLIVGGFD KVYDLNKCFRNEGMSTRHNPEFTTVELYQAYADFNDMMDLTEGVITTLCDKVNGTYDITF DGVDLHLKDFKRVHMVDLIKEVTGVDFWRKDITFEEAKAFAKEHHVEIADHMNSVGHVIN EFFEQKCEEKVIQPTFIYGHPVEISPLAKKNEEDPRFTDRFELFINAREYANAFSELNDP ADQRSRFEAQVEEAERGNDEATPVIDDDYVEALEYGLPPTGGLGIGIDRLVMLLTGAPSI RDVILFPQMKPRD >gi|224461450|gb|ACDD01000052.1| GENE 24 25291 - 25659 423 122 aa, chain + ## HITS:1 COG:FN0467 KEGG:ns NR:ns ## COG: FN0467 COG1380 # Protein_GI_number: 19703802 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Fusobacterium nucleatum # 1 115 1 115 118 85 50.0 2e-17 MLTEFLIITSLNYIGVVVAKILHLPIPGTIIGLILLFIFLATKQLKLERIEKISNFLLEN MTILFLPPAINLIAAGSFLEGQILKIIFLMVATTFFTMGITGKVVQFLIEKKEERDERNH RG >gi|224461450|gb|ACDD01000052.1| GENE 25 25637 - 26329 780 230 aa, chain + ## HITS:1 COG:FN0468 KEGG:ns NR:ns ## COG: FN0468 COG1346 # Protein_GI_number: 19703803 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Fusobacterium nucleatum # 1 230 1 230 230 252 63.0 4e-67 MKEIIVDNPYFGIVLTLFFFQIGKFIFQKTQSPLCNPLMIATVLIIALLHFFDIPLDDYT IGGDYILFLLGPATVVLAVPLYKQLNLLKKYFFPVLVGGIVGSFTAILSVIILGKALNFD FVLLLSFMPKSITTPIGIELSTMLGGIPAITIFAILVTGIFGNVSAPFICQVFRIKHPVA KGIGIGVASHAVGTTKAMEMGEIEGAMSALSIVIAGILTLIWAPIIKIFL >gi|224461450|gb|ACDD01000052.1| GENE 26 26475 - 26735 335 86 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237739595|ref|ZP_04570076.1| SSU ribosomal protein S20P [Fusobacterium sp. 2_1_31] # 1 86 1 86 90 133 82 3e-30 MANSKSAKKRVAVAERNRERNQAVKTRVKTMNKKVVVAVQDQDAEAAKNALSVAYKELDK AVSKGIMKKNTASRKKSRLAAKVNAL >gi|224461450|gb|ACDD01000052.1| GENE 27 26798 - 27373 637 191 aa, chain + ## HITS:1 COG:FN1880 KEGG:ns NR:ns ## COG: FN1880 COG0778 # Protein_GI_number: 19705185 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 184 2 189 192 107 35.0 2e-23 MLEKIKKNRSHRSFEQVNIPTEDLHRILTAVSYSASARNAQENRFMFTNSFKQCKQIFKQ TKWAGAISWNPTEEEGPTAYILLCNPSEKPTAMSFVDMGIALQSMTLVAQDLGYSCCILG AYNKKEVEKIFGLPDGYFSFLLLAIGKATDTVEVVITHDLSVKYQREEENHHTVFKLPME DVLLTNIDENN >gi|224461450|gb|ACDD01000052.1| GENE 28 27348 - 28493 771 381 aa, chain - ## HITS:1 COG:FN1041 KEGG:ns NR:ns ## COG: FN1041 COG4552 # Protein_GI_number: 19704376 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Fusobacterium nucleatum # 5 380 12 390 391 227 37.0 3e-59 MTKIEKAKYIWKNCFQDSEEETNFYFEKHFQEAQWKYYSKEDKILSSLHENPYTLKIKDS LSSYPYIVGVATLPEDRGQGYMTKLLLEEMLNLRNKNVDFCFLLPINPIIYRGFGFEYFS RKEEYSFDISLLPSQKRNASIQILEITKENLEKHWKDWKKIYSISMIPYTLYEERDFNSF KNLLEEIYLSEGKIYLFYQKNRPSGYLILDTEEDKIHIREFLGTNHKAYLDMFAFLKGYQ EYYSKIQIMSPENSNLEFFFKNQCKIEKKSSPFFMGRILQVQSFLQKLQFIAPEITIFIE DPILFENTGYYTLTSEVSFTQNHIENYDFQIGIRELVPLVLGFFSFQDLLRLGKVKLHSV EHLAKIETLFTRKFNYFHQYW >gi|224461450|gb|ACDD01000052.1| GENE 29 28506 - 28586 165 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFEDYAVELLLMITMITVCKMFMFTL >gi|224461450|gb|ACDD01000052.1| GENE 30 28588 - 30021 1925 477 aa, chain - ## HITS:1 COG:FN1906 KEGG:ns NR:ns ## COG: FN1906 COG0260 # Protein_GI_number: 19705211 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Fusobacterium nucleatum # 1 476 1 477 478 466 50.0 1e-131 MYFQMISKIQKNYDKTISLLAENEIAFCSCVSKQNQDMITKIFQKKKFSAKEGEVCEISF LENENLCTTIFIGLGKKEDLTKNILRESLYSALEKETGHFLISSEDPDLIDLDIFAEIAE HINYDFDKYKSKKKDKFLYLDFYNPNQLKFPQESQILSEISSIVRNLINEPAAYMTPDRL SIEAQICSEKYGFEIEILDEHKADSLGMKAFLAVGRAAFDRPKVIVMRYLGNPHSKEKTA LIGKGVCYDTGGLSLKPTSSMLNMKDDMSGAATVIGIMSAVAQNKIKHNVIGVIAACENA IGPNAYRPGDVIGSLNGKTIEVTNTDAEGRLTLADALTYSIRIEKATELIDIATLTGAMY MALGSEACGVITNTPSLYEKLVKASENWREEFWQMPLFKNQKKSLKSSIADIKNSGPRQA GASFAAKFLEEFVEEKPWLHLDVAGTCFSEEGDSYYKKGATGQLLRSVYTYLKDKEL >gi|224461450|gb|ACDD01000052.1| GENE 31 29999 - 30073 56 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWYNKHRKTLNLREEMDYVFSNDF >gi|224461450|gb|ACDD01000052.1| GENE 32 30164 - 30625 583 153 aa, chain + ## HITS:1 COG:FN0601 KEGG:ns NR:ns ## COG: FN0601 COG2849 # Protein_GI_number: 19703936 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 22 148 9 137 141 104 47.0 6e-23 MKRIFILLNFLMFSYWIEARTEIEYKNLEEKDGLVYYQEEIYSGKVTRGKDRYYYQDGKA DGTWLWFYPNGNLKTIETWREGKLQGKYILYLDNGNPIMKTSYSNGKDMGEYLLYYPNGR LRVKGRYEYGKPKGVWEYYTETGKLKGKGKEIL >gi|224461450|gb|ACDD01000052.1| GENE 33 30703 - 32289 2335 528 aa, chain + ## HITS:1 COG:FN1975 KEGG:ns NR:ns ## COG: FN1975 COG0513 # Protein_GI_number: 19705271 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Fusobacterium nucleatum # 1 528 1 528 528 782 79.0 0 MEKKETLKEFRELGIGEKLLKALSKKGYETPTPIQSLTIPALLTGEKDIIGQAQTGTGKT AAFALPILENIEHQDKIQGIVLTPTRELALQVAEEMNSLGSSKKIKIIPVYGGQSIDIQR KLLRNGADIIVGTPGRVIDFIERKFLRLQDLKYFILDEADEMLNMGFLEEVEKILEATNE DKRMLFFSATMPSEILKVAKKHMKDYEILAVKARELTTDLTDQIYFEVNERDKFEALCRI IDLAEDFYGIVFCRTKTDVNEVVGRLNDRGYDAEGLHGDIGQNYREVTLKRFKAKKINIL VATDVAARGIDVNDLSHVINYAIPQEAESYVHRIGRTGRAGKEGTAITFITPQEYRRLLQ IQKIVKTEIRKEEVPEVKDVIQAKKFQIQKDIDEILGEGEYDKFKKLAQDLLKKEEAENI VSSLLKLAYEDVLDESNYNEISSTKSVGGGKARLFVALGRKDGMTAKKLVEKVMKVAKVQ DKKIRNVEVYEAFSFITVPFKEAEIIIDSFKARQKGKKPLIEKAKSQK >gi|224461450|gb|ACDD01000052.1| GENE 34 32301 - 33245 1022 314 aa, chain + ## HITS:1 COG:FN1976 KEGG:ns NR:ns ## COG: FN1976 COG1559 # Protein_GI_number: 19705272 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Fusobacterium nucleatum # 23 310 22 308 310 299 54.0 4e-81 MKKASYITLFIFIIGILGYGYQQICKKREYQVALNFEYGKNIREELLKINARNHKLFWLY LRYFHQGGKDIKAGYYEVHGQYSWKDVLSMLEEGRGKYQKITIIEGTPLFQVFELLEEKG IGKAEKYREQLQMISFPYPTPDGNWEGYFYPETYNVPENYTEKDVIQLFLQEFLKHFPEE EYPDKEEFYQKLILASLLEREAKLEEEKPMIASVIENRLKKGMRLEIDSTVNYLYQYQKK RIYYKDLEKDSPYNTYRHTGLPPGPICSPTEKSMYAAYHPAKTDFYFFVTKGEGAHHFTK TYQEHINFQKKYKK >gi|224461450|gb|ACDD01000052.1| GENE 35 33262 - 34596 1048 444 aa, chain + ## HITS:1 COG:FN1977 KEGG:ns NR:ns ## COG: FN1977 COG0037 # Protein_GI_number: 19705273 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 1 443 1 448 448 274 40.0 3e-73 MQGFQKFLKDQKKYQYIQEGDRILVAFSGGPDSVFLVEMLLQLQEQLSFQMLLLHLHHMI RQEDADRDYQFCLEYARKKNLEIIAKKLDVPSYAKENRQSLEEAGRNLRYKFFQEIRKEK SYHKIATAHHLDDHLETFFFRLLRGSSMEGLAGISRKQGDRIRPLRDFEKKEILFYLEEH QIPYCHDKTNEEVEYSRNRIRLELLPQFDSYNPKWKEKVASFMEELEENKKGKSIDWRDY SEEDFLNVTKLQKEREYLQQKIIYEYILSKQISVNRKQIHQICTLLKKGGSLSYDLKNFW KFKKEYDRIWIEPIKKEEANMFVNNVEIKVPGEVYFQNYRIKILVCEENRSKGNQEFLWN WDGISSLKVRNFQEGDRIQLAGMKTPKKVKEIFINEKVPREQRKQIPILIYGEEIIALGN LRQAKWNKTDDGKIICIKIEEVRR >gi|224461450|gb|ACDD01000052.1| GENE 36 34593 - 36782 1334 729 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 113 719 13 595 636 518 46 1e-146 MSDKQEKDIMEQEEQKELQETASEENIEQENQKLQTEEEKKEEEIHKTLEEDSKKDTTSR TQEENKKTEKRIYINNEEDLKKILRESFGNSKNNKNPKKLGGKFNFVGFLLLVFIVAVVL SFPKFMKDSKSGEELHEVSYTSFVKSIDEKKFQRIEEREGYLYGYLSGEKEEFRLNVSEE KTGTTATVVYKARMITDRLGEDSNVVSKMEAAGLDVKAIPPAQTPFILNLLASWLPILLL IGVWVFMLRGVGKGGGGGPQIFNVGKSKAKENGENITQISFADVAGIDEAKQELEEVVEF LREPEKFKKIGARIPKGVLLLGSPGTGKTLLAKAVAGEAKVPFFSMSGSEFVEMFVGVGA SRVRDLFAKARKNAPCIVFIDEIDAVGRKRGTGQGGGNDEREQTLNQLLVEMDGFGNEET IIVLAATNRPDVLDRALKRPGRFDRQVYVDKPDLKGRVEILKVHAKNKKFSQDVDFEIIG KKTAGLVGADLANILNEAAIIAARANRDEINMMDLEEASEKVEMGPEKKSKVVSERDKKL TAYHETGHAIARYALGSEEKVHKITIIPRGAAGGYTMSLPAEEKSYQTKQDLLDFMVFAY GGRAAEEIVFGKENISTGASNDIERATAYAKAIVTRFGMVDEFGPILLDGTQEGDMFERK YYSEQTGKEIDDVVRKIIKTQYQKTLDILKENRDKLEAVTKVILEKETIMGDEFEKIMSS DTKEFTNEV >gi|224461450|gb|ACDD01000052.1| GENE 37 36795 - 36881 62 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKDLSSVLGLASYYLAWIINTKEEIQWL >gi|224461450|gb|ACDD01000052.1| GENE 38 36872 - 37135 399 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|19705275|ref|NP_602770.1| SSU ribosomal protein S15P [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 3 87 1 85 85 158 92 1e-37 MAMRSKEEIIREFGKKEGDTGSTEVQIALLTEKINHLTEHLRVHKKDFHSRLGLLKMVGQ RKRLLSYLTKKDLEGYRALIAKLGIRK >gi|224461450|gb|ACDD01000052.1| GENE 39 37195 - 38277 1109 360 aa, chain + ## HITS:1 COG:FN1980 KEGG:ns NR:ns ## COG: FN1980 COG5438 # Protein_GI_number: 19705276 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Fusobacterium nucleatum # 1 357 1 369 369 313 47.0 5e-85 MKKILIVLLSFFLFQMMYGEEEYVRGKILSLEDIITADSGDEEVQEVYIYRVKFLSGDRK GEEVSIEYPIYREEEYNIGAKPGDKVVLYYESNEIGDEKYYISDIDKRSQLLGISGLFIL LTLFISKKNGLKALLALGITVLFVIKVFIPSILLGYSPILFSVITGIFSTFVTIYLMTGF EKKGFIAIVGTLGGVLFAGILSYIAVNTMRLTGYETTDSLSFASYLKGIKLRELISAGVI IGSMGAVMDVAMSMSTAMHEIHQKKSDIGRKELFYSAMKMGNDMIGTMVNTLILAYIGGS LLLTVMVYIQREQFPMIRLLNFENIATEILRSISGSIGILICVPITAYVGSILYGKKTKR >gi|224461450|gb|ACDD01000052.1| GENE 40 38373 - 39908 2327 511 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 509 1 510 511 640 60.0 0 MRKKHGIIMALLCMLMFVLTACGGGDKAATAPAEKDTLIVADGASPKTLDPRATNDNVSA RVMVQIYDTLVEQDENTQIQPGLAESWEQADDVTTIFHLRKGVKFHNGEELKASDVKFSL DAMKASPQTSEIIEPLKEVVVLDDYTVKVVTEFPFAPILNHLAHPTASIVNEKAVKEAGE SYGQHPVGTGPFKFVDWQSGDRVTLEANEEYYKGASPIKHLIFKNVVEITNRTIGLETGE IDIAYDIEGLDKLKIAEDPKLNLVEDLDLSMVYLGFNLKKAPFDNIKVRQAIAYAIDQQP IIDTAFQGAAFPANSIIGPKIFAHSDKGIKYQQNLEKAKALLAEAGYRDGFKTEIWINDN PTRRDIAVILQDQLKQVGIDVEVKTLEWGAYLDGTARGDHQMFILGWGTVTADPDYGINN LVSTKTVGAAGNRSFYSNPKVDELLQKGRSTIDPEARKAIYEEIQVILQEDLPMYYIVYP KKTVGMQKYIEGFKFNPAGHHRIYGVSFKAE >gi|224461450|gb|ACDD01000052.1| GENE 41 39994 - 40845 961 283 aa, chain + ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 224 1 224 308 358 83.0 5e-99 MYKYVIRRLLLLIPVLLGISLLVFAIMYVTPGDPAQLMLGENAPKAAVEALREKMGLNDP FIVQYFRFVGKAITGDFGRSYTTGREVFAEIFSRFPNTLVLAILGIIISVVIGIPIGIIS ATKQYSAVDSISMVLALLGVSMPVFWLGLMLILLFSVKLGLLPSGGFDGLKSVILPALTL GVGSAAIVTRMTRSSMLEVIRQDYIRTARAKGVSEKVVINKHALCFNSYYYRSRTTIWTL IRWSCINRICIFMARSWKNDGRCNSAKRFSNCISSRYFSSSSI >gi|224461450|gb|ACDD01000052.1| GENE 42 40769 - 40915 229 48 aa, chain + ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 48 261 308 308 74 79.0 4e-14 MMVDAIRQKDSPTVLAAVIFLAAAFSIVNLLVDILYAYVDPRIKSQYK >gi|224461450|gb|ACDD01000052.1| GENE 43 40930 - 41796 1587 288 aa, chain + ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 5 288 6 289 289 433 83.0 1e-121 MAAANKKRSQWCEVWRMLTKNKMAMLGLFILLFLIILALFADIIYDYDTVVIKQNLSHRL QGPSGAHWLGTDEFGRDILARLVHGARVSLKVGILAVGLSIVLGGILGAISGFYGGTIDN IIMRAMDIFLAVPSILLAIAIVSALGPSMINLMVAISVSSVPTYARIVRASVLSIRDQEF IEAAKAIGASNTRIIFKHIIPNALAPVIVQGTLGVANAILSIAGLSFIGLGIQPPAPEWG SMLSGGRQYLRYAWWVTTFPGLAIMVTILSLNLLGDGLRDALDPRLKQ >gi|224461450|gb|ACDD01000052.1| GENE 44 41817 - 42824 604 335 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 22 320 35 328 329 237 42 2e-61 MDGKLLDIKNLEVQYVTDEETVYAVNGIDISLNEGETLGLVGETGAGKTTTALGIMRLVP NPPGKIMGGEIIYEGENLLKLPEEEMRKIRGNKISMIFQDPMTSLNPVMTVGEQIAEVIQ IHENITTEESMKKAGEMLELVGIPAARINDFPHQFSGGMKQRVVIAIALACNPKLLIADE PTTALDVTIQAQVLDLMNNLKEKFKTAMILITHDLGVVAQVCDKVAIMYAGEIVESGSLE EIFENTKHPYTLGLFGSIPSLDEERTRLIPIRGLMPDPTNLPEGCKFNPRCPHATDLCRQ KIPNAVEVSPGHKVKCFIAEGLVEFKEGWEAKDGE >gi|224461450|gb|ACDD01000052.1| GENE 45 42814 - 43785 828 323 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 6 313 11 324 329 323 52 2e-87 MENKVLLEVKNLKKYFQTPKGPLHAVDGVNFSIEEGKTLGVVGESGCGKSTTGRVILRLL EATDGEILFEGKNIREYSKAEMSKLRQEMQIIFQDPFASLNPRMTVSEIIAEPLIIHKQC KSKKELEDRVLELMETVGLSQRLMNTYPHELDGGRRQRIGIARALALRPKFIVCDEPVSA LDVSIQAQVLNLMQDLQEKLGLTYMFITHDLSVVKHFSDDIAVMYLGELVEKAPSKELFR NPIHPYTKALLSAIPSTNIRNKMERIRLEGEITSPINPEPGCRFAKRCIYAQDICRKESP KLQEVHGNHFFACHRAEELDFIK >gi|224461450|gb|ACDD01000052.1| GENE 46 43892 - 44137 412 81 aa, chain + ## HITS:1 COG:no KEGG:FN0683 NR:ns ## KEGG: FN0683 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 81 1 81 81 122 72.0 5e-27 MHDGCSGKFEDGKQVVQKLRMMGFSEQLMPVPAVFVCEDCKHEIVMDTFEYVCPHCNTIY AVTPCHAFDVENILSAGKKKE >gi|224461450|gb|ACDD01000052.1| GENE 47 44315 - 44401 160 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSTSVSHPFGLLGAERSFLEMIYDIIY >gi|224461450|gb|ACDD01000052.1| GENE 48 44471 - 45130 771 219 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4684 NR:ns ## KEGG: Fjoh_4684 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 214 1 213 214 159 42.0 9e-38 MTLEEFRELLEENEDWSPGWDAIEEAFSKVYKNQEPTHFGTLLPSRAVFGGKEFLDGYSM YQSSKGYKHLVSFGMTELYAEEEALGGEYSKWGYEMTIKLKEKEEEKCMWAVDMFSNLAR YTFQSNSFFEEFQYIAGDGTSICKDKKSKITALMTILDTEISPIDTIYGRVEFIQLVGIT ERELQKIQENPQNMKVLYERMKEDNPDFVLDLERTKSYL >gi|224461450|gb|ACDD01000052.1| GENE 49 45239 - 45886 792 215 aa, chain + ## HITS:1 COG:TM0961 KEGG:ns NR:ns ## COG: TM0961 COG1704 # Protein_GI_number: 15643721 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 19 155 32 180 193 79 34.0 4e-15 MFLIVLILIIIGVNFIKSYNRLQALAQKVKSSNSDIKNAIFRKVELTNKLMEIAKGYANH EKLIFIKTSEDFSSAYKDSSESLAHLKSLSVHFPELKANENYLDLSEKITTNEDLIMERR DAYNMAAQSYNAERLKFPFVLFSSSLGFREAPYLDLESNQKIDEFTTDDGEIIKDIFRNA ANTTTDFTKKGIEKIQKTAQNINKKEEVEDEEKEE >gi|224461450|gb|ACDD01000052.1| GENE 50 46034 - 48781 3132 915 aa, chain + ## HITS:1 COG:FN1836 KEGG:ns NR:ns ## COG: FN1836 COG0457 # Protein_GI_number: 19705141 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 16 914 14 936 936 559 41.0 1e-159 MKKKISTLGIYFLLTTLSFSGEREDFQRIDTLYKERNFDAALQQSVQYIKNYPSSSRILE MRNQVGKLYFIKREYSKAREQFRAILAMEPSGSTKNETYYYLARIYAALGENDQNRFALT QIKTSSSFYAKAHYESAIQYMEKMKYQEAIQLLAVPIHKKGDFYAESLLNTALAYFNQED FISSKKYLLEYSSVEQRKNRSLVEYLYGTMLYKENKVLDAVQRLETLVQQDSTSLYGKKA ILTLIEIYSNQGNAGKVEEKLAKLQGTPEYNRAMTMIGDLYVSKQQYQKALEMYAKSNQQ NDPRLLYGKAYSLYKLNRLEEALQFFEKLRSTDYYNQAIYHIFAIEYRLKHYQRILDNRH IMKRVVVTQTDNDNINTIIANAAYELGEYTLSKDYYGRLYAITPKKENLFRIILMDSKTM DIEDMARRFADYRKYYPSDTEFRKEITQAVGEAYYKAGKTEEAISVYRAYLQEKYDLEIT QALTVALLKEKRYGEMEEYLSRVPEGKENQYLRGIAAMGSGDHAQAEVYFYQMLTRLEEG NPEIPNIQLNRVRNFFLMEKYPEVTRVGESYLAKFASGKERQEVLDKVALSYFRLANYEK AREYDRQISMIDGFEEYGKFQIADTYYNEKKYQEAANQYKDLFTAYPNGKYAEQARYWYA NSLAMAGNQAAFTTEKQNFMRDYPNSSFVDNLTSLDKNLKSEMATKHLEESIKNKKTKNA QDLVSQVQSPEDKAYYQAKVYDSQKKSDLARKEYEKLLQSAKYKDYANLQLGNYWYARKD WKKAKNYYSSANSLGGAGNKDFVLYQLANLNAMEGKEQEALKLYRTVYKSYPGKYGVQAK TKAAEIFEKIGDEKSYLFLYQELSKVKEKEIRSYALEKLLYFSLEKENMKQAKIHYEALK KQDATKAKKYQDFFR >gi|224461450|gb|ACDD01000052.1| GENE 51 48798 - 49163 481 121 aa, chain + ## HITS:1 COG:no KEGG:FN1835 NR:ns ## KEGG: FN1835 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 39 120 40 121 121 69 50.0 3e-11 MNKNKILISAFFLWSFFLFAEGEVRELPVEGATTTSNVLDKGITAQNEQTLEVKELDTQE LLLQNQDLESSSIKITGKALKEQQQQVKVVQNDTLKIEEELAAGVKPKSFWQKIKDFFTG E >gi|224461450|gb|ACDD01000052.1| GENE 52 49178 - 49789 908 203 aa, chain + ## HITS:1 COG:FN1834 KEGG:ns NR:ns ## COG: FN1834 COG0811 # Protein_GI_number: 19705139 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 17 202 1 186 187 245 68.0 3e-65 MQFMKIGGLLMWFIFALGVMGLYAILERTVYFTVKERNSIANLNKKLKDLLEKNKIKEAI VYLNSNKSSSARVLQAILIYGYKENKESLEALEEKGKEVAIQQLRYLERNMWLISVAAHV APLVGLLGTVTGMIKAFQAVALYGTGDPAVLAKGISEALYTTAGGLFVAIPAMILYNYYN KKIDSIVSDIEQSSTELLNYFRR >gi|224461450|gb|ACDD01000052.1| GENE 53 49802 - 50236 590 144 aa, chain + ## HITS:1 COG:FN1833 KEGG:ns NR:ns ## COG: FN1833 COG0848 # Protein_GI_number: 19705138 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Fusobacterium nucleatum # 34 143 1 112 114 83 40.0 1e-16 MKLERNKRRGAGELALEMTPMIDVVFLLLIFFMLATTFDDKAGIKIDLPKSAIREEKVVH KLQLFADKDKNLYLLYEEAGKETRLSISQEELEGKLQEQLQRAEDKNLVISADQSLSHGY IVELMSSAKKAGATGLNIDTSYQK >gi|224461450|gb|ACDD01000052.1| GENE 54 50251 - 51030 812 259 aa, chain + ## HITS:1 COG:FN1832 KEGG:ns NR:ns ## COG: FN1832 COG0810 # Protein_GI_number: 19705137 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein TonB, links inner and outer membranes # Organism: Fusobacterium nucleatum # 28 259 3 234 234 119 36.0 5e-27 MKKEVDIPCFLLSLGLSFLILFLLSSTLPKASEEVENLKIGLVAMENDNSLDSDGSSTTD AAPSELTKPQLPEPPTLEEIKEERKEESKEPETKVEEKVEEIGEIKKPNLADLKKTISKP KLENPSVNMDRFDKKTSPKNGIGIDIDRILSKATGQKGLPSGSRMGVVDGTAVIQWNPSN PEPSFPEVAKKTGKNGSVVLLITVNEIGDVISVRMEQGSGVPEINEAISKVARTWKVKLV KKGTSVGGTFVLKYSFHLK >gi|224461450|gb|ACDD01000052.1| GENE 55 51047 - 52435 1559 462 aa, chain + ## HITS:1 COG:FN1831 KEGG:ns NR:ns ## COG: FN1831 COG2204 # Protein_GI_number: 19705136 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Fusobacterium nucleatum # 1 462 1 464 464 551 66.0 1e-157 MILLGFRLDTELKEELSNNFENNLNFADNIVDFMECVKTKKYEAIVIEEQNLKDDNLMNL VAKVSEFQKKGVIIVLGETSNLKVVAGSIKAGAYDYILKPEENSTIVKIIEKSVKDYKLL AERVDKNRKIGDKLIGRSKEMIDLYKMIGKVAGNDVPVLVVGERGTGKTSVAKAIHQLSN VSDEGFLSINCNSFRGELIERKLFGYEIGAFQGANFNQRGILEQEEMKILHLGNVESLSL DMQSKILYLLEEQKFFRLGGQDAIQSKVRVIASTSEDLESRIQEGKFIEELYRKLRVVEI HIPPLRNRKNDIPFIADHYIMECNIELNTNIRGISRPALKKIMRYDWPGNVNELKNAIKS AMTLSRGTSILLEDLPSSVLGEKVMSKAGIGELSLKEWIRQEIQYYKNENSQDYYGQIIS KVEKELISQILEMTNGKKVETAEILGITRNTLRTKMNNYGLE >gi|224461450|gb|ACDD01000052.1| GENE 56 52439 - 53875 1603 478 aa, chain + ## HITS:1 COG:FN1830 KEGG:ns NR:ns ## COG: FN1830 COG2812 # Protein_GI_number: 19705135 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Fusobacterium nucleatum # 1 475 1 483 484 421 51.0 1e-117 MYITLYRKYRPASFQEVAGEQEIVRALKNALKNNQLSQAYLFTGPRGVGKTTIARLIAKS VNCLAPKEDGEACGVCENCLSFQEGSFLDLIEIDAASNRGIDEIRLLKEKINYQPSQGKK KVYIIDEVHMLTKEAFNALLKTLEEPPSHVIFILATTEPDKILPTIISRCQRYDFKTLSL QDMGNQLQYILSQENLEMEEEVKELIYEASGGSMRDAISILERLLVSASEKKISLEESEK ILGMTPVQKMEQFLHCLLGEEKKEILEELDELWLESVDMEAFLKDFAKFIKNQIKKEKLG IEKGLFIIKNIYEVLNIFRLEEDKRLVAYVLVEKLLKQDSIRTSAMQKYKPVLENEKIKN AEKLTEEKTIISLLDIQNRWEEIIEKAREEKISMGVYLSNAKLVSLENSMLSLSYEESNL FSKEQIQEKQYSSILLKVLEEEFKQKFKLKVFTTISEKKQENRIAKKILDYFGGEIIS >gi|224461450|gb|ACDD01000052.1| GENE 57 53872 - 54696 618 274 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452708|ref|ZP_05618007.1| ## NR: gi|257452708|ref|ZP_05618007.1| hypothetical protein F3_06554 [Fusobacterium sp. 3_1_5R] # 1 274 1 274 274 436 100.0 1e-121 MILAVLISVALFILSLSFPLIAFVLPSYQLKNSKKWGLRRTILLHFVIMICLWIVQKELF FIYFILPFSISIWYFFFTFILKKEERDMNQIVITALSSSVMLGVYWIVFHKQYQKEYELV LGIYQKAYQLTQGEIQGIHEYISSYFPSMIFQYMMLTVFFCYLVLVGMKKYRDWNLHYIW SIPYIVYSFLSNVFEIENIYVENFGEIAKAILLWYGIKSIYDLLADYFKRFAWILHGASF LLAIEFPNIVFIFGGLIMLLENYWSKKLGEIVKK >gi|224461450|gb|ACDD01000052.1| GENE 58 54719 - 55168 562 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237738486|ref|ZP_04568967.1| LSU ribosomal protein L9P [Fusobacterium mortiferum ATCC 9817] # 1 149 1 149 149 221 71 2e-56 MAKVQVILTEDVAGQGRKGQIVTVSDGYAHNFLIKNKKGILATEEELKKMESRKKKEAQR AEDDRLKAVEVKKALESAKVVLGVKTGENGKLFGAITNKEVSIGIKETFNLDIDRKKIEC NIKALGEHIAVVRLHTEVKAEVKVVAVAK >gi|224461450|gb|ACDD01000052.1| GENE 59 55185 - 56525 1988 446 aa, chain + ## HITS:1 COG:FN1827 KEGG:ns NR:ns ## COG: FN1827 COG0305 # Protein_GI_number: 19705132 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Fusobacterium nucleatum # 4 445 3 445 446 455 56.0 1e-128 MKTLEEISKIPHSLEAEQAILGGIFVEPDLFEEVLEIVSPEDFYKNMYSVIFRSMLEVYR ESNEIDMVLIKNKLLQVHQFTEEQINEELSNILENSFSAVNLKEYARLVKEKAILRRLGE AGRKITEIAYRDDRDAEDILDEAESIVLKVDQQKKGKEIISLREAAKIEFDRLERIEANQ GETVGVTTGFSDLDKDTGGWNPSDLVIVAARPAMGKTAFALNLVLNAAKKGNKSILVFSL EMSTQQLYQRFMSIEAGVALSKIRNGHLDSKDWGRLGAATDIIGNYDITIADIPNVNVLE IRALARKIKSRQDLDMIVIDYLQLIRGSSVRSESRQQEISEISRSLKSLARELDIPIIAL SQLSRSPESRPDKRPMLSDLRESGAIEQDADVVIFLYRDDYYNQESPDAGITEIIIGKQR NGPTATVKLRFFHELTKFANFTTRVD >gi|224461450|gb|ACDD01000052.1| GENE 60 56550 - 57770 1549 406 aa, chain + ## HITS:1 COG:FN1826 KEGG:ns NR:ns ## COG: FN1826 COG0826 # Protein_GI_number: 19705131 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Fusobacterium nucleatum # 1 403 4 407 410 679 80.0 0 MKKAELLAPAGNVEKLKTAIHYGADAVFLGGKMFNLRAGSNNFSDEELEECVQYAHERGK RVYVTLNIIPHNEELEQLPDYVKFLEKIGVDAVIVADLGVFQIVKENTNLAISVSTQASN TNWRSVKMWKDMGAKRVVLAREISLENIMEIRQKVPDIELEVFVHGAMCMSVSGRCLLSN YMTGRDANRGDCAQSCRWKYSVVEETRPGEYMPVYEDERGTYIFSSKDLCTIEFIDKILE LGVDSLKIEGRMKGIFYVANVVKVYRDALDSFYSGNYEYNPKWKEELEATSNRSYTDGFY KGNPGVEGQNYNNRNSYSQTHQLVAKVEEKISENEYILAIRNRLFVGETLEVISPGISVR DFVMPKMILLNKGREEGEIEQANPNSFVKIVTDIPLSEMDMLRKKL >gi|224461450|gb|ACDD01000052.1| GENE 61 57778 - 58437 777 219 aa, chain + ## HITS:1 COG:no KEGG:FN0749 NR:ns ## KEGG: FN0749 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 26 210 35 219 405 207 54.0 2e-52 MKKYILIFLCSMQGMLWADEIREVPIINMQDVYQKLSLAGKLDFCIFQQAYLGFLTISNK NADYLAIIDYTKPSNEKRFFLLDMINYKIVNQTYVSHAKNTGLDTAVHFSNDRNSMQSSL GFYLTKDTYKGEYGYSLVLEGLEDKINSNAEERRIVMHGGDFAEESYLKTYGFLGRSWGC PVLPKSEIALVIDKLKNRHVLFIAGNDTNYQEITKFKFK >gi|224461450|gb|ACDD01000052.1| GENE 62 58479 - 58958 525 159 aa, chain - ## HITS:1 COG:FN0780 KEGG:ns NR:ns ## COG: FN0780 COG3610 # Protein_GI_number: 19704115 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 6 155 4 153 163 119 49.0 2e-27 MLLYILEVFWALLATLAFSIIFQVTGKRLILSTIAGGIGWIVLSVALHYFQYSSVTSFLF SAMSITIYAEIVAKKMNTTVTTTLIPGLIPLVPGSGIFFTMDNFVQGNYIKAVDLGRETL FVTAAITIGIVFITSLSQMIIRIVKYRTILQKHRKKIKR >gi|224461450|gb|ACDD01000052.1| GENE 63 58959 - 59714 591 251 aa, chain - ## HITS:1 COG:FN0781 KEGG:ns NR:ns ## COG: FN0781 COG2966 # Protein_GI_number: 19704116 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 5 245 13 253 256 252 53.0 5e-67 MQEAKILSLANLTGKTLLQSGAETSRVESCIQQICKHFGLRAQTFVSITCIITSAKNKEG NSLCSVERVTSISNNLHRIDQIHDILLHLEEYDTSRLEKTIYKIRNTQVHKTSTLVTSYF FAAFFFCLLFKGGFQDAIMSGIGGILIFYLSLFTKKLRVNPFFFNTLGGFSCTLTAYLWY KLHILNSVSYASIGTIMLLVPGLALTNAIRDLVAGDLLSGISRACEALLIGTALATGAGF ALFLLFQLEMM >gi|224461450|gb|ACDD01000052.1| GENE 64 59726 - 60454 754 242 aa, chain - ## HITS:1 COG:FN0782 KEGG:ns NR:ns ## COG: FN0782 COG4123 # Protein_GI_number: 19704117 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 4 241 6 243 243 277 63.0 1e-74 MEKETTIDLLKKGLKIIQRNDYFNFSLDSLLISEFVKINKQSKKILDLGTGNAAIPLFLS LKTTGQIYGLEIQKVSYDLAIKNIVLNHLEEQIQILHGDMKNWQEFFPRNSFDIVVSNPP FFEFHGNRELLNDLDQLTLARHEISITLEELIQVASNLVKEHGYFYLVHRADRLADIFEL CRKYQLEPKRLQFCHTKRKKNAKILLLEAVKLGKSSLQILPPLFANKEDGSYSEEILTMF EK >gi|224461450|gb|ACDD01000052.1| GENE 65 60702 - 61847 1757 381 aa, chain + ## HITS:1 COG:FN0783 KEGG:ns NR:ns ## COG: FN0783 COG1960 # Protein_GI_number: 19704118 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 1 381 1 381 381 657 86.0 0 MEFNIPKTHELFRQMIREFVEKEVKPLATELDEEERFPVETVKKMAEIGIMGIPIPTQYG GAGGDNLMYAMAVEELSRACGTTGVVVSAHTSLGSWPILKFGTEAQKQKYLPKMASGEWI GAFGLTEPNAGTDASGQQTTAVFDEEKQEWIINGSKIFITNAGYAHVYVVFAMTDKSKGV KGISAFIIESGTPGFSIGKKEKKLGIRGSATCELIFEDVRIPKENLLGDLGKGFKIAMMT LDGGRIGIASQALGLAQGALDEAVQYVKERKQFGRALSKFQNTAFQLANMEVKVEASRLL VYKAAWNESNHLPYTVDAARAKLFAAETAMEVTTKAVQLFGGYGYTREYPVERMMRDAKI TEIYEGTSEVQRMVISGNLLK >gi|224461450|gb|ACDD01000052.1| GENE 66 61885 - 62682 1162 265 aa, chain + ## HITS:1 COG:FN0784 KEGG:ns NR:ns ## COG: FN0784 COG2086 # Protein_GI_number: 19704119 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Fusobacterium nucleatum # 1 265 1 262 262 385 78.0 1e-107 MKIVVCIKQVPDTTEIKLDPVKGTLIRDGVPSIMNPDDKAGLEEALKLKDLYGAKVTVVT MGPPQAEAILREAYAMGVDNAILITDRKFGGADTLATSNTIAAAIKKIVNEDGCDLIIAG RQAIDGDTAQVGPQIAEHLGLPQVSYVKEMKYDEADKSLTIKRVVEDGYYLLKVSTPALV TVLAEANQPRYMRVKGIVEAFDKPITTWGFADIDIDEKIIGLAGSPTKVKKSFTKGAKAA GEVFEVEAKEAAQMILEKLKEKFVI >gi|224461450|gb|ACDD01000052.1| GENE 67 62715 - 63725 1376 336 aa, chain + ## HITS:1 COG:FN0785 KEGG:ns NR:ns ## COG: FN0785 COG2025 # Protein_GI_number: 19704120 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Fusobacterium nucleatum # 1 336 1 391 391 474 66.0 1e-134 MNLIDYKGILVFAEQRDGKIQNVALELIGKARELAESIDTKTVSAVLIGENIKGLAQELI HYGADVVYVVDGAEYKVYDTEKFAQVFKALINDKKPEIVLFGATTIGRDLAPRVSSRMTT GLTADCTRLEIAEDKSLWMTRPAFGGNLMATIVCPDHRPQMSTVRPGVMKKRNKEEDRKG EIVDYPVTLDMSKCKVQVLEVVKEEGNTVDISEAKILVSGGRGVGHKANFQELEDLAAEV GGIVSASRAQVDAGNISHDRQVGQTGKTVRPFVYFACGISGAIQHVAGMEESEYIIAINK DKYAPIFSVADLGIVGDVHKVLPLLTEEIRKFKAAK >gi|224461450|gb|ACDD01000052.1| GENE 68 63901 - 65709 2522 602 aa, chain + ## HITS:1 COG:FN0452 KEGG:ns NR:ns ## COG: FN0452 COG0449 # Protein_GI_number: 19703787 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Fusobacterium nucleatum # 1 602 1 607 607 638 57.0 0 MCGIVGYSGNETKAKEVILSGLEKLEYRGYDSAGIAIVMENQELFIEKKKGKLAVLKEYV EKDSKLEGKIGIGHTRWATHGIPTDENAHPHYGQDKKVAVVHNGIIENYWKLKEELLKEG VHFSSDTDTEVVAQLFEKLYQGDLLEATLLLLEKIKGSYALGMIHQAEPTRLVCCKKESP LVIGIGENASYIASDATALLKYTKNFIYLEDGDIAILEGNQVKLYDRLGKEITREVVYID ASPEQVSKQGYEHFMLKEMEEQGDIIEKTLGVYVNEEGNVNFQKQVAGISLENFHKIYVV ACGTAYHAGLQFQYFMKHLCQKEILVEIASEFRHDPPFLDEKTLVIVISQSGETYDTLMA LRQAKLQGAMTLAICNVLGSTIAREANRVIYTLAGPEISVASTKAYTAQVVLLYLLTLYF SGKNEKELDDAYQLSEKFKHIFDKKEEIKRVSEKIAKSKDIFYLGRGLDEKIAREGSLKL KEITYIHSESFPIGELKHGSIALIEEGVPVVLLSTRKEWSEKSSSNLKEVKSRGAYVIAI AVEGSDEVKAGADEYIEVEEAGKYLTALLAVVKMQLLAYYVAVAKGLDVDKPRNLAKSVT VE >gi|224461450|gb|ACDD01000052.1| GENE 69 65729 - 67486 2075 585 aa, chain + ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 585 1 584 584 639 55.0 0 MKNQEKIGWIQSKMKDSDIAAYIVPTADYHQSEYLGEYFKARAFLSGFTGSAGTLVILSE EAYLWTDGRYYVQAEKQLEGSGIHLMKQGMPGIPNYIEFLREKLAKKEKIGMDMKVFVTS DILKLQKDFECKDVGDLTIEIWKDRPNLPKDTIFIHEEKYHGEASPLKIAKIREDLSQHS LDYQLIATLDDIAWIFNLRGKDIEDNPVFLSFALISQEDVVLYCDKEKISDTVASYLREI GVEWKEYFAIFEDLSKLEGRIGMEFESSSYALYSSILEKKNIVNHQPKSSFLKTIKTEVE LENTKKIHILDGVAVTKFMYWLKHHYQTENMTEYSAEKYLDSLRAQIEHFQELSFHTIAG FGSNAAMMHYQASPEKEVVLKEGALFLVDSGGQYLEGTTDITRTFALGEVPEEQKRHFTL TLKGMIDLSKAKFMHGATGTNLDILARQHLWNIGIDYKCGTGHGVGHFLGVHDGLHGIRF QYNAQRLEENMVVTNEPGVYIAGSHGIRIENELVVRPYLETEHGKFLQFETITFAPIDLD AILPELLSVEEKEWLNQYHKDVYTKISPFLNEKEKEWLKIYTRSI >gi|224461450|gb|ACDD01000052.1| GENE 70 67499 - 67972 652 157 aa, chain + ## HITS:1 COG:no KEGG:FN1219 NR:ns ## KEGG: FN1219 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 150 4 151 151 131 46.0 8e-30 MPLYIKVLTDYYHFLIGDLEEKRKVFLMELLKYLLLKDEYGYDPFLEGETERVVFLLRCI QQEEIPMSFESYVKLQKWHRKDAWSDGELQEYFLHQRKGKEVKIMFDFDNASSEEIEILS YLNRFLEGKGRKFQVLNIHNARYVDVSELLEELRNKQ >gi|224461450|gb|ACDD01000052.1| GENE 71 68002 - 68625 899 207 aa, chain - ## HITS:1 COG:lin1978 KEGG:ns NR:ns ## COG: lin1978 COG1272 # Protein_GI_number: 16801044 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Listeria innocua # 7 207 10 210 210 147 43.0 2e-35 MTLDRIEEHWNAWTHYIGSLAAIVALVLLIIRALQISNFLYLGTVIVFGIALISLYSISG TYHILKPGKVKNIFHILDHIGIYFLIAASYTPYIFMGLTGPKKWIIFGVQWGITLLGIFF KIFFTGKFQVLSTMLYLAMGWTIVFVFRDIYHSLSPLSFRFLLASGIVYSVGTIPFLLDN IRFSHAVWHIFVLAGSTLGFLSIFFLV >gi|224461450|gb|ACDD01000052.1| GENE 72 68699 - 70774 2493 691 aa, chain - ## HITS:1 COG:CAC2241 KEGG:ns NR:ns ## COG: CAC2241 COG2217 # Protein_GI_number: 15895509 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 4 684 8 693 699 517 42.0 1e-146 MKTREYSVKNLHCSGCSAVIQGMILKLPGMIGVNIDIYEEKIMLQYEDTIEDSSLLEKIN EIGNRIEPGTLFYKKEESVENNDEKKNFYMKICSFGGMIFFLLIALLFSNNHVQFFCYLI SYFFISGDILWKASKNLGRGKILDENFLMSIASLGAIYLGEYHEAIGVMFFYKIGEILEE KAVLTSKKSISSLLKLRPEVAFQKQKDGSFQEVPSSSLHVGDIIQVKEGEKIPVDGKIIK GESFLDVSALTGEVIPMDVKVGDTVLSGSINGDRILELKVVRKFSDSTISKIIDMVEHAN TKKSKVEKFMSKFARYYTPIVVSLALIVGLILPFFLGNFKIWFERAILFLVISCPCALVI SIPLTFFHNIGRASKQGILVKGANYLEAVLDIKNIVFDKTGTLTKAKFQIKKIVGENKEL LQELAKAGEFYSKHPIGMAIYESIPLTIEEKDIQNYKNIPGYGVRLEYKGKEVFLGKETY LLEQGISYPKIEKTGSIVFILLEGTYQGYIVVEDEIKEESTHTISQLQKLGFTPYILTGD GKEIGESVGKQLGINSKNIFTNLLPEQKVKTLKKIQESGKTLYIGDGINDAPVLASSDIG ISMGNMGSDVAIEASDIVFMDDHIEKLLLLLALAKQNRRTLYTCITFALGIKILVMILGI LGIANMWFAIFSDVGVTLLCILYSSFSFHRS >gi|224461450|gb|ACDD01000052.1| GENE 73 70904 - 72121 1420 405 aa, chain - ## HITS:1 COG:CAC1001 KEGG:ns NR:ns ## COG: CAC1001 COG0436 # Protein_GI_number: 15894288 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 396 1 395 395 541 64.0 1e-154 MKISKRAEEMGYSPIRKLIPYADEAEKRGVKVYKLNIGQPNIVTPDSFFEGLHSYKEKIV KYSDSRGIPSLLESFVRSYRQSGIELEKEDILITQGGSEAIFFTLMAICDEGDEVLVPEP FYSNYSSFSRFAGAKVVPISTSIETGFHLPKKEEIEALITPKTKAIMFSNPVNPTGTVFT EKEIRMIGELAIEHDLYIIGDEVYRQFVYDDETEFLSVMKLDHLQDRVVIVDSISKHYSA CGARIGLVASKNHELMAQILKFCQARLCVSTIEQHSAANLINTMNSYFEDVKLKYKNRRD LLFSYLSRIPGVVCSRPEGAFYIIAKLPVDDSEKFAKWLLTDYSYENMTLLIAPGPGFYM TPGKGKQEVRFSFCTNVDDIENAMIVLKRALVEYQRLFLAEKEAQ >gi|224461450|gb|ACDD01000052.1| GENE 74 72173 - 73516 1429 447 aa, chain - ## HITS:1 COG:FN0162 KEGG:ns NR:ns ## COG: FN0162 COG0534 # Protein_GI_number: 19703507 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 10 446 10 446 446 510 62.0 1e-144 MLLFKSKQNKKFFALALPIFLELTLVNVVGNIDTIMLGRFSDEAVGAVGGITQALNIQNV LFGFISLGTGILVAQYVGAKNYKKMKEVISTSLFLNFIFSFLLALLYILFWRQIFNFMKL PEELVNIGKYYFLLLSSFCAFQALTLTSGAILKSYGKPKLMLFVNVGVNLLNILGNGMFL FGWLGMPILGTLGVGISTVFSRAIGCIFAIYLVKKHCHFQFSKKYFQPFPWKTIQNLLSI GIPTAGENLAWNIGQLLILSMINALGTNYIAARTYLMLITMFIMVFSISLGHATAIQIGQ LVGAKKWNQAYLRGFSSLKLSFILAIVTSSTVFLLRVPIMSIFTQNEEILKISYQVFPYF ILLESGRVFNIVIINALHASGDILPPMIVGIIFVFLVAVPFSYLFGIKFAWGLVGIWIAN AMDEWIRGFAVLYRWKSQKWKTKSFIS >gi|224461450|gb|ACDD01000052.1| GENE 75 73526 - 74347 904 273 aa, chain - ## HITS:1 COG:FN1702 KEGG:ns NR:ns ## COG: FN1702 COG1968 # Protein_GI_number: 19705023 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Fusobacterium nucleatum # 6 259 1 254 266 280 61.0 3e-75 MNPFFIIIILAVIEGLTEFLPVSSTGHMILANFFLGKSTFREEFMNHFLIIVQLGAILAV IVFFWKKVNPFVKSKEEFKKRFQLWSKVIVGVFPAAIIGLIFDDYIEQHFMNNIYIVIFT LIFYGIALMGIEHHHKKTAMKVRYASFRKLHYRTALYIGFFQCLAMIPGTSRSGATIIGA LLLGVSRPLATEFSFYLAIPTMFGATLLKLLKTNIVYTTQEWYYLGIGTFIAFLVAYIVI SWFMNYIQKRDFTLFGWYRILLGIIVLVIYLLG >gi|224461450|gb|ACDD01000052.1| GENE 76 74357 - 75355 1404 332 aa, chain - ## HITS:1 COG:FN1703 KEGG:ns NR:ns ## COG: FN1703 COG0451 # Protein_GI_number: 19705024 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 1 332 1 332 332 480 69.0 1e-135 MIVVTGAAGMIGSAVVWKLNEMGINDILLVDKLRTEDKWLNIRKRDYRDWMDRDVFLDWL FQEAEANEITAIVHMGACSATTETDGDYLMSNNYAYTKALWEYCSQRNIRFIYASSAATY GAGEQGYHDMVSPEELKALKPLNKYGYSKKRFDDWAFKQKSHPSIWAGLKFFNVYGPQEY HKGRMASMVFHSFRQYKETGKVKLFQSHKEGYEDGGQLRDFVYVKDVVDIIYYMLTQNFE SGIYNIGTGQARSFLDLAMATIKAAAGREDIQVSDVIEFIPMPEDLRGKYQYFTQAQMEK LGNTTYHLHMHSLEEGVKDYVQNYLSQEDAYL >gi|224461450|gb|ACDD01000052.1| GENE 77 75358 - 76176 841 272 aa, chain - ## HITS:1 COG:BH2283 KEGG:ns NR:ns ## COG: BH2283 COG0613 # Protein_GI_number: 15614846 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 4 248 8 255 290 155 35.0 6e-38 MKVDLHLHSTASDGSFSPKQIVQLALLKKMKAIALTDHDTIDGLYEAKQEAEKWGIEFVP GIEFSTYWKNYEVHILGYFLNLEDSNFITTIQELKILREERNKKIIQLLQNYGIILDMTS LEKQYPKQSIGRVHIAKEIIKNGYVKDMQEAFSKYLAQGGLAYVPKEGLSPHKAIQILKE NAAFSSLAHPKFISKNENEILQLIEELKEVGLDAIEANYAGFKSYEIRKYRSWAKKYNLF ITGGSDFHGTNRKNVEIGMQGLDYSQFNKFRR >gi|224461450|gb|ACDD01000052.1| GENE 78 76237 - 76977 852 246 aa, chain - ## HITS:1 COG:no KEGG:FN1719 NR:ns ## KEGG: FN1719 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 246 1 239 239 217 47.0 3e-55 MKKLFMISALSIMLVACGGAVKEKDLVQKYQLTPNSAVHWDQTIMHIIPAEAKIADWYGN ENPINYLQKTGRMNEKDFNFLVSLSQKKAEQVSKEEYEQFLDLLTSYVNTLPRKFFLSNT NIKDPKGLVKLMVRESNSTLDNPSRYIKETIASPEEWQQIVKFSSQDDLKEKDVKKLRKI LNSFLKDPELYSPEVWYRREVSDRMLELTKMQQAGNLTKMQQNNINAKALYLAYPEYFSK LDKWDK >gi|224461450|gb|ACDD01000052.1| GENE 79 76996 - 79665 3450 889 aa, chain - ## HITS:1 COG:FN1718 KEGG:ns NR:ns ## COG: FN1718 COG0653 # Protein_GI_number: 19705039 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Fusobacterium nucleatum # 1 886 1 869 869 1229 73.0 0 MIANLLKAIFGTKNEREVKRIQKIVVKINALSDEYSSLSDEELKGKTAIFKERLQNGETL DDILVEAFATVREASSRVLGLRHYDVQLIGGIVLHEGKITEMKTGEGKTLVATAPVYLNA LSGRGVHVITVNDYLATRDREMMGRVYSFLGLTSGVIVNGMYGKDRRAAYQCDITYGTNS EFGFDYLRDNMVASVGEKVQRELNYCIVDEVDSILIDEARTPLIISGASSDAIKWYQVAY QVVSLLNRSYETEKIKNIKEKKEMNIPDEKWGDYEVDEKAKNIVLTEKGVSKVEKLLKLD NLYSPENVEITHYINQALKAKELFKRDRDYLVRDTGEVVIIDEFTGRAMEGRRYSDGLHQ AIEAKEAVRIAGENQTLATITLQNYFRMYQKLSGMTGTAETEATEFVHTYGLEVVVIPTN EPVIRKDHSDLVYKTKEEKLEAIIDKIEELYKKGQPVLVGTVSIQSSEELSDLIKKKGIP HNVLNAKYHAQEAEIVAQAGRKGSVTIATNMAGRGTDIMLGGNPEFLAIHEAGSREAENY SEILSKYVKQCEEERKEVLALGGLYILGTERHESRRIDNQLRGRSGRQGDPGESQFFLSL EDDLMRLFGSDRVKAVMEKLGLPHGEPITHKMINKAIENAQTKIESRNFGIRKNLLEFDD VMNKQRTAIYASRNEALVKEDLKSNILSMLHDIIYTKTFQHLVGEVKEDWDIQGLAKYLA ERFDYIIEDEKEYMSMNVEDYAALLYDRLSAVYEEKENRMGSEIMRKIEKYILFEVVDAR WREHLKALDGLREGIYLRAYGQKNPVTEYKLVSSEIYEKMLETIQEEITSFLFKIVIKTE ENEKIEEETPKKAEKIQFIPKNQQELTPEDECPCGSGKKYKNCCGRIKK >gi|224461450|gb|ACDD01000052.1| GENE 80 79721 - 81751 2123 676 aa, chain - ## HITS:1 COG:FN1717 KEGG:ns NR:ns ## COG: FN1717 COG0272 # Protein_GI_number: 19705038 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Fusobacterium nucleatum # 13 673 33 695 696 729 62.0 0 MISKKEKENRREELQKKLQRYSDAYYSQNESLISDYEYDMLLKELEQLEKELQIENKSSI TQKVGSSLKNTKFQKTTHKTPMLSLSNTYEIGEIEDFLLRAKKNLNMTDNLEVEMEIKLD GLSISIIYEKGKLVRAVTRGDGIVGEDVTENVLQIESIPHELSEAFDIEIRGEIVLPFSE FENLNKIRIEKGEEVFANPRNAASGTLRQLDPKIVKERHLAAYFYFIVNAEQYGIHSQKE SILFLEKLSLPTTKICEIFQDSSKLEDRINYWSSEREKLPYETDGLVLKINDISLWEKLG STGKSPRWAVAYKFPAKQVTTKLLDITWQVGRTGKITPVAELEEVELSGSRVKRASLHNY DEIVRKDIRIGDTVFIEKAAEIIPQVVKAVKEDRTGKEKKIEAPEHCPICESILEREEGL VDLKCINKHCPGKIQGEMEYFVSRDGMNISGLGGRILEKLLSLHYLENVSDIYSLKEKRE ELEQLDKMGKKSIENLLSSIEESKKTSYDKVLYALGIPFVGKVAAKLLAKESKNILKLQT MTEEELQQIEGVGEKMARSIVDFFQNETKKQLIQNLIDIGLCFTLKDGVEASTEAKIWQG KNFLVTGTLSKYTRAELQEDIEKSGGTNLSSVTKKLDYLIVGEKAGSKLEKAKALGSVTI LTEEEFLALKEKLTKQ >gi|224461450|gb|ACDD01000052.1| GENE 81 81748 - 82785 882 345 aa, chain - ## HITS:1 COG:FN1920 KEGG:ns NR:ns ## COG: FN1920 COG0482 # Protein_GI_number: 19705225 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Fusobacterium nucleatum # 3 344 2 342 343 355 52.0 7e-98 MKEKVILGMSGGVDSAVAAYLLQKNGYDVIAVHFSVHKTERTQEEVKDSQTIANQFSIPL YSYCLEKEFQEKIISYYLTEIEKGRTPSPCPLCDDSVKFHLLFQEAEKHGAKYVATGHYA SISSNNVFKTSLLEANHHIHKDQCYMLYRLSSEKLKRILFPLSSLEKSEVREIARKIGLF VSEKKDSQGICFAPEGYQAFLKKHLAHKIQKGNFIDEKGNILGKHSGYPLYTLGQRRGLG IQMKEISFITKINPDTNEITLGKFDNLLEDKVILENTIFHLPLETLKNMTLLARPRFSST GFLGKVREENGVVTFHYFEKNAHNASGQHLVLFYENYVVGGGIIA >gi|224461450|gb|ACDD01000052.1| GENE 82 82795 - 84069 931 424 aa, chain - ## HITS:1 COG:FN1924 KEGG:ns NR:ns ## COG: FN1924 COG1055 # Protein_GI_number: 19705229 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Fusobacterium nucleatum # 1 424 1 424 425 556 76.0 1e-158 MLLILGILIFVLVFYCIITEKVPSCYATMLGALIMSFCGIITEEEILQTIHSRLDILLLL IGMMMIVSFISETGLFQWFAIRVVKLVRGEPLLLLILLSLITAISSAFLDNVTTILLMAP ISILLAKQLQLDPFPFVMTEVLSSDIGGMATLIGDPTQLIIGSEGHLSFNEFLWNTAPMT IIALTILLVSVYFLYIRKMKVPHELRAQIMELESSRILKNKKLLTQSLFVLILVILGFVS NNFVNKGLSVIALSGAFVLAFISKRNPKEIFEKIEWDTLFFFIGLFAMIRGIENLGIINV MGEKILEISTGNFHFATLSVMWFSSLCTSILGNVANAATFSKIIQTLLPNFENIENTKAF WWALSFGSCLGGSITMIGSATNIVAVATAKKSGCKIDFITFMKFGCRIAIINLIVASLYL YLRY >gi|224461450|gb|ACDD01000052.1| GENE 83 84098 - 85375 1139 425 aa, chain - ## HITS:1 COG:FN1925 KEGG:ns NR:ns ## COG: FN1925 COG1055 # Protein_GI_number: 19705230 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Fusobacterium nucleatum # 2 425 1 424 424 557 75.0 1e-158 MLLILALCIFIAVFYCIITEKIPTPWATMLGGLTMSLLGIINQEEALEAISERLEILFLL IGMMMIVLLISETGIFQWFAIKVAQLVRGEPFSLIILLCTITALCSAFLDNVTTILLMAP VSILLAKQLKLDPFPFVISEVMAANIGGLATLIGDPTQLIIGAEGNLNFNQFLMNTAPVS ILSMISLLFTVYFMYGRKMQVSHELKARIMELDSSRSLKEPTLLKLAGSIFALVILGFIL NNFINKGLAIISLSGAFYLVVLAKRKPKEIFENLEWETLFFFIGLFMMIKGIEELNVMEI IGQQLVHITEGNFPLAMFSITWISAIFTSIIGNVANAATMSKIIQVMIPSFNSLGNTSHF WWALSFGSCLGGNISLLGSATNVVAVGAATKAGCKIDFVKFLKFGSIIALENLIIASLYI FFRYL >gi|224461450|gb|ACDD01000052.1| GENE 84 85390 - 86319 1265 309 aa, chain - ## HITS:1 COG:FN1926_2 KEGG:ns NR:ns ## COG: FN1926_2 COG0517 # Protein_GI_number: 19705231 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Fusobacterium nucleatum # 147 309 3 167 167 197 63.0 2e-50 MKFASYLDPQLIFMDIQKETKQEVIQEMIHRIAAKDSTVREKENIIEEMVLKRENEISTC IGEGVAIPHARIENFGDFVVAIAILEKPILGEIGASNKFDEVNVVFLIISDVLKNKNILK VMSAISKIVMKNPMVFDKIKTEKNPSKIIDYIEETGIEISHKIVAEDVLSPDIIPVHPED TLENVAKRFILEQKTGLPVVDSDGTFLGEITERELIDYGMPDYLSLMGDLNFLTVGEPFE EYLIHEQTTSIENLYRKDKKMIKIDRKTPIMEICFIMVYKGINRLYVIDHGKYCGMITRS DIIKKVLHI >gi|224461450|gb|ACDD01000052.1| GENE 85 86378 - 88903 3314 841 aa, chain - ## HITS:1 COG:FN1927_1 KEGG:ns NR:ns ## COG: FN1927_1 COG1461 # Protein_GI_number: 19705232 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 1 555 1 560 560 677 65.0 0 MKIEIKSLNAVRLTKLFIAASRWLSKYADVLNDLNVYPVPDGDTGTNMSMTLQAVENELV KLDHEPNMKELSEIVSENILLGARGNSGTILSQIIQGFLSVVENTEEISIDVAARAFMAA KDKAYQAVNQPVEGTILTVIRKVAEAAMVYQGPQDDFILFLVHLKNIAHEAVENTPNELA KLKEAGVVDAGGKGIFYVLEGFEKSVTDPEMLKDLARIAKAKTVKRDKMEFAQEEDIAFK YCTEFIIESGSFPLEEYKAKIAPLGDSMVCAQTAKKTKTHLHTNHPGEVLEIAAALGNLN NIKIENMLIQHRNLLVTEAELSQVSGTVNPKEETFLLRSENATPIAYFAVVDNVELGNRF LDDGATAVLIGGQTKNPSVADIENGLKKIHSKTIILLPNNKNIISSAKMAAERSDKEVIV FETKSMLEGHYVVKHKEESMDILLQQLGRNYSIEITKASRNTKVEELEIEKEDCIALVNG RIVEKAKNNAELIEKLYTRYLDRNSLSIFAVLGKEREEEGMKALKQHSSRIKYQEFEGNQ ENYPYYIYVEQRDPNLPRIAIVTDSASDLTPELMNGYDIHIIPLRLKIGDKNYDDGITIT RKEFWNKILREKVLPKTSQPSPAEFHKVYQNLFDKGYESVITILLSSKLSGTQQAAKIAK EMMGNDKDIYIVDSKAVTFAEAHQVLEAAKLVKEGATTKEVLERLYELQDQMKLYFAVND ITYLQKGGRIGRASSIIGGLLKVKPILKLEDGEVTLETKVIGERGALSYMEKLIKNEGKK NSIILYTAWGGGNQELHNADVLKKMSEDSRKIEHRGRFEIGPTIGSHSGPVYGFGMISKI R >gi|224461450|gb|ACDD01000052.1| GENE 86 88922 - 89470 431 182 aa, chain - ## HITS:1 COG:FN1928 KEGG:ns NR:ns ## COG: FN1928 COG1396 # Protein_GI_number: 19705233 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 182 1 184 184 228 70.0 4e-60 MSIGEKIKKSRNEKSLSLRELAVKVDLSASFLSQIEQGKASPSIENLKKIATALDVRVSY LIEDDEIQKNVDFVKKENVKYIESRDSNTKMALLTVSNDEKTMEPILYEIGPGGESGRNS YSHSGEEFIYITQGELEIYINDSMYKLKEGDSLYFKSNQQHRFKNSTKKETKAIWVVSPP GF >gi|224461450|gb|ACDD01000052.1| GENE 87 89474 - 90682 239 402 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 263 389 765 895 904 96 40 4e-19 MKAACILVGTELLNGAMVDTNSIYMAEELNKVGIELPYKMIVRDIKEEIIDAIQYFHSRV DIIIMSGGLGPTLDDITKDAIADFLGKKLIVDPEELKVLHQKFASRGLPILEMNTKEVEK PEGAISFENSVGMAPAIYIDKIAAFPGVPRELYDMFPKFLSYFIKEKNWKHKIYIKDIIT YGIPESVLENHVKDCFQEEGIFYEFLVKNYGILIRMQADAMKKNKVEKIKEKIYNIIGDF IIGEDSVKIEEKIVQYLKEKQWKISLAESCTGGLIADHFVRLAGVSEVFYEGIVSYDNEA KKKRLGVQKQTLDNDGAVSENTAREMLLGLSTEVAISTTGIAGPGGGSNEKPVGLVYIGI RVLDKTYVIKKIFHGNRQQIRQRTVLEALVSLFQILTKGCEM >gi|224461450|gb|ACDD01000052.1| GENE 88 90694 - 91197 670 167 aa, chain - ## HITS:1 COG:FN1930 KEGG:ns NR:ns ## COG: FN1930 COG1267 # Protein_GI_number: 19705235 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Fusobacterium nucleatum # 6 167 9 170 171 209 70.0 2e-54 MDKRLKTIRNLATWFGLGDLPKAPGTFGTLGAIPLYILLCYLRKIFPNTMIYNSFYFMFL MTFFAISVYVADICEQEIFKKEDPQAVVIDEVLGYLTTLAFINPIGISQTLWAIGLAFLI FRFFDITKLGPIDKSQHLKMGIGVVMDDFLAGIIGNFLLVCLWTIFF >gi|224461450|gb|ACDD01000052.1| GENE 89 91201 - 93369 2097 722 aa, chain - ## HITS:1 COG:FN1931 KEGG:ns NR:ns ## COG: FN1931 COG0826 # Protein_GI_number: 19705236 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Fusobacterium nucleatum # 1 722 1 720 720 672 50.0 0 MKIVAPAGNKERLYAAIKAGAEEIYMGLQGFGARRAAENFTVEEFIEALDYAHLRESRIF LTLNTLMFKEEIEFLYCNLKKLYEHGLDAVIVQDLGLANYLHQNFPDLELHGSTQLSVAN HVEINFLQSIGFTRVVLPRELTFEEIKSIREHTTIELEVFVSGALCICYSGNCYLSSFLG GRSGNRGMCAQPCRKNYQINQENKSYFLSPKDQFLGEEEIQKLKAIGIDSIKLEGRMKEP NYVFTTVQYYRNLIDDIATKERSSSLFNRGYSKGYFYEKTSEIMNPLFSSNLGERIGIIQ GKEIRLEKDVILGDGFSYVSSSFEKLGGCYLNQIIIKGNKEKRKKAFKGEILILKDVSRG SKYLYRSYSKEIQDSIEVEKKQQDKRKEILASFIGNIGEKAKLSIHTKNEWGKEITLSVF SEENLQNANKKATTKEEVFQKISELGNTSFYLKNIEIELEENIFIPASLLKSLKRFAIEK LEKKLLESYLRIAPKEWKLSSIPNEKIDDLDFFFIVRTEKQKQYLEEKGYSNIFYRSYDI ASEGELEKQNLDTLVAANFYQILKNRNTSGIIGNWNLNISNPYTFELLERLPQLDLLMLS PEMSFEKMKNIGATKQKKAILAYSKLRGMYIELDLIKGKNTILQNQENDNFHLKTNALGH TEVYLEEALNILSKQNLIKELGISVIVIEFTYETLQEIDLVLQELREKKGRYKAYNYERG VY >gi|224461450|gb|ACDD01000052.1| GENE 90 93341 - 93943 452 200 aa, chain - ## HITS:1 COG:FN1932 KEGG:ns NR:ns ## COG: FN1932 COG0237 # Protein_GI_number: 19705237 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Fusobacterium nucleatum # 1 190 4 192 193 157 47.0 1e-38 MIIGITGTIASGKSTVSDYFIKQGYVVIDADKITKELQEQKEVLKEFLEIFGESVLLENR SLNRQKLREIVFQDKTALQKINRIMHPKVREKFEDVRSRTLKEEIVFFDIPLLFEAHFED LCEKIILVCAEREVQIRRVIQRDNSSRELAEKIINSQAKEEEKRKKSDYIIENNGTVEEL YQKLKKWEETFNENCRSSRK >gi|224461450|gb|ACDD01000052.1| GENE 91 94090 - 95058 926 322 aa, chain + ## HITS:1 COG:no KEGG:CLB_1618 NR:ns ## KEGG: CLB_1618 # Name: not_defined # Def: AraC family transcriptional regulator # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 3 317 4 323 329 123 28.0 1e-26 MRKENFIEKYFERLSRIENLTKITDLFGIRYVFPNSHGKYWFYRLAIEEGMDITFTSLPN RLNYAFQIVNWDEEVLEFGVCTEGKMEILCYPSLERYSYQQGQACLYYSKNSVEKFEFYA KYYKGFSIHLHLDYFEYFFKLGKSSWLEKEWRNNLKNMLQEKFLKIWNSSIFLQALAKEI ADFKIKSILDYFEFKGKINYFLVKLMLECMGISDSEDEKIQHLTFLIAQDYEKIYSLQEI SRFLDTPIYQLQKMMKKKKGITICQYIRNLKLEYAKILLEKNDYTVAEVASMIGYSNPSK FSKIFFERYQKKPKKFKNNKIY >gi|224461450|gb|ACDD01000052.1| GENE 92 95200 - 96945 196 581 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 354 559 41 252 329 80 30 4e-14 MKIYQKLKLYAGEKIGYAYCSIFLSFLASLFLLLPYWLLWGFLKELVLVQNIKNVRYYAT WIVIFMMIYGIFYFLSLWCSHLLAFRLETNLRKEGIKHLLNASFSFFEKNSSGKIRKLID DNASETHTIVAHLLPDLVGTALLPLGMVIIFFRIDIILGCSLLFMIGIGIWQLKDMMGEQ EFMKTYMISLEKMNAEAVEYVRGMQVIKIFNTTIQSFKAFYDAILSYSNYALHYSMSCRR AYVWFQVLFHLFITFPLPIAILFMRHGANEKLVLIKIIFYVIFAGLLFLAFMRVMYVGMY HFQALEVISKLEALFDEMERNNVTYGKLEQANHFDLEFKKVSFCYEEQYVCKELSFQLKE KKTYALVGSSGGGKSTIAKLLSGFYSVQEGEILLGGKNIKEYSETFLSKHIAFVFQNSKL WKKTIFENVKMGREDASYEEVMKALEKAQCEDILNKFEERENTLIGAKGVYLSGGEIQRI AIARAILKNADIVILDEASAAADPENEYELQRAFSNLMKDKTVIMIAHRLSSIQNVDEIL VIDQGSIIERGNHKELMANKSRYADLQKFFSEANDWRIAND >gi|224461450|gb|ACDD01000052.1| GENE 93 96938 - 98656 1454 572 aa, chain + ## HITS:1 COG:SP1435 KEGG:ns NR:ns ## COG: SP1435 COG1132 # Protein_GI_number: 15901287 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Streptococcus pneumoniae TIGR4 # 5 572 14 581 581 547 50.0 1e-155 MIKEKHFLGLTTQGKKDLIRASFSSFFMHFAYMAPIMLIFFFSESVLQGKEASPMIYGLG ILVLCFVMYLLIFYNYNTLYNATFQESANLRIHLADTLKNLPLSYYSKHNTSDLSQTIMK DVADMEHAMSHAIPQTFGFILYIIVISILMLLENVVLTLCILVPILLSFFLLILSKKMQI SSSTKYYKQLRENSEFFQESIEMQQEIKSYGQKEKVQQELMKQIEESETLHKKAELSQAF PVVFAQSILKFILGLTVFIGAKLYVEGEVSLLYLLGYLIAASKIMDGMNGLYLNLAEMMS LDARIQRIQEIQQVKRQEGKEIELSSYDIIFQKVSFSYRSDCKVIDKVSFVAKQNEVTAI VGASGCGKTTLLRLISRLYDYDEGKIFVGGKEIVDIDINHFFKNISIVFQEVLLFNTSIM ENIRIGKKSATDEEVIQAAKLANCDEFVSRFPKGYQTIIGENGSKLSGGERQRISIARAI LKDAPIVLLDEISASLDIENERKIQESLKRLLKHKTVIVISHRMKSIEKADNIIVMNEGK IEKIGKHKELLKSSSIYKNMIKKSEFAENYVY >gi|224461450|gb|ACDD01000052.1| GENE 94 98709 - 99563 823 284 aa, chain - ## HITS:1 COG:FN0789 KEGG:ns NR:ns ## COG: FN0789 COG1284 # Protein_GI_number: 19704124 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 277 1 277 280 259 50.0 3e-69 MKKKTLFVIKDYILISFACALMGFTINYFYISNKLAEGGVSGICLILHYLSNIPISYLYL GLNIPLLIIAWKFLGRDFSMKTIYATVLLSFFMDFFSYLRTPIPDFLLASLFGGALTGIS LGLIFISGGSTGGTDIIAKLITRYRGVSVGKALLAMDFVILSLVAFLFGKLIFMYTLIAV TVSSKIIDFIQEGMDEAKAIFIMTSKPQELKTAISKKINRGVTFLDGEGGFSGEKLKVLY CVISKYQLVNLKRTVRQLDPNAFLTITNVHEVLGEGFKHLNTEE >gi|224461450|gb|ACDD01000052.1| GENE 95 99920 - 100111 317 63 aa, chain + ## HITS:1 COG:no KEGG:Mmol_1121 NR:ns ## KEGG: Mmol_1121 # Name: not_defined # Def: hypothetical protein # Organism: M.mobilis # Pathway: not_defined # 5 60 2 57 58 63 50.0 2e-09 MKNTFWKNKKNKKTYKILEEAVDCTNIRDGVKVFIYQPIDKKESYFVREQEEFFQKFEKI SQE >gi|224461450|gb|ACDD01000052.1| GENE 96 100217 - 100495 386 92 aa, chain + ## HITS:1 COG:PA1749 KEGG:ns NR:ns ## COG: PA1749 COG2388 # Protein_GI_number: 15596946 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Pseudomonas aeruginosa # 25 85 92 152 161 59 44.0 1e-09 MDKIVLVETEKSGSFEIRENNIVLAELNFNKLENGVIDAYHTFVDSSLRGQGVAEKLYLE LIQYAKEKGYKIIPTCSYIGRRIQKDLDLIKK >gi|224461450|gb|ACDD01000052.1| GENE 97 100649 - 102190 1999 513 aa, chain + ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 4 513 3 511 512 669 71.0 0 MEVKQKKSFMNSFLDFIEAGGNKLPHPLTLFFILCIIIVIISGIAAKMGASVTYTALDRK TLEISEQTLEVKSLMSAEGIRYIFNSMVTNFTGFAPLGTVLVALIGIGVCEASGLMSATL RKVVTSTPRKAITAVVVLAGVMSNIASDAGYVVLVPLGALIFLSFGRHPLAGLAAAFAGV SGGFSANLLLSTTDPLLSGLTTEAARLMRPDYFVNPASNYYFMFVSTFIITILGTIITEK LVEPRLGKYEGEMVDSHEGELSDLERKGLRYAGLSVILYIAIMLILMLPENAILREDGLL KSFTSHGLVPALMLFFLIPGLVYGITTKKITSDKEVAKMMGKSLGTMGGYLALVFVSAQF VAYFNFTHLGTYIAVEGAAGLKAIGFTGLPLILAFILVSAFINLFMGSASAKWAIMAPVF VPMLMQLGYSPEFTQVAYRIGDSSTNIISPLMSYFAMIVAFAQQYDKKAGMGTLISIMLP YSISFLIGWSILLVIWFMLNLPIGPEAFIHLLG >gi|224461450|gb|ACDD01000052.1| GENE 98 102594 - 103166 603 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452749|ref|ZP_05618048.1| ## NR: gi|257452749|ref|ZP_05618048.1| hypothetical protein F3_06759 [Fusobacterium sp. 3_1_5R] # 1 190 1 190 190 325 100.0 7e-88 MFINLSKNKNCSQVIQYLKQYTTEEFQRETEIVGLIYESHFLDMEQKSLEILKGLNLENK KYIFALCISKGIQGNSLYRIKQTVEAMGYELDYIEHIIIGENISGEASFQGEKKSSELEE KIKNISQELNQKRIKKINTSYSKCSSFFAKIVRISLIYHFLQSSLDSNRCRKCGHCRGVC PTQKKIQKSS >gi|224461450|gb|ACDD01000052.1| GENE 99 103381 - 105612 3074 743 aa, chain + ## HITS:1 COG:FN0262 KEGG:ns NR:ns ## COG: FN0262 COG1882 # Protein_GI_number: 19703607 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Fusobacterium nucleatum # 3 742 2 742 743 1286 82.0 0 MQKAWRHFQEGNWAQTIDVTDFIKKNYQEYLGDESFLKGPTENTKKLWDILSVMLKEERE KGIYDAETKIPSRIDAYGPGYIKKELETIVGLQTDAPLKRAIFPNGGLRMVKNSLEAFGY RLDPTLEEFYSKNRKTHNSGVFSAYTPEIKLARHTGIITGLPDAYGRGRIIGDYRRVALY GVNYLIEKRKEDLNSCNPTEMTEDVIRKREEMFDQIEALEALKRMGASYGFDLGEPASTA QEAIQWTYFAYLAATKDQNGAAMSIGKVSTFLDIYIQRDLEEGSITEEQAQEFMDHFVMK LRIIRFLRTPEYDALFSGDPVWVTESLGGMDNNGKSMVTKNSYRMLHTLYNLGPAPEPNL TILWSEHLPMAWKKYCAKVSIDTSSLQYENDDIMRPQFGDDYGIACCVSPMAIGKQMQFF GARVNLPKALLYAINGGKDENKKVQVTPEVFEKIQGEYLNYDEVWEKYDKILTWLANTYV KALNIIHYMHDKYSYEALEMALHDINIKRTEAFGIAGLSIVADSLAAIKYGKVKMIRDEE GDVVDYEIEKPYVPFGNNDDKTDELAVLVLRTFMNKIRSHKMYRDAIPTQSILTITSNVV YGKKTGNTPDGRRAGTPFAPGANPMHGRDTKGAVASLASVAKLPFEHANDGISYTFAITP NTLGKTMEEKKSNLVGLMDGYFKQTGHHLNVNVFGRELLEDAMEHPEKYPQLTIRVSGYA VNFVKLTREQQLDVVNRTISDKF >gi|224461450|gb|ACDD01000052.1| GENE 100 105640 - 106365 768 241 aa, chain + ## HITS:1 COG:FN0261 KEGG:ns NR:ns ## COG: FN0261 COG1180 # Protein_GI_number: 19703606 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Fusobacterium nucleatum # 1 241 1 241 243 353 68.0 2e-97 MKAYINSFESFGTKDGPGIRFVLFLQGCPLRCRYCHNVDAWNLQHPNYIYTSEEILEEVN RVKVFLTGGITISGGEPLLQADFVKEFFQLCHKNGIHTALDTSGYIFTEKVKEVLEETDL VLLDLKHIDSEKYYDLTSVNLSPTLEFLEYLSKTQKDTWIRYVLVPGYTDDVEDLKRWAE YVSKYSNVKRVDILPFHQMAIYKWEKERKNYTLRDVLPPTKEAVRFAENIFLSYGLPVYT E >gi|224461450|gb|ACDD01000052.1| GENE 101 106433 - 106993 690 186 aa, chain - ## HITS:1 COG:no KEGG:Plut_0528 NR:ns ## KEGG: Plut_0528 # Name: not_defined # Def: exonuclease # Organism: P.luteolum # Pathway: not_defined # 2 183 3 183 197 117 34.0 2e-25 MKILYLDTETTGLTYRSTIIQLAAIVEIDGEIKETINLYCAPFPDSDISEEALSITKFTR EEIFQFDSPQVVCQTFTQILGKYVDKYNKNDKFIVIGHNVKFDLDMLRNWAYRCNERFIA SYIDFKNEFDTLAFTKCLKILGRLPQTENNKLETLCQAFQIPLENAHNALADTIAAKDLY HYLQNK >gi|224461450|gb|ACDD01000052.1| GENE 102 107022 - 107111 100 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AGKVNFENIEECSSKIHTNEFHGKQIVKL Prediction of potential genes in microbial genomes Time: Fri May 20 02:09:09 2011 Seq name: gi|224461449|gb|ACDD01000053.1| Fusobacterium sp. 3_1_5R cont1.53, whole genome shotgun sequence Length of sequence - 2336 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 524 484 ## COG3142 Uncharacterized protein involved in copper resistance 2 1 Op 2 . - CDS 517 - 1932 1754 ## COG1027 Aspartate ammonia-lyase - Prom 2022 - 2081 10.4 + 5S_RRNA 2203 - 2260 91.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. 3 2 Tu 1 . - CDS 2243 - 2335 215 ## Predicted protein(s) >gi|224461449|gb|ACDD01000053.1| GENE 1 2 - 524 484 174 aa, chain - ## HITS:1 COG:FN0469 KEGG:ns NR:ns ## COG: FN0469 COG3142 # Protein_GI_number: 19703804 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Fusobacterium nucleatum # 1 174 1 174 202 194 58.0 7e-50 MIKEACVGSIQEAILAEKNGANRIELCDNLIEGGTTPSYGCMKVALKSLNIPIFPMIRPR GGNFCYTKEEIETMKEDILMAKKLGIPGVVFGALTSNGELDIPNLQYLMEAAKPMQTTFH KAIDEMNFPLKAIPQLIGLGFDRILTSGKKEKALEGVELLNEMIEVANEKIIIV >gi|224461449|gb|ACDD01000053.1| GENE 2 517 - 1932 1754 471 aa, chain - ## HITS:1 COG:CAC0274 KEGG:ns NR:ns ## COG: CAC0274 COG1027 # Protein_GI_number: 15893566 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Clostridium acetobutylicum # 1 465 1 465 465 558 60.0 1e-158 MTFRLESDSIGSLQVPSNAYYGVQTLRAKNNFFITGYRLNPIFISSLAYVKKAAAICNME AKTIQEDIAKAIIQAADEIIAGKFRDQFITDVIQGGAGTSMNMNMNEVIANRANELLGGA LGTYDKVHPNDHVNFGQSTNDVVPTSGKLTIQFLLKDLVSNLEDLYSAFQSKATQYDHII KMGRTHLQDAVPIRVGQEFRAFSGPVKRDIERLKAAMYELSFVNMGATAVGTGLNADVNY IQRVVEVLSEVTGFSFSQCEDLVDGTRNLDSFVYLSSILKTCAVNLSKTANDIRLMSSGP KAGIAELILPQEQPGSSIMPGKVNPVIPEVMNQVCFQIFGNDVTITKAAEAGQLELNVFE PVLFFNLFQSIQILTNGIRTFIDNCIAGIQVNEEDCKYWLTRSVGVVTALSPHIGYKVAA EIAKLSLKTGKPVYDLVLEQGLLEKEKLDVILNPFEMTKPGIAGKELLSRD >gi|224461449|gb|ACDD01000053.1| GENE 3 2243 - 2335 215 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no DVVPDYEASSLRVTHPSATQVQRLESTCMC Prediction of potential genes in microbial genomes Time: Fri May 20 02:09:17 2011 Seq name: gi|224461448|gb|ACDD01000054.1| Fusobacterium sp. 3_1_5R cont1.54, whole genome shotgun sequence Length of sequence - 11659 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 3, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 21 - 63 9.8 1 1 Op 1 1/0.000 - CDS 71 - 664 724 ## COG0353 Recombinational DNA repair protein (RecF pathway) 2 1 Op 2 1/0.000 - CDS 666 - 968 438 ## COG2926 Uncharacterized protein conserved in bacteria 3 1 Op 3 5/0.000 - CDS 988 - 1959 1682 ## COG0205 6-phosphofructokinase 4 1 Op 4 10/0.000 - CDS 1964 - 2920 1262 ## COG0825 Acetyl-CoA carboxylase alpha subunit 5 1 Op 5 . - CDS 2910 - 3830 1090 ## COG0777 Acetyl-CoA carboxylase beta subunit - Prom 3912 - 3971 9.2 - Term 3941 - 3981 5.1 6 2 Op 1 . - CDS 4003 - 4530 621 ## FN0407 hypothetical protein 7 2 Op 2 1/0.000 - CDS 4591 - 5004 569 ## COG1959 Predicted transcriptional regulator 8 2 Op 3 1/0.000 - CDS 5033 - 5941 885 ## COG3872 Predicted metal-dependent enzyme 9 2 Op 4 1/0.000 - CDS 5960 - 7477 1723 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit 10 2 Op 5 5/0.000 - CDS 7482 - 9254 1800 ## COG0322 Nuclease subunit of the excinuclease complex 11 2 Op 6 . - CDS 9241 - 10092 1022 ## COG1660 Predicted P-loop-containing kinase - Prom 10269 - 10328 80.3 + TRNA 10247 - 10322 93.2 # Gly TCC 0 0 + TRNA 10333 - 10409 89.3 # Ala TGC 0 0 + TRNA 10414 - 10490 91.0 # Met CAT 0 0 - Term 10536 - 10568 3.2 12 3 Tu 1 . - CDS 10576 - 11658 1555 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461448|gb|ACDD01000054.1| GENE 1 71 - 664 724 197 aa, chain - ## HITS:1 COG:FN0412 KEGG:ns NR:ns ## COG: FN0412 COG0353 # Protein_GI_number: 19703754 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Fusobacterium nucleatum # 1 197 1 197 197 273 63.0 1e-73 MAIKSLEKLIDQFHKLPGIGRKSATRLGFHILDYSEQEIDDFIQALEDIKGKIHRCPVCG DYCEEELCPICSDEARDHKSICVVEDSRDVVSLEKTGKYRGLYHILGGKLAPLQGITPDK LNLKSLLERLAKEDVQEIILALNPDLEGETTAMYLVKLLKPFDVKITKIASGIPMGGNLE FADSATIARALDARQEV >gi|224461448|gb|ACDD01000054.1| GENE 2 666 - 968 438 100 aa, chain - ## HITS:1 COG:FN0411 KEGG:ns NR:ns ## COG: FN0411 COG2926 # Protein_GI_number: 19703753 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 2 98 1 97 98 99 81.0 2e-21 MVDNVLELVRKERRKNQIRREIEDNDRKIRDNRKRVELLSNLKAYFTANMSYEDILDIID NMSSDYEDRVDDYIIRNAELGKERREINKTVKELKKSMVD >gi|224461448|gb|ACDD01000054.1| GENE 3 988 - 1959 1682 323 aa, chain - ## HITS:1 COG:FN0410 KEGG:ns NR:ns ## COG: FN0410 COG0205 # Protein_GI_number: 19703752 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Fusobacterium nucleatum # 2 323 8 329 329 484 76.0 1e-137 MIERKIAIMTSGGDSPGMNAAIRAAAKTAMSKGMTVYGVRRGYLGMLNDEIFPMTGQFVS GIVDKGGTVLLTARCDEFREEKFRAIAANNLKKRGIEGLVVIGGDGSYHGADLLYREHGI KVIGIPGTIDNDIKGTQFTLGFDTCLNTILDAISKIRDTATSHERTILVEVMGRSAGDLA LQACIAGGGDGIMIPEMDNPIELLALQLKERRKSGKLHDIVLVAEGVGRVYEIEQELKGR ISSEIRSVVLGHVQRGGTPSGFDRMLASRMGHKAIELLEQDQGGLMIGLEGIELVTHPID YAWNGERKTNLDTDYELALLLAK >gi|224461448|gb|ACDD01000054.1| GENE 4 1964 - 2920 1262 318 aa, chain - ## HITS:1 COG:FN0409 KEGG:ns NR:ns ## COG: FN0409 COG0825 # Protein_GI_number: 19703751 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Fusobacterium nucleatum # 1 311 1 311 313 402 67.0 1e-112 MEFERKVEDIEEKIKELEIFAEEKGIALGEEIQKLKDFRDQFLEEIYKEVTDWDKVSISR HPMRPHTVDYIEYLVEDFVELHGDRLFRDDPAIIGGLGKIDGKSMMIIGHEKGRGTEDKI KRNFGMANPEGYRKALRLFHMAERFHLPVLVLIDTAGAYPGLEAEQNGQGEAIARNLMEM SDLRTPIISVVIGEGGSGGAIGLGVADKVYMLEHSTYSVISPEGCAAILFKDSSKAPEAA QNLKISAQNLLRLEVIDGIIPESLGGAHRDPERTAINLKNVILSSFSELEQIPLEDLLEN RYNKFRKMGKFKTKIEGE >gi|224461448|gb|ACDD01000054.1| GENE 5 2910 - 3830 1090 306 aa, chain - ## HITS:1 COG:FN0408 KEGG:ns NR:ns ## COG: FN0408 COG0777 # Protein_GI_number: 19703750 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Fusobacterium nucleatum # 6 300 13 304 304 344 57.0 1e-94 MAFFKIKNLSRSRKKYATLTVETSEVEEKENKEVKQHTENNDDKVESLWSRCPSCQEIIY QEDLQNNLWTCPNCSHHFSVSARQRIDLLIDTGSFEETDVEYCSSDPLQFPGYLEKYKET QEKENLIEGVICGRGKLQGIDVAIAVMDFKFMGGSMGSVVGEKIVDTMELGLREKIPVIV VASSGGARMQEGVLSLMQMAKTAAAAERLKKAGIPFISIPVNPTTGGVTASFAMLGDIIM SEPKARIGFAGPRVIEQTIRQKLPENFQKSEFLQEHGMVDMVVERKNMKETLYKILTNIL GATNGI >gi|224461448|gb|ACDD01000054.1| GENE 6 4003 - 4530 621 175 aa, chain - ## HITS:1 COG:no KEGG:FN0407 NR:ns ## KEGG: FN0407 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 173 11 174 174 125 44.0 6e-28 MRKILGCLSIATLLLAGCTSTDFRNVFDSLNTSVPEAINQAVSSRVNAENELYTVGSASV GQTGSIIAQSKANKIASEALRSKIRAAVETNFKSYTLNMDSYSKNLVLPAIPELTSYATD LVIKQVKQKGAWEDSSKVYSLLSVPTAEVTSTSQKVLKSFLTNTSKKLEDLSKGI >gi|224461448|gb|ACDD01000054.1| GENE 7 4591 - 5004 569 137 aa, chain - ## HITS:1 COG:FN1093 KEGG:ns NR:ns ## COG: FN1093 COG1959 # Protein_GI_number: 19704428 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 132 1 128 142 138 55.0 3e-33 MRLKNEVEYVFRILLYLSKYGKDRVISSTEISEKEQIPHLFSLRILKKMEKAGLLSIQKG AKGGYSLKKDPKDITLKTAIECIEGDIIVKDCVSDPKSCSLRGGRCSVHRAMALIEKEFI EHLAKYNFQDLSDENYF >gi|224461448|gb|ACDD01000054.1| GENE 8 5033 - 5941 885 302 aa, chain - ## HITS:1 COG:FN1092 KEGG:ns NR:ns ## COG: FN1092 COG3872 # Protein_GI_number: 19704427 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Fusobacterium nucleatum # 4 302 7 304 304 407 68.0 1e-113 MNQKMSIGGQAVLEGVMMRGTEYLATAVRKNTGEIVYRKRKISSRKKEFYKMPFIRGMFM LFDSLVLGIQELTFSANQSGETEEENLSQKEAIMTTIVSLALGIGLFIVLPSFIGGLLFS ENKLYANLLEAFIRLAGFVLYIWALSFSKDIHRVFEYHGAEHKSIYAYENDMELTPENAK KFTTLHPRCGTSFLFIVMLVAIIVFSCIDFLVPTPETLLGKLALKLILRVGLMPLIASLS YELQRYSSKHLDHFFVRLLSFPGLSLQRITTQEPDLSQLEVAIVAIKVSLGEQVQNATEI LE >gi|224461448|gb|ACDD01000054.1| GENE 9 5960 - 7477 1723 505 aa, chain - ## HITS:1 COG:FN1091 KEGG:ns NR:ns ## COG: FN1091 COG2208 # Protein_GI_number: 19704426 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Fusobacterium nucleatum # 57 505 1 447 447 467 55.0 1e-131 MFYILLLLVLLFLFFLLLRKIEMINTREKLRIITGLKNHLEMDNLQEVIQIEYDETLKQI VKQEAELNNSLEELKEYKKELELTYDSLLSKSTQLEYSNQFLEKRVANLSNLNSISRSVL SIFELDKIINIILDAYFVLTGAKRISLYLWDEEGNLLNKKIKGSIRFQGTVSYSPELLKK FGKTEYERIYQELGKGFTVLKDEELIISPLSVNQEEMGVIYIIEDKDKMIDIDEEMMSAL GIQIGTAIKNARAYYELLSNERISQELAVASRIQNRILPQDIHSVDGLQIAKYFKPAKEI GGDYYDYGMLRDEIFFITIADVSGKGVPAAFLMALGRSVLKTLMEMKQGCPSQEMQELNQ LIYGDITEEMFITMLHSKYDLKTRTLTFSNAGHNPLLVYKAAKDVIELHTVKGVALGFLE NYAYREASLEIEKGDIVVFYTDGITEAENMNSELFGIERLKEVVYNNKGRSAEKIKEAIL DEIIAFREEREQVDDITFVILKSKK >gi|224461448|gb|ACDD01000054.1| GENE 10 7482 - 9254 1800 590 aa, chain - ## HITS:1 COG:FN1090 KEGG:ns NR:ns ## COG: FN1090 COG0322 # Protein_GI_number: 19704425 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Fusobacterium nucleatum # 1 588 1 588 589 695 62.0 0 MEIGKWDIPENPGVYLMKEKNKVIYVGKAKNLHKRVKSYFQKEVDREKTRELVKHIEDIE YILCPSELDALLLENNLIKKYNPKYNIALKDEKTYPYLSLTKETFPAFHMIRKSKHLDLE HREYFGPYPFGAWKLKKILLKLFKIRDCFWDMNKKYKRPCLKYDMKTCLGPCVHKEVQEE YQAMVAKVREVLKGNTKDCIQELRVQMEEMAEKFEFEKAIILREQIQELANLEKEQISEY GKEVDEDIFVWKEVFDRMFLCVLNVREGKILGKISNNFLLEEKVYENLEEELLLSYYRKY PIPKSIVFEEKQQEVLKEGLIHLELMFDRKIESYYPKIKSRRLELLEMALLNLEKDIENF HLKKEVIEDGMKELYSFLGLKHFPRRIECFDISNIQGKDAVASMSVSIEGKAAKGEYRKF KIQCKDTPDDFAMMREVIYRRYSKLEPKDFPDVILIDGGLGQINSAGAILEELGKIQFTD LLSLAKRDEEVYKYGETLPYSISKDKEALKIFQRVRDEAHRFGITYHRKLRSKRVISSEL DHIEGIGEVRRKKLLKRFSSVSGVKAASLEELEECVPKQVAIRIKEELGG >gi|224461448|gb|ACDD01000054.1| GENE 11 9241 - 10092 1022 283 aa, chain - ## HITS:1 COG:FN1089 KEGG:ns NR:ns ## COG: FN1089 COG1660 # Protein_GI_number: 19704424 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Fusobacterium nucleatum # 3 283 4 290 290 300 56.0 3e-81 MKKEVVIVTGLSGGGKTTVLNILEDLSYYTIDNMPIGMEKFLLYTNLDKIAIGIDIRTFQ SLEDFLSVTESLQSKKISYSIIFVEASKEVILSRYHLTRRHHPLKESTLLKSIEKEIAFM SSIKEMADGVIDTSFLKPRDLEPKIKAILKVPGCSREMNIHLQSFGFKYGLPIDVDLVFD VRFLPNPYYKEELKEKSGNDPEVVDYIDSFPISAEFYKKLYDFISFLIPQYITEGKKHLS IGIGCSGGKHRSVAFVNKLYKDLVMEKKFRVYKSHREQEFGNW >gi|224461448|gb|ACDD01000054.1| GENE 12 10576 - 11658 1555 360 aa, chain - ## HITS:1 COG:FN0735 KEGG:ns NR:ns ## COG: FN0735 COG5295 # Protein_GI_number: 19704070 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Fusobacterium nucleatum # 137 360 384 617 617 112 35.0 1e-24 PQGDKGDTGVAGPQGPQGDKGDKGDKGDTGAAGPQGPQGDKGDKGDKGDTGDTGAVGPQG PQGDKGDTGAAGPQGPQGDKGDKGDKGDKGAVGPQGPKGEPGKDGKDGKNGEGAKVLAGN NIKVDSEEKKQGEDKVIENTISLKEDIKVKTVSTDSINVGDTVKISKEGINAGKQKITNV ADGKADSDAVNVKQLNEVKKEVKENTKEMTKKLYHLGEEIDGVRSEARGIGSLSASLAAL HPMQYDKAKPNQVMAGVGTYRDKQAVAVGMTHYFTENLMMTAGVSLAETSNTKAMANVGL TWKFGKGDDRENLPEVYKEGPISSIQVMQGKMMNLENINKEQQTKIEMLEKQVKLLLENR Prediction of potential genes in microbial genomes Time: Fri May 20 02:09:28 2011 Seq name: gi|224461447|gb|ACDD01000055.1| Fusobacterium sp. 3_1_5R cont1.55, whole genome shotgun sequence Length of sequence - 25261 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 10, operones - 4 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1186 1457 ## COG5295 Autotransporter adhesin - Prom 1372 - 1431 8.5 - Term 1444 - 1476 4.0 2 2 Tu 1 . - CDS 1492 - 1941 645 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 1977 - 2036 8.5 + Prom 1993 - 2052 14.3 3 3 Tu 1 . + CDS 2114 - 2830 684 ## Sterm_1160 molybdenum ABC transporter, periplasmic molybdate-binding protein - Term 2718 - 2766 7.2 4 4 Op 1 1/0.500 - CDS 2796 - 3539 533 ## COG2035 Predicted membrane protein 5 4 Op 2 . - CDS 3536 - 4618 1097 ## COG0787 Alanine racemase - Prom 4650 - 4709 5.5 - Term 4671 - 4721 8.1 6 5 Tu 1 . - CDS 4741 - 6225 2085 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 6258 - 6317 11.2 - Term 6239 - 6292 -0.7 7 6 Tu 1 1/0.500 - CDS 6334 - 7227 934 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 7265 - 7324 5.3 - Term 7388 - 7423 0.2 8 7 Op 1 . - CDS 7436 - 8740 2005 ## COG0213 Thymidine phosphorylase 9 7 Op 2 . - CDS 8754 - 10667 2078 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism 10 7 Op 3 . - CDS 10684 - 11295 819 ## gi|257452776|ref|ZP_05618075.1| hypothetical protein F3_06900 11 7 Op 4 1/0.500 - CDS 11282 - 12511 1436 ## COG0285 Folylpolyglutamate synthase 12 7 Op 5 . - CDS 12508 - 13224 1127 ## COG0775 Nucleoside phosphorylase - Prom 13252 - 13311 12.2 + Prom 13283 - 13342 8.4 13 8 Tu 1 . + CDS 13363 - 14271 1026 ## COG1560 Lauroyl/myristoyl acyltransferase + Term 14272 - 14303 4.1 - Term 14260 - 14291 4.1 14 9 Op 1 7/0.000 - CDS 14304 - 15137 1460 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 15 9 Op 2 . - CDS 15183 - 15959 1223 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 16 9 Op 3 . - CDS 16007 - 16831 1144 ## COG0489 ATPases involved in chromosome partitioning 17 9 Op 4 1/0.500 - CDS 16909 - 17799 1164 ## COG1159 GTPase 18 9 Op 5 1/0.500 - CDS 17800 - 18513 657 ## COG0582 Integrase 19 9 Op 6 17/0.000 - CDS 18510 - 20177 2575 ## COG0497 ATPase involved in DNA repair 20 9 Op 7 1/0.500 - CDS 20171 - 20971 891 ## COG0061 Predicted sugar kinase 21 9 Op 8 1/0.500 - CDS 20993 - 22096 1277 ## COG4942 Membrane-bound metallopeptidase 22 9 Op 9 . - CDS 22093 - 22938 939 ## COG2177 Cell division protein - Prom 22965 - 23024 8.8 - Term 22991 - 23029 7.2 23 10 Op 1 . - CDS 23030 - 24679 1604 ## FN1654 hypothetical protein - Prom 24705 - 24764 6.5 - Term 24713 - 24752 5.2 24 10 Op 2 . - CDS 24770 - 25261 773 ## COG3958 Transketolase, C-terminal subunit Predicted protein(s) >gi|224461447|gb|ACDD01000055.1| GENE 1 1 - 1186 1457 395 aa, chain - ## HITS:1 COG:XF1981 KEGG:ns NR:ns ## COG: XF1981 COG5295 # Protein_GI_number: 15838575 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Xylella fastidiosa 9a5c # 233 370 209 347 1190 70 38.0 8e-12 MLEEKSVKHWLKRKVKVTEALLVAFLITGGIGYAESANASSSNKMKYYGVSEESMPGTGE REPKNEYGEGARGGKKSIAIGENAKVGTWKTIWEKHIGTVNEGYNFYYGKDKDYKLTQAD VGKGYFSDGDNSVVLGNDAKSDYANSVVIGTQADSTRGNSNVVVGHKAKLSGHNGVVIGE NSIGSSSNLGVIAIGKDSRAAGGIVIGTEATLGRWEKGTNGEETNKEDLGGGIVIGRKAS ATTLPNVVIGDEARSAKDYSVVLGSYAEVENENATAVGNAATVTVDGGIALGGSSSSTTK GGKMGYDPMTGKIKENLDKVSSVWESSTGALSVGDSAYTRQITNVAAGTEDTDVVNVAQL KALNTKVDKGATHYYSVNDMDDHVDNFNNDGAKGF >gi|224461447|gb|ACDD01000055.1| GENE 2 1492 - 1941 645 149 aa, chain - ## HITS:1 COG:FN1079 KEGG:ns NR:ns ## COG: FN1079 COG0783 # Protein_GI_number: 19704414 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Fusobacterium nucleatum # 1 144 1 144 144 159 60.0 2e-39 MKKVELLNKYLSNLAVLLIKLHNLHWNVVGQQFMSIHNFTESQYDTYFVYYDDVAEALKM QGQRPLVKMKDYLAVASIQEAEDKDFSPCEVLSIIKADMEEMNQLAREIRAIASEEDDFA VANMMEDHISATVKQLWFIDSMTKVDCKL >gi|224461447|gb|ACDD01000055.1| GENE 3 2114 - 2830 684 238 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1160 NR:ns ## KEGG: Sterm_1160 # Name: not_defined # Def: molybdenum ABC transporter, periplasmic molybdate-binding protein # Organism: S.termitidis # Pathway: ABC transporters [PATH:str02010] # 24 225 36 262 264 66 24.0 7e-10 MKRFLFILAIIFAFLCLSKCQLQEETKIPLSIPYELHHVVPEIITAYHREGNREEIEVFE YHSKELKDIPKIGIHIIDSWKRSLKNPILRSPLVIIGTRKIYSLEQLKDSSISLPDPDIN TTGLHAINLLKEQDLWLNFKRNITYKNKGILSMESVDLAEEDFAIVSLADTYFMKNSFIV LDLPKEEYNTFYSIQNYDKNKEEQKKFIDFLTSEKSMKIFQKYGFFKVTSEDSSKSEK >gi|224461447|gb|ACDD01000055.1| GENE 4 2796 - 3539 533 247 aa, chain - ## HITS:1 COG:FN0490 KEGG:ns NR:ns ## COG: FN0490 COG2035 # Protein_GI_number: 19703825 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 245 1 260 260 180 48.0 2e-45 MIENIIKGLAVGVANIIPGVSGGTVAVLLGIYERLTDAMGNFFLVSFQKKKEYFIFLFQI MIGAVLGVLLFAKLIEFSIQNYPKGTASFFSLCILPSLFYIVKPYQKTKKNMFLFLLGAL FLGFFMLLSFFFKKETGAEMTPVSLISFSYGMRLFFCGLIAAGAMIIPGISGSLLLLVLG EYYHILSFILHMQMIPLLYLAMGVALGLVLFSKAIHWLLHKEKEKTMFFIAGIVFMSIFQ IWTSLPM >gi|224461447|gb|ACDD01000055.1| GENE 5 3536 - 4618 1097 360 aa, chain - ## HITS:1 COG:FN0491 KEGG:ns NR:ns ## COG: FN0491 COG0787 # Protein_GI_number: 19703826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Fusobacterium nucleatum # 5 358 4 357 359 292 45.0 9e-79 MVEHSFYLEVNREAILHNINVLRKWKRKDIIPVIKANAYGHGMLEMAKTCVQAGVTQVAV ARYEEAKKILEDSYFQSLSKECSFQILIFESIGDFSLLDDFPRMDISINSLEELEKALEY HISPKKMQVKIDFSFGRNGIQEKDLPFFIKKVKEENLTFKGIFSHLFSSSYEDGLLCIKK FSSLVQEMGKERFDRIHLQNSAASYNYDCDIVTDIRVGMLTYGLQEPGYFHEELQRAFCL KGKIDSIRCVENMKYLAYEGKEDVGMKFAKWAAKIKIGYADGFGKENENGSCIIQRKEYR IAEVTMDNTFLEVDERVKVGDEVLLFYNPTKTKQETGKEIHEHLTGLTNRLPRKWIGEIK >gi|224461447|gb|ACDD01000055.1| GENE 6 4741 - 6225 2085 494 aa, chain - ## HITS:1 COG:BH0844_1 KEGG:ns NR:ns ## COG: BH0844_1 COG1263 # Protein_GI_number: 15613407 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 2 406 4 416 424 452 57.0 1e-127 MKLFSEVQKIGKALMTPIAILPAAGLLLAFGNKLNLPIMEQAGQIIFSNLPLLFAIGAAV GLVGGDGVAGLAAIVAILIMNTTMGLVAGAAQGIANGDPSFAMVMGVPTLQTGVFGGLIA AIIAAICYNKFYKTELPAFLGFFAGKRLVPIMTAVFAFIIGLLMPWIWQPVQHGLAALSY LANETNTNVSTFIFGVIERALIPFGLHHIFYAPFWYQFGEYTTKAGEVINGDQAIWFAML KDGIHNFSAETYQGAGKFLTGKFVFMMFGLPGAALAMYQEARPENKKLVGGILFSAALTS FLTGITEPIEFTFIFVAPVLYAIHCVFAGLSFMLMNILGVRIGMTFSGGFIDYIVFGVLP GTSGFETKWYFVIVVGLIISVIYYLGFRFFIRKFNLATPGREVATEATEKKEVSEDELAN GVLVALGGKENLISLDACITRLRVEVKDTAKVEDAALKALGATGVLKVGENGVQAIFGAK AQFICNDLKKMTGI >gi|224461447|gb|ACDD01000055.1| GENE 7 6334 - 7227 934 297 aa, chain - ## HITS:1 COG:PAB2381 KEGG:ns NR:ns ## COG: PAB2381 COG0697 # Protein_GI_number: 14521649 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pyrococcus abyssi # 4 281 5 270 280 128 33.0 1e-29 MSNRGYKILLFMTAAIWGGGFPITKIALNYGTSPNAILAVRFLAASVILFAYLCYKKEKI TRSEVKLGLFTGVFLSLGFSFQTVGLSYTTASKNAFLTGTYVVLTPFFAWLFTRKMPKKQ IYFSCFLSLLGIFLLSWSGENVSMQFGDVLSLLCAVFYAIQISYMSAKIGEKNPLHVNFF QMLSAGILTLIYNIILKGGSVSSFPENKVQLFSVGFLVVFNTLLAYSAQTLAQKYVESSL VCLILSTEILFGAFISFLFLGEILSFQSLLGGFLMFLSIFLAEFDWKKKSDKESITK >gi|224461447|gb|ACDD01000055.1| GENE 8 7436 - 8740 2005 434 aa, chain - ## HITS:1 COG:SA1938 KEGG:ns NR:ns ## COG: SA1938 COG0213 # Protein_GI_number: 15927710 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Staphylococcus aureus N315 # 1 400 14 413 446 411 55.0 1e-114 MRFVDIIEKKKQKKSLSKEEIKIWIQGLVEGSIPDYQSSALLMAIVLNGMTQEETTNLAE AMVLSGEQIDLSNISGVKVDKHSTGGVGDKTTLVLGPLVASCGLKVAKMSGRGLGHTGGT LDKLESIPGFDCFLTTENFVRQVEKIGIALVGQTADLVPADKKLYALRDVTATVESIPLI ASSIMSKKLAFGSDTILLDVKFGEGAFMKTIEEGKELASSMIKIGKSLGRDTRAILTEMD QPLGNTIGNALEVIEAIETLQGKGPEDFTELCIASAELMLLQGKIVSTKEEAREMLWKKI DSGEAFEKFCEVVREQKGDVQALHDISLFPQAKNRTELKSQKTGYVVKIHSQNLGFLSME IGAGREKKEDDINPAVGLKLHKKYGDFVEVGESLCAIYHDNPLEENWKTRLLESFEIQEG KPKKKNMIEAIIEE >gi|224461447|gb|ACDD01000055.1| GENE 9 8754 - 10667 2078 637 aa, chain - ## HITS:1 COG:FN1012 KEGG:ns NR:ns ## COG: FN1012 COG1493 # Protein_GI_number: 19704347 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Fusobacterium nucleatum # 4 619 2 613 615 689 57.0 0 MESYRSISIREISEAMNLTVLNEGNLDLKVSRPNLYQVGYELTGFLATGSEELTDYINVY GQEESYYLEKLSSETKEEILSKYFALPFPALVISSAAIVSEEVLAIAKRYNKNVLRSQYL ISETIRELKFYLLRQLWIEEVYEDYALMEIHGIGVLLTGYEDAKIGSMIELVGRGHRLIT DRNVLIRRLGENDVEGMNMLEKTTEKDHFFIENHRGRQIDVTSHFGVKSTRKKKKINIVI HLEEWDEKKFYDRLGLDVEYEVFVGEKIQKITLPVRKGRNLAVIIETAALSYRLRRMGLN SAEYFLSESQRIIRENQEKRGLNMGNKTMAMPVRKLKNEFDLKIIYGEELIDTTYVETTN VFRPSLALAGHYELYQNSENRGVQVFSTVEFKFLESLSEEERVENLKRYLSYDFPLIVLT TGLHAPEYFMRLVKESKHILCRSPFRKPSQLIANFNNYLETYFAPTLSLHGVFVELYGFG VLLIGKSGIGKSETALELIHRGHRLVADDFVKFSESPTGDIIGKSARIPYFMEIRGLGII DIKTLYGLGAVRIAKRLDLIIELKEQDEDSYITSVGGQAEKQEILGKSFKKETIYISSGR NAAVMVEILVMNTMAKILGYNAEKAFDFGMKLLNSED >gi|224461447|gb|ACDD01000055.1| GENE 10 10684 - 11295 819 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452776|ref|ZP_05618075.1| ## NR: gi|257452776|ref|ZP_05618075.1| hypothetical protein F3_06900 [Fusobacterium sp. 3_1_5R] # 19 203 19 203 203 194 100.0 2e-48 MKLIRLLNILLLLVFLFFMYHIYQIYTEEGIGRAKENAKIEKEVEDSFQVRKKNFYLTQK DAPLRPKLEEVVEELNTEENKQEILKEQLEEKIENTIQNTLDTFGNKVEPTVKKEEKVAK KIEKKVEKTAKKVEKKVVEKVKKEEKIEAPKAVIIEKKEKKEKPVEQPKAPAPVVTEIKE VKLPKPAAGEIREIEGTFDASSL >gi|224461447|gb|ACDD01000055.1| GENE 11 11282 - 12511 1436 409 aa, chain - ## HITS:1 COG:FN1014 KEGG:ns NR:ns ## COG: FN1014 COG0285 # Protein_GI_number: 19704349 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Fusobacterium nucleatum # 3 401 13 411 415 498 62.0 1e-141 MMKIYSHSMFGIKLGLQNMERLCEKLGHPERAYKIIHIAGTNGKGSTATTLERILLEAGY QVGKYTSPHILKFNERIIANGKQISDEEIEKYYYQVEKIMEEEKIDATFFEITTAMMFSY FRDKKLDYVVLETGMGGRLDATNVSQAELCIITNVSLDHTEYLGDSIYKIAKEKAGIIKN CPKVIVADQQEEFLQAILEEKAEVINVLNKYADATYQLDFEKFMTEIKIEGKNYHFSLFG DYQYHNFLCAYEAAKQLGISEEIIQRAAEKVVWECRFEIAARQPLVILDGAHNPDGVREL VKIVKQHYHKTEVAILTSILKDKDIKPMLEMLAEVSDDIILTSLADNPRGSTAKELFDLA NNPDIFSMEEDMKQAYKLLIGKNRKLNIICGSFYTLIKWKEEVQSNETN >gi|224461447|gb|ACDD01000055.1| GENE 12 12508 - 13224 1127 238 aa, chain - ## HITS:1 COG:FN1015 KEGG:ns NR:ns ## COG: FN1015 COG0775 # Protein_GI_number: 19704350 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Fusobacterium nucleatum # 1 229 5 233 237 232 53.0 6e-61 MKIGIMGAMHEEIVELQQDLELGYHIEKIGDLDFMIGKLYGREVVLVEGGIGKVNAALCA SLLSHHFQVDALLFTGVAGALHSDINIADIVLGTELLEHDFDVTAFGYPLGKIPRMDVHA FPANQKLLEIAKKVGTRIFGEKHIFEGRILSGDQFVADLQKIQFLQETFDGYCTEMEGAA VAHVCHILGTPFLIIRSISDKANHDAQMDYPEFVKIAAKNSKKMIEGILKETIWEESL >gi|224461447|gb|ACDD01000055.1| GENE 13 13363 - 14271 1026 302 aa, chain + ## HITS:1 COG:FN1016 KEGG:ns NR:ns ## COG: FN1016 COG1560 # Protein_GI_number: 19704351 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Fusobacterium nucleatum # 74 290 1 217 226 234 54.0 2e-61 MNYRIQFYLLLFFRKILLWLPESFRFSFGNFLGKAAYHLIKSRRQTALWNLQLAFPEKTE QERKEIAIHSYQIMVKYFLSTLWYENYLENRVTIFNRKAIELAYAKGKGVMAAVMHMGNM EASVKAGEGFPIVTVAKDQRNPYIENFIIESRKKNLKLDLLTKSRQTVRQLQSYHKKEEK YIYALFSDHRDKGAHVNFFGLETVAPTGAVSLAYKYNMPLLLVYSCLEKDNSASIHISEE IPLIRTENPKQDVLENTQALIHRMEEIIRQYPEQWMWFHDRWNLYRDFKKEGLLPPFLQG KK >gi|224461447|gb|ACDD01000055.1| GENE 14 14304 - 15137 1460 277 aa, chain - ## HITS:1 COG:FN1019 KEGG:ns NR:ns ## COG: FN1019 COG1250 # Protein_GI_number: 19704354 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Fusobacterium nucleatum # 1 277 1 277 279 452 83.0 1e-127 MKVGVIGAGTMGSGIAQAFAQVEGYEVVLCDINDEFAARGKEKLKKGFDKRIAKGKMEQA AADAILSKITTGTKEKCGDCDLIIEAAIENMEIKKQTFKELQAICKPEAMFATNTSSLSI TEIGAGLDRPVIGMHFFNPAPVMKLVEVIAGLNTPAEMVDKIKKISEEIGKVPVQVEEAA GFVVNRILIPMINEAVGIYADGVASVEGIDSAMKLGANHPMGPLALGDLIGLDVCLAIME VLYREFGDTKYRPHPLLRKMVRGGKLGMKSGEGFYKY >gi|224461447|gb|ACDD01000055.1| GENE 15 15183 - 15959 1223 258 aa, chain - ## HITS:1 COG:FN1020 KEGG:ns NR:ns ## COG: FN1020 COG1024 # Protein_GI_number: 19704355 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Fusobacterium nucleatum # 1 258 1 258 258 355 69.0 5e-98 MEFVKYQQEGFVGVVTIDRPKALNALNSQVLEELAQTFDAVDLQNTRVIVLTGAGEKSFV AGADIGEMSSLSKAEGEAFGKKGNAVFRKIETFPIPVIAAINGFALGGGCEIAMSCDIRI CSDNALFGQPEVGLGITPGFGGTQRLARLIGQGKAKEVIYACKNMKAEEAFSVGLVNAVY PIADLMPEAMKLAAKIAKNAPIAVRMCKEAINGGYDLAMDDAVALEAKVFGQCFETEDQR EGMKAFLEKRKVEGFKNK >gi|224461447|gb|ACDD01000055.1| GENE 16 16007 - 16831 1144 274 aa, chain - ## HITS:1 COG:FN2098 KEGG:ns NR:ns ## COG: FN2098 COG0489 # Protein_GI_number: 19705388 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Fusobacterium nucleatum # 32 274 15 255 257 280 58.0 3e-75 MSGCSTCPSASGCSTEKKATCGEKNTNPFNKIKKVIGVMSGKGGVGKSTVTVLLAKELQA RGYKVGILDGDITGPSIPRLTGIREERAEAVSETEIFPVTTKEGIKVMSLNLLLEDENEP VVWRGPVVGNVVKQFWNDVIWGELDFLLIDMPPGTGDVALTVMQSLPLDGVVMVSVPQDM VSMIVAKAVNMTKKMNVPVLGLVENMSYIVCPGCETIIHFHDNNGGKDSLKEMNLNLLGE LPMKQEIAKMTQGDDSGIGMIFKEIADRFLKVIK >gi|224461447|gb|ACDD01000055.1| GENE 17 16909 - 17799 1164 296 aa, chain - ## HITS:1 COG:FN0270 KEGG:ns NR:ns ## COG: FN0270 COG1159 # Protein_GI_number: 19703615 # Func_class: R General function prediction only # Function: GTPase # Organism: Fusobacterium nucleatum # 1 295 1 296 296 408 77.0 1e-114 MKAGFIAVVGRPNVGKSTLMNKLVSEKVAIVSDKAGTTRDNIKGILNFQGKQYIFIDTPG IHKPKHLLGEYMTEIAIRSLKDADAILFLLDGTQEISTGDFFVWEKIQSSKKPVVVLVNK IDKISDEEIEEKKLEIQEKLGEGFKIVFASGMYSFGLPRLLDALEEYLEEGIQYYPEDMY TDMSIYRMITEIVREKILEKTRDEIPHSVAIEILNVAERKEAKDKFDINIYVERSSQKGI LIGKDGKMLKEIGSEARKEIENLLERKIYLTLWVKVKDDWRKKKPFLKELGYSYEE >gi|224461447|gb|ACDD01000055.1| GENE 18 17800 - 18513 657 237 aa, chain - ## HITS:1 COG:FN0269 KEGG:ns NR:ns ## COG: FN0269 COG0582 # Protein_GI_number: 19703614 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Fusobacterium nucleatum # 15 237 8 241 241 73 27.0 2e-13 MKEVEEYLRQNVAQEKTRRIYLRDLEQVREFLEKDFLEVEEADLQKYFDFCQQSLKESSL RRKQSVLRKFYQYLLTERKIQKNPFPVISTTYKKEEKQEKERLSQEEYELLLTKLPEEMK VLTEMLWETEAKILDLFDVRVESLREYDYKKIVGKRQGKVYSYEIPEFLQEAFQKMVEGK QPEEKVFHGNRQQYDKELKKINETWKASQIKKESWKSEKIEIEKIREHYFEIGIGDK >gi|224461447|gb|ACDD01000055.1| GENE 19 18510 - 20177 2575 555 aa, chain - ## HITS:1 COG:FN0268 KEGG:ns NR:ns ## COG: FN0268 COG0497 # Protein_GI_number: 19703613 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Fusobacterium nucleatum # 1 552 6 557 558 493 54.0 1e-139 MLRELKIENLAIIEELDLEFQEGFVVLTGETGAGKSIILSGINLLIGEKASVDMIRDGEN SLLAQGVFDITKKQEQDLQKFGISIEDGEVVVRRQLDRNGKSKIYVNSIRVNVTELREIM SSLVDIVGQHSHQMLLNKNNHQKLLDHFLEEKGQIVKKEVESLAKEYDILDRRIKEIEKN RQEALEKKEFYEYQLQEIEKLQLKEGEDEKLEEEYKKIFHAGKIKEKLYNTLYALRDGEY NVSSLLHQSKKNVENLGKYGKEFQEAYESLENISYQLDDCLGVLDELQDSIEVEEGNLDE ISKRLDEINRIKNKYNGSIKDILIFRDSIAEKIDFLDENNLEVKTLIEKRKQIAGNYQEK ALQLHNERLKVARFIEKELEQELQFLKMEEARLHVQFTEKEGISSEGMEEIEFFISTNLG QSMKPLAKIASGGEVSRIMLAIKVLFSRVDNIPILIFDEIDVGVGGETVKKIGDKLQEIG QRAQVISITHSPAIAARAAQQFYIEKDISGEKTLSSVTELQEEERVREIARMLSGEQVTE SVLELAREMLQEGRL >gi|224461447|gb|ACDD01000055.1| GENE 20 20171 - 20971 891 266 aa, chain - ## HITS:1 COG:FN0267 KEGG:ns NR:ns ## COG: FN0267 COG0061 # Protein_GI_number: 19703612 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Fusobacterium nucleatum # 4 266 3 267 267 237 47.0 2e-62 MKKKVYLYYNDGKEIAQELYEKSLPFFQEKGIEIMTKERENEADFYVVIGGDGTLLTAFK KFARVDIPVIAINAGHLGFLTEIKKEDMFQEYQNFLEGKSQTQKRHFLKVKIGGKTYRAL NEVVITRESVVKNMVKLKVFSEDSFVNHYKGDGLIIATPTGSTAYSLSAGGPIVGVPMKV YILTPIAPHNLNTRPLVMDGSSPLSVSLIEEEKAYCIIDGNNEKLLDGNDRVEISYSEET LHLVVPKNRDYYSVIREKLKWGDNLC >gi|224461447|gb|ACDD01000055.1| GENE 21 20993 - 22096 1277 367 aa, chain - ## HITS:1 COG:FN0266 KEGG:ns NR:ns ## COG: FN0266 COG4942 # Protein_GI_number: 19703611 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Membrane-bound metallopeptidase # Organism: Fusobacterium nucleatum # 3 367 4 403 403 266 46.0 3e-71 MKKYIVILFSFCFFHLSFADQVKDMKKKIQNIEKQIQVKNTRIKKIDVEKSQIAKKIEQL KREIEENSRKRLEMQNEIVEVTKKIEYGSKNLEISNQEFENKKLQYDAKMIAWSHYLIGH AGDLEDKPLVTKNFKTLLYSDLQRMGKIQTVQGDIKTVKEQIEAERAKLAKLQSGLAANI AEGDRKQKQQNALIAQLNQEKKEHQGSIQKLSKEKARIARQIEQIIRSRVKVDKKIVKKT QAYSKIGKTMKPLDGPIVVHYGQIKAGQVSSNGIEIKANMGAPVKAATSGTVIYASNFQG LGKVIMIDYGYNTIGVYGNLISLKAGLNQKVSKGQVIGILGVSSNGEPHLYYEVRFNLHP VDPMGTF >gi|224461447|gb|ACDD01000055.1| GENE 22 22093 - 22938 939 281 aa, chain - ## HITS:1 COG:FN0265 KEGG:ns NR:ns ## COG: FN0265 COG2177 # Protein_GI_number: 19703610 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Fusobacterium nucleatum # 1 273 1 273 308 190 39.0 2e-48 MNKVFGLGKENLKYVSRLKRRIFYCVISLVIILNIFISAGLNLRKLSEVNEGKAFFVVDL QHNLENSKKEALEKMFWKMEGVRKVQYLSKEKSFQELQQELNISIPMKDNPLTDSIVVYL SKTSNMEKIREKLEDNEAVKEVFQDKGYLEHIQKNNGMYQTLTYVSSFGAILIFGLLIFL FKAASALDFFNCINAIRDDNYNLKRSKRRNLIPFTLSTLAGELIFLNIYVYVRKIFIAYK SDFLLLAYWDTFLWHLLALLIINLVIWILPITILGIDGEEE >gi|224461447|gb|ACDD01000055.1| GENE 23 23030 - 24679 1604 549 aa, chain - ## HITS:1 COG:no KEGG:FN1654 NR:ns ## KEGG: FN1654 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 4 542 29 566 571 522 50.0 1e-146 MEWKKQLPKGITDFKEVIENQFYYFDKSFLIEEIGKDGSAVTLFTRPRRFGKTLNMSMLQ YFWDVQKAEENRKLFQGLYIESSSYFSEQGKYPVIYLSLKDMKEKTWEECSKKIKRIISN VYNQYEIIRDSLNQRDLKIFDEAWLETLESNNSNAIKELSEYVSKYYQKKVIILIDEYDT PIVSAYENGYYEEAIAFFRNFYSSALKDNEYLQLGVMTGILRVAKEGIFSGLNNLSVYTV LDEKYSSSFGLTQEEVEQALEYYDLKDNFQEVKEWYDGYLFGNTEIYNPWSIINYLSKEK IAAYWVNTSNNFLVYDLLEKANISIFEELEKIFQGKEICKTLDSAFSFQDMKNPQEIWQL LVHTGYLKIERELENSRYALTIPNQEIQSFFEKSFLNRFLGGVDIFYDMLLALKNQNMEI FEKKLQDILLTKISYHDIGKEEKYYHNLVLGMMLSMSREYEIYSNQESGYGRYDISIEPK EKDKIAFILEFKLAKSEEELEKKAKEALLQIEEKRYDVGIQKRGKQKIVKLGIAFCGKKV RVYQQDRLK >gi|224461447|gb|ACDD01000055.1| GENE 24 24770 - 25261 773 163 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 1 162 148 309 309 257 79.0 7e-69 VFAAAEYEGPVYIRMGRLDVETVLEDNYEFQIGLANTLREGTDVSIVSCGLMTQEALKAA DILAEEGISVRVINSGSVKPLDGETILKAAQETKFIVTAEEHSVIGGLGAAVSEFLSETH PTLIKKVGIYDAFGQSGKGQELLEKYELTADKLVAVIRENLKK Prediction of potential genes in microbial genomes Time: Fri May 20 02:09:55 2011 Seq name: gi|224461446|gb|ACDD01000056.1| Fusobacterium sp. 3_1_5R cont1.56, whole genome shotgun sequence Length of sequence - 7291 bp Number of predicted genes - 10, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 2 - 437 642 ## COG3958 Transketolase, C-terminal subunit 2 1 Op 2 . - CDS 457 - 1269 1093 ## COG3959 Transketolase, N-terminal subunit 3 1 Op 3 . - CDS 1314 - 1835 405 ## FN1061 hypothetical protein 4 1 Op 4 . - CDS 1832 - 2401 770 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation - Prom 2423 - 2482 9.4 + Prom 2416 - 2475 13.9 5 2 Tu 1 . + CDS 2505 - 2672 325 ## - TRNA 2506 - 2594 67.5 # Ser GCT 0 0 - TRNA 2600 - 2683 66.2 # Ser TGA 0 0 - Term 2433 - 2475 6.1 6 3 Op 1 . - CDS 2695 - 4017 1654 ## COG0144 tRNA and rRNA cytosine-C5-methylases 7 3 Op 2 . - CDS 4007 - 4072 135 ## + Prom 3909 - 3968 9.4 8 4 Op 1 1/0.000 + CDS 4152 - 5489 1802 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) 9 4 Op 2 . + CDS 5513 - 6406 915 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 10 4 Op 3 . + CDS 6413 - 7039 511 ## COG4122 Predicted O-methyltransferase + Term 7190 - 7248 11.1 Predicted protein(s) >gi|224461446|gb|ACDD01000056.1| GENE 1 2 - 437 642 145 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 2 145 3 146 309 261 87.0 4e-70 MKKSTRQAYGEALVELGQQNKNIVVLDADLSKSTKTDLFKKAFPDRHINVGIAEADLIGT AAGFATCGKIPFASSFAMFAAGRAFEQIRNTVAYPKLNVKIAPSHAGVSVGEDGGSHQSV EDMAIMRSIPGMVVLCPCDAVETKK >gi|224461446|gb|ACDD01000056.1| GENE 2 457 - 1269 1093 270 aa, chain - ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 1 270 1 270 270 459 81.0 1e-129 MKDLKSLESIAKNIRRSIVSMICEAKSGHPGGSLSIVDILTALYYDEMNIDPAKPKMEGR DRFVLSKGHAAPALYAVLAEKGYFPKEELMTLRKFGSHLQGHPDMKKVPGVEISTGSLGQ GLSVANGMALNAKIFKEDYRVYVMIGDGELQEGQIWEAAMTAAHYKLDNVCAFVDSNNLQ IDGNVDAVMGVEPLDKKWEAFGWNVLSIDGHNFEEIFSALEAAKACKGKPTLILAKTVKG KGVSFMENVCGFHGTAPTVEERDKALAELA >gi|224461446|gb|ACDD01000056.1| GENE 3 1314 - 1835 405 173 aa, chain - ## HITS:1 COG:no KEGG:FN1061 NR:ns ## KEGG: FN1061 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 167 5 169 184 103 41.0 3e-21 MRIEYLGLVISFFCFLIGIKFPDWDFKWKLRHRSIITHSPLFSTILVFLYYTKLEERLFS YVIASFSFGVAIHMIFDLFPHGWGSGALLKIPIFKITCSPKNSQYFFLFTIILNFFYVLL FLERKEEYFVYILFGFLYMLSRIPYEKKFWRPFGLYFLLIALGALNFVDIALK >gi|224461446|gb|ACDD01000056.1| GENE 4 1832 - 2401 770 189 aa, chain - ## HITS:1 COG:BS_maf KEGG:ns NR:ns ## COG: BS_maf COG0424 # Protein_GI_number: 16079857 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus subtilis # 4 186 5 185 189 169 46.0 3e-42 METMILASKSPRRKEILEMLAWNFEVCSQETEEIFEEGKSIEENMQKIALEKAKAVVNLH PNSLILSCDTMVVVENTILGKPKNKKEAKAMLQALSGKHSYVYSAVALLDRKRDLEETFV EKTKIYFYQMSEKEIDDYIATGEPMDKAGAYAIQGKASAFIEKIEGDYWNVVGLPISRVY QKLKEWGYL >gi|224461446|gb|ACDD01000056.1| GENE 5 2505 - 2672 325 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVREGFEPSVPRGYSSLAGKCIRPLCHLTAHGGRLETRTLKSYDVGFQDRFLTN >gi|224461446|gb|ACDD01000056.1| GENE 6 2695 - 4017 1654 440 aa, chain - ## HITS:1 COG:FN0313 KEGG:ns NR:ns ## COG: FN0313 COG0144 # Protein_GI_number: 19703658 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Fusobacterium nucleatum # 1 430 1 431 435 464 58.0 1e-130 MAIKNEIIALLQEVDQGKYSNIALNELFHRKVFKKGEKNFITEVFYGVIRNKIYLDYMIS QKVKEVKKDWLQQLFRLSFYQIRFMKSDDKGVIWEGVELAKKKYGVSVSRFVNGVLRNFQ RSFIEEEENLKREGREEVLFSYPKWFFEQIKKESPERYIEILKSLKRTPLLSVRVNLLKY SCEEFEEYLRKEEIEIIKKVETVYYMKAGNLLNSSEFQEGKIIVQDAASYLAAKNLGAKP GEIVLDTCSAPGGKTSVLAEAMKNEGQILSLDIHTHKIKLIQENCKKLGITIVQAVKLDA RHLSLQGKKFDRILVDAPCSGYGVLAKKPEGLYNKKEENIKELVTLQREILMAAAEVLKV GGEMVYSTCTILPAENQENAKWFLETHPNFESIQLQIPENVAGTYDECGGFSIDYQEEVV DSFYMIKWRKNLDKALDSVL >gi|224461446|gb|ACDD01000056.1| GENE 7 4007 - 4072 135 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNVCYNEWKVGKEKLGGKYGN >gi|224461446|gb|ACDD01000056.1| GENE 8 4152 - 5489 1802 445 aa, chain + ## HITS:1 COG:FN1480 KEGG:ns NR:ns ## COG: FN1480 COG2239 # Protein_GI_number: 19704812 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Fusobacterium nucleatum # 1 443 7 449 449 545 65.0 1e-154 MENILEYLESNRLSELKNILNEENPVDIAEHFENLSKEKTILVFRILQKDTASEVFSYLS SEKQEEIIESITDEELKRILDELFLDDTVDLIEEMPANIVDKILKNSSSETRKLINQFLK YPENSAGGVMTVEYVSFKNDMTIGQALSYFKNVGMNKEDTDICFVIDKTRHFLGIITLKQ LIIVEDDVPLVDAMDTSIPTVNTLDDQEEVADLFRKYDYNSIPVVDNENRLVGLITIDDV VDVIDQENTEDFHIMAAMEPSNEEYLRESIFSLAKHRIIWLLVLMISATATGIIIRKYES VLQSVVTLAIFIPMLMDTGGNAGSQSATLIIRGLALGEIELKDIGKIVWKEFRVSILVGI VLALVNFLRIYYIDQVGFTIAMVVCFSLLITVIIAKVVGGSLPIIAKALKLDPAIMASPL ITTIVDACALAIYFQLSSHFLNLVS >gi|224461446|gb|ACDD01000056.1| GENE 9 5513 - 6406 915 297 aa, chain + ## HITS:1 COG:FN1038 KEGG:ns NR:ns ## COG: FN1038 COG0697 # Protein_GI_number: 19704373 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 6 294 8 297 303 285 56.0 9e-77 MNQFLLGNGALLLTSFIWGSAFVAQVTGMDLIGPFTFSASRCFLSTLFVLALIFLQKEKD DTKMKDLLFGGIACGLFLFLGSSCQQVGLQYTTAGKTSFITSLYIVLVPLLGIFFKKKVN LFTWMAVFLGTVGLYLLAMSGLTEGANINKGDFFVFLGSFFWAGHILVIDYFTKKVNPIK LSCLQFAVTTCLAAFVALSIETPTLPNIFASWKSIAYAGILSGGIAYTLQIVGQKHTTNT TLASLILSLESVFGAIAGFIVLHERLKPSEILGCIIMFIAILVAQIPSDLFQKKKGN >gi|224461446|gb|ACDD01000056.1| GENE 10 6413 - 7039 511 208 aa, chain + ## HITS:1 COG:FN0314 KEGG:ns NR:ns ## COG: FN0314 COG4122 # Protein_GI_number: 19703659 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Fusobacterium nucleatum # 1 206 1 213 215 209 55.0 3e-54 MLEILNNINQYLYHKIEESDPILLELEAYAKEHKVPIITKEVAEYLKMMLQIKKCHNALE IGTAIGYSGIYIARQITGQLTTIEIDEERFEEAKVNFKKADISNVVQILGDATEKIKEIQ ENFDFIFIDASKGQYQKFFEDSYPKLNQGGLIFIDNILFRGYVCEENYPKRFKTLVKKLD EFISYLYKNHNFVLLPFGDGIGIVHKSK Prediction of potential genes in microbial genomes Time: Fri May 20 02:10:25 2011 Seq name: gi|224461445|gb|ACDD01000057.1| Fusobacterium sp. 3_1_5R cont1.57, whole genome shotgun sequence Length of sequence - 55914 bp Number of predicted genes - 59, with homology - 59 Number of transcription units - 13, operones - 8 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 30/0.000 - CDS 2 - 223 264 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 2 1 Op 2 51/0.000 - CDS 258 - 2339 2709 ## COG0480 Translation elongation factors (GTPases) 3 1 Op 3 56/0.000 - CDS 2366 - 2836 772 ## PROTEIN SUPPORTED gi|237737535|ref|ZP_04568016.1| SSU ribosomal protein S7P 4 1 Op 4 . - CDS 2857 - 3225 610 ## PROTEIN SUPPORTED gi|237737534|ref|ZP_04568015.1| SSU ribosomal protein S12P - Prom 3272 - 3331 11.8 - Term 3284 - 3342 4.2 5 2 Tu 1 . - CDS 3369 - 5267 2472 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 5295 - 5354 9.4 + Prom 5275 - 5334 10.8 6 3 Tu 1 . + CDS 5369 - 7633 1700 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Term 7647 - 7686 4.5 7 4 Op 1 4/0.000 - CDS 7661 - 8416 863 ## COG2099 Precorrin-6x reductase 8 4 Op 2 6/0.000 - CDS 8431 - 9174 1110 ## COG1010 Precorrin-3B methylase 9 4 Op 3 12/0.000 - CDS 9167 - 10186 1209 ## COG2073 Cobalamin biosynthesis protein CbiG 10 4 Op 4 9/0.000 - CDS 10202 - 10963 1077 ## COG2875 Precorrin-4 methylase 11 4 Op 5 7/0.000 - CDS 10942 - 11664 904 ## COG2243 Precorrin-2 methylase 12 4 Op 6 2/0.000 - CDS 11657 - 12235 883 ## COG2242 Precorrin-6B methylase 2 13 4 Op 7 6/0.000 - CDS 12222 - 12866 823 ## COG2241 Precorrin-6B methylase 1 14 4 Op 8 5/0.000 - CDS 12859 - 13983 1317 ## COG1903 Cobalamin biosynthesis protein CbiD 15 4 Op 9 1/0.200 - CDS 14015 - 14659 858 ## COG2082 Precorrin isomerase 16 4 Op 10 3/0.000 - CDS 14668 - 15987 1402 ## COG1797 Cobyrinic acid a,c-diamide synthase 17 4 Op 11 9/0.000 - CDS 15984 - 17060 1081 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 18 4 Op 12 2/0.000 - CDS 17041 - 18003 1180 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 19 4 Op 13 . - CDS 18000 - 19502 1852 ## COG1492 Cobyric acid synthase 20 4 Op 14 . - CDS 19489 - 21024 1204 ## COG1178 ABC-type Fe3+ transport system, permease component - Prom 21156 - 21215 9.2 + Prom 21033 - 21092 8.8 21 5 Tu 1 . + CDS 21120 - 22115 871 ## FN0917 hypothetical protein - Term 21942 - 21986 9.2 22 6 Tu 1 . - CDS 22066 - 22758 826 ## COG1738 Uncharacterized conserved protein - Prom 22844 - 22903 10.7 + Prom 22815 - 22874 7.3 23 7 Op 1 . + CDS 22902 - 24143 1166 ## COG0772 Bacterial cell division membrane protein 24 7 Op 2 . + CDS 24190 - 24384 396 ## gi|257452822|ref|ZP_05618121.1| hypothetical protein F3_07134 25 7 Op 3 1/0.200 + CDS 24412 - 25803 1807 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases + Term 25828 - 25861 -0.1 + Prom 25904 - 25963 4.3 26 8 Op 1 . + CDS 25991 - 26551 646 ## COG1658 Small primase-like proteins (Toprim domain) 27 8 Op 2 1/0.200 + CDS 26541 - 27485 277 ## PROTEIN SUPPORTED gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B 28 8 Op 3 . + CDS 27500 - 27805 319 ## COG1799 Uncharacterized protein conserved in bacteria + Term 27827 - 27872 12.3 - Term 27814 - 27858 12.1 29 9 Op 1 . - CDS 27885 - 28745 1057 ## HRM2_32380 hypothetical protein 30 9 Op 2 1/0.200 - CDS 28759 - 32331 4648 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 31 9 Op 3 1/0.200 - CDS 32393 - 33742 2189 ## COG1757 Na+/H+ antiporter 32 9 Op 4 . - CDS 33760 - 34947 1911 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 35123 - 35182 10.5 + Prom 34915 - 34974 7.0 33 10 Op 1 . + CDS 35073 - 36506 927 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 34 10 Op 2 . + CDS 36528 - 37415 754 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 37420 - 37451 -0.7 - Term 37408 - 37439 -0.7 35 11 Op 1 . - CDS 37444 - 38724 1336 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family 36 11 Op 2 . - CDS 38711 - 38902 129 ## Moth_1713 sigma-54 dependent trancsriptional regulator 37 11 Op 3 . - CDS 38904 - 39875 1401 ## COG0391 Uncharacterized conserved protein 38 11 Op 4 26/0.000 - CDS 39897 - 40787 1123 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 39 11 Op 5 . - CDS 40791 - 41216 479 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 40 11 Op 6 . - CDS 41227 - 42165 1211 ## COG0225 Peptide methionine sulfoxide reductase - Prom 42197 - 42256 5.4 - Term 42185 - 42239 13.1 41 12 Op 1 . - CDS 42258 - 42449 192 ## gi|257452839|ref|ZP_05618138.1| hypothetical protein F3_07219 42 12 Op 2 42/0.000 - CDS 42487 - 42762 340 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 43 12 Op 3 42/0.000 - CDS 42765 - 44162 1914 ## COG0055 F0F1-type ATP synthase, beta subunit 44 12 Op 4 42/0.000 - CDS 44186 - 45034 1026 ## COG0224 F0F1-type ATP synthase, gamma subunit 45 12 Op 5 41/0.000 - CDS 45049 - 46551 2057 ## COG0056 F0F1-type ATP synthase, alpha subunit 46 12 Op 6 38/0.000 - CDS 46570 - 47103 720 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 47 12 Op 7 37/0.000 - CDS 47100 - 47606 647 ## COG0711 F0F1-type ATP synthase, subunit b 48 12 Op 8 40/0.000 - CDS 47646 - 47921 673 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 49 12 Op 9 . - CDS 47947 - 48753 975 ## COG0356 F0F1-type ATP synthase, subunit a 50 12 Op 10 . - CDS 48757 - 49131 247 ## gi|257452848|ref|ZP_05618147.1| hypothetical protein F3_07264 51 12 Op 11 . - CDS 49106 - 49360 255 ## gi|257452849|ref|ZP_05618148.1| hypothetical protein F3_07269 52 12 Op 12 1/0.200 - CDS 49360 - 50076 1221 ## COG0500 SAM-dependent methyltransferases 53 12 Op 13 1/0.200 - CDS 50106 - 51464 2091 ## COG1109 Phosphomannomutase 54 12 Op 14 1/0.200 - CDS 51461 - 52003 433 ## COG4769 Predicted membrane protein 55 12 Op 15 1/0.200 - CDS 52006 - 53439 2037 ## COG0015 Adenylosuccinate lyase 56 12 Op 16 1/0.200 - CDS 53453 - 53863 198 ## PROTEIN SUPPORTED gi|228002792|ref|ZP_04049785.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 57 12 Op 17 5/0.000 - CDS 53864 - 54811 1082 ## COG0681 Signal peptidase I - Prom 54842 - 54901 8.7 - Term 54864 - 54931 12.0 58 12 Op 18 . - CDS 54932 - 55282 552 ## PROTEIN SUPPORTED gi|237739925|ref|ZP_04570406.1| LSU ribosomal protein L19P - Prom 55349 - 55408 6.5 + Prom 55372 - 55431 10.4 59 13 Tu 1 . + CDS 55457 - 55912 199 ## gi|257452857|ref|ZP_05618156.1| hypothetical protein F3_07309 Predicted protein(s) >gi|224461445|gb|ACDD01000057.1| GENE 1 2 - 223 264 74 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 72 1 72 407 106 69 3e-22 MAKEKYERSKPHVNIGTIGHVDHGKTTTTAAISKVLSDLGLAQKVDFDKIDVAPEERERG ITINTAHIEYETET >gi|224461445|gb|ACDD01000057.1| GENE 2 258 - 2339 2709 693 aa, chain - ## HITS:1 COG:FN1556 KEGG:ns NR:ns ## COG: FN1556 COG0480 # Protein_GI_number: 19704888 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 693 1 693 693 1217 88.0 0 MARKVSLDMTRNVGIMAHIDAGKTTTTERILFFTGVERKIGEVHEGQATMDWMEQEQERG ITITSAATTCFWREHRVNIIDTPGHVDFTVEVERSLRVLDGAVAVFSAVDGVQPQSETVW RQADKYQVPRIAFFNKMDRIGANFEMCVSDIREKLGSNPVPIQLPIGAEDQFEGIVDLIE MKEIVWGADSDNGQVFEVREVRESMKEAAEEARQYMLESVVETSDELMEKFFGGEEITVE EIRSALRVATIANTVVPVTCGTAFKNKGVQPLLDAIVDYMPSPTDVAMVAGTDPKDPEKE VDRQMSDEAPFAALAFKVMTDPFVGRLTFFRVYAGIVEKGSYVLNSTKGKKERMGRLLQM HANKREEIDVVYCGDIAAAVGLKDTTTGDTLCAEDAPIVLEKMEFPEPVISVAVEPKTKA DQEKMGIALSKLAEEDPTFRVRTDEETGQTIISGMGELHLEIIVDRMKREFKVESNVGQP QVAYRETITKSVDQEVKYAKQSGGRGQYGHVKVTIEPNPGKEFEFINKITGGVIPKEYIP AVEKGCREALESGVVAGYPMVDVKVTLYDGSYHEVDSSEMAFKIAGSMALKQGAGKAAAV ILEPVFKVEVTTPEEYMGDIIGDLNSRRGMVSGMIDRNGAKIITAKVPLSEMFGYATDLR SKSQGRATYSWEFSEYIQVPASIQKAIQEERGK >gi|224461445|gb|ACDD01000057.1| GENE 3 2366 - 2836 772 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237737535|ref|ZP_04568016.1| SSU ribosomal protein S7P [Fusobacterium mortiferum ATCC 9817] # 1 156 1 156 156 301 98 4e-81 MSRRRAAVKRDVLPDSRYSDKVVTKVINSIMLDGKKAIAEGIFYGAMDIIKEKTGQEGYD VFKQALENIKPQIEVRSRRIGGATYQVPVEVKADRQQTLAIRWLTLYTRQRKEYGMIEKL AAELIAAANNEGATIKKKEDTYKMAEANRAFAHYKI >gi|224461445|gb|ACDD01000057.1| GENE 4 2857 - 3225 610 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237737534|ref|ZP_04568015.1| SSU ribosomal protein S12P [Fusobacterium mortiferum ATCC 9817] # 1 122 1 122 122 239 96 3e-62 MPTLSQLVKNGRDTLVEKKKSPALHGNPQRRGVCVRVYTTTPKKPNSALRKVARVKLTNG IEVTCYIPGEGHNLQEHSIVLVRGGRTKDLPGVRYKIIRGALDTAGVAKRKQARSKYGAK KA >gi|224461445|gb|ACDD01000057.1| GENE 5 3369 - 5267 2472 632 aa, chain - ## HITS:1 COG:FN2102 KEGG:ns NR:ns ## COG: FN2102 COG0488 # Protein_GI_number: 19705392 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Fusobacterium nucleatum # 1 631 1 631 631 790 69.0 0 MALVQVSNLYMGFSGSCLFRDINFSIDEKDKIALIGMNGAGKTTLVKILLGLEYSEVDPR TQQRGNISTKNGIKIGYLSQNPKLDLENTVFEEMMTVFSELQKIHQRMQEINISLANNLG NNQELMNELGEIAAYYEQHEGYAVEYRVKQILRGLSLKENLWEQKIKNLSGGQLSRVALG KILLEEPDLLVLDEPTNHLDLNSIAWLEKTLKSYPKAIFLVSHDVYFLDNVANRIYEMEG KTLKAYSGNYTDFVIQKEAYLSGAVKAYEKEQEKIQKMEEFIRRYKAGVKSKQARGREKI LNRMDKMENPVITTKKMKLKFDTDLQSVDLVLELKKLCKSFSGKKLFENLDLKIYRGERV GIIGKNGTGKSTLLKIINSLEKASEGTFSIGEKVKIGYYDQNHQGLGMDNNVLEELMYHF TLSEEEARNICGAFLFREDDIYKKISSLSGGEKARVAFMKLMLEKPNFLILDEPTNHLDL YSREILMNALEDYSGTLLVVSHDRNFLDQVVRKIYRIEENGFSVFHGDYSSYLEEEKEVK EKSNEGNLSFEEQKKQRNRVANLERKTKKLEEEIARLEEKKSICEKEYEEAGRKNDLDAL LDLQRKLEEWDEKIFQKLEAWEELESEKNSLK >gi|224461445|gb|ACDD01000057.1| GENE 6 5369 - 7633 1700 754 aa, chain + ## HITS:1 COG:FN1704_1 KEGG:ns NR:ns ## COG: FN1704_1 COG1752 # Protein_GI_number: 19705025 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Fusobacterium nucleatum # 8 345 5 341 375 206 36.0 1e-52 MFQKSYFYFFLFLFTSFVSFTDGWQNQEERIAALNQEITQLMKKKQEYEVLKQKIRSEVT KENPKIALVLSGGGAKGAAHIGVLKVLEKYQIPVDIIIGTSVGSIVGGMYAIGYSPEEIE TLILNLNFGKLLTDSKDKTLKTIESHLTNEKYPLHFNMDKEFNISTPMGILNGQNIYFQL KDIFSPAENIHNFDEFPIPYRAITTNLQNGKEEIIKEGNLALASFQSMAIPAFISPVEHN GEFFVDGGVVNNFPVDVAIQMGADIIIGVDISADDNKISNDSNIISILDKISSYNGNRST ELHRQLANILIVPNVKQHNTVDFSNLSDLIQEGEIAAEKHANILQKFTDSSEFQKKKMKK LQQKSFYIEKIKCHGNEILSLEEIVQLAPPSKTKRYTKEQLEEWARKIYANTYVDKVEYH IKDNILYFNIHEKKEIILNAGLAYHTHYGGSFNVAANIPNFFDNITTHLGLKAEISEFPK LDIHNSFQYRIQRQTFYGQGRIFFQKSPFFLYEAGDNISTYATMDIGTSLTLGTELSPSL MLQYELSHHNINHNYVKGKRKIKEIEQNYKILKNTLKVTKDTLNRNVFSNKGYKLEGEIS NMNSTDNHKISASSLKGTAEIYIPITDTNLTLSSALSGGKISGRNIPKTEYIKIGGSRNF QNNVEFLGVPISSIHSNHFWLWNFGLQYKLFENLNFIGKYNHIEYSNEKNEKQKEDGYGF GLGFDIFYTPITFQISKRRHYRYPVWELSLGYAF >gi|224461445|gb|ACDD01000057.1| GENE 7 7661 - 8416 863 251 aa, chain - ## HITS:1 COG:FN0950 KEGG:ns NR:ns ## COG: FN0950 COG2099 # Protein_GI_number: 19704285 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6x reductase # Organism: Fusobacterium nucleatum # 1 246 19 264 266 320 64.0 2e-87 MIWVIGGTKDSRDFLEEYTKYDSNVIVSTATEYGGKLLENLDITISTQKMNLDEMLQFLK DYSIQKIVDVSHPYAYEVSKNAMRVAEMQGISYYRFERKEIELCAKKYSKFKNLKDLLHY VESLEGNILVTLGSNNVPSFQNLKNLSKIYFRILPKWDMVKRCEEHGILPKNIIAMQGPF TENMNIAMLEQLQIQYLITKQAGDTGGEREKISACDKKGIEVIYLEKEKLEYKNCYFELN TLIEALKIPSK >gi|224461445|gb|ACDD01000057.1| GENE 8 8431 - 9174 1110 247 aa, chain - ## HITS:1 COG:FN0951 KEGG:ns NR:ns ## COG: FN0951 COG1010 # Protein_GI_number: 19704286 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-3B methylase # Organism: Fusobacterium nucleatum # 1 247 1 247 249 402 81.0 1e-112 MNKGKIYVVGIGPGNMEDISVRAYRVLKEVDVIAGYTTYVDLVREEFQEKEFCASGMKRE VERCQEVLELAKEGKNVALISSGDSGIYGMAGIMLEVAMESDIEVEVVPGITSTIAGAAL VGAPLMHDQALISLSDLLTDWEVIKRRIEAASQGDFVISLYNPKSKKRVSQIQEAREIML KYKKASTPVALLRHIGREEENYDLCTLENFLDYEIDMFTIVLIGNSNSYIKNGRMITPRG YQDKYQY >gi|224461445|gb|ACDD01000057.1| GENE 9 9167 - 10186 1209 339 aa, chain - ## HITS:1 COG:FN0952 KEGG:ns NR:ns ## COG: FN0952 COG2073 # Protein_GI_number: 19704287 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiG # Organism: Fusobacterium nucleatum # 1 326 1 322 337 388 66.0 1e-108 MKIAFWTVTRGAGNIAKEYAELLSSQVKYDEIQVYTLEKFSIKDTVQIQNFTDKLEEKFH SYDTHVFIMASGIVIRKISKLIKGKDIDPAVLLIDEGKHFVISLLSGHLGGANEITYKIA SLLNLIPIITTSSDITGKIAVDIISQKLNAELEDLKSAKEVTSLIVDGKTVDILLPKNVK VQKADKISKNPDGIIVVSNKRKLEMTRIFPKNLILGIGCKKDTREKDILEAIEASMEKHN LDMRSVKHIATVDIKKDELGLVQAAKTLEKELIIISREEIKKVQDKFEGSDFVEKNIGVR AVSEPAAYLSSSRKGQFLERKAKYQGITISIYEEEIESE >gi|224461445|gb|ACDD01000057.1| GENE 10 10202 - 10963 1077 253 aa, chain - ## HITS:1 COG:FN0957 KEGG:ns NR:ns ## COG: FN0957 COG2875 # Protein_GI_number: 19704292 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Fusobacterium nucleatum # 2 253 6 257 257 417 86.0 1e-117 MEKVYFIGAGPGDPELITIKGQRIVKEADVIIYAGSLVPKQVIDCHKEGAEIYNSASMSL EEVIAVMVKAVQAEKKVARVHTGDPAIYGAHREQMDILDEYGVEYEVIPGVSSFLASAAA IKKEFTLPNVSQTVICTRIEGRTPVPERESLESLASHQASMAIFLSVHMIDRVVESLLKH YPKTTPVAIVQRATWEDQKIVLGTLETIEEKVREANINKTAQILVGNFLGKEYEKSKLYD KYFSHEFRQGIEK >gi|224461445|gb|ACDD01000057.1| GENE 11 10942 - 11664 904 240 aa, chain - ## HITS:1 COG:FN0959 KEGG:ns NR:ns ## COG: FN0959 COG2243 # Protein_GI_number: 19704294 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Fusobacterium nucleatum # 1 238 9 246 248 367 81.0 1e-101 MNNKFYGIGVGVGDPEEITMKAVNVLKKLDVVILPEAKKDEGSVAYEIAKQYMKKDVEKV FVEFPMLKSLEDRINARKANAKIVEEYLEKGLNVGFLTIGDSMTYSTYVYLLEHLPEKYL VETVPGISSFVDMASRFNFPLMIGEESLKVVSLNSHTEIEKEIASSDNIVFMKVSRSFER LKQAIIATGNQENIIMVSNCGKENQVVTYDIEELEEEDIPYFTTLILKKGGMKAWKKFIS >gi|224461445|gb|ACDD01000057.1| GENE 12 11657 - 12235 883 192 aa, chain - ## HITS:1 COG:FN0964 KEGG:ns NR:ns ## COG: FN0964 COG2242 # Protein_GI_number: 19704299 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Fusobacterium nucleatum # 1 189 1 189 189 293 79.0 2e-79 MHIYDKEFVQEELPMTKQEIRAISIAKLQLHPNSILIDVGAGTGTIGIEAATYLSQGKVY AIEKEEKGLETIRKNAAKFQLQNFELIHGKAPDAIPNIPYDRMFIGGSTGKLEEIIQHFM HYGIEKAILVINCITLETQSKAMEVLKSFGFRDIEVVQVQVSRGKKVGPYTMMYGENPIY IIKVVKGEKNIE >gi|224461445|gb|ACDD01000057.1| GENE 13 12222 - 12866 823 214 aa, chain - ## HITS:1 COG:FN0966 KEGG:ns NR:ns ## COG: FN0966 COG2241 # Protein_GI_number: 19704301 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 1 # Organism: Fusobacterium nucleatum # 1 209 14 224 229 195 46.0 6e-50 MSRVMVVSIGPGNVDYISQKAKERLEQSDFVLGSRRQIEDVRSICSVTTEFYVYKKITEI KEVVEKEQKKKISILVSGDSGYYSLVPYLKKVLREEFDIIPGLSSFQYLFSKIGENWQDF FIGSVHGRKLDYIQKFREENRGLVLLTDEENNPKQIAKNLWEAGFREVDIIVGENLSYQE ENISYYKIEDWEIMPERFEMNVCIYRKGEENAYL >gi|224461445|gb|ACDD01000057.1| GENE 14 12859 - 13983 1317 374 aa, chain - ## HITS:1 COG:FN0967 KEGG:ns NR:ns ## COG: FN0967 COG1903 # Protein_GI_number: 19704302 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Fusobacterium nucleatum # 1 372 1 371 375 472 63.0 1e-133 MEDRELRNGYTTGSCAAAAVKAALMSLLYHISLQEVEVETPKGEELVIPILKVRRRGNFA SAAVQKYAGDDPDVTNGISICVKVFLQKEFPKIERAIIRGKCLIYGGRGVGLVTKKGLQV EVGKSAINPGPQKMIEKVVKDLLQETEDKVVICIYIPEGRAKASQTYNPKMGVLGGISVL GSTGIVKAMSEEALKASMYAELKVLRMDKRRKWVIFAFGNYGKAYCEKLGLDIEQMIIIS NFAGFMIESAVKLGFQKIILLGHIGKAIKLAGGIFHTHSRVADGRMEVMGANAFLYGLDS TIIRKILLSNTVEEACNYISDSKFFDYLSNRIRDKIVEYSRKEGFESEVLLFSFEKGTLG QSDAFLKMVEECHE >gi|224461445|gb|ACDD01000057.1| GENE 15 14015 - 14659 858 214 aa, chain - ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 1 214 4 217 219 353 84.0 2e-97 MAYIKVPGDIEKRSFEIIEEEMGEKIHQFSEQELPIVKRIIHTSADFEYGDLIEFQNNAI QSGIESLRKGCKIYCDTNMIVNGLSKPAMSKFACSAYCLVSDKEVIEEAKKEGLTRSIVG IRKAAKDKETKIFIIGNAPTALYQLKEMIERGEIERPALVIGVPVGFVGAAESKEAFKSL DVPYITINGRKGGSTIGVGILHGILYQIYKREGF >gi|224461445|gb|ACDD01000057.1| GENE 16 14668 - 15987 1402 439 aa, chain - ## HITS:1 COG:FN0972 KEGG:ns NR:ns ## COG: FN0972 COG1797 # Protein_GI_number: 19704307 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Fusobacterium nucleatum # 1 438 1 444 444 501 58.0 1e-142 MKAFLLAGTHSGVGKTTISMGLMKIFSRKYQVSPFKVGPDYIDPSFHAWVTGNFSYNLDY FMMGKQGVQYSFQSHQKDFSIVEGVMGLYDGIDHSLDNASAAHISRILDLPVILIVDAQG KSTSIAAQVLGYQKLDERVKIAGVIINQVNSEKSYIHCKEAIERYTKIPCLGYVKKEEQL RISSRHLGLLQANEVKDLDEKLETLADMIEQTIDIKRIEDIAERQEKSETIFHPLEKYQN YWRGRKIGIARDEAFRFYYQDNLESLEYLGFEVEYFSPIHDSQLPEKVDYLYFGGGYPEI FSEGLEKNKKMREEIQKFTGGIYAECGGFMYLGKEIIQLSEEKLQMCALLPVSTKMKNRL NISRFGYISLEENGIEIAKAHEFHYSDLENMEKDTRVLIARKIDGRSWSCIFEKEGRIYA GYPHIHFFNSIEFLKKIWR >gi|224461445|gb|ACDD01000057.1| GENE 17 15984 - 17060 1081 358 aa, chain - ## HITS:1 COG:FN0973 KEGG:ns NR:ns ## COG: FN0973 COG0079 # Protein_GI_number: 19704308 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Fusobacterium nucleatum # 2 353 3 355 357 430 61.0 1e-120 MDLHGGNIYRLQREGKEVLDYSSNINPLGVPQKFIEKAIQNFSSLSQYPDIDYIELREKI ADYNQVSRENILVGNGATEILFLYIRALRPKKTLLVGPCFAEYARALKTVESEICLFPLK EEENFILNVEALISEIQKEDYDLVLLCNPNNPTGRFIPLEDFKKIVTVIEKKGIQLFVDE AFIEFVESWKEKTVALLKSKSVFILRALTKFFAIPGLRLGYGMTWNADLFSRMQEEKEPW SVNVFANLAGLTMLEDEEYIRKTEDWIREEKKYFHQELSKISEIKVYETETNFILLQLLS KEAREFQAAMIEKGILVRDASNFPFLNEYYIRLAIKDRVSNQKVIKAIQEILKKGERV >gi|224461445|gb|ACDD01000057.1| GENE 18 17041 - 18003 1180 320 aa, chain - ## HITS:1 COG:FN0975 KEGG:ns NR:ns ## COG: FN0975 COG1270 # Protein_GI_number: 19704310 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Fusobacterium nucleatum # 3 309 5 316 325 353 61.0 3e-97 MIFIFRYSFAYFLDLVFGDPYWFPHPVRFIGKWISSLEKILYRFSNKYYSGVFLWFATCA ITFIISFYLAKNEYLEIFFLYTSLATKSLAMEGKKVIRLLEEGDLDKAKKELSYLVSRDT KEMDEEQISMSTLETIAENTVDGVISPMFYAFIGSHFHFLGVSLALPFAMTYKAINTLDS MVGYQTEQYNLFGRFSAKMDDIANWIPARLAGGIFIPLAAWILGFQAKKSYQIFQRDGNK HASPNSGQSEAAYAGALGVQFGGKIFYFGEAYEKQKIGDALFPFSVEIVRRGVKLLYGTS FCACILFILLGGLYHGFTWR >gi|224461445|gb|ACDD01000057.1| GENE 19 18000 - 19502 1852 500 aa, chain - ## HITS:1 COG:FN0977 KEGG:ns NR:ns ## COG: FN0977 COG1492 # Protein_GI_number: 19704312 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Fusobacterium nucleatum # 3 490 2 493 496 541 56.0 1e-153 MRKHRSLMVVGTASGVGKSATVTALCRIFQKDGYRVCPFKSQNMALNSYVTKDGKEMGRA QAVQAEAIGLEPQAWMNPILLKPSNDKKIQVIIEGKSFGNLTGPEYHKYKQNFIPRLQEI YHRIEKHYDISVIEGAGSPAEINMLEEDISNFGMARIADAPVLLVADIDKGGVFASIYGT IMLLEEKDRRRIKGIIINKFRGNVEVLKPGLEKIETLTGVPVLGVMPYSDFDLEEEDSLS EKYKKKNSKKISIRIGVVQLRHLSNMTDFDALRRLEEVDLHFISKVEEIEGEDIIILPGS KNTIEDYLEIEKKGIVHKLREEVKKGTMIIGICGGFQMLGSRIEDPYEIESEAGSVEACG FLEMNTILEKDKNLLQYQGSFQFGKDCLEKMNGVSVKGYEIHQGVSISEMRDAMGEDRLI SLAKGRVWGSYLHGIFDNTEFLNRLLEEFRQKKNLEGSLQDYQEYREEQWEKLEQLYRKH LDISRIYKIMNEFEKGQEKK >gi|224461445|gb|ACDD01000057.1| GENE 20 19489 - 21024 1204 511 aa, chain - ## HITS:1 COG:PH1352 KEGG:ns NR:ns ## COG: PH1352 COG1178 # Protein_GI_number: 14591158 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Pyrococcus horikoshii # 9 499 40 552 559 127 26.0 5e-29 MRRYKYPKIIINSIYLLLWILPLFWFVRDFWVMEEIQNSLDRSLWRTVVFTWKQSIYSSC LAFIVAIIPARYLAYHKNLLSKILESLLFIPFFFPVLSTIGIFSIVFNLPWIEKFSILYS MKAILIAHVFYNSPIFVKYIGESLRRIPKEIEESMILDGASSWKIFWKGQLPLMMPQVFK AFILCFTYCFLSFAILLSLGGIQYQSLEVEIASTLQGDFNFSKAMIYGLLQFLMLLSVNS LGILLPDYELKGSGYSKKMPYYTFLFSALYALFECGIVLASIVASFYNYFTGEFSLRAYQ IIFSSSFQEEYPIWRSLGNSFLVAGIAALGTVIIVYFLLRNYSRIIELLIFSNLGISGAF FAMTLYYIYVLYEVPFTLLLVFAYFMTGIPLAYSFLYQNVKNFPKDLQEMALLDGTSHWT YFWKIQFPILRPLFLLSFLQSFAIFLGEFTLAYTMQLGDIFPVVSLVNYSLLVDKKYLES SALSAVLLLLILLLFFLGECLKVRGEVHEEA >gi|224461445|gb|ACDD01000057.1| GENE 21 21120 - 22115 871 331 aa, chain + ## HITS:1 COG:no KEGG:FN0917 NR:ns ## KEGG: FN0917 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: Purine metabolism [PATH:fnu00230]; Pyrimidine metabolism [PATH:fnu00240]; Metabolic pathways [PATH:fnu01100]; DNA replication [PATH:fnu03030]; Mismatch repair [PATH:fnu03430]; Homologous recombination [PATH:fnu03440] # 11 321 1 322 322 138 31.0 3e-31 MFYFFYGNQSLLELEFKKRREEYSQKNYIIHSFDFSNQEEEIFLQELSMNSMFAETKCFL VKRVEHFKGNQLSSLLKGMSLFDLSKKEIFFFYAEKEIGKTVEKELTKLGTEIAIFSEEE QEKNLKHYLEKKLSLSSYDAEKLLEMLGKNFHKIEQESNKILQFLDGESFSFEKVFPILS IEKEYNIFSIIDQFLEQESPQILLEYLQQNKNDISVILYNLAESVFLIAKISSLIEQDQI DDRVSYTNFKTSFPKIQQYFRGKGNRSLHPYPVYLKIKIAKKHPISFWLKKLNEILLCEY QFKSGFMDIQMSIEQFILGFYPFSLSIPQDK >gi|224461445|gb|ACDD01000057.1| GENE 22 22066 - 22758 826 230 aa, chain - ## HITS:1 COG:FN1996 KEGG:ns NR:ns ## COG: FN1996 COG1738 # Protein_GI_number: 19705292 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 2 230 3 233 235 272 60.0 4e-73 MQNEILWAIMLLCNFLCIMAIYYRFGKIGLFAWVPVATILANIQVVMLVRLFGMEVTLGN ILYAGGYLVTDILAENYGKEDAKKAVYLGFFSMIAMTIIMQVAIHFTPSSAGIELFDGVK GVFALMPRLAIASLLAYLISQQHDIWAYEFWRYRFQDRKYIWIRNNASTMVSQLLDSFIF TVVAFYGVFPLPVLWEVFIGTYLIKFLVAICDTPFIYLGEYLKRMGKIQE >gi|224461445|gb|ACDD01000057.1| GENE 23 22902 - 24143 1166 413 aa, chain + ## HITS:1 COG:FN0042 KEGG:ns NR:ns ## COG: FN0042 COG0772 # Protein_GI_number: 19703394 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Fusobacterium nucleatum # 33 406 39 416 417 211 35.0 3e-54 MKRKDAIHENIYDKYQKLHESGEELEKQVSRNKRSSALLMILFIILSLSIANMFSVSLGL RNDQLGLVKKHTLMIFIGLFLCFVLSKISYKTFQKSFAKKALYIIPLLIFIGMMLAPSSI VPVRNGAKAWIQLGGFAIQPAELFKVSYIILLSGVLARIEDENSMKDYTLIILVGAFTFL PYAIFIHLQNDLGAIIHYALITGYLFVLSNISIKIIRLWSLIGGVAVVSAFSLIYKLGAD NLSGYKLKRIYSFLDGLFTGNYSPEFGYQVRQALIGFGSGGFLGKGFANGIQKYSYVPET ATDFISVTFGEEFGLLGMFILLSFYLILYWIICTISKECQDSFGKYLSAGIGAYLIIQVF INIGVAIGILPVFGLTLPLFSNGGSSIFAILSALGICLNINKTSHLFEKKKKK >gi|224461445|gb|ACDD01000057.1| GENE 24 24190 - 24384 396 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452822|ref|ZP_05618121.1| ## NR: gi|257452822|ref|ZP_05618121.1| hypothetical protein F3_07134 [Fusobacterium sp. 3_1_5R] # 1 64 1 64 64 80 100.0 2e-14 MEMKDIIEKVNYYSSLSKKRALTAEEEADRAIWRKRYLEKLTSQVRKHLDSIQIVDEKEQ NKIQ >gi|224461445|gb|ACDD01000057.1| GENE 25 24412 - 25803 1807 463 aa, chain + ## HITS:1 COG:FN0040 KEGG:ns NR:ns ## COG: FN0040 COG0017 # Protein_GI_number: 19703392 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Fusobacterium nucleatum # 3 463 1 461 461 763 80.0 0 MEMTTVKSIFRNKETYIDKEVKLGAWVRKIRSQKNFGFLEINDGSFFNGIQVVFDTSLEN FDEISRLSIASSVIVEGTLVKSEGAGQEFEIKASKVEVCQKADLDYPLQNKRHSFEFLRT KSHLRARTNTFSAVFRVRSAAAYAIHKFFQEQNFVYVHTPIITSSDAEGAGEMFRITTLD LNNVPKNEDGSINFQKDFFGKSTNLTVSGQLNGETYCAAFRNIYTFGPTFRAEYSNTARH ASEFWMIEPEMAFADLEVNMDIAEKMVKYIIRYVMETCPEEMNFFNQFIEKGLFDKLNNV LNNDFGRLTYTEAIDILEKSGKKFEYPVKWGIDLQSEHERYLAEEHFKKPVFLVDYPKDI KAFYMKLNEDGKTVRAMDLLAPQIGEIIGGSQREDNLEILENRMNELGMDKEDYSFYLDL RKYGSFPHSGYGLGFERMIMYLTGMANIRDVIPFPRTPNNIEF >gi|224461445|gb|ACDD01000057.1| GENE 26 25991 - 26551 646 186 aa, chain + ## HITS:1 COG:FN0039 KEGG:ns NR:ns ## COG: FN0039 COG1658 # Protein_GI_number: 19703391 # Func_class: L Replication, recombination and repair # Function: Small primase-like proteins (Toprim domain) # Organism: Fusobacterium nucleatum # 1 180 1 180 183 194 57.0 1e-49 MKPKIQEIIIVEGRDDISAVKAAVDAEIIQVNGFAIRKKGNIDKIKKAYEKKGIILLTDP DYAGNEIRSFLQKHFPKAKNAYISRSEGKKGDDIGVENAKPEAILRALELAKCNIEKQEN AYSIHFLYDLELVGHPRSTKYREIFTSILGIGYSNGKQLLSKLNRYGFSEEEILKAHQKM KEEYEK >gi|224461445|gb|ACDD01000057.1| GENE 27 26541 - 27485 277 314 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B [Streptococcus pneumoniae SP18-BS74] # 4 314 6 309 311 111 28 1e-23 MKNKIRELLLQYDSILITAHKNPDGDAVGAGLALTLSLLELGKKVRFVLQDKIPDTTLFL EGSHLIEQYQEEENFQNIELVVFLDCATRDRAGCMNHLTEGKTTINIDHHMSNPHYGDYA FVEPNISATSEILTQLLREWNFPMNAAIASALYLGIVNDTGNFEHDNVTVNTLKAAQFLV EQGANNAMIVRNFLKTNSYASLKLLGEALFHFQFFEEKKLSYFYLTKEVMNKYAAKKEHT EGIVEKLLSYEKASVSLFLREEEDGSIKGSMRSKDSIDVNQIAAYFGGGGHVKAAGFSSQ DCADIILNKILELL >gi|224461445|gb|ACDD01000057.1| GENE 28 27500 - 27805 319 101 aa, chain + ## HITS:1 COG:FN1010 KEGG:ns NR:ns ## COG: FN1010 COG1799 # Protein_GI_number: 19704345 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 96 1 93 98 86 44.0 1e-17 MEKETSIVFLKPKRFEDCDDCVRYVAEDKIVNVNLKDLKEKDARRLYDYVHGAVYVKQAK LIDIGENIFCCVPKNINSEVKYNQGNTSKSNEEEEIIPFAK >gi|224461445|gb|ACDD01000057.1| GENE 29 27885 - 28745 1057 286 aa, chain - ## HITS:1 COG:no KEGG:HRM2_32380 NR:ns ## KEGG: HRM2_32380 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 27 286 40 300 301 288 54.0 2e-76 MKKILSILFVLLISQFTFAAPSLGSEYKLSKVIEVEGRQGIAVDKDYYYISSSTALYKYD KSGNLVQKNTNPFTKLEKEANHFGDIDVWNGEIYTGIEIFEFGTSKNIQVAVYDAATLEY KYSIPWAAESGQVEVCGLAVDRDNNTVWMADWTKGRYLYCYDLATKKYERKVHLRPDPQY TQGIYCIDGKMLISADDGDADFHESDNIYVADISDKKQTASYVSLFREMSDFKRAGEIEG LSIDPTNSDLLVLANRGTRVDRGMPVGFYEGYDKEIHELYVYTKVR >gi|224461445|gb|ACDD01000057.1| GENE 30 28759 - 32331 4648 1190 aa, chain - ## HITS:1 COG:FN1421_1 KEGG:ns NR:ns ## COG: FN1421_1 COG0674 # Protein_GI_number: 19704753 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 1 409 3 411 412 777 91.0 0 MKRIMKTMDGNQAAAYASYAFTEVAGIYPITPSSPMAEYVDEWASKGMKNIFDVPVKLVE MQSEAGAAGTVHGSLQAGALTTTYTASQGLLLKIPNMYKIAGELLPGVIHVSARSLSVQA LSIFGDHQDIYATRQTGFTMMASGSVQEVMDMATVAHLTAIKSRVPVLHFFDGFRTSHEI QKIELMDYDVCKKLVDYDAIQAFRDRALNPEHPVTRGTAQNDDIYFQTREAQNKFYDAVP DIAAYYMEEISKETGRDYKPFKYRGAADATRVIIAMGSICPAAEETVDYLVEKGEKVGLL TVHLYRPFSEKYFFNVLPKTVEKIAVLERTKEPGAPGEPLLLDVKGLFYGKVNAPVIVGG RYGLSSKDTTPAQIKAALDNLKLDNPKTNFTVGIVDDVTFTSLEVGERLVVSDPSTKACL FYGLGADGTVGANKNSIKIIGDKTDLYAQGYFAYDSKKSGGVTRSHLRFGKNPIKSTYLV SSPMFVACSVPAYLNQYDMTSGLKEGGKFLLNCVWDKEEALQRIPNNVKRDIARANGKLY IINATKLAHDIGLGQRTNTIMQAAFFKLAEIIPFEEAQQYMKDYAYKSYGKKGDDIVQLN YKAIDVGASGLIELEVDPAWKDLEVVDQVKEDKNNDTCNCKTDLLKTFVKDIVEPINAIK GYDLPVSAFTGREDGTFENGTASFEKRGVAVDVPEWIVDNCIQCNQCSYVCPHAAIRPFL ITEEEKKASPVELITKKAVGKGLEDVTYRIQVTPLDCVGCGSCVNVCPAPGKALVMKPIA NALELEEDKKATYLYGSVPYRTDRMPTSTVKGSQFSQPLFEFNGACPGCGETPYLKVISQ MFGDRMMVSNASGCSSVYSGSAPSTPYTKNCHGEGPAWASSLFEDNAEYGFGMHIGVEAL RDRLQHIMEGAMEEVSPALQGLFREWIENRAYAAKTREVSPKIIELLEGKEEAYAKEILG LKQYLIKKSQWVVGGDGWAYDIGYGGLDHVLATNEDINIIVMDTEVYSNTGGQASKATPT GAVAKFAAAGKPVKKKDLAAICMSYGHIYVGQVSMGANQQQFLKAIQEAEAYNGPSIIIA YAPCINHGIKKGMSKSQTEMKLATECGYWPIFRYNPLLEAEGKNPLTLDSKEPKWELYQD YLMGETRYLTLMKTNPNEAKALFDKNQWDSQRRWRQYKRLASLDFSEEKR >gi|224461445|gb|ACDD01000057.1| GENE 31 32393 - 33742 2189 449 aa, chain - ## HITS:1 COG:FN1420 KEGG:ns NR:ns ## COG: FN1420 COG1757 # Protein_GI_number: 19704752 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 5 448 2 445 445 620 78.0 1e-177 MLENQVKASFKGLIPFIVFIVIYLGAGMILQSQGVELAFYQLPGPVAAAAGIVVAFILFK GTIEEKFNTFLEGCGHQDIMTMCIIYLLAGAFAVVSKAMGGVDSTVNLGITYIPPHYIAV GLFVIGAFISTATGTSVGAIVALGPIAVGLGEKSGVPMALILAAVMGGAMFGDNLSVISD TTIAATKTQGVEMRDKFRINLFIAAPAAIITIILLFMFGRPDVVPEAMSYDFNIVKVLPY VFVLVMALIGINVFVVLASGVLLSGIIGFAYGDFTLLTFGQQVYNGFTNMTEIFILSMLT GGMAQMVTKQGGIQWVIEKIQTMVVGTKSAKFGIGMLVGLTDIAVANNTVAIIINGEIAK QLSTKYEVDGRESAAFLDIFSCVAQGAIPHGAQMLILLGFAKGAVSPTQLMPLLWYQILL FIFSVVYIMMPQLSKQVLNFLDKPQSKKA >gi|224461445|gb|ACDD01000057.1| GENE 32 33760 - 34947 1911 395 aa, chain - ## HITS:1 COG:FN1419 KEGG:ns NR:ns ## COG: FN1419 COG0626 # Protein_GI_number: 19704751 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Fusobacterium nucleatum # 1 395 1 395 395 684 84.0 0 MEMKKCGLGTTAIHGGAVKNPYGSLAVPVFQTSTFIFDSAEQGGKRFALEEPGYIYSRLG NPTTSILEARVAALEEGEAAVAMSSGMGAISSTLWTVLKAGDHVITDTTLYGCTFALMNH GLTKFGVEVSFVDTSDLEAVKKAMKPNTRVVYLETPANPNLKIVDLEAIAKLAHTNPNTL VIVDNTFATPFLQKPLKLGVDIVVHSATKYINGHGDVIAGLAITNQELANQIRLVGVKDM TGSVLGPQEAYYILRGLKTFEIRMERHCKNAEKVVEYLCKHDQVEKVYYPGLVDHPGHEV AKKQMRAFGGMISFELKGGIEAGKTLLNNLKLCSLAVSLGDTETLIQHPASMTHSPYTKE ERMAAGITDGLVRLSVGLENVEDIIADLEYGLSKI >gi|224461445|gb|ACDD01000057.1| GENE 33 35073 - 36506 927 477 aa, chain + ## HITS:1 COG:FN1418 KEGG:ns NR:ns ## COG: FN1418 COG1167 # Protein_GI_number: 19704750 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Fusobacterium nucleatum # 1 469 1 471 475 647 67.0 0 MKTKIIRDAKHNISMQLYEILKEDILQNNWKENTKFYSIRQISIKFQVNLNTVLKVFQTL EEEGYLYSIKGKGCFIKKGYNLDVNERMTPILNTFRFGQNTKGQEINFSNGAPPKEYFPV EAYQNILSEILSDVEGSKNLLGYQNIQGLESLRQELTQFVKPYGITVSKDNIIVCSGTQN VLQLISTTLGTIPRKTVLLSNPTYQNAVHILESSCNIENIDLQSDGWDMKKLEEILQSKK IHLVYVMTNFQNPTGVSWSLEKKKQLLEFSKKYDFYIIEDDCFADFYYERKMAKPIKAFD KEGRVLYLKTFSKLVMPGIGLAMLIPPKNFVEKFTINKYFIDTTTSGIHQKFLELFIKRG LLEKHLEQLREILGQRMKYMVETLQKIPHLRILHIPKGGFFLWIELANYIDGEKFYYKCR LRGLSILPGFIFYSNTKNSCKIRISIVSSSFDEMQIGCQIIQDILEHCEGVSEMKLP >gi|224461445|gb|ACDD01000057.1| GENE 34 36528 - 37415 754 295 aa, chain + ## HITS:1 COG:FN0354 KEGG:ns NR:ns ## COG: FN0354 COG0697 # Protein_GI_number: 19703696 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 70 292 2 224 224 174 46.0 1e-43 MRNYILKKYATRFYAFLAVFFWASAFVSTKIVLQSGQLSAMDLGTLRYFLAAILLLPLAI LFKVRLPDSRDLSKFAISGILGYTAYMFFFNTASTMITPSTASVINAICPGVTAIFAYFL FYEKISWKGILGLAISFTGILFLSLWNGSFSLNIGVLYMLAAALCLSMYNISQRSFVKRY NAMETMTYCLLAGSLFLLLCHGKSLTLIPSLSKQMWIHLLYLAIFPSILSYYCWAKAMEC CNKTTEVTNFMFVTPMLATFLSFLMIKEFPTWSTYFGGALILLGMLLFQIEKTKK >gi|224461445|gb|ACDD01000057.1| GENE 35 37444 - 38724 1336 426 aa, chain - ## HITS:1 COG:FN0185 KEGG:ns NR:ns ## COG: FN0185 COG3593 # Protein_GI_number: 19703530 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Fusobacterium nucleatum # 39 426 1 397 400 462 61.0 1e-130 MKFNKIQVKNWGNFVDISLDCEDFLIFTGASDTGKSSLMKAILSFFRVRNLREGDIRDSK FPLEMIGNFIEKTGEFQLKFLKKNAEEIRYFVRYSAEWQEISEKEFQDFIQPISVFYIPS VLEESQMDYLFERVFQNEKLKAYHRFWEEYQEARKNRKSHGFYRHLFLRFLCEIATHEEK NNFWEHSILLWEEPEFYLNPQEERACYEKLLEHSRLGLQIIVSTNSSRFIDLEQYQSICI FRKKEEETRVYQYRGNLFSGDEVTEFNMNYWINPDRSELFFARKVILVEGQTDKIVLSYL AKKLGVYSYDTSIIECGSKSTIPQFIRLLNAFKIPYVVVYDKDNHLWRNPTEIFNSNQKN RSIQKMIYKKFGSYVEFENDIEEEIYNEDRERKNYKNKPFYALETVMQENYKIPKKLREK VYRIFE >gi|224461445|gb|ACDD01000057.1| GENE 36 38711 - 38902 129 63 aa, chain - ## HITS:1 COG:no KEGG:Moth_1713 NR:ns ## KEGG: Moth_1713 # Name: not_defined # Def: sigma-54 dependent trancsriptional regulator # Organism: M.thermoacetica # Pathway: not_defined # 1 63 1 66 748 61 43.0 8e-09 MKRKAKCIESNCIMCRACFTNCPVKAIDRKININRELCIGCGTCMKVCQHGAMILEEVED EIQ >gi|224461445|gb|ACDD01000057.1| GENE 37 38904 - 39875 1401 323 aa, chain - ## HITS:1 COG:BS_yvcK KEGG:ns NR:ns ## COG: BS_yvcK COG0391 # Protein_GI_number: 16080529 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 315 15 329 331 324 51.0 2e-88 MRKQPSIVVLGGGSGISVLLRGLKHLPVDITTIVTVADSGGSSGVLRKEFSCLPPGDFRN VIAALSDVEPLMEEVFQYRFQKDTFLGGHPLGNLIIMAMTELTGNLQESIDSLRKLFNIK AQILPASLDNVTLAAKKIDGSIVEGENEIPRTNQKIQEVFYTTQVSPIPKTLEIIKKADL IILGMGSLYTSLIPHLLVEGISESIAKSKAKKIYICNAMEQPGETEQYTVSDHVKAIYQH SQEGLIDTILVDSHSIPKREMKRYEEAGVSRVEIDFSKLQELGLEVIDRNMIEVDKKGMI RHHPYRLAAVIYSLIDHWERFYD >gi|224461445|gb|ACDD01000057.1| GENE 38 39897 - 40787 1123 296 aa, chain - ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 10 296 6 294 294 412 80.0 1e-115 MFIFSIFPYFFIFLLILLFISKGIKIVPESNVYIVEKLGKYHQSLSSGLNFINPFFDRIS RVVSLKEQVVDFPPQPVITKDNATMQIDTVVYFQITDPKSYTYGVERPLSAIENLTATTL RNIIGDMTVDQTLTSRDIINTKMRVELDEATDPWGIKVNRVELKSILPPEDIRVAMEKEM KAEREKRATVLEAQAKRESAILVAEGEKQSTILRAEAAKESEIQEALGKAQAILEIRKAE AEGIRLLNEAKITKEVLSLKSFESLEKVAEGQATKIIIPSELQNLSSFVTAIREMK >gi|224461445|gb|ACDD01000057.1| GENE 39 40791 - 41216 479 141 aa, chain - ## HITS:1 COG:FN1548 KEGG:ns NR:ns ## COG: FN1548 COG1585 # Protein_GI_number: 19704880 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Fusobacterium nucleatum # 1 138 1 138 138 89 40.0 2e-18 MGIVFWIILACIFAGLEIIIPALITIWFAFAALLLVMLSFFNFFILSPFMEWKFFIFVSV ILLLLTRPFSKKYFQNQKEEFRGDWVGKELVIEKVIREGYYEAKFKGSIWTLLSEDSLEV GDIVKIVSYEGNRIIVKKKEA >gi|224461445|gb|ACDD01000057.1| GENE 40 41227 - 42165 1211 312 aa, chain - ## HITS:1 COG:SP1359_1 KEGG:ns NR:ns ## COG: SP1359_1 COG0225 # Protein_GI_number: 15901213 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Streptococcus pneumoniae TIGR4 # 1 162 1 162 163 207 62.0 3e-53 MKEIYLAGGCFWGVEGYFRRIDGIEDVKVGYANGKTEEANYQNLKITEHAETVKIIYREQ EIDLETILEHYFRIIDPTSLDQQGHDKGRQYRTGIYYTDEKDLPVIQEFYQSVERLYQER LMVEVEKLQHFILAEDYHQDYLGKNPNGYCHIPLHLAFEPLVKIQSYVKKSKVELEKDLT ELQYLVTQKAATELAYENEYWSQAEEGIYVDITTGEPLFSSKDKFDSGCGWPSFSKAFSS GVLRYYHDESHGMKRIEVKSRIGDAHLGHVFEDGPKKLGGLRYCINSASLRFIPLEKMEE EGYGEYVKYIAK >gi|224461445|gb|ACDD01000057.1| GENE 41 42258 - 42449 192 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452839|ref|ZP_05618138.1| ## NR: gi|257452839|ref|ZP_05618138.1| hypothetical protein F3_07219 [Fusobacterium sp. 3_1_5R] # 1 63 1 63 63 114 100.0 2e-24 METKSLKSFFYITEEGDICLDSCIPSLAEEFNPEMIYTVLNIMLEHVMECYPTFMRKVWN EEK >gi|224461445|gb|ACDD01000057.1| GENE 42 42487 - 42762 340 91 aa, chain - ## HITS:1 COG:FN0357 KEGG:ns NR:ns ## COG: FN0357 COG0355 # Protein_GI_number: 19703699 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Fusobacterium nucleatum # 4 88 3 89 134 82 51.0 1e-16 MATSFMVKVVTPTKVVLEQEADFLLVRTTEGDMGILGNHFPLVAALADGQMKIRKDKREK FFRVEGGFIEISNNQVTILSNQAYPQEERVI >gi|224461445|gb|ACDD01000057.1| GENE 43 42765 - 44162 1914 465 aa, chain - ## HITS:1 COG:FN0358 KEGG:ns NR:ns ## COG: FN0358 COG0055 # Protein_GI_number: 19703700 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Fusobacterium nucleatum # 1 462 1 462 462 790 88.0 0 MNKGKITQIISAVVDVEFKDELPKIYNALKVQVGEKELVLEVQQHLGNNVVRTVAMDSTD GLLRGMEVMDTGAPITVPVGKAVLGRILNVLGEPVDQKGPVETEEYLPIHREAPKFEEQE TVTEIFETGIKVIDLLAPYIKGGKTGLFGGAGVGKTVLIMELINNIAKGHGGISVFAGVG ERTREGRDLYNEMTESGVLNKTSLVYGQMNEPPGARLRVALTGLTVAENFRDKEGQDVLL FIDNIFRFTQAGSEVSALLGRIPSAVGYQPNLATEMGTLQERITSTKSGSITSVQAVYVP ADDLTDPAPATTFSHLDATTVLSRDIASLGIYPAVDPLDSTSKALSPDIVGKEHYEVARE VQRVLQRYTELQDIIAILGMDELGDEDKLVVSRARKIQRFFSQPFAVAEQFTGMEGKYVS IKDTIRGFKEILEGKHDELPEQAFLYVGTIEEAVLKGRDLMKGAE >gi|224461445|gb|ACDD01000057.1| GENE 44 44186 - 45034 1026 282 aa, chain - ## HITS:1 COG:FN0359 KEGG:ns NR:ns ## COG: FN0359 COG0224 # Protein_GI_number: 19703701 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Fusobacterium nucleatum # 1 282 1 282 282 317 60.0 2e-86 MASSKKIKTRIKSIQSTHQITKAMEIVSTTKFRRYSLLAKESQAFSDSIQKILTNISMGV KAEKHPLFDGRERVRNIGVIVVTSDRGLCGSFNSSTLKELEKFRKKHDDQHIFIIPVGKK GRDYCEKRGYNVIQDYVGVDNYNMLTITEEISKVIVDRYRQEKLDEVYIIYNKFISALRS DLTLSKVIPITRLEGEENRGYIFEPSAEEVLSSLLPRYIGVTVYQAVLNNTASEHSARKN AMKNANENAEDMIRQLDLKYNRERQAAITQEITEIVGGAEAL >gi|224461445|gb|ACDD01000057.1| GENE 45 45049 - 46551 2057 500 aa, chain - ## HITS:1 COG:FN0360 KEGG:ns NR:ns ## COG: FN0360 COG0056 # Protein_GI_number: 19703702 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Fusobacterium nucleatum # 1 499 1 499 500 801 83.0 0 MKIRPEEVSEIIKKEIENYKKSLDVKTSGTVLEVGDGIARIYGLSSVMSNELLEFPNGVM GMALNLEENNVGAVILGNASLIKEGDGVKATGRVVSVPAGEGMLGRVVNALGEAVDGKGE IRPSKYMPVERKASGIISRQPVFEPLQTGLKSIDGMVPIGRGQRELIIGDRQTGKTAIAL DAIINQKGNGVKCIYVAIGQKRSTIAQIFQKLEDAGAMEYTTIVAATASEAAPLQYLAPY SGVAMGEYFMDKGEHVLIIYDDLSKHAVAYREMSLLLRRPPGREAYPGDVFYLHSRLLER AAKLSPELGGGSITALPIIETQAGDVSAYIPTNVISITDGQIFLETQLFNSGFRPAINAG ISVSRVGGAAQIKAMKQVASKVKLELAQYNELLTFAQFGSDLDKATKAQLDRGNRIMEVL KQAQYRPYPVEEQVVSFFGVTNGYLDSIPVERVKAFEEELLGKLRASSTILDRIREEKAL SKELEAELRAFIESFKKTFE >gi|224461445|gb|ACDD01000057.1| GENE 46 46570 - 47103 720 177 aa, chain - ## HITS:1 COG:FN0361 KEGG:ns NR:ns ## COG: FN0361 COG0712 # Protein_GI_number: 19703703 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Fusobacterium nucleatum # 1 168 1 168 174 136 57.0 2e-32 MIENQVGRRYAEAIYTIAEERGKVKETHTFLNSIMELYKNDITFRNFIQHPLLKVQEKEE VLREIFAEVSDELLQIAFYILEKGRISFIRNIVAEYLKIYYEKHQILDVVATFAVELSEE QKTKLIQKLKDKTKHEIRLETQVDESILGGGILKIGDQVMDGSLRKELQQIKNGKKS >gi|224461445|gb|ACDD01000057.1| GENE 47 47100 - 47606 647 168 aa, chain - ## HITS:1 COG:FN0362 KEGG:ns NR:ns ## COG: FN0362 COG0711 # Protein_GI_number: 19703704 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Fusobacterium nucleatum # 6 167 1 162 163 112 45.0 4e-25 METTTMPVISIDVNLFWQIINFFILVFVFNKYFKTPIQRILTERKKKITSELHSATLSKE EAKVSAKQAETALKEARDEAHEILKKAEYRAEEVRNEILADARLQKERMLREASEEVMRL KAKARRDLHQEVTSLAVELAEKLMKKNIDKQTATDLIDDFIERVGDEA >gi|224461445|gb|ACDD01000057.1| GENE 48 47646 - 47921 673 91 aa, chain - ## HITS:1 COG:FN0363 KEGG:ns NR:ns ## COG: FN0363 COG0636 # Protein_GI_number: 19703705 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 5 91 3 89 89 101 83.0 3e-22 MMEGMLMAKAIVLAGSGIGVGLAMIAGLGPGIGEGYAAGKAVEAVARQPEARGNIISTMI LGQAVAESTGIYSLVIALILLYANPLINMLG >gi|224461445|gb|ACDD01000057.1| GENE 49 47947 - 48753 975 268 aa, chain - ## HITS:1 COG:FN0364 KEGG:ns NR:ns ## COG: FN0364 COG0356 # Protein_GI_number: 19703706 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Fusobacterium nucleatum # 51 268 1 218 218 259 68.0 5e-69 MSFQALQFVTPALVEGPKVVFFIPLPSSLQHLPFVMQYGQGHYGWPVSITVVTTWFLILM LFLFFKLCTKKLEIVPGKPQILLESIYEFLDNLMEQMLGAWKAKYFAFLGSLFLFIFPAN IISFFPIPWARFTGGTFSIEPAFRAPTADLNTTIGLALLTTIIFIATSIKQNGVWGYLKG FFSPLPIMAPLNVVGELAKPLNISVRLFGNMFAGSVIMGLLYKACPWVIPAPLHLYFDLF SGLVQSFVFVTLSMVYIQSSLGDAEYLD >gi|224461445|gb|ACDD01000057.1| GENE 50 48757 - 49131 247 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452848|ref|ZP_05618147.1| ## NR: gi|257452848|ref|ZP_05618147.1| hypothetical protein F3_07264 [Fusobacterium sp. 3_1_5R] # 1 124 1 124 124 215 100.0 8e-55 MEDIKTIFKHAGISAILVFLYGLLIWNFYVLIGTFSACLVSILSFYSLCEDVKTQVFLKD DSRRRAFLRYLKRYVLSGVYLAVLGYFWGLPMILSAAVGLLNIKLNIYLLPIFKKLKNYS RKEE >gi|224461445|gb|ACDD01000057.1| GENE 51 49106 - 49360 255 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452849|ref|ZP_05618148.1| ## NR: gi|257452849|ref|ZP_05618148.1| hypothetical protein F3_07269 [Fusobacterium sp. 3_1_5R] # 1 84 1 84 84 122 100.0 8e-27 MSGFFNRDFFYYLSLFSQLGITMVGNIAVSLFLYLIFAKYVFRHPLILFLFLLLGIVSGY YQVYKLITQKKERGKKGGRHQDDF >gi|224461445|gb|ACDD01000057.1| GENE 52 49360 - 50076 1221 238 aa, chain - ## HITS:1 COG:BS_yqeM KEGG:ns NR:ns ## COG: BS_yqeM COG0500 # Protein_GI_number: 16079615 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus subtilis # 1 218 2 219 247 113 29.0 4e-25 MYKHFSKIYDNFMQYADYTKWKEEIKKLILLGQPKGKELLELGCGTGELLKRFEKDYHCH GLDISEHMLKVAQEKLAAQKIPLYLGDMVDFDTGDRYDIIIAIFDTVNHIVDMIDLKRHF RTVFANLKPGGVYIFDIVDRAFMDEMFPNDVFVDVRDDLTVIWEHELEDGIDYIDATYFT HLVGSRYRRVEETYAKKIYHRRELEHAIRRSNLKIQKVVTSTGIAGNRYMYLLKKEEL >gi|224461445|gb|ACDD01000057.1| GENE 53 50106 - 51464 2091 452 aa, chain - ## HITS:1 COG:FN0366 KEGG:ns NR:ns ## COG: FN0366 COG1109 # Protein_GI_number: 19703708 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Fusobacterium nucleatum # 1 451 1 451 452 612 69.0 1e-175 MRKYFGTDGIRGEANRELTVDIALRLGYALGYYLKKKSTEKKKIKVILGSDTRISGYMLR SALTAGLTSMGVQVDFVGVLPTPAVAYITKTKKADAGVMISASHNPAKDNGLKVFGSTGY KLPDEVEEEIEYFMDHIEEISTEVLAGDEVGKFKYAEDEYYLYRNYLLSSVKGDFQGIKL IIDAANGSAYRVAKDVFLELGAEVIVINDTPNGKNINVKCGSTHPEILSKVVVGYEADLG LAYDGDADRLIAVDKSGKIVDGDKVIAILSVLMKQRGELHQNGVVTTVMSNMGLENYLKS QGISLVRASVGDRYVLEKMLANGINIGGEQSGHIILSDYATTGDGVLTSLKLVEAIRDAK KDLHEMIREIKDWPQVLINVTVDNAKKNSWKEFPVLTSFIAKMEEEMGENGRVLVRTSGT EPLIRVMVEGREETQVQEIAEKIAEVVRTELA >gi|224461445|gb|ACDD01000057.1| GENE 54 51461 - 52003 433 180 aa, chain - ## HITS:1 COG:FN0367 KEGG:ns NR:ns ## COG: FN0367 COG4769 # Protein_GI_number: 19703709 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 4 167 5 168 172 152 61.0 5e-37 MEVKKKREVYIAAFVLLALYLSLLESLIPKPFPWMKFGFSNIIILVILEKWDKKMAFEVL LLRIFIQALMLGTMFSPGFLVSLCSGFLSLCLTTMLYRVRKYLSLLSISCLSAMFHNAIQ LVVVYFLLFRNISLQSKSIMIFIFAFLLLGVISGLITGILVSKLALRIPRSKDKKETEVV >gi|224461445|gb|ACDD01000057.1| GENE 55 52006 - 53439 2037 477 aa, chain - ## HITS:1 COG:FN0368 KEGG:ns NR:ns ## COG: FN0368 COG0015 # Protein_GI_number: 19703710 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Fusobacterium nucleatum # 1 477 1 477 477 764 82.0 0 METKIYSNPLAERYSSKEMLEVFSPDFKFSTWRKLWVALAESEKELGLEIQDEQIQQMKE NIYNIDYALASQKEKEFRHDVMAHVHTFGTQAPLAMPIIHLGATSAFVGDNTDLIQIREA LLLTKQKMVNVMAELSKFAKENRALPTLGFTHFQAAQLTTVGKRACLWLQSLMLDLEELE FRNSTLRFRGVKGTTGTQASFKDLFEGDFQKVRELDEKVTEKMGFDKRFLVTGQTYDRKV DSEVMNLLANIAQTAHKFTNDLRLLQHLKEIEEPFEKNQIGSSAMAYKRNPMRSERISSL AKFVIALQQSTAMTAATQWFERTLDDSANKRLSLPQAFLAVDAILIIWKNIMDGLVVYPK MIEKRIMSELPFMATEYIIMECVKQGGDRQELHERIRQHSMEAGKQVKVEGKENDLIDRI LADDYFKLDKERLLEILDPKSFTGFAAEQVLDFLELEIQPILEKNKDQLGMNSELRV >gi|224461445|gb|ACDD01000057.1| GENE 56 53453 - 53863 198 136 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228002792|ref|ZP_04049785.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Anaerococcus prevotii DSM 20548] # 1 136 1 143 146 80 32 1e-14 MLRRLEQEDIDFLYALEQTNFPTSYYSKSQLLEMLSDEAYSIYGIERDKKLIAYVIFFNS IDCQELMKIAVSQEYRRQGLATKLLEVEKRRPILLEVRESNLGAQEFYKQHGFEKIYVRK QYYRDNGENAVILEKK >gi|224461445|gb|ACDD01000057.1| GENE 57 53864 - 54811 1082 315 aa, chain - ## HITS:1 COG:FN0370 KEGG:ns NR:ns ## COG: FN0370 COG0681 # Protein_GI_number: 19703712 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Fusobacterium nucleatum # 10 295 8 284 286 270 47.0 2e-72 MRNHILWNVIIYVIVTSFFLYIWWKQKKLAGIIEQYRIRFGNWIIEKFHVQAEAAKKAIQ RFIDVTEALVTALVLVLVLQHFYVGNFKIPTPSMVPTIEIGDRVLANMVVYRFTSPKKED VIVFKEPIEDSKNYTKRVIALPGETIKIEGNAVYTDNQKNEKRSYSILPSTSDIPRSLME GEEWKVPKKGDHITVVPSTNYKQLFVENGLNPNEIQKGIMENAALAFMFMPNLQFYINGE PTGPILDFLHDNSSLNHLMAGEVVEQDLDQDYYFVLGDNTDHSADSRIWGFVKKERITGK VLFRFWPLNRVGFVK >gi|224461445|gb|ACDD01000057.1| GENE 58 54932 - 55282 552 116 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739925|ref|ZP_04570406.1| LSU ribosomal protein L19P [Fusobacterium sp. 2_1_31] # 1 116 1 116 116 217 93 1e-55 MKEKLIQLVEKDYLRTDIPQFKAGDTIGVYYKVKEGNKERVQLFEGVVIRVNGGGIAKTF TVRKVTAGIGVERIIPINSPMIDKIEVLKVGRVRRSKLYYLRGLSAKKARIKEIIK >gi|224461445|gb|ACDD01000057.1| GENE 59 55457 - 55912 199 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257452857|ref|ZP_05618156.1| ## NR: gi|257452857|ref|ZP_05618156.1| hypothetical protein F3_07309 [Fusobacterium sp. 3_1_5R] # 1 151 1 151 151 257 100.0 2e-67 MKKITFLFFIFLLLPSKIFAFSFDTEVKKYYDIPKIQKNFPTSKVRRQNASYDAITIENN FQGYTIILAHFDTPIQAKSFFYQTIQDAAKQNLKLFLSENGYTTLLDIDRGIIYGILTEE ENCISVRFTNIDTISEILPILDKNIDSWKMI Prediction of potential genes in microbial genomes Time: Fri May 20 02:11:12 2011 Seq name: gi|224461444|gb|ACDD01000058.1| Fusobacterium sp. 3_1_5R cont1.58, whole genome shotgun sequence Length of sequence - 14158 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 6, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 9.2 1 1 Op 1 32/0.000 + CDS 254 - 1255 866 ## COG1135 ABC-type metal ion transport system, ATPase component 2 1 Op 2 22/0.000 + CDS 1245 - 1895 816 ## COG2011 ABC-type metal ion transport system, permease component 3 1 Op 3 . + CDS 1919 - 2695 937 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen + Term 2710 - 2757 2.0 4 2 Op 1 . - CDS 2652 - 2753 148 ## 5 2 Op 2 . - CDS 2750 - 3022 300 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system 6 2 Op 3 . - CDS 3019 - 3249 452 ## gi|257452862|ref|ZP_05618161.1| hypothetical protein F3_07334 - Prom 3279 - 3338 9.8 + Prom 3235 - 3294 7.7 7 3 Tu 1 . + CDS 3351 - 3857 252 ## Dole_0250 putative transcriptional regulator - Term 3661 - 3701 4.2 8 4 Op 1 42/0.000 - CDS 3743 - 4561 914 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 9 4 Op 2 25/0.000 - CDS 4548 - 5195 244 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 10 4 Op 3 . - CDS 5205 - 6164 1531 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 11 4 Op 4 . - CDS 6233 - 6922 740 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 12 4 Op 5 . - CDS 6919 - 7929 811 ## COG1533 DNA repair photolyase 13 4 Op 6 . - CDS 7910 - 8728 734 ## Dtur_0464 hypothetical protein - Prom 8827 - 8886 7.5 + Prom 8766 - 8825 8.0 14 5 Op 1 . + CDS 8858 - 9814 1396 ## COG0039 Malate/lactate dehydrogenases 15 5 Op 2 . + CDS 9801 - 10556 761 ## COG0101 Pseudouridylate synthase + Term 10564 - 10599 2.1 - Term 10733 - 10787 4.8 16 6 Tu 1 . - CDS 10791 - 14075 3616 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 14098 - 14157 10.2 Predicted protein(s) >gi|224461444|gb|ACDD01000058.1| GENE 1 254 - 1255 866 333 aa, chain + ## HITS:1 COG:FN0660 KEGG:ns NR:ns ## COG: FN0660 COG1135 # Protein_GI_number: 19703995 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Fusobacterium nucleatum # 1 332 1 334 335 442 70.0 1e-124 MIQLKQVNKIYNNGFHAVKDINLEIQKGDIFGIIGLSGAGKSSLIRMLNRLEETSSGEIW MDGVNINSLSKDQLLKKRKKIGMIFQHFNLLSSRTVAENIAFSLEIANWKQEDIQRRVKE LLELVELSEKANYYPSQLSGGQKQRVAIARALANKPDILLSDEATSALDPKTTKSILDLL REIQKKFSLTVVMITHQMEVVREICNKVAVMSEGKIVEQGGVHHIFSNPTSEITKELISY VPEKKEQNFTRKKGHMLLKLNFLGSISEEPIISNIIRTCAIDISIISGKIDTLATMNVGH LYVELSGNLEAQEKAIAAFQEADVKVEVIYNAL >gi|224461444|gb|ACDD01000058.1| GENE 2 1245 - 1895 816 216 aa, chain + ## HITS:1 COG:FN0659 KEGG:ns NR:ns ## COG: FN0659 COG2011 # Protein_GI_number: 19703994 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Fusobacterium nucleatum # 2 216 18 232 233 250 68.0 1e-66 MLFNMLWTSTLETLYMVFFSTAFALLLGFPFGILLVITKENGLWEHLKFHQVLETSINIL RSFPFIILMIVLFPLSRVITGTTIGSTAAIVPLAIGTAPFVARMIEGALLEVDSGLIEAS ESMGASNWTIIRKVMIPEATSSLINGITITIISLIGYSSMAGAIGAGGLGDLAIRYGYQR FQIDLMCYAIVILLIIVQATQWIGNWFINQRKKKLG >gi|224461444|gb|ACDD01000058.1| GENE 3 1919 - 2695 937 258 aa, chain + ## HITS:1 COG:FN0658 KEGG:ns NR:ns ## COG: FN0658 COG1464 # Protein_GI_number: 19703993 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Fusobacterium nucleatum # 7 258 9 261 261 338 70.0 7e-93 MLKKVFTIGSFVVLSSLALAGTLKVGASPVPHAEILNFVKADLKKQGVDLKVVEFTDYVT PNLALSDGELDANFFQHIPYLQKFASERKLKLTSVGKIHVEPIGLYSKKAASLKNLKKGA TIAIPNDPSNGGRALILLHNKKLLVLKDPKNLYATEFDIVKNPNNFKFKAVETAQLPRVL ADVDAALINGNYALESGLNPTKDALLLEGKESPYANVIAVKVGKEKNADIQKLVKTLQNP KVKEFIDKQYRGGVVAAF >gi|224461444|gb|ACDD01000058.1| GENE 4 2652 - 2753 148 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKGLIPCIILGISPFLLEFRKQQRHRLCIVYR >gi|224461444|gb|ACDD01000058.1| GENE 5 2750 - 3022 300 90 aa, chain - ## HITS:1 COG:FN0211 KEGG:ns NR:ns ## COG: FN0211 COG2026 # Protein_GI_number: 19703556 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Fusobacterium nucleatum # 1 87 1 87 88 102 57.0 2e-22 MRYQVEFSQQGKKELKKLDAFAQKIIMKWISKNLINTENPRIHGKELKGNLKSFWRYRVG NYRLLADIQDENITILLIKIGHRREIYDKK >gi|224461444|gb|ACDD01000058.1| GENE 6 3019 - 3249 452 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452862|ref|ZP_05618161.1| ## NR: gi|257452862|ref|ZP_05618161.1| hypothetical protein F3_07334 [Fusobacterium sp. 3_1_5R] # 1 76 1 76 76 105 100.0 1e-21 MSVISLRLNEKEEKLLKEFSEFEGLGISSYIKKIIYERLEDEYDIQCFDKAYEEYLESGK KSYSFDEVLNELGIEL >gi|224461444|gb|ACDD01000058.1| GENE 7 3351 - 3857 252 168 aa, chain + ## HITS:1 COG:no KEGG:Dole_0250 NR:ns ## KEGG: Dole_0250 # Name: not_defined # Def: putative transcriptional regulator # Organism: D.oleovorans # Pathway: Homologous recombination [PATH:dol03440] # 11 135 235 370 383 82 33.0 7e-15 MSNICLTTIESKPFNPNIANGFFRAGFIETWGRGIEKICEACSNYGIKIPEYTVYPEDIT LKFEALNTAKNAASKIDDNFYLVFDYLNQFPTTKHKNIMEDLNISRRTLERIISLLKEQS YIERIGNNRSGYWKILKKEELPIILFQITLQKKIPTDNKTIVPPGFNP >gi|224461444|gb|ACDD01000058.1| GENE 8 3743 - 4561 914 272 aa, chain - ## HITS:1 COG:CAC2878 KEGG:ns NR:ns ## COG: CAC2878 COG1108 # Protein_GI_number: 15896132 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Clostridium acetobutylicum # 1 249 4 252 268 159 38.0 4e-39 MLQYEFMQKAFFVGLLLSIIVPCIGSFIILKRLSMLGDALSHASLSGVAFGLLLAWNPLV GAFLACVIAGLGTEYLRKKIPQYSEISIAVITSLGVGFAGVLSSFIKNATSFHSFLFGSI VAISTLEVIMITCVSVLVLFLFLFFYKELFYIAFDEEGARVAGVPVNRINFIVAIITAIT VSIASRTVGALMISSFMVLPMAAAMQVARSYKTTILFAIFYAVCSTLLGLTLSYYYGLKP GGTIVLLSVGIFFCNVIWKSMIGSSSFFNIFQ >gi|224461444|gb|ACDD01000058.1| GENE 9 4548 - 5195 244 215 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 21 209 21 212 311 98 32 2e-20 MKTLIQVEQGYFHYPKQEKLLENINFHIQEGEFTAIIGANGAGKTTLLKLLLEQLSFQRG KVIRKYRQISYVSQAQDKLQESFPATVLEVVLLNLRQEIGYFHFTKEKHREKARKALKMV GMERYEKHLLKELSGGQRQRVMIAKALVQEPELLILDEPTTGLDKKSVEDLFDTLTNLNH EKGMAILMISHDLFRVRTWCEHIYLLEEGELYASV >gi|224461444|gb|ACDD01000058.1| GENE 10 5205 - 6164 1531 319 aa, chain - ## HITS:1 COG:lin0191 KEGG:ns NR:ns ## COG: lin0191 COG0803 # Protein_GI_number: 16799268 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Listeria innocua # 1 315 1 309 312 258 45.0 2e-68 MKKRLWALLLAMLCLCIALIGCGKKKEEVVSDKIKVVTSNYPMYDFTKRIAGDTLEVVNL VPPGTEPHDWEPSVQDIAQLEEAKAFIYNGAGMETWVEKVLESLNNKELLVVEASQRVDL LKAEEHEEEHHHEHESHEEHHHHHGEWDPHVWLSLRAAQVEMENIKNLLVEVNPEQKEVY EENYQKAIKEFQSLDEEYKTALESFKGKEIVVAHEAFAYLCRDYDLHQLGIEGVFADSEP SPAKMKEIIDFVKEHQVKVIFFETLASPKVAEAIAKETGASTDMLNPLEGLTEEEIAAGK DYLSMMRENLESLKKAFVE >gi|224461444|gb|ACDD01000058.1| GENE 11 6233 - 6922 740 229 aa, chain - ## HITS:1 COG:XF0145 KEGG:ns NR:ns ## COG: XF0145 COG4221 # Protein_GI_number: 15836750 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Xylella fastidiosa 9a5c # 3 218 5 234 251 106 29.0 4e-23 MKVLVTGATSGIGKAVVERLLQEGNQVTGVGRDFQKYPIEHVCYQKYSCDFRKMAEMEKL LRMVAREEWDVVILVAGLGYFAPHEEIHFSKIQEMVQVNLSSAMMIVQATLRNLKKKRGQ IIFVSSVTATKASPMGAAYSATKAGISHFATSLWEEVRKYGVRVSVIEPDMTKTDFYEHN SFDVGEEADNYLLAEEVVDAIFFLLSQRQGMNIRRIEVQPQRHKITRKG >gi|224461444|gb|ACDD01000058.1| GENE 12 6919 - 7929 811 336 aa, chain - ## HITS:1 COG:FN0898_2 KEGG:ns NR:ns ## COG: FN0898_2 COG1533 # Protein_GI_number: 19704233 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Fusobacterium nucleatum # 4 332 2 330 330 309 49.0 5e-84 MKSWNSNFSHIYVEKEVMNYERTKKIIEKFPKAVVIVIERYQDVFHPVGQEFSYQKQSQK LILAKKQDNFLYKGAKVCESFQNHHFYYTSFFLNCIYDCDYCYLQGVYSSANLVIFVNLE DFLEEVKQLLETTKELYLCISYDTDLLAFEGITSFVEEWYDFSLEHPSLKIELRTKSAKV LDFSKKKYNPNFILAWTLSPESTSQIFEKKTPNLEHRLEAIQKWQRQGFITRLCFDPIFW KKDFQEEYRNFLKKCFSKLDQEKILDISVGTFRVSKEYLKKMRKQNPNSLLLAYPFVCEE GVYSYPREIQQKMFSFVEEELLQYIEKEKLFIGGKI >gi|224461444|gb|ACDD01000058.1| GENE 13 7910 - 8728 734 272 aa, chain - ## HITS:1 COG:no KEGG:Dtur_0464 NR:ns ## KEGG: Dtur_0464 # Name: not_defined # Def: hypothetical protein # Organism: D.turgidum # Pathway: not_defined # 1 267 1 273 277 154 35.0 3e-36 MVYLFFALYGEAKPFIEKWKLKKQNQYTKYQVFERESFCCVVTGVGSMKMAIHTTHFLSS RNLQEEDIFCNVGIAGTKSSHFTKGELYFIHKIHSKESGRDFYPELLYRQKYQEASLETF SKVVEKEEEIQEDLVDMEGAAFFETLHFFAKKKQIFLWKCVSDFLEGEKVNPEELLKKHC EGLATFLEQFIGRKNEELEFLKRERRDLEEKLWKHLFCSETMRIQGKDLLHYAELSEKNV EKMIQKYLRKEVKTKTEGKKYFEDLRNEILEF >gi|224461444|gb|ACDD01000058.1| GENE 14 8858 - 9814 1396 318 aa, chain + ## HITS:1 COG:FN1169 KEGG:ns NR:ns ## COG: FN1169 COG0039 # Protein_GI_number: 19704504 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Fusobacterium nucleatum # 1 317 1 317 318 570 86.0 1e-163 MLQTKKVGIVGIGHVGSHCALAMLLQGVCDEMVLMDILPEKAKGYAIDCMDTVSFLPHRT IIKDGGIKELSEMDVIVISVGSLTKNNQRLEELKGSMEAIKSFVPDVVKAGFNGIFVVIT NPVDIVTYFVRQLSGFPKHRVIGTGTGLDSARLRRILSETTNIDSHVIQAFMLGEHGDTQ VANYSSATIHGVPFLDYVKTHPEQFKDVDLLDLEKQVVRTAWDIIAGKGSTEFGIGCTCA NLVKAIFHNERRVLPCSAYLEGEYGQSGFYTGVPAIIGNNGVEEILELPLNEREEKRFKE ACEVMKKYIEIGNSYGIC >gi|224461444|gb|ACDD01000058.1| GENE 15 9801 - 10556 761 251 aa, chain + ## HITS:1 COG:CAC3099 KEGG:ns NR:ns ## COG: CAC3099 COG0101 # Protein_GI_number: 15896350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Clostridium acetobutylicum # 5 251 3 244 244 174 39.0 2e-43 MEFVNLRFSIEYEGTRYLGWQRLGEKQREKTIQGKIEQVLARLFALNPEEVSVIASGRTD AGVHAKEQIANVHLPSGKSPQEIEEYCNQYLPEDIRIFHAHFVEELFHSRFHAKTKEYHY EISLQKPSVFHRNVTWYCPMQLDIEKMKESSQYFLGEHDFIAFSSLKKTKKSTIRRIDRI EIQETESGLLFRFVGNGFLQNMIRILVGTLIEVGEGKKTKEDIISIFQSKTRQKAGFLAP AKGLTLYKVYY >gi|224461444|gb|ACDD01000058.1| GENE 16 10791 - 14075 3616 1094 aa, chain - ## HITS:1 COG:FN0499 KEGG:ns NR:ns ## COG: FN0499 COG1629 # Protein_GI_number: 19703834 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 9 391 2 379 743 187 33.0 8e-47 MKKGYLGFLFLISSLSSFAVEQEVKLAPTTVDGRSSYNGSVTENEIKNIVVITKEEIQKK QHKDLLSVFEDSPMTMVTHTQAGPLIALRGSGEKTVMRVKVLLDGTSINTVDDSMGVIPF NAIPVSSVEKIEIIPGGGITLYGSGSSSGVINIITKSGKMKDYGVVNVTSSSFNTYNVNM SKGLKLGKNIFANVAVEAEKGKGYRQKEEHKKYNVLGGFHFRLNDKNSIRIYGSQYKNDE DNTNELSIYDLKRNRRKAGDTLSKVKSDRHTFGVDYQYNPSEKLHFTANYNSSKFSRDIT QDARPSLTFLPSIDFFDNAFADSDSRIDLVLRNVSQRLEGRFEENIDNARGKLDYSYANN KGKFTFGYDYTSHHLKRVSTTVSAPYNEYRDIGLLIHKKHDRAFSEERLKENPDMIIGYS TIAADSMYNNPEDYFLNGKIGEKGIEDFYIKKNKFSLEDKLAKKLYKYATPEMIADYEKK KGTAEEKGVTSLVMEIFKSGVDPMPMMIDINRWYQTDFRKEKVFSLIDQNKLVEKDGKKG VYVKNPASKENTFFEINENTTFEDFARIHETLNSPPITSSTFVSSRIDTKKTTDSFYLHN DYSLTDNFDIGLGLRYEKSKYSGTRKTLTNQIIKMNPGVDRKKFADSAIGTLDLYTQTSD VIYTERTDRGQDQPGFMILEKLRRLKELRETGQTIIPMVNLTTQYRKTEENIGGDISFSY KLNDTNRMYVKYERAFNTPLPTQMTNKTFDPIHKVRVYWESGIRTEKMNNFEIGFRGMLR KNISFSAAAFLSDTYDEIISVVKDGNSHQTREWRFINLDKTRRLGLELQSEQTFDKLRLK ESVTYIHPKILANNYKDEVMRIANEQMTHLIDGRRKSIREYFNNSDMKKMKEKERIIAAI DHFYQEEFYQKNIADKEKINHFIEEYVQKNINPFIDNSSDVDNETKKYLKDGIKNNLEND RNYVSIIREQYDYEYSLTNGSFLEEGERIPLAPKVKATFGADYQFTDKLRIGMNTTYIGS YISVEPARVYEIIKTKVPAHFVSDIYGTYHCTEDFSIKFGVNNIFNHQYNLRQDSYTATP APGRTYSAGFSYRF Prediction of potential genes in microbial genomes Time: Fri May 20 02:11:36 2011 Seq name: gi|224461443|gb|ACDD01000059.1| Fusobacterium sp. 3_1_5R cont1.59, whole genome shotgun sequence Length of sequence - 27554 bp Number of predicted genes - 26, with homology - 24 Number of transcription units - 4, operones - 3 average op.length - 8.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 73 - 195 110 ## 2 1 Op 2 9/0.000 - CDS 164 - 856 658 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 3 1 Op 3 . - CDS 853 - 2868 2183 ## COG0147 Anthranilate/para-aminobenzoate synthases component I - Prom 2896 - 2955 5.4 - Term 2933 - 2967 1.2 4 2 Op 1 . - CDS 2980 - 3627 785 ## gi|257452874|ref|ZP_05618173.1| hypothetical protein F3_07394 - Prom 3650 - 3709 11.9 5 2 Op 2 . - CDS 3711 - 3776 65 ## 6 2 Op 3 1/0.000 - CDS 3810 - 4235 855 ## COG0716 Flavodoxins 7 2 Op 4 . - CDS 4301 - 5194 794 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 8 2 Op 5 1/0.000 - CDS 5214 - 6515 1753 ## COG2252 Permeases 9 2 Op 6 1/0.000 - CDS 6526 - 7401 936 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 10 2 Op 7 1/0.000 - CDS 7404 - 8510 1115 ## COG0772 Bacterial cell division membrane protein 11 2 Op 8 1/0.000 - CDS 8520 - 8960 866 ## COG0756 dUTPase 12 2 Op 9 1/0.000 - CDS 8957 - 10207 1407 ## COG0612 Predicted Zn-dependent peptidases 13 2 Op 10 22/0.000 - CDS 10220 - 11305 1173 ## COG0795 Predicted permeases 14 2 Op 11 . - CDS 11308 - 12384 959 ## COG0795 Predicted permeases 15 2 Op 12 . - CDS 12371 - 12922 682 ## FN1032 hypothetical protein 16 2 Op 13 . - CDS 12934 - 14115 603 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative 17 2 Op 14 . - CDS 14130 - 15194 1255 ## COG0787 Alanine racemase - Term 15207 - 15248 6.0 18 3 Op 1 31/0.000 - CDS 15257 - 16198 1430 ## COG0341 Preprotein translocase subunit SecF 19 3 Op 2 1/0.000 - CDS 16200 - 17435 1734 ## COG0342 Preprotein translocase subunit SecD 20 3 Op 3 9/0.000 - CDS 17449 - 17868 571 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 21 3 Op 4 . - CDS 17880 - 20480 3577 ## COG0013 Alanyl-tRNA synthetase 22 3 Op 5 . - CDS 20500 - 21225 262 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 23 3 Op 6 . - CDS 21241 - 23634 3039 ## FN0694 S-layer protein 24 3 Op 7 1/0.000 - CDS 23609 - 26209 2820 ## COG0249 Mismatch repair ATPase (MutS family) 25 3 Op 8 . - CDS 26212 - 27156 483 ## PROTEIN SUPPORTED gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase - Prom 27203 - 27262 5.9 - Term 27216 - 27252 4.9 26 4 Tu 1 . - CDS 27271 - 27441 400 ## gi|257452895|ref|ZP_05618194.1| hypothetical protein F3_07499 - Prom 27491 - 27550 9.8 Predicted protein(s) >gi|224461443|gb|ACDD01000059.1| GENE 1 73 - 195 110 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITIEKISCYKYVLNCKQIVYSFFILCLLLLKITLKNDSN >gi|224461443|gb|ACDD01000059.1| GENE 2 164 - 856 658 230 aa, chain - ## HITS:1 COG:FN1729 KEGG:ns NR:ns ## COG: FN1729 COG0115 # Protein_GI_number: 19705050 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Fusobacterium nucleatum # 1 225 1 228 249 169 43.0 5e-42 MKIILDDAFLFGAGVFETIKVEKGRAIFCEEHLKRLHQSLEFFGISQKISEEEVQEYLDK QEEKDFALKIVVSSKNILYLKRENPYLSQNREKGLRLCFSKVLRNSTSAMVYHKTTQYYE NLLEKKKVKECGYDEVLFWNERGELTEGAVSNIFFIKGEKLYTPAVSCGLLAGIMRAKVM ERYTVEEKIIRKEDLETFDACFMTNSLMGMFWVKEIEGVFYDNNRENFML >gi|224461443|gb|ACDD01000059.1| GENE 3 853 - 2868 2183 671 aa, chain - ## HITS:1 COG:alr3443_2 KEGG:ns NR:ns ## COG: alr3443_2 COG0147 # Protein_GI_number: 17230935 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Nostoc sp. PCC 7120 # 205 658 35 496 514 403 46.0 1e-112 MRTLLIDNYDSYTYNLYQLLADISEDEVLVIKNDEYSWQEVQDLSFDLVVISPGPGTPTK KEDFGVCEEFIRYCEKPIFAVCLGHQGLYHILGGEVGKAPVAMHGRLSKIYHKERGIFQN LKQGIEVVRYHSLLCKGIVPDCLEVEARTEEGLIMALSHKTRPIWSVQFHPESICTENGR EMLENFFRLGREFYEKEEEFIYEVIDFLGEGEEIFRKLYPKFPKVLWLDSSKVEEGLARF SIFGLSSVEKGHSLTYHVDSGVVKKSWENGKIEEFSESIFDYLQRNQKSWKLKEELPFDF QLGYIGYFGYELKKECVTGNQHSYEYPDAQFRYVDRAVVLDHLEKKLYLLSEGREKVWIE EVKEILQSAESYQEREHFSDYPRVAFVENKKQYLENIRKSQELISQGESYEICLTNRLDI FAKIHPVDYYLLLRKVSPGPYSAFLPYQDISIASSSMEKFLTIDRNRIVETKPIKGTIRR GNTAEEDEKLKRSLVEEEKNKSENLMIVDLLRNDLGKVSEIASVKVPKLMAVETYTTLHQ LVSTITGKVASQYDSIDVIKASFPGGSMTGAPKKRTLEIIDHLEKVPRGVYSGSIGFLAN NGTADFNIVIRTAIIEKEKVSLGVGGAIIALSNAEEEFEEILLKAKGVLRAFQLYFKGNT EEEIEIEASIE >gi|224461443|gb|ACDD01000059.1| GENE 4 2980 - 3627 785 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452874|ref|ZP_05618173.1| ## NR: gi|257452874|ref|ZP_05618173.1| hypothetical protein F3_07394 [Fusobacterium sp. 3_1_5R] # 1 200 1 200 215 295 100.0 1e-78 MKKFTTSLGIFFITSFVTLAATSLKPAEVHTKDGVFTNEKGMVLQGEYEIREGLYDSNFH FQNGKLLHFSFETRKKKENDFEVEGKFATPESFKGELSIKTEEKGKKEEILKKKLSLDGT LEGKTLYTFATEVLTKTPESLKIPTFNTVETWKLQDGKREWKEEKTVELSKKEGSMDYHM FQKTNTEEEQVFSKGKLVHTSKGSSSSSEMKSMKQ >gi|224461443|gb|ACDD01000059.1| GENE 5 3711 - 3776 65 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKKGTVTVPFLSYIYQMKKM >gi|224461443|gb|ACDD01000059.1| GENE 6 3810 - 4235 855 141 aa, chain - ## HITS:1 COG:FN0513 KEGG:ns NR:ns ## COG: FN0513 COG0716 # Protein_GI_number: 19703848 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 2 141 3 142 142 154 56.0 5e-38 MKIALVYRSTTGRTEAMAKAIEEGILAAGGAVNVSSIEDVNVDDVFASDILVLGSSADGA ESIDEANFVPFMEDNKDKFAGKKVFLFGSYGWGGGEYANTWKDQVVEFGAEMIEEPVTCL EDPEDATLDQLREVGKKIAAL >gi|224461443|gb|ACDD01000059.1| GENE 7 4301 - 5194 794 297 aa, chain - ## HITS:1 COG:FN2101 KEGG:ns NR:ns ## COG: FN2101 COG0697 # Protein_GI_number: 19705391 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 5 295 7 299 301 242 49.0 8e-64 MNLGMGILVTFIGGVFWGFSGVAGKYLFEYTGVTSDWLVPWRLLFAGCIMLLYLYYKQGK EIFRILKEDYKDLLLYAIFGMMACQYTYFTTVQYSNAAIATVLQYSAPPLIMVYMCYKER KKPAKIEVISLIFSCIGVFVLCTHFQFETFVISPKALVWGMISALAMVVNTVQPVNLLKK YGSFLPLAWSMTIGGSILFFWTRPDKIPVEYTWNLFGGFFAVVFLGTIVAFSFYMQGIKV IGPTKASLIACVEPISATVLSIILLGTAFEFLDIVGIALILMAVCLLTYPTKNKKNS >gi|224461443|gb|ACDD01000059.1| GENE 8 5214 - 6515 1753 433 aa, chain - ## HITS:1 COG:FN1025 KEGG:ns NR:ns ## COG: FN1025 COG2252 # Protein_GI_number: 19704360 # Func_class: R General function prediction only # Function: Permeases # Organism: Fusobacterium nucleatum # 6 433 8 435 435 483 65.0 1e-136 MENQGFLDRYFKLSERGTTVRNEVIGGLTTFLAMAYIIFVNPSILSLTGMDKGALITVTC LATALGTFISGVWANAPFGLAPGMGLNAFFTFTLVMDKGVTWETALGIVFLSGCFFFILS LGGIRERIADCIPLSIKIAVGAGIGLFITLIGLKNMGLVVKNDATLVGLGVLGPEVLIGI AGLFIAVILEIKRVKGGILIGILSSTILAFVFHKVEMPASFISLPPSMAPIFMKLDIKSA FQISLMGPIFSFMFVDLFDSLGTLISCSKEIGLVDKDGKIKGFGKMLYTDVASTIFGAMM GTSTVTTFVESSAGIAAGARTGLASVVTSILFVLSLVFAPIVGVVPAYATAPALIIVGVY MFKNVQHLDFNDLKTLVPAFIIIIMMPLTYSISIGLSLGFISYIIIHLLTGDFKALNIPL IFVGILSLVNLIV >gi|224461443|gb|ACDD01000059.1| GENE 9 6526 - 7401 936 291 aa, chain - ## HITS:1 COG:FN1026 KEGG:ns NR:ns ## COG: FN1026 COG0564 # Protein_GI_number: 19704361 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Fusobacterium nucleatum # 1 285 1 288 289 219 46.0 4e-57 MKEYQVEEKYVGVRIDRYLRKEFPDLSLGDIFKTLRTGKIKVNGKKVKENYRFLEEDRIQ NYLQVEEKEKMTFIHLSQEEKKRLEDGIFYQDTDILVFYKKAGELMHKGSSHDYGLAEQF QAYFQNEDFHFVNRLDKETSGLVLGGKCLKIVRELAEAIKKRTIIKKYYIIIEGSPEKNH FSLKTYLKKGENKVLESQTAKEEYKECSASFSVIRKNKEYSLLEAVLETGRTHQLRVQLA GIGFPILGDIKYGKKRAKRMYLHSHLLKISDWEKEWDTGIPTEFLSYFNNK >gi|224461443|gb|ACDD01000059.1| GENE 10 7404 - 8510 1115 368 aa, chain - ## HITS:1 COG:FN1027 KEGG:ns NR:ns ## COG: FN1027 COG0772 # Protein_GI_number: 19704362 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Fusobacterium nucleatum # 11 368 7 364 366 406 62.0 1e-113 MGKNRKRYFLIKRIKKMNMWFIANIFVIFLLSLMSIYSSTIPKGPGFFKKELLWFVISAF VFIGFSLLDYHKYMKYDRYVYLFNVLMLLSVFVIGTKRLGAQRWIDLGPISIQPSEFAKI FLVLTLSSYMAKRSHERFEGFKAMTFSFLHMLPIFGLIALQPDLGTSLVLLIVYATLVFI NGLDWRTIFILIVAAILAVPGSYFFLLHDYQRQRVLTFLHPGEDMLGSGWNVMQSMIAIG SGGIDGKGFLQNSQSKLRFLPESHTDFIGAVYLEERGFLGGVALLFLYLFLLIQILKIAD DTEEKFGKLICYGIASIFFFHIFINLGMIMGIMPVTGLPLLLMSYGGSSLVFAYMMLGIV QSVKFHRG >gi|224461443|gb|ACDD01000059.1| GENE 11 8520 - 8960 866 146 aa, chain - ## HITS:1 COG:FN1028 KEGG:ns NR:ns ## COG: FN1028 COG0756 # Protein_GI_number: 19704363 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Fusobacterium nucleatum # 1 146 1 146 146 211 74.0 3e-55 MSKVQVKVVLEEGVQLPKYESAGAAGLDVRANITESISLGSLERTLIPTGIRMAIPEGYE VQVRPRSGLALKHGITLLNTPGTIDSDYRGELKIIIANMSKEPYVIEPQERIGQLVLNKV EQMEFELVSSLDETERGEGGFGHTGK >gi|224461443|gb|ACDD01000059.1| GENE 12 8957 - 10207 1407 416 aa, chain - ## HITS:1 COG:FN1029 KEGG:ns NR:ns ## COG: FN1029 COG0612 # Protein_GI_number: 19704364 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Fusobacterium nucleatum # 3 409 2 408 408 431 54.0 1e-120 MSEQVQVKTLSNGITVLIEKVPELQSFSLGFFVRTGARNEREEESGISHFIEHMMFKGTE TRTAKDLSEVIDNEGGIINAYTSRETTVYYVQLLSNKLEIAIDVLSDMMLHSTFTEENIE KERNVIIEEIKMYEDSPEDTVHDENISFALRGIQSNSISGTPEGLKKITRDHFMNYLKDQ YVASNLLIAISGNFDETVLMTQLEEKMSSFPKSDKKREYDNRYEIYAGTQVITRDTQQVH ICFNTRGIDVHHPKKYAASILANALGGGMSARLFQRIREEKGLAYSVYSYQSVYEDCGIF TTYAGTTKEAYQEVVNMIQEEYKKVREEGITEQELQRCKNQFTSALMFHLESSKGRMSSM ASSYINNGKVEAREEIMKRINEVSLEDIKEMAQYLFDEKYYSCTVLGNIKKEEFSI >gi|224461443|gb|ACDD01000059.1| GENE 13 10220 - 11305 1173 361 aa, chain - ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 361 2 363 363 341 48.0 1e-93 MKKLDIYMTKNFLKYFSYSLFSFLGIFVLSQVFKVLRYVNEGQLSPGQIPLFIGNLLPGI IINVAPLAVLLGGLISINIMASNLEIISLKTSGIRFARLVRGPIFMSFLISLIVFYLNDR VYPGSVVRNRELRGKEDVEEREVPKEKENAFFRNVEGRYVYYMKKINRETGIMDHVEVLD MSENFDKIERMITAKKGRYDFKRKLWVFEDAHIYYPDTDTVEARAFIQEQKYMDEPEYFI SLSNIVPKQQTIAELKKAIKEGSATGNEIREILSELGKRYSFPFASFVVSFLGLALGSHY VRGMSILNIVISILLGYAYYLVEGAFEALGMNGYLNPFLSGWIPNLLFLAAGLYFMRRAE Y >gi|224461443|gb|ACDD01000059.1| GENE 14 11308 - 12384 959 358 aa, chain - ## HITS:1 COG:FN1031 KEGG:ns NR:ns ## COG: FN1031 COG0795 # Protein_GI_number: 19704366 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 358 1 359 359 375 57.0 1e-104 MKIIDSYILKECRGPIILSVSIFTFIFLLDIIVAMMENIIVKGISVFDVARILSFYVPPI LTQTIPLGLFVGIMITFSKFTRSSEAIAMNSIGMDIRAILKPILTLGIASMFFILFLQES IIPRSYIKLQYLASKIAYENPVFQLKERTFMNNLEGYSLYIDKVGRDKHASGILIFENDE KTIFPIVLVGHQAYWRDSSIILERANFISFDEKGVRKLTGSFEDKRVELQAYFSDLQIKV KEIEMMSIGTLLREMKGKSKEEKLIYKVEINRKLALPFSSVMLSVLGVLLSIGHQRSGKR AGILVGILTIFFYICLLNVGIVLANVGKIPILLGVWLPNVLLAALTYRLYIVKKRRGI >gi|224461443|gb|ACDD01000059.1| GENE 15 12371 - 12922 682 183 aa, chain - ## HITS:1 COG:no KEGG:FN1032 NR:ns ## KEGG: FN1032 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 171 1 173 179 101 38.0 1e-20 MYLDIIVCVILLLAILTGASNGMYVEFISIFGLFLNIMLTKTYTPTVISFFKIKYINNNY ALTYIVVFISLYLFIKIVLCITNRVLRDKSNGIITRGIGAFIGLAKGTVIAFIFLLIYNF SMDLFPSIRVYSVGSKTNLIFADAVPEMEKFIPDIFVEKLNRIRNFNFIEKALRNQRNTY ENN >gi|224461443|gb|ACDD01000059.1| GENE 16 12934 - 14115 603 393 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 1 390 1 392 396 236 33 9e-62 MAKIVLQRGKEKKIQNFYPNVFQDEIKEKIGTMKTGDLVDIVTEEMEFVARAYVTEGSSA YARVLSTKDEKIDKTFFQKKIKNAYDRRKHLLKETNCIRAFFSEGDGIPGLIIDKFEHYV AVQFRNSGLEVFRQEILNAIKKYLKPKGIYERSDVENRTHEGVEQKTGILFGEIPERIVM EDNGAKYHIDIIHGQKTGFFLDQRDSRKFIQKYIHEKTRFLDVFSSSGGFSMAALRDGAR EVIAIDKDAHALELCRENYELNHFQSNFSTMEGDAFLLLETLGGRKEKFDIITLDPPSLI KRKAEIYRGRDFFFDLCEKSFPLLEENGILGVMTCAYHISLQDLIEVTRMAASKHGKKVR VLGVNYQPEDHPWILHIPETLYLKALWVQVVED >gi|224461443|gb|ACDD01000059.1| GENE 17 14130 - 15194 1255 354 aa, chain - ## HITS:1 COG:FN0406 KEGG:ns NR:ns ## COG: FN0406 COG0787 # Protein_GI_number: 19703748 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Fusobacterium nucleatum # 1 352 1 352 354 332 46.0 5e-91 MRAWVEIDTENLRHNIREIQKRAEGFGVWGVIKANAYGLGVLPVAKILAEEGIHYFAVAS LEEAKEVRNANITGEILILGSLFHDEILEAESLDFHINVSCREELEWIAKNAPKTKIHLK IDTGMTRLGFSYQEGMEVIEFAKKLSLNITGVFSHFSDADGNSQEAKEYTQKQIERFLPY ATREDIPYRHIFNSGALIQYTDQKIGNMVRAGICLYGILGSTPIPSFKNVVTLKTKVLFK KTVTEETYVSYGRLCKLEKGETYVTLPIGYADGVKKYLANGGKVEILGEACPIIGAICMD MMMVKIPENLVNKIEIGTEVSVFNNDIIRRNNISETCTWDMFTGLGRRVQRIYK >gi|224461443|gb|ACDD01000059.1| GENE 18 15257 - 16198 1430 313 aa, chain - ## HITS:1 COG:FN0700 KEGG:ns NR:ns ## COG: FN0700 COG0341 # Protein_GI_number: 19704035 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Fusobacterium nucleatum # 3 313 5 317 317 355 61.0 6e-98 MQVNVIKNSKQYLGLALTMVILSLGVFFTKGLNYGIDFSGGNLLQIKYENKITLHDINES LDNIQGIPQIGTNSRKVQISEDNTVIIRTQEISEDEKKEILNALQSVGAYQIDKEDKVGA SVGEELKTSAIYALGIGAVLIIIYITFRFEFIFAVGAIVALLHDLILALGCISLLYYEIN TPFIAAILTILGYSINDTIVVFDRIRENLKRRAKTQMSIEECLAKSVNQVMIRSINTSVT TLFAIVAILLLGGDSLRTFIVTLLVGILAGTYSSVFIATPVVYFLHKKGDGKGMEKISVE KDEDQEDEEKILV >gi|224461443|gb|ACDD01000059.1| GENE 19 16200 - 17435 1734 411 aa, chain - ## HITS:1 COG:FN0699 KEGG:ns NR:ns ## COG: FN0699 COG0342 # Protein_GI_number: 19704034 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Fusobacterium nucleatum # 1 411 1 411 411 557 75.0 1e-158 MKSKLMLKLLLVLGILAGAMWLSFSKPTKLGLDLKGGVYVVLEAVPEEGQTLDKDAMGRL IEVLDRRINGLGVAESSVQMAGDNRVIIELPGVDNTEDAVKMVGKTALLEFKLKQEDGSL GETLLTGGSLKKADVSYDNLGRPQIQFEMTPEGAREFAKITRENIGKQLAITLDGEVQTA PVINGEIPSGSGVITGNYTVEEAKATATLLNAGALPVKAEIAEIRTVGASLGDESIAQSK QAGMLAIVLIWAFMILFYRLPGIVADIALVFFGFITFGLLNFIDATLTLPGIAGLILSAG MAVDANVIIFERIKEELQFGNTIRNAIASGFNKGFVAIFDSNITTLIITIILFTFGTGPV KGFAVTLTIGTIGSMLTAITITKVLLLNFVEIFGFTKPSLFGIKGKKEEAK >gi|224461443|gb|ACDD01000059.1| GENE 20 17449 - 17868 571 139 aa, chain - ## HITS:1 COG:FN0698 KEGG:ns NR:ns ## COG: FN0698 COG0816 # Protein_GI_number: 19704033 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Fusobacterium nucleatum # 2 139 1 138 138 166 71.0 1e-41 MLKKYLALDVGDVRIGVAKSDIMGIIASPLETIDRRKMKPVKRIIELCEQENTKSIVVGI PKSLDGSEKRQAEKVRIFIHALKSAIPGVEIFEVDERFTTVTADQILTEMNRKGALEKRK VVDKVAASLILQQYLNMKK >gi|224461443|gb|ACDD01000059.1| GENE 21 17880 - 20480 3577 866 aa, chain - ## HITS:1 COG:FN0697 KEGG:ns NR:ns ## COG: FN0697 COG0013 # Protein_GI_number: 19704032 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 863 1 863 867 1268 74.0 0 MLTGNEIRQKFIEFFESKAHKHFESASLIPDDPTLLLTVAGMVPFKPYFLGQKEAPCPRV TTYQKCIRTNDLENVGRTARHHTFFEMLGNFSFGDYFKREAIVWSWEFVTEVLGLPKDKL WVTVFTTDDEAEKIWIEDCHFPKERIVRMGESENWWAAGPTGSCGPCSEIHVDLGPEYGG DENSKIGDEGTDNRYIEIWNLVFTEWNRMEDGHLEPLPKKNIDTGAGLERIAAMVQGKSN NFETDLLFPLVEEAGRLTNSKYHESPEKDFSLKVITDHSRAVTFLIHDGVIPSNEGRGYV LRRILRRAVRHGRLLGQKELFLYKMVKKVVDQFAIAYPDLTANLENIQKIVKIEEEKFSN TLDQGIQLVNEQIEMALQAGKSSLDGEITFKMYDTYGFPYELTEEICNERGIAVSQEEFL AKMEEQKEKARSARAVIMEKGQDSFIEEFYDKHGVTEFTGYHSTEEKAKLLNIRQKEDGT LLLIFDKTPFYGESGGQVGDHGSISSEAFQGKVLDVKKQKEIFTHIVEVVSGEAEEGKEY TLTVDSKYRAAVSKNHTATHLLHKALREVLGTHVQQAGSLVDSEKLRFDFSHYEAMTEEQ IQEVEERVNEKISEAIAVEVSHKTMEEAKACGAMMLFGDKYGDVVRVVHVPGFSTELCGG IHVENIGHIGLFKIVSEGGIAAGVRRIEAKTGYEAYRFVEENIGMLKKTAKLLKTEDSLL LEKVEKVLQEEKEKAREITSLKDKIAKQEAEALYTHALEIADVKVFMAKYEDKSMDDLRK MIDFVKDKEENAIVVLTSTFEKLSFAVGVSKALTGKYKAGNLVKIAAEITGGKGGGKPDF AQAGGKDKSKIEEAMEAIKKAIKENK >gi|224461443|gb|ACDD01000059.1| GENE 22 20500 - 21225 262 241 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 8 226 6 229 245 105 25 3e-22 MINLTARNLVKAYKQRKVVDSVSLEVNKGEIVGLLGPNGAGKTTTFYMITGIIKPNAGSV ICNGEEITSYPMYKRANLGIGYLAQEPSVFRNLTVEDNIRAVLEMKHYGKKEQKEMVDKL LEEFKLTHVYESLGYSLSGGERRRVEIARTIANNPSFILLDEPFAGVDPIAVEDIQQSIR YLQKRGLGILITDHSVRETLNITEKAYIMAQGKVIISGSPQEISENEMARKIYLGENFKL D >gi|224461443|gb|ACDD01000059.1| GENE 23 21241 - 23634 3039 797 aa, chain - ## HITS:1 COG:no KEGG:FN0694 NR:ns ## KEGG: FN0694 # Name: not_defined # Def: S-layer protein # Organism: F.nucleatum # Pathway: not_defined # 244 742 7 505 643 204 29.0 2e-50 MTKKAWIYSGVGAVVLVAGYFNYFGEDKKLDTLKKVIETSNAIYKSADYFVEAKKQIDYV DDKETKFEIAKAVVKGMALSGDNVVIDKLRNLVLKNNILGVSENGWKFNTSELRYNKSTD EIISEAGVSAINEKKGIHLEGKKFLTTTSMSHILLENGVKFEVGQAGLRGEKAEYDDSTK KILLSGNIELYNPQKDGKEFRGKFGNMIYDVEKGRGETSLPFEIIYKETILNAEKMDFHP EENAFHLEQNVKITSKEYNANLLAIDKKAGEDFITFVGPIRGQNEEYTYSMNRAVYDTVK KEITMTGNIDILSRKGERIRADRAIYHEKEKLLDVYSDGNKVTYDGSGHHIEATSFQYDA KTGDVHVHSPYRYTNQNGDIFEGSNLEYNKATGKAIVKGEVKYQSKDYTVKTVDLDYARE TGVLTIANPYSITMKDGTNFEGKSAVYNEKTGNLVSPGSIYMVGKDYVAHGHDLKYNNNT GAGTLEGPVNLVSETQNFNITGDRAVFDKKNGAVMGNVKGNLQGTMIATSKAIYKSNQKM VELPAPIQYRNPQENLHGNMKSGKYFVEDHRFQGKQFVAIRPGEKVSSEYAEYFTEEKRA ELIGKVRMENADQVVSTEKASYELIDKYAELPETFKMTKGQFVVNGASGNVNFTTQKLFV KKPKMNSQAGEHFEAERLEGNLKTLIMDFRDKVYGKTMQKNVLSEYRGEKARVYLKKEQN QYRAKKLEVFEDAVFSQEDKKLRGKRGQYDFDTSLVNFYGDVSFTSKDGNIRSDEMIYNT QTKKAKAKGNVELHYNK >gi|224461443|gb|ACDD01000059.1| GENE 24 23609 - 26209 2820 866 aa, chain - ## HITS:1 COG:FN0693 KEGG:ns NR:ns ## COG: FN0693 COG0249 # Protein_GI_number: 19704028 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 1 857 20 893 896 922 59.0 0 MASETPLMAQYKEIKEEYQNAILLFRLGDFYEMFFEDAKIASKELGLTLTSRNREKGQEV PLAGVPYHSVASYVAKLLEKGYTVAICDQVEDPKAAKGIVKREVTRVLTPGTQIDVDYLD GKSNQYLMSFVCKEEGAAIAYFDITTGEFRVRELKEGNLFYQLLGELGKINPKELILEEE IYRNYQDDFEKYPDFSGIKINFCKNVKQAESYLKECYQILSLESFGLAQKPLAQQACANI LDYVKTLQKGQEFPLMKISLLSNQETMELNRSGQKNLEIFGALFSILDLCKTSMGSRYLK RVLQNPLLNIAKIKKRQDYVEFFTKEVLLREEVRELLSEVYDLERILGKIQLSTVNGRDI LALGNSLKAALLLEKQLHRYSMLKMEREVFSEMQENIMNAIMIEAPFSIREGGIFQKGYH QELDELRHIASSGKEILLDIEAREREKTGIKTLKIKYNKVFGYFIEVSKANEHLVPSHYI RKQTLVNSERYIVEELKEYEDRILNAKTKIESLEYYLFQEFVEKIKEQKEALSDLARQLS FLDVMTSFAQLAIQKSYVRPEVVEEDILEICGGRHPIVENLIPKGTYIKNDLYFDKSERM MVLTGPNMSGKSTYMKQIALIIILAQVGSFVPADFAKIGIVDKIFTRIGASDDLLTGQST FMVEMSEVANIIHNATEKSFIILDEIGRGTSTFDGISIATAITEHIHSHIRAKTIFATHY HELTELEKELELVKNYRIEVQEQGKEVLFLREIVQGGADKSYGIEVAKLSGLPQSILKRS KQILSRLEKQKALVEKKMQGEQMVLFQAEEIEEEEETERNMEEQSVLEEIRKLSIDQMTP LQALLVLQSLKGKLSGGKHDEKSLDL >gi|224461443|gb|ACDD01000059.1| GENE 25 26212 - 27156 483 314 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase [Haemophilus influenzae R2866] # 3 306 38 342 353 190 33 7e-48 MKKIFIAPIAGVTDYTYRGILEEFHPDLLFTEMVSSDALAALNDKTISQILRLRPGNGVQ LFGKDIEKMLYSAKYVEKLGVKHIDINSGCPMKRVVHSGHGAALLKSPDKIKEILSTLRE GLQEDTDLSIKIRVGYEKPENYIQIAKIAEEVGCSHITVHGRTRAQLYTGFADWSLIKEI KENVSIPVIGNGDIFTAEDAKEKIEYSGVDGVMLARGIFGNPWLIREIREILQYGQVKTK VTAEDKINMAIHHIEEIAKDNPQREFVFDVRKHICWYLKGISNSAECKNQINRSVDYQAT VALLQELKEKIKGE >gi|224461443|gb|ACDD01000059.1| GENE 26 27271 - 27441 400 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452895|ref|ZP_05618194.1| ## NR: gi|257452895|ref|ZP_05618194.1| hypothetical protein F3_07499 [Fusobacterium sp. 3_1_5R] # 1 56 1 56 56 68 100.0 1e-10 MHVIDKETCIGCGACEGVCPVSTISATDDGKYEVGDACVDCGACAAGCPVSAISAQ Prediction of potential genes in microbial genomes Time: Fri May 20 02:12:19 2011 Seq name: gi|224461442|gb|ACDD01000060.1| Fusobacterium sp. 3_1_5R cont1.60, whole genome shotgun sequence Length of sequence - 27669 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 5, operones - 4 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 22 - 339 398 ## gi|257452896|ref|ZP_05618195.1| hypothetical protein F3_07504 2 1 Op 2 11/0.000 - CDS 349 - 1440 1282 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 3 1 Op 3 21/0.000 - CDS 1430 - 2470 1407 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 4 1 Op 4 . - CDS 2467 - 4062 201 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 - Term 4083 - 4117 3.4 5 1 Op 5 . - CDS 4129 - 5367 1913 ## FN1590 lipoprotein - Prom 5464 - 5523 16.9 - Term 5493 - 5543 10.5 6 2 Op 1 1/0.000 - CDS 5569 - 7602 3571 ## COG3808 Inorganic pyrophosphatase 7 2 Op 2 . - CDS 7671 - 8666 1143 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 8 2 Op 3 . - CDS 8650 - 8850 374 ## gi|257452903|ref|ZP_05618202.1| hypothetical protein F3_07539 9 2 Op 4 8/0.000 - CDS 8847 - 9410 782 ## COG0194 Guanylate kinase 10 2 Op 5 1/0.000 - CDS 9423 - 10301 945 ## COG1561 Uncharacterized stress-induced protein 11 2 Op 6 58/0.000 - CDS 10362 - 14330 5368 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 12 2 Op 7 28/0.000 - CDS 14367 - 17921 840 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 - Prom 17997 - 18056 4.0 - Term 18032 - 18066 3.6 13 2 Op 8 47/0.000 - CDS 18102 - 18467 547 ## PROTEIN SUPPORTED gi|237738814|ref|ZP_04569295.1| LSU ribosomal protein L12P 14 2 Op 9 43/0.000 - CDS 18508 - 19020 761 ## PROTEIN SUPPORTED gi|237738813|ref|ZP_04569294.1| LSU ribosomal protein L10P - Term 19041 - 19078 1.5 15 2 Op 10 55/0.000 - CDS 19188 - 19925 1082 ## PROTEIN SUPPORTED gi|237738812|ref|ZP_04569293.1| LSU ribosomal protein L1P 16 2 Op 11 45/0.000 - CDS 19957 - 20382 669 ## PROTEIN SUPPORTED gi|237738811|ref|ZP_04569292.1| LSU ribosomal protein L11P 17 2 Op 12 46/0.000 - CDS 20419 - 21009 844 ## COG0250 Transcription antiterminator 18 2 Op 13 . - CDS 21013 - 21195 232 ## COG0690 Preprotein translocase subunit SecE - Prom 21264 - 21323 2.8 - TRNA 21220 - 21295 87.4 # Trp CCA 0 0 19 3 Tu 1 . - CDS 21329 - 21481 228 ## PROTEIN SUPPORTED gi|197735409|ref|YP_002164187.1| ribosomal protein L33 - Prom 21609 - 21668 6.4 - Term 21657 - 21699 8.0 20 4 Op 1 1/0.000 - CDS 21707 - 22498 1078 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 21 4 Op 2 11/0.000 - CDS 22514 - 23800 713 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 22 4 Op 3 11/0.000 - CDS 23812 - 24297 642 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component - Term 24312 - 24348 7.5 23 4 Op 4 . - CDS 24365 - 25408 307 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 - Prom 25619 - 25678 5.1 - TRNA 25676 - 25762 72.3 # Leu CAA 0 0 - Term 25629 - 25663 2.1 24 5 Op 1 . - CDS 25814 - 26881 1319 ## COG0598 Mg2+ and Co2+ transporters 25 5 Op 2 . - CDS 26883 - 27593 302 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 Predicted protein(s) >gi|224461442|gb|ACDD01000060.1| GENE 1 22 - 339 398 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452896|ref|ZP_05618195.1| ## NR: gi|257452896|ref|ZP_05618195.1| hypothetical protein F3_07504 [Fusobacterium sp. 3_1_5R] # 1 105 1 105 105 193 100.0 3e-48 MASRNIIRNWGILAILFLGIIYSLMVTGKEHSIILDNRNGLSGLRYSIDGENYQAMGTKK IQRYVQGKAHTIYIKKSNGQVTEKDFQLGFQENDLELDIKEIVKS >gi|224461442|gb|ACDD01000060.1| GENE 2 349 - 1440 1282 363 aa, chain - ## HITS:1 COG:FN1896 KEGG:ns NR:ns ## COG: FN1896 COG1172 # Protein_GI_number: 19705201 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 21 344 1 324 340 446 74.0 1e-125 MQNKWKKQLINNSVPILMFALVVFAFPLSGLSLSYIANEMLLRMSRNLFLVLSLLIPIIA GMGLNFGIVLGAMGGQLALIFVSDWHIVGVQGVLLAMILSIPFSMVLGYIGGAVLNRAKG REMITSMILGYFINGVYQLIVLYAMGVVIPLTDSKILLSSGRGVRNTIDLGKLNAALDKF VSFKIMNIEIPVLTILFIVALCIFIVWFRSTKLGQDMRAIGQDMEVARSSGIEVDKTRII AIVISTVLAGIGQVIYLQNIGTMNTYNSHEQIGMFSIAALLIGGASVAKASIPNALGGVV LFHLMFVLAPRAGKELMGSAQIGEYFRVFVSYGIIALVLIMYEWRAQKEKEAERERLIQE QLK >gi|224461442|gb|ACDD01000060.1| GENE 3 1430 - 2470 1407 346 aa, chain - ## HITS:1 COG:FN1897 KEGG:ns NR:ns ## COG: FN1897 COG1172 # Protein_GI_number: 19705202 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 7 343 1 337 339 443 74.0 1e-124 MIDGKKIIDSIGWPRLIIGLFLLSTYCVAPFVGIPLFVAFRDTFTRFGMNAILVLSLMPM IEAGAGLNFGMPLGVEAGLLGALISIQLGLTGGIGFFAAILFAIPFAILFGWFYGLILNR VKGGEMMIATYIGFSMVSFMCMMFLLLPFTRPDMIWAYGGEGLRTTISVERYWQKILDDL IGIHWELLPVGEILLFAVVAFGMWIFFRTRTGLSMSAVGKNPKFAQATGVSINRVRIQSV IISTVLAALGIIIYQQSFGFIQLYLAPFNMAFPAIAAILIGGASVNKVTVWHVLIGTFLF QGILTMTPTVVNAVIQTDMSETIRIIVSNGMILYALTRKGGGSHAK >gi|224461442|gb|ACDD01000060.1| GENE 4 2467 - 4062 201 531 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 290 519 26 236 318 82 27 4e-15 MNDILLKAENLSKSFGENVVLKDINFTIKPGEIVGLVGENGAGKSTLMKIIFGMEVIQAT GGYGGRLEFEGKEVRFSSPFEALEAGIGMVHQEFSLIPGFEAAENIVLNRESTKKGISEY LFGNRIRKMNEVENLERAKNAIEQLGVEQLQAEAQISEMPVAHKQFTEIAREIEREKTKL LVLDEPTAVLTEEEAKVLISTMKKLSEKGIAIIFITHRLQEILDVSDKVIVLRDGVLINT VETKDTNVNQITEWMIGRKISSSEEVIHEMNEKLENILEMKDLWVDMPGEMVKKLSLNIK KGEILGLGGMAGQGKIGIANGVMGLCDAGGEILYKGETLSLNAPKEALSKGIFFVSEDRK GVGLLLEESIEKNIAYPAIQIKQQFLKKKFGLFQLLDEKAVTENAKKYIEKLEIRCTSSK QFVKELSGGNQQKVCLAKAFTMNPELLFVSEPTRGIDIGAKQLVLETLKEYNQEKGTTII ITSSEIEELRSICDRIAVINEGKVAGILSPKADILEFGKLMVGAKEGGEEQ >gi|224461442|gb|ACDD01000060.1| GENE 5 4129 - 5367 1913 412 aa, chain - ## HITS:1 COG:no KEGG:FN1590 NR:ns ## KEGG: FN1590 # Name: not_defined # Def: lipoprotein # Organism: F.nucleatum # Pathway: not_defined # 1 410 1 412 414 541 65.0 1e-152 MKKYVLAFIVSLSLIVFAACGKKAPEEAKDVARASKEAGANYHIGVVSGTVSQSEDGLRG AEAVVKEYGALENGGRVVHVTFPDNFMQEQETTISKIVSLADDPEMKVIVMAEAIPGTSA AFKAIKEKRPDIILLANTPHEDPELISQYADVSVHPDSVARGYLIVKAAHDLGAKKFMHI SFPRHLGYELIARRRAIFQAAAKDLGMEYIEMSAPDPVSDVGVPGAQQFILEQVPNWLDK YGKDVAFFATNDAQTEPLLKKIAEIGGYFIEADLPSPTMGYPGALGVQFAEDEKGDWPKI LAKVEEAVKNAGGSGRMGTWAYSYNFASVQALTDFAMKHLDNGTDLKDFSALLESYQKYT PGAGWNGSNYVDANGVEKDNFFMLYQDTYVFGKGYLHMTDEKVPEKYFEIKK >gi|224461442|gb|ACDD01000060.1| GENE 6 5569 - 7602 3571 677 aa, chain - ## HITS:1 COG:FN2030 KEGG:ns NR:ns ## COG: FN2030 COG3808 # Protein_GI_number: 19705321 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Fusobacterium nucleatum # 6 677 8 671 671 845 81.0 0 MEQMFMYFGIIAGIISLVAAFYYAKKVESYSINIPRVAEITEAIREGAMAFLTAEYKILI WFVIAIAILLGIAISPFTAVAFVLGAVTSAIAGNIGMRIATKANGRTAIAAKEGGLAKAL DVAFSGGAVMGLSVVGLGILMLSIVMLVLTGMGMELSTVAAELTGFGMGASSIALFARVG GGIYTKAADVGADLVGKVEAGIPEDDPRNPATIADNVGDNVGDVAGMGADLFESYVGSII AAVALGTFIAANEAGMTAIGYIFAPLVLAGLGIIASILASFTVKTNDPNAVHHKLETGTR IAGLLTIIASFGVVKYFELPLGVFWAIVAGLVAGLVIAYFTGLYTDTHTKAVNRISDAAS TGAATAIIEGLAVGMESTVAPIIVIAIAIIIAFQQGGLYGIAIAAVGMLATTGMVVAVDA YGPVADNAGGIAEMSELPPEVRETTDKLDAVGNSTAAVGKGFAVGSAALTALSLFATYKQ TVDSMTDFDLVIDVTDPEVIVGLFIGGMLTFLFAALTMTAVGKAAIEMVEEVRRQFREIP GIMEKKAKPDYKRCVEISTHSSLKQMILPGVLAIVAPVIVGVWSVQALGGLLAGALVTGI LMAIMMANAGGAWDNGKKQIEAGYKGDGKGSDRHKAAVVGDTVGDPFKDTSGPSMNILIK LMTIVSVVLVPFFVKFI >gi|224461442|gb|ACDD01000060.1| GENE 7 7671 - 8666 1143 331 aa, chain - ## HITS:1 COG:FN2031 KEGG:ns NR:ns ## COG: FN2031 COG1477 # Protein_GI_number: 19705322 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Fusobacterium nucleatum # 19 324 2 313 320 342 58.0 4e-94 MKKINKILIVIFCLFLGKISYAKEIKYEESKFLFGTYIKITSYSESTSTAKKAIQAAFQE IERIDKKFNSKTEGSLIYQLNHSSNKEISLDAEGKFLFQTIQKAYLLSHKKYDITISPLL RLWNFENPEKAKIPNKISLEKILKEVDFEKIKIEGNRLRLLSPVKEIDTGSFLKGYALAR AEKVLKEKGLKSAFISSISSIDLLGSKPGGKPWKIALENPTNANDILGVLSLQDKALGVS GDYQTYVEIQGKRYHHILDKRTGYPVGDKKMVVVICKDGFEADVYSTTFFLMPIEEVLNY ANKMANFEVMIVDKNMKFHMSKGFSQYFSKK >gi|224461442|gb|ACDD01000060.1| GENE 8 8650 - 8850 374 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452903|ref|ZP_05618202.1| ## NR: gi|257452903|ref|ZP_05618202.1| hypothetical protein F3_07539 [Fusobacterium sp. 3_1_5R] # 1 66 1 66 66 107 100.0 3e-22 MKKDITYDELLEKIPNKYILTIVGGERARELHAGATPLTKTAKKDTDLKKVFREIIDGKI HYEEDK >gi|224461442|gb|ACDD01000060.1| GENE 9 8847 - 9410 782 187 aa, chain - ## HITS:1 COG:FN2033 KEGG:ns NR:ns ## COG: FN2033 COG0194 # Protein_GI_number: 19705324 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Fusobacterium nucleatum # 1 181 1 181 185 237 68.0 1e-62 MPKGNLYVVSGPSGAGKSTICRKVRKMLGINLATSATTREPRTGEVHGVDYYFLSHAEFE KKIQEGAFLEYAKVHNNYYGTLKSEVENRVNQGEKVILEIDVQGGLQVKALYPDAHLIFF KTPNLEQLEARLRGRKTDSEETIQLRLKNSIEELKCEEKYDICIVNHTVEQACNDLIQII EEKENLS >gi|224461442|gb|ACDD01000060.1| GENE 10 9423 - 10301 945 292 aa, chain - ## HITS:1 COG:FN2034 KEGG:ns NR:ns ## COG: FN2034 COG1561 # Protein_GI_number: 19705325 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Fusobacterium nucleatum # 1 292 1 292 292 262 55.0 5e-70 MRSMTGYAKLIYEDEKYALQMEMKSVNNKNLSCKIKLPYNLNFLETKIRNEIATKVLRGS VELRIELEEKEENLEAIQYDKNLSRAYFDTLSSMEKELGEVFSNKMDFLVRNFNVLKKGN NEVSEEEYSQFLLPKVQELLLPFLESRQEEGNRLQLFFLEKFEILKSYVTKIEEYQPSVV ERYKEKLLARLQTCREDLHFEENDILKEVMIFTDRSDISEELSRLKSHLQQLEKELKSKE LGLGKKIEFLLQEIFRELNTTGVKSNLYEISNLVVSAKNELEKIREQIMNIE >gi|224461442|gb|ACDD01000060.1| GENE 11 10362 - 14330 5368 1322 aa, chain - ## HITS:1 COG:FN2035 KEGG:ns NR:ns ## COG: FN2035 COG0086 # Protein_GI_number: 19705326 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Fusobacterium nucleatum # 1 1318 1 1319 1319 2109 81.0 0 MGIRNFEKIKIKLASPEKIEEWSYGEVTKPETINYRTLNPEKDGLFCEKIFGPTKDWECA CGKYKRMRYKGLICEKCEVEVTKSKVRRERMGHIALAAPVSHIWYSKGTPNKMSLIIGLS PKELESVLYFARYIVTESEEESLEIGKILTEKEYKLFKQLYGTKFEAYMGAEAILKLLER INLEELRIELEAELEEVSSAQKRKKIVKRLKIVRDFIASGNRPEWMILKNVPVIPADLRP MVQLDGGRFATSDLNDLYRRVINRNNRLKKLLEIRAPEIVVKNEKRMLQEAVDALIDNGR RGKPVVAQNNRELKSLSDMLKGKQGRFRQNLLGKRVDYSARSVIVVGPSLKMNQCGIPKK MALELYKPFIMRELVKRELATNIKTAKKLVEEADDKVWDVIEDVIQDHPVLLNRAPTLHR LSIQAFEPVLIEGKAIRLHPLVCSAFNADFDGDQMAVHLMLSPEAIMEAKLLMLAPNNII APSSGEPIAVPSQDMVMGCYYMTEKKKGAKGEGKAFSNIDQLLTAYQNKVIDTHALVKVR VNGEMIETTPGLVMFNEILPVQDRNYQKTIGKKELKQLIAYLYDEHGFTETADLINKLKN FGYHYSTLAGISVGVEDLVIPEEKKHLLAAADEQVEQIDADYKAGKIINEERYRKTIEVW SKTTDAVTKAMMDGLDKFNPVYMMANSGARGNTNQMRQLAGMRGNMADTQGRIIETPIKA NFREGLTVLEFFISSHGARKGLADTALRTADSGYLTRRLVDISHEVIVNAEDCGTMQGIE VGDLISGGKVIEKLAERIKGRVLAEDLVYNGEVLATRNTMIGKELLREIDEKGIKKVKIR SPLTCALEKGVCKKCYGMDLSNLKEILLGEAVGVVAAQSIGEPGTQLTMRTFHTGGVASA SAAITQIKSENGGRISFRDIHTLNLNGEEIVVSQAGKVIVADNEYEVSSGSILKVKAGDI IEEGTVLVTFDPYHIPLIAAQDGKVEYRELTPKKTHDEKYDVWQSLVVKAMDSGDVNPRV HILDKDGKKLGTYNIPYGAYMMVEDGAMVKKGDILAKIMKIGEGSKDITGGLPRVQELFE ARNPKGKAMLTEIDGRVEINNRKKKGMRVVTVKSLDGEGEQREYLVPVGERLIVTDGLKV KAGDKITEGAISPFDVLNIKGLVAAEQFILESVQQVYRDQGVGVNDKHIEIIVKQMFKKV KIVDSGASLFLEDEVVEKRLVDLENKELAEKGKALIQYEPIIQGITKAAVNTGSFISAAS FQETTKVLSNAAIEGKVDYLEGLKENVIIGKKIPAGTGFSAYKNVAMKVQEEFGAELEET EE >gi|224461442|gb|ACDD01000060.1| GENE 12 14367 - 17921 840 1184 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 889 1142 1085 1390 1392 328 55 3e-89 MGKLVERLNFGKIKERGIMPHFLEFQLNSYEDFLQMKVAPNNRENKGLESAFREIFPIES SSNGEIRLEYVSYELHAAEPPLNDELECKKRGKTYSDSLKVRLRLFNKKSGNEIQESLVY FGEVPKMTERGTFIINGAERVVVSQLHRSPGVSFNKEVNIQTGKDLFSGKIIPYKGTWLE FETDKNDFLSVKIDRKKKVLATVFLKAVDFFKDNTEIKEHFFEVKELDLTEFYEKYANDT EELLSVVRTKIESSFLKEAIYDEETGEIIAEEDAVISEALISKIIENKIAVLSYWEVKPE DLLIANTIANDTTKNSDEAVTEVFKKLRPGDLVTVDSARSLIKQMFFNVQRYDLEPVGRY KMNKRLKLEIDENEVLLTPEDVLGTIQYVIDLNNGESHVHTDDIDNLSNRRVRGVGELLL MQIKTGLLKMSKMVREKMTIQDIETLTPQSLLNTRPLNALILDFFGSGQLSQFMDQSNPL AELTHKRRISALGPGGLSRERAGFEVRDVHDSHYGRICPIETPEGPNIGLIGSLAIYAKI NQYGFIETPYVAVKDGVADLNDIRYLAADEEEGMFIAQADTKLGENNELLEPVTCRIGPE ILDVEAKRVHYLDISPKQVVSVSAGLIPFLEHDDANRALMGSNMQRQAVPLLRTQAPFIG TGLERKVAVDSGAVVTTKVDGTVSYVDGKKIVIETEDKREYTYRLLNFERSNQSMCLHQS PLVNLGDKVKAGDIIADGPATSKGDLALGKNILMGFMTWEGYNYEDAILISDRLRKDDVF TSIHVEEYEIEARNTKLGDEEITREIPNVSEAALRNLDANGIIMVGSEVEPGDILVGKTS PKGETEPPAEEKLLRAIFGEKARDVRDSSLRMPHGSKGTVVEILELSRENGDELKAGVNK AIRVLVAEKRKITVGDKMSGRHGNKGVVSRVLPAEDMPFLEDGTHLDVVLNPLGVPSRMN IGQVLEVHLGMAMRTLNGGTHIATPVFDGATEEQIKDYLENQGYPRSGKVTLYDGRTGDK FDNKITVGIMYMLKLHHLVEDKMHARAIGPYSLVTQQPLGGKAQFGGQRLGEMEVWALEA YGASNILQEMLTVKSDDVTGRTKTYEAIIKGEEMPDSDLPESFKVLLKEFQALALDVELC DENDNVINVDEELNKEDTTLEYSPSDMLELEDDEDEDDYEDDEE >gi|224461442|gb|ACDD01000060.1| GENE 13 18102 - 18467 547 121 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237738814|ref|ZP_04569295.1| LSU ribosomal protein L12P [Fusobacterium sp. 2_1_31] # 1 121 1 121 121 215 91 3e-55 MAFDREKFIADLEAMTVLELKELVTALEDHFGVTAAAPVAVAAAGPAEAAEEKTEFDVVL KSAGGNKIAVIKEVRAITGLGLKEAKELVDNGGVIKEAAPKEEAEAIKEKLTAAGAEIEV K >gi|224461442|gb|ACDD01000060.1| GENE 14 18508 - 19020 761 170 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237738813|ref|ZP_04569294.1| LSU ribosomal protein L10P [Fusobacterium sp. 2_1_31] # 1 170 1 170 170 297 88 4e-80 MATQVKKEIVAELVEKIKKAQSVVFVDYQGIKVNEETALRKKMRESGAEYLVAKNRLFKI ALKESGVEDNFDEILEGTTAFAFGYEDPAVPAKVVFDLAKDKAKAKQDIFKIKGGYLTGK RVSIDEVEALAKLPSREQLLSMVLNSMLGPVRKLAYATVAIADKKEAAGE >gi|224461442|gb|ACDD01000060.1| GENE 15 19188 - 19925 1082 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237738812|ref|ZP_04569293.1| LSU ribosomal protein L1P [Fusobacterium sp. 2_1_31] # 11 245 1 235 235 421 88 1e-117 MTTYREEIKEMAKHRGKKYIEVSKLVETGKLYEVKEALELVAKTRTANFVETVEVALKLG VDPRHADQQVRGTVVLPHGTGKTVKILAITSGENVQKALDAGADYAGAEEYISQIQQGWL DFDLVIATPDMMPKIGRLGKILGTKGLMPNPKSGTVTPDVAGAVSEFKKGKLAFRVDKVG SIHVAIGKADFSADKIEENFKAFMDQIVRLKPAASKGQYLRSVAVSLTMGPGVKMDPLLV AKYVG >gi|224461442|gb|ACDD01000060.1| GENE 16 19957 - 20382 669 141 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237738811|ref|ZP_04569292.1| LSU ribosomal protein L11P [Fusobacterium sp. 2_1_31] # 1 141 1 141 141 262 94 2e-69 MAKEVIGLIKLQLPAGKANPAPPVGPALGQHGVNIMEFCKAFNAKTQDKAGWIIPVEISV YNDRSFTFILKTPPASDLLKKAAGIQSGAKNSKKEVVGKITSAKLRELAETKMPDLNAGS VEAAMKIIAGSARSMGIKIED >gi|224461442|gb|ACDD01000060.1| GENE 17 20419 - 21009 844 196 aa, chain - ## HITS:1 COG:FN2041 KEGG:ns NR:ns ## COG: FN2041 COG0250 # Protein_GI_number: 19705332 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Fusobacterium nucleatum # 1 196 1 193 193 234 66.0 7e-62 MTKTEVKRWFMIHTYSGYEKKVKTDLEQKIETLGMTEIVSKILVPEEKSTEIVRGKEKVV FRKIFPGYVMLEMTAVREESDEGINYKVDSDAWYVVRNTNGVTGFVGVGSDPIPMEEHEV ENIFRVIGYKEEVREQQLYKADFEVGDYVKVLDGGFVNKEGRVAEMDYEQGKVKIMIDIF GRMTPVEVSFSSVEKM >gi|224461442|gb|ACDD01000060.1| GENE 18 21013 - 21195 232 60 aa, chain - ## HITS:1 COG:FN2042 KEGG:ns NR:ns ## COG: FN2042 COG0690 # Protein_GI_number: 19705333 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecE # Organism: Fusobacterium nucleatum # 1 57 1 57 58 62 56.0 2e-10 MSLFQDVRKEYSKVQWPKKKDIISSTVWVVVMAVILSIYLGVFDLIATRLLKNLVSLFGG >gi|224461442|gb|ACDD01000060.1| GENE 19 21329 - 21481 228 50 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|197735409|ref|YP_002164187.1| ribosomal protein L33 [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 50 1 50 50 92 84 3e-18 MRVQVLLECTETKLRHYSTTKNKKNTPERLEIKKYNPVLKRHTIYKEVKK >gi|224461442|gb|ACDD01000060.1| GENE 20 21707 - 22498 1078 263 aa, chain - ## HITS:1 COG:FN1255 KEGG:ns NR:ns ## COG: FN1255 COG0647 # Protein_GI_number: 19704590 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Fusobacterium nucleatum # 1 257 12 270 275 251 47.0 7e-67 MNRLKNKTCFLFDLDGTIYLSEHLIPGATDLLAEIRRQGKHFAFMTNNSSSAKQQYLEKM KRLGIEVTAKEILTSTDATLRYLKMQNMKKIVLLATPEVEKEFQEEGFTIIKERGKEADC VVLTFDLTLTYDKIWTAYDYLVKGLPYIASHPDYLCPLKEGFKPDVGSFISMFQTACHRE PLIIGKPNHYMVEEAMERFHVKKEDMVIVGDRLYTDIRTGLRSGVTAIAVLSGETTEDML ENTEDVPDYVFPSVKEIFDIMKK >gi|224461442|gb|ACDD01000060.1| GENE 21 22514 - 23800 713 428 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 427 3 429 435 279 35 2e-74 MEAFLPVIVLFVLFFLNIPIGFALMGSALFYFMFLNTTMAMNMVIQQFVTAVESFPYLAV PFFIMVGSVMNYSGISEELMNMAEVLAGHMKGGLAQVNCLLSAMMGGISGSANADAAMES KILVPEMIKKGFSKPFSAAVTAASSAVSPVIPPGTNLILYALIANVPVGDMFLAGYTPGI LMTLAMMITVHIISVKRGYQPSRERMARPAEIGRQAIKSIWALAIPFGIILGMRIGMFTP TEAGGVAVFFCFIVGFFIYKKLKLYHIPIILMETVKSTGAVMIIIASAKVFGYYMTLERI PQMITEGLMNFTSSPVILLMVINILLLFVGMFIEGGAALVILAPLLVPAVKALGVDPLHF GVIFIVNIMIGGLTPPFGSMMFTVCSIVDVKLEDFIREVWPFILSLVVVLLVVTYSSSIA LFIPNLFR >gi|224461442|gb|ACDD01000060.1| GENE 22 23812 - 24297 642 161 aa, chain - ## HITS:1 COG:FN1257 KEGG:ns NR:ns ## COG: FN1257 COG3090 # Protein_GI_number: 19704592 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Fusobacterium nucleatum # 10 154 1 145 147 149 68.0 1e-36 MRDLLKKFELYLGSVFISVTVVVVIMNVFTRYFLKFTYFWTEEVAVGCFVWTIFLGTSAA YRERGLIGVEAIVVLLPKKVRKVVEFVTFLLLVIISAIMFYFSLTYVMGSSKITSALEIS YSYINSGIVLSFALMTIYSVIFAVQCFKEMITGKDCKEIEG >gi|224461442|gb|ACDD01000060.1| GENE 23 24365 - 25408 307 347 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 1 316 1 326 346 122 27 2e-27 MKKGFKFFCAMGLLALALVGCGGNKDAAAPEGEKKEARVIKVTTKFVDDEQTAKSLVKVV EKVNERSNGSLELQLFTSGTLPIGKDGMEQVANGSDWILVDGVNFLGDYIPDYNAITGPM LYQSFDEYLKMVRTPLVENLNKQAEEKGIKVLSLDWLFGFRNMITKKPVKTPEDMKGLKL RVPTSQLYTFTIEAMGGNPVAMPYPDTYAALQQGVIDGLEGSILSYYGTKQYENVKEYSL TRHLLGVSAVCISKACWDSLTDEERTIIQEEFDAGAQDNLTETIKLEDEYAQKLKDAGVT FHEVDADAFNKAVAPVYGMFPKWTPGIYDEIMKNLKEIREELAKEGK >gi|224461442|gb|ACDD01000060.1| GENE 24 25814 - 26881 1319 355 aa, chain - ## HITS:1 COG:FN0332 KEGG:ns NR:ns ## COG: FN0332 COG0598 # Protein_GI_number: 19703675 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Fusobacterium nucleatum # 1 354 1 350 351 417 59.0 1e-116 MPNSHASRSKKNGLPPGSIIYTGENPDHEVSITVIYYNQEIFEKQVFHSVDEFRFNRRFQ GNAWINIDGISDVNYIKKIGRYFHIDNLTLEDLANPEQRVKLEEREEYLFLILKMLSLNL ITEEIEYEQLSFILEDNILITFQETPKDVFDGIRYRLESDKTKIRSLSTGYLAYTLIDAI VDNYFVILDEVEKEIDNLESKVIDKSEKEDLENILELKQSISSLKRFIAPLRELVAKLQT RGMRGYFSEDMRIYLNDLYDHSIITFETVEMLNSRVHELVQLYHSTVSNDMNQIMKILAV ISTVFMPLSFLTGLYGMNFRYMPELESPIGYFVLLAFMVLLVLGMLFYFKKKKWI >gi|224461442|gb|ACDD01000060.1| GENE 25 26883 - 27593 302 236 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 15 233 16 230 236 120 35 7e-27 MDRFITELGYQEIAIRIIAAIFIGGIIGYEREKNNRPAGFRTHILVCLGAAITSIIQDRM RIDVLRLSVTHPEAMQAIKLDLGRLGAQVISGIGFLGAGSIMRERGTVEGLTTAAGIWAT GCIGLAIGWGFYSLTLIATLAVIITLITLKKLEVSWIAKQYNAKILVQYKQHIRGEDILE MSDYLKEINVKVLGITKEEVEKTALFTIRLKKNAKVSDILLNLASNEKIEQVRKED Prediction of potential genes in microbial genomes Time: Fri May 20 02:12:40 2011 Seq name: gi|224461441|gb|ACDD01000061.1| Fusobacterium sp. 3_1_5R cont1.61, whole genome shotgun sequence Length of sequence - 2627 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 326 - 1750 1510 ## COG0591 Na+/proline symporter - Term 1718 - 1772 10.1 2 2 Tu 1 . - CDS 1782 - 2627 1339 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461441|gb|ACDD01000061.1| GENE 1 326 - 1750 1510 474 aa, chain + ## HITS:1 COG:FN0107 KEGG:ns NR:ns ## COG: FN0107 COG0591 # Protein_GI_number: 19703455 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Fusobacterium nucleatum # 1 468 1 469 482 624 71.0 1e-178 MAGIETFITFIIYLLFLMGIGVYFYTKTNTHEDYVLGGRGVGYWVTAMSAQASDMSGWLL MGLPGAVFLNGLTEIWVIIGLAAGTYANWKWVAPKLRVQTEETDTLTLPTFLTKRLGDPT GMIRTFSAIAILFFFTIYSSSGLVAAGKLFETILGIDYTWGVLIGGGTIIVYIFLGGYLA CCWTDFFQGVLMFFAITIVPVMAYFQGGGINGIEMAMRAREISLNIFSRTENIDIFIILS GLAWGLGYFGQPHILVRFMSIDKVEELWKSRLIAMIWVVISLVGAIAVGVTGIAVFPNIT ELNGDAEKIFIYMIAKLFNPWIGGILFAAILSAIMSTISSQLLVSSNTLTEDFYKYIKRT PSNKELMWVGRLSILVIFFIAGILSLNPDSKVLSLVSYAWAGFGAVFGPAILITLYKKTI HWKSVLLGMIVSAITVVVWKHTGLGNTLYEILPGFLVNTVIILCTNQYLKTERE >gi|224461441|gb|ACDD01000061.1| GENE 2 1782 - 2627 1339 281 aa, chain - ## HITS:1 COG:FN0735 KEGG:ns NR:ns ## COG: FN0735 COG5295 # Protein_GI_number: 19704070 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Fusobacterium nucleatum # 58 273 373 595 617 117 36.0 2e-26 EAGPKGDKGDKGDTGAVGPKGEPGKDGKDGKNGEGAKVLAGNNIKVDSKEKKQGEDKVIE NTISLKEDIKVKTVSADSINVGDVNISKSGINAGKQKITNVADGKADSDAVNVKQLNEVK KEVKENTKEMTKKLYHLGEEIDGVRSEARGIGALSASLAALHPMQYDKAKPNQVMAGVGT YRDKQAVAVGMTHYFTENLMMTAGVSLAETSNTKAMANVGVTWKFGSKEEGEDIKISEDV ILKEQLGKLTMENRNQKQENLELKSRVEKLEQKLEAILNQR Prediction of potential genes in microbial genomes Time: Fri May 20 02:12:41 2011 Seq name: gi|224461440|gb|ACDD01000062.1| Fusobacterium sp. 3_1_5R cont1.62, whole genome shotgun sequence Length of sequence - 1624 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1623 2166 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461440|gb|ACDD01000062.1| GENE 1 3 - 1623 2166 540 aa, chain - ## HITS:1 COG:SMc01708 KEGG:ns NR:ns ## COG: SMc01708 COG5295 # Protein_GI_number: 15964211 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Sinorhizobium meliloti # 291 464 475 648 1291 69 37.0 2e-11 SDRQNVMGSDNEITGRDQGTNSGKKRTNVDTIIGGGNKIRGNNTYMKGYESLTVIGNNNE SVNPSSGIVIGDNQQIGTIDETVVIGSMRPEDKKDGNNAQGHRSVIIGYQAGGKDERCSG GFNVAIGHSARVDGWMGAVTGYNSHIKANDGHFLSIYGAENKISSNIRPEWVNMGAYANS IVGSWNNIEDSNNSMIFGAGNKVSHAMSITEKVEEVNGNGPRLSWRSQDGEAYSDISNKD MADLAMLNGGSVMTLGNANVIDYAIRSQVLGTGNILKGTNTKESTMNSINGYRNIGTNIK NMSLLGNGNKVSETKNGVVIGDYHELNGGNNNIILGSMETREEEETRTYLKDGEPAKYKV KKQVAVKKHKDNISNAVMIGYNTDVEKDGGVALGSEAVSNIDKEIVGWDVSKEKASMEST PIWKSTRAALSVGDVENKVTRQITGVAAGRADTDAVNVAQLKSVLSHPFHVFSGGNASTK GTDISNGTDLTFYKMNWEFRDGLKAAVEGEGENRRVVVSLDKENLKKDPDFKGTKGDKGD Prediction of potential genes in microbial genomes Time: Fri May 20 02:12:56 2011 Seq name: gi|224461439|gb|ACDD01000063.1| Fusobacterium sp. 3_1_5R cont1.63, whole genome shotgun sequence Length of sequence - 41414 bp Number of predicted genes - 46, with homology - 40 Number of transcription units - 11, operones - 6 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 320 - 404 68.6 # Tyr GTA 0 0 + Prom 197 - 256 11.9 2 2 Tu 1 . + CDS 322 - 435 287 ## - TRNA 423 - 497 66.8 # Glu TTC 0 0 - TRNA 502 - 577 81.3 # Thr TGT 0 0 3 3 Op 1 . - CDS 684 - 1511 964 ## COG0731 Fe-S oxidoreductases 4 3 Op 2 . - CDS 1544 - 2230 732 ## COG0588 Phosphoglycerate mutase 1 - Prom 2250 - 2309 10.7 + Prom 2241 - 2300 7.3 5 4 Op 1 . + CDS 2329 - 2511 191 ## - TRNA 2330 - 2403 67.5 # Cys GCA 0 0 - TRNA 2419 - 2494 87.4 # Phe GAA 0 0 6 4 Op 2 . + CDS 2499 - 2627 364 ## - TRNA 2500 - 2577 93.9 # Asp GTC 0 0 - TRNA 2584 - 2659 97.4 # Val TAC 0 0 7 5 Op 1 1/0.000 - CDS 2707 - 3579 905 ## COG0470 ATPase involved in DNA replication 8 5 Op 2 1/0.000 - CDS 3579 - 4601 1399 ## COG1077 Actin-like ATPase involved in cell morphogenesis 9 5 Op 3 8/0.000 - CDS 4614 - 4991 171 ## PROTEIN SUPPORTED gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 10 5 Op 4 1/0.000 - CDS 4979 - 6400 2041 ## COG0215 Cysteinyl-tRNA synthetase 11 5 Op 5 1/0.000 - CDS 6462 - 7160 346 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 12 5 Op 6 . - CDS 7136 - 9472 2873 ## COG1193 Mismatch repair ATPase (MutS family) - Prom 9494 - 9553 12.0 + Prom 9558 - 9617 7.9 13 6 Tu 1 . + CDS 9654 - 9911 390 ## PROTEIN SUPPORTED gi|237745230|ref|ZP_04575711.1| LSU ribosomal protein L28P + Term 9925 - 9968 4.2 - Term 9965 - 10005 5.1 14 7 Op 1 . - CDS 10025 - 11524 2264 ## COG1288 Predicted membrane protein 15 7 Op 2 . - CDS 11583 - 12572 1380 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 12600 - 12659 16.0 - Term 12641 - 12691 8.4 16 8 Op 1 11/0.000 - CDS 12709 - 12927 346 ## PROTEIN SUPPORTED gi|237736139|ref|ZP_04566620.1| SSU ribosomal protein S18P 17 8 Op 2 . - CDS 12962 - 13246 400 ## PROTEIN SUPPORTED gi|197736538|ref|YP_002165316.1| ribosomal protein S6 18 8 Op 3 35/0.000 - CDS 13341 - 14420 1768 ## COG0206 Cell division GTPase 19 8 Op 4 . - CDS 14444 - 15703 1618 ## COG0849 Actin-like ATPase involved in cell division 20 8 Op 5 . - CDS 15693 - 16379 686 ## FN1453 hypothetical protein 21 8 Op 6 6/0.000 - CDS 16393 - 17259 1407 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 22 8 Op 7 11/0.000 - CDS 17273 - 18112 1264 ## COG0812 UDP-N-acetylmuramate dehydrogenase 23 8 Op 8 26/0.000 - CDS 18125 - 19468 1686 ## COG0773 UDP-N-acetylmuramate-alanine ligase 24 8 Op 9 4/0.000 - CDS 19476 - 20543 1368 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 25 8 Op 10 28/0.000 - CDS 20557 - 21858 1455 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 26 8 Op 11 28/0.000 - CDS 21874 - 22959 1437 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 27 8 Op 12 . - CDS 22959 - 24242 1458 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 28 8 Op 13 . - CDS 24245 - 24808 610 ## COG0241 Histidinol phosphatase and related phosphatases 29 8 Op 14 . - CDS 24855 - 26006 1300 ## COG2404 Predicted phosphohydrolase (DHH superfamily) 30 8 Op 15 . - CDS 26018 - 28576 3109 ## COG0495 Leucyl-tRNA synthetase 31 8 Op 16 1/0.000 - CDS 28615 - 29328 371 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 32 8 Op 17 . - CDS 29338 - 30606 1979 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 33 8 Op 18 . - CDS 30639 - 31463 899 ## gi|257452953|ref|ZP_05618252.1| hypothetical protein F3_07793 34 8 Op 19 1/0.000 - CDS 31473 - 32480 710 ## PROTEIN SUPPORTED gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 35 8 Op 20 14/0.000 - CDS 32493 - 32969 686 ## COG2137 Uncharacterized protein conserved in bacteria 36 8 Op 21 1/0.000 - CDS 33019 - 34098 1602 ## COG0468 RecA/RadA recombinase 37 8 Op 22 . - CDS 34070 - 35140 1213 ## COG0859 ADP-heptose:LPS heptosyltransferase 38 8 Op 23 . - CDS 35098 - 35787 461 ## FN0545 lipopolysaccharide core biosynthesis protein RfaY 39 8 Op 24 11/0.000 - CDS 35784 - 36791 785 ## COG0859 ADP-heptose:LPS heptosyltransferase 40 8 Op 25 3/0.000 - CDS 36788 - 37819 1184 ## COG0859 ADP-heptose:LPS heptosyltransferase 41 8 Op 26 3/0.000 - CDS 37807 - 38586 540 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 42 8 Op 27 . - CDS 38595 - 39674 907 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 39695 - 39754 14.9 - Term 39730 - 39773 7.8 43 9 Tu 1 . - CDS 39790 - 39993 422 ## - Prom 40047 - 40106 8.5 + Prom 39975 - 40034 6.1 44 10 Op 1 . + CDS 40099 - 40227 428 ## - TRNA 40100 - 40177 93.9 # Asp GTC 0 0 - TRNA 40184 - 40259 97.4 # Val TAC 0 0 45 10 Op 2 . + CDS 40252 - 40440 141 ## + Term 40674 - 40700 -0.6 - TRNA 40269 - 40344 87.4 # Phe GAA 0 0 - TRNA 40363 - 40446 66.2 # Ser TGA 0 0 - TRNA 40449 - 40523 66.8 # Glu TTC 0 0 - TRNA 40545 - 40621 98.9 # Met CAT 0 0 - TRNA 40640 - 40716 89.8 # Arg TCT 0 0 - TRNA 40725 - 40800 94.1 # Lys TTT 0 0 - TRNA 40804 - 40879 93.2 # Gly TCC 0 0 - TRNA 40892 - 40968 82.4 # Met CAT 0 0 - TRNA 40977 - 41064 70.9 # Leu TAA 0 0 46 11 Tu 1 . - CDS 41140 - 41412 284 ## gi|257466656|ref|ZP_05630967.1| excinuclease ABC subunit C Predicted protein(s) >gi|224461439|gb|ACDD01000063.1| GENE 1 2 - 223 264 74 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 72 1 72 407 106 69 3e-22 MAKEKYERSKPHVNIGTIGHVDHGKTTTTAAISKVLSDLGLAQKVDFDKIDVAPEERERG ITINTAHIEYETEK >gi|224461439|gb|ACDD01000063.1| GENE 2 322 - 435 287 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVREGFEPSKAEPSDLQSDPFGRSGTSPLYNIVSGTP >gi|224461439|gb|ACDD01000063.1| GENE 3 684 - 1511 964 275 aa, chain - ## HITS:1 COG:FN0127 KEGG:ns NR:ns ## COG: FN0127 COG0731 # Protein_GI_number: 19703472 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 1 275 1 282 284 237 46.0 2e-62 MARYVFGPVPSRRLGISLGMDIVVPKTCNLNCVFCECGPTKDWTIERQHFISYDEFIQEL EEALTDVVPDYVTFSGSGEPTLSLDLGKIIRYIKKEHPSIKIAVITNSLLLHREDVLEEI QEADLIMPSLHTVRQEIFEKIVRVYPNYRIETVLEGLQKLCSCFRGNIDLELFLIEGLNT SFSDLKAYATFVKTLSYRKLQLNSLDRPGTESWVKPVPYHKLLEIKEYLEQEGLSGVEII GKFNINQKITEDESRMKAMKERRKYTEEEIKSLYK >gi|224461439|gb|ACDD01000063.1| GENE 4 1544 - 2230 732 228 aa, chain - ## HITS:1 COG:FN0729 KEGG:ns NR:ns ## COG: FN0729 COG0588 # Protein_GI_number: 19704064 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Fusobacterium nucleatum # 1 228 1 228 228 302 61.0 4e-82 MKLVLVRHGQSEWNLQNRFTGWADVDLSETGIREAKEAGRELLAQKIDFDLCFTSYQKRA IKTLQYILEELDALYLPIIKTWKLNERHYGALQGLNKSETAKKFGEEQVHIWRRSFDIQP PAMEKEDKRSPRYDKRYRDLKEEEIPLSESLKDTIVRVLPYWNEVIAPEIKKGKNILIAA HGNSLRALVKHLLKISDEKIMELNLPTGKPLIFEITEELEIVEAPKLF >gi|224461439|gb|ACDD01000063.1| GENE 5 2329 - 2511 191 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEATTRFELVMEILQTSALPLGDVAIIYIKWCPEAESNHRHGDFQSPALPTELSGHPWRE >gi|224461439|gb|ACDD01000063.1| GENE 6 2499 - 2627 364 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGVTRLELATSCVTGRRSNQTELHPQDMVVAIGLEPMTLCL >gi|224461439|gb|ACDD01000063.1| GENE 7 2707 - 3579 905 290 aa, chain - ## HITS:1 COG:FN1576 KEGG:ns NR:ns ## COG: FN1576 COG0470 # Protein_GI_number: 19704897 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Fusobacterium nucleatum # 1 289 1 287 289 152 34.0 9e-37 MLEDWIRQDISKNKKSGTYLFYGEDSSRLEKAVLSFAKALCCPEEKDYYCDSCSVCNRIQ KGVYADVHVLENLKIEDIREAETSFHESSYEGERKIFILPNIQDLRKESANALLKSIEEP GEGTFFLLWSTRKNILATIRSRAIQVFVPRVNYQELGVSKECYHFFEGNEQDILNCLKEN INWQEHQSYRNIQKNIVSYLETQQTSSKVKVYQSLIDFLEVKENLSVVEILWFIEELVGS PCERKDFAWIFHYCLMQERYQGKLEEKLILSKMLNFPINNKVLFANLFLK >gi|224461439|gb|ACDD01000063.1| GENE 8 3579 - 4601 1399 340 aa, chain - ## HITS:1 COG:FN1577 KEGG:ns NR:ns ## COG: FN1577 COG1077 # Protein_GI_number: 19704898 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 4 340 6 342 342 546 86.0 1e-155 MAFFKLNRGLGIDLGTANTLVYSKKHKRIVLNEPSVVAVERETKKILAVGNEAKEMLGKT PDSIVAVRPLSEGVIADYDITEAMIKYFIKKVFGSYSFFMPEIMICVPVDITGVEKRAVL EATISAGAKRAYLIEEARAAALGAGMDISAPEGNMIIDIGGGSTDIAVISLGGTVVSKTI RIAGNNFDSSIIKYVKKTHNLLIGDKTAEEIKIKIGTALPLEEEETMEVKGRDLMMGLPK TVTISSEEIREAIMDSLMEIVRCIKSVLEQTPPELASDIVDKGMVMTGGGSLIRNFPEMV EKYTSLKVTLAENPLESVVRGSGLALEQVKVLRKIEKAER >gi|224461439|gb|ACDD01000063.1| GENE 9 4614 - 4991 171 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 [Bacillus selenitireducens MLS10] # 8 114 12 121 141 70 34 2e-11 MESVDIREMSGLALAYLGDTVWETQVRLYWVKKGFNISHLNYKVKKFVNAKAQSHYYQLL KEELSEEENAIMRRAKNANIRSFPKSCSNQEYREATAFEAILGAWFLQGEIDKIQAFANR ILEKE >gi|224461439|gb|ACDD01000063.1| GENE 10 4979 - 6400 2041 473 aa, chain - ## HITS:1 COG:FN1579 KEGG:ns NR:ns ## COG: FN1579 COG0215 # Protein_GI_number: 19704900 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 471 1 471 473 666 68.0 0 MIKIYNTLSASLDTFTPRKEKEVSMYVCGPTVYNYIHIGNARPAIVFDTVRRYFEYRGYK VTYVQNFTDVDDKMIKRANEEGTTVEDVAHRYIQAYLEDMKSLHIKEEGMIRPKATEHIQ EMIDMIQNLIDKGHAYESNGDVYFRVATYHQEYGALSKQKIEDLQSGARIEVTEIKESPL DFALWKASKPGEPSWKSPWGEGRPGWHIECSAMSNKYFGNSFDIHGGGQDLIFPHHENEI AQSKCSCGGSFANYWMHNGYINIDGVKMSKSLGNFVLLRDILKHFSGKVIRFFMLSAHYR KPMNFSDAELSQAKIALERIENSLIRAHEISETSIALEGSAGVELKKALEDTKGKFIEAM DEDFNTAQAIGVIFELVRELNKTLDSSYNQEAYVIVKETADYLYHILYDVLGIEVEVETK VENLTVDLVEFILELRREARAEKNWALSDRIRDRLAELGIQIKDGKDSTTWRV >gi|224461439|gb|ACDD01000063.1| GENE 11 6462 - 7160 346 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 10 227 2 216 234 137 36 8e-32 MHCGYSKIKKKMSFILACAGIGKRMKLGYPKQFLEYDGKPLFLKPLLCAEQSEYVDEIII VSQEEYLEDIKTLCQKEGIHKLKAVVTGGRERQDSIFAALKKVSIDMDYVMVQDAVRPFC KEKYIRESYEQLEAGYMGTVVGVAVKDTIKEITEDGFVKNTPKRSSLFAAHTPQAFQKEI LKEAYEKAYQDKFLGTDDASLVERLQLSIKIIVGDYDNIKITTPEDLKILNP >gi|224461439|gb|ACDD01000063.1| GENE 12 7136 - 9472 2873 778 aa, chain - ## HITS:1 COG:FN1581 KEGG:ns NR:ns ## COG: FN1581 COG1193 # Protein_GI_number: 19704902 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 1 778 1 778 778 1014 71.0 0 MSIHSHRVLEFDKLKEKVMTYLAIEKNVEEIINLKPFTDLSSLQQEFVYVQDCMDFMQYD GGLDVRHLKDICALTEKIKLIGTYLEVDELWDININLRFFRIFQTQLEDLGKYKALRDYM KQVSPLRLIEDLISKAIDAEKQIKDDASLDLRDIRIHKKVLAQNIRRKFDELFEEPSVSA AFQERIITERDGRMVVPVKLDFKGLIKGIEHDRSSSGQTVFIEPLSIVSLNNKMRELETK EKEEIRKILLRLSEQIRNHQDEIYKIGNMILYIDRLQAKANFGLEEACHVPMVQGKEILY LEKARHPFIPKEKVVPLTFEIGKDYKILLITGPNTGGKTVALKTAGLLTLMALSGIPIPA SQNSRIGFFQGVFADIGDEQSIEQSLSSFSAHVTNLQDILEQVHRNCLVLLDELGSGTDP TEGSAFAMSIIDYLKEKKCNSIITTHYSEVKAHGYNEEGIETASMEFDTTTLSPTYRLLM GIPGESNALTIAKRLGIPQEIIEKAQSYISEDNKKIELMINNIKNKSESLDKMQAELTGL REAAKMNQEKWEEERKALEREKNEILKKAYEDSEKMMNEMRAKASALIEKIQKEEHSKEQ AKQIQKNLNMLSSALKEEKNKTITLNKTMKKKAHFKEGDRVFVKNINQFATVLKINAMKE SAQVQAGILKLEVPFEEIRVTEEKKEKTYQVQVHKKIAVRSEIDLRGKMVEEGIHELETY LDRALLNGYHEIYVIHGKGTGALRNGILEYLKTCPYVKDYRIGGHGEGGLGCTVVTLK >gi|224461439|gb|ACDD01000063.1| GENE 13 9654 - 9911 390 85 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237745230|ref|ZP_04575711.1| LSU ribosomal protein L28P [Fusobacterium sp. 7_1] # 1 85 1 85 85 154 87 6e-37 MQRCEITGTGIISGNKISHSHRLTRRVWKPNLQVTTILVNGNPIKIKVCSRTLKSLKGAS EVEIMNILKANAATLSERLKKHLSK >gi|224461439|gb|ACDD01000063.1| GENE 14 10025 - 11524 2264 499 aa, chain - ## HITS:1 COG:FN0023 KEGG:ns NR:ns ## COG: FN0023 COG1288 # Protein_GI_number: 19703375 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 1 499 1 499 499 677 71.0 0 MKKWKIPDTFVIIFFVVLLAGFLTHVVPVGSFDMKDITYTTSDGAEKTKSVPVAGSFHYA LDEQGQPLVKGIKVFEPGGEIGLTNYVYEGLVSGDKWGTAVGVVAFILVIGGAFGIILKT GAVETGLYALISKTKGSEILIIPLVFILFSLGGAVFGMGEEAIPFAMILVPIIIGLGYDS ITALMITYCSTQIGFATSWMNPFSVAVAQGVAGIPVLSGSGFRIFMWIFFTAVGTIFTMR YAKKVKATPNLSVAYETDKYYREDYKAEATEGQKFTLGHKLVLLVVVLGMIWVIWGVIKQ GYYLPEIATQFVIMGIISGIIGVVFHLNDMTTNDMASSFRKGAEELVGAALVVGMGKGIV LVLGGTSAGEPSVLNTILNWVATGMEGMHSAFSAWVMYIFQSCFNFFVVSGSGQAALTMP IMAPLSDLLGVTRQVAVLAFQLGDGFTNLIVPTSGLLMAILGVARLDWGTWVKFQWKFQA LLFILGSIFVIGASLVNFS >gi|224461439|gb|ACDD01000063.1| GENE 15 11583 - 12572 1380 329 aa, chain - ## HITS:1 COG:FN0511 KEGG:ns NR:ns ## COG: FN0511 COG1052 # Protein_GI_number: 19703846 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 2 328 5 331 335 399 62.0 1e-111 MRVLFFDAKSYDKENFDAYKEKYGFDIKYLKVKLNEETVDFVKGYEIISIFVNDTVNPPV IDKLIEYGVKLIVLRCAGYNNVDVNYINGRIKLVRVPAYSPYSVAEYTASMVMTLNRKIH KAYVRTREGNFSINGLMGFDLHKKTVGVIGAGRIARIFIKIMRGFDARVIAYDPYPNESF ARDLGYEYVDLDTLYRESDIISLHCPLTRENTYLINRESMKKMKDGVMIVNTGRGRLIDT IDLIEALKDKKVGAAALDVYEEEAGYFFEDMSSSIIEDDILGRLLSFNNVLLTSHQAYFT KEAFRDITLTTLENIQSFLQGKELENEIK >gi|224461439|gb|ACDD01000063.1| GENE 16 12709 - 12927 346 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736139|ref|ZP_04566620.1| SSU ribosomal protein S18P [Fusobacterium mortiferum ATCC 9817] # 1 72 1 72 72 137 97 8e-32 MAEFRRRRAKLRVKAEEIDYKNVDLLKRFVSDKGKINPSRLTGANAKLQRKIAKAIKRAR NIALIPYTKIEK >gi|224461439|gb|ACDD01000063.1| GENE 17 12962 - 13246 400 94 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|197736538|ref|YP_002165316.1| ribosomal protein S6 [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 94 1 94 94 158 84 4e-38 MKKYEIMYIISPTVLEEGRDAIIEKVSELLTSNGANILKTEKWGERKLAYLIDKKKTGFY VLTTFEIDGTKLAEVESKLNITEEVMRYIVVKQD >gi|224461439|gb|ACDD01000063.1| GENE 18 13341 - 14420 1768 359 aa, chain - ## HITS:1 COG:FN1451 KEGG:ns NR:ns ## COG: FN1451 COG0206 # Protein_GI_number: 19704783 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Fusobacterium nucleatum # 21 359 22 359 360 377 64.0 1e-104 MLIEQDLVKIKVLGAGGAGGNAINDMISSGVGGVEYIAANTDSQDLNKSLADSRLQLGEK LTRGLGAGADPSIGKQAAEEDIDKIKQLLEETDMLFITAGMGGGTGTGAAPVIARVAKEL GILTVAIVTRPFSFEGKKRKNNADLGVRQLKETVDALVIIPNDKLFELPDKTITLQNAFK EANNILKIGIRGVADLMIGNGLINLDFADVRATMLNSGIAVLGFGEGEGENRAMKATEKA LQSPLLEKSIQGASKILINITGSPDITLMEAQTISETVRDAAGKTAEDVMFGLVVDPEVG DKVLVTIIANNFVDETQDAEPFINLKPQGNKEEEMLTENKEAHYNDDDIDLPPWLRSKK >gi|224461439|gb|ACDD01000063.1| GENE 19 14444 - 15703 1618 419 aa, chain - ## HITS:1 COG:FN1452 KEGG:ns NR:ns ## COG: FN1452 COG0849 # Protein_GI_number: 19704784 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Fusobacterium nucleatum # 2 372 3 377 447 248 38.0 2e-65 MEDNITKLIMDIGNSHIKLLVGEVSTDFTKIKVLQYVEVPTKGMKKSVVESSDELSYAIQ KALNSLDNPEHREIDKVTIGVGGKYIQSKTRKLSIEFEEREVQESDLERLYELAEECLEP EDLVLKREMYNIKINNAGIVKNPIGLVASRLEANVHLIYVDREDIEKMTDAIVEAGFDIE NIYLNAYASLKSTLVDEESTKMGVALVDIGEGVTDIIISKNHKIIYSKSANLGGIHFMSD IMYLFHVSEEEAREVYSSYMKGEMGEQYISSSGKCFVKEDVEKIIDARIGDIATFILNTI QESGFTGYLGQGMVLTGGVASLDRLVGKINAQTGGIVRRKKPLPIRGLEKPEYRMATVVG LFLEAIEEEMEAQQKRIYEAMREEEVEDDLEELLEDTRSEKKSSGETFGKIKKWISYFI >gi|224461439|gb|ACDD01000063.1| GENE 20 15693 - 16379 686 228 aa, chain - ## HITS:1 COG:no KEGG:FN1453 NR:ns ## KEGG: FN1453 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 46 224 1 189 191 122 38.0 1e-26 MLLRLSFISLIIWFIYMIPSQFLNLDIFKIKKINIGENSKILNEELSAVAEKIYDKSIWQ IDMKKLKQELSKDIRLESVEISHDKVGELNFKVEEKELLYYAQIGERIYLMDKKGEVFGY FNERDKMSLPLLVSKDGKNVSSLVEVLSNLQEYSFYDSISQIYEVDRNRIDIILIDGTKI FTNTSVDKKKYKVAMALYFEIIKNKKIAYMDLRFQDFIIRYVEDDNGR >gi|224461439|gb|ACDD01000063.1| GENE 21 16393 - 17259 1407 288 aa, chain - ## HITS:1 COG:FN1454 KEGG:ns NR:ns ## COG: FN1454 COG1181 # Protein_GI_number: 19704786 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Fusobacterium nucleatum # 1 287 1 286 287 368 65.0 1e-102 MRIAVFMGGVSSEKEVSIRSGEAILESLQRQGYDAYGVVLTEKNMISAFQEEQYDLAYLA LHGGAGENGEIQSVLELLGKKYTGSGVAASAISMDKLLTKKIASLEGVRMAKTFTTVAEI SRYPVMVKPSKDGSSVGIHVCNNQEEVEKALQEISGYAMIEEYIQGEELTVGVLNGKALG VLKIIPQAADIYDYESKYAAGGSIHEFPARIAKIAYEEAMVNAVKIHEALGMKGVSRSDF ILKDDQVYFLEVNACPGMTKTSLVPDLATLQGYTFDDITRILVEDALA >gi|224461439|gb|ACDD01000063.1| GENE 22 17273 - 18112 1264 279 aa, chain - ## HITS:1 COG:FN1455 KEGG:ns NR:ns ## COG: FN1455 COG0812 # Protein_GI_number: 19704787 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Fusobacterium nucleatum # 1 279 1 281 281 333 65.0 2e-91 MKVLEQQIMKEYSNMKIGGKAKRLIIVENKEEMKEAYEKYDSLLLLGNGTNLLLNDGYLD YNFVSTEKLNRIEKLEKNRVYVEAGVDLDTLLAFMEKENLSGIEKMAGIPGSIGGLTYMN GGAFGTEIFDFIDEIEVLTERNIIQSIKKKDLYVRYRKTEIQEKKWIVLSVIFQFQTGFD KSTVEEIKKSREEKHPLDKPSLGSTFKNPEGDFAARLISEAGLKGRKVGGAQIAEKHPNF VLNLGEATFQDILDTLDLVKKTVKEKFGVQLEEEIIIIR >gi|224461439|gb|ACDD01000063.1| GENE 23 18125 - 19468 1686 447 aa, chain - ## HITS:1 COG:FN1456 KEGG:ns NR:ns ## COG: FN1456 COG0773 # Protein_GI_number: 19704788 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Fusobacterium nucleatum # 1 446 9 464 468 518 58.0 1e-147 MEKIYFVGINGIGMSGLAKIMKCQGYDVVGADLARNYVTEELESLGITVYPEHKACQMQG RDSLIASSAIHSDNPEFQYAKQHNIPLMKRGELLATLLNNKVGIAVAGTHGKTTTSSMMS AVMLSLDPTIVVGGILPEIGSNAKVGMGEYFIAEADESDNSFLFMKPKYAVVTNIEEDHL ETHGNLENIEKSFRQFVEQTERKVLVCTDCANVRTVFSESEKIMTYGMDYEANIMAKNVE IVNGKTSFEVLIQGESQGRFYISIPGKHNILNSLPVIYFSLLFGVPKEEIQDKLLHFRGS KRRYDVLYWDQENNRKIIDDYAHHPTEIQATLKGVKSIEKGKIIGIFQPHRYSRVHFLLE RFAHCFEGLDELILLPIYSAGEQNESGVSEKDIAKIIPTIPVTCIESKERVVERIMEETR EDNHIFIFMGAGDISKLAHEVADRLQK >gi|224461439|gb|ACDD01000063.1| GENE 24 19476 - 20543 1368 355 aa, chain - ## HITS:1 COG:FN1457 KEGG:ns NR:ns ## COG: FN1457 COG0707 # Protein_GI_number: 19704789 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Fusobacterium nucleatum # 1 355 4 357 357 429 58.0 1e-120 MKKVILTTGGTGGHIYPALAVAEGLRNKGIETLFIGSSTRMEKDIVPKANFRFIGLDIHP PRSMKTAMKYLKSFVHAYHILKEEEPDAVIGFGNYISVPVLTMAFLLRKKIYLQEQNADL GFANRLFYRFAQFTFLAFEHTYNTVPIKYQKKFIVSGNPLRSEIHEVNYEEARERLKVQK DEKVLLITGGSLGAQEINNAVLKYWEHFFQAKNVRVYWATGKQNYEEVQEKVKRAKMTDT IKDYFENMIHIMAASDLVVCRAGALTISELIALQKPAVIIPYSSQKVGQYQNAKILEERH SAVIYTNQESEQAIEKVIELLNNEEELRTMGIRMRSLQTPHAVNTIISNLDIWRD >gi|224461439|gb|ACDD01000063.1| GENE 25 20557 - 21858 1455 433 aa, chain - ## HITS:1 COG:FN1458 KEGG:ns NR:ns ## COG: FN1458 COG0771 # Protein_GI_number: 19704790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Fusobacterium nucleatum # 1 433 23 454 454 474 58.0 1e-133 MKKAMVLGMGISGNGAKTLLEKEGYMVIAVDDKLAMSSEEAMKYLDDIEVFIKSPGVPYT PLVKAVQEKGIKVQDEIEIAYQYMVKTNRNMTIVAVTGTNGKTTTTSKIAELLNYAGKKA AAAGNIGRSFVDVLLSEENYEYAVLELSSYQLENVYEFTPYISLVTNLTPDHLTRYETLK DYYDTKFRICQNQKEENSFFLYNIDHEELRKRENLMKGKKISLSKEQDADTCVRNGKIIF QEEEIMQVSELSLKGNHNLENSLFIITAGKLLGLDTKVIREFLMNTEPLEHRMERCFQYG KVQFINDSKGTNIDSAKFALEAYPGCILICGGFDKKVDLNPLADIIVKQVKEVYLIGVIA EKIKSLLLERNYPVEKIYSLETIENSLLDMKKRFTKEDEELILLSPATSSFDQFKSFEHR GQVFKELVCKIFG >gi|224461439|gb|ACDD01000063.1| GENE 26 21874 - 22959 1437 361 aa, chain - ## HITS:1 COG:FN1459 KEGG:ns NR:ns ## COG: FN1459 COG0472 # Protein_GI_number: 19704791 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Fusobacterium nucleatum # 1 361 1 360 361 422 67.0 1e-118 MLYFLASYSTELGFLKSIYLRDFISFSLSFLLVLFLGKPFIHYLQKKKFGETIRQEGPAS HMSKKGTPTMGGVLIIFSLLLTTLLVADISNAFIGLLMISTLIFAGIGFIDDYKKFTVNK KGLAGRKKLLGQSIVAVIVWVYIKYMGLTGDTSVDFSVVSPSNPRWMLYLGGIGMLIFIL LVILGASNAVNITDGLDGLAIMPTVICSTILGVIAYFTGHIELSSHLQLYYTSGIGEITI FLAAICGSGLGFLWYNCYPAQIFMGDTGSLSLGGILGVVAVLLKQELLLPIIGAVFVLEA VSVILQVGSFKMRGKRIFRMAPIHHHFELGGLAETKVTMRFWIITILLGIFALGLIKLRG I >gi|224461439|gb|ACDD01000063.1| GENE 27 22959 - 24242 1458 427 aa, chain - ## HITS:1 COG:FN1461_2 KEGG:ns NR:ns ## COG: FN1461_2 COG0770 # Protein_GI_number: 19704793 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Fusobacterium nucleatum # 17 423 3 414 416 385 51.0 1e-107 MERLCQLLQKKFPSLPKIKIQNVVMDSRKITEGSLFFAIQNGNQYVQEALDKGASLVIAD RYSGNHEKVIKVENTILVMQELAEEYRRCLKTKMIAITGSNGKTTTKDIIYAILSKTFQT KKTLGNYNNHIGLPFTILNLEEKDEFAVLEMGMSSFGEIDLLGKIARPDYGIITNIGDSH LEFLKTRENVFKAKTELLPYLPEGCFISSGDDVFLKKIPAIHVGYDERNDYRIFGYQKKD RRSSFQLNDKQYEIPLEGKHNVMNAAMGIAIAEMIGMDSKTIQQNLLQIELSPMRFERSE YQGTKYINDAYNASPISMGVALDSLVETTAECKIAVLGDMLELGEKEVTFHKDVIEKAIS CSLQAILLYGPRMKKALQEFSNIPNKVLHFEKKEEIKDYLKQFPRKTVLIKASRGMKLEE IIEREEK >gi|224461439|gb|ACDD01000063.1| GENE 28 24245 - 24808 610 187 aa, chain - ## HITS:1 COG:FN1461_1 KEGG:ns NR:ns ## COG: FN1461_1 COG0241 # Protein_GI_number: 19704793 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Fusobacterium nucleatum # 1 187 5 189 197 192 51.0 2e-49 MKKAIFLDRDGTLNIEKEYLYQEKDLEFEKGVIEALSIFRDLGYLLIVVTNQSGIARNYY TEEDLEIFHQAFQRRLSFFGLKIDKFYYCPHHPEKGIGKYKQDCFCRKPKPGMLEKGIAE FDVDRNLSYMVGDKYADIQAGRAARISPILVRTGYGKEEEQKLQLGEAKVFDTLLAFAHY IKQRERL >gi|224461439|gb|ACDD01000063.1| GENE 29 24855 - 26006 1300 383 aa, chain - ## HITS:1 COG:FN1601 KEGG:ns NR:ns ## COG: FN1601 COG2404 # Protein_GI_number: 19704922 # Func_class: R General function prediction only # Function: Predicted phosphohydrolase (DHH superfamily) # Organism: Fusobacterium nucleatum # 1 357 1 357 358 390 51.0 1e-108 MADIVCDTRSQKRKKPLVVVVTHGDADGLVAAAIVKAFEERINPEQSFLIFSGMDVTEEQ TEKLFDYICKYNDLGIRDKIYILDRPIPPLGWLSMGYVCDVPMIHIDHHITNHPDTYTFD ERGKYILHHWSEEESAAFLSLEFFKPLQEKAEVFKKLYNTFYDLAKATSEWDTFHWKQLG ETTNDLLWKKKALSINAAEKLLGSVGFYRAIQERIGEEDYSQDLFTYFFRLQDAYDHQFQ NAYEFAKRSVTEYIFKSHRIGVIYGVDVNYQSMIADYLFLDKKYHFDVIAFVNIYGTVSF RGKGNFDVSILAQKLGEFCGHSGGGHKNASGCKIYNRDRFKENLLELFYESMDALKLGNF NKKGRGPKKSLFFIKKKYTKGKI >gi|224461439|gb|ACDD01000063.1| GENE 30 26018 - 28576 3109 852 aa, chain - ## HITS:1 COG:FN1517 KEGG:ns NR:ns ## COG: FN1517 COG0495 # Protein_GI_number: 19704849 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 852 1 857 859 1367 75.0 0 MKEYVFKEVEKKWQERWEKDQVFKGSNTVEGKENYYVLEMLPYPSGKLHVGHARNYTIGD VIARYKRMKGYNVLHPMGWDSFGLPAENAAIQNGAHPAKWTKSNIENMKRQLKLLGFSYD WDREIASYTPEYYKWNQWIFKKMYEKGLVYKKKSLVNWCPDCKTVLANEQVEDGKCWRHS KTAVIQKELEQWFFKITDYADELLEGHEELRGGWPEKVLTMQKNWIGKSFGTEVVFQVVE NNTDLPVFTTRVDTIYGVTYAVVAPEHPIVDEILKANPAIKSAVMAMKNMDVIERAAEGK EKNGIDTGWHVKNPYNGVEVPLWIGDYVLMNYGTGAVMAVPAHDERDYAFAKKYNLEIKS VIFPKEGEIALPFVEDGLVQNSAEAFNGIPNREALVKMAEFGEEKGFAKRTFKYRLKDWG VSRQRYWGTPIPVLYCEKCGEVLEKDENLPVMLPEDIQFSGNGNPLETSESFKNATCPCC GGPARRDTDTMDTFVDSSWYFLRYCDAQNKDLPFDKKIVDGWTPVDQYIGGVEHAVMHLL YARFFHKMLRDLGYLSSNEPFKRLLTQGMVLGPSYYSAAENRFLFAEEVELKGEKAFSKK TGEELVVKVEKMSKSKNNGVDPEEMILKYGADTTRLFIMFAAPPEKELEWNENGLAGAYR FLTRVWRLVLENQDHISLEKIDYTAINKADKALIIKLNQTIKKVTESIEDDYHFNTSIAA TMELLNDVQAYQSDSTQYTRVLGEALKQIVIMLSPFVPHFCDELWESIGETGYVSEQEWP VYDEKYITTDDVVMAIQVNGKMRGSIEVERETSKEEIEKLALAVPNVVKHIEGKELVKLI VVPNKIVNIVVK >gi|224461439|gb|ACDD01000063.1| GENE 31 28615 - 29328 371 237 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 4 237 9 248 255 147 33 1e-34 MERVIGINPVLEVLQNREKTIEKLEVYKGVRGEVLQKIQRLASERNIKIFYTNKKIENSQ GFCIFLTDYDYYREFDEILENMARKSQSIILILDEIQDPRNFGALIRSAEVFGVDAIIIP ERNSVRINETVVKTSTGAIEYVPIVKVTNLSNTIEKLKKIDYWVYGAAGEAESSSAEEQY PQKVVLVLGNEGTGLRKKVREYCDKLIKIPMRGKINSLNVSVAGGILLSEIAKFHKE >gi|224461439|gb|ACDD01000063.1| GENE 32 29338 - 30606 1979 422 aa, chain - ## HITS:1 COG:FN1520 KEGG:ns NR:ns ## COG: FN1520 COG0766 # Protein_GI_number: 19704852 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Fusobacterium nucleatum # 1 422 1 423 423 631 77.0 0 MVEAFQIIGGKDLAGELVVEGSKNSTLPIMIATLVAKGKYVLKNVPNLRDIRTLVKLLES LGLQITKLDDHSYEIINTGLTNLEASYDLVKKMRASFLVMGGMLAHSKKATVSLPGGCAI GSRPVDLHLKGFEQLGVKIHIDHGYVYAEAEELIGNEIILDFPSVGATENIIMAAVKAKG KTILENAAKEPEIVDLCNFLNKMGAKITGAGRSRLEIEGVEELHACEHSIIPDRIVAGTY IIAAILFQGKITVRGVVREDLASFLSKLEEMGLKYQIEDDVFTVLSKLEDLKPGKITTMP HPGFPTDLQSPIMTLMCFIKGTSEIKETIFENRFMHVPELNRMGAKIDIDGSKATITGVD HFSSAEVMASDLRAGASLVLAALKSPGTSIVNRIYHVDRGYESLEVKLQALGANIERIKV DA >gi|224461439|gb|ACDD01000063.1| GENE 33 30639 - 31463 899 274 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452953|ref|ZP_05618252.1| ## NR: gi|257452953|ref|ZP_05618252.1| hypothetical protein F3_07793 [Fusobacterium sp. 3_1_5R] # 1 274 1 274 274 480 100.0 1e-134 MKRAFLFLCLFLLSFSSFAIQITGKTMLDKVQIGQVKIDFIDAENHSYSTKSNFLGEYSL HLPEGYYRIYIENENYRIAESHNQVYSFSKDRTLNLSLEKKKQQLEGMILDESGYGVADV SLEIKQNGKTYQLQSDKYGKFQFPIDCGLLSIFAQKEGFLEGGEVILVREKRPVKNLQII LKKRYSYILGIVTDGVKALPGVTVRLRNENLETIDQVFSNPLGYYQFRNIGNNQKVAVSV YEEGFQEYISDFFFVDKNYEKEHIILRKKEKNML >gi|224461439|gb|ACDD01000063.1| GENE 34 31473 - 32480 710 335 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Cryptobacterium curtum DSM 15641] # 2 315 518 841 860 278 45 5e-74 MIILGIESSCDETSIAIIRDGKTILSNYISSQIDIHKEYGGVVPEIASRQHIKNIAAILE ESLTEAGITLKEVDYIAVTYAPGLIGALLVGISFAKALAYANHIPLIPVHHIKGHIYANF LEHDVELPCISLVVSGGHTNIIYMDEKHEFHNLGGTLDDAVGESCDKVARVLGLGYPGGP VIDKMYYQGNPQYLKLTKPKVGKYEFSFSGIKTAVINFDHKMKSRGETYKKEDLAASFLG TVVDILTEKTIAAAKEKKVKHILLAGGVAANSLLRKQLAERAEQEGMKLLYPSMRLCTDN AAMIAEAAYYKIQNGGKPADYNLNGVATLDINQDI >gi|224461439|gb|ACDD01000063.1| GENE 35 32493 - 32969 686 158 aa, chain - ## HITS:1 COG:FN0548 KEGG:ns NR:ns ## COG: FN0548 COG2137 # Protein_GI_number: 19703883 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 34 158 2 127 130 75 40.0 4e-14 MIQKFTLQGKEILSEEEYEELIRYRIRLSAYTWLSKRDYSAKELEMKLSRYCSQKQWILD LIEDLQEQEYLDDYHYAVQWIQSKKYGRSKMEYLLLQKGLSREIVKKALEETYESDLDEI VRVWNKLGEKAKEKKVMALLRKGYRYSEIKKALAEIEE >gi|224461439|gb|ACDD01000063.1| GENE 36 33019 - 34098 1602 359 aa, chain - ## HITS:1 COG:FN0547 KEGG:ns NR:ns ## COG: FN0547 COG0468 # Protein_GI_number: 19703882 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Fusobacterium nucleatum # 7 334 16 342 381 416 73.0 1e-116 MAKAKEKTVELTAKQKALETAVKEITKDFGEGAIMKLGDNSHMQIEVIPTGSLNLDAALG LGGVPRGRVVEIYGAESSGKTTIALHIIAEAQKMGGIAAFIDAEHALDPVYAKALGVDID ELLISQPDFGEQALDIADTLVRSGAIDVIVVDSVAALVPKVEIDGEMSDQQMGLQARLMS KALRKLTATLNKSKTTMIFINQIREKIGSFGFGPQTTTTGGKALKFYSSVRMEVKRIASV KQGDDVIGNETVVKVTKNKIAPPFKEASFQIMYGKGISKVGEILDIALAKDIVAKSGAWF SFGEIRLGQGKENVKARLEEESDLLNAIYEEIKKLEAPIEEEIKTGLFGEEESEEVSKA >gi|224461439|gb|ACDD01000063.1| GENE 37 34070 - 35140 1213 356 aa, chain - ## HITS:1 COG:FN0546 KEGG:ns NR:ns ## COG: FN0546 COG0859 # Protein_GI_number: 19703881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 15 339 6 334 335 390 59.0 1e-108 MVGKEENKEERKKMRILVIRLSSIGDVLLTTPVLKAWKEKYPDSILDFVVLKQFQAAIQN CPYIDNIHIFDKKQHDGIKNIIKFSKKLAENQYDYVFDLHNKFRSQLMRWSMRVPYFVYP KRKWWKSILVNLGLISYQVDDTIIKNYFAAFRKFSLSYQGEDLYFHVSEEDKKKFESYRN FPVLAPGASKNTKKWPIENFALLAKLLYEKYSYPSILIGGKEDEETCQKIIELSGGKAIS FAGKLSLQESGALLSQAAFLVSNDSGPFHIARGVKCPSFVIFGPTSPGMFELGKRDTLLF AGVDCSPCSLHGDKECPKKHFRCMKEITAEQILKKIEEKNSKEGVFSHGESKRKNS >gi|224461439|gb|ACDD01000063.1| GENE 38 35098 - 35787 461 229 aa, chain - ## HITS:1 COG:no KEGG:FN0545 NR:ns ## KEGG: FN0545 # Name: not_defined # Def: lipopolysaccharide core biosynthesis protein RfaY # Organism: F.nucleatum # Pathway: not_defined # 31 190 8 161 198 129 50.0 6e-29 MKEKISSQEVLYFSSKEALTLFELWKQGNYKIKKTLKDSNRSYVLLLEIEGKYFVYKEPR EKNRRKWQQFLSLFRGSESKREAFQMLEIENHGFLGPQLQFAYEKRKLGRVIHSFLLYSY IDAEEITVETAEKALSYLHRIHEAGFLHGDSQISNFLIHEEEIYIIDSKFQKNKYGDFAC AYEEYYFELSCPTCSFLIDRKRIPYRIAKKWKDLKEWWVKRKTKKREKK >gi|224461439|gb|ACDD01000063.1| GENE 39 35784 - 36791 785 335 aa, chain - ## HITS:1 COG:FN0544 KEGG:ns NR:ns ## COG: FN0544 COG0859 # Protein_GI_number: 19703879 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 330 1 329 342 334 51.0 1e-91 MKILIIHTAFIGDIVLSTPMIAKIADTYPKAQIYYLTVPAGASILQNNPHLTKIISYDKK GKDKTWKAFFDLAKELRKEKFDKIYCPHRYLRSMLLSLLVGAKEKIGYRTAPLSCFFSKK IPYQKNCHEVERLLSFIEGGSKTRYEIELYPGKEEENFWKKLQEETATYSCIVAIAPGSR WETKRWPLEYFQELMDKLCETGRTAILLIGGKEEQNLSFKIQKGVWDLRGKTSLLELTKI LQEVDYVVTNDSSPIHIASSSSKAKIIAIFGPTVKEIGFTPWSKNSVVIEKEDLDCRPCS IHGSNHCPQKHFHCMKELKPEMILQEIAEYSKGER >gi|224461439|gb|ACDD01000063.1| GENE 40 36788 - 37819 1184 343 aa, chain - ## HITS:1 COG:FN0543 KEGG:ns NR:ns ## COG: FN0543 COG0859 # Protein_GI_number: 19703878 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 1 342 6 344 345 391 60.0 1e-108 MQEIKRIIVARTDKIGDLVLSIPSFYMLKKMYPKAELIVLVRKYNYDIVKNLPYIDRVLK IDDFKKEELLMKIAYFKADAFIALFHDDYIAKLVKASKAKIKIGPISKPSSWFLYNKGVL QKRSLSMKNEAEYNLDLVKKLNPLRYQACYELNTDLVLTEENRKVASLFWEQEKLGEKVL VCNPFLGGSTKNLRDEEYGKILKYLLLREENIDIILTCQISEEERALKLKEYIGMEKIHI FANGGSILNVAAVIEKAQLYFGGSTGPTHIAGALGQKIVAFYPSKKTQSKIRWGIFRKYL EDVHYFIVDEESSEKENYEKPYFDSMNKAKEEKIANLLYEAFL >gi|224461439|gb|ACDD01000063.1| GENE 41 37807 - 38586 540 259 aa, chain - ## HITS:1 COG:FN0542 KEGG:ns NR:ns ## COG: FN0542 COG0463 # Protein_GI_number: 19703877 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Fusobacterium nucleatum # 1 247 5 250 263 312 65.0 4e-85 MKLSVAMITCNEEKILEKTLKSIVHLASEIVIVDSGSTDRTEEIAKKYGAKFVHQDWLGY GPQRNVAIGLCQSDWILNIDADEEISPKLYERIKNIIERPVTKKVYKVSFTTVCFGKKIY HGGWSGAKKVRLFYKNSGKFNNNTVHEEFETKEEIESIKEEIYHHSYVNLEDYFHKFNRY TTEGAKDAFQKRKKVSVLKIVLEPFYKFIRMYLLRLGFLDGLEGFVLANTSAMYSMVKYY KLYELYQKEKESHGSSCKK >gi|224461439|gb|ACDD01000063.1| GENE 42 38595 - 39674 907 359 aa, chain - ## HITS:1 COG:FN0541 KEGG:ns NR:ns ## COG: FN0541 COG0726 # Protein_GI_number: 19703876 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Fusobacterium nucleatum # 10 359 1 351 351 333 50.0 3e-91 MYWIFIFIFILLIFHHHGIPIFLYHQVQPNSKVTPELFEKHLLWLTKKGYHTMTMSEYIE EGANKKTVLLTLDDGYYDNYKYVFPLLKKYNMKATIFLNTLYIAEERTKEEEIEENGVAN QKAILQYIETSCAESPQYMSWKEIQEMYDSGLVDFQAHSHKHMAVFSDNKLQGFFNGKEE DCTDTYLYGGKIKRGYPKFKKRGEYTLPGIQIDKKFFSLFEEYYHKTLQYIADNKRRIEE GQKFIENHSKYFHKVTDEEFETRIREDYLENKKKIEEHLGYEVNCFCWPWGHRSWASIQI LEKYGVKAFVTTKKGTNDQLPNLKFIKRIELRNYSLQKFKWNVRITSNLILGKLYSLVS >gi|224461439|gb|ACDD01000063.1| GENE 43 39790 - 39993 422 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKMVLLVLVLVGIFAGCTHTEKTATGGAVVGAAVGALLGNDARSTAIGAGLGGALGAGA GEITKNK >gi|224461439|gb|ACDD01000063.1| GENE 44 40099 - 40227 428 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGVTRLELATSCVTGRRSNQTELHPQNMVVAIGLEPMTLCL >gi|224461439|gb|ACDD01000063.1| GENE 45 40252 - 40440 141 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRPFHYGAQRRNRTTDTGIFSPLLYRLSYLGINNKQKWRKVRDSNSKVLRRRFSRPIPYQ LG >gi|224461439|gb|ACDD01000063.1| GENE 46 41140 - 41412 284 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257466656|ref|ZP_05630967.1| ## NR: gi|257466656|ref|ZP_05630967.1| excinuclease ABC subunit C [Fusobacterium gonidiaformans ATCC 25563] # 1 76 44 119 133 119 98.0 6e-26 IYIGISHDVKKRFQEHLEKKGAKYTKAHPVEKILFTIPCETKSEALKLEYFFKTWTKKQK EDFLQKADADLGKSLYQKKKLQEKKKEEKI Prediction of potential genes in microbial genomes Time: Fri May 20 02:13:57 2011 Seq name: gi|224461438|gb|ACDD01000064.1| Fusobacterium sp. 3_1_5R cont1.64, whole genome shotgun sequence Length of sequence - 29850 bp Number of predicted genes - 30, with homology - 26 Number of transcription units - 11, operones - 5 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 59 - 118 8.0 1 1 Op 1 . + CDS 151 - 1509 1453 ## FN0748 hypothetical protein 2 1 Op 2 . + CDS 1481 - 2014 658 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 2019 - 2057 1.1 3 2 Op 1 . - CDS 2009 - 2077 135 ## 4 2 Op 2 1/0.250 - CDS 2049 - 2768 870 ## COG1496 Uncharacterized conserved protein 5 2 Op 3 . - CDS 2765 - 3577 1282 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 6 2 Op 4 . - CDS 3574 - 3858 299 ## FN1563 hypothetical protein - Prom 3882 - 3941 11.8 + Prom 3775 - 3834 8.7 7 3 Tu 1 . + CDS 3886 - 3999 62 ## - TRNA 3949 - 4032 67.8 # Leu TAG 0 0 - TRNA 4039 - 4114 94.1 # Lys TTT 0 0 - Term 3952 - 4002 1.2 8 4 Tu 1 . - CDS 4077 - 4184 85 ## - Prom 4384 - 4443 6.4 - TRNA 4124 - 4199 76.3 # His GTG 0 0 9 5 Tu 1 . + CDS 4128 - 4388 640 ## + Term 4421 - 4478 -0.2 - TRNA 4208 - 4283 93.2 # Gly TCC 0 0 - TRNA 4288 - 4364 82.6 # Pro TGG 0 0 - Term 4452 - 4505 2.5 10 6 Tu 1 . - CDS 4522 - 5928 790 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 6126 - 6185 9.7 + Prom 6083 - 6142 9.7 11 7 Tu 1 . + CDS 6166 - 6819 1009 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins + Term 6841 - 6883 5.1 - Term 6829 - 6870 8.7 12 8 Op 1 6/0.000 - CDS 6893 - 8923 2886 ## COG2987 Urocanate hydratase 13 8 Op 2 . - CDS 8925 - 10457 2147 ## COG2986 Histidine ammonia-lyase - Prom 10611 - 10670 8.7 + Prom 10506 - 10565 13.6 14 9 Tu 1 . + CDS 10593 - 11732 978 ## COG1940 Transcriptional regulator/sugar kinase + Term 11747 - 11787 9.2 - Term 11735 - 11775 8.4 15 10 Op 1 12/0.000 - CDS 11785 - 12723 1489 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 16 10 Op 2 3/0.000 - CDS 12749 - 13333 841 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 17 10 Op 3 13/0.000 - CDS 13330 - 13935 918 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 18 10 Op 4 12/0.000 - CDS 13935 - 14468 738 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 19 10 Op 5 12/0.000 - CDS 14458 - 15411 1371 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 20 10 Op 6 1/0.250 - CDS 15435 - 16745 1647 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 16814 - 16873 13.7 - Term 16956 - 16989 3.1 21 11 Op 1 . - CDS 16997 - 17557 708 ## COG0193 Peptidyl-tRNA hydrolase 22 11 Op 2 40/0.000 - CDS 17560 - 19947 3011 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 23 11 Op 3 1/0.250 - CDS 19966 - 20979 1098 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit 24 11 Op 4 1/0.250 - CDS 20997 - 21395 494 ## COG0622 Predicted phosphoesterase 25 11 Op 5 24/0.000 - CDS 21449 - 23884 3092 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 26 11 Op 6 . - CDS 23931 - 25847 2435 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 27 11 Op 7 . - CDS 25840 - 26139 381 ## gi|257452987|ref|ZP_05618286.1| hypothetical protein F3_07999 28 11 Op 8 9/0.000 - CDS 26120 - 27214 1078 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 29 11 Op 9 . - CDS 27250 - 27462 388 ## COG2501 Uncharacterized conserved protein 30 11 Op 10 . - CDS 27533 - 29266 1850 ## FN0001 chromosomal replication initiator protein DnaA - Prom 29497 - 29556 10.4 Predicted protein(s) >gi|224461438|gb|ACDD01000064.1| GENE 1 151 - 1509 1453 452 aa, chain + ## HITS:1 COG:no KEGG:FN0748 NR:ns ## KEGG: FN0748 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 21 446 2 424 430 441 58.0 1e-122 MHKFHHRLRVIKFISTLLACFIVTFLILYVIQKKENILLGLASIITSPAILITDFILVGG IGAAFLNALLIFFFNFILIRILKLKITGIVIACLLTVFGFSFFGKNMLNILPFYIGGIVY CIYAHEELSDNFVPIAFSSALAPFVSEIAFQVGSTESSYVGAIILGIGIGFIICPLAKKM YHFHEGFNLYNLGFTGGILGAVIASILKLYDVPIEPQYLVSTEHHFFLSVLCSAIFGALI LIGLLIKDVHIHYYFKLLRDPGFHTDFTKKYGYGPSFINMGIMGFLSMLFLSLEGQTLNG PILAGIFTVVGFAAYGKTPLNTFPILLGVHLASYGSNTPLFSICLSGLFGTALAPIAGVY GTLWGVVAGWLHLSVVQSIGIIHSGLNLYNNGFSCGIVASVLLPVMNMVSEQNAKSKLHL LKRHKVYIQAINRHFETQKKEEIHEKNTTHSH >gi|224461438|gb|ACDD01000064.1| GENE 2 1481 - 2014 658 177 aa, chain + ## HITS:1 COG:FN0747 KEGG:ns NR:ns ## COG: FN0747 COG0494 # Protein_GI_number: 19704082 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 12 177 9 169 171 143 46.0 1e-34 MRKIRPIPIKELHFLKPAIEKHPHNHIPLEFLIKQDAIAALLLNEDATKAFLVKQYRPGA GKELYEIPAGLIEEKEDPKLACFREVEEETGYLPKDYKILYESKKALFVSPGYTEEALYF YIFQLYSDNTIPQALKLDEGEELVGSWIPIEEIFSENKPHISCDLKTIFCFLLWKSL >gi|224461438|gb|ACDD01000064.1| GENE 3 2009 - 2077 135 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGHLYILRNKKGVQYKNCIPFL >gi|224461438|gb|ACDD01000064.1| GENE 4 2049 - 2768 870 239 aa, chain - ## HITS:1 COG:FN1561 KEGG:ns NR:ns ## COG: FN1561 COG1496 # Protein_GI_number: 19704893 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 2 238 8 241 242 182 41.0 4e-46 MIRDYENRTEFLEWKDFGLRLIYTKKSLGNVMEMSFSELREKLNLPLEKVIITGKQTHSD HIAMIQEKDIVYFEDNDGFITDREDVILYTKYADCMPVFLLDSKQKKIAVVHSGWKGSFQ RIACKALTKMSKYYGTKVEDIEVVFGVGISQEHYEVGEEFFKQFQDSFSPIFITKSFQKK GEKYFYDNQEFIAQTLLECGVKEEKIFRNHLCSFEGDYHSYRRDREGAGRNGAFIYFEK >gi|224461438|gb|ACDD01000064.1| GENE 5 2765 - 3577 1282 270 aa, chain - ## HITS:1 COG:FN1562 KEGG:ns NR:ns ## COG: FN1562 COG2876 # Protein_GI_number: 19704894 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Fusobacterium nucleatum # 3 269 67 333 334 397 70.0 1e-110 MTKFVTRDFQKKDTVLEIAGHKIGGGNFLLMAGPCSVENKEMVFSIAKKVKECGGSVLRG GAYKPRTSPYDFQGLGEEGLRYLREAADEYGLLVVTEVMSAEDLELVERYADILQVGARN MQNYSLLKKLGTVKKPILLKRGLAAKIEELLMAAEYIFAYGNPNIILCERGIRTFETMTR NTVDINAIPLLKELTHLPILIDASHGTGKRSLVSPVTLAAVVAGADGAMVEIHEHPSCAL SDGPQSLDFEMFEIFVKNLNKILAVREELL >gi|224461438|gb|ACDD01000064.1| GENE 6 3574 - 3858 299 94 aa, chain - ## HITS:1 COG:no KEGG:FN1563 NR:ns ## KEGG: FN1563 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 87 1 87 87 98 57.0 9e-20 MYNEIDLHGMNYEDALRIFIQKYNEILRKKEKREICVIHGYGSKRLDSSAVLRENLRNYL SKQKGKLKYRLDLNPGVTYVVPIAFLEERGKRKK >gi|224461438|gb|ACDD01000064.1| GENE 7 3886 - 3999 62 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIITYFKYKKKSKISYSNSQKWCEERDLNSHAEGARS >gi|224461438|gb|ACDD01000064.1| GENE 8 4077 - 4184 85 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVERQFVALVVAGSNPVDHPIILCVISSVGRAHDF >gi|224461438|gb|ACDD01000064.1| GENE 9 4128 - 4388 640 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIDGIRTRDNQCHKLALYQLNYDHHIFGAGNEVRTRDIQLGRLTLYQLSYSRKMVGIARF ELAALCSQGRCATGLRYIPTNILLNI >gi|224461438|gb|ACDD01000064.1| GENE 10 4522 - 5928 790 468 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 1 446 3 445 456 308 36 2e-83 IMELVNTLNGYLWSYILIGLLLISGIFYTLRTGFAQIFLFGDMLKLVTGKLSALKDGEKK EANQVSAFQAFCISVSSHVGTGNLAGVAIAVVLGGPGALFWMWVTSLIGCATSLIENTLA QVYKEEDGKGGFRGGPAYYMEKALGWKSMAKFFSVIVIITFAFAFNTVQANTIAQAFEGS FGFSPMVVGIVVTVLSALVIFGGLQRIANFAGLVVPVMALGYVIVALIVLLMNIAHIPAL IMLIVKSAFGVQAMAGGAMGVAMLQGVKRGLYSNEAGMGSAPNAAATSNVSHPVKQGLLQ AFGVFVDTIIICSATGFIVLLLPDYANVGETGIKLTQIALSREVGAWGNPFITACLFLFA FSSVIGNYYYGETNVEFLSGGNKQIMLIFRVISVAIIYIGSVAKLSTVWDLADLSMGIMA IMNIVAIAILSPKALHVIQDYRKKRKEGKNPEYSVKDTPEITNTEVWD >gi|224461438|gb|ACDD01000064.1| GENE 11 6166 - 6819 1009 217 aa, chain + ## HITS:1 COG:FN1265 KEGG:ns NR:ns ## COG: FN1265 COG2885 # Protein_GI_number: 19704600 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 11 214 1 200 202 200 60.0 1e-51 MKFQKTTASLLLALTLVGCTSSPFLTDEGNINKKSSGTAGGAAVGALLGQLIGKDTKGTL IGAGMGALAGLGWGAYRDQQEAALRASLKNTAVQVQRDGENISLYLPGGVTFASDSAQIS GNFYSALNSIAQVLVQYPETQILVQGHTDSTGSFQHNMDLSNRRANSVKHYLIGQGVASN RLMSQGFGPNNPVADNSTPDGRQMNRRVEIKIAPKYN >gi|224461438|gb|ACDD01000064.1| GENE 12 6893 - 8923 2886 676 aa, chain - ## HITS:1 COG:FN0792 KEGG:ns NR:ns ## COG: FN0792 COG2987 # Protein_GI_number: 19704127 # Func_class: E Amino acid transport and metabolism # Function: Urocanate hydratase # Organism: Fusobacterium nucleatum # 3 676 4 673 673 1230 86.0 0 MINQDIFHAMTIKLEACDIPKEIPKMDPNIRRAPKRVVNLTEDDIKLALKNALRYIPEEF HEMLAPEFLEELMEHGRIYGYRFRPEGRIYGRPIDEYKGNCTDTKAIQVMIDNNLDFAIA LYPYELVTYGETGQVCQNWMQYRLIKKYLENMTQDQTLVMASGHPTGLFHSNPYAPRVII TNGLMVGLFDDYDNWARGAAIGVANYGQMTAGGWMYIGPQGIVHGTYSTILNAGRLFCGV PADGDLRGKLFVTSGLGGMSGAQGKAGVIAKGVAIVAEVDISRIHTRLEQGWVNQIAETP EEAFTIAHEKLAAKEAYAIAFHGNVVDLLEYADAHNEHIDLLSDQTSCHAVYDGGYCPVG ISFEERTRLLAEDRKTFRELVDKTLKRHYDVIKRLTDKGVYFFDYGNSFLKAIYDTGVKE ISKNGRDDKAGFIFPSYVEDILGPELFDYGYGPFRWCCLSGKHEDLIKTDHAALELVDPN RRYQDRDNYVWIQDADKNNLVVGTQCRIFYQDAMSRTAIALKFNDMVRKGEIGPVMLGRD HHDVSGTDSPFRETSNIKDGSNIMADMATQCFAGNAARGMTMIALHNGGGVGIGKSINGG FGMVLDGSLRVDEILKQAMPWDVMGGVARRAWARNPHSIETVIEYNNKNQGTDHITLPYI ASDDLVNGLVEKVLKK >gi|224461438|gb|ACDD01000064.1| GENE 13 8925 - 10457 2147 510 aa, chain - ## HITS:1 COG:FN0791 KEGG:ns NR:ns ## COG: FN0791 COG2986 # Protein_GI_number: 19704126 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Fusobacterium nucleatum # 1 509 6 514 516 849 85.0 0 MELVLGSNRITLEDLVNVTRRGYKVKISEEAYEKIDRARALVDKYVEEGKVSYGITTGFG KFAEVTISKEETGQLQKNIIMSHSCSVGNPMPNDVARGIVLLRAVNLAKGYSGVRRVVVE TLVEMLNKNVTPWIPEKGSVGSSGDLSPLAHMSLVLLGMGKAYYEGELLDGKTAMERAGI PILPSLSSKEGLALTNGTQSLTSVGAHVLYDAINLSKHLDIAAAMTMEGLHGIIDAYDAR IGEVRGQEGQIQTAENMRNLLAGSGNVTKQGVERVQDSYVLRCIPQIHGASKDTLEYVKH KVEIEINAVTDNPLIFVDTDEVISGGNFHGQPMALPFDFLGIALAEMANVSERRIEKMVN PAINHGLPAFLVEKGGLNSGFMIVQYSAAALVSENKVLAHPASVDSIPTSANQEDHVSMG SVAAKKSKDILENVRNVIGIELITACQAIDLKGAKDKLSKATRAAYDEVRQYVPYVDVDR ESYVDIHKAEAIIKTNKIVEAVEKIIGGLH >gi|224461438|gb|ACDD01000064.1| GENE 14 10593 - 11732 978 379 aa, chain + ## HITS:1 COG:FN0790 KEGG:ns NR:ns ## COG: FN0790 COG1940 # Protein_GI_number: 19704125 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Fusobacterium nucleatum # 1 378 16 386 387 301 44.0 1e-81 MYQKEIKKNNENIIFEYIYNQKQGFSIAEVCQSLDLTFPTVKRIFESFLEKSILIQAKKN NHGVGRKAMEYTYNNDFCYSIGVRISEDFLHLILTNSIGKVFCQSKITIPSQLKNICSFL EENILVFLRQINQEKKNKIVGIGISIPGIFNQETKMIEFKINHFSSFVALEELQKNIPYP IYIENESNLSAIAEAVLGKYLNLSEFTVLTINKNIGSSHFVRREKDRNFYFKAGRIHHMI VNKNGRKCYCGSKGCLGTYISIKALLQDFQEIFPEVQDIESIFHEKYRESKEGKKILEQY IEYLAIGIQNLLFFSNPEKIIISGMICHFQEYLYTKLLNKIYHSGHIFFRGRDTVVFSSF HENSSLVGAALFPIVDSMF >gi|224461438|gb|ACDD01000064.1| GENE 15 11785 - 12723 1489 312 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 4 262 5 261 264 189 42.0 6e-48 MEAIVMAVVILGVTGLAMGLFLAFAAKKFEVQIDPKIEEIISILPGANCGGCGYPGCSGY ASAIVETGAAMTLCSPGGSAVAAKIGDIMGASVDTSGEKVVARVLCQGDNTFSKKRFDFD GELRTCAAVTLYAGGDKSCKYGCLGYGDCERVCPVGAIVVNEKGIASVDEEACISCGLCV KACPKSVIAMTPVAKKVTVKCMSKDKGGDAKKACGIACIGCGMCQRTCPFGAIEVSNNLA KIDPAKCKNCQLCVVVCPTKAIYTGLNRPLPKKPEPKKPAAPKPAAAPTPTPEVKKEVVV EKVVEEVKAEKE >gi|224461438|gb|ACDD01000064.1| GENE 16 12749 - 13333 841 194 aa, chain - ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 1 194 1 194 194 257 86.0 1e-68 MSLGSIFGIIISSIFINNIIFAKFLGCCPFMGVSKKIDASLGMGMAVTFVITIASGVTWL VYRFILEPMGLAYLQTIAFILIIASLVQFVEMAIQKTSPSLYKALGVFLPLITTNCAVLG VAIINIQADYNFIETLVNGFSVAVGFSLALILLAGVRERIEYSAIPKAFQGIPIAFLTAS LLAMAFMGFSGMKI >gi|224461438|gb|ACDD01000064.1| GENE 17 13330 - 13935 918 201 aa, chain - ## HITS:1 COG:FN1593 KEGG:ns NR:ns ## COG: FN1593 COG4660 # Protein_GI_number: 19704914 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Fusobacterium nucleatum # 4 192 3 191 205 255 81.0 3e-68 MGNKIKILLEGMFTGNPVFVLLLGLCPTLGTTTSAINGFSMGVAVIAVLACSNVLISLFK KCIPDQVRIPAFIMIIASLVTIVDMMMNAYTPELYKVLGLFIPLIVVNCIVLGRAESFAS KNSVFDSLLDGIGTGIGFTLSLTLLGTIREILGNGSVFGISLFPEGFTPALIFILAPGGF MTIGVVLAIINVVKAKRGEKK >gi|224461438|gb|ACDD01000064.1| GENE 18 13935 - 14468 738 177 aa, chain - ## HITS:1 COG:FN1594 KEGG:ns NR:ns ## COG: FN1594 COG4659 # Protein_GI_number: 19704915 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Fusobacterium nucleatum # 1 177 1 177 177 219 59.0 3e-57 MKNKFVHYGAVLFIIAAVSAGILAAVNGFTSQVIANNAIQLVTEARKQVLPAAASFKEEE GKEVEGMTFIPGFDEAGSNVGYVVSVDQNGYAGNINFVLGLDMEGKITGINIISSGETPG LGARINEPEWQSHWIGEDDSHEFNKATDAFAGATISPNAVYTGMMRTIKAYKAEVIK >gi|224461438|gb|ACDD01000064.1| GENE 19 14458 - 15411 1371 317 aa, chain - ## HITS:1 COG:FN1595 KEGG:ns NR:ns ## COG: FN1595 COG4658 # Protein_GI_number: 19704916 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Fusobacterium nucleatum # 1 316 1 314 314 467 77.0 1e-131 MSSILKMGPSPHIRTSETVESVMYDVIIALIPAFLIAVYVFGLRAIIVTGVAVLTCLVTE YICQKIMKQDISIFDGSAVLTGILFSFVIPVIMPLPYVIIGCIIAIALGKMVYGGLGHNI FNPALVGRAFVQASWPVAITTFAYDGRTGATMLDAMKRGLDINTVLIANSGNLYLDALIG KMGGCLGETSALALILGGCYLIYKKQIDWKVPAVMIGTVFVMTWAMGAADPIMQILSGGL MLGAFFMATDMVTSPHTDKGRVVFAFGIGFLVSCIRMKGGYPEGTAYAILIMNGVVPLIN RYIRPKKFGEVKTNNEK >gi|224461438|gb|ACDD01000064.1| GENE 20 15435 - 16745 1647 436 aa, chain - ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 436 7 441 441 652 74.0 0 MKFFGFRGGVHPPENKLQTETFPVEKLEAPKMLYVPLLQHIGAPLDPIVAVGDQVLKGQK IADSQGFLTSPIHSPVSGTVKKIEERVFPLMGTCKSIVIENDGQETWAELSKIENWETAE VKDLLAMIREKGIVGIGGASFPTHVKLNPPADTKIDTLLLNGAECEPYLNSDNRLMLENP SSIIEGVKIIKKILGVSTAIIGIEENKPEAIANMKKAAEGTGIEIAPLKTKYPQGGEKQL IKAVLNREVPSGKLPSSVGVVVQNTGTAAAIYEGLVHGTPLIEKVVTVSGKAIATPKNVR IAIGTPFSYLLDACGVDREKVDKLVMGGPMMGMAQFSEEAPVIKGTSGLLALTTEETNPY KPKACIGCGKCVSVCPMSLEPVMFARLAAFQQWEGLQNYHLMDCIECGSCAFICPANRPL TEAIKIGKAKLRSMKK >gi|224461438|gb|ACDD01000064.1| GENE 21 16997 - 17557 708 186 aa, chain - ## HITS:1 COG:FN1597 KEGG:ns NR:ns ## COG: FN1597 COG0193 # Protein_GI_number: 19704918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Fusobacterium nucleatum # 1 186 1 186 191 205 55.0 4e-53 MKLVVGLGNPGKKYEKTRHNVGFMAIDLFLKKHSILGEKEKFLSKVVETNFQGEKVYFIK PQTYMNLSGNAIHEVVQFYKIDPVSEILVVYDDKDLPLGKLRYKVKGSSGGHNGMKSIIS HIGQEFCRLKCGIGSTSGNVIDFVIGDFQKAEESELESMLEIAVEGIEDWLKNINSEKMM QKYNKK >gi|224461438|gb|ACDD01000064.1| GENE 22 17560 - 19947 3011 795 aa, chain - ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 145 795 1 653 653 793 61.0 0 MLISLNWLKQYVDLKEDVLELEKALTMIGQEVENIEEQGKHLHHVVIGKIVDYQKHPNSD KLTLLQVDTGEETLQIVCGAPNHKLGDKVVVAKIGAILPGDFKIKKSEIRKVESYGMLCS EVELGIGTSADGIIILPEDAPIGEEYRKYAKLDDVVFELEITPNRPDCLSYIGIAREIGA YFERKIKYPMIVMDEIIDQVSTQAKITIEDKERCHRYMGRLIKNVKVEESPEWLKQRIQS MGLKPINNIVDITNFVMFECNQPMHAFDFDKLAGNEIFVRAAKEGEEIITLDGVERKLNG ELVIADGEKPIAIAGVIGGEATQIDENTKNIFLEVAYFTPENIRKTSRTLGIFTDSAYRN ERGMDPEGIPYAMDRAASLIQQVAGGEILSKPLDKYLVRRELTEIPINLEKVNKFVGKAL DLDTVGNILTNLEILIKPYGPNALLVTPPSHRADLTRPADIYEEIIRMYGFDNIEAKMPK EDISAGKTAERYEIQENLKKLLTEMGLHEVINYSFIPQKARNIFHYSQPVLEIQNPLSED MAIMKPNLQYSLLANVRDNFNRNQYDLKFGEVSKTFVKVEGEDLAQEDIHLGIVLAGHKD KTLWNTGKESYDFYDIKAYVETVLAEMGIQNYNLIRSMDSNFHPGRSADIQIGRECIGTF GEVHPDIAEAMEIKKERVYLAELNITTMKKYSKKKLGYDRVSKYPAVLRDLAIVLDQDVL VGEMVKMIQKKHSLIEHIDIFDVYYGENLGEGKKSIAISIIFRDKKKTLSDTEIEENIQS ILKLIREKYQGEIRQ >gi|224461438|gb|ACDD01000064.1| GENE 23 19966 - 20979 1098 337 aa, chain - ## HITS:1 COG:FN2123 KEGG:ns NR:ns ## COG: FN2123 COG0016 # Protein_GI_number: 19705413 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Fusobacterium nucleatum # 1 337 1 337 338 501 70.0 1e-141 MKQEITALQEEAKKEIELVSSLGQLDELRIKYMGKKGKLTDLSKGMKNLSAEERPEIGQL INDAKNEILEAFSSKNSILVKEQKEKKLKEEVIDISLPSRALSLGTEHPITETMNFMKDI FIKMGFDVADGPEIEYVKYNFDALNIPDSHPSRDLTDTFYMNPEVVLRTQTSPVQIRYML EHKPPFRMICPGKVYRPDYDVSHTPMFHQMEGLVIGSNISFADLKGILTQFVKEVFGDTR VRFRPHFFPFTEPSAEMDVECNICHGEGCRVCKGSGWLEIMGCGMVDPEVLKAGGYNPEE VSGFAFGMGIERIAMLRLGIDDLRSFFENDIRFLKQF >gi|224461438|gb|ACDD01000064.1| GENE 24 20997 - 21395 494 132 aa, chain - ## HITS:1 COG:FN2124 KEGG:ns NR:ns ## COG: FN2124 COG0622 # Protein_GI_number: 19705414 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Fusobacterium nucleatum # 2 132 22 153 153 128 47.0 2e-30 MEREKPERVFAMGDYTKDFEELSYLYSEIPFEIVKGNCDFWDHHFSEEKLVLLKGKRIFL THGHLYGVKSSYDSLRQMGKNMKCDIILFGHTHREYFEKKEIILANPGAAQDGKYGILNI ENTKVEIILKRL >gi|224461438|gb|ACDD01000064.1| GENE 25 21449 - 23884 3092 811 aa, chain - ## HITS:1 COG:FN2125 KEGG:ns NR:ns ## COG: FN2125 COG0188 # Protein_GI_number: 19705415 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Fusobacterium nucleatum # 1 810 1 811 811 1147 75.0 0 MSNISNRYIEEELKESYLDYSMSVIVSRALPDVRDGLKPVHRRILFAMNEMGMTNDKPFK KSARIVGEVLGKYHPHGDTAVYNTMVRMAQEFNYRYMLVEGHGNFGSIDGDSAAAMRYTE ARMSKITAELLEDIDKNTIDFRKNFDDSLDEPTVLPSKLPHLLLNGSTGIAVGMATNIPP HNLGELVDGSLQLIDNPEISDLELMEYIKGPDFPTGGIIDGKKGIRDAYLTGRGKIRVRG KVKIEENKNGKFFLIIEEIPYQLNKSTLIERIANLVKEKKITGIVDLRDESNREGIRVVI ELKKGEEPELVLNKLYKYTELQSTFGVIMLALVNNVPKVLTLKQMLCEYISHRFQVITRR TLFDLDKAQKRAHILQGYRIALENINRIIEMIRSSKDANQAKEQLIEKYAFTEIQAKSIL DMRLQRLTGLEREKVEAEYQDLEKLIIELQDILSHDNKIYDIMKQELLKVKDTYGDKRRT HIEEERMEILPEDLIKDEEMIITCTNKGYIKRIEANKYKSQNRGGKGVTGLNTIDDDVVD TILTASNLDTLMIFTDKGKVYNIKVYQLPELSRQSRGRLISNLLRIGEEEKIRAIIKTRV FDKEKELVFVTKQGIVKKTSLEEFKNINTGGLIAIKFKEEDDLIYVGLVEAAENEVFIAT RKGFAVRFPNDNVRPTGRNTMGVKGIELREGDEVVSALLIKEKEMDILTITENGYGKRTR LDEYPSHNRGGKGVINLRCNDKTGNIVSVLTALDEEELVCITSNGIIIRTPMNSISRFSR AAQGVIIMKVALDEKVASITRIKAEEEKEEI >gi|224461438|gb|ACDD01000064.1| GENE 26 23931 - 25847 2435 638 aa, chain - ## HITS:1 COG:FN2126 KEGG:ns NR:ns ## COG: FN2126 COG0187 # Protein_GI_number: 19705416 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Fusobacterium nucleatum # 3 638 6 639 639 971 77.0 0 MNNYGAQNITVLEGLEAVRKRPGMYIGTTSARGLHHLVWEVVDNSVDEALAGYCNTITVS ILPDNIIQVEDNGRGIPVDIHPKYGKSALEIVLTVLHAGGKFENDNYKVSGGLHGVGISV VNALSEWTEIKVKRDGNVYYQKYLRGKPIEDVKIISSLEAGDTTGTTVTFKPDAEIFETV IFEYEVLQHRLKELAYLNRGLEINLLDCRNEIGKKEKFQFEGGISDFLKEVTHENQVLLS KQIHVEGQAEQVGVDIAFTYTTSQSETIYSFVNNINTTEGGTHVTGFRTCLTKVINDIGK SQGFLKEKDGKLQGGDIREGIVAIISVKVPQPQFEGQTKTKLGNSEVSGIVNSVLSVDLK IFLEDNPNDTKLIIEKILNSKKAREAAQRAREAVLRKSVLEVGSLPGKLADCSSKKSEEC EIFLVEGDSAGGSAKQGRDRYFQAILPLKGKILNVEKAGLHKALESEEIRAMVTAFGTNI GEESFDLNKLRYGKIILMTDADVDGAHIRTLILTFLYRYMVDLIHNGNVYIAQPPLYKIS FGKSIRYAYTDAQLKEILQSVEGENKKYTLQRYKGLGEMNPEQLWETTMDPEARLLLKVS IDNAREADMLFDKLMGDKVEPRREFIQEHAEYAKNIDI >gi|224461438|gb|ACDD01000064.1| GENE 27 25840 - 26139 381 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257452987|ref|ZP_05618286.1| ## NR: gi|257452987|ref|ZP_05618286.1| hypothetical protein F3_07999 [Fusobacterium sp. 3_1_5R] # 1 99 1 99 99 157 100.0 2e-37 MKIQVHKLFDIVQEEFQKSAPMQEIFLKSHWENIVGKYSKYSEILWFREGKLCIKVYNSM ALQHMYMNKNKILVKIQEYAKKKAIIIEDVKYLLEGKYE >gi|224461438|gb|ACDD01000064.1| GENE 28 26120 - 27214 1078 364 aa, chain - ## HITS:1 COG:FN2128 KEGG:ns NR:ns ## COG: FN2128 COG1195 # Protein_GI_number: 19705418 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Fusobacterium nucleatum # 1 363 1 366 369 311 51.0 9e-85 MKVLSIQLNHVRNLKNQEIIISSPIQVFYGKNGQGKTSILEAIYFAATGLSFRTKHSSEM IRYTKNTLSCSLGYQDQFSKKSLSVSIENEKKQFFFLGKKISQMEFYGNLNVIYYIPEDV MLINGSPSVRRLFMDREISQINVFYLQQLKKFSHLLKIRNKYLKEKLYQNEEFLIYEKEF VECGSYLIEQRNHYLQLMSSFIKNIYQNLFDKEKELQLQYKTFIEFQNDVTLSKIQEEFW KEIKKKKEKEIQYGFSMVGPHKDEFIFLLERQDAKLYASQGEKKSIIFSLKLSEIDILSK NKKEMPIVLIDDVTSYFDEERCYSVLQYLYEKKVQVFITSTERLKIEADYYRIEKGEVYE NTSS >gi|224461438|gb|ACDD01000064.1| GENE 29 27250 - 27462 388 70 aa, chain - ## HITS:1 COG:CAC0003 KEGG:ns NR:ns ## COG: CAC0003 COG2501 # Protein_GI_number: 15893301 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 69 2 67 68 61 57.0 3e-10 MKEEKVTLKTEFITLNQLLKLVGISFNGAEAKYMILDGKIKVNGEVEIRRGKKIRSGDIV EFEEMKYMVE >gi|224461438|gb|ACDD01000064.1| GENE 30 27533 - 29266 1850 577 aa, chain - ## HITS:1 COG:no KEGG:FN0001 NR:ns ## KEGG: FN0001 # Name: not_defined # Def: chromosomal replication initiator protein DnaA # Organism: F.nucleatum # Pathway: not_defined # 2 539 5 535 637 368 47.0 1e-100 MKKIDDNIIEIPEEIEEEKFEILNHGSLAKDIKKVSIMTKELPEIEMQEYHIKESGNFLG IQGKVINMPIEMIVFPFFTPQKQNRRVNFKYYFDDLGVTMKSTLVVENNKDIVFQPSILE DKIYTFLLSLYERKEEDDDEEYIEFEISDFVVDFLGNKMNRTYYTKIEQALKNLKRTMYE FSINNHKKLGDYKFESELFQLLDYEKRKRGKKVYYKVRLNRNIRKKIQEKRYIIYNSKAL IEILNKDHIAARIYKYISQIRYKTGEKNVTNIRTLAAIIPLKVEQETERETKTGVKKYIL NRLKPVLTRICKAFDVLVEFGYILQYETEYNKEEDTYYLTYIFNKEKDNTCHISSYLKPK KKKSIEQKTKMRNQNIEEAEVVEKTKKTKVKSYEEEFSETILASLEYLKRNSYIKSLWNQ RNDRKISNLLKTEDEAFVVDLLSRFGRSYNENIKASISVYMDGIIKKMRKEEKQMGNNLT LFPVNSFSNSTNVAKTKKQIIQSRPILVKESLTWKEIENKLKKYTEEERKKIEEKALEKY YQETGGNKSFILDAKKNNLARYHKIICSYIEEVLLEQ Prediction of potential genes in microbial genomes Time: Fri May 20 02:14:38 2011 Seq name: gi|224461437|gb|ACDD01000065.1| Fusobacterium sp. 3_1_5R cont1.65, whole genome shotgun sequence Length of sequence - 12193 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 3, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 10.2 1 1 Op 1 . + CDS 98 - 232 196 ## PROTEIN SUPPORTED gi|197735492|ref|YP_002164270.1| hypothetical protein FNP_0004 + Term 235 - 268 4.1 2 1 Op 2 16/0.000 + CDS 296 - 628 437 ## COG0594 RNase P protein component 3 1 Op 3 18/0.000 + CDS 637 - 846 178 ## COG0759 Uncharacterized conserved protein 4 1 Op 4 16/0.000 + CDS 865 - 1482 656 ## COG0706 Preprotein translocase subunit YidC 5 1 Op 5 4/0.000 + CDS 1485 - 2234 855 ## COG1847 Predicted RNA-binding protein 6 1 Op 6 11/0.000 + CDS 2243 - 3616 1974 ## COG0486 Predicted GTPase 7 1 Op 7 . + CDS 3626 - 5470 2359 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division + Term 5483 - 5513 1.2 8 2 Op 1 . + CDS 5530 - 8481 3036 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) 9 2 Op 2 . + CDS 8488 - 8553 90 ## 10 2 Op 3 1/0.000 + CDS 8550 - 8855 380 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 11 2 Op 4 1/0.000 + CDS 8842 - 9705 627 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase 12 2 Op 5 . + CDS 9740 - 10024 468 ## COG2088 Uncharacterized protein, involved in the regulation of septum location + Term 10141 - 10209 13.5 + TRNA 10115 - 10189 71.6 # Gln TTG 0 0 + Prom 10115 - 10174 79.8 13 3 Op 1 . + CDS 10310 - 10744 487 ## gi|257453002|ref|ZP_05618301.1| hypothetical protein F3_08084 14 3 Op 2 . + CDS 10772 - 12121 1849 ## COG0166 Glucose-6-phosphate isomerase Predicted protein(s) >gi|224461437|gb|ACDD01000065.1| GENE 1 98 - 232 196 44 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|197735492|ref|YP_002164270.1| hypothetical protein FNP_0004 [Fusobacterium nucleatum subsp. polymorphum ATCC 10953] # 1 44 1 44 44 80 86 7e-15 MKRTFQPNTRKRKKDHGFRSRMATKNGRKVLKRRRARGRQVLSA >gi|224461437|gb|ACDD01000065.1| GENE 2 296 - 628 437 110 aa, chain + ## HITS:1 COG:FN0002 KEGG:ns NR:ns ## COG: FN0002 COG0594 # Protein_GI_number: 19703354 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Fusobacterium nucleatum # 3 110 2 109 111 91 50.0 4e-19 MFHTIKSQDNFQNIYKTGKKIYGTYSLLFYKENQMNHNQYGFVASKKIGNAVCRNRIKRL FREFIKQNEILLPKFTTFILVAKKKSGEEIKTIKYEQIEKDLYKIFKIKK >gi|224461437|gb|ACDD01000065.1| GENE 3 637 - 846 178 69 aa, chain + ## HITS:1 COG:FN0003 KEGG:ns NR:ns ## COG: FN0003 COG0759 # Protein_GI_number: 19703355 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 69 1 69 82 94 65.0 4e-20 MKNMLLFSIRCYQKYISPYLGKNCRFYPTCSQYTYEAIQKYGCLKGIYLGIKRISKCHPF HPGGYDPLP >gi|224461437|gb|ACDD01000065.1| GENE 4 865 - 1482 656 205 aa, chain + ## HITS:1 COG:FN0004 KEGG:ns NR:ns ## COG: FN0004 COG0706 # Protein_GI_number: 19703356 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Fusobacterium nucleatum # 1 204 1 204 205 261 75.0 4e-70 MTYLYELLKQLISSLLLSVDNVVQNFGISIIIATIIVRIILLPLTLKQDKSMKAMKKIQP ELEILKEKYGNDKQLLNQKTMELYQKHKVNPAGGCLPLLVQLPILFALFGVLRGGIIPED SKFLWLELTKPDPFYIFPLLNGAISFFQQKLMGNSDNPQMKNMMYMFPIMMIFISYKMPG GLQLYWLTSSLTAVLQQYFIMKKGD >gi|224461437|gb|ACDD01000065.1| GENE 5 1485 - 2234 855 249 aa, chain + ## HITS:1 COG:FN0005 KEGG:ns NR:ns ## COG: FN0005 COG1847 # Protein_GI_number: 19703357 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Fusobacterium nucleatum # 96 248 10 162 163 142 58.0 5e-34 MIKNTQIKAMTEEEAKKRALNILEAKEYQIIGIKTLESPKSFLGLFNKNGLFEISVDTEK LEKEIIKTTPVIEKKKKQTSEIKEKRKTEDVSFKENQKETTENIISEREIVSKISTLLEN IGLNLRVEYKKISEKHYQFQLFGEDNGIIIGKKGKTLNSFEYLVNSIYKEYKIEIDVEGF KEKRNQTLRELGKKMAEKCIKNRKTIRLNPMPPKERKIIHEILNRYSELETYSEGRDPKR YIVIKYKKK >gi|224461437|gb|ACDD01000065.1| GENE 6 2243 - 3616 1974 457 aa, chain + ## HITS:1 COG:FN0006 KEGG:ns NR:ns ## COG: FN0006 COG0486 # Protein_GI_number: 19703358 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 457 1 455 455 614 72.0 1e-176 MLLDTIAAISTPRGEGGISIVRISGPESLHILEKIFFPKKNIPVKELRNYGIHYGHIKKG EEIIDEVLVSIMKAPNTYTREDIVEINCHGGYLITEKILELVLSSGARLAEMGEFTRRAF FHGRIDLTQAEAVMDIIHGKTETSLSLSMNQLRGDLKEKILSLKKAILDLAAHINVVLDY PEEGIDDPIPENLLKNLRQVSVEIKELISSYQKGKMIKEGVKTVIIGKPNVGKSSLLNSI LREERAIVTQVAGTTRDIIEEVINIKGIPLVLVDTAGIRNTTDLVENIGVMKSKEFLQKA DLVLFVLDASQELSKEDEEIYASLQENQKVIGILNKTDLEKKIQISSLSKIKNWIEISAM KYIGIEEMEEKIYQYILQENVEESSKKLILTNIRHKSALEKTNQAIENIFATVEQGLPMD LMAVDIKEALDSLSEITGEISTEDVLDHIFHNFCVGK >gi|224461437|gb|ACDD01000065.1| GENE 7 3626 - 5470 2359 614 aa, chain + ## HITS:1 COG:FN0007 KEGG:ns NR:ns ## COG: FN0007 COG0445 # Protein_GI_number: 19703359 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Fusobacterium nucleatum # 2 613 3 624 628 890 69.0 0 MQEFDVIVVGGGHAGCEAALASARLGLKTAIITLYLDSIAMMSCNPSIGGPGKSNLVTEI DILGGEMGRHTDQFNLQLKHLNESKGPAARVTRGQADKFLYRTNMRLTLEHTENLSILQD CVEKLLVQEDEVYGVKTRLGIEYKAKSVILCTGTFLKGKVVIGDITYSAGRQGESAAEKL SENLRELGLQVERYQTATPPRIDKKSIDFSKLKELHGEKHPRYFSIFTEKKENTIVPTWL TYTNEKTLEKTKEMLQYSPIVSGIIETHGPRHCPSIDRKVLNFPEKTDHQIFLEMESLDS DEIYVNGFTTAMPPFAQDEILHTISGLEQAKIMRYGYAVEYDYMPAFQLYPSLENKKISG LFCAGQINGTSGYEEAAAQGLVAGINAARKILGKNPIFIDRSEAYIGVMIDDLIHKKTPE PYRVLPSRSEYRLHLRFDNAFMRLYDKTKEIGLLTQEKLLLVEKAIQNVKQEVERLKTIS ISMQEANQFLEKKQCSDFFSKGVKIADVLKKKEITYLDLKELIEIPDYPEFVRNQIETIL KYEIFMEREEKQILKFKELEHQFIPKDFDFSSVKGISNIALSGLLEVKPLSIGEAGRISG VTGNDLALLIAHLR >gi|224461437|gb|ACDD01000065.1| GENE 8 5530 - 8481 3036 983 aa, chain + ## HITS:1 COG:FN0019 KEGG:ns NR:ns ## COG: FN0019 COG1197 # Protein_GI_number: 19703371 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Fusobacterium nucleatum # 5 981 3 979 981 888 52.0 0 MEKIQKYRGEIPYFIQEHCKDILIYICSSYRNLEDYYSVLKDISSLPMYMLERKETEESI SQRYELFEFFKKKKKAILLLTLDMFLTKYKEIGSYQIFTVGKEYSITKLVEHLEQQEYTR NYLLEKKGEYSIRGDILDIYPYTDSSPIRIEFFGEEIERISYFDIENQKSFHLLKEYKMY TDNNKIEKSLIPFLNLEKKNYSLFFENIELLSYKLEEMILLEENEREKQKYRKEFENLYE NGIELEILQFQYQDLERFKKKEELEAISKSKKIILKSLEIEKYQEIYSNVISKYHKYPYF EGYENEKELVLTDRELKGIRVKREIEKKKKLKISSPEQIQEGEYIIHENYGVGLYLGMEI IDGKDYLRIQYADEDKLFVPLEGIQKIEKYVHVPGIIPEIYHLGTRGFSKKREKLQEDIL KFAKEILEIQAKRKSIGGFQYSPDTVWQEEFESSFPYTETSAQKKAIQDVKQDMEMGKIM DRLICGDVGYGKTEIAIRATFKAIMDHKQVVLLAPTTVLAEQHYHRFQERFLNYPIEIAV LSRMKTPKEQKEILEKIKSGSIDLVIGTSRLLSDDIEFKDLGFVIIDEEQKFGVKAKEKF KKIRGNINILAMTATPIPRTLNLSLLGIRDLSIVDTPPDGRKTIKTFFIEKKEENIVKAI LKELAREGQVFYVFNSVKRIEEKVKELEKILPSYVKIDYIHGKMSGKELKYKIEQFENMQ IDVLVSTTIIENGIDIENANTMIIEGMEKLGLSQIYQLRGRIGRGRRQSYCYCIISEYKS KKAEEREKSLIELGQGSGLDLSMEDMRIRGAGEILGEKQHGAIETLGYHFYMKMLEEEIA KLKGEKIEETERKLYISLPFAKYIPDFYIQKEEKIVIYKRALFLQTMEEILEFEKEILDR FGKFPQEVIGFFQYLKIQYYCKKFGIYELIETDFKYWIRFEENKVDIDRIVELFSQQKID YSQRSKQVVFEGNIFNFFEMYKK >gi|224461437|gb|ACDD01000065.1| GENE 9 8488 - 8553 90 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYDTIDFVKKTRYICNERTQK >gi|224461437|gb|ACDD01000065.1| GENE 10 8550 - 8855 380 101 aa, chain + ## HITS:1 COG:FN0020 KEGG:ns NR:ns ## COG: FN0020 COG1188 # Protein_GI_number: 19703372 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Fusobacterium nucleatum # 1 99 1 99 99 123 73.0 9e-29 MRLDKFLKVSRIIKRRPIAKLVVDEKKAKLDGKIAKSSTEVKVGQELELEYFNKYFKFKI LQVPSGNVSKEKTSELVELIESKGIEKNFSLDSEEEFFENI >gi|224461437|gb|ACDD01000065.1| GENE 11 8842 - 9705 627 287 aa, chain + ## HITS:1 COG:FN0021 KEGG:ns NR:ns ## COG: FN0021 COG1947 # Protein_GI_number: 19703373 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Fusobacterium nucleatum # 4 284 8 290 294 221 46.0 1e-57 MKIYKIKANAKINIGLNILGKAENGYHLLDMTMLPISYYDTLRIQVFSQKGGLHIFCKDR SIPRDKRNILFKIYEKFYQWTQIEPEKIKISLRKNIPSEAGLGGGSSDGAFFLKFLNTYY SYPLSKEELFWLAFEVGSDLPFFLKNMASRVEGTGEKITPFFHQSKQKILIFKPKFGFST KEAYELSDAYSTIKMADIPLIIQGLKEGNIQEKEENISNHLEEVLLLHKKELKKLKEKIE KYTRKKTFMTGSGSAYYIFLEEKSAYSIRRKCKKYFKDCKVQLCNFL >gi|224461437|gb|ACDD01000065.1| GENE 12 9740 - 10024 468 94 aa, chain + ## HITS:1 COG:SA0456 KEGG:ns NR:ns ## COG: SA0456 COG2088 # Protein_GI_number: 15926175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Staphylococcus aureus N315 # 1 94 9 98 108 102 55.0 1e-22 MKITDVRVKKIIGEETGRLKAYVDLTFDEAFVIHGLKLIEGESGKFIAMPSRKMPDGEFK DIVHPISSELRKEITDCVIQKYEEVLKEEIVSEE >gi|224461437|gb|ACDD01000065.1| GENE 13 10310 - 10744 487 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453002|ref|ZP_05618301.1| ## NR: gi|257453002|ref|ZP_05618301.1| hypothetical protein F3_08084 [Fusobacterium sp. 3_1_5R] # 1 144 1 144 144 209 100.0 5e-53 MKKIILLIFLLLAFHSFSNHTLSTDHWSYEALKHVSNKKIINEDIQRFDGTKLVTKSEFV YSLSRILKLVETEKASQEDIRVLESLILQYSDELNKIGFDTKTYDNKLENINDNIQILQA LVNENEKKIDILMKRIEKLENKKY >gi|224461437|gb|ACDD01000065.1| GENE 14 10772 - 12121 1849 449 aa, chain + ## HITS:1 COG:FN2054 KEGG:ns NR:ns ## COG: FN2054 COG0166 # Protein_GI_number: 19705344 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Fusobacterium nucleatum # 1 449 1 448 448 636 69.0 0 MKSISFDFKTSRQFISEEEIENIKPQITLAANILENGSGAGNDFLGWLSLPTNYDKEEFI RIQEAAEKIKKQSEVLVVIGIGGSYLGARAVIECLNHTFYNHLDSKKRNTPEIYFVGHNI SGRYIKHLLEVIGDRDFSVNVISKSGTTTEPAIAFRIFKKKLEEKYGKKEAKGRIFATTD AKKGALKSLAIQEGYETFVIPDNVGGRFSVFTAVGLLPIAVSGISISELMSGAKDGELEY SKTFDENICYQYAAVRNILYRKNISVELLVNYDPRFHFIAEWWKQLFGESEGKDGKGLFP AAVDFSTDLHSMGQYIQDGKRILMETVLQVEAEEEDITLELEKEDLDGLNYLAGKTMHEI NQKAFSGTLLAHIDGGVPNFVITLPEVNAYYIGKLLYFFEKACGVSGYLLAVNPFNQPGV ESYKKNMFALLGKKGYEELSKELEKRLKK Prediction of potential genes in microbial genomes Time: Fri May 20 02:14:59 2011 Seq name: gi|224461436|gb|ACDD01000066.1| Fusobacterium sp. 3_1_5R cont1.66, whole genome shotgun sequence Length of sequence - 22113 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 6, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 33/0.000 + CDS 79 - 1011 1078 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 2 1 Op 2 35/0.000 + CDS 1004 - 2038 816 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 3 1 Op 3 . + CDS 1996 - 2784 195 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 2790 - 2832 4.3 + Prom 2904 - 2963 12.1 4 2 Op 1 . + CDS 3000 - 3068 74 ## 5 2 Op 2 . + CDS 3065 - 4048 1369 ## COG3181 Uncharacterized protein conserved in bacteria 6 2 Op 3 . + CDS 4059 - 4490 475 ## FN2104 hypothetical protein 7 2 Op 4 . + CDS 4503 - 5981 2035 ## COG3333 Uncharacterized protein conserved in bacteria + Term 6018 - 6063 0.7 8 3 Op 1 40/0.000 - CDS 5993 - 7726 708 ## COG0642 Signal transduction histidine kinase 9 3 Op 2 . - CDS 7748 - 8431 639 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 8455 - 8514 8.6 10 3 Op 3 . - CDS 8531 - 9916 1173 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 9961 - 10020 8.8 + Prom 9897 - 9956 6.4 11 4 Tu 1 . + CDS 10020 - 10868 1643 ## COG0214 Pyridoxine biosynthesis enzyme + Term 10896 - 10941 2.5 - Term 10879 - 10934 8.1 12 5 Op 1 1/0.333 - CDS 10937 - 11512 631 ## COG1573 Uracil-DNA glycosylase 13 5 Op 2 . - CDS 11522 - 12301 1071 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 14 5 Op 3 . - CDS 12323 - 13507 1702 ## COG0281 Malic enzyme 15 5 Op 4 2/0.333 - CDS 13524 - 14066 602 ## COG1704 Uncharacterized conserved protein 16 5 Op 5 . - CDS 14088 - 15860 1617 ## COG4907 Predicted membrane protein 17 5 Op 6 . - CDS 15879 - 17366 1493 ## COG2317 Zn-dependent carboxypeptidase - Term 17380 - 17428 5.8 18 6 Op 1 . - CDS 17433 - 19142 2018 ## COG0442 Prolyl-tRNA synthetase 19 6 Op 2 1/0.333 - CDS 19198 - 21237 1934 ## COG1200 RecG-like helicase 20 6 Op 3 . - CDS 21276 - 22025 1213 ## COG0217 Uncharacterized conserved protein - Prom 22051 - 22110 6.0 Predicted protein(s) >gi|224461436|gb|ACDD01000066.1| GENE 1 79 - 1011 1078 310 aa, chain + ## HITS:1 COG:FN0885 KEGG:ns NR:ns ## COG: FN0885 COG0614 # Protein_GI_number: 19704220 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 36 308 18 283 286 179 38.0 6e-45 MKKITAILFMILSTTILAFTNMQEINGVKYQFDFNEAPKRAVSISQFTTEIMLKLGLEKQ MIGTAFLEEEIYPSVASSYRKVPVLAEKWPSLEQLLSKNPDFVTGWEVAFKKGVDSKMIH RSHINMFVPKSSIEFNADLNTLFDDYKMFGKIFHKEKEVEKYIATEKARVEKIKKEVKNK QEFTYFLYDSGTDKAFTVFEGFTTNLLKLVHGKNILSGKGVQKTWGETSWETVIAENPDY FIIVDYSVGIREETDSDSKIKAIKANPKLKNLKAVKNNKFIRVKLAEIVPGIRNVDFFER VAKEVYKIHE >gi|224461436|gb|ACDD01000066.1| GENE 2 1004 - 2038 816 344 aa, chain + ## HITS:1 COG:FN0884 KEGG:ns NR:ns ## COG: FN0884 COG0609 # Protein_GI_number: 19704219 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Fusobacterium nucleatum # 20 342 25 344 345 235 44.0 9e-62 MNKKWKITLLCIASLLIPIFCIGFGSIKIDNKWVVQIVMNHVLGKEYFVCKWERTLETIV WDLRFPRILLAFLTGAALSLVGVIMQTITKNNLAEPYILGISSGASAGAVSVIILSGTYP ILQKISIEQGAFLGSLLSISMVFFISSRHLTRGSSLILTGVGVSSFFSAMTTVIIYSSKN NSQLVTAMFWMTGSLSSAAWESLFYPFLIFLFFTILVYLYSHELDILLMGDTDANTLGVH TQFLKFIMIGISTLLISILVSLTGIIGFIGLVIPHIARKIIGYQHRTLVIFSTLLGGNFL VVADTFARSYFSPEEMPIGVITAFIGTPIFLWIVRRNYSYGGRE >gi|224461436|gb|ACDD01000066.1| GENE 3 1996 - 2784 195 262 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 14 237 1 229 245 79 25 2e-14 MDCKKELFLWGERMIEVKNLSYHKDNKDILKNVSLSFQENCITGILGANGSGKTTLLRHL IRELPSHNAIYIGGKEINQISKKDFAKKISFISQNTMYIPEMTIEDIVMIGRYPYKKLFF NYSKEDEKKVEESLLLFNLENLRQKAIGSVSGGEAKRAFIARAFAQNTEILILDEPINHL DIKHQLALLKLFHKLKEKTIILSIHNLEFALKFCDQIILMKDGKVIEMGKTEAVFSSQKI LEVFEVEVEVKKIADEKVIMYR >gi|224461436|gb|ACDD01000066.1| GENE 4 3000 - 3068 74 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIMLIKKGGKLTKILTLLTKI >gi|224461436|gb|ACDD01000066.1| GENE 5 3065 - 4048 1369 327 aa, chain + ## HITS:1 COG:FN2103 KEGG:ns NR:ns ## COG: FN2103 COG3181 # Protein_GI_number: 19705393 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 21 327 2 307 308 448 74.0 1e-126 MKSKFLKVFSSVVMGAVLLSACGTDKSKAENGGDKYPSKPVNVIVAYKAGGGTDVGARIL VSEAQKSFPQPFVIVNKPGADGEIGYTELLKSEADGYTIGFINLPTFVSIPLQRKTNFQK EDAQAIMNHVYDPGVLVVREDSKWKNLEEFVEDAKQNPDALTISNNGTGASNHIGAAHFA YEAGIKVTHVPFGGSTDMIAALRGSHVDATVAKISEVASLVKNKELRILGTFTDERLEGF EDVPTLKEKGYNVLFGSARALVAPKGTPEEIIQYLHDTFKTALESPENIEKSNNANLPLK YMSGEELTNYINEQDQYIKEMVPKLGI >gi|224461436|gb|ACDD01000066.1| GENE 6 4059 - 4490 475 143 aa, chain + ## HITS:1 COG:no KEGG:FN2104 NR:ns ## KEGG: FN2104 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 143 1 147 147 108 53.0 6e-23 MIKYDRILTIGLIILEALYFCMIKSLPEKAAKYPLFVLALLIILTIALGIKSFTTKIEKE KSEIFQGFQGKQFIFIVVLSAIYIFGIEKIGFFISSFVYLIVIMVGLKSNIKWAVISSIV FCLLIYSIFVVFLKVPVPNGILI >gi|224461436|gb|ACDD01000066.1| GENE 7 4503 - 5981 2035 492 aa, chain + ## HITS:1 COG:FN2105 KEGG:ns NR:ns ## COG: FN2105 COG3333 # Protein_GI_number: 19705395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 492 1 492 494 660 83.0 0 MSDILFGFVTALAPVNLLAACLSVSIGIIIGALPGLSAAMGVALLIPITFGMPASTGLIV LAGVYCGAIFGGSISAILIRTPGTPAAAATAIDGYELTLKGKAGKALGTAVIASFIGGIL SSISLYLFAPTLATLALKFGPAEYFWLSIFGLTIIAGASTKSITKGLISGAIGLMLSTIG MDPMLGNPRFTLGIPSLLSGIPFTASLIGLFSMSQVLMLAEKKIKESGNLVHFEDKILLT KEELLRILPTALRSTVIGNLIGILPGAGASIAAFLGYNEAKRFSKHKEEFGHGSIEGIAG SEAANNAVTGGSLIPTFTLGIPGESVTAVLLGGLLIQGLQPGPDLFTIHGKITYTFFAGF IIVNIFMLILGLTGSKIFAKISRVPDTYLIPIIFSLSVIGSYAIHNQMADVMIMFVFGFI GYVVNKLELNSASIVLALILGPIGESGLRRSIILNHGKLDILFKSPVSIFLIVCTILSLF SPMIMKKLQKRS >gi|224461436|gb|ACDD01000066.1| GENE 8 5993 - 7726 708 577 aa, chain - ## HITS:1 COG:BH3156 KEGG:ns NR:ns ## COG: BH3156 COG0642 # Protein_GI_number: 15615718 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 2 569 5 577 589 196 26.0 1e-49 MKKKILLICFTLILSSIFTVSIIFYNMMKHNYIESILANANSNIQLIHLILAENKYADKY LFKLSQSLSQKTGFRVTFIRTDGIPLADSNDNSILFENFQSLPSFQIAKKNITSHYVKKQ PLTTIPEIKIFTKLHFYNKKSTILMLSKKLTFLEEFQKKFFLAILTGIFISSILSVFLSL YFTAWATKPILQLTNAVREISQGNFCPKLLLRSHDELEELAKNFYNMNQKIKILLQDIQN KANNLQNILDNLSEGILLLDIQGNVILMNKFAEFEFEISNSTHNFFSYSNFSFCHKEIQQ SLLNKQTFELKKRIGKKIYKLHNHFMEENKQMILVIQNITQLEQNEELRREFVSNASHEL KTPLTIISGFIETIKLGHVQEKQQLEHILNIIDLESKRLNKLVNNLLHLSHLEKNVEQTN KKIYRVSLYRTIPQIKNLYQPLLEEKDISLDISIANDFIESHISEEFLHIVLGNLLENAI KYSKIHSNIILSSKIDNRKLYFKIQDFGCGIAKDEQEKIFQRFYRVDPSRNNKIKGNGLG LSIVKKMIENVNGNISVESELQKGTTFLITIPITEKS >gi|224461436|gb|ACDD01000066.1| GENE 9 7748 - 8431 639 227 aa, chain - ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 226 3 227 232 200 47.0 2e-51 MGEKILIIDDEEAILELLKFNLEIYGYKIFTSNTGKGILEKIIEIHPNIILLDLMLPEID GMSICKKVRENSIWNDLRIIILSAKSQEIDKITCLEIGADDYITKPFSIRELIARIHAFS RRISPTVPTTQEIIQYHDLVIDPKEKTVLKKDKKISLTLLELKLLLYLLKNQGKISTREM IFKNVWNYEEQNNTRSLDVNIRKLRQKLEDSNNHYIETIRGIGYKLL >gi|224461436|gb|ACDD01000066.1| GENE 10 8531 - 9916 1173 461 aa, chain - ## HITS:1 COG:FN1462 KEGG:ns NR:ns ## COG: FN1462 COG1167 # Protein_GI_number: 19704794 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Fusobacterium nucleatum # 1 448 2 456 469 530 60.0 1e-150 MIFPLDNNSKTPLYIQMYSEIKKQIQDGSLHSNEKLPSKKHFMEQYHISQNTVQNALYLL LEEGYLYSIERRGYFVSNLENIFTKSLPSKTVQKENNISKVKYDFAYSGVDVQSIPKTIL KKITRDIYDEQNTELLFQGDIQGYLPLRESICQYLENSRGFSVSSNQIIISSGTEYLFYI IFKIFDQKIYGLENPGYKMLQELFTSNQIEFHPIPLDESGIQVEELEKQKVQIACITPSH QFPSGIIMPIRRRNELLQWANSSEERYIVEDDYDSEFKYNGRPIPALKAIDQKDKVIYMG SFSKSISPALRVSYMVLPKNLLTVYERKLPYFICPVSTLSQKILHKFISEGYFIKHLNRM RTLYKQKREFIVQSFKKTNITILGADAGLHLLLSFPPSFPESKFLADCKKHSIRLYPIRE YYFQENITTNPIFLLGYASLEKKQIQEGISLLLKILESNQE >gi|224461436|gb|ACDD01000066.1| GENE 11 10020 - 10868 1643 282 aa, chain + ## HITS:1 COG:FN1463 KEGG:ns NR:ns ## COG: FN1463 COG0214 # Protein_GI_number: 19704795 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxine biosynthesis enzyme # Organism: Fusobacterium nucleatum # 3 282 1 280 280 466 89.0 1e-131 MDMITKFNGGVIMDVTNVEQAKIAEEAGAVAVMALERVPADIRAAGGVSRMSDPKMIKEI MAAVKIPVMAKVRIGHFVEAEILEAIGIDFIDESEVLSPADNVYHVNKNEFKTPFVCGAR NLGEALRRICEGAKMIRTKGEAGTGDVVQAVSHMRQIMKEMNIVKSLREDELYVMAKDLQ VPYELVKYVHDHGRLPVPNFSAGGVATPADAALMRRLGADGVFVGSGIFKSGDPRKRAKA IVEAVQNYNNPEVIARVSENLGEAMVGINEEEIKVIMAARGV >gi|224461436|gb|ACDD01000066.1| GENE 12 10937 - 11512 631 191 aa, chain - ## HITS:1 COG:FN0901 KEGG:ns NR:ns ## COG: FN0901 COG1573 # Protein_GI_number: 19704236 # Func_class: L Replication, recombination and repair # Function: Uracil-DNA glycosylase # Organism: Fusobacterium nucleatum # 1 188 1 188 195 155 46.0 4e-38 MLEKNDLWEELKYGAASIGNTILKPHQLEVLIGGGNPDSDILILGDDPELYLNENLKTKE GSSGEFLYLLLEFCGIQKEDIYVSTLSKRNARLKDFMPEDYEKLKELLICQIGLLSPKVI VCLGYEAAQMLLEKEINLEKDRQEVFTWKAGIQVFVTYDVNTVKKARAELGKKAKLALEF RNDLKKLEYFK >gi|224461436|gb|ACDD01000066.1| GENE 13 11522 - 12301 1071 259 aa, chain - ## HITS:1 COG:FN0900 KEGG:ns NR:ns ## COG: FN0900 COG1235 # Protein_GI_number: 19704235 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Fusobacterium nucleatum # 1 257 1 257 260 327 60.0 1e-89 MKVAMLGSGSGGNASYVEENGYGILIDAGFSCKKIEERLASIGKSAENIKALLITHEHTD HISGAGILARKYNLPIYISPESLEVCRQKLGKIAEDQIHCIQKDFFLNENIYVKPFDVMH DAVRTLGFHIETASQKKLAISTDIGYITNLVREAFQDVDVAILESNYDYNMLMNCSYPWD LKARVKGRNGHLSNNDAAKFIREMYTNKLQKIFLAHVSKDSNHPNIIHDTMELEFEKYSQ KPNYEISSQNIATKLFESK >gi|224461436|gb|ACDD01000066.1| GENE 14 12323 - 13507 1702 394 aa, chain - ## HITS:1 COG:SA1524 KEGG:ns NR:ns ## COG: SA1524 COG0281 # Protein_GI_number: 15927279 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Staphylococcus aureus N315 # 6 393 5 390 409 432 57.0 1e-121 MSNVYEESLKLHEANHGKLSVVSKVTVKSREDLSLAYSPGVAEPCRKIQENKENVYRYTS RGNMVAVITDGTAVLGLGDIGPEAALPVMEGKAVLFKEFGGVDAFPICLDTKDTEEIITT IKRIAPGFGGINLEDISAPRCVEIETRLKEELDIPVFHDDQHGTAIVVVAGLINSLKLLK KNVEEIKVVINGIGAAGSSIAKLILQLGVPGKNMLLVGKDGILNREQSENYNHIHKELSC RTNDACQTGTLKDAIQGADVFVGVSVGGIVSAEMIESMNHDAIVFAMANPTPEIMPEEAK KAGARIVGSGRSDYPNQINNVLVFPGLFKGALRAKSKKITEEMKMAAAVGLANLITEEEL KDDYIIPGAFDPRVAETVAKEVEKVAKAQGICRE >gi|224461436|gb|ACDD01000066.1| GENE 15 13524 - 14066 602 180 aa, chain - ## HITS:1 COG:FN1125 KEGG:ns NR:ns ## COG: FN1125 COG1704 # Protein_GI_number: 19704460 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 15 180 18 183 183 204 63.0 7e-53 MIIIFIIIIVVCFMAISFKNKFVVLLSRVKNAWSQIDVQLQRRFDLIPNLVETVKGYAAH EKGTLEAVIAARNQYVSAGSVQEKMEASNQLTGVLRQLFAVSEAYPDLKSNTNFLQLQEQ LKEVEDKVAYARQFYNDTVTKYNQSIQLFPASLFAGLFHYVEEPLFQAVTGSQEVPKVKF >gi|224461436|gb|ACDD01000066.1| GENE 16 14088 - 15860 1617 590 aa, chain - ## HITS:1 COG:FN1127 KEGG:ns NR:ns ## COG: FN1127 COG4907 # Protein_GI_number: 19704462 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 26 529 27 544 606 345 39.0 1e-94 MKKIFSLLFFICFSLVLFSADFEITNLNITAKLEENASMKVREEVQYRIGEINGILFDLD AKGNGPLTSLAVYATDENGNFEKVPQTNLEITEEDELYRIKVYARTVNQIRTFAFVYELQ GGAKLYQDIAELNRIFVGKNWQSPIGQVQVKVLLPNTVPQDSIHAYGHGPLTGNISLEDN TIFYNLEQYYPGDFVEAHILFGPQGLSGVPQDLLVKEKAKDRLLAQEKAWAEEANVERER YQKLEKHGKFAFGIEAFCLALYFLFAKFILRKPKKLEQEFPEYFRELPTDDSPAIVGNFF QAENSEKIFATIMDLVRRKYLNLELRGAEQILTINTEKNKTENLTPYEKEIIEIYLHQIG SRSEVNLSTISKQKLSLSISQRILGWNSLVKREYAAKGYGDSRSPLIILGVFCCFLFLGL SIVAISVFEQVQFAFFIPVIFAFLLPYTFNSKFPNAKTTESMQKWKAFKKFLEDYSLLKE AKIDSIYLWEHYFVYALVLGVADKVAKAYQLALEKGEILMPEGRSSLHYYAPCLHSYIRQ PSLHQNIQKTYQRSHQSIARSTRSSSIGRGGGFSGGSSGGGGSRGGGGAF >gi|224461436|gb|ACDD01000066.1| GENE 17 15879 - 17366 1493 495 aa, chain - ## HITS:1 COG:FN0061 KEGG:ns NR:ns ## COG: FN0061 COG2317 # Protein_GI_number: 19703413 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent carboxypeptidase # Organism: Fusobacterium nucleatum # 6 495 3 495 496 541 57.0 1e-154 MKDKIQEFKECIKEKKYLLASIEVLQWELETLAPKKGQDYLSEVLAYMSMKDYELSTSDK FQNLVRDLLQEKESLDPILQKEVEQAAEEMEKMKKIPAEEYRAYAELCAKNQGVWEEAKQ NNNFQLVEENLTKIFEYNRKFARYLQKEEKNLYDVLLRDYEKGMTCEKLDVFFASLKKEI VPLLHKIQKKKKQPFPFLTSPISKEKQKEFCHLLAEYLGFDFERGILAESEHPFTLNINK KDVRITTKYVEALPFSSIFSTIHETGHAIYEQQIGDELVSTLLGSGGSMGLHESQSRFWE NIIGRSFEFWKELYPSLQTHFTSLKTIPLEEFYQAINQVEASLIRTEADELTYCLHIMLR YELEKEIIEGTLSVKDLPKAWNEKIEEYLGITVPNATEGVLQDVHWYAGLIGYFPSYALG NAYASQLFHTMKQELSLNDYSQDKLQEVRLWLGENIHQYGMMKTTSELIREITGEDLNPD YYIEYLKNKYEALYQ >gi|224461436|gb|ACDD01000066.1| GENE 18 17433 - 19142 2018 569 aa, chain - ## HITS:1 COG:FN1658 KEGG:ns NR:ns ## COG: FN1658 COG0442 # Protein_GI_number: 19704979 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 1 569 1 567 567 865 75.0 0 MRFSKAYIKTLKETPKEAEIISHQLLLRAGMIKKLASGLYTYLPLGFRTLKKVENIIREE MDRAGSQELLMPVLQPAELWQESGRWNVMGEEMVRLKDRHQRDFVLGPTNEEVITDIVRN DISSYKSLPINLYHIQTKVRDERRPRFGLMRSREFIMKDAYSFHTSQESLDEEFENMKNT YTRIFERCGLKFRPVEADSGAIGGSGSQEFHVLAESGEDEIIYSDGCSYAANVETAISKI ENPPKEEEKEVELVSTPNASSIEELSQFLNVPKYKTVKAMMYKDLGTDTFAMVLIRGDFE VNEVKLKNALNAIAIELAKDEEIEALGLTKGYIGPYALQNKNFTIIVDPTVLEVSNHILG GNQKDSHYINVNYGRDYTADMVKDIRLVKAGEDCPRSNGKLHSARGIECGHIFKLGDKYS KALGASYLDEKGESKIMLMGCYGIGVGRTMAAAIEQNYDEHGIIWPSALAPYLVDVIPAN IKNAEQMQLAEKIYEQLNAEHLDAMLDDRDERPGFKFKDADLIGFPFKVICGKKAAENIV ELKIRKTGETFEIPVDEIISKIKDLEKQY >gi|224461436|gb|ACDD01000066.1| GENE 19 19198 - 21237 1934 679 aa, chain - ## HITS:1 COG:FN1660 KEGG:ns NR:ns ## COG: FN1660 COG1200 # Protein_GI_number: 19704981 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Fusobacterium nucleatum # 10 674 18 684 689 790 62.0 0 MEQYHSTLYQVLDSKKYKGLKALGIKTVHDLLYYFPRAYDNRSNIKKIAELRMEEYAVIH AKLLHVYSAPTKLGRKMTKATATDGSGFLEIVWFGMPYLQKSLKLQEEYIFVGTVKRAMG AFQMTNPEFKLSKGQKMRGEILPIYSSHKNLSQNRLRKYLKEILFENSLLSENIPKEICQ KYNILGRNQALSEIHFPSSEKILEEAKRRFAIEELLIIEMGILKNRFLTDALTQAFYHLE GKKTLVKQYLSSLPFQLTKAQKKVITEIYKDLEQGRIVNRLVQGDVGSGKTMVAMVLLLY MIENGYQGALMAPTEILAIQHYLGIYSKMQELGLRVELLTGSIRGKKRRKLLDDLKEGNI DLLIGTHALLEEEVRFHQLGFIVIDEQHRFGVLQRKKLREKGILTNLLVMTATPIPRSLA LSIYGDLDVSILDELPPGRSPIKTKWISTKEDMEKMYAFIRKQLSQGKQAYFVAPLIEES EKLLLSSILEVEEEVKEKLPNYKIALLHGRMKNIEKDEIMQRFKQREIDILVSTTVIEVG IDVPNAVIMTILNAERFGLSALHQLRGRVGRGKDASFCFLISKTQNETSKQRLEIMEATQ DGFIIAEEDLKLRNAGEIFGLRQSGLSDLRFIDLLHDVKTIKLVRDECMEYLRKNQGKIL LPSLEEDIFQKFKDSVQKD >gi|224461436|gb|ACDD01000066.1| GENE 20 21276 - 22025 1213 249 aa, chain - ## HITS:1 COG:FN1661 KEGG:ns NR:ns ## COG: FN1661 COG0217 # Protein_GI_number: 19704982 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 249 1 249 249 394 83.0 1e-110 MSGHSKWNNIQHRKGAQDKKRAKLFTKFGRELTIAAKEGGGDPNFNPRLRLAIEKAKAGN MPKDILERAIKKGTGELEGVDFTEIRYEGYGPAGTAFIVDVVTDNKNRSASEVRTVFSRK GGNLGADGAVSWMFKKLGIIEVASEGLDLDEFMMAALEAGAEDVTDEGETFEVVTDYTQL QTVAENLKAAGYTYTEAEISMVPDNKVEITDLETAKKVMLLFDSLDDLDDVQEVYSNFDI PEELLEQLD Prediction of potential genes in microbial genomes Time: Fri May 20 02:15:11 2011 Seq name: gi|224461435|gb|ACDD01000067.1| Fusobacterium sp. 3_1_5R cont1.67, whole genome shotgun sequence Length of sequence - 9699 bp Number of predicted genes - 15, with homology - 12 Number of transcription units - 6, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 234 - 422 223 ## - Prom 452 - 511 4.5 + Prom 325 - 384 1.9 2 2 Tu 1 . + CDS 441 - 515 74 ## + Prom 635 - 694 7.2 3 3 Tu 1 . + CDS 715 - 825 142 ## + Term 902 - 936 -0.5 4 4 Op 1 . - CDS 926 - 1192 248 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system 5 4 Op 2 . - CDS 1189 - 1419 432 ## gi|257467415|ref|ZP_05631726.1| hypothetical protein FgonA2_08228 - Prom 1464 - 1523 15.4 + Prom 1963 - 2022 7.9 6 5 Op 1 . + CDS 2060 - 2578 946 ## COG2109 ATP:corrinoid adenosyltransferase 7 5 Op 2 . + CDS 2652 - 3626 1187 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 8 5 Op 3 2/0.000 + CDS 3637 - 4113 721 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 9 5 Op 4 . + CDS 4115 - 5473 1349 ## COG0534 Na+-driven multidrug efflux pump 10 5 Op 5 . + CDS 5527 - 6831 1632 ## COG0427 Acetyl-CoA hydrolase 11 5 Op 6 14/0.000 + CDS 6876 - 7253 449 ## PROTEIN SUPPORTED gi|237736456|ref|ZP_04566937.1| LSU ribosomal protein L21P 12 5 Op 7 14/0.000 + CDS 7266 - 7601 275 ## PROTEIN SUPPORTED gi|237742036|ref|ZP_04572517.1| 50S ribosomal protein L27 13 5 Op 8 . + CDS 7602 - 7886 481 ## PROTEIN SUPPORTED gi|237736458|ref|ZP_04566939.1| LSU ribosomal protein L27P + Term 7894 - 7946 12.6 + Prom 7897 - 7956 7.6 14 6 Op 1 . + CDS 7978 - 8904 960 ## COG0501 Zn-dependent protease with chaperone function 15 6 Op 2 . + CDS 8983 - 9630 491 ## COG0671 Membrane-associated phospholipid phosphatase Predicted protein(s) >gi|224461435|gb|ACDD01000067.1| GENE 1 234 - 422 223 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVEVAGIEPASEIKVTISFYKLSLLLYFADVTPANRANKSYSLKFPFCLEKSQKVICIWV TP >gi|224461435|gb|ACDD01000067.1| GENE 2 441 - 515 74 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKCENLIFQGFTKEETEVYCDDIL >gi|224461435|gb|ACDD01000067.1| GENE 3 715 - 825 142 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MISLAIIGCGNVSFFVSPFIGIALVCYGLFRNPYKK >gi|224461435|gb|ACDD01000067.1| GENE 4 926 - 1192 248 88 aa, chain - ## HITS:1 COG:FN0211 KEGG:ns NR:ns ## COG: FN0211 COG2026 # Protein_GI_number: 19703556 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Fusobacterium nucleatum # 1 87 1 87 88 101 58.0 4e-22 MKYQVEFTKTASKKFQKLDSSIKKILFSWITKNLQNCSNPRAFGKALKGNLSDKWRYRVG DYRIMARIEDSKIIIIIVDIGHRKDIYE >gi|224461435|gb|ACDD01000067.1| GENE 5 1189 - 1419 432 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257467415|ref|ZP_05631726.1| ## NR: gi|257467415|ref|ZP_05631726.1| hypothetical protein FgonA2_08228 [Fusobacterium gonidiaformans ATCC 25563] # 1 76 1 76 76 91 98.0 1e-17 MSVVSLRLNEKEEKVLKEFAGFENIGISTYIKKVLFEKLEEEYELKLFDSLWNEHIQSGG ETVTLEEVAKENGIEL >gi|224461435|gb|ACDD01000067.1| GENE 6 2060 - 2578 946 172 aa, chain + ## HITS:1 COG:FN1790 KEGG:ns NR:ns ## COG: FN1790 COG2109 # Protein_GI_number: 19705095 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Fusobacterium nucleatum # 2 172 3 173 173 191 61.0 4e-49 MKTYVQIYTGNGKGKTTASLGLAVRALGNGWKVLLCQFMKGQNYGELRTLATFPNMTIRR FGTGNFIRKIENVQEIDKKLAREGYSFLKEVIQSGEYSLVIADEIFVARRFGLVSSEEIL SLIQLKSENTELVLTGRHAPDEIIEKADLVTEMCEVKHYFKQGVKAREGIER >gi|224461435|gb|ACDD01000067.1| GENE 7 2652 - 3626 1187 324 aa, chain + ## HITS:1 COG:FN1786 KEGG:ns NR:ns ## COG: FN1786 COG2870 # Protein_GI_number: 19705091 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Fusobacterium nucleatum # 7 323 4 320 323 382 61.0 1e-106 MRRKDWITKITENFQKVKIAVLGDLMLDDYIIGKVERISPEAPVPVVNVEEEKFVLGGAA NVVNNLSNLGAEVYCLGVIGTGHNSKRLLSAFDKKVHIDGIIRSEERPTIVKKRVLSGNH QLLRLDWEDSTAISKKLEDELLERFVKISSEIDAIILSDYNKGVLTSRVSKEIIRICREK NIIVTVDPKPINIDNYCGASSITPNRKEAYQCAGVSTSYSIEALGMDLRKKYELETVLIT RSEEGMSLYREDIYTVPTFAKEVYDVTGAGDTVISVFTLSKVAGASWQEAAEIANTAAGV VVGKVGTSTVSIEEIQREYCRIYE >gi|224461435|gb|ACDD01000067.1| GENE 8 3637 - 4113 721 158 aa, chain + ## HITS:1 COG:FN1788 KEGG:ns NR:ns ## COG: FN1788 COG0245 # Protein_GI_number: 19705093 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Fusobacterium nucleatum # 1 158 1 158 160 213 66.0 2e-55 MFRIGNGYDVHVLTEGRKLILGGVEIPHTKGVLGHSDGDVLIHAIMDALLGALSLGDIGL HFPDTEEEYRGISSLLLLKKIKELVQEKGYRVGNIDATIALQKPKLRPYIDTMREKIANI LEIDVDRVSIKATTEEKLGFTGREEGIKAYAVTLLEKE >gi|224461435|gb|ACDD01000067.1| GENE 9 4115 - 5473 1349 452 aa, chain + ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 450 12 458 459 400 49.0 1e-111 MGKDKKEFYSMVWKLVLPMAIQNVVNVAVISTDVIMLSKVGEKVLAGASLASQLQFIMTL ICFGITSGATILTAQYWGKGDKRTVEKILGLSLKLSLIVSFFFFVLATFFPKFSMEIFSK DPAVIEEGVKYLRIVGFSYLLTAITIVYLNILRSIEKVFIATLVYTVSLCTNIGVNAILI FGLLGFPKMGIVGAAIGTLVARLIEIIMVAIYAKKNETLLRLHLQDIFKVSRILWKDYFH YATPVIFNELCWGAGIAANAAILGHLGSSMVAASSVTQILRQLSAVVTFGIANAAAILIG KTIGEKRYDLAQNYAKRLIRLSIISCSIGSLLIFCISPWVVKHFAVTPEIQDYLSYMLKI IVLYIIAQGISVVFIVGIFRAGGDSRYGLFVDFSTMWLGSILLGFIGAFILHLPVKIVYL LLMCDEFLKVPMVIKRYKKRKWLKNVTRDFIS >gi|224461435|gb|ACDD01000067.1| GENE 10 5527 - 6831 1632 434 aa, chain + ## HITS:1 COG:FN0621 KEGG:ns NR:ns ## COG: FN0621 COG0427 # Protein_GI_number: 19703956 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Fusobacterium nucleatum # 1 434 1 434 434 540 60.0 1e-153 MTHWKGLYQERLCSAEQAVKSIPNNCRVVPSHAAGEPKHLVEAMMANREQYHNVDIFSMV NLGHAAYGKEEEKEHFHVNAAYASASTREVVNAEHGDFTPCFFYQVPELLKKDGPMPADV ALIQVSLPDEHGYCSLGVSSDYTKEAAENAKIVIAQVNKYMPRTLGNNFVHVSKMTHIVE YDEPIHILNPPFVGETERKIGEYCASLIQDGDTLQLGIGAIPDAVLSFLTDKKHLGIHSE MISDGVVDLIEAGVIDNSRKNFNPGKSIVSFLMGTEKLYNYVHNNPALEMHPVDYVNHPI IAAQNDNLVSINSALQVDLMGQANSETLGHKQFTGIGGQVDFVRAASMSKGGRTIIAMPS TAAKGKISKIVFLLDEGAAVTTSRTDIDYVITEYGIAKLRGKSLRARAKALIEIAHPDFR EGLREQALQKFGRL >gi|224461435|gb|ACDD01000067.1| GENE 11 6876 - 7253 449 125 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237736456|ref|ZP_04566937.1| LSU ribosomal protein L21P [Fusobacterium mortiferum ATCC 9817] # 23 125 1 103 103 177 87 3e-44 MDYAWKRSSATLPATNTFGGVRMYAVIKTGGKQYKVAEGQVLRVEKLNAEVNETVELQEV LLVADGENVKVGTPVVEGAKVVAEILAQGKGAKVINFKYKPKKASHRKKGHRQLFTEIKV TSIQA >gi|224461435|gb|ACDD01000067.1| GENE 12 7266 - 7601 275 111 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237742036|ref|ZP_04572517.1| 50S ribosomal protein L27 [Fusobacterium sp. 4_1_13] # 1 111 1 109 109 110 50 4e-24 MIRVTVVRKNGNITGYYAKGHAEYAELGSDIVCAAASTAMQNPLAGMQEVLRLNPQYGFD DDGYITVTLDRMNFQGKEKEVSSLLETMVVMIRELERNYPKNIKLVEKEEK >gi|224461435|gb|ACDD01000067.1| GENE 13 7602 - 7886 481 94 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237736458|ref|ZP_04566939.1| LSU ribosomal protein L27P [Fusobacterium mortiferum ATCC 9817] # 1 94 1 94 94 189 96 5e-48 MKFILNIQLFAHKKGQGSVKNGRDSNPKYLGVKKYDGEVVKAGNIIVRQRGTAFHPGNNM GMGKDHTLFALIDGYVKFERLGKDKKQVSIYASK >gi|224461435|gb|ACDD01000067.1| GENE 14 7978 - 8904 960 308 aa, chain + ## HITS:1 COG:FN0920 KEGG:ns NR:ns ## COG: FN0920 COG0501 # Protein_GI_number: 19704255 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Fusobacterium nucleatum # 1 305 1 305 309 442 71.0 1e-124 MYGLSQIRNKQIQVPHLNIFKIGTWVMMGIFASYLMIYLFLGQEILNYFPLLLLFAFATP LFSLWMSKASVKRAYHIRLIGEGGARNEKEQLVVDTIQLLSEKLKLQKLPEIGVYPSYDV NAFATGASKNSALVAVSQGLLQTMDETEIIGVLAHEMSHVVNGDMLTSSILEGFVSAFAL IATIPFLFGRSDNDRGERAGSSLMTYYLVRNIANFFGKLVSSAYSRRREYGADRLASKIT GAVYMKSALMKLQDISQGRVNLQAEDRRFANFKITNNFSMGGMANLFASHPSLENRIEEV EKLEQQGW >gi|224461435|gb|ACDD01000067.1| GENE 15 8983 - 9630 491 215 aa, chain + ## HITS:1 COG:BMEII1103 KEGG:ns NR:ns ## COG: BMEII1103 COG0671 # Protein_GI_number: 17989448 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Brucella melitensis # 55 204 28 169 292 61 31.0 1e-09 MMNVSNVLENILLPVYRLDYFFLKAFCHEVKGDNVTNILYPYFQEEKLDKFFHAVTHFGE GYLEFFLVLLFFLLFLYDKKKFKVCKEYALSLILVLCSTQVVVNILKLTFGRARPYVFFD PERFYGIFYLIDNHLLMNSQYHSFPSGHTITIWGTIWFFFFTVKSKYRYLLFSLGFLVAL SRMYLGYHWFSDVTVSIGLSYVIVKWIVTKRSVMR Prediction of potential genes in microbial genomes Time: Fri May 20 02:15:28 2011 Seq name: gi|224461434|gb|ACDD01000068.1| Fusobacterium sp. 3_1_5R cont1.68, whole genome shotgun sequence Length of sequence - 891 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 235 - 284 6.5 1 1 Tu 1 . - CDS 417 - 605 225 ## gi|257453037|ref|ZP_05618336.1| hypothetical protein F3_08261 - Prom 694 - 753 7.3 Predicted protein(s) >gi|224461434|gb|ACDD01000068.1| GENE 1 417 - 605 225 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453037|ref|ZP_05618336.1| ## NR: gi|257453037|ref|ZP_05618336.1| hypothetical protein F3_08261 [Fusobacterium sp. 3_1_5R] # 1 62 1 62 62 81 100.0 1e-14 MAEHGGKREGAGRPTSPDKKIQKSIKIDPTLYKQIESLDGSFISKIERGLELLLQEENEH KK Prediction of potential genes in microbial genomes Time: Fri May 20 02:15:40 2011 Seq name: gi|224461433|gb|ACDD01000069.1| Fusobacterium sp. 3_1_5R cont1.69, whole genome shotgun sequence Length of sequence - 27477 bp Number of predicted genes - 48, with homology - 42 Number of transcription units - 16, operones - 7 average op.length - 5.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 2.1 1 1 Op 1 . + CDS 87 - 326 362 ## gi|257453038|ref|ZP_05618337.1| hypothetical protein F3_08266 2 1 Op 2 . + CDS 332 - 1123 776 ## SSUBM407_p004 toxin of epsilon-zeta postsegregational killing system 3 1 Op 3 . + CDS 1120 - 1368 284 ## gi|257453040|ref|ZP_05618339.1| hypothetical protein F3_08276 4 1 Op 4 . + CDS 1403 - 2659 1125 ## CCC13826_0614 hypothetical protein 5 1 Op 5 . + CDS 2677 - 2946 202 ## BcerKBAB4_1183 resolvase domain-containing protein 6 2 Tu 1 . - CDS 2917 - 3849 330 ## PROTEIN SUPPORTED gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase - Prom 3929 - 3988 5.5 + Prom 3969 - 4028 7.2 7 3 Tu 1 . + CDS 4133 - 4195 140 ## + Term 4231 - 4283 -0.6 8 4 Tu 1 . - CDS 4192 - 5307 1010 ## COG0582 Integrase - Prom 5328 - 5387 6.0 9 5 Tu 1 . - CDS 5439 - 6380 715 ## COG4823 Abortive infection bacteriophage resistance protein - Prom 6402 - 6461 7.3 + Prom 6132 - 6191 10.2 10 6 Tu 1 . + CDS 6432 - 6539 70 ## + Term 6635 - 6671 2.1 11 7 Op 1 . - CDS 6576 - 6989 453 ## COG2856 Predicted Zn peptidase 12 7 Op 2 . - CDS 7004 - 7840 971 ## HS_1527 hypothetical protein 13 7 Op 3 . - CDS 7856 - 8263 524 ## Sterm_3926 transcriptional regulator, XRE family - Prom 8287 - 8346 14.6 + Prom 8322 - 8381 16.8 14 8 Op 1 . + CDS 8427 - 8636 267 ## gi|257453049|ref|ZP_05618348.1| hypothetical protein F3_08321 15 8 Op 2 . + CDS 8668 - 8733 62 ## 16 8 Op 3 . + CDS 8714 - 8827 138 ## 17 9 Tu 1 . - CDS 8789 - 8893 254 ## gi|257453050|ref|ZP_05618349.1| hypothetical protein F3_08326 - Prom 8928 - 8987 8.2 + Prom 8869 - 8928 3.9 18 10 Tu 1 . + CDS 8977 - 9174 311 ## gi|257453051|ref|ZP_05618350.1| hypothetical protein F3_08331 + Prom 9205 - 9264 3.8 19 11 Op 1 . + CDS 9294 - 9527 366 ## gi|257453052|ref|ZP_05618351.1| hypothetical protein F3_08336 20 11 Op 2 . + CDS 9517 - 9675 216 ## gi|257453053|ref|ZP_05618352.1| hypothetical protein F3_08341 21 11 Op 3 . + CDS 9683 - 10375 563 ## gi|257453054|ref|ZP_05618353.1| hypothetical protein F3_08346 22 11 Op 4 . + CDS 10378 - 10548 240 ## gi|257453055|ref|ZP_05618354.1| hypothetical protein F3_08351 23 11 Op 5 . + CDS 10549 - 10875 438 ## gi|257453056|ref|ZP_05618355.1| hypothetical protein F3_08356 24 11 Op 6 . + CDS 10885 - 11397 642 ## gi|257453057|ref|ZP_05618356.1| putative phage associated protein 25 11 Op 7 . + CDS 11409 - 12062 729 ## BB3533 hypothetical protein 26 11 Op 8 . + CDS 12081 - 12590 726 ## gi|257453059|ref|ZP_05618358.1| hypothetical protein F3_08371 27 11 Op 9 . + CDS 12644 - 12718 57 ## 28 11 Op 10 . + CDS 12685 - 14994 1854 ## Sterm_3911 toprim domain protein + Prom 15095 - 15154 12.5 29 12 Op 1 . + CDS 15351 - 15557 125 ## gi|257453061|ref|ZP_05618360.1| hypothetical protein F3_08381 30 12 Op 2 . + CDS 15541 - 16296 650 ## gi|257453062|ref|ZP_05618361.1| hypothetical protein F3_08386 31 12 Op 3 . + CDS 16301 - 16519 220 ## gi|257453063|ref|ZP_05618362.1| hypothetical protein F3_08391 32 12 Op 4 . + CDS 16509 - 16844 224 ## gi|257453064|ref|ZP_05618363.1| dUTPase 33 12 Op 5 . + CDS 16855 - 17397 402 ## gi|257453065|ref|ZP_05618364.1| hypothetical protein F3_08401 + Term 17418 - 17468 9.7 + Prom 17439 - 17498 6.1 34 13 Op 1 3/0.000 + CDS 17611 - 18051 480 ## COG3728 Phage terminase, small subunit 35 13 Op 2 . + CDS 18044 - 19294 732 ## COG1783 Phage terminase large subunit 36 13 Op 3 . + CDS 19307 - 20701 1508 ## Sterm_1427 portal protein SPP1 37 13 Op 4 . + CDS 20688 - 22262 1261 ## COG5585 NAD+--asparagine ADP-ribosyltransferase 38 13 Op 5 . + CDS 22259 - 22435 228 ## FN0575 hypothetical protein + Term 22437 - 22481 5.2 39 14 Tu 1 . + CDS 22488 - 22646 265 ## gi|257453071|ref|ZP_05618370.1| hypothetical protein F3_08431 - Term 22556 - 22604 4.1 40 15 Tu 1 . - CDS 22630 - 22974 434 ## gi|257453072|ref|ZP_05618371.1| hypothetical protein F3_08436 - Prom 23106 - 23165 10.2 + Prom 22987 - 23046 7.1 41 16 Op 1 . + CDS 23087 - 23911 858 ## Ccel_2974 hypothetical protein + Term 23923 - 23960 3.1 42 16 Op 2 . + CDS 23985 - 24119 254 ## gi|257453074|ref|ZP_05618373.1| hypothetical protein F3_08446 43 16 Op 3 . + CDS 24122 - 24235 62 ## 44 16 Op 4 . + CDS 24228 - 24818 922 ## Sterm_1430 minor structural GP20 protein 45 16 Op 5 . + CDS 24836 - 26026 1648 ## Sterm_1431 hypothetical protein 46 16 Op 6 . + CDS 26035 - 26415 304 ## CDR20291_1437 hypothetical protein 47 16 Op 7 . + CDS 26417 - 26761 346 ## Sterm_1433 hypothetical protein 48 16 Op 8 . + CDS 26761 - 27309 514 ## Sterm_1434 hypothetical protein Predicted protein(s) >gi|224461433|gb|ACDD01000069.1| GENE 1 87 - 326 362 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453038|ref|ZP_05618337.1| ## NR: gi|257453038|ref|ZP_05618337.1| hypothetical protein F3_08266 [Fusobacterium sp. 3_1_5R] # 1 79 1 79 79 142 100.0 9e-33 MAQYDLTLAKEVTRECAWGILGTINRIENKMGSSLFLELIEKKIEKEIREIPGMNFDEIN TLDIKCAFVRDVLSELEKV >gi|224461433|gb|ACDD01000069.1| GENE 2 332 - 1123 776 263 aa, chain + ## HITS:1 COG:no KEGG:SSUBM407_p004 NR:ns ## KEGG: SSUBM407_p004 # Name: not_defined # Def: toxin of epsilon-zeta postsegregational killing system # Organism: S.suis_BM407 # Pathway: not_defined # 29 263 31 268 287 156 36.0 9e-37 MEKNYTQQELEIAFKKILNYYKSFYFTHKNPKVFLLGGQPGAGKSGLEHMINLKKDYVSI SGDDYREYHPRFQEINLEYGREASKYTQQWAAEMTEKLIKELRKEKYNLIVEGTLRTAEL PLKEANAFKKEGYEVELNVVVVKPEKSRLGTLQRYEEMLKRGKIPRMTPKEHHDLVVNNI ANNLEIIYKAEVFDNIKLFDRENNLLYNKIENPSVSPKDILQKEFHREWQEEEIKEWNER WEYLIQVMKDRKESRTRTVRRFL >gi|224461433|gb|ACDD01000069.1| GENE 3 1120 - 1368 284 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453040|ref|ZP_05618339.1| ## NR: gi|257453040|ref|ZP_05618339.1| hypothetical protein F3_08276 [Fusobacterium sp. 3_1_5R] # 1 82 1 82 82 122 100.0 5e-27 MKKEEKSENNLWLKKKNWREQIAHIRIPSYLKEDLEEYFSYIKIGDNLMSKATIDNVISS INVAEQENDLSSEEAKYIRKIL >gi|224461433|gb|ACDD01000069.1| GENE 4 1403 - 2659 1125 418 aa, chain + ## HITS:1 COG:no KEGG:CCC13826_0614 NR:ns ## KEGG: CCC13826_0614 # Name: not_defined # Def: hypothetical protein # Organism: C.concisus # Pathway: not_defined # 10 391 3 388 400 285 43.0 2e-75 MEKRFEKEFKIYILKNKDVPVLRFENEKKIDNTKLGDYPSYRFKHIQILCEDLLPKGYTN TKDSSELKHWIEQRKIPKNRKNMEDILHYQLQNQITDPNNPMSYIDVSYGLSLNDSYWIV PDDGKEYLWKEYNLYQNKFSEILSLVAFGEKNISNLPEENRTSPEYTTDGMLAKCWTTID DTIVLLKKSSEHHKVEAYAEYYMAQVAKMMNFEHVSYDIMKYHDSIVSSCPLFTSEEEGY VPMYRCLKKDDCHKKGARLLESISEITGQEFLEDIMVFDSLIYNTDRHLGNLGMMIENST GKYLRPAPIFDNGNSILSFLPGQNLPQIFKNYTSKFEIDFDLLSANLVSERHREGLKRLE TFQFQRHPLYNLSEDILVKGEQFIQARAKLLTRQLDKKKEKERNPWSKKIEKIHGIER >gi|224461433|gb|ACDD01000069.1| GENE 5 2677 - 2946 202 89 aa, chain + ## HITS:1 COG:no KEGG:BcerKBAB4_1183 NR:ns ## KEGG: BcerKBAB4_1183 # Name: not_defined # Def: resolvase domain-containing protein # Organism: B.weihenstephanensis # Pathway: not_defined # 3 74 6 77 200 73 51.0 2e-12 MIYGYIRISSKTQNEERQIIALKESGVSSDNIFIDRESGKNFNRASWQKLMARLVVGDTL IIKELDRMGRNDELQIILRHLNQTKFVHK >gi|224461433|gb|ACDD01000069.1| GENE 6 2917 - 3849 330 310 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase [Streptococcus pneumoniae SP6-BS73] # 4 299 10 310 317 131 34 4e-30 MSYKHLTINERNKIEVLRKEGYSSRRIAKILGFHHSTISRELKRCDNEYEAVYAQKDKIK KSSSKGRKPKVNDNITKCISEKLHKKWSPEQIANTVCKDIVSFKTIYNWIYSGIIDFDIS KLRRKGKSRKVKETRGRFNIGKSINKRPKEVKKRNTFGHWELDTVVSSRGKSKGCLATFV ERKTRFYIALPMVDRSKNSMLGAIEKLIQSLPLEALKTFTSDRGKEFACYEDVEKQGINF YFADAYSAWQRGSNENSNGLLREYYPKKTDLSKISINELIKNLVELNTRPRKCLEYQTPF NLFMHELSLV >gi|224461433|gb|ACDD01000069.1| GENE 7 4133 - 4195 140 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVPNPPFILCFYISVCLLSA >gi|224461433|gb|ACDD01000069.1| GENE 8 4192 - 5307 1010 371 aa, chain - ## HITS:1 COG:L55605 KEGG:ns NR:ns ## COG: L55605 COG0582 # Protein_GI_number: 15673415 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Lactococcus lactis # 17 363 9 348 359 101 26.0 2e-21 MPVRRNKGEGSITTTTRNGKTYYKASVTIGYDMDGKQIRKSFGSFRKSIVVDKINTVKYE VKNDLLEKQGDIKFGKLFLSWIQDFKKVEVSGNTFAEYEVCYRLRILPYLLANAKANEIT LPFLQKYFNSLQNEWSVNVIRKTYVKINACLNFALIQGFINRNPAKGIKLPKQEKSTKYK VFTKEEQDLILQTLNLRNIVDTAIYFDFYTGLRLGELLGLKWEDLDGRLLSIKRQYQKNI EIAEQRKTTYVLKKLKTIHSYRCMTLPNKIVNLLSNMERTSEFIFPGIDGQPLEVKKLPR RLAAICKKLNIPHRSFHSIRHSFATRLFEKNVQIKTVQELMGHSEIATTMDIYTHVMPQT KEEAANILDSI >gi|224461433|gb|ACDD01000069.1| GENE 9 5439 - 6380 715 313 aa, chain - ## HITS:1 COG:lin2373 KEGG:ns NR:ns ## COG: lin2373 COG4823 # Protein_GI_number: 16801436 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Listeria innocua # 31 312 6 289 298 122 33.0 9e-28 MNLFTVDQDRVVFILGDFMIEIIKLLSEKVKQPTTIEQQIQILKSRNVIIDDEETAVRYL KLYNYYFITGYLHPYKNKKDGSYIPISFGKILNQIQFDMRLREICMYGLDIIEKSLKTML AYHFSHNYQYGNIAYFYKEFFPGKESSHEKLISYYQKAVENNKDLPFVKHNMNTYGILPT WAAIELFTMGNLENFFKLLKNECIREIEKEIGFPKNKIANWIESLRIFRNMVAHNQRLYN FTIPSTPIKTKEYNEQSGKIFDYIVVMKYLFLDIKDWNEYLIPRLEYIFEDFKEEIELKC IGFPKNWKDILMK >gi|224461433|gb|ACDD01000069.1| GENE 10 6432 - 6539 70 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTKNCIPVPILRIVSPTVNFNGDTIPILDSILYL >gi|224461433|gb|ACDD01000069.1| GENE 11 6576 - 6989 453 137 aa, chain - ## HITS:1 COG:FN2066 KEGG:ns NR:ns ## COG: FN2066 COG2856 # Protein_GI_number: 19705356 # Func_class: E Amino acid transport and metabolism # Function: Predicted Zn peptidase # Organism: Fusobacterium nucleatum # 11 133 19 136 138 69 37.0 2e-12 MENFVKILIKEHETNNPFIICRNLKIQIKYIDSYEMKGAVAKVWEKPCIFLSSYLEGFSK YFALSHELCHAIKHDIEEVKFFKDNTFYSTDKFEIEANEFAALLLKNPETQNYDIDNLDD LDREVEEELVKFIYKNS >gi|224461433|gb|ACDD01000069.1| GENE 12 7004 - 7840 971 278 aa, chain - ## HITS:1 COG:no KEGG:HS_1527 NR:ns ## KEGG: HS_1527 # Name: not_defined # Def: hypothetical protein # Organism: H.somnus # Pathway: not_defined # 2 277 3 278 288 391 76.0 1e-107 MNDIKLYENKEVRSIWDEEHEEWLFSIVDVVGILTESKNPQVYWRVLKKRLVDEGNQTVT NCNALKMKAKDGKMRLTDVTDMQGIFRIIQSIPSPKAEPFKMWLAEVGKERIDETIDPEI AIDRALATYLKKGYSENWIHQRLLTIRVRKELTDAWSNHGVKEGIEYAILTDEITKAWSG MTTRKYKQLKGLKKENLRDNMSTLELVLNMLAEATTTELVEEENPTNLEENKQIAKAGGA VAGNARKEIESRTGKSVITKKKAVDFTKLLVEIDKLEK >gi|224461433|gb|ACDD01000069.1| GENE 13 7856 - 8263 524 135 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3926 NR:ns ## KEGG: Sterm_3926 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: S.termitidis # Pathway: not_defined # 1 133 1 127 127 80 42.0 1e-14 MTLGDRVKRKREELKLSQEELAEKMGYKSKTSIHKIEQNITDLPLSKVEELSKVLRVSTS YLMGWEEEPKKPIDPIIDEYHLDEYELAEYNKILNMNMLMFNDKELSEEGKEKLEIALKE VFVEELLKRRAAKKK >gi|224461433|gb|ACDD01000069.1| GENE 14 8427 - 8636 267 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453049|ref|ZP_05618348.1| ## NR: gi|257453049|ref|ZP_05618348.1| hypothetical protein F3_08321 [Fusobacterium sp. 3_1_5R] # 1 69 1 69 69 117 100.0 3e-25 MINTELLKQKIDSSGYRFSWIAKQLKLSSYGFRKKLNNDTEFKVSEVSKICKILTINDKE RDTIFFCKD >gi|224461433|gb|ACDD01000069.1| GENE 15 8668 - 8733 62 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKSVITILNKKGNTVGILQV >gi|224461433|gb|ACDD01000069.1| GENE 16 8714 - 8827 138 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGFYKFKREVNEGKMEISITTQEDELVVSLKVNWDSS >gi|224461433|gb|ACDD01000069.1| GENE 17 8789 - 8893 254 34 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453050|ref|ZP_05618349.1| ## NR: gi|257453050|ref|ZP_05618349.1| hypothetical protein F3_08326 [Fusobacterium sp. 3_1_5R] # 1 34 1 34 34 68 100.0 1e-10 MSAKIYACLLGEWVDISSGDTLLAGIPIHFQRYY >gi|224461433|gb|ACDD01000069.1| GENE 18 8977 - 9174 311 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453051|ref|ZP_05618350.1| ## NR: gi|257453051|ref|ZP_05618350.1| hypothetical protein F3_08331 [Fusobacterium sp. 3_1_5R] # 1 65 1 65 65 127 100.0 2e-28 MQERKETLAITIEEAAEYIGVGKDCVKRMTEVPDFPLFRVGNRTLIIKPKILDWLIKHNR EDFGK >gi|224461433|gb|ACDD01000069.1| GENE 19 9294 - 9527 366 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453052|ref|ZP_05618351.1| ## NR: gi|257453052|ref|ZP_05618351.1| hypothetical protein F3_08336 [Fusobacterium sp. 3_1_5R] # 1 77 1 77 77 113 100.0 4e-24 MKEKESVEYTKIMATASAKKLSGYLKKLDISETETKKLIDLILEQVRDCIELGKEIAYAE MISYMKNNLKVGDVNAD >gi|224461433|gb|ACDD01000069.1| GENE 20 9517 - 9675 216 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453053|ref|ZP_05618352.1| ## NR: gi|257453053|ref|ZP_05618352.1| hypothetical protein F3_08341 [Fusobacterium sp. 3_1_5R] # 1 52 1 52 52 98 100.0 1e-19 MRIRIRPTIFNVSLAITPFLMLLAYHDRGYFACGGETLVPLVGLVAHYAFKE >gi|224461433|gb|ACDD01000069.1| GENE 21 9683 - 10375 563 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453054|ref|ZP_05618353.1| ## NR: gi|257453054|ref|ZP_05618353.1| hypothetical protein F3_08346 [Fusobacterium sp. 3_1_5R] # 1 230 1 230 230 342 100.0 9e-93 MRTAKPKVKREIKVNKKKEIPIKKVKFIDEDIEKRKFLLESIYRIYENMKIHKDIWTKEI KQNNGYLNPYYKMLMGRIIKLSEKLIATYISQDIQRTEISSFWIQKSVVALMQENISFKQ SNKNKEKLWKDDSLPFDSEEFYNSLMVVWGVSIIIQEKLKNIMRLDKIQKEMDEMINKIN KLLDFIDTDIVQLTEEQNENKKSKILKPLSKKKMEIIHAKLKEKGYLKGA >gi|224461433|gb|ACDD01000069.1| GENE 22 10378 - 10548 240 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453055|ref|ZP_05618354.1| ## NR: gi|257453055|ref|ZP_05618354.1| hypothetical protein F3_08351 [Fusobacterium sp. 3_1_5R] # 1 56 1 56 56 81 100.0 2e-14 MESFDNVKNPKKSYGIFKVFNSGVGGKNMDLNNINEAKALIQKIEKIKKEMELRMM >gi|224461433|gb|ACDD01000069.1| GENE 23 10549 - 10875 438 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453056|ref|ZP_05618355.1| ## NR: gi|257453056|ref|ZP_05618355.1| hypothetical protein F3_08356 [Fusobacterium sp. 3_1_5R] # 1 108 1 108 108 204 100.0 2e-51 MHKNSIPDKIISFLTEYPGRSNREIAQGIAVNENNIRGELYKLKKKGFIFGCAKDGWYTD EDLRNQLQRKKEIATDVLEECYALFKNCENEKNKIQLARIITDLLKKF >gi|224461433|gb|ACDD01000069.1| GENE 24 10885 - 11397 642 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453057|ref|ZP_05618356.1| ## NR: gi|257453057|ref|ZP_05618356.1| putative phage associated protein [Fusobacterium sp. 3_1_5R] # 1 170 1 170 170 238 100.0 9e-62 MNKLSLYQITEEMEALDSLYWENIDEETGEIKNAEVLEEFEKEIKNLLSSKGAQIIASQR QSKLFIEAAKSEIERLTKLKKSLERKEEFYKNYIIRSMEKANLIEIKTEIGKIKLTSGKG KVEIYDLGLLDDKFFKIKKEPMKTEIREALKAGLEVQGAKMIYENGLSIK >gi|224461433|gb|ACDD01000069.1| GENE 25 11409 - 12062 729 217 aa, chain + ## HITS:1 COG:no KEGG:BB3533 NR:ns ## KEGG: BB3533 # Name: not_defined # Def: hypothetical protein # Organism: B.bronchiseptica # Pathway: not_defined # 1 216 1 210 216 203 48.0 4e-51 MANLIMILGESGTGKSTSIETLNEKETFIIQVVDKPLPFKGFKKRYSLKTKENPKGNRFI SDRADVIIKILQSLNKEKGVKNIIIDDSQYIMANEFMRRAKEKGYEKFTEIGQNFYNLID TANDLREDINVIFLQHTEVTDDGRKKAKTIGKLIDDKISLEGRFTIVLITEVEDGAYYFR TQNNGNDTCKSPRGMFEDLRIPNDLQYVVTKCNEYFN >gi|224461433|gb|ACDD01000069.1| GENE 26 12081 - 12590 726 169 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453059|ref|ZP_05618358.1| ## NR: gi|257453059|ref|ZP_05618358.1| hypothetical protein F3_08371 [Fusobacterium sp. 3_1_5R] # 1 169 1 169 169 262 100.0 6e-69 MMNLWNANAEDLTKKTGTKERFQNSGIYEVKIKEAFISDSTKSQAKAITIVLETEENYGR VNFWFLKGDGTENEFARTTLNRMMYLLKLKADKLKIESKKIKNYKGEEIERAFLPELEGK NIGVILNVKIDGDQINFDVKDFFDIKSKKTSDEILNKTEAYTVEFFTKK >gi|224461433|gb|ACDD01000069.1| GENE 27 12644 - 12718 57 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTSSHSKKEGGPKWINIKDMGTN >gi|224461433|gb|ACDD01000069.1| GENE 28 12685 - 14994 1854 769 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3911 NR:ns ## KEGG: Sterm_3911 # Name: not_defined # Def: toprim domain protein # Organism: S.termitidis # Pathway: not_defined # 1 255 11 275 607 87 30.0 2e-15 MDKYKRYGNELRFDYCPICKKESSDNPHFSINLETKQYYCHSTGRGGSIEELEDFDVDLE NISIKKEKKIQAANFDSIMKSRADKHLGEDWLTYLKGRGISEKGLGRLVRLGRNNTMMIP VTDGEHVVSIKYRTMDKKMSSEKGSQSNYLVNWQNIKNKSYLIIVEGEIDLLSAIEAGYD NVVSLPFGAKNLKAIEHQKTWIESFSKITIAVDNDIPGEECKKEIIEILRRVKNKVHEVN FGTYKDLNEVLQDKGVEAIEAIIKAASKVEHIFRPFYKEENGYYCWQKENYVRITDFTLE LTGYSDNYIVGLVTNAGRQREFKAKKTDLLAKNGILEHLGYYLGSSQSIAKFWSWFLDES TEQFLLEIPHYGIIENEYYDPGSKVICSKEDLKIQKVEEIEKMSKEDKEWLQENLIYLRK DVNQSLLGICWALGRFHIHGNYPILEVSGTTSIGKTEYVEFISRILFGSKENIKSFTTLT NHQIRSLSSCSNITPWAIDEVKITGKNLREKAIELYSTIRAVYDNKTINQGNITSKLTEF SLCTPLIISGETELSDVSIKNRMISTTLNKHNKSKDDIFFQLKDTSLLEKLGKEALRNRM NNGKIEVELDIVKGLLNQVKDERQFYNGKCILTGLKALNEIIKIDPKVKQNFILFLNGQL ANEYSVTTNFLELLELVADSGVDVRTFYQVKNGKHYARFNLLYKAIAEEHFKTNSTLELL DARTLKKQLIENNYILNDRISARFPKDDFTNDTIPATAVELVPNCLFEI >gi|224461433|gb|ACDD01000069.1| GENE 29 15351 - 15557 125 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453061|ref|ZP_05618360.1| ## NR: gi|257453061|ref|ZP_05618360.1| hypothetical protein F3_08381 [Fusobacterium sp. 3_1_5R] # 1 57 1 57 68 91 100.0 1e-17 MEDLKLKRGTSFIEFYYRGLSITNSKELAAYIKINKWYFDRAKPEVQEQFRRLYRIYKKQ EKKNEKKN >gi|224461433|gb|ACDD01000069.1| GENE 30 15541 - 16296 650 251 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453062|ref|ZP_05618361.1| ## NR: gi|257453062|ref|ZP_05618361.1| hypothetical protein F3_08386 [Fusobacterium sp. 3_1_5R] # 1 251 1 251 251 467 100.0 1e-130 MRKRIDPQELVGKEFENKIGEKFKIVKYLFKEKTNHCFDVEFLETKNIQLGTLNQIRNGT CIDVVQKKKMKRLQRELDLRKRNRLVKQAKNVCHVPNNLKEKNVLAIDLSTTSTGIAYSQ KGEIVRWKTIKAEDKDFRKRGAKIIEELVKILKKGKIDFVVLEDVYLGLNSSVLTMLSEV RGMLTYPLVKLNIDILIVPPVLWKHRIEGVPFHREEQKEFMMKKFLEYTGENPDSDDVAD AYMMLRACLED >gi|224461433|gb|ACDD01000069.1| GENE 31 16301 - 16519 220 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453063|ref|ZP_05618362.1| ## NR: gi|257453063|ref|ZP_05618362.1| hypothetical protein F3_08391 [Fusobacterium sp. 3_1_5R] # 1 72 1 72 72 113 100.0 3e-24 MKEIIREFKGYEKRKAFLFAKSLKISGVEDIKIQVSYDSEHLATKTKRPSKFIVYQEIWE ENDLDRRGEYER >gi|224461433|gb|ACDD01000069.1| GENE 32 16509 - 16844 224 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453064|ref|ZP_05618363.1| ## NR: gi|257453064|ref|ZP_05618363.1| dUTPase [Fusobacterium sp. 3_1_5R] # 1 111 1 111 111 195 100.0 7e-49 MKDRLQEIWERQKNFDNIVFANAGVTREQVANEIKVALITEIGELYNENPTFKFWKEQKN IEITDKTKEEFVDCLHFLISLGQDIFKDEQEMFDWYCKKNDKNLLRQKTGY >gi|224461433|gb|ACDD01000069.1| GENE 33 16855 - 17397 402 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453065|ref|ZP_05618364.1| ## NR: gi|257453065|ref|ZP_05618364.1| hypothetical protein F3_08401 [Fusobacterium sp. 3_1_5R] # 1 180 1 180 180 282 100.0 7e-75 MTEKEIKKIVKVTVEEVHKHKLKPAKNPFQLTELYLSQYKNMEESIQIKLDTVKELRAEI PGMKSPVLVPDVIQGGKAENISQLEKREEVIEHLVKEVEQLSALKLQIEKVMDKYREDKD FRILQERFFNRKTFDEIGDLLNIDESTVRRRKNRIIKEMSEILFPICAEMPPKVRLDCTE >gi|224461433|gb|ACDD01000069.1| GENE 34 17611 - 18051 480 146 aa, chain + ## HITS:1 COG:lin0104 KEGG:ns NR:ns ## COG: lin0104 COG3728 # Protein_GI_number: 16799182 # Func_class: L Replication, recombination and repair # Function: Phage terminase, small subunit # Organism: Listeria innocua # 1 122 1 117 180 88 42.0 4e-18 MKLNARQKAFCEYYVACGNATEAAKKAGYSEKTAYSMGNENLRKPELKNYINELMAKMEE KRMASAEEVLKFLTASMRGEVEEEVVVVEGEGDGCSSARKMKKQISAKERIKAAELLGKR HLLFSDKVKVEGSIPVMIVGEDELDD >gi|224461433|gb|ACDD01000069.1| GENE 35 18044 - 19294 732 416 aa, chain + ## HITS:1 COG:SPy0972 KEGG:ns NR:ns ## COG: SPy0972 COG1783 # Protein_GI_number: 15674984 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Streptococcus pyogenes M1 GAS # 6 414 8 417 429 370 45.0 1e-102 MIKFKKVSLPQTVGKGYKTFWNFKGRYKVVKGSRASKKSKTTALWIICNMMKYAKANTLV VRKVFRTIKDSCYSDLLWAIDRLEVEEYWEKKESPLELTYIPTGQKILFRGFDDPMKITS ISVTTGSLCWCWVEEAYEITDEAAFNMLDESIRGVVEEPLFKQVIICFNPWNERHWLKKR FFDVQDENILAITTNYMCNEWLDDSDKKLFEDMKKNNPRRYQVAGLGNWGIVDGLVYENW HEQEFDWREILEKRKKAKAVFGLDFGFTNDPAAFFCAVLDLEQKELYVFDEFYKTHMHNS DIYREIERMGFRKEIIIADGVEAKSIEHLRNLGLPRVKGSKKGRDSINAGIQFVQDFKIY IHPRCPNFLMEISNYSWDKDKFGKSINKPVDDFNHLMDAMRYALEDYMRESSLSFD >gi|224461433|gb|ACDD01000069.1| GENE 36 19307 - 20701 1508 464 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1427 NR:ns ## KEGG: Sterm_1427 # Name: not_defined # Def: portal protein SPP1 # Organism: S.termitidis # Pathway: not_defined # 1 454 1 471 483 419 48.0 1e-115 MWEWIKNIFKKNKKVENMEIRKLEYLISTWLTSKVRQDQLNGERYYRGNQDILQKKRKAI MEQGRLEAVDNMVNSKIVDNQYAKMVDQKVNYLLAKKPTFNCENEDVRELFGAKFLRMLR NLGEDSLNNGIGWIYPYFGKDGTLQFRKFEASEILPIWKDNNKEELELAIRLYEVMEFEG DRLKPVKKVEVYSEHGVNFFIWDYNRLKEVGHSDYISIGESGYNWGKVPLVPFRSNNLEQ PLICRVKCLQDGLNEILSKFQDNMLEDAGTTILILTNYDGENLGEFRRNLATYRAVKVNN MDGGKGGLDKLTIEVNAENYQLIIKLLKKAIIENARGFDAKDERLGGNPNEMNIQSMYSD IDLDANQMETEFQASFEELMWFVNKALNINDTLEVVFNRDVLVNETESITNCIQSSTLLS LETVLAQHPWVTNVGEELKRLKKQKEESIEGEYAGVMNAPDDNE >gi|224461433|gb|ACDD01000069.1| GENE 37 20688 - 22262 1261 524 aa, chain + ## HITS:1 COG:BH3531 KEGG:ns NR:ns ## COG: BH3531 COG5585 # Protein_GI_number: 15616093 # Func_class: T Signal transduction mechanisms # Function: NAD+--asparagine ADP-ribosyltransferase # Organism: Bacillus halodurans # 4 325 6 314 490 137 30.0 6e-32 MTMSKKYWVERFEQEEARNAKVSLRHMQIAKKQYQSTMRKIDSEIRSWYSRYAIDNEVSY EEAQKILSGKDRKSLKLSLEEYIKLGEQQNVSFNAEVEKTLKKVSAGVHVNRLESIKSSI QAELDILYTNVERGLGEHFCEVVGAGYARTSYLVQSMSGNYESIYGINKDLLNQMIYRPW TNDGKNWSNRIWKQKDKLIGELHTSLVQSLVLGDDVNLLADKMSKRLDVGFSRAANLLMT ESAAYHSKAAELCYKDLGVEKYEILATLDNRTSTVCQGMDGKVFERKQYQVGVTAPPFHC RCRTTTIPYFEDLTEDETRAARDEEGNYVEEKANMKYPEWKEKYLKENEKEGIIDIEVDE MTPCLRRLKDDTIVKTEVKEIKYKKADFKDWLFDWSKTEKKGYQILALYAEDDSRIQGAV SIKPRKDNLTVEIDIAESAPFNRLYKNKAEEKEYSGVGAHLFAEVCKRSFEIGYDGYVEF KAKTNLVDYYKEKLGAIAIDTQRMYIDTDGAKKLIEKYYGGGKI >gi|224461433|gb|ACDD01000069.1| GENE 38 22259 - 22435 228 58 aa, chain + ## HITS:1 COG:no KEGG:FN0575 NR:ns ## KEGG: FN0575 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 3 58 2 57 57 69 66.0 4e-11 MKLEDFGMMTEPMPENKVEYDLRALNAHCKENGILPTDLSTKELERFEKHKEKKVINF >gi|224461433|gb|ACDD01000069.1| GENE 39 22488 - 22646 265 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453071|ref|ZP_05618370.1| ## NR: gi|257453071|ref|ZP_05618370.1| hypothetical protein F3_08431 [Fusobacterium sp. 3_1_5R] # 1 52 1 52 52 73 100.0 5e-12 MIDVVKQKQVEIDNFVDKIMEELENKKIDLAHYEYLLGQLKNKIEFYLRFQR >gi|224461433|gb|ACDD01000069.1| GENE 40 22630 - 22974 434 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453072|ref|ZP_05618371.1| ## NR: gi|257453072|ref|ZP_05618371.1| hypothetical protein F3_08436 [Fusobacterium sp. 3_1_5R] # 1 114 1 114 114 214 100.0 2e-54 MNKYSKTIYKILRTIDYCYSKKIMIKFDLEKLGLSQHELGLYLHNLVEAGYIGGIIISQA FGQTHFEKYRAHEYAYITLSGMMFLEENSEMKNFYKTITEIRDWFCAIAPIAGI >gi|224461433|gb|ACDD01000069.1| GENE 41 23087 - 23911 858 274 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2974 NR:ns ## KEGG: Ccel_2974 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 31 265 66 281 289 71 26.0 3e-11 MKKGVYLIFILLFLFTACKMQEDVEIAKTISKIDKIYKNREYEKALSMIDELQKSHPQSK ENETLKNIRENVEIEIRNEEIRKTNKEREFKEKQEKMRKIENNFKEAMRNIEKEYDEFNG ITWFDSKKIDGEKFSKETELLAIRVFLYGSQRGKLGEYVDNVRLVLRYYGDDWIFFDKVI ILADGERKEFQLNAYDAEREVVKGGSLGRVYEKYDIAIDEEDMDFFLKVANSNTTKVRFY GKYQTDFYLNPHEKYIIKNILTAYKKNLYLQIVD >gi|224461433|gb|ACDD01000069.1| GENE 42 23985 - 24119 254 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453074|ref|ZP_05618373.1| ## NR: gi|257453074|ref|ZP_05618373.1| hypothetical protein F3_08446 [Fusobacterium sp. 3_1_5R] # 1 44 1 44 44 62 100.0 7e-09 MTVAGYCFLGVVVVVLCIYVYGRVKYRLTKPKDIVKEAKKGFKK >gi|224461433|gb|ACDD01000069.1| GENE 43 24122 - 24235 62 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGLETVLLFRLFCIVGEKEQDLKSMTYIVKNERRKHE >gi|224461433|gb|ACDD01000069.1| GENE 44 24228 - 24818 922 196 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1430 NR:ns ## KEGG: Sterm_1430 # Name: not_defined # Def: minor structural GP20 protein # Organism: S.termitidis # Pathway: not_defined # 1 193 1 200 200 106 39.0 6e-22 MNKEELLALGLSEELANKVVDKYGHLVTKTRLDEVIAERDTLKTQVSERDKQLKELEKAA GDNKELKEQIEKLQKDNKDAADKYKKDLHDLQVNNAVDLAISGAKGKNGKAIKALLDLEK AEIKDGKIIGLEEQLAKLKESDGYLFEETQQPQNTNPAGFTPGAGSSKTPGGEGPKAYSQ IMQMLAENPGLDISKI >gi|224461433|gb|ACDD01000069.1| GENE 45 24836 - 26026 1648 396 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1431 NR:ns ## KEGG: Sterm_1431 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 396 1 357 357 367 53.0 1e-100 MKHFDAKIFNGEAFGKYVSIIPNTKKNELLKSGAIQGNQEIKDAFANQTGTHYATLPMHG RIGGKTLNYNGSTNVTATSTKTYSRGVISIGRMAAWTEKDFSYDITGGVDFMDNVAKQVV DFWADAYQGILLSILKGIFAMNSGKDKEFAEGHTYNITELAGKDGKVGPTTLNSASQKAC GDNKNIFKVAIMHSTVATNLENLQILKYFTQTDANGMQREVGLATWNGRVVFIDDSMPTA KFTGGKYAKVEASHPDALKIQTPGTGVKEVPQATVAGAKFDSKWVPKDGEYAALVEAGTE YTTYLLGAGAFDYEDLGVKHAHEMVRDAKTNGGEDMLITRRRLVYAPYGISYKTDSTISP EDTELEKGTNWELVKSQDGDVIDHKAIPIARIISRG >gi|224461433|gb|ACDD01000069.1| GENE 46 26035 - 26415 304 126 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1437 NR:ns ## KEGG: CDR20291_1437 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 126 9 136 136 86 39.0 3e-16 MDAIEKLLQSFGYAVGEADRPLLSFIRNTVENSIKIRANIMEIPQELLPVVEKRTVGEFL ATKLSTGDFKSDSIDLEPLVKTIQEGKVTITYDTSGQTREMMLKTYCGMLIAYGELEIVA YRKLRW >gi|224461433|gb|ACDD01000069.1| GENE 47 26417 - 26761 346 114 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1433 NR:ns ## KEGG: Sterm_1433 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 3 112 5 116 119 80 41.0 1e-14 MNHKAVLERTYIATAKVYGYEKVKEKGITKNKEIVLIEQLKCRIDYETITGTEQGNLGKV YQQVILFCNPDIRIPPNSKIEVTQLGRTETYWSSGKPAVYSSHQEIILQVKEVA >gi|224461433|gb|ACDD01000069.1| GENE 48 26761 - 27309 514 182 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1434 NR:ns ## KEGG: Sterm_1434 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 3 177 5 142 149 77 33.0 2e-13 MKIQMDEKAFMRFLKECKKDAVDARPVLENGLNEIGARLLRRVKQKTPVGQSQEGKIARR DKNGKLMTYSRGANKGKIKTRIGIIHQGGNLRRSWYVTNTIRKNDTSRVIVYNSSRYGMY VEYGHRQTPGRFVPVLGKRLKARWVKGRFMLTKSIQEVDTIALGVMKKHIKKAVSAWKYR YF Prediction of potential genes in microbial genomes Time: Fri May 20 02:19:19 2011 Seq name: gi|224461432|gb|ACDD01000070.1| Fusobacterium sp. 3_1_5R cont1.70, whole genome shotgun sequence Length of sequence - 104940 bp Number of predicted genes - 126, with homology - 115 Number of transcription units - 31, operones - 23 average op.length - 5.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 514 425 ## gi|257453080|ref|ZP_05618379.1| hypothetical protein F3_08476 2 1 Op 2 . + CDS 511 - 981 416 ## gi|257453081|ref|ZP_05618380.1| hypothetical protein F3_08481 3 1 Op 3 . + CDS 991 - 1248 304 ## gi|257453082|ref|ZP_05618381.1| hypothetical protein F3_08486 + Prom 1259 - 1318 12.2 4 2 Op 1 . + CDS 1358 - 1810 630 ## COG3600 Uncharacterized phage-associated protein 5 2 Op 2 . + CDS 1816 - 2436 342 ## gi|257453084|ref|ZP_05618383.1| hypothetical protein F3_08496 + Term 2438 - 2467 -0.3 - Term 2426 - 2455 -0.3 6 3 Tu 1 . - CDS 2489 - 2572 67 ## - Prom 2711 - 2770 6.5 + Prom 2989 - 3048 9.7 7 4 Op 1 . + CDS 3204 - 4466 880 ## gi|257453085|ref|ZP_05618384.1| hypothetical protein F3_08501 8 4 Op 2 . + CDS 4463 - 5407 777 ## COG0863 DNA modification methylase + Prom 5411 - 5470 6.2 9 5 Op 1 . + CDS 5520 - 5615 172 ## 10 5 Op 2 . + CDS 5638 - 5895 272 ## gi|257462770|ref|ZP_05627178.1| hypothetical protein FuD12_02899 11 5 Op 3 . + CDS 5972 - 6142 191 ## gi|257453089|ref|ZP_05618388.1| hypothetical protein F3_08521 12 5 Op 4 . + CDS 6157 - 6588 471 ## gi|257453090|ref|ZP_05618389.1| hypothetical protein F3_08526 13 5 Op 5 . + CDS 6624 - 6788 216 ## 14 5 Op 6 . + CDS 6803 - 7486 338 ## gi|257453092|ref|ZP_05618391.1| hypothetical protein F3_08536 15 5 Op 7 . + CDS 7560 - 8345 1109 ## gi|257453093|ref|ZP_05618392.1| hypothetical protein F3_08541 16 5 Op 8 . + CDS 8365 - 8667 425 ## gi|257453094|ref|ZP_05618393.1| hypothetical protein F3_08546 17 5 Op 9 . + CDS 8671 - 9996 1240 ## COG3846 Type IV secretory pathway, TrbL components + Term 10007 - 10036 0.5 18 6 Op 1 . + CDS 10067 - 10594 303 ## Lebu_0954 type IV secretory pathway protease TraF-like protein 19 6 Op 2 . + CDS 10609 - 10707 130 ## 20 6 Op 3 . + CDS 10792 - 10923 213 ## gi|257453097|ref|ZP_05618396.1| hypothetical protein F3_08561 21 6 Op 4 . + CDS 10939 - 11052 99 ## 22 6 Op 5 . + CDS 11092 - 11262 268 ## gi|257453099|ref|ZP_05618398.1| hypothetical protein F3_08571 23 6 Op 6 . + CDS 11312 - 12394 714 ## gi|257453100|ref|ZP_05618399.1| ATPase involved in chromosome partitioning 24 6 Op 7 . + CDS 12411 - 13079 757 ## gi|257453101|ref|ZP_05618400.1| hypothetical protein F3_08581 25 6 Op 8 . + CDS 13111 - 13413 413 ## gi|257453102|ref|ZP_05618401.1| hypothetical protein F3_08586 26 6 Op 9 . + CDS 13434 - 13736 400 ## gi|257453103|ref|ZP_05618402.1| hypothetical protein F3_08591 27 6 Op 10 . + CDS 13763 - 14083 230 ## gi|257453104|ref|ZP_05618403.1| hypothetical protein F3_08596 + Prom 14101 - 14160 7.3 28 7 Tu 1 . + CDS 14236 - 14442 248 ## gi|257453105|ref|ZP_05618404.1| hypothetical protein F3_08601 + Term 14614 - 14660 8.2 + Prom 14444 - 14503 7.8 29 8 Op 1 . + CDS 14668 - 15237 412 ## BcerKBAB4_5408 hypothetical protein 30 8 Op 2 . + CDS 15234 - 15374 123 ## 31 8 Op 3 . + CDS 15440 - 15640 244 ## + Term 15656 - 15710 8.1 - Term 15640 - 15698 -0.6 32 9 Op 1 . - CDS 15726 - 16589 665 ## gi|257453109|ref|ZP_05618408.1| hypothetical protein F3_08621 33 9 Op 2 . - CDS 16601 - 17209 664 ## COG0582 Integrase - Prom 17367 - 17426 8.4 + Prom 17204 - 17263 6.7 34 10 Op 1 . + CDS 17363 - 17578 150 ## 35 10 Op 2 . + CDS 17647 - 20100 2546 ## Ppro_3827 MobA/MobL protein 36 10 Op 3 . + CDS 20088 - 20615 334 ## gi|257453112|ref|ZP_05618411.1| hypothetical protein F3_08636 37 10 Op 4 . + CDS 20630 - 20962 322 ## gi|257453113|ref|ZP_05618412.1| hypothetical protein F3_08641 38 10 Op 5 . + CDS 20974 - 21300 489 ## gi|257453114|ref|ZP_05618413.1| hypothetical protein F3_08646 39 10 Op 6 . + CDS 21301 - 21465 258 ## gi|257453115|ref|ZP_05618414.1| hypothetical protein F3_08651 + Term 21466 - 21493 0.1 + Prom 21496 - 21555 7.3 40 11 Op 1 . + CDS 21767 - 25657 2929 ## COG4227 Antirestriction protein 41 11 Op 2 . + CDS 25717 - 26037 496 ## gi|257453117|ref|ZP_05618416.1| hypothetical protein F3_08661 42 11 Op 3 4/0.000 + CDS 26051 - 26767 708 ## COG3701 Type IV secretory pathway, TrbF components 43 11 Op 4 11/0.000 + CDS 26780 - 27613 901 ## COG3504 Type IV secretory pathway, VirB9 components 44 11 Op 5 . + CDS 27626 - 28747 1278 ## COG2948 Type IV secretory pathway, VirB10 components 45 11 Op 6 . + CDS 28802 - 30754 1708 ## COG3505 Type IV secretory pathway, VirD4 components 46 11 Op 7 . + CDS 30800 - 31120 316 ## gi|257453122|ref|ZP_05618421.1| hypothetical protein F3_08686 47 11 Op 8 . + CDS 31101 - 32039 1100 ## COG4962 Flp pilus assembly protein, ATPase CpaF 48 11 Op 9 . + CDS 32063 - 32314 281 ## gi|257453124|ref|ZP_05618423.1| hypothetical protein F3_08696 49 11 Op 10 . + CDS 32324 - 34927 2540 ## COG3451 Type IV secretory pathway, VirB4 components 50 11 Op 11 . + CDS 34940 - 35668 905 ## COG3713 Outer membrane protein V 51 11 Op 12 . + CDS 35721 - 36035 401 ## COG0629 Single-stranded DNA-binding protein + Prom 36065 - 36124 6.4 52 12 Op 1 . + CDS 36157 - 36312 188 ## gi|257453128|ref|ZP_05618427.1| hypothetical protein F3_08716 53 12 Op 2 . + CDS 36309 - 37868 1505 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - TRNA 37778 - 37862 67.5 # Ser GGA 0 0 - Term 37636 - 37695 2.4 54 13 Tu 1 . - CDS 37912 - 38733 904 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 38759 - 38818 9.6 + Prom 38796 - 38855 8.9 55 14 Tu 1 . + CDS 38886 - 39086 451 ## FN1309 hypothetical protein + Term 39096 - 39124 1.4 + Prom 39108 - 39167 12.8 56 15 Op 1 1/0.125 + CDS 39200 - 39805 623 ## COG4399 Uncharacterized protein conserved in bacteria 57 15 Op 2 1/0.125 + CDS 39829 - 40833 1298 ## COG2255 Holliday junction resolvasome, helicase subunit 58 15 Op 3 1/0.125 + CDS 40823 - 41239 405 ## COG1959 Predicted transcriptional regulator 59 15 Op 4 5/0.000 + CDS 41254 - 41964 738 ## COG1385 Uncharacterized protein conserved in bacteria 60 15 Op 5 . + CDS 41951 - 43261 873 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 61 15 Op 6 . + CDS 43239 - 44201 1246 ## FN1213 hypothetical protein 62 15 Op 7 . + CDS 44218 - 44622 263 ## gi|257453138|ref|ZP_05618437.1| hypothetical protein F3_08766 63 15 Op 8 1/0.125 + CDS 44625 - 46454 2336 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 64 15 Op 9 1/0.125 + CDS 46447 - 48291 2271 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 65 15 Op 10 1/0.125 + CDS 48312 - 48617 224 ## PROTEIN SUPPORTED gi|229884332|ref|ZP_04503793.1| predicted RNA-binding protein containing KH domain, possibly ribosomal protein 66 15 Op 11 1/0.125 + CDS 48633 - 50420 1929 ## COG1154 Deoxyxylulose-5-phosphate synthase 67 15 Op 12 1/0.125 + CDS 50404 - 51234 850 ## COG3481 Predicted HD-superfamily hydrolase 68 15 Op 13 1/0.125 + CDS 51243 - 52046 877 ## COG1189 Predicted rRNA methylase 69 15 Op 14 1/0.125 + CDS 52043 - 53326 1716 ## COG0793 Periplasmic protease 70 15 Op 15 1/0.125 + CDS 53326 - 54009 934 ## COG0313 Predicted methyltransferases 71 15 Op 16 1/0.125 + CDS 53990 - 54880 1217 ## COG1161 Predicted GTPases 72 15 Op 17 . + CDS 54877 - 55656 1217 ## COG0171 NAD synthase 73 15 Op 18 . + CDS 55669 - 56028 557 ## FN1201 hypothetical protein + Term 56076 - 56108 3.2 + Prom 56156 - 56215 13.1 74 16 Op 1 . + CDS 56267 - 57421 735 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog 75 16 Op 2 . + CDS 57430 - 58860 1593 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Prom 58905 - 58964 11.9 76 17 Op 1 . + CDS 59058 - 60506 2071 ## COG2368 Aromatic ring hydroxylase 77 17 Op 2 . + CDS 60527 - 60835 254 ## gi|257453153|ref|ZP_05618452.1| hypothetical protein F3_08841 78 17 Op 3 . + CDS 60810 - 62117 1481 ## COG0427 Acetyl-CoA hydrolase 79 17 Op 4 . + CDS 62127 - 63476 1392 ## CD1965 hypothetical protein 80 17 Op 5 . + CDS 63515 - 63763 269 ## gi|257453156|ref|ZP_05618455.1| hypothetical protein F3_08856 81 17 Op 6 . + CDS 63729 - 63848 72 ## - Term 63777 - 63819 10.1 82 18 Tu 1 . - CDS 63845 - 64402 497 ## COG1396 Predicted transcriptional regulators - Prom 64584 - 64643 16.0 + Prom 64503 - 64562 12.5 83 19 Op 1 4/0.000 + CDS 64660 - 66018 1821 ## COG2610 H+/gluconate symporter and related permeases 84 19 Op 2 1/0.125 + CDS 66067 - 67392 1674 ## COG3048 D-serine dehydratase 85 19 Op 3 . + CDS 67403 - 68512 1224 ## COG3616 Predicted amino acid aldolase or racemase + Term 68567 - 68609 9.0 - Term 68555 - 68597 9.0 86 20 Op 1 1/0.125 - CDS 68623 - 69126 701 ## COG0716 Flavodoxins 87 20 Op 2 . - CDS 69142 - 69822 834 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 - Prom 69855 - 69914 9.4 + Prom 69835 - 69894 17.3 88 21 Op 1 . + CDS 70013 - 70720 910 ## COG0813 Purine-nucleoside phosphorylase 89 21 Op 2 . + CDS 70717 - 72420 1564 ## COG1032 Fe-S oxidoreductase 90 21 Op 3 . + CDS 72429 - 73745 1800 ## COG1160 Predicted GTPases 91 21 Op 4 1/0.125 + CDS 73820 - 74965 1506 ## COG1114 Branched-chain amino acid permeases 92 21 Op 5 1/0.125 + CDS 74984 - 75349 519 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 93 21 Op 6 . + CDS 75369 - 76634 1692 ## COG1114 Branched-chain amino acid permeases + Term 76651 - 76694 -0.8 + Prom 76641 - 76700 8.8 94 22 Op 1 . + CDS 76743 - 76949 370 ## FN0101 glutaredoxin 95 22 Op 2 24/0.000 + CDS 76946 - 79204 2705 ## COG0209 Ribonucleotide reductase, alpha subunit 96 22 Op 3 . + CDS 79197 - 80231 1264 ## COG0208 Ribonucleotide reductase, beta subunit 97 22 Op 4 . + CDS 80255 - 80560 418 ## gi|257467363|ref|ZP_05631674.1| hypothetical protein FgonA2_07961 98 22 Op 5 . + CDS 80633 - 81712 1705 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 81714 - 81760 7.9 - Term 81631 - 81685 -0.0 99 23 Op 1 1/0.125 - CDS 81743 - 82189 390 ## COG1846 Transcriptional regulators 100 23 Op 2 . - CDS 82214 - 82759 649 ## COG0386 Glutathione peroxidase 101 23 Op 3 . - CDS 82761 - 82874 122 ## - Prom 82931 - 82990 14.4 - Term 82973 - 83020 7.6 102 24 Tu 1 . - CDS 83041 - 83856 1184 ## COG5266 ABC-type Co2+ transport system, periplasmic component - Prom 83918 - 83977 16.9 + Prom 84199 - 84258 7.6 103 25 Op 1 6/0.000 + CDS 84293 - 84754 772 ## COG0054 Riboflavin synthase beta-chain 104 25 Op 2 16/0.000 + CDS 84759 - 85838 1261 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis 105 25 Op 3 15/0.000 + CDS 85848 - 86495 873 ## COG0307 Riboflavin synthase alpha chain 106 25 Op 4 . + CDS 86508 - 87719 1692 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase + Term 87729 - 87763 2.2 - Term 87710 - 87757 12.3 107 26 Op 1 . - CDS 87758 - 89245 1713 ## COG4868 Uncharacterized protein conserved in bacteria - Prom 89272 - 89331 7.5 108 26 Op 2 18/0.000 - CDS 89345 - 90061 280 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 109 26 Op 3 19/0.000 - CDS 90077 - 90874 245 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 110 26 Op 4 24/0.000 - CDS 90858 - 91847 1379 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 111 26 Op 5 20/0.000 - CDS 91851 - 92735 989 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 112 26 Op 6 . - CDS 92760 - 93911 1775 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component - Prom 94008 - 94067 11.5 + Prom 93906 - 93965 9.1 113 27 Tu 1 . + CDS 93990 - 94106 92 ## - Term 94045 - 94071 -1.0 114 28 Op 1 1/0.125 - CDS 94081 - 94866 985 ## COG0561 Predicted hydrolases of the HAD superfamily 115 28 Op 2 1/0.125 - CDS 94859 - 96166 1163 ## COG0534 Na+-driven multidrug efflux pump - Term 96191 - 96226 5.1 116 28 Op 3 . - CDS 96244 - 96444 324 ## COG1278 Cold shock proteins - Prom 96631 - 96690 23.6 117 29 Tu 1 . + CDS 96760 - 98100 2069 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Term 98111 - 98156 5.8 + Prom 98107 - 98166 7.3 118 30 Op 1 . + CDS 98186 - 98791 421 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Prom 98805 - 98864 6.1 119 30 Op 2 . + CDS 98887 - 99369 557 ## Lebu_2038 hypothetical protein 120 30 Op 3 . + CDS 99372 - 99761 404 ## Lebu_2039 hypothetical protein 121 30 Op 4 . + CDS 99762 - 101948 2637 ## COG2217 Cation transport ATPase 122 30 Op 5 . + CDS 101970 - 102722 602 ## Lebu_2041 hypothetical protein 123 30 Op 6 . + CDS 102745 - 103041 548 ## Lebu_2042 hypothetical protein + Term 103059 - 103095 6.6 124 31 Op 1 22/0.000 + CDS 103103 - 103807 586 ## COG0850 Septum formation inhibitor 125 31 Op 2 22/0.000 + CDS 103809 - 104600 1229 ## COG2894 Septum formation inhibitor-activating ATPase 126 31 Op 3 . + CDS 104606 - 104863 400 ## COG0851 Septum formation topological specificity factor Predicted protein(s) >gi|224461432|gb|ACDD01000070.1| GENE 1 2 - 514 425 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453080|ref|ZP_05618379.1| ## NR: gi|257453080|ref|ZP_05618379.1| hypothetical protein F3_08476 [Fusobacterium sp. 3_1_5R] # 1 170 1 170 170 342 100.0 3e-93 MKTVLILGHNARDKGAYSPYLKMSEYDYWGEVVKGLDVPVLRRNPNRGYGLEMREMLSRL EQIEYDVAIELHFNSAISNANGAEVLIYRGNTTSKTLATKFLKKLEAMGHRNRGIIEVSH ERERNGAYGICNSRGHYLLIEPFFGSDEKDCIPVGEMREILKEWIKEVQI >gi|224461432|gb|ACDD01000070.1| GENE 2 511 - 981 416 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453081|ref|ZP_05618380.1| ## NR: gi|257453081|ref|ZP_05618380.1| hypothetical protein F3_08481 [Fusobacterium sp. 3_1_5R] # 1 156 1 156 156 305 100.0 7e-82 MNVTLASVFGSAVGKKVIDKVLDVIGNKIPMTKDEKENLAVELGKIEVDGIKAQTQNILA RNKFIQGLVNAMPLIGWILPLSFAALIGCYLFQFGSDVWFSARGMEAPIYSINKEYSELM KKFIEFLFYGKIAGKLSPFHNSEGTTGGKFLDKLYN >gi|224461432|gb|ACDD01000070.1| GENE 3 991 - 1248 304 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453082|ref|ZP_05618381.1| ## NR: gi|257453082|ref|ZP_05618381.1| hypothetical protein F3_08486 [Fusobacterium sp. 3_1_5R] # 1 85 1 85 85 144 100.0 2e-33 MTDGLLFTILGTMCGVIGLLYNVIRNLRNDFYNELDKMKKLSDDRDNDIKELIREMKADL KEDMREIKNDIRRAESFKCAGQNQG >gi|224461432|gb|ACDD01000070.1| GENE 4 1358 - 1810 630 150 aa, chain + ## HITS:1 COG:UU033 KEGG:ns NR:ns ## COG: UU033 COG3600 # Protein_GI_number: 13357589 # Func_class: S Function unknown # Function: Uncharacterized phage-associated protein # Organism: Ureaplasma urealyticum # 7 150 9 153 157 103 37.0 1e-22 MYSIYQIAGWFFSKEYAMSSKKLQKLCWYAYSWYIALNSDPEDGHLERLITDVPGAEAWV HGPVFRDLYTDFRYNEYTKTKNAQIENLDKDTLDFLERIWNVYGSFSGEQLEEMTHNEAP WMNARKDVDKFESSNKLINDRDIFQEYLNR >gi|224461432|gb|ACDD01000070.1| GENE 5 1816 - 2436 342 206 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453084|ref|ZP_05618383.1| ## NR: gi|257453084|ref|ZP_05618383.1| hypothetical protein F3_08496 [Fusobacterium sp. 3_1_5R] # 1 206 1 206 206 367 100.0 1e-100 MKKGLDSIKTNVKNLEENIKLKEKSGDNIVNIDFSFFYFPSISLKYFNNCFKDEESYDKF MVNFYHKILNYLKDKTYNQLESNGKHTHNIKDAKQVRTINKILDEYTKKFTFLPCINIEM RQEFYQIASLAGARIIGTRYGKTFYILFFDPYHLIYPNAKYNVDKTSYKGENVFYINDDI KIFCLEHLLENERCVDCEVMDNLLKK >gi|224461432|gb|ACDD01000070.1| GENE 6 2489 - 2572 67 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLFSKNLTDKFYKYDFMVCLIFKILRF >gi|224461432|gb|ACDD01000070.1| GENE 7 3204 - 4466 880 420 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453085|ref|ZP_05618384.1| ## NR: gi|257453085|ref|ZP_05618384.1| hypothetical protein F3_08501 [Fusobacterium sp. 3_1_5R] # 1 420 1 420 420 731 100.0 0 MNYEGLDSKSLVLNLLGQDANIQVNKKILLTLGLEQAFYLSYLINQYKYFLAEGSLREDD SFYASNSDISLFSTLNNSQIQRVKKILVEKGFFKISIEGIPSVTYYYLNFEKILQIVASE KTNLELAYKNVYQNETINFEISSEESISLLEKLTYKELRFFCKENKIKYNGNDTKRDLIY KIVENKNPSLLSTAYFSTVDDISTTSGSEIRPLSKMGCSEDPSGFKTRPLVDTKSIPNLK QIKQKKNHDHEEHDLEFDFEKIFHELGVNYTKTNQESIERLLQKMKPHEIEHYIRELYQN IKENPNVKDINALFSAKIQKQECQINTSKKKEIETLSKPKPTAKTTYTREDIKNELAKYN ILYQRSLFEKVKEEFFKVASKEDAEYIFEIIDNDITGFTSLTQLYREWIKVLFPSEGGKL >gi|224461432|gb|ACDD01000070.1| GENE 8 4463 - 5407 777 314 aa, chain + ## HITS:1 COG:XF0641 KEGG:ns NR:ns ## COG: XF0641 COG0863 # Protein_GI_number: 15837243 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Xylella fastidiosa 9a5c # 3 299 51 372 380 135 27.0 2e-31 MKIMHGDCSEYLKTIKTESIDCIVTSPPYWQLRDYGVSNQIGMEESIEEYIDKLMNIMNE LYRVLKKSGTFFLNLGDTYSNVNSKFYPANKMKDNKFSWIEGTVVSRKTNILRKSKMMIP ERLSIRMIDSGWILRNEIIWHKPNALPESLTDRFTNDFEKIFFFTKSQKYYFQKQYEPYS EKTLHSFKDGIMPTGKKKMLSAGESKMAMKKIDKPWRAVYNEKGRNMRTVWSIANKGLRE GHYASFPEKLVERCLLSGCPQNGIVLDPFLGSGTTLKVAKNLNLNGIGIELKKEYIDIAV HRIGEDLFTKIQIV >gi|224461432|gb|ACDD01000070.1| GENE 9 5520 - 5615 172 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFILSAAFLFIIGIGAEKDYIQLELFQELRR >gi|224461432|gb|ACDD01000070.1| GENE 10 5638 - 5895 272 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257462770|ref|ZP_05627178.1| ## NR: gi|257462770|ref|ZP_05627178.1| hypothetical protein FuD12_02899 [Fusobacterium sp. D12] # 1 83 1 83 103 82 90.0 7e-15 MDIFEIIVAFIYIYFLYIFYDYYISRKILKKWQRLSKEKKEEFIKNLSKKNQKRLLYMIE IEKDREMEKQKKKFTKVKKKYRKER >gi|224461432|gb|ACDD01000070.1| GENE 11 5972 - 6142 191 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453089|ref|ZP_05618388.1| ## NR: gi|257453089|ref|ZP_05618388.1| hypothetical protein F3_08521 [Fusobacterium sp. 3_1_5R] # 1 56 5 60 60 80 98.0 3e-14 MKKTLAKYLKTIFILKKSEGKYEENYGNIVFVFNNEESSIMDKMIEIEQLTLFQIN >gi|224461432|gb|ACDD01000070.1| GENE 12 6157 - 6588 471 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453090|ref|ZP_05618389.1| ## NR: gi|257453090|ref|ZP_05618389.1| hypothetical protein F3_08526 [Fusobacterium sp. 3_1_5R] # 1 143 1 143 143 199 100.0 3e-50 MKNFLTMILLVFGIMFFLRENPFRSEEEKIKIKFTREFSYMEEDFYLLNEVLSVKEKDKL LVDLSSSSRGEQEKAFKELLKEKQEEINKIEDYKKIKFKKEIEENKKIGNTINSIYENNI FRAIIVIVTLLCAVIYILFPRKI >gi|224461432|gb|ACDD01000070.1| GENE 13 6624 - 6788 216 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIKIVEAKYKEELEKKLNAEIKKLEKRTIKDIKYSVFSFREYSYVHSALIIFE >gi|224461432|gb|ACDD01000070.1| GENE 14 6803 - 7486 338 227 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453092|ref|ZP_05618391.1| ## NR: gi|257453092|ref|ZP_05618391.1| hypothetical protein F3_08536 [Fusobacterium sp. 3_1_5R] # 1 227 1 227 227 353 100.0 4e-96 MKYLFYIGTRKYSLSTEEDLLKISPGDLSGKKEIVIGSQNLIHDVQRAYRKLFQKAVDQY NLQRQDKPISSYYCKINESKKLSLATGILIRLGKKEEWENILDKDKKKIEKLYQNQLEVI KEFLPDFHIVNATIYLEEPCLRIVGIPIKREKEEKKKLSIHICKSACFDKTRIEELRKHL VFQIKSDFLKLYYKEIEVTRMIQERKKKKQKQRKITEKDYRQLELFQ >gi|224461432|gb|ACDD01000070.1| GENE 15 7560 - 8345 1109 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453093|ref|ZP_05618392.1| ## NR: gi|257453093|ref|ZP_05618392.1| hypothetical protein F3_08541 [Fusobacterium sp. 3_1_5R] # 1 261 1 261 261 384 100.0 1e-105 MKTWKRIIIIGLMLCVNFQAFALFGGGSSKGIGKIVKILVAMQAKQALMETELGKELFEA VQQTQNQLKQIEMEMTNMMSLTQELTTGQLLKLQQDYQELLSIQNQFQNTLGSFKNFENQ FQNTYKDFKDLKGLTSLDYIQQADDLLTASRNLTKETFAMAGLGSSEKMANDAQRITELM KAANSAEGQKAVMQAGVNMAGEQARMLGEMRTLLAASLKAQNAEIMRSLQNEKAGVEKTK QVLQVHESKNYKKPDLLNGKW >gi|224461432|gb|ACDD01000070.1| GENE 16 8365 - 8667 425 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453094|ref|ZP_05618393.1| ## NR: gi|257453094|ref|ZP_05618393.1| hypothetical protein F3_08546 [Fusobacterium sp. 3_1_5R] # 1 100 1 100 100 186 100.0 4e-46 MKKIIILSFIVICFIACGKREPEYPIFTHDEKVAIYKEARDHNNTEKLKEIENLMKQLET EGKKGDKIAEKERADWHTVRVLYIAPEQSYKKPDLLNGKW >gi|224461432|gb|ACDD01000070.1| GENE 17 8671 - 9996 1240 441 aa, chain + ## HITS:1 COG:mll9606 KEGG:ns NR:ns ## COG: mll9606 COG3846 # Protein_GI_number: 13488455 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbL components # Organism: Mesorhizobium loti # 59 271 102 322 495 67 25.0 5e-11 MILTEILSNFIEILYKIPQNLRTMAMTLLFFLSTIEIALTIYNNIDNPQFQYLKWGKTKI LKIGFIIFAIQKYESIIKAIKSFFLEIGTKGLGLSISSSDYFNDPSIIFDKGKKLSLIIL ESVEGFSPSTYIFILLALLAFVGFFVISIQIILCWIEFYFLTGISIVFLPFGALDMTGEY YKNVFKTIMSCSIKLCVFNIWILICDKIIKKALSNPPNGIVDLDYALVVCGTAFILVAIM LILPSMTSGLLTGSPQMNAGAAMSAAIGAGAALATNAYHTARMGGKAGTELGSSINKGTE QGGQKGAIYGAKLGGPAGAIVGSWVGAGVGALGGAIKGGFKGAYAAGRYGINQGLLRKPE YGANTKEKKEDSKSNLANTVNLKNAESKNPQTMNTGSAREEDTNATDVLRELQSEQGSKN TANTGTQPIVNGEQGVPSWMQ >gi|224461432|gb|ACDD01000070.1| GENE 18 10067 - 10594 303 175 aa, chain + ## HITS:1 COG:no KEGG:Lebu_0954 NR:ns ## KEGG: Lebu_0954 # Name: not_defined # Def: type IV secretory pathway protease TraF-like protein # Organism: L.buccalis # Pathway: not_defined # 14 175 17 161 163 69 35.0 4e-11 MKRRDMRKKYVWFLFLILVIGTLQYKSRDYTINITRSLPLGIYHLEEASDIQLGDIVQFQ LEKEKMDFLYDREYLPRIADTLLKIVAADSTNSEKIRIQNNSIFPILYIGNHNWGPILPA DSKNRVVPQISLEEMKPKEGEYLLLSPVARSFDGRYWGSISKEKILKKATPILIF >gi|224461432|gb|ACDD01000070.1| GENE 19 10609 - 10707 130 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCDKFFQNDKKIKRILDFSLKILYNLVYKEFV >gi|224461432|gb|ACDD01000070.1| GENE 20 10792 - 10923 213 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453097|ref|ZP_05618396.1| ## NR: gi|257453097|ref|ZP_05618396.1| hypothetical protein F3_08561 [Fusobacterium sp. 3_1_5R] # 1 43 1 43 43 68 100.0 9e-11 MILHYNMDGSIQMQLKFRGKVIEKYFSSQKEYVAFLQNFDSKI >gi|224461432|gb|ACDD01000070.1| GENE 21 10939 - 11052 99 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGELQIKRSRKPCDVFSKKKSGRQYKIMLELSRNLAT >gi|224461432|gb|ACDD01000070.1| GENE 22 11092 - 11262 268 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453099|ref|ZP_05618398.1| ## NR: gi|257453099|ref|ZP_05618398.1| hypothetical protein F3_08571 [Fusobacterium sp. 3_1_5R] # 1 56 1 56 56 67 100.0 2e-10 MKTREDILREIERLEKEKIKVERKAKKTGEKEYYDWALINASKILALEWTLQETVL >gi|224461432|gb|ACDD01000070.1| GENE 23 11312 - 12394 714 360 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453100|ref|ZP_05618399.1| ## NR: gi|257453100|ref|ZP_05618399.1| ATPase involved in chromosome partitioning [Fusobacterium sp. 3_1_5R] # 1 360 1 360 360 664 100.0 0 MIKNITIQEKNGKVINNSLYLNVKHIEILGVSKKENSIILEYKDDTISLFNDFLLEEKIE KDVSGRLIYLKKKININIYQSKSTRIQATINIPFPILDNWKLSKGDAKVFITCDSKKVYI RKGDSMRGKVYTVKISKGGIGKTWITAQLGHGLALNGNKVMILTSDSQNNIFDYMIPEKE HEKYKHIKDLRHSVLYGKGEVIPLRKNVDFIPVESSIFTEKFLEKLPEFIEGLRKEYNFI LIDSIPTKAIDSAFVSLSDKIIIPVFCDAVTRKEAVNVMMEAGIEKVHSIIVNLFRNTAV QRESYQFLKEFTSEDIVFPSPIKETAHVESLIHKCKTVWESKASSIKSIQNSLLDVIMTM >gi|224461432|gb|ACDD01000070.1| GENE 24 12411 - 13079 757 222 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453101|ref|ZP_05618400.1| ## NR: gi|257453101|ref|ZP_05618400.1| hypothetical protein F3_08581 [Fusobacterium sp. 3_1_5R] # 1 207 1 207 222 321 100.0 2e-86 MGSAEQFKALMEQRKKEKAPEKNVVIKSSQFNLIQEDFLVFDYEVFQEFTKDKELLEYIK NKTFDLINAQAGGALYIGKNLTEVAEKLSKRGSPEGLYTRYLQYNGIKKDTALRLRKRWE LYKKAKEEHSKKIISLLNVQEIEEVYRNQKLLEDFSNMKLEDVKDVLQRRIVASASTKRI DFPLEYHYSFLEKKYQKKISNLDEEKQKLALELLQKLEELLQ >gi|224461432|gb|ACDD01000070.1| GENE 25 13111 - 13413 413 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453102|ref|ZP_05618401.1| ## NR: gi|257453102|ref|ZP_05618401.1| hypothetical protein F3_08586 [Fusobacterium sp. 3_1_5R] # 1 100 1 100 100 174 100.0 1e-42 MKKIMTIIIGLSLFISCGKKEPEYPIFTYDEKEAMYKEAKDNNNTEKLKEIEDLMKQLEI AGKKGDKVAEQEYEDWHVVEVLYVSPKKKDPGANLLNRSW >gi|224461432|gb|ACDD01000070.1| GENE 26 13434 - 13736 400 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453103|ref|ZP_05618402.1| ## NR: gi|257453103|ref|ZP_05618402.1| hypothetical protein F3_08591 [Fusobacterium sp. 3_1_5R] # 1 100 1 100 100 181 100.0 2e-44 MKKIILLSLFVICFMACGKKELEYPIFSHDEKVTMYKEAKDNNNTEKLEEIEKLMQQLEI AGKKGDEVAEKERADWHTVRVLYIAPEQSYKKPDLLNGKW >gi|224461432|gb|ACDD01000070.1| GENE 27 13763 - 14083 230 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453104|ref|ZP_05618403.1| ## NR: gi|257453104|ref|ZP_05618403.1| hypothetical protein F3_08596 [Fusobacterium sp. 3_1_5R] # 1 106 12 117 117 178 99.0 1e-43 MVLVYLVIIGIIGAIYNFLNTKHIEKYGYPIFDFYFIEILAGSGGFLFIGYWWREHAIKH SKDAFAPNLFIGFISLGIIIWIMGKSISYYQDKRHLGKKRKMIIKI >gi|224461432|gb|ACDD01000070.1| GENE 28 14236 - 14442 248 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453105|ref|ZP_05618404.1| ## NR: gi|257453105|ref|ZP_05618404.1| hypothetical protein F3_08601 [Fusobacterium sp. 3_1_5R] # 1 68 1 68 68 86 100.0 5e-16 METKEFLENTLAALKKQEKDLLPTEENFVKLEAIRGQIACIEEKIRKLKNFNPFAEKIKK EKNRGMER >gi|224461432|gb|ACDD01000070.1| GENE 29 14668 - 15237 412 189 aa, chain + ## HITS:1 COG:no KEGG:BcerKBAB4_5408 NR:ns ## KEGG: BcerKBAB4_5408 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 4 172 8 180 190 96 34.0 5e-19 MSIKFYILKEKYIDYLREADTKVQKNKKESRPYIGIVYKVGDFNYFSPLASPKDKHIRMK NRIDFIKIDNGKLGIINLNNSVPVNKDQLKLLDIDSLKNSLKVEEKKYGTLCEDQLIWCN DNLERIEKNFKKLYTLSINGKLPESIRDRCCDFKVLEKQCLDYSNKLKQEKLETINDIIK SNSKNQLDL >gi|224461432|gb|ACDD01000070.1| GENE 30 15234 - 15374 123 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLKIEKYYNVSDVIMTEELEQKLLPMREISQEKVVTEEKDIEIER >gi|224461432|gb|ACDD01000070.1| GENE 31 15440 - 15640 244 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MENIEFWKSKLTELKKKKERLEMSGEDPVQLEEIICRIKEIEEKIKKLENPFLKKISQKK DRGIER >gi|224461432|gb|ACDD01000070.1| GENE 32 15726 - 16589 665 287 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453109|ref|ZP_05618408.1| ## NR: gi|257453109|ref|ZP_05618408.1| hypothetical protein F3_08621 [Fusobacterium sp. 3_1_5R] # 1 287 1 287 287 487 100.0 1e-136 MLFFNLNLDKYDGEISQIIDIKMKFLDKISNLENVNCFWRIHSTDFPFLKYPYIYNKDMV LEAVDNEQSKFSSEDIISCIEKAHIELIPGLEKSLNDLGLRYDFIIEISNKEYNNSLRDY MAENNLQFAFESIGNTNMTLEEKINFISNELKELGAENNTLIIMDPYIFPKKHDSDYLDF FLGFIKKSKIKKLKLITSSDNNKYNRNLHLEFKTQLEELSIILEVFRYDKQHDRRWIIEE KNTGFLCGPSLNGIGKDKTTTISHLNKEDVIYIINNEISLLNEKIEV >gi|224461432|gb|ACDD01000070.1| GENE 33 16601 - 17209 664 202 aa, chain - ## HITS:1 COG:CAC1595 KEGG:ns NR:ns ## COG: CAC1595 COG0582 # Protein_GI_number: 15894873 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 18 197 8 181 186 73 38.0 3e-13 MLTESKKGQAVKAITPEDLKRIREYLSMKGKIPFLEFINFGCNVALRISDLSVIEFQDIN ERRWKLELIEKKTKKKRVIKLNKTCQKAVKNLKEYYQELGYDVSKGYLFKSLSPYQLKHK LDTPFTVNGVSRAFKRLEEMLNIQYPLGSHSLRKTWGKKVYEETLNIALIMKAFNHSSPA VTLRYIGIEQEEIDQLYEDFEI >gi|224461432|gb|ACDD01000070.1| GENE 34 17363 - 17578 150 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLILYNIFFRLQAKNQGQDQEKCKKSAKTVKIVSKMKGYLVLSNFFKIRKARLRISLSE KRSVLFMGFAT >gi|224461432|gb|ACDD01000070.1| GENE 35 17647 - 20100 2546 817 aa, chain + ## HITS:1 COG:no KEGG:Ppro_3827 NR:ns ## KEGG: Ppro_3827 # Name: not_defined # Def: MobA/MobL protein # Organism: P.propionicus # Pathway: not_defined # 1 197 1 195 234 144 44.0 2e-32 MGTYRLCYKKGKKGYARNHAAYILREEKYKGKEDLIYKESGNIPFVDGSNAIKFWEYADV FERADSVAYREMELNIPNELNHEQAKELIQNFVKKEFSSSVPYTFAIHESYNKEGEKNLH CHLMFSERELDGISRSDELFFRRANSKNPAKGGAEKNRIWQKKEKLLDLRKSWEVEQNLV LEKYGLEIRVDCRSLQEMRREALEKENFQRAEELDRTPINISGKILYKVDHKMRLTEEEE KKYQTFLRAKEEKKIKEKNIEILPKEEMIEYIDHIKKYSIEDSTLNICTKGAYFRMKRTK SQVKKELVKYPDNVELQNHYEIISNEIINTEKQWKNTLKFKNISEQLERDHKRELEKAST IFATKFGENYEKAKERFNQEKLQKKLRRKYQNYDTLKLKIKATMLEVENSIYKAADIVSE YKYSKVLHNFTTHIEAISKLKEEEQEISLYYPKELSMIQAKLIWHEKRKLETKQEFIELT DKIKTSKTFPSLIQQIEQNNSIEKKEFQKLYPEREENELEYFQNKIRLIQRQAELEKLYI KYSDKKTIDLKRLYSVSLEKEAVEKLFNSKYSREVEPNSDIAKHLLEEVQGEIEKNNKKI QNQEKALSYLKNALDYSNREFGLSGMEMLAINKLSNQEYNRGYKQKKKLDDVLSIEKKKL AKMGMLSWGRKELQNNIQKHILEQNRCQRKLDKLLLKYKGSVELSEESKKIEKVYQTNFE KIQKQMWSTKKENKINYQLKRTLEIKPEKLRKLPKIHSKHVEKREKVQREIRGNFSRVLQ HARYELDKILAADKTEIQSTLDITLKKEKDKGYEWEL >gi|224461432|gb|ACDD01000070.1| GENE 36 20088 - 20615 334 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453112|ref|ZP_05618411.1| ## NR: gi|257453112|ref|ZP_05618411.1| hypothetical protein F3_08636 [Fusobacterium sp. 3_1_5R] # 1 175 1 175 175 286 100.0 4e-76 MGTIMLKACDRKIYNLRKRISNLEKKKYLTIIQENKIARKIRDHKLLQLGLLFEITYTLI YSEYEVTGHLLQLKEKQGEELNILQTEGNSIFSEISIEEHDKEEVRYLLTEERKARNHIL ISYGALLESTNTMYYPLSVLIAYIRNIHNYTKEELKSLEEIGRQFFREKDGKGEN >gi|224461432|gb|ACDD01000070.1| GENE 37 20630 - 20962 322 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453113|ref|ZP_05618412.1| ## NR: gi|257453113|ref|ZP_05618412.1| hypothetical protein F3_08641 [Fusobacterium sp. 3_1_5R] # 1 110 1 110 110 177 100.0 3e-43 MLESIEEKKVRLLDEAVEYLKLQELKESYGYQGFFENTNVLKNNPDLNNDKFFVLKLLDL NPLEFQYASTELQNDREVAMKAILKEKGNLRYIGVSLKEDKNFMSLFKKC >gi|224461432|gb|ACDD01000070.1| GENE 38 20974 - 21300 489 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453114|ref|ZP_05618413.1| ## NR: gi|257453114|ref|ZP_05618413.1| hypothetical protein F3_08646 [Fusobacterium sp. 3_1_5R] # 1 108 1 108 108 168 100.0 1e-40 MEKEIYQTYYENGKLKEECPIRRGKLHGVSTVYREDGEILEKRIYQNDMCMGNPFTGMNI FELEDVLGLTDSTKEEWQETLEEEGECYEITPEFASIILLDNKEEEEN >gi|224461432|gb|ACDD01000070.1| GENE 39 21301 - 21465 258 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453115|ref|ZP_05618414.1| ## NR: gi|257453115|ref|ZP_05618414.1| hypothetical protein F3_08651 [Fusobacterium sp. 3_1_5R] # 1 54 1 54 54 66 100.0 4e-10 MEGMEKLMKDLGVNNQEELLEYINDPANQEEEIVKQLKEFIDFFAQKNTLQSEN >gi|224461432|gb|ACDD01000070.1| GENE 40 21767 - 25657 2929 1296 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 923 1276 158 519 522 167 29.0 1e-40 MAEKTYKVQGIFEWFKQFHGKTIRIEGESKNFEILLDNESLPHLLGVQYVNRVNEIRRGR NLYQYISKLSDQAIYNKIAKNNYNKLQATQQRIDTFQTFMENIEKAYIVEQTHQNTQLKS NYLVIQSEDNTFFHLGIHQSKNTYLETYFVRSDIKYFETSQIIEPIKGLYQYNEKIMSFE PFSFCQKEHILEKEEKTMNLYQYFGCKNQEELYRLVENKNPSVRELIDFMNYAKEKIISK VEKIQHKEDMIKYLGSIELPKEDEFSLTFVDSANRVLSHKIFHRGTGLSEIMKQAYHPRA RGFFLLSNSTNYNLYDNLYYELKDLNYESLDRMYLKNNEIYSVDLELGGSIGNVSEEKEN TFLFSETFTNVSSIPKMEKYQEFIEYYREQQLPGKNVIANHREIEKLLKLSNQHLSQEYF SIIPYDKNNKITNYETLFKGTVNSAPVDLKVLIPYLLDENIKGIQILHNHPSGIPKPSRK DIELTRDIESLCKKFNKELLDHTIVGKEDIFSFQREGMLYQNVLAQGVAEINEPREMIEA LKDDIENLQYANDSMRGNKEIAKFIIPENPSALRHMTEELRNDKELVLQAIQKNGKMFEY ASEVLRDDKDVAYEALKKDKANVLYLGDTIGKKIQEEKDLEIFLQEYAKEKEKQLEEAKQ IKTDKLYIYEDKKLISVEELSTEGYRNAFLLYKEKDRENRSVPSSLKFEMFGITEKGELY LTGIDQTDTSYTLEEACERFHQYGKISGERNQKMLPRMIHEIDQKYEVNMAKVYQEITGH EIAEPKIQEMTPQKEIEQDKIYAYILEGKNEKGEELTLIAEDTLTKEGIANVYKSSKELY EKGLISKVKIFGIDSNGIVYHGRPDQRNFDLSKGEYAKNLISGSLKVNLLESRQEYILDK IKEIDKQYHVGMLELYQGITNVRGNEVMQQSSRSKKEMEKEEKIVGVSNSKKEGQKIMQQ IMEEAKVSKPSKYKEEREKFTDSVIKSLEEGKIPWERDWESKSSILHNPISQTKYQGKNA FKLAVTSFKKGYTDPRWVTFKQAQQAGWKVKKGEKATTIEIYQKYDKATKKEFDEKVLEG MSYAEKSKYMKENVYFFIKTHNVFNGQQLEGIAPFKEEKREVKYEKIDKILQNSKVPVQY LGNKAFYSIETDSITLPEKENFKSENRFYGTALHELAHSTGHKNRLNRNIDSKFGTKQYA REELVAEFSSVFIGQQLGLNYDKDKLDNSKAYLQNWAKHLKEDKNLLYDAIKDAEIATKS IVAMQTKEGPEIFKQHNHERQFLKGKKVEKENGMER >gi|224461432|gb|ACDD01000070.1| GENE 41 25717 - 26037 496 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453117|ref|ZP_05618416.1| ## NR: gi|257453117|ref|ZP_05618416.1| hypothetical protein F3_08661 [Fusobacterium sp. 3_1_5R] # 1 106 1 106 106 167 100.0 2e-40 MKLVTRVQNKMKVAISLFFLKAALAMASTSDAPWVGMLDKIMKVLVGPTARLLSIFALVV VGFVFMSGNTKEGGKMGLNIAVGVSIIFAAATWGPKFFGYSGSILM >gi|224461432|gb|ACDD01000070.1| GENE 42 26051 - 26767 708 238 aa, chain + ## HITS:1 COG:CC2687 KEGG:ns NR:ns ## COG: CC2687 COG3701 # Protein_GI_number: 16126921 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, TrbF components # Organism: Caulobacter vibrioides # 47 228 40 218 254 72 29.0 7e-13 MRFFSKKKTYSKYYPKGKREIPRGRGTSSINNPALNMYLNLAKSCRNWQLAFLIMAGFFG VSLFSYFQLANRTKLVPFIIEIDQEGKPHFAGKMEQIQFKANDVLIFSMLDTHIMNSRSV SLDRVITYKLLKKQYSFLSKEMKNKMNEEITTLNIEKKFKTKESIDVQITSILKNSEGVY QVNWIEKKFKDGSFVESKKMTGLFSVSQKTGTLSEEELRNNPLELIIEDYNITIDKSI >gi|224461432|gb|ACDD01000070.1| GENE 43 26780 - 27613 901 277 aa, chain + ## HITS:1 COG:RSc2576 KEGG:ns NR:ns ## COG: RSc2576 COG3504 # Protein_GI_number: 17547295 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB9 components # Organism: Ralstonia solanacearum # 38 253 99 315 334 114 30.0 2e-25 MRKKILLYCFLAIASVSFAIDSASEFDEYSTQEPMIKAGVKKGKQNIKTNYFYNENDSYK IYARAGYVSTVLLNPDEDIIHAEIGDATRWYIQTYYTGTERGMTPAITVKPFVPELKTNL IISTSKRTYNFMLEAAYNSYNPIVTFEYQKEIQIAKRKEADLKAQATSINIHNLNFDYSW KKGKYPWSPEQVFDDGEKTIILLPESTKATQLPVLFIKDEQTGEAAMIRHRYDPVKREFI IDRLFQQAILRYGEQEIIIKRKGSFIKSSHDHISISL >gi|224461432|gb|ACDD01000070.1| GENE 44 27626 - 28747 1278 373 aa, chain + ## HITS:1 COG:mlr6405 KEGG:ns NR:ns ## COG: mlr6405 COG2948 # Protein_GI_number: 13475359 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB10 components # Organism: Mesorhizobium loti # 105 366 109 388 409 149 35.0 1e-35 MDFNTNQDPIEQNGIEEKSEMRIKGVFFKRKMLYGILLFFLGILVFYLFSGEIFSKEENE GKKEVQEKEQTNVEEVEANYEDVPTDVGYEEKLRDAEGKIITEEDLNREVIDTQENSEEI EYQKGKRRELEEEAIQAMRSPSSITIATRPQINNRNVVLTPGGGNTPVTDYDGNRQESKR NFLNREKSQKFYQLGELVDPVSEYELKAGDFIPAVMLQAINSDLPTKGIVAQVAENVMDT VTGKYLLIPQGTKIIGTYDSSITFGQERLLVVWQRLIFPNGRTILLENMQGIDLSGKAGL NADVNNHFATLLKGVILSSLMGSAAAITTDRKKDWRGAAAEGAGEQIISIGDRIARKKSI QTTNIIYETGRSI >gi|224461432|gb|ACDD01000070.1| GENE 45 28802 - 30754 1708 650 aa, chain + ## HITS:1 COG:mlr6395 KEGG:ns NR:ns ## COG: mlr6395 COG3505 # Protein_GI_number: 13475349 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Mesorhizobium loti # 13 645 69 634 735 301 31.0 3e-81 MKWLVNFRKYSFQKKISIFLILVGFWVGCQVGTQTFAKKVGYHKALGTPLFYYQKTPIYF PFSIFSWMKWEKSAPKALDDAQLNMITVLLISVMSIGIINKKKKKKDYYGSASWATKKEI ENMNFFPYLKKEIYEADYHFSQTIALYQEKILRNCNIQTRMQQSEYAKSGVFVGRDAWGR DLIDLSPGHLMMIAKTGGGKGISVVITTLFTWKGSTIVNDVKGDNWLWTAAYRRRLGHKC FRFEATADGVEKVSCHYNPLAEIRKGTLWEYQDARIIAETLVSPDRLKDPFFGPNGVTYL TAVILHVLYTVKRRVANLPDVYNFMSSPQFTEEEKLKQMMTFEHNDTVNRNLFYEIYNDV IILQDGEESPRTHPRVSRVAADMLGRSDKERSGIISTAKTELEVFAIPTVARNTAYSDFR ISDLQNYEVPVDLYFVTPINAIDITSTLLKLFLTQILFILTDKIEINSKGENTAYRHRLL LLMDEFTAIGRIDLLNKEVALMRGHGIKGFFIIQDMKQLKATYGENNAFLGNMSTTIYYS TNDVDTAKYIETRLGNKTEKMITRSYGQGGILFRKNLNYSEHYIARPLMTAEEIHSMDEN TSIILSAGKRPIKGKIVKWYEEPEFQNRFQRCPAATNRTPSDVIMPISEN >gi|224461432|gb|ACDD01000070.1| GENE 46 30800 - 31120 316 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453122|ref|ZP_05618421.1| ## NR: gi|257453122|ref|ZP_05618421.1| hypothetical protein F3_08686 [Fusobacterium sp. 3_1_5R] # 1 106 1 106 106 125 100.0 8e-28 MEKEKDTTEMNSNEEQKKNKKTIDEKIQDLKSRLKSAEQQKKEMLQRKGVLIWRKIKPSF FQDERILYDLLEQDDKMGILIEKVEYIIKQLFPTYIRGDKDDKESK >gi|224461432|gb|ACDD01000070.1| GENE 47 31101 - 32039 1100 312 aa, chain + ## HITS:1 COG:AGpT89 KEGG:ns NR:ns ## COG: AGpT89 COG4962 # Protein_GI_number: 16119853 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 303 25 326 343 237 42.0 2e-62 MTKNQNNLLDLLEISLGKEVLSYLSEDIVTEITMNDDGTVWVDTLDRGWVLTNIRLEEEA VYSIIALVANSVNQEVTMQTPIISAELPGSGFRFEGNIPGISSRSVFNIRKYSILNFVLD DYVKSNIMSEEQKRVIEEAVKHHRNILVVGGTGSGKTTLCNAILSEIAKYQERIIIIQDT NELKCACPNRLFLRSNQYVSMRDLLTSTLRRTPRRIVIGEVRDGAALNVLKAWNTGHPGG LCTLHADSAELGLFQLEGYVSEVSQNSQRDTIARTVDLIVDLQKEGLGRKVRGIIQVEGL DDSGNYILKNIA >gi|224461432|gb|ACDD01000070.1| GENE 48 32063 - 32314 281 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453124|ref|ZP_05618423.1| ## NR: gi|257453124|ref|ZP_05618423.1| hypothetical protein F3_08696 [Fusobacterium sp. 3_1_5R] # 1 83 1 83 83 147 100.0 1e-34 MDEELRQPICKGFIEEPTVAGGAREPVVLNFLLGLISIFATATFYFLPIFFVLHGFIIKF TKEDPFFFLVLRNHLTYKDYYDA >gi|224461432|gb|ACDD01000070.1| GENE 49 32324 - 34927 2540 867 aa, chain + ## HITS:1 COG:RSc2581 KEGG:ns NR:ns ## COG: RSc2581 COG3451 # Protein_GI_number: 17547300 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Ralstonia solanacearum # 54 850 28 790 816 500 34.0 1e-141 MFQEYKKPKIPKELFVEPDLVAYLMPYVSIDENKNYEVRTTDSLGEERVENFPIVYLKDG SYQTTFQFRGKDLDSCTVYELLSITSRMNNTLKQLNADWTIHVHAIRKKIKKYNKKEGIQ NIPIKIIELEREEFFKSGHHYESDYYITFTWLTPTDQLQKAKSLFFKKTDQEVIINFVEE HLKYYNQELTKIYALLSDILQECRILSIDEVVSFYHSLVSDNPELQLKAPRAFYYKGELI ATGNLIEKYKETLQQEEIRTELLPVLLDSYLYDSGITGGIEPKIGKYHIRTVSLLKYPGD AIVGILDELNRVNIEYDWCSRYIMMDTLTAKKELEKYFDRWDSARESFKTLLKTSLFKME GKENDAAASRAWQIRKEKANLEQDYNTVGYYTFTVVLKGENKAEVEKRALLVKTILNAKG FTAKIENFNALEAYLSTMPGNLLNVRKPLQNSLVFGNLLPLNAVWAGDAWNKHLNTPPLL YCQTIGNTPFRLNLHFGDVGHTIIIGPTGAGKSVLLATLHAQFLAYPKAKVIAFDRGAST RVLNKAAGGVFYDLGEDNIRFQPLRYCDQEKEREWCQEWILGLLEENTLTITPDIQTYVW KALTNLAKLPIEKRTMNSFVDLVGGQSRDIKNALQSYYGKGPFAKYFDGNEEFLQESLYT VFEMEKVMEHKNVITPLMEYLFHLIDTKMIDGISPMLLTLDEGWAFLKNKRFSGKIEDWE RTLRKKNVSIVFTTQNPDEVLESDIKAAILNQCYTRIFLPNPNAKAEIQAGYYKIFGLND TEIDILQNSTPKKQYYFKNPKGSRLFELALSPLELCYLANSSGKDQEKCKELSNLSQKEF NKQWLNYRGFYGDDIVDRLEEIIKEEC >gi|224461432|gb|ACDD01000070.1| GENE 50 34940 - 35668 905 242 aa, chain + ## HITS:1 COG:PM0998 KEGG:ns NR:ns ## COG: PM0998 COG3713 # Protein_GI_number: 15602863 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein V # Organism: Pasteurella multocida # 5 242 27 274 274 95 28.0 6e-20 MKKICILGFFIISSFSFANTIGIAGVYREPLHHAKSTLSALPIVQIEYKDFYFKNYKAGF YFYQEPGFKVSILVNPLGGYTDFAIQKSKLKKGYQNISNRNTQFMAGLALDFQLDKRTIG HGEYMLGHYGSMGEIKINQVYRLHDRITFLPGISFHYYDAKYMHHYIGISKEEVTKNEKI KKSYHGKDTISGGVNATVEFALTEQVSCNIFAGVEAYNHIKESALVKKSHQVYGGIGFRV SF >gi|224461432|gb|ACDD01000070.1| GENE 51 35721 - 36035 401 104 aa, chain + ## HITS:1 COG:CAC1919 KEGG:ns NR:ns ## COG: CAC1919 COG0629 # Protein_GI_number: 15895193 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 1 104 1 114 133 65 31.0 2e-11 MNNFSGVGRLTADAELQKKDGKSIVRFNIAIYRTKEVTDFFPCIIFGEYGEKLYDFLKKG KMIGIMGQVHNNQYEKDGEKRYYTSILVNRIELLEKKSELENIE >gi|224461432|gb|ACDD01000070.1| GENE 52 36157 - 36312 188 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453128|ref|ZP_05618427.1| ## NR: gi|257453128|ref|ZP_05618427.1| hypothetical protein F3_08716 [Fusobacterium sp. 3_1_5R] # 1 51 1 51 51 71 100.0 2e-11 MERIKNIIYKIKEEDSPALYEFFLFSIALDFVDSEKNTSITEKVASEEEIL >gi|224461432|gb|ACDD01000070.1| GENE 53 36309 - 37868 1505 519 aa, chain + ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 3 347 7 367 542 132 30.0 1e-30 MNRRAVGYCRFSSDSQREESIDAQLRAIRDFCKQNNYDLIKVYKDEGISGTSTKDRESFL QLIEDSKKNMFDYVIVHKFDRFARNRYDHAIYEKILNDNKVKLLSVLERLNDSPESIILK SVLTGMNEYYSLNLSREIKKGLNENALKAMHTCGIPPLGYDLDENRRYIINEKEAEAVRL IFSLAADGIGFASIARTLNERSYKNKRGREFKKTSIRDTLLNQKYIGTYFHSLKNRDGTF RRDPILIENAHPAIIEKSLFYKVQTRFKNHLKGPRDRKNTTYYLTGFCRCGECGGSFSGG YRSAHIDGSVHYGYECRKRRAKENNCKNKPIFKEVLEPVILELIKSEIFAEENMEILVKD ISEVIKKYKISQEQEEEYYVKEIEKLNKMVLKLLDKNLEGFLSDEIFRKKNKELNERILI MKEKLYSLETLDQLKEDNLRKYLLKLKNDSTHSLNRKIVESFLHEVIIYQDHIEVTLRRF PKQILDMSKDGGSRGNRTHKTLPPTAFQAAALPLGDTSD >gi|224461432|gb|ACDD01000070.1| GENE 54 37912 - 38733 904 273 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 3 272 6 278 290 162 32.0 6e-40 MNKKILLVNDMPGYGKVALSAMTPILSTMGHSLFNLPTALVSNTLDYGKFEIMDTTEYME KSLQIWEELNFSFDCISTGFIFTKRQVELILQYIEKKKTQGIFVMVDPIMGDQGKLYNGV KEETVDNMRKLSSVADVMVPNFTEACFLARKYVGQKTISLEEVKDLIQLLLSNGAKSIVI TSIETEDNQHYVCGFDSKTQDYFFLPYDHIPIQFPGTGDIFSSILLGNLLHEYSLTESVQ KAMNVVYEFILKNKDNQDKFRGIAIEEGLSLIK >gi|224461432|gb|ACDD01000070.1| GENE 55 38886 - 39086 451 66 aa, chain + ## HITS:1 COG:no KEGG:FN1309 NR:ns ## KEGG: FN1309 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 64 1 64 64 90 71.0 2e-17 MVTKDMNILEAVQNYPIAIEVFQKHGLGCVGCMIASGETLGEGIAAHGLNPDAIVDEINE LIKQGK >gi|224461432|gb|ACDD01000070.1| GENE 56 39200 - 39805 623 201 aa, chain + ## HITS:1 COG:FN1218 KEGG:ns NR:ns ## COG: FN1218 COG4399 # Protein_GI_number: 19704553 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 4 196 3 195 200 181 49.0 7e-46 MLLRLLIMVLIGAWIGWITNWLAIKMLFHPYEEKRFLCFKLQGLIPKRKKDIGSGIARVV EQELLSLKDVLNQMDTELIFQNIERMMDEYLEDNLAKEIQKAFPFAAMFVGKDSLGKIKS LLKQAILSRKEEICSAFTNHLEENVDIQKIISDKIASFSFQKVEEIILSLAKKELKHIEL VGAILGAVIGGLQFLLFSYFS >gi|224461432|gb|ACDD01000070.1| GENE 57 39829 - 40833 1298 334 aa, chain + ## HITS:1 COG:FN1217 KEGG:ns NR:ns ## COG: FN1217 COG2255 # Protein_GI_number: 19704552 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Fusobacterium nucleatum # 1 331 1 331 332 490 77.0 1e-138 MDRIVSELEIPGEIEIQKNLRPKSFREYIGQESLKEKIFISIQAAKRRGSVIDHVLLYGP PGLGKTTLAGVIANEMGANLKITSGPVLEKAGDLAAILTSLEENDVLFIDEIHRLNTAVE EILYPAMEDKELDIIIGKGPAARSIRIELPNFTLIGATTRAGLLSAPLRDRFGISHKMEY YTEEEVKEIILRGGKILEIEVEGEGAEELAKRSRGTPRIANRLLKRVRDYAEIRGKGIIT QEIAIQALNLLGVDMEGLDDLDRNILQAMFENYGGGPVGIETLSLLLGEDRRTLEEVYEP YLIQKGFLKRTNRGRIATSKAIAYWEKMEEKNEN >gi|224461432|gb|ACDD01000070.1| GENE 58 40823 - 41239 405 138 aa, chain + ## HITS:1 COG:FN1216 KEGG:ns NR:ns ## COG: FN1216 COG1959 # Protein_GI_number: 19704551 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 138 1 139 143 186 71.0 8e-48 MKINTKVRYGFKALAYIAMNTEENKLVRIKEIAESQNISIQYLEQILFKLKNEKIIEGKR GPSGGYRLAMSPKEITLHKVYMILDDEVKVIDCNESDEHRQQCKDSICGSTCIWSKLDYA LTKILSDTTLEDFINNVK >gi|224461432|gb|ACDD01000070.1| GENE 59 41254 - 41964 738 236 aa, chain + ## HITS:1 COG:FN1215 KEGG:ns NR:ns ## COG: FN1215 COG1385 # Protein_GI_number: 19704550 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 235 1 234 235 233 56.0 2e-61 MISVIIERNEYLDEIILEKKEDLHHLLHVFRLEIGDKVRAVDGDYEYICEIQKIIENKVH LQILEKREDAFSLSVDIDAAICLIKNDKMDFCIQKLTELGIRSIIPTVAKRCVVKLKEKK EKWNTIVKETMKQCQGVKPTQIQEVTDLKKLPLEDYDLILLPYECEEEHSLKYVLQNRVE KPRKVLYVIGPEGGFEKEEIQYLASKRAEVVSLGKRILRAETAAIVVGGILVHEFG >gi|224461432|gb|ACDD01000070.1| GENE 60 41951 - 43261 873 436 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 7 421 4 425 451 340 39 1e-92 MNSDKRVAFYTLGCKVNQYESESIKNQLLQKGYEEVDFESIADIYIVNSCTVTSIADRKT RNMLRRAKKQNPSGKVIVTGCYAETNRKDLLEMEEIDFVIGNKDKSAVAKFVQEIHTQER VEKKESIFQEKEYQEYEFATFREMTRAYVKIQDGCNEFCSYCKIPFARGKSRSRKQEKVL EEIDKLLMEGFQEIILIGINLGDYGKDLEGDISFETLVQEILKRDLLKRVRIGSVYPDRI TNSFISLFENPKMMPHLHISLQSCDDTVLKNMKRKYGRELILNSLLSLREKVPSMEYTAD IIVGFPGETEEMFQNTYASLEEIGFSHLHIFPYSDREGTLASRMKNKLSPEIKKERVTIL ENLQKKVEEDRRKAYLGKTIEVLIEEEKDGYWWGYSPNYLRVKIKGDNISVNSVIQVEIE KVEKGVLVAYEYAKSL >gi|224461432|gb|ACDD01000070.1| GENE 61 43239 - 44201 1246 320 aa, chain + ## HITS:1 COG:no KEGG:FN1213 NR:ns ## KEGG: FN1213 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 10 320 16 327 327 313 55.0 7e-84 MNMRKVYNGVILVLIAILVILLYFNFRGNQISLSEHDRMLFIGKKNLVAVYEDKLAVDIP FEIHTNKEMTFGDLVKKKEYEEVLRKVNDILPEKIEKYAVVKYGEIDYKVKNAKKLPETT IDESRYALASSIYSMFDELYREANTADVLNQNIIVDVLNANGRGGYARKTGELLTQNLSM KYNAANYEKNQEESYIILNDISMDKARDIVMTLPEKYFKIQAKPVVPTLANVVIVLGKEQ NLPFAISIEGSEANIKKAAANLKKAGYKTIKTSTKSGNEKSFIEYRKEDYFIAYKIAKML DIQDMVEKDSLSDKVDIHLQ >gi|224461432|gb|ACDD01000070.1| GENE 62 44218 - 44622 263 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453138|ref|ZP_05618437.1| ## NR: gi|257453138|ref|ZP_05618437.1| hypothetical protein F3_08766 [Fusobacterium sp. 3_1_5R] # 1 134 1 134 134 149 100.0 7e-35 MWIYLLSFLGIFLENSFFFSGEKVFFFSIPFFSYVLLKKRGNSLIPLLLTILLVSLQGNS YFSFFLYFLCYGVVFYFAFRNMEYNQGTVFYLTIIELGFYSILQNYHWNFLCFMIHAFCF LGLNYYYLKKCYKD >gi|224461432|gb|ACDD01000070.1| GENE 63 44625 - 46454 2336 609 aa, chain + ## HITS:1 COG:FN1211 KEGG:ns NR:ns ## COG: FN1211 COG0768 # Protein_GI_number: 19704546 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Fusobacterium nucleatum # 2 594 5 599 657 719 63.0 0 MRRKKSDFFLGVENNSRGKIYLGVIVLFFFILLVRMFYLQVLQGKEYRYLSEKNQFKLKK ITSPRGQIFDSTGKLIVTNGVGYRLVYLRERNNEEEYVNAIVDLTGYDKEYILKRIKYGE IFPYTRENVLIEDLEEEKAYKLMEKIVDYPYLEVQAYSKRKYLYDSVAAHSIGYVKKISK KEYELLKEQGYTPRDVVGKEGLEKQYDRELKGEDGYEYIEVNAFNKIQRQMESKDPIPGK NLHLSLNMELQQYMEEQYREEGRAGAFIALDAKTGEIITLVSYPTFSLNMFSSQISQTVW NEIMNDKRRPLGNKAVAGEYPPGSVFKVISALAFLESGIDPKQKYLDANGYYQIGKWKWR AWKRGGHGLVDMKKSLVESANPYYYRLADQVGYKPIAEMAKRFGLGSLTGVDIPGEKMGA IPTPEWKKKKLKASWVKGDSILMSIGQGYDLVTPLQIAKAYSIVANKGYAYSPHLVKYLE DVKTKKREKVVGKRIEVKSVPKAHYDIINEALIATVSQDNGTTRILRNPKYLVAAKSGSA QNSQSKTTHAWVAGYFPANDPEIVFTALLTAAGGGGAVAGGMTKKFMDKYDEMKNPPPKV EKMEETNNE >gi|224461432|gb|ACDD01000070.1| GENE 64 46447 - 48291 2271 614 aa, chain + ## HITS:1 COG:FN1210 KEGG:ns NR:ns ## COG: FN1210 COG0595 # Protein_GI_number: 19704545 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Fusobacterium nucleatum # 34 614 25 608 608 828 72.0 0 MNKVEDEKGGFDNIRKALKNIKSEIDELKSPKKKKTENVKTEIKKNNQKTTKKAVSSKKE DKMFVIPLGGLEEVGKNMTVLQYKDEIIVVDVGAIFPDESLPGVDLVIPDFTFLENNKEK IKGVFITHAHEDHIGAIPYLYEKIGKDIPIYGGKLTMAFVKSKFDNVGLSKKLPKMKEVT GRTKVKVGKYFTVEFVKVTHSITDSYSVSIKTPAGHVFHTGDFKIDLTPVDGDGVDFARL AELGDEGVDLLLSDSTNSEVEGFTPSEKSVGEAFKQEFMKAKGRIIIAVFASHIHRIQQI IDIAVKNHRKIAIDGRSLVKVFEIAPSVGCLNIPEGALVSLAEVDKLKDHKVVILCTGTQ GEPMAALSRIAKNMHKHIKVKEGDTVIISATPIPGNEKAASSNINNLLRFDAEVVFKKIA GIHVSGHGSKEEQKLMLNLIKPKHFMPVHGELRMLKAHMKTAMETGVSKNDILITQNGNK VEVTKNYVKINGKVTAGETLVDGLGIGDIGSAVIKDRQQLSQDGIVVVAYTIERKTGKII AGPEIATRGFVYMKDAEELIKEASDLLETKIPFSEKYLPKEWGLLKNNVKDCAAKFFYNK TKRNPMILPIITEI >gi|224461432|gb|ACDD01000070.1| GENE 65 48312 - 48617 224 101 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229884332|ref|ZP_04503793.1| predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Sebaldella termitidis ATCC 33386] # 1 98 1 99 106 90 49 2e-17 MKLTSKQRAFLKKKAHELNPIVRIGKDGLQETVIESILSAIDSRELIKVKILQNCETEKE EIYQQLLEETRFDVVGMIGRTIIVFKENKEKPVVSTELKSL >gi|224461432|gb|ACDD01000070.1| GENE 66 48633 - 50420 1929 595 aa, chain + ## HITS:1 COG:FN1208 KEGG:ns NR:ns ## COG: FN1208 COG1154 # Protein_GI_number: 19704543 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Fusobacterium nucleatum # 6 595 4 600 600 660 54.0 0 MAKVLDLEKKAIEIRKTLIQTVSQTGGHLAPNLGVVELTLALHHVFDFSKDKLLFDVGHQ SYVHKLLTDRKERFSTLRTRGGVGPFLDPTESSWDHFISGHAGTALAAAVGMAKAYPEKK IVVVIGDASIANGHSMEALNYIGGEKIKNILVILNDNEMSIGRNVGSLSKFLGKVMLSSP YLSLRKEIRSFVDKIQATSIKDTLERMEISVKNFLFPTNVAENFGYIFLGSIDGHNLEEL VGTFLKAKEMEGPLFLHVKTVKGKGYRFAEQNTEKFHGIAPFDLSTGVVANSSETYSNVF GTKMKEISKKDNSVFAITAGMLSGTGLKKMAEVFPERVLDTGIAEGFATTMSAGLAISGE KPYLCIYSTFLQRSFSQIIHDISLQNLPVRFIIDRAGIVGEDGKTHHGLHDLSFLLSVPN IVVLNPTTKEELEEMLNFSLGYQEGPMAIRIPRDVAYSLPMKSTWKIGTWQEVKTGKKTL LIAVGSMLKELLSLKVEGTIVAASSLRPLDKEYIKSQFEKYETIIVCEENYKEASFFQYL LNELDSMGIQRKLYSISLSSFIIGHGKRKELLEEYGLSGAKLLERIEEIVDGGKK >gi|224461432|gb|ACDD01000070.1| GENE 67 50404 - 51234 850 276 aa, chain + ## HITS:1 COG:FN1207 KEGG:ns NR:ns ## COG: FN1207 COG3481 # Protein_GI_number: 19704542 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Fusobacterium nucleatum # 1 274 1 274 274 298 59.0 9e-81 MEERNNKSYYFIKELLQLDLVKALELYDDQGVKVSTHTYDVLNLSIEEILKQYKTLENAS KKLDFFAITVGVIIHDVSKASIREQEENLSHSQMMIKNPDYILKEVEEVLREVEEKTGLF LKKTIKKRISHIVISHHGRWGKIQPSTKEACIVYKGDMYSAKYHRINPIGADSILAYIEK GYSLEEICQKLNCTPGVVKDRLKRSRNELKLSTIGQLIHYYQKNKKVPLGDEFFVLRVEE TKKLKQLVDKQGFQELILQNPLIPYFEDEAIFKEKK >gi|224461432|gb|ACDD01000070.1| GENE 68 51243 - 52046 877 267 aa, chain + ## HITS:1 COG:FN1206 KEGG:ns NR:ns ## COG: FN1206 COG1189 # Protein_GI_number: 19704541 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Fusobacterium nucleatum # 2 245 8 250 266 218 51.0 1e-56 MKERLDQILWKYGYVDSIEKAKRLIMLGSVIVNEQRIDKAGTLFKYSEEMNIRVKGQENP YVSRGGFKLKKAIDDFACSFQGKRVLDIGASTGGFTDCSLQEGAAYVYALDVGTNQLAWK LRQDPKVKSIEQCHVKELNWTLLDQEPVDYMVMDVSFISVCGIFRYLYPFLEENGKLLLL IKPQFEVEKHFLEKGIVYERKAHQEVLERVIKIAKENGFFLQNIEVSPILGGKGNVEYIS CFSKQQTSAVLELESILEKAKEMGGLK >gi|224461432|gb|ACDD01000070.1| GENE 69 52043 - 53326 1716 427 aa, chain + ## HITS:1 COG:FN1205 KEGG:ns NR:ns ## COG: FN1205 COG0793 # Protein_GI_number: 19704540 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Fusobacterium nucleatum # 12 418 2 423 427 504 63.0 1e-142 MKIVNKYMILFLLISSLCFAKEKNRVGFLTNLKELKEISDIMDIVNENYVDTGDHKFSRK TLMQGALKGMVESLEDPHSTYFTKAELESFEEDVRGKYVGVGMVVQKKANEALTVVSPIE DAPAFKVGIRPRDKVVSIGGVSTYNLTTEECVKKLKGKAGTSIAIKVQREGREKLLDFTL KRETIQLKYVKHRMLDSKIGYLRLTQFGENIYPDLRKALEDLQAKGMKALVFDLRSNPGG ALDQAIKVSSMFLKEGRVVSVKGRDGKEKISKREGKYYGDFPLVILVNGGSASASEIVAG AIKDNKRGMLVGEKTFGKGSVQTLLPLPDGDGIKITIAKYYTPSGVSIHGKGIEPDVPVE DKDYYLLFDGTITNVDEKENKASKKKLIQEIKGTKEAKKMDTHKDIQLNVAKGILEGILV GKGREKK >gi|224461432|gb|ACDD01000070.1| GENE 70 53326 - 54009 934 227 aa, chain + ## HITS:1 COG:FN1204 KEGG:ns NR:ns ## COG: FN1204 COG0313 # Protein_GI_number: 19704539 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Fusobacterium nucleatum # 1 227 1 235 235 307 72.0 1e-83 MLYIVATPIGNLEDMTFRAVRILKEVEYIFAEDTRVTRKLLQHYEISTKLDRYDEFTKMK RIPDIIKLLEEGKNIALVTDAGTPCISDPGYELVDAALQAGIQVSPIPGASALTASTSVA GISLRRFCFEGFLPKKKGRQTLFKSLLEEERPIIIYESPFRLIKTLKDIENYLGNREVVI VREITKIYEEILRGRTKELLEKLENKTIKGEIVLIIKGVNDDVDDRD >gi|224461432|gb|ACDD01000070.1| GENE 71 53990 - 54880 1217 296 aa, chain + ## HITS:1 COG:FN1203 KEGG:ns NR:ns ## COG: FN1203 COG1161 # Protein_GI_number: 19704538 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 286 1 287 289 389 74.0 1e-108 MSMTEINWYPGHMKKTKDLIKENMPLIDVVLEIVDARIPISSKNPDIPVFAKNKKRIVVL NKSDLMEKSELSKWKEYFLKVEKADAVVEISAETGYNVKQLYACIDKVSKEKKDKLYAKG LKKVNIRIIVLGIPNVGKSRLINRIVGKNSAGVGNKPGFTKGKQWVKLKDGLELLDTPGI LWPKFENREVGFHLAMTGAIKDEILPLEEVACAFLSKMISLGKWNILQQRYKLLEEDYNE ITGYILEKIALRMAMLNKGGELNVKQAAYTLLRDYRSGKLGKFGVDILENSIGEEE >gi|224461432|gb|ACDD01000070.1| GENE 72 54877 - 55656 1217 259 aa, chain + ## HITS:1 COG:FN1202 KEGG:ns NR:ns ## COG: FN1202 COG0171 # Protein_GI_number: 19704537 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Fusobacterium nucleatum # 1 249 8 256 258 306 58.0 2e-83 MKSLEEKLVKFIQEQVKNAGFKKVILGLSGGIDSALVAYLAVKALGKENVIAIKMPYKTS SQESIEHANLVLKDLGLEDRTIEITPMVDAYFTNQAEASSLRRGNYMARTRMTVLFDQSA LENALVIGTSNKTEILLGYGTLFGDMACSFNPIGDIYKKDVWSLSRYMGVPKEIIEKQPS ADLWAGQTDEQELGLSYKEADEILERLVDKKQSLEEIVAAGYEEGIVNKVIQKVKASAYK RKLNPIAKVGEVLGRDFSF >gi|224461432|gb|ACDD01000070.1| GENE 73 55669 - 56028 557 119 aa, chain + ## HITS:1 COG:no KEGG:FN1201 NR:ns ## KEGG: FN1201 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 117 2 118 124 86 42.0 3e-16 MIDNHLKEEFQEYQKEKQQISEIIRKAEGRNNSQHKIISAIFVILIVAILILGIILNRLT LLQTLEIATLLAVLKVIWLFYDLQKTMHFQFWLLNSLEFRLNEIDKKARNIERTLKEEK >gi|224461432|gb|ACDD01000070.1| GENE 74 56267 - 57421 735 384 aa, chain + ## HITS:1 COG:CAC0707 KEGG:ns NR:ns ## COG: CAC0707 COG1508 # Protein_GI_number: 15893995 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Clostridium acetobutylicum # 49 382 94 462 464 134 31.0 3e-31 MKTSLGLEQKQILKLSQEMKLSLQVLSMPYRQVLNLWNGNMEKTYSSTEKSFFENLTEEK DFYTFLEEQLLYINPPKNIRENLIFCIYNLNSSGFLEFSDIELCKHLGISMKELKETYEY LYTLHPIGVGAHNFRECIRLQLQKKKEWNQKMEGILLHLQYIADGNSEKLRKKLGITKEE FQSFLNKIKNCNPIPARGYFIRKNLTIVPDFVLVLENEVWIAKENTELRDSLYSTQIIET PSIKLLRLCIEKRMDTLKKIVDYVVEYQKEYFKGENFLHTLHEKQVAHDLNLHLSTVSRA IQNKYLKTEKGIYSIKSLFCYEEKREKIKMEIERLIKYENYQKPYTDTQIQEELKSKFGK IPRRTIAKYRQELGYVSSFYRKIR >gi|224461432|gb|ACDD01000070.1| GENE 75 57430 - 58860 1593 476 aa, chain + ## HITS:1 COG:BH0992 KEGG:ns NR:ns ## COG: BH0992 COG3829 # Protein_GI_number: 15613555 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 21 473 20 451 454 300 39.0 6e-81 MLGKKGECGMDKQVYNVFRKIVESSYDGIYVTDGKGNTIFVNQAYEELTGLSIQKLQGKN MKELVEEGIYDESGSLQAIQSKEKITINQKLKSGKIIFITSSPVFDARGKIIYVVTNVRD MRELQRLEQKFLNTQKLAEKYKTELEFLKQKEKQNTNPQSKNKQMMNIIKLLETTARFDT SILLEGETGTGKTHLAKIIHDNSPRREQKFVEINCGAIPKELMESELFGYEKGAFTGADK NGKMGLFELANNSTLFLDEISELPLEMQVKLLKVLETGYIVRVGGIQPIPVDVRIITASN KCLKKQIEKNCFREDLYYRINVVRVHLPSIRERKEDIIMIANNFLKQFNEMYGLHKKISE DVYQAFLKYSWPGNIREIKNVVEQLVVISQVDEIKKEILPKELAFQEVFSDESGVILACQ KCMEKYLNMSLKEATNEFQKDVIEKLLLELKSQRKVAEKLGVNPSTITRKLQEKEI >gi|224461432|gb|ACDD01000070.1| GENE 76 59058 - 60506 2071 482 aa, chain + ## HITS:1 COG:AF0333 KEGG:ns NR:ns ## COG: AF0333 COG2368 # Protein_GI_number: 11497945 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Aromatic ring hydroxylase # Organism: Archaeoglobus fulgidus # 2 481 3 479 500 585 57.0 1e-167 MALMTAAQYEESLRKMNMEVYLFGKRVECPVDDPIIRPSLNSVKMTYELAQKPEYQELMT VISPLTGERINRFAHIHQSTEDLKNKVKMQRLLGQKTASCFQRCVGMDAFNACYSTTYDM DKALGTNYHEKLVTFLKYCQEKDLTVDGAMTDPKGDRGLSPSQQADPDLFLRIVERRENG IVVRGAKAHQTGIVNSHEVLVMPTISMTEKDVDYAVCFAVPVDTKGIKIIYGRQSCDTRK LEDGMLDRGNPKFGGHEALVVFDDVFVPEERIFMCGEYQFSGSLVERFAGFHRQSYGGCK VGVGDVLIGATALIADYNGTKKASHVKDKIIEMVHLNETLYACGIACSSEGHTTPAGSFE IDMLLANVCKQNVTRLPYEIGRLAEDIAGGIFVTMPSEADYNSPIVGKYVEKYLKAVATV PTYDRMKVLRLIENMMLGTAAVGYKTESLHGAGSPQAQRIMISRQSNLEMKKQLVKDILD IK >gi|224461432|gb|ACDD01000070.1| GENE 77 60527 - 60835 254 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453153|ref|ZP_05618452.1| ## NR: gi|257453153|ref|ZP_05618452.1| hypothetical protein F3_08841 [Fusobacterium sp. 3_1_5R] # 1 102 1 102 102 161 100.0 1e-38 MEEILELIGKEIQPSLQEHGGSLEIVLYEEEKQELQLRLMGQCCSCPHSIDTVENFIKVK LKEKFPKLKKISVNTGVSNELLEIAKNLLRKEDSIGKLERDL >gi|224461432|gb|ACDD01000070.1| GENE 78 60810 - 62117 1481 435 aa, chain + ## HITS:1 COG:FN0621 KEGG:ns NR:ns ## COG: FN0621 COG0427 # Protein_GI_number: 19703956 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Fusobacterium nucleatum # 1 431 1 431 434 603 67.0 1e-172 MENWKEIYKEKIVTADEAVKYIQSGNRVIFAHACGEAQIITEAMLRNKEEYEKVEIVHLV PMGTGEYAQEENQKYFYHNSFFGGGSTRKALNGTYGDFTPSFFFEIPKLFKKDGKLPVDV AIIQVSSPDEHGYCSYGISCDYTKGAAENAKIVIAQVNKYMPRTLGDSFIHLSEMDYIVA YDEPILQLVPPKIGEIEQKIGEYCASLIQDGDTLQLGIGAIPDAVLTFLKDKRHLGIHSE MISDGVVDLILAGVIDNSKKTIHKNKCVVSFLMGSPKLYDFVNNNPAVELYPVDYVNHPM IIAKNDNMISINSALQVDLMGQVNAESIGAKQFSGTGGQVDFVRGASMSKGGKSIIAMPS TAAKGRISKIVMNLDLGSTVTTSRNDVDYVVTEYGIASLKGKTLRERAKALIAIAHPDFR ESLMKQALEKFQILN >gi|224461432|gb|ACDD01000070.1| GENE 79 62127 - 63476 1392 449 aa, chain + ## HITS:1 COG:no KEGG:CD1965 NR:ns ## KEGG: CD1965 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 11 446 8 435 463 439 51.0 1e-122 MQVFTRKLGEEQPSWKLGLFRIRLPFIHYRFESSEALQAVLMCSTCLGAIPILTGVLGIP FELAWSMVIINGLLYNMHSFLGDPVVPGWITPSIPLTIAYLTQFEMGPTRIQALIALQLL VAFIFLIMGITGFAGKLMKIVPDSIKSGILLGAGFAAIIGEFVIEKGRFNLYPFSIAIGT IFSYFLLFSERFKELRKKYKFIDVLGKYGMLPAILISVVIAPLFHELPFPTIQIGQFIKI PEFSNIFHQVSVFGVGFPSIDLFIKAIPMSVMVYIIAFGDFVTSGELLREADEIRTDEKI DFNSNRSNLISGIRNLIQGIFIPYVPLCGPLWAAVSAAVFERYKEKTESSGTTSYGMESV YSGVGTFRWMTFICVTIFPIVSLLQPTLPVALSLTLLVQGFVCTRLAMMICTDKLDMGIA GVMAAVIAVKGAAWGLGVGVLLYLLLSKK >gi|224461432|gb|ACDD01000070.1| GENE 80 63515 - 63763 269 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453156|ref|ZP_05618455.1| ## NR: gi|257453156|ref|ZP_05618455.1| hypothetical protein F3_08856 [Fusobacterium sp. 3_1_5R] # 1 82 1 82 82 153 100.0 4e-36 MKILFCGGCNPLYNRVLVYEKVKDLKINSNILLLNGCHRGCKKISKDEKCINVQEFFTTH SSIENYSERDIIVWIYEQSKKK >gi|224461432|gb|ACDD01000070.1| GENE 81 63729 - 63848 72 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYGFMSSQRKNKKDSIVISKRGCIKVLIQPLCFIDIIEN >gi|224461432|gb|ACDD01000070.1| GENE 82 63845 - 64402 497 185 aa, chain - ## HITS:1 COG:FN0555 KEGG:ns NR:ns ## COG: FN0555 COG1396 # Protein_GI_number: 19703890 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 1 185 15 199 199 244 71.0 9e-65 MNKEINVGNIIRNIRLSKGLLIKEVAMKCDISSSMLSQIEKGNANPSLNTIKSIAQVLEV PLFKFFLDFEKNEDKINLLKKENRKIISTKNVRYELLSPKTATNIECMKMILTSKNAETS MYPMSHKGEEIAVLLEGKVEITVDLFSTIMFPGDSVHIPAQVPHKWKNLYDKESVIIFSV SPPEF >gi|224461432|gb|ACDD01000070.1| GENE 83 64660 - 66018 1821 452 aa, chain + ## HITS:1 COG:FN0554 KEGG:ns NR:ns ## COG: FN0554 COG2610 # Protein_GI_number: 19703889 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 1 452 1 449 449 629 84.0 1e-180 MSSTFVIASIFLSILLLIILTVKVKLHPFFALTISAFFFGIASGHSLTDIIGAYSDGLGG TIAGIGIVIAIGTVMGALLEQSGAAETMAESILKVTGNKNADIGLAITGYFVSIPVFCDS AFVLLSPLAKRISKDTGNSMTTMAIALAMGLHATHMLVPPTPGPLAVAGILGANLGMVIL FGMLVSIPVTIVAILVGRIFGKKYFFLPNEDIVEVDNGEENKKKLPSPFMSFAPILIPIF LMLLRTISNLESRPFGENYLFHITNSLGQTIIALFIGLIIAFFTYKSVYPKDKSVWTFDG IFGEALKTAGQIVLIVGAGGAFATVLKLSNLQEIVMNLFAGISIGIIVPYIIGAIFRTAI GSGTVGMITAASMLLPLVDVLGFNSPIGLVIAMLACAAGGFMVFHGNDDFFWVVVSTSGM KPEIAYKVFPIISILQSLVALLCVFILKMIFL >gi|224461432|gb|ACDD01000070.1| GENE 84 66067 - 67392 1674 441 aa, chain + ## HITS:1 COG:FN0553 KEGG:ns NR:ns ## COG: FN0553 COG3048 # Protein_GI_number: 19703888 # Func_class: E Amino acid transport and metabolism # Function: D-serine dehydratase # Organism: Fusobacterium nucleatum # 1 441 1 441 441 663 77.0 0 MDKKYLIEENTTIKKMANLEEIAWINKKEKDYLEYEKALPISDEELKEAEKRLNRFAPFI KKVFPETTDTNGIIESPLEPIFHMQKELEEKYETKIPGKLYLKMDSHLPVAGSIKARGGV YEVLKHAEDLAIAAGLLKLEDDYSILANSNFKEFFSKYKVQVGSTGNLGLSIGITSASLG FEVIVHMSADAKQWKKDMLRAKGVKVVEYESDYGKAVEEGRKSSDADPNSYFVDDEKSMN LFLGYTVAASRIQKQFEEKNIIINQEHPLIVYIPCGVGGAPGGVAYGLKRIFKDNAFVFF IEPTLSPCVILGMESELHEKINVHDIGIHGITHADGLAVASPSGLVGRLMEPILSGGFTV EDYKLYDYLRLLDKYENKKIEPSSCAAFEGPTFLLKHEETRKYMEKKVGKHIEGAYHICW ATGGKMVPEEDMKNFLETYLK >gi|224461432|gb|ACDD01000070.1| GENE 85 67403 - 68512 1224 369 aa, chain + ## HITS:1 COG:FN0552 KEGG:ns NR:ns ## COG: FN0552 COG3616 # Protein_GI_number: 19703887 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid aldolase or racemase # Organism: Fusobacterium nucleatum # 1 368 1 368 369 523 71.0 1e-148 MRKEDLKTPTILLNIGVLKDNIHRYQKLCNQYGKELWPMIKTHKSKGILEIQIEEGASGA LCGTLEEAEMCQKMGVKKIMYAYPVSTKENIQRVIELASKSYFIIRLDNEEGAKKINEMA IKEDVIINYTIIVDSGLHRFGVSVDNILGFAEKLKQFQNLKFVGISSHPGHVYSSTCSKD IEKYVEDECDTLHKAKSILEEAGYSIKYVTTGSTPTFDEAVKHPEINVYHPGNYVFLDSI QLSIGKAKKENCALTVYSSIISHPQEDLFICDAGAKCLGLDQGAHGNNSIIGYGTIIDHP ELVIVSLSEEVGKIKVTGQTDLKVGDKIEIIPNHSCSSANLCSYYTVVKNGQVTDSISVD VRGNSFTRT >gi|224461432|gb|ACDD01000070.1| GENE 86 68623 - 69126 701 167 aa, chain - ## HITS:1 COG:FN0724 KEGG:ns NR:ns ## COG: FN0724 COG0716 # Protein_GI_number: 19704059 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 167 1 167 167 171 55.0 4e-43 MNKIGIFYGTSGSTTLGIVDELEFQLRKENYQTYNVKDGIEAMKDYDNLILVTPTYGVGE LQPHWQKQYETLSKMDFHGKVVGLIGLGNQFAFGESFVGALRVLYDVIVKNGGKVVGFVS DKEYSHEETTSVIDGNFVGLPIDETNQGSKTPQRIISWLEVVKKEMK >gi|224461432|gb|ACDD01000070.1| GENE 87 69142 - 69822 834 226 aa, chain - ## HITS:1 COG:FN0725 KEGG:ns NR:ns ## COG: FN0725 COG1179 # Protein_GI_number: 19704060 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Fusobacterium nucleatum # 1 226 1 234 234 258 55.0 5e-69 MIFKRTELLIGKDKLKMLQNSHILLFGLGGVGGQAFEALVRTGIGEISIVDFDTVDITNC NRQILATQNTVGKYKTEVAIERALSINPTIKIHSYTERVSKDNVLSFFQNRQYDYIIDAI DTITAKLDIIQYAWEHQIPVISSMGTARKWNPSLLEITDIKKTSVCPLARVMRRELKKRG VNRCKVVYSKEEAKCLQEDTLGSIAFVPPVAGLLLVGEVVKDLCNL >gi|224461432|gb|ACDD01000070.1| GENE 88 70013 - 70720 910 235 aa, chain + ## HITS:1 COG:FN0435 KEGG:ns NR:ns ## COG: FN0435 COG0813 # Protein_GI_number: 19703773 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Fusobacterium nucleatum # 1 235 7 241 241 321 65.0 6e-88 MSVHIAAKLGEIAEIVLLPGDPLRAKWIAENFLENPICYSTVRGMYGYTGEYQGKRISIQ GTGMGIPSISIYVNELIQEYGVKTLVRIGSAGSYQEDVKIRDIVLAMSTCTDSSLNANRF PNANFAPTSDASLFLKAYQLAKEKNLSVHAGSILTSDEFYNDDPDTWKHWAKFGILCVEM ETAALYTLAAKFKVKALSILTISDSLVTKEATSSEERQTSFSTMVDLALGVVVNI >gi|224461432|gb|ACDD01000070.1| GENE 89 70717 - 72420 1564 567 aa, chain + ## HITS:1 COG:FN0734 KEGG:ns NR:ns ## COG: FN0734 COG1032 # Protein_GI_number: 19704069 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Fusobacterium nucleatum # 1 564 1 564 568 1027 85.0 0 MKFLPTTREEMKKLGWDTLDVLLISGDTYLDTSYNGSVLVGKWLVKHGFRVGIIAQPEVD SPVDITRLGEPNLFFAISGGCVDSMVANYTATKKRRQQDDFTPGGINNRRPDRAVLVYSN MIRRFFKGTKKKIVISGIESSLRRITHYDYWTNKLRKPILFDAKADILSYGMGEMSMLAL ARALQQNEDWTEIRGLSYLSKEPKENYLALPSHADCLASKDVFTKAFHQFYLNCDPITAK GLYQKCDDRYLIQNPPSLTYTEKEMDAIYSMEFARDVHPYYKAMGAVRALDTIRYSVTTH RGCYGECNFCAIAIHQGRTVMSRSQSSIVEEVTEMTKLPKFKGNISDVGGPTANMYSLEC KKKLKLGSCPDRRCLYPKKCPSLQVNHRNQVDLLRKLKKIPKIKKIFIASGIRYDMILDD TQCGQMYLKELVQDHISGQMKIAPEHTEDSILSLMGKDGRSCLNEFKNQFYQLNQKLGKK QFLTYYLIAAHPGCREKEMVDLKRFASKELRVNPEQIQIFTPTPSTYSTLMYYTEKDPFT GKKLFVEKDNGKKQKQKDIVLDKKYRS >gi|224461432|gb|ACDD01000070.1| GENE 90 72429 - 73745 1800 438 aa, chain + ## HITS:1 COG:FN0170 KEGG:ns NR:ns ## COG: FN0170 COG1160 # Protein_GI_number: 19703515 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Fusobacterium nucleatum # 1 437 1 440 440 777 87.0 0 MKPIVAIVGRPNVGKSTLFNNLIGDRVAIVDDMPGVTRDRLYRETEWNGAEFVVVDTGGL EPANNEFMMTKIKEQAEVAMNEADVILFVVDGKAGLNPLDEEVAYILRKKQKPVVLCVNK IDNYLQQQDDVYDFWGLGFEYLVPISGAHKVNLGDMLDMVVDIIGKLEFPEEEEDILKLA VIGKPNAGKSSLVNRLSGEERTIVSDIAGTTRDAIDTLIEYKENRYMIIDTAGIRRKSKV EESLEYYSVLRAIKTIKRADVCLLMLDAQEGLTEQDKRIAGIAAEERKPIVIVMNKWDLV KNKDMKKYKEELYAELPFLSYAPIEFVSALTGQRTTKLLEIADTIYEEYTKRISTGLLNT VLKDAILMNNPPTRKGRLIKINYGTQVSVAPPKFVLFCNYPELIHFSYARYIENKFRESF GFEGSPILISFEKKSKEE >gi|224461432|gb|ACDD01000070.1| GENE 91 73820 - 74965 1506 381 aa, chain + ## HITS:1 COG:FN0053 KEGG:ns NR:ns ## COG: FN0053 COG1114 # Protein_GI_number: 19703405 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Fusobacterium nucleatum # 6 381 6 381 424 327 51.0 2e-89 MVKRKDVVFTGFALFAMLFGAGNLIFPPMLGHNLGSSWGIAALGFVVTGVGFPLLGLIAA VHTGPELDDFAKRVSPLFARSYITILILTIGFFLAMPRTGATAYEMTLQNVGDTNPIHKY IFLAFYFLITWMFSLRANKVVERIGSILTPVLLIILAVIMYQGIFHPFSMPQTVVLEEAP FKIGFIQGYQTMDTLATIVYSAVIMKSIRHGRNLSQEEESSFLWKSSLIAVGLLACVYGA LTYIGATFSGFETVGNTDLLSQIVRNLLGDFGNIILGLAVAGACLTTAIGLVATVGDYFE KILPFSYRTIVTVTCIAGFVFSNFGVQTIIQVAIPILVVLYPISMMLIFLNLLQKYMKND MVYRIIIVLTTMFGLYQAYSL >gi|224461432|gb|ACDD01000070.1| GENE 92 74984 - 75349 519 121 aa, chain + ## HITS:1 COG:BH3485 KEGG:ns NR:ns ## COG: BH3485 COG1393 # Protein_GI_number: 15616047 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Bacillus halodurans # 8 115 8 115 119 139 66.0 1e-33 METLLIWYPKCGTCRNAKKWLDEHGIEVLTRHIVEENPTKEELKHFWELSSFPLKKFFNT SGILYRELGLKDKLKEMSEEEMLSLLSTNGMLVKRPILVQDKKVLVGFKEAEWKQFFNIA E >gi|224461432|gb|ACDD01000070.1| GENE 93 75369 - 76634 1692 421 aa, chain + ## HITS:1 COG:FN0053 KEGG:ns NR:ns ## COG: FN0053 COG1114 # Protein_GI_number: 19703405 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Fusobacterium nucleatum # 1 419 1 421 424 390 60.0 1e-108 MYTWKNVLLTGFALFAMLFGAGNLIFPPMLGKTLGDVWLTGTIAFILTGVGFPLLGIIST ALSGKKDINEFADKVSPLFAKIFFIALILAIGPLLAIPRTGATAYEITFLHAGVSSSLYK YVYLVLYFGITLLFSLKANKVVDRIGSILTPILLAMLFIIIVKGVSSPLGVPVAGTILTP FKNGFIEGYQTMDTLASIVFAGVILKSIRGDRELSPKQEFSFLIQVSIIACLGLSIVYGG LSFIGASVSGMGSELGKTELLVYLTTTLLGKSGYAILGICVAGACLTTAIGLVATVADYF SKITSLSYEILAVLTTIVSFIFACFGVDVIVKIAVPVLVFLYPLAMALILLNVFQIQNHF VFKGTCLGAGLISFYEMLGVLGVQNEFLANIYSFLPFSSLGFAWLVPAVLGGVLFRLIKK N >gi|224461432|gb|ACDD01000070.1| GENE 94 76743 - 76949 370 68 aa, chain + ## HITS:1 COG:no KEGG:FN0101 NR:ns ## KEGG: FN0101 # Name: not_defined # Def: glutaredoxin # Organism: F.nucleatum # Pathway: not_defined # 1 67 1 67 67 83 68.0 2e-15 MIRVYSKEDCAKCKNLKSILEGKGLDFEYIEDKKQLMIVASKARIMSAPVIEYQEKVYSM DDFLKVIA >gi|224461432|gb|ACDD01000070.1| GENE 95 76946 - 79204 2705 752 aa, chain + ## HITS:1 COG:FN0102 KEGG:ns NR:ns ## COG: FN0102 COG0209 # Protein_GI_number: 19703450 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Fusobacterium nucleatum # 1 752 1 755 755 1186 76.0 0 MNLERRKVINRDGIIEDLNIEKIREKLVRACAGLEVNMVELESKIESIYEENITTKKIQE SLINSAVSMTSFEESDWAEVAGRLLMMEAEREVYHSRGFSYGELEKTISLMLSYGLYDAR LSKYTKEEIYELNQAIVPERDMVYDYAGASMFVHRYLLKYSGKIHELPQEVFMIIAMLLS IYEKDKVKVAKEIYEGLSLRKISLATPILANLRIPNGNLSSCFITAIDDNIESIFYNVDS IAKISKNGGGVGVNISRIRAKGSMVNGYYNASGGVVPWIRILNDTAVAVNQQGRRAGAVT VAIDSWHLDMESFLELQTENGDQRGKAYDIYPQVVVSNLFMERVKSGADWTLVDPYEIRQ IYGVELCELYGVEFEEVYERIERENKIQLKKIMKARDLFKEIMKSQLETGMPYIFFKDRA NERNHNSHLGMIGNGNLCMESFSNFSPSKNFQEKIVGNVAIHEKEMGEVHTCNLLSLNLA EIMEEELEKYTSLAVRALDNTIDLTVTPLAESNKHNEKYRTIGVGAMGLADYLAREYMIY EESEEEISQVFERIAAYALKASAFLARDRGQYPAFVGSKWSQGIFFGKTQDWYEKHSKYS DVWKEVFYLVDQYGLRNGELTAIAPNTSTSLLMGATASVVPTFSRFFIEKNQSGATPRVV KYLKDRAWFYPEFKNVDPKTYVKITSKIGQWTTQGVSMELLFDLNKNVRAKDIYDTLLTA WETGCKSVYYVRTIQKNTNIMNEKEECESCSG >gi|224461432|gb|ACDD01000070.1| GENE 96 79197 - 80231 1264 344 aa, chain + ## HITS:1 COG:FN0103 KEGG:ns NR:ns ## COG: FN0103 COG0208 # Protein_GI_number: 19703451 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Fusobacterium nucleatum # 1 344 5 348 348 579 84.0 1e-165 MDRKRLFNPEGNDSLLERRIIKGNSTNLFNLNNVKFTWATQLYRTMMANFWIPEKVDLTQ DKNDYENLTVPEREAYDGILSFLIFLDSIQTNNVPNISDYVTAPEVNLLLSIQTFQEAIH SQSYQYIIESILPKESRDLIYDKWRDDKILFERNRFIAQIYQDFIEETSDRNFAKVLVAN YLLESLYFYNGFNFFYLLASRNKMVGTSDVIRLINRDELSHVVLFQKIIREIKAENPNFF QEEEIRMMFQTAVEQEILWTEHIIGNRVLGITTETTEAYTKWLANERLRTIGLAPMYDGF TKNPYKHLERFADTEGDGNVKSNFFEGTVTSYNMSSSIDGWDEF >gi|224461432|gb|ACDD01000070.1| GENE 97 80255 - 80560 418 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257467363|ref|ZP_05631674.1| ## NR: gi|257467363|ref|ZP_05631674.1| hypothetical protein FgonA2_07961 [Fusobacterium gonidiaformans ATCC 25563] # 1 101 1 101 101 186 100.0 5e-46 MNIVLDEKVEKYMQQKHLSALIIEMTPVGCSCVGIHNHAEPDYLETDKIAEYEKKESYEL YVWKEEIKVFIEKDLLPCNEISILGTYNPFNKRVYMHCEIK >gi|224461432|gb|ACDD01000070.1| GENE 98 80633 - 81712 1705 359 aa, chain + ## HITS:1 COG:FN1908 KEGG:ns NR:ns ## COG: FN1908 COG0584 # Protein_GI_number: 19705213 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Fusobacterium nucleatum # 1 359 1 353 357 582 79.0 1e-166 MNVKKVLVLASVLLSVSAYAESMEANGMHNKLIIAHRGASGYLPEHTLESKALAFAQGAD YLEQDLAMSKDGRLIVIHDHFLDGLTDVAKKFPDRKREDGRYYVIDFTWDELQTLEMTEN FSTENGVQKQVYPGRFPLWASHFRLHTFEDEIEFIQGLEKSTGRKVGIYPEIKAPWFHHQ NGKDIAKATLEVLKKYGYTKKSDLVYLQTFDYNELKRVKTELMPQMGMDLKLVQLIAYTD WHETEEKGKDGKWINYDYDWMFKKGAMKEVAKYADGVGPGWYMLVDENTSTLGNLKYTDM VEDIKTTKMENHPYTVRKDALPKFVKDIDEMYDALLNKSGATGLFTDFPDLGVKFVETK >gi|224461432|gb|ACDD01000070.1| GENE 99 81743 - 82189 390 148 aa, chain - ## HITS:1 COG:BS_ykmA KEGG:ns NR:ns ## COG: BS_ykmA COG1846 # Protein_GI_number: 16078380 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 2 147 3 147 147 132 51.0 2e-31 MDKYDVLKLENQLCFPLYAVAKEITRAYQPYLEPLHLTYTQYITMMVLWEQKKVSVKELG AYLYLDSGTLTPLLKKMEQKSWIRRIRSKEDERKVWIELTTEGEALKEKAVNIPKNMGKC INIDAKEAKQLYLILHHLLQNPHFQKNK >gi|224461432|gb|ACDD01000070.1| GENE 100 82214 - 82759 649 181 aa, chain - ## HITS:1 COG:FN2007 KEGG:ns NR:ns ## COG: FN2007 COG0386 # Protein_GI_number: 19705303 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Fusobacterium nucleatum # 1 181 17 197 199 236 63.0 2e-62 MNIYEFNVKNIKGEDISLQDYQGKVLLIVNTATACGFTPQYNDLENLYKKYQEKGLIILG FPCNQFGQQAPGTDYEISDFCSLNFGVSFPQFSKIDVNGETAHPLFQYLQSEKSFAGFDA EHKLTPILEDILSKEDPNFTEKSSIKWNFTKFLVDRNGKVLQRFEPTTDISKIDEIIKSV L >gi|224461432|gb|ACDD01000070.1| GENE 101 82761 - 82874 122 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQEILSLFGQPLALCLASTSVLILMSHIFIIDYKERV >gi|224461432|gb|ACDD01000070.1| GENE 102 83041 - 83856 1184 271 aa, chain - ## HITS:1 COG:FN0947 KEGG:ns NR:ns ## COG: FN0947 COG5266 # Protein_GI_number: 19704282 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 271 1 270 270 377 67.0 1e-104 MLSKKILFTGLALCISASAFAHFQLIHTTSSNITDKNTVPFELIFTHPGEGMEGHSMDIG KDEKGSIKPMEAFFSVHKEQKTDLKNKLVSSKFGPNGHQVQAYKFTFDKTTGLKGGGDWG FVAVPAPYYEASEEIYIQQVTKAFVNKDDISTDWDARIAEGYPEIIPLNNPTNLWVGQVF RGKVVDPEGKAVANAEIEVEYINADIQNSQFVGENKFENAAMVLRADEFGYFSFIPVHAG YWGFAALGAGGEKTHNGKELSQDAVLWIEAK >gi|224461432|gb|ACDD01000070.1| GENE 103 84293 - 84754 772 153 aa, chain + ## HITS:1 COG:FN1505 KEGG:ns NR:ns ## COG: FN1505 COG0054 # Protein_GI_number: 19704837 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Fusobacterium nucleatum # 1 151 5 155 157 232 78.0 2e-61 MHTLEGKYSGKGLRVGIVAARFNEFITSKLISGAEDALLRHEVEEKDITLAWVPGAFEIP LAAKRMANSGKYDCIITLGAVIKGSTPHFDYVCAEVSKGVAHIGLESNIPVIFGVLTTNS IEEAIERAGTKAGNKGFDVAMTGIEMANLLKDM >gi|224461432|gb|ACDD01000070.1| GENE 104 84759 - 85838 1261 359 aa, chain + ## HITS:1 COG:FN1506_2 KEGG:ns NR:ns ## COG: FN1506_2 COG1985 # Protein_GI_number: 19704838 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Fusobacterium nucleatum # 145 357 2 220 223 251 55.0 1e-66 MEDLEYMHLALELAKHGEGRVNPNPLVGAVVVKNGKIIGKGYHHEYGGPHAEVFALQEAG EEAKGATIYVTLEPCSHYGKTPPCAKKIIDSGIKRCVISMGDPNPLVAGKGISMMRDAGI EVEIGLCETEARALNRVFLKYISTKLPFLFLKCGITLDGKLATRNFQSKWITNEIAREKV QKLRNKYTGIMVGVHTVIEDNPSLDARIENGRDPYRIIVDPYLEIPLSSKLLHRHDKKTV IITSFLEKETQKKKELDDLETRFIFLEDRIFSWPQMLIEIGKLGIDSVLLEGGGQLISSA FREDVIDGGEIFIAPKILGDKEAVAFVSGFSKESMDEAITLPNVELHQYGNNCSMEFYR >gi|224461432|gb|ACDD01000070.1| GENE 105 85848 - 86495 873 215 aa, chain + ## HITS:1 COG:FN1507 KEGG:ns NR:ns ## COG: FN1507 COG0307 # Protein_GI_number: 19704839 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Fusobacterium nucleatum # 1 215 34 251 251 282 65.0 4e-76 MFTGLVEEMGRVLSITEGNHSMQIKIQCKKVLEGAKLGDSIATNGTCLTAVEIGKDYFVA DCMHETMKRTNLHRLKKSDFVNLEKSITLSTPLGGHLVTGDVDCEGKITNIRQDGIAKIY TVELPKYYMKYVVEKGRVTLDGASLTVMELGDSSLGVSLIPHSQEMIILGKKKVGDYINI ETDLIGKYVEKLLSFPKQEEKKSKLSLDFLAENGF >gi|224461432|gb|ACDD01000070.1| GENE 106 86508 - 87719 1692 403 aa, chain + ## HITS:1 COG:FN1508_1 KEGG:ns NR:ns ## COG: FN1508_1 COG0108 # Protein_GI_number: 19704840 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Fusobacterium nucleatum # 1 203 1 203 203 314 76.0 2e-85 MLSRIEDALEDIKNGKPIIVVDDENRENEGDLFVAAERANYDAINLMAIEGRGLTCVPMS REWAERLQLLPMTAVNTDAKCTAFTVSVDYKYGTTTGISIGDRLTTILHLADSSSKAEDF TRPGHIFPLIAKDRGVLEREGHTEATVDLCRVAGLKPVAVICEILKQDGTMARMDDLEIF AKEHDLKIISIEDLIKYRKKNDELVKIEIKAQMPTAYGSFSIVGFDNQLDGKEHIALVKG DVKGKENVLIRVHSECFTGDILGSKRCDCGDQLHSAMKRIDKEGEGIILYLRQEGRGIGL INKLKAYKLQEEGLDTLDANLHLGFAGDLRDYGIAAQMLHALGVKSIRLLTNNPAKLEGL EEYGVKITGREEIEIHHNEVNEHYLLTKQLRMRHMLHVKKSEK >gi|224461432|gb|ACDD01000070.1| GENE 107 87758 - 89245 1713 495 aa, chain - ## HITS:1 COG:FN1121 KEGG:ns NR:ns ## COG: FN1121 COG4868 # Protein_GI_number: 19704456 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 494 1 505 506 794 77.0 0 MKIGFDHNKYLEEQSKFIAERVNHYDKLYLEFGGKLMFDLHAKRVLPGFDENAKIKVLSK LKDKVEVVICVYAGDIERNKMRGDFGITYDMEVFRLIDDLREHDLKVNSVVITRYEERPA TALFITKLERRGIKVYKHLATKGYPTDIDTIVSDEGYGKNPYIETERPIVVVTAPGPGSG KLATCLGQLYHEFKRGKSAGYSKFETFPVWNVPLKHPLNIAYEAATIDLADVNMIDPFHL EAYGETTVNYNRDIEAFPLLKRIIEKITGEESIYKSPTDMGVNRVGFGIIDDAVVQEASK QEIIRRYFNAGCEYKKGYIDYPIFQRAELIMRNLNLTEEDRKCVAAARNKAKTSGILSAV ALELQDGSIITGRQSELMDATSAAILNAVKHLADFDDKLLLLSPVILEPILTLKEKTLNH KNVPLDCEEILIALSISAATNPMAASALSKLQELKGVQAHCTHILAKKDEQTLKKLGIDI TCDQVFPTENLYYNS >gi|224461432|gb|ACDD01000070.1| GENE 108 89345 - 90061 280 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 228 1 226 305 112 32 8e-24 MSSILKIENLHVFYDNIHALKGISLEVHEGEIVSLIGANGAGKTTTLQTISGLIQAKQGT IHFRDKDIMKQKPEQICKLGIAQVPEGRRIFSRLPVKDNLKLGQYIIKDSGENKEKDRAQ FYSIFPRMSERKNQLAGTLSGGEQQMLAMGRAIMSRPKLLILDEPSMGLSPLFVKEIFNV IKKLNEMGTTILLVEQNAKMALSISDRAYVIETGKITLEGNAKELLKNPEVKKAYLGA >gi|224461432|gb|ACDD01000070.1| GENE 109 90077 - 90874 245 265 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 9 250 1 231 245 99 25 9e-20 METINKNAILSAHNISIQFGALKAVSDFNLEIYPGDLVGLIGPNGAGKTTVFNVLTGVYP ASSGEYHFNGNLIKNSSTSKLVTQGLARTFQNIRLFKYLSVLDNVMVAHNFSMKYGIFSG MLRLPSCWKEEKEIRKKSMNLLKIFHLDKFANQAAGNLPYGEQRKLEIARAMATNPKLLL LDEPAAGMNPTETEELMKTIKFIRDTFGIAILLIEHDMKLVLGICEKLVVLDHGTIIASG DPQEVINNPQVVTAYLGQDNTEEEE >gi|224461432|gb|ACDD01000070.1| GENE 110 90858 - 91847 1379 329 aa, chain - ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 42 321 1 280 285 376 78.0 1e-104 MEKSKKINYIISFLLLFLIYLGMTLMIHSSIFSRYQLSVIILICINIILAVSLNITVGCL GQITIGHAGFMSVGAYTAALFSKSALVSGVPGFFLALLLGGIVAGVVGIVIGVPALRLNG DYLAIITLAFGEIIRVLIEYFDFTGGPQGLRGIPKFNNFDIIYWIMVFSVILMFSLMTSR HGRAVLAIREDEIASCASGINTTYYKTFAFTLSAIFAGIAGGIYAHNLGVLGAKQFDYNY SINILVMVVLGGMGSFTGSILAAIVLTLLPEMLREFSDYRMIVYAVILIFMMIFRPKGLL GREEFQLSLALTWCKQKLRIGGKKNGNHQ >gi|224461432|gb|ACDD01000070.1| GENE 111 91851 - 92735 989 294 aa, chain - ## HITS:1 COG:FN1431 KEGG:ns NR:ns ## COG: FN1431 COG0559 # Protein_GI_number: 19704763 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Fusobacterium nucleatum # 1 294 14 308 308 411 81.0 1e-115 MEFLLQIINGLQIGSIYALVSLGYTMVYGIAQLINFAHGDIIMVGAYTSLFSIPIFQKMG LPIWATIFPAMIICALLGMLTEKIAYRPLRNSPRISNLITAIGVSLFLENIFMKLFTPNT RAFPKVFSQVSIHLFGISFNYGSVITILLTLALSIALHLFMKNTKYGKAMLATSEDYGAA TLVGINVNFTIQLTFAIGSALAAIASVLYVSAYPQVQPLMGSMLGIKAFIAAVLGGIGIL PGAVIGGFILGIIESLTRAYLSSQLADAFVFGILIIVLLIKPTGILGKNIKEKV >gi|224461432|gb|ACDD01000070.1| GENE 112 92760 - 93911 1775 383 aa, chain - ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 2 380 3 381 383 518 69.0 1e-147 MKKWSYGMLAAALLLTACGGEKKELSQGAETNTIKLGAAGPLTGALAIYGVSATNGTKLA IDEINKNGGILGKQIELNLLDEKGDTTEAVTAYNKLMDWGMVAYIGNVTSKPSVAVSELA AEDGIPMITPSGTQFSITEAGKNIFRVCFTDPYQGEVLATLASEKLHAKTAAVLINNSSD YSDGVAQAFLKKSQEKGIQVVATEGYSDGDKDFKAQLTKLLPLNPDVIVVPDYYEQDALI ASQAREIGLTSQFIGPDGWDGVIKTLDSSSHDVLEGALFTNHYAIDDSNEKVQHFVKAYR DSYQDEPSAFSALSYDAVYMLKDAIETVGSTDKEAVAKALREISFDGVTGHLTFDENNNP VKAVTIIKVENGKYKFDSVLEAK >gi|224461432|gb|ACDD01000070.1| GENE 113 93990 - 94106 92 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKYIKEMDSCKVFFDIFSKNKKNALTGIKKIILKTVLS >gi|224461432|gb|ACDD01000070.1| GENE 114 94081 - 94866 985 261 aa, chain - ## HITS:1 COG:CAC0522 KEGG:ns NR:ns ## COG: CAC0522 COG0561 # Protein_GI_number: 15893812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 257 1 257 265 183 40.0 3e-46 MIKLVASDMDGTLLNEQGNIPSHFWEIEKNLEEKQILFCAASGRQYFNLELLFSSIKNNT IFLAENGSLVIFRDKVLFENSMSKKDLKEWLQIASSLQNVFPVFCGKNSAYIEKTENETF LTEVKKYYHKLEMVDSLEEISENMLKLAICDLNGSETNSYPHYKKFNAEYQVVVSGGIWL DIMNQSTNKGVALEKIKEFFEIKYDELLVFGDYLNDYEMMSCGKYSFAMENAHPKLKEKA NYVTKSNKDEGVLFTIKQFLK >gi|224461432|gb|ACDD01000070.1| GENE 115 94859 - 96166 1163 435 aa, chain - ## HITS:1 COG:FN1469 KEGG:ns NR:ns ## COG: FN1469 COG0534 # Protein_GI_number: 19704801 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 11 431 13 436 440 294 44.0 2e-79 MKSLTKKIFQFAIPSITSMWIFTLYTIVDGIFIGKYVGPLALGAANLAMPIFNLSFGIGI MIAVGASTLISIAFSQKNFKQGNYYFNLASFFAFLLGTCLSLFCFFALKSIVTFLGANDN LFPYVYEYVRIILFFFPFYLCGYGWEIYIKVDGNAVYPMFCVLLGAGINIALDYIFLAIF HTGVQGAALATGLAQTITSLALLAYIIKYSKNFSFQKVHIYGKNILCILKTGFSEFFTEI SSGILILIFNHFLFFYLGERGIISFSAISYLSSLVIMTMIGFAQGIQPILSFSYGKKSKK EILHIFNISILSIIVLGIFFLLFACFFSQNLVKYFLSIETETLVTSVALKKYSISYLFMG LNILFSAFFTALKKAKFSLLITFCRGIFLPIIALFSTPFLLGKENLWFAATISEGMTFLI SFYLYQNYKKELLHD >gi|224461432|gb|ACDD01000070.1| GENE 116 96244 - 96444 324 66 aa, chain - ## HITS:1 COG:FN0528 KEGG:ns NR:ns ## COG: FN0528 COG1278 # Protein_GI_number: 19703863 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Fusobacterium nucleatum # 1 66 6 71 71 102 93.0 2e-22 MKGTVKWFNKEKGFGFITGEDGKDVFAHFSQIQKEGFKELFEGQEVTFDITEGQKGPQAS NIVIVK >gi|224461432|gb|ACDD01000070.1| GENE 117 96760 - 98100 2069 446 aa, chain + ## HITS:1 COG:SPy1150 KEGG:ns NR:ns ## COG: SPy1150 COG0446 # Protein_GI_number: 15675127 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Streptococcus pyogenes M1 GAS # 1 446 1 455 456 583 65.0 1e-166 MEKIVVVGANHAGTAAINTILDNYKDKELVVFDRNSNISFLGCGMALWIGGQISSGDGLF YSSKEILEGKGAKIHMETEVYNIDFENKFVYAKGVQDGKEYRESYDKLILSTGSLPIQLP VPGTELENVQFVKLYQNAKEVIEKLNTNKEIKHVTVVGAGYIGVELAEAFKRWGKEVCLV DFCEDCLSTYYDKNFRDMMDQNLADHGIELRYGQLLKEIKGNGKVESVVTDKEEFKTDMV VLCVGFRPNTALAKDQLETFRNGAYKVDKTQKTSKDGVYAIGDCATVYDNTIDDINYIAL ATNAVRSGIVAAHNVSGTPLEGIGVQGSNGISIYGLNMVSTGLTFEKAQRLGIKVGETTY TDLQRPEFIETKNEPVTIRIVYNLDTRVILGAQIASREDISMAIHMFSLAIQEKVTIDKL KLLDIFFLPHFNKPYNYITMAALSAK >gi|224461432|gb|ACDD01000070.1| GENE 118 98186 - 98791 421 201 aa, chain + ## HITS:1 COG:CAC2437 KEGG:ns NR:ns ## COG: CAC2437 COG1853 # Protein_GI_number: 15895702 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Clostridium acetobutylicum # 2 190 3 191 199 223 58.0 1e-58 MKKNFKPSVMLNPVPVVLITSRNKQGEENVFTVAWTGTVCTKPPMLSISIRPERLSYEYI KETLEFTVNLPTKSLVKAVDYCGVRSGRKENKIKNMGFHLKRGEKVSTSYIEECPIALEC KVTQIIPLGTHHLFLAEVVSCFVEDSLIDKENKIHFEEANLITYSHGEYYPSVKKSIGNF GFSVRKKKRKNTCILNKKGIK >gi|224461432|gb|ACDD01000070.1| GENE 119 98887 - 99369 557 160 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2038 NR:ns ## KEGG: Lebu_2038 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 159 1 159 161 192 59.0 3e-48 MLPSFYGVIEVKHYHQGRLRIQTNSLIQNPELEMELLQNIKQIEGIESVKINDKIGSVLI LFQETKIEASFLYLIILKMLHLEEEAFRKKPGKLKLLCRNVLEAVDFSIYNKSKGLLDGK LIVSSIFLYYGVKKLRVTPQLPSGATLLWWAYNLMIKGKE >gi|224461432|gb|ACDD01000070.1| GENE 120 99372 - 99761 404 129 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2039 NR:ns ## KEGG: Lebu_2039 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 122 1 122 123 147 60.0 1e-34 MLKNLLKTTYFMFHQLKIVHSIPGRLRLTVPGLSAIPEEMRKHEHYTTELILSKEGIQSI EYSYLTNKVLIHYDPSLITDKEIVSWLNAVWKIIVDHSDLYEKMTLGEIEKNLDKFYELL KKELRRGEL >gi|224461432|gb|ACDD01000070.1| GENE 121 99762 - 101948 2637 728 aa, chain + ## HITS:1 COG:FN1190 KEGG:ns NR:ns ## COG: FN1190 COG2217 # Protein_GI_number: 19704525 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 726 1 732 735 999 75.0 0 MSNKNYLLNCEIKHRIRGRIRIKSRALKYLGTLKEEVESQLMQVRYIENAKISEMTGSIV IYFEDITLTDQNLISLLQNTLNAYLVEIYKNEKTVTGSKYVIERKLQEESPKEIIQKIVA SSTLLAYNIFRPSVSTAVGMARFLNYNTLATLSLAMPVLKNGILSLIKNRRPNADTLSSS AILSSIALGKEKTALTIMILEEFAELLTVYTMKKTRGAIKDMLSVGENFVWKEMEDGSVK RIPIEEVEKGDLILVQTGEKISVDGLIRKGEALIDQSSITGEYMPVTKKQGEEVFAGTIL KNGSITVEAQKVGDDRAVSRIIKLVEDANFNKADIQSYADTFSAQLIPLNFLLAGIVYLG TRNVQKALSMLVIDYSCGIRLSTATAFSAAINTAAKNGILIKGSNYIEELSKSDTVIFDK TGTITEGKPKVQTLQVFGKRMKEDKMLSLAAAAEETSSHPLAVAILNEMKDRGLNIPKHQ DTLIVVAKGMETKVGKDMIRVGSRKYMEENNISLEESQEVVRGILHRGEIIIYVARNEEL IGVIGVSDPPRENIKKAINRLRNQGIDDIVLLTGDLRQQAETIASRMSMDRYESELLPED KAKNILKFQSGGSKVIMIGDGINDAPALSYANVGVALGSTRTDVAMEAADITITSDDPLL VPGVVGLAQKTVKTIKENFAMAIGMNSFALVLGATGILPAIYGSVLHNATTILVVGNSLK LLKYDVNK >gi|224461432|gb|ACDD01000070.1| GENE 122 101970 - 102722 602 250 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2041 NR:ns ## KEGG: Lebu_2041 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 247 1 247 248 287 57.0 2e-76 MKKLTLTILHKLPNRIRFQVSERIRDLKSFAHSLKCDNSKIRLRYNFRTNTLLVEFNPDE IYLQEVIYRVVTALSIENGMLPVRLIEEYESKSLNSLSVYSGAAIMISFLHSLKQATNTT LQTTMNHFALALTTTALAEHAYSETKRKGFFDIELVPALYLIKSYFDNNSISSIALMWLT TFGRHLIVNNSSSKEIKVFRLKDKDGQYHYIADVREDNSIENLSDLVHHVFFNKKKMNKN TEKYVTISMK >gi|224461432|gb|ACDD01000070.1| GENE 123 102745 - 103041 548 98 aa, chain + ## HITS:1 COG:no KEGG:Lebu_2042 NR:ns ## KEGG: Lebu_2042 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 9 87 5 83 95 105 68.0 6e-22 MFGFGNGMGNFKREHCVGIAIGVGVAAVGYYLYKKNQDKVDNFLRKQGINVKTSSSTNYE AMDLETLTEMKEHIEDVIAEKELSAGAVTECDVTCANN >gi|224461432|gb|ACDD01000070.1| GENE 124 103103 - 103807 586 234 aa, chain + ## HITS:1 COG:FN0175 KEGG:ns NR:ns ## COG: FN0175 COG0850 # Protein_GI_number: 19703520 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Fusobacterium nucleatum # 22 232 1 214 216 196 50.0 3e-50 MIENETIVCYTIKNMLNWVDSMKNYVILKGKKDRLEIQLNGEVDFITLRNSMIEKMKEAK NFIGEGKMAIEFTGRDLSELEENVLIDLIRLHSNLNIVYVFSGEKIKEVNRFSLFHSISE EGPTKFFRGTLRSGSKLEYDGNLVILGDVNPGSLIKASGNVLVLGHLNGTVYAGIEDSNN SFVAAMFLNPVKLIIGNKVSKVLQKEILDTNRVKKGSFQIAQVKQGEIVIEEWR >gi|224461432|gb|ACDD01000070.1| GENE 125 103809 - 104600 1229 263 aa, chain + ## HITS:1 COG:FN0176 KEGG:ns NR:ns ## COG: FN0176 COG2894 # Protein_GI_number: 19703521 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Fusobacterium nucleatum # 2 262 3 263 264 390 78.0 1e-108 MSQVIVVTSGKGGVGKTTTTANIGAGLAEKGHKVLLIDTDIGLRNLDVVMGLENRIVYDL VDVIEGKCRIPQALIKDKRCSNLSLLPAAQIRDKNDINEEQMKTLIEVLRKDFDYIIIDC PAGIEQGFKNAIAAADRAIVVTTPEISATRDADRIIGLLEANGIKDPKLIVNRIRMDMVK ENNMLSVEDMLDILAIGLIGVVPDDESIVISTNKGEPLVYKGETLAAKAYRNIVERIEGK EVDFLNLDVKMGFFDRLKFIFRG >gi|224461432|gb|ACDD01000070.1| GENE 126 104606 - 104863 400 85 aa, chain + ## HITS:1 COG:FN0177 KEGG:ns NR:ns ## COG: FN0177 COG0851 # Protein_GI_number: 19703522 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation topological specificity factor # Organism: Fusobacterium nucleatum # 1 80 1 81 99 101 72.0 3e-22 MGLFDFFKKNNSKDEAKSRLKLVLMQDRAMLPSGVMERIKDDIIQVLSKYVEIDQEQLNI EMSNCDDDPRQIALLANIPIRQKNK Prediction of potential genes in microbial genomes Time: Fri May 20 02:24:25 2011 Seq name: gi|224461431|gb|ACDD01000071.1| Fusobacterium sp. 3_1_5R cont1.71, whole genome shotgun sequence Length of sequence - 13496 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 237 - 267 -0.5 1 1 Op 1 . - CDS 418 - 990 754 ## FN1315 hypothetical protein 2 1 Op 2 . - CDS 1001 - 1777 750 ## COG0327 Uncharacterized conserved protein 3 1 Op 3 . - CDS 1774 - 2649 1151 ## gi|257453203|ref|ZP_05618502.1| hypothetical protein F3_09093 4 1 Op 4 31/0.000 - CDS 2691 - 4001 1815 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 5 1 Op 5 1/0.000 - CDS 4011 - 5837 1778 ## COG0358 DNA primase (bacterial type) - Prom 5867 - 5926 9.8 - Term 5886 - 5934 11.1 6 2 Op 1 1/0.000 - CDS 5958 - 7607 2277 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 7 2 Op 2 1/0.000 - CDS 7623 - 9008 1805 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 9038 - 9097 3.5 8 2 Op 3 1/0.000 - CDS 9106 - 10107 1096 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 9 2 Op 4 1/0.000 - CDS 10104 - 10772 862 ## COG0125 Thymidylate kinase 10 2 Op 5 15/0.000 - CDS 10754 - 11908 1323 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 11 2 Op 6 32/0.000 - CDS 11905 - 12729 836 ## COG0575 CDP-diglyceride synthetase 12 2 Op 7 . - CDS 12719 - 13411 674 ## COG0020 Undecaprenyl pyrophosphate synthase - Prom 13436 - 13495 5.0 Predicted protein(s) >gi|224461431|gb|ACDD01000071.1| GENE 1 418 - 990 754 190 aa, chain - ## HITS:1 COG:no KEGG:FN1315 NR:ns ## KEGG: FN1315 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 14 186 4 175 177 102 38.0 7e-21 MRKYLLSLCMLCFSCIAWAEVNDTLNLIPKKEIPKIEEKIHEIYNKKKVKVYVNTLTEGE GFQVADPERTVILNISRDKTTQVKVTLRFSKDIDIEEEQSKMDLSLDNASSILIGGKPGE YILQVLDGVEYLLENVEISEPQILMQKAEEKAEFQKGIFISLGVILLLLLKIGYDFLKKK KAKQEKIITK >gi|224461431|gb|ACDD01000071.1| GENE 2 1001 - 1777 750 258 aa, chain - ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 256 1 257 258 250 54.0 2e-66 MKTKDFINILEKKYPKNLAEDWDNVGLLVGDEEKDLQKILFSLDVTEEVIDYAIKNSFDM IISHHPIIFRGIKRVLKQDALGTKIFKLVKYGINVYTLHTNLDAQIEGLNDYLLEKIGIS NSSILEKREDGTGIGRIFKYPEGKLISEIQEELSNYLKLSFQRYIGKNRNKKVYRACLVN GSGMSYWRMAQSRGVELFITGDVSYHDALDAKESGMDIIDIGHYEAERFFAELLMKNLQE TSLNFEIFDSKPVFQLIK >gi|224461431|gb|ACDD01000071.1| GENE 3 1774 - 2649 1151 291 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453203|ref|ZP_05618502.1| ## NR: gi|257453203|ref|ZP_05618502.1| hypothetical protein F3_09093 [Fusobacterium sp. 3_1_5R] # 1 291 1 291 291 423 100.0 1e-117 MKQKEFEELLLNKRFSDDEFFEYLQKNTLKDIEFEVMDEQVAIEKKDFTLLESSVLEYIE ELCSYDPDHLSEEREAHIALEVKKVLYYAFFYFKEGISYMDLVQEGIVGLMKGVDRQSER LDFWIIREIFLFVYSEIQDLKFGFKNFLKGKREEAEHHHEHEHHHDHEEDHECSCGHDPN EEHECCGKHHHKEEEEEILDKNQILEKLLKSNTAIDEMEQIIEESLDFHHIKNRLYAIEI EVLNYYFGLLVEKRYSIFEIEEKFQLQKNHAQNIFENAMYKLSTLKGKLEL >gi|224461431|gb|ACDD01000071.1| GENE 4 2691 - 4001 1815 436 aa, chain - ## HITS:1 COG:FN1318 KEGG:ns NR:ns ## COG: FN1318 COG0568 # Protein_GI_number: 19704653 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Fusobacterium nucleatum # 116 436 23 331 331 412 80.0 1e-115 MREFIKNEKVLSLIRKAMKEKVITYEEINDELKEDFPLEQIDKLISGMIEQGIEIKKKAS LEKEKKEKTKKKTTKIETKEASKTVTKKRKKKEEEVEEVSKMKEEEEEFQDTDLSFEEIP EEENLEDLLEKEEDFDASELEEIPEEELTNEELAELSNGMKVDEPIKMYLREIGQIPLLT HKEELELAKKALEGDEFANKRLIEANLRLVVSIAKKHTNRGLKLLDLIQEGNIGLMKAVE KFEYTKGYKFSTYATWWIRQAITRAIADQGRTIRIPVHMIETINKIKKEARIYLQETGKD ATPEILAERLGMEVEKVKSIQEMNQDPISLETPVGSEEDSELGDFVEDQKMLTPYELTNR SLLREQLDSVLGSLSSREEKVLRYRYGLDDGSPKTLEEVGKIFKVTRERIRQIEVKALRK LRHPSRRKKLEDFKVE >gi|224461431|gb|ACDD01000071.1| GENE 5 4011 - 5837 1778 608 aa, chain - ## HITS:1 COG:FN1319 KEGG:ns NR:ns ## COG: FN1319 COG0358 # Protein_GI_number: 19704654 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Fusobacterium nucleatum # 2 601 3 599 603 483 46.0 1e-136 MFRQEDIDRLMEQLNIVDVVGEFVELKKSGANYKGLCPFHADNNPSFSVNPQKNICKCFV CGAGGNPITFYSKYKKISFQEAVRELAKKYHIPLQEIKQNKEENEKFERYYKIMEEAHQY FSHLIFENIGREALEYLVKRKVGPKLIQENNLGYASPSWDSLFNHLIELGYQSEELELLG LVKRRENGQYYDVFRNRVIFPIYSIQGRVIAFGGRTLEQDKEIPKYINSPDTPIFKKGKG LYGLERVSGIKQKNYAMIMEGYMDVLSTVSYGFDTSIAPLGTALTKEQVKLLKRYTENVI LCFDSDNAGQMAAERAIFLLKEEGFNIRVLQLKGAKDPDEFLKKFGKEAFLQEVQACLEA FDFLYSYYKKEYALDDIMSKQKFVERFQEFFHSLSKELEQELYLHKFSDLLGMDVQVLRP LLFPKGTRNFSHLRERKQEEEIFVQDFQELYSVLEKLSVQISILDLQQHKESEEAKYYHF LKDMPFQFDFTRKIFQDLEKYEQGRDTQSRNTILESIREIFTKDNYTIEEKEHLFELLSE CLERDKIEEQKHIVCRDWAREMFKNTVVTDPFKQLRLKQLEQKILKNKKDMGQFLEMYQQ YLKLREEK >gi|224461431|gb|ACDD01000071.1| GENE 6 5958 - 7607 2277 549 aa, chain - ## HITS:1 COG:FN1320 KEGG:ns NR:ns ## COG: FN1320 COG0760 # Protein_GI_number: 19704655 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Fusobacterium nucleatum # 186 547 1 355 356 142 30.0 2e-33 MAIRKFRKIMKLVIFIVAIAMIGSGAWLTFTNLLQHHSAGETQYAYQLNGEKVSKVKIAR EENNLMEQLNKMGQGKTSKELVSLIAFQKVINDELTLQLAEDMKIKVPSSEIKEEYEKIE NSIGNKEQFKRMLSVQGYSKKSFKAMLEENLLLQKVMEKFAEEAKKSGKDGNLLFQEALA KKRNEMKIDKLSPEYEKLQLKVVEEKDGFKITNVDMADRVTQLMLMTGEEEAKVTEEVKK QFEEGIAFAKKAQEKGVLISKDLPINVQLAEYGKAFFEKLKSEVKIDEVELSRFFQANHN RYNQHASIDVDVAVLKIVPSKEDIAAIEKKAEETLKSLKKENFAKIGADLQKKSPETVIY EELGWFEKGAMVKEFEEAAFSSKEAQIYPKVISTQFGKHLLYIQEVQENKVKAAHILFRE VASQASIDKSLKEAESIKEKLDKKEVTFETLKNINKNLLFAHTFTGVDKSGVIQGFVTDK ALVDTMYAAEMNKVQIYSDDYAKKAGIIYLFSKTKQEEDKIVSLEEVQDRVRDEYRSWRA QQELQKIMN >gi|224461431|gb|ACDD01000071.1| GENE 7 7623 - 9008 1805 461 aa, chain - ## HITS:1 COG:FN1321 KEGG:ns NR:ns ## COG: FN1321 COG2204 # Protein_GI_number: 19704656 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Fusobacterium nucleatum # 1 461 9 469 469 612 72.0 1e-175 MKNAILAISEKKDTLKQIRKELSEKYEVITFNNLLDAIDMLRESDFDLVLLDEYLTWFSL SDAKKKLSSIGKDFATIALFDDITPDKLKEIKQAGIYSYLPKPVLVSDIDKVILPVLHNL ELVKENKKMTEKLTELEHETEIIGQSPKIKEVKNLIDRVADSDMPVLISGEKGVGKLVIA REIYKKSDRKKQDYIQVSCATIPEENLEKELFGYERGTFIGANTSKKGLLEEIDGGTIYI EDIALMDLKVQSKLLKVIEYGELRRVGGTKVRRVNVRFIIGSDIDLKEETEQGRFRKDLY HRLTAFLIVVPPLRERKEDVPLLVSYYLNRIVKELHRETPVISGEAMKYLMEYSYPRNIR ELKNMVERMALVSNEKILDVEDLPLEIKMKSATLENKTVVGVGPLKDILEQEIYSLDGVE KVVIASALQKTRWNKQETSKLLGIGRTTLYEKIRKYGLDIK >gi|224461431|gb|ACDD01000071.1| GENE 8 9106 - 10107 1096 333 aa, chain - ## HITS:1 COG:FN1322 KEGG:ns NR:ns ## COG: FN1322 COG0750 # Protein_GI_number: 19704657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Fusobacterium nucleatum # 1 333 1 338 339 326 54.0 4e-89 MTVLIAIVVLGIIILVHELGHFATAKLFHMPVSEFSIGMGPQVYSYETSKTTYSFRAIPL GGYVNIEGMEIDSEVEGGFASKPAYQRLIVLVAGVCMNFLFAMTLLTALYFHLGNAEYSK EPIVGAVIEESPAVQYLQAEDRIVQIEGVSILTWEDIGKNIQNKEKIEVLVERGEEEKSF QIPLIQKENRSFLGVYPKIIKSSYSFGQSFLKANSSFINIISDMGKGLWKMVRGEISVKE ISGPIGILQVVGEASKQGIVSVLWLSVFLSINVGLLNLLPLPALDGGRILFVLLEILHIP FSKKIEENIHKIGLFLFLTLIFFISIQDVLHLF >gi|224461431|gb|ACDD01000071.1| GENE 9 10104 - 10772 862 222 aa, chain - ## HITS:1 COG:FN1323 KEGG:ns NR:ns ## COG: FN1323 COG0125 # Protein_GI_number: 19704658 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Fusobacterium nucleatum # 1 222 1 222 225 260 60.0 1e-69 MGKIIVIEGTDSSGKETQSHLLLEHFLSLGRKARRLSFPNYESPACEPVKMYLAGEFGLN AEKVNPYPVSTMYAIDRYASYQKDWGYDYQQEESIFVADRYVTSNMIHQASKLEGKEKEE YLIWLETLEYKQFEIPRPDCIIFLDMPTKQAQELMKKRANKITGEEEKDIHERNREYLEK SYRNACEMAEKYGWTRISCVDGDRIKSIQEIRNEILEKVREI >gi|224461431|gb|ACDD01000071.1| GENE 10 10754 - 11908 1323 384 aa, chain - ## HITS:1 COG:FN1324 KEGG:ns NR:ns ## COG: FN1324 COG0743 # Protein_GI_number: 19704659 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Fusobacterium nucleatum # 1 382 4 390 390 431 58.0 1e-120 MKRIVVLGSTGSIGKSSLEVVRGNADLFQIVGLSGHRNMELLKQQIKEFHPKYVTVGYWE AYQELKTIFPEIQFFYGEQGLEELASVEDYDILLTAVSGAVGIRATVKGIEKEKRIALAN KETMVAAGSYINDLLKRYPKAEIIPIDSEHSAIFQSLQGNDKKEVKRLIITASGGAFRGK TRIELEKVGVQDALKHPNWSMGKKITVDSATLVNKGLEIIEAHELFGIDYDKIDTILHPQ SIIHSMVEYQDNSIIAQMGVTDMKLPIQYAFTYPRRVSNSVLESLDFLKYGQMSFEKIDT QVFQGIDLARKAGNMGGTMPIVLNAANEIAVDFFLKEKIRFLEIYEVIQAAMEQFPREEI QSLEHILAKDHEVREWVKTWEKLL >gi|224461431|gb|ACDD01000071.1| GENE 11 11905 - 12729 836 274 aa, chain - ## HITS:1 COG:FN1325 KEGG:ns NR:ns ## COG: FN1325 COG0575 # Protein_GI_number: 19704660 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Fusobacterium nucleatum # 3 270 5 281 294 197 44.0 2e-50 MKSRIIVALIGIPILIFVILFGGIPLLIFTNFVVGIGTWEFYRMIEHSGRRVHKYVGMLA SLALPNYIFWTQGQKVEGEIAILIFAMVLMFLERVFTNRIEHASTEIGNTVLGLIYVSYF FSHILKWSFWDNGGQLILLLQIMVWSCDSFAYFIGISIGRKIFKRGFTEISPKKSIEGFL GGILCTILAAYLLLKYFTLFLAQTQEELLIFSLILGVGVSLAAQIGDLVESLFKRECGIK DSGKILAGHGGILDRFDSMIFVLPIMYYIMGAVL >gi|224461431|gb|ACDD01000071.1| GENE 12 12719 - 13411 674 230 aa, chain - ## HITS:1 COG:FN1326 KEGG:ns NR:ns ## COG: FN1326 COG0020 # Protein_GI_number: 19704661 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 1 227 1 227 230 311 67.0 6e-85 MSIEVPKHIAIIMDGNGRWAKKRALPRTLGHREGAKTLQKILKYAGELGIQYLTVYAFST ENWNRSEEEVSALMKLFSKYIKNEEKNLMKNNVRFLVSGRKERVSSSLLEEIKALEEKTS RNTGITFNIAFNYGGRAEIVDAVNQLLQEKKEKISEEDISSHLYQNIPDPELIIRTSGEF RISNFLLWQLAYAEIYVTDTLWPDFDEKSLDLALENFQKRERRFGGVYEK Prediction of potential genes in microbial genomes Time: Fri May 20 02:24:46 2011 Seq name: gi|224461430|gb|ACDD01000072.1| Fusobacterium sp. 3_1_5R cont1.72, whole genome shotgun sequence Length of sequence - 17636 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 6, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 63 - 491 580 ## COG0716 Flavodoxins + Term 495 - 542 2.2 - Term 488 - 525 3.6 2 2 Op 1 33/0.000 - CDS 532 - 1101 985 ## COG0233 Ribosome recycling factor 3 2 Op 2 24/0.000 - CDS 1119 - 1838 1246 ## COG0528 Uridylate kinase - Term 1857 - 1910 8.1 4 2 Op 3 38/0.000 - CDS 1911 - 2804 537 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts 5 2 Op 4 . - CDS 2837 - 3580 1104 ## PROTEIN SUPPORTED gi|237743354|ref|ZP_04573835.1| SSU ribosomal protein S2P 6 3 Op 1 . - CDS 3708 - 7064 3259 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 7 3 Op 2 10/0.000 - CDS 7073 - 7519 590 ## COG0691 tmRNA-binding protein 8 3 Op 3 1/0.000 - CDS 7530 - 9509 1243 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 9 3 Op 4 . - CDS 9464 - 10036 652 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism 10 3 Op 5 . - CDS 10038 - 10448 579 ## gi|257453222|ref|ZP_05618521.1| hypothetical protein F3_09188 11 3 Op 6 1/0.000 - CDS 10466 - 11473 1373 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Term 11482 - 11521 6.3 12 3 Op 7 . - CDS 11532 - 11792 432 ## COG1862 Preprotein translocase subunit YajC 13 3 Op 8 . - CDS 11859 - 13010 1651 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain - Prom 13044 - 13103 6.2 14 4 Tu 1 . - CDS 13129 - 13650 773 ## COG1778 Low specificity phosphatase (HAD superfamily) - Prom 13717 - 13776 11.0 + Prom 13752 - 13811 8.0 15 5 Tu 1 . + CDS 13838 - 14398 684 ## COG3758 Uncharacterized protein conserved in bacteria - Term 14656 - 14701 8.3 16 6 Op 1 . - CDS 14786 - 15391 926 ## gi|257453228|ref|ZP_05618527.1| hypothetical protein F3_09218 17 6 Op 2 . - CDS 15400 - 15465 127 ## 18 6 Op 3 . - CDS 15481 - 17634 3115 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461430|gb|ACDD01000072.1| GENE 1 63 - 491 580 142 aa, chain + ## HITS:1 COG:FN0029 KEGG:ns NR:ns ## COG: FN0029 COG0716 # Protein_GI_number: 19703381 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 1 137 1 138 143 154 55.0 5e-38 MNTIGIVYYSFTGNVLRMVKELEKGIEEVGGKFKSYRVAEVKADEIFQQDIIVMASPANG SEEIETEFFQPFMENNQKKFQGKKVYIFGSWGWGEGYFLEKWKKQLEEFGAILVAEPILC NGYPNGETRKALQEMGKILVEK >gi|224461430|gb|ACDD01000072.1| GENE 2 532 - 1101 985 189 aa, chain - ## HITS:1 COG:FN1623 KEGG:ns NR:ns ## COG: FN1623 COG0233 # Protein_GI_number: 19704944 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Fusobacterium nucleatum # 6 189 7 190 190 240 77.0 1e-63 MTTGKEVIQECQNKMQKTIEATKEKFTSIRAGRASVAMLDNIKVEQYGSDMPLNQVATVS APEARLLVIDPWDKTMIPKIEKAILAANLGMNPNNDGRVVRLVMPELTADRRKEYVKLAK KEAENGKIAVRNIRKDMNTALKKIEKDKESGMSEDELKRFEAEVQTLTDKTIKDLDDLLA KKEKEITTV >gi|224461430|gb|ACDD01000072.1| GENE 3 1119 - 1838 1246 239 aa, chain - ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 1 239 1 239 239 399 87.0 1e-111 MEKPCYQKVLLKLSGEALMGEQEFGISSDVINSYAMQIKEIVDLGVQVSIVIGGGNIFRG LSGAEQGVDRVTGDHMGMLATVINSLALQNAMEKIGLATRVQTAIEMPKVAEPFIKRKAQ RHLEKGRVVIFGAGTGNPYFTTDTAAALRAIEMNTDVVIKATKVDGVYDKDPVKYADAVK YETVTYTEVLNKDLKVMDATAISLCRENKLPIVVFNSLVPGNLKKVILGEKIGTTVVAE >gi|224461430|gb|ACDD01000072.1| GENE 4 1911 - 2804 537 297 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 292 1 276 283 211 45 3e-54 MAAITAGLVKELRERTGAGMLDCKKALEQHDGDIEKAIDYLREKGIAKAVKKAGRIAAEG LIFDGVTADHKKAVVLEFNSETDFVAKNEEFKNFGKALVQIALDKNINTIEELKATEFEA GKTVEAVLTELIAKIGENMNLRRIHETVAKDGFVETYSHLGGKLGVIVEMSGEATEGNLH KAKDIAMHAAAMDPKYLCQEEVTTADLEHEKEIARKQLEEEGKPAQIIEKILIGKMNKFY EENCLVNQIFVKAENKETVGQYAGDLKVLSFTRYKVGDGIEKKEEDFAAEVAAQIKG >gi|224461430|gb|ACDD01000072.1| GENE 5 2837 - 3580 1104 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237743354|ref|ZP_04573835.1| SSU ribosomal protein S2P [Fusobacterium sp. 7_1] # 1 247 1 247 247 429 86 1e-120 MAVITMKQLLEAGVHFGHQAKRWNPKMAKYIFTERNGIHVIDLHKSLKKIETAYDEMRKI VEDGGKVLFVGTKKQAQEAIKEQAERSGMYYVNSRWLGGMLTNFSTIKGRIERLKELERM EAEGILDTAYTKKEAATFRKELAKLSKNLTGIKEMKEVPQAIFVVDVKMEELAVTEADHL GIPVFAMIDTNVDPDKVTFPIPANDDAIRSVKLITSVMANAIVEGNQGKENVEPASEEIQ VEEGSAE >gi|224461430|gb|ACDD01000072.1| GENE 6 3708 - 7064 3259 1118 aa, chain - ## HITS:1 COG:mlr5451_3 KEGG:ns NR:ns ## COG: mlr5451_3 COG0318 # Protein_GI_number: 13474545 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Mesorhizobium loti # 621 1115 2 497 508 460 47.0 1e-129 MLLKERRFLPLFLTQFLGALNDNLLKMAIITFITYHLQGSLTEKGILISSVNVITILPMF FISATAGQFADKFQRNSLVKIIKGIEIFGILLCIFFFYSGQYPLILVTLFVMSMRSAFFG PLKYSILPQHLKEAELISANAFVDTSTYLAVLFGTILGTYLHSPTFVLAFLLSSAVIGFI SSFFIPISPAPRPKAKLHKNILKDIRITYRKVAELKVIYQTILGISWFWSLAAVVMLLIY PLCESVLGTSRNAVAVFMLIFALGISIGAYLCTKILKGVVHPTYVPLSSLGMAISMFALY WATNRYVPPFENLHTVPFFTSFVGIRMAIILFLLAFFSGMYLVPLNTFLQTRAPKKYLAT VIAGNNIVNACGMVFLSIFIMILFHLGISIPQIFFFLSLVSILVAFYILTMLPDALPRSI AQSLLAIFFKVEVKGLEHFEKAGKRVLVIANHTSLLDGLLVAAFMPERLIFAINTHIAKK WWVKIFKPVVTLHPLDPTNPVALKNIIDELKKNQKCIIFPEGRITVTGSLMKVYEGAGVV ANAADANILPVRIDGAQFSKFSYLKTKFKTSYFPKITITVLPHTKITLEEGTSPAVRRKQ IGDQLYTIMTNMMYQSSPISTPLFRALLTARKIHGKGHVVAEDIGRRPITYQQLILKSYV LGKFFQDSIEEKHVALMLPNSLANVVAYFGLQSVGKIPAMVNFTQGEAQILSCLDTANVK TLITAKKMVDLMELQPLIESLEHIGIRVLYLEEVQEELSYTQKLVGMYRYYRRYSPKVDS SDIATILFTSGSEGVPKAVALNHENLQANRYQISSVFAFNEKDIFFNMLPMFHSFGLEVG TILPLLSGIKVFFYPSPVHYKIVPELVYDSNATILCGTDTFFQGYAKQANPYDFYNIKYA IVGAEKLKDSTSQIWMEKFGVRILEGYGITETSPVLAVNTPMYQKKNSVGKWMPGIEYRL EEIEGVEEGGRLFVKGKNIMRGYLKNGELETLPDGWYDTGDIVSVDEEGFVHILGRAKRF AKIGGEMVSLSAVEEVLQEKYPDIKLAVISIQDEKKGEQLVLFTEAENMDSKELLDYFKT KHYSELWIPKKILTKQEIPILGTGKTNYVKLREMLDTK >gi|224461430|gb|ACDD01000072.1| GENE 7 7073 - 7519 590 148 aa, chain - ## HITS:1 COG:FN0609 KEGG:ns NR:ns ## COG: FN0609 COG0691 # Protein_GI_number: 19703944 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Fusobacterium nucleatum # 1 148 1 148 148 215 76.0 2e-56 MILAGNKKAYFDYFVEDKLEAGIELQGSEVKSAKAGKVSIKESFIRIINGELFIMGMSIV PWSFGSIYNPEERRVRRLLLHKKEIRRLHEKVSQKGYTIVPLDVHLSHGYVKLEIALARG KKTYDKRESIAKRDSERDIRRSLKENNR >gi|224461430|gb|ACDD01000072.1| GENE 8 7530 - 9509 1243 659 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 1 647 56 697 730 483 41 1e-136 SEYFGVEKEFRGGKLMREEFVRGTFSIIKERFAFVDTEEGEGIFIPKTAFHGALDGDVVL VRITKDKTEEHGREGEVTEIVSREKEKIVGILERRSDFGFVRPTHAFGKDIYIPRGKMKK AQNGELVVVSIYFWGDKDRKPEGEIIEVLGDPYNTKNMVDALIYREGMSEEFPRKVKTEL KNIRTTISEKEVSSRHDLREYSIITIDGEDARDLDDAVYVEKMKNGNYKLLVCIADVSYY IPENSELDLEAQKRGNSVYLVDRVLPMFPKEISNGICSLNENEDKLTFTCEMEIDCTGKV IQAEMYKSVIRSVHRMTYTKVNEMIEGKEQTLQEYQDIQEMVKDMLDLSQILRARKYARG SIDFDLSEIKLVLDENEKVKYVKLRERGEAEKIIEDFMIAANEAVAEKLFWMEIPSVYRT HEKPERERLQKLNESLKNFHYRVHNLEDVHPKQFQEMIEDSKEKGVNLIVHKMILMALKQ ARYSMENVGHFGLASECYTHFTSPIRRYADLEVHRILDSTLKSYPSGKELSRNVKKLPKI CEHISKTERTAMKVEEESVKIKLVEYMMNQVGEEFSAIVTGFSNRRVFFETEEHIEVSWD VVSSRHFYEFDEREYAMLDREQTEHQYHMGDKVKIVIVKASLQELEIEAVPTIVMQKGW >gi|224461430|gb|ACDD01000072.1| GENE 9 9464 - 10036 652 190 aa, chain - ## HITS:1 COG:FN0607 KEGG:ns NR:ns ## COG: FN0607 COG1713 # Protein_GI_number: 19703942 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Fusobacterium nucleatum # 16 179 16 179 193 176 59.0 2e-44 MREEEKDYCVWLSKILSKKRFAHVLSVVKEADYLARKNGADVEKCRLAALLHDCAKEMPL EEMQEICRREKFVDLSEQDLENGEILHGFVASVYVKEKFGIEDKEILEAICYHTVGKVGM SLIGKIVYIADAIEETRNYPNVVAIREKTHENLELGILMEIEHKLEYLSSIGARLHPNTL EWKKSLEEGN >gi|224461430|gb|ACDD01000072.1| GENE 10 10038 - 10448 579 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453222|ref|ZP_05618521.1| ## NR: gi|257453222|ref|ZP_05618521.1| hypothetical protein F3_09188 [Fusobacterium sp. 3_1_5R] # 1 136 1 136 136 218 100.0 9e-56 MKKLVTLVIWILAILVGGIYMTFPSSKVVRDATKVEKTQEDIDGKDYVLYLPDGSSEEKE LQESENKSEELHRLVQAELDYLYEKEVEGSKIELRNIYVTEDGVYILCTEKPKEQSLQAI AEVLKQLEITAKVQVL >gi|224461430|gb|ACDD01000072.1| GENE 11 10466 - 11473 1373 335 aa, chain - ## HITS:1 COG:FN1334 KEGG:ns NR:ns ## COG: FN1334 COG0860 # Protein_GI_number: 19704669 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Fusobacterium nucleatum # 6 335 2 338 338 308 50.0 1e-83 MYKKSLSILFFLFVCFSSFAAEIQKVVDRGEKIEIQLNGSVAGNITEAYDEDSRVLFLEI PKASLNKKIGLADNPYIENFNMEDYGGSVGLTCRLKNKLSYKIEKGSKSVALVFQQGSGK KKLIVIDPGHGGKDPGAARGAYREKDIVLSVGKYLKEELGGEYDIIITRDTDKFITLSER PKMGNRAGAKLFVSLHVNAAVNTAANGVEVYFFSKKSSPYAERIAQYENSFGEKFGEKSS SIAQISGEIAYKHNQTESIPLAENISEKIARSLGMRNGGAHGANFAVLRGFNGPSILVEM GFISNASDVEKLIREEHQRQIAQDVAAGIREFFER >gi|224461430|gb|ACDD01000072.1| GENE 12 11532 - 11792 432 86 aa, chain - ## HITS:1 COG:FN1335 KEGG:ns NR:ns ## COG: FN1335 COG1862 # Protein_GI_number: 19704670 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Fusobacterium nucleatum # 3 85 7 89 94 80 59.0 6e-16 MEKYGNLILIVLVWGAIFYFLVMRPNKKRQKEQKELFDSLHEGVEVVTAGGIKGTILYVG EDFVDVKVDKGVKLTVRKTSISTIVK >gi|224461430|gb|ACDD01000072.1| GENE 13 11859 - 13010 1651 383 aa, chain - ## HITS:1 COG:FN0118 KEGG:ns NR:ns ## COG: FN0118 COG0484 # Protein_GI_number: 19703466 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Fusobacterium nucleatum # 1 371 1 376 392 414 64.0 1e-115 MEKRDYYEVLGVTKGSSEGEIKKAYRKAAMKYHPDKYTNASEKEKKEAEDKFKEVNEAYQ VLSDPQKKQQYDQFGHAAFEQGAGGFGGGFGGGFEDLGDIFGDLFGSAFGGGFGGSSRRR SSVQPGDDLRLQVEITLEEANTGVEKTVKYNRKGKCTHCDGTGAEDKKVKQCSKCHGTGR IQVQQRTPFGVFQNVSECPDCHGTGKIPEKKCTHCHGTGAEKEKIEKTVKIPAGIDDGQK LKLTGMGDASTTGGAFGDLYVHVRVKPHPIFERNDIDLYCDVPITFATAVAGGEIEVPTL TGKKKVKIAAGTQTGKMMKLSGEGMKSLRGNYHGDLLIRLNIETPTNLTKHQMELLQKFE ESLEEKNYPKRKGLFDKIKDLFQ >gi|224461430|gb|ACDD01000072.1| GENE 14 13129 - 13650 773 173 aa, chain - ## HITS:1 COG:FN0213 KEGG:ns NR:ns ## COG: FN0213 COG1778 # Protein_GI_number: 19703558 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Fusobacterium nucleatum # 2 169 1 168 168 184 52.0 8e-47 MLEKIEMVVFDIDGTLTDGRLIRDNEGNTSKNFYAKDGFAMGQWLRLGKKIGIITGKESR IVADRAKELGIIDVIQGSKNKARDLEQFLEKYSYTKEQIAYMGDDINDLGILSKVGFSSC PKDAAPEVLAMVDFIAAHNGGQGAARDLMEHIMKANGMWKKVLEYYQKEENRS >gi|224461430|gb|ACDD01000072.1| GENE 15 13838 - 14398 684 186 aa, chain + ## HITS:1 COG:FN1763 KEGG:ns NR:ns ## COG: FN1763 COG3758 # Protein_GI_number: 19705082 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 185 1 184 184 192 49.0 2e-49 MYTIIKKNEWQSLEWSGGITNQLYIYPKTGDYTTRNFSARISIAETRDESRSQFTSLPGI DRFISNLEGTMKLEHEDHYDIEVHPYEIERFQGSWVTFSTGKYRDFNLMLQGVMGDLYFK ELTGDITLHLQEALTFAFIYVIEGSIILDKQIKLEASDLLIATDCRLDVKTDSAKVYYGF VKEWDS >gi|224461430|gb|ACDD01000072.1| GENE 16 14786 - 15391 926 201 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453228|ref|ZP_05618527.1| ## NR: gi|257453228|ref|ZP_05618527.1| hypothetical protein F3_09218 [Fusobacterium sp. 3_1_5R] # 1 201 5 205 205 377 100.0 1e-103 MYIGILVLGAAFAACGPSNTKIRPYAEVKEEVPGIQSIVTRADRGSIYIALKNVSEEELE IIWEDSTLGGDQVSHGTYVDINDYRLKQENTKMKKGEIFQTVLRRKNDLYYLDPVLYQPG GVKVKALKYPTDLVLKVKQGEKISTLETHIQQEESLHQKDVDARLQGAKDANFIPEFTDK KIVLREDKKIKKGVVTNGIEE >gi|224461430|gb|ACDD01000072.1| GENE 17 15400 - 15465 127 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMKGGCIDKYFMYLSYKKKIG >gi|224461430|gb|ACDD01000072.1| GENE 18 15481 - 17634 3115 717 aa, chain - ## HITS:1 COG:PM0714 KEGG:ns NR:ns ## COG: PM0714 COG5295 # Protein_GI_number: 15602579 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Pasteurella multocida # 142 712 2092 2712 2712 71 24.0 5e-12 VADGTADTDAVNKSQLDKAKAAATTKVKGSENIEVDGTPEKDGSMTYTVKTKDKMTLGKD SDKKIVVNGETGKLTVGKEVEMDGTTGNARFGKVQVNGEKGTVGGLTNTTWNPDKIVSGQ AATEDQLKVLDKKMQNNTEELINKGMDFSGNDYDAKNAKTKIHKKLGDRLEIIGSLEAGV EADSKNLRTRVTDDGKLELLMAKNPKFKSVEAGEGDTKVTIGDTGIKIGDKTYITKDGIN ANGNKITNVADGTADTDAVNKGQLDKAQAAATTKVDGSDNIQVDKKPEKDGSMTYIVKTK DKMTLGKDSDKKIVADGETGKLTVGKEVEMDGTTGDARFGKVQVNGEKGTVGGLTNKTWD KNKIVSGQAATEDQLKAVDDKVDNLGNKIDETTNKVEKGLNFAADTGKATNRQLGDTLTI AGDNNNITTSVEEGKVKVALKEDVKVKTLTSETVTTDKLILKGKDGKTTDVGETLDKHDR DIQENKNAIKKGLNFAANHGEVNKQLGDTMSIKGKDGLSEKEIEEKYDVENIVTSVDKEG NLWIKMAKNPKFKSVEAGEGDTKVTIGDIGIKIGDKTYITKDGINANDNKIKNVADGKVA KGSKDAINGGQLHDALSNVQEGMNQINHRVDKLDDRMHRGLANAAAMSTVEFLEIGINQA TVGAAIGTYRGNQAVAVGVQAAPTENMRVHAKVSVAPSRNNTETMAGVGASWRFNIK Prediction of potential genes in microbial genomes Time: Fri May 20 02:27:31 2011 Seq name: gi|224461429|gb|ACDD01000073.1| Fusobacterium sp. 3_1_5R cont1.73, whole genome shotgun sequence Length of sequence - 39362 bp Number of predicted genes - 56, with homology - 55 Number of transcription units - 16, operones - 8 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 2279 2616 ## COG5295 Autotransporter adhesin - Prom 2322 - 2381 18.8 - Term 2407 - 2449 9.8 2 2 Op 1 29/0.000 - CDS 2464 - 4281 2659 ## COG0443 Molecular chaperone 3 2 Op 2 21/0.000 - CDS 4319 - 4879 896 ## COG0576 Molecular chaperone GrpE (heat shock protein) 4 2 Op 3 . - CDS 4876 - 5889 1391 ## COG1420 Transcriptional regulator of heat shock gene - Prom 5947 - 6006 10.0 - Term 5995 - 6037 5.1 5 3 Op 1 . - CDS 6046 - 6501 488 ## NT01CX_0673 hypothetical protein 6 3 Op 2 1/0.000 - CDS 6513 - 7262 914 ## COG2849 Uncharacterized protein conserved in bacteria 7 3 Op 3 . - CDS 7273 - 8736 2394 ## COG0516 IMP dehydrogenase/GMP reductase 8 3 Op 4 . - CDS 8755 - 9570 925 ## COG0668 Small-conductance mechanosensitive channel - Prom 9710 - 9769 8.4 + Prom 9565 - 9624 10.8 9 4 Tu 1 . + CDS 9801 - 10004 424 ## COG1278 Cold shock proteins + Term 10019 - 10050 3.4 - Term 9924 - 9958 -0.9 10 5 Op 1 . - CDS 10048 - 11271 1513 ## COG1301 Na+/H+-dicarboxylate symporters 11 5 Op 2 . - CDS 11284 - 11772 629 ## COG0262 Dihydrofolate reductase - Prom 11800 - 11859 7.8 - Term 11817 - 11858 7.3 12 6 Op 1 50/0.000 - CDS 11901 - 12251 547 ## PROTEIN SUPPORTED gi|237739944|ref|ZP_04570425.1| LSU ribosomal protein L17P 13 6 Op 2 26/0.000 - CDS 12274 - 13254 1304 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 14 6 Op 3 36/0.000 - CDS 13282 - 13869 905 ## PROTEIN SUPPORTED gi|237744174|ref|ZP_04574655.1| SSU ribosomal protein S4P 15 6 Op 4 48/0.000 - CDS 13919 - 14308 632 ## PROTEIN SUPPORTED gi|19704620|ref|NP_604182.1| 30S ribosomal protein S11 16 6 Op 5 . - CDS 14340 - 14696 576 ## PROTEIN SUPPORTED gi|237739948|ref|ZP_04570429.1| SSU ribosomal protein S13P - Prom 14816 - 14875 2.0 17 7 Op 1 . - CDS 14889 - 15002 199 ## PROTEIN SUPPORTED gi|237736194|ref|ZP_04566675.1| 50S ribosomal protein L36 18 7 Op 2 9/0.000 - CDS 15023 - 15244 266 ## PROTEIN SUPPORTED gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 - Prom 15264 - 15323 5.3 - Term 15253 - 15293 -1.0 19 7 Op 3 12/0.000 - CDS 15325 - 16083 1322 ## COG0024 Methionine aminopeptidase 20 7 Op 4 . - CDS 16105 - 16737 1086 ## COG0563 Adenylate kinase and related kinases - Prom 16764 - 16823 4.7 - Term 16783 - 16822 5.9 21 8 Tu 1 . - CDS 16831 - 20925 5104 ## COG5295 Autotransporter adhesin - Prom 20974 - 21033 8.7 - Term 21153 - 21196 5.4 22 9 Op 1 53/0.000 - CDS 21201 - 22481 901 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 23 9 Op 2 48/0.000 - CDS 22520 - 23002 627 ## PROTEIN SUPPORTED gi|237736189|ref|ZP_04566670.1| LSU ribosomal protein L15P 24 9 Op 3 50/0.000 - CDS 23002 - 23187 273 ## PROTEIN SUPPORTED gi|237743917|ref|ZP_04574398.1| LSU ribosomal protein L30P 25 9 Op 4 56/0.000 - CDS 23202 - 23705 757 ## PROTEIN SUPPORTED gi|237736187|ref|ZP_04566668.1| SSU ribosomal protein S5P 26 9 Op 5 46/0.000 - CDS 23732 - 24094 485 ## PROTEIN SUPPORTED gi|237736186|ref|ZP_04566667.1| LSU ribosomal protein L18P 27 9 Op 6 55/0.000 - CDS 24121 - 24654 744 ## PROTEIN SUPPORTED gi|237743914|ref|ZP_04574395.1| LSU ribosomal protein L6P 28 9 Op 7 50/0.000 - CDS 24678 - 25073 612 ## PROTEIN SUPPORTED gi|237736184|ref|ZP_04566665.1| SSU ribosomal protein S8P 29 9 Op 8 50/0.000 - CDS 25102 - 25389 451 ## PROTEIN SUPPORTED gi|237743912|ref|ZP_04574393.1| SSU ribosomal protein S14P 30 9 Op 9 48/0.000 - CDS 25411 - 25962 868 ## PROTEIN SUPPORTED gi|237739378|ref|ZP_04569859.1| LSU ribosomal protein L5P 31 9 Op 10 57/0.000 - CDS 25981 - 26322 486 ## PROTEIN SUPPORTED gi|34764027|ref|ZP_00144913.1| LSU ribosomal protein L24P 32 9 Op 11 50/0.000 - CDS 26345 - 26713 576 ## PROTEIN SUPPORTED gi|237736180|ref|ZP_04566661.1| LSU ribosomal protein L14P 33 9 Op 12 50/0.000 - CDS 26747 - 26998 385 ## PROTEIN SUPPORTED gi|237739375|ref|ZP_04569856.1| SSU ribosomal protein S17P 34 9 Op 13 50/0.000 - CDS 27040 - 27222 291 ## PROTEIN SUPPORTED gi|34764030|ref|ZP_00144916.1| LSU ribosomal protein L29P 35 9 Op 14 50/0.000 - CDS 27222 - 27647 675 ## PROTEIN SUPPORTED gi|34764031|ref|ZP_00144917.1| LSU ribosomal protein L16P 36 9 Op 15 61/0.000 - CDS 27650 - 28306 979 ## PROTEIN SUPPORTED gi|237736176|ref|ZP_04566657.1| SSU ribosomal protein S3P 37 9 Op 16 59/0.000 - CDS 28329 - 28661 508 ## PROTEIN SUPPORTED gi|237736175|ref|ZP_04566656.1| LSU ribosomal protein L22P 38 9 Op 17 60/0.000 - CDS 28713 - 28988 468 ## PROTEIN SUPPORTED gi|19704962|ref|NP_602457.1| SSU ribosomal protein S19P 39 9 Op 18 61/0.000 - CDS 29012 - 29842 1384 ## PROTEIN SUPPORTED gi|237742669|ref|ZP_04573150.1| LSU ribosomal protein L2P 40 9 Op 19 61/0.000 - CDS 29886 - 30173 390 ## PROTEIN SUPPORTED gi|237736172|ref|ZP_04566653.1| LSU ribosomal protein L23P 41 9 Op 20 58/0.000 - CDS 30175 - 30807 918 ## PROTEIN SUPPORTED gi|237736171|ref|ZP_04566652.1| LSU ribosomal protein L1E 42 9 Op 21 40/0.000 - CDS 30836 - 31462 992 ## PROTEIN SUPPORTED gi|237742672|ref|ZP_04573153.1| LSU ribosomal protein L3P - Prom 31544 - 31603 4.8 - Term 31487 - 31526 1.4 43 9 Op 22 . - CDS 31613 - 31924 469 ## PROTEIN SUPPORTED gi|237736169|ref|ZP_04566650.1| SSU ribosomal protein S10P - Prom 31990 - 32049 8.4 - Term 32007 - 32053 -0.1 44 10 Tu 1 . - CDS 32113 - 32337 387 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase - Prom 32362 - 32421 2.5 - Term 32346 - 32391 9.4 45 11 Op 1 . - CDS 32424 - 32807 732 ## COG2033 Desulfoferrodoxin 46 11 Op 2 . - CDS 32828 - 33253 445 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 47 11 Op 3 . - CDS 33237 - 33347 114 ## 48 11 Op 4 . - CDS 33350 - 33802 645 ## COG0456 Acetyltransferases 49 11 Op 5 . - CDS 33815 - 34552 644 ## COG0101 Pseudouridylate synthase - Prom 34575 - 34634 5.4 + Prom 34538 - 34597 4.8 50 12 Tu 1 . + CDS 34671 - 35027 288 ## gi|257453278|ref|ZP_05618577.1| hypothetical protein F3_09468 + Term 35033 - 35067 1.2 - Term 35020 - 35053 1.0 51 13 Tu 1 . - CDS 35054 - 35656 580 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 35737 - 35796 9.8 + Prom 35643 - 35702 10.8 52 14 Tu 1 . + CDS 35732 - 35974 119 ## gi|257453280|ref|ZP_05618579.1| hypothetical protein F3_09478 + Term 36035 - 36087 5.1 53 15 Tu 1 . - CDS 35971 - 36453 534 ## COG2131 Deoxycytidylate deaminase - Term 36461 - 36492 3.4 54 16 Op 1 4/0.000 - CDS 36508 - 38145 1749 ## PROTEIN SUPPORTED gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P 55 16 Op 2 1/0.000 - CDS 38138 - 39013 835 ## PROTEIN SUPPORTED gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P 56 16 Op 3 . - CDS 39003 - 39272 439 ## COG1925 Phosphotransferase system, HPr-related proteins - Prom 39300 - 39359 8.5 Predicted protein(s) >gi|224461429|gb|ACDD01000073.1| GENE 1 2 - 2279 2616 759 aa, chain - ## HITS:1 COG:ECs4480 KEGG:ns NR:ns ## COG: ECs4480 COG5295 # Protein_GI_number: 15833734 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Escherichia coli O157:H7 # 74 339 106 356 1588 90 32.0 9e-18 MISKNTILKNLEKYLKRSFKGKVRINESSLIAYLLVGGFFCFVSNVGYATVAGKGEIKYQ SSGSKTEDGLAINYAQALGDSQTIAIGGNDDGGNGNAIAKGEGSIAIGGKSRTDGKTAIA VGVEAEANEGATALGAKSKSLGTNSIALGLEAEAKQNESIAIGHAAKADELHAISMGYKA NAKSQNGISIGAETVADGQNSIVIGKGSKTSELTDEDGIGLAKKPRDASVVIGVDSVSKN QYTITMGHRAQTLGQDSFAFGNESKARKDRAIAFGERTIADGENSMALGSKAGAYTVNTL AFGAGAQAKESQAISIGNKAHSNGENSISIGTTSLVGKVGQYGMSTGEKIVKNGTAIGGY SNVDANGGTTIGSFTTVNNEGGVALGIDSIADVSGGKAGAKQTYSAYNSKKAGAFTSTNT YNGNMRTGVAPRNNLKAGAVSVGKADGSFTRQIVGLAAGTEDTDAVNVAQLKSLTMKIEG DKNENEGETPKVGLWDGSLKVTAKGTGSIKASTVVEGNKVTVNVDSTDLEKKIQAAGKVK YFSVKSTGGNNENNDGATADNAVAIGKDSSATEENSIALGSDATTENAKKEVTQAEVQAG NGDKVVYNGFAGTKPYAQLSVGSKDKERQIKNVAAGEVSASSTDAVNGSQLYAVAAKPMT YTDDKGTTINRILGQNINLKGGAQGDLSTGNIAVEASGTDTLNIKLAKNLKEIDSISKNG KAGSPKITLGDSNISFNSDVDMGSKKITNVAAGDVSENS >gi|224461429|gb|ACDD01000073.1| GENE 2 2464 - 4281 2659 605 aa, chain - ## HITS:1 COG:FN0116 KEGG:ns NR:ns ## COG: FN0116 COG0443 # Protein_GI_number: 19703464 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Fusobacterium nucleatum # 1 605 1 607 607 934 88.0 0 MAKIIGIDLGTTNSCVAIMEGGSATIIPNAEGARTTPSVVNIKDNGETIVGEIAKRQAVT NPNSTVSSIKTYMGSDHKVEIFGKKYTPQEISAKTLQKLKKDAEAYLGEEVKEAVITVPA YFTDAQRQATKDAGTIAGLEVKRIINEPTAAALAYGLEKKKEEKVLVFDLGGGTFDVSIL EIADGVIEVISTAGNNHLGGDDFDKKIIDWMVTEFKKETGLDLSSDKMAYQRLKDAAEKA KKELSTMMETPISLPFITMDATGPKHLEMKLTRAKFNDLTRDLVEATQGPTKTALSDASL QPGEIDEVLLVGGSTRIPAVQEWVEAYFGKKPNKGINPDEVVAAGAAIQGGVLMGDVKDV LLLDVTPLSLGIETLGGVFTKMIEKNTTIPVKKSQVYSTAVDNQPAVTINVLQGERSRAA DNHKLGEFNLEGIPAAPRGVPQIEVTFDIDANGIVHVSAKDLGTGKENKVTISGSTNLSK EEIDRMTKEAEANAAEDKKFEELIAARNQADMLISSTEKSMKDHADKLGEEDKKNIEAAI EELKKVKDGDSKEAIDQAVEKLSQAAHKFAEELYKDAQAQQAQGAAGNTGVGSANEDVAE AEVVD >gi|224461429|gb|ACDD01000073.1| GENE 3 4319 - 4879 896 186 aa, chain - ## HITS:1 COG:FN0114 KEGG:ns NR:ns ## COG: FN0114 COG0576 # Protein_GI_number: 19703462 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Fusobacterium nucleatum # 36 185 51 199 199 140 64.0 9e-34 MTDEAKKEEVLEEVKEEILEAEEVKEETTEKVLSPEEEIGKLKAEIEDWKQSYLRKQADF QNFTKRKEKEIDELRQYSSQKIVEKLLGSLDNLERAISAAKETNDFDGLVQGVEMILRNI QDVMKSEGVEEIEALGKEFDPMFHHAVMQEDSPEFKDNEVMLELQKGYKMKDKVIRPSMV KVCKKS >gi|224461429|gb|ACDD01000073.1| GENE 4 4876 - 5889 1391 337 aa, chain - ## HITS:1 COG:FN0113 KEGG:ns NR:ns ## COG: FN0113 COG1420 # Protein_GI_number: 19703461 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Fusobacterium nucleatum # 1 337 12 350 351 377 61.0 1e-104 MGISDREKLVLNAIVNYYLTFGDTIGSRTLVKKYGIELSSATIRNVMADLEDMGFIGKTH TSSGRIPTDKGYRYYLNELLKVERLSQQERESIEGFYEERIGELDKLLETTSSLLSKLTS YAGIAVEPRIVDSEIHRVELIHIDEYLVMAIIIMKDRRVKTKKIHLINPLSEKELGSIAK ELNERIQYEHLTVKEIEEFILGGDMIQSSETSMEDLNRFFIDNVTSMFKERDVDSASEVL DFLSEKKDIRQMFERLIQKRENIGQGVQVIFGDELGIKELEDYSFVYSLYQLGGAQGIIG VIGPKRMAYSKTVGLLDCVTKEVNRAIDRIEKKEVKK >gi|224461429|gb|ACDD01000073.1| GENE 5 6046 - 6501 488 151 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0673 NR:ns ## KEGG: NT01CX_0673 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 1 151 1 151 151 197 65.0 9e-50 MIGREPQKENDLFFTCALIDYIARKTKNKRVAIVDSLGKERLHKIYDLADIYHSDNLERV CDDFIQEAKILNGNFDNVKDAKYMVPSHWDIAKVYKRLILGIAKEKNIEIIEALMEAYHS FVSDLIDDYNSSFYYDAPQNILNTFLYGVVE >gi|224461429|gb|ACDD01000073.1| GENE 6 6513 - 7262 914 249 aa, chain - ## HITS:1 COG:FN1230 KEGG:ns NR:ns ## COG: FN1230 COG2849 # Protein_GI_number: 19704565 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 89 247 2 161 162 154 50.0 1e-37 MKMKQCMGIFLIMSSCLFAEIREITPLSEFSSVLLGETVQKNTKPEKVISHKAEEKKSVS TMSGIPKHSRNISQKDTSQFVDISEQDNRNGVIYRQGEESPFTGVFALFMGDWIQYIETY KNGKLDGESSWYSQNGTQILLEQYQAGKLHGNQLSYYENGNPKAEVMYDKGKITGVISFS KDGKEIHKSIFNNGTGIWKLYWENGNVLEIGKYTNFRKDGIWKKYNEDGSLESTLEYQNG RLLKETWGE >gi|224461429|gb|ACDD01000073.1| GENE 7 7273 - 8736 2394 487 aa, chain - ## HITS:1 COG:FN1231_3 KEGG:ns NR:ns ## COG: FN1231_3 COG0516 # Protein_GI_number: 19704566 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Fusobacterium nucleatum # 204 487 1 285 285 463 84.0 1e-130 MMNGKILKEAITFDDVLLVPARSEVLPHQVSLKTRLTKKITLNVPILSAAMDTVTESDLA IALARQGGIGFIHKNMSIEEQAAEVDRVKRSESGMITNPITLNQESTVMQAEEIMRRYKI SGLPVIEEDGKLIGIITNRDIKYRKDMNQLVGEIMTKEKLITAPVGTTLDEAKEVLLANR IEKLPITDEEGYLKGLITIKDIDNIIQYPNACKDEKGTLRCGAAVGIGPDTLDRVKALVE AGVDIITVDSAHGHSKGVIEMVRKIREAFPDLDLIGGNIVTAEAAKDLVEAGANAVKVGI GPGSICTTRVVAGVGVPQLTAVNDVYEYCKNQGIGVIADGGIKLSGDIVKALAAGADCVM LGGLLAGTKEAPGEEILLEGKKFKSYVGMGSIAAMKRGSKDRYFQTETDAQKLVPEGIEG RIAYKGAVKDVVFQLCGGIRAGMGYCGTPTIERLQVEGRFMKITGAGLLESHPHDITITK EAPNYSK >gi|224461429|gb|ACDD01000073.1| GENE 8 8755 - 9570 925 271 aa, chain - ## HITS:1 COG:FN0619 KEGG:ns NR:ns ## COG: FN0619 COG0668 # Protein_GI_number: 19703954 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Fusobacterium nucleatum # 12 271 9 272 281 279 57.0 4e-75 MNQIFLELSKMLTELLPYLFLKGISLLALIVIFPKLVKYFIRFLDKVMLRRGFDDLLMSF TESFVSTLGYIILFFSAVGILGVKATSLMAVLGTAGLAVGLALQGSLSNLAGGVLILFFK QFTKGDYIAIASGQEGTVQSIRILYTTLVTVNNQLIIIPNSQLANGYIINYSTNPERRMD LTYSASYDDKVDDVIAVLTKIAESHPKVLKNKPITIRLKQHSASSLDYMFRVWTLQEDYW DTIFDFNETVRKEFDKHGIEIPYNKLDIYNK >gi|224461429|gb|ACDD01000073.1| GENE 9 9801 - 10004 424 67 aa, chain + ## HITS:1 COG:lin1401 KEGG:ns NR:ns ## COG: lin1401 COG1278 # Protein_GI_number: 16800469 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Listeria innocua # 1 66 1 66 66 80 68.0 6e-16 MLKGTVKWFNNEKGFGFITGEDTVDYFVHFSGIAGEGFKSLEEGQAVTFEVSEGKKGPMA VEVTKAN >gi|224461429|gb|ACDD01000073.1| GENE 10 10048 - 11271 1513 407 aa, chain - ## HITS:1 COG:CPn0528 KEGG:ns NR:ns ## COG: CPn0528 COG1301 # Protein_GI_number: 15618439 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Chlamydophila pneumoniae CWL029 # 14 394 7 394 414 156 28.0 8e-38 MKEQKGKISFPIILLTGIIIGSLVGVVFREKAVVLKPLGDIFLNLMFTAVVPMVFVSIAT AVGNMVNMTRLRKILFSTVLTFIGTGLIASVYVFIAVKIFPPAVGTKIALQSTTMQEAKS SADLLVSSFTVPDFIDLLSRRNMLPLIIFATLFGFCVSHCGGEESPIGKVLNNLNDIMMK LINLIMWYAPIGLGAYFASLVGEFGPNLIGDYGRTLLIYYPLCLLYFFTAFPFYAFLAGG KEGIKRMFQYIYSPAITAFATQSSMATLPVNMETCKKIGVPKDISDLVLPMGATMHMDGS VLSSIVKISFLFGIFQTPFTGIETYFLSIVVSILAAFVLSGAPGGGLVGEMLIVSLFGFP PEAFPLIATIGFLVDPPATSLNASGDTIASMLVARMVEGKDWLHRHI >gi|224461429|gb|ACDD01000073.1| GENE 11 11284 - 11772 629 162 aa, chain - ## HITS:1 COG:FN0241 KEGG:ns NR:ns ## COG: FN0241 COG0262 # Protein_GI_number: 19703586 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Fusobacterium nucleatum # 5 159 6 164 164 154 49.0 6e-38 MSPNYERLKMIVCVGENNLIGDKDPSGNGLLWHSKEELLYYKSITTGQVTLFGENTAKFV PIHLMKKTREVLILTMDSNIEDILQQYPEKDVFLCGGATIYRYYLEHYPIAQVYVSKLKK HVEVAEAKNPLYFPNLESLGYVCVKETEYEDFIACIYEKKRA >gi|224461429|gb|ACDD01000073.1| GENE 12 11901 - 12251 547 116 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739944|ref|ZP_04570425.1| LSU ribosomal protein L17P [Fusobacterium sp. 2_1_31] # 1 116 1 116 116 215 93 4e-55 MNHNKSYRKLGRRADHRKAMMKNMTISLLTSERIETTVTRAKELRKFAERMITFGKKGTL ASRRNAFAFLRSEEAVAKLFNELAPKYADRNGGYTRIIKTSVRKGDSAEMAIIELV >gi|224461429|gb|ACDD01000073.1| GENE 13 12274 - 13254 1304 326 aa, chain - ## HITS:1 COG:FN1283 KEGG:ns NR:ns ## COG: FN1283 COG0202 # Protein_GI_number: 19704618 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Fusobacterium nucleatum # 1 325 17 342 342 483 81.0 1e-136 MLKIEKHARGIHITEVRESEFKGQFVVEPLYRGYGHTLGNALRRVLLSSIPGAAIKGIRI EGVLSEFSVMDGVKEAVTEIILNVKEIVVKSETAGERKMTLSVKGPKVVTAADIIPDIGL EIINPEQEICTITTDRELDIEFLVDTGEGFVVSEEIERDGWAVDYIAVDAIYTPIRKVSY DIQDTMVGRMTDFDKLTLSVETDGSIEIRDALSYAVELLKLHLDPFLEIGNKMENLRVEV EEEVENQSSSIRDDINLNIKIEELDLTVRSFNCLKKAGIEEVGQLSKLSMNELLKIKNLG RKSLDEILEKMKELGYDLAQNGSAES >gi|224461429|gb|ACDD01000073.1| GENE 14 13282 - 13869 905 195 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237744174|ref|ZP_04574655.1| SSU ribosomal protein S4P [Fusobacterium sp. 7_1] # 1 195 1 195 195 353 88 1e-96 MARNRQPVLKKCRNLGIDPVILGVNKSSNRSLRPNANRKPTEYAIQLREKQKAKFIYNVM EKQFRKLYDEAARKLGVTGLTLIEYLERRLENVVYRLGFAKTRRQARQIVSHGHITVNGR RVNIASYRVKVGDVIAVVENSKNLEIIKSAVDTANAPAWLQLDKAAFAGKVLQNPTKDDL DFDLNESLIVEFYSR >gi|224461429|gb|ACDD01000073.1| GENE 15 13919 - 14308 632 129 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|19704620|ref|NP_604182.1| 30S ribosomal protein S11 [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 129 1 129 129 248 92 5e-65 MAKKTVAKVKKKSKNIPNGVAHIHSTFNNTIVAITDTEGKVISWRSGGTSGFKGTKKGTP FAAQIAAEQAAGVAMENGMKKVEVRVKGPGSGREACIRSLQAAGLEVTKITDVTPVPHNG CRPPKRRRV >gi|224461429|gb|ACDD01000073.1| GENE 16 14340 - 14696 576 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739948|ref|ZP_04570429.1| SSU ribosomal protein S13P [Fusobacterium sp. 2_1_31] # 1 118 1 118 118 226 94 2e-58 MARVAGVDIPRNKRVEIALTYIYGIGRPTSQKVLKEAGVNFDTRVKDLTEEEVNKIREII NGIKVEGDLRKEVRLSIKRLMDIKCYRGLRHKMNLPVRGQSSKTNARTVKGPKKPIRK >gi|224461429|gb|ACDD01000073.1| GENE 17 14889 - 15002 199 37 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736194|ref|ZP_04566675.1| 50S ribosomal protein L36 [Fusobacterium mortiferum ATCC 9817] # 1 37 1 37 37 81 97 9e-15 MKVRVSVKPICDKCKVIKRHGKIRVICENPKHKQVQG >gi|224461429|gb|ACDD01000073.1| GENE 18 15023 - 15244 266 73 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 [Mycobacterium tuberculosis H37Rv] # 1 72 1 73 73 107 69 1e-22 MSKKDVIELEGTILEALPNAMFKVELENGHTILGHISGKMRMNYIKILPGDGVTVQISPY DLSRGRIVYRKKN >gi|224461429|gb|ACDD01000073.1| GENE 19 15325 - 16083 1322 252 aa, chain - ## HITS:1 COG:FN1297 KEGG:ns NR:ns ## COG: FN1297 COG0024 # Protein_GI_number: 19704632 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Fusobacterium nucleatum # 2 252 3 253 254 398 76.0 1e-111 MILKNLEEIKEIEKANQIIARLYRDILPPYIKAGISTKELDKIVDDYIRSQGAIPGCIGV QGMYNEFPAATCISVNEEVVHGIPGDRILQEGDIVSVDTVTILNGYYGDSAYTYAVGEID EESKKLLEVTKKSREIGIEQAIVGNRLGDIGHAIQKYVEKEGFSVVRDYAGHGVGLAMHE DPMVPNYGRAGRGLKIENGMVIAIEPMINVGTYKVVLHPDGWTVSTKDGKRSAHFEHSIA IVDGKPIILSEF >gi|224461429|gb|ACDD01000073.1| GENE 20 16105 - 16737 1086 210 aa, chain - ## HITS:1 COG:FN1298 KEGG:ns NR:ns ## COG: FN1298 COG0563 # Protein_GI_number: 19704633 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Fusobacterium nucleatum # 1 210 2 211 211 312 76.0 3e-85 MEMNIVLFGAPGAGKGTQAKFIMDQYEIPQISTGDILRQAIANKTTLGLEAKKFMDEGKL VPDSVVNGLVAERLEQADCKKGFIMDGFPRTVVQAEELDKILEKLNRKIEKVIALNVKDE DIVERITGRRTSKKTGKIYHMTFNPPVDEDPADLVQRADDTKEVVEKRLSTYHEQTAPVL DYYKAQNKVSEIDGSQQMEEITKQIFSILG >gi|224461429|gb|ACDD01000073.1| GENE 21 16831 - 20925 5104 1364 aa, chain - ## HITS:1 COG:FN0735 KEGG:ns NR:ns ## COG: FN0735 COG5295 # Protein_GI_number: 19704070 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Fusobacterium nucleatum # 1160 1351 386 576 617 102 40.0 6e-21 MLEEKSVKHWLKRKVKFTEALLVAFLITGGIGYAADNGAGSGTGVAIGTGSNAQRDGVVA IGRGAHTNYAGGSGYSEVNGDVAIGLNATTHSYYDQSGSVAIGKNSYVENTVGIQEKLFA FKQTPFNSWGFGSLPQEPDKVVTGVAIGDNTYVRTGGTMVGSHNYRGKIGDITVSTDEYK TKRITGLGIYSTTLGANSFTNGTVATTSGALNVISSDYDGNDATKATKNFGATITGSLNS IESATSSNNVSGLANAVVGTANRTNNSNGSLIFGAGNEITNSIANVKTDAITGSAFGGPD SITSMSEKVRKLVKDSESAGSTMAFGGGNKADWTQLTAMIGVNNTITGESGKIAKLNMVN GYKNTVTNANNNIIMGNEHTSTKDNNIMIGGLSKADTRNVANTVSVGYDAKVTVEGGVAL GHQSVAAVDKGKAGFDITKNTASTDENATWKATHAALSVGDVEKKVTRQITGVAAGTEDT DAVNVAQLKKVKDSINTAVENSKIHYYSVNDDNNHVENYNNDGAKAKNAVVIGIGSTSNG VNSTVLGNDIKLTGDKNGRNNSIVVGHHIEADGTHNAIFATDYNNDDNKTTHVFGEQNTV LGVGNLVGWTAEKDPSDATNTKWIYTKNTSGSDQNTVVGMNNTVNTNGNTVLGSSNEIRN NGSVISIGSGNVVGGTIINESGNEEGVGLRSGVFGHDSSVSHNEAFVFGNESKATAMEAY VLGNSSENTGKNSIVLGNYAKNESIGGSILGSHAENHGEWGTALGGCSNVTVDYGVALGA FSTANTSSGIAGYDPSGNSADNSSVWNSSLAAVSIGDSKEGYTRQITNVAAGTEDTDVVN VAQLKSLGKKVETDYAKVDASNLSDQNVTSWKTKLGVSDLTSTLLTYKANGKGNQTVSLA TGLNFTNGENTTASIDANGVVKYDVNKELKNMTSISGKEGEGKISFGKDAKNNNPTVNVN NSRITGVANAIDKMDAVNKGQLDTAINEAVKQAKADINVKGEDGISVQRTKDTFTVSLDK KTKATLAKVGTGKIEENNQNTVTGDTVYKAIKDLKGDISAAKTEVKGSEQIEVVKEATSD GHDLYKVKAKTEELKVKDGKIEKPTKEKALVTAGNIAKVIEETELTTTVKSGSKNITVEE KKIGNNTEYTMDLAKDISVDSIKIGDKISISKSGINAGKQKITNVADGKADSDAANMKQL RKVEENAKKDAEKLGNAINHNAQKIHDLKKEVGNVGALSSAMAALNPMEYDPMKPNQVLA GVGSYKNSQAVAVGMSHHFNENLRVQAGVSFSEGRKTESMVNLGLAWKIGKDDRDDSYNK YKEGPISSIYVLQDEVIFLKQANQKKDKEIDELKMLVKKLMSEK >gi|224461429|gb|ACDD01000073.1| GENE 22 21201 - 22481 901 426 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 16 426 19 437 447 351 44 3e-96 MTLLEKFNSKLSSIMKVPELRDRILFTLLMFLVARIGTFIPAPGVDTDRLAAMTAQNDIL GYINMFSGGAFTRVSIFALGIIPYINASIVVSLLAAIIPQIEEIQKEGEAGRNKITQWTR YLTIAIALVQGFGVCMWLQSVGLIFDPGILFFLTTIATLTAGTVFLMWVGEQISVKGIGN GVSLLIFLNVISRGPSNIVQTIQTMSGSKFLIPVLLAVAAAGILTIMGIVVFQLGQRKIP IHYVGKGFNSRGGMGQNSYIPLKLNSSGVMPVIFASVLMMIPTVMINAIPSKYAIKTTLS MIFNQQHPVYMIVYALVIVFFSFFYTAIVFDPEKVADNLKRGGGTIPGIRPGIETVEYLE GVVTRITWGGALFLAAISILPFAIFSALGLPVFFGGTGIIIVVGVAIDTVQQIDAHLVMR DYKGFI >gi|224461429|gb|ACDD01000073.1| GENE 23 22520 - 23002 627 160 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736189|ref|ZP_04566670.1| LSU ribosomal protein L15P [Fusobacterium mortiferum ATCC 9817] # 1 159 1 159 159 246 76 2e-64 MKLNELTPSVPRKARKRVGRGESSGWGKSAGKGSNGQNSRAGGGVKPYFEGGQMPIYRRV PKRGFSNYPFRKEYALVSLDALNKFEDGATVCPDCLAEMGIIKCACSLVKVLGNGELTKK LTIKAHKITKSAQAAIEAKGGSVEVIEVKTFADVAGNNKK >gi|224461429|gb|ACDD01000073.1| GENE 24 23002 - 23187 273 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237743917|ref|ZP_04574398.1| LSU ribosomal protein L30P [Fusobacterium sp. 7_1] # 1 61 1 61 61 109 88 2e-23 MSKLRIELVKSMIGRKPNHIATLKSLGLKKMHDVVEHTMTPELKGKLAQVEYLLKIEEVQ A >gi|224461429|gb|ACDD01000073.1| GENE 25 23202 - 23705 757 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736187|ref|ZP_04566668.1| SSU ribosomal protein S5P [Fusobacterium mortiferum ATCC 9817] # 1 167 1 167 167 296 89 2e-79 MSKFANREEKQYQEKLLKISRVSKTTKGGRTISFSVLAAVGDGEGKIGLGLGKANGVPDA IRKAIASAKRNIVEVSLKGGTVPHEIVGKWGATSLWMAPAYEGTGVIAGSASREILELVG VKDILTKIKGSRNKHNVARATVEALKLLRTAEEIAALRGKEVKDILS >gi|224461429|gb|ACDD01000073.1| GENE 26 23732 - 24094 485 120 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736186|ref|ZP_04566667.1| LSU ribosomal protein L18P [Fusobacterium mortiferum ATCC 9817] # 1 120 1 122 122 191 80 6e-48 MFKKVNRASVREKKHLAIRNKISGTAERPRLSVYRSNNNIFAQLIDDVNGVTLVSASTIM KGMKVENGGNVEAAKAVGKAIAEKAVEKGIKEVVFDRSGYKYTGRIAALAEAAREAGLSF >gi|224461429|gb|ACDD01000073.1| GENE 27 24121 - 24654 744 177 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237743914|ref|ZP_04574395.1| LSU ribosomal protein L6P [Fusobacterium sp. 7_1] # 1 177 1 177 177 291 79 5e-78 MSRVGKKPIVVPAGVEVKIDGHKVTVKGPKGTLEKEFNQELTIKLENGEVVVERPNDEPK VRAIHGTTRALIQNMVSGVSEGFKKSLTLVGVGYRAAVKGKGLELSLGYSHPVIIDEIPG ITFTVEKNTTILVEGIEKDLVGQIAANIRSKRAPEPYKGKGVKYTDEHIRRKEGKKA >gi|224461429|gb|ACDD01000073.1| GENE 28 24678 - 25073 612 131 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736184|ref|ZP_04566665.1| SSU ribosomal protein S8P [Fusobacterium mortiferum ATCC 9817] # 1 131 1 131 131 240 90 1e-62 MYLTDPIADMLTRIRNANAVMHEKVDVPFSKMKERIAEILKEQGYISNYKIVTDGTKQNI RVYLKYDGKERVIKGIKRISKPGRRVYSSVEDMPRVLSGLGIAIVSTSKGIVTDKVARME NVGGEVLAFVW >gi|224461429|gb|ACDD01000073.1| GENE 29 25102 - 25389 451 95 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237743912|ref|ZP_04574393.1| SSU ribosomal protein S14P [Fusobacterium sp. 7_1] # 1 95 1 95 95 178 91 5e-44 MAKKSMIARDARRAELSEKYAEKRAELKKRVAAGDMEAMFELNKLPKDSAAVRRRNRCQL DGRPRGYMREFGISRVKFRQLAGAGVIPGVKKSSW >gi|224461429|gb|ACDD01000073.1| GENE 30 25411 - 25962 868 183 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739378|ref|ZP_04569859.1| LSU ribosomal protein L5P [Fusobacterium sp. 2_1_31] # 1 183 1 183 183 338 92 2e-92 MSKYVSRYHKLYNDVIVPKLMKDLEIKNIMDCPKLEKIIVNMGVGEATQNSKLMDAAMAD LTIITGQKPLLRKARKSEAGFKLREGMAIGAKVTLRKERMYDFLDRLVNVVLPRVRDFEG VSANAFDGRGNYSLGLADQLVFPEIDFDKVEKLLGMSITMVSSAKTDEEGRALLKAFGMP FKK >gi|224461429|gb|ACDD01000073.1| GENE 31 25981 - 26322 486 113 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34764027|ref|ZP_00144913.1| LSU ribosomal protein L24P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 113 1 113 113 191 84 5e-48 MAKPKIKFVPASLHVKTGDTVCVISGKDKGKTGKVVKVFPKKGKVVVEGVNVVKKHLKPS PVNPQGGVVEKAAAIFSSKVMLFDEKAGKPTRVKYEVRDGKKVRVSKKSGEII >gi|224461429|gb|ACDD01000073.1| GENE 32 26345 - 26713 576 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736180|ref|ZP_04566661.1| LSU ribosomal protein L14P [Fusobacterium mortiferum ATCC 9817] # 1 122 1 122 122 226 93 2e-58 MVQQQTILNVADNSGAKKLMVIRVLGGSKKRFGRIGDIVVASVKEAIPGGNVKKGDVIKA VIVRTRKETRREDGSYIKFDDNAAVVINNNNEPKATRIFGPVARELRAKSFMKILSLAPE VI >gi|224461429|gb|ACDD01000073.1| GENE 33 26747 - 26998 385 83 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237739375|ref|ZP_04569856.1| SSU ribosomal protein S17P [Fusobacterium sp. 2_1_31] # 1 83 1 83 83 152 91 2e-36 MRNERKVKEGIVVSNKMEKTIVVAIETMALHPIYKKRVKKTTKFKAHDEQNVAQVGDKVR IMETRPLSKDKNWRLVEIIEKAR >gi|224461429|gb|ACDD01000073.1| GENE 34 27040 - 27222 291 60 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34764030|ref|ZP_00144916.1| LSU ribosomal protein L29P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 60 1 60 60 116 100 2e-25 MRAKEIREMTSEDLVVKCKELKEELFNLKFQLSLGQLTNTAKIREVRREIARINTILNER >gi|224461429|gb|ACDD01000073.1| GENE 35 27222 - 27647 675 141 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34764031|ref|ZP_00144917.1| LSU ribosomal protein L16P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 141 1 141 141 264 90 5e-70 MLMPKRTKHRKMFRGRMKGNAQRGTTVAFGDYGLQALEPSWITNRQIESCRVGINRTFKR EGKTFIRIFPDKPITARPAGVRMGKGKGNVEGWVCVVKPGRILFEVSGVTEEKAKAALRK AAMKLPIKCKIVKREENGGEN >gi|224461429|gb|ACDD01000073.1| GENE 36 27650 - 28306 979 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736176|ref|ZP_04566657.1| SSU ribosomal protein S3P [Fusobacterium mortiferum ATCC 9817] # 1 218 1 218 218 381 83 1e-105 MGQKVDPRGLRLGITRSWDSNWYADKKEYAKYFHEDVKVREFVKKAYYHAGVSKVKLERT SPSQITVLISAGKAGIIIGRKGAEIESLRAKLEKMTGKKITVKVQEVKEFNKDAVLVAES IATQIEKRIAYKKAMTQAIGRAMKAGAKGIKVMVSGRLNGAEIARSEWAVEGKVPLHTLR ADIDYAVATAHTTYGALGIKVWVFHGEVLPTAKEGGEA >gi|224461429|gb|ACDD01000073.1| GENE 37 28329 - 28661 508 110 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736175|ref|ZP_04566656.1| LSU ribosomal protein L22P [Fusobacterium mortiferum ATCC 9817] # 1 110 1 110 110 200 93 1e-50 MEARAITRYVRLSPRKARLVADLVRGKSALQALDILEFTNKKAARVIKKTLASAIANATN NFKMDEDKLVVSTIMINEGPVLKRIMPRAMGRADIIRKPTAHIIVAVSEK >gi|224461429|gb|ACDD01000073.1| GENE 38 28713 - 28988 468 91 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|19704962|ref|NP_602457.1| SSU ribosomal protein S19P [Fusobacterium nucleatum subsp. nucleatum ATCC 25586] # 1 91 1 91 91 184 93 6e-46 MARSLKKGPFCDHHLMSKVEAVVESGNNKAVIKTWSRRSTIFPNFIGITFGVYNGKKHIP VHVTEQMVGHKLGEFAPTRTYHGHGADKKKK >gi|224461429|gb|ACDD01000073.1| GENE 39 29012 - 29842 1384 276 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237742669|ref|ZP_04573150.1| LSU ribosomal protein L2P [Fusobacterium sp. 4_1_13] # 1 276 1 276 276 537 93 1e-152 MAIRKMKAITNGTRHMSRLVNDELDKVRPEKSLTVPLKSAYGRDNYGHRTCRDRQKGHKR LYRIIDFKRNKLDIPARVVTIEYDPNRSANIALLFYADGEKRYILAPKGLHKGDVVKAGA SADIKPGNALKIKDMPVGVQIHNIELQRGKGGQLVRSAGVAARLVAKEGTYCHVELPSGE LRLIHGECMATIGEVGNAEHSLVNIGKAGRARHMGKRPHVRGSVMNPVDHPHGGGEGKNP VGRKSPLTPWGKPAIGVKTRGKKTTDKFIVRRRNEK >gi|224461429|gb|ACDD01000073.1| GENE 40 29886 - 30173 390 95 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736172|ref|ZP_04566653.1| LSU ribosomal protein L23P [Fusobacterium mortiferum ATCC 9817] # 1 95 1 95 95 154 78 6e-37 MNAYDIIKKPVITEKSELLRKEYNKYTFEVNPKANKFQIRNAVQELFNVKVLTVATMNYK PVTKRHGMKLYQTSARKKAIVKLAEGHTITYFKEV >gi|224461429|gb|ACDD01000073.1| GENE 41 30175 - 30807 918 210 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736171|ref|ZP_04566652.1| LSU ribosomal protein L1E [Fusobacterium mortiferum ATCC 9817] # 1 209 1 209 210 358 86 4e-98 MAVLNIYDLAGNQTGTVEVNEAVFGIEPNKTVLHEVLTAELAAARQGTAATKTRAMVRGG GRKPFKQKGTGRARQGSIRAPHMVGGGVTFGPQPRSYEKKVNKKVRNLALRSALSAKVAN NQIVVLEGAVEAPKTKTIVNLVNKIDAKQKQLFVVNDLTDVKDYNLYLSARNLENAVVLQ PNEIGVYWLLKQEKVILTKEALTTIEEVLA >gi|224461429|gb|ACDD01000073.1| GENE 42 30836 - 31462 992 208 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237742672|ref|ZP_04573153.1| LSU ribosomal protein L3P [Fusobacterium sp. 4_1_13] # 1 208 1 208 211 386 91 1e-106 MSGILAKKIGMTQIFEDGKFVPVTVVEAGPNYVLQKKTEESDGYTALQLGFDEKKEKNTT KPLMGIFNKAGVKPQRFVRELKVDSVEGYELGQEIKVDVFSEVEYVDITGTSKGKGTAGV MKRHGFGGNRATHGVSRNHRLGGSIGQSSWPGKVLKGLRMAGRHGNATVTVQNLKVVKVD AENNLLLIKGAVPGAKNGYLVIKPAIKK >gi|224461429|gb|ACDD01000073.1| GENE 43 31613 - 31924 469 103 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736169|ref|ZP_04566650.1| SSU ribosomal protein S10P [Fusobacterium mortiferum ATCC 9817] # 1 103 1 103 103 185 90 4e-46 MASNKLRIYLKAYDHSLLDESAKKIVEVAKKSGAEVVGPMPLPTKIKKYTVLRSVHVNKD SREQFEMRVHRRMVELVNSTDKAIASLTAVNLPAGVGIEIKQI >gi|224461429|gb|ACDD01000073.1| GENE 44 32113 - 32337 387 74 aa, chain - ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 1 71 1 71 538 97 66.0 7e-21 MEEVLVFGYKNPDTDSICSSIAMAALKRKQGFDAIACCLGSLSKETEFVLRKLSVETPKM LKTVSAQVMALKIY >gi|224461429|gb|ACDD01000073.1| GENE 45 32424 - 32807 732 127 aa, chain - ## HITS:1 COG:MTH757 KEGG:ns NR:ns ## COG: MTH757 COG2033 # Protein_GI_number: 15678782 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Methanothermobacter thermautotrophicus # 3 123 5 123 124 93 42.0 7e-20 MRNDFFKVAGSKKLLEVAVDGEGCLKEAIPGVEKLEVKSEDASTEKHVPYVEEQENGYLV KVGKETAHPMQDAHYIQFIEIVVDDNNLYRRYLNPGDAPEAFFAVPKGTKVVAREYCNLH GVWQYTK >gi|224461429|gb|ACDD01000073.1| GENE 46 32828 - 33253 445 141 aa, chain - ## HITS:1 COG:FN2045 KEGG:ns NR:ns ## COG: FN2045 COG0735 # Protein_GI_number: 19705335 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Fusobacterium nucleatum # 1 141 3 142 142 192 70.0 1e-49 MNIRIDNVGEYLKEHGIKPSYQRMRIFQYLLDYHNHPTVDVIYKALCPEIPTLSKTTVYN TLNLFVEKKIVNVIIIEENETRYDLVSTTHGHFKCQECGAVYDVELKNTPFQAESLLEGC KVEEEHFYFKGICKNCMEKKH >gi|224461429|gb|ACDD01000073.1| GENE 47 33237 - 33347 114 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLLVNYYIIKYIIITKLTYKILYTIGGGIEHEYQN >gi|224461429|gb|ACDD01000073.1| GENE 48 33350 - 33802 645 150 aa, chain - ## HITS:1 COG:FN2046 KEGG:ns NR:ns ## COG: FN2046 COG0456 # Protein_GI_number: 19705336 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Fusobacterium nucleatum # 1 149 1 148 149 131 52.0 6e-31 MKFRELVEIDLEYLNKIVELEEEAFEGQGGVDLWILKALIRYGKVFVLEDKNGELVSVLE FMQVFEKKEAFLYGICTRKKYRRQGWAEYILDLGEKYLKEKFYHGIALTVDPKNEIAIHL YKNKDYKVLELQENEYGEGIHRLLMKKSLE >gi|224461429|gb|ACDD01000073.1| GENE 49 33815 - 34552 644 245 aa, chain - ## HITS:1 COG:FN1600 KEGG:ns NR:ns ## COG: FN1600 COG0101 # Protein_GI_number: 19704921 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Fusobacterium nucleatum # 2 242 6 247 247 241 54.0 6e-64 MKNIKISYQYDGSSFMGFQRQPEKRTVQGEIEKCLFRILKEKIDLTSSGRTDRGVHAMHQ VSNFFTAVNIPLDKLFYALSRCLPEDILLLELEEARKDFHARFSAKTRSYCYRITWEKSP FERRYKTYVKKKIDSQSFFKILEIFMGKHNFQNFRLQDDAFANPIREIYSIQVKEVDEGM DIYIEANAFLKSQIRIMLGTAFQVYFQKVESNRIEKMLKEPDKEFPKYLADPNGLYLYHI KYDEE >gi|224461429|gb|ACDD01000073.1| GENE 50 34671 - 35027 288 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453278|ref|ZP_05618577.1| ## NR: gi|257453278|ref|ZP_05618577.1| hypothetical protein F3_09468 [Fusobacterium sp. 3_1_5R] # 1 118 1 118 118 196 100.0 4e-49 MDKKQVEELQSLLQKQNYTIIYVDFANPKNVIVSHSINDFSSIDLEHFAKFYVFNDNFMR VYSRKGPKHFDFYEIQKEDFDPGYDEKVFFVDYSQYKKLHMRVGKIDGKSAMQYLYFE >gi|224461429|gb|ACDD01000073.1| GENE 51 35054 - 35656 580 200 aa, chain - ## HITS:1 COG:FN1901 KEGG:ns NR:ns ## COG: FN1901 COG0664 # Protein_GI_number: 19705206 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 14 198 33 216 217 144 44.0 7e-35 MKKNWGNYAKLFSFKKGEAIFFRGEEVKGLHILAEGIAVAEMLKENGDVNQIEEMQGETF LASAFVFGGNPYYPVDLRAKTDCKIYFVPKEELIFVFQKEPEMLEKFVNDISSKAQFLSN RLWSQFQYKSIGSKLNQYLLSQEKEGKCCFDRSLKELAELFGVTRPSLSRVLGQYVEEGI LERNGRNQYKILDRESLEEN >gi|224461429|gb|ACDD01000073.1| GENE 52 35732 - 35974 119 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453280|ref|ZP_05618579.1| ## NR: gi|257453280|ref|ZP_05618579.1| hypothetical protein F3_09478 [Fusobacterium sp. 3_1_5R] # 1 80 1 80 80 96 100.0 6e-19 MEKLILTCPHCHKKMKIQKKAAKYKCPHCSSICIISSIALFLLMIQNYIQFFTQKIKHKY QNVKNTYKYLKMLRDNQKKH >gi|224461429|gb|ACDD01000073.1| GENE 53 35971 - 36453 534 160 aa, chain - ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 3 158 15 170 174 262 75.0 2e-70 MKRKDYITWDEYFMGVALLSAMRSKDPNTQVGACIVSPDKKIIGLGYNGLPKGCEDDEFP WEREGEFLETKYPYVCHAELNAILNSTQSLKNCTIYVALFPCHECSKAIIQSGIREIVYL SDKYAETESNIASKRMLDSAGVVYRKLEKTCQNLYLSFES >gi|224461429|gb|ACDD01000073.1| GENE 54 36508 - 38145 1749 545 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 1 543 287 823 827 678 59 0.0 MLNGNENSNEFLEMLEDYLPAEKTGGKNQRVVGTINSIERNFVYLDVPGQRTVVRVRAEE LSEYNVGDQVEVVLVGLLEADDDQEVLIASRKRIDLEDNWKHIEDSYENKTVLSGRIVKK IKGGYIVEAALYQGFLPNSLSEINEKDGEAMVGKNIDVIVKDIKQDSRDKRSKKITFSKK DITLMKEGEEFAKLTVGDVVTCTVSGIMDFGLSVMIDHLRGFIHISEVSWKRLDDLRDLY TVGQTVEAKILSLDEEKKNIKLSIKQLTTNPWDLSKDAFHEGDEVEGKVTRVLAYGAFVE LTEGVEGLVHISDFAWNKKRINMEEYAKVGETVKVKILEFNPEGRKLKLGFKQLVENPWD VAEEKFAEGKELTATILDIKPFGLFAEIESGVDVFVHSSDFGWPGDEPANYQVGDSISFK VLELNVEDKKIKGSIKALKKSPWDKAMEEYKVGTTVEKKIKNIMDFGLFVELSKGIDGFI PTQFASKDFVKDLKDKFEIGQVVKAQIVEINQETQKIKLSIKKIELEEQKREDQDLLAKY GTAGE >gi|224461429|gb|ACDD01000073.1| GENE 55 38138 - 39013 835 291 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|34762725|ref|ZP_00143715.1| LytB protein; SSU ribosomal protein S1P [Fusobacterium nucleatum subsp. vincentii ATCC 49256] # 7 288 3 284 827 326 57 0.0 MNHKVTIIRANKMGFCFGVMEAVRLCEDILQDPKNANKNKYILGMLVHNDFVVQSFEKKG FVTIEESEISSLEKGDIVVIRAHGITKEVQKQLEEKELDLYDATCIFVSQIKLKILWAIE QGYDIIFIGDKHHPEVKGITSYAKNIQIFASLEELKKVTIEKEKKYFLSTQTTLNQKKFL EIKKYMEENYSNVYIFNKICGATQERQKATESLAKEVDVVFVLGGKKSSNTQKLYEISKS LNPNTYLLEKEEDLEEAYLQGKSKIGLTAGASTPEEIIRNIENKIRGILDA >gi|224461429|gb|ACDD01000073.1| GENE 56 39003 - 39272 439 89 aa, chain - ## HITS:1 COG:FN1782 KEGG:ns NR:ns ## COG: FN1782 COG1925 # Protein_GI_number: 19705087 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Fusobacterium nucleatum # 1 88 1 88 89 120 77.0 5e-28 MKTVKVEIKNKAGLHARPSSLFVQAVAKYDSEIKVRCDEEEINGKSIMGLMLLAAEQGRI LELTADGPDEEAMLAELVDLIEVKKFNEP Prediction of potential genes in microbial genomes Time: Fri May 20 02:27:55 2011 Seq name: gi|224461428|gb|ACDD01000074.1| Fusobacterium sp. 3_1_5R cont1.74, whole genome shotgun sequence Length of sequence - 3285 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 83 - 2188 2057 ## COG3968 Uncharacterized protein related to glutamine synthetase - Term 2181 - 2232 7.2 2 2 Tu 1 . - CDS 2249 - 3163 1241 ## COG2066 Glutaminase - Prom 3224 - 3283 12.0 Predicted protein(s) >gi|224461428|gb|ACDD01000074.1| GENE 1 83 - 2188 2057 701 aa, chain + ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 1 701 1 696 696 660 46.0 0 MKTMLEVFGIHCFSEKELKSRVPKDVFKSFKKVQSGKEELSITTANVIANAIKLWAIENG ATHFTHWFQPLTELTAEKHESFLSVHSDGTSITEFTGKELIKGESDTSSFPNGGLRSTFE ARGYTAWDIGSPMFLKGEGLSKSLYIPTAFIGYSGEALDKKVPLLRSISAVRKEALRIQK TLGDFDTRHIDVTLGVEQEYFLVEKKFFDLRKDLTLSGRTVFGNLPPKGQEMNDHYYGTI KERVEAFMTELDTELWKVGVMSKTKHNEVAPNQFEVAIMFNTANVAVDQNQITMDMIKKV ATRHHLTALLHEKPFHGINGSGKHCNWSLSTDTGKNLLDPSSLEENRFDFLLYVMAVMEG VYRYSGILRACTATPGNDYRLGGHEAPPAIISIFLGNELQQIFENIQHNNLSMTTQKDLL DLGSSFPKIPKDISDRNRTSPFAFTGNKFEFRMPGSSASPATPTFILNTIVADILKEYAD KLEQWENISPNVKVVKLIQEQYPKYKNILFNGNGYDKNWEVEAKALGLQNFKNTVEALPN YISEESIALFERNQVLTRAELQSRFHVYCERYNKQNNIEISSAIEIARNEIYPSVLAYIT KIAQNIDILKSLVEETEYQEEKKLLKTLLTNKNEMLQSIHELVDGMKTATSIVDQYQRAQ YYSNTLIPKLTNLRKVVDILEKESDKHTWPIPSYYDLLFNL >gi|224461428|gb|ACDD01000074.1| GENE 2 2249 - 3163 1241 304 aa, chain - ## HITS:1 COG:FN1397 KEGG:ns NR:ns ## COG: FN1397 COG2066 # Protein_GI_number: 19704729 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Fusobacterium nucleatum # 1 304 1 304 304 456 76.0 1e-128 MQELLQKIVEKNKELTNLGAVANYIPELDKANKNALGICVMDMEGNQFCYGECGTRFTIQ SISKIISLMLAILDNGEEYVFSKVGMEPSGDPFNSIRKLETSSRKKPYNPLINAGAIAVA SMIKGKNVRERFQRLLDFTRKITEDETVDVNYKIYCGESETGDRNRAMGYFLKGEGIIEG NVEEALDIYFKQCSMEVTVYTIAKLGLFLANDGVLSNGERVISTRLSRIVKTLMVTCGMY DESGEFAVRVGMPSKSGVGGGIVSVVPKKMGIGVYGPSLDKKGNSIAGAGVLEDLAKELD LSIF Prediction of potential genes in microbial genomes Time: Fri May 20 02:28:03 2011 Seq name: gi|224461427|gb|ACDD01000075.1| Fusobacterium sp. 3_1_5R cont1.75, whole genome shotgun sequence Length of sequence - 17705 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 8, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 59/0.000 - CDS 59 - 457 576 ## PROTEIN SUPPORTED gi|237736381|ref|ZP_04566862.1| SSU ribosomal protein S9P 2 1 Op 2 . - CDS 474 - 908 694 ## PROTEIN SUPPORTED gi|237736380|ref|ZP_04566861.1| ribosomal protein L13 - Prom 992 - 1051 7.2 - Term 1038 - 1074 7.5 3 2 Tu 1 . - CDS 1100 - 2533 2100 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 2563 - 2622 7.6 4 3 Op 1 9/0.000 - CDS 2661 - 3365 951 ## COG3279 Response regulator of the LytR/AlgR family 5 3 Op 2 . - CDS 3362 - 5035 1744 ## COG3275 Putative regulator of cell autolysis 6 3 Op 3 . - CDS 5045 - 5944 943 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 5972 - 6031 6.7 - Term 5989 - 6041 9.7 7 3 Op 4 . - CDS 6061 - 6444 465 ## gi|257453293|ref|ZP_05618592.1| hypothetical protein F3_09543 - Prom 6620 - 6679 79.6 + TRNA 6603 - 6679 77.6 # Arg ACG 0 0 8 4 Op 1 . - CDS 6683 - 7393 439 ## COG4912 Predicted DNA alkylation repair enzyme 9 4 Op 2 . - CDS 7393 - 8253 922 ## COG0384 Predicted epimerase, PhzC/PhzF homolog 10 4 Op 3 . - CDS 8263 - 8910 812 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Prom 8979 - 9038 19.3 + Prom 8974 - 9033 15.9 11 5 Tu 1 . + CDS 9085 - 9309 255 ## gi|257453297|ref|ZP_05618596.1| hypothetical protein F3_09563 + Term 9329 - 9376 10.6 + Prom 9359 - 9418 8.7 12 6 Op 1 2/0.000 + CDS 9447 - 10646 1709 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 13 6 Op 2 4/0.000 + CDS 10643 - 11335 780 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 14 6 Op 3 . + CDS 11336 - 12160 994 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component + Term 12165 - 12197 3.2 - Term 12153 - 12185 3.2 15 7 Tu 1 . - CDS 12188 - 12334 312 ## gi|257453301|ref|ZP_05618600.1| hypothetical protein F3_09583 - Prom 12385 - 12444 3.7 - Term 12408 - 12457 8.4 16 8 Op 1 1/0.000 - CDS 12468 - 13307 1132 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase 17 8 Op 2 1/0.000 - CDS 13309 - 14748 2178 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 18 8 Op 3 . - CDS 14760 - 15431 729 ## COG0692 Uracil DNA glycosylase 19 8 Op 4 . - CDS 15444 - 17204 2131 ## COG1032 Fe-S oxidoreductase 20 8 Op 5 . - CDS 17223 - 17645 426 ## FN1229 hypothetical protein Predicted protein(s) >gi|224461427|gb|ACDD01000075.1| GENE 1 59 - 457 576 132 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736381|ref|ZP_04566862.1| SSU ribosomal protein S9P [Fusobacterium mortiferum ATCC 9817] # 4 132 1 129 129 226 86 8e-59 MADMNQYRGTGRRKTSVARVRLIPGGQGVVINGKSMAEYFGGREILAKIVEQPLTLTETL DKYEVRVNVCGGGNAGQAGAIRHGVSRALVEADETLKAALREAGFLTRDSRMVERKKYGK KKARRSPQFSKR >gi|224461427|gb|ACDD01000075.1| GENE 2 474 - 908 694 144 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237736380|ref|ZP_04566861.1| ribosomal protein L13 [Fusobacterium mortiferum ATCC 9817] # 1 144 1 144 144 271 90 2e-72 MKKYTYMQRKEDVVREWHHYDAEGKILGRLAVEVAKKLMGKEKITFTPHIDGGDFVVVTN VAKMVVTGKKLTDKKYYNHSGFPGGIRERKLGEILDKRPEELLMLAVKRMLPKNKLGREQ LTRLRVFAGAEHTHEAQQPNKVEF >gi|224461427|gb|ACDD01000075.1| GENE 3 1100 - 2533 2100 477 aa, chain - ## HITS:1 COG:FN0221 KEGG:ns NR:ns ## COG: FN0221 COG1966 # Protein_GI_number: 19703566 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Fusobacterium nucleatum # 1 457 1 455 474 598 72.0 1e-171 MFSFIGAVIALIVGYVVYGAFVDRVFGSTDAKVTPAKRMADGVDYVEMDWKKAFLIQFLN IAGTGPIFGAVAGAMWGPAAFIWIVFGCIFAGSVHDFLIGMLSVRQDGASVSEIVGKYLG ENARKLMVAFSIVLLVLVGVVFVKSPADILHNLTGIPTMVLLGIIIIYYLIATVLPIDQV IGRIYPIFGVCLLIMAVGIGFGIIFQGYAVNIPEITFHNFHPAGKSIFPYLCISIACGAI SGFHATQSPMMARCLRTEREGRRVFYGAMISEGIVALVWAAAAMCYFGNIEGLAAAGSAA VVVDTISRGVLGPVGGALAILGVVACPITSGDTAFRSARLTIADAIGYKQGPVKNRFVIA VPLFAIGLALCFIPFAIIWRYFGWSNQTLATIALWAAAKYLEKHGKNFWIAVIPALFMTV VVTSYIICAPEGFAWVFGDMDIHVVEQIGIVAGIIVSALSGLLFWKTKTPAAEIEVE >gi|224461427|gb|ACDD01000075.1| GENE 4 2661 - 3365 951 234 aa, chain - ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 1 233 2 240 240 234 50.0 9e-62 MRCVIVDDEFPAREELKYFISKFPGTELTQEFGDSLDAFDYLQEHAKEVDVLFLDINMPE LNGLNLGKIIRKLNPAMKIIFVTAYREYAVDAFEIQAFDYLLKPYSEDRIEKLLSRLSVE KKQISNKVSISVGEKIMVFNTEDIIVVEADKKESRVYTTKECYLTKMKISDWEEQLPENQ FYRCHRSYLVNLSKVREIEPWFNNSFVIHMENCPVKIPVSRNNMKEFKSLFQVR >gi|224461427|gb|ACDD01000075.1| GENE 5 3362 - 5035 1744 557 aa, chain - ## HITS:1 COG:FN0220 KEGG:ns NR:ns ## COG: FN0220 COG3275 # Protein_GI_number: 19703565 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Fusobacterium nucleatum # 19 549 3 534 541 486 49.0 1e-137 MFTLMNHLLNNIGYIIAAAFLFTKIKSAIEGLREEERRNHIIYVFFFSALAIAGTYIGLD YKGSILNTRNIGVITGGLLLGPEVGILAGIFSAIHRILIPIGEATEIPCAIATILAGVFS GYLHNRYRESVKPMIGFFLAIIVESISMILILGFSSNFDESLDVVRSIYFPMSFMNSLGV YALISIIQNTLSTMEVNAGKQAKIALEIANKTLPYFQKGESLDSVCKIILESLDAKAVAI TDLEKIRASYVVEGIPKIEKTEIQSAFTKKVLELGRIMVFGKNNTGDLSDYLFLSKEIKS CIILPLFERGKVSGALKIFFDTPEKVTANNKYLAIGLSQLISTQLELEKLDALEDSARKA ELKALYSQINPHFLFNVLNTIASFVRIDPNKAREVIIDLSTYLRYNIENSMKFVPLEQEL EQVKAFVAIESARFGNKIKVHYEIEEKALESEIPSLSIQPLVENSIIHGLLPKRQGGNIW ISAKVKEEGTRIVIQDDGVGISESVIHSLEEEIGSSIGLKNVHHRLKLIYGKGLLVERLS EGTKISFWIYRQEVEKR >gi|224461427|gb|ACDD01000075.1| GENE 6 5045 - 5944 943 299 aa, chain - ## HITS:1 COG:FN1498 KEGG:ns NR:ns ## COG: FN1498 COG0697 # Protein_GI_number: 19704830 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 284 1 285 299 329 67.0 3e-90 MKKELQGVILVSLAATLWGFDSIALTPRLFHLQVPYVVFILHFLPFIGMTVLFGREEFQK IKQLDSHDLFYFFLVALFGGAVGTLSIVKALFLVNFQHLTVVTLLQKLQPVFAIILARVL LKEVIEKKFIFWALIALLGGYFLTFEGNVPSMEGNNIGLACLYSLLAAFSFGSATVFGKR ILKNASFRTALYVRYSFTSIIGFFIALVSGSFQSFAQTTGMEWLIFIIIGLTTGSGAILL YYYGLRYIPARISTICELCFPISSVIFDFLLNGKLLSMIQLVSAAIMLLAIYRITQKQK >gi|224461427|gb|ACDD01000075.1| GENE 7 6061 - 6444 465 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453293|ref|ZP_05618592.1| ## NR: gi|257453293|ref|ZP_05618592.1| hypothetical protein F3_09543 [Fusobacterium sp. 3_1_5R] # 1 127 1 127 127 196 100.0 3e-49 MEKQKLEKIEARIKELYQDTKEDHTLCGKYSKLLQVAKQVLEEEKEEGLLVKRLRIAEGK LTHSVLWNEASNLEVLFLVTQEEDFLGCATWKIPVEDQNNPEIFTEEAHGDKIAKWLKQE YLQEVFF >gi|224461427|gb|ACDD01000075.1| GENE 8 6683 - 7393 439 236 aa, chain - ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 2 232 15 251 251 114 33.0 2e-25 MKKESKGIQNFLKGFQEEEYQKFNAKLIPNLPSKEVLGVRTPILRKLAEELYLRQAERML QYMTELPHRYLEENHLHAFLIENIKDFSQTMEETEKFLPYINNWATCDTFSPKIFKKYPL EVYEKIKVWLQSTHEYTVRYGIGLLLSNYLEKHFQKEMLELVANIQREEYYIRMMIAWYF ATALAKQWSCTLPYLEQHTLEEWTHNKAIQKAIESRRITEEQKEYLRTLKRKTSKK >gi|224461427|gb|ACDD01000075.1| GENE 9 7393 - 8253 922 286 aa, chain - ## HITS:1 COG:lin0782 KEGG:ns NR:ns ## COG: lin0782 COG0384 # Protein_GI_number: 16799856 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Listeria innocua # 1 278 1 271 282 179 37.0 7e-45 MKRPIFIYDAFTKEKFGGNGAGILFHAEELSTAEKQNLAKELGFSETVFIQASEKADFKF EYFTPKQEVDLCGHATIAAIYSLFEENRISEDKDRITIDTKLGVLPIFLERQGKELLSVW MEQDEGDLSFTLDISEEEILASLGLTEKDRNRKFLLVKAYSGLWDLMIPLASKESLDKIQ IDFSKVEALSEKLSVISFHPFFLEDKHVYVRNFAPIVDIPEESATGTSNGALAFYLFKQG YLSENEILYCHQGESLQRKSQILAKITREEKILVGGEAIRILQGEY >gi|224461427|gb|ACDD01000075.1| GENE 10 8263 - 8910 812 215 aa, chain - ## HITS:1 COG:ML1003 KEGG:ns NR:ns ## COG: ML1003 COG1974 # Protein_GI_number: 15827479 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Mycobacterium leprae # 91 214 113 235 235 70 37.0 2e-12 MDFKTYLKEKREELGYSQNKLAKALQITQPYYNSIERGEVKNPPSEEILERMIGLFSLNE KDAEYFLYLAAVERTPKIILEKMKQIKGEGPAAIPLFPRISAGIGVFGEEEVEDYISIPG VRNVEEVFSVRVKGDSMEPTIKNSSIIVCRQNMQVHNGEIGAFLVNGEAFVKRLQIKPDY VVLMSDNPNYQPIYISPNDEFVSLGKVLKVINDII >gi|224461427|gb|ACDD01000075.1| GENE 11 9085 - 9309 255 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453297|ref|ZP_05618596.1| ## NR: gi|257453297|ref|ZP_05618596.1| hypothetical protein F3_09563 [Fusobacterium sp. 3_1_5R] # 1 74 1 74 74 140 100.0 3e-32 MRILVKNKKWETSFQTVTLICDVKAKNGIFHIQFPYNGKYVQIKSNNLDLTFHHLEKVFN RFGNLPETKQFLAS >gi|224461427|gb|ACDD01000075.1| GENE 12 9447 - 10646 1709 399 aa, chain + ## HITS:1 COG:FN0128 KEGG:ns NR:ns ## COG: FN0128 COG1840 # Protein_GI_number: 19703473 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Fusobacterium nucleatum # 89 396 5 312 314 452 68.0 1e-127 MYINLSMNIKEIITKYPETKAVFENQGIQGLEDEKVLQLLEAYPLSKIMELKKVDSKAFL SRLEESIKTNRETSDITMKKEEKKENGLSLLGLLPCPVRIPLLEGFQNFLQNHPDVEVNY ELKAASSGLDWLKKDVIEANHVDQLADMFLSAGFDLFFDNKWMGKWKAEGIFEDMTGLTH YNTDFENENISLKDPKGDYSMIGVVPAIFLVNKNALGNRKTPESWQDILSEEFENSISLP IADFDLFNSILVHIYKLYGQEGVEKLGKSLLSNLHPAQMVDAKEPAITIMPFFFSKMIKE NGPMQVVWPKEGAIISPIFMLTKKHRKEELKPIVNFMGGKEVGTIISHQGLFPSIHPEVE NPTSGKPFVWIGWDFIYSHDMGQLLQDCESWFMKGAKRS >gi|224461427|gb|ACDD01000075.1| GENE 13 10643 - 11335 780 230 aa, chain + ## HITS:1 COG:FN0129 KEGG:ns NR:ns ## COG: FN0129 COG0378 # Protein_GI_number: 19703474 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Fusobacterium nucleatum # 1 230 1 230 231 392 79.0 1e-109 MKFITISGPPSSGKTSLILKTIENLKQKGMKVGVVKFDCLYTEDDVLYEKMGIPVKKGLS GSVCPDHFFVSNIEEVVQWGKRQGLDLLITESAGLCNRCSPYIKDIKAICVIDNLSGINT PKKIGPMIKTADIIVITKGDIVSQAEREVFAARVQIVNPRAAILHVNGLTGQGSFEFANL VMEENQEIDTVVEKQLRFSVPSAVCSYCLGETRIGTSYQMGNIRKIDLED >gi|224461427|gb|ACDD01000075.1| GENE 14 11336 - 12160 994 274 aa, chain + ## HITS:1 COG:FN0130 KEGG:ns NR:ns ## COG: FN0130 COG1136 # Protein_GI_number: 19703475 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 7 273 1 267 268 417 82.0 1e-117 MEELNLMDILGIQEEIIEEVTIVAGYNKLGEKENFDSFTIKAGEIVAIVGPTGSGKSRLL ADIEWGAQGDTPTKRSILVNGKPMDAKKRFSPSHKLVAQLSQNMNFVMDLSVRDFLDLHA ESRLAANREEIIEKIFRQANDLAGEKFNLDTPITSLSGGQSRALMIADTAILSSSPIVLI DEIENAGIDRKKALDLLVGNNKIVLMATHDPILALMGDRRIVIKNGGIAKVMESNPEEKQ ILGKLEELDDVVQSMRNQLRYGEVLSCDFEIKKI >gi|224461427|gb|ACDD01000075.1| GENE 15 12188 - 12334 312 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453301|ref|ZP_05618600.1| ## NR: gi|257453301|ref|ZP_05618600.1| hypothetical protein F3_09583 [Fusobacterium sp. 3_1_5R] # 1 48 16 63 63 76 100.0 5e-13 MIAMPKPLSMKEERIEIEGKTREVSFEGEQVFIDGSVEIVAEGNSYLK >gi|224461427|gb|ACDD01000075.1| GENE 16 12468 - 13307 1132 279 aa, chain - ## HITS:1 COG:FN1224 KEGG:ns NR:ns ## COG: FN1224 COG2877 # Protein_GI_number: 19704559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Fusobacterium nucleatum # 1 277 9 284 286 474 81.0 1e-134 MIVQDTKVVKVGENVSIGGKKRFTLIAGPCVMESQELMLEVAGEINKICKKLGIEYIFKA SFDKANRSSIHSYRGPGLEEGLKMLQKVKDTYGIPVVTDIHEPWQCEKVAEVADLLQIPA FLCRQTDLLIAAAATGKPVNIKKGQFLAPWDMKNVVVKMEESGNEGILLCERGSTFGYNN MVVDMRSLLEMRKFGYPVVFDVTHAVQKPGGLGNATSGDREYVYPLMRAGLAIGVDAIFA EVHPNPEVAKSDGPNMLYLKDLEEILKVAIQIDDLVKNY >gi|224461427|gb|ACDD01000075.1| GENE 17 13309 - 14748 2178 479 aa, chain - ## HITS:1 COG:FN1225 KEGG:ns NR:ns ## COG: FN1225 COG0769 # Protein_GI_number: 19704560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Fusobacterium nucleatum # 4 476 3 479 485 509 54.0 1e-144 MEKLLEGLQYEILQKPEVEIFTGMEHDSRKIVEGSIFVALEGEVVDGHTFIDTAIEKGAK LIIVSKEVPCQKGIGYVLIKNLRKHLGILASNFYGWPQKNIKILGVTGTNGKTTTTYLLE QLLGEEKVARFGTIEYKIGKEVIEAPNTTPESLDLVRMIKKAYDQGLEYIIMEVSSHALE LGRVNMLEFDGAIFTNLTLDHLDYHKTMEQYFMAKRKLFLKLRGKAIKVLNVDDEYGRRL QEEFHGISYGTKQAEVQGKILGFEGGKEKVELSLFGKKKECKIQILGGFNLYNLLGSIAL VKELGMSEEEIFAKVELLQGAPGRFETVDCGQDYMVVIDYAHTGDALENILQAIQEIKTK KIITIFGCGGDRDPRKRPIMAEIAERYSDFVVLTSDNPRTENPESILEEVKGGFTKENHI CVLERAEAIAEGIRRAEKGDIVLIAGKGHETYQILGRKKYHFDDREFARREIVFRKQGR >gi|224461427|gb|ACDD01000075.1| GENE 18 14760 - 15431 729 223 aa, chain - ## HITS:1 COG:FN1226 KEGG:ns NR:ns ## COG: FN1226 COG0692 # Protein_GI_number: 19704561 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Fusobacterium nucleatum # 1 222 1 222 226 282 63.0 4e-76 MVHIGNDWDKVLEGEFQQEYYQNLRKILVREYRSKRIFPPAEKIFNALKWTSYKDCKVVL LGQDPYHGLGQAHGLSFSVPKGQRIPPSLQNMYKELQNSLGLSIPHHGCLEKWAKQGVLL LNTSLTVVEGQASSHSKIGWEVFTDHVIQKLNEREEALIFILWGNHARSKKKWIDSRKHY ILEGVHPSPLSANRGFFGCGHFRQVNEILRTLGKEEIDWQIEE >gi|224461427|gb|ACDD01000075.1| GENE 19 15444 - 17204 2131 586 aa, chain - ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 6 583 5 614 622 503 42.0 1e-142 MTQVNIDNYLLEILKPGQYLGNEINSIHKKEYQTHMCLFFPDIYEVGMSNLGIRILYNIL NKLEGFYLERGFCPMEDLEEKMREHQIPMFSWETKTPLKEFDIVGFSLSYEMAYPNLLNA LDLAGIPFRWKDRGEEYPLLMAGGTCMMNPTVISPFMDYIVIGDGEDVMPEIARIMMKNQ GKTKVEKLQAIQHLDGVWIPRFHKEGEKVKRAIVEDLNDTSYYAEQIVPYIEVVHDRATV EIQRGCSRGCRFCQAGIVYRPVRERSLEKNLELIEKMIQDTGYSEVSLSSLSSSDYSNIH QLIAGIKANPLNKNVGVSLPSLRMNPDSVRVAESISGGKRTGFTFAPEAGSQRMRDIINK GVTEEEILATAEEAVRAGWDNLKFYFMIGLPFETKEDVLAIHELAKKVMFKCRPISRRVQ VTVSVSNFVPKPHTPFAWQKQMGFEEMYEKHSLLREAFKGFKGVSLKIHDPKKSYLEGFL SRGDERISDLVELAFHKGVKLDDYRDNFELWKAAMDELGIQEEKYLGERSQDTVFPWDFV DTGVHKSFLLEEWEKAKKEALTPECREKCSMCGMRERFPKCLKIYK >gi|224461427|gb|ACDD01000075.1| GENE 20 17223 - 17645 426 140 aa, chain - ## HITS:1 COG:no KEGG:FN1229 NR:ns ## KEGG: FN1229 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 140 7 146 146 107 48.0 2e-22 MLRVRLKQAYVYFFIPWTAKKEQEVQNFLSEEEFFIFSTMGRYDKNHSYFLWKKIKKSEL VYLEIYQKLALLHDCGKEKKGFLARCLTVILGRKRMKDFHSERAYEKLKNRNLELAELCQ KHHQRATTKEMKLFQKLDDE Prediction of potential genes in microbial genomes Time: Fri May 20 02:28:24 2011 Seq name: gi|224461426|gb|ACDD01000076.1| Fusobacterium sp. 3_1_5R cont1.76, whole genome shotgun sequence Length of sequence - 10100 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 3, operones - 2 average op.length - 6.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 192 - 251 11.9 1 1 Op 1 . + CDS 357 - 923 519 ## FN2097 hypothetical protein 2 1 Op 2 . + CDS 908 - 1078 163 ## gi|257453308|ref|ZP_05618607.1| hypothetical protein F3_09620 3 1 Op 3 24/0.000 + CDS 1080 - 2543 796 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 4 1 Op 4 10/0.000 + CDS 2594 - 3715 519 ## COG1459 Type II secretory pathway, component PulF 5 1 Op 5 . + CDS 3725 - 4198 624 ## COG2165 Type II secretory pathway, pseudopilin PulG + Prom 4256 - 4315 6.2 6 2 Tu 1 . + CDS 4339 - 4629 264 ## gi|257453312|ref|ZP_05618611.1| integral membrane protein + Prom 4637 - 4696 6.6 7 3 Op 1 . + CDS 4719 - 5012 276 ## gi|257453313|ref|ZP_05618612.1| hypothetical protein F3_09645 8 3 Op 2 . + CDS 4915 - 5565 323 ## gi|257453314|ref|ZP_05618613.1| hypothetical protein F3_09650 9 3 Op 3 . + CDS 5537 - 6058 527 ## gi|257453315|ref|ZP_05618614.1| hypothetical protein F3_09655 10 3 Op 4 . + CDS 6055 - 7182 607 ## gi|257453316|ref|ZP_05618615.1| hypothetical protein F3_09660 11 3 Op 5 . + CDS 7172 - 7765 333 ## gi|257453317|ref|ZP_05618616.1| hypothetical protein F3_09665 12 3 Op 6 . + CDS 7692 - 8993 1275 ## COG1450 Type II secretory pathway, component PulD 13 3 Op 7 . + CDS 9022 - 9600 691 ## gi|257453319|ref|ZP_05618618.1| hypothetical protein F3_09675 14 3 Op 8 . + CDS 9647 - 10096 276 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 Predicted protein(s) >gi|224461426|gb|ACDD01000076.1| GENE 1 357 - 923 519 188 aa, chain + ## HITS:1 COG:no KEGG:FN2097 NR:ns ## KEGG: FN2097 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 85 188 31 134 134 66 38.0 5e-10 MKKINKGGFSLLEVCVSALLVMIVIQISTSLYRNYQEHLDLQLAKIKISKLFYLYSMKSF YQRKAYYFTISDIEKTIEVKNSFFLLENKVVLPNHLSYYLTSNSVLDQKYGHLTRNGNIS PSFSIYLFGYQGFVKDKITFSSFEETKILRLRQYHKIKGKSVDMENIQKYHLETNKNRKL FYQEWREE >gi|224461426|gb|ACDD01000076.1| GENE 2 908 - 1078 163 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453308|ref|ZP_05618607.1| ## NR: gi|257453308|ref|ZP_05618607.1| hypothetical protein F3_09620 [Fusobacterium sp. 3_1_5R] # 5 56 1 52 52 90 100.0 2e-17 MERRMKNIYQWINILFLCIVFEEKIPVIFHMKDSISRSSLLFYQEKGERKIILLGE >gi|224461426|gb|ACDD01000076.1| GENE 3 1080 - 2543 796 487 aa, chain + ## HITS:1 COG:VC2732 KEGG:ns NR:ns ## COG: VC2732 COG2804 # Protein_GI_number: 15642726 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Vibrio cholerae # 50 484 50 497 503 325 39.0 2e-88 MKKYFLDFIYFQKDCLQLFFFETVKKELQSLILPVARENQKLLRLEKLFAFYETENSIYY IVKDIEEMQADDEPKEKQIMYYVISQYLYEFYFQYFQIYYKNFSFVNTKEKQLSTQTIHM LLEIAVLTKVSDIHFEIFETNAQIRFRIDGKLKRVIMFSMETHSILISQIKILSKLNIVE KRLPQDGSFSKIIEKYQIDFRVSILPNIYGEKAVIRILDRNNTKFDLESLGFESDQLIAI KRILKSNAGIILNCGPTGSGKTTTLYSFLQYKNKEETNIVTIEDPVEYHLEGITQIACRE EIGLNFSVILKSLLRQDPDIIMIGEIRDRETAALAIKAALTGHLVFSTIHAKNSTQCIDR LCDLGISPFLISNSLLMILSQRLFRKNCIYCRNKNEDSVKLSSLLAYNSNKEIKSYSSVG CSHCNYKGYLGRIGVYELFIVDDYNRNWILCRDTKKELKPHMISLDENVLNKIKSGIISL EEVIGEI >gi|224461426|gb|ACDD01000076.1| GENE 4 2594 - 3715 519 373 aa, chain + ## HITS:1 COG:PA4527 KEGG:ns NR:ns ## COG: PA4527 COG1459 # Protein_GI_number: 15599723 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Pseudomonas aeruginosa # 41 373 39 373 374 129 22.0 8e-30 MIVNTEKDVYKCLNVRKNKKMVIFSREINFSFYRKKYLLPFVKEFLFLLRNGIAYLEAFT IMKTYENNIFKKKILEDIIDSVQQGNKIVDSFSINSEFFGKFFLKVLFIGEESGNMESAL ELLISELEEYKKLKKQIFSLLFYPCFLICFSTFILIFLFSFIFPKLLSLFQDTGIPLPLI TRILLQIKYIFPFLGLIVILCFIGIYLIFIKKYHKELQHKIDRFLFNKYYCSGLFAEMLR LRMSKYLELLLKTGFSFQETFSILEKEIENLEFCKRFLSMKKKIYKGEKVHIAFRELGCF SEKDLYFIALGEEGGNIEEIFQKIAMYTQEQLHFKIQKYLLWLEPSIFIIFGLCIGIVII AVYLPMFSLSNIL >gi|224461426|gb|ACDD01000076.1| GENE 5 3725 - 4198 624 157 aa, chain + ## HITS:1 COG:FN2093 KEGG:ns NR:ns ## COG: FN2093 COG2165 # Protein_GI_number: 19705383 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Fusobacterium nucleatum # 8 157 1 151 151 129 45.0 3e-30 MKNKGFTLIEIVIAVAIVAVLSTLVTPQVRNQLAKGKDTKAIATLSSLRIASQMYQMEHT EKLIEPDDYDSDEKVKEAFQKLSEYLDPNAKKILKDAKIEIGGSKNSKDAGIQYGGELFF TFKNPDEKGKSDGIYLWFKLPENIGQFDSRGVEWKSY >gi|224461426|gb|ACDD01000076.1| GENE 6 4339 - 4629 264 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453312|ref|ZP_05618611.1| ## NR: gi|257453312|ref|ZP_05618611.1| integral membrane protein [Fusobacterium sp. 3_1_5R] # 1 96 53 148 148 139 98.0 5e-32 MISISLYSFPLIFLYGYVSDFVQKEVLGFGDIKFVMSVGAIMASTYHLWISIYYFYMISF VLASMIGVYILYSKKTKELAMLPYFSLSLCILKVYL >gi|224461426|gb|ACDD01000076.1| GENE 7 4719 - 5012 276 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453313|ref|ZP_05618612.1| ## NR: gi|257453313|ref|ZP_05618612.1| hypothetical protein F3_09645 [Fusobacterium sp. 3_1_5R] # 1 97 32 128 128 164 100.0 2e-39 MLSFLWLREKQYDKKWEQRNAIISFQAKIQKERMEEGDFYYNNAKWSLDNMNSKFLLKVS KEHLRNGDKEEDIYYFQMFDIESSKKIWEGWSIVIPK >gi|224461426|gb|ACDD01000076.1| GENE 8 4915 - 5565 323 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453314|ref|ZP_05618613.1| ## NR: gi|257453314|ref|ZP_05618613.1| hypothetical protein F3_09650 [Fusobacterium sp. 3_1_5R] # 28 216 1 189 189 332 99.0 1e-89 MEIRRRIFTIFKCLISKVQKRFGKGGVLLYQNNKETAFSSLEIAIAFSIFLIFLSFFFPS IFLFCGSYEKIREISKISQEEQNLERLLEHLLAHKISYLSPDLPSCFVLDTEGKTLIDAN IQSLKSWKMEEGDTLMIQCIFQDDKNQYLEKTFVLRFFRSHLYLEQYRNGYFITGDRIDM LSNVRGYFSIKDSILKICYIRKNRDKTYENNFYISS >gi|224461426|gb|ACDD01000076.1| GENE 9 5537 - 6058 527 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453315|ref|ZP_05618614.1| ## NR: gi|257453315|ref|ZP_05618614.1| hypothetical protein F3_09655 [Fusobacterium sp. 3_1_5R] # 1 173 1 173 173 269 100.0 6e-71 MKTIFIFHHKKRGFIFLPILFFISFFMGMMLIEFQEIYSLFSVHILEKQSEEKKISQETL TNIVNYEKAKIEKYLSEYPDKKLYHYLTETEDQISLLVKTNSPISIGGYHLENEIPQKII YDTWKGYFVKYYELIERKQKYKIRFMVEYRYGKNKKVSEFISYKIIEMEVYLL >gi|224461426|gb|ACDD01000076.1| GENE 10 6055 - 7182 607 375 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453316|ref|ZP_05618615.1| ## NR: gi|257453316|ref|ZP_05618615.1| hypothetical protein F3_09660 [Fusobacterium sp. 3_1_5R] # 1 365 1 365 375 588 100.0 1e-166 MKYSVYTWKEICSQQNIAKNSILLLDSKFFKIIVLTIPPSIDEEDRKETVYEKLSQDYFL DTKEISYIYECVLEENAQTETVFCCYLKEDISFLSNLSFNILFVIPSFLLGTAISKVKKY YLLNFQEKEVYIFLYENQKMSSVQMIQLHSEEQSMINQCLQLQMEKLPVILIGDYTEKQR EILSQYFSIYDLNRLQIRKIAQKIDIFHHGRFSRKKKYIKYLSYLYLLGSCMIICLGFYW QYQIESLRTELVSLEKKISHVGYQMNLLEEEILKVREEQEKREQELEEQKQKFYQIHKML WKIYEISKWKEIICIESLEDRTLKLKLQFPSQKSYLYFMSQLLRKNFKFLNHDRIERIHN KYEVDIEIEEAENEE >gi|224461426|gb|ACDD01000076.1| GENE 11 7172 - 7765 333 197 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453317|ref|ZP_05618616.1| ## NR: gi|257453317|ref|ZP_05618616.1| hypothetical protein F3_09665 [Fusobacterium sp. 3_1_5R] # 1 197 1 197 197 311 100.0 1e-83 MKNKKEIHGIIVIGISFMIVYFFTWKNFQEYREKNRKKEELLQKIMLQEKHYQELQQQLK IMKISLPHKEIEEQNQEMFSHLLEFEILLQRVLEKHHLQLQGLGRIQKEGNRLFISSKIQ GKIYNLLMLIQELEQDSRRVSFSEEYWKLERSHNQTAILDCNFVIYVKEGEYDFEVITRN NKKNRVPFVHLAKKTLY >gi|224461426|gb|ACDD01000076.1| GENE 12 7692 - 8993 1275 433 aa, chain + ## HITS:1 COG:FN2086 KEGG:ns NR:ns ## COG: FN2086 COG1450 # Protein_GI_number: 19705376 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Fusobacterium nucleatum # 83 433 60 402 402 235 38.0 2e-61 MKLLQGITKRTGYLLYIWQKRHSISFYILFALICLFTLEEVCYAKELPKDIEMSDTTLRE VLDELEGCLGIKITVDESPKDPLNLFFQEGQSIEEVLDMLGEITNKKVKKISNHEFLLEE VQIQKEEISKEYHLHYLRSKEIYDALKDLFPDIKIANLDSRNQVIVVAEEKKIHEIDKLM EHMDIEGKQVKVHSQILDISKDLFHELGFDWLYEKPSQQKNKFSVAVLGEESVGNSGPVL GSKWNLIRQFSNATEALGLSLKLLEARQDLKITSSPSILIAHGNKGEFKITEEVIVGEKK EKKKGESTSVEPIFKEAGLILKVIPYIHQDNSVTLDISLELSDFRYRQTNHKKDWNFNSQ GGSKIGRSLSTRIHVKNKEMILIGGLSRFTRRNTENKVPFLSDIPGLGYLFKSESKKDAE TDMYIKIFIEVCE >gi|224461426|gb|ACDD01000076.1| GENE 13 9022 - 9600 691 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453319|ref|ZP_05618618.1| ## NR: gi|257453319|ref|ZP_05618618.1| hypothetical protein F3_09675 [Fusobacterium sp. 3_1_5R] # 1 192 1 192 192 320 100.0 2e-86 MGDFNKEILFAPGGKVFIGSNSEENRILKECSMYQIVKEVLPQVYYRLPKRRKEIYEEEI LKIARYYMKVQYNINSLFPKGETAAYILQYSTRKLTEFDFYAPTYKNRKFQVGKYKLTFY RKDKNSPLFEMGERAKLVELFRYIGPYSFKYDVRKQLKEIIKKYRFENLKADFDVPKWME KEFDKIKKIQEI >gi|224461426|gb|ACDD01000076.1| GENE 14 9647 - 10096 276 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 9 147 15 153 165 110 38 3e-24 MDFQELKLKQYASLIEDEKDEIAILSNTSAFLYEILEDVNWVGFYFVKGDELVLGPFQGK TACYRIPFSRGVCGWVARNEKPIIVPNVHEFEGHIACDASSNSEIVLPIFKDGKLYAVLD IDSAEFDNFCILEQVFLGEIIEILEKKWK Prediction of potential genes in microbial genomes Time: Fri May 20 02:29:28 2011 Seq name: gi|224461425|gb|ACDD01000077.1| Fusobacterium sp. 3_1_5R cont1.77, whole genome shotgun sequence Length of sequence - 2953 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - LSU_RRNA 76 - 2852 97.0 # FJ410389 [D:301..3086] # 23S ribosomal RNA # Fusobacterium necrophorum # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Fusobacterium. Prediction of potential genes in microbial genomes Time: Fri May 20 02:29:32 2011 Seq name: gi|224461424|gb|ACDD01000078.1| Fusobacterium sp. 3_1_5R cont1.78, whole genome shotgun sequence Length of sequence - 1396 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 59 - 1396 97.0 # AJ295750 [D:1..1472] # 16S ribosomal RNA # Fusobacterium equinum # Bacteria; Fusobacteria; Fusobacteriales; Fusobacteriaceae; Fusobacterium. Prediction of potential genes in microbial genomes Time: Fri May 20 02:29:33 2011 Seq name: gi|224461423|gb|ACDD01000079.1| Fusobacterium sp. 3_1_5R cont1.79, whole genome shotgun sequence Length of sequence - 1506 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 201 122 ## gi|257453322|ref|ZP_05618621.1| hypothetical protein F3_09692 2 1 Op 2 . + CDS 188 - 670 711 ## gi|257453323|ref|ZP_05618622.1| hypothetical protein F3_09697 3 1 Op 3 . + CDS 674 - 1357 515 ## gi|257453324|ref|ZP_05618623.1| hypothetical protein F3_09702 4 1 Op 4 . + CDS 1386 - 1496 130 ## Predicted protein(s) >gi|224461423|gb|ACDD01000079.1| GENE 1 1 - 201 122 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453322|ref|ZP_05618621.1| ## NR: gi|257453322|ref|ZP_05618621.1| hypothetical protein F3_09692 [Fusobacterium sp. 3_1_5R] # 1 66 1 66 66 108 100.0 1e-22 EWKKNGKKTAGVIAQEVEKILPQAVQNQGYKSVDYNALVGLCIEINKALLERIERLEKVV EKYGNS >gi|224461423|gb|ACDD01000079.1| GENE 2 188 - 670 711 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453323|ref|ZP_05618622.1| ## NR: gi|257453323|ref|ZP_05618622.1| hypothetical protein F3_09697 [Fusobacterium sp. 3_1_5R] # 1 160 1 160 160 298 100.0 8e-80 MAIPSKGPISLNDIRQNLGVYGPISLNDYRVRALAKKPSGTISLKDCYKQSAENVYKLVV ERNGDGDYGYALGRLGSITPQKLNGKTITFFFAYDSYITLKTQDTKPYFKEVTLEYEDRV ITLQQANYTKYRYFGYDDYIIKKIQSSVGKGIEIRLTAKE >gi|224461423|gb|ACDD01000079.1| GENE 3 674 - 1357 515 227 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453324|ref|ZP_05618623.1| ## NR: gi|257453324|ref|ZP_05618623.1| hypothetical protein F3_09702 [Fusobacterium sp. 3_1_5R] # 1 227 1 227 227 378 100.0 1e-103 MKKYIFRVQDLKQGNMIPYNVLDLDEKTPENTETETFIALEIEESYDIDLFLYDKEKQEV RRKTKKELYDNGLYTLKIGEIFDETSQDFKTLDQPSRWHTWNGKEWEVDIKEVKDIVNEK WKVERQLKIDKDTNYKGFIFQTREYTDIHNFEQYGFLMMLGKAKQSDKVEWRMKDNSYHE FTVQELLEVVAIWGARKKAIFEDNKRMWLELEKAETVEEIEKIKWGV >gi|224461423|gb|ACDD01000079.1| GENE 4 1386 - 1496 130 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVRDMFLEADMRFILGFLFGVMYVSIILSLCLLGI Prediction of potential genes in microbial genomes Time: Fri May 20 02:29:59 2011 Seq name: gi|224461422|gb|ACDD01000080.1| Fusobacterium sp. 3_1_5R cont1.80, whole genome shotgun sequence Length of sequence - 1506 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 299 70 ## gi|257453326|ref|ZP_05618625.1| hypothetical protein F3_09712 + Term 308 - 341 -0.2 + Prom 460 - 519 3.4 2 2 Tu 1 . + CDS 553 - 615 66 ## 3 3 Tu 1 . - CDS 924 - 1088 122 ## gi|257453328|ref|ZP_05618627.1| hypothetical protein F3_09722 - Prom 1142 - 1201 8.3 4 4 Tu 1 . - CDS 1208 - 1414 267 ## gi|257453329|ref|ZP_05618628.1| hypothetical protein F3_09727 - Prom 1437 - 1496 4.7 Predicted protein(s) >gi|224461422|gb|ACDD01000080.1| GENE 1 3 - 299 70 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453326|ref|ZP_05618625.1| ## NR: gi|257453326|ref|ZP_05618625.1| hypothetical protein F3_09712 [Fusobacterium sp. 3_1_5R] # 1 98 1 98 98 172 100.0 5e-42 SKKWKKIIDEPQSYLVHNKSFTIDSEASELLIIETENASQDYSIYGLYDIELVKEIGKAV LTPMGYYDGSSFTISGNTLSFKSYVGTSNNNKIWVYTR >gi|224461422|gb|ACDD01000080.1| GENE 2 553 - 615 66 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYGTMVLINGLEKSTVSSID >gi|224461422|gb|ACDD01000080.1| GENE 3 924 - 1088 122 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453328|ref|ZP_05618627.1| ## NR: gi|257453328|ref|ZP_05618627.1| hypothetical protein F3_09722 [Fusobacterium sp. 3_1_5R] # 1 54 1 54 54 83 100.0 5e-15 MEQFFSAIIVKGFNEKISYYASEKEIEIQLINNKIYLTDKGEEWDAFIQKVYYM >gi|224461422|gb|ACDD01000080.1| GENE 4 1208 - 1414 267 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453329|ref|ZP_05618628.1| ## NR: gi|257453329|ref|ZP_05618628.1| hypothetical protein F3_09727 [Fusobacterium sp. 3_1_5R] # 1 68 31 98 98 125 98.0 1e-27 MIGQDYNWNYYSTGIIPVLDKFPYNLALGSAISGAENDHFVVHVEENKISVTKTRSYHKQ IFTLIAYY Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:17 2011 Seq name: gi|224461421|gb|ACDD01000081.1| Fusobacterium sp. 3_1_5R cont1.81, whole genome shotgun sequence Length of sequence - 1388 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 389 391 ## COG3177 Uncharacterized conserved protein 2 1 Op 2 . - CDS 439 - 1176 843 ## Lebu_1402 filamentation induced by cAMP protein Fic 3 1 Op 3 . - CDS 1218 - 1340 103 ## gi|257451832|ref|ZP_05617131.1| helicase Predicted protein(s) >gi|224461421|gb|ACDD01000081.1| GENE 1 2 - 389 391 129 aa, chain - ## HITS:1 COG:pli0008 KEGG:ns NR:ns ## COG: pli0008 COG3177 # Protein_GI_number: 18450294 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 9 129 6 131 254 100 38.0 7e-22 MLENTIMGKITDEYKDDLLVRMAHHSTAIEGNTLTQGDTTSILIYGYIPKGMNEREYYEV KNYKKAFSFLLEAEKEISSTLMKQYHKLIMENLRDDNGQFKKTGNMVIGADFEPTKPYLV PSMIENWCN >gi|224461421|gb|ACDD01000081.1| GENE 2 439 - 1176 843 245 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1402 NR:ns ## KEGG: Lebu_1402 # Name: not_defined # Def: filamentation induced by cAMP protein Fic # Organism: L.buccalis # Pathway: not_defined # 1 221 1 221 222 282 60.0 7e-75 MKDKYNMTLEENIFVAKRNMVDSIWKSANLEGIAVTYPETEIIIDGMAVQNMYIKDINSV VNLKHAWNFLLENVEYPIDLGYLCKLHQYLGEANVIPFPGVVRTSGVNMGGTSWKSEERP DKERIQDNIKEILETKSPTERAMDIFLYLTRQQIFYDGNKRLATLAANQIMIQNGVGLLS VPIEKQKEFKDKLITYYETNQTEDLKKFLYDFCIEGIHFEKARENLSSTKNTWEKKITKT NDLSR >gi|224461421|gb|ACDD01000081.1| GENE 3 1218 - 1340 103 40 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257451832|ref|ZP_05617131.1| ## NR: gi|257451832|ref|ZP_05617131.1| helicase [Fusobacterium sp. 3_1_5R] # 1 40 2205 2244 2244 70 100.0 3e-11 MDDSNSKIMEEKSGLENRYKSTEVKNPWEKKITKTNDLSR Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:25 2011 Seq name: gi|224461420|gb|ACDD01000082.1| Fusobacterium sp. 3_1_5R cont1.82, whole genome shotgun sequence Length of sequence - 1381 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 49 - 558 475 ## COG2963 Transposase and inactivated derivatives + Prom 585 - 644 3.8 2 2 Op 1 . + CDS 664 - 744 75 ## 3 2 Op 2 . + CDS 801 - 1380 438 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|224461420|gb|ACDD01000082.1| GENE 1 49 - 558 475 169 aa, chain + ## HITS:1 COG:FN1887 KEGG:ns NR:ns ## COG: FN1887 COG2963 # Protein_GI_number: 19705192 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 169 1 169 169 199 82.0 2e-51 MSKLTREDKIEIYERRLKGETISSLAKSFNIHESNIKYLIALIGKYGNNILRKSKNRAYS KEFKLQAINRILINHESINSVAIDIGLISAGVLHNWLSKFKENGYNVVEKKKGRKPKSMT KTKNNDKELSEKEKIKKLEDEIIYLKAENEYLKKLRALVQERELKKKKK >gi|224461420|gb|ACDD01000082.1| GENE 2 664 - 744 75 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRRIKILLKKLKKFIMRIKEDMVIAE >gi|224461420|gb|ACDD01000082.1| GENE 3 801 - 1380 438 193 aa, chain + ## HITS:1 COG:FN0486 KEGG:ns NR:ns ## COG: FN0486 COG2801 # Protein_GI_number: 19703821 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 193 1 193 203 329 97.0 2e-90 MKKFNLQSIIRKKRKYSSYKGQVGKIADNHIKRDFEATAPNQKWFTDVTEFNLRGEKLYL SPILDAYGRYIVSYDISRSPNLEQINHMLNLALKENENYENLIFHSDQGWQYQHYSYQKR LKEKKITQSMSRKGNSLDNGLMECFFGLLKSEMFYEQEEKYKTLEELKEAIENYIYYYNN KRIKEKLKGLTPA Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:29 2011 Seq name: gi|224461419|gb|ACDD01000083.1| Fusobacterium sp. 3_1_5R cont1.83, whole genome shotgun sequence Length of sequence - 1158 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 6.8 1 1 Tu 1 . + CDS 104 - 1132 494 ## PROTEIN SUPPORTED gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase Predicted protein(s) >gi|224461419|gb|ACDD01000083.1| GENE 1 104 - 1132 494 342 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase [Streptococcus pneumoniae SP6-BS73] # 3 334 2 315 317 194 38 2e-50 MAQQQYTTKRRKGQHLTLIERGKIEAFLKINIPKIQIASEIGISIRTLYREINRGMVRGL LNSDYSTYDAYSAEFAHKKYLEVMKSKEGTLKIGKNRKLIEYVENSMLNDKNSPYVALEK AKKENIEVNICLKTLYNYIHKQLFINFSEEDMIYKKDRRKQEKIPKRIRKIGGRSIEERP EEINNRQEVGHFEADTVVGKRGTKEAILVLTDRKTRLEMVRKIPDKTAESVIKELSKIII EYPGVIKSITSDNGSEFMRADKIEEENIAYYYAHSYSSWERGSNENNNKLIRRFIPKGTD ISEVSEEEIKEIEKWMNDYPRKLFNGKSANEMYLSEFTKYFS Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:30 2011 Seq name: gi|224461418|gb|ACDD01000084.1| Fusobacterium sp. 3_1_5R cont1.84, whole genome shotgun sequence Length of sequence - 1015 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1013 1045 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase Predicted protein(s) >gi|224461418|gb|ACDD01000084.1| GENE 1 2 - 1013 1045 337 aa, chain + ## HITS:1 COG:FN1122_2 KEGG:ns NR:ns ## COG: FN1122_2 COG0204 # Protein_GI_number: 19704457 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Fusobacterium nucleatum # 181 337 2 159 228 165 53.0 1e-40 YAIKHNNNAKAAIRYHTSHQQVKRWRDRYDGTIQSLLPKSRRPKSHPKWEVIDKYNKQAP DYRKVLDTIIVPNEFPKTKIGKIRRFMVPAVLENIGKEEVVTEEPSTEEYTIIKEYLSTS KGRTVVPQAHLELDLGMDSLDMIEFISFLGSRFGMVVQNETILENSTVESIAAYVEKHRG EDKIEDVNWKEILNKETEIKLPYYGIFARIGKLLNYLLFWTYFRIEIQEREYLEKKPTIY VGNHQSFLDVALIARAFPTSILKNCFFMAKGIHFKSFFMKFFAKQGNVVLLDINENITEV LQTMAKVLREGKSILIFPEGVRTRDGKLNSFKKSFAI Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:31 2011 Seq name: gi|224461417|gb|ACDD01000085.1| Fusobacterium sp. 3_1_5R cont1.85, whole genome shotgun sequence Length of sequence - 1008 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 38 - 1006 753 ## COG0582 Integrase Predicted protein(s) >gi|224461417|gb|ACDD01000085.1| GENE 1 38 - 1006 753 322 aa, chain + ## HITS:1 COG:CAC1110 KEGG:ns NR:ns ## COG: CAC1110 COG0582 # Protein_GI_number: 15894395 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 3 312 14 325 340 62 23.0 1e-09 MLENWQGVTEKNRKIYEKYLNSCRSNNEETWETTYKTYASRMYKFLKWLNKEKNRYLLSQ DTLENAVEIIEEYKNYCRECGNSKRTIANAIVTISSFYDWTVRRKMIKYHPFKDRLEKQK ITDRDNTRESYYLTTEQVLTARLYMKVENKKFDLQDRILWELFIDSACRISAIQALTLEQ LELEGGYFKNVIEKEGYVVNAYFFDTCKSLIKEWLQERERIGIKEKWLFVTKYEGEYRQM SQATIRKRIKKIGKILEINGLYPHSLRKTSINLLSKLGGLDIASHYANHASTVVTSKHYI EKESATEIRNQILALRQKIGIF Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:32 2011 Seq name: gi|224461416|gb|ACDD01000086.1| Fusobacterium sp. 3_1_5R cont1.86, whole genome shotgun sequence Length of sequence - 956 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 1 - 150 88 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Prom 159 - 218 3.8 2 1 Op 2 1/0.000 + CDS 243 - 728 508 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 3 1 Op 3 . + CDS 619 - 955 339 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) Predicted protein(s) >gi|224461416|gb|ACDD01000086.1| GENE 1 1 - 150 88 49 aa, chain + ## HITS:1 COG:SP1275 KEGG:ns NR:ns ## COG: SP1275 COG0458 # Protein_GI_number: 15901135 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Streptococcus pneumoniae TIGR4 # 1 45 17 61 1058 77 84.0 4e-15 PIIIGQAAEFDYSGTQACETLKKEGIEVVLINSNPATIMTDKAIAIEFT >gi|224461416|gb|ACDD01000086.1| GENE 2 243 - 728 508 161 aa, chain + ## HITS:1 COG:lin1949 KEGG:ns NR:ns ## COG: lin1949 COG0458 # Protein_GI_number: 16801015 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Listeria innocua # 1 143 98 240 1070 176 57.0 2e-44 MAVELSEKGILEKYGIKVIGTSIESIKRGEDRELFREAMEKIGEPILTSHVVESLEEGYK IANEIGYPVVVRPAYTLGGTGGGFAHNPQELEEILLKGLSLSRVGQVLIERSILGWKEIE YEVIRDANGNGITVCNMENIDPVRNSYRRFHCSGSKPNFDR >gi|224461416|gb|ACDD01000086.1| GENE 3 619 - 955 339 112 aa, chain + ## HITS:1 COG:BH2536 KEGG:ns NR:ns ## COG: BH2536 COG0458 # Protein_GI_number: 15615099 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus halodurans # 18 112 240 334 1062 155 76.0 1e-38 MPMEMALRYVIWKILTRLGIHTGDSIVVAPSQTLTDREYQMLRRASLKIVEEIGIVGGCN VQFALHPKSFEYAIIEINPRVSRSSALASKATGYPIARVATKLAMGYLLDEV Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:32 2011 Seq name: gi|224461415|gb|ACDD01000087.1| Fusobacterium sp. 3_1_5R cont1.87, whole genome shotgun sequence Length of sequence - 853 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 852 1226 ## HSM_0708 YadA domain-containing protein Predicted protein(s) >gi|224461415|gb|ACDD01000087.1| GENE 1 3 - 852 1226 283 aa, chain - ## HITS:1 COG:no KEGG:HSM_0708 NR:ns ## KEGG: HSM_0708 # Name: not_defined # Def: YadA domain-containing protein # Organism: H.somnus_2336 # Pathway: not_defined # 47 271 2608 2846 3674 78 31.0 3e-13 NGGQLHAVKQDVKAAKTEVTSKGKTVQVTEQAATDGHTIYNVEVRQNVKYATEDGKEVIL GTDGKFYHPENLKEDGTPVDANKAIPKDKVKAKLAEEAKLDNISSGKIADGSKEAINGSQ LKEVGDYLGLQPKQDGTGFEKPTFTALKNVDGSDNTAPKNVIDSVNTTIGKVNEGLKYGA DNTTTPTTQQLGSSLSVKSAEKELVKAGTDAKFVGKNIVTNYENNSGNGTVSIGISEKPE FKEVTLKDGDGNTTTINKDGMTITPKNLEEGKTVVKLTKNGLD Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:37 2011 Seq name: gi|224461414|gb|ACDD01000088.1| Fusobacterium sp. 3_1_5R cont1.88, whole genome shotgun sequence Length of sequence - 804 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 29 - 316 407 ## gi|257453338|ref|ZP_05618637.1| hypothetical protein F3_09787 2 1 Op 2 . - CDS 388 - 804 602 ## gi|257453339|ref|ZP_05618638.1| hypothetical protein F3_09792 Predicted protein(s) >gi|224461414|gb|ACDD01000088.1| GENE 1 29 - 316 407 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453338|ref|ZP_05618637.1| ## NR: gi|257453338|ref|ZP_05618637.1| hypothetical protein F3_09787 [Fusobacterium sp. 3_1_5R] # 1 95 1 95 95 149 100.0 4e-35 MGLFGKKEQKPFVSNNSLVEIISCLGAYEEGISEKGYEALEKKLERRYLYRNVESVDFGG EVALVKYKDLKVRSEFEVKKMREEMEREAGLDLGR >gi|224461414|gb|ACDD01000088.1| GENE 2 388 - 804 602 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257453339|ref|ZP_05618638.1| ## NR: gi|257453339|ref|ZP_05618638.1| hypothetical protein F3_09792 [Fusobacterium sp. 3_1_5R] # 1 138 1 138 138 228 100.0 7e-59 VLVTILNIASDFMDGTFLITLSKKVSGWKAIKEGVIFQKTAILMFLYLLINRAGSIASAL LSGAIASIGIGSQMGAQGFSNAVSTPGRLAGNMANSASRFSDDQTHKGGAKAFRRESYAA NAYKRAADGVKNFASRFR Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:50 2011 Seq name: gi|224461413|gb|ACDD01000089.1| Fusobacterium sp. 3_1_5R cont1.89, whole genome shotgun sequence Length of sequence - 794 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 56 - 793 1071 ## COG5295 Autotransporter adhesin Predicted protein(s) >gi|224461413|gb|ACDD01000089.1| GENE 1 56 - 793 1071 245 aa, chain - ## HITS:1 COG:PM0714 KEGG:ns NR:ns ## COG: PM0714 COG5295 # Protein_GI_number: 15602579 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Pasteurella multocida # 6 238 1957 2198 2712 73 36.0 3e-13 NGSQLDKVAKESKTEVKAGDSGNVTVNKSDDTPDKHVVYTVDMKKDITLDKVTVKDKEDN KTEVTPGKVSVDGKNGSGVTLNGADGSIGLKGENGKDALSIKGEKGQAGVDGKNGTDGKT RIVYEYADPKNPGTKVREEVATLNDGIKYKGDSGEAYTKLNKQTEIVGGQKDTDKLSENN IGVVASQDGDNAKLTVKLSKELKDLTSVETKDEEGNKTVQNSKGTTITDKDGNKTEITKD GMTIT Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:50 2011 Seq name: gi|224461412|gb|ACDD01000090.1| Fusobacterium sp. 3_1_5R cont1.90, whole genome shotgun sequence Length of sequence - 794 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 10.6 1 1 Tu 1 . + CDS 83 - 787 780 ## COG2992 Uncharacterized FlgJ-related protein Predicted protein(s) >gi|224461412|gb|ACDD01000090.1| GENE 1 83 - 787 780 234 aa, chain + ## HITS:1 COG:FN1894 KEGG:ns NR:ns ## COG: FN1894 COG2992 # Protein_GI_number: 19705199 # Func_class: R General function prediction only # Function: Uncharacterized FlgJ-related protein # Organism: Fusobacterium nucleatum # 31 231 2 202 203 172 49.0 5e-43 MKKFVILLIISIFSFAFICQDAMASTNAVSITQAKDFSKIAKNRKQVFIDTLVPIINEIK GNIKTDKEKVEEILKKEEAMRTNSEKALLEENYTKYKVNSRTPQELLKKMVLPPTSLIIA QASVESGWGGSKLAQLGNNLFGMTSISKSSADSVKIGNMRYKKYAGIQESVEDYILTISR HNAYKSLRGGIRRGEDSVGLVKHLGSYSELGSKYSSYVAKVIQSNSLQKHDTDL Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:51 2011 Seq name: gi|224461411|gb|ACDD01000091.1| Fusobacterium sp. 3_1_5R cont1.91, whole genome shotgun sequence Length of sequence - 731 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 20 - 730 466 ## COG0732 Restriction endonuclease S subunits Predicted protein(s) >gi|224461411|gb|ACDD01000091.1| GENE 1 20 - 730 466 236 aa, chain - ## HITS:1 COG:VC1768 KEGG:ns NR:ns ## COG: VC1768 COG0732 # Protein_GI_number: 15641771 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Vibrio cholerae # 13 203 36 216 462 60 28.0 4e-09 NDNLEQQAQALFKEWFIDNPEKKNWSNGTFSDLIQSTLSGDWGKEVATRNNTEKVYCIRG ADIPEVKAGNKGKMPIRYILPKNYASKKLNAGDIVVEISGGSPTQSTGRCTAISESLLNR YDSGMICTNFCRAIKPISGYSIFIYYYWQHLYDKGVFFSYENGTTGIKNLDISGFLETEP IVIPLKEKILEFNDYCQTIFNQIFSHGKESEYLVQLRDTLLNKLMSGEIDVSAIDL Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:52 2011 Seq name: gi|224461410|gb|ACDD01000092.1| Fusobacterium sp. 3_1_5R cont1.92, whole genome shotgun sequence Length of sequence - 695 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 694 879 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) Predicted protein(s) >gi|224461410|gb|ACDD01000092.1| GENE 1 1 - 694 879 231 aa, chain - ## HITS:1 COG:BH2536 KEGG:ns NR:ns ## COG: BH2536 COG0458 # Protein_GI_number: 15615099 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus halodurans # 1 231 336 563 1062 234 49.0 8e-62 NEVTGKTYACFEPSLDYIVVKIPKWPFDKFKKADRRLGTKMMATGEIMAIGENFESAFLK GIRSLEIGRYNLEHPAIESLRMEELKKEVVNPSDERIFVVAEMLRRGYIKEKLQKLTGID KFFMEKIEWIVKQEELLKKMSFADLDEKFLRNLKKKGFSDKGIADLMKISEEDIHAKRMQ YGIVPSYKMVDTCAGEFESSSSYYYSTYSQYDEVVVNSGRKMIVIGSGPIR Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:52 2011 Seq name: gi|224461409|gb|ACDD01000093.1| Fusobacterium sp. 3_1_5R cont1.93, whole genome shotgun sequence Length of sequence - 695 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 695 809 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) Predicted protein(s) >gi|224461409|gb|ACDD01000093.1| GENE 1 2 - 695 809 231 aa, chain + ## HITS:1 COG:BH2536 KEGG:ns NR:ns ## COG: BH2536 COG0458 # Protein_GI_number: 15615099 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus halodurans # 1 231 336 563 1062 232 48.0 4e-61 NEVTGKTYACFEPSLDYIVVKIPKWPFDKFKKADRRLGTKMMATGEIMAIGENFESAFLK GIRSLEIGRYNLEHPAIESLRMEELKKEVVNPSDERIFVVAEMLRRGYIKEKLQKLTGID KFFMEKIEWIVKQEELLKKMSFADLDEKFLRNLKKKGFSDKGIADLMKISEEDIHDKRIQ YGILPSYKMVDTCAGEFEASSSYYYSTYSQYDEVVVNSGRKMIVIGSGPIR Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:53 2011 Seq name: gi|224461408|gb|ACDD01000094.1| Fusobacterium sp. 3_1_5R cont1.94, whole genome shotgun sequence Length of sequence - 654 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 620 716 ## COG3328 Transposase and inactivated derivatives Predicted protein(s) >gi|224461408|gb|ACDD01000094.1| GENE 1 3 - 620 716 205 aa, chain + ## HITS:1 COG:Z2082 KEGG:ns NR:ns ## COG: Z2082 COG3328 # Protein_GI_number: 15801523 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 1 203 88 290 293 208 46.0 6e-54 VISLEIGENESSKYWLGILNALKNRGVKDIMVLCADGLSGMKEAIQTAFPETEYQRCIVH QVRNTLKHVSYKDMKAFAADLKQIYLAPTEEKGYEALQRVKEKWEEKYPYSMKSWEQNWD ILSPIFKFSMDVRKVIYTTNAIESLNSTYKKLNRQRSIFPNEKALLKTLYLATLQATKKW TMPLRNWGKVYGEFSIMYEERFEKN Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:54 2011 Seq name: gi|224461407|gb|ACDD01000095.1| Fusobacterium sp. 3_1_5R cont1.95, whole genome shotgun sequence Length of sequence - 617 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 615 741 ## COG3328 Transposase and inactivated derivatives Predicted protein(s) >gi|224461407|gb|ACDD01000095.1| GENE 1 3 - 615 741 204 aa, chain - ## HITS:1 COG:SMa0384 KEGG:ns NR:ns ## COG: SMa0384 COG3328 # Protein_GI_number: 16262658 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 36 202 27 188 400 138 39.0 8e-33 KDVYKVKPLTEGKKNIIASLLQEYDIQSAQDIQVALRDLLGGTIQSMLEAEMEEHLGYEN YERTEDRMEGDNYRNGTKKKKIRSQYGEFEVEVPQDRNSSFDPKIVKKRQKDISEIDQKI INMYARGLTTRQISQQIEELYGFECSESFISNVTDKILQDIEDWQNRPLDAIYPILFIDA VHFSVREDNRVKKIAAYVILGITI Prediction of potential genes in microbial genomes Time: Fri May 20 02:30:54 2011 Seq name: gi|224461406|gb|ACDD01000096.1| Fusobacterium sp. 3_1_5R cont1.96, whole genome shotgun sequence Length of sequence - 595 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 37 - 411 564 ## gi|257453347|ref|ZP_05618646.1| GrdX protein - Term 132 - 187 13.4 2 2 Tu 1 . - CDS 392 - 595 181 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase Predicted protein(s) >gi|224461406|gb|ACDD01000096.1| GENE 1 37 - 411 564 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|257453347|ref|ZP_05618646.1| ## NR: gi|257453347|ref|ZP_05618646.1| GrdX protein [Fusobacterium sp. 3_1_5R] # 1 124 1 124 124 201 100.0 1e-50 MEYVIITNNRKVANLYQETNQVKFYEHKDFLHILDKVQEQVYEGRKLLSDPIISHLEDAK NPFKSVIVSKEYFSENQEFKKIIDLAVKIATQLETPTETYSEEELEAFRFIDLKLLQESS HAFD >gi|224461406|gb|ACDD01000096.1| GENE 2 392 - 595 181 67 aa, chain - ## HITS:1 COG:FN1122_2 KEGG:ns NR:ns ## COG: FN1122_2 COG0204 # Protein_GI_number: 19704457 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Fusobacterium nucleatum # 1 62 161 222 228 60 43.0 5e-10 AKELDVDVQAFVIQGAYELFPTSARMPKMGKVHLEILPRFSPKDMTYEEITQEARNQIEK RLNQKHD Prediction of potential genes in microbial genomes Time: Fri May 20 02:31:01 2011 Seq name: gi|224461405|gb|ACDD01000097.1| Fusobacterium sp. 3_1_5R cont1.97, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 159 327 ## gi|257466697|ref|ZP_05631008.1| cell surface protein - Prom 223 - 282 10.6 + Prom 92 - 151 6.1 2 2 Tu 1 . + CDS 269 - 355 183 ## + Term 448 - 498 6.2 Predicted protein(s) >gi|224461405|gb|ACDD01000097.1| GENE 1 3 - 159 327 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257466697|ref|ZP_05631008.1| ## NR: gi|257466697|ref|ZP_05631008.1| cell surface protein [Fusobacterium gonidiaformans ATCC 25563] # 1 52 1 52 910 88 90.0 1e-16 MLEEKSVKHWLKRKVKFTEALLVAFLITGGIASANVVVGTGTGNGDNTITES >gi|224461405|gb|ACDD01000097.1| GENE 2 269 - 355 183 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVIARPPPIQKRGFSFYLCSFYIVIFVI Prediction of potential genes in microbial genomes Time: Fri May 20 02:31:09 2011 Seq name: gi|224461404|gb|ACDD01000098.1| Fusobacterium sp. 3_1_5R cont1.98, whole genome shotgun sequence Length of sequence - 528 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 526 706 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) Predicted protein(s) >gi|224461404|gb|ACDD01000098.1| GENE 1 1 - 526 706 175 aa, chain + ## HITS:1 COG:FN0422 KEGG:ns NR:ns ## COG: FN0422 COG0458 # Protein_GI_number: 19703764 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Fusobacterium nucleatum # 5 175 702 872 1063 217 59.0 7e-57 GIAIANEVGYPVLVRPSYVLGGQGMEICHDEVNLVKYLEASFSRDASSPVLIDKYLNGIE LEVDAICDGEDVLIPGVMEHLERAGVHSGDSITIYPQQNLYKGTEEEILDITRKIARALE VKGMMNIQFIAYQNELYVIEVNPRSSRTVPYISKISGLPVIEIASRMMLGEKLKD Prediction of potential genes in microbial genomes Time: Fri May 20 02:31:10 2011 Seq name: gi|224461403|gb|ACDD01000099.1| Fusobacterium sp. 3_1_5R cont1.99, whole genome shotgun sequence Length of sequence - 528 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 528 634 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) Predicted protein(s) >gi|224461403|gb|ACDD01000099.1| GENE 1 3 - 528 634 175 aa, chain - ## HITS:1 COG:FN0422 KEGG:ns NR:ns ## COG: FN0422 COG0458 # Protein_GI_number: 19703764 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Fusobacterium nucleatum # 5 175 702 872 1063 213 58.0 2e-55 GIAIANEVGYPVLVRPSYVLGGQGMEICHDEINLVKYLEASFSRDASSPVLIDKYLNGIE LEVDAICDGEDVLIPGVMEHLERAGVHSGDSITIYPQQNLYAGTEEQILEITTKIARALK VKGMMNIQFIAYQNELYVIEVNPRSSRTVPYISKISGLPVIEIASRVMLGEKLKD