Prediction of potential genes in microbial genomes Time: Fri May 13 06:17:31 2011 Seq name: gi|225935391|gb|ACGA01000001.1| Bacteroides sp. D2 cont1.1, whole genome shotgun sequence Length of sequence - 90265 bp Number of predicted genes - 65, with homology - 65 Number of transcription units - 29, operones - 18 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 353 - 429 54.7 # Arg ACG 0 0 - TRNA 459 - 532 55.8 # Arg ACG 0 0 - TRNA 554 - 627 54.1 # Arg ACG 0 0 1 1 Tu 1 . - CDS 707 - 3004 1676 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 3033 - 3092 5.1 - Term 3092 - 3139 8.2 2 2 Tu 1 1/0.250 - CDS 3176 - 5221 1863 ## COG0326 Molecular chaperone, HSP90 family - Prom 5249 - 5308 6.5 - Term 5281 - 5315 4.0 3 3 Tu 1 . - CDS 5356 - 7884 1970 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 7915 - 7974 3.2 + Prom 7962 - 8021 7.9 4 4 Op 1 . + CDS 8242 - 10818 2295 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 5 4 Op 2 . + CDS 10851 - 12065 1177 ## BT_0900 TPR repeat-containing protein + Term 12089 - 12126 4.1 - Term 12196 - 12249 10.1 6 5 Op 1 . - CDS 12307 - 13428 637 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins - Term 13450 - 13492 6.4 7 5 Op 2 . - CDS 13506 - 13796 256 ## BT_0902 hypothetical protein - Prom 13823 - 13882 7.7 - Term 13851 - 13888 2.1 8 6 Op 1 . - CDS 13925 - 14725 608 ## BT_0903 hypothetical protein 9 6 Op 2 . - CDS 14777 - 16606 1221 ## BT_0904 hypothetical protein 10 6 Op 3 . - CDS 16638 - 17360 763 ## BT_0905 hypothetical protein 11 6 Op 4 5/0.000 - CDS 17369 - 18397 1063 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 12 6 Op 5 . - CDS 18443 - 19426 839 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 13 6 Op 6 . - CDS 19476 - 20552 847 ## BT_0908 hypothetical protein 14 6 Op 7 23/0.000 - CDS 20564 - 21433 690 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 15 6 Op 8 . - CDS 21517 - 22512 1138 ## COG0714 MoxR-like ATPases - Prom 22588 - 22647 8.3 - Term 22685 - 22735 12.2 16 7 Op 1 . - CDS 22750 - 24162 1109 ## BT_0911 putative integration host factor IHF alpha subunit 17 7 Op 2 . - CDS 24175 - 24447 293 ## BT_0912 DNA-binding protein HU 18 7 Op 3 3/0.000 - CDS 24440 - 25750 1145 ## PROTEIN SUPPORTED gi|229200236|ref|ZP_04326798.1| SSU ribosomal protein S12P methylthiotransferase 19 7 Op 4 . - CDS 25747 - 26706 731 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 - Prom 26733 - 26792 8.6 - Term 26770 - 26819 9.0 20 8 Op 1 . - CDS 26845 - 27003 246 ## PRU_0750 hypothetical protein 21 8 Op 2 . - CDS 27020 - 27208 320 ## PROTEIN SUPPORTED gi|53713719|ref|YP_099711.1| 50S ribosomal protein L33 22 8 Op 3 . - CDS 27230 - 27490 441 ## PROTEIN SUPPORTED gi|29346326|ref|NP_809829.1| 50S ribosomal protein L28 - Prom 27510 - 27569 2.6 - Term 27564 - 27599 -1.0 23 9 Op 1 . - CDS 27613 - 28845 914 ## COG1546 Uncharacterized protein (competence- and mitomycin-induced) 24 9 Op 2 . - CDS 28890 - 29909 654 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase - Prom 30075 - 30134 5.2 + Prom 29927 - 29986 4.6 25 10 Tu 1 . + CDS 30110 - 34567 3347 ## BT_0921 hypothetical protein + Term 34575 - 34626 9.3 - Term 34566 - 34610 6.2 26 11 Op 1 . - CDS 34655 - 35512 731 ## BT_0922 hypothetical protein 27 11 Op 2 . - CDS 35534 - 35971 526 ## BT_0923 putative periplasmic protein - Prom 36051 - 36110 7.7 + Prom 36064 - 36123 5.8 28 12 Op 1 5/0.000 + CDS 36291 - 37442 992 ## COG1470 Predicted membrane protein 29 12 Op 2 24/0.000 + CDS 37446 - 38189 285 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 30 12 Op 3 . + CDS 38182 - 39138 849 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 31 13 Op 1 40/0.000 - CDS 39268 - 40542 993 ## COG0642 Signal transduction histidine kinase 32 13 Op 2 . - CDS 40564 - 41241 179 ## PROTEIN SUPPORTED gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 - Prom 41364 - 41423 6.3 - Term 41388 - 41429 2.1 33 14 Op 1 . - CDS 41456 - 42949 1627 ## COG0442 Prolyl-tRNA synthetase - Prom 42981 - 43040 2.4 34 14 Op 2 . - CDS 43120 - 43548 224 ## BT_1390 hypothetical protein - Prom 43608 - 43667 7.6 - TRNA 43687 - 43759 82.4 # Gly CCC 0 0 + Prom 44246 - 44305 7.5 35 15 Op 1 13/0.000 + CDS 44536 - 45948 1031 ## COG1538 Outer membrane protein 36 15 Op 2 9/0.000 + CDS 45955 - 46941 961 ## COG0845 Membrane-fusion protein 37 15 Op 3 22/0.000 + CDS 46964 - 48148 562 ## COG0842 ABC-type multidrug transport system, permease component 38 15 Op 4 . + CDS 48132 - 49316 941 ## COG0842 ABC-type multidrug transport system, permease component + Prom 49395 - 49454 7.0 39 16 Tu 1 . + CDS 49546 - 53535 2205 ## COG5002 Signal transduction histidine kinase + Term 53573 - 53622 15.6 - Term 53561 - 53610 12.4 40 17 Op 1 . - CDS 53667 - 54179 385 ## BT_0959 hypothetical protein 41 17 Op 2 . - CDS 54193 - 56082 1622 ## COG4206 Outer membrane cobalamin receptor protein 42 17 Op 3 . - CDS 56113 - 56532 365 ## BT_0961 hypothetical protein 43 17 Op 4 . - CDS 56541 - 57944 1039 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 44 17 Op 5 . - CDS 57964 - 59472 748 ## COG1145 Ferredoxin - Prom 59577 - 59636 4.7 45 18 Tu 1 . - CDS 59660 - 60754 876 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Prom 60779 - 60838 9.8 46 19 Tu 1 . - CDS 60912 - 62999 1152 ## BT_0964 hypothetical protein - Term 63413 - 63452 -0.0 47 20 Op 1 6/0.000 - CDS 63481 - 64422 625 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 64458 - 64517 6.7 48 20 Op 2 . - CDS 64553 - 65104 333 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 65139 - 65198 2.9 49 20 Op 3 . - CDS 65200 - 65550 229 ## COG1733 Predicted transcriptional regulators - Prom 65770 - 65829 8.0 + Prom 65522 - 65581 3.9 50 21 Tu 1 . + CDS 65642 - 66661 810 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family + Term 66773 - 66815 1.5 - Term 66478 - 66520 4.0 51 22 Op 1 . - CDS 66658 - 66912 205 ## BT_0967 hypothetical protein 52 22 Op 2 . - CDS 66924 - 70061 2640 ## BT_0968 hypothetical protein - Prom 70228 - 70287 6.2 - Term 70466 - 70514 1.1 53 23 Tu 1 . - CDS 70571 - 71686 690 ## BT_0969 hypothetical protein - Prom 71728 - 71787 7.0 + Prom 71703 - 71762 5.5 54 24 Op 1 . + CDS 71942 - 72562 502 ## COG1011 Predicted hydrolase (HAD superfamily) 55 24 Op 2 . + CDS 72600 - 73409 547 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Term 73459 - 73506 10.2 - Term 73447 - 73492 10.6 56 25 Op 1 . - CDS 73560 - 74288 501 ## BT_0973 hypothetical protein 57 25 Op 2 12/0.000 - CDS 74300 - 74785 473 ## COG3610 Uncharacterized conserved protein 58 25 Op 3 . - CDS 74787 - 75554 699 ## COG2966 Uncharacterized conserved protein - Prom 75607 - 75666 8.3 + Prom 75468 - 75527 2.3 59 26 Tu 1 . + CDS 75632 - 76885 853 ## COG0477 Permeases of the major facilitator superfamily - Term 76885 - 76948 8.6 60 27 Op 1 . - CDS 77004 - 77255 230 ## BT_0977 hypothetical protein 61 27 Op 2 . - CDS 77338 - 77889 539 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 77981 - 78040 5.6 - Term 78013 - 78063 13.6 62 28 Op 1 . - CDS 78065 - 81208 2378 ## BT_0979 hypothetical protein 63 28 Op 2 . - CDS 81239 - 82921 1648 ## BT_0980 hypothetical protein 64 28 Op 3 . - CDS 82980 - 87374 3686 ## COG0642 Signal transduction histidine kinase - Prom 87455 - 87514 3.5 65 29 Tu 1 . - CDS 87548 - 90202 2246 ## COG3250 Beta-galactosidase/beta-glucuronidase Predicted protein(s) >gi|225935391|gb|ACGA01000001.1| GENE 1 707 - 3004 1676 765 aa, chain - ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 22 299 24 301 308 160 35.0 1e-38 MRKIFLVFITLWLIIPAIHAQKVGLVLSGGGAKGMTHIGIIRALEENNIPIDYIAGTSMG AIIGSLYAMGYSPDDMVELLKSEDFKRWYSGEVEEKYVYHFKKNLPTPEFFNIRFSFKDS LKSLKPQFLPTSVVNPIQMNLVFVDLYARATAACKGDFDKLFVPFRCIASDVYNKKQLVM KEGDLGDAVRASMSFPFMFKPIEIDNVLAYDGGIYNNFPTDVMRDDFHPDVIIGSVVSTN PTKPKENDLMSQIENMVMQKTDYSIPDSMGILMTFKYDNVNLMDFQRIDELHDIGYNRTI SMMDSIKSRIHRRVNLDNIRLRRMVYRSNFPELRFKNIIIDGANPQQQAYMKKEFHKSDN KEFTYEDLKQGYFRLLSDKMISEIIPHAIYNPEDDTYDLHLKVKLENNFAVRLGGNISTS NSNQIYLGLSYQDLNYYAKEFILDGQLGKVYNNVQFMAKIDFATAIPTSYRLIGSISTFD YFKKDKLFSRNNKPAFNQKDERFLKLQVGLPFLSSKRAEFGVGIARIEDKYFQKSVIDFG NDKFDKSRYDLFGGSISFNGSTLNSKQYPTRGYREALVAQIFVGKERFYPGEGSTTSNNK DHHSWLQLSYMKEKYHNMSEHWVLGWYLKALYASKNFSENYTATMMQAGEFSPTLHSKLT YNEAFRANQFVGAGIRPIYRLSQMFHLRGEFYGFMPIYPIEKNSLNKAYYGKAFSKFEYL GEISVVCQLPFGDISAYVNHYSSPKREWNVGLSIGLQLFNYRFIE >gi|225935391|gb|ACGA01000001.1| GENE 2 3176 - 5221 1863 681 aa, chain - ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 1 581 2 601 658 455 41.0 1e-127 MQKGNIGVTTENIFPIIKKFLYSDHEIFLRELVSNAVDATQKLNTLASIGEFKGELGDLT VHVELGKDTITISDRGIGLTAEEIEKYINQIAFSGANDFLEKYKNDANAIIGHFGLGFYS AFMVAKKVEIITKSYRDDAKAVKWTCDGSPEFTIEEIEKADRGSDIILYIDDDCKEFLEE ARISELLKKYCSFLPVPIAFGKKKEWKDGKQVETAEDNIINDTTPLWTRKPSELSDEDYK SFYSKLYPMSDEPLFWIHLNVDYPFHLTGILYFPKVKSNIELNKNKIQLYCNQVYVTDSV EGIVPDFLTLLHGVIDSPDIPLNVSRSYLQSDSNVKKISTYITKKVSDRLQSIFKNDRKQ FEEKWNDLKIFINYGMLTQEDFYDKAQKFALFTDTNDKHYTFEEYQTLIKDNQTDKDGNL IYLYANNKDEQYSYIEAATNKGYNVLLMDGQLDVAMVSMLEQKLEKSRFTRVDSDVVDNL IVKEDKKGETLEANKQDAITTAFKSQLPKMDKVEFNVMTQALGENSAPVMITQSEYMRRM KEMANIQAGMSFYGEMPDMFNLILNSDHKLIKQVLNEEENACQAEVAPILSEMDNVNKQR NELKDKQKDKKEEDIPTAEKDELNDLDKKWDDLKSKKEAIFIGYASNNKVIRQLIDLALL QNNMLRGEALNNFVKRSIELI >gi|225935391|gb|ACGA01000001.1| GENE 3 5356 - 7884 1970 842 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 833 2 806 815 763 48 0.0 MNNQFSQRVSDIIVYSKEEANRLRSRYIGPEHLLLGILRDGEGKAIEILSKLNTNLAAIK QQIEAQLKAEADEMLLPDVEVPLSNDAAKILKMCILEARGMKSNIADTEHVLLAILREKN NMAASVLEANDINYVKVLEQATLQPDINSGMGFSEDDEDDEEMSSPRSGRGGSDERQQAQ TASKKPSNDTPVLDNFGTDMTKAAEEGRLDPVVGREREIERLAQILSRRKKNNPILIGEP GVGKSAIVEGLALRIIQKKVSRILFDKRVVALDMTAVVAGTKYRGQFEERIRSILNELQK NPNVILFIDEIHTIVGAGSAAGSMDAANMLKPALARGEIQCIGATTLDEYRKNIEKDGAL ERRFQKVIVEPTTAAETLQILRNIKDKYEDHHNVYYTDEALEACVKLTDHYITDRNFPDK AIDALDEAGSRVHLTNVNVPKEIEEQEKLIEEAKNKKNEAVKSQNFELAASFRDKEKELS IQLDEMKKEWEANLKENRQTVDAEEIANVISMMSGIPVQRMAQAEGIKLAGMKEDLQAKV IAQDTAIEKLVKAILRSRVGLKDPNKPIGTFMFLGPTGVGKTHLAKELAKYMFGSADALI RIDMSEYMEKFTVSRLVGAPPGYVGYEEGGQLTEKVRRKPYSIVLLDEIEKAHPDVFNIL LQVMDEGRLTDSYGRMVDFKNTVIIMTSNIGTRQLKEFGRGVGFATQSRLDDKEFSRSVI QKALNKSFAPEFINRVDEIITFDQLSLEAITKIIDIELKGLYDRIESIGYKLVIEDKAKE FIASKGYDVQYGARPLKRAIQTYLEDGLSELIISSSLKEGDTIQVTLNEEKGELEMKVVS PE >gi|225935391|gb|ACGA01000001.1| GENE 4 8242 - 10818 2295 858 aa, chain + ## HITS:1 COG:BH0007 KEGG:ns NR:ns ## COG: BH0007 COG0188 # Protein_GI_number: 15612570 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus halodurans # 1 812 1 804 833 866 55.0 0 MLEQDRIIKINIEEEMKSSYIDYSMSVIVSRALPDVRDGFKPVHRRILYGMMELGNTSDK PYKKSARIVGEVLGKYHPHGDSSVYFAMVRMAQEWAMRYPLVDGQGNFGSVDGDSPAAMR YTEARLNKLGEAMMDDLYKETVDFEPNFDNTLVEPKVMPTRIPNLLVNGASGIAVGMATN MPPHNLSEVIEACDAYIDNPEITVEELMEFVKAPDFPTGGYIYGVSGVREAYLTGRGRVI MRAKAEIESGQTHDKIVITEIPYNVNKAELIKYIADLVNDKKIEGISNANDESDRDGMRI VIDIKRDANASVVLNKLYKMTALQTSFGVNNVALVHGRPKTLNLRDLIKYFIEHRHEVVI RRTQFDLRKAKERAHILEGLIIASDNIDEVIRIIRAAKTPNDAITGLIERFNLTEIQARA IVEMRLRQLTGLMQDQLHAEYEEIMKQIAYLESILADDEVCRRVMKEELLEVKAKYGDVR RSEIVYSSEEFNPEDFYADDQMIITISHMGYIKRTPLTEFRAQNRGGVGSKGTETRDEDF VEHIYPATMHNTMMFFTQKGKCYWLKVYEIPEGTKNSKGRAIQNLLNIDSDDNVTAYLRV KSLEDSEFINSHYVLFCTKKGVIKKTLLEQYSRPRQNGVNAITIREDDSVIEVRMTNGNN EIIIANRNGRAIRFHEAAVRVMGRTATGVRGITLDNDGQDEVVGMICIKDLETESVMVVS EQGYGKRSEIEDYRKTNRGGKGVKTMNITEKTGKLVTIKSVTDENDLMIINKSGITIRLK VADVRIMGRATQGVRLINLEKRNDQIGSVCKVMTESLEDEIPAEEAEGTIVSDPNADAPD IDDAADVNENESNNEIEE >gi|225935391|gb|ACGA01000001.1| GENE 5 10851 - 12065 1177 404 aa, chain + ## HITS:1 COG:no KEGG:BT_0900 NR:ns ## KEGG: BT_0900 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 404 1 398 398 627 88.0 1e-178 MKRVLFSMVLLMAVSFSFAQMKNVKEAKSMANDVKPNFKQAEQLIKEAMKNPETKDLADT WDVAGFIQRRINEEQMKNAFLKKPYDTLKVYNSILKMYEYYNKCDELAEIPNEKGKVKNK YRKANASSMLAERPNLINGGIQYFNLDKNKEALKFFATYVESASYPMLADKELAKNDTLL PQIAYYATLAADRVGDKDAIIKYAPAALSDKDGGKFAMQLMADAYKAKGDTAAWIKSLEE GILKFPGNDYFFANLVDYYNSSNQASKAMEFADRMLANDPNNKLYLYVKAYLYHNMKEYD NAIEYYKKAIAADPEYAEAYSNVGLVYLMKAQDYADKATTDINDPKYAEAQAVVKKFYEE AKPFYEKARALKPDQQDLWLQGLYRVYYNLNMGPEFEEIDKMMK >gi|225935391|gb|ACGA01000001.1| GENE 6 12307 - 13428 637 373 aa, chain - ## HITS:1 COG:MA2866 KEGG:ns NR:ns ## COG: MA2866 COG0589 # Protein_GI_number: 20091690 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanosarcina acetivorans str.C2A # 86 243 8 151 152 63 31.0 5e-10 MEDKLVTLAILTYTKAQILKNVLENEGIETYIHNVNQIQPVVSSGVRLRIKESDLPRALK ITESSTWLSESIVGEKEPKVKDKSNKILIPVDFSNYSMKACEFAFNLAKTENAEVILLHV YFTPIYASSLPYGDVFNYQIGDEESVKTIIQKVHSDLNALSEKIKEKVASGEFPNVKYSC ILREGIPEEEILRYAKEQRPMVIIMGTRGKNQKDIDLIGSVTAEVIDRSRTAVLAIPENT PFKQFSEAKRIAFITNFDQRDLIAFEAFFNTWKSFHFSVSLIHLTDSKDTWNEIKLAGIK EYFHKQYPGLEIHYDVVMNDNLLKGLDQYIKDNHIDIITLTSYKRNIFARLFNPSIARKM IFHSDTPLLVINS >gi|225935391|gb|ACGA01000001.1| GENE 7 13506 - 13796 256 96 aa, chain - ## HITS:1 COG:no KEGG:BT_0902 NR:ns ## KEGG: BT_0902 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 178 95.0 5e-44 MRTITFNELRKIKDSLPSGSMHRIADELGLHVDTVRNFFGGHNFKEGKSVGIHLEPGPDG GLVMLDDTTVLDRALKILDELNMSMQKEQATESVQV >gi|225935391|gb|ACGA01000001.1| GENE 8 13925 - 14725 608 266 aa, chain - ## HITS:1 COG:no KEGG:BT_0903 NR:ns ## KEGG: BT_0903 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 266 12 277 277 418 85.0 1e-116 MSLTCFGQDSLSIDTRQTNGVDSIHASHTTFSSNTLEDATKAEGDSAYIKEDYAAAIQIY EALLKNGEAADVYYNLGNSYYKIGEIAKAVLNYERALLLQPGNGDIRANLEVARAKTIDK VEPVPEVFFVSWIKSLTNSMSVDAWATWGIVSFILLIIALYFFIFSKQIMLKKIGFISGI ILLIVTVCSNLFASQQKEHLVNRSEAIVMNPSVTVRSTPSESGTSLFILHEGRKVSVKDN SMKEWKEIRLEDGKVGWVPASAIEVI >gi|225935391|gb|ACGA01000001.1| GENE 9 14777 - 16606 1221 609 aa, chain - ## HITS:1 COG:no KEGG:BT_0904 NR:ns ## KEGG: BT_0904 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 609 1 608 608 1060 90.0 0 MKKLIIILMALIAYSTHVFADKVSFTASAPDAVVVGEQFRLSYIVTTQKVKDFRAPSIKG FDVLMGPSRSQQSSTQIVNGNVTSTSSITFTYILMANNAGEYTIPGASIVADGDQMVSNS VKIKVLPQDQGGNSGQNNSSSGSIHSSSGTSVSNQDLFIMASASKTNVYEQEAFVLTYKI YTRESNLQLNNAKLPDFKGFHSQEIEMTTNARWTPEHYQGRNYYTTVYRQFVLFPQQSGK LYIDPAQFQMTIGKPVQSDDPFDAFFNGGSNVIEIKKNISTPKIAINVNPLPTGKPADFS GGVGEFNISSSINSKELKTNDAITIKLVISGTGNLKLISNPEIKFPDDFEVYDPKVDNQV RLTREGLTGNKVIEYLAIPRHAGTYKIPGVSFSYFDIRSKSYKTLKTEDYVVNVEKGAGN ADQVIANFTNKEDLKVLGEDIRYIKQNEVTLQPKGSFFYGSMTYWLFYIIPALAFIIFFI IYRKQAAENANVAKMRTKKANKVATKRMKLAGKLLSENKKDAFYDEVLKALWGYISDKLN IPVSRLSKDNIEEKLRNHGVNEELIKEFLNALNDCEFARFAPGDENQAMDKVYSSSIEVI SKMENSIKH >gi|225935391|gb|ACGA01000001.1| GENE 10 16638 - 17360 763 240 aa, chain - ## HITS:1 COG:no KEGG:BT_0905 NR:ns ## KEGG: BT_0905 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 3 242 242 261 82.0 2e-68 MLKSKYILFAVFLLAAAGVSAQKAERDYIRKGNRLFNDSVFVDAEVNYRKALEANPKSTV SMYNLGNTLSQQQKFQDAMEQYVSASKIEKDKMKLAHIYHNMGVIFQAGKDYAKAVDAYK MSLRNNPTDHETRYNLALAQKMLKDQQNQQNQDQNQDQNKDQQKQDQKQDQNKDKQKDQK QDEKKDQQQPPKSEKKQDNQMSKENAEQLLNSVMQDEKDVQDKVKKQQKVMQGGRLEKDW >gi|225935391|gb|ACGA01000001.1| GENE 11 17369 - 18397 1063 342 aa, chain - ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 6 328 5 318 318 94 26.0 4e-19 MFRFGEPTYLYLLLLLPFLAAFYLYSNYKRRKNIRRFGDPTLLAQLMPDVSKYRPDVKFW IIFVAIGLFSVLLARPQFGSKLETVKRKGVEVIIALDISNSMLAQDVQPSRLEKAKRLIS RLVDELDNDKVGMIVFAGDAFTQLPITSDYISAKMFLESISPSLISKQGTAIGEAINLAA RSFTPQEGVGRAIIVITDGENHEGGAVEAAKAAAEKGIQVSVLGVGMPDGAPIPVEGTND YRRDREGNVIVTRLNEAMCQEIAKEGKGIYVRVDNSNSAQKAINQEVNKMAKSDVESKVY TEFNEQFQAIAWVILLLLLAEILILDRKNPLFKNIHLFSNKK >gi|225935391|gb|ACGA01000001.1| GENE 12 18443 - 19426 839 327 aa, chain - ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 3 318 4 313 318 157 31.0 4e-38 MVFANIEYLFLLLLLIPYIVWYILKQKKSEATLQISDARVYAHTPKSYKNYLLHVPFLLR CLALVLVILVLARPQTTNKWQNSEIEGIDIMLAIDVSTSMLAEDLKPNRLEAAKDVAAEF INGRPNDNIGITLFAGETFTQCPLTVDHAVLLDMIHNIKCGLITDGTAVGMGIANAVTRL KDSKAKSKVIILLTDGTNNKGDISPMTAAEIAKSFGIRVYTIGVGTNGMAPYPYPVGNTV QYVSMPVEIDEKTLTEIAGTTDGNYFRATSNSKLKEVYEEIDKLEKTKLNVKEYSKRDEE YHWFALAAFLCVLLEVLLRNSVLKKIP >gi|225935391|gb|ACGA01000001.1| GENE 13 19476 - 20552 847 358 aa, chain - ## HITS:1 COG:no KEGG:BT_0908 NR:ns ## KEGG: BT_0908 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 357 5 361 362 585 91.0 1e-166 MNRNIILIALLGLLFVSKAVAQSVTVEAKIDSLQILIGEQAKVQLQVAMDAKQRAIFPAY TDTLVRGVEIIETVKPDTQFLNDRQRMVITQEYIITSFDSALYYLPPMPVTVDGKVYNSK ALALKVYSMPVDTLHPDQFFGQKPVMKAPFAWEDWYGLIACSFLALPLLGLLIYLIVRIR DNKPIIRKIKVEPKLPPHQAAMKEIERIKTEKVWQKGLSKEYYTELTDAIRTYIKDRFGF NALEMTSSEIIDQLLEMNDKEAISDLKLLFQTADLVKFAKHNPQMNENDANLINAIDFIN ETKQPEEENQKPQPTEITIIEKRSLRVKAMLICGIALLSAALIGTFIYIGLQLYNLFV >gi|225935391|gb|ACGA01000001.1| GENE 14 20564 - 21433 690 289 aa, chain - ## HITS:1 COG:BB0175 KEGG:ns NR:ns ## COG: BB0175 COG1721 # Protein_GI_number: 15594520 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Borrelia burgdorferi # 3 276 8 278 291 142 31.0 8e-34 METSEILKKVRQIEIKTRGLSNNIFAGQYHSAFKGRGMSFSEVREYQFGDDIRDIDWNVT ARFNKPYVKVFEEERELTVMLMVDVSGSLEFGTIKQLKKDMVTEIAATLAFSAIQNNDKI GVIFFSDRIEKFIPPKKGRKHILYIIRELIDFQPESRRTNIRLALEYLTNVMKRRCTAFI LSDFIDQESFKNALTIANRKHDVVALQVYDRRVSDLPPVGLMQIKDAETGHEQWIDTSSK AVRRAHRDWWIQKQTELNDTFTKSNVDAVSVRTDQDYVKALLNLFAKRN >gi|225935391|gb|ACGA01000001.1| GENE 15 21517 - 22512 1138 331 aa, chain - ## HITS:1 COG:Rv1479 KEGG:ns NR:ns ## COG: Rv1479 COG0714 # Protein_GI_number: 15608617 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Mycobacterium tuberculosis H37Rv # 30 331 52 352 377 314 52.0 2e-85 MAESIDIRELNERIERQSAFVTNLTTGMDQIIVGQKHLVESLLIGLLSDGHVLLEGVPGL AKTLAIKTLASLIDAKYSRIQFTPDLLPADVIGTMVYSQKDESFKVQRGPIFANFVLADE INRAPAKVQSALLEAMQERQVTIGKETFMLPEPFLVLATQNPIEQEGTYPLPEAQVDRFM LKVVIDYPKLEEEKLIIRQNINGEKFNVKPILRADEIIEARKVVRQVYLDEKIERYIVDI VFATRFPEKYDLKELKDMIGFGGSPRASINLALAARTYAFIKRRGYVIPEDVRAVAHDVL RHRIGLTYEAEANNMTSDEIISKILNKVEVP >gi|225935391|gb|ACGA01000001.1| GENE 16 22750 - 24162 1109 470 aa, chain - ## HITS:1 COG:no KEGG:BT_0911 NR:ns ## KEGG: BT_0911 # Name: not_defined # Def: putative integration host factor IHF alpha subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 470 1 487 487 384 60.0 1e-105 MNERLTIQDLTDLLAAKHSMTKKDAEAFVKEFFLLIEQALENEKTVKIKGLGTFKLIDVD SRESVNVNTGERFQIKGHTKVSFSPDANLRDTINKPFAHFETVVLNENTILEDTPIEETE EEETGEEASAQAVSNEMGDNSETTAIEENEGTDNLSEEEPIQEEQIASRPLVEESIEELA ITEESAIVQELTEQTSPKEPEEIITETNIEEKVEQLEDEEVPEEEVTIDEQQPASIEEEK EKITAEKIIEQELLKANLQPVTPIVPPAEKETIKPVQPEQISQPVSKKTAPVKEKSPVPY LIAVIVVVLLLCGGVILFIYYPDLFSSSSDKNALDMPPVTTQPVQPEAQLSDTIAHKDTT KIITPDVSKVATTTQPVAKKEEAIPVKAEPQTVTQQPATSAYLDSASYKITGTKTKYTIK EGETLTRVSLRFYGTKAMWPYIVKHNPKVIKNPNNVPYGTTIEIPELTKE >gi|225935391|gb|ACGA01000001.1| GENE 17 24175 - 24447 293 90 aa, chain - ## HITS:1 COG:no KEGG:BT_0912 NR:ns ## KEGG: BT_0912 # Name: not_defined # Def: DNA-binding protein HU # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 90 1 90 90 144 88.0 8e-34 MNNKEFTSELAERLGYTIKDTSELMNSLLSSMTQELEEGNVVAIQGFGSFEVKKKAERIS INPASKQRMLVPPKLVLSYRPSNTLKDKFK >gi|225935391|gb|ACGA01000001.1| GENE 18 24440 - 25750 1145 436 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229200236|ref|ZP_04326798.1| SSU ribosomal protein S12P methylthiotransferase [Pedobacter heparinus DSM 2366] # 24 432 1 409 410 445 51 1e-124 MKRKTIDIITLGCSKNLVDSEQLMRQLEEVGYSVTHDTENPQGEIAVINTCGFIGDAKEE SINMILEFAERKEEGDLKKLFVMGCLSERYLKELAVEIPQVDKFYGKFNWKELLQDLGKV YHDELYIERTLTTPQHYAYLKISEGCDRKCSYCAIPIITGHHISKPIEEILDEVRYLVSQ GVKEFQVIAQELTYYGIDRYKKQMLPELIERISDIPGVEWIRLHYAYPAHFPTDLFRVMR ERDNVCKYMDIALQHISDNMLKLMRRQVSKEDTYQLIEQFRKEVPGIHLRTTLMVGHPGE TEEDFEDLKEFVRKVRFDRMGAFAYSEEEGTYAAETYEDSIPQEVKQARLDELMDIQQGI SAELSAEKIGKQMKVIIDRLEGDYYIGRTEFDSPEVDPEVLIDRSERELKIGQFYQVEVM NADDFDLYAKIINDYE >gi|225935391|gb|ACGA01000001.1| GENE 19 25747 - 26706 731 319 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 5 317 2 322 336 286 46 3e-76 MGFFSFFSKEKKETLDKGLSKTKESVFSKIARAVAGKSKVDDEVLDNLEEVLITSDVGVE TTLNIIKRIEKRAAADKYVNTQELNLILRDEIAALLTENNSGDVADFDVPIARKPYVIMV VGVNGVGKTTTIGKLAYQFKKAGKSVYLGAADTFRAAAVEQLMIWGERVGVPVVKQKMGA DPASVAYDTLSSAVANNADVVIIDTAGRLHNKVGLMNELTKIKNVMKKVVPDAPDEVLLV LDGSTGQNAFEQAKQFTLATEVTAMAITKLDGTAKGGVVIGISDQFKIPVKYIGLGEGME DLQVFRKNEFVDSLFGENA >gi|225935391|gb|ACGA01000001.1| GENE 20 26845 - 27003 246 52 aa, chain - ## HITS:1 COG:no KEGG:PRU_0750 NR:ns ## KEGG: PRU_0750 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 51 1 51 52 79 90.0 3e-14 MAKKTVASLHEGSKEGRAYTKVIKMVKSPKTGAYVFDEQMVPNEKVQDFFKK >gi|225935391|gb|ACGA01000001.1| GENE 21 27020 - 27208 320 62 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53713719|ref|YP_099711.1| 50S ribosomal protein L33 [Bacteroides fragilis YCH46] # 1 62 1 62 62 127 100 2e-28 MAKKAKGNRVQVILECTEHKDSGMPGTSRYITTKNRKNTTERLELKKYNPILKRVTVHKE IK >gi|225935391|gb|ACGA01000001.1| GENE 22 27230 - 27490 441 86 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29346326|ref|NP_809829.1| 50S ribosomal protein L28 [Bacteroides thetaiotaomicron VPI-5482] # 1 86 1 86 86 174 95 1e-42 MSKICQITGKKAMIGNNVSHSKRRTKRTFDLNLFNKKFYYVEQDCWISLSICANGLRIIN KKGLDAALTEAVAKGYCDWKSIKVIG >gi|225935391|gb|ACGA01000001.1| GENE 23 27613 - 28845 914 410 aa, chain - ## HITS:1 COG:FN1929_2 KEGG:ns NR:ns ## COG: FN1929_2 COG1546 # Protein_GI_number: 19705234 # Func_class: R General function prediction only # Function: Uncharacterized protein (competence- and mitomycin-induced) # Organism: Fusobacterium nucleatum # 248 409 1 160 165 133 48.0 6e-31 MFAEIITIGDELLIGQVVDTNSAWMGQELNKIGIEVLRIVSIRDREKEILEAIDNAMERV NIVLVTGGLGPTKDDITKQTLCKYFHTELVFSEEVFENVKRVLAGKIPMNALNKSQAMVP KDCTVINNPVGSASVSWFERDGKVLVSMPGVPQEMTAVMTESVLPKLHERFQTDVIMHQT FLVQHYPESVLAEKLEPWESALPECIKLAYLPKLGIIRLRLTGRGQNREEVKVLLEREKV KLEKILGEDIFGEEDTPLEVIVGELLKKKKLTVSTAESCTGGSIAARLTSIAGSSEYFNG SVVAYSNEVKMGLLHVSSETLEQHGAVSEETVIEMVKGAMKALKTDCAVATSGIAGPGGG TPEKPVGTVWIAAGYKNEIRTYKQETNRGRSMNIERAGNNALLMLRDLLK >gi|225935391|gb|ACGA01000001.1| GENE 24 28890 - 29909 654 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 4 333 479 811 832 256 43 3e-67 MSAIILGIESSCDDTSAAVIKDGYLLSNVVSSQAVHEAYGGVVPELASRAHQQNIVPVVH EALKRAGVTKEELSAVAFTRGPGLMGSLLVGVSFAKGFARSLNIPMIDVNHLTGHVLAHF IKAEGEEERQPVYPFLCLLVSGGNSQIILVKAYNDMEILGQTIDDAAGEAIDKCSKVMGL GYPGGPIIDKLARQGNPKAFTFSKPHIPGLDYSFSGLKTSFLYSLRDWLKEDPDFIEHHK VDLAASLEATVVDILMDKLRKAAKEYKINEIAVAGGVSANNGLRNAFREHAEKYGWNIFI PKFSYTTDNAAMIAITGYFKYLDKDFCSIDLPAYSRVTL >gi|225935391|gb|ACGA01000001.1| GENE 25 30110 - 34567 3347 1485 aa, chain + ## HITS:1 COG:no KEGG:BT_0921 NR:ns ## KEGG: BT_0921 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1484 1 1484 1484 2617 88.0 0 MSVFVAEELSEVLNTKVVIGRINMGLLNRIIIDDVLLDDQSGQEMLKVTRLSAKFDIMPF FKGKISISSVQLFGFNINLRKETPDSPPNFKFVLDAFASKDTVKKENSLDLRINSILIRR GRMSYHVLSEEETPGKFNAKHIQLQNIIANISLKALSKDSLNLGIKRLSLDEKASGFSLK KMSLKLVANNKQTNIDNFTIELPETSLKLDTIHLEYDSLKAFDRFTEQVHFSFRTLPSQV TLKDISPFVPILSHFKEPVTLDMEVKGTVDQLTCSHLEITADDRQFRLRGDVSLQDLSRP QDAYVYGTLSELSANTRGVGFLVRNLSPNYNGVPPLLERLGNVSFQGEISGYFTDLVTYG QLQTSLGNVKTDLKLSSDKTKGLFAYSGAVKTEDFQLGKLLDNEELGEITFNLDVHGRHI ADHLPAVELKGLIASIDYSRYRYENITLDGEYKQGGFNGKIALDDPNGSIYLNGDVNVAS KVPTFNFLAVVNKVRPHDLNLTTKYPDAEFSLKLKANFTGGSVDEMIGEINVDSLEFTAP DKAYFMQNMNIRATKQNGENQLRLTSEFLKASIEGKFQYHTLPASILNIMRKYVPSLILP PKKPIETHNNFLFDIHIYNTDILSTIFDIPLTVYTHSTLKGYFNDALQRLRVEGYFPRLQ YKNNYIESGMILCENPADHIRARVRLTNLKKKGAVNLSLDAQAKDDNVSTTLDWGNNAAA TYSGKLAAVAKFLRTSGEKSLLKAMVDVKPTDVILNDTLWKIHPSQVVVDSGRVDVNNFY FSHQDRYVRINGRLSENPKDTVKVDLKDINMGYVFDIASISDDVNFEGDATGTAYASGVL KKPIMNTRLFIKNFSLNHGRLGELDIYGEWDNENRGIRLDASIQDISPSPSRVTGIIYPL KPESGLDLNIEANELNLKFLEHYMKSIANDIKGRGTGKVHFYGKFKGLNLDGAVMTDASM NFDILNTHFAVKDTIHLAPTGLTFNNIHISDMEGHSGRMNGYLHFQHFKNLNYRFEIQAN NMLVMNTKESTDMPFYGTVYGTGNVLLSGNAIQGLDVNVAMTTNRNTTFTYINGSVASAT SNQFIKFVDKTPRRTIQDSVQVISYYEQIQQKRQEEEEEQKTDIRLNILVDATPDATMRI IMDPIAGDYISGKGTGNIRTEFYNKGDVKMFGNYRINQGVYKFSLQEVIRKDFIIKDGST ITFNGAPLDANMDIQASYTVNSASLNDLIPDASAIIQQPNVRVNCIMNLSGMLVRPTIKL GIELPNERDEIQTLVRNYISTEEQMNMQILYLLGIGKFYTEDARNNNQNSNVMSSVLSST LSGQLNNALSQVFETNNWNIGTNLSTGDKGWTDMEVEGILSGQLLNNRLLINGNFGYRDN PMANTNFVGDFEAEWLITRSGDIRLKAYNETNDRYYTKTNLTTQGVGIMYKKDFNKWSDL YFWNKWRLRNKRKQEEAEKVKPQQTDSITDKTAKSAVKRKRVQEQ >gi|225935391|gb|ACGA01000001.1| GENE 26 34655 - 35512 731 285 aa, chain - ## HITS:1 COG:no KEGG:BT_0922 NR:ns ## KEGG: BT_0922 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 284 1 292 293 350 63.0 3e-95 MKLKIYTLLLALSVTWSLQSCDNDDDGSIAVPAELQSAFSSKFPNAANVKWETKSGYYVA DFYDGYEASAWFTQDGKWQMTETDIPYNALPQAVKMSFENSEYASWKRDDIDKLERTGVE TVFVIKVEKQNQEVDLYYSADGTLIKSIVDTDDDNSEHLPVQLTEAMKNFINEKYPNARI MEVDVEDNRNDWDFGYTEVDIIHNGIPKDVLFNQTGNWYSTSWEVRQDELPEAVNSTLNN QYREYRFDEAEYIEKADGTIYYRIELEKGDVDKVVNIGENGTVLS >gi|225935391|gb|ACGA01000001.1| GENE 27 35534 - 35971 526 145 aa, chain - ## HITS:1 COG:no KEGG:BT_0923 NR:ns ## KEGG: BT_0923 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 145 248 92.0 5e-65 MKKLVFLLVCLFTLQTVARADDDKPIQITQMPQPAQQFIKQHFADSKVALAKMESDFFYK SYEVIFTNGDKVEFDNKGNWEEVNCKYSSVPTAIIPAAIQKYVTTNYPDAKILKIERDKK DYEVKLSNRTELKFDLKFNLIDIDF >gi|225935391|gb|ACGA01000001.1| GENE 28 36291 - 37442 992 383 aa, chain + ## HITS:1 COG:BH3215 KEGG:ns NR:ns ## COG: BH3215 COG1470 # Protein_GI_number: 15615777 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 8 383 9 385 385 286 41.0 7e-77 MTMRTNYFLLLAILLGIIPMSYTHANDSIPKSVILYTPYTKISVSPGASIDYSIDLINNT DQLTNANLSVSGLSASWKHEMKSGGWSLSQLSVLPKEKKTFNLKVEVPLKVNKGNYHFVV YAGNAKLPLDVVVAQKGTYQTEFTTDQPNMQGNSKSTFTFSATLKNQTADQQLYALMANA PRGWNVVFKPNYKQATSAQVEANSTQNVSIDITPPANVEAGSYKIPVRAATGTTSAELEL EVVVTGSYQMELTTPRGLLSTDVTAGDIKKIELEVRNTGSSLLKDIQLSANKPADWEVTF EPSKIDALKAGETSTVMATLKASKKALPGDYVTTIMAKTPEVNADAQFRVAVKTPMIWGW VGVLIIIATIGVVYYLFRKYGRR >gi|225935391|gb|ACGA01000001.1| GENE 29 37446 - 38189 285 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 11 216 17 222 318 114 31 2e-24 MGEQVIVLTDLTKQYGNFTAVDHIRLNIRKGEIFGLLGPNGAGKSTTILMMLGLTEPTSG TVEICGINSTTHPIEVKRKIGYLPEDVGFYDDMTGPENLIYTARLNGISDKEAKTKAMEL MKRVGLEDQLAKKTGKYSRGMRQRLGLADVLIKNPEIIILDEPTSGIDPAGVQEFIELIR WLSKEEGLTVLFSSHHLDQVQKVCDRVGLFSNGQLLALIDMAELKDKKQELSDIYNHYFE EGGERHE >gi|225935391|gb|ACGA01000001.1| GENE 30 38182 - 39138 849 318 aa, chain + ## HITS:1 COG:BH3213 KEGG:ns NR:ns ## COG: BH3213 COG1277 # Protein_GI_number: 15615775 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Bacillus halodurans # 8 317 31 344 345 318 57.0 1e-86 MNKVNHPFWVIVNKEISDHVKSWRFIILIGIIALTCMGSLYTALTNISEAIKPNDPDGSF LFLKLFTVSDGTLPSFVLFINFLGPLLGIALGFDAVNSEQNKGTLSRMLSQPIHRDCIIN AKFVAALIVIGIMLFVLSFLVMGCGLIAIGIPPTAEEFWRIVFFIITSIFYVAFWLNLAI LFSLRFRQAATSALASVAVWLFFSVFYTMIVNLVAKGLSPSQMASPYQIISYQKFILGLM RLAPSELFNEATTTLLMPSVRSLGPLTMEQVQGAIPSPLPLGQSLLVVWPQLTGLIAATV ICFAISYIMFMRREIRSR >gi|225935391|gb|ACGA01000001.1| GENE 31 39268 - 40542 993 424 aa, chain - ## HITS:1 COG:BMEII1015 KEGG:ns NR:ns ## COG: BMEII1015 COG0642 # Protein_GI_number: 17989360 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Brucella melitensis # 136 400 148 414 447 91 25.0 3e-18 MKLIYYIILRITLALTLILTVWAIFFYVTMIDEVNDEVDDALEDYSETIIIRALAGEELP SKTNGSNNQYYMMEVSKEYAESREDIQYKDSMVYIEEKGETEPARILTTIFKDDEGRYHE LTVSTPSIEKDDLRDAIQVWIIFLYVALLFCIIIISVWVFYRNMRPLYVLLHWLDGYQTG KRNKPLSNDTQITEFRKLNEAAIRYVERTEQMFEQQKQFIGNASHEIQTPLAICRNRLEM LMEDDSLSEKQLEELMKTHQTLEYITKLNKSLLLLSKIDNGQFTDTKELELNVLLKQYLQ DYEEVYDYRNIEVTIDEQGVFKVTMNESLAVALLTNLLKNAFVHNIDGGHIRITVTKNGI TFRNSGVEHPLNKEHIFERFYQGSKKEGSTGLGLAIADSICRLQHLNIKYYFEQNEHCFE ISQS >gi|225935391|gb|ACGA01000001.1| GENE 32 40564 - 41241 179 225 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP19-BS75] # 1 183 1 190 226 73 31 3e-12 MKILIVEDEPSLRELIQCSLEKERYVVETASDFNSALRKIEDYDYDCILLDIMLPDGSGL DLLERLKALHKRENVIIISAKDSLEDKVLGLELGADDYLPKPFHLVELNARIKSVIRRHQ HDGEIDIRQGNVRIEPDKYRVFVNEQEVELNRKEYDILLYFINRPGRLINKNTLAESVWG DHIDQVDNFDFIYAQIKNLRKKLKDSGATIEIKAVYGFGYKMIVE >gi|225935391|gb|ACGA01000001.1| GENE 33 41456 - 42949 1627 497 aa, chain - ## HITS:1 COG:BB0402 KEGG:ns NR:ns ## COG: BB0402 COG0442 # Protein_GI_number: 15594747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Borrelia burgdorferi # 8 497 5 488 488 475 49.0 1e-134 MAKELKDLTKRSENYSQWYNDLVVKADLAEQSAVRGCMVIKPYGYAIWEKMQRQLDDMFK ETGHVNAYFPLLIPKSFLSREAEHVEGFAKECAVVTHYRLKNAEDGSGVVVDPAAKLEEE LIIRPTSETIIWNTYKNWIQSYRDLPILCNQWANVFRWEMRTRLFLRTAEFLWQEGHTAH ATREEAEEEAIRMLNVYGEFAEKYMAVPVVKGVKSANERFAGALDTYTIEAMMQDGKALQ SGTSHFLGQNFAKAFDVQFVNKENKMEYVWATSWGVSTRLMGALIMTHSDDNGLVLPPHL APIQVVIVPIYKNDEQLKQIDAKVEGIVAKLKALGISVKYDNADNKRPGFKFADYELKGV PVRLVMGGRDLENNTMEVMRRDTLEKETVTCEGIETYVQNLLEEMQANIYKKALDYRNSK ITTVDTYDEFKEKIEEGGFILAHWDGTTETEEKIKEETKATIRCIPFESFVPGDKEPGKC MVTGKPSACRVIFARSY >gi|225935391|gb|ACGA01000001.1| GENE 34 43120 - 43548 224 142 aa, chain - ## HITS:1 COG:no KEGG:BT_1390 NR:ns ## KEGG: BT_1390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 139 13 147 153 103 37.0 3e-21 MAKKIEKYLPVDYSDSFSIEVTENRLSPKDIILKVLAHNPAWLRMLYAIRACLVKPFGIE TKAIESEELIIEEDKQEAIMRKDDKHLLFYVDIFITPLETGKQMIEVTTLVKYHNWVGKA YFFCIKPFHRVIVPLVLKKALC >gi|225935391|gb|ACGA01000001.1| GENE 35 44536 - 45948 1031 470 aa, chain + ## HITS:1 COG:VC1606 KEGG:ns NR:ns ## COG: VC1606 COG1538 # Protein_GI_number: 15641614 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 3 467 4 460 476 169 27.0 9e-42 MKKNKRLCILLTLYFSIGQIDAQTPLSFEESLHLLNQGNQSLKIADKSIEIAKAERDKLN AFWYPSLQSTGAFVHMSEKIEVKQPLSQFTDPAKDFVHSIIPDDQIISSILDQIGANTLI FPLTPRNLTTVDLSAEWVLFSGGKRFHATNIGRTMVDLARESRAQVSANQQSLLVESYYG LRLAQQIVTVREETYNGLKKHYENALKLEAAGMIDKAGRLFAQVNMDEAKRALEAARKEE TVVQSALKVLLNKKDADANIIPTSPLFMNDSLPPKMLFDLSVNSGNYTLNQLQLQQHIAK QEVRIAQSGYLPNIALFGKQTLYSHGIQSNLLPRTMVGIGFTWNLFDGLDREKRVRQSKL TEQTLALGQMKARDDLAVGVDKLYTQLEKAQDNVKALNATIALSEELVRIRKKSFTEGMA TSTEVIDAETMLASVKVARLAAYYEYDVALMNLLSLCGTPEQFANYQPKP >gi|225935391|gb|ACGA01000001.1| GENE 36 45955 - 46941 961 328 aa, chain + ## HITS:1 COG:HP1488 KEGG:ns NR:ns ## COG: HP1488 COG0845 # Protein_GI_number: 15646097 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Helicobacter pylori 26695 # 36 327 37 327 329 209 36.0 4e-54 MKPTSKTLSWAFVIILLAVGIFTGLGVILMHKQPLVLQGQAEATEIRISGKLPGRIDTFF VQEGDWVHQGDTLVVINSPEVHAKYQQVNALEQVAVQQNKKIDAGTRRQIVATALQLWNK TKSDLTLAQTTYNRILTLYKDSVITSQRKDEVEAMYKAAVAAERAAYEQYQMAVDGAQKE DKASAASMVDAARSTVDEVSALLVDARLIAPENGQIATIFPKRGELVAPGTPIMNLVVMD DIHVVLNVREDLMPQFKMDETFVADVPAIGKKNIEFKIYYISPLGSFATWKSTKQTGSYD LRTFEIHARPTQKMDDLRPGMSVLLTLD >gi|225935391|gb|ACGA01000001.1| GENE 37 46964 - 48148 562 394 aa, chain + ## HITS:1 COG:VC1608 KEGG:ns NR:ns ## COG: VC1608 COG0842 # Protein_GI_number: 15641616 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 16 346 12 340 387 105 22.0 1e-22 MDESQTYSPFRSVLLREWRRMTSRRLYFGVCIVLPLFTLFFMATIFGNGQMENIPIGIVD QDNTATSRTIVRNISAVPTFKVTKHFVNEAAARESVQKKEIYGYLSIPPQFEQNAITGKN ATLSYYYHYALMSVGGELMAAFETSLAPVALSPVVMQAMALGVEQDQITTFLLPVQANNH PIYNPSLDYSVYLSQLFFFVLFQVLILLITVYAVGIEIKFRTADDWLATAKGNIVTAVLG KLLPYTIIYILIGWLANYVMFGILHIPFQGSWWLMNIMTVLFIIATQALGLFLFSLFPAI SLVISVVSMVGSLGATLSGVTFPVPNMYPLVRDASNLFPVRHFTEMMQTMLYGGGGFIHL WPSAVILCIFPLLALSLLPHLKRAIESHKYENIK >gi|225935391|gb|ACGA01000001.1| GENE 38 48132 - 49316 941 394 aa, chain + ## HITS:1 COG:jhp1379 KEGG:ns NR:ns ## COG: jhp1379 COG0842 # Protein_GI_number: 15612444 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Helicobacter pylori J99 # 13 369 6 351 376 129 28.0 8e-30 MKTLSKLQQLSFIIRREFLAISTSYAVLLVLMGGIFVYGLLYNYMYAPNIVTDVPVAVVD NSHSELSRDFIRWLDATPQAEISSQAMDYHEAKEWMKECKVQGILYLPHDFEERVFRGDE AVFSLYATTDAFLYFEALQGASSRVMLAINDKYRPDEAVFLPPQGLLAVAMAKPINVEGT ALYNYTEGYGSYLIPAVMMIIIFQTLLMVIGMVTGEEYSSKGIRAYTSFGYGWGVAIRIV AGKTSVYCALYAIFAFFLLGLLPHFFSIPNIGNGLYIVLLLIPYLMATSFLGLAASRYFT DSEAPLLMIAFFSVGLIFLSGVSYPMELMPWYWKVAHYIFPAAPGTLAFVKLNSMGASMA DIRPEYITLWIQALIYFTISIWVYKKKLESNLIS >gi|225935391|gb|ACGA01000001.1| GENE 39 49546 - 53535 2205 1329 aa, chain + ## HITS:1 COG:BH4026 KEGG:ns NR:ns ## COG: BH4026 COG5002 # Protein_GI_number: 15616588 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 778 1033 357 607 607 139 33.0 3e-32 MKYIVYILLFFPVWVTAQTYKYIGIEDGLSNRRIFNIQKDAQGYMWFLTNEGMDRYNGKD IKHYKLNKEGNTLDAPIRLGWLYTEPHIGIWVIGKQGRIFQYDANKDDFRMVYRLPDTSE TISCGYLDRNNNIWLCRKDTVLLYNIKDARIFQFPNVLHSSITAIEQVDEHHFFIATETG VRYVKQENNVLEIIPVETLDYFHAQVSELYFHQQLKRLFIGSFERGVFVYDTNTQEIIRP DADLSDVNIARISPLNETELLIATEGMGVYKVDVNTCKLEHYIVANYQSYNEMNGNNIND VFVDEEKRIWLANYPTGITIMDNRYENYHWMKHSMGNHQSLINDQVHAVIEDEDGDLWFG TSNGISLYQSKTGKWHSFLSSFDQQIKDKNHIFITLCEVSPGVIWAGGYTSGIYKINKNT LSVEYFSPYLLSHVNMRPDKYIRDIIKDSRGYIWSGGYYNLKCFNLATNSVRLYPGLNSI TSIVEKDKDTMWIGTVNGLYLLNRNSGEYQYIKMEIGAVYINALYQADNGLLYIGTNGVG VFVYNHQDKTFEHYFSDNSALVSNRIFTILPEVNGRIMMSTENGITCFHTKEKIFNNWTR GEGLLPAYFNASAGTVRKNKGFVFGSTDGAIEFPVNVKFPDYKFSRLIFSDFHLSYQPVY PGVKDSPLQKSIDETDVLELEYDENTFSFEVSTINYDSPGNALYSWKLEGFYEKWTQPGA NNLIRFTNLPPGRYILHVRAISREEHDIVFQEHAMKIIITQPFWSSWWAILCYILLVIGG FYFVLRVINLRKQKNISDEKTQFFINTAHDIRTPLTLIKAPLEELLEEETLTDNGITRTN IALRNVESLLRLVSNLINFERTDIYSSKMSVSEYELNSYMNEIYDTFSSYAAIKRIEYTY ESTFSYLNVLFDKEKMDSILKNIISNSLKYTPESGKVNISVSDTNDSWKVIIKDTGIGIP ASEQSKLFKLHFRASNAINSKVTGSGIGLMLVGKLVSLHGGKISVESVEHQGTTIKIVFP KSNKNFQNTGEEAPSKFEALAPVLPTPNVPAKATATIDNPNLQRILVVEDNDELRSYLVS SLSSIYNVQACANGKEALIIIKEFWPELVLSDIMMPEMGGDELCVAIKSDIETSHIPVLL LTALGDENNILDGLSIGADEYLIKPFSVKILRANIANLLANRELLRMRYANLDIEKTMVP SANGTNSLDWKFISNAKKIVDENINNPEFSVDVLCESSGMSRTSFYCKLKALTGQSPTEF IRVMRLKRATELLKEGGYAINEISDMVGFSETKYFREVFKKYYKMSPSRYAKEGGNPAAT ELEDDDEED >gi|225935391|gb|ACGA01000001.1| GENE 40 53667 - 54179 385 170 aa, chain - ## HITS:1 COG:no KEGG:BT_0959 NR:ns ## KEGG: BT_0959 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 170 3 169 169 264 89.0 9e-70 MESTTVKLYSLNYGNVRTYLAASLFIVGNILFPQFFHLIPQGGITWLPIYFFTLIGAYKY GWKVGLLTAVLSPIINSSLFGMPMPVVLPAILLKSVLLAIAAGWAAQRFGRISIPVLVGV VLTYQVVGTLGEWMYIGNFYNAIQDFRIGIPGMCLQVFGGYLFIRYLIRK >gi|225935391|gb|ACGA01000001.1| GENE 41 54193 - 56082 1622 629 aa, chain - ## HITS:1 COG:PA1271 KEGG:ns NR:ns ## COG: PA1271 COG4206 # Protein_GI_number: 15596468 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Pseudomonas aeruginosa # 14 532 5 510 616 100 26.0 1e-20 MRRNTACLFLFGSLLSLSGMAHTKKDSIQAGRNYMIDEVVVTGTRNETDVRHLPMTISVV GRQQIEKRYEPSLLPLLTEQVPGLFTTSRGIMGYGVSTGAAGGMSLRGIGGSPTAGLLVL IDGHPQYMGLMGHPIADAYQSMMAEKVEVLRGPASVLYGSNAMGGVINIVTRRQQEEGVK TNMQVGYGSYNTLQTEFSNRVKKGRFSSVVTGSYNRTDGHRPDMGFEQYGGYAKLGYDIS SFWKVWGDVNVTHFNASNPGTIQVPLIDNDSRITRGMTSFALENHYEKTSGGLSFFYNWG RHKINDGYQIGKEPQKSHFNSKDKMLGVSWYQSATFFTGNRLTVGFDYQHFGGESWNKVL ATGEHTPGVDKQMDEFAGYVDFRQDISSWFSLDAGIRVDHHSHVGTEWIPQGGLAFHLPK NAELKAMVSKGYRNPTIREMYMFPPANPELKPEKLINYELSYSQRLLEGALSYGISLYYI NGDNLIMSNGLFPPLNINSGEIENWGVETNIGYHINSHWNVNANYSWLHMENPVLAAPEH KLYAGADYTSGRWSISTGVQYVKGLYTSVVKGSEQQDHFVLWNLRGNYRVCRFANLFIKG ENLLAQRYEINAGYPMPKATFMGGINLNF >gi|225935391|gb|ACGA01000001.1| GENE 42 56113 - 56532 365 139 aa, chain - ## HITS:1 COG:no KEGG:BT_0961 NR:ns ## KEGG: BT_0961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 144 241 84.0 7e-63 MEELINLLHTGGYSCTIANKGEIRTFTQRGVADIYDLLTQEPEFLKGASIADKVVGKGAA ALMILGGIKELYTDIISTKALELLQKSDIKVGFTEEVPFIWNRNHTGGCPVETMCSEVES AEEILPLIRDFLEKIRNKK >gi|225935391|gb|ACGA01000001.1| GENE 43 56541 - 57944 1039 467 aa, chain - ## HITS:1 COG:MA0422 KEGG:ns NR:ns ## COG: MA0422 COG1453 # Protein_GI_number: 20089314 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 54 467 1 385 400 227 36.0 4e-59 MEDKNKKDINRRDFIKIVGISAATSTGLLYGCSSKGTSSSSSATGEGEVPTDKMTYRTSP TTGDRVSLLGYGCMRWPLKPAPNGNGEVIDQDAVNGLIDYAIAHGVNYFDTSPAYVQGFS EKATGIALSRHPRDKYYIATKLSNFSPDTWSREASLKMYHKSFADLQVDYIDYMLLHGIG MGGMEALKGRYLDNGMLDFLVKEREAGRIRNLGFSYHGDIEVYDYLLSRHDEFKWDFVQI QLNYVDWKHAKETNTRNTDAEYLYGELHKRGIPSIIMEPLLGGRLSKLNDNLVARLKQRR PESSVASWAFRFAGSFPDILTVLSGMTYMEHLQDNLRTYSPLEPLTDEEKEFLEETAQLM LKYPTIPCNDCKYCMPCPYGLDIPAVLLHYNRCVNEGNVPRNGQDENYAKARRAFLVGYD RSVPKLRQASHCIGCNQCVAHCPQNIDIPKELHRIDQFVEQLKQGTL >gi|225935391|gb|ACGA01000001.1| GENE 44 57964 - 59472 748 502 aa, chain - ## HITS:1 COG:MTH401 KEGG:ns NR:ns ## COG: MTH401 COG1145 # Protein_GI_number: 15678429 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanothermobacter thermautotrophicus # 194 484 21 323 337 69 23.0 1e-11 MLRTIRLTTAIVCFTLITLLFLDFTGTLHTWFGWLAKIQFLPALLALNIGVVLFLVVLTF LFGRIYCSVICPLGVFQDIVSWVSGKQKKNRFRYSPAMKWLRHGVLGVFIIMMVAGLNSL AILLAPYSAYGRIASSLFAPVWQWGNNLLAYFAERMDSYAFYEVDVWIKSLSTMLLAIVT LVVLFILAWRNGRTYCNTICPVGTVLGFISKYAIFKPVIDTSKCNSCGLCARNCKASCIN PKAHEIDYSRCVTCMDCIGKCKHGAITYTRRKPKNETATSEDTKAKSVTTEQVDNARRSF LSASAIFATTSVLKAQEKKVDGGLATIEDKKIPQRENPIYPPGALSARNFTQHCTACQLC VSVCPNQVLRPSDNLLTLMQPEMSYERGYCRPECTKCSEVCPAGAIHLTSLAEKSAIQIG HAVWIKENCVPLTDGMECGNCARHCPTGAIQMVTSDPEKTDSPKIPVVNVEKCIGCGACE NLCPSRPFSAIYVTGHQMHRII >gi|225935391|gb|ACGA01000001.1| GENE 45 59660 - 60754 876 364 aa, chain - ## HITS:1 COG:XF1739 KEGG:ns NR:ns ## COG: XF1739 COG2220 # Protein_GI_number: 15838340 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Xylella fastidiosa 9a5c # 31 363 26 358 385 312 43.0 7e-85 MRIIFKKFKARMIVGCILAVIALLAVSVIVFINQASFGRTPHGERLERVMKSPNYRNGEF KNQHETLLMTSDRGRFSGIWEFIFRKIDGLRPEQAVKAIKTDLRKIDPNEEILVWFGHSS YLIQTGGKRILVDPVFCMASPVSFVNKPFKGTELYTPDDMPDIDYLVISHDHWDHLDYNT VKKLKDRIGAVICPLGVGEHFEYWGFDKERLIELDWNEDANLAPGFMIHCLPARHFSGRG LTANQSLWASFLLETPSQKIYIGGDGGYDTHYAEIGNRFPDIDLAILENGQYNEEWSLIH LMPQYMAQTARDLKAKRVLTVHHSKYALAKHRWDEPLKNAEEMKNKDSLNVLIPEIGEVV ALEK >gi|225935391|gb|ACGA01000001.1| GENE 46 60912 - 62999 1152 695 aa, chain - ## HITS:1 COG:no KEGG:BT_0964 NR:ns ## KEGG: BT_0964 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 633 168 799 800 1092 84.0 0 MGYRPSIISEVSVTSSKEVFVEIPMDENVQDLAEVLVKPEIKKDRTVNPMAITGGRMISI EEASRFANGFDDPARLSSAFAGVAGDVGTNAVAIRGNAPQFTQWRLEGIEIPNPTHFADL AGLGGGFLSALSTQVIGNSDFYNGAFPAEYSNALSGVFDMHLRNGNNQKYEHAFQVGLMG IDLASEGPISKKRGSSYLVNYRFSTTSLASGNDINLKYQDLAFKLNFPTRKAGTFSIWGL GLIDRNKADVLERSEWETMGDRSSGYNKLDKLAGGLAHKYVLDENTYIRSSLSATYSKDR SLADLHTDDAIINVADIQNSRWNLVFNSYLNKKFSPRHTNRTGITITELKYDLDYKVSPY LGLNKPMEQISKGSGESTVLSAYSSSVINLSNNLTTSLGVTGQYFTLNKNWSIEPRVALK WKINPAHSLAAAYGLHSRRERLDYYFVEQIINGKKESNRYLDFSKAHHFGFTYDWNINQT LHLKIEPYYQYLFHIPVEENSSFSIINYEEFYLDRILTNTGIAKNYGIDITLEQYMKNGF YYMITASLFKSKYRGGDRIWRNTRLDKSFLVNLLAGKEWMVGRLKQNVLSVNGRLFFQGG GRYTPVDEEKSQEEKDIVFDESKAYTKRFNPSINGDVSISFRINKKRVSHEFSLKILNVG MRTGMHFYQYNERTSVVEEKDGAGLIPNISYKIYF >gi|225935391|gb|ACGA01000001.1| GENE 47 63481 - 64422 625 313 aa, chain - ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 109 277 150 318 354 81 32.0 2e-15 MFLSARFPSETEEKVQKWIIKDKDQQAKAKASLDYWNELDVEADSSTYASLERVNLRTGY NKEHLTGIASYQRFARVAAVVVPLLLLAGGMFCYLSPQNEMIEVSVAYGEQKRLILPDSS EVWLNAGSTILYPETFAKDKRLVILNGEAYFSVQKDTASPFIVEVPQLSVKVLGTKFNVK AYPGDEKITTTLTSGKVEVSVQSQPPRILKPNEQLTYDKKSSDIHISVIDTHDTNSWVVG KLIFTNATAGEIFRTLERHYNTTIDNEANIPASKRYTVKFLKDEDLDEILNILKDIIGFD YQQHENKIVLTQP >gi|225935391|gb|ACGA01000001.1| GENE 48 64553 - 65104 333 183 aa, chain - ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 4 179 9 194 206 73 27.0 3e-13 MNNNIDIKTLEAFQDGNHKAFETIFITYYNKTKAFIDGYIKSEPDAEELTEDLFVNLWIN RHSIDTSKSFNSYLHTIARNAAINFLKHKYVCDAYLNHNQETGYSSTSEEDLIAKELEML IDELVGGMPEQRRMIYTLSRNEGLSNAEIAERLNTTKRNVESQLSLALKEIRKVISCFLL SLL >gi|225935391|gb|ACGA01000001.1| GENE 49 65200 - 65550 229 116 aa, chain - ## HITS:1 COG:BH0737 KEGG:ns NR:ns ## COG: BH0737 COG1733 # Protein_GI_number: 15613300 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 12 110 14 112 118 88 41.0 2e-18 MYKKKIPFDIDCGIKITMEVIGGKWKSCIIQELCTGPKRPSELHRLFTEASPRVINQQLK ELEMHGIISKKIFMELPPHSEYAITELGKTLIPLIEQLEIWGDSFRPEMKRILKIE >gi|225935391|gb|ACGA01000001.1| GENE 50 65642 - 66661 810 339 aa, chain + ## HITS:1 COG:BH0654 KEGG:ns NR:ns ## COG: BH0654 COG0010 # Protein_GI_number: 15613217 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Bacillus halodurans # 53 337 3 302 304 160 31.0 3e-39 MNNSSLLADVSGIPYFCFKKMRKYLLSLALVVGIVSMNAQEKVTNNVNMENKKSTLRFIY PQWQGGIVDHWMPDIPAEDSSRGYYLGAQLLNYLAPQTGQKTVEVPVSLDINDRATEKGI SARSVILKQTKAALDLLKENHPDRIVTLGGECSVSVVPFTYLINRYPDDVAIVWIDAHPD INLPYDEYKGYHAMALTACLGMGDEEIMELLPGKTDASKALIVGLRSWDEGMQERQRKLG IKGLSPQEVANDRTKIMEWLKSTGVSKVVIHFDMDVLDPAEIIAAVGVEPNGMKIEEVVR IINDISSEYDLVGLTVAEPMPRVAIKLRNMLNQLPLLKD >gi|225935391|gb|ACGA01000001.1| GENE 51 66658 - 66912 205 84 aa, chain - ## HITS:1 COG:no KEGG:BT_0967 NR:ns ## KEGG: BT_0967 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 77 1 67 73 73 70.0 2e-12 MKSKRIREEEIKTVLGKRNYVILIIGFILIIIGYILMSGEGSTLAAYNPDIFSGTRIRIA PLLCLLGYLLNVFGILYRPRQRGI >gi|225935391|gb|ACGA01000001.1| GENE 52 66924 - 70061 2640 1045 aa, chain - ## HITS:1 COG:no KEGG:BT_0968 NR:ns ## KEGG: BT_0968 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1045 1 1044 1044 1821 82.0 0 MSNLLFKRWNDFGGWLVFLIAAFVYGMTIEPTASFWDCPEFISCAEKLQVGHPPGAPFYM LVGNLFTQFASDASQVSGMVNFLNALLSAGCILFLFWSITRLVRALLVGDERKLSVMDVI IILGAGFVGALAYTFSDTFWFSAVEGEVYAFSSFLTALVFWMILRWQDEADSVSGDRWII LIAYIIGLSIGVHLLNLLCIPAIVLVFYYQKYQTLSLKGVIGAIALSGILIVLILFVYIP GMADVGGWFELFFVNVMGLPFQSGLIVFLGLVLFLLIGAIYRFRKRIVNTGLWCLLMLTI GYTTYAVILIRANANTPLNENAPDTIFTLKSYLNREQYESAPLLYGRTYASEPEYVPEGD YYKVKTEKGSAIYRPDKKEGKYKIIRYKEDVCYTQNMLFPRMWNDRSAASYKGWSGGGAN EAPTQKENLTYFITYQLNYMYWRYFLWNFVGRQNDIQGSGEPEHGNWITGISWLDNLRLG DQKLLPESLRENKGHNVFYGLPLLLGLLGIYWQWTRGKKGKQQFSVLFFLFFMTGLAIVL YLNQTPGQPRERDYAYAGSFYAFAIWIGMGAAGCCDMLRRKQAKILPVGLLMLLCLFVPI QMASQTWDDHDRSNRYTCRDFGANYLMTLPDKGNPIIFCDGDNDTFPLWYNQDTEEVRRD VRICNLSYAQTDWYIYQQQCPLYDAPGLPISWDQNQYQEGKNEYVAVRPELKKQIEALYQ KHPEEARDSFGDDPYEIKNILKYWVFAEKQEFHVIPTDTINIHIDKNAVLRSGMMLPEAI HHLRGEELKDAIPDKLSISLKDIRLLTKVDLLILEILANCNWERPLYMAISVGNSSKLKF DDYFVQEGLAFRFTPFNYKEWGDVKEGNGYAIDTEKLYENVMNRYKYGGLDTPGLYLDET TLRICYSHRRLFAQLAKELVKQGDDIRARKVLEYAGQAIPAYNVPEVYESGSYDIATAYA SIGETQKATDLLNHLIAESENYIDWAFSLGDNRIGMVQQDCLYRFWQWNQCNELLKNMDK ERYEQSNQQFEKKYMRLTQLMNYHN >gi|225935391|gb|ACGA01000001.1| GENE 53 70571 - 71686 690 371 aa, chain - ## HITS:1 COG:no KEGG:BT_0969 NR:ns ## KEGG: BT_0969 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 362 1 362 362 550 81.0 1e-155 MNNLRILLYMMLLSLAACHPEGTSVKQGLGKAAQLMAQDPDTASIILETIQSSQMNEAQL AEYNLLCTQLNEDKNIPHSSDKQIRQAASYYEKNGNEYQKSKAYYYLACVESDLEQKEEA EIHFKKAIKLAKETEEYDHLAKICKRCSLYYQKYGNFDEALEMERKAYASQLMLNDGKSD SSVILSSALGMFGVMSLLLGLLWKKNRHALSQLDLFKEEILKKDVESDKLILQCNHLEEK YQSLQLHIYESSPVVSKVRQFKERNVLSSKIPSFSEKDWTELLRLQENVYGLVSKLKEIG PKLTEEDLRVCAFLREGVQPACFADLMKLTTETLTRRISRIKTEKLMLANSKESLEDIVK SLGDLSFCARN >gi|225935391|gb|ACGA01000001.1| GENE 54 71942 - 72562 502 206 aa, chain + ## HITS:1 COG:L111950 KEGG:ns NR:ns ## COG: L111950 COG1011 # Protein_GI_number: 15672092 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Lactococcus lactis # 2 191 3 192 207 97 30.0 2e-20 MIKNIVFDFGGVIVDIDRDKAVQAFIKLGLADADTRLDKYHQTGIFQELEEGKLSADEFR KQLGDLCGRPLTMEETKQAWLGFFNEVNLNKLDYILELRKSYHVYILSNTNPFVMSWACS SDFSSKKKPLNDYCDKLYLSYEVGHTKPAPEIFEFMVNDCNIIPSETLFVDDGASNIHIG KELGFETFQPKNGEDWREEMTAILQK >gi|225935391|gb|ACGA01000001.1| GENE 55 72600 - 73409 547 269 aa, chain + ## HITS:1 COG:mll2118 KEGG:ns NR:ns ## COG: mll2118 COG1028 # Protein_GI_number: 13471973 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Mesorhizobium loti # 7 182 10 186 271 160 47.0 2e-39 MQPQVILITGASSGFGKITAQMLSEQGHIVYGTSRKPSEDMNQVKMLVVDVTNSFSVCQA VERILLEQGRIDVLINNAGIGIGGALELATEEEVNIQMNTNFFGVVNMCKAVLPSMRKAR KGKIINISSIGGVMGIPYQGFYSASKFAVEGYSEALALEVHPFHIKVCVVEPGDFNTGFT DNRNISEQTRLDADYGESFLKSLEIIEKEERNGCHPRKLGAAICKIVECTNPPFRTKVGP WIQVLFAKSKKWLPDAVMQYALRIFYAIK >gi|225935391|gb|ACGA01000001.1| GENE 56 73560 - 74288 501 242 aa, chain - ## HITS:1 COG:no KEGG:BT_0973 NR:ns ## KEGG: BT_0973 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 1 241 242 378 92.0 1e-103 MEIFWRTIAYYNSATWLLQIVIILIGIALTGLLIGRPRPWVKMAMKFYMIGLYTWISLVY YYIYCEERSYNGVMAMFWGVMAIIWIWDAITGYTTFERTHKYDLLSYVLLAMPFIYPLVS LARGLSFPEMTSPVMPCSVVVFTIGLLLLFAQKVNMFLVLFLCHWSLIGLSKTYFFQIPE DFLLASATIPGLYLFFREYFLNNLHADTKPKAKYINWLLISVCVGLAVLLTTTMFLELVP KG >gi|225935391|gb|ACGA01000001.1| GENE 57 74300 - 74785 473 161 aa, chain - ## HITS:1 COG:Cj1165c KEGG:ns NR:ns ## COG: Cj1165c COG3610 # Protein_GI_number: 15792489 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 6 161 5 160 164 85 35.0 3e-17 MIALDILTDGFFAAVAGIGFGAISDPPLRAFKMIAILAALGHACRFCLMTYLGVDIATGS LFAGLVIGFGSLWLGKKVYCPMTVLYIPALLPMIPGKFAYNMVFSLIMCLQNVNDPDKLD KFMSMFFSNTLIASTVIFMLAVGATFPMFLFPHRAFSLTRH >gi|225935391|gb|ACGA01000001.1| GENE 58 74787 - 75554 699 255 aa, chain - ## HITS:1 COG:Cj1166c KEGG:ns NR:ns ## COG: Cj1166c COG2966 # Protein_GI_number: 15792490 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 13 247 12 249 258 142 34.0 7e-34 MKTGESLLSIGKFIAEYSAHLMGAGVHTSRVVRNTKRIGEAFGLDVKLSVFHRNIILTII DKETNEACNEVIDIPAHPISFEHNSELSALSWEAVDNHLSLEELKDKYKKIISAPRIHPL FVLLLVGFANASFCKLFGGDLISMGIVFSATITGFYLKQQMQAKKINHYVVFIVSAFVAS LCASTALIFDTTSEIAMATSVLYLVPGVPLINGVIDVVEGYVLTGFARLTEASLLIVSIA IGLSFTLLMVKNSLI >gi|225935391|gb|ACGA01000001.1| GENE 59 75632 - 76885 853 417 aa, chain + ## HITS:1 COG:BMEI0267 KEGG:ns NR:ns ## COG: BMEI0267 COG0477 # Protein_GI_number: 17986551 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Brucella melitensis # 11 379 20 365 397 115 28.0 2e-25 MNNQFTHPKERIRFAILTFFFAQGLCMASWASRIPDFKDVFAANYAFYWGLILFMIPVGK FVAIPLAGYLVSKLGSRSMVQVSILGYASSLLCVGLAHEVYLLGFLLFCFGVCWNLCDIS FNTQGIEVERLYGKTIMATFHGGWSLGACAGALIGFVMILAGVSPIWHYTLIFIIILIIA LSGRKYLQESAPQEAEVSDTKTQERNTAKAPNGFRLLFQKPEMLLLQLGLVGLFALIVES AMFDWSAVYFESVVHVPKSLQIGFLVFMIMMATGRFLTNYAYQLWGKKKVLQLAGSFICI GFFVSALLGGVFESMAMKVIINSLGFMLVGLGISCIVPTLYSFVGAKSKTPVSIALTILS SISFIGLLIAPLLIGAISQAFDIRIAYMLIGILGGCIVLIVSFSSAFDIQESGDKPE >gi|225935391|gb|ACGA01000001.1| GENE 60 77004 - 77255 230 83 aa, chain - ## HITS:1 COG:no KEGG:BT_0977 NR:ns ## KEGG: BT_0977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 83 83 137 83.0 1e-31 MCSKVMDFLTDDDFINYVLGVTPQSASQWETYFREHPEEMADAEEAKAVLLAPANVDCGF SIVENNELKDRIISSIKDFSGIL >gi|225935391|gb|ACGA01000001.1| GENE 61 77338 - 77889 539 183 aa, chain - ## HITS:1 COG:BH3216 KEGG:ns NR:ns ## COG: BH3216 COG1595 # Protein_GI_number: 15615778 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 5 183 3 178 193 67 26.0 2e-11 MMEQTNHSTDTLLASFQAGNMAAFSQLYNLHINVLFNYGLKLTIDKELLKDCIHDIFVKL YTKKDELGTIDNLRSYLFISLKNKLCDELRRRMYMSDTAVEEVSISTPTDVEDDYMEEEQ RKNEFSLVRRLLDQLSPRQREALTLYYIEEKKYEDICEIMNMNYQSVRNLMHRGLTKLRS LAS >gi|225935391|gb|ACGA01000001.1| GENE 62 78065 - 81208 2378 1047 aa, chain - ## HITS:1 COG:no KEGG:BT_0979 NR:ns ## KEGG: BT_0979 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1047 1 1031 1031 2004 89.0 0 MNIQVRTILLGLLSIGFVQSYAQTFALQVKNDQITYLNDDRGNRILDFSTCGYKSSEQDI PSVRNVVFVPWKAGDNTARIQRAIDYVASLTPDASGFRGAVLLDQGEFSLSGSIRISASG IVLRGTDKEKTILLKKGVDRGALIYMEGMDDLNVQDTLKVFSHYVPVNARTLEVASGVSL KKGDRVMVTRPSGKEWIASLGCDIFGGGISALGWKEGDMDLTWDRTVCEVNGNQVTLDAP LTVALDANYGTSSLLTYQWNGRIHDCGVENMTLISDYDKRYPKDEDHCWTGISIEDAENC WVRLVNFKHFAGSAVIVQRTGSKITVEDCISKEPVSEIGGMRRCTFHTLGQQTLFQRCYS EQGIHDFAAGYCAAGPNAFVQCDSYESLGFSGSIDAWACGLLFDVVNIDGHNLTFKNLGQ DKNGAGWNTANSLFWQCTAAEIECYAPAKDAMNRAYGCWAQFSGDGEWAQSNNHVQPRSI FYAQLGERLNKECAERARILPRNTSATSSPTVEVAMELAKEAYKPRLTLEHWIGDNKFAP SVASTGVKSIDDIKEKKSAALANSSSSSSSSAAASAAKLLTQPEVTVTNGRIQMNGALLV GGTHTTPWWNGKLKTNYLKKASPAITRFVPGREGLGLTDRIDSVVDFMKQKNILVFDQNY GLWYDRRRDDHERVRRRDGDVWGPFYEQPFGRSGQGTAWEGLSKYDLKRPNAWYWSRLKE FAEKGNKDGLLLFHENYFQHNILEAGAHWVDSPWRSSNNINQTGFPEPAPFAGDKRIFVA DMFYDISHPVRRELHRQYIRQCLNNFADNSNVIQLTSAEFTGPLHFVQFWLDVIAEWETE TGKKAKVALSTTKDVQDAILADPKRAAVVDIIDIRYWHYKTDGIFAPEGGKNMAPRQHMR KMKVGKVTFTEAYKAVNEYRQKFPQKAVTFYAQNYPAMGWAVFMAGGSCPVIPCTDKAFL KDAVAMEVEETNTDEYKKMVKSDIGSIIYSKSGTEIPVQLSSGKYVLKYIHPASGKIETI NKSLKINGLYNLKVPDKKEGIYWFHKL >gi|225935391|gb|ACGA01000001.1| GENE 63 81239 - 82921 1648 560 aa, chain - ## HITS:1 COG:no KEGG:BT_0980 NR:ns ## KEGG: BT_0980 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 560 1 560 560 1020 91.0 0 MNNIKKVLSAWMLVACVLPVAAQYPVIPDSVKARGAKQEAEFERKSDAAWEKALPTVLEE AKKGRPYKPWASKPEDLIKSNIPAFPGAEGGGMYTPGGRGGKVIVVTSLEDSGPGTLREA CETGGARIIVFNVAGVIRLKSPISVRAPYVTIAGQTAPGDGICVTGQSFLIDTHDVVIRH MRFRRGAQDVAFRDDAVGGNAVGNIMVDHCSASWGLDENMSIYRHVYNRGADGHGLKLPT VNITIQNSIFSEALDTYNHAFGATIGGHNSMFCRNLFASNISRNSSVGMDGDFNFVNNVV FNWWNRSVDGGDHNSFYNMINNYFKPGPITPIGKPISYRILKPEAGRDKNRPLSFGKAYV NGNIIHGNAKVTKDNWDGGVQLKEEVDAAKFLPLIKSDEAFKMPPVTVMDTKKAYTFVLD NVGANFPKRDAVDARVIKTVQTGKAIYAKDAPEFVSPYVKRRLPADSYKQGIITDIRQVG GLPEYKGEAVVDSDGDGMPDAWEVANGLNPNDPADANMDCNGDGYTNIEKYINGIDTRKK VDWTDLKNNYDTLSKRKSLL >gi|225935391|gb|ACGA01000001.1| GENE 64 82980 - 87374 3686 1464 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 928 1181 8 256 294 159 36.0 5e-38 MIGMYAFAYPNMIVEHYTAERGLPNNIVNCTLKGQDGFIWFGTWYGLCSFDGSKFRSYDN HDGFYSTDIPPRKIQRIVEDKNGFLWIKTIDRKLYLFDKRHESFHAVYDDVKEYSENIQI IKIQTTEDGDVLLLTKDKSLLRAYTDKEGKITMKQLHDSRPNVNVYDMRLKHNIFCETAE FINWIGMDYQILSLRKGEALKDKPADFIQKKVSANPDQFTCASYNSKFLWLGDKKGHIYS IDPQNGVVNRYEIPEIKQPVSNLLVTESGLMYITTNEGAYEYNIGYKQLTKLPFTIPEKD NGIIFYDKYDKVWFQEGNQALTYYDPLNRSHHRFTFPNQNAIGNFEMQDAGEQGMFFMTP GGEILLFDREKLEMTRINQLKPFSDDLPNQLFFHLLLDKDGILWLASTSSGVYRLNFPKK QFQLLTEVSPSPVVPERSTSWNQGIRALFQAQNGDIWVGTRWQALYRLDRNGQVKQIFSD KNYLLGAVYHIMEDKDGNLWFSTKGNGLVKAEPDMNSPHGLRFTRYINDPKNPNSISNND VYFTYQDSQGRIWVGLLGGGLNLISEENGAITFIHKYNGLKQYPAYGLYMEVRTMTEDED GRIWVGTMDGLMSFDGHFTAPEQIQFETYRQVSENSNVADNDIYVLYKDTDSQIWVSVFG GGLNKLVRYDKEKREPIFKSYGIREGMNNDVVKSIVEDKNGNLWFTTEIGLSCFNKATEQ FRNYDKYDGFLNVELEEGSALRTLNGDLWIGTRQGILTFSPDKLETLHTNYDTRIVDFKV SNRDLRSFRECPILKESITYAKAIELNYNQSMFTIEFAALNFYNQNRVSYRYILEGYEKE WHYNGKNRIASYTNVPPGDYTFRVETVDEANPELVSNCTLAITILPPWWLSWWATLIYVI LGLAALYFSLRLAFFMLKMKNDIYIEQKVSEMKIKFFTNISHELRTPLTLIKGPIQELRE REKLSPKGLQYVDLMEKNTNQMLQLVNQILDFRKIQNGKMRLHVSLIDFNEMIASFQKEF RVLSEENEISFTFQLPDEPIMVWADKEKMSIVIRNIISNAFKFTHSGGSIYITTGLTDDG KRCYVRVEDNGVGIPQNKLTEIFERFSQGENAKNSYYQGTGIGLALSKEIVNLHHGQIRA ESPEGQGAVFIVELLMDKEHYRPSEVDFYVGDTETAPVSVEQDPVANAISEDGTEEEPEI DASLPTLLLVEDNKDLCQLIKLQLEDKFNIHIANNGVEGLKKVHLYHPDIVVTDQMMPEM DGLEMLQSIRKDFQISHIPVIILTAKNDEGAKTKAITLGANAYITKPFSKEYLLARIDQL LAERKLFRERIRQQMENQTTTEEDSYEQFLVKKDVQFLEKIHQVIEENMDDSDFNIDTIA SGIGLSRSAFFKKLKSLTGLAPVDLVKEIRLNKSIELIKNTDLSVSEIAFAVGFKDSGYY SKCFRKKYNQSPREYMNEWRKGER >gi|225935391|gb|ACGA01000001.1| GENE 65 87548 - 90202 2246 884 aa, chain - ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 45 801 3 731 755 233 28.0 2e-60 MNKRPFIILSAFLLLFSLIEKGQATETYRPETSVAGFIQLPGSGRQVYNFNPGWRFFRGD VRGAEAVNFDDRSWNVVSTPHTVELMPAEGSGCRNYQGPAWYRKHFVLPAETKGQRVVLH FEAAMGKQILYLNGKRIQEHLGGYLPFTLDLTANGVQAGDSCLLAVFTDNSDDKSYPPGK RQYTLDFAYHGGIYRDVWMIAKSPVAITDAIDSQTVGGGGVFVHFDKISEKSAQVYVNTE VQNDDARFESVTVETTLTDADGKVIKRSSGKLSLKPGEKKSIRQQMEVKNPTLWSPDTPY LYRVQSRIKKGNKSIDGGITRVGIRLAEFRGKDGFWLNGKPFGQLVGANRHQDFAYVGNA LPNSQQWRDAKRLRDAGCTIIRVAHYPQDPAFMDACDELGLFVIVATPGWQYWNKDPKFG ELVHQNTREMIRRDRNHPSVLMWEPILNETRYPLDFALKALEITKEEYPYPGRPVAAADV HSAGVKEHYDVVYGWPGDDEKEDKPEQCIFTREFGENVDDWYAHNNNNRASRSWGERPLL VQAMSLAKSYDEMYRTTGLFIGGAQWHPFDHQRGYHPDPYWGGIYDAFRQKKYAYEVFRS QSPASLQHPLAECGPMIFIAHEMSQFSDKDVVVFTNCDSVRLSIYDGTKTWTKPVVHAKG HMPNTPVIFENVWDFWEARGYSYTQKNWQKVNMVAEGIIDGKVVCTQKKMPSRRSTKLRL YIDTQKVNLVADGSDFIVVVAEVTDDSGNVRRLAKENIVFTVEGEGEIIGDATIGANPRA VEFGSAPVLIRSTRKAGKIKVKAHVQFEGTQAPTATEIELESVPAELPFCYEEQTYEIQR TTPSTLNVNPGKESSEGKVQLTEEERQRVLDEVERQQTEFGTEK Prediction of potential genes in microbial genomes Time: Fri May 13 06:19:54 2011 Seq name: gi|225935390|gb|ACGA01000002.1| Bacteroides sp. D2 cont1.2, whole genome shotgun sequence Length of sequence - 2409 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 52 - 83 4.5 1 1 Tu 1 . - CDS 230 - 2392 1404 ## BT_0984 hypothetical protein Predicted protein(s) >gi|225935390|gb|ACGA01000002.1| GENE 1 230 - 2392 1404 720 aa, chain - ## HITS:1 COG:no KEGG:BT_0984 NR:ns ## KEGG: BT_0984 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 718 47 803 803 1287 80.0 0 MNIWVEDGDVMFYVSRSGTFDENNCQLKQGRVRLRLSPNPFKDAKDFRQELKLKDGYVEI AAGNTQIQFWVDVFHPTIHVEVTNAQPLQTEVFYENWRYQERPVRKGEGQQCSYKWAPPK GSMTEADCVSVNTNQLLFYHRNPEQTVFDVVVAQQGMNEVKSQMMNPLKNLTFGGTLLGE NLEFVGTTDEVYAGTDYRAWKFRSSKAARKEHFCIVLHTDQTETIEEWEQGIQTALHRIA PKGKVSSKTIIQDKKQTRSWWNSFWQRSFIEADGEAKEITRNYTLFRYMLGCNAYGSVPT KFNGGLFTFDPCHVDEKQSFTPDYRKWGGGTMTAQNQRLVYWPMLKSGDFDMMPSQFDFY NRMLKNAELRSRVYWQHDGACFSEQIENFGLPNPAEYGFKRPDWFDKGLEYNAWLEYEWD TVLEFCQMILETKNYANADITPYLPLIESSLTFFDEHYRQLASRRGRKALDGNGHLILFP GSACETYKMTNNASSTIAALKTVLETYGKKEEMLKTIPPIPLRYIEIKDTLNPTIAPVLK QTISPAVSWERINNVETPQLYPVFPWRIYGVGKEDLDIARNTYFYDPDAIKFRSHTGWKQ DNIWAACLGLTEEAKKLSLAKLSNGPHRFPAFWGPGYDWTPDHNWGGSGMIGLQEMLLQT NGEQILLFPAWPKEWNVHFKLHAPGETTVEATLKNGKVTDLKVLPESREKDIIIMIEKEK Prediction of potential genes in microbial genomes Time: Fri May 13 06:20:12 2011 Seq name: gi|225935389|gb|ACGA01000003.1| Bacteroides sp. D2 cont1.3, whole genome shotgun sequence Length of sequence - 17792 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 4, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 119 - 1552 1189 ## BT_0985 putative sialic acid-specific acetylesterase II 2 2 Tu 1 . - CDS 1669 - 4968 2809 ## BT_0986 putative DNA-binding protein - Prom 5037 - 5096 4.2 - Term 5057 - 5106 11.0 3 3 Op 1 . - CDS 5176 - 6069 389 ## BF2115 putative AraC-type transcription regulator 4 3 Op 2 . - CDS 6149 - 6487 214 ## BF1869 hypothetical protein 5 3 Op 3 . - CDS 6566 - 7360 418 ## gi|160882221|ref|ZP_02063224.1| hypothetical protein BACOVA_00167 6 3 Op 4 . - CDS 7394 - 9703 1072 ## gi|260170301|ref|ZP_05756713.1| hypothetical protein BacD2_00383 7 3 Op 5 . - CDS 9734 - 12208 1382 ## gi|160882223|ref|ZP_02063226.1| hypothetical protein BACOVA_00169 8 3 Op 6 . - CDS 12245 - 13162 643 ## BT_1062 hypothetical protein 9 3 Op 7 . - CDS 13217 - 14803 1382 ## BT_1063 hypothetical protein 10 3 Op 8 . - CDS 14859 - 16400 1008 ## BT_1064 hypothetical protein 11 3 Op 9 . - CDS 16408 - 16995 285 ## PG2130 hypothetical protein - Prom 17034 - 17093 9.9 - Term 17426 - 17464 4.7 12 4 Tu 1 . - CDS 17495 - 17617 56 ## - Prom 17675 - 17734 1.8 Predicted protein(s) >gi|225935389|gb|ACGA01000003.1| GENE 1 119 - 1552 1189 477 aa, chain - ## HITS:1 COG:no KEGG:BT_0985 NR:ns ## KEGG: BT_0985 # Name: not_defined # Def: putative sialic acid-specific acetylesterase II # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 477 477 926 93.0 0 MKSSIIKTGTMLAGFLLAACLSTHAEVKLPAIFSDGMVMQQQTNANLWGMATPHKKVTVT TSWNGKQYAATADKNGAWKLTVATPEAGGPYTVTFDDGTQKTLNNILIGELWLCSGQSNM EMPMKGFKNQPVENANMDILHSKNPQIRLFTVKRTSTFTPQNDVIGSWKEATPASVRDFS ATAYYFGRLVNEILDVPVGLVVAAWGGSACEAWMTADWLKAFPEAKIPQTEADIKSKNRT PTVLYNGMLHPLIGMTMKGVIWYQGEDNWNRAHTYADMFTRLINGWRAEWKQGDFPFYYC QIAPYDYGIITEKGKEVINSAYLREAQAKVEHRVANSGMAVLLDAGMEKGIHPAKKQIAG ERLALLALTKAYGVEGVNGESPYYKSIEIKNDTVVVSFERANMWISGKNCFESKNFQVAG EDKVFYPAKAWIERSKMLVKSDKVPHPVAVRYCFENYVEGDVYCDGLPLGSFRSDDW >gi|225935389|gb|ACGA01000003.1| GENE 2 1669 - 4968 2809 1099 aa, chain - ## HITS:1 COG:no KEGG:BT_0986 NR:ns ## KEGG: BT_0986 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1083 1 1086 1105 1930 84.0 0 MKSRLKQQIFALSLLAWTAVCPADAQQTRSLREQFQNPSDEAKPWTFWYWMYGAVSKEGI TADLEAMKHAGLGGTYLMPIKGIHEGAQYDGKAQQLTPEWWEMVRFSMEEADRLGLKLGM HICDGFALAGGPWITPEESMQKVVWSDTIVNGGKLMAIRLPQPEAYESYYEDIALFALPV EDAADEMQAKITCVNLATTGNVKATQTVNMDAAGVIRSSYPCYIQYEYEQPFTCRNIEIV LNGNNYQAHRLKVMASDDGVNYRLVKQLVPARQGWQNTDENSTHSIPPTTARFFRFYWTP EGSEPGSEDMDAAKWKPNLKIKELRLHREARLNQWEGKAGLVWRVAQATKEEEVGKQDCY SLSQVINLTEQYTSHSNGKTLTATLPKGKWKLLRMGHTATGHTNATAGGGKGLECDKFNP KTVRKQFDNWFAQAFAKTNPEVARRVLKYMHVDSWECGSQNWNKRFAIEFQKRRGYDLMP YLPLLAGIPMESVEQSEKILRDVRTTISELVVDVFYQVLADCAREYDCQFSAECVAPTMV SDGLLHYQKVDLPMGEFWLNSPTHDKPNDMLDAISGAHIYGKNIIQAEGFTEVRGTWDEY PGMLKALLDRNYALGINRLFYHVYVHNPWLDRKPGMTLDGIGLFFQRDQTWWDKGAKAFS EYATRCQSLLQYGHPVTDIAVFTGEEVPRRSILPERLVPSLPGIFGAERVESERIRLANE GQPLRVRPVGVTHSANMADPEKWVNPLRGYAYDSFNKDAILRLAKAENGRITLPGGASYK VLVLPLARPMNLEPVLSSEVQKKINELKEAGILVPSLPYTEEDFSVYGLERDMIVPEDIA WTHRRGELGELYFVANQKNETRTFTASMRINGKKPECWNPVTGEMNIHPSYRINGNRTEV TLTLAPNESVFIVYPAEGVHEGYGESSLQLQKEKKDIARASLNIALEAKEYAITFAANQK TLTRKELFDWSQETDEQIRYYSGTATYKTTFRWKNKPNKDQQIYLNLGTVYNLATVRVNG VDCGTIWTAPYRANITGALKKGTNELEIEVTNTWANALTGADEGKAPFDGIWTNAKYRRA EKTLLPAGLLGPLSFSITE >gi|225935389|gb|ACGA01000003.1| GENE 3 5176 - 6069 389 297 aa, chain - ## HITS:1 COG:no KEGG:BF2115 NR:ns ## KEGG: BF2115 # Name: not_defined # Def: putative AraC-type transcription regulator # Organism: B.fragilis # Pathway: not_defined # 1 291 1 289 290 265 44.0 1e-69 MKDNKPYRQNKVFDNISETEEFNVYTKIDDLPLNETPAYLEEGINGICTGGSALFNVFGN KRRIVPNDLVVIFPFQLAAVTEISDDFSMTFLKVSKNLFMDTISGVCIPTLDFFFYMRKN FSTPLCDEECQRFIHFCHILIYRINLPRNLFRRESIMQLLRVFYWDIYVAYKRNPKAAEL VKYTRKEKLLFDFFCLVIEFHTVSRDVAFYAQKMCVSAKHLTMVITDMSGRSAKDWIIEY SILEIKALLRDSDLEIKEVASRTNFQSNSVMTRFFREHTGMTPSDYRERIFIQIDGS >gi|225935389|gb|ACGA01000003.1| GENE 4 6149 - 6487 214 112 aa, chain - ## HITS:1 COG:no KEGG:BF1869 NR:ns ## KEGG: BF1869 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 105 13 112 122 66 34.0 4e-10 MIEGAPFPAIEKVYDPSSKKCNGRITPQAPIVITGHHLDMLTWDSANLYLVSSVNDRMLI ECGDIHKYSDDKVYTTIPDIDEGEYFLALMILMKDKESFLYIFPISLIVQFT >gi|225935389|gb|ACGA01000003.1| GENE 5 6566 - 7360 418 264 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882221|ref|ZP_02063224.1| ## NR: gi|160882221|ref|ZP_02063224.1| hypothetical protein BACOVA_00167 [Bacteroides ovatus ATCC 8483] # 1 264 1 264 264 526 100.0 1e-148 MKRITVIIFVSFFILLKGYSQKIEIYEEAGIKYAAIKSMELPANRVENRSENPDFYKTIQ LFKYTDSGMDTEPIQVNGENVYVKRITRHPYNENNQTIKIVWKVSEYFIISPDNVCSDGN DSDGKNVGEKTMKWATANGYLATANTNSYTTASFAVPKGCAMYRGKDGKDEPGTWRIPTL REGSLIMIFYKELERTKDKGTDFQPFDLSLDDKKGTAYWLATENNTSGSAWSIKFYPMAV KYTSSLISKGSTLYLRCIRDIPLK >gi|225935389|gb|ACGA01000003.1| GENE 6 7394 - 9703 1072 769 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170301|ref|ZP_05756713.1| ## NR: gi|260170301|ref|ZP_05756713.1| hypothetical protein BacD2_00383 [Bacteroides sp. D2] # 1 769 1 769 769 1482 100.0 0 MTIKIKYNVYFRTACIALFSLIFTSCTKDKLEEFDPGNATGKMVMVSLNVSLPPAVEPTS VSGYAVTKPFFRNRTNADSPFTVILGEGQKEYAINSNTRVSDGTTSLYNLWLFQFHENGS INGKPHKLSDVKTAINDMVTIDVPLVVADNQTLYLLVLGPKLDYDMSEVGTLDDLKKWNF KYLTNVEGHTQSLITADNEVPFAGEVSGVTVVNVDDGKRGLVEYNKPVGFVGGIEVRRLM ARITLRYKFEVENYQLQGLKLLNVNNTIRLSNPSKNTAEDTYATLETVDFEGPDSNDFYS ATWYVAQNCQGTIEKILSENQRYYKIVDKTPAGDAPPLGTQIEAWAYPTTASATDEYAIY QMYVGNNNTNNFDVEPNHFYNLRTTINTDLTSAKNDERIRMYTASQYVEFHASQNISVGG GAKFDNTKYNKAGVSYDLDAAYDVRPIVIQTQGRKVEVGIYTDNLCITKVSSNKSWLYLS SSSNYTDAFNNVDEPLATSVKANTILPTQVTFYLYNNEYVYDENGKLVNPGENDKDGKRS LYIKVTTTTEGEGGGAVQAYHIFKMDQRPAVYAGRFGGERDADGNYTMGLVHDRLPVRGP KYLESITTMGTIQYGYDKVETAAYSYGTDNMDYGKEATRNLAENIRNLPWNNEKIAAPQK DASRYVLLYQYQYPASSFSARVCYDKNRDENGDGRIDKEEMKWYFPASNQMIGMYIGSFL NNESFPGSATTEDGNDYVKRWYQGVNSNQKMQTGSSRCVRDIDISLDTD >gi|225935389|gb|ACGA01000003.1| GENE 7 9734 - 12208 1382 824 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882223|ref|ZP_02063226.1| ## NR: gi|160882223|ref|ZP_02063226.1| hypothetical protein BACOVA_00169 [Bacteroides ovatus ATCC 8483] # 1 824 6 829 829 1552 99.0 0 MNITINIQFSANVLLLTIFALFAYSCTKENETTGARIYEGEKIPMSIKMGTRNTTADDDE VIKSVRAIVFNDKNELVYNDVSDASINVDNIYIAPIRAARGYNNIYIICNETPELTEKLA AITLENEIEKVTFSAIGIVAPPPMYGNVSRAYVESRSDGTNATVTINNVTTTELPIEVNR MVSRISFTAIKNITNEDEDFKVTKLNVKVCRMPVATTIGEGQAYTENVWSDDLTISGTGE LDNNGTYTINGNNYTIPDNIDFITIPITYIPEHLLSEPQNASQATYLKIDAQCVLKNGST QVLNCIYLLNIGQEPPKNHNLTRNNHYRIYATITGMGAMGLYAEIVAMEEHDITINWKPI DGLVIVSDKAADYDAVADTSRNVNIWNDFSVYSGILKAYHSGTGYKDILFKYGSLIAISS PDANTEFTPPTNATILNDVLWYPGSYAPLNITEWADIPYLETDKIPTDNTIVQVAAGKGD PCKLAGLSESQIKTLGIVDNKQWRMATPAENLILKTAAENESNSQGYPSFHWLFSPHNRY RDTNGISQGDRSNGWYWDNNATVFHFSGEEPQNAEIQENMDRQNAYIIRCVRNEMIKSQL EVGIISSPTYQGTETSGNAYFGIISNIPYWTATLVTEGAYVGSTTEFDDFSFVAGETIHT THGGNTQNIPIYVKRKESTSPRSFRVKVEGIGLEGKTVNKILTVSQSGYDLRAKTGLSSM GNIPQEGGSYTTTINLTPTDVTISSGKLYLQITYGGVIRCKSTEVHTEPNKYSYDVTITI PENNSPSIVELIGNIYLEYENNLTVPLGSPTIKQNGTISSNNEN >gi|225935389|gb|ACGA01000003.1| GENE 8 12245 - 13162 643 305 aa, chain - ## HITS:1 COG:no KEGG:BT_1062 NR:ns ## KEGG: BT_1062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 301 17 313 317 138 30.0 3e-31 MAAGITGCDVFHDDLDQCDLFLEFRYDYNMVNEDWFADQVEEVKVYVFDAEGKYLQTFIE KGNPLKNTDYRMLIPYRLKGCTAVVWAGKTDEFYRLSSMTAGDPIDKLSLKYEPDNDMSN NHLDALWHSGPLQMFSPENISNTETVSLIRNTNDITIGITRGSNPVDLSKYDIQLIAANS IYDYKNNFGDGNKNIIYHPCADEEDNKLSLQTRLHTLRFVKDVAMPFSITEQTSGKAIDI GGETTINLIDYLLKSKPEAMGDQEYLDRRYQWDINIRIGDKEENGYVALSITINNWTYWF QPTDM >gi|225935389|gb|ACGA01000003.1| GENE 9 13217 - 14803 1382 528 aa, chain - ## HITS:1 COG:no KEGG:BT_1063 NR:ns ## KEGG: BT_1063 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 523 1 562 566 129 25.0 2e-28 MKTKSFLLSTLVTFMFAGCSSDDAREGNIPSGELNGKAYLSLSLQSHTSTTRAENVVEKP GSSGESKAGTVRVLLFDEEDVCLDVVNFDDLTVGNSGGESGGTGTPEAVASDAKLVPEKT KKVFVVINPYTDGNKGWNLAAETVKGKPWSAINTAIEAVIANVATTNQFMMTSAGKGTGI EGALTNVTVHKPSGYTNEAINQAKTDAKNNPAEISVDRLSAKVELVVKESFTTKPDGAKF TFGGWELSVTNKSVKLYSELITYDNATTGAVYRRDKNYLSDEQPDVSNASTMETNMDAAF NYLKNIDSESEEMPAVAQSKGTSLYCLENTMEAKAQQLGFTTKVVVKATYTPNGLTENSS YFSWKGNYYTLDQLKDEYNNTASGGLKTDLPIFLKKAGVVAEGVSDIDKAIAELTADKFT AKTGIIGRFCAVRYYHESVCYYDVLIRHDQNVTTKMALGRYGVVRNNWYHLELQSVSGPG TPWIPDPSDPDPTNPTPPGTDDDEADAYISVKITINPWTYWTQGVDLH >gi|225935389|gb|ACGA01000003.1| GENE 10 14859 - 16400 1008 513 aa, chain - ## HITS:1 COG:no KEGG:BT_1064 NR:ns ## KEGG: BT_1064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 132 512 14 388 392 342 47.0 2e-92 MINLKSILWLLPLSLLVSPLLYGENLPDKWYKGTIQATAIELQQVGDSLHIQVLYDLDKI KVSSNHSIELIPVLIAANHQLELPEISVKGHSNFQNLKRKLALMNTQERDLYQSEAPYYV VKGYGITEEKQIRYSLVIPFESWMKDARLDIKKEVMGCCKPGKLLSTFPLFGTVTLEQPP VPYHIVPHISYVRPQVEPVKKREMSCEAFLDFVVSKTIIRPDYMNNPVELEKISSMLAEV RNDTTITIRGISVIGYASPEGSVLFNKQLSEGRAKALVNYLLPRFPFSKELYKVEYGGEN WEGLRKMVAESDMAEKDGILHIIDHIPVEINYRTNTSRKKSLMLYKQGNPYRFMLREYYQ HLRKAICKIEYDVQNFNIEQAKVLIHSRPQNLSLNEIYLVALTYKNGSPEFIELFEKAVS LFPDDKIANLNAASAALSREDIALAEKYLKKAEISAPEYENAVGVLYLLKSDYKQAKLHL IKAAESGLEQAYFNLEELNKKEEDIRSMTKLDY >gi|225935389|gb|ACGA01000003.1| GENE 11 16408 - 16995 285 195 aa, chain - ## HITS:1 COG:no KEGG:PG2130 NR:ns ## KEGG: PG2130 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 6 195 8 193 193 166 47.0 6e-40 MKIHDIAILFFIFWGSISQCAAQKIAVKTNLLYGTYASSPNLETEFGISPRGTVDLGAGF NWFTSVHSSSNKKLVHWLGSVEYRYWTCERFSGHFGGIHILGGQYNIAGHHLPLLFGDDS GHYRYEGWGIGGGISYGYHFLLGNRWSLEANIGIGYVRLHYDKFRCETCGEKTGTENRNY FGPTKAAVSLIFLIK >gi|225935389|gb|ACGA01000003.1| GENE 12 17495 - 17617 56 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKFLKFVIKKYVYTALWTVNSVWFIENIPMVIGTEFFFL Prediction of potential genes in microbial genomes Time: Fri May 13 06:22:37 2011 Seq name: gi|225935388|gb|ACGA01000004.1| Bacteroides sp. D2 cont1.4, whole genome shotgun sequence Length of sequence - 25023 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 12, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 266 207 ## BT_0759 hypothetical protein + Term 332 - 371 -0.0 + Prom 268 - 327 4.2 2 2 Op 1 . + CDS 409 - 690 283 ## CLD_0105 hypothetical protein 3 2 Op 2 . + CDS 684 - 1412 416 ## BF3041 hypothetical protein + Term 1647 - 1692 1.1 + Prom 1567 - 1626 4.3 4 3 Op 1 . + CDS 1694 - 2905 604 ## Slin_2713 hypothetical protein 5 3 Op 2 . + CDS 2895 - 3413 92 ## gi|260170311|ref|ZP_05756723.1| hypothetical protein BacD2_00433 + Term 3489 - 3526 -0.5 - Term 3317 - 3364 -0.9 6 4 Op 1 . - CDS 3406 - 4695 682 ## BT_0987 putative cytochrome c-type biogenesis protein 7 4 Op 2 . - CDS 4565 - 4921 108 ## BT_0987 putative cytochrome c-type biogenesis protein - Prom 4985 - 5044 9.5 - Term 4951 - 5010 5.6 8 5 Tu 1 1/0.500 - CDS 5084 - 7735 1790 ## COG0474 Cation transport ATPase - Prom 7767 - 7826 2.0 9 6 Op 1 40/0.000 - CDS 7856 - 9223 961 ## COG0642 Signal transduction histidine kinase 10 6 Op 2 . - CDS 9220 - 9906 520 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 9959 - 10018 6.3 - Term 10002 - 10037 1.0 11 7 Tu 1 . - CDS 10052 - 12070 1768 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 12108 - 12167 1.6 12 8 Op 1 . - CDS 12190 - 12900 630 ## COG3250 Beta-galactosidase/beta-glucuronidase 13 8 Op 2 . - CDS 12937 - 16299 2434 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 16167 - 16226 5.5 14 9 Tu 1 . + CDS 16273 - 17844 853 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 17851 - 17908 4.4 + Prom 17917 - 17976 7.3 15 10 Tu 1 . + CDS 18086 - 19585 937 ## COG0642 Signal transduction histidine kinase 16 11 Tu 1 . - CDS 19582 - 20427 370 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 20498 - 20557 8.4 - Term 20546 - 20601 11.2 17 12 Tu 1 . - CDS 20627 - 24826 3490 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 24867 - 24926 3.3 Predicted protein(s) >gi|225935388|gb|ACGA01000004.1| GENE 1 3 - 266 207 87 aa, chain + ## HITS:1 COG:no KEGG:BT_0759 NR:ns ## KEGG: BT_0759 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 83 1 57 59 70 64.0 2e-11 NGKNPEVYNYIGNDSALIIEKEIETEMKAELYSFLLDNKFNKGVMFKKSIEQFVEHYEMV GLVQEETLMRAFQRWRKLVKEEKAIKL >gi|225935388|gb|ACGA01000004.1| GENE 2 409 - 690 283 93 aa, chain + ## HITS:1 COG:no KEGG:CLD_0105 NR:ns ## KEGG: CLD_0105 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 3 69 1 67 71 65 47.0 6e-10 MTLKEKQLEFIIYCIENTAERLGRYSADVYNKLKELGAIDGYINAFYDTLHTQGKAYIVD SLLEYIYHRDPQWLPEDYRPFQVSTQQKGDKSC >gi|225935388|gb|ACGA01000004.1| GENE 3 684 - 1412 416 242 aa, chain + ## HITS:1 COG:no KEGG:BF3041 NR:ns ## KEGG: BF3041 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 163 1 161 171 165 51.0 1e-39 MLTVYHGSTYRVEQPLAGVCRPNLDFGVGFYLTDLKEQAVRWALRTADIRHENSVWLNIY SLDIDACRNFSFNYLHFTTYDAHWLDFVVACRQGNVIWQDYDIIEGGIADDRVIRTIDLY MRGDYTREEALSRLIHQEPNNQICITNQKVIDEHLHFVDVILLPFPSLSKEIPNADIVMQ GKYYSIVELLATRLHISSLQALDIFYNSESYQRIVHRLGDLYLMSDAYIVDELMRELQKR QG >gi|225935388|gb|ACGA01000004.1| GENE 4 1694 - 2905 604 403 aa, chain + ## HITS:1 COG:no KEGG:Slin_2713 NR:ns ## KEGG: Slin_2713 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 1 401 1 397 401 150 30.0 9e-35 MAHLIVTDFGAIKSANIEIKKYNFFIGHTSSGKSTIAKLLAIFNNSIFWTIKEGDFNSFL RLLDKYNINFEFTSTTIIRYSNEKYYWEIGLNKFHSNYEDADLMEMANTSKSYDFILKFI ERKENNFAYKEFIKSLKNLLDLKDSAMVELIKPALVGLLYEECIPVYIPAERLLISTFSN SIFSLLQAGASIPDCIKDFGSLYEKARIQYKNIDINILDIKVSFNNNGDTVYLMNENKEV KLSQTSSGIQSIIPLWIVFNQYVESKKKQILVIEEPELNLFPSTQHFLIDWIMRKMRKSN GSIVITTHSPYVLSVVDNLILAQEILKKSNKKKLVLSKIKELIPSMALIDFDDVSSYFFH SDGTVKDIRDTDIKSLGAEYIDTASDKLGYIFDELCNIERNEL >gi|225935388|gb|ACGA01000004.1| GENE 5 2895 - 3413 92 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170311|ref|ZP_05756723.1| ## NR: gi|260170311|ref|ZP_05756723.1| hypothetical protein BacD2_00433 [Bacteroides sp. D2] # 1 172 1 172 172 304 100.0 1e-81 MSCKCFNRRFVFKNTDSFDHKYVNSKCKCTSRFTIYENKSKFTIASKDISEVDKIKIDGY FDSSSEHRKCDYLFVYTSPLSCVYIFVELKGTDIAHAITQIGNTVNLFYNQGYLKDKKVI GAIVSSRHPSNDGTYRKAKQTLEKSLSSKIKSFRIEKKNKEMTYDPINDKVI >gi|225935388|gb|ACGA01000004.1| GENE 6 3406 - 4695 682 429 aa, chain - ## HITS:1 COG:no KEGG:BT_0987 NR:ns ## KEGG: BT_0987 # Name: not_defined # Def: putative cytochrome c-type biogenesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 429 76 504 504 706 81.0 0 MENQYRTPIQEDGSFSFRFPVYAKLREVSIRNYAEHLYIHPGDSIHVEIDFKDLFHPKVT GDAEKLNQEILAFTESAYYYIQNYSINPNLNIKDFETELKKEYDFRLERRNEYLTKYKPM GDVTLFTEELLKQDYYYTLLSYGNQCQFKTRKEMDRYHKLLSAINKLYNKGILSARLYDI ADEVECYIAYGITYKDKKNPSVEEIMSAVGENELNQYIYTKMAVGSLNANDTLALTTRHT QFDSIVKMPHLRAQVMQIYNQTKSYLENPQPVSNNLLYGEFHENSKLKTSMPYMEPVYNI LEKNHGKVIYFDFWARWCPPCLAEMEPLKQLRSKYSTKDLVIYSICVSEPKEEWEECLNE YSLKNRGIECIYASDYFGKDNLQKIRKQWKIDRMPYYLLINRKGQIVDFGTTARPSNPQF VSRIDDALK >gi|225935388|gb|ACGA01000004.1| GENE 7 4565 - 4921 108 118 aa, chain - ## HITS:1 COG:no KEGG:BT_0987 NR:ns ## KEGG: BT_0987 # Name: not_defined # Def: putative cytochrome c-type biogenesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 83 504 100 63.0 2e-20 MKHYYSFLIITSLLLAGCGQKDRAKSQFESSVQTEESYSLAKEYIKEAVITGKVLNRDFY PQEKELTLIIPFFLENGKSISYPHTRRWFIFIPLSGLCKIKRSFHSQLCGTFVYPPGR >gi|225935388|gb|ACGA01000004.1| GENE 8 5084 - 7735 1790 883 aa, chain - ## HITS:1 COG:PA4825 KEGG:ns NR:ns ## COG: PA4825 COG0474 # Protein_GI_number: 15600018 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Pseudomonas aeruginosa # 40 883 45 903 903 964 54.0 0 MVWKKKLRSPQYTFNSEKVFLVATQPGKSIYSYLQTTKLGLTQGEVQERQSIYGRNEVVH EQKKNPFILFIKTFINPFIGVLTGLAIISLFLDVLMADPGEQEWTGVIIISSMVLFSAIL RFWQEWRASEATDSLMKMVKNTCLAKRAGEQEEEIEITELVPGDIVYLAAGDMVPADIRI IDSKDLFISQASLTGESEPIEKFPEVRGQQFRKGSVIELDNICYMGSNVISGAAKGIVFE TGNKTYLGTIAKSLVGHRATTAFDKGISKVSFLLIRFMLVMVPFVFFVNGFTKGDWFEAF IFAISVAVGLTPEMLPMIVTANLSKGAIAMSKKKTIVKNLNAIQNFGAMDILCTDKTGTL TCDKIVLEKYINADGSDDNSKRILRHAYFNSYFQTGLRNLMDKAILSHVRDLSLEHLKDD YTKVDEIPFDFTRRRMSVVIEDRQGKRQIITKGAVEEVLDVCSYAEFNGQIHPLADALKI KAQMISEEMNQQGMRVLAVSQKSFIEKDCNFAIEDEKEMVLIGYLAFLDPPKPSAAEAIE QLYAHGVAVKILSGDNDVVVKAIARQVGIDTSHFLTGIEIENMDETALKEAVKDTTLFSK LTPLQKTQIISLLQEQGNTVGFLGDGINDAGALRQSDIGISVDSAVDIAKESADIILLEK DLMVLEDGVLEGRKTFGNINKYIKMTASSNFGNMFSVMFASAFLPFLPMMPIHLLIQNLL YDISQTTIPFDRMDPEFLKKPRKWDASDLSRFMIYVGPISSIFDIITYLVMWYVFSCNSP EHQTLFQTGWFVEGLLSQTLIVHMIRTRKIPFIQSRATWPVMGLTFLIMAIGILVPFTAF GRSIGLTALPLSYFPWLVGILLSYCILTQIVKNWYIRKFVRWL >gi|225935388|gb|ACGA01000004.1| GENE 9 7856 - 9223 961 455 aa, chain - ## HITS:1 COG:RSp1043 KEGG:ns NR:ns ## COG: RSp1043 COG0642 # Protein_GI_number: 17549264 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 173 453 184 463 466 151 32.0 3e-36 MKIRTTLTLQYAGLTAAVFFVFVMAVYYVSEHSRSNAFFRNLQSEAITKAHLFLKNQVDA KTMQSIYLNNQKFIDEVEVAVYTTDFKILYHDALQNDIVKETPEMIKRILQRKNINFYVD EYQAIGLVYPFEGKDYVVTAAAYDGYGYANRDALRNMLILLFIGGLSVLVVVGYILSRST LKPIRNIVKEAEKITASHIDKRLPVKNEQDELGELSTTFNALLERLEKSFNSQKMFVSNV SHELRTPMAALTAELDLALLKERSSEQYQMAIGNALQDSHRIVNLIDGLLNLAKADYQSE QITMEEVRLDELLLDARELVLKAHPDYHIELVFEQEAEEDNVLTVIGNSYLLTTAFVNLI ENNCKYSSNRTSSVLIAYWEQWAIIRLSDTGVGMSDTDKENLFTLFYRGENKNIAPGNGI GMALTQKIIHLHKGELTVSSHKDEGTTFVVKLPHI >gi|225935388|gb|ACGA01000004.1| GENE 10 9220 - 9906 520 228 aa, chain - ## HITS:1 COG:ECs0609 KEGG:ns NR:ns ## COG: ECs0609 COG0745 # Protein_GI_number: 15829863 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 4 224 3 221 227 180 43.0 2e-45 MYTILIIEDEPRVASLLMNGLEENGYQTMVAYDGLMGLRLFQTHTFDLVISDIVLPKMDG FELAKEIRKTNPNIPILMLTALGSTNDKLDGFDAGADDYMVKPFDFRELNARIKVLLKRV AGNAQELPQELVYADLRIDLQRKDVERNSISIKLSPKEYNLLLYMVENAERVLSRVEIAE KVWNTHFDTGTNFIDVYINYLRKKIDRDFEPKLIHTKAGMGFILTDKL >gi|225935388|gb|ACGA01000004.1| GENE 11 10052 - 12070 1768 672 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 14 307 310 629 1087 150 32.0 1e-35 MDKDGKVLHSETENFGFRTIEVRESDGLYINGVRINVRGVNRHSFRPESGRTLSKAKNIE DVLLMKSMNMNSVRLSHYPADPEFLEACDSLGLYMMDELGGWHGKYDTPTGVRLIEGMIE RDVNHPSIIWWSNGNEKGWNTELDGEFHKYDPQKRPVIHPQGNFSGFETMHYRSYGESQN YMRLPEIFMPTEFLHGLYDGGHGAGLYDYWEMMRKHPRCIGGFLWVLADEGVKRVDMDEF IDNQGNFGADGIVGPHHEKEGSYYTIKQLWSPVQIMNTSIDKQFDGKFSVENRYDYLNLN TCRFLWKQVKFPLATDASNAAIQVLKEGEVQGSDVVAHSAGILDIKTNILSNADALFLTA IDPYGHELWRWTFPVNKLNQQTEQLSPLSSRPTYTETENELTVKANKRTFIFSKKDGQLK GVSVDNRKINFANGPRFIGARRADRSLDQFYNHDDEKAKEKDRTYSEFPDAAVFTKLDVK EDGGNLVVTANYKLGNLDKAQWTINPSGELALDYTYNFSGVVDLMGIRFDYPEDQVISKR WLGAGPYRVWQNRIHGTQYDVWENDYNDPIPGETFTYPEFKGYFGDVSWMNIQTKEGTIS LTNEAPDAYIGVYQPRDGRDRLLYTLPESGISVLNVIPPVRNKVNSTDLCGPSSQPKWVN GPQTGRVIFRFM >gi|225935388|gb|ACGA01000004.1| GENE 12 12190 - 12900 630 236 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 42 211 63 229 1087 79 31.0 6e-15 MKKVTTLLSTLALATTLAAQNLPQTERQYLSGHGCDDMVEWDFFCTDGRNSGKWTKIGVP SCWELQGFGTYQYGITFYGKPFPEGVANEKGMYKYEFEVPEKFRGKQVNLVFEASMTDTE VKVNGRKVGSKHQGAFYRFSYNITDFLKYGKKNLLEVTVAKESENASVNLAERRADYWNF GGIFRPVFLEVKPAVNLRHIAIDAKMDGTFRANCYTNISNDGMSIRTQILDKKERS >gi|225935388|gb|ACGA01000004.1| GENE 13 12937 - 16299 2434 1120 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 64 1091 6 957 1087 612 35.0 1e-174 MPQGKAFLHRVLLLLLFFNISVKERGTKYPFVRLFDMKKKERMKIYTLLLGVLFVSPIQA QTVHDWENHHVLQINREPARAAFTPFSVQKGDGSISLDGTWKFRWTPVPNERVMNFYQIN FDDKDWTDFPVPANWEVNGYGTPIYVSAGYPFKIDPPRVMGEPKTDYTTYKERNPVGQYR RTFVLPAGWEADGQTFLRFEGVMSAFYVWINGERVGYSQGSMEPSEFNVTKYLKSEENQI SLEVYRYSDGSYLEDQDFWRFGGIHRSIHLIHTPDIHVRDYAVRTLPASVGNYKDFILQI DPQFSVYRGMTGKGYVLQGVLKDASGKEVATLKGDVEDILDLEHKASRMNEWYPQRGPRK MGRLSAIIKSPERWTAETPYLYKLHLTLQNEEGKVVEQIEQAVGFRSVEIKKGQLLVNGN PVRFRGVNRHEHDPRTARVMSEERMLQDILLMKQANINAVRTSHYPNVSRWYELCDSLGL YVIDEADIEEHGLRGTLASTPDWHAAFMDRAVRMAERDKNYPCIVMWSMGNESGYGPNFA AISAWLHDFDPTRPVHYEGAQGVDGNPDPKTVDVISRFYTRVKQEYLNPGIAEGEDKERA ENARWERLLEIAERTNDDRPVMTSEYAHSMGNALGNFKEYWDEIYSNPRMLGGFIWDWVD QGIYKTLPDGRTMVAYGGDFGDKPNLKAFCFNGLLMSDRETTPKYWEVKKVYAPVELKME NGKLKVTNRNHHIDLSSYRCLWTLSVDGKQKEQGEITLPEIAPGESGTIDLPTFRSLNPL SDYQLKVSIVLKSDALWAKAGHEVAWEQFCLQKGDLASADLINKGTLQVEEDDKSLLISG RGFSVQWEKKVNGSMTSLIYKGKEMLAHSDDFPVQPVTQVFRAPTDNDKSFGNWLAKDWK LHGMDHPQINLESFHHEKRADGAVIVRIQTSNLYKKGKVVTTSVYTVFSDGTIDLKTTFL PQGVLPEIPRLGIAFCLAPAYDTFTWYGRGPQDNYPDRKTSAMIGLWKGSVADQYVHYPR PQDSGNKEEVHYLTLTDKQNKGIRVDAVENAFSASALHYTVQDIYEETHDCNLKPRAEII LSMDAAVLGLGNSSCGPGVLKKYAIEKKEHTLHIRISSKQ >gi|225935388|gb|ACGA01000004.1| GENE 14 16273 - 17844 853 523 aa, chain + ## HITS:1 COG:SP0659 KEGG:ns NR:ns ## COG: SP0659 COG0526 # Protein_GI_number: 15900560 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Streptococcus pneumoniae TIGR4 # 231 379 33 174 188 70 29.0 8e-12 MKKSFTLRHIITSNIMMMKKLLFSSLFLLGSLVSQAQHEYTIEGEVKGVKDGTHVSLFLT DGRVGSIVGTDTIRNGTFFFKRNAGESGMDQLSLMCRDTDFPPMSLDIYATPGAKIKVTG TNPLIYTWRVDSPVKEQQEHNRFIEDSRDLWDEFQRLAIKERSMRSASEAERKALRTKSD SISSIINQRELKLMKELPISNVWMEKLLRLSMSLKYNSRFTNKEEILALYDRLNEEQKAS IEGQEVRVNLFPPKTVKEGDDMADTDLFDLDGKMHHLADFKGKYMLLDFWSSGCGPCIMA LPEMKEIQEQYKDRLTIISLSSDTQNRWKAASAQHEMTWQNLSDLKQTAGLYAVYDVNGI PNYVLISPEGKIVKMWSGYGKGSLKLKMRRYLDAPKREMSITQNANRKVVNHPSFESTNT DIIEVKQVELTDTATIVHFYAYYIPKYWIQVSANAQLVDEQGASYTLQKADGITPGEHFF LPESGEAEFSLTFKPLPLKTKSFNFTEGTAKKDWQINGIKLTK >gi|225935388|gb|ACGA01000004.1| GENE 15 18086 - 19585 937 499 aa, chain + ## HITS:1 COG:MA2348_2 KEGG:ns NR:ns ## COG: MA2348_2 COG0642 # Protein_GI_number: 20091183 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 189 494 114 422 427 148 34.0 3e-35 MEDYKQTLYNDELLNILPDGVIIFDTNGEVIQLNQQAFAELHVHPSVNDMLPFPTNRLFK LLNKKEDILSTILEKIRQGENTYSLPEHTFMQEQVDYTQFPIRGEFATIRDRTNLNKILF FFRNITVELTQEYILNTALQKTKIYPWYYDISRSEFTLDDRYFEHLGIPAGENNTLTMEE YVNMIHPDDRQPMADAFVVQLSGNTTFDKTVPFRLRRGDGTWEWFEGQSTYIANISGHPY RLVGICLSIQEYKDIENTLIEARKKAEESDRLKMAFLANMSHEIRTPLNAIVGFSDVIGS TYDELSEEERADFVRLISINSEHLVRLIDDILDLSKIESNTIKFTFSNCSLNSLMMDIEK EQAMKPISGIEIKSLLPDEDVYINTDITRLKQVICNFINNARKFTQKGYIYFGYTLDNRN ANSVQIFVEDTGSGIPQECLNEIFDRFYKVDTFKQGTGLGLSICKTIVEHLQGDIFVESK IGEGSCFTVTLPFERKVEV >gi|225935388|gb|ACGA01000004.1| GENE 16 19582 - 20427 370 281 aa, chain - ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 137 281 150 293 307 74 29.0 3e-13 MESDMPKYDLPVDYIVGEGFAKDLLKSYKNFPCKIESGLFILCIKGTMQVTINSNAHHIS ENDLITLPPNCILEIHTFSSDIQIYYAGFSSHFIESINLMKATQHLLPVIMENSIVALSP LQACSYKMFYESSILSYASYRTRENKEIVKAVLTMFIQGATEIYKLQNNWYLSSQSRKYE IYQEFLQLAMKHYTVHHGASFYANELGLSLPHFCSTIKKAAGNTPLEVIASIILMEAKSR LKSADEPVKNIALSLGFNNISFFNKFFKQHTGITPQEYRGR >gi|225935388|gb|ACGA01000004.1| GENE 17 20627 - 24826 3490 1399 aa, chain - ## HITS:1 COG:TM1062 KEGG:ns NR:ns ## COG: TM1062 COG3250 # Protein_GI_number: 15643820 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 371 940 4 557 563 120 25.0 2e-26 MKKLLLAVFSIATTFSLYAQREVPQERMEQIYEEVKTPYKYGLAVAPADNYHKIDCPTVF RQGDKWLMTYVVYNGKGGTDGRGYETWIAESDNLLEWRTLGRVLSYRDGKWDCNQRGGFP ALPDMEWGGSYELQTYKGRHWMTYIGGEGTGYEAVKAPLYVGLAWTKGDISTAHEWESLD KPILSIHDKDAQWWEKLTQYKSTVYWDKDKTLGAPFVMYYNAGGRHPETDLKGERVGIAL SKDMKTWKRYSGNPVFAHEADGTITGDAHIQKMGDVYVMFYFSAFEPSRKYKAFNTFAAS YDLVNWTDWKGADLIIPSKNYDELFAHKSYVVKHDGVVYHFYCAVNNAEQRGIAIATSKP MGRSAVRFPVPESKNRRQIMTLNEGWKTWITEATHLKGNFMMPAKTVNIPHNWDDYYGYR QLTHGNLHGTAMYVKDFTADVKSGKRYFLRFDGVGTYATITVNGKDFGRHPIGRTTLTLD VTDELKQGVNRLEVKAEHPEMIADMPWVCGGCSSEWGFSEGSQPLGIFRPVVLEVTDEIR IEPFGVHIWNDEKAANVFVETEIKNYSKATETVELVNKLSNADGKQVFRLVEKVTLAPGE MKVIRQQAPVENPVLWNTENPYLYKLASMIKRDTKTTDEISTPFGIRTISWPVKRNDGDG RFYLNGKPVFINGVCEYEHQFGQSHAFSNEQVAARVKQIRAAGFNAFRDAHQPHHLDYQK YWDEEGILFWTQFSAHVWYDTPEFRENFKKLLRQWVKERRNSPSVVMWGLQNESTLPREF AQECSDLIREMDPTAKTMRVITTCNGGEGTDWNVIQNWSGTYGGDVTKYGRELSQANQLL NGEYGAWRSIDLHTEPGDFQVNGVWSEDRMCQLMETKIRLAEQAKDSVCGQFQWIYSSHD NPGRRQPDEAYRKIDKVGPFNYKGLVTPWEEPLDVYYMYRANYVPAAKDPMVYLVSHTWA NRFEKGRRRATIEAYSNCDSVLLYNDLTNEKATFLGRKKNNGTGTHFMWENRDIRYNVLR AVGYYKGKPVAEDLILLNGLEQAPNFKLLYQDDKKILKGEAGYNYLYRLNCGGDDYTDSF GQLWLQDNTNYSRSWAENFKDLNPYLASQRTTNDPIRGTRDWTLFQHFRFGRHQLEYRFP VADGTYRIELYFTEPWHGTGGSASTDCEGLRIFDVAVNDSVVLDDLDIWAESGHDGVCKK VVYATVKGGMLKIHFPEVKAGQALISGIAIASTDQELKPTVFPASGWSWEKADKEVMEKT PKELLPEDKNARVSISYEAETAVLKGKYQKKEHRKQTGVFFGKGKGNSIEWNVSTGLAQV YALRFKYMNTTGKPIPVLMKFIDSKGVVLKEDVLNFPETPDKWKMMSTTTGTFINAGHYK VLLSAENMDGIAFDALDIQ Prediction of potential genes in microbial genomes Time: Fri May 13 06:23:31 2011 Seq name: gi|225935387|gb|ACGA01000005.1| Bacteroides sp. D2 cont1.5, whole genome shotgun sequence Length of sequence - 16560 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 51 - 2750 2451 ## BT_0997 hypothetical protein - Prom 2930 - 2989 7.8 + Prom 2887 - 2946 5.8 2 2 Op 1 . + CDS 2967 - 4181 823 ## BT_0998 two-component system sensor histidine kinase 3 2 Op 2 . + CDS 4147 - 5400 1005 ## COG0642 Signal transduction histidine kinase + Term 5418 - 5481 5.2 4 3 Op 1 . - CDS 5693 - 6571 545 ## COG3568 Metal-dependent hydrolase 5 3 Op 2 . - CDS 6595 - 8757 1489 ## GYMC10_6263 metallophosphoesterase 6 3 Op 3 . - CDS 8774 - 9667 770 ## COG3568 Metal-dependent hydrolase 7 3 Op 4 . - CDS 9697 - 11442 1508 ## BT_0030 hypothetical protein 8 3 Op 5 . - CDS 11469 - 14969 2802 ## Dfer_2402 TonB-dependent receptor plug - Prom 15098 - 15157 6.5 - Term 15106 - 15137 2.5 9 4 Op 1 . - CDS 15185 - 16177 863 ## COG3712 Fe2+-dicitrate sensor, membrane component 10 4 Op 2 . - CDS 16185 - 16556 313 ## BDI_2325 RNA polymerase ECF-type sigma factor Predicted protein(s) >gi|225935387|gb|ACGA01000005.1| GENE 1 51 - 2750 2451 899 aa, chain - ## HITS:1 COG:no KEGG:BT_0997 NR:ns ## KEGG: BT_0997 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 899 1 873 873 1731 95.0 0 MKNTRLFMFAACTLFLAACGRQTVKIMTPPDASNRVLFGAEQLQTTLDKAGYQVMMQQGD TTFSDPEIKTILLTEVNDTTLKKEGFHISTTGNLTRVSGRDGSGVIYGCRELIDRVNDSD GKLNFPEELKDGPEMVLRGACVGLQKMTYLPGHGVYEYPYTPESFPWFYDKEQWIKYLDM LVANRMNSLYLWNGHPFASLVKLEDYPFALEVDEETFKMNEEMFSFLTEEADKRGIFVIQ MFYNIILSKPFAEHYGLKTQDRNRPITPLIADYTRKSIAAFIEKYPNVGLLVCLGEAMCT VEDDVEWFTKTIIPGVKDGLQALGRTDEPPLLLRAHDTDCKLVMDAALPLYKNLYTMHKY NGESLTTYEPRGPWSKIHTDLSSLGSIHISNVHILANLEPFRWGSPDFVQKAVTAMHNVH GANALHLYPQASYWDWPYTADKLPNNEREFQLDRDWIWYQTWGRYAWNCHRDRTDEMGYW DHQLGKFYGTSDENASNIRVAYEESGEIAPKLLRRFGITEGNRQTLLLGMFMSQLVNPYK YTIYPGFYESCGPEGEKLIEYVEKEWKKQPHVGEMPLDIVAQVIEHGDKAIAAIDKAAGS VSSNKDEFARLQNDMHCYREFAYAFNLKVKAAKLVLDYQWGKEIKNLEEAIPLMEQSLEH YRKLVELTDEHYLYANSMQTAQRRIPIGGDDGKNKTWKELLVHYEKELENFKANLALLKE KQNGNAVTETVEIAAWTPANVKLISNYPTVKVDEGISLFMDVPGKIEAVAPELKGMKALR FNGNEQREKGTSITFETDAPVKLLVAYFKDDQKKYAKAPKLEIDASANDYGQAEPVLTNA VRINGMPLANVHAYSFPAGKHTLMLPKGYLQVLGFTAAETKVRNAGLAGDEETMDWLFY >gi|225935387|gb|ACGA01000005.1| GENE 2 2967 - 4181 823 404 aa, chain + ## HITS:1 COG:no KEGG:BT_0998 NR:ns ## KEGG: BT_0998 # Name: not_defined # Def: two-component system sensor histidine kinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 402 1 403 810 596 73.0 1e-169 MKEQTNYDYEKYVQIAQMAKMGWWESDLKNKEYICSDFIVDLLGLESNRISFTEFHQRIR EDHRLRLKNEYMSLSYLETYEQMFPIRAKDGEIWVYSKINFQKPDKEGYRNMAGLLQYID RPIDTTNENIDFFQVSNLLYQQNNISYSLLAFLQCDDVTQVVNKTLGDLLNQFLGDRIYI FEINRKEQRQDCTYEVTAEGISKEQEFLSNIPWDPSTWWNHQIAERRAIILNTLDDMPEE AAEYRQTLEVQDIKSLMVVPLISKEEVWGYMGIDMVRTQRSWSNVDYQCFSSLANIISIC IELRKSELQAKEDRLALDNSEKILRNIYKNLPAGVELYDKDGYLVDINDKELEIFGLSDK NEALGVNLFDNPNIPLEVKERLRAKEDVNFSINYDFYKNKSICR >gi|225935387|gb|ACGA01000005.1| GENE 3 4147 - 5400 1005 417 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 125 410 17 307 328 182 37.0 1e-45 MTSTKINQYVDSRRNGIINLTTKVTALYDSQNRFINYLFINIDTTETTNAYTKIQEFENL FLLIGDYAKVGFAHFNVLTRDGYAQDTWYRNLGEKEGTPMPQVIGVYAHVVPEDQAVLKN FVGEVKTGKATSLRKEVRVCRENGKYTWTSINVMVRDYRPQDGIIEMLCINYDITPLKET EQKLIIARDKAEELDRLKSAFLANMSHEIRTPLNAIVGFSSLLAETDSRSERQEYIKIVQ ENNELLLQLISDILDLSKIEAGTFNFVYTNVDVNETCSEIIKSMGMKVGKGVELILEEPF PECYIYTDKNRFTQVISNFINNALKFTQQGSITLGYEQVSHQKIKFYVRDTGMGIPEEKQ KSVFERFVKLNTFVQGTGLGLSICKSIVSQMGGEIGVESTEGVGSCFWFTHPYHAAD >gi|225935387|gb|ACGA01000005.1| GENE 4 5693 - 6571 545 292 aa, chain - ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 37 285 1 250 257 166 37.0 5e-41 MKCFINKKYYFLFLILLFAGELPAKSLNLDKTKPEDIIRLASYNIRTKGDKGDKAWEVRL NALVDVVRRNKFDMFAIQEGRTSQLKDMMILNEFSYIGRDRDGDNKGEHCAIFYKKDRFK VLKHGDFWYSETPDIPSYGWGARCRRICTWGYFKDLRTGKKFYVFNSHTDHEATEARRQS SFMLLEQVRKIAKGKPTFCTGDFNATPDEEPIQLLLKDSLLLDSYECTLTPPKGPSGSFY AYDKTRNIAKRIDFIFVTPKIKVLSYHTIDDDIKYNKYSSDHFPVMVEVLPK >gi|225935387|gb|ACGA01000005.1| GENE 5 6595 - 8757 1489 720 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_6263 NR:ns ## KEGG: GYMC10_6263 # Name: not_defined # Def: metallophosphoesterase # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 65 706 983 1559 2013 143 28.0 2e-32 MKNKFYHISTYSVMRYWTILWMAILSFSCSDFNPMDSYSRIPPDRNTDIDDGDEGDGAGG LFEKGYGTVNKPYLVMDVIQIQNMSEALVKGKMIYFQLGADIDMKSISNWDPLNPTGDYY IYFDGNNHIIKNFTCTDKAYASFFGILAGTCKNVGFYNAHVEAATNSGAGVIGGYIGVKA PNAVEKTGQVENCYVSGKVKGKYAGGIASRMGRPYGGQICYIKNCYSTAEVISTGDECGG IVGSMYENSEVSYCYSTGVLIGANSVGGIAALPSEGAKITSCVAWNWKMTGPAARSGRIC GVLSQGENGHQADPVASECYAWEDMVCTGFSPEDNVGSVSTGKYDGVGESVLTLQNSIAN WGTPWHNVGNIDMGFPILEWQLDRGDYASYGGHDNEPEGDFANGDGTQNNPYVIANAIHI QNMSKALIEKQTTYFVLSADIDMQGIKWTPLNDANGYHKWIDFDGRNHVVKNLTCESGTY RSFFGVLCGECRNVGFVDANISSSSTGIGIIAGYVGLAAGAENYTGKITNCYTSGILKGS GAAGGIGGVLGGSGYIKNCYSSATVIDQIANNTGKAGGIIGRVNGNASGSSIENCYSSGD ISAIGGGNVGSIVGKIDKGKLTVKNCVAWNTMLTSTDKTKIGRIVGGTANTAYENCYAYD GMILKAGETIFTVSDETSPSGSSFQGVAKSTNELKSTVISWDSSLWKEGNNGYPVFKWSK >gi|225935387|gb|ACGA01000005.1| GENE 6 8774 - 9667 770 297 aa, chain - ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 55 296 4 255 257 109 31.0 6e-24 MKKIVIYTLISFIFSVFASSCEDNKDNSLYYPDFTWDTGDGEEDEDPVTETSMRVATYNL QVETGTGWTNRRERVAQLIKDYDFEICGFEEASWEQRSYLGTQLASDYQILAYGRDTGND DNKAGEMSGILYKKSRYTLLDAGRFWFSETPDIPSNGWDETNFKRFCVWGKFKDSKTQKE FYLFETHMPLADNARKHACQMLVDAVSDKAKDNTPAFCTGDFNATPDAPEIATTICQSGI LKDAYREAAVQHGALFTFPSKKTRIDFIFVKHATVLSTRTIVSSLSDHYPMVIVVEI >gi|225935387|gb|ACGA01000005.1| GENE 7 9697 - 11442 1508 581 aa, chain - ## HITS:1 COG:no KEGG:BT_0030 NR:ns ## KEGG: BT_0030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 581 1 605 609 477 45.0 1e-133 MKIMKYTLWATVGLSLLFASCESILDINPKDRLTTKDYFTNEEQLRLYSNQFYSNNFPGD GDIYKDNADVLIVSPLDDEVSGQRVIPETGGGWSWSALRSINFLLDNLGNCKDQKVRDKY EALARFFRAYFYFEKVKRFGDVPWYDKVLGSDDADLYKARDSREFVMGKIMEDLNFAIEV FKETNRTKELYRVTWWTAQALKSRVGLFEGTYRKYHGLGDYEKYLNDCVSASNEIMTASG GYSLYQSGSQSYRNLFKSENAIDAEIILARDYNNDLSLVHKVQAFENSPTLGRPGLSKKL VNYYLMKNGNRFTDQPNYATMEFKDEIVNRDPRLAQTIRTSNINMNVTMTGYHLLKYAND NMSYTGDSSNDLPLFRLAEVYLNYAEAKAELGTLTQTDIDNTVNKLRTRAGVTGKLNMNA ANLSPDPYMCAPETGYVNVTGDNKGVILEIRRERAIELVMEGFRYYDLMRWKEGQCMAQS FKGFYLPATAINKAYDIDGDGTNDVCFYTTSSQPNVGNVTYVKLASDGSGTSLSEGNYGN LLCYSWIDRTWNENRDYLYPVPRQEITLSDGVVTQNPGWNE >gi|225935387|gb|ACGA01000005.1| GENE 8 11469 - 14969 2802 1166 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2402 NR:ns ## KEGG: Dfer_2402 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: D.fermentans # Pathway: not_defined # 4 1166 28 1217 1217 931 43.0 0 MKHKVLLILLLGIIISVSASAQKVTINLKQVKLEKVFSAITQQTGLTVAYSRTIVNPDRI VTVEAKNQELSKVLNDLFLGTNVNYEIGKTKIYLKEKVTDFEQQTSNANKKNISGRVVDE KGEPIIGASVMVQGSSLGTITNVDGRYTLANVPESSTITVSYIGYITVNYAATSRNLSQV VLREDSKTLEEVVVVGYGTQKKVNLTGAVAIVSGDELTTRSAANLSQLLQGSVPNMNVNF SSGRPGQGGSFNIRGVNSISADAAPLTIIDGIEGDINKVNPNDVESISVLKDASAAAVYG ARAAYGVILVTTKNGKIGKTNVSYNGRFSFGETTTSTDFETRGYYSAGINDMFYKTYQGV PYTHYTQEDYHELWIRRNDKVEDPSRPWVVEKNGEYKYYGNFDWYNCLFDNTRPTWEHNL TVSGGTEKVKYMLSGNYYNQKGIIRIDPDRFKKYTFRSKIIANITSWFELSNNTSYYHSE YTYPGLSGVNDVFTRAGRHALASIVPMHPDGTLVYRTGLTDTGEVADGVSAVLLNGGHHN RDREYEFVTTFEAVLKPIKHFEVRANYSWAHYNQQNLNRSVDVLYSRNPGETITMDNGRT RGNYLSEIQNNQIRQTFNLYGTYDNTFANAHSVKVIVGGNYDYKYFKKLGMKRNGLLSES LDDFNLAKGDDISITGGQEEYAILGFFYRLNYGYKDRYLFEASGRYDGSSRFRRGHRFGF FPSFSAGWRVSEEAFFTQAKNYVSNLKLRLSYGSLGNQKTVGYYDYLQLINTGAVMNYAF GDTTKGDYAYESAPNSTDLTWETVITKNIGLDLGFLNNRLNVSFDAYIRDTKDMLMAGKT LPGVYGASSPRMNVADLRTKGWEASITWGDSFTLASKPFNYRIMAGIGDNTSKVTKYDNP NRTLTDPYEGQQLGEIWGYVVDGYFKTDEEARNYKVDQSFVNQMINASALDNGLHAGDLK FVDLDGNNKIEQTTSANDRKDMKVIGNSLPRYNYNFGISADWYGIDFSVLFQGIGKQNWY PGAETSMFWGPYSRPYASFIPSDFMSQVWSEENTNAYFPRPRGYVALGSNRELAVVNTKY LQNLAYCRLKNLSIGYTLPDKWLSKIGFEKIRVYFSGENLLTFTKLHSDYIDPEQASASN SWKTSKSDANIYPWAKTYSFGVDITF >gi|225935387|gb|ACGA01000005.1| GENE 9 15185 - 16177 863 330 aa, chain - ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 117 320 67 267 280 71 25.0 2e-12 MNRPTDKQIEEVLAGVATPEDAKFVAEWFATEEGNTYLDAVMTREAEHLKAETAEIYVDH TIPSEKMYHQIQKNISRKQIKRICFRVAAILIPVIFLIGLYIQINSRVNLFGTTEYEEIR VAKGERIQMMFQDGTRVYINSDSWLKYPKKFGLSKREVFLVGEAYFVVAKNKKRPFIVNL NGPSVHVLGTSFDVQAYPENKDIVICLDEGHVNLTLSSAKKYPLLPGEKLIYNKESDQCR IIKNDHSKQVSMWKDNVISFKDTPLAEVVKVLNRWYNVNFKIEDEQASKYVYTLTSDNTI LEKVLQDLEKIAPVKFEYDEVRKEVTVRMK >gi|225935387|gb|ACGA01000005.1| GENE 10 16185 - 16556 313 123 aa, chain - ## HITS:1 COG:no KEGG:BDI_2325 NR:ns ## KEGG: BDI_2325 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: P.distasonis # Pathway: not_defined # 1 116 82 197 197 121 53.0 7e-27 MTKNHVLNLIRNENSAISKNYEIAQSTPVYEDNLIENLEKKELMASFYKAVDMLPPQKRS ICLMKVKEELTNQEIAERMNLSVNTVKTHYTEALKMLRIHLSKMLIIVTFVTLMTYLSVH YLR Prediction of potential genes in microbial genomes Time: Fri May 13 06:24:52 2011 Seq name: gi|225935386|gb|ACGA01000006.1| Bacteroides sp. D2 cont1.6, whole genome shotgun sequence Length of sequence - 69070 bp Number of predicted genes - 53, with homology - 52 Number of transcription units - 30, operones - 15 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 261 240 ## BDI_2325 RNA polymerase ECF-type sigma factor - Prom 388 - 447 7.3 - Term 369 - 430 10.4 2 2 Op 1 . - CDS 457 - 2661 1957 ## BT_1001 putative alpha-rhamnosidase 3 2 Op 2 . - CDS 2691 - 4550 1363 ## BT_1002 hypothetical protein 4 2 Op 3 . - CDS 4579 - 6675 2015 ## COG3533 Uncharacterized protein conserved in bacteria - Prom 6717 - 6776 4.1 - Term 6742 - 6786 10.7 5 3 Tu 1 . - CDS 6804 - 7553 279 ## COG4422 Bacteriophage protein gp37 - Prom 7617 - 7676 4.2 - Term 7593 - 7639 9.1 6 4 Tu 1 . - CDS 7740 - 7934 117 ## BT_2526 hypothetical protein - Prom 8047 - 8106 5.6 7 5 Op 1 . - CDS 8284 - 9222 309 ## COG3023 Negative regulator of beta-lactamase expression 8 5 Op 2 . - CDS 9251 - 9601 426 ## BT_4442 hypothetical protein - Term 9618 - 9652 4.0 9 5 Op 3 . - CDS 9687 - 10205 541 ## BF2226 hypothetical protein - Prom 10346 - 10405 7.4 + Prom 10503 - 10562 5.9 10 6 Op 1 . + CDS 10606 - 11859 854 ## COG1106 Predicted ATPases 11 6 Op 2 . + CDS 11874 - 12584 478 ## BVU_3668 hypothetical protein - Term 12602 - 12647 10.5 12 7 Tu 1 . - CDS 12820 - 13158 70 ## BT_2526 hypothetical protein - Prom 13283 - 13342 6.9 - Term 13322 - 13364 -0.1 13 8 Op 1 . - CDS 13393 - 16230 1709 ## BT_2473 hypothetical protein 14 8 Op 2 . - CDS 16258 - 17379 415 ## BT_2316 hypothetical protein 15 9 Op 1 . - CDS 17506 - 18780 505 ## gi|260170343|ref|ZP_05756755.1| hypothetical protein BacD2_00613 16 9 Op 2 . - CDS 18801 - 20864 1314 ## FIC_00184 hypothetical protein 17 9 Op 3 . - CDS 20900 - 21874 844 ## BT_1062 hypothetical protein - Prom 21968 - 22027 5.0 - Term 22035 - 22090 1.0 18 10 Op 1 . - CDS 22135 - 23700 1413 ## BT_1063 hypothetical protein 19 10 Op 2 . - CDS 23744 - 25027 895 ## BVU_0907 hypothetical protein 20 10 Op 3 . - CDS 25035 - 25604 476 ## BDI_3526 hypothetical protein - Prom 25762 - 25821 7.1 + Prom 25704 - 25763 9.6 21 11 Tu 1 . + CDS 25885 - 26754 487 ## BT_2889 AraC family transcription regulator 22 12 Tu 1 . - CDS 26976 - 28193 768 ## BF3522 tyrosine type site-specific recombinase - Prom 28381 - 28440 7.2 + Prom 28587 - 28646 5.7 23 13 Tu 1 . + CDS 28721 - 29215 117 ## BF2331 hypothetical protein + Term 29244 - 29282 1.0 + Prom 29398 - 29457 5.4 24 14 Op 1 . + CDS 29517 - 31190 778 ## azo2045 hypothetical protein 25 14 Op 2 . + CDS 31196 - 33187 585 ## swp_4497 hypothetical protein + Term 33192 - 33240 9.1 - Term 33179 - 33227 5.3 26 15 Op 1 . - CDS 33230 - 33637 287 ## COG4933 Uncharacterized conserved protein 27 15 Op 2 . - CDS 33615 - 34670 567 ## Caul_1164 hypothetical protein - Prom 34759 - 34818 4.3 28 16 Op 1 . - CDS 34824 - 34934 61 ## gi|260170356|ref|ZP_05756768.1| putative ATPase involved in transport 29 16 Op 2 . - CDS 34986 - 35522 293 ## COG1106 Predicted ATPases 30 16 Op 3 . - CDS 35506 - 36447 298 ## ETA_03970 hypothetical protein - Prom 36512 - 36571 4.9 - Term 36522 - 36594 14.0 31 17 Tu 1 . - CDS 36613 - 38151 946 ## COG2461 Uncharacterized conserved protein - Prom 38267 - 38326 3.7 - Term 38356 - 38396 -0.1 32 18 Op 1 9/0.000 - CDS 38426 - 39796 442 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 33 18 Op 2 27/0.000 - CDS 39793 - 42933 2671 ## COG0841 Cation/multidrug efflux pump 34 18 Op 3 . - CDS 42937 - 44055 902 ## COG0845 Membrane-fusion protein - Prom 44268 - 44327 5.6 35 19 Op 1 2/0.000 + CDS 44367 - 45152 367 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 45169 - 45196 0.1 36 19 Op 2 . + CDS 45222 - 45827 340 ## COG1309 Transcriptional regulator + Prom 45888 - 45947 2.9 37 20 Op 1 . + CDS 46007 - 46456 305 ## Dole_2867 hypothetical protein 38 20 Op 2 . + CDS 46453 - 47256 348 ## COG0716 Flavodoxins + Term 47259 - 47303 -0.7 + Prom 47260 - 47319 1.7 39 21 Op 1 . + CDS 47347 - 48327 697 ## COG0863 DNA modification methylase 40 21 Op 2 . + CDS 48308 - 51088 1331 ## Hlac_3519 N-6 DNA methylase + Term 51253 - 51279 -1.0 - Term 51069 - 51126 13.1 41 22 Tu 1 . - CDS 51131 - 52894 627 ## BF0032 two-component system response regulator - Term 53068 - 53120 2.1 42 23 Tu 1 . - CDS 53280 - 53894 614 ## COG1793 ATP-dependent DNA ligase - Prom 53922 - 53981 4.6 + Prom 53845 - 53904 7.5 43 24 Tu 1 . + CDS 53929 - 54120 87 ## - Term 54084 - 54139 15.4 44 25 Tu 1 . - CDS 54157 - 54948 957 ## BT_1007 hypothetical protein - Prom 54972 - 55031 4.9 + Prom 54899 - 54958 4.4 45 26 Op 1 2/0.000 + CDS 55043 - 55858 660 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 55870 - 55929 1.9 46 26 Op 2 . + CDS 55988 - 56896 845 ## COG1073 Hydrolases of the alpha/beta superfamily + Term 56919 - 56961 7.2 + Prom 56973 - 57032 7.2 47 27 Op 1 . + CDS 57079 - 59553 1936 ## BT_1010 hypothetical protein 48 27 Op 2 . + CDS 59591 - 60769 1135 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 49 27 Op 3 . + CDS 60793 - 62256 1516 ## BT_1012 hypothetical protein 50 27 Op 4 . + CDS 62303 - 66175 3117 ## COG4692 Predicted neuraminidase (sialidase) + Term 66199 - 66245 10.1 + Prom 66206 - 66265 6.0 51 28 Tu 1 . + CDS 66288 - 67175 195 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Term 66982 - 67034 1.4 52 29 Tu 1 . - CDS 67190 - 67972 653 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 68135 - 68194 5.7 + Prom 68053 - 68112 2.5 53 30 Tu 1 . + CDS 68150 - 69040 453 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily Predicted protein(s) >gi|225935386|gb|ACGA01000006.1| GENE 1 3 - 261 240 86 aa, chain - ## HITS:1 COG:no KEGG:BDI_2325 NR:ns ## KEGG: BDI_2325 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: P.distasonis # Pathway: not_defined # 7 86 5 84 197 79 52.0 4e-14 MLDKSAQYTNTDDEKLFSFIEKGDKGAFTQAYDRYHKLLYVLAYRYLMNADMAEDVVQHV FARLWEFRSELHVGISLKNYLFTMTK >gi|225935386|gb|ACGA01000006.1| GENE 2 457 - 2661 1957 734 aa, chain - ## HITS:1 COG:no KEGG:BT_1001 NR:ns ## KEGG: BT_1001 # Name: not_defined # Def: putative alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 734 1 729 729 1409 93.0 0 MKKKYMILLGALSLASSTFAQTWIWYPGDYEIWLGNQMNNRRTERGAFFPPFWKTDSHYV VVEFSKQLNLSEPEEIFIAAEGKYNVKLDGKLQFGMPETMTLPAGKHNLNIKVWNQATPP TIYVKGKTVNSDSSWRVTYEDKEWIDESGKASGTSATIYIDAGCWNFDGATQLPSQFSLM REPQQPVAKTEQAEGGILYDFGKETFGFITLKNLSGKGKIEIYYGESPEEAKDKAYCETL DKLLLEPGQVTDLAIRSTSPLNSSDNEYTLENSKAFRYVYVTHEPGVQIGEVSMQYEYLP EEYRGSFRCNDEELNRIWEVGAYTMHLTTREFFIDGIKRDRWVWSGDAIQSYLMNYYLFF DSESVKRTIWLLRGKDPVTSHSNTIMDYTFYWFLSVYDYYMYSGDRHFVNQLYPRMQTMM DYVLGRTNKNGMVEGMTGDWVFVDWADGYLDKKGELSFEQVLFCRSLETMALCADLVGDE IGKQKYEKLAAALKAKLEPTFWNNQKQAFVHNRVNGQQSDAVTRYANMFSVFFQYLNADK QQAIKNSVLLNDSILKITTPYMRFYELEALCALGEQEAVMKEMKAYWGGMLKEGATSFWE KYNPEETGTQHLAMYGRPYGKSLCHAWGASPIYLLGKYYLGVKPVKEGYKEFAIAPVLGG LKWMEGTVPTPNGDIHVYMNGKTMKVKATEGEGYLTINSRRPPKANIGIPEKVSEGVWRL RIDSPEERIVTYHL >gi|225935386|gb|ACGA01000006.1| GENE 3 2691 - 4550 1363 619 aa, chain - ## HITS:1 COG:no KEGG:BT_1002 NR:ns ## KEGG: BT_1002 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 615 1 617 619 1167 88.0 0 MRKLFLTAICILCSHWLWSGEIWVSPKGNDFNDGTRQSPKATLTAALRQAREWRRTGDDR VQGGITIYMEGGTYALYEPVFIRPEDSGTKESPTVIRSATDEKVVLSGGVRIKNWKKQGK LWVADVPAFNGRPLDFRQLWVNGEKAVRARDVEDFEKMNRICSVDEKNEVLYVPAVAVRR LIDNKGKLKAEYAEMVIHQMWCVANLRIRSIEVQGDSAAVRFHQPESRIQFEHPWPRPMV TTDGHNSAFYLTNARELLDVPGEWYHDMDARRVYYYPREGEKMQEAEVMVPAIETLVQVE GTLDRPMCHIRFEKITFSYTTWMRPSEKGHVPLQAGMYLTDGYRIDPKMQRDYLNHPLDN QGWLGRPAAAVRVAAARQIDFERCRFEHLGSTGLDYEEAVQGGIVRGCLFRDIAGNGLLA GSFSPAAHETHLPYDPADRREVCTHQQINNCYFTEVGNEDWGCLAIAAGYVSDINIEHNE ISEVPYSGISLGWGWTQTVNCMRNNRVHANLIHHYAKHMYDVAGVYTLGSQPKSYVTENC VHSIYKPGYVHDPNHWFYLYTDEGSSFITVRDNWTEGEKYLQNANGPGNVWENNGPKVDD AIRERAGLETEYKDLLNVR >gi|225935386|gb|ACGA01000006.1| GENE 4 4579 - 6675 2015 698 aa, chain - ## HITS:1 COG:TM0280 KEGG:ns NR:ns ## COG: TM0280 COG3533 # Protein_GI_number: 15643049 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 25 694 4 618 620 397 36.0 1e-110 MKQIKLLLLLASASVTGALAQSNGLTDMSQSRFAKMANTGIGAVHWTDGFWGDRFQVFSR TSLQSMWNTWNAPEISHGFHNFEIAAGVCKGEHWGPPFHDGDMYKWMEGVASVYAVNKDP ELGKLMDNFITCVVKAQRADGYIHTPVVIEELNKGIDSHALEDQHKQTVIGTKVGDEDEK GAFANRLNFETYNLGHLMMAGIVHRRATGKTTLFDAAVKATDFLCHFYETASAELARNAI CPSHYMGVVEMYRATGNPRYLELSKNLIDIRGMVENGTDDNQDRIPFRDQYRAMGHAVRA NYLYAGVADVYAETGEQQLMKNLTSIWNDIVTRKMYVTGACGALYDGTSPDGTCYEPDSI QKVHQSYGRPYQLPNSTAHNETCANIGNMLFNWRMLEVTGDAKYADLVETCLYNSVLSGI SLDGKKYFYTNPLRISADLPYTLRWPKERTEYISCFCCPPNTLRTLCQAQNYAYTLSPEG IYCNLYGANTLTTTWKGKGEVALTQETDYPWDGNVRVTLDKVPRKAGTFSLFLRIPEWCE KATLTVNGQPLQVNAKANSYAEVNRAWKKGDVVELVMDMPVRLLEAHPLAEEIRNQVVVK RGPLVYCLENMDIANGEKIDNVLIPADIKLTPKKITIEGSPIVALEGKARLASATSWEGV LYRPVVQAEKTVDIRLIPYYAWGNRGKGEMTVWMPLAR >gi|225935386|gb|ACGA01000006.1| GENE 5 6804 - 7553 279 249 aa, chain - ## HITS:1 COG:MT2803.2 KEGG:ns NR:ns ## COG: MT2803.2 COG4422 # Protein_GI_number: 15842273 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Mycobacterium tuberculosis CDC1551 # 5 212 11 223 284 77 30.0 2e-14 MGEKASMWNLWHGCHKLSEGCRHCYVYRTDGKYGKDSSVVTKTEKFDLPLLRKKNGTYKI PSGNLVYTCFTSDFLIEDADEWRAEAWEMIRIRQDLHFLFITKRIDRLQQCLPPDWGDGY ENVTICCTMENQDRVDFRLPIYREIPIKHKIIICEPLLSRIDFRGELGDWVEQVVAGGES GKEARMCDYEWVLDIRQQCIDANVGFWFKQTGSFLLKEGHEYKIARQFQHSQARKAGLNY TPDKQEIKG >gi|225935386|gb|ACGA01000006.1| GENE 6 7740 - 7934 117 64 aa, chain - ## HITS:1 COG:no KEGG:BT_2526 NR:ns ## KEGG: BT_2526 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 64 1 64 132 75 64.0 6e-13 MIIYNSRLAKCLLDKKKHSFMIFGCYFTRYKQLEFWEEIENRIHVRQYMEYFLPALLSAV GVSL >gi|225935386|gb|ACGA01000006.1| GENE 7 8284 - 9222 309 312 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 45 148 1 104 116 105 45.0 8e-23 MRKIDSIIVHCSATKAGQDFTAADIDRWHRERGFNGIGYHYVIRLDGRLEKGREIDLAGA HCKGWNERSVGICYIGGLDENGHPADTRTNAQKRVIYQVIMDLQRQYTILQVLGHRDTSP DLNGDGVIEPYEYVKACPCFDVREFMKSGRELLFVLLLGLVLPGVLSGCRSKKEVTSRSS EVQMDSSSSGHSSHVALYDVNQEKKMLERMEESTEQILVVLGKDTATGDFSSTCFISGSR KTVNKVFEKKERIDEKEDTQISSFIDTNHHKVVCEEEENSLYYRKNISWMLIISISALML LYFIYKKRRILF >gi|225935386|gb|ACGA01000006.1| GENE 8 9251 - 9601 426 116 aa, chain - ## HITS:1 COG:no KEGG:BT_4442 NR:ns ## KEGG: BT_4442 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 116 116 116 52.0 3e-25 MQITTILAFITAMGGLEAVKWIVRYISCRKTDARKEEADVSSLEEENRRKKVDWLEDRLT QRDEKIDGLYIELRKEQEEKIDWIHKCHEVELAQKESEVKKCEIRGCVKRIPPSEY >gi|225935386|gb|ACGA01000006.1| GENE 9 9687 - 10205 541 172 aa, chain - ## HITS:1 COG:no KEGG:BF2226 NR:ns ## KEGG: BF2226 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 155 1 153 155 84 34.0 2e-15 MAIRFKRVQRLCDPTNKEAGKKVYPVISYQYDTSATLDEFAKEISSASGVSEGETISVLK DFRTLLRKTLLSGRSVNIAGLGYFYLAAQSKGTEKAEDFTVSDISGLRVCFRANSDIRLF AGTSTRSDGLKFKDLDHINDSSIIDGGSGEGGEAPDPDEGTGGSGEAPDPAA >gi|225935386|gb|ACGA01000006.1| GENE 10 10606 - 11859 854 417 aa, chain + ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 1 416 1 415 420 163 32.0 5e-40 MLLEFTVKNYRSFHKECTFSLEAQNIVEEPKTNVVSLDGYKIVKTAAVYGPNSSGKSNLI SALDNMRNCVINSVRLNDNEMLPYDPFLLSETSDDTYEPTHYEILFLMDGIRYRYGFDYT YTSIVGEWLFTKVGTKKEKTLFIRTEEGIGVSEKDFSEGIGYESKTNENRLFLSLCAQLG GTISKKIINWFNNGYRIISGLQSIGYKTLSKQMFHEKKEGYQEALAFFKTLQLGFEDLLS EEESTTKVSFVSGKSMVLEKNIKLSTVHNKYNDKGQVIGEEVFNLENQESAGTLKLVELS GPIFETLLKGSILVVDELDAKMHPLISQYIIKLFNNAETNPKNAQLIFSTHDTHLLSARL LRRDQIWFTEKDTLEQTDLYSMMDIILPDGSKPRNDTNYEKNYINGRYGAIPFIMND >gi|225935386|gb|ACGA01000006.1| GENE 11 11874 - 12584 478 236 aa, chain + ## HITS:1 COG:no KEGG:BVU_3668 NR:ns ## KEGG: BVU_3668 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 236 3 234 237 243 56.0 4e-63 MARVIKIDNALQKRHEYKKNKRKARTVICRILIVCEGEKTEPNYFRSFDRHRKGNVVYEL TLDGGRMNTVGVVDKAISLRDKANIPYDRVWAVFDKDGFPAKNFNTAIAKANQNHIECAW SNEAFELWYLYHFHNRITGMTRDEYAVAITKAVNASPLYKKKTPYQYAKNDKSSFDIVTT FGSQDNAIRWAETKHCEYTDERYAMHNPCTTVYRLVRQLTGTDEELNEEIIRKIDE >gi|225935386|gb|ACGA01000006.1| GENE 12 12820 - 13158 70 112 aa, chain - ## HITS:1 COG:no KEGG:BT_2526 NR:ns ## KEGG: BT_2526 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 112 21 132 132 128 66.0 6e-29 MIFGCCFTRFKYLEVWKEMELRIHERQYIECLLLALLPALILSLFLSWWCMLFVLLNYHL LYWMERWFGHHSSFDWEALEHCGDTFYLRKRKSYAWMKWYGKKSLPPSEWDD >gi|225935386|gb|ACGA01000006.1| GENE 13 13393 - 16230 1709 945 aa, chain - ## HITS:1 COG:no KEGG:BT_2473 NR:ns ## KEGG: BT_2473 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 524 867 312 644 706 82 25.0 1e-13 MKKRQIPLFSHPRCLIFFSILGILILSSCVDDNETGTINSSPANVIQLSIPDAELVQVRS VATGKECMISNLQVLIYNDAHTAAPVYYQEGKSTQLSFLFGNGTASPTITLTDYIPQNGD RVYVLCNMTDRTSINETTSEDQLKNYSSQGQMKGFSSSQKEERTIPMYGWIEWNEAATSN VCLLTRCLAKIVVNANLEQLFPDKKVYWEWKNLNFSDFILDSDNQGDIYKGNINENGISE KHDLLSAITPIDESKGLVTSYYPLEYKHSIYALGKEVDKNKFSVDRAMLLLTVQNMDGSD KEYYRLDFWNKSTGAYWDILRNHSYNFNISKLKSKGYATTLEAIQNPGSNIEYSVTVSDN WSQGFASNGQYLLKTDRESIELLRGIEDPVVMVKIELQADDSGNANFDNVTTRRIRALGM DKKTLFPYSQMQMYYSIDNKTLIPIKKTTESNLSDSEVDLPVNSNKYWIYACVSNANLEF QGYLEITIGNIIKYIPIKAIEHSEANPIDESGPANSFITPITYGIYSFDATVIGNGVDGI VAETKNKPAWSGQLKDDLAMYHFKNAMGEDISVSKNVAVEPKSAKLIWQDKDNLISQVAF NSETNRVEFLSNGAGNGVIAVYDNSDPNAADANVLWSWHIWCTEQPDIIELGLPTNGEVY SGINYRIMDRDLGATTCTPDELTTRGMGYQWGRKDPFIGSASFESTESAPMYNVRGTDLF FEEIQKTKSIGTIEYAIRHPNVFIISNGAAYSGGDWLYYISPIYQSSSMTGNQYLWGNNP YSEYKMVNIKTMKTLYDPCPAGYKVSPPDVYGAFLKEKIIEGSVTDRLPSSDYHYGTMDG EKIYPPSFTKYGAYFYCGEGSMQRVYFTGNVRFGDTGFKGFQGMYYGGSLAGTLLHDDGN GGNDVVSFAFNIPSSFGFGCSFGSKEAHYWLDRSAAASIRCVRDE >gi|225935386|gb|ACGA01000006.1| GENE 14 16258 - 17379 415 373 aa, chain - ## HITS:1 COG:no KEGG:BT_2316 NR:ns ## KEGG: BT_2316 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 266 476 784 888 73 26.0 1e-11 MDTEGTSNCYIADRGGKSYSFTATIMGNGVDGIIDNGEFEDAMGNKLAKVTGADILPLSA KLLWQDTDELVEQVALVDNRVQVKMGRSRGNAVIAVYDNVNPNAVNAKILWSWHLWCTAT PGLITYSASRYTGNEYKVMDRNLGATTATGGLGTTQGLSYQWGRKDPYSGSLSYDGKRTV LYDIRSAESEVEYSKNEPATIGQTISTPHICNMGTSDTYSWCKDNDLYMKYLWGNPTGTD KVFPQSTSKTIYDPCPVGYKVAPGDVFQVLAKTGWVWAHSNIAYYFYIKNGYSHGSSFYC DGTGADETKVIYLPEVYLPNWVAGDLSLSGKYWTSSLYGEKKAFTFTFTLQGSYMLPVNR RFADVGSVRCIKE >gi|225935386|gb|ACGA01000006.1| GENE 15 17506 - 18780 505 424 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170343|ref|ZP_05756755.1| ## NR: gi|260170343|ref|ZP_05756755.1| hypothetical protein BacD2_00613 [Bacteroides sp. D2] # 1 424 1 424 424 795 100.0 0 MISNITNVLTALFVTFLLFSCVEEKNSCDDGRTNNVPDGKERMEIRLIAPVGNIMTTRGV ENALFEENRIDKLFMDIMDGENVVTSCSTEDGRLILVSVSDSIYIASAMFDIGVLKTGYT VCIRANDDFPKVISEQPVTPLYMSGFGFIEKSTEQNFRTSLHLLRGVAKLRTTVRKTALS VPQQELEIGDVKVQVIHAVDKIRKYAPFYSHDSDEYIYENRPSSYFSYFSYFDYPEISLT DILLKDRVELDGEMVTFYSQYIYENYLEVSRDYELDKNVTLLKLTIPVTDGMTSRIIERT IRLDNGDYRVKRNHIYSIDVQVLSIDEVKIYTDMLNWTDIEITGDIVGTDFDVDRKEISL VQGITDPVKLIQVCCNTPGSFQIRILQPNQTDLMSINDLQLYCDGISTNHRVIGNSGIYD VSTA >gi|225935386|gb|ACGA01000006.1| GENE 16 18801 - 20864 1314 687 aa, chain - ## HITS:1 COG:no KEGG:FIC_00184 NR:ns ## KEGG: FIC_00184 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium # Pathway: not_defined # 346 577 43 324 1036 109 30.0 3e-22 MKSYWWVCFLLFYRFCISCTENEGSREAMMNPDEVQMAINITTRTTNDPEVLNVNETRIS RIRVYIFDGTSLDKMYYWTLTETSGFYATPVFTVKAATGKSLCAIVNEPTDMNTRAILES VDHPNDLIDVQYQIADYLTTQANVPEYTTDYCLPMYGEVKGIDVSKGMTQTRNMTVDRAV ARVDVYMRKEAGNKEEVLIPNGLMVTGGAETGYVSPAKVGNYASSTIDITREAVKSIPEE TSTKDKGILAYSFYLPEMECKDRKLNLKLDGYDSIDLGGDSDNSGGAVLEKLERNHVYQL LCRFKTKMPALDINLTVCPWIALEPNIDKIYGDIHIINCFIVPPGGRVHIPMDGVYKAWA QLDEFEHAPIPDGEVMVEVLWQDEIGVMTDHSIKVMNASLRDKAYVIVETGEKMGNAVIT MKVNGEIYWSWHIWCTDYNPNLKNGQQELNGYVWMDRNIGATYNSYNEEGGVRSKGFLYQ WGRKDLFPPAKGWEKAEADKELYGITGELKSFSKTPVAVLNNISNSVHNPMTFYTSSDSW YTVNEKLADYSLRYLWNTKLGKKTLFDPCPAGWRVPAPKNVEDSPWPDDSIVLYKEMKGF GFNVSNVYYPAAGYRLNSTGDIMMDINLNTPYNYWGRYWSGNLRRSFTMTLTCVVAWGER PEVTGETYGVYAKLERASAASVRCVKE >gi|225935386|gb|ACGA01000006.1| GENE 17 20900 - 21874 844 324 aa, chain - ## HITS:1 COG:no KEGG:BT_1062 NR:ns ## KEGG: BT_1062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 315 1 308 317 305 48.0 2e-81 MRHNIKIKGIIRRTGLGGCLLLLLACFSCESIHETLPECRLYIDFKYDYNMEFTDAFAAQ VNRVDVFVFDKEGTFIMKKSEQGERLGTGTYKMSFELPAGEYQVAAWAGMSDAFEMQELQ AGQSKLNDLTMKMKREKSLIHNVKLEPFWYGEVKTVHFTGEKEQTETVRLVKDTKKFRFI LQKSGPGEELDVNDCLFEIRADNGSLAWDNSLLEDDVICFQPYHLEKVEDVGIVVEMNTM RLLENKKVYLMLTRKSDGKELMKVNLISYLLLTKMEGHHIPAQEYLDRQSEYAIVFFYNP ELLDFPANKIVINGWTIWLKGEEL >gi|225935386|gb|ACGA01000006.1| GENE 18 22135 - 23700 1413 521 aa, chain - ## HITS:1 COG:no KEGG:BT_1063 NR:ns ## KEGG: BT_1063 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 517 1 562 566 145 28.0 3e-33 MKLDKIFVTLFVGLAMTACSNDEEATTGGNQLPVDGKAAYMSLSLAMPKSSGTRAPGENH GTANEQKVNELYLALFDASDICLETKTLSAADYILNVGGADVAGHDGKAFRVPSATAKVL AVVNPSDKFKLVCVASTSWSVINNAVEQTVKEVIGDAKNNFMMINAGKNASPADGALVAA NVKVVDGTTYADATAAIAAAEGDRSEIHVDRVVAKVSLGVNASGVTVPAGVTCTFGNWAL NVTNKSMFPYSEIIMPAGGSAGADYRIDPNYTLAGFNVSQFNYLKIAANGSLPTDFSAMA DSKYCLENTMAAYAQTQAQTTAAVVSAVYTPNSFTAGESWFRLLGVTYKTLADLQTVYVA AKSATTPDAAQQQVLDLCDQFYARMSAAATAQSKTIGADFAAITIAELDAITNGGEYSKP AAGETVGVEYFQKGVCYYNILIRHDDAITATMAHGKYGVVRNNWYTLTINSVKQPGTSWI PDKTDPTKPENPGEDNDDNEAYLSVDITVNPWTTWSQGVDL >gi|225935386|gb|ACGA01000006.1| GENE 19 23744 - 25027 895 427 aa, chain - ## HITS:1 COG:no KEGG:BVU_0907 NR:ns ## KEGG: BVU_0907 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 16 426 23 431 486 232 34.0 2e-59 MKRICLYMALGLLGILSGHAQEISIKNAIARKISDDKVEVSFNIDCNTLQFSSRQQLTII PVIISESENDSLFLPSVCIAGTNRYRVNKRQEKLYGKIKGQEIIRYKKKYDTVIEYNETV PSQEWMSGARVEVLRELQGCAGCGEILGNSPIADIPLLKKIVERPNLEIRIPVIEEKHRS FTHTATLNFKVNQSTLLADYMNNPVELAKIYSSIDSIREDVSYRIDRIGIVGYSSPEGNY VSNARLSEQRAKALERNLEQAYKLDNGIIVSRSVPENWEGLSAWLQEYRPSYMQKVLDII EQTPVPDERDAKIKAIDGGKIYNALLREVYPLLRLVEYTVSYTVVPFSVEQGREIIHTRP DKMNHYEMYLVADSYGKGSDEYNKIIRMIADRFPDDRIANNNAAIVAWEMEDYDAMNVYL QKLNEKR >gi|225935386|gb|ACGA01000006.1| GENE 20 25035 - 25604 476 189 aa, chain - ## HITS:1 COG:no KEGG:BDI_3526 NR:ns ## KEGG: BDI_3526 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 8 189 4 185 185 268 70.0 8e-71 MKKYLCFIGLLGTLLCSTATSHAQEIAVKSNLLYDATTTMNLGLEIGLAKKWSLDLSGNY NPWKFNDETRLRHWGVQPELRYWLCERFNGHFIGVHGHYAKYNVGGISFLSDNMEQHRYQ GHLWGGGISYGYQWLLGNRWSMEAVIGVGYARLDHSKYPCATCGTVQKSEKKNYLGPTKA AINMIYIIK >gi|225935386|gb|ACGA01000006.1| GENE 21 25885 - 26754 487 289 aa, chain + ## HITS:1 COG:no KEGG:BT_2889 NR:ns ## KEGG: BT_2889 # Name: not_defined # Def: AraC family transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 284 8 289 294 332 56.0 1e-89 MENKLLYIEEHMSCQNYMTTIETGFKFIEFTEDTELGEDDTSKNYLLFFLKGDFTVSYNQ FHNRTFQAGDMILIPRSSSFKGVAGKGGNLLFMFFDVPESSCDKLVLQSLSEICKEKEYK FEPIKIRYPLTPFLEILTYCIRNSMNCSHLHDLMQRELFFLLRGFYEKEEIAMLFYPIIG KRIDFKDFVMHNYTKVSNIEQLISLSNLGRSCFFARFNEVFGMTAKQWMLKQKNQLILEK MTKPGICIKDIIEELGFDSQVYFSRYCKQHFGCTPSQLIKRCQADNSTD >gi|225935386|gb|ACGA01000006.1| GENE 22 26976 - 28193 768 405 aa, chain - ## HITS:1 COG:no KEGG:BF3522 NR:ns ## KEGG: BF3522 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 2 404 1 399 403 383 50.0 1e-105 MITSVRLMLNKSRRLNNGSYPLVFQVIHERRKKLMYTGFRIKEESFDELEEKIIDGVDST FTTADVARMNRELRKIKNRIRAQIRHLERSTESFTVEDVLAQYIHKNVRQQFYLLRYIDT QIDRKKTLKKEGTAAAYRSTRLSLAKFLNGSDIRMSAIDLRFIRQYEDFLYNSGVTGNTV SYYLRNLRTLYNQAVTDGYHPHGEYPFAKAQTRPAKTVKRALTRKDLQALANLGLEEMPE LKFARDLYLFSFYAQGMAFVDIVLLRKSDIYNGVLTYSRHKSKQLIRIAVTPQMQELMDK YETEGEYVFPIISEKSLSEYKQYRLSLGRINRYLKKIAVMIDIAVPLTTYTARHTWATLA RDYGAPVSVISAGLGHTSEEMTRIYLKEFDVSQLDKVNSMVTNLS >gi|225935386|gb|ACGA01000006.1| GENE 23 28721 - 29215 117 164 aa, chain + ## HITS:1 COG:no KEGG:BF2331 NR:ns ## KEGG: BF2331 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 161 1 154 163 97 35.0 1e-19 MVTVTFAIKPYLVRYMYVRYRQSLESRSHSNPPSKLPVPIHLSHLTPVYHFLYQLSVPHP QGVSWKETGNISFVLPSPRLGKNPEVYNYFGQDSIFIIEKEIEVEMKAELYSFLLENKFK NGVMYIKSMHEFVEKYNMTESVEEESLMRAFQRWRKMVKENLKK >gi|225935386|gb|ACGA01000006.1| GENE 24 29517 - 31190 778 557 aa, chain + ## HITS:1 COG:no KEGG:azo2045 NR:ns ## KEGG: azo2045 # Name: not_defined # Def: hypothetical protein # Organism: Azoarcus_BH72 # Pathway: not_defined # 5 557 6 586 591 372 38.0 1e-101 MENILELKTLNELQQYKFYIPSYQRGYRWKSYEVKDLLNDIADFQPRLINDTDEKTWYCL QPIVIKNISNLEYEVIDGQQRLTTIYLILHYLNQDFVEAKRDKLFSLEYETRKDSHNFLL SLNFEKSNDNIDFFYISQAYKTISDWFDNKGDNFDKGEFRSKFKFSTKVIWYQSFEENPI DIFTRINIGKIPLTNAELIKALFLNSSNFEKGELKKLRQRQFEIATEWDNIEHTLQNDRL WYFINKNNSTSNRIEFIFDLMNSEKDTADQYSTFRFFQKKFTKKQNDTIDIVWSEIKQYF QRFNEWFNERDLYHKIGYLLCVEAIDIATLYEKSTLLSKSEFKNELNKIIKDDLKEVRLS ELQYGDNKVKNVLLLYNILTMLNSEKDNSYFPFDLFKNEKWDIEHITSVKDAIPDKNRKD WLKDAVVFIDDSNREGKLLKKQAMECECDDDEVFKTLFEHIVAHFNSEINDEDINDISNL TLLDSETNRGYKNAVFPFKRKTIIKRDKAGIFIPLCTKNVFLKYFSEYPPKISFWTQDDR NNYLIDIETTLNEYINK >gi|225935386|gb|ACGA01000006.1| GENE 25 31196 - 33187 585 663 aa, chain + ## HITS:1 COG:no KEGG:swp_4497 NR:ns ## KEGG: swp_4497 # Name: not_defined # Def: hypothetical protein # Organism: S.piezotolerans_WP3 # Pathway: not_defined # 10 498 11 519 838 238 35.0 5e-61 MNEFDKKYKGKTFSFYELINTHNIEIPIIQRDYAQGREDKKEIRDNFLNALYNSLEENYQ LKLDFIYGSIVNDAFQPLDGQQRLTTLFLLHWYAATKDKCLDDNVQKTLNRFSYETRISS RDFCKSIINNRIIVNKDSSISNIIIDSPWFYLSWKKDPTIRAMLRTIDDIHNKFCLIDNL WEKLISETLPICFYYVELENIGLTDDLYIKMNARGKLLSTFEKFKAGFQKYINSNKWEEN HNPTDTFAFKIDTKWSDLFWTNFSNKQSIDDSLVHFISTIAMIRHAIEKKEERINHITRL NDNPDSIRPELFSKDSFCYLKKCFDLYCDILESNIDLNLNFPLWRHKPSKDFLSQIVSEN ENISYTQKVLFYAQTEFLMYNYKDFKKEKFEDWMRVIRNIVSRGNIEVTGKRPDIIRSPQ TFDGVINLISELSEGCGDIYNYLSKVEKINSSFARSQVEEEKLKAKLLMHNIENVALKNI IFEVEDNDLLMGRIEFALSCINFDYSEISSFDIEKFNKVGQVINRYFSKETDISNDIRRA LLTIGVNGKYEYYEYWWSFWYVGNANKRCLIDKFRELEYYIYSDCKPYFKELVLKLIDNS LRDIIEKFVPPHDMPSWKIKLIKNPALLDKNTSHYIAIPEDNSYCYLLKSKRPRDIDGSI KIE >gi|225935386|gb|ACGA01000006.1| GENE 26 33230 - 33637 287 135 aa, chain - ## HITS:1 COG:BMEI1226_2 KEGG:ns NR:ns ## COG: BMEI1226_2 COG4933 # Protein_GI_number: 17987509 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Brucella melitensis # 5 130 1 130 139 67 33.0 7e-12 MKVLLSIKPEFVQEIFSGKKKYEYRKAIFTKNVDKVVVYSTKPVGMIVGEFTVENILNDK PCTLWDQTHKDSGITKDFFDQYFEGRTHGYALKISSPQLYEEPINPFKLFKTFVAPQSFK YIDSNDSEALLFSNY >gi|225935386|gb|ACGA01000006.1| GENE 27 33615 - 34670 567 351 aa, chain - ## HITS:1 COG:no KEGG:Caul_1164 NR:ns ## KEGG: Caul_1164 # Name: not_defined # Def: hypothetical protein # Organism: Caulobacter_K31 # Pathway: not_defined # 1 351 5 351 351 329 46.0 1e-88 MEQQRFCDINLSDPFFNSLKEDYPEFSKWYDKKKKDKAKAFVQKNNEGMLQAFLYLKQEL EDIIDIYPPMPPANRLKVGTFKIEAHNTKLGEQFVKKIVAAALHIDADEIYVTIYRKHEG LIKLLKRYGFLIYGTKGQEEEPELVFVKPMKVYSGDLLYDYPFVHTSKVRKFILSIKPEY HTPLFPDSILDNEERHKSFLVQDIAYTNSIHKVYLCKMRDIDQLLRGDILLIYRMKDEKG AAYYRSVVSSICIVEEVRKASDFKSMEEFIKYANAYSIFDENELRKWYTVYGMVLIKMTY NIAFDRRVTRGELIEQVGLSASDYWGFMQINDEQFKNIISRGKINESLIID >gi|225935386|gb|ACGA01000006.1| GENE 28 34824 - 34934 61 36 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170356|ref|ZP_05756768.1| ## NR: gi|260170356|ref|ZP_05756768.1| putative ATPase involved in transport [Bacteroides sp. D2] # 1 36 1 36 36 70 100.0 3e-11 MHIVLPDGTEPRGDGNFERNYIKGRYGAVPYLSTIE >gi|225935386|gb|ACGA01000006.1| GENE 29 34986 - 35522 293 178 aa, chain - ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 5 178 198 364 420 70 30.0 2e-12 MLLVVRQIFTTTNADNSGYNVISGLNSQGYEGLTERQFLNKEAESVGALQFFKDLQLGFI DIETDEHEIEKGRKAIDVFTVHNIYNKDGEITGKQRFRFDYCESQGTQKLFELAGPLFEA LRHGRLLVMDELDAKMHPLISQHIIKLFSSEKTNQHHAQLLFTTHDTNLLSSHLLRRD >gi|225935386|gb|ACGA01000006.1| GENE 30 35506 - 36447 298 313 aa, chain - ## HITS:1 COG:no KEGG:ETA_03970 NR:ns ## KEGG: ETA_03970 # Name: not_defined # Def: hypothetical protein # Organism: E.tasmaniensis # Pathway: not_defined # 15 260 11 258 301 133 34.0 9e-30 MLQKTFLARCDNRACLAKTNVISGSPEDWLSNDLLSGSNTFGLTFDFFIDWAINRISPYV WIKRILLPTYTYDEFIGKLDSEMEKEFGKDYLCRLGRFATEYDMQVQFIVFHDELDWAND RSELIIVSLSFKEGYYSFSPQKYSLSEFKELIKSHSGGPVSIGSKGLIYGTSRLECSLSK TDSLYPGDADLLLLNEDNKAVCILEFKKHTLSSPVSEQCFTNYYPRPDERKYKRLALLRD YLASKSNSRILFFVLYYPTQTYIEQRWKLEVIEGNAFSLRATDSFIFELPADKSDNEYKK VIEKISQVIAARS >gi|225935386|gb|ACGA01000006.1| GENE 31 36613 - 38151 946 512 aa, chain - ## HITS:1 COG:FN1655 KEGG:ns NR:ns ## COG: FN1655 COG2461 # Protein_GI_number: 19704976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 4 508 1 510 512 565 59.0 1e-161 MENMKTFLPETDLAKLRLVLEIKDAYNRGTISLEEARKMLKEKVVTLKPYEIALAEQELK EVQDDECRKEDIQRMLELFDGIMDTSCPDLPPDHPIMCYYRENDELRKILSSIEELAQYP LIKNQWLELYDKLTPYRLHLSRKQNQLYPVLEKKGFDRPTTTMWLLDDFVRDEIRDARIL LENDSDDEFMACQQTIVYDIRDLMEKEETVLYPTSLVMISPEEFEEMKSGDREIGFAWIG EDLQQKPSSTPAEKEKGEMPGFAAELAGLLNKYGYGRGGGDELLDVATGRLSLEQINLIY RHLPVDLSYVDENELVCFYSDTKHRVFPRSKNVIGRNVKNCHPRSSVHVVEDIIEKFRSG EQDHAEFWINKPGFFVYIYYVAVRDENGKFRGILEMMQDCTHIRSLEGSRTLLTWDDTNT PAQTEPSSAEKPGEESAKIEITSATLLKDLLAAYPLLKDRMEEISPKFKLLKSPLARVIL PKATIKMMSERTGIPLEVLIESLKSKIEELSR >gi|225935386|gb|ACGA01000006.1| GENE 32 38426 - 39796 442 456 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 1 454 1 455 460 174 27 9e-43 MKRIIYIIMCMALLSSCHIYRNYQRPEGLPVDSLFRETDSSRTEDSLILGDLPWQEMFRD TLLQNLIRYGLANNTDMQTALLRVDQAKAQLKAACLSFLPSLTLSPQGTLTGTDGSKTVK TYELPVQASWEIDLFGNLRNAKKGTQATLLQQKAYQQAVRSELIATIANDYYSLLMLDEQ LEISRATLDVWREQIRTMESKFKVGEETENAITQARANLYELEATHNDLQRQQREAENTL CTLLGTTSRSIQRGSLEQQSFPELINIGVPLRLLSKRPDVVQAEMTLANAYYTTNQARSA FYPNLTLTGSAGWTNSLGQVVTNPGGWILSAVASLTQPIFNRGKLISNLRVSKDDEQIAL LAYKQALLDAGQEVNDALYATEAAQRTLGSHQKQCKELERTVQTSEALYQTGNATYLELL TARQSLLNARLNVVSDRFTRCQSIINLYNALGGGCE >gi|225935386|gb|ACGA01000006.1| GENE 33 39793 - 42933 2671 1046 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 6 1026 5 1022 1051 763 40.0 0 MKLQKFIDRPIFSMVISVVILVAGLIGLYALSVEQYPDIAPPTIRVSTTYSGANAEAVQK SVIVPLEEAINGVENMTYMTSTASNTGSAEITVYFKQGSDPDMAAVNVQNKVSTATSLLP AEVTKVGVTTMKRQSSMLKIFGLYSPKGTYDETFLTNYLTINVKPQIQRISGVGEVNVMG GDYAMRIWLKPDVMAQYELEPSDVETALSSQNIEASTGSIGEDSKNVYQYTLKYRGRLET EQEFEEIVIKSYNDGRVLKLKDIATVELGAQSYSYTGTVNGAPGSTCMINQTAGTNANEI ITNIDKYLDELKETLPEDMEIVELMSTKNFLDASINEVIKTLLEAIILVVLVVYVFLQNP RSTLIPTISIFVSLIGTFAALYIAGFSINLLTLFALVLAIGTIVDDAIIVVEAVQTRFDE GYRSPYKAAVDAMNGISSAVVTSTLVFMAVFVPVSFMGGTSGIFYTQFGITMAVSVGISA INALTLSPALCALILRPNEILVEGKKPEFSTRFRMAFDSAFKRIVLKYKMGAKFLIKRKW LAWSSLGLAIILLCYLMSTTKTGLVPSEDTGSIFVSLDAPAGSTLAETASIMDKVEKELK EIPQIDNFNKVAGFGMGSGSGSSHGMFIIKLKHWDERQGEESSVDAISEEIYRRTATIKN ANIFVFSPPMISGYGTGNSFELYIQDRSGKGTEALSEVTNEFLAELNKRPEIQMAYTSFS ANFPQYRVDVDEAQCQRVGTTTEEVLNVLSGYFGSIYASNFNRFTKLYRVIIQAPSDSRK NMQSLDNIFVKTNGGMAPVSQSVKLTKTYGAESLTRFNMFSSINVQGMPADGYSSGDVIS AVTEVAAQTLPTGYGYEYSGMTREEEQMVNSHDTIIIYGVCILFIYLILCALYESIFIPM AVILSVPFGLMGSFLFARMWGIENNIYLQTGLIMLIGLLSKTAILLTEYATERRHAGMSL TQAAMSAATVRFRPILMTALTMIFGMLPLMFASGVGANGNRSLGVGVVGGLIIGTVALLF VTPAFFIVFQYIEERVMGKRKEDRQL >gi|225935386|gb|ACGA01000006.1| GENE 34 42937 - 44055 902 372 aa, chain - ## HITS:1 COG:ECs4393 KEGG:ns NR:ns ## COG: ECs4393 COG0845 # Protein_GI_number: 15833647 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 4 371 5 377 385 140 29.0 4e-33 MKVRSIFVSGLCCISLLSGCQQNTKQQNEGENYPVMTLKPEDRQLSVKYSAVIEGKQDVE IRPQVSGTITKVCVEEGARVHKGQILFIIDQVPYKAALQKAQAAVATAEACEATARQTLE GKQSLFKDKVISDFELRTAQNDYKSAKAALLQAQAEMADAQNNLSYTEVKSPVDGYAGMT SYRIGALVSSSMTDALIDVSDNSEMYVYFSLTEKQVLSLTAQYGSLDKALQSFPEVSLEL NDGSNYEQKGKVDVISGIIDKTTGAVSMRAVFGNKDKRLMSGGQANIIITYDRPQCIVIP QGATYEIQNRIFAYKVVDGKAVSTPIKVFEINDGAEYIVEEGLQEGDIIVSEGAGLLKDG TIISAVKEEKEE >gi|225935386|gb|ACGA01000006.1| GENE 35 44367 - 45152 367 261 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 157 252 191 286 288 66 36.0 5e-11 MEGSRLHILVCDGALELELNGKPYEMKGPALLDIMDTITVRINRIDPNLRAWCLFITFEF ASESLKNLRPGPLNHLLERLYLPVWHLSQEESGILERQLLLLKETLGNPKHYYRQELSET YFRSFSLELGNIMFTHEENMDDALPYISKRDHITLNFMKLVSKHFMEEHNIDFYAEALCI STKHLTRIVKGMTGKTPHTIICNELIHQAMAMLENDSIPVSQIAEELHFSDQAAFCKFFK KHKKVPPMAYRRRRNNIVTKK >gi|225935386|gb|ACGA01000006.1| GENE 36 45222 - 45827 340 201 aa, chain + ## HITS:1 COG:CC2662 KEGG:ns NR:ns ## COG: CC2662 COG1309 # Protein_GI_number: 16126897 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Caulobacter vibrioides # 6 182 21 198 213 70 31.0 2e-12 MELKRSETQLIEAVSTIITEEGFSKIGINKIARTAGCDKVLIYRYFGGLDGLIAVWAKKH DFYIQAYDTFINRIETVNEANLKQITKEILLAQLHFIRENKMHQELLLWELSGGLKFKAI RDLREENGHKLQKILEQKVDTKDLNISMYITLLIASINHIVLSTLEYPLFNGIDFSSDTS WTIYENTLCDYIDMLFEKFNR >gi|225935386|gb|ACGA01000006.1| GENE 37 46007 - 46456 305 149 aa, chain + ## HITS:1 COG:no KEGG:Dole_2867 NR:ns ## KEGG: Dole_2867 # Name: not_defined # Def: hypothetical protein # Organism: D.oleovorans # Pathway: not_defined # 4 147 73 221 225 71 24.0 8e-12 MTPDYHTISWEYSNQDKNTNIRVVLKNGTYKIDGMLKSKSYSKTYSSNGVPWFQNIGFNI GYSIKEKSTFKFECIRPDNLKLYEMQADGKEVVVHNGIREQRINVHLTGLLAKFFGCDYY IDLSSGQFIQYKGVQGAPGTPETIITIKK >gi|225935386|gb|ACGA01000006.1| GENE 38 46453 - 47256 348 267 aa, chain + ## HITS:1 COG:MJ0298 KEGG:ns NR:ns ## COG: MJ0298 COG0716 # Protein_GI_number: 15668473 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanococcus jannaschii # 1 148 1 149 150 106 40.0 5e-23 MKTLIIVSSTYLGNTMKVAKAMAQELKATVKCPKEVTESDISSYDLIGLGSGINFASHNK EIISLVEELDIARKKIFIFSTRCRPILGSYHKKLKSLIAQKDGILLGEFSCVGFDRTGPW VGMNGYNKNRPNSNDLFKSALFALQMRKKTHPLSALKKTYTPLLTYEGLPLRSDGTNEVV GNVVLLNTSRCIACGKCMKSCPMNVFTLKDNAKTPLPVDEMNCIMCGKCEKECPADAVFI NETFSNGLRILFRESSCDKLQKAYWYK >gi|225935386|gb|ACGA01000006.1| GENE 39 47347 - 48327 697 326 aa, chain + ## HITS:1 COG:yhdJ KEGG:ns NR:ns ## COG: yhdJ COG0863 # Protein_GI_number: 16131150 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Escherichia coli K12 # 37 288 19 258 296 77 30.0 4e-14 MGSTSKDIKEEDKRLLEGVIQKKKKQQKDIKPFVNKVIQGDCLNILPSIPDKSIDMILCD LPYGTTQNKWDSVIDLQALWAEYERIIKDNGAIVLTAQGIFTAKLILSKEKLFKYKITWI KSKPTNFLNAKKQPLRKHEDVCVFYKKQSVYNPQMTKGEAYDKGVRKDQYTGSYGEFKPQ HVKSDGERYPNDVVFFEEDHDDFVYVKTAESEGEVYHPTQKPVELGRYLIRTFSNPGDII LDNACGSGSFLLSAILENRRFIGIEKNEDVLLHRIQPTDYIKICMDRISETLKREEVTPS TRKLFKKPITKYHTLNYLETDATNQL >gi|225935386|gb|ACGA01000006.1| GENE 40 48308 - 51088 1331 926 aa, chain + ## HITS:1 COG:no KEGG:Hlac_3519 NR:ns ## KEGG: Hlac_3519 # Name: not_defined # Def: N-6 DNA methylase # Organism: H.lacusprofundi # Pathway: not_defined # 287 685 277 677 694 96 25.0 5e-18 MPQINYNERSWAIDVISEINLYLANKSWHVKSAGGENTIRNEKSSLFPDVLIFKDSAKKI ILQGWELKMPDTPITDTELINNAKIKSEILNNNSFILWNVTSAVLYVKNGNTYSILKSWD SIDLHSRSEVKENENLWKDLLHTILEDLNDYFEHGEITESGSSEILAIDKIIDIVLENID STAENIKENIKRNAILDAQIDNWWLSSAAEYGFNPQTKKDVKHKLPTLSKVILTDWFFKI IFGNIIKRHFNEAKIIETITFDTTVSEAQQIIAKISKLCNFGNIFGDNIANELISDNVWK QLVQLNLFISNIKIESVDIQILQNLLQSSIDYAKRKVAGQFATPPQLADLLTRITINNKN GITLDPCCGTGTIIKQAYSLKEEYEIGQDQIIESIWASDKYSFPIQLSTLSLSNPNNMGK TLNIFRSDVIDLHMGQTAFEEHNNGNQVEKKLPYVDYVVSNLPFIREKEIKKLNPNIGKI NELIKEQTKAKNTLSKKSDIFAYIPFYLYNIISNNGRIGLILSNAWLGTDYGEIFLKLIQ KYFNIDRVVVSGNGRWFNNAKVVTTLLIATKREISDPVNLDRKISFCTLKEKIENIADIK KLSSEIILNKESEQVNIQSYSISEIKQLENIGIPWSGYFANLNWLPAITEKLLDSKKIFK FIRGERRGCNRMFYPAAGHGIEDEYIKPVLKNLKKAPSFIATSQTQAFCCSKSIEELETL NHTGALNWIRSFENQMDKTNKKPLPVALHRANMFWYEMSTKNMADFVVNINFDKSLFIAM LDKRSFFDQRMIGFSIKEQYKDENKIFLLALLNNILSMFFIESLGFGRGLGALDLSKTKF EKSFKMLNPEVLSDVQKEEIIKAFQPIFNRDRLPLEKELEQKDRIDFELMLLKIYGLEKY YDTIRQALLHLYKTRFAVKEKKHKKV >gi|225935386|gb|ACGA01000006.1| GENE 41 51131 - 52894 627 587 aa, chain - ## HITS:1 COG:no KEGG:BF0032 NR:ns ## KEGG: BF0032 # Name: not_defined # Def: two-component system response regulator # Organism: B.fragilis # Pathway: not_defined # 29 585 41 578 579 426 41.0 1e-117 MNDCYLNLLVSFRALFSADPGANLVNSKQTIGIILPDLSLLGYAVPVESFGIGLFVILFA YLIIEYNVQLLSHEKFKIAMNMLHTAHTPLILLRNQLEELKTGNLPEPLSQQVEEALGYA ECIIYCNQNIATLNKVTKRIPPKTSTVNLELSTYVTSIVNQCRAYANSRQIQLTVGECSD CVSCRINENIMTAALQHLINKMILISESGCCISINVTHTMNSWQLQISNNEIAGQRTEKM FPFIPIIFPVYGYSDLWTVRKIIRLHGGKITGCRHGKAATFQIVIPTDCHCLNQSCPVLR HSSTKTKTPTNDSCENPKTNKQNTKTRETSHILLVMADKLFSDYLKKTLSRYFQISVLDN PELLMDTAISQNPDAIIIDDNVNGISGDILSTQIKEDKTIGYIPIILLLRAFDNESYLSH LGSGADRLGLRTESICKLRADIRMLVENRMVLRERIRLFLSDAISPMIPTKAEIETENAD RKFMDKVNKILEKNLSTDKYTIDKLSIDIGMSRTAFYNKIREITGNPPENYINSFKMDKA LKLLASQQYSISDIAGILGYCDAKYFGKKFKDFYHVCPSDYIKSIVG >gi|225935386|gb|ACGA01000006.1| GENE 42 53280 - 53894 614 204 aa, chain - ## HITS:1 COG:AGl502_1 KEGG:ns NR:ns ## COG: AGl502_1 COG1793 # Protein_GI_number: 15890358 # Func_class: L Replication, recombination and repair # Function: ATP-dependent DNA ligase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 26 183 3 157 546 171 52.0 1e-42 MAKNSPDEGKSIKMVEKQVDPSFNNLKEYQAKRQFSETPEPMADTVETPSRIPVFVVQKH EASHFHFDFRLEVDGVLKSWVVPKGPSMNPKDKRLAIQVEDHPLSYAHFEGVIPEGNYGA GTVEIWDSGTYAYVGNNRNISAAIKNGILEFKLHGHKLKGLFTLIHTNMDDQDRDWLLIK KDDVFAVTHVYDAKNIPLYDEVFL >gi|225935386|gb|ACGA01000006.1| GENE 43 53929 - 54120 87 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIHCHNNQHICLLPEFIYFNGFKTVFKFANLFFYSFSPFSQECIPQLSCIMFVYSKKAGK YVT >gi|225935386|gb|ACGA01000006.1| GENE 44 54157 - 54948 957 263 aa, chain - ## HITS:1 COG:no KEGG:BT_1007 NR:ns ## KEGG: BT_1007 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 263 1 263 263 356 83.0 4e-97 MKSFNVTSMLVVLMFLCFTGIQAQTVTPSKKYITKELNNVSNFSSIRVLGSPDVEYRQSS DSKTTVSIYGSDNLVDLLEVSTVNGVLQVNIKKGVKILSGERRLKVIASSPSLDDVDIKG SADVYLKGTLKGADLNLNITGSGDIEAENLQYTNLSAFVKGSGDINVKNVKATTVKTIVS GSGDVEMKGSTQMAMLTVNGSGDISADKLTATNVVATVSGSGDISCYASKQLDAKANGSG DIEYKGNPSIVNKQGKKDSISGK >gi|225935386|gb|ACGA01000006.1| GENE 45 55043 - 55858 660 271 aa, chain + ## HITS:1 COG:PA1713 KEGG:ns NR:ns ## COG: PA1713 COG2207 # Protein_GI_number: 15596910 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 36 270 32 264 278 75 23.0 7e-14 MQPKETIIFKYSDIIFSFLVNDDTVCTHKAVSHLLIYVYSGKMSITEQGDELTVTAGECV FIRRDHKVVITKGPYNGEQYKGITMQFNRDFLIKRYKALDKKEICAQLAPFQNSVVKLSK SADIDSIFLSMIPYFDSSNKPNDELMKLKLQEGLLTLLYMDKSFYPTLFDFTGPWKIDII NFLNENYMYDLSIGEIALYTGRSLATFKRDFKKISNLSPQKWIMQKRLNVAYEKIKEGGE KIADVCFDVGFKNRSHFTTAFKKQFGFTPTH >gi|225935386|gb|ACGA01000006.1| GENE 46 55988 - 56896 845 302 aa, chain + ## HITS:1 COG:BMEI1884 KEGG:ns NR:ns ## COG: BMEI1884 COG1073 # Protein_GI_number: 17988167 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Brucella melitensis # 3 301 2 307 307 317 55.0 2e-86 MKKNVTFKNRAINMAGNLYLPNDFSDNKKYAAIVCVHPGSSVKEQTAGIYAGKLAAHGFV TLAFDASYQGESGGEPRYIEEPTARVEDIRSAVDYLTTLEFVDPDRIGVLGVCAGGGYSV NAAMTEHRIKAVGTVVGANIGRIYRENNPIQTLEAIGKQRTAEANGAEPMIINWTPSSPE EREKQGITDIDIVEAVEYYRTPRGEYPTSCNKLRFVSMASVLAFDAFNLVEQLLTQPLQI IVGDKQGAFGSYKDGQELYRRATCEKDLLVIEGASHYDLYDIPKYVDQAVKKLTEFYGKH LK >gi|225935386|gb|ACGA01000006.1| GENE 47 57079 - 59553 1936 824 aa, chain + ## HITS:1 COG:no KEGG:BT_1010 NR:ns ## KEGG: BT_1010 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 824 1 824 824 1614 91.0 0 MEMNKKRLIGYLFVLMSVGCIHAQETNVSAQEYKLWYDCPAQVWTEALPLGNGRLGAMVY GTPGTEQIQLNEETIWAGRPNNNANPDALEYIPKVRELVFAGKYLEAQTLATEKVMAKTN SGMPYQSFGDLRIAFPSHTRYSNYYRELSLDSARVIVRYEVDGVQYQRETITSFTDQVVM VRLTANRPGQITFNAQLTSPHQDVMIASEEGNCVTLSGVSSLHEGLKGKVEFQGRLTAKN KGGEIACADGILSVEKADEAIVYVSIATNFNNYQDITGNQIERAKNYLEKAMVHPFIESK KNHIDFYRQYLTRVSLDLGKDQYSNVPTDKRVENFKNTNDAHLVATYFQFGRYLLICSSQ PGGQPANLQGIWNDKLFPSWDSKYTCNINLEMNYWPSEVTNLSELNEPLFRLIKEVSDTG KETAKVMYGANGWVLHHNTDIWRITGAVDKAPSGMWPSGGAWLCRHLWERYLYTGDIEFL RSVYPILKESGRFFDEIMVKEPVHNWLVVCPSNSPENVHSGSNGKATTAAGCTMDNQLVF DLWTTIISASQILDTDQEFATHLAQRLKEMAPMQVGHWGQLQEWMFDWDDPKDVHRHISH LYGLFPSNQISPYRTPELFDAARTSLIHRGDPSTGWSMGWKVCLWARLLDGDHAYKLITD QLTLVRNEKKKGGTYPNLFDAHPPFQIDGNFGCAAGIAEMLMQSYDGFIYLLPALPTVWQ EGSIKGIIARGGFELDLSWKNGKVSRLVVKSHKGGNCRLRSLNPLTGKGLKRAKGENPNP LYAVPTIPEPLINEKANLNKVEIAETYIYDLPTKAGQEYILIGK >gi|225935386|gb|ACGA01000006.1| GENE 48 59591 - 60769 1135 392 aa, chain + ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 49 389 23 361 361 311 43.0 1e-84 MVACGTVCFLACTAPQKVETEKWSERMARSEMKRFPEPWMIEKAKKPRWGYTHGLVVKSM LEEWKHTGDSTYYDYAKIYADSLIDADGCIKTMKYLSFNIDNVNGGKILFDLYAQTGDER YKIAMDTLRKQMTEQPRTSEGGFWHKLRYPHQMWLDGIFMASPYLVQYGATFNEPALFDE AKKQILLINSKTYDPATGLYYHGWDESREQKWANPETGCSPNFWSRSIGWYGAAIVDVLD FLPQETAGRDSIIQILQGLAKTIVKYQDPQSGTWYQVTDQGAREGNYLESSATALFVYTL AKAINKGYLGNEYIEPTKKAFDGMVKTFTRLEEDGSYTITNCCAVAGLGGDSKRYRDGSF EYYISEPIIENDPKSVGSFILAAIEYEKLTTK >gi|225935386|gb|ACGA01000006.1| GENE 49 60793 - 62256 1516 487 aa, chain + ## HITS:1 COG:no KEGG:BT_1012 NR:ns ## KEGG: BT_1012 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 475 1 475 483 907 95.0 0 MRKFKILPLLLLLLTLVIPAAAQKKTQKTYIPWNNGKLVVSEEGRYLKHENGAPFFWLGE TGWLLPERLNRDEAEYYLEQCKRRGYNVIQVQTLNNVPSMNIYGQYSMIDEYNFKNINQK GVYGYWDHMDYIIRTAAKKGLYIGMVCIWGSPVNRGEMTVDQAKAYGKFLAERYKDEPNI IWFIGGDIRGDVKTAEWEALATSIKAIDKNHLMTFHPRGRTTSATWFNNAPWLDFNMFQS GHRRYGQRFGDGDYPIEENTEEDNWRFVERSLAMKPMKPVIDGEPIYEEIPHGLHDENEL LWKDYDVRRYAYWSVFAGSFGHTYGHNSIMQFIKPGVGGAYGAKKPWYEALDDPGYNQMK YLKNLMLTFPFFERIPDQSVIVGQNGERYDRAIATRGNDYLMVYNYTGRPMEVDFNKISG AKKNAWWYTTKDGKLEYIGEFDNGVHKFQHDSGYCSGNDHILIVVDSSKNYMEKDWTELP DRQGAGA >gi|225935386|gb|ACGA01000006.1| GENE 50 62303 - 66175 3117 1290 aa, chain + ## HITS:1 COG:STM1252 KEGG:ns NR:ns ## COG: STM1252 COG4692 # Protein_GI_number: 16764604 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted neuraminidase (sialidase) # Organism: Salmonella typhimurium LT2 # 928 1285 21 344 347 154 34.0 1e-36 MLMNKRIFILFLFSLLLFTGSIYAAIGITDLRTEQLKNPAGIDVRQPRLSWRIESNEQNV IQTAYHILVASTPELLEQGKGDIWDSGKIESDASQWIAYQGKTLKRNAPYYWKVKIDTNK GTTDWSTPAFWTTGLFNEADWQGQWIGLDKAAPGDSETQWSRLAARYLRKEFALKKEVKR ATVHIAGMGLYELFINGQRIGNQVLAPAPTDYRKTILYNTYDVTSLLQTENAIGVTLGNG RFYTMRQNYKPYKIPTFGYPKLRLNLIVEYADGSKETIATNTTWKLTTDGPIRSNNEYDG EEYDARKELGNWTQTGYDDKSWMPAQRVSIPSGTLRAQMMPGMKVTETLKPVSIKKLGNK YILDIGQNMAGWVRFRVKGQAGDSIRLRFAESLQSNGELYTRNFRDARSTDVYVVSGRET KDATWAPRFIYHGFRYVEVSGYPNAKAEDFVAEVVEDEMEHTGTFSCSDETLNKIIRNAF WGIRSNYKGMPVDCPQRNERQPWLGDRTMGCWGESMLFDNYAMYTKWTRDIREAQREDGC IPDVAPAYWNYYSDNVTWPAALPMACDMLFTNFGDKRPIEENYPAIKKWVSHIREYYMTE EYIITKDKYGDWCVPPESLELIHSKDPSRKTDGDLIATAYYLKVLQLMHRFASLQGLKAD AEEWEDLEHRMKDAFNARFLHVKESTSPVPGHALYPDSIFYGNNTVTANILPLAFGLVPK NQINAVAQNAVASIITTNKGHISTGVIGVQWLLRELSRRGHADVAYLLATNKTYPSWGYM VDKGATTIWELWNGDTANPEMNSGNHVMLLGDLLPWCFNNLAGIRADRWKSGYKHIVFQP AFEIQELSNVDASYMSIYGKITSRWAKTPTHLEWDIELPANTTGEVHLPDGRKEKIGSGK YHFSVDIPTRNTAILSDEFLYEKASFPECHGATIVELKNGDLVASFFGGTKERNPDCCIW VCRKPKGSKEWTAPKLAADGVFSIKDSQAVLAGIDSTCTPVKDEKGKLIARRKACWNPVL FQIPGGDLILFYKIGLKVSDWTGWLVRSRDGGKTWSKREPLPEGFLGPIKNKPEYINGRI ICPSSTEGSNGWRVHFEISDDKGKTWKMIGPLDAELSVPTQNRKKGGMNVDDQEGGEAIR GEGAKPVYAIQPSILKHKDGRLQVLCRTRNAQVATAWSSDNGDTWSKVTLLDVPNNNSGT DAVTMKDGRHILIYNNFSTLPGTPKGPRTPLCVAISEDGINWKPVLTLEDSPISQYSYPS IIQGKDGKLHAIYTWRRQRIKYAEIDPTKF >gi|225935386|gb|ACGA01000006.1| GENE 51 66288 - 67175 195 295 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 53 285 3 238 242 79 26 4e-14 MADNYIERQQEQYEARKAAWKQAQKYGKKKSTTVHPTESASCTPMTSKSSPSKRRVFITG GAEGIGKAIVEAFCLSGDQVAFCDINEIAGQETAKATGSIFHKVDVSDKDALESCMQRIL SEWNDIDIIVNNVGISQFSSITETSVEDFDKILSINLRPVFITSRLLAIHRKEQSSPNPY GRIINICSTRYLMSESGSEGYAASKGGIYSLTHALALSLSEWNITVNSIAPGWIQTHDYD QLRPEDHSQHPSRRVGKPEDIARMCLFLCEENNDFINGENITIDGGMTKKMIYLE >gi|225935386|gb|ACGA01000006.1| GENE 52 67190 - 67972 653 260 aa, chain - ## HITS:1 COG:PA1640 KEGG:ns NR:ns ## COG: PA1640 COG1752 # Protein_GI_number: 15596837 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 1 182 1 180 345 182 49.0 7e-46 MDKQKVALVLSMGGARGIAHIGVIEELLRHNFEITSIAGSSMGAMVGAMYASGKMEECKE WLYSWDKRKMWELADITLSRDGLVKGDRFIKELKQIIPDMNIEDLPVPYVAMATDIVRDQ EVRFDRGSLHEAIRASISIPMLFRPLRKDGMVLIDGGILNPLPLAHVKRIEGDILIAVDV NAPLDSGKKKKISPYNLLTESSRMMMQQITRYQIERCQPDILIQMSGDTYDMLEFHHAAS IVKTGVEVTRSVLKEFLDNE >gi|225935386|gb|ACGA01000006.1| GENE 53 68150 - 69040 453 296 aa, chain + ## HITS:1 COG:XF0250 KEGG:ns NR:ns ## COG: XF0250 COG0697 # Protein_GI_number: 15836855 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Xylella fastidiosa 9a5c # 1 267 7 278 304 130 31.0 3e-30 MKEAFIKLHISIIIAGATGVFGKLITLNEGLLVWWRMMFATLILAGILYFAKKLPKLPLK EILKISGVGVLLALHWIFFYGSIKASNVSIGVVCFSMVGFFTAFLEPLINRHRVSLKEVT FSLLTLFGIVLIFHFDTRYRMGITLGIISSALAALFTITNKKTAKNHASSALLLYEMFGG FIGVSCILPFYLHYFPVSTIVPGMGDLINLLLLASFCTVGLYLLQIQVLKKISAFTVNLS YNLEPIYSIILAMIIFNEAKELNIAFYSGLGMIILSVILQTLSVTFQKKFSIPQQS Prediction of potential genes in microbial genomes Time: Fri May 13 06:28:27 2011 Seq name: gi|225935385|gb|ACGA01000007.1| Bacteroides sp. D2 cont1.7, whole genome shotgun sequence Length of sequence - 42774 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 9, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 6 - 51 5.5 1 1 Op 1 . - CDS 85 - 2049 1869 ## BT_1017 hypothetical protein 2 1 Op 2 . - CDS 2106 - 3440 921 ## COG5434 Endopolygalacturonase - Prom 3513 - 3572 2.5 3 2 Op 1 . - CDS 3641 - 6460 1959 ## BT_1019 putative secreted hydrolase 4 2 Op 2 . - CDS 6498 - 9818 2857 ## BT_1020 hypothetical protein 5 2 Op 3 . - CDS 9851 - 10774 664 ## BT_1021 arabinosidase - Prom 10831 - 10890 7.3 - Term 10864 - 10913 9.4 6 3 Op 1 . - CDS 10980 - 11648 733 ## BT_1022 hypothetical protein - Term 11657 - 11698 3.0 7 3 Op 2 . - CDS 11709 - 13448 1685 ## BT_1023 hypothetical protein - Prom 13545 - 13604 9.1 - Term 13555 - 13599 9.4 8 4 Op 1 . - CDS 13626 - 15809 1900 ## BT_1024 hypothetical protein 9 4 Op 2 . - CDS 15824 - 19144 2445 ## BT_1029 hypothetical protein 10 4 Op 3 . - CDS 19185 - 20606 870 ## BT_1026 hypothetical protein - Prom 20702 - 20761 7.3 11 5 Tu 1 . - CDS 20783 - 21676 649 ## BT_1031 hypothetical protein - Prom 21905 - 21964 6.0 - Term 21890 - 21933 10.5 12 6 Op 1 . - CDS 21971 - 24244 1988 ## COG3537 Putative alpha-1,2-mannosidase 13 6 Op 2 . - CDS 24252 - 25220 904 ## COG2152 Predicted glycosylase 14 6 Op 3 . - CDS 25241 - 26566 1077 ## COG0477 Permeases of the major facilitator superfamily - Prom 26588 - 26647 6.4 - Term 26619 - 26655 3.0 15 7 Tu 1 . - CDS 26665 - 28848 1508 ## BT_1035 hypothetical protein - Prom 28870 - 28929 4.6 - Term 28916 - 28986 11.1 16 8 Op 1 . - CDS 29005 - 30138 908 ## BF1313 hypothetical protein 17 8 Op 2 . - CDS 30170 - 31303 1031 ## BF1328 putative secreted endoglycosidase 18 8 Op 3 . - CDS 31329 - 32900 1521 ## BF1327 hypothetical protein 19 8 Op 4 . - CDS 32915 - 35668 2334 ## BVU_0619 hypothetical protein - Prom 35739 - 35798 10.6 - Term 35867 - 35908 6.3 20 9 Op 1 . - CDS 35959 - 37140 968 ## BT_1045 hypothetical protein 21 9 Op 2 . - CDS 37162 - 38292 858 ## BT_1044 hypothetical protein 22 9 Op 3 . - CDS 38339 - 39976 1437 ## BT_1043 hypothetical protein 23 9 Op 4 . - CDS 39999 - 42743 2068 ## BT_1042 hypothetical protein Predicted protein(s) >gi|225935385|gb|ACGA01000007.1| GENE 1 85 - 2049 1869 654 aa, chain - ## HITS:1 COG:no KEGG:BT_1017 NR:ns ## KEGG: BT_1017 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 653 1 653 653 1180 89.0 0 MKRILFTMLLAASLSAEAQTQTYETEFARPLNEVLTDIQNRFGVRLKYDIDTVGKVLPYA DFRIRPYSVEESLTNVLAPFDYKFVKQKGNMYKLKAYEYPRRTDADGEKMLAYLNTLYAD RQAFELRVDSLKKEVRQRLGIDTLLAQCVKSKPILSKIRKFDGYTVQNFALETLPGLYVC GSIYTPQSKGKHALIICPNGHFGGGRYREDQQQRMGTLARMGAICVDYDLFGWGESALQV GSAAHRSSAAHTIQAMNSLLILESMLASRKDIDTSRIGTNGGSGGGTHTVLLSVLDDRFT ASAPVVSLASHFDGGCPCESGMPIQLSAGGTCNAELAATFAPRPQLVVSDGGDWTASVPT LEFPYLQRIYGFYNAKDKVTNVHLPKEKHDFGPNKRNAVYDFFADVFKLDKKMLDESKVT IEPESAMYSFGEKGALLPEGAIRSFDKVAAYFDKKAFADLKSDASLEKKAMDWVASLKLD DEKKAGFAATAIYNHLRKVRDWHNEHPYTTIPEGINPLTGKPLSKLDREMIADSAMPKEV HERLMKELRRVLTEEQIEQILDKYTVGKVAFTLKGYQTIVPNMTEEETAYVLEQLKLARE QAIDYKNMKQISAIFEIYKTKCEQYFNEHGRNWRQMFKDYVNKRQEEKKAQEKK >gi|225935385|gb|ACGA01000007.1| GENE 2 2106 - 3440 921 444 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 24 443 28 437 448 281 37.0 2e-75 MNLRTTFLVFLCFCATTILRAERVDMLKTGAKANGKALNTKLINSTIDRLNRGGGGTLFF PAGTYLTGSIHLKSNITLELEAGATLLFSDNFDDYLPFVEVRHEGVMMKSFQPLIYAVDA ENITIKGEGTLDGQGKKWWMEFFRVMIDLKDNGMRDINKYQPMWDAANDTTAIYAETNKD YVTTLQRRFFRPLFIQPVRCKKVKIEGVKIINSPFWTVNPEFCDNVIIKGITIDNAPSPN TDGVNPESCRNVHISDCHISVGDDCITIKSGRDAQARRLGVPCENITITNCTMLSGHGGV VIGSEMSGSVRKVTISNCVFDGTDRGIRIKSTRGRGGVVEDIRVSNVVMSNIKQEAVVLN LKYSKMPAEPKSERTPIFRNVHISGMTVTDVKTPIKIVGLEEAPISDIVLRDIHIQGGKQ KCIFENCERITMDDVIVNGEEIKM >gi|225935385|gb|ACGA01000007.1| GENE 3 3641 - 6460 1959 939 aa, chain - ## HITS:1 COG:no KEGG:BT_1019 NR:ns ## KEGG: BT_1019 # Name: not_defined # Def: putative secreted hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 938 1 922 923 1690 86.0 0 MNIKKYLLLFLCLPCFMQAKTITISRLTCEMQEGLVVVEGSPRLGWVMESPENGTRQSAY EIDIREAFTGRSVWNSGKVYSTQSQLVSIEGADIRPDNQFNYSWRVRVWDETDAPSEWSR EAKFRSASSRLSEGKWIGAITRQNAHLPEGRKFHGGELKKPEVKAAWEAVDTLAKKSICL RRTFQVGDKTEGNTNRQSGKKIVEATAYVCGLGFYEFSLNGKKVGNSEFAPLWSDYDKTV YYNTYDVTEQLRHGENAVGILLGNGFYNVQGGRYRKLQISFGPPTLLFELVINYEDGTST TLRSDNDWKYDFSPVTFNCIYGGEDYDARREQKGWNQVGFDDSHWRPVVIQEAPKGILRP QMAPPVKIMERYDIQKVTKLNAEQIASASKSTKRTVDPSAFVLDMGQNLAGFPEITVRGK RGQKVTLLVAEALTDEGACNQRQTGRQHYYEYTLKGEGDETWHPRFSYYGFRYIQVEGAV LKGQKNLQRLPVLKNIQSCFVFNSAKKVSTFESSNRIFNAAHRLIEKAVRSNMQSVFTDC PHREKLGWLEQVHLNGPGLLYNYDLTAYAPQIMQNMADAQHANGAMPTTAPEYVVFEGPG MDAFAESPEWGGSLVIFPFMYYETYGDDSLIKKYYPNMRRYVDYLKTRAEMGILSFGLGD WYDYGDFRAGFSRNTPVPLVATAHYYMTVMFLIKAAKMVGNDFDIHYYTSLAHEILASFN KCFLNKETAQYGTGSQCSNALPLFLQMTQDSDEQGSYQPDAKLDEKVLMNLIKDVEAHGN RLTTGDVGNRYLIQTLARNGQHELIYKMFNHEEAPGYGFQLKFGATTLTEQWDPRQGSSW NHFMMGQIDEWFFNSLVGIRPSTTYQKFIIAPQPVGDLKYVKASYETLYGTIVVDWTSQN GIFTLNISVPVNTTAVVYLPGEKEPKEIQSGTYQLVCAR >gi|225935385|gb|ACGA01000007.1| GENE 4 6498 - 9818 2857 1106 aa, chain - ## HITS:1 COG:no KEGG:BT_1020 NR:ns ## KEGG: BT_1020 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1106 1 1109 1109 2185 94.0 0 MNNSAKILFVLAAGWLTTTAFAQDRIHYTGKELSNPAYHDGQLSPVVGVHNIQLVRANRE HPDASNGNGWTYNHQPMLAYWNGQFFYQYLADPSDEHIPPSQTFLMTSKDGYHWTNPEIV FPPYKVPDGYTKESRPGVQAKDLIAIMHQRVGFYVSKSGKLITMGNYGVALDKKDDPNDG NGIGRVVREIKKDGSFGPIYFIYYNHGFNEKNTDYPYFKKSKDREFVKACQEILDNPLYM MQWVEEADREDPLIPLKKGYKAFNCYTLPDGRIASLWKHALTSISEDGGHTWAEPVLRAK GFVNSNAKIWGQRLSDGTYATVYNPSEFRWPLAISLSKDGLEYTTLNLVHGEITPMRYGG NYKSYGPQYPRGIQEGNGIPADGDLWVSYSVNKEDMWISRIPVPVEINASAHADDDFSNN KSIAELTDWNIYSPVWAPVSLEGQWLKLQDKDPFDYAKVERKIPASKELKVSFDLKAGQN DKGTLQIEFLDENGIACSRMELTDDGIFRMKGGSRFANMMKYEAGKVYHVEAVLSTADRN IQVYVDGKRVGLRMFYAPVAAVERIVFRTGVPRTFPTVDTPADQTYDLPNAGAQDSLAGY GIANVKTSSTDKDASSAFLKYADFSHYADYFNHMEDENIVQAIPNAKASEWMEENIPLFE CPQHNFEEMYYYRWWSLRKHIKETPVGYGMTEFLVQRSYSDKYNLIACAIGHHIYESRWL RDPKYLDQIIHTWYRGNDGGPMKKMDKFSSWNADAVLARYMVDGDKDFMLDMKKDLETEY QRWERTNRLKNGLYWQGDVQDGMEESISGGRKKKYARPTINSYMYGNAKALSCMGILSGD EGMAMKYGMRADTLRNLVENELWNTRHQFFETMRTDSSANVREAIGYIPWYFNLPDTTKK YEVAWKEILDEKGFSAPYGLTTAERRHPEFRTRGVGKCEWDGAIWPFASAQTLTAMANFM NNYPQTVLSDSVYFRQMELYVESQYHRGRPYIGEYLDEVTGYWLKGDQERSRYYNHSTFN DLMITGLIGLRPRLDNTIEVNPLIPVDNWDWFCLDNVLYHGHNLTILWDKNGDRYHCGKG LRIFVDGKEVGYADTLTRIVCKDVLK >gi|225935385|gb|ACGA01000007.1| GENE 5 9851 - 10774 664 307 aa, chain - ## HITS:1 COG:no KEGG:BT_1021 NR:ns ## KEGG: BT_1021 # Name: not_defined # Def: arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 307 1 281 283 548 92.0 1e-155 MKKLRNTLAAISLLFLASCGGNKDYYMFTSFHEPADEGLRYLYSEDGMHWDSIPGIWLKP ELGQHQLMRDPSMVRTPDGTYHLVWTTSWKGDLGFGYAHSKDLIHWSEQQMIPVMADEPT TINVWAPEIFYDDESEQFMVVWASCVPGRFEKGIEEENNNHRLYYITTKDFKTVSKAKLL YDPGFSTIDAVIVKRAKNDYVMVLKDNTRPERNLKIAFSDSMTGPYSPASQPFTESFVEG PSVEKLGDDYLIYFDVYKKKIYGAMRTKDFRNFTDVTEEVSIPVGHKHGTIFKAPESLVK ALLEEKK >gi|225935385|gb|ACGA01000007.1| GENE 6 10980 - 11648 733 222 aa, chain - ## HITS:1 COG:no KEGG:BT_1022 NR:ns ## KEGG: BT_1022 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 222 1 222 222 358 86.0 1e-97 MKKILILSLLFIFSWMSAQAVDLNKENRDPEYVKSIVGRSQKIVDKLGLTDAKIAEDVRN IIANRYFELNDIYEVRDAKVKKVKDSGLTGDAKNEALKAAENEKDAALYRSHFAFPADLS LFLDEKQIEAVKDGMTYGVVKVTYDSHLDMIPTLKEEEKAQIYAWLIEAREFALDAENSN KKHAAFGKYKGRINNYLAKRGYDLKKEREEWYKRVKARGGSL >gi|225935385|gb|ACGA01000007.1| GENE 7 11709 - 13448 1685 579 aa, chain - ## HITS:1 COG:no KEGG:BT_1023 NR:ns ## KEGG: BT_1023 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 579 579 1088 92.0 0 MKKKHVLLAAFAAAMLTPTVAWAQYPQITDEAKENYKKMMTEERKRSDEAWAKALPIVQK EAKEGRPYIPWASRPCDLPQANIPAFPGAEGGGMYSFGGRGGKVITVTNLNDRGPGSFRE ACETGGARIIVFNVAGIIRLESPIIVRAPYVTIAGQTAPGDGVCIAGESFWVDTHDVVVR HMRFRRGETKVWHRDDSFGGNPIGNIMIDHCSCSWGLDENISFYRHMYDPSEGQYESKDL KLPTVNVTIQNTISAKALDTYNHAFGSTLGGENCAFARNLWASNAGRNPSIGWNGIFNFV NNVVFNWVHRSSDGGDYTAMFNMINNYYKPGPATPKDSNVGHRILKPEAGRSKLNYKVYG RVYADGNVMEGYPAITADNWKGGIQIEDQPNTDGYTENIRTYQPFEMPYIKVMGANDAYD YVLKHAGATIPCRDIVDERVIEEVKTGIPYYEKKLPKDAYGDLTGLSPKSMGEDGQFKYR RLPKDSYKQGIITDIRQMGGFPEYKGTPYVDTDGDGMPDEWEIANGLNPNDPSDANKDCT GDGYTNIEKYINGISTKHKVDWRDMKNNYDTLAEKGKLM >gi|225935385|gb|ACGA01000007.1| GENE 8 13626 - 15809 1900 727 aa, chain - ## HITS:1 COG:no KEGG:BT_1024 NR:ns ## KEGG: BT_1024 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 727 10 765 765 719 51.0 0 MINRITKFSAVAIMATTALFFQGCSDEFLQDKKAYGAVTSEVYNDYEGAKGRVNILYQYL LPSSTSGIGFDTPSAGVADECSQSTEEYGGFSVFIDPTTELDYTTVPDYAYREMKNTSPY GRIRECNDIIEGVTASTLTEEQKRELLGQAYFFRAWCYYRLVKIYGGMPIIDHVQTPMLN DDGGLDLSIPRSTTKDCIDFICDDLETASNYLPVNWNTTLSADYGRVTAGTALALQGRAR LLYASPLFNRGDDEARWDLAYQSNEAAVNKLTEGGFGLAYAGNPGRNGGNWAKMFSDYQS PEAVFVTLYNTIKESAGTSYNKNNSWENSIRPKNTYGGGGKATTAQMVDLFPMADGMKPG ESKYAYDSKCFFLNRDPRFYRTFAFPGVMWRFKGDPTSKGVDYPYTGDEYVLWNYTWYDD ENKQIAEDQSGYGSDGLATNVKGVYVRKRSDDLGVNSSPLYNYDLTVSEPFKWSGAPYME MRYAEVLLNLAESACGIKNYDKAISILKDIRLRAGYSESDVAAYGDKLERMDRGELFGAI LYERQIELAYEGKRFDDMRRWMLWDGGVGQEVLKSTWKLSGFNGNTCNYLKVTPFNETHR RTGLEVCVNTSFGTAEAKMDADPIVKAGITRPVLNLDDSKTVLTEKASEGATNTAIDNLA DFYRANLYRKTTRVDGDPQYEINFLPKYYFIGFKQNFMQNNVTLLQNIGWNDYNQSGALG TFDPLAE >gi|225935385|gb|ACGA01000007.1| GENE 9 15824 - 19144 2445 1106 aa, chain - ## HITS:1 COG:no KEGG:BT_1029 NR:ns ## KEGG: BT_1029 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1106 1 1102 1102 1229 57.0 0 MQKKYLFLKCCLCFLMFFCVSLQAQQARSVSGVVVDEIGEPIIGAAVKIDGTTLGTITDM DGKFVISVPPKGKITISYIGYTSQTISDFKDMRIVLKEDLMKLDEVVVVGYGTQKMKNVT GAISTVTPTEISDLSVASLGNALGGVVNGLGVSGGNTRPGEGASLTIRQNDVASSFSTGG SSSPVYVIDDYIQDETAFNNLDASVIESITVLKDAAAAVYGARAAQGVILVKTKRGQIGA PKISYNGQFGYTDEVSRAKLLNAYDYGVLWNGVASAPTTDGSINNRTDLFQADELAAMRN LNYDLLKDEWSAAFTQKHSINVSGGTDRATYFAGMSYYTQDGNLGRLDYDRWNYRAGVDA KISKWLKASLQVSGDYGNQTKAYNKVGGENSDTDYNTLLTHLRYIPSSVAGHSMAALGVS NSEVKDIQAYNYRSIQDLNNYSNNRTSNMTLNSSVEYDFGWSSIFKGLKMKFSYSKSINN TKANQLGTNVTVYKMLNRGGSGQHLYTGDDLDLSEANFSAITKSNGNLIRRQMDRADQYQ MNFNVSYARKFGLHDVSALFSIEKSEAESEYVWGQVLDPLSFNDGQSKTASGTQSTEFSR SESGMLSYIGRVNYSYADRYLLEFLIRSDASTKFAPENYWGVFPSLSLGWVVSEESWFKE HVKGIDFLKVRGSFGMLGKDNIAAWSWLTLYNRDANKGAVFGTSASNNIGAGISTNVNNG IPNRDAHWDKSYQTNLGFDIRTLDSRLSINLDGYYNMHRDMFMSPQGSAALPSTVGTKPA PTNYGSVDTYGVEISLGWKDNIGKDFKYWVKLNTGYSDNKVKKMYWPTTIAQDQQHPNER TDLGTWGYQCIGMFRSYQEIEEYFEKYMKKPDGTYGTYLGKTKDAVFPGMLIYKDIKGPL NSDGTYQAADHSVTDVDKVKISNRSNPYSFTLNFGGSWKGLSFSAQLGASWGGYSFVPKE ARGVYSPITKQTDYKALEYYNMPSFWSNNMYVYENVYDAQGNIVAAQNRDAKYPNLMYSL NSETSTFWRISGTRVTLRNLTVAYSIPNEWLRKLGVGIESCRFNLTGQNLLSLYNPYPDN FMDPLTGNYGVYPVLRKFTLGVNISF >gi|225935385|gb|ACGA01000007.1| GENE 10 19185 - 20606 870 473 aa, chain - ## HITS:1 COG:no KEGG:BT_1026 NR:ns ## KEGG: BT_1026 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 472 25 476 481 288 34.0 3e-76 MKSKMKIKMRVVRNLWYLPMVLFIASCAKGFDNNETFSSGVSNTQLLSPAEDGITFSKSA DEKTFTVKWPVVLGASAYQVLLQIVDDPENPETVLDKTVDGCSLEVEMKEDTKYQVAVKS LGNVEYNNADAKESSLAQWSSLIPTIATIPSGSDIAEWFAANPIQDTGAEMAYDLEANGT YTLNSPLDFAGYWITFRGDKIHHPKVTMGTNGLIKTRYGLKIKYIDFDCTGVDQAAQYGS FLTMSETPDESTLVDGKWYPMLMPIVIQSCNIKGIARSLVYNGGKNYIISGLYIKDCIVE LNTAQDSKKPVISFTDGGRGCINDLEISNSTFYDLSKGGNFFIQYANNAAGSTNLSSLFS SSSVKLLNNTFYNVAYSKNFANYSGMKNSKVTITVMNNIFQHCGTVQKLFNSGNKQFSNN AYYEEVDGDVNKQSTDGSEFQEDPGFTNPISGNFAVSSSTIIAKKVGDPRWLQ >gi|225935385|gb|ACGA01000007.1| GENE 11 20783 - 21676 649 297 aa, chain - ## HITS:1 COG:no KEGG:BT_1031 NR:ns ## KEGG: BT_1031 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 297 3 296 297 520 86.0 1e-146 MKTRRNLFKHILWILILAECFPLLAIAGSQQKEQRYKIAVCDWMILKRQKIGSFQLVHEL NGDGVELDMGGLGKREMFDNKLREPHFQQLFRETAQKYQLEVSSIAMSGFYGQSFLERAN YKELVQDCLHAMRVMNTKVAFLPLGGIETGWEKNSALRTEVVKRLKEVGDMAASEGVVIG IETQLDAKGDVKLLKEINSPGIKIYFKFQNALENGRDLCKELKILGKKNICQIHCTDTDG VTLPFNERLDMNKVKKTLDKMGWSGWLVVERSRDKDDVRNVKKNYGSNIKYLKEVFQ >gi|225935385|gb|ACGA01000007.1| GENE 12 21971 - 24244 1988 757 aa, chain - ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 35 756 35 752 770 335 32.0 2e-91 MKRIILTYTLAFSLLSASAGEGGSPPLKNTDKGLLIDYVDPFIGTTNFGTTNPGAICPNG MMSVVPFNVMGSSENTYDKDARWWSTPYEYTNCFFTGYAHVNLSGVGCPELGSLLLMPTT GELNVDYKEYGSKYKDEQASPGYYSNYLTKYNVKTEVSATPRTGIARFTFPKGKSHILLN LGEGLTNESGAMLRRVSDCEVEGVKLLGTFCYNPQAVFPIYFVLRVKKNPSATGYWKKQR AMTGVEAEWDPDQGKYKLYTRYGKEIAGDDIGTYFSFDTEEGEQVEVQMGVSFVSIENAR LNLDREQARKNFEQIHAEARSKWNDDLSRITVEGGTDAQKTVFYTALYHLLIHPNILQDV NGEYPAMESDKILTTKGDRYTVFSLWDTYRNVHQLLTLVYPERQMEMVRTMLDMYREHGW LPKWELYGRETLTMEGDPSIPVIVDTWMKGLRDFDVDLAYEAMYKSATLPGAENLMRPDN DDYMSKGYVPLREQYDNSVSHALEYYIADFALSRFAEALGKKKDAEIFYKRSLGYKHYYS KEFGTFRPILPDGTFYSPFNPRQGENFEPNPGFHEGNSWNYTFYVPHDVYGLAKLMGGKK PFIDKLQMVFDEGLYDPANEPDIAYAHLFSYFKGEEWRTQKETQRLIDKYFTSKPDGIPG NDDTGTMSSWAIFNMIGFYPDCPGLPEYTLTTPVFNKVTIRLDPKWYKEKELIIETNRTQ PGTLYIDKVLLNGQKFNKYHITHDELVHGKYLKFELK >gi|225935385|gb|ACGA01000007.1| GENE 13 24252 - 25220 904 322 aa, chain - ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 6 321 11 326 326 375 58.0 1e-104 MNKIQIPWEDRPVGCTDVMWRYSQNPVIGRYHIPSSNSIFNSAVVPFEDGFAGVFRCDNK AVQMNIFTGFSKDGIHWDINHEPIQFKAGNTEMIESEYKYDPRVTWIEDRYWVTWCNGYH GPTIGIAYTFDFKEFFQCENAFLPFNRNGVLFPQKIDGKYAMLSRPSDNGHTPFGDIYIS YSPDMKYWGEHRCVMKVTPFPESAWQCTKIGAGSVPFLTDEGWLLFYHGVITTCNGFRYA MGAAILDKDHPEKVLYRTREYLIGPAAPYELQGDVPNVVFPCAALQDGERVAVYYGAADT VVGMAFGYIKEIIDFTKRTSII >gi|225935385|gb|ACGA01000007.1| GENE 14 25241 - 26566 1077 441 aa, chain - ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 56 441 53 411 492 98 22.0 3e-20 MQQNKTASQRNTISPACWVPTAYFAMGLPFIAINLVSVFMFKDLGISDTQIAFWTSLIMM PWTLKFLWSPFLEMYKTKKFFVLITELLSGVLFGIVAFSLFFDYFFAISISTMAVIAFSG ATHDIACDGVYMAELNKEDQAKYIGVQGAFYNVAKLVANGGLVAMAGALAEHFGAIEGAS IDANKGAYSSAWMIIFGVIAAIMVLIGIYHIRILPSTQVPVTTKKSTSEIKKELVAVIGN FFTKKHIYYYIAFIILYRLAEGFIMKIAPLFLRASRDVGGLGLSLTEIGTLNGIFGSAAF VLGSLLAGIYVSKFGLKKTLFTLCCIFNLPFVAYTFLAVAQPTNVYLIGTCITMEYFGYG FGFVGLTLFMMQQIAPGKHQMSHYAFASGIMNLGVMLPGMVSGYLSDLLGYRNFFIYVLI ATIPAFLITYFIPFTYDDSKK >gi|225935385|gb|ACGA01000007.1| GENE 15 26665 - 28848 1508 727 aa, chain - ## HITS:1 COG:no KEGG:BT_1035 NR:ns ## KEGG: BT_1035 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 727 11 734 734 1300 83.0 0 MKHWLLSVLLVVCGAVSAQVTLVKDGKSNARIIVQDKMPDTKTSAQLLQHFLYHISKSKL PIENDKTPRKGDILIGGQTPQTVTEDGFSISTQDGILRISGKEKGVVYGVVTLLEQYLGV DYWGENEYSFTPQKTVILPLINKVENPAFRYRQTQCYAIRTDSIYKWWNRLEEPNEVFAA GYWVHTFDKLLPSSVYGKEHPEYYSYFKGKRHPGKASQWCLSNPEVFEIVAQRVDSIFKA NPDKHIMSVSQNDGNYTNCTCDACKAIDDYEGALSGSVITFLNKLAARFPDKEFSTLAYL YTMNPPKHVKPLPNVNIMLCDIDCDREVTLTENASGKEFVKAMEGWSKITNNIFVWDYGI NFDNYLAPFPNFHILQDNIRLFKKNHATMHFSQIAGSRGGDFAELRAYLVSKLMWNPEAN VDSLMQHFLHGYYGEAAPHLYQYIKVMEGALIGSGQRLWIYDSPVSHKYGMLKPALMRRY NHLFDLAEKAVATESDFLKRVQRARLPIQYSELEIARTETEKDLADINKKLDLFEERVKE FQVPTLNERSNSPIDYCKLYRERYMPQKENSLALGAKVTYITPPTGKYAALGKTALVDGL FGGATFVDSWIGWEGTDGAFVIDLGETKEIHSVETDFLHQIGAWILFPLKVVYSYADDGE HYTHWNTIDLQEERTGEVKFRGVKAESAEPVKTRYVKVEVTGTKECPTWHYGVGHPSWFF IDEVIIK >gi|225935385|gb|ACGA01000007.1| GENE 16 29005 - 30138 908 377 aa, chain - ## HITS:1 COG:no KEGG:BF1313 NR:ns ## KEGG: BF1313 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 377 3 379 380 299 45.0 1e-79 MKYIYLSSLMIAFSTVFFMGCENAEYKAQTNSLYITDAADIATAATITMDDGADINVVVR LAQKVTEDVEVEIQFVPELLTEYNTISGTEYEVLPANMLPQNATVTIPAGEISANYRLHV NDFATNGITYAIPLALGNVIKGDVTKSKAQSKFIYVLAKPLNVSAPVLRGNAGNVSTLPA TDWGITVDAWTLEAWVRMDGYKKNNQAIFTSGSGDHEIYIRFGDANKPYNYLQIKTLGGQ KQTASDLEVNTWYHWAFVYNGTTLTIYKNGKQETFFDPPAPKGGNVRIDYMKIVDSGSYF QNNCTMSQVRFWKVARTESEIANNMYFEVNPKNPNLIALWPMDEGEGTAFRDATGNGHDA TSNGALQKWEHNIRFDK >gi|225935385|gb|ACGA01000007.1| GENE 17 30170 - 31303 1031 377 aa, chain - ## HITS:1 COG:no KEGG:BF1328 NR:ns ## KEGG: BF1328 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.fragilis # Pathway: not_defined # 10 359 11 342 350 122 32.0 3e-26 MKHIKKMLGFAFAATFFVGCTDVESINDIDLNQTTKTDEYYAALREWKETPGLPQVFVWF DNWTGTSPTGENSLRGLPDSITIASNWGGHPKFDLTPERKADMEYVQKVKGTKVVVTLFS QNIGDDLPDKEIYHEAGTSSDPEKVRPAIKAYAEDIYKACLECGYDGYDWDYEPGGGMGV GPLWSNKVQRTIFVEELSYWFGHGAMDPNRDRGDRAMPEKRLLFLIDGEVGIRTGMDKEW LSYYVDYFVLQAYGGMADYRMKGVLDDMSDWIQQGIITPDEIVRRTIATENFESYANTGG GFLGMSNYVYKTTYTTGGKSYEIDQLIGGCGLYRVGFDYAQGKGDYDGSPEYYFLRQGIT NIYKNFYSRMNATANEE >gi|225935385|gb|ACGA01000007.1| GENE 18 31329 - 32900 1521 523 aa, chain - ## HITS:1 COG:no KEGG:BF1327 NR:ns ## KEGG: BF1327 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 523 5 514 514 402 44.0 1e-110 MKRFNIIKTVSAAAFTTTLLFTSCTGNFDELNTHPTDLYPENMTPTERVGTLFVAMTRLL NACQENNSQHTEQMVGQYGGYFATTAPWNGTNFGTFNPSADWVSVPYKDMFTEFYPNFQT VKESTGGSGYIYAWASILRVGVMIRVADIYGPIPYSEMGKGEFQVAYDDVKTLYHNMISD LTRSIEVLSAFVQENAGKEVPVAEYDIIYNGDFSKWIKYANSLKLRMAVRIASNTEDTEY AKQIMAEAIEGGVIESNGDNAFFPAMPYNPFEIASGWGDLAINATLSAYMTGYSDPRISK YMNTATSYRDYRGVRMGISDTSKAVEGSAARYSKPAFTADSPMPVFYAAETYFLKAEAAL QGWIAGGDAAAKSYYEQGIQMSMSQYGVEIGDYLSSTVVPQGYTDRLNSKNNISFTSVPS VAWGDGTNKLQKIITQKWIANYPMGIEAWCDYRRTGYPELFTARDNLSSAGYIGDIDSKR MVRRLPYPTTEKSSNSANVEAAIATMLGGPDTGSTDLWWAKKN >gi|225935385|gb|ACGA01000007.1| GENE 19 32915 - 35668 2334 917 aa, chain - ## HITS:1 COG:no KEGG:BVU_0619 NR:ns ## KEGG: BVU_0619 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 917 117 1033 1033 945 54.0 0 MGEDTKVLDEVVVTALGIKRSEKALTYNVQQVKGDELTTVKSTNFMSALAGKAAGVQINS SASGPGGGVKVVMRGAKSITQNNNALYVIDGIPMYNRASSGGDGAMATQPSSESIADINP DDIESMSLLTGPSAAALYGYEGANGVVLITTKKGRADKTTVTYSNNTTFSDPMMMPKFQN TYGNLAGETMSWGLETDKRYDPANFFNTGSNISNALSLSTGNDKSQSYISVATTNAAGIM PNNDYDRYNFSFRNTTNFLNNKFILDTSANYIIQNDKNMVSQGQYFNPLTALYLFPRGED FNDVTVYERYDDLSGVSTQYWPYGDLGGVSSQNPYWIMNRMNRENDRRRYKLAASLQYNI IEGLNVQGRVSVDNTESKYTEKFHAGTVGNFAGTKGRYKEERRMDRQIYADVLANFNKTF KDFNVMANLGASIKDLRMESSSLHGDLNNVTNHFTIENLTRSGYYKVDANGLKRQTQSVF ANAEIGWRSFLFLSLTGRADWDSSLALSDRGNKAFFYPSVGISSILSEVIDLPEWFSFLK ARFSFTSVGNAYDPYLTKARYIYDDQLDQYKTESLYPNSDLKPEITKSYEAGLNMRFFKG ALNFDITYYNSDTRNQTFAAPLPASAGYTAVNIQAGSVQNQGIEMALGYNNEWRDFGWST NFTFTYNKNKIKRLADGVTNRITGEPIEMPYVDKARLGGSTSPVVRLTEGGSMGDLYINR DFKRDNNGYIYLDAKTGLPSMVETEYKKIGSLLPKVNLGWRNSFTYKGVRLNVLLSGRFG GLVVSNTQAFLDRCGVSEYSAQLRQTGGVTINDRNVSAKDYLGIVAAGTGEGDHYVYNAT NIRLQEVSLEYTLPKKWLYNIADVTLGLIGNNLAMIYCKAPFDPELVASSSSTFYTGVDY FMQPSLRNMGFSVKVQF >gi|225935385|gb|ACGA01000007.1| GENE 20 35959 - 37140 968 393 aa, chain - ## HITS:1 COG:no KEGG:BT_1045 NR:ns ## KEGG: BT_1045 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 393 14 395 396 451 57.0 1e-125 MKKIVMTLFGTLLLAGCQNELYTDTQKDFGSDQGAYIAADGPIQIFVEEGKEYFIKDIKV GLAVKENKACEVPLTVGDQAQLDEYNKKNGTSYLMLPPEMYEIPSVMTFKPNLAMQSIPV ALKDLKFSLKGDYALPIRLSKGSVTLVPNEEEALIILEKRTRTKALLMNNAGSESGDMFP QDFKVKQWTMEVMIKRSAYNANNKSIGGTKVAPNSSPLDEIFTRFGDVTINPNQLQIKTG ASQIDVPSNKFAAKANEWYMLTFVYDGKMTLVYVNGDLVASREIRIGEYGLTGFWLGGSN EFVREVRFWDIARTAQQIKGYTWKMVNPDEQGLLLYYPCNGQKRDHETGVISDDATKIWN WATYYNGDKSKLDLPMVGKFDDNNGEMYVFPAE >gi|225935385|gb|ACGA01000007.1| GENE 21 37162 - 38292 858 376 aa, chain - ## HITS:1 COG:no KEGG:BT_1044 NR:ns ## KEGG: BT_1044 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 364 364 593 75.0 1e-168 MKLRNYLYIGIAVLGFTACSDWTDAEREVFPEQEEINRFIPLIDAQNEADLMPSQREYYR QLREYRTTPHVKGFGWFGNWTGKGTNAQNYLKMLPDSVDFVSLWGTRGRLSDEQKNDLKF FQEVKGGKALLCWIIQDLGDQLTPQGLTATEYWVTDKGKGDFQEGVKAYANAICDSIEKF NLDGFDIDYEPGYGHKGTLATSQTISSTGNNNMQIFIETLSARLRPAGKMLVMDGQPDLL STETSKLIDHYIYQCYWERNTNTVINKINNPHLDDWERKTIITVEFEQCWRCGGIDPDNK AVNSNANRRYTSVRTEFNGKYGTQIFDYATLNLPSGKRIGGIGTFHMEYDFANEPSYRWL KTALYYGNQVYPGTFE >gi|225935385|gb|ACGA01000007.1| GENE 22 38339 - 39976 1437 545 aa, chain - ## HITS:1 COG:no KEGG:BT_1043 NR:ns ## KEGG: BT_1043 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 544 2 543 546 553 53.0 1e-156 MKFKKYLLVVFSLGLLTACEDYIDINKSTTGVTDEELEAGGLIYGTQLMEMQQRVIPIGS PSKTTDPGNDLAVTDVMSSAQYIGFFGMNNNWKLSTEATWDFADNRMSYAYEQLYMKVYQ PWVQIYQKIGNSEEPDKQEIMAIFNVVRIAGWLRATDCFGPIVYSNAGNGDIAPMLDKQE DVYKAMLKDLATCADILNKSTSKIMPKYDLIYDGDPIKWTKLANSLMLRMAVRTHFVNKD LAVEYIQKALNPRNGGVIESKTDEAKIGNTSKFPLLNSLIATIDYKENRMGATAWSYLTG YNDPRIDKYYTKGTYNKISGYYCIAPGNTLGKAEGDNTAEFASCPKVENADPLYWFRASE TFFLKAEAALYGLTEGNVKDLYEAGVKLSMDEWGIAEAAATTYLNNSAQPATLNSPSNYY YAGGKSYNSDITAGNITPKWDDAASEETKLQRIITQKYLALYPNGVEAWTEYRRTGFPYL ASLFDNTLYNKVGATEKTRAPERFSFAAKEYSGNPNMSSITTLLNGPDKGGTKLWWVRAN RPVQN >gi|225935385|gb|ACGA01000007.1| GENE 23 39999 - 42743 2068 914 aa, chain - ## HITS:1 COG:no KEGG:BT_1042 NR:ns ## KEGG: BT_1042 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 914 186 1097 1097 1348 75.0 0 MGEDTKVLDEVVVTALGIKRSEKALSYNVQQVKGDELTTVKDANFMNSLNGKVAGVNIQR SASGVGGGTRVVMRGNKSIAGQNNVLYVVDGVPIGNKADRSGDGTGFVGMASGEGISNFN PDDIESLSVLTGPSAAALYGANAANGVILINTKKGAQGVMRLNVSSSVEFANPFVMPEFQ NTYGNVSGNYFSWGEKMENPYSWEPRDFYNTGATFNNSFNLSMGTEKNQTYISASAVNSN GMVENNKYHRYNVTFRNTAKFLNDKLTLDVSASYVREYYNNMVSFGTYFNPIVGAYLYPR GMNFESEKYFERYDNELGYNAQSWQPGSMGMDVQNPYWIAYRNLRPEVKDRYMLYASLKY DITSFLNVAGRLRMDNTYSESEDRRYASTTSTFAGENGRYKYSNEFYKQKYADIMVNFDK QFAQVYHATVNAGVSFEEYDTKGHGYGGDLLLVPNKFTYGNVNPSVASPFETGGDSRTQN FATFASAELAWNSALYLTLTGRADKPSQLVNSKEQWIFYPSVGLSAIVTELLPDNIRESI NPVLGYFKVRGSFTEVGSPIPFTGLTPGTITHKLENGTVSPFEYYPLSDLKAERTRSYEL GIDTRWFNNTITLGVTVYHSNTYNQLLKADMPGSSGYKYMYVQAGNVQNRGIEMTLGYDQ TFGEFNYNTTFTATANRNKIKKLASNVKNPVNGELLDLSDIQLGRFRLREGGEVGALYAD RRVERNEEGYILYNPGQTLATEKTTPFKIGSVNPKWNLGWRHGFNYKGINANVMFTARIG GNVISKTQATLDRFGVSKASADARDAGYVMLGNIKMQPEDYYGTIYDLDSYYVYSATNVR LQEASIGYTLPNKWFGDVVKNVNISVYGTNLWMIYNKAPYDPELTASTGTFGQGYDYFML PSSRTYGFSLKFGF Prediction of potential genes in microbial genomes Time: Fri May 13 06:31:11 2011 Seq name: gi|225935384|gb|ACGA01000008.1| Bacteroides sp. D2 cont1.8, whole genome shotgun sequence Length of sequence - 2943 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 773 - 807 0.5 1 1 Tu 1 . - CDS 817 - 2052 698 ## BT_1041 integrase - Prom 2191 - 2250 9.0 + Prom 2165 - 2224 9.4 2 2 Tu 1 . + CDS 2433 - 2943 342 ## BT_1042 hypothetical protein Predicted protein(s) >gi|225935384|gb|ACGA01000008.1| GENE 1 817 - 2052 698 411 aa, chain - ## HITS:1 COG:no KEGG:BT_1041 NR:ns ## KEGG: BT_1041 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 410 1 410 413 681 82.0 0 MFKYSRDGVSVLTILDTRRVKKSGLFPIKVQVVFRRKQKYYSTGKELSKEDWDRLLKAKS RLLTEIRSDIESSFSNVKQHINELIQKGEFNIETLSIRLGRQMKDMNLHSAFSLKMKELK DNEQASTYLSYQSALKSLESFGGTYIPLERITVDWLKRCERFWLSKGKSYSSISIYFRAL KCILNRAIQDGILKESSFPFGKNKYEIPEGCGRKLALTLADIKKVMSYQDGTKDIEEFRD LWVFSYLCNGINFMDLLFLQYSNIVDGEICFVRSKTSRTAKHSREIHAIITPEMWNIIHK WGNPRINPQTYIFKYADGTENAFERVRLVRRIVTKCNRRLKKIAQDTGISQLTTYTARHS FATVLKRAGAKTSYISESLGHSNLAVTENYLAYFEKEERIRNAQLLTDFNI >gi|225935384|gb|ACGA01000008.1| GENE 2 2433 - 2943 342 170 aa, chain + ## HITS:1 COG:no KEGG:BT_1042 NR:ns ## KEGG: BT_1042 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 1 169 1097 248 82.0 8e-65 MKEIIKHKFSYLLFFLLLVSCFASAYGQERTITLNLSKVPLNTALKEVERQTSMSVVYNT NDVDINRIVSIKVTKESLNNVMNQLFRGVNVSFSIVDNHIVLSAKSNKEDQQKKTPIMAS GTVTDSKGEPLIGVSILVKGTSNGTITDMDGNFKIQAAKGDVLEVSYIGY Prediction of potential genes in microbial genomes Time: Fri May 13 06:31:21 2011 Seq name: gi|225935383|gb|ACGA01000009.1| Bacteroides sp. D2 cont1.9, whole genome shotgun sequence Length of sequence - 4648 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 2776 2505 ## BT_1046 hypothetical protein 2 1 Op 2 . + CDS 2794 - 4344 1395 ## BT_1047 hypothetical protein 3 1 Op 3 . + CDS 4372 - 4648 205 ## BT_1048 putative secreted endoglycosidase Predicted protein(s) >gi|225935383|gb|ACGA01000009.1| GENE 1 2 - 2776 2505 924 aa, chain + ## HITS:1 COG:no KEGG:BT_1046 NR:ns ## KEGG: BT_1046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 924 16 940 940 1483 83.0 0 LANAQPLKVTLGEDTQVLDEVVVTALGIKRATKALSYNVQEVKGDELTAVKDANFMNSLS GKVAGVQINSGATGAGGATRVVMRGMKSLTKDNNALYVIDGVPIFNTGKSGGEGLFGEMG GSDAVADLNPDDIASISMMTGPSAAALYGSAAANGVVLITTKKGQTEKTTITVSNSTTFS KAYIMPDMQNRYGTSSGLFSWGELTNRRYNPSDFFETGSNVINSVALSTGNTKNQTYLSA STTNSGGILPNNSYNRYNFTARNTTHFLNDKLTLDIGAQYIVQNNKNMVSQGQYYNPLPS LYLFPRGDNFDEIRLYERYNTNYGYMEQYWPYGDASLSLQNPYWIQNRINRTSNKKRYMM NASLKWQATDWLNVVGRVNLDNSDYRNKNEKSASTLTTFCGVSGGFEDAMRQERSLYADF LANIDKTFGDFHLTANVGASIYHTSMDQLYIAGDLVIPNFFQINNINFSANYKPDPTGYE DEIQSIFASAELSWKNQLYLTVTGRNDWDSKLAFSKQKSFFYPSVGLSALLSEMVKLPEV ISYAKIRGSYTVVASSFDRFLTNPGYEYNSQTHNWANPTVYPMDNMKPEKTKSWEVGLNL KFWGNRFNLDATYYRSNTLNQTFKVDIPSSSGYKQAIVQAGNVQNQGIELALGFSDKWAG FGWSSNATFTLNRNKVKRLASGSVNPVTGEAIQMDEMNVGWLGKENVAPRVILTEGGSMT DIYVYNQLTKDNNGNIKVDQNGNLGITSSNTPVKVGNLDADFNLGWTNHFTYKGIDLGVV LSARVGGLAYSATQGILDYYGVSETSATARDNGGIPINNGKVNAQKYYQTIGTGEGGYGR YYLYSATNVRLQELSLNYTLPKRWFKNVANVTLGVVGRNLWMIYCKAPFDPELSASTSSN YYMNVDYFMQPSLRNFGFNVKVQF >gi|225935383|gb|ACGA01000009.1| GENE 2 2794 - 4344 1395 516 aa, chain + ## HITS:1 COG:no KEGG:BT_1047 NR:ns ## KEGG: BT_1047 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 514 514 763 73.0 0 MRKYLRNITVGAALLLLAGSCTDKFEEYNTNQYQIHDADPATLMKSMIETIVNIQQNDSQ MQDQMVGQLGGYLCCSNTWSGTNFSTFNQSDAWNATPWNTPFEKIYGNFFQIQEATNSTG HYYAFSCMIRAITMLRVADCYGPMPYSQVKKGNFYVSYDTQEQVYTSILSDLANAADVLY NYYVETNGNAPLGANDPVFDGNYSGWAKLANSMRLRVAMRISGTWPGIAQEAAEAAVTHK AGLIESNSDNAMLSCGTQSNPYQLAAVSWGDLRVNANIVDYMNAYGDPRMPKFFNKSTLA GKTDKYVGMRTGDADFKKADAAGFSIPAYTATSKLMVFCAAETAFLRAEGKLRGWNVGSK TAKAYYEDGINLSMEQYQVSATEYLKIDEAPVVSHESDAVQNATVTITNTVSVMWDDSEA DNVNGKNFQRVITQKWIANYPLGLEAWAEYRRTGYPELYPCIDNLSDCGVSSQRGMRRLS FPYTEAQNNKANYDLGVAELGGADNEATDLKWAKKN >gi|225935383|gb|ACGA01000009.1| GENE 3 4372 - 4648 205 92 aa, chain + ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 92 15 91 375 111 66.0 7e-24 MKKIYLLYIVLISLATTSLIGCSDWTESEAKTFPESIVSDEYYAALRAYKQTDHQVAFGW FGGWSGEGAYMKSSLAGIPDSVDIVSIWGNWS Prediction of potential genes in microbial genomes Time: Fri May 13 06:31:56 2011 Seq name: gi|225935382|gb|ACGA01000010.1| Bacteroides sp. D2 cont1.10, whole genome shotgun sequence Length of sequence - 30821 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 10, operones - 6 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 861 657 ## BT_1048 putative secreted endoglycosidase 2 1 Op 2 . + CDS 870 - 2069 886 ## BT_1049 putative patatin-like protein 3 1 Op 3 . + CDS 2093 - 3049 518 ## BT_1050 hypothetical protein + Term 3136 - 3195 10.0 - Term 3247 - 3308 16.5 4 2 Tu 1 . - CDS 3380 - 4372 868 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 4429 - 4488 8.2 + Prom 4456 - 4515 6.9 5 3 Op 1 . + CDS 4535 - 5143 433 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 6 3 Op 2 . + CDS 5140 - 5766 478 ## COG2731 Beta-galactosidase, beta subunit 7 3 Op 3 . + CDS 5775 - 8945 2232 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 8 3 Op 4 . + CDS 8942 - 9559 390 ## COG1180 Pyruvate-formate lyase-activating enzyme 9 3 Op 5 . + CDS 9564 - 10280 413 ## BT_1056 hypothetical protein 10 3 Op 6 . + CDS 10308 - 11315 823 ## BT_1056 hypothetical protein 11 3 Op 7 . + CDS 11344 - 14223 2024 ## BT_1057 hypothetical protein + Term 14249 - 14300 8.4 - Term 14236 - 14288 16.2 12 4 Op 1 . - CDS 14318 - 15451 1174 ## COG0642 Signal transduction histidine kinase 13 4 Op 2 . - CDS 15471 - 16400 545 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 16562 - 16621 8.8 + Prom 16432 - 16491 5.4 14 5 Op 1 . + CDS 16601 - 17560 645 ## BT_1060 hypothetical protein + Term 17614 - 17655 1.0 + Prom 17599 - 17658 4.5 15 5 Op 2 . + CDS 17799 - 18746 476 ## BT_1061 hypothetical protein + Term 18757 - 18819 18.6 - Term 18753 - 18796 12.6 16 6 Op 1 . - CDS 18857 - 19813 788 ## BT_1062 hypothetical protein - Prom 19833 - 19892 2.2 17 6 Op 2 . - CDS 19946 - 21592 1422 ## BT_1063 hypothetical protein 18 6 Op 3 . - CDS 21630 - 23150 1415 ## BT_1064 hypothetical protein 19 6 Op 4 . - CDS 23152 - 23724 463 ## BT_1066 hypothetical protein 20 6 Op 5 . - CDS 23743 - 25500 1157 ## BT_1067 hypothetical protein - Prom 25585 - 25644 4.7 - Term 25866 - 25903 -0.9 21 7 Op 1 8/0.000 - CDS 25971 - 26513 572 ## COG1475 Predicted transcriptional regulators 22 7 Op 2 . - CDS 26510 - 27814 833 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 23 7 Op 3 . - CDS 27818 - 28222 249 ## BT_1071 hypothetical protein - Prom 28252 - 28311 3.5 24 8 Tu 1 . - CDS 28338 - 28742 257 ## gi|260170434|ref|ZP_05756846.1| hypothetical protein BacD2_01068 - Prom 28838 - 28897 9.7 25 9 Tu 1 . + CDS 29182 - 29421 208 ## BT_1516 replicative DNA helicase + Prom 29663 - 29722 6.5 26 10 Tu 1 . + CDS 29854 - 30820 734 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III Predicted protein(s) >gi|225935382|gb|ACGA01000010.1| GENE 1 1 - 861 657 286 aa, chain + ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 93 374 375 477 82.0 1e-133 ITEAQKKDLQFCQQVKGTRFTMCFIIRSVGDQITPQNIRENWENMGFSSEKEAVNDFWEW PSDESNKEAIEASIRKYASAIADTVNKYGYDGFDIDYEPNFGNPGNIVDEDDRMFAFVDE LGKYFGPKSGTGKLLVIDGEPQSITGRPEVGLYFDYFIIQAYNNSSPGSDSKLDKRLITG GVAGAGLVQTYSSVMSEEQITKMTIMTENFEATDAAMDGGYDYTDRYGNKMKSLEGMARW QPSNGFRKGGAGTYHMEAEYGTSPEYKNIRRAIQIMNPSSHSLLKN >gi|225935382|gb|ACGA01000010.1| GENE 2 870 - 2069 886 399 aa, chain + ## HITS:1 COG:no KEGG:BT_1049 NR:ns ## KEGG: BT_1049 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 399 1 393 393 472 59.0 1e-131 MKRMNYIKILCFSAVITGLGSIVSCGDAKYDTLDTHAYIEEALSSTSQKVTVQATGESFT TLNVHLSNLSSTDNHYKLVTDQSVLDTYNHINGTGYIMLPKDYYTLPETITVKAGQYAAD ALPIALKAFSQEMMKSGESYALPVKLVLQDGSISPMENTGTFVILAESIIEFSAPMFVGA PSLKANKFTESPETYSQYTIEVRFQVANTADRDRAVFKNGGDDANFILLRFEDPQSDNEN YKAHSLVQIVGRNRLYLNPSNSFKPNEWQHLALVCDGSNYRLYINGVDSGVLSIPTGATT FSDVNWFCLGDDSYSRWGNCKILMSEARIWSVVRSASQIQNNMTQVSPKSVGLEAYWRFN EGQGNVFEDATGKGHTLTTSATPTWIDGILSTDKSTEWK >gi|225935382|gb|ACGA01000010.1| GENE 3 2093 - 3049 518 318 aa, chain + ## HITS:1 COG:no KEGG:BT_1050 NR:ns ## KEGG: BT_1050 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 313 1 299 301 168 36.0 3e-40 MKNIFQSLSIATLLGLFVPFISSCSDNEEVFNEWNATYVSLQRNDYLSGNVKKFNLTHDA NGIGGDEMKMAFTVKTQKAVSTDMVIALSAKSETEGLDASQIVLSSSQVTLKEGQMTSEE ITATVDPTIFASIMEKTSFSFSISISNVTTNDKNTVISSSLSTLPVIINKAAYCNLKSGT PSNSQLISNRAGWVIKLEEGVEGTGNNLIDGRTNTDIALSGAGFWFTVDLGETKVLTGIK TNHWGNSSSYAPREVEILQSDNGTTWKSLSSLAISGGTQNITFISPITTRYLKYQIITIA SSGRTDVTEFNVYEPKSE >gi|225935382|gb|ACGA01000010.1| GENE 4 3380 - 4372 868 330 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 114 324 116 322 331 78 30.0 1e-14 MDETILVNYLQGECNDEEAARVEAWCEERPENRKTLEQLYYTLFAGERIAVMNTVDTEAS LDQFKSVIREKEKKAKRKSISIRWGRYATVAAAFLTGLVFAGGIAWGLLSNKLSDYEVIT AAGQRAQTVLPDGSKVWLNASSKIVYHNSFWSSDRQIDLSGEAYFEVSHDKHAPFIVNSK QIKTCVLGTKFNVRARQDENRVVTTLLQGLVRVDSPRTEENGYLLKPGQTLNVNTDTYQA ELIEYNQPTDVLLWINGKLEFKQQSLLEITNIMEKLYDIKFIYKDEALKSERFTGEFSTD STADEILNVLMHTNHFSYKKDGRIVRVMKK >gi|225935382|gb|ACGA01000010.1| GENE 5 4535 - 5143 433 202 aa, chain + ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 2 189 4 196 206 66 23.0 4e-11 MSQSPIYLDINNNKSVITALKAGEEKVFDVVYRHYFRRLCAFCSQYVGEQEEIEEIVQET MMWLWENRCTLMEELTLKTLLFTIVKNKALNRLSHFEIKRKVHQEIVDKYDSEFNNPDFY LSDELFRLYEEALKKLPKEYLEAYEMNRNQHLTHKEIAEKLNVSPQTINYRIGQALKLLR VALKDYLPLFILIFGPNFFEQS >gi|225935382|gb|ACGA01000010.1| GENE 6 5140 - 5766 478 208 aa, chain + ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 61 205 4 150 152 72 34.0 4e-13 MKSKFVFSSSSLFIGLFLFFFTENVCSQESADNWTLKQARQWTQKQEWANGLKAMPHKTT DYQEFASQYHKNQKVWDKTFQWLATHDLANMPAGRYEVDGEHCYINVQDATTQDVSKRKI EAHRHGIDLQYVVKGTERFGITSTEYAEPITEYKPDVTFYKAKKIKYVDSTPDTFFMFFP KNFHQALVKAGKEPEDIRVIVAKIEYIP >gi|225935382|gb|ACGA01000010.1| GENE 7 5775 - 8945 2232 1056 aa, chain + ## HITS:1 COG:jhp1446 KEGG:ns NR:ns ## COG: jhp1446 COG1074 # Protein_GI_number: 15612511 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Helicobacter pylori J99 # 3 818 6 728 946 139 26.0 4e-32 MSELLVYKASAGSGKTFTLAVEYIKLLILNPRAYRQILAVTFTNKATAEMKERILSQLYG IQIGDKDSEAYLNRIKEETGKTEQEIREAAGVALSYMLHDYSRFRVETIDSFFQSVMRNL ARELELSPNLNIELNNTEVLSDAVDSMIEKLGPTSPVLAWLLDYINERIADDKRWNVSDE VKSFGRNIFDEGYIEKGEGLRQRLRNPDTIKEYRKQLKALETEILEQMKGFYDQFEGELD GHALTADDLKNGSRGIGSYFRKLNNGILGNDIRNATVEKCLEDAKNWATKTSPRYADIIN LANSSLMQILEDAEKLRSKNNLLLNSCRLSLQHLNKVQLLANIDEEVRQLNHDNNRFLLS DTNALLHQLVKDGDSSFVFEKIGTNIHNVMIDEFQDTSRMQWGNFKLLLLEGLSQGADSL IVGDVKQSIYRWRNGDWGILNSLNDHIEHFPIRVKTLATNRRSETNVIRFNNQIFTAAAN YLNGVYKQQLGKDCEDLQKAYADVVQESPRSTEKGYVKVSFLEPDEEHDYTEQTLISLGE EVKHLLTSGVRLNDIAILVRKNKSIPRIADYFDKELHYKVVSDEAFRLDASLAICMMLNA LRFLSDENNKIARAQLAVAYQNEVLQKGLDWNTLLLLPAENYLPAAFLERIKELRLMPLY ELLEELFSIFEMNLIKDQDAYLFAFFDAVTDYLQSNSSELDGFIRHWDETLCSKTIPSGE VEGIRIFSIHKSKGLEFHTVLLPFCDWKLENETNNQLVWCAPQTTPFNALDILPINYSTQ MAESIYGNEYLQERLQLWVDNLNLLYVAFTRAGKNLIIWSRKSQKGTMSELLANTLPIVA QEEGIDWEEDCYEQGQLCPSEEERIKTSTNKLTQKPEKLPIRMESMRHDIEFRQSNRSAD FIQGIEEEDSDDRFINRGRMLHTLFSVIETAEDIDPAIERLIFEGVIRNDEKEKVAREVA KKAFSSPEIQDWYSGKWTLFNECAIIYKEKGVLQTRRPDRVMMKDDQVVVVDFKFGKENP TYNKQVKGYMQLLTKMGYKNITGYLWYVDEEKIEKV >gi|225935382|gb|ACGA01000010.1| GENE 8 8942 - 9559 390 205 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 12 202 12 256 302 102 27.0 6e-22 MKAIPLIGIARHRLTIDGEGVTTLVAFHGCPLRCKYCLNPTSLQPDGVWERYDCNQLYEE VRKDELYFLASCGGVTFGGGEPLLQNEFIRQFRQLCGPEWRITVETSLNVPLQNVEELIS IVDNYIVDIKDMNNDIYQRYTGKGNERVLCNLRYLIEEGKTGKIIIRTPLIPSYNTEKDV DYSIELLKEMGITQFDRFTYKTPND >gi|225935382|gb|ACGA01000010.1| GENE 9 9564 - 10280 413 238 aa, chain + ## HITS:1 COG:no KEGG:BT_1056 NR:ns ## KEGG: BT_1056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 267 269 210 48.0 4e-53 MARGKQTCKILKEIRRQIAEANDIEFITSECQYLGDCLGTCPKCEAEVRSLEQQLLNRQR SGLNVKIVGVSLGLSAALISSPLYGQEAIKDSLSNSCDSLPKVEVTGGKTEKRHQVAGTT ITAEELKALSLKKEEAMFGMVEVMPEFPGGQKALMAYIAKNTNYPQDGPCISGRVIVQFT ITKKGKIIDAKVARGIHPVYDKEALRIIKKMPAWKPGTQMGKPVAVRYTVPIVFRAKQ >gi|225935382|gb|ACGA01000010.1| GENE 10 10308 - 11315 823 335 aa, chain + ## HITS:1 COG:no KEGG:BT_1056 NR:ns ## KEGG: BT_1056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 266 269 185 42.0 2e-45 MKRGKQTCKILKEIRRQIAEANDIEFITSECQYQGDCLGTCPKCEAEVRYLEQQLERKRR VGKAITLFGLSTGILTIIPPTSLNAETLQCPKMNWTITADSLIQEKDMKGEAPFYGPEKL PEFPGGTEKFIEFLKENLRYSEIDSIANNTYTDVKFTIDHDGQVINPKITRKIHPKVEAE ILRVMSLIPRWTPGMLANETVQVTYNLMLSFSAMYGPELRITYTRINYPDVYEKVPVMPD FPEGQQALMKFLAKNIQYPNTGTEGNGVQGRTIIQFIVDKEGNVVKPKVLRGVDPYLDKE ALRVINQMPKWKPGELEDGTKVAVYFTVPVMFRLQ >gi|225935382|gb|ACGA01000010.1| GENE 11 11344 - 14223 2024 959 aa, chain + ## HITS:1 COG:no KEGG:BT_1057 NR:ns ## KEGG: BT_1057 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 959 1 954 954 1638 86.0 0 MQSFLQLVAHDLYTKIGNDLSRTALIFPNKRANLFFNEYLAGESDQPIWSPAAMSISDLF QKLSVQKSGDPIRLVCELYKVFKEETQSQETLDDFYFWGELLISDFDDVDKNMVDADKLF SNLQDLKNLMDDYEFLDKEQEEAIQQFFQNFSIEKRTELKEKFISLWDKLGTIYHRYRTN LTELGIAYEGMLYRNVIEQLDTDQLKYDKYIFVGFNVLNKVENDFFSKLKDAGKALFYWD YDIFYTQQIKKHEAGEFLKRNLEEFPNELPENFFNSFKEPKKIRYISASTENAQARFLPE WIKAMTDDHSQIAEEKEKENAVVLCNEALLLPVLHSIPQEVKNVNITMGFPLAQTPVYSF INAAMELQTNGYRSDTGRFTYEAVSAILKHPYTQQLSSHAGPLERELTQTNRFYPLPSEL KQDDFLTTLFTPRNGIKELCDYLIELIKYISTIYRKEGEYNDIFNQLYRESLFQSHTKIN RLYSLIESGELNIRTDTLKRLITKVLTSSNIPFHGEPAIGMQVMGVLETRNLDFRNLIIL SLNEGQLPKSGGDSSFIPYNLRKAFGMTTIEHKNAVYAYYFYRLIQRAENITLLYNTSSD GLNRGEESRFMLQLLVEGPHDITREYLEAGQSPQSTQEIRVEKTPEVLRRLYRAYDSTHP NSVILSPSALNAYLDCRLRFYYRYVAGLKTPDEVSAEIDSALFGTIFHLSAQSAYTDLTA AGKTIQKEDLERLLRNDVKLQSYVDQAFKEELFKVSPEEKPEYNGIQLINSKVIVSYLKQ LLRNDLQYTPFEMVAMEKKVSEEITIQTGQGPFTLRLGGTIDRMDAKESTLRIVDYKTGG SPKIPANIEQLFTPSETRPNYIFQTFLYAAIMSRQQSLMVAPALLYIHRAASENYSPVIE MGEPRKPKIPVNNFAFFEDEFRERLQALLEEIFSEEEPFTQTEDTKKCSYCDFKAICKR >gi|225935382|gb|ACGA01000010.1| GENE 12 14318 - 15451 1174 377 aa, chain - ## HITS:1 COG:CAC3391 KEGG:ns NR:ns ## COG: CAC3391 COG0642 # Protein_GI_number: 15896632 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 130 373 342 576 579 122 35.0 2e-27 MNMEINPSEYKVLIVDDVISNVLLLKVLLTNEKFKIVTAGNGTQALEQVKKENPDLVLLD VMMPDISGFEVAQQMKADPEMAEIPIIFLTALNSTADIVKGFQVGGNDFISKPFNKEELI IRVTHQISLVAAKRIIVAQTEELRKTIMGRDKLYSVIAHDLRSPMGSIKMVLNMLILNLP SETIGEEMYELLTMANQTTEDVFSLLDNLLKWTKSQIGKLKVVYQDIDMVEVVEGVSEIF TMVASLKNIKIVQDVPVENMAVRADIDMIKTVIRNLISNAIKFSNEGSEVVVSLAEEDGM AIVSVKDSGCGIDDENQKKLLHTDTHFSTFGTNNEEGSGLGLLLCQDFVVKNGGKLWFTS KKGDGSTFSFSIPLLEK >gi|225935382|gb|ACGA01000010.1| GENE 13 15471 - 16400 545 309 aa, chain - ## HITS:1 COG:XF0611 KEGG:ns NR:ns ## COG: XF0611 COG0451 # Protein_GI_number: 15837213 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Xylella fastidiosa 9a5c # 3 307 22 326 329 483 70.0 1e-136 MKRILVSGGAGFIGSHLCTRLINEGHDVICLDNFFTGSKDNIIHLMDNHHFEVVRHDVTY PYSAEVDEIYNLACPASPIHYQHDPIQTAKTSVMGAINMLGLAMRLDAKILQASTSEVYG DPIVHPQPESYWGNVNPVGYRSCYDEGKRCAETLFMDYHRQNSVRVKIIRIFNTYGPRML PNDGRVVSNFILQALNNEDITIYGDGKQTRSFQYIDDLIEGMIRMMNTEDGFTGPVNLGN PNEFPVLELAERIISLTGSSSKIVFKSLPDDDPKQRQPDITLAKEKLGWQPTVELEEGLK RMIEYFKNV >gi|225935382|gb|ACGA01000010.1| GENE 14 16601 - 17560 645 319 aa, chain + ## HITS:1 COG:no KEGG:BT_1060 NR:ns ## KEGG: BT_1060 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 319 1 317 317 458 72.0 1e-127 MTINDPNTNLIEAMKEKLPAKGKLADMLMDTLYIGKEAVYRRLRGEVPFTLQEAALVSRK LGISLDKIIGTSFKSNAMFDINIVDYDDPFESYYNTLYKYVSLINTMQDDPESTMGTSSN IIPQTLYLKHDLLAKFRFFKWMYQNKYIQCNNFDELEIPAKLLNIQKDYVDMTRHIHSID YIWDSMIFQHLINDIQYFSDVHLISDEAKMQIKEELFLLTDELEELATRGKTKDGNTVRI YVSHINFEATYSYVETNSLQLSLIRVYSINSLTTLDHEIFHSLKEWIQSLKKFSTLISES GEMQRIHFFKQQREIIDTL >gi|225935382|gb|ACGA01000010.1| GENE 15 17799 - 18746 476 315 aa, chain + ## HITS:1 COG:no KEGG:BT_1061 NR:ns ## KEGG: BT_1061 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 315 1 315 319 464 75.0 1e-129 MKNGILNELITVMKERVPKGQNLASNLADILCMGKEAVYRRLRGEVSFSIEEVAIISHKL GISIDQVVGDHHSNKVTFDLNLHHSSNAIDSYYEILNRYLQIFDFVKEDNSTEIYTASNL LPFTLYSSFENLSKFRLSRWIYQHGQIKTPHSLEDMRVEKKIVQAHKKLSESVKQCPKTF FIWDTSIFYSFVNEIKYFASLHMITPNDVENLKEELYQLLTYIETLSSKGEFSEGRKVYF YLSNINFEATYSYIEKLNYQISLLRIYSINSMDSQSPHICQMQRDWILSLKRHSTLISES GEAQRIIFLTKYKKV >gi|225935382|gb|ACGA01000010.1| GENE 16 18857 - 19813 788 318 aa, chain - ## HITS:1 COG:no KEGG:BT_1062 NR:ns ## KEGG: BT_1062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 317 1 317 317 482 74.0 1e-135 MNLKSLVRHVTFGILIISSWIMTSCDNFNEDLPECRLFVKFKYDYNMEFADAFHTQVDKV ELYVFNKDGKFLFKQAEEGSSLSTGNYLMEVALPIGEYQFMAWAGARDSYDITSLTPGVS TITDLRLQLKREESFIINKELETLWYGEIIDVNFTGKTHQTETINLIRDTKRVRFIFQGY TNDWELNVDDYDYEIRESNGYLGHDNSLLEDDMLSFQPYRRDQVNSSAASVDLNTMRLMK DRKTRLVVTEKSSAKKVFDINLIDYLAMTNMEDKKIGMQEYFDRQSNYHIVFFISDSWLA MRIVINGWTVYSQTEGEL >gi|225935382|gb|ACGA01000010.1| GENE 17 19946 - 21592 1422 548 aa, chain - ## HITS:1 COG:no KEGG:BT_1063 NR:ns ## KEGG: BT_1063 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 546 1 564 566 351 43.0 6e-95 MKKKNFFMLALAAIAFAACSNEDIVPGGENGTDVTNSDGDAWVALTVQSTKTRALNTPNQ ENGTPDESTITKVYAVFFDGHTDASVVTKIVDFGTEDINNLPNRTNAFKVPSTSVAVLVI ANPQNLGRTIQEGDTYENVNKVITLTAEAGVTTGVAKSGGFVMTNAKGDLEPSDTDGKPK VLTLYNTPTAATSSPLTIRIDRISAKVRVYVKPDGGEKLSETATIDSDETEWYLNITNKK FYPASKRMKTAMNTWTPYDQYSLGSYRVDPNFGVVNDIGAWDDDDQTVYAGNYFYVSSAT KKEDIDWNEVNTSVKYCLENTQNQTDVVNFNMHAYTTQVLLKVGYAPTTITNEDGSKVTL ANGTDWFNMNGIFYTQASILTYIEKELASKYKHDKPSEYATDITDAYNGYIKELLKTEVT IPTDKDGAKTAEEMAAELKLEFSNLAAQVSTASKGGKNLKGVEYYSEGICYYKIMIKHDD SPTIMNKLGEFGVVRNSVYDITVNKINNPGYPIIPDPDPNTPDEEEDRYLSIKIEVNPWT WYSQVEPL >gi|225935382|gb|ACGA01000010.1| GENE 18 21630 - 23150 1415 506 aa, chain - ## HITS:1 COG:no KEGG:BT_1064 NR:ns ## KEGG: BT_1064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 119 506 4 392 392 610 79.0 1e-173 MMMKRNISFIVCMLVVCSLNIKNVMAQSRSYEGNIAVKPVQLEQKGNFLHIDIDFSLNNV KVKSARGVDLIPQLVSPERTLNLPKVSIKGRDEYLAYERELAVMSAKEKKTYEKPYVVEK GSKWRNDTIRYRYVVRFEPWMKDARLDVQRDECGCGETALMDVEQFDKVTLERVLLPYVV SPQLAYLQPKAEEIKRRDIQAECFLDFEVNKVNIRPEYMNNPQELAKIRKMIDELKSDPS VKVNRLDIIGYASPEGTLVTNKRLSEGRAMALREYLASRYDFARNQYYIIFGGENWEGLE EALNTMEMEYKDEVLDIIRNIPIEKGRETKLMQLRGGVPYRYMLKYIFPSLRVAICKVNY EVKNFNIDEAKEVIKSRPQNLSLNEMFLVANTYPVGSQDFIDIFETAVRMYPKDEIANIN AAAAALSRNDLLSAERYLDMVHSNTNLSEYNNAIGVLMLLKGEYEKGEKYLKIASESGLE AAGHNLEELARKRTNAAEIEKKNRDK >gi|225935382|gb|ACGA01000010.1| GENE 19 23152 - 23724 463 190 aa, chain - ## HITS:1 COG:no KEGG:BT_1066 NR:ns ## KEGG: BT_1066 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 189 189 350 87.0 2e-95 MKKVSLLILFMLGCILFESKAQKVALKSNLLYDATTTMNLGLELGLARKWTLDIPVNYNP WKPSDGKRLRHWGIQPEVRYWFCERFRRTFIGIHGHYADFNVGGWPDWSFISENMQKNRY QGHLYGGGFSVGHSWILKKRWSIEASVGVGYAHIVYDKYPCTTCGTKEKESSKNYFGPTK ASVSLIYIIK >gi|225935382|gb|ACGA01000010.1| GENE 20 23743 - 25500 1157 585 aa, chain - ## HITS:1 COG:no KEGG:BT_1067 NR:ns ## KEGG: BT_1067 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 584 17 607 608 634 59.0 1e-180 MAGCEHDVSTIGQENEEEALKEKVQIEIFARAKSYSLPPTRALESEVGKTPWILVFKGQG AGATFIEAVQAFEMISKRYVLLTKQPAGNKYQLLILANPLDQFYYGNASTGYDFNQAALL SKLTPGVTTYGEACSKLLTKPLDSNPLTTVPYSGDEETIPMSHRLEVDQIDNTTKIANTD GSSLLLTRVVAKIVIVNKASNFILKGVTAVMNVPRQGKLHDADGILMNNTTNLTEYRYDA AYSLPVVKAETIATDVQSTEKFPIYMYESDVQNNTYLIIQGTYDNKDYYYKMSLLNSSLQ AMDIERNHSYTFTINRAKGPGYDTVSDAKTAKPSNTDLDFEIMVDDSESYEILANNDYYL GVSNSVFIAYYSGENMEDYDAFSVITDCTKDFQDARTITDNKLEVSGAFQLSGPFKVPIV TSNGTNPVITPVTVGVTNQLMGYEEGIDDKKNAYITLKLGNLEKKVHIRQRKAISAGGST LKFMPTNNTDPVVSEINYFCLTGQVEEGTDNPKEWIRLYSSVMPEGHTGEDHIIVEDGRI YVEVLPNTNATPRSGIVHLTTMASVGGVSLNSVQRIKLDITQLGQ >gi|225935382|gb|ACGA01000010.1| GENE 21 25971 - 26513 572 180 aa, chain - ## HITS:1 COG:L69383 KEGG:ns NR:ns ## COG: L69383 COG1475 # Protein_GI_number: 15673430 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 6 165 7 166 180 223 64.0 2e-58 MSVDKSPVYEVKAVPVEKVYANDYNPNVVAPPEMKLLELSIWEDGFTMPCVCYYDKEADN YILVDGYHRYTVLKTSQRIYKRENGLLPIVVIDKDLSNRMSSTIRHNRARGMHNIELMCN IVAELDKAGMSDQWIMKNIGMDRDELLRLKQISGLADLFANREFSIPDEVAPTETERKTL >gi|225935382|gb|ACGA01000010.1| GENE 22 26510 - 27814 833 434 aa, chain - ## HITS:1 COG:lin1347 KEGG:ns NR:ns ## COG: lin1347 COG3969 # Protein_GI_number: 16800415 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Listeria innocua # 9 433 3 434 434 440 49.0 1e-123 MAKKKIAGTKNVYELAQERLKVIFNEFDNIYVSFSGGKDSGVLLNMCIDYIRKNNLKIRL GVFHMDYEIQYKMTIDYVDRMLEANKDILDVYRVCIPFRVATCTSMYQSFWRPWEDSKKN IWVRSMPKKAMTKEDFPFYNTTMWDYEFQMRFAQWIHNKKDAVRTCCLIGIRTQESFNRW RCIYMSRKFQMYHKYRWTSKVGNDIYNAYPIYDWKTTDVWTANGKFQWDYNVLYDLYYRA GVNLERQRVASPFINEAQESLQLYRVLDPNTWGKMVGRVNGVNFTGMYGGTHAMGWQSVK LPEGYTWREFMYFLLSTLPERARKNYLRKLSVSVNFWRTKGGCLSDATIQKLIDAKVPII VMDNSNYKTLKKPVRMEYQDDIDIPEFKEIPTYKRMCVCILKNDHACKYMGFSPTKEEMS KRSQVMEQYRIIVS >gi|225935382|gb|ACGA01000010.1| GENE 23 27818 - 28222 249 134 aa, chain - ## HITS:1 COG:no KEGG:BT_1071 NR:ns ## KEGG: BT_1071 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 2 135 135 200 72.0 2e-50 MQIIRLKGKDKHLYRLLAPMVMDPEVIRANNNYPFKTGEEYVWFIAIEDKEVVGFVPVEQ KSRKKAVINNYYVKAEGTVREEILSHLLPAIVAEFGPESWLLNSVTLVQDKETFEKFEFV SMDKKWTRYVKMYR >gi|225935382|gb|ACGA01000010.1| GENE 24 28338 - 28742 257 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170434|ref|ZP_05756846.1| ## NR: gi|260170434|ref|ZP_05756846.1| hypothetical protein BacD2_01068 [Bacteroides sp. D2] # 1 134 1 134 134 238 100.0 7e-62 MKGLEIIKDRLQRKREQYIHQGEKIEQMKCTMQNTGEEIEQLEYVLSEMEHIADANRAPR VVPNARVEEVFAYLCRVFEFLQHRLKLHFKYKLACEVIFARYQFKSRLHTPGREYLSFAT ILTYFKRERGMAYG >gi|225935382|gb|ACGA01000010.1| GENE 25 29182 - 29421 208 79 aa, chain + ## HITS:1 COG:no KEGG:BT_1516 NR:ns ## KEGG: BT_1516 # Name: not_defined # Def: replicative DNA helicase # Organism: B.thetaiotaomicron # Pathway: DNA replication [PATH:bth03030] # 3 67 175 239 461 70 55.0 1e-11 MIEARVASNKNSAPGLTTGFADLDWITCGWQPGEEIVIAARLAVGKTAFALHLARTAASA GYHIAVYIFTYTIFFLMND >gi|225935382|gb|ACGA01000010.1| GENE 26 29854 - 30820 734 322 aa, chain + ## HITS:1 COG:aq_1099 KEGG:ns NR:ns ## COG: aq_1099 COG0332 # Protein_GI_number: 15606369 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Aquifex aeolicus # 8 322 5 304 309 269 42.0 5e-72 MEKINAVITGVGGYVPDYILINDEISKMVDTTDEWIMGRIGIKERRILKEEGLGTSYIAR KAVKQLIKRTQTNPDDIDLIIVATTTPDYRFPSTASILCERLELKRAFAFDMQAVCSGFL YALETGANFIRSGNYKKVVVVGAEKMSSIINYTDRATCPIFGDGGAAVMLEPTTEDFGIM DAVLRTDGKGLPFLHIKAGGSVCTPSYYTLDNQMHYIYQEGRTVFKYAVSNMADACESII ARNHLSKDNIDWVIPHQANQRIITAVTQRLEVPVEKVMVNIERYGNTSAGTLPLCLWDFE DKLKKGDNLILTAFGAGFAWGA Prediction of potential genes in microbial genomes Time: Fri May 13 06:33:45 2011 Seq name: gi|225935381|gb|ACGA01000011.1| Bacteroides sp. D2 cont1.11, whole genome shotgun sequence Length of sequence - 17152 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 6, operones - 6 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 77 - 121 -0.6 1 1 Op 1 . - CDS 355 - 2208 1077 ## COG0210 Superfamily I DNA and RNA helicases 2 1 Op 2 . - CDS 2210 - 2560 264 ## gi|260170440|ref|ZP_05756852.1| hypothetical protein BacD2_01098 3 1 Op 3 . - CDS 2638 - 4407 952 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family - Prom 4467 - 4526 7.3 + Prom 5333 - 5392 3.0 4 2 Op 1 . + CDS 5463 - 6692 792 ## COG0582 Integrase 5 2 Op 2 . + CDS 6714 - 7901 840 ## BF3004 tyrosine type site-specific recombinase + Term 7917 - 7956 7.4 - Term 7896 - 7951 8.3 6 3 Op 1 . - CDS 7979 - 8275 201 ## BDI_0745 hypothetical protein 7 3 Op 2 . - CDS 8278 - 8556 300 ## BF1983 hypothetical protein - Prom 8576 - 8635 3.1 8 4 Op 1 . + CDS 9045 - 9284 79 ## gi|260170446|ref|ZP_05756858.1| hypothetical protein BacD2_01128 9 4 Op 2 . + CDS 9296 - 9508 168 ## 10 4 Op 3 . + CDS 9510 - 9737 172 ## BF1345 hypothetical protein 11 4 Op 4 . + CDS 9749 - 11686 1535 ## COG1475 Predicted transcriptional regulators + Term 11702 - 11737 5.6 12 5 Op 1 . + CDS 11740 - 11922 140 ## gi|260170449|ref|ZP_05756861.1| hypothetical protein BacD2_01143 13 5 Op 2 . + CDS 11919 - 12146 239 ## gi|260170450|ref|ZP_05756862.1| hypothetical protein BacD2_01148 + Term 12178 - 12214 7.1 - Term 12137 - 12172 -0.5 14 6 Op 1 . - CDS 12331 - 13623 369 ## Xaut_0787 hypothetical protein 15 6 Op 2 . - CDS 13635 - 13886 145 ## gi|260170453|ref|ZP_05756865.1| hypothetical protein BacD2_01163 16 6 Op 3 . - CDS 13870 - 15348 442 ## PP_3067 hypothetical protein 17 6 Op 4 . - CDS 15410 - 15874 391 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 18 6 Op 5 . - CDS 15879 - 16331 390 ## BVU_2479 hypothetical protein 19 6 Op 6 . - CDS 16398 - 16892 231 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 16971 - 17030 5.8 Predicted protein(s) >gi|225935381|gb|ACGA01000011.1| GENE 1 355 - 2208 1077 617 aa, chain - ## HITS:1 COG:BB0344 KEGG:ns NR:ns ## COG: BB0344 COG0210 # Protein_GI_number: 15594689 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Borrelia burgdorferi # 9 309 12 323 699 64 23.0 6e-10 MGHNIDEKVDLTLEACISPNSRKSFFLFAGAGSGKTHSLVRLLNKIKDKWEKQLNLEGRH IAVITYTNAATDEILNRLDYSPLFHVSTIHSFVWEVIKHYQTDIKKYYIAFKEVEKAEIE LKLEKSKSKSGKNYNNNLEKLVAINEKIIKASDINRFIYNPNGNNYEHNALSHADVIKIG AKMIVDNELLQKIIAQQYPFFMIDESQDTKKELIAAFFEIERNVDDNFTLGLFGDQKQRI YTDGEERIVEIIPQEWEKPIKEMNYRCAKRIVELANKIGCQIDVYAKQSPRDNAPEGTVR LFLIKNKDDLNKIELESKVIQMMADITGDQMWNRDATNVKVLILEHMMAARRLGFAEFFD VMHEVDKYKQTLLQGLVSDMDIFTKLIFPLVENVLKNDNLSALNLLKLHSPLLRELPQNK AFQILDKCKKVIDILANMDFNVVTIRELITYVCTTHIFVVPEILKRASKMVLNDLTEDDK EDMDIVCSWIKVMDLPVNQIKRFDDYVNRRTMFDTHQGVKGLEFERVLVVIDDNESRGFL FSYDKLLGVKPLSDGDIKNKETGKETSMERTTRLFYVTCTRARNSLAIVMYTNAPETAKE TVLVNKWFIDDEIIIIN >gi|225935381|gb|ACGA01000011.1| GENE 2 2210 - 2560 264 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170440|ref|ZP_05756852.1| ## NR: gi|260170440|ref|ZP_05756852.1| hypothetical protein BacD2_01098 [Bacteroides sp. D2] # 1 116 1 116 116 207 100.0 2e-52 MVYQTGVDVVWIKEKETIYPYTFEDSLIFTNLDLFRGDEKLKKLGTVTTFYNYLHSIKDL KTFHMKIFDKLESSSNVKAEFANTLLYAEQFEALQTPAYIREGLEWLKSYFEFNHK >gi|225935381|gb|ACGA01000011.1| GENE 3 2638 - 4407 952 589 aa, chain - ## HITS:1 COG:PA1939 KEGG:ns NR:ns ## COG: PA1939 COG3593 # Protein_GI_number: 15597135 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Pseudomonas aeruginosa # 40 584 1 565 665 347 36.0 4e-95 MRINHVHIRNFRKLRNCRIDFDENQTIFVGANNSGKTSAMSAIIWFLKGKDRFTTREFTL TNWIKINELADKWLVFDEVDPLLLTPNQWDDIVPSLDIWIDVSESEAYQVYKIIPSLTWK KDSVGVRLRFEPKDIKSLYADYKKEVMKVRNLQASPEYKETRNLKLYPQNMWDFLNHNHN LLAYFNIKYYVLDSEKINHELESEVQVTPDDSLEENPLVDLIRIDSIEAFREFSDPMGKN ESDIDTLSKQLQAYYRQNYTDETIIETGDLKLMEELSRANTSYDLKLQKSFSMPIGELSN INYPGFQNPSISIRSNVNIVDSISHDSSVQFSIQGKPELSLPEKYNGLGLRNLISIYLKL IQFREQWTRNSKNEEDVKKSIEPIHLVFIEEPEAHLHAQAQQVFIRKALEALTNAAENNA LKENPNLTTQLIVSTHSNHIVNEVDMNCLRYFRRVLDEIIGIPVSKVVNMSRTFGENEND TKKFVTRYIKLTHCDIFFADAAILVEGAAERILMPHFIKREGMDNYYISVMEINGSHAHR FSSLLKKLNILTVIVTDIDAQQEVEEDGKKKKNLAFPKEGKVKRPITIL >gi|225935381|gb|ACGA01000011.1| GENE 4 5463 - 6692 792 409 aa, chain + ## HITS:1 COG:SMc02489 KEGG:ns NR:ns ## COG: SMc02489 COG0582 # Protein_GI_number: 15966799 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sinorhizobium meliloti # 217 397 137 325 330 62 27.0 2e-09 MENEKFKVLLYLKKSVLDKSRKTPIMGRITVGDTVAQFSSKLSCTLSLWNPRASRLNGKS QEAVETNGKIDKLLLSINEAYNMLTERNVPFDATDVKNLFQGGIATQMTLFRLFDRHIEE VRARVGIDVSHRTIPNYLYTRRRLGEFVSKNYNVKDLAFSQLNEQFIREFQDYLILERNL GVETVRHYLAILKKICRIAFKEGHSDRHYFVNYPLLKQKVNPPRTLSREEFEKIRDLQFE EHRWSHITTRDMFLFACYTGTAYVDVIFITNDNLSKDDAGDLWLKYQRGKNGKLCRVKLL PEAIELIEKYKNPSRETLFPKMEYNALKWNLQSIRQLIGMTGPLTYHMGRHSFSSLITLE GGVPIETVSKMLGHSDIKTTQIYARVTPKKLFEDMDKYIEATKDLKLVL >gi|225935381|gb|ACGA01000011.1| GENE 5 6714 - 7901 840 395 aa, chain + ## HITS:1 COG:no KEGG:BF3004 NR:ns ## KEGG: BF3004 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 393 1 394 403 519 61.0 1e-146 MRSTFKTLFYINRGKVKKDGTTAIFCRITVDGEQTVITTGIFCNPKDWKSKKGEVKDEKV NGQLKSFRQRIEQTYENTLKKYGVVSVDLLKNAILEIHTIPTMLLLAGEAERERLRVRSL EINSTSTYRQSKISQNNLREYLLTLKMKDIAFTDITEEFGEGYKLFLKAKEYKSGHINHC FTWLNRLIYIAVDQEILRFNPIADVKYEKKEPPKLQHISRNELKLIMEKPMPNTFQELIR RAFIFSAFTALAYVDLKGLYPRHIGRTAEGKPFIRIHRKKTQVEAYVPLHPVAEQILSLY NTEDDTKPVFPLPNRDQIWYCITEIGFLAGVKGNMSYHQSRHTFGTLLLSEGIPIESISK MMGHTNISTTQVYAKVTDMKISEDMDKLIERRKAM >gi|225935381|gb|ACGA01000011.1| GENE 6 7979 - 8275 201 98 aa, chain - ## HITS:1 COG:no KEGG:BDI_0745 NR:ns ## KEGG: BDI_0745 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 97 3 96 97 105 56.0 6e-22 MSTELVTKDNERIKSLFCSLDRLLDRIETVMTGYEPSLNGERFLTVAQVSERLKISRRAL QEYRTKGKIPYLQLGGKTLYRESDIQKLLEQNYREAWE >gi|225935381|gb|ACGA01000011.1| GENE 7 8278 - 8556 300 92 aa, chain - ## HITS:1 COG:no KEGG:BF1983 NR:ns ## KEGG: BF1983 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 91 1 91 94 92 48.0 4e-18 MEIINIEARTFEVMLERFESFVRKAEKLVDRNKSKELDGWLDNQDVCLILNVNPNTLQYL RDKRKLAYTKFNRKMYYKPEDVERFINNLKSE >gi|225935381|gb|ACGA01000011.1| GENE 8 9045 - 9284 79 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170446|ref|ZP_05756858.1| ## NR: gi|260170446|ref|ZP_05756858.1| hypothetical protein BacD2_01128 [Bacteroides sp. D2] # 1 79 1 79 79 103 100.0 4e-21 MIRTAEKNRKQRFSNLPISIGSILKIAFVVAGFALWGWSFVAVVAGIYLAWAIIKGVLSC LISLAVIVGFILLLIVLIF >gi|225935381|gb|ACGA01000011.1| GENE 9 9296 - 9508 168 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKPKYFNSLGSLNEYANSYPNNSFSFEHNPRKCGCDSVRCSFDVTDKEDNDILEILVICP ICANNPKNRQ >gi|225935381|gb|ACGA01000011.1| GENE 10 9510 - 9737 172 75 aa, chain + ## HITS:1 COG:no KEGG:BF1345 NR:ns ## KEGG: BF1345 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 69 4 73 73 62 51.0 6e-09 MTPPKMNLHGVSICPAGEENYEQYKDFRGKWLFQYDYRDTDGELFSVVMPTLTECRMRLD KWLSEKINKQNQESF >gi|225935381|gb|ACGA01000011.1| GENE 11 9749 - 11686 1535 645 aa, chain + ## HITS:1 COG:lin2922 KEGG:ns NR:ns ## COG: lin2922 COG1475 # Protein_GI_number: 16801981 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 38 198 25 182 283 123 44.0 1e-27 MTAIKNNGTAVAVAEKANEVVTAVVVRTLENGVKKAVNIPSDLIEPSNYNARKTFDADAL KELAQSISVHGLIQPITLRRKGEKGEHYEIICGERRFRACRMLKLAEIPAIVREATDEQA YDLSISENLQREDVPPMEAAKAYKRLIDTKRYDVASLALQFGKSEKHVYQTLKLCELIKG IANLVKAGKLTASAGIVISKYDKKIQAEILKDRLGEDGQGEWCSISAGVLDGKIQSCYTN HLENYLFDKTPCLKCPHNSTNFDLFATGSGCGRCADKKCLDAKNTAYLVEQAQAVALADP KLVFIGEQYGYENEATQKVRKGGYEFKNVHTYNLSRYPTAPSVPQSADFAKPEDFAKAQD KYGKEQERYSKQTAHLDELKEQGKIRVYAEIGDKGVKLHYKEVASKDTKSNEQLIADLTA KKKRNTELQAEKTAEGVKELLRTDELPESAFTADEETAMYFFMLSKLRRSHYKAVGLKEN DYYGLTEEKKLQIASTLTEEQKTVIRRDYLYSHLTDRTATVADTKGGLLLAFAKQHLPKK TAEIEATHAEDYGKKNARLDERIAGLKKAEKQMKADAKAKKTEQPAKAENTAKTVGKTAD KAKKATALTAKVAEQPKVIPTPTAGTVHLVTVPPKGKKAEVATAV >gi|225935381|gb|ACGA01000011.1| GENE 12 11740 - 11922 140 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170449|ref|ZP_05756861.1| ## NR: gi|260170449|ref|ZP_05756861.1| hypothetical protein BacD2_01143 [Bacteroides sp. D2] # 1 60 1 60 60 103 100.0 4e-21 MTTVFVLFQTDVHRTRASRVFFGIFSTEVKAIDHAKENGLYRHDAEVVIIECEIDKFSEV >gi|225935381|gb|ACGA01000011.1| GENE 13 11919 - 12146 239 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170450|ref|ZP_05756862.1| ## NR: gi|260170450|ref|ZP_05756862.1| hypothetical protein BacD2_01148 [Bacteroides sp. D2] # 1 75 1 75 75 144 100.0 1e-33 MRIDAYIGNPIIATQEQYYQLEELLHEIDPDCRSVQLRTERLIIENISRRTFHYPHNPSI EWVAGRLQYMVENGL >gi|225935381|gb|ACGA01000011.1| GENE 14 12331 - 13623 369 430 aa, chain - ## HITS:1 COG:no KEGG:Xaut_0787 NR:ns ## KEGG: Xaut_0787 # Name: not_defined # Def: hypothetical protein # Organism: X.autotrophicus # Pathway: not_defined # 54 421 131 504 514 235 34.0 3e-60 MEENNLPNFKYILDLLENKFNKKISQRFMQIAMMSSAINASYIKCGVYNLQGNSLNLDNI ENAIDFFQSRRLYYVSVLPLIPQMAKGSKKIKYIDLLNFFQYCIDSCLVGITSAYFKLLI NKCISDYEITSDGTIAKGNFDFSHLESFFLEPERLSLLDQLELRPDIVIQKTMLPKSDNK IFSFSETANAMSLYEGAFHKYKLDKSSIFKELSLFLYDIAIYIEDDFNIVIEETDFEKIS QKYKSLKLSIQTTDYFVALNSKAPFQKVGNYYFSTVVLLTRFVTNTLSTSLLKNRTFQIH SGFVFEDKVASILKKFNFELTDIKRINRKEFDLVTIKNGEIYNFQCKNNYYDISTVDLDY NKIAKLNNRLCNYYEKALTKEINRENLIKDKLGIQQIYHFVISRYPVITRNEKIINFNRL ENWLIDKNLI >gi|225935381|gb|ACGA01000011.1| GENE 15 13635 - 13886 145 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170453|ref|ZP_05756865.1| ## NR: gi|260170453|ref|ZP_05756865.1| hypothetical protein BacD2_01163 [Bacteroides sp. D2] # 1 83 1 83 83 147 100.0 2e-34 MAQKDKIRSIEKPEMPIGTLHGQDIDDFFNKSIAYHKEVFENTLSEYEPFDLYSNWKTFV RNLDEFINIPFEIDIDDKKNIYH >gi|225935381|gb|ACGA01000011.1| GENE 16 13870 - 15348 442 492 aa, chain - ## HITS:1 COG:no KEGG:PP_3067 NR:ns ## KEGG: PP_3067 # Name: not_defined # Def: hypothetical protein # Organism: P.putida # Pathway: not_defined # 11 314 18 298 425 130 29.0 1e-28 MGIFATYTQDEYVRKEMQMKRSFDIYKMSDSRERKLWKESIIVFDSSALLDIYFLPKATR EKVFENHFKKKLEGRLWIPSHVNFEYYKNREQIIKKPITENYQPLKDEVLGAIKKSISAI IKKTNDFKNRTKNDDRHPYLKQEDIDKYLTVLESLKNDTDSFEKSIVNQILEIEKEIEDL PNNDDVLKAVEEYFQIGREYNFNEIVEITKEGKHRYEHSIPPGYQDKKDKKGFQIFGDLI IWKQIIEYASETQKPILFICNDLKEDWCYLEDDVTEKRIKSPREELIKEIFDEANVDFWM YNLSQFLYKSNKHLFSSDEEVIDSDKILRFSRLIQENKFSKKPKSRSRVIHEEFYECNEC DGNNDGFGNYVDYWTETSIINEYPSSHRNSKFDSAYTGACEWCNTLHIECPCCHSVTAIN IHQYDEKVECEGGCGIFFFIESDPSHHNMDNYEIKIIDHRAEECSACGEEYIDDGGNTGV CKKCNEEYGTER >gi|225935381|gb|ACGA01000011.1| GENE 17 15410 - 15874 391 154 aa, chain - ## HITS:1 COG:DR0676 KEGG:ns NR:ns ## COG: DR0676 COG0454 # Protein_GI_number: 15805703 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Deinococcus radiodurans # 1 141 1 140 152 67 27.0 1e-11 MDDRIKFITDIDTLVKLRFDYISTEIDLSTTDKAKLEPKLREYFKEYVSEEKFTAYAMEI DGEIVSAAFLIIDEKPPGLNNLNGRYGTIMNVFTYPEHRGKGYATSVMSSLIVDADNIFV SVLDLYSTKAGKGLYEKLGFKEVEYSTMRLYTNE >gi|225935381|gb|ACGA01000011.1| GENE 18 15879 - 16331 390 150 aa, chain - ## HITS:1 COG:no KEGG:BVU_2479 NR:ns ## KEGG: BVU_2479 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 149 1 146 147 150 47.0 2e-35 MNYIETNLQPGEEIKYVAKLHFFLFVQPIVLLLIGALLASSPKEISAMTHYAGLLILFFG IVSLMQRLLVKIGSAYAVTNKRVILKTGVISRRVVDLVLAKCEGLHVKQSVLGRIFNFGA ITVTTGGVTSSYPFIADPLAFRREINTQIG >gi|225935381|gb|ACGA01000011.1| GENE 19 16398 - 16892 231 164 aa, chain - ## HITS:1 COG:HP1543 KEGG:ns NR:ns ## COG: HP1543 COG0739 # Protein_GI_number: 15646150 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Helicobacter pylori 26695 # 28 164 140 281 312 115 40.0 3e-26 MEQVYSANEFQWIADSLQLSFGELCNYPVIFPIKKAQRISSGFGMRVHPVYRVRKFHTGI DISGVKGTPVYATGNGIIVRKGYCSGYGNYIEIKHAGGFHSFYAHLSRTMVNVRDSVGIA QQIAYVGSTGIATGSHLHYEIRKGKYYLNPTGWCYLLFEIWKNK Prediction of potential genes in microbial genomes Time: Fri May 13 06:35:04 2011 Seq name: gi|225935380|gb|ACGA01000012.1| Bacteroides sp. D2 cont1.12, whole genome shotgun sequence Length of sequence - 18470 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 4, operones - 3 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 467 148 ## BT_2287 conjugate transposon protein 2 1 Op 2 . - CDS 479 - 1063 611 ## PGN_0593 putative conserved protein found in conjugate transposon TraO 3 1 Op 3 . - CDS 1066 - 2028 732 ## PGN_0594 conserved protein found in conjugate transposon TraN 4 1 Op 4 . - CDS 2075 - 3451 1032 ## PGN_0595 putative conserved protein found in conjugate transposon TraM 5 1 Op 5 . - CDS 3492 - 3716 164 ## BF1772 hypothetical protein 6 1 Op 6 . - CDS 3724 - 4347 374 ## BF1237 hypothetical protein 7 1 Op 7 . - CDS 4389 - 5390 774 ## PGN_0598 conserved transmembrane protein found in conjugate transposon TraJ 8 1 Op 8 . - CDS 5394 - 6014 615 ## PGN_0599 putative conserved protein found in conjugate transposon TraI 9 1 Op 9 . - CDS 6068 - 6382 331 ## gi|260170467|ref|ZP_05756879.1| hypothetical protein BacD2_01233 10 1 Op 10 . - CDS 6394 - 8892 1545 ## COG3451 Type IV secretory pathway, VirB4 components 11 1 Op 11 . - CDS 8885 - 9220 385 ## BF0124 hypothetical protein - Term 9238 - 9262 -1.0 12 1 Op 12 . - CDS 9263 - 9565 227 ## Fjoh_3006 hypothetical protein - Prom 9648 - 9707 2.2 13 2 Op 1 . - CDS 9716 - 10429 550 ## BT_2301 conjugate transposon protein 14 2 Op 2 . - CDS 10466 - 10924 329 ## BT_2302 conjugate transposon protein 15 2 Op 3 . - CDS 10932 - 11687 611 ## COG1192 ATPases involved in chromosome partitioning - Prom 11853 - 11912 5.7 16 3 Op 1 . + CDS 12423 - 12854 363 ## BF1366 hypothetical protein 17 3 Op 2 . + CDS 12858 - 14132 863 ## BT_2305 putative mobilization protein 18 3 Op 3 . + CDS 14167 - 16176 1318 ## COG3505 Type IV secretory pathway, VirD4 components + Term 16185 - 16227 7.2 - Term 16171 - 16214 6.4 19 4 Tu 1 . - CDS 16232 - 18469 696 ## SUN_1091 hypothetical protein Predicted protein(s) >gi|225935380|gb|ACGA01000012.1| GENE 1 2 - 467 148 155 aa, chain - ## HITS:1 COG:no KEGG:BT_2287 NR:ns ## KEGG: BT_2287 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 144 7 141 145 152 55.0 3e-36 MKRIKSFFGITAWLVAIMLLALACSDDMDINKVYAFDLVCMPVQKKIVQGEVAEIRCQIV KEGDYQETKFFIRMFQPDGVGELWLDDGRVLLPNDLYLLEKETFRMYYTSHCTDQQVIDV YIEDNHGQVVQKLSRFRMTRERKRIYKQERNSHDD >gi|225935380|gb|ACGA01000012.1| GENE 2 479 - 1063 611 194 aa, chain - ## HITS:1 COG:no KEGG:PGN_0593 NR:ns ## KEGG: PGN_0593 # Name: traO # Def: putative conserved protein found in conjugate transposon TraO # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 194 1 194 194 255 65.0 7e-67 MKRLSFIMAFALCLLFTDQAHAQRQLPGMRGIQVTAGMVDGIYSSALDNEAGYYFGAAMA TYAKNGNKWVFGAEFMERYYPYRTIRIPVSQFTAEGGYYLKILSDPSKTFLLSLGGSALA GYETSNWGEKTLYDGSTLRHKDAFIYGGAITLKLETYLTDRIVLLLTGRERILWGNSTGH FHTQFGLGLKFIIN >gi|225935380|gb|ACGA01000012.1| GENE 3 1066 - 2028 732 320 aa, chain - ## HITS:1 COG:no KEGG:PGN_0594 NR:ns ## KEGG: PGN_0594 # Name: traN # Def: conserved protein found in conjugate transposon TraN # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 320 1 311 311 497 75.0 1e-139 MKRIFLMLALIVSVFAANAQEVQTVTEETATVIQTKPTTGDYYQGFTRPLTFDRMIPPYA LEVTFNKTVHVIFPAPIRYVDLGSADLLAAKADGTENVLRVKAALRDFSRESNLSVITED GNYYTFNVKYADEPVKLSVEMTDFLHDGEAVNRPNNALAVYMQELGQESPLLVKLIMQSI YKNNDREIKHLGSKRFGIQHTLKGVYTHNGLLYFHLQLKNSSNVPFNIDFITFKIVDKKV AKRTAIQEQVIWPLRAYNNLMVIGGKRTERMVFTLPKFTIPDDKMLVIELNEQEGGRHQR FTVDNADLVRAKVINELKVK >gi|225935380|gb|ACGA01000012.1| GENE 4 2075 - 3451 1032 458 aa, chain - ## HITS:1 COG:no KEGG:PGN_0595 NR:ns ## KEGG: PGN_0595 # Name: traM # Def: putative conserved protein found in conjugate transposon TraM # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 40 456 25 452 453 385 52.0 1e-105 MEQENLNKGTPPETEAIPEVQPSAPDSKSGAGKEPPKSPPKKEVTMAERQKRKKMLIMPL FFLIFGGAMWLIFAPGEKDDTKVEGLSGLNAELPVPKDEGIVGDKRDAYEREAMQQKEQE RMRSLQDFSTMFDQNEEQKPEESNEPEYYESPSARQTVPANNLQASANAYQDINKQLGSW YEEPAGQSAQELAVEERMSELERKLEEAEAQKATEDEQTALLEKAYAMAARYMPPQAGQA ESSNAVPSTKDKVNVQPVKQVRHNVVSLLSAPMSDDEFIESFSQPRNMGFFTAAGNDEVK DKNSIRACVNQTVTLTSGREVQIRLLEPMQAGSILIPANNIITGSCKIAGERLDIAINSI QYAGNIIPVEISVYDTDGQRGIFIPNSDEVKAVKEVASTLASSAGTSIMISDDAGSQLAA DMGKGLIQGASQYVSKKMSVVKVTLKANHRLLLLPKEN >gi|225935380|gb|ACGA01000012.1| GENE 5 3492 - 3716 164 74 aa, chain - ## HITS:1 COG:no KEGG:BF1772 NR:ns ## KEGG: BF1772 # Name: traL # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 73 1 73 94 74 46.0 1e-12 MLQKIIGRVNEWLEDRLRSLVRPLSPDARVIVIVTMLIVFSSLSIYMTVSSIYNLGKDKG KELQIEQIETLKLE >gi|225935380|gb|ACGA01000012.1| GENE 6 3724 - 4347 374 207 aa, chain - ## HITS:1 COG:no KEGG:BF1237 NR:ns ## KEGG: BF1237 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 207 1 207 207 339 77.0 4e-92 MEFKSLKNIETSFKQIRLFGIVFLVLCAAVTGYSVWSSYSFAEAQRQRIYVLDGGKSLML ALSQDLSQNRPVEAREHVKRFHELFFTLSPDKKAIESNVNRALFLVDKSAFKYYQDMAEK GYYNRIVSGNINQIVQVDSIACNFDVYPYTAVTYARQMILRQSNMTERSLITRCNLINSV RSDNNPHGFTMEKFEIVENRDIRVTER >gi|225935380|gb|ACGA01000012.1| GENE 7 4389 - 5390 774 333 aa, chain - ## HITS:1 COG:no KEGG:PGN_0598 NR:ns ## KEGG: PGN_0598 # Name: traJ # Def: conserved transmembrane protein found in conjugate transposon TraJ # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 6 332 1 328 328 506 75.0 1e-142 MMLLAIDFENLHQILRNLYQEMMPLCGDMIGVAKGIAGLGALFYVASRVWQALSRAEPID VYPLLRPFVIGLCIMFFPTIVLGTINAVMSPVVKGTHTILESQIEDVTALQAEKDGLEYD ARVREGKAWLVDDEVYDRKMEELGIWEMGEIISMWGERKWYDMKMWFRQLVRDFFELLFN AAALTIDTLRTFFLVVLSILGPLAFALSVYDGFQSTLTNWLSRYISVYLWLPVADLFSAV LSKIQALMLQMDIGLLKDPSYVPDGSNGVYIIFLIIGIIGYFSIPTVAGWVIQAGGGNAT RGVNSAASKGAALAGGVAGAAAGNVAGRLLGKK >gi|225935380|gb|ACGA01000012.1| GENE 8 5394 - 6014 615 206 aa, chain - ## HITS:1 COG:no KEGG:PGN_0599 NR:ns ## KEGG: PGN_0599 # Name: traI # Def: putative conserved protein found in conjugate transposon TraI # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 206 1 207 207 239 60.0 4e-62 MKKILLSLVLAVSFFGFEAKAQWAVIDPSNLAQNILTVSKTANTATNVINSFKEMQKIYN QGKEYYDKLQSVHNLIKDARKVKETVELVSEISQIYSTNFNKMLTDKNFTHKELEAISNG YSKLLKESGNLLSDIKNIVSTSNGLSMTDAERMAIIDEIHSQMIEHRNLTRYFSKKSISV SFIRSQEKGDMEHVRALYGDPSERYW >gi|225935380|gb|ACGA01000012.1| GENE 9 6068 - 6382 331 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170467|ref|ZP_05756879.1| ## NR: gi|260170467|ref|ZP_05756879.1| hypothetical protein BacD2_01233 [Bacteroides sp. D2] # 1 104 1 104 104 217 100.0 1e-55 MFSKCNPTDPCAVHQDFNLWEITGRWVSPDGAPAVTIYRNTSRKRGGIRLCLTYNNPQVV CDCTVYCVFGLYYIDLYGRIGLAYDREREVLLLSAFGEYVRVED >gi|225935380|gb|ACGA01000012.1| GENE 10 6394 - 8892 1545 832 aa, chain - ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 436 734 189 472 593 72 23.0 4e-12 MRNIMKAATIESKFPLLAVEHDCIISKDADITLAFRVDLPELFTVTSTEYEAIHSAWAKA IKVLPDYSVIHKQDWFVKETYKPETDKEDMSFLSRSFERHFNERPYLNHSCFLYLTKTTK DRMRMQSNFSSLCRGNIIPKEVNKDTATRFLEAVGQFERIVNDSGFVSLTRLSSDEITGT ENEPGLVEKYFALSLDDTTCLEDMELGADGLRIGAKKVCLHTISDVEDLPGTVGTDMRYE KLSTDRSDCRLSFAAPVGLLLSCDHVYNQYIFLDDSAENLRKFEKSARNMQSLSRYSRGN QINKEWIDKYLNEAHSFGLTSVRAHFNVMAWSEDAEELKHIRNDVGSQLALMECKPRHNT TDAATLYWAAMPGNAGDFPSEESFYTFIEPALCFFTGETNYKSSLSPFGIKMCDRVSGKP LHVDISDLPMKQGITTNRNKFVLGPSGSGKSFFMNHMVRQYWEQGTHVVLVDTGNSYQGL CEMVRRKTKGEDGIYFTYTEENPISFNPFYTDDYKFDVEKKDSIKTLLLTLWKSEDDKVT KTESGELGSAVSAYIERIRSDRSIVPSFNTFYEYMLGDYRRELEEREIKVSKEDFNLDNM LTTLRQYYRGGRFDFLLNSEQNIDLLNKSFIVFEIDSIKENKELFPVVTIIIMDAFIQKM RRLKGVRKQLIVEEAWKALSSANMAEYLKYMYKTVRKYFGEAIVVTQEVDDIIASPIVKE AIINNSDCKILLDQRKFMNRFDVIQSLLGLTDKEKGQILSINMANHPNRMYKEVWFGLGG VQSSVYATEVSGEEYLTYTTEETEKMEVFKKTDELGGDYELAIKQIAESKRN >gi|225935380|gb|ACGA01000012.1| GENE 11 8885 - 9220 385 111 aa, chain - ## HITS:1 COG:no KEGG:BF0124 NR:ns ## KEGG: BF0124 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 107 3 108 110 158 69.0 7e-38 MEHNINKGIGKSVEFKGLKSQYLFIFAGGLLAVFVVFVIMYMAGVDQWVCIGFGVIAASV LVWLTFNLNAKYGEHGLMKMLAKRQHPRYLINRKNSRRLFSRPKKKEVRDA >gi|225935380|gb|ACGA01000012.1| GENE 12 9263 - 9565 227 100 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_3006 NR:ns ## KEGG: Fjoh_3006 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 3 100 30 127 127 141 80.0 8e-33 MTKKKILMSAAFIFAAVSSAFAQGNGIGGITEATNMVTSYFDPGTKLIYAIGAVVGLIGG IKVYNKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|225935380|gb|ACGA01000012.1| GENE 13 9716 - 10429 550 237 aa, chain - ## HITS:1 COG:no KEGG:BT_2301 NR:ns ## KEGG: BT_2301 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 219 12 224 240 68 29.0 2e-10 MLYALVVAIYLLRERMQKRKKNAAKNSGFNPFRSSPKEDIVGKSKFDLRQSRTEATTLIN NEKGIENEPIFADENKKDTPVTPPLVESEIVLSAENMTDENSNEINLVIENTAPEFEPEY NNEDIDKEETEDEDTEGVAGVSIALGLGFDELAGMVRTVETADAATSEEKEEAGRVLVEI RKTEMFDQVVGDEPKKKVVSALMDDYFSAFHRKKREAGETDEPIVKAPKDFNVRGFA >gi|225935380|gb|ACGA01000012.1| GENE 14 10466 - 10924 329 152 aa, chain - ## HITS:1 COG:no KEGG:BT_2302 NR:ns ## KEGG: BT_2302 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 79 152 47 120 120 88 56.0 9e-17 MAKRADTSGLDKGAVISYAKPSVREKVLKPYDPSSSEPVPETEQVPELPQETEVETEPVK EKEQPREEPKRRENKVQEYESLFFKEAAIKTRNGKVVYIRKEHHDRIMKIVRVIGENEFS LFNYLDNILEHHFATYQDEITKLYRKKNTDVF >gi|225935380|gb|ACGA01000012.1| GENE 15 10932 - 11687 611 251 aa, chain - ## HITS:1 COG:BBD21 KEGG:ns NR:ns ## COG: BBD21 COG1192 # Protein_GI_number: 11496587 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Borrelia burgdorferi # 6 242 2 244 246 59 27.0 5e-09 MKKEPKYVAFSTQKGGAGKTTLTVLVASYLHYVKGYNVAVIDCDFPQHSIADLRERDFKM VDDDEYYKGMAYEQITRLDGKKFYPVVESSTEEALNDAEALCEEEEYDFIFFDLPGTLNN KDLVVALANMDYIIAPIAADRFVLESTLNYLIAVRDTIVNPGKSNIKGMHLLWNLVDGRE KSDLYEVYEAVIKDLSFPVMKTFVPNSLRFRKEQSISHKALFRSTIFPADKALVKGSNID ALTDELLEILK >gi|225935380|gb|ACGA01000012.1| GENE 16 12423 - 12854 363 143 aa, chain + ## HITS:1 COG:no KEGG:BF1366 NR:ns ## KEGG: BF1366 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 143 2 138 140 159 59.0 4e-38 MNEEKREPRKRGKGGRPPKNDPAKHRLTVNLTDTQHAGFLAMFEQSGISSLSAFICARIF GDEFRVVKTDASAVEFTAKLTALHSQFRSVGVNYNQIVKELHSNFGEKKALALLYKLEKA TVELAGIGREVMQLCEQFKAKYL >gi|225935380|gb|ACGA01000012.1| GENE 17 12858 - 14132 863 424 aa, chain + ## HITS:1 COG:no KEGG:BT_2305 NR:ns ## KEGG: BT_2305 # Name: not_defined # Def: putative mobilization protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 408 382 55.0 1e-104 MVAKISHGSNLYGALSYNQEKVDGGLGKVLATNLVIEPTDGAFNATACMQDFERFMPSHI TTKKPVIHISLNPHPDDKLTDSQLAEIGQKYMERLGYGSQPYMIFKHEDIDRQHIHIVST RVRPDGTLVPDSFEKDRSNKIRRELEKEYNLIPANGQKQGEAWQFAPVNVSQGNLKKQVA NVIKPLSEMYRFQTLGEYRALLSLYNIGVEEIKGENKGKPYRGLVYSALDSEGNRVGKPL KSSLFGKALGIEVLEKQFGTSKETIKADGIPARTRIVVAASLATARTESEFREALQKKGI DLVLRRNDEGRIFGVTFIDHNERAVLNGSRLGKEFSANVLNERFADDSPREDLQTRQPKP SDDVRPNNQPPTNKQPEPQSGHSTVDDAASSLFSVLTPETGGQDNQQPTIKRKKKKKRRY GRQL >gi|225935380|gb|ACGA01000012.1| GENE 18 14167 - 16176 1318 669 aa, chain + ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 201 558 117 466 589 90 25.0 1e-17 MQNEDDLRGLAKVMDFMRALSILFVVINVYWFCYEAMHEWGINIGVVDKILMNFQRTAGL FSSILITKIFAVVFLALSCLGTKGVKEEKITWTKIYVCLAIGFVLFFLNWWLLALPLPKV ANAGFYIFTISVGYILLLMAGTWISRLLKNNMFDDVFNTENESFMQETKLMVNDYSINLP TKFWYKKKQWKGWINVVNPFRAAIVLGTPGSGKSYAVVNQFIKQQIEKGYTGYIYDFKFP DLSTIAYNHLLNNREGYEKVPTFYVINFDDPSRSHRCNPINPSFMDDISDAYESAYTIML NLNRTWVQKQGDFFVESPIILFAAIIWYLRIYEDGKYCTFPHAIEFLNKSYEDIFPILTS YPDLENYLSPFMDAWKSGAADQLQGQIASAKIPLSRMISPQLYWVMSGDEFTLDINNPDD PKILCVGNNPDRQNIYGAALGLYNSRIVKLINKKGQLKSSVIIDELPTIYFKGLDNLIAT ARSNKVAVLLGFQDFSQLKRDYGDKEAAVVMNTVGNIFSGQVVGDTAKTLSDRFGKVLQK RQSMTINRNDKSTSISTQMDSLIPPSKISNLTQGMFVGAVADNFDERIEQKIFHCEIVVD NERVAVETKAYRKIPVITNFVDENGVDRMKEMIKENYDRIKAEAKQIVADELARIKADPE LSKLLPKEE >gi|225935380|gb|ACGA01000012.1| GENE 19 16232 - 18469 696 745 aa, chain - ## HITS:1 COG:no KEGG:SUN_1091 NR:ns ## KEGG: SUN_1091 # Name: not_defined # Def: hypothetical protein # Organism: Sulfurovum_NBC37-1 # Pathway: not_defined # 1 743 91 824 826 645 46.0 0 INAGMRLMNVRDITDINGNTDTYATVYIPRGQESKFLNKLRDYATKVTQKGNPKNQKLIN SIEDIHIALLNSFWTDSPQLLPNENYNWFEVWIKVDESTNKSEQITQFTDLLTTLEISFK SNYLLFPERAVLLVYANNESLIQILNSSDSLAELKAGKETANFWTNESPSDQNEWVEDLL SRLVVNDQSNISVCVIDTGVNSGHKLLSPVLQSAHCLAYDSQWTTADRKGHGTMMAGVCA YGDLSRSLEHNAPVEVKHTLCSVKLLPDVGENPKELWGDITKQCIYKAEITIPDKKVLNC MAVTSSDTHDKGRPSSWSAALDDMCVGTSEETPRLIIVAAGNVNDEEVWNNYPDGNTLQS VHNPGQSWNCLTIGAYTEKIQIDDSDYDEFERVAPSGGISPFNSTSFLWDKKWPIKPDVV FEGGNLIKTGNDFPPYTEHDALGVLTTSKNIQFRQFESFNMTSASCALASNLVGEIAAIY PDLWAESLRALIIHSAKWSDTMLRQFNVAGRTNIKKLLRSCGYGIPNRERALYSTENGFT YISQSSLKPYIKNGSTVSLNEMHFIDLPWPKELLESLGEINTTLRITLSYFIEPSPGEIG WKDKYTYQSCGLRFDVNNPQETQDQFKRRINKYVESEDADAANIDLVENDSGRWLIGMKN RSVGSIHSDMIITTAADLASCNMIAVFPVGGWWKMRTNLNRYNSRIRYSLIVSLETPATE VDLYNVVTTKIAAMVETPVEINIPV Prediction of potential genes in microbial genomes Time: Fri May 13 06:36:56 2011 Seq name: gi|225935379|gb|ACGA01000013.1| Bacteroides sp. D2 cont1.13, whole genome shotgun sequence Length of sequence - 67150 bp Number of predicted genes - 67, with homology - 66 Number of transcription units - 31, operones - 17 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 . - CDS 1189 - 1401 208 ## PGN_0927 hypothetical protein - Prom 1630 - 1689 6.2 + Prom 1436 - 1495 2.4 3 2 Op 1 . + CDS 1657 - 3222 1641 ## PGN_0581 hypothetical protein 4 2 Op 2 . + CDS 3257 - 5365 1246 ## COG0550 Topoisomerase IA + Term 5444 - 5482 9.1 5 3 Tu 1 . + CDS 5927 - 6454 528 ## COG4734 Antirestriction protein + Term 6477 - 6505 1.0 6 4 Tu 1 . - CDS 6523 - 7272 178 ## Taci_1485 hypothetical protein - Prom 7306 - 7365 4.3 - Term 7313 - 7356 5.2 7 5 Op 1 . - CDS 7381 - 7710 239 ## BF1731 hypothetical protein 8 5 Op 2 . - CDS 7707 - 8021 299 ## BF1259 hypothetical protein - Prom 8055 - 8114 6.1 - Term 8187 - 8224 0.2 9 6 Tu 1 . - CDS 8278 - 8787 431 ## COG0262 Dihydrofolate reductase - Prom 8986 - 9045 3.0 - Term 8876 - 8924 8.2 10 7 Op 1 . - CDS 9095 - 9946 402 ## BF1264 hypothetical protein 11 7 Op 2 . - CDS 9960 - 10601 301 ## gi|260170490|ref|ZP_05756902.1| hypothetical protein BacD2_01348 - Prom 10621 - 10680 2.4 12 7 Op 3 . - CDS 10687 - 11136 139 ## gi|260170491|ref|ZP_05756903.1| hypothetical protein BacD2_01353 - Prom 11211 - 11270 2.1 - Term 11212 - 11261 7.4 13 8 Tu 1 . - CDS 11278 - 12345 455 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 12604 - 12651 0.2 14 9 Tu 1 . - CDS 12768 - 13676 232 ## BVU_2470 putative transposase - Prom 13838 - 13897 6.7 + Prom 13579 - 13638 3.1 15 10 Tu 1 . + CDS 13668 - 13820 86 ## - TRNA 13898 - 13974 86.1 # Asp GTC 0 0 + Prom 14004 - 14063 6.2 16 11 Op 1 . + CDS 14127 - 15443 1198 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 15484 - 15536 9.4 + Prom 15459 - 15518 4.2 17 11 Op 2 . + CDS 15538 - 16128 366 ## COG1678 Putative transcriptional regulator - Term 16071 - 16108 1.0 18 12 Op 1 . - CDS 16133 - 16669 349 ## PROTEIN SUPPORTED gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase 19 12 Op 2 . - CDS 16666 - 17124 357 ## BT_1080 hypothetical protein 20 12 Op 3 . - CDS 17151 - 17768 512 ## COG0353 Recombinational DNA repair protein (RecF pathway) - Prom 17798 - 17857 3.3 + Prom 17563 - 17622 4.2 21 13 Op 1 . + CDS 17860 - 19311 634 ## COG0591 Na+/proline symporter 22 13 Op 2 . + CDS 19319 - 19924 594 ## COG0218 Predicted GTPase 23 13 Op 3 . + CDS 19933 - 20538 591 ## BT_1084 hypothetical protein + Prom 20554 - 20613 1.7 24 13 Op 4 . + CDS 20637 - 21032 306 ## BT_1085 hypothetical protein + Term 21056 - 21088 2.0 + Prom 21059 - 21118 7.8 25 14 Op 1 . + CDS 21138 - 23126 2134 ## COG1297 Predicted membrane protein 26 14 Op 2 . + CDS 23133 - 24689 1198 ## COG0657 Esterase/lipase 27 14 Op 3 . + CDS 24692 - 25609 721 ## BT_1088 hypothetical protein 28 15 Tu 1 . - CDS 25566 - 26249 500 ## COG0321 Lipoate-protein ligase B - Prom 26272 - 26331 6.9 + Prom 26198 - 26257 7.8 29 16 Tu 1 . + CDS 26316 - 27155 556 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 27210 - 27266 3.5 - Term 27201 - 27251 4.3 30 17 Op 1 . - CDS 27292 - 29502 2109 ## COG2217 Cation transport ATPase 31 17 Op 2 . - CDS 29600 - 29923 414 ## BT_1092 putative heavy-metal binding protein 32 17 Op 3 . - CDS 29996 - 32209 2250 ## BF2084 putative TonB-dependent outer membrane receptor protein - Prom 32239 - 32298 2.7 33 18 Tu 1 . - CDS 32343 - 32741 202 ## BT_1140 hypothetical protein - Prom 32838 - 32897 6.1 + Prom 32748 - 32807 5.9 34 19 Op 1 . + CDS 32938 - 34023 766 ## COG1408 Predicted phosphohydrolases 35 19 Op 2 . + CDS 34098 - 34331 248 ## BDI_1027 hypothetical protein + Term 34363 - 34401 -0.9 + Prom 34373 - 34432 4.9 36 20 Tu 1 . + CDS 34476 - 35327 837 ## BT_1144 hypothetical protein + Prom 35360 - 35419 5.4 37 21 Op 1 . + CDS 35441 - 36061 411 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 38 21 Op 2 . + CDS 36105 - 36743 396 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Prom 36776 - 36835 4.7 39 22 Tu 1 . + CDS 36856 - 39705 2641 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain + Term 39766 - 39815 8.1 - Term 39754 - 39802 13.4 40 23 Op 1 . - CDS 39828 - 40397 530 ## COG0778 Nitroreductase 41 23 Op 2 . - CDS 40466 - 41551 939 ## BT_1149 type II restriction enzyme HpaII 42 23 Op 3 . - CDS 41631 - 42434 479 ## gi|293373029|ref|ZP_06619398.1| hypothetical protein CUY_0916 43 23 Op 4 . - CDS 42477 - 43094 589 ## COG1739 Uncharacterized conserved protein - Prom 43279 - 43338 3.2 - Term 43197 - 43249 10.1 44 24 Op 1 2/0.000 - CDS 43350 - 44597 1328 ## COG4198 Uncharacterized conserved protein - Prom 44652 - 44711 6.5 - Term 44611 - 44660 8.1 45 24 Op 2 6/0.000 - CDS 44760 - 45680 1361 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases - Prom 45725 - 45784 5.1 46 24 Op 3 . - CDS 45788 - 46855 1093 ## COG1932 Phosphoserine aminotransferase - Prom 46996 - 47055 6.1 47 25 Op 1 . - CDS 47061 - 47876 627 ## COG0513 Superfamily II DNA and RNA helicases 48 25 Op 2 . - CDS 47828 - 48388 440 ## COG0513 Superfamily II DNA and RNA helicases 49 25 Op 3 . - CDS 48393 - 48572 204 ## gi|237716306|ref|ZP_04546787.1| conserved hypothetical protein - Prom 48593 - 48652 5.0 - Term 48700 - 48757 11.2 50 26 Op 1 7/0.000 - CDS 48814 - 50088 1479 ## COG2871 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF 51 26 Op 2 9/0.000 - CDS 50107 - 50733 805 ## COG2209 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE 52 26 Op 3 9/0.000 - CDS 50764 - 51405 656 ## COG1347 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD 53 26 Op 4 9/0.000 - CDS 51426 - 52100 852 ## COG2869 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC 54 26 Op 5 7/0.000 - CDS 52114 - 53286 1277 ## COG1805 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB - Prom 53328 - 53387 2.7 55 26 Op 6 . - CDS 53408 - 54757 1361 ## COG1726 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA - Prom 54789 - 54848 3.8 - Term 54817 - 54865 13.0 56 27 Tu 1 . - CDS 54930 - 56324 1390 ## COG3579 Aminopeptidase C - Prom 56401 - 56460 7.1 + Prom 56317 - 56376 12.4 57 28 Tu 1 . + CDS 56571 - 57815 1029 ## BT_1162 hypothetical protein + Prom 57817 - 57876 6.3 58 29 Op 1 . + CDS 57953 - 58450 376 ## BT_1163 hypothetical protein 59 29 Op 2 . + CDS 58452 - 58700 220 ## BF2008 hypothetical protein 60 29 Op 3 . + CDS 58712 - 60937 1584 ## BF2061 putative transmembrane protein 61 29 Op 4 . + CDS 60927 - 62255 761 ## BF2060 putative transmembrane surface-related protein 62 29 Op 5 . + CDS 62252 - 63169 518 ## COG1216 Predicted glycosyltransferases 63 29 Op 6 . + CDS 63184 - 64377 783 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis + Term 64445 - 64490 -0.9 - Term 64433 - 64478 7.1 64 30 Op 1 . - CDS 64512 - 64958 267 ## COG3023 Negative regulator of beta-lactamase expression 65 30 Op 2 . - CDS 64963 - 65268 378 ## BT_1518 hypothetical protein 66 30 Op 3 . - CDS 65325 - 65867 466 ## BT_1517 hypothetical protein - Prom 65997 - 66056 8.1 67 31 Tu 1 . - CDS 66061 - 67149 851 ## BT_4046 hypothetical protein Predicted protein(s) >gi|225935379|gb|ACGA01000013.1| GENE 1 233 - 1183 561 316 aa, chain - ## HITS:1 COG:AGpT158 KEGG:ns NR:ns ## COG: AGpT158 COG0464 # Protein_GI_number: 16119896 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 257 18 284 345 252 49.0 8e-67 MATANQLKTLIKSHFEENNEKFNTVALQIAAHEAKLGHSNLANDIKQIIDNGKKTSPKLK PLNSELQGLVLEVESDKKLSDLIVPSEIKDRINRVIREFTHRNKLLSHNLENRRKILFSG PPGTGKTMSASIIANELHLPIYVILMDKVVTKYMGETSAKLRQIFDLIVDRPAVYLFDEF DAIGSQRGKDNDVGEMRRVLNSFLQFLERDNSESLIIAATNNLGMLDQALFRRFDDVIHY NLPSDEEKIQLLKSRLSKNVTSKDMKILLPMLENLSHAEINQACLDAIKESVLNNSDISL SLIEKTIKERTMAYKI >gi|225935379|gb|ACGA01000013.1| GENE 2 1189 - 1401 208 70 aa, chain - ## HITS:1 COG:no KEGG:PGN_0927 NR:ns ## KEGG: PGN_0927 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 59 1 59 67 83 69.0 2e-15 MNKIKEILDKKGVKQVWLAEQLGKSFNVVNAYVHNRRQPSLETLYKIASILDVDVNQLLC SKEELKKGNE >gi|225935379|gb|ACGA01000013.1| GENE 3 1657 - 3222 1641 521 aa, chain + ## HITS:1 COG:no KEGG:PGN_0581 NR:ns ## KEGG: PGN_0581 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 19 521 25 493 494 198 32.0 4e-49 MDEKKVKNEEPKVFLTQGEDGKLKAIAGEQDGKLKAVDPKKENADQFMKIDTNGNALENF FKKFSAQFSHPSHTGVYAVCESAVDKVAAFFDKLIRIDHKDKALDPYRVKFDGKIRIEVE FTGKYQPLDLNKLNWKEVEKLGLSGESLQDALKAMVYGHKSPGLVDIKPAVDGQEFPMQA RLSLEQQPDGSIKIATHPKQDQPDFEKPFMGVVFTDQDKEQFLKTGHGGRVFDIEPVAGG EKVSSLISLDKLTNRFEAMALKDIYIPQTLKNAPLSEEQQQGLKEGKAVWIEGMDKKTKQ GEQPQKIDRFVQYNAASKNFDFKFSDEQRQQFNQERRAKQGEGQAEGQEQKMPKARKLGE IWVYSKQGGVQLSREDFDKLCNKEPIYVKDMESQKPKQQQDASGAQKVEATDQKGQKYNA WVWIDEDKGKVRHSSKHPDQVRAIEAKQAAQSGQNIKPAAESKTQVAVNNEGKTNEATKH STEPLKKGQSQPTEKQAEKKEQKQEQKQSPAAPKKGKGRKM >gi|225935379|gb|ACGA01000013.1| GENE 4 3257 - 5365 1246 702 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 641 5 658 709 418 39.0 1e-116 MKIIIAEKPSVAKSIAAIVGANNKKDGYIEGNGYAVTWAFGHLIGLAMPEHYGIEGFKRE NLPILPKEFKLLPRQVKEGKEYKNDPGVMKQLKVIRELFSNGDEIIVATDAGREGELIFR YIFEYIGSSLPFRRLWISSLTDRAIRDGFQTLRPGSDYDNLYASAKARSTADWLVGLNAS QALSISAGYGVWSLGRVQTPTLAIICSRYLENKDFKPQTYFRLKLHTAKDAVAFPALSVD KFVTRTIAEEVTAQVKAAGAVQVTGVERKEVKQEPPLLYDLTTLQKEANSKHGFSADKTL TIAQALYEAKKISYPRTGSRYISEDVLEEIPHLIAILRSHPLFGAYIDSRFDNTLNTRSV DDTKVTDHHALIITEDPATGLSKDEQLIYDMVAGRMLEAFGERCIKENTTVSLDAGGVHF SCKGSVTLVPGWRSVFNFKDEPNDEDDTAVLPALIEGDSLSVRDCEIQEKQTKPRPLHTE SSLLAAMESAGKEVENEEEREAMKDCGIGTPATRASIIETLFSREYIVRDKKSLVPTNKG LVVYLAVKDKKIADVAMTGQWEQALNKIASGNMDASTFHKAIEVYASQITSELLDMTFEQ QNNRQSCPCPKCKSGNVVIYNKVAKCLNENCGLTVWRSIAKKELTDGQLTDLLMNGKTAF IKGFISSKTGSTFEAAVKFDTDYKTVFEFPQNKGASKKRRSK >gi|225935379|gb|ACGA01000013.1| GENE 5 5927 - 6454 528 175 aa, chain + ## HITS:1 COG:YPMT1.61c KEGG:ns NR:ns ## COG: YPMT1.61c COG4734 # Protein_GI_number: 16082851 # Func_class: R General function prediction only # Function: Antirestriction protein # Organism: Yersinia pestis # 9 172 10 166 168 107 43.0 1e-23 MENISEARIYVGTYAKYNEGSIFGKWLDLSDYSDKDEFYEACRELHKDEQDPELMFQDWE YIPSGLIGESWLSDNIFEIIEAIEDIDDDKKEAFEVWLNHESYDLSSKDINGLISSFEDD FQGAYKDEEDYAYEVVEECYELPEFAKTYFDYEKFARDLFMTDYWFDDGYVFRHS >gi|225935379|gb|ACGA01000013.1| GENE 6 6523 - 7272 178 249 aa, chain - ## HITS:1 COG:no KEGG:Taci_1485 NR:ns ## KEGG: Taci_1485 # Name: not_defined # Def: hypothetical protein # Organism: T.acidaminovorans # Pathway: not_defined # 15 249 6 234 235 180 40.0 5e-44 MKYLCIEMKSYSNKIFSDERLDYSCSYCGGDTPVTRDHVPSRILLDDPFPENLPLVGCCQ KCNNGFSKDEEYFASVIECIIHSTSNPELLSREKIKRVLRHNSRLRQRIESSFIKEDPIL FDDIEQRLYFKIESERFDNVLIKLAKGHVKFEHSTPMFEDPDVIWFKYINELSEDEHTAF FNIEDSGKITEIGSRAFHKTHVDNVMQTYHNQWEIVQEGRYVYCVSNDLGQTAVRIILSN YLACYIVWQ >gi|225935379|gb|ACGA01000013.1| GENE 7 7381 - 7710 239 109 aa, chain - ## HITS:1 COG:no KEGG:BF1731 NR:ns ## KEGG: BF1731 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 8 95 5 92 119 87 50.0 1e-16 MSEDAGLRDWMQEWFEQFMGRFDRLDKFMDIMSGRHNVLNGERLLDNQDLCQLLNVSKRT LQRYRSAGELTYITIHHKIFYTEKNVEKFIRDHFSKGDNDSEKEPEPLR >gi|225935379|gb|ACGA01000013.1| GENE 8 7707 - 8021 299 104 aa, chain - ## HITS:1 COG:no KEGG:BF1259 NR:ns ## KEGG: BF1259 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 98 1 95 119 69 41.0 3e-11 MKVITIESEVYKRLVRKIEWLYSYAKKQEKENALPNPDPSEIWVSDLEAAAILRVSKRTM QRLRSNGDITYSIRGGKAWYTLAEVKRLLLGRVVNNDKPEGGKP >gi|225935379|gb|ACGA01000013.1| GENE 9 8278 - 8787 431 169 aa, chain - ## HITS:1 COG:MA4540 KEGG:ns NR:ns ## COG: MA4540 COG0262 # Protein_GI_number: 20093324 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Methanosarcina acetivorans str.C2A # 3 166 7 177 198 80 31.0 9e-16 MSKIKAHVAISLDGYIAHTDGETNWIPGELAKEISEIHQTSEILLAGFNTYEAIFEQCAG SWPYKNTYVISHHDFSALANKNLQFLFCDQFEQISTLKKEAAGDISVIGGGNLITFLLNN GLLDELNLYIVPVLLGDGIRFLGKTYDVNVKSSKTEEYKGTTKICYILN >gi|225935379|gb|ACGA01000013.1| GENE 10 9095 - 9946 402 283 aa, chain - ## HITS:1 COG:no KEGG:BF1264 NR:ns ## KEGG: BF1264 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 281 1 279 282 250 51.0 3e-65 MRDFYTRLSAKIQTELQSIDLEGCDIPIKESVRMIKYLEDRLSELRDYFLVLKSITPEEE IVFFKEMKPELLGLLLYFNKIHNIELKRPIGSNETQSEYYDKELMSLTYFFERNLDFHQY YRANSTYLDEQYFIRGKSSLQLCVDSAKYILDPLFSTGYDYKVAKIICNEMLRIYLNKKK HSIEKQVIIKKSRELLPNNNLKWTGSKSDAIELGYSIRDSGAINNGNVDVKEIMNFLEVS FDIDLGDYYRTYVAIKGRKKDRTPFLNKLIEVLIKRMERDDSE >gi|225935379|gb|ACGA01000013.1| GENE 11 9960 - 10601 301 213 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170490|ref|ZP_05756902.1| ## NR: gi|260170490|ref|ZP_05756902.1| hypothetical protein BacD2_01348 [Bacteroides sp. D2] # 1 213 25 237 237 354 100.0 2e-96 MKIETIKSKGFALSIILFPLMLLCGFLMHPDLLEMKMLHTAQDLVDRFHNNQLYHIGHFI VMMAVPLIIVVMVGTMNLLQGKGKKLGFWGGIIGIFGAFILAVDKGALCLVMSAFDTIPE PQFQGLFPYLDVIVKKAGLLWIVYFLPLLPLGAVIQTIGLMKEKMVQKWQGILIIIGLLL LNNPEIELISTVGSILMSIGYVSWGIKLYRNSN >gi|225935379|gb|ACGA01000013.1| GENE 12 10687 - 11136 139 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170491|ref|ZP_05756903.1| ## NR: gi|260170491|ref|ZP_05756903.1| hypothetical protein BacD2_01353 [Bacteroides sp. D2] # 1 149 1 149 149 247 100.0 1e-64 MIVNDKNQRNFSQNKKRRIGTDILFGAFVLVAISGFILLYMKMTSDSVVLMINQWWWAFI HRMSALLALIFTMPHVYKHRKWYKKVFTPKPKSKITVILSISFAITLLTIIVLAVKRDSV SWEIVHSLVGFIAVAFIIIHALKRYHIIK >gi|225935379|gb|ACGA01000013.1| GENE 13 11278 - 12345 455 355 aa, chain - ## HITS:1 COG:CC0891 KEGG:ns NR:ns ## COG: CC0891 COG2207 # Protein_GI_number: 16125144 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Caulobacter vibrioides # 249 354 217 323 328 92 44.0 1e-18 MFQALFFGILVITKKNKSMLDKHVSAWLLFLTIHILCISASQSEIIPYSSSFLYPSYILA GLHGFFLYLCTKSLMSNSPKDFRKEYISIAGYIAIAVIGFIFYNSYPQTITVITRILAFL CNALFIILAVRLFHRYKKFLQTHFSNIDKLSFNWMRFLAFGLIILLSGALVLVLLSEIFS FPVSLYSVFAIIVLLFVNVLGFRGIKHSTIFNQTIIPYNQGQPEKQPATNKEETSYANYG LKQEDAIILSERLKTYMEDEKPYTNMDLTLKDLASALDTYPHYITQVLNTLFNQNFYEFI NTYRVEEVQRRLHDSQFMNLTILAIAYDCGFNSKSAFNRIFKQKTGLTPSEYKKQ >gi|225935379|gb|ACGA01000013.1| GENE 14 12768 - 13676 232 302 aa, chain - ## HITS:1 COG:no KEGG:BVU_2470 NR:ns ## KEGG: BVU_2470 # Name: not_defined # Def: putative transposase # Organism: B.vulgatus # Pathway: not_defined # 1 298 1 298 412 199 36.0 9e-50 MKHTFKINFYLKRSVIRKNGKMPVVMRIIINGERTDVHLPIDICSSMWSVEFGRASGKTD EARQVNVFLEQVHATLFTYYRQSLVEGVIVTAKGLKQRFVQLGVQPQTLLSIFRKHNDDA YLMAKSEGISMSTYKKYDLAFRRVEQFIAKQGKTDIQLCHLDMCFINDYELFLRTDCKLG INMAAKMLQILKKVVLLARNLGIINFNPFLNHHTKREEIGIRYLTWSELERIKQKEITIP RLAFVRDLFVFSSYTGLSYADIRKLSECHISSFKNGSLWIMQNRTKTGHLARIPLLPLLL KY >gi|225935379|gb|ACGA01000013.1| GENE 15 13668 - 13820 86 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFHKSLFFTDTNIHTLLVSEKRQIMILRDDLTTRIWSKYHKKKRTRNVID >gi|225935379|gb|ACGA01000013.1| GENE 16 14127 - 15443 1198 438 aa, chain + ## HITS:1 COG:MJ0001 KEGG:ns NR:ns ## COG: MJ0001 COG0436 # Protein_GI_number: 15668173 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanococcus jannaschii # 26 277 12 241 375 78 25.0 3e-14 MKNTPIERHLIDETINEFQIVDFSKATIREVKAIASKAEAASGVEFIKMEMGVPGLPPSA VGVKAEIEALQNGIASLYPDINGLPELKKEASNFIKAFINVDLSPEGCVPVTGSMQGTFA SFLTCSQCDEKKDTILFIDPGFPVQKQQLVVMGQKFETFDVYDYRGEKLKEKLESYLKKG NISAIIYSNPNNPSWICLKEEELRIIGELATQYDVIVLEDLAYFAMDFRQDLSKPYQPPF QPSVAHYTDNYILLISGSKAFSYAGQRIGVSCISDKLYHRSYPGLTKRYGGGTFGTVFIH RVLYALSSGTSHSAQFAMAAMLKAANEGQYNFLNEVKIYGERAQKLKEIFLRHGFHLVYD NDLGEPIADGFYFTIGYPGMTSGELAKELMYYGVSAISLVTTGSHQEGLRACTSFIKDHQ YAQLDERMKLFAENHPIA >gi|225935379|gb|ACGA01000013.1| GENE 17 15538 - 16128 366 196 aa, chain + ## HITS:1 COG:CPn0139 KEGG:ns NR:ns ## COG: CPn0139 COG1678 # Protein_GI_number: 15618063 # Func_class: K Transcription # Function: Putative transcriptional regulator # Organism: Chlamydophila pneumoniae CWL029 # 19 196 10 188 188 104 32.0 1e-22 MNIDSDIFKIQSNNVLPSRGRILISEPFLRDATFGRSVILLVDHTDEGSMGLVINKQLPL FLNDIILEFKYLDEIPLYKGGPIATDTLFYLHTLSDIPGSISISKGLYLNGDFDEIKKYI LQGNKISECIRFFLGYSGWDSEQLSNEIRENTWLVSEEEKSYLMKNNIKDMWRKALEKLG SKYETWSRFPQVPTLN >gi|225935379|gb|ACGA01000013.1| GENE 18 16133 - 16669 349 178 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase [Capnocytophaga ochracea DSM 7271] # 11 172 2 162 166 139 41 5e-32 MRQSFLMNERIYLRAVEPEDMDIMYEMENDPSMWDISNFTVPYSRYVLRQYIEGSQCDVF ADKQLRLMIVRKSDQCILGTIDITDFVPLHSRGEVGIAVHKDYRQQGYATDALKLLCEYA FDFLSLSQLYAHVMTDNKVCVKLFTSCGFVQCGLLKNWLQVEGCYKDALLLQCLNPKK >gi|225935379|gb|ACGA01000013.1| GENE 19 16666 - 17124 357 152 aa, chain - ## HITS:1 COG:no KEGG:BT_1080 NR:ns ## KEGG: BT_1080 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 239 87.0 3e-62 MEEQIKRAVRNLNISYVFFWVLPAFLLGAGEFELFPVGGLVDNAQAIYYFETIGILLTAL CVPLSLKLFSLVLKKKIDHMTITLALKRYVQWNIVRLGVLEVAIVVNLLCYYLTLSSTGN LCMLIGLTASLFCLPSEKRLRNELHIAKDEKL >gi|225935379|gb|ACGA01000013.1| GENE 20 17151 - 17768 512 205 aa, chain - ## HITS:1 COG:DR0198 KEGG:ns NR:ns ## COG: DR0198 COG0353 # Protein_GI_number: 15805234 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Deinococcus radiodurans # 4 198 2 193 220 195 49.0 5e-50 MNQQYPSILLEKAVGEFSKLPGIGRKTAMRLVLHLLRQDTATVEAFGNSVITLKREVKYC KVCHNISDTETCQICANPQRDASTVCVVENIRDVMAVEATQQYRGLYHVLGGVISPIDGV GPSDLQIESLVQRVTEGGIKEVILALSTTMEGDTTNFYIYRKLDKLGVKLSVIARGISVG DELEYADEVTLGRSIVNRTLFTGTV >gi|225935379|gb|ACGA01000013.1| GENE 21 17860 - 19311 634 483 aa, chain + ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 1 422 3 423 512 112 26.0 2e-24 MMILVTIICYFAVLLLIARITGRKGGSNAAFFKGENQSPWYIVSFGMIGASISGVTFVSV PGMVRGMDMTYMQTVLGFFFGYMVVAHVLLPLYYKLNLTSIYGYIGSRIGLRAYRTSSFF FLLSRMLGTAAKLYLVCLILHTYVFQEMHVPFWLIAVGSVTLVWIYTHKSGIKTIVWTDT LQTFCLIAALIFIIYFTIQRLDLDFSGIVQTIQSSEHSRIFVFDDWVSRQNFFKQFFSGI FIVIVMTGLDQDMMQKNLSCRNLREAQKNMYCYGFSFIPLNFLFLCLGILLITLAGQMQL ELPAMNDDILPMFAAQGYLGQPVLILFTIGIIAAAFSNSDSALTAMTTSVCVDLLNTRKD TEEVARRKRDKVHLLLSILLAFFICLVEMLNNKSVIDAIYIIASYTYGPLLGMFAFGLFT RRQTNDRWVPLIAILSPLLCYLADWWIGKETGYKFGYELLMLNGTLTFAGLMLMSKKELQ RNK >gi|225935379|gb|ACGA01000013.1| GENE 22 19319 - 19924 594 201 aa, chain + ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 192 1 189 194 136 44.0 3e-32 MEITSAEFVISNTDVRKCPAGIFPEYAFIGRSNVGKSSLINMLTNRKGLAMTSATPGKTM LINHFFINKNWYLVDLPGYGYARRGQKGKDQIRTIIEDYILEREQMTNLFVLIDSRLEPQ KIDLEFMEWLGENGIPFSIIFTKADKLKGGRLKININAYLRELSKQWEELPPHFVSSSED RMGRTEILNYIENINKDLNIK >gi|225935379|gb|ACGA01000013.1| GENE 23 19933 - 20538 591 201 aa, chain + ## HITS:1 COG:no KEGG:BT_1084 NR:ns ## KEGG: BT_1084 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 201 1 199 199 256 83.0 5e-67 MKMKKSFLSIAFMAVFMLMATNSQAQSWSDLLNKDNISKVVNAITGNTESIDMTGTWSYK GSAVEFESDNLLMKAGGAAAATMAESKLNEQLSKIGIKDGQMSFTFNADSTFTSTVGKKT LKGTYSYNASTKQVDLKYLKLLNLHAKVNCSSSSLELLFNSDKLLKLMAFIGSKSNSSAL KTVSSLADNYDGMMLGFQLSK >gi|225935379|gb|ACGA01000013.1| GENE 24 20637 - 21032 306 131 aa, chain + ## HITS:1 COG:no KEGG:BT_1085 NR:ns ## KEGG: BT_1085 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 244 96.0 8e-64 MKKILFLMTLLVMGVSFAFAQTNADIKFDKTTHDFGKFSENSPVVSCTFTFTNIGDAPLV IHQAVASCGCTVPEYTKEPIMPGKKGTIKVTYNGTGKYPGHFKKSITLRTNAKTEMVRLY IEGDMTAKDAK >gi|225935379|gb|ACGA01000013.1| GENE 25 21138 - 23126 2134 662 aa, chain + ## HITS:1 COG:PH0361 KEGG:ns NR:ns ## COG: PH0361 COG1297 # Protein_GI_number: 14590271 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 25 626 3 595 626 238 30.0 3e-62 MKQEEDKFTGLPENAFRELKPGEVYNPLMGPSKNYPEVNIWSVAWGIAMAILFSAAAAYL GLKVGQVFEAAIPIAIIAVGVSGAAKRKNALGENVIIQSIGACSGVIVAGAIFTLPALYI LQAKYPEMTVTFMQVFISSLLGGVLGILFLIPFRKYFVSDMHGKYPFPEATATTQVLISG EKGGSQAKPLLMAGMIGGLYDFIVATFGWWNENFTTRVCSAGEMLAEKAKLVFKVNTGAA ILGLGYIVGLKYASIICAGSLAVWWIIIPGMSAIWGDSVLNAWNPEVTSTVGMMSPEEIF KYYAKSIGIGGIAMAGVIGIIRSWSIIKSAVGLAAKEMGGKGNVEKSIIRTQRDLSMKII AIGSIITLILIVLFFYLDVMQGNLLHTLVAIVLVAGISFLFTTVAANAIAIVGTNPVSGM TLMTLILASVVMVAVGLKGPSGMVAALVMGGVVCTALSMAGGFITDLKIGYWLGSTPAKQ ETWKFLGTIVSAATVGGVMIILNKTYGFTSGALAAPQANAMAAVIEPLMSGVGAPWLLYG IGAVLAIILTLCKIPALAFALGMFIPLELNVPLVVGGAVNWFVTTRSKDATLNTERGEKG TLLASGFIAGGALMGVISAAMRFGGINLVNEAWLNNTWSEVLALGAYALLILYFIKASMK VK >gi|225935379|gb|ACGA01000013.1| GENE 26 23133 - 24689 1198 518 aa, chain + ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 26 261 45 305 328 161 33.0 3e-39 MRRLLSIIVSLITAITFAQQPVELPLWPDGAPNSSGLTGEEQEVRPHFVTNVTYPTMTVY HPEKPNGMAIIMCPGGGYRGLGMDGEGYDMAPWFCEQGITYIVLKYRMPNGHCEVPISDA EQAIRMVRQHAKEWNVNPHRVGIMGASAGGHLASTIATHYNSETRPDFQILLYPVVTMMQ VTRGNTREALLGKNPTMEQIQKFSAELQVTPDTPQAFIAITSDDPSVPPYHGVNYYLALQ KNKVPATLHVYPTGGHGWGFQDSFKYKQQWTQELEKWLRDGVVFPEDSEPMLRIGKSYLG TKYVANTLDQDEKEALVIRTDAVDCLTFVEYTLAQALGSSFANNLQKIRYRDGIINGYPS RLHYTSEWIENGVRHGFLTDITAKNSLHTNKLALSYMSTHPWQYKKLADSPGNVLQMAEH EKAISGKVVHWLPKSELPEAGFPWIMNGDIIAITTKVSGLDIAHVGIAAYRHGKLHLLHA SSTLGKVVVSDEPLNHMLNNNKSWTGIRVVRMSHSKNN >gi|225935379|gb|ACGA01000013.1| GENE 27 24692 - 25609 721 305 aa, chain + ## HITS:1 COG:no KEGG:BT_1088 NR:ns ## KEGG: BT_1088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 305 1 310 310 500 78.0 1e-140 MKRVKHFLLVSCLFPTLLGAQEMANAYFLELTPDAQSAGMAGTGLATTDNGSTAIFHNAS TIAFSQEVMGASYSYANIDKDYALHSASLFYRIGREGIHGFAIGFRHFKDPKVFDYRPHI WDLEAAYFRNIAKNLSMSLTFRYLQAKAAEGVDTKNSVCLDFGATYYRSMALLDEMASWS IGFQAANLGKKLDGQKLPARLGLGGTIDLPFSIENRLQVALDFNYLLPSEIRHLQAGVGA EYNFLKYGVVRAGYHFGDKDKGVGNYGTLGCGINFWPIRADFSYALADKDCFMHRTWQLG VGIVF >gi|225935379|gb|ACGA01000013.1| GENE 28 25566 - 26249 500 227 aa, chain - ## HITS:1 COG:MT2274 KEGG:ns NR:ns ## COG: MT2274 COG0321 # Protein_GI_number: 15841708 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Mycobacterium tuberculosis CDC1551 # 11 205 30 208 240 155 46.0 4e-38 MKTVFLDWNLIPYAEAWQRQTEWFDDIVRAKVQGGSYENRVVMCEHPHVYTLGRSGKENN MLLSDEQLKAIDATLYHIDRGGDITYHGPGQLVCYPILNLEEFQLGLKEYVHLLEEAVIR VCASYGIEAGRLEEATGVWLEGDTPRARKICAIGVRSSHFVTMHGLALNVNTDLRYFSYI HPCGFIDKGVTSLRQELKHDVPMDEVKQRLEDELRKLFQLPAARCGA >gi|225935379|gb|ACGA01000013.1| GENE 29 26316 - 27155 556 279 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 152 279 161 293 307 72 30.0 1e-12 MKMDFPQVDLPTEVLAWTNVTEDILNIYKQSCRLQACIVAICTEGSMKASINLLDYEIRP NDLITLLPGTIIQFRERTEKVCLCFAGFSAYCAGRINLMKNIGNAYPKLIEQPVVPLNEE VASYLKDYFALLSRASCNENFEMDSELVELSLQTILTSIRLIYHKFPGENSSSNRKKEIC RELIQAITENYKNERRAQFYADKLGISLQHLSTTVRQVTGKSVLDTIAYIVIMDAKAKLK GTNMTIQEIAYSLNFPSASFFGKYFRRYVGMTPLEFRNR >gi|225935379|gb|ACGA01000013.1| GENE 30 27292 - 29502 2109 736 aa, chain - ## HITS:1 COG:alr7635 KEGG:ns NR:ns ## COG: alr7635 COG2217 # Protein_GI_number: 17158771 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Nostoc sp. PCC 7120 # 14 733 11 744 753 605 44.0 1e-172 MGNTIKKAFPVLNMHCAGCANNVEKTVKKLPGVIEASVNFATNTLAVSYEKDQLTPGEIR AAVLAAGYDLIVEEAHKEERREEEQHKRYTRLKWKVIGAWIFVVPLLVFSMILMHVPYSN EIQMVLAIPVMVFFGGGFFTGAWKQAKLGRSNMDTLVALSTSIAFLFSLFNTFFPEFWYD RGLEPHVYYEASAVIIAFVLTGKLMEERAKGNTSTAIRKLMGMQPKVARVLRNGVEEEIL IENLQIGDMVVVRPGEQIPVDGQLSEGDSYVDESMISGEPVPVEKKKGDKVLAGTINQRG SFIIHAAQVGSETVLARIISMVQEAQGSKAPVQRIVDRITGIFVPVVLGIAILTFVLWVA IGGSEYISYGILSAVSVLVIACPCALGLATPTALMVGIGKAASQHILIKDAVALEQMRKV DVVVLDKTGTLTEGHPTATGWLWAQSQEPHFKDVLLAAEMKSEHPLAGAIVSALQDEEKI KPAVLDSFESITGKGIKVSYEGHTYWVGSHKLLKDFSATVNDVMAEMLVRYESDGNGIIY FGRENELLAIIAVSDPIKATSAEAVKELKRQGIDICMLTGDGQRTALAVSSRLGIERFVA DALPDDKAEFVRELQMQGKKVAMVGDGINDSQALALADVSIAMGKGTDIAMDVAMVTLMT SDLLLLPKAFQLSKQTVKLIHQNLFWAFIYNLIGIPIAAGILFPLNGLLLNPMLASAAMA FSSVSVVLNSLSLGRK >gi|225935379|gb|ACGA01000013.1| GENE 31 29600 - 29923 414 107 aa, chain - ## HITS:1 COG:no KEGG:BT_1092 NR:ns ## KEGG: BT_1092 # Name: not_defined # Def: putative heavy-metal binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 107 1 107 119 136 79.0 2e-31 MKTKRWMATCVVALLSVAAVLAKDIRVVVFKVSQMHCEKCEKKVKDNMRFEKGLKDIATE VKTRMVTITYDAEKTNVKNLQAGFNKFNYEAEFVKETKKDDQKTDKK >gi|225935379|gb|ACGA01000013.1| GENE 32 29996 - 32209 2250 737 aa, chain - ## HITS:1 COG:no KEGG:BF2084 NR:ns ## KEGG: BF2084 # Name: not_defined # Def: putative TonB-dependent outer membrane receptor protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 15 735 15 735 735 1181 79.0 0 MIIGLFLLFAGNLYAQVRGTVKDSTGEAIPGANVFWMNTGQGVTTKEDGSFSIVKPSKSH MLIISFIGFQNDTIHVNSKNQQLDIVLRDGVELNEVNIVTRKLGTMKLRSSVMNEDMISS AELSRAACCNLGESFVTNPSVDVSYSDAATGAKQIKLLGLSGTYVQMLTENIPNYRGAAS PYGLGYVPGPWMQSIQVSKGTSSVKNGYEAITGQINVEFKKPQLPEADWVSANLFASTTN RYEANADATLKISKRWSTSLLAHYENETKAHDGNDDGFVDIPQVEQYNVWNRWAYMGDHY VFQAGIKALSETRTSGQANHGGAIHSGDLYKVGIDTERYEFFTKNAYIFNKEKNTNLALI LSTTLHNQDATYGRKLYNVDQTNVYASLMFETEFNPQNSFSAGLSFNYDAYDQHYRLENN TNNPLKAFEKEAVPGAYVQYTLNLNDKWMVMAGLRGDYSNEHGFFVTPRAHLKYNPNDYV NFRLSAGKGYRTNHVLAENNYLLSSSRKVEIAKNLDMEEAWNYGASVSTYIPIFGKTLNV NAEYYYTDFLKQVIVDMDSNPHEVAFYNLNGRSYSHVFQVEASYPFFKGFTLTGAYRLTD AKTTYKGVRMEKPLTSKYKGLLTASYQTPLGIWQFDATLQLNGGGRMPTPYELGDGQLSW ERRYGSFEQLSLQVTRYFRRWSIYVGGENLTNFKQKNPIIDAANPWGSNFDSTMIWGPVH GAKGYIGIRFNLARNSE >gi|225935379|gb|ACGA01000013.1| GENE 33 32343 - 32741 202 132 aa, chain - ## HITS:1 COG:no KEGG:BT_1140 NR:ns ## KEGG: BT_1140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 132 147 68.0 1e-34 MKRLAYITELILSLLIIYSGGGVSIVRYCCARCETVQSCCDTGCPKCKKTHTCDSKKGCK DKGCTATIYKLDLMKHTTELTTSALVVDLLCEHFCYLLTPTYADKPVEYDSLTSPPPLCS RQKLALYSTYII >gi|225935379|gb|ACGA01000013.1| GENE 34 32938 - 34023 766 361 aa, chain + ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 41 356 55 384 392 177 31.0 4e-44 MLTYLLIIITLYLAGNAYIFIRAKQALKVKSLGVKIFLTVLFWICALSFFGTMLARNLEM PVFISHSMYTIGTSWLIFTLYMALFLLLFDLLKLFKVVCKYRFYLSLVFTLGLLGYGVYN YHHPETNVVSILTNKQYEDTPQAIKIVAISDVHLGNGTGKAALKKYVEMINAQHPDLILI SGDLIDNSVVPLYTENMAEELGDLKAPMGIYMVLGNHEYISDIDESIRYIKSTQIQLLRD SVVTLPNGIQLIGRDDRHNHKRHSLQELMVNVDKSKPIILLDHQPFDLEKTKAAGIDLQF SGHTHHGQIWPINWVTDYIFEQSHGYRQWGNSHVYVSSGLSLWGPPFRIGTHSEMVIFNF Q >gi|225935379|gb|ACGA01000013.1| GENE 35 34098 - 34331 248 77 aa, chain + ## HITS:1 COG:no KEGG:BDI_1027 NR:ns ## KEGG: BDI_1027 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 77 1 77 77 124 81.0 1e-27 MEQLHAHEVLHMMEGNSYSESSLREAIIKKFGSQQRFYACSAENMDVDTLIEFLKMKGKF MPAEDGFTVDITKVCKH >gi|225935379|gb|ACGA01000013.1| GENE 36 34476 - 35327 837 283 aa, chain + ## HITS:1 COG:no KEGG:BT_1144 NR:ns ## KEGG: BT_1144 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 1 283 283 508 88.0 1e-143 MKKQYVSLLAIILSVSGFLFSCHDKMNKNTGALQFDSIQVNETAHLFSDTAKPACNIIIN FAYPIKSSDDMLKDSLNTYFISACFGDKYIGEKPEEVVKQYTENYISEYRRDLEPMYTED EKDKEDESSIGAWYSYYKGIESHVQLYDKDLLVYRIDYNEYTGGAHGIYMATYLNMDLTL MRPLRLDDIFVGDYKDALTDLIWNQLMADNKVTTHEALEDMGYASTGDIAPTENFYLSKE GITFYYNVYDITPYSMGPVKVTIPFAMMEHLLGSNPILGELKN >gi|225935379|gb|ACGA01000013.1| GENE 37 35441 - 36061 411 206 aa, chain + ## HITS:1 COG:SA2499 KEGG:ns NR:ns ## COG: SA2499 COG0357 # Protein_GI_number: 15928295 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Staphylococcus aureus N315 # 10 149 16 164 239 100 36.0 1e-21 MEIILKYFPDLTEEQRKQFAALYDLYTDWNSKINVISRKDIENLYEHHVLHSLGIAKIIQ FRPGTSIMDLGTGGGFPGIPLAILFPEVKFHLVDSIGKKVRVATEVANAIGLKNVTFRHA RAEEEKRTFDFVVSRAVMPLADLIKIIKKNISSKQQNALPNGLICLKGGELEHETMPFKH KTVIHSLSDNFEEEFFKTKKVVYVPI >gi|225935379|gb|ACGA01000013.1| GENE 38 36105 - 36743 396 212 aa, chain + ## HITS:1 COG:VC1270 KEGG:ns NR:ns ## COG: VC1270 COG0491 # Protein_GI_number: 15641283 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Vibrio cholerae # 11 210 13 210 218 152 40.0 4e-37 MKIKRFEFNMFPVNCYVLWDDTKEAVVIDPGCFYEEEKQALKKFILTNELNVKHLLNTHL HLDHIFGNPFMLKEFGLSAEANKADEYWIDEAPKQSRMFGFQLQEEPVPLGKYLHDGDII TFGHTKLEAIHVPGHSPGSLVYYCKEDNCMFSGDVLFQGSIGRADLTGGNFDELIEHICS RLFVLPNETVVYPGHGAPTTIGMEKAENPFFR >gi|225935379|gb|ACGA01000013.1| GENE 39 36856 - 39705 2641 949 aa, chain + ## HITS:1 COG:YPO0905_2 KEGG:ns NR:ns ## COG: YPO0905_2 COG1003 # Protein_GI_number: 16121210 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Yersinia pestis # 465 941 4 484 494 581 59.0 1e-165 MKTDLLASRHIGINEQDTAIMLRKIGVNSLDELINQTIPANIRLKEPLALTSPLTEYEFG KHIAELAAKNKLYTTYIGLGWYNTITPAVIQRNVFENPVWYTSYTPYQTEVSQGRLEALM NFQTAVCDLTAMPLANCSLLDEATAAAEAVSMMYALRSRAQQKANANVVFVDENIFPQTL AVMTTRAVPQGIELRVGKYKDFEPSPEVFACILQYPNSHGNVEDYSEFTEKAHAADCKVA VAADILSLALLTPPGEWGADIVFGTTQRLGTPMFYGGPSAAFFATKDEYKRNMPGRIIGW SKDKYGKLCYRMALQTREQHIKREKATSNICTAQALLATMAGFYAVYHGQEGITTIASRI HSITVFLEKQLKKCGYTQVNAQYFDTLRFELPEHVSAQQIRTIALTKEVNLRYYENGNVG FSIDETTDVAAANVLLSIFAIAAGKDYQKVDDIPERSNIDKDLKRTTPFLTHEVFSKYHT ETEMMRYIKRLDRKDISLAQSMISLGSCTMKLNAAAEMLPLSRPEFMGMHPLVPEDQAEG YRELIKNLSEDLKVITGFAGVSLQPNSGAAGEYAGLRVIRAYLESIGQGHRNKVLIPASA HGTNPASAIQAGFVTVTCACDEQGNVEMADLRVKAEENKDELAALMITYPSTHGIFETEI KEICDIIHACGAQVYMDGANMNAQVGLTNPGFIGADVCHLNLHKTFASPHGGGGPGVGPI CVAEHLVPFLPGHGIFGNSQNQVSAAPFGSAGILPITYGYIRMMGAEGLTQATKIAILNA NYLATCLKDTYGVVYRGANGFVGHEMILECRKVHEEAGISENDIAKRLMDYGYHAPTLSF PVHGTLMIEPTESESLAELDNFVDVMLNIWKEIQEVKNGEADKDDNVLINAPHPEYEIVS DRWEHSYTREKAAYPIESVRDNKFWINVARVDNTLGDRKLLPTRYGTFE >gi|225935379|gb|ACGA01000013.1| GENE 40 39828 - 40397 530 189 aa, chain - ## HITS:1 COG:MA1774 KEGG:ns NR:ns ## COG: MA1774 COG0778 # Protein_GI_number: 20090624 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 3 180 31 214 220 105 34.0 5e-23 MKTNEVLENIKARRSVRAYTGQQVLEEDLQAILEAATYAPSGMHLETWHFTAIQNMDKLT ELNERIKGAFAKSDDSRLQERGHSKTYCCYYHAPTLVIVSNEPTQWWAGMDCACAIENMF LAAQSLGIGSCWINQLGTTCDDPEVREFITALGVPANHKVYGCVALGYPDSKIPMKEKKV KVNTITIVR >gi|225935379|gb|ACGA01000013.1| GENE 41 40466 - 41551 939 361 aa, chain - ## HITS:1 COG:no KEGG:BT_1149 NR:ns ## KEGG: BT_1149 # Name: not_defined # Def: type II restriction enzyme HpaII # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 633 85.0 1e-180 MAFEATKKEWCELYSFFRLLADGKVVLGTAEAKAGDTFWPVAMIQREEHDGTRQYYIEED TIRIEGENGSKSMPREDFGIVADLILQAVKSSPENDVASPEGVEEFLDEAAIFDLEAKTE DRTDFSITFWHPKAPLRGFNVRSRLGVMNPLLDGGRAANLKLEQSGVKFATPTVNKINAL PESPNEVAERMMMIERLGGVLKYADVADRVFRSNLLMIDLHFPRVLTEMVRIMHLDGISR ISELTEVIKQMNPLKIKDELINKHKFYEFKMKQFLMALVLGMRPAKIYNGLDSAVEGILL VDGNGEVLCYHKSEKQIMEDFLFLNTRLEKGSLEKDKYGFLERENGVYYFKLNAKIGLVK R >gi|225935379|gb|ACGA01000013.1| GENE 42 41631 - 42434 479 267 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293373029|ref|ZP_06619398.1| ## NR: gi|293373029|ref|ZP_06619398.1| hypothetical protein CUY_0916 [Bacteroides ovatus SD CMC 3f] # 1 267 4 270 270 561 100.0 1e-158 MINKCHIIFSLFLSSFVWMSVQAQAQIQETQKLAPNNIPVPWYSQKIAGCPYSHCSLASS LMVFDYFKGMTADTQRTAQDAEKKLIEYQRNYFLKKRAPFRRRTSIGQGGYYSFEIDSLT RYYENMISAEHFQQKDYRILKDYIDRGIPVLVNVRYTGAVRGLRPGPRGHWMVLRGIDDK HVWVNDPGRSPEMRTKGENICYPIKKQPGNPSYFDGCWTGRFIIVTPREWIRNSLFAQVG KLPPLEEVTHIVPPVVSSVTLPQVINQ >gi|225935379|gb|ACGA01000013.1| GENE 43 42477 - 43094 589 205 aa, chain - ## HITS:1 COG:NMA0240 KEGG:ns NR:ns ## COG: NMA0240 COG1739 # Protein_GI_number: 15793258 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 6 178 4 176 203 184 48.0 1e-46 MTAEDTYKTIIEPSEGIYTEKRSKFIAIALPVRTLDEIKMHLETYQKKYYDARHVCYAYM LGAARKDFRANDNGEPSGTAGKPILGQINSNELTDILIIVVRYFGGIKLGTSGLIVAYKA AAAEAIAAATIIEKTVDEDVTVMFEYPFMNDVMRIVKEEEPEILNQSYDMDCSMTLRIRR SMMPKLRARLEKVETARILDDENIL >gi|225935379|gb|ACGA01000013.1| GENE 44 43350 - 44597 1328 415 aa, chain - ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 414 1 413 414 413 49.0 1e-115 MAIIKPFKGIRPPQNLVEQVASRPYDVLNSEEARAEAEGNEKSLYHIIKPEIDFPVGTDE HDEQVYAKAAENFQLFQDKGWLVQDKKENYYIYAQTMNGKTQYGLVVGAYVPDYMNGIIK KHELTRRDKEEDRMKHVRVNNANIEPVFFAYPDNEKLDVIIKKYTANKPVYDFIAPGDGF GHTFWIVDQDADIATITAEFAKMPALYIADGHHRSAAAALVGAEKAKQNPNHRGDEEYNY FMAVCFPANQLTIIDYNRVVKDLNGLTPEQFLAALDKNFIVEEKGADIYKPSGLHNFSLY LGGKWYSLTAKAGTYNDNDPIGVLDVTISSHLILDEVLGIKDLRSDKRIDFVGGIRGLGE LKKRVDSGEMKVALALYPVSMKQLMDIADTGNIMPPKTTWFEPKLRSGLVIHKLD >gi|225935379|gb|ACGA01000013.1| GENE 45 44760 - 45680 1361 306 aa, chain - ## HITS:1 COG:aq_1905 KEGG:ns NR:ns ## COG: aq_1905 COG0111 # Protein_GI_number: 15606928 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Aquifex aeolicus # 2 305 3 320 533 157 34.0 2e-38 MKVLVATEKPFAKVAVDGIRKEIEAAGFELALLEKYTDKAQLLDAVKDANAIIIRSDIVD AEVLDAAKELKIVVRAGAGYDNVDLAAATAHNVCVMNTPGQNSNAVAELALGMMVYAVRN FYNGTSGTELMGKKLGIHAYGNVGRNVARVAKGFGMEVYAYDAFCPKEVIEKDGVKALDS AEELYKTCQVVSLHIPATAETKNSINYALLKDMPKGAMLVNTARKEVINEAELIKLMEER ADFKYITDIMPAANAEFAEKFAGRYFSTPKKMGAQTAEANINAGIAAAQQIVGFLKDGCE KFRVNK >gi|225935379|gb|ACGA01000013.1| GENE 46 45788 - 46855 1093 355 aa, chain - ## HITS:1 COG:lin2957 KEGG:ns NR:ns ## COG: lin2957 COG1932 # Protein_GI_number: 16802016 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Listeria innocua # 4 351 5 352 363 332 50.0 7e-91 MKKHNFNAGPSILPREVIEDTAKAILDFNGSGLSLMEISHRAKDFQPVVDEAEALFKELL NIPEGYSVLFLGGGASMEFCMVPYNFLEKKAAYLNTGVWAKKAMKEAKGFGEVVEVASSA EATYTYIPKDYTIPTDADYFHITTNNTIYGTELKKDLDSPVPMIADMSSDIFSRPIDVSK YICIYGGAQKNLAPAGVTFVIVKNDAVGKVSRYIPSMLNYQTHIDNGSMFNTPPVVPIYA ALLNLRWIKAQGGVKEMERRAIEKADMLYAEIDRNKMFVGTAAKEDRSRMNICFVMASEY KDFEADFLKFATERGMVGIKGHRSVGGFRASCYNALPKESVQALIDCMQEFEKLH >gi|225935379|gb|ACGA01000013.1| GENE 47 47061 - 47876 627 271 aa, chain - ## HITS:1 COG:dbpA KEGG:ns NR:ns ## COG: dbpA COG0513 # Protein_GI_number: 16129304 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli K12 # 43 262 223 448 457 124 37.0 2e-28 MLSATDAEEIPEFTGLNRTVKLDFLSDASEEQESRLKLMKVLSPSKDKIDTLYNLLCTLG SSSSIVFCNHRDAVDRVHQLLADKKLSAERFHGGMEQPDRERALYKFRNGSCHVLISTDL AARGLDIPEVGHIVHYHLPVNEEAFTHRNGRTARWDATGTSYLILHAEEKLPPYIPEDME TVVLPENPPRPPKSVWATIYIGKGKKEKLSRIDIAGFLYKKGNLTREDVGAIDVKEHYAF VAVRRAKVKQLLNLIQGEKIKGMKTIIEEAK >gi|225935379|gb|ACGA01000013.1| GENE 48 47828 - 48388 440 186 aa, chain - ## HITS:1 COG:RC1020 KEGG:ns NR:ns ## COG: RC1020 COG0513 # Protein_GI_number: 15892943 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Rickettsia conorii # 9 182 17 189 414 116 39.0 2e-26 MKNEIIQSALRNLKIEELNPMQEASLEQATGRKDVILLSPTGSGKTLAYLLPLLLTLKPN DDSVQVLILVPSRELALQIDSVFKAMGTSWKTCCCYGGHPIAEEKKSIVGNHPAIIIGTP GRITDHLSKGNFNPETIETLIIDEFDKSLEFGFHDEMAEIITQLPGLKNVYCFRLPMLKR FLNLQD >gi|225935379|gb|ACGA01000013.1| GENE 49 48393 - 48572 204 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716306|ref|ZP_04546787.1| ## NR: gi|237716306|ref|ZP_04546787.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 59 1 59 59 100 100.0 4e-20 MEYHRISFIHNDTEYSFVKAMSSSLTGYALVTACRAEVTIYMKENNLKGYYILTGMAKV >gi|225935379|gb|ACGA01000013.1| GENE 50 48814 - 50088 1479 424 aa, chain - ## HITS:1 COG:PA2994 KEGG:ns NR:ns ## COG: PA2994 COG2871 # Protein_GI_number: 15598190 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF # Organism: Pseudomonas aeruginosa # 6 424 6 407 407 434 51.0 1e-121 MDMNLILASIGIFLVVILLLVVILLVAKNFLVPSGDVKLTINGEKELEVASGSTLLNTLS VNGVFLSSACGGKGSCGQCKCQVLEGGGEILPSEAPHFSRKQQQDHWRLGCQVKVKSDMA IKIDESILGVKEWECEVISNKNVATFIKEFIVALPEGEHMDFIPGSYAQIKIPKFSMDYD KDIDKSLIGDEYLPAWEKFGLLGLKCKNDEETIRAYSMANYPAEGDRIMLTVRIATPPFK PKDQGPGFMDVMPGIASSYIFTLKPGDKVMMSGPYGDFHPIFDSKKEMMWIGGGAGMAPL RAQIMHLTKTLHTTDRKMSYFYGARALNEVFYLEDFLQIEKDFPNFTFHLALDRPDPAAD AAGVKYTPGFVHNVIYETYLKNHEAPEDIEYYMCGPGPMSKAVEKMLDDLGVPAQNLMFD NFGG >gi|225935379|gb|ACGA01000013.1| GENE 51 50107 - 50733 805 208 aa, chain - ## HITS:1 COG:HI0170 KEGG:ns NR:ns ## COG: HI0170 COG2209 # Protein_GI_number: 16272135 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE # Organism: Haemophilus influenzae # 1 208 1 198 198 209 60.0 2e-54 MEHLLSLFVRSIFVDNMIFAFFLGMCSYLAVSKNVKTAVGLGVAVTFVLLVTLPVNYLLQ TKVLAANAIIEGVDLSFLSFILFIAVIAGIVQLVEMVVERFTPSLYASLGIFLPLIAVNC AIMGASLFMQQRINVAPSDPKYIGDIWDALSYALGSGIGWLLAIVGLAAIREKMAYSDVP APLKGLGITFITVGLMAIAFMCFSGLNI >gi|225935379|gb|ACGA01000013.1| GENE 52 50764 - 51405 656 213 aa, chain - ## HITS:1 COG:HI0168 KEGG:ns NR:ns ## COG: HI0168 COG1347 # Protein_GI_number: 16272134 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD # Organism: Haemophilus influenzae # 10 211 8 208 208 216 58.0 3e-56 MGQLFSKRNKEVFSAPLGIDNPVTVQVLGICSALAVTAKLEPAIVMGLSVTVITAFSNVV ISLLRKTIPNRIRIIVQLVVVAALVTIVSEVLKAYAYDVSVQLSVYVGLIITNCILMGRL EAFAMQNGPWESFLDGLGNGLGYAKILIIVAFFRELLGSGTLLGFNILNYEPIQNIGYVN NGLMLMPPMALIIVACIIWYQRARHKELQEESN >gi|225935379|gb|ACGA01000013.1| GENE 53 51426 - 52100 852 224 aa, chain - ## HITS:1 COG:VC2293 KEGG:ns NR:ns ## COG: VC2293 COG2869 # Protein_GI_number: 15642291 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC # Organism: Vibrio cholerae # 2 224 4 250 257 90 29.0 2e-18 MNTNSNSYTIIYASVMVIIVAFLLAFVSSSLKSTQDKNVQLDTKKQILAALNIKNVEDAD AEYQKYVKGDMLMNVDGTLTENTGEFATNYEKEAKEQQRLHVFVCEVDGQTKYVVPVYGA GLWGAIWGYVALNEDKDTVYGTYFSHASETPGLGAEIATEHFQNEFVGKKTLEDGSIALG VVKNGKVEKPEYQVDGISGGTITSVGVDAMLKTCLNSYLSFLTK >gi|225935379|gb|ACGA01000013.1| GENE 54 52114 - 53286 1277 390 aa, chain - ## HITS:1 COG:PA2998 KEGG:ns NR:ns ## COG: PA2998 COG1805 # Protein_GI_number: 15598194 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB # Organism: Pseudomonas aeruginosa # 3 384 2 401 403 322 44.0 1e-87 MKALRNYLDKIKPNFEEGGKFHAFQSVFDGFETFLFVPNKTAKTGTHIHDAIDSKRIMSI VVISLVPALLFGMYNVGYQHFTHTGATGSFIEMFAYGFLAVLPKIIVSYVVGLGIEFVVA QWKKEEIQEGFLVSGILIPMIVPVDCPLWILAVATAFSVIFAKEVFGGTGMNVFNVALIT RAFLFFAYPTKMSGDAVWVSGDSIFGLGQSVDGLTVATPLGQAATSGSVPAFNMDMITGL IPGSIGETSVIAILIGAVILLWTGVASWKTMISVFVGGAFMAWVFNSIGMENNTMAQMPW YEHLVLGGFCFGAVFMATDPVTSARTERGKYIFGFLIGVMAIVIRVLNPGYPEGMMLAIL LMNIFAPLIDYCVVQGNISRREKRAIKSNQ >gi|225935379|gb|ACGA01000013.1| GENE 55 53408 - 54757 1361 449 aa, chain - ## HITS:1 COG:YPO3240 KEGG:ns NR:ns ## COG: YPO3240 COG1726 # Protein_GI_number: 16123399 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA # Organism: Yersinia pestis # 4 447 1 446 447 289 35.0 7e-78 MANVIKLRKGLDINLKGKAAETYATVKEPGFYALVPDDFPGVTPKVVVKEQEYVMAGGPL FIDKYHPEVKFVSPVSGVVTSVERGARRKVLNIVVEAAAEQDYEDFGKKNVASMDAEAVK TALLEAGLFAFIKQRPYDIIADPTVTPKGIFVSAFDTNPLAPDFEFALKGEETNFQTGLD ALAKLAKTYLNISVKQNAAALTQAKNVTITAFDGPNPAGNVGVQINHLDPVSKGETVWTI DPQAVIFIGRLFNTGRVDFTRTIAVTGSEVLKPAYCKLRVGALLTNVFAGNVTKDKDLRY ISGNVLTGKQVSPNGFLGAFHSQLTVIPEGDDIHEMLGWIMPRFNQFSANHSYFSWLMGK KEYTLDARIKGGERHMIMSGEYDRVFPMDILPEYLIKAIIAGDIDRMEALGIYEVAPEDF ALCEFVCSSKMELQRIVRAGLDMLRSEMA >gi|225935379|gb|ACGA01000013.1| GENE 56 54930 - 56324 1390 464 aa, chain - ## HITS:1 COG:TP0112 KEGG:ns NR:ns ## COG: TP0112 COG3579 # Protein_GI_number: 15639106 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Treponema pallidum # 27 463 3 440 450 278 33.0 2e-74 MNKQILSIFVFCAFSYSTQAQEVKGGISDSMMQQIKQSYANTPTDKAIRNAIGNNDIRKL ALNQDNLKGMDTHFSIKVSSKGITDQKSSGRCWLFTGLNVMRAKAIAKHNLGSFEFSQTY PFFFDQLEKANLFLQGIIDTSSKPMDDKMVEWLFRNPLSDGGTFTGVADIVSKYGLVPKD VMPETNSSENTSRMAGLIALKLREQGLQLRDLAAQGVKPAALEKTKTEMLSTIYRMLVLN LGVPPTEFTWTEYNAKGEPVSTETYTPLSFLKKYGDEKLIDNYVMLMNDPSREYYKCYEI DYDRHRYDGKNWTYVNLPIEDIKEMAISSLKDSTMMYFSCDVGKFLNSDRGLLDVKNYDY ESLMGTSFGMNKKQRIQSFASGSSHAMTLMAVDLDKNGKPTKWMVENSWGPVVGYQGYLI MTDDWFNEYMFRLVVETKYASKKALEVLKQKPIRLPAWDPMFAE >gi|225935379|gb|ACGA01000013.1| GENE 57 56571 - 57815 1029 414 aa, chain + ## HITS:1 COG:no KEGG:BT_1162 NR:ns ## KEGG: BT_1162 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 414 1 409 409 723 85.0 0 MMLPSINMMSATADSTAVKSDELQMSTTSISDTTPTDIEGTPSDTIPALSKRELRRQRVA NRNLHYNILGGPSYTPDFGLLVGGSALMTFRMNPSDTTQRRSVVPMAIALIFKGGLNLMT KPQLFFKGDRFRIFGTFSYKNTIENFYGIGYSTNKDYERGEDTSEYRYSGIQVNPWFLFR LGESNFFAGPQVDLNYDKITKPAAGMVNEPSYIAAGGTEHGYKNFSSGLGFLLTYDTRDI PANAYSGTYLDFRGMMYNKTFGSDNNFYRLEIDYRQYKTVGRRKVIAWTVQSKNAFGDVP LTKYVLSGTPFDLRGYYMGQFRDKSSHVMMAEYRQMINTDKSTWVKKMLSHIGYVAWGGC GFMGPTPGKIEGVLPNLGLGLRIEVQPRMNVRLDFGRDMVNKQNLFYFNMTEAF >gi|225935379|gb|ACGA01000013.1| GENE 58 57953 - 58450 376 165 aa, chain + ## HITS:1 COG:no KEGG:BT_1163 NR:ns ## KEGG: BT_1163 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 162 1 162 249 230 72.0 1e-59 MKQFIYFFSGLLLLFVSTTRVVAQEVSDYSQLNEEDYSKIVLPPLSVLFENAKNNPAYEL TAIKATIERKLLQKEKWSFLGFFSLRGSYQYGMFGNESTYTDVAVAPYLTYSTQAQNGYT LGAGLNIPLDNLFDLKGRVSRQRLTLRSAELEREVKFDEIKKNYH >gi|225935379|gb|ACGA01000013.1| GENE 59 58452 - 58700 220 82 aa, chain + ## HITS:1 COG:no KEGG:BF2008 NR:ns ## KEGG: BF2008 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 82 168 249 249 100 64.0 2e-20 MYVMATSQIRVLKMRSENLVLANIQYEISEKNFANGTIDSADLSTNKERQSQAREAYENS RFELTKSLMILEVISRTPIMRK >gi|225935379|gb|ACGA01000013.1| GENE 60 58712 - 60937 1584 741 aa, chain + ## HITS:1 COG:no KEGG:BF2061 NR:ns ## KEGG: BF2061 # Name: not_defined # Def: putative transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 721 1 716 738 758 54.0 0 MNFIISIIRTLFRHRWLILLGTSIFTLLVFYYTRHMRGGYDVKATLYTGVASGYNLESDK RTDWATVQNSMDNLISIMQAESTLKRVFLRLFARVLIQGNPDHENDGITPSCYNYTYNHL KNSPHGPEILKLVDKSSEDKTVANLEAYIRPKKGNYIYEMFYYNHPFYSYNALKNIKVQR RLTSDLLDISYTSDDPGIAYNTVSILMDEFVEEYRRIRYGETDKVIKYFEEELKRIGKKL NSEEEDLTKYNVQKRIINYLDETKEIAAISKEFELREQNALFAYNSTQSMLEELEKHMDS NAKQMLKNMEFVDKLREASNITSKISEVEAISDTKQGNIESLSGDKKRLSELKQELNDLT TSYVGHKYTKEGVSRTNIIDQWLEQTLLFEKAKAELLIVQDARRELNDRYVFFAPVGTTI RQKERSINFTERSYLTVLQSYNEALLRKKNLEMTSATLKVLNEPTYPIGSNSTNRKQIVI AACVGSFLIIIAILLLIEFLDRTLRDANRTKRVTGFKVIGAVPNPSSARYGGLTKTYVQL SVKELSNSLLCFLTKRKSPGVFIINLFSTSENSSEEIIGNLICGYMQSRMLNTRFIMYGE DFNTESTQYLLAKSITDFYTLQGEDILIVAYPPLSKSNIPTALLNDANANILVSPADRGW KTIDKQLCEQLITQMGKANVPFRICLTNANRDAVEDFTGQLPPYTLLRRIGYRFSQLSLT EKIKFNLSRKAKETEEEDDDE >gi|225935379|gb|ACGA01000013.1| GENE 61 60927 - 62255 761 442 aa, chain + ## HITS:1 COG:no KEGG:BF2060 NR:ns ## KEGG: BF2060 # Name: not_defined # Def: putative transmembrane surface-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 8 433 18 442 444 525 58.0 1e-147 MTNNRIVTFHILLAIQFVLVAAGMLINIKVGLFSMVFILLFTTITLVQLNNDEQISWKPG QNIMTYMFAVWLCFYLLEILNPNNVQAAWNINLTPYALIPLICAFVVPLVIRTKKDIELL LIIWSIFVIIFTIKGYWQKNYGFNSKELYFLHALGGWRTHIIWSGIRYFSCFSDAANYGV HAAMSAVVFSISAFFVKSKWLRIYFILIAAGGLYGMGISGTRAAMGVIMGGMLTITIIAK NWKALLAGIFVSISVFCFFYFTNIGNGNQYIHKMRSSFHPSEDASYQVRVENRQRMKELM ARKPLGYGIGLSKAGKFESKEQMPYPPDSWLVSVWVETGIVGLILYLSIHGILFAWCSWI LMFRVRNKSLRGLVAAWLCMDAGFFIATYVNDIMQYPNQLPIYIGFALCFAAPHIDKRIS EEKEESEEKEQQLSTNETNEQE >gi|225935379|gb|ACGA01000013.1| GENE 62 62252 - 63169 518 305 aa, chain + ## HITS:1 COG:CAC3069 KEGG:ns NR:ns ## COG: CAC3069 COG1216 # Protein_GI_number: 15896320 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 8 260 4 258 299 167 35.0 2e-41 MTNTAPDISFITICYNGFKDTCELIESLHKKLKSVSYEIIVVDNASREDEAAKIRELYPT VVSIRSNENGGFSGGNNIGIRIAKGKYIFLINNDTYIESDEVACLVERLESRPEIGGVSP KIRFAFPPQHIQFAGFTPLTKVTLRNNMLGFDCPDDGSYDTPHPTPYLHGAAMIIKREVI EKVGMMPEIFFLYYEELDWSTSMTRAGYELWYEPRCTVFHKESQSTGQLSKLRTYFLTRN RLLYARRNMKGMERWMSVLYQSTVAAGKNSLSFAFKGRFDLFSAVYYGVCAGLFMSSSNK NSKNA >gi|225935379|gb|ACGA01000013.1| GENE 63 63184 - 64377 783 397 aa, chain + ## HITS:1 COG:CAC1691 KEGG:ns NR:ns ## COG: CAC1691 COG1215 # Protein_GI_number: 15894968 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 41 261 47 271 425 72 26.0 1e-12 MNIIDWILYIPLAFCVCYLLLYAIASKCYRAPQYPEARMLRRIVVLFPTYKEDRVILASV ESFLEQDYPKELYDVIVISDQMRPETNEALQTLPIRLLTANYTESSKAKAMALAMDSIDR DAYDIVVIMDADNITTSDFLSTINRVFDSGIKSAQAHRTGKNLNTDISVLDSISEEINNG FFRSGHNAVGLSAGLSGSGMAFEAKWFHQQVKYLQTAGEDKELEAMLLQQHIYTVYLPDL LVFDEKTQKKEAISNQRKRWIAAQFGALRASLPHLPKALLQGNIDYCNKILQWMLPPRLV QLAGVFGLTFVFTVIGIILSLYNGGHEWMIAIKWWILSAAQVAAMTLPISGGKLFTKQVG KAILKVPMLAVTMIGNLFKLKGANRKFIHTEHGEYRK >gi|225935379|gb|ACGA01000013.1| GENE 64 64512 - 64958 267 148 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 142 2 98 116 111 52.0 6e-25 MRTINLIVIHCSASRVDRNFTEDDLEVCHRRRGFNGTGYHFYIRKNGDIKTTREIERIGA HAKGYNRNSVGICYEGGLDCHGRPADTRTEWQIHSMHVLILTLLRDYPGCRICGHRDLSP DLNGNGEIEPEEWIKACPCFDVKKEWDA >gi|225935379|gb|ACGA01000013.1| GENE 65 64963 - 65268 378 101 aa, chain - ## HITS:1 COG:no KEGG:BT_1518 NR:ns ## KEGG: BT_1518 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 94 1 93 101 124 82.0 1e-27 MLDKIIEIIMTILPFLGSSRKKRQIMAQEVKEFSELVKDQYTFLMQQLEKVLKDYFDLSS KVKEMHAEIFSLRGQLTQAAALQCSSKECVQRVQMVAETEG >gi|225935379|gb|ACGA01000013.1| GENE 66 65325 - 65867 466 180 aa, chain - ## HITS:1 COG:no KEGG:BT_1517 NR:ns ## KEGG: BT_1517 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 142 3 140 176 120 42.0 3e-26 MATVTVVRYKRRKRLGDDKSPMMYLLKPKAGESKIYSIDSLAQEIESIGSLSVEDVSHVM KSFVRAMKKVLVAGNKVKVDGLGIFYTTLTCPGVEQEKDCTVKNITRINLRFKVDNSLRL ANDSTATTRGGDNNMMFELYTEKKSVAGGNGGDGSDDDGKGDGGEPGGGSGGGEAPDPAA >gi|225935379|gb|ACGA01000013.1| GENE 67 66061 - 67149 851 362 aa, chain - ## HITS:1 COG:no KEGG:BT_4046 NR:ns ## KEGG: BT_4046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 358 158 514 518 599 81.0 1e-170 LDSNNNEELQAEIRGIMRDFFSPLKAQGEYLRFLFLTGISKFSQMSIFSELNNLQNISMQ DAYSAICGITENELRTQLEEDIRRMAEANGETYDEACMHLKQQYDGYHFSKKCEDVYNPF SLFNAFAQKSYENFWFSTGTPTFLIDILQQSDFDIRQLDGVSATAEQFDAPTNVITDPLP VLYQSGYLTIKEYDRDFQIYTLAYPNKEVRKGFIESLMPAYVHLPARENTFYVVSFIKDL RVGNLDQCMERIKSFFASIPNDMNNKEEKHYQTIFYLLFRLMGQYVDAEVKSAVGRADVV IKMQEAIYVFEFKVDGTPEEALAQINSKQYAIPYQADHRKVVKVGVNFDSSTRTIGEWVI GN Prediction of potential genes in microbial genomes Time: Fri May 13 06:39:35 2011 Seq name: gi|225935378|gb|ACGA01000014.1| Bacteroides sp. D2 cont1.14, whole genome shotgun sequence Length of sequence - 1942 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 422 385 ## BT_4046 hypothetical protein - Prom 450 - 509 5.8 + Prom 414 - 473 5.3 2 2 Tu 1 . + CDS 603 - 860 263 ## BT_1170 hypothetical protein - Term 919 - 955 1.1 3 3 Tu 1 . - CDS 993 - 1940 702 ## BT_1172 DNA primase/helicase Predicted protein(s) >gi|225935378|gb|ACGA01000014.1| GENE 1 2 - 422 385 140 aa, chain - ## HITS:1 COG:no KEGG:BT_4046 NR:ns ## KEGG: BT_4046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 518 206 71.0 2e-52 MEGTFRRYPIGIQSFERLRNDNCVYIDKTELIYRLANSAKACFLSRPRRFGKSLLVSTLA AYFSGKKDLFKGLMMEQLEKDWTVYPVLHIDFSISKYMNAGMLRSAINNRLVEWERIYGC DTSEDTFSLRLKGIIKRAYE >gi|225935378|gb|ACGA01000014.1| GENE 2 603 - 860 263 85 aa, chain + ## HITS:1 COG:no KEGG:BT_1170 NR:ns ## KEGG: BT_1170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 130 80.0 1e-29 MKTKNKILPEQEEEKTFQYRTYGKGELALLYLPNILQQSAVDRFNEWIEAAPGLKERLLA TGMNPRARYYTPAQVRLIIEVLQEP >gi|225935378|gb|ACGA01000014.1| GENE 3 993 - 1940 702 315 aa, chain - ## HITS:1 COG:no KEGG:BT_1172 NR:ns ## KEGG: BT_1172 # Name: not_defined # Def: DNA primase/helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 315 284 598 599 570 80.0 1e-161 AIEQAAEVPLEGIFTAADLHDDLRALFDNGFGPGAETGWEEMDKICTYERGRSVYVTGVP GAGKSEWVDELVLRLCLRHQWKIGFFSPENTPIVYHLRKLIEKLTGHRFQNGCGMTEGLL ANSENFLTENVSHISLKGNVSPDRVLAKAHELVVRRGCRIIVFDPLNRFDHNPQPGQTET QYISNLLNKFTEFAVQHNCLLVVVVHPRKMNRNPVTGITPRVEMYDINGSADFYNKADYG IIVERDKEVGVTRVYVDKVKFKHLGVGGMVSFIYDPVSGRYLPCEESHDPSLSVDQRVRN TMFDNSCWLPEKELF Prediction of potential genes in microbial genomes Time: Fri May 13 06:39:56 2011 Seq name: gi|225935377|gb|ACGA01000015.1| Bacteroides sp. D2 cont1.15, whole genome shotgun sequence Length of sequence - 18350 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 4, operones - 2 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 798 567 ## BT_1172 DNA primase/helicase - Prom 820 - 879 4.6 2 2 Tu 1 . - CDS 957 - 1925 634 ## BT_1173 hypothetical protein - Prom 2013 - 2072 6.8 + Prom 1893 - 1952 5.6 3 3 Op 1 . + CDS 2074 - 2412 240 ## gi|260170549|ref|ZP_05756961.1| hypothetical protein BacD2_01665 + Term 2441 - 2483 1.5 4 3 Op 2 . + CDS 2487 - 3599 665 ## BT_1175 hypothetical protein + Term 3653 - 3689 1.6 + Prom 3773 - 3832 3.1 5 4 Op 1 11/0.000 + CDS 3935 - 4786 525 ## COG0438 Glycosyltransferase 6 4 Op 2 . + CDS 4858 - 6312 656 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 7 4 Op 3 . + CDS 6315 - 7226 794 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases 8 4 Op 4 . + CDS 7274 - 8401 834 ## BF2054 hypothetical protein 9 4 Op 5 25/0.000 + CDS 8441 - 9595 913 ## COG0438 Glycosyltransferase 10 4 Op 6 . + CDS 9607 - 10761 833 ## COG0438 Glycosyltransferase 11 4 Op 7 . + CDS 10776 - 11630 791 ## BF2051 putative glycosyltransferase 12 4 Op 8 . + CDS 11635 - 12210 235 ## BT_1182 hypothetical protein 13 4 Op 9 . + CDS 12183 - 13088 701 ## BT_1182 hypothetical protein 14 4 Op 10 . + CDS 13154 - 14368 972 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 15 4 Op 11 . + CDS 14439 - 17060 1478 ## COG0642 Signal transduction histidine kinase 16 4 Op 12 . + CDS 17138 - 18238 593 ## COG3021 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|225935377|gb|ACGA01000015.1| GENE 1 3 - 798 567 265 aa, chain - ## HITS:1 COG:no KEGG:BT_1172 NR:ns ## KEGG: BT_1172 # Name: not_defined # Def: DNA primase/helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 265 1 266 599 445 78.0 1e-124 MRNFANYDVDIHGKSSGVLKTICKKCLPTRKNKRDRSLRVNVDTGHCHCYHCGADFYVPD DVEERKNAERAVARKRRAAAIPQHFQRPMFDVSKTTLSEDTERWLVETRCIPQSVIAALR ITEQEEFMPQSGKKERCICFNYFEGGQLINTKFRALPKLFKMVQGAELIPYNIDSIVGQT SCIIHEGELDAASSIAAGFQSVISVPAGANSNLSWLDRFMETHFEDLKEIIIAVDADSAG IRLRNELINRLGAERCRVVTYGPEC >gi|225935377|gb|ACGA01000015.1| GENE 2 957 - 1925 634 322 aa, chain - ## HITS:1 COG:no KEGG:BT_1173 NR:ns ## KEGG: BT_1173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 328 328 333 51.0 6e-90 MKKDQYFNLEVNLLNDDNIASMMSEMNAAEALGIYVILLLHLRTKDAYEASCKPVLLKAM ARRYDVDEVAVERVLREFDLFELDEERQMFRSSYLDRVMKSLEEKRKMDIENGKKGGRPK KVAKSAETPVSKGRKPTENQKRREEESKEEESKGSVSVVNNNRSDIETPSLVSRLADEGN HGPLQPVLPWEKLVDQLSTFQSYMELAGQHSGLGKLFVDHQKLILEIFKKHIRLYDKGAG LLFPEDVKRYFSNYIAAGSVTCRTLGETLLKELENTVDKDVNRFESVVDGRRTYLGHLIP ADAPPRPDASAVWDDVKKRWAH >gi|225935377|gb|ACGA01000015.1| GENE 3 2074 - 2412 240 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170549|ref|ZP_05756961.1| ## NR: gi|260170549|ref|ZP_05756961.1| hypothetical protein BacD2_01665 [Bacteroides sp. D2] # 1 112 1 112 112 191 100.0 2e-47 MDTKRNQTLEEIEENKIVSEHYQNRIKLIKELLKTSQLVIGDLCVHINISEASYHRYTNF TSYMKTDIFIHACIFLKQYIESHHIPYTQEEKRLIKALDLFQISSNSNLNCN >gi|225935377|gb|ACGA01000015.1| GENE 4 2487 - 3599 665 370 aa, chain + ## HITS:1 COG:no KEGG:BT_1175 NR:ns ## KEGG: BT_1175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 366 1 366 372 568 74.0 1e-160 MSDSETDSPSIKALIVFRENGETDNLFVPILCDAIRMAGIDVRCSTKEFWESDKHYDIIH FQWPEEVVGWTCNDPDIIRRLEERISFFRSRGTRFVYTRHNVRPHYANGIISRAYDIIES QSDIVVHMGRFSRDEFLTKYPDSQNVVIPHHIYQYTYKEDISVERARQYLNLSQDAFIVT TFGKFRNREEIRMVLGAFRAWDEEHKLLLAPRLYPFSRRNSYGRNILKRWASRIGYYILV PLFNRMFRLQAGANDELIDNCDLPYYMAASDAIFIQRKDVLNSANIPLAFLFHKVAVGPD TGNMGELLNNTGNPTFCPDNKADIIRALEEARQLAAWGKGEVNYEYAMENMSIKKIGKMY VQLYESTSLK >gi|225935377|gb|ACGA01000015.1| GENE 5 3935 - 4786 525 283 aa, chain + ## HITS:1 COG:HI1698 KEGG:ns NR:ns ## COG: HI1698 COG0438 # Protein_GI_number: 16273585 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Haemophilus influenzae # 66 276 139 350 353 103 33.0 3e-22 MNTEDYDFIIGVHAFLSLQLASIRHQLKAKRVIGWMHTSFDAFFNNPGFYLYDQKKHFQH EMPKLDGMVVLTNYDRERFEQELHLFPTSIYNPLPLIPEGEAKPAYKRFLSVGRMSHLTK GFDILIEAFALFARNNKEWTLEIVGEGPEAPLLKKQIASHKLENRITISPFTKQIQKHYA SASVYILSSRWEGFGLVLAEAMSHRLPIICSDLPFTKEVLDKKNNHLFFQNENIEELAKC MTDIARLTPEQLDQMGKVSLDIAESLKLPTIIKQWENYFEKMK >gi|225935377|gb|ACGA01000015.1| GENE 6 4858 - 6312 656 484 aa, chain + ## HITS:1 COG:L13324 KEGG:ns NR:ns ## COG: L13324 COG2244 # Protein_GI_number: 15672194 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Lactococcus lactis # 5 473 1 469 475 215 30.0 1e-55 MTNNIKQQLFSGVFYTAMAKYIGIIISLVVAGILARLLSPNDFGTVAIATVIISFFAIFT DMGISPAVVQDKTLTEDDLANIYSFTFWTGIAIALLFFIASWPISIYYDSPILRTLCQLL SVNLFFASANIIPGALFYKNKEFKFIAIRSFLIQIMGGTGAVIAALSGAGLYALLINPIL SSILIYIISFRRYPQRLRMTWGLQSLRKIFSYSAYQFLFNVICYFSRNLDKLLIGKHMGM SDLGYYEKSYRLMMLPLQNITQVITPVMHPIFSDLQNDKGKLATSYERIIRFLAFIGLPI SVLLFFTAEEITLIIFGSQWLPSVPVFRILTLSVGIQIILSSSGSIFQAAGDTRSLFVCG LFSSVLNVAGILLGIFHFGTLTAVASCIVVTFTINFIQCYWQMYRVTFRQSAWPFIRQLI SPLVISILIALALIPMQYALEGMNIFVTIIAKGIVSFIIFGIYIQMTHEYDMIGKVRSLR KKSL >gi|225935377|gb|ACGA01000015.1| GENE 7 6315 - 7226 794 303 aa, chain + ## HITS:1 COG:SP1767 KEGG:ns NR:ns ## COG: SP1767 COG1442 # Protein_GI_number: 15901598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 39 303 549 813 814 208 41.0 1e-53 MNIRDIFITLRYKISYGSTVEWYNFWYIVLRKKRKITPQVASIDETIRKIIDDRCSVSRF GDGEVLLTSPEKEIRFQKGDPLLAKRLTEVLQSHEEGHIVCISDAFRDLYRYNRKSRRFW RTHFYLYGSWWDRLLIAGRKYYNTFVTRPYMDFARKEDSARWFHDMKGIWDNRDIVFIEG EKSRLGVGNDLFDNARSIRRILCPPRDAFERLEDIKREACKVEKEALFLIALGPAATVLA YDLFKAGYQAIDVGHVDVEYEWWRMGAHKKVKLERKYVNETAIGSEVADAGEEYRKQIIA QIV >gi|225935377|gb|ACGA01000015.1| GENE 8 7274 - 8401 834 375 aa, chain + ## HITS:1 COG:no KEGG:BF2054 NR:ns ## KEGG: BF2054 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 368 1 368 372 497 63.0 1e-139 MKALFLIFHGFNEANGISKKIRYQVKALKDCGVDVRLCHYDVTSYGERRWMVDDEVIADF GTGFAAKIWKRVYYSPIVDYARHEGIDFVYMRSFHNASPFTLRLVKSLKRIGAKVVMEIP TYPYDQEYVTRSMKFHLAIDRCFRHKLAKALDGIVTFSNAEEIFGGNTIRISNGIDFEAI PMKQQQNDTSKELHLIGVAEVHYWHGFDRLVRGLADYYRSNPEYKVYFHIVGPLSGERER AGILPVVRDNHLEPYVLLHGPLHGDELDAMFEKADFAIGSLGRHRSGITYIKTLKNREYA ARGFAFTYSETDDDFDHMPYVWKVEPDESPVDIQGLIDFRKTLKMTPAEIRESVCPLSWK TQMQKVIDAIKTKDK >gi|225935377|gb|ACGA01000015.1| GENE 9 8441 - 9595 913 384 aa, chain + ## HITS:1 COG:HI1698 KEGG:ns NR:ns ## COG: HI1698 COG0438 # Protein_GI_number: 16273585 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Haemophilus influenzae # 2 374 3 353 353 142 28.0 1e-33 MKIAYCIPSLDYPSGMERVLTLKANYFAEVFGYEIHIFLTDGEGRKPYYELHPSITVHQM DINYSHLNNLPLYKKIPQYMAKQKLFRNKLNEHLCRIKPDITISLLRRDINFINKLTDGS AKLGEIHFNKSNYRDFSNSRLPSFIRAIVRKFWRIQLIRQLRQLKRFIVLSHEDAREWTE LDNVEVIYNPLPFFPEQVSDGNRKQVIAAGRYVPQKGFDRLIPAWQLVARKHPDWVLRIY GDGMRAELQQLIDSLGITSSCILEPTVPNIVEKYCESSVFVLSSRFEGFGMVIIEAMACG VPPVSFTCPCGPRDIIDDGKDGLLVEDGNIEELAEKICYLIENEETRKEMGRQARVDVQR FKIENIAEQWKQLFESLVPPVKTT >gi|225935377|gb|ACGA01000015.1| GENE 10 9607 - 10761 833 384 aa, chain + ## HITS:1 COG:SMb21250 KEGG:ns NR:ns ## COG: SMb21250 COG0438 # Protein_GI_number: 16264502 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 2 371 50 403 427 149 31.0 9e-36 MKIAYYLPSLYAPGGLERIVTFKANYLAEHCEGYEVYIITSEQRDRQIHFALSPKVKHID LDVAFDWPFNQSTIGKLLKYPFRYHKFRKRFTRLLMEIRPDITLSTIRREVKFIHSIADG SKKVGEFHVTRYSYGVGKGGFFGNMMQKRWDNELQGNLRKLDQFVVLTHEEAAFWPELNN IRVIPNPIITQTERFSECTSHQVIAVGRYAPQKGFDLLIEAWNIIAKKYPTWKLRIYGDG SLRDSFQQRIDELGMTDSCILEHTVSNIVDKYCESSIFVLSSRFEGFGMVITEAMSCGVP PISFDCPCGPKDIIDDGKDGLLVENGNIEELAEKIAYLIENEKVRMDMGKQACMNVQRFQ MENIVKQWKGLFEELTNNSQQPQN >gi|225935377|gb|ACGA01000015.1| GENE 11 10776 - 11630 791 284 aa, chain + ## HITS:1 COG:no KEGG:BF2051 NR:ns ## KEGG: BF2051 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 282 1 282 283 465 76.0 1e-130 MEWYSKYLQVYEKPFAEAPQAVIEEVRRNLAGLQSNEPLVTVSLIGYNEEKHLLACLWSL SEMQCKYPVEIIGVDNESKDRTAEIYEACGVPYFTETQHSCGFARLCGLNHARGKYHINI DSDTMYPAKYVETMVDALEKPGVVAVSSLWSYIPDKDHSWMGLKIYEAARDLHLWLQSFK RPELSVRGLVFAYRTEYARKTGIRTDIIRGEDGYLALQLKQFGKIAFVRKRRARAVTGYG TVGADGTLFNSFKVRVVSRLKGIGGLFTKKKEYKDEESNLVKKK >gi|225935377|gb|ACGA01000015.1| GENE 12 11635 - 12210 235 191 aa, chain + ## HITS:1 COG:no KEGG:BT_1182 NR:ns ## KEGG: BT_1182 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 170 3 190 500 148 43.0 1e-34 MRSVKTINKIIVFILCTINIGCIKEKNLYDDELAEKEVISDPEFFYPFINEPIDHTAEIT IQVKEGVQLPIQPTAVIPPLKYNKSWLFMLTQDDCKHAAFCYTWAAIHGKPLSKKSFYDL AHLQRNELPSDYYYLGKTLGSTDGAGNEVRFSFGTTLSPEESWMDKTTTINKKTMVLTDK KGWCGEMSKKC >gi|225935377|gb|ACGA01000015.1| GENE 13 12183 - 13088 701 301 aa, chain + ## HITS:1 COG:no KEGG:BT_1182 NR:ns ## KEGG: BT_1182 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 301 208 500 500 337 59.0 4e-91 MWGNVKEMLNFGVGIAFHDVMATDVYDPEDILSHYKIAQDIILNRLKGRGCKMLAEPNGN KTYVEAADKYPIIRTLTAQAGAKKLLPFKVKGDLEKVLIERVFYEPPSGSGLTTMDVIKP VILEELRYDKKERTAISIGVHSTDTGWIDLFEWLNDTYGKDGDDSMWFTSQEEYYEYNYY QVHGTTEVNYEDEQTIKLTVHLPGQEYFYHPSVTVNLSGVKKEDIKHISSNDEVTGLSYA NYENGIMLNIDCRKYLAEHAENFVKRYEANPTNASAKADANYFVNMLKDSDKKTELKKRA E >gi|225935377|gb|ACGA01000015.1| GENE 14 13154 - 14368 972 404 aa, chain + ## HITS:1 COG:PAE0419 KEGG:ns NR:ns ## COG: PAE0419 COG1215 # Protein_GI_number: 18311929 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Pyrobaculum aerophilum # 57 310 42 295 365 83 28.0 8e-16 MNETITLCLEIIFWLALFVVFYTYLGYGIVLYILVKLKELFVKPVKRSLPASDADLPEVT LFITAFNEEDVVDEKMENSLELDYPADKLHIVWVTDGSDDGTNERLQTRWQGKATVHFQP LRQGKTAAMTRGMTLVDTPLVVFTDANTMVNREAIREIVLAFQDPKVGCVAGEKRIAVQT KDGAAAGGEGIYWKYESTLKALDARLYSAVGAAGELFAVRNELFEAMEPDTLLDDFILSL RITMKGYTIAYCTNAYAIESGSADMREEEKRKVRIAAGGLQSIWRLRPLLNPFRYGILSF QYTSHRVLRWSITPFLLFALFPLNIAILLLGGSAIFYGVLLAMQVLFYGLGYWGYYLSTK QIKNKLLFIPYYFLFMNVNVLKGIRYLKKKKGSGAWEKAKRAEK >gi|225935377|gb|ACGA01000015.1| GENE 15 14439 - 17060 1478 873 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 486 723 61 308 328 171 40.0 5e-42 MNRILHDANNAQRILKLTADTMLLVDRYGVCIDIESHCDLWFLQEDILLGKNIFELLPEY TRERVMPVFQVVLEEQRSISKNFKLVLKDETFYFKCLMFPYDGMVLCQYRDITQRSNVKR RLEQANLTLRAIQKVAQIGQWAYNTNKNVFHYLGYTGVLCEESVQNLAIEKYVELIVEED RQSFIEWCHVNEKELNMESISYRVRLNGEIFYMRIQTYLRKECSDGSFNIEGYIQNITDI QHRRNDINTLTHAINNAKESIFAARPDGTIIFANRRFLHNHGIPESEDISQLKIYNVTAD MPTQEAWDERCKDVIHGGSSNFVAHHPSKINKGILAYEGTMYNVTNDSGEESYWSFAHDI SERIRYEAQIKRLNQIMDTTINNLPAGIVVKEINNDFRYIYRNREAYNRDLYKNDPVGKN DFDFYPPIVAEKKRQEDIQVATSGKGMHWTVEGKDRNGNLIILDKRKIRVDGDELSSPII VSIEWDVTELEMMKRELLSSKEKAEMSDSLKSAFLANMSHEIRTPLNAIVGFSHLIAESD DAEERKTYYNIVNANNERLLQLINEILDLSKIESGTIEFSFGPASLHNLCREVYDAHIFR TPQGVSLVYEPSDESLMIETDKNRVFQVISNLIGNAVKFTKEGSISYGYKLVDNQIVFHV TDTGTGIEPEKVGRVFERFAKLNNHAQGTGLGLSICKSIVERLGGKISVSSEFSKGTTFT FTLPYTVAESVSVDSEKKNDDETSGGMSSAINTRHACILVAEDTDSNFDLLEAILGKEHR IIRAHDGMEVVTMFDEVKPDLILMDIKMPNLDGLEATKIIRELSATVPIIAQSAFAYEQD RKAAKEAGCNDFIAKPIADDKLKAMIHKWLSPP >gi|225935377|gb|ACGA01000015.1| GENE 16 17138 - 18238 593 366 aa, chain + ## HITS:1 COG:DR0632 KEGG:ns NR:ns ## COG: DR0632 COG3021 # Protein_GI_number: 15805659 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 137 364 119 325 329 62 26.0 1e-09 MKAFYRLFYYISIVLTSILAGVTIAGAFVGNAAPDSFKFMPFIGLILPILLLANLASVIY WTIRWRCWVFIPLIAIFSNWSYISCVIQSPFFSPASAPMVKMNAYTPGILTVATYNVDAF NHEDTGYSCKEIASYMRNLQADILCFQEFGINDEFGVDSIAAVLSDWPYYFIPISPEGKH LLQLAVFSRYPIKEENLIVYPDSKNCSLWCDIETNGRTIRLFNNHLQTTEVSQNKRKLEK GLRTDDSQRVERAAVGLIDGLHENFRKRAAQANTLKQLIAASPYPTIVCGDFNSLPSSYV YHTVKGDKLQDGFQTSGHGYMYTFKFFKHLLRIDYILHSPELNSTDYFSPDLTYSDHNPV VMRVKL Prediction of potential genes in microbial genomes Time: Fri May 13 06:40:39 2011 Seq name: gi|225935376|gb|ACGA01000016.1| Bacteroides sp. D2 cont1.16, whole genome shotgun sequence Length of sequence - 3985 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1 - 31 2.0 1 1 Tu 1 . - CDS 66 - 3359 3103 ## BVU_2749 OmpA-related protein - Prom 3386 - 3445 7.5 - Term 3484 - 3529 8.1 2 2 Tu 1 . - CDS 3576 - 3872 334 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|225935376|gb|ACGA01000016.1| GENE 1 66 - 3359 3103 1097 aa, chain - ## HITS:1 COG:no KEGG:BVU_2749 NR:ns ## KEGG: BVU_2749 # Name: not_defined # Def: OmpA-related protein # Organism: B.vulgatus # Pathway: not_defined # 1 1097 1 1113 1113 1231 60.0 0 MLKRMRSFLVLVMLFIAVTMSAQVTTATMSGKVTAQDEPIIGATVVAVHEPSGTRYGTVT NVSGQFNLQGMRTGGPYKVEISYVGYQTAIYKGINLLLGENYVLNVSLKESSELLDEVVV TASKESNMKSDRAGAVTNLTSAKIASIPTVSRSMNDIMRLSPQSNTTSNGFAAGGGNYRQ SFVTVDGASFNNAFGIGSNLPAGGAPISLDALEQLSVSITPYDVRQSGFTGAAINAVTKS GTNELTGSAYTFLTNNNLIGDRVGDGSIELEKSHQYTYGATLGGAIIKNKLFFFVNGEYE DDLTAGPVARARNNDGEKYGENGIHRPVASKMDEIRDYLINTYDYDPGVYQGYNVKVPSY KFLARVDWNINENNKVNVRFSKTNNRYDASPSSSVNPMTAGTIFPGSTTAGISAEKGATS DEGLYFRNTRYRQEQRFTSIAAEWNSKWGVLNNTLRGTYSYQDEPRSYDGGIFPTTYILE DGAVYAMFGTELFTEGNIRQVKSFTITDEATWTWGNHNFMAGLQFESNKALNGYMQGANG AYVYSSWDDFVTGKKPFSFLITHSNSADLSQFTSKMRQTQYSAYIQDQWNISENFKLTAG LRFELPVYPSLKDNYNAEYAKLNFGGRKFTTDQVPGNSLSVSPRIGFNWDITGERKYVLR GGTGYFVGRMPFVWLVSAVGNSGVGQTQYGYNIGKSSTPGAVAPNFYTDAKDMLEDIYQG NFTPQVSIPGSPTIIDKDLKMPATWKTSLAFDTKLPGDIDFTLEGIYNRDFNPAVISNAN IYADGTKTISKNDIRTAYKSYNAGTGAYLIENGGSKAYYLSLTAQLHKNFDFGLDASFAY TYSRARSYGDGIGDQVTSAYRTNTYAVNGINDHELGYGTYVAPHRIIASLGYSKAYAKHF RSSVSFIYEGAPLGYAGGSYSYSRYSYTYTKNIVGDGGANNLLYVPATKDELTFKDVTSK TGEVTYSAEQQKEDFWNFVQQDKYLKKRLGKYAERGGAVMPWHHQLDFKFNQDFYMMIGG KKNLLQFGIDIQNLANLLNKDWGLYKTVNSITPLADNGDGTFTMQKASGQILNSTYKNYA STASTYRVMFSLRYIFN >gi|225935376|gb|ACGA01000016.1| GENE 2 3576 - 3872 334 98 aa, chain - ## HITS:1 COG:RSc2913 KEGG:ns NR:ns ## COG: RSc2913 COG0488 # Protein_GI_number: 17547632 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Ralstonia solanacearum # 7 96 463 554 555 136 68.0 7e-33 MALKEEGNVLLLDEPTNDIDVNTLRALEEGLEDFAGCAVVISHDRWFLDRICTHILAFEG DSNVFYFEGSYSEYEENKMKRLGNEEPKRVRYRKLMTD Prediction of potential genes in microbial genomes Time: Fri May 13 06:41:49 2011 Seq name: gi|225935375|gb|ACGA01000017.1| Bacteroides sp. D2 cont1.17, whole genome shotgun sequence Length of sequence - 144081 bp Number of predicted genes - 165, with homology - 161 Number of transcription units - 56, operones - 37 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1260 1356 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 1398 - 1457 76.2 + TRNA 1381 - 1453 70.0 # Lys TTT 0 0 - Term 1523 - 1567 0.3 2 2 Tu 1 4/0.000 - CDS 1576 - 2514 338 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 2558 - 2617 4.0 - Term 2516 - 2558 12.8 3 3 Op 1 . - CDS 2642 - 3949 890 ## COG0642 Signal transduction histidine kinase 4 3 Op 2 . - CDS 3946 - 4512 530 ## BT_1661 two-component system sensor histidine kinase - Prom 4594 - 4653 4.2 - Term 4622 - 4680 7.2 5 4 Tu 1 . - CDS 4705 - 6090 1478 ## COG0657 Esterase/lipase - Prom 6133 - 6192 5.5 + Prom 6051 - 6110 10.3 6 5 Tu 1 . + CDS 6182 - 7534 1025 ## COG0534 Na+-driven multidrug efflux pump + Prom 7588 - 7647 4.1 7 6 Op 1 . + CDS 7698 - 8414 1070 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins 8 6 Op 2 . + CDS 8392 - 9135 561 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 9161 - 9201 1.2 - Term 9147 - 9189 8.2 9 7 Tu 1 . - CDS 9227 - 9730 555 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 9941 - 10000 7.0 + Prom 9703 - 9762 5.7 10 8 Op 1 3/0.000 + CDS 9971 - 10531 683 ## COG1704 Uncharacterized conserved protein 11 8 Op 2 . + CDS 10577 - 11539 694 ## COG0501 Zn-dependent protease with chaperone function + Prom 11571 - 11630 5.8 12 9 Tu 1 . + CDS 11666 - 12166 421 ## BT_1200 hypothetical protein 13 10 Tu 1 . - CDS 12175 - 12798 723 ## COG2860 Predicted membrane protein - Prom 12887 - 12946 4.1 + Prom 12760 - 12819 5.3 14 11 Tu 1 . + CDS 12891 - 13976 1139 ## COG0381 UDP-N-acetylglucosamine 2-epimerase + Term 14001 - 14058 -0.2 + Prom 14187 - 14246 4.7 15 12 Op 1 . + CDS 14279 - 16855 2300 ## BT_1204 putative outer membrane protein 16 12 Op 2 . + CDS 16863 - 17831 851 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 17 12 Op 3 . + CDS 17786 - 18133 182 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase + Prom 18149 - 18208 2.5 18 12 Op 4 . + CDS 18231 - 19187 1156 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Term 19220 - 19282 17.1 19 13 Op 1 31/0.000 - CDS 19334 - 20476 1006 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 20 13 Op 2 . - CDS 20514 - 22061 1540 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 21 13 Op 3 . - CDS 22082 - 22318 236 ## BT_1211 hypothetical protein - Prom 22343 - 22402 6.9 + Prom 22315 - 22374 11.6 22 14 Op 1 13/0.000 + CDS 22537 - 23934 1449 ## COG1538 Outer membrane protein 23 14 Op 2 24/0.000 + CDS 23954 - 25174 1302 ## COG0845 Membrane-fusion protein 24 14 Op 3 36/0.000 + CDS 25223 - 25966 254 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) + Prom 26007 - 26066 1.9 25 14 Op 4 . + CDS 26086 - 27306 409 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 + Term 27307 - 27351 5.3 26 14 Op 5 9/0.000 + CDS 27355 - 28416 689 ## COG3275 Putative regulator of cell autolysis 27 14 Op 6 . + CDS 28459 - 29157 703 ## COG3279 Response regulator of the LytR/AlgR family + Term 29174 - 29209 -0.9 + Prom 29328 - 29387 5.6 28 15 Op 1 . + CDS 29564 - 30748 452 ## BVU_2806 hypothetical protein + Prom 30750 - 30809 5.5 29 15 Op 2 . + CDS 30846 - 31016 264 ## 30 15 Op 3 . + CDS 31058 - 31507 327 ## BF2302 hypothetical protein + Term 31525 - 31564 5.1 - Term 31424 - 31461 4.1 31 16 Tu 1 . - CDS 31681 - 32793 444 ## gi|260170591|ref|ZP_05757003.1| hypothetical protein BacD2_01905 - Prom 32815 - 32874 8.6 - Term 33043 - 33071 -0.1 32 17 Op 1 . - CDS 33118 - 33609 190 ## gi|260170592|ref|ZP_05757004.1| hypothetical protein BacD2_01910 33 17 Op 2 . - CDS 33650 - 34153 190 ## gi|260170593|ref|ZP_05757005.1| hypothetical protein BacD2_01915 34 17 Op 3 . - CDS 34150 - 34728 461 ## COG0860 N-acetylmuramoyl-L-alanine amidase 35 17 Op 4 . - CDS 34712 - 35227 269 ## gi|262407839|ref|ZP_06084387.1| conserved hypothetical protein 36 17 Op 5 . - CDS 35236 - 35529 209 ## gi|260170596|ref|ZP_05757008.1| hypothetical protein BacD2_01930 - Prom 35643 - 35702 8.3 - Term 35671 - 35703 4.0 37 18 Op 1 . - CDS 35727 - 36005 274 ## gi|160885943|ref|ZP_02066946.1| hypothetical protein BACOVA_03948 38 18 Op 2 . - CDS 35968 - 37389 504 ## COG3344 Retron-type reverse transcriptase 39 18 Op 3 . - CDS 37404 - 37811 272 ## gi|160885941|ref|ZP_02066944.1| hypothetical protein BACOVA_03946 - Prom 37875 - 37934 9.8 - Term 37899 - 37924 -0.5 40 19 Op 1 . - CDS 37958 - 38767 542 ## gi|160885940|ref|ZP_02066943.1| hypothetical protein BACOVA_03945 41 19 Op 2 . - CDS 38793 - 39581 597 ## gi|160885939|ref|ZP_02066942.1| hypothetical protein BACOVA_03944 42 19 Op 3 . - CDS 39584 - 45757 3478 ## gi|260170602|ref|ZP_05757014.1| hypothetical protein BacD2_01960 43 19 Op 4 . - CDS 45769 - 47082 701 ## BVU_2821 hypothetical protein 44 19 Op 5 . - CDS 47086 - 49077 1016 ## BVU_2822 hypothetical protein - Term 49108 - 49141 4.5 45 20 Tu 1 . - CDS 49182 - 49370 158 ## gi|237721085|ref|ZP_04551566.1| predicted protein - Term 49400 - 49432 1.7 46 21 Op 1 . - CDS 49487 - 50533 308 ## Sbal195_3678 hypothetical protein 47 21 Op 2 . - CDS 50520 - 50726 98 ## gi|237721087|ref|ZP_04551568.1| predicted protein - Prom 50773 - 50832 5.7 - Term 50799 - 50842 -0.8 48 22 Tu 1 . - CDS 50887 - 51360 61 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Term 51698 - 51739 -0.9 49 23 Op 1 . - CDS 51745 - 52326 237 ## BVU_2826 hypothetical protein 50 23 Op 2 . - CDS 52329 - 56234 3566 ## COG5283 Phage-related tail protein 51 23 Op 3 . - CDS 56237 - 56398 202 ## gi|160885930|ref|ZP_02066933.1| hypothetical protein BACOVA_03935 52 23 Op 4 . - CDS 56429 - 56950 305 ## BVU_2828 hypothetical protein - Term 56952 - 56997 8.8 53 24 Op 1 . - CDS 57000 - 57155 74 ## gi|160885928|ref|ZP_02066931.1| hypothetical protein BACOVA_03933 54 24 Op 2 . - CDS 57152 - 57991 655 ## BVU_2829 hypothetical protein 55 24 Op 3 . - CDS 57988 - 58560 169 ## BVU_2830 hypothetical protein 56 24 Op 4 . - CDS 58557 - 59099 310 ## BVU_2831 hypothetical protein 57 24 Op 5 . - CDS 59108 - 59530 483 ## BVU_2832 hypothetical protein 58 24 Op 6 . - CDS 59535 - 61109 1493 ## BVU_2833 hypothetical protein 59 24 Op 7 . - CDS 61127 - 62188 617 ## BVU_2834 hypothetical protein 60 24 Op 8 . - CDS 62142 - 63587 897 ## BVU_2835 hypothetical protein - Prom 63612 - 63671 4.0 - Term 63635 - 63674 -0.9 61 25 Op 1 . - CDS 63714 - 65729 1027 ## COG1783 Phage terminase large subunit 62 25 Op 2 . - CDS 65791 - 66336 409 ## BVU_2837 hypothetical protein - Prom 66356 - 66415 5.7 - Term 66755 - 66794 1.1 63 26 Tu 1 . - CDS 66870 - 67658 333 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Term 67663 - 67699 3.3 64 27 Op 1 . - CDS 67732 - 68649 467 ## COG1032 Fe-S oxidoreductase 65 27 Op 2 . - CDS 68660 - 69091 448 ## BVU_0933 hypothetical protein - Prom 69112 - 69171 5.5 + Prom 69065 - 69124 11.0 66 28 Tu 1 . + CDS 69164 - 69349 218 ## + Term 69364 - 69399 4.0 67 29 Op 1 . - CDS 69405 - 70397 834 ## BVU_2845 putative type I restriction-modification system methyltransferase subunit 68 29 Op 2 . - CDS 70354 - 71460 556 ## COG0582 Integrase 69 29 Op 3 . - CDS 71477 - 72085 415 ## gi|260170630|ref|ZP_05757042.1| hypothetical protein BacD2_02100 70 29 Op 4 . - CDS 72097 - 72603 293 ## gi|260170631|ref|ZP_05757043.1| hypothetical protein BacD2_02105 71 29 Op 5 . - CDS 72628 - 72873 170 ## gi|260170632|ref|ZP_05757044.1| hypothetical protein BacD2_02110 - Prom 72898 - 72957 2.0 72 30 Op 1 . - CDS 72963 - 73205 227 ## gi|260170633|ref|ZP_05757045.1| hypothetical protein BacD2_02115 73 30 Op 2 . - CDS 73207 - 73758 435 ## gi|260170634|ref|ZP_05757046.1| hypothetical protein BacD2_02120 74 30 Op 3 . - CDS 73764 - 74477 259 ## Dalk_4616 hypothetical protein - Prom 74594 - 74653 2.9 - Term 74748 - 74787 -1.0 75 31 Op 1 . - CDS 74822 - 75286 193 ## gi|260170636|ref|ZP_05757048.1| hypothetical protein BacD2_02130 76 31 Op 2 . - CDS 75315 - 75962 569 ## HAPS_0636 hypothetical protein 77 31 Op 3 . - CDS 76012 - 76587 344 ## BVU_2840 hypothetical protein - Prom 76624 - 76683 4.2 78 32 Op 1 . - CDS 76699 - 77451 294 ## BVU_2841 hypothetical protein 79 32 Op 2 . - CDS 77420 - 78076 428 ## BVU_2842 hypothetical protein 80 32 Op 3 . - CDS 78137 - 78430 152 ## BVU_2843 hypothetical protein - Prom 78466 - 78525 2.4 81 33 Op 1 . - CDS 78540 - 78977 349 ## gi|160885891|ref|ZP_02066894.1| hypothetical protein BACOVA_03896 82 33 Op 2 . - CDS 78974 - 79738 441 ## PA14_58970 hypothetical protein 83 33 Op 3 . - CDS 79766 - 81484 1085 ## COG1475 Predicted transcriptional regulators - Prom 81509 - 81568 2.1 84 34 Op 1 . - CDS 81586 - 81807 214 ## gi|260170645|ref|ZP_05757057.1| hypothetical protein BacD2_02175 85 34 Op 2 . - CDS 81764 - 82411 486 ## gi|160885887|ref|ZP_02066890.1| hypothetical protein BACOVA_03892 86 34 Op 3 . - CDS 82308 - 83120 392 ## Coch_0881 hypothetical protein 87 34 Op 4 . - CDS 83156 - 83620 370 ## BDI_0857 putative recombination protein 88 34 Op 5 . - CDS 83626 - 84783 461 ## COG1061 DNA or RNA helicases of superfamily II 89 34 Op 6 . - CDS 84780 - 85115 110 ## BVU_2856 hypothetical protein 90 34 Op 7 . - CDS 85048 - 85716 733 ## BVU_2857 hypothetical protein 91 34 Op 8 . - CDS 85735 - 86187 245 ## BVU_2858 hypothetical protein 92 34 Op 9 . - CDS 86194 - 86643 196 ## COG0629 Single-stranded DNA-binding protein 93 34 Op 10 . - CDS 86640 - 87554 535 ## BVU_2860 hypothetical protein 94 34 Op 11 . - CDS 87560 - 88342 396 ## COG4712 Uncharacterized protein conserved in bacteria - Prom 88367 - 88426 2.8 - TRNA 88760 - 88829 27.8 # Pseudo ??? 0 0 - TRNA 89057 - 89154 22.3 # Pseudo GAA 0 0 - TRNA 89244 - 89317 65.6 # Undet ??? 0 0 95 35 Op 1 . - CDS 89380 - 89571 154 ## gi|160885875|ref|ZP_02066878.1| hypothetical protein BACOVA_03880 96 35 Op 2 . - CDS 89568 - 89837 61 ## gi|260170657|ref|ZP_05757069.1| hypothetical protein BacD2_02235 97 35 Op 3 . - CDS 89843 - 90049 259 ## gi|160885873|ref|ZP_02066876.1| hypothetical protein BACOVA_03878 - Prom 90293 - 90352 3.4 98 36 Op 1 . - CDS 90484 - 90720 164 ## gi|160885871|ref|ZP_02066874.1| hypothetical protein BACOVA_03875 99 36 Op 2 . - CDS 90736 - 91029 285 ## gi|237716168|ref|ZP_04546649.1| conserved hypothetical protein - Prom 91167 - 91226 6.6 + Prom 91030 - 91089 6.7 100 37 Op 1 . + CDS 91183 - 91578 297 ## gi|260170661|ref|ZP_05757073.1| hypothetical protein BacD2_02255 101 37 Op 2 . + CDS 91595 - 92173 431 ## gi|160885867|ref|ZP_02066870.1| hypothetical protein BACOVA_03871 102 37 Op 3 . + CDS 92181 - 92648 223 ## gi|160885866|ref|ZP_02066869.1| hypothetical protein BACOVA_03870 103 38 Tu 1 . - CDS 92645 - 92983 233 ## gi|160885865|ref|ZP_02066868.1| hypothetical protein BACOVA_03869 - Prom 93053 - 93112 9.2 + Prom 93196 - 93255 13.9 104 39 Tu 1 . + CDS 93302 - 93514 232 ## BT_1232 hypothetical protein + Term 93685 - 93726 -1.0 105 40 Op 1 . - CDS 93537 - 94487 1091 ## COG2837 Predicted iron-dependent peroxidase 106 40 Op 2 15/0.000 - CDS 94534 - 95265 189 ## PROTEIN SUPPORTED gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 107 40 Op 3 . - CDS 95262 - 96743 1271 ## COG0364 Glucose-6-phosphate 1-dehydrogenase 108 40 Op 4 . - CDS 96793 - 98319 1461 ## COG0362 6-phosphogluconate dehydrogenase - Prom 98340 - 98399 3.4 + Prom 98297 - 98356 6.9 109 41 Op 1 . + CDS 98378 - 99547 1030 ## COG1301 Na+/H+-dicarboxylate symporters + Prom 99561 - 99620 4.4 110 41 Op 2 14/0.000 + CDS 99641 - 100711 1305 ## COG1089 GDP-D-mannose dehydratase 111 41 Op 3 . + CDS 100731 - 101801 1026 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Term 101822 - 101869 11.3 + Prom 101827 - 101886 8.7 112 42 Op 1 . + CDS 102054 - 103712 1519 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 113 42 Op 2 . + CDS 103762 - 104910 1060 ## BT_1227 hypothetical protein 114 43 Op 1 . - CDS 105107 - 107101 2076 ## COG3855 Uncharacterized protein conserved in bacteria 115 43 Op 2 . - CDS 107135 - 108799 1635 ## COG2985 Predicted permease - Prom 108873 - 108932 3.5 116 44 Tu 1 . - CDS 109062 - 109505 496 ## BF1816 hypothetical protein - Prom 109525 - 109584 6.3 + Prom 109487 - 109546 6.7 117 45 Op 1 7/0.000 + CDS 109575 - 111293 1247 ## COG0714 MoxR-like ATPases 118 45 Op 2 . + CDS 111262 - 112695 811 ## COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain + Term 112758 - 112797 -0.5 + TRNA 113125 - 113214 55.5 # Ser GCT 0 0 + TRNA 113276 - 113350 57.4 # Glu CTC 0 0 + Prom 113458 - 113517 3.1 119 46 Op 1 . + CDS 113550 - 114746 507 ## BF2484 tyrosine type site-specific recombinase + Prom 114750 - 114809 1.9 120 46 Op 2 . + CDS 114829 - 114996 238 ## 121 46 Op 3 . + CDS 115038 - 115487 288 ## BF2302 hypothetical protein + Term 115506 - 115544 5.2 - Term 115637 - 115677 -1.0 122 47 Op 1 . - CDS 115685 - 116050 61 ## BT_4444 hypothetical protein 123 47 Op 2 . - CDS 116080 - 116574 327 ## COG3023 Negative regulator of beta-lactamase expression 124 47 Op 3 . - CDS 116567 - 116977 299 ## BVU_2812 hypothetical protein 125 47 Op 4 . - CDS 117017 - 117274 303 ## gi|260170686|ref|ZP_05757098.1| hypothetical protein BacD2_02380 126 47 Op 5 . - CDS 117279 - 120155 1770 ## BF2447 hypothetical protein 127 47 Op 6 . - CDS 120152 - 122542 796 ## BF2448 hypothetical protein 128 47 Op 7 . - CDS 122539 - 124575 1037 ## BF2449 hypothetical protein 129 47 Op 8 . - CDS 124568 - 124825 181 ## BF2450 hypothetical protein - Term 124858 - 124888 0.3 130 47 Op 9 . - CDS 124894 - 125217 266 ## BF2451 hypothetical protein 131 47 Op 10 . - CDS 125259 - 125714 519 ## BF2452 hypothetical protein 132 47 Op 11 . - CDS 125736 - 126119 311 ## BF2453 hypothetical protein 133 47 Op 12 . - CDS 126116 - 126619 408 ## gi|260170694|ref|ZP_05757106.1| hypothetical protein BacD2_02420 134 47 Op 13 . - CDS 126612 - 126929 209 ## gi|260170695|ref|ZP_05757107.1| hypothetical protein BacD2_02425 135 47 Op 14 . - CDS 126929 - 127225 197 ## gi|260170696|ref|ZP_05757108.1| hypothetical protein BacD2_02430 - Term 127228 - 127273 3.5 136 48 Op 1 . - CDS 127295 - 128425 973 ## BF2457 hypothetical protein 137 48 Op 2 3/0.000 - CDS 128429 - 129028 532 ## COG3740 Phage head maturation protease 138 48 Op 3 2/0.000 - CDS 129049 - 130029 222 ## COG4695 Phage-related protein - Prom 130090 - 130149 5.3 139 48 Op 4 . - CDS 130323 - 132038 686 ## COG4626 Phage terminase-like protein, large subunit 140 48 Op 5 . - CDS 132022 - 132384 286 ## gi|260170701|ref|ZP_05757113.1| hypothetical protein BacD2_02455 - Prom 132532 - 132591 14.8 + Prom 132519 - 132578 10.9 141 49 Tu 1 . + CDS 132615 - 133388 262 ## PputGB1_1741 hypothetical protein - Term 133368 - 133417 11.5 142 50 Op 1 . - CDS 133426 - 133761 158 ## BF2462 hypothetical protein 143 50 Op 2 . - CDS 133901 - 134182 155 ## gi|260170705|ref|ZP_05757117.1| hypothetical protein BacD2_02475 144 50 Op 3 . - CDS 134179 - 134364 144 ## gi|260170706|ref|ZP_05757118.1| hypothetical protein BacD2_02480 - Prom 134403 - 134462 9.7 + Prom 134332 - 134391 3.7 145 51 Tu 1 . + CDS 134413 - 134637 320 ## gi|260170707|ref|ZP_05757119.1| hypothetical protein BacD2_02485 + Term 134653 - 134688 3.6 - Term 134634 - 134680 6.3 146 52 Op 1 . - CDS 134689 - 135036 197 ## gi|260170708|ref|ZP_05757120.1| hypothetical protein BacD2_02490 147 52 Op 2 . - CDS 135038 - 135367 201 ## gi|260170709|ref|ZP_05757121.1| hypothetical protein BacD2_02495 148 52 Op 3 . - CDS 135384 - 136052 313 ## CHU_1184 hypothetical protein 149 52 Op 4 . - CDS 135985 - 136743 377 ## BF2328 hypothetical protein - Prom 136829 - 136888 2.3 150 53 Tu 1 . - CDS 136913 - 137263 329 ## BURPS1710b_1673 GP72 - Prom 137308 - 137367 2.3 - Term 137284 - 137313 -0.9 151 54 Op 1 . - CDS 137430 - 137897 408 ## BT_1331 hypothetical protein 152 54 Op 2 . - CDS 137894 - 138259 310 ## gi|260170714|ref|ZP_05757126.1| hypothetical protein BacD2_02520 153 54 Op 3 . - CDS 138310 - 139122 614 ## gi|260170715|ref|ZP_05757127.1| putative phage-like protein 154 54 Op 4 . - CDS 139109 - 139594 389 ## gi|260170716|ref|ZP_05757128.1| hypothetical protein BacD2_02530 155 54 Op 5 . - CDS 139594 - 139980 355 ## BF2326 hypothetical protein 156 54 Op 6 . - CDS 139973 - 140233 157 ## gi|260170718|ref|ZP_05757130.1| hypothetical protein BacD2_02540 157 54 Op 7 . - CDS 140205 - 140396 198 ## - Prom 140446 - 140505 5.6 158 55 Op 1 . - CDS 140599 - 140808 158 ## gi|260170719|ref|ZP_05757131.1| hypothetical protein BacD2_02545 159 55 Op 2 . - CDS 140854 - 141084 323 ## gi|260170720|ref|ZP_05757132.1| hypothetical protein BacD2_02550 160 55 Op 3 . - CDS 141127 - 141315 279 ## gi|260170721|ref|ZP_05757133.1| hypothetical protein BacD2_02555 - Prom 141404 - 141463 8.8 + Prom 141274 - 141333 6.7 161 56 Op 1 . + CDS 141458 - 142129 413 ## Fjoh_1827 hypothetical protein 162 56 Op 2 . + CDS 142150 - 142689 210 ## gi|260170723|ref|ZP_05757135.1| hypothetical protein BacD2_02565 163 56 Op 3 . + CDS 142710 - 143153 401 ## BF2306 hypothetical protein 164 56 Op 4 . + CDS 143203 - 143691 226 ## gi|260170725|ref|ZP_05757137.1| hypothetical protein BacD2_02575 + Prom 143696 - 143755 3.6 165 56 Op 5 . + CDS 143777 - 144080 234 ## gi|260170726|ref|ZP_05757138.1| hypothetical protein BacD2_02580 Predicted protein(s) >gi|225935375|gb|ACGA01000017.1| GENE 1 3 - 1260 1356 419 aa, chain - ## HITS:1 COG:PM0425 KEGG:ns NR:ns ## COG: PM0425 COG0488 # Protein_GI_number: 15602290 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Pasteurella multocida # 7 418 4 410 556 464 56.0 1e-130 MAADDKKIIFSMVGVSKAFTPNKNVLKDIYLSFFYGAKIGIIGLNGSGKSTLLKIIAGLE KSYQGEVVFSPGYSVGYLAQEPYLDNTKTVKEVVMEGVQSIVDALTEYEEINQKFGLPEY YEDQDKMDALFTRQGELQDIIDATDAWNLDSKLERAMDALRCPPEDQPVENLSGGERRRV ALCRLLLQKPDVLLLDEPTNHLDAESIDWLEQHLQQYEGTVIAVTHDRYFLDHVAGWILE LDRGEGIPWKGNYSSWLEQKTKRMEMEEKTVSKRRKTLERELEWVRMAPKARQAKGKARL NSYDKLLNEDVKEKEEKLEIFIPNGPRLGNKVIEAKQVAKAYGDKLLFDDLNFMLPPNGI VGVIGPNGAGKTTLFRLIMGLETVDKGEFEVGETVKVAYVDQQHRDIDPNKSVYQVISG >gi|225935375|gb|ACGA01000017.1| GENE 2 1576 - 2514 338 312 aa, chain - ## HITS:1 COG:YPO3444 KEGG:ns NR:ns ## COG: YPO3444 COG0454 # Protein_GI_number: 16123592 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Yersinia pestis # 154 312 9 167 167 98 35.0 2e-20 MKQEHVNSIRAFNRFYTKLLGVLNKYYLESEFGLPEVRIIQDVYLHPDRSSKDISSELNM DKGLLSRLLKQLEQKGYICRKGTERDSRMGLINLTQKGCEIYHCLNAAANQSVEEIFAHL NDGQLQEIINSMDFIYNTMNEKETNLAVKNVAPIVIRPIEEEDNTSIACVLRASVEEHDA PKVGTFYDDPHTDMMFQTFNIKNAEYWVVECDGVILGGGGFYPTKGLPKGYAELSKFHFK PELRGKGIGKRLLQLIEQRAIKAGYTYMYIVSYHQFGNAVAMYEKCGYKHISSALDQSGL YQDAPFHMVKEL >gi|225935375|gb|ACGA01000017.1| GENE 3 2642 - 3949 890 435 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 127 428 2 311 328 162 37.0 1e-39 MKERLKKNEDADFTFRYDFSKVGSYYQNSQKEGKIDLVTKVTTLYDNEHQPINYLLINAD KTETTVAYNKIQEFEEFFELVGDYAKVGYAHFNVLTGHGYAQKSWYDNIGEEDETPLSEI FSAYRHFHSDDRALLIQFLDDVRKGLTVKLSKEMRIYREDGTCTWTYVHLLVRKYAPQDQ IIEIISINYDITELKRTEEMLVKARDKAEASDRLKSAFLANMSHEIRTPLNAIIGFSSLL ASTEDAAEKELYNSLIGHNNKLLLNLINDVIDLSKIESGYIELHPDWVNLTELLNESVAE YVHQVPSGVELLTNYPEHAFLAELDKLRIKQIVSNFLSNALKNTTTGHVEVFYEIDHQFV RIGVKDTGRGIPQNMLEKIFERFEKLDSFVQGAGLGLSICKLIVEKMNGRILVDSQLGIG TTFVIELPCHSIPVE >gi|225935375|gb|ACGA01000017.1| GENE 4 3946 - 4512 530 188 aa, chain - ## HITS:1 COG:no KEGG:BT_1661 NR:ns ## KEGG: BT_1661 # Name: not_defined # Def: two-component system sensor histidine kinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 188 1 171 601 240 67.0 2e-62 MNRLQDFTFFSQLMANANMGWWKANLSTANYECSDFILELLGLDKTGVISFEDFNKRILK EEQRPTTVHSFGVHQRPEAVYLLDTVKGAVWMRSKICFQETDEDGNEIVYGIAEVQDAPD MASAYQALQHSERLLYNIFKHLPIGIELYNMDGVLIDLNDKELEMFHVEKKEDILGINIF ENPIFPKE >gi|225935375|gb|ACGA01000017.1| GENE 5 4705 - 6090 1478 461 aa, chain - ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 35 244 82 305 328 138 36.0 2e-32 MKQKFSTIILLLFILLAGSRVVAQNAPKPFDIEQPSLRVFLPAPELATGRAVVACPGGGY GGLAVNHEGYDWAPYFNKQGIALIVLKYRMPKGDRTLPISDAEAAMKIVRDSADVWNLNP NDIGIMGSSAGGHLASTIATHAKPELRPNFQILFYPVITMDKSYTHMGSHNNLLGKDASA ELETEYSNEKQVTKDTPRAFIVYSDDDKVVPPANGVNYYLALNKNNVPSVLHIYPSGGHG WGIREDFLYKSEMLNELTSWLRSFKVPHKDAIRVACIGNSITYGARIKNRDRDSYPAVLS RMLGEAYWVKNFGVSARTLLNKGDNPYMNEKAYQDALAFNPNIVVIKLGTNDSKSFNWKY KADFTKDLQTMVDAFKALSAQPKIYLCYPSKAYQTGDNINDDIISKEIIPMIKKVAKKNN LAVIDLHTAMDGMPELFPDKIHPNEAGAKVMAKAVYQSLKK >gi|225935375|gb|ACGA01000017.1| GENE 6 6182 - 7534 1025 450 aa, chain + ## HITS:1 COG:VNG0727C KEGG:ns NR:ns ## COG: VNG0727C COG0534 # Protein_GI_number: 15789902 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Halobacterium sp. NRC-1 # 6 434 18 465 494 133 27.0 8e-31 MQGIKNLTKGPINRQLFNLAMPIMATSFIQMAYSLTDMAWVGRLGSEAVAAIGSVGILTW MSGSISLLNKVGSEVSVGQSIGAQSQEDARSFASHNITIALIISICWGGLLFIFAEPIIR IYELEEHITANAIQYLRIVSTGLPFVFLSAAFTGIYNAAGRSKVPFFISGTGLILNIVLD PLFIFGFGLGTNGAAYATWIAEASVFLIFVYQLRCRDALLGGFPFFTRLKKKYTRRILKL GLPVATLNTLFAFVNMFLCRTASEQGGHIGLMTFTTGGQIEAITWNTSQGFSTALSAFIA QNYAAGRIERVLKSWYTTLWMTGIFGTLCTLLFVFFGNEVFAIFVPEQAAYEAGGVFLRI DGYSQLFMMLEITMQGVFYGIGRTIPPAIISISCNYMRIPLAILFVRMGMGVEGIWWAVC VTTVAKGLILLSWFIIIKKKCLSIPSTIKG >gi|225935375|gb|ACGA01000017.1| GENE 7 7698 - 8414 1070 238 aa, chain + ## HITS:1 COG:FN1265 KEGG:ns NR:ns ## COG: FN1265 COG2885 # Protein_GI_number: 19704600 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 47 227 37 202 202 80 36.0 2e-15 MNKMKFMVLFMSIAMIFGSCGSMNNTGKGAAIGGGSGAALGAILGGVIGKGKGAAIGAAI GTAVGAGTGALIGKKMDKAAAEAKQIEGAQVEQITDNNGLQAVKVTFDSGILFTTGNANL SAAAKSALSKFANNVLNQNRDMDVSIYGYTDNQGWKNSTAAQSQQKNLNLSQERAQSVSS YLLSCGVSTNQIKGVQGMGESDPVASNDTAAGREQNRRVEVYMYASEQMIKDAQAAAN >gi|225935375|gb|ACGA01000017.1| GENE 8 8392 - 9135 561 247 aa, chain + ## HITS:1 COG:CC0325 KEGG:ns NR:ns ## COG: CC0325 COG0744 # Protein_GI_number: 16124580 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Caulobacter vibrioides # 26 219 25 212 229 218 55.0 1e-56 MHKQLLIKKILRYTRNLLIFFFASTLLAVIVYRFMPVYVTPLMVIRSVQQIFSGDKPTWK HTWVSFDKISPNLPMAVIASEDNRFAEHNGFDLVEIKKAMKENETRKRKRGASTISQQTA KNVFLWPQSSWVRKGLEVYFTFLIELFWSKERIMEVYLNSIEMGNGIYGAQATAKVKFGT TAAQLTRGQCALIAATLPNPIRFNSAKPSSYILKRQNQILRLMNLVPKFPPEEKAVEKKK SKNKKSK >gi|225935375|gb|ACGA01000017.1| GENE 9 9227 - 9730 555 167 aa, chain - ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 101 38.0 5e-22 MKSLSFRKDLIGVQEELLRFAYKLTANREEANDLLQETSLKALDNEEKYVPDTNFKGWMY TIMRNIFINNYRKIVRDQTFVDTTDNYYHLNLPQDSGFESTEGAYDLKEMHRIVNALPRE YKIPFSMHVSGFKYREIAEKLGLPLGTVKSRIFFTRQRLQQELKDFV >gi|225935375|gb|ACGA01000017.1| GENE 10 9971 - 10531 683 186 aa, chain + ## HITS:1 COG:lin0961 KEGG:ns NR:ns ## COG: lin0961 COG1704 # Protein_GI_number: 16800030 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 4 186 5 185 185 159 48.0 4e-39 MTILIIVGIIVLLGIIFASMYNSLVKLRNNRENAFADIDVQLKQRHDLIPQLVDTVKGYA AHEKETLDQVIQARNGAVSAKTIDDKIAAENQLSSALAGLKITLEAYPDLKANQNFLQLQ EEISDVENKLAAVRRYFNSATKEYNNAVQTFPSNIVAGMTGFQREIMFDLGKNERANLEQ APKISF >gi|225935375|gb|ACGA01000017.1| GENE 11 10577 - 11539 694 320 aa, chain + ## HITS:1 COG:lin0962 KEGG:ns NR:ns ## COG: lin0962 COG0501 # Protein_GI_number: 16800031 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Listeria innocua # 55 318 34 302 304 153 34.0 5e-37 MQYVGIQTQQSRNNLRSAFLLFLFPCLVAALLYLACYLLTFMGYKEDMEVEMMPIVNHFF FSSLPYTMGIVLIWFLIAYWANTSIINSATGSKPLNRQENKRVYNLVENLCMSQGMKMPK INIIYDSSLNAFASGINERTYTVTLSEGIIRKLNDEELEAVIAHELSHIRNRDVRLLIIS IVFVGIFSMLTEITFYTITHIRVRSNSKGSGGVILFILIALVIAAIGYLFASLMRFAISR KREYMADAGSAEMTKNPLALASALRKISADPAIEAVERKDVAQLFIHNPKKASKSIFSGL SGLFATHPPIEKRIEILEQF >gi|225935375|gb|ACGA01000017.1| GENE 12 11666 - 12166 421 166 aa, chain + ## HITS:1 COG:no KEGG:BT_1200 NR:ns ## KEGG: BT_1200 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 164 164 226 76.0 2e-58 MRTKKQIFMLLLALLVGIPTLSAQQSKKEKKEQKKEAVMKLIESENYKIDVNTAMPMRGR SIPLTSSYSLTIRNDSVISYLPYYGRAYSIPYGGGDGLNFKAILKEYNVEMDKKGNAVIK FVARNPEDRYEFRAKVFPNGSASIDVNMQNRQSISFQGELDIKEEK >gi|225935375|gb|ACGA01000017.1| GENE 13 12175 - 12798 723 207 aa, chain - ## HITS:1 COG:VC2382 KEGG:ns NR:ns ## COG: VC2382 COG2860 # Protein_GI_number: 15642379 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Vibrio cholerae # 3 197 37 233 239 121 39.0 9e-28 MPTFVQILDFIGTFAFAISGIRLASAKRFDWFGAYVVGLATAIGGGTIRDVLLDVTPGWM TDPIYLICTGLALLWVICFGRWLIRLNNTFFIFDTIGLALFTVVGVGKSIALGYPFWVAI IMGSITGAAGGVIRDVFINEIPLIFRKEIYAMACVVGGIAYWLCDVAGLESYACQLIGGL AVFLTRILAVKYHICLPILKGSEEPEE >gi|225935375|gb|ACGA01000017.1| GENE 14 12891 - 13976 1139 361 aa, chain + ## HITS:1 COG:MJ1504 KEGG:ns NR:ns ## COG: MJ1504 COG0381 # Protein_GI_number: 15669698 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Methanococcus jannaschii # 1 359 1 362 366 197 33.0 3e-50 MKITIVAGARPNFMKIAPITRAIEAARALGKSISYRLVYTGRKDDTSLDASLFSDLDMKA PDVYLGVESSNPTSLTAGIMVAFEQELTENPAHVVLVVDDLTATMSCAIVAKKQGIKVAH LVAGTRSFDMKMPKEVNRMITDGISDYLFTAGMVANRNLNQTGTESENVYYVGNILIDTI RYNRNRLLKPIWFSVLGLQEGNYLLLTLNRRVLLNNKENLRQLLKTIIEKSAGMPIVAPL HTYVRNAIKELGIEAPNLHIMPPQNYLFFGYLINKAKGIITDSGNVAEEATFMGIPCITL NTYAEHPETWRMGTNELVGEDPALLAKAMDTLMQGEWKQGELPERWDGRTAERIVQILTS K >gi|225935375|gb|ACGA01000017.1| GENE 15 14279 - 16855 2300 858 aa, chain + ## HITS:1 COG:no KEGG:BT_1204 NR:ns ## KEGG: BT_1204 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 858 1 858 858 1651 93.0 0 MKQRYSKILILFLLVLAASNAFAQQIKGVVTDSVTHEPLMYISVYYQEKRDMGTITNIDG EYSLEARRNGGTLVFSAVGYISKTVRVGSNNQTVNVKLAPDNVLLNEVVVKPQKEKYSRK NNPAVEFMKKVIEHKKAQVLEVNEYYQYDKYEKMKMSINDLTPEKLEKGIYKKYSFLKDQ VEVSETTNKLILPISVQETASQTIYRKNPESKKTLIKGKNSNGIDEFFSTGDMLGTVLKD VFADINIYDDDIRLLQQRFVSPIGNNAISFYKYYLMDTLMVNKRECVHLTFVPQNSQDFG FTGHLYVLNDSTYAVQKCTMNLPKKTGVNFVNRMDIVQQYEQLPNGNWVLADDDMTVDLS WNSNKTAGGLQVERTTKYSNYKFDPIEQRLFRLKGSVIKEADMLSKSDEYWASVRQVPLT KKESSMDVFVNRLEQIPGFKYIIFGAKALIENFVETGSKGHPSKVDIGPINTMISSNYID GTRFRLSGMTTAHFHKHWFLNGYGAYGLKDERWKYSGTVTYSFNKRDYVVWEFPKHYLSA SYSYDVMSPMDKFLFTDKDNIFLSVKTTTVDQMSYMRDATINYELETLTGFGVKAMLRHR NDEPTGKLEYLRNDAAQTRVHDVTTSEASLTLRYAPGESFVNSKQRRVPVSLDAPIFTLT HAMGFKGVLGGDYSFNRTEASVWKRFWLPASWGKIDCSLKAGAEWNTVPFPLLILPEANL SYITQRETFNLINNMEFLNDRYASMSLSYDMNGKLFNRIPLIKNLKWREMFRVRALWGTL TDKNNPFKSNNPDLFRFPTRDGKFTSFVMDPKVPYIEGSVGIYNIFKLLHVEYVHRFTYR DNPGINKNGIRFMVLMVF >gi|225935375|gb|ACGA01000017.1| GENE 16 16863 - 17831 851 322 aa, chain + ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 3 311 16 327 443 310 50.0 2e-84 MQPLAERLRPKTLDEYIGQKHLVGPGAILRKMIDAGRISSFILWGPPGVGKTTLAQIIAN KLETPFYTLSAVTSGVKDVREVIDRAKSNRFFSQSSPILFIDEIHRFSKSQQDSLLGAVE NGTVTLIGATTENPSFEVIRPLLSRCQLYVLKSLEKEDLQELLQRAITTDAILKERKIEL KETTAMLRFSGGDARKLLNILELVVQSETEETVVITDEMVTERLQQNPLAYDKDGEMHYD IISAFIKSIRGSDPDGAIYWLARMVEGGEDPAFIARRLVISASEDIGLANPNALLIANAC FDTLMKIGWPKDGFLWPKRRYI >gi|225935375|gb|ACGA01000017.1| GENE 17 17786 - 18133 182 115 aa, chain + ## HITS:1 COG:PA2613 KEGG:ns NR:ns ## COG: PA2613 COG2256 # Protein_GI_number: 15597809 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Pseudomonas aeruginosa # 3 108 321 427 441 98 45.0 3e-21 MAEGRIPLAEATIYLATSPKSNSAYSAINDALELVRSTGNLPVPLHLRNAPTKLMKQLGY GQEYKYAHSYEGNFVKQQFLPDELKDKRIWQPQNNPAEQKHAERMIQLWGDKFKK >gi|225935375|gb|ACGA01000017.1| GENE 18 18231 - 19187 1156 318 aa, chain + ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 318 1 324 324 342 54.0 5e-94 MKIVVLDGYAANPGDLSWEGMKVLGECTIYDRTAPEEVLERAAGAEAILTNKVIINADHM AALPELKYIGVLATGYNVVDTAAAKERGIVVTNIPSYSTASVAQMVFAHILNITQQVQHH SEEVHKGRWANNKDFCFWDTPLMELREKKIGLVGLGNTGYTTARVAIGFGMQVYALTSKS HFQLPPEIKKMDLDQLFSECDIISLHCPLTPETREMVNARRLGMMKPTAILINTGRGPLI NEQDLADALNSGKIYAAGVDVLSTEPPCADNPLLTAKNCYITPHIAWATIEARERLMNIA ISNLQAYIAGKPENIVNK >gi|225935375|gb|ACGA01000017.1| GENE 19 19334 - 20476 1006 380 aa, chain - ## HITS:1 COG:Cj0082 KEGG:ns NR:ns ## COG: Cj0082 COG1294 # Protein_GI_number: 15791472 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Campylobacter jejuni # 5 380 10 374 374 306 50.0 4e-83 MYIFLQQYWWLVVSLLGAILVFLLFVQGGNSLLFCLGKTEEHRKMMVNSTGRKWEFTFTT LVTFGGAFFASFPLFYSTSFGGAYWLWMIILFSFVLQAVSYEFQSKAGNLLGKKTYRTFL VINGVVGPVLLGGAVATFFTGSDFYINKANMTDTIMPVISHWGNGWHGLDALTNIWNVIL GLAVFFLARVLGSLYFINNIDDKELTDKCRRAVRNNTVLFLVFFLAFVIRTLVSDGFAVN PDTQEIYMQPYKYLTNFIEMPVVLALFLIGVVLVLFGIGKTLLKKTFDKGIWFTGIGTVL TVLSLLLVAGYNNTAYYPSYTDLQSSLTLANSCSSEFTLKTMAYVSILVPFVIAYIFYAW RSIDRHKITEKEMDEGGHSY >gi|225935375|gb|ACGA01000017.1| GENE 20 20514 - 22061 1540 515 aa, chain - ## HITS:1 COG:Cj0081 KEGG:ns NR:ns ## COG: Cj0081 COG1271 # Protein_GI_number: 15791471 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Campylobacter jejuni # 6 512 3 504 520 543 55.0 1e-154 MIESIDTSLIDWSRAQFAMTAMYHWIFVPLTLGLAVVMGIMETLYYKTGNEFWKKTAQFW MKLFGINFAIGVATGLILEFEFGTNWSNYSWFVGDIFGAPLAIEGILAFFMEATFIAVMF FGWGKVSKRFHLASTWLTGLGATISAWWILVANAWMQHPVGMEFNPDTVRNEMVDFWAVA TSPVAVNKFFHTVLSGWVLGAIFVVGISCWYLLKKRNREFALASIKIGAIFGLVASLLSV WTGDGSGYQIAQTQPMKLAAVEGLYEGGTNVGLVGIGVLNPEKKTYNDGKDPFLFRFEIP SMLSFLAERNVDGYVPGITNIIEGGYQQKDGSTALSAAEKIERGKTAIGALAAYRAAKSA GHEEDAQIAYKVLQENIPYFGYGYIKDVNQLVPNVPLNFYAFRVMVILGGYFILFFIVVL FFIYKKDLSQMRWMHWIALLTIPLGYIAGQAGWVVAECGRQPWAIRDMLPTMAAISKLDV SSVQTTFFIFLLLFTVMLIAGVGIMVKAIKKGPEN >gi|225935375|gb|ACGA01000017.1| GENE 21 22082 - 22318 236 78 aa, chain - ## HITS:1 COG:no KEGG:BT_1211 NR:ns ## KEGG: BT_1211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 78 1 78 78 124 85.0 9e-28 MKNTLLTIWNFYLEGFRSMTLGRTLWVIILLKLFVMFFILKMFFFPDFLRDHPTDDDKGT YVGNELIERAIPGKSTDF >gi|225935375|gb|ACGA01000017.1| GENE 22 22537 - 23934 1449 465 aa, chain + ## HITS:1 COG:RSp0669 KEGG:ns NR:ns ## COG: RSp0669 COG1538 # Protein_GI_number: 17548890 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Ralstonia solanacearum # 46 439 76 470 522 77 24.0 8e-14 MNMINVRRLTVMTFLGAGMLSGISAQVSPLQVDTLKETKIPAQWDLQSCIDYAKEQNITI RKNRITATSTQIDVKTAKAALFPSLSFSTSQQVVNRPYQETSSRVSGSEIISSNSKTSYN GNYGLNASWTLYNGNKRLKTIQQEKLNNQVAELDVATSENNIQESIAQVYIQILYAAESV KVNESTLQVSIAQRDRGQELLNAGSIAKSDFAQLEAQVSTDRYQLVTAQATLEDYKLQLK QLLELDGENEMNIYLPALSDENVLAPLPTKKDVYISALSLRPEIEASKLNVEASELGIGI AKSSYLPTISLSAGIGTNHTSGSDFTFGEQVKNGWNNSIGLSVSVPIFNNRQTKSAVQKA KLQYETSMLSLLDEQKALYKTIEGLWLDANSAQQRYAAANEKLKSTQISYELISEQFNLG MKNTIELLTEKNNLLQAQQEQLQAKYMAILNTQLLKFYQGDQLAI >gi|225935375|gb|ACGA01000017.1| GENE 23 23954 - 25174 1302 406 aa, chain + ## HITS:1 COG:AGc3332 KEGG:ns NR:ns ## COG: AGc3332 COG0845 # Protein_GI_number: 15889118 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 371 37 432 437 200 32.0 4e-51 MKTKKIILIAVAVVVVAGAGIWFFAGSPAKHKVTYATANVSKGDISNSVTATGTIEPVTE VEVGTQVSGIIDKIYVDYNSVVTKGQLIAEMDRATLQSELASQQATYDGAKAEYEYQKKN YERSKGLHEKSLISDTDFEQALYNYQKAKSSYDSSKASLAKAERNLAYATITSPIDGVVI SRDVEAGQTVASGFETPTLFTIAADLTQMQVVADVDEADIGGIVEGQRASFTVDAYPNDV FEGVVTQIRLGDASSTSSTSTTTTVVTYEVVISAPNPDLKLKPRLTANVTIYILDKKDVL SVPNKALRFTPEKPLIGNNDIVKDCEGEHKLWTREGTTFTAHPVEVGISNGISTEIISGI SEGTKVVTEATIGAMPGENVAAEGNQESGGERSPFMPGPPGSKKKK >gi|225935375|gb|ACGA01000017.1| GENE 24 25223 - 25966 254 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 224 1 221 223 102 30 1e-20 MKKVIEIQNIKRNFQVGDETVHALRGVSFNINEGEFVTIMGTSGSGKSTLLNILGCLDTP TSGEYLLDDIPVRTMSKPQRAVLRNRKIGFVFQSYNLLPKTTAVENVELPLMYNSAVSAS ERRRRAIESLQAVGLGDRLEHKSNQMSGGQMQRVAIARALVNNPAVILADEATGNLDSRT SFEILVLFQKLHAEGRTIIFVTHNPELSQYSSRNIRLRDGQVIEDTTNPKILSAAEALAA LPKNDED >gi|225935375|gb|ACGA01000017.1| GENE 25 26086 - 27306 409 406 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 8 406 10 413 413 162 29 1e-38 MNGTNLFKIALRALANNKLRAFLTMLGIIIGVASVITMLAIGQGSKKSIQQQISEMGSNM IMIHPGADMRGGVRQDPSAMQTLKLADYEALRDETSFLSAISPNVSSSGQLIAGNNNYPA SVNGVGTEYLDIRQLTVENGEMFTEADIQSSAKVCVIGKTIVDNLFPDGSDPVGKIIRFS KIPLRVVGVLKAKGYNSMGQDQDAVVLAPYTTVMKRLLAVTYLQGVFASALTEDMTDYAT DEISTILRRNHKLKASDDDDFTIRTQQELSTMLNSTTDLMTTLLACIAGISLVVGGIGIM NIMYVSVTERTREIGLRMSVGARGVDILSQFLIEAIMISITGGIIGVIIGCGASWIVKSV AHWPIYIQPWSVFLSFAVCTVTGVFFGWYPAKKAADLDPIEAIRYE >gi|225935375|gb|ACGA01000017.1| GENE 26 27355 - 28416 689 353 aa, chain + ## HITS:1 COG:ECs3260 KEGG:ns NR:ns ## COG: ECs3260 COG3275 # Protein_GI_number: 15832514 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 132 326 334 531 565 99 32.0 1e-20 MKQTFTSARRPLEILIHIISWGIMFGFPFFFVERGNGNINWMAYIRHLAVPLSFMIAFYV NYFILVPRYLFQSQAKRYIVYNIIFLCVIGILLHLWQSLTFDPSFAPKAKRPGMPPGWLF FLRDMLSLVFTIGLSAAIRMSARWTQNEAARKEAERNRAEAELKNLRNQLNPHFLLNTLN NIYALIAFDSDKAQQAVQELSKLLRYVLYDNQQTYVPLCKEVDFIRNYIELMRIRLSANV QMITKFDIQPDSQTLIAPLIFISLIENAFKHGISPTESSFISIHLSENDKEVICEIRNSN HPKTVEDKSGSGVGLEQVSRRLEILYPGAYTWSKGISKDEKVYESRLSIKIRE >gi|225935375|gb|ACGA01000017.1| GENE 27 28459 - 29157 703 232 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 2 204 1 207 240 100 32.0 2e-21 MMLRCAIVDDEPLALSLLESYVNKTPFLQLVGKYSSAVQAMKELPGEEVDLLFLDIQMPE LNGLEFSKMVDPHTRIVFTTAFGQYAIDGYRVNALDYLLKPISYVDFLQAANKALQWFEL VQKPEEIDSIFVKSDYKLVQVDLKKIMYIEGLKDYIKIYTEDASKPILSLMSMKAMEELL PSSRFIRVHRSFIVQKDKIRVIDRGRIVFDKTYIPISDSYKQVFQTFLDERS >gi|225935375|gb|ACGA01000017.1| GENE 28 29564 - 30748 452 394 aa, chain + ## HITS:1 COG:no KEGG:BVU_2806 NR:ns ## KEGG: BVU_2806 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 392 1 393 393 518 66.0 1e-145 MATLTLVIVPAKRLSDGTHKIRIRVAHNSETRFITTDIVVRENEFKNGKIVHRPDKDFLN TKLQQLYNLYFKRYMELDYPDSLTCTQLVKMITNPLNGEKHRKFEDIVDEYLSQIDEEER TKTYKLYRLATNKFMQFIGNGSLMEHITPIRMNQYISWLKKTKLSSTTINIYITLLKVII NYAIKMRYVTYDIDPFITARIPSAQKRETQITVEELKTIRDANLEHYNLNVTRDIFMLTY YLAGMNLVDILAYDFRTDEINYIRKKTKNTKEGDSLISFSIPEEAKPIIKKYMKKNTGKI IFGKYKNYTSCYNLLARKISQLGKVAGIRHKFTLYSARKSFVQHGYDLGIPLSTLEYCIG QSMKEDRPIFNYVTIMRKHADKAIREILDNLKNE >gi|225935375|gb|ACGA01000017.1| GENE 29 30846 - 31016 264 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKDEEKKELEQEYENLKLLASFHEAYGVPENAKEREALINDILDRMNEIQEKLKKL >gi|225935375|gb|ACGA01000017.1| GENE 30 31058 - 31507 327 149 aa, chain + ## HITS:1 COG:no KEGG:BF2302 NR:ns ## KEGG: BF2302 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 18 148 14 139 140 77 39.0 1e-13 MIDWNDCLPTKEMQADFERFKELKTTEEKEAFKKEMQDKYNKLPEAQKEAYKKVSEAGLK ATVNACNDYIERAEEAILRDKLGELPEAISFSYIAKKYFGKSRNWLYQRINGNIVNGKKA RFTDNELKTFLNALNDVSEMIHQTSLKIS >gi|225935375|gb|ACGA01000017.1| GENE 31 31681 - 32793 444 370 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170591|ref|ZP_05757003.1| ## NR: gi|260170591|ref|ZP_05757003.1| hypothetical protein BacD2_01905 [Bacteroides sp. D2] # 1 370 1 370 370 691 100.0 0 MIRMKRYVKDWLVLGTVILTLIIIVILVFYFIQTQGKFADKQTDWGEFGSLLGAIAGLIA FVGVLFTLRQNKQQFLNSEDRAVFFELLRIFISYRDALQVKRIDWVYDEKQCEWKITPYN EFCTPEKTYRQIYVELYHTFYLEIRRGIPENFSKEEFVRRIIPQNMSREQWMFMYGQLNV AINNIYSEHEFGIHKGKINIYPVHINTYDYLCLNAIKIYFEQNNFKPIAEACAKAADHCF APYKNQLGTYFRNAYYILEMTSEFTSPLKYSNIFRSQLSKYELVLLFFNSFSSLSTIETR RLYLNADLFNNLELKDVRLKEGINDESVSRRMEYIHFPPVLFQKANKNEYMSSDLLKKLY NVILSENNIL >gi|225935375|gb|ACGA01000017.1| GENE 32 33118 - 33609 190 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170592|ref|ZP_05757004.1| ## NR: gi|260170592|ref|ZP_05757004.1| hypothetical protein BacD2_01910 [Bacteroides sp. D2] # 1 163 1 163 163 330 100.0 1e-89 MKRRLKRNVNKRLAGKCHPKTGALFVKKDIYCGWGILEDFVGPEFEGVKVDIGLTLEQAY EKLGGTDRKFYNGTMSLGIMCIKEQIENNTLSDNLYLSQEDIDMIKKGKLPQCKTMHHCP ETTKEGTIVMQLVDRDIHHKTKHTGGSATLNIENSYAVSDDFE >gi|225935375|gb|ACGA01000017.1| GENE 33 33650 - 34153 190 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170593|ref|ZP_05757005.1| ## NR: gi|260170593|ref|ZP_05757005.1| hypothetical protein BacD2_01915 [Bacteroides sp. D2] # 1 167 1 167 167 302 100.0 5e-81 MKMLIYITIFLALGIWFASCKTSRNIEMQKQVDYWGEFQFLRNSIESLKTDVSKQTKITT DKLSDLKIENTTVYLSAPDSTGSQYPVKESTTTASKLEQERIEVSEELSVALQRFSNRLD SLSNKVDVILNQKDMVVELSWWNLHKYKVYCGISCMLIIVWLIYKMR >gi|225935375|gb|ACGA01000017.1| GENE 34 34150 - 34728 461 192 aa, chain - ## HITS:1 COG:BH1294 KEGG:ns NR:ns ## COG: BH1294 COG0860 # Protein_GI_number: 15613857 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 2 188 3 173 253 60 28.0 1e-09 MKILIDNGHGSNTPGKCSPDGRLREYSYTREIAGRVVFELRKLGIDAELVVKEEIDVPLS ERCRRVNEYKTSEAILISIHCNAAGNGSNWMQARGWEAWTSVGQTKADKLADCLYATAEE CLFGMKIRKDMADGDPDKESSFYILKHTKCPAVLTENLFQDNKEDVDFLLSEEGKRTIVS LHVKGICKYLKV >gi|225935375|gb|ACGA01000017.1| GENE 35 34712 - 35227 269 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|262407839|ref|ZP_06084387.1| ## NR: gi|262407839|ref|ZP_06084387.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 171 1 171 171 317 100.0 2e-85 MRWLYELFNVDQIRIIFVSMFSSLLAYLTPTKGFLIALVVMFGFNIWCGMRADGVSIIRC KNFKWDKFKNALVELLLYLIIIEVVFSFMSLIGDGENSLLVIKTITYVFSYVYLQNAFKN LIIAYPRNKGFRIIYHVIRFEFKRATPTHVQGIIDRIENELDKEERYENID >gi|225935375|gb|ACGA01000017.1| GENE 36 35236 - 35529 209 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170596|ref|ZP_05757008.1| ## NR: gi|260170596|ref|ZP_05757008.1| hypothetical protein BacD2_01930 [Bacteroides sp. D2] # 1 97 1 97 97 147 100.0 3e-34 MSMGIKVLYDWLLQSNRPAHVKAGMFVFVVMLVFCFLLLGIDFCKSAIVSLTTTAIAAIV VEYIQKKCGFIFDWLDALATVLLPGLITVFSILVVTL >gi|225935375|gb|ACGA01000017.1| GENE 37 35727 - 36005 274 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885943|ref|ZP_02066946.1| ## NR: gi|160885943|ref|ZP_02066946.1| hypothetical protein BACOVA_03948 [Bacteroides ovatus ATCC 8483] # 1 92 1 92 92 183 100.0 4e-45 MDQKNILPRGIAKPIEQQPDGTWIVRHHFRVVGTSENGEELVTFASSEYPEKPTLQQIQR SIDRYRVCLTMYGDTISDEIEKVDLSVYMFTD >gi|225935375|gb|ACGA01000017.1| GENE 38 35968 - 37389 504 473 aa, chain - ## HITS:1 COG:alr3497 KEGG:ns NR:ns ## COG: alr3497 COG3344 # Protein_GI_number: 17230989 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Nostoc sp. PCC 7120 # 58 426 5 347 352 99 26.0 2e-20 MSEQLELFIGHPPGDEPGKTKIADATASSSWNVNFNNGNVNTNNRQNANRVRPLAATGNI IYDILLSSIFEASEDCARQKRTSTDCVEFYNDYQSALVRLWHSIIYGEYVPDFSKVFIRT YPVYREVFAAAFIDRVVHHWIALRIEPILEERFREQGNVSKNCRKGEGCLSAVHYLNNMI VEVSENYTADAYIFKDDLFSFFMSISKSLVWEMLNIFVRDNYKGDDIECLLYLLAVTIFH CPQNKCIRRSPVSMWDKLPSNKSLFHNDPDRGVAIGNLPSQLIANFLASVYDYFVMEILG FRHYVRFVDDFCIVVKSPEEILSKVHLLAGFLKEQLLLRLHPRKLYLQHYKKGVLFVGAF ILPGRIYVSNRVVGNTYNAVRKFNRIAENGFAEAYVEKFVSTMNSYYGLMKHFATYNIRR KIAAMLLPEWWEYVYIEGHFEKFVLKNKYNHRKQLIKHIKKHGSKKYLTAWDC >gi|225935375|gb|ACGA01000017.1| GENE 39 37404 - 37811 272 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885941|ref|ZP_02066944.1| ## NR: gi|160885941|ref|ZP_02066944.1| hypothetical protein BACOVA_03946 [Bacteroides ovatus ATCC 8483] # 1 135 1 135 135 262 100.0 5e-69 MALTQDLPISNSMYKLLNLIIDARQQFPKAFRYEFGTELMMLAVHCCEYIRYANTDMNLE HRADYLMKFLCEFDALKLLLRVCEERHLTSLTQTAEICLLAESIGKQSTGWYKKTVADLQ RQKANGSQQVAKPES >gi|225935375|gb|ACGA01000017.1| GENE 40 37958 - 38767 542 269 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885940|ref|ZP_02066943.1| ## NR: gi|160885940|ref|ZP_02066943.1| hypothetical protein BACOVA_03945 [Bacteroides ovatus ATCC 8483] # 1 269 1 269 269 506 96.0 1e-142 MDKNIANAMLLRLNKQDQIEALKSIGFTTVNENTPASDIAKYMQWSGTLLDLSLATLRIE DGEQVFFTASEWNSMSANNRSKYIRIGIRLRAECHQFIIAKSDCVDAGGNKTFKWGGYGT DLRGLKNYGSGNQGLYDTFDGKENTDVIIETLAGVKDTQGTVGAPAAEVARAYKACTLES DGIEDTTVWNLPALGELMLMAKYKTEINELITSMFGNQNIFTNDWYWSSTEYDASSSWNV GFNHGYVITGSRQGAGRVRPLAAINALSL >gi|225935375|gb|ACGA01000017.1| GENE 41 38793 - 39581 597 262 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885939|ref|ZP_02066942.1| ## NR: gi|160885939|ref|ZP_02066942.1| hypothetical protein BACOVA_03944 [Bacteroides ovatus ATCC 8483] # 1 262 1 262 262 473 100.0 1e-132 MTNEQSATLLRLNKQAQVAALNAVGFSDVTENSRASEFGQRIKWAAGLLDLHLACNRISD NSKAYFTAAEWNSLTLANKQLYIKRGLRIRAHGHSFVIAAQECYNADMTTTFYWGGQGKA IDGLNQKGLGAMYGCFTGEEDTDLIITGLKDQNNSGVIGAPAAEAARAYRAYTLESDGIE DESNWFLPSSGQMLLMYRYRDKINEMMRTFWSSDSMLMTDKYYWSSTIWDTNSAWAFELN TGRITNQNKNSALLHVRAVASE >gi|225935375|gb|ACGA01000017.1| GENE 42 39584 - 45757 3478 2057 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170602|ref|ZP_05757014.1| ## NR: gi|260170602|ref|ZP_05757014.1| hypothetical protein BacD2_01960 [Bacteroides sp. D2] # 1 2057 1 2057 2057 4042 100.0 0 MTETEKQQIISLVLQALKTNSLTIEQLTDTTELSKDMYVEVSGGRKISIDLLSSTIAKMV NGDFDALVENVNKIAKDLSDGDAELLKRITGVSDKSNPLTDPFKSIGSFTTIGSFKDKLK TMYSGDSSIGNYRCILSVDSSKIPVNIQIERLELNKVCQSFTSCIQLATMSDNAEGVYLG TVCTISRIGIVSNESVAWGKWTSVINDFEERIGKANGIAPLNEESKVPSECLPEPLSLGE GEEEAFPGNRGKSLEDTMKNIPSDIIKPGSFSVLSDASYLDVYFKKVSKTTGKETDDSFR LPSATLEQAGLLSAEDKQALEDMKSGTPADDVTHPIVIVDEIRPLKDGYYTLETAIAAIV SYQQESGVKYERTGLIITYKTGEYEMETRQFQGAVSDFATPSLWKPFGNGGGGSVFETSD EPAEGGKDAFSTGGAYAYVPANLDVNVETEGIVKLQMKNAAGETLGDEVQFAIGTGGGGQ TGGTIVAIAFQSTPVYGSYGSTLRTFAAIRSVTSNGVESSDNLIEKLELVDRESGLTVWT ETVNKASSGDMKDFSFELDFTTYFTAAGTRKFKLIATDESGNTGSKNVNVTAVDITCTCV QVLNYTPETLLTPTTESFSLPLYKFGNNTSDKGISAQVDIKINGEWQSLSTTVVNDNYSH SVVIRPASLGLEHGTYPLRIQGTDVASGVKGNVIYTAVMVIDPNSSTPLVALRYDDKNGG VVRLYETVELDVACYDPLEMTSPVSVKANNVQVTQIAASRNKTYQVKQQLQGYKADGTDT VNYTAVCKDVTSEPVRVTVSGSAIDAAIKEGAIYNFDFSSRTNQETDHSIVSGNYEMKVD GANWTTNGFGTFLGENCLRVAENVGVSLNHAPFAGSSIESNGAAIQFAFASKNVTDDDAL LLSCYDETSGAGFYVTGRVVGIFCNNGVSRREERAYRQGEKITVAVVVEPASNYVERDGT RYSMMKLFLNGEEVACLGYVPGGGSLIQTKYITMDGKLGDLYLYYMMAWNSYMEWAQAFK NYLVRLTDTEVMVKEYAFEDILKSQTAEGSTQSRPSAAEIYSRGMPYIVECPYEGSDIEA LDGTTSTSTKIYITLYYFDPERPWRNFKAVSVQTRNQGTTSAKRPVKNKRYYLAKSKGKN KDTRIILLNPDDTTEEGRRAIALAAINKVQVGDNTIPVDVITVKVDYSDSGNANDCGACE MMNVTYRALGGNYMTPVQRAFDGTFDSGDLHIEDLQMNHSTANHPVATYRCKDDSLQNVY FHAKGNWKEDKGEQFALGFKDTPGYNKGCLNYGDFIEFFGTPDETLDAIEIRFKQTDGLD TDSVYLLSLYCGSSYRIMRYQDSSWKKQSGSMKYENGKWNVTGDVLNPVEGFELLNYQGM DWFQGVGSVQDMMAMKTDKSSWVQKLVDNGTISADTFPAWTYYFESLVDDDQLAIDYALG KKVPYNLYRWLRFCDSCDYSKGGNWQRTWKENLYKYACPESVLSYDIFTDYLAATDQRAK NMQPMWFLEEYASVTDGVYSSEDAMRMYLNKIYDCDTLNSKDNDGGCTVDAEVDPNRTSD ETFTNPYAGYGSVLFNNIYLQQVVWTDSSGTELSLRTVAAAMRNVQATIDGVTLHPFSPE GATHFFIDKRLKKWQKLVSSYDGERKYISYTATSDAIYFYALQGLGLTALPSFIERRWRI RDGYFQTGDFFSGVISGRVSSKSNATIRIVAAKNGYFGVGNDASGNLSESCFLEAGEEYV FTNFSHEEGALLYIYQADRMKLLDLSEISLSSTVSFSAMQLVETLILGSDTHTEQSIGSY APLTSLNCGEMPFLVSLDIRNTQIATLVTDKCPRIAHINASGSKLENITLAETSPINDIS LPPTMTSLRFVGLPELTYTGLSAPSGLHIESMPNVQRLRLETSPQLDAIQMLRDVLASQA ASRKLSMLRISNMTLKADGSELLAILEYGVAGMDEDGNRQDKPVVNGTYELTVIRETDEI ESLESGIDGLVILTVIDAYIDLINWFNNESYGGEPYYDNVTLDNINEVLEYYNGETYEEY LERFAEDNMDINDLINK >gi|225935375|gb|ACGA01000017.1| GENE 43 45769 - 47082 701 437 aa, chain - ## HITS:1 COG:no KEGG:BVU_2821 NR:ns ## KEGG: BVU_2821 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 266 3 266 275 273 48.0 1e-71 MIISPFTPLFFSPSTDKFGAKSKYVQLFARTDRIFVELILTPREQEPIVYINNLLSNIST PVSLSSWKMNDDKILYFYNISLLPCGYYTVTVNGNTSEIFKVTDDECELSETSLIQYSMK DNKQRLDAVWWIDGMQYFFDFRVPGGFKDNGWTFGVDNEQFVTSDEDIVELFSHEYTTVL FTLGNGMGCPVWFAELLNRVLCCNYVYFDGVRYTRKESNVPELNQQIEGLKSFVFNQMLQ KVRTMNPVLEWNNQLAMRCVQSGAYRIADDEGMRSIKYGSESGVAEVGAYINMTKAIPNT GVSINSDTMVTVNSIHHPGVDENSYWDLIAIKTTDIDNKYIGRRGYGKLTVNGLDRLKND LDNGSINLRAVLYKGDSYTNLIEGSVISRDGVCVLKGINGGDIGALKEFQLYLDNVYDCD IDNLGMTIELVWVYEND >gi|225935375|gb|ACGA01000017.1| GENE 44 47086 - 49077 1016 663 aa, chain - ## HITS:1 COG:no KEGG:BVU_2822 NR:ns ## KEGG: BVU_2822 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 403 1 424 892 138 29.0 8e-31 MLCKYVLTVDSISYDIPKSCIQNWDEIKFSRKRSGLEGITRTFTSKFQFVGEAYDLILEE YLSKYLASNASITVYTITNSHTYEEFFSCRLDFGSLTYDGNTVSINLIDDSVASIIKANK GTQYEYLVDEIKDTYQLYYDRLPFNYYANYICGGYSLEDGGQYVDFSRDITGKTIFQSLP LEVVEKDLPESDSPVEINSVTLDTSVPAFLRAHKPVKVYITPEFNFYLGRGDVMLTLAKV DGNGTTSTIASWINTDYSGNAHTTEKDTYRPEQYRDVYAIDLQDGECLQFVIHDPIGNMN VNGPGKVYFSKYSLQVKWTSIASPINIDVVKPITVLNSLLKSMNGGKGGIKGEIASGVDN RLDNCLILAAESIRGILSAKLYTSYTKFVDWMEACFGFVQKIEGDIVKFVHRDSLFTFNG NKNISRSISDFQFKVDSSRIYARVKVGYDKVDYECLNGRDEFRFTAEYTTGLQVTDNTLE LVSPYRADAYGLEIVSQKRGSSSTDNESDNDVFIVGAMLAYNKVIGKAEYVLERNADWKI AGVLNPDAMFNVMYWQKAMLKANAKYIGMFADSLHYASSDGNSNVIVNDVKLTDDFILEE HLVTCGDVSFTTFDEDIPQTDDGTIKIQKGGLVYEGYIKEVSSVVERNEGVKYDLFVRSI TKA >gi|225935375|gb|ACGA01000017.1| GENE 45 49182 - 49370 158 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237721085|ref|ZP_04551566.1| ## NR: gi|237721085|ref|ZP_04551566.1| predicted protein [Bacteroides sp. 2_2_4] # 1 62 1 62 62 95 100.0 9e-19 MIREVILEALKKRGIKQIELADHLGINKSPLNAFLKGKGKISMENIEKSFLFLGIDIILK NK >gi|225935375|gb|ACGA01000017.1| GENE 46 49487 - 50533 308 348 aa, chain - ## HITS:1 COG:no KEGG:Sbal195_3678 NR:ns ## KEGG: Sbal195_3678 # Name: not_defined # Def: hypothetical protein # Organism: S.baltica_OS195 # Pathway: not_defined # 26 339 26 335 350 80 23.0 1e-13 MKDTKQNANNNTNTTVIQVNGDYYSGITESQAKEIALATVRNEFSILYGEAQNIFEKRVQ EIVDESLLRIQHDSPESFKRFNEPAIQMILNTVYKEYAKSGDSDLKQRLIDLLIARIKVS EHTFTQVLIDDAIRITPKIRIQHLQFLTSLFFIYSGLVTFTCIAEYDECITMLAKNYPLY PKEASLFNVNDVYFLALLKHTGCITYAESSSSLVEEEILICFGGIFNKGFKISDIDSALK KELEEHNLICTSEIQPGNVRINTRNEFVLRLDVNKISKKYRKNMYDLYVNNCSTVEDLRN YNKSLDNRVHAMFESVNFLNRTEHYSLSEFGLFLGQQNFYVSSLLYFF >gi|225935375|gb|ACGA01000017.1| GENE 47 50520 - 50726 98 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237721087|ref|ZP_04551568.1| ## NR: gi|237721087|ref|ZP_04551568.1| predicted protein [Bacteroides sp. 2_2_4] # 1 68 1 68 68 128 100.0 1e-28 MEAQDLITIIFSFCRNNAGWLFSGIGVSLLGFIIKKVLSRKKKTTIHQEANSNKQSKITQ VGGHYERH >gi|225935375|gb|ACGA01000017.1| GENE 48 50887 - 51360 61 157 aa, chain - ## HITS:1 COG:CC2883 KEGG:ns NR:ns ## COG: CC2883 COG1595 # Protein_GI_number: 16127115 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 3 153 9 159 189 69 28.0 2e-12 MDFEKELSEIYPWILKVARKFCCSMQDAEDLAGDTVYKLLVNRDKFDCSKPLQPWCLIIM RNTYIIRYNRNSLIHFTGLDMVDGSAISNCTAHSILFDDLVSTIQQCAKKSRCIDSVMYY AEGYSYEEISEILNIPVGTVRSRISSGRKFLLQEISS >gi|225935375|gb|ACGA01000017.1| GENE 49 51745 - 52326 237 193 aa, chain - ## HITS:1 COG:no KEGG:BVU_2826 NR:ns ## KEGG: BVU_2826 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 192 1 188 189 146 43.0 3e-34 MIERLNQITLNDFIELSCGNYACLLSDRGSMSESTLKEMASKLIIEYRSIVNPSGMQAMI MDKEDMVKERAKLLSLRICQTLVSLGFYDDVRQVLGQLNVDIRNMSDEQVISKLDYLLHS AIFEQKRNEERRSEEHKGSKATPEQIRSSFDAEIAFLMTFFKMSIDSRVINAAVYANIVH QADVEISIRKRST >gi|225935375|gb|ACGA01000017.1| GENE 50 52329 - 56234 3566 1301 aa, chain - ## HITS:1 COG:Z2987 KEGG:ns NR:ns ## COG: Z2987 COG5283 # Protein_GI_number: 15802339 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Escherichia coli O157:H7 EDL933 # 328 608 225 520 696 83 25.0 3e-15 MADLKLKDFVDENDLQKLVELDNTIERVRADYVNAAKELAKGLKLNVEGVADLEKLSNLY NTQAKTAGSASAELTEALRKQSEITQTVSKKIEEKLNVEKLSAAELKKLTKANSDNAASL EKAAKAEANLTKAQNAGNTTRKKAVLSEEERLKLIRTAITLTNQEVHSRSQAKEMNKQLQ KAVDVLKDTDENYIRTLARLNSTIGINTDYIKRNSDRYSQQKMTIGAYREEVKAAWVEIQ NGNKSMQNMGIIARNAGRMLKTEMAPGLSQVSAGLKGWAAGYIGAQAVVGGIVKMFTQLR EGVGSIVEFEFANSKLAAILGTTADNIKELTTDARQLGATTKYTAAQATELQIELAKLGF TRREILDSTGAILRFAQATGAELSDAAALSGAALRMFNASTKETERYVSAMAVATSKSAL SFSYLATALPIVGPVAKAFNFQIEDTLALLGKLADAGFDASMSATATRNILLNLADGNGK LAKALGEPVKTLPELVVGLKKLKEQGVDLNTTLELTDKRSVAAFNAFLTASDKIVPLRDQ ITGVDKELTDMADTMSNNVKGSIAGLSSAWEAFMLSFYDSKGIMKDVLDFLARGLRNVAT QLKGYSELQDEADNKAVAFAQKEMMKSDILEKNARNMQRLYKEYINSGMSADEAAKKAKE DYIETLKSRLEYENSDYQLAIDNRKKLEGELKDRGFFTILTSWRRTNNVIKDEIDVATKA AAGKKAISSITESLIEQLDTIDLKENGGTKGNSVKVLTDKEKREQEKALKEKLKIHETYQ ESELALMDEGLEKELAKIGVAYSKKIAVVKGNSKEEIATRQNLAKEMQEKLDEFTIKYNS DREKKDVENALAVVKKGSQEELDLKLHQLELQREAEIDTAEKTGEDVFLIDDKYAKKKQE LYERHASDQVQLIAENAAHEQEIRDAAYVMDTLALKKQLASKLITEEQYAIEEYNLQLEY VHKTTEAAIEALELELTVENITAEERTKIVTQLYVLKAALAKKEAELQISAIQNITKAED KALKERQKNLKKWLQTASQAVGTIGNLVSTLYDAQIDKIEEEQDANDEKYDKDVERVDKL AESGAISEEEAEARKRAAKSLTEAKNAELEKQKQEMARKQAIWEKATSVAQAGIATALAI TEALPNIPLSIVIGAMGAIQVATILATPIPSYAEGTKGNDRHPGGTALVGDAGKHEVIMY SGKAWITPATPTLVDIPKGAQVFPDVDKVDISNFDMPDWDFPTFSPTYFASSSGDTTVFN DYSRLEKRVDKTNLLLMKSLKMQRQDASNREFELYKLSKLK >gi|225935375|gb|ACGA01000017.1| GENE 51 56237 - 56398 202 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885930|ref|ZP_02066933.1| ## NR: gi|160885930|ref|ZP_02066933.1| hypothetical protein BACOVA_03935 [Bacteroides ovatus ATCC 8483] # 1 53 10 62 62 95 100.0 1e-18 MQSELERISDLAKKAAVLDGCMYVVYQKEDGTYAFDKLGVEIKGKIVEYRHYL >gi|225935375|gb|ACGA01000017.1| GENE 52 56429 - 56950 305 173 aa, chain - ## HITS:1 COG:no KEGG:BVU_2828 NR:ns ## KEGG: BVU_2828 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 173 1 173 173 238 60.0 7e-62 MADFDELHRVIHSIASGFEEECIRCMEEHKNVLVDCIQEQLYSGLDGTEHLLNPDYDTDT YFNEPGPWQNRAEQYKRWKERITPPLRSEMLYLPPRPVEVPNLFITGTFYDSITADRIDS GLRFSTKGFTDGSSIEKKYGEQILGIGDTAKEYFNIMYLRPWMERFFSECGYR >gi|225935375|gb|ACGA01000017.1| GENE 53 57000 - 57155 74 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885928|ref|ZP_02066931.1| ## NR: gi|160885928|ref|ZP_02066931.1| hypothetical protein BACOVA_03933 [Bacteroides ovatus ATCC 8483] # 1 51 1 51 51 72 100.0 8e-12 MKVDNVTFVEVAVKGMTKEEFINAHIKVVWQELKEADRKKKLSEVYDAITK >gi|225935375|gb|ACGA01000017.1| GENE 54 57152 - 57991 655 279 aa, chain - ## HITS:1 COG:no KEGG:BVU_2829 NR:ns ## KEGG: BVU_2829 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 278 1 277 277 404 67.0 1e-111 MRKIRTCKGSRMNTGSSACSIDWKKVKGAILTEHGVKLPADITGEKLLELCHADRPGRIY PILPFLEYAKNGGEPQVNPVGYGASEYNGLSAQTDTFTLKKFDEVLNAQLLKCANKGWDV YFWNQDNMLIGYNDDTDILAGIPMSTVYPTVTQYPTSSAKSAMTVSFSHEDVEDSQLHFD YVQLDFNPKNFVKGLVDVVFQKLEAENTYKIVEVVGGYDRTEEFGSLIADGAAEVMNNVT SATYSDGIITIVPKAGAVPSLKAPSVLYEKGIRGIEQVS >gi|225935375|gb|ACGA01000017.1| GENE 55 57988 - 58560 169 190 aa, chain - ## HITS:1 COG:no KEGG:BVU_2830 NR:ns ## KEGG: BVU_2830 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 189 1 188 188 231 61.0 9e-60 MIDLDITELFEEIVKELPEGLEILYPNGKGGTKVMKSPRLNYIFGSSQYIKDILDEYSKS SAQSERKFPLVALFTPISEDRGDADYFSKAKVSLIIACSSCKEWSNEMRRTTSFKNILRP IYKRLLEVLYEDSRFDCDYDEKVKHSYSENYSYGRYGAYTDSGEAVSEPIDAINIRSMEI KINNLNCRRK >gi|225935375|gb|ACGA01000017.1| GENE 56 58557 - 59099 310 180 aa, chain - ## HITS:1 COG:no KEGG:BVU_2831 NR:ns ## KEGG: BVU_2831 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 180 3 173 173 215 56.0 5e-55 MLIDVSYFMSGPRHIENVSVAEMPSPQSLAVNEVINGYIKAFQPEFLRNVVGVTLSQAIT DYLELIEREKEDSSDEVDISEEKEAPQSGYAVLCEKLCEPFADYVFYHILRDANTQATIT GLVRLKCANEYIAPLKRQVSTWNSMVEKNKQFVEWAMSNDCPFDVKITKNLLTPINAFNL >gi|225935375|gb|ACGA01000017.1| GENE 57 59108 - 59530 483 140 aa, chain - ## HITS:1 COG:no KEGG:BVU_2832 NR:ns ## KEGG: BVU_2832 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 129 1 116 140 61 37.0 9e-09 MDYILRGNDKDVTNVLKEQRIRINRGMIQLIPISECGLVTKEDARKTLECMLAEKNEEIG RLTASIAEKDKTIVELTEERETMKARIAELEVQVPSDEKNLPVADSKDLQEEDAKEVTVT DDKAVSVEDEKKTGKGKTSK >gi|225935375|gb|ACGA01000017.1| GENE 58 59535 - 61109 1493 524 aa, chain - ## HITS:1 COG:no KEGG:BVU_2833 NR:ns ## KEGG: BVU_2833 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 20 524 59 561 562 361 40.0 4e-98 MPKKFTVSDFNLKTDGLPAEQKTFMENIVGMMCEVVNKSLEGFASPEEVTKQFGDINNLL KAYDGEKFQQLVKDNEQLVEQVKTLGESIEKMKQKGLSMDTINKFDEKLNEMLDSEKFRD FAEGKTRKSGEFDGFSLKDVVSMTDNYTGDLLITQQQKRVVTQVANKKLHMRDVLTTLTA DPAYPQLAYAQVYAFNRNARFVTENGRLPESSIKVKEIQTGTKRLGTHIRISKRMLKSRV YIRSYILNMLPEAVWMAEDWNILFGDGNGENLLGIINNTGVTSVEKIISTAIVTGAAGAV KAITGYNGDKDVIVEFAEPQDLILDGMSITFAGAAVLTELNKTHALVKMEDGRILIPGVA FSGAETATDKMTFSVHEAGFKNIEEPNSEDVVKTAFAAMTYAQYFPNAIILNPMTVNGME SEKDTTGRNLGIVKMVDGVKYIAGRPIIEYGGILPGKYLLGDFNQAANLVDYTTLTLEWA EDVETKLCNEVVLMAQEEVIFPIYMPWAFAYGDLAALKTAITKA >gi|225935375|gb|ACGA01000017.1| GENE 59 61127 - 62188 617 353 aa, chain - ## HITS:1 COG:no KEGG:BVU_2834 NR:ns ## KEGG: BVU_2834 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 329 3 336 351 321 52.0 3e-86 MEEKIKSLQYKTKANDVDEKGIVTVAVNGIGVKDSQNDISMPGSFNKTLKENIGRMRWFL NHRTDQLLGVPLSGKETEGNLVMVGQLNLEKQIGRDTLADYKLFAENGRTLEHSIGVKAI KRDSVDPCKVLEWRMMEYSTLTSWGSNPQTFLVNIKSATADQVKEAVDFVRKAFLQHGYS DERLKGYDMELSLLLKSLNGGAVVSCPHCGYQFDYDAETEHTFAQQVLDYAADYQRWITQ DIVREEMEKLTPEIRTQVISLIDSVKSEKKEFTQKGLQDLMNYVRCPHCWGKVYRSNAIL QNTSENTTGKNEPSVDTQEKNDGENGNDEVTIKAADNGTLLDFKSLNSCFENK >gi|225935375|gb|ACGA01000017.1| GENE 60 62142 - 63587 897 481 aa, chain - ## HITS:1 COG:no KEGG:BVU_2835 NR:ns ## KEGG: BVU_2835 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 481 4 475 475 473 49.0 1e-131 MNIFFDNLFGKKSKTKGEVEIVTSSENKDIDTQSGKAEKWSVAYIEDLTSPIVAGSNYLT LFSTIPEVFFPIDYIASRIAGANFQLKKTKDDSIVWANKRMNGILSRPNCLMRWKELIYQ HHIYKLCTGNSFIRAAMPDVFSTAEKWRYCDNYWVLPSDKTIVEPVYGNMPLFGIAQTED IIRSYRLEYGWNGSLEIPPYQIWHDRDGSAEFYSGAMFLKSKSRLASQNKPMSNLIAVYE ARNVIYVKRGGLGFIVSKKTDATGSIALTDDEKEQLLKQNFEKYGVRKGQVPYGISDADI DFVRTNLSIAELQPFEETLADAINIAGAYGIPAVLVPRKDQSTFSNQATAEKSVYCSTVI PMAKQFCKDFTAFLGLEGGGYYLDCDFSDVDCLQEGLKESEDVKTNINKRCREQFSCGLI TLNDWRAQIGESMIENPLFDKLKFDMSDEELDKVNRVFNTKSGDEKDGRENQKPSVQDKG K >gi|225935375|gb|ACGA01000017.1| GENE 61 63714 - 65729 1027 671 aa, chain - ## HITS:1 COG:BS_yqaT KEGG:ns NR:ns ## COG: BS_yqaT COG1783 # Protein_GI_number: 16079672 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Bacillus subtilis # 6 173 3 169 431 82 30.0 4e-15 MVINYKKLNPNGFYLLKYLNDETIRFIILYGGSSSGKSYSVAQTILIQTLQDGENTLVMR KVGASILKTIYEDYKVAAIGLGISHLFKFQQNTIKCLVNGAKIDFSGLDDPEKIKGISNY KRVQLEEWSEFEHPDFKQLRKRLRGKKGQQIICTFNPISESHWIKKEFIDKDKWHDVPMT VTIAGKELPKELTKVKSVKKNAPRQILNLRTKQIEEQAPNTVIIQSTYLNNFWVVGSPDG AYGFYDEQCVADFEYDRVHDPDYYNVYALGEWGVIRTGSEFFGSFNRGKHSGEHKYVPDL PIHISVDNNVLPYISVSYWQVDFTTGTKVWQFHETCAESPNNTVKKASKLVAKYLKSIQY SDRLYVHGDASTKAANSIDDEKRSWMDLFIDTLQKEGFEIEDKVGNKNPSVAMTGEFVNA IFDCTVPGIEIYIDESCSVSIEDYMSVQKDANGAILKTKVKNKTTLQTYEEHGHLSDTFR YVVVDLCSEQYIEFSNRRKRNLYACNGTINFFNPDTECKYTKKILYVMPNVNGKFVLIQA FRCGNKWHVVDVVFMDTTSTEDIRSSILSHESDSCVIECTDAYFPFIRELRSSTNKEIRV MKEFPDVDKRIAATSDYVKNSILFSASKVESDTEYVAFMNNLMDYNKDSETKEASAVLSG LVQFVVKLGLN >gi|225935375|gb|ACGA01000017.1| GENE 62 65791 - 66336 409 181 aa, chain - ## HITS:1 COG:no KEGG:BVU_2837 NR:ns ## KEGG: BVU_2837 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 181 4 181 181 219 63.0 2e-56 MGKQEKPLTFKQEKFCKYYVDTEGNASEAYRMSYDASKMKPETIWSAASRLLANSKVSAR ISEIKQQRAKETEVERKTVEKVLMDIVLADPDDLHYVDPVTGKTKMRSPSQLPKRARNAL KKIQNNRGVVNYEFNGKTEAARILGAWNGWEADKNVNIKGGDGNKVGELRIGFEDNENSE E >gi|225935375|gb|ACGA01000017.1| GENE 63 66870 - 67658 333 262 aa, chain - ## HITS:1 COG:XF2279 KEGG:ns NR:ns ## COG: XF2279 COG0451 # Protein_GI_number: 15838870 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Xylella fastidiosa 9a5c # 4 214 22 296 342 90 29.0 3e-18 MRKMIVTGSEGFIGKALCRELAKRDVEVIGLDRKSGIEATKVCELLKNGGIDCVFHLAAQ TSVFNGNLEQIRKDNIDTFMRVADACNQYHVKLVYASSSTANPENTTSMYGISKYFDEQY ASIYCKAATGCRLHNVYGPNPRKRTLLWFLMEKENVSLYNCGQNIRCFTYIDDVIEGLIY AVGCNRQLINICNVQPVTTMYFASLVKYYKPLEIELINEKRDFDNLEQSVNRDIYLVPLS YTSVEDGVKKIFDEKKGKDMSY >gi|225935375|gb|ACGA01000017.1| GENE 64 67732 - 68649 467 305 aa, chain - ## HITS:1 COG:Ta1390 KEGG:ns NR:ns ## COG: Ta1390 COG1032 # Protein_GI_number: 16082367 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Thermoplasma acidophilum # 79 234 46 207 425 88 34.0 1e-17 MNIGILAVDSNYPNLALMKISSYHKARGDNVEWYNPLCSYDKVYIAKVFSFTPDYGYYIN ADQVEKGGTGYDIKKVLLPEIDRMIPDYDLYNVDKNLAYGFLTRGCPNRCKWCVVPAKEG NITTYMDIADVSAGRKNVILMDNNILASNYGLQQIEKIVSMGVRVDFNQGLDARLVTEDV AKLLAKVKWIKRIRFGCDTPGQIAECERATALIDKYGYKGEYFFYCILLNDFKEAFTRVN HWRVKGGRFLPHCQPYRDLNNPRQIIPQWQKDLAGWADKKWVFRSCEFKDFTPRKGFKCR EYFQK >gi|225935375|gb|ACGA01000017.1| GENE 65 68660 - 69091 448 143 aa, chain - ## HITS:1 COG:no KEGG:BVU_0933 NR:ns ## KEGG: BVU_0933 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 127 1 126 151 166 59.0 2e-40 MAKIYVASSWRNVFQQDVVGILRDLGHEVYDFKNPPHGNGGFQWSDIDPNWQNWTTEQYR EALNHPIAQKGFDSDFNGMKWADVCVMVLPCGRSANTEAGWMKGAGKRVMVYSPKEQEPE LMYKIYDFVSDSIFRINDEIIGV >gi|225935375|gb|ACGA01000017.1| GENE 66 69164 - 69349 218 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYSYTISFTAGGRSYSFTTNISYPHNFHDRGLVKTAVMSAIGAYKRANGIEAATVESSVS Y >gi|225935375|gb|ACGA01000017.1| GENE 67 69405 - 70397 834 330 aa, chain - ## HITS:1 COG:no KEGG:BVU_2845 NR:ns ## KEGG: BVU_2845 # Name: not_defined # Def: putative type I restriction-modification system methyltransferase subunit # Organism: B.vulgatus # Pathway: not_defined # 1 330 1 330 330 504 73.0 1e-141 MMNNKETLIKTLRGSVAQLNELSDMTEGIDVYDAAGYVDTEFLMEALSCVNTFMDASNMV ITKISSLLAPDAPVDERKNQADEGKKWNVEEILKHCTLEDSVLKLPKVQFNKKSYAEAKK WIEEAGGSWQGGKIQGFTFPFNPERVFSILKEGKRCDLQKDFQFFETPADIADWLVMLAG GIHETDTVLEPSAGRGALIKAIHRSCPSVTVECYELMPENREFLHTLDNIILLDEDFTKD SVGHYTKIIANPPFSGNQDIDHVRLMYERLEEGGTLAAITSQHWKFASEKKCVEFREWLE EVHGEVFEIGAGEFKESGTTVSTMAVVIKK >gi|225935375|gb|ACGA01000017.1| GENE 68 70354 - 71460 556 368 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 118 368 14 265 265 123 32.0 5e-28 MDDKRKQILVDYISYLYTTGRSYDSIGKYIKYVTDFLENSEEINRRGYYKYKHKNADAMV RHSFMCEAVCDLLSYLKIGYGRREKAVKPLEKLEVISEKNKKLLNDFIIWLTDNNDYSSH TIDVYYTSLRKYFEYANELNMDNCRRFIKSLEEEKLSPATIRLRITAIEKFSKWVKKPIE LKRPRMKRKLDVNNVPTEEEYNRLLEYLKTKLNKDYYFFIKVLGTTGARLSEFQQFTWED IATGEVVLKGKGNKYRRFFFQKQLQREVKDYIKETGKSGTLAVGRFGPLTQRGLSQHLKV WGKHCGIDSKKMHAHAFRHFFAKMFLKKTKDVIQLADLLGHGSVDTTRIYLQKSYDEQQR DFNKNVTW >gi|225935375|gb|ACGA01000017.1| GENE 69 71477 - 72085 415 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170630|ref|ZP_05757042.1| ## NR: gi|260170630|ref|ZP_05757042.1| hypothetical protein BacD2_02100 [Bacteroides sp. D2] # 1 202 1 202 202 390 100.0 1e-107 MSKMNLNELRDKAYKTACEHVFHDQELSNNHFLCLVISELMEAVEADRKGRRANVDRYNK KIANSRICQGLDSDIPKERGYEVAYSETIKGSIEEELADAVIRLLDLAGLRGISLELANG DIDDCIEDMAEACKGESFTESIYSISTLPVRYDGIFDFSTAVNDMILSIFGLAKHLDVDL LWHIEQKMKYNELREKMHGKKY >gi|225935375|gb|ACGA01000017.1| GENE 70 72097 - 72603 293 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170631|ref|ZP_05757043.1| ## NR: gi|260170631|ref|ZP_05757043.1| hypothetical protein BacD2_02105 [Bacteroides sp. D2] # 1 168 1 168 168 305 100.0 6e-82 MSKKDLIEQNITRVQEYVRELIEDAKCNNGVSETLESTSIIVGNSDDIYDFAILFASNSE CVYCEFINGKIEYIDCELDCEICQFEGRLIFQYINGSFHNPTSQIIELSKLLMKGELKDT KSIFCSMVLRLMDTEEYSNNYCKSLDLVLRLFPEIDGELLEKELDRYI >gi|225935375|gb|ACGA01000017.1| GENE 71 72628 - 72873 170 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170632|ref|ZP_05757044.1| ## NR: gi|260170632|ref|ZP_05757044.1| hypothetical protein BacD2_02110 [Bacteroides sp. D2] # 1 81 1 81 81 146 100.0 4e-34 MKRNEKIEKLERLGIFNQWKYNTERANETFNIECPDFSMTNEERMNNLLDVDCSFHQFLT ISFPFYNTPEGATFWENIAKK >gi|225935375|gb|ACGA01000017.1| GENE 72 72963 - 73205 227 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170633|ref|ZP_05757045.1| ## NR: gi|260170633|ref|ZP_05757045.1| hypothetical protein BacD2_02115 [Bacteroides sp. D2] # 1 80 1 80 80 140 100.0 2e-32 MKTTIISCVILFVFLLYVGHFSITIKPFTVQLPYWHRSLGLFLLILSFIVYNVGERAKGY IDGMKEGERIVLELLKKKTE >gi|225935375|gb|ACGA01000017.1| GENE 73 73207 - 73758 435 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170634|ref|ZP_05757046.1| ## NR: gi|260170634|ref|ZP_05757046.1| hypothetical protein BacD2_02120 [Bacteroides sp. D2] # 1 183 1 183 183 347 100.0 2e-94 MIPLCINGKDYYDREEALAAWFEEWLMKQDFEQDLIDRELELEYRKTHPDWNTPYVMYGV RKKHKCIQKNEIAVFYDLLPRQKRARTAETHWYKVLYKRKATPEEVESLKAGEYTRRYLV YSLFIEKKMTLDKALSLIVADDKLLGIADNTISEIVTAFETFFNRKFRIYKPEFTTQLNL FTD >gi|225935375|gb|ACGA01000017.1| GENE 74 73764 - 74477 259 237 aa, chain - ## HITS:1 COG:no KEGG:Dalk_4616 NR:ns ## KEGG: Dalk_4616 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 16 236 1 222 227 130 35.0 6e-29 MKYYASVSFGKDSLAMLFMLIDKGYQLDEVVFYDTGMEFQAIYNTRDAVLPILKKLGIKY TELHPEQPFLWTMFERPVKKRGTNIIHKKGYSWCGGTCRWGTSEKLRALKAHTKDGIDYV GIAADETHRFEKEKRPNRVLPLRDWGITEADALQYCYTKGFVWHEDGVRLYELLDRVSCW CCGNKNLKELKNMYLYLPWYWKKLKELQLNTDRPYRRNSGETIFDLEERFKREMQQK >gi|225935375|gb|ACGA01000017.1| GENE 75 74822 - 75286 193 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170636|ref|ZP_05757048.1| ## NR: gi|260170636|ref|ZP_05757048.1| hypothetical protein BacD2_02130 [Bacteroides sp. D2] # 1 154 1 154 154 328 100.0 6e-89 MREFSKETSLQRVMRASGRVPVQCSCSVCKQQCHTPCLGTPDDIERIIDAGYADRLALTN WAAGIFLGVINIAIPMIQPVAGKEYCAFFENGLCILHDKGLKPTEGRLSHHTVRKDNFNP AMSIAWNVAKEWLMPENEDVLSRVVNKFLNARKP >gi|225935375|gb|ACGA01000017.1| GENE 76 75315 - 75962 569 215 aa, chain - ## HITS:1 COG:no KEGG:HAPS_0636 NR:ns ## KEGG: HAPS_0636 # Name: not_defined # Def: hypothetical protein # Organism: H.parasuis # Pathway: not_defined # 1 213 1 213 213 186 50.0 5e-46 MNTYYKFAPNVFLAKCDEKHEKGETIEVTTKYGKENECIVFNLIYERDGFYYYSIVRADG FNVQEWAKQRAERRHEWATSAVQKSCEYYNKSNKDKDFLSLGEPIKVGHHSEKRHRKAID DAWNNMGKSVEFSDKAAEHERVAKYWEKRANTINLSMPESIDFYEHKLEQAKEYHEGLKS GKYRREHTYAMAYANKAVKEAKKNYDLAVKLWGDV >gi|225935375|gb|ACGA01000017.1| GENE 77 76012 - 76587 344 191 aa, chain - ## HITS:1 COG:no KEGG:BVU_2840 NR:ns ## KEGG: BVU_2840 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 190 1 189 190 314 85.0 1e-84 MIIRTVCGYDFFEVSSAMQKAIRRADTGVAGFFALELWASGYRDYVWKRLFTISAEDCYG IITKEIEALWQGHELVNKTATEPKGRIFVSKAVILLCECRKNRDADHLQNFIYDRKDIDI EKWINDVRRYPIPIPDYTFDVHTRKGKKHGRTKEEFFQEEYKALQPRVPGLFDDLVQPSQ PKLFNDETTAK >gi|225935375|gb|ACGA01000017.1| GENE 78 76699 - 77451 294 250 aa, chain - ## HITS:1 COG:no KEGG:BVU_2841 NR:ns ## KEGG: BVU_2841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 250 1 250 250 476 88.0 1e-133 MPCKIVIPSHKRHDRVFAKKLVNDPIICVAESQADLYQQFNPECEIVTHPDDVMGLIPKR NWMAKHFGELFMLDDDVHACKPIYVEKGEPSRIKDKDKITNIIQSLFEMASMMDVHLFGF TARISPVMYDESAFLSLSKMITGCSYGVIYNKNTWWNEEIRLKEDFWISCYMKYKERKVL TDLRYNFEQKNTFVNAGGLASIRNQEEERKSILFIKKNFGDSILLKSATTNGKDKTKQLV QYNISCKFKF >gi|225935375|gb|ACGA01000017.1| GENE 79 77420 - 78076 428 218 aa, chain - ## HITS:1 COG:no KEGG:BVU_2842 NR:ns ## KEGG: BVU_2842 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 217 1 216 217 370 85.0 1e-101 MTEIIQVCLLDFNKGQLTGLPKNPRFFRDYRFEAMKKSIQDSPEMLELRELIVFPYNDGR YIVVCGNLRLRACKELGYKELPCKILAPDTPIKKLREYATKDNVNFGENDLDVMENEWNK AELQDWGIEFAPEKKEDEFKERFDAITDDTAIYPLIPKYDEKHELFIITSSNEVDSNWLR ERLDMQHMKSYKTGKVSKSNVIDIKDVRHALQNSNTKS >gi|225935375|gb|ACGA01000017.1| GENE 80 78137 - 78430 152 97 aa, chain - ## HITS:1 COG:no KEGG:BVU_2843 NR:ns ## KEGG: BVU_2843 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 97 1 97 97 97 51.0 1e-19 MITLNRFAQRCLNIMRKRFKMNEHSSRKAFSIRIEAVWRKFDIASKYRSDNLPKYSEDEE LAAEMIIYLVAYLKRFGCEDIEQLIKDKIEFDDRKND >gi|225935375|gb|ACGA01000017.1| GENE 81 78540 - 78977 349 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885891|ref|ZP_02066894.1| ## NR: gi|160885891|ref|ZP_02066894.1| hypothetical protein BACOVA_03896 [Bacteroides ovatus ATCC 8483] # 1 145 1 145 145 260 100.0 2e-68 MTDGAIDRLKEMVNKPFLYQNEEVVILNYCDGTGDDGTEVEIYLNNGKVLVFSMFDLASK LNRFRPITNTVVVLANERLNKVSTVNPTILQDLRNLVLQQIKDVKEDPSKVSQAKQVFQG VNTVINLAKTELEYRKYLDTTDPSK >gi|225935375|gb|ACGA01000017.1| GENE 82 78974 - 79738 441 254 aa, chain - ## HITS:1 COG:no KEGG:PA14_58970 NR:ns ## KEGG: PA14_58970 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA14 # Pathway: not_defined # 32 253 6 234 235 107 33.0 4e-22 MKTWTGEQLAILDSEYPTADLKELARRLDKTLSAVKTKALIRKLRRSPRISFWNSERLDK LKKLYPNHTNEEIAQILGTTYSAVNGIAFKLRLFKSKEFKFQCASKSFFPKGHQPMNKGR KQTEYMSEEQLAKTKATRFKKGHIPKNHKPVGYERITRDGYIEVKTAEPNVFELKHRLVW IEHNGEIPPGYNIQFKDGNRQNVSIENLYMISRSEQLKKENSLYARYPEDVQYLIKLKGA LNRQINKATKKNES >gi|225935375|gb|ACGA01000017.1| GENE 83 79766 - 81484 1085 572 aa, chain - ## HITS:1 COG:SMc02801 KEGG:ns NR:ns ## COG: SMc02801 COG1475 # Protein_GI_number: 15967087 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Sinorhizobium meliloti # 4 197 39 198 296 103 34.0 6e-22 MEVQNIRIDLISPSPLNPRKTFDEAALEELASNIEKQGLLQPITVRVAKSEEMTNLETGD VTPLPYTYEIVCGERRFRAVSLLKAKEDEANVAKIKAHRKKSEKFQTISCIVREMTDDEA FEAMITENLQRKDVDPIEEAFAFAQLAEKGRTLEDIALKIGKSTRFVFDRIKLNSLIPEL KERVRNGDIPLSGAMILSKLDEDTQKEFHEEEEEQCTTAMIREFVSNSFMELGNAPWIKD DSDNWENTDIKSCSQCENNTCNHGCLFYEMNSKDARCINAACYEKKQIAYVTRKIQLEYE HLVKVGEPLSFGKTVIIARRPDTYWGEDRKVFYEKTLEAVKQLGFEIVDPDEIFRCKCWY SEDDERILKMLEDGEVYRCLSFFGHYSPEFNVSFYYVRKETASSTSAVADLKEIEREKIN AQLKRAKDIVKEKSAEEMRKWAQEKTYYQRTKEFSENEQLVFDVLVLSGCSSTYLEKLNL KKWNGESDFVNYVKNNQADRHQWYRAFIAECLSSNNVNFYSYLQKCQKILFAEQYPDDFK ALSKKLADSYDKKEKKLKERLKELNNDNTEEA >gi|225935375|gb|ACGA01000017.1| GENE 84 81586 - 81807 214 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170645|ref|ZP_05757057.1| ## NR: gi|260170645|ref|ZP_05757057.1| hypothetical protein BacD2_02175 [Bacteroides sp. D2] # 1 73 1 73 73 125 100.0 6e-28 MANKKQKVTIYWNTRHIKLEDIPEVKRRIRERFGIPNHTTVNGETDCYIREEDMELLRET EKRGFIQIRNKPA >gi|225935375|gb|ACGA01000017.1| GENE 85 81764 - 82411 486 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885887|ref|ZP_02066890.1| ## NR: gi|160885887|ref|ZP_02066890.1| hypothetical protein BACOVA_03892 [Bacteroides ovatus ATCC 8483] # 60 215 1 156 156 297 99.0 4e-79 MENELILRPQEENRLAVLRTSPKNYCKALCPKKVEDVFQSDEPSIGTIIRKFGEPQARAV LVILIADALEFFNVSNTMSATQVATTVDLIIEEYPYMKTDDFKLCFKNAMKMKYGENYNR IDGSIIMGWLREYNKERCAVADNQSWNTHKAKLSGETSFTSGLSYEEYRNELKLRVEQGD EEAAKALSLSNEIISYLNKREYGKQEAEGDNLLEH >gi|225935375|gb|ACGA01000017.1| GENE 86 82308 - 83120 392 270 aa, chain - ## HITS:1 COG:no KEGG:Coch_0881 NR:ns ## KEGG: Coch_0881 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 109 1 110 298 105 50.0 1e-21 MERNSFIFYKGWREAIKDLPDDVRLEIYESIIEYATTGNLRGLKPMANIAFNFIKIDIDR DTEKYMSIVERNKSNGSKGGRPKSENPKEPKEPTKPTGLFGNPKEPTKPDNDNEYDNDYV DDNDSHLKKKETSPKGESKKDELSLFPEEKIDWGGLMDYFNSTFKGKLPAIKSIDAKRKK AIKARVAQYGKQAVFDVFQLVLDSPFLLGQNDKNWRCTFDWIFLPTKFTNILEGNYNGKR TDTAATRRESVSSLTDLAEELLQSSMPKEG >gi|225935375|gb|ACGA01000017.1| GENE 87 83156 - 83620 370 154 aa, chain - ## HITS:1 COG:no KEGG:BDI_0857 NR:ns ## KEGG: BDI_0857 # Name: not_defined # Def: putative recombination protein # Organism: P.distasonis # Pathway: not_defined # 14 154 17 157 157 197 70.0 1e-49 MWRNYKKKEKKKPLFEVEGVKVKKKPDLVDKLDRIFSLFIRYRDTMPNGYFQCISCGKIK PFNKADCGHYINRQHMSTRFDEMNCNAQCSHCNRFMEGNIQDYRRRLVAKYGERNVLILE AKKNVTKQFSDFQLEKLITHYKEEAKKLKEAKGL >gi|225935375|gb|ACGA01000017.1| GENE 88 83626 - 84783 461 385 aa, chain - ## HITS:1 COG:VC1636_1 KEGG:ns NR:ns ## COG: VC1636_1 COG1061 # Protein_GI_number: 15641641 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Vibrio cholerae # 3 350 65 420 420 202 37.0 9e-52 MTYQLRDYQKSASDAAVSVFKSKEKKNYVIVLPTGAGKSLVIANIAARIDGPLIVFQPSK EILEQNFAKLQSYGIFDCGVYSASAGRKDINRITFAMIGSVMKHMSFFKHFKHVLIDECH LVNPEKGMYKEFFEDEQRKVIGLTATPYRLCSGRGGAMLKFITRTRPKVFTDVIYHCQVS ELLAKGFLASLKYYDITKLDLSRVRTNSTGADYDEKSLLQEFERVDIYKDIVGWTKRLLN PKSGIPRKGILIFTRFIREAEKLASEIPNCAIVSGSTPKEERARILKGFKDGRIKVVANV GVLTTGFDYPELDTIVLARPTKSLSLYYQMVGRVIRPCQGKEGWVVDLSGNFRRFGRVEE LRIEQPEKGKWCIMSRGRQLTNVVF >gi|225935375|gb|ACGA01000017.1| GENE 89 84780 - 85115 110 111 aa, chain - ## HITS:1 COG:no KEGG:BVU_2856 NR:ns ## KEGG: BVU_2856 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 111 1 91 91 112 64.0 5e-24 MGTSSKKTCLINFLKKNRKTMSRLQHKKGRKSNYVKRLVNNPDWEEAKRKVRIRDGHKCQ MCGKDFNLEIHHKTYRVNGKSIVGHELEHLDCLVTLCGDCHSKVHKYHIKL >gi|225935375|gb|ACGA01000017.1| GENE 90 85048 - 85716 733 222 aa, chain - ## HITS:1 COG:no KEGG:BVU_2857 NR:ns ## KEGG: BVU_2857 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 221 1 221 233 175 49.0 9e-43 MIANLRNYEPETIEFVVPDSIREKFPPVLFQGSTNVDELIKLVNEHFNATFPESEVTQRL LDEFEISEIREEYCIKQENEVPKRERELLEAIERAKKIKSDAQDRLASIKTEIKDLAAEV KKGTREYHLSSKNTIRFALDGYFLYYSWVNGEFKLVKAEKIPDWDKRSLWAQEDRNRKAM LDLFGIEYPEVERPIDDTEDYGDKFEEDLSDKLPEEEPEDDE >gi|225935375|gb|ACGA01000017.1| GENE 91 85735 - 86187 245 150 aa, chain - ## HITS:1 COG:no KEGG:BVU_2858 NR:ns ## KEGG: BVU_2858 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 150 1 150 150 239 75.0 2e-62 MEATLTKKDGKIQMDKSFEFMCSTLRNGEYTVTIKKKTQPRTLNQNALMWKWFQCIGACL REYTGEEYWSTAAGVQDIHDLYCKKFLVKQVHVNGKVETIVRGTSKLNTLEMHNFMESVK IDAATEFGITLPLPEDQHYLDFIHEYQNRY >gi|225935375|gb|ACGA01000017.1| GENE 92 86194 - 86643 196 149 aa, chain - ## HITS:1 COG:XFa0061 KEGG:ns NR:ns ## COG: XFa0061 COG0629 # Protein_GI_number: 10956771 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Xylella fastidiosa 9a5c # 2 111 3 111 136 101 43.0 6e-22 MSLNKLMLIGHVGKDPDIRILEAGSKVATFSFATTEKGYTLANGTQVPERTEWHNIVVWR GLADVVEKYVHKGDKLYLEGKIRTRSYDDSRGIKRYITELFVDNMEMLSVKPQQAPPPPP LPEHTNNQTRSAVNECPPPPPPTKDDLPF >gi|225935375|gb|ACGA01000017.1| GENE 93 86640 - 87554 535 304 aa, chain - ## HITS:1 COG:no KEGG:BVU_2860 NR:ns ## KEGG: BVU_2860 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 303 1 301 306 393 66.0 1e-108 MIELVKSSVVFNEENHTYMLGEKQLQGITGMISRQLFPDKYKDVPDFVLKRAAEKGSLIH AQCQFADVTGLPPESIEAENYIRMRVNAGYKALANEYTVSDNEYFASNIDCVWEKAGRIS LVDIKTTLHLDKEYLSWQLSIYAYFFELQNPLLKVDKLFGIWVRGDKHELVEIPRKPDKE VKKLMECEKKGEQYLSNLPVPTPDDDKLLIPVQLVNTIIGIEEELADLTKIQKDYKAKLK TAMRENGVKSWDAGRLRVSYTPASTSDNFDTKKFQADYPELYSKYIKTVPKADSIRVTIR EDKS >gi|225935375|gb|ACGA01000017.1| GENE 94 87560 - 88342 396 260 aa, chain - ## HITS:1 COG:CAC1936 KEGG:ns NR:ns ## COG: CAC1936 COG4712 # Protein_GI_number: 15895209 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 20 187 3 169 229 201 61.0 9e-52 MTARKNTVSTVQNEEKKKNSIRPLLASEIECRVGTMKPDGSGCSLLLYKDARVDMRILDE VFGEMNWKRHHDVVNGNLFCTLSIWDNEKKEWVSKQDVGTESSTEKEKGQASDAFKRAGF NWGIGRELYTGPFIWIPLEKNEIYQSKTGSPALYTKFSVKEIGYNEQKEIILLVIVDNKN RVRFAYGNTKEKVYAPNVSASNASGKVYTGVDLDRAIKQMTGVKSREELERVWAEHPELH NNKEFRNITIDMQKTYPPRN >gi|225935375|gb|ACGA01000017.1| GENE 95 89380 - 89571 154 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885875|ref|ZP_02066878.1| ## NR: gi|160885875|ref|ZP_02066878.1| hypothetical protein BACOVA_03880 [Bacteroides ovatus ATCC 8483] # 1 63 1 63 63 98 100.0 2e-19 MKKIKVIQYAMMFIALWTTLYLIDSIEVSKKEFIAAFVLVTVVSVNYICFRYYEDRKQNK DSL >gi|225935375|gb|ACGA01000017.1| GENE 96 89568 - 89837 61 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170657|ref|ZP_05757069.1| ## NR: gi|260170657|ref|ZP_05757069.1| hypothetical protein BacD2_02235 [Bacteroides sp. D2] # 1 79 1 79 89 159 100.0 4e-38 MLVEMIRGEMAEILLDNILRLFSTETFGKDKSAYYVGGEKKLMNLIEAGKIESDKPTNVQ NGKWHCNAAQVLLHCRCAGRKVKSKKRKK >gi|225935375|gb|ACGA01000017.1| GENE 97 89843 - 90049 259 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885873|ref|ZP_02066876.1| ## NR: gi|160885873|ref|ZP_02066876.1| hypothetical protein BACOVA_03878 [Bacteroides ovatus ATCC 8483] # 1 68 9 76 76 122 100.0 9e-27 MEKDIQRRNVIDVLRSMDVGAIEVFPIVQKPSVTNTLNARLYKEKAEGMAWKTKSDVKNM QFIVTRIA >gi|225935375|gb|ACGA01000017.1| GENE 98 90484 - 90720 164 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885871|ref|ZP_02066874.1| ## NR: gi|160885871|ref|ZP_02066874.1| hypothetical protein BACOVA_03875 [Bacteroides ovatus ATCC 8483] # 1 78 1 78 78 105 100.0 1e-21 MDKRTELEIQRDKYEAVIEERDALISSLRGENEKLKRDLESERGFYREKVSQCDDLKKFI ESQRNLMDIVLKNNQSIL >gi|225935375|gb|ACGA01000017.1| GENE 99 90736 - 91029 285 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716168|ref|ZP_04546649.1| ## NR: gi|237716168|ref|ZP_04546649.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 97 1 97 97 155 100.0 5e-37 MSYNLSQIMKSAHRNYKKGGKTFSECLKSAWSFAKLQESFSPEAVKSRTDKFLAERHEAM SKTAKATPSKEYNNLNIPASAYYNPNSTHYGAHYVGD >gi|225935375|gb|ACGA01000017.1| GENE 100 91183 - 91578 297 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170661|ref|ZP_05757073.1| ## NR: gi|260170661|ref|ZP_05757073.1| hypothetical protein BacD2_02255 [Bacteroides sp. D2] # 1 131 1 131 131 229 100.0 5e-59 MAETSVNEKIREIISYYKLSDRQFSIKIGVTQSVIGSMFQKNTEPSSKVIRLTLNAFTDI SADWLLRNKGPMLISDIKPDPNIERMERLVDTIATLQGTINEQMKTIQLFTEENQKLKGE LAMLKNERNIG >gi|225935375|gb|ACGA01000017.1| GENE 101 91595 - 92173 431 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885867|ref|ZP_02066870.1| ## NR: gi|160885867|ref|ZP_02066870.1| hypothetical protein BACOVA_03871 [Bacteroides ovatus ATCC 8483] # 1 192 1 192 192 341 98.0 2e-92 MKKILLILMLFTPIITWGQKSDLIKFLDACKTFEFEESKEIISKYSFNLGNMYDLLNYEE PTSILFDTDTLNIKGYKAIINCKIKNKAGQYIDKKMIVVMYLNKENSLWCVEMFREATDP NKEYKISKQDVDSGKFYTKKQYVYRNLAYWAISSGKLNEAIKYMNIAEEEAAKVNDTKFN IDSQKEVLRKII >gi|225935375|gb|ACGA01000017.1| GENE 102 92181 - 92648 223 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885866|ref|ZP_02066869.1| ## NR: gi|160885866|ref|ZP_02066869.1| hypothetical protein BACOVA_03870 [Bacteroides ovatus ATCC 8483] # 1 155 1 155 155 285 100.0 5e-76 MCQIRTAPPKDEREYPLVITAEEKDKVLNYILVVANGKRTAKLNYKDIPDLRISKEQYEI VLEEFKNRRFIDYKGYGIEYLTLNFEIFNFAEKGGFTVERDLYILSFDTFQMQLERLEKE LSPDTAAKVDDVVGKAKNITELLIGLSALAEKMNL >gi|225935375|gb|ACGA01000017.1| GENE 103 92645 - 92983 233 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160885865|ref|ZP_02066868.1| ## NR: gi|160885865|ref|ZP_02066868.1| hypothetical protein BACOVA_03869 [Bacteroides ovatus ATCC 8483] # 1 112 1 112 112 185 100.0 6e-46 MKSLLKNVLRRISKKQSSKEDNATAFYPQCCAKVDDSARMRIKMSYDQNVKETISSLKTL ANDMSSGFVTFKKFQTRRYQYNPDADATLYASRLLRAASILEFLLTDPDNKS >gi|225935375|gb|ACGA01000017.1| GENE 104 93302 - 93514 232 70 aa, chain + ## HITS:1 COG:no KEGG:BT_1232 NR:ns ## KEGG: BT_1232 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 1 67 68 70 53.0 1e-11 MEYSVEELKSALIEKCESEGILYATVAMDRRTKEMILPDTLQGALKHPEFFVCTCKKVKD QYVVEEITKV >gi|225935375|gb|ACGA01000017.1| GENE 105 93537 - 94487 1091 316 aa, chain - ## HITS:1 COG:MT0820 KEGG:ns NR:ns ## COG: MT0820 COG2837 # Protein_GI_number: 15840211 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Mycobacterium tuberculosis CDC1551 # 13 311 8 306 335 296 50.0 3e-80 MNPYQHSFGGNIPQDVAGKQGENVIFIVYTLKDSPETLDKVKDVCANFSALIRSMRNRFP DMMFSCTMGFGADAWSRLFPEQGKPKELKTFEEIKGEKHTAVSTPGDLLFHIRAKQMGLC FEFASIIDEKLQGVVEPVDETHGFRYMDGKAIIGFVDGTENPAVDENPYHFAVVGEEDAD FAGGSYVFVQKYIHDMVAWNSLPVEEQEKVIGRRKFNDVELSDEEKPQNAHNAVTNIGDD LKIVRANMPFANTSKGEYGTYFIGYASTFTTTRQMLESMFIGNPVGNTDRLLDFSTAVTG TLFFAPSYDLLGELGE >gi|225935375|gb|ACGA01000017.1| GENE 106 94534 - 95265 189 243 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 [Hydrogenivirga sp. 128-5-R1-1] # 31 228 34 212 228 77 31 4e-13 MKLSVFPSSIETSRALILRLVEIMNEEPDRVFNIAVSGGNTPALMFDLWANEYMEITPWD RMRIYWVDERCVPPDDSDSNYGMMRNLLLGITPILYENVFRIRGEAKPAKEAVRYSELVR QQVPFKRGWPEFDIILLGAGDDGHTSSIFPGQEDLLTSNSIYVVSAHPRNGQKRIAMTGY PIQNARYVIFLITGKNKADVVEEICNSGDTGPAAYVAHHAQNVELFMDKGAAAYIEDPNK KMS >gi|225935375|gb|ACGA01000017.1| GENE 107 95262 - 96743 1271 493 aa, chain - ## HITS:1 COG:VCA0896 KEGG:ns NR:ns ## COG: VCA0896 COG0364 # Protein_GI_number: 15601650 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Vibrio cholerae # 1 493 10 501 501 561 54.0 1e-160 MIIFGASGDLTKRKLMPALYSLYREKRLTGEFSILGIGRTVYSDDNYRSYILEELQQFVK SEEQDTALMASFVSHLYYLPMDPAKEEGYPQLRQRLVELTNEVDPDNLLFYLATPPSLYG VVPLYLKAAGLNTPHSRIIVEKPFGYDLESALELNKTYASVFNEHQIYRIDHFLGKETAQ NVLAFRFANGIFEPLWNRNYIDYVEITAVENLGIEQRGGFYETAGALRDMVQNHLIQLVA LTAMEPPAVFNADNFRNEVVKVYESLTPLNDVDLNEHIVRGQYTASGSKKGYREEKGVAP DSRTETYIAMKLGISNWRWSGVPFYIRTGKQMPTKVTEIVVHFRETPHQMFHCAGGNCPR ANKLILRLQPNEGIVLKIGMKVPGAGFEVRQVTMDFSYAQLGGVPSGDAYARLIDDCIQG DPTLFTRSDAVEASWKFFDPVLRYWKDNPDAPLYGYPAGTWGPLESEAMMHEHGADWTNP CKNLTNTDQYCEL >gi|225935375|gb|ACGA01000017.1| GENE 108 96793 - 98319 1461 508 aa, chain - ## HITS:1 COG:TP0331 KEGG:ns NR:ns ## COG: TP0331 COG0362 # Protein_GI_number: 15639322 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Treponema pallidum # 25 508 4 488 488 580 57.0 1e-165 MCISNVTKIKNIQNIQVMANQNKTDIGLIGLAVMGENLALNMESKGWHVSVYNRTVPGVE EGVVERFLNGRAKGKNIEGFTDIKAFVDSIAIPRKIMMMVRAGSPVDELMDQLFPLLSPG DILIDGGNSNYEDTNRRVQLAESKGFLFVGSGVSGGEEGALNGASIMPGGSEKAWPEVKP ILQSIAAKAPDGTPCCQWVGPAGSGHFVKMIHNGIEYGDMQLIAEAYWVMKKLLDLTNEE MADVFARWNEGKLRSYLIEITANILRHKDKSGGYLIDKILDAAGQKGTGKWSVINAMELG MPLGLIATAVFERSLSSQKDLRHLASKQFQCQHTQPIYNKAELVKNIFSALYASKLVSYA QGFAVLQRASDAFGWHLDLASIARMWRGGCIIRSIFLNDIAAAFEATDKPKHLLLAPYFK EEMKTLLPGWKSLVAESMKEELPVPAFSSALNYFYSLTSADLPANLVQAQRDYFGAHTFE RKDELRGQFFHENWTGHGGDTKSGTYNV >gi|225935375|gb|ACGA01000017.1| GENE 109 98378 - 99547 1030 389 aa, chain + ## HITS:1 COG:Cgl2969 KEGG:ns NR:ns ## COG: Cgl2969 COG1301 # Protein_GI_number: 19554219 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Corynebacterium glutamicum # 8 381 5 379 412 324 52.0 1e-88 MKKIKIGLLARIVIAIILGIAIGTVFPAPLVRIFVTFNGIFSEFLNFSIPLIIVGLVTVA IADIGKGAGKMLLVTALIAYFATLFSGFLSYFTGVTVFPSLIEPGAPLEEVSEAQGILPY FSVSIPPLMNVMTALVLAFTLGLGLASLNSDALKNVARDFQEIIVRMISAVILPLLPLYI FGIFLNMTHSGQVYSILMVFIKIIGVIFVLHIFLLVFQYSIAALFVHRNPFKLLSKMLPA YFTALGTQSSAATIPVTLEQTKKNGVSAEVAGFVIPLCATIHLSGSTLKIVACALALMMM QGMPFDFPLFAGFIFMLGITMVAAPGVPGGAIMASLGILQSMLGFDESAQALMIALYIAM DSFGTACNVTGDGAIALIIDKIMGKNRAE >gi|225935375|gb|ACGA01000017.1| GENE 110 99641 - 100711 1305 356 aa, chain + ## HITS:1 COG:BMEI1413 KEGG:ns NR:ns ## COG: BMEI1413 COG1089 # Protein_GI_number: 17987696 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Brucella melitensis # 1 350 1 346 362 483 67.0 1e-136 MKKALISGITGQDGSYLAEFLLQKGYEVHGILRRSSSFNTGRIEHLYFDEWVRDMKQKRT INLHYGDMTDSSSLIRIIQQVQPDEIYNLAAQSHVKVSFDVPEYTAEADAIGTLRMLEAV RILGLEKKTRIYQASTSELFGKVQEVPQKETTPFYPRSPYGVAKQYGFWITKNYRESYGM FAVNGILFNHESERRGETFVTRKISLAAARIAQGEQDKLYLGNLDSLRDWGYAKDYIECM WLILQHDVPEDFVIATGEMHTVREFATLAFKEAGIELRWEGEGVNEKGIDVATGKSLVEV DPKYFRPSEVEQLLGDPTKAKTLLGWDPRKTSFEELVSIMVRHDMEKVRRMIATKH >gi|225935375|gb|ACGA01000017.1| GENE 111 100731 - 101801 1026 356 aa, chain + ## HITS:1 COG:Cj1428c KEGG:ns NR:ns ## COG: Cj1428c COG0451 # Protein_GI_number: 15792746 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Campylobacter jejuni # 1 354 1 342 346 348 47.0 1e-95 MEKNAKIYVAGHRGLVGSAIWKNLQDKGYTNLVGRTHKELDLLDGTAVRNFFDEEQPEYV FLAAAFVGGIMANSIYRADFIYKNLQIQQNIIGESFRYHVKKLLFLGSTCIYPRDAEQPM KEDVLLTSPLEYTNEPYAIAKIAGLKMCESFNLQYGTNYIAVMPTNLYGPNDNFDLERSH VLPAMIRKIHLAHCLKKGNWEAVRKDMNLRPVEGVNGDSPKEEILAILQKYGISETEVTL WGTGTPLREFLWSEEMADASVFVMEHVDFKDTYKEGSKDIRNCHINIGTGKEITIRQLAE RIVETVGYQGKLTFDSSKPDGTMRKLTDPSKLHSLGWHHKIEIEEGVQRMYEWYLK >gi|225935375|gb|ACGA01000017.1| GENE 112 102054 - 103712 1519 552 aa, chain + ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 25 546 14 499 600 219 30.0 8e-57 MEQSFIAYIENSIKNNWDLDALTDYKGATLQYKDVARKIEKLHIIFEESGIRKGDKIAVC GRNSSHWGVTFLATLTYGAVVVPILHEFKADNIHNIVNHSEAKLLLVGDMVWENLNESAM PLLEGILMMNDFTLLVSRSERLTYAREHLNEMFGKKYPKNFRTEHIAYHKDEPEELAVIN YTSGTTSYSKGVMLPYRSLWSNTKFAFEVLDLQAGDKIVSMLPMAHMYGLAFEFLYEFSV GCHIYFLTRMPSPKIIFQAFEEVKPNLIVAVPLIIEKIIKKSVLPKLETPAMKILLKVPI INDKIKATVREEMIKAFGGNFKAVIVGGAAFNQEVEQFLKMIDFPYTVGYGMTECGPIIC YEDWRRFKPGSCGKAVPRMDVKVLSSDPENIVGEIVCKGPNVMLGYYKNEEATQEVIDKD GWLHTGDLALMDAEGNVTIKGRSKNLLLSASGQNIYPEEIEDKLNNLPYVAESIIVQQNE KLVGLVYPDFDDAFAHGLKNEDIEQIMEENRVALNDTLPAYSQISKMKIYPEEFEKTPKK SIKRYLYQEAKG >gi|225935375|gb|ACGA01000017.1| GENE 113 103762 - 104910 1060 382 aa, chain + ## HITS:1 COG:no KEGG:BT_1227 NR:ns ## KEGG: BT_1227 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 567 83.0 1e-160 MKIKKTIFTAAILMAAVCLPAQNKSAGINISIWKDICTQPHDSTQTTYVNIGLLSTMNRL NGVGINALGSVVHGDMNGVQITGLANLAGGTMRGVQLAGISNISGNNTVGLSAAGLVNIT GDRTQGVIISGLTSIGGDNTSGLMISGFMNVTGNMASGLHFSGAANITGQSFGGLMASGL LNVVGEHMNGLQMAGIANITASKLNGVQMALCNYATQARGLQIGLVNYYKEDMKGFQLGL VNANPDTRVQMMVYGGNATPVNIGVRFKNQLFYTILGIGSMYQGLNDKFSASASYRAGLS FTLYKGLSISGDLGYQHIEAFNNKDEVIPKRLYALQARANLEYQFTRKFGIFATGGYGLT RFYNKSSNYDKGAIIEAGIVLF >gi|225935375|gb|ACGA01000017.1| GENE 114 105107 - 107101 2076 664 aa, chain - ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 664 1 663 665 798 60.0 0 MTAQSNITPESIVSDLRYLQLLSRSFPTIADASTEIINLEAILNLPKGTEHFLTDIHGEY EAFQHVLKNASGAVKRKVNEIFGNTLREAEKKEICTLIYYPEEKLQLVKAREKDLDDWYL ITLNQLVKVCQNVSSKYTRSKVRKSLPAEFSYIIQELLHESSIEPNKHAYINVIISTIIT TKRADDFIIAMCNLIQRLTIDSLHIVGDIYDRGPGAHIIMDTLCNYHNFDIQWGNHDILW MGAASGNDSCIANVIRMSMRYGNLGTLEDGYGINLLPLATFAMDTYADDPCTIFMPKMNF ADAHYNEKTLRLITQMHKAITIIQFKLEAEIIDRRPEFGMANRKLLEKIDFERGVFVYEG KEYALRDTNFPTVDPADPYRLTEEERELVEKIHYSFMNSEKLKKHMRCLFTYGGMYLVSN SNLLYHASVPLNEDGSFKHVKIRGKEYWGHKLLDKADQLIRTAYFDEEGEEDKEFAMDYI WYMWCGPEAPLFDKDKMATFERYFVEDKELHKEKKGYYYTLRNREDICDQILTEFGASGP HSHIINGHVPVKTIQGEQPMKANGKLFVIDGGFSKAYQPETGIAGYTLVYHSHGMQLVQH EPFQSRQKAIEEGLDIKSTNFVLEFNSQRMMVKDTDKGKELVTQIQDLKKLLVAYRTGLI KEKV >gi|225935375|gb|ACGA01000017.1| GENE 115 107135 - 108799 1635 554 aa, chain - ## HITS:1 COG:ECs4625 KEGG:ns NR:ns ## COG: ECs4625 COG2985 # Protein_GI_number: 15833879 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 17 552 16 557 561 334 37.0 3e-91 MEWLYSLFLEHSALQAVVVLSLISAIGLGLGRVHFWGVSLGVTFVFFAGILAGHLGLSVD PQMLNYAESFGLVIFVYSLGLQVGPGFFSSFRKGGVTLNMLALGVVLLGTLLTVVASYAT GVSLPDMVGILCGATTNTPALGAAQQTLKQMGMDSSTPALGCAVAYPMGVVGVILAVLLI RKVLVRKEDLEIKEKDDANKTYIAAFQVHNPAIFNKSIKDIARMSYPKFVISRLWRDGHV SIPTSDKILKEGDRLLVVTAEKDALALTVLFGEQENTDWNKEDIDWNAIDSELISQRIVV TRPELNGKKLGSLRLRNHYGINISRVYRSGVQLLATPGLVLQLGDRLTVVGEAAAIQNVE KVLGNAVKSLKEPNLVVVFIGIVLGLALGAIPFSFPGVSTPVKLGLAGGPIIVGILLGTF GPRIHMITYTTRSANLMLRALGLSMYLACLGLDAGAHFFDTVFRPEGLLWIALGAGLTII PTVLVGVVAFKMMKIDFGTVSGMLCGSMANPMALNYVNDTIPGDNPSVAYATVYPLCMFL RVIIAQVLLMFLLN >gi|225935375|gb|ACGA01000017.1| GENE 116 109062 - 109505 496 147 aa, chain - ## HITS:1 COG:no KEGG:BF1816 NR:ns ## KEGG: BF1816 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 146 1 146 148 178 63.0 4e-44 MKTNTFKYFGLALMAILMVSFTSCEVEIDSFYDDDNNGAGYYNRSADLCSRTWVSFYRDM DGNYCRQELDFFLDRTGIDYIRVEYPNGTVEQYEYNFRWSWENYAQTSIRMSYGPNDVSY LDDVYIGGNRLSGYLDGRNNFVEFQGK >gi|225935375|gb|ACGA01000017.1| GENE 117 109575 - 111293 1247 572 aa, chain + ## HITS:1 COG:VCA0763 KEGG:ns NR:ns ## COG: VCA0763 COG0714 # Protein_GI_number: 15601518 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Vibrio cholerae # 38 384 1 350 552 299 44.0 1e-80 MNFPYETENSYLSGGNCLNLGVNPPNEVYFKEFLYLCLLLTTHEIICRMKSIKSHITQLL KSLNEGVFEKEHTIALSLLSAMAGESIFLLGPPGVAKSLVARRLKLAFKDADAFEYLMSR FSTPDEIFGPVSISKLKDEDTYERITKGYLPTASIVFLDEIWKAGPAIQNSLLTVINEKI YRNGQFTVRVPLKALIAASNELPAKGEGLEALYDRFLIRQFVGCIEQEYAFDQMISSTRE VEPEIPAKLQVDDELYNQIQAESEKVGIHYTIFELIHNIKREIEQYNTGRDENTPPIYIS DRRWKKIVGLLRTSAYLNESPGIHFSDCLLMSACLWDEVSQLPIIENIVEQSIARGINTY LLGEKRLEQKLDTLKENMKSEHSLRELSDPGIQVVDTFYHRIEGYHIAGNLLIFASDYQS LRKDSNRLFYIQQDKFRPVNKILKAYDFVKNRNIAQKNIYSLRKGKRSVFVNNQEYPLLC YDNCEPLPTQQDGSTPFEFTLQEVIDLLHQMEVEYKTISERETAYTKEHLFLSSSQKSKI KRILGETAHIIENYRNELRIIAHAHEQENREY >gi|225935375|gb|ACGA01000017.1| GENE 118 111262 - 112695 811 477 aa, chain + ## HITS:1 COG:VCA0762 KEGG:ns NR:ns ## COG: VCA0762 COG2425 # Protein_GI_number: 15601517 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 194 459 202 469 481 133 33.0 7e-31 MPMNKRTESIRLKHLQDIYYEKLQGIAYDVYDEQLHNLIIRPEELDADIHLYFRHTQPSL QDFYSRYASQWEYFHEMNEASDAKFLQFLNNSAYPFSMKYHLVDLNVKYYLQRFNAISPR SKEWKALRTLFFDKWHTLLSNNEFNYQMEHIEQLCEDFYRIQLALAKNLPVRGGSRLVWL LRNHKQIAEQILEYEETIKRNPVIRELVEILGKKHQSSRKRFKMTAGIHREQIISHATRS DIAGICEGNDLNSLLPLEYCYLAEKNLQPIFFERFIEKRLQVIDYQSHEKQTINDKKTVG NEVSEEAEGPFIVCLDTSGSMAGERERIAKSTLLAIAELTEVQHRKCYVILFSDDIECIE ITDLGSSFDRLVDFLSQSFHGGTDMEPVITHALRKISEEGYMEADIITVSDFEMRPVDKL LSQSIEHAKAKQTKMYAISLGGKSAESSYLKLCDKYWEYSVQNAENLNKNIIEESNI >gi|225935375|gb|ACGA01000017.1| GENE 119 113550 - 114746 507 398 aa, chain + ## HITS:1 COG:no KEGG:BF2484 NR:ns ## KEGG: BF2484 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 397 1 398 400 421 55.0 1e-116 MATVCYQLDTRREKKDGTYPIKLYIRHKSRILISTDFCATPRTWTGTEYSKEAKSYKAKN VAIRNLINRVEILIVMLDNNQKLKSMSDAALKDYISKSIKNESTCKTFVTYLDEFIETKT KGNTIELYKATKNKILAYDPVCTFETITKKWLESFNKWLKDTGMKTNSISIHLRNIRAVF NYAIDNEETELYPFRKFAIEKEETRKRSLRPEQLATLRDFNGEEYQKEYQDIFMLMFYLI GINAIDLFNLKQIVDGRIEYKREKTGKLYSIKVEPEAMEIINRYKGNRFLLNTLETNDYN YRKYMAAMNRGLQKLGNFERKGLGGKKIRDILFPGITSYWARHTWATIAHKIEISKDVIS LALGHEFGCKTTGIYIDYDLEQIDKANRKVIDYINSLK >gi|225935375|gb|ACGA01000017.1| GENE 120 114829 - 114996 238 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKDEERKELEQEYENLKLLASFHEAYGVPENEKEREALINDILDRMNEIREKLKE >gi|225935375|gb|ACGA01000017.1| GENE 121 115038 - 115487 288 149 aa, chain + ## HITS:1 COG:no KEGG:BF2302 NR:ns ## KEGG: BF2302 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 18 148 14 139 140 77 37.0 2e-13 MIDINACLPTPEMKADFERFKTLSTQEERDAFKKEMQAKYNALPEDQREAYKKASESGLK ATVDACNDFIERAEEAILRDKLGELPEAISFSYIAKKYFGKSRNWLYQRINGNIVNGKKA RFTDNELQTFLNALKDVSEMIHQTSLKLG >gi|225935375|gb|ACGA01000017.1| GENE 122 115685 - 116050 61 121 aa, chain - ## HITS:1 COG:no KEGG:BT_4444 NR:ns ## KEGG: BT_4444 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 13 131 132 126 63.0 2e-28 MLAICFVSCRTQYIPVESVRTEYKTRDSIRYDSIYQRDSIYTLVKGDTVYQYRYKYLYRY LTTNRTDTILKNDSIRVPYPVEKKLNRWQSIKMELGGWAFGIIILFILIIIGRIIFKSKN N >gi|225935375|gb|ACGA01000017.1| GENE 123 116080 - 116574 327 164 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 45 151 1 100 116 93 45.0 2e-19 MNKIDSIIIHCSATRAGQDLTAKDIDRIHRARGFNQIGYNYVIRIDGTVEKGRSLAVDGA HCNTKGFSESSYNKHSVGICYIGGLDANGKPADTRTIAQRAALRELVAKLCKEYEIIEVL GHRDTSPDLDGSGEVEPKEYIKACPCFDVRSEFPNFLRNTVVRP >gi|225935375|gb|ACGA01000017.1| GENE 124 116567 - 116977 299 136 aa, chain - ## HITS:1 COG:no KEGG:BVU_2812 NR:ns ## KEGG: BVU_2812 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 133 3 128 128 127 54.0 2e-28 MDEWLKIIGALGGLEAIRFTVTFLANRKTNARKEKATADSMELQNLLSIIDNLNKQIERY DERLKQRDEKVDTIYREWRTAQAEAQNWMRKYYELELALKDAEHNRCDRPDSECSRRTPP RRPITINNQNKEESNE >gi|225935375|gb|ACGA01000017.1| GENE 125 117017 - 117274 303 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170686|ref|ZP_05757098.1| ## NR: gi|260170686|ref|ZP_05757098.1| hypothetical protein BacD2_02380 [Bacteroides sp. D2] # 1 85 1 85 85 138 100.0 1e-31 MKIDFRKIQVKDIEGNNSTVDIAKMLGNAIYQKTADLGELELAQQIYKNGEVEVSPEQAE SIKKYVSTGFVAFVQVAVNEALSVE >gi|225935375|gb|ACGA01000017.1| GENE 126 117279 - 120155 1770 958 aa, chain - ## HITS:1 COG:no KEGG:BF2447 NR:ns ## KEGG: BF2447 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 39 428 41 419 1324 183 32.0 2e-44 MIEVENKKVPHSFRNKYLRNSGSVSISTTTPTPINGGGANLDVLKIDDGRTVSDENVFSS LRSLFEIKSRIIALTDNNTAPTDDNTFSSLRIRQELYAAIDALKDSYLSKTAPDETQFLI KLLGGLIVDNGLDVTKGISTDTLTATTVTTQILNVLDKLIAKSATFSGDISSNDYAEGLI GWLIGKDGHIDAKSLRLRDFLEVPELRYNRVSIVSGEEWNAPGGGIIENIDESNRIIYLK LEPGEIAEIEVDDICKGIFNDSTGFQTAYFRITEKIGDSTFKYALRSGTTAHPCKAMHFV SYGNFTSKDRQRSSYSTQSYVRYLTGVNSWEITKEMIAMQLGDLSNLKLFGIEMTGHSAY LRNVYMTGTIKQLSNDGITEVPVPAFKGVWTPGTYWYYDEVVCNGSTWICIADKTIQEPT DNSTDWLKYVSKGETGDKGDKGDKGDKGDTGATGAKGDKGDTGPTGSQGIPGTSQYFHVK YSANANGNPMSDTPNTYIGTAVTTSATAPTGYASYKWVQLKGSQGPKGEQGIAGPTGANG QTSYLHIKYSDNGTSFTANNGETPGAWIGQYVDFTAADSTTFSKYIWTKVKGDTGDKGDK GDKGDKGDQGGKGDTGATGLPGALIRPRGEWKANTNYVNNTQYRDTIIYNGNTYSCRVDH NSGSSFDVTKWTLFNDFVNVATQLLVAQNATIDILGTSGLFIGNQAKTQGWLMTGGSIKH NVTGLELTADGKLSLPATGAILVGNKTFITNGKIVTDFIDVKTLEVEKLNGATGTFKSLQ GTKIVDNKEVVMCEIGFSTSEGKMYFEGDMQHQGTFKEPNGTNRSYRFLTADLWCRGQFG HQQMTSLSFNSASTSDFFAHIYNYGTDTTYHKYAQSGQPIDCIFLEGSGNYVIYICNSPR RKMITIVNASGYPKRVLTTWQSGGTYTLEPYRFAIFVTAETYASVNNTSSTVNLHVMQ >gi|225935375|gb|ACGA01000017.1| GENE 127 120152 - 122542 796 796 aa, chain - ## HITS:1 COG:no KEGG:BF2448 NR:ns ## KEGG: BF2448 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 391 7 380 693 88 23.0 1e-15 MKLRYYSEFKSRKDKTYRIEIHTVFATYSEELTLTDSPFTVEYESDTLYKPLKMSNSVTS ILTDRILSDLYTAEGQNIEVRLYNKTDDVLEWFGYMSPNLYSSDYITPLNIVEIQAIDTI SVLENKKYSYINSSEVYFKSFKDVIMHILDIADPGKILNKLYFQKTNRISKDVSTSLIED IYIHERNFFDEANEPMNSRDVLEEISKYIGMTFIQYQDAYYMIDYDFIKNDELHFFVYDR ISDTCESITIPSALLNVRNIGVSESAGSISLGDVYNKVSVVANMNQITNLCPELLDDDKD IVNQNSDPNKYYISGRDIDGKNYTLLNSFFKSNSNWGYLIPSFSFLDIPAEGVEVTIDNV NDIYSGVVWQKYSDYTTEDGEPSSLSWKTCVSFLQAYNIISASRKTLLTLKNGEYSLFKG GYFIINIAYRMSGSFLPNDIIKTSDEVYSNTKYGAGFDNTMVPCKLYIDDYYYDGEVWRN QKYYTDRVNRGYYKNTHNLTYKGATWYRYKDEFGDWRFVSKGEYDSVSGEKASGGFEDRN KVYAYRENGEDIFVEKWYHDECTLKDGFYLVHINKEGDKVFDDEKRLTNTVSYRFNLYDS TDGVAIKLPDDKILCGKIRFELSTPNHLGKYPMYRTDGGCYPCTAFHISDFTFKYTNNKV TYDIFNNAVDDSDVVYSNVINDNNVTEMDDIELLINSNAKNISSYSNCATKSGDKFDYLK TVYSPLHDKNVLPEQILIDKFYTHYKAPKFRYSNNLNRGFSILSRIYENSLKREMVVDQM SIDYANESCNVSLIET >gi|225935375|gb|ACGA01000017.1| GENE 128 122539 - 124575 1037 678 aa, chain - ## HITS:1 COG:no KEGG:BF2449 NR:ns ## KEGG: BF2449 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 644 1 692 737 290 35.0 2e-76 MSDIVTRLLLKTNDFDANLEKSKKGVNNFQGGISNMAKTAGAGIVKFAGTIGVAMGAYEG LMKVINSSQTTGDAYVKMMDQAKASVDSFFSSIALGNFDGFLSNLQNVIDKAGDLSVALD NLGTKTLFNKAEVDDLNTKYQLELNKAKARNISDEERNKHLAKAKEYLIQMATLQQSLAT ANTDTSYTALQAEIAKMGFNKNVSREMWDYLIKDSKRSDIEKTSTAHKDRLFEYNKRIES SKVRDPYTEQVIDSDETKKIRAELKRYKESRFGLYGELNRLFIEAADDEESAIAQALKMR ATANSLKVAVSNKELEVANADAKINGSYNKRNGGGSGSDTKIEPEKDSIAWYDAEISKLN KKLIATTDEQAKTTIKTTINEFEAKKIKLQVETSGNSIEAINIQLADLNKKLISVTDMQA RSTIQTTINEFEQKKINLKFVVDQEAFKIKNGGMKDGALSVPIAPTYDKVPTHGKGGKNF KLPKFESPIKKKDIDINEEYTKSLYAVGSIMSSLSGITNESTAAYLQWGAGVVSSIAQAI PAIRDLITAKQTEAVINGVTSATETPVVGWLLAGAAVASVIAAMASIPKFATGGIVPGTS FTGDKVPALLNSGEMILNGSQQSNLFQMLNSGLYGSLSQKIAPSAENGNQPANVTFRIHG RDLEGVLSNHYNQKSKVR >gi|225935375|gb|ACGA01000017.1| GENE 129 124568 - 124825 181 85 aa, chain - ## HITS:1 COG:no KEGG:BF2450 NR:ns ## KEGG: BF2450 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 80 1 82 84 84 56.0 2e-15 MDFIEVESFIDGLNRRNREAWEQTRLLGFIIAQSNSTKTLKQTDILRFPWDEEEKKDTSV TDEEMQRLRAKAKEVESQLNTNKDV >gi|225935375|gb|ACGA01000017.1| GENE 130 124894 - 125217 266 107 aa, chain - ## HITS:1 COG:no KEGG:BF2451 NR:ns ## KEGG: BF2451 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 91 1 91 113 96 58.0 3e-19 MKTISLNGKDLSLKYTLRAFFVFESISDYPFQFGKLLDEYILFYSFLIASNKDSFNMEFD EFIELCENDLTLFEQFKEFILDEIKLRSQSAGNDVKKKKVTTRKRKP >gi|225935375|gb|ACGA01000017.1| GENE 131 125259 - 125714 519 151 aa, chain - ## HITS:1 COG:no KEGG:BF2452 NR:ns ## KEGG: BF2452 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 145 1 145 160 177 67.0 9e-44 MSKTKAALGKDLMLFTEDKAIALATSCKLGLSAETIDTQSKDSGIWTEKDIKKLSWNASS ENVFSADVDANSYDKLFALMIARKPITLKFGIVSDPNANEMPEAGWTLAAGAYTGKAVIT SLEANAPDGDKATFSVSFEGTGPLTKEADSK >gi|225935375|gb|ACGA01000017.1| GENE 132 125736 - 126119 311 127 aa, chain - ## HITS:1 COG:no KEGG:BF2453 NR:ns ## KEGG: BF2453 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 126 18 146 149 61 30.0 8e-09 MSLSIGAHVYKKLSDSTELAKLVTDKIYAISTKTETSFPFVIYKRNSLTPEYAKYSTTGD TVSVEIVVASDNYLNSVTIAEEVRKSLENKRGSYDNFDVIDSKLISADEDFIEDTFIQRL VFSFKTE >gi|225935375|gb|ACGA01000017.1| GENE 133 126116 - 126619 408 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170694|ref|ZP_05757106.1| ## NR: gi|260170694|ref|ZP_05757106.1| hypothetical protein BacD2_02420 [Bacteroides sp. D2] # 1 167 1 167 167 244 100.0 2e-63 MSNDNYTGRNLYRVEVDATRVNELLKRLNDKEAKKAISSALRKSILIIRKQAQENLISAV TDAEFSSSKNGVSFKPLKNEINVAVYRNASGARVDLIDRRKKGSRAYMLKWFESGTKERA TKKGANRGIINASHFFSNAVKSKQKEAENSLEKNIIDSITKVANKKK >gi|225935375|gb|ACGA01000017.1| GENE 134 126612 - 126929 209 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170695|ref|ZP_05757107.1| ## NR: gi|260170695|ref|ZP_05757107.1| hypothetical protein BacD2_02425 [Bacteroides sp. D2] # 1 105 1 105 105 187 100.0 1e-46 MQAGLLNEMIGFYRSESKRDNLGGTSESWVKVFDKRAYIRFKSGARKEANGEIYNTTVNT IMIRICKEINAKMRIEYDGQKYKILSINHDRKQQATVIEAEVINE >gi|225935375|gb|ACGA01000017.1| GENE 135 126929 - 127225 197 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170696|ref|ZP_05757108.1| ## NR: gi|260170696|ref|ZP_05757108.1| hypothetical protein BacD2_02430 [Bacteroides sp. D2] # 1 98 1 98 98 182 100.0 8e-45 MAQYVTLEELKQHLNVDFDTDDTYITGLIEPVQLLIESYLNNPLDTYVKDTKIDRRIWHA IRILIANYYANRESVTFATPQVIPGHIELLLQPLKRYT >gi|225935375|gb|ACGA01000017.1| GENE 136 127295 - 128425 973 376 aa, chain - ## HITS:1 COG:no KEGG:BF2457 NR:ns ## KEGG: BF2457 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 14 376 21 394 398 191 32.0 3e-47 MRKEFETIAQYKEQMRAMLDKAEAEKRALDASEKEQFEQLKTKKELLEMKVERRALEDIN AGLVSDRRVLFSQAVFDVVNHRSLEEYNGVVSEGGIKVVERAVTVTDTTDAASMVPVTIG EIIEPLEKGLIIDKLGIKMQSGLVGDLVFPTLAAVEATIQGENVAVTDTELNIDKIKASP KRVSISIPVSKRAINQTNYSLQDVVLKQISLGVARTLNKWMFSGTALSGASNGVFVKTKP DVEYTNALTFADIVSLESTVMDAGVDVTDGTAAYVCTPKVYGALKSTPKAAGAAEMICQN GMVNGYPVLVTNYMDADSIGFGVFSNAAIGQFGDMDLVIDPYTGAKSNIVNFVLNTDYDI VVARPEAFAIAKKKAA >gi|225935375|gb|ACGA01000017.1| GENE 137 128429 - 129028 532 199 aa, chain - ## HITS:1 COG:STM2236 KEGG:ns NR:ns ## COG: STM2236 COG3740 # Protein_GI_number: 16765564 # Func_class: R General function prediction only # Function: Phage head maturation protease # Organism: Salmonella typhimurium LT2 # 2 163 12 158 172 71 35.0 9e-13 MEIRSYTELGAPKVGDGRIIEGYAVVFGQESRVLYDREKQRAFVEVIEKGAITEELLRSC DVKALLDHNKQRLLARSNRGAGTLSLELDDYGLKYRFEAPSTPDGDFAVEMIKRGDIFGS SFAYALNEKDKTKVSYSMKDGLLLRTVHMIDRISDISPVVDPAFYGTDVTVRSMDDTIAE LSGENKDYLNEINNLRKSI >gi|225935375|gb|ACGA01000017.1| GENE 138 129049 - 130029 222 326 aa, chain - ## HITS:1 COG:ECs1592 KEGG:ns NR:ns ## COG: ECs1592 COG4695 # Protein_GI_number: 15830846 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Escherichia coli O157:H7 # 2 299 90 385 403 85 25.0 2e-16 MPNRRMNSFEMVRNMVVQIVNQGNAYIVIRRKFGSVSELVLCANNTVTYDKLNDVYIISD PYNRIYGRFESYEIIHLKNNSLDGGYTGVSTIMYASRIFSIAASADNQNLRTFQNGSKIK GLVSGAKEINKGLPGAGMTDIQLSTVGDRIEEQLNTGRDIISVPGDVGFHQLSINPVDAQ LLETKKFSILDICRFYGVHPDKVFAGQSTNYKASEMSNVSFLTDTLQPILKQIEAEFNYK LIPNSVAHLYSISFDLSCLYQTDLTTQASYYKALEEMGAHSPNDTRRALGKPPVEGGDKV FISCNVQPIEAASQKVELPKNEETNI >gi|225935375|gb|ACGA01000017.1| GENE 139 130323 - 132038 686 571 aa, chain - ## HITS:1 COG:ECs1598 KEGG:ns NR:ns ## COG: ECs1598 COG4626 # Protein_GI_number: 15830852 # Func_class: R General function prediction only # Function: Phage terminase-like protein, large subunit # Organism: Escherichia coli O157:H7 # 58 566 25 528 553 194 31.0 3e-49 MEKETRDKLIALKQSVISDLHNIDVDSYKLEAADERLNLYVKGCINNPDAHNLYELLAVS RFFSFLDKYEFRIKEVKKFVTFYERLKFSGTKGKTRYKLTPIQVFQFSNILAFYKPGTNK RLIREALLFVPRKFSKTTSVASLSINDLLFGDANAQTYVAANSYNQAKVCFDEIRNILKS LDPKFRHFKINREIIYNHIKGKTSFARCLASNPDKLDGLNASMVIVDEYSQADSAALKNV LTSSMGARLNPLTVVITTASDKETAPFVEMLKMYKAILRGEIENDSIFAHIFEPDVDDEE GDPATWRKVQPHMGITVYEDFYIDAYQKALYSAPDALEFRTKLLNVFAVDSTTKWIEAKQ IEERFKDIKIENIGTYPLTMAAVDLSVRDDFSTVTYNIYSKESGSFHSHTDYYFPEGALK DHPNRELYEGWAKAGYLILCDGDIIDYQQIVNDILARAKYLQIMGVGYDPYKSAEFVNLL TYSVGGASEYIKPVKQTYGTFTSPIESFELALYRSKLTFSPNPITPYCFSNAVLDEDRNM NKKPVKKTHNAKIDSTITNLMTFYLFNNMEV >gi|225935375|gb|ACGA01000017.1| GENE 140 132022 - 132384 286 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170701|ref|ZP_05757113.1| ## NR: gi|260170701|ref|ZP_05757113.1| hypothetical protein BacD2_02455 [Bacteroides sp. D2] # 1 120 42 161 161 220 100.0 2e-56 MSDLDDIKEKIRAAMNSQGTYTSDLDLCITLCAGSYIAFKIALNDIAKKKRSFVTEVSRE GNKKLVAHPAFKVLFDALEVTRKQLRELGLTLQTLSSSDDDEVNDLINEVDKIDRDGEGD >gi|225935375|gb|ACGA01000017.1| GENE 141 132615 - 133388 262 257 aa, chain + ## HITS:1 COG:no KEGG:PputGB1_1741 NR:ns ## KEGG: PputGB1_1741 # Name: not_defined # Def: hypothetical protein # Organism: P.putida_GB1 # Pathway: not_defined # 1 79 1 79 303 113 58.0 5e-24 MTEKINHVKNPLTIIAIFAGIAEVSGTIITPFIDKELQGIFIYFLIGFPTILVVLFFVTL WFKSNVLYAPSDFSNEENYVIMRQFYDKNRKIEIVERKQSNTKKTDGTCFVAISSTNAPQ KANSENSVRIRLADVPTASNMAKAFRKKGYNNIDIYSGFSEEPSTDENSKAIWIGADIDI ETIKKVIKDAVRIYPNLKYISISNEYTDNVRNEIFIGGSTETAISRNAKDLEEEDFKAIE SVKNINELEQIITQKFY >gi|225935375|gb|ACGA01000017.1| GENE 142 133426 - 133761 158 111 aa, chain - ## HITS:1 COG:no KEGG:BF2462 NR:ns ## KEGG: BF2462 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 110 1 110 111 85 40.0 7e-16 MSRNPHYIKMINSNKWKLLRAKKLQSNPVCEVCEANNRSTLATEVHHTVPVESVSHELGM RQLMFDYNNLQSLCHSCHSDAHRRAFSHSKEAIQANNKRATERFADKFLKI >gi|225935375|gb|ACGA01000017.1| GENE 143 133901 - 134182 155 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170705|ref|ZP_05757117.1| ## NR: gi|260170705|ref|ZP_05757117.1| hypothetical protein BacD2_02475 [Bacteroides sp. D2] # 1 93 1 93 93 144 100.0 2e-33 MSKVKQYIEQATNERIRSRGLIRKVAIEAARIQRDETRRQAIEVYKQMCPSKNCKGCASR IHKQETQLTRCDGNCARIRLLINGLDRIETLCI >gi|225935375|gb|ACGA01000017.1| GENE 144 134179 - 134364 144 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170706|ref|ZP_05757118.1| ## NR: gi|260170706|ref|ZP_05757118.1| hypothetical protein BacD2_02480 [Bacteroides sp. D2] # 1 61 1 61 61 91 100.0 2e-17 MNQAQNQSKYYYSPRFRHFNIYRRDPDGDTKVDDAATQEEAKRKVYELNGWNYKPKNNTV K >gi|225935375|gb|ACGA01000017.1| GENE 145 134413 - 134637 320 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170707|ref|ZP_05757119.1| ## NR: gi|260170707|ref|ZP_05757119.1| hypothetical protein BacD2_02485 [Bacteroides sp. D2] # 1 74 1 74 74 118 100.0 1e-25 MKTKIGVSIKVNNKEYKFDFESDYLTEDLWNNHKYNHEILMGHIDAAIINYKNTHNITEE TDSNSMRFTPHFYE >gi|225935375|gb|ACGA01000017.1| GENE 146 134689 - 135036 197 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170708|ref|ZP_05757120.1| ## NR: gi|260170708|ref|ZP_05757120.1| hypothetical protein BacD2_02490 [Bacteroides sp. D2] # 1 115 1 115 115 204 100.0 2e-51 MVKKLSNTNYLHDVSADPVAANERNRKYIDRFVSENYNGLVAKFSPLDGTINSSAFGALD KLNSTIISLYTDPNLHFTDWEQAKQYLSNKFTEKAIRVSVKKPVKSEDGENEDES >gi|225935375|gb|ACGA01000017.1| GENE 147 135038 - 135367 201 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170709|ref|ZP_05757121.1| ## NR: gi|260170709|ref|ZP_05757121.1| hypothetical protein BacD2_02495 [Bacteroides sp. D2] # 1 109 1 109 109 194 100.0 2e-48 MRSRKKKLVYFKKIPVRVDLEQWQRLDKIRADYHFKSTYEIMQYILGCFLRVADPMPGDD DEEVLPDEIKEMFYDLSQAERHFEYVKPKRKLPQHKVDEMNGQKRLEGF >gi|225935375|gb|ACGA01000017.1| GENE 148 135384 - 136052 313 222 aa, chain - ## HITS:1 COG:no KEGG:CHU_1184 NR:ns ## KEGG: CHU_1184 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 46 221 30 208 212 95 31.0 1e-18 METRSKQTLPIAAILSYGLPYYDEPIESGKRPEWFKACCKYVCPNFKIDDSNRNIMNQLF LYTEGRSEKLDSNKGLLLRGDIGTGKSTIMQILNRYSYFTRGKAKGGYPIGGFRIDSASC IANGFSMRGKDALELYTYNNGTPRMICFDELGREPIPAKYFGTELNVMQYIFQCRYELRH EAITHVTTNLTIKEIQRIYGAYIADRINEMFNVLDLNGASRR >gi|225935375|gb|ACGA01000017.1| GENE 149 135985 - 136743 377 252 aa, chain - ## HITS:1 COG:no KEGG:BF2328 NR:ns ## KEGG: BF2328 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 154 7 159 295 166 54.0 6e-40 MNIDGYTLTEKMRKARRRFRFTATEQALFYELVAICNGEDWRDVFDCSNIELCFALNVNE KTLIKARESLINAGLIYYKSGKNKRIISSYSFVKEFKTTVTTTVNFTANQTANKGVNQTA NDTVDKGANDTGDSTDYNKLKQKPNRNILSKVSHGDFDFISDEFLEAFSLWLEYKKDRRQ NYKSEKSLKACYNKLVKLSKNDPVIAEQIVNESIANNWSGLFELKNDKCEYGNKKQTDPA DSGDTIIRTTVL >gi|225935375|gb|ACGA01000017.1| GENE 150 136913 - 137263 329 116 aa, chain - ## HITS:1 COG:no KEGG:BURPS1710b_1673 NR:ns ## KEGG: BURPS1710b_1673 # Name: not_defined # Def: GP72 # Organism: B.pseudomallei_1710b # Pathway: not_defined # 2 116 72 182 183 74 44.0 1e-12 MTKYNNVKIDGYDSKKEYRRAKELKLLEKKGIITGLQEQVKFELISPQCHFYEVQGAWKM LRKKELIERGVYYIADFVYYRNGEYIVEDTKGVRTKEYIIKRKLMLYIHGIKIKEV >gi|225935375|gb|ACGA01000017.1| GENE 151 137430 - 137897 408 155 aa, chain - ## HITS:1 COG:no KEGG:BT_1331 NR:ns ## KEGG: BT_1331 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 142 3 144 159 210 76.0 1e-53 MNTWFLCKIRYEKVMENGMQKKVTEPYLVDALSFTEAEARIIEEVTPFISGEFTVSDISR AHYSEIFTSEEDSADKWFAGRLAFTTLDEKSGKEKRTYTNVLIQAADIHDAMKKLDEGMK GTMADYSSILLKETAIVDVYPYEVKQENYEFKHDK >gi|225935375|gb|ACGA01000017.1| GENE 152 137894 - 138259 310 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170714|ref|ZP_05757126.1| ## NR: gi|260170714|ref|ZP_05757126.1| hypothetical protein BacD2_02520 [Bacteroides sp. D2] # 1 121 1 121 121 214 100.0 1e-54 MSSKNKEEILDCIPLLHPYSMDEIYTMLKRHGCKISHEETYNLVEKNLYKIRDKYVRMTE YRSFGNSTSIENIKMSNGCLTCTLTEVSTLEQKIVDKIKFAEKYNIPYEGLIELIKEMNL L >gi|225935375|gb|ACGA01000017.1| GENE 153 138310 - 139122 614 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170715|ref|ZP_05757127.1| ## NR: gi|260170715|ref|ZP_05757127.1| putative phage-like protein [Bacteroides sp. D2] # 1 270 1 270 270 540 100.0 1e-152 MKSSEQKEIEWKEKRRGKITASTLPDLMKAGKGCPFGKAALDAMYLVRYERRTGTMRENG SNKAFDWGHENEPLAVEWVRSQLMNEIKSCTTDFKDIVFNEPFEGFRDSPDFYVYGFDGK VIALGEIKCPMSQGKIESLQFGNTIDEKDEYYWQFLGHFLGRPDVDKLYYVIYDGYTNEG RILEMNRADHADNIKKLYDRIRLASEMIDESIRSGLDLLDCVDKAKEVLKLKMQIEALKP EAKNSVPVKNQIYKIRKELKKLTKKVPSQH >gi|225935375|gb|ACGA01000017.1| GENE 154 139109 - 139594 389 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170716|ref|ZP_05757128.1| ## NR: gi|260170716|ref|ZP_05757128.1| hypothetical protein BacD2_02530 [Bacteroides sp. D2] # 1 161 1 161 161 318 100.0 5e-86 MTHWKTQFNYDYLGAYSLPDGKDIILTIRETKKEQVVGTSGKKEECFVAYFFENVKPMIL NRTNCKTMTKLFKNPNFEEWVNKQIQIGSVMVDAFGEKVDSLRIRPFIPKVENSLPTVET GSAIWKNILDGLAGGFTVAQVQTKYKLTKEQIKELVAHEIK >gi|225935375|gb|ACGA01000017.1| GENE 155 139594 - 139980 355 128 aa, chain - ## HITS:1 COG:no KEGG:BF2326 NR:ns ## KEGG: BF2326 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 128 13 134 134 117 49.0 2e-25 MNELYWIERLDAVNVTFVIILIVALVWLVYVFIESNVESYSEKECIDKGIYKAKKVSYVI IAISLLILIFTPTTKEMYRIIGIGETINYLRQNEASKELPDKCIKALDLFLDKITGDNKE TNSNNTTR >gi|225935375|gb|ACGA01000017.1| GENE 156 139973 - 140233 157 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170718|ref|ZP_05757130.1| ## NR: gi|260170718|ref|ZP_05757130.1| hypothetical protein BacD2_02540 [Bacteroides sp. D2] # 1 86 1 86 86 128 100.0 1e-28 MAANPQCIGNCRICTVLGACPSDTLVCEDCGEEIEPGEEIELEVETYERGRYGTKIITVC ARCYESLYQGESDNFNNDFLKSKQNE >gi|225935375|gb|ACGA01000017.1| GENE 157 140205 - 140396 198 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLTLKQSPAAIILMLSACSLAEGEPEPGKLIIALLIVFITVVYVLVCNYLNVKRHGGESS MYR >gi|225935375|gb|ACGA01000017.1| GENE 158 140599 - 140808 158 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170719|ref|ZP_05757131.1| ## NR: gi|260170719|ref|ZP_05757131.1| hypothetical protein BacD2_02545 [Bacteroides sp. D2] # 1 69 1 69 69 114 100.0 2e-24 MRRKRNELTALLRGMQPGETMTFPRSKRNSVRPTCTNLKYDEGLLFTTETDKDNLIVTRL NNEQWDELE >gi|225935375|gb|ACGA01000017.1| GENE 159 140854 - 141084 323 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170720|ref|ZP_05757132.1| ## NR: gi|260170720|ref|ZP_05757132.1| hypothetical protein BacD2_02550 [Bacteroides sp. D2] # 1 76 1 76 76 108 100.0 7e-23 MKATSTLTRKTALEILIESRDKSIINALIAKKEIALEEAVNNAEWYASLGLDGMADNEVA RQEKLIRDIERLKVAI >gi|225935375|gb|ACGA01000017.1| GENE 160 141127 - 141315 279 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170721|ref|ZP_05757133.1| ## NR: gi|260170721|ref|ZP_05757133.1| hypothetical protein BacD2_02555 [Bacteroides sp. D2] # 1 62 1 62 62 100 100.0 2e-20 MKAGMIGDVEFKKTGSETVCCVSLINTTAGQRFLACTLSSSKTFKTFKGAEKFMNSFGYQ KI >gi|225935375|gb|ACGA01000017.1| GENE 161 141458 - 142129 413 223 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_1827 NR:ns ## KEGG: Fjoh_1827 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 4 184 6 193 230 86 31.0 8e-16 MEIVDRIKLFREYLGIGQTAFEVNIGVARGYFSNVKTLGSDRILRIHTKYPELNIEWLVT GNGEMIKNAEREQKTIEISESAISETKRKGALIYDIDATCGLSGRDIEFTDEKVIGSIDA PEINSDSKIIFATGDSMLPLIASGDRVVIRKIESWDYFNYGQVYLIITNEYRLIKRVRRH PKDADNLILLRSENPDYDDIDLPKREIIHLFIVENILSIKNIL >gi|225935375|gb|ACGA01000017.1| GENE 162 142150 - 142689 210 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170723|ref|ZP_05757135.1| ## NR: gi|260170723|ref|ZP_05757135.1| hypothetical protein BacD2_02565 [Bacteroides sp. D2] # 1 179 1 179 179 342 100.0 5e-93 MKRLFILFGILIILASCNQKNNKNFKSLNYTKLKQPMCYFAKTDSGFVLKAESNKDYKYK YIEYVINKNEIKMYSDSLYDLYCGVSITYPIKFGRGSFMGDWEFSNRSICKLKPTGEMID KRYVYKTKFSNFNSPEIRTHIKELNFEINGHEFDIALDSIETRTLLPTFEPTLYTAKGL >gi|225935375|gb|ACGA01000017.1| GENE 163 142710 - 143153 401 147 aa, chain + ## HITS:1 COG:no KEGG:BF2306 NR:ns ## KEGG: BF2306 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 145 15 161 176 80 33.0 2e-14 MEVVLIIVVAAVIMLAIKIAMTNPKESSNNQNQPKTEIPSGEIEFPPSGYFYYEMVGMYY HGVTPKDFGIFKGKAIAETNNPKDEFAVGIYRDGDNKLVGYIPKDFRGVSNEKIHKEITE NGGSREVVFKISGSEKKCYGTVYIKNN >gi|225935375|gb|ACGA01000017.1| GENE 164 143203 - 143691 226 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170725|ref|ZP_05757137.1| ## NR: gi|260170725|ref|ZP_05757137.1| hypothetical protein BacD2_02575 [Bacteroides sp. D2] # 1 162 1 162 162 280 100.0 2e-74 MEAKIEPHEQFNSEDEELCYYGKVYESVLNKLDAVNTSLKDWLSHLLMIAATLLGVLAAL NPVKQTDPICIRICFLLAVLLLVLVLLLGGVSLYGVVFEKRTHFENYAKELGESVKYHRK MRPTMPDGKRLFLICEKGSYICFGLFFLALILYLILSLFVVS >gi|225935375|gb|ACGA01000017.1| GENE 165 143777 - 144080 234 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170726|ref|ZP_05757138.1| ## NR: gi|260170726|ref|ZP_05757138.1| hypothetical protein BacD2_02580 [Bacteroides sp. D2] # 1 101 1 101 101 197 100.0 1e-49 MGTFFGLIAVLFAVLQIILFFKIWGMTNDIREIKEKYLSSTDPKKNVSPAQPTEFSIGEL VVEIKTNKQMRIKEITQDGKYSCYTGGGASHEGDFTASEIK Prediction of potential genes in microbial genomes Time: Fri May 13 06:55:19 2011 Seq name: gi|225935374|gb|ACGA01000018.1| Bacteroides sp. D2 cont1.18, whole genome shotgun sequence Length of sequence - 13052 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 8, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 77 - 136 7.7 1 1 Op 1 . + CDS 161 - 1372 331 ## BDI_0906 putative plasmid maintenance system antidote protein 2 1 Op 2 . + CDS 1386 - 1859 373 ## gi|260170729|ref|ZP_05757141.1| hypothetical protein BacD2_02607 3 2 Tu 1 . - CDS 1847 - 2041 133 ## gi|260170730|ref|ZP_05757142.1| hypothetical protein BacD2_02612 - Prom 2158 - 2217 4.0 4 3 Tu 1 . - CDS 2260 - 3090 639 ## BT_1233 hypothetical protein - Prom 3279 - 3338 5.8 + Prom 3119 - 3178 6.0 5 4 Tu 1 . + CDS 3199 - 3783 504 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 3852 - 3893 6.2 - Term 3691 - 3720 0.0 6 5 Op 1 . - CDS 3786 - 4688 654 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 4713 - 4772 2.1 7 5 Op 2 . - CDS 4782 - 6524 1757 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases 8 5 Op 3 . - CDS 6521 - 8449 1311 ## COG0642 Signal transduction histidine kinase - Prom 8524 - 8583 6.0 9 6 Tu 1 . + CDS 8568 - 10700 198 ## PROTEIN SUPPORTED gi|227384144|ref|ZP_03867559.1| SSU ribosomal protein S1P 10 7 Tu 1 . - CDS 10672 - 11568 743 ## COG1266 Predicted metal-dependent membrane protease - Prom 11627 - 11686 7.7 + Prom 11545 - 11604 4.6 11 8 Tu 1 . + CDS 11624 - 12721 1068 ## BT_1240 hypothetical protein + Term 12862 - 12902 -0.4 Predicted protein(s) >gi|225935374|gb|ACGA01000018.1| GENE 1 161 - 1372 331 403 aa, chain + ## HITS:1 COG:no KEGG:BDI_0906 NR:ns ## KEGG: BDI_0906 # Name: not_defined # Def: putative plasmid maintenance system antidote protein # Organism: P.distasonis # Pathway: not_defined # 16 361 28 362 400 181 32.0 6e-44 MYISKINTSQAPIDVSLREMLAKFIQENEIIESSIADEIGIHKETLSKFLEGKAELKFMQ AIRLMKLLDLTESQLVSAYCKDINIDEASSLDKFEKLSYIMQNFDVPTLKKIGIIKSRAK IDEYEQCICDFFGFSSIYEYDDTSLMPTLFSKSKKKILQEKEAKMTTFWLKCAISSFSKI DNPNDYDKDLLFQLLKRASEFTQDEVNGYKRFVLVLFQLGITVLTQSYVSGTKSFGVTMI LNGKPCIIITDMNKQYHKLWINLLHELYHVINDFEMLESMDYHLSNSETPELLLNENRAD QFALDVLVNPSVQSKLRKIISFPFKVHLLAKELNISPSIIYGVYLESLPNGRLKAQEFAK YNNGDNLISSDIATRNILFDAVEKRSLENAIEKMKTELFKRAI >gi|225935374|gb|ACGA01000018.1| GENE 2 1386 - 1859 373 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170729|ref|ZP_05757141.1| ## NR: gi|260170729|ref|ZP_05757141.1| hypothetical protein BacD2_02607 [Bacteroides sp. D2] # 1 157 1 157 157 283 100.0 2e-75 MELPKKKLEDLISAADKIIEANNQTQEDLFGQVKNPNAISEILDIPLQDPKRSYTLYYQN IQKFLGDFLPKDNDISKTIRELICILLAHKELSGITYGTREADSRMATTTDMENLIDVLS EWSETPTDYFKLANILLKKNKELGYIPEEREIKDYLK >gi|225935374|gb|ACGA01000018.1| GENE 3 1847 - 2041 133 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170730|ref|ZP_05757142.1| ## NR: gi|260170730|ref|ZP_05757142.1| hypothetical protein BacD2_02612 [Bacteroides sp. D2] # 1 64 53 116 116 120 100.0 2e-26 MTDEELRAFCLKQAIQIITHKEQPRTMGFQNTDSIYLFELTEILLEYIKTGKQNYVPVYL NYFK >gi|225935374|gb|ACGA01000018.1| GENE 4 2260 - 3090 639 276 aa, chain - ## HITS:1 COG:no KEGG:BT_1233 NR:ns ## KEGG: BT_1233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 238 3 240 244 404 79.0 1e-111 MNKIIGLAVLLFCLSGCVRDNDAIYYPVGNVDVERGGPALEAGKGDLIARSYNTEDYVLD TLAQYPGDPTLGKLTFMVSLKNRLADQEVSGFNGVGLSKLTMSLGYKDGNYPAESQVPVY TSSDVTASYAIKLRLKGELTLTGDEWMIDYIYAQLAGLFQPYPPTSFPEVFMCKGGEQSY ATFDSFRRTWTFDITYDRSNLSFSQLYFNLFVNLAGQKREDRVRLRIDKESYFEIYKEKR GNVVKLHFLFFFGSGYENRTRITCVRGVFVHHLKRL >gi|225935374|gb|ACGA01000018.1| GENE 5 3199 - 3783 504 194 aa, chain + ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 39 190 40 192 199 77 28.0 2e-14 MDTLLRETVNAVVNSRFPEMSIEGRRQIESILIREEFPKGAIALNEGEVAHEIVFVGKGM LRQYYYKNGKDVTEHFSYEGCIVMCIESFLKQEPTRLIVETLEPSIIYLFPRDMIQKLAK ENWEINMFYQKILEYSLIVSQIKADSWRFESARERYNLLLETHPEIIKRAPLAHIASYLL MTPETLSRVRSGVL >gi|225935374|gb|ACGA01000018.1| GENE 6 3786 - 4688 654 300 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 204 298 191 286 288 74 37.0 3e-13 MYIQNVPKVGISSVVHSKHIDSDNIDVVDNDIALFDTESVISLYNGPTKLEVLTVGLCLE GAGTFNISLREFQLFPGLMVIALPSQIVEQRRFSSDFKGIFFAVSKNLLEALPKIGNVLS LFFYLKDYPCFDLTPQEQETVKEYHAFIRKQLRNKEALYRKEVVMGLMQGFFYELCNIFT NHAPANATTMKNKSRKEYIFERFYESLVESYQSERSVKFYADQLCLTPKHLSGVVKEVSG KTVGEWIDELVILEAKALLNSSSMNIQEIADRLNFANQSFFGKYFKHYTGMSPKEYRKSR >gi|225935374|gb|ACGA01000018.1| GENE 7 4782 - 6524 1757 580 aa, chain - ## HITS:1 COG:CAC0353 KEGG:ns NR:ns ## COG: CAC0353 COG0737 # Protein_GI_number: 15893644 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Clostridium acetobutylicum # 27 568 542 1093 1193 235 30.0 2e-61 MKRIGFIYGLLFCLVLSVAAQEKVVKLKIVETSDVHGNYYPYNFITRHEWKGSLARIYSF VQKEREQYKENLILLDNGDILQGQPTAYYYNYIDTVSPHLCSEMMNFMKYDAGNMGNHDV ETGRAVFDRWIATCDFPVLGANIIDTSTGKPHLAPYKVLERDGVKIVVLGMITPAIPAWL SENLWKGLRFDDMEETARKWMKIIREKENPDLVIGLFHAGQEAFKMSGKYNENASLNVAK NVPGFDIVLMGHDHASECKKVMNVAGDSVLIIDPASNGIVLSNVDVTLKLKDGKVQSKDI KGELTETEAYGISEDFMKRFAPQYETVQKFVSKKIGTFTESISTHPAFFGPSAFIDLIHT LQLDITGADISFAAPLSFDAEIKKGDVFVSDMFNLYKYENMLYVMTLSGKEIKDFLEMSY YMWTNRMKSPEDHLLWFKEKPREGAEDRASFQNFSFNFDSASGIIYTVDVTKPQGEKITI TSMADGSPFRMDKIYKVALNSYRGNGGGELLTKGSGIPQEELKDRIIFSTDKDLRFYLMN YIEKKGTMDPKALNQWKFVPEKWTVPAAKRDSEYLFRSVQ >gi|225935374|gb|ACGA01000018.1| GENE 8 6521 - 8449 1311 642 aa, chain - ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 400 642 13 260 311 154 38.0 7e-37 MLGRLGYLFICFLWLLQSTEVLAVSKDKKPILIICSYNPAAHQTSVTISDYMDEYSKLGG QRDIIIENMNCKSFSEAPLWSAMMTQILAKYQGEKHPAQIILLGQEAWAAYLSQRDEMQV KVPVMCSLVSSNVVILPKDTVENLDCWMPESVDIFEDHLDIPELESGFINQYNIEGNISM IQAFYPKTKHIAFISDNTYGGVTMQSLVRKEMKKFPDLDLILMDGRRHSIYTIVEELRQL PENTVILVGTWRVDMNEGYFMRNATYAMMEATPTIPAFTPSSVSLGYWAIGGVLPDYRKV GGEMAMESVRMDQHPEDTGKHLSIIGSKAVLDSRKVKEWGLHPSVLPFKVQLVNQPVSFY QQYTYQIWSACALFVILVLGLCISLFYYFRTKRLKDELLKSEKDLRVAKDRAEESNRLKS AFLANMSHEIRTPLNSIVGFSDVLAMGGSTEDEQQSYYKIIKTNSDLLLRLINDILDLSR LEANRVTLTWEECDVVQLCSQVVASVSFSRQSSDNQFLFSTSFETFRMVTDVQRMQQVII NLLSNADKFTKRGKITLDFSVNEEKQMAIFSVTDTGCGIPKEKQGLVFERFEKLNEYAQG TGLGLSICKLIVHKWRGSIWIDPDYTGGARFVFSHPLNIEKE >gi|225935374|gb|ACGA01000018.1| GENE 9 8568 - 10700 198 710 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227384144|ref|ZP_03867559.1| SSU ribosomal protein S1P [Jonesia denitrificans DSM 20603] # 633 710 203 279 488 80 48 4e-15 MELFHKMISGFLGIPERQISSTLHLLGEGATIPFISRYRKEATGGLDEVQIEQIKEQHDK LCDIAKRKETILGTINEQGKLTAELEKRINDTWNPTELEDIYLPYKPKRKTRAEVARQKG LEPLATILMLQRENNLSAKAASFVKGDVKDVEDALKGARDIIAEQVNEDERARNAVRNQF SRQAEISAKVVKGKEEEAAKYRDYFDFSEALKRCTSHRLLAIRRAESEGLLKVSINPDDE ACIERLERQFVRGNNECSQQVDEATVDAYKRLLKPSIETEFAAQSKEKADEEAIRVFTEN LRQLLLAPPLGQKRVLAIDPGFRTGCKVVCLDAQGNLLHNENIYPHPPVNKTGEAASKLR KMIEAYQIEAISIGNGTASRETEDFINSQSFDRQIPVFVVSEQGASIYSASKIARDEFPD YDVTVRGAVSIGRRLMDPLAELVKIDPKSIGVGQYQHDVDQTKLKKALDQTVENCVNLVG VNLNTASSHLLTYISGLGPQLAQNIVNYRAENGAFSSRKELMKVPRMGAKAFEQCAGFLR IPGAKNPLDNTAVHPESYHIVEQMAKDLKCTIDELIADKELRRKINISAYITPTVGLPTL QDILQELDKPGRDPRKAIKVFEFDKNVRTIADLREGMILPGIVGNITNFGAFVDIGIKEN GLVHLSQLAERFISDPTEVVSIHQHVMVRVMNVDYDRKRIQLSMIGVQQD >gi|225935374|gb|ACGA01000018.1| GENE 10 10672 - 11568 743 298 aa, chain - ## HITS:1 COG:FN0640 KEGG:ns NR:ns ## COG: FN0640 COG1266 # Protein_GI_number: 19703975 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Fusobacterium nucleatum # 76 288 75 288 293 83 28.0 5e-16 MEADNIDGKKEPRRLPVWACIPLFIVVLFIFIGLYGTLVSGGLSLVLGVEARHPGVVGYI FVEASMLLAVLTAAVILLRLERRPFSDLGLSLKGHASGLWYGLLMAILLYLLGFGISVVL GEIEVTGFQFKPLDLLGSWVFFLLVALFEEILMRGYILGRLLHTNMNKFLALFISSALFA FMHIFNPEIAFLPMLNLLLAGMLLGASYLYTRNLCFPISLHLFWNWIQGPILGYQVSGNN FTTSMLTLRMPEENVLNGGAFGFEGSLICTVLMIVFTILIVWWGEKREAISLAVPRSY >gi|225935374|gb|ACGA01000018.1| GENE 11 11624 - 12721 1068 365 aa, chain + ## HITS:1 COG:no KEGG:BT_1240 NR:ns ## KEGG: BT_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 365 1 349 349 644 90.0 0 MYSLFFLYLSEQIYKIMKIKLLLISFLLAANALGAAAQVSKTYYVSKPGTLISMMTEEEA NSVTHLTLTGKLNAEDFRHLRDEFANLKVLDISNAEIKMYSGKAGTYPNGKFYIYMPNFI PAYAFSNVVDGVTKGKATLEKVILSEKTKNIEDAAFKGCENLKICQIRKKTAPNLLPEAL ADSVTAIFVPLGSSDAYRYKDRWQNFAFIEGEPVETTLQVGAMGKLEEEILKAGLQPRDI NFLTVEGKLDNADFKLIRDYMPNLVSVDISKTNATAIPDFTFSQKKYLLNMKLPHNLKSI GQRVFSNCGRLCGTLELPASVTAIEFGAFMGCDNLRYVLATGDKITTLGDNLFGEGVPSK LVYKK Prediction of potential genes in microbial genomes Time: Fri May 13 06:56:10 2011 Seq name: gi|225935373|gb|ACGA01000019.1| Bacteroides sp. D2 cont1.19, whole genome shotgun sequence Length of sequence - 25551 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 9, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 12 - 45 1.2 1 1 Tu 1 . - CDS 192 - 749 321 ## COG4430 Uncharacterized protein conserved in bacteria - Prom 773 - 832 1.6 - Term 752 - 793 7.3 2 2 Tu 1 . - CDS 856 - 2475 1759 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 2711 - 2770 4.6 + Prom 2489 - 2548 5.2 3 3 Op 1 5/0.000 + CDS 2741 - 3322 815 ## COG0576 Molecular chaperone GrpE (heat shock protein) 4 3 Op 2 . + CDS 3362 - 4546 1486 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain + Term 4569 - 4619 6.2 + Prom 4629 - 4688 4.1 5 4 Op 1 . + CDS 4709 - 5734 520 ## COG0393 Uncharacterized conserved protein 6 4 Op 2 . + CDS 5768 - 6907 765 ## COG4335 DNA alkylation repair enzyme + Term 6922 - 6988 11.5 - Term 6914 - 6970 11.0 7 5 Op 1 . - CDS 7028 - 7420 380 ## Coch_0296 hypothetical protein 8 5 Op 2 . - CDS 7457 - 7645 167 ## gi|260170747|ref|ZP_05757159.1| hypothetical protein BacD2_02697 - Prom 7665 - 7724 3.3 - Term 7664 - 7721 4.3 9 6 Op 1 . - CDS 7732 - 9372 1439 ## BT_0484 hypothetical protein 10 6 Op 2 . - CDS 9387 - 12131 2137 ## BT_0483 hypothetical protein - Prom 12152 - 12211 4.5 - Term 12153 - 12210 11.2 11 7 Op 1 . - CDS 12230 - 14305 1337 ## COG3525 N-acetyl-beta-hexosaminidase 12 7 Op 2 1/0.000 - CDS 14305 - 16629 2085 ## COG3525 N-acetyl-beta-hexosaminidase 13 7 Op 3 . - CDS 16664 - 19258 1510 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 19312 - 19371 5.5 14 8 Tu 1 . - CDS 19412 - 21046 1167 ## COG4409 Neuraminidase (sialidase) - Prom 21120 - 21179 5.5 + Prom 21016 - 21075 5.9 15 9 Op 1 1/0.000 + CDS 21161 - 21625 302 ## COG0477 Permeases of the major facilitator superfamily + Prom 21668 - 21727 6.3 16 9 Op 2 . + CDS 21760 - 22935 915 ## COG2942 N-acyl-D-glucosamine 2-epimerase 17 9 Op 3 . + CDS 22956 - 24350 795 ## TTE2414 hypothetical protein 18 9 Op 4 . + CDS 24382 - 25549 1057 ## Sde_2927 beta-1,3-glucanase precursor Predicted protein(s) >gi|225935373|gb|ACGA01000019.1| GENE 1 192 - 749 321 185 aa, chain - ## HITS:1 COG:alr0739 KEGG:ns NR:ns ## COG: alr0739 COG4430 # Protein_GI_number: 17228234 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 8 176 15 192 193 109 35.0 2e-24 MEIKYFENREDWRKWLADNFENASDVWFVFPSKSSGEKSVTYNDAVEEALCFGWIDSTIK ALDKEHKIQHFTPRNPKSSYSQANKERLKWLLEHNMIHPDWEDNIRKVLSAPFVFPNDII GKLKEDKTVWENYQQFSDAYRRIRIAYIEAARKRPEEFEKRLNNFINKTKENKMIKGFGG IEKYY >gi|225935373|gb|ACGA01000019.1| GENE 2 856 - 2475 1759 539 aa, chain - ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 538 1 540 540 666 57.0 0 MITVSNVSVQFGKRVLFNDVNLKFTSGNCYGIIGANGAGKSTFLRTIYGDLDPTTGTIAL GPGERLSVLSQDHFKWDSYTVMDTVMMGHTVLWDIMKQREELYAKEDFTDEDGLKVSELE EKFAELDGWNAESDAAMLLSGLGVKEDKHYVLMGELSGKEKVRVMLAQALYGNPDNLLLD EPTNDLDMETVTWLEEYLANFEHTVLVVSHDRHFLDSVCTHTVDIDYGKINMFAGNYSFW YESSQLALRQQQNQKAKAEEKKKELEEFIRRFSANVAKSKQTTSRKKMLEKLNVEEIKPS SRKYPGIIFTPEREPGNQILEVSGLSKKTEEGVVLFNDVNFNVEKGDKVVFLSRNPRAMT AFFEIINGNMKPDAGTFNWGVTITTAYLPLDNTDFFNSDLNLVDWLSQFGEGNEVYMKGF LGRMLFSGEEVLKKVSVLSGGEKMRCMIARMQLRNANCLILDTPTNHLDLESIQAFNNNL KTYKGNILFSSHDHEFIQTVANRIIELTPNGIIDKMMEYDEYITSDHIKELRAKMYGDK >gi|225935373|gb|ACGA01000019.1| GENE 3 2741 - 3322 815 193 aa, chain + ## HITS:1 COG:TP0215 KEGG:ns NR:ns ## COG: TP0215 COG0576 # Protein_GI_number: 15639208 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Treponema pallidum # 22 191 27 203 220 80 33.0 2e-15 MDPKEKKVKEEELNVEETHNPAEDQPQNEQAEGTAPLTHEEELEKELEKAQEALEEQKDK YLRLSAEFDNYRKRTLKEKAELILNGGEKSLGSILPVVDDFERAIKTMETATDVNAVKEG VELIYNKFMAVLAQNGVKVIETKDQPLDTDFHEAIAVIPAPSEAQKGKILDCVQTGYTLN DKVLRHAKVVVGE >gi|225935373|gb|ACGA01000019.1| GENE 4 3362 - 4546 1486 394 aa, chain + ## HITS:1 COG:ECs0015 KEGG:ns NR:ns ## COG: ECs0015 COG0484 # Protein_GI_number: 15829269 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Escherichia coli O157:H7 # 4 394 3 372 376 282 42.0 9e-76 MAEKRDYYEILEVTKTATVEEIKKAYRKKAIQYHPDKNPGDKEAEEKFKEAAEAYDVLSN PEKRSRYDQFGHAGVSGAAGNGGPFGGFGGEGMSMDDIFSMFGDIFGGRGGGFGGGFGGF SGFGGGGSQQRRYRGSDLRVKVKLTLKEISTGVEKKFKLKKYVPCDQCHGTGAEGDGGSE TCPTCKGSGSVIRNQQTILGTMQTRVTCSTCGGEGKIIKNKCKKCGGDGIIYGEEVVSVK IPAGVAEGMQLSMGGKGNAGKHNGVAGDLLILVEEEPHQDLIRDENDLIYNLLLSFPTAA LGGAVEIPTIDGKVKVKIDSGTQPGKVLRLRGKGLPNVNGYGTGDLLVNISIYVPEALNK EEKSTLEKMEASDNFKPNTSVKEKIFKKFKSFFD >gi|225935373|gb|ACGA01000019.1| GENE 5 4709 - 5734 520 341 aa, chain + ## HITS:1 COG:VCA0951 KEGG:ns NR:ns ## COG: VCA0951 COG0393 # Protein_GI_number: 15601704 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Vibrio cholerae # 6 111 2 106 106 91 46.0 3e-18 MRNEFIMSTTNTIEGCPIKKYIDTICSNIVIGTNIFSDFAASFTDFFGGRSDSYRRKLEI IYGEASKALKQKALNLGANAIVGFRVDFDEISGKDKSMFMVSVSGTACIIEINNKELSEE EIIRTSISQSDLQKEIDRRYIISQIKDAVPLTSEWVEFLLEYPQTEIIESLIKRYISIDL DYYAEEAANIKQVLSATPPDFVIPVIYQHIEHKKIRILIKECCLFDGESIYETIVKGKIH EGVRLLSATKNSYDAKDLKWMKTILECLNSLPDTGKIEVVKGGVFTRDNGEKYICENGHK NPVDSSFCEKCGVNMKGMTEGEVKHIDSLKERYSIIEQYIQ >gi|225935373|gb|ACGA01000019.1| GENE 6 5768 - 6907 765 379 aa, chain + ## HITS:1 COG:BS_yhaZ KEGG:ns NR:ns ## COG: BS_yhaZ COG4335 # Protein_GI_number: 16078046 # Func_class: L Replication, recombination and repair # Function: DNA alkylation repair enzyme # Organism: Bacillus subtilis # 5 377 4 356 357 271 38.0 1e-72 MAEPFKNMYNEQFFDLFTKDLKLVIDDFDAHGFVSQVMDDEWEGRELKQRCIHITTILKK FLPADYKEAIAKILELLDHVKSTRPDFSVIDDTKFGLMLEYGAILDNYVEQYGLDDYETS VKAIEKITQFTSCEFVTHPFIIKYPDEMMKQMLVWSKHEHWGVRRLASEGCRPRLPWAMA LPNLKKDPTPIIPILENLKNDPARFVRLSVANNLNDIAKDNPEIVIDLAKKWKGESKEVD WIIKHGCRTLLKQGIPEVMELFGFDSIRNNISVEDFQISSPKVKVGDSLEFGFNLLNHSN KTIKIRLEYGIYYQKANGTLTKKVHKISEKEYTGNSTTRITRKHSFRVVTTRKFHLGLHQ IAIIINGSEFEKYDFDLIE >gi|225935373|gb|ACGA01000019.1| GENE 7 7028 - 7420 380 130 aa, chain - ## HITS:1 COG:no KEGG:Coch_0296 NR:ns ## KEGG: Coch_0296 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 129 1 126 126 75 31.0 8e-13 MKTVEVIVEHAGKNLSAYIEDAPVITVGNDMREIEDNMREAIELYLEDNSNPCEVLSGEF ELKFKIDAATFINYYSNIFTKAALSRITGINERQLWHYAAGVHKPRRQQLEKIQKGIQAL SKELSAINLL >gi|225935373|gb|ACGA01000019.1| GENE 8 7457 - 7645 167 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170747|ref|ZP_05757159.1| ## NR: gi|260170747|ref|ZP_05757159.1| hypothetical protein BacD2_02697 [Bacteroides sp. D2] # 1 62 1 62 62 104 100.0 2e-21 MSYKSVKDVVTMLLENGFVLKGQKGSHMKFEKNGRVVIVPNHNSKGVEKGTYYSILKQAG LK >gi|225935373|gb|ACGA01000019.1| GENE 9 7732 - 9372 1439 546 aa, chain - ## HITS:1 COG:no KEGG:BT_0484 NR:ns ## KEGG: BT_0484 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 546 45 577 577 967 85.0 0 MKQYMKNKSRLGFFLAMCLGLFSCSKFLEENPKDKLPEDDVYNTISEVYLNAVASLYTYV GGYSDSQGLQGTGRGVYDLNTFTSDEAIIPTRGGDWYDGGFWQGLYLHDWGIENDAIQAT WEYLYKVVMLSNKSLERIDKFAETHSATELPAYRAEVRAMRAMYYYYLMDLFGRIPLVQS SSVAMKDVVQSERKTVFDFVVKELQEAAPLLSDAHSNQSGPYYGRITRPVVTFLLAKLAL NSEVYTDNDWTDGQRPDGKNIKFTVNGSELNAWETVIYYCDQLKTMGYKLEPEYETNFSI FNEPSVENVFTIPMNKTLYTNQMQYLFRSRHYNHAKAYGLSGENGPSATIEALEAFGYET AEQDPRFDICYFAGIVHDLKGNIIKLDNGTVLEYLPWKVSLDITDTPYEQTAGARMKKYE VDPTATKDGKLMENDIVLFRYADALLMKSEAKVRNGANGDEELNEVRSRVNASPRTATLE NILAERQLELAWEGWRRQDLVRFGKFTRAYSSRPQLPDEASGYTTVFPIPEKIRVMNERL KQNPGY >gi|225935373|gb|ACGA01000019.1| GENE 10 9387 - 12131 2137 914 aa, chain - ## HITS:1 COG:no KEGG:BT_0483 NR:ns ## KEGG: BT_0483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 914 1 914 914 1639 90.0 0 MVQHARLIFYSLLLLVIPCESTLAQKIPVTPIDSLITVGYATGSLKTLSGSVEKITETQM NKDQITNPLEAIRGRVPGLTIQRGSNGPAALDAVRLRGTTSLTSGNDPLIIVDGVFGDLS MLTSIYPTDIESFTILKDASETAQYGSRGASGVIEVTTKKGMSGRTQVAYNGSFGISTVY KNLKMLSGDEYRRVASERGISILDKGNNTDFQKEIEQTGLQQNHHIAFYGGSSESSYRVS LGFMDRQGVILNEDMKNFTSNMNMNQKMFDGFLNCELGMFGSIQKNHNLVDYQKTFYSAA TFNPTYPNHKDPVTNSWDGITTASQITNPLAWMEVQDDDATSHISTHARLTFNLLEGLKL NLFGAYTYNIVENSQYLPTSVWANGQAYKGTKKRESLLGNMMLTYKKNWKKHFFDVLALA ELQKETYTGYYTTVSNFSTDKFGYNNLQAGALRLWEGTNSYYDQPRLASFMGRFNYTYAD RYVLTLNARTDASSKFGANHKWGFFPSASAAWVISEEEFMKQLPMVDNLKFRIGYGLAGN QSGIDSYTTLNLVKPNGVVPVGNSAVVSLGDLRNTNPDLKWEVKHTFNTGIDVALFGNRL LLSANYYNSRTTDMLYLYNVSVPPFTYNTLLANIGSMRNWGTEIAIGITPLKTKDMELNI NANITFQRNKLLSLSGMYNGEMLSAPEYKSLASLDGAGFHGGYNHIVYQMVGQPLGVFYL PHSTGLESDGNGGYTYGIADLNGGGVSLEDGEDRYVAGQAVPKTILGSNISFRYKRFDLS LQVNGAFGHKIYNGTSLTYMNMNIFPDYNVMKKAPKQNIKDQTATDYWLEKGDYVNFDYV TLGWNVPIEKVQKLKKYVRSLRLAFTVNNLATISGYSGLSPMINSSTVNSTLGVDDKRGY PLARTYTLGLSINF >gi|225935373|gb|ACGA01000019.1| GENE 11 12230 - 14305 1337 691 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 30 522 30 518 757 375 40.0 1e-103 MNIRKEYPKVCLFLWILGMCFHAHPILAQSVIPVPLKMEQGTGSFLLSEKTKLYTNLQGG EAELWENYLKALPVQLKEARMKDRKQMLFLLITPKTPQLPSPESYTLSVTPQRIEIRATS GAGLFYGMQTLLQLMQPAGTGSYSVASVEIEDTPRFAYRGLMLDVSRHFSTKEFIKKQID ALAYYKINRLHLHLTDAAGWRLEIKKYPLLTDFAAWRTDPTWKKWWNGGRKYLRYDEPGA SGGYYTQNDIREILEYARQHYITVIPEIEMPSHSEEVLAAYPQLSCSGEPYKNSDFCVGN EETFTFLENVLTEVIELFPSEYIHVGGDEAGKSAWKTCPKCQKRMKDEHLANVDELQSYL IHRIEKFLNNHGRRLLGWDEILQGGIAPNATVMSWRGEEGGIAAVTSGHHAIMTPGAYCY LDSYQDAPYSQPEAIGGYLPLKKVYTYDPVPASLTAEQAKLVYGVQGNLWVEYIPTPEHV EYMIYPRMLALAEVAWSAPERKSWPDFHTRALSAVADLQKKGYHPFDLSKEIGSRPESLQ PVSHLALGKKVIYNSSYSPHYPAQGNTALTDGIRGDWTYGDGSWQGFISDNRLDVTIDME KETPIHSVTAAFMQVVGAEVFLPETVIISISDDGINFTELQKQHFEVSKETPIRFTDISW QGEAKGRYVRYQAQAGSEFGGWIFTDEIIVK >gi|225935373|gb|ACGA01000019.1| GENE 12 14305 - 16629 2085 774 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 31 619 31 607 757 431 42.0 1e-120 MKQLFKLTGCLALAGLFASCQSAQQEANYQIIPMPQEIVTAQGSPFILKSSVKILYPEGN EKMQRNAKFLADYLKTATGKDFSIEAGTEGKNAIVLALGTENENPESYQMKVTGDGITIT GPTEAGVFYGIQSLRKSLPVAVGADIAMPAVEINDAPRFGYRGAHFDTSRHFFTVDEIKT YIDMQALHNMNRLHWHITDDQGWRLEIKKYPKLTEIGANRTETVIGRNSGEYDGKPYGGF YTQEQAKEIVDYAAERYITVVPEIDLPGHMQAALAAYPELGCTGGPYEVWRQWGVSEDVL CAGNDQVLKFLEDVYGELIEIFPSPYIHVGGDECPKVRWEKCPKCQARIKALGLKSDQSH SKEERLQSFVINHIEKFLNDHGRQIIGWDEILEGGLAPNATVMSWRGEKGGIEAAKQKHD VIMTPNTYLYFDYYQAKDVENEPFGIGGYLPMERVYSYEPMPASLTPEEQKYIKGVQANL WTEYIATFPHAQYMVLPRWAALCEIQWSSPEKKNYANFLSRLPQLIKWYDAEGYNYAKHV FDVQAEFDPNPAEGTMDVTLSTIDGAPVYYTLDGTEPTAASPVYEGVLKIKENVTLSAKA IRPNGESKTVTEKIDFSKSSMKPIVANQPINEQYLFKGASTLIDGLKGNSSYKSGRWIAF NGNDMDMTIDLQQPTEISSVAISTNVAKGDWVFDARNLSVETSDDGKTFKKIASEEYPEM KETDKDGIVEHKLTFAPVTTQYVRVIASPEKSLPAWHGGKGKNAFLFVDEIKID >gi|225935373|gb|ACGA01000019.1| GENE 13 16664 - 19258 1510 864 aa, chain - ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 50 843 59 860 891 560 39.0 1e-159 MMINLIGKKAQIACELLCCCSMAYAQSNDNSEVVVLNTGWEFSQAGTELWRPAQVPGTVH QDLIYHKQIPDPFYGINEQKIQWVENEDWEYRTAFTVTPEQLKRDDAQLVFEGLDTYADV YLNGALLLKADNMFVGYTIPVKSQLRIGENLLHIYFHSPIRQTLPQYNSNGFNYPADNDH HDKHLSIFSRKAPYSYGWDWGIRMVTSGIWRPVTIRFYDAASISDYHVKQLALTDQLANL SNELEINNILPRPLQAEVRINTSFEGSAEKSISQAITLQPGINHVSIPSEVASPVRWMPN GWGKPALYDFSAQIIVEDKVVAEQSHRIGLRTVRLVNEKDKDGESFYFEVNGVPMFAKGA NYIPQDALLTNVTTERYQILFRDIREANMNVIRVWGGGTYEDDRFYDLADENGILVWQDF MFACTPYPSDPTFLKRVEAEACYNIRRLRNHASLAMWCGNNEILEALKYWGFDKNFPPEI YQEMFRGYDKLFHQLLPAKVKELDAGRFYIHSSPYFANWGRPESWGIGDSHNWGVWYGQK TFESLDTDLPRFMSEFGFQSFPEMKTISTFAAPEDYQIESEVMNAHQKSSIGNALIRTYM ERDYIIPEKFEDFVYVGLVLQGQGMRHGLEAHRRNRPYCMGTLYWQLNDSWPVVSWSSID YYGNWKALHYQAKRAFAPILINPIRQNDSLNIYLISDCPDTKDHLMLEMKVTDFDGKKQG KPIRLNTLTVPANTSQCVYRIKTDTWLSPEEQQRCFMQLTLKDKAGNTLAETVYFFRKTK DLLLPETTVSCKMKQKDGMCELTLFSPALAKDVFIEIPLQGARFSDNFFDLLPGERKTVV ITSPQIKKGEELPLTIKHIRETYN >gi|225935373|gb|ACGA01000019.1| GENE 14 19412 - 21046 1167 544 aa, chain - ## HITS:1 COG:STM0928 KEGG:ns NR:ns ## COG: STM0928 COG4409 # Protein_GI_number: 16764290 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Salmonella typhimurium LT2 # 193 526 57 393 412 118 31.0 4e-26 MRRIFYLLFLVLLGYSFDVKASDTVFIHETQIPVLIERQDNVLFYIRLDAKESKMLDEVV LDFNKSTNLADVQAIKLYYGGTEALQDQNKNRFAPVEYISSHRPGATLAANPSYSIKCAE VGPSEKVVLRGNYNLFPGVNFFWISLQIKTDASLHTKIVSDLRAVKVDGKELYCKFISPK AITHRMAVGVRHAGNDGSASFRIPGLVTTNKGTLLGVYDVRYNSSVDLQEYVDVGLSRST DGGKSWEKMRLPLSFGEYGGLPKAQNGVGDPSILVDTQTNTVWVVAAWTHGMGNQRAWWS SHPGMDINHTAQLVLAKSTDDGKTWSKPINITEQVKDPSWYFLLQGPGRGITMSDGTLVF PTQFIDSTRVPNAGIMYSKDRGKTWKMHNMARTNTTEAQVAEIEPGVLMLNMRDNRGGSR AVALTKDLGKTWTEHPSSRKALQEPVCMASLIHVDAKDNILNKDLLLFSNPDTTKGRNHI TIKASLDKGLTWLPEHQIMLDEAEGWGYSCLTMIDKETIGILYESSVAHMTFQAVKLTDL LGMK >gi|225935373|gb|ACGA01000019.1| GENE 15 21161 - 21625 302 154 aa, chain + ## HITS:1 COG:BS_araE KEGG:ns NR:ns ## COG: BS_araE COG0477 # Protein_GI_number: 16080449 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 3 152 318 464 464 96 39.0 1e-20 MTTVLALVIIDKVGRKKLVYYGVSGMVVSLILIGLYFLFGDSLGVSSLFLLVFFLFYVFC CAVSICAVVFVLLSEMYPTKVRGLAMSIAGFALWIGTYLIGQLTPWMLQNLTPAGTFFLF AVMCVPYMLIVWKLVPETTGKSLEEIERYWTRSE >gi|225935373|gb|ACGA01000019.1| GENE 16 21760 - 22935 915 391 aa, chain + ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 2 388 4 388 388 470 56.0 1e-132 MDFKKLANQYKDELLDNVLPFWLKNSQDHEYGGYFTCLDREGKVFDTDKFIWLQGREVWM FSMLYNKVEKRQEWLDCAIQGGEFLKKCGHDGNYNWYFSLDRSGRPLVEPYNIFSYTFAT MAFGQISLATGNQEYANIAKKTFEIILSKVDNPKGKWNKLHPGTRNLKNFALPMILCNLA MEIENILGKDYLEQAIETCIHEVMDVFYRPELGGIIVENIDVDGNLVNCFEGRQVTPGHA IEAMWFIMDLGKRLNRLELIEKAKNITLTMLNYGWDKEYGGIYYFMDRNGCPPQQLEWDQ KLWWVHIETLISLLKGYQLTGDQQCLKWFEKVHDYTWSHFKDPEYPEWYGYLNRRGEVLL PLKGGKWKGCFHVPRGMYQCWKVLQNLSEIK >gi|225935373|gb|ACGA01000019.1| GENE 17 22956 - 24350 795 464 aa, chain + ## HITS:1 COG:no KEGG:TTE2414 NR:ns ## KEGG: TTE2414 # Name: not_defined # Def: hypothetical protein # Organism: T.tengcongensis # Pathway: not_defined # 27 455 5 417 422 310 37.0 6e-83 MKRIFIIACIILTIASTAFAGECRVGLTLIPPGTITNKVDLDIRAGIVNKGNTQQDFYVS IYLNDVNQDSLIHFSRITLAPGSSREIKFSLKTTNKIGRNRIILKVKEKNEEFTISKDFE VINSEIRSTRLIDGAWAGLYHWSEIEGKYWNPDIKKMTDDQWRELVRSMHSIGMDVIVLQ EVFRNQDYVGKHDLNINNYQGKAFYPSTLYSGRMPISAKDPIGAILSEADKLGMSVMMGV GMFAWFDFTVESLEWHKQVAKELWDMYGDHPSFYGFYVSEECAGNLYNSESTNEGQMIRK KEIVDFFREFKKYTSQFAPEKPVMLATNSMEILKGADTYPALLQNLDILCPFGFARMPEG DLTGKEAADILQKFCDDAGAHLWFDLEAFLFNPDNSLYPRPIDEIIRDLNLLDNFEKVLC YQYPGVFNNPKMSIRIGEERTIDLYNAYKTYMKKIKADRYNKIK >gi|225935373|gb|ACGA01000019.1| GENE 18 24382 - 25549 1057 389 aa, chain + ## HITS:1 COG:no KEGG:Sde_2927 NR:ns ## KEGG: Sde_2927 # Name: not_defined # Def: beta-1,3-glucanase precursor # Organism: S.degradans # Pathway: not_defined # 129 389 1178 1436 1441 111 30.0 6e-23 MKTITQFIFIILCLSDMIFTSCADGDFGANPYDPNTPITVSQLPKVLSFTPTEGKEGDVI TITGVNFTTATNVAFGGKGASSFEIIDDATIEAVVSSYGGTGAVAVTNHKGTRSLEGFLF IKEIDPTENPNLALNGYATGSAAISGSISNINDDNDKSWWQAAGNDNEWVKIDLGKIYSI NTVVMTWDMNAAGTDCDLMISEDDVNYTTIYSIKNWDAVSNDGINKVSFDNANARYVKLA NMKNSATPYNMTLFEVEIYNTPPAENLALQKIASASSNNTAAFNAVYGNTSFMWQAEGND DEWFKVDLGKIYTINNVVIQWDAGAYAADCEILISQDDVEYTSVYSITGWDSAATEGAQD MNFDNVDARYVKALLKNGATPWRMTIKEF Prediction of potential genes in microbial genomes Time: Fri May 13 06:58:24 2011 Seq name: gi|225935372|gb|ACGA01000020.1| Bacteroides sp. D2 cont1.20, whole genome shotgun sequence Length of sequence - 171044 bp Number of predicted genes - 138, with homology - 136 Number of transcription units - 43, operones - 30 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 5.5 1 1 Op 1 . + CDS 117 - 1181 929 ## TTE2411 S-layer-like domain-containing protein 2 1 Op 2 . + CDS 1202 - 4354 2168 ## Dfer_5330 TonB-dependent receptor 3 1 Op 3 . + CDS 4361 - 5953 1015 ## Dfer_5331 RagB/SusD domain protein 4 2 Op 1 . - CDS 6764 - 7813 548 ## BT_3593 hypothetical protein 5 2 Op 2 . - CDS 7841 - 8872 825 ## BT_3594 hypothetical protein 6 2 Op 3 . - CDS 8876 - 10294 1094 ## BT_0446 hypothetical protein 7 2 Op 4 . - CDS 10308 - 11729 779 ## BT_0447 S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase 8 2 Op 5 . - CDS 11736 - 14411 2227 ## BT_0448 hypothetical protein 9 2 Op 6 . - CDS 14430 - 15752 1190 ## BT_0447 S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase 10 2 Op 7 . - CDS 15786 - 17129 1142 ## BT_0449 putative S-layer related protein precursor - Prom 17153 - 17212 5.1 - Term 17180 - 17216 6.5 11 3 Op 1 . - CDS 17238 - 18905 1078 ## BT_0450 hypothetical protein 12 3 Op 2 . - CDS 18922 - 20553 1315 ## BT_0451 hypothetical protein 13 3 Op 3 . - CDS 20567 - 23713 2387 ## BT_0452 hypothetical protein 14 3 Op 4 . - CDS 23736 - 24809 713 ## BT_0435 hypothetical protein 15 3 Op 5 1/0.000 - CDS 24832 - 26079 1011 ## COG2942 N-acyl-D-glucosamine 2-epimerase 16 3 Op 6 . - CDS 26124 - 27527 1233 ## COG0477 Permeases of the major facilitator superfamily 17 3 Op 7 . - CDS 27533 - 28630 845 ## BT_0435 hypothetical protein 18 3 Op 8 . - CDS 28675 - 30549 1261 ## BT_0434 hypothetical protein - Prom 30796 - 30855 9.2 + Prom 30603 - 30662 9.3 19 4 Tu 1 . + CDS 30884 - 32092 356 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 + Term 32272 - 32309 6.6 - TRNA 32310 - 32382 81.4 # Thr CGT 0 0 - Term 32254 - 32303 15.1 20 5 Tu 1 . - CDS 32491 - 34533 1452 ## BT_0761 hypothetical protein - Prom 34625 - 34684 10.2 + Prom 34523 - 34582 5.8 21 6 Op 1 . + CDS 34746 - 35327 364 ## COG1636 Uncharacterized protein conserved in bacteria 22 6 Op 2 . + CDS 35355 - 36233 520 ## COG1864 DNA/RNA endonuclease G, NUC1 + Term 36282 - 36317 -1.0 + Prom 36282 - 36341 4.7 23 7 Op 1 . + CDS 36468 - 37505 798 ## COG1559 Predicted periplasmic solute-binding protein + Prom 37507 - 37566 4.0 24 7 Op 2 11/0.000 + CDS 37589 - 39181 1529 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 25 7 Op 3 1/0.000 + CDS 39185 - 39766 623 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 26 7 Op 4 . + CDS 39781 - 41088 1220 ## COG1541 Coenzyme F390 synthetase + Prom 41099 - 41158 2.1 27 7 Op 5 . + CDS 41178 - 41750 532 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 28 8 Tu 1 . - CDS 41737 - 42513 420 ## COG1145 Ferredoxin - Prom 42713 - 42772 4.7 - Term 42806 - 42846 10.1 29 9 Op 1 . - CDS 42872 - 43222 595 ## PROTEIN SUPPORTED gi|29345835|ref|NP_809338.1| 50S ribosomal protein L20 30 9 Op 2 . - CDS 43321 - 43518 333 ## PROTEIN SUPPORTED gi|29345834|ref|NP_809337.1| 50S ribosomal protein L35 - Term 43539 - 43576 2.5 31 10 Op 1 16/0.000 - CDS 43588 - 44199 619 ## COG0290 Translation initiation factor 3 (IF-3) 32 10 Op 2 . - CDS 44272 - 46212 2091 ## COG0441 Threonyl-tRNA synthetase - Prom 46234 - 46293 2.1 33 10 Op 3 . - CDS 46295 - 48250 2109 ## COG0457 FOG: TPR repeat - Prom 48309 - 48368 3.5 34 11 Op 1 . - CDS 48402 - 48956 660 ## COG0242 N-formylmethionyl-tRNA deformylase 35 11 Op 2 . - CDS 49002 - 49418 375 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) - Term 49421 - 49471 11.3 36 11 Op 3 . - CDS 49493 - 50632 1295 ## BT_0418 outer membrane porin F precursor - Prom 50871 - 50930 8.1 - Term 50875 - 50933 11.6 37 12 Op 1 . - CDS 50952 - 51920 740 ## BT_0417 hypothetical protein 38 12 Op 2 . - CDS 51946 - 53052 1087 ## BT_0416 hypothetical protein - Prom 53072 - 53131 4.5 - Term 53063 - 53118 5.5 39 13 Op 1 18/0.000 - CDS 53135 - 54595 1591 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 40 13 Op 2 8/0.000 - CDS 54608 - 55516 805 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 41 13 Op 3 1/0.000 - CDS 55524 - 56132 570 ## COG0529 Adenylylsulfate kinase and related kinases 42 13 Op 4 . - CDS 56145 - 57698 1301 ## COG0471 Di- and tricarboxylate transporters 43 13 Op 5 . - CDS 57710 - 58534 673 ## COG1218 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase - Prom 58661 - 58720 5.3 - Term 58740 - 58785 -0.0 44 14 Tu 1 . - CDS 58786 - 59280 602 ## BT_0410 hypothetical protein - Prom 59313 - 59372 3.4 - Term 59304 - 59358 0.8 45 15 Tu 1 . - CDS 59396 - 60166 812 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity - Prom 60263 - 60322 7.5 + Prom 60280 - 60339 8.0 46 16 Op 1 . + CDS 60367 - 60576 407 ## BF1659 hypothetical protein 47 16 Op 2 . + CDS 60609 - 60971 310 ## BT_0407 hypothetical protein 48 16 Op 3 . + CDS 60975 - 61232 272 ## BT_0406 hypothetical protein + Term 61316 - 61353 -1.0 - Term 61081 - 61130 1.5 49 17 Tu 1 . - CDS 61239 - 62792 1264 ## BT_0374 hypothetical protein - Prom 62985 - 63044 6.8 - Term 63041 - 63075 3.0 50 18 Op 1 11/0.000 - CDS 63187 - 63759 404 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 51 18 Op 2 16/0.000 - CDS 63768 - 64886 917 ## COG1088 dTDP-D-glucose 4,6-dehydratase 52 18 Op 3 . - CDS 64883 - 65785 751 ## COG1209 dTDP-glucose pyrophosphorylase 53 18 Op 4 3/0.000 - CDS 65790 - 66506 443 ## COG1922 Teichoic acid biosynthesis proteins - Prom 66526 - 66585 1.8 54 18 Op 5 . - CDS 66587 - 67822 588 ## COG0438 Glycosyltransferase - Prom 67842 - 67901 5.6 55 19 Op 1 . - CDS 67917 - 68792 261 ## Spro_0558 acyltransferase 3 56 19 Op 2 . - CDS 68801 - 69760 545 ## BF2802 putative acyltransferase transmembrane protein 57 19 Op 3 . - CDS 69772 - 70686 368 ## BT_2147 O-acetyl transferase 58 19 Op 4 11/0.000 - CDS 70695 - 71792 706 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 71867 - 71926 5.7 59 19 Op 5 7/0.000 - CDS 72067 - 72684 409 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 60 19 Op 6 8/0.000 - CDS 72693 - 73256 431 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 61 19 Op 7 . - CDS 73253 - 74410 775 ## COG0438 Glycosyltransferase 62 19 Op 8 . - CDS 74420 - 75964 713 ## BLA_0592 polysaccharide biosynthesis protein 63 19 Op 9 . - CDS 75999 - 76367 396 ## Gmet_1497 glycosyl transferase family protein 64 19 Op 10 . - CDS 76370 - 77425 430 ## COG0438 Glycosyltransferase - Prom 77453 - 77512 1.8 65 20 Op 1 . - CDS 77581 - 78855 619 ## gi|260170823|ref|ZP_05757235.1| hypothetical protein BacD2_03077 66 20 Op 2 . - CDS 78870 - 79841 418 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 67 20 Op 3 . - CDS 79841 - 80806 509 ## Patl_2044 hypothetical protein 68 20 Op 4 . - CDS 80851 - 81831 456 ## Fisuc_0084 hypothetical protein 69 20 Op 5 . - CDS 81870 - 82751 466 ## PRU_1758 group 2 family glycosyltransferase 70 20 Op 6 . - CDS 82763 - 83686 525 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases 71 20 Op 7 . - CDS 83689 - 84642 502 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 72 20 Op 8 . - CDS 84648 - 85136 108 ## GFO_0543 hypothetical protein 73 20 Op 9 . - CDS 85105 - 85548 293 ## CA2559_07240 xanthan biosynthesis pyruvyltransferase GumL 74 20 Op 10 . - CDS 85554 - 87038 223 ## BVU_2944 putative transmembrane protein 75 20 Op 11 . - CDS 87115 - 88095 513 ## BMD_2936 hypothetical protein 76 20 Op 12 . - CDS 88174 - 89487 1429 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 77 20 Op 13 2/0.000 - CDS 89524 - 91947 2039 ## COG0489 ATPases involved in chromosome partitioning 78 20 Op 14 2/0.000 - CDS 91972 - 92733 504 ## COG1596 Periplasmic protein involved in polysaccharide export 79 20 Op 15 . - CDS 92798 - 94204 997 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 94231 - 94290 1.8 80 20 Op 16 . - CDS 94355 - 94504 84 ## gi|160883759|ref|ZP_02064762.1| hypothetical protein BACOVA_01731 - Prom 94529 - 94588 11.0 + Prom 94573 - 94632 1.8 81 21 Op 1 . + CDS 94656 - 95285 437 ## BT_1637 hypothetical protein 82 21 Op 2 . + CDS 95315 - 97147 1324 ## BT_1703 hypothetical protein + Term 97217 - 97273 9.8 83 22 Tu 1 . - CDS 97281 - 97499 193 ## BT_1704 hypothetical protein - Prom 97700 - 97759 6.3 + Prom 97572 - 97631 5.7 84 23 Op 1 . + CDS 97720 - 98208 634 ## BT_1705 hypothetical protein 85 23 Op 2 . + CDS 98249 - 98356 120 ## 86 23 Op 3 . + CDS 98361 - 98786 249 ## COG3023 Negative regulator of beta-lactamase expression 87 24 Op 1 . - CDS 98779 - 99528 695 ## BT_0613 putative membrane protein involved in polysaccharide export 88 24 Op 2 . - CDS 99556 - 100134 491 ## BT_0596 putative transcriptional regulator - Prom 100287 - 100346 3.9 - Term 100237 - 100287 1.3 89 25 Tu 1 . - CDS 100469 - 101434 584 ## COG4974 Site-specific recombinase XerD - Prom 101516 - 101575 8.0 - Term 101562 - 101619 2.3 90 26 Tu 1 . - CDS 101642 - 102613 935 ## COG1482 Phosphomannose isomerase - Prom 102668 - 102727 4.3 - Term 102634 - 102676 4.0 91 27 Op 1 . - CDS 102754 - 104070 1144 ## BF1621 hypothetical protein 92 27 Op 2 . - CDS 104078 - 104659 620 ## BF1610 hypothetical protein 93 27 Op 3 . - CDS 104675 - 106822 1607 ## BDI_2678 hypothetical protein 94 27 Op 4 . - CDS 106852 - 108828 1449 ## PG2135 putative lipoprotein 95 27 Op 5 . - CDS 108837 - 110282 934 ## PGN_0183 minor component FimC 96 27 Op 6 . - CDS 110294 - 111151 562 ## gi|260170854|ref|ZP_05757266.1| hypothetical protein BacD2_03232 97 27 Op 7 . - CDS 111194 - 112411 1043 ## gi|260170855|ref|ZP_05757267.1| hypothetical protein BacD2_03237 - Prom 112468 - 112527 6.7 98 28 Op 1 . + CDS 112730 - 113827 372 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 99 28 Op 2 . + CDS 113879 - 115219 1601 ## COG0738 Fucose permease 100 28 Op 3 . + CDS 115267 - 116421 1372 ## COG0153 Galactokinase + Term 116448 - 116495 14.2 - Term 116434 - 116481 12.1 101 29 Op 1 . - CDS 116541 - 118523 1877 ## COG3534 Alpha-L-arabinofuranosidase 102 29 Op 2 . - CDS 118550 - 120070 1200 ## COG3507 Beta-xylosidase 103 30 Tu 1 . + CDS 120247 - 124518 2743 ## COG0642 Signal transduction histidine kinase + Term 124638 - 124687 4.1 + Prom 124541 - 124600 3.2 104 31 Op 1 . + CDS 124703 - 124798 62 ## 105 31 Op 2 . + CDS 124819 - 128028 2619 ## BVU_1005 hypothetical protein 106 31 Op 3 . + CDS 128068 - 129819 1531 ## BVU_1006 hypothetical protein 107 31 Op 4 . + CDS 129852 - 131633 1267 ## BVU_1007 hypothetical protein 108 31 Op 5 . + CDS 131668 - 133596 1564 ## COG3507 Beta-xylosidase + Term 133618 - 133668 15.0 + Prom 133667 - 133726 4.8 109 32 Op 1 . + CDS 133767 - 134906 400 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 110 32 Op 2 . + CDS 134938 - 136683 1721 ## COG4146 Predicted symporter 111 32 Op 3 . + CDS 136693 - 137370 709 ## COG1051 ADP-ribose pyrophosphatase 112 32 Op 4 5/0.000 + CDS 137376 - 138062 784 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 113 32 Op 5 . + CDS 138115 - 139641 1542 ## COG2160 L-arabinose isomerase + Term 139652 - 139712 1.2 + Prom 139660 - 139719 10.7 114 33 Tu 1 . + CDS 139744 - 141339 1309 ## COG1070 Sugar (pentulose and hexulose) kinases + Prom 141345 - 141404 3.3 115 34 Op 1 1/0.000 + CDS 141453 - 143855 1777 ## COG3533 Uncharacterized protein conserved in bacteria 116 34 Op 2 . + CDS 143922 - 145466 1667 ## COG3534 Alpha-L-arabinofuranosidase + Term 145529 - 145577 2.5 + Prom 145517 - 145576 3.3 117 35 Op 1 . + CDS 145609 - 147618 2344 ## COG0021 Transketolase 118 35 Op 2 . + CDS 147618 - 148052 538 ## COG0698 Ribose 5-phosphate isomerase RpiB + Term 148075 - 148125 11.3 - Term 147990 - 148024 2.5 119 36 Tu 1 . - CDS 148162 - 148644 475 ## COG2839 Uncharacterized protein conserved in bacteria - Prom 148866 - 148925 12.0 + Prom 148654 - 148713 4.4 120 37 Tu 1 . + CDS 148923 - 149360 266 ## BT_4511 hypothetical protein + Term 149370 - 149413 -0.4 + Prom 149446 - 149505 4.1 121 38 Op 1 1/0.000 + CDS 149587 - 150156 266 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 122 38 Op 2 . + CDS 150226 - 151140 473 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 151159 - 151197 9.2 123 39 Op 1 . - CDS 151250 - 153451 1786 ## COG3537 Putative alpha-1,2-mannosidase 124 39 Op 2 . - CDS 153469 - 155736 1641 ## BVU_0137 delta\-4\,5\ unsaturated\ glucuronyl\ hydrolase - Prom 155774 - 155833 5.8 - Term 155759 - 155826 23.2 125 40 Op 1 . - CDS 155855 - 157570 1339 ## BVU_0132 glycoside hydrolase family protein 126 40 Op 2 . - CDS 157601 - 159424 1548 ## BVU_0134 hypothetical protein 127 40 Op 3 . - CDS 159436 - 162762 2428 ## BVU_0135 hypothetical protein - Prom 162868 - 162927 4.6 128 41 Op 1 6/0.000 - CDS 162929 - 163906 745 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 163926 - 163985 5.4 129 41 Op 2 . - CDS 164031 - 164579 461 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 164724 - 164783 4.1 + Prom 164475 - 164534 3.1 130 42 Op 1 . + CDS 164565 - 164846 97 ## gi|260170887|ref|ZP_05757299.1| hypothetical protein BacD2_03397 131 42 Op 2 . + CDS 164834 - 165115 265 ## BT_0334 hypothetical protein 132 42 Op 3 . + CDS 165149 - 165376 245 ## BT_0333 hypothetical protein 133 42 Op 4 . + CDS 165388 - 166470 1373 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 134 42 Op 5 . + CDS 166478 - 166651 181 ## BF1647 hypothetical protein 135 42 Op 6 22/0.000 + CDS 166664 - 167428 736 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 136 42 Op 7 . + CDS 167449 - 167991 619 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit + Prom 168012 - 168071 2.5 137 43 Op 1 . + CDS 168100 - 170157 1657 ## BT_0328 hypothetical protein 138 43 Op 2 . + CDS 170175 - 171014 807 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain Predicted protein(s) >gi|225935372|gb|ACGA01000020.1| GENE 1 117 - 1181 929 354 aa, chain + ## HITS:1 COG:no KEGG:TTE2411 NR:ns ## KEGG: TTE2411 # Name: not_defined # Def: S-layer-like domain-containing protein # Organism: T.tengcongensis # Pathway: not_defined # 6 343 37 369 1321 174 34.0 6e-42 MDVVYDNVMTYQLSYIRSLQLSSGAIKDKDTSNSKITPYFAHFAVLALLKDPTNQNIEAV KKYIKWYLGKLNGITNPYKGGAEIEGSVYDYFGDSETTQGTYDSVDSYAATFLEIIMEFA QTSRENKSWLEQYKSKISLVASAMVKTIDTDFNQLPGTITEDENDGLSVASYVYDVKYLM DNCEVNIGLKAAKWLKDNNLIDNDYDFAGLLAKNTEGIGTLWNGTTYDWWKSNSVTTDWK QFYPDATAQLYPGMFNVIEPNSQTANKLYTQFNKTYTSWASGVTYSSFPWTIIAYAAATI NDAVRVNTYLKHINSLNIKGQQKTNWYSMEAGCIVLAIDRIKNPVVDSEYTPIN >gi|225935372|gb|ACGA01000020.1| GENE 2 1202 - 4354 2168 1050 aa, chain + ## HITS:1 COG:no KEGG:Dfer_5330 NR:ns ## KEGG: Dfer_5330 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 36 1050 166 1177 1177 1066 52.0 0 MKNYNNENKSRSLILTLCMVIFLLSPSVLFAQKQNITGKVIDDQGQPVIGATVVLKDTPT GVITNMDGEFVIASPTKENTITVSMIGYNSVTVKAVAGQRVSITLIEKTYELDGLVVIGY GSIAKKDITGAVGVVNGKELKDMPVSSINNVLQGKVPGLTVTSSSGTPGAGSVTHIRGIG SITGSTTPLYVVDGLPQTGIDYLNPNDIESIAVHKDASVAAIYGSRGANGIIIITTKSGS LENKAQVSYDGYVGWQTPWKRPHMLNAADYITYKNLAADNAGQERVAAFATQENIDAVLN FVNKNSGPNGTDWWKEIINEGAFMHNHNISVNGGTKNIGILSSLAYTDQDGIVKGSNYKR ISWRNNFNMKISKRVSLTANVGIIDEKRQLIDENNPATGTVFSAMGADPITPVFRNNLVD VPIFLSQIYNGYEPNNLYSQYSGILFSNKRNPVGQIQRMRQSTYEYLYVKAGANLEIKLL DFLKFNSRFGMDISRSLTDGFQPKYTLNANDYANENTVIQDNSRSNYFVWEQTLSYDQTF GKFKIGALVGTSAEETSVSSVNASIQGVINNDKDMAILNAGTTNPAVSGYPYDNSMLSFF GRVSLDYDSKYLIAANLRRDGSSKFADGHKWGTFPSVSAAWRFSSESFMESTRNWLSDAK LRVSYGLIGNQNISGGGYASTYGSTIYDRYSFGSPNTASIGAGIITVGNPVLKWETSKQF DIGLDLSLFNNSIEIVADYFRKNIDDMLMQEPQPTTLGFPNFPYANVGSMRNVGWELGVT YRKTWGDLNFTASANISTYKNEVTSLGNGDAIYGTAYNNNTITKTEVGQPVGYFYGYVTN GIFQNAEQVEGSAQRETAKPGDIRYKDLNNDDVLDDNDRTKIGDPWPDFVYGLTLGASWK GFDFNMFLQGSQGNDVMNMTLYDFESGTGYMNAHSDFLKRAWNGEGSTDKYHRISADQGQ NQLISDYFIEDGSYFRIKNLQIGYNFCDRLLKFKGISYLRLYVSVQNLLTLTKYSGLDPE IGSSNATLNGIDQGFYPQARTWTIGLNVKF >gi|225935372|gb|ACGA01000020.1| GENE 3 4361 - 5953 1015 530 aa, chain + ## HITS:1 COG:no KEGG:Dfer_5331 NR:ns ## KEGG: Dfer_5331 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 529 9 541 541 595 54.0 1e-168 MKNIKYLFVSLSLISLFSCSDFLDRDPKGVMDQDRFFLSPDAGYSAVVKCYKTLNDVAGY EAPRMDLYNISTDDSEKGGSDAGDRTFAGDLSFGRALASNTDLVNLWSNMYNGIARCNIC LENIPNKPLVDADGYPLSNDVKARYIGEVKFLRAFFYFELCKIFGGVPLVDRTLTVDDSK KLVRATEAETADFIMKDLSEAALESNIPDKATLPTTELGRVTKEAVWAMQARVYMYFAKD NPSYYADARDAAKKVIDSKSCELAPNFQSLYLSDNYMLSESIFPNIRGDVPNDHIYGSFI PVYSNPRSCGAYGFDQPTQNLVDEFEEGDPRLLYTIIEPGDKFPKEKGSETLDFSTYPNT GYHSRKAFLVVSRRGNGWGDDAWTFHIIRYADVLLLYAEALVQTNGDKDEIVKYINMVRD RANDSRSGDIEATSRVLIIPNIKLKPVTINDDLLAAVKHERRVELAMEYNRLYDLKRWNC YVETMNTFSTYSYAKGRGAAFKKGINELFPIPQVEIDRSGGSIKQNPGYN >gi|225935372|gb|ACGA01000020.1| GENE 4 6764 - 7813 548 349 aa, chain - ## HITS:1 COG:no KEGG:BT_3593 NR:ns ## KEGG: BT_3593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 349 22 357 358 497 70.0 1e-139 MLCLLGIQLQAQQPMYYDASRFPLLGKATQDTGTRYERLPDSLKNISRPPLWNLSRNSAG MAIRFRSNSTQIAVKWESLFNNHMNHMTDVGIKGLDLYCWEGNKQWRFVNSARPSGKTNQ ATIIANMLPTEREYMLYLPLYDGLVSLSIGVDSLATIDQPLIDYPIRKKPVVFYGTSILQ GGCASRPGMAHTNIISRRLNRECINLGFSGNAFLDLEVAKVISEVDASVFILDFVPNASV EQMKERMETFYRIIRSKHPDTPIIFIEDPIFTHTRYDERIAKEVQRKNDTLKEIFNRLKK EDEKNIIFISSKNMLGEDGEATVDGVHFTDLGMTRYADLVCPIIKKAIK >gi|225935372|gb|ACGA01000020.1| GENE 5 7841 - 8872 825 343 aa, chain - ## HITS:1 COG:no KEGG:BT_3594 NR:ns ## KEGG: BT_3594 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 341 219 531 536 87 26.0 9e-16 MLIKRLYFIATALLTFSICSCGESYPDHPWEWDQETEEEESNPDIVSQGWTAVNNFGNLP DYIHIYKSPKNLAAKKAIAYIAVADMSKAKFEVLGDVAFSQEANGYGGKSLYTPSEFYEA SKAPVIINGGLFFYSAGFYYSQNLVIREGQLLAPNQNYYSKDWVTMWYPTLGAFCQMKDG TFQTTWTYQASDGINYCYPAPAENDINKDPLQAPSSAFPSGAKALEATTGIGGVTVLLRA GEIKNTYVEEMLDISAASNQPRTAIGVTTDKKMIIFVCEGRNMTEGVAGLTTANVAKVMK DLGCTEALNLDGGGSSCMLVNGKETIKGSDGKQRKVLTAVGIK >gi|225935372|gb|ACGA01000020.1| GENE 6 8876 - 10294 1094 472 aa, chain - ## HITS:1 COG:no KEGG:BT_0446 NR:ns ## KEGG: BT_0446 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 472 2 469 469 457 56.0 1e-127 MNNNNTRASSLDALRGYAILTMVLSGSIAWGVLPGWMYHAQVGPRSNFVFDGSIYGITWV DLVFPFFLFAMGAAFPFSIGNKYRKGSSRRKIIYDSFLRGFRLTFFAIFIQHIYPWVVSS PQDVRSWLIALGGFALMFPMFMRIPFKMPEYVRMLIQILAYATGVFMLLNIHYADGRTFS LGYSNIIILVLANMAIFGSLIYTLTINKPWIRIAILPFIMAVFLGNENAESWNHSLMTFS PLPWMYQFSYLKYLFIVIPGSIAGEYLQEWLYSKNTVTINDNKDEHKRIPWILLLTIGLI VLNLYGLYMRYLLLNLAGTTAILCILYILLQVEGKNTGYWYKLFKAGTYLILLGLAFEAY EGGIRKDPSTYSYYFLASGLAFMAMIAFSIMCDIYSWTRLTRPLEYAGQNPMIAYVSTQL VVLPLLNLAGLGTYLSYLDQNAWLGFLRGVIVTSLALLITILFTKLKCFWRT >gi|225935372|gb|ACGA01000020.1| GENE 7 10308 - 11729 779 473 aa, chain - ## HITS:1 COG:no KEGG:BT_0447 NR:ns ## KEGG: BT_0447 # Name: not_defined # Def: S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 472 6 472 884 721 72.0 0 MKYIIYHLWVISLLFPLGMQAKVRLTSIWGDNMVLQQQSEVTFHGKASANSKITIEASWD HRKLTTRSDQHGNWDIQLPTPAAGGPYTITFSDGETLTLKNILLGEVWFCSGQSNMEMPV RGFRGQPVYGSQPYIVSADSKRELRLFTVKRDWSTTPKEEGMTGYWSELSPKEVGDFSAT AYFFGDLLQRSLDIPVGLIHCSWSASKIETWMNKETLSQFPEVQLPDIKQDKFDWPAGTP TLLWNAMVNPWKGFPIKGVIWYQGESNSANPALYKKLFPAMVIQWREFFNNPAMPLYYVQ ITPWQAEGKDKLDRAWFRQCQLELMQEVPNVGMVTTTDAGSEKFIHPPYKIKVGQRLAYW ALAKTYGIEGFLYAGPLYKSHQLKKNVVEITFEHGSDGLIPENQKLKGFELVDEKGNIVP AEAEIINGSARVKVWNDSVPHPVEVRYCFRNYMEGDLFNNAEIPASPFRIVIH >gi|225935372|gb|ACGA01000020.1| GENE 8 11736 - 14411 2227 891 aa, chain - ## HITS:1 COG:no KEGG:BT_0448 NR:ns ## KEGG: BT_0448 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 438 889 27 474 477 525 54.0 1e-147 MNMKKLSIISLLLWIGIVAFASGKPKVMWLDCSANFQRFSYPDSIRYYVDKCHEAGITHL VLDIKDNTGEVLYPSKYAIQKKNWKNFDRPDFDFINTFIEAAHTHNMIIFAGMNIFADGQ NIVKRGAVFDKHKKWQAINYVPRKGLLPVTEIDGKPTMFLNPALKEVQKYEIDIIKEVVR NYAFDGIMLDRARYDCIDSDFSPESKKMFEKFIGKKVERFPEDIFEWRPNAEGGIDRVGG PYYHQWITWRTSVIYNFIKDVRTSIKKIKPECMLAAYTGAWYPTYFEVGVNWASRQYDVS KDFSWATPDYKDYGFAELLDFYTNGNYYWNVTLDDYYKSSGKLKNETDSEFSTGEYLCVE GGCKYSKYLLKDAVPVCGGLYVEDYKRDVNQFQKAVRMNLKESDGVMIFDIVHIIRNGWW GELKEALDETKMDEARMIKGTVTCDGKGIANVVVTDGLRCVTTDKNGIYHLPSLGNTRFV YITTPAGYLTDCEQTIPKFYQEIDMNKTNEYNFRLKKNPKDDKKHLFVLEADVQAGLKEH WDLYAPIIDDYKQLINQHSDRDVFGLNCGDIFWDTPATFFPPYIDKAKKLDIPIYRAIGN HDIDCNGATHETSYRTFEGYFGPAHYSFNKGNAHYIVINNNFYVGREYFYIGYVDETTFK WLKEDLSYVPKGTLVFFMTHIPTRITEQKRPFNYDYAMLAGETINAEAVHQLLDGYETHF LTGHLHSNSNIIFNNHQMEHNTAAVCGIWWHADVCIDGTPQGYGVYEVDGNQVKWYYKSA GHPKEYQFRSYAAGISKEFPKDIIANVWNWDKNWKVEWLENGKVMGLMTQYTGVDPYAHQ VCTDKKRTMQSWISAASTGHMFRATPRNPKAKIEIRVTDRFGKVYQQNVSK >gi|225935372|gb|ACGA01000020.1| GENE 9 14430 - 15752 1190 440 aa, chain - ## HITS:1 COG:no KEGG:BT_0447 NR:ns ## KEGG: BT_0447 # Name: not_defined # Def: S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 439 471 884 884 671 76.0 0 MRKITIILFSSLLIIALGACKSATPKVENNKPALMWFDAEANFERFSNPDSIDYYLAKIK SLGFTHAIVDVRPITGEVLFDTEFAPKMREWQGYERKDFDYLGHFIKKAHELGIQVHASL NVFVAGHNYFDRGLVYSTHLEWASIVYTPEGITSITNEKKKYSAMVNPVNEEFQTHILNV LKDLVKRYPDLDGLMLDRVRYDGITADFSALSRQKFEAYIGQKVEKFPEDIFEWKKDEND KYYPERGKHFLKWIEWRTKNIYDFMARTRNEVKIVNPDISFGTYTGAWYPSYYEVGVNFA SKKYDPSQDFDWATPEYKNYGYAELMDLYTTGNYYTDITIEESLKNKKSVWNETDSQAQS GTWYSVEGSCQKLRNIMKDNKFMGGILVDQFYDNPAKLSATIEMNLKASDGLMVFDIVHI INKGLWKEVENGMRAGGALK >gi|225935372|gb|ACGA01000020.1| GENE 10 15786 - 17129 1142 447 aa, chain - ## HITS:1 COG:no KEGG:BT_0449 NR:ns ## KEGG: BT_0449 # Name: not_defined # Def: putative S-layer related protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 447 19 449 453 435 52.0 1e-120 MGGIAACLCMACGGNDSKDYWGDTSGGGDKEPTENPNASKPRYIWIDAAANFPDFANSKE NIARDLALAKDAGFTDIVVDVRPTTGDVLFKTNLVDQVKFMYAWIGSNYTKVERTATWDY LQAFVDEARKQGLRIHAAINTFVGGNQIDGGTGLLYRDQSKAEWATQMNMQGGITSVMNT SESTKFFNPAHPEVQTFLCDLLKDLAGYDLDGIFLDRGRFLNLQTDFSDESRKQFEAYMG GIKIQNYPEDILTPSASSLPTTYPKYMTKWLEFRAKVIYDFMQKARTAVKGVNSKIKFGV YVGGWYSTYYDVGVNWAASTYDTSRYYNWATSKYKNYGYAACMDQILIGAYASPLKVHGT TEWTMEGFCLLAKDKIKGECPMVAGGPDVGNWDTNNQATQEQENQAIVQSVKACMNVCDG YFLFDMIHLKKADQWQYAKEGIKLAIE >gi|225935372|gb|ACGA01000020.1| GENE 11 17238 - 18905 1078 555 aa, chain - ## HITS:1 COG:no KEGG:BT_0450 NR:ns ## KEGG: BT_0450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 554 16 564 565 243 34.0 2e-62 MKNKILSIVSVIVFLMGMSACHSPEDFNPTVERNIINNLTASFPDDDSDDNSFTSEIDYT NHVITVVFPYNYPKLSDNVLQLSDLKKMRVIADLDNNVYVYPTLLFMDMTKDNFITVKEQ TGTSTEYKVVAEIRKSNECAITKFNITSLGLSGIINESTKTISLISLESIGEVLADVSIS HGATISPDPTTVALDYDQDQKVTVTAQNGTTNSTYTIKKEIPEKIAAGLRANSAKLIWAK KLTDIGISSSDMTTGIAVTNDYVVINERNKNPVYLHAKNGEKAGTMNISFVGSTTNFYVT SDKDGNILFTNLTPNAGPAFTVWKASGVDQVPEKYIEFTTSLAMGRKLSVCGSLNSNAII TAPIYATSGQFARWQVENGVLKSQTPDIITAQGIGGWGTNADVIYSDPTDPTSDYYTTFY AEPRNLCWMDGKTNTIKNKGVALDGNYVPNAVDYTVFNKVGYVASNLVNPWSWGTADNIY LFDLSSGSLGTQAIDFGSTGLGINGNYGSKILGNTNGNYCGDVVLKVSDDGYYMYVYFMF CNGYVGCVQCDCIDM >gi|225935372|gb|ACGA01000020.1| GENE 12 18922 - 20553 1315 543 aa, chain - ## HITS:1 COG:no KEGG:BT_0451 NR:ns ## KEGG: BT_0451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 543 1 553 553 391 42.0 1e-107 MKKILLNISLALALGFSITSCNDWLDGVKQTSTVSDEIVWQDEAYVDKYVNAFYTYLNKY GQFGESQFSGSLTESLTETFKYGSVALGNRAGHPNNYVTNPDVISPDGCLYNIWDKDLAY GNIRQMNQFLVMQKKYSTFPDDKNLKWEAQVRFFRAYIYFQLAKRHGGVILYDDLPTSND KARSTAAETWQFIADDLDFASTNLPKEWDAANKGRVTKGAAYALKSRAMLYAERWQDAYD AADEVEKLQLYDLVDNYADAWKGNNKEAILEFDYNKDSGPNHTFDQYYVPQCDGYDFGAL GTPTQEMVESYEDKNGNKVDWTKWHGTTTEEPPYDQLEPRFAATVIYRGCTWKGRVMDCS VGGINGAFMAYREQSYSYGKTTTGYFLRKLLDEKLIDVKGTKSSQAWVEIRFAEVLLNKA EAAYRLNKIGEAQSLMNRVRGRQGVNLPGKSSSGEAWFNDYRNERKVELAYEGHLFWDMR RWKLAHIEYNNYRCHGLKITNGTYEYIDCDGQDRKFPQKLYVLPVPTSEIKNNALIEQYD EWK >gi|225935372|gb|ACGA01000020.1| GENE 13 20567 - 23713 2387 1048 aa, chain - ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 1048 7 1074 1074 835 45.0 0 MKKRIKWNPRFITSLFIAFMCTFNCFAQNVIKGSVNDGSGEPLPGVSVAVKGTTSGTITD MDGKFSINARNNDVLIFSYIGMTTQDIKVNNQRFISVTMKDDVASLDEVVVVGYGTQKRG SITGAISTVSDKELLKAPTMSISNIVGARIAGVAAVQSSGQPGSDNATLTMRGQTGIVYV IDGIRRTAADFNGLDPNEIESVSILKDASAVAVYGLDANGVFIVTTKQGKAEKMSITYTG SVGFSQNAEQQEWLDGPGYAYWYNKARELQGDSQVFTAEMVQKMKDGVDGWGNTNWYDKI YGTGTRTHHNISATGGSDRVRFFASIGYLDEKGNIDNYGYNRLNLRSNIDAKLTKSLTFT LGVSGRIEKRDSPRYSANPDDWHNVPQQVIRALPYVPATYEYEGKVYNVSTPTASSPVAP LASINESGYSKSNYSYIQSNFSLKYDAPWLKGLSFKFQGAYDLTYNFSKILTNPYEVMIM NLPTIESTSLTYYKGTDASGNSISLSESASRGYTFTTQSSVTYDNKFGKHTIGAMFLAET RENKSNALSATGYGLDFIQLDELNQITNKTGNGTEKFPTIGGYSGHTRVAGFVGRVNYNY DDKYYLEASLRYDGSYLFGGMNKRWVTLPGVSAGWRINNESWFNAPWINNLKLRAGIGKT ATSGVSAFQWRNTMGITKNAVIIGGSSQTMMYASVLGNPNLSWAQCLNYNVGVDATMWNG LLGVEFDVFYKYEYDKLATVTGAYAPSRGGYYFSSANVNKTDYKGFDITLTHYNHINKFN YGAKLIWSYAYGRWLKYVGDSDNAPEYQRLTGKQIGSKYGFIANGLFQSEEEIANSATVK GYKALPGYIKYVDRNGDGIITAAQDQGYVGKSATPKHTGSLNLFGNWKGFDFDLLFSWGL GNVVALTGQYTASGSVGTQDNTSFTKPFYHGGNSPTFLVENSWTPDNTNAEFPRLEISGP SNNNGFSSTFWYRNGNYMRLKTAQIGYNFPKKWLTPAGIEGLRLYVEGYNLLTFSGLTKY NIDPESPAVNNGYYPQQRTYTLGVKLTF >gi|225935372|gb|ACGA01000020.1| GENE 14 23736 - 24809 713 357 aa, chain - ## HITS:1 COG:no KEGG:BT_0435 NR:ns ## KEGG: BT_0435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 355 4 363 366 371 54.0 1e-101 MDGRRNFLKTAFATCSLLASSKLLTACTPIKKMTSDEEETLKTFATPNALKGMPIKATFL DEISWDIPHQNWGTKEWDKDFKAMKKMGINTVVLIRAGLGRWIASPFQCLLGKEDVYYPP VDLVEMFLTLADKYNMAFYFGMYDSGKYWQEGLFQKEIDLNMALIDEVWKKYGHHKSFQG WYLSQEISRRTKNMSKIYAEMGKHAKSVSDNLKTLVSPYIHGVKTDQVMAGDKALSVKEH EEEWDEILDNIKGAVDIMAFQDGQVDYHELYDYLVINKKLADKYNMKCWTNIESFDRDMP IRFLPIKWEKLLLKLDAARRAGMENAITFEFSHFMSPNSAYLQAGHLYERYCEHFIL >gi|225935372|gb|ACGA01000020.1| GENE 15 24832 - 26079 1011 415 aa, chain - ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 30 415 4 387 388 471 57.0 1e-132 MNIKNNLGHSADISLTAELSIPINGNNIIDFKKLARLYKDELLDNVLPFWLEHSQDNEYG GYFTCLDRDGRVFDTDKFIWLQGREVWMFSMLYNKVEKRQEWLDCAILGGEFLKKYGHNG NYNWYFSLDRSGRPLVEPYNIFSYTFATMAFGQLSLATGSQEYADIAKKTFDIILSKVDN PKGKWNKLHPGTRNLKNFALPMILCNLALEIEHLLEPGYLEQTMETCIHEVMEVFYRPEL GGIIVENVDMDGNLVNCFEGRQVTPGHAIEAMWFIMDLGKRLNRPELIEKAKETTLTMLN YGWDKEYGGIYYFMDRNGCPPQQLEWDQKLWWVHIETLISLLKGYQLTEDKRCLEWFKKV HDYTWKHFKDKDHPEWYGYLNRRGEVLLPLKGGKWKGCFHVPRGLYQCWTVLENL >gi|225935372|gb|ACGA01000020.1| GENE 16 26124 - 27527 1233 467 aa, chain - ## HITS:1 COG:CAC1339 KEGG:ns NR:ns ## COG: CAC1339 COG0477 # Protein_GI_number: 15894618 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 9 463 13 453 469 283 37.0 6e-76 MKSTINFGYLIFLSVVAALGGFLFGYDTAVISGTIAQVTQLFQLDTLQQGWYVGCALVGS IVGVLFAGILSDKLGRKLTMVISAVLFSTSALGCALCADFTQLVIYRIIGGVGIGVVSIV SPLYISELAVAQYRGRLVSLYQLAVTVGFLGAYLVNYQLLAWAESGTQLSVDWLNKVFIT EVWRGMLGMETLPAILFFIIIFFIPESPRWLIVRGKELKAINILEKIYNSITEAKSQLKE TKSVLTSETKSEWSLLMKPGIFKAVIIGVCIAILGQFMGVNAVLYYGPSIFENAGLSGGD SLFYQVLVGLVNTLTTVLALVIIDKVGRKKLVYYGVSGMVVSLILIGLYFLFGDSLGVSS LFLLVFFLFYVFCCAVSICAVVFVLLSEMYPTKVRGLAMSIAGFALWIGTYLIGQLTPWM LQNLTPAGTFFLFALMCVPYMLIVWKLVPETTGKSLEEIERYWTRSE >gi|225935372|gb|ACGA01000020.1| GENE 17 27533 - 28630 845 365 aa, chain - ## HITS:1 COG:no KEGG:BT_0435 NR:ns ## KEGG: BT_0435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 365 1 366 366 665 87.0 0 MKTLDRRDFLKKATLAGASALAVPTLFESCTSKASASTVVVADGLESKLIVPKNNGLKIT GTFLDEISHDIPHQNWGEKEWDMDFQHMKNIGIDTVIMIRSGYRKFVTFPSPYLLKKGCY MPSVDLVDMFLRLAEKYGMKFYFGLYDSGKYWDTGDMTWEVEDNKYVIDEIWKNYGSKYK SFGGWYISGEISRATKGAIGAFHTLGKQCKDVSNGLPTFISPWIDGKKAIMGTTKMTRED AVSVQQHEKEWDEIFDGIHDVVDACAFQDGHIDYDELDAFFSVNKKLADKYGMQCWTNAE SFDRDMPIRFLPIKFDKLRMKLEAAKRAGYDKAITFEFSHFMSPQSAYLQAGHLYDRYKE YFEIK >gi|225935372|gb|ACGA01000020.1| GENE 18 28675 - 30549 1261 624 aa, chain - ## HITS:1 COG:no KEGG:BT_0434 NR:ns ## KEGG: BT_0434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 624 2 627 627 950 71.0 0 MRNYFLGLCLLFALCFTACSRSDDSVDVLIIGGGASGVTAGIQSARMGAETLIVEETEWL GGMLTSAGVSAVDGNYDLPAGLFGEFRGHLADYYGGLDSLKTGWVSGVLFEPSVGNKIFH EMVGVEKNLKVWHNATLVKLERKDDSWIAQIQMKDHTIKKVCAKILIDGTELGDIAKMCG VKYDIGMESRHDTKEDIAPEEKNNIIQDITYVAILKDYGKDMTIPCPEGYNKDEFACACS NPVCITPKEQNRVWSKEMMITYGKLPNNKYMINWPIEGNDYYINLLEMTREEREEALKFA KHYTMCFVYFLQHELGFNTLGLADDEYPTADKLPFIPYHRESRRIHGLVRFNLNHVCEPF NQSQPLYRTCIAVGNYPVDHHHTRYHGYEELPNLYFHPIPSYGLPLGTLIPKDVEGLIVA EKSISVSNIINGTTRLQPMVMQIGQAAGALAALAVKENKSISEVSVREVQNAILDAKGYL LPYLDVAIGHPMFKSLQRVGSTGILKGIGKSVDWANQMWFRADTLLLANELEGLGDVYPF VNKQIIEGNNTISIQKATEMIGEIAEKEGIEIKEGRVEEIWNEFDLKNFDMDRDILRSEM AILIDQILDPFNNKKVDITGQYIQ >gi|225935372|gb|ACGA01000020.1| GENE 19 30884 - 32092 356 402 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 87 399 5 317 323 141 29 2e-32 MNQQFLKEIEKGSKSALVKKRIITHYIYNGSSTIPDLSKELDLSVPTVTKFIGEMCDDGY INDYGKLETSGGRHPNLYGLNPESGYFLGVDIKRFAVNIGLINFKGDMVELKMNIPYKFE NSIEGMNELCKHILNFIKKLTINKEKILNINVNVSGRVNPESGYSFSQFNFEERPLADVL SEKLGYKVTVDNDTRAMTYGEYMQGCVKGEKDIIFVNVSWGVGIGIIIDGKIYTGKSGFS GEFGHMSAYDNEIICHCGKKGCLETEASGSALHRILLERIQSGESSILSTRVATEENPIT LDEIIAAVNKEDLLCIEIVEEIGQKLGKQIAGLINIFNPELVIIGGTLSLTGDYITQPIK TAVRKYSLNLVNKDSAIITSKLKDKAGIVGACMLARSRMFES >gi|225935372|gb|ACGA01000020.1| GENE 20 32491 - 34533 1452 680 aa, chain - ## HITS:1 COG:no KEGG:BT_0761 NR:ns ## KEGG: BT_0761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 680 1 682 682 1155 87.0 0 MDELNCGQGEQNAGPEKKKSTSKIVKRTLVVTALALAVYVVYSVVYLFVSPDRNIQQIYL VPEDAAFIIQSSAPIEDWEKFSGSETWQCLKKAKSFEEVTQSVEKLDSVVKSNKVLLSLV GERDMLISLHKTRPTNWDFLLILDMQKTSKMDLLKDQVETVLVMSGFTVTNRMHNSINIL EMRDPETRDIFYIAFVDNHLVGSYTSGLVESAIDSRNKPKIGLDQSFIETEKLVSGKGLV RVFVNYARVPQFMSIYLGTRNEYIDLFSNSMNFAGLYLNTNKERMEVKGYTLKKDSADPY VTALLNSGKHKMKAHEILSGRTALYTNIGFNSPVTFVKELENALSVHNKQLYDSYQSSRK KIEGLFGISLEENFLSWMSGEFAITQSEPGLLGHDPELILAIRAKSIKDARKNMEFIEKK IKRRTPVKIKTANYKDFEINYVEMKGFFRLFFGKLFDKFEKPYYTYVDDYVVFSNKAASL LSFVEDYEQKNLLKNNPGFENALSYLKSSSTIFLYTDVRKFYSQLKPMMNPATWNEIQSN KDVLYSFPYWTMQIIGEDQSASLQYVMDYSPYQPEEVTVAIATDEDDKEMNEDAETEKEQ MSELKRFYVEKFEGNVLREFYPEGALKSEAEVKEGKRHGRYREYYEDGTLKLRGKYANNK PKGTWKYYTEDGKFEHKEKF >gi|225935372|gb|ACGA01000020.1| GENE 21 34746 - 35327 364 193 aa, chain + ## HITS:1 COG:NMA1447 KEGG:ns NR:ns ## COG: NMA1447 COG1636 # Protein_GI_number: 15794352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 10 190 20 204 241 213 53.0 2e-55 MKKKFQLEVPGGADKVLLHTCCAPCSSAIIECMMQHHITPVIYYCNPNIYPLEEYMIRKD ECTRYAQSLGLEIIDADYDHENWRCHIAGMEQEPERGGRCLRCFKLRLLETARYAHEHGL SVITTTLASSRWKSLEQINEAGQYATASYPDVTYWEQNWRKGGLSERRIAIIKEYNFYNQ QYCGCEFSMRKEE >gi|225935372|gb|ACGA01000020.1| GENE 22 35355 - 36233 520 292 aa, chain + ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 101 273 2 175 195 120 40.0 2e-27 MNRNKKDKNRKLFKKKSHSNNRLGCIIAIIVLIPILFGVYLYCQQINIQKNNELQTDASL QIPLGKDLETPISLVPRQEQIIRHSGYTVSYNKDLKIPNWVSYELTREETKGKEKRGNRF IADPLVTGPIATNADYTRSGYDKGHMAPAADMKWSPEAMKESFYFSNMCPQHPQLNRRGW KNLEEKIRNWAIADSAIIIICGPIIEKQPKTIGKNKVVVPQRFFKVVLSPFAKPIRAIGF LFNNEQAVEPLSSYVVTVDSIESLTNMDFFAPLPDEIENKIEANANYSLWPN >gi|225935372|gb|ACGA01000020.1| GENE 23 36468 - 37505 798 345 aa, chain + ## HITS:1 COG:XF0675 KEGG:ns NR:ns ## COG: XF0675 COG1559 # Protein_GI_number: 15837277 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Xylella fastidiosa 9a5c # 28 342 29 343 350 127 28.0 3e-29 MKRKKRNILLSILIGAFLLCAIAGGTFYYYLFAPQFHPSKTVYIYVDRDDTADSIYHKIQ KFGHVNKFTGFQWMAKYKDFDQNIHTGRYAIRPNDNVYHVYSRFSRGYQEAINLTIGSVR TLDRLARSIGKQLMIDSAEIASQLFDSTFQAQMGYTTITLPSLFIPETYQVYWDMSVDDF FKRMKNEHERFWNKERLAQATAIGMTPEEVSTLASIVEEETNNNEEKPMVAGLYINRLHQ DMPLQADPTIKFALQDFGLRRITNENLKVNSPYNTYINTGLPPGPIRIPSKKGIDSVLNY TKHNYIYMCAKEDFSGTHNFASNYADHMANARKYWKALNERKIFK >gi|225935372|gb|ACGA01000020.1| GENE 24 37589 - 39181 1529 530 aa, chain + ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 3 529 2 521 584 355 39.0 1e-97 MSKQLLLGDEAIAQAALDAGLSGVYAYPGTPSTEITEYIQMAPITTEQNIHNRWCANEKT AMEAALGMSFVGKRALVCMKHVGMNVAADCFVNSAITGVKGGLIVIAADDPSMHSSQNEQ DSRFYGDFSLIPMYEPSNQQEAYDMVYSGFEFSEKLGEPILMRMVTRLAHSRSGVERKAQ KPQNGISFSEDPRQFILLPGNARKRYKVLLAHQDEFIKASEESPYNKYTDGPNKKLGIIA CGIGYNYLMENYPEGCEYPVLKIGQYPLPKKQLHQLIDSCDEILVLEDGQPFVEKQLKGY LGIGVKVKGRLDGTLSQDGELNPDSVARAVGKENKSEFGIPSVVEMRPPALCEGCGHRDM YITLTEVLKEEYPSHKVFSDIGCYTLGANAPFNAINSCVDMGASITMAKGAADGGLFPAV AVIGDSTFTHSGMTGLLDCVNENASVTIVISDNETTAMTGGQDSAGTGRIEAICAGIGVD PAHIRVVTPLKKNYEEMKQIIREEIEYRGVSVIIPRRECIQTLARKKRSK >gi|225935372|gb|ACGA01000020.1| GENE 25 39185 - 39766 623 193 aa, chain + ## HITS:1 COG:PH0764 KEGG:ns NR:ns ## COG: PH0764 COG1014 # Protein_GI_number: 14590633 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus horikoshii # 4 193 5 200 202 109 35.0 4e-24 MKKDIILSGVGGQGILSIATVIGKAALKEGLYMKQAEVHGMSQRGGDVQSNLRISDQPIA SDLIPSGKCDLIISLEPMEGLRYLPYLSPNGWLVTNETPFVNIPNYPEADKVMAEINKLP HKIVLNVDKVAKEVGSARVANIVLLGATIPFLDIDYEKVQDSIREIFLRKGEAIVEMNLK ALTAGKEIAEKLM >gi|225935372|gb|ACGA01000020.1| GENE 26 39781 - 41088 1220 435 aa, chain + ## HITS:1 COG:AF2013 KEGG:ns NR:ns ## COG: AF2013 COG1541 # Protein_GI_number: 11499595 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Archaeoglobus fulgidus # 5 433 11 438 440 439 48.0 1e-123 MNTQYWEEELETMSREKLQELQLRRLKKTINIAANSPYYKEVFSKHGITADSIQSLDDIR KLPFTTKADMRAHYPFGLVAGDMSNDGVRIHSSSGTTGNPTVIVHSQHDLDSWANLVARC LYAVGIRKTDVFQNSSGYGMFTGGLGFQYGAERLGCLTVPAAAGNSKRQIKFINDFKTTA LHAIPSYAIRLAEVFQEEGLDPKGTTLKTLVIGAEPHTDEQRRKIEKMLGVKAYNSFGMT EMNGPGVAFECQEQNGMHFWEDCYLVEIIDPETGEPVQEGEIGELVLTTLDREMMPLIRY RTRDLTRILPGKCPCGRTHIRIDRIKGRSDDMFIIKGVNIFPMQVEKILVQFPELGSNYL ITLETVNNQDEMIVEVELSDLSTDNYIELEKIRKDITRQLKDEILVTPKVKLVKKGSLPQ SEGKAVRVKDLRDNK >gi|225935372|gb|ACGA01000020.1| GENE 27 41178 - 41750 532 190 aa, chain + ## HITS:1 COG:BS_xpt KEGG:ns NR:ns ## COG: BS_xpt COG0503 # Protein_GI_number: 16079265 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Bacillus subtilis # 1 180 1 181 194 177 49.0 1e-44 MQLLKKRILQDGKCYEGGILKVDGFINHQMDPVLMKSIGVEFVRRFAATNVNKIMTIEAS GIAPAIMTGYLMDLPVVFAKKKSPKTIQNALSTTVHSFTKDRDYEVVISADFLTPNDNVL FVDDFLAYGNAALGIIDLIKQSGANLVGMGFIIEKAFQNGRKMLEEQGVRVESLAIIEDL SNCCIKIKDE >gi|225935372|gb|ACGA01000020.1| GENE 28 41737 - 42513 420 258 aa, chain - ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 2 240 12 247 294 82 26.0 9e-16 MIFYFSGTGNSKWIANQLSKEQKEELVFIPDALKNGALEFSLQADEKIGFVFPIYSWGPP EIVLNFVRQLSLRGYKRQYLFFVCSCGDDTGLTQQVLAKALKNKGWECHAGFSVTMPNNY VLLPGFDVDNKELEEKKLADAVSTVSKINASISKREELFLCHEGSMPFIKTRIINPLFNR FQMSPKHFYATDACIGCKRCEKSCPVENVTVVDGRPVWGMDCTSCLACYHVCPQHAVQYG KRTKDKGQYFNPNQSTRL >gi|225935372|gb|ACGA01000020.1| GENE 29 42872 - 43222 595 116 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29345835|ref|NP_809338.1| 50S ribosomal protein L20 [Bacteroides thetaiotaomicron VPI-5482] # 1 116 1 116 116 233 100 3e-60 MPRSVNHVASKARRKKILKLTRGYFGARKNVWTVAKNTWEKGLTYAFRDRRNKKRNFRAL WIQRINAAARLEGMSYSKLMGGLHKAGIEINRKVLADLAMNHPEAFKAVVAKAKAA >gi|225935372|gb|ACGA01000020.1| GENE 30 43321 - 43518 333 65 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29345834|ref|NP_809337.1| 50S ribosomal protein L35 [Bacteroides thetaiotaomicron VPI-5482] # 1 65 1 65 65 132 100 8e-30 MPKMKTNSGSKKRFTLTGTGKIKRKHAFHSHILTKKSKKRKRNLCYSTTVDTTNVSQVKE LLAMK >gi|225935372|gb|ACGA01000020.1| GENE 31 43588 - 44199 619 203 aa, chain - ## HITS:1 COG:RSc1578 KEGG:ns NR:ns ## COG: RSc1578 COG0290 # Protein_GI_number: 17546297 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 3 (IF-3) # Organism: Ralstonia solanacearum # 3 175 5 178 178 142 45.0 6e-34 MKNDTLKGQYRINEQIRAKEVRIVSDDIEPKVYPIFQALKMAEEKELDLVEISPNAQPPV CRIIDYSKFLYQLKKRQKEQKAKQVKVNVKEIRFGPQTDDHDYNFKLKHAKGFLEDGDKV KAYVFFKGRSILFKEQGEVLLLRFANDLEDYAKVDQMPILEGKRMTIQLSPKKKEAPKKP ATAGAPKPAAPAQKAEKPETSEE >gi|225935372|gb|ACGA01000020.1| GENE 32 44272 - 46212 2091 646 aa, chain - ## HITS:1 COG:DR2081 KEGG:ns NR:ns ## COG: DR2081 COG0441 # Protein_GI_number: 15807075 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Deinococcus radiodurans # 2 641 1 647 649 643 51.0 0 MIKITFPDGSVREYNEGVNGLQIAESISSRLAQEVLACGVNGETYDLGRPINEDANFVLY KWDDEEGKHAFWHTSAHLLAEALQELYPGIQFGIGPAIENGFYYDVDPGEAVIKESDLPA IEAKMLELAAKKEDVVRKSIAKTDALKMFGDRGETYKCELISELEDGHITTYTQGAFTDL CRGPHLMTTAPIKAIKLTTVAGAYWRGHEDRKMLTRIYGITFPKKKMLDEYLVLLEEAKK RDHRKIGKEMQLFMFSETVGKGLPMWLPKGTALRLRLQEFLRRIQTRYDYQEVITPPIGN KLLYVTSGHYAKYGKDAFQPIHTPEEGEEYFLKPMNCPHHCEIYKNFPRSYKDLPLRIAE FGTVCRYEQSGELHGLTRVRSFTQDDAHIFCRPEQVKDEFLRVMDIISIVFRSMDFQNFE AQISLRDKVNREKYIGSDDNWEKAEQAIIEACAEKGLPAKIEYGEAAFYGPKLDFMVKDA IGRRWQLGTIQVDYNLPERFELEYMGSDNQKHRPVMIHRAPFGSMERFVAVLIEHTAGKF PLWLTPEQVVILPISEKFNEYAEQVKMYLKIHEIRAIVDDRNEKIGRKIRDNEMKRIPYM LIVGEKEAENGEVSVRRQGEGDKGTMKFEEFAKILNEEVQNMINKW >gi|225935372|gb|ACGA01000020.1| GENE 33 46295 - 48250 2109 651 aa, chain - ## HITS:1 COG:all0889 KEGG:ns NR:ns ## COG: all0889 COG0457 # Protein_GI_number: 17228384 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 10 600 50 584 605 149 25.0 2e-35 MTIARNALYFEDYVLSIQYFNQVINAKPYLYEPYFFRALAKLNLEDFQGAEADCDAAIQR NPFVVGAYQIRGLARIRQSKFDGAIEDYKKALHYDPENITLWHNLTLSHIQKKDYNAAKE DLESLLKVSPRYTRAYLMRGEVSLQQKDTIAALNDFNKALELDKYDPDAWAARAIVKLQQ AKYAEAEADFNRAIPLSAKNAGNYINRALARFHQNNLRGAMSDYDLALDIDPNNFIGHYN RGLLRAQVGDDNRAIEDFDFVIKMEPDNMMAVFNRGLLRAQTGDYRGAIQDYTTVINQYP NFLAGYYQRSEARRKIGDKKGAEQDEIKVMKAQIDKQNGVTNKDVAQNKDKENDEEGGEK TRKKSDKNMNNYRKIVIADDSEAEQRYTSDYRGRVQDKNVNITLEPMFALTYYEKMSDVK RSVNFHKYIEDLNRTGILPKRLRITNMEAPLTEEQVKVHFALIDTHTSAIVEDDKNASKR FARAIDFYLVQDFSSAVSDLTQTILLDGDFFPAYFMRALIRCKQLEYQKAEQAVETDVVP GDNKRKEITAVDYEVVRKDLDKVINLAPDFVYAYYNRANVSAMLKDYRAAIIDYDKAIEL NPDFADAYFNRGLTHIFLGNNKLGISDLSKAGELGIVSAYNVIKRFTDQSE >gi|225935372|gb|ACGA01000020.1| GENE 34 48402 - 48956 660 184 aa, chain - ## HITS:1 COG:TM1661 KEGG:ns NR:ns ## COG: TM1661 COG0242 # Protein_GI_number: 15644409 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Thermotoga maritima # 5 166 4 155 164 131 47.0 6e-31 MILPIYVYGQPVLRKVAEDITPDYPNLKELIENMFETMVHADGVGLAAPQIGLPIRVVTI TLDPLSEEYPEFKDFNKAYINPHILEVGGEEVSMEEGCLSLPGIHETVKRGDKIRVKYMD ENFVEHEEEVEGYLARVMQHEFDHLDGKMFIDHISPLRKQMIKGKLNTMLKGKARSSYKM KQVK >gi|225935372|gb|ACGA01000020.1| GENE 35 49002 - 49418 375 138 aa, chain - ## HITS:1 COG:CAC1680 KEGG:ns NR:ns ## COG: CAC1680 COG0816 # Protein_GI_number: 15894957 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Clostridium acetobutylicum # 3 135 2 134 135 86 39.0 1e-17 MSRIVAIDYGRKRTGIAVSDTMQLIANGLTTVPTHELLNFIGEYIAKEPVERIIIGLPKQ MNNEVSENMKNIEPFVRSLKKRYPDLPVEYVDERFTSVLAHRTMLEAGLKKKDRQNKALV DEISATIILQTYLESKRF >gi|225935372|gb|ACGA01000020.1| GENE 36 49493 - 50632 1295 379 aa, chain - ## HITS:1 COG:no KEGG:BT_0418 NR:ns ## KEGG: BT_0418 # Name: not_defined # Def: outer membrane porin F precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 379 1 372 372 593 82.0 1e-168 MKKGLLFILLAAVSVCLPAQEKENAAKSYRVETNRFGANWFISGGVGAQMYFGDNDGKAD FGKRLAPALDIAVGKWFTPGIGLRVAYNGLQAKGATPNANGPLVDGGQYSNGYYKEKWNV MNFHGDVMLNLSNMICGYREDRLYSFIPYAGAGFVHSGKGAGYDELGINAGLINRFRLSS ALDLNVELRGLLMKGAFGNSGPEGLAGLTVGVTYKFKKRGWDAVPTVPMVPESQLNDMRD RVNALKGENESLKRDLVEARNKKPEVIVKKEAGFVPRLVVVFNIGKSNISKREYMNIEAM AKGIKATDKVFTITGYADKGTGSAEYNMKLSKKRAEAVRDLMVNEFGVPASQLKVDYKGG VGNMFYDDAKLSRVAIVEE >gi|225935372|gb|ACGA01000020.1| GENE 37 50952 - 51920 740 322 aa, chain - ## HITS:1 COG:no KEGG:BT_0417 NR:ns ## KEGG: BT_0417 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 322 322 621 93.0 1e-176 MIQKSVHRLLMTGFVAFISSLSLMAQHKVEMLPFGDMDQWVDRQIKESSIIGGNTKNVYA IGPTTVIKGDQVYKNMGGSPWATSNVMAKVAGITKTNTSVFPEKRGDGYCARLDTRMESV KVLGLVNITVLAAGSVFTGSVHEPIKGTKNPQKMLQTGIPFTKKPVALQFDYKVKMSDRE NRIRATGFSKITDVPGKDYPAAILLLQKRWEDANGNVYAKRIGTMVTYYYHSTDWKNNAT YEIMYGDITNRPEYKSHMMRLQATESYTVNSKGESVPIHEVAWGDENDVPTHMCLQFTSS HGGAYIGSPGNTLWIDNVKLVY >gi|225935372|gb|ACGA01000020.1| GENE 38 51946 - 53052 1087 368 aa, chain - ## HITS:1 COG:no KEGG:BT_0416 NR:ns ## KEGG: BT_0416 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 368 1 368 368 681 97.0 0 MGLLEFNKLPINTLVGADWKTFKAITAGREIDAAYKGKYRLTKAVCRLLSPLASLQDRRY EKLLANQPLEHDPVFILGHWRSGTTFVHNVFSCDKHFGYNTTYQTVFPHLMMWGQPFFKK NMSWLMPDKRPTDNMELAVDLPQEEEFALSNMMPYTYYNFWFLPKYQQEYADKYLLFDDI TDAELKVFEEVFTKLIKISLWNTRGTQFLSKNPPHTGRVKELVKMFPNAKFIYLMRNPYT VFESTRSFFTNTIQPLKLQDVSNEQLEENILSIYAKLYHKYESDKKFIPEGNLMEVKFED FEADAMGMTENIYQSLSIPGFAEARTDIEKYVGGKKGYKKNKYKYDDRTIRLVEENWDFA LKQWDYNL >gi|225935372|gb|ACGA01000020.1| GENE 39 53135 - 54595 1591 486 aa, chain - ## HITS:1 COG:PA4442_1 KEGG:ns NR:ns ## COG: PA4442_1 COG2895 # Protein_GI_number: 15599638 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Pseudomonas aeruginosa # 7 436 11 433 451 515 61.0 1e-146 MADNKLDIKAFLDKDEQKDLLRLLTAGSVDDGKSTLIGRLLFDSKKLYEDQLDALERDSK RVGNAGEHIDYALLLDGLKAEREQGITIDVAYRYFSTNGRKFIIADTPGHEQYTRNMITG GSTANLAIILVDARTGVITQTRRHTFLVSLLGIKHVVLAVNKMDLVGFSEERFNEIVADY KKFVAPLGIPDVNCIPLSALDGDNVVDKSERTPWYKGISLLDFLETVHIDNDHNFTDFRF PVQYVLRPNLDFRGFCGKVASGIVRKGDTVMALPSGKTSKVKSIVTYDGELDYAFPPQSV TLTLEDEIDVSRGEMLVHPDNLPTVDRNFDAMMVWMDEEPMDVNKSFFIKQTTNLSRTRI DTIKYKVDVNTMEHLSLENGQLTKETLPLQLNQIARVVLTTAKELFFDPYKKNKSCGSFI LIDPITNNTSAVGMIIDRVEMKDMSDTEDIPVLDMAKLGIAPEHYEAVEKAVKELERQGL AVRLIK >gi|225935372|gb|ACGA01000020.1| GENE 40 54608 - 55516 805 302 aa, chain - ## HITS:1 COG:VC2560 KEGG:ns NR:ns ## COG: VC2560 COG0175 # Protein_GI_number: 15642555 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Vibrio cholerae # 1 302 14 315 315 442 70.0 1e-124 MEEYKLSHLKELEAESIHIIREVAAEFENPVMLYSIGKDSSVMVRLAEKAFYPGKVPFPL MHIDSKWKFKEMIQFRDEYAKKYGWNLIVESNMEAFHAGVGPFTHGSKVHTDLMKTQALL HALDKYKFDAAFGGARRDEEKSRAKERIFSFRDKFHQWDPKNQRPELWDIYNARVHKGES IRVFPISNWTELDIWQYIRLENIPIVPLYYAKERPVINLDGNIIMADDDRLPEKYRDQIE MKMVRFRTLGCWPLTGAVESGAATIEEIVEEMMTTTKSERTTRVIDFDQEGSMEQKKREG YF >gi|225935372|gb|ACGA01000020.1| GENE 41 55524 - 56132 570 202 aa, chain - ## HITS:1 COG:BH3385 KEGG:ns NR:ns ## COG: BH3385 COG0529 # Protein_GI_number: 15615947 # Func_class: P Inorganic ion transport and metabolism # Function: Adenylylsulfate kinase and related kinases # Organism: Bacillus halodurans # 13 198 16 201 208 206 54.0 2e-53 MEEKNHIYPIFDRMMTRQDKEELLGQHSVMIWFTGLSGSGKSTIAIALERELHKRGLLCR ILDGDNIRSGINNNLGFSETDRVENIRRIAEVSKLFLDSGIITIAAFISPNNDIREMAAN IIGKDDFLEVFVSTPLEECEKRDVKGLYAKARKGEIQNFTGISAPFEVPEHPALSLDTSK LTLEESVNRLLELVLPKVKCIK >gi|225935372|gb|ACGA01000020.1| GENE 42 56145 - 57698 1301 517 aa, chain - ## HITS:1 COG:BH3384 KEGG:ns NR:ns ## COG: BH3384 COG0471 # Protein_GI_number: 15615946 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Bacillus halodurans # 1 305 2 300 589 163 33.0 9e-40 MTFEIVFVLLSLLGMVAALIADKMRPGMILFSVVVLFLCAGILTPKEMLEGFSNKGMITV ALLFLVSEGIRQSGTLGQVIKKLLPQGKTTVFKAQLRILPSVAFISAFLNNTPVVVIFAP IIKHWAKSVNLPATKFLIPLSYVTILGGICTLIGTSTNLVVHGMILEAGFEGFSMFELGK VGVFIAIAGIIYLFLFSKRLLPDARPDTAVPDEEVEEGEKLHRVEAVLGARFPGINKKLK DFNFQRHYGAEVKEIKTRNGQRFVSNLEEVVLHEGDTLVVMADDTFIPTWGESSVFVLLT NGNEPDTTGKKKRWFALLLLVLMIVGATVGELPITKEMFPDIKLDMFFFVSITTIIMAWT NLFPARKYTKYISWDILITIACAFAISKAMVNSGVADSVAKFIIGLSDDYGPHVLLAMVF IITNLFTELITNNAAAALAFPLALSISAQLGVSPTPFFVVICMAASASFSTPIGYQTNLI VQGIGNYKFTDFVRIGLPLNIITFLISIILIPLIWNF >gi|225935372|gb|ACGA01000020.1| GENE 43 57710 - 58534 673 274 aa, chain - ## HITS:1 COG:aq_337 KEGG:ns NR:ns ## COG: aq_337 COG1218 # Protein_GI_number: 15605852 # Func_class: P Inorganic ion transport and metabolism # Function: 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase # Organism: Aquifex aeolicus # 10 267 6 249 268 260 52.0 2e-69 MEQKYVMAAIDAALKAGEKILSIYNDPASDFEIERKADNSPLTIADRKAHEAIVAILNDT PFPVLSEEGKHLGYEIRRGWDTLWIVDPLDGTKEFIKRNGEFTVNIALVQNSVPVFGVIY VPVKEELYFGIEGAGAYKCSGIVSLEDDGVALEQLIGKSEQIPLKEVHDHLIVVASRSHL SPETESYIADLKKKHGSVELISSGSSIKICLVAEGKADVYPRFAPTMEWDTAAGHAIARA AGMEVYQAGKEEPLRYNKEDLLNPWFIVEPKREH >gi|225935372|gb|ACGA01000020.1| GENE 44 58786 - 59280 602 164 aa, chain - ## HITS:1 COG:no KEGG:BT_0410 NR:ns ## KEGG: BT_0410 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 1 164 164 270 89.0 1e-71 MKKLVVLGMGVCLVLAFASCKSSESAYKKAYEKAKQQELAEPQVEAPVEVTPVVAAPVTT TKVADTSGVRQEKVTVVSGNEGLKDYSVVAGSFGVKANAEGLKDWLDGQGYHSTIAFNAD KAMYRVIVNSFADKTAAAEARDAFKAKYPNRSDFQGAWLLYRVY >gi|225935372|gb|ACGA01000020.1| GENE 45 59396 - 60166 812 256 aa, chain - ## HITS:1 COG:all0475 KEGG:ns NR:ns ## COG: all0475 COG4221 # Protein_GI_number: 17227971 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Nostoc sp. PCC 7120 # 1 253 4 256 257 295 54.0 7e-80 MEAKIVFITGASSGIGEGCARKFAREGWNLILNARTVSKLEELKAELEGAYGVRVYILPF DVRDRKLAAASLESLPEEWKTIDVLVNNAGLVIGVDKEFEGSLDEWDIMIDTNIRGLLAM TRLVVPGMVERGRGHIINIGSIAGDAAYPGGSVYCATKAAVKALSDGLRIDLVDTPLRVT NVKPGMVETNFTVVRYRGDKEAADNFYKGIRPLTGDDIAETVYFAASAPAHIQIAEVLLM PTYQATGTISYKKKSE >gi|225935372|gb|ACGA01000020.1| GENE 46 60367 - 60576 407 69 aa, chain + ## HITS:1 COG:no KEGG:BF1659 NR:ns ## KEGG: BF1659 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 69 1 69 69 79 91.0 3e-14 MKGLNVLAAFLGGAAVGAALGILFAPEKGEDTRHKIAEILRKKGIKLNRSEMETLVDEIA AEMKGEIAE >gi|225935372|gb|ACGA01000020.1| GENE 47 60609 - 60971 310 120 aa, chain + ## HITS:1 COG:no KEGG:BT_0407 NR:ns ## KEGG: BT_0407 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 119 143 92.0 2e-33 MFADDKSIENFQQLFFEFKKYLELQKEYTKLELTEKLTILFSTLIMILVLIILGMVALFY LLFALAYILEPLVGGLMSSFAIIAGINVVLIALVIIFRKQLIISPMVNFLANLFLTDSNK >gi|225935372|gb|ACGA01000020.1| GENE 48 60975 - 61232 272 85 aa, chain + ## HITS:1 COG:no KEGG:BT_0406 NR:ns ## KEGG: BT_0406 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 119 81.0 3e-26 MNAQTPQKFTLEEIAERKKKLLNEIHAQKKAMTATTREIFAPLAPATNKADALMRSFNTG MAVFDGVVMGIKIMRKIRAYFRNLR >gi|225935372|gb|ACGA01000020.1| GENE 49 61239 - 62792 1264 517 aa, chain - ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 516 832 79.0 0 MSNKIYPIGIQNFEKIRQDGYFYVDKTALMYQMVKTGSYYFLSRPRRFGKSLLISTLEAY FQGKKELFAGLMVEKLEKDWIEHPILHLDLNIEKYDASDSLDKILNDNLEYWESKYGTRP SETSFSLRFAGIIQRAYEKTGQRVVILVDEYDKPMLQAIGNEDLQKQFRDTLKPFYGALK TKDGCIKFALLTGVTKFGKVSVFSDLNNLKDVSMDERFVDICGITEKEIHDNLEEELHQL AEKQKISYEQVCAELKECYDGYHFVEHTIGIYNPFSLLNTFDKMKFGSYWFETGTPTYLV NLLKKHHYDLERMAHEETDEQVLNSIDSESSNPIPVIYQSGYLTIKGYDERFGIYRLGFP NREVEEGFVRFLLPYYANVNKVESPFEIQKFVREVESGDYNSFFRRLQSFFADTGYDVIR EQELHYENVLFIVFKLVGFYTKVEYHTSEGRIDLVLQTDKFIYVMEFKLNGTAEEALKQI NEKHYSLPFEADNRKLFKVGVNFSSQTRNIEKWIVEE >gi|225935372|gb|ACGA01000020.1| GENE 50 63187 - 63759 404 190 aa, chain - ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 182 1 183 183 192 53.0 4e-49 MEVIKTNIEGVVIIEPRLFKDDRGYFFESFSQREFDEKVRPIKFVQDNESMSSYGVMRGL HFQTMPYSQSKLVRCVKGTVLDVAVDIRKGSPTYGQHVAVELTEENHRQFFIPRGFAHGF AVLSEKAVFQYKCDNFYAPQHDGGISILDDSLGICWRIPTGKAILSEKDTKHPLLKDFES PFLYGEDLYK >gi|225935372|gb|ACGA01000020.1| GENE 51 63768 - 64886 917 372 aa, chain - ## HITS:1 COG:STM2097 KEGG:ns NR:ns ## COG: STM2097 COG1088 # Protein_GI_number: 16765427 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Salmonella typhimurium LT2 # 6 372 3 360 361 397 54.0 1e-110 MSKRNIIITGGAGFIGSHVVRLFVNKYPDYHIINLDKLTYAGNLANLKDVENKPNYTFVK ADICDFEMMLKIFKQYHVDGVIHLAAESHVDRSIRDPFTFARTNVLGTLSLLQAARLTWE YLPEGYEGKRFYHISTDEVYGALELTHPEGKSSEISAHEVYGDEFFKETTKYNPHSPYSA SKAGSDHFVRAFHDTYGMPTIVTNCSNNYGPYQFPEKLIPLFINNIRHRKPLPVYGKGEN VRDWLYVVDHARAIDVIFHNGKIADTYNIGGFNEWKNIDIIHVIIKTVDRLLGNPEGHSE GLITYVMDRMGHDLRYAIDSTKLKNELGWEPSLQFEEGIEKTVQWYLDNQEWMDNITSGA YESYYEDMYKNR >gi|225935372|gb|ACGA01000020.1| GENE 52 64883 - 65785 751 300 aa, chain - ## HITS:1 COG:YPO3861 KEGG:ns NR:ns ## COG: YPO3861 COG1209 # Protein_GI_number: 16123996 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Yersinia pestis # 1 294 1 291 293 412 65.0 1e-115 MKGIVLAGGSGTRLYPITKGISKQLIPIFDKPMIYYPISVLMLAGIREILIISTPHDLPG FKRLLGNGSDYGVRFEYAEQPSPDGLAQAFIIGEDFIGSDSVCLVLGDNIFHGNGFSSML KEAVYMAEKERKATVFGYWVSDPERYGVAEFDDEGNCLSIEEKPVHPKSNYAVVGLYFYP NRVVDVAKRIKPSVRGELEITTVNQQFLEDSELKVQTLGRGFAWLDTGTHDSLSEASTFI EVIEKRQGLKVACLEGIAFRQGWIDADKMRELAQPMLKNQYGQYLLQVVEEVERTGKSNL >gi|225935372|gb|ACGA01000020.1| GENE 53 65790 - 66506 443 238 aa, chain - ## HITS:1 COG:CAC2317 KEGG:ns NR:ns ## COG: CAC2317 COG1922 # Protein_GI_number: 15895584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Teichoic acid biosynthesis proteins # Organism: Clostridium acetobutylicum # 39 222 50 231 250 116 35.0 3e-26 MIKLKELSIMESINQLKALPQGKLLINTINAHSYNTALKDPLFAKALMKGDALIPDGASI VKVCKWLKMQSQPRERIAGWDLFVFEMNRLNEKGGRCMFMGSSEKVLALIEQRAAVDYPN LEVVTYSPPYKPEFSQEDNAAIVAAINEANPDLLWIGMTAPKQEKWIYSNWQKLNIHCHV GTVGAVFDFYAGTAERAPLWWQQHSLEWFYRLIKEPRRMWRRYLIGNVLFLWNICKEI >gi|225935372|gb|ACGA01000020.1| GENE 54 66587 - 67822 588 411 aa, chain - ## HITS:1 COG:alr0159 KEGG:ns NR:ns ## COG: alr0159 COG0438 # Protein_GI_number: 17227655 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 168 405 141 391 397 62 25.0 1e-09 MLSILINSYACSPNMGSEPGMAWNWCVNLAKYCELHIITEGEFRDKIEAVVPTLPQSGNM HFYYNPVSEEVRRMCWNQGDWRFYKHYRIWQYRTYEMTLDIMKMHHIDIVHQLNMIGFRE PGYLWKIKEVPFVWGPIGGLKQFPLAYLEGTGLKMKLFNGLKNILNVLQLKYDKRVNVAL KKADVLVSSIPDSYRAIKRYKRLESVVIPETGCFPWACVSTERFDSAIFHVLWVGKFDFR KQLPLALKAIAATKNERILLDVYGTGTDDQVSTAKALSVSLGIADQVIWHGNQPNHEVHE KMRSSHIFLFTSVNEDTSTVVLEAVSNQLPVVCFDACGMAAVIDETVGRKITLSSPTQSV KDFAKVLNELEADRDLLKQLSVNCRQRQEELSWDNKAKQMLELYESCSLTV >gi|225935372|gb|ACGA01000020.1| GENE 55 67917 - 68792 261 291 aa, chain - ## HITS:1 COG:no KEGG:Spro_0558 NR:ns ## KEGG: Spro_0558 # Name: not_defined # Def: acyltransferase 3 # Organism: S.proteamaculans # Pathway: not_defined # 1 282 1 287 335 74 27.0 3e-12 MNLTKEHSIQIKGVAILCMVLFHLFGFPERIPTSVQWMGMPIIKALQICVPIYLFMAGYG LQCMVAKGTITWMSIGKRLKKLYLSFWWVAIPFISVGCIVGYYAPDVKNIFYNLSGLTTS CNGEWCFFSLYAELLVLFYFVSKIKLGWKGYLLLMLGLLILTRGLNCALHLDEEVIVERH LKMILIDLNIFMLGCFFAKFNIFGWLHERCHWLYKKIYLAPLLLVIPILVRAYLPLIGVT ELLIVPMFCIGVVNICKRGILLFFGKYSINLWLVHSFFIFLFFEWHFFYYK >gi|225935372|gb|ACGA01000020.1| GENE 56 68801 - 69760 545 319 aa, chain - ## HITS:1 COG:no KEGG:BF2802 NR:ns ## KEGG: BF2802 # Name: not_defined # Def: putative acyltransferase transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 317 7 337 340 301 53.0 3e-80 MKQRNISIDILKCFAALIITNSHMDILYPKFGALATGGAIGDALFFFCSGFTLFLGRMGR FDNWYKRRINRIYPTIFAWAILGSLLFGYQNNINNVLLFGGGWFVSCIMLYYVLLYFIQR YMFKYLRLAFMAVAIACAVYFMLINTPMDFNMYGAGYFKWFHFFMFMLMGAMMGISQRQY KYHFVWDGLKLIGCVVAFYALYAFKDIAVYNKFQMLTWIPLLGTVFYFFKLCNSDFMKKA YHHCTVGWIIKLVGGLCLEIYLVQTALFTDKMNAVFPLNLIVMFAIIVFAAYILRCSARL FAQTFKDMDYDWRAVFKAV >gi|225935372|gb|ACGA01000020.1| GENE 57 69772 - 70686 368 304 aa, chain - ## HITS:1 COG:no KEGG:BT_2147 NR:ns ## KEGG: BT_2147 # Name: not_defined # Def: O-acetyl transferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 261 12 284 337 76 28.0 1e-12 MKRNISLDIARGLSIVLIVLSHTPFGYSYLFDFFYVQVFFFLSGVFIKSSYTVENSLYKK VKALIYPYLFFAVISSIAVLMLGRNTIDGIHLYDPDSFDNGPLWFLIALFTLNIMFLLLN ELPQYVRFCMLWVIFGVCYYLGLNGIDDYTDITKAGISLPILLAGNYYLKIEPFFRKYRY VICLLSILLCALFIWVIPVHIGIRWIKLPDNILLYLLASFSGIAFILSISCIVESILILR KSLSFLGGYSLFILCMHWPIIRMLYDKVHISNNDIIAFVVGAIVCFTTAGIGVILKKYCR FFFK >gi|225935372|gb|ACGA01000020.1| GENE 58 70695 - 71792 706 365 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 8 349 5 327 344 118 26.0 2e-26 MNTNKYAVSIILPCYNVSQYIERAIDSILIQDFKDYEIIIINDGSTDNLLQKCDDLGYTE KGYIKILSFKNQGLSQARNEGLKIAQGEYVYFFDPDDYINQGMLSKVYLKAKEGNYDAVH FGFQTIYEDQGGIHYDKAESPYVYQTNDEIIHNYLPRFLGITQDNINSFVDLNHLWSSKE FSGVWRFLYKRSVLIDNKILFPKGIKLIEDKLFNARFFCYAKTIALMDDVLYNYVIKEKG LLTSSLNNVKGLVEDKIIGIEQRAILRNLYLKEKNIDIFSYYIGTIVFSALELIMRLSDI SFFSSRKELKRYMNMEDVRVGIKEIDVSHLPLKLRLPIVMLKYRLTYLLLGGMRVLKMIG LKVNI >gi|225935372|gb|ACGA01000020.1| GENE 59 72067 - 72684 409 205 aa, chain - ## HITS:1 COG:SPy0787 KEGG:ns NR:ns ## COG: SPy0787 COG0463 # Protein_GI_number: 15674831 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 7 203 3 194 310 120 37.0 2e-27 MYNKSRIAILMATYNGECYLNYQIESIMGQSFEDWTLYIRDDGSSDSTVEIVREYCKKDS RIMLVDDEIKHRGAKKSFIWLLENIHSEYYMFSDQDDIWLPHKIESTYLKMLEMESVNPT TPILVHTDLAIADEDGFVVKGSRYKYDKLYPDWLDKLNWFIVSYRICGNTIMLNDKAKAV ALPFDDRIYMHDWWVVLMTLKMGGG >gi|225935372|gb|ACGA01000020.1| GENE 60 72693 - 73256 431 187 aa, chain - ## HITS:1 COG:SA0151 KEGG:ns NR:ns ## COG: SA0151 COG0110 # Protein_GI_number: 15925860 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Staphylococcus aureus N315 # 63 178 65 181 208 74 36.0 8e-14 MKIDCSNMPNGASKLNFILRYIFNQLRTWYMFHIRYPWVKYEGFVRVMKGTGFAQNMDVR IGHNVQFGDYCNVASNVYFGNNILMAGRVCFVGRQDHTFSIPGKTIWSGERGDNGITIIE DDVWIGTAAIIMSGVTIGRGSIVAAGSVVTKDIPSCEIWGGVPARKIRDRFETEEEKKYH ICHANFK >gi|225935372|gb|ACGA01000020.1| GENE 61 73253 - 74410 775 385 aa, chain - ## HITS:1 COG:MTH450 KEGG:ns NR:ns ## COG: MTH450 COG0438 # Protein_GI_number: 15678478 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanothermobacter thermautotrophicus # 197 382 219 410 411 79 27.0 1e-14 MKILWISPWFGNYRVPVYENLNRLCSGNFYLICSEENTSELVRGKLKAVLGNHAIVMSGE QRMTMGSDESDFANSALVIKKQPGLYKAVKSVEADVIITEGFGGWAPAGIRYAVTHRKKL CMFYERTAYVERNSPTWRSMYRRLVGIPVDYFLINGTLTEEYLNEGLHFKRTPKVKGCMC ADSFGLSQAVEKVTSADKDALRKELNLKNGLTFLFVGQMVERKGIKELLVVWGQHIAEYP NDNLLVIGKGILEKPLKDLYAGDNSIHIMGGINYDELYKYYALCDVFIMPTLEDNWCLVI PEAMACGKPVACSIYNGGHYELVQDGVNGYKFDPLKPESIIDILAKFHQADLSAMGQKAI EIESNYTPDKAAARIFEACEKVYKR >gi|225935372|gb|ACGA01000020.1| GENE 62 74420 - 75964 713 514 aa, chain - ## HITS:1 COG:no KEGG:BLA_0592 NR:ns ## KEGG: BLA_0592 # Name: cps1A # Def: polysaccharide biosynthesis protein # Organism: B.animalis_lactis # Pathway: not_defined # 270 492 121 331 353 124 32.0 1e-26 MGVSIRKRLKDSFMVKVWRTGGFIVNSAVEDNKQYYSQSGHGTCVFGRTKALKELHFEEE LWLQDAKYALPEDMVMFYKLYLKGNVIAMNREVEFVHLDAGSSLMDDNKKLNNIYASARN GLIFWHRFIYKCRDKKWLSILCIVRRIFFTSLFALLKGVIRWDIRYFKTYMEGYRDGWKY IHTSTYNELSNIVKCSGGKIKKVYSFGLFHGLAYTLFVLSLRLFPSTRVALFFRKMHDVA VKHYAKHTCKRVVHEYAELTSLELSPLTDNLPIWIFWWQGENEMPPVVKSCLNSVRKYAG AHPVRVVSQHNIDEYLIIPQHIKDRLKQNVGDMMQGMSFIHFSDYVRMALVYEHGGIWLD ATCMLTAPLYVPQNASFVSVRTNKLDSVPNRGKWNIYYLGAAKHNVMFGYMKEMLSEYWK NNKFIIDYFFTDYCLSLAYDLYPNVRKMIDALPALNSDKDAHSIYFLLNQAYDENIYNEI VSKVNLHKLTWKGRLNEYTNDGMYTFYGWVLENS >gi|225935372|gb|ACGA01000020.1| GENE 63 75999 - 76367 396 122 aa, chain - ## HITS:1 COG:no KEGG:Gmet_1497 NR:ns ## KEGG: Gmet_1497 # Name: not_defined # Def: glycosyl transferase family protein # Organism: G.metallireducens # Pathway: not_defined # 6 112 17 123 328 91 42.0 1e-17 MINRDYSVTIRTLGTAEDKYQRTLDSIATQSIQPKEVIIVLANGYECPKEQLGYERFVFV DKGMVNQRVACFDESSSEYTLALDDDVEFSSNFVASLFDTMEKCKADFVSPLVKEINISG GG >gi|225935372|gb|ACGA01000020.1| GENE 64 76370 - 77425 430 351 aa, chain - ## HITS:1 COG:SMb21311 KEGG:ns NR:ns ## COG: SMb21311 COG0438 # Protein_GI_number: 16264635 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 136 305 206 378 420 80 33.0 6e-15 MYDDEGICVIRLRGGDFLIPEVEKPSLWKKFRPLYRFYSYRKRIVLAISTLGDVDIIEVP EYGAEGYFLHQLNIPVVVRLHTPMLLDHYRFSLQSFSKNNWHYYWQGKKEFQEMKKAAYL SSCSTSLKIWAEQYAGVMKDRARVIYNPIDMNAWKGFQRKEEYREVKEILFAGTICDWKG CGDLAEACRILHQENASCQFRLSLVGKTGIFAEQLQAKYGDEPWFNLVGRVQREELMERY TTADVICFPSWWENMPMVCIEAMLCGGIVLGSSSGGMSEIIEDGKSGFLIEPRNPRCLAD KIRQIFNLSEGEKANVSLSAKQRIKNAFSLDMIIKQQIAYYKEVVEDYKNK >gi|225935372|gb|ACGA01000020.1| GENE 65 77581 - 78855 619 424 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170823|ref|ZP_05757235.1| ## NR: gi|260170823|ref|ZP_05757235.1| hypothetical protein BacD2_03077 [Bacteroides sp. D2] # 1 424 1 424 424 693 100.0 0 MIHQFPNKDSKTNWGYIIGYFILFLIAPPFIAIPTIIVYMVTKSHATKSDYYLCFFAIAG YFAAINATKRMGGDQWQYYVAYMNVPDVGFWKSLVYIYGIDYFKDSSRMQVSGEFMNGIY NYLGYYLTFGYYPLYAALLTFMDYLLVFWGLRRFCLSMKKPHIPIIAGIFTLSFFYLFFQ YTLQIQKQFLAQCIMMYVLGNYAYYGKMRKKDWIMAVCAVFTHAATLLFVPFFIFKPLHR RLTKKGLFFIGAAFAILIIIAPRLASGIATDTSSALTYGVSRLATSETLNDTEFGLVWAQ VFIIALPMALIAFRKLWLERKTLSDSSAFILNITLLLLLTIVAMYRQPLAQYRYFMMLFA FMPFIYPFAFDNIRNRDMLLKVIAVVMILWFYYQFELIVWDYAPLVDIIIKSPIFLLFGN YYTV >gi|225935372|gb|ACGA01000020.1| GENE 66 78870 - 79841 418 323 aa, chain - ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 1 264 1 261 344 112 29.0 7e-25 MSIPFISIIVPVYNVAQFLSITLDSILNQRYTNYEIILINDGSTDESAEIAQRYASNDQR IVFIDKCNEGVATTRNKGLEVSRGIYIMFLDADDIIYPNTLDKIATMLRTTNADLLRFEH QTIDEEGNELYPNYNIRRRKKYVGCELDAATFMNKILLNEYYMCLNIFKSSLIKSHHISF LPDCTYNEDTLFILKSLLYSKKNFYIPIKVYGYRKYSEAVTAKFSIKNYEDVKLVFTEII QMSENITNNLRTSFKAVAESLAIHLYLEKSNMNVLDGDLDTIICQCTNRPVMMEWKCISL LGDKGFKLWRSLFLLKRILNRLM >gi|225935372|gb|ACGA01000020.1| GENE 67 79841 - 80806 509 321 aa, chain - ## HITS:1 COG:no KEGG:Patl_2044 NR:ns ## KEGG: Patl_2044 # Name: not_defined # Def: hypothetical protein # Organism: P.atlantica # Pathway: not_defined # 14 260 2 217 278 95 27.0 3e-18 MNTEAKTRLAMNDFKHFILTRFNIPFIRKDSVDFLFSDSYLNERYSIFENTCFSSMCNQK NKDFIWLVFFDKRTPKSFVHRNEQLQKLCKNFKPIYVDMNYIANLAIESSYVIYANEIAP VNENDYGVMMGRVWLAEYYNLLIRSYCEDSVEYVITTRIDNDDCFECDMVEAVKKYATID TVDHILSFDMGIQYFNNTCIAQRFYYPNNHFTSMLEKLDKPLRTVFYWDHFFVEKFMPVI HISEKPLWLEILHNSNAINTIKLNRKNSLYLSGLDLHPFGIDLKYSSIKILVSLLINPRL YLYPWIKYLLHLAALKKYLKK >gi|225935372|gb|ACGA01000020.1| GENE 68 80851 - 81831 456 326 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_0084 NR:ns ## KEGG: Fisuc_0084 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 33 322 59 356 365 137 32.0 4e-31 MFCSKLAYMFWEDNESLRSHVGNYYTRTCKVVGEFSAPCLIMMVDGRFPHGGLADRLRGC INLYKFAKERGMDFRIYWKSPFDIHEYLTCNEYDWKIDQDEISYFMGNSFPVTLGSYYKQ YSLSEAKEAEFQYQKMNEIIDSHPHIKQFHFYTNAHFVDDKEYAELFKELFVPSEGLQEK ISFYSKVHNYSPYLSMAFRFRQLLGDFEDASGGEVLSVAERKQYIEDSLHLIDKVYEENN GEQMIFVTSDSKDFIKAASEKSYVFTVSGEIVHIDKQCTLVKTDYSKEFLDMLMLARSSF IYVCKKERMFTPGFSLNAAKLGGGNV >gi|225935372|gb|ACGA01000020.1| GENE 69 81870 - 82751 466 293 aa, chain - ## HITS:1 COG:no KEGG:PRU_1758 NR:ns ## KEGG: PRU_1758 # Name: not_defined # Def: group 2 family glycosyltransferase # Organism: P.ruminicola # Pathway: not_defined # 7 277 5 273 307 212 41.0 1e-53 MHLCNFTFIIPHKNIPSLLQRCIDSIPQREDVHIVVVDDNSDSCQVDFSHFPGLDNPLVE VIFTKEGYGAGYARNVGLKQVQSKWVLFADADDFFTPELNVFLDKYKDSTADVVFITNNT IDCDTFIPQNQDLKVAELAKECNHENEFDPLRYKSQPPWTKMVKMSLINQYHIMFDETPA SNDVWYSVQVGFYARKIEISKEIVYTRTIRQGSLQYSLDKERLLARLKVGYKVNTFLRNH GKIKYYNEIWGYLLDLRKISYPLFFIQFFKYIYKTPFFIVKKHLKYKFLKLFR >gi|225935372|gb|ACGA01000020.1| GENE 70 82763 - 83686 525 307 aa, chain - ## HITS:1 COG:mlr0961 KEGG:ns NR:ns ## COG: mlr0961 COG3594 # Protein_GI_number: 13471080 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Mesorhizobium loti # 54 302 382 619 672 147 35.0 3e-35 MWLSKLKRVFSNMLYIDEIIDDIVSYKLRFLKLDVFANINKLGGVNLHPFVDSQGYTHTV TVSLTSYGDRVYSVCYVIYSILMQSVQPNRIILWLSEQEYNDENLPQSLKDLRSRGVEIA YCPDYRSYKKIIPTLRKYPEDVIITIDDDVLYGHETLEYLINAYLQEPEYIWFMSGSKMA FDKNGNIKPYVTWYDEHLTALECNIKNFPTGIGGILYPPHSLAEEVTDETLFMKLAPTAD DIWLKAMSLKQGTNCRQIKLNKPVMRFHTGIREVQTSALCKDNIEKNLNDKAVYAVFDYF DLWKLFN >gi|225935372|gb|ACGA01000020.1| GENE 71 83689 - 84642 502 317 aa, chain - ## HITS:1 COG:YPO0187 KEGG:ns NR:ns ## COG: YPO0187 COG0463 # Protein_GI_number: 16120528 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 2 106 6 110 329 95 42.0 1e-19 MISIIIAAYNAASFITTCIDSCLNLLGSETEIIIINDGSTDDTLAICNQMANCYSNIFIF SQDNQGVSAARNEGLLRAHNEWVLFMDADDWLNKRDMEFLITNLKQLDEAIDICTFGYNT ILQDKIIEHREQDRVCAPIEILNSASFKLASWNYVFRRTLLINNDINFPEGVICTEDQNF NIKALCQARVVQSFNLIVYNYNCTNIYSASHKKHSEAWIKSRLLSANNILSYCILHDVPL LNVYAQVKRLYESYMNDFTTDISVKERASFFISEYNKTVSYLPEIKNIRKFRLCYYNMWL GLLLFKIHKIIVYIKNR >gi|225935372|gb|ACGA01000020.1| GENE 72 84648 - 85136 108 162 aa, chain - ## HITS:1 COG:no KEGG:GFO_0543 NR:ns ## KEGG: GFO_0543 # Name: not_defined # Def: hypothetical protein # Organism: G.forsetii # Pathway: not_defined # 11 149 154 292 298 83 34.0 3e-15 MNLPPPKKNVKNIQIGIVPHFSEVDYFIKQYGLKYTVIDLRTFEVEKVIDEINSCQYVLS SSIHGIIVAHAYHIPCIWIQKGYIHTDGIKFYDYFSSVNIPIYEGFTDIENILSDLSSCL NFFERYSSIALPHKSLVEIQQDLLDVAPFPVKQEYRNLLKKS >gi|225935372|gb|ACGA01000020.1| GENE 73 85105 - 85548 293 147 aa, chain - ## HITS:1 COG:no KEGG:CA2559_07240 NR:ns ## KEGG: CA2559_07240 # Name: not_defined # Def: xanthan biosynthesis pyruvyltransferase GumL # Organism: C.atlanticus # Pathway: not_defined # 5 145 3 128 269 63 27.0 2e-09 MNKRINLVYWNKPNFGDQLSPYIINKLSGLPIQYKRGSVSVRRSIKEILKYLICNNRKAL KDMLFSWEHNIIGIGSVINLGNHKSHIWGSGFMNTDQTFHGGIACAVRGEYTNNKIISMG YSGCCVFGDPALLLPLVYEPTPPQKKC >gi|225935372|gb|ACGA01000020.1| GENE 74 85554 - 87038 223 494 aa, chain - ## HITS:1 COG:no KEGG:BVU_2944 NR:ns ## KEGG: BVU_2944 # Name: not_defined # Def: putative transmembrane protein # Organism: B.vulgatus # Pathway: not_defined # 1 487 17 503 510 403 48.0 1e-110 MLYLRMFVNMGVSLFTSRLILNALGETDFGIYNVVGGIVVMFTFINGSMNSSTSRFLTVS LAKGNSQELKATFSFAVTIHAILAFIILLLAETFGLWFFYHKMIIPAERMDTAFILYQLS IITTMLSIMSVPYNSSIISHEKMNAFAYISISDVVFKLLIVYIVIYLPWDKLLLYALLLF ITQIVNQLIYILYCQVHFEEAKYSLTWNKKLFVEMCGFAGWNMTGNFAFVCYTQGLNLLI NMFFGPSVNAARGIAIRVQGIISNFASNFQTAIHPQITKNYAVGNYEYMQKLIFAESKFS LYLLLLLSLPVLVEAQLILNWWLVNVPENTVVFLRIMLFTTYIETTTNPLLVSALATGNV KKLQLVICPLLLCILPLSYLFLKFGAPAYSVFIVNLLILFLSLIIRVFMLKPFIGLSPKH YFVDVIIRCMGVGILSSVLPIYLFYIIEPPFFRFFIVCVSSVLSVVSVVYFIGINHEERL FVRAKIRNVINYKR >gi|225935372|gb|ACGA01000020.1| GENE 75 87115 - 88095 513 326 aa, chain - ## HITS:1 COG:no KEGG:BMD_2936 NR:ns ## KEGG: BMD_2936 # Name: not_defined # Def: hypothetical protein # Organism: B.megaterium_DSM319 # Pathway: not_defined # 6 278 3 303 365 68 25.0 2e-10 MEKNSRKDSFDVVKCFAAFFVVQLHTIPATVCPLLNVIARLAVPLFFLITGYYYTSIVEK GKYGVQLKKIFLLAIASSLFYWIYYGCMALKNNMFYQWFMDTFNSISILNWVLINDTPGI GHLWYLYAMIYSFIAIYVIDKLKIKVKWVIPILFLIGLYVGCKGWPYSWYRNWAFMGVPY ILLGRIIFEYKEMLIYKLGGVKIVYFISAAIIGLLGEYKLYEIVGMDPIRDHYVFIILMS GILFILALQYPYFGKGSVVAIIGREYSAFIYILHIFILRILSFYIDFSSSLMIRFLCPII VFVCTIVVATYIIHMYKLLSKKCVTI >gi|225935372|gb|ACGA01000020.1| GENE 76 88174 - 89487 1429 437 aa, chain - ## HITS:1 COG:XF1606 KEGG:ns NR:ns ## COG: XF1606 COG1004 # Protein_GI_number: 15838207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Xylella fastidiosa 9a5c # 1 435 1 443 450 471 53.0 1e-133 MNIAIVGTGYVGLVSGTCFAEMGATVTCVDVDTNKISKLKAGEMPIYEPGLEELVKRNVG YGRLHFTTDLIEVLDDVEVVFSAVGTPPDEDGSADLKYVLAVARQFGQNINKYTILVTKS TVPVGTAQKVKAVIQEELDKRGADVPFDVASNPEFLKEGAAIKDFMSPDRVVVGVESKKA EEVMTKLYQPFLLQNFRVIFMDIPSAEMTKYAANAMLATRISFMNDIANLCERVGANVDH VRKGIGADVRIGQKFLYAGCGYGGSCFPKDVKALMHTGIDNGYHMEVIEAVERVNDRQKS IVYDKLIRLMGDVKGKTIAMLGLAFKPDTDDMREAPALVVIDKLLKDGATVKVFDPIAMP ECKRRIGNVVTYTENLYDCADGADALLLMTEWRQFRLPTWNVIQKVMTDKYIVDGRNIWN RVELEEMGFSYTRIGEK >gi|225935372|gb|ACGA01000020.1| GENE 77 89524 - 91947 2039 807 aa, chain - ## HITS:1 COG:VC0937_2 KEGG:ns NR:ns ## COG: VC0937_2 COG0489 # Protein_GI_number: 15640953 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Vibrio cholerae # 501 794 7 299 302 121 29.0 7e-27 MKENPYENNMNEQDEEKINYQELLFRYIIHWPWFLASILICLIGAWGYLYFQTPVYQVSA SIMIKDDKKGGNSGSADLENLGLGGVITSAQSIDNEIEVLRSKSILKEVVNSLELYITYY DEDEFPKREMYKTSPVVVNLTAQEADKLPSTALIDMQLSSDGGLDVNLKVGLNEYNKRFD KLPAVFPTNVGTFGFTLRDSLLNGQIEGRKDVRHISAVVSQPFGMAKGYQWALTIAPTSK ATSVATVSLVNTNIQRGQDFINKLMEMYNRNTNNDKNEVAEKTREFINERIKIIDEELGN TEEKLETFKRNAGLTDISSDAQLAVSGNAEYEKKRVENGTQINLVRDLAKYINNPLNEYE VLPSNIGLTDNGLTTQLERYNELVIERKRLLRTSTENNPMIINLDMSIRAMKANVKAAIN GTLQGLLIVKADLDREASRFSRRISDAPGQERQYVSIARQQEIKAGLYLMLLQKREENAI TLAATANNAKIIDEPVAEGGPVSPKPKMIYMIALVLGVGLPVGIIFLISLTKFKIEGRGD VEKLTRLPIVGDVPLTNEKAGSIAVFENQNTLMSETFRHIRTNLQFMLENDQKVILVTST VSGEGKSFISSNLAISLSLLGKRVVIVGLDIRKPGLNKIFNIPRKEQGITQYLSNPEKNL MDFVQPSDVSKNLYILPGGIVPPNPTELLARDGLDKAIETLKKNFDYVILDTAPAGMVTD TLLVGRVADLSVYVCRADYSRKAEFTLINELAADNKLPNICTIINGLDLKKKKYGYYYGY GKYGKYYGYGKRYGYGYGYGEHTVKEE >gi|225935372|gb|ACGA01000020.1| GENE 78 91972 - 92733 504 253 aa, chain - ## HITS:1 COG:PM1016 KEGG:ns NR:ns ## COG: PM1016 COG1596 # Protein_GI_number: 15602881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Pasteurella multocida # 39 218 76 251 387 75 32.0 1e-13 MRKLKRLTLGALLAFLLVSCQSYKKVPYLQDTAFVNDTEQSVLQTGVKVMPKDLLTIAVS CSTPELAAPFNLVNSSTDAPQQYLVDNQGNINFPVLGEIHVGGLTKLEIENLIIDKLKVY LKEAPLVTVRIVNYRISVLGEVNKPGSFVVSNEKINLLEALAMAGDLTIYGVRDNVKLIR TGQDNKQEIITMDLNKAETVLSPYYQLQQNDIIYVTPNKTKAKNSDVGTSTGLWFSGISI LMSIANLLIGILR >gi|225935372|gb|ACGA01000020.1| GENE 79 92798 - 94204 997 468 aa, chain - ## HITS:1 COG:VC0934 KEGG:ns NR:ns ## COG: VC0934 COG2148 # Protein_GI_number: 15640950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Vibrio cholerae # 110 468 111 465 465 235 38.0 2e-61 MQEVQRFNKVLKSFVLSGDIILLNLLLWVFGSLWGNRSPFEYSVSLFQNMALMTLCYLVC NIRSGVILHRPVVRPEQIMLRVARNMIPFVLIVFGLSYIFHFECVNLRQLGIFYVVLIIV IISYRLTFRSILELYRKSGKNVRKVVLVGSHENMQELYHSMTDDPTSGYRVLGYFEDFPS DRYPMNIAYLGQPCEAVDYLTRNAGKVDQLYCSLPSARSAEIVPIINYCENHLIRFFSVP NVRNYLKRRMYFEMLGNVPVLSIRREPLELLENRIVKRGFDIICSLLFLCTLFPIIYIIV GLAIKISSPGPVFFKQKRSGEDGREFWCYKFRSMRVNAQSDTLQATECDPRKTRIGNLIR KTNVDELPQFINVLKGDMSLVGPRPHMLKHTEEYSHLINKYMVRHFVKPGITGWAQVTGF RGETKELWQMEGRVQRDIWYIEHWTFILDLYIMYKTVYNVIRGDKEAY >gi|225935372|gb|ACGA01000020.1| GENE 80 94355 - 94504 84 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160883759|ref|ZP_02064762.1| ## NR: gi|160883759|ref|ZP_02064762.1| hypothetical protein BACOVA_01731 [Bacteroides ovatus ATCC 8483] # 1 49 1 49 49 87 100.0 2e-16 MKRKWHHCILGFVMGRLLKNTGVTVRDMLKEIHMSSDTYEQLKKATIRS >gi|225935372|gb|ACGA01000020.1| GENE 81 94656 - 95285 437 209 aa, chain + ## HITS:1 COG:no KEGG:BT_1637 NR:ns ## KEGG: BT_1637 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 206 208 281 68.0 9e-75 MTNDFRMSYFMPPIAPIKNELGRIVTPATLIPSCEVSVEQVFQMITCNENLKILTEQVRN SGDIRTAKASLLPYVTPCGTFSRRNSKCFVASSHLVVVDIDHLDSYQEAVEMRNTLFNDH LLRPVLTFISPSGRGVKAFVPYDHLPMANDTNSIIENMKLAMMFTVLLYDTETPPPFGEK RKGVDFSGKDIVRSCFLSHDPGALFRNSK >gi|225935372|gb|ACGA01000020.1| GENE 82 95315 - 97147 1324 610 aa, chain + ## HITS:1 COG:no KEGG:BT_1703 NR:ns ## KEGG: BT_1703 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 610 1 612 612 914 74.0 0 MDEIESLCRLCEAVEIAGADIAPTYAEYVQLAFAIATDCGEAGREFFHRLCRVSAKYERE HAERIFSNALTTRHGDVHLGTAFHLAERAGVTLCKEEMMNHRKNAKNAGNAPAKNLTHTH VYNKVDNEEPDESEELQDGSDPNQPLPSFTEADWPKILLLIMSYATSPTQRDVMLLGALT AIGASMERYVRCPYAGKLQSPCLQSFIVAPSASGKGILSFIRLLVEPIHDEIRQKVAEEV KAYKKEKAAYDTMGKERCKVEAPQMPKNKMFLISGNNTGTGILQNIMDANGTGLICETEA DTISAAIGSEYGHWSDTLRKAFDHDRLSYNRRTDQEYREVKKSYLSLLLSGTPAQVKPLI PSTENGLFSRQLFYYMHGIWTWINQFESGETDLEAIFTGIGLEWKKQLDLMKAHGLHTLR LTDEQKQEFNALFADLFFRSGLANDNEMSSSIARLAVNTCRIMAEIAMIRALECDQPYQF KDSSAPLLTPDKEIAADNIKDGIITRWDVTITAEDFHAVLELVTPLYRHATHILSFLPST EVKHRANADRDALFEIMGNQFTRAQLLEQAEKMKIKPNTALSWLNRLIKKGLLINTDDKG VYTRTHVCVC >gi|225935372|gb|ACGA01000020.1| GENE 83 97281 - 97499 193 72 aa, chain - ## HITS:1 COG:no KEGG:BT_1704 NR:ns ## KEGG: BT_1704 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 115 81.0 5e-25 MNEFKIRAYGRMELAQLYSPELTDIAAYRKMKKWISLCPGLLQRLYDLGYESKRRSFTPL EVRVIVDALGEP >gi|225935372|gb|ACGA01000020.1| GENE 84 97720 - 98208 634 162 aa, chain + ## HITS:1 COG:no KEGG:BT_1705 NR:ns ## KEGG: BT_1705 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 164 166 256 89.0 2e-67 MIRYKIYQNQQQKGLNAGKWFARAVSDETFDLAKLAEHMSKHNSPYSGGVIKGVLSDMVD CIKELLLDGKCVKIDDLAIFGVGIRSKAADTLEDFSLEKNITGMRLKARATGNLSTTNLK LDSQLKQQAEYQKPTTAGGGSDSGDTPDPKPDGGGEAPDPAA >gi|225935372|gb|ACGA01000020.1| GENE 85 98249 - 98356 120 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNSSSPRSVWSFIIKVIITVATAIGGLIGVQSCM >gi|225935372|gb|ACGA01000020.1| GENE 86 98361 - 98786 249 141 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 129 2 97 116 77 40.0 7e-15 MHTITLIIIHCSATPEGKSLSAEACRQDHIFHRRFRDIGYHFYITRDGEICQGRPLEKVG AHCRDHNTHSIGICYEGGLDMAGRPKDTRTLAQRASLLGLLRELRKIFPKTLIVGHHDLN PMKECPCFNCTEEYRELDGII >gi|225935372|gb|ACGA01000020.1| GENE 87 98779 - 99528 695 249 aa, chain - ## HITS:1 COG:no KEGG:BT_0613 NR:ns ## KEGG: BT_0613 # Name: not_defined # Def: putative membrane protein involved in polysaccharide export # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 249 68 317 317 271 54.0 2e-71 MNTKLSGAFALIFFFFFLSACQSYKKVPYLQDAEILKQANTQVAPVQDARLIPGDEVSIL VSTSDPVVSQPFNAQGSTFLLDDQGNINYPVLGKLPLNGLTSREAENLITERLKSYVKER PTVVVRMSGFKVSVLGEVASPGVYPVVNEQINVLEALAMAGDLTIYGVRDNVKLIREDRN GHKQFVTLNLNDADLLLSPYYQLQQNDILYVTPNKTKAQSADIGTSTTMWVSGFSILVSI ASLLVNILR >gi|225935372|gb|ACGA01000020.1| GENE 88 99556 - 100134 491 192 aa, chain - ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 338 86.0 5e-92 MILTKPKSISAGPSDGTGEGVAHSKRWYVALVRMHHEKKVAERLDKMGIENFVPVQQEIH QWSDRRKIVESVLLPMMVFVHVNPKERKEVLGFSTVSRYMVMRGESSPAVIPDEQMARFR FMLDYSEEAICMNSSPLARGEKVRVVKGPLTGLVGELVNVDGKSKIAVRLNMLGCACVNM PIGYVEAICEKN >gi|225935372|gb|ACGA01000020.1| GENE 89 100469 - 101434 584 321 aa, chain - ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 67 288 65 278 297 60 28.0 5e-09 MRNKNGFSRCAEFYIGRLRKEGRYSTAHVYKNALFSFSKFCGTLNVSFRQVTRESLRRYG QYLYECGLKPNTISTYMRMLRSIYNRGVEAGIAPYVPRLFHDVYTGVDVRQKKALPAVEL HKLLYEDPKSERLRRTQAIAALMFQFCGMSFADLAHLEKSALEQNVLRYNRIKTKTPMSV EVLDTASEMINQLRNREDAQPDCPDYLFDILSGDQKRLDERGYREYQSALRQFNNCLKDL ARALHLQSPVTSYTLRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFGLKERTEV NKGNLSYIKNCCVGRVKSVKY >gi|225935372|gb|ACGA01000020.1| GENE 90 101642 - 102613 935 323 aa, chain - ## HITS:1 COG:CAC2918 KEGG:ns NR:ns ## COG: CAC2918 COG1482 # Protein_GI_number: 15896171 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Clostridium acetobutylicum # 1 323 1 310 326 198 37.0 1e-50 MYPLKFEPILKQTLWGGDKIIPFKHLNSDLKGVGESWEISGVENNESVVANGPDKGLTLA DMVRKYREELVGEVNYARFGNKFPLLIKFIDAKQDLSIQVHPTDELAKKRHNSMGKTEMW YVVDADKGAKLRSGFSEQITPKEYKERVLNNTITDVLQEYEIHPGDVFFLPAGRVHSIGA GSFIAEIQQTSDITYRIYDFNRKDANGKTRELHTDLAREAINYEVLDDYRTKYEPLKDEP VELVACTYFTTSLYDMTEEISCDYSELDSFVIFICMEGSCTMRDNEGNELTVSAGESILL PATTQDVTITPEGGSVKLLETYV >gi|225935372|gb|ACGA01000020.1| GENE 91 102754 - 104070 1144 438 aa, chain - ## HITS:1 COG:no KEGG:BF1621 NR:ns ## KEGG: BF1621 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 438 1 482 482 488 51.0 1e-136 MKKKLIYAVLCLVVALPAAAQKYYNDAISITGVSLWQQGESLYIDMQIDMRNLKVDNDRT LTLTPMLVSADHNLTLPEIIINGRRRQKAYVRSMALNSETSLGVPSNKKEVISYTQIIPY QPWMENASLNLEENLCGCGGHQEVVAQELIPNEISTEIKRLSAIHPILSHIQLSADRLEV RSKQYEAHLEFPVNKSVILPDYMNNASELQNIRKMLSETLNDKGLNVKGIYIEGFASPEG ALKLNEQLSVKRAETLKNYLTAQGQIPAGLCHVSFGGENWDGLLKSLEASTLKEKATLID IIEHTADISLRKEKLKKVDGGVPYRVMLRELYPVLRKVNCRVDYTTDITVAAQADAEDTN LNAAATALSERNLTAARQYLDKSNPQTAEYANNNGAYYLLNGQPEQAILEFNKAIQKGSE AARSNLEEMGKVMKMQKK >gi|225935372|gb|ACGA01000020.1| GENE 92 104078 - 104659 620 193 aa, chain - ## HITS:1 COG:no KEGG:BF1610 NR:ns ## KEGG: BF1610 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 193 1 195 195 281 71.0 7e-75 MKRIKYLLIAILVFTFTGSMYGQEKNYYMPKFAIKTNGLYWATTTANLGFEVGLGKKLTL DVSGNYNPWKFSDNKQIKHWLVQPELRYWLCERFYGHFFGLHAHYADFNVSNLDILGLGH HRYQGNLYGAGISYGYQWILNKRWSMEATIGVGYARINYDKYNCGHCGSKLDSGHKNYFG PTKVGINIIYIIK >gi|225935372|gb|ACGA01000020.1| GENE 93 104675 - 106822 1607 715 aa, chain - ## HITS:1 COG:no KEGG:BDI_2678 NR:ns ## KEGG: BDI_2678 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 550 1 570 632 122 26.0 7e-26 MKRFIYYITLFGLLLASCTYEDSPENPSTEKNGTTFRFSVDIPDYKTVLSRALASENTVN DLWLMAFDADGLFIGRVHPSLLIGYDNGVGTFQAEIPDHTGIIHVIANYDQWDSFDERAA LQKDERELIPSFTSTKMVFWGRQVLASATDSPRIILYRNQAKVTVENEAENFEVTGYAIC NYASSGTAAPFNPDAAVTPFVIMDGKPTLAHGSTPKASQTSDDCTMEPKYMFENENYFND QTYIIISGRLKGKTDVLYYKIQLLDSNKKPYTVVRNYHYRVVIKSFSENANGSTTFDDAK SAESSNNIYAEIFKDSPTIADNNNNVLTVDRLHFLFVRGGTLNVSAQYTQNGVADHSKIS VVLAEDQGNILHNLNYDGNGTITADVSRIIAGQQEATITVKAGILSRTITVVSSALYQFD PASLSPEVYTARDQDMTLQFTIPANIPSYLYPLKCAITTANLYPVAPNKNLQIEYTNGGY QYIYWATEPGTKTLNFRTSLENSDETILIANDIFKTQEIKVKSRHFTDVSVNGNNLVDYG TNNTAQFTFTIPSYPDYPVSYPLTVFIATGNLHTTQSGWTPVNGGYQYTYASQLSGVQTV IFTTAKAVSDENIVISAPGFSPTTIGIGTVLQRGVAVTNSIRVYQNNNLLRIPNYRVTSS DTGIVPTFTANSRSNYSFTIAAGSKASDVVRFTLQGYTAAYRVEELLQSPEIVLK >gi|225935372|gb|ACGA01000020.1| GENE 94 106852 - 108828 1449 658 aa, chain - ## HITS:1 COG:no KEGG:PG2135 NR:ns ## KEGG: PG2135 # Name: not_defined # Def: putative lipoprotein # Organism: P.gingivalis # Pathway: not_defined # 37 658 32 660 670 156 26.0 2e-36 MKQNKLYILSFMFLLLWGITSCENESPVESTGVSDVNGITLKLSIPRPVTSRAAVTEEAG EDALNENTIQTLDVFIYREGADDCLLYRHFTFSPQLTGTGEHTEVLKDVAQEKFALNVKH SVYVIANHTATIPEAGLTLTQLKALAAPVLDADKKQDTFLMDGEQTMVLNDGIIANKEIP VTLRRAAAKIRITLNYVNGFTPLDGETPSKKIVNYAADGSSIAQGAVIPTQLQTMNSFTA RNTGAGYKDQFILYSYANDWTKDTNRETYVLVNVPVKDADNHPIAQNYYRVPVNYRLPDG TAGQEENLYKLERNHLYDILVNIDKQGDTDPHSAVQLNASYTIQDWTTHEILVSVEGVNF IYVKDTKISMPNSTQFTTTFQSSTPDVEITNITVNGVSATNGGKDIKIVSTPNAKSGTIS ITSPLPENFLAKEITFQVKNGAGLTQQVTVSQYPALYLGSDTSADVPGGSDGQNNTKMYI INSFVADFSALPYPDEFDEPFGTGFSHYDPDPALGKSYTDYIRDNAVLGYPLRDSEGAGI DTEENNRRISPHMMLASQYGTTTADSYVKSRIKCRDYVERDATTGETYSDWRMPTQAEIY LIDILQNVKVAEVKAILEGHYYWSSSAASAVRFMDPRVGNGNSFGPLNAAVRCVRDIR >gi|225935372|gb|ACGA01000020.1| GENE 95 108837 - 110282 934 481 aa, chain - ## HITS:1 COG:no KEGG:PGN_0183 NR:ns ## KEGG: PGN_0183 # Name: fimC # Def: minor component FimC # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 22 477 37 459 462 77 23.0 1e-12 MKLIVKYRICLYLFLLAFSLYSCSESETPVSDTADNQARLNFLIRTTPKEEKLTLGGDDS FSSLALYIFNKADQRCEYSELIPEFTPQRLKELSRSVNVSPQTKIIYAIANYNDPDKTFS TPVTSALTLEQLENLTVSGNVFSGNSILMVGKKEVPINSEYVVAEIPMQRLAARLDIYMF KNQELAQSTVTVTSVEFRNQVLTTYGNPDRVTMPADAKMQNVTVPITENGTLQPMPSDLS EIIPANAKASFYTYRNMVPDGKPDANTPYIRITALFDGISYTYRGYLTDQGQTTNKYSLL HNTVYRVMAMLDHPDNQLVIQTTPYPWSVVSSEIGHEVTEGDYRLQSFNGSDTGATTGVV QFPYIWEGEARNETSYADYSFNLTAPAGAVWTATLTNGLDFAFGTAGSVAGTPAVSKGIA RDAAYEIKIGAAKPWNGTAKHTLFYITVDGVKLKINPLQNGIRPFPGDNDTDILISQTEY K >gi|225935372|gb|ACGA01000020.1| GENE 96 110294 - 111151 562 285 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170854|ref|ZP_05757266.1| ## NR: gi|260170854|ref|ZP_05757266.1| hypothetical protein BacD2_03232 [Bacteroides sp. D2] # 1 285 6 290 290 561 100.0 1e-158 MLYALLLVTVTGCIKENLDDCETILYFDYLGDGTRDIFLQKIEKVDMYVYNEDKVCIQKT ALNKSELHRQQGTTLNLPSGQYHVVCWGNSFGDTRINEGASLNNNLVGAPHYFTKELITT NDSLYFGEKEITITNENYKVDTVPFSSAHIKMLVELEGLDAGNARTVTSPVSMEMGNLSP TVNFTKSFSNEQISYFPLVNFDSGTQKFGAKFNVLRFNNENEVYLQLHDTQTNEKLHTLQ LKDFMKENDITVDGINEVTIGIRILFNGTAITVKPWNEEVIRPGQ >gi|225935372|gb|ACGA01000020.1| GENE 97 111194 - 112411 1043 405 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170855|ref|ZP_05757267.1| ## NR: gi|260170855|ref|ZP_05757267.1| hypothetical protein BacD2_03237 [Bacteroides sp. D2] # 1 405 15 419 419 763 100.0 0 MKIKQLLLAGMTGLILFSCSDKDDISVSQNEINSLSVSLSGIKSATNSRASSPTDIITSD IKNVNSVLINLTDANGKVITSKSVTKDEVLNSDWNKLTDPAKGLKFINTPQSVSKVYIYG NPGNAVNNNVINTKLAEQQGSAVLYYGMDDDLKPIVDEPIEPTPTSGKTYTANVTIAPIV ARLQITKISFKNAGNFDFTRSIGGATKKATVTWTQFSGNVKGLYMSNFYNTYNQPGTLEN LLMNSTAEGHIHEGMWTFDTNPIIDAAPFASYQVYSSADGTYANLPLDLSGKCYAFNFFP GTAIPQLHLDLSDLVIDGLASTDTEVFNPALANSARFANIVKYYKEINTELTAADFKPGT LYNMEIELIPMLDNDLGNVQYNVLVHVTVAPWNEETITPGFDLEQ >gi|225935372|gb|ACGA01000020.1| GENE 98 112730 - 113827 372 365 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 38 364 27 345 345 147 31 2e-34 MNNTFPTEGNLSGLSQKDFQKDINDKKTDLFILKNTKGMEVAVTNYGCAILSIMVPDKNG KHANVILGHDSIDHVINSPEPFLSTTIGRYGNRIAKGKFTLFGEEHELTINNGPNSLHGG PTGFHARVWDAVQIDESTVQFNYVSADGEEGFPGNLEVEMTYRLENEVNALTIEYRATTD KATVVNLTNHGFFNLAGISNPTPTVNNHIVTINADFYTPIDEVSIPTGEIAKVEGTPMDF RAPHTVGERIDDKFQQLIFGAGYDHCYVLNKMESGSLDLAATCKDPESGRIMEVYTTEAG VQLYTGNWLNGFEGAHGATFPARSAICFEAQCFPDTPNKPHFPSATLLPGDEYQQITVYK FTVEE >gi|225935372|gb|ACGA01000020.1| GENE 99 113879 - 115219 1601 446 aa, chain + ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 12 438 24 412 412 129 28.0 1e-29 MTQEKKNGNLVAIITMFFIFAMISFVTNLAAPFGTIWRNEYAGSNTLGMMGNMMNFLAYL FMGIPAGNMLVKIGYKKTALIAMAVGFLGLFTQYLSSLFGAGAEVFAFGEYVIKLNFVIY LLGAFICGFCVCMLNTVVNPMLNLLGGGGNKGNQLIQTGGALNSLSGTLTPLFVGALIGT VTSSTAMSDVAPLLFVAMGVFVAAFIVISFVAIPEPHLQKGGVKKEKFSHSPWSFRHTLL GVIGIFIYVGIEIGIPGTLNFYLADSSDKGAGIMMNGAAIGGAIAAIYWLLMLVGRTASS AISGKVSSRAQLIVVSATAIIFVLIAIFTPKDVTVSMPGYTVGEGFMMAQVPVSALFLVL CGLCTSVMWGGIFNLAVEGLGKYTAQASGIFMMMVVGGGVLPLIQQSISDSVGYMASYWL IIAALAYLLFYGLVGCKNVNKDIPVE >gi|225935372|gb|ACGA01000020.1| GENE 100 115267 - 116421 1372 384 aa, chain + ## HITS:1 COG:CAC2959 KEGG:ns NR:ns ## COG: CAC2959 COG0153 # Protein_GI_number: 15896212 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Clostridium acetobutylicum # 1 383 6 388 389 256 40.0 4e-68 MDTEYVRSRFIKHFDGTTGFLYASPGRINLIGEHTDYNGGFVFPGAVDKGMIAEIKPNGT DKVKAYSIDLKDYVEFGLNEEDAPRASWARYIFGVCREMIKRGVDVKGFNTAFAGDVPLG AGMSSSAALESTYAFALNELFGENKIDKFELAKVGQATEHNYCGVNCGIMDQFASVFGKA GSLIRLDCRSLEYQYFPFHPEGYRLVLMDSVVKHELASSAYNKRRQSCEAAVAAIQKKHP HVEFLRDCTMAMLEEAKADISAEDYMRAEYVIEEIQRVLDVCEALEKDDYETVGKKMYET HHGMSKLYEVSCEELDFLNDCAKEYGVTGSRVMGGGFGGCTINLVKDELYDNFVEKTKAA FKAKFGRSPKVYDVVIGDGSRRLE >gi|225935372|gb|ACGA01000020.1| GENE 101 116541 - 118523 1877 660 aa, chain - ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 41 619 54 623 835 421 40.0 1e-117 MRKYTNLLAVLALSTGMALHAQTNELVIQTKKLGAEIQPTMYGLFFEDINYAADGGLYAE LVKNRSFEFPQHLMGWNTYGKVTLMDDGPFERNPHYVRLSDPGHGHKHTGLDNEGFFGIG VKKGEEYRFSVWARLPQGSAKETLRIELVDTKSMGEHQAFASENLTIDSKEWKKYQVILK PGITQPKSVLRIFLTSKGTVDLEHVSLFPVDTWKGHENGLRKDLAQALADIHPGVFRFPG GCIVEGTDLETRYDWKKSVGPVENRPLNENRWQYTFTHRFFPDYYQSYGLGFYEYFLLSE EMGAAPLPILNCGLSCQYQNNDPKAHVAVCDLDNYIQDALDLIEFANGDVNTTWGKVRAD MGHPAPFNLKFIGIGNEQWGKEYPERLEPFIKAIRKAHPEIKIVGSSGPNSEGKDFDYLW PEMKRLKVDLVDEHFYRPESWFLAQGARYDNYDRKGPKVFAGEYACHGKGKKWNHYHAAL LEAAFMTGLERNADIVHMATYAPLFAHVEGWQWRPDMIWFDNLNSVRTTSYYVQQLYAQN KGTNVLPLTMNKKNVTGAEGQNGLFASAVYDKGKNELIVKVANTSATIQPISLNFEGLKK QDVLSNGRCIKLRSLDLDKDNTLEQPFAIVPQETPVSIEGNVFTTELEPTTFAVYKFTKK >gi|225935372|gb|ACGA01000020.1| GENE 102 118550 - 120070 1200 506 aa, chain - ## HITS:1 COG:BH1878 KEGG:ns NR:ns ## COG: BH1878 COG3507 # Protein_GI_number: 15614441 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 39 506 24 479 781 135 24.0 1e-31 MTNLFMPLFFVAGSLLTSCTSAVFSPTPSANPWDDNYLSVAKMEDYRQWGTYNVHDPSCR KLGDYYYMYSTDAIFGENRKEAEEKGVPLGYIQMRRSKDLVHWEFLGWAFPEIPEEAVQW VQTHADGKGATNIWAPYIIPYKDKYRLYYCVSAFGRKTSYIGLAESNSPEGPWTQIGSIV KTNDSIVMNAIDPSVIADEITGKWWMHYGSFFGGLYCVELNPETGLALNEGDLGHLVARR ANYRKDNLEAPEIIYNPNLKKYYLFTSYDPLMTTYNVRVSRSDAAQGPFTDYFGKAEKDT TNNFPILTAPYRFENHPGWAGTAHCGVFTDGQGNYFMAHQGRLSPQNQLMVLHVRQLFFT PDGWPVVSPERYTGTPSRKFTEVDLVGEWEMIRVQEPKYERQLEAGQILWGEGKLKDGEW NLSTRIKLAKDGTCTGEITDDRWKIVPMNGNWSFLTEKHLLMIKSDSEKIENLIIFAGHD WENETETILFTGLDSRGCSVWGKRIK >gi|225935372|gb|ACGA01000020.1| GENE 103 120247 - 124518 2743 1423 aa, chain + ## HITS:1 COG:SMc04212 KEGG:ns NR:ns ## COG: SMc04212 COG0642 # Protein_GI_number: 15965635 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 868 1137 217 489 511 145 30.0 5e-34 MKRHLLTLLIIYSVLFFAGWKSVHAQERFADRYNITYVTMNEGLPHNFIDDLYKDSRGFL WISTAGGGVSRYDGYEFVNYNPNTPNCKLKSNFIINVCEDSAQRLWMVSEGGTDIIDLTT LKPTVPRDAKGVLKNILNQPAVHVMKDSKGCIWLHCANKLNRIVFNEKGDVETVSTLDLV VLNGPNIALQDIDEDGKIWAGINGEILKVDWDSQGKLTTTPIAECLKFEPGTYISDFLMK ENEVWISTDRGLFRYNKSGNIVKRYEHDPNNPRSLSQNYLTDLSITNDKQLIIATLRGVN IYNPMTDDFECIASGDFQNGSSNLLNSNFINCILSEEDHIWFGTETGGINMLSPRRLSIR NYLHNKENPSSLSYNPVNAIYEDKYGTLWVGTVEGGLNQKEHGSEKFTHFTRDHGGLSHN SVSALTADPDDHLWIGTWGGGINLLDLKAPQEVLKVISSQTSSGFPIDFVGSLTYDPINK GIWIGANQGLYFYEPETGTISAPLADKVAENIHGCIGSIIDKEGKLWIGCLEGVYIIDLH SRSAAGEFEYRHLNYKLNDPNSRLIEKITCFFETKDGTLWLGSSGYGIYKRTTNEQGKEI FVSYSTPQGLPNSSVRGILEDGNGYLWIGTNNGLSCYHPEENRFINYTLQDGLINTQFYW NASCRSTQGLLYFGSVGGLVAIENNRPTISLPAAKVRFTRLRIGNEEILPGSEYLPEDIA ITTELRLHEKEKSFSLEFSALNFKPSNTAIYSYRLLGFDDKWMQVSGNRRFASYTNLPPG DYTLQVKYTPDRENEGENVTELKITIVPYFYKKVWFILLIIILALVSVWQFYQWRIRTLK RQKEYLHRTVEERTHELEQQKHLLENQTEELSRQNQMLTQQNEKITKQKAQLIRMSHKVQ ELTLDKISFFTNITHEFRTPITLIIGPIERALKLSYNPQVIEQLHFVERNSKYLLSLVNQ LMDFRKVESGKLEIVKTRGNFLKFIDSLITPFEVFAGERNIVLKRYYRMETPEILYDEEA MRKVVTNLLSNAIKFTPNGGTVSLYISSLSSGEGGKESLYICVCDTGPGIPEEDLNRIFN RFYQSQNQVKYPVYGQAGTGIGLYLCKRIVQMHGGEIKVRNNRLSGCSFRLLLPLQREEE KDDKLIIINSNDSSINATPTSQTPKEKETLTILVVEDNVDMRGYIRSILREQYNVLEAAD GEEALHILNSNPVDFIISDLMMPVMDGVELSRRVKDTFAISHIPFLMLTAKTSQEARLES YRMGVDEYLLKPFDETLLLTRIQNILENRKRYQRKFTLDMDVDVLNMEEESGDKKFMNQI MEVIKENYKNSYFEVSDFSEAVGVSKSLLNKKLQSLIGQSAGQFIRNYRLNIARELILKN RETKNMNIAEIAYEVGFNDPKYFTRCFTKYFNTTPSSLLNKEE >gi|225935372|gb|ACGA01000020.1| GENE 104 124703 - 124798 62 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNDKKMAVVSCIASAKSNLPPIAQQFNLSN >gi|225935372|gb|ACGA01000020.1| GENE 105 124819 - 128028 2619 1069 aa, chain + ## HITS:1 COG:no KEGG:BVU_1005 NR:ns ## KEGG: BVU_1005 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 1069 1 1069 1069 1934 89.0 0 MRKHLFFSVMLSMCTFGGMAVYPTPAMATVAQSPTIKVRGQVIDEQGEPMPGATIKIKGG QGGTVTDLDGNFQLEVPGNATLLISYVGFKEREVAVRNRAIIEDIQLQADNKMLEQVVVV GYGTQKKSDLTGSVAVVDAEALKQTSNSNISTMLEGKVSGVQITSDGQPGADPTVRIRGV GSFGDTSPLYVIDGVPMGTSIRDFSSNDIETIQVLKDASAAAIYGSRAANGVIIITTKRG QKDQPLKVDYNGYFGVDNIPKGVYDVMNADQYSQYLGQTAANSNTPLPTGYKLDSETGKY HFQDDTNTNWFDEVFKAGIRQNHNVNLSGGGAHNTYNISLDYFNQKGTLEGAGPNYERYT ARVNNTMETKFIKFQTSLVYSHSDQDNMGLSNASEYVQGLYGDVTNVLRGTLMMQPTIKA YDSSTWVLDNLVGIANNFNYDAYGYGVYYDTVHGDISASNPLLVNNLLQRNTRVDRFVGT GSADMDILKMIGVDNKNHKLNYKVNLSYSKTHCKDFTWIPAWVQSNRVYLSKSNERLTKG SRDYSDALIENVITYDGTIGKHHINVLAGQTYEEEDTDLLTGWGINFTEPYFLQLQNASD TYSESYEYKHSILSYIGRINYNYDDRYLFSATVRRDGSSRLTRNIRWGTFPSVSVGWRFD KEKFFPFDQNVVNLFKVRASYGELGNENIGEYMYQAVMSRNNMTYNFNGNVVTGSAVSTF VDNNLAWEKKKTYNVGIDLALFNNRLEFTAEWYKNTSEDLLYAVPVPEQAGVSNTTVTMN AASMNNSGFEFSATYRNRDRDFKYELSANLSTLRNRVTSLGFGTDSYISGAYITNVGQEI GQFYGWAYEGIARTQEDLDNHATQEGAQIGDCLYKDISGPDGKPDGKVDANDQVVLGSGM PKIHFGLNARFEYKRFDLSIATFGALNYHVSDDIYNSLNSCYGWGNKEVGMLDANRFSED GSTYLSNVPRTYVTNSASLGWNDLFSSRKIQNAAYWKIANVELGYNFPNEWFGKYVSDVR FYVSAQNLHTFTGYHGYNVDYAGGAFTPGYNFCSYPTARTFMCGVHFTF >gi|225935372|gb|ACGA01000020.1| GENE 106 128068 - 129819 1531 583 aa, chain + ## HITS:1 COG:no KEGG:BVU_1006 NR:ns ## KEGG: BVU_1006 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 583 1 583 583 1013 86.0 0 MKLYKYSLILILGFSTLVSCSDKMDLLNPNQQTSSTFGFNADDLEESVIAAYNHIRMEGS YARVGYTIDVCRGDEAWNSSQVWYMPFDDLNAEVTSDITWWPWREWYYTINVCNFAISRC GEDNSQLSNKMKRIKGQVLFLRGLSYYNLVGYYQNPPLITDYATYSSLDGLYTTNSTYDN VLDQVEKDFHEAMELLPSRDEGGEWAGGRATCGAAAGYYARALMMRQKYSDALTVLKDII GKKYGTYRLMSNYGDNFREGSAYENNAESLFEVQYLDYGAQGTDDEWTPVNTSPNATQGS AIESNFGPGNYGGWADISASPWLYQLFKSERTINGKLDPRLYWTIGTYENDWADFEYGNV AYTHKLTATDNIVTNNTYGGLPIAKFTNLRTGLYSTIVTGLHDGINLRLMRYSDVLLRAA ECENEVNGPTQQAIDWINEVRNRADLSDLDLSDFDTTDQLFEQIANIERPKEFGCEFGRG FDLIRWGFFYDSSRLKQLKEHGTVRRSTVGVKNPVNYSDVTNDPELKSTFDTYISGHEFL PIVQQLMNNNPNLKGNSANYSTDNSTYFRESGCKVRPVVNLSK >gi|225935372|gb|ACGA01000020.1| GENE 107 129852 - 131633 1267 593 aa, chain + ## HITS:1 COG:no KEGG:BVU_1007 NR:ns ## KEGG: BVU_1007 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 591 1 592 592 781 67.0 0 MRLLKKIAYACCSVTFGLLTLTGCTDGKLYDVNAPDWISDKIQEIEDSKKQPEDEVLEGM QEDVYTIGNTDFTSGWWSTFSKYYVVPDGEKWNAVFNLHINPTDNTYFKNYALVVTNDVD RGGTGYTEYGAIRFDATSDSIAYNSQWGKLYFKYTTSTLLLSPDASNADPNVQSLGGKVT LTVDRTNENAFTIKMTNGVATKIYKQPYKEGNLNADANNTNIRCFLVPEGSYIDFLQTNI VPIGGLTSALDKNPVSMVLQDVPDQVNVGTPLEEAMIDVSAIVTFEEGVTKTVPAAELSI SAIPDMEQPGVKTLVVIYNKTLKGENCDKPIMANTTFEVVEQIASIEVTTPPSHTQYYYY TSAATESLANRTMAFDPTGMVVTATYTDQSTRIIDNARLIFSAVPVKTGSQIVTITAEEG ITATVEVNVSKSIASEVRNTTSMVGAEDNSTSFHGASSDIFNVPVGKTKYITFTNYSDLA ANWNNFLVFLHKSITTDEYAFVRADNFGLGNGYAACVNGGTQGDWGTWLSGMNGAKVTVY VTNCGNSTADIQAVMVGTKGKTSTQYYLGINTIDPNDLCFTLSVEKAHLVFNE >gi|225935372|gb|ACGA01000020.1| GENE 108 131668 - 133596 1564 642 aa, chain + ## HITS:1 COG:BH1878 KEGG:ns NR:ns ## COG: BH1878 COG3507 # Protein_GI_number: 15614441 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 161 587 40 428 781 97 24.0 9e-20 MKKLTFFLISLLTVFAFAACSDGLDNIEYNRNSDIVIVTPEVTTVTGTSLTVTSSVTGNI NQVVKKGFCYSINAQNPTIKDNIVDADENFSATISGLTGNTSYYIRSYVYGNSRYTYSDV LTATTGSQSLDEQLKNYVAPAYEDNYIGIAAWNQRSKWNLANVHDPTVMKADDGYYYMYQ TDASYGNAHSGNGHFHARRSRDLVNWEYLGATMTETPPTWIKEKLNAYRAEMGLEPIDSP SYGYWAPVARKVATGKYRMYYSIVITNYIKTGKPEVDNNGNFDGSWTERAFIGLMETADP ASNIWEDKGFVVCSASDKGKTDYGRVSTSDWNGYFKINAIDPTYIITENGEHWMIYGSWH SGIAALQLNPEDGMPLHTLGNPWDITGENNSGYGKIIATRGNSRWQASEGPEVIYRNGYY YLFLAYGTLAVEYNTRVCRSVNIDGPYVDMDGTPAMGSGELYPILTAPYLFNNSYGWVGI SHCGIFEDGEGNWFYTSQGRFPANVGGNEYSNAIMMGHVRSIRWDANGWPLVMPERYGAV PQVPITEEEIAGNWEHLSLTSSTSTQRTSETLTFDAGTHKISSGSWKDASWTFDAASQTI TTSAGVVLYLQREVDWEATPRTHTIVYAAQGNKKTYWGKKVQ >gi|225935372|gb|ACGA01000020.1| GENE 109 133767 - 134906 400 379 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 40 378 12 345 345 158 30 1e-37 MKKHFLLAGIAALMLAACNNKPASELTLSGLDPVKFQTEVNNAQTALYTLKNKAGMEVCI TNFGGRIVSIMVPDKNGQMQDVVLGFDSIADYINVPSDFGASIGRYANRINQGRFVLDGD TIQLPQNNYGHCLHGGPKGWQYQVYEANLIDPTTLELTRISPDGDANFPGNVTAKVTYKL TDDNAIDIKYSATTDKKTIINMTNHSYFNLSGDPSKISTDHIMYVNADYYTPVDSTFMTT GEIAPVKDTPMDFTTPKAVGKEINNYDFVQLKNGNGYDHNWVLNTKGDLSQVAAKLTSPE SGITLEVYTNEPGIQVYTGNFLDGTVTGKKGFVYNQRASVCLETQHYPDSPNKADWPSVV LEPGQTYNSECIFKFGIEK >gi|225935372|gb|ACGA01000020.1| GENE 110 134938 - 136683 1721 581 aa, chain + ## HITS:1 COG:yidK KEGG:ns NR:ns ## COG: yidK COG4146 # Protein_GI_number: 16131549 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Escherichia coli K12 # 20 476 16 445 571 138 26.0 2e-32 MNWNSHEFIWIDWVIIVVGILAVTWAVWRSIQKDKRLQQGANSEDYLFGKGEPWYIIGAA IFAANIGSEHLVGLAGTGAKSGVGMAHWEMQGWMILILGWLFVPFYQLMNVKLGKIITMP DFLKYRYTPATGSWLSIITLIAYVLTKVSVTAFTGGIFMESLLGLPFWFGAIGLIVLTGI FTVLGGMKGVMTLSAIQTPILIIGSFLVLFLGLSALGAGSISDGWTHMIEYAKTLNIGTD GVAHGTNHMFHFETGDPMYDDYPGFWVFIGASIIGFWYWCTDQHIVQRVLGQRKGESSEE VMRRARRGTIAAGYFKILPVFMFLIPGMIAAALSARPESGFVLDNPDTAFGSMVKFVLPA GVKGIVTIGFISALVASLAAFFNSCATLFTEDFYKPMFKDKSEATYVLVGRIATIVVVIL GIVWIPIMQSLGSLYDYLQGIQSLLAPAMVAVFALGIFSKKITPKAGETAMIVGFIIGMV RLLTNILTNTGKDVMSGWYWECTAWFWQTNWLIFEIWLLVFLIVLMVVVSFFTPAPTAKQ IEAITFTDDYKTLIKQSWNKWDVITSIGVVVLCALFYIYFW >gi|225935372|gb|ACGA01000020.1| GENE 111 136693 - 137370 709 225 aa, chain + ## HITS:1 COG:alr2484 KEGG:ns NR:ns ## COG: alr2484 COG1051 # Protein_GI_number: 17229976 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Nostoc sp. PCC 7120 # 11 217 21 237 248 126 33.0 4e-29 MNKYYSTNPTFYVGIDCIIFGFSEGEISLLLLKRNFEPAMGEWSLMGGFVQKDESVDDAA KRVLHELTGLENVYMEQVGAFGAVERDPGERVISIAYYALININEYDRKLVQKHNAYWVN MNELPPLIFDHPEMVEKARELMKQKASVEPIGFNLLPKLFTLSQLQSLYEAIYDEPMDKR NFRKRVAEMDYIEKTDKIDKLGSKRGAALYKFNSKAYRKDPKFKL >gi|225935372|gb|ACGA01000020.1| GENE 112 137376 - 138062 784 228 aa, chain + ## HITS:1 COG:ECs5174 KEGG:ns NR:ns ## COG: ECs5174 COG0235 # Protein_GI_number: 15834428 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 2 228 1 228 228 318 65.0 5e-87 MLEELKEKVFHANLELVKHGLVIFTWGNVSAIDRESGLVVIKPSGVSYDDMKAEDMVVVD LDGKVVEGRLKPSSDTPTHVVLYKAFPEIGGVVHTHSTYATAWAQAGCDIPNIGTTHADY FHDAIPCTADMTEAEVKGAYELETGNVIVKRFEGLNPVHTPGVLVKNHGPFSWGKDAYDA VHNAVVMEQVAKMASIAYAVNPNLTMNPLLIEKHFSRKHGPNAYYGQK >gi|225935372|gb|ACGA01000020.1| GENE 113 138115 - 139641 1542 508 aa, chain + ## HITS:1 COG:TM0276 KEGG:ns NR:ns ## COG: TM0276 COG2160 # Protein_GI_number: 15643046 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Thermotoga maritima # 1 505 1 495 496 504 50.0 1e-142 MNAFNQYEVWFVTGAQLLYGGDAVIAVDAHSNEMVNGLNESGKLPVKVVYKGTANSSKEV EAVFKAANNDEKCVGVITWMHTFSPAKMWIHGLQQLKKPLLHLHTQFNKEIPWDTMDMDF MNLNQSAHGDREFGHICTRMRIRRKVVVGYWKEEETLHKITVWMRVCAGWADSQDMLIIR FGDQMNNVAVTDGDKVEAEQRMGYHVDYCPVSELMEYHKDIKNEEVDALVATYFKEYDHD ASLEDKSTEAYQKVWNAAKAELAIRAILKAKGAKGFTTNFDDLGDIEYNGFDQIPGLASQ RLMAEGYGFGAEGDWKSAALYRTVWVMNQGLPKGCSFLEDYTLNFDGANSSILQSHMLEI CPLIAANKPRLEVHFLGIGIRKSQTARLVFTSKVGTGCTATIVDMGNRFRLIVNDVECIE PKPLPKLPVASALWIPMPNLEVGAGAWILAGGTHHSCFSYDLTAEYWEDYAEIAGIEMVH INKDTTISCFKKELRMNEVYYMLNKALC >gi|225935372|gb|ACGA01000020.1| GENE 114 139744 - 141339 1309 531 aa, chain + ## HITS:1 COG:CAC1344 KEGG:ns NR:ns ## COG: CAC1344 COG1070 # Protein_GI_number: 15894623 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 1 529 1 531 534 648 59.0 0 MKLDAKSTIEAGKAILGIELGSTRIKAVLIDQENKPIAQGSHTWENQLVDGLWTYSIEAI WSGLQDCYADLRTNVKNAYGIEIETLAAIGVSAMMHGYMPFNKKEEILVPFRTWRNTNTG RAAAALSELFVYNIPLRWSISHLYQAILDNEAHVNEIDFLTTLAGYVHWQITGEKVLGIG DASGMLPIDPTTNNYSAEMVAKFDKLIASKEYSWKLQDILPKVLSAGENAGILTPEGSKK LDASGHLKAGIPVCPPEGDAGTGMVATNAVKQRTGNVSAGTSSFSMIVLEKDLSKPYEMI DMVTTPDGSLVAMVHCNNCTSDLNAWVNLFKEYQELLGIPVNMDEIYSKLYNIALTGDTD CGGLLSYNYISGEPVTGLADGRPLFVRSANDKFNLANFMRTHLYASVGVLKIGNDILFNE EKIKVDRITGHGGLFRTKGVGQRILAAAINSPISVMETAGEGGAWGIALLGSYLVNNEKK QSLADFLDERVFVGDAGVEVSPTPEDVAGFNTYIENYKAGLPIEEAAVKFK >gi|225935372|gb|ACGA01000020.1| GENE 115 141453 - 143855 1777 800 aa, chain + ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 30 721 4 666 758 476 39.0 1e-134 MKTTSFILALVLSTSLAKAQNAAQVSYFPLQNVKLLDSPFLQAQQTDLHYILALNPDRLL APFLREAGLQPKAPSYTNWENTGLDGHIGGHYLSALSMMYAATGDTAVYNRLNYMLNELH RAQQAVGTGFIGGTPGSLQLWKDIKAGKIRAGGFDLNGKWVPLYNIHKTYAGLRDAYLYA GSDLARKMLIDLTDWMIDITSGLSDEQMQDMLRSEHGGLNETFADVAEITGDKKYLKLAR RFSHKLILDPLIKDEDKLTGMHANTQIPKVIGYKRIAELSQDDKSWSHAAEWDHAARFFW NTVVNHRSVCIGGNSVREHFHPSDNFTSMLNDVQGPETCNTYNMLRLTKMLYQNSHNPNN TQEPDPNYVNYYERALYNHILASQEPDKGGFVYFTPMRPGHYRVYSQPETSMWCCVGSGL ENHTKYGEFIYAHQRDTLYINLFIPSQLTWKEQGVTLTQETRFPDDGKVTLRIDEAPKKK RTLMIRIPEWANQSKGYSISINGKRKIFIMAKGNQYLPLSRKWKKGDVITFNLPMRVSME QIPDKKDYYAFLYGPIVLAASTGTEHLDGLYADDSRGGHIAHGKQIPLQEVPMLIGNPDS IRKSLHKEQGSRIAFSYNGDVYPAQGKALELVPFFRLHNSRYAVYFRQTSEEQFKAIQEE MATAERKATELANQTIDLIFPGEQQPESDHGIQYEQAETGTNTDRHFRRAKGWFGYQLKV KEEASRILVTIRKDDRNKVAILLNNEKLAVNPTISEADKDGFITLSYVLPQKLNTGSCPI RFIPDETEWTPAIYEVRLLK >gi|225935372|gb|ACGA01000020.1| GENE 116 143922 - 145466 1667 514 aa, chain + ## HITS:1 COG:BH1874 KEGG:ns NR:ns ## COG: BH1874 COG3534 # Protein_GI_number: 15614437 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 25 514 4 497 498 590 54.0 1e-168 MKAKLLVSTAFLAASVSLSAQKSATITVHADQGKEIIPKEIYGQFAEHLGSCIYGGLWVG EKSDIPNIKGYRTDVFNALKDLAVPVLRWPGGCFADEYHWMDGIGPKENRPKMVNNNWGG TIEDNSFGTHEFLNLCEMLDCEPYVSGNVGSGTVEELAKWVEYMTSDGDSPMANLRRKNG RDKAWKLKYLGVGNESWGCGGSMRPEYYADLYRRYSTYCRNYDGNRLFKIASGASDYDYN WTDVLMNRVGHRMQGLSLHYYTVTGWSGSKGAATQFNKDDYYWTMGKCLEVEDVLKKHCA IMDKYDKDKKIALLLDEWGTWWDEEPGTIKGHLYQQNTLRDAFVASLSLDVFHKYTDRLK MANIAQIVNVLQSMILTKDKEMVLTPTYYVFKMYKVHQDATYLPIDLTCEKMNVRDNRTV PMVSATASKNKDGVIHISLSNVDADEAQEITINLGDTKAKKAAGEILTSAKLTDYNSFEN PNIVKPVPFKEVKINKGTMKVKLPAKSIVTLELQ >gi|225935372|gb|ACGA01000020.1| GENE 117 145609 - 147618 2344 669 aa, chain + ## HITS:1 COG:BH2352 KEGG:ns NR:ns ## COG: BH2352 COG0021 # Protein_GI_number: 15614915 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Bacillus halodurans # 10 669 9 663 666 462 41.0 1e-129 MNDNKLMNRAADNIRILAASMVEKANSGHPGGAMGGADFVNVLFSEFLVYDPENPRWEGR DRFFLDPGHMSPMLYSTLALTGKFTLDELKEFRQWGSPTPGHPEVDIMRGIENTSGPLGQ GHTFAVGAAIAAKFLKARFNEVMNQTIYAYISDGGIQEEISQGSGRIAGALGLDNLIMFY DSNDIQLSTETKDVTVEDTAMKYEAWGWNVLSINGNDPDEIRAAIKEAQAEKERPTLIIG KTVMGKGARKADGSSYEANCATHGAPLGGDAYVNTIKNLGGDPTNPFVIFPEVAELYAKR AEELKKIVAEKYAAKAAWAKANPELAAKLELFFSGKAPKVDWAAIEQKAGSATRAASATV LGALATQVENMIVASADLSNSDKTDGFLKKTHSFKKGDFSGAFFQAGVSELSMACICIGM SLHGGVIAACGTFFVFSDYMKPAVRMAALMEQPVKFIWTHDAFRVGEDGPTHEPVEQEAQ IRLMEKLKNHKGHNSMLVLRPADAEETTIAWKLAMENMSTPTGLIFSRQNIANLPAGTDY EQAAKGAYIVAGSDENPDVILVASGSEVATLVAGTELLRKDGVKVRIVSAPSEGLFRNQP KEYQEAILPADAKIFGMTAGLPVTLQGLVGCHGKVWGLESFGFSAPYTVLDEKLGFTAEN VYNQVKAML >gi|225935372|gb|ACGA01000020.1| GENE 118 147618 - 148052 538 144 aa, chain + ## HITS:1 COG:TM1080 KEGG:ns NR:ns ## COG: TM1080 COG0698 # Protein_GI_number: 15643838 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Thermotoga maritima # 4 141 3 140 143 147 51.0 6e-36 MKTIGLACDHAGFELKEYVRGWLEAKGWAYKDFGTNSTASVDYPDYAHPLALAVESGECY PGIAICGSGNGINMTLNKHQGVRAALCWNAEIAHLARQHNDANVLVMPGRFISTEEADMI LTEFFSTKFDGGRHQNRIDKIPVK >gi|225935372|gb|ACGA01000020.1| GENE 119 148162 - 148644 475 160 aa, chain - ## HITS:1 COG:NMB0932 KEGG:ns NR:ns ## COG: NMB0932 COG2839 # Protein_GI_number: 15676826 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 5 142 3 142 162 66 35.0 2e-11 MLDIILIVISALCMIAGLAGCILPFLPGPPIAYLGLVILHFTDKVEYTTTQLIVWLLIVA VLQVLDYFSPMLGSKYSGGSKWGNWGCIIGTLVGLLFLPWGIILGPFLGAVIGELLGNKE FSQALKSGFGSLIGFILGTLLKFVVCGYFCYQFIAGLFFR >gi|225935372|gb|ACGA01000020.1| GENE 120 148923 - 149360 266 145 aa, chain + ## HITS:1 COG:no KEGG:BT_4511 NR:ns ## KEGG: BT_4511 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 145 10 144 144 175 57.0 5e-43 MNDNKLKNKPVPYNYARCFNEHCPKAPDCLRHVAALHTTADTSFISIVNPVSIPTDSTAC PYFKTAEKMHVAWGISHLLDKVPFKDGTSIRNQLISHFGKNLYYRFYREERPITPNDQAF ICQLFRRKGIMEEPAFDSYTDEYNW >gi|225935372|gb|ACGA01000020.1| GENE 121 149587 - 150156 266 189 aa, chain + ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 10 185 16 192 199 61 26.0 9e-10 MNTNIRKILNDKYGLFEKDINILLEKFTRLSLKKKETILQEGQADHYIYFVEKGIVKSTI FREGREFIVFFALENEVPFSSPDLIHSGQSLYTLETIDDCILWRISRKDLADLFGTSINL SNWGRMLLQEWLAVSSSYFSSIHWMSKKEQYQYLLEKMPQLVQRLSMRDLSAWLDITPQS LSRIRADIY >gi|225935372|gb|ACGA01000020.1| GENE 122 150226 - 151140 473 304 aa, chain + ## HITS:1 COG:FN1744 KEGG:ns NR:ns ## COG: FN1744 COG0697 # Protein_GI_number: 19705065 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 7 288 9 290 293 67 25.0 2e-11 MKSLKGIIYAMISSGTFGLIPLFTIPLMGDMGMNESNILFYRFLFSTLMMGAVCLIHKSP LRVNMKHLISITGLGALYALTALFLIYSYHYISSGVATTIHFLYPICVSFLMVVFFKEQK SKALFLAASLSLIGVAFMCWSGGSSIRLMGVGLAALTILTYGIYIVALNQSGLGKLPAEV LTFYVLLGGCIIFFIYSLCTTGISAIPGTKAGLYILALAFLSTVISDLTLILAVKYAGST TTAILGSMEPLVAVVVGVLVFAEHFSFQSLAGLLLILLSVIIVILADQRKKSQKIPEKEI VQSH >gi|225935372|gb|ACGA01000020.1| GENE 123 151250 - 153451 1786 733 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 10 729 14 761 790 482 37.0 1e-135 MTIKIPILLLNIVGGLLVLSCTTQPKVNADYSVYVNPFIGNADNGHTFPGACVPFGMIQA SPETGNDVWKYCSGYNFADDSIIGFAQTHLSGTGCPDLGDILLLPFSGKIENNVYKGKFD KRSENASPGYYAVHLTDFDVDVELTSTQRTAFHRYVYHSDAPSGLLVDLQSGVVWDREWL KKHVMYADMDLPDNHTITGHQMVENWVKRSYYYVISFDKPYTVKDTLSSVDPEKGKRLLL DFNLAKGDTLQVKIALSTVGVEGAIAALEKENPAWNFEGIRTKAHNQWNQLFSRVKVTGS DAQKTNFYTSLYHLFIQPNNIADIDGRYRGANDSIYTSKTGSYYSTLSLWDTYRAAHPLY TILCPEKVDGMIQSMLDHYKIQGYLPIWALWGKESHCMIGNHAIPVIVDAYLKGFRGFEP EEAYAAIRGTSTVSHQHSDWEIYDQYGYYPFDLIPKESVSRTLESGYDDYCVAQMAKALG NDEDYAYFSKRSSYYKNLLDPSTTMMRGKDSKGDWRTPFNTLLLSHAATSGGDYTEGNAW QYTWHVQHDVEGLIELLGGKEKFANKLDSLFFLESSKENTGFTQDVTGLIGQYAHGNEPS HHVAYLYNYVGQPHKTQQLIREIFDRFDLPKPDGLCGNDDCGQMSAWYIFSAMGFYPVNP IGGEYILGAPQIEEVTISLPNHKTFTVQAKGLSDENKYVQSVTLNGKPVENFRIYHSDIM KGGELVFVMTDRY >gi|225935372|gb|ACGA01000020.1| GENE 124 153469 - 155736 1641 755 aa, chain - ## HITS:1 COG:no KEGG:BVU_0137 NR:ns ## KEGG: BVU_0137 # Name: not_defined # Def: delta\-4\,5\ unsaturated\ glucuronyl\ hydrolase # Organism: B.vulgatus # Pathway: not_defined # 8 394 10 404 408 418 54.0 1e-115 MNLIKKRIFSLWASLCILCAVNAQETELTYCNKQIHKTLKAIGTSSKIPRSMDAHKLHWN MVGVDDWTSGFFPGVLWYDYEYSREPEIKEKAVYFTRLLEPLSHKVTSHDLGFQIFCSYG NAYRLTKDKYYKDVILKSAEELSKLYNSRVGTILSWPWKVAENGWPHNTIIDNMMNLELL FWASKNGGGKKYYDMALSHARVTKENHFRQDGSCYHVAVYDTISGRLIKGITHQGYSDRS MWARGQAWAIYGYTMVYRETRDKDFLRFAEKVADLYISRLPADLIPFWDFDAPDIPAAPK DASAAAVTASALLELSTLEDDPERSELYYALAKEMLHSLSSEHYQSRKKNSSFLLHSTGH YPAGEEIDASIIYADYYYIEALMRRKKIEAKQALSQENKFIHPGILHTRESLERIRHYVD QKINPAYQSYLLLQADPCASSDYQLQGPFEVIARLGVNSYTKRPSEDDHKAAYLNALMWT ITGKEAHARKSIEILNAYSAVLRLIGPNDNDDPLCASLQGSMLANAAELIKHTYDKVEKK DIASWEHMLRTVFIPVLDTFFKAKPYTNGNWGAAATKAYMAFGIFLDDESLYNQAVDFYY NGHDNGTIKNYISESGQCQESGRDQDHVMFGLGNLSEVCETGYNQGDEKMYAALDNRLLT GYEYTVKYNLGESVPFKVWTDISGRYCDWQVISEKERGAFRPIFEIVYNHYVTRKGLDMP YTRRVMSKIPIEGGSKWCDGPGYGTLLFRTQMNEK >gi|225935372|gb|ACGA01000020.1| GENE 125 155855 - 157570 1339 571 aa, chain - ## HITS:1 COG:no KEGG:BVU_0132 NR:ns ## KEGG: BVU_0132 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 43 397 30 379 1141 369 50.0 1e-100 MKLRYILPILFLLFAFVSCQEDNTPPPANPNPNYTETGPSMDFVHPGILHTTSSLNRMKN FVMGNVSPAIDCYRKLESNRLASSSYAIQGPFTVIARDGINASTKTPSEEDHKAAYLNAV MWCLTGNEAHAKKTIEILNAYAGTLREIGPDANDDPLCASLQGFLLVNAAELIRYTYEPV SQTDIDAWEKMFRNVFIPVLREFFAKPAYTNGNWGTAAIKAFMGFGVFLDDESLYNEAVD FFYNGNDNGSLTNYISETGQCQESGRDQNHTMLGLGHLAEACEIAYNQKNETLYSASENR LLTGYEYTAKYNLGYNDVPFVTWQDVTGRYSNWTVISTADRGSFRPVFEIAYNHYVTRKG LKMPYTQQVISRISPEGEAQWCDHPGYGTLLFRSESGMPPSEGAVDGKGTDWKVVTNNAT GKAEGDDYIVTPSLQGNGKYRGDVKRGQLALHIGNYPILAVVIKGLPATRAFTFDSSEYG YYKNSVGNQWGQNTASSIEKDYGTVYYWNFSEGNFFKDDKNVYLPTDKSFNITITLKIAD LVYPDGVSPYTIKWMKAFRNVDELNSYLEEN >gi|225935372|gb|ACGA01000020.1| GENE 126 157601 - 159424 1548 607 aa, chain - ## HITS:1 COG:no KEGG:BVU_0134 NR:ns ## KEGG: BVU_0134 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 19 607 18 607 608 629 56.0 1e-178 MKNKKYLKYVLLSLTISIGMSSCSDFLDEVNYSSQSADKYYQTKSGYESLIVGCYSNLKN IYNNTNFQIFTQQGTDVFTQNYPTEITPMNQYTTTYQSSNGTVYSMWSSYYNALKNVNAA IDRSENVILRTDDPDGMEPNTLAQRVAEAKALRAWYLLEIIRNWGKAPLEIHEAKEPSYT VEYADGAAFYKQIFEDLEAAIAVLPYRQTGSNYGRMSKAAAKHIRALAYLTRGYEDYADS KDFENAFKDAEDVYLNSGHKLLDDYAMVHRQSNEMNDEIVFAIGFADAANYNTNIWNQWF MMPYSIGGWLGLGKDSYYGNASAHVKAIPAKYAYLMYDWQKDRRPSVTYMSPLNGDKNTS VDGKDAGKNWFQCTTPVDGVFAKGDKIIYFPVPTDPEYKVWSASDKDKVRYKVFNFPAGD ASDMSDDDYYKYAYQTTNSTSRTFLPIWKFKDGNTEFKEDESGSGTRDIYLFRLAETCLI AAEAAVMNNDNDNAIKYINYVASRAAKNAPVAGLTGYSGTVTLDDILDERAKELLGEGSR WNDLQRTRKLAERVLKYNWDVSNIHGGTIKTTLTEESFKNKFILRPIPLQWLNSLSNGSE LGNNPGW >gi|225935372|gb|ACGA01000020.1| GENE 127 159436 - 162762 2428 1108 aa, chain - ## HITS:1 COG:no KEGG:BVU_0135 NR:ns ## KEGG: BVU_0135 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 137 1108 45 1024 1025 1039 55.0 0 MNKQSMTYGYPPKRSIKNIKITFTVLLLFILQLTTSNIYAQDKKVSISVKNENLENVIRM VEKQTDYLFFYNSDEINRKQKITIDKRNSTLKDVLEVITAKANLSYTIKGRHIILTQKVE KTAVGGSSSTKPSERTVTGIVKDETDFPVVGANIVIPKTKKGVITDINGAFSLNVTPGDV IEISYLGYMTQEIKVTDQKSLNIVLRENAKALSEVVVVGYGAVKKSDLTGSVASVSNANL VRGGNTNSAGALQGELSGVTIVRSNNKPGGGYDIKIRGINSITASSSPLIVIDGVPGGNL DFVNPDDIEKIDVLKDASATAIYGSSGANGVIIVTTKRGQAGKPKISYNGYVGFRTYTNL PDMMSGDEYVQLAREATRGGAPNYVYKRDEQIFTDPSELQAVKDGNYYDWLDAISDPAFI TNHSLSAVGGTEAVKYGFSGGYYYEDGMIQPQEYTRYNLRSVVDITINKHVSFGGSMYAT HSIREKGNWDLLRDVFRMRPTQHPNSLVTGEEIWKYSSNNLFNPLITSKNQRSQVKSLNL QSNIYLKITPIENLELTSSFSPYFTQTSMGDYIGVWTKAQQGTAKGAIANATKDNSLSWI WDNIINYTWKKAIHSITATGVFSAQKYQYDRLYGATKDLSYNSLWYNLNGGEITGLSSNF NQWTMMSYVGRVNYGLLDRYLLTASLRYDGSSRLADGHKWALFPSVAVAWRITEENFAKD LQWLSNLKLRLSYGQAGNANSVSPYASEGNISGSVYYPFGTSTSVGNLPANIANPLLTWE RTSEYNVGFDFGFLGQRISGNIEYYNRTTNDLLMGRNIPVHLGYSSVVSNVGSVRNCGFE LQLNTVNVASKNFTWNTTINLAYNKNEIVSLADEEDLSNYSVNLKGMRGRYSDKRFIGKP VDTNWNQNTIGVWQLGEEEEAAKYGCVPGNFKIKDYNNDGKLTDDDYIIDGKRTPDWTGG MTNMFKYRDFDLSFHMYFQLGATQYDRFFENFALEWNSQNFNNLRTNYWTPENPSNTMGR PSQMGSRGNIAYEKTDFLKVSYVTLGYSLNKKLLSKWGLSNARIYVTVQNPLILTKFRGL DPEQPDITNIGDTDGMTKNTLLGVNVSF >gi|225935372|gb|ACGA01000020.1| GENE 128 162929 - 163906 745 325 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 19 317 35 315 323 67 23.0 3e-11 MENMISDLIIGFLKGSLNEKEINQFYDWVNKAPENKKIFFEVKMIYDACLSKGSDIDMDK SWQRLLEKKQKQAPKKIYTLFRRIQTYAAVAVIAIALTSTIFLMLNTTSPAPIAEYVSGD GIVADKIILSDGTQVSMGSQTKFRYDPHYGKDKRVVYLEGEAFFDVAKQKSKPFIVVVNG QEIEALGTKFNVEAYPSDSVAVTTLLEGSIRLTSENINTPVVLRPNQQYVYNKDKGTYKV DKVEASLYTSWISGYYYFHEESLDGILNRLGHIYGVDFRVQSDNLKGRKFTGTFYRGQSI KDILDVINVSIPIQYQINKQHVIIN >gi|225935372|gb|ACGA01000020.1| GENE 129 164031 - 164579 461 182 aa, chain - ## HITS:1 COG:PA1300 KEGG:ns NR:ns ## COG: PA1300 COG1595 # Protein_GI_number: 15596497 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 16 171 13 169 175 67 32.0 1e-11 MKDEHVNDINNYQKLYMQYAPMLLRFAGKFISPFFAEDIVQDVFLRIWDKQVFLLSESEV KNILFVAVRNACIDHLRRISLEQEFADKRAIQLKLDELSFYDGADELFMRKDLMAHVMAK INELPEKRREIFLLSYMEGLKAAEIAERLNLSTRTVENQLYRTLLFLRKELQTAFVYLFM FV >gi|225935372|gb|ACGA01000020.1| GENE 130 164565 - 164846 97 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170887|ref|ZP_05757299.1| ## NR: gi|260170887|ref|ZP_05757299.1| hypothetical protein BacD2_03397 [Bacteroides sp. D2] # 1 93 1 93 93 166 100.0 5e-40 MFIFHFSILQGQSVLIIIIRENRTNHVGGDKNRYILVINKKDENRFFFHPDYIFIEYAIK PKPCKFPFSFAPFTQIKTIFAFQNPERRKKWSK >gi|225935372|gb|ACGA01000020.1| GENE 131 164834 - 165115 265 93 aa, chain + ## HITS:1 COG:no KEGG:BT_0334 NR:ns ## KEGG: BT_0334 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 93 93 142 91.0 4e-33 MEQIDNLKELINQGDVDTAIKQLDQLLQDTSVEKEKDTLYYLRGNAYRKKGDWKRALDNY QYAIELNPDSPAVQARKMAIDILNFYHKDMFNQ >gi|225935372|gb|ACGA01000020.1| GENE 132 165149 - 165376 245 75 aa, chain + ## HITS:1 COG:no KEGG:BT_0333 NR:ns ## KEGG: BT_0333 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: Citrate cycle (TCA cycle) [PATH:bth00020]; Metabolic pathways [PATH:bth01100] # 1 75 1 75 75 118 98.0 7e-26 MAKIKGAIVVDTERCKGCNLCVVACPLDVIALNKEVNMKGYNYAWQVKEDTCNGCSSCAT VCPDGCISVYKVKVE >gi|225935372|gb|ACGA01000020.1| GENE 133 165388 - 166470 1373 360 aa, chain + ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 6 356 7 351 356 383 54.0 1e-106 MAEEVVLMKGNEAIAHAAIRCGADGYFGYPITPQSEVLETLAELKPWETTGMVVLQAESE VAAINMVYGGAGSGKMVLTSSSSPGVSLKQEGISYIAGAELPCLIVNVMRGGPGLGTIQP SQADYFQTVKGGGHGDYRLIALAPASVQEMADFVKLGFELAFKYRNPAIILADGVIGQMM EKVVLPAQKPRRTEAEVIEQCPWATTGKSKGRKPNIITSLELKPEAMEINNLRFQEKYRV IEENEVRFEEINCEDAEYLIIAFGSMARIGQKAMELAREKGIKVGILRPITLWPFPTKAI ASYADKVKGMLVTELNAGQMIEDVRLAVNGKIKVEHFGRLGGIVPDPDEIVTALEEKIIK >gi|225935372|gb|ACGA01000020.1| GENE 134 166478 - 166651 181 57 aa, chain + ## HITS:1 COG:no KEGG:BF1647 NR:ns ## KEGG: BF1647 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 57 1 57 57 75 92.0 4e-13 MNPDKIRNVLNILFMILAVAAIIIYFVAKEDFKMFIYVCGAAIFVKLMEFFIRFTNR >gi|225935372|gb|ACGA01000020.1| GENE 135 166664 - 167428 736 254 aa, chain + ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 5 252 6 262 296 258 48.0 7e-69 MTKEEIIKPENLVYKKPTLMNDNAMHYCPGCSHGVVHKLIAEVIEEMGMEDKTVGISPVG CAVFAYNYLDIDWQEAAHGRAPAVATAVKRLWPDRLVFTYQGDGDLACIGTAETIHALNR GENITIIFINNAIYGMTGGQMAPTTLVGMKSSTCPYGRDVELHGYPLKITEIAAQLEGTA YVSRQSVQSVPAIRKAKKAIRKAFENSMSGKGSNLVEIVSTCSSGWKMTPEKSNKWMEEH MFPFYPLGDLKDKE >gi|225935372|gb|ACGA01000020.1| GENE 136 167449 - 167991 619 180 aa, chain + ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 179 12 183 186 117 40.0 1e-26 MKEEIIIAGFGGQGVLSMGKILAYSGLMEGKEVSWMPAYGPEQRGGTANVTVIVSDDKIS SPILSKYDAAIILNQPSLEKFESKVKPGGILIYDGYGIIHPPTRKDIKVYRIDAMDAANE MNNAKAFNMIVLGGLLKLRPVVTLENVIKGLKKTLPERHHHLIPMNEEAIKKGMELIREV >gi|225935372|gb|ACGA01000020.1| GENE 137 168100 - 170157 1657 685 aa, chain + ## HITS:1 COG:no KEGG:BT_0328 NR:ns ## KEGG: BT_0328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 685 1 680 680 1178 84.0 0 MPPIMRRILLLYILLSVIGLTTVHAQFDNGRQRVDENGYDQYGNQVDPAMIPDRLDSANV EVQGLPPKLYMWRIKNQLGDRTIVPADTAFHHFQNSNLTEGLTGHYNYLANMGSPRMSRI FFDRRDPEPTIFMEPFSSFFIRPTEFNFTNSNVPYTNLTYHKAGNKVNGEERFKSYFSVN VNKKLAFGFNVDYLYGRGYYNNQNTAYFNAAVFGSYIGDKYQMQAIYSNNYLKTNENGGI EDDRYITAPEEMAQGQREYESTNIPTVLSATTNRNHDFYVFLTQRYNLGFSRDISQAEND TTPAKQEFVPVTSFIHTIQVERARHSFNSNDNVKDYYKNTYLDPDNAMVRDSTTYVGIKN TIGIALLEGFNKYAKAGLTAFASYKISKYTLMNMEGNPLPDKYNENEIFVGGELSKREGN VLHYHAIGEVGLAGKAIGQFNVKGDIDLNFPLWKDTVSLIARGEVSNKLAPFYMRHYHSK HFMWDDDMDKEFRTRIEGELSIARWRTRLKAGVENIKNYTYFNQEATPEQNGGSIQVLSA SLNQDFKLGIFHLDNEVTWQKSSDQTVLPLPDLSLYHNFYMQFKLAKKVLSVQLGADVRY FSKYNAPAYMPAIQNFHLQPTDDQVQIGGYPIVNVYANLHLKRTRFYVMMYHVNQGMSSP NYFLSPHYPINPRVLKFGLSWNFYD >gi|225935372|gb|ACGA01000020.1| GENE 138 170175 - 171014 807 279 aa, chain + ## HITS:1 COG:AF0231 KEGG:ns NR:ns ## COG: AF0231 COG0834 # Protein_GI_number: 11497847 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Archaeoglobus fulgidus # 70 276 62 262 264 76 29.0 5e-14 MVSRPKLRLFRYLVPVIIVLAFIFSFRQCGKQEKPAGHPRDYAAIAKEGIIRVATEYNSI SFYVDSDTVSGFHYELIQAFARDKGLKAEITPIMSFEERLEGLSEGRYDVIAYGILATSK LKDSLLLTSPIFLNKQVLVQRKEIGENDSLYIRTQLDLAQRTLHVVKGSPSILRIQNLGN EIGDTIYIKEIDKYGSEQLISMVAHGDIDYAVCDESIALAAADSLPQIDINTAISFTQFY SWAVSKQSPALLDSLNTWLDKFQKEEEYQKIYKKYYDKE Prediction of potential genes in microbial genomes Time: Fri May 13 07:07:18 2011 Seq name: gi|225935371|gb|ACGA01000021.1| Bacteroides sp. D2 cont1.21, whole genome shotgun sequence Length of sequence - 84913 bp Number of predicted genes - 65, with homology - 61 Number of transcription units - 30, operones - 17 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 4146 2897 ## COG0642 Signal transduction histidine kinase + Prom 4197 - 4256 6.6 2 2 Op 1 . + CDS 4313 - 5572 1008 ## Phep_2694 glycosyl hydrolase family 88 3 2 Op 2 . + CDS 5637 - 8126 1710 ## COG3250 Beta-galactosidase/beta-glucuronidase 4 2 Op 3 . + CDS 8161 - 9330 916 ## Cpin_0086 hypothetical protein 5 2 Op 4 . + CDS 9350 - 10552 1240 ## Slin_0557 hypothetical protein 6 2 Op 5 . + CDS 10576 - 13692 3027 ## Dfer_2137 TonB-dependent receptor 7 2 Op 6 . + CDS 13713 - 15194 1533 ## Dfer_1583 RagB/SusD domain protein 8 2 Op 7 . + CDS 15214 - 16728 1455 ## Hoch_1684 cell surface receptor IPT/TIG domain protein 9 2 Op 8 . + CDS 16743 - 18167 1288 ## gi|260170904|ref|ZP_05757316.1| hypothetical protein BacD2_03484 10 2 Op 9 . + CDS 18209 - 19432 1114 ## Anae109_3266 hypothetical protein + Term 19513 - 19557 10.1 + Prom 19793 - 19852 5.5 11 3 Tu 1 . + CDS 19882 - 20868 958 ## COG0673 Predicted dehydrogenases and related proteins + Prom 20876 - 20935 6.5 12 4 Op 1 . + CDS 21024 - 21494 532 ## COG0590 Cytosine/adenosine deaminases 13 4 Op 2 . + CDS 21534 - 22340 584 ## BT_0320 hypothetical protein - Term 22360 - 22408 9.4 14 5 Op 1 . - CDS 22461 - 24056 1517 ## BVU_2305 hypothetical protein 15 5 Op 2 . - CDS 24082 - 25428 833 ## BT_0315 hypothetical protein - Prom 25457 - 25516 1.7 + Prom 25405 - 25464 4.9 16 6 Tu 1 . + CDS 25581 - 26429 1453 ## PROTEIN SUPPORTED gi|237715971|ref|ZP_04546452.1| ribosomal protein L11 methyltransferase + Term 26456 - 26519 25.3 - Term 26442 - 26508 25.4 17 7 Op 1 . - CDS 26554 - 27060 513 ## COG0716 Flavodoxins 18 7 Op 2 24/0.000 - CDS 27073 - 29109 1858 ## COG0022 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit 19 7 Op 3 . - CDS 29147 - 30565 1372 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes - Prom 30588 - 30647 5.5 20 8 Op 1 3/0.000 - CDS 30674 - 31396 322 ## COG0095 Lipoate-protein ligase A 21 8 Op 2 . - CDS 31416 - 32759 679 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 22 8 Op 3 . - CDS 32835 - 33728 749 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) - Prom 33760 - 33819 1.8 - Term 33828 - 33877 6.6 23 9 Tu 1 . - CDS 33883 - 35529 2066 ## COG0205 6-phosphofructokinase - Prom 35605 - 35664 8.0 - Term 35659 - 35697 9.4 24 10 Op 1 9/0.000 - CDS 35736 - 37115 430 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 25 10 Op 2 27/0.000 - CDS 37115 - 40219 3109 ## COG0841 Cation/multidrug efflux pump 26 10 Op 3 . - CDS 40223 - 41479 1346 ## COG0845 Membrane-fusion protein - Prom 41554 - 41613 5.7 + Prom 41470 - 41529 7.8 27 11 Tu 1 . + CDS 41752 - 43968 2002 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Term 43997 - 44037 -0.8 - Term 43981 - 44028 6.2 28 12 Op 1 . - CDS 44095 - 45558 1383 ## BT_0294 hypothetical protein 29 12 Op 2 . - CDS 45605 - 46822 922 ## BT_0293 hypothetical protein 30 12 Op 3 . - CDS 46829 - 47578 421 ## BT_0292 hypothetical protein - Prom 47826 - 47885 7.3 31 13 Tu 1 . - CDS 47990 - 48943 454 ## BT_1726 integrase - Prom 49005 - 49064 8.7 - Term 49012 - 49074 9.0 32 14 Op 1 . - CDS 49157 - 51505 1799 ## COG1874 Beta-galactosidase 33 14 Op 2 . - CDS 51420 - 51602 70 ## - Prom 51758 - 51817 10.0 - Term 51764 - 51809 8.0 34 15 Op 1 . - CDS 51876 - 53240 774 ## COG5434 Endopolygalacturonase 35 15 Op 2 . - CDS 53241 - 54509 908 ## COG4289 Uncharacterized protein conserved in bacteria 36 15 Op 3 . - CDS 54525 - 55697 947 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins - Prom 55740 - 55799 2.1 + Prom 55557 - 55616 3.7 37 16 Tu 1 . + CDS 55642 - 55884 119 ## gi|260170931|ref|ZP_05757343.1| hypothetical protein BacD2_03619 - Term 55750 - 55783 -0.9 38 17 Op 1 . - CDS 55835 - 56023 166 ## 39 17 Op 2 . - CDS 56040 - 56468 184 ## gi|260170932|ref|ZP_05757344.1| hypothetical protein BacD2_03624 40 17 Op 3 . - CDS 56482 - 56943 274 ## BT_4357 hypothetical protein - Term 57528 - 57577 7.3 41 18 Op 1 . - CDS 57627 - 59186 1246 ## BT_0288 hypothetical protein 42 18 Op 2 . - CDS 59161 - 59586 254 ## COG0848 Biopolymer transport protein 43 18 Op 3 . - CDS 59591 - 60037 368 ## BT_0286 hypothetical protein 44 18 Op 4 . - CDS 60021 - 61406 1477 ## COG0811 Biopolymer transport proteins - Prom 61483 - 61542 6.7 + Prom 62000 - 62059 5.8 45 19 Op 1 . + CDS 62125 - 63675 1031 ## BT_3420 hypothetical protein 46 19 Op 2 . + CDS 63708 - 64991 844 ## BT_3421 hypothetical protein 47 19 Op 3 . + CDS 65055 - 65789 560 ## gi|260170941|ref|ZP_05757353.1| hypothetical protein BacD2_03669 48 19 Op 4 . + CDS 65794 - 66021 151 ## gi|260170942|ref|ZP_05757354.1| hypothetical protein BacD2_03674 + Term 66170 - 66208 0.3 + Prom 66092 - 66151 2.4 49 20 Op 1 . + CDS 66270 - 66731 208 ## gi|260170943|ref|ZP_05757355.1| hypothetical protein BacD2_03679 50 20 Op 2 . + CDS 66728 - 68413 686 ## COG3307 Lipid A core - O-antigen ligase and related enzymes + Prom 68661 - 68720 5.5 51 21 Op 1 . + CDS 68765 - 68962 186 ## 52 21 Op 2 . + CDS 68976 - 69176 116 ## + Prom 69373 - 69432 9.6 53 22 Op 1 . + CDS 69471 - 70151 111 ## BT_3534 hypothetical protein 54 22 Op 2 . + CDS 70209 - 70520 375 ## gi|260170946|ref|ZP_05757358.1| hypothetical protein BacD2_03694 55 22 Op 3 . + CDS 70552 - 70986 116 ## gi|260170947|ref|ZP_05757359.1| hypothetical protein BacD2_03699 + Term 71018 - 71062 5.4 + Prom 71166 - 71225 6.3 56 23 Op 1 . + CDS 71250 - 72482 686 ## COG0582 Integrase 57 23 Op 2 . + CDS 72500 - 72892 188 ## BDI_2139 hypothetical protein + Prom 73318 - 73377 2.2 58 24 Tu 1 . + CDS 73485 - 74852 584 ## PG1109 mobilization protein + Term 74877 - 74932 13.6 + Prom 74893 - 74952 9.4 59 25 Tu 1 . + CDS 75049 - 76278 516 ## Acry_2349 putative transposase 60 26 Tu 1 . - CDS 76606 - 77406 148 ## COG0030 Dimethyladenosine transferase (rRNA methylation) - Prom 77432 - 77491 3.0 61 27 Tu 1 . + CDS 77652 - 78377 595 ## BT_1755 hypothetical protein + Prom 78398 - 78457 5.5 62 28 Op 1 . + CDS 78640 - 78819 109 ## gi|260170955|ref|ZP_05757367.1| hypothetical protein BacD2_03739 63 28 Op 2 . + CDS 78861 - 81359 1604 ## COG3250 Beta-galactosidase/beta-glucuronidase 64 29 Tu 1 . - CDS 81587 - 83155 1286 ## BT_0374 hypothetical protein - Prom 83176 - 83235 5.2 - Term 83267 - 83303 5.2 65 30 Tu 1 . - CDS 83342 - 84643 846 ## BT_0282 hypothetical protein - Prom 84672 - 84731 5.6 Predicted protein(s) >gi|225935371|gb|ACGA01000021.1| GENE 1 1 - 4146 2897 1381 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 816 1097 2 256 294 123 33.0 3e-27 ICFPRAYAAESGIFNFSNINLKNGLSQLSVLAIAQDSKGYIWFATRNGLNKYDGEKFTVY KHDNDDPHSLSDNHITCLLSDDANEGLWVGTNNGLNYINLKTNQITKYQTDDYPNMAGNN IAVLQFDQSGNLWVGTRTGLCIWQPRSGEFKKKYSDHVPENEAIMQLYIDAEKRLYIGTH RHGLFICTPEMELIQQLNKSTQPALSDDAISCIFEDSRHQVWIGTETRGLNKWNPKTKDL SIYNQKNSGLTNDYIRCITEQDQKLIIGTFDGLSILDLRNDDISKYNRFDANQNSLSHFS VRSIYIDREKTLWVGTYSGGVNYYNPLNNRFIFYHPVENNNQKLYNIFGTMVYDSTSLWI ATEGGGLMQFDPITKEYTNFLIEQTSKGLFNKNIIKSLLLDGDYIWCGTSTGGVYKFHIP SKKFTLVYSFGREKNFAIYTFYKDVENNLWIGTTADKGLVKLSPDGKITNQFPYGKEKKE FVNFTSLRSFLALNAHTFLIGTRSFGLYEYDMQKGTVLRCNMSETNPNQKLENNYITSMV RKANGEIWFSTFGGGIYLYKQGTGVIKHIGKKEGLPNNEVYAMVEHNNNLWISVDKEIAE LNTKTGQIRSYDCFVGSETLEFTPQGVISLPHGEVYFSGSNGFLSFTPGSLLKNNTMPPV VLTQLTVNNKVITPNDDTHLLDSAPDDTQTLVLAYNQNNFSIAYSALNYILHEQNQYAYK LSGHDEDWNYVGSRKEAFYTNIAPGKYTFQVIASNNDGVWNETGKKMEIIIRSPWWGTPL AYILYVLSALGIMFTIGYYMYSKHKLELDLRMKQMEKQRMEEFHQTKLHLFTNFSHELRT PLTLIISPLEELMKRAEVPQFIRTQLDMILKNARRLLLLVNQLMDLQKNQSGNLTLQLSQ SDLNAFLLEIFYTFRQIAESKQIHFEYQAPQGEVTACFDQGLLEKAVFNLLSNAFKFTVP GDTITLRLLLITDIDTLRQEHSDWLSEHSPLITSHQGSYLVIQVADTGKGIPEDERTQIF TPFYQVGHTTGSENMNGTGIGLSLTQSVVHLHHGEIGVTANTPKGSVFTILLPNHTMTEE EVANTAVLSEKNNEQVEKGMSGAIDSENTISGTTPSLPGKTVLLVEDNEEIRSYVKGHLE QYYNVLEADNGSDAFEIVLKEFPDLVVTDIMMPGIDGLELCSLIKNNLQTGHIPVILLTA RTMVMHVKEGFLSGADDYVVKPFNIDVLLVRIYNLLAQREKLKSVYSKNFSLQSMGIEVE TTSADEKFMQKLFKIIEKYLTDPDLKVDKICEEMGFSRSNLYRKLKAITDLPLNDLIRHK RLEIATQMLKQTEMNITEIATATGFSTLAYFTKCFKSAYGVSPSEYQKSYSGHTDIDTIR E >gi|225935371|gb|ACGA01000021.1| GENE 2 4313 - 5572 1008 419 aa, chain + ## HITS:1 COG:no KEGG:Phep_2694 NR:ns ## KEGG: Phep_2694 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: P.heparinus # Pathway: not_defined # 31 414 14 383 383 445 58.0 1e-123 METKIKNYLLVFGLGIITLTSCNHSKEQDIVASNFAFAEQQLKYALTQIDEARANESQES REKRESKGWGELTNPRSIKPDGSLELVVSKDWTSGFFPGELWYLYEYTHNPYWKEQAQKH TEILEREKNNGKTHDMGFKMYCSYGNGYRLTGDSTYKDILLQSARTLITRYKPVIGCIRS WDHHADKWQFPVIIDNMLNLELLFWAFRETNDSTYYNIAVNHALTTMKNHFREDNSCYHV VEYDTINGTVRSKMTHQGYSDESAWARGQAWALYGYTMCYRETGRPEFLQQAEKVEKYIF NNPTLPADLIPYWDFNAPGIPNEPRDASAASCMASALYELSTYVPEKKEQYKQEADKILQ NLTHSYRAQLNGDKGFLLLHSTGSKPHDSEIDVPLSYADYYFLEALLRKAKLEKEESLF >gi|225935371|gb|ACGA01000021.1| GENE 3 5637 - 8126 1710 829 aa, chain + ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 63 478 30 423 755 166 31.0 2e-40 MKRLSSIFILFILLILPLNAMIPTQQRIRLTDSWEYLKGDLGSIWEAVRPAAPGSSEAVP IWQQVTLPHCFNAEDAVDPDINYYQGAGWYRTQLSIKNPYLNGRIILEFEGAGPKTEVYV YTYKVASHTGGYDGWSVDITEAVQQALANPDCAKRFNGRIPLSVRCDNTRDTEMIPSDLS DFNLYGGLYRYVNLIYQPEVSFSNIQIENPVEKDMKSASLKIQGTFHNPNDIRQASIHLL LKAPNGEKVWEQTTTVLPLGTNPLFETTIKNPILWSPDQPQLYTCELTLTTPQSKQTITE SIGFRHFEFIENGPFRLNGERLLLRGTHRHEDHAGVGAAMTEEQMEAEMKLIKEMGVNFI RLGHYQQSDIILRLCDQLGILVWEEIPWCRGGLGGEKYQAQARRMLTNMITQHHNHPSVI LWGLGNENDWPNDFSTFEQSAIRAFMKELNDLSHQLDPSRMTSIRRCDFCNDIVDVYSPS IWAGWYRGQFTDYKSISEKEMKRVKHFLHVEWGGDSHAGRHSEYPFNPLQAITGGNSADE RAGDFALQGGDPRASRDGDWSETYIVKLFDWHLKEQETMPWLTGTAFWVFKDFSTPLRPD NPVPYVNQKGVVERDLTKKEAYYVFQSYWTKQPMIHIYGHNWRTRWGKANEEKEILVYSN CAAVELFVNGVSQGVKQRNSQDFPAAGLHWNCILKAGENNIKAVAVKGPGKGNMKAAITD EISFNYQTEEWSNPKQLKLKTYPDKGSIVWVEAQLTDNKGIPCLDAANVISFEAAGDGKL IQNLGTSTGSRKVQAYNGRAAIRIDLNGTCQVAVKSPGIQTAFIEVKAE >gi|225935371|gb|ACGA01000021.1| GENE 4 8161 - 9330 916 389 aa, chain + ## HITS:1 COG:no KEGG:Cpin_0086 NR:ns ## KEGG: Cpin_0086 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 27 388 23 384 384 458 58.0 1e-127 MRKTIISIGFIIFFGTATAVAGEDTAEQIRHLLEKETIQRADAALHEEPVTVTATRATRS AGGIHDFYSEGDYWWPNPEAPDGPYIRRDGETNPENFTAHRQAMIRMNQLVGDLTSAWIL TGKQVYSDAAMQHIRAWFVNPDTQMNPNLLYAQAIKGVATGRGIGIIDTIHLIEVVQSLR LMEAKGMIDKEEVRTIKQWFSDYLTWMSTHRYGIKEMEAKNNHGTCWVMQAAAFALFTDN REMVRFCTERYKTVLLPNQMAADGSFPLELERTKPYSYSLFNLDAMATICQLLTTGEEDI WNYTTPDGKNIDKGIAWLYPYVKDKSSWKRQPDIMFWKEWPVAQPFLLFGTLHHFNKEYF KTWMQLEHFPTNEEVVRNLPIRHPLLWIQ >gi|225935371|gb|ACGA01000021.1| GENE 5 9350 - 10552 1240 400 aa, chain + ## HITS:1 COG:no KEGG:Slin_0557 NR:ns ## KEGG: Slin_0557 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 47 352 58 361 405 272 44.0 2e-71 MKNLLVCTLLFASAFTLNAQVANTRIYAAEKLAKVKEKADSPLYAPAVKTLLRDADKALK MTPPSVMDKTMTADSGDKHDYMSMGPYWWPDPSKPDGLPYIRKDGQRNPELDKLDRNKLG DMSKAVTTLGLAYYFSGDEKYAQKAVDFLNVWFLDAKTKMNPNLTYGQTIPGKNKGMGRG AGMIDIYSFTEMIDAMTLMENSKAFTPKVKKGMKEWFTQLVEWMQTSPVAAEEQRAKNNH GLAYDVQLTAYALYTGNQDLAMKTIQEFPEKRLFAQIEPDGKQPLELARTTALGYTIFNL GHMLDMCSIASTLGQDIYNATSQDGRSITAALKFLIPYIGKPQSEWPYQQIKEWDKKQEE ACWILRRASFFDPKAGYEAIGAQFRETPANKRIHLIYSLE >gi|225935371|gb|ACGA01000021.1| GENE 6 10576 - 13692 3027 1038 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2137 NR:ns ## KEGG: Dfer_2137 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 15 1038 9 1050 1050 548 34.0 1e-154 MRNKFIHYPLVMALMLLFCVPCVTVTYAQSFTVTGTVTDSQGGIPGANVKVKGGTAGTIT NMDGQFTLSVPSSKSILVISYIGYTPQEVAINGKNKLDIHLLEDTKTLDEVVVVGYGVQK KSHLTGSVSKMDINNLTDIPVTQVDQLLQGKIAGVNIQNSTSEAGAAPQIRVRGMGSISA DSSPLIVIDGYPTPDGLSTLDMADISSIEVLKDAASAAIYGSRAANGVILVTTKTGKASK PKYSVKASTGLKWAYKLHPIMSSQDYCSMIGYESVLKNKTMSQTEEAFGLIDNYTDWQKE GLNSSPQIHQVQLSVSGGKNDITYYISGNYAQEDGIMLNSNYTRLNLRSRITAQLSKRVD FSLNLAPSYTKTETPATNFMDFYRTPSFMPVRHTAATSALTGKPIGSYARGSDFSNVTYT REDGTTFTPTSSPFGSSNNNPRSLMDTEERFREDYNLQANASFNVQLMKGLVFTTSNGFF IKYRQNNQYRDYAAKKDEESAMGTYTNRLYIDLLSENTLNYTGKTGKHDYSILAGYTAQT TSERTAGIVGLGFPTDYIHTLNAATSFDLDGTYTQKYRTAMMSLLARATYSYDEKYLLSA SIRTDGSSLFADGHQWGYFPSVSVGWRASEESFLKQFGWLNLLKVRASFGVTGNNSIPAN SYYDLLYPNNYALGEGNGNLISGLAKTSETKGNNRITWEQTYEYNAGFDLSILNNRINLT VDGYYSITKQLLFKQPVLSFTGFNNYWNNIGRIRNSGIEVELNTHNIRTKNFEWESSFNI SSNFNKLLELSGEERLISTGERNETYLAQVGKRAISFFGYKTDGIYKSQEEVDAVPHLAS AVPGSLRIVDINDDGVINDKDRTEIGNPFPTATWGFTNTLTWKGFDLYVMIQGVHGLDVF NGDGYFTETKKFNRNYVKNRWISADYPGDGKTPSFAANGGVAWEFTDYLIEDGSYVALRN VTLGYKFNKKQLKKIGISSLRLYASGQNLFYIWSKDYRGINPEARKTSGSYSSPLINGYQ SGGFPLQSTVTFGFELNF >gi|225935371|gb|ACGA01000021.1| GENE 7 13713 - 15194 1533 493 aa, chain + ## HITS:1 COG:no KEGG:Dfer_1583 NR:ns ## KEGG: Dfer_1583 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 493 1 477 479 276 37.0 2e-72 MKLKHTFIGILSTCALGLASCDLNITPDSYIADINFYQNEAEVNTAVIGCYGGIHAPLNI EWALTELRSDNTRMNGTRSSNDAFMQLLALDLGTMDASNPNIRTYWEATYQNINNCNSIL KPEVLAVVENEKKRAQFEGEALFIRAYHYFNLVRLFGPTFLVTEEISIAESLKKDRSSVQ ATYAQIIDDLKRSVTDLQDITYSSNDLGRATVLAAKSLLAKVYLTLGRYEDAKPLLSDVV TAKGETLNVKYADVFDTTKEMNDDIIFTVRYKSGSLGIGSPFANTFAPQNSGANVITGSG DGKNYPTKEISAAYSKEDLRKDVSMADSYYDKSKDVEAQVAYVKKFLSEVSIRYDAENDW PIIRYADVLLMYAEVLNEVDGPSAGLKYLNLVRERAGLSKIESDAVTTRNVFRTYLEKER RLELAFENQRWFDLLRWGKAIEVINKHIQETEWDFYAGYTQQINKLKDYQLILPIPQSVI DNNPGVITQNPNY >gi|225935371|gb|ACGA01000021.1| GENE 8 15214 - 16728 1455 504 aa, chain + ## HITS:1 COG:no KEGG:Hoch_1684 NR:ns ## KEGG: Hoch_1684 # Name: not_defined # Def: cell surface receptor IPT/TIG domain protein # Organism: H.ochraceum # Pathway: not_defined # 39 221 121 297 363 77 29.0 2e-12 MKNKFYSLFIFFVLTALAGCNEKESTDVIEQGDPQIIEFTPTSGKFGQEITIKGEFLRDI QKATIGGVEATIRYKLSQQEIVIVVPANAGNGKIVLSTKEKKTESEQSFTIVYPVPQVKS VPAEAHVGDQIEIQGENLDIVSKVCFGDKEASISYQSEREIVATIPFVITDAAPISLYYL DSTGEQFTQPEGPDFEIIKDIPTIDAMAERVTEGSLITLNGTFLNLIESIHFGDEVKVTN FVEKTANSIVFRVPELPESATVDVLAKYYEGTGSLTLRNDCYVFIPRVFSYPNLKMGAHR NEDFGNMINGTAGQVSTTCILKDVDSRALIDFAAVHNSNNDFALNGPQNVKANLRNYWCN GTPLPPLKSSSTEAEVNENFGEFTSTVTKLLVLQESKGYGELIRNIKEGNIEEISPTDEI TKALFNIDMDAEGSNSVRSRQKTEAEDKEASNIYKVGSVVVFKNLKKNKFGIMIIRSVNV DFDAVKATNDANATITFDLYYQRY >gi|225935371|gb|ACGA01000021.1| GENE 9 16743 - 18167 1288 474 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170904|ref|ZP_05757316.1| ## NR: gi|260170904|ref|ZP_05757316.1| hypothetical protein BacD2_03484 [Bacteroides sp. D2] # 1 474 1 474 474 860 100.0 0 MKRQYKFTALSAVLFCLIGLLYGCTRDNEIIIPETSVNPPATDYSVAIGDEVTFTGQNMH LISKVAFDNQVVNITTEPSNRSQTTLIVAVPDNFEVTQHISVVATYNSVHKLTLSDAFEV VVPGVTADVSSATIGDKITLTGKNMHLITKVNFGDQVVSFDPNPDRSHTSLMVTVPSTFD VTKKVQLSVTYTTNTVNVSNDFEVIVPPVIPTVTTVLEGEVGTGAIITLAGTNLNIIKKL MVNGQSYEFTATATSLSFNAPEDITENLVIDNVVLVYDNVLGDNQELAVSGSVTVKPTPT LPYVLWEDIELGSHGTENAFFDVTNGRTMTPCDVIEHKEWVTFAMDNTSGAPTARLLNPA NIGTNLLNSQFCNGTALNLNDYAIWKESTQTTPFKKVTDDALIAKVTNGEITNIRDDIGD VKASDIKTNTPTVAAGEVYYFYHKEKFGLIWVKQLNIQTDRKLNTIIMNVYYEK >gi|225935371|gb|ACGA01000021.1| GENE 10 18209 - 19432 1114 407 aa, chain + ## HITS:1 COG:no KEGG:Anae109_3266 NR:ns ## KEGG: Anae109_3266 # Name: not_defined # Def: hypothetical protein # Organism: Anaeromyxobacter_Fw109-5 # Pathway: not_defined # 31 357 46 371 411 263 41.0 1e-68 MNLRANYSKRIRLSLAFSIVAMTVCAQSIWNAEHLEQVKKSLNQPAYSATYQNLLKQADK ALNAHHLSVMMKDKTPPSGDKHDYLSQARYFWPDPTKPDGKPYISRDGVSNPELDKLDRN RLGEMANNVTTLALAWYFSGKEQYALKATKQIRVWFLDKKTCMNPHLKYAQVAPGHNNDL GRCYGVIDTYSLVEMLDAVQLLEKSRSFTSKDSEQLKKWFGQLLDWILTSEQGKEEASRP NNHSTAYDAQAIAFALYTGNHKVAERILREVPKKRVFKQIEPDGKQPEELRRTLAFGYSE FNLQHLIDIATMGEKAHVPVLSVSSADGRSIYKAADFLKPYLGKGVKEWPYKQISEWNGK QQELCKDLYRLYLLDSTRTDYLKAYQQFRKNNPKDLFNLLFLKEESK >gi|225935371|gb|ACGA01000021.1| GENE 11 19882 - 20868 958 328 aa, chain + ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 4 244 2 244 340 110 27.0 3e-24 MSEKIIKWGFIGCGEVTKTKSGPAFQKVEHSEVVAVMSRDGAKAKAYAKERGIKKWYDDA QELIDDPEVNAIYIATPPSSHATYAIMSMKAGKPAYIEKPMAVTYEECTRINRISNETGV PCFVAYYRRYLPYFQKVKELVENGTIGNVINVQIRFAQPPRDLDYNRENLPWRVQADIAG GGYFYDLAPHQIDLLQDMFGCILEASGYKSNRGGLYPAEDTLSACFQFDNGLVGSGSWCF VAHDSAREDRIEIIGDKGMICFSVFTYEPIGLHTEKGREEICIGNPEHVQQPLIQAVVDH LLGKSVCSCDGESATLTNWVMDKILGKL >gi|225935371|gb|ACGA01000021.1| GENE 12 21024 - 21494 532 156 aa, chain + ## HITS:1 COG:MA3407 KEGG:ns NR:ns ## COG: MA3407 COG0590 # Protein_GI_number: 20092219 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Methanosarcina acetivorans str.C2A # 1 156 6 162 162 185 58.0 4e-47 MTREDLMRKAIELSKENVENGGGPFGAVIATKEGEIVATGVNRVTASCDPTAHAEVSAIR AAAAKLGTFNLSGYEIYTSCEPCPMCLGAIYWARLDKMYYGNNKTDAKNIGFDDSFIYDE LELKPEDRKLPSEILLHNEALTAFKAWAAKEDRVEY >gi|225935371|gb|ACGA01000021.1| GENE 13 21534 - 22340 584 268 aa, chain + ## HITS:1 COG:no KEGG:BT_0320 NR:ns ## KEGG: BT_0320 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 268 1 267 267 258 50.0 1e-67 MKRKSLLFVMASVCLFSCQQQEELQDPSKEGINFLTTVDNESITDAVTRSSSNLSLPLKS SFSAGDVISMSVSEQEYQPFTLGVDNQNWSETATDAEAVTFYAHYPELSGDVTTRSLFSR YRDIKGGLEYLFGTAQASKGSSNVALTFKRMTAPVILLDEKGNPYEGKAIVKLFLKNKGV QDLFNGKIEADKNATPEYISIRKISEGILTNLIPQIIKAGEKIGSVVVDDKEEPIIADQD ITIEAGSPVTMRLYGGRGIIDERTPLRR >gi|225935371|gb|ACGA01000021.1| GENE 14 22461 - 24056 1517 531 aa, chain - ## HITS:1 COG:no KEGG:BVU_2305 NR:ns ## KEGG: BVU_2305 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 531 1 523 523 399 47.0 1e-109 MKSNRILAICILLIGVGFLNAQTTIYDANRWMGSDLNGTARFVGMGGAMGALGGDITTMG TNPAGIGIYRSNDFMVSFGFDNTGTKANGASLDKFHGSFDNAGFVFSTKIGNTTALRFAN FGFNYRKMKSFNRSMLMSGVFNTSQTVQMANMVNFDSYGGFDPFTEAALRSDDAFQNPEL PWLGIMGYNAHLVNPVYGEVDPDNPDAEPPFEGYLPYFQAGDAVSQSYRSKESGGIHSFD LNGALNFYDRFYVGATLGLYSVNYDRTSEYNEDFTDKDGNGHGGYTLGNDFWVDGSGVDF KLGFILRPFESSSFRIGAAVHTPTFFSLKERNTAYLAFDLNEETKDITRPYDARGNDTEG EYEYKLVTPWKFNASMGYTIGSSVALGVEYEYSDRTKAKMKDPDGYELGQTEDIKAMMKP VHTLRVGAEFKLAPEFSFRLGYNHITAPLKSDAFKYLPVNSMRTDTEFSNPGATNNYTFG CGYRGESFYVDMAYLYNTYKETFYAFDSLELPGTKVTNNNHKVVLTVGMRF >gi|225935371|gb|ACGA01000021.1| GENE 15 24082 - 25428 833 448 aa, chain - ## HITS:1 COG:no KEGG:BT_0315 NR:ns ## KEGG: BT_0315 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 205 1 189 429 245 86.0 3e-63 MKKIVYFLLFALCLPIGLMAQSVDDDLYFIPSKDKQEKKETPVKKETKKQVTNIYTSPGT TVVVQDRKGRTRDVDEYNRRYDARDNEFVMDNDTLYIKEKSNPDLDGEWVTGEFNGTEDD YEYAERIIRFRNPRFAISISSPLYWDVVYGPNSWDWNVYTDGMYAYAFPTFSNPLWWDWR YGSYGWGWNYGWGWNRPYYAWGYSPGYWGGWYGGYWGGWYGGGYWGHHHHWHGGPSWGWG GGGRPYYAGRSVINGNNRSYYSGSRNYNSTSYRRGTGSSTTRPTNSSVGQSVRRSGTTTG ANSRVVGTRDYTRSSSNGSVRSGSSYSRRNTETYTRPSSTRSSGTSTRNSGSSYTRSNSS TGNSRTSTSRSSGSSYSRGSSTSTSRSYTPDRSSSRSSNSGSYSRSSGSSYSSGSSGSYS RSSGSSGSSSRSSGGGGSSRSSGGSYRR >gi|225935371|gb|ACGA01000021.1| GENE 16 25581 - 26429 1453 282 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237715971|ref|ZP_04546452.1| ribosomal protein L11 methyltransferase [Bacteroides sp. D1] # 1 282 1 282 282 564 95 1e-160 MKYFEFTFRTHPCTEIVTDVLAAVLGEVGFESFVECEGGLTAYIQQTICDENAIKIAITE FPLPDTDITYTYTEAEDKDWNEEWEKNFFQPIIIGNRCVIHSTFHQDIPKAEYDIVINPQ MAFGTGHHETTSLIIEELLDSELKDKSLLDMGCGTSILAILARMRGARPCTAIDIDEWCV RNSIENIELNHVDDIAVSQGDASSLTGKGPFDVIIANINRNILLNDMKQYVACMHTDSEL YMSGFYVDDITAIREEAEKNGLTFVHYKEKNRWAEVKFVYKG >gi|225935371|gb|ACGA01000021.1| GENE 17 26554 - 27060 513 168 aa, chain - ## HITS:1 COG:alr2405 KEGG:ns NR:ns ## COG: alr2405 COG0716 # Protein_GI_number: 17229897 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Nostoc sp. PCC 7120 # 2 168 3 169 170 136 44.0 2e-32 MKKIGLFYATKAERTSWVAEKIQKEFGKEKIETVPIEQAWQNDFAAYDCFIVGASTWFDG ELPTYWDELLPELRTMKLKGKKVAIFGLGDQIRYPENFADGIGLLAEVFEEDEATLVGFT SSEGYTFERSKALRGEQWCGLVVDLDNQSEQAEKKIKAWCQQVKKEFA >gi|225935371|gb|ACGA01000021.1| GENE 18 27073 - 29109 1858 678 aa, chain - ## HITS:1 COG:CT340_2 KEGG:ns NR:ns ## COG: CT340_2 COG0022 # Protein_GI_number: 15605063 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit # Organism: Chlamydia trachomatis # 354 678 5 328 328 252 43.0 1e-66 MKKKYDIKTTDVETLKKWYHLMTLGRALDEKAPSYLLQSLGWSYHAPYAGHDGIQLAVGQ VFTLGEDFLFPYYRDMLTVLSAGMTAEEVILNGISKATDPGSGGRHMSNHFAKPEWHIEN ISSATGTHDLHAAGVARAMVYYGHKGVAITSHGESATSEGFVYEAINGASLERLPVIFVI QDNGYGISVPKSEQTANRKVAENFSGFKNLKIIYCNGKDVFDSMNAMTEAREYAISTRNP VIVQANCVRIGSHSNSDKHTLYRDENELEYVKDADPLMKFRRMLLRYKRLTEEELQQIEA DAKKELSAANRKALAAPDPDPKSIYDFVMSEPYQPQKYKDGTHEAEGEKTFLVNAINETL KAEFRHNPDTFIWGQDVANREKGGVFNVTKGMQQEFGEARVFSAPIAEDYIVGTANGMSR FDPKIHVVIEGAEFADYFWPAVEQYVECTHEYWRSNGKFAPNITLRLASGGYIGGGLYHS QNLEGALTTLPGARIVCPSFADDAAGLLRTSMRSKGFTLFLEPKALYNSVEAAAVVPEDF EVPFGKARIRREGSDLSIITYGNTTHFCLHAAERLEKEGGWKVEVIDIRSLIPLDKEAIF ESVKKTSKALVVHEDKVFSGFGAELAAMISGEMFRYLDGPVQRVGSTFTPVGFNPILEKE ILPDEAKIYEAARRLLEY >gi|225935371|gb|ACGA01000021.1| GENE 19 29147 - 30565 1372 472 aa, chain - ## HITS:1 COG:BH2761 KEGG:ns NR:ns ## COG: BH2761 COG0508 # Protein_GI_number: 15615324 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Bacillus halodurans # 5 469 4 417 426 275 39.0 2e-73 MSKFEIKMPKLGESITEGTIVSWSVKVGDMIQEDDVLFEVNTAKVSAEIPSPVAGKVVEI LYKEGDTVAVGTVVAIIDLDGEESSGTEPVNEGVVREKADAGQVAANVSETSPSSVETAK NESANTASKPVVAEEERWYSPVVIQLAREAKIPKEELDAIQGTGYEGRLSKKDIKDYIEK KKRGGSVAPKPASAVAAPAASKPSVAVSSEQASPKVAPATSAAAIQPAATLSKSSAPVAM PGVEVKEMDRVRRIIADHMVMSKKVSPHVTNVVEVDVTKLVRWREKNKDAFFRREGVKLT YMPVITEAVAKALAAYPQVNVSVDGYNILFKKHINVGIAVSLNDGNLIVPVVHDADHLNL NGLAVAIDSLALKARDNKLMPEDIDGGTFTITNFGTFKSLFGTPIINQPQVAILGVGYIE KKPAVVETPEGDTIAIRHKMYLSLSYDHRVVDGMLGGNFLHFIADYLENWKG >gi|225935371|gb|ACGA01000021.1| GENE 20 30674 - 31396 322 240 aa, chain - ## HITS:1 COG:SPy1033 KEGG:ns NR:ns ## COG: SPy1033 COG0095 # Protein_GI_number: 15675030 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Streptococcus pyogenes M1 GAS # 14 240 13 242 329 134 34.0 2e-31 MIRCIYSPFTDIYFHLAAEEYLLKQGNEDIFMLWQDTPSVVIGKHQRLRSEVDKKWAEQE RVHIARRFSGGGAVYHDLGNINLTFIETAPRLPEFVTYLQRTLDFLNSIGLMAKGGERLG IYLNGLKISGSAQCVHKNRVLYHCTLLYDTDLTALHKALNPEPMVDDETLSSVYAVPSVR SEVTNIRSYLPEGTVTDFKEKAFQYFSKSQSVSAFTGKEIEAVNQLREGKYIQKEWIYSR >gi|225935371|gb|ACGA01000021.1| GENE 21 31416 - 32759 679 447 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 3 444 4 446 458 266 34 3e-70 MNFDIAIIGGGPAGYTAAERAGANGLKAVLFEKKAMGGVCLNEGCIPTKTLLYSAKILDS IKSASKYGVSAESPSFDLSKIMSRKDKTVKMLTGGVKMTVSSYGVTIIEKEALIEGEKEG KIQITCDGETYSVKYLLVCTGSDTVIPPIPGLSEISYWTSKEALEIKELPKTLVIIGGGV IGMEFASFFNSMGVKVHVVEMMPEILGAMDKETSGMLRAEYAKRGVTFYLNTKVVEVNPH GVVIEKEGKMSAIEAEKILLSVGRKANLSKVGLDKLNIELHRNGVKVDEHLLTSHPRVYA CGDITGYSLLAHTAIREAEVAINHILGVEDRMNYDCVPGVVYTNPEVAGVGKTEEELVKS GVPYRVSKLPMAYSGRFVAENEQGNGLCKLIQDEDGKIIGCHMLGNPASELIVIVGIAIQ RGYTVEEFQKTVFPHPTVGEIYHEIMF >gi|225935371|gb|ACGA01000021.1| GENE 22 32835 - 33728 749 297 aa, chain - ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 70 292 48 265 275 186 43.0 5e-47 MPQRNNPMSAVQKKRTVSTTKRKGTTSSSKTSRTLKKEQVKHRTVMPTWLRNILAVMIVG CFSVAFYYFFIRPYAYRWKPCHGLKEYGVCIPDGYDIHGIDISHYQGKIDWKKLLQNKET ATPLHFVFMKATEGGDHNDTTFEANFANARNHGFIRGAYHFYIPSTDALKQADFFIRTVK LVSGDLPPVLDVEVTGRKEKKELQQGIKRWLDRVESHYGVKPILYTSYKFKTRYLDDSIF NAYPYWIAHYYVDSVRYQGKWHFWQHTDVGSVPGIKEYVDLNVFNGSLEELKKLTIK >gi|225935371|gb|ACGA01000021.1| GENE 23 33883 - 35529 2066 548 aa, chain - ## HITS:1 COG:TP0542 KEGG:ns NR:ns ## COG: TP0542 COG0205 # Protein_GI_number: 15639531 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Treponema pallidum # 1 546 1 559 573 640 55.0 0 MTKSALQIARAAYQPKLPKALASGAVKAVAGAATQSVADQEAIKALFPNTYGMPLITFEA GEAVALPAMNVGVILSGGQAPGGHNVISGLFDGIKKLNPENKLYGFILGPGGLVDHNYME LTADIIDEYRNTGGFDIIGSGRTKLEAESQFEKGLEIIKQLGIKALVIIGGDDSNTNACV LAEYYAAKKYGVQVIGCPKTIDGDLKNDMIETSFGFDTACKTYAEVIGNIQRDCNSARKY WHFIKLMGRSASHIALECALQVQPNVCIISEEVEAKDMSLDDVVTSIAKVVADRAAQGHN FGTVLIPEGLVEFIPAMKRLIAELNDFLAANAEEFGQIKKSHQRDYIIRKLSPENSAIYA SLPEGVARQLTLDRDPHGNVQVSLIETEKLLSEMVATKLAAWKEEGKYVGKFAAQHHFFG YEGRCAAPSNFDADYCYSLGYTASMLIANGKTGYMSSVRNTTAPAAEWIAGGVPITMMMN MERRHGEMKPVIQKALVKLDGAPFKAFASQRERWAIETDYVYPGPIQYFGPTEVCDQATK TLQLEQAK >gi|225935371|gb|ACGA01000021.1| GENE 24 35736 - 37115 430 459 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 10 456 7 457 460 170 27 3e-41 MKRKKLYIAILLLAGVLTSCKVGKSYVRPDLHLPDSLERSQDSVSFGDQDWRDIYTDATL RSLIERALDHNKDMLIAAARVKEMAAQKRISTAALLPDIKGKVTGERELENHGGDAFKRS DTFEAQFLVSWEVDLWGNLRWARSASIAEYLQSIEAQRALQMTIVAEVAQAYYELVALDT ELDIVKQTLKAREEGVRLARIRFAGGLTSETSYRQAQVELARTATLVPDLERKISLKEND IAYLAGEYPNKIARSRLLQEFNSPETLPVGLPSTLLERRPDIRQAEQKLIAANAKVGVAY TNMFPRLALTGGFGSESTSLSELLKSPYAVMEGALLTPIFGWGKNRAALKAKKAAYEAEV HSYEKSVLEAFKETRNAIVNFNKIKEVYELRANLERSAKSYMDLAQLQYINGVINYLDVL DAQRGYFDAQIGLSNAIRDELIAVVQLYKALGGGWEQNP >gi|225935371|gb|ACGA01000021.1| GENE 25 37115 - 40219 3109 1034 aa, chain - ## HITS:1 COG:SMa1662 KEGG:ns NR:ns ## COG: SMa1662 COG0841 # Protein_GI_number: 16263363 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 6 1028 7 1030 1044 845 43.0 0 MKVTFFIDRPVFSAVISIVIVIVGIIGLTMLPVDQYPQITPPVVKISASYPGASALTVSQ AVATPIEQEINGTPGMLYMESNSSNSGGFSATVTFDVSADPDLAAVEIQNRVKLAESRLP AEVIQNGISVEKQAPSQLMTLCLTSTDPKFDEIYLSNFATINVLDVIRRIPGVGRVSNIG SRYYAMQIWAQPDKLANFGLTVQDLQNALKDQNRESAAGVLGQQPVQGLDITIPITTQGR LSTVGQFEDIVVRANANGSIIRLKDVARVSLEASSYNTESGINGENAAVLGIYMLPGANA MEVAERVKEAMDEISKNFPEGLSYEIPFDMTTYISESIHEVYKTLFEALVLVVLVVYLSL QSWRATLIPVVAVPISLIGTFGFMLIFGFSLNILTLLGLILAIGIVVDDAIVVVEGVEHI METEHLSPYEATKKAMNGLSSALIATSLVLAAVFVPVSFLSGITGQLYRQFTVTIVVSVL ISTVVALTLSPVMCSLILKPDSGKKKNIVFRKINEWLGIGSNKYVAAVTRTIKHPRRVLS AFGMVLIAIMLIHRIIPTSFLPVEDQGYFKIELELPEGATLERTRIVTERAIAYLEKNPY IEYIQNVTGSSPRVGSNQGRAELTVILKPWEERKSTTIEKIMHTVEKHLEEYPECKVYLS TPPVIPGLGSSGGFEMQLEARGEATFDNLVDATDTLMYYASKRKELAGLSSSLQSEIPQL YFDVDRDKVKMLGVPLADVFSTMKAYTGSVYVNDFNMFNRIYKVYIQAEAPYREHKDNIN LFFVKASNGAMVPLTSLGNASYTTGPGSIKRFNMFTTAVIRGAAAQGYSSGQAMEIMEQI AKDHLPDNIGLEWSGLSYQEKQAGGQTGMVMALVFLFVFLFLAAQYESWTVPIAVLLSLP VAALGAYLGVWVCGLENDVYFQIGLVMLVGLAAKNAILIVEFAKVQVDKGEDLVQSAIYA AKLRFRPILMTSLAFVLGMLPMVLASGPGSASRQAIGTGVFFGMIFAIVFGIILVPFFFV MVYKTKSKILKHKK >gi|225935371|gb|ACGA01000021.1| GENE 26 40223 - 41479 1346 418 aa, chain - ## HITS:1 COG:mll6731 KEGG:ns NR:ns ## COG: mll6731 COG0845 # Protein_GI_number: 13475614 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 45 385 58 394 402 208 38.0 2e-53 MRLFFSKHELKLRRKRTIAAIICVVVVLGVYWILTRPQKAAPEMPTVIVEPVVKDDVEIY GEYVGRIRAQQFVEVRARVEGYLENMLFAEGTYVNKNQVLFVINQDQYRAKADKARAQLK KDEAQALKAERDLKRIRPLFEQNAASQLDLDNAEAAYESAEATVAMSEADLAQAELELGY TIVRSPLSGHISERNVDLGTLVGPGGKSLLATIVKSDTVLVDFSMTALDYLKSKERNINL GQQDSTRSWQPNITITLADNTVYPFKGYVDFAEPQVDPQTGTFSVRAEMPNPKQVLLPGQ FTKVKLLLDVREGALVVPMKAVTIEKGGAYIYTMRRDNAVEKRFIELGPEVGNNVVVERG LAEGEMVVVEGFHKLTPGMKVRVSQPDAEAQDSITATKNEVAGAKENVTGTKDNAKGE >gi|225935371|gb|ACGA01000021.1| GENE 27 41752 - 43968 2002 738 aa, chain + ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 27 297 22 299 308 198 41.0 3e-50 MKKQIFSTLVLSIGLLLPFSLHSQEQRKKVGVVLSGGGAKGMAHIKALKVIEEAGIPIDY IAGTSMGAIVGGLYAIGYTPEQLDSMVRKQDWTFLLSDRIKRSAMSLTDRERSEKYTVSI PFTKTPKDAATGGIMKGQNLANLFSDLTVGYHDSIDFNKLPIPFACVAANVVNGEQIVFH DGILSTAMRASMAIPGVFTPVRQDSMVLVDGGIVNNYPADVVKAMGADIIIGVDVQNALK KADKLNSVPDILGQIVDITCQSNHEKNVDLTDTYIRVNVDGYSSASFTPAAIDTLMRRGE EAAKDQWSSLLALKKKIGIAEDYTPKQHGPYSSLSNARTVYVTDISFSGVEVDDKKWLMK KCKLEENSDISTQQIEQALYQLRGSQSYSSASYTLKETPEGYHLNFLLQEKYERRINLGI RFDSEEIASLLVNATADLKTRIPSRLALTGRLGKRYAARIDYTLEPIQQRNFNFSYMFQY NDINIYEEGDRAYNTTYKYHLAEFGFSDVWYKNFRFGLGLRFEYYKYKDFLFKKPEISDL KVESEHFLSYFAQVQYNTYDKGRFPSKGSDFRAAYSLYTDNMAQYNEHAPFSALNASWAS VIPVTRRFSIIPSIYGRILIGRDFPYPLQNAIGGDVPGFYIPQQLPFAGVTNLELMDNTI MIASIKFRQRMGAIHYLTLTGNYGLTDSNFFDILKGKQLFGISAGYGMDSIFGPLEISLG YSNQTDKGSCFVNLGYYF >gi|225935371|gb|ACGA01000021.1| GENE 28 44095 - 45558 1383 487 aa, chain - ## HITS:1 COG:no KEGG:BT_0294 NR:ns ## KEGG: BT_0294 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 486 1 475 476 447 59.0 1e-124 MKSVKWFYVGAMAIALGMLTFATVACHDDDDDPKPAEGEVVETPKPVVEYYIMGTVTSGG AAMDGAKVKVGSKNYTTDSNGKFSVTESATGTYSIEAAPNGYLSQKTSVVIADNAENRSV VTVALALTKESPKTAVSIEATGETKVEDKSESNQAIEKPGEVAPEEVVEDKPLVKVELTI PEDAIEATGDNAEIVKEGKVDISVTTFVPAPEEVTTEVKAEDVNRDVPKSIPLAAAKFEP SGLKFKKSVTISIPNPIPGITFADADMILTYQNPNTGEWGDAKDNNGNVIKNISSTTESG AVTAYTAEVDHFSAYAIENKVYSKISNETVITNILGQASRDNSENAKAVTGIELKYKEKS GWDYDKNDAGLVAEVKSQLGAGASAEDTKTVNAMVAFMKTRMFSLMGSVSGITETERVYN TVNVNGYTTMSYTCYAKVRTTTLTANVKFKGTAKSVSITATRYTGTDHQYKTVTYNPTHS GGKGGSI >gi|225935371|gb|ACGA01000021.1| GENE 29 45605 - 46822 922 405 aa, chain - ## HITS:1 COG:no KEGG:BT_0293 NR:ns ## KEGG: BT_0293 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 405 1 406 410 642 73.0 0 MKRIVSVTAVFLCGISLLQAQPVRVSETLKELEMENISVVEKRDTITAAFETSAYRGIYN GIGIAIRHLVAIPEIPTLQLLILDNALLQLCITIPADLIQKCQSGECTLDEVYRKMGMTT STETAVRQLKGVKRKESSFSKVDFVIYPNVMLVNNVTYKLYKAALELQPALEMQLWKGAS LRMQVSLPIVSNEDGKWNCVRLGFMALRQDFRLANHWKGYLTGGSFSNDRQGLAAGIGYF SANGRWTVEGGGGITGSSHFYGSDWKMSKWKRVNGQISVGYYVPEVNTLVKVEGARFIYG DYGVRGTLSRYFGEYIVGIYGMYTDGATNAGFNFSIPLPGKKRRRHLLRVMLPEYFAFQY DMRSGNEYAHRSLGESYSVEPKSAENSHFWQPDYIRYYLIRTSEK >gi|225935371|gb|ACGA01000021.1| GENE 30 46829 - 47578 421 249 aa, chain - ## HITS:1 COG:no KEGG:BT_0292 NR:ns ## KEGG: BT_0292 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 249 249 395 75.0 1e-109 MKQTKAAFRNKPFISLKGSILFLILIPVTGLVHAQYSMGTTGQLMIPTAEMQETGTFMGG VNFLPEQVTPSVFSFPTMNYFVDMTLFSFIEFTYRMTLLKMTTGTGRTGYHNQDRSNTIR IRPIKESRYFPAVVIGGDDLLTEKKTPYWGAYYGVLTKTIGFRSGDQLAVTAGWYIHQGD CRVFNKGPFGGVRYTPSFCKELKLMVEYDTHGWNMGAAMRFWKHLSVNVFTREFTCVSAG LRYECTLIH >gi|225935371|gb|ACGA01000021.1| GENE 31 47990 - 48943 454 317 aa, chain - ## HITS:1 COG:no KEGG:BT_1726 NR:ns ## KEGG: BT_1726 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 317 1 314 316 504 80.0 1e-141 MQKMNKNGFSQCAGAYIERLRKEGRYSTAHVYKNALFSFSKFCGTLNISFRQVTRECLRC YGQHLYESGLKLNTISTYMRMLRSIYNRGVEAGSAPYVPRLFHDVYTGVDIRQKKALPVT ELHKLLYEDPKSERLRRTQTIAALMFQFCGMSFADLAHLEKSALDRNVLQYNRIKTKTPM SLEILESAKEMMNQLRSNKPALPDCPDYLFDILHSDKKRKDEKAYKEYQSALRRFNNCLK DLARALRLNSPVTSYTFRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFGLRERT EVNRKNLSYVRDCNAGR >gi|225935371|gb|ACGA01000021.1| GENE 32 49157 - 51505 1799 782 aa, chain - ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 1 607 1 602 612 426 36.0 1e-118 MKKPLLYLLILLVTVFSTSCSQSSKEIFEIGDKTFLLNGKPFVVKAAEIHYPRIPKEYWE HRIKMCKALGMNTICLYVFWNFHEPEEGKYDFTGQKDIAAFCRLAQENGMYVIVRPGPYV CAEWEMGGLPWWLLKKKDIKLREQDPYYMERVKLFMNEVGKQLTDLQISKGGNIIMVQVE NEYGSFGIDKPYIAEIRDIVKQAGFTGVPLFQCDWNSNFENNALDDLLWTINFGTGANID DQFKRLQELRPDIPLMCSEFWSGWFDHWGAKHETRSAEDLVKGMKEMLDRNISFSLYMTH GGTSFGHWGGANFPNFSPTCTSYDYDAPINESGKVTPKYFEVRNLLSNYLPEGESLPEVP DSIPTIAIPSFKLDEVAILFDNLPEPKISENIQSMEAFDQGWGSILYRTTLPASKEEQTL TITEAHDWAQVFLDGKKLATLSRLKGEGTVILPPMKEGAQLDILVEAMGRMNFGKGIYDW KGITEKVEVQSNNGVITSLKNWKVYNIPVDYAFAQNKKFVKQDNPQKYPAYYRGTFTLDK TGDTFLNMTNWSKGMVWVNGYAIGRYWEIGPQQTLYVPGCWLKKGENEVIILDMAGSVQP QTEGLQQPVLDNLRVHGAAYAHRKVGENLDLTGEKPVYEGTFKSGNGWQHVKFGKEVETR FFCLEALNAYDGKDFAAIAELELLGADGKPLSRQHWKVIYADSEETEEANNIATNVFDLQ ESTFWHTSYSSAAKHKFPHQIVINLGEDKIVTGFSYLPRAEADKTGMIKDYRVYLKMMPF KL >gi|225935371|gb|ACGA01000021.1| GENE 33 51420 - 51602 70 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVCKFAVGLNGNIDRWWIYLEIVLYHENNKLENEETFIVFAYFISDCFQHFLFPIFKGDI >gi|225935371|gb|ACGA01000021.1| GENE 34 51876 - 53240 774 454 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 36 427 18 411 448 276 36.0 9e-74 MKKFLFTIGLLAVSFATLWGQFKYVKVDAPFSMKPIKEFIYPDQDFSIVNYGAVKGGEAD VSDAIAGAITACNQAGGGRVVIPEGEWLTGPIHLKSNVNLYLAEGAVLRFTDNPSHYLPA VMTSWEGMECYNYSPLIYAFECKNVAITGTGMLSPKMDCWKKWFARPKAHMDALRKLYTM ASKDVPVEKRQMAVGENHLRPHLIHFNRCENVLLDSFKIRESPFWTIHMYMCNGGIVRNL DVKAHGHNNDGIDLEMTRNFLVEDCTFDQGDDAVVIKAGRNRDAWRLNTPTENIVIRNCN ILEGHTLLGIGSEISGGIRNVYMHDCKAPQSVRRLFFVKTNHRRGAFIENIHMENIRTGH VQRVLEIDTDVLYQWRELVPTYEERITRIDGIYLKNVICDSADAIYELKGDAKLPVKNVV IQDVHVDKVNDFVKKVHNVKNVKEDNVTYGSLRE >gi|225935371|gb|ACGA01000021.1| GENE 35 53241 - 54509 908 422 aa, chain - ## HITS:1 COG:TM1061 KEGG:ns NR:ns ## COG: TM1061 COG4289 # Protein_GI_number: 15643819 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 37 420 8 385 387 296 41.0 6e-80 MNVMMKNKNILVFILLCVFSTVTYAKKGKVEKEISDREYWAQMAYKIAAPVLSNMSKGEL KKNMQVEISPTWDGRSKDVTYMECFGRLMSGIAPWLSLPDDDTEEGKIRKQLREWALKSY AHAVDPDSPDYLLWRNEGQPLVDAAYIASSFLRAKKQLWEPLDEVTKQRYIKEFQLLRRI DPPYTNWLLFSAMIESFLMEADAQYDLFRIHTAVRKAEEWYVGDGWYSDGETFAFDYYNS YVIQPMFVQVMQTVNDHKVILHDRSLEMQKKTEDLAKKRMQRFGMILERFISPEASFPVF GRSMTYRLGVFQPLSMLCWKEMLPEELPEGQVRNALTCVMKRLFAVDGNFNEKGFLQLGF AGHQTGLADWYSNNGSMYITSEVFLPLGLPANHSFWTSAPQDWTSKKAWSGHEFPKDHAI HY >gi|225935371|gb|ACGA01000021.1| GENE 36 54525 - 55697 947 390 aa, chain - ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 125 386 89 350 352 160 35.0 5e-39 MNFRKISFWVLSCVFSISMAGCKSGKSNSQNETITDSVWTKSKVIETIKKVNDYWQAENP EPAFAFWHPAAYHTGNIAAYEVTGNETYKKYSEAWAEKNQWKGATSDDKSKWKYNYGETM EHVLFGDWQICFQTYIDLYNMDREPAEYKIARAKEVMEYEMSTPSNDYWWWADGLYMVMP VMTKLYKVTGNQQYLDKLYEYFTYAKKLMYDEESALFYRDGKYIYPKHKTNQGLKDFWSR GNGWVFAGLAKVLQDLPENDVHRAEYITIYQAMAKSLAAAQQEEGYWTRSLLDPVQAPGR ETSGTAFFTYGYLWGVNNGLLDKSTYEPFIKQGWKYLTEIALQPNGKIGYVQPIGERADQ HKNVGPETTADFGVGAFLLAGSEMVKYLNN >gi|225935371|gb|ACGA01000021.1| GENE 37 55642 - 55884 119 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170931|ref|ZP_05757343.1| ## NR: gi|260170931|ref|ZP_05757343.1| hypothetical protein BacD2_03619 [Bacteroides sp. D2] # 1 80 1 80 80 143 100.0 4e-33 MLILKTQLNTQKEIFLKFINYTVFRGYYLGNKDACFLGQKGSKMRQNRTKIIQYIENCLF EERTIRYPCYKAYKLQAHHN >gi|225935371|gb|ACGA01000021.1| GENE 38 55835 - 56023 166 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVFYLISVLSILIVFIVPCIIGTRYMIKKQIAWHIQILWLLLIVLTSYGGLVVCMLYNK DI >gi|225935371|gb|ACGA01000021.1| GENE 39 56040 - 56468 184 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170932|ref|ZP_05757344.1| ## NR: gi|260170932|ref|ZP_05757344.1| hypothetical protein BacD2_03624 [Bacteroides sp. D2] # 1 142 1 142 142 209 100.0 6e-53 MNLVLLNFVWSGVLIGESFIRFWYIIPATIVIEMVIIKLMTSHLWGKSLLISTVGNLVSG VIGTIVIGCASILIDFLFGSDFFPTMLITWILMFAGSFGLEYLTVRWIFKDKSRALMWAI LSGNVLSYCFIVISLLVKYWIK >gi|225935371|gb|ACGA01000021.1| GENE 40 56482 - 56943 274 153 aa, chain - ## HITS:1 COG:no KEGG:BT_4357 NR:ns ## KEGG: BT_4357 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 153 119 220 1178 74 35.0 1e-12 MGLAIKIVGVSLGMSAMGATSLYGQKVGEDSLRNDCTMLPQVEVVGWKAVKKVSVAGGIV LVSQDRHSISVEGIVMDKCGHSIIGAEIVEKGTAHRTLSDLNGRFILQISGKRAVLKVNY PGMKTKSVRVRKRKGKNIKVVLYKNRRVLDEIL >gi|225935371|gb|ACGA01000021.1| GENE 41 57627 - 59186 1246 519 aa, chain - ## HITS:1 COG:no KEGG:BT_0288 NR:ns ## KEGG: BT_0288 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 517 1 506 507 755 76.0 0 MTKTTTGSSRLTVILSWMIALICFVGGALYIWKMPSPDQEAMDLQMKQQKNQKSRKSLTE KVKQEQAQRALPANYARQLVEQSEVLVKKNLEEQVKKFQTMAETMRRRKDDMLAKVEKRK LPWGAPADANDTSKARAIPRAGNPSANASVEELYAVLRDYEAEIQQSHLAVSAAKQSLSK GQSFPEVYSSLKAGNSRMPSFDELIRMQSRDSEWATSAGSNASGGLEIKSTADLNNYRGL LGQATRQAGLAQSRLEGLFGAVRQAGNPVSGMGQGTGSPGNGGPGGNGSGNGNGTGMGMG GSGEVSASGGGKGTFSAYSGSKLSKEMVKAQALPGRRFSKDAERKGWLYINTWYMIGPWE SFGREDFSIVHPPEVSIDFDAVYTDGQIGTGIAETDSDPLKVIGDEVQLDGTLRWKFMQS ESMHNVPPVTTGHSTYYAYTELYFDEATTMLVAIGTDDSGRVWINGKDVWRDYGTSWYNI DEHIAPFQFRQGWNKVLVRLENGGGGPCGFSFLIIPQSK >gi|225935371|gb|ACGA01000021.1| GENE 42 59161 - 59586 254 141 aa, chain - ## HITS:1 COG:alr0644 KEGG:ns NR:ns ## COG: alr0644 COG0848 # Protein_GI_number: 17228140 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Nostoc sp. PCC 7120 # 1 119 1 121 217 59 28.0 2e-09 MRLSLFNNEDEPEVSMSPLIDCVFLLLIFFLVSTMTKVKNRDIPVDLPTSQSAVKLKPDD KQAIVGLDADGNFYWDGQPCSTNFMMEQLRQTCITDPGKRIRIDMDRHTPFGRFVEVMDA CQFYNLTNIGIRTYDENYNRE >gi|225935371|gb|ACGA01000021.1| GENE 43 59591 - 60037 368 148 aa, chain - ## HITS:1 COG:no KEGG:BT_0286 NR:ns ## KEGG: BT_0286 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 148 256 85.0 2e-67 MRLNYDDDDKVEVQMSPLIDCVFLLLIFFLVTTMMKKWEMQIPLTLPTMTSSLSTTRAGE EAVIIAVDEKKNVYQVVGHDAYSGESTYLPVKDLSAFLSELRNSRGVETNIDVTAYRTVP VSTVIEIFDQCQIQGFTHTRVRLGSKPY >gi|225935371|gb|ACGA01000021.1| GENE 44 60021 - 61406 1477 461 aa, chain - ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 255 448 5 196 202 92 29.0 2e-18 MNYTKMKFGLKKGIVGVCILGAILLFTTPLSAQTNGKSGKGLYELFSTKLPDGAQWVKDN KYVRKGGYSKKTENYIYVGAAKSPAEMTGLSIPIRENPGPGEYRYITFAWVKWGGEQIGM KFHVSEKSVNQKGKKYDFTYIAGEPKDLINQLKGLDLGDKPGHWMVMTRDLWKDFGNITL TGMSFICPERRDAGFDEIFLAKTQDDIKNAPKVLPSEIATAVPAVEEDDELMYDEAASDS IAADQGVQIDWGAQIKAGGTMMYPLYLLGLIALVIAIQRLITSRQGRLAPKPLCKSVNEC LANGDLKGALAACDKYPSTLANSLRFIFEHVKAGREAVSQTAGDMAARDIRTHLARIYPL SVIASLAPLIGLLGTVVGMIEAFGLVALYGDEGGASILSDSISKALITTAAGLIIAAPAI ALYFIIKNRIMRMASLVEVKVEEVITKLYLDNEEEKNETEL >gi|225935371|gb|ACGA01000021.1| GENE 45 62125 - 63675 1031 516 aa, chain + ## HITS:1 COG:no KEGG:BT_3420 NR:ns ## KEGG: BT_3420 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 516 6 508 508 580 57.0 1e-164 MNKNFSFLIFIIAMSCVLTGWTFRSNVKYPSDVETALNKAGTNRAELEKALRYFIEKDDQ QMLQALYFLISNMDIHYTETYVLKDTLDNNIPFREFDYPDIASAVAAIDSMRAFYGGLVF KDTIIEDLKSLKGEYLIDNVNQAFEAWRSSQFKEIPFEDFCEYILPYRVTVEPLEAWRET YRQKYRWMTDSLQNKPLERVLEYAGMDYRLWFTSSYGQKPLIEDEPLSRLSALQLLFRKR GACEDIAALQVFSFRSQGIPAAYDVVPWWATSMGSHFVNTVFDKRMQPVRLDVTNNTVVN RNMNREPAKVLRTTYSKQPDVLAAKVGWQDIPPCHLRTLNYKDVTPQWWESSDVTVGLSP DVPKADIAYAYVFNWGRWRPAWWGEVRGDSVTFSNMPKGIVILPVYYKHGRVVQAGYPQV HAYNHELQLTPDTVHRRTVEIKQQDGYLIFRPGKEYELFYWDKVWKSLGTQIAEEGKTSI FFENAPDNVLFRLIPEYSVDKERPFIIMQDGTRYWW >gi|225935371|gb|ACGA01000021.1| GENE 46 63708 - 64991 844 427 aa, chain + ## HITS:1 COG:no KEGG:BT_3421 NR:ns ## KEGG: BT_3421 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 408 25 410 428 111 23.0 6e-23 MMKVYIYRLVLAGLMSLWSNAGMGQVSKSAITHFPVHIVQRIHEVISITPVSEVTQMKLG THFMQQDSLAALALHQTGLSDTLGIYFRTSVEELRSILSSLEFNAYKLKMIPRSSSRLRK IVLQKEALHLTPKQVDHLLKESDCMERMKKQKKEDIRAIEQRKADSILTHAKYLTYYEQE NQEKTSKTVKKWLTDLKRNYLMRADMDSAKLYQPLFRFESEQQAIISYWKDYGDKEKIKE AQAHYEALKPRDVQRMEIYKQLPSWSMIREVILKRDTLSLNLDENTLDTLLATYNTYLKV RQQKKAKKEKFSDRGLECKLIVPILTMERIDKLLVAKRRIQAEKNALKRIPVLEKYELVN NSNRKDIMKELTDYELKLEVAQEWVNIERSQENLFKLCDVKDHKPVVLQELDRKEKGGKK ENRENIF >gi|225935371|gb|ACGA01000021.1| GENE 47 65055 - 65789 560 244 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170941|ref|ZP_05757353.1| ## NR: gi|260170941|ref|ZP_05757353.1| hypothetical protein BacD2_03669 [Bacteroides sp. D2] # 1 244 1 244 244 329 100.0 1e-88 MKLQKLIWAVALFTLLIGLGSCSEEEYIESTSFTNDAIGDITVKVISSTDIPKGIRPLIL KDKTELDEFIKEINSIKFNDFPSIKGNIIRIKTRTEGGSSTGGGSTGGSGTGGDIDGGDT DGGDTDGGDTDGGDTDDSDNTDDTDKGQAGSKSFTAELDANGDYQIIIDLTWAQKGKGDI SVSSKNSNTWYFSSWTQDSGAASWLGSSSISYAITGTIKWYAILELQWIEMTRQSFTIKG TEQV >gi|225935371|gb|ACGA01000021.1| GENE 48 65794 - 66021 151 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170942|ref|ZP_05757354.1| ## NR: gi|260170942|ref|ZP_05757354.1| hypothetical protein BacD2_03674 [Bacteroides sp. D2] # 1 75 1 75 75 128 100.0 1e-28 MKNDLGFILGNWDAISCIIYTFSLIAVGFAITDKVNVWNKILNIFIILSFPIIGCVFYII IKSYKRRKNVKPNSE >gi|225935371|gb|ACGA01000021.1| GENE 49 66270 - 66731 208 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170943|ref|ZP_05757355.1| ## NR: gi|260170943|ref|ZP_05757355.1| hypothetical protein BacD2_03679 [Bacteroides sp. D2] # 85 153 1 69 69 113 100.0 5e-24 MNRLTYSLLWALPIINYLWSIILNMTPNLPFDYIAQQIKEEPPIYATARIIVSILLIIIL YSYLLVCYIRIRHGKPIKLVTISWMMIQTASHVAMYVYTFFIDYNDHREYALYINAFWFF HSLLYFIPFYLLAFYLFKKIELISKKQNKGAIQ >gi|225935371|gb|ACGA01000021.1| GENE 50 66728 - 68413 686 561 aa, chain + ## HITS:1 COG:VC0393 KEGG:ns NR:ns ## COG: VC0393 COG3307 # Protein_GI_number: 15640420 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A core - O-antigen ligase and related enzymes # Organism: Vibrio cholerae # 84 374 144 445 597 67 25.0 9e-11 MSKTLIYKTRWRFLPILNYLLIASFSILCTGSLFVNSSAFTDNQIFPKWLFVFMGLGVIG LVFSVMLLAGKKYVCNMKMLNAMILILCVLQAGYGILQFCNILPPSSATYKVTGSFDNPA GFAGSLCAGLPFTVYFLLDTNTKSIRWSGWFAMLIISIGILMSESRTGLLCIAIVSIVHF VYYNVKKLHGRKAIIGFCFSLMLLLASAILLFNAIYQWKKDSADGRLLIWKCTWEMIKDK PITGHGIGAFESHYMDYQAKYFEEHPNSQFTMLADNVKHPFNEFLSIGVQFGVIGWGCLL ALAIFLVYCYRRHPSQEGYVSLLALVSVAVFSSFSYPFTYPFPWIITVLGIGVLIGKGYG NLRFSQMNMMKKAIAICLLVASAFLLFTVGNRIKAEMEWKRISRLSLRGQTYSMLPHYQR LLAILGKEPYFLYNYAAELCVAKRYDDCLKTVSACRKYWADYDLELLQAEALIGLEQYEK ARHYLEKAVTMCPVRFVPLYRLQYVYKKMKNEKEANRLAHLIIEKPVKVDSDIVRAIKRE MRLYLQHKKQQDNIRKEKYYE >gi|225935371|gb|ACGA01000021.1| GENE 51 68765 - 68962 186 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVYWLVNITLLSGVWILPLFWGFFLLRKNRVPIGMVILWMFCILATSYIGVLGCYIYLTY LKANK >gi|225935371|gb|ACGA01000021.1| GENE 52 68976 - 69176 116 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNLKYLTLTLFSGLCMGAIFINSGLLTDSPIFPKRLLVFARLRITRLCLSYLLLTGKNV SVMSRI >gi|225935371|gb|ACGA01000021.1| GENE 53 69471 - 70151 111 226 aa, chain + ## HITS:1 COG:no KEGG:BT_3534 NR:ns ## KEGG: BT_3534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 225 24 225 226 98 33.0 1e-19 MRLFFKYILITIIVNISVNTVAQEANIQRDSFAPLHSVPEWIKTKTTPEEYQLWMTMSKI YKINYSFLNETISKEKRQKLYDDIRVICDSIEAGKYAKHKGELFVVQLIPRTDTTLNWRN EETIQIDEHIRYCKQKAIIYQSVQKKTVYLECIVWYIYNSLNKEVYIIKYETSSPLAFSR FLGNLHFSFLKETNELIGSCSGTLQYYDERHEYHGETFTKTFSLKL >gi|225935371|gb|ACGA01000021.1| GENE 54 70209 - 70520 375 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170946|ref|ZP_05757358.1| ## NR: gi|260170946|ref|ZP_05757358.1| hypothetical protein BacD2_03694 [Bacteroides sp. D2] # 1 103 1 103 103 167 100.0 1e-40 MKRHSLLGKIVITSAFLVFLVSCKNSKQIESVASKKQFLKELHELGFSIVYGKISLESKK DEGEDDIQKLLNLFGEKNDDDSKLDKDTIWHKAYEKSKEIPLK >gi|225935371|gb|ACGA01000021.1| GENE 55 70552 - 70986 116 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170947|ref|ZP_05757359.1| ## NR: gi|260170947|ref|ZP_05757359.1| hypothetical protein BacD2_03699 [Bacteroides sp. D2] # 1 144 1 144 144 218 100.0 7e-56 MSKTLIYKTLWRLLPILNYLFCIIVDWIESLNNVNNAIHPPLYWNIYVIGNVLLIIWLYL YLWICSSRDFSKQAVLIITILSVSLQTLIHFGVFAFVTTTTYDLYICDNNAFPIGKFIYS MLYFMPFYVVAFVLRGKLFQIKLK >gi|225935371|gb|ACGA01000021.1| GENE 56 71250 - 72482 686 410 aa, chain + ## HITS:1 COG:TM0967 KEGG:ns NR:ns ## COG: TM0967 COG0582 # Protein_GI_number: 15643727 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Thermotoga maritima # 123 388 9 247 253 65 25.0 3e-10 MRSTFKVLFYVKKGSEKPNGNLPLMCRITVDGEIKQFSCKMDVPPRLWDVKNNRASGKSV EAQRINLAVDKIRVEVNRRYQELMQTDGYVTAAKLKDAYLGIGVKQETLLKLFEQHNAEF EKKVGHSRAQGTFTRYRTVCNHIREFLPHTYKREDIPLKELNLTFINDFEYFLRTEKKCR TNTVWGYMIVLKHIVSIARNDGRLPFNPFAGYINSPESVDRGYLTQTEIQTLMNAPMKNA THELVRDLFVFSVFTGLAYSDVKNLTADRLQTFFDGNLWIITRRKKTNTESNIRLLDVPK RIIEKYKGLARDGHVFPVPNNGSCNKILKDIGRQCGFKVRLTYHVARHTNATTVLLSHGV PIETVSRLLGHTNIKTTQIYAKITAQKISQDMETLSHKLEDMEKNICRAI >gi|225935371|gb|ACGA01000021.1| GENE 57 72500 - 72892 188 130 aa, chain + ## HITS:1 COG:no KEGG:BDI_2139 NR:ns ## KEGG: BDI_2139 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 128 2 125 127 143 57.0 2e-33 MKEERNIITMDGQGNISLPSDIGATAMTEREICELFGVIAPTVRAGIKALCKSGVLSIYD IKHIIRLSDKYSAEVYNLETIAALAFRVESFGAAKVRRALLERIIHGRKEKTTVFVSVVS DGKPNSRWKA >gi|225935371|gb|ACGA01000021.1| GENE 58 73485 - 74852 584 455 aa, chain + ## HITS:1 COG:no KEGG:PG1109 NR:ns ## KEGG: PG1109 # Name: not_defined # Def: mobilization protein # Organism: P.gingivalis # Pathway: not_defined # 1 455 1 455 455 387 48.0 1e-106 MGFVVLHMEKAHGSDSGTTAHIERFIIPKNADPTRTHLNRRLIEYPDGIKDRSAAIQQRL EEAGLTRKIGSNQVRAIRINVSGTHEDMKRIEEEGRLDEWCADNLKYFADTFGKENIVAA HLHRDEETPHIHVTLVPIVKGERKRRKREEQTKKRYRKKPTDTVRLCADDIMTRLKLKSY QDTYAVAMAKYGLQRGIDGSKARHKSTQQYYRDIQKLSDDLKAEVVDLQQQKETAQEELR RAKKEIQTEKLKGAATTAAANIAESVGSLFGSNKVKTLERENTALHKEVADHEETIEALQ DRIQTMQADHSREIREMQQKHGREIADKDTRHKQEISFLKTVIARAAAWFPYFREMLRIE NLCRLVGFDERQTATLVKGKPLEYAGELYSEEHGRKFTTERAGFQVLKDPTDGTKLVLAI DRKPIAEWFKEQFEKLRQNIRRPIQPQRKGKGFKL >gi|225935371|gb|ACGA01000021.1| GENE 59 75049 - 76278 516 409 aa, chain + ## HITS:1 COG:no KEGG:Acry_2349 NR:ns ## KEGG: Acry_2349 # Name: not_defined # Def: putative transposase # Organism: A.cryptum # Pathway: not_defined # 23 344 7 333 397 170 32.0 1e-40 MRAMKEVLNDIYTDGCKEKKYRLSDFFNRWWDEYAQHPAEYITPEQYKAVNAIRVCRTTT LGIDTYVCPDCGEVREISHSCKHRFCPSCGWRDTLKWAARMKEKMLRVPHRHVVMTLPHI LLDLVKRNRKEMLNILMRTSAETLKDWMMHKFGLKTGVIAVLHTYGETKQLHVHTHMIMS WGGIDNDGKIVIPEHDYVHIPSICKVFRYKFEHALIGLFDAGRLEHDFHDRMEFMGFIKK VVNKKDWIVHLEQPIQMPEQVIQYVGRYSKRACLSEYKITAMDGENISFRYRDYKNSPDR RNPVEKELTLHYREFFPRLLQHVPLRYFRIVRYYGFYSNKGSLPEEYFGRDESEIEEAEL QQAESGYENPYFCERCGRTRVYSHTTVTGGGMTYTVILEHCDIHRKKAA >gi|225935371|gb|ACGA01000021.1| GENE 60 76606 - 77406 148 266 aa, chain - ## HITS:1 COG:BH0380 KEGG:ns NR:ns ## COG: BH0380 COG0030 # Protein_GI_number: 15612943 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Bacillus halodurans # 7 265 18 277 284 146 32.0 4e-35 MTKKKLPVRFTGQHFTIDKVLIKDAIRQANISNQDTVLDIGAGKGFLTVHLLKIANNVVA IENDTALVEHLRKLFSDARNVQVVGCDFRNFAVPKFPFKVVSNIPYGITSDIFKILMFES LGNFLGGSIVLQLEPTQKLFSRKLYNPYTVFYHTFFDLKLVYEVGPESFFPPPTVKSALL NIKRKQLFFDFKFKAKYLAFISCLLEKPDLSVKTALKSIFRKSQVRSISEKFGLNLNAQI VCLSPSQWVNCFLEMLEVVPEKFHPS >gi|225935371|gb|ACGA01000021.1| GENE 61 77652 - 78377 595 241 aa, chain + ## HITS:1 COG:no KEGG:BT_1755 NR:ns ## KEGG: BT_1755 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 235 51 276 408 93 32.0 8e-18 MNRILTLYLFLLLCGTASAQQIVKWDDLQTITDNARRTVYYEKGSKQPLQGEYRIIRGLD EERVKLSDGIINGDYLRYRDGVLRESGIYAKGKRNGIFTEYYQDGVTPRKETPMQQGKID GTVKTYFRNGKIEIEKEYRQSVESGRERRFDSKTGEQIFESHYIDGKKEGEEWEIFEDGR TLRSRTTRHYRNGKLDGFYRVESTRDGKPYITIEGQYTDGEKSGRWKQYNATDDTTHEWD E >gi|225935371|gb|ACGA01000021.1| GENE 62 78640 - 78819 109 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170955|ref|ZP_05757367.1| ## NR: gi|260170955|ref|ZP_05757367.1| hypothetical protein BacD2_03739 [Bacteroides sp. D2] # 1 59 1 59 59 110 100.0 4e-23 MKKRIVSLLLVCATMASSTLRSQPVQLKVNAGEQIATVPRLFADDKAWFDIHFHVILIR >gi|225935371|gb|ACGA01000021.1| GENE 63 78861 - 81359 1604 832 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 24 817 53 871 871 526 38.0 1e-148 MNRIYLLLFVCLFLSACQATVQEKERRTEFNDGWRFHLGEMPDASSPQLDDAAWKTLNLP HDWSIGGDFSEKNPSGFSGGALPGGTGWYRKTFTIGKKLAGKKICIDFDGVYMHSEVYIN GQLVGKRPYGYISFRYDLTPYIKFGEPNVIAVKADNSRQPNSRWYSGSGIYRNVWLTVLN SVHVDLWGTYVTTPKVTATQADIVINTTVKNENTDPCDVEISTTLLDAQNKSIRKAVETL RVSSEHTGVCRQTIALDNPRLWSPETPYLYQVKTELRVDGKLTDTYYTTTGVRTFTFDAE KGFSLNGRSMKIKGVCLHHDLGCLGATVNYRALERQLEILKEMGCNGIRTAHNPPTPELL ELCDRMGFIVQDEVFDMWRKTKSKYDYSNDFLEWYERDLTDFILRDRNHPSIFTWSIGNE IWEQWSDSFGDTLSVEYANQLFKLGLTPEKVKEQSDKHPYTLLTTRLTEIVRKLDPTRPI STANNETEPCNLLIQSPAPDLIGFNYNNHNWEKFNEKYPGRKLLITESTSGLMTRGYYEM PSDSIFIRPHAEKAKFDSPLKECSAYDNCHVPWGSTHEKSWDMVKTMDHIPGYYVWTGFD YIGEPTPYEYPARSSYFGIIDLAGFPKDVYYMYQSEWTNKKVLHLFPHWNWEKGQEIDIW VYYNNADEVELLVNGKSQGIKRKGENEYHLSWRVKYEPGTIKAISRKNGQMVLEQEIRTA GEPAQIRLTADRRTIKADGKDLSFVTVEILDKAGTLCPNADNLVRFEVSGNTSIAGVDNG SPYSMERFKNNKRKAFYGKCLVVLQNDGTKGMSTLKAISEGLEAAEITLSGK >gi|225935371|gb|ACGA01000021.1| GENE 64 81587 - 83155 1286 522 aa, chain - ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 519 6 516 516 659 63.0 0 MDSVIRKFPIGIQNFESLINDGYVYVDKTALVYRMATTGRYYFLSRPRRFGKSLLLSTIE AYLSGKKELFKGLAIEKLEQKWEEHPILHLDLNTENYKEPDSLRRRLDSTLSYWEQLYGT RAVETSLPLRFEGIVRRACEKTGHRVVILVDEYDKPMLQAIGNETLQAEYRNLLKAFYSV LKSQDRYIKLAFLTGVTKFGKVSVFSDLNNLNDISMDYRYMDICGITDKELHENFDGDVA LLGERNGLTKEECYVKLKEQYDGYHFDYDTVGLYNPFSIFNTLSKLKFSDYWFETGTPSF LVYLLKHSNYRLDRITEEQVSGDLLNSIDSMSRNPIPVIYQSGYLTIKGYDKEFGIYRLG FPNKEVENGFIKYLLPFYTPVTEQESSFIITSFVMDIRQGNVDSFMQRLQSMFADTDYKI VGKMELYFQNAMYLVFKMMGFYTDVERTTSNGRIDVVLQTKDYIYVMELKLDGSTDEALC QIEEKGYALPFAKDSRKLYKIGVNFSSETRGIVEWKIVEDNS >gi|225935371|gb|ACGA01000021.1| GENE 65 83342 - 84643 846 433 aa, chain - ## HITS:1 COG:no KEGG:BT_0282 NR:ns ## KEGG: BT_0282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 432 1 431 431 666 75.0 0 MDKKSVLLSLWQKFVNRFYKSGECEGLMEDVSKSSVSCYEELKERKEIKERDSLSKTKQL TESVDLFLKSHYDFRYNVLTEETEFRLLERMNEGFRPVNQRVLNTICLEAHEAGIGCWDR DLSRCIYSTRIAEYHPFRLYLDELPEWDGIDRVSALARRVSESPLWEKGFRIWMLGMTAQ WMGVMGDHANSVAPLLISTEQGYFKSTFCKSLLPPVLRRYYMDKVDLTSQGNVERRLAEM GLLNLDEFDKFSPAKMPLLKNLMQMASLSLCKAYQKNYRALPRIASFIGTSNRKDLLTDP TGSRRFMCVELEHPIDCEAIDHEQIYTQLKAEILSGKRYWFTKEEERELQENNMKFYRQG PVEDVLRACYRSAERREECDLLSAADIFQCLKRRNPAAMRGANPASLAQILIAVGIERKH TKFGNVYRVVKIG Prediction of potential genes in microbial genomes Time: Fri May 13 07:12:27 2011 Seq name: gi|225935370|gb|ACGA01000022.1| Bacteroides sp. D2 cont1.22, whole genome shotgun sequence Length of sequence - 13713 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 65 - 703 441 ## BT_0281 putative DNA-binding protein 2 2 Tu 1 . - CDS 1035 - 1727 303 ## gi|260170960|ref|ZP_05757372.1| hypothetical protein BacD2_03764 - Prom 1789 - 1848 7.6 - Term 1752 - 1817 -0.0 3 3 Op 1 . - CDS 1872 - 3971 1186 ## BT_0278 hypothetical protein 4 3 Op 2 . - CDS 3977 - 6412 1738 ## BT_0277 hypothetical protein 5 3 Op 3 . - CDS 6450 - 8375 1351 ## BT_0276 hypothetical protein 6 3 Op 4 . - CDS 8388 - 10223 1101 ## BT_0275 hypothetical protein 7 3 Op 5 . - CDS 10216 - 11016 636 ## BT_0274 hypothetical protein 8 3 Op 6 . - CDS 11041 - 12561 1312 ## BT_0273 hypothetical protein 9 3 Op 7 . - CDS 12581 - 13693 810 ## BT_0272 hypothetical protein Predicted protein(s) >gi|225935370|gb|ACGA01000022.1| GENE 1 65 - 703 441 212 aa, chain + ## HITS:1 COG:no KEGG:BT_0281 NR:ns ## KEGG: BT_0281 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 213 330 83.0 2e-89 MAVQFELYKTPMPKEKKNKVRYHARPISYETVNTKKLVYRIHDRCSLSPSDVTATLEELK YEVAQCLKEGKKVHIDGLGYLQVTLSCEEEIRDPKDKRVHKVKLKAIKFKADKELKAELS DMEFHRSKYRPHSAGLSEVEIDMELTKYFSENQLITRKDFQYLCGMTQITAYRHIKRLIA EKKLQNKGTTHQPIYTPVPGNYRVSVAVKYKE >gi|225935370|gb|ACGA01000022.1| GENE 2 1035 - 1727 303 230 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170960|ref|ZP_05757372.1| ## NR: gi|260170960|ref|ZP_05757372.1| hypothetical protein BacD2_03764 [Bacteroides sp. D2] # 1 230 1 230 230 422 100.0 1e-117 MKTKIKMLLCMFVGMFGYCMLTFAQPSLEEQKVKIERWKREFKSLERQYAQNHVLHNGGG LVAKLSKKNHSRELARWRALYEKKAADRKLAKAEWLRLKAIIVTTTDSVEKERLMSLEHD WLYVYQTFNGNCRTALTFMAKLDTDRILAADEQTLRIAKRLKVQEKIFPKGQEKKMKAVQ CEIRDKYMKPLYQKMALVPDSVSERDVQIDSLHLIHRALIYYKKSSLLDD >gi|225935370|gb|ACGA01000022.1| GENE 3 1872 - 3971 1186 699 aa, chain - ## HITS:1 COG:no KEGG:BT_0278 NR:ns ## KEGG: BT_0278 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 696 7 585 590 266 30.0 1e-69 MRTQKIMKWSFLAVMLLGNISISCTKYDTPDFVEETGGVSGNLSVKKNVLWINLEGAGGG DLVKNAFPDDGVVKGLLPRSRYAWNGLESEHIDDTYTPTAENAIACASMLTGNIPMRHGI ADETYLSEVYFDPDYDESMKAYPGFFQYIVDYDKSMKTLAVTPWKTQNQELLEKAGTTVT TTSDEETLNTVLQQIDNENNRVIYLSFRDVLDAAKSGGGWKDSNTQYINAMHKMDGYIGQ LLEAIRARPEYYYEDWLIMITSNHGGTADGRYGGMSLEERNMFGIFYYEHFSKGKEMNPG LVEDVLCFDQSFQGVVIDSIEPTPDKKGIETMRQIYSLDSLDSGMTVEIIMAARPSIGRS YVPGDQNGKNLFGKRRWNMSLTHTYASALGSFYGKNDGTDGQRTAAFLNPMIHTYTSTVK LYDTEAHIQKDWVDEVIDQWGNVTEGYNKMTPKRKGKVAVYSYYDGLKKTTKSTDADLDW SIADYVDNTNLTLYGDMSNLCRYILELRIWNKELSPAEVKQYSNKLKLSPSDPMYGNLIG YWQFYKGEDGQYLKDDSIVVNQIKQVKKRVKVGDREEEQFLDTEGLRLRKKFTASNNKSY YTYVEKEDLKYQTLTNTLYQTIESEGRMMESVLPVPTVLQWLDISFPLETTRETGANAFK TSKLDGVAYPWDGENNRAVWRGMFLGDYTVDLEWREYEK >gi|225935370|gb|ACGA01000022.1| GENE 4 3977 - 6412 1738 811 aa, chain - ## HITS:1 COG:no KEGG:BT_0277 NR:ns ## KEGG: BT_0277 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 668 41 601 705 273 31.0 3e-71 MKDKIVYILLAVWGVWGATACEDKDFEQIISATDDYTSATGGDYYEGGGIDVSLYDKARI FPGLVDTLTEKRVDEQVVNIDMAYRSAKAVNVNLAMTSPAIYSTGLYAGAGEKITVMLDD DVKGLTVQIGIHSRDLSSLVGSSYLERDPKVVTSMALFKGKNEIRNPYGGYIWIKRSGDA SDTGIVPLKVQGAYLAPDYVVGETEAAEWGEKIKTTTVPWIELRGKQIAFSVPVKYMKLK LQSEGQSFVTRLEQSLELWDDWVLCYNEFYGLDDAESETFPKPDFPVRVVMDAHLVTERY SYYSNTNLELLQTEELIDMIADPEQVKAGALNTSHVVGWMSLGLFVQTYWPTPAPNSFKD MYSLMPNFYFLYKHGWWGNQQDAKLFAYKLFGRDQKVINTTQYNLNADEFENLVSWAKAD SCKIYSDEAKRPSKSGNDYWPAALTFYSAILSYKQEDTGKDGWKYFAYLNRFLSNEGQNV SIFNRLSMSEAMLTCLSHYFERDFTPLFDRYGIEISDKMRAEALQYKAVEKRIWEHNPLK GNDTSDFDGKVFYTKSGKTPFRHLRSEWTAVAYSGTGEDLKASNYGYVYESRAIKYNTPF NLFDGDRSTLWQSYSDLYEEYTDENGDKHYPYKIDNLYYNAKAPDLPYTIVIQPGESRSI DLDGVYMAFGFTEVNGIYNSDVKDYDKYAFRPQHIIVEVTSTPLEYNDVDTIYTNIQQVQ WKQVFDSDRDPKGTPSQQFWPDRSNLFYIELDQKATNVTGIRLTMDRESHIAKDRPANFP ADEKPNRPEFTNKYLNRIQKIAEFGTFYFNE >gi|225935370|gb|ACGA01000022.1| GENE 5 6450 - 8375 1351 641 aa, chain - ## HITS:1 COG:no KEGG:BT_0276 NR:ns ## KEGG: BT_0276 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 638 2 596 601 268 32.0 5e-70 MKVNNYKRITNLAGTCFLGILLLLVTACNEVMEDSLRYDYPASGSNYESGHVLLVVMDGA AGRAVQAARNAYKAPNLKSMIAHALYTDYGLADGSNNIAGGEMTNARGWANLMIGNTTHD IKTEDDLIAGTDNFISRLVEENSLVSMYAVDEKFRQTFAVKGMTAPEVNTDEAVKNGVLE ELKLPDTSDLIVAEFGGVREAAGGEFYNENGTPTEAVVNAIGVLDNYIGEMWSALKERPG YENENWLIIVTSNYGGDVQMVEGKEFADHYADVSRNTFTLMYNERLVSQIQAAPGNTALS YSFSTPAWSYDYRNPNPNRYAESARLGNTEMGEFYFNDKNEIEPVTIQFFLSSSVYNSRK YVILSKSSNMDEKTKVGNGWFFHFNADTNNRRICFGFGGKRWLIQTKDENNLDWSQWHVL TLTLEPNPDPKKPANTLLTIYIDGELNNQLSYKNSEIVNGYTQNKSFPSTDAPLRIGGTE NRDSQNSQQNTKNQQFSNYIYVTNLQIYDVAIPKEDVALYAGKNQLHLLKDSYKYWDNLK GYWPCDLEDDQMEPTLKNYAKDNGEDATDDFVIDRGAADVWLSGSSLSPAIHPIPESDKT FYVKTFNTVDVPRQIFVWLGKNVRWDWAMEGKAWKFAYEEF >gi|225935370|gb|ACGA01000022.1| GENE 6 8388 - 10223 1101 611 aa, chain - ## HITS:1 COG:no KEGG:BT_0275 NR:ns ## KEGG: BT_0275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 602 11 559 561 374 38.0 1e-102 MNKIDCYNSWLCKAASMVCLAVCVLALAGCNEDEANSIKEDPYAGGRAPLAVKLLSEKPS PESAGPREKVIFKASGLAEYCHPEEGTFDFDFYISDEKCKIESATDSTLTIVVPEAVSSG TTYLVLENQIFYGPYFSILGSVSIDEGFEYYKTGPYNGLIYSCVPWCGNTELTSEFYLCG DFRQEKTKPYGGIAMINNEKGLIKYGTADKIKIGRGIATGSFYDENTDEFYYPEVDGMNY WKADKESPRALIYGAFREYETYSASLGGFEFKNILLANNDLTIKTETKKFSDYTGKTYDI SVPAFVGGTMMSEKIIRAFSTSDGKIIAIGNFTVHRMTDYDNTTCDANKRLLAAEILTPA RSVMRMDEIGQLDKTYRRSLQNEEESLPGTTGEIKDACMLSDESLIIVGAFTSFDGKPVW NIVKLDKNGQVDDAFQSVVGGANGDINRITCTSFKDENGMEQERIVLVGSFTTFNGQPAQ GMVILDAEGNPDPGFVLKELEGGILNFAKIVDLNANGETAMPHVVISGTFTKYDGITRQG FLILDMKGDAIQRFNVPGRFYGQLYDAQYSLTSDNVNGLLLTGDFSSFDGKRMNNIVMLK VDLAENTNNEP >gi|225935370|gb|ACGA01000022.1| GENE 7 10216 - 11016 636 266 aa, chain - ## HITS:1 COG:no KEGG:BT_0274 NR:ns ## KEGG: BT_0274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 234 1 235 240 215 49.0 1e-54 MKNNSRKLIFGICAFLLATLSLGSCNSDDYLVDGGTSNPYFDGNVMQYLESRPDLFKDLV KVIKLTKWEDILTNEEVTFFAPTDFTIAKSVERMNYYLYNYQGMDKITELSQVKPAVWED LIGMYVMEGKYRLNDIAQIDTAAVSAFPGQTNFTYDRSYKMTMGVCYGDASGIKYAGYRQ VMYAYQDFGYPEYAYVSSCNIEPTNGIVHVLRLDHYLGFSISGLLWRALEAGVDYPKKGG GNSGLIKTVVREEEFKDMYIKEKNDE >gi|225935370|gb|ACGA01000022.1| GENE 8 11041 - 12561 1312 506 aa, chain - ## HITS:1 COG:no KEGG:BT_0273 NR:ns ## KEGG: BT_0273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 506 1 503 503 356 42.0 2e-96 MLKYISIGLLLIFTSTFCGCNLDTDDEDKMPGDKFWESAANAEAFMLSVYQNLRKATTDN GFFLYAGDVRCAPITGQNTNNYLYLIQNDMKNYKAKKDSKEEGTSSDFGAIYNWQNMYKV VQSANIMIEEVANVKELSAVEVERYRAECKFLRNLAYFFMVRIFGDVPYYTEAYAAAPLP RMDKTVVLKTCLADLQTVLDNDPDKGILPWRNGSGSLRANRGAVLTLMMHMNMWLAFFDE NNAVTYYNEVRRLAETDSWIDSSIYSLQSMSQMSEVFQGESNEGLFEIAQNVTMGEIFKT DHMWCTKVVYKIRNRTEPEFKYSEVFLKQLYPEETADMRKESWFNYLYLDDDDYVGTARR LEIIKLLNADTNGNSTIPNAGNYIVFRLADAILLYAEALNNLGESNDALREVNRIRQRAG APDFTDADNLDASIYWERVRELMGEGQYFYDLVRTQKICDTDFAVFADESGHREKKADLK QGSWTWPIYKKALDNNPYMTKNLYWE >gi|225935370|gb|ACGA01000022.1| GENE 9 12581 - 13693 810 370 aa, chain - ## HITS:1 COG:no KEGG:BT_0272 NR:ns ## KEGG: BT_0272 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 370 689 1056 1056 441 58.0 1e-122 MPNINLDPVTTTTWNFGYDMSIFNNRLSANIDAYYRQVDNQLSDIELPNHAGFEKVRSTE VSLVNYGLEVSLRGKPLSVQSKWNLLIGLNFALNRDVITKLPNEARQILNSDAWVANRLG GNTTSMLLYINKGVYATDEDVPVDPATGKRLRLGGKNTDEAYFKAGDPIWVDVNGDYVID DKDRVVAANARPKVTGGLFFNLSYKEFSMHVNTSFVFKRDIINSVLAKSFKAYDDPVRKS IESLRKDASLSPIEKYNFWTENNRYNAIYPNPYNYHHNKVIDPFREAQTLFLEEGSYFKL NTVSLSYRFPKRWLDFFRVRGVTLKASVNNIWTFSNYSGISPESVNGLGRDTSGGYPNSR TWTMGVVLSL Prediction of potential genes in microbial genomes Time: Fri May 13 07:14:59 2011 Seq name: gi|225935369|gb|ACGA01000023.1| Bacteroides sp. D2 cont1.23, whole genome shotgun sequence Length of sequence - 212783 bp Number of predicted genes - 152, with homology - 149 Number of transcription units - 63, operones - 40 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 2086 1713 ## BT_0272 hypothetical protein 2 1 Op 2 . - CDS 2106 - 3176 949 ## BT_0271 hypothetical protein 3 1 Op 3 . - CDS 3166 - 4032 543 ## BT_0270 hypothetical protein 4 1 Op 4 . - CDS 4045 - 5658 1364 ## BT_0269 hypothetical protein 5 1 Op 5 . - CDS 5671 - 8646 2354 ## BT_0268 hypothetical protein - Prom 8668 - 8727 5.2 6 2 Op 1 . - CDS 8843 - 11692 1413 ## BT_3422 hypothetical protein 7 2 Op 2 . - CDS 11702 - 12988 568 ## ACL_1119 hypothetical protein - Prom 13087 - 13146 5.8 - Term 13188 - 13242 13.2 8 3 Tu 1 . - CDS 13279 - 17364 2276 ## COG0642 Signal transduction histidine kinase - Prom 17399 - 17458 9.2 9 4 Op 1 . + CDS 17653 - 19113 928 ## BT_0266 hypothetical protein 10 4 Op 2 . + CDS 19161 - 20642 849 ## BT_0265 hypothetical protein 11 4 Op 3 . + CDS 20685 - 22442 1053 ## BT_0264 hypothetical protein + Term 22462 - 22514 13.5 - Term 22448 - 22502 15.3 12 5 Op 1 . - CDS 22529 - 24613 1482 ## BVU_2155 hypothetical protein 13 5 Op 2 . - CDS 24640 - 26055 673 ## BT_2959 hypothetical protein - Prom 26283 - 26342 9.4 + Prom 26297 - 26356 9.0 14 6 Op 1 . + CDS 26422 - 26691 356 ## BT_0261 hypothetical protein 15 6 Op 2 . + CDS 26699 - 27253 412 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Prom 27259 - 27318 5.4 16 7 Op 1 . + CDS 27341 - 28117 589 ## COG0388 Predicted amidohydrolase + Prom 28156 - 28215 4.3 17 7 Op 2 . + CDS 28266 - 30257 1792 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 30294 - 30348 9.6 + Prom 30268 - 30327 5.2 18 8 Op 1 . + CDS 30452 - 33361 2235 ## BT_0257 xanthan lyase 19 8 Op 2 . + CDS 33441 - 33758 372 ## BT_0256 hypothetical protein 20 8 Op 3 . + CDS 33758 - 33982 158 ## BT_0255 hypothetical protein + Term 34016 - 34088 7.5 + Prom 34000 - 34059 4.0 21 9 Tu 1 . + CDS 34089 - 35948 1495 ## COG1032 Fe-S oxidoreductase + Term 36125 - 36164 -0.2 22 10 Op 1 . - CDS 35953 - 36249 156 ## gi|160882453|ref|ZP_02063456.1| hypothetical protein BACOVA_00404 23 10 Op 2 . - CDS 36237 - 36449 264 ## gi|160882452|ref|ZP_02063455.1| hypothetical protein BACOVA_00403 - Prom 36473 - 36532 4.3 24 11 Tu 1 . - CDS 36560 - 39922 3515 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) - Prom 39956 - 40015 4.6 + Prom 39942 - 40001 5.7 25 12 Op 1 . + CDS 40109 - 40852 627 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 26 12 Op 2 . + CDS 40849 - 42186 1168 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 27 12 Op 3 . + CDS 42212 - 43120 679 ## COG1410 Methionine synthase I, cobalamin-binding domain 28 12 Op 4 . + CDS 43189 - 43701 548 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 29 12 Op 5 . + CDS 43707 - 44282 536 ## BT_0247 hypothetical protein + Term 44312 - 44375 10.0 30 13 Op 1 28/0.000 - CDS 44328 - 47195 2763 ## COG0419 ATPase involved in DNA repair 31 13 Op 2 . - CDS 47192 - 48445 976 ## COG0420 DNA repair exonuclease - Prom 48491 - 48550 5.4 + Prom 48430 - 48489 5.2 32 14 Tu 1 . + CDS 48566 - 49324 648 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Term 49447 - 49500 -0.5 - Term 49325 - 49379 9.0 33 15 Op 1 . - CDS 49423 - 49899 407 ## BT_0242 putative polysaccharide deacetylase - Prom 49979 - 50038 2.9 34 15 Op 2 . - CDS 50065 - 51849 1244 ## BT_0240 hypothetical protein 35 15 Op 3 . - CDS 51862 - 52596 516 ## BT_0239 hypothetical protein - Prom 52726 - 52785 4.0 + Prom 52671 - 52730 7.1 36 16 Op 1 . + CDS 52754 - 53998 962 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 37 16 Op 2 . + CDS 54031 - 56190 2080 ## BT_0237 hypothetical protein 38 16 Op 3 . + CDS 56191 - 58347 2050 ## BT_0236 hypothetical protein + Term 58363 - 58431 3.2 - Term 58351 - 58418 3.9 39 17 Op 1 . - CDS 58434 - 58616 210 ## BF3012 hypothetical protein 40 17 Op 2 . - CDS 58613 - 59740 390 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase - Prom 59835 - 59894 4.4 + Prom 59691 - 59750 9.6 41 18 Op 1 4/0.100 + CDS 59909 - 60208 299 ## COG0526 Thiol-disulfide isomerase and thioredoxins 42 18 Op 2 . + CDS 60226 - 60690 444 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 60746 - 60792 15.2 - Term 60732 - 60778 15.2 43 19 Op 1 1/0.300 - CDS 60800 - 61435 506 ## COG0778 Nitroreductase - Prom 61457 - 61516 6.1 - Term 61466 - 61506 7.1 44 19 Op 2 . - CDS 61530 - 62090 677 ## COG1592 Rubrerythrin 45 19 Op 3 . - CDS 62113 - 62541 251 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 62570 - 62629 8.6 - Term 62663 - 62717 12.2 46 20 Op 1 . - CDS 62745 - 63245 405 ## BT_0214 hypothetical protein 47 20 Op 2 . - CDS 63261 - 64019 262 ## BT_0213 hypothetical protein 48 20 Op 3 . - CDS 64028 - 66097 1784 ## COG1404 Subtilisin-like serine proteases 49 20 Op 4 . - CDS 66108 - 67817 1204 ## BT_0211 hypothetical protein 50 20 Op 5 . - CDS 67848 - 70475 1858 ## COG4886 Leucine-rich repeat (LRR) protein 51 20 Op 6 . - CDS 70520 - 72082 1085 ## BT_0209 hypothetical protein 52 20 Op 7 . - CDS 72102 - 72992 769 ## BT_0208 hypothetical protein 53 20 Op 8 . - CDS 73018 - 74538 1442 ## BT_0207 hypothetical protein 54 20 Op 9 . - CDS 74551 - 77484 2491 ## BT_0206 hypothetical protein - Prom 77516 - 77575 3.8 55 21 Tu 1 . + CDS 78131 - 80056 1676 ## COG0171 NAD synthase 56 22 Op 1 . - CDS 79924 - 80172 95 ## 57 22 Op 2 . - CDS 80174 - 80734 333 ## COG2365 Protein tyrosine/serine phosphatase 58 22 Op 3 13/0.000 - CDS 80764 - 81888 1089 ## COG0131 Imidazoleglycerol-phosphate dehydratase 59 22 Op 4 19/0.000 - CDS 81892 - 82932 873 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 60 22 Op 5 18/0.000 - CDS 82949 - 84238 1280 ## COG0141 Histidinol dehydrogenase 61 22 Op 6 . - CDS 84280 - 85131 999 ## COG0040 ATP phosphoribosyltransferase - Prom 85151 - 85210 6.9 - Term 85176 - 85215 8.4 62 23 Op 1 . - CDS 85390 - 85881 341 ## BT_0199 hypothetical protein 63 23 Op 2 . - CDS 85955 - 87859 1426 ## BT_0198 hypothetical protein - Prom 88021 - 88080 9.2 64 24 Tu 1 . - CDS 88167 - 88823 417 ## BT_0197 hypothetical protein - Prom 88952 - 89011 4.9 + Prom 88653 - 88712 2.5 65 25 Tu 1 . + CDS 88929 - 89111 112 ## - Term 88940 - 88976 6.6 66 26 Op 1 6/0.100 - CDS 89030 - 90397 1129 ## COG2271 Sugar phosphate permease 67 26 Op 2 . - CDS 90410 - 91321 888 ## COG0584 Glycerophosphoryl diester phosphodiesterase 68 26 Op 3 . - CDS 91330 - 92388 920 ## BT_0194 hypothetical protein 69 26 Op 4 . - CDS 92390 - 93271 642 ## COG1082 Sugar phosphate isomerases/epimerases 70 26 Op 5 . - CDS 93278 - 95242 1743 ## COG1520 FOG: WD40-like repeat 71 26 Op 6 . - CDS 95270 - 96661 1295 ## BT_0191 hypothetical protein 72 26 Op 7 . - CDS 96685 - 100128 3167 ## BT_0190 hypothetical protein - Prom 100255 - 100314 1.5 73 27 Op 1 6/0.100 - CDS 100337 - 101323 824 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 101352 - 101411 3.8 - Term 101337 - 101388 6.3 74 27 Op 2 . - CDS 101413 - 102015 586 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 102045 - 102104 10.9 + Prom 102021 - 102080 6.8 75 28 Tu 1 . + CDS 102211 - 102915 564 ## COG1741 Pirin-related protein + Term 102944 - 102996 8.2 - Term 102939 - 102977 7.6 76 29 Op 1 . - CDS 103021 - 105078 1934 ## COG4232 Thiol:disulfide interchange protein 77 29 Op 2 . - CDS 105094 - 105393 373 ## BF3194 hypothetical protein - Prom 105478 - 105537 3.1 + Prom 105374 - 105433 3.8 78 30 Op 1 . + CDS 105460 - 106113 504 ## COG0572 Uridine kinase 79 30 Op 2 . + CDS 106131 - 107522 1082 ## COG4623 Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein + Term 107524 - 107565 6.3 80 31 Op 1 . - CDS 107551 - 109098 1415 ## COG0591 Na+/proline symporter 81 31 Op 2 . - CDS 109095 - 109220 74 ## gi|153807903|ref|ZP_01960571.1| hypothetical protein BACCAC_02189 82 31 Op 3 . - CDS 109292 - 109891 598 ## COG0778 Nitroreductase 83 31 Op 4 . - CDS 109903 - 112650 2555 ## COG1410 Methionine synthase I, cobalamin-binding domain 84 31 Op 5 . - CDS 112740 - 113192 445 ## COG0691 tmRNA-binding protein 85 31 Op 6 . - CDS 113202 - 113747 441 ## BT_0178 hypothetical protein 86 31 Op 7 . - CDS 113780 - 114544 456 ## BT_0177 hypothetical protein - Term 114579 - 114618 1.2 87 31 Op 8 . - CDS 114621 - 115121 544 ## BT_0176 hypothetical protein - Prom 115184 - 115243 5.1 88 32 Tu 1 . - CDS 115723 - 116127 278 ## BT_0175 hypothetical protein - Prom 116331 - 116390 6.3 + Prom 116215 - 116274 5.3 89 33 Op 1 . + CDS 116366 - 117067 921 ## COG0822 NifU homolog involved in Fe-S cluster formation 90 33 Op 2 . + CDS 117087 - 118097 1329 ## BF3207 hypothetical protein + Term 118112 - 118180 15.9 91 34 Tu 1 . + CDS 118439 - 118792 378 ## BT_0170 hypothetical protein + Prom 118814 - 118873 3.4 92 35 Tu 1 . + CDS 118921 - 120033 541 ## Sden_2170 helix-turn-helix, AraC type + Term 120064 - 120114 14.1 + Prom 120079 - 120138 7.4 93 36 Tu 1 . + CDS 120199 - 120585 212 ## BVU_3800 two-component system response regulator - Term 120374 - 120423 5.7 94 37 Tu 1 . - CDS 120610 - 121011 334 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 - Prom 121084 - 121143 5.9 + Prom 120945 - 121004 8.2 95 38 Op 1 7/0.000 + CDS 121111 - 126741 4538 ## COG2373 Large extracellular alpha-helical protein + Prom 126748 - 126807 1.9 96 38 Op 2 . + CDS 126853 - 129276 1267 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC 97 38 Op 3 . + CDS 129303 - 130478 921 ## COG0477 Permeases of the major facilitator superfamily 98 38 Op 4 . + CDS 130525 - 130959 242 ## BT_0160 hypothetical protein - Term 130721 - 130778 -0.4 99 39 Tu 1 . - CDS 130923 - 131159 61 ## gi|293369470|ref|ZP_06616053.1| hypothetical protein CUY_4296 - Prom 131292 - 131351 2.8 + Prom 130990 - 131049 4.3 100 40 Op 1 . + CDS 131116 - 132543 927 ## COG1757 Na+/H+ antiporter 101 40 Op 2 . + CDS 132611 - 133201 702 ## BT_0157 hypothetical protein + Term 133227 - 133280 -0.7 + Prom 133206 - 133265 5.5 102 40 Op 3 . + CDS 133292 - 133951 582 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 103 41 Tu 1 . - CDS 133992 - 135059 797 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 135198 - 135257 4.9 + Prom 134939 - 134998 3.0 104 42 Op 1 . + CDS 135243 - 136685 940 ## BT_0154 putative periplasmic protease 105 42 Op 2 . + CDS 136692 - 137486 402 ## BF1404 hypothetical protein + Term 137533 - 137573 -0.8 - Term 137769 - 137817 6.5 106 43 Op 1 . - CDS 137895 - 138713 787 ## COG0627 Predicted esterase 107 43 Op 2 . - CDS 138739 - 139173 312 ## BT_0151 hypothetical protein - Prom 139236 - 139295 6.3 108 44 Op 1 . + CDS 139487 - 141733 1971 ## BT_0150 putative ferric aerobactin receptor 109 44 Op 2 . + CDS 141766 - 142548 1070 ## COG0501 Zn-dependent protease with chaperone function + Term 142578 - 142629 4.4 + Prom 142654 - 142713 6.1 110 45 Op 1 . + CDS 142734 - 144026 1281 ## COG1253 Hemolysins and related proteins containing CBS domains 111 45 Op 2 . + CDS 144084 - 144557 515 ## BT_0147 hypothetical protein + Term 144588 - 144633 13.1 - Term 144571 - 144624 18.3 112 46 Op 1 . - CDS 144648 - 145205 309 ## BF3831 hypothetical protein 113 46 Op 2 . - CDS 145270 - 146490 1063 ## BT_0146 unsaturated glucuronyl hydrolase 114 46 Op 3 . - CDS 146579 - 148228 1442 ## COG3507 Beta-xylosidase - Prom 148279 - 148338 4.2 - Term 148365 - 148410 11.5 115 47 Op 1 . - CDS 148436 - 149812 1007 ## BT_0143 putative transmembrane protein 116 47 Op 2 . - CDS 149822 - 151297 1104 ## COG5492 Bacterial surface proteins containing Ig-like domains 117 47 Op 3 . - CDS 151327 - 153063 1354 ## BT_0141 hypothetical protein 118 47 Op 4 . - CDS 153074 - 156304 2410 ## BT_0140 hypothetical protein - Prom 156331 - 156390 2.1 119 48 Tu 1 . - CDS 156437 - 156952 655 ## BT_0139 RNA polymerase ECF-type sigma factor - Prom 156979 - 157038 2.4 - Term 157358 - 157424 18.1 120 49 Op 1 . - CDS 157654 - 160479 2225 ## COG3292 Predicted periplasmic ligand-binding sensor domain - Prom 160621 - 160680 6.3 - Term 160651 - 160686 -0.8 121 49 Op 2 . - CDS 160716 - 162983 1624 ## Csac_2721 heparinase II/III family protein 122 49 Op 3 . - CDS 163005 - 164930 1318 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 123 49 Op 4 . - CDS 164949 - 166880 1530 ## COG3533 Uncharacterized protein conserved in bacteria - Prom 166936 - 166995 5.5 + Prom 166943 - 167002 10.7 124 50 Op 1 . + CDS 167045 - 167827 634 ## BT_0135 hypothetical protein 125 50 Op 2 . + CDS 167805 - 168188 247 ## BT_0134 hypothetical protein - Term 168153 - 168200 9.1 126 51 Tu 1 . - CDS 168214 - 169830 1323 ## BT_0136 hypothetical protein - Prom 170039 - 170098 8.0 + Prom 169956 - 170015 6.2 127 52 Tu 1 . + CDS 170067 - 170918 943 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 128 53 Op 1 . - CDS 171100 - 173037 1767 ## BT_0132 alpha-glucosidase, putative - Prom 173062 - 173121 4.2 129 53 Op 2 . - CDS 173123 - 173992 589 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 174081 - 174140 4.2 + Prom 173846 - 173905 5.4 130 54 Op 1 3/0.100 + CDS 174125 - 174940 201 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 131 54 Op 2 . + CDS 174981 - 176168 1039 ## COG1312 D-mannonate dehydratase + Term 176224 - 176294 10.0 - Term 176287 - 176336 4.5 132 55 Op 1 . - CDS 176479 - 176610 74 ## 133 55 Op 2 . - CDS 176628 - 178034 1142 ## BVU_0121 glycoside hydrolase family protein 134 55 Op 3 . - CDS 178049 - 179344 1243 ## Dd586_1768 exopolysaccharide inner membrane protein 135 55 Op 4 . - CDS 179406 - 181445 1815 ## gi|260171100|ref|ZP_05757512.1| hypothetical protein BacD2_04464 136 55 Op 5 . - CDS 181495 - 182688 986 ## gi|260171101|ref|ZP_05757513.1| Ig-like, group 2 137 55 Op 6 . - CDS 182725 - 184521 1562 ## BVU_0125 hypothetical protein 138 55 Op 7 . - CDS 184555 - 187803 2988 ## BVU_0126 hypothetical protein 139 55 Op 8 . - CDS 187853 - 188383 404 ## BT_0139 RNA polymerase ECF-type sigma factor - Prom 188403 - 188462 5.8 - Term 188532 - 188574 4.2 140 56 Tu 1 . - CDS 188621 - 188995 358 ## BT_0128 hypothetical protein - Prom 189015 - 189074 5.6 - Term 189035 - 189089 -0.9 141 57 Tu 1 . - CDS 189110 - 193192 2910 ## COG0642 Signal transduction histidine kinase 142 58 Tu 1 . - CDS 193246 - 195159 1406 ## BT_0127 putative transmembrane protein - Term 195439 - 195500 12.0 143 59 Tu 1 . - CDS 195607 - 197019 1027 ## BVU_0121 glycoside hydrolase family protein - Prom 197040 - 197099 4.3 - Term 197135 - 197181 9.0 144 60 Op 1 . - CDS 197356 - 198378 852 ## COG3746 Phosphate-selective porin - Prom 198445 - 198504 2.8 145 60 Op 2 . - CDS 198515 - 199885 1161 ## BF1089 hypothetical protein - Prom 199937 - 199996 5.3 - Term 200024 - 200083 15.2 146 61 Op 1 . - CDS 200105 - 202471 2091 ## BF1763 putative outer membrane protein 147 61 Op 2 . - CDS 202496 - 204529 1929 ## BF1827 hypothetical protein 148 61 Op 3 . - CDS 204556 - 205113 380 ## BF1765 hypothetical protein - Prom 205169 - 205228 6.7 - Term 205152 - 205203 0.3 149 62 Tu 1 . - CDS 205342 - 208095 2011 ## BT_0126 hypothetical protein - Prom 208212 - 208271 6.7 + Prom 208057 - 208116 5.0 150 63 Op 1 1/0.300 + CDS 208258 - 210165 1648 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 151 63 Op 2 1/0.300 + CDS 210179 - 211945 1436 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 152 63 Op 3 . + CDS 211965 - 212441 588 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit + Term 212471 - 212510 -0.1 - TRNA 212530 - 212617 55.6 # Ser GGA 0 0 Predicted protein(s) >gi|225935369|gb|ACGA01000023.1| GENE 1 1 - 2086 1713 695 aa, chain - ## HITS:1 COG:no KEGG:BT_0272 NR:ns ## KEGG: BT_0272 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 695 6 683 1056 599 49.0 1e-169 MKNRYILLLFTLFFVVTGSVAQDGAVIVSGTVYDKAGNNETFPGVNIICRDAKTQKRIRG TASDLEGGFSIKVPVGSELVFTYVSYKPTIYKVKKSVSGISIFLEENINEVQETVIVGQR RVSKANLTASATVIDAKELADTPVPNIMNLLQGRVAGMDIQMNNGLPGASGTYNIRGVSD ISLVGNNDSGWDLASSAPLFVVDGIPQTDVEDYNAEGLISGSGVSPISNIPVEDIANIQI LKDAAATSLYGSAGAYGVILIETKKGDSPKPRVTYSGNFTISTPPSLREVAIGNAERNLL KWQILNNDTSQLYHGYQDIMFMPAVSDSLNPYFNNNTDWQDQFYRVTYNQSHNISFSGGD NLFNYKVNGNYYTEKGIVKNTDFNRYALTGNMNYTSPNSKFTIGVDMKVGFTDNSTGSGN AVSQSGVASGASASSLLPPPSLYSASVAALQVFGVENSTVKSNYSATVNLGYALPFDIKW RGTFNYSYNSTDQEKFTPAILSDSKYSAIAYNYNSNESKVYIDTHVSRTFDLKYVMLGLT TGIRYNSRTSTGNSITYTGLPNDYIIGPVGHGKSEGSATVSQNQSTFSFMFAPNFKIKTK DSFKAGGDKYIFDPTISPELSSVYGTRTKWNINPSLGFRWNIGYESFMDRFTWLDNMSFR LTWGQVVKYKATRYDVFGTYDIDPGNTYNGESYIP >gi|225935369|gb|ACGA01000023.1| GENE 2 2106 - 3176 949 356 aa, chain - ## HITS:1 COG:no KEGG:BT_0271 NR:ns ## KEGG: BT_0271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 61 349 2 285 291 279 48.0 1e-73 MKNRREICDKLGKSGLLCAMLLVNAGCVDKYLPDEKDAFDRNVAYTRTTFDPVMGHIVFY TDICNVANSTLPLTFEITDMMHNDGSPAPELEEFYPVRVWKKPYLGTEKSIDEINAKRGI EYRRLFDVRKNSGEIVMWGEANSGILRCQPDSGYVFNMNISNSGGYKVVKRMRLMPKREV DFEPSIYDSQTGLAIAEYVSPETSRMRYEQNSSSFSYTIEPEDIHVYFRENKDKQDGATS LSISFYDPSWNVIDPRRFNETNWGGLFQAGFLKGKIGPKEVVYDMAYPLPLFNGQTKYTD SKGEKASVKFATSWFSKYGYRNTGYFVFDFAIYKEAHWEMLIHFAKGMPQLGEIKD >gi|225935369|gb|ACGA01000023.1| GENE 3 3166 - 4032 543 288 aa, chain - ## HITS:1 COG:no KEGG:BT_0270 NR:ns ## KEGG: BT_0270 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 287 1 261 264 147 30.0 4e-34 MNKKLLAIACSWGALLLSSCMDTSVVESDMSEIPSQGITYYDGTIYEYLENGDHSLGVTY DSLLYLLDCSGEGSSVPLKFQELKTCLQDEEGEYTLMAVPDSCIRLALKMLNNYRRLNNL VIDAKDIPADAPESEKYAAGELTLEKLLNYRKEIEHTDAQTQKPKIDIYEYKAPLDTMLC RYMTAGLYDTEKLSRVTSAEGTIIQGLYSYRMNLSYRRLPASGYMGAGPQDITFYDMRNT LNMTYWESTKVMWTDVQTKNGVIHVLFPQHEFGFGNFIHYFRNVGHEE >gi|225935369|gb|ACGA01000023.1| GENE 4 4045 - 5658 1364 537 aa, chain - ## HITS:1 COG:no KEGG:BT_0269 NR:ns ## KEGG: BT_0269 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 536 2 512 512 477 46.0 1e-133 MRYMNQILCFKRMIKPLALAAVLFYSSCSDLLENDNAFFHVSTQEQQWNSLTDTRSALFG IYGLMRTALGENNGFWAAGDLRLGDFTVRMRDDLQAIRDNRLDAPYDNIRQISNWNRFYK VVNAASVFIENAGKVLEKDNAYSETNYIYDMSQARALRALAYFYMVQMWGDVPLITQSYD NGTFPEVSRTDKEIVLNYVEKELLAVSQLLPELLGSSSDKYYDGDANTWRGLLINRYSVY AMLAHVTAWKGNYIDAEAYSGQILNKLPSFIKNGTTTPYTTTENLVSATGMFSSKYSNDF SVTRLVAFGYSYVNSDKGDVNETGTDGHLESWTLAEPIIRKQLPDIYLSKDTLEKVFMNK FTTDNRYGVDALSNPVQYFDGYITGINYQYPLFTKMKVVRDGEDKTNDLGVFSSYIVWSR LEDMLLMHAEALAVLNRPEDALVDLNTLRGVRNLRSLSYAKDLQGNVKNLLKEIFQERRR ELMGEGHYWFDRVREARLVGDDRTMVELVNNGGIYWPVAGEVLRHNTAITQNEYWKR >gi|225935369|gb|ACGA01000023.1| GENE 5 5671 - 8646 2354 991 aa, chain - ## HITS:1 COG:no KEGG:BT_0268 NR:ns ## KEGG: BT_0268 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 991 10 977 977 892 48.0 0 MKNAIIRTFAIVALLGFGSEMKAVRSSEAFYFSNDSISPDENIGQLSEREDSIRKSMEEA VIYKLKDTLDLWNLSRSPYISVQQYLKGSVPGVYVQETTGEPGSMQNMLFRGVSTPIFSK GDVASTQPVVFLNGIPLLMNDAFVYNVKSTEVNPIGTATNILSGLNLDNVASIEVVKDAA QLAKLGPLAANGAILINLKDGFYGGKNMFIRANGGVAIPPSNVKMTNAANEHAFRMQFAD YCATGAQRAAYLEKMPAWMKDVRDMNFFGEPEWADDYYSMSPLYNFSASMGSGGSSANYI FMMGYTGSNGVADETSFDKLTASFALNMKLIDQLGMSCLINASRLSRNGNRNLRDRYAEI EYLPDLTTPLSPVASVYHSYLDLYEEYKKNDNLNNLLNGYLAMNYNWNGLFVDTRLLLDY NTNVRHVFWPMNLMESVNYVSNYSGYNRRLIWQSSADYKFNLGHKHFFNLGAQAIIMKSV QHYNYTQGYDGKDDTKPTTSGGGFLYMKRYVDKMTHNMVSTLFSVDYKYKNLLDAQVVLR GDGSSNVQKDSRWLFTPAVSAGLNLKNLFLVDTDWVSDWSLRGSWARIGRYQDNNRFAAG PQYTGEELTGFGQPVMSSYYGYASVARPYNTGWVGYGLGWPYSDKWNVGLNSVFFNNRLS LSIEYYNNTDCDLITAIPVKQEYGYKYKYANGMKVRNSGVEITLSGKPFNAPGKFSWDAS FSLAYNKNELLQLPENLNELVVGNRKLKVGHSIDQFWLYQNEGVYAGDAEVPEVNGKKLS MSSIPFAKNDPKWKDQNGDNVINDDDRVLKGHTLPPVTGNFINNFKFGRFDLGVNLFFAL GHDAINYRSSQRYNFLALENTPSLESLREIFFWQTTNDKNSYPLYNQMSGLMPYRAEQDL FLEKLSYLKLRSLTLGYTLPLGKKAYKGEGKKKEMSKKKLTDIYLYVTGSNLFTLTGFSG DDPELIDVDGYYRGYGQPLSRSVILGVKFNF >gi|225935369|gb|ACGA01000023.1| GENE 6 8843 - 11692 1413 949 aa, chain - ## HITS:1 COG:no KEGG:BT_3422 NR:ns ## KEGG: BT_3422 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 688 23 683 905 351 33.0 1e-94 MKKVYILMILFSFVSGIANCYAQIGKFDKDSLMNRFVYQLGVFPQEKVYLHTDKGTYMAG DTLWFRSYIVDATLHKPLQNKYIFVELVNPLDSIVSQVLVRPGENIYQGYLPLSRELVDG DYTLRAYSRYMLKNGSECIFRRPIRIVTVAWNKVGMRALSRPAGKSESLSLSFTTGGKPM QLQRAEMSLKDKSDIALKLSEAKDGFNVEWGKNDWKGNASWLLSVRDSENNVYRRFLPMA TRNEEYDVTFYPEGGYLLNGLECRVAFKAMGRTGNAATISLDVVDEAGEIITSTRTLHEG MGTFMLTPEIGKTYLVKCTDEFGRNREFKLPPVNAKALHGLRVDALRDNFRVSLLSVAES PSEPLYLVAHVRGAIIFSEEWKEPQKKYLLPKQYFPAGVVQFLLLNQNGQALSERLAFSD SYTPAICDLTVNGPITKKRESISVNASLQDINQRPLKGVYSVSVVDGKFASVDSCYNILS HLLLASELKGNIQSPGFYFKKESTSARSCLDLLMLTQGWRRYDLTAIIQGKYKIPVLEKH TEMAIQGRTLAAGGLLSKSNNEHLVTIAGTGNLKGFKRVTSTDKDGYFCFDSIAYVDGSG FHIDAMQLNAKRTEAIELFESDYSKDMPLYPQTLLEDDSVRSVQREDVEMITRLDNLHFL LQDVIVRAPMWGSRDYRMFTDREVVRYKDIRTLLKNQGLTISTLAEEPEETQLRVNADKT LAVRDTAMLSGDEYDAVESVSSRDSQVEDMIYYGDQRILLFVDDNYCKPDILVNWITPGD IESMVLVKDVDRFRANTLLRGTLKWSEKFYLDRGYDLCHAYCRIPLSREKIAVLNVTTKD GFDSRCLGWWSRYYLNTRQRNHRNTTFYPLGYQAPVEFYSPKYDTAAKKDNEIPDLRTTL YWNPKLVTDERGNTSFSFYTSDQPGNYFVCIEGISEDGELIHVVKVLMP >gi|225935369|gb|ACGA01000023.1| GENE 7 11702 - 12988 568 428 aa, chain - ## HITS:1 COG:no KEGG:ACL_1119 NR:ns ## KEGG: ACL_1119 # Name: not_defined # Def: hypothetical protein # Organism: A.laidlawii # Pathway: not_defined # 28 234 11 205 5552 93 30.0 1e-17 MEKRFFLLNICWAFISILNATQHRQAENPLVISTAEDLKAFAKSVNSGNSYQGKVVKLSA DIWLNDTVGWQKWNRQTKMKSWTPIGIPRAPFEGTFDGDGHFIAGLFAKTGSETFSQGLF GFLKRATVKNVHIRFSHFISYDYVGALAGYISYNTQIHNCSNEGTIEAERNFSGGLVGFS SGQNRIISCSNYGRVYGHRCVGGIVGYFEGGSVYNTFNRGEIIGRYEHVGGIIGEFSEPY QKTVKETGLKELPNDTVANCYNTGRIIARDVAGGIAGHVNLHPIEITTWKVVFANNYNTG QIRTTYPPVTDGLVGVYAYFAISETQVIPTIDRINRDGGPCFWSEESCRVKDIKKPRFES EKIRSESWNKIMYGVMKIPESFRYFTDNEMKQQSFVDLLNKWVDKKDIFSRWSLDKEGVN KGFPVFCN >gi|225935369|gb|ACGA01000023.1| GENE 8 13279 - 17364 2276 1361 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 828 1058 1 231 294 133 39.0 2e-30 MKINRTTIIFVLFIMNIIGPITAQEFSVEILPLSKQLPSNSVQRVFQDREGFMWFGTREG LSRYDGYRILTFRSGKTTPDLLTDNQITCITDSWERILIGTKKGLNILNKKTYEISHIDN EELKDQEIRSILFDSKGYIWVGTYVALYRCSSDFSSCKRYDSSLPVTSVNSIYEDADKNI WVTFWRKGIFRYDRIKDTFVKYPALGKENNPFSVFQDDKKQHWIGTWGEGLYKFYPEESD EQVYMPVESVKEGELPENGTFFSIQQDKKYGYLWLVSSRGLYVVRKRVDNLVETIDISDI SSKLNNIFSEICLDKSGNLWIASFNEGVAYINLDKPIIQNYPMPSIKKTTGLTTNIQAIY NDNDGDIWINQNRLGLGIYKKDSNKIIWYRDIPDLRGLSGIETIGCIGYSSSNDQILVGP SYQPFIYILKKENGQVKLVSPINLQQYIKGASNNPQFFYEDKKLNVWVITSLGLFVKPAG DYNTLKETGFLQREITGLAEDNLGHIWVSTRRNGVFCLTVSDDFQIRKENVVKFDVESGL LISDNIEDLCTDNEGRVWMGSQEGYVFLYNQKSKTVEDYSDVFTTLTEGVQDMMMDKTGH LWISTNKRIIEYDPKTGGQINYQAGQDIVVNSFTKHSCFESQAGEMFYGGNRGIAVFMPY KRLADKPEKIRTHIVDVKMGDESLLTGNLNERFNLLKRTLKLHAEDQNIEIDFSSLNYSF PTKIQYAYKMEGVDKDWVYIKDGRQFAYYNRLPKGKHTFCVKATDINGLWSSLVTKVQVD KEPAFYETWWAYVIYVMLILLVCYSFYYRTKRRMQLRHELSVAQIEKEKSEELVQVKLRY FTNISHDLLTPLTIVTCLIDDAEITYKNKIPQFDMIRTNVNRLKRLLQQILDFRKVESGN MKLKVTSGDIVSLIRDVCDSNFMPLIQKKKLTFTFESPEETIQAYFDVDKIDKVVFNLLS NAYKYTGEGGEIKVALSVFLQNGHTYLSIIVSDTGKGIASEDIDKIFTRFYTNQHWVSSE TNGIGLSLTKELLELHHGTISVESEVGRGSSFTVIIPIDKESYTEAEINVESSQELKRES GIGTVNAENNILDWKQLEEGDINTTISDIRLLLVEDNEELLYLMRRILSKHYYVLTAKNG IEALEVMKEYEADIIVSDVMMPEMDGLELCRVVKGNLDTSHIPIILLTAKNSAEDRIECY NAGANAYISKPFELKVLEARIDNFLAEKKSKQEEFRSDAEDINFNLLDATDIDKEFLKKV TDVIQENLSSSTFDVVQLADALAMSKSSLYRKTKAIIGLSPVEFIRNVRLKQGVKMLKNK SISVSEVAYICGFSNPKYFSTCFKEEFGITPKEFQKSDISQ >gi|225935369|gb|ACGA01000023.1| GENE 9 17653 - 19113 928 486 aa, chain + ## HITS:1 COG:no KEGG:BT_0266 NR:ns ## KEGG: BT_0266 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 483 1 484 486 748 77.0 0 MKRKYVKIMFLSMTLLTSSITTAYATIPEDEKKEEAPKVTEDAHLPLMIINGTNKSEHFF INSESLTNKEIKITAPNGFTVSPSVIPANSGKQKVTVTLNSSKILTEGKIVLRNGDYRSY LKVKGYGTALPAKAISKSPIYKGGNDSDFNKTFTPSSKGYTLEFRVKTDEPEKCFYPYFV NEKGYGFKAYITSTEVGLYNAYKKNISNPATTGKEGGLGKFYNNDGEAHTYRIAVTPDNR AFIYRDGMPIDSVRIIDYAPQPYFASGIGEAVENLLKNPDFEGEFDTNPESKLVTAVEGW DVVIGDRWNSEQYIKPEEIDNMQDIDNHIFEIKPYKWAAGWSDGILMQVVDVAPNETYTL SALLKGGIAKKAGTLTGKMIIEEAQNKEKKVVTEIASDNWEMYSMDYTTSADCKQLRISF TVGRGKWGNDIGAIRVDNAKLTGVSCNYTPKFGFVDNTADIEYFTIDESGAYAPMQPTIT INLPSE >gi|225935369|gb|ACGA01000023.1| GENE 10 19161 - 20642 849 493 aa, chain + ## HITS:1 COG:no KEGG:BT_0265 NR:ns ## KEGG: BT_0265 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 493 1 493 493 858 84.0 0 MRHILFIYFMFSVISAFSQNTQISPGVLWNDINGEQINAHGGCVVFENGFYYWFGEDRTG FVSNGVSCYQSKDLYNWKRLGLSLKTKGEPKEDMNDISHGRLFERPKVIYNPKTKKWVMW THWESGDGYGAARVCVATSDRIEGPYILYKTFRPNKNESRDQTLFVDTNGDAYHFCSTDM NTNMNVSLLRDDYLEPTPTETKILKGLKYEAPAIFKVGDYYYGLFSGCTGWAPNPGKTAY TTSILNEWTTGRNFAVDKLKQVTYNSQSCYVFKVNNKTNAYIYMGDRWNSKNVEKSHHVW LPISMRSGYPVVKWYNQWDLSIFDKMYRYKRAEKIINGNIYSLLEKNSDRLVSKPINGFS IADDNDTINLSLEFIKTDIPNGYKLKDIKTGKFLESLFGTLRLNPEKQNDAQCWIFNLLE DGYYQIQNAKDKKYITVSGSNTFDGTSLYLTEYSKKLMQDFAVYFDSDKYQYKEADIFAA TYRTNNLKLMGAQ >gi|225935369|gb|ACGA01000023.1| GENE 11 20685 - 22442 1053 585 aa, chain + ## HITS:1 COG:no KEGG:BT_0264 NR:ns ## KEGG: BT_0264 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 585 1 582 582 954 82.0 0 MNIKKQRIKIYLLFIFICLPNLSTVPSFAFQSREISMETVSSWAVTTTVFKLAGSNEVNN VKLNWMQRKDADLYKIYRDNNLIGEAKGNTYDDYNLMVNKTFTYHVEAFYNGQKVATSIS QQATTFIPTGESKVYDNLNGKYITKESSGNKPQGMKINDLYFSYKIENTEKTVDGQQLKG WLVTESYSKTGLNGSWSTPRELVFYPNVKMEGNAFRYNPKTGKVVLSSHYEDENGYTAAK IYLAQITPKGKLEVGTMERPLGHDSRDQSLFIDDDNTAYLLSATNTNSDINIYKLDASWT KPVELVNTICKGLHRETPAIIKKDGEYYFFSSKASGWYPSQTMYTSTTDLAGEWTPMREI GNNTTFDAQFNRISKVGTTYGVWSYHWGAQRKYKTPDGNFPRISIAAFNKGYASMDYYRY LEFNDDYGIIPVQNGKNLTLNVPVTSAVSGAKGVKADCITDGASLDSSTYFQKSSNATIG TPYMFTIDMQKKARISEINLSTRLVNGSEAAYKYTIEGSPDGKSYKMLVDGRTNWQVGFL ILNVEDPTAYRYLRLRIYGVVNIHKGNSAMWADGIYEFAAFGTPE >gi|225935369|gb|ACGA01000023.1| GENE 12 22529 - 24613 1482 694 aa, chain - ## HITS:1 COG:no KEGG:BVU_2155 NR:ns ## KEGG: BVU_2155 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 26 693 34 704 704 833 57.0 0 MEKVKKMLPKFCGFLFFGMMGTFVSAQKISYQVNSQTGALQTLSIANDPQKMNWLVATDG SQYRWVKENYGWGLGYATQVKGNRKEKVSWEVPVEIKQEGNEVVYMAGDIRICVKRKMVG DELVEEYTFRNQGEDEVVLVDIGVYTPFNDNYPGSRTCINARTNTHIWEGENAAYINALR MGGYAPHLGLVLTKGAVKSYEIWERGRDKGNSHTRGIITLNLPDLQLMPGKEYSISWCLF AHKGIDDFRQKLLEKGSVFVSCNKYVFEKGEKAFIELAAENSIQKCVLKKQDIHIPMKKR GNLWIAEVKMEQLGENRFDIFYGEGKQTHVTCLVIDNVDSLIKKRVDFICDHQQMKESNT RKDAFMVYDNEKNEIYLNNTRNCNPVDRDEGAERVGMGVLLAKYYQLHPDDHIKTALLKY AKFLRNRLQESDYKTFSSVDRKGRNRAYNYAWVADFYFQMYKITGDKQYAVDGYMTLRSM FRQFGHGFYAIGIPVHLGLQTLKAADMDVEYETLKNDYIQVGDTFVKNGLNYPASEVNYE QAIVAPSIIFLLQLYLETGIQKYLDGAKQQMPALEAFNGNQPSYHLNEIAIRHWDGYWFG KREMWGDTFPHYWSTLTGAAFYLYAQCVGDNTYKRRAENIVRNNLCLFFEDGKASCAYIY PNKVNGVKAGFYDPYANDQDWALVYYLLVNKDIY >gi|225935369|gb|ACGA01000023.1| GENE 13 24640 - 26055 673 471 aa, chain - ## HITS:1 COG:no KEGG:BT_2959 NR:ns ## KEGG: BT_2959 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 468 10 458 460 620 64.0 1e-176 MNLSFLMEKRHSYFISMILLWTSFSLVSQSSCAQNQEKIICNGIPWFDDQGKIVNAHGAC IVEDAGRYYLFGEYRSDETNSFVGFSCYSSDDLSNWKFERIVLPIQKEGLLGPNRIGERV KVMKCPSTGEYVMYMHTDDTKYCDPCIGYATSKTINGEYIFQGPFKIGNEPIRMWNMGTF QDTDGTGYLLIHEGDIYRLSEDYHSAEKRLVKNMAPGGESPAMFKKEGIYYFLFSNKTSW EKNDNYYFTAPAVGGPWKKGGLFVPEGKLTYNSQTTFVFPLTQGKDTIPMFMGDRWSFPH QASAATYVWMPMQADKGKLSIPEYWQAWDIKTLSQVDALDNGQILSLRKTRSEAGWERRG EQLCSNLKNSVLNIPFKGTQAAIIGEANSHGGYAKVSVLNRKGKTLYSSLIDFYSKYPES AIRIVTPSFAKGKYILRIEVTGISPVWTDKTKARYGSDDCLVTLDKIVVFE >gi|225935369|gb|ACGA01000023.1| GENE 14 26422 - 26691 356 89 aa, chain + ## HITS:1 COG:no KEGG:BT_0261 NR:ns ## KEGG: BT_0261 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 1 89 89 162 95.0 2e-39 MASRRELKKNVNYIAGELFTECLINSMFIPGTDKVKADELMAEVLKMQDEFVTRISHTEP GNVKGFYKKFRADFNAKVNEIIEAIGKLN >gi|225935369|gb|ACGA01000023.1| GENE 15 26699 - 27253 412 184 aa, chain + ## HITS:1 COG:CC1900 KEGG:ns NR:ns ## COG: CC1900 COG0204 # Protein_GI_number: 16126143 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Caulobacter vibrioides # 11 177 16 182 196 125 40.0 5e-29 MKKAIYSFIYYRLLGWKTNVTVPNYDKCVICAAPHTTNLDLFIGKLFYGAIGRKTSFMMK KEWFFFPLGFFFKAVGGIPVDRSRKTSLVDQMVHNFAEYKKFNLAITPEGTRKANPNWKK GFYFIALKAQVPIVLIGIDYSKKTISATKAIMPSGDINKDMREIKLYFKDFKGKHPENFA LGEI >gi|225935369|gb|ACGA01000023.1| GENE 16 27341 - 28117 589 258 aa, chain + ## HITS:1 COG:STM0308 KEGG:ns NR:ns ## COG: STM0308 COG0388 # Protein_GI_number: 16763691 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Salmonella typhimurium LT2 # 1 258 1 255 255 234 44.0 2e-61 MESIRISIIQTDIVWENKQENLRLLHEKLQSLRGITEIVVLPEMFSTGFSMQSKILAEPN SGETITTLKQWAAKFQLAICGSYIATENEQFYNRAFFLTPEGEEFYYDKRHLFRMGREAE HFSAGDKRLIIPYHGWNICLLVCYDLRFPVWSRNVGNEYDLLIYVANWPIPRRLVWDTLL RARALENQCYVCGVNRVGTDGYQLSYNGGSKVYSAFGEEIGSFPDGKEGITTVSVNLTAL HQFREQFPVWKDADEFHL >gi|225935369|gb|ACGA01000023.1| GENE 17 28266 - 30257 1792 663 aa, chain + ## HITS:1 COG:BS_nagB KEGG:ns NR:ns ## COG: BS_nagB COG0363 # Protein_GI_number: 16080555 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus subtilis # 42 279 7 238 242 171 39.0 4e-42 MKTNLSSQISLHRVSPRYYRPENAFEKSVLTRLEKIPTDIYESVEEGANYIAREIAQTIR EKQKAGRFCVLALPGGDSPSHVYTELIRMHKEEGLSFRNVIVFNMYEYYPLSPDAINSNF NALKSMLLDHIDIDKQNIFTPDGSIAKDTIFEYCRLYEQRIESFGGIDIALLGIGRVGNI AFNEPGSRLNSTTRLILLDNASRNEASKIFGTLDNTPISSITMGVATILGAKKVYLLAWG ENKAAMIKECVEGPITDTIPASYLQTHNNAHVALDLSAAMNLTRIQRPWLVTSCEWNDKL IRSAIVWLCQLTGKPILKLTNKDYNENGLSELLALYGSAYNVNIKIFNDLQHTITGWPGG KPNADDTYRPERAKPYPKRVVIFSPHPDDDVISMGGTLRRLVEQKHEVHVAYETSGNIAV GDEEVVRFMHFINGFNQLFNNSEDLVINEKYIEIRNFLKEKKDGDMDSRDILTIKGLIRR GEARTACTYNNIPLERCHFLDLPFYETGKIQKNPISEADVEIVRNLLREIKPHQIFVAGD LADPHGTHRVCTDAVFAAVDLEKEEGAKWLKDCRIWMYRGAWAEWEIENIEMAVPISPEE LRAKRNSILKHQSQMESAPFLGNDERLFWQRSEDRNRGTATLYDQLGLASYEAMEAFVEY VPL >gi|225935369|gb|ACGA01000023.1| GENE 18 30452 - 33361 2235 969 aa, chain + ## HITS:1 COG:no KEGG:BT_0257 NR:ns ## KEGG: BT_0257 # Name: not_defined # Def: xanthan lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 969 1 970 970 1757 87.0 0 MKKIIVFFLGLAFCATFLFAQDIERNVKERLTDYFNQYTATAKISTPKLNSFDINYDRKT IAIYASESFAYQPFRPETVENIYNQVKELLPGPVHYYQLTIYADGKPIEDLVPNFYRNKK KDKERLSLNVDYRGAPWVKNTSRPNEISRGLQDRHIAIWQSHGNYFKNDKNEWGWQRPRL FCTTEDMFTQSFVLPYVIPMLENAGAIVYTPRERDTQKNEIIVDNDTPNASLYLEMGGKK INWANAPVRGFAQKKTIYKDGENPFTDGTCRFIPTERKKKKNKDQVFAEWVPTLPATGKY AVYVSYQTLPNSVSDAKYLVFHNGGVTEFKVNQKIGGGTWVYLGTFEFDKGSNDYGMVVL SNESSEHGVVCADAVRFGGGMGNIARGGKISGLPRYLEGARYSAQWAGMPYEVYAGRKGE NDYTDDINTRSNVINYLSGSSVYNPQQSGLGVPLEMTMALHSDAGCSKTDELIGSLGIYT TDFNNGKLNAGTDRYASRDLADILLTQIQKDIYSSYSLPWTRRSMWNRNYSETRLPATPS TIIELLSHQNFADMQLGHDPNFKFTVGRAIYKGILQFITNQHDKEYIVQPLPVSNFAIQF GKKKNILELSWKGEDDPQEPTARPREYIVYTRIGYGGFDNGTLVSKTSHTVKIEPGLVYS FKVTAVNRGGESFPSEILSAYKAKREQGKVIIINGFDRISGPAVVNTSDKAGFDLMQDPG VPYISNISFCGAQTGFDRTQAGKEGKGSLGHSGNELEGMKIAGNTFDYPFIHGKAIQAAG KYSFVSCSDEAVENGLVTLEDYPVVDYILGLEKEDPANKAYYKTFSSAMQRIMTSYCQAG GNLFVSGAYVGSDMSGTQGNREFTEKILKYGYQGSLTDKSSNQIKGLGRTITIPRLPNES SYAVPAADCIVPVDTAFPVFTYAPGNQSAGIAYKGNYRTFVLGFPFESIQSEADRATIMA GILGFFTQK >gi|225935369|gb|ACGA01000023.1| GENE 19 33441 - 33758 372 105 aa, chain + ## HITS:1 COG:no KEGG:BT_0256 NR:ns ## KEGG: BT_0256 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 1 105 105 176 92.0 3e-43 MYTIQANPSGTRSIEVSNENLRTIEKYALFRHLIDSTGIIDEAVLDKLKLNIRSLIASQE EDSKDLLDLCIDVIYHNNMKAFGLQQLIKLYLTWLSSPEAEEEEE >gi|225935369|gb|ACGA01000023.1| GENE 20 33758 - 33982 158 74 aa, chain + ## HITS:1 COG:no KEGG:BT_0255 NR:ns ## KEGG: BT_0255 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 129 89.0 4e-29 MITVDTCGITAYSPLIPAIKAMCTASPGETIEIIMNHADAFQDLKEYLSEQGIGFREIYD GEQMTLQFTINGKF >gi|225935369|gb|ACGA01000023.1| GENE 21 34089 - 35948 1495 619 aa, chain + ## HITS:1 COG:PA4928 KEGG:ns NR:ns ## COG: PA4928 COG1032 # Protein_GI_number: 15600121 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Pseudomonas aeruginosa # 9 611 23 667 747 514 40.0 1e-145 MKEYRLTDWLPTTKKEVELRGWNELDVVLFSADAYVDHPSFGAAVIGRILEAEGLKIAIV PQPNWRDDLRDFKKLGRPRLFFGISGGCMDSMVNKYTANKRLRSEDAYTPDGRPDMRPDY PSTVYSQILKKLFPDVPVVIGGIEASLRRLSHYDYWQDKVQKSILCDSGADLLIYGMGEK PLADLVKNMKSLLTTEEPVLTSSKFRTIIGSVPQTAYLCRATEWTSAEDDLQLYSHEECL ADKKKQASNFRHIEEESNKYSASRITQAVGNKIVVVNPPYPPMSQKDLDHSFDLPYTRLP HPKYKGKRIPAYDMIKFSINIHRGCFGGCAFCTISAHQGKFIVSRSKESILKEVKEVIQL PDFKGYLSDLGGPSANMYQMKGKDEAICKKCKRPSCIHPKVCPNLNTDHRPLLDIYHAVD TLPGIKKSFIGSGVRYDLLLHQSKDETTNRSTAEYTRELIVNHVSGRLKVAPEHTSDRVL SIMRKPSFEQFETFKKIFDRINREENLRQQLIPYFISSHPGCNEEDMAELAVITKRLDFH LEQVQDFTPTPMTVATEAWYSGFHPYTLEPIFSAKTQREKLAQRQFFFWYKPEERKNIIN ELRRIGRSDLIDKLYGKKR >gi|225935369|gb|ACGA01000023.1| GENE 22 35953 - 36249 156 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882453|ref|ZP_02063456.1| ## NR: gi|160882453|ref|ZP_02063456.1| hypothetical protein BACOVA_00404 [Bacteroides ovatus ATCC 8483] # 1 98 1 98 98 171 100.0 1e-41 MARLKIVWTETATIVLRRILTFYRVRNGNSKYSRSIYTMINDVLKLVAKYPYIYKATSVP NIRVFHCDYFKVYYRVLEEYILVEAIFDTRQDPDKAPF >gi|225935369|gb|ACGA01000023.1| GENE 23 36237 - 36449 264 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882452|ref|ZP_02063455.1| ## NR: gi|160882452|ref|ZP_02063455.1| hypothetical protein BACOVA_00403 [Bacteroides ovatus ATCC 8483] # 1 70 1 70 70 92 100.0 9e-18 MAAKYEPQENEMPEQVNEPIAAYQRTASDMSLYIPTEYEREIIMRSEKDYEEGRLYTQEE VDKQVKQWLD >gi|225935369|gb|ACGA01000023.1| GENE 24 36560 - 39922 3515 1120 aa, chain - ## HITS:1 COG:BS_mfd KEGG:ns NR:ns ## COG: BS_mfd COG1197 # Protein_GI_number: 16077123 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Bacillus subtilis # 34 1048 31 1097 1177 627 34.0 1e-179 MTITELQQQYAAHPNMAVMKRLLKDTSVQTVFCGGLCASAASLFSSVLVQEGGCPFVFIL GDLEEAGYFYHDLTQILGTERVLFFPSSFRRSIKYGQKDAANEILRTEVLSRLQKGEEGL CIVTYPDALAEKVVSRKELSNKTLKLNVGEKVDTTFITDVLHSYGFEYVDYVYEPGQYAV RGSIIDVFSFASEYPYRIDFFGDEVESVRTFEVESQLSREKKDGVSIVPDLAVTGDVTTS FLDFIPKETTLAMRDFLWLRERIQVVHDEALTPQAIAVQEAEENGGITLEGKLIDGSEFT VRALDFRRLEFGNKPTGTPNASVTFNTSAQPIFHKNFDLVASSFKDYLEKGYSLYICSDS MKQTDRIKAIFEDRGDKINFTPVERTIHEGFVDNTLRLCIFTDHQLFDRFHKYNLKSDKA RSGKVALSLKELNQFTPGDYVVHTDHGIGRFSGLVRIPNGDTTQEVLKLVFQNEDVVFVS IHSLHKVSKYKGKEGEAPRLNKLGTGAWEKLKERTKSKIKDIARDLIKLYSQRRQEKGFS YSPDSFLQRELEASFIYEDTPDQSKATIDVKADMESDRPMDRLVCGDVGFGKTEVAIRAA FKAVADNKQVAVLVPTTVLAYQHFQTFRERLKGLPCRVEYLSRARTAAQTKAVLKGLKDG DVGILIGTHRILGKDVQFKDLGLLIVDEEQKFGVSVKEKLRQLKVNVDTLTMTATPIPRT LQFSLMGARDLSVISTPPPNRYPIQTEVHTFNEEVITDAINFEMSRNGQVFFVNNRIANL PELKAMIERHIPDCRVAIGHGQMEPTELEKIILDFVNYDYDVLLATTIIESGIDIPNANT IIINQAQNFGLSDLHQMRGRVGRSNKKAFCYLLAPPLGSLTAEGRRRLQAIENFSDLGSG IHIAMQDLDIRGAGNLLGAEQSGFVADLGYETYQKILTEAVHELKTDEFAELYADEIKGE GQISGEEFVDECQVESDLELLLPANYVTGSSERMLLYRELDGLTLDKDVEAFRARLEDRF GPVPRETEELLRIVPLRRLAARLGAEKIFLKGGRMTLFFVSNPDSPFYQSKAFGKVIDYM MKYTRRCDLREQNGRRSMLIKDVTNVETAVSVLQEIVALQ >gi|225935369|gb|ACGA01000023.1| GENE 25 40109 - 40852 627 247 aa, chain + ## HITS:1 COG:Rv2051c_2 KEGG:ns NR:ns ## COG: Rv2051c_2 COG0463 # Protein_GI_number: 15609188 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis H37Rv # 7 239 3 229 264 212 45.0 5e-55 MQTSDSIVIIPTYNERENIENIIRAVFGLPQVFHILVIEDGSPDGTAAIVRKLQQEFPEQ LFMIERKGKLGLGTAYITGFKWALEHAYEYIFEMDADFSHNPNDLPRLYQACSEQGGDVS IGSRYVSGVNVVNWPMGRVLMSYFASKYVRFITGIPVHDTTAGFVCYRRQVLETIDLDHI RFKGYAFQIEMKFTAYKCGFKIIEVPVIFINRELGTSKMNSSIFGEAIFGVIKLKVNSWF HKFPQKK >gi|225935369|gb|ACGA01000023.1| GENE 26 40849 - 42186 1168 445 aa, chain + ## HITS:1 COG:XF0988 KEGG:ns NR:ns ## COG: XF0988 COG0044 # Protein_GI_number: 15837590 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Xylella fastidiosa 9a5c # 1 444 1 447 449 416 48.0 1e-116 MKRTLIQNAVVINEGRKVLGSVVIENEKIAEILVGEENATAPCDEIIDATGCYLLPGAID EHVHFRDPGLTHKADITTESRAAAAGGVTSIMDMPNTNPQTTTLEALEEKFVLLGEKSAV NYSCYFGATNNNYTQFAQLDKHRVCGVKLFMGSSTGNMLVDRMASLRNIFGGTDLLIAAH CEDQGIIKENTDKYKKEYGDDVPLALHPLLRSEEACYRSSELAVQLARETNARLHIMHIS TAKELSLFSNVPLVQKKITAEACVSHLLFTEEDYQTLGARIKCNPAIKTAQDRKALREAV NSGLIDAIATDHAPHLLSEKEGGALKAMSGMPMIQFSLVSMLELTEKGVFTIEKVVEKMA HAPAQMYEIQNRGFIRKGYQADLVLVRPGSEWTVTTDCILSKCKWSPLEGHTFDWKVEKT FVNGHLLYNNGEIDETYRGQELRFR >gi|225935369|gb|ACGA01000023.1| GENE 27 42212 - 43120 679 302 aa, chain + ## HITS:1 COG:VC0390_2 KEGG:ns NR:ns ## COG: VC0390_2 COG1410 # Protein_GI_number: 15640417 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Vibrio cholerae # 11 301 615 897 899 205 40.0 8e-53 MSTILSYKIHTVTPYINWIYFFHAWGFQPRFAAIANIHGCDACRASWLTTFPEEERNKAS EAMQLFKEANRMLDLLDRDYEVKTIFKLCKANSDGDNLIIEKEKDRFITFPLLRQQTPKR DGSPFLCLSDFIRPISSGIPDTIGAFASSIDADMEGLYEQDPYKHLLVQTLSDRLAEAAT EKMHEYVRKEAWGYAKDENLGIADLLVEKYQGIRPAVGYPSLPDQSVNFLLDELLDMKQI GISLTENGAMYPHASVCGLMFSHPASEYFSVGKIGEDQLEDYARRRGKSIEEMCKFLAAN LQ >gi|225935369|gb|ACGA01000023.1| GENE 28 43189 - 43701 548 170 aa, chain + ## HITS:1 COG:CC3310 KEGG:ns NR:ns ## COG: CC3310 COG1595 # Protein_GI_number: 16127540 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 42 166 33 158 166 82 34.0 3e-16 MREQTAEVSAPIEQEFLSVIREYERVIYKVCYLYANPNAPLNDLYQDVVLNLWKAYPKFR KECKVSTWIYRIALNTCISFYRKEKNVPEIVSLTRDTDWAIEAHDPINEMLKQLYQMINQ LGQLDKSIILLYLEDKSYEEIAEITGLTVTNVATKLSRIKDKLKRMKKEE >gi|225935369|gb|ACGA01000023.1| GENE 29 43707 - 44282 536 191 aa, chain + ## HITS:1 COG:no KEGG:BT_0247 NR:ns ## KEGG: BT_0247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 187 1 187 192 250 85.0 2e-65 MELEELKKSWNALDEHLKDKEFIKEEELEKLIRHADKGIHAIARLNIKLILISLPILILF LAEVLLHNRLNPIYIIIIFAWIPALCWDIVTTRFLQRTQIDEMPLVEVISRVNRIHRWTI RERLIAIAFLLVLAILSFIYWQVWQYGMGMIAFFILLWGGGLGLILWIYRKKFLNRIHEI KKNLSELNELM >gi|225935369|gb|ACGA01000023.1| GENE 30 44328 - 47195 2763 955 aa, chain - ## HITS:1 COG:ZsbcC KEGG:ns NR:ns ## COG: ZsbcC COG0419 # Protein_GI_number: 15800123 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Escherichia coli O157:H7 EDL933 # 1 947 1 1034 1047 235 27.0 3e-61 MKILAIRLKNLTSIEGTVEVDFMAEPLHSAGIFAISGPTGAGKSTLLDALCLALYDKAPR FATSVENVNLADVGDNQINQSDVRNLLRRGTSDGYAEVDFLGIDGRRYRSRWSVRRTRNK INGSLQPQTLEVKELDTEKEFQGTKKELLIQLVELVGLTYEQFTRTVLLAQNDFATFLKS KGAAKAELLEKLTGTGVYSRISQEVYARNKAAQEEVTLIQNRMNVIELMPEEELLALQKE KELSIEKRAAGIKLLAEQNEQLNVVRSLKIQEELWKKKQQEEQEEQAREKVLQGALASQE EGLVHFKAQWEAIQPDLKKARQLDVQIQSQQSSYIQSQQILQAANRQVAEQEQKMRVAAE QLQVSYSSLNRLLSHVGIEEALQLEQVEEILRQEESKLAAATSTNEERLLRLNSFGYPLL AEEQVKLQKELTRQQNIRQLTETQTKAKTEIERLEKEVANCLKQLTEQETALKVTQRLYE NARMAVGKDVKALRRQLQEGEACPVCGSTAHPYHQEQEVVDTLFRSIEQEYNVASTNYQQ MNNRSIALQRDLAHQKTVDGQIAEQLAALYKAGIEAGNEEQIQRRLAELAIRILEYRNLY AEWQRSDEEIKKMRAHCEALRENVSLCRLAMQKVSSAKEQLVILQNAASAELKRFEVIEK ALNVLRQERSQLLKGKSADEAEAAVAKREKELNLALEKARKEVEAVHNRLSGLQGEMKQI ALAIGELQEQYKKIESPELLPEIIKKQQEENLNIERALSTMEACLLQQAKNKLTVEQIAK ELAEKQTVAERWAKLNKLIGSADGAKFKVIAQSYTLNLLLLHANQHLSYLSKRYKLQQVP DTLALQVIDCDMCDEIRTVYSLSGGESFLISLALALGLSSLSSNNLKVESLFIDEGFGSL DAESLRTAMEALEQLQMQGRKIGVISHVQEMSERISVQVQVHKKVNGKSVLTVVG >gi|225935369|gb|ACGA01000023.1| GENE 31 47192 - 48445 976 417 aa, chain - ## HITS:1 COG:PA4281 KEGG:ns NR:ns ## COG: PA4281 COG0420 # Protein_GI_number: 15599477 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Pseudomonas aeruginosa # 2 389 1 380 409 277 39.0 3e-74 MIRILHTADWHLGQTFFGYDRTEEHGVFLNWLAEEIRQKEIDALIIAGDVFDVSNPSAAS QSMYYQFIYRVTAENPYLQIVIVAGNHDSAARLEAPLPLLQAMRTEVRGVVRKLEGGEID YDHLTVELKNRRGEVELLCMAVPFLRQGDYPAVQTEGNPYAEGVRELYAQLLQRLWKRRT ENQAILAIGHLQATGSEIAEKDYSERTVIGGLECVSPETFSEQIAYTALGHIHKAQRVSG RENVRYAGSPIPMSFAEKHYHHGVVMVTFDGGCAVDIERLECPKLIPLLSVPNGEPVSPE AVLEALKELPETEAVAPYLEVKVLLEEPEPILRQEIEDALADKNYRLARIVSTYRTETGN TTKENENWKRGLQEMSPLQIAQSAFEKIYQVEMPVELTGLFEEAYLAATRKEEEEEE >gi|225935369|gb|ACGA01000023.1| GENE 32 48566 - 49324 648 252 aa, chain + ## HITS:1 COG:TM1693 KEGG:ns NR:ns ## COG: TM1693 COG0204 # Protein_GI_number: 15644441 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Thermotoga maritima # 55 220 59 223 247 109 35.0 4e-24 MKILYYIYQICIALPILLVLTILTAIVTIVGSLLGGAHIWGYYPGKIWSQLICLFLLIPV KIEGREKLHDKTSYIFVPNHQGSFDIFLIYGFIGRNFKWMMKKSLRKLPFVGKACESAGH IFVDRSGPKKVLETIRQAKDSLKDGVSLVVFPEGARTFTGHMGYFKKGAFQLADDLQLAV VPVTIDGSFEILPRTGKWIHRHRMILTIHEPIPPKGKGMENIKATMAEAYAAVESALPEQ YKGMVKNEDQDR >gi|225935369|gb|ACGA01000023.1| GENE 33 49423 - 49899 407 158 aa, chain - ## HITS:1 COG:no KEGG:BT_0242 NR:ns ## KEGG: BT_0242 # Name: not_defined # Def: putative polysaccharide deacetylase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 158 2 158 158 171 60.0 1e-41 MKNPNELSAASSLRGDELQLIEQKASIVVGIANSRTRLRIGILLSKYSEYQVDYISSESE IGKLIEEADLIIGAGITAYEGVLRRKPVIVVGDYGLGGLVTPDTFRKHYNNRFRGKINGV RNESFSLENLEKEIYKSFNLTFQELQMMSNQTITLQNI >gi|225935369|gb|ACGA01000023.1| GENE 34 50065 - 51849 1244 594 aa, chain - ## HITS:1 COG:no KEGG:BT_0240 NR:ns ## KEGG: BT_0240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 594 3 584 584 711 66.0 0 MCIFAQNSSKVKTTTVSVYAAFFFILVLSVSCHRDSEAPDAAFQRIEMCMESLPDTALYL LKSIPHIEKLRGKLQADYALLLTQAMDQNYVKFTSDSLIALALNYYTVEHGDSVTRAKAQ YYYGRVLRELGKDEEALTFLSSAKEMFGNIQCCKMFAMATDEIGMINRKKKLYQESLKNF RESYTIYEELKDSLSLVRAGQNIGRAYLFQNKWDSCYFYYTYALELAREKQYPSEVSILH ELGILYRSMGELKKSERYFLAAYEKETDEERKYVECMSLGYLYIQMGDVENARKYFKMSI NSSKEYTRIDAYNNLYFLEKDIDNFEEAIIYHEKADSIVNVLDEVDSQELITELQKQYEN EKLRNDNLQLKVDRTVFIFCGTVIFLIVAFYMCYYYYKSKNHRKKIAEIESQIRDNEEEI KRYQQEMEEIQELKDQVLEENRILEENRMKVGELNGKIVLLSMQNKNLSGLLKELGGELT VGPSSEQYISAFRLLLAIKEGTLRGKLSNGERYKLFSLFDLLYSNYVTRLLDKTPLLTKH DLEICCFLKFGLTNEELARIFQTSSDSVTKAKGRLKGRLGISPQEDLNVFLRDF >gi|225935369|gb|ACGA01000023.1| GENE 35 51862 - 52596 516 244 aa, chain - ## HITS:1 COG:no KEGG:BT_0239 NR:ns ## KEGG: BT_0239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 244 244 382 83.0 1e-105 MNVNKILPFLLLLPFLASCTSKYKIEGTSSVNSLDGKMLYLKSLRDGEWVKLDSAEVVHG LFSMKGKIDSVQMVTLYMDEESIMPIVLESGKITVTISNTDLKAVGTSLNNALYEFISKR NQLEESIGELEQKETRMVLDGGDLDEIHGQLVVEGDSLMKAMNQYVKTFISDNYENVLGP SVFMMLCSSLPYPIMTPQIDDIIKDAPYSFKDNKLVREFLSKARENMKLIEEHQRLEQNA STNK >gi|225935369|gb|ACGA01000023.1| GENE 36 52754 - 53998 962 414 aa, chain + ## HITS:1 COG:MA2647 KEGG:ns NR:ns ## COG: MA2647 COG0641 # Protein_GI_number: 20091470 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Methanosarcina acetivorans str.C2A # 13 405 9 396 446 414 48.0 1e-115 MKTSTFAPFAKPLYVMVKPVGAVCNLACDYCYYLEKANLYKDNLKHVMSDELLEKFIDEY INSQTMPQVLFTWHGGETLMRPLSFYKKAMELQKKYARGRTIDNCIQTNGTMLTDEWCEF FRENNWLVGVSIDGPQEFHDEYRKNKMGKPSFVKVMQGINLLKKHGVEWNAMAVINDFNA DYPLDFYRFFKEIGCQYIQFAPIVERILSHEDGRHLASLAENKAGTLADFSITPEQWGNF LCALFDEWVKEDVGKYYVQIFDSTLANWMGEQPGICTMAKTCGHAGVMEFNGDVYSCDHF VFPEYKLGNIYSKTLVEMMHSERQHNFGNMKYQSLPTQCKECEFLFACNGECPKNRFSQT AEGEPGLNYLCKGYYQFFKHVAPYMDFMKNELMNQRPPANIMEALKNGDLKIEY >gi|225935369|gb|ACGA01000023.1| GENE 37 54031 - 56190 2080 719 aa, chain + ## HITS:1 COG:no KEGG:BT_0237 NR:ns ## KEGG: BT_0237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 719 1 695 695 1314 91.0 0 MNRLRLYLLALAALFVCSVKADEGMWLLQLMQQQHSIDMMKKQGLKLEAQDLYNPNGISL KDAVGIFGGGCTGEIISPEGLILTNHHCGYASIQQHSSVEHDYLTDGFWATSRDKELPTP GLQFTFIERIEDITDIVNAKIAAKEITESESFSNIFLQKLAHDLYFKSDLADKKGIVPQA LPFYAGNKFYLFYKKIYPDVRMVAAPPSSIGKFGGETDNWMWPRHTGDFSMFRIYADANG EPAEYSESNVPLKTKKHLSISIKGLKEGDYAMIMGFPGSTSRYLTVSEVKERMESENDPR IRIRGARLAVLKEVMNASDKIRIQYANKYAGSSNYWKNSIGMNKAIIDNDVLGTKAAQEA KFAEFAKAQNNAEYAAVVKNIDDLVAKTTPLNYQYTCLRETFFGAIEFGNVMLTKTREAL LEKNDSVIEARMKALESTYESIHNKDYDHEVDRKVAKALFPLYAEMIPANQRPSIYKVIE QKYKGDYNKFVDDMYDNSIFANRANFEKFTKKPSVKAIDNDLALQYCQSKYDLMDTLASQ LKDMDQELALLHKTYIRGLGEMKLPVPSYPDANFTIRLTYGNVKPYDPKDGVHYNYYTTT KGILEKENPEDREFVVPAKLKELIEKKDYGRYALPNGDMPVCFLSTNDITGGNSGSPVLN ENGELIGCAFDGNWESLSGDINFDNNLQRCINLDIRYVLFILEKLGNCGHLINEMTIVE >gi|225935369|gb|ACGA01000023.1| GENE 38 56191 - 58347 2050 718 aa, chain + ## HITS:1 COG:no KEGG:BT_0236 NR:ns ## KEGG: BT_0236 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 37 718 1 682 682 1336 92.0 0 MRKQILFVLFSLATLSIHADEGMWMLTDLKAQNAVAMRELGLEIPIEEVYNANGLSLKDA VVHFGGGCTGEVISSEGLVLTNHHCGYGAIQQHSNVEHDYLTDGFWAMNRDAELPTPGLT VTFIDRILDVTDYVNEQLKKDPDPDGVNYLSPSYLGNVAERFAKAENIEITPATKLELKA FYGGNKYYLFIKTVYSDIRMVGAPPSSIGKFGADTDNWMWPRHTGDFSLFRIYADKNGKP AEYSKDNVPLQVKKHLKISLAGVQEGDFTFVMGFPGRNWRYMIADEVEERMQTTNFMRQH VRGARQKVLMEQMLKDPAVRIHYASKYASSANYWKNAIGMNEGLIRLNVLDTKRAQQEEL LARGREKGDDSYQKAFDEIRSIVSHRRDALYHQQAINEALVTALDFMRIPSTTELVTALK SKDKEQIKEAKLKLKKEGDKYFASVPFPDVERMVAKEMLKTYANYIPAEQRINIFEIINS RFKGSIDAFVDACFEHSIFGNPKNFEKFIKKPSLYKIGYDWMVLFKYSVTDGILKTAIAM KEANQNYDAAHKVWVKGMMDMRQEKGTPIYPDANSTLRLTYGQVLSYEPADGVVYDAHTT LKGVMEKEDQGNWEFVVPQKLKELYKSQDYGRYGKNGEMPVCFIVNTDNTGGNSGSPVFN SKGQLVGTAFDRNFEGLTGDIAFRPSSQRAACVDIRYTLFIIDKYAGASHIIDELSIE >gi|225935369|gb|ACGA01000023.1| GENE 39 58434 - 58616 210 60 aa, chain - ## HITS:1 COG:no KEGG:BF3012 NR:ns ## KEGG: BF3012 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 60 1 60 60 89 75.0 3e-17 MKLNRTDYVLERASDGGYYAWLTVNMQCNAYGESPEEAVQNLQNIMNEMIDEMYMVEEFI >gi|225935369|gb|ACGA01000023.1| GENE 40 58613 - 59740 390 375 aa, chain - ## HITS:1 COG:MTH884_2 KEGG:ns NR:ns ## COG: MTH884_2 COG1819 # Protein_GI_number: 15678904 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Methanothermobacter thermautotrophicus # 2 328 1 312 348 74 22.0 3e-13 MKFLFIVQGEGRGHFTQAITLEEMLLRNGHEVVEVLVGKSSTRTLPGFFNRSIHAPVKRF ISPNFLPTADNKRANLTKSFAYNLLRLPEYLRSMYYINQRIRETGAEVVINFYELLTGLT YALFRPSVPYICVGHQYLFLHRDFEFPDKNSCQLWMLRFFTRMTALRSSKKLALSFLEME QDDMNQIVTVPPLIRQEVTAIRPEKGDYIHGYMVNSGFADSVESFHAKHPEVPLTFFWDR SDAEEVTRIDETLSFHQIDDVKFLNAMAGCKAYASTAGFESICEAMYLGKPVLMVPAHIE QDCNAYDAMKAGAGIISDSFDLQPLLRFAGEYTPNRHFVYWVRSCERRMIRELEKLAASH SEITSIPTFTNYLPI >gi|225935369|gb|ACGA01000023.1| GENE 41 59909 - 60208 299 99 aa, chain + ## HITS:1 COG:RSc1188 KEGG:ns NR:ns ## COG: RSc1188 COG0526 # Protein_GI_number: 17545907 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Ralstonia solanacearum # 6 97 16 107 108 95 44.0 2e-20 MEKFEDLIQSPVPVLVDFFAEWCGPCKAMKPVLEELKLIVGDKARIAKIDVDQHEDLATK YRIQAVPTFILFKNGEAVWRHSGVIHSSELQGVIEKHYT >gi|225935369|gb|ACGA01000023.1| GENE 42 60226 - 60690 444 154 aa, chain + ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 40 152 3 115 117 105 38.0 3e-23 MKKVLVMVALVMASVIVYAFNDSRETNQGKKEVTGNGEVVVMDKEMFLKDVFDYEKSKEW NYKGNKPAIIDLYADWCGPCRQTAPIMKELAKEYAGKIVIYKVNVDKQKELAALFNATSI PLFVFIPMKGDPQLFRGAADKATYKKAIDEFLLK >gi|225935369|gb|ACGA01000023.1| GENE 43 60800 - 61435 506 211 aa, chain - ## HITS:1 COG:PAE2336 KEGG:ns NR:ns ## COG: PAE2336 COG0778 # Protein_GI_number: 18313271 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Pyrobaculum aerophilum # 21 208 50 252 274 110 35.0 3e-24 MRKVQLLLVCLMLSVAAFAADKVIKLPKPNLNRTGAVMKALSERHSTREYASKSLSLSDL SDLLWAANGINRKESGMRTAPSALNKQDVDVYVVLPEGSYLYDAKNHQLTLVAEGDHRGA VAGGQAFVKTAPVSLVLISDLSRFGDAKSARSQLMGAMDSGIVSQNISIFCSAANIATVP RASMDNEQLKKVLKLKDSQMPMMNHPVGYFK >gi|225935369|gb|ACGA01000023.1| GENE 44 61530 - 62090 677 186 aa, chain - ## HITS:1 COG:CAC3598 KEGG:ns NR:ns ## COG: CAC3598 COG1592 # Protein_GI_number: 15896832 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 182 1 180 181 231 70.0 6e-61 MKKFRCTVCGYVHEGDAAPEKCPLCKAPASKFVEVVEVEGGALSFADEHVIGVAKGCDEE MIKDLNNHFMGECTEVGMYLAMSRQADREGYPEVAEAFKRYAWEEAEHAAKFAELLGDCV WDTKTNLQKRKDAEQGACEDKKRIATRAKALNLDAIHDTVHEMCKDEARHGKGFEGLYNR YFGDKK >gi|225935369|gb|ACGA01000023.1| GENE 45 62113 - 62541 251 142 aa, chain - ## HITS:1 COG:FN2045 KEGG:ns NR:ns ## COG: FN2045 COG0735 # Protein_GI_number: 19705335 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Fusobacterium nucleatum # 7 135 14 140 142 118 48.0 4e-27 MKPYDRLLEHNIKPSMQRIAIMEYLMDNPIHPSADDIYTALSPSMPTLSKTTVYNTLKLF SEQGAALMLTIDEKNTNFDADTSVHSHFLCKRCGHIYDLKCPEAIKKVENIDMDGHQVSE VHYYYKGICKNCLSKDKETRID >gi|225935369|gb|ACGA01000023.1| GENE 46 62745 - 63245 405 166 aa, chain - ## HITS:1 COG:no KEGG:BT_0214 NR:ns ## KEGG: BT_0214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 165 1 156 158 104 39.0 1e-21 MYMKNIFKMMLLLTMALFMSCGDDDEVKLPDTKDVNYANIAGTWRLSEWNGEKIDGDTRY YYIKFDRKEKDGKRSYTIYTNLNSATSQQIPGSFTLNKEEDYGDVISGTYYYQLDTDDEW EYSYIVSGLTDISMVWTAKEDMGEIKVYTRCEDIPSDILTGTRTSF >gi|225935369|gb|ACGA01000023.1| GENE 47 63261 - 64019 262 252 aa, chain - ## HITS:1 COG:no KEGG:BT_0213 NR:ns ## KEGG: BT_0213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 249 48 267 271 199 43.0 6e-50 MKRYLAETIELGLFLLIYLFPVHSFAQETVEQVLQFEHTVMNIGTLSEDDDPAMYHFKYH NVSKKPIYLSKLTTSCGCTVAKYDKKVVQPGEQGEIILVFHPMDQAGDLYREAFVYTDLS KEHPTAKLALVGKVLPTADRWRDYPVYIGNTLRLKRKDWQVRILSREGRQVERFVCVNTG KQPLRLSALLLPEYIHFHTEPEIISPGIEADMVLSIDKGLLPQKNEITFHFMLDGIPVRP CERRIQVKLLLQ >gi|225935369|gb|ACGA01000023.1| GENE 48 64028 - 66097 1784 689 aa, chain - ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 223 525 45 297 416 130 36.0 6e-30 MKRIYLILIAAVAITFASCSDQDMDAVKPDTGQGAPIIDLPEGATQGKILVKFKPEAASF LDAATTRSIGGALTRSGISDMDAVLQRIGISELERIFPVDSRTEERTRKAGLNLWYIIHF AEDTDLEQVAKDLSQVADVAKVQFTRAIQRSYDPNVRATVLTKQAMTRVVHTTRAVNTDA NDPFFNLQWGCKNDGSILQNGEKNDKGDNVVPAVMGVDVNCGEAWKLCTGNPSIVVAVLD EGVMYDHPDLEGNIWVNEDETFASKEDADGNGYAGDRYGYNFTDDKGYISYDDPNDTGHG THVAGIISAVRNNGEGISGIAGGDKANNRGGVKIMSCQVFSGSKGCDLYQEAKAVKYAAD NGAVILQCSWGYNSGLSNPISGYTPGFTSDKDWVDAAPLEKEAFDYFLHNAGSPNDVIDG GIIVFAAGNEYAAMAGYPGAYPDYISVAALAADGTPSCYSNYAMGVSIAAPGGDSDYHQS SKGKIYSTLPPSANEDGGENSHYGYMEGTSQACPHISGVAALGLSYAAELHRHFRADEFR KMILESVNPVEPYFKETKVYWYTNASFGQIAAGQMEPAAYAGNMGTGLIDAYKLLKAVEG GGVEMTVPNMYVAVEATSKINYSRYFKNGENMTFTCTVDDNSIATLTTENNITFTLKGLK VGSTKATVKASDGTKQDFFITVRKNDSWM >gi|225935369|gb|ACGA01000023.1| GENE 49 66108 - 67817 1204 569 aa, chain - ## HITS:1 COG:no KEGG:BT_0211 NR:ns ## KEGG: BT_0211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 547 1 618 660 344 36.0 6e-93 MKPLFKIYLYLFAGLLFIAACNDSDEEGITGFTIDTQEVTLGAIGGMEPVKVASGTKWVA KVNQPWVKVMPANGVGSTNCEIVVDSTLSNDVRHAVVTFVPEGQPKQELKIHQTGYGKMI GLDKYEVEVANMANDDKRYFDISVTTNVKFKVEYSQAIGSWVTTNNRTPDVSLDYGARPR TLKMRFKWDMNTDPQERIASIKFLPVNAEDELEKEVTLTVKQEAAPEITDDRRGDSIAIV IASTKMRSMMNWDASERLDYWLGVTVWERTDKDVTPEKIGRVRSVEFRLLNTKEVLPVEI GKIKYLETLVIYGNTNTSLLPSPYRIGNALAELKYLKNLTISALGITTIDKNELKEPCKV LRTLDVSGNNFTSIPYDLTPTNYPELLNLSLTGNRRYSSITDLSTETRDNPGLRIDASSS SFKNLLKWEKLKSLSLSYNLIYGKLPTFINSYNGSPEYGVSTYTDEDIQQNDTLMSASEE VKAKLKTIPNILPNAEHFSINLNFLTGDDLPDWLLYHPRFARFDPFTLIYTQDSGKDKSG NIPGFKNEPSNLEWFYERYPKARPTLTDN >gi|225935369|gb|ACGA01000023.1| GENE 50 67848 - 70475 1858 875 aa, chain - ## HITS:1 COG:alr0124_1 KEGG:ns NR:ns ## COG: alr0124_1 COG4886 # Protein_GI_number: 17227620 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Nostoc sp. PCC 7120 # 461 828 65 399 461 81 22.0 7e-15 MFVLSCMLFTLGFTVGSCSDDDDAVLQAGYGYAQFKLYKSCLAETKVTRASTNELNYLRD AQKMKIVLINQEDGTEVVQTVGLEAMGDDSEFGLRSEKLQLMAGTYQIVGFYLYKADGTT QDLKQILSGEPDEKTIITVINGGLAVQDIHVKVVKRGMVKFTVTKNFLPSTRAALGEDYL FSDIKYINVTVQEQFTKKDTTFSNVAVEYTEKLDGNGTKISVAVSDSIFRLSAGKYKVKN YTTIKKNKGSLEYGEVNGADFEVSDNVTTEVKIPVNFSKTTGSIKDYLVLKEIWEALGGP AIPEKNQKGWRYSGVTYPLGTNWNFDKDIDLWGDQPGVELDAKGRVVGLSIGAFAPEGDI PESLGDLTELRTLSLGNHSDQVGDNIIEETMGLELTEAQKNSVRSDFYNKFVKNDIALYF SEPIQAALKWQKEGIPTTFSPSVVNDKASRPSLKDVPANRLTNGIHKIPVTIGNLKNLQY LYIANGKFEGFEEGTDLGKLENLTDLEIYNCPSMKKLPEELQQLPNLQSFNLASNPNLGD FHEDLGEFVASENISKTLQIFYLTFNNLTVLPDMSMVKKLGKLDCAYNKIKTIEKAFGRD VNLVQLSMDHNLIEELPRDENGSFCGYADVESFSFAYNKLKKFPNIFSSQSVYVMSSVNF SFNEIDGFEGEEDGTFKGVNANTIAIGGNKLKKFPTILFKTNSQVSALGLNGNGIEEIPK GTFSASKYSYMMKTLDLTYNKLSKLPDDFNGRNMPFLYGVDVSNNRFTEVPTGPMDAATL TVYAVRSQRDENGNRLLRKWPSNISLCPSLRQFCIGGNDLRKITDTISSYIIVFEIKDNP NISLNLSNVCNMIKAGAYLLIYDPEQDIRGCDYVK >gi|225935369|gb|ACGA01000023.1| GENE 51 70520 - 72082 1085 520 aa, chain - ## HITS:1 COG:no KEGG:BT_0209 NR:ns ## KEGG: BT_0209 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 396 28 421 650 132 28.0 3e-29 MKDIIKYYLQLIVLMFCLFTSACSDDDETVTPVFPDLQKIECAVGDTKTLTFEATDNWIL ISSSLWCYFEQDGEQTFTCSGGVGEQTVTIHISDDATELMKSYKAELTMTMAGSRQVIAE VTRPSTGYELHAFDAEQTIEYTAENPYVQDYGGKAYFWVSSNADWIVESSESLDLSKTNI SGEAGNNVKITPLLKQGTENRKTAWTQELIFKNRKGEVISKLPVHYDGIPADKIEFSNDN IYSNKIKASVDGESYTFKNQSYEAEGVPLTVIARNDEYTYVCVEYTSTMGPETGWNEEWS FKLLTGFKNWLWIEDDSEGNLMIAAKSNDGASRSAYLMVFPNLVYAEVENDFENKVFSKE GIVGEYSNYIGALIEQDAFVATSGLSIMDSYTFRPLYDGAGNAIQAEPYAGEMTENELIE KYGTSNVYTVYSFTLGMSYTQIFVLPNGYTGSNLQATTILNGKNTAWSGISLEPGQNSSG QMGINIYGMNSEANGDEMCITIKNGTEPYAVLLIETRYSD >gi|225935369|gb|ACGA01000023.1| GENE 52 72102 - 72992 769 296 aa, chain - ## HITS:1 COG:no KEGG:BT_0208 NR:ns ## KEGG: BT_0208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 293 1 297 299 258 44.0 2e-67 MKLLITGIVALITLAVGFWGCNDLSRTTYSGSDYVMFSDTLSMYPVQDSEKWFEIPVVAT NICDYDRSFGVEVDDKASNAIEKKQYVVESNTVTIKAGERVAKFRMKGIYENIDKTDSLS VTFNLLAKDENVWDLYGTRTRVQMQKACPFELSTFEGYCLLTSSFFSAYMTNTEHRLLQA ERDKTEDNTIILHDFFYKNYDLKIKYDPSDPLKPFVEFDDQIIGSTAEAFGTIYGNGKLM CTQPVAYDSYYNVCQKFVFLYSTIYVVGKGTVGTYVNILEWISDEEAEQYKKEEGL >gi|225935369|gb|ACGA01000023.1| GENE 53 73018 - 74538 1442 506 aa, chain - ## HITS:1 COG:no KEGG:BT_0207 NR:ns ## KEGG: BT_0207 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 506 1 511 511 583 58.0 1e-165 MIRKIKLYIALVAVLLSASSCLDKYPQDAIPQDDAIKTVSDVRQALIGIYAQFKNSSLYS GYLTLLPDIQTDQVYAVKGYTNVYGDVWRNEILAINKQVEYVYGGLYAVIGRCNFVLDNM AAVEAATSDDEQLDKLDNYKGHIYFARALAYSELLKCFCKAYDSDEEAANELGVVLQSSY VNPGPVKRASLKDSYQFVLDDLAKAAEYLAADDDDTAVIYNSAYFTVGTVNALYARMYLY MQKWEKAVEYATKVIDSKKYALADATKNSYSVTYNDFAYMWQYDNSTEIIWKVMFEVNSY GGALGTVFLNYDYTSYKPDYVPAKWVLDAYANADLRYNAYFGSVTTGFSHALTWPLLIKY MGNQDFISQRVLCVSMPKPFRLAEQYLIRAEAYCRMGTAYYGKAGIDISTLRMARYSSYG GSTTLTEENWFKTVSEERMKELFMEGFRLNDLKRWHEGFERKEQTSTVSPGNTLKLEKDD PRFVWPIPQHELDAPGVDLMPNESNK >gi|225935369|gb|ACGA01000023.1| GENE 54 74551 - 77484 2491 977 aa, chain - ## HITS:1 COG:no KEGG:BT_0206 NR:ns ## KEGG: BT_0206 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 977 1 980 980 1313 66.0 0 MRKSILLFVLFTLTSIPLLLFAQGGYQVTGHIISAEDNQPMIGVSVLEKGTTNGVITDIN GNYSITVTKSPATLQFSYVGMKTIDKQVTASTRINLTMENDAQMVEEVVVVAYGVRKKGT IAGSVSTVKAEKMEDVPAPSFDQALQGQAPGLMVLSDSGEPSKAATFRIRGTNSINSGKD PLFILDGVAISSSDFNTISPNDIESISVLKDASSTSIYGARASNGVVVITSKRGRMGEAA KVTFRTQLGFSQLASKDWDQMNTDERIQFEKEVGLDKGQDYEKLSKTNINWLDKVYNDTA PLQNYELSVNGGTEKLNYYVSGSYYDQDGIAVGSTFERVGFRANVEAKANKWLKIGTNSM FAYQEVEQSDDGEYALWAPISASFFMLPYWNPYKEDGSLALQDDGSWKGTTENPLAWMAN NPLSNKKYKLLSTFYAEVTPVKNLTIRSQLSADYGHTTSFYQSFPSYKPNNNYGGAQRSS FDMLNLMITNTANYRFMLNDVHSFNFMVGQEGENYHYEGFQLTTRGQTNDILTNLASGST ASSWSDPVTEYSFLSFFARGEYNYDDRYYADFSIRGDGSSRFGTDNHWGAFWSVGFMWNL RKEKFMQKYDWLTNAQIAVNTGTSGNSSINNYEHLALVSGGYKYDNESGIAISQLGNEEL SWESTWATNVALHLGFIDRINLDVEFYNKKTTNMLMAVPISYTSTGFGTRWDNVGAMRNR GVEINVGADVLRIKDFKWNVNANVSYNKNEITELYNGVTEYVASDTGRMVAVGHPLGEFY LNRYAGVNPINGDALWYTKDGEITMEYNESDKVMLGKTHEAPWQGGFGTTLSWKGFSLSA QFTWVADRWMLNNDRVFQESNGLFSAYNQSKRMLYDRWKKPGDVTDIPRYGVTPQLDSRF LEDASFLRLKNLMLSYTFPQKWLKRTSFLNSARIYAQGQNLLTFTNFTGMDPESTSNVYK AQYPMSRQFTFGLEVSF >gi|225935369|gb|ACGA01000023.1| GENE 55 78131 - 80056 1676 641 aa, chain + ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 326 634 1 309 310 454 66.0 1e-127 MNYGFVKVAAAVPHVKVADCKFNVEKIESLIAVAEGKGVQIIIFPEMSITGYTCGDLFGQ QLLLEEAEMGLMQILNNTRQLDIISIVGMPVVVNSTVINAAAVIQKGKVLGVTAKTYLPN YKEFYEQRWFTSALQLTTNNVRLCGQIVPIGSNLLFETSDTTFGIEICEDLWSTIPPSSS LALQGAEILFNMSADNEGIGKNNYLCSLISQQSARCIAGYVFSSCGFGESTTDVVFAGNG FIYENGSLLAHSERFSMKEQLIISEIDVERIRAERRINTTFAANQANLGDKKAVSIATEF VNSKELTLTRKFNSHPFVPQGIELNEHCEEVFSIQVAGLAQRLIHTGAKTAVIGISGGLD STLALLVCVKTFDKLGLSRKDILGITMPGFGTTDRTYHNAIDLMKSLGISIREISIKDAC IQHFKDIDHNINVHDVTYENSQARERTQILMDVANQTWGMVIGTGDLSELALGWATYNGD HMSMYGVNASVPKTLVKYLVQWVAVNGMDENSKATLLDIVDTPISPELIPADENGEIKQK TEDLVGPYELHDFFLYYFLRFGFRPSKIFYLANIAFKDMYDEKTIKKWLSTFFRRFFNQQ FKRSCLPDGPKVGSISISPRGDWRMPSDASSAMWLKEIEAL >gi|225935369|gb|ACGA01000023.1| GENE 56 79924 - 80172 95 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPYLYHRSVTAVPPYCHNHAIMLAQLCQCGGTSLLIGRPYKASISLSHMAELASLGILQS PLGLIEILPTLGPSGKQERLNC >gi|225935369|gb|ACGA01000023.1| GENE 57 80174 - 80734 333 186 aa, chain - ## HITS:1 COG:PA3885 KEGG:ns NR:ns ## COG: PA3885 COG2365 # Protein_GI_number: 15599080 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Pseudomonas aeruginosa # 37 183 48 195 218 79 27.0 5e-15 MQKQIPLCLLAIVLSVSLFGQNLKADKIILSDSDLTNLYQIDSGVYRSEQPSKEGFKALE KYGIGEVLNLRNRHSDDDEAKGTSIKLHRVKTKAHSISEKQLIQALRIIKNRKAPIVFHC HHGSDRTGAVCAFYRIIFQNVSKEDAIHEMTEGGYGFHRIYKNIIRRIKEANVEQIRKEV MEGGEL >gi|225935369|gb|ACGA01000023.1| GENE 58 80764 - 81888 1089 374 aa, chain - ## HITS:1 COG:Cj1599_2 KEGG:ns NR:ns ## COG: Cj1599_2 COG0131 # Protein_GI_number: 15792904 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Campylobacter jejuni # 184 374 3 190 190 219 57.0 5e-57 MKKKVLFIDRDGTLVIEPPVDYQLDSLEKLEFYPKVFRNLGFIRSKLDFEFVMVTNQDGL GTPSFPEETFWPAHNLMLKTLEGEGITFDEILIDRSFPEDNAPTRKPRTGMLTKYLDNPE YDLAGSFVIGDRPTDVELAKNIGCRAIYLQDSTEALKEKGLEEVCVLATTDWDRIAEFLF AGERKAEVRRTTKETDIYVALNLDGNGTCDIFTGLGFFDHMLEQIGKHSGMDLTIRVKGD LEVDEHHTIEDTAIALGECIYQALGSKRGIERYGYALPMDDCLCQVCLDFGGRPWLVWDA EFKREKIGEMPTEMFLHFFKSLSDAAKMNLNIKAEGQNEHHKIEGIFKALARALKMAIKR DIYHFELPSSKGVL >gi|225935369|gb|ACGA01000023.1| GENE 59 81892 - 82932 873 346 aa, chain - ## HITS:1 COG:YIL116w KEGG:ns NR:ns ## COG: YIL116w COG0079 # Protein_GI_number: 6322075 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Saccharomyces cerevisiae # 4 342 5 376 385 223 38.0 5e-58 MKTLQELTRPNIWKLKPYSSARDEYKGVTASVFLDANENPYNTPHNRYPDPMQCELKTLL SKIKKVSPEHIFLGNGSDEAIDLVFRAFCEPGKDNVVAIDPTYGMYQVCADVNNVEYRKV LLDDDFQFSANKLLAATDEHTKLIFLCSPNNPTGNDLLRSEIVKVLCQFEGLVMLDEAYN DFSQAPSFLEELDKYPNLVVFQTFSKAWGCAAIRLGMAFASKEIIDILSKIKYPYNVNQL TQQQAISMLHKHYEIERWVKTLKEERDYLEAEFEKLPCTIKLFPSDANFFLAKVTDAVKI YNYLVGEGIIVRNRHNISLCCNCLRVTVGTRVENNTLLAALKNYQG >gi|225935369|gb|ACGA01000023.1| GENE 60 82949 - 84238 1280 429 aa, chain - ## HITS:1 COG:hisD KEGG:ns NR:ns ## COG: hisD COG0141 # Protein_GI_number: 16129961 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Escherichia coli K12 # 8 427 13 431 434 429 56.0 1e-120 MKLIKYPSKEQWTELLKRPALNTESLFDTVRSIINKVRTEGDKAVLEYEAAFDKVTLSAL AVTPEEVQAAGTLVNDELKAAISLAKQNIETFHSSQRFVGKKVETMNGVTCWQKSVGIEK VGLYIPGGTAPLFSTVLMLAVPAKIAGCKEIVLCTPPDKNGNIHPAILFAAQLAGVSKIF KAGGVQAIAAMAYGTESVPKVYKIFGPGNQYVTAAKQLVSLRDVAIDMPAGPSEVEVLAD ASANPVFVAADLLSQAEHGIDSQAILITTSEKLQTEVMEEVERQLAELPRREIAAKSLEN SKLILVKDLDEALELTNAYAPEHLIIETENYMEVAERVINAGSVFLGSLTPESAGDYASG TNHTLPTNGYAKAYSGVSLDSFIRKITFQEILPEGIKAIGPAIEEMAANEQLDAHKNAVT VRLKAIQNL >gi|225935369|gb|ACGA01000023.1| GENE 61 84280 - 85131 999 283 aa, chain - ## HITS:1 COG:PM1195 KEGG:ns NR:ns ## COG: PM1195 COG0040 # Protein_GI_number: 15603060 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Pasteurella multocida # 2 282 7 298 299 244 45.0 2e-64 MLRIAVQAKGRLFEETMALLEESDIKLSTTKRTLLVQSSNFPIEVLFLRDDDIPQTVATG VADLGIVGENEFMEKEEDAEIVKRLGFSKCRLSLAMPKDIEYPGLSWFEGKKIATSYPVI LRKFLKKNSVNAEIHVITGSVEVSPGIGLADAIFDIVSSGSTLVSNRLKEVEVVMKSEAL LIGNKNMSEEKKEVLEELLFRMNAVKTAEDKKYVLMNAPKDKLEEIIAVLPGMKSPTVMP LAQEGWCSVHTVLDEKRFWEIIGKLKGLGAEGILVLPIEKMIV >gi|225935369|gb|ACGA01000023.1| GENE 62 85390 - 85881 341 163 aa, chain - ## HITS:1 COG:no KEGG:BT_0199 NR:ns ## KEGG: BT_0199 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 1 163 163 314 92.0 6e-85 MKKIINPWKGLEGYNCFGCASNNEAGLKMEFYEDGDEVVSIWKPRPEYQGWIDTLHGGIQ AVLMDEICAWVILRKLQTTGVTSKMETRYRKSIDTKDSHIVLRASIKEVKRNIVIVEAKL YNKDGEVCTESVCTYFTFSKEKSKDEMHFSKCDVESEEILPLI >gi|225935369|gb|ACGA01000023.1| GENE 63 85955 - 87859 1426 634 aa, chain - ## HITS:1 COG:no KEGG:BT_0198 NR:ns ## KEGG: BT_0198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 634 1 634 634 724 56.0 0 MKLFYNLQRGVGLLVFCSILLSACVNHISEEEEDSVNNGDIPLKFIADIRESVGTRMANN NFAKGDEVGLFALAGTTTMQEERYVDNLHFIRSSNGEFESNESVYYPDDGVTLNLISYYP YQKSGVGIGESTMQVAVSTTQNIPDDYSHSDFLIASKEEVLASKEAVALTYNHQFFRLKI VLIPGEEENLEDMLSVKPTLSVSGFYTKTSYDFQKKTFSGYSEEKDIIPAGEWEIKDGRL IGKEFILIPQEATPGYQYITLEAAGKQYTSLLPSTLQLKSGKQRELEIKFVSAEDVLMSK VNAEIGDWDGTEIDHTESVTLHKYIDISKLTFEKSNVYKVLHSGKQVAEICKEYLVTSEF SSQAIVAYPMKKDGSVDLSQGVVAQLLGKSGKVNGGSVSWNMEDHSLAYVDGTFPIRNNV YVMADGTVSLSVTMVDDILPVLALEDIVRDVRGGIIHNYPLVKIGTQYWMRDNLEVSSYV DGEVLPRLDAVTANVVGYLQSATEHYFYTANVVLSNKILPAHWSIPDWKDWNILKDYLNG EASLLKSGTWLPLKAGEQVQPATNLSGFNGIPVGMYVGAFQTDYEKKHLAYWTLDNANAA IDIKVFYLKSDTDIIEESNAGVDTKAFAIRCIRK >gi|225935369|gb|ACGA01000023.1| GENE 64 88167 - 88823 417 218 aa, chain - ## HITS:1 COG:no KEGG:BT_0197 NR:ns ## KEGG: BT_0197 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 218 1 199 199 307 75.0 1e-82 MSRIILNTKTWLLFVLLFGISFSAWAQSDDFNTWTKFKVNHKIDSRFSVSGDLELRMKDD VSKLDRWGLIVGGSYRPYSFLNLGVGYETHLRNLGDSDWKFRHRYHITATASFRYQWLKV AVRERFQQTFDRGNSETRLRSRLKLSYAPTKGIVSPYFSIEIYQSLDDASFWRASRMRYR PGVEIALAKRWSLDAFYCYQYASSQGRHIAGIEVGYSF >gi|225935369|gb|ACGA01000023.1| GENE 65 88929 - 89111 112 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFLGVKQVVPLKETTRWFKREKSLFIKHCYSNHYFGFTFHTNKAKNADNKAAPIQNQLI >gi|225935369|gb|ACGA01000023.1| GENE 66 89030 - 90397 1129 455 aa, chain - ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 5 453 1 440 459 373 43.0 1e-103 MLKQLINFYKVSSPRPCNGEALSSSGSQHLHSDARRLKYLKWSTFLSATFGYGMYYVCRL SLNVVKKPIVDEGIFSETELGIIGSVLFFTYALGKFTNGFLADRSNINRFMTTGLLVTAL VNLCLGFTHSFILFALLWGISGWFQSMGAASCVVGLSRWFTDKERGSYYGFWSASHNIGE ALTFLIVASIVSVLGWRYGFFGAGIVGLLGALIVWKFFHDTPESQGFPPVNAPKQKEEMS VVETTDFNKAQRQVLMMPAIWILALSSAFMYISRYAINSWGVFYLEAQKGYSTLDASFII SICPVCGIVGTIFSGVISDKLFGGRRNVPALIFGLMNVLALSLFLLVPGVHFWIDVLAMV LFGLGIGVLICFLGGLMAVDIAPRNASGAALGVVGIASYIGAGLQDVMSGVLIEGQKMVQ NGVDVYDFTYINWFWIGAALLSAFFALLVWNVKPK >gi|225935369|gb|ACGA01000023.1| GENE 67 90410 - 91321 888 303 aa, chain - ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 53 303 20 271 295 143 34.0 5e-34 MLELMKRICNFILLLCLVQLAVAQNRIAEIRENLLGNHPDKILVVSHRADWRNAPENSLQ GIQNCIDMGVDMVEIDLKRTKDGHLVVMHDKTINRTMSGKGLVEDYTLAELKAMRLKNGV ACKTRHQIPTLEEVMLLCKGKIMVNIDKGYDYFQEAYIVLEKTGTVDQCVIKAGLPYEQV KAENGAVLDKVIFMPIVQLHKKGAEAIIDSYKIHMKPAAYELVFDNDSPEVLNLIKKVRD TGSKLFINSLWPELCGGHDDDRAVELHQPDESWGWIINQGAKLIQTDRPALLLEYLRKKK LHD >gi|225935369|gb|ACGA01000023.1| GENE 68 91330 - 92388 920 352 aa, chain - ## HITS:1 COG:no KEGG:BT_0194 NR:ns ## KEGG: BT_0194 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 352 1 352 352 640 84.0 0 MDIIRKIQYLLFCLLAIGFVACDDDDNNSIETGHEGILTQLAEEVDATAQQLWSSSPLIV NTGRTTTLTKIQGYADKCKDDYFISYLNGFDQASTSMEKCDPIIYFYRSAFDRVMDGIKN SKVENGTAAIWLLYNMGYVVKTPSGCFAIDISHRWAKELAPYIDFLCVTHKHSDHYDNDL IQAMFGLGKPVLSNYLKDTTYPYTAKGDKDYEIGKFKIKTCITDHNNAGLSNFVTVFSID CGEDTGNFVFMHVGDSNYKPEQYTNLASHVNVLIPRYAPNALTENNILGSGAGQVEPDYV LLSHILELAHAGVDESRWSLDMALERASKINCEQTYVPMWGEKLVWKNNKLN >gi|225935369|gb|ACGA01000023.1| GENE 69 92390 - 93271 642 293 aa, chain - ## HITS:1 COG:MJ1311 KEGG:ns NR:ns ## COG: MJ1311 COG1082 # Protein_GI_number: 15669501 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Methanococcus jannaschii # 86 293 77 288 293 77 28.0 2e-14 MRVPVFILFMAMTMGLYAQKYKVGTTTALWKVPASTDFSQARSVGVEYVEVAFNQCYRGV PADEVVPRIRDMKAKIDSAGIKVWSIHLPFSRTLDISVLDEKKRKENVDFMAEMIGQCAQ FQPKCLVLHPSSEPIADSIRAQRITNASRSIAYLKKYADQIGAQLCIENLPRTCLGNTPE ELLEIIKDIPGVKVCFDTNHYTKGTTEHFVETVGERIGTIHASDFDFVNECHWLPTQGDI QWGKLMYALEGIGYEGVFMYEATKDHENDNTRPTPERVVETFNKIINDYKNQK >gi|225935369|gb|ACGA01000023.1| GENE 70 93278 - 95242 1743 654 aa, chain - ## HITS:1 COG:MA4287_3 KEGG:ns NR:ns ## COG: MA4287_3 COG1520 # Protein_GI_number: 20093076 # Func_class: S Function unknown # Function: FOG: WD40-like repeat # Organism: Methanosarcina acetivorans str.C2A # 340 617 27 311 337 92 27.0 3e-18 MKKILYSLLFIFTVVFTACEEDLPKASFDLYELKSLTATAGDMNVTLSWEAYENARPNEY LILWTSGSSEAEGGEMTVDAKTMTATINNLVNDVAYTFSVQPRYAGGLASKTTAACTPKN ARYPISDLTAAAGNERVRLRWTKPASERFTRYQVTVNPGNQIINLDDTSLEEYIVDGLTN DQEYTFNVVCVYPTGNSIAVETSATPGLIYPILASTELVVWEPSTFAYNDMYFMAGEVKS VSWDFGDGTTSGENNPVHAFTTTGTYTVAVTVTYVNNTTESGSLTVTVGNYKWNSVDLNF GGLTGYVKTSNPVFSPDGKTMYIPTSTPAGHLFAIDVVSGEFKWVFAISQITYGGGALVA PDGTIYQCVRNATINNVYAINPNGTQKWAVKLDAAIGAFPALSADGVLYCLTNKSTLYAL DASSGAIKWQQSLDGATGSAVAIDKAGNVYAGTSAAIYSFKPNKEQNWKLEEVNVTEQAT FALKDQVLYATLKNGGLVAVDMTNGTKKWTYPTTKGDAYFPIADKNGNVYFTEKGSQTVH AVNASGSKIWEKNVGNNLNYSGGALSTDGILYIGTQSNNKVLGLDITNGNIVFEETVGQQ VMAAVSIGPDRRLYCGTIGSNNIGSIKAFAVNKTLATDSWSIRGGDIQGTNRQK >gi|225935369|gb|ACGA01000023.1| GENE 71 95270 - 96661 1295 463 aa, chain - ## HITS:1 COG:no KEGG:BT_0191 NR:ns ## KEGG: BT_0191 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 463 1 463 463 874 91.0 0 MKKYKLILFLSTALLSVACTDSFFDLEPSSSVPTDKVYKTAEDFNVAVIGCYSKLQTQVS YFTECCEYRSDNLTLSAPTAGTQDRYDIDQFADKASNGILEDAWANFNNGVYRCNLVLDR IDEANFDATLKKQYKGEALFIRALTYFNMYRLWGGIPMTNKVVTVAEALKIGRSSDQQVY DFLVGDLNQVINENMLPSSYTSADMGRVTSGAAMALLGKIYLTFHKWTEARNVLSQLVGR YSLMTTPEQVFDVNNKMNDEIIFAVRFNKDVEGEGHGYWFSIINLTDDTNQTKSLKECYK DGDKRKNLITYVKVEDKVCVMNKFKDVKSATYNTVGNDQIILRYADVLLMYAEALNEISY SNLQNSEAMVALNAVHTRAGLSPLQITELPDQDSFRKAIMLERQQEFPYEGQRWFDLVRM GGAKEAMKAEGHIIQDYQFLYPIPKTELERINNTELLWQNTGY >gi|225935369|gb|ACGA01000023.1| GENE 72 96685 - 100128 3167 1147 aa, chain - ## HITS:1 COG:no KEGG:BT_0190 NR:ns ## KEGG: BT_0190 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1147 1 1146 1146 2100 93.0 0 MNNLLICKGISKNSCSFLAFLLFTFILIAPAAAQTLKEHDITLRVQNEPVESVFNKISKQ TNFKFLYDQETVNKAPRVSFDIKNASLKQILGEITSQAKLYFNRTDHTIAVSKQPLKKEE TAQRARTIQGIVADDKGEPVIGASVQIKGEGSGTITDIDGRYSLMNVPESATLTISYIGY KTVNISAKDKNTAKITLVEDSKMIDEVVVVGYGVQRKRDVSTSISSVKAEQIADVSASDF RQALAGKMPGVQVTQPSGDPEGSVSIRVRGISTVNAGSDPLYIIDGVPVERGFANLNNND VESVEVLKDASSAAIYGSRGSNGVIIITTKQGQSEKMKVQYDGYYGIQSVSKKLSMMNAY QFAEFAKDGHDNAYLDANPGGSPDDPNGMRPNSWERIPTELFPYLNGDKGLTDTDWQDAI FRTAATTSHNVSISGRGKTVGYFISANYYDKEGIIINSDFKKYSMRMNLDGKYKRLKFGL NFSPSYSTSNRVDASGSNGIVQSALMMPPVWPVYNADGSYNYQGNGYWRIGNDYQHNAVL NPVAMANLQSDVVDRMAIVGKVFAELELYKGLTYNISFGGDYYGSHNDQYRSSELPLLGQ KYYDVKSNPTAYSSSGFYFNWLVENKINYNTTIKDAHSINVVLVQSAQKETYKGDNVTAT DFPNDYIQTISGGTVTKGASDKTQWTIASYLARMQYSYKGKYMASAAIRADGSSRFGKNN RWGYFPSASLAWRISGEDFFMKAKCLSFVDDLKLRASYGVTGNFQIGNYDHLSLMALDNY VLGTGNGQLVNGYKPSTIKNEDLSWEKNAMVNVGVDLQMFKGLLGLTVDYYNTNTSNMLL NVPVPHLTGYSTALMNIGKVNNRGWEVALTSQKNITKDFGYSFNANYATNTNEVKALGPG NAPIISTGSVDHAYYITKVGEPIGCYYLLVQDGIFSNEEELKKYPHFSNTQPGDFRFVDV DGDGVMDLDKDRTIVGNYMPDFTYGFGGKVWFKGFDLDFNFQGVYGNEILNLNRRYIDNL EGNTNGTTIALNRWKSPEDPGNGQVNRANRKSKGYNGRTSTWHLEDGSYLRLQNVTLGYT LPKNLTQRFFVEKLRVYVSGQNLWTSTNYSGYNPEVNARPSNSLSPGEDYGTYPLAKTFL FGLNITL >gi|225935369|gb|ACGA01000023.1| GENE 73 100337 - 101323 824 328 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 74 293 59 290 323 69 28.0 7e-12 MKDTKTEKDKLHRYLDDLYTRDDASQLLQSIKDSENQDILDELAAEVWEESGNLQPGNDL EREKYKREARQLLKRIEHKKRTWFRRASVVAVSTAAVIAIIIGSVNFFRYMNEQQVTLAE ITTSFGEKRQVTLPDGTLLVLNSCSQVRYPDRFVGDLREVELEGEGYFRVARNEKMPFVV RTKRLDVQVLGTRFDVKSYSTDEIVSVSVESGKVQVDLPEAMMRLTAKEQVLINTVSGEY SKKTEDRGVAVWMKGGLRFYSTPIRDVAKELERVYNCRITFAPGQDFNNLITGEHDNKSL EAVLKSIEFISGDIKYKKEGINVLLYKE >gi|225935369|gb|ACGA01000023.1| GENE 74 101413 - 102015 586 200 aa, chain - ## HITS:1 COG:RSc2361 KEGG:ns NR:ns ## COG: RSc2361 COG1595 # Protein_GI_number: 17547080 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 13 181 36 210 213 71 29.0 9e-13 MYATDTLMDERELVLRLIDGDEEAFCELYAAYKNRLLYFALKFVKSREFAEDIFQDAFTV VWQTRRFINPEASFSSYLYTIVRNRILNQLRDMANEDQLKEQILSQAIDSTNETNNHILL NDLKEIIARALEQLTPRQREVFKMSRELQMSHKEIAEALGVSVHTVQEHISLSLKVIRSY LTKYSSTSADVLLILLCLNL >gi|225935369|gb|ACGA01000023.1| GENE 75 102211 - 102915 564 234 aa, chain + ## HITS:1 COG:yhhW KEGG:ns NR:ns ## COG: yhhW COG1741 # Protein_GI_number: 16131311 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Escherichia coli K12 # 7 233 6 229 231 168 37.0 9e-42 MKKVIDRASSRGYFNHGWLKTHHTFSFANYYNPERIHFGALRVLNDDSVDPSMGFDTHPH KNMEVISIPLKGYLKHGDSVQNTKTITPGDIQVMSTGSGIYHSEYNRSDKEQLEFLQIWI FPRVENTRPEYNNFDIRPLLKQNELALILSPNGKTPASIKQDAWFSMGSFDTEKTIEYKM HQEGNGVYIFVLEGEITVAEERLSKRDGIGVWDTKEFPIHILKGTQLLLIEVPM >gi|225935369|gb|ACGA01000023.1| GENE 76 103021 - 105078 1934 685 aa, chain - ## HITS:1 COG:HI0885 KEGG:ns NR:ns ## COG: HI0885 COG4232 # Protein_GI_number: 16272825 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Haemophilus influenzae # 1 598 1 521 579 96 24.0 2e-19 MRKIISFLLLSFVVYALQAQIKDPVKFKTELTALSDTEAEVVFTATMDKGWHVYSTDLGD GGPISATFNVDNKSGVELVGKLKPVGKEVSTFDKLFEMKVRYFENTAKFVQKVKFTGGAY AIEGYLEYGACDDESCLPPTQVPFKYSGVAKAGNAAATKTEQPEQKVADKQKEEPVPAAT KDSSAMMELVPATTTEAATDIQPAVASGDLWKPVISDLQALGEEHGQEDMSWIYIFITGF LGGLLALFTPCVWPIIPMTVSFFLKRSKDKKKGIRDAWTYGASIVVIYVALGLAITLIFG ASALNALSTNAIFNILFFLMLVIFAASFFGAFEIRLPSKWGNAVDSKAESTTGLLSIFLM AFTLSLVSFSCTGPIIGFLLVQVSTTGSVVAPAIGMLGFAIALALPFTLFALFPSWLKSM PKSGGWMNVIKVTLGFLELAFALKFLSVADLAYGWRLLDRETFLALWIVIFALLGFYLLG KIKFPHDDDDNKVGVTRFFMALISLAFAVYMVPGLWGAPLKAVSAFAPPMQTQDFNLHKN DVHAKFDDYDLGMEYARLNGKPVMLDFTGYGCVNCRKMEAAVWTDPKVSDLINNDYVLIT LYVDNKTPLTEPVKIIENGTERTLRTVGDKWSYLQRVKFGANAQPFYVLLDNQGKPLNKS YAYNEDIPKYIEFLQTGLENYKKGK >gi|225935369|gb|ACGA01000023.1| GENE 77 105094 - 105393 373 99 aa, chain - ## HITS:1 COG:no KEGG:BF3194 NR:ns ## KEGG: BF3194 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 99 1 99 99 169 87.0 3e-41 MVKHIVLFKLKDEVPAEEKLEVMTKFKEAIEALPAKISVIRKVEVGLNMNPGETWNIALY SEFDTLEDVKFYATHPDHVAAGKILAETKEGRACVDYEL >gi|225935369|gb|ACGA01000023.1| GENE 78 105460 - 106113 504 217 aa, chain + ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 13 212 3 202 211 223 54.0 2e-58 MQKYSIWFKYTFKEMLIIGIAGGTGSGKTTVVRKIIESLPAGEVVLLPQDSYYKDSSHVP VEERQNINFDHPDAFEWSLLSKHVMTLKEGKSIEQPTYSYLTCTRQPETIHIEPREVVII EGILALCDKKLRNMMDLKIFVDADPDERLIRVIQRDVIERGRTAEAVMERYTRVLKPMHL QFIEPCKRYADLIVPEGGSNRVAIDILTMYIKKHLKN >gi|225935369|gb|ACGA01000023.1| GENE 79 106131 - 107522 1082 463 aa, chain + ## HITS:1 COG:VC0866 KEGG:ns NR:ns ## COG: VC0866 COG4623 # Protein_GI_number: 15640882 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein # Organism: Vibrio cholerae # 31 456 30 460 530 163 28.0 7e-40 MNVKTLSIPLFLILFCCILGCRNKHHNKMDETARDLPQIKDSGELVVLTLYSSTSYFIYR GQEMGFQYELSEQFAKSLGLKLRIEVANSVDEMIQKLLAGEGDMIAYNLPITKEWKDSLL YCGEDVITHQVIVQQGRGKQKPLEDVTELVGKDIYVKPGKYYDRLVNLNSELGGGIRIHE VTNDSITIEDLITQVAQGKIPYTVADNDLARLNKTYYPNLNIDLSISFDQRSSWAVRKDS PELAAAATQWHQENMTSPAYTASMKRYFENSKMMPHSPILSLKEGKISHYDHLFKKYSKD IGWDWRMLASLAYTESNFDTTAVSWAGAKGLMQLMPATARAMGVPPGKEQNPEESVKAAI KYITATDRSFSMIPDKQERLNFILASYNAGLGHIYDAMALAEKYGKNKLVWKDNVENFIL LKSNQEYFNDPVCKNGYFRGIETYNFVRDIMSRYESYKKKIKA >gi|225935369|gb|ACGA01000023.1| GENE 80 107551 - 109098 1415 515 aa, chain - ## HITS:1 COG:MTH1856 KEGG:ns NR:ns ## COG: MTH1856 COG0591 # Protein_GI_number: 15679844 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanothermobacter thermautotrophicus # 1 512 1 511 526 485 50.0 1e-136 MNTFTLGLIVVAYLLSLAYLGFLGYKKTTNTSDYLVGGRQMNPIVMALSYGATFISASAI VGFGGVAAAFGMGIQWLCFLNMFVGVVIAFIFFGLRTRRMGAKLNVSTFPQLLGRHFRSR NIQVFIAAVIFIGMPLYAAVVMKGGAVFIEQIFQIDFNISLLIFTLVIAAYVIAGGMKGV MYTDALQAVIMFACMLFLLFSLYQVLGMGFTEANKELTAIAPLVPEKFKALGHQGWTAMP VMGSPQWYSLVTSLILGVGIGCLAQPQLVVRFMTVESSKQLNRGVFIGCFFLIITVGAIY HAGALSNLFFLKTEGAVATEVVQDIDKIIPYFINKAMPDWFAALFMLCILSASMSTLSSQ FHTMGASVGSDIYGTYKPRSRNKLTNVIRLGVLFSILVSYIICYMLPHDIIARGTSIFMG ICAAAFLPAYFCALYWKKATKQGVMASLWVGTIGSLFALVFLHQKESAALGICKALFGRD VLITTYPFPVIDPILFALPLSVLAIVVISLMTYKK >gi|225935369|gb|ACGA01000023.1| GENE 81 109095 - 109220 74 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153807903|ref|ZP_01960571.1| ## NR: gi|153807903|ref|ZP_01960571.1| hypothetical protein BACCAC_02189 [Bacteroides caccae ATCC 43185] # 1 41 1 41 41 78 100.0 1e-13 MFGIDDPFIILPYLLSVVCVIFAAWFGLKYWNKDDEKDETR >gi|225935369|gb|ACGA01000023.1| GENE 82 109292 - 109891 598 199 aa, chain - ## HITS:1 COG:MTH120 KEGG:ns NR:ns ## COG: MTH120 COG0778 # Protein_GI_number: 15678148 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanothermobacter thermautotrophicus # 36 199 25 191 191 159 44.0 4e-39 MKKIFSFLCLIVAVTAMTSCSSAKEEKGTSGTGNAVLDNIFERKSVRAYLNKGVEKEKID LMLRAGMAAPTGRDIRPWEFVVVSDRAKLDSMAAALPYAKMLTQARNAIIVCGDSVRSSY WYLDCSAAAQNILLAAESLGLGAVWTAAYPYEDRMQVVRKYTNLPDNILPLCVIPFGYPA TKENPKQKFDEKKIHYNQY >gi|225935369|gb|ACGA01000023.1| GENE 83 109903 - 112650 2555 915 aa, chain - ## HITS:1 COG:VC0390_2 KEGG:ns NR:ns ## COG: VC0390_2 COG1410 # Protein_GI_number: 15640417 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Vibrio cholerae # 324 914 1 590 899 703 59.0 0 MKKTISQIVSERILILDGAMGTMIQQYNLKEEDFRGERFAHIPGQLKGNNDLLCLTRPDV IQDIHRKYLEAGADIIETNTFSSTTVSMADYHVEEYVREMNLAAVKLARDLADEYTAKNP DKPRFVAGSVGPTNKTCSMSPDVNNPAYRALSYDELAASYQQQMEAMLEGGVDAILIETI FDTLNAKTAIFAAEQAMKATGVEVPIMLSVTVSDIGGRTLSGQTLDAFLASVQHANIFSV GLNCSFGARQLKPFLEQLASHAPYYISAYPNAGLPNSLGKYDQTPADMAHEVREYVEEGL INIIGGCCGTTDAYIAEYPALVEGAKPHVPASAPDCMWLSGLELLEVKPEINFVNVGERC NVAGSRKFLRLINEKKYDEALSIARQQVEDGALVIDVNMDDGLLEAKTEMTTFLNLIMSE PEIARVPIMIDSSKWEVIEAGLKCLQGKSIVNSISLKEGEEVFLEHARIIRQYGAAAVVM AFDEKGQADTAARKIEVCQRAYRLLVDKIGFNPHDIIFDPNVLAVATGIEEHNNYAVDFI EATAWIKKNLPGAHISGGVSNLSFSFRGNNYIREAMHAVFLYHAIQQGMDMGIVNPGTSV LYTDIPADVLEKIEDVVLNRRPDAAERLIELAESLKATMSGTAGQPAVKHDAWREESVQE RLKYALMKGIGDFLEQDLAEALPLYDKAVNVIEGPLMDGMNHVGELFGAGKMFLPQVVKT ARTMKKAVAILQPIIESEKVEGSASAGKVLLATVKGDVHDIGKNIVAVVMACNGYDIVDL GVMVPAETIVQRAIEEKVDMIGLSGLITPSLEEMAHVAVELEKAGLDIPLLIGGATTSKM HTALKIAPVYHAPVVHLKDASQNASVASKLLNPQAKAELVNELETEYEALREKSGLMKRE TVSLEEAQKNKLNLF >gi|225935369|gb|ACGA01000023.1| GENE 84 112740 - 113192 445 150 aa, chain - ## HITS:1 COG:TM0254 KEGG:ns NR:ns ## COG: TM0254 COG0691 # Protein_GI_number: 15644629 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Thermotoga maritima # 10 149 16 155 158 135 48.0 4e-32 MKQPPVNIKNKRATFDYELIDTYTAGIVLTGTEIKSIRLGKASLVDTFCYFTKGELWVKN MHIAEYFYGSYNNHTARRERKLLLNKKELEKFQREMKNPGFTIVPVRLFINEKGLAKLVI ALAKGKKEYDKRESIKEKDDRRDMARMFKR >gi|225935369|gb|ACGA01000023.1| GENE 85 113202 - 113747 441 181 aa, chain - ## HITS:1 COG:no KEGG:BT_0178 NR:ns ## KEGG: BT_0178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 310 91.0 1e-83 MKLISSPAKAWEEISMEEDRRKVYIAFVYPMIGLCGLSVFIGSLLTNGWGGPQSFQIAMT NCCAVAVALFGGYFLAAYAINEMGTRMFGMHSNMPLTQQFAGYALVVPFLLQIVTGLLPD FRIIAWLLQFYIVYVVWEGVPVLMGVEEKQRLKYTLLSSVLLILCPAVIQIVFNRLTAIL N >gi|225935369|gb|ACGA01000023.1| GENE 86 113780 - 114544 456 254 aa, chain - ## HITS:1 COG:no KEGG:BT_0177 NR:ns ## KEGG: BT_0177 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 247 1 258 263 145 40.0 9e-34 MFMKQVKFFLVALMAVVMGMSVTSCMNGDDNHNVTMTVPVKYNYSSFLMGDGTTKLVPTT ELGLLDGIMYIISCQYDQSQVSANSNSIPVTLLSTPLCIDPKGDESLSTTKTKPTNPLYT LDKQQSSLVYYDKNTIVLTMPYWVKVTNSSVEESELKKHSFCLSYNPDEIESNATKLNLY ISHRVEDGEESVTRSNFTYAYRAYSISSALYQFKEKTGKLPQYLVLKAETNSSKDELKDE NGETSVEYQYAFTE >gi|225935369|gb|ACGA01000023.1| GENE 87 114621 - 115121 544 166 aa, chain - ## HITS:1 COG:no KEGG:BT_0176 NR:ns ## KEGG: BT_0176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 166 1 162 162 277 93.0 1e-73 MIEVLESALQKAAGEGMDEFIQAFTDKYKEIIGGELTAETMPLLTGEQHSLLAYQIFRDE IMVGGFCQLIQNGYGGYIFDNPFAKVMRLWGAEEFSKLVYKAKKIFDSNRKDLEKERTDD EFMAMYEQYEAFDELEEAYLEMEEQVTALIASYVDDHLELFAKIIK >gi|225935369|gb|ACGA01000023.1| GENE 88 115723 - 116127 278 134 aa, chain - ## HITS:1 COG:no KEGG:BT_0175 NR:ns ## KEGG: BT_0175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 133 134 201 86.0 6e-51 MELNQVDIHYLIAAICVISSALIFYTIGVWGERLQRKLKFWHLIFFLLGLLVDTVGTSLM EHIAELTHLHDEMHTVTGAIAILLMFVHALWAIWTYVKGTPIEKRHFNRFSIVVWCIWLI PYLIGVYLGMRLHV >gi|225935369|gb|ACGA01000023.1| GENE 89 116366 - 117067 921 233 aa, chain + ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 233 1 230 230 371 79.0 1e-103 MTYSHEVEHMCVVKKGPNHGPAPIPEEGKWVKSKEIVDISGLTHGIGWCAPQQGACKLTL NVKEGIIQEALIETIGCSGMTHSAAMASEILPGKTVLEALNTDLVCDAINTAMRELFLQI VYGRTQSAFSEGGLIIGAGLEDLGKGLRSQVGTLYGTLAKGPRYLEMAEGYIKQIFLDKN DEICGYEFVHMGKFMDEIKKGTDANEALKKVTGTYGRVTAEQGAVKSIDPRHE >gi|225935369|gb|ACGA01000023.1| GENE 90 117087 - 118097 1329 336 aa, chain + ## HITS:1 COG:no KEGG:BF3207 NR:ns ## KEGG: BF3207 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 336 1 336 336 647 97.0 0 MIREVKFESQDRRIKGIIEALNANGIKDIEEANAICEAAGLDPYKTCEETQPICFENAKW AYVVGTAIALKKNCTNAAEAAEAIGIGLQAFCIPGSVADDRKVGIGHGNLAAMLLREETK CFAFLAGHESFAAAEGAIKIAAKADKVRKEPLRCILNGLGKDAAQIISRINGFTYVQTQF DYFTGELKVVREIAYSDGPRAKVKCYGADDVREGVAIMWKEGVDVSITGNSTNPTRFQHP VAGTYKKERVLAGKPYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSS SVPAHVEMMGFLGIGNNPMVGCTVACAVDVAQALGK >gi|225935369|gb|ACGA01000023.1| GENE 91 118439 - 118792 378 117 aa, chain + ## HITS:1 COG:no KEGG:BT_0170 NR:ns ## KEGG: BT_0170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 115 1 115 117 157 86.0 8e-38 MLLTFLNTLATIISVISLLIVTYGVLVCFIAFLRNEIKRFNGTYTVYNIRQLRADFGTYL LLGLEFLIASDILKTVVDPTLDELAILGGVVVVRTVLSVFLNKEIKELAENDSAKEI >gi|225935369|gb|ACGA01000023.1| GENE 92 118921 - 120033 541 370 aa, chain + ## HITS:1 COG:no KEGG:Sden_2170 NR:ns ## KEGG: Sden_2170 # Name: not_defined # Def: helix-turn-helix, AraC type # Organism: S.denitrificans # Pathway: not_defined # 129 365 76 323 328 67 26.0 1e-09 MGITHLSIELTLDLIALIIGIILIIRAKDNYPKLYWGIIATGIGIMFSWENIGWLTIVTD TPEYNFTELLNIEKMLKWYALANIVALFPIASLSPGYLNHFRIFTFLLLPIITITVGISY LGFNGNITPIHSIDEIIPNIHQIDVKLRACIFLLSVFTPLVLLIYPMMNNKTYRRINNNM YLFIGFLFVFLGIYILFTLNINEFVFNLFGIMAIVFTVLFSIQYLRYENPFSNHINMIHN AKNTESTIMLQAGKPSPTLPLFSTIEAYLQEQHPYTDQRYNIELLAKSLNKQEHDISAAI KSQGFTGFREYINYLRLEYFRQLAAENPKSNVKELMFACGFTSRATFYRNFSEKYGVSPS KFIENQRVEQ >gi|225935369|gb|ACGA01000023.1| GENE 93 120199 - 120585 212 128 aa, chain + ## HITS:1 COG:no KEGG:BVU_3800 NR:ns ## KEGG: BVU_3800 # Name: not_defined # Def: two-component system response regulator # Organism: B.vulgatus # Pathway: not_defined # 17 125 143 251 254 102 44.0 4e-21 MNIDNCNHTDTEKHQEEKIFLYTTITGNLQLIHLEQIGYFRYNSKSKLWEAVLNNNHTLA LKRNTNSQKILSLHPYLIQISQSHIINISYLVSIEDNNCILLPPFNEMELLQVSKSFMKK LKEKYPCI >gi|225935369|gb|ACGA01000023.1| GENE 94 120610 - 121011 334 133 aa, chain - ## HITS:1 COG:CC3636 KEGG:ns NR:ns ## COG: CC3636 COG0545 # Protein_GI_number: 16127866 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Caulobacter vibrioides # 3 132 37 167 177 113 43.0 7e-26 MGKKKEYKEANRRFLKKLSFQEGVFALPCGIYYKVLETGESTISPGARSIVTVHYKGSLI DGRVFDNSYERTCPDALRLSDVIEGWQVALQKMHVGDKWIIYIPYAMGYGIKSFDSIPAY STLIFEVELLGVA >gi|225935369|gb|ACGA01000023.1| GENE 95 121111 - 126741 4538 1876 aa, chain + ## HITS:1 COG:FN0579 KEGG:ns NR:ns ## COG: FN0579 COG2373 # Protein_GI_number: 19703914 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Fusobacterium nucleatum # 262 1869 54 1604 1611 397 24.0 1e-109 MGQTKTTRSISATGLFLLIMMTVGLYSCTRTQKDIIPSADYAPYVNAYTGGVISQNSTIR IELTHDQPMVDMNNELKSNPFSFSPSLKGKAYWVSNNTIEFVPEEGALKPGTLYEGTFRL GDFIEVDKKLKEFNFSFRVQERNFTLQLESLPITATQPNEINIKGEIRFSDVVKKEEVEK MLTASDGKKSYPVEVTATDNHTRYLFSIRQIPREADDYPLTITANGNAAGIDRKQSEEVL IPAKDCFRFMSAERIDQPENGIEIVFSAPLSTTQDLKGLIEIPEISSSIFQISENRVFIY FEANTQNKLTLNIHEGVKDCQGKALGTSHTISFSEVSLKPQVEMSTTAAILPDSKSLIIP FRAVNLYAVDLSVIRIFENNVLMFMQTNSLASANELRRSGRLVYKKTLWLAKDASKDIHH WGDYSIDLAGLIHQEPGAIYRVILSFRQEYSAYPCGGGENQDMKFADSSTSDGLTKVSGS VLSEEDEAIWNTPEAYYYYNGGTMDWSVYRWTERDNPCHPSYYMDSDRAAACNVLASNLG MIVKRNSLNKLWIAVSNILDTKPIGKAQVTAYNFQLQPIGKGETNGEGFVEIAPNGVPFI IVAESEKQKAYVRVVDGEEQSVSRFDVGGKDIQKGLKGFIYGERGVWRPGDTLHISFILE DREKRIPDKHPVALEIYNPRGQFYTKMISTQGMNGFYTFDVPTQATDPTGLWNAYIKVGG TTFHKGLRIETIKPNRLKINLALPKVLQATDKDFYAPLTSTWLTGATASKLKAKVEMSLS KVNTQFKNYGQYIFNNPATDFTTIKTDIFDGTLDAEGKANVMLKVPTATEAPGMLNATFT TRVFEPGGDASIYTQTIPFSPFTSYVGINLNQPKGKYIETDKDHVFDIVTVNTQGQLVNS SNLEYKIYRIGWSWWWENSGESFGTYINNSSITPVASGNLQTRGGKASFKFRIDYPSWGR YLVYVKDKESGHATGGTVYVDWPEWRGRSSKTDPSGIKMLAFSLNKDSYEIGETATAIIP AAAGGRALVSIENGSTVLRQEWIEVSNGGDTKYTFKITPEMTPNVYLHISLLQPHAQTVN DLPIRMYGVVPVFVTNSQTVLQPQIQMPEVLRPETNFNVTVSEKTGKPMTYTLAIVDDGL LDLTNFKTPDPWNDFYSREALGIRTWDMYDNVLGASAGSYSSLFSTGGDATLKPADAKAN RFKPVVKFIGPFYLGKGKSQTHTLKLPMYVGSVRAMVVAGQEGAYGNAEKTAFVRTPLMM LSTLPRVLSIQEEITVPVNIFAMENQVKNVTVSLQASGGGVQIVGANQQSLKFSQPGDQL VFFTLKTGSKTGKATIHLTANGGGQQTKETIEIEVRNPNPVVTLRNSQWVEAGQSKELSY NLSSSSANNQIKLEVSRIPSVDISRRFDFLYNYQHHCTEQLTSKALPLLFVGQFKTIDKI EAEKIKTNIQEAIRQIYGRQLPNGGFVYWPGNAVADEWISSYAGMFLTLAQEKGYAVHSN VLNKWKRFQRAAAQNWRMPQDASGWQQWQSELQQAFRLYTLALAGVPEYGAMNRMKEQAG LSIQAKWRLATTYALTGKMKPAEELVYNAETTVSPYSSMNQIYGSSDRDEAMILETLILM NRERDALQQAKVVSKNLSQEEWFSTQSTAFALMAMGRLAEKLSGTLDFVWTWNDKQQPAV KSAKAVFEKEIATTPKSGMIAVKNQGKGALSVDLITRTQLLNDTLPAISDNLRMDIRYAN LNGTPISVNDIIQGTDFMAITSISNISGTSDYTNLALTHIIPSGWEIYNERMVAPETESG AADGSGKSVSKYNYLDIRDDRVLTYFNLRRGETKVFTVRLQATYAGNFILPAVQCEAMYD VNVQARSKAGRTTVSR >gi|225935369|gb|ACGA01000023.1| GENE 96 126853 - 129276 1267 807 aa, chain + ## HITS:1 COG:FN0580 KEGG:ns NR:ns ## COG: FN0580 COG4953 # Protein_GI_number: 19703915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Fusobacterium nucleatum # 29 806 5 721 724 374 32.0 1e-103 MNPIYKFFKRLSVTKKVMTISIALLMIGYIFCLPRQLFHVPYSTVVTDRNEELLGARIAS DGQWRFPPRKTTPEKIKQCLITFEDKHFYHHWGVNPLSTGRALYQNLKNKRVVSGGSTLT MQTIRLARNKPRTIGEKVIEMIWATRLEFRTSKEEILSMYVSHAPFGGNVVGLDAAAWRY FGHSAEDLSWAESAMLAVLPNAPAMIHLSKGRKTLLSKRNRLLKQLFEEEIIDASTYELA ISEPLPDEPHPLPQIAPHLVTRFYQERNGLYTRSTIDKGIQTHIESLAERWSNEFNRSDI RNLAILIIDIPTNQVVAYCGNVHFDRKQGGNQVDVIQAPRSTGSILKPFLYGAMLQEGSL LPQMLLPDVPVNINGFTPQNFSMQFEGAVPASEALARSLNIPAVTMLQRYGVPKFHHMLQ QMGFKTINRSASHYGLSLILGGAEATLWDVTNAYAQMGRSLSNSHSNDLPQEKEVQILLG TEEKTVSERDGSRKVTSGKTISRKTTSRKDISEGVISEGVISAGAAWLTLSALTEVNRPE EIDWKSIPSMQTIAWKTGTSYGFRDAWAVGVTPRYAVGVWVGNATGEGKPGLVGAQTAGP VLFDIFNYLPSSPWFERPTEIFVDAEICRQSGHLKGRFCEETDTVLILPVGLRTEACPYH HLVTLSADESHRIYENCANTEPTIQKSWFALPPVWEWYYKQHHPEYKPLPPFKAGCGEDS FQPMQFIYPPMNAHIKLPKQLDGSKGFITVELAHSNPNATIFWHLDDTYQTQTQDFHKIS LQPAPGKHSLTAVDGEGNTVSTTFFIE >gi|225935369|gb|ACGA01000023.1| GENE 97 129303 - 130478 921 391 aa, chain + ## HITS:1 COG:CAC3482 KEGG:ns NR:ns ## COG: CAC3482 COG0477 # Protein_GI_number: 15896719 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 15 388 19 387 394 247 40.0 3e-65 MKQPLKENGGLPASILWTLAIVAGVSVANIYYIQPLLNMIRHELGISEFRTNLIAMVTQI GYAAGLLFITPLGDLYQRKKIILVNFTILIFSLLTIALTHNFHLILIASFLTGVCSMIPQ IFIPIAAQFSRPENKGRNVGIVLSGLLTGILASRVVSGFIGELIGWREMYHIAAGMMFIC AIVVLKVLPDIQTNFQGKYSGLMKSLLALVKEYPQLRIYSIRAALNFGSLLAMWSCLAFK MGQAPFFANSDVIGMLGLCGVAGALTASFVGRYVKQVGVRRFNFIGCGLILFAWLLFFIG ENTFIGIIAGIIIIDIGMQCIQLSNQTSIFELDPRASNRINTVFMTTYFIGGSMGTFLAG SFWQLYGWHGVIGTGVVLTGISLLITTFYKK >gi|225935369|gb|ACGA01000023.1| GENE 98 130525 - 130959 242 144 aa, chain + ## HITS:1 COG:no KEGG:BT_0160 NR:ns ## KEGG: BT_0160 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 1 141 144 228 89.0 5e-59 MKYGVNRQILLITAGTVWIIAGANILRIGIVTWLNTSQDWMFKIGEATVVFLLFFVLVFR RLYYKHTQRIEQKKEHKNCPFSFFDVKSWITMIFMISLGITIRSFHLLPETFISVFYTGL SIALILTGVLFIRYWWIRRKTFAN >gi|225935369|gb|ACGA01000023.1| GENE 99 130923 - 131159 61 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293369470|ref|ZP_06616053.1| ## NR: gi|293369470|ref|ZP_06616053.1| hypothetical protein CUY_4296 [Bacteroides ovatus SD CMC 3f] # 1 78 1 78 78 126 97.0 5e-28 MIGINDTRGEGAFFMLILITFYLFSGYCFVYSRAKIQQKRMKSFASKCQAIKIIISLYLI GICIRPAQLAKVLRLIHQ >gi|225935369|gb|ACGA01000023.1| GENE 100 131116 - 132543 927 475 aa, chain + ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 11 474 1 445 473 270 35.0 4e-72 MKKAPSPLVSLIPIIVLVLLLFATIRTFGSDALSGGSQVSLLTTTAICILIGMVFYKIPW KDYELAITNNIAGVATAIIILLIIGALSGIWMISGVVPTLIYYGMQIIHPSFFLASTCII CALISVMTGSSWTTIATIGIALMGIGKAQGFEDGWIAGAIISGAYFGDKISPLSETTILA SSITDTPLFRHIRYMMITTVPSLVITLIIFTVAGFSHDANNTQHIAEVATALNEKFHITP WLLIVPVVTGILIAKKVPSIVTLFLSTLLAAVFALIFQPDLLEEISGIAASGFDSLFKGL MMTIYGETNLHTDNAVLTDLIATRGMSGMMNTIWLILCAMCFGGAMTASGMVGSITSIFV RFMKKTVSVVSATVCSGLFLNLTTADQYISIILTGNMFRDIYAKKGYESCLLSRTTEDAV TVTSVLIPWNSCGMTQATILSVPTLVYLPYCFFNIISPLMSITVAAIGYKIARRS >gi|225935369|gb|ACGA01000023.1| GENE 101 132611 - 133201 702 196 aa, chain + ## HITS:1 COG:no KEGG:BT_0157 NR:ns ## KEGG: BT_0157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 301 94.0 1e-80 MKKFIALVALVLVSASTMMYAQESNAAARRAERKAERDAERAKLRAEEEVQDMVAYQQAV QALKNKQFVLEANQVVFRNGMSAFVTSNTNFVLMNGNRATVQTAFNTPYPGPNGIGGVTV DGNSSDMKMNIDKKGNVNCSFSVQGIGISAQVFINMSSGNNTASVSISPNFSNNNLTLNG NIVPLDQSNIFKGRSW >gi|225935369|gb|ACGA01000023.1| GENE 102 133292 - 133951 582 219 aa, chain + ## HITS:1 COG:RSc0292 KEGG:ns NR:ns ## COG: RSc0292 COG2197 # Protein_GI_number: 17545011 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Ralstonia solanacearum # 1 213 1 210 210 92 27.0 5e-19 MREFIIADNQDISKAGMMFLLSKQKEVSLLLEADNKAELIQQLRLYPQAVIILDYTLFNF AGADELIVLQERFKEADWILFSDELSLNFLRQVLFSSMAFGVVMKDNSKEEIMTAIQCAT RKQRYICNHVSNLLLSGTSSPLAASSMDDHLLTQTEKNILKEIALGKTTKEIAAEKNLSF HTINSHRKNIFRKLGVNNVHEATKYAMRAGIVDLAEYYI >gi|225935369|gb|ACGA01000023.1| GENE 103 133992 - 135059 797 355 aa, chain - ## HITS:1 COG:BB0682 KEGG:ns NR:ns ## COG: BB0682 COG0482 # Protein_GI_number: 15595027 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Borrelia burgdorferi # 1 353 1 350 355 297 43.0 2e-80 MDIAALLSGGVDSSVVVHLLCEQGYKPTLFYIKIGMDGAEYMDCSAEEDIELSTATARKY GLSLEVVDLHQEYWENVAAYAIDKIRQGLTPNPDVMCNKLIKFGCFEQRVGKDFDFTATG HYATTLQRNGKTWLGTAKDPVKDQTDFLAQIDYLQVSKLMFPIGGLMKQEVREIANKAGL PSARRKDSQGICFLGKINYNDFVRRFLGEKEGAIIELETEKKLGTHRGYWFHTIGQRKGL GLSGGPWFVVKKDMEENTIYVSRGYGVETQYGNEFRMHDFHFITDNPWKGQEKEIDITFK IRHTPDFTKGKLIQEGEKQFHILSSEKLQGIAPGQFGVIYDEEAKVCVGSGEIIC >gi|225935369|gb|ACGA01000023.1| GENE 104 135243 - 136685 940 480 aa, chain + ## HITS:1 COG:no KEGG:BT_0154 NR:ns ## KEGG: BT_0154 # Name: not_defined # Def: putative periplasmic protease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 480 1 484 484 635 68.0 0 MTKLRKIILIPALSIVFISGFFSCGVDRWPEYAHQTALDTWMYDIMQQNYLWYQDLPSYD DVNLFLEPASFLSKVKSKNDSYSFVDSVMETPLPTYGFDYSLVRSADIDTAYNALITYVI PGSPAEAAGLERGNWIMKVDTSYISKKYETQLLQGTQARDLVMGVWKEVPVEPEEGEEGE EEFVYKVVPNDITLKLPAARSVEDNPVHKTKILTVKENNRDIKVGYLMYNSFTAGTNSDP DKYNNELRQISQEFKTAGVKYVILDLRYNTGGSLDCVQLLGTILTSEARLNKPMAYLEYN NKNRDKDATINFDSEILKSGVNLDLPALFAITSSTTAGAPEMLIRSLSLKDSYPVVTIGG VTKGQNVATEQFINEEFLWSINPVVCTVYDSNHDAGGAISPATDLKISETTIDGVTNYSE FLPFGDPNERMLKVAIGVIDGSYPPKDEETEETTKAQFKIEKSVISPASRRFSSNGLRLK >gi|225935369|gb|ACGA01000023.1| GENE 105 136692 - 137486 402 264 aa, chain + ## HITS:1 COG:no KEGG:BF1404 NR:ns ## KEGG: BF1404 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 15 247 19 249 274 149 38.0 8e-35 MKKIAILLLAILIFSCSDDDEKGTVENKGQWAMIFNESVKNDSNPVDKTEKFMFDGERLI QHIIKQRYFEEEISNEVNLSYSDNQVTVTTDYLTLVYTLNSEGYASQCVYSLSSQNRIYQ FSYSAEGYLTGIVENIDNTEYSSTSLTYENGDITSISSKMNGLENKFIYEPGEESSTYHL PCLGLLEIHPLTFHIEALYAGLLGKDPRHFTIRSCPAGSNDEKTIYSYEFDKKGNPSRMI CQTTYAGGQANYYPYTRNISVSFE >gi|225935369|gb|ACGA01000023.1| GENE 106 137895 - 138713 787 272 aa, chain - ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 4 272 2 265 269 174 37.0 1e-43 MKKKKLLLIALLLVWVAPSFAAKVDTLLIKSPSMNKDVQVVVVTPDAALGKKAVACPAIY LLHGYGGNAKTWIGIKPNLPQIADEKGIIFVCPDGKNSWYWDSPLNPSYRYETFISSELV KYIDEHYKTIADRKGRAITGLSMGGHGAMWNAIRHKDTFGASGSTSGGMDIRPFPKNWDM SKQLGEYESNKEVWDNHTVINQIDKIENGDLAIIVDCGEGDFFLNVNKDLHNRLLRKKID HDFITRPGGHTGQYWNNSIDYQILFFDKFFKK >gi|225935369|gb|ACGA01000023.1| GENE 107 138739 - 139173 312 144 aa, chain - ## HITS:1 COG:no KEGG:BT_0151 NR:ns ## KEGG: BT_0151 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 254 82.0 5e-67 MKTVKLITCNDAMKAHILQGALENEGIESILHNENFSTLYKSCVSNIAGVDILVADEDYE NAVQVLRDNDSWPEELTLCPYCGSSDIQLVLRKGKRWRALGAAIISALMVTPPGDNHWNY TCKQCHKTFEMPVSKFNPSAETEE >gi|225935369|gb|ACGA01000023.1| GENE 108 139487 - 141733 1971 748 aa, chain + ## HITS:1 COG:no KEGG:BT_0150 NR:ns ## KEGG: BT_0150 # Name: not_defined # Def: putative ferric aerobactin receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 748 45 792 792 1338 86.0 0 MVLGLNKGGVTNAEGHFTIEQVPPGIYRLQATAIGYKSVTTPEYIVSTKDLNISIEMEEN LTELAGVTVTASPFRRDLESPVSLRIIGLQEIEKSPGANRDISRIVQSYPGVAFSPIGYR NDLIVRGGSPSENRFYLDGVEIPNINHFSTQGASGGPVGILNADLIREVNFYTGAFPTDR GNALSSVLDFKLRDGDMEHNSVKATLGASEVSLASNGHIGKKTSYLVSIRQSYLQFLFDM LDLPFLPTFTDAQFKLKTRFNEQNELTVLGLGGIDNMRLNTKADSEDNEYILSYLPKIKQ ETFTLGAVYRHYAGAHVQSVVVSHSYLNNRNTKYRQNDESIPENLMLRLRSTEQETKFRF ENNSSFRNWKVNLGLNLDYSQYTNTTFQKAYTNQAQTFDYHTYLGMMRWGLFGTISYSSM DERFTASLGLRADANNYSSAMKSLSDQLSPRISLSYQLADHWFLSGNAGLYYQLPPYTAL GFKDNNGTYVNKYNLRYMKVSQESLGISWRKGDTFEVSVEGFYKDYDKIPLSVVDGIPLT CKGNDYGVIGNELLTSTAQGRSYGAEILVKWLITKKLNLASSFTLFKSEYRNDKESEYIA SAWDNRYIFNLRGTYNLPRQWSVGMKVSCIGGAPYTPYDEEKSSLVSAWDAQGKAYYDYS KYNKERLPAFAQVDLRIDKTFYLKHCMLGFYLDLQNITVSKLKQQDVLMSTGIIENPEAP ADSQHYKMKRLKQSSGTLLPTLGITFEY >gi|225935369|gb|ACGA01000023.1| GENE 109 141766 - 142548 1070 260 aa, chain + ## HITS:1 COG:YPO0927 KEGG:ns NR:ns ## COG: YPO0927 COG0501 # Protein_GI_number: 16121232 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Yersinia pestis # 1 256 1 247 250 152 36.0 7e-37 MKREIAMIAFVLLGMGMTASAQFGKKINLGKALQAGKDVVSAVTLSDADIANMSKEYMVW MDAHNPLTKPDTEYGKRLEKLTGHIKEVDGLKLNFGVYEVVDVNAFACGDGSVRICAGLM DVMTDEEVMAVVGHEIGHVVHTDSKDAMKNAYLRSAVKNAAGASSDKVAKLTDSELGAMA EALAGAQYSQKQESEADDYGVEFCVKNGIDPYAMANSLTKLSELSKDAPQASYLQKMFSS HPDTMKRIERAKAKAATYAK >gi|225935369|gb|ACGA01000023.1| GENE 110 142734 - 144026 1281 430 aa, chain + ## HITS:1 COG:sll0260 KEGG:ns NR:ns ## COG: sll0260 COG1253 # Protein_GI_number: 16331101 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Synechocystis # 1 422 8 430 448 319 42.0 6e-87 MEFLIILLLLVLNGIFAMYEIALVSSSKARLETLVAKGNKSARGVLKQLEEPEKFLSTIQ IGITLIGIVSGAYGGVAIADDLVPFFSLIPGAEAYARNLAMITTVAIITYLSLIIGELVP KSIALSNPERYATLFSPVMILLTKVSYPFVWLLSVSTRLLNKLIGLKSEERPMTQEEIKM ILHQSSEQGVIDKEETEMIRDVFRFSDKRANELMTHRRDLIILHPDDTQEKVMKTIEEEH YSKYLLVDERKDEIIGVVSVKDIILMIGSKKLFNLREIARPPLFIPESLYANKVLELFKK NKNKFGVVVNEYGSTEGIITLHDLTESIFGDILEEDETEEEEIVTRQDGSMLVEASMNID DFMEEMGILSYEDLESEDFTTLGGLAMFLIGRIPKAGDIFTYKNLQFEVVDMDRGRVDKL LVIKRDDEQE >gi|225935369|gb|ACGA01000023.1| GENE 111 144084 - 144557 515 157 aa, chain + ## HITS:1 COG:no KEGG:BT_0147 NR:ns ## KEGG: BT_0147 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 157 11 152 153 213 88.0 1e-54 MKKVIMLAAVVAALASCQSKANKAAEAQADSLALAMTPITELTEVYEGTLPAADGPGIDY VLTLNAATDGVDTAYTLDMTYLDAEGQGQNKTFTSKGKQQTVHKVVNKKPVTAVKLTPKD GEAPMYFVVVNDTTLRLVNDSLQEAVSNLNYDIIKVK >gi|225935369|gb|ACGA01000023.1| GENE 112 144648 - 145205 309 185 aa, chain - ## HITS:1 COG:no KEGG:BF3831 NR:ns ## KEGG: BF3831 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 32 183 332 470 473 79 33.0 5e-14 MKQCIKTNGVLFFFCLISLIITAQEQSVQSNIKCLNYPSSFELLLRHFKGKIVYIDVMVS WCKPCLAELKEYEKTDDFFKKNDIVRLFISIDEPKDWNVCLKRLDERLLNGYFVTYHRPE NSVENNKFSVEVEKLFVTYDEKGNFAGLSVPQFIIVNREGTIVEYKAQRPSNPEELKNQL KQYLE >gi|225935369|gb|ACGA01000023.1| GENE 113 145270 - 146490 1063 406 aa, chain - ## HITS:1 COG:no KEGG:BT_0146 NR:ns ## KEGG: BT_0146 # Name: not_defined # Def: unsaturated glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 405 1 405 406 768 92.0 0 MKKIVVGLAVMLGFCMCTHKPSGTLDVNKALDYCAEQTQRTLTELKTDSGIDYTMMPRNI MADEHHWNCRKATKEEWCAGFWPGVLWYDYEYTQDKHILEEAKKFTNSLEFLSRIPAYDH DLGFLVFCSYGNGYRLTKDPAYKKVILDTADSLATLFNPVVGTMLSWPREVEPRNWPHNT IMDNMINLEMLFWAAKNGGNPYLYDVAVSHADKTMKCQFRPDYTSYHVAVYDTITGNLIK GVTHQGYADSTMWARGQAWAIYGYTVVYRETKDPKYLDFAQKVTDVYLERLPEDKVPYWD FSAPGIPDAPRDASAAAVVASALLELSTYLPNGTGKRYKDVAIEMLASLNSDSYQSGKSK PSFLLHSVGHWPAHSEIDASIIYADYYYIEALLRLKRLQEGKNVIK >gi|225935369|gb|ACGA01000023.1| GENE 114 146579 - 148228 1442 549 aa, chain - ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 10 548 16 529 531 214 30.0 5e-55 MRKKILFLAGMMACCSLAGAQEISKTWVADKGNGTYQNPVLHADYSDPDVCAAGDDFYMT ASSFNCIPGIPILHSNDLVNWSLVNYALPIQEPEEFFDKAQHGKGVWAPAIRFHNGEFYI YWGDPDYGIYMIKTKDPKGKWSKPVLVIAGKGMIDPTPLWDEDGKVYLVHAYAGSRSGVN SIVVICELNAEGTEVISDPVMVFDGNDGKNHTVEGPKLYKRNGYYYIFAPAGGVATGWQL VLRSKNIYGPYESKIVMAQGKTNINGPHQGGWVDTNTGESWFVHFQDKGAYGRVIHLNPM TWVNDWPVIGIDKDKDGCGEPVTTYKKPNVGKTYPIATPPESDEFNTRHLGLQWQWHANK KDTYGFTTDLGYIRLYAGSLSKEFVNFWEVPNLLMQKFPAEEFTATTKLTFTAKQDGEQT GIIVMGWDYSYLSIRKAGDQFILQQAVCKDAEQQNPEQIKELANIPVEHLKMPGVADNEW QTVYLQVKVRKGAVCTFAYSLDGKKYMTVGEPFTARQGKWIGAKVGVFCVTPNEGNRGWA DVDWFRMTK >gi|225935369|gb|ACGA01000023.1| GENE 115 148436 - 149812 1007 458 aa, chain - ## HITS:1 COG:no KEGG:BT_0143 NR:ns ## KEGG: BT_0143 # Name: not_defined # Def: putative transmembrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 458 1 458 458 866 92.0 0 MKTKYINVLFSFVIASFMMSCSSEIPTGDANKFSDMKSPEEDMVKRDYLPLNHPCMLHTQ ADINRVKSNLNRSPWAEAYAQLEASQYAQSSYTENTRALLDGYLKRMDKNNWSGKYSDYS NYTACMYDAAAAYQLALRYQLSGNTSFADAAVKLFNAWATNCKGILRMEGYTNNIPDPNL YLIPIQAHQWANAAELLRDYNGWDRDDFEKFKTWMKDTFYSVSDMFLKNHNGGQGNMHYW LNWDLAQMTSILSIGILCDDNVMINQAIVYFKNEEGRYKEAGNIKNAVPYLHQDPDSDEI LGQCEESGRDQGHATLCVSLMGAFCQMAYSIGEDLFAYDNYRAVAMAEYVGKYNLIKDES FNKGTLVGDDFIYDSNSFPYTSYSNPSYTNATISTEQRGTKRPSWELFYGYCKEKGISSL YSEKWADQMRPDGGGGNYGPNSGGFDQLGFGTLMHYRE >gi|225935369|gb|ACGA01000023.1| GENE 116 149822 - 151297 1104 491 aa, chain - ## HITS:1 COG:CAC2367 KEGG:ns NR:ns ## COG: CAC2367 COG5492 # Protein_GI_number: 15895634 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 126 289 152 317 752 84 36.0 6e-16 MKYITFLSAKCVSVSLAILGLLLSAIACDDKEYGDAMSEGQLMNDIEMNIESSIALAVGM DIQVICKPVPENVTYPEMSWKSSDENIVSVSQEGKITAKAVGKAKVNISQKAAFETLKTI DVEVKPVATAIEMDDFELFEGTTKKASVKITPSNGYNVFHWKSSNEEIVKVDADGTVTGI IPGEAVVTAIATDGTSLTVSAKVTVRKVIPVESIQLDPVDDIMIGQTVVIGCHLVPEEAT SGLLSWVSSDEKVAVVDADGAVTGVGYGTVTITATDPMSNISATTSVTVGALLDYQFQST LKPCDLKPNQGGTVTFDNGYAHVIMKGPDGNGNWRQDINIVNDAYQKKLEYAPQTYKYLA IKLRRPQQSATANAGVLKFDIGDGVTAGNYCNSADYYVFDKETLTIVKKGTGVEKYGEPN IYVYDMSVDNAIKGSSPISTNEAGVATMTLLSLWIADVKEAAIDKTYDMYWIKTFKTLED LKAYIEKENNK >gi|225935369|gb|ACGA01000023.1| GENE 117 151327 - 153063 1354 578 aa, chain - ## HITS:1 COG:no KEGG:BT_0141 NR:ns ## KEGG: BT_0141 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 577 1 577 579 1088 92.0 0 MKRSKITFFLTIVLGITLHSCLDMEPQDQLGGKNMWTSVNDYKQFANTFYSWTRDFSSVV YDGTHSDKRSDLITYQSYNEFSRGINSIPSSDANYTDNYKHIRRTNLLIENAETYSKPED IKQYLGEAYFFRAYSYFDLLQLYGDVIITKKPLDITDPEMKVKRNDRSEVVDLIIDDLGH AIDNLPAFKELSTEEEGRISKEGAQAFLSRVALYEGTWQKSRNNNQNTERTKNLLDIAAK AARTVMDAKYGYSFRLFGTDSETKVLGDSAQKYMFILENEKSNPAGIKKSANHEYIFARR HDQILASIGKNITQECLANVQWITRKFANLYLCDDGLPIDKSDRFQKYDEKVSEFLNRDN RMRYTLLKPGTRYWGNKFGRTSWQWDEVDLKTSKLYDPASGTCYGNQKWSAERVVPDTQE GYDFPLIRYAEVLLNYAEAVYERDDKIENEDLNISLNQVRQRVNGSMPALTKEFAQTHGL DMRTEIRRERTVELFNEGFRIDDLKRWKTAENEMPQAMLGIKWKGTQYESWNTPFSLNDD GCIVVETGRQWADKNYLYPLPSDQLQLNPNLGQNPGWK >gi|225935369|gb|ACGA01000023.1| GENE 118 153074 - 156304 2410 1076 aa, chain - ## HITS:1 COG:no KEGG:BT_0140 NR:ns ## KEGG: BT_0140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1076 22 1097 1097 2004 93.0 0 MKRRGLIFNSRPNKTAVFALGLMLGLCPVSQIWAFDNANENQVVLQSHKQVKGQIVDAMG EPIIGASILEKGTTNGVISDIDGNFSLNVSAPNAIIVISYIGFKSLELPASDPNLKRIVM KEDTEVLDEVVVVGYGTQKKESLTGAVTVVGAKQLENKGTMSSPLQAMQGTVPGVIITRN SGAPGDESWGMKLRGASSSNSTDPLIIVDGVEYSDGINGMRNLNPDDIESINFLKDASAA IYGSKAAGGVVLITTKKAKAGKTVVQYNGSFTGKVVGLQPELMSLDQWADAVITAQTNDG YSDSNWIRYAKLAKLYKNQYIDLNHSTHPIPEGFKDVEDFVFMDNDWQDILWGNSWSTQH DLSVSGGTERNLFRLSLGYMYDNSTLKWGNNNNQRYNMRLNNQFKLSDNVMLTSSIGYNR QDQVSPSMIGKVLSQSSPQPGLPASTIDGRPYGWGTWRALNWWAEEGGDNKLKVSAINIS ESLNWKIYSDLDMVVNVGYNTSTATREKVEKSIDWYNYAGTTLLATEPTQENSKYSDSFS RTDYYMASGYLNWHKTLAEVHNLSAMAGAQYNYTQYKYTFVSVKDINPSLEIPNGAGEVL IKDGDSKPAKWHEAMMSYFGRLNYDYKQRYLLEGNLRYDGSSKFRPENRWQFFWGVSSGW RLSEESFMHSLSSVVNNLKLRLSYGVLGNQSGVDRYDGTQLYNFASSNGAYIGSGKVSTI DTNGKIVTTDRTWERIHSYNLGVDFAFLNNRLTGTFELFMKKNNNMLIDAQYPGVLGDNA PTMNLGKFEAKGWEGNITWSDKIGPVQYHIGGTITYTTNKLIDLGATSVLKSGFVDKQQG YPLNSYFGLRYIGKIQTQEELEKYKYYYLDGNGIGMQDNIRLGDNMFEDVNGDGVLDQND YVYLGTDDPKLSYSFNVGLEWKGFDFSAIFQGVGRRTVYRGGEGNETWRVPMSAIYLNTT TQSIGNTWSPENRNAYYPTYTSVGSINNYNYQCSSWSVENGSYLRLKNLTLGYTLPVSWL AKTNVISKLRIYFTGADLWEHSKLKDGWDPEASRKTKDLGRYPFNRTFTVGVNATF >gi|225935369|gb|ACGA01000023.1| GENE 119 156437 - 156952 655 171 aa, chain - ## HITS:1 COG:no KEGG:BT_0139 NR:ns ## KEGG: BT_0139 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 258 87.0 8e-68 METTANISDNIITRSYEEYYQVILTYITYRITHRYEAEDLTQDVFVRLLDYKQMLRPDTV KYFLFTIARNIVIDYIRRYYKKQEIDSYFYDTVSTSTNETEEKIIGDDLMTMERTRLAAM PEQRRLIYTLNRFENKTSPEIANELNLSCRTVENHLFLGRREMREFFRNCI >gi|225935369|gb|ACGA01000023.1| GENE 120 157654 - 160479 2225 941 aa, chain - ## HITS:1 COG:VC1353_1 KEGG:ns NR:ns ## COG: VC1353_1 COG3292 # Protein_GI_number: 15641365 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Vibrio cholerae # 8 653 18 652 675 76 22.0 3e-13 MKDTIKYLILILIYLACGVELQAKDYIFRNIVMSDGLSGLLVNAIYKDSEGFVWLGTDNC LDRFDGVKVRHFEFRGIESGRKKRVNSVTETANKQLWVGNGIGLWRLNRTNSQLERIVPE KIDFAVNTLLANEDILYIGTEKGLFIHKDGQLLQVLTDRNMLAACNRIMDICLNEDKTVL WLATVQGLFSYSLKDGKIDSWHFQENVPEADYFRCLTRIGETLYLGTMSQGVVRFDTKKE SFSHAVSLGCDVVSDISSDGKGTVYIATDGNGVHFLSHKDQQVTRRFYHDVNDKEGIRSN SIYSLLVDDRGAVWVGHFQAGLDYSLYQNGLFCTYDYPPLFNSANLSIRSFVNKGQEKVI GSRDGLFYINEATGIVKSFVKPVLTSDLILTICFYQGEYYIGTYGGGMMVLNPETLSLKY FAQGDTELFQKGHIFCVKPDAKGNLWIGTSQGLFSYNGQTKQIKSFTSANSQLPEGNVYE VSFDSTGKGWIATETGMCIYDPASQSLRSNVFPEGFVNKDKVRTIYEDAEHNLYFIREKG SLFTSTLTMDRFQNRSIFSTLPDNSLMSVIEDNLGWLWVACNDGLLRIKEEGEEYDAFTF NDGVPGPTFTNGAAYKDEKGLLWFGNTKGLIYVDPKRVDEVRGKVRPIVFTDILANGVPF TSSSLKYNQNNLTFCFTDFAYGLPSALLYEYRLEGVDKDWKLLAAQNEVSYYGLSSGTYT FRVRLPGNEQSEASCQVTVCPMIPWWGWGLSVLLIVGIIAFIRYYVWKRMRRLLDSSASS ISASAGEEIQQREQSVEQPEVISEQQSSTVEEKYKTNRLTEEECKELHKKLVAYVEKEKP YINPDLKMGDLASALDTSSHSLSYLLNQYLNQSYYDFINEYRVTQFKKMVADSQYSRYTL TALAELCGFSSRASFFRSFKKSTGVTPNEYIRSIGGTAKEE >gi|225935369|gb|ACGA01000023.1| GENE 121 160716 - 162983 1624 755 aa, chain - ## HITS:1 COG:no KEGG:Csac_2721 NR:ns ## KEGG: Csac_2721 # Name: not_defined # Def: heparinase II/III family protein # Organism: C.saccharolyticus # Pathway: not_defined # 25 746 14 741 752 452 34.0 1e-125 MKQIVLCLIGILFASFVSAQKINHPSLLYTPQRIQQVKQRMQNEPKLREAWEDIQKTADE ALQKKDFNRLDYLSLAYLMTDNKEYANIIKEILLKAVEAESWGDMEMMARIPVWRSQLGM AHKSFLSAIGYDAAYNIMSSSERKKIAEGLKRLAVEPALGDWLLEPTRIHSLNSMGHNWW TSCVCQGGILALSLQNELPEVKDWVEQLHESLPEWFDFAGDALQQKAKSFDEAGGMYESL NYANFGIQEALLFRIAWINTHPGQNPGDIPQLAKLPNYFSQVCYPRTGMLYSLNFGDSHK NVSAESSMMLLYALGLKDPTILWYIAQVEQGQHRDGFFLNRPMGFLYTPDLSKAPVTPDL KTSQLFSDFGWATMRTSWEKDATMLAVKSGHTWNHSHADANSFIVFHKGVDIIKDGGNCW YPNPAYRNYFFQSQAHNVVLFNGEGQPREQQYSGSTLRGNLYHLLDAGNVKYVLANGTGP VSNNFSRNFRHFLWMDNVIYMIDDLKTHKVGQFEWLWHTNGTYEKSGIDVNVTNGNSSVV IRPLYPRMLAKSDFVHDYPEDLYWEEIEAPTEDLKGTEKYYSFHLPAEVNRVKGLTAIIL KDAPDEKDLPQMERREGQDWIGLRIRHKGKITDLYINQLADGRLMHSNSWIMPDGWMTDA YMFAVSYPEGTEAKDAKDFFIAYGSALRRDNETYFSSLAKLFVIQKAEDKKLDLWINGQP KINTTFRSTKKPVAVEVNDKKIPVVYQKSQVKVKL >gi|225935369|gb|ACGA01000023.1| GENE 122 163005 - 164930 1318 641 aa, chain - ## HITS:1 COG:MA3635 KEGG:ns NR:ns ## COG: MA3635 COG0596 # Protein_GI_number: 20092435 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Methanosarcina acetivorans str.C2A # 372 636 16 279 282 97 28.0 1e-19 MKNLLIATFVVSISLLSGSCKVSSGQGKQYDFSSLDSVIQGWVDKGYYPGASICVVKNDT VIFQKNYRDYTPDTKVYVASAGKWVAAAVIGAVVDRTDLGWDDSVEKWLPEFKGDPKGRI LLRQLLSHTSGVRPYLPEPRVDNYNHLDSAVTEILPLDTVFTPGTRFEYGGLAMQIAGRM AEVAMGKEFETLFQELLAQPLEMKNSHFTPINTDGGHAPMLGGGLCTTMNDYLHFLSMIY HDGMYNGKQIISAETVKEMQADQVKGAIIPSNNLDNYVAKGLGQFHNGIYGLGEWRELID KKTGEAYQISSPGWAGAYPWINKHDKVYGFFISHVTGSSVKEDGFSSFFGSPVISRTVSE ILKGKPLVVKQGRINVGNGSLYYEEAGQGEPIIFVHGHSLDHRMWDEQFSVFAKKYHVIR YDLRGYGISSSQTEDYQFMHVEDLVTLMDSLHIKKAHIVGLSLGGFITADMLAYFPDRML SAFLASGNIRKSKGPSEPMTKEEAKVRDEEIAALKKKGVEVMKKEWFEGLMKSGGSQRER MRAPLWQMIDEWDAWQPLHKEVRVVAGLDAIEELKKSHPAVPALIVEGHSSDNKFSKNPP ILEYLPNGKLKIIEDCGHMMNMERPEEFNAALEEFLINIEQ >gi|225935369|gb|ACGA01000023.1| GENE 123 164949 - 166880 1530 643 aa, chain - ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 50 643 4 594 758 393 36.0 1e-109 MNKKVLVTVILFGSLTVSGIVSAQSVYPGQHQGKLKKETVAPLQAESFDLKDVRLLPSRF RDNMLRDSAWMTSIDVNRLLHSFRTNAGVFAGREGGYMTVKKLGGWESLDCELRGHTTGH MLSALGLMYAATGSEIFKLKGDSLVNGLEEVQNALKNGYLSAWPEELINRNIQGKGVWAP WYTLHKLFSGLIDQYLYADNKKALTIVTRMGDWAYNKLKPLSEETRKLMIRNEFGGINES FYNLYSITGDERYRWLAEYFYHNDVIDPLKELRDDLGTKHTNTFIPKVIAEARNYELTRN ETSRKLSEFFWHTMIDHHTFAPGCSSDKEHYFDPKKLSQHLTGYTGETCCTYNMLKLSRH LFCWTGDSSIADYYERALYNHILGQQDPETGMVAYFLPLLSGSHKLYSTKENSFWCCVGS GFENHAKFGEAIYYHNNQGIYVNLFIPSQVTWKEKGLTIRQETEFPQEETTRFTLQAENP VRTTIYLRYPSWSKDVKVSVNGKKISVKQKSGSYIAITREWKDGDQISATYPMQIKLETT PDNPDKAALLYGPLVLAGERGTEGMQAPAPFSNPALYNDYYTYNFHVPAHLRTSLKLDKK HPERALQRVGSDLKFTTEQGDVIRPLYDLHHQRYVVYWDLQSE >gi|225935369|gb|ACGA01000023.1| GENE 124 167045 - 167827 634 260 aa, chain + ## HITS:1 COG:no KEGG:BT_0135 NR:ns ## KEGG: BT_0135 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 260 1 260 260 481 90.0 1e-134 MDLRTANVTRYILPLREGGSLPALAEADDEFKYVVKFRGAGHGTKVLIAELIGGEIARAL GFRVPEIVFLNLDEAFGRTEADEEIQDLLQWSRGLNMGLHFLSGSLTFDPIVHQVDGKTA SQVVCLDALLTNVDRTIKNTNMLIWHKELWLIDHGASLYFHHSWTNWQKQALVPFVQIKD HVLLPFADKLEEVDIEFRQILTSDKIREIVNAIPDDWLNWTEGTETPQDLRDIYIRFLEE RIKHSETFVNEAQNARKALI >gi|225935369|gb|ACGA01000023.1| GENE 125 167805 - 168188 247 127 aa, chain + ## HITS:1 COG:no KEGG:BT_0134 NR:ns ## KEGG: BT_0134 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 1 127 127 233 88.0 1e-60 MQERHLYEYAVIRFVPKVEREEFINMGIVLFSKQAKYLKSLYTIDENKLKLFSSELDINC LKEGLQVFDKICMGNKEGGAIANMDIPDRFRWLTAVKSSCIQVSRPHPGFSTDLDKTLER LFKELVL >gi|225935369|gb|ACGA01000023.1| GENE 126 168214 - 169830 1323 538 aa, chain - ## HITS:1 COG:no KEGG:BT_0136 NR:ns ## KEGG: BT_0136 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 535 6 534 536 684 59.0 0 MNKYFLVVLLSLFMQVAKVAAAPWQWSVEIKELISEETNAHPSAYLWIPENCKQVRAVII GQHNMTEETIFEHPEFRKNMGKLGIAEIWITPGIDQRWDVTKGTQQIFETMMKNLSEVSG YTELKFAPVIPIGHSAMATYPWNFAAWNPERTLAVLSIHGDSPRTHLTGYGRANLDWGTR TIEGIPSLMVMGEDEWWEDRLITSFDYRREYPNAPLSFLADVGHGHFDISDELIDYLSLF LKKTVEYRLPEHSSLDAPIQLIPVEAKNGWLADRWRKNEKPTAEAASYDKYKGDKNHAFW YFDKEMADATEKYYANERGKTEQYIGFEQKGKLITFNPKSHVRMSPSFQPEADGVTFHLK AVYTDTLRNEYSKEHSTHPIRMSRICGPVEVVNDTTFTVRFYRMGLDNPKRTGGICLMAS VKQDCKYRSAVQQVEIRIPYRNKEGIPQRIIFPKLSDVKASVKEISLNGTADSGLPVYYY VKEGPAEIKGDKLVLTKIPPRAKFPVKVTVVAWQYGRSGEPKVQTAEAVEQSFYITAR >gi|225935369|gb|ACGA01000023.1| GENE 127 170067 - 170918 943 283 aa, chain + ## HITS:1 COG:YPO2805 KEGG:ns NR:ns ## COG: YPO2805 COG0656 # Protein_GI_number: 16123003 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Yersinia pestis # 1 275 15 290 297 335 56.0 5e-92 MEKVKLNNGIEMPILGYGVYQVTPEECERCVSDAISVGYRSIDTAQAYHNEEGVGNAISK CGVPRGELFITTKVWISNAGYEKAKASIDESLRKLKSDYIDLLLIHQPFGDYYGTYRAME EAYKAGKLRAIGVSNFYPDRFIDLAEFCEITPAINQVETHVFNQQIKLQEVMKEYGTKIM SWGPFAEGRNDFFTNPTLQEIGKEYSKSVAQVALRYLIQRDIIVIPKSTHKERMIENFNV FDFSLTPDDMAAIAALDTAKTLFFSHYDPEMVKWLINYSTKNE >gi|225935369|gb|ACGA01000023.1| GENE 128 171100 - 173037 1767 645 aa, chain - ## HITS:1 COG:no KEGG:BT_0132 NR:ns ## KEGG: BT_0132 # Name: not_defined # Def: alpha-glucosidase, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 644 1 644 644 1205 88.0 0 MKKRILLLLLVIPALIKAQDVMTETRQELTSPDGAYRFTFYQRAVGEDNAQMYYTLTYKN RPVIEESKLGVLIENQLFESALGIPNDTCHFWCENLKLTETEHQKTDERWKPVYGERAEV RDCYNEMTLKFKKGEGKGNRDGGYDKRKNYFMNIIVRAYNEGVAFRYHFPEMTNGLFLHI VGEQTSFTMPEGTMAYYERWAQGPYELRPLEGWGKEESERPLTLELPDGLSVALLEAEMV DYVRGKFRLSAEKPSTLETSLYSSVDIISPYSTPWRVIMAAERPVDLINNNDIVLNLNPA CKLADTSWIKPGKVFRSGDLKQDRVKAAIDFAAERGIQYVHMDAGWYGPEMKMSSDATTV SPDKDLDIPALCKYAESKGIGLMVYVNQRALVQQLDTLLPLYKKWGLKGIKFGFVQIGNQ RWSTWLHDAVRKCGEYDLMVDIHDEYRPTGFSRTYPNLMTQEGIRGNEEMPDATHNTTLP FTRYLAGAGDYTLCYFNNRVKNTKAHQLAMAAVYYSPLQFMFWYDRPEFYQGEEELEFWK AIPSVWDDSRALDGEIGEYIVQARRSGNDWFVGAMTNTEARTITLTTDFLEPGKKYMLHL YEDDDKLNTRTKVRSTHKKIKAGDKLVLKLKASGGAALHFTPLEK >gi|225935369|gb|ACGA01000023.1| GENE 129 173123 - 173992 589 289 aa, chain - ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 289 40 300 313 137 29.0 2e-32 MNEKLTITTSNPVRARFYEYPRFTYPWHFHSEYEIIYVEKGEGDCLVGDSIISYSKGDLI LFGSELPHSMQSPPDDGEESDNEEKSEPKVRGVNIQFEKDFMHYSISQYSQFIPIRNLLE DACRGIKFTITRSGKIIKLLEQIPSAKGADQIILLLSLLQMMATSNHKKYLTTSHYTPSP SIMRNERMEKVIAYLNKHYTESIGLDEIASYTAMNPAAFCRYFKENTGKTFKEYVLDMRI GYACKLLNSSMMNISQISATCGFESPVHFNRIFKRVTGMTPTLYREQME >gi|225935369|gb|ACGA01000023.1| GENE 130 174125 - 174940 201 271 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 10 263 4 238 242 82 27 2e-14 MNELFSIAGKVAVITGAGGVLGGNIAQHFVQQGAKVVAIDIRQEQLDNRVAELKQYGDDV IGIIGNVLDIASLEKVAEKIVAKWGQIDILLNIAGGNMPGATLEPEQHFYDMDISCWEKV TNLNMNGTVYPSMIFGKVMAKQGKGCIINVSSMAAYSAITRVPGYSAAKTAVANFTQWLA SEMALKYGDGIRVNAIAPGFFIGDQNRRVLINPDGSLTDRSKKVLAKTPMKRFGDIKELN GAVHFLCSEAASFVTGAMLPIDGGFSAFSGV >gi|225935369|gb|ACGA01000023.1| GENE 131 174981 - 176168 1039 395 aa, chain + ## HITS:1 COG:uxuA KEGG:ns NR:ns ## COG: uxuA COG1312 # Protein_GI_number: 16132143 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Escherichia coli K12 # 2 381 1 383 394 348 46.0 1e-95 MMEKTWRWFGKKDKITLPMLRQIGVEGIVTALHDVPNGEIWTMEAINDLKSYIESYGLRW SVVESLPVCEAIKYAGAEREQLIENYKVSLANLGKCGIKTVCYNFMPVIDWIRTDLQHPW PDGTSSLYYDHIRFAYFDIRILEREGAEKDYTEEELQKVAELDKVITEAEKDALIDTIIV KTQGFVNGNIKEGDKNPVSIFKRLLALYKDINRDALRENMRYFLSAIMPVCEEYGVNMCV HPDDPPFQVLGLPRIVTNENDIEWFLNAVDNPHNGLTFCAGSLSAGEHNDTRELAKKFAK RTHFVHLRSTAAMQGGNFIESSHLTGRGHLIDLIRIFENENPGLPTRVDHGRMMLGDEDK GYNPGYSFHGRMLALAQVEGMMAVVDDEKKRQIIL >gi|225935369|gb|ACGA01000023.1| GENE 132 176479 - 176610 74 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLILPSHSHPTLNPFIHRTFVGEGKGEGSVRVKFTLTLSTLTK >gi|225935369|gb|ACGA01000023.1| GENE 133 176628 - 178034 1142 468 aa, chain - ## HITS:1 COG:no KEGG:BVU_0121 NR:ns ## KEGG: BVU_0121 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 9 468 5 461 461 492 54.0 1e-137 MNRLIRIALTFLLVMTSGVIQAEIVIYPVPQGIYYARHNDDYTVKVRQVGEKDWVDLYEY NVKVDMDTKSDATMVQFDFSGKVEVLVQKNNGELRSAVVRPLSKGIQPEIDGNFLLFTLD KPQKLSVEFNGDRLNNLHVFANPIIENVPDKNDPNVMYFESGIHEPTDVAGKCFRIPSNT TVYLEGGAVLKGCLTCDSVENVKILGHGMLLEPQQGISVAYSKNVLIDGITVVNSRHYTV SGGQSQGITIKNLKSFSYQGWSDGLDFMSCSDVVIDDVFLRNSDDCIAIYTHRWNYYGDS RNIRVLNSTLWADIAHPINIGTHGNTKTGDEVLEDILFKNIDILEHDEDDRDYQGCMTIN VGDHNLAQNITFEDIRVEHIQEGQLFHLRVMYNQKYNTGPGRGVRNITFRNISCTGKYIN PSLMEGYDKNRKVENILFENIMLNGKRITSLEELNIDKKDFVGKIQIK >gi|225935369|gb|ACGA01000023.1| GENE 134 178049 - 179344 1243 431 aa, chain - ## HITS:1 COG:no KEGG:Dd586_1768 NR:ns ## KEGG: Dd586_1768 # Name: not_defined # Def: exopolysaccharide inner membrane protein # Organism: D.dadantii_Ech586 # Pathway: not_defined # 60 429 47 406 410 291 43.0 3e-77 MKKIYYITAVFATLFLVGCGDGIDLPGVNVETDLNKIPLPDNNVNLEQVELKPSTEPMLH EGLHTEEDFQRIRDKKAAGEEPWVSAYQLLVESQFSQKTADTYPTEWIKRGISGDENYMN AARGATIVYQQALRWKIEQDDEYAAKAVENLNKWVQTCVGVTGNTNLSLAAGLYGYEFAI AGELLRDYGGWDRADFAAFQNWLLKVFYPANDDFLKRHHDTNALHYWANWCLCNIAAKMA IGIVTDRRDIYNEGIAHLQTGDTNGRLRRAIYHDYAPDYNFAQWQESGRDQGHTLMCVGL MGIICQLAWSQGDDFFAYDDNLFLRACEYAACCNYTNETVPYTTYIWQKQSQWGYPIPEE QTTLGGGKWIKRAIWALPYYHYKGIKNISDDNLKYTKIATEYVGIEGGGGYYDANSGGYD VLGLGTLMYAQ >gi|225935369|gb|ACGA01000023.1| GENE 135 179406 - 181445 1815 679 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171100|ref|ZP_05757512.1| ## NR: gi|260171100|ref|ZP_05757512.1| hypothetical protein BacD2_04464 [Bacteroides sp. D2] # 1 679 20 698 698 1252 100.0 0 MKKIYNYLFPVFLSLFALAACDEDNEEIVSMSYTDPVATVTKIDPVEGYVGNEFTVSGSD FGIITEDVKVFIGSQEAVVVSCADDAILAKVPESATNGKITVEVFGQRVETDLVYRVLGK PGVSVVKPSYGFPGASIVFEGQEFVSSKTLYTLTFGTSTDKAEIVGTPIDTEFMAKIPET AVSGVMTLIMAEQTIDLASYPFTVLKHATLDTPKEDEPVPSGFAGSKFAITGTNLVQELL DKSVEGLEPLKVTFFKAGGEPVEAAIDTDNLTDKSIPLTVPASLEAGDYTITVITPFEAI GTQLKYTVLPMPTVTGISTKAGYINAEVTIIGQNFGTKAENIQVFFGETICDKVTLNDKG NIVVNVPKGVSSEAPVKIKLIIQGKEIEMGESGTFEVWETPEITSVETPYIYPNGTLVKA GQEITFTGKGFGTDKNSVTVTFEGISVPVTVKEITTTSITVTVPQGFNGGKVTMVFEGIA QPVESDMLQPLPVDGDISQYVLMNYKQPFEYVKEGGDKGFHKKGEWAKAAYWIVQNSNLT AGEGGAAVDLAFKTKYGDGSDAGLALQTDWGFDNPKNDGKVYQTSHLQKGKYKLTAHVYE YDGRGFTGYVAVCKGNEMANTSDIPSKSLANASISGTGDVVVDILVEEPTDVVIGFVCTI TVKQGRAKIDNFKLELVEQ >gi|225935369|gb|ACGA01000023.1| GENE 136 181495 - 182688 986 397 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171101|ref|ZP_05757513.1| ## NR: gi|260171101|ref|ZP_05757513.1| Ig-like, group 2 [Bacteroides sp. D2] # 1 397 1 397 397 763 100.0 0 MKFLFKYTFHLAILACVSALFSGCEQDPKYRVYDYPVPVVESIYPTDGYVTTQVVITGTN FGDRAEAVKVFFGEAQSNKVLDCKNNRLVVEVPETAVTGNLSLQIYNKKVENIGHYTVLP TPRVITVTSDSEDGEGVADTGDKVTITGENFGTDPSDISVSFNGTPAEFELVDESTIVAT TPADYKTGNVTVTIHGYTMTGSAMFNPNSKGDVTVLYLQNYKQPFAKANDENWKNGEWWT PAVWNQNKASFNAKNNTTVTGMQYKAAEGFTLAFQNGWEKEAYTNGKIWQVATLRPGKYR LEVTYAYTMVVSDAGNFISALMAKGNSESDIPNVADIEQLNGVCAIYDKAGTNDDSGVLV TPSFEVTETTDVVIGFLTSLAKGNSYFKVTELKLILE >gi|225935369|gb|ACGA01000023.1| GENE 137 182725 - 184521 1562 598 aa, chain - ## HITS:1 COG:no KEGG:BVU_0125 NR:ns ## KEGG: BVU_0125 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 598 1 675 675 337 36.0 1e-90 MKKITNYILICTCALGFSSCVNTFLDLEPLDAKTDVIYFKTPEHFREYANGLYGQLLGWQ SSYGSIFDHMDAASDLSTCFRYSYGVGTGVMGVPNDDSRWGNCYGNIRATNHMFERAVSS YTGNLADIKKELAEGHFFRAYNYFYLLKFFGGVPVVTKVLDVTSPELYGKRNSRYEVVNL ILSDLDEAIAGLPLEQNITSADKGKISKQAAQAFKARVLLYEATWRKYNGTSTDFEGSAG PASDQVNTFLEESVQLSEIVMGDAAYSLWNYNNVAAMKNLSSRYLFNIEEEASNPAGAGR ATNKEFIIASVYSQETRKGQVDLNQVIYTDMRPSRKLIDMFLCTDGLPVSMSDKFQGYKN PGDEFQNRDFRLTSYVGSYSTSLTVESCGYGVSKFAITDIQRQSKDESANYPVLRLAEVY LNYAEAVMERYGEISDDQLNKSINKIRARAGIANLTNALAKRIQEGVPANATKTVNQVML DEIRRERALELYMEGFRCDDLKRWGIAEKNLNESRCGAVVGNASYPTAFVDENGNATSAF NSAIFTKGTEEVETGKGKLPCVVLLKSSDCAFTKGDYLWAIPRNQINLNSNLVQNPGY >gi|225935369|gb|ACGA01000023.1| GENE 138 184555 - 187803 2988 1082 aa, chain - ## HITS:1 COG:no KEGG:BVU_0126 NR:ns ## KEGG: BVU_0126 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 18 1082 21 1108 1108 679 38.0 0 MEIKSKFLCSKGVALAFAMLLGGSPGVLLYANDSVEESLMVVQTGRIIKGLVTDANGEPL IGCNVVVVGSNAGVITDIDGRFTLNIPADAKQIKVSYIGYVDQIINLHGRSDFKVVLKED NNALDEVVVVGYGTQKKATLTGAVEQIGSQVLESRAITNVGAALQGATPGLVVTRSSSRP GNEGLNFQIRGLTSVNGGSPLIIVDGVPVLNSESFQNLNSDDIENISVLKDGSASIYGAK AANGVILVTTKKGKGKTTVDYGFNMRFTTNGIMAFSPSMQEYASMWLEANKEMPEHDWWG WGEENLKKMAQGIEGIYESTVADWGTMFVGNANRLDELFARRYSYQHNLSVAGSTDKSDY RISLAYADNQANLATAYDGQKQLNLRLNYGIKLTDWFKLETLASMIKTNTETPTHGIDRT LYGNDAPFFPAKNPYGQWYANFGNVGDRNAAAATTDGGRDEREKLTTRVDFKALVDIWKG ITFEGTASFQNEEYRRERYSLPVQCYNWFGEQTAKLVYETTQTLSTPQDVLNFKDSHQPG YLVQANNARYQYYSGLLKYKRTFAEVHNIDAMFGINAEKWVTKKVVTAREKFEDAGIYDL NLATGTQGNGGGKTHNGTYSYIARLNYNYAEKYMVELMGRRDGNSKFAPGYRFKSFGSVS LGWAFSEEQFVEFLKPVLSFGKLRLSYGSSGNDVGLGDYDYVSTVSLGTAGFGTIPANQV SSGFGGLISYDRTWEKVSQKNFGIDLNFFDNRLKATFDYFIKDNTGMLVNVTYPGVLGGK APKTNSGHLNVKGWEFTIGWRDQIKDFSYYANFNIGDTKSLLKEMEGADSYGAGWNAAVN GYPLNSYFLYRTDGYFKDQAEVDRYYALYGEGKEDLTGVGAGSASRLRPGDTKRLDLNGD YKISGAGNENSDLQYLGDSNPHFVFGFTLGGSWKGIDVNAMFQGVGKQYVIRNDWMAYPF QTRTANQNPTYLGKTWTESNPNAEFPRLTTNANLARWNYQNNDFMMQNNRYIRLKTLIVG YTLPQIWTRKVKLEKVRVYFSGNDLWEATSIRDGFDPEMGAASNNSGYPFARTWSFGLNI TL >gi|225935369|gb|ACGA01000023.1| GENE 139 187853 - 188383 404 176 aa, chain - ## HITS:1 COG:no KEGG:BT_0139 NR:ns ## KEGG: BT_0139 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 1 169 171 125 44.0 8e-28 MNKQQATFDDFFSECYLKYKEYIKNYIAIRICHPHEAEDLAQDVFVRLWEHRAFVNKDTV WSLLFTIARNIVTDKIRRYYKQEDFVAYIYNRMEDTSRNTTEDTIHFRELKKMHDQVMEA LPVKRRQIYELSFNHELSCPAIAGKLSLSPRTVECQLLLARKTVRTYLKNEFSKVG >gi|225935369|gb|ACGA01000023.1| GENE 140 188621 - 188995 358 124 aa, chain - ## HITS:1 COG:no KEGG:BT_0128 NR:ns ## KEGG: BT_0128 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 10 131 132 197 78.0 1e-49 MKVLVLILLVGGLMLSSCGGSSSKNNANVKEVNPADTIYLGDLREKFANDSVFFKIVAPD LMLMDYQYLWAVTESEAVEKGLTKEYYKRVKKEITDTNEAIKRGVMKGANVKRIPDFQAS QENK >gi|225935369|gb|ACGA01000023.1| GENE 141 189110 - 193192 2910 1360 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 844 1104 8 271 294 150 35.0 2e-35 MAMKIRICLLIILMGLLYPGYISSQESPYVFRQIGVVEGLPDNYVKSVFPIPDGRIGVRS TVLLSLYDGANYSNFPFNIHGEYSIAYNHIIPEQYIDADKRLWMKERKSLRVFDLTTEQY IYNVDSLFQQFGLKDKVADLIIDSEKHCWFLTPGSSVYMYDAETKSIEQVCRNDEFMEYY GGLLGVESHGKYSWMVHQKGAIRCYDLEKKRFVHQLDFLKDQLKPDDRVVLKILDNGDFW LMWDRGVGYYDVYNKKWNQISGIQLGHYSWFTSMDVDKGGNAWVGTVIDGFYVIDMHNFS VTRTLDIPLLSGNTVRNGIQSIYCDRENNSVWIGLYNQGMCYYHPSMNKIVLYNKKMING DWKGEEIRCMLETSKGEILMGTTQGVYRYEPETKSMNRFYHEFSQKNCRVLYEDSKKRVW VGTYHDGLYCIDHGKVQSYDYPDTDYQNELDFSNIRVMVEDSSGRLWVSIYGGVGLFNPE NGQINLLSEQFPELKKYKVANALAIDNQSRLVVGSDNGLYIYDPATNKIWIPEEDGQANS IFNQGSIKYNQILKDHEGTLWFATQYGLSVLTYNGQSYTLGKEEGLSKAILNVVEDKNHD IWISTVTSIYKIKVDRRADKYSFHVISCLSEDEIRQDDLYSFPSLMTRDNQLFLGLMNGF IAFSPENMIDNQCLNRPLFTSFRLFNVPVVSGEIYNGRVLFDKALSYSDEVQLKYDENYI TLEFSGLNFPNPSQTSFRYQLEGFDKEWTETLFENGQGRVVYNNLPSGEYIFRVSAAGND RIWGPESAFKIVIHPPFWDTLAARIFYAILVILLIFGLIYVINRRNRQKMIRMQEEEALK QKEELDQMKFRFFTNISHELRTPLTLIITPLDMVIRRLTDDAMKKQLNTIYKNAQNLLSL VNQLLDFRKLEMKGERLHLMNGDMEEFIVSAYNNFMPMAVEKHLNFVNQSEHRALYMFFD RDKVHKIMNNLLSNAFKYTPQGGTVNLQLATEEIEERNYVRISVSDTGIGISESDLPYIF DRFYQVGNEGDEKIGSGIGLHLVREYVNIHGGRIKVDSQIDRGSVFTVWLPMDLKPEPNE LPEEIIGTETPDIKEKETTTSTVDDNLKKLLLVEDNQEFRTFLKEQLEDFYQIIEAADGE EGERKAIEENPNLIISDIMMPKVDGIELCRRIKTNVQTSHIPVILLTARTADDIKINSYE VGADSYMSKPFNFDMLMVRIEKLIEQQEKRKQEFRKNIEVNPSAITITSVDEQLIQKCLE YIEKNMDNPEYGVEELSGDLGMVRMSLYRKLQSITGHTPTDFIRSIRLKRAAQLLQGSQL PIVEIANRVGFSSPSYFSKCFREMFGMLPKQYAEESGRKE >gi|225935369|gb|ACGA01000023.1| GENE 142 193246 - 195159 1406 637 aa, chain - ## HITS:1 COG:no KEGG:BT_0127 NR:ns ## KEGG: BT_0127 # Name: not_defined # Def: putative transmembrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 629 478 1083 1097 1017 78.0 0 MINKIIWIIGIVWIGNIQNGVQAQRRNFIHPGITYTQGDLDRMKAMVKAKHEPYYSTFLK LKESPYSSLNTQVINRGKQIREGRFNATIGVDGRRAHDLALLWHLTGDEAYARKAVEYLN ANSYYTNTSSRGTGPLDNGKIYLLIDAAELMRDYSGWKVEDQQRFKNMLVYPGYSNTEDF SSKYANYWDDSKNGVTFYWNIFNFDAARFGNQGLFAARGMMAMGIYLDNEVMYDRAYRYL LGMKHRPDDLPYPSGPPVISKEPIKKSPTMLDYKLEGRENKISDYGYDEQLQYYIYPNGQ CQESSRDQGHVLAGIHNYVAIAEMAWNQGDSLYNCLNNRLLLGLEWNYRYNLSKVQTYEE QKEPWEPTSYSKNSNEVTFENGKFLQIKSRSGRWESVAVSPHGRGDVAGSGGTREMALAH YAVRAGLPENNYTWLKRYRDYMIEHHGCENWGIAPNWFYEWTDWGTLTKRLTPWMAGDPV SFSTGKRVSGIHLLPSEISAADYDYYCLAEDPEGHTYHNVGKKRGNEYRPDGAVELRKEE DNYVVTQIEDGEWMNYTVSIPTDGDYTVYLVYQSKGNSLLSVASDSEVKTEPMQLPSSIQ WTEREIGKLTLPAGACVLRLQIEQAGDKLEIQKIIVK >gi|225935369|gb|ACGA01000023.1| GENE 143 195607 - 197019 1027 470 aa, chain - ## HITS:1 COG:no KEGG:BVU_0121 NR:ns ## KEGG: BVU_0121 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 25 470 16 461 461 714 74.0 0 MNIVGVLSGKRKCLLAIAIAFSTFGNAQLVTYPEGLNTGMPHNDDYTVKVREAGGEWKDV FEYEVQVDMDRVQSASMVQFDIGSPVEVMVKKNNGTIQDVKIRPLAIGIQHTVNHNAIFF TLTRPQCLSIEFNGDRLHNLHLFANPLETETYTESSDKVMYFGPGVHRPKDLPNTQIQIP SNTTVYLAPGAVVKAKLVIDKAENVRIVGRGILDHPIRGIEVTHSKNIWIDGITVINPDH YTVFGGESTGLTINNLKSFSCKGWSDGIDLMCCSDVLIDNVFMRNSDDCIAIYAHRWNYY GGSRNVTLQNSILWADIAHPINIGGHGNPDDKVGEILENITVRNVDILEHDEDDLLYQGC MAVDCGDKNLVRKVLFEDIRVENIQEGRLFHINVRFNSKYDKQPGRGIEDIIFRNIIYNG VGENPSLLKGFDKERSVKNIIFDNVIINGMKMKNIDDFITNEYIKNITVK >gi|225935369|gb|ACGA01000023.1| GENE 144 197356 - 198378 852 340 aa, chain - ## HITS:1 COG:XF0975 KEGG:ns NR:ns ## COG: XF0975 COG3746 # Protein_GI_number: 15837577 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate-selective porin # Organism: Xylella fastidiosa 9a5c # 53 312 103 364 389 79 25.0 7e-15 MDGGVYLKNPNNFGNGTEFNDLRIGVKATYQNWGMKLEMGYAGNKVAIKDAFATYSYKNS SIQIGQFYEPFSLDMICSTFDLRFNQSPGAVLALTNSRRMGVAYSYRTQYYYLCGGFFTD NDLSNLKNASQGYAIDGRLVYRPLYEQAKLIHIGLAAIHRTPDGTLPEDENRNTFTYKSP GVSTIDNRTLIQADVDHAASQFKIGTELLIYYHKFFLQGEYIRAHVKREKGFENYTAQGA YLQCSWLLLGQNYLYDEEVACPGRPEGKALELCARFNYLSLNDAGIKGGTQKDLSFGLNY YINKHIAVKLNYSYFIPGSHIKEIESTNFSVVQGRFQFIF >gi|225935369|gb|ACGA01000023.1| GENE 145 198515 - 199885 1161 456 aa, chain - ## HITS:1 COG:no KEGG:BF1089 NR:ns ## KEGG: BF1089 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 426 8 424 446 580 73.0 1e-164 MKTITEAISKQSSNRRLADFLFILWAGGAALLSYSLVYTLRKPFTAATFDGIEAFGFDYK VLVTIIQIAGYLIAKFIGIKLISELKRENRLKFILVSIAVAELSLVAFGALPTPYNMFAM FFNGLSLGCMWGVIFSFIEGRRTTDILASLLGISIVISSGTAKSIGLFVMNTLNVSEFWM PALIGAFALPLLALLGYSLTRLPQPTAQDIEQKSSRVTLNGKQRKELFIDFMPFLVLLFV ANLMLVVLRDIKEDFLVKIIDMNGQSSWMFAQVDTVVTLIILALFGAMVFVKSNIKVLVA LLGLVVLGTATMSFISFNYDSLQLDAITWLFVQSLCLYIAYLCFQSIFFDRFIACFKIKG NVGFFIVTIDFIGYTGTVLVLMFKEFAHVDINWLEFYNILSGYVGLICTVAFTCSMIYLI QRYKKEKQLKKAKEAEMSNGIKFTGEGMEPTTFSQI >gi|225935369|gb|ACGA01000023.1| GENE 146 200105 - 202471 2091 788 aa, chain - ## HITS:1 COG:no KEGG:BF1763 NR:ns ## KEGG: BF1763 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis # Pathway: not_defined # 30 389 42 367 391 155 31.0 6e-36 MNIFRINNKYVAVAIFLMLAVTLQAQDYGALQYMLQKRPANEKFESNKFNEHLFFSAGIG PYSLLTSGDSQDGMGMTAHLFMGKWITPVHGLRIGVNLGYLPSSIYDSKIKMGGGSLDYL LNMSSLAYGYNANRCFELVGIAGIEAGYSKVGDNSDRSEKYPDLKGGGQLYYGAHLGLQG NVRLSSTLDLFVEPRIGWYNDGFAYTESWRNYKMAGSVLVGLTYMPAAPMGTKIHFDDFD KSSFLNHMFISLSGGISTLKVPGIKNTIKGLGPQFSAGIGKWFSPSSGLRLSGTVGLSDT PSGSASGYFKHVDLHADYLLNINNVLWGYDEDRIFSLIGIAGVNLAGTKGVDKTAKYAPG IGVGVQGSFRINRSVDLFIEPRLNVYNKRYAGGRGVGGNTDQLGELNFGLTYHTIDRAAR PKNGFSSNHIADNLFMTSGIGVQMFLNKTNLENLGSLGPQASVSIGKWLSPYSGLRLVGT GGFFTNYVVPGSVKAGKLRHASVSGGLDYLWNITSTMSGYNPDRIFDLIGSVGVNLAYTS KSDHKFQPGINAGIQGLWHVNDFLGLYIEPQIRLYGDKFIEGNLGFMQKDVMVGVNAGFH YRFVPYSKAANRSVFGQDDKRYFISGALGLGSLLVANKDLVKNAGVEAKGSIGKWYTPLS AWRVNGTIMYKAKTSSKMNLHYAGLGMDYMMSLATLAKGYSPDHVIDVVPFVGVTAGLVR RYGKFRAVPGLDAGVQVKLKVASSLYLYAEPKVGIRTDTYDGSEQGRPDRVASMVGGLLY RFKMPTFQ >gi|225935369|gb|ACGA01000023.1| GENE 147 202496 - 204529 1929 677 aa, chain - ## HITS:1 COG:no KEGG:BF1827 NR:ns ## KEGG: BF1827 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 672 1 670 709 491 39.0 1e-137 MKINLKRDRLTGILLVAMMFVGMNVSAQRIRVQGHITNPQGKSVPNVNVLNPVNDERIEM SDEDGRYSVLVEKNGSLKFTCVGYEDKTVKVAGKQILNVVLKDAVIELDEVTITSKVKDK VIPEPTDIEIKGNYFHLKTRVPVPKEMFNSHRRLVLQPSIYDVTLKKRLLMRPVVFDGDT YNTTQNRMYDYDMDKDPLHDYIRVKTTSSRKGDIIAYHDSIYIEYLQHDYRADVHLAMEN YRNIIYRDSFSIARGTVNPLRFLEYKFSAFSLTDEKYLPKPVMQLRDTKGEVNLTFLVGK ADLDDKNPQNQVELNRLNQELRAIETNPDASLKSFHITGVASPDGSYATNLRLAKLRTDK ALERILAQLDPETRKLLEVKSDASVASWKEVAELLKKNSKPELAKEVEDLIKQYAATPYR LNGVLKSKPFYKELAATYLPKLRKVQYTYGYSIFRSLTDDEIRELYRKNPKQLTRFEYYR MITTAKTPDEREKYCREALELYDNFTYAANELAVATIQKDTPDSRILEPFVSKSAPAELL SNQAIALLHEGKYTKADSVLTLVPEEAVSEDLQAIVQALAGYYNDAFEKVAATSPFNEVV MLLAMKKNQEAWDKISTMDVETAREYYIKAIAANRLEKIGDAIMSIEKALELDPSLLEVA KVDGDIIDLLPEEQKIK >gi|225935369|gb|ACGA01000023.1| GENE 148 204556 - 205113 380 185 aa, chain - ## HITS:1 COG:no KEGG:BF1765 NR:ns ## KEGG: BF1765 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 25 185 41 201 201 121 43.0 1e-26 MVYVMRNGRRVIAALSFLLLCCGVHVHAQRVAVKTNALGWLTASPNVEAEFVLGSHVSLN MGIAANPISTDNFKTTFTHFQPEVRYWLNRPMVSHFLGITAFVNNFDMMVKDVHHKGDAY AAGLTYGYAWVLGDHWNIEATAGVGVLRYRQFKYDKGTPKPGAVNDSKTTIAPVKLGVSF VYIIR >gi|225935369|gb|ACGA01000023.1| GENE 149 205342 - 208095 2011 917 aa, chain - ## HITS:1 COG:no KEGG:BT_0126 NR:ns ## KEGG: BT_0126 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 917 1 917 918 1699 87.0 0 MRRILLGLCFLSFLNSASGQEIPLPEKMPQTHPRVLTTPAGKQETWKLIKKEEWAKDVFN KLKERTEVYTNLTDAQPAWLLSRLAMYWKSHATEVYVKGETFDHAGGERAPYPTVRYTGT RGTAATHGRPKLADVVPYDDEDGNVTFCNNALPDRPMESVHPSKTGRNIESLNCEILGIA RDAAFLYWMTDEEKFAKLAAGVFDTYMTGIYYRNAPIDLNHGHQQTLVGLTSFEVIHEDA LHIAVPLYDFLYNYLKANYPDKMEIYAGAFKKWADNIIANGVPHNNWNLLQARFVMNVGL VLEDNKEYADGKGREYYIDYVMNRSSIRQWSLTRLADYGFDINTGIWAECPGYSSVVIND YANFVNQFDTNLQYDLVKAVPVLSKAVATTPQYLFPNRMICGFGDTHPGYLSTNFFIRMI QNAQANGKKEQENYFTALLKCLNPDLGNDKTEKKNVRVSVNSFFEDKPLTLNPKVQPGKI EDYVSPLFYAPNVSWLVQRNGMHPRNSLMISLNGSEGNHMHANGISMELYGKGYVLGPDA GIGLFLYSGLDYAEYYSQFPSHNTVCVDGISSYPVMKSNHSFDLLSCFPASAEPGKAFTS VTYSNLYFREPESRADQTRMMSIVTTGAETGYYVDVFRSRKEKGGDKMHDYFYHNLGQTL TLTTADGSDLNLQPTEELAFAGAHLYAYSYLYDKKVAVTNKDVKATFTIDMKDKDGDDIY MNLWMKGEPDREVFTALAPMTEGLSRTPNMPYNIKEQPTLTFVARQHGEAWNRPFVSIYE PSTKKEPSAIQSVSYFDAEGAGLEDFAGICVKSKNGRIDHIFSLSDAAQTATYQGMKVKA DYAVISNEYAGNRTLFLGNGTQLVAPGVMIQTDNAANVLLEKKEGKWYIISSAPCTVVIG DKKIKSDAASEPMLLRI >gi|225935369|gb|ACGA01000023.1| GENE 150 208258 - 210165 1648 635 aa, chain + ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 46 566 8 527 527 684 61.0 0 MKILSIHDLATIRKRAEHNLSLREESNEKVTEKCYGLASGAQHLQILICGGTGCKASSSQ GITDNLLKAIKKNEITDKVEVITVGCFGFCEKGPIVKIIPDNTFYTQVTPEDAEEIINEH IIGGRRIERLLYVDPKTEHTVSDSKHMDFYRKQLRIALRNCGFIDPENIEEYIAREGYFA LADCLLNKQPTDVIDIIKRSGLRGRGGGGFPTGLKWEFANKQQSDVKYVVCNADEGDPGA FMDRSIMEGDPHSIVEAMCVCGYSIGSTKGLVYIRAEYPLAINRLKTAINQAREYGLLGD HILGTDFSFDIEIRYGAGAFVCGEETALIHSMEGKRGEPTLKPPFPAESGYLGKPTNVNN VETLANIPIILTKGADWFATIGTERSKGTKVFALAGKINNVGLIEVPMGTTLREVIYEIG GGIKGGKKFKAVQTGGPSGGCLTEKHLDTPIDFDNLLSTGSMMGSGGMIVMDEDDCMVSV ARFYLDFTVEESCGKCTPCRIGNKRLLELLNKITEGKGTEKDLDTLATLGQVIKDTALCG LGQTSPNPVLSTLDNFYDEYMEHVRDKTCRAKQCKSLLTYTISPERCIGCHLCAKNCPAD AISGLVRKPHVIAPDKCIKCGMCMARCKFNAILVC >gi|225935369|gb|ACGA01000023.1| GENE 151 210179 - 211945 1436 588 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 222 586 5 364 372 379 52.0 1e-104 MEEKQITLQIDGHFITVPEGSTILEAAIKIGINIPTLCHIDLKGTCIKNNPASCRICVVE VMGRRNLAPACATRCTEGMVVKTSTLRVMNARKVVAELILSDHPNDCLTCPKCGNCELQT LALRFNIREMPFNGGELSPRKREITASIVRNMDKCIFCRRCESVCNDVQTVGALGAIRRG FNTTIAPAFDRMMTESECTYCGQCVAVCPVGALTERDYTNRLLDDLANPDKVVIVQTAPA VRAALGEEFGFPPGTLVTGKMVYALRELGFDYVFDTDFAADLTIMEEGSEILNRLTRYLN GDKSVRLPILTSCCPAWVNFFEHHFPDMLDIPSTARSPQQMFGSIAKSYWAEKMGIPREK LVVVSTMPCLAKKYECARDEFKVNGIPDVDYSISTRELARLIKRANIGFPLVLDSPFDNP MGESTGAGVIFGTTGGVMEAALRSVYEIYTGETLKNVNFEQVRGLNGVRRATINLNGFEL KVGIAHGLGNARHLLEDIRNGHNEYHVIEIMACPGGCIGGGGQPLHHGNSEILYARANAL YREDANKPLRKSHENPYIKTLYEDYLGKPLGEKSETLLHTHYFNKAID >gi|225935369|gb|ACGA01000023.1| GENE 152 211965 - 212441 588 158 aa, chain + ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 4 156 16 171 176 143 45.0 1e-34 MSDIKLACDMAEQIKTICDKHGNKPGELINILHEAQHLQGYLPEETQRIIASKLGIPVSK VYGVVTFYTFFTMTPKGKHPISVCMGTACYVRGSEKLLEEFKRVLGIEVGDTTPDGKFSL DCLRCVGACGLAPVVMIGEKVYGRLQPVDVKKIIEELE Prediction of potential genes in microbial genomes Time: Fri May 13 07:25:02 2011 Seq name: gi|225935368|gb|ACGA01000024.1| Bacteroides sp. D2 cont1.24, whole genome shotgun sequence Length of sequence - 63477 bp Number of predicted genes - 69, with homology - 67 Number of transcription units - 27, operones - 19 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 436 - 1332 381 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 2 1 Op 2 . + CDS 1336 - 1878 335 ## Cpin_4451 YbaK/prolyl-tRNA synthetase associated region 3 1 Op 3 . + CDS 1911 - 2759 481 ## BT_0110 putative regulatory protein + Term 2770 - 2827 10.8 + Prom 2794 - 2853 5.7 4 2 Op 1 . + CDS 2971 - 3330 288 ## BT_0109 hypothetical protein 5 2 Op 2 . + CDS 3293 - 5335 1261 ## BT_0108 hypothetical protein 6 2 Op 3 . + CDS 5348 - 5626 278 ## BT_0107 hypothetical protein + Term 5652 - 5691 5.2 7 3 Tu 1 . + CDS 6002 - 6523 462 ## COG0262 Dihydrofolate reductase - Term 6556 - 6599 1.2 8 4 Tu 1 . - CDS 6629 - 8710 1141 ## COG0550 Topoisomerase IA - Prom 8778 - 8837 3.2 9 5 Tu 1 . - CDS 8892 - 10421 1300 ## BT_0103 hypothetical protein - Prom 10444 - 10503 4.3 + Prom 10401 - 10460 7.5 10 6 Op 1 . + CDS 10665 - 11504 491 ## Dde_0780 hypothetical protein 11 6 Op 2 . + CDS 11489 - 13615 620 ## COG1204 Superfamily II helicase - Term 13584 - 13629 12.2 12 7 Op 1 . - CDS 13635 - 15644 1139 ## COG3505 Type IV secretory pathway, VirD4 components 13 7 Op 2 . - CDS 15665 - 16903 705 ## BT_0101 hypothetical protein 14 7 Op 3 . - CDS 16888 - 17325 500 ## BT_0100 hypothetical protein 15 7 Op 4 . - CDS 17346 - 17549 58 ## BT_0099 hypothetical protein - Prom 17691 - 17750 3.1 + Prom 17408 - 17467 3.9 16 8 Op 1 . + CDS 17685 - 18002 120 ## BF1247 hypothetical protein 17 8 Op 2 . + CDS 17999 - 18760 471 ## COG1192 ATPases involved in chromosome partitioning 18 8 Op 3 . + CDS 18765 - 19235 433 ## BT_0097 conjugate transposon protein 19 8 Op 4 . + CDS 19237 - 19950 382 ## BF1244 hypothetical protein + Term 20062 - 20102 5.3 20 9 Op 1 . + CDS 20113 - 20409 248 ## BF1243 putative transmembrane conjugate transposon protein 21 9 Op 2 . + CDS 20422 - 20754 219 ## BF1242 putative transmembrane conjugate transposon protein 22 9 Op 3 . + CDS 20751 - 23255 1312 ## COG3451 Type IV secretory pathway, VirB4 components 23 9 Op 4 . + CDS 23284 - 23517 259 ## BT_0092 conjugate transposon protein 24 10 Op 1 . + CDS 23645 - 24274 512 ## BF1239 hypothetical protein 25 10 Op 2 . + CDS 24295 - 25320 674 ## BT_0090 conjugate transposon protein 26 10 Op 3 . + CDS 25333 - 25956 397 ## BF1237 hypothetical protein 27 10 Op 4 . + CDS 25960 - 26265 66 ## BF1236 hypothetical protein 28 10 Op 5 . + CDS 26240 - 27592 989 ## BT_0087 conjugate transposon protein 29 10 Op 6 . + CDS 27626 - 28534 552 ## BT_0086 conjugate transposon protein 30 10 Op 7 . + CDS 28537 - 29115 341 ## BT_0085 conjugate transposon protein 31 10 Op 8 . + CDS 29137 - 29580 196 ## BT_0084 conjugate transposon protein + Term 29598 - 29654 2.4 + Prom 29630 - 29689 8.8 32 11 Op 1 . + CDS 29723 - 30496 81 ## gi|260171151|ref|ZP_05757563.1| hypothetical protein BacD2_04721 33 11 Op 2 . + CDS 30537 - 31553 529 ## PSHAa1358 hypothetical protein + Term 31721 - 31780 6.1 - Term 31611 - 31642 2.3 34 12 Op 1 . - CDS 31650 - 32129 410 ## BF1225 hypothetical protein 35 12 Op 2 . - CDS 32149 - 32364 282 ## BF1224 hypothetical protein 36 12 Op 3 . - CDS 32389 - 32601 100 ## BF1223 hypothetical protein 37 12 Op 4 . - CDS 32645 - 33178 555 ## COG4734 Antirestriction protein - Prom 33296 - 33355 6.2 - Term 33278 - 33334 -0.0 38 13 Tu 1 . - CDS 33476 - 33928 376 ## COG2003 DNA repair proteins - Prom 33956 - 34015 5.4 - Term 34126 - 34160 4.0 39 14 Op 1 . - CDS 34186 - 34728 565 ## BVU_2471 hypothetical protein - Term 34750 - 34799 13.5 40 14 Op 2 . - CDS 34805 - 36037 949 ## BT_0076 transposase - Prom 36059 - 36118 2.5 - TRNA 36312 - 36396 48.3 # Ser TGA 0 0 + Prom 36409 - 36468 6.2 41 15 Op 1 11/0.000 + CDS 36638 - 38866 2475 ## COG1882 Pyruvate-formate lyase 42 15 Op 2 . + CDS 38912 - 39610 710 ## COG1180 Pyruvate-formate lyase-activating enzyme + Term 39644 - 39687 5.1 - Term 39571 - 39628 18.1 43 16 Op 1 . - CDS 39638 - 40081 411 ## COG3023 Negative regulator of beta-lactamase expression 44 16 Op 2 . - CDS 40068 - 40235 209 ## gi|160887574|ref|ZP_02068577.1| hypothetical protein BACOVA_05594 45 16 Op 3 . - CDS 40256 - 40759 533 ## BT_4735 hypothetical protein - Prom 40871 - 40930 4.0 + Prom 40726 - 40785 6.4 46 17 Op 1 . + CDS 40953 - 41183 257 ## gi|160887572|ref|ZP_02068575.1| hypothetical protein BACOVA_05592 47 17 Op 2 . + CDS 41271 - 41981 443 ## BT_4734 hypothetical protein 48 17 Op 3 . + CDS 41858 - 43387 985 ## COG5545 Predicted P-loop ATPase and inactivated derivatives + Term 43417 - 43459 2.2 + Prom 43407 - 43466 4.5 49 18 Tu 1 . + CDS 43502 - 43681 98 ## + Term 43840 - 43874 0.7 - Term 43548 - 43609 18.7 50 19 Op 1 . - CDS 43627 - 43929 434 ## BT_4733 hypothetical protein 51 19 Op 2 . - CDS 43895 - 44248 255 ## BT_4732 hypothetical protein - Term 44249 - 44289 1.7 52 20 Op 1 . - CDS 44305 - 44550 139 ## BT_4730 hypothetical protein 53 20 Op 2 . - CDS 44559 - 44693 169 ## - Prom 44728 - 44787 2.6 - Term 44713 - 44768 16.4 54 21 Op 1 . - CDS 44813 - 46084 817 ## PRU_2549 hypothetical protein 55 21 Op 2 . - CDS 46121 - 47392 781 ## Cpin_6746 hypothetical protein 56 21 Op 3 . - CDS 47429 - 48280 645 ## gi|260171174|ref|ZP_05757586.1| hypothetical protein BacD2_04846 57 21 Op 4 . - CDS 48290 - 49213 617 ## gi|260171175|ref|ZP_05757587.1| hypothetical protein BacD2_04851 58 21 Op 5 . - CDS 49236 - 50654 1011 ## Dfer_0773 RagB/SusD domain protein 59 21 Op 6 . - CDS 50674 - 54012 1605 ## BT_0190 hypothetical protein - Prom 54075 - 54134 2.5 - Term 54106 - 54161 10.1 60 22 Op 1 . - CDS 54183 - 55742 1364 ## BT_1484 hypothetical protein 61 22 Op 2 6/0.000 - CDS 55819 - 56778 800 ## COG3712 Fe2+-dicitrate sensor, membrane component 62 22 Op 3 . - CDS 56842 - 57438 438 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 57505 - 57564 1.9 - Term 57567 - 57623 5.5 63 23 Op 1 . - CDS 57651 - 58955 1172 ## BT_4721 hypothetical protein 64 23 Op 2 . - CDS 58962 - 59510 422 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 59651 - 59710 7.9 + Prom 59576 - 59635 5.1 65 24 Tu 1 . + CDS 59686 - 60408 759 ## BT_4719 hypothetical protein + Term 60475 - 60516 9.5 - Term 60464 - 60502 4.2 66 25 Op 1 . - CDS 60540 - 61220 744 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) - Prom 61288 - 61347 4.3 67 25 Op 2 . - CDS 61368 - 62294 792 ## COG0583 Transcriptional regulator - Prom 62397 - 62456 9.8 + Prom 62223 - 62282 4.2 68 26 Tu 1 . + CDS 62431 - 62907 640 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) + Term 62941 - 62987 7.2 - Term 62953 - 62994 -0.0 69 27 Tu 1 . - CDS 63030 - 63386 221 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|225935368|gb|ACGA01000024.1| GENE 1 436 - 1332 381 298 aa, chain + ## HITS:1 COG:BH1937 KEGG:ns NR:ns ## COG: BH1937 COG0697 # Protein_GI_number: 15614500 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 4 283 3 281 298 119 32.0 5e-27 MKNIKYIFLVLLGGTMYGTMSSLVKLLYTKGFNAAELAFWQALLAALFLGVCTFITRKSD KGKLLYKDFLPLLLTGSAIGLTNFLYYQSVSYISASLAIVILMQFTWFSLLLEWIIFTKK PSRLEFLTVFFILIGTLMAGNLFDVKEWSFSPKGIMLALFSSLTYAVYIIANGRVSKGVR WQAKSMSIMIGSSLSIFIINSQTILVNNHFGSDFMLWAIFLAIVGTTIPTALFAAGISKI GAGISSILMTIELPVAVICAYVVLNEYISPLQIAGIIIMLAAITSMNYYKYLKTKKQK >gi|225935368|gb|ACGA01000024.1| GENE 2 1336 - 1878 335 180 aa, chain + ## HITS:1 COG:no KEGG:Cpin_4451 NR:ns ## KEGG: Cpin_4451 # Name: not_defined # Def: YbaK/prolyl-tRNA synthetase associated region # Organism: C.pinensis # Pathway: not_defined # 1 179 1 179 179 215 61.0 6e-55 MFYISEIMNTKPTVFNTELQNLVYTTLDKLEIPFERVDTDEAISMDDCVLINQKMNMKMV KTLFLCNRQQTEFYLFITTAHKSFKSKEFSKALNISRISFTPAELLQEKLGTKIGAATIF SVLLDVKNEIQVVIDEEVISEDFYGCSDGTTTGYMKISTKLIINNFLNYAMHIPTIIQIQ >gi|225935368|gb|ACGA01000024.1| GENE 3 1911 - 2759 481 282 aa, chain + ## HITS:1 COG:no KEGG:BT_0110 NR:ns ## KEGG: BT_0110 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 282 26 286 286 297 58.0 3e-79 MEKFVNMIISQINTEIENVCMNSDVSSDKALYMLNFIRPLFERLREFIHEYTFQDANEEI SFFKNIKPFILSKLIYFNDIYTLELRKPNGSKDVLKEYYKKKQTTITEFCNNNLDFYQYY RSKATHLDKYYFLRGHENYKLCHNGGMFDKDPLFSTCCDHRVAKMLAYDMLEIYLQQRLQ EVERKEVIEISRASLPDNPFQWTGTKIAAIELGYAIYAAGVLNNGNADIKEIMTYIEASF KIDLGDYYRTYLTIRERKKDKTSFLTNLINQLLRKMDEDDKL >gi|225935368|gb|ACGA01000024.1| GENE 4 2971 - 3330 288 119 aa, chain + ## HITS:1 COG:no KEGG:BT_0109 NR:ns ## KEGG: BT_0109 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 1 119 119 192 95.0 4e-48 MEIITFESKAYKDLDNKITAIADYIFNHIEAESTNEDDIWVDSYEVCTFLKISEKTLQRL RVAGTIAYSNIRGRYFYKISEVKRMLEERLIRSNKENIQNLITNHQLYVKERRNIRKDK >gi|225935368|gb|ACGA01000024.1| GENE 5 3293 - 5335 1261 680 aa, chain + ## HITS:1 COG:no KEGG:BT_0108 NR:ns ## KEGG: BT_0108 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 641 1 641 666 1237 95.0 0 MLRKEEILERTNNGLSVFKHYISGNWRIGRNFLNPLYEDNKASCNIYFDRRSGIYKMKDF GNDSYSGDCFFFVGQLKGLDCNNSMDFVEILETIDRDLGLGLATGNPIPVTRTSCHIVDD MPEETPEKESKPYQFREQKFPLAELMYWQQYGITPEILELYKVCSLRDFQSETADGTPFT YTSSVAEPMYGYKSKRYIKLYRPFSKTRFLYGGSFGENYCFGLEQLPAKGDTLFITGGEK DVMSLAAHGFHAICFNSETVTISPTLIYKLTFRFKHIILLYDTDKTGKESARKQEKQLEE LGVKRLLLPLPGTKEEKDISDYFKAGNTREDFLKLFIEFLDNLYSDTLIMLKSCEIDFNN PPAKAQVIISAGDVPLGTQGNLFGITGGEGTGKSNYIAAMLAGCICQPDKEVDTLGIQIA ANSKHKAVLLYDTEQSEVQLFKNVSNLLTRAKQQNKPEELKAFCLTGMSRKERLHAIVQS MDKFYYQYGGIQLVVIDGIADLVKSANDEVESVAVIDELYRLAGIYNTCILCVLHFVPNG LKLRGHLGSELQRKAATILSIEKDEEPTQSVVKALKVRDGSPLDVPLMLFAWDKTTGMHL YKGEKPREEKEKRKEKELVGVARDVFGRQEHITYIDLCEQIQQILDVKERTAKSYIRFMR ERDIIIKDPSNQSYFMIGLI >gi|225935368|gb|ACGA01000024.1| GENE 6 5348 - 5626 278 92 aa, chain + ## HITS:1 COG:no KEGG:BT_0107 NR:ns ## KEGG: BT_0107 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 92 1 92 92 162 98.0 5e-39 MYIDKDDFTAWMERIMDRFDMQDKKIDKVIKGRNCLDGEELLDNQDLCLLLKVAPRTLTR YRKKGILPFLMLDGRCYYRATDVHKLIREKTD >gi|225935368|gb|ACGA01000024.1| GENE 7 6002 - 6523 462 173 aa, chain + ## HITS:1 COG:MA4540 KEGG:ns NR:ns ## COG: MA4540 COG0262 # Protein_GI_number: 20093324 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Methanosarcina acetivorans str.C2A # 4 172 8 181 198 82 30.0 3e-16 MKHLKTCIAVSLDGFIATKDNELDWMPENVKLEISAAYEQPDILLAGVNTYSYIFEHWGG WPYKSKKTFVVSHYDTNVTEKENVTFLTDMPLRAINELKSSSETDIQVIGGGKFITSLIE ASLLDEITLYIVPVMLGDGIRFIGKTFGSKWELTGHRVIDNQVVCLTYQYKGE >gi|225935368|gb|ACGA01000024.1| GENE 8 6629 - 8710 1141 693 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 649 4 677 709 413 38.0 1e-115 MIAVIGEKPSVARDIARILGANEKQDGYLSGNGYLVTWAFGHLVGLAMPEAYGIQSFRRE SLPIIPDSFQLTPRQVKADKGYKADPGALKQLKVIKEVFNQSDKIIVATDAGREGELIFR YIYQYIGCNKPFVRLWISSLTDKAIREGLQNLKAGSLYDNLFLSAQARSEADYLIGINGT QALSVAAGQGIFSLGRVQSPTLAMICTRFLENKNFVPQKYWQLKLQAAKDNVSFVALSAD KYDKQQPAIDTLQRIKEAETVQVKNVERKEVNQEPPLLYDLTTLQKEANTKLNFSADKTL SIAQKLYEGKLISYPRTGSRYISQDVFEEIPERLVNLEQYARFAGYAAGMKGKALNSRSV NDGKVTDHHALIVTENLPGKLETDEQVIYELITGRMLEAFSEKCVKDVTSVTLECAGSLF TVKGSIIKSAGWRTVFGEKEDGEDNATLPAMQDGDSLPLSGIELLEKQTKPKPLHTESSL LSSMETAGKELENAELKASMKDTGIGTPATRAAIIETLFSRQYIVREKKNLVPTEKGLAV YNIIRDKKIADVEMTGMWENTLAKIESGEMNPDTFRKGIEVYARQITAELLDVQLSFASG SGCICPKCKTGRIFFYPKVAKCSNVDCSVTIFRNKSDKQLTDKQITELVTTGKTGLIKGF KSKNGKVFDASLAFDEQFNVTFVFPEKKGKPKK >gi|225935368|gb|ACGA01000024.1| GENE 9 8892 - 10421 1300 509 aa, chain - ## HITS:1 COG:no KEGG:BT_0103 NR:ns ## KEGG: BT_0103 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 509 1 511 511 804 95.0 0 MDVNTANQSTSDEQMMDILLVLDKEKKTISAVKGVDENGELQTVPPENNSELLKFDRHGD FFSNFFSNMMSQLKNPTRFNFFKIPKIELPKIEPIIRDNFNNPTPENEANIERYRVTPET VKQEVKKEQQETQNGQQSAKEAGATAQALQQPDKSKYYIDPDKVDWEALKNFGLSKEQLE KAKALEPMLRGYKSPGAFTIAGNYNSAIMKMDARLSFRHDKDGNVVLAIHGIRQKPELER PFFGHEFSKEDKANLLETGNMGRIVNLKNYITGEIIPSFVSVDKVTNELVSMRASSVQIP DEIKGVKLNKEQQQALREGKGVFLENMISNRKNPFSATVQVNADKKSLEFIYPNAKQSQE QQQKQTQQNSLVTSDGVTIPKSISGIELSRQQQQDLVNDKTIFVAGLKDKRGVEYDAYIQ VNHDKKKLGFYSDNPSFDRSAVKEITPASKNRTQVAVNSEGKTNEATKKVKEPLKKGQDK PTEKQKTKQDKKEKQEQTDKPKQSRGRKR >gi|225935368|gb|ACGA01000024.1| GENE 10 10665 - 11504 491 279 aa, chain + ## HITS:1 COG:no KEGG:Dde_0780 NR:ns ## KEGG: Dde_0780 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 1 279 5 283 283 264 46.0 3e-69 MIHPFFKVIIHKDENIPDLHGICAGFEGLKWRKEQLVDHLFEYLPEFALNYSEYIDLNGE NAIAKIRQVAANIYKSKKFESRGEFGELLLHAIIRETYNTIPAISKIYYKDGPNETVKGF DAVHVIDLGETLELWLGEVKFYQDISSAMHSVIEELKQHIEVRYIKSEFIAITNKIDSKW SHSDKLKALLNPNTSLDDVFENTCIPVLLTYDSKILAKYNKSCKEYIDEITQELIQYHDK FCNDLGKFPVTVHLFLLPLNTKKELIECIDKKLRIWQSL >gi|225935368|gb|ACGA01000024.1| GENE 11 11489 - 13615 620 708 aa, chain + ## HITS:1 COG:yfjK KEGG:ns NR:ns ## COG: yfjK COG1204 # Protein_GI_number: 16130545 # Func_class: R General function prediction only # Function: Superfamily II helicase # Organism: Escherichia coli K12 # 41 683 38 701 729 292 31.0 2e-78 MAESINIQELRRILRGKVIRQIDKYSILKDIANLTALNRTIGQEFILRLLSRISDFKGFE EIIYSLVRQVGLFPYLNEDELSFRDTIAYEMHRPTGFKENIVFHHAQAEIYYALLRKENI VLSAPTSFGKSLIIDSIIASLNYSNIVIIVPTIALIDETRKRLSKFKSIYKIITHPSQVI TSKNVFILTQERALEIIKNTNVDFFVIDEFYKLSPQRTDEERCHILNQVFYTLVKTGAQF YLLGPNIERVTTELLDNIQYKFIKTDYKTVVSERHNITVGKNEKPINKLIELCKELDEPT LIFCQSPASANKVATAFLESQVFPKETINDDLVKWLKNNYHPQWILPNCIERKIGIHHGK IPRSIAQKCIKLFNDGNLKFLICTSTLIEGVNTKAKNVIIYDNKIARSKFDYFTFNNICG RSGRMFSHFIGNIYLFHEPPLAELPLVDFPIFSQTEDVPEKLLINIDEEDLKESSKLKLK KYIEQEVLSKEVLKKNSYIDLDKQIKLASFIKENLSKVHPILNWNQYPTNEQLKAVCILI WEFLVASNKMIYGVSSGRQLHFRINQYREAGNIKNFISMIIQDCKSNEEINEKIELSFDI QRHWINFQFPRFLISLNDIANDVFKKYRLEQCDYSYFASMVECYFISPYVVPLDEYGLPI QISEKIGKIIQLSPNIDEALQQISRFNPMMTNLPKIEKEFIEEFQEYI >gi|225935368|gb|ACGA01000024.1| GENE 12 13635 - 15644 1139 669 aa, chain - ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 201 558 117 466 589 93 25.0 1e-18 MQNEDDLRGLAKVMDFMRAISILFVVINIYWFCYQSFREWGINIGVVDKILLNFQRTAGL FSNILYTKLFSVVFLALSCLGTKGVKEEKITWNRIYVFLTIGFILFFLNWWLLVLPLPLA ANTALYIFTMTAGYLSLLAAGVWISRLLKNNLMDDVFNLENESFAQETRLIENEYSINLP TRFYYKKKWNDGWINIVNPFRASMVLGTPGSGKSYAIVNNYIKQQIEKGFAVYIYDYKFP DLSEIAYNHLINHLDGYKVKPKFYMINFDDPRKSHRCNPINPAFMTDISDAYEASYTIML NLNRSWITKQGDFFVESPIILLASIIWYLRIYQGGKYCTFPHAIEFLNKKYADTFTILTS YPELENYLSPFMDAWESNAQDQLQGQLASAKIPLSRMISPALYWVMTGDDFSLDINNPEE PKILVVGNNPDRQNIYSCALGLYNSRIVKLINKKHQLKSSVIIDELPTIYFRGLDNLIAT ARSNKVAVCLGFQDFSQLRRDYGDKESKVIENTVGNIFSGQVVSETAKTLSDRFGKVLQK RQSMTINRNDKSTSISTQMDSLIPPSKISNLTQGMFVGAVSDNFDERIEQKIFHAEIVVD NEKVKRETAQYKKLPQIIDFRDEDGNDRMQEEIQANYNRVKQEVQQIVTDEMERIKNDPD LRHLIKTEN >gi|225935368|gb|ACGA01000024.1| GENE 13 15665 - 16903 705 412 aa, chain - ## HITS:1 COG:no KEGG:BT_0101 NR:ns ## KEGG: BT_0101 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 412 1 412 412 721 94.0 0 MVAKISVGSSLFGALSYNQNKVDEEQGKVLLSNRMFESEDGNFSIRRCMECFDMHLPADL KTEKPIIHISLNPHPDDVLSDSQLADIAKEYMDKLGYGNQPYMVYKHEDIARHHIHIVSI RVDETGKKINDKFEHIRSKQITRELEQKYGLHPAEKKQATERPELKKVDYRAGDVKHQLS NTVKALASSYRFQSFTEYKALLSIYNVQAEEVKGEANGKPYNGIVYSATNDKGEKRGNPF KSSSLGKSVGYEAIQRHIKKSAKGIQDKNLKERTRRTVGAVMQSARSREQLITELKAKGI DVLFRQNDTGRIYGVTFIDHENRTVLNGSRLGKDFSANVFNDLFSGSRTLTGNSKQETQE HTPEFNPTGHIENTGKTIAGLFSLLSGGDDTPPDNSQVPPPKKKKKKKQRRI >gi|225935368|gb|ACGA01000024.1| GENE 14 16888 - 17325 500 145 aa, chain - ## HITS:1 COG:no KEGG:BT_0100 NR:ns ## KEGG: BT_0100 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 1 145 145 266 98.0 1e-70 MKQKNENASRPGGAGRKPKADPAVFRYSISFNAIDHARFLALFDQSGMRTKAHFITARIF GEPFKVIKIDKAAVEYYTRLTALYSQYRGIAVNYNQVVKALNTNFSEKKALAFLYKLEKA TMELADLNRQIIELTREFETRWLQR >gi|225935368|gb|ACGA01000024.1| GENE 15 17346 - 17549 58 67 aa, chain - ## HITS:1 COG:no KEGG:BT_0099 NR:ns ## KEGG: BT_0099 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 27 93 93 105 94.0 7e-22 MQPKGIGQANQAPAAGSVPISANSNPSVGGFLFALADSKVCSALLEILSAALNALPGRSR DRVTPKS >gi|225935368|gb|ACGA01000024.1| GENE 16 17685 - 18002 120 105 aa, chain + ## HITS:1 COG:no KEGG:BF1247 NR:ns ## KEGG: BF1247 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 105 24 148 148 96 55.0 3e-19 MDRWKYSLKSVRKESGFENLKTRFQDCNLSRKQESKPDGMIERKRSFNRERKLSVFVERL KGTRIGRKKSFLLSGFLYGTKENTLSVNLNGWNEIKMYHNKPVLQ >gi|225935368|gb|ACGA01000024.1| GENE 17 17999 - 18760 471 253 aa, chain + ## HITS:1 COG:DR0013 KEGG:ns NR:ns ## COG: DR0013 COG1192 # Protein_GI_number: 15805054 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Deinococcus radiodurans # 8 196 51 247 296 63 31.0 5e-10 MKKETLYVAFSTQKGGVGKTTFTVLVASYLYYLKGYNVAVVDCDYPQHSISAMRKRDAEQ VNSDEYYKRLAFSQFKTLGKKAYPVLCSSPDEAIKTADEYLASAGMDYDVVFFDLPGTVN SEGVINSLSGVDYIFTPIAADRVVLESSLSFAVAIDKLLVKNEACRLKGLHLFWNMVDGR EKTDLYTLYEQTIGELELPLMKVFIPDTKRFKKELDAQRKTVFRSTLFPADKRLVKGSNM EELITEIAYLIKL >gi|225935368|gb|ACGA01000024.1| GENE 18 18765 - 19235 433 156 aa, chain + ## HITS:1 COG:no KEGG:BT_0097 NR:ns ## KEGG: BT_0097 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 145 145 204 83.0 9e-52 MAKQSGGKPQIDEDFMKEIISQGLPVKKQETPMVAIPTEVKTKAETPDIPANVPEQEPKR ETETVKEEKAVKEPARRKKNAPGDFRETYFMRVDLTDRQPLYVSRTTHEKLMKIVTVIGG RKATVSSYVENILLRHFDQFQDEINELYESKFEKPF >gi|225935368|gb|ACGA01000024.1| GENE 19 19237 - 19950 382 237 aa, chain + ## HITS:1 COG:no KEGG:BF1244 NR:ns ## KEGG: BF1244 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 237 1 237 237 393 89.0 1e-108 MLKWFIIGTLLYYAILLWRYRDSLGTWFTSGKREPEQPGSKPTVKTGNGDSLVGASRYRM GQMRTNGDILGHLSKGVDNASIFVPQGDETVTKTLDNELADEKNTDTPQAIETEFEMEFE TEDSEEPDVSPDEIEAEEIACYMGEGEPEMAQGVTLGELGQMVQVIQVKQAPEMEERQAV QTICRTETNLFHSLVEQINGGRSRVAELLQKHEIPVPVTVPAAGNDEVAAFDMNDFL >gi|225935368|gb|ACGA01000024.1| GENE 20 20113 - 20409 248 98 aa, chain + ## HITS:1 COG:no KEGG:BF1243 NR:ns ## KEGG: BF1243 # Name: not_defined # Def: putative transmembrane conjugate transposon protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 98 1 98 98 144 100.0 1e-33 MKKKVLFSAAVLLAASSAFAQGNGMAGITEATNMVTSYFDPATKLIYAIGAVVGLIGGVK VYGKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|225935368|gb|ACGA01000024.1| GENE 21 20422 - 20754 219 110 aa, chain + ## HITS:1 COG:no KEGG:BF1242 NR:ns ## KEGG: BF1242 # Name: not_defined # Def: putative transmembrane conjugate transposon protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 110 1 110 110 212 100.0 4e-54 MADYPINRGIGKPVEFKGLKSQYLFIFAGGLLALFVLFIIMYMVGINQWVCIIFGVTSAT LLVWLTFRLNEKYGTHGLMKLSARKSHPFHIINRKAISRLFTKNQKQASK >gi|225935368|gb|ACGA01000024.1| GENE 22 20751 - 23255 1312 834 aa, chain + ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 437 733 189 472 593 79 24.0 3e-14 MRNILKATTLENKFPLFTVENGCIVSKDADITVAFRVELPELFTVTAAEYEAIHSAWNKA VKVLPDYSIVHKQDWFIKENYAPDIQKDDLSFLSRSFERHFNERPFLNHTCYLFLTKTTK ERSRMQSNFSTLCRGFLVPKEIRDKETVTKFLEAVGQFESIMNDSGFITLTRLTSDEITG TKETAGIVEKYFSLSQTDTTTLKDISLGADEMKIGDDILCLHTLSDAEDMPGKVGTDTRY EKLSTDRSDCRLSFASPVGVLLSCNHIYNQYIFIDDHTENLKQFEKMARNMHSLSKYSRA NQINKSWIEEYLNEAHSQGLISVRCHCNIMAWSDDRDELKHIKNDVGSQLALMECKPRHN TTDTPTLFWAGIPGNQADFPAEESFYTFIEQALCLFTEETNYKSSLSPFGIKMVDRVTGK PLHIDISDLPMKKGIITNRNKFILGPSGSGKSFFTNHMVRQYYEQGAHVLLVDTGNSYLG LCEMINRKTHGEDGIYFTYTTENPIAFNPFYVEDGVFDIEKKESIKTLILTLWKRDDEAP TRAEEVALSNAVSSYIELITKDSSVTPCFNTFYEYVKTDYREHLQEKNVREKDFDIDNFL NVLEPYYKGGEYDYLLNSDKELDLLHKRFIVFELDNIKDHKILFPITTIIIMEVFINKMR KLKGIRKLILIEEAWKAIASANMADYIKYLYKTVRKYFGEAIVVTQEVEDIISSPIVKES IINNSDCKILLDQRKYLNKFDSIQNLLGLTDKERSQVLSINLANHPNRKYKEVWIGLGGT QSAVYATEVSLEEYFTYTTEETEKMELFTLSEKLGGNLELAIKRLAESKRNPEK >gi|225935368|gb|ACGA01000024.1| GENE 23 23284 - 23517 259 77 aa, chain + ## HITS:1 COG:no KEGG:BT_0092 NR:ns ## KEGG: BT_0092 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 118 134 98.0 1e-30 MKHLKFLFPVLLCTLLLGLSSCKETNADRLRAMRGDWVSVKNRPAFTLFEENGHYRVTTY RKTYRGTIQTETYQISE >gi|225935368|gb|ACGA01000024.1| GENE 24 23645 - 24274 512 209 aa, chain + ## HITS:1 COG:no KEGG:BF1239 NR:ns ## KEGG: BF1239 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 209 1 209 209 367 98.0 1e-100 MRTKIIMLLVVCSLFTGKVNAQWVVSDPGNLAQGIINASKNIVQTSSTAQNMVKNFQETV KIYQQGKEYYDRLKSVHNLVKDARKVQKSILLIGEISDIYVNSFQKMLSDENYTPDELSA IAYGYTQLLQESSDVLEEMKSVVNINGLSMSDKERMDIIDRTYNAIRNYRDLVSYYTRKN ISVSYLRAKKKKDTDRVMALYGSADERYW >gi|225935368|gb|ACGA01000024.1| GENE 25 24295 - 25320 674 341 aa, chain + ## HITS:1 COG:no KEGG:BT_0090 NR:ns ## KEGG: BT_0090 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 1 341 341 619 99.0 1e-176 MLLAIEFDNLHQILRSLYTDMMPLCGNMAGVAKGIAGLGALFYVAAKVWQSLASAEPIDV YPLLRPFVIGFCIMFFPTFVLGTINSVMSPVVKGCNNMLETQTFDMNAYREQKDKLEYEA MVRNPETAYLVSDEAFDKQIDELGWSAKDIATMGGMYMDRAAHNIKQSVRDWFRELLELL FQSAALVIDTIRTFFLIVLAILGPIAFAISVYDGFQATLTQWITRYISVYLWLPVSDLFS SILARIQVLMLQKDIQELSDPNFIPDGSNSVYIIFMIIGIIGYFTIPTVSNWIIQAGGMG NMSRNINSAANKTGSGVGAVAGAATGNAGGRVGGKLIKSNQ >gi|225935368|gb|ACGA01000024.1| GENE 26 25333 - 25956 397 207 aa, chain + ## HITS:1 COG:no KEGG:BF1237 NR:ns ## KEGG: BF1237 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 207 1 207 207 395 98.0 1e-109 MEFKSLKNIETSFKQIRLFGIVFVVMCTLITGYAVWNSYTFAEAQRQKIYVLDGGKSLML ALSQDLSQNRPVEAREHVKRFHELFFTLSPDKNAIESNIKRSLFLADKSAFNYYRDLSEK GYYNRIISGNISQTIHIDSVSCNFDVYPYAVATYARQMIIRESSVTERSLVTRCRLLNAV RSDNNPHGFIMESFEITENKDLNTIKR >gi|225935368|gb|ACGA01000024.1| GENE 27 25960 - 26265 66 101 aa, chain + ## HITS:1 COG:no KEGG:BF1236 NR:ns ## KEGG: BF1236 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 101 1 101 101 180 90.0 1e-44 MIRKILSPVNQAIIHVQDWADEKLRHLCGRMTPEIRVAVILLMLLFFGGLSIYFTVSSIY RIGKEDGETIRIEHIRQLQLQGKDSTNIFNQSDNGRKRSKE >gi|225935368|gb|ACGA01000024.1| GENE 28 26240 - 27592 989 450 aa, chain + ## HITS:1 COG:no KEGG:BT_0087 NR:ns ## KEGG: BT_0087 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 450 1 461 461 721 91.0 0 MEEKEVRNEVPAPETDRKKEEKTKKELTPQQIQQRKKMLVYPLMGLVFLGSMYLIFAPSD KDEAKVENVGGFNADIPQPKGDGIISDKKTAYEQEQMENKQADKMRSLQDFAFSLGEENG NGEDLTLIDDAPAEKPKTNVIDFGAGAPSNSRSSIQSSAAAYRDMNRQLGSFYETPKEDK EKEELKRQVEELTARLDAKENGVGSMDEQVALMEKSYELAAKYMNGGQSGQVAQVTPTAA IQEKGNAAPVKSVSDRTVSGLQQPMSNAEFIAEYSKPRNYGFNTAVSSSYSMGKNTIRAC IHNDQTLTDGQTVKLRLLEPLQAGNVIVPKNSLVSGSAKVQGERLDILVSSLEYAGNIIP VELAVYDSDGQKGLSVPSSLEQEAAKEAMANIGAGLGTSISFAQSAGQQVAMDITRGLMQ GGSQYLAKKFRTVKVHLKANYQVMLYAKQQ >gi|225935368|gb|ACGA01000024.1| GENE 29 27626 - 28534 552 302 aa, chain + ## HITS:1 COG:no KEGG:BT_0086 NR:ns ## KEGG: BT_0086 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 302 1 302 302 583 97.0 1e-165 MKKLMILFALIMGVVSMKAQSNDLYQGITKKLPYRQMVTPYGVQVTFAKTVHIIFPSAVK YVDLGSNYIIAGKADGAENVVRVKATTEGFPGETNFSVICEDGSFYSFNAKYAHEPEMLN IEMKDFLENEDTTDFSHTRMNIYFRELGSESPLLVKLIMQSIYKNNDREIKHLGCKRFGV QFLVKGIYSHNGLFYFHTQTKNSSNVPFDTDFIRFKIVDKKVAQRTAIQETVIDPVRSYN EILVIGGKSTVRTVYTVPQFTIPDDKILVIELMEKNGGRHQTIRVENSDVVAAKVINELK IK >gi|225935368|gb|ACGA01000024.1| GENE 30 28537 - 29115 341 192 aa, chain + ## HITS:1 COG:no KEGG:BT_0085 NR:ns ## KEGG: BT_0085 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 386 98.0 1e-106 MKRICCIISLFVLCLTFNRAHAQRCLPGMKGLQVTGGMADGVHWNSKSDFAYYFGAAMST YTQNGNRWVIGGEYLEKNYPYKDLQIPVSQFTGEGGYYLNFLSDRKKTFFLSLGLSALAG YETSNWGDKLLPDGSTLTDKDGFVYGGALTLELESYITDRVVFLINARERCLFGSSVGKF HTQFGIGLKIIM >gi|225935368|gb|ACGA01000024.1| GENE 31 29137 - 29580 196 147 aa, chain + ## HITS:1 COG:no KEGG:BT_0084 NR:ns ## KEGG: BT_0084 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 263 87.0 2e-69 MYMRNVIYKMLTGCYIVATLLLVGACSDNVDIQQSYPFSIETMPVPKKLKVGETAEIRCQ LHRDGRYEETEYFIRYFQPDGAGTLQMSDGTVLLPNDLYPLPSETFRLYYTSASTDQQTV DVYFQDSFGTLIQLTFSFNNDSSKENK >gi|225935368|gb|ACGA01000024.1| GENE 32 29723 - 30496 81 257 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171151|ref|ZP_05757563.1| ## NR: gi|260171151|ref|ZP_05757563.1| hypothetical protein BacD2_04721 [Bacteroides sp. D2] # 1 257 1 257 257 453 100.0 1e-126 MRMYKLIINLLFIIAFIVVIWINFFEDLYITTTFFKNADKFNRLVENISMAYITSYIFYI VIYVFKSKQDERVILPFIADYVFVAMNNCVYFCYSMRSKSGLEYFPFETSIYDRNTKIYP NEEELKIICSKINPNEIDKDKPVLPGFTPIPHFFGVMIKYVHSIDYFIKIVLEKSAFLDV ELLRILTDIQTHGFHQEMMSYDKSMLLTAKHRHDNLMIYKDSLRTYFELFIKLEKYSKDH LKQYVERESLKVKSINK >gi|225935368|gb|ACGA01000024.1| GENE 33 30537 - 31553 529 338 aa, chain + ## HITS:1 COG:no KEGG:PSHAa1358 NR:ns ## KEGG: PSHAa1358 # Name: not_defined # Def: hypothetical protein # Organism: P.haloplanktis # Pathway: not_defined # 6 331 2 327 337 288 46.0 2e-76 MNNERLNRYLQYQNEEKMEELKKHSVIIDEFIKYCNSKQINLTKANFDYIQTIGIIAKYP NIVYLLNEKIIPDKENLVGMNLLESEYQMKPFASGYYYSDKYMVMAHPYFRRGYHDNCNF APRFIDVFWKFNKKDIQKHIAIDSDRVRINVDNTMYMEFDTWYGAKFQETIKDIDDGVVK LRPPLDLNSFEIEFFFGDTYALDIKWSSKNGIKTFQAEEFKSEKCKISRNGKDFFPTKYL HAEFDMNKNTFRHFDGAIHFYTSDEYYQRRDSDFNHNNKNSSQLKTLSLKLFKVNGLISI DDWTELTSQYLTGNPLVFEYFEEKFPDHIQEFIKNRKN >gi|225935368|gb|ACGA01000024.1| GENE 34 31650 - 32129 410 159 aa, chain - ## HITS:1 COG:no KEGG:BF1225 NR:ns ## KEGG: BF1225 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 159 1 159 159 297 94.0 1e-79 MKTTEVNKKIIGRRCKCIFTGLLVTGVIEDTTEDKYTVSVKVRFDTPHQWGDEFYSYDWS FGRKADDSGSLKYLELLPDKTTFDAMIVTFGDSIDTLNGIFEDAKTWGVCSLKGWIDSYE STRFTPIGTDKAVITSEYNMECVKEWLEHNTPIKNIIIG >gi|225935368|gb|ACGA01000024.1| GENE 35 32149 - 32364 282 71 aa, chain - ## HITS:1 COG:no KEGG:BF1224 NR:ns ## KEGG: BF1224 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 71 1 71 71 123 92.0 3e-27 METLNKNGVSITQTPGEEKYVKCCLGAFRGQIYFQYDYRHTDMELFSTVAKTLAECRKQR DEWIAKKEKKQ >gi|225935368|gb|ACGA01000024.1| GENE 36 32389 - 32601 100 70 aa, chain - ## HITS:1 COG:no KEGG:BF1223 NR:ns ## KEGG: BF1223 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 70 1 70 70 73 95.0 2e-12 MKVYNKITTFLRWFILAFIIYQVATSTAGMWIALIIGFFIIRFVLRLFISAVYLFCMAII FIVLLSLLII >gi|225935368|gb|ACGA01000024.1| GENE 37 32645 - 33178 555 177 aa, chain - ## HITS:1 COG:YPMT1.61c KEGG:ns NR:ns ## COG: YPMT1.61c COG4734 # Protein_GI_number: 16082851 # Func_class: R General function prediction only # Function: Antirestriction protein # Organism: Yersinia pestis # 1 174 1 166 168 109 42.0 3e-24 METVTLSEARVYVGTYAKYNNGSLFGKWLDLSDYSDKDEFLEACRELHEDEQEPEFMFQD IENIPEALISESWLSDKFFELRDAIEKLSETEQEAFFVWCDHHNSDISEEDADDLVSSFE DEYQGEYKDEEDYAYEIVEECYDLPEFAKTYFDYSAFARDLFMTDYWMDNGFVFRCA >gi|225935368|gb|ACGA01000024.1| GENE 38 33476 - 33928 376 150 aa, chain - ## HITS:1 COG:ECs4513 KEGG:ns NR:ns ## COG: ECs4513 COG2003 # Protein_GI_number: 15833767 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli O157:H7 # 18 150 94 224 224 85 35.0 3e-17 MKTTEFTMPEITISYKDNVKASERVKILSSETSCSFLKPFYAECMEHHEESYVMFLNRAN KALGVSLISKGGMAETVMDVKIILQTALKVHASGIILSHNHPSGNLRPSEPDKQITSKIK EACKVLDLHLFDHIILTEESYYSFADEGLI >gi|225935368|gb|ACGA01000024.1| GENE 39 34186 - 34728 565 180 aa, chain - ## HITS:1 COG:no KEGG:BVU_2471 NR:ns ## KEGG: BVU_2471 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 180 1 180 180 320 99.0 1e-86 MDLQIIQSKIYEIRGCRVMLDSDLAALYQVETKALKQAVKRNIERFPSDFMFEVTKEEVE CLRSQIVTLKNNPDETEEETSSKRGKHTKYLPYVFTQEGVAALSGVLRSPIAIQVNISIM RAFVALRQMITGYQELLKRIEELEESTDAQFSEVYQALTQLLSKPEPKPRKPIGYRTYDE >gi|225935368|gb|ACGA01000024.1| GENE 40 34805 - 36037 949 410 aa, chain - ## HITS:1 COG:no KEGG:BT_0076 NR:ns ## KEGG: BT_0076 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 410 1 410 410 761 96.0 0 MKSTFKVLFYLKKGSEKKNGEVMIMARITIDGKLCQFSTKQSILPENWNIAAGKAKGKDA GRINALLEDIKASLNNIYHEQQRRDNYVTAEKVKNEFLGHSEKHETILDLFKKHNDDVKQ LVGISKTIATYRKYEVTRRHLAEFIQSKYNISDIAINEITPMFITDFELYLRTACKCGYN TTAKFMQFFKRIILIARNNGILVGDPFANYKIRLEKVDRGYLTEDEIKIILKKKMVSERL ENVRDLFIFSCFTGLAYIDVANLTQDNIRKSFDGNLWIITKRQKTNTDVNVPLLDIPKMI LKKYKGKLPNGKILPVISNQKLNAYLKEIADVCGIKKNLTFHLARHTFATTTTLAKGVPV ETVSKMLGHTNIETTQIYARITNNKISNDMQGLDKKFVGIEKIYKEVSMK >gi|225935368|gb|ACGA01000024.1| GENE 41 36638 - 38866 2475 742 aa, chain + ## HITS:1 COG:CAC0980 KEGG:ns NR:ns ## COG: CAC0980 COG1882 # Protein_GI_number: 15894267 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Clostridium acetobutylicum # 7 742 8 743 743 944 62.0 0 MELNKIFKDGLWSTEINVRDFVSHNITPYYGDASFLEGPTERTKAVWNRCLEALAEERAN NGVRSLDNVTVSTITSHKAGYIDKENELIVGLQTDELLKRAIKPFGGINVVSKACHENGV EVDDRVKDIFTHYRKTHNDGVFDVYTEEIRSFRSLGFLTGLPDNYARGRIIGDYRRMALY GIDRLIEAKKEDLRNLTGPMTEARIRLREEVAEQIKALKDMKVMGEYYGLDLSRPAYTAQ EAVQWVYMAYLAAVKEQDGAAMSLGNVSSFLDIYMEYELSKGTITESFAQELIDQFVIKL RMVRHLRMQSYNDIFAGDPTWVTESLGGRLNDGRTKVTKTSFRFLQTLYNLGPSPEPNLT VLWSPELPEGFKEFCAKVSIDTSSIQYENDDLMREVRQSDDYGIACCVSYQEIGKQIQFF GARCNLAKALLLAINGGRCENTGTVMVKNISVLTSDTLNFEEVMSNYKKVLTEIARVYNE AMNIIHYMHDKYYYEKAQMALVDTNPRINLAYGVAGLSIALDSLSAIKYAKVTARRNDIG LTEGFDIQGEFPCFGNDNDKVDHLGVDLVYFFSEELKKLPVYKNARPTLSLLTITSNVMY GKKTGATPDGRAKGVAFAPGANPMHGRDKNGAIASLSSVAKLRYRDSQDGISNTFSIVPK SLGATEEDRVENLVTMMDGYFTKGAHHLNVNVLNREMLYDAMEHPEKYPQLTIRVSGYAV NFVKLSREHQLEVISRSFHERM >gi|225935368|gb|ACGA01000024.1| GENE 42 38912 - 39610 710 232 aa, chain + ## HITS:1 COG:VC1869 KEGG:ns NR:ns ## COG: VC1869 COG1180 # Protein_GI_number: 15641871 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Vibrio cholerae # 2 229 14 244 246 197 42.0 1e-50 MGTFDGPGLRLVVFLQGCNFRCLYCANPDTIAGKGGTPTPPEEIVRMAMSQRPFFGKRGG VTFSGGEPTFQAKALVPLVRELKEKGIHVCIDSNGGIWNEEVEELFKLTDLVLLDIKEFN PARHQALTGRSNEQTIRTAAWLEENEKPFWLRYVLVPGYSDFEEDIRRLGEALGKYKMVQ RVEILPYHTLGVHKYEAMEQEYKLKDVKENTPEQLEKAAEVFKEYFTTVVVN >gi|225935368|gb|ACGA01000024.1| GENE 43 39638 - 40081 411 147 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 142 3 98 116 90 42.0 1e-18 MRKISLIVIHCSATRVDRDFTAKDVDTAHRFRGFSCWGYHYYIRKSGQIEPMRDEDTVGA HARGFNAISLGVCYEGGLDEDGKAADTRTSRQKESLHRLVRELLQRYPDAKVVGHRDLSP DTNYNGIVDPWERIKECPCFEVKAETW >gi|225935368|gb|ACGA01000024.1| GENE 44 40068 - 40235 209 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160887574|ref|ZP_02068577.1| ## NR: gi|160887574|ref|ZP_02068577.1| hypothetical protein BACOVA_05594 [Bacteroides ovatus ATCC 8483] # 1 55 9 63 63 85 100.0 1e-15 MKEIVTKILDVIMFLVPFFGKRKRNRIVREVRFNATHKEVCNVKTTEREKDDEKD >gi|225935368|gb|ACGA01000024.1| GENE 45 40256 - 40759 533 167 aa, chain - ## HITS:1 COG:no KEGG:BT_4735 NR:ns ## KEGG: BT_4735 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 167 228 81.0 6e-59 MAMTVTYSVVPRKNPAKKDESAKYYAQAQASGELDFEELCEGITSRSTCTETDVRAAISG ILYEAKRALKAGRIVRLGDLGSLQIGLNSEGAVSVKEFSSSLITAAHIIFRPGKTLADIT KILSYQQVTTRAVAQTGGSGNEGEDDDKGSGGGSDGGGSGEAPDPAA >gi|225935368|gb|ACGA01000024.1| GENE 46 40953 - 41183 257 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160887572|ref|ZP_02068575.1| ## NR: gi|160887572|ref|ZP_02068575.1| hypothetical protein BACOVA_05592 [Bacteroides ovatus ATCC 8483] # 1 76 1 76 76 134 100.0 3e-30 MYYPDQPFLIRTYKKSELAHLYNPNVCLKVALQILRRWIVYNLPLLQELEQEGYRARNRL LSPRQVATIIRYLGEP >gi|225935368|gb|ACGA01000024.1| GENE 47 41271 - 41981 443 236 aa, chain + ## HITS:1 COG:no KEGG:BT_4734 NR:ns ## KEGG: BT_4734 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 170 1 154 688 218 67.0 1e-55 MDHFTIFQDFYRVIAEMTEEEIVSAISNGTYRETVEDVRRIFAEQGEKAANEKKIELPEI TFSANYRGGRSNATLVKYLGYIVVDIDHQTQEALARILALAKKCAHTRIAFISPKGMGLK IVVRVCRTDGTLPETIQEIEDFHHAAYHKIASFYAQLCGVEVTLPGRMLAALACSLTIRI FISIPMPQLSSWNSHRCSSSLKRRKGHPAKRKRRQHQTTIPSPNKWPSTTIPRTLP >gi|225935368|gb|ACGA01000024.1| GENE 48 41858 - 43387 985 509 aa, chain + ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 198 432 356 592 836 64 25.0 5e-10 MFFKSQKKKRASGKKKKTTAPDNNPLTEQVALNYHSSHASLMVTLNYYHNKSEKYVTGNR NNYLHCLACMYNRYGVPQEEAAAFIKSQFTDLPADEMDALIGSAYGHNEEFDTRKLNSTQ KRMLQIEQHIKENYDTRYNEVLHIMEYRRRKTDTEQPEPFHILDEMMENSIWMEMNELGY SCTVKTIQNLIYSDFSITYHPIREYLDSLPEWDGTDYIGILANSVHTSHQKFWVECLERY LVGMCAAATQDDVVNHTVLLLCSEIQNIGKTTFINNLLPPELRAYLSTGLINPSNKDDLA KIAQAMLINLDEFEGMSGRELNIFKDLVTRKVISIRLPYARRSQNFPHTASFAGTCNYQE VLHDTTGNRRFLCFHVDSMEFIKINYAQLYAQIKYLLNKPGYQYWFTQWENSRIEENNED FIFHSPEEELVLTHIRKPERFEKVHYLTVTEIAELIRERTGYQYSHGTKAQLGKVMSKHG FEFHKGKNGRRYTVFIIDTEQVKSNRLYE >gi|225935368|gb|ACGA01000024.1| GENE 49 43502 - 43681 98 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLEIDPHLPSPFSFFFLKKTVGNLKKTVIFFKFPTMFFLIASYKDIENSILQPNAFMIR >gi|225935368|gb|ACGA01000024.1| GENE 50 43627 - 43929 434 100 aa, chain - ## HITS:1 COG:no KEGG:BT_4733 NR:ns ## KEGG: BT_4733 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 100 6 101 101 143 83.0 2e-33 MQINNHQIVDYDAVLDAKFGAEGTPERAEAEEKAYAFYTGQIIEDARKKAKITQAELARR IGSDRSYISRVESGQTEPKVSTFYRIMNALGCRIEFSMSL >gi|225935368|gb|ACGA01000024.1| GENE 51 43895 - 44248 255 117 aa, chain - ## HITS:1 COG:no KEGG:BT_4732 NR:ns ## KEGG: BT_4732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 110 2 108 110 100 56.0 2e-20 MEKKRKIRTYGGYFEAFMETLTEKEQDKIQYGLLLLKTQERLSTKFVKFVQDGVFELRTE YNGNIYRVFFIFDDGNIVVLFNGFQKKSQKTPGSEIDKALKIKEEYYADKQSSNSRL >gi|225935368|gb|ACGA01000024.1| GENE 52 44305 - 44550 139 81 aa, chain - ## HITS:1 COG:no KEGG:BT_4730 NR:ns ## KEGG: BT_4730 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 80 1 77 84 123 75.0 2e-27 MSHLTPVTIEYRGNPKQYVSVVLDAINRGRLTYDGIANCEQTFRALASVVDVISPKNGKT LSVETLVSYEKKKRAGEFEEK >gi|225935368|gb|ACGA01000024.1| GENE 53 44559 - 44693 169 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENSCNDELFCSLKNELREVTAQINVHSEKLMELLDQKKESSVN >gi|225935368|gb|ACGA01000024.1| GENE 54 44813 - 46084 817 423 aa, chain - ## HITS:1 COG:no KEGG:PRU_2549 NR:ns ## KEGG: PRU_2549 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 22 421 10 411 416 383 48.0 1e-105 MKRFILLTILCCLVLSISAQIARDEIFEDIHRSAANHYAYPDPHFTMTAPPKGYKPFYLS HYARHGSRYRVNPDDYTKPLAILREAEKDGVLTDLGKKALWLVDSLARGAENRYGDLTPL GARQHRGIARRMYNNFPEVFQGAAEVDARSTTVIRCILSMTAECLQLQSLNPELRIKDEA TSYDMYYMNYGNEYFKKKRQEDEVIAVKAAFRKEHLHPERLMKSLFNNDNYVKWKVDAGK LMSYLFELAAITQSHDTDLDLYSLFTKEECYDLWLISNLNWYIDYGPSPLTQGKMPYVEA NLLENILNTADTCVVKKENSATLRFGHETCLLPLACLLELGDCAYQTTDINKLSDIWRNY RIFPKACNIQFVFYRKKGNDNILVKILLNEQEMKLPLASELVPYYHWEEVERYYRNKLAT FVH >gi|225935368|gb|ACGA01000024.1| GENE 55 46121 - 47392 781 423 aa, chain - ## HITS:1 COG:no KEGG:Cpin_6746 NR:ns ## KEGG: Cpin_6746 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 20 422 25 424 429 312 40.0 1e-83 MDIKKIVNFIFLLVFSCIIKAQMTIPPFPKWEKGYLDIHHINTGRGNCAFLIFPDGTTMM IDAGDFDGKEYAAKYAPMHAVPIFPDSSYTPGSSIINYVTNLLGKDVVIDYFLLTHFHSD HYGDVQKATGTSKNGYRITGLTEVGDVIPIKMYVDRDYPDYQFPVDLRSKNDGVDLPTFL NLLEFLNYQEKHNGLKVEKFDIGSNTQFVLKKEPKSYPGFEVRNIKSNNRLWTGEGKKTT ILFTKEELMAKNGKYSENQMSTAIVVKYGAFKYYAGGDNSGLVDQDHAEWYDIETPMAPI IGKVSAVSLNHHSNRDATNRNFLEVLDPKVVVAQSWSTDHPGAEVGQRLLSKNIGTQPRD VFMTYYDDETGVAIGPWFARSLKAKQGHIVIRVYPDGEYEVYVLDARKTDLVISNRFGPY SSE >gi|225935368|gb|ACGA01000024.1| GENE 56 47429 - 48280 645 283 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171174|ref|ZP_05757586.1| ## NR: gi|260171174|ref|ZP_05757586.1| hypothetical protein BacD2_04846 [Bacteroides sp. D2] # 1 283 1 283 283 548 100.0 1e-154 MKYLNILLIILPFIFSCSSDTPENDKEKIPPIIDTSIRVGTFNVDVGQTATAEEIALSLK SFYLDILSLEECPILIDKNSGVEEFYSIIGKALGMEYFYIGDISSGNHWIEWGKDKTGKY AGKYKVILSKTPILQSKEFALEGKGWTHSSTIRIETKIKNKELTFYSLHIPGSAGVVEGS VMKNLVDVVLMKDTSKRIIVMGDFNDLPETNAMQYIFNSGFKSVWDDVEDAEQYIDKYKY GQIDNILINKNSGMKAEIAYSIFKKDISLSDHPFLWSKIIIGE >gi|225935368|gb|ACGA01000024.1| GENE 57 48290 - 49213 617 307 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171175|ref|ZP_05757587.1| ## NR: gi|260171175|ref|ZP_05757587.1| hypothetical protein BacD2_04851 [Bacteroides sp. D2] # 1 307 1 307 307 553 100.0 1e-156 MKDILNFIIILSMSLLLASCGDDKDGINQPDPITGLTTEVFEESISLSWDVPNGEVKKYV IVYNPGDGLIDIVDPAITKYSIEKLKPGTDYEIDLYWVNNANVRSLASTVNVTIPQKEGV IEHIYVGDLLLPNQKAIDNLQLKYTSVTGKLRIGNGTSGSDITDVSMLANITEVGTNLEV DGNSLLGNLDFLKNLKKIGGNFWLRNNPMLTSITGLKNLSEIKTNLYIMGNPSLTNLNGL EGLTAVKLINIGIRSGNKSDKGNEKLTDLSALKPILTSLPTVSYHCEGNAYNPSVDDILN DRGSDSK >gi|225935368|gb|ACGA01000024.1| GENE 58 49236 - 50654 1011 472 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0773 NR:ns ## KEGG: Dfer_0773 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 6 472 7 483 483 305 40.0 4e-81 MKKILFILISGIFTITSCTNMLDLYPISSSTKENSYTKPEDFTQLLNSVYAFLQSNGQYG QNFHFLMEVPSDNSKETSATVRGGVYYQFEILQVATDNSIIEASWTDCYKAIQSCNILLE RIDNIAMDEPLKTITIGETYFLRALNYFNLVRLFGDVPLILTETKDPADLLSVGRENKSK VYEQVIIDLKEAIKRLPAIQEEKGRATIGAAYTLLGKVYLTLKQYQEAVEALHNVKGYAL VKNYSQIFGPDNEYNEESIFEVNFTSNLENEGSALANLFAPTGEGPTLGIVGAVYNQNIP TQELYDLYSSDDKRKNVTIGVSNNKLYAKKYVGPTVKDQDSDINVIVLRYADVVLMLAEA LNEIGYTGDDSGEAFTLLNSIRSRAGVSTYDSDDLPDQTSFRKAIADERRLELAFENHRW FDLIRTGTAVETMNNSSQESSKFTVKDYQLVYPIPQREINANPVVIKQNPEY >gi|225935368|gb|ACGA01000024.1| GENE 59 50674 - 54012 1605 1112 aa, chain - ## HITS:1 COG:no KEGG:BT_0190 NR:ns ## KEGG: BT_0190 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 41 1112 39 1146 1146 734 40.0 0 MKNRTTFEKFPRTNAGRSKNWKTLRALCLLLFLSVSFTAYSQITVNVKDISLRASLKKIE QVSNYKFFYNENLPELNQKVSLNVQNATIQQVMKQLLGKMELTYKQEQENIIVLIRKEQE KKRNINVSGTVVDEKGEPVIGASVTVVGTALGTITNLEGKYSLTDVQEDSKVAISFVGYG ALTFSAKDKQLARVVLKEDSELLDEVVVIGYGTERKSDLSTAVSQVKSKDFESISYSDPG QAIAGRMPGIYVKQASGAPGGNPQISIRGTGTISSGSSPLVVVDGLPVTDDVGLNSINPN DIESINVLKDVASAAIYGSRAANGVILITTKKGKAGKAQYTFNTYYGVQQVEKKYDLVDA YQQAELMAEAFQYKNQEIPNFIKPYLEHQQGLVNTNWQDHIFRDAPTNYYELSVSGANEQ TVYFVSGSYYKQKGIVIGSDYERISGRVNLTSKLSNKFEFGINLNPNYSMQDKITEGWEE SPLSMGIYNSYPFFPVYEPDGSYAISKQIHAVQNYGGMAEVENNVATANLTTNQVENFKI LGGAYLSYTPLKGLTFKTYVGGYFISNRAEYYRPSTIGAYRQAAPTVAKAASSTYETFNW MSENTITFNKSWGKHDFEILGGMSFQKETTNNNSIDAENFPNDDIPTLNAALTINTASAT RYQWALLSYFGRLKYDYKKKYFFTAALRMDGSSRFGRNNRFGLFPSLSAAWRISEENFFP KNDIISELKPRISWGISGNDEIGNYSSLSLMSSAKYNFNGNLVAGMRPNTSPNANLSWEK NNTVNVGLDVGLFRNALNLNFEYYVSKTSDMLLEVPVPAYSGYSTSLQNVGSVRNKGFEF AASFKNHIKDFNYQISANFSTNKNTVLSLGANQNEIITARNITKVGHPIGALYGYRIIGI FENEEQLKSLPNYRDSQKVGDYIFKNIDDSDNTITEKDREIIGNPHPEFTYGGNIQLFYK NFDLGITIQGVYGVDVINRDLPTTILSSEIWSVMSKEYYNNRWISPDKPGKYAKAGSNSN TLSRESDLMMQDGSFFRIRDISLGYTLDKKWLTRTFLTNVRVYVSVKNPLLVSKYMGLNP ESEGSSNPLNAGVTLGQYPAEKNFVVGASISF >gi|225935368|gb|ACGA01000024.1| GENE 60 54183 - 55742 1364 519 aa, chain - ## HITS:1 COG:no KEGG:BT_1484 NR:ns ## KEGG: BT_1484 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 519 9 523 523 608 59.0 1e-172 MNTLERKLPIGIQTFEDIRQGNYLYVDKTSILWQIINTGKPYFLSRPRRFGKSLLISTLE AYFQGRKDLFEGLAIEKLEKKWEQYPVLHLDLNAKKYETAADLVAMLNQYLEKWEAVYGD EKKDRSPEERFSYVIEQACLKTGKGVVVLVDEYDKPLLQALLDENLLDEYRRILKAFYGV LKSSDRYLRFIFLTGVTKFAQVSVFSDLNQLNDISMKIPYANICGITKKELVSTFTPELE RLAEVQEMSFDDTVDKMTAMYDGYHFTYSEDGLFNPFSVLNVFDGLMFDNYWFQTGTPTY LVDLLKQSDYDLRLLIDGLEVGSSGFAEYRAEAKNPLPMIYQSGYLTIKNFDKSLNLYTL GFPNDEVKYGFLKFLIPYYTPISSDETDFNAVKFVRELQSGDVHSFMERMKSFFADIPYE LNTKTERHYQVIFYLVFKLMGQYVDAEVRSAKGRADAVVETKDRIYVFEFKLEGTVDEAL KQIDEKGYLLPYQTDGRELVKVGVSFNAEERNIGEYKVV >gi|225935368|gb|ACGA01000024.1| GENE 61 55819 - 56778 800 319 aa, chain - ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 25 287 42 294 327 81 25.0 2e-15 MDKTHYKELIEKYFEETITDVEIKELSDWIKNDRQLHDWWEQEFERSDSSMNPVLRDKLF ARIKEETLGKEVRPLRIIPWKWVAAILLPVCVAFFTYYLLDSSPTAETPFIVKAGKGDKA TIELPDGTNVVLNSASQLSYLNNFGENVRRVQLNGEAYFKVAHDEKHAFIVQIGDLEVKV LGTSFNVSAYEDAKDVTVVLLEGKVGVYAQKISHIMKPGDKIEYNKATHKITATQVHPTD YIEWTKGNMYFEKESLENIMKTLSRIYDVEIRFDSNKLPNEYFTGTIPGGGIQNALNILM LTSPFYYEMDGSVIVLKEK >gi|225935368|gb|ACGA01000024.1| GENE 62 56842 - 57438 438 198 aa, chain - ## HITS:1 COG:fecI KEGG:ns NR:ns ## COG: fecI COG1595 # Protein_GI_number: 16132114 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Escherichia coli K12 # 22 173 13 162 173 63 30.0 2e-10 MVDSTDEKHILIDLKGGSFQAFERLYNMYSGKLYNFIMRLSSGNQYMAEEVVQATFIRIW EVHEKVDPASSFISFLCTIAKNLLMNMYQRQTVEYVYNEYLMKSSVDCDSQTEENIDLRF LNEYIDSLAEELPAQRKKIFILSKRQNYTNKEIAEIMGISESTVATQLSLAVKFMREQLM KHYDKVITLLLAFFVNEM >gi|225935368|gb|ACGA01000024.1| GENE 63 57651 - 58955 1172 434 aa, chain - ## HITS:1 COG:no KEGG:BT_4721 NR:ns ## KEGG: BT_4721 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 434 1 418 418 586 72.0 1e-166 MEEKELWMNKLKEKLADYSEPAPASGWEQLEKELMPPVEKKIYPYRKWMMAAAAIIILAV VSSVSLYFLGTPAADEIRHIQTPALASTPDALPGVQQPDMQGTAVEPVLRPVTREDRLAK VDRNHTEQKTGVDPLGIGNENKLSTGNGEDSLSEEDKNVKGNGETENVKDEPKQAENTDV SQGQSQDTERSNNRPRRPSGKDKLHIPTEKRSSQKGTWSMGLSVGNSGGASTEVGAGSHA YMSRVSMLSVSNGLMETIPNDQTLVFEDGVPYLRQAKQVVDIKHHQPISFGLSVRKGLAK GFSLETGLTYTLLSSDAKLAGEEQQIEQKLHYIGIPLRANWNFLDKKLVTLYVSGGGMVE KCVYGKLGSEKETVKPLQFSVSGGVGAQINATKRLGIYVEPGVAYYFNDGSDIQTIRKEN PFNFNIQAGVRLTY >gi|225935368|gb|ACGA01000024.1| GENE 64 58962 - 59510 422 182 aa, chain - ## HITS:1 COG:PA0762 KEGG:ns NR:ns ## COG: PA0762 COG1595 # Protein_GI_number: 15595959 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 2 175 6 187 193 68 25.0 5e-12 MEEIELSEQCRLGNNRARKELYEQYAGRMLGICLRYTGDRDTAQDLLHDGFLKIFDSFDK FTWRGEGSLRAWMERVMVNTVLQYLRKNDVINQSTPLEELPEEYEEPDASDVEAIPQNVL MQFIEELPAGYRTVFNLYTFEDKSHKEIAQVLGINEKSSASQLFRAKSVLAKRVKEWIMH NG >gi|225935368|gb|ACGA01000024.1| GENE 65 59686 - 60408 759 240 aa, chain + ## HITS:1 COG:no KEGG:BT_4719 NR:ns ## KEGG: BT_4719 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 237 240 324 75.0 2e-87 MKKFKFIAFIFAIMTTMPILQSCLDDDDNSSDLLAISTINMISQDSKEFYFTLDDGKKMY PSNSQGWNNEDWVEGQRAFVIFNELEEPVNGYDLNIQVKGINPILTKDIVTMGEDDNDEE KIGDDKINTTYMWINKDNKYLTIEFQYYGTHSEDKKHFLNLVINDKEETAPTADEGNAED EYINLEFRHNSEGDDPQRLGEGYVSFKLDKIKNRMEGKKGLRIRVNTIYGGPKTYEVKFP >gi|225935368|gb|ACGA01000024.1| GENE 66 60540 - 61220 744 226 aa, chain - ## HITS:1 COG:slr2057 KEGG:ns NR:ns ## COG: slr2057 COG0580 # Protein_GI_number: 16330455 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Synechocystis # 1 220 1 233 247 159 49.0 6e-39 MKKYIAEMIGTMVLVLMGCGSAVFAGGLADTVGAGVGTIGVALAFGLSVVAMAYAIGGIS GCHINPAITLGVFLTGRMNGKDAGMYMIFQVIGAIIGSAILFALVSTGAHDGPTATGSNG FGDGEMLQAFIAEAVFTFIFVLVVLGSTDPKKGAGNLAGLAIGLTLVLVHIVCIPITGTS VNPARSIAPALFQGGEALSQLWLFIIAPFVGAALSAVVWNYLGDKK >gi|225935368|gb|ACGA01000024.1| GENE 67 61368 - 62294 792 308 aa, chain - ## HITS:1 COG:STM4125 KEGG:ns NR:ns ## COG: STM4125 COG0583 # Protein_GI_number: 16767389 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 1 294 1 293 305 194 38.0 3e-49 MTIQQLEYILAVDQFRHFARAAEYCRVTQPTLSAMIQKLEDELGVKLFDRTVQPVCPTAI GQKVIDQARVILAQAAQVKEIISEDKQSLSGVFRLGVLPTIAPYLLPRFFPQLMEKYPEL DIRVTEMKTQDIQQALHAGDLDAGIIASKLEDTFLTEETLFYEQFYAYVSRKESSFKHDM IRTSDITGEHLWLLDEGHCFRDQLVRFCQMEAVKVNQMAYRLGSMETFMRMVESGKGITF IPELAVMQLTEEQRQLVRPFAIPRPTRQVVLVTNKDFIRHSLLCVLKEEIKAAVPKEMLS LQSIQCLL >gi|225935368|gb|ACGA01000024.1| GENE 68 62431 - 62907 640 158 aa, chain + ## HITS:1 COG:PM0817 KEGG:ns NR:ns ## COG: PM0817 COG0783 # Protein_GI_number: 15602682 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Pasteurella multocida # 7 154 5 152 159 155 52.0 3e-38 MKTLEFIKLNESGANNVVASLQQLLADFQVYYTNLRGFHWNIKGHNFFVLHSQFEKMYDD TAEKVDEIAERILMLGGTPANKFSDYLKVANVNEVDKVSNGDEALNNILQSISYLIGEER KILSIASQAGDEVTVSMMSDYLKEQEKLVWMLVAYNSK >gi|225935368|gb|ACGA01000024.1| GENE 69 63030 - 63386 221 118 aa, chain - ## HITS:1 COG:CC3054 KEGG:ns NR:ns ## COG: CC3054 COG1472 # Protein_GI_number: 16127284 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Caulobacter vibrioides # 1 110 687 799 806 92 40.0 2e-19 MSYTTFEYSALQVVQKSARCFEVSFKVKNTGKYDGEEVSQLYMRDEYASVVQPMKQLKHF ERFHLKKGEEKKVTFVLTEEDFFLVNYTLKKVVESGNFHLMIGAASNDIRLQNVILVE Prediction of potential genes in microbial genomes Time: Fri May 13 07:29:30 2011 Seq name: gi|225935367|gb|ACGA01000025.1| Bacteroides sp. D2 cont1.25, whole genome shotgun sequence Length of sequence - 11482 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 217 81 ## BT_4714 periplasmic beta-glucosidase precursor, xylosidase/arabinosidase - Prom 271 - 330 6.3 - Term 330 - 385 13.4 2 2 Op 1 . - CDS 434 - 1828 1066 ## BT_3987 endo-beta-N-acetylglucosaminidase F1 precursor 3 2 Op 2 . - CDS 1890 - 3062 948 ## BT_3986 putative patatin-like protein 4 2 Op 3 . - CDS 3072 - 4136 796 ## BT_3985 hypothetical protein 5 2 Op 4 . - CDS 4168 - 5733 1223 ## BT_3984 hypothetical protein 6 2 Op 5 . - CDS 5737 - 9093 2881 ## BT_3983 hypothetical protein - Prom 9114 - 9173 4.8 7 3 Op 1 6/0.000 - CDS 9270 - 10265 775 ## COG3712 Fe2+-dicitrate sensor, membrane component 8 3 Op 2 . - CDS 10313 - 10852 492 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 10885 - 10944 8.2 9 4 Tu 1 . - CDS 11035 - 11391 251 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|225935367|gb|ACGA01000025.1| GENE 1 1 - 217 81 72 aa, chain - ## HITS:1 COG:no KEGG:BT_4714 NR:ns ## KEGG: BT_4714 # Name: not_defined # Def: periplasmic beta-glucosidase precursor, xylosidase/arabinosidase # Organism: B.thetaiotaomicron # Pathway: Cyanoamino acid metabolism [PATH:bth00460]; Starch and sucrose metabolism [PATH:bth00500]; Biosynthesis of secondary metabolites [PATH:bth01110] # 20 72 6 57 769 80 71.0 1e-14 MKYIFIILWICVWVTCTPIFAQQVSVLTYQNPNLSIDIRLADLLSRMTLEEKVGQLLCPL GWEMYEIHGSKV >gi|225935367|gb|ACGA01000025.1| GENE 2 434 - 1828 1066 464 aa, chain - ## HITS:1 COG:no KEGG:BT_3987 NR:ns ## KEGG: BT_3987 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F1 precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 390 22 402 476 212 35.0 3e-53 MLTAWGGFMALSSCEDDLTAYTVDKDRYEVNEHPCAFLRNVQGDGKNLIDLYNAEGEQVE LNIQLTKPVTADTKCTLIADESLLINYNQIHGTDFAIFPIVQVVFENGKVVTVEAGNLVS ESVKIKLEARESLETGKTYVLPIRIESASDIALSEDRDTYYYLIKAQGDRINPQKDPDVK VMSCFGLDWENPLIHTQMFLKKSGKPLFDIVCLFSAGINYNAETGQAYVYKNPNIVHLLN NRDKYIKPLQDMGIKVVMGLCGNRDFTGCGSMLPEACKQFARELKNLCDQYGLDGVLFDD EYSSYGVDIKPGFSSYASSANASRLCYETKKLMPDRLIIPFLWGSMSYLQSVDGMTPGEY CDYFFPNYNLYPSAAGGAQMKQMGAGSQELTGYQWYARNYGDLSRVNSRGYGLTTVFGMG PYQEERKSGSWLPWSTQKKSLDIIARDIFDDEVVCDEVMIRKDW >gi|225935367|gb|ACGA01000025.1| GENE 3 1890 - 3062 948 390 aa, chain - ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 388 8 381 384 278 40.0 2e-73 MKQINSWMYFLITFACAAVLSACAGDGIEESSFESKVYLNSEKEETLLIKPGEPSYTKTL QTALPKALDHDITVRFAADLSKVDEYNTVYKDNALPLPEAQYKFTTSEAIIPAGTVRSTQ AVIDFVDVDKLDREKRYVLPVRIVDAGDCTILARSATTYYIFKGAALINVVANMERNYCK VNWQSNVSAMSELTFEALVRAKSFTNPESSNDILSLMGIEGYFLLRTGDTLYPGQLMLST GTNFPGKDATKVLPANEWIHIALTFSSGNVIIYINGKPQSEGSITRKSINLKSSGSADDS NMKGFLIGASWDKSRWWVGEMCEMRVWETARTQEEIASNFYYVDPHSEGLVAYWKFDDGE GSKVTDRTGHGNDAVALNEITWIPVELPAK >gi|225935367|gb|ACGA01000025.1| GENE 4 3072 - 4136 796 354 aa, chain - ## HITS:1 COG:no KEGG:BT_3985 NR:ns ## KEGG: BT_3985 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 350 1 356 358 196 34.0 1e-48 MKNLFTKNYKWMMVAVCIVAVFTFASCDDWTEVESLTVNQPNIAEQEPALYEKYLESIRN YRHVSHKVVIAQFDNLDNQGGRRYHLTNLPDSIDIISLKQPEVLPDWMQKEMMTCRDRKG MKFVYEIDFATLSQEYKDLESASEVEQQTFADYATEYVNRHLALCEKEGYDGVIVRYVGL YTGYMTNDEKEDYIARQSIFLDIFINWYNEHTGKILIFRGTPHFIEDKYLAEGSLLLNSK YIIVETSAATAATDIDYLMANVMDKSGIPFDRFIVTASTYSLDAKDQTTGHFWDVKGNAV SAVSGCARWVVSFNNDYKKAGLYILETQNDYYHSEKVYPNIREAISIMNPPVIK >gi|225935367|gb|ACGA01000025.1| GENE 5 4168 - 5733 1223 521 aa, chain - ## HITS:1 COG:no KEGG:BT_3984 NR:ns ## KEGG: BT_3984 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 521 1 537 537 409 43.0 1e-112 MKNIIKYISFGVLCGLVTACTPNYEYINQNHAGVTEEQAEADGYNITTSLITLQNNVIST STSRAQYVDLLLGGTWGRYAAESKANSWPNKFSTFDPPADWSGVEFNEVIPYIFPYVSSI ERVTDDVVPQAIAHIIKVAAMHRVTDTYGPIPYTQVGVDGQVQTAYDSQEQVYKAMFVEL NQAITELTDHQTESISQNADLIFKGDLLKWIRFANSLKLRLAIRTCYVPQFNVDGKTSQQ LAEEAVNHSVGVMTSNDDVARLTTFSNVGNPLNEAIKFNEGDHRAIADMTIYMNAYNDPR RAAYFNESGYSSQTYCGLRTGLAPIGSSDFNKFSTIKIERETPLVWMYPSEVMFLCAEGA LRGWNMGGGTAQSYYEQGIKLSFEYWGVSDDNYIHSTKAPDTYSDPSGKNSYSNKVSNIT VQWNNGGVFEENLERIIVQKWLANFLLGLESWSDMRRTGYPKVLPVVQNNSNGLLNNDEI PRRCIYPTREGQSNAQNYQAALKLLGGPDNLATHLWWDCKE >gi|225935367|gb|ACGA01000025.1| GENE 6 5737 - 9093 2881 1118 aa, chain - ## HITS:1 COG:no KEGG:BT_3983 NR:ns ## KEGG: BT_3983 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 1118 21 1135 1135 1227 57.0 0 MNFYRFKAVLFIIFSCFLLQTALFAQGNVKITIKKKNVTLQEALQDVEKQSAYLVAFNES KLEKTKQIHLNINAEPLEKALTIILSGTGLTYKIKDKYIMIVPAGKPAPEKKTITGVVQD ANGEPLIGVNVSVKGGSAGTVTNLNGEFSLQAAKGEILEFSYVGYVAASQTVTERNSLAI IMQEDAKTLDEVVVTALGIKRAEKALSYNVQQINADDILENKDANFVNSLSGKVAGVNIN SSSSGVGGASRVIMRGTKSIEQSSNALYVIDGVPMYNSKGEGGTEYGSSGVTEAIADINP EDIESLSVLTGAAAAALYGSDAANGAIIVTTRQGKEGKLSVTVSSSMEFMKAFVMPKFQN RYGSSSGDMSWGDLLSSSSYIGYDPASDYLQGGLVATETVSLSTGTERNQTYFSVGAVDS KGIVPNNGYHRYNFTYRNTTSFLNDKMKLDVGASYIYQKDRNMTNQGVNYNPLIGAYLFP RGNDWEDVEMFERYDPQRKIMTHYFTMDPGEYIIQNPYWINYRNLRENSKNRYMLNASLS YEITDWLNVSGRIRLDNSVNKYTERIFASTPEQLTMLSKNGLYAITRSSDMQIYGDILAN INKRFGNDWSLFATLGVSLTDLNYSDTGMRSPLRDGSIEGETTGVANVFNLSNMSNKALE KVENPWREQTRSIFASAEIGYKSTYYLTLTGRNDWPSQLAGPASTSKSFFYPSIGASVIV SEMIPNLSKDYISFLKVRASYASVGNAFKRFVANPVLEWNNTSSTFEQLTNYPVSNLKPE RTKSWEFGLTVNFLKYFNADLSYYLTNTYDQLFQPDISVGSGYSTIYVQTGNIRNQGVEL ALGFKNTWRKFTWNSNLTFSTNNNKIIELGNNLVNPVTGELFSISSLDMKGLGDVHFLLK EGGTLGDMYSSADFVRDDNGNIYVDSEGKVVYEKGIKEPENWKKLGSVLPKGNLAWRNRF DIGNLNVGFMLAARLGGVVYSRTQATLDYYGVSEVSAIARDNGGVRINQSIMDAGDLIEA HSWYNQVAVSGGIPQLYTYDATNVRLQEASIGYTIPRKWLGGVMDIDLSIVGRNLWMIYC KAPFDPENVATTGNYYQGIDYFMMPSTRNIGFNVRLKF >gi|225935367|gb|ACGA01000025.1| GENE 7 9270 - 10265 775 331 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 116 274 119 273 331 77 31.0 4e-14 MKNYIQKIINVFTASEHSENVTKEVHQWLVDEEHADEKDTALNTLWKETEGKVDAGTWTS LANVYDKLRVPQQNTKSRFRIRAWQYAAAAVVLLMVVISGTFYYTKDMYSAVAMVEKFTP AGKMNVIELPDGSKVQTNSGTILLYPEVFKGDTRTVYLIGEANFKVKKNPEQPFIVKSTT MSVTALGTEFNVMAYPENEEIVATLIHGKIKVDCNSGKESYILTPGQQVTYLRNTSKSLL ADANLEDVTAWQKGMFVFRGVSIKDIFLTLERRYAMTFQYNANLFNDDKYNFRFRENSSI KEVMDVMQEVVGGFDYKIEEDICYIKAIKKK >gi|225935367|gb|ACGA01000025.1| GENE 8 10313 - 10852 492 179 aa, chain - ## HITS:1 COG:BMEI0371 KEGG:ns NR:ns ## COG: BMEI0371 COG1595 # Protein_GI_number: 17986654 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Brucella melitensis # 8 172 10 169 190 58 20.0 6e-09 MTHIGEQEKKKKFEQFFILTFPKVKAFAWKILRSEEDAEDIAQDIFVKLWDNPEIWENKE TWDSYIYTMARNQIYNFLKHQSVELSYQEKLVQEDSPSFEFDIYDKLYAKELQLLIKLTL DNMPEQRRKVFSMSRKNGMSNQEIADNLQLSIRTVERHIYLALQELKKVILIAFFFYFS >gi|225935367|gb|ACGA01000025.1| GENE 9 11035 - 11391 251 118 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 1 117 639 758 778 98 45.0 2e-21 MSYTTFEYSALQVVQKSARCFEVSFKVKNTGKYDGEEVAQLYMRDEYASVVQPMKQLKHF ERFYLSKGEEKLIVFTLAEEDLSIINQALEQIVESGTFQVMIGSSSDDIRLEGSILVK Prediction of potential genes in microbial genomes Time: Fri May 13 07:30:11 2011 Seq name: gi|225935366|gb|ACGA01000026.1| Bacteroides sp. D2 cont1.26, whole genome shotgun sequence Length of sequence - 12483 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 217 115 ## BT_4714 periplasmic beta-glucosidase precursor, xylosidase/arabinosidase 2 1 Op 2 . - CDS 291 - 1367 379 ## BDI_3745 hypothetical protein - Term 1374 - 1418 6.6 3 2 Op 1 . - CDS 1433 - 2713 924 ## BT_4710 hypothetical protein 4 2 Op 2 . - CDS 2738 - 3952 1015 ## BT_4710 hypothetical protein 5 2 Op 3 . - CDS 3975 - 5060 873 ## BT_4709 glycosyl hydrolase 6 2 Op 4 . - CDS 5085 - 6695 1265 ## BT_4708 hypothetical protein 7 2 Op 5 . - CDS 6722 - 10084 2752 ## BT_4707 hypothetical protein - Prom 10208 - 10267 2.0 + Prom 10006 - 10065 6.0 8 3 Tu 1 . + CDS 10116 - 10289 130 ## + Term 10481 - 10526 0.6 9 4 Op 1 . - CDS 10270 - 11202 851 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 11225 - 11284 5.5 10 4 Op 2 . - CDS 11286 - 12422 696 ## BVU_1801 glycoside hydrolase family protein Predicted protein(s) >gi|225935366|gb|ACGA01000026.1| GENE 1 1 - 217 115 72 aa, chain - ## HITS:1 COG:no KEGG:BT_4714 NR:ns ## KEGG: BT_4714 # Name: not_defined # Def: periplasmic beta-glucosidase precursor, xylosidase/arabinosidase # Organism: B.thetaiotaomicron # Pathway: Cyanoamino acid metabolism [PATH:bth00460]; Starch and sucrose metabolism [PATH:bth00500]; Biosynthesis of secondary metabolites [PATH:bth01110] # 17 72 4 57 769 81 69.0 9e-15 MIKRLYILVMVQMVCTLGFTQSSPSLPAYKDPSLSIDIRLSDLLSRMTLEEKVGQLLCPL GWEMYEIHGSEV >gi|225935366|gb|ACGA01000026.1| GENE 2 291 - 1367 379 358 aa, chain - ## HITS:1 COG:no KEGG:BDI_3745 NR:ns ## KEGG: BDI_3745 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 9 350 7 363 369 121 27.0 3e-26 MMKRFMSALIVLLCSIVGVCAQQQGKTVLRFDTTTWNFGNIQEVEGKVSHTFHFTNIHTS PVVIEEVISTCGCAIPVYSKQPVKPGHTGTITVTFDPKGRTNFFSKSIRVVSNSGQSVNT LWVKGTINTMNRIEDEYPYSLSSDILADRMTLSYDLLQHNGRPKQLEIRIYNRSDKIVRL SYSLLDKSGCLSISMPSSLQGRSCATIKITASPLKGFYGTFKDKIIISANSVHSSPIQIF GTVIDDMRKVSTATAPRMKCSQSYFNLGNISLKKRIQRKVKVTNEGANPLIIRKIECPEF VSTNIRGEKVLKQNECLEVLFELNVSSSNHLLDAKVKLITNDTHQPVHTVIFEGEMGK >gi|225935366|gb|ACGA01000026.1| GENE 3 1433 - 2713 924 426 aa, chain - ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 414 1 374 402 108 24.0 3e-22 MIMKNNINRLTLLLLVLLGGVGFTSCSDDDNDFIELPTIEMPQSEAGPVSIDYREILDAG SYVYDYTIKLDKPAATTLLCDIAVDESMVEAYNAANNTSYKMMPAFVYELQAKNAIVKAG QQESNSLSVKFSSLFGLVEGEEYLLPIVATIDETCVGQFVTDTRSVSYFTISIDGELDYI PGLNMSSYSTDMYRTLSFANDEVVTIEDNTHTFEMLVYPYSWHSGTNYIGTWRGKDTNNN NEFFSGCELRVTGATGASNIGNRQCDLTLANQNITLPANQWVRLTITCDGTKTGQNTEVA YRLYINGEEVASAKPTKRWGPSSSQRFKVGYTLTGIQFGNTSSSMYFDGLISDIRMWKKC LTAEEVKANLRTIASPSSSDLYGYWKLDEGEGNTLKDSSGNGRDLTFPASANIIWNAEFN DLPQDN >gi|225935366|gb|ACGA01000026.1| GENE 4 2738 - 3952 1015 404 aa, chain - ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 404 5 402 402 246 33.0 1e-63 MKTLNNIKKIWGAALIGIASFMLQGCQEAAQKGDGLLMTGTSSDRLVKFAIDGFPSAYAV SVTSTSRVESDIQVNLAVDASLVDTYNEEMGTNYYPIPDKSYTFENPEVTISAGQAISSA ASLSIADDSEFVPGRVYLIPVTIKSATGDLDIIEAGRTIFLKVSRTLRFHAPYVGQASMA YQFLLPDPIPSLPTYTWEVKIYATQFRSSGASGTTRVCSFGGSEASVEGGAIDDGGFKCD QNLLRFGEGTDEPNQLHVTTKQGKMSSNTRFALNTWYAVALVNDGSTLTLYINGEKDNSM TVAPYEYALYGVQIGMPSKGYQSSQLFYGRLSEMRLWTRPLSAREIKANVCGVDPSTNGL VSYWRMNEGEGTTFYDLSPSKRDISYKNGITIDWTYDDTNKCLE >gi|225935366|gb|ACGA01000026.1| GENE 5 3975 - 5060 873 361 aa, chain - ## HITS:1 COG:no KEGG:BT_4709 NR:ns ## KEGG: BT_4709 # Name: not_defined # Def: glycosyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 344 8 310 313 172 35.0 1e-41 MKRNYVYFLFTYLLTFVVSGCDTDIEPEVIQAANTYNEEYYQNLREYKKTDHAICFGWYA GYTSEASPSQGLHFTGLPDSLDIVSLWSGIPSNNPAYVETSYYNERYLPTAYEEMNYIRT VKGTRVVMCTICRIGSTEFPKTDAGIEAYALHLVRCVLRNDLDGLDLDYEPEGDWLQNDN FTKFIKVLGNYFGPQSGTGKLLIVDFYNQLPPKETEPYVDYFVRQCYTTGSAQSLQSKFD QLSSWCPPEKFVATEQMGWYWQNGGVEFTEADGNKVDSWGNPIYSVIGMARWNPTQGRKG GFGGYYFEYEYNTTRPANQAIGDKEKAAIPYYSLRRGIQEQNPAGKPVQQSTKQETEEEN G >gi|225935366|gb|ACGA01000026.1| GENE 6 5085 - 6695 1265 536 aa, chain - ## HITS:1 COG:no KEGG:BT_4708 NR:ns ## KEGG: BT_4708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 532 20 527 531 363 40.0 1e-98 MNIRKFFKVRNYYLLLGVIMINVGCTKNFLEYNTNPNEATDEMTDWDNVRTGSLLLQMEQ NVLVVAQPGLNIGSDRYQTVEVMGGDGYVGYFGFPAPSINSAGRYNWDKRSWYGDMFTTN YLTTMNAWREIRKAINNDEDPRFSMAQILKVAAMHRVTDTYGPIPYLNFGVSKEVPYDSQ KDVYYRFFEELDGAINNLDSYAASGSKVLSSWDCVFNGDVTSWIKFANSLRLRLALHLAY VDETKAKSEAQLAIGNSYGLMNVKSDLAELQHITPIATYESPLYILKGWDDICMGATLDS YMNGYQDPRLSAYFEAGTGGKYRGIRAGMSKDVSKDKYITGIFAEPQATATSNVVWMRSS ESYFLLAEYALRWGTNADAKKYYEDGIRMSFDEHGASGADAYLTRTAAGGYVPANYEDPV TSSHSMDALGTVSIAWDESGDFRTNLEQIITQKYIALYPVGQEAWTEFRRTGYPKVFPVV VNESSGGSVDTNIQIRRLPYPESEYNTNRTELDKGITLLGGVDNPGVRLWWDVPNK >gi|225935366|gb|ACGA01000026.1| GENE 7 6722 - 10084 2752 1120 aa, chain - ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 1120 1 1099 1099 1166 58.0 0 MKKNHSFTALCPKYALNLKLPLVMRISLALLFAVVLQLSAEDGYAQRTRVAISMNNVSVE QVLNKIEETSDYVFLYNDKTIQKNRIVSVRNNSGKILDILDEVFKGTNITYTVVDKQIIL STNKLNVTEQDPTIQVKGTVKDTKGEPLIGVNVKVKGTTTGAITDFDGNFQIQAKKGAVL EISYIGYASQEVKVADNKSLSIVMTEDTKVLNEVVVTALGIKRETKSLTYNVQQIGGDEV AKAKDMNVMSSLAGKVAGVNINASSSGIGGGARVVMRGTKSISGNNNALYVVDGVPLANI SGEQPDDQYTGAGQSGDGLANFNADDIESISVLSGSAAAALYGSAAANGVVLITTKKGAV DKTRLTYSNNSTFYSPFRLPEFQNTYGSETGEWYSWASKLSSPTDYEVKDFFQTGYNVAN TVSLTTGTQKNQTYVSVGAVNARGIIQNNDLDRYNFSVRNTTSMLNDKLTMDLSFMYSNV KEQNMLSQGEYANPLVPVYLFPRGDDFSKYQYYERYDVDRNIKTQFWPLKENGLSMQNPY WITNRNMYTNKKDRFLASGSLKYTITDWLNVSGRVKLDKETINSEKKMYASTLNVIAENS DKGAYVQSNKNTSQIYADAMLNINKYFYHDTWNLTASLGASLLDLDYKELNFGGGLLTIP NLFTSNNANVSGKLTYKKTNYHDRTNSLFGTFSLGYKGMVYVDGSLRNDWISALAGTNHS SILYPSIGASAILTNMFTMKSNILSYAKVRLSYSEVGNAPERFRAITTYAVLGGLNTTSY FPIKDLKPERTHSWEGGFNLGLFDNKLTLDVTAYKTYTENQLFSPEISTTTGYSKLYVNA GKVTNKGIELALGFKQNIGPVDWKTTLLWSLNRNRIDQLLPEYTNEELGVTVHLDEMDVY SLGGAKQRLTVGGSMGDVYVNTMRTDEHGHIWMNSMTGALETDKNNYVYAGNTNPKYTLS WRNEFNWKGISLGFMLNARVGGIGVSATQAVMDYYGVSKTSAEARDNGGVKVNGQMIDAQ TWYQTIGANGTGYVGSMYVYSMTNVRLGELTLGYDIPINKWCKWISGLNVSFVGRNLWML YCKAPFDPESTASTGTYNQGMDYFMQPSLRSMGFSAKISF >gi|225935366|gb|ACGA01000026.1| GENE 8 10116 - 10289 130 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNYGLIPASVCETLYRESGKFGGRQLLVAAFFLFSFILLCYSMGFYFKGYYSNINLL >gi|225935366|gb|ACGA01000026.1| GENE 9 10270 - 11202 851 310 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 93 281 123 308 331 80 31.0 4e-15 MKENMENMSPESLLRKAQALGDDIKEMESIDVLGAYYQAQAKLKTNRRKNMYNQLMRYAA FLTIPLFLSSLVLGYLYFSGTETDEKYAEVVTATGSVIRYELPDHSVVWLNAGSTLRYPT TFKKDNRLVELKGEAYFEVQADKDRPFYVNTPNGLSVYVYGTKFNVAAYEDDNYIETVLE KGKVNVITPDQETIVLAPGEQLLYDKQSQKSKKNKVDVYGKIAWKDGKLIFRNASLEEIL KRLERHFNVEIQFNNRSGKEYKYRATFRTETLSQILDYLARSADLKWKIEEPQQQADDTL SKTKIRVDLY >gi|225935366|gb|ACGA01000026.1| GENE 10 11286 - 12422 696 378 aa, chain - ## HITS:1 COG:no KEGG:BVU_1801 NR:ns ## KEGG: BVU_1801 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 1 378 284 661 662 615 73.0 1e-174 MFLPPIPFVVRPACDSNTPFYLHNHQPAVTWCDNGDLLAIWFSANEENGRGMVVLGSRLR AGHTDWDVASLFFKVPDRNMTGSALLNDGKGKLYHINGVEASGDWQNLAMVLRTSTDNGA SWSTPKLIAPEHTKRHQVIAGTICTREGWLVQACDAGPGSHDGAAVQISKDGGKTWCDPW DGAPLPDFKEEGTGSTIAGIHAGIVQLENGSLMAMGRGNSIRNKEGKLRMPMSISDDMGK TWKYVASELPPIDGGQRLVLMRLNEGPLLLVSFTDHPQRTPLEERGLEFKDKNGNVKKGY GMYAALSYDEGKTWPVRKLLTDGEYRFLNGGAWTGYFEMDENHAEPRGYLAGTQTPDNVV HILSSRLHYRFNLAWLEK Prediction of potential genes in microbial genomes Time: Fri May 13 07:31:15 2011 Seq name: gi|225935365|gb|ACGA01000027.1| Bacteroides sp. D2 cont1.27, whole genome shotgun sequence Length of sequence - 52548 bp Number of predicted genes - 48, with homology - 47 Number of transcription units - 17, operones - 8 average op.length - 4.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 840 408 ## COG1262 Uncharacterized conserved protein - Prom 961 - 1020 7.2 + Prom 812 - 871 11.2 2 2 Tu 1 . + CDS 1014 - 1577 350 ## BT_4705 RNA polymerase ECF-type sigma factor + Term 1609 - 1641 1.4 - Term 1920 - 1973 1.6 3 3 Op 1 4/0.000 - CDS 2120 - 3256 680 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 4 3 Op 2 . - CDS 3288 - 4313 725 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 5 3 Op 3 1/0.000 - CDS 4375 - 5199 1064 ## COG0447 Dihydroxynaphthoic acid synthase 6 3 Op 4 10/0.000 - CDS 5203 - 6870 1303 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase 7 3 Op 5 . - CDS 6890 - 8005 785 ## COG1169 Isochorismate synthase 8 3 Op 6 . - CDS 8002 - 9234 1145 ## COG0561 Predicted hydrolases of the HAD superfamily 9 3 Op 7 . - CDS 9318 - 9686 268 ## COG3189 Uncharacterized conserved protein 10 3 Op 8 . - CDS 9745 - 11502 1254 ## BT_4697 transcriptional regulator - Prom 11540 - 11599 8.8 11 4 Op 1 . - CDS 11644 - 12423 774 ## BT_4696 hypothetical protein 12 4 Op 2 . - CDS 12479 - 13408 302 ## COG3129 Predicted SAM-dependent methyltransferase 13 4 Op 3 . - CDS 13481 - 14131 526 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 14 4 Op 4 . - CDS 14201 - 15649 1310 ## Fjoh_4499 hypothetical protein 15 4 Op 5 . - CDS 15677 - 18916 2499 ## Fjoh_4500 TonB-dependent receptor, plug - Prom 18949 - 19008 4.9 16 5 Tu 1 . - CDS 19051 - 20235 912 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 20462 - 20521 9.2 - TRNA 20542 - 20626 49.7 # Ser TGA 0 0 + Prom 20455 - 20514 5.9 17 6 Op 1 . + CDS 20760 - 23261 2166 ## COG1193 Mismatch repair ATPase (MutS family) 18 6 Op 2 . + CDS 23348 - 24400 793 ## COG0598 Mg2+ and Co2+ transporters 19 6 Op 3 . + CDS 24420 - 25628 1054 ## COG1760 L-serine deaminase + Term 25642 - 25682 4.4 - Term 25630 - 25670 8.2 20 7 Op 1 . - CDS 25693 - 26193 459 ## BT_4677 hypothetical protein 21 7 Op 2 . - CDS 26235 - 26681 586 ## BT_4676 putative periplasmic protein - Prom 26726 - 26785 5.2 - Term 26871 - 26925 1.5 22 8 Tu 1 . - CDS 26949 - 28523 906 ## BT_1642 hypothetical protein - Prom 28568 - 28627 6.8 - Term 28684 - 28719 2.0 23 9 Tu 1 . - CDS 28828 - 29184 228 ## BT_0405 hypothetical protein - Prom 29326 - 29385 3.6 24 10 Tu 1 . + CDS 29611 - 29823 94 ## gi|260171230|ref|ZP_05757642.1| hypothetical protein BacD2_05128 + Term 29942 - 29985 0.9 - Term 30155 - 30186 2.7 25 11 Tu 1 . - CDS 30360 - 31094 523 ## COG1922 Teichoic acid biosynthesis proteins - Prom 31116 - 31175 4.2 26 12 Op 1 . - CDS 31230 - 31352 72 ## 27 12 Op 2 . - CDS 31330 - 31683 226 ## COG0438 Glycosyltransferase 28 12 Op 3 . - CDS 31631 - 32404 424 ## gi|260171234|ref|ZP_05757646.1| hypothetical protein BacD2_05148 29 12 Op 4 . - CDS 32459 - 33247 264 ## Mfla_2015 glycosyl transferase, group 1 - Prom 33454 - 33513 5.1 30 13 Op 1 . - CDS 33586 - 34053 211 ## COG0241 Histidinol phosphatase and related phosphatases 31 13 Op 2 . - CDS 34090 - 35220 330 ## BDI_1468 hypothetical protein - Prom 35251 - 35310 1.9 32 14 Op 1 1/0.000 - CDS 35346 - 36050 574 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 33 14 Op 2 3/0.000 - CDS 36057 - 36653 399 ## COG0279 Phosphoheptose isomerase 34 14 Op 3 . - CDS 36681 - 37721 734 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase 35 14 Op 4 . - CDS 37718 - 38929 554 ## COG0438 Glycosyltransferase - Prom 38993 - 39052 3.9 36 15 Tu 1 . - CDS 39055 - 39885 252 ## BVU_2666 putative O-acetyltransferase Cps9vM - Prom 39951 - 40010 4.0 37 16 Tu 1 . - CDS 40044 - 41093 386 ## LCABL_02380 CpsH - Prom 41173 - 41232 5.2 38 17 Op 1 . - CDS 41278 - 41769 269 ## LAR_1296 hypothetical protein 39 17 Op 2 . - CDS 41785 - 42174 292 ## Acfer_0733 exopolysaccharide biosynthesis protein 40 17 Op 3 . - CDS 42178 - 43713 648 ## Dfer_0062 polysaccharide biosynthesis protein 41 17 Op 4 . - CDS 43770 - 45029 638 ## COG1232 Protoporphyrinogen oxidase 42 17 Op 5 . - CDS 45017 - 45919 318 ## COG1088 dTDP-D-glucose 4,6-dehydratase 43 17 Op 6 5/0.000 - CDS 45919 - 46995 762 ## COG0451 Nucleoside-diphosphate-sugar epimerases 44 17 Op 7 2/0.000 - CDS 46995 - 47780 576 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 45 17 Op 8 5/0.000 - CDS 47787 - 49133 985 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 46 17 Op 9 . - CDS 49197 - 50609 690 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 47 17 Op 10 . - CDS 50629 - 51756 780 ## BT_1355 hypothetical protein 48 17 Op 11 . - CDS 51779 - 52546 733 ## COG1596 Periplasmic protein involved in polysaccharide export Predicted protein(s) >gi|225935365|gb|ACGA01000027.1| GENE 1 3 - 840 408 279 aa, chain - ## HITS:1 COG:all3226_2 KEGG:ns NR:ns ## COG: all3226_2 COG1262 # Protein_GI_number: 17230718 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 26 248 3 241 246 127 37.0 3e-29 MKNKFFLLAITFCLPIHSQEVSKSFIPMAEIPAGSFYMGSDGLGEDFDEAPIHQVIISRP FRMGITEITNAQYESFRPEHRALRGKNGVSLEDDEAVVNVSYSDAVAFCEWLSRKEGKNY RLPTEAEWEYACRAGTYTLFSTGDGLPAVYHRNQKVVRDFDPVSLKVAQTPPNTFGLYDM HGNVEEWCLDWYASYSAEKQKDPAGPLAGEFRVTRGGSHHTPEKYLRSANRLAMLPEDKH SQTGFRIVEADTRLNVSGTSAPVPFNQESVKNTSIKWKK >gi|225935365|gb|ACGA01000027.1| GENE 2 1014 - 1577 350 187 aa, chain + ## HITS:1 COG:no KEGG:BT_4705 NR:ns ## KEGG: BT_4705 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 176 1 176 189 297 93.0 2e-79 MKISFSKQTKERAFKQLYEEYYAPFCLYAKRFVDDKETREDIVSDVFTSLWDKLDTDSFD LQSNTALAYIKICVKNSCLNFLKHQEYEWSYAENIQKKAPVYETEPDSIYTLDELYRMLY ETLNKLPENYRTVFIKSFFEGKTHVEIAEEMNLSVKSINRYKQKTMELLRNELKDYLPLL LLLFHHA >gi|225935365|gb|ACGA01000027.1| GENE 3 2120 - 3256 680 378 aa, chain - ## HITS:1 COG:Cgl0445 KEGG:ns NR:ns ## COG: Cgl0445 COG0318 # Protein_GI_number: 19551695 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Corynebacterium glutamicum # 51 357 60 373 376 99 27.0 8e-21 MIFDRQQQRLLLEGKEYTFEEISQLIAGGAEAHSPAYWDLYLFLNEWFNDSPVITVHTSG STGIPKELMVRKDQMMQSARLTCEFLNLKKGESALLCMNLRYIGAMMVVVRSLVAGLNLI VRPASGHPLADIKESLRFVAMVPLQVYNTMQVPEEKERLKQTDILIIGGGAVDEALETEI NHLPTAVYSTYGMTETLSHIALRRLNGASASEHYYPFPSVKLSLSAENTLIIDAPLVCDE TLQTNDIARIYPDGSFMILGRKDNVINSGGIKIQAEEMEKLLRPFIPVSFVITSVPDQRL GQAVTLLIVGQLDTREIENKLQATLEPYYRPKHIFITEFIPQTENGKIDRVGCRALAASY LSHSHQFPSIELHDDAIM >gi|225935365|gb|ACGA01000027.1| GENE 4 3288 - 4313 725 341 aa, chain - ## HITS:1 COG:AGpA707 KEGG:ns NR:ns ## COG: AGpA707 COG4948 # Protein_GI_number: 16119707 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 125 238 67 179 299 85 35.0 2e-16 MDCKIDIIPRLLHFKQPAGTSRGIYTTRKVWYLHFTSPDFPGRVGIGECAPLPALSCDDL PDYEDILKRICRQVEKEQGMWDKDVLCQYPSILFGLETAIWHFFAGSWALSDTAFSRGEV GIPINGLIWMGDFDHMLSQIEKKMEAGFRCVKLKIGAIDFEKELALLRHIRTHFSSKEIE LRVDANGAFSPGDAMEKLKRLSEFDLHSIEQPIRAGQWEEMARLTSESPLPVALDEELIG CNTLEEKKKLLATIRPQYIIIKPSLHGGICGGDEWIMEAEKQHIGWWITSALESNIGLNA IAHWCATFNNPLPQGLGTGMLFTDNIEVPLEIRKDCLWFCK >gi|225935365|gb|ACGA01000027.1| GENE 5 4375 - 5199 1064 274 aa, chain - ## HITS:1 COG:BS_menB KEGG:ns NR:ns ## COG: BS_menB COG0447 # Protein_GI_number: 16080132 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroxynaphthoic acid synthase # Organism: Bacillus subtilis # 6 274 3 271 271 410 69.0 1e-114 MPTQREWTTIREYDDILFDYYNGIARITINRERYRNAFTPTTTAEMSDALRICREEADID VIVITGAGDKAFCSGGDQNVKGHGGYIGKDGVPRLSVLDVQKQIRSIPKPVIAAVNGFAI GGGHVLHVVCDLSIASENAIFGQTGPRVGSFDAGFGASYLARVVGQKKAREIWFLCRKYN AQEALDMGLVNKVVPLEQLEDEYVQWAEEMMLLSPLALRMIKAGLNAELDGQAGIQELAG DATLLYYLTDEAQEGKNAFLEKRKPNFKQYPKFP >gi|225935365|gb|ACGA01000027.1| GENE 6 5203 - 6870 1303 555 aa, chain - ## HITS:1 COG:lin1783 KEGG:ns NR:ns ## COG: lin1783 COG1165 # Protein_GI_number: 16800851 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Listeria innocua # 19 541 23 561 580 182 28.0 1e-45 MYTDKKNILQLVALLEAHGITKVVLCPGSRNTPIVHTLSNHPNFTCYAVTDERSAGYFAI GLALNGGKPAAVCCTSGTALLNLHPAVAEAFYQNVPLVVISADRPAAWIGQMDGQTVPQP GVFQTLVKKSVNLPEIHTEEDEWYCNRLVNEALLETNHHGKGPVHINVPISEPLFQFTVD ALPEVRVITRYQGLNVYDRDYNDLVDRMNKYQKRMIIIGQMNLIYLFEKRYIKLLYKHFA WLTEHIGNQTVPGIPVKNFDAALYAMPEEKIDQMVPELLITYGGHVVSKRLKKFLRQHPP KEHWHVSPEGEVVDLYGSLTTMIEMDPFEFLEKIASLLDNRTPDYPRVWENYCKIIPEPD FAYSEMLAVGTLLKALPESCALHLANSSVVRYAQLYSIPPTIEVCCNRGTSGIEGSLSTA VGYAAASDKLNFIAIGDLSFFYDMNALWNINVRSNLRILLLNNGGGEIFHTLPGLDMSGT SHKFIAAVHKTSAKGWAEERGFLYLQVQNDEELTEAMKTFTQPETMEQPVLLEVFTNKNK DARMLKNYYHQLKQK >gi|225935365|gb|ACGA01000027.1| GENE 7 6890 - 8005 785 371 aa, chain - ## HITS:1 COG:VNG1083G KEGG:ns NR:ns ## COG: VNG1083G COG1169 # Protein_GI_number: 15790177 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Halobacterium sp. NRC-1 # 107 360 177 436 441 112 28.0 2e-24 MTDEEISNLTTIDAFIQRKQPFAVYRIPGEKVPRLLTQAEGAVRLIYDLKELNGQRGFVI APFQVSETCPAVLIQPDQWGQPLPIDNDTAEEREVALRMQGQESFLTSSTEEYASCFHTF INALRDNTFDKLVLSRHLTIDKVSGFSPLSIFRAACRRYIHSYIYLCYTPQTGIWLGSTP EIILSGEKDEWNTVALAGTQPLQDGKLPQIWDEKNRKEQAYVASYIRRQLLSLGIHSTEN GPYPAYAGALSHLKTDFQFSLKDNKGLGDLLKVLHPTPAVCGLPKEEAYRFILQNEGYDR RYYSGFIGWLDPEGRTDIYVNLRCMHIKDEQLTLYAGGGLLASSELNDEWLETEKKLQTI KRLIATPPLKS >gi|225935365|gb|ACGA01000027.1| GENE 8 8002 - 9234 1145 410 aa, chain - ## HITS:1 COG:SP0923 KEGG:ns NR:ns ## COG: SP0923 COG0561 # Protein_GI_number: 15900803 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Streptococcus pneumoniae TIGR4 # 4 269 3 269 269 164 38.0 4e-40 MKYKLLVLDVDGTLLNDAKEISKRTLAALLKVQQMGVRIVLASGRPTYGLMPLAKMLELG NYGGFILSYNGCQIINAQNGEILFERRINPEMLPYLEKKARKNGFALFTYHDDTIITDSP ENEHIQNEARLNDLQIIKEEEFSAAVDFAPCKCMLVSDDEEALLGLEDHWKRRLNGALDV FRSEPYFLEVVPCAIDKANSLGALLEVLGMKREEVIAVGDGVCDVTMIQLAGLGIAMGHS QDSVKACADYVTASNEEDGVAVAVEKAIISEVRVAEIPLDQLNAQARHALMGNLGIQYTY ADEDRVEATMPVDHRTRQPFGILHGGATLALAETVAGLGSMILCQPDEIVVGMQVSGNHI SSAHEGDTVRAVGTVVHKGRSSHVWNVDVFTSTNKLVSSIRVVNSVMKKR >gi|225935365|gb|ACGA01000027.1| GENE 9 9318 - 9686 268 122 aa, chain - ## HITS:1 COG:BMEII0787 KEGG:ns NR:ns ## COG: BMEII0787 COG3189 # Protein_GI_number: 17989132 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Brucella melitensis # 3 117 1 114 116 105 49.0 3e-23 MEMIQVRIKRVYEDFSETDGYRVLVDKLWPRGMKKEWLKYDYWAKDITPSPTLRKWFHED IPGHWGDFVTQYQKELDASPSMADFLTLIKPHPVITLLYASKEPVYNHARILRDYLEMRL RE >gi|225935365|gb|ACGA01000027.1| GENE 10 9745 - 11502 1254 585 aa, chain - ## HITS:1 COG:no KEGG:BT_4697 NR:ns ## KEGG: BT_4697 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 583 4 588 589 679 61.0 0 MRFWVCIFVLLFVHNQFSRADKIINNDSLYTEKYIRDIYISNPKRALQLLDEAETRKAFP LRLINELRSLSYRNMYMNKLAFMYARKSYLLDSISQREPKHMLKMTVYLAELSSIMSKYN ESMHYALSGIMQAQKLKDREAEARLLFCIGENNWRLSLKDEAYNYFGRTIELLRGSKDMR EMMLLSYYYGAEMGFLMTDSRIDEALALAYEREKLLKKLEKVPEVPEGYIDGQYSYLYAN LAYISYLEKKYAQAEGYYQKYLAIKESHTPDGKMYSIPYLILSKQYETVIDNCKDFKELL RTQRDTLNAQYLTILNKEVQAYLGLNRYKEAAEIRETIIAITDSINSTDRKNAALELNAM YGASEKEEYIAEQASQLKIRNVSLCFLACIVVLTLFILWRLWRFNHIIEYKNRMLAKLIN EKFANKKDGNQLLEVYEEQEVSSELEPELISPEEQDELLDETDKESGEEEENKKIFQELN HTVVQKQLYLSPELSREDLAQIVHLNNARFARMIRECTGTNFNGYINGLRINYAIKLMKK YPNYTIRAIADESGFNSAPILYNLFKKKTGMTPYEFKKAQDTLRN >gi|225935365|gb|ACGA01000027.1| GENE 11 11644 - 12423 774 259 aa, chain - ## HITS:1 COG:no KEGG:BT_4696 NR:ns ## KEGG: BT_4696 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 254 1 254 255 456 82.0 1e-127 METNYITLTKENIANEHICCAFSDKKCKDSYELKKEWLKKEFDNGYVFRRIDERAKVFIE YGPAEKAWVPVNASNYLMINCFWVSGKYKGCGHGKALLQSAIDDAKLQGKDGLVTVVGTS KFHFMSDTKWLLRQGFQTIEKLPYGFSLLGLKLNPAAPDPSFNDCVLGGECPDKGGIVVY YTHRCPFTEFHVRGSLVSAAKSKDIPLKIIELESMEQAQNAPTPATIFTLFYNGKFVTTD LSVCLEGRLSKALGCDSVE >gi|225935365|gb|ACGA01000027.1| GENE 12 12479 - 13408 302 309 aa, chain - ## HITS:1 COG:VC1614 KEGG:ns NR:ns ## COG: VC1614 COG3129 # Protein_GI_number: 15641622 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Vibrio cholerae # 2 305 44 348 362 309 49.0 5e-84 MAKKGELHIRNKHNGQYDFSLLMENYPPLKRFVSLNPLGIQTINFFNPQAVKALNKALLV TYYGIRYWDIPKQYLCPPIPGRADYIHYIADLIQPNGTTPDLPAGDANGQKSKCRCLDIG VGANCIYPIIGHTEYGWTFVGTDIDPVSIENARKIVTCNPVLAHKIDLRLQKDSKKIFDG IIMPDEYFDVTICNPPFHSSREEAEDGTLRKLSSLKGTKINKVQLNFGGSANELWCEGGE VRFLLNMISESQKYQKNCGWFTSLVSKEKNLEKLYAKLKSVNVSEYKVIRMQQGTKSSRI LAWRFSTNS >gi|225935365|gb|ACGA01000027.1| GENE 13 13481 - 14131 526 216 aa, chain - ## HITS:1 COG:RSc0292 KEGG:ns NR:ns ## COG: RSc0292 COG2197 # Protein_GI_number: 17545011 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Ralstonia solanacearum # 5 210 2 210 210 85 28.0 8e-17 MNTDITFIIADNQDITRMGMHGYISAIFSGCRMIDVTDKKELMLALVECNDSVVILDYTL FDINGIEEFLIIEKRFPRVRWILFSNELSEDFIRRMSIERNIGMILKENSGEEIHSALMC TAHGERFLCHQITNLLIIGSDKPEIHSVLTVTEIDILKLIAHGKSVKEIALERTSSIHTI ITHKKNIFRKLGVNNVYEATKYTLRAGLIEMVEYYI >gi|225935365|gb|ACGA01000027.1| GENE 14 14201 - 15649 1310 482 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_4499 NR:ns ## KEGG: Fjoh_4499 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 10 477 5 473 474 526 56.0 1e-147 MKRRIYFLYIIWVQVVAISLSSCDDELTDINRNPNATENPQPAYLLAAAQYHAANLYWGS STNYNSTLLWVQHWAKIQYTEPDCYNVDNGSFTTTWDTGYATLITDLNAIISSELGNDNY RGIATVWRSWVYLLLTNLYGNIPYSEYAQSVTPSYDSQETVLRGLLEELIAADQLLSTTG GTVEGDLIYGNDITRWKQLANSLRLRIALELADRDEDTARRVIASLWDNRNSVIDSNDAI ARFVFTSSPQWNPWASAFNSRDDQRVSKTLIDKLNEWNDPRIGILAQLPQDESVKNYVGA ANSLSADAANNQGFNKVSRPGTYFLKDSSPAVFYTYAEVLFIFAESAARGWITADAETLY REAITASLNQFGIIDNRIIDSYLQQEAIRFDAAHWYESIGWQKWIAYYGQGPDAFTDWRR LGYPQLIPGPDSVLGSGELPRRFFYPVTEQSLNGKQYQIAISHQGADEITTRLWFDVANK ER >gi|225935365|gb|ACGA01000027.1| GENE 15 15677 - 18916 2499 1079 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_4500 NR:ns ## KEGG: Fjoh_4500 # Name: not_defined # Def: TonB-dependent receptor, plug # Organism: F.johnsoniae # Pathway: not_defined # 28 1079 35 1058 1058 1108 56.0 0 MGKRIHLFLLALAIGVIQGAAQVTTVRGIVTTEEDGEPVIGASVIVKGTALGTVTDVNGR FELSGLPPSATRLLISYISLMAKEVAIAPQVSVTLKSDTHLLDEVVVTALGISREKKALG YTAQEVKQDALVQGKDNNLLNSLSGKIAGVRITNTQGDVGSSRIVIRGETSIAGENQPLF IVDGIPVDNSQLNARSSGRDFKNAIADLNPEDIKTLTVLKGPNAAALYGARAAHGAIVIT TKGGDKRQKGIGITLHSSTQVSFVATLPEFQNLFGQGAGGRFSYVDGKGAGVNDGVDESW GPRLDIGLLIPQFDSPLDADGNRVATPWVSHPNNVRDYFRMGISTNNGISIARGDDKYQF RVGYNYEKQVSIVPDAGTNKTNISLNTDYHLAKWIVVGATANYIVYTAPSLPGSATPSGS NVRSNSPMLQFLWFGRQVDTNSLKADYTRNWNSSYYDNPFWSASYNTQSQERHRLIGDLH AEFRLTDGLNVRFRTSTDWYNDRRKSKVKWGSAGAGSPYGSYAEDAYTVKENNTEVLATY IKQLNKNWGIDALLGFNVRNKQYENNYQAAPRLAVADLYTLTNSRDPLTSSNDFYRLRQY GLYGSIQLDYRRWAFLNITGRNDWSSTLPVDNNSYFYPSVTASVLLSEAFGWRSKAVNYL KIRGGWSQVGADANPYQLATVFTSETAFNGNPLQSSSTIGMNPNLKPENTSSIEAGFEAA FWDNRLYLDFTYYKTDSRNQILKLATTAASGYTSQVRNAGHIRNRGYEIQLGAVPIQTSK GFRWNLDLNYGANSSKVVKLDDEGLITSYQLYSSGIQILASVGEAYGTLFGTSYVRDANG NVVVDANGLPKISTTNKTLGKFTPDWTGGISNTFSYRSLSLSFLIDASVGGSIFSNTNKT GKYTGVLANTLSGRDAEHGGLWYYTDAMGNNVRLPESPSYSVSSDGLYYAQVNGQSTRVY QDGIMVEGVTESGSKNEEVVSAEKYYHRIYSIAEANVYDASYVKLREVALSYRLPRLWTQ KLHLQEASVTLTGRNLWTIYKSVPNIDPESALTTGNAQGVEAYSLPTTRSFGVNLSVKF >gi|225935365|gb|ACGA01000027.1| GENE 16 19051 - 20235 912 394 aa, chain - ## HITS:1 COG:FN1419 KEGG:ns NR:ns ## COG: FN1419 COG0626 # Protein_GI_number: 19704751 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Fusobacterium nucleatum # 1 390 3 393 395 222 36.0 8e-58 MKKQTQAIHTQFQRRDAYNSISMPVYHTAAYEFDNADDMADAFCGRTDAPDYSRVMNPTV TFFENKVKELTGAAEVIALNSGMAAISNTLFSIAAAGKNIVTSRHLFGNTYSFIAGTLSR FGVEPRLCDLTDIGAVEKVVDSNTCCIYMEIITNPQMEVADLQALSRIVHERGIPLIVDT TLIPFTEFDSHSLGVDIEVVSSTKYLSGGATSIGGLVIDYGTCPGFGKRMRTEMLLNLGA YMTPHVAYMHTIGLETLDARYRIQSANAAMLAGKLRILSAVKRVNYVGLKDNPYYALAQR QFGPTAGAMITIDLESKETCFAFINRLKLIRRATNLFDNKTLAIHPASTIFGPFTDKQRQ DMDVLDTTIRFSIGLEDVDDLFDDVRQALDNKKD >gi|225935365|gb|ACGA01000027.1| GENE 17 20760 - 23261 2166 833 aa, chain + ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 13 832 10 784 785 357 30.0 5e-98 MIYPQNFEQKIGFDQIRQLLKDKCLSTLGEERVTDMNFSEQHEEVEEKLNQVTEFVRIIQ EEDGFPDQFFFDVRPSLKRVRIEGMYLDEQELFDLRRSLETIRDIVRFLHRNEDEEENNT PYPSLKRLAGDIAVFPQLIGKIDGILNKYGKIKDNASTELARIRRELASTMGNISRSLNS ILRSAQSEGYVDKDVAPTMRDGRLVIPVAPGLKRKIKGIVHDESASGKTVFIEPAEVVEA NNRIRELEGDERREIIRILTEFSNILRPSIPEILQSYEFLAEIDFIRAKSYFAIQTNSLK PAVENEQLLDWTMAVHPLLQLSLAKHGKKVVPLDIELDQKQRILIISGPNAGGKSVCLKT VGLLQYMLQCGMLIPLHERSHAGIFSSIFIDIGDEQSIEDDLSTYSSHLTNMKIMMKSCN ERSLILIDEFGGGTEPQIGGAIAEAVLKRFNQKGTFGVITTHYQNLKHFAEDHEGVVNGA MLYDRHLMQALFQLQIGNPGSSFAVEIARKIGLPEDVIADASEIVGSEYINADKYLQDIV RDKRYWEGKRQTIRQREKHMEETIARYQTEMEELQKSRKEIIRQAKEEAERMLQESNARI ENTIRTIKEAQAEKEKTRQARQELTDFRTSLDALASKEHEEKIAKKMEKLKEKQERKKNK KSEPKAAVSSPSSAPKIVPIAVGENVKIKGQTSIGQVMEISGKNATVAFGSIKTTVKIDR LERTNHTPKTEGIAKSTFVSSQTHDQMYEKKLSFKQDIDVRGMRGDEALQAVTYFIDDAI LVGMDRVRILHGTGTGILRTLIRQYLATVPGVSHYSDEHVQFGGAGITVVDFD >gi|225935365|gb|ACGA01000027.1| GENE 18 23348 - 24400 793 350 aa, chain + ## HITS:1 COG:MA1721 KEGG:ns NR:ns ## COG: MA1721 COG0598 # Protein_GI_number: 20090573 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Methanosarcina acetivorans str.C2A # 10 349 20 355 356 224 37.0 1e-58 MKNNLLSEKLIYTGESLTPTHLHLCTYNATEMQESSSDTFQAIKGTLNNDQINWLQIHGM KNTETIREICSHFEIDFLVLQDILNADHPTKIEEHDKYIVLILKIFYPNEHKEENELDEL LQQQVCLIIGNNYVLTFLEKETDFFDDVSSALRNDVLKIRSRQTDYLLSVLLNSVMGNYI STISSIDDALEDLEEELLTITSGDDIGIQIQALRRQYMLMKKAILPLKEQYIKLLRAENL LIHKVNRAFFNDVNDHLQFVLQTIEICRETLSSLVDLYISNNDLRMNDIMKRLTIVSTIF IPLTFLVGVWGMNYKWMPELEWQYGYLFAWIIMAIIGIIVYLYFRKKKWY >gi|225935365|gb|ACGA01000027.1| GENE 19 24420 - 25628 1054 402 aa, chain + ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 402 1 402 408 430 52.0 1e-120 MKSIKELYRIGTGPSSSHTMGPRKAAEMFLERHPDAASFKVTLYGSLAATGKGHMTDVAI IDTLQPIAPVEIVWQPKVFLPFHPNGMTFAALDSNDKVQENWTVYSIGGGALAENNDNPT IESPDVYDMENMTEILQWCEDTGKSYWEYVKECEEEDIWDYLAEVWATMKDAIHRGLEAE GVLPGPLNLRRKASTYYIRATGYKQSLQSRGLVFSYALAVSEENASGGKIVTAPTCGSCG VMPAVLYHLQKSRDFSDMRILRALATAGLFGNIVKFNASISGAEVGCQGEVGVACAMASA AANQLFGGSPAQIEYAAEMGLEHHLGMTCDPVCGLVQIPCIERNAYAAARALDANLYSAF TDGMHRVSFDKVVQVMKQTGHDLPSLYKETSEGGLAKDYKQM >gi|225935365|gb|ACGA01000027.1| GENE 20 25693 - 26193 459 166 aa, chain - ## HITS:1 COG:no KEGG:BT_4677 NR:ns ## KEGG: BT_4677 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 166 1 166 166 258 77.0 4e-68 MWRFKFSAWAVVLMMTALSFGACDNDDDDTFVPPSNITEALKQVYPAAQNVEWEMKGAYY VADCWVSNDELEVWFDANANWVMTENELNSIDQLVPAVYTAFMDSKYNTWVVTDVYVLTF PQNPMESVIQVKQGSQRYALYFLQEGGLLHEKDISNGDDTNWPPTE >gi|225935365|gb|ACGA01000027.1| GENE 21 26235 - 26681 586 148 aa, chain - ## HITS:1 COG:no KEGG:BT_4676 NR:ns ## KEGG: BT_4676 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 147 147 239 82.0 2e-62 MKKILSLLVLAIVAVQFSFAKDVITKDMNQLPLPARNFINGNFTKPQIAHIKIDKDMMES TKYEVVLMDGTEIDFDSKGNWEEVSAKKGQVVPVSIIPGFAVNYLKSHNFVNEGVTKVER DRKGYEIELSTGLSFKFDKKGKFIKADD >gi|225935365|gb|ACGA01000027.1| GENE 22 26949 - 28523 906 524 aa, chain - ## HITS:1 COG:no KEGG:BT_1642 NR:ns ## KEGG: BT_1642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 521 1 518 521 854 81.0 0 MIDMENLQRLYPIGIQTFSKIREGNYLYIDKTEYVYRMTHSASDYMFLSRPRRFGKSLLT STLHSYFSGHKELFRGLAIEKLEKEWMEYPVLHFDMSTAKHVDEEQLFQELNLKLFNYEE IYGKLEEEINPNQRLMGLIKRAYQQTGKKVVVLIDEYDAPLLDVAHERENLDVLRNIMRN FYSPLKACDPYLRYVFLTGITKFSQLSIFSELNNIENISMDEPYAAICGISEDEIRSQMK EDVEGLAKKLEITPEEVLMKLKENYDGYHFTYPSPDIYNPFSLLTAMEKGKIGSYWFGSG TPTYLIKMLDKFGVVPSEIGKKKAAVEDFDAPTERMTSIIPLLYQSGYITIKDYDEELDL YTLDIPNKEVRIGLMKSLLPYYVVSKAPETNTMVAYLSRDIRNGDIDAALRRLQMFLSTI PQCDNTKYEGHYQQIFYIIFSLLGYYVDVEVRTPRGRVDMVLRTETTLYVMELKLDKGAD RAMEQINLKNYPERFALCGLPIIKIAVSFDSERCTIGEWKIMEG >gi|225935365|gb|ACGA01000027.1| GENE 23 28828 - 29184 228 118 aa, chain - ## HITS:1 COG:no KEGG:BT_0405 NR:ns ## KEGG: BT_0405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 119 132 50.0 3e-30 MAKEKQKPYEFLSNLVLALMGTDRIFSNSFFISEFAISPNTLGEIRRGEDMCIYQYVRVI QCMMKYLHLLVRMDMLLKELRTVLASNCDLVVATIPHRFHGTYQPKEWVVVMHWDGIK >gi|225935365|gb|ACGA01000027.1| GENE 24 29611 - 29823 94 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171230|ref|ZP_05757642.1| ## NR: gi|260171230|ref|ZP_05757642.1| hypothetical protein BacD2_05128 [Bacteroides sp. D2] # 1 70 1 70 70 120 100.0 2e-26 MTVLLKTEANANKMSLSFPNSILPPKNPDNIRNFVQELPNHFLTDKTIQVGAKHDFNIRK ATRPLKHLTY >gi|225935365|gb|ACGA01000027.1| GENE 25 30360 - 31094 523 244 aa, chain - ## HITS:1 COG:CAC2317 KEGG:ns NR:ns ## COG: CAC2317 COG1922 # Protein_GI_number: 15895584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Teichoic acid biosynthesis proteins # Organism: Clostridium acetobutylicum # 39 222 50 231 250 120 35.0 2e-27 MIKLKELSIMESTEQLKLLPQGKLLINTINAHSYNTAVKDPQFAKALMKGDVLIPDGASI VKVCKWLKMQSQPRERIAGWDLFVLEMNRLNEKGGRCMFMGSSEKVLALIKQRAAVDYPN LEVVTYSPPYKPEFSQEDNAAIVAAINEANPDLLWIGMTAPKQEKWIYSNWQKLNIHCHV GTVGAVFDFYAGTTERAPLWWQQHSLEWFYRLLKEPKRMWRRYLVGNVLFLWNICREKYG YTMR >gi|225935365|gb|ACGA01000027.1| GENE 26 31230 - 31352 72 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPYMEWLNSLIIYPILTLLKMDKDYNCDIFENGNQDYDNK >gi|225935365|gb|ACGA01000027.1| GENE 27 31330 - 31683 226 117 aa, chain - ## HITS:1 COG:BH3712 KEGG:ns NR:ns ## COG: BH3712 COG0438 # Protein_GI_number: 15616274 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Bacillus halodurans # 13 102 284 373 395 67 37.0 6e-12 MQGLCQVKKKNIFLKSLDVFVLPSYFEGLPMSLLECMSYGAVPVTTNVGSIGDVVEDCKN GLFIKMKDTDSIIEVLTNLNNDRKMLSRLGNEARATIFQNFNPHKYIQNLNAIYGMA >gi|225935365|gb|ACGA01000027.1| GENE 28 31631 - 32404 424 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171234|ref|ZP_05757646.1| ## NR: gi|260171234|ref|ZP_05757646.1| hypothetical protein BacD2_05148 [Bacteroides sp. D2] # 1 257 1 257 257 508 100.0 1e-142 MKIIVVAPSLDPSKNVSGISSVTKFIISNNKEQEYIHFELGRKDNEEGGIFRIFSILKNM REWKNMLNKERDAIIHYNFPLSKASILRDPLFMGIALQQKRKMVVHIHGGVFLTHYKEAS SFLKWILKKVFSWNVPFIVLGENEKNIIEQTLKAKRVFLLPNCVDLQDAESFERVYNWIE PLNIGYIGRIAVTKGMENLLQACRELKAKQIPFIVKLAGAEEVKDSFLPLFDSYLGHQFI YAGIVSGEKEKYFFEKS >gi|225935365|gb|ACGA01000027.1| GENE 29 32459 - 33247 264 262 aa, chain - ## HITS:1 COG:no KEGG:Mfla_2015 NR:ns ## KEGG: Mfla_2015 # Name: not_defined # Def: glycosyl transferase, group 1 # Organism: M.flagellatus # Pathway: not_defined # 3 253 122 378 382 76 27.0 1e-12 MGIHDVKNHSSVKPPLFLIISQWIALMTARCYVLFSKNQYELFKQLYPNKRSELVGMSYK DFGRPSVQTCNTIKEEIKILFFGTLQPYKGYDLLFDAFERLIEEGMTNLRLSLYGRCGNQ EIERICREKIKHPEFYEMHLEFVDNREIPDIFASHHFAVFPYRDATNSGPLMIAVNYGLP VIAPDHSCFKDVCTDGVNSLLYDNKEKEALYEKLKYLSQISYDDYMIMRTNCDILKSTFS EATIAQNYINLFKVITIKTDGK >gi|225935365|gb|ACGA01000027.1| GENE 30 33586 - 34053 211 155 aa, chain - ## HITS:1 COG:MT0122_2 KEGG:ns NR:ns ## COG: MT0122_2 COG0241 # Protein_GI_number: 15839494 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Mycobacterium tuberculosis CDC1551 # 14 147 39 173 217 97 40.0 7e-21 MNLVDIDVAGYDTLLLDRDGVINKLCPNDYVKCWEEFEFLPGIFEALVKWNKHFRHIFIV TNQRGVGKGVMTEQALRGLHCRMCEEIISHGGRIDKIYYCTALTEKDNRRKPGNGMFLDI LRDYPDVEKERCLMIGDSESDMKFAENCGIKGILL >gi|225935365|gb|ACGA01000027.1| GENE 31 34090 - 35220 330 376 aa, chain - ## HITS:1 COG:no KEGG:BDI_1468 NR:ns ## KEGG: BDI_1468 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 19 376 22 395 418 163 32.0 8e-39 MRSNVFYTLFVVVYYPICLLLYYEWNFDYSDEILTALLVAYFCMQSKKTFNKEFCCFLLI MMAYVAYSIFIAITSVDAIFYDLQQQVKPYLIFYATFAISPQFTTGQCRFLRKWVLFCYV LYFISIIPHIIENLGYDRAEFSRNSIVCALLYYYFSTESRRAVIITLLIMTSGLLSLKSK FFGEYILMVALFLLVKKKVNFKSVKAICIEVILLVVVLFFTWTKFSFYYIEGFKDSSQEL ARPISYITSLKIFVDYFPFGSGFATFSSNAAGVYYSPLYYKYNIDGVWGLSPDNPAFLAD AFYPLLAQFGIVGVLLFCVFWRRRFSQIQNINMLKYYRLAIMASLCLLIESVGDTSYLSS PGIAYFLILALCLRKI >gi|225935365|gb|ACGA01000027.1| GENE 32 35346 - 36050 574 234 aa, chain - ## HITS:1 COG:Cj1423c KEGG:ns NR:ns ## COG: Cj1423c COG1208 # Protein_GI_number: 15792741 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Campylobacter jejuni # 1 221 1 213 221 183 49.0 2e-46 MEVIILAGGLGTRLRSVVSEIPKCMAPVAGKPFLWYLLKYLARYEVSRVILSVGYLREVI YQWIDEVKDEFPFTFDYAVEDEPLGTGGGIKLAMDKIEGTEALILNGDTFFDVDLNDLIC KHLNQNALLSLALKPMNNFDRYGNVKCNERGEILVFEEKRYCKQGMINGGIYVLDKVTPF FDHLPRKFSFETQVLEPNSGKGCISGIVQDRYFIDIGIPEDLAKANIEFPNLSF >gi|225935365|gb|ACGA01000027.1| GENE 33 36057 - 36653 399 198 aa, chain - ## HITS:1 COG:Cj1424c KEGG:ns NR:ns ## COG: Cj1424c COG0279 # Protein_GI_number: 15792742 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Campylobacter jejuni # 7 198 8 200 201 226 60.0 3e-59 MDSIDIVRKQVWNSAYVKNLLLENEAVIARIAQVADLCTEAYRNGKKTLFAGNGGSAADA QHFAGELVSKFYFDRPGLPSIALTTDTSIITAIGNDCGYKQIFARQIQAQGVKGDVFIGI STSGNSENIVEVLPVCREKGITSIALTGANPCKMDEFDIVIKVPSNETPRIQECQTLIGH IICCIVEENIFSEKYNNK >gi|225935365|gb|ACGA01000027.1| GENE 34 36681 - 37721 734 346 aa, chain - ## HITS:1 COG:Cj1425c KEGG:ns NR:ns ## COG: Cj1425c COG2605 # Protein_GI_number: 15792743 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Campylobacter jejuni # 3 340 4 339 339 328 49.0 1e-89 MIIRSKAPLRLGLAGGGSDVSPYSDIYGGLILNATINLYTYCTIEETYDGLITIDSDDAR CYKSYLVAKQLEIDGEASLIKGVYNRVMRDFDISLRSFKITTYNDALAGSGLGTSSAMVV CILKAFIEWLGLPLGDYEASRLAYEIERKDLALSGGKQDQYAAAFGGFNYMEFLPNDLVI VNPLKIKRWIMDELEASMVLYFTGASRSSAAIIEQQQKNTSSGNQNAIEAMHRIKQSAKD MKLALLKGDMNEFARILGQAWEDKKKMANAISNPMIQEVFDVAMSAGALAGKVSGAGGGG FVMFMVEPTRKKEVVNALKKLNGFVMPFQFTEGGAHGWKIYPTDKV >gi|225935365|gb|ACGA01000027.1| GENE 35 37718 - 38929 554 403 aa, chain - ## HITS:1 COG:aq_1641 KEGG:ns NR:ns ## COG: aq_1641 COG0438 # Protein_GI_number: 15606745 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Aquifex aeolicus # 223 387 143 305 316 78 29.0 2e-14 MFKKNKMKIAFLLRPWPIYGGGETVTITLSNEFVKRGISVIVFYTKDSKSQNVPLIDNRV KSVFVPNIKSDEHQFSFTKNEISFASKFLSDHVEKENIDIIINQWWPAQTTQNARKRARV VQCLHMSLFLPANYDNLTWKGLDIIKKIMGRTLYDYIHKESRCRQLEESLPFVDKYIFLA KGFMNDYVSFRKNDRNLAKLDFCNNPLPLECSFDKTDFNKKENMVLFVGRMYESHKKVSV MLKIWREIEADRNLDSWSFTLVGDGPDKPMFQNMAKEMQLKRVKFEGYQLPLDYYKKAKI FLMTSAHEGWGMTLVESQQNGVVPIVMDTFSSCHDIIENGESGFIVGMNNVTEYIKRLKS LMIDEDMRMCMALKGLDTCKRFSVEKIVDRWQQIFNELLKDKS >gi|225935365|gb|ACGA01000027.1| GENE 36 39055 - 39885 252 276 aa, chain - ## HITS:1 COG:no KEGG:BVU_2666 NR:ns ## KEGG: BVU_2666 # Name: not_defined # Def: putative O-acetyltransferase Cps9vM # Organism: B.vulgatus # Pathway: not_defined # 22 250 15 245 275 154 37.0 4e-36 MSINNFINRSFGFIERKLSVNRINWFCTLYFNFRTLPFNIACKLPVYIYGKIELVSLNGS VLFDCIPVKGMVKIGKHRDYYTPPTKTIVKISQLGQIVFRGYCNISQGAIINITDGCLTL GDRVWIGQSVLLDCNNSIQLGEGTRIAYGCKFSDSNHHYIVDGDGNIPRIKGLIEIGRYN WIGNNTVITKGCKTTDHTICTHGSLLNKNYPQKYDSNESMLLAGTPAKLLKVGYQRINSA KLENEIVDFFKTHPNESVMSMHLPLVNNTCELIYYK >gi|225935365|gb|ACGA01000027.1| GENE 37 40044 - 41093 386 349 aa, chain - ## HITS:1 COG:no KEGG:LCABL_02380 NR:ns ## KEGG: LCABL_02380 # Name: cpsH # Def: CpsH # Organism: L.casei_BL23 # Pathway: not_defined # 2 337 13 335 350 97 24.0 6e-19 MKSVLLIAPTFLNLYIDLKIEWEKRGYNVTYIEEESYSFDPSLIRNKQLSYWKEKLYIQF LKKKNIQLYEKKLKGKTFDYLMVINGKSFHPILLDQIRRDNPMVKTVLYLWDRTYKNYRF DRNFDCFDRVATFDREDAKQFNLEFLPLYWIEGNRSQCAKYKIFAFGSYRNDRKIIFELL YKVAIENELPSYIKLYIPSVPKSLLSRIKTVIKKITKRKSCNDCNQDILIHEPLSPSEFR KCIYSSLCVIDTYNDFQEGMTPRFMWALGAGRKIITTNKNAINYPFYSPDRILVINNMVD TTSLLLFLDEKINKNEQIEDLIVQYRIDNWMNYLLCGKEKRASVCKGGL >gi|225935365|gb|ACGA01000027.1| GENE 38 41278 - 41769 269 163 aa, chain - ## HITS:1 COG:no KEGG:LAR_1296 NR:ns ## KEGG: LAR_1296 # Name: not_defined # Def: hypothetical protein # Organism: L.reuteri_K # Pathway: not_defined # 13 130 106 217 255 70 33.0 1e-11 MRKLPNLFERGCDIVVPKPYHFYRITVEEYFDERVGTIIMRLLRLILKKQYPNYYPYFEK TLRGSRLYYANLEIMRNEIFDEYCSFIFEVFDLLENELLNKKHYIDLNKEFALYRVFGYV GELLTNTFILKKKAEGLRVKEFTVLFNSAVKGNNTTDYSNIKL >gi|225935365|gb|ACGA01000027.1| GENE 39 41785 - 42174 292 129 aa, chain - ## HITS:1 COG:no KEGG:Acfer_0733 NR:ns ## KEGG: Acfer_0733 # Name: not_defined # Def: exopolysaccharide biosynthesis protein # Organism: A.fermentans # Pathway: not_defined # 11 83 13 80 257 64 43.0 9e-10 MNRKMSIGVCYYLPEPTYICNEYCIPIQLGFYETGIEMGIQKDNTGDNRSLKHPSYSEYS GVYWLWKNMFSEYKGMVHHRRSFTLENISLSKRFGIYTGLLINYAKNMFKHSAITYNEIV ECKSSEEYK >gi|225935365|gb|ACGA01000027.1| GENE 40 42178 - 43713 648 511 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0062 NR:ns ## KEGG: Dfer_0062 # Name: not_defined # Def: polysaccharide biosynthesis protein # Organism: D.fermentans # Pathway: not_defined # 2 492 1 492 510 343 39.0 7e-93 MLSTTEQENNKRIAKNTIYLYIRMFLTMIVSLYTSRIVLRVLGDSDYGLYNVIGGILTMF SMFSGALTVGTQRFLTFALGEKNFEKLQSTFSMVLGLHLAIAVVILLLSETIGLWFFYEY LKIPDGRMTAALWVYQFTILGFLINLIQLPFQACLIAHEKMSMYAYMSIYDVVMKLMLVI ILQYVTFDKLVLYAVVVFVVNLTSAVIYNFYSRRQFSECTFRIAWDGNLAKKFASYSGWN IVGGSIGFFTNQGINILLNIFCGTIVNAARGLSITVNTFLLNFVNNFQTAVNPQIVKLYA ANEYDKLHRLVIYNCRIAEYLYLLIAIPIFIEIEFVLKIWLGDYPEYTPVFVQIILIQSA WNPLSYPVGMLIHASGRMKWPSIWASLLILIFPISYILLKTGYSPVTVYIASATIWLYLN GCNLYFANRYTYLSIKRVLHEVYFNVIPGAAVMFIIPYIISEQMETGWSSFLVVSSVSLI TSTLVIFVWGMTPNMRILVLKKLHINNQLNK >gi|225935365|gb|ACGA01000027.1| GENE 41 43770 - 45029 638 419 aa, chain - ## HITS:1 COG:YPO3111 KEGG:ns NR:ns ## COG: YPO3111 COG1232 # Protein_GI_number: 16123277 # Func_class: H Coenzyme transport and metabolism # Function: Protoporphyrinogen oxidase # Organism: Yersinia pestis # 28 415 38 423 427 327 45.0 2e-89 MHTIILGAGISGLGAGYSLAQNGNNPIILEKDETYGGLCGNFTIEGFRFDRFVHFSFTKD EKVNKFFVDSSPEIYCHKPNPFNFYNGLWIKHPAQNNLFPLPKEEKEKIIADFRGRPKVD SLVVENYEQWLRIQYGDYFAEHFPMVYTRKYWMTEAYDLETRWIGSRLYQPSLDEVIQGA QTADTPVTYYSKEMCYPVKGGYKQYLVTLAMGQDIRYNEEVIKISPQSKSLHTASGTEYT YNRLISSLPLPMIVKMLSDVPEEVLSAAKMLRCTCGYQISVGLKIKDIPPYLWWYIYDED ILPARVYSPSLKSPDNAPEGCSSLQLEIYCEQGKYTNEELISQSVGKLIKLNVIKQEDIL FIDVRYEKYANVIFDQNIYTMRKIVRDYLWSIGIETIGRFGEWDYLWSDQSLLSGLRKR >gi|225935365|gb|ACGA01000027.1| GENE 42 45017 - 45919 318 300 aa, chain - ## HITS:1 COG:MA3779 KEGG:ns NR:ns ## COG: MA3779 COG1088 # Protein_GI_number: 20092575 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Methanosarcina acetivorans str.C2A # 3 241 2 249 318 96 30.0 5e-20 MEKIFLIGGSGFIGKNLVNYLSGRYDIFVIDKLIDVNYFLERPAIKTLKIDLVEQKVPED YPLPDYIINLASIVTAERDISLFDGLISSNLKVLLNLYNRFKDCSQLKLFIQFGSSEEYG NEGVPFIETIREVPNSPYALVKQLTVNTALMLHNNYGFPVMVVRPGNLFGPYQSASKFIP YIITQLKENKQLDVTFCEQKRDFIYVEDFAWCIGKLIEHHSCIIGEIVNVASGISISLRE IIEICKANIGSTSLINYGALPYRENEVMDLKCSINKLETIVGHSLHFDIVTRLKTFSCIQ >gi|225935365|gb|ACGA01000027.1| GENE 43 45919 - 46995 762 358 aa, chain - ## HITS:1 COG:STM2091 KEGG:ns NR:ns ## COG: STM2091 COG0451 # Protein_GI_number: 16765421 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Salmonella typhimurium LT2 # 5 349 2 345 359 375 51.0 1e-104 MDIDIFKNFYRGKRVLVTGHTGFKGSWLSIWLHELGAEVIGVAKDPATDKDNYVLSGIGG KIKADLRADVRDSQRMKDIFQEYQPEIVFHLAAQPLVRLSYEIPVETYETNVMGTINILE AIRITDSVKVGVMITTDKCYENKEQLWGYRENEPMGGYDPYSSSKGAAEIAIASWRRSFF NPEQYEKHGKSIASVRAGNVIGGGDWALDRIIPDCIRALESNKPIEIRSPKAIRPWQHVL EPLNGYMLLASKMWNEPTEYCEGWNFGPRVENIISVWEVGTRILENYGKGELKDLSDSNA LHEAELLMLDISKAKFRLGWEPRMDIEQCIQLVIDWYRCYKTEEVYGLCLEEIKKFIG >gi|225935365|gb|ACGA01000027.1| GENE 44 46995 - 47780 576 261 aa, chain - ## HITS:1 COG:alr2825 KEGG:ns NR:ns ## COG: alr2825 COG1208 # Protein_GI_number: 17230317 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Nostoc sp. PCC 7120 # 1 259 1 257 257 327 58.0 1e-89 MKAVIFAGGFGTRLSEATNLIPKPMVEIGGKPILWHIMKTYSHYGINEFVICCGYKQYVI KEYFANYFRHNSDLTVDLSNNSVEILDNHSENWKVTMVDTGLNTQTGGRLKRVQKYIGDE RFVLTYGDGVADINIAESIKEHDLSGCDISLTAYKPGGKFGALQIDLNNNKVLSFQEKPD GDRNWVNAGYFVCEPKVFEYIPACDDTVVFERKPLEDLAKDGKMHAYRHTGFWKPMDTLR DNVELNEMWDKGIAPWKVWGK >gi|225935365|gb|ACGA01000027.1| GENE 45 47787 - 49133 985 448 aa, chain - ## HITS:1 COG:YPO3113 KEGG:ns NR:ns ## COG: YPO3113 COG0399 # Protein_GI_number: 16123279 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Yersinia pestis # 4 442 3 432 437 496 54.0 1e-140 MNRKDTLKQQILQLSREYYNEVHRDSKVFEPGKSFVNYGGRFFNAEELVNLIDSSLDFWL TAGPWAHKFEKRFADWLGVKYCSLTNSGSSANLLAFMTLTSPLLGERRIKKGDEVITVAC GFPTTVTPIIQYGAIPVFVDVTIPEYNIDTTQLETAYSDKTKAVMIAHSLGNPFDLQSVK DFCDRHHLWLVEDNCDALGSEYTINGETRKTGTIGHIGTSSFYPPHHMTMGEGGAVYTND PLLNKITNSFRDWGRDCWCVGGVDNTCKYRFSKQFGELPVGYDHKYVYSHFGYNLKLTDM QAAIGCAQLDKLDSIVLARRKNFQTLLDGLNDTEGLILPEPQKNSNPSWFGFLISVKEDA GFTRNELSEYLESKKIQTRNLFAGNLLKHPAFNEMRQNGEGYRVIGDLKGTDFILNNTFW IGVYPGMTEEMLQYMISTIKEFVTSRKA >gi|225935365|gb|ACGA01000027.1| GENE 46 49197 - 50609 690 470 aa, chain - ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 68 459 70 454 464 218 36.0 2e-56 MVSTTSQINKFIEFIAVLGDLIILNLVLFFLLFFWEEAFVSLPFSCRVSWMMTSLSLCYL ACSGSRGRVWDSRGIRPDQLVLRVLKNIIAFSIFWACIMTFSGISIYSPLFFVAYFSFLF VILSIYRIIIRHFLIVYCAKGKHRRYAVFIGGGNNMQVLYEEMESSLASSLYEVVGYFDI KPNDALSSQCSYLGNPDGFSDFMSAHPGIKHVFCSLSMEEGRYNFSIMNHCENHLLYFHG VPNVCKGFPRRIWHSMVGNMPILNLRYEPLGKMENRMLKRIFDIVFSGLFLVTVFPFVYL IVGSIIKLTSPGPVFFKQMRTGLNGVDFVCYKFRSMKVNNEADSKQATADDPRKTRFGNF LRCSNIDELPQFINVFKGDMSLVGPRPHMLAHTETYARLIDKYMVRHFIKPGVTGWAQTH GFRGETRELSQMEERVKADIWYMEHWTMLLDLYIIYKTIANIIVGEKNAY >gi|225935365|gb|ACGA01000027.1| GENE 47 50629 - 51756 780 375 aa, chain - ## HITS:1 COG:no KEGG:BT_1355 NR:ns ## KEGG: BT_1355 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 366 1 367 379 472 67.0 1e-131 MIEGESGKEIKQKHNNHTDEEIEIDLMEIFRKIIAIRKTIYKAAGIGLMFGVIIAISIPK QYTVEVILSPEMGNNKGGGLSGLAASFLGSGVSMGDGTDALNASLSADIVSSTPFLLELS AMNIPVTKNEVMTLNTYLDEETSPWWGYVIGLPGMVIGGVKSLFIAEDESISSDKVSRQG TIELSKKELMKIGTLKKMIAASVDKKTSMTSVTVTFQDPKVAAVVADSVVKKLQEYIIDY RTFKAKEDCLYLEKLFKECRQEYYAAQKKYADYLDSHDNIILQSVRAEQERLQNDMNLAY QIYSQVANQLQVARAKVQEEKPVFAVVEPAVVPLEPSGTSRKIYVLAFVFLSVCIVISWN LFGKDLLNKFKEVCA >gi|225935365|gb|ACGA01000027.1| GENE 48 51779 - 52546 733 255 aa, chain - ## HITS:1 COG:Cj1444c KEGG:ns NR:ns ## COG: Cj1444c COG1596 # Protein_GI_number: 15792762 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Campylobacter jejuni # 128 202 431 505 552 66 38.0 5e-11 LKEGFIVDGTPGFVLQPYDEVYVRRSPGYQAQQNVVVEGEILFGGSYAMTSREERLSDLI NKAGGATNYAYLRGAKLTRVANASEKKRMGDVIRLMSRQLGEAMMDSLGVRVEDTFSVGI DLEKALANPGSTADIVLREGDVISIPKNNNTVTINGAVMVPNTVSYMEGKNIDYYLDQAG GYSENAKKSKKFIVYMNGQVTKVKGSGKKQIEPGCEIIVPSKAKKKTNIGNILGYATTFS TLGMMVASIANLIKK Prediction of potential genes in microbial genomes Time: Fri May 13 07:37:01 2011 Seq name: gi|225935364|gb|ACGA01000028.1| Bacteroides sp. D2 cont1.28, whole genome shotgun sequence Length of sequence - 457885 bp Number of predicted genes - 354, with homology - 350 Number of transcription units - 158, operones - 86 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 91 - 150 3.0 1 1 Tu 1 . + CDS 173 - 346 56 ## - Term 66 - 112 3.8 2 2 Tu 1 . - CDS 313 - 1269 493 ## BT_0595 integrase - Prom 1453 - 1512 6.1 - Term 1450 - 1498 10.8 3 3 Op 1 . - CDS 1523 - 2701 1128 ## BT_4675 heparin lyase I precursor - Term 2727 - 2766 7.1 4 3 Op 2 . - CDS 2825 - 3376 519 ## BT_4674 hypothetical protein - Prom 3429 - 3488 4.0 + Prom 3346 - 3405 3.9 5 4 Op 1 . + CDS 3524 - 7048 2357 ## COG0642 Signal transduction histidine kinase + Prom 7073 - 7132 1.8 6 4 Op 2 . + CDS 7164 - 9773 1833 ## Cpin_6170 membrane or secreted protein + Term 9803 - 9863 8.1 + Prom 9917 - 9976 5.2 7 5 Op 1 . + CDS 9997 - 12972 2478 ## BT_4671 hypothetical protein 8 5 Op 2 . + CDS 12994 - 14571 1444 ## BT_4670 hypothetical protein 9 5 Op 3 . + CDS 14595 - 16337 1462 ## BT_4669 hypothetical protein 10 5 Op 4 . + CDS 16361 - 17449 1007 ## COG3867 Arabinogalactan endo-1,4-beta-galactosidase 11 5 Op 5 . + CDS 17521 - 19962 2111 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 19983 - 20042 15.0 + Prom 19983 - 20042 6.3 12 6 Op 1 . + CDS 20224 - 20673 393 ## gi|293374010|ref|ZP_06620349.1| hypothetical protein CUY_2814 13 6 Op 2 . + CDS 20723 - 23158 1911 ## MCP_0630 hypothetical protein 14 6 Op 3 . + CDS 23152 - 24078 388 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) 15 7 Op 1 . - CDS 24110 - 25141 837 ## BF1208 putative endonuclease/exonuclease/phosphatase family protein 16 7 Op 2 17/0.000 - CDS 25173 - 25859 738 ## COG0569 K+ transport systems, NAD-binding component 17 7 Op 3 . - CDS 25864 - 27693 1357 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 27843 - 27902 5.0 + Prom 27681 - 27740 7.8 18 8 Tu 1 . + CDS 27826 - 29196 1674 ## COG1350 Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) - Term 29233 - 29282 1.1 19 9 Tu 1 . - CDS 29329 - 30582 578 ## BVU_0128 transcriptional regulator - Prom 30649 - 30708 6.8 + Prom 30422 - 30481 6.6 20 10 Op 1 . + CDS 30664 - 31425 532 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) 21 10 Op 2 . + CDS 31437 - 32387 531 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 22 10 Op 3 5/0.032 + CDS 32400 - 34337 1120 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit 23 10 Op 4 2/0.097 + CDS 34350 - 34757 357 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) + Term 34800 - 34843 -0.6 24 10 Op 5 . + CDS 34855 - 38916 2811 ## COG0642 Signal transduction histidine kinase + Term 38930 - 38974 3.7 + Prom 39160 - 39219 4.0 25 11 Tu 1 . + CDS 39429 - 41696 887 ## Phep_1707 hypothetical protein + Prom 41853 - 41912 8.2 26 12 Op 1 . + CDS 41937 - 44060 1714 ## BT_4662 heparinase III protein, heparitin sulfate lyase 27 12 Op 2 . + CDS 44084 - 46225 1526 ## BT_4661 hypothetical protein 28 12 Op 3 . + CDS 46244 - 48430 1893 ## BT_4660 hypothetical protein 29 12 Op 4 . + CDS 48507 - 49391 606 ## BT_4660 hypothetical protein 30 12 Op 5 . + CDS 49405 - 51096 1336 ## BT_4659 hypothetical protein + Term 51114 - 51182 14.6 + Prom 51120 - 51179 2.5 31 13 Op 1 . + CDS 51204 - 54134 1320 ## Phep_2838 lyase catalytic 32 13 Op 2 . + CDS 54136 - 56574 1169 ## Slin_5006 alpha-L-fucosidase (EC:3.2.1.51) + Prom 56638 - 56697 6.4 33 14 Op 1 . + CDS 56843 - 58039 1123 ## BT_4658 glucuronyl hydrolase 34 14 Op 2 . + CDS 58068 - 60068 1854 ## BT_4657 heparinase III protein + Term 60080 - 60122 2.1 + Prom 60072 - 60131 3.1 35 15 Op 1 . + CDS 60302 - 61738 1180 ## COG3119 Arylsulfatase A and related enzymes 36 15 Op 2 . + CDS 61806 - 63434 1425 ## BT_4655 hypothetical protein 37 15 Op 3 1/0.129 + CDS 63477 - 64448 476 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 38 15 Op 4 . + CDS 64445 - 65713 1256 ## COG0738 Fucose permease 39 15 Op 5 . + CDS 65754 - 68375 2148 ## BT_4652 hypothetical protein + Term 68394 - 68448 13.1 + Prom 68386 - 68445 5.6 40 16 Op 1 . + CDS 68576 - 69409 668 ## COG0648 Endonuclease IV + Term 69507 - 69552 -0.4 + Prom 69439 - 69498 5.4 41 16 Op 2 . + CDS 69597 - 71549 1320 ## PRU_1399 hypothetical protein + Term 71568 - 71618 16.4 - Term 71550 - 71612 19.5 42 17 Op 1 . - CDS 71635 - 73206 868 ## COG0038 Chloride channel protein EriC - Prom 73234 - 73293 1.8 43 17 Op 2 . - CDS 73297 - 76053 1801 ## BT_3977 hypothetical protein 44 17 Op 3 . - CDS 76102 - 76662 598 ## BT_4649 hypothetical protein 45 17 Op 4 . - CDS 76696 - 77040 388 ## BT_4648 hypothetical protein 46 17 Op 5 . - CDS 77045 - 77584 524 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 77667 - 77726 5.0 - Term 77696 - 77744 5.1 47 18 Op 1 . - CDS 77784 - 77960 231 ## BT_4646 hypothetical protein 48 18 Op 2 . - CDS 78022 - 79566 1223 ## BT_4645 hypothetical protein 49 18 Op 3 . - CDS 79588 - 80442 559 ## BT_4644 putative anti-sigma factor 50 18 Op 4 . - CDS 80442 - 80993 315 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 51 18 Op 5 . - CDS 81044 - 82303 1103 ## BT_4642 hypothetical protein - Prom 82369 - 82428 7.5 + Prom 82493 - 82552 6.8 52 19 Tu 1 . + CDS 82585 - 83016 135 ## BT_4640 hypothetical protein + Term 83036 - 83078 -0.8 + Prom 83080 - 83139 2.0 53 20 Op 1 . + CDS 83166 - 83807 562 ## COG0586 Uncharacterized membrane-associated protein 54 20 Op 2 21/0.000 + CDS 83880 - 84527 599 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) 55 20 Op 3 . + CDS 84543 - 85559 809 ## COG0306 Phosphate/sulphate permeases 56 20 Op 4 . + CDS 85611 - 87866 1768 ## COG0475 Kef-type K+ transport systems, membrane components + Term 87993 - 88051 18.6 - Term 87980 - 88039 17.2 57 21 Tu 1 . - CDS 88042 - 89682 1197 ## Snas_3080 hypothetical protein - Prom 89798 - 89857 7.2 + Prom 90055 - 90114 2.1 58 22 Tu 1 . + CDS 90134 - 95008 1933 ## COG0514 Superfamily II DNA helicase 59 23 Op 1 . - CDS 95151 - 97232 1393 ## COG4930 Predicted ATP-dependent Lon-type protease 60 23 Op 2 . - CDS 97257 - 99797 1909 ## Slin_4848 PglZ domain protein 61 23 Op 3 . - CDS 99794 - 101392 1015 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 62 23 Op 4 . - CDS 101427 - 101663 287 ## gi|260171314|ref|ZP_05757726.1| hypothetical protein BacD2_05560 63 23 Op 5 . - CDS 101660 - 103999 1093 ## TherJR_1080 phosphoesterase PHP domain protein 64 23 Op 6 . - CDS 104007 - 107522 2105 ## COG1002 Type II restriction enzyme, methylase subunits 65 23 Op 7 . - CDS 107544 - 111152 3179 ## Slin_4844 hypothetical protein 66 23 Op 8 . - CDS 111156 - 111767 564 ## Slin_4843 hypothetical protein 67 23 Op 9 . - CDS 111764 - 112393 420 ## Slin_4842 hypothetical protein - Prom 112609 - 112668 3.5 - Term 112638 - 112675 8.2 68 24 Tu 1 . - CDS 112713 - 114626 2568 ## COG0443 Molecular chaperone - Prom 114747 - 114806 8.8 - Term 114881 - 114931 14.1 69 25 Tu 1 . - CDS 114955 - 115743 663 ## COG3187 Heat shock protein - Prom 115852 - 115911 6.8 + Prom 115770 - 115829 5.5 70 26 Op 1 . + CDS 115889 - 117172 843 ## BT_4613 hypothetical protein + Prom 117191 - 117250 4.6 71 26 Op 2 . + CDS 117270 - 118463 1213 ## COG1748 Saccharopine dehydrogenase and related proteins 72 26 Op 3 . + CDS 118493 - 118948 600 ## COG1225 Peroxiredoxin 73 26 Op 4 . + CDS 118975 - 120009 1362 ## COG0468 RecA/RadA recombinase + Term 120057 - 120103 1.6 - Term 120038 - 120097 5.7 74 27 Tu 1 . - CDS 120126 - 121118 977 ## COG2855 Predicted membrane protein - Prom 121175 - 121234 4.8 75 28 Op 1 . - CDS 121239 - 121613 280 ## COG3304 Predicted membrane protein 76 28 Op 2 . - CDS 121632 - 122048 381 ## COG5579 Uncharacterized conserved protein 77 28 Op 3 . - CDS 122122 - 123054 432 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 123297 - 123356 7.3 78 29 Tu 1 . + CDS 123446 - 124072 425 ## BT_4601 hypothetical protein - Term 124116 - 124181 3.2 79 30 Op 1 . - CDS 124301 - 125197 668 ## COG0583 Transcriptional regulator 80 30 Op 2 . - CDS 125268 - 125855 459 ## BT_4598 hypothetical protein - Prom 125879 - 125938 8.1 + Prom 126007 - 126066 7.6 81 31 Tu 1 . + CDS 126141 - 128729 2616 ## COG0542 ATPases with chaperone activity, ATP-binding subunit + Term 128750 - 128793 10.2 - Term 129434 - 129480 10.8 82 32 Op 1 . - CDS 129510 - 129938 582 ## BT_4595 hypothetical protein 83 32 Op 2 . - CDS 129959 - 130576 374 ## COG0237 Dephospho-CoA kinase 84 32 Op 3 . - CDS 130566 - 131579 595 ## BT_4593 hypothetical protein 85 32 Op 4 . - CDS 131600 - 131923 480 ## COG1862 Preprotein translocase subunit YajC 86 32 Op 5 . - CDS 131963 - 132889 1006 ## COG0781 Transcription termination factor - Prom 133028 - 133087 10.1 + Prom 132687 - 132746 7.1 87 33 Op 1 . + CDS 132970 - 133098 76 ## gi|153807761|ref|ZP_01960429.1| hypothetical protein BACCAC_02044 88 33 Op 2 . + CDS 133058 - 133459 471 ## BT_4590 hypothetical protein + Prom 133480 - 133539 3.2 89 34 Op 1 22/0.000 + CDS 133597 - 134187 969 ## PROTEIN SUPPORTED gi|160887365|ref|ZP_02068368.1| hypothetical protein BACOVA_05384 90 34 Op 2 . + CDS 134231 - 134890 503 ## COG0193 Peptidyl-tRNA hydrolase 91 35 Tu 1 . + CDS 134992 - 135408 630 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 92 36 Op 1 . - CDS 135543 - 135968 652 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 93 36 Op 2 . - CDS 136018 - 138291 1912 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 138365 - 138424 2.5 - Term 138373 - 138420 12.6 94 37 Op 1 . - CDS 138443 - 139528 1078 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 95 37 Op 2 . - CDS 139597 - 140820 1358 ## COG2195 Di- and tripeptidases - Prom 140842 - 140901 10.2 - Term 140879 - 140922 12.1 96 38 Op 1 . - CDS 140974 - 142383 1118 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Prom 142420 - 142479 6.0 - Term 142446 - 142493 8.3 97 38 Op 2 . - CDS 142548 - 144707 1933 ## BT_4581 alpha-glucosidase - Prom 144854 - 144913 7.3 + Prom 144887 - 144946 4.3 98 39 Op 1 . + CDS 144976 - 145809 876 ## BT_4576 hypothetical protein 99 39 Op 2 . + CDS 145820 - 146596 408 ## BT_4575 hypothetical protein 100 39 Op 3 . + CDS 146615 - 146992 312 ## COG0239 Integral membrane protein possibly involved in chromosome condensation 101 39 Op 4 . + CDS 147023 - 147685 428 ## COG1357 Uncharacterized low-complexity proteins + Prom 147749 - 147808 4.7 102 40 Tu 1 . + CDS 147865 - 149145 1607 ## COG0148 Enolase + Term 149166 - 149221 14.0 - TRNA 149303 - 149379 73.6 # Thr TGT 0 0 + Prom 149531 - 149590 4.5 103 41 Op 1 . + CDS 149694 - 150245 407 ## BT_4565 hypothetical protein 104 41 Op 2 . + CDS 150278 - 151228 802 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 105 41 Op 3 . + CDS 151250 - 151834 320 ## BT_4563 hypothetical protein 106 41 Op 4 . + CDS 151764 - 152567 599 ## BT_4562 hypothetical protein 107 41 Op 5 . + CDS 152574 - 153155 525 ## COG1971 Predicted membrane protein 108 41 Op 6 . + CDS 153186 - 154199 1009 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis - Term 154207 - 154250 8.7 109 42 Op 1 . - CDS 154267 - 154638 292 ## COG3169 Uncharacterized protein conserved in bacteria 110 42 Op 2 . - CDS 154654 - 155400 650 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 155433 - 155492 1.5 + Prom 155461 - 155520 2.1 111 43 Op 1 . + CDS 155584 - 157005 1269 ## BT_4557 hypothetical protein 112 43 Op 2 . + CDS 157022 - 157492 514 ## BT_4556 hypothetical protein 113 43 Op 3 . + CDS 157515 - 158516 1116 ## COG4864 Uncharacterized protein conserved in bacteria + Prom 158590 - 158649 4.0 114 44 Tu 1 . + CDS 158673 - 159551 881 ## COG2820 Uridine phosphorylase + Term 159615 - 159668 -0.7 - Term 159603 - 159656 -0.7 115 45 Op 1 . - CDS 159666 - 160325 788 ## COG2910 Putative NADH-flavin reductase - Prom 160347 - 160406 4.5 116 45 Op 2 . - CDS 160408 - 162315 1435 ## COG0642 Signal transduction histidine kinase - Prom 162454 - 162513 4.4 + Prom 162283 - 162342 5.1 117 46 Op 1 1/0.129 + CDS 162451 - 163848 1055 ## COG0486 Predicted GTPase 118 46 Op 2 . + CDS 163901 - 164743 485 ## COG0566 rRNA methylases 119 47 Op 1 . - CDS 164754 - 165230 284 ## BVU_3777 arsenate reductase 120 47 Op 2 . - CDS 165246 - 166286 817 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 166480 - 166539 6.7 + Prom 166425 - 166484 7.6 121 48 Tu 1 . + CDS 166536 - 169313 1459 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits + Term 169332 - 169393 -0.2 + Prom 169333 - 169392 5.9 122 49 Op 1 . + CDS 169542 - 169772 208 ## gi|260171374|ref|ZP_05757786.1| hypothetical protein BacD2_05860 123 49 Op 2 . + CDS 169779 - 170855 373 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 + Prom 170948 - 171007 2.0 124 50 Tu 1 . + CDS 171119 - 172744 537 ## Cyan8802_4015 ABC transporter, ATP-binding protein-related protein + Prom 172778 - 172837 2.9 125 51 Tu 1 2/0.097 + CDS 172861 - 173100 135 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits + Prom 173157 - 173216 3.4 126 52 Op 1 . + CDS 173261 - 173560 198 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 127 52 Op 2 . + CDS 173637 - 174863 824 ## ZPR_4088 hypothetical protein + Prom 174996 - 175055 8.2 128 53 Op 1 4/0.032 + CDS 175086 - 177917 1611 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 129 53 Op 2 27/0.000 + CDS 177933 - 179477 1319 ## COG0286 Type I restriction-modification system methyltransferase subunit 130 53 Op 3 . + CDS 179530 - 180738 349 ## COG0732 Restriction endonuclease S subunits + Term 180882 - 180919 -1.0 131 54 Tu 1 . - CDS 180731 - 181096 130 ## Msm_0158 type I restriction-modification system methylase, subunit S - Prom 181260 - 181319 6.1 + Prom 181099 - 181158 3.2 132 55 Tu 1 . + CDS 181284 - 181841 187 ## COG0732 Restriction endonuclease S subunits + Term 181989 - 182020 -0.5 133 56 Tu 1 . - CDS 181834 - 182982 275 ## COG0732 Restriction endonuclease S subunits - Prom 183039 - 183098 4.4 + Prom 183034 - 183093 4.3 134 57 Tu 1 . + CDS 183114 - 184040 648 ## COG0582 Integrase + Term 184047 - 184093 8.1 + Prom 184047 - 184106 4.7 135 58 Tu 1 . + CDS 184164 - 185015 620 ## A2cp1_3953 hypothetical protein 136 59 Tu 1 . + CDS 185637 - 186071 437 ## BT_4511 hypothetical protein + Term 186153 - 186190 6.2 - Term 186022 - 186075 -0.3 137 60 Tu 1 . - CDS 186172 - 188262 1254 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 188347 - 188406 6.6 138 61 Op 1 . - CDS 188868 - 189755 670 ## COG1708 Predicted nucleotidyltransferases 139 61 Op 2 . - CDS 189682 - 189984 203 ## BVU_2278 hypothetical protein - Prom 190093 - 190152 6.4 + Prom 190096 - 190155 3.4 140 62 Op 1 . + CDS 190223 - 191104 749 ## COG2367 Beta-lactamase class A 141 62 Op 2 . + CDS 191176 - 192588 860 ## COG0346 Lactoylglutathione lyase and related lyases 142 62 Op 3 . + CDS 192610 - 193137 345 ## BT_4505 hypothetical protein 143 63 Tu 1 . - CDS 193166 - 193798 272 ## COG2095 Multiple antibiotic transporter - Prom 193832 - 193891 5.0 + Prom 193920 - 193979 6.0 144 64 Tu 1 . + CDS 193999 - 194979 791 ## COG3049 Penicillin V acylase and related amidases + Term 195011 - 195059 -0.8 - Term 194923 - 194956 -0.4 145 65 Tu 1 . - CDS 195095 - 195559 575 ## COG2030 Acyl dehydratase - Prom 195609 - 195668 4.6 146 66 Tu 1 . - CDS 195680 - 196198 457 ## BT_4502 hypothetical protein - Prom 196391 - 196450 4.8 + Prom 196016 - 196075 3.9 147 67 Tu 1 . + CDS 196283 - 197968 241 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 148 68 Tu 1 . - CDS 197965 - 199326 1065 ## COG0534 Na+-driven multidrug efflux pump - Prom 199346 - 199405 2.3 149 69 Op 1 . - CDS 199450 - 199656 206 ## BF1617 hypothetical protein 150 69 Op 2 . - CDS 199735 - 201240 971 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 201260 - 201319 3.1 + Prom 201324 - 201383 5.2 151 70 Op 1 . + CDS 201491 - 201790 184 ## BF3088 hypothetical protein 152 70 Op 2 . + CDS 201810 - 202541 198 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 + Prom 202581 - 202640 4.2 153 71 Tu 1 . + CDS 202660 - 202929 321 ## BT_4495 hypothetical protein + Term 202946 - 202994 10.0 - Term 202933 - 202981 13.8 154 72 Op 1 . - CDS 202986 - 203270 301 ## COG3309 Uncharacterized virulence-associated protein D 155 72 Op 2 . - CDS 203258 - 203458 289 ## gi|160887278|ref|ZP_02068281.1| hypothetical protein BACOVA_05296 156 72 Op 3 . - CDS 203535 - 203801 353 ## COG2388 Predicted acetyltransferase 157 72 Op 4 . - CDS 203884 - 205197 1146 ## COG1090 Predicted nucleoside-diphosphate sugar epimerase - Prom 205217 - 205276 5.2 + Prom 205191 - 205250 4.8 158 73 Op 1 . + CDS 205341 - 206309 911 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 159 73 Op 2 . + CDS 206379 - 207218 569 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component - Term 207086 - 207128 -0.8 160 74 Tu 1 . - CDS 207224 - 207592 139 ## BT_4485 hypothetical protein - Prom 207695 - 207754 7.6 + Prom 207659 - 207718 6.4 161 75 Tu 1 . + CDS 207767 - 208435 368 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 + Term 208503 - 208537 3.6 162 76 Op 1 1/0.129 - CDS 208468 - 209241 512 ## COG1573 Uracil-DNA glycosylase 163 76 Op 2 . - CDS 209263 - 210534 968 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif - Prom 210735 - 210794 9.3 - Term 210718 - 210761 7.8 164 77 Tu 1 . - CDS 210796 - 212352 1282 ## BT_1828 hypothetical protein - Prom 212413 - 212472 4.0 - Term 212447 - 212490 11.7 165 78 Tu 1 . - CDS 212523 - 212699 353 ## gi|167754220|ref|ZP_02426347.1| hypothetical protein ALIPUT_02513 - Prom 212753 - 212812 4.8 + Prom 212749 - 212808 6.9 166 79 Op 1 . + CDS 212847 - 213494 579 ## BT_4477 putative ATP-dependent DNA helicase 167 79 Op 2 . + CDS 213528 - 214916 799 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Prom 214949 - 215008 5.0 168 80 Tu 1 . + CDS 215029 - 215661 639 ## BT_4475 hypothetical protein + Term 215699 - 215751 13.3 + Prom 215692 - 215751 3.9 169 81 Tu 1 . + CDS 215903 - 218038 1489 ## COG1509 Lysine 2,3-aminomutase + Term 218227 - 218265 0.1 - Term 217934 - 217993 8.1 170 82 Tu 1 . - CDS 218138 - 219451 352 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 171 83 Op 1 . - CDS 219567 - 220274 630 ## BT_4472 hypothetical protein 172 83 Op 2 . - CDS 220348 - 221943 1235 ## BT_4471 hypothetical protein 173 83 Op 3 . - CDS 221959 - 225087 2699 ## BT_4470 outer membrane protein - Prom 225110 - 225169 7.8 + Prom 225049 - 225108 8.1 174 84 Op 1 . + CDS 225325 - 225774 255 ## COG3663 G:T/U mismatch-specific DNA glycosylase 175 84 Op 2 . + CDS 225823 - 226176 363 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family + Term 226317 - 226369 7.5 + TRNA 226242 - 226314 80.5 # Trp CCA 0 0 - Term 226406 - 226466 5.1 176 85 Tu 1 . - CDS 226574 - 228424 2195 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain - Prom 228597 - 228656 7.6 + Prom 228390 - 228449 7.7 177 86 Op 1 4/0.032 + CDS 228614 - 229270 500 ## COG0558 Phosphatidylglycerophosphate synthase 178 86 Op 2 5/0.032 + CDS 229267 - 230223 657 ## COG4589 Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase 179 86 Op 3 . + CDS 230248 - 230904 457 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 180 86 Op 4 . + CDS 230905 - 231477 321 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 181 86 Op 5 . + CDS 231474 - 232616 1207 ## BT_4460 hypothetical protein + Term 232722 - 232770 -0.3 - Term 232548 - 232594 1.2 182 87 Op 1 . - CDS 232627 - 233796 990 ## COG2311 Predicted membrane protein - Prom 233819 - 233878 2.5 183 87 Op 2 . - CDS 233880 - 234752 760 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 234827 - 234886 2.1 - Term 234835 - 234875 5.8 184 88 Op 1 . - CDS 234888 - 235469 555 ## BT_4457 hypothetical protein 185 88 Op 2 17/0.000 - CDS 235466 - 236851 1137 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 186 88 Op 3 . - CDS 236848 - 237588 594 ## COG0247 Fe-S oxidoreductase 187 88 Op 4 22/0.000 - CDS 237641 - 238192 228 ## PROTEIN SUPPORTED gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 188 88 Op 5 . - CDS 238207 - 238563 403 ## COG0720 6-pyruvoyl-tetrahydropterin synthase - Prom 238622 - 238681 4.7 + Prom 238612 - 238671 4.2 189 89 Tu 1 . + CDS 238693 - 239052 198 ## COG2832 Uncharacterized protein conserved in bacteria 190 90 Tu 1 . - CDS 239022 - 239588 568 ## BT_4451 putative MTA/SAH nucleosidase - Prom 239752 - 239811 6.6 - Term 239646 - 239680 6.2 191 91 Tu 1 . - CDS 239814 - 240140 369 ## Aave_4587 hypothetical protein + Prom 240556 - 240615 5.0 192 92 Op 1 . + CDS 240670 - 241398 549 ## gi|260171443|ref|ZP_05757855.1| hypothetical protein BacD2_06215 193 92 Op 2 . + CDS 241429 - 242217 383 ## gi|260171444|ref|ZP_05757856.1| hypothetical protein BacD2_06220 194 93 Op 1 . - CDS 242267 - 242512 90 ## gi|260171445|ref|ZP_05757857.1| hypothetical protein BacD2_06225 - Prom 242555 - 242614 5.0 195 93 Op 2 . - CDS 242626 - 243087 377 ## gi|260171446|ref|ZP_05757858.1| hypothetical protein BacD2_06230 - Prom 243204 - 243263 11.8 + Prom 243406 - 243465 4.9 196 94 Tu 1 . + CDS 243531 - 244007 365 ## gi|260171447|ref|ZP_05757859.1| hypothetical protein BacD2_06235 + Term 244135 - 244168 -0.5 + Prom 244024 - 244083 6.6 197 95 Tu 1 . + CDS 244183 - 244818 311 ## gi|260171448|ref|ZP_05757860.1| hypothetical protein BacD2_06240 198 96 Tu 1 . - CDS 244891 - 245409 255 ## gi|260171449|ref|ZP_05757861.1| hypothetical protein BacD2_06245 - Prom 245563 - 245622 6.4 + Prom 245833 - 245892 7.1 199 97 Op 1 . + CDS 245969 - 246382 432 ## BT_4432 putative non-specific DNA-binding protein HU-1 200 97 Op 2 . + CDS 246389 - 248188 1077 ## BT_4431 hypothetical protein + Term 248222 - 248263 0.7 - Term 248188 - 248225 1.1 201 98 Op 1 . - CDS 248333 - 250810 2029 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 202 98 Op 2 . - CDS 250855 - 251985 677 ## BT_4429 putative pteridine-dependent dioxygenase 203 98 Op 3 . - CDS 252025 - 252909 645 ## BT_4428 hypothetical protein 204 98 Op 4 . - CDS 252926 - 254728 1359 ## BT_4427 surface layer protein 205 98 Op 5 . - CDS 254725 - 257220 1961 ## BT_4426 hypothetical protein 206 98 Op 6 . - CDS 257286 - 258002 941 ## COG0274 Deoxyribose-phosphate aldolase 207 98 Op 7 . - CDS 258045 - 259079 1277 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 208 98 Op 8 . - CDS 259094 - 260620 1611 ## COG1070 Sugar (pentulose and hexulose) kinases 209 98 Op 9 . - CDS 260650 - 261129 621 ## BT_4422 hypothetical protein 210 98 Op 10 . - CDS 261143 - 261514 284 ## BT_4421 hypothetical protein 211 98 Op 11 . - CDS 261566 - 262654 884 ## BT_4420 hypothetical protein - Prom 262675 - 262734 5.1 + Prom 262943 - 263002 8.9 212 99 Op 1 . + CDS 263036 - 263329 229 ## BT_4419 hypothetical protein 213 99 Op 2 . + CDS 263341 - 263631 354 ## BT_4418 hypothetical protein + Term 263672 - 263736 7.1 + Prom 263675 - 263734 5.6 214 100 Op 1 . + CDS 263755 - 265290 1914 ## COG1418 Predicted HD superfamily hydrolase + Term 265320 - 265349 2.1 215 100 Op 2 . + CDS 265355 - 266110 613 ## COG3142 Uncharacterized protein involved in copper resistance + Term 266137 - 266192 -0.0 - Term 266125 - 266180 5.1 216 101 Op 1 . - CDS 266230 - 267444 1374 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 217 101 Op 2 . - CDS 267459 - 267890 335 ## BT_4414 hypothetical protein - Prom 267950 - 268009 5.5 218 102 Op 1 6/0.032 - CDS 268137 - 268979 622 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 219 102 Op 2 3/0.032 - CDS 268972 - 269985 383 ## COG1145 Ferredoxin 220 102 Op 3 2/0.097 - CDS 269995 - 270795 503 ## COG1145 Ferredoxin 221 102 Op 4 2/0.097 - CDS 270788 - 271261 609 ## COG1908 Coenzyme F420-reducing hydrogenase, delta subunit 222 102 Op 5 2/0.097 - CDS 271276 - 273276 1904 ## COG1148 Heterodisulfide reductase, subunit A and related polyferredoxins 223 102 Op 6 7/0.032 - CDS 273264 - 274130 769 ## COG2048 Heterodisulfide reductase, subunit B 224 102 Op 7 . - CDS 274133 - 274687 296 ## COG1150 Heterodisulfide reductase, subunit C - Prom 274709 - 274768 2.9 - Term 274720 - 274772 -0.5 225 103 Op 1 . - CDS 274790 - 275227 190 ## BT_4413 hypothetical protein 226 103 Op 2 . - CDS 275310 - 278066 2008 ## BT_4412 hypothetical protein - Prom 278232 - 278291 8.3 + Prom 278229 - 278288 7.2 227 104 Op 1 . + CDS 278391 - 278528 141 ## + Term 278554 - 278594 6.2 + Prom 278540 - 278599 5.1 228 104 Op 2 . + CDS 278622 - 279800 1113 ## BVU_3875 aminopeptidase C + Term 279865 - 279918 8.8 - Term 279848 - 279910 20.4 229 105 Op 1 . - CDS 279942 - 280643 888 ## BT_4411 hypothetical protein 230 105 Op 2 . - CDS 280704 - 282638 1741 ## BT_4410 hypothetical protein - Prom 282664 - 282723 7.0 231 106 Tu 1 . + CDS 282621 - 282782 87 ## + Term 282875 - 282919 10.8 - Term 282885 - 282929 0.8 232 107 Tu 1 . - CDS 282965 - 284026 829 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 284048 - 284107 2.6 + Prom 284131 - 284190 6.6 233 108 Tu 1 . + CDS 284272 - 285699 648 ## BT_4408 hypothetical protein + Term 285705 - 285752 9.5 + Prom 286249 - 286308 6.9 234 109 Tu 1 . + CDS 286348 - 287529 876 ## COG2706 3-carboxymuconate cyclase - Term 287481 - 287526 3.0 235 110 Op 1 . - CDS 287684 - 290188 1566 ## Cpin_6170 membrane or secreted protein 236 110 Op 2 . - CDS 290201 - 292543 1984 ## COG1874 Beta-galactosidase 237 110 Op 3 . - CDS 292590 - 294101 1422 ## Rmar_1481 RagB/SusD domain protein 238 110 Op 4 . - CDS 294114 - 297206 2745 ## BF0536 putative outer membrane protein 239 110 Op 5 . - CDS 297219 - 298568 1205 ## BVU_0840 hypothetical protein 240 110 Op 6 . - CDS 298581 - 300170 1580 ## COG3867 Arabinogalactan endo-1,4-beta-galactosidase - Prom 300208 - 300267 3.1 - Term 300222 - 300265 9.4 241 111 Op 1 . - CDS 300273 - 303554 3012 ## BT_4398 TPR domain-containing protein - Prom 303649 - 303708 3.6 242 111 Op 2 . - CDS 303711 - 305093 1040 ## COG0477 Permeases of the major facilitator superfamily - Prom 305136 - 305195 5.8 + Prom 305031 - 305090 5.8 243 112 Tu 1 . + CDS 305265 - 306158 532 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 306055 - 306085 1.3 244 113 Op 1 . - CDS 306086 - 306874 539 ## COG0739 Membrane proteins related to metalloendopeptidases 245 113 Op 2 . - CDS 306846 - 307511 715 ## COG3382 Uncharacterized conserved protein - Term 307517 - 307561 -0.5 246 113 Op 3 . - CDS 307562 - 308086 386 ## COG1496 Uncharacterized conserved protein - Prom 308310 - 308369 3.4 247 114 Tu 1 . - CDS 308371 - 309537 1105 ## COG0536 Predicted GTPase - Prom 309577 - 309636 4.2 - Term 309548 - 309603 1.7 248 115 Op 1 . - CDS 309653 - 310222 751 ## COG0563 Adenylate kinase and related kinases 249 115 Op 2 . - CDS 310290 - 310826 770 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 250 115 Op 3 . - CDS 310868 - 312412 1324 ## COG3104 Dipeptide/tripeptide permease - Prom 312451 - 312510 5.2 251 116 Tu 1 . - CDS 312618 - 312839 222 ## BT_4384 hypothetical protein + Prom 313120 - 313179 7.2 252 117 Op 1 . + CDS 313241 - 313429 126 ## 253 117 Op 2 . + CDS 313371 - 314888 608 ## PROTEIN SUPPORTED gi|225093729|ref|YP_002662469.1| ribosomal protein S15 254 117 Op 3 . + CDS 314961 - 316028 1159 ## BT_4382 hypothetical protein 255 117 Op 4 . + CDS 316052 - 316336 285 ## BT_4381 hypothetical protein - Term 316504 - 316528 -1.0 256 118 Op 1 . - CDS 316598 - 316942 408 ## BT_4380 hypothetical protein - Prom 316962 - 317021 6.2 257 118 Op 2 . - CDS 317039 - 318427 1448 ## BT_4379 putative oxalate:formate antiporter - Prom 318489 - 318548 3.1 - Term 318586 - 318634 2.0 258 119 Op 1 . - CDS 318674 - 319066 415 ## BT_4378 preprotein translocase subunit SecG 259 119 Op 2 . - CDS 319073 - 319852 510 ## BT_4377 hypothetical protein 260 119 Op 3 . - CDS 319858 - 320385 495 ## BT_4376 hypothetical protein 261 119 Op 4 . - CDS 320372 - 321613 1101 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 262 119 Op 5 . - CDS 321622 - 322716 860 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 263 119 Op 6 . - CDS 322721 - 323764 921 ## BT_4373 hypothetical protein 264 119 Op 7 . - CDS 323783 - 324817 844 ## COG0820 Predicted Fe-S-cluster redox enzyme - Prom 324851 - 324910 3.4 265 120 Tu 1 . - CDS 324921 - 328271 1897 ## CPS_1799 hypothetical protein - Prom 328334 - 328393 7.3 - Term 328379 - 328451 19.3 266 121 Op 1 . - CDS 328481 - 330622 2556 ## BT_4371 peptidyl-prolyl cis-trans isomerase - Prom 330674 - 330733 3.6 - Term 330675 - 330713 -0.7 267 121 Op 2 . - CDS 330742 - 331968 1001 ## COG1253 Hemolysins and related proteins containing CBS domains 268 121 Op 3 . - CDS 332002 - 332616 439 ## BT_4369 hypothetical protein 269 121 Op 4 . - CDS 332660 - 333985 1577 ## BT_4368 hypothetical protein 270 121 Op 5 . - CDS 334044 - 335357 1145 ## BT_4367 putative outer membrane protein 271 121 Op 6 . - CDS 335344 - 336213 545 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor - Prom 336365 - 336424 6.3 272 122 Op 1 . + CDS 336519 - 337706 667 ## BT_4364 hypothetical protein 273 122 Op 2 . + CDS 337706 - 339280 1278 ## BT_4363 putative alkaline phosphatase + Prom 339311 - 339370 4.2 274 122 Op 3 . + CDS 339477 - 342794 3942 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) + Term 342879 - 342927 15.1 + Prom 342803 - 342862 5.0 275 123 Op 1 . + CDS 342969 - 344075 1057 ## BT_4361 hypothetical protein 276 123 Op 2 . + CDS 344137 - 344583 494 ## BT_4360 hypothetical protein + Term 344654 - 344712 12.6 - Term 344643 - 344698 8.1 277 124 Op 1 . - CDS 344721 - 346733 1397 ## Phep_1386 hypothetical protein 278 124 Op 2 . - CDS 346733 - 348001 1118 ## Dfer_0342 hypothetical protein 279 124 Op 3 . - CDS 348024 - 349775 1391 ## BDI_2518 hypothetical protein 280 124 Op 4 . - CDS 349807 - 353022 2302 ## BDI_2519 hypothetical protein - Prom 353073 - 353132 5.2 281 125 Tu 1 . - CDS 353135 - 354169 564 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 354210 - 354269 4.0 282 126 Tu 1 . - CDS 354296 - 354859 463 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 354888 - 354947 7.0 - Term 354947 - 354990 1.5 283 127 Op 1 . - CDS 355029 - 357164 1625 ## BT_4359 alpha-N-acetylglucosaminidase precursor 284 127 Op 2 . - CDS 357232 - 359565 2210 ## COG1874 Beta-galactosidase - Term 359577 - 359629 5.4 285 128 Op 1 . - CDS 359648 - 361189 1548 ## BT_3020 hypothetical protein 286 128 Op 2 . - CDS 361242 - 362552 1114 ## gi|160887164|ref|ZP_02068167.1| hypothetical protein BACOVA_05180 287 128 Op 3 . - CDS 362568 - 364088 1582 ## BF0587 hypothetical protein 288 128 Op 4 . - CDS 364095 - 367406 3087 ## BF0536 putative outer membrane protein - Prom 367458 - 367517 5.3 - Term 367429 - 367473 0.5 289 129 Tu 1 . - CDS 367653 - 368675 849 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 368787 - 368846 5.4 + Prom 368722 - 368781 7.2 290 130 Tu 1 . + CDS 368810 - 369385 447 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 369557 - 369612 2.4 - Term 369533 - 369589 5.2 291 131 Tu 1 . - CDS 369701 - 371599 1181 ## COG0642 Signal transduction histidine kinase - Prom 371624 - 371683 3.7 - Term 371648 - 371687 5.1 292 132 Op 1 . - CDS 371717 - 374353 2724 ## COG0525 Valyl-tRNA synthetase 293 132 Op 2 . - CDS 374420 - 375058 650 ## BT_4352 hypothetical protein - Prom 375085 - 375144 6.1 + Prom 375056 - 375115 7.9 294 133 Op 1 . + CDS 375270 - 376253 792 ## BT_4351 hypothetical protein 295 133 Op 2 . + CDS 376266 - 377054 1075 ## COG3956 Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain + Term 377078 - 377118 3.5 - Term 377108 - 377152 12.4 296 134 Op 1 . - CDS 377160 - 377627 394 ## BT_4349 hypothetical protein 297 134 Op 2 . - CDS 377655 - 378035 383 ## BT_4348 hypothetical protein 298 134 Op 3 . - CDS 378083 - 378634 489 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 299 134 Op 4 . - CDS 378659 - 380215 1051 ## COG2989 Uncharacterized protein conserved in bacteria 300 134 Op 5 . - CDS 380190 - 381128 518 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Prom 381186 - 381245 5.3 - Term 381215 - 381263 11.0 301 135 Tu 1 . - CDS 381282 - 383084 3066 ## PROTEIN SUPPORTED gi|160887146|ref|ZP_02068149.1| hypothetical protein BACOVA_05162 - Prom 383104 - 383163 6.8 + Prom 383080 - 383139 10.8 302 136 Tu 1 . + CDS 383253 - 389138 3768 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 303 137 Tu 1 . - CDS 389140 - 390270 728 ## COG3344 Retron-type reverse transcriptase - Prom 390312 - 390371 6.6 - Term 390317 - 390369 4.1 304 138 Tu 1 . - CDS 390476 - 392587 1579 ## BT_4341 hypothetical protein - Prom 392688 - 392747 8.8 305 139 Tu 1 . + CDS 393806 - 394285 240 ## BT_4340 hypothetical protein + Prom 394289 - 394348 5.5 306 140 Tu 1 . + CDS 394418 - 396607 2249 ## COG3968 Uncharacterized protein related to glutamine synthetase + Term 396619 - 396662 8.6 + Prom 396641 - 396700 8.4 307 141 Op 1 . + CDS 396724 - 397425 584 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 308 141 Op 2 . + CDS 397460 - 397891 225 ## Dfer_2829 RNA polymerase, sigma-24 subunit, ECF subfamily + Prom 398002 - 398061 3.5 309 142 Op 1 . + CDS 398083 - 399243 864 ## COG3712 Fe2+-dicitrate sensor, membrane component 310 142 Op 2 . + CDS 399269 - 402838 1722 ## BT_3279 hypothetical protein 311 142 Op 3 . + CDS 402856 - 404397 937 ## Cpin_3049 RagB/SusD domain protein 312 142 Op 4 . + CDS 404433 - 405437 643 ## gi|260171562|ref|ZP_05757974.1| hypothetical protein BacD2_06810 313 142 Op 5 . + CDS 405464 - 406474 733 ## COG0526 Thiol-disulfide isomerase and thioredoxins 314 142 Op 6 . + CDS 406531 - 408210 1486 ## COG1404 Subtilisin-like serine proteases 315 142 Op 7 . + CDS 408220 - 409362 907 ## HCH_00467 PDZ domain-containing protein 316 142 Op 8 . + CDS 409364 - 410524 941 ## gi|160887127|ref|ZP_02068130.1| hypothetical protein BACOVA_05143 317 142 Op 9 . + CDS 410545 - 411834 1132 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 411885 - 411914 0.4 + Prom 411883 - 411942 7.2 318 143 Tu 1 . + CDS 411993 - 414422 2105 ## COG3525 N-acetyl-beta-hexosaminidase + Term 414496 - 414556 12.3 - Term 414542 - 414588 -0.9 319 144 Op 1 . - CDS 414592 - 415542 707 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 320 144 Op 2 . - CDS 415579 - 416223 597 ## BT_4335 hypothetical protein 321 144 Op 3 . - CDS 416243 - 418735 2388 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 418781 - 418840 7.8 + Prom 418728 - 418787 9.4 322 145 Op 1 . + CDS 418861 - 419526 631 ## BT_4333 hypothetical protein 323 145 Op 2 . + CDS 419529 - 420188 523 ## BT_4332 hypothetical protein 324 145 Op 3 1/0.129 + CDS 420252 - 421430 575 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative + Term 421647 - 421679 2.4 + Prom 421627 - 421686 7.5 325 146 Op 1 . + CDS 421839 - 423089 1182 ## COG0477 Permeases of the major facilitator superfamily 326 146 Op 2 . + CDS 423103 - 423696 608 ## COG1259 Uncharacterized conserved protein 327 146 Op 3 . + CDS 423705 - 424403 571 ## COG1385 Uncharacterized protein conserved in bacteria 328 146 Op 4 . + CDS 424455 - 426002 1639 ## BT_4327 hypothetical protein 329 146 Op 5 . + CDS 426023 - 426667 200 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) + Prom 426677 - 426736 4.4 330 147 Op 1 . + CDS 426762 - 427967 979 ## BT_4325 hypothetical protein 331 147 Op 2 . + CDS 427991 - 430642 1531 ## BT_4324 hypothetical protein + Prom 431194 - 431253 6.4 332 148 Op 1 . + CDS 431431 - 432351 853 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 333 148 Op 2 . + CDS 432410 - 433336 829 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 334 148 Op 3 . + CDS 433412 - 434212 792 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase + Prom 434214 - 434273 1.6 335 149 Op 1 . + CDS 434294 - 437131 2914 ## COG0612 Predicted Zn-dependent peptidases 336 149 Op 2 . + CDS 437155 - 438648 982 ## BT_4319 hypothetical protein 337 149 Op 3 . + CDS 438687 - 439394 209 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 338 149 Op 4 . + CDS 439465 - 440043 232 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 339 149 Op 5 . + CDS 440070 - 440954 688 ## COG3735 Uncharacterized protein conserved in bacteria 340 149 Op 6 . + CDS 440993 - 441631 558 ## COG0546 Predicted phosphatases + Prom 441681 - 441740 6.8 341 150 Op 1 32/0.000 + CDS 441826 - 442143 534 ## PROTEIN SUPPORTED gi|237715590|ref|ZP_04546071.1| 50S ribosomal protein L21 342 150 Op 2 . + CDS 442165 - 442428 449 ## PROTEIN SUPPORTED gi|160887096|ref|ZP_02068099.1| hypothetical protein BACOVA_05112 + Term 442451 - 442500 14.3 + Prom 442481 - 442540 5.0 343 151 Tu 1 . + CDS 442659 - 443933 1440 ## COG0172 Seryl-tRNA synthetase + Term 443972 - 444012 4.9 + Prom 443978 - 444037 4.3 344 152 Tu 1 . + CDS 444092 - 446392 2384 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 446476 - 446526 11.5 - Term 446457 - 446519 19.1 345 153 Op 1 12/0.000 - CDS 446538 - 446891 489 ## COG0853 Aspartate 1-decarboxylase 346 153 Op 2 . - CDS 446910 - 447755 723 ## COG0414 Panthothenate synthetase - Prom 447869 - 447928 9.0 347 154 Op 1 . + CDS 447898 - 448731 655 ## COG0297 Glycogen synthase 348 154 Op 2 . + CDS 448753 - 450447 1254 ## BT_4306 hypothetical protein + Term 450464 - 450528 14.1 - Term 450667 - 450713 4.4 349 155 Op 1 2/0.097 - CDS 450746 - 452200 1323 ## COG1449 Alpha-amylase/alpha-mannosidase 350 155 Op 2 4/0.032 - CDS 452218 - 453483 1068 ## COG0438 Glycosyltransferase 351 155 Op 3 . - CDS 453500 - 455437 1499 ## COG3408 Glycogen debranching enzyme - Prom 455550 - 455609 10.3 + Prom 455543 - 455602 8.6 352 156 Tu 1 . + CDS 455652 - 456308 412 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Term 456363 - 456412 4.5 353 157 Tu 1 . - CDS 456437 - 457036 596 ## COG2095 Multiple antibiotic transporter - Prom 457062 - 457121 9.3 + Prom 456900 - 456959 7.8 354 158 Tu 1 . + CDS 457174 - 457842 586 ## BT_4300 Crp family transcriptional regulator Predicted protein(s) >gi|225935364|gb|ACGA01000028.1| GENE 1 173 - 346 56 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFQIKRKKVEEYLLTIRVFVYLCVKFKFCYLRSNTHILLLIINILYIYHLSIQQFLT >gi|225935364|gb|ACGA01000028.1| GENE 2 313 - 1269 493 318 aa, chain - ## HITS:1 COG:no KEGG:BT_0595 NR:ns ## KEGG: BT_0595 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 317 1 317 318 506 80.0 1e-142 MQMMNKNGFSRCAEFYIGRLRKEGRHSTAHVYKNAIFSFSKFCGTSNVSFRQITRERLRR YGQYLYECGLKPNTISTYMRMLRCIYNRGVEAGSAPYVPRLFHDVYTGVDVRQKKALPIG ELRRLLYEDPKSERLRRTQAIAALMFQFCGMSFADLAHLEKSALDQNVLRYNRIKTKTPM SVEILDTAKEMINRLRSNQDSHPDSPDYLFDILSSDKKRTDERAYREYQSALRRFNNRLK DLARALRLKSPVTSYTLRHSWATTAKYRGVSIEMISESLGHKSIKTTQIYLKGFGLKERT EVNKGNLSYVRNCCIDRW >gi|225935364|gb|ACGA01000028.1| GENE 3 1523 - 2701 1128 392 aa, chain - ## HITS:1 COG:no KEGG:BT_4675 NR:ns ## KEGG: BT_4675 # Name: not_defined # Def: heparin lyase I precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 392 2 376 376 699 88.0 0 MKKNIFAICVVAVGCTALTAQTKDTQTLVPLTERVNVQADSARVNQIIDDCWVAVGTKKP HAIQRDFTRMFNGKPSYRFELKEDDNTLSGYAKGETKGRAEFFYCYATSDDFKGLPADVY QKAQITKTVYHHGKGICPQGASRDYEFSVYIPSSLGSDVSTIFAQWHGMPDRTLVQTPQG EVKKLTIDEFVELEKTTIFKKNEGYEKVAKLDEQGNPVKDKQGNPVYQAGKANGWLVEQG GYPPLAFGFSGGWFYIKANSDRKWLTDKDDRCNANPEKTPIMKPVTSDYKASTIAYKMPY ADFPKDCWITFRIHIDWTVYGKEAETIVKPGMLDVQMDYQEKGKKVSKHIVDNEKIMIGR NDDDGYYFKFGIYRVGNSTKPVCYNLTNYSER >gi|225935364|gb|ACGA01000028.1| GENE 4 2825 - 3376 519 183 aa, chain - ## HITS:1 COG:no KEGG:BT_4674 NR:ns ## KEGG: BT_4674 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 182 19 182 183 271 84.0 6e-72 MRKLIYWLLLPLAVAVSACGGKKGSSDNTSTLAMIDSVDAHGLQRMQTSKSETDFKFKGK DYHSLVSRTPDENLPHVTNEMGDTYVDNKIVLHLTRGNETVLNKTFTKNDFSSVVDANFL SKSILEGIVYDKTTPQGIVYAASVCYPQTDLYMPLSITITADGKMSIQKVDMLEEDYDDE APN >gi|225935364|gb|ACGA01000028.1| GENE 5 3524 - 7048 2357 1174 aa, chain + ## HITS:1 COG:MA3405_3 KEGG:ns NR:ns ## COG: MA3405_3 COG0642 # Protein_GI_number: 20092217 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 814 1036 16 242 256 124 37.0 1e-27 MRERITTILFLILAYCSSGNVYGQKVNYQQFDNIYLGAEASVISCFLQDSEGLIWVGSNK GLFSYDGYSTQQHFTYGERTNTRIYCGIIADNTYLYLGTDNGILVYNYRTDKYEQPDTDF PTDVRTMVLQGDTLWIGSLNGLYTYQLKSRQLNHFDSKQTGLPHNTIYSVIQTKDNQIYI GTYNGLCRYVQASGKFENILLPVNRGGSNLFVNSLLEDTTRQCIWIGMEGYLFQYNPTTG EMKAIEVFHNNSIKSLALDGDENLLAGTDNGLYVYQNDKEPLQHIIHDSRNIQSLTNNII WNIFSDQEHNIWLGTDYGISLSRYNSALQFVPISQITGTGDGNQFYSLFRDSKGFYWFGG TNGVIRFTHPAGNKHDAIWYRMGDKKHPLSHNRIRHIYEDREQQLWIATDGSINRYDYNT HQFVHYNIVDSTGTYNTNWAYYLFEDTDGKLWIATCLGGIFVVDKHKLIQSAGKSYVADQ NYSIHNGLSGMFINQIVPDAEGNVWVLLYNSKGIDKINARTRQVTKLFADELSGEKSPNY LLRDEDGMLWVGFHGGVMRINPKDSSQQSISFGNFSNNEILSMTSVKEHIWVSTTNGLWI IDRRTMDARQQSLTDKRFTSLMFDPEDGNIYLGGADGFGISRPEIQAMSQPEHSILLTAL YINNQLMSPHTGENIPNIRYTDAIELKYDQNNLAFELSDLPYSLEEKHEFVYRLEGMDKE WNFLQSNTNRITYSNLGYGDYQLVISKVEKSGSPSEHPYILNIKILPPWYYTLWAKIVYI LLFLSLIAWVINFFRVKNRLKMERMEKEKILEQSRQKMAFFTNLSNELKTPLSRIIAPVS QLLPVAEEVHEKQTLEEVQRNAMKINSLIHQVLNFNRIEDNKDSLLILSRIELVSFSRSL FSVYEENKQITFHFETNKAKIYADMDAIKLGVILDNLLSNAVKFTPDGGSVRLSLFYREE TGLLDICVSDTGSGIPQQDIPYIFQRFFQSPHSGNKEGTGIGLYLVKTYTELHGGHINGV TSNEGKGTSIGLNIPVIAVEKEEIPATQVKKQLESLPVLKPIEAESQDEKFLSNIIRLIE DHLSESELNVNALCELSGISNKQIYRKIKQLTGMSPVEYIKSIRMKKAAMLLQQKKFTVA EVMYMVGFSNHSYFSKCFQAEFGKTPRQYLNEDL >gi|225935364|gb|ACGA01000028.1| GENE 6 7164 - 9773 1833 869 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6170 NR:ns ## KEGG: Cpin_6170 # Name: not_defined # Def: membrane or secreted protein # Organism: C.pinensis # Pathway: not_defined # 24 869 19 842 844 635 40.0 1e-180 MKRLLILTFICLISAFVKVQGKSSSTPIIYIDGNGVMRWSDTRREASFFGVNYTLPFAHA YRAIGYLELDRKAAIDKDVYHISRLGLNAYRIHLWDVELTDGQGNLLENEHLDLMDYLIA KLKERNIHIVITAQTNFGNGYPERNIQTGGFSYKYDKCDMHSHPEAIAAQETYLHGLVKH VNPYTGLAYKDDPSIVGFEINNEPCHSGTKEEVKAYINRMLKAINKTGNRKPVFYNVSHN GYVVEAYYETAIQGTTYQWYPIGLVSGQTQQGNFLPYIDRYDIPFSDKVKGFDKKTRMVY EFDPADIMYSYMYPAMVRTFRTAGFQWITQFAYDPMDIAYANTEYQTHFLNLAYTPHKAI SMKIAAEAARSLKRGESYGSYPQDTLFGDGFRVSYTEDLSELNNGKKFYYSNHTNTQPKD ASQLVSIAGCGSSPIIRYEGTGTYFMDCLEPGVWRLEVMPDAVVVNDPFAKPSLDKEVVT IAYGAWDMALQIPDLGMEFTFTALNQGNQQKGDVTDGIIRGLCPGTYLLKRKNCTPKQNW QADSQWNSIRIGEYVAPAPRVTDYKVVHTPSATTEANKDLTISAQVVGTEFPDSVIIYTD KISFWNEHNPYIKMKHTGGYTYQATIPAREIKDDCFRYNIIVCRGNSTRTYPTGNSGYRN SSSGIKENPLDWNYTSGAYWTTRVVAPDSAIPLLTITDADSRIEAYTLPEWNDLQRTLVD SSPVEKPLLRFRFTPKGENPHYFLRTFVKNLIEERKERVKDCSVLCIRVNRTKALPEGLS AGFVTSDGYTYKSPCPAPSSEGIIRIPLKDLRQTDTVLLPIAYPTFLKQYFHPETEIAFL PEKIEKLELSMSGNKKELVEIELGNIWLE >gi|225935364|gb|ACGA01000028.1| GENE 7 9997 - 12972 2478 991 aa, chain + ## HITS:1 COG:no KEGG:BT_4671 NR:ns ## KEGG: BT_4671 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 991 1 992 992 1280 65.0 0 MNRNILNLRHLTVLLMLCTAFLGGIIPAFAQQGGKKITGQVIDENKEPMIGVSILIVGTS TGTVTDFDGNYTLNVPQGSKELQFSYVGYETKVVPVPANSTTLNVQLRSDSQVLSDVVVI GYGTQRKSDLTGSVTNISSKDFNMGLVSSPEQLINGKISGVQIMSNSGSPTAGSTIRVRG GASLNASNDPLIVLDGVPLEQGGISGNDGNFLSLINPNDIESMTILKDASSTAIYGSRAS NGVILITTKKGSSDKLKFTFSTTNSIQTRTKLADMLSYDQFVNTIRTQGTTAQQSLLGTA HTDWNDEIYQNAFGTDNNLSISGKIAKNWPFRISVGYYNQSGLLRTDNAERYTGSIMVTP SFFKDHLKLNINAKGSINNNTFGNTNAIWAASTMNPTIPVYSGLDTFGGYTEAIDNTGAP ANGAVMNPVGLIKMNDSQSNVRRFIGNIDADYRVHFFPDLKLHATLGYDYAQGKGSVYVP AEAAQNYTTSGLDYSYGPQKKENRLLTLYANYNKLVESIKSSFDVTAGYDYQYWKATTPL YYEMNTLGEIQKTSKAKDERHVLLSYYGRLNYSFDSRYMLTTTIRRDATSRFAKNVRWGT FPSVALGWRVTEESFLKDNKVLSNLKIRASYGVTGQQDGIGNYNYLPIYTQSQNGAEGIM GNEYIHTYRPSSYVPDLKWETTTSWNIGFDFGFLEDRITGSFDYYTRQTKDLIATVPAAA GTTFDRNITTNVGNVDSQGIEFSVNATPIQTEDWNWDVSFNMTWQKMKVKNLSLSAGTVA TNTSVGPYIDSYQVQTLSEGYAPYMFYVYHQLYDEKTGKPVEGAYLDVNGDGQINSSDLY RYHSPAPDYILGFSTSLRWKKWTLSTSLRANIGNYVYNAMAMNAGAWETVSYNSYQLNNL STSYLKTGFQTRQYLSDYYVENASFLKMDNLSLNYNFGQICKGCSLNVSAMVQNVFCITK YSGVDPEVPNGVDNSFYPRPRTFSLNVGFNF >gi|225935364|gb|ACGA01000028.1| GENE 8 12994 - 14571 1444 525 aa, chain + ## HITS:1 COG:no KEGG:BT_4670 NR:ns ## KEGG: BT_4670 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 525 3 524 524 532 53.0 1e-149 MKKILYTAFYFCSALLLNSCIGDLDQYPHEDETSSTIYTNAENYKAVLGKIYAAMVTTGQ EKGGDPDLSSNSGQDYMRCFFNLQECGTDEVASTWLSGDNVSGLTYLSWDANDPWVSDMY YRIYYNIALCNEFLRNATDEKLAEFSEQDQAEIRHYRAEARFMRALFYYHAMDLFRNIPF ITENDPTVGYLPPRYTSAQIFSYIESELNAITNDMLSKGDCPYGRASQGAAYTLLAKLYL NAEVYIETSKYTECIAACQQAIAQGYSLEDDYSKLFNADNDKRTANEIIFTLPVDATHTV SWGSSTYIVCGAVSNTSDKQIPENYGVKSGWGMFRVRGEIPALFSGTDDKRYMFFTDGQT QYLDVIDNQSNGYFVEKWTNLTDAGATASNTADGGVNTDYPLFRLADVYLMMAEAVVRGG TGSDKGTVLGYINKLQERAYGDDSHNKAEADLTLDFILKERARELYWEGYRRTDLIRFGM FTTDKYLWQWKGGEKGGVAVNSKYNIYPIPATELTANPNLYNENY >gi|225935364|gb|ACGA01000028.1| GENE 9 14595 - 16337 1462 580 aa, chain + ## HITS:1 COG:no KEGG:BT_4669 NR:ns ## KEGG: BT_4669 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 580 11 570 570 269 32.0 3e-70 MKTINSLFLSLAALFILSACEDDADKYYLSSLTENELIASTNNIVLTEDIAQKNVLSLAW TDRTLAINNPDFKPTNLLKTSIQVALNEDFSGTIVESTETSLSKTYNGAELNIVTNNLGV EANVATKFYFRLKGTTGNNIEPAYSNVEVITITPYELDMRFASMLDQNGTDTGMSLFSVN TDGVYKGFVGVNAWYNFLIKEANGTIWRNDNETGMPFLLTTAGDWKCWFPGVGGCYYITF DTNNKQWNGIQIPTLNVSGDIDATMEFNQTNRKWTATFNANAAGSMTIQISGTGKLYDHT CANPNDNGGYDIDDTKAKDTPMAFSGSSTALNLGETAGDITVNVPQAGECTLSIDLSNPQ QWTVSVEEGSIEPEPEPEEGQYLYLPGIGQGSDWTYDHKIEVYYKEEQKYAGIVDVNSQY GNYAMTLFSSGDAWDTDKMYTIANDDSESTAEAGTLVNGQAKNIPAPAAGLYLFDVSLKE LTYKTYAVGNNIYCYYGIEGDNNLYPIATTGTTGEYSGTITLSQDSNWGIKFYIKDDWTG AYGGSNGKLYYNSNDGIPLTAGTYTITVNLIDGTYSASNK >gi|225935364|gb|ACGA01000028.1| GENE 10 16361 - 17449 1007 362 aa, chain + ## HITS:1 COG:CAC2570 KEGG:ns NR:ns ## COG: CAC2570 COG3867 # Protein_GI_number: 15895830 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinogalactan endo-1,4-beta-galactosidase # Organism: Clostridium acetobutylicum # 44 335 34 328 360 233 41.0 6e-61 MKRFKNILMASALLCGAFFTACDDNDDKPVFPENQDQAYDMSGFAKGADVSWLTEMEKEG YKFYDAEGNGHECMSLLRDLGMNAIRLRVWVNPDQGWSEEEGFFNPEGWCDKDDVVTKAW RAHNLGYRIMIDFHYSDIWADPGRQEKPAAWADLSFDELKQAVADHTTEVLSAIKARGID VEWVQVGNETRTGMLKPDGAASSGKTANFAQLVTAGYDAVKTIYPEAKVIVHIDQGNNPN RYIWLFDGLKADGGKWDVIGMSLYPEPDTQDWRQLNNDCIANMKSLIERYNTPVMMCELG IYWNYEEAEEFFTDFMTKAKEIDQCLGVFTWEPQSYGGWKGYHKGMFDDSGKPTSAFNAF KR >gi|225935364|gb|ACGA01000028.1| GENE 11 17521 - 19962 2111 813 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 28 798 59 871 871 439 33.0 1e-122 MKHQKQLLAVLLMGAACTLQAQRSETLLEKNWKFSKGDFKEASQPEFNDTKWESVVIPHD WAIFGPFDMNNDLQNVAVTQNFEKKASLKTGRTGGLPYVGTGWYRTSFDAPADKEVTLLF DGAMSEARVYVNGKEACFWPFGYNSFHCNVTSLLNKDGKNNTLAVRLENRPQSSRWYPGA GLYRNVHLIVTDKVHVPVWGTQITTPHVSKDFAAVRLQTKIDNAGEKTQIRIETEILSPD GKVVTSKENTSRINHGQPFEQNFIVNAPELWSPESPSLYKAVSKIYADDKLVDTYTTRFG IRSIEYIADKGFYLNGKHRKFQGVCNHHDLGPLGAAINVAALRHQLTLLKDMGCDAIRTS HNMPTPELVALCDEMGFMMMIEPFDEWDIAKCENGYHRYFNEWAERDMVNMLHNYRNNPC VIMWSIGNEVPTQCSPVGYKVAKFLQDICHREDPTRPVTCGMDQVTCVLANGFAAMIDIP GLNYRTQRYKESYDQLPQNLILGSETASTVSSRGVYKFPVEDKKSTKYEDHQCSSYDVEA CSWSNIPDEDFALADDNHWTIGQFVWTGFDYLGEPSPYDTDAWPNHSSMFGIIDLASLPK DRYYLYRSVWNKDAETLHILPHWTWPGREGEVTPIFVYTNYPTAELFINGKSYGKQSKNN SSLKNRYRLMWMDAVYEPGEVKVVAYNKEGKAVAEKVVRTAGKPHHIELVSNRNELTADG KDLAYVTVKVVDKDGNLCPTDNRQINFSAKGTGKYRAAANGDPTNLEQFHLPKMHAFNGM LTAIIQAGETAGEIVFTAKANGVKAGNIRIQTK >gi|225935364|gb|ACGA01000028.1| GENE 12 20224 - 20673 393 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293374010|ref|ZP_06620349.1| ## NR: gi|293374010|ref|ZP_06620349.1| hypothetical protein CUY_2814 [Bacteroides ovatus SD CMC 3f] # 1 149 62 210 210 303 100.0 3e-81 MEVSLNKESDSLYVIHLNTDQPSHSVWKLPYPVYQFDYGDVTGDGMPEIIVGVIKPTRFD PKPDKRLFIYRIADEAYIRPLWLGSRVAQPLEDFRVIREHTSVLIRTMERERSGKYLIAE YAWRGFGLDFKGYLKRETEEKEAKRILER >gi|225935364|gb|ACGA01000028.1| GENE 13 20723 - 23158 1911 811 aa, chain + ## HITS:1 COG:no KEGG:MCP_0630 NR:ns ## KEGG: MCP_0630 # Name: not_defined # Def: hypothetical protein # Organism: M.paludicola # Pathway: not_defined # 169 791 121 744 748 286 31.0 3e-75 MRRLLLLSCTLVLILCGCKNKNKNTSTALAQDTVTTATSLLTDTVLPQSIDLKQDISRYS FQELRLLRSYPYAIHGYHFMEADINAFFSANTKWYNDLVWKLWEESEANGENKFPENYDE VKLTAEEKAFVERIDARMAEMRQQQFTQRDSYYLGNANNIVNLFQFKDIDEALLAKLQQN NFAITEGSNLQLFHAYEENDYRQVPNFITTDLYLQAFHMYFSYVLKSLEKQHIIPTLERL CLSLNATCISISRQTEDESLKDMAEYAATFYAIPYYLLTKETPNLPAKYQKAYQQEIEHI NAQEDDFSEFLSYKEAYFPYSLFKPRGHYTREPQLQAYFQAMMWLQTACFCREQQEQLKQ AIFQATVLSTYKDMAETPLMELYQRVYTPLTFLMGETDNLSLLDIAQILKKNKAEYTEDA LTPVQIEKVNQALIELAKSKNRIKPKIEISCRDKINFMPQRYLADNEVLQELVDVTPNSK RAYPKGLDVFAAFGVNSAETLLTDFYKEPGNWNQYTGELQKLKDKFKASQPAQVSVYELW MKSLFTMQKTDKSQPGFMQTPEWGYKNLNTALASWAELKNDAILYGEQPMAAECGGAGPP DPIVVGYVEPNLPFWKKMSGILQATQLVLQQSNCLTDDLKGKTEQLQDYVSFLIQVTEKE LRGEKLTEQEYRTLEYMGSSIEYFTLSVLDPDLHLDNWSLVQGPDKSIAVVADIYTRNVS GCDKNGVLHVATGNANNIYVVVEIEGNLYLTRGATFSYYEFVQPLDTRLTDEEWQKMLEE KKAPAVPEWMKKILLEKEPKVDDRVFYSSGC >gi|225935364|gb|ACGA01000028.1| GENE 14 23152 - 24078 388 308 aa, chain + ## HITS:1 COG:BS_ywtB KEGG:ns NR:ns ## COG: BS_ywtB COG2843 # Protein_GI_number: 16080641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Bacillus subtilis # 27 305 55 335 380 145 31.0 9e-35 MLISFSRINRIIGSIGLLLLTACQHPSEECLSIAFTGDVLLDRGVRQQIQRGGVEDLFAS VAPLFHQMDATVINLECPVTSVRSPLHKKYIFRAEPRWMEELVKAGISHAALANNHTMDQ GRSGLTDTYQHLLSAGITPIGYGNTSSESCQPVVIKKGKIKVALFNSVTLPLENWVYLEN DPGICQQTTEEIEEEIKDFKQENPESYVVVILHWGIEYQSSPTLNQRKGAHRLVRAGADA IIGHHPHVIQKEEYFNGKPIFYSLGNFVFDQRKPETSQSQIVQLDFTSTTCKVKVHPVTI HQCKPEMQ >gi|225935364|gb|ACGA01000028.1| GENE 15 24110 - 25141 837 343 aa, chain - ## HITS:1 COG:no KEGG:BF1208 NR:ns ## KEGG: BF1208 # Name: not_defined # Def: putative endonuclease/exonuclease/phosphatase family protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 343 1 341 341 600 80.0 1e-170 MRLTKLLLITLLLLVSVCAKSQKGNEFTVLQWNVWQEGTMIPGGYDAIVNEIVRLKPDFV TFSEVRNYNHTNFTARVCASLQEKGLQYYSFYGYDTGLLSKHPITDSLTVFPENGDHGSI YRLTSSVNGRKVAVYTAHLDYLDCAYYNVRGYDGSTWKEIPLPTSVEEILKINVASQRDD AIRLFITQAEKDLANGYAVIVGGDFNEPSHRDWIEKNKNLYDHNGFVVPWTVTTLLEEAG FVDSYRKIYPNPLTHPGFTYPSDNPAKTPEKITWAPKADERDRIDFVFYKGDGLDAKKAI IFGPKGSIVRAQRVQETSKDKFLLPLDVWPTDHKGLLVTFGSK >gi|225935364|gb|ACGA01000028.1| GENE 16 25173 - 25859 738 228 aa, chain - ## HITS:1 COG:aq_1503 KEGG:ns NR:ns ## COG: aq_1503 COG0569 # Protein_GI_number: 15606658 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Aquifex aeolicus # 3 181 5 183 218 105 34.0 7e-23 MKYIIIGLGNYGHVLAEELSALGHEIIGADISESRVDSIKDKVATAFVIDATDEQSLSVL PLNSVDIVIVAIGENFGASIRVVALLKQKKVPRIFARAIDAVHKAVLEAFDLERILTPEE DAARSLVQLLDFGTNMEGFRIDQDYYVVKFTVPKKFVGYFVNELNLDEEFHLKMIGLKRA NKITNCLGISLTELHVKNELPGNEKVEEGDELVCYGRYRDFQTFWKAI >gi|225935364|gb|ACGA01000028.1| GENE 17 25864 - 27693 1357 609 aa, chain - ## HITS:1 COG:BH0598 KEGG:ns NR:ns ## COG: BH0598 COG0168 # Protein_GI_number: 15613161 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus halodurans # 154 606 18 445 448 221 36.0 3e-57 MKIYHKFLLYQNKLLKPYVRILLGLVEALTYLASLLLIVGVVYEHGFPLSIDEVANLQTL YKTVWIIFLIDVTLHISLEYRNTKKQYRRLAWILSGLLYLTLVPVIFHRPEEEGAILQIW EFLHGKFYHLLLLLVLSFLNLSNGLVRLLGRRTNPSLILAVSFMAIILIGAGLLMLPRCT VNGITWVDSLFTATSAVCVTGLVPVDVSTTFTTSGLVVIILLIQIGGLGVMTLTSFFAMF FMGNTSIYNQLVVRDMVSSNSLGSLLSTLLYILGFTLVIEGIGMVSIWFSIHGTLGMTLE GELGFAAFHAISAFCNAGFSTLSGNLGNPMVMTNHNGLFISVSLLIIFGGIGFPILVNFK DIVLYHLRRFWKLVRTRKLDRHKMQHLYNLNTKIVLIMTFLLLLIGTLAIAAFEWNASFA GMPLADKWTQAFFNATCPRTAGFSSVDLASLSVQTLLVYLFLMWIGGGSQSTAGGVKVNA FAVVVLNLVAVLRGTERVEVFGRELSYDSIRRSNATVVMSLGVLFIFIFTLSILEPGVSI MALTFECVSALSTVGSSLNLTPHLCDASKLLVSLLMFIGRVGLITLMLGIVKQKKNTKYR YPSDNIIIN >gi|225935364|gb|ACGA01000028.1| GENE 18 27826 - 29196 1674 456 aa, chain + ## HITS:1 COG:TM0539 KEGG:ns NR:ns ## COG: TM0539 COG1350 # Protein_GI_number: 15643305 # Func_class: R General function prediction only # Function: Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) # Organism: Thermotoga maritima # 10 426 7 421 422 489 59.0 1e-138 MSEKRKRYILPEEEIPHYWYNIQADMVNKPMPPLHPGTKQPLKAEDLYPIFAKELCHQEL NQTDAWIEIPEEVREMYKYYRSTPLVRAYGLEKALGTPAHIYFKNESVSPIGSHKLNSAL AQAYYCKEEGVTNITTETGAGQWGAALSYAAKVFGLEAAVYQVKISYEQKPYRRSIMQTF GAQVTPSPSMSTRAGKDILTKHPTYQGSLGTAISEAIELAQMTPNCKYTLGSVLSHVTLH QTIIGLEAEKQMEMAGEYPDIVIGCFGGGSNFGGISFPFMRHTIQEGKKTRFVAAEPASC PKLTRGKFQYDFGDEAGYTPLLPMFTLGHNFAPAHIHAGGLRYHGAGVIVSQLLKDNLME AVDIQQLESFEAGCLFAQSEGIIPAPESSHAIAAAIREANKCKETGEEKVILFNLSGHGL IDMTSYDKYLAGDLVNYSLTDDDIQKNLDEIGDLAK >gi|225935364|gb|ACGA01000028.1| GENE 19 29329 - 30582 578 417 aa, chain - ## HITS:1 COG:no KEGG:BVU_0128 NR:ns ## KEGG: BVU_0128 # Name: not_defined # Def: transcriptional regulator # Organism: B.vulgatus # Pathway: not_defined # 283 417 799 936 937 68 28.0 4e-10 MVPILANLPILGRFLRVGCSRSFREKTTFVAQNKKNMVNVIVLCIELLLDIVGLVLGILV YSRSSDNNGLTKAWGVLAITLSFLLLCDNLEWMWIFSRGGEETIPRFTEVPMNHLSVWHI VRVIVFFQFFSIFPIASLKPGWMTFSRVVSLCIPILLITCIACCYEFFNGHYTTLKSFAS IRENIGEQDVRVRLMLFVISVITPSVNFLFPYMRRWIPIRRKQSQAMSIYMMCFAIIMSG YIWLMLGTSGLCFNVFGYIVILPVLFLNILYLRNENPLSLPPQPVEELEMEEIEAIREIA VSPVVLELSNQMQSLMKHSKSFTNPQYSLQEFLNDLNTNENRLNKALHYDGFSGFRDYIN FYRLQYFKEQAQLKRELTVKELMFLSGFTSRSSFYRYFASVEKMSPSEYMEMLNREN >gi|225935364|gb|ACGA01000028.1| GENE 20 30664 - 31425 532 253 aa, chain + ## HITS:1 COG:TP0233 KEGG:ns NR:ns ## COG: TP0233 COG1366 # Protein_GI_number: 15639225 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Treponema pallidum # 147 253 9 118 181 63 30.0 4e-10 MTRYLKNMEKEITIENQVEEISIIAQFIEELGMSLHLPSGITMSLNLAIEEAISNIIHHG YPQNQKGEITLKTSVAPGLLIAQIIDDGISFDPTKSENEASGNALSLEQQLTQGLGLFLI RRTMDKVEYHSTDSQNELILTKKIEMDFKPEATLKTNLCKIEEVIILTVEGRLDTANTNE FNALILPLLNVQNPNIIINCEGLLYISSSGLRSLITLQKSVKQHNGQLVLEAMRAEIRKI FDMTGCSGLFIIR >gi|225935364|gb|ACGA01000028.1| GENE 21 31437 - 32387 531 316 aa, chain + ## HITS:1 COG:mll1621 KEGG:ns NR:ns ## COG: mll1621 COG0715 # Protein_GI_number: 13471600 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Mesorhizobium loti # 19 288 19 294 323 122 28.0 1e-27 MRKASIICSILLSLFFLTEPVNAQSINLTPKWTAQAQFAGYYVADKLGFYKEEGLDVHVV HPSLSASSFSFLQKGYSQAVVMNLSYALTEYFAGAQVVNILQTSQENSLMLVSRSPLKGI SSLQHKKIGVWNHLSQELLDQIANKYQLQVEWIRFNGGVNIFLSKAVDICLVGSYNEFLQ LIEAGVKTDSIHIMHFSDYGYNLPEDGLYVSREFYQKYPQAVQKLARASIRGWEWANEHR EQTLDIIMEQVHQHNIGTNRYHQRKMLDEILRLQTNKKSGQRPYRLSREVFDLATSILLP DQIKNTSDIRYENFVK >gi|225935364|gb|ACGA01000028.1| GENE 22 32400 - 34337 1120 645 aa, chain + ## HITS:1 COG:FN1091 KEGG:ns NR:ns ## COG: FN1091 COG2208 # Protein_GI_number: 19704426 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Fusobacterium nucleatum # 394 637 203 444 447 142 33.0 2e-33 MKHTFSRSFATRLSIYVLSFTLIIFATIMALFYYYSHQKVTDFAIERTHGLLSNIATEIS GQLLSVETTIEQSVWVLEKNIDHPDSLHNIIESVVRNNQFIVGSGIAFVPDYYKEKGKYF MPYASFKDEINGEITYQVLGSQNYDYPCMDWYLIPQLLKQPYWSEPYYDDGGGNFIMSTY AKPLFDSDGKLVAVFTANISLTQFTDTISLLKPYPSSYTYLISRNGSFLTHMDRNKIMNE TIFSEAFAKENLAQEQIGHEMLAGHSGTMRFDNKGVDSYAFYTTIPQIGWSVCTVCPSQI ILQELDSTSRTIIYTFIVGMIALFLIVYSIIRRLVRPLEKFSESAREIATGRFDVTIPEV HSNDEIRDLYDSLIFMQHSLSTYVVELKDTTASKERIESELSIAREIQMGMLPKIFPPYP ERSDVDLHAILHPAKEVGGDLYDFYMDGNRLYFLIGDVSGKGVPASLFMAITRSLFRTLS QQVLSPAKIVTDMNNSISDNNDSNMFVTLIVGILDLETGKLKLCNAGHNPPILIRPDGQV SFLEFKTQIFVGVIEEFAYTEDETTLEKGSKLFLYTDGVTEAENTEKELYGDEKLLETLS DNTTSDVRTTVNGIVDSIAEHVKEAEASDDLTILLIQYEPGTTNN >gi|225935364|gb|ACGA01000028.1| GENE 23 34350 - 34757 357 135 aa, chain + ## HITS:1 COG:slr1861 KEGG:ns NR:ns ## COG: slr1861 COG2172 # Protein_GI_number: 16330247 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Synechocystis # 5 127 4 129 143 68 33.0 3e-12 MEREIKISNDLNEISVLASFIEELGEELSLSFETTMNINLALEEAVANIIMYAYPTQEQH TILLRVTYSEKQLVFLLTDKGASFDPTQVDEVDVTLSLEERPIGGLGIFLIRSIMNEISY QRIDNENHLIMKKDI >gi|225935364|gb|ACGA01000028.1| GENE 24 34855 - 38916 2811 1353 aa, chain + ## HITS:1 COG:CAC0323 KEGG:ns NR:ns ## COG: CAC0323 COG0642 # Protein_GI_number: 15893615 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 823 1051 376 616 654 138 35.0 6e-32 MSSLNKIAICLLFLCAGVNYAFSDIPEQINFSYISINEGLSQSTVFSIDQDQRGNMWFAT YDGVNKYDGYSFTVYQHNEDDPNSIANDISRIVKTDSQGRVWIGTRDGLSYYDEEKDKFR NFFYEKKGKKLQINGIAEISPEQLLISTQEGLTMFDIKESRFVDDSFSTAMHKIVASALY RQGDIIYIGTPVDGLYSYSIPQKKLERITPITGTKQIQAILQQSPTRIWIATEGAGLLLF NPKTQEVKAYHHSSSNPKSISSNYIRSLALDSQNRLWIGTFNDLNIYHEGNDSFVSYSSS PVESGSLSQRSVRSIFMDSQGGMWLGTYFGGLNYYHPIRNRFKNIQRIPYKNSLSDNVVS CITEDKNKNLWIGTNDGGLNLYNPKTQQFTHYILQEDEREIGIGSNNIKAVYVDEQKSLV YIGTHAGGLTVLHRNSGQMESFNQLNSQLVNENVYAILPDKEGSILLGTLSALVSFNPEK RSFTTIDKEKDGTPFTSQRITILFRDSRKRLWIGGEEGISVFQQEGIEIEKAPILPESSV TKMFTNCIYEAANGIIWVGTREGFYCFNEKEKKIKRYTTANGLPNNVVYGILEDTFGRLW VSTNRGISCFNPETEKFRNFTESDGLQSNQFNTSSFCRTSNGQMYFGGINGITTFRPELL LDNPYTPPVVITKLQLFNKTVRPDDETGILTKNINETESITLKSWQTAFTLEFVVSNYIS GQHNTFAYKLEGYDKEWYYLTDKRAVSYSNLPQGTYHFLVKAANSDGKWNTVPTMLEIIV LPIWYKTWWAIVLFLAIFIGFITFVFRFFWMRKSMEAELEIERRDKEHQEEINQMKMRFF INISHELRTPLTLILAPLQEIINKISDRWTRNQLEYIQRNANRLLHLVNQLMDYRRAELG VFELKVKKENAHQLIQDNFLFYDKLARHKKITYTLHSELEDKEELFDPNYLELIVNNLLS NAFKYTESGQSITVTLKEENNWLVLQVSDTGIGIPINKQGKIFERFYQIESEHVGSGIGL SLVQRLVELHHGRIELDSEEGKGSTFSVYLPQDINTYKSSELASNDTLNEEGQVYSTNSK EMYFIDTEKVENETIEAGDKKRGTILIVEDNNEIRHYLSSGLAELFNTLEAGNGEEALEK LKDNEVDIIVTDVMMPVMDGIKLCKNVKQNIRTCHIPVIILSAKSEVKDQMEGLQMGADD YIPKPFSLAILTTKIQNMMRTRRRMLERYSKSLEVEPEKITFNAMDEALLKRAVAIVEKN MDNIEFSTDEFAREMNMSRSNLHLKLKAITGESTIDFIRKIRFNEAAKLLKDGRYTIAEV STMVGFNTPSYFATSFKKYFGCLPTEYIKKIKG >gi|225935364|gb|ACGA01000028.1| GENE 25 39429 - 41696 887 755 aa, chain + ## HITS:1 COG:no KEGG:Phep_1707 NR:ns ## KEGG: Phep_1707 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 4 749 2 750 750 545 39.0 1e-153 MNSLNKLYSYLLLSLTSILFVNAHPTIIKQDIDWPQFMSQQDMIWEKLPEYWYDSAFLGN GKLGLMIYKEPGENYIRLETGNCDVHDHRAKRDVFGIPRLLTGHFALRPKGEIISGQMRL DLWNAEASTDITTTKGVIHLKTIVHADDMIIVVKATTEGDENDFKWEWIAAEANSPRYLI FKRRGQTNKIPQDYELNPAPTISRNRNISLSVQKLLAGGQTTVGWQETQQSKKERTLWIN LTHTYPQNNSSEICKTEIQNAIHKGYDSLQCTHRKWWNAYYPSSFITLPEAQKENFYWIQ MYKLASATRGDRALIDNTGPWLTETPWPNAWWNLNVQLTYWALNASDHLDLAASLENALY NHVDQLRLNIPAPYRHNSLGIGVASNLECMTSEVGIPGKGKAQVGLLPWACHNLWLIYRH KMDDDVLRHKLFPLLKESINYYLHFLKQGSDGKLHLPTTYSPEYDVVEDCNFDLALLRWG CQTLVESAHRLKIQDPLIDTWKNVLLNLTPYPMDENGLLIGKGMPYAFSHRHYSHLLAIY PLYLINKEQPNDIETIEKSLAFWQSKPKALLGYSCTGASSISSAIGKGNDALTYLNKLFG KYLSPTTMYKESGPVIETPLSGAQCIHDMLLQSWGGKIRIFPAVPDTWQDVAYCGLRTEG AFKVSASRKDGKTQFIHIKSLAGEPCIITTDMPNPIFKGERTFTVSSSDNTYQIDLKKGE EVLIYPQGVPQNFVISPIMKSSKNCFGLKIVQSNT >gi|225935364|gb|ACGA01000028.1| GENE 26 41937 - 44060 1714 707 aa, chain + ## HITS:1 COG:no KEGG:BT_4662 NR:ns ## KEGG: BT_4662 # Name: not_defined # Def: heparinase III protein, heparitin sulfate lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 707 1 702 702 549 45.0 1e-154 MKNIFFICLFALLSFTGCTDDENILGAEEGGNIESLPNNKPNDVLNARLFEVINLDYPGL EKVKSYYEANEYYYAAYELLKYYRNRSDINNPNINLINPSITAFEQNIADQALDSRFYVR NFKESVDANGNEVYYSFKKDGKIDWSYIPSGMTDQEFKSQIHRHQWMLPQAKAYRVNQNE NYISSWITVYNDWLNTFPCPEGTVDKDAVQWYGLQPAERVLDQINIMPHFIQSTNFTPEW LSTFLTALADEVECIRKNYFTDGSNIYVTQVQAVTTAGMLMPEFKNATEWLNEGSLKISE QVTAQFLADGVQVELDPSYHIGVVNDLYSIYKLAQLNNKLSLFPSNYTELLKKAARFVVD IIYPNYTIDNFNDTRSVSWTKNVLLKNFKKYMEMFPDDKEIQWMATEGKQGTKPTELVQI YDASGYYMLRSGWESSATMMILKNNNNPENKWHCQADNGTFGLYRKGRNFCPDAGVYTYG GTSSSNADRAAFAATKMHNTMTREEKDIAKGYMSGKFLKQETKKGTNILVTENKSYSDLT HRRAVFFVDNTFFVLVDEGYGEGSTPSVNLSFNLCPDTKDVVIDDESANYQYGAHTLFTD NNNMLFRTFVETKTGYSATNNTAYTSNKIEEKTQRRFYRVNITKPENGAARFITVIYPFK TEADFEKINIDAKFIDNTETSAGTFHENGAAVEVTIDGQKYELSYTL >gi|225935364|gb|ACGA01000028.1| GENE 27 44084 - 46225 1526 713 aa, chain + ## HITS:1 COG:no KEGG:BT_4661 NR:ns ## KEGG: BT_4661 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 713 1 726 726 504 42.0 1e-141 MKSNLQNLTSRFFAILMTIGLCITGISCTDPETTDSTKFAIYYAGVTDIGPSMNFNMSGP TYIGGTPSDFAITRVTLNGEVYETSSFQISDPSTGAIKLTDTDNLPVGTYCISISCISNG KYYEFKDVITVNMLAPVPDGISVDPSEVTVDFADIYKESASAQVKTEEGTHVHISKYEII QEEGKEYFAISKTGKITVNDKYEGEILPGKYVLNLKLTTEAGAGIYENAVTFKIISSPLT LTYNPSSVKVEKDEAFTSSVPTLKGSTDGLTYKIKSISPETSAITIDEQTGVITLAGNNG LEIDNSYSVVVTATNQYGSKDFDETPFVINIVAFINPITKLQYANQEKVQGVAFEFTPED VDGDELTYSFVDLDSRLTDKLNIDPVTGAISAKKGNSIEVATYTITVKAKNNKSEQTATF TLNITKNPNSFTFIRYGNNLGLTPEENYADQFDYDKKATLLAAKLTPKTDIPEGRPVKWE VKIQNTTALSGTTISETGELSFSENKWNSNYGVSVLFVTATVGEGEEAVSKTVPVFIRQN KDKNDIFVEYTPFAVKMNPAKGGTTPAPVVKLSGSPYTDYTKFLMDYRRDFYYYSFIGDH VDGIQSTVGSFMYGLWTTYYNTIGKTPNYSARGPMSYYDNSQNTNQALGYVNNKDLSVVI NPGKWKDTEGVYANGVFIGRMSFVTDGTQNDLANDGTTKNQIFPLAIWFDESF >gi|225935364|gb|ACGA01000028.1| GENE 28 46244 - 48430 1893 728 aa, chain + ## HITS:1 COG:no KEGG:BT_4660 NR:ns ## KEGG: BT_4660 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 724 1 724 1047 889 64.0 0 MYNINFNTLSKRGYLTLLALLIAITAFGQEITVNGVVIDETDTPLIGATVQVKNSQKGVV TDFDGKFSIKANNNATLIISYIGYKNQEIKIKGTKNLNIKMEPDNAMLDEVIVVGYGSMK KSDLTGSVSSVAAKSIEGFKTGSVVEALGGQIAGVQITQSDGTPGSGFDIKIRGVGTVNG DSSPLYIVDGFEVGDIDYLANSDIESVEVLKDASASAIYGARAANGVVLVTTKSGKEGKP VITYNGSATYRNIPKKLDMLSTYEFAALQVELNPTKYGTTYYQEGNDSDGNPYRHQTLED YLTDPGIDWQGESFKPTWSQNHDFSISGGTKETKYAASFSHFDENGIFKNSGYKKNSAKL RINQKINKFITFDATINYANTVKEGIGTSGTGGTLNMLSNILRFRPTGGNSVTNEELLNS VFDPLELSENATYSQINPIKQAEAVKDWRQAELWGANASLSIQLMKNLTFKAAATYNTTN TRRDIFYGAESSQAYRSGGVYGSTQMQKDLRWQSSNTLTYKHKINKKNSFDVMLGHEFAF RSSELLYGQAKDFPFENLGNDNLGIGATPSSVSTSRSDKKLLSFFARGNYNFDNRYLFTA TIRADGSTVFSAKNKWGYFPSFSAAWRISEEKFMKNITPISNLKLRLGWGTVGNDRITNY LSMDLYTASKYGVGQQLVTVLNPKQLANKALKWEGSTTANLGIDVGFFENRLNLTADFFL KKHKRPVT >gi|225935364|gb|ACGA01000028.1| GENE 29 48507 - 49391 606 294 aa, chain + ## HITS:1 COG:no KEGG:BT_4660 NR:ns ## KEGG: BT_4660 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 755 1047 1047 392 65.0 1e-108 MNINSTNIRTRNFLWQTDFNISFIKNTLKALQDGTSYIQSATKFNSNFNGNDYISIVGSS LGQMYGYVFDGVYQTSDFNMLPDGTMQLKPGIADISQHAGKTVEPGMVKYKDIDGDGVIT TDDRTTIGNGQPDWYGGITNTFNYKNIDFSFMLQFQYGNDIYNATRMFNTQSQDERSNQL AEVADRWTPTHASNRVPSAKGYVKYELYSRFIEDGSFLRLKNITLGYTFPNKWTRKAYIN RLRVYGTAQNLFCLTKYSGYDPEVNMKSSPLMPGFDWGAYPKSRVFTFGVEVQF >gi|225935364|gb|ACGA01000028.1| GENE 30 49405 - 51096 1336 563 aa, chain + ## HITS:1 COG:no KEGG:BT_4659 NR:ns ## KEGG: BT_4659 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 563 1 557 557 600 55.0 1e-170 MKNFKYYLCTLACAIAVSSCSLDETSYTEIEKGNYMNNATEAENVLLGVYRNMVQDGIYG FHLSLYFTIPSDIAKVTGNSTDGLRLIPSNAYTSSQTEIATTWANLYNAIYSANDFIEAL QQKIGTYDETNYKKAAVYMAEARCLRALYYFEVVRWFGHVALITNTEQSRQHPSTFTQAD PADVYKFIEADLQYAIDNLPYAIDDNIRSDNSFRLSKGAALGLLTKVYATWAGYPVHDTS KWEDAAKTAKILVESGKHHLLDDYEQLWKNTCNGVWDEEESLIEVSFYAPTVTGVSANDP CGRIGKWNGVQASGIRGVRNAGNWRVIPTFLRDWKDRESDKRWGLSFADYKYGKATDTGE DGVKIVINSSGNIEDAIKDDAKDALKKSYIDNVCPRKWDTEDYVNSANYLIDANLSNINW YILRYADVLLLYAEALNEWKQGPTDDAYRAINMVRRRGFGFPVDTDNSKSDLSGMSYEEF QKAVRNERAYELAFEGHRRQDLVRWGIYYESIKQTAQDLVNWYSGGDGYYVCVDYTKKNK NELLPIPQQEMDLCTQFNQNPGW >gi|225935364|gb|ACGA01000028.1| GENE 31 51204 - 54134 1320 976 aa, chain + ## HITS:1 COG:no KEGG:Phep_2838 NR:ns ## KEGG: Phep_2838 # Name: not_defined # Def: lyase catalytic # Organism: P.heparinus # Pathway: not_defined # 26 909 41 1004 1077 368 30.0 1e-100 MKKNICIIALLVSFITSQSAFSQIYSFEDNKIPGDWSINKGTLNISADKYKLGNQSLQIN WKQGAILTLQSPSGISEACQNKNGGINLWIYNKFPINEHLIFSFKDTNEKEVCRFPFLLN FKGWRCIWAKFQGDMNMPPKSNIKSVEMQFPQETEEGVIFIDFLEFTPKVSWQNMSDAQY KVNRKDYSLVSDFVSYRNTTPQVKKIITAEDQQIKIIADRLTTWYLGSGQQSSDKWIKMR EDNEEVFIRTGLKAAQKIKIQYNEDNTPKGEPLFPMGAPTTIEGQQLKKFRTINENILLP LALDYRKNQNVQSLKKVLYIYDWFNDQGWADGSGMGTLCFEKLRSSGYFHSFFLLKEQLS TELLERELQTLNWFTMFGTCYQTPANAGEVADNLRALAIPKLIYALSINDKQKKQVALTA FKNYMDNALGIAPGFFGTFKPDFSGYHHRGPYNSAYYPHALYAGALIAYLLHDTPYALSE STLHNLKQGLLTFRFFCAGLNVPAGTVGRFPKGQQILETLLPAFAYVSLSYKEPDKELTA AFKRILESGSNRQAITNYVSNVNSNLAYTSTVGEIELLTQLASTSISKEEKVNGTLFMPY SGLLIAKDSLFHFNIKGFSRYIWDFESSATENTKGRYLSYGQIEYFDLKNKYKSFNPEES DFDWNYIPGATTKVLPEYVLQDKGGSSSGHRNFSDETFLTGIHGSKKSAMFSFRMHDITY DNSFRANKSVFVFEDFLLCLGSDIQTKDKRYPTITTLFQSFDKNTSKERIDEGYILTDPS LMYVVKGGAVRTLCEGTHTRAYIEHGAPAIDAKYQYYILKQKDKKTAKRLLSNHSPIEIV AQDNDAHIIRHKTAGIICGALFNPLKTYTEQLVTQVNIPLSYILEKEEENDSFRLSICEP DMRRASRAHMGLLTEEDVVQEEKAFNTQLTINGIYNVKCLQKSIKVSHDKEKNKTYVTIS TIRGENYTLLLHQTNI >gi|225935364|gb|ACGA01000028.1| GENE 32 54136 - 56574 1169 812 aa, chain + ## HITS:1 COG:no KEGG:Slin_5006 NR:ns ## KEGG: Slin_5006 # Name: not_defined # Def: alpha-L-fucosidase (EC:3.2.1.51) # Organism: S.linguale # Pathway: not_defined # 24 794 25 803 864 706 45.0 0 MKHRLITLLLICFYILDSNAQSDLTLWYKSPAKVWEEALPVGNGRLGAMIFGEPQKERIQ FNENTLYSGEPETPKDINVASDLGHIRQLLNEGKNTEAGNIIQQKWIGRLNEAYQPFGDL YIEFASKGAITDYIHSLDMNNSIVTTSYKQNGIAIRREVFASYPAQAIIIHLSASKPVLN FTAHLESPHPVTQDSDSQAIYLKGQAPAHAQRRDIEHMKRFNTQRLHPEYFDQTGHVIQK KQVIYGNELGGKGTFFEACLLSSHKDGKLVIENNQFIAQDCSEVTLVLYAATSYNGLHKS PSKEGKNPHQEINNYRKISEKHSYKKLKEEHITDYQSLFKRVSFNLHTNKQLKKTPTDQR LKLFKKKEDQTIITQLFQFGRYLMIAGSRGEGQPLNLQGLWNNEVLPPWNSGYTLNINLE MNYWPAEVTNLSECHQPLFKLIEEIADKGKNLARDMYGLNGWAIHHNISIWREAYPSDGF VYWFFWNMSGPWLCNHIWEHYLYTKDIDFLKKYYPILKGSATFCSEWLVENSEGELVTPV STSPENAYLMPDGISASVCEGSTMDIAIIRSLFSNTINASKVLQTDSLFCAELTQKVNKL KKYQIGSKGQLLEWDKEYMENEPQHRHVSHLFGLYPGCDITDYTPELFDAARKSLNARGN KTTGWSMAWKISLWSRLYNSLKAYEALSNLINYVDSDTKAENQGGLYRNLLNALPFQIDG NFGATAGIAEMLLQSHKGNIHLLPALPPTWEKGNIKGLKARGGFTVDMEWEKGKITVAYV TSPYEQTTNITYKDMIRKTHFNAGERKKISFK >gi|225935364|gb|ACGA01000028.1| GENE 33 56843 - 58039 1123 398 aa, chain + ## HITS:1 COG:no KEGG:BT_4658 NR:ns ## KEGG: BT_4658 # Name: not_defined # Def: glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 397 33 427 434 714 87.0 0 MDKKRLDVASAQLQLSAEEVSGTGMLPRSIRTGYDMDFLCRQLERDSLTFKDSLRAQPTA EQLGKRRLCNVYDWTSGFFPGSLWYAYELTGNDTLKTQAIEYTNLLNPVRYYKGTHDLGF MVNCSYGNAERLSPNDTIAAVMRETADNLCGRFNDSIGAIRSWDFGTWNFPVIIDNMMNL DLLFNVAKTTGDNRYKDIAIKHAMTTMHNHFRPDYTCWHVVSYNSDGTVERKQTHQGKND DSSWARGQAWAVYGYTACFRETNDSIFLNFAKDIADMIMDRVKTDDAIPYWDYDAPVTKE TPRDVSAASVTASAMIELSTMVPDGQKYLDYAEKILKSLSSDAYLAKVGDNQGFILMHSV GSLPNGSEIDTPLNYADYYYLEALKRFMDLKGINYKDI >gi|225935364|gb|ACGA01000028.1| GENE 34 58068 - 60068 1854 666 aa, chain + ## HITS:1 COG:no KEGG:BT_4657 NR:ns ## KEGG: BT_4657 # Name: not_defined # Def: heparinase III protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 666 1 666 666 1291 93.0 0 MNKTLKYIVLLAIACFVSKGYAQELKSEVFSLLNLDYPGLEKVKALHQEGKDEDAAKALL DYYRARTNVKTPDINLNKVTISKEEQQWADDGLKHTFFVHKGYQPSYNYGEDINWQYWPV KDNELRWQLHRHKWFTPMGKAYRISGDEKYAKEWAHQYIDWIKKNPLVKMDKKEYELVSD GKIKGEVENVRFAWRPLEVSNRLQDQTSQFQLFLPSPSFTPDFLTEFLVNYYKHAVHILA NYSDQGNHLLFEAQRMIYAGAFFPEFKEAPAWRKSGIDILNREIHVQVYEDGGQFELDPH YHLAAINIFCKALGIADANGFRKEFPQDYLDTIENMIMFYANISFPDYTNPCFSDAKLTT KKEMVKNYKSWSKLFPKNQAIKYFATEGKEGALPDYMSKGFLKSGFFVFRNSWGMDATQM VVKAGPKAFWHCQPDNGTFELWFNGKNLFPDSGSYVYAGEGEVMEQRNWHRQTCVHNTVT LDNKNLETTESVTKLWQPEGAIQTLVTENPSYKNLKHRRSVFFVDNTYFVIVDEMAGSGK GSINLHYQMPKGEIANSREDMTFLTQFEDGSNMKLQCFGPAGMSMKKEPGWCSTAYRKRY KRMNVSFNVRKDGEEAVRYITVIYPVKKSADAPKFDAKFKNKAFDENGLEVEVKVNGKKQ SLKYKL >gi|225935364|gb|ACGA01000028.1| GENE 35 60302 - 61738 1180 478 aa, chain + ## HITS:1 COG:PA0031 KEGG:ns NR:ns ## COG: PA0031 COG3119 # Protein_GI_number: 15595229 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 1 437 34 426 503 116 24.0 8e-26 MDRIANEGIRFDNCYAVNALSGPSRACILTGKFSHENGFTDNASTFNGDQQTFPKLLQQA GYQTAMIGKWHLISEPQGFDHWSILSGQHEQGDYYDPDFWEDGKHIVEKGYATDIITDKA IKFLEGRDKNKPFCMMYHQKAPHRNWMPAPRHLGIFNNTTFPEPANLFDDYEGRGRAARE QDMSIEHTLTNDWDLKLMTREEMLKDTTNRLYSVYKRMPIEVQDKWDSVYAGRIAEYRKG DLKGKSLISWKYQQYMRDYLATVLAVDENIGRLLNYLEKIGELDNTIIVYTSDQGFFLGE HGWFDKRFMYEECQRMPLIIRYPKAIKAGSTSNAISMNVDFAPTFLDFAGVDIPSDIQGA SLKPILVNEGKTPTDWRKAAYYHYYEYPAEHSVKCHYGIRTQDFKLIHFYNDIDEWEMYD MKADPREMNNVFGKPEYAKVQKELMELLQDTQKQYKDTDPDEKEKVLFKGDRRQMQNR >gi|225935364|gb|ACGA01000028.1| GENE 36 61806 - 63434 1425 542 aa, chain + ## HITS:1 COG:no KEGG:BT_4655 NR:ns ## KEGG: BT_4655 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 38 542 1 505 505 1043 96.0 0 MDRRKFLKNTGWSFLGLAASGSLLGSCAAGSKEAKKIMPSASNLKMYWGDLHNHCNITYG HGDMRDAFEAAKGQLDFVSVTPHAMWPDIPGADDPRLKWVIDYHTGAFKRLREGGYEKYV KMTNEYNKEGEFLTFVGYEAHSMEHGDHVALNYDLDAPLVECTSIEDWKQKAKGHKVFIT PHHMGYQGGYRGYNWKCFTEGDITPFVEMYSRHGLAESDQGDYPYLHDMGPRQWEGTIQY GLELGNKFGIMASTDQHSGYPGSYGDGRIGVMAPSLTRDAIWEALRTRHVCAATGDKIII DFRLNDAFMGDVVRGNSRRIYLNVTGESCIDYVDIVKNGQILARMNGPLTPIAPEGDTVR CKVKVDFGWNREEKYVHWQGKLSVDKGQIHSVTPCFRGAAFTSPQEGETEFHTHVNRIVS VGDKETELDMYSSKNPNTTTAAMQAVILDVEMPKDGKIIAEFNGKKFEHTLGELLEGSRS HFMIGWLSEAILFNRAMPESCFTLEHYMEDKDPQRDTDYYYVRVHQRDGQWAWSSPIWAE RV >gi|225935364|gb|ACGA01000028.1| GENE 37 63477 - 64448 476 323 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 7 311 6 317 319 187 37 4e-46 MEKEYAIGIDLGGTSVKYALIDNEGVFYFQGKLPSKADVSAEAVIGQLVTAINEVKAFAQ EKGYKIDGIGIGTPGIVDCTNRVVLGGAENINGWENIHLADRIETETGLSALLGNDANLM GLGETMYGAGQGATHVVFLTVGTGIGGAVVIDGKLFNGYANRGTELGHVPLIANGEPCAC GSVGCLEHYASTSALVRRFSQRIADAGISYPNEEINGELIVRLYKQGDQIAKVSLEEHCD FLGHGIAGFINIFSPQKIVIGGGLSEAGDFYIQKVSEKARSYAIPDCAMNTQIIAAALGN KAGSIGAASLVFTQLSAPNLIKL >gi|225935364|gb|ACGA01000028.1| GENE 38 64445 - 65713 1256 422 aa, chain + ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 2 402 18 407 412 141 29.0 2e-33 MRKNLGMLTLIMAFWFTISFITNILGPLIPDIIHNFNLSDLAMAGFIPTSFFLAYAIMSI PAGLLIDRFGEKPVLFGGFLMPFIGTVLFACMHTYPILLASSFIIGLGMAMLQTVLNPLQ RTVGGEENYAFIAELAQFMFGIASFLSPLVYTYLIRELDPATYTAGKGFFIDLLADITPR EMPWVSLYWVFTILLLVMLIAVGISRFPKIELKEDEKSGSKNSYLALFKQKYVWLFFLGI FCYVSTEQGTSIFMSTFLEQYHGVNPQTDGAQAVSYFWGLMTAGCLVGMILLKLIDSKRL LQISGILTIILLLLALFGSKEVSIIAFPAVGFSISMMYSIVFSLALNSASQHHGSFAGIL CSAIVGGAGGPMIVSTLADATSLRTGMLFILVFVGYITFIGFWARPLINNKTIRLKDLFK QR >gi|225935364|gb|ACGA01000028.1| GENE 39 65754 - 68375 2148 873 aa, chain + ## HITS:1 COG:no KEGG:BT_4652 NR:ns ## KEGG: BT_4652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 873 1 872 872 1645 89.0 0 MMKQRISIFLLFTILLSASGYAQKGIMRLTQQTLMHEVRETPSPLNGQHITVNPPRFMWP DKFPHLGAVLDGVEEEDYKPEVTYRIRIARDPEFKSEVITAERKWAFFNPFKLFEKGKWY WQHAYVDKEGKEEWSPVYHFYIDEHIRTFNPPSLQEVLAKLPKTHPRILLDAEDWDNIIE RNKNNPEAQAYIRKADKCLNHPLKHLEEEIDTTQVVKLTNIVQYRSALIRESRKIVDREE ANIEAMVRAYLLTKDEVYYKEGIKRLSEILSWKSSKYFAGDFNRSTILSMSTSAYDAWYN LLTSDERKLLLRTIRDNGKKFYHEYVNHLENRIADNHVWQMTFRILNMAAFATYGELPMA STWVDYCYNEWVSRLPGLNADGGWHNGDSYFQVNLRTLIEVPAFYSRISGFDFFADPWYN NNVFYVIYQQPPFSKSAGQGNSHESKMKPNGTRVGYADALARECNNPWAAAYVRTILQKE PDIMEKTFLGKSGDLTWYRCTTKKALPKEGHTLAELPMAKVFNETGIGTMNTSLGDIDKN AMLSFRSSSYGSTSHALANQNAFNTFYGGKAIFYSSGHRTGFTDDHCMYSYRNTRAHNSI LVNGMTQKIGTEGYGWIPRWYEGEKIAYMVGDASNAYGKETSPLWLKRGELSGTQYTPEK GWDENKLDMFRRHIIQLGSTGVFVIYDELEGKEAVTWSYLLHTVELPMEIQELPNEVKVT GRNKAGGISVAHLFSSAKTEQAMVDTFFCAPTNWKNVTNAQGKAVKYPNHWHFSSTTVPC KTARFLTVMDTHGDNRLDMQVVRKGNIIQVGDWTITCNLTEKGKAAIHVSNKVEKVSLNY DPDKKEGATTVTDQVKGKQVNKVLTDYLPDFEI >gi|225935364|gb|ACGA01000028.1| GENE 40 68576 - 69409 668 277 aa, chain + ## HITS:1 COG:STM2203 KEGG:ns NR:ns ## COG: STM2203 COG0648 # Protein_GI_number: 16765533 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Salmonella typhimurium LT2 # 1 277 1 279 285 384 67.0 1e-107 MKYIGAHVSASGGVEFAPINAHEIGANAFALFTKNQRQWVSKPLTEESISLFKENCEKYG FQREYILPHDSYLINLGHPEEEGLQKSRAAFLDEMQRCEQLGLKLLNFHPGSSLNKVSVE DCLSLIAESINLALEKTKGVTAVIENTAGQGSNLGSEFWQLKYIIDRVNDKSRVGVCLDT CHTYTAGYDIVNEYDKVFDEFDKEVGFNYLRGMHLNDSKKALGTHVDRHDSIGEGLIGKA FFERLMQDSRFDNIPLILETPDESKWKEEITWLRSME >gi|225935364|gb|ACGA01000028.1| GENE 41 69597 - 71549 1320 650 aa, chain + ## HITS:1 COG:no KEGG:PRU_1399 NR:ns ## KEGG: PRU_1399 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 36 647 39 690 692 260 32.0 1e-67 MKKFWLACTMIIAMGVSSCVDSDKDLYQEDPGTGTNPSSFSTVQKVQVEIDYSDSESRVP FSIYDGNPLIESENTITLKENVQALDGAWTDEQGKFTATVDLPAYVSDVYIVSTSPFARR AIPGKIVNGVLKVSDTDEQPTTRASYRESTKFDENRFDNLGWKTNLGKYDEYSGVINYAY KGKDPKLTLSKSEMNELRTTVNKVLNTFKDCPEDYRTQADLYVEKDETAVVLTALKGWTC WNSSLGYYYYRPNQIPTSLKDVKVYAIFPNTQMTWNNGSLKASPQGIEEGTAVQLKYFDD PEHPEGTNFPKGYSIGFVLACNAWNTYFTGFNSHTLTYGFYACSTKGFSTKVNSNIDVRT AMFRDKNNNIAIAFEDFMDDQNFTDVVFSLKANPEITNVPPVDEDLNTTIEKTGVYAFED EWPKAGDYDMNDVLVQYTYQKVFNIYNEILSESFTFKTLYNKYTVFTNGLGFTLSNAGNA QSTEYLIKKEGEKEFTVASGTDKFTRESNAIILTDNVKTNPNAEYKVTFKYGDKNSNKKQ ETSIDAFIYRPSKEGNRLEVHCPMQKPTSKVDTSFFGQNDDRSKPNEGIYYVSNQENIYP FAFYLANANADDIAELKNFNKNEKKPINEIYPKFIDWAKYGTNSDWYKKK >gi|225935364|gb|ACGA01000028.1| GENE 42 71635 - 73206 868 523 aa, chain - ## HITS:1 COG:FN1727 KEGG:ns NR:ns ## COG: FN1727 COG0038 # Protein_GI_number: 19705048 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Fusobacterium nucleatum # 18 522 10 520 521 260 33.0 5e-69 MFRLIKKIKDKGRWRIFKLKLIDARLYFVSIFVGLLTGLVAVPYHYLLQFFFNLRHDFFD SRPKWYWYIPLFLLMWGILVFVSWLVKKMPLITGGGIPQTRGVINGRVDYKHPFLELVAK FVGGILALSTGLSLGREGPSVQIGSYVGYLVSKWGRVLSGERKQLLSAGAGAGLAAAFAA PLASSLLVIESIERFDAPKTAITTLLAGVVAGGVASWIFPINPYFHIDAIVPGMTFWGQV KLFLLLAAVVSVFGKFFSVTTLQVKRIYPAIKHPEYVKMLYLLFIAFLISMAEFNLTGGG EQFLLSQAMRPDTHILWIVGMMLLHFVFSTFSFSSGLPGGSFIPTLVTGGLLGQIVGLIM VQQGVIAYENISYIMLICMSAFLVAVIRTPLTAIVLITEITGHLEVFYPSIVVGGLTYYF TEMLQIKPFNVILYDDMIHSPAFKEEPRYTLSVEVMSGSYLDGKIVDELRLPERCVIINV HRDRKNLPPKGQTLMPGDQVQIEMDSQDIEKLYEPLVSMANIY >gi|225935364|gb|ACGA01000028.1| GENE 43 73297 - 76053 1801 918 aa, chain - ## HITS:1 COG:no KEGG:BT_3977 NR:ns ## KEGG: BT_3977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 917 3 909 937 622 39.0 1e-176 MNIDSHIYSFQYRLITCYVLLLISLTVFAQSGKERFSGRVIDTETYQPVPFATVRLLALP DSTLLGGGATDAIGKFQLSISIPNHVTKAKSLLLQVSYIGYKQVFHPVSISGKSSSYELG DINLTPESYALDEAVVVGQAPMAVTEGDTTVFNASAYRTPEGSMLEELVKQLPGGEIDAD GKLLIHGKEVKKILVDGKEFFSDDPKAALKNLPVEMVEKLKAYERQSDLARLTGIDDGEE EMILDLSVKKNMKRGWMENFMGGYGSKDRYELANTLNRFRENSQLTIIGNLNNTNNQGFS ELQNESSSSTGNIRSRMGLTTSRSLGLNATYDWKRVKLRSNVQYVGTDRLEDSRTTVDNF LREDKSINLGTSHSRQQNHELVANASLEWKMDSVTTLIFRPQYRFAANDRENNGFQEGWG NDVLLNERESSGTNHNSRYNLALMLQLSRKLSRLGRNVALKVDYGTNASATDRTNLSTTR YFKNNTKKIQNQKIEDDVDGFNYRVQLVYVEPLPWRHFLQLRYSYQYRVNNSDRFVYDWD KELEEFAPDFDEESSNRFENQYTNHLVNLAIRTSQKKYNYNIGVDFEPQKSVSHSLLSDA PEDQLERSVMNFSPTVNFRYKFSKRTRLQIVYRGKGKQPNIRDLQPVTDRTNPLNIRVGN PLLKPSYMNTFTLNFNSYNTKHQRNMVATALFENTINNVTNQVTYDSETGVRTTSPVNMN GNWRAMGSFSLNTPFKNRNWIFRTYSYLQYRNQNGYTTLNKEEPVKTSVQHLTGRERLRI TYRTRQMELTGLVGLTYNNSYNDVREKRTETFDYQAGTELQLYLPWGMELYNDLTYFLRT GYGYEGYAKANFMWNCQLSKAFLKRKQLLIRFKIYDILHQDISMIRTITATAIRDTDYNA LGSYFMVHAILRLNMMGK >gi|225935364|gb|ACGA01000028.1| GENE 44 76102 - 76662 598 186 aa, chain - ## HITS:1 COG:no KEGG:BT_4649 NR:ns ## KEGG: BT_4649 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 1 194 194 192 73.0 5e-48 MKTKFIYVILAVLLMGSQMALSAQNTDNKERKQRPTPEQMVQMQTSQMVKILMLDDATAA KFTPVYEKYLKDLRECRMMSKPRTEKAKKQGTDANAKEERPSMTDDEIATMLRNQFTQSR KMLDVREKYYNEFSKILSQKQILKIYQQEKMNANKFKKEFDRRKGQKPGQGHHQGQRPRA PHPDQK >gi|225935364|gb|ACGA01000028.1| GENE 45 76696 - 77040 388 114 aa, chain - ## HITS:1 COG:no KEGG:BT_4648 NR:ns ## KEGG: BT_4648 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 1 119 119 118 64.0 5e-26 MKKDFDFDDIGKRTPYRTPDGFFEDVQRKVMERAGVKQQRKSHMKLIISTVITMAAVLIG FLFVPSLRQADEVKTSSSKVLANGTESVDKWIKELSDEELEELVSFSENDIFLN >gi|225935364|gb|ACGA01000028.1| GENE 46 77045 - 77584 524 179 aa, chain - ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 1 176 3 184 187 79 30.0 3e-15 MINENKIREACASNRERGFKMLMDSFQVPIYNYIRRLVVSHEDAEDVLQEVFIRVFRHID QFREESSLSTWIYRIATNESLRLLNGRKDEGVVSAEDVQEELMGKLKASDYIDYENGLAV KFQEAILSLPEKQRLVFNLRYYDELEYEEIARVLDSKVDTLKVNYHYAKEKIKEYILNR >gi|225935364|gb|ACGA01000028.1| GENE 47 77784 - 77960 231 58 aa, chain - ## HITS:1 COG:no KEGG:BT_4646 NR:ns ## KEGG: BT_4646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 58 1 58 58 96 82.0 3e-19 MELPKDPMMLFSVINMKLRDCYSSLDELCEDMNVDKDELVNQLKAVGFEYSVEHNKFW >gi|225935364|gb|ACGA01000028.1| GENE 48 78022 - 79566 1223 514 aa, chain - ## HITS:1 COG:no KEGG:BT_4645 NR:ns ## KEGG: BT_4645 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 1 512 513 723 69.0 0 MKQAYHYLSVLLILFFFVDMMRADGSDVLERIVRLPKIKGTVYSLLSDVSQQSGYMFIYD SKVIDNDAVVKIKGGERSIRQAVYDIVGNTSLEFQVMGTHILITFPSEKKVRVQQDFVPE QITNFVIIGTLLDKETGIPISSATVGVRGTSIGSITNQNGDFKLSLPDSLKNDSITFSHI GYLSQDIEFALLVGRHNILSLEPKVVPLQEVVIRRSDPKKLLREMIERRNKNYSHTPVYL TTFYREGVQLKNKFQNLSEAVFKVYKTSSYSSVPDQVKLLKMSRLSNVEAKDSLLVKVKS GIQACIQMDIIKDMPEFLTPSVEKGIYDYTSEGVTFLEDRFVNVVHFEQKKGISEPLFCG ELFLDSETSALLQARLEVHPVYVKNAAGMFVERRTRNVRMIPQKVVYTISYKPWQGTYYI HHIRGDLHFKVKRTKMLFGSRDLHIWFEMITCKVDTEQVVVFPRTDRLPTRTIFSDTYFK YDENFWKDFNVIPLEEEISKLIEKISLKIEEIGD >gi|225935364|gb|ACGA01000028.1| GENE 49 79588 - 80442 559 284 aa, chain - ## HITS:1 COG:no KEGG:BT_4644 NR:ns ## KEGG: BT_4644 # Name: not_defined # Def: putative anti-sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 284 2 280 280 353 65.0 3e-96 MKTDIYKIKTDQAWNRLYNRLDNDRLLAKVDEGHHTRKYSQWIRYGAVAAVLIGVIWGTL YWVTGSDKELAQHLLTQENQGVSTLATTLEDGSVVLLAKETSLLYPKHFIADKREVSLQG NAFFDVAKKQGQPFWIDTEQAKIEVLGTAFSVQSDENAPFRLSVQRGIVKVTLKKGNQEC YVKAGETVVVRSQQLVVLDTDKENEEWGSFFKHIRFKDESLANILKVMNLNSDSLQIQVV SPALEERRLTVEFSDESSEVIANLIASALGLQCIWQGDILLLSE >gi|225935364|gb|ACGA01000028.1| GENE 50 80442 - 80993 315 183 aa, chain - ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 39 169 33 161 181 66 32.0 3e-11 MLDELLILKKIKGGDIKAFEELFRRYYFPLCCYAAGITGQMEVAEEIVEELFYVLWKERE RLQIFQSVKSYLYRATRNQSLQYCEHEEVRNRYRETVLTTSNPEQSTDPHQQMEYEELER FINGTLEKLPVRRRRIFEMHRLEGWKYVEIATQLSLSVKTVEAEMTKALRTLREEVEIYI HMK >gi|225935364|gb|ACGA01000028.1| GENE 51 81044 - 82303 1103 419 aa, chain - ## HITS:1 COG:no KEGG:BT_4642 NR:ns ## KEGG: BT_4642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 419 1 418 418 622 73.0 1e-176 MKNVEKHLVGWLSVCLLLCSFPVWAQGDAGDYLTIVGMVKDKQNKKALENVNVSVHGSNI GTVTNAEGEFALKIKKTEALRELEISHIGYVNNHISLEKETPSKLTVWLTPHANLLNEVV VFAENPRMIVEKAISKISLNYSDKRDMLTGFYRETVQKGRRYIGISEAVIDVSKTAYTNR NTNYDKVRVVKGRRLLSQKASDTLAVKVVGGPNLSITLDVVKNKGALLDMEELNNYEFWM AESMLIDNRMQYVINFRPKVILMYALLYGKLYIDRERLSFTRIEMSLDMQDKSKATTAIL YKKPLGLRFKPQELSYLVTYKDVEGKTYLNYISNTIRFKCDWKKKLFSTSYTVASEMVVT DKKEGMLETIPNKEAFSLNQIFYDKVDEYWSPDFWGNYNIIEPTESLEHAVDKLKKQSN >gi|225935364|gb|ACGA01000028.1| GENE 52 82585 - 83016 135 143 aa, chain + ## HITS:1 COG:no KEGG:BT_4640 NR:ns ## KEGG: BT_4640 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 146 146 134 52.0 7e-31 MMKYIFCILLGVFLCWGFHLISDEAQEAYRLEAGSNGIQTYSNSYTSKASIIQLDQEQKL VKTKEYAGKQNYNNDNGVLNPFSFRHLSPLKILRFNIPSVTIRILSSLKIQLPKNQWTGF SYYTNLFKYSDRYHIYSLGHILI >gi|225935364|gb|ACGA01000028.1| GENE 53 83166 - 83807 562 213 aa, chain + ## HITS:1 COG:STM2367 KEGG:ns NR:ns ## COG: STM2367 COG0586 # Protein_GI_number: 16765694 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Salmonella typhimurium LT2 # 3 213 6 216 219 266 62.0 3e-71 MDFLLDFILHIDQYMVMIVRDYHAWTYAILFFIIFCETGLVVTPFLPGDSLLFVAGAISA LPDMPISVHILVIILFAAAVLGDSCNYMIGHFFGRKLFNNPNSRIFKQSYLEKTHEFYKK YGGKTIILARFVPIVRTFAPFVAGMGKMNYYYFMMYNLVGGAVWVAVFCYAGYFFGDLPF VQENLKLLIVAIIFISVLPAIIEVVRAKLKSQK >gi|225935364|gb|ACGA01000028.1| GENE 54 83880 - 84527 599 215 aa, chain + ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 9 215 4 210 210 104 33.0 1e-22 MKNSFFSRFTPKEPKFFPLLKQLSEVLCEASVVLIESLQHDSPTERSDYYKKIKDLEREG DRLTHLIFDELGTTFITPFDREDIHDLASCMDDVIDGINSCAKRISIYNPRPISENGKEL SRLIQEEAIYICKAMDELETFRKKPTLLREYCSKLHEIENQADDVYEFFITKLFEEEKDC IELIKIKEIMHELEKTTDAAEHVGKILKNLIVKYA >gi|225935364|gb|ACGA01000028.1| GENE 55 84543 - 85559 809 338 aa, chain + ## HITS:1 COG:RSc1313 KEGG:ns NR:ns ## COG: RSc1313 COG0306 # Protein_GI_number: 17546032 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Ralstonia solanacearum # 12 333 19 331 336 259 47.0 6e-69 MELLVTIIILALIFDYINGFHDAANSIATIVSTRVLTPFQAVIWAAFFNFVAFFIAKYII GGFGIANTVSKTVVEQYITLPIILAGVIAAITWNLVTWWKGIPSSSSHTLIGGFAGAAIM ANGFEAIQLNIILKIAAFIFLAPFIGMVIAFGFTLFVLYICRRAHPHTAEQWFKRLQLVS SALFSVGHGLNDSQKVMGIIAAAMIAGHSMGLGMGINSINDLPDWVAFSCFTAISLGTMS GGWKIVKTMGTKITKVTPLEGVIAETAGAFTLYITEMLKIPVSTTHTITGAIIGVGATKR LSAVRWGVTKSLMTAWILTIPVSGLLAAAIYYIVSIFL >gi|225935364|gb|ACGA01000028.1| GENE 56 85611 - 87866 1768 751 aa, chain + ## HITS:1 COG:PA5529 KEGG:ns NR:ns ## COG: PA5529 COG0475 # Protein_GI_number: 15600722 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pseudomonas aeruginosa # 3 429 2 427 585 354 43.0 3e-97 MSHLPTLIADLALILISASIITLLFKWLKQPLVLGYIIAGLLAGPYINIFPTVGDIENIN IWAEIGVIFLLFALGLEFSFKKLMNVGSTAFITAITEVISMLLIGYLVGQLLGWGTMNSI FLGGMLSMSSTTIIIKAFNDLELRNQRFTGIVFGTLVVEDLIAILMMVLLSTMAVSQDFV GEDLLISVLKVVFFLILWFLIGIFVIPAFLKKAKKLMNNETLLIVSLGLCLGMVVLATYT GFSTALGAFIMGSILAETIEAEHIEHIIQPVKDLFGAIFFVSVGMLVNPAVLVEYAWPVI IITLVTIIGKAIFSSFGVLLSGEPLNTSIKSGFSLAQIGEFAFIIAGLGVSLKVLDPFIS PIIVAVSVITTFTTPYFIRLANPFAEWLYKILPTKVQETLDRYASGKKTMNHDSDWKKLL KNMIGRVVIYSVLLTAIWLLSIQIIYPAISEMFAPVTVWINLVMCLGTLLLMTPFLWALI SNKYNSSELFLKLWRDENYNHGRLVSLVLGRVSVALFFITSVVISYFKLNWGISVIIAIA VVALILILREDLTQYSRLETRFLANLNRREEVAKKRHPLKTSFNSEFNDKDLDLTSILVS PYSDYIGKSLGELSFRQNFGVNVVAITRGDLNIYIPKSSEHIYPQDKLTVVGTDTQLQEF RNRIEDVKNTSDMDAVDKKITLHSFTVDEEFRFLNQTIAQSHLGEKHDSIVVAIERNDEL ITLDKDTTFQLGDLVWIVGNREKIRKILYLT >gi|225935364|gb|ACGA01000028.1| GENE 57 88042 - 89682 1197 546 aa, chain - ## HITS:1 COG:no KEGG:Snas_3080 NR:ns ## KEGG: Snas_3080 # Name: not_defined # Def: hypothetical protein # Organism: S.nassauensis # Pathway: not_defined # 213 492 58 320 422 78 27.0 7e-13 MRTIYLLLFPLFLFFIVSCTEDEVDPNFSIENVGDVVMGSAKGSQVTISFTSTREWKAST VADWFTIAPVSGEAGTCNITLTAISENITGNVRTATLTLTSGTLTQDITIEQESAEFVNL EQNIYNVSVEGGELDIRFSTNIAEDELLIYGSLGTGTWLTQETKTRASSSYMLNLTVLPN TDGVSRTAYIYFVKVTDMESIVVEMVTIIQRGEVASESTDYSSDKKVRVLQTAKLGKGLP IVLMGDGFIDTEINDGTYDAVMDKAFENLFTEEPIKSLRDYFNVYAVTAVSKHNIFGTGY ETALGCELAGGNSTGISGEDNAVQRYVQCVDNIDMSETLAVVILNSPVYAGTTYFGYTNQ TKVVEFAIAYCPVIYDLQSESFRQVLVHEAVGHGFAKLEDEYAYQENGTISSKEIKNVQY LQTLGWAQNVDFTSDPSQVLWSAFLNDNRYASEKLGVFEGACTYIKGAYRPSEESMMNSN TEGFNAPSRKAIYDKIMERSLRKQMSYEEFAVFDLQNKSQTRSTKSTVGPSKPFTRPHFV NRMLNN >gi|225935364|gb|ACGA01000028.1| GENE 58 90134 - 95008 1933 1624 aa, chain + ## HITS:1 COG:alr0205 KEGG:ns NR:ns ## COG: alr0205 COG0514 # Protein_GI_number: 17227701 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Nostoc sp. PCC 7120 # 275 626 11 343 718 225 40.0 5e-58 MSIWSKLIGIKKTEDRNNAIGNKITSVPFSAPNYAIVDVEVGLKDHKIHDIGALLHDGTT FHKTSKDELFSFLCDIDYICGHNIIHHDAKYLFTDKTCRWILVDTLYVSPLLFPERPYHR LVKDDKLISEQMNNPVNDCKKANDLLLDEIACWNSLPKEKRILFASLLKGKKEFEGFLRI MNAEYINEGIPELIKKLYAGKICQHADLDMLIEQYPCGLAYALALIDTTDYRSITPGWVL YNYPEVEFIVKLLRHTSCREGCDYCNTQLDVLHNLKVFFGYERFRTYEGEPLQEQAAQAA VKGKSLLAIFPTGGGKSLTFQLPALMAGRSVHGLTVVISPLQSLMKDQVDNLADRGITDA VTINGMLDPITRALSIQRVQDGEASLLYISPEMLRSKTIEKILIARHVVRFVIDEAHCFS SWGQDFRVDYLYIGKFIREYQQKKKCKNPIPVSCFTATAKQKVVQDICDYFKQTLNLDLE LFASTASRTNLRYSVIHADSDNDKYLKLRELVAESNCPTIIYVSRTKRTEELATKLTRDG YKALPFNGKMESDEKIANQDAFMNDQVHIIVATSAFGMGVDKKDVGLVVHYDISDSLENY VQEAGRAGRDPSLSARCYVLYSDNDLDKHFILLNQTKLSISEIQQVWKAIKDLTRQRMKV SCSALEIARQAGWDDSVSDIETRVRTALAALEQSGYLIRGNNVPHVYATGITVKNMDEAR KRISMSLLFDRDEIEKAVRIIKSLISQKHIAKAQDSEAESRIDYLADILGISKREVISVV ERMRQEGILADSKDISAYLLDAGDSERKSQTLLERFAKLEQYILNHIPDESLRISCKQLN DNAVNNGINTSKEKDIRTLLYFLTVKGYTRKKEDAVRNIEISRQADLESTIKRFEKRLEI SRFAIEWLYRIASDTETGNSPNKAIQFSVVELLNQIKSSPQSLFGGLDDIQLEDVEEALL YLSKIGALKLEGGFLVLYNAMNIQRIKDNKSRYKQDDYRMLNEFYKQKIQQVHIVGEYAN LMVRDYNTALRYVQDYFQMDYRKFVTKYFKGDRVSEIQRNLTPQKYKQLFGQLSKRQMEI ISDKDSRCIVVAAGPGSGKTRVLVHKLASLLLLEDVKHEQLLMLTFSRAAATEFKQRLME LIGNAAHFIEIKTFHSYCFDLLGRIGNLEDAKSVVAKAAEMICQGEVEPNKIGKTVLVID EAQDMGVEEYALLRALMANNEEMRVIAVGDDDQNIYEFRGSDSDYMYRLTQENGSKFVEM TENYRSAHHPVNFANGFLRTIDKRIKSTPIISMRKEEGWVEVTRHQSKYMYQPLVEKLLQ HKDKGTSCVLTQTNEEAVILMALLRKHGINSKLIQSMDGLRFWNMAEIRYLLRYIDKRAK TPLIPEELWEEAKHATFSAYDGSLSLTYVKRCVEQFEQTNKAKYLCDFKEFVFESSVEDF CDVSGAEVVVSTIHKAKGREFDDVYILVSDNYSKDAYLMRKYYVGITRAKNRLFIHTNGD CFNRLSTDRYFVDQRQYDMPEEIVLQLSHKDVYLGFFKERKQEVLALRGGDSLNYNDFFL YSSSTNKPVARLSLKMQDTLSEWEERGYKVKSASVRFVVAWKPKDAQKNESETAVLLADL VLSL >gi|225935364|gb|ACGA01000028.1| GENE 59 95151 - 97232 1393 693 aa, chain - ## HITS:1 COG:MA2364 KEGG:ns NR:ns ## COG: MA2364 COG4930 # Protein_GI_number: 20091197 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent Lon-type protease # Organism: Methanosarcina acetivorans str.C2A # 5 515 15 520 682 416 43.0 1e-115 MTLQQKIMNAFLGKVVRKDLAFLVKGGLPVPTYVLEYLLGQYCACDDEATIEEGLEKVRQ VIQDNYVHRAESEVIKGKIREQGCHRIIDKVTVTLNEKADEYQAHFANLGLTNVPIGTQY VTNNPKLLSGNGVWCIVTIGYISGEDIKVRWEIQTLKPVQISNVDVQDYIDKRKNFTTEE WLDFMMHTVGLNPDTLNRREKFITLARLLPHVENNFNFMELGPKGTGKSHVFQELSPYGV LVSGGDVTSPRLFVKMSGNKEILGLVGYWDVVAWDEFEQQKGRAVDAVLIDTMQNYLANK SFNRGKGTHEASASICFVGNTKHTVPYMLKNSHLFESIPTSFIKGAFLDRIHLYNPGWEI RMLKKDSFSKGYGLITDYIAAIMHELRNKDLTALLKDYAKFDGSLSERDHLAIRKTFSGM VKLIYPDGNMTDEEAFELIDFAAESRKRVKDQLYIVDETFNAEPAKFRYVREKDGVEINV ETLEKVSNALIIPNNIHQESTIMANLDDSDIPLFQEGTETTMQVEPSSERQSRKRFIPLQ EKNQFFRLGQTGVTYQSLFAEYLKDATEITIEDPFIRTSWQVKNLMEFLTMLIDTRDVDD LKVHLVTNEEEDKLPDLIDKLDDIKNDMIGYGIEFDYNFREFHDRCIKADNGWVISMGRG LDIYEKYSPYSVAATRPDKRKCKEFMVTFTKDK >gi|225935364|gb|ACGA01000028.1| GENE 60 97257 - 99797 1909 846 aa, chain - ## HITS:1 COG:no KEGG:Slin_4848 NR:ns ## KEGG: Slin_4848 # Name: not_defined # Def: PglZ domain protein # Organism: S.linguale # Pathway: not_defined # 3 846 5 868 868 295 26.0 6e-78 MTQERIYSYFQRNPQLHVLFIFDKANIIMNDLADCSWETEYIYKVFDGAWFNTKYNIEYA WKEKRVVLLFPLGTYPMSEEQQLRFPLMDMLKANMEYKEEDYAAFMQQYKLPEKYRAFIS RHIGELMSNKINAMLKDRFMPEAFSEDVVVRGFISSYLGEKRLLEWENIIIRMFILGLDS ENKKRLDFYHKLERNKDAKTAVDERLTKIFGFSYKPNQEAKVKELVESLKYNSITQLLDV IADDSYKAYKIKSSIALEQMNRIYELGTRDREFADKFTKVMKELGADIREKELTTIYGMD ASFYYLTEELGWPILQEIAGSKFVTEPAEMQERLRLLSQKLPADSVLQQAISFLMQMAFY YEMVRGLGSLKLNTPEAYVQLYTNDLYRLDTFYRCALEEYHELLSKDVPILTCLNGLKQQ FDGEYARMVNVFNLEWMICVIEKGNYFNDLSLKKQEDFYANECVSNSKQVVIISDALRYE VAAELMQELAKEKHIAKLSAYRAMLPTETKYCKPALLPHTSLIWKNKEMLVDGEVLDTLE SRSAQVAKYKESACCVDYETVIKADVKTARALFKSPLVYIFHDTIDAASHGAGAGDVIAA CRKAIEQLAVLVRRLHASWNVTNVVLTADHGFLYNDVEFAEKDKHAVTVAGIIEKKTRYY VSDQVSAQEGVVTMPLDKVSGMKAETPIYIGVPMGTNRLAASGGYSFAHGGATLQEMLIP VIHSSQKRSDKTNKVGVALVDHNLVMVSSRLKFQLIQSEAVSMTVVERKVVCQVYQGDTP VTGKQTITLDSADTINLNNRVYEVVLILNHSVHSGMLQLRVYDEEDCLNPLIREVVKNNT MIEQDF >gi|225935364|gb|ACGA01000028.1| GENE 61 99794 - 101392 1015 532 aa, chain - ## HITS:1 COG:Cgl0025 KEGG:ns NR:ns ## COG: Cgl0025 COG2865 # Protein_GI_number: 19551275 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Corynebacterium glutamicum # 34 380 44 379 563 105 27.0 2e-22 MTAEDIQKLRDLGEVSKVQFKERILDKYDIGCELVAMSNARGGQLVIGINDKTGAFNPLS YAEVQETTNLLSNMASENVVPNILLEIDTVSVEGGSVVIATIKEGINKPYHDNKGIVWMK NGADKRKVFDNAELAEMMTECGSFTPDEAAVKDAMLDDLDEDTIKQFLQNRFAMVLEKKG MIGDALQEASLDEIANTIAKGHDLNKLMRNLRFIRPDGSLTVAAILLFGKYTQRWLPVMT AKCISYVGNSIGGKVFRDKVNDAVMEGNLLHQFETIMSFFTRNLKNVQVKKEFNSLGELE IPYVSLMEFVVNALVHRSLNWKAPIRIFIFDNRVEIHSPGTLPNGLQVEDITNGTSMPRN NFLFSNAIYLLPYTGAGSGIQRALEEGVQVEFTNDERIHEFLITIKRNKDLDSNLADSDT TPNTFPEHLDTTPNTSDECPNTFSEHLDTTPNTSDECPNTHQMKLSNKQKDIVNFCSVPR SSREILERVGVKYHNSNIKRYVTELVEAGYLERTIPDNPFDMNQKYKKKQSL >gi|225935364|gb|ACGA01000028.1| GENE 62 101427 - 101663 287 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171314|ref|ZP_05757726.1| ## NR: gi|260171314|ref|ZP_05757726.1| hypothetical protein BacD2_05560 [Bacteroides sp. D2] # 1 78 1 78 78 112 100.0 6e-24 MKLIFEKNEKGDIEVQIQKGTNLISFNYVEMLKQLLDDNTIAEDCEFENLDEEEISVLKN MLNDITKVISDGQGTQSE >gi|225935364|gb|ACGA01000028.1| GENE 63 101660 - 103999 1093 779 aa, chain - ## HITS:1 COG:no KEGG:TherJR_1080 NR:ns ## KEGG: TherJR_1080 # Name: not_defined # Def: phosphoesterase PHP domain protein # Organism: Thermincola_JR # Pathway: not_defined # 2 772 12 885 887 95 21.0 7e-18 MNPIYTDIHIHTSENPDNLNMDYDWRLLLEKVKVRACGNDLLLSLTDHNTINKSVYIDIM AHISDYPMLHLLLGVELHIKYRNNCPAYHCHAFFKNKIDVENIDRINVILDKLYPHKLVE KKDESIPTLESIIHELDEYDYIMLPHGGQSHATFNKAIPRDTKFDTTMERSIYYNQFDGF TARSKDKLEDTTEYFQKLGIKDFVNLITCSDNYIPENYPSAKATDAAPFIPTWMLAEPSF DGLRMALSESTRLIYSDNKPESWAEYIRSAKLKSDFADIDVIFTPGLNVIIGGSSSGKTL LVDSLVKKIKGGENPSIFNTSNYKKFEVEHITIDNPTGRHPHYIEQNYIMQIVNSPNSSE LTSIDILKSLFPKDEDFAQKVEEGIATLRKDVTTLMKCVERIEEIEGELNNLSQIGKLLV KGEIRKNVIVTFRSQDKERENIKYSRNKYEEDVSSLSVIKKFLKENPLVIDCDNEVEIIL DRLSQAFLYAQVDEGVSNVIDRHYRNYNEKLKDRNSESQSKAQEFQQVLALIKEYAVKAK LFEQTLQKIASYNVKEMSKQVVAMGHTLDVSNRFNLTKDKVRNAFNEFLKSEFRFASYID ITPDQLYKSHFSQRPKVEGYEQFIEKVVGIFMIENKLTYQITTKDGQDFHNLSAGWKTSV LLDLILGYEKDTAPIVIDQPEDNLATSYINDGLVQAIKSVKSSKQVILVSHNATIPMMGD AQNIIYCRNENDRIIIRSAALEGQINGEPVLDLVAKITDGGKASIKKRVKKYNLKKFRQ >gi|225935364|gb|ACGA01000028.1| GENE 64 104007 - 107522 2105 1171 aa, chain - ## HITS:1 COG:STM4495 KEGG:ns NR:ns ## COG: STM4495 COG1002 # Protein_GI_number: 16767739 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Salmonella typhimurium LT2 # 1 1171 1 1215 1225 620 33.0 1e-177 MDTNRLKKFAIEARNILKAGIAAKLTSLGFDNKGEVAENLRPQLLQGGTLWNGRTLSEGF FHQWMRLYEEVQAKGVNEVYEQAAYTWFNRLVAIRILQKNGLCEPVSTYADSAHTPQIVN QARQGLFPPMEEASKRHLLELLDDDTKVTEQFAVLLVAWCHANPILQRCFGTMEDYTELL LPANILAENGFIDLLNHTEFITDEEFRSPELIGWLYQFYISERKDEVFAKKGKFEADEIP AATQIFTPNWIVKYMVQNTVGRIYLDNNPYTILTYKEKWQYLVEPAEPTPAEAILHYNEL TDLKVADLACGSGHILNECFDLLYDLYITEGYTRVEAITHIFQDNLTGIDLDPRAQQLSL FALTLKACQRDHSFADAHCLPRVLTMPKVQLSLPTDTRLAVEIDAVNTLLKDADSLGSIM KFDLTSVAREWVASLVAENEAAALVIALTEQYDALVMNPPYMGSSKMNKNLYNYIQSKYP DGKTDLCNTFMFVGMNSLSPNGKMGMINMQSWMFQSSFLNLRKYFLNSYNFDSLLHLEKH FFEELSGDVVKVVSFICSKHPPLYNGVYFRLTDGKCEDKRLNFLSKSNVFTNVSQFGFES IPGIPMGYWATSKMLKCWDKYTQISTKMVTREGMAPASSDRFIRLWHEVSKRKTDFVLSD NKSTSQKKWFPYNKGGEKRKWYGNFEYLVEYENDGYNIKHNYDIKTGRLRSHNYNGDLAF KEGITWSVLCDAAFAVRYTPCGFLWDSAGAMGFTEENLVYYLGLLNSNVIDKFLRVLAPG LKFNVGDINRLPLIEKTSAVIEYAVQHSVYISKLDWDSHETSWDFQQNGLISLIDFAEQD LTTESLIALCSTTAHSDAVTQQADKNLRAVRLQDLMERYHQKWSTLFMQLHANEEELNRQ FIDIYGLQDELTPDVPLNEITILQQDEVSIEGNALKWHDEVILKQLFSYAAGVWMGRYRL DKSSLHIAHPNPTAEELAPYTYNGHTIEIDDDGIFPLMAADSGFTDNACLRTAQFVSDVF GAEYQVENLNYIEHTLGKTIEQYWQKDFWKDHKKMYQNRPIYWLFASKKGSFQCIAYLHR MTPYTPERIRSKYLLPFIETLGRKIADLTTRAADLSTAERKKLDTLNKQLEECREYHDRL AVVAEQAIPLDLDDGVVVNYAKLGDVVSKLK >gi|225935364|gb|ACGA01000028.1| GENE 65 107544 - 111152 3179 1202 aa, chain - ## HITS:1 COG:no KEGG:Slin_4844 NR:ns ## KEGG: Slin_4844 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 3 1199 5 1184 1186 469 29.0 1e-130 MQLQDLYIKSINRPVNPAVSATKFDKETIDIEIKEYVFTDEILNGLYRILNAIKNNKPYD HVGIWIDGYYGSGKSHFLKYLDYCITPETRDVALERLLDAVKEIDPFDGKHNLQFEFSQV QDIASWLKTATIDTCIFNLETSYNAATDKKKAFLQVFWDEFNGKRGLNKFNLTLAQLLEK PLMERGLLEQFKEVIAEMGGDWNNPSEAADLVDQALEMVVDAACELNPQFDRNSLYNHID KRDYAISIDRFGMELADYIKDKGEDYRLILLADEVSQFINKERDRYLNLQEIITKLSESC HNQVWVACTAQQDLSEIMDDCNIGEEKDKEGKIKGRFEVKVSLKGTQPEVITQKRILEKK PEVKTYLAAMYEQQQGAFSLQFKLPNAYKSYDTETDFIDFYPFVPYQFKLIMQVFDSFLA LGYVAKEVKGNERSIIKVIHTTAKAPNNAQAEVGKFVSFDELYNNMFEEGLQARGQKAVD NAIRIARTYTDPKLAVRVANVLFMVCNISQTDQLVFPATLDNITTLLINDMTTPRLNLKN EVEKVVDFLCDNNIIRREQGRQGAPDFFSFYSEEEMKVAELIKSQTADNNLQAGELKEIF NRYLKGLKNKEQYKTRSFAVGMTINQRNYLSNNPDVVLEFMMDDDYDAPEQLALQNQTNR LVYYVGPQFRANARLRNNFYWYCKVQSYMTANPATSDENAAVREEFKKRASEVMTGIIER EFQKILDTCPLVSGLSVIDDVELGNKKGNDRYLAAIEKHLSGIYTKAGLVDLPQIPRNSD KLKIAILRPVNPGDYEGLNDSLTEAEHEVEIYLNKQYEELNVGDVVAKFSKAPFGWDGIC TLYVVNELVRRHRRDYSYSNNPNVEVSIVANRLLSETNKFTIRPAITISSEVINGFIAAW KNIFGVSESFSTTDSTQLFRLSREKENVRSVENVVARYKQHYKDNAAYSFSAPLDEMVTL LEGWLAERDPLKYFRRVIDEQQQARTLFDTCKEVVQFVQDQMGNYKDVLHFVRDNRDNFS FVPIELHEAVTQLMGIENAAWPIGIRTYMKLRNTLAGAIDAKRTELRKQIAEEYAKTYAQ LEQGCKENEVSTSILSDQEVIVKLKTQPDSIPVLQTNLDTNTFFAEQTAKILAAKPKPKP TVGDDKNKKEEIPAKKYPTQVALSTRTTLPIGSEADVDKYLAQLKKQLMAHIDKGETVMI IK >gi|225935364|gb|ACGA01000028.1| GENE 66 111156 - 111767 564 203 aa, chain - ## HITS:1 COG:no KEGG:Slin_4843 NR:ns ## KEGG: Slin_4843 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 5 200 7 198 202 110 32.0 3e-23 MTIKELDDKLNSPGFQDTENGDLFYNFFIYQYPAEKEYDIRKQIQEFKQNLIRPTNYVDV LTLDLFEEFCNYLNGQKFLKFDSMLQYLLEKEQEQPDTSDKIFSTLERHAHSEKFVKYLH ERIVAHITIDDEYRRPYVFIYGIGTMYPYLRANELLAKYEDYNETGRYKIILFYPGHREQ SSFKLFDTLNDHHTYRATFLINE >gi|225935364|gb|ACGA01000028.1| GENE 67 111764 - 112393 420 209 aa, chain - ## HITS:1 COG:no KEGG:Slin_4842 NR:ns ## KEGG: Slin_4842 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 9 204 8 205 211 89 28.0 9e-17 MAMKINSPYTAAITGGGFLLNETLALLPLLQSEDREELLKDERLNNRVLMINAETSRKKA ILEISRRYDTMSPAFWQDFLVMTAAEQPIALFFVMLKTYKILFDFHVNVTIRRWNSVAKS VKREDILMEFNEISARDAFVDSWSEATKDKVISAYLSILRKIGILDRTNQLQVLDCTNYP YYLQIGESWFLEACLLQPYQIEKIKNSLA >gi|225935364|gb|ACGA01000028.1| GENE 68 112713 - 114626 2568 637 aa, chain - ## HITS:1 COG:TP0216 KEGG:ns NR:ns ## COG: TP0216 COG0443 # Protein_GI_number: 15639209 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Treponema pallidum # 1 599 1 599 635 683 63.0 0 MGKIIGIDLGTTNSCVAVFEGNEPVVIANSEGKRTTPSVVAFVDGGERKVGDPAKRQAIT NPTRTIFSIKRFMGETWDQVQKEVTRVPYKVVKGDNNTPRVDIDGRLYTPQEISAMILQK MKKTAEDYLGQEVTEAVITVPAYFSDSQRQATKEAGQIAGLEVKRIVNEPTAAALAYGLD KAHKDMKIAVFDLGGGTFDISILEFGGGVFEVLSTNGDTHLGGDDFDQVIINWLVQEFKS DEGADLTQDPMALQRLKEAAEKAKIELSSSTSTEINLPYIMPVGGVPKHLVKTLTRAKFE SLAHELIQACLEPCKKAMSDAGLSNADIDEVILVGGSSRIPAVQKLVEDFFGKTPSKGVN PDEVVAIGAAVQGAVLTDEIKGVVLLDVTPLSMGIETLGGVMTKLIDANTTIPARKSETF STAADNQTEVTIHVLQGERPMAAQNKSIGQFNLTGIAPARRGVPQIEVTFDIDANGILKV SAKDKATGKEQAIRIEASSGLSKEEIEKMKAEAEANAEADKKEREKIDKLNQADSVIFST ENQLKELGDKLPADKKAPIEAALQKLKDAHKAQDLAAIDTAMAEINTAFQAASAEMYAQS GAQGGAQAGPDMNGAAGQQDNSKHGDNVQDADFEEVK >gi|225935364|gb|ACGA01000028.1| GENE 69 114955 - 115743 663 262 aa, chain - ## HITS:1 COG:DR1940 KEGG:ns NR:ns ## COG: DR1940 COG3187 # Protein_GI_number: 15806938 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Heat shock protein # Organism: Deinococcus radiodurans # 30 222 182 365 403 71 28.0 2e-12 MKKVFVSICIAGAALAMSSCRSVEKAIPLSSINGEWNIIEVNGSKITPGESKSLPFIAFD TATGRVSGNSGCNRMMGSFDVNAKPGSLELTGMASTRMMCPDMTTENNVLNAFAQVKGYK KAGKDKVYLCNSSNRPVVVLQKKEADVKLSVLNGEWKIKEVNGEAISSGMEKQPFIAFDV KKKTIHGNAGCNLINGGFETSTSNAKSISFPGVASTMMACPDMETEGKVLKAINEVKSFD VLSGGGIGLYDANGALVIVLEK >gi|225935364|gb|ACGA01000028.1| GENE 70 115889 - 117172 843 427 aa, chain + ## HITS:1 COG:no KEGG:BT_4613 NR:ns ## KEGG: BT_4613 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 637 71.0 0 MRIIVLSLILFCCGTSPITAQSDYIVTTPSAQEIPVGQEEQFIKSNFPLLPLGKWTPGMK FMFVPSPRSMFLPTLSSYETEKGVDNSLLKHKILTFTGTEEKAQNIPNGTNYSTRFIFEC EGGKYYYEIKNMRLEEISEKAPRAGINGLVYLKDVDTAKELLVGKTVYIQAESVRIDDAN NYSGYRDIAIPVNTEATITAIGVGSQAYPAKIVFKDTQGHSYYLEVALSRTNSGMDLNDF QGEKRMKYFSNAFSFTNKSLGTIESLKNKYMGMTVYPKKVLPAKRIISFEDKQTESRVHL PRYTVLQIKDIRLSPPGSLAVLSLEDKDGAIYELETDLKYDVIVRNENYIEDFFGFEDIH KKYPGITENRWQIISRGDLETGMSTVECRLSIGDPIEIELKKDNRFETWFYNGKTLEFEN GTLQRYK >gi|225935364|gb|ACGA01000028.1| GENE 71 117270 - 118463 1213 397 aa, chain + ## HITS:1 COG:slr0049 KEGG:ns NR:ns ## COG: slr0049 COG1748 # Protein_GI_number: 16331467 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Synechocystis # 1 389 1 388 398 555 66.0 1e-158 MGRVLIIGAGGVGTVVAHKVAQNADVFTDIMIASRTKEKCDKIVEAIGNPNIKTAKVDAD NVDELVALFNDFKPEMVINVALPYQDLTIMEACLKAGVNYLDTANYEPKDEAHFEYSWQW AYHDRFKEAGLTAILGCGFDPGVSGIYTAYAAKHYFDEIQYLDIVDCNAGNHHKAFATNF NPEINIREITQNGRYYENGKWVTTGPLEIHKDLTYPNIGPRDSYLLYHEELESLVKHYPT IKRARFWMTFGQEYLTHLRVIQNIGMARIDEVDYNGVKIVPLQFLKAVLPNPQDLGENYE GETSIGCRIRGLKDGKERTYYVYNNCSHQEAYRETGMQGVSYTTGVPAMIGAMMFFKGEW KRPGVNNVEEFNPDPFMEQLNKQGLPWHEEFDKNLEL >gi|225935364|gb|ACGA01000028.1| GENE 72 118493 - 118948 600 151 aa, chain + ## HITS:1 COG:HI0254 KEGG:ns NR:ns ## COG: HI0254 COG1225 # Protein_GI_number: 16272212 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Haemophilus influenzae # 2 150 4 150 155 143 45.0 1e-34 MINVGDKAPEILGINEKGEEIRLSAYKGKKIVLYFYPKDSTSGCTAQACNLRDNYSELRK AGYEVIGVSVDNEKSHQKFIEKNNLPFTLIADTDKKLVEEFGVWGEKKLYGRAYMGTFRT TFLINEEGIVERIITPKEVKTKEHASQILNQ >gi|225935364|gb|ACGA01000028.1| GENE 73 118975 - 120009 1362 344 aa, chain + ## HITS:1 COG:AGc3441 KEGG:ns NR:ns ## COG: AGc3441 COG0468 # Protein_GI_number: 15889174 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 17 334 66 384 416 427 66.0 1e-119 MAKKDELNFETDNKMASSEKLKALQAAMEKIEKSFGKGSIMKMGDDSVEQVEVIPTGSIA LNVALGVGGYPRGRIIEIYGPESSGKTTLAIHAIAEAQKAGGIAAFIDAEHAFDRFYASK LGVNIDDLYISQPDNGEQALEIAEQLIRSSAIDIIVIDSVAALTPKAEIEGDMGDNKVGL QARLMSQALRKLTAAVSKTRTTCIFINQLREKIGVMFGNPETTTGGNALKFYASVRLDIR GSQPIKDGEEILGKLTKVKVVKNKVAPPFRRAEFDIMFGEGISHSGEIIDLGAELGIIKK SGSWYSYNDTKLGQGRDAAKQCIMDNPELAEELEGLIFEELKKK >gi|225935364|gb|ACGA01000028.1| GENE 74 120126 - 121118 977 330 aa, chain - ## HITS:1 COG:SPy1056 KEGG:ns NR:ns ## COG: SPy1056 COG2855 # Protein_GI_number: 15675048 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 45 325 34 331 339 133 33.0 5e-31 MISTATKTLQANNKMIYIALLSTLTFFLFLDYIPGLEAWSAWVTPPVALFLGLIFALTCG QAHPKFNKKTSKYLLQYSVVGLGFGMNLHSALASGKEGMEFTVISVIGTLVIGWFIGRKL FKIDRNTAYLISSGTAICGGSAIAAVGPVLKAKDSEMSVALGTIFILNAIVLFIFPAIGH ALNMDQQQFGTWAAIAIHDTSSVVGAGAAYGEEALKVATTIKLTRALWIIPMAFATSFIF KSKGQKISIPWFILFFVLALVVNTYLLDGVPQLGAAINGIARKTLTITMFFIGASLSIDV LKAVGIKPLIQGILLWIIISLSTLAYIYFV >gi|225935364|gb|ACGA01000028.1| GENE 75 121239 - 121613 280 124 aa, chain - ## HITS:1 COG:MT0892.1 KEGG:ns NR:ns ## COG: MT0892.1 COG3304 # Protein_GI_number: 15840283 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 1 124 1 123 129 89 44.0 2e-18 MGCLMNLLWLLLGGIFTAVEYLISSLLMMLTIIGIPFGMQTLKLAGLALWPFGKEVRSGN RSGGCLYILMNILWIFLGGIWICLSHLVFGAILCITIIGIPFGLQHFKLASLALSPFGKD IVTV >gi|225935364|gb|ACGA01000028.1| GENE 76 121632 - 122048 381 138 aa, chain - ## HITS:1 COG:SMc01703 KEGG:ns NR:ns ## COG: SMc01703 COG5579 # Protein_GI_number: 15964216 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 4 133 6 136 143 125 47.0 3e-29 MTNYNLQRFLDAQRGDYEQALAEVRNGRKYSHWIWYIFPQLKGLGMSYNSQYYGLSGKEE AEAYLAHPILGERLREITSVFLQLKNKTAQEVFGSLDAMKVLSCMTLFNEVASDDLFQQV IGHYYQGKVDETTKRKLE >gi|225935364|gb|ACGA01000028.1| GENE 77 122122 - 123054 432 310 aa, chain - ## HITS:1 COG:PA4794 KEGG:ns NR:ns ## COG: PA4794 COG0454 # Protein_GI_number: 15599988 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Pseudomonas aeruginosa # 154 310 2 158 160 141 40.0 2e-33 MSHTNIMQAEENDLVNILSLQKKSFMVVAERMNKFDLPPLLQNQDEICDEYQTGIILKYV SDDGQIVGSVRGCMDENRVCHIGKLIVHPDFQNKGIGRELMTEIERLFPHCHKFSLFTGE ETPNTLHLYSKVGYNIVCRKEMEGISMIIMEKEKLTYRDFTFDDLETVCKLPRNKLELFF MFPKADFPLTIDQLKNTIKDRFDSTVVLFNNEVVGFANFYEVRESQYCAIGNVIVSPCFR NRGVGTFLINAMEDIGKKKYNVSELHLSCFDANTSGLLLYTKLGYKPYEIEKYINKENEV SALIELKKVL >gi|225935364|gb|ACGA01000028.1| GENE 78 123446 - 124072 425 208 aa, chain + ## HITS:1 COG:no KEGG:BT_4601 NR:ns ## KEGG: BT_4601 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 212 343 82.0 2e-93 MARYIIEEMPDLQKTGKRITYPKFARIDNTSIKELAQRVGDVSGFSAGDIEGVLLQTAIE MAHLMAEGRSVKIDGIGTFTPSLTLGRDKEREDPEEGGSHRNAQSIFIGGVNFRVDREMV RNISERCQLERAPWKKQRSSNKFTSEQRLALALKYLDEHPFLTVWEYRKLTGLLQTAATN ELRQWGHQPDSGIEITGRGAHRVYIKKK >gi|225935364|gb|ACGA01000028.1| GENE 79 124301 - 125197 668 298 aa, chain - ## HITS:1 COG:BS_ywfK KEGG:ns NR:ns ## COG: BS_ywfK COG0583 # Protein_GI_number: 16080817 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 6 275 6 272 299 142 30.0 1e-33 MSDFRLKVFQSVAKNLSFTKASQELFVSQPAITKHIQELETYYQARLFDRQGSKISLTKA GELLLKHSEKILDDYKQLEYEMHLLHNEYIGELKLGASTTIAQYVLPPLLANFIAKFPQI NLSLINGNSRGVEAALQEHRIDLGLVEGIFRLPNLKYTPFLQDELVAVVHTQSKLAVTDE ITPEDLLNIPLVLRERGSGTLDVFERALLQHNLKLSSLNVLMYLGSTESIKLFLEHTDCM GIVSIRSVHKELVAGHLRVVEIKGMPMLREFNFVQLQGQEGGLSQVFMRFAGHHSKSL >gi|225935364|gb|ACGA01000028.1| GENE 80 125268 - 125855 459 195 aa, chain - ## HITS:1 COG:no KEGG:BT_4598 NR:ns ## KEGG: BT_4598 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 194 1 194 194 332 88.0 3e-90 MSDKPQIKLAETVVLIDAAFLNFVITDIKGYFEETLHRSLQEIDLSMLTTYITLDAGITE GKNEVQFLFVYDKESSRLQYCQPSGLQEELNGVAFQSPYGEYSFASVPSEGMVSREDLFL DLLSIVSDSADVKRMIVISFNEEYGKKVTDALHEVKGKEVIQFRMNEPELPVDYKWDMLA FPVMQALGIKAEELQ >gi|225935364|gb|ACGA01000028.1| GENE 81 126141 - 128729 2616 862 aa, chain + ## HITS:1 COG:slr1641 KEGG:ns NR:ns ## COG: slr1641 COG0542 # Protein_GI_number: 16331048 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases with chaperone activity, ATP-binding subunit # Organism: Synechocystis # 4 860 7 862 872 881 56.0 0 MNFNNFTIKSQEAVQEAVNLVQSRGQQAIEPVHIMQGVMKVGENVTNFIFQKLGMNGQQV ALVVDKQIDSLPKVSGGEPYLSRESNEVLQKATQYSKEMGDEFVSLEHIILALLTVKSTV STILKDAGMTEKELRNAISELRKGEKVTSQSSEDTYQSLEKYAINLNEAARSGKLDPVIG RDEEIRRVLQILSRRTKNNPILIGEPGTGKTAIVEGLAHRILRGDVPENLKNKQVYSLDM GALVAGAKYKGEFEERLKSVVNEVKKSEGDIILFIDEIHTLVGAGKGEGAMDAANILKPA LARGELRSIGATTLDEYQKYFEKDKALERRFQIVQVDEPDNLSTISILRGLKERYENHHH VRIKDDAIIAAVELSSRYITDRFLPDKAIDLMDEAAAKLRMEVDSVPEELDEISRKIKQL EIEREAIKRENDKPKLEIIGKELAELKEQEKSFKAKWQSEKTLMDKIQQNKVEIENLKFE AEKAEREGDYGKVAEIRYGKLQALDKEIEDTQKQLRDMQGDKAMIKEEVDAEDIADVVSR WTGIPVSKMLQSEKDKLLHLEEELHQRVIGQDEAIEAVADAVRRSRAGLQDPKRPIGSFI FLGTTGVGKTELAKALAEFLFDDETMMTRIDMSEYQEKHSVSRLVGAPPGYVGYDEGGQL TEAIRRKPYSVVLFDEIEKAHPDVFNILLQVLDDGRLTDNKGRVVNFKNTIIIMTSNMGS SYIQSQMEKLHGSNKEEVIEETKKEVMNMLKKTIRPEFLNRIDETIMFLPLNEKEIKQIV LLQIKGVQKMLAENGVELKLTEGALDFLSQVGYDPEFGARPVKRAIQRYLLNDLSKKLLS QEVDRSKAITVDAGGDGLVFRN >gi|225935364|gb|ACGA01000028.1| GENE 82 129510 - 129938 582 142 aa, chain - ## HITS:1 COG:no KEGG:BT_4595 NR:ns ## KEGG: BT_4595 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 142 1 142 144 221 85.0 7e-57 MLKTILSISGKPGLYKLISQGKNMLIVESVNAEKKRFPAYGNEKIISLADIAMYTDDAEV PLYDVLEAIKEKEKSAQASIDPKKATPEQLREYLAEVLPNFDRERVYVADIKKLVAWYNI LISNGITEFKPEGEVKEEAVAE >gi|225935364|gb|ACGA01000028.1| GENE 83 129959 - 130576 374 205 aa, chain - ## HITS:1 COG:lin1598 KEGG:ns NR:ns ## COG: lin1598 COG0237 # Protein_GI_number: 16800666 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Listeria innocua # 1 195 1 199 200 107 33.0 2e-23 MSIKIGITGGIGSGKSVVSRLFGIMGIPVYISDIEAKRITQTDPVICQELCALVGQDVFQ NGVLNRSLLASYMFGHPDRVQKVNEIIHPQVKEDFRRWAARFGGEQLVGMESAILVEAGF RSEVDFLVMVYAPLEVRVERTIKRDCSSREQVMKRIEAQMSDEVKRSHADFVIVNDDETP LIPQVLELISLLSKNNHYLCSAKNN >gi|225935364|gb|ACGA01000028.1| GENE 84 130566 - 131579 595 337 aa, chain - ## HITS:1 COG:no KEGG:BT_4593 NR:ns ## KEGG: BT_4593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 335 1 335 337 544 86.0 1e-153 MFDRRKIEYTYLKLSKKIKDFLLSDKSREFLIFLFFFLIAGGFWLLQTLNSDYEAEFSIP VRIKDVPNNVVLTSEPPSELRVRVKDKGTVLLNYMLGKSFFPVNLGFLDYKGKDNHVKIY ASDFEKKILSQLNVSSKILSIKPDTLEYIYSEGKSKLVPVRFQGKVTAGLQYYVSDTICN PDSVLVYAPEGILDTITTAYTQKINLENISDTTRRRIPLNSERGVKFVPASVEMIFPVDI YTEKTVEVPLHGVNFPADKALRTFPSKVQITFQVGLKRFRSIKASDFIINISYEELLKLG SDKYTVKLKSFPSGINQIRIVPEQVDFLIEQITSNVD >gi|225935364|gb|ACGA01000028.1| GENE 85 131600 - 131923 480 107 aa, chain - ## HITS:1 COG:XF0224 KEGG:ns NR:ns ## COG: XF0224 COG1862 # Protein_GI_number: 15836829 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Xylella fastidiosa 9a5c # 10 95 21 107 120 75 43.0 2e-14 MNVLTVLLQAPAGAAGGGSMMWIMLIAMFVIMYFFMIRPQNKKQKEIANFRKSLQVNQNV ITAGGIHGTIKEITDDYIVLEIASNVKIKIDKNSIFADASAASNQSK >gi|225935364|gb|ACGA01000028.1| GENE 86 131963 - 132889 1006 308 aa, chain - ## HITS:1 COG:TM1765 KEGG:ns NR:ns ## COG: TM1765 COG0781 # Protein_GI_number: 15644510 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Thermotoga maritima # 197 298 32 133 142 63 36.0 6e-10 MINRVLIRLKIIQIVYAYYQNGSKNLDSAEKELFFSLSKAYDLYNYLLMLMIALTEYAQK RIDAAKAKLAPTKEELYPNKKFVDNKFIAQLEVNKQLVEFIANQKRTWANDQDFIKELYE KIVETDIYKDYMASDDHSYEADRELWRKLYKTYIFNNDSLDQVLEDQSLYWNDDKEIVDT FVLKTIKRFDEKKGANQELLPEFKDDEDQEFARRLFRRTILNSDYYRHLVSENTKNWDLD RIAFMDVIIMQTALAEILSFPNIPVSVSLNEYVEIAKLYSTAKSGSFINGTLDGIVNQLK KEGKLTKN >gi|225935364|gb|ACGA01000028.1| GENE 87 132970 - 133098 76 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153807761|ref|ZP_01960429.1| ## NR: gi|153807761|ref|ZP_01960429.1| hypothetical protein BACCAC_02044 [Bacteroides caccae ATCC 43185] # 1 42 1 42 42 63 97.0 4e-09 MQYLILCAFSFDSLKKISNFEADYKRTERYGRFKKENERRHE >gi|225935364|gb|ACGA01000028.1| GENE 88 133058 - 133459 471 133 aa, chain + ## HITS:1 COG:no KEGG:BT_4590 NR:ns ## KEGG: BT_4590 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 134 134 158 81.0 5e-38 MEDLKKKMSADMNDKEIVFSKSIKAGKRIYYLDVKKNRKDEMFLAITESKKVITGEGDDS QVSFEKHKIFLYREDFQKFMAGLEEAVNFIECSDANEYIARLNTEADEESEQKAIEEARA NKLESEIKIDIDF >gi|225935364|gb|ACGA01000028.1| GENE 89 133597 - 134187 969 196 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160887365|ref|ZP_02068368.1| hypothetical protein BACOVA_05384 [Bacteroides ovatus ATCC 8483] # 1 196 1 196 196 377 99 1e-103 MKSIEVKGTARTIAERSSEQARALKEIRKDGGVPCVLYGGNEVVHFTVTNEGLRNLVYTP HIYVVDLVIDGKKVNAILKDIQFHPVKDTILHVDFYQIDEAKPIVMEVPVQLEGLAEGVK AGGKLALQMRKIKVKALYNVIPEKLIVNVSHLGLGKTVKVGELSFEGLELISAKEAVVCA VKLTRAARGAAAAAGK >gi|225935364|gb|ACGA01000028.1| GENE 90 134231 - 134890 503 219 aa, chain + ## HITS:1 COG:slr0922 KEGG:ns NR:ns ## COG: slr0922 COG0193 # Protein_GI_number: 16331675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Synechocystis # 31 219 1 190 194 140 41.0 2e-33 MQIKADVQSALICINLRFSFYINELFYGKTMIKYLVVGLGNIGPEYHETRHNIGFMTVEA LARINNAPPFMDGRYGFTTSFSIKGRQLILLKPSTFMNLSGLAVRYWMQKENIPLENVLI VVDDLALPFGTLRLKGKGSDAGHNGLKHIAATLGTQNYARLRFGIGSDFPRGGQIDYVLG HFTDEDWKTMDERLEMAGEIIKSFCLAGIDITMNQFNKK >gi|225935364|gb|ACGA01000028.1| GENE 91 134992 - 135408 630 138 aa, chain + ## HITS:1 COG:Cgl2072 KEGG:ns NR:ns ## COG: Cgl2072 COG1188 # Protein_GI_number: 19553322 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Corynebacterium glutamicum # 5 121 10 122 126 92 42.0 1e-19 MAEARIDKWMWAVRIFKTRTIAAEACKKGRITINGSLVKAARMIKPGDVIQVKKPPITYS FKVLQPIEKRVGAKLVSEMMENVTTPDQYELLEMSKISGFVDRARGTGRPTKKDRRELEE FTTPEFMDDFDFDFDFEE >gi|225935364|gb|ACGA01000028.1| GENE 92 135543 - 135968 652 141 aa, chain - ## HITS:1 COG:RSc0455 KEGG:ns NR:ns ## COG: RSc0455 COG0537 # Protein_GI_number: 17545174 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Ralstonia solanacearum # 30 106 27 103 147 61 35.0 4e-10 MKSDPKECLYCQNNETLHNLMIEIAQLSVSRVFLFKEQTYRGRCLVSYKDHVNDLNELSD EDRNAFMADVARVTRAMQKAFQPEKINYGAYSDKLSHLHFHLAPKYVDGPDYGGIFQMNP GKVYLTDAEYQELIDAVKANL >gi|225935364|gb|ACGA01000028.1| GENE 93 136018 - 138291 1912 757 aa, chain - ## HITS:1 COG:PA5529 KEGG:ns NR:ns ## COG: PA5529 COG0475 # Protein_GI_number: 15600722 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pseudomonas aeruginosa # 7 403 6 402 585 352 47.0 2e-96 MSQLPTLIADLALILICAGVMTLLFKKLKQPLVLGYVVAGFLASPHMPYTPSVMDTANIK TWADIGVIFLLFALGLEFSFKKIVKVGGSAIIAACTIIFCMILLGIGVGMGFGWHRMDSL FLGGMIAMSSTTIIYKAFDDLGLRKKQFTGLVLSILILEDILAIVLMVMLSTMAVSQHFE GTEMLESIGKLLFFLILWFVVGIYLIPEFLKRCRKLMGEETLLIVSLALCFGMVVMAAHT GFSAAFGAFIMGSILAETIEAESIDRLVKPVKDLFGAIFFVSVGMMVDPAMIVEYAVPII VITLAVILGQAVFGTFGVILSGKPLKTAMQCGFSLTQIGEFAFIIASLGVSLHVTSDFLY PIVVAVSVITTFLTPYMIRFAEPASTFVDAHLPESWKKIMMRYSSGSQTALNHENLWKKL ILSMVRITVVYSIVSMSIIALSFRFVVPFFKENLPHFWASLLGAVFIILCIAPFLRAIMV KKNHSVEFMTLWHDSRANRAPLVSTIVIRIMIAALFVIFVISGLFKASIGLIIGVAVLVV LLMVWSRRLKKQSILIERRFFQNLRSRDVRAEYLGEKKPEYAGRLLSHDLHLADMEIPGE SSWAGKTLMELNLGKKFGVHVASILRGKRRINIPGGSVRLFPMDKIQVIGTDEQLSVFNE AMQNGAKIDWEVYEKSEMALKQFIIDSDSVFLGKTIRESGIRDKYHCMIAGVESEDGTLM VPDVNAPLEEGDVVWVVGEKEDVYQLVDQKNEKVQAG >gi|225935364|gb|ACGA01000028.1| GENE 94 138443 - 139528 1078 361 aa, chain - ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 1 361 4 362 365 333 46.0 4e-91 MKTTPFTEKHIALGAKMHEFAGYNMPIEYSGIIDEHITVCQGVGVFDVSHMGEFWVKGPQ ALAFLQKITSNNVAALAPGKIQYTCFPNEDGGIVDDLLVYRYEPEKYMLVVNASNMEKDW NWCVSHNTEGAELENSSDNIAQLAIQGPKAISVLQKLTDIDLASIPYYTFKVGTFAGEEN VIISNTGYTGAGGFELYFYPSAADSIWKAIFEAGEEYDIKPIGLGARDTLRLEMGFCLYG NDLDDTTSPIEAGLGWITKFVEGKDFINRPMLEKQKAEGVTRKLVGFEMIDRGIPRHGYE LVNDEGEKMGVVTSGTMSPTRKIGIGMGYVKPEYSKVGTEICIDMRGRKLKAVIVKPPFR K >gi|225935364|gb|ACGA01000028.1| GENE 95 139597 - 140820 1358 407 aa, chain - ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 3 407 4 408 408 498 58.0 1e-141 MTLVDRFLKYVSFDTQSDESTGLTPSTPKQMIFAEYLKKELESLGLEDITLDEHGYLFAT LPANIDKEVPTIGFIAHMDTSPDMTGKDVTPRIVEKYDGSDIVLCAEENVVLSPSQFPEL LDHKGEDLIVTNGKTLLGADDKAGIAEIVSAMVYLKEHPEIKHGKIRIGFNPDEEIGEGA HKFDVEKFGCAWGYTMDGGEVGELEFENFNAAAVKITFKGRNVHPGYAKNKMINSIYLAN QFITLLPSQERPEHTTGYEGFYHLIGIQGDVEQSTVSYIIRDHDRAKFENRKKEMERLVA QMNAEHGAGTATLELRDQYYNMREKIEPVMHIIDTAFAAMEAVGVKPNVKPIRGGTDGAQ LSFKGLPCPNIFAGGLNFHGRYEFVPIQNMEKAMKVIVKIAELVASK >gi|225935364|gb|ACGA01000028.1| GENE 96 140974 - 142383 1118 469 aa, chain - ## HITS:1 COG:MJ0204 KEGG:ns NR:ns ## COG: MJ0204 COG0034 # Protein_GI_number: 15668376 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Methanococcus jannaschii # 1 465 1 456 471 185 30.0 2e-46 MGGFFGTVSKTSCVTDLFYGTDYNSHLGTKRGGLATYSEEKGFIRSIHNLESTYFRTKFE GELDKFKGNAGIGIISDTDAQPIIINSHLGRFAIVTVAKIVNIQELEEELLSQNMHFAEL SSSNTNQTELIALLIIQGRTFVEGIENVFKHIKGSCSMLLLTEDGSIIAARDRWGRTPVV IGRKDGAYAATSESSSFPNLDYEIERYLGPGEIVRMYDDHVDQLRKPNEDMQICSFLWVY YGFPTSCYEGKNVEEVRFTSGLKMGQTDESEVDCACGIPDSGVGMALGYAEGKGVPYHRA ISKYTPTWPRSFTPSNQEMRSLVAKMKLIPNRAMLQNKRLLFCDDSIVRGTQLRDNVKIL YDYGAKEVHMRIACPPLIYACPFVGFSASKNALELITRRIIKELEGDENKNLEKYATTGS PEYDKMVSIIAERFGLSSLKFNTLETLIEAIGLPKCKVCTHCFDGSSHF >gi|225935364|gb|ACGA01000028.1| GENE 97 142548 - 144707 1933 719 aa, chain - ## HITS:1 COG:no KEGG:BT_4581 NR:ns ## KEGG: BT_4581 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 719 1 719 719 1363 90.0 0 MKNMKIGTAIWVCLLLSFFFGTSAKAESITSPDGQLKLNFSVNTQGEPVYELSYKGKEVI NPSKLGLELKNDPGLMNGFTLADIKTSAFDETWEPVWGEVKSIRNHYNEMAVTLNQKAQD RNLIICFRLYNDGLGFRYEFPQQKNLNYFVIKEEHSQFAMAGDHKAFWIPGDYDTQEYDY TESKLSEIRGLLQGAITSNASQTPFSPTGVQTSLQMKTADGLYINLHEAALVDYSCMHLN LDDQNLIFESWLTPDAKGDKGYMQTPCRSPWRTVIVSDDARDILASKLTLNLNEPCAYED VSWIKPVKYIGVWWEMITGKSSWSYTDDVYSVKLGQTDYSKTKPSGRHAANNDKVKRYID FASEHGFDQILVEGWNEGWEDWFGNSKDYVFDFVTPYPDFDVKMLNAYAKEKGVKLMMHH ETSASVRNYERHLDKAYQFMIDNGYNAVKSGYVGNIIPRGEHHYGQWMNNHYLYAVEKAA EYKICVNAHEATRPTGLCRTYPNLIGNESARGTEYEAFAGNKPFHTTLLPFTRQIGGPMD YTPGIFDTKLSFLSGDHSFVRTTLAKQLALYVTMYSPLQMAADLPESYERHMDAFQFIKD VAVDWDDSKYLEAEPGDYITVARKAKGTDNWFVGGITDENPRTSAFTLDFLEPGKQYVAT LYADGKEADFEKNPTSYQIKKGLVTNKNKMSVKLARSGGFAVSLIEATPADLKTIKKWK >gi|225935364|gb|ACGA01000028.1| GENE 98 144976 - 145809 876 277 aa, chain + ## HITS:1 COG:no KEGG:BT_4576 NR:ns ## KEGG: BT_4576 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 272 20 273 278 327 63.0 4e-88 MAQHLKLIALLIFTLSASYTSAQSLRKLLERPYGIIESKWEKDAQGATELNYYNEDYCDY YLYRANDRSYNLSNGKNTIFKIEKNAQVDNPFKSASSYTFYRGKFPKDFNIQTPYALPVK NNQKTAWKTDPREPFKTLNFHIKQGDTIYATRSGIACKTSNPQQLLIYHPDQTFAAYLVM NENFIEPGEEVQVGQAIGKAGPTGAAISFFFLDENKFKGGLSSGYPYSHFIPVFRTTEGD VKPEEKTIYQAVTDHDLIMQDMSKRDKKKYMKLHNLK >gi|225935364|gb|ACGA01000028.1| GENE 99 145820 - 146596 408 258 aa, chain + ## HITS:1 COG:no KEGG:BT_4575 NR:ns ## KEGG: BT_4575 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 255 1 255 258 329 61.0 5e-89 MRIFLLFFPVFFFCGLLHAQTVKVEYGGDPLPDKDRKKIEQFLQHEVDFYSQFGLPDTLS LQLYVFENRREAIDYLESINVSLPIKASGAYSPKLQKAVILGRENGRERSLAIIYHELSH HFVSQILGKRPPSWLNEGLSEYFEHCTIHKKAVRHTFTEYEQGRVRTMYMLGEVNLPTFL NSSQGKFMKQQMTDEQYSYILSHALVTFWIESVPREIFKKFISVLQNKNDPSTVSEQINL VYPGGFQQFEKDFEAAYK >gi|225935364|gb|ACGA01000028.1| GENE 100 146615 - 146992 312 125 aa, chain + ## HITS:1 COG:AGc2712 KEGG:ns NR:ns ## COG: AGc2712 COG0239 # Protein_GI_number: 15888794 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 125 10 124 125 74 43.0 5e-14 MSKEVIYIFIGGGTGSALRYFIQLLMHERIVPYHFPWATFTVNLLGSFLIGLFYALSERF HLPFEVRLFLTTGLCGGFTTFSTFSSDGVGLLKGEFYGAFVLYTLLSIGIGLAATLAGGY VGKQI >gi|225935364|gb|ACGA01000028.1| GENE 101 147023 - 147685 428 220 aa, chain + ## HITS:1 COG:CAC1657 KEGG:ns NR:ns ## COG: CAC1657 COG1357 # Protein_GI_number: 15894934 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Clostridium acetobutylicum # 56 219 53 216 216 118 39.0 9e-27 MIKRNPTKPVRVVPPMMEEQEVSTTTLQEWLDREETVSHLLFCKGKEEGIDKSYKSFKNC TFQHQTFSECKFRSSQLTDVRFENCDLSNISFAESSLYRVEFISCKLLGTNLSETTMNHV LLHDCNAGYINLAMSKMNQVRFAHSQLRNGSLNDCRFSSVAFESCDLVEADFSHAPLRGI DLRTSRISGITLNISDLKGAVITSLQAMDLLPLLGVIIED >gi|225935364|gb|ACGA01000028.1| GENE 102 147865 - 149145 1607 426 aa, chain + ## HITS:1 COG:SP1128 KEGG:ns NR:ns ## COG: SP1128 COG0148 # Protein_GI_number: 15900994 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Streptococcus pneumoniae TIGR4 # 3 420 4 423 434 614 73.0 1e-175 MKIEKIVAREILDSRGNPTVEVDVVLESGIMGRASVPSGASTGEHEALELRDGDKQRYGG KGVQKAVDNVNKIIAPKLIGMSSLNQRGIDYAMLALDGTKTKSNLGANAILGVSLAVAKA AANYLDLPLYRYIGGTNTYVMPVPMMNIINGGSHSDAPIAFQEFMIRPVGAPSFREGLRM GAEVFHALKKVLKDRGLSTAVGDEGGFAPNLEGTEDALNSILAAIKAAGYEPGKDVMIGM DCASSEFYHDGIYDYTKFEGEKGKKRTSAEQVDYLEELINKFPIDSIEDGMNENDWEGWK KLTERIGNRCQLVGDDLFVTNVDFLAMGIEKGCANSILIKVNQIGSLTETLNAIEMAHRH GYTTVTSHRSGETEDATIADIAVATNSGQIKTGSLSRSDRMAKYNQLLRIEEELGDLAVY GYKRIK >gi|225935364|gb|ACGA01000028.1| GENE 103 149694 - 150245 407 183 aa, chain + ## HITS:1 COG:no KEGG:BT_4565 NR:ns ## KEGG: BT_4565 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 60 183 1 124 124 220 91.0 2e-56 MTENRSYLQKYAMHFGTYMGIYWILKFILFPLGFHIPFLSLLFVILTLSVPFIGYHYAKM YRDKICGGSIQFSHAMLFTIFMYMFASLLVAVAHYAYFQFIDHGFIINSYIQLWDELMTN TPALIENKEVIKETIDTARSLTSINITMQLLSWDVFWGSILAIPTALMVMKKAKPENDGV AQS >gi|225935364|gb|ACGA01000028.1| GENE 104 150278 - 151228 802 316 aa, chain + ## HITS:1 COG:mlr7556 KEGG:ns NR:ns ## COG: mlr7556 COG0463 # Protein_GI_number: 13476277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mesorhizobium loti # 3 308 4 301 326 255 42.0 9e-68 MDISVVIPLFNEEESLPELYAWIERVMQANGFSFEVIFVNDGSTDHSWEVIEKLKAQSEH VKGIKFRRNYGKSPALYCGFAEAEGNVVITMDADLQDSPDEIPELYRMITKDGYDLVSGW KQKRYDPISKTLPTKLFNATARKVSGIKNLHDFNCGLKAYRKEVVKHIEVYGEMHRYIPY LAKNAGFKKIGEKVVQHQARKYGTTKFGLNRFVNGYLDLLSLWFLSRFGIKPMHFFGLLG SLMFMIGMISVIIVGASKLYAMYNGLPYRLVTDSPYFYLSLTAMIIGTQLFLAGFLGELI SRNAPERNNYQIEKKI >gi|225935364|gb|ACGA01000028.1| GENE 105 151250 - 151834 320 194 aa, chain + ## HITS:1 COG:no KEGG:BT_4563 NR:ns ## KEGG: BT_4563 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 179 183 247 70.0 2e-64 MKNLIRIFLILTIIGAAVSLHSACSDENDCSLAGRPMMYCTIYTINPDNPTIALKDTLDS LTITALGTDSIILNNEKNVHTLMLPLRYTSDTTVFIFRYDPKRRDKDVDTLYIVQQNTPY FQSMECGYMMKQNIISAKFGKPGRPGNPPPYGNGQPDQIDSLHIKNKEANTNEIENLQIF YNYRPERTPDETTN >gi|225935364|gb|ACGA01000028.1| GENE 106 151764 - 152567 599 267 aa, chain + ## HITS:1 COG:no KEGG:BT_4562 NR:ns ## KEGG: BT_4562 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 80 267 1 188 188 344 89.0 2e-93 MKSRTCRYSIIIGQNAHLMKRLISLLLLFCIGLPLLAQQQRPAQTPKRDQKKKEVAEIDT IPLYNGTYVGVDLYGIGSKLLGGDFMSSEVSVAVNLKNKFIPTIEFGMGGTDTWSETGIH YKSKMAPFFRIGVDYNTMAKKKEKNSYLYAGLRYAFSSFKYDVSTMPADDPIWGDVIGNP SLEDGYWGGSVPFSHLGMKGSVQWLEIVLGVKVRIYKNFNMGWSVRMKYKTKASTGEYGD PWYVPGYGKFKSSNMGITYSLIYKLPL >gi|225935364|gb|ACGA01000028.1| GENE 107 152574 - 153155 525 193 aa, chain + ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 8 192 8 186 187 119 47.0 3e-27 MTGLEIWLLAIGLAMDCFAVSIASGIILKRTRWKPMLVMAFAFGFFQAIMPFIGWMCAKT FSRLIESVDHWIAFAILAFLGGRMILESFKEEECCKLFNPANPKVVLTMAIATSIDALAI GISFAFLGVQDYTEILPPISIIGFVSFVMSLIGLIFGIQCGCGIARKLKAELWGGIILVI IGLKILIEHLFLQ >gi|225935364|gb|ACGA01000028.1| GENE 108 153186 - 154199 1009 337 aa, chain + ## HITS:1 COG:YPO3234 KEGG:ns NR:ns ## COG: YPO3234 COG1477 # Protein_GI_number: 16123393 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Yersinia pestis # 32 334 27 336 340 202 37.0 8e-52 MKTKKSFLWLAFLILATIWILVRHNQQVDYYSVKGLVFGTVYKITYQHDGDLKPEIEAEL KRFDQSLSPFNDSSVISRVNRNEELVTDSFFQKCFHRSMEISRETKGAFDITVAPLANAW GFGFKKGAFPDSLMIDSLLQITGYEKVKMDNGKVIKQDPRTMLSCSAVAKGYSVDVIAQL LDRKGIKNYMVDIGGEVVVKGKNPSKGLWRIGINKPIDDSLAINQDLQTILEITDLGLAT SGNYRNYYYKDGKKYAHTIDPRTGYPVQHNILSSTVIAKDCMTADALATAFMVMGLEEAE AFCKADTTIDAYFIYSGENGEFKTYYTEGMKKYITTP >gi|225935364|gb|ACGA01000028.1| GENE 109 154267 - 154638 292 123 aa, chain - ## HITS:1 COG:XF0449 KEGG:ns NR:ns ## COG: XF0449 COG3169 # Protein_GI_number: 15837051 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 4 121 9 114 116 90 52.0 6e-19 MKGFYTILLLIVSNVFMTFAWYGHLKMKQEYSWFAALPLIGVIAFSWAIAFFEYSCQIPA NRIGFIGNGGPFSLMQLKVIQEVITLVIFTVFTTIFFKGEALHWNHLAAFVCLIAAVYFV FMK >gi|225935364|gb|ACGA01000028.1| GENE 110 154654 - 155400 650 248 aa, chain - ## HITS:1 COG:NMA1465 KEGG:ns NR:ns ## COG: NMA1465 COG0037 # Protein_GI_number: 15794367 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Neisseria meningitidis Z2491 # 1 245 5 252 319 138 35.0 1e-32 MTQFTEEEKTIRRIERRFSKGVVQYGLIEEGDKILIGLSGGKDSLALVELLGRRARIYKP RFSVVAVHVVMKNIPYQSDTEYLKAHCEAYGVPFVQYETAFDPATDTRKSPCFLCSWNRR KALFTVAKEHGCNKIALGHHMDDILETLLMNITYQGAFSTMPPRLVMKKFDMTIIRPMCL VHEADLVELATLRHYQKQVKNCPYESQSSRSDMKGILRQLEVMNPEARYSLWGSMTNVQE ELLPDKID >gi|225935364|gb|ACGA01000028.1| GENE 111 155584 - 157005 1269 473 aa, chain + ## HITS:1 COG:no KEGG:BT_4557 NR:ns ## KEGG: BT_4557 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 473 1 473 475 816 87.0 0 MKGKYILFLICSLFLCEVSAQTLEQARTLFTKGEYEQAKPVFKKYAKSQPSNGNYNYWYG VCCLKTGEAEEAVKPLEIAVKKRITGGQLYLGQAYNETYRFEDAVNCFEEYIADLSKRKK PTEEAEKLLEKSKSDLRMLKGVEDVCIIDSFVVDKATFLNAYKISEESGKLFTFNEFFKT EGDHPGTVYETEIGNKIYYSEKGEKGNLDIFSKNKLLNEWSDGRPLPGSINASGNANYPF VLSDGVTVYYASDGEGLGGYDIFVTRYNTNTDTYLVPENVGMPFNSPYNDYMYVIDEYNN LGWFASDRFQPEGKVCIYVFIPNTSKQTYNYEAMEQQEIIRLAQIHSLKETWKDKQAVTE ALQRLKAAISHKPKERRAMDFEFVIDDHTTYYLMSDFKSAKAKTLFQRYQQMEKDYYQQE EKLNSLRQQYAGANQQGKAKMAPAILDLEKRILQMSEELDALEVNVRNAEKTK >gi|225935364|gb|ACGA01000028.1| GENE 112 157022 - 157492 514 156 aa, chain + ## HITS:1 COG:no KEGG:BT_4556 NR:ns ## KEGG: BT_4556 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 156 249 94.0 3e-65 MDILIIAVLIIAAVILFLVELFVIPGISLAGISALVCILYANYYAFDNLGMAGGFATLGI SAVACIGSLIWFMRSKMLDKLALKKDIDSKVDRSAEDSVKVGDTGISTTRLAQIGSAEIN GKIVEVKSIDGFLNEKTPIIVSRITDGTIMVEKHKE >gi|225935364|gb|ACGA01000028.1| GENE 113 157515 - 158516 1116 333 aa, chain + ## HITS:1 COG:BS_yqfA KEGG:ns NR:ns ## COG: BS_yqfA COG4864 # Protein_GI_number: 16079592 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 322 1 321 331 390 66.0 1e-108 MESSFYLPIFLIAGGIIFLIIFFHYVPFFLWLSAKVSGVNISLIQLFLMRIRNVPPYIIV PGMIEAHKAGLKNITRDELEAHYLAGGHVEKVVHALVSASKANIELPFQMATAIDLAGRD VFEAVQMSVNPKVIDTPPVTAVAKDGIQLIAKARVTVRANIRQLVGGAGEDTILARVGEG IVSSIGSSENHKSVLENPDSISKLVLRKGLDAGTAFEILSIDIADIDIGKNIGAALQIDQ ANADKNIAQAKAEERRAMAVASEQEMKAKAQEARAKVIEAEAEVPKAMAEAFRSGNLGIM DYYRMKNIEADTSMRENIAKPTTGNAGNQPLSK >gi|225935364|gb|ACGA01000028.1| GENE 114 158673 - 159551 881 292 aa, chain + ## HITS:1 COG:VNG0893G KEGG:ns NR:ns ## COG: VNG0893G COG2820 # Protein_GI_number: 15790029 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Halobacterium sp. NRC-1 # 19 268 14 227 273 100 30.0 5e-21 MKKYFPSSELIINEDGSVFHLHVKPEWLADKVILVGDPGRVALVASHFENKECEVESREF KTITGTYKGKRITVVSTGIGCDNIDIVMNELDALANIDFVTREEKDQFRQLELVRIGTCG GLQPNTPVGTFVCSQKSIGFDGLLNFYAGRNAVCDLPFERAFLNHMGWSGNMCAPAPYVI DASEELIDRIAKDDMVRGVTIAAGGFFGPQGRELRIPLADPKQNDKIEAFEYKGYKITNF EMESSALAGLSRLMGHKAMTVCMVIANRLIKEANTGYKNTIDTLIKTVLDRI >gi|225935364|gb|ACGA01000028.1| GENE 115 159666 - 160325 788 219 aa, chain - ## HITS:1 COG:PA0741 KEGG:ns NR:ns ## COG: PA0741 COG2910 # Protein_GI_number: 15595938 # Func_class: R General function prediction only # Function: Putative NADH-flavin reductase # Organism: Pseudomonas aeruginosa # 6 219 2 213 213 201 51.0 7e-52 MEKVKKIVLIGASGFVGSALLNEALNRGFEVTAVVRHPEKIKIENENLKVVKADVSALDE VAAVCKGADAVISAFNPGWNNPDIYDETIKVYLTIIDGVKKAGVNRFLMVGGAGSLFIAP GLRLMDSGEVPENILPGVKALGEFYLNFLKKEKEIDWVFFSPAADMRPGVRTGRYRLGKD DMIVDIVGNSHISVEDYAAAMIDELEYPKHHQERFTIGY >gi|225935364|gb|ACGA01000028.1| GENE 116 160408 - 162315 1435 635 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 205 509 4 323 328 167 35.0 7e-41 MEPAINLYSSNEKSELIRENSQEILSVLAAHQIALWEYDIITGECSFSDDYFSTLGLKNA GVLFEDIDSFYRFVHPDDTEAYKQAFAEMLSSDSKTSQIRVRCIGEQGKVVWLEDHFLSY KESCKSHPGKLLAYTVNVTSQCEKEQHIKYLEEYNRKIIEALPEFIFIFDDNFFITDVLM APGTILLHPVEVLRGADGRSIYSPEVSDLFICNIRECLQDGQLKEIEYPVEVDGGVHYFQ ARIAPFEGNRVLALIHDIGDRILRSQELIEAKRRAEDADKMKSVFLANMSHEIRTPLNAI VGFSEIIAVTEGEEEKREYLEIIQRNSNLLLQLINDILDLSRIESGKSEMHFQQVEIAGL VEEVEKVHQLKMKSNVELKVIRPEGEYWTSTDRNRVMQVLFNFLSNAIKNTEKGSITLGL KHEGPWLKLYVSDTGCGISKEKLPQIFTRFEKLNDFVQGTGLGLSICKSIVERLGGRIEV TSELGQGSVFALYLPYQEIPKEVVERRLPHKKNVDEDKRKKILVVEDIESNFAQLNILLN KEYIISWVRNGQEAINSFIREKPDLILMDIRMPIMDGIQATEKIRTISLTVPIIAVTAYA FYTEQQQAIQAGCNAVISKPYSLERLKETIESYIG >gi|225935364|gb|ACGA01000028.1| GENE 117 162451 - 163848 1055 465 aa, chain + ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 4 465 5 459 459 300 38.0 4e-81 MIQDTICAIATAQGGAIGSIRVSGPEAISITGSIFKPAKTGKLLSEQKPYTLTFGRIYDG DEMIDEVLVSLFRAPHSYTGEDSTEITCHGSTYILQQVMQLLIKNGCRMAKPGEYTQRAF LNGKMDLSQAEAVADLIASSSAATHRLAMSQMRGGFSKELTDLRNKLLNFTSMIELELDF SEEDVEFADRSALRKLADEIEQVISRLAHSFSVGNAIKNGVPVAIIGETNAGKSTLLNVL LNEDKAIVSDIHGTTRDVIEDTVNIGGITFRFIDTAGIRETNDTIESLGIERTFQKLDQA EIVLWMVDAVNAASQIEQLSEKIIPRCEGKHLIVVFNKADLIEDKQKENLLSLLKDFPKE STESIFISAKQRKNTSELQKMLIDAAHLPTVTQNDIIVTNVRHYEALNKALEAIHRVQNG LDSQISGDFLSQDIRECIFFISDIAGEVTNDMVLQNIFQHFCIGK >gi|225935364|gb|ACGA01000028.1| GENE 118 163901 - 164743 485 280 aa, chain + ## HITS:1 COG:Rv0881 KEGG:ns NR:ns ## COG: Rv0881 COG0566 # Protein_GI_number: 15608021 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Mycobacterium tuberculosis H37Rv # 11 262 23 276 288 150 37.0 3e-36 MPIIEISSLTDSGVEIFSTLTEAQLRNRIEPDKGLFIAESPKVIHVALNAGYEPLALLCE QKHITGDAAGIIERCGDIPVYTGERKLLATLTGYTLTRGVLCAMRRRALPSVEEVCRKAR RIVVIEGVVDATNIGTIFRSAAALGIDAILLTRNSCDPLNRRAVRVSMGSVFLIPWTWLD GSPYELRKLGFRTVAMALTEKSISIDNPILATEPKLAIVMGTEGDGLQNETINEADYVVR IPMANGVDSLNVAAASAIAFWQLRVQNDAILPPLPDDVHI >gi|225935364|gb|ACGA01000028.1| GENE 119 164754 - 165230 284 158 aa, chain - ## HITS:1 COG:no KEGG:BVU_3777 NR:ns ## KEGG: BVU_3777 # Name: not_defined # Def: arsenate reductase # Organism: B.vulgatus # Pathway: not_defined # 3 158 2 157 157 192 60.0 4e-48 MKNILIVSNNDMCRSRMAQEILNSFGRGMKISTAGILAGNSVPDVVCQVMEQNGYDFSRR KPCDVATYAQQTWDYVITLCPEAEEVQKEMQGVVRKYVSFNFTDPFQGGIYVEDEQEERV RALYDIMHKELYEFFRSELMEKLLPRCSCGANTYCRCE >gi|225935364|gb|ACGA01000028.1| GENE 120 165246 - 166286 817 346 aa, chain - ## HITS:1 COG:RSc0194 KEGG:ns NR:ns ## COG: RSc0194 COG1063 # Protein_GI_number: 17544913 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Ralstonia solanacearum # 1 332 1 332 345 217 37.0 3e-56 MLAYTYIEHGRFELQEKARPEIKESRDAIVRVTLGSICTSDLHIKHGSVPRAVPGITVGH EMVGVVEQVGADVTSVKPGDRVTVNVETFCGECFFCKHGYVNNCTDPNGGWALGCRIDGG QAEYVRVPYADQGLNRIPDTVSDEQALFVGDVLATGFWAARISEITAEDTVLIIGAGPTG ICTLLCVMLKKPKRIIVCEKSPERIRFVCEHYPDVLVTEPENCKDFVLKNSDHGGADVVL EVAGSDDSFHLAWNCARPNAIVTIVALYDKPQLLPLPDMYGKNLIFKTGGVDGCDCNEIL MLIEEGKIDTTPLITHRFPLNEIEEAYRIFENKLDGVIKVAISGNK >gi|225935364|gb|ACGA01000028.1| GENE 121 166536 - 169313 1459 925 aa, chain + ## HITS:1 COG:FN1445 KEGG:ns NR:ns ## COG: FN1445 COG1112 # Protein_GI_number: 19704777 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Fusobacterium nucleatum # 62 921 2 847 849 756 50.0 0 MIDTSENLIIIKGQIKTPEIESCQNHNGNYKVVFRNVPSAYTYKEENVLWLTDPDKPDPN NYQIISKGRTLSNIKALSIFKAGSHSYWHIRFQNGKEYDYREKDLEIIESCLGESRSKSI FEYLKKVADANELKADDGTKLLAKQYEKIHFIANNRAIAVYLNPQKYKMQTRTASTLIFP FGCNASQQKAVQAAFENQISVVQGPPGTGKTQTILNIIANILVRGKTVQVVSNNNSAIVN VLEKLSKYDMGFIVALLGSTANKEKFIETQEEEKQYPENFESWHDADADQPQFLNQIHHQ TEELKSIFYKQERLAMARQEIQALKIEWQHYLQEFGTKEITLQQRKSSSSADLLNLWNEC QQFAEKEQSLSLQGIAAFIQRLKWLFFKFRSKTICRIPDKGFYKREMSLIIADFQILFYQ TKYAELEVEIDTLEKELANKDAAEMARQMADTSMKYLKNQLFHTYGNNHDKPIFTLPDLK NNWREVQKEYPIILSTTFSSLSSLQRDAVYDYIIMDEASQVSVETGALALSCAKNAIIVG DTMQLPNVVTEENKEKLNFIANACLIKPEYDCANMSFLQSICKVIPNIPQTLLREHYRCH PRIINFCNQKFYGGDLVIMTRDKGEEDVICTIRTAKGNHSRSHMNQREIDVIKEEVLPSL SYETDEIGVIAPYNKQVDAVKSALGEDIDVATVHKFQGREKDAIIMTTVDDVITSFSDDP NLLNVAVSRAKSQFYLVVSGNEQPKDCNISDLIAYIEYNNGTVSTSKIHSIFDYLYEQYT DARIAYLKKHKKISEYDSENLTFALLEDILKENINMRHLNIICHLPLYMLIQDYSLLNEE ESKYAANINTHIDFLIYNRVSKQPVLAIETDGYTFHKSGTSQSERDIKKDHILELYGIPL VRLSTIGSNEKKIIGDKLSEVLNLH >gi|225935364|gb|ACGA01000028.1| GENE 122 169542 - 169772 208 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171374|ref|ZP_05757786.1| ## NR: gi|260171374|ref|ZP_05757786.1| hypothetical protein BacD2_05860 [Bacteroides sp. D2] # 1 76 1 76 76 110 100.0 2e-23 MTKNYLIVNTLIKMRIRYNTYNNKNHLFYKNRQINKSYFDQILRNEQESIRLTILRDTLL PKLMSGELKINATEVL >gi|225935364|gb|ACGA01000028.1| GENE 123 169779 - 170855 373 358 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 5 331 1 313 339 148 32 4e-34 MNKSINILDKEYLQWVKDLCGRYRQSQIKAAVKVNVEQLKFNWLLGRDIVELHVEERWGE SVITQLSKDLREAIPNAAGLSKSNIYYCRKFYLLYKDALKTFHQLGGKNETELFHQVGGI FEVPWRHHCLIMDKVKDDIDKALFYVHQTVENGWSRSMLLNFISTDLYERQGKALTNFTK TLPDAMSDLAQELTKDPYNFAFTGITGRYNERLLKNALLNNITRFLIELGTGFAYVGKEY RLEIGETENFIDLLFYNLSLSCYVVVEVKIGKFAFADIGQLGGYVVACNHLLRKEGRDNP TIGLLICKEKDRIQAQYALESSSQPLGISEYDLEKFYPEKVEGTMPTIEEIEAKLRDN >gi|225935364|gb|ACGA01000028.1| GENE 124 171119 - 172744 537 541 aa, chain + ## HITS:1 COG:no KEGG:Cyan8802_4015 NR:ns ## KEGG: Cyan8802_4015 # Name: not_defined # Def: ABC transporter, ATP-binding protein-related protein # Organism: Cyanothece_PCC8802 # Pathway: not_defined # 19 531 30 544 556 342 37.0 3e-92 MSNKSIIIPVNNKPTKIESVQNFVIVGANGSGKSHLGAWIEQQSANGEVLRISAQRALSI PDSITIKSEEAAWNKIYYGEELHHDKNYKWNWGNGLTTKLIDDYDSVLSAIFARLNKEDR AYVIDCKDKEKRGETKADVPQMIIDKITSIWNAIYPHRQIILEDAKIKAKTTSSEEYHAK EMSDGERVTIYLLGQCLIAPNNMIIIIDEPEIHLHKSIMYRLWDKIEEFCPNKTFIYITH DLDFAASRKEATKIWVKSYFGNNRWDIKILDPDENIPDSLMFEVLGNRKPVLFVEGERGS YDNQLYPFVYSNYNIIPCHDCSKVIEMTKSFNNERIKNMHNYSIKGLIDRDYMTEAEISC YKESNIYTLDVAEVENLYLIEDIIKLVAENQALEPNETFNQVKQFLFNKFKEEYDLQLCS ICSREIRHKLQCYTKPSENTMEALEAQAQKIVNSIDIPQIYNQSKQKIDAIVTSQDYDNL LKIYNRKSLHLQISGILKLSSNEYPKLILRMMKTDKKERIIEALKKHMPILDEKEPSIEV I >gi|225935364|gb|ACGA01000028.1| GENE 125 172861 - 173100 135 79 aa, chain + ## HITS:1 COG:FN1445 KEGG:ns NR:ns ## COG: FN1445 COG1112 # Protein_GI_number: 19704777 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Fusobacterium nucleatum # 1 79 547 623 849 74 48.0 5e-14 MTRDKGEEDVICAIRTAKGNHSRSHMNQREIDVIKEEEVLPNLSYETDEIGVIAPYNKQV DAVKSALEEDIDVATVHKF >gi|225935364|gb|ACGA01000028.1| GENE 126 173261 - 173560 198 99 aa, chain + ## HITS:1 COG:FN1445 KEGG:ns NR:ns ## COG: FN1445 COG1112 # Protein_GI_number: 19704777 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Fusobacterium nucleatum # 1 87 755 842 849 98 60.0 3e-21 MLIQDYSLLNEEESKYAANINTHIDFLIYNRVSKQPVLAIETDGYTFHKSGMSQSERDIK KDRILELYGIPLVRLSTIGSNEKKIMEDKLNEILHLQTQ >gi|225935364|gb|ACGA01000028.1| GENE 127 173637 - 174863 824 408 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4088 NR:ns ## KEGG: ZPR_4088 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 41 274 8 251 324 65 25.0 5e-09 MKSCIIPRNDSLCALCPIREADKTGSHMVPNLLTAVTFSFDGKTKRDREIVELYHINNPE DNAIYYGSQVAPEKIAEDLGHEITDEELEKNTNLLCYDNIFCYQCENRFGVLETTYGEYY KGLKNDINPRIAYLFWLSVYWRMAIGYMGIFMDGEDEFALRDILNKNIHSYNEIINSKEK LGDYGYVIFRVKDGIIKGDSGILGTRTPHCPYVILVADYVVALFSNYKKLHSKVHIFNWE IYKEDINTPDKPFDYIEISIEEFYEFRDNIIDNGYNEGLGAEREKLARKIREYERSQGKP VNKYEVKKLMDMAHLVDSENVHLRVRKLYRFEAAYMKMIEAQKNGISYDFLKDRQLMLNQ EDINNYIVDLQNLRKHNHSIDGFPFAKEFLEDETITSFEEIINKYRPT >gi|225935364|gb|ACGA01000028.1| GENE 128 175086 - 177917 1611 943 aa, chain + ## HITS:1 COG:SA0189 KEGG:ns NR:ns ## COG: SA0189 COG0610 # Protein_GI_number: 15925899 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Staphylococcus aureus N315 # 1 929 1 916 929 633 40.0 0 MVTQSEQVLESGLIKTLMEMNYEYISIKEEDNLYANFKKQLEKHNKRELSLHKRKHFTNK EFEKICIYLEGGTRFEKAKKLRDLYPLETEDGERIWVEFLNKNKWCQNEFQVSNQITVEG RKKCRYDVTILINGLPLVQIELKKRGVELKEAYNQIQRYHKTSFHGLFDYIQLFVISNGV NTRYFANNPNGGYKFTFNWTDAENIPFNDLSKFAYFFFDQCNLGKMISKYIVLHEGDKCL MVLRPYQFYAVERILERVQNSNKNGYIWHTTGAGKTLTSFKAAQLVSELDGIDKVMFVVD RHDLDTQTQSEYEAFEPGAVDGTDNTYELIKRLSGNSKIIITTIQKLNCAITKDYYNKYL QEVRHQKVVMIFDECHRSHFGDCHKNIVKFFSNLQIFGFTGTPIFVENAKQEHTTKEVFS DCLHRYLIKDAIADENVLGFLVEYYKGKDESGIDYMNEARMKEIARFILTNFNKSTVDGE FNALFAIQSVPMLLQYYKIFKELNPKIKIGAVFTYAANSSQDDEQTGMNQGYANDKVTAD ELQVIMNDYNNTFGTSFTTDNFSAYYDDINLRMKKKKKDMEPLDLLLVVGMFLTGFDAKK LNTLYVDKNLEYHGLLQAFSRTNRVLNEKKRFGKIVCFRDLKNNVDAAIKLFSNNNPADT ILREPFPVVKEKFNELSLKFKEKYPDVQSIDKLQSEYEKRDFVLAFREIIKKRAEMQIYE DFESDDKEFILSEQEFMDFRSKYLDITTGVINPNPDDKKNTGDTDVPPYGKDERTLDDID FCLELLHSDVINVAYILTLINDLDPSSNDYQERRQQILDTMIKDAVMRNKAKLIDGFIRQ NVDNDKDGFSKSKSDGSIDLESRLTNYVSQQRYKAIQELAEEEGIDEEALIKFLNEYDFL QKEKPEILQEAVKKKRIGLKERRTLLKRIMDKLHSIIELFNWE >gi|225935364|gb|ACGA01000028.1| GENE 129 177933 - 179477 1319 514 aa, chain + ## HITS:1 COG:SA0391 KEGG:ns NR:ns ## COG: SA0391 COG0286 # Protein_GI_number: 15926109 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Staphylococcus aureus N315 # 6 514 11 516 518 584 57.0 1e-166 MSEELQQKLRSQLWTVANTLRGNMSASDFMYFTLGFIFYKYLSEKIELYANEILEEDHIT FKEVWNGKDEELKQDVKEECIQNLGYFIEPEYLYSTIIELISKKENILPSLERSLKKIED STIGQDSEDDFGGLFSDLDLASPKLGKTADDKNKLISDVLIALNGIDFGLQEAGDIDILG DAYEYMISQFAAGAGKKAGEFYTPQEVSQILAEIVITGKVRLKDVFDPTCGSGSLLLRTA KSGKADSIFGQEKNPTTFNLCRMNMLLHGVKYNDFDIQNGDTLEADAFGDRQFDAVVANP PFSADWTAADKFNNDDRFSKAGVLAPRSKADYAFILHMIYHLNDGGTMACVAPHGVLFRG AAEGKIRQFLIEKKNYIDAIIGLPANIFYGTSIPTCILVIKKCRKEDDNILFIDASKEFE KVKTQNKLRPEHIQKIIDTYRERKEIEKYSHCATLQEVKENDYNLNIPRYVDTFEEEEEI DIHAVMTEIKELEAKRAELDKQIDVYLKELGLIQ >gi|225935364|gb|ACGA01000028.1| GENE 130 179530 - 180738 349 402 aa, chain + ## HITS:1 COG:MJ0130m KEGG:ns NR:ns ## COG: MJ0130m COG0732 # Protein_GI_number: 15669898 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanococcus jannaschii # 217 401 22 207 425 112 36.0 1e-24 MANNNNKDKCNVPHLRFPEFSGEWKETTLGKIAEITKGSGISKDQLSEQGSPCILYGELY TKYKSEIINEVYSRTELDSSPLVKSKANDVIIPCSGETAIDISTARCVLFNNILLGGDLN IIRLKYDDGGFFAYQLNGARKKDIARVAQGVSVVHLYGENLKQIRVYYPNIEEQRKITHL LSLIDGRIATQNKIIDKLKSLIKGLIDDIITLECGLLVTFETLYSKAGEGGTPTTSNMEF YDNGNIPFIKIEDLNNKYLLTNKDCITELGLKKSSAWLIPTNSIIYSNGATIGAISINKY PICTKQGILGIIPNSNIDVEYLYYFMRSSYFQKEVERIVTEGTMKTAYLKDINHIKCPIP DSDKQKEISHALSTLSLKEDIENQLLKKYQIQKQYLLSQMFI >gi|225935364|gb|ACGA01000028.1| GENE 131 180731 - 181096 130 121 aa, chain - ## HITS:1 COG:no KEGG:Msm_0158 NR:ns ## KEGG: Msm_0158 # Name: not_defined # Def: type I restriction-modification system methylase, subunit S # Organism: M.smithii # Pathway: not_defined # 1 120 1 120 199 89 38.0 5e-17 MNGDAILVIKDGSGVGTVSYAQGKFSVIGTLNYLTVIGNNNLRYLYFALSVFNFQPYKTG MAIPHIYFKDYGKAKIYFPPITEQKRVANVLDKLENKLLLEQGILASFNWQKRYLLGQMF I >gi|225935364|gb|ACGA01000028.1| GENE 132 181284 - 181841 187 185 aa, chain + ## HITS:1 COG:MPN285 KEGG:ns NR:ns ## COG: MPN285 COG0732 # Protein_GI_number: 13508024 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Mycoplasma pneumoniae # 14 161 117 265 306 122 42.0 5e-28 MSAISLNLLHSETWEQFKIKDIAQIGRGRVISFIEISQQKNPTYPVYSSQTSNDGIMGYL DDYMFEGEYISWTTDGANAGTVFYRNGKFNCTNVCGLLKLRKEFDTHFVSLVLAEATKKY VSINLANPKLMNNTMGNIQIRLPKLDEQKRISIIFRKLQKLLTTHNSLLAEYTKQKQYLL SQMFI >gi|225935364|gb|ACGA01000028.1| GENE 133 181834 - 182982 275 382 aa, chain - ## HITS:1 COG:MJ0130m KEGG:ns NR:ns ## COG: MJ0130m COG0732 # Protein_GI_number: 15669898 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanococcus jannaschii # 206 381 31 207 425 113 37.0 6e-25 MDFYSTNSLCWEQLEYETNTVQNLHYGLIHVGLPTMIDLSKDKLPNIKEGNMPKNFELCK NGDIAFADASEDTNEVAKAVEFYDLDEKDVVCGLHTIHGRDNADRTVIGFKGYAFSSDTF HHQIRRIAQGTKVFSISTKNFSECYIGIPSKEEQTKIVTLLRLINERIATQNKIIEDLKK LKSAISAKLFSQEPIVWNRLNSYFIKGKAGGTPTSTNKKFYDGDIPFLSINDITKQGKYI WQTENHISQNGLDNSSAWIVPKHSLIMSMYASVGLVTINQVPIATSQAMFSMLLKDESLL DYLYYYLSYFKRRHIHKYLETGTQSNINADIVCGIMIPDYEYRHNIKIASMLQSIDVKLD NESLILNQYNQQKQYLLSQMFI >gi|225935364|gb|ACGA01000028.1| GENE 134 183114 - 184040 648 308 aa, chain + ## HITS:1 COG:SPy2122 KEGG:ns NR:ns ## COG: SPy2122 COG0582 # Protein_GI_number: 15675872 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 5 300 68 371 381 106 30.0 5e-23 MIQKKTIKEISVLWKEDKKQYVKQSTFAAYVLILENHIYPTFGEMYELEENKVQEFALQK INNGLSKKSIKDILIVLKMILKFGVKHGYLNYKEWEIRFPTEDEKQHLEVLSISHQKRIM SFIQEHFTFKNLGIYICLSTGLRIGEVCALTWDDINIELGIISIKRTIERIYIIDGEKRH TELIINTPKTKNSIREIPITKELIRILKPLKKIVNGNYYILTNEEKPTEPRTYRNYYKKL MKDLNIPELKFHGLRHSFATRCIESNCDYKTVSVILGHSNISTTLDLYVHPNMEQKKKCI DRMIKGLK >gi|225935364|gb|ACGA01000028.1| GENE 135 184164 - 185015 620 283 aa, chain + ## HITS:1 COG:no KEGG:A2cp1_3953 NR:ns ## KEGG: A2cp1_3953 # Name: not_defined # Def: hypothetical protein # Organism: A.dehalogenans_2CP-1 # Pathway: not_defined # 3 283 1 282 282 181 36.0 2e-44 MAIIKIGELLKLRGLDINKRIKLVRHKDARQKQFINGVEVEGNPYDWYRNDKDKFIAYQS EQHRDVFKNVDYIVSFIGENGTIARFIGIYKIEGPDNERNTNKYCYKITEVEGFDELKER IIIDWGPSTISWHQWLNDKNDKEIIEITPGFDHIFPGYEKIALTLAQLKNIILEKEYPEW KKMLSAVNCIYIITDRKTGKNYIGSTYGKEGIWGRWKEYAKTGGHGNNVTLQKLYDQDNS YPNNFSWSILETLSISISSYEAINIEKCYKQKLGTLAFGLNNN >gi|225935364|gb|ACGA01000028.1| GENE 136 185637 - 186071 437 144 aa, chain + ## HITS:1 COG:no KEGG:BT_4511 NR:ns ## KEGG: BT_4511 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 269 90.0 2e-71 MKEEPDSLSVPYNFARCFNAQCPQAPKCLRHIATQLDTADNLYITIINPARYPADGNQCE CFKTAVKVHVAWGLKRLLDRIPYEDAVSIRIQLVGHYGKTGYYRFYRGERGLMPKDQAYI KQLFRNKGIKEEPTYQRYTEEYIW >gi|225935364|gb|ACGA01000028.1| GENE 137 186172 - 188262 1254 696 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 317 639 61 389 480 72 22.0 4e-12 MKITQFRKNEDTIALSVMDLDILVNKIKTEIKSRPVSTFREHLRYTLPDERCMFANKLPQ IIPAAEFRKVNGQKQMKNYNGIVELTIGPLSNKSEIALVKQKACEQPQTRCVFMGSSGKT VKIWTTFTRPDNSLPNTREEAELFHAHAYRLAVKCYQPQIPFDILPKEPTLEQYSRLSHD PDIIYRPDSVQFYLSQPSSMPEETTFREAVQAEKSPLTRAVPGYDAENAFLMLFEAAFRK AYADLREAGLELREGTWHPLVVQLAKNCFASGLPQEEVVRRTVFHFYMYKQEELIRQMIG NIYTECKGFGKNISLSKEQQLALQTEEFMKRRYEFRHNTQIGEVEYRERLSFRFRFNPLD KRALNSIALDAQMEGIPLWDRDISRYIYSNRVPVFNPLEDFLYRLPAWDGKDRIRALAAT VPCKNPYWMDLFHRWFLNMVSHWKGSNKKYANSVSPLLVGPQGTRKSTFCRSIMPPSERS YYTDSIDFSRKKDAELYLNRFALINIDEFDQVSSTQQGFLKHILQKPVLNVKKPHGSAVL EMRRYASFIATSNQKDLLTDPSGSRRFICIEVTGVIDTNRPIDYEQLYAQAMYELEHGER YWFDQEEEKIMVENNREFEQVPPEEQLFFRYFRAAQPEEGEWLSPAEIMEDIQKGSSIPM SVKRVNSFGRILKKQEIPSKHTRSGTLYHVVRLITR >gi|225935364|gb|ACGA01000028.1| GENE 138 188868 - 189755 670 295 aa, chain - ## HITS:1 COG:SMb20835 KEGG:ns NR:ns ## COG: SMb20835 COG1708 # Protein_GI_number: 16264326 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Sinorhizobium meliloti # 1 291 27 324 331 156 32.0 6e-38 MKKSIKRLPKRTQEELTVLLDLVRKDIRNCQMVILFGSYARGNYVLWDSNIEFGVHTSYQ SDYDILVVITGPAKQVEEKLHRITNKYHDLFADRRHAFPQFIVEHINTVNRNLEVSQYFF TDIVKEGIMLYDSGKCELAKPRKLSFREIRDIAQKEFDELYPYACGLLEGVKEYYLPKKQ TKISAFLLHQTCEKLYNCILMVFTNYRPKSHKIKELGGMVKRFSMELTTVFPQNTDAEKE CFDLLCRSYIEARYNKDFSISQEQLEYLISRVDILKNITERLCKEKIVEYDTMTE >gi|225935364|gb|ACGA01000028.1| GENE 139 189682 - 189984 203 100 aa, chain - ## HITS:1 COG:no KEGG:BVU_2278 NR:ns ## KEGG: BVU_2278 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 53 3 55 594 94 84.0 1e-18 MITLGNFENFVPYKMLMRGEEYYDTDTVSELEEISPSECTATVEGTDDYNVEIFSASLLF GDSKVRTFVRLSNKLGYEKVYKTFTQTYAGRIDSLAGFSA >gi|225935364|gb|ACGA01000028.1| GENE 140 190223 - 191104 749 293 aa, chain + ## HITS:1 COG:SMa1953 KEGG:ns NR:ns ## COG: SMa1953 COG2367 # Protein_GI_number: 16263522 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Sinorhizobium meliloti # 3 268 10 311 334 72 25.0 7e-13 MRSFIVFLCLIPTLLFARQTQLETQLKEAIKGKKAEIGIAVIIDGKDTVTVNNDIHYPLM SVFKFHQALALADYMGKQKQSLETRLPIKKSDLKPDTYSPLRDKYPQGGIEMSIADLLRY TLQQSDNNACDILFNYQGGPDAVNKYIHSLGIRECAIVGTETAMHEDLNLCYENWTTPLA AAELVEIFRKKPLFPKVYKDFIFQTMVECQTGQDRLVAPLLDKKVTVGHKTGTGDLNAKG QQIGCNDIGFVLLPGGRTYSIAVFVKDSEENNQANSKIIADISRIVYEYVMQH >gi|225935364|gb|ACGA01000028.1| GENE 141 191176 - 192588 860 470 aa, chain + ## HITS:1 COG:lin0429 KEGG:ns NR:ns ## COG: lin0429 COG0346 # Protein_GI_number: 16799506 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Listeria innocua # 1 124 1 123 126 140 53.0 6e-33 MKLHHIAIWTFRLEELKEFYVRFFGGKSNEKYINPKKGFESYFISFGEGTDLELMSRTDV QNTPIEENRVGLTHFAFTFPSQEEVLRFTEQMRSEGYTIAGEPRTSGDGYFESVVLDPDG NRIECVYRRTANESKNKARQETETESIPPVTLHTERLFLRPFEERDAETFFACCQNPNLG NNAGWPPHRTLDESRRILHSTFINQEGIWAVILKDTKQLIGSVGIIPDPKRENPQVRMLG YWLDESYWSKGYMTEAVQGVLNYGFEELRLSLITATCYPHNKRSQKVLKKNGFIYEGTLH QAELTYNGNIYDHQCYYLPGISQPTPEDYDEILHVWEMSVRHTHDFLTEEHIQFYKPLVR KHYLPAVELFVIRNANGKIAAFMGLSDELIEMLFVHPDEQGKGYGKRLMEYARDKKQMDK VDVNEQNEKALQFYLHLGFQVIGRDETDSMGKPFPILHLQLPEADSANRD >gi|225935364|gb|ACGA01000028.1| GENE 142 192610 - 193137 345 175 aa, chain + ## HITS:1 COG:no KEGG:BT_4505 NR:ns ## KEGG: BT_4505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 166 1 166 180 244 80.0 7e-64 MNLRARLSERVHIEDIREVLHFIQDDERLREEIYQLIFDEDAIVSYQALWVCTHFSKADV AWLSRKQEELIDAAMTCPHSGKRRMILNLICQQPAADPPRVDFLDFCMERMISREEPPGV QSLCMKLAYQLTRSIPELQQELRTILEIMEPDLLVPAIRSVRKNTLKAMKAKKNR >gi|225935364|gb|ACGA01000028.1| GENE 143 193166 - 193798 272 210 aa, chain - ## HITS:1 COG:PAB0863 KEGG:ns NR:ns ## COG: PAB0863 COG2095 # Protein_GI_number: 14521504 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Pyrococcus abyssi # 1 205 13 199 201 66 23.0 4e-11 MALFPVINPLGNGFVVNGFFTDLDPQQRKAAIQKLTLNFIMIGVGTLVIGHLFLLIFGLA IPVIQLGGGILICKTAMELLGDSGSSDKEEASKNVDGFRWKNIEQKIFYPITFPISIGPG SISVIFTLMASASVKGKLLQTGINYLVIALVIICMAAIFYVFLSQGQRFIQRLGPVGNQI INKLVAFFTFCIGIQISVTGISQIFHLNIL >gi|225935364|gb|ACGA01000028.1| GENE 144 193999 - 194979 791 326 aa, chain + ## HITS:1 COG:mlr8141 KEGG:ns NR:ns ## COG: mlr8141 COG3049 # Protein_GI_number: 13476735 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Mesorhizobium loti # 2 324 25 350 350 400 59.0 1e-111 MCTRVVYSEKNGMVATGRSMDWKTEMHSNLWVFPKGMERNGETGANSLQWTSKYGSVVTS AFEIASTDGMNEKGLVANLLWLPETEYPVRDQSKPGLAITAWVQYMLDNFATVDEAVAFI DENTFQVVSDLMPDGSRLATLHLSISDATGDCAIFEYTGGKLTVYHSKEYKVMTNSPTYN KQLALNEYWKSIGGLSFLPGTNRPSDRFARASFYINALPQTDDVRMAIASVFSVIRNTSV PYGISTPEFPEISTTQWRTVSDSKNLLYFFESSLTPNTFWVNLRETDLSEGAPVLKLSIA NGETYHGNATKEFKPAQPFRFMGVKG >gi|225935364|gb|ACGA01000028.1| GENE 145 195095 - 195559 575 154 aa, chain - ## HITS:1 COG:CC0942 KEGG:ns NR:ns ## COG: CC0942 COG2030 # Protein_GI_number: 16125194 # Func_class: I Lipid transport and metabolism # Function: Acyl dehydratase # Organism: Caulobacter vibrioides # 8 148 5 145 148 117 44.0 6e-27 MEKVIINSYEEFEKLVGQQIGVSDYVELSQERINLFADATLDHQWIHVNTERAKVDSPYH STIAHGYLTLSMLPYLWNQIIQVNNLKMMINYGMDKMKFGQAVLSGQSVRLVTTLHSLTN LRGVAKAEIKFAIEIKDQPKKALEGIAVFLYYFN >gi|225935364|gb|ACGA01000028.1| GENE 146 195680 - 196198 457 172 aa, chain - ## HITS:1 COG:no KEGG:BT_4502 NR:ns ## KEGG: BT_4502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 169 1 169 176 300 82.0 2e-80 MGEKPKFAFLGNSHMAFWALDVYFPSWECLNYGAPGEGLAYVESFTVDTSDCQVVIQFGT NDIYQLNDENMDDYVERYVKAVLAIPSLKTYLFCIFPRNDYDDYSTAVNQFIRVLNRKIY EKLQGTDIVYLDVFDRLLQDGRLNPELTLDDLHLNGRGYSILSEALKRASGL >gi|225935364|gb|ACGA01000028.1| GENE 147 196283 - 197968 241 561 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 349 556 82 277 285 97 33 7e-19 MIHFFKQPISHLALPDKFTYPFHYTPHPLCVLAAEEVKEYIASREEWQEELAFGKMFGVL IVQKENKQETAKKEAVNEIGYLAAFSGNLAGKNLHPYFVPPVYDLLQPEGFFKIEEEQIS SINIRIRELENNRSYLDLIEKWKTETEQAKAILNQAKAALKAAKEAREMRRQSSSALSEE EQASLIRESQYQKAEYKRLEKEWKKRLEELEMETRHFETEIEQLKTERKERSAALQRKLF EQFRMLNARGEVKDLYTIFEQTVQKVPPAGAGECALPKLLQYAYLHQLKPLAMAEFWWGD SPKNEIRHHGYYYPSCKGKCEPILQHMLQGLKVDENPLLNSIHEDEELEIVYEDEWLVVV NKPAGMLSVPGKEEDRDSVYHRLKKKYPDATGPMIVHRLDMATSGLLLVAKTKEVHQHLQ AQFASRSIKKRYVAVLDGATATVEKTVPLPGKAGRIELPLCLNPLDRPRQIVSREHGKEA ITEYRIICESEKHTRIAFYPLTGRTHQLRVHAAHPEGLGCPILGDELYGKKADRLYLHAE YIEFRHPISGKILRIQKEADF >gi|225935364|gb|ACGA01000028.1| GENE 148 197965 - 199326 1065 453 aa, chain - ## HITS:1 COG:SP1939 KEGG:ns NR:ns ## COG: SP1939 COG0534 # Protein_GI_number: 15901763 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Streptococcus pneumoniae TIGR4 # 6 426 8 428 456 248 34.0 2e-65 MATSKEMTAGPALPLILKFTLPLLLGNLLQQTYSLVDAAIVGKFLGINALASVGASTSVV FLILGFCNGCCGGFGIPVAQKFGARDYSTMRSYVAVSLKLAAGMSVVIALLTCILCEDIL RIMRTPENIFEGAYAYLLVTFIGVPCTFFYNLLSSIIRALGDSKTPFWFLLFAAVLNIIL DLFCILVLGWGVAGAAIATVFSQGLSAVLCYIYMYRKFEILQGTPKERRFQSKLAKTLLY IGVPMGLQFSITAIGSIMLQSANNALGTACVAAFTSAMRIKMFFICTFESLGIAMATYSG QNYGAGKPECVWLGIKASALMMIVYAAFTFVLLMVGAKYFALIFVDPSETEILLDTELFL HISCMFFPMLGLLCILRYTIQGVGYTNLAMFSGVAEMIARILVSIYAVPAFGFIAVCYGD PMAWIAADLFLVPAFIYVYRRLKKQVFTSTVTV >gi|225935364|gb|ACGA01000028.1| GENE 149 199450 - 199656 206 68 aa, chain - ## HITS:1 COG:no KEGG:BF1617 NR:ns ## KEGG: BF1617 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 67 1 67 68 116 77.0 2e-25 MKKLLCPQCKIAAMYVKNEQGDRLLVYVLEDGEVVPKYPEDSMEGFDLTEVFCLGCSWHG SPKRLVKR >gi|225935364|gb|ACGA01000028.1| GENE 150 199735 - 201240 971 501 aa, chain - ## HITS:1 COG:MA1483 KEGG:ns NR:ns ## COG: MA1483 COG0168 # Protein_GI_number: 20090342 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Methanosarcina acetivorans str.C2A # 45 498 26 474 476 287 40.0 3e-77 MEEYISDNSLYIRSKNNSMINRKMIGRVLGMLLFIELGMFLLCAGVSAGYGESDYKYFLY TCIINAVVGGLLLLYGRGAENKMSRRDGYCVVTLSWVFFTFFGMLPFYFSGSIDTITNAF FETMSGFTTTGATILDDIESLSHGMLFWRSLTQWIGGLGIVFFTIAILPIFTTGGVQLFS AESTGVTHDRTHPKINVMAKWLWTIYLVLTVSETVLLMFGGMSLFDAICQSFATTATGGY STKQNSISYWNSPYIEYVVAIFMIVSSINFSLFLMCLKGKVGRLFKDEELHWFLASVGIL TFLITLALVFQNHYDWELAFRKALFQVSTVHTSCGFATDDYNMWPPFTWLLLFIAMLSGG CTGSTSGGIKNMRLLIVARAIRNEFKHLLHPNAVLPVRINKQTVSSSIVTTVLIFFAFYL VLILIGWTALLFLGVGFSESVSTVISSIGNVGPGLGTCGPAYSWNSLPDLAKWILSFLML VGRLELFSVLLLFYPGFWKNR >gi|225935364|gb|ACGA01000028.1| GENE 151 201491 - 201790 184 99 aa, chain + ## HITS:1 COG:no KEGG:BF3088 NR:ns ## KEGG: BF3088 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 97 1 97 98 142 77.0 4e-33 MKASRIAHEKKTVELMIRLYCRKKEKNTILCADCEELLRYAHARLDHCPFGEKKSACKEC TVHCYKPVMRERMRQVMRFSGPRMLLYAPWQAIRHLLNL >gi|225935364|gb|ACGA01000028.1| GENE 152 201810 - 202541 198 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 5 241 8 254 259 80 28 7e-14 MKKAIIIGATSGIGQEVAKCLLLDGWQIGVAGRRQSALENLQRAAPDQIQIQTLDVTQED AGEKLNMLIDKVGGMDLFLLSSGIGFQNTNLNMEVELNTAYTNVEGFIRMVDTAFIYFRK SGGGHLAVISSIAGTKGLGVAPAYSATKRFQNTYIDALEQLSYLQKLNIHFTDIRPGFVA TDLLNDGKHYPLLMDATKVGRHIAWSLERKQRIVVIDWRYRILVFFWKMIPRWMWKRLPV KTN >gi|225935364|gb|ACGA01000028.1| GENE 153 202660 - 202929 321 89 aa, chain + ## HITS:1 COG:no KEGG:BT_4495 NR:ns ## KEGG: BT_4495 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 1 89 89 141 86.0 8e-33 MEQKFCQSCSMPIDDSTFGKETDGSKNEDYCHYCYADGHFTKECTMDEMIELNLNYLDEF NKDSEVKYTVEEARKTMKEFFPQLKRWKQ >gi|225935364|gb|ACGA01000028.1| GENE 154 202986 - 203270 301 94 aa, chain - ## HITS:1 COG:HI0450 KEGG:ns NR:ns ## COG: HI0450 COG3309 # Protein_GI_number: 16272398 # Func_class: S Function unknown # Function: Uncharacterized virulence-associated protein D # Organism: Haemophilus influenzae # 1 91 1 90 91 89 50.0 2e-18 MYAIAFDMVITDLRANYGEPYNNAYFEINKVLRQYEFYNTQGSVYLTEKTDMANLFRAID ALKRIPWFQASVRDLRAFRVEDWSNFTDFIKENK >gi|225935364|gb|ACGA01000028.1| GENE 155 203258 - 203458 289 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160887278|ref|ZP_02068281.1| ## NR: gi|160887278|ref|ZP_02068281.1| hypothetical protein BACOVA_05296 [Bacteroides ovatus ATCC 8483] # 1 66 7 72 72 129 100.0 5e-29 MIKEGEKVKVDPRLTGLDEWIEGTVIDIEHNPFKGLVIAIRDKFNRIFFGEQKYFIPVNE DNVCMQ >gi|225935364|gb|ACGA01000028.1| GENE 156 203535 - 203801 353 88 aa, chain - ## HITS:1 COG:PA1749 KEGG:ns NR:ns ## COG: PA1749 COG2388 # Protein_GI_number: 15596946 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Pseudomonas aeruginosa # 14 82 83 152 161 57 44.0 8e-09 MDYEIIHQPEQKLFKTEVDGRTAFVQYRLIGDSLDIIHTIVPQPIEGRGIAAALVKAAYA YALANGMKPKATCSYAVKWLERHPEING >gi|225935364|gb|ACGA01000028.1| GENE 157 203884 - 205197 1146 437 aa, chain - ## HITS:1 COG:SA0724 KEGG:ns NR:ns ## COG: SA0724 COG1090 # Protein_GI_number: 15926446 # Func_class: R General function prediction only # Function: Predicted nucleoside-diphosphate sugar epimerase # Organism: Staphylococcus aureus N315 # 5 279 6 291 300 173 36.0 5e-43 MNIAMTGATGYIGKHLSNYLTEKGGHRIIPLGRSMFREGMSGYLIQTLTHCDVVINLAGA PINKRWTPEYKQELFNSRIVVTNRIIRALNAVKTKPKLMISASAVGYYPSEAEVDEYTRT RGEGFLSDLCYAWEKEAKHCPEPTRLVITRFGVVLSPDGGAMQQMLRPLQATKIATAIGP GTQAFPWISIRDLCRAMEFFITHEETHGVYNLVAPQQISQYAFTRAMGKAYRAWTTMVAP QRIFRILYGEAASFLTAGQRVRPTRLTEAGFHFSIPNVERFFRGTNHSTVTSLDLHRYMG LWYEIARYENRFEYGLVDVTATYTLRPDGMIRVENRGCKRNSPYDICKTANGHAKIPDPA QPGKLKVSFFLSFYSDYYILELDEENYNYALVGSSTDKYLWILSRTPQLPEEIKKKLVTA AERRGYDTSQLKWIEQL >gi|225935364|gb|ACGA01000028.1| GENE 158 205341 - 206309 911 322 aa, chain + ## HITS:1 COG:aq_1420 KEGG:ns NR:ns ## COG: aq_1420 COG0741 # Protein_GI_number: 15606599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Aquifex aeolicus # 89 283 81 268 299 109 35.0 6e-24 MKINYILTLILVFCIGASIPILTGSSQVNEQHSAKSEVPYCVTPPTVPAQVTFDGETIDL RRYDRRERMDREMMAFTYMHSTTMLLIKRANRYFPIIEPILKANGIPDDFKYLMVIESNL NNIARSPAGAAGLWQFMPATGREFGLEVNDNVDERYHIEKATVAACKYFKQAYAKYGDWM AVSAAYNAGQGRISSQLEKQLASHAMDLWLVEETSRYMFRLLAAKEIFNNPQRYGFLLKR EHLYPPIPYKEVTVNTSIDDLNDYAKSQGITYAQLRDANPWLRDTSLKNKTGKTYILYIP TQEGMYYNPQKTVAYNKQWVID >gi|225935364|gb|ACGA01000028.1| GENE 159 206379 - 207218 569 279 aa, chain + ## HITS:1 COG:MA2034 KEGG:ns NR:ns ## COG: MA2034 COG1226 # Protein_GI_number: 20090882 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Methanosarcina acetivorans str.C2A # 7 269 19 279 279 270 51.0 2e-72 MHDEKLKRKLYVIIFESDTPAGKLFDVILIACILVSVLLVIIESLKGLPTYLTTPFVIME FLFTGFFTFEYLTRIYCSPRPRKYIFSFFGIVDLLATLPLYIGLLFPGARYLLIIRAFRL IRVFRVFKLFNFLNEGERLLTALRESSKKIAVFFLFVVILVTSIGTLMYMIEGTQPNSQF NNIPNSIYWAIVTMTTVGYGDITPATGLGKFLSACVMLIGYTIIAVPTGIVSASMMKEYK RRRDKECPNCHRSGHEDNAEFCKYCGHSLDPSETKAEEK >gi|225935364|gb|ACGA01000028.1| GENE 160 207224 - 207592 139 122 aa, chain - ## HITS:1 COG:no KEGG:BT_4485 NR:ns ## KEGG: BT_4485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 1 121 121 158 82.0 5e-38 MKKRIFYIISFLVIFCIEVLIALYVRDRFIRPYVGDMLVVVLVYSFVRIFLPTGIPRMPF YVFLFACFVEVLQYFRLVETLGVTNRVARIVLGSTFDWGDIACYAVGCVFIVLFEHFVRR RS >gi|225935364|gb|ACGA01000028.1| GENE 161 207767 - 208435 368 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 8 216 15 229 236 146 39 1e-33 MMLNMDFVIRLLVAGILGTIIGLDREYRAKEAGYRTHFLVSLGSALIMIVSQYGFQEIIK ESSVTLDPSRVAAQVVSGIGFIGAGTIIFQKQIVRGLTTAAGIWATAGIGLAVGTGMYTI GIAATVLTLVGLELLSYLFKSIGMKSSMVSFSTPNKDILKQIADRFNSKDYLIVSYEMET QHTGETEYYQVTMVIKSKRNNDEGHLLSLIQEFPDVTVQRIE >gi|225935364|gb|ACGA01000028.1| GENE 162 208468 - 209241 512 257 aa, chain - ## HITS:1 COG:CC2333 KEGG:ns NR:ns ## COG: CC2333 COG1573 # Protein_GI_number: 16126572 # Func_class: L Replication, recombination and repair # Function: Uracil-DNA glycosylase # Organism: Caulobacter vibrioides # 87 254 78 231 479 67 29.0 2e-11 MNVYIYDKTFDGLLTAVFDAYFRKTFPDALLSEGDALPLFCDELHTVVTDEEKAGRVWRG LQKKVSSSALGCLTQSWLSELPEIGILIFRYIRKAIDAPRSIETNFGDPDVLQLAQIWKK VDGERVHLMQFVRFQKAADGTFFAAFEPQYNALPLTVHHFKDRFADQKWIIYDMKRRYGF YYDLQEVTTISFDDDSRESHLITGMLDESLMDKDEKLFQQLWKTYFKAICIKERMNPRKH RQDMPVRYWKYLTEKQK >gi|225935364|gb|ACGA01000028.1| GENE 163 209263 - 210534 968 423 aa, chain - ## HITS:1 COG:CAC3343 KEGG:ns NR:ns ## COG: CAC3343 COG4277 # Protein_GI_number: 15896586 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Clostridium acetobutylicum # 5 382 2 378 440 446 55.0 1e-125 MMNENVLAKLKILAESAKYDVSCSSSGTVRSNKPGTLGNTVGGWGICHSFAEDGRCISLL KIMLTNYCIYDCAYCVNRRSNDLPRATFSVSELVELTMEFYRRNYIEGLFLSSGVVRNPD YTMERLVRVAKDLRQVYRFNGYIHLKSIPGASRELVNEAGLYADRLSVNVEIPKEENLKL LAPEKDHKSVFAPMKYIQQGVLESKEERQKFRHAPRFAPAGQSTQVIVGATSESDKDILF LSSALYGRPTMKRVYYSGYVSVNTYDKRLPALKQPPLVRENRLYQADWLLRFYQFKVDEI VDDSYPDLDLEIDPKLSWALRHPEQFPVDINKADYEMLLRVPGVGVKSAKLIVASRRFSR LGFYELKKIGVVMKKAQYFITCKELPLQMQTVNELSPQRVRSLLLPKPKKKVDERQLLLD FGE >gi|225935364|gb|ACGA01000028.1| GENE 164 210796 - 212352 1282 518 aa, chain - ## HITS:1 COG:no KEGG:BT_1828 NR:ns ## KEGG: BT_1828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 520 895 86.0 0 MSNKIYPIGIQNFEKIRNDGYFYIDKTALIYQMVKTGGYYFLSRPRRFGKSLLLSTLESY FSGKKELFEGLAMGKLEKDWIKYPVIHLDLNAKKFDTVDDLIRLVDRQLLVYEAQYGSCS KDETIDDRFVTLIRMAAEKTGERVVILVDEYDKPMLQAIGRDELQEEYRNTLKAFYGVMK SMDGYIKFAMLTGVTKFGKVSVFSDLNNLDDISMRQIYVDICGVSEQELHDNLESELHEL AEIRGVSYDDVCNELRACYDGYHFTHNSIGIYNPFSLLNAFKYQEFSSYWFETGTPTYLV ELLKKHHYDLHRMAHEETTAEVLNSIDSTSDNPIPVIYQSGYLTIKGYDPRFGNYRLGFP NREVEEGFVKFLLPFYANTNAVESSFAIQKFVREIESGDYESFFRRLQSFFADTPYELVR DLELHYQNVLFIVFKLVGFYVKAEYHTSEGRVDLILQTDKFIYVMEFKLNGTAEEALQQI NEKHYALPFEHDERKLFKIGVNFTSATRNIEKWMVEAE >gi|225935364|gb|ACGA01000028.1| GENE 165 212523 - 212699 353 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754220|ref|ZP_02426347.1| ## NR: gi|167754220|ref|ZP_02426347.1| hypothetical protein ALIPUT_02513 [Alistipes putredinis DSM 17216] # 1 58 33 90 90 71 74.0 1e-11 MKELVEKVAALYADFSKDANAQIENGNKAAGTRARKASLEIEKAMKEFRKASLEASKK >gi|225935364|gb|ACGA01000028.1| GENE 166 212847 - 213494 579 215 aa, chain + ## HITS:1 COG:no KEGG:BT_4477 NR:ns ## KEGG: BT_4477 # Name: not_defined # Def: putative ATP-dependent DNA helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 1 215 215 383 91.0 1e-105 MKLLTDTDYIHALIAEGEHQQQDFKFEISDARKIAKTLSAFANTDGGRLLIGVKDNGKIA GVRSEEEKYMIEAAAQLYCVPEVEYSLQTYIVEGRQVLVATIEETPHKPVYAKDETGKPL AYLRIKDENILATPIHLRVWQQSDSPRGEFIRYTEREQLLLDQLEHGTLLSLNRYCRQTG LSRRAAEHLLAKFVRYDIVEPVFENHKFYFRIKDE >gi|225935364|gb|ACGA01000028.1| GENE 167 213528 - 214916 799 462 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 2 434 5 440 456 312 39 1e-83 MTFINEINDILWTYILIIMLLGCAVWFSIRTRFVQFRMIREMIVLLSESAGKGKQGEKHV SSFQAFAISIASRVGTGNLAGVATAIAIGGPGAIFWMWVIALLGASSAFIESTLGQLYKI RGKDSFIGGPAYYMKKGLKQPWMGMLFAVLISITFGFAFNSVQSNTICAAAEHAFGFNHI ILGGVLTLLTLLIIFGGIQRIARVSSIIVPVMALGYVGLALVIVALNITHLPGVIALIVS HAFGWEQALAGGVGMALMQGIKRGLFSNEAGMGSAPNAAATAHVSHPVKQGLIQTLAVFT DTLLICTCTAFIILFSGVPLDGSANGVQLTQQALTNKIGSSGSIFVAVALFFFAFSSILG NYYYGEANIRFITHRKWVLHGYRILVGGMVLFGSLATLDMVWSLADVTMGLMAICNLIAI LFLGKYAIRLLNDYRAQKKAGIQSPVFKKETMPDIEKDLECW >gi|225935364|gb|ACGA01000028.1| GENE 168 215029 - 215661 639 210 aa, chain + ## HITS:1 COG:no KEGG:BT_4475 NR:ns ## KEGG: BT_4475 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 210 210 400 98.0 1e-110 MGLFGLFKKKSDETKVGNVEDFISLTRVYFQSVIATNLGITNIRFLPDVANFKRLFKVPT QNGKLGLAEKSAARKMLMQDYGLNENFFKEIDSSVKRNCRTQNDIQSYLFMYQGFSNDLM MLMGNLMQWKFRMPSIFKKALYGMTQKTVHDVCTKTVWKADDVHKTAATVRQYKERLGYS EQWMTDYVYNIVLLAKKEPKRKNDDTETKK >gi|225935364|gb|ACGA01000028.1| GENE 169 215903 - 218038 1489 711 aa, chain + ## HITS:1 COG:MJ0634 KEGG:ns NR:ns ## COG: MJ0634 COG1509 # Protein_GI_number: 15668815 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Methanococcus jannaschii # 22 679 24 619 620 137 23.0 7e-32 MKQKKMLALTFSQLKQIYNQEIPEVVEIADKSSSAEEFKAGMLRFLEICGIENEAAEEAR EQIRLLLDYDGQNVHELSTGQDMSVQTIRLLYEFLTGTLENMEMPTDLFIEIFQMFKRLK GEVVPLPSPQRIKSRNDRWETGLDEEVREVRDENKERMLHLLIQKIENRKSKPSVRFHFE EGMSYEEKYRLVNEWWNDFRFHLAMAVKSPGELNRFLGNSLSSETMYLLYRARKKGMPFF ATPYYLSLLNVTGYGYNDEAIRSYILYSPRLVETYGNIRAWEKEDIVEAGKPNAAGWLLP DGHNIHRRYPEVAILIPDTMGRACGGLCASCQRMYDFQSERLNFEFESLRPKESWDRKLR RLMAYFEEDTQLRDILITGGDALMSQNKTLQNILDAVYRMAARKQRANLERKDGEKYAEL QRVRLGSRLLAYLPMRINDGLVDVLREFKEKASAIGVKQFIIQTHFQTPLEVTPEAKEAI RKILSAGWIITNQLVYTVAASRRGHTTRLRQVLNSLGVVCYYTFSVKGFNENYAVFAPNS RSMQEQQEEKIYGRMTPEQADELYKILETKVGMEEEPKEDVAKQLRRFMRKHHLPFLATD RSVLNLPAIGKSMTFQLVGLTEEGKRILRFEHDGTRHHSPIIDQMGQIYIVENKSLAAYL RQLAKMGEDPEDYASIWNYTKGETEPRFSLYEYPDFPFRTTDKMSNLSIKE >gi|225935364|gb|ACGA01000028.1| GENE 170 218138 - 219451 352 437 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 9 433 7 418 447 140 26 1e-31 MKTDLIYGVEDRPPFKDALFAALQHLLAIFVAIITPPLIIASALKLDVEKTSFLVSMSLF ASGVSTFIQCRRFGAIGAKLLCIQGTSFSFIGPIIATGLVGGLPLIFGSCMAAAPIEMIV SRTFKYLRNIITPLVSGIVVLLIGLSLIKVGIVSCGGGYAAMDNGTFATWENLSIAGLVL LSVLFFNRCRNKYLRMSSIVLGLCLGYALAFALGKVDMSSLNVEMLMSFNIPQPFKYGVD FNVSSFIAIGLVYLITAIEATGDVTANSMISGLPIEGDSYLKRVSGGVMADGFNSFLAGV FNSFPNSIFAQNNGIIQLTGVASRYVGYYIAAMLVLLGLFPIVGAVFSLMPDPVLGGATL LMFGTVAAAGIRIVSSQEIGRKETLVLAVSLSLGLGVELMPDVLQQTPEAIRSIFSSGIT TGGLTAIIANIVIRVKE >gi|225935364|gb|ACGA01000028.1| GENE 171 219567 - 220274 630 235 aa, chain - ## HITS:1 COG:no KEGG:BT_4472 NR:ns ## KEGG: BT_4472 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 430 91.0 1e-119 MKQIISALFLCMLLSGIPGLQAQNIQLHYDFGRSLYDKDLQGRPLFTSTVEKFHPDTWGS TYFFVDMDYTSEGVASAYWEIAREIKFWKGPFSAHLEYNGGLSKGMSYKNAYLAGATYTF NNASFSKGFTLTAMYKYIQKHQSPNNFQLTGTWYVNFCRNLLTFSGFADWWREETNYGKT IFLSEPQFWVNLNKIKGVNKNFNLSVGSEVELSNNFGGRDGFYVIPTLALKWTLN >gi|225935364|gb|ACGA01000028.1| GENE 172 220348 - 221943 1235 531 aa, chain - ## HITS:1 COG:no KEGG:BT_4471 NR:ns ## KEGG: BT_4471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 1 531 531 920 80.0 0 MKRKFVYLYLLGLLLFSVACTGNFRDYNTDLSGITDDDLIIDDNGYGIRLGIIQQGIYFN YDYGKGKNWPFQLIQNLNADMFSGYMHDGKPLNGGSHNSDYNMQDGWNSAMWTHMYSYIF PQIYQSENATRNTHPALFGITKILKVEAMHRVTDYYGPIIYKNFANAEKHYRPDKQEDVY YEFFNELDSAVVALTNYIDEKPESNGFARFDILLDGKYASWVKFANSLRLRLAMRIASVA PDKACAEIQKIKENDYGFFEAETGGAIVSTKSGYTNPLGELNCVWNETYMSANMESILVG YDDPRLGAYFEHCTDEALKGQYRGIRQGTCFAHSHYSGLSKLFVLQSTDAPLMTASEVWF LRAEAALRGWTDEDEEACYRNGVTTSFHQWGIYGVEDYLKSELTASGFIDTYDEENNIEA RCKVTPKWNPQDDKETKLEKIITQKWIAMFPEGCEAWAEQRRTGYPRLFPVRFNHSRNGC IDTEIMVRRLNFPGTLQTEDPEQYSALMEALGGDDHGGTRLWWDTGNNNLE >gi|225935364|gb|ACGA01000028.1| GENE 173 221959 - 225087 2699 1042 aa, chain - ## HITS:1 COG:no KEGG:BT_4470 NR:ns ## KEGG: BT_4470 # Name: not_defined # Def: outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 1042 3 1035 1035 1694 79.0 0 MNKQYFIISLRKEVFVSCFLMGLFFLEGGNCYASEQQVLRGTIEVPSDNNTVRGRVIDIY GEPLIGATIREKGGTNGTVTDIDGKFFLSVPDSAVLQVSFVGYKTVEVNVTGRNVLEIRL QEDAVMLEHVIVTALGLEKDEATLAYSVQKIKGEELNRVKEVNMIAALAGKAAGVQINKN SSGMGGSAKVSLRGIRSVSGDNQPLYVIDGVPMSNTSSEQAYSAIGGTANAGNRDGGDGI SNLNPEDVESISILKGAPAAALYGSMAANGVILITTKKGNSAGQRNINFSTGLTFEKAFS MPKMQNRYGVSDVVDSWGEKENLMAYDNLNDFFRTGLTSMTSVSISYGNENLQTYFSYAN TTGKGIIDKNKLKKHNINLRETATMFDKRLKLDGSVNVMKQTVENKPVSGGFYMNPLVGL YRFPRGEDLSYYKDHFETYDEERKLGVQNWHTFTEDFEQNPYWITNRIQSKETRTRVILS LSANFKVNDWLTIQARGNMDYWADKLRQKFYASTATALCGANGRYIEMDYQETQMYGDVM AMVKKTWGDFTLDAAIGGSINDKVVNSTRYDSKNASLKFANVFNIANIVMNSSASIDQRI DQRRQLQSIFGTAQIGYKEKLFLDLTARNDWASTLSYTKHEKSGFAYPSVGLSMLLDKWV KLPEWVSFAKLRGAYSKVGNDIPPFITNSFSHINAGGEIQANDAAPFKDMEPEMTHAIEF GTEWRFFQHRLGVNLTYYRTNTYNQFFKLPALAGDKYAYRYVNAGNIQNRGWELTLNGTP VLTPDFNWKTSVNFSTNKNKIVKLHEDLKEMIYGPTSFSSSYAMKLIKGGSIGDIYGKAF VRDDAGNIVYETEGDNKGLPLVEGDGNTVKVGNANPVFMMGWDHTFSYKGFTIYFLLDWR YGGEVLSQTQAEMDLYGVSEITADARDRGYVMLDGQRIDNVKGFYKIVGGRAGVTEYYMY DATNLRLREVSLSYNFSKKWIQKTKVLKDVQLSFVARNLCFLYKKAPFDPDLVLSTGNDN QGIEVFGMPTTRSLGFTLKCEF >gi|225935364|gb|ACGA01000028.1| GENE 174 225325 - 225774 255 149 aa, chain + ## HITS:1 COG:NMB0698 KEGG:ns NR:ns ## COG: NMB0698 COG3663 # Protein_GI_number: 15676596 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Neisseria meningitidis MC58 # 1 143 76 220 229 148 53.0 3e-36 MWRIYGILFFNDKNHFLNSTLKSFCREQIIDFLNEKGIALFDTASSIRRLQDNASDKFLE VVEATDVAALLRQLPECKAIVTTGQKATDTLRQQFDIEEPKVGDYSEFVFEGRALRLYRM PSSSRAYPLALDKKAAAYRIMFQDLQILK >gi|225935364|gb|ACGA01000028.1| GENE 175 225823 - 226176 363 117 aa, chain + ## HITS:1 COG:FN0052 KEGG:ns NR:ns ## COG: FN0052 COG1393 # Protein_GI_number: 19703404 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Fusobacterium nucleatum # 4 116 5 117 120 123 60.0 7e-29 MATLFLQYPACSTCQKAKKWLTENNIEFTNRLIVEENPTVEELKAWIPRSGLPVKKFFNT SGLVYKELKLSEKLPAMSEEEQIALLATNGKLVKRPLVVTDSFVLVGFKPDEWEKLK >gi|225935364|gb|ACGA01000028.1| GENE 176 226574 - 228424 2195 616 aa, chain - ## HITS:1 COG:STM2315 KEGG:ns NR:ns ## COG: STM2315 COG2304 # Protein_GI_number: 16765642 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Salmonella typhimurium LT2 # 139 610 111 591 593 415 46.0 1e-115 MKTNQFRAMMLVLLMAVISLGRMNAQVITVSGTVTDAKDGNPLVGCSVQIKGTSKGAITN MKGQYTIQAKKGETLLFQYIGYKQERRVVKSATLDVKMKADDVVLEECVVVGYGHELKAT KSVSAAYMAVCPTPGIMYDAVNAEEYGQIQENGFKSVSDAPLSTFSIDVDAASYSNMRRF INKGELPPVDAIRTEELVNYFSYDYPKPTGSDPVKITMEAGACPWNANHRLVRIGLKAKE IPTDNLPASNLVFLIDVSGSMWGANRLDLVKSSLKLLVNNLRDKDKVAIVTYAGSAGVKL EATPGSDKQKIREAIDELTAGGSTAGGTGILLAYKIAKKNFISNGNNRIILCSDGDFNVG VSSAEGLEQLIEKERKSGVFLTVLGYGMGNYKDKKIQVLAEKGNGNHAYIDNLQEANRVL VGEFGATLHTVAKDVKLQVEFNPSQVQAYRLIGYESRLLKDEDFNNDAKDAGDMGAGHTV TAFYEVIPTGAKNEYVGKIDDLKYQKKEKVTVKPTGSNDLLTVKLRYKAPDKDVSKKMEL PFVDNKGNNVSSDFRFASAVAMFGQLLRDSDFKGNASYDKVINLAKQGLNNDDKGYRREF IRLVEAAKGLERTNKN >gi|225935364|gb|ACGA01000028.1| GENE 177 228614 - 229270 500 218 aa, chain + ## HITS:1 COG:VC1935 KEGG:ns NR:ns ## COG: VC1935 COG0558 # Protein_GI_number: 15641937 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Vibrio cholerae # 2 206 27 233 252 156 43.0 3e-38 MKNEVDGRREIASRNTAWATIIARKLTRWGVTPNQISMMSVFFAMVGCLLLIGTVIDPGF NKYVAYILFIVCMQSRLLCNLFDGMVAIEGGKKSANGDLYNDMPDRFADALFIIPIGYIA GGFGIELGWLAALLAVMTAYFRWIGAYKTHQHFFNGPMAKQHRMALLTLAFVVATCTIHA GYDRMVCLIALIIINVGLVATLIHRLYLMSHTTNNEIK >gi|225935364|gb|ACGA01000028.1| GENE 178 229267 - 230223 657 318 aa, chain + ## HITS:1 COG:VC1936 KEGG:ns NR:ns ## COG: VC1936 COG4589 # Protein_GI_number: 15641938 # Func_class: R General function prediction only # Function: Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase # Organism: Vibrio cholerae # 22 311 14 303 310 261 48.0 1e-69 MKELLDIIFPTLSDELIIVISLIIGLLVTASLILFLVKKISPKTNISELSARTRSWWIMA GMFIGAVFISYNISYFFLAFLSFIAFRELYSVLGFREADRGALFWGILAIPIQYYLAYLA WYGAFIIFIPVVMFLVLPLRLVLKGDTHGITKSMALLQWILMLSVFGISHLAYLLSLPEL PGFNAGGRGLLLFLVFLTEINDVMQFIWGKLLGRHKILPKVSPNKTWEGFLGGVISTTVI GYFLGFLTPLSAPNVILVSALVAIAGFSGDVVISAIKRDKGIKDMGNSIPGHGGVFDRID SLAYTAPVFFHLVYYIAY >gi|225935364|gb|ACGA01000028.1| GENE 179 230248 - 230904 457 218 aa, chain + ## HITS:1 COG:VC1937 KEGG:ns NR:ns ## COG: VC1937 COG0204 # Protein_GI_number: 15641939 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Vibrio cholerae # 22 210 22 202 223 124 37.0 1e-28 MQAAAMQIIYKGVFQWFLKLVVGVQFTDCQFLKKEKQFIILANHNSHLDTLSLLSSLPGG LLWKVKPVAAEDYFGKTRFQASISNFFINTLLIRRKGEKDSEHDPIRKMLEAIDAGYSLI LFPEGTRGKPEQMGKIKSGIARILSLRPEVKYIPVFMTGMGRSLPKGKMILLPYKASIYY GMPALVKSTDTHEILDQITGDFEAMKEKYQVIIDEEEE >gi|225935364|gb|ACGA01000028.1| GENE 180 230905 - 231477 321 190 aa, chain + ## HITS:1 COG:mll4824 KEGG:ns NR:ns ## COG: mll4824 COG1595 # Protein_GI_number: 13474039 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 17 187 9 179 179 77 25.0 2e-14 MLFFKRKISKLSDEELLIHYTKSGDTEYFGELYNRYIPLLYGLCLKYLHDEDRAQEAVMQ LFEDLLPKLGNYEIKVFKPWLYRVAKNHCLQLLRKENKEITLDYTVNVMESDEFLHLLSE EESSEEQLKALHHCLEKLPEEQRTSITRFFLEEMSYADIVEQTGFTLNNVKSYIQNGKRN LKICIKKQAL >gi|225935364|gb|ACGA01000028.1| GENE 181 231474 - 232616 1207 380 aa, chain + ## HITS:1 COG:no KEGG:BT_4460 NR:ns ## KEGG: BT_4460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 379 1 447 449 392 53.0 1e-108 MKLLDYIRGLRKGKEAHRLEKESMQDPFLADAMDGYNQVEGNHEQRIEKLRMQVSAHSAK KKNTYAITWSIAACLVIGFGISSYFLFLKKSMTDEVFIAEESVPAKLPETTTPATPTNPA TPAAPVTPRADKKEMSASAVIEPMMEEALEQTAELQEVAATMDTSESVSDKKMRMAKVVT PPNSNIIQGKVTDEKGEPIIGASVAYKGTNIGTITNMNGEFSLVKKEGKKQLTAQFIGYD PVEIPVDTSQTMLIAMNENKQTLNEVVVVGYGTNKNKKSTTVVTAKEQADKDITPQPVIG KRKYQKYLKENLVRPTDEKCAQVKGKVVLTFLVNKEGRPFYIKVKESLCESSDKEAIRLI QEGPDWIYGNKSVEITVKFE >gi|225935364|gb|ACGA01000028.1| GENE 182 232627 - 233796 990 389 aa, chain - ## HITS:1 COG:BS_yxaH KEGG:ns NR:ns ## COG: BS_yxaH COG2311 # Protein_GI_number: 16081049 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 3 389 5 391 402 110 27.0 5e-24 MTQTLKTSERLGVVDALRGFALLAIVLLHNLEHYNLFFIPENMPAWLQTIDKYAWDTMFF LFAGKAYATFSLLFGFSFYIQFHNAEKRGIDFRGRFAWRLCLLFLFAQLHALFYNGDILL LYAVVGFALIPVCKLKDKTVFWIALILLLQPYEWGRAVYAMINPDYVAVTGHCVPYAIRA QEATLNGNFLEVLRSNIIDGQLYSNIWQVENGRLFQTAALFMFGMLLGRRKYLIKSEESV CFWKKMLIGAILAFIPLYCLKTFVPDLLTNPSIQVPYNIAVPSYANFAFMIILVSIFTLL WFKKEKGYSWQSLLIPYGRMSLTNYISQSIMGVTIYYGFGLAMYKYAGATASLLIALLIF TVQLIFSRWWLARHKQGPLEFLWRKGTWI >gi|225935364|gb|ACGA01000028.1| GENE 183 233880 - 234752 760 290 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 6 290 5 290 290 318 50.0 5e-87 MYANKVKKIAAVHDLSGMGRVSLTVVIPILSSMGFQVCPLPTAVLSNHTQYPGFSFLDLT DEMPKIIAQWKKLEVQFDAIYTGYLGSPKQIQIVSDFIKDFRQPDSLIVADPVLGDNGRL YTNFDMEMVKEMRHLITKADVITPNLTELFYLLDEPYKADNTDEELKEYLRLLSDKGPQV VIITSVPVHDEPHKTSVYAYNRQGNRYWKVTCPYLPAHYPGTGDTFTSVITGSLMQGDSL PMALDRATQFILQGIRATFGYEYDNREGILLEKVLHNLDMPIQMASYELI >gi|225935364|gb|ACGA01000028.1| GENE 184 234888 - 235469 555 193 aa, chain - ## HITS:1 COG:no KEGG:BT_4457 NR:ns ## KEGG: BT_4457 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 193 356 92.0 3e-97 MSSREEILASIRKNTQTRYDKPDIADMERLSYPDKIEQFCAISRAVGGTAVVLGEGEDVN AVIRRTYPDAMRIASVLPDISCATFNPDNLDDPKELDGTDVAVVKGEIGVAENGAIWIPQ TVKYKAIYFISEKLVILLDRNKIVNTMYDAYRELDGQEYQFGTFISGPSKTADIEQALVM GAHGARDVLVILI >gi|225935364|gb|ACGA01000028.1| GENE 185 235466 - 236851 1137 461 aa, chain - ## HITS:1 COG:ykgF KEGG:ns NR:ns ## COG: ykgF COG1139 # Protein_GI_number: 16128292 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Escherichia coli K12 # 29 458 34 473 475 313 38.0 7e-85 MSTKHSKAAEKFLQDSKMAAWHNETLWMVRAKRDKMSKEVPEWEELRNRACELKLYSNSH LEELLLEFEKNATANGAIVHWAKDADEYCAIVYEILNEHNVHHFIKSKSMLAEECGLNPF LMERGIDVVESDLGERILQLMHLEPSHIVLPAIHIKREQVGELFEKEMGTEEGNFDPTYL THAARKNLRPLFLNAEAAMTGANFAVASTGDIVVCTNEGNADMGTSYPKLNIAAFGMEKI VPDLDALGVFTRLLARSATGQPVTTYTSHYRRPREGGEYHIIIVDNGRSTLLSKPDHIKT LNCIRCGACMNTCPVYRRSGGYSYTYFIPGPIGINLGMAHDPEKYYDNLSACSLCMSCSD VCPVQVDLAEQIYKWRQDLDGLGKANTGKKIMSGGMKFLMERPALFNAALWAAPVVNGLP RFMKYNDLDDWGKGRELPEFAKESFNEMWKKNKVQGKEESK >gi|225935364|gb|ACGA01000028.1| GENE 186 236848 - 237588 594 246 aa, chain - ## HITS:1 COG:BH1832 KEGG:ns NR:ns ## COG: BH1832 COG0247 # Protein_GI_number: 15614395 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Bacillus halodurans # 1 245 1 238 244 161 36.0 1e-39 MKVGLFIPCYINAIYPNVGVASYKLLKSLGVDVDYPLDQTCCGQPMANAGFEDESMKLAL RFGDLFREYDYIVGPSASCVAFVKENHPGILEKEGHVCQTAGKICDICEFIHDVLKPSKI PARFPHKVSIHNSCHGVRELFISAPSEMNIPYYNKLRDLLDLVEGIEVFEPSHIDECCGF GGMFAVEEQAVSVCMGRDKVKDHMATGAEYIVGADSSCLMHMQGVIKREHLPIQIIHIVE ILASQS >gi|225935364|gb|ACGA01000028.1| GENE 187 237641 - 238192 228 183 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 [Rickettsia canadensis str. McKiel] # 2 182 16 224 225 92 34 2e-17 MTMRKINEIFYSLQGEGYHTGTPAIFVRFSGCNLKCDFCDTQHEEGTMMTDEEIITKVKK YPAVTVVLTGGEPSLWIDDQLIDLLHQAGKYVTIETNGTHPLPASIDWVTCSPKQGAKLA IDRMNEVKVVYEGQDISIFELLPAEHFFLQPCSCNNTALTVDCVMRHPKWRLSLQTHKLI DIR >gi|225935364|gb|ACGA01000028.1| GENE 188 238207 - 238563 403 118 aa, chain - ## HITS:1 COG:mll5797 KEGG:ns NR:ns ## COG: mll5797 COG0720 # Protein_GI_number: 13474825 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Mesorhizobium loti # 1 100 2 110 119 65 37.0 3e-11 MFTVIKRMEISASHKLVLPYRSKCASLHGHNWIITVYCRSMRLNSEGMVVDFTRIKEVVM EKLDHQNLNEVLPFNPTAENIARWVCKQLPQCYKVEVQESEGNIVIYEKDSASAEGQE >gi|225935364|gb|ACGA01000028.1| GENE 189 238693 - 239052 198 119 aa, chain + ## HITS:1 COG:PA1439 KEGG:ns NR:ns ## COG: PA1439 COG2832 # Protein_GI_number: 15596636 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 8 117 21 129 135 92 45.0 2e-19 MKTLYIVLGSISLALGILGIFLPLLPTTPFLLLTAALYFKGSPRLYNWLLNHRHFGPYIR NYRENKAIPLRAKVISLVLMWGTMLYCIFFLIPLIWVKILLGLIAAGVTYHILSFKTLK >gi|225935364|gb|ACGA01000028.1| GENE 190 239022 - 239588 568 188 aa, chain - ## HITS:1 COG:no KEGG:BT_4451 NR:ns ## KEGG: BT_4451 # Name: not_defined # Def: putative MTA/SAH nucleosidase # Organism: B.thetaiotaomicron # Pathway: Cysteine and methionine metabolism [PATH:bth00270]; Metabolic pathways [PATH:bth01100] # 1 188 1 188 188 370 93.0 1e-101 MLKILVTYAVQGEFVEIKWPDVEPYYIRTGIGKVKSAFHLAEAICQVQPDLVLNIGSAGT VNHQVGDIFVCRKFVDRDMQKMKEFGLECEIDSSALLEEKGYCEHWTEEGICNTGDGFLT ELTHVSGDVVDMEAYAQAFVCRSKEIPFISVKYVTDIIGHNSVKHWEDKLADARQGLSHY FNVLKERI >gi|225935364|gb|ACGA01000028.1| GENE 191 239814 - 240140 369 108 aa, chain - ## HITS:1 COG:no KEGG:Aave_4587 NR:ns ## KEGG: Aave_4587 # Name: not_defined # Def: hypothetical protein # Organism: A.avenae # Pathway: not_defined # 3 105 1 105 122 61 34.0 9e-09 MKIMDGEDIWRQVIQFVEEERWGGEFTRETDLARDLKLQGDDAYEFISLFSRIFNVNVEK FVFEEYFYPEGDWILPKLLDLILGRKEKVKKRITLGDLERAVKEGKLV >gi|225935364|gb|ACGA01000028.1| GENE 192 240670 - 241398 549 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171443|ref|ZP_05757855.1| ## NR: gi|260171443|ref|ZP_05757855.1| hypothetical protein BacD2_06215 [Bacteroides sp. D2] # 1 242 10 251 251 406 100.0 1e-112 MKKVSLLLLTILLTIGCNQKPKEQVVSQESNPQLTAAEIRNQEIQDSLKKVRNDSLALIA WGDVKFGMSMREALATETFKGGDKFPDSNRISMNFNDERNFRKAFGLNESSEIWAKFQEN ELTRIYIESHYLTANSINDLVSDCDIFIKNFTEKYGEPSYKKSKVNISEFNSGEEFEYAK FQIGDKTITIVLGELSSEVKFYYHVYIDNSKFPQKKHVMTDKEKREEQKQKEDTEKIRNN SF >gi|225935364|gb|ACGA01000028.1| GENE 193 241429 - 242217 383 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171444|ref|ZP_05757856.1| ## NR: gi|260171444|ref|ZP_05757856.1| hypothetical protein BacD2_06220 [Bacteroides sp. D2] # 24 262 1 239 239 442 100.0 1e-122 MFMPHHHNSEINNSMKKVFFLLFMATIVSLVGNAQSRTNSDPTALSYKSKEIKSALYWQQ SSKTGRWESRKNTKRVYLGEGVAIENFNNIFIGEYNNKRYLFLDFYRYFWRYPALKKEWM YSRTIMAALLTDDDYSHLSSLATNQTLSIIPRFYHSMYKGHAEYSFPFFLTLGETLLSAT ETLYKSNKRAYGEEYANKEWKKDYPPINFIVLKRITDSDGQDVVRFILYPKALPELIDYT YFEVDYSVYKNLLTKDKKVSYK >gi|225935364|gb|ACGA01000028.1| GENE 194 242267 - 242512 90 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171445|ref|ZP_05757857.1| ## NR: gi|260171445|ref|ZP_05757857.1| hypothetical protein BacD2_06225 [Bacteroides sp. D2] # 1 81 26 106 106 155 100.0 6e-37 MGRRAVYDKINSVLSSGKIESESHYVIENGNVYWEVEYHHWSGHPGTQIAFMHFMSNGNY AVDITKQEFEKQTGKKAKKGK >gi|225935364|gb|ACGA01000028.1| GENE 195 242626 - 243087 377 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171446|ref|ZP_05757858.1| ## NR: gi|260171446|ref|ZP_05757858.1| hypothetical protein BacD2_06230 [Bacteroides sp. D2] # 1 153 33 185 185 270 100.0 2e-71 MKNLLLLMITALMFAGCSKDEEKDEFYEKQITANELESGTGTYVMNKGSYTYYLVFENGK LGYYTYKGGKFTSIHSVDYSINKNDLNLVKHPYMEEVLGGEREYYTLYISFVHWGHEKMT DYGTGGEQLLIRGDNYPYEFATGYYDKSSISLK >gi|225935364|gb|ACGA01000028.1| GENE 196 243531 - 244007 365 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171447|ref|ZP_05757859.1| ## NR: gi|260171447|ref|ZP_05757859.1| hypothetical protein BacD2_06235 [Bacteroides sp. D2] # 15 158 15 158 158 259 100.0 3e-68 MKKILLLLAILPILTTACSKDDKSTEQTFFVNVYTKWENDEEEISKQAFVYIFANENKSI DNAKSAESVADDGVITYTDGSKSSKPKYATKYQSGVFNIENMPNGEYILWVTDMNEYGGA CYSSYKKISVNESYRGTSEKKVFLRTAQDRGLYLYQNW >gi|225935364|gb|ACGA01000028.1| GENE 197 244183 - 244818 311 211 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171448|ref|ZP_05757860.1| ## NR: gi|260171448|ref|ZP_05757860.1| hypothetical protein BacD2_06240 [Bacteroides sp. D2] # 1 211 1 211 211 343 100.0 3e-93 MTKSTGNELYTTFYNKYIKNKTLTPRSYALTLKNLKTGESSYIRGYWNKKEGVKLMEGTY EVTGTSSPIYNSYLYQKLDTVYLAFKENIAINSNTTSVNLSAKYNSFMLMFDTDNTKSIE YGYGENSSNNIVLSKVDNIYYMFLDKLSIAGNDRLRIKRTSGSESNIGISKTPFENGKYY YFNDITNSFDVPPNGTRKLIQSASQVLIFTV >gi|225935364|gb|ACGA01000028.1| GENE 198 244891 - 245409 255 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171449|ref|ZP_05757861.1| ## NR: gi|260171449|ref|ZP_05757861.1| hypothetical protein BacD2_06245 [Bacteroides sp. D2] # 1 172 1 172 172 333 100.0 2e-90 MKKGIYLFMFIAVLGGCKQQLNFVKVANNIYMNQIQAFGDAMLLKGLQAYREKSNILERL RYSAANDTVFALEMLGFQGDLYLTYWNKVDTISYTNTEDKPGYVSNLLFTKYMMGLVSQW NILKIKEEEKDNSSLIPKELVYATRIIIRKNTYKVECVRFNDFFNLERDCHY >gi|225935364|gb|ACGA01000028.1| GENE 199 245969 - 246382 432 137 aa, chain + ## HITS:1 COG:no KEGG:BT_4432 NR:ns ## KEGG: BT_4432 # Name: not_defined # Def: putative non-specific DNA-binding protein HU-1 # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 137 1 137 137 224 85.0 6e-58 MSILYKTTRFADTFTGDKETNVRVQLLSWDTLDTKAFVEYLADKNNITKGDAYRNLSMIL EGIEAILKDGNILNLDDFGSFSLNGSFCEDKEPGENHRAESIEVKNVVFKAEKRLKKAIT AAGFEKYNPERHNKRKY >gi|225935364|gb|ACGA01000028.1| GENE 200 246389 - 248188 1077 599 aa, chain + ## HITS:1 COG:no KEGG:BT_4431 NR:ns ## KEGG: BT_4431 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 599 1 599 599 1077 87.0 0 MEYDRIYSIRKGEYFADALKRAGKDFIPTNCIINKLLPGLGATHCELTALRKSIIIEPNV PVIESKAKVHKNSLAIYKGVSVRQITDFLEKNQDKHYKLLTTPEGFTKIKEAMLAVDIDM YTECFILFDECEKLVQDVHYRDSIREPMNDFFHFQHKALISATPIVPEKDNSFDSFMRVL IQPDYEYRQKLKLITTNNVLETLQEVIEAKRGTVCIFCNSIDSIDSFYRLIPELNNACTF CSEDGQYKLWKGNRRKKSMMITELERYNFFTSRFYSAVDILCKNPPHVILISDLHGATQS VIDPTTEAIQIIGRFRGGVNTVTHIASIRPDLECMSANEIDSWIQGASHIFNGWKTQLAQ TSNIGERTLLQEAIGENSYLPYLDANGKPDPFLIANLYEKEQVKRLYTSTDLLCSAYQQT DYFIFSHEERLMPVSDNERMAIQHRLAKKKRAELIVRKLEEMEKMSRTTDKKVQKRYQRM LGNLLTTINDRYIYDCFCRFGGDFIRESDYNENKLRAALNVSSEHTIKQSVQMRSSIQRT FPIGSELSVQEVKSMLKQVYKKMGLNTGRGITTKELEQYAEIEKGWKHDNRTIKILKFK >gi|225935364|gb|ACGA01000028.1| GENE 201 248333 - 250810 2029 825 aa, chain - ## HITS:1 COG:Cj1013c_2 KEGG:ns NR:ns ## COG: Cj1013c_2 COG0755 # Protein_GI_number: 15792340 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Campylobacter jejuni # 544 813 12 280 287 251 50.0 4e-66 MKLIRLIASPVLMYILAGLYALILAVATLVENFYGPAVARESFYYAPWFILLQLLQAANL LAMFLQGSYFKRISKGSLIFHGAFLFIWLGAAVTHYVGVTGIMHIREGETANSMMKEEGT GMENASLPFSVTLEDFRLQRYPGSHSPMSYESDLVIKRENESPLQATIRMNKVIDVNGYR LFQSSFDSDEQGTVLSVSYDRPGMQLTYTGYFLLLVGFVLTLFSKKSRFGRLRRELGEMK KNTPFCLLLFLALSGISNMQALSAQQPVASSQQSLLVSQLPCVSSSHAEKFGSLVVLNPN GRLEPVNSYTSAILRKLYGADKLNGINSDQFFLNLLSFPDEWGAFPFIKVDNKELLQRFG RDGKYIAWQDVFDADGNYILTNELNTIYAKPAAERKRLDSDLLKLDESVNIIYRIMQHQL LPLFPDGNDLQGKWYSPGDDLSAFHGKDSLFVTKIMDWYIYELGNGVRSNNWKEADKVVE MMNVYQQAKAKVPAIDNRKVKAELLYNQLDLFFWCRLAYLILGGILLFIACGEIIADFKW GRKISSILIALLAIAFLTHTGGVLLRWYICGHAPWANAYESMICTSWMLVGSGLLFARRF RILPALAGLLGGIMLFVAGLNHLNPEITPLVPVLQSYWLMSHVAIIMIGYVFFALCALTG LFNLVLMSLLSATNRVKLLFRIRELTLLNEMAMILGLFFMTAGTFLGAIWANVSWGRYWG WDPKETWALISIVVYALVLHIRFIPLLKGKTTWCFNLLSVVAVLSVIMTWFGVNYYLSGL HSYGKTEGGDFLLWIWGLGVCLVLVLALFARRRLKKLSATFSSFF >gi|225935364|gb|ACGA01000028.1| GENE 202 250855 - 251985 677 376 aa, chain - ## HITS:1 COG:no KEGG:BT_4429 NR:ns ## KEGG: BT_4429 # Name: not_defined # Def: putative pteridine-dependent dioxygenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 381 381 607 79.0 1e-172 MKYHRITCNLYKTTGNSWEQMAEELLSSAVREERVLRLVLFTRCTDNQEYKFQRSFLEQW VERHFVSPRPVVSLVAQKPLVADLVLEVHSLVEITDEAVTIEEQVTSSSVRYLRIMTPHY REIIVGGLYADDLNLPVREQSEQALRKAEEILKTEQMNFGDIVRQWNYLENITDIAHGNQ CYQDFNDVRTLFYASSEWESGYPAATGIGTQHGGVLIDFNAVSGEVDIAPLDNDWQRAAH VYSDEVLISHRADRKKGTPKFERGKSLSDRRQEVIYISGTAAIRGEESVTTGDVLSQTEI TLENIQHLIGLEEGREKLPEHSGKLGLLRVYLKNEEDASAVKADLDKLCPDIPIVYLYAD VCREELLVEIEGIAYL >gi|225935364|gb|ACGA01000028.1| GENE 203 252025 - 252909 645 294 aa, chain - ## HITS:1 COG:no KEGG:BT_4428 NR:ns ## KEGG: BT_4428 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 294 22 296 296 541 89.0 1e-152 MTRIYSAILIFLFSASLLAQTRTFAGTDYSQGIVFVMENNQIVWQHKAPDSNDLWVLPNG NILFTTGHGVLEMTRQNDTIFHYESKSPVFACQRLKNGNTFVGECTTGRMLEISPKGKIV KEVCILPEGVKEKGFAFMRNARRLDNGHFLVAHYGPQCVTEYDTNGKEVWKLDVPGGPHS LTRLPNGHTLIAVADKDRNPRIIEVTADGKTVWELSNADIPGKPLKFLGGFQYFSDGRIL ITNWTGHVNPKEKVHMLLVDRQKKVLYSLENTPGLKTMSSVYSMDVPAGVTSYH >gi|225935364|gb|ACGA01000028.1| GENE 204 252926 - 254728 1359 600 aa, chain - ## HITS:1 COG:no KEGG:BT_4427 NR:ns ## KEGG: BT_4427 # Name: not_defined # Def: surface layer protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 600 2 599 599 1110 91.0 0 MKKKILIPVGMAAIFLCASWQEAETTVSPDRLFLADNGKSLFVTNRAGCELIKMSSDGQK MEKKVSFSSPVNAMTQDANGKLWVVCDGNYGTMYELDGKKLSVQSKTKSGATPSDILYNP LSKSLWVTQRFNNELWEIDPATRKVKTKIAVGREPVSMAAFAGDSCLLIANNLPEMPSTA YPIAVQLDMVDVLSKKVSGRVMLPNGSTDVKSVAVDKNHTFAYVTHLISRYQLPTNQLDR GWMATNTLSIIDLKARKWLTSVILDTPQKGAANPWSVIVTPDDKQIIVAAAGSQELVRID RIALHERLGKAKQGEMVTPSMKAWGNIPNDAGFLYGIRDFIPTQGKGPRSVVATGGKIYT ANYYTSELVSMDLNGKNVQKQILGAPLAFTKVGKGDMYFHDATICFQNWQSCATCHPNDA RMDGLNWDLLNDGMGNPKNTKTLLLSHQTPPCMATGIRKNAEVAVRSGVKYILFMEGNDE IYESIDEYLKSLKPLPSPYLENGKLSAKAKRGKKIFEENCASCHSGEYYTDQKQYKVDWT TGPDKGLAMDVPALNECWRTAPYLYDGRSYSMKDMLKVHGPHKPVSDKELEELEEYVLSL >gi|225935364|gb|ACGA01000028.1| GENE 205 254725 - 257220 1961 831 aa, chain - ## HITS:1 COG:no KEGG:BT_4426 NR:ns ## KEGG: BT_4426 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 830 7 844 844 1500 82.0 0 MKLLKCGMACCVFLSIVAWQTKDTSLQPTDAKGFIVEIQKKYAEIQAIKQKGNQEETENK IKAVHRRLTRAYPVYYDWWLQDGTTGDVDWFNKSFNQELSVRLQKLNIKAAVTNTPESIE SAFLSYLKACEQRRIKRLEAFTADKPEIVFTKYRTLRPSFFAYTEGVSDARAECNYIAGG ALAKLKMNGIWAEVETMLTDEEGVVRDPNLHFDGQHLLFSWKKSPKEDDFHLYEMDLKTR EIKQLTFGKGHADIEGIYLPDDNILFNSTRCGSTVDCWFTEVSNMYLCDREGRYMRQVGF DQVHTVTPTLLDDGRVVYTRWDYNDRGQVWAQPLFQMNPDGTGQAEYYGMNSWFPTTVAQ IRQIPGTRKLMAVFMGHHTPQHGKLGIIDPEAGRDENEGVMFVAPVHKPEPERIDGYGKF TDQFQHPFPLSETEFLISYTPLGYYVGHPMEFGVYWMNADGERELLVSDARISCNQPVLV APRKRPFRRSSSIDYTKNEGVYYMQNIYEGNGLKGVKPGTIKQLRVVEIQFRAAGVGEVN GNDKGGGAIMSSPVGVGNAAWDVKRVLGVTEVYPDGSAFFKVPARRPLYFQALDENGRVV QTMRSWSTLQPNEVQSCVGCHEHKNTVPVAGHPVSMAMNKGVKALAPEDEMGERNFSYLK EIQPIWDKHCISCHDGVKQPMSLKGELKVMDKRSKRKYTDSYLNLTHATQKKEEGSWRGN AHHPEVNWISALSEPTLLPPYFAGSNTSNLIKRLESGHGGTKLTPQEIRKVALWIDLLVP QIGDYREANNWSQKDLDFYNYYDKKREAARAEDQENIRQYIQSLQTKQEKK >gi|225935364|gb|ACGA01000028.1| GENE 206 257286 - 258002 941 238 aa, chain - ## HITS:1 COG:BS_dra KEGG:ns NR:ns ## COG: BS_dra COG0274 # Protein_GI_number: 16080993 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Bacillus subtilis # 18 224 3 208 211 196 53.0 3e-50 MEKKNINEVIAELSVEQLAGMIDHTFLKPFGDASPIEKLCAEARQYQFAMVAINPAEVET CVKLLEGSGVRVGAAIGFPLGQNTVECKAFETRDAIAKGATEIDTVINVRALQKGQTEIV KKEIEDMVSICKPAGVICKVILETCYLTDEEKETVCRIAKEAGVDFVKTSTGFGTAGATV HDVALMRRVVGPTIGVKAAGGIRDLDSALALIQAGATRIGTSSGIQIVESYKELKKGL >gi|225935364|gb|ACGA01000028.1| GENE 207 258045 - 259079 1277 344 aa, chain - ## HITS:1 COG:SMa1156 KEGG:ns NR:ns ## COG: SMa1156 COG1063 # Protein_GI_number: 16263079 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Sinorhizobium meliloti # 1 342 4 355 357 107 26.0 2e-23 MKTVIVPAPGKIEIRQVETPVINAYQALVKTEMVALCNATDSKLIAGKFPGVDTYPLALG HENAGIVVAVGEKVRNFKVGDRAIGGLISDFGAQGVDSGWGGFSEYVVVNDFEVLKEEGL ATPEQGCWDSFEIQNTVPSHVQPEEAVISCTWREVLGAFKDFNLTPGKKVIVVGSGPVGL SFVKLGKLFGLGQIDIVDMLPAKLEVARRMGADNGYTPAEISTPEFIAAANRSYDAVIDA VGLDVVVNSVLPLVKMGGDVCVYGVMTKNPTFDLSKAPYNFDLHMHQWPTRSEEKAAMTT LAQWIEEGRLSASDFITHRFAIEEIEEAFAAVKRGEVLKCVLTF >gi|225935364|gb|ACGA01000028.1| GENE 208 259094 - 260620 1611 508 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 2 496 3 500 500 317 34.0 5e-86 MYILAHDLGTSGNKATLFDESGLLIASRTAAYPTDYASGNRAEQNPLHWWKAIVDTTQAL LKLVSPNDIAGVALSGQMMGCLCVDKNGHPLRPHMLYCDQRSQEEETILSEKIDPLHFYE ITGHRISASYSIEKLMWVKKHEPEIFAQTAKMLNAKDYINFRLCGTLATDPSDASGTNAY DLNRWQWSEEIIEAAGLDLSLFPEVRSSIDVIGEVTNEAARETGLLAGTPVICGGGDGSC AGVGVGCVAPGSAYNYLGSSSWVALTVEKPIVDEQRRTMNWAHVVPGMLHPSGTMQAAGS SYNWMINQLCQHEQALATQLGRSVFELIDEQILSSPIGANKLLFLPYMLGERTPRWNVDA KGAFIGLTLGHTHGDMLRAVMEGITLNLGFIVNIFRKHVPIDQVTVIGGCAQNPVWRQMM ADIYQAEIRVPNYLEEATSMGAAILAGIGAGVFKDFSVIDRFVRIEQTVQPIQENVKKYE AWMPVFDKAYHALCDMYTEIAKTELDSI >gi|225935364|gb|ACGA01000028.1| GENE 209 260650 - 261129 621 159 aa, chain - ## HITS:1 COG:no KEGG:BT_4422 NR:ns ## KEGG: BT_4422 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 159 254 96.0 5e-67 MSEWLQTYNEFFGMYHAGIITSIFGAFAVTFTVLMSWPKLVKDFGPIGGFMAAALIIGTF WVVNHKLPGFGFSTGLLNDADGLPMQFCLIHQGNRGSAPWVDMGWAIAMGFILADVLCAP KGTRGGLLKEAFPRWLVIILGGIVGGIFVGLTGYTNAAL >gi|225935364|gb|ACGA01000028.1| GENE 210 261143 - 261514 284 123 aa, chain - ## HITS:1 COG:no KEGG:BT_4421 NR:ns ## KEGG: BT_4421 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 1 122 123 211 96.0 5e-54 MKQYLSTFAGSAICGGFAFGIWPELWKTYGLMGGWLAATLIIGIMWYMNHYHGAILNPPG KIWLDQGWCIGSAGIAWGIVRFQGDITNFFYAVPTLICCLIGGALAGILVWKIRSCDCAR KIN >gi|225935364|gb|ACGA01000028.1| GENE 211 261566 - 262654 884 362 aa, chain - ## HITS:1 COG:no KEGG:BT_4420 NR:ns ## KEGG: BT_4420 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 362 1 362 362 694 91.0 0 MKKYYLLSLVAASLMGATCIQCTSPKQKGGENALPQKSELFANLPDCCPTPDGMAIAPNG DLILACPNFADITQPACLMRITKNGEITKWMDVPVLEETGWASPMGIAFNEEGDLFICDN QGWSGAEKAQNKGRVLRLKFENDQLKETITVASGMEHPNGLRIRNGKLYVTQSSLSQIKD PSGLLVSGVYCFDMNDRDVAVTNTLEDKNLITTVITKNPEVQYGLDGIVFNEAGDLFVGN FGDGAVHRIKMDVEGNVFTNDVWAKDTTQLRTTDGMCIDDKGNIWVADFSANAVARIDKD GKIQRIAQSPDCDGSDGGLDQPGEPIVWNGQVIVSCFDLVTGPDKVNRKHDKPFTLAKLS LE >gi|225935364|gb|ACGA01000028.1| GENE 212 263036 - 263329 229 97 aa, chain + ## HITS:1 COG:no KEGG:BT_4419 NR:ns ## KEGG: BT_4419 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 121 92.0 8e-27 MTEEEKKLLNSFETQLRHLIYLHDELKRENAELKKLLDNEKLKNEKVQAQYDELEVSYTN LKTATAISLNGSDVKETKLRLSKLVREVDKCIALLNE >gi|225935364|gb|ACGA01000028.1| GENE 213 263341 - 263631 354 96 aa, chain + ## HITS:1 COG:no KEGG:BT_4418 NR:ns ## KEGG: BT_4418 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 130 91.0 1e-29 MNDKIKINLQIADSNYPLTINREEEQMVREAAKQVNVRLNKYREVYKNLEPEKIIAMVAY QFSLERLQLMQRNDTSPYVEKVKELTELLEDYFKEE >gi|225935364|gb|ACGA01000028.1| GENE 214 263755 - 265290 1914 511 aa, chain + ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 34 511 44 514 514 397 48.0 1e-110 MIAIIATAIACFIVGGILSYILFRYVLKSKYDNVLKEAETEAEVIKKNKLLEVKEKFLNK KADLEKEVALRNQKIQQAENKLKQREMVLSQRQEEIQRKKMEAEAVKENLEAQLVIVDKK KDELDKLQQQEIDKLEAISGLSAEEAKERLVESLKEEAKTQAQSFINDIMDDAKLTASKE AKRIVIQSIQRVATETAIENSVTVFHIESDEIKGRIIGREGRNIRALEAATGVEIVVDDT PEAIVLSAFDPVRREIARLALHQLVTDGRIHPARIEEVVAKVRKQVEEEIIETGKRTTID LGIHGLHPELIRIIGKMKYRSSYGQNLLQHARETANLCAVMASELGLNPKKAKRAGLLHD IGKVPDEEPELPHALLGMKLAEKYKEKPDICNAIGAHHDETEMTSLLAPIVQVCDAISGA RPGARREIVEAYIKRLNDLEQLAMAYPGVTKTYAIQAGRELRVIVGADKIDDKQTESLSG EIAKKIQDEMTYPGQVKITVIRETRAVSFAK >gi|225935364|gb|ACGA01000028.1| GENE 215 265355 - 266110 613 251 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 5 245 2 242 244 204 43.0 1e-52 MEKFQIEICTNSVESCLAAQEGGANRVELCAGIPEGGTTPSYGEIAIAREVLTHTRLHVI IRPRGGDFLYSDIEIRTMLKDIEIARRLGADGVVFGCLTADGEVDLTSMQILMEASKGLS VTFHRAFDVCRNPQKALEEIIELGCNRILTSGQQPTAEQGIGLLKELQGLAAGRITLLAG CGVNENNIAWIAAGTGINEFHFSARESIQSEMKFRNEVVSMGGTVHINEYERNVTSVRRV KETVEALYKGM >gi|225935364|gb|ACGA01000028.1| GENE 216 266230 - 267444 1374 404 aa, chain - ## HITS:1 COG:PAB1772 KEGG:ns NR:ns ## COG: PAB1772 COG1883 # Protein_GI_number: 14521092 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Pyrococcus abyssi # 23 401 19 399 400 170 31.0 4e-42 MENIDFATLFQGIGTMMASGWVLASARVFLALLGLLLIYLGWKGILEPMVMIPMGLGMVA INCGTLMMPDGTLGNLFLDPMLSDTDQLMNTMQIDFLQPVYTLTFSNGLIACFVFMGIGA LLDVGFLLQKPFASIFLALCAELGTFLTVPIASALGLTLKESASVAMVGGADGPMVLFTS LALAKHLFVPITVVAYLYLGLTYGGYPYLVKLLVPKRFRAIKMVTKKAPKNYDAKVKLAF SAVLCAILCFLFPVASPLFFSLFLGVAVRESGMKHIYDFVSGPLLYGSTFMLGVLLGVLC DAHLLLDPKILKLLVLGIVALLLSGIGGIIGGYIMYIIKRGNYNPVIGIAAVSCVPTTAK VAQKIVSKDNPDSFVLGDALGANISGVITSAIITGIYITVIPYL >gi|225935364|gb|ACGA01000028.1| GENE 217 267459 - 267890 335 143 aa, chain - ## HITS:1 COG:no KEGG:BT_4414 NR:ns ## KEGG: BT_4414 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 41 183 183 238 82.0 5e-62 MIHLLLSTLAMMGLTFLIGFFVAAVIKLIAYAADSFDFYSSHRLELLRLRRWRQHRQKVE RLVRQIPIADEDALGDSREDYSKGIDRNFTGYAGYYHGVSPGASENNLMNYYYSQDTREF FLREEEQAHVNKKNSKKLTNNKQ >gi|225935364|gb|ACGA01000028.1| GENE 218 268137 - 268979 622 280 aa, chain - ## HITS:1 COG:PAB1785 KEGG:ns NR:ns ## COG: PAB1785 COG0543 # Protein_GI_number: 14521075 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 277 8 288 292 206 39.0 4e-53 MNEQSNIYLPHLMVIEKITHEAPGVKTFRLRFKDEKEGEAFHFKAGQFGEYSAFGEGEST FCIASSPTRKGYIECTFRQAGRVTTGLAKLEEGATVGFRGPFGNTFPLDEWKGKNLLFVA GGIALPPMRCVIWNALDRREDFKDITIVYGAKSVNDLVYKEELKEWENRPDVNLITTVDP GGETPDWTGKVGFVPSVLEAAAPASADTVAIVCGPPVMIKFTFPVLEKLGFADENIYTTL ENRMKCGVGKCGRCNVGKLYVCKDGPVFTKAQLNDIPPEY >gi|225935364|gb|ACGA01000028.1| GENE 219 268972 - 269985 383 337 aa, chain - ## HITS:1 COG:PH1290 KEGG:ns NR:ns ## COG: PH1290 COG1145 # Protein_GI_number: 14591101 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Pyrococcus horikoshii # 15 337 19 344 372 157 31.0 2e-38 MENLHISKSSLQEWFHQMVKKEMHIFAPVHSGDKVDFKRVTSYDEVATDYVQTTQSAKRF AFPKTEVLFSYQKDGKEATLQEADINAIPETILWKIRPCDAAGFAPLSGIFNWDYKDKLY NARREKMTLISFSCAQCDESCFCTSVHGGPGNTAGSDIQITELPDQSALVEVLTAKGKAL VKFFVKEYTPAEEIDKEQYLASVPTRFNVDNVREKLVGAFDSPVWKQQSERCLGCGACAY VCPTCACFDIQEDAKGSSGHRIRCWDSCGFSLFTLHTSGHNPRPTQSTRWRQRILHKFSY MPERIQETGCTGCGRCSRACPVDMNILEHLISISSHE >gi|225935364|gb|ACGA01000028.1| GENE 220 269995 - 270795 503 266 aa, chain - ## HITS:1 COG:MJ0005_2 KEGG:ns NR:ns ## COG: MJ0005_2 COG1145 # Protein_GI_number: 15668177 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanococcus jannaschii # 144 247 7 105 115 79 38.0 9e-15 MSKLTEKAVALLQEGEVNLVIGYEEGNHGTRPLFCRQAENADRLILDDRCTNNIAVYLIK RELTGTGKVAITATVPALRTIVQLAFENQLKEDNLLVLTVDGHGEVIQFKGFDEIKTYLA DFPLTISEENLQLIDRLKKMSREERWKYWMEEMSKCIKCYACRAACPLCYCSRCIVEVNC PQWVQPWSAPLTNMEWQINRVMHMAGRCIGCGACKQACPVGIPLHLMIQSMMEDIQGEFG VAPGAINPAGNVLSTFKAEDKENFIH >gi|225935364|gb|ACGA01000028.1| GENE 221 270788 - 271261 609 157 aa, chain - ## HITS:1 COG:MA2868_2 KEGG:ns NR:ns ## COG: MA2868_2 COG1908 # Protein_GI_number: 20091692 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, delta subunit # Organism: Methanosarcina acetivorans str.C2A # 14 139 1 126 145 144 46.0 6e-35 MNENKIENNQEFEPRIVAFVCNWCTYAGADLTGTSRLKYATNVEIVRFPCTGRIDFMLLL KAFAGGADGIIVSGCHPNDCHYTSGNFHARRRWIVFRGLLDFLGIDVRRIRYSWVSAAEG AKWADLVNETVAGIRELGPYLEYQKASDYLEKEAIYE >gi|225935364|gb|ACGA01000028.1| GENE 222 271276 - 273276 1904 666 aa, chain - ## HITS:1 COG:MK0265 KEGG:ns NR:ns ## COG: MK0265 COG1148 # Protein_GI_number: 20093705 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit A and related polyferredoxins # Organism: Methanopyrus kandleri AV19 # 3 660 7 648 656 631 50.0 1e-180 MSKIGVFICHCGENISATVDCEKVAEAAKEMEGVAYAIDYKYMCSDPGQTLIKNAIKEYG LDGVVVAACSPRMHEPTFRRACAEAGLNPYLCEIANIREHCSWVHEKGEATTRKAVDIVK SLVEKVKRNHPLVPIQVPITKKALVIGGGIAGIQASLDIANCGHQVILVEKEPSIGGHMS QLSETFPTLDCSQCILTPRMVEVAQHPNITLYTYAELEHVEGFIGNFKASIRLKAKSIDE KLCTGCGLCTTKCPNKKIPSEFNAGLGMRTAIYVPFPQAVPNKPVIDKEHCTHYRNGKCG VCEKLCPTGAIRFGQEDRIITEEVGAIVVTTGFNVLNTDFFPEYGYGKYKDVITGLQFER LASASGPTFGEIRRPSDGQIPQKIVFVACAGSRDPAKGIPYCSKICCMYTAKHAMLYQHK VHGGESYVFYMDIRAGGKNYEEFVRRAIEEDGVNYVRGRVARIYEKNGKLIVKGVDTLLG ASPVEIEADMVVLATAGVANKGAEELAQKMHISYDPYQFFAESHPKLKPVETNTAGIYLA GACQAPKDIPETVGMASGAAVKVAGLFSQANLVREPLIAVVNRTAPPLFSTCVGCFMCQT ACPYQAIEREEIKDRNGNVIKTVAKVNPGLCQGCGTCVAFCRSKSIDIQGYSNEQVYAEV MALLNH >gi|225935364|gb|ACGA01000028.1| GENE 223 273264 - 274130 769 288 aa, chain - ## HITS:1 COG:SSO2358 KEGG:ns NR:ns ## COG: SSO2358 COG2048 # Protein_GI_number: 15899116 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit B # Organism: Sulfolobus solfataricus # 1 278 4 286 293 179 31.0 4e-45 MRIGFYPGCSLNGTSREYNESVKALAQAMGLELIELKDWNCCGATAAHSMSKQLSLALPT RVLALAEQQGFQEIVVPCASCYNRLSVAQYELMHNEALKDEILTTVGMRYLGNVKILNVI QMIEKYMLDALKEKIVRPFAHQVACYYGCLLVRPHKILNFDRLEDPQSMDKIMSLIGATP IDWAFKTECCGAGLSVSRTDLVGRLSGNILKDADDRGAEAVIVACPMCHSNLDMRRPAIN HYLAKPVTIPVLYITQAIGLAVGLTPKELGLGRHFVAVNLKEAEVCLK >gi|225935364|gb|ACGA01000028.1| GENE 224 274133 - 274687 296 184 aa, chain - ## HITS:1 COG:SSO1134 KEGG:ns NR:ns ## COG: SSO1134 COG1150 # Protein_GI_number: 15897991 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit C # Organism: Sulfolobus solfataricus # 16 181 57 225 230 90 32.0 1e-18 MENKECISYDLEQKTGIDVSYCYQCGKCTAGCVLSEEMDYAPSYILRLLQTKNAQNDRRV LSSNAIWICLNCENCIARCPKEINIPEVMDYLREKSRKEGCISKESRPVVAFHSAFLESV KRTGRLYEVGLVAGFKARTLRLTQDLNVAPVMFVKGKLNLLPERVKDENKIKKIFAQTIE KTQK >gi|225935364|gb|ACGA01000028.1| GENE 225 274790 - 275227 190 145 aa, chain - ## HITS:1 COG:no KEGG:BT_4413 NR:ns ## KEGG: BT_4413 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 1 145 145 202 73.0 3e-51 MTKYIFCILACAFLVLGLSNVSDKELQGFQKETEISSNCMQYTAESSCRISSADLNKSSK VSHSQFVDETSNCKVSDCIFDNISFPRSIVPFKNLKFNTNTVIIQILSSLNVLLPENRLG HTDLDVNYTKYSCKYYVYTLAHILI >gi|225935364|gb|ACGA01000028.1| GENE 226 275310 - 278066 2008 918 aa, chain - ## HITS:1 COG:no KEGG:BT_4412 NR:ns ## KEGG: BT_4412 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 918 1 905 905 1176 64.0 0 MKTFTHLLCVLILSIALFACNNAHFLKEEAYRNQVIQDFELKKQALPHGDLFAVFADSVL SVYEREALMFLYAYMPIGDVTDYPGDYYLENVRLSKQTRSEMPWGKEIPDEVFRHFVLPI RVNNENLDDSRRVFYGELKDRVKGLPMKEAILEVNHWCHEKVVYRPSDARTSSPLASVKT AYGRCGEESTFAVAALRSVGIPARQVYTPRWAHTDDNHAWVEAWADGHWYFFGACEPEPV LNLGWFNAPASRGMLMHTKVFGRYNGPEEMMLETPNYTEINVTENYAPTAKASVTVRDKN GQPVAGARVDFKVYNYAEFYTVATKYTDTNGQVSLTAGKGDMLVWASDKGTFGFSKLSFD KQPELTLTLDKKEGDIFEEDIDIVPPVENPILPEVTPEQRAENDRRMMQEDSIRNAYVAT FPTAEQADSIVSCLKGKLSSFAGKALASFLLDSRGNHDVLVRFLNEADRQGKLMKGAALL SILTKKDLRDVRYEVLIDHLLNTKDVDTYLYDCVIPPFHCMDASTEYVYDILAPRASTEA LTPYKSFFQSKFSEAEMDTFRTRPQALVEWVNRNITIDEENNFQRIPISPEGVWRAKVAD SYSRDLFFVALARSMNIGADIRSTDGRVRYVSWPENRWGSEFMEVDFDKQEAVEASRGIY HFYEGDKAIARDDKRVKYYSKFTISRLQEGRPELISYEEQDPRLRNMGVLDAGYYLLVTG TRLADGGVLARISSFVLPAQKDEFKPVATKVPYHLRESGEKVAVIGNFNSESLFTPVERI GEKVMPLARQSILQTCGRGYFVVAVLGAGQEPTNHALKDIAALGNDFEQWGRKMVFLFPS EEQYKKFNAAEFKGLPSTITYGIDTDDSIRKGIVQAMNLNNSILPVFIIADTFNRLVFVS QGYTIGLGEQLMKVVHGL >gi|225935364|gb|ACGA01000028.1| GENE 227 278391 - 278528 141 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKNANEKIMMLQYRIKRYQAMGNGAMCQTLNGKLQKLLSQQVAM >gi|225935364|gb|ACGA01000028.1| GENE 228 278622 - 279800 1113 392 aa, chain + ## HITS:1 COG:no KEGG:BVU_3875 NR:ns ## KEGG: BVU_3875 # Name: not_defined # Def: aminopeptidase C # Organism: B.vulgatus # Pathway: not_defined # 3 392 4 396 396 581 71.0 1e-164 MKKLFMSVALLLLAVTSFAQTPGYEFTTVVSHKATPVKDQGSTGTCWCFATASFMESELL RMGKGEYDLSEMFIVRQKYMNQIEDNYLRRGKGSIGEGSLAHTFKNAYKKAGIVPEEVYT GLLNDSKDHNHGALSRYLKALVDANIASKKRTPEYDALINNLFDIYLGKLPEKFTYKGKE YTPQSFTESLGLNMDDYIELTSFTHKPYYETFSPEVPDNWENQPMYNLPLDELIGAIDYA LNKGYTVCWDGDVSEQGFSFKNGIAINPQVEDVKDYSTTDRARFEAMPKYQRMDEVFKFE HPYPEINVTPEIRQDGYEKFVTTDDHLMHITGIVKDQNGTKYYITKNSWGAESNKSGGYL NMSESYVRAKTICVMVHKDSLSKELKKKLGIQ >gi|225935364|gb|ACGA01000028.1| GENE 229 279942 - 280643 888 233 aa, chain - ## HITS:1 COG:no KEGG:BT_4411 NR:ns ## KEGG: BT_4411 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 233 233 382 90.0 1e-105 MKKVLFAGLLVIAGVSVSAQNLIKNEKFATEVKTKVTNANKATAGEWFIMNNEADGATTI AWEQTGDAKYPNAMKLDNSGADKNISWYKAFLGQRITDGLEKGVYVLTFYAKAKEAGTPV GVYIKQTNEEKNDNGKYNTTFFMRRDYDADAQPNASGAQYNFKIKDADKWTKVVVYYDMG QVVNAISSKKANADLEVSDTDDDAAILKDCYLAILSQNKGGVVEISDVTLRKK >gi|225935364|gb|ACGA01000028.1| GENE 230 280704 - 282638 1741 644 aa, chain - ## HITS:1 COG:no KEGG:BT_4410 NR:ns ## KEGG: BT_4410 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 644 1 644 644 1256 91.0 0 MTRMKHLLCVFLVAALGSLSFKANAYTERNMLQKAADETTLKNVLVMKQAWVPYPAYTDR TAWDSLMGPNKQRLIEAGEKLLDYKWQLIPATAYLEYERSGNRKVMEVPYDANRQALNTL MLAELAEGKGRFIDQLLNGAYMSCEMNSWVLSAHLPRQSSKRSLPDFREQIIDLGSGGYG ALMAWVHYFFRKPFDKINPVVSLQMRKAIKERILDPYMNDDEMWWMAFNWKPGEIINNWN PWCNSNALQCFFLMENNKDKLAKAVYRSMQSVDKFINFVKSDGACEEGTSYWGHAAGKLY DYLQILSDGTGGKLSLFNEPMIRRMGEYMSRSYVGNGWVVNFADASAQGGGDPLLIYRFG KAVNSDEMMHFAAYLLNGRKPYATMGNDAFRSLQSLLCCNELAKVTPKHDMPDVTWYPET EFCYMKNKNGMFVAAKGGFNNESHNHNDVGTFSLYVNTIPVIIDAGVGTYTKQTFGKDRY TIWTMQSNYHNLPMINGVPQKFGQEYKATNTVCNEKKRTFSTDIATAYPAEAKVKSWIRS YALDNNKLIIGDSYTLNEVIAPNQLNFLTWGNVTFPSEGKIRIEVKGQKVEMTYPSQFKA ELETIKLDDPRLSNVWGKEIYRITLKTEEKKATGKYEFVIQQVK >gi|225935364|gb|ACGA01000028.1| GENE 231 282621 - 282782 87 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFHSSHGIFIHLSIAKIVYLIKNEVSKSYEPNDKKVHTSKFILIIASFQAFRL >gi|225935364|gb|ACGA01000028.1| GENE 232 282965 - 284026 829 353 aa, chain - ## HITS:1 COG:SMa2355 KEGG:ns NR:ns ## COG: SMa2355 COG0389 # Protein_GI_number: 16263727 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Sinorhizobium meliloti # 1 330 34 363 379 341 53.0 1e-93 MDAFYASVEQRDNPELRGKPLAVGHAEERGVVAAASYEARRYGVRSAMSSQKAKRLCPQL IFVSGRMDVYKSISRQIHEIFHEYTDLIEPLSLDEAFLDVTENKKDISLAVDIAKEIKQK IREQLNLVASAGVSYNKFLAKIASDYRKPDGLCTIHPEQALDFIARLPIESFWGVGPVTA KKMHLLGIHNGLQLRKCSLEMLTGHFGKAGTLYYECSRGIDERPVEAIRIRKSIGCERTL ERDISARSSVIIELYHVAVELIERLQRKDFKGNTLTLKIKFHDFSQITRSLTQSQELTTL DRVLPLAKELLKSVEYEQHPIRLIGLSVSNPKEEADEQHGVWEQLSFEFSDWD >gi|225935364|gb|ACGA01000028.1| GENE 233 284272 - 285699 648 475 aa, chain + ## HITS:1 COG:no KEGG:BT_4408 NR:ns ## KEGG: BT_4408 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 466 1 462 463 636 68.0 0 MKNILFGLLISCSVPIFAQNGITFKVERLSKPEELLDLDSQDDIYKSLILSYMDIEPYEI KKKNIEIPYHIIAKSQSPDSLVSFGTNSFFNGMYQAYADHRPFVLSPDMIWLLISQGFAR HINANQESMRNDLVDFSSKLSLIVRADKKLEDPTFSWEELFPQFTKQISEHVGNHLTELL TCNFSTTTSLEKVASEITIMEAVKPYFEFIIIRMVCGIPKITLEGTPEDWEKVLQKARSL KEYKLEWWISELEPLLEEFVKASKGQADKSFWRNMFKYHSQKQYGAPNIIDGWIVKFFPY DKEGRRNNLKQLEGRNCLPDEIVKVDVKYQEVYTDAVKETPLEFWAGFIGLEQNSKNFAL RPQIGWMVRKKDVNNDGLKNKLSADAQNDGWGSGINIRVKEFPTVLLELKEIKRLDIQFI DKIDIPDEISKIKIESLTLYGKITKEGIERIKKLLPDTDIKINGSRGGINSLIAP >gi|225935364|gb|ACGA01000028.1| GENE 234 286348 - 287529 876 393 aa, chain + ## HITS:1 COG:PA4204 KEGG:ns NR:ns ## COG: PA4204 COG2706 # Protein_GI_number: 15599399 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Pseudomonas aeruginosa # 48 392 30 385 388 199 34.0 6e-51 MKKTFLSKTFIFNSFAAVCILGISISSCTSKKKTHMETTGTTENELTMLVGTYTSGTSKG IYSFRFNEKDGIAVPLSEVEIENPSYLVPSADGKFVYAVSEFNDERAAANALAFDKEKGT FQLLNSQKTGGEDPCYIITNGKNVVTANYSGSSISVFPIAKDGSLLPASDILKFEGSGVD KERQEKSHLHCIRITPDGKYLFADNLGTDQIHKYVINPNADITNQESFLKEGQPAAYKVK AGSGPRHLTFAPNGNYAYLINELSGTVIAFEYKDGNLKEMQTIAADTVGAKGSADIHISP DGKFLYASNRLKADGIAIFRIHPDNGMLTKTGYQLTGIHPRNFIITPNGKYLLVACRDSN VIQVYERDADTGLLTDVHRDIKIDKPVCIRFVP >gi|225935364|gb|ACGA01000028.1| GENE 235 287684 - 290188 1566 834 aa, chain - ## HITS:1 COG:no KEGG:Cpin_6170 NR:ns ## KEGG: Cpin_6170 # Name: not_defined # Def: membrane or secreted protein # Organism: C.pinensis # Pathway: not_defined # 17 790 15 807 844 450 34.0 1e-124 MIKRILFILLCSLGYLCQIGAQELVYMDSSGVIRWKKNKQEIALFGANYCLPSACDFRAA DYVGGDRKQMVVEDLDHFKRMNWDALRLCFWGDFQNTDRLGNLQENEHLELFDYLIAEAC KRDIYMLLSPIVTYDSQWPEMRDTTNTGLAKYYPKTTLIHDENAVRAQENYMKQLLNHRN PYTGRCLKDEPNILFVELINEPTQFPEDIPGMVCYINRMCKAIRSTGCKKLTFYNVSQDF RVSPAIKKSDIQGSTYAWYPSALNNNHSIEGNGLLFVDRYEQMLHPDLKGKAKLVYEFDA TDMASGFMYPAMVREYRRGGIQFAALFSYDMLRTAPMNLGWQTHFFNMVFTPSKAVSGMI AAEVMRRVPRGKHFGYYPDNRTFGDFRVSYDERLSEINSGDLFYYSNTTTTQPKDLKALK HIAGVGSSPVVHYSGTGIYFLDKQDNNNWQLEVYPDIMDVDDPFKMLNKHRVSRKSAYNE RAIKIQLPGFEIETTILPGKYLFKEGKLVSKEELPAKDFYQIPMKEWKIANHTWSEFTSD KEITFCCEVFGPKRPKQVDVYLMLKPWGCKRIPMTAEDGFYYKAQVDLSWLAKGNYEYHF GVDTGEDTLLFPAKTYCTPERWDYYEQATYVMRLVNETTPLSLLGAQDNWKHIRRTRTFR SPESQFSPVVSGTELTPAFQLSVSDLEKKDDYTAPCDVTFSHYIGGRIAWRSKSKTVPSY IRIRAYGVNNTDKAICNLVDREGRGYGAVFNLKDEVSDILIPVSELIPTKAAMLPQDWPG VNPYWYPASAQNNNGVALDWRIIDFVQISLREELYDVENQKNKGIVVEKIDLLF >gi|225935364|gb|ACGA01000028.1| GENE 236 290201 - 292543 1984 780 aa, chain - ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 61 205 38 183 612 122 40.0 3e-27 MKSIILTSALCAMTATVMAQQVYELSAPVKEKVIYTGHLKLGGTRPGGGSIDVNSYYMSM DGKPVIPVMGEFHYSRYPHEQWEEEILKMKAGGINVLPTYIFWSLHEEQEGVFNWSGNLD IRRFFELCKKHDMNVIVRIGPFGHGEIRNGALPDWLFAKPLEVRSNDENYLFYAKRLYEE IAKQLEGLYYKDGGPIIGTQIENEHQHSAAPWGVNYPGEPLDMTTATYDSSITMIGVSVQ DKKITTAELGDLHMKTLKQMAEDAGIITPIYTATGWGNAAVIGNEAIPVTAAYTYPFWAE PQMSPFCMFKDIQKNPDYAPVRYDTDKFPSFCAEMGAGIQMIYKRRPIVTAKAAEALMVR TLGSGANGIGYYMYHGGSTPKMIGGVASFNDEPMGMPKVSYDFQAPLGEFGLEHESYRTL RLLHSFLADFADRLAPMETVLPEGYEKITPDNRETLRYAARMKAGSGFVFMVNFQDHDTA RIDQKDLQLRLKLDNETIQIPAKGSFTLPKDESVILPFNFLMEDALLKYATAQLLMKIDD NGKEHYFFFTPDGMNTEYVFDKATVRGKNQFAPVPGLKSTFSIRTKAGKEVKITTLTREQ ALNSCKIDGKLLITEATVLPGSGEVQLLSLGNNRFEYVLYPSKQGFKPYIKEVEIVKPDF SYKKVGARRLTVHFDDKLQAPQVQEYFLRLNYTGDVAMAFMNGTLVLDHFYHGAPWTIGL KRFSQQMEKEDLSFYFRPLRSDAPFLIDLPKEAIPDFSKGAVCTINHVEVVPEYITTIKF >gi|225935364|gb|ACGA01000028.1| GENE 237 292590 - 294101 1422 503 aa, chain - ## HITS:1 COG:no KEGG:Rmar_1481 NR:ns ## KEGG: Rmar_1481 # Name: not_defined # Def: RagB/SusD domain protein # Organism: R.marinus # Pathway: not_defined # 21 499 29 496 499 268 37.0 4e-70 MKTMKYIATALMLFSMTSCELERLDYTEISPENFFKTETDLKLAVNSLYYDFNPGDFKSV YSADYSGYQIIGDMTTDVLWSCWAWESDELYFQQWYATVGGSIQNHAYSNFERYNFLSKA RNTIRRIENSPVSEEAKTLYSGEAKALRGWMGLYLYDMFGPVPVAPDEVLDDPQTFVYLP RLTEEEYDTMMETDLRDAIRDLPEVAEARGRMTKGAARMILLKYYMIKGYFEKAEILARE LMEMEGRVYSLQPDYNYVFSKEGIGNNEVILQLACNSTASWYSNYMTAEVLPADYPWAEK ATGWGGYVMPWDFYGTFEEGDSRLKNVVTGYTNKKGEKIDRTNSSQLAKGALPLKYGMDP DMKDAQSGVDVVIYRYSDVLLTLAECINRNEGSPTTEAIGLVNRVRNRAGLSALDDSQTA SKETFNEALLLERGHEFYLEGLRRQDLVRFGKYVEYANARIDAINRSEGRGYFNVHEGHN RFWIPQSFIDESKGAIKQNNYDR >gi|225935364|gb|ACGA01000028.1| GENE 238 294114 - 297206 2745 1030 aa, chain - ## HITS:1 COG:no KEGG:BF0536 NR:ns ## KEGG: BF0536 # Name: frrG # Def: putative outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 37 1030 119 1137 1137 710 39.0 0 MEKYIQLVKKAGFMLSFLLMTTSLIAQGIQVKGVVKDPAGEVIIGASVVVKGTTNGSITD LNGHFNLSDVPSGAVITVSYIGYISQEKKVTTAPMEFTLKEDNKTLDEIVVVGYGTAKKS SISGAVASVKADELPTAASASVGSMLRGRSSGMNITQNSANPGASMNISIRGGLSGQTPL IVIDGVPQASSKTVTAGTAYSGGEKDNALINLNPNDIETIDILKDASAAAIYGSDASGGV ILITTKRGKTGKPDISYSGSVAFQYIKDAPDFLNARDFMIEQNKVFDELGRGDEKKYTDR QIANFVGDGTDWMDEVTRVGIVNEHNLSVSAGTDATKYLFSLSYYDHQGIAKNNSMNRIT GRLNVDQTINRLLKAGINSTFSQIKYHDVPLGNERQDNSALIYSAMTFIPTVPVYDEDGK YSVNPIRDIYPNPVSLLDITDQTVSRDLFVSGYLEFRPIKDLLLKATVGFDMKDVQADQY IPTTTKKGYSMDGQASKQNAKSQMNLFNAIANYTKVFNEIHDVNLMAGFEYKKSSWEGMG IVASQFPYDGSLMNNIGTSEQEKPAISSYKGASEMASFIARLNYSLASKYVLTVNLRVDG SSNFSKEHQWGAFPGVSAAWRMNEESWMKNINWLSNLKLRAGMGQTGNAGNLTGINTYYK VSQGAFAPNGSLVNGMAISKLGNDKLKWETLTDYNLGIDFGFFNNRLSGSIDLYQRLRKD VIMSKSLMSYHEVKTIDYNSATVYRARGIDIGIHSVNFDNKNFGWTTDINFSYYRNHTTK RDPDFIPAVYQDYVEEWNNIYGYKTNGLIQMDQSYKHLQKSGAGAILYQDLYGYKLDENG EKMRDSEGRYIRIAGEDGILDDADIVVLANNTPIPFSINNTFRWKNWDANIYLYGSLNGW KVNDIKLQSIYGIEDITYGVNVLKEVKNRWSPANPMGTLPGVAEASSGVNPSKSDFFLEK AWYLRLDNVSIGYTFPAQWFKNKIRSVRLYAAGRNLAVFTPYKGMDPETGNGIGAYPNQS SFAIGLDVKF >gi|225935364|gb|ACGA01000028.1| GENE 239 297219 - 298568 1205 449 aa, chain - ## HITS:1 COG:no KEGG:BVU_0840 NR:ns ## KEGG: BVU_0840 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 11 376 11 383 499 107 25.0 1e-21 MKNVLKTIGKLYLLLTLVCLAVGCEDDKEDVAPGLYVANEEIETFPGDTVLVSGTASNYA GLASITLSCESWGIHKAYDLNGQKPKVFNYDYRLIVPKTATFEEHLLITIRDVNGWETKK NVLLTYIADMESPVMQTQLPARIAVDFNTATNKGSWNLNMKFTDDRELKNIRLQIPGMQI DETVEVSGRSGELKRTIDFTTGEFPVTLTVTDAGGNETVVNTTVVVMLAEEEDPIQDYAQ MYIVNADENPNDYVNGYYRYMDRKGEYQYEGKFYAPTDNAQVYFVPTKSMEGDLYGTSPY VSSKLMNKNGYVVPVVLPKKGYYSVWIDLNAHTYSYWNMDIPSGTCTEPLWMSGTGFSFA DWGASDQMTQTDTYRYEVETSVLGDYAGDRQYYFYTSGWARVFRADAAGNWWFEAATGSC ITYKTDYAGKVKVTFDTAALWATIKKVTE >gi|225935364|gb|ACGA01000028.1| GENE 240 298581 - 300170 1580 529 aa, chain - ## HITS:1 COG:TM1201 KEGG:ns NR:ns ## COG: TM1201 COG3867 # Protein_GI_number: 15643957 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinogalactan endo-1,4-beta-galactosidase # Organism: Thermotoga maritima # 183 526 28 376 606 246 37.0 7e-65 MKITSFIVMGFSVVCSMTACSDDPVTVADPEPAVVVVKVPNGSFEEDAAETVSPKGWTIS GDNSAVKVVEGGCEGIYALHYGATSAYTVSTRQTVNGLEDGIYDLEFYYKSSGGQISCCV AAGTDNKKMTSLQASPATWIRSYVRGIKVEGGKCDIEIHSESAETNWSRFDGLRLKKTEK EYNLLKGGDISQLTYVEQMGGKFYENGEEKDCIDILKNNGFNIVRLRLYNDPGNPDYSPS NRLPEGISGPDDILRLAKRAKQAGMQIQLTFHYSDYWTNGETQTKPHDWEGLDFAGLKQA LYDFTFNFMNKMKAQGTTPEFVALGNETQAGMLYPEGSYENFAQLSELYNAGYDAVKAVS QDSKVIIHLNAAGDKSQYNWYFGELKNRHTKYDVIGASYYPFWTDKSAAQMREWADYVSA KFDKDILIMETGYSWNKTLPDGSPGQLADNGPYADFSPLGQKNFMLELIKEIKQAKDCRI LGFLYWDPIFIEVEGMGWELGAKNYVSNTTLFDFSGNRMEVLDAFKYNN >gi|225935364|gb|ACGA01000028.1| GENE 241 300273 - 303554 3012 1093 aa, chain - ## HITS:1 COG:no KEGG:BT_4398 NR:ns ## KEGG: BT_4398 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1093 1 1093 1093 2082 90.0 0 MVKAWKEKVVIPTYEVGKPEKNPIFLEKRVYQGSSGVVYPYPVIESMSDEKVDKEYNAVF IENEYIKVMILPELGGRVQMAYDKIKQRHFIYYNHVIKPALVGLVGPWISGGIEFNWPQH HRPSTYMPVDTTIEENADGSVTVWVNEMERMFHQKGMAGFTLRPGHAFLEIKGVLYNRTE VPQTFLWWANPAVAVNDYYQSVFPPDINAVFDHGKRAVSSFPIATGTYYKMDYSAGVDIS NYKNIKVPTSYMAVNSRFNFEGGYENDTRAGMLHVANHHISPGKKQWTWGNGDFGRAWDR NLTDEDGPYIELMAGVYTENQPDFTWLQPYEEKSFVQYFLPYRELGVVKNASKDLLMNIE PEGEDSVRFKIFATSRQMVNVVLKNEDGKVYYNKEKTITPEELLDETVNVKGEEFNRLNL EITASGKELLYWHAEPEEVKPIPDAAEAALLPEEIKTTEQLYLTGLHLEQYRHATYNPVE YYEEALRRDPIDVRNNNALGLWYIRKGRFCRAEQYLLTAVKTLQKRNPNPYDGEPIYNLG LALKYQGRYNEAYDRFYKSCWNAAWQDAGYFACAQISIMQNRLEDALDEVDRSLIRNWHN HKARALKTTILRRMGKTEEALRLIDDSLAIDQFNFGCRYEKYLIMSTSEELVTLKEMMRG EAHNYDEIALDYCAAGCWQEAASLWKIAIEEAAVTPMTYYYLGWCLFQGGLPGAHQAFAD AKMACPDYCFPNRLEAIIALQCAMEQNPDDAKAPYYLGNLYYDKRQYDLAMEAWETSAKL DDSFPTVWRNLALAYFNKKNEETKAIKYMERAFRLDTTDARILMELDQLYKRVRRLHKER LAFLQQYPELIMQRDDLILEEITLLNQTGEYKKAKTLLDAHNFHPWEGGEGKVPAQYQLA RVELAKQALTVGNYKEAIALLEECLEYPHHLGEGKLYGAQENDFYYFLGCAYEGLGQNEK AAECWEQAILGPTEPAAAMYYNDAKPDKIFYQGLALLKLGRLDEANGRFHKLTSYGEKHL FDKIKMDYFAVSLPDLLIWEDDLTIRNVIHCKYMMALGYWGLNQKEKSIRLLNEVERLDI NHQGIQAFRTLIG >gi|225935364|gb|ACGA01000028.1| GENE 242 303711 - 305093 1040 460 aa, chain - ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 4 455 5 475 491 290 36.0 4e-78 MKSYNKKFVYFICLVSAMGGLLFGYDWVVIGGAKPFYELYFDIADVPAMQGLAMSVALLG CLIGAMVAGMMADRYGRKPLLIISAFIFLSSAYATGAFSVFSWFLAARFLGGIGIGIASG LSPMYIAEVAPTSIRGKLVSLNQLTIVLGILGAQITNWLIADPIPADFTPADICASWNGQ MGWRWMFWAAAFPAAVFLLLACFIPESPRWLAMKGKEDRAWKVLGQIGGDDYADQELRLV EETKSSKSEGGLRLLFSRPFRKVLILGIIVAVFQQWCGTNVIFNYAQEIFQSAGYSLGDV LFNIVVTGVANVIFTFVAIYTVDRLGRRALMLLGAGGLAGIYLILGTCYFFEVSGFFMVV LVVLAIACYAMSLGPITWVLLAEIFPNRVRAVAMATCTFALWVGSFTLTYTFPLLNNFLG SSGTFWIYAAICAVGYLFFFRALPETKGKSLEALEKDLIK >gi|225935364|gb|ACGA01000028.1| GENE 243 305265 - 306158 532 297 aa, chain + ## HITS:1 COG:PA3571 KEGG:ns NR:ns ## COG: PA3571 COG2207 # Protein_GI_number: 15598767 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 14 293 26 298 307 151 31.0 2e-36 MVKLKDGFTGERALVLPRIIVDKMEEDPLTSMLHITDIGYYPKAKHHYRERIEPINQYVF IYCIDGAGWYRIGEQEYTVSANQYFILPAGVPHAYASNKSLPWTIYWIHFKGILAPFYAQ EASRPMDIQPELHSRISTRINLFEEIFNTLKNGYSNENLRYAFSMFHFYLGTLRYIQQFR NAAAGNDATDDENISELAIHYMKENMEKHLSLQDIADQIGYSPSHFSMLFKKKTGHSPLT YFNLLKIQQACLLLDTTDMKINQICYKIGIEDTYYFSRLFSKIMGMSPREYRKSKKG >gi|225935364|gb|ACGA01000028.1| GENE 244 306086 - 306874 539 262 aa, chain - ## HITS:1 COG:SMc00539 KEGG:ns NR:ns ## COG: SMc00539 COG0739 # Protein_GI_number: 15965497 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Sinorhizobium meliloti # 74 193 271 398 413 89 39.0 4e-18 MEAKSCSLNNEAFFSGLKKKKYMKNILILLTVLLLPLVANGQNKSSFSAREVSDVRVATP GLFAKDNHIYLRLDSLKDHEYAFPLPGGKVISAYGTRGGHSGTDIKTCAKDTIRAAFDGV VRMAKPYYAYGNIVVIRHPNGLETLYSHNFKNLVKSGDIVKAGQPIALTGRTGRATTEHV HFETRINGEHFNPNLIFNLKEGTLRKECIKCTRNGSKIVVKTHIPDNRIAQSPKEVKLPY PFLDLRYSRGDMPIILLNNREK >gi|225935364|gb|ACGA01000028.1| GENE 245 306846 - 307511 715 221 aa, chain - ## HITS:1 COG:SSO0658 KEGG:ns NR:ns ## COG: SSO0658 COG3382 # Protein_GI_number: 15897568 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sulfolobus solfataricus # 44 206 42 208 224 84 35.0 1e-16 MFEIKVSKEIKDACPVFAGAAVYAEVKNTAYSEGLWHEINAFTEQLTSTTQMEDIKHQPV IFATREAYKRCGKDPGRYRPSAEALRRRLMRGIPLYQIDTLVDLINLVSLRTGHSIGGFD ADKIRGIHLELGIGKAGEPFEGIGRGVLNIEGLPVYRDSFGGIGTPTSDHERTKMDLGTT HILAIVNGYNGKDGLKEAAEMIQTLLRNYAESDGGEIMFFE >gi|225935364|gb|ACGA01000028.1| GENE 246 307562 - 308086 386 174 aa, chain - ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 6 174 110 277 278 115 36.0 3e-26 MRRLLLRGMDALITNVPGYCVCVTTADCVPVLLYDKKQHVVAAVHAGWKGTVKHIVSHVM DHLNQKFGTQGEDVVACIGPSISLESFEVGDEVYDAFKESGFDMSLISMKKKETGKYHID LWEANRIELLNAGVPAEQIEVAGICTYIHHDEFFSARRLGIDSGRILSGIMIRK >gi|225935364|gb|ACGA01000028.1| GENE 247 308371 - 309537 1105 388 aa, chain - ## HITS:1 COG:aq_2069 KEGG:ns NR:ns ## COG: aq_2069 COG0536 # Protein_GI_number: 15607036 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Aquifex aeolicus # 6 335 4 342 343 271 48.0 1e-72 MAESNFVDYVKIYCRSGKGGRGSTHMRREKYCPNGGPDGGDGGRGGHIILRGNRNYWTLL HLKYDRHAMAGHGESGSKGRSFGKDGADKVIEVPCGTVVYNAETGEYLCDVTEDGQEVIL LKGGRGGQGNWHFKTATRQAPRFAQPGEPMQEMTVIMELKLLADVGLVGFPNAGKSTLLS SISAAKPKIADYPFTTLEPNLGIVSYHGGKSFVMADIPGIIEGASQGKGLGLRFLRHIER NSLLLFMVPADSDDIRKEYEVLLNELRTFNPEMLDKQRVLAVTKSDMLDQELMDEIEPTL PEGIPHVFISSVSGLGISVLKDILWEELNKESNKIEDIVHRPKDVTRLQQELKDMGEDED IVYEYDDNVDDDDDDIDYEYEEEDWEEK >gi|225935364|gb|ACGA01000028.1| GENE 248 309653 - 310222 751 189 aa, chain - ## HITS:1 COG:CC1269 KEGG:ns NR:ns ## COG: CC1269 COG0563 # Protein_GI_number: 16125518 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Caulobacter vibrioides # 2 187 1 186 191 174 45.0 9e-44 MLNIVIFGAPGSGKGTQSERIVEKYGINHISTGDVLRAEIKNGTELGKTAKGYIDQGQLI PDELMIDILASVFDSFKDSKGVIFDGFPRTIAQAEALKKMLAERGQDVSVMVDLDVPEEE LMVRLIKRGKDSGRADDNEETIKKRLHVYHSQTAPLIDWYKNEKKYQHINGLGTMEGIFA EICEAVDKL >gi|225935364|gb|ACGA01000028.1| GENE 249 310290 - 310826 770 178 aa, chain - ## HITS:1 COG:DR1376 KEGG:ns NR:ns ## COG: DR1376 COG0634 # Protein_GI_number: 15806393 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Deinococcus radiodurans # 13 173 10 171 176 124 40.0 9e-29 MDTIQIKDKQFTVSIKEQDIQKEVIRVANEINRDLAGKNPLFLSVLNGSFMFTADLLKHI TIPCEISFVKLASYQGVTSSGVIKEVIGLNEDIAGRTVVIVEDIVDTGLTMQRLLDTLGT RNPETIHIASLLVKPEKLKVNLNIEYVAMEIPNDFIVGYGLDYDGFGRNYPDIYTVVD >gi|225935364|gb|ACGA01000028.1| GENE 250 310868 - 312412 1324 514 aa, chain - ## HITS:1 COG:CAC0751 KEGG:ns NR:ns ## COG: CAC0751 COG3104 # Protein_GI_number: 15894038 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Clostridium acetobutylicum # 7 510 24 520 521 278 35.0 2e-74 MSALKGHPKGLYLIFATSTAERFSYYGMRAIFILFLTQALLFDKEAAASIYGSYTGLVYL TPLIGGYIADKYWGIRRSVFWGAMMMAVGQFLMFMSASTLDNTELAHWMMYGGLGFLILG NGCFKPTVSSLVGQLYEPGDRRLDAAYTIFYMGVNVGSFAAPLICGYLGDTGDPHDFKWG FLAAGIMTLFTVVLFETQKNKYLFSPSGEPIGIIPDAKREKKEDKAEHISHPKMDKRTKL RNTLVITILTVALIAFFNYAFEGDWVSIGIFTACIVFPVLILLDGSLTKIERSRIFVIYI VAFFVIFFWAAYEQAGASLTLFASEQTNRDILGWEMPASWFQSFNPLFVVVLAYIMPGVW SFLNKRKMEPASPTKQAIGLLLLSLGYLFICFGVKDAIPGVKVSMIWLTGLYFIHTMGEI ALSPIGLSMVNKLSPLRFASLMMGIWYLSTATANKFAGMLSGLYPEAGKVKSIFGYQIAT MYDFFMVFVVMSGVASLILFLLSKKLQKMMHGVE >gi|225935364|gb|ACGA01000028.1| GENE 251 312618 - 312839 222 73 aa, chain - ## HITS:1 COG:no KEGG:BT_4384 NR:ns ## KEGG: BT_4384 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 2 74 74 114 80.0 1e-24 MEKTKIGLNAGKVWRILNEKGELSMFDLCRELSLTFEDVALAIGWLARESKVFLRKREGM LFASVENVEFTFG >gi|225935364|gb|ACGA01000028.1| GENE 252 313241 - 313429 126 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGQEQLISLFPAAHLLIINKFSSHNKREKSEESCTFVIRKLKEHRNENISKQQYQEIGRL YY >gi|225935364|gb|ACGA01000028.1| GENE 253 313371 - 314888 608 505 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225093729|ref|YP_002662469.1| ribosomal protein S15 [gamma proteobacterium HTCC5015] # 1 496 4 491 497 238 31 2e-61 IAMKIFPSSSIKKLDAYTIENEPIASIDLMERAATALTKAITDRWNMETPVTIFAGPGNN GGDALAVARMMAEKGYKIEVYLFNPKGELSPDCQTNKELVEMMEEVTFHEISTQFVPPVL TPDHLVIDGLFGSGLNKPLNGGFAAVVKYINSSPAMVVAIDIPSGLMGEENTFNVKTNII RADVTFSLQLPKLAFLFAENTEFVGEWELLDIQLSEDGIEETETNYEMLEIEDIRSLIKP RKQFAHKGNFGHALLIAGSKGMAGASALAARACLRSGVGLLTVHAPMCNNDILQTSIPEA IVETDANETCFAVPTDTDDYQAVGIGPGLGRNEETEAALLEQLEHCQAPVVVDADALNIL ANHRHALTHLPKGSILTPHPKELERLTGKCQDSYERLMKACELARAAHVHIILKGAYSAI ITPEGKCFFNPTGNPGMATGGSGDVLTGVILALLAQGYPTEEAAKIGTYIHGLAGDVAQK KQGMIGMIASDIITCLPTAWRLVSE >gi|225935364|gb|ACGA01000028.1| GENE 254 314961 - 316028 1159 355 aa, chain + ## HITS:1 COG:no KEGG:BT_4382 NR:ns ## KEGG: BT_4382 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 355 1 354 354 571 87.0 1e-161 MKKLIIAAGLLMATSAYAQTEVLTGVTRGKDYGVVYSLPKTQIELEIKANKVNYTPGEFS KYADRYLRLANVSTEPEEYWELASVKVKSVGVPNSETTYFVKLKDKTVAPLMELTEDGIV KTINVPYSKSSSDKKAAPAPTVLSKIANPREFLTEEILMASSTAKMAELVAKEIYNIRES KNALLRGQADNMPSDGAQLKIMLDNLNAQEEAMTQMFSGTRDKEEKTFTVRLTPDKEFNN EVAFRFSKKLGVVANNDLAGTPFYISLKDLKSVKIPQEDGKKKKDLDGIAYNVPGQAMVT LTDGKKKLYEGELPITQFGVIEYLAPVLFNKNSTIKVYFDPNTGGLLKVDREEGK >gi|225935364|gb|ACGA01000028.1| GENE 255 316052 - 316336 285 94 aa, chain + ## HITS:1 COG:no KEGG:BT_4381 NR:ns ## KEGG: BT_4381 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 94 1 94 95 129 78.0 4e-29 MGGEGHMLDMIRRLSEGREASRLRRERANDKLKHLNRTNEPYPLPNTTPEEMGRIINDSE KKKEKDSNYFVWGTLIIMGVLIATAVIFWAIFLN >gi|225935364|gb|ACGA01000028.1| GENE 256 316598 - 316942 408 114 aa, chain - ## HITS:1 COG:no KEGG:BT_4380 NR:ns ## KEGG: BT_4380 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 113 1 113 115 187 82.0 8e-47 MAAKEKINLLEVIPCRSEHITAEREGETIVLSFPRFKYPWMQRFLVPAGMSKELHVRLEE HGTAVWELIDGKRTVREIIEELADHFQNEAGYESRVSAYLSQMQKDGFIKLFIN >gi|225935364|gb|ACGA01000028.1| GENE 257 317039 - 318427 1448 462 aa, chain - ## HITS:1 COG:no KEGG:BT_4379 NR:ns ## KEGG: BT_4379 # Name: not_defined # Def: putative oxalate:formate antiporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 462 1 462 462 823 96.0 0 MTEHLKQKLNDSPVLRWSVLALVAFTMLCGYFLTDVMSPLKPMLEKELLWDSLDYGFFTS AYGWFNVFLLMLIFGGIILDKMGVRFTGMGACLLMVFGCGLKYYAISTTFPEGAMLFGFK TQVTLAALGYAIFGVGVEIAGITVSKIIVKWFKGKEMALAMGLEMATARIGTTLAMVLTV PLADFFGYTDEGGAFHTNIPAPILFCLIMLCVGTIAFFLYTFYDKKLDASLDAEGLEPEE PFRMKDIVYIITNKGFWLIALLCVLFYSAVFPFIKYAADLMVQKYNVDPKLAGTIPGLLP IGAIILTPLFGSLYDRIGKGATLMVIGSLMLIFVHTMFALPILNIWWFATIIMIILGFAF SLVPSAMWPSVPKIIPEKQLGTAYALIFWVQNWGLMGVPLLIGWVLNTYCKGPVVDGAQT YDYTLPMTIFALFGVLALIVALMLKAENKKKGYGLEEANIQK >gi|225935364|gb|ACGA01000028.1| GENE 258 318674 - 319066 415 130 aa, chain - ## HITS:1 COG:no KEGG:BT_4378 NR:ns ## KEGG: BT_4378 # Name: secG # Def: preprotein translocase subunit SecG # Organism: B.thetaiotaomicron # Pathway: Protein export [PATH:bth03060]; Bacterial secretion system [PATH:bth03070] # 1 130 1 131 131 178 85.0 5e-44 MYLLFVILMVIAALLMCFIVLIQNSKGGGLASGFSSSNAIMGVRKTTDFLEKATWGLAIF MVVMSIATAYVVPRSAVVKDAVLEQAQKEQQTNPYNLPAGTAAPQTEAPATNAPATETPA PAETPAPKAE >gi|225935364|gb|ACGA01000028.1| GENE 259 319073 - 319852 510 259 aa, chain - ## HITS:1 COG:no KEGG:BT_4377 NR:ns ## KEGG: BT_4377 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 259 1 264 264 363 79.0 4e-99 MTSVNFQQWIQHPETLNRDTLYELRNLLARYPYFQSLRLLYLKNLYILHDISFGGELRKA VLYIADRRQLFQLIEGDRYNVQARKKGVPLTEVLKDEPSVDRTLALIDAFLSTVPEEVTA KTSFDYSMDYTSYLLEETLDTEQSSEEMPKLKGYELIDDFIEKSESDSPLCMKPLREEMS SSTTSSDELSESEEETKEEEEEDDSCFTETLAKIYVKQQRYSKALEIIKKLSLKYPKKNA YFADQIRFLEKLIINANSK >gi|225935364|gb|ACGA01000028.1| GENE 260 319858 - 320385 495 175 aa, chain - ## HITS:1 COG:no KEGG:BT_4376 NR:ns ## KEGG: BT_4376 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 317 95.0 1e-85 MNWIKKIIRPLVLFVLPLVVIACTVSYKFNGSSINYDKVKTISIADFPIKSEYVYAPLAT KFNEDLKDIFIRQTRLQLLKPSQNADLQIDGEITGYNQYNQAVSADGYSSETKLTITVNV RFVNNTNHAEDFEQQFSAFRTYDSTQLLTAVQDGLIAEMSKEITDQIFNATVANW >gi|225935364|gb|ACGA01000028.1| GENE 261 320372 - 321613 1101 413 aa, chain - ## HITS:1 COG:BMEI0866 KEGG:ns NR:ns ## COG: BMEI0866 COG2204 # Protein_GI_number: 17987149 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Brucella melitensis # 15 263 181 428 528 241 48.0 2e-63 MTKAEIQQVKQRFGIIGNTEALSRAIDVAIQVAPTDLSVLITGESGVGKESFPQIIHQYS RRKHGQYIAVNCGAIPEGTIDSELFGHEKGAFTGAIGERKGYFGEADGGTIFLDEVGELP MPTQARLLRVLESGEFIKVGSSKVQKTDVRIVAATNVNLTQAIAEGRFREDLYYRLNTVP IQIPPLRERGDDVLLLFRKFSADFAEKYRMPAIQLTEDAKKELLAYPWPGNVRQLKNITE QISIIETNREITAAILQNYLPAQNIQRLPALMGTRESKGFESEREILYSVLFDMRQEVAE LKKMVHNLMAERAGQVGQVGQVGTVVTTPVVTTHQPSVPAIIHTMQPAVCKDDDDIQDTE EYVEEPPLSLDEVEKEMIRKALERHHGKRKSAAKDLNISERTLYRKIKEYELD >gi|225935364|gb|ACGA01000028.1| GENE 262 321622 - 322716 860 364 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 2 344 3 343 346 335 47 1e-90 MEDNKIRIGITQGDINGVGYEVILKTFSDPTMLELCTPIIYGSPKVAAYHRKALDIQTNF SIVNSATEAGYNRLSVVNCTDDEVKVEFSKPDPEAGKAALGALERAIEEYREGLIDVIVT APINKHTIQSEEFSFPGHTEYIEERLGNGDKSLMILMKNDFRVALVTTHIPVREIATTIT KELIQEKLMIFHRCLKQDFGIGAPRIAVLSLNPHAGDGGLLGMEEQEVIIPAMKEMEEKG ILCYGPYAADGFMGSGNYTHFDGILAMYHDQGLAPFKALAMEDGVNYTAGLPVVRTSPAH GTAYDIAGKGLACEDSFRQAIYVAIDVFRNRQRERVARANPLRKQYYEKRDDSDKLKLDT VDED >gi|225935364|gb|ACGA01000028.1| GENE 263 322721 - 323764 921 347 aa, chain - ## HITS:1 COG:no KEGG:BT_4373 NR:ns ## KEGG: BT_4373 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 340 1 340 345 620 90.0 1e-176 MKKYFFYLSIALVAFVVASCGLKGNHTSSGRAYELLVVIDHGVWDRAAGRALHDALDSDM PGLPQSEPSFRIMYTSPKDYDSTLKLIRNIIIVDIQDIYTKASFKYAKDVYANPQMILTI QAPNEEEFEKFVEENKKTIVDFFTRAEMNRQITLLEEKHSNFISNKVDSLFGCNIWLPSE LTNSKTGEDFFWASTNTGTADRNFVMYSYPYTDKDTFTKEYFVHKRDSVMQANIPGFKKG VYMSTDSLLTDVRPINVQNSYTMEARGLWRMKGDFMGGPYVSHTRLDEKNQRIITAEIFV YSPDKMKRNLVRQMEASLYTLKLPNEVQQNQIPLGGVSKAAAEQTKK >gi|225935364|gb|ACGA01000028.1| GENE 264 323783 - 324817 844 344 aa, chain - ## HITS:1 COG:yfgB KEGG:ns NR:ns ## COG: yfgB COG0820 # Protein_GI_number: 16130442 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Escherichia coli K12 # 3 334 17 359 384 253 39.0 3e-67 MSKYPLLGMTLVELQSLAKRLGMPGFTAKQIVSWLYEKKVASIDEMTNLSLKHRELLKQN YEVGAAAPVDEMRSVDGTVKYLYKVGENHFVESVYIPDDDRATLCVSSQVGCKMNCKFCM TGKQGYTANLTASQIINQIHSLPERDKLTNVVMMGMGEPLDNLDEVLKALELLTANYGYA WSPKRITLSTVGLRKGLQRFIEENDCHLAISLHSPLTAQRSELMPAERAYSITEMVELLK NYDFSKQRRLSFEYIVFKGLNDSQVYAKELLKLLRGLDCRVNLIRFHAIPGVDLEGADMD TMTRFRDYLTSHGLFTTIRSSRGEDIFAACGMLSTAKQEENNES >gi|225935364|gb|ACGA01000028.1| GENE 265 324921 - 328271 1897 1116 aa, chain - ## HITS:1 COG:no KEGG:CPS_1799 NR:ns ## KEGG: CPS_1799 # Name: not_defined # Def: hypothetical protein # Organism: C.psychrerythraea # Pathway: not_defined # 14 562 7 532 1241 141 25.0 2e-31 MIKQLAQYSVPAENSRAIRVFLSSTFRDMEMERSALVKLFKGLQVKAASRGVTISLVDLR WGITEEDAKSGKVVEICLKEIVHSRPFFIGMVGERYGWCPSYEDLSQTFSDSLEYSWVED DLNNRLSVTEMEMQFGVLRNPNPLHAYFYIKQSATDPSCPKEEFHKLQHLKECLLQQDRY PVQEYDSPDQLCQFVESAFTELLEQEFPSVLTVEDNQALQEQLTRNELLFNYHPIPEADQ AFSDFLAADEQRCLVVTGDGGLGKSALLAHWSDWVNNDTQMIPIYHRLDNTTLSSYPETL VRALAAKCQNALKCQTLGEEDLFTAEVSKESDSTQGTARSILDIVRESMVFGNNVLDGFS ENTLRNIKEIEEDLNSIQKFSQLWAALGRSKYAIILLIDDISYLNSTEVSLFSLFGSIPS NVKVVLSFAASSTAYLPFVQNRYVHYRLNGFSQADAKSFSKQYLSTYSKALSAQQEDILS SWVLAKQPRCLNVLLNELVSFGQYDTLSEYMSGYCQINEIAQFYDSVLRRLSSDYGFDRI GRALLMLSLTLEGFTEDEVKSMTGINQLLWSQLRVEMSSWLTNKGGRYCICDTQMIEAIE RSFAQDDASIDECRKEIIVALLDDSDTLSHPLTFADYNYRMKQFCYHDSYRYQVEITYQS YKMQHWDFLRDWICDAETFEILYRTNYSLLEDCWKAIMNDNPSLTPAAYVELDFGQIDSF LIPVIANDMATFLSSSFHLTKAAMEVSEKSMEGEAIPLIAKSVLKMNEGCRYARNEEYET ACDCFLKALVMQESIVPTPELEIANTCRNLGLAYYYNEQYNEAVIYLNRALNYHAASTDE KSQAEVIELSEFLAYCDYYRDEEESAAKKFRKVAEMHESLNGRLSSGVAKCLRMQGKCLY YIKRYDEAWSLVNEALGIAIQIDNRKQIVACHKELYHLCKEFKRVMKERGDEQSSTLFFR ESLLHEKFFSEKPRLAELTVRYEALRCDIMRQYYENKEYGNVIRIATTLDIHEDIDPNVS SEVCYYEAQAYAKTENYPMAKDAFSRELELRKKYLGWEDDHTILACQNLGVLHSFCNERE DALACFREAYDHEVKKNGSDSAVAKKLLQYIDVVRS >gi|225935364|gb|ACGA01000028.1| GENE 266 328481 - 330622 2556 713 aa, chain - ## HITS:1 COG:no KEGG:BT_4371 NR:ns ## KEGG: BT_4371 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 713 1 712 712 1210 89.0 0 MATLQNIRSKGPLLVIVIGLALFAFIAGDAWKVMQPHQAHDVGEVNGDALSAQEYQNLVE EYTEVVKLMRGVTALNDEQTNQVRDEVWRSYVNNKLVEKEAKALGLTVSAAEIQDILKAG VHPLLRQTPFQNPQTGNFDKDMLNKFLVEYAKMSESQMPAQYAEQYNNMYKYWTFIQKTL IESRLAEKYQALVSKALLSNPVEAQDAFDARVNQYNVLMAGIPYSSVVDSTIVVKESELK DLYNKKKEQFKQYQETRDVKYIDVQVTASAEDRAAIQKEVDEATAQLATTTDDYTSFIRS VGSEAPYVDLFYNKTAFPSDVVARLDSASVGAIYGPYYNGADNTINSFKVVAKAAAADSV EFRQIQVYAEDAVKTKTLADSIYNAIKGGANFADVAKKYGQTGDANWITSAQYEGAQVDG DNLKFISAINNTGVNELVNLPMGQANVILQVTNKKSVKDKYKVAVIKREVEFSKETYNRA YNDFSQFIAANPSVEKMVANAEEAGYRLLDRMDLSSSEHNIGGVRGTKEALRWTFDKAKP GDVSGLYECGESDHLMVVGLVSIKPEGYRPLKAVQDQLRAEIVKDKKAEKIMADMKAANA TSFNQYKDMANAVSDSLKMVTFAAPAYVSALRSSEPLVGAYASVAPENQLSAPIKGNAGV FVLQVYGRDKLNETFDAKAEEATLTNMHARFANRLMNDLYLKANVKDTRFLFF >gi|225935364|gb|ACGA01000028.1| GENE 267 330742 - 331968 1001 408 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 4 408 17 420 426 156 27.0 7e-38 MVFSAFFSGMEIAFVSVDKLRFEMERKGGITSRILSIFFKNPNEFISTMLVGNNIALVIY GILMAQIIGDNLLAGFIDNHFLMVLAQTVISTLIILVTGEFLPKTLFKINPNLVLNVFAI PLIICYVILYPISKLASGLSCLFLRVFGMKINKDASDRAFGKVDLDYFVQSSIDNAENEE ELDTEVKIFQNALDFSNIKIRDCIVPRTEVVAVDLTTSLDELKSRFIESGISKIIVYDGN IDNVVGYIHSSEMFRAPKNWHENVKQVPIVPETMSANKLMKLFMQQKKTIAVVVDEFGGT SGIVSLEDLVEEIFGDIEDEHDNTSYISKQIDEREYVLSARLEIEKVNETFGLDLPESDD YLTVGGLILNQYQSFPKLHEVVRVGRYQFKIIKVTATKIELVRLKVLE >gi|225935364|gb|ACGA01000028.1| GENE 268 332002 - 332616 439 204 aa, chain - ## HITS:1 COG:no KEGG:BT_4369 NR:ns ## KEGG: BT_4369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 204 1 193 193 342 87.0 4e-93 MSITITSGAIVMLLLLSSCGGKKKAMGEAITERDSLPMMTTLGVTTLISDSGVTRYRVNT EEWSVYDRKKPSYWAFEKGVYLEQFDSIFHIEASIKADTAYYYDKERLWKLIGNVDIQNR KGERFNTELLYWNEATQKVYSDKFIRIQQPDRIITGHGFDSNQQMTIYTIHNIEGIFYVD EEAGGPPQPETKALPDSVRKDSVK >gi|225935364|gb|ACGA01000028.1| GENE 269 332660 - 333985 1577 441 aa, chain - ## HITS:1 COG:no KEGG:BT_4368 NR:ns ## KEGG: BT_4368 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 441 1 433 433 724 92.0 0 MKIKTLVAMLFLSAGATTVVAQDASNCNSNSSISHEAVRAGNFKDAYTPWKAVLENCPTL RFYTFTDGYKILKGLMGQIKDRNNADYQKYFNELMNTHDLRIKYTDEFLAKGTKVSSADE ALGIKAVDYIAFAPKIDVNQAYQWLSQSVNAVKGESAGATIFYFLQMSLDKLKTDPNHKE QFIQDYLAASEYIDAAIAAEASEAKKKPLLGIKDNLVALFVNSGTADCESLQSIYGPKVE SNQTDLAYLKKVIDIMKMMRCTESDAYQQASFYSYKIEPTAEAATGCAYQAFKKGDIDGA VKFFDEAVNLETDNLKKAEKAYAAAAVLASAKKLSQARAYCQKAIGFNENYGAPYILIAN LYAMSPNWSDESALNKCTYFAVIDKLQRAKQVDPSVAEEANKLIGRYSGHTPQAKDLFML GYKQGDRITIGGWIGETTTIR >gi|225935364|gb|ACGA01000028.1| GENE 270 334044 - 335357 1145 437 aa, chain - ## HITS:1 COG:no KEGG:BT_4367 NR:ns ## KEGG: BT_4367 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 437 1 431 431 644 77.0 0 MVGFKHTLCALLLTMVTGMAIAQNNTNSPYTRYGYGDLSDQSFGNSRAMGGIAFGLRDGA QINPLNPASYTAIDSLTFLFEGGVSLQNMNISGSGVKLNAKNSSFDYLAMQFRLHPRIAM SIGLLPFSNVGYSVSDTQAATDPTTGNTADYARSYTGDGGLHQLYAGVGVKVLKNLSVGV NASYFWGDINRTRVLYFPSVSGAYNYNHQSVASVSSYKLDFGAQYTLDINKKHSVTIGAI YSPKLKLGNDYSVTTQMVSNSTGTAVSTTTLKPDATFEVPNTFGVGFTYNYDKRLTVGLD YSLQQWSKTKFDVNTSDEAVREDFNETYTYCDRHKVSVGAEYIPNLMGRSYLSHIKYRLG AYYTTPYYKIGGKEATREYGVTAGFGLPVPRSRSILSISGQFVRISGQESAFVNENIFRV SIGLTFNERWFFKRRVE >gi|225935364|gb|ACGA01000028.1| GENE 271 335344 - 336213 545 289 aa, chain - ## HITS:1 COG:TM0883 KEGG:ns NR:ns ## COG: TM0883 COG1521 # Protein_GI_number: 15643645 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Thermotoga maritima # 49 278 3 239 246 85 29.0 1e-16 MIYLKIDSSIFLFLGISKMKKEDNSSSMLQLFFFYFTFAPAKEPKDLNLIIDIGNTKAKI AFFDGGEIVDIVAESNQSLGCLKAFCSKYPVEQGIVATVIDLSEKVLADLAALPFPLLWL NHQTPLPVVNLYETPETLGYDRMAAVVGANEQFPHRDILVIDAGTCITYEFIDSKGQYHG GNISPGMQMRFKALHQFTGRLPLVDTNGRKLLMGRDTETAIRAGVMKGMEYEISGYIESM KHKYPELLVFLTGGDDFSFDSSVKSIIFADRFLVLKGLNRILNYNNGRI >gi|225935364|gb|ACGA01000028.1| GENE 272 336519 - 337706 667 395 aa, chain + ## HITS:1 COG:no KEGG:BT_4364 NR:ns ## KEGG: BT_4364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 395 1 395 395 682 85.0 0 MKRFFSYTILSFFLLLLPTGQASANNNDSIRLSLLTCAPGEEIYSLFGHTAIRYENPSQG IDVVFNYGLFSFNTPNFIFRFSLGETDYQLGATDYERFAAEYAFFGRSVWQQTLNLTDEE KTELIRLLQENYRPENRVYRYNFFYDNCATRPRDKIEESIAGKVIYPAEPQDGSLTFRDI VHQYCKGHPWARFGIDLCIGSEADQPITQRQMMFAPFYLMDAFDGAQIKSDSIQRPLITA NELVVDATPEPDESGWMPTPLQCTLLLFILTTAATIYGIRRRTGLWGIDLFLFGIAGIVG CVLAFLALFSQHPAVSSNFLLLVFHPGQLLFLPYIIYCVRKGKKCWYLTLNLAVLTLFIV LFPVIPQRFDFAVVPLALVLLIRSASNLIVTSKKK >gi|225935364|gb|ACGA01000028.1| GENE 273 337706 - 339280 1278 524 aa, chain + ## HITS:1 COG:no KEGG:BT_4363 NR:ns ## KEGG: BT_4363 # Name: not_defined # Def: putative alkaline phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 524 1 524 524 996 88.0 0 MKGLLTSLITVLTFTGLQAQSLPAAPKLVVGLTIDQLRTDYLEAFSSLYGEKGFKRLWKE GRVFHNAEYTFCGIDRASAIAAIYSGTTPSMNGIISQRWMDASTLRPVNSTDDTAFMGYY TDQTVAPTKLLTSTIADELKIATQGKGLVYAIAPDCDAALFAAGHAGNAAFWLNPNTGKW SGTTYYGEFPWWASQYNDRQAIDFRIAGMTWEPIFPRGMYTFLPDWRDMLFKYKFDDDRS NKYRRFIASPFVNDEVNALAEEALNKSSIGMDDITDLLALTYYAGNYAHKSVQECAMEIQ DTYVRLDRSIADLLEVLDKKVGLQNVLIFVTSTGYIDSESPDSGLYKVPGGEFYLNRCAA LLNMYLMATYGEGKYVEAHHNQQIYLNHKLLEKKELNLAEIQQKSAEFLMQFSGVNEAYS ANRLLLGSWTPEIYKIRNGYHRKRSGDLVIDVLPGWTIVNENGGDNKVVRHSYIPSPLIF MGHSVKPAIIQTPVTIDHIAPTLAHFMRIRAPNACTSAPITDLR >gi|225935364|gb|ACGA01000028.1| GENE 274 339477 - 342794 3942 1105 aa, chain + ## HITS:1 COG:PA4403 KEGG:ns NR:ns ## COG: PA4403 COG0653 # Protein_GI_number: 15599599 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Pseudomonas aeruginosa # 3 1103 2 914 916 632 37.0 1e-180 MGFNEFLSSVFGNKSTRDMKEIKPWVEKIKAAYPEIEKLDNDALRAKTEELKKYIHESAT AERAKVEELKASIETLELEDREEVFVQIDKTEKEILEKYEKALDEVLPVAFSIVKATAKR FTENEEIVVTATDFDRQLAATKDFVRIEGDKAIYQNHWVAGGNDTLWNMVHYDVQLFGGV VLHKGKIAEMATGEGKTLVATLPVFLNALTGNGVHVVTVNDYLAKRDSEWMGPLYMFHGL SVDCIDRHQPNSDARRQAYLADITFGTNNEFGFDYLRDNMAISPKDLVQRQHNYAIVDEV DSVLIDDARTPLIISGPVPKGDDQLFEQLRPLVERLVEAQKVLATKYLSEAKKLIASNDK KEVEEGFLALYRSHKALPKNKALIKFLSEQGIKAGMLKTEEIYMEQNNKRMHEVTDPLYF VIDEKLNSVDLTDKGVDLITGNSEDPTLFVLPDIAGQLSELENLNLTNEQLLEKKDELLT NYAIKSERVHTINQLLKAYTMFEKDDEYVVIDGQVKIVDEQTGRIMEGRRYSDGLHQAIE AKERVKVEAATQTFATITLQNYFRMYHKLSGMTGTAETEAGELWDIYKLDVVVIPTNRPI ARKDMNDRVYKTKREKYKAVIEEIEKLVQAGRPVLVGTTSVEISEMLSKMLTMRKIEHKV LNAKLHQKEADIVATAGLSGTVTIATNMAGRGTDIKLSPEVKAAGGLAIIGTERHESRRV DRQLRGRAGRQGDPGSSVFFVSLEDDLMRLFSSDRIASVMDKLGFQEGEMIEHKMISNSI ERAQKKVEENNFGIRKRLLEYDDVMNKQRTVVYTKRRHALMGERIGMDIVNMIWDRCANA IENNDYEGCQMELLQTLAMETPFTEEEFRNGKKEQLAEKTFNIAMENFKRKTERLAQIAN PVIKQVYENQGHMYENILIPITDGKRMYNISCNLKAAYESESKEVVKSFEKSILLHVIDE AWKENLRELDELKHSVQNASYEQKDPLLIYKLESVTLFDAMVNKINNQTISILMRGQIPV QEAPDESAARRVEVRQAAPEQRQDMSKYRENKEDLSDPNQQAAASQDTREQQKREPIRAE KTVGRNDPCPCGSGKKYKNCHGKNL >gi|225935364|gb|ACGA01000028.1| GENE 275 342969 - 344075 1057 368 aa, chain + ## HITS:1 COG:no KEGG:BT_4361 NR:ns ## KEGG: BT_4361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 368 1 369 369 544 76.0 1e-153 MAKSYSLVFIVLFLLTLGSCQSLEQIPIDYLQPADLSFPPQLRKVAIVNNTSDAPDNKLI TETEKIKEGTPLISRATAYANGDPKAATESLAEEIAQQNYFDEVVICDSALRANDKLARE STLSQEEVRQLTSNLGVDFIIALENLQLKATKTVRYLDEFNCFQGAVDVKVYPTVKVYLP ERSRPMTTLHPNDSIFWEEFGGTAIEAATRMIPDKQMLEEAAAFAGTVPVKNIVPIWKKG TRYLYTGGSVPMRDAAIYVRENSWDDAYELWRQAFEGTKNQKKKMRAALNIAVYYEMKDS LAKAEEWAEKAQQLAKKIDKKNITDSPQATIDDVPNYYLTTLYLAELKERNAQLPKLKLQ MSRFNDDF >gi|225935364|gb|ACGA01000028.1| GENE 276 344137 - 344583 494 148 aa, chain + ## HITS:1 COG:no KEGG:BT_4360 NR:ns ## KEGG: BT_4360 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 148 200 85.0 1e-50 MKLSQQSLSIIESAIQKAVAKYVCSCEQTVVTDIHLQPDQASGQLNIYNDDDEELANIMI EEWATYEGDDFLENVEPSLRNILCRMKDAGDFDKVTILKPYSFVLVDEEKETVAELLLID DDTILVNDELLKGLDKELDDFLKDLLEK >gi|225935364|gb|ACGA01000028.1| GENE 277 344721 - 346733 1397 670 aa, chain - ## HITS:1 COG:no KEGG:Phep_1386 NR:ns ## KEGG: Phep_1386 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 14 670 18 692 692 130 25.0 2e-28 MNMKIKDSIVWTAFLAVMAIIMFPSCSDEHEAGILEITNNEIILQAEGDPVQVEVKSNTE WRIDFVESTWFSTDIRGAQSSRTYFTVTYDENISDSERFCDIRVFTKDGKTADVIKIKQL SRYPFIVPASDKMELFTKGGEYEMEISTNVPETDIVITPTVNWVQEYRISDGKLYFNTET NSQSPRTGSIVLSYDDQYQRKVTATINLSQAYSEYANAELVSYPTVHGYAVGGITNNVYV EGTIVANGTSKNFPSNRYVIQNEAGETVVFESESLITFAQFAKVSLCLKDGMIREESEGS FTYRLISGITAAHVISSELSTFTIPERTIAELTDNMVFSLVTLKDVEIASPVGAFTNFKT TDPGAADRKINKNYWVEKFPAYYRYYPTCIRDKNGSNTYMLTAFEAPYAHETLPKGSGSI TGMVAKVKLTNFDISESRLCILPLNREDINISDINEITDVLVEWDCNVPDWKEIGSSTFT DYHPTGGEASQANALLSKDGNKYFQQTYADYILGFQDDFRGDVNLNTTDGYYGRMRGGAF NSKPWSTSSYFYVDKISTVGISTSLSLQVEMNASWGGGPVAVVEYAYSMNGEWIKVDNSE FTILGQFDRTAAGGQTEKNIPGYKVYDFKLPDALLNRDNICIRLRPVRLPAGVSSFMPLR LANISIKYNK >gi|225935364|gb|ACGA01000028.1| GENE 278 346733 - 348001 1118 422 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0342 NR:ns ## KEGG: Dfer_0342 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 38 411 32 409 419 303 42.0 9e-81 MNSKIFFLGAIMLLPTFASCQQEKPFPDDAVDKVYEYLPEWQAGYLDIHQISTGRGNAAY LIFPDGTTMLLDAGDLGVHSGTQEIMKAVPNDSKRPAEWIAQYIKHFSLPLKNNGALDYA LLTHFDTDHIGQNGKLAIEKVGLDYKLTGITHVGNLLNISTLIDRGYPTYDYPTAAKVSG AHISNYKLYVAARDREGKKNEGFVIGSNSQIKLLKDPGSYPTFEVRNIVGNGKIWTGSGT TAKELVPSTASSSEQLNENRCSCGIRITYGNFDYFSAGDILGVEKAPEWFDIETPVARLL GETDVVVANHHAYSDAMCDTYISQVKPQVYVIPVWDYYHPQPATLSRMLSQTLYPGERSV FAAGMVDSNRSRLGEDGLKIKPAGHVVTRVYPGGEKFQIFVLNDRNEAYEILYKTGEIKS NN >gi|225935364|gb|ACGA01000028.1| GENE 279 348024 - 349775 1391 583 aa, chain - ## HITS:1 COG:no KEGG:BDI_2518 NR:ns ## KEGG: BDI_2518 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 14 563 17 568 568 450 43.0 1e-125 MRRNILIIISFLTLLVSSGCTNLDEKWYSEVTPDTYFTSKETVYSFLVRSFTHWRWFHGF DRAILQECTTDEMCVTQKGIHYNDVRYSQLQHHDWTPLHPNNEETWRGVGMGVAMALACK EDLSGVDYVSLGMTEELKADHQMQLQTLVAYFYLRGLDFYGGMPIYRRSTTEEVPRSTAR ETFNYVEELLLAAIPKLEKKRADMLEEGYLRQGTAAALLAQLYFNAEVYMGENRFAECAQ VCQDLLDGKYGYYELEEDWFGPFTFDNNKSKEVMWSVQSQYAKGTLFQWQFERYNHYNAK NYFDLSGYSSTNGMHLQPSLKPNGDPYTDKLGRPFAKFNNKDLRKKLYLYKGNGKYEGMF LYGKLQRTSRSGTEVKCTGLYEYPGEVLEFVDQVAQFKKVKDGEYSSVNELPSNISTGEE NSGIRLCKLPVPDNIDKTIAFNPDYPVIRFAEIYYMLAECKYRSGYKKEAANLFNEVRKR NFENKADPDPVTETNIDKYRILDEWMVEFLGEQRRRTDLRRWGLYTTGSWWDHKPTNDDH YELFPIPEKSISVSNVLKQNPGYGGGNEMTKEEAGIYSVKQID >gi|225935364|gb|ACGA01000028.1| GENE 280 349807 - 353022 2302 1071 aa, chain - ## HITS:1 COG:no KEGG:BDI_2519 NR:ns ## KEGG: BDI_2519 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 116 1071 50 1020 1020 855 46.0 0 MNFNSYFTIKHTSNLRILLLTLFFIVPLGTIYGQNQLITIAGKNITLLKAFDAIESQTQL SITYSRTKIDVNRTLSVDFKKQKLSIVLDKLLAGTGFTYKIEKGYIVLVPVPVKQEQTEA TSKKIQGVIVDTNGDPVIGASIQEKGVAGKGTVSDLDGKFNLSVASNAVLLISYIGFQSL EVKVGNRTSLSLVLKEDHQLLDEVVVIGYGTMEKRAVTSSITSISSKDLVTGLGGSTIAT ALKGKISGMNVSETSSPNASASFQLRGVASINASSSPLIVIDGIPGGDLRSISQEDIQSI DVLKDASAGAIYGTRAAGGVILVTTKKAKEGPITLSYTGELSTEQVSRRPQVLDRDAFVR FGVGTDLGASTDWYGELLNEGALSQRHVVTLSGGGHTARIYATIMAQDQKGIAIGDNRKD YSGRINANFNLLNDLLEIGLHTEYREAHRDQRSSSSCFDMALKMNPTEHVYDSTSETGYN VLVGGSEYYNPLAEVMLKQTDNVDKWLKADATVKLNLPAGFSAQATLGWEDRQYQQTHYV SALHRTSLNGSYKGKGFHGYSKTVNVSFEPTINFMRVFADDHTVSAVAGYSYWENYSENF DMSNFDFPVDGVGAWDMGTGSWLSDGKAAMSSHKYPRERLISFFGRANYSYKDRYMVTGS VRHEGSSKFGKNHRWGTFWAVSGGWRISNEAFLRDVSFLDDLKVRVGYGVTGNNNFSAGA STAMYSSNSMWPYNGTWITSYGPARNVNNDLHWEKKAELNIGLDYAFLNNRLFGKFDIYK RRVSGMLYNISVPNPPAVYEKTTMNYGNLENTGWEFEIGGVPVKTKDFNWTTTMRLSHSS SKITSLWGNNTYQDRVGFPSPGTNGSAGRIEAGTKIGSYYIWKYAGITDNGRWLLYDKND NVILSDKKTYDDKRYIGNAIPALMVSWDHTFTYKNLSLGINLRSWIDFDVFNTINMYYGL SEVAGQNVLRNAYIDNRHIKEVKQLCDYWLEDGTFLKIDAISLGYSLNMKKWQKYIDKID LYLTVRNVACFTKYSGLNPEVNVNGLDPGYEWFNNIYPETRRYTVGMKIQF >gi|225935364|gb|ACGA01000028.1| GENE 281 353135 - 354169 564 344 aa, chain - ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 22 276 35 278 331 72 28.0 1e-12 MKKKSISELIHFFVRHPLQNSLQMKFWRWLVSPADAEDKEEAIQEEWMNIIAEPDEATRR SWYEVRRKAGLANPIIKPHWEIKPLLRIASMILIPLFSVLLSWHYINDYTDSCKLVEYIV PMGERGELTLPDGTKVQINSESSIIYPKSFRGDTRTVYLSGEANFDVHKDKKHPFIVKTS LLSVRALGTKFNIQAYSEDRKTTTTLENGKVQINNLLAPDSCFILTPGEQLEYNHLTKNY EKRKIDVMMASGWTRGELNFVDCHLEDILNTLGRHYNVEIKAEPHLYTNDLYTIKLRKGE PLQAAIHIVTMTVGGMESKVTDNKVVILTAASPANKSKKGGTHP >gi|225935364|gb|ACGA01000028.1| GENE 282 354296 - 354859 463 187 aa, chain - ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 6 181 18 192 201 59 25.0 4e-09 MEFTEEQILLRALSRGDEKAFEVLFMRYFPRVKRFISGLLQDDITAEDFAQDILLKLWQK REEMAKIENLNAYLYRTSRNAVYQHLRHILLVNEYGEKQQENLSRQQNEGAGNIEENMFA EELLLLIQHTVEQMPAQRKRIYEMSRKEGKNNDEIAQLLAINKRTVENHLTQALADIRKM LKHFYLF >gi|225935364|gb|ACGA01000028.1| GENE 283 355029 - 357164 1625 711 aa, chain - ## HITS:1 COG:no KEGG:BT_4359 NR:ns ## KEGG: BT_4359 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 705 27 717 744 1253 87.0 0 MYTRLLFILLLGATWMCSCQTEQSPVNTLADRVTEGTSKDRILFRMIADENNPLKDYFEI SSEDGKVLITGNSDLSLATGLNWYLKYVAGIHLSWNNLSQKLPEILPLPQEKIRKETSMQ NRYYLNYCTYSYSMAFWDWERWEKEIDWMAMHGINMPLSITGMEVVWYNLLKRLGYTTEE VNEFISGPAFMAWWQMNNLEGWGGPNPDSWYQQQEALQKKIVARMRELGIEPVFPGYAGM VPRNIGEKLGYQIADPGKWCGFPRPAFLSTEDEHFDSFAAMYYEELEKLYGKANYYSMDP FHEGGNTEGVDLAKTGASIMAAMKKANPKAVWIIQAWQANPREEMIASLNQGDLLVLDLY SEKRPQWGDPDSMWYREKGFGKHDWLYCMLLNFGGNVGLHGRMNQLVNGYYDACAHTNGK MLHGVGATPEGIENNPVMFELLYELPWREERFSSDEWLQTYLKARYGREVSPEIMEAWRA LEHTVYNAPKDYQGEGTIESLLCARPGFHLDRTSTWGYSKLFYAPDSTAKAARLFTSVAD QYKGNNNFEYDLVDIVRQSNADKGNVLLEEISQSYDRKDKEDFRKQTQQFLDLILAQDRL LSTRKEFSVSSWLNAARSLGTTEEEKRLYEWNASALITVWGDSIAANQGGLHDYSHREWS GLLKDLYYQRWKAFFEQKQAELDGKPAGQEINFYGMEKAWAEKSKAQTLKN >gi|225935364|gb|ACGA01000028.1| GENE 284 357232 - 359565 2210 777 aa, chain - ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 14 601 10 602 612 476 41.0 1e-134 MKMNKNHKTVLVILNIIVSFLISSCSSPKEQVRIGNGTFTIEGKDIQLICGEMHYPRIPH EYWRDRLKRARAMGLNTVSAYVFWNFHERQPGEFDFSGQADIAEFIRTAQEEGLYVILRP GPYVCAEWDFGGYPSWLLKEKDMTYRSKDPRFLSYCERYIKELGKQLSPLTINNGGNIIM VQVENEYGSYAADKEYLAAIRDMIKEAGFNVPLFTCDGGGQVEAGHVEGALPTLNGVFGE DIFKVVDKYQKGGPYFVAEFYPAWFDEWGRRHSSVAYERPAEQLDWMLSHGVSVSMYMFH GGTNFEYTNGANTGGGYQPQPTSYDYDAPLGEWGNCYPKYHAFREVIQKYLPVGTVLPEV PADNPTTTFATVELKESAPLRTAFHPTTQSENVLSMEDLGVDFGYIHYQTTLQKAGKQKL VIQDLRDYAVILIDGKQVASLDRRYNQNSVTLNVSKTPATLEILVENTGRVNYGPDILFN RKGITSQVLWGNEKLTGWSITPLPLYKEKVSEMEFGETIKGVPAFHKGTFTVEKKGDCFV DMSQWGKGAVWVNGKSLGRFWNIGPQQTLYLPAPWLKEGENEIVVFEMEDTGKRVLQGLN QPILDSLGIDKNYQKGQRRAVVGTPILEDGDLALKTTLQETNEWQSFDLPVATTLRHFCI ETLASYTEDNQACISEVELIDDKGQPIDKTKWEVVYVSSEQADKNLGIAENLFDGDISSF WHTNAAVESNHPHRVIIDMKEIYKVSAFRVKVRKGSFLSGKVKDINIYGRPQFFLFH >gi|225935364|gb|ACGA01000028.1| GENE 285 359648 - 361189 1548 513 aa, chain - ## HITS:1 COG:no KEGG:BT_3020 NR:ns ## KEGG: BT_3020 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 511 36 510 513 213 31.0 1e-53 MTILSTLLLGIMCSFCSDTIDVYAGQYGEEDGTSEPETPEVTGNIVPIESLRNPDRGFHL ECNLLADQMKSPYNDYEVYGNDLYTKKVEQFDAKDDNLTLVQQYIYLTNWVSKDLDAEAL SNIRKIFELMKAQGYKAILRFAYNHAGLNTSGGESKQWILRHIEQLTPLLNEYIGQIATM QVGFIGAWGEWHTSPLMNDQSAKNAIVSALLRALPAPYCVEMRYPNHKKALTLEQEGSRG RIGYANDYFTAGEHPLAPGNDFVPNTDDYKQITEEVKVNNFYMSGEIPYNEDTEWGLAEL ISPIKSLRILREHRYSAFDVTQNYDLNIMSWKRVKVNPALLNDNHILFDESYFKDEEGNE VVRSFYDFVRDHLGYRLNLQSESKVEAKGGNLEYDLTITNTGFATVINPKEVYLVLVSES GQVVKEFKLDVDPKTWIPATEQEPNQAAKYTIKGSVAAGVSGTYKVGIWMPEKVADWKYN SVYAIKFAKTENVTHWYDDADKYAVNLFGTVTF >gi|225935364|gb|ACGA01000028.1| GENE 286 361242 - 362552 1114 436 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160887164|ref|ZP_02068167.1| ## NR: gi|160887164|ref|ZP_02068167.1| hypothetical protein BACOVA_05180 [Bacteroides ovatus ATCC 8483] # 1 436 1 436 436 857 100.0 0 MKHLLYSLLGILLLAGCKEDKYNVIIPMSDIYLSAPQDGAIIDLNDLSIEKYSFSWEKPL ENGAKLLICTDRKFKEPVIIDAGKSTSVAISALTADQSFSQLGIKAGQEAVLYWTVKETG NITAAASEARTIRVKRMTSKLVQPEDLTKISLAENAPETAVQFEWDTQEWPESTSYSLCF SLDPEMKQTVAEHSVGVVNGKSSLTHEELQALLDKLSIKRWTSNSVYWNVKTDDGQWVSR SSGVLNMTEMMRFIDVRGDEKITYRVVRIFYSDKTSLVWLADNLRATKYADGTDIEANNF KKTPASLGEGRVKAYGVHYHYDIRDKIAPKGWRLPTIQEYKNLFAEAGTAEGQWNVLKDP EYYESVKGQAHLNDWKFNLCASGQWSGDAITNHTGPYCYLLVTDDMSHQCILHDGGATLW SPWTTGAPARFIYNEN >gi|225935364|gb|ACGA01000028.1| GENE 287 362568 - 364088 1582 506 aa, chain - ## HITS:1 COG:no KEGG:BF0587 NR:ns ## KEGG: BF0587 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 506 1 495 495 393 43.0 1e-108 MNMKKTIYTAIATLLFLISSCTLEQEVYNKINPGMFPQNAEDVNALVNSSAYYVFSPWGI FEVAAGYVTTSEMVTDYVENTWGWTTVYNSYEANDWHIDGDYRRVYDYSQYLASMTLTMD RIKDVNMDETLKARYMAELKCGRGFLAFLMYDMYGPIPIADLETLQKPEAEIVIPRLSEE AMREYIVSNLKDAANVLPYKYEDTNYGRFTKGLANTLLLKFYMMTKQWDEAEKIGRELTK PEYGYKLVDDYNSLFSLSGEKNSEVIFSCVAEAGVMEQKWFAHVLTSDYPLPAGMSVTTW GGFKISWPAYESYDPKDKRRERIIGEYTGTEGVHHSRALDRDGGTTGILFYGAIPVKYKI EGVVGEKCEIDMPIYRYADVLTLLSEAIVRNGNSITQEAIGLLNQVRVTHGGLKAYQLSD FSSVEDFLDKLLEERGHEFYFEGVRRQDLIRHGKFIEAALAKARFAGQPTGKIETKVDGQ YKYEKFPLPTHLITEGQGIIIQNPGY >gi|225935364|gb|ACGA01000028.1| GENE 288 364095 - 367406 3087 1103 aa, chain - ## HITS:1 COG:no KEGG:BF0536 NR:ns ## KEGG: BF0536 # Name: frrG # Def: putative outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 40 1103 42 1137 1137 964 47.0 0 MRLLVLFLVCSIGLTYAADSYAQKALISIDVRNQRVEDILKEIEEQSDFDFFFNNKHVDL NRRVSVSADKSNIFSVLKEIFSGTNVKYSVLDKKIILSIEAQSPEQEKKITVSGTISDAQ GEPIIGASVLEKGVQGNGTITDIDGNFKISVSSSKAQLDISYIGYQSQTVAVQPGKNMKI ILQEDSKQLDEVVVVGYGTQSKKTLTGAVSMVNMGDVEMNTVPNVSRALAGKAAGFRVNQ VSAQPGGEAKFRIRGEASTGAGNEPLFVIDGFPVSSTSNLSSGNGFYESGNIDNVLESLN PDDIESISVLKDAASTAIYGARAGHGVVLITTKRGKSQKPRVTYSGMGSVQVARSNYKML DTRMYMDMYNKQMHEEWLKLNGMGIYEGYVEKQDTPPTQYKPTFNNDEILTASGTDWLDA VMRNGYTQQHNLSVNGGTEKTRYLASVNYMNQEGIVKNNGVSRFSARLNLDQELSEYVSF GLTATYSQNKYDNVPLGDNANEYSGVLTGAIQSNPTIPIYDSEGNYFIDPKRPFVANPVS LLDIKDNTVKDRIIGSAFVSVKPLKDLEVKLQLGVDRSFQKRSSYLPKTTLQGKTKNGRA DISQETSTAYLMELTAQYSKNFGDHNVKAIGGYSFQKFKNDGLSAGNEDFLVDGFGYSSL GSGNYKKPSVGSWASINSIASLFARANYSYKGRYLLEATVRADAASNFAPENRWGYFPSV SAGWMISEENFMKGASNWLSMLKLRTSYGQTGNSNVGYRIYDYYEVGRSAIIGGAESTGV YASDLGNRSLTWETTTEFNVGVDLGILNNRFKLTAEYFKRKITDLLVTNKPLPFYNEINK IAGNIGSTQSQGVEITVNTVNIVTKDFEWDTTLTLSHYNDRWLTRDPNWKPKPYEKENDP IRAWWEYEALGILQPGEKVPDAQKDLVPGMMKLKDRDGNGVLGDEDKVYMGNGDPKIIYG LNNSFRYKNFDLNIYFYGEAGRKRGASYYEGWTRMDNGINVSTYALKAFNSNNLTATDPT FVRGGSGWGDYYVKSIYYVRCGSITLGYKVPISQKIAKNLRVYVDVNNPFVITNWTGLDP ETDNGTYAYPNVTSYNFGVSISF >gi|225935364|gb|ACGA01000028.1| GENE 289 367653 - 368675 849 340 aa, chain - ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 86 301 85 293 327 77 25.0 3e-14 MDKTDKNTIEELLPRYCEGLATEEERLQVETWMDESEENRRMAKQIHALYLATDTVHIMT KVDTEKALTKVKSRMTGNRQRKTTWWEWAQRAAAVLFIPLLVTLMVQHWGGSEQELAQMM EVKTNPGMMTSLTLPDGTLVFLNSESTLSYPSRFDSDTRNVTLQGEAYFEVAKNPEKKFI VSTSHQSQIEVLGTHFNVEAYEKEDRIAATLVEGKIGFIYSSDNGSKKVLMDPGQKLVYD TRDSKVQLYATSGESEIAWKEGKIIFRNTSLEEGLRMLEKRYNVEFIIKNDRLKGDSFTG TFTNQRLERILEYFQLSSQIRWRYLDSPDIKDEKSKIEIY >gi|225935364|gb|ACGA01000028.1| GENE 290 368810 - 369385 447 191 aa, chain + ## HITS:1 COG:STM2640 KEGG:ns NR:ns ## COG: STM2640 COG1595 # Protein_GI_number: 16765960 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Salmonella typhimurium LT2 # 7 170 5 177 191 66 27.0 2e-11 MKEIHLLENNYLLSAIQQGDQKAFDALFRRYYPTLCAYGHRFVDLEDAEEIVQDSLLWIW ENRENLFIETSLSSYLFKMIYRKALNKLAHIDATQRADTRFYEEMQEMLQDTDLYQVEEL TQRIKDAIATLPESYREAFVMHRFRDMSYKEIAEILGVSPKTVDYRIQQALKQLRVDLKD YLPLLLPILFP >gi|225935364|gb|ACGA01000028.1| GENE 291 369701 - 371599 1181 632 aa, chain - ## HITS:1 COG:mlr3786_1 KEGG:ns NR:ns ## COG: mlr3786_1 COG0642 # Protein_GI_number: 13473249 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 392 628 218 462 478 164 37.0 4e-40 MKQSLRIIPFLLGLFLFIEGFLSPASARTENTNKEVLVLNSINFNLPWAKNFYWCLHDAL QKQGISVKAESLSVPALVNEMEANAVVTHLRQKYPEPPTAIVMIGDPGWIVCRELFDDVW EDVPVVVTNARDRLPASLDILLSHAPLTESNSVSAEEWRREYNITSLKQHYYVKETVDLI YKLIPDMERLAFISDDRYISEETRRDVKEAVEENFPDLRLELLSTTQLSTEMLLDTLRSY KSNTGIIYYSWFESHNENDNNYLFDHIQEIITNFTPSPLFLLSHEDLSNNTFAGGYYVSA ESFSDSLLEILDRILKGEQARNIPGGVGGKGSAYLCYPVLKAHNILSYRYPADAVYVNEP KTFFQQYSVEIIVCSIFLLILIVAIIYYISILRKAYSRLHEAMEKAEQANQLKSAFLANM SHEIRTPLNAIVGFSNMLPHVENREEMSEYADLIETNTDLLLQLINDILDLSKIEAGTFD FYPALIDVNRTLEEIEQSMQLRLRHNNVTCTFVERLPECTLYTDKNRLIQLLANFIVNAI KFTEVGTIRMGYRLLDSETIYFYVSDTGCGMSSEQCIHVFERFVKYNSFVQGTGLGLSIC HTIVDRLGGKIGVESKENKGSTFWFTLPYRQN >gi|225935364|gb|ACGA01000028.1| GENE 292 371717 - 374353 2724 878 aa, chain - ## HITS:1 COG:FN2011 KEGG:ns NR:ns ## COG: FN2011 COG0525 # Protein_GI_number: 19705307 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 877 3 887 887 748 45.0 0 MELASKYNPADVEGKWYQYWLDHHLFSSKPDGREPYTIVIPPPNVTGVLHMGHMLNNTIQ DILVRRARMEGKNACWVPGTDHASIATEAKVVNKLAAQGIKKTDLTRDEFLKHAWDWTDE HGGIILKQLRKLGASCDWDRTAFTMDEKRSESVLKVFVDLFNKGLIYRGVRMVNWDPKAL TALSDEEVIYKEEHGKLYYLRYKVEGDPEGRYAVVATTRPETIMGDTAMCINPNDPKNEW LKGKKVIVPLVNRAIPVIEDNYVDIEFGTGCLKVTPAHDVNDYMLGEKYNLPSIDIFNDN GTLSEVAGMYIGMDRFDVRKQIEKDLKAAGLLDKTEAYTNKVGYSERTNVVIEPKLSMQW FLKMQHFADMALPPVMNDELKFYPAKYKNTYRHWMENIKDWCISRQLWWGHRIPAYFLPE GGYVVAATAEEALAKAKEKTGNAALTIEDLRQDEDCLDTWFSSWLWPISLFDGINNPGNE EINYYYPTSDLVTGPDIIFFWVARMIMAGYEYEGQMPFKNVYFTGIVRDKLGRKMSKSLG NSPDPLELIEKYGADGVRMGMMLSAPAGNDILFDDALCEQGRNFCNKIWNAFRLIKGWTN AKGTIEIPTDAHLAVQWFDQRLDAAAVEVADLFSKYRLSEALMLVYKLFWDEFSSWLLEI VKPAYGQPVNGFIYSMVLSSFERLLELLHPFMPFITEELWQQLREREPGESLMVTQLFEP VGANEQFLAEFEVAKEIISNVRSIRLQKNIALKEELELQVVGTHPVEKLNPVIIKMCNLS SVTVVESKSEGASSFMIGTTEFAVPLGNMIDVEAEIARMEAELKHKEGFLQGVMKKLSNE KFVNNAPAAVLELERKKQADAESIINSLKESIAALKKS >gi|225935364|gb|ACGA01000028.1| GENE 293 374420 - 375058 650 212 aa, chain - ## HITS:1 COG:no KEGG:BT_4352 NR:ns ## KEGG: BT_4352 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 333 78.0 3e-90 MATSEEIEKYCRNCVSRDFVNGKGLVCKRTRELPAFEEECESFEKDEELERLAPPKPEDF PVSMTEEEMLAEENLSKGVLYAVAACIVGAVAWGLISVSTGRQIGFMPIAIGFMVGFAMR KGKGIRPIFGIIGAALSLISCVLGDFFSIIGYISQDYDMSYFDVLVSVDYGEIFSIMLEN VMSMTALFYGFALYEGYKFSFRAQKHPEGGKI >gi|225935364|gb|ACGA01000028.1| GENE 294 375270 - 376253 792 327 aa, chain + ## HITS:1 COG:no KEGG:BT_4351 NR:ns ## KEGG: BT_4351 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 324 48 348 355 518 82.0 1e-145 MDSCSGNNLFANNLSFYLLRFVVFSNLFPFAIGDLFIFLSIAGVIIYPIYARLRKKLPWK KILLRDGEYLLWIYVWFYLAWGLNYSQKNFYQRTEIPYTAYTPENFQEFVDDYITQLNRS YTPVNSINQDLIREETVRIYNQLSDSLGVHRPPHDHPRVKTMLFTPFISMVGVTGSMGPF FCEFTLNGDLLPANYPATYAHELAHLLGITSEAEANFYAYQVCTRSQAMGIRFSGYFSVL GHVLGNAKRLLSEEEYTKLFQRIRPEIIELAKNNQAYWAAKYSPVVGAVQDWIYDLYLKG NKIESGRQNYSEVVGLLISYQEWKKMN >gi|225935364|gb|ACGA01000028.1| GENE 295 376266 - 377054 1075 262 aa, chain + ## HITS:1 COG:BS_yabN KEGG:ns NR:ns ## COG: BS_yabN COG3956 # Protein_GI_number: 16077126 # Func_class: R General function prediction only # Function: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain # Organism: Bacillus subtilis # 12 259 233 484 489 216 49.0 2e-56 MSHTRQEQMEAFGRFLDILDELRVKCPWDRKQTNESLRPNTIEETYELCDALMRDDKKEI CKELGDVLLHVAFYAKIGSETGDFDIKDVCDKLCDKLIFRHPHVFGEVKADTAGQVSENW EQLKLKEKDGNKSVLSGVPAALPSLIKAYRIQDKARNVGFDWEEREQVWDKVKEEIGEFQ EEVANMDKEKAEAEFGDVMFSLINAARLYKINPDNALELTNQKFIRRFNYLEEHTIKEGK SLKDMSLEEMDAIWNEAKKKGL >gi|225935364|gb|ACGA01000028.1| GENE 296 377160 - 377627 394 155 aa, chain - ## HITS:1 COG:no KEGG:BT_4349 NR:ns ## KEGG: BT_4349 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 154 155 231 84.0 8e-60 MRKLIALVVLLCGFMPVLWAADGCDQHLSREEFRAKQKAFIIEQAGLNKEEAAKFFPVYF ELQDKKKKLNDESWDLMRKGKDDKTTEAQYAEINEKVANNRIAADQLDKTYLGKFKKILS SKKIFLVQRAEMRFHREMIKGMNRSKDKGDDSKKK >gi|225935364|gb|ACGA01000028.1| GENE 297 377655 - 378035 383 126 aa, chain - ## HITS:1 COG:no KEGG:BT_4348 NR:ns ## KEGG: BT_4348 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 128 128 183 86.0 1e-45 MKEEDILLKKLGKENSFKVPDGYFENLTSEVMNKLPEKEKVAFKEEPVSTWTRLKPLLYM AAMFVGAALIIRVASTDHKPAAADDVAVTEVDTEVVVSDEMIDVAVDRAMLDDYSLYVYL SDASVE >gi|225935364|gb|ACGA01000028.1| GENE 298 378083 - 378634 489 183 aa, chain - ## HITS:1 COG:RSc1055 KEGG:ns NR:ns ## COG: RSc1055 COG1595 # Protein_GI_number: 17545774 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 5 182 2 190 199 82 28.0 6e-16 MTPYNEREVLKLLQEESTQRKGFEMIVGQYSEQLYWQIRRMVLSHEDANDLLQNTFIKAW TNIDYFRAEAKLSTWLYRIALNECLTFLNKQRAMTTVDIDDPEAVVVQKLESDAYFSGDK IQMCLQKALQTLPEKQRMVFNLKYYQEMKYEEMSEIFGTSVGALKASYHHAVKKIEKFLE EID >gi|225935364|gb|ACGA01000028.1| GENE 299 378659 - 380215 1051 518 aa, chain - ## HITS:1 COG:BMEI1014 KEGG:ns NR:ns ## COG: BMEI1014 COG2989 # Protein_GI_number: 17987297 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Brucella melitensis # 239 512 272 521 531 148 32.0 3e-35 MEELSMKSSRLLWGVIILLFGCLAACKKEKPQSPIPDSLSSLRELFHPAYQINADSIHQM IRICLNENPVTPWDSVLLAHYQEKDEFFWLNDSLISDKPAVQVADSMLFWLGNISQHGVN PNLYPVDSIREELQQIRTLKLREGKTMNRLLADVEYQLTAAYLSYVCQLKFGFLPSERRW NDSIDHILLKQCDVKFAMAALDSLRANPNAAFHRAQPASPLYHKMQEELVRVNGWGVTDT TDYYRDRLLVNMERARWQYALDKGRKYVIANVAAFMLQAVNEETDSILEMRICVGTVKNK TPLLSSRIYYMELNPYWNVPQSIIRKEIIPTYRRDTTYFTRNRMKVYDKNGLQVDPHSIK WAKYAGSGVPYTVKQDNKTGNSLGRIIFRFPNPHSVYLHDTPSRWAFTRNNRAVSHGCVR LQKALDFAFFLLKEPDELLEDRIRIAMDIKPVSEEGKKLPISAAYRELKHYSLEQYIPLF IDYQTVYLSTDNNLRYCEDIYKYDPSLLEAMNNLNLKP >gi|225935364|gb|ACGA01000028.1| GENE 300 380190 - 381128 518 312 aa, chain - ## HITS:1 COG:slr0050 KEGG:ns NR:ns ## COG: slr0050 COG1234 # Protein_GI_number: 16331469 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Synechocystis # 5 304 2 307 326 198 37.0 1e-50 MEKFELHILGCGSALPTTRHFATSQVVNLREKLFMIDCGEGAQMQLRRSRLKFSRLNHIF ISHLHGDHCFGLPGLISTFGLLGRTADLHIHSPRGLEELFAPMLAFFCKTLTYKVFFHEF ETKEPMLIYDDRSVTVTTIPLKHRIPCCGFLFEEKQRPNHIIRDMVDFYKVPVYELNRIK NGADFVTPEGEVIPNLRLTRPSAPARKYAYCSDTIYRPSLAEQIKNVDLLFHEATFAQTE QARAKETYHTTAAQAAHLALDANVRQLVIGHFSARYEDESVLLHEASAIFPQTILAKENL CIDVDGGTVYEK >gi|225935364|gb|ACGA01000028.1| GENE 301 381282 - 383084 3066 600 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160887146|ref|ZP_02068149.1| hypothetical protein BACOVA_05162 [Bacteroides ovatus ATCC 8483] # 1 600 1 600 600 1185 100 0.0 MENLKNVAPIEDFNWDAYENGETVTNVSHDELEKAYDGTLNKVNDREVVDGTVIAMNKRE VVVNIGYKSDGIIPLNEFRYNPDLKIGDTVEVYIENQEDKKGQLVLSHRKARATRSWDRV NAALENEEIIKGYIKCRTKGGMIVDVFGIEAFLPGSQIDVKPIRDYDVFVGKTMEFKVVK INQEFKNVVVSHKALIEAELEQQKKEIIGKLEKGQVLEGTVKNITSYGVFIDLGGVDGLI HITDLSWGRVSDPKEVVELDQKLNVVILDFDDEKKRIALGLKQLTPHPWDALDTDLKVGD KVKGKVVVMADYGAFIEIAPGVEGLIHVSEMSWSQHLRSAQDFMKVGDEVEAVVLTLDRE ERKMSLGIKQLKQDPWETIEEKYPVGSKHTAKVRNFTNFGVFVEIEEGVDGLIHISDLSW TKKVKHPSEFTQIGADIEVQVLEIDKENRRLSLGHKQLEENPWDVFETVFTVGSVHEGTI IEMLDKGAVVALPYGVEGFATPKHLVKEDGSQAQLDEKLEFKVIEFNKDAKRIILSHSRI FEDVAKAEERAEKKAASGAKKSSSSNKREDSPMIQNQAASTTLGDIDALAALKEQLEGKK >gi|225935364|gb|ACGA01000028.1| GENE 302 383253 - 389138 3768 1961 aa, chain + ## HITS:1 COG:MA3490 KEGG:ns NR:ns ## COG: MA3490 COG1112 # Protein_GI_number: 20092301 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanosarcina acetivorans str.C2A # 377 1797 11 1475 1939 568 30.0 1e-161 MDERLDKVKVQCDYLPLINFAIQQNGASIIHQLSIENTTPAPLKDIQVQITTEPTFGNAA PIAVAQIPPNESICLSSFNLTLSANYFTQLTERLSGNLKIEITSKAESVFCQTYPIDILA YDQWGGLNVLPEMLAAFITPNHTAIVPIIKRAASILGQWTDNPSLDEYQSRTPDRVRKQM AAIYTAITEQQIIYSTIPASFEEYEQRVRLADSVMAQKLGTCLDMALLYASCLEAIGLNA LIIITQGHAFAGAWLVPETFPDPTIDDVSLLTKRTAEGIYDITLVETTCMNMGHSSDFDD AVKKANGKLTDGNSFILAIDVKRARHSGIRPIPQRILHGQVWEVEEKETDIQRSAVHATP QSINPYDLSGNETQAVITKQLLWERRLLDLSLRNNLLNIRITKNTLQLIPANLSCLEDAL ADGEEFRILHRPADWESPAMDFGIYSSIPESDPMVGFINSELSQKRLRFYLPENDLGKAL THLYRSSRTSIEENGANTLYLALGLLKWYETPSSERPRYAPILLLPVEIIRKSAAKGYVI RSREEETMMNITLLEMLRQNFGITISGLDPLPTDESGVNVKLIYSIIRNSIKNQRKWDVE EQAILGIFSFNKFIMWNDIHNNANKLVQNKIVSSLINGKIEWEAATEEIDATDMDKQVSP ADIVLPIIADSSQLEAIYEAVHDKTFILHGPPGTGKSQTITNIIANALYKGKRVLFVAEK MAALSVVQNRLAAIGLAPFCLEIHSNKTKKSAVISQLKETTEIIRQTPPEEFRKEAERLL NLRAELNQYIEALHKEYPFGVSLYDAIIHYQSVDVESCFDIPQAYLDTLDKDTFAQWEDA IELLVRTANACGHPYQHPLTGISITEYSSAVKEESSQLLTGFIDLLTTIRQKLDVFSILL KDTDIHPTRKDFQTIAHIIRRILDIPELTPGLLTLPLLNETLNEYREVVVHGRKRDEQRK EIETGFTQEILSINAKQMLAEWNRVSGQWFLPRYFGQRKIRKAINIYALKTIETEDIKPL LHRIIRYQEEAEAVRKYIGQLPSLFGRPGKNEDWNTIEQIIDDMASLHSHLLNYAKDIAK VSQIKQNLSAQLTEGIQTFRDIHAHSFNELYQLADTLTATEKKLSGMLGISIEELYTDSA DWITIALSKARTWKENLDKLKDWYQWLQAYQTLHSLGIGFIATEYKEKNIPTGQLTDSFR KSFYQAAIRYIIAKEPTLELFNGKIFNDIIAKYKQISAKFEETTQRELFARLASNIPSFT HEAIQSSEVGILQKNIRNNARGISIRKLFDQIPTLLSRMCPCMLMSPLSVAQFIDTDADK FDLIVFDEASQMPTYEAVGAIARGKNVIIVGDPKQMPPTSFFSVNTIDEDNIEMEDLESI LDDCLALSIPSKYLLWHYRSKHESLIAFSNSEYYDNKLMTFPSPDNIESKVRIVNINGYY DKGKSRQNRAEAQAVVDEIARRLRSEELRKKSIGVVTFSIVQQALIEDLLSDLFIFYPEL ETLALECDEPLFIKNLENVQGDERDVILFSVGYGPDAEGRVSMNFGPLNRAGGERRLNVA VSRARYEMIIYSTLRSDMIDLNRTSSIGVAGLKRFLEYAEKGTRSTISSVPRQLSEATAS IETIIADRLRSLGYTVHTDIGCSGYKIDIGIVDTENTSNYQLGIICDGKNYKRTKTARDR EIVQNNVLKALGWDIYRIWTMDWWEKPDEVMATIQEAIARKKSSKVGSMTAAEINSTPTE VAVPAPTAQITNNEISFVLKASPAAPEKQAASVLSTQNRIEQKYKFAKITPYNYSPEDFF FTDSYSILLSQIRKIMESEAPISKSLLCKKILSEWGISRLGTRVEAQIETALDTLNIYRT EYEGLVFCWNDKEQCASYSIYRPVSDREATDIPPEEIANAIRQLLTDSISLPAADLIKAC AQQFGFARMGSNIDAAMQRGIREAVKRNYAKIENERVTIAN >gi|225935364|gb|ACGA01000028.1| GENE 303 389140 - 390270 728 376 aa, chain - ## HITS:1 COG:SA2010 KEGG:ns NR:ns ## COG: SA2010 COG3344 # Protein_GI_number: 15927789 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Staphylococcus aureus N315 # 81 333 24 273 338 164 37.0 4e-40 MDGILITIIVITAFLVYKMISSSRKQEGYKRWKATSSGKEEKTTKVKWEHGAWLRHALGG TVPRAQYVQKFDESAVVWCANLLGVETARLKEILRDVSEHYREFWMRKRKGGYRMISAPD KELQSIQTTINSRILSSVTMIHPAAVGFRSGHSVVDNVSPHLGKRYILKMDIHDFFFSIR SRRVKKTFEKIGYPENVSKVLGVLCCLYRHLPQGAPTSPALSNIVGYEMDKKLAALAAEY GLTYTRYADDLTFSGDVFPKEQIIPRIKQIIRDEKFEPNHKKTRFINEYGRKIITGVSVS SGAKLTIPKARKREIRKNVYFILTKGLAEHQRRIGSSDPVYLKRLIGELCYWRSIEPDNA YVSDSINRLKCLEKGY >gi|225935364|gb|ACGA01000028.1| GENE 304 390476 - 392587 1579 703 aa, chain - ## HITS:1 COG:no KEGG:BT_4341 NR:ns ## KEGG: BT_4341 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 703 17 719 719 1289 92.0 0 MNPYTHQEISADSIIERVMTFAPLYETIVSDYRANLYIKGKMDIQKKNFILRYVPSMFRL QKGVREYLLETYSDLHYTAPNIYDQKVKASQGTVRGNRGLPGLLEYFNVNIYSSSLLNDE RLLSPLAKNGQKYYKYRIDSVMGDPNNLDYRVRFIPRTKSDQLVGGYMVVSSNVWSVREI RFSGRSELITFTCWIKMGDVGKKNEFLPVRYEVEALFKFLGNKVDGNYTASLDYKSIELK EKKVRKKEKKNYNLSESFSLQCDTNAYKTDASTFAIMRPIPLSESEKKLYSDNLLRRDTT TVQKHSKSQAFWGTMGDLMVEDYKFNLSNIGSVRFSPFINPLLFSYSGSNGLSYRQDFRY NRLFRGDKLLRIVPKLGYNFTRKEFYWSLNADFEYWPQKRGFFRLNVGNGNRIYSSKVLD ELKAMPDSIFNFDLIHLDYFKDLYFNFRHTVEVVNGLDIGLGFSAHKRTAVEPSRFVITG DYPMPPPEFMDKFKNTYISFAPRIRIEWTPGLYYYMNGKRKINLHSLYPTFSVDYERGIK GVFKSTGEYERIEFDLQHQIRMGLMRNIYYRFGFGAFTNQDELYFVDFANFSRHNLPMGW NDEIGGVFQVLDSRWYNSSRRYVRGHFTYEAPFLILRHLMKYTRYVQNERIYISALSMPH LQPYLEVGYGIGTHIFDVGVFVSSENWKFGGIGCKFTFELFNR >gi|225935364|gb|ACGA01000028.1| GENE 305 393806 - 394285 240 159 aa, chain + ## HITS:1 COG:no KEGG:BT_4340 NR:ns ## KEGG: BT_4340 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 104 159 1 56 66 68 64.0 9e-11 METKQVTVQPAVGGMIKASKDSFVIPKEELKDFYLKFAFLLNPDSCSINRTEFELLNILL KDLKKILAALTHLNTHPWDEGIAEIQISCGVYSLQDNLSKKERMQMNTSTGKHLQFLTQM AVNSPVFKLLFKNYHNHYIQVESLVKQMAKEIDQQQRSE >gi|225935364|gb|ACGA01000028.1| GENE 306 394418 - 396607 2249 729 aa, chain + ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 5 729 7 724 724 621 44.0 1e-177 MSKMRFFALQELSNRKPLEVIAPSNKLSDYYGSHVFDRKKMQEYLPKEAYKAVTDAIEKG TPISREIADLIANGMKSWAKSLNVTHYTHWFQPLTDGTAEKHDGFIEFGEDGGVIERFSG KLLIQQEPDASSFPNGGIRNTFEARGYTAWDVSSPAFVVDTTLCIPTIFISYTGEALDYK TPLLKALAAVDKAATEVCQLFDKNITRVYTNLGWEQEYFLVDSSLYNARPDLCLTGRTLM GHSSAKDQQLEDHYFGSIPPRVTAFMKELEIECHKLGIPAKTRHNEVAPNQFELAPIFEN CNLANDHNQLVMDLMKRIARKHHFNVLLHEKPYSGVNGSGKHNNWSLCTDTGINLFAPGK NPKGNMLFLTFLVNALMMVYKNQDLLRASIMSASNSYRLGANEAPPAILSCFLGSQLSST LDEIVRQVGNEKMTPEEKTTLKLGIGRIPEILLDTTDRNRTSPFAFTGNRFEFRAAGSSS NCAASMIAINAAMANQLNEFRASVEKLMEEGVGKDEAIFRLLKETIIASEPIRFEGDGYS EEWKQEAARRGLTNICHVPEALMHYIDNQSKSVLIGERIFNETELNSRLEVELEKYTMKV QIEGRVLGDLAINHIVPTAVAYQNRLLENLRGLKEIFPAEEYEVLSADRKELIREISHRV TSIKVLVRDMTEARKVANHMENYKERAFAYEEKVRPYLDQIRDHIDHLEMEVDDEIWPLP KYRELLFTK >gi|225935364|gb|ACGA01000028.1| GENE 307 396724 - 397425 584 233 aa, chain + ## HITS:1 COG:slr0449 KEGG:ns NR:ns ## COG: slr0449 COG0664 # Protein_GI_number: 16332256 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Synechocystis # 18 233 16 233 238 79 25.0 6e-15 MVKMNMSDIDISEPLSDMLAPLNNEQKEFLMNNYTIQTYKKNETIYCEGETPHHLMCLIS GKVKIFKDGVGGRSQIIRMIKPREYFAYRAYFAKQDFVTAAAAFEPSVVCLIPMSAITTL VAQNNDLAMFFIRQLSIDLGISDERTVNLTQKHIRGRLAESLIFLKESYGLEEDGSTLSI YLSREDLANLSNMTTSNAIRTLSQFATERLITIDGRKIKIIEEEKLKKISKIG >gi|225935364|gb|ACGA01000028.1| GENE 308 397460 - 397891 225 143 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2829 NR:ns ## KEGG: Dfer_2829 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: D.fermentans # Pathway: not_defined # 8 128 17 141 206 75 33.0 7e-13 MNIHIENIIADIKRGNKQAFKKLFDDYYPILCVFSSHYIEDKEVCKDIVQDALLVYWERK EDFDDILKVKSFLYTVTRNKCLNHLKHEQLDIPDFAEQEEFDSGFEAAIIEQETFRIVRK AVEELPTQMRNIILYSMKRVKKP >gi|225935364|gb|ACGA01000028.1| GENE 309 398083 - 399243 864 386 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 189 385 132 327 331 90 30.0 6e-18 MNSFKEKFKIAKILASLFTHSSTPEEEKNYRAWLDENPEHQKIADRILNKETYEENSRLI KSFSSQKAWEKVYPLLGGNQSGTVFSWKKSLRYAALILLLLVPASYFIYHWISEETISDI TPGTLGGELILSDGKSFNLSDNNLPENAVKAFVIDSKGINYQIPANKPQVKEIQNTLRTL QGMECLITLSDGTRVHLNAETRLTYPVCFSSKERIVQIEGEAYFDVAPDKEHPFIVKTSH TSIRVTGTSFNVRAYADEDTESTTLISGTVRINSRNEEFELVPNQHYTYNKNTGTNTVAN VNTDLYTSWESGSFIFLNVPLENVMSYLSKWYGFQYSFEDEAAKQVRIGAYLNRYANMNP IIDMITELNLVNIKQREGILHISYKQ >gi|225935364|gb|ACGA01000028.1| GENE 310 399269 - 402838 1722 1189 aa, chain + ## HITS:1 COG:no KEGG:BT_3279 NR:ns ## KEGG: BT_3279 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 1187 14 1180 1182 680 35.0 0 MKKRVKSLPWVVSYASRNFVKFCLLWIFSLSSLMIQAQDNQRISIDLRGETLEAALWYLQ NRTKFVFMYGTEDIANVTDITIKAKDKTITEILDECLEGTNLTYEISGSAIVIKKRQSQK ITISGWVRDTNGEALPGATITIRGSKHGAIAGLDGQYTFNIPAQEGLILTYSFIGMEKKT VKYTGKKTINVTLNSSTSTEIDEVVVTGYQNIQRRDLVGSITTVKAKDIMMPSYTSIDQM LQGRVAGMVVTNTSSRVGTAPKIQIRGTSTLLGNQDPLWVVDGIIQEDPLELDNTSLMTE DLRNIIGNQISWLNPADIESISILKDASATAIYGSKASNGVIEITTKKNTTDRLSINYTS NFIIGTRPNYDQFNYMNSKERVLFSQEAFNWGTPYNAEPIKQIYTYEGLLNMYLSHDISS EEFLAQRNVLETQNTDWFKLLTQRSFSHNHNLSVSGGTNKYSYSASLGYSNSEGQEIGNN SERMTGRIAITIRPIQKLTINATINGSVSTNEGFAGGVNPMSYATTTNRSIDPDAYYQKK ASYTYNSGMKSLSYNFINERENSGSKSKSSFMSASLDLKWNILDWLTYQFTGGYSDNNST NEAWETEQTFYIADTYRGYDFNSVSPNSKEFKSALLPFGGELYTNNTHQYSYNIQNKLQF SKAFNNENRLNALLGMELRSSTNKGLTNTVWGYAPDRGEIITSPTTLQTFEPITGSLQEG WGILQKIYNNGWRKTNTTDNYFSVFATLAYSFKNRYVVNANVRNDASNRFGQDTNHRIDP TYSFGFSWRASEEHFMKKYLNWITSLNFRGTYGIQGNALTRLSPDLILNQGKVADLYNRY QSTISQIPNPNLSWERTKTWNFGVDLELFHLFYMNLEYYTRRSNAIVELELPFEYGISSM KRNGGIIKNRGIEYTLTFTPIQKRDYALSVSLNASKNWNEGGHTNIEVNAAHFLNGRSDI ILKEGYPLSGFWSYSFAGLDGKTGEAQFNLLDIPEDQRSRQIDPTTYLVYSGQKEPYFTG GLSLSFRYKSLTLNTSFSLLLGNKKRLPSPYEQFASSYFMPDPYTNVNRDLLNRWKEPGD ESHTIIPSLPKAGMTYVQLPNLENVYRIPIWEQSDAMVVSGSFLRCRNIGLSWQMKREWC EKISMRNLSLNFNMDNIFVIASKRFNGFDPEVSNSVLPRNYSLGINIGF >gi|225935364|gb|ACGA01000028.1| GENE 311 402856 - 404397 937 513 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3049 NR:ns ## KEGG: Cpin_3049 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 1 506 1 465 475 234 33.0 4e-60 MKTIIYIMTGIVLFSLSSCSNFLEPQSQSEYIPKDANALQEMLIGSAYPKHAGGNSLLAF LSFFDDDVQFHKTNYEFNTNTLGNIEAKKAVYTWQPDMFFIMDKNNYNIQNIWEGYYNYI LGANAALDYINEVNGTEAEKNYVVAQALGLRAFYYFMLVNHFGAPYNYNKQALGVPLKLN SNLLPEDQLLMARNSVEEVYNQIIIDLNESERLFLTLSKDKQYEPNYLVSLPMVQLLKSR IFLYMENWKDAAIYAQKVINDWTFSLINLNGLPAPSVAEPYYNFTSLKSSEVIWLYGKIA DLTVFNDETVSHEEDGYFGKVTHYRKAFIASDDLIESFQDGDLRKEKYIAKEYDKDNNVF LEDSYSSFGKYKVSAVGEPNGSENFALSFRLSEAYLNLAEAAAHNNDEKTALSTLKTLLE NRYEPGKFVEPNGLTGEVLKTFIKNERRKELCFEGQRWFDLRRYGMPQITHRWGEQVYTL KQNDPSYTMPIPDAVLLKNKKLEQNPLAPKREN >gi|225935364|gb|ACGA01000028.1| GENE 312 404433 - 405437 643 334 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171562|ref|ZP_05757974.1| ## NR: gi|260171562|ref|ZP_05757974.1| hypothetical protein BacD2_06810 [Bacteroides sp. D2] # 1 334 1 334 334 679 100.0 0 MKKHILWGLFTCIVLASCYKEDALSPTEGGIELRFEVPQGNNSWDDDINQIYKDFGVYLI YTDLKDADFNRSWTGTAVGSDGLYGQGCISNEMTNYYVEFMKKHVFAYINPQVTSKVLPM YWYLAYNVYSKSVIEFGGIIFGTYIVPIHEIDDGLDYWSTCMFGEDNPDDPYFIPTDRAT LDKRRKMILGPIIEKAVKAGNIVIPTDFEIGFDHVTSLVNSLGMEDDPNYYLTRGYPGTV NTYKFLRITTPDHNKLPPTNEETFIGYILISMFYNQAKIAEVYPADKYPLIAEKFSFVQK YMKEKYQIDLEAIANGPDWDLPLPTIPEKPNEGE >gi|225935364|gb|ACGA01000028.1| GENE 313 405464 - 406474 733 336 aa, chain + ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 196 317 42 154 181 67 28.0 5e-11 MTMKKVKAVLLVLSVICNVAVWAGNGLKISGKMKVLKSTTLQIKDINENIILSCNIGING VFTTNEKNIIPDIYKLYIGETEQIIYLENNPITIKGYFDEKNPEQSSLSFTGIDPFLTLQ NYMPTERDPDKATISTSAKEKLTPAMASALAYLADVNDYPSNKMLLDMIPEQDRNSLSAK WLINKVEVLSHQIIGAECPGFTFIDSNGKSVSLKDFRGKIVVLDFCASWCGPCRKEMRSM LTIYNDLKADDLEFISVSLDDSEAKWRKMLDEEKLPWVMLWDKTGFPKNSKTPSAIQAAY GFYSIPFLVVIDKEGKLAARNVRGEQVREAILKIRK >gi|225935364|gb|ACGA01000028.1| GENE 314 406531 - 408210 1486 559 aa, chain + ## HITS:1 COG:alr0996 KEGG:ns NR:ns ## COG: alr0996 COG1404 # Protein_GI_number: 17228491 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 266 506 123 360 488 82 30.0 3e-15 MLFYQHYSIGAVSQTRSRNTKKQSELDWFNCSFDKDSVYGAEVNKAYEYLKANKKKAKKR PVVALIGTGMDVEHEDLKQAIWVNPKEKLNQKDDDKNGLIDDINGWNFIGGKDGQVMESL TREGEREFFRLKDKYADYIFDGKKYYKIINGKRQEVPAPENMEEYNYYRYKVMPESRIGG SYGGLQLSYVIEEYVEKFDKDMKKRFPGKELTVDDFQSCYDPKAERDSLSEIAFVFTAYY FSLYNTDKWEPVYQNMGKKSVATAKTSYEDALKKYGTDNRKEIVGDNPLDINDTQYGNNV LLTSDAATGVMKAGVIAAKRDNGVGSNGIADNAEIMTLRIHPGEGEPYLKDMALAIRYAV NHGADVILLPEQNSLYPEEQRQWIADALKEAEKKGALVIVPVWDLSVDMDKDEFFPNRKM RKDGELTNFMVVASSDKNGNPVLNTNYGATALDLYAPGTDIYSSYMGDTYQKGTGEGMAS ATVAGVAALVKSYFPKLTGSQIRDILLKSVTSRKGVEVEKGIRVNDSPSQDLFLFDDLCI SGGIVNAYQAILEAEKVSK >gi|225935364|gb|ACGA01000028.1| GENE 315 408220 - 409362 907 380 aa, chain + ## HITS:1 COG:no KEGG:HCH_00467 NR:ns ## KEGG: HCH_00467 # Name: not_defined # Def: PDZ domain-containing protein # Organism: H.chejuensis # Pathway: not_defined # 20 367 36 381 395 97 25.0 7e-19 MKYTTRMIGTVVAFCMAVPVFAQQYKATIPYRMVGEKMIIEMKVNGTARPFIFDTGGRTA LTTQACQALQMAATDSTKVTDVNNVESYYKTTRIENLTTPDHVINFKNAPSLVIDEVKGW ECFGVDGIIGSDLFANTIVTIDSQAKNIIVTSAEKPSTVSLRKMLNFTKGGGMPIVSVQI APVSNINVLFDTGSPSLLSLIESDFEKIKPEASMEVVSEGYGEGSIGVSGQADKASSYRV RIPLLSVGATKFRNVTTSTNNHPYTLLGVKLLQYGKVTIDYPRGRFYFEAFQPDNEINNQ GNNFDLTVKDGDLFVSTVWSSTKGKIAVGDKVVKINGKPTKKYDFCESILNGIPELKEKK KTKLTIETASGVKDIIYEKE >gi|225935364|gb|ACGA01000028.1| GENE 316 409364 - 410524 941 386 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160887127|ref|ZP_02068130.1| ## NR: gi|160887127|ref|ZP_02068130.1| hypothetical protein BACOVA_05143 [Bacteroides ovatus ATCC 8483] # 1 386 6 391 391 721 100.0 0 MKSIGILTGLWLLLFLNCGQRMAAQIRNKVCDTIPYEFIQEKIIIPVTVNGIKVKYIVDT GGRTGTMYDAATEMKATAAGYMRISDVNAQGSNYQEAHVQNVSIGENYKIKQLKTMVLPK NPFFTGLGVVGILGGDAFAQSVVTFDSRSKIMVINYPYRPEGLKVTDGIPLLDETDHHSI VNVRLGDNDFKVLFDTGAGGFLLYSTEDYERLSDISKVTNHGYGIVAAGITGLGKPVDIK KVTVPPINIMGKEFTNVGSTTTVMNGSIIGVDLLEYGKVIIDYMRRRFYFFPFEEGKTDM GGAPALWNVSILPRNDRFEITTIWDSMKDKVAFGDQVININGTSLDDCPMSQMAVEDIMN AIPGDTGYIIVKKDNQEKKIEIRKEK >gi|225935364|gb|ACGA01000028.1| GENE 317 410545 - 411834 1132 429 aa, chain + ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 318 417 5 104 117 111 47.0 2e-24 MKRIITKMYLCLLAFCIAGGISAQTQNSMTEVIPFKTIDGKIIVEAAINGEVADFVLDLS GHNALLPEALKKLRIDTGKNGTFSAYQDFVFKQVPVRKVYEMATVAIGNNTFNNDLPAFT LEDEPYLRKLGVMGVLSGAIFRTSVLTIDMQRKKLTITQPYRPSYMKLNYRENFELITGL GIVCPISIQGKPVSLVLDTWSEGLVNLTEKDFNTWSTQYTKGTNQKVSNGYKEATQEEES LILPETMFVKTKIEDAMAVKNPYLKRSVLGKKILDYGIISIDYIHQKIYFQPFDMVPIPE AEAKVTETKVEDGKLNPITRQFFLEHIFDYRKGNDFVYNGDKPVVIDFWATWCGPCMRLL PEMEKLAEKYKGKVVFYKVNADKEKDLCNHFGVQALPTLFFIPAGGKPIIEVGATPEKYV QIIEEQLLK >gi|225935364|gb|ACGA01000028.1| GENE 318 411993 - 414422 2105 809 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 26 636 26 605 757 471 40.0 1e-132 MKKSISLLLLSLLMITPSCQKTKEAANEYNIVPKPNQILPQEGRFELNNKVCLVVPSDAP EVKSIADSLAGQLKLTAGISLKEAESADGKTAISFVVQEGMPKEGYKLSVTPSFITLTAS QPNGFFYGVQTLYQLLPPAIYGKQLDKKADWSVPAVEIEDSPRFVHRGLMLDVCRHYVPI DYIYKFIDLLAMNKMNVFHWHLTDDQGWRIEIKKYPKLTEIGSQREKTLVDYYYVNYPQV FDGKEHGGYYTQEQIKDVVAYAASKYINVVPEIEMPGHALAALAAYPELSCDSTQTYKVS PTWGVFEQVFCPSETTFIFFEGVMDEVIELFPSEYIHIGGDECPKTAWKNSAFCQQLIRQ LGLKDDTTPSKIDGIKHSKEDKLQSYFVTRMEKYLNSKGKNIIGWDEILEGGLAPNATVM SWRGVEGGMNAAKAGHNAIMTPNPYVYLDYYQEEPEISPTTIGGYNTLKKTYSYNPVPDD ADELAKKHIIGIQGNIWREYMQTSERTDYQAFPRAMAIAETAWTQNANKDWKNFCERMVT EFERLKVMNTQPCLNFYDVNINTHADENGPLMVLLESFYPNAEIRYTTDGSEPTKASVLY EKPFVLEGNIDLKAAAFKDGKMLGKVAHKPLYGNLLTGKPYTVNYTMGWTGDIFDENDVL GADKTTFGLTNGKRGNNASYTPWCSFGIVEGKDLEFIVHLDKPTQVSKVIFGSLFNPAMR ILPAGGVAVEVSADGKQYTRIAEKALKHDYPETGRIAFTDSIEFEPAQATFLKVKIKNGG TLRNGVNFEKNNGPEVIPAELWIDEIEAY >gi|225935364|gb|ACGA01000028.1| GENE 319 414592 - 415542 707 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 10 310 5 303 306 276 48 7e-73 MAEIEKVKCLIIGSGPAGYTAAIYAGRANLCPVLYEGLQPGGQLTTTTDVENFPGYPEGI SGPQLMEDLRAQASRFGTDVRFGIATAADLSKAPYKITIDGDKVIETESLIIATGATAKY LGLEDEKKYAGMGVSACATCDGFFYRKKVVAVVGGGDTACEEAVYLAGLASKVYLIVRKP FLRASKIMQERVMNHEKIEVLFEHNAVGLFGDNGVEGVNLVKRWGEPDEERYSLPIDGFF LAIGHQPNTEIFKEYVDTDEVGYIITEGDSPRTKVPGVFAAGDVADPHYRQAITAAGSGC KAALEAERYLSAKGLI >gi|225935364|gb|ACGA01000028.1| GENE 320 415579 - 416223 597 214 aa, chain - ## HITS:1 COG:no KEGG:BT_4335 NR:ns ## KEGG: BT_4335 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 214 1 215 215 311 79.0 1e-83 MRKYIFSVLIALLSLPVIAQQQSQAKVILDKTAEAFRKAGGVKADFTVKAVTNGLVEGAE NGTIQLKGEKFVLKTSDIITWFDGKTQWSYVTKNDEVNVSNPTQEELQQINPYTFLYMYQ KGFSYKLGTTKTYRGKAVWEVVLTARDKKQELEHITLFVTKDTYEPLYILLQQRGQQTRN EITITSYQTKQNYADQVFTFNKKQYPNAEVIDLR >gi|225935364|gb|ACGA01000028.1| GENE 321 416243 - 418735 2388 830 aa, chain - ## HITS:1 COG:BS_spoIIIE KEGG:ns NR:ns ## COG: BS_spoIIIE COG1674 # Protein_GI_number: 16078743 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 242 815 218 776 787 399 41.0 1e-110 MAKKKLDKEAERTPSSPSKIVAVFKNETIHFVIGLMLVIFSVYLLLAFSSFFFTGAADQS IIDSGSSADLAAVNNQVKNYAGSRGAQLASYLINDCFGVSSFFILVFLAVAGLKLMRVRV VRLWKWFIGCTLLLVWFSVFFGFAFMDHYQDSFIYLGGMHGYNVSRWLISQVGVPGVWMI LLITAICFFIYISARTVIWLRKLFALSFLKREKKEEKENVPEGEGDPEFTTSQPQEVEFN LKRTYKQTPPPAPVMDIQAEEPEDDFPINKPEKEDTSVSDESEGVTMVFEPTVSNPAPIV QEDSLEEAEPGFEVEPAASEEEYQGPELEPYNPTKDLENYRFPTIDLMKHFENDDPTIDM DEQNANKDRIINTLRSFGIEISTIKATVGPTVTLYEITPEQGVRISKIRGLEDDIALSLS ADGIRIIAPIPGKGTIGIEVPNKNPKIVSGQSVIGSKKFQESKYDLPIVLGKTITNEVFM FDLCKMPHVLVAGATGQGKSVGLNAIITSLLYKKHPAELKFVLVDPKKVEFSIYSVIENH FLAKLPDGGEPIITDVTKVVQTLNSVCVEMDTRYDLLKMAHVRNIKEYNEKFINRRLNPE KGHKFMPYIVVVIDEFGDLIMTAGKEVELPIARIAQLARAVGIHMIIATQRPTTNIITGT IKANFPARIAFRVSAMMDSRTILDRPGANRLIGKGDMLFLQGADPVRVQCAFIDTPEVEE ITKFIARQQSYPTPFFLPEFVSEDGGSEVGDIDMGRLDPLFEDAARLVVIHQQGSTSLIQ RKFAIGYNRAGRIMDQLEKAGIVGPTQGSKARDVLCMDDNDLEMRLNNLQ >gi|225935364|gb|ACGA01000028.1| GENE 322 418861 - 419526 631 221 aa, chain + ## HITS:1 COG:no KEGG:BT_4333 NR:ns ## KEGG: BT_4333 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 219 1 219 219 345 88.0 5e-94 MEKESQTIFDKNVIEFVTVAAEFCAFLERAERMKRSTFVDTSLKILPLLYLKASMLPKCE TIGDEALETYVTEEIYEILRINLSGLMADKDDYLDVFVQDMVYSDQPIKKSISEDLADIY QDIKDFIFVFQLGLNETMNDSLAICQENFGTLWGQKLVNTLRALHDVKYNQEEEEEEVGN EEGFYEPSDDNDCCEEDGCHCHDDDCHCHEDGCHCHDDELK >gi|225935364|gb|ACGA01000028.1| GENE 323 419529 - 420188 523 219 aa, chain + ## HITS:1 COG:no KEGG:BT_4332 NR:ns ## KEGG: BT_4332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 206 216 378 91.0 1e-104 MIVRRTIDKDELKELPKTVFPGRIYVIQSEAETEKAVAYLQSRPVIGIDSETRPSFTKGQ SHKVALLQISSEECCFLFRLNMTGLTQPLVDLLENPAVIKVGLSLKDDFMMLHKRAPFTQ QSCIELQDYVRQFGIQDKSLQKIYAILFKEKISKSQRLSNWEADVLSDGQKQYAATDAWA CLNIYNLLQELKQTGDWEVSPLTATVSPAEEVMSNKISD >gi|225935364|gb|ACGA01000028.1| GENE 324 420252 - 421430 575 392 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 1 392 1 393 396 226 36 1e-57 MHKVYLKPGKEDSLKRFHPWIFSGAIARFDGEPDEGEVVEVYTSKKEFIAEGHFQIGSIA VRVLSFRQEPIDHDFWKRKLEIAYDMRRSIGIATNPTNNTYRLVHGEGDNLPGLVIDVYA KTAVMQAHSAGMHVDRMAIAEALSEVMGDKIENIYYKSETTLPFKADLFPENGFLKGGSS DNIAQEYGLKFHVDWLKGQKTGFFVDQRENRSLLERYAKDRSVLNMFCYTGGFSFYAMRG GAKLVHSVDSSAKAIDLTNKNVELNFPGDSRHEAFAEDAFKYLDRMGDQYDLIILDPPAF AKHKDALRNALQGYRKLNAKAFEKIKPGGILFTFSCSQVVSKDNFRTAVFTAAAMSGRSV RILHQLTQPADHPVNIYHPEGEYLKGLVLYVE >gi|225935364|gb|ACGA01000028.1| GENE 325 421839 - 423089 1182 416 aa, chain + ## HITS:1 COG:STM3113 KEGG:ns NR:ns ## COG: STM3113 COG0477 # Protein_GI_number: 16766414 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 409 1 406 418 415 54.0 1e-116 MSIKVRLIIMNFLQFFVWGSWLISLGGYMGRELHFEGGQIGAIFATMGIASLVMPGIIGI IADKWFNAERLYGLCHIAGAGCLFYASTATGYDQMYWAMLLNLLVYMPTLSLANTVSYNA LEQYKCDLIKDFPPIRVWGTIGFICAMWAVDLTGFKNSSAQLYVGGASALLLGLYSFTLP ACRPAKSENKSWLSAFGLDALVLFKKKKMAIFFLFSMLLGAALQITNTYGDLFLGSFASI PEYADSFGVKHSVILLSISQMSETLFILAIPFFLRHFGIKQVMLISMFAWVFRFGLFGFG DPGSGLWMLILSMIVYGMAFDFFNISGSLFVEQEAKSSIRASAQGLFFMMTNGLGAIMGG YASGAVVDAFSVYADGRLVSREWMNIWLIFAAYALVIGILFALVFKYKHQQESKTN >gi|225935364|gb|ACGA01000028.1| GENE 326 423103 - 423696 608 197 aa, chain + ## HITS:1 COG:MT1877 KEGG:ns NR:ns ## COG: MT1877 COG1259 # Protein_GI_number: 15841299 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 6 155 3 151 164 87 39.0 2e-17 MDKKVELQVINITNSQAQVGAFAMLLGEVDGERQLPIIIGPAEAQATALYLKGVKTPRPL THDLFITSLTILGTSLIRVLIYKAKDGIFYSYIYLKKDEEIIRIDARTSDAVALAVRADC PILIYESILEQECLHMSSEERTRSEETDNDEGAEEEHDLPGATSRTLEEALEQAIKDENY ELAAQIRDQINSRNKNQ >gi|225935364|gb|ACGA01000028.1| GENE 327 423705 - 424403 571 232 aa, chain + ## HITS:1 COG:BH1350 KEGG:ns NR:ns ## COG: BH1350 COG1385 # Protein_GI_number: 15613913 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 223 1 238 250 98 30.0 1e-20 MHVFYTPDIQKSNELPEEEAQHCIRVLRLSIGDEITLTDGKGNFYKAEITVATNKRCFVN IKETIFQEPLWPCHLHIAMAPTKNMDRNEWFAEKATEIGFDELTFLNCRFSERKVIKTER IEKILVSAIKQSLKARLPKLNEMIEFDQFIRQEFKGQKFIAHCYEGEKPLLKNVLKPGED ALVLIGPEGDFSEEEVKKAIERGFVPISLGKSRLRTETAALVACHTLNLQNQ >gi|225935364|gb|ACGA01000028.1| GENE 328 424455 - 426002 1639 515 aa, chain + ## HITS:1 COG:no KEGG:BT_4327 NR:ns ## KEGG: BT_4327 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 515 1 515 515 764 80.0 0 MAKKTISRLSVLAVLIVFLAACSKTSEYTNVIPADASVVASINLKSLASKAGLDDKENEA AKQKVLEALKSGMNAATFQQLEKVMKNPGESGIDVESPFYVFSSSSFPYPTVVGKVNNED KLHASLDVMAKEQICQPVGEADGYSFTTMNSGLLVFNSSTILVVNVSGTTQTDKAKEAIT NLLKQTASNSIVKSGAFQKMEKQKSDINFFASMTAIPSTYRDQITMGLPTEVKAEDITLI GGLNFEKGKIALKTENYTENEAVKALLKKQMESVGKANNTFVKYFPASTLMFFNVGVKGG ELYNLLSENKEFRNTVSIAKADEVKELFSSFNGDISAGLINVTMSSAPTFMMYADVKNGN ALEMIYKNKESLGLKRGEDIMQLGKDEYVYKTRGMNIFFGIKDKQMYATNDELLYKNVGK AADKSVKDAPYASDMKGKSLFIAINAEAILDLPIVKMVAGFGGQEAKTYIELANKVSYLS MSSEGEVSEIDLCLKDKDVNALKQIVDFAKQFAGM >gi|225935364|gb|ACGA01000028.1| GENE 329 426023 - 426667 200 214 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 18 210 22 211 223 81 28 4e-14 MNSIHLQQTLPQVFADRNSVTSDVWHQDLFFRKGEMYLIEAASGTGKSSLCSYIYGYRND YQGIINFDETNIKAYSVKQWVDLRKHSLSMLFQDLRIFTELSALENVQLKNNLTGCKKKK EILSFFEQLGIADKINVKAGKLSFGQQQRVAFIRALCQPFDFLFLDEPISHLDDDNSRIM GELIIAEAKAQGAGVIATSIGKHIELPYNHTLQL >gi|225935364|gb|ACGA01000028.1| GENE 330 426762 - 427967 979 401 aa, chain + ## HITS:1 COG:no KEGG:BT_4325 NR:ns ## KEGG: BT_4325 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 401 1 401 401 731 94.0 0 MPTLVWKLLRQHISIGQLAGFFLANLFGMMIVLLSVQFYKDVIPVFTEGDSFMKKDFIIA TKKISALGSFAGKSNTFSAEEIADLKKQPFTKTIGAFTPSQFKVSAGLGMKEAGIHLSTD MFFESVPDEFVDIKLDKWHFDENTHTIPIIIPRNYLNLYNFGFAQSRSLPKLSEGLMSLI QMDIMMRGNGRVEQYKGNIVGFSNRLNTILVPQSFMKWANENFAPNAEAQPARLIIEVSN PADSAIASYFQKKGYETEDGKLDAGKTTYFLRLIVGIVLGVGLFISILSFYILMLSIFLL LQKNTTKLESLLLIGYSPNKVALPYQLLTVGLNVIVLVLSIGLVSWLRSYYIDSIRLLFP QLETGSLWAAISMGILLFIVVSVINILAVKRKVLSIWMHKS >gi|225935364|gb|ACGA01000028.1| GENE 331 427991 - 430642 1531 883 aa, chain + ## HITS:1 COG:no KEGG:BT_4324 NR:ns ## KEGG: BT_4324 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 881 1 879 880 1406 76.0 0 MKNKTTFSIHRGLIAFFFGILFCFAPLSGQNVQDTIIAYFNLLEKVPQEKLYLHLDKPFY GAGEKIWFKGYLVNSVTHQDNTQSNFIITELVNRSDSIVERKKIRRDSLGFHNAFTLPPT LPAGDYYLRGYSNWMLNQEPEFFYSRNLKIGNSIDNTIVSNIEYQQEDESHYTAKVRFTS NTQEAFGNTAIRYRYIENGKIKDKGKRKTDENGLISISLPDLKTIATRYIEVEFDDPQYT YKKTFYLPSFTKDFDVKFFPEGGALLTVAHQNIAFKAQGSDGFSTEIEGFLFDANGDTLT AFRSEHDGMGVFTLNPTVGNSYYVIAKSGDGISKRFNLPAVEEKGITLSMTHYKKEIRYE IQKTEATEWPQKLFLIAHTRGKLVILQPVSVDRTFGRMNDSLFNAGITHFMLIDQQGNAL SERLVFIPDRNPHQWQILADKPTYGKREKVSLQISAKDDNGTPVEGSFSVSITDRKSIRP DSLADNIVSNLLLTSDLKGYVENPGYYFLQQDLRTLRTVDFLMMTHGWRRHHIQNVLTAP SMNLTNYMEKGQTISGRIKGFFGGNVKKGPICILAPKQNIIATTTTDEKGEFIVNTSFRD STTFLVQARTKRGFAGVDIVIDTPQYPATSHKSPFHDGTATFMEDYLLNTRDQYYMEGGM RVYNLKEVVVTGSRKKASSESIYTGGINTYTIEGDRLEGFGAQTAFDAVTRLPGISVTNG NEIHIRNNPEQPVIVVDDVVYEDDNDILTMIQTSDMSSLSLLRGADAAILGSRGSAGAIV ITLKDGKDIPARPAQGIITCTPLGYSDSVEFYHPTYDTPEKKNDQRSDLRSTIYWNPTLQ LDANGKATIEYYTPDSTAPEDITIEGVDKNGKVCRIIQTINNR >gi|225935364|gb|ACGA01000028.1| GENE 332 431431 - 432351 853 306 aa, chain + ## HITS:1 COG:TP0637 KEGG:ns NR:ns ## COG: TP0637 COG0324 # Protein_GI_number: 15639624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Treponema pallidum # 4 286 20 306 316 224 43.0 1e-58 MPDYDLIAILGPTASGKTPFAAALAYELNTEIISADSRQIYRGMDLGTGKDLADYTVNGR TIPYHLIDIADPGYKYNVFEYQRDFLNSYESIKQKGCLPVLCGGTGMYLESVLKGYKLMP VPENQELRNRLANHSLEELTEMLSQYKVLHNSTDVDTVKRAIRAIEIEEYYAAHPVPERE FPELNSLIIGVDIDRELRREKITRRLKQRLDEGMVDEVRRLIEQGIAPDDLIYYGLEYKY LTLYVIGKLTYEEMFNGLEIAIHQFAKRQMTWFRGMERRGFTIHWMNAELPMEEKIAFVK EKLQGN >gi|225935364|gb|ACGA01000028.1| GENE 333 432410 - 433336 829 308 aa, chain + ## HITS:1 COG:TM0358 KEGG:ns NR:ns ## COG: TM0358 COG1597 # Protein_GI_number: 15643126 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Thermotoga maritima # 10 284 6 281 304 114 28.0 3e-25 MSVEPKKWGVIYNPKAGTRKVKKRWKEIKEYMDSKGVDYDYVQSEGFGSVERLAKILANN GYRTIVIVGGDGALNDAINGIMLSDAEDKENIALGMIPNGIGNDFAKYWGLSTEYKPAVD CIINHRLKKIDVGYCNFYDGNEHQRRYFLNAVNIGLGARIVKITDQTKRFWGVKFLSYVA ALFSLIFERKLYRMHLRINDEHIRGRIMTVCIGSAWGWGQTPSAVPYNGWLDVSVIYRPE FLQIISGLWMLIQGRILNHKVVKSYRTRKVKVLRAQNAAVDLDGRLLPKHFPLEVGVLPE KTTLIIPN >gi|225935364|gb|ACGA01000028.1| GENE 334 433412 - 434212 792 266 aa, chain + ## HITS:1 COG:FN1224 KEGG:ns NR:ns ## COG: FN1224 COG2877 # Protein_GI_number: 19704559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Fusobacterium nucleatum # 12 266 30 284 286 263 50.0 2e-70 MIELKNNPAGNFFLLAGPCVIEGEEMAMRIAERVVKITEALQIPYVFKGSYRKANRSRLD SFTGIGDEKALKVLRKVHETFGVPTVTDIHTADEAAMAAEYVDILQIPAFLCRQTDLLVA AAKTGKTINIKKGQFLSPLAMQFAADKVVEAGNKNVMLTERGTTFGYQDLVVDYRGIPEM QSFGYPVILDVTHSLQQPNQTSGVTGGMPQLIETVAKAGIAVGADGIFIETHENPAVAKS DGANMLKLDLLEGLLTKLVRIREAIK >gi|225935364|gb|ACGA01000028.1| GENE 335 434294 - 437131 2914 945 aa, chain + ## HITS:1 COG:BB0536 KEGG:ns NR:ns ## COG: BB0536 COG0612 # Protein_GI_number: 15594881 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Borrelia burgdorferi # 8 894 15 885 933 300 28.0 7e-81 MKHLLRGLFIAVLFICCNFQLVLAQPMQELPVDKNVRIGKLDNGLTYYIRHNALPEKRVE FYIAQKVGSILEEPQQRGLAHFLEHMAFNGTKHFPGDETGLGIVPWCETKGIKFGTNLNA YTSVDQTVYNISNVPTENINVVDSCLLILHDWSSAINLADKEIDKERGVIREEWRSRNSG MLRIMTDAQSTLYPDSKYSDCMPIGSIDVINNFPYQDIRDYYAKWYRPDLQGIVIVGDIN VDEIEAKLKKVFADVKAPVNPAERIYYPVADNQEPLIYIGTDKEVKNPYVNIFFKQDATP DSLKNTIAYYATQYMVSMAMNMLNNRLNELRQTANPPFTSASAEYGNYFLAKTKEALALD ASSKIDGIDLAMKTVLEEAERARRFGFTASEYERARANYLQAVESAYNEREKTKSGSYVN EYINNFLEKEPIPGIEVEYTLVNKLAPNIPVEAVNQVMQQLITDNNQVVLLAGPEKEGAK YPTKEEIAALLKQMKSFDLKPYEDKVSNEPLISEDIKGGKIVSEKAGEIYGTTKLVLSNG VTVYVKPTDFKADQIVMKGVSFGGTSVFPNEEIINISQLNGVALVGGIGNFSKVDLGKAL AGKRANVGAGIGNTTETVSGSCAPKDFETMMQLTYLTFTSPRKDNEAFESYKNRLKAELQ NADANPMTAFSDTITSVLYGHHPRAIRMKENMVDQIDYDRILEMYKDRYKDASDFTFFLV GNVDLATMKPLIAKYLGGLPSINRKETFKDNKMDIRKGEIKNVFAKAQETPMATIMFLYS GTCKYDLRNNVLLSFLDQALDLVYTAEIREKEGGTYGVSCNGSLGKYPKEELVLQIVFQT DPAKKDKLSAVVVEQLHKMAKEGPSAEHMQKIKEYMLKKYKDAQKENGYWLNNMDEYLYT GVDNTKDYEKLVNSITAKEVQDFLAKLLKQNNEIQVIMTVPEENK >gi|225935364|gb|ACGA01000028.1| GENE 336 437155 - 438648 982 497 aa, chain + ## HITS:1 COG:no KEGG:BT_4319 NR:ns ## KEGG: BT_4319 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 497 1 496 496 797 87.0 0 MMIFNELRKHGRLAAKRHPMYEKNKVAKILGYVMGAFWAGYLIFFGTTFAFGFSDMVPNR EPYHVMNAVVLIFILALDFLLRVPFQKTPTQEVKPYLLLPVKRIRVIDFLLIRSGLSLFN LFWLFLFVPFSFITITKFFGIPGVITYLIGILLLIIANNYWYLLCRTLINERIWWILLPI AFYGGIGCLLFIPEDSPLFYFFMDLGDGYIQGNILYFLGTILVIAILWLINRKIMSGLIY AELAKVEDSQIKHVSEYKFFERYGEVGEYMRLELKMLLRNRRCKGALRHIAIVVMAFSLA LSFSSVYDNNFMTSFICVYNFAVFGMIILSQIMSFEGNYIDGLMSRKESIMSLLKAKYYT YSIGEIIPFILMVPAIIMNKLTLLGAFAWFFYTIGFIYFCFFQLAVYNKQTVPLNEKVAS RQTNSAIQMVVNFAAFGVPLILYSLLNAFLGETITYIILLVVGLGFTLTSPLWIKNVYHR FMKRRYENMEGFRDSRQ >gi|225935364|gb|ACGA01000028.1| GENE 337 438687 - 439394 209 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 231 1 240 245 85 25 4e-15 MIIIDKLKKNFGEKIAVDIEHYEINQGDMLGLVGNNGAGKTTLFRIMLDLLKADDGKVII NDIDVSQSEDWKSITGAFIDDGFLIDYLTPEEYFYFIGKMYGLKKEEVDERLIPFERFMS GEVIGHKKLIRNYSAGNKQKIGIISAMLHYPQLLILDEPFNFLDPSSQSIIKHLLKKYNE EHQATVIISSHNLNHTVDVCPRIALLEHGVIIRDIINEDNSAEKELEDYFNVEEE >gi|225935364|gb|ACGA01000028.1| GENE 338 439465 - 440043 232 192 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 81 191 70 174 175 94 39 8e-18 MMMRYITEKSDSHKSSIKKPDKKKRYFLQFILLLGLVASLSSCRTSAPRLDYQALARASI LLGVDINMEDNHKLYLEAADWIGVPYRGGGDSKRGADCSGLVYQVYRKVYRTQVPRNTED LKKESNKVAKRNLREGDLVFFTSSRSKNKVAHVGIYLKNGKFIHSSTSKGVIVSNLNENY YTKHWISGGRIR >gi|225935364|gb|ACGA01000028.1| GENE 339 440070 - 440954 688 294 aa, chain + ## HITS:1 COG:SMc02488 KEGG:ns NR:ns ## COG: SMc02488 COG3735 # Protein_GI_number: 15966800 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 21 294 84 358 358 89 27.0 6e-18 MKSFIGAVLFICVAFSANAQLLWKVSGNGLNQPAYIIGTHHLAPFSIMDSIAGLKKAMNE TQQVYGEMKMSEMQSPATMEKMQKAMMIESDTTLNSLLSPKDFETANKFCKENLMVDLNM APKLKPAFLLNNVVVMAYVKHIGKFNPQEQLDTFFQSQAIQNGKKVDGLETAEFQFNLLF NGTSLQRQAQLLVCTLNHIETEVENLKRLTNAYMKQDLNTMLKISEERKGDQCDALPGEE DAMIYNRNKTWAEKLPAIMKAAPTFVAVGALHLPGEKGLLNLLKRQGYTVEAVK >gi|225935364|gb|ACGA01000028.1| GENE 340 440993 - 441631 558 212 aa, chain + ## HITS:1 COG:MA2967 KEGG:ns NR:ns ## COG: MA2967 COG0546 # Protein_GI_number: 20091785 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Methanosarcina acetivorans str.C2A # 1 210 63 272 279 186 47.0 2e-47 MNYKTYLFDFDYTLADSSRGIVKCFRIVLTRHQYLTVTDEAIKRTIGKTLEESFSILTGI TDPAQLEAFRQEYRLEADVHMNVNTRLFPDTLSTLKELKKRGARVGIISTKYRFRILSYL EEFLPKDFLDIIVGGEDVKAPKPSPEGVKFALEHLGSSPEETLYIGDSTVDAETAQNAGV DFAGVLNGMTTAEELRVYPHKIIMQNLGELVQ >gi|225935364|gb|ACGA01000028.1| GENE 341 441826 - 442143 534 105 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237715590|ref|ZP_04546071.1| 50S ribosomal protein L21 [Bacteroides sp. D1] # 1 105 1 105 105 210 100 8e-53 MYAIVEINGQQFKAEAGQKLFVHHIEGAENGSTVEFEKVLLVDKDGNVTVGAPTVEGAKV VCQVVSNLVKGDKVLVFHKKRRKGHRKLNGHRQQFTELTITEVVA >gi|225935364|gb|ACGA01000028.1| GENE 342 442165 - 442428 449 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160887096|ref|ZP_02068099.1| hypothetical protein BACOVA_05112 [Bacteroides ovatus ATCC 8483] # 1 87 1 87 87 177 100 6e-43 MAHKKGVGSSKNGRESQSKRLGVKIFGGEACKAGNIIVRQRGTEFHPGENIGMGKDHTLF ALVDGTVSFKVGKEDRRYVSVVPATEA >gi|225935364|gb|ACGA01000028.1| GENE 343 442659 - 443933 1440 424 aa, chain + ## HITS:1 COG:aq_298 KEGG:ns NR:ns ## COG: aq_298 COG0172 # Protein_GI_number: 15605830 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Aquifex aeolicus # 1 423 1 422 425 327 43.0 4e-89 MLTIKQITENTEAVLRGLEKKHFKNAKETIDQVIALNDKRRSTQNELDKNLAEVNSLSRT IGQLMKEGKKEEAETARARVAELKEGNKELDAVMTQAATDMQNILYTIPNIPYDSVPEGV GAEDNVVEKMGGMETELPKDALPHWELAKKYDLIDFDLGVKITGAGFPVYKGKGAQLQRA LINFFLDEARKSGYTEIMPPTVVNAASGYGTGQLPDKEGQMYHCEVDDLYLIPTAEVPVT NIYRDVILEEKQLPIMNCAYTQCFRREAGSYGKDVRGLNRLHEFSKVELVRIDKPEHSKQ SHQEMLDHVEGLLQKLELPYRILRLCGGDMSFTAALCFDFEVYSEAQKRWLEVSSVSNFD TYQANRLKCRYRNAEKKTELCHTLNGSALALPRIVAALLENNQTPEGIRIPKALVPYCGF DMID >gi|225935364|gb|ACGA01000028.1| GENE 344 444092 - 446392 2384 766 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 307 760 5 459 468 429 51.0 1e-120 MNKIISKERFSEKVFKFEIEAPLIAKSRKAGHFVIVRVGEKGERMPLTIAGSDLKKGTIT LVVQEVGLSSTRLCELNEGESITDVVGPLGQATHIEKFGTVICAGGGVGVAPMLPIVQAL KAAGNRVITVLAGRNKDLIILEKEMRESSDEVIIMTDDGSYGRKGLVTEGVEEVIKREKV DKCFAIGPAIMMKFVCLLTKKYEIPTDVSLNTIMVDGTGMCGACRITVGGKTKFVCVDGP EFDGHQVNFDEMLKRMGAFKNIEREEMHKLESECEATKEIDEKSRNAAWRQELRKSMKPK ERTAIPRVEMNELDAEYRSHSRKEEVNQGLTAEQAVTEAKRCLDCANPGCMEGCPVGIDI PRFIKNIERGEFLEAAKTLKETSALPAVCGRVCPQEKQCESKCIHLKMNEKPVAIGYLER FAADYERESGQISVPVIAEKNGIKIAVIGSGPAGLAFAGDMAKYGYDVTVFEALHEIGGV LKYGIPEFRLPNKIVDVEIDNLGKMGVNFIKDCIVGKTIGVEDLKAEGFKGIFVASGAGL PNFMNIPGENSINIMSSNEYLTRVNLMDAASEDSDTPVAFGKNVAVIGGGNTAMDSVRTA KRLGAERAIIIYRRSEEEMPARIEEVKHAKEEGVEFLTLHNPIEYIADEQGCVKQVILQK MELGEPDASGRRSPVAIPGATETIDIDLAIVSVGVSPNPIVPSSIKGLELGRKGTIAVND NMESSIPMIYAGGDIVRGGATVILAMGDGRKAAAAMHEQLKTNADN >gi|225935364|gb|ACGA01000028.1| GENE 345 446538 - 446891 489 117 aa, chain - ## HITS:1 COG:NMA1492 KEGG:ns NR:ns ## COG: NMA1492 COG0853 # Protein_GI_number: 15794392 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Neisseria meningitidis Z2491 # 1 115 1 114 127 130 60.0 4e-31 MMIEVLKSKIHCARVTEANLNYMGSITIDEDLLDAANMIPGEKVYIADNNNGERFETYII KGERGSGKICLNGAAARKVQPDDIVIIMSYALMDFEEAKSFKPAVIFPDPATNKVVK >gi|225935364|gb|ACGA01000028.1| GENE 346 446910 - 447755 723 281 aa, chain - ## HITS:1 COG:CAC2915 KEGG:ns NR:ns ## COG: CAC2915 COG0414 # Protein_GI_number: 15896168 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Clostridium acetobutylicum # 1 279 1 279 281 234 42.0 2e-61 MKVVHTIKDLQAELAVLRAQGKKVGLVPTMGALHAGHASLVKRSVSENGVTVVSVFVNPT QFNDKNDLEKYPRTLDADCRLLEECGADFAFAPSVSEMYPEPDTRQFSYAPLDTVMEGAF RPGHFNGVCQIVSKLFDAVQPDRAYFGEKDFQQLAIIREMVRQLKYNLEIVGCSIVREED GLALSSRNKRLSAEERENALNISRTLFKSRNFAATHTVSETQKMVEDAIEAAPGLRMEYF EIVDGNTLQKISNWEDTSYVVGCITVFCGEVRLIDNIKYKE >gi|225935364|gb|ACGA01000028.1| GENE 347 447898 - 448731 655 277 aa, chain + ## HITS:1 COG:TM0895 KEGG:ns NR:ns ## COG: TM0895 COG0297 # Protein_GI_number: 15643657 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Thermotoga maritima # 6 177 2 169 486 83 26.0 4e-16 MTKANKVLFITQEITPYVSESEMANIGRNLPQAIQEKGREIRTFMPKWGNINERRNQLHE VIRLSGMNLIIDDTDHPLIIKVASIQSARMQVYFIDNDDYFQNRMQTTDENGIEYDDNDS RAVFYARGVLETVKKLRWCPDVIHCHGWMTALAPLYIKKAYKDEPSFRDAKVVFSVYEDD FKSTLSDDFATKLMLKGITKKDLGDLKEPVDYAALCKLAVDYSDGVIQNSEKVDESIIEY ARQSGKLVLDYQNPENYADACNEFYDQVWDATVNEEE >gi|225935364|gb|ACGA01000028.1| GENE 348 448753 - 450447 1254 564 aa, chain + ## HITS:1 COG:no KEGG:BT_4306 NR:ns ## KEGG: BT_4306 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 564 1 540 542 614 61.0 1e-174 MKAKYALIALLAITFWGCDDNTAGLGLGMFPGSDQNINGKLSTFDVTTESVHAGEIYAMT NVGYVGKFTDETFGTYQAGFLAELNCPSGMTFPGVYDGTALDEKKKATRVMVGDDNEDNK DVTFIRDDNNKIIGNIHTIELYLWYSSYFGDSLTACRLSVYELSENLDTEHAYYTNINPE NYYDHADPNSLLGTKAYTAVDLSVKDSIRNLSTYVPSVHVSFRDEAAKEIGKEIIKKANE LGVNLDNKEFRKIFKGIYVKSDYGDGTVLYINQAQMNVVYKCYAVDTITGVKLQKKVAEN GVYKDSTYYGYRTFATTREVIQANQLENDKAAIQNCINKSEWTYLKSPAGIFTQLTLPVS QIAEKLQGDTLNAVKLGIPIYNETSDKKFGMSTPKNVLLIRKKYKDSFFKGNQLSDGITS SLFTPTTTSFTQYTFNNITQMINDCLGDGEREKAEKSLPMTLEIINSLGEKEEKVVTTIE EWEEYSDWNKFILIPVLVTTDSSSSNSYYGSSSNVISIQHDLKPGYARLKGGTNATNASL PKEEQDKYKLKLEVVSTNFGTKSK >gi|225935364|gb|ACGA01000028.1| GENE 349 450746 - 452200 1323 484 aa, chain - ## HITS:1 COG:MA4052 KEGG:ns NR:ns ## COG: MA4052 COG1449 # Protein_GI_number: 20092845 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-amylase/alpha-mannosidase # Organism: Methanosarcina acetivorans str.C2A # 1 387 1 390 396 263 36.0 4e-70 MRTICLYFEIHQIIHLKRYRFFDIGNDHYYYDDYANETGMNEVAERSYIPALNTLIEMAK NSGGAFKVALSISGVALEQLEIHAPAVIDLLHQLNDTGCCEFLCEPYSHGLSSLANEDCF REEVLRQRDKMKQMFGKEPKVFRNSSLIYSDEIGGLVASMGFKGMLTEGAKHVLGWKSPH YVYHCNQAPSLKLLLRDFKLSDDISLRFSNSDWAEYPLFADKYINWIDVLPQEEQVINIF MELSSLGMAQPLSSNILEFLKALPECAKAKGITFSTPTEIVTKLKSVSQLDVAYPMSWVD EERDTSCWLGNVMQREAFNKLYSVAERVHLCDDRRIKQDWDYLQASNNFRFMTTKNNGMW LNRGIYDSPYDAFTNYMNILGDFIKRVDALYPADVDSEELNSLLTTIKNQGDEITELEKE VAKLQAKVEAAKKATVKKTADAKEPTVKEKAVAKSKPAAKKAEVKKAAAEPKKATGKAKK AAAK >gi|225935364|gb|ACGA01000028.1| GENE 350 452218 - 453483 1068 421 aa, chain - ## HITS:1 COG:Ta0340 KEGG:ns NR:ns ## COG: Ta0340 COG0438 # Protein_GI_number: 16081471 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermoplasma acidophilum # 1 415 20 388 388 219 31.0 6e-57 MKVLMFGWEFPPKIYGGLAVASYGITKGLSLQGDMETIFCMPKPSGEEEKFLKIIGMNQV PIVWRDVNYDYLKSRLLEMTPEEYYSFRDHIYADFSYMHVNDLGCMEFAGGYPGNLHEEI NNFSIIAGVVARQQEFDIIHAHDWLTYPAGVHAKMVSGKPLCIHVHATDFDRSRGKVNPT VYSIEKNGMDHADCIMCVSELTRRTVINEYHQDPRKVFAMHNAVYPLSQELLDIPRPDHS KEKVVTFLGRITMQKGPEYFVEAAALVLRRTRNIRFVMAGSGDMLNAMINLVAERGIADR FHFPGFMKGKQVYEVYKNSDVFVMPSVSEPFGIAPLEAMQCGTPSIISKQSGCGEILDKV IKTDYWDIHAMADAIHSLCTNPSLFEYLKEEGKKEVDGITWEKVGLRIRALYEAVLRNYG K >gi|225935364|gb|ACGA01000028.1| GENE 351 453500 - 455437 1499 645 aa, chain - ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 1 636 22 669 680 282 31.0 2e-75 MSYLRFDKTLMTNLEESLQREILRTNKAGAYHCTTIVDCNTRKYHGLLVIPVPNIDDENH VLLSSLDETVIQHGAEFNLGLHKYQGNNFSPNGHKYIREFDCEHIPATTYRVGGVILRKE KIFVHHENRILIRYTLLDAHSATTLRFRPFLAFRSVREYTHENAQASREYQLVENGIKTC MYPGYPELYMQLNKKCEFHFMPDWYRGIEYPKEQERGYDFNEDLYVPGYFEVDIKKGESI VFSAGTSEVTPRRLKQTFEAEVLDRTPRDSFYHCLKNSAHQFHNQQEDEHYILAGYPWFK CRARDMFIALPGLTLALDEVDQFEDVMKTAEKAIRNFINEEPVGYKIYEMEHPDVLLWAV WALQQYAKETSREQCRQKYGELLKDIMEFIRQRKHENLFLHDNGLLFANGTDKAITWMNS TVNGHPVIPRTGYIVEFNALWYNALRFIADLVREGGDVYLADELDAQAEVTGKSFVEVFR NEYGYLLDYVDGNMMDWSVRPNMIFTVAFDYSPLDRAQKKQVLDIVTKELLTPKGIRSLS PKSGGYNPNYVGPQIQRDYAYHQGTAWPWLMGFYLEAYLRIYKMSGLSFVERQLISYDDE MTSHCVGSIPELFDGNPPFKGRGAVSFAMNVAEILRVLKLLSKYY >gi|225935364|gb|ACGA01000028.1| GENE 352 455652 - 456308 412 218 aa, chain + ## HITS:1 COG:Rv1337 KEGG:ns NR:ns ## COG: Rv1337 COG0705 # Protein_GI_number: 15608477 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Mycobacterium tuberculosis H37Rv # 7 190 35 223 240 92 33.0 7e-19 MKQDIQRIMLAAAKPLFLIFILYILKILEVGMDWDFSRLGVYPMEKRGLVGILTHPLIHS GFSHLLTNTIPLFFLSWCLFYFYRGIADKIFILIWLGSGLLTFLIGKPGWHIGASGLIYG IAFFLFFSGILRKYIPLIAISLLVTFLYGGIIWHMFPYFSPTNMSWEGHLSGGIMGTLCA FVFVNHGPQRPEPFADEMEEEENTEEETAAEEEEKETL >gi|225935364|gb|ACGA01000028.1| GENE 353 456437 - 457036 596 199 aa, chain - ## HITS:1 COG:aq_540 KEGG:ns NR:ns ## COG: aq_540 COG2095 # Protein_GI_number: 15606002 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Aquifex aeolicus # 11 194 14 213 214 76 31.0 2e-14 MFAGFNWQQMISAFIVLFAVIDIIGSIPIIINLKEKGKDVNAMKATVISFALLIGFFYAG DMMLKLFHVDIESFAVAGAFVIFLMSLEMILDIEIFKNQGPIKEATLVPLVFPLLAGAGA FTTLLSLRAEYASVNIIIALVLNMIWVYFVVSMTGRVERFLGKGGIYIIRKFFGIILLAI SVRLFTANITLLIEALHQS >gi|225935364|gb|ACGA01000028.1| GENE 354 457174 - 457842 586 222 aa, chain + ## HITS:1 COG:no KEGG:BT_4300 NR:ns ## KEGG: BT_4300 # Name: not_defined # Def: Crp family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 220 1 220 220 385 86.0 1e-106 METMFDTLLQLPLFQGLCHEDFTSILDKVKLHFIKHKVGETIIESGSPCKQLCFLLKGEV SIVTSSKENIYTVIEQMEAPYLLEPQSLFGMNTNYTSAYVAHTEAHTVSISKAFVLSDLF KYEIFRLNYMNIVSNRAQNLYSRLWEEPTQDLREKIIRFFLLHCEKAQGEKIFKVKMDDL ARYLDDTRLNTSKALNELQDKGLLELRRKEILIPDAQKLVSE Prediction of potential genes in microbial genomes Time: Fri May 13 07:55:19 2011 Seq name: gi|225935363|gb|ACGA01000029.1| Bacteroides sp. D2 cont1.29, whole genome shotgun sequence Length of sequence - 42436 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 17, operones - 10 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 31 - 1554 1379 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 1606 - 1665 5.2 - Term 1622 - 1673 13.2 2 2 Tu 1 . - CDS 1686 - 2132 466 ## COG1970 Large-conductance mechanosensitive channel - Prom 2160 - 2219 7.3 + Prom 2156 - 2215 6.8 3 3 Tu 1 . + CDS 2275 - 3285 1128 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 3302 - 3353 10.4 + Prom 3330 - 3389 6.1 4 4 Op 1 . + CDS 3458 - 5518 1811 ## COG0339 Zn-dependent oligopeptidases 5 4 Op 2 . + CDS 5550 - 6065 472 ## BT_4261 hypothetical protein 6 4 Op 3 . + CDS 6074 - 6523 423 ## COG2131 Deoxycytidylate deaminase 7 4 Op 4 . + CDS 6554 - 8302 1720 ## COG0793 Periplasmic protease 8 4 Op 5 . + CDS 8280 - 8807 156 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 9 4 Op 6 . + CDS 8752 - 9543 460 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Term 9266 - 9292 -1.0 10 5 Op 1 . - CDS 9530 - 9817 279 ## BT_4256 hypothetical protein 11 5 Op 2 . - CDS 9804 - 10922 1015 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) - Prom 11016 - 11075 6.0 + Prom 10952 - 11011 5.0 12 6 Op 1 . + CDS 11083 - 11766 892 ## BT_4254 hypothetical protein + Prom 11831 - 11890 4.3 13 6 Op 2 . + CDS 11923 - 12417 468 ## COG0054 Riboflavin synthase beta-chain + Term 12498 - 12568 30.1 + TRNA 12468 - 12553 63.9 # Tyr GTA 0 0 + TRNA 12559 - 12631 68.9 # Gly TCC 0 0 + Prom 12478 - 12537 80.4 14 7 Op 1 . + CDS 12695 - 13954 1000 ## COG0673 Predicted dehydrogenases and related proteins + Term 14039 - 14074 -1.0 + Prom 13994 - 14053 3.7 15 7 Op 2 . + CDS 14261 - 15670 1301 ## COG0673 Predicted dehydrogenases and related proteins + Term 15710 - 15745 2.6 16 8 Op 1 . + CDS 15785 - 16675 728 ## COG1284 Uncharacterized conserved protein 17 8 Op 2 . + CDS 16748 - 20089 3186 ## COG3250 Beta-galactosidase/beta-glucuronidase 18 8 Op 3 . + CDS 20120 - 21208 1078 ## BT_4240 hypothetical protein + Term 21227 - 21275 5.0 - Term 21275 - 21317 4.1 19 9 Op 1 . - CDS 21371 - 22036 549 ## BT_4239 hypothetical protein 20 9 Op 2 . - CDS 22081 - 22983 899 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 21 9 Op 3 . - CDS 23013 - 23816 503 ## COG0101 Pseudouridylate synthase - Prom 23852 - 23911 3.0 + Prom 23754 - 23813 5.9 22 10 Op 1 . + CDS 23946 - 24443 373 ## BT_4221 hypothetical protein 23 10 Op 2 . + CDS 24464 - 26116 2044 ## COG2268 Uncharacterized protein conserved in bacteria + Term 26135 - 26192 13.6 + Prom 26182 - 26241 7.1 24 11 Tu 1 . + CDS 26279 - 27409 587 ## COG1672 Predicted ATPase (AAA+ superfamily) + Term 27495 - 27548 11.1 - Term 27697 - 27764 3.1 25 12 Tu 1 . - CDS 27794 - 28231 314 ## BF3357 hypothetical protein - Prom 28308 - 28367 7.2 + Prom 28272 - 28331 6.8 26 13 Tu 1 . + CDS 28426 - 29286 416 ## BVU_1438 hypothetical protein + Prom 29355 - 29414 4.3 27 14 Tu 1 . + CDS 29458 - 30048 377 ## BT_4219 hypothetical protein + Prom 30069 - 30128 2.8 28 15 Op 1 . + CDS 30150 - 31166 1049 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 29 15 Op 2 . + CDS 31238 - 32182 1050 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase + Term 32229 - 32279 5.5 + Prom 32189 - 32248 5.2 30 16 Op 1 . + CDS 32345 - 33082 557 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 31 16 Op 2 . + CDS 33100 - 34128 884 ## BF3771 hypothetical protein 32 16 Op 3 . + CDS 34135 - 34881 673 ## COG0169 Shikimate 5-dehydrogenase 33 16 Op 4 . + CDS 34900 - 35850 631 ## COG1073 Hydrolases of the alpha/beta superfamily 34 16 Op 5 8/0.000 + CDS 35923 - 36840 650 ## COG1512 Beta-propeller domains of methanol dehydrogenase type 35 16 Op 6 . + CDS 36888 - 37469 681 ## COG1704 Uncharacterized conserved protein + Term 37671 - 37706 1.1 - Term 37480 - 37544 5.0 36 17 Op 1 . - CDS 37600 - 39678 1867 ## BT_4122 hypothetical protein 37 17 Op 2 . - CDS 39690 - 42434 2382 ## BT_4121 hypothetical protein Predicted protein(s) >gi|225935363|gb|ACGA01000029.1| GENE 1 31 - 1554 1379 507 aa, chain - ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 193 507 1 318 318 418 62.0 1e-116 MQEKIIILDFGSQTTQLIGRRVRELDTYCEIVPYNKFPKEDPTIKGVILSGSPFSVYDKD AFKVDLSEIRGKYPILGICYGAQFMSYTNGGKVEPAGTREYGRAHLASFCKDNVLFKGVR DGSQVWMSHGDTITAIPDNFKKIASTDKVEIAAYQVEGEQVWGVQFHPEVFHSEDGTQIL KNFVVDVCGCKQDWSPASFIESTVAELKAQLGDDKVVLGLSGGVDSSVAAVLLNKAIGKN LTCIFVDHGMLRKNEFKNVMKDYECLGLNVIGVDASAKFFAELAGVAEPESKRKIIGKGF IDVFDVEAHKIKDVKWLAQGTIYPDCIESLSITGTVIKSHHNVGGLPEKMNLKLCEPLRL LFKDEVRRVGRELGMPEHLITRHPFPGPGLAVRILGDITPEKVRILQDADDIFIQGLRDW GLYDKVWQAGVILLPVQSVGVMGDERTYERAVALRAVTSTDAMTADWAHLPYEFMGKVSN DIINKVKGVNRVTYDISSKPPATIEWE >gi|225935363|gb|ACGA01000029.1| GENE 2 1686 - 2132 466 148 aa, chain - ## HITS:1 COG:ECs4156 KEGG:ns NR:ns ## COG: ECs4156 COG1970 # Protein_GI_number: 15833410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 5 148 2 134 136 151 56.0 3e-37 MGKSTFLQDFKAFAMKGNVIDMAVGVVIGGAFGKIVSSLVANVIMPPIGLLVGGVNFTDL KWVMKAAEIGADGKEIAPAVTLDYGQFLQATFDFLIIAFAIFLFIRLITKLTTKKQAEVP ATPPAPPAPTKEEVLLTEIRDLLKEKNS >gi|225935363|gb|ACGA01000029.1| GENE 3 2275 - 3285 1128 336 aa, chain + ## HITS:1 COG:VC2000 KEGG:ns NR:ns ## COG: VC2000 COG0057 # Protein_GI_number: 15642002 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Vibrio cholerae # 2 332 3 330 331 494 77.0 1e-139 MIKVGINGFGRIGRFVFRAAMERNDIQIVGINDLCPVDYLAYMLKYDTMHGQFNGTIEAD VENSKLIVNGQAIRITAERNPADLKWNEVGAEYVVESTGLFLSKDKAQAHIEAGAKYVVM SAPSKDDTPMFVCGVNEKTYVKGTQFVSNASCTTNCLAPIAKVLNDKFGILDGLMTTVHS TTATQKTVDGPSMKDWRGGRAASGNIIPSSTGAAKAVGKVIPALNGKLTGMSMRVPTLDV SVVDLTVNLAKPATYAEICAAMKEASEGELKGILGYTEDAVVSSDFLGDARTSIFDAKAG IALTDTFVKVVSWYDNEIGYSNKVLDLIAHMASVNA >gi|225935363|gb|ACGA01000029.1| GENE 4 3458 - 5518 1811 686 aa, chain + ## HITS:1 COG:XF1944 KEGG:ns NR:ns ## COG: XF1944 COG0339 # Protein_GI_number: 15838538 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Xylella fastidiosa 9a5c # 9 685 36 716 716 467 38.0 1e-131 MNNITNAQNPFYGQYHTPHETVPFDRIETEHYEPAILEGIKLQNTEIEAIIQNPEKADFT NTIEAFEESGELLDKVVAVFGNMLSAETNDDLQELAQKIMPLLSEHSNNITLNEKLFARV KEVYNQKETLQLTQEQKQLLENAYNSFIRHGANLEGEAREEYRRLTTELSKLTLTFSENN LKETNAYQMLLTKKESLAGLPEIIIEAAAETAKNEEKEGWAFTLHAPSYIPFMTYSDNRD LRQKLYMAYNTKCTHDNEFNNIDIVKKIANTRMKIAQLLGYKDYAEYTLKKRMAENSKSV YKLLNQLLEAYTPTAQQEYKEIQELARKEQGADFVVMPWDWSYYSNKLKDKKFNINEEML RPYFELEQVKKGVFGLAEKLYGITFRKNTEIPVYHKEVEAFEVFDKDGKFLAVLYTDFHP RPGKRAGAWMTSYKDQWIDKKTGKNSRPHVSVVMNFTKPTENKPALLTFNEVETFLHEFG HSLHGMFANSTYRSLSGTNVYWDFVELPSQIMENFAIEKDFLNTFARHYQTGEVLPDELI KRLVDASNFNVAYACLRQISFGLLDMAWYTRNTPFEGDVKAYEQEAWKDAQILPVVQEAC MSTQFSHIFVGGYAAGYYSYKWAEVLDADAFSLFKQKGIFNQEVADSFRNNILSKGGTEH PMILYKRFRGQEPSIDALLIRNGIKK >gi|225935363|gb|ACGA01000029.1| GENE 5 5550 - 6065 472 171 aa, chain + ## HITS:1 COG:no KEGG:BT_4261 NR:ns ## KEGG: BT_4261 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 168 17 168 169 159 56.0 3e-38 MKRSIFQIVGLLLLLPLFSGCNDSDDVAAIFTGKTWKLNYITVDGGHEMFGFWENEEQEK ASIKELNKNGTYNIVFDGTVDGDVINGNIKGTVIATSTFEGKWNANAKNNSFKATVTTAG SYGDDKLAKNFIEGLNAATSYEGDSNNLYLLYKPASGKQTFRMVFRVVSSK >gi|225935363|gb|ACGA01000029.1| GENE 6 6074 - 6523 423 149 aa, chain + ## HITS:1 COG:AF1764 KEGG:ns NR:ns ## COG: AF1764 COG2131 # Protein_GI_number: 11499353 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Archaeoglobus fulgidus # 6 149 2 156 157 117 42.0 8e-27 MDTEKKQLELDKRYIRMASIWAENSYCQRRKVGALIVKDKMIISDGYNGTPSGFENVCED ENNLTKPYVLHAEANAITKIARSNNSSDGATMYVTASPCIECAKLIIQAGIKRVVYSEHY RLEDGIELLKRAGIEVIYTELDDNSSPNK >gi|225935363|gb|ACGA01000029.1| GENE 7 6554 - 8302 1720 582 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 51 358 43 346 408 211 40.0 2e-54 MSTKNSSRFTPVIIAVSVVVGILIGTFYAKHFAGNRLGIINGSSNKLNALLRIVDDQYVD TVNMADLVEKAMPQILAELDPHSTYIPAQNLEEVNSELEGSFSGIGIQFTIQNDTIHVNA VVQGGPSEKIGLMAGDRIVTVDDSLFVGKKVTNERAMRTLKGPKGSQVKLGIKRTGEKDL LHFNITRGDIPQNTVDAAYMVNDDIGYVKVSKFGRTSHVELLNALAQLNHKKCKGLIIDL RGNTGGYMEAAIRMVNEFLPEGKLIVYTQGRKYPRAEEFANGTGSCQKMPLVVLIDEGSA SASEIFTGAIQDNDRGTVVGRRSFGKGLVQQPIDFSDGSAIRLTIARYYTPSGRCIQRPY ESGKDRNYELDLYTRYEHGEFFSRDSIKQNESERYNTSLGRTVYGGGGIMPDIFVPQDTT GVTSYLSTVINRGLTIQFTFQYTDNNRKKLSQYETEEELLNYLRHQGLVEQFVRFADSKG VKRRNILIQKSYKLLEKNLFGNIIYNMLGLEAYLQYFNKTDATVIKGIEILEKGEAFPKA PVAVEEEVTKDKKDGKKKRTAQAYSITEDPTRGFNYAKAAIS >gi|225935363|gb|ACGA01000029.1| GENE 8 8280 - 8807 156 175 aa, chain + ## HITS:1 COG:aq_1731 KEGG:ns NR:ns ## COG: aq_1731 COG0212 # Protein_GI_number: 15606807 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Aquifex aeolicus # 2 156 22 174 186 107 36.0 2e-23 MRKLQSANILAALEAHPAFRAANTVLLYHSLNDEVDTHAFIQKWSSEKRILLPVVVGDDL ELRIYTGPEDMSISGIYGIAEPTGEIFTDYAAIEFIVVPGVAFDAKGNRLGRGKGYYDRL LPRIPSAYKAGICFPFQLVEEVPAESFDIRMDIIITINEDELSHPHHPLPSCDRE >gi|225935363|gb|ACGA01000029.1| GENE 9 8752 - 9543 460 263 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 6 254 8 251 260 92 28.0 1e-18 MKTNYHTHTTRCHHATGSDEEFVLSAIKGGYQELGFSDHTPWKYHTDYISDIRMLPEELP GYVESIRSLQEKYKNQISIKIGLECEYFPEYIHWLKGIIKEYKLDYIIFGNHHFHTDEKF PYFGRNTDSVDMLELYEESAIEGMESGLFAYLAHPDLFMRSYPEFDRHCKLVSRHICRTA VRLNLPLEYNIGYEEYNDIHGITTIPHPDFWKIAAHEGCTAIIGVDAHNNQYLENPFYYN RATKTLRKLGIKVIDRISFFNEK >gi|225935363|gb|ACGA01000029.1| GENE 10 9530 - 9817 279 95 aa, chain - ## HITS:1 COG:no KEGG:BT_4256 NR:ns ## KEGG: BT_4256 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 162 95.0 4e-39 MKRNDAEQIGKLIQQFLRQESLESPLNEQRLLDAWPQILGPAAAYTSNLYIRNQTLYVHL TSAALRQELMMGREVLVRTLNQRVGATVITNIIFR >gi|225935363|gb|ACGA01000029.1| GENE 11 9804 - 10922 1015 372 aa, chain - ## HITS:1 COG:BH0004 KEGG:ns NR:ns ## COG: BH0004 COG1195 # Protein_GI_number: 15612567 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Bacillus halodurans # 1 364 1 369 371 176 31.0 5e-44 MILKRISILNYKNLEEVELGFSAKLNCFFGLNGMGKTNLLDAVYFLSFCKSSGNPIDSQN IRHEQDFFVIQGFYEAEDGTPEEIYCGMKRRSKKQFKRNKKEYSRFSDHIGFLPLVMVSP ADSELIAGGSEERRRFMDVVISQYDKEYLEALIRYNKVLAQRNTLLKSEFPVEEELFLVW EEMMAQAGAIVFQKREAFIREFIPIFQSFYSFISQDKEVVGLSYESHARDASLLEVLKQS RERDKIMGFSLRGIHKDELNMLLGEFPIKKEGSQGQNKTYLVALKLAQFDFLKRTGRTVP LLLLDDIFDKLDASRVEQIVKLVAGDNFGQIFITDTNRGHLDRILHKVGSDYKIFRVEEG TIQETEADNEAQ >gi|225935363|gb|ACGA01000029.1| GENE 12 11083 - 11766 892 227 aa, chain + ## HITS:1 COG:no KEGG:BT_4254 NR:ns ## KEGG: BT_4254 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 377 95.0 1e-103 MAEQKNQNEHLNVEDALTQSEAFLIKYKNAIIGGVVAVIIIVAGFIMYKNLYAEPREEKA QAALFKGQEYFEQDAFEQALNGDSIGYTGFLKVADDYSGTKAANLAKAYAGICYAQLGKY EEAVKMLDSFNGKDQMVAPAILGAAGNCYAQLGQLDKAASTLLSAADKADNNTLSPIFLI QAGEILVKQGKYDDAVNAYTKIKDKYFQSYQAMDIDKYIEQAKLMKK >gi|225935363|gb|ACGA01000029.1| GENE 13 11923 - 12417 468 164 aa, chain + ## HITS:1 COG:BH1557 KEGG:ns NR:ns ## COG: BH1557 COG0054 # Protein_GI_number: 15614120 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Bacillus halodurans # 19 164 11 156 156 139 50.0 3e-33 MATAYHNLSEYDFNSVPNAEAMKFGIVVSEWNFNITGALLKGAVDTLKKHGAKDENILVK TVPGSFELTFGANQMMENCDLDAIIAIGCVIKGDTPHFDYVCMGATQGITELNATGDIPV IYGLITTNTMEQAEDRAGGKLGNKGDECAITAIKMIDFVWSLNK >gi|225935363|gb|ACGA01000029.1| GENE 14 12695 - 13954 1000 419 aa, chain + ## HITS:1 COG:lin2266 KEGG:ns NR:ns ## COG: lin2266 COG0673 # Protein_GI_number: 16801330 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 20 285 2 248 358 68 24.0 3e-11 MKNSPVQSHVLRLAHPPIPTIRIGIIGLGNRGLLTLQRYLQIEGVEIKALCEIREGNLSK AQQLLKEADRSEATGYTGVEGWKKMCESDEVDLIFICTDWLMHTPMATFAMECNKHVAIE VPAAMSVEECWQLVDTAEKTRRHCIMLENCCYDPFALMTLNMAQRGLFGEIMHVEGAYIH DLRSMYFAEESEGGYHNHWGKRYSIEHTGNPYPTHGLGPACQILDIHRSDRMEYLVSMST HQAGMSEYARRLFGEYSPEAHQKYQLGDVNTTLIHTLKGKTIMLQYDISTPRPYSRLQTV CGTLGFAQKYPVPCIALDPNGDTPLEGELLEKMMARYKHPFSATIGEEAHRRGLPNEMNY VMDYRLIHCLRNGLPLDMDVYDAAEWSCITELSEKSVLNKSMAVEIPDFTRGAWKKYKY >gi|225935363|gb|ACGA01000029.1| GENE 15 14261 - 15670 1301 469 aa, chain + ## HITS:1 COG:lin2262 KEGG:ns NR:ns ## COG: lin2262 COG0673 # Protein_GI_number: 16801326 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 110 314 42 225 349 66 29.0 1e-10 MKKLFTVTAIGLALLTWHTSCTQQPKAPEAFTPIKVETPARPAGQEDVIQLVTPKIDTVR VGFIGLGMRGPGAVARWTHIPGTKIVALCDLLPERVEKSQEILKNAGLPEAASYSGSEEA WKQLCERDDIDLVYIATDWKHHAAMGVYAMEHGKHVAIEVPAAMTLDEIWQLINTSEKTR KHCMQLENCVYDFFELTSLNMAQQGVFGEVLHVEGSYIHNLEDFWPEYWNNWRMDYNHLH RGDVYATHGMGPACQVLNIHRGDRMKTLVSMDTKAVNGPAYIKKQTGEEVTDFQNGDQTS TLIRTENGKTMLIQHNVMTPRPYSRMYQIVGADGYASKYPIEEYCLRPSQVDSKDVPNHE NLNAHGSVPENVKKALMDKYKDPIHIELEETAKKVGGHGGMDFIMDYRLAYCLQNGLPLD MDVYDLAEWCCMAELTRLSIENNSAPVEVPDFTRGGWNKVQGYRHAFAK >gi|225935363|gb|ACGA01000029.1| GENE 16 15785 - 16675 728 296 aa, chain + ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 13 294 1 281 283 150 32.0 2e-36 MKTAIPKPSKQSIIREARDYVMIAIGMILYGIGWTVFLLPNDITTGGVPGIASIVYWATG FPVQYTYFSINFFLLLLALKLLGLKFCIKTIFGVFTLTFFLSVIQKLTAGFGLLHDQPFM ACVIGASFCGGGIGVAFSANGSTGGTDIIAAIINKYRDITLGRVVLICDMIIISSSYFVL KDWEKVVYGFVTLYICSFVLDQVVNSARQSVQFFIISNKYEEIGRHINEYPHRGVTIINA TGFYTGREVKMMFVLAKKRESPIIFRLIKDIDPNAFVSQSAVIGVYGEGFDHIKVK >gi|225935363|gb|ACGA01000029.1| GENE 17 16748 - 20089 3186 1113 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 19 1077 5 966 1087 704 38.0 0 MGIFSLTTLSTMAEKAVKPYWQDVQVVEVNKEYPRTSFMTYNNRADALSGKFERSKYYRL LNGTWKFYFVDSYKKLPENITDPNTNTDSWDDIQVPGNWEVQGHGIAIYTNHGYEFKPRN PQPPTLPEANPVGVYRRDIDIPADWDGRDIYLHLAGAKSGVYVYINGQEVGYSEDSKNPA EFLINNYVKPGKNVLTVKIFRWSTGSYLECQDFWRMSGIERDVFLYSQPKAALKDFRIKS TLDDSYKNGIFGLNVDLRNHEKAATNLTLVYELLDAQGKVVATEEKTAYIPSNEVRTLSF DKNLTDVSTWTSEHPNLYKLLMTVKENGKVNEIIPFNVGFRRIEIKPIEQKAANGKPYVC LFINGQPLKLKGVNIHEHNPSTGHYVTEDLMRRDFELMKQHNLNSVRLCHYPQDRRFYEL CDEYGLYVYDEANVESHGMYYDLRKGGSLGNNPEWLKPHMDRTINMFERNKNYPSVTFWS LGNEAGNGYNFYQTYLWLKEADKDLMDRPVNYERAQWEWNSDMYVPQYPGADWLENIGKN GSDRPVAPSEYAHAMGNSTGNLWGQWQAIYKYPNLQGGYIWDWVDQGLLQKDKNGREYWA YGGDFGVDAPSDGNFLCNGLVNPDRGPHPAMAEVKYVHQNVGFEAVDAAAGIFKITNRFY FTNLKKYQIHYNVLANGKTIKGGKVSLDIAPQASKEFTVPVNGLKNQPGTEYFVNFSVTT TEPEPLIPTGYEIAYDQFQLPIQAEKSIYKANGPALKTTIQGDELIISSSKVNFVFNKNS GLVTSYKVDGTEYFKDGFGIQPNFWRAPNDNDYGNSTPKRLQIWKQSSKNFRITDATMTA EDKAVSLNVTYLLAAGNLYVVTYKIYPNGIVNVNAKFTSTDMQPTETEVSEATRMATFTP GSDAARKAASKLEVPRIGVRFRLPAQMNNVQYFGRGPEENYIDRNHGTLVGVYKTTADQM YFNYVRPQENGHHTDTRWIALSPNKGNGLVLVADSLIGFNALRNSIEDFDSEEALPHPYQ WNNFSPEEVANHDENAARNVLRRMHHVNDITPRDFVEVCVDMKQQGVGGYDSWGARPEPF HQIPANRDYQWGFTLVPVRSANQANEAAKYDYR >gi|225935363|gb|ACGA01000029.1| GENE 18 20120 - 21208 1078 362 aa, chain + ## HITS:1 COG:no KEGG:BT_4240 NR:ns ## KEGG: BT_4240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 362 1 362 363 703 94.0 0 MKDLSSIVAKFKVQGTIEEIKPLGAGLINDTYKVNTKEADAPDYVLQRINHAIFQNVEML QSNISAVTNHIRKKLTEAGESDIDRKVLSFLETEEGKTYWFDGDSYWRVMVFIPRAKTYE TVNPEYSNYAGEAFGNFQAMLADIPETLGETIPDFHNMEFRLKQLREAVAKDAAGRVAEV KYYLDEIEKRADEMCKAERLYREGKLPKRVCHCDTKVNNMMFDEDGKVLCVIDLDTVMPS FIFSDYGDFLRTGANTGDEDDKDLDRVNFNMEIFKAFTKGYLKGAKSFLTQIEIENLPYA AALFPYMQCVRFLADYINGDTYYKIKYPEHNLVRTKAQFKLLQSVEANTPEMIAFINECL LN >gi|225935363|gb|ACGA01000029.1| GENE 19 21371 - 22036 549 221 aa, chain - ## HITS:1 COG:no KEGG:BT_4239 NR:ns ## KEGG: BT_4239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 216 330 77.0 2e-89 MKINKLVITIFFSAVGLFVLTALQAQEAKTLFVNMPDSLSPLLTKVNREDCIDFLESKMK AQVENRFGKKSEMTDLSKDYIRMQMSAQSTWQMKVLALNDSTNVICTVSTVCAPACDSSI HFYTDDWKPLTTSLFITLPLMDDFLNAPDSAGVYEFDEARRSADMLLMKADFNKENTELT LTLTTSDYMSKETAEKLKPFLRRPVVYHWKNGAFIKLRIEN >gi|225935363|gb|ACGA01000029.1| GENE 20 22081 - 22983 899 300 aa, chain - ## HITS:1 COG:CAC1984 KEGG:ns NR:ns ## COG: CAC1984 COG0697 # Protein_GI_number: 15895255 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 2 296 4 283 285 128 31.0 2e-29 MWLLLAFLSATLLGFYDVFKKKSLKDNAVLPVLFLNTFFSSLIFLPFILISVYQPDLLGG TIFNVPVAGWEQHKYIIIKSFIVLSSWIFGYFGMKHLPITIVGPINATRPVMVLVGAMLV FGERLNLYQWIGVMLAIASFFMLSRSGKKEGIDFKHNKWIFFIVLAAITGAISGLYDKYL MKSLNPMLVQSWYNVYQVFIMCPILLLLWWPKRKSTTPFRWDWTIILISIFLSAADFVYF YALSYDDSMISIVSMVRRGSVVVSFTFGALFFREKNLKSKAIDLILVLIGMIFLYLGSKS >gi|225935363|gb|ACGA01000029.1| GENE 21 23013 - 23816 503 267 aa, chain - ## HITS:1 COG:CAC3099 KEGG:ns NR:ns ## COG: CAC3099 COG0101 # Protein_GI_number: 15896350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Clostridium acetobutylicum # 1 243 1 244 244 159 37.0 4e-39 MQRYFIYLAYDGTNYHGWQIQPNGISVQECLMKALSTFLRREIEVIGAGRTDAGVHASLM VAHFDSDELLDTTSVTDKLNRLLPPDISIYRVCRVRPDAHVRFDATARTYKYYVTTSKYP FNRQYRWRLYNQLNYERMNEAAHILFEYNDFTSFSKLHTDVKTNICHITHAEWTQEEDAT WVFTIRADRFLRNMVRAIVGTLIEVGRGKLTVEGFRRIIEQQDRCKAGTSAPGQALFLVN VEYPESIFECDDQQSSITGEPSVADNL >gi|225935363|gb|ACGA01000029.1| GENE 22 23946 - 24443 373 165 aa, chain + ## HITS:1 COG:no KEGG:BT_4221 NR:ns ## KEGG: BT_4221 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 248 90.0 7e-65 MSSTIFLIIALVTTGIFVIQFVLSIFFGDIDADVDVDADISSVVSFKGLTHFGIGFGWYM YLAGNTEIQSYVIGILVGLFFVLAVWFLYKKAYQLQQVNHSEQTDQLVGRECTIYFKQSD SKYTVQTSRDGAMREVDVISESGKAYQTGDRTIITSYKDGTLFIQ >gi|225935363|gb|ACGA01000029.1| GENE 23 24464 - 26116 2044 550 aa, chain + ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 369 1 377 509 112 27.0 1e-24 MTQEMLIMAAILVAVILLTFIGILSRYRKCKSDEVLVVYGKTGGDKKSAKLYHGGAAFVW PIIQGYEFLSMKPMQIECKLTGALSAQNIRVDVPTTITVAISTDPEVMQNAAERMLGLTM DDKQNLITDVVYGQMRMVIADMTIEELNSDRDKFLAKVKDNIDTELRKFGLYLMNINISD IRDAANYIVNLGKEAESKALNEAQANIEEQEKLGAIKIANQIKERETKVAETRKDQDIAI AETKKQQEISVANADKDRISQVAIANAEKESQVAKAEAEKNIRIEQANTEKESRVAELNS DMEIKQAEAAKKAAIGRNDAQKEVALSNSELAVTQANADKQAGEAAAKSEAAVQTAREIA QKEVEEAKARKVESSLKAEKIVPAEIARQEAILQANAIAEKITREAEARAKATLAQAEAE AKAIQMKLEAEAEGKKRSLLAEAEGFEAMVRAAESNPAIAIQYKMVDQWKEIAGEQVKAF EHMNLGNITVFDGGNGGTSNFLNTLVKTVAPSLGVLDKLPIGETVKGIINPDSKTEEKPA TKAEEKKDKK >gi|225935363|gb|ACGA01000029.1| GENE 24 26279 - 27409 587 376 aa, chain + ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 1 376 1 388 390 98 25.0 2e-20 MKPYNPFLVYGYNSPEYFCDREKETDKMISALQNGRNLTLISPRRMGKTGLIKNVFYRMK QEKNPNTAYFYMDIYPTRDLKAFIQLLAQNVLGELDTLSQNVLRQMTAFFKSCRPIISAD ERSGMPTVTLDFVPAHAEQTLKEVLDYIAASKKHCYLAIDEFQQITEYPEKGIEALLRSH IQFMPNVHFIFSGSKKHVMEEMFSSAKRPFYQSTQIIVLKEIPLENYYSFAHSFFAKEKR ELPLETFSYLYQLENGHTWYIQSILNRLHEKKINPIDNRLVDSCINDILDEQETIYQSNL TLLTNNQVDLLKAIATEGCIKSINANDFIKRHHLKTPSSVNVALKSLLNKELIYNTSEGY IVYDRFFGKWLKDTVI >gi|225935363|gb|ACGA01000029.1| GENE 25 27794 - 28231 314 145 aa, chain - ## HITS:1 COG:no KEGG:BF3357 NR:ns ## KEGG: BF3357 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 143 1 143 144 131 42.0 6e-30 MKTDFDYNSMPVSFAHCLNGHCLRADKCLRRQVTLRMPKERAAVTVINPEHITSDGGDCT YFIDEKPLLFARGMKHILDRVPLADANIIKRQMISYFGKTVYYRCCNKERLIKPKEQEYI QGLFRKRGVAGSPQFDEYIEYYDLG >gi|225935363|gb|ACGA01000029.1| GENE 26 28426 - 29286 416 286 aa, chain + ## HITS:1 COG:no KEGG:BVU_1438 NR:ns ## KEGG: BVU_1438 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 14 286 13 277 277 173 38.0 5e-42 MKTKHLLGALLSFLCTIFLFSCGNDDEKEIYPLSFEKEYYERPLLGATNIMIRGGNRDYT VTVERTEILSISVDLSSSTGMGSLLVYPKKKGETKVSVKDNITNETVDLKIKITDAYLAY SIKESNHPALSNGTAVYLINNEAKDCYFFHYNYSQHELSPTPIAKGTYDFSVKLENGSGN SSPTYPIPYLTLNYASDEQGNFTDASTPPTSHKLRFELFDGVTSVNAVVNLIQRYLKVDW EQLINKALTRSDYLIIPTLKTTIDNTDYTIIGTLDTHPEIPENILE >gi|225935363|gb|ACGA01000029.1| GENE 27 29458 - 30048 377 196 aa, chain + ## HITS:1 COG:no KEGG:BT_4219 NR:ns ## KEGG: BT_4219 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 30 225 225 347 85.0 2e-94 MSLKLNKPHNIRGVVSYKRSFPDLNDAHLEVAKKIGISPLADREEAEAMKEKLTHITDNE FYAVDSLTHSIPYLVPRASALLDTIGSNFLDSLAAKGLNPNQIIITSVLRTENDVKRLRR RNGNASANSAHCFGATFDVSWKRFKKVEDKDGRPMQDVSADTLKLVLSEVLRDLRQAEKC YIKYELKQGCFHITAR >gi|225935363|gb|ACGA01000029.1| GENE 28 30150 - 31166 1049 338 aa, chain + ## HITS:1 COG:DR1988 KEGG:ns NR:ns ## COG: DR1988 COG1702 # Protein_GI_number: 15806986 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Deinococcus radiodurans # 18 323 65 370 380 269 46.0 5e-72 MIEKLIVLEDIDPVIFYGVNNANIQLIKALYPKLRIVARGNVIKVLGDEEEMCAFEENIT KLEKYCAEYNSLKEEVIIDIIKGNAPQAEQTGNVIVFSVTGKPIIPRSENQLKLVEGFAK NDMVFAIGPAGSGKTYTAIALAVRALKNKEIKKIILSRPAVEAGEKLGFLPGDMKDKIDP YLQPLYDALQDMIPAAKLKEYMELNIIQIAPLAFMRGRTLNDAVVILDEAQNTTTQQIKM FLTRMGMNTKMIVTGDMTQIDLPASQTSGLVQALRILKGVKGISFVELNKKDIVRHKLVE RIVDAYEKFDKEAKAEREKRKNEQLVINGERPVKLAKD >gi|225935363|gb|ACGA01000029.1| GENE 29 31238 - 32182 1050 314 aa, chain + ## HITS:1 COG:CC3242 KEGG:ns NR:ns ## COG: CC3242 COG0152 # Protein_GI_number: 16127472 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Caulobacter vibrioides # 10 312 13 318 320 271 45.0 9e-73 MKALTKTDFNFPGQKSVYHGKVRDVYNINGEKLVMVATDRISAFDVVLPEGIPYKGQMLN QIAAKFLDATTDICPNWKMATPDPMVTVGVLCEGFPVEMIVRGYLCGSAWRTYKSGVREI CGVKLPDGMRENQKFPEPIVTPTTKAEMGLHDEDISKEEILKQGLATPEEYEILEKYTLA LFKRGTEIAAERGLILVDTKYEFGKHNGTIYLMDEIHTPDSSRYFYSEGYQERFEKGEPQ KQLSKEFVREWLMENGFQGKDGQKVPEMTPAIVQSISDRYIELFENITGEKFVKEDTSNI AERIFKNVETFLNS >gi|225935363|gb|ACGA01000029.1| GENE 30 32345 - 33082 557 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 24 245 1 221 221 219 47 3e-56 MDYPQQHIKPYDEEGKKTEQVERMFDNIAHAYDKLNHTLSLGIDRSWRKKAIAWLHPFQP QRMMDVATGTGDFAILACRKLQPAELIGTDISEGMMNVGREKVKKEGLSDKISFAREDCT SLSFADNDFDAITVAFGIRNFEDLDKGLSEMCRVLKPGGHLVILELTTPDRFPMKQLFSI YSKVVIPLLGKLLSKDNSAYHYLPDTIKVFPQGEVMKGVIARAGFSEVNFKRLTFGICTL YTATK >gi|225935363|gb|ACGA01000029.1| GENE 31 33100 - 34128 884 342 aa, chain + ## HITS:1 COG:no KEGG:BF3771 NR:ns ## KEGG: BF3771 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 30 172 29 168 172 70 32.0 1e-10 MKTTKSIGLFLLCIFCCINFTSCDPANNGEDDLIWDFSPIVLYISVQDAQGNDLLNPLTK GSIANQGIKAIYKGETYEKDAPLNERTRAYMAYFTGLQTGVSKDGKYYLTFGEFNGDHTF DNEKVEIDWNDGKEPSVITFSSKLTWKSKKEPVFDRKFCLNGQEIDQKQGLVITRTPSQS EQKFDIVAIEYGIDVETDEIKEKIKADLESKSPYTNGESYSISIQEKNSGTYTLLNSDGF PITEKEFAIEEAEAHGMYGITTEIAKTCRLIPPDDQIYNHIKLKLGIDGEKSSNTFNIFI GRPYNFWIYEDLTEYYKDKYPDGKVKEIVRLLKSKPNNPTKQ >gi|225935363|gb|ACGA01000029.1| GENE 32 34135 - 34881 673 248 aa, chain + ## HITS:1 COG:MK0117 KEGG:ns NR:ns ## COG: MK0117 COG0169 # Protein_GI_number: 20093557 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Methanopyrus kandleri AV19 # 5 246 15 271 290 139 33.0 4e-33 MEKYGLIGYPLRHSFSIGYFNEKFKSEGINAEYVNFEIPSINNFMEVIEENPNLCGLNVT IPYKEQVIPFLDELDRDTAKIGAVNVIKIIRQQKGKVKLVGYNSDIIGFTQSIQPLLQPH HQKALILGTGGASKAVYHGLKNLGIESIFVSRTHKADDMLTYEELTPEIMAEYTIIVNCT PVGMFPKVDFCPNIPYELLTPNHLLYDLLYNPNVTLFMKKGEAQGAVVKNGLEMLLLQAF AAWEIWHK >gi|225935363|gb|ACGA01000029.1| GENE 33 34900 - 35850 631 316 aa, chain + ## HITS:1 COG:SPy1892 KEGG:ns NR:ns ## COG: SPy1892 COG1073 # Protein_GI_number: 15675706 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Streptococcus pyogenes M1 GAS # 11 313 12 305 308 215 39.0 7e-56 MRRKVVYSIIIIMLALTGCTIGGSFYMLNFSLTPNAKILSKDADSYPFMYRNYPFLRPWV DSLKQVDALKDTFIINPHGIQLHAYYVAAPQPTSKTAVIVHGHTDNAIRMFMIGYLYNRD LGYNILLPDLQHQGESEGPAIQMGWKDRWDVLQWMNIANEIFGDSTQMVVHGISMGGATT MMVSGEEQKPFVKCFVEDCGYTSVWDEFSHELKASFHLPPFPLMYTTSWLCEKKYGWNFK EASSLKQVAKSQLPMLFIHGDKDTYVPTWMAYSLYEAKPEPKELWIVPGAAHAVSYKENK QEYTDRVRAFVGRYIH >gi|225935363|gb|ACGA01000029.1| GENE 34 35923 - 36840 650 305 aa, chain + ## HITS:1 COG:TM0962 KEGG:ns NR:ns ## COG: TM0962 COG1512 # Protein_GI_number: 15643722 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Thermotoga maritima # 4 161 6 150 238 87 32.0 3e-17 MKSILTFILATFLLFPLQAQEKVYTVDNLPKVHLQNKMQYVCNPAGILSQAACDSIDSML YALEQQTGIETVVAVVPSIGEEDCFNFCHQLLNQWGVGKKGKDNGLVILLVTDQRCIQFY TGYGLEGVLPDAICKRIQTKYMIPYLKDGNWNAGMVAGLKATCQRLDGSMENDALSDSND GGSFDFVLAILCFIVIGGGLAFFSARRQSRCPKCGKHQLQRTGSAVVSRINGVKTEDVTY TCRNCGHTIIRRQQSYDSDYHHRGGGGGGPFIGGFGGGSLGGGGGFGGGSFGGGMGGGGG AGSRF >gi|225935363|gb|ACGA01000029.1| GENE 35 36888 - 37469 681 193 aa, chain + ## HITS:1 COG:PM0785 KEGG:ns NR:ns ## COG: PM0785 COG1704 # Protein_GI_number: 15602650 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 1 193 1 192 193 195 56.0 5e-50 MKKSIIIILAVVAILVIWAVSVYNGLVTMDENVSGQWANVETQYQRRADLIPNLVNTVKG YATHEKETLEGVVAARSQATQIKVDAADLTPEKLAQYQKAQGAVTSALGKLLAITENYPD LKANQNFLELQAQLEGTENRINVARKNFNDAAQAYNTNIRRFPKNIFAGMFGFDKKAYFE AEEGSEKAPKVEF >gi|225935363|gb|ACGA01000029.1| GENE 36 37600 - 39678 1867 692 aa, chain - ## HITS:1 COG:no KEGG:BT_4122 NR:ns ## KEGG: BT_4122 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 692 1 691 691 1244 91.0 0 MMKKRNFLYAGTLALGMFMSGCSDSFLDMNNYGAYDDFDSETKITWYLAGLYQNCFENYT SPTSQYLGLHASYAQDFNEFTDEMWGITSTSRIDPSTRYSTIDDIKTQTDASSGKSYDPL FASYFGKALGSSVTNNAYTRIRNCNILLRDIDASSVSQETKDKAKGQALFLRAMQLFDLV RMYGCVPIVTTVLNAEATDAGLPRASVTQCVEQIVKDLTEAATLLPDEWGTNDYGRLTRG GALAYKSRVLLFYASPIFNKGWNDAGNDRWQKALNATLNAMSGITASGLNGVTDAASWSK ILADDDNEHSNRETLVVRLLAKESNSSLGYKNNRWEKAIRLSSQGGSGGKGVPIELIDVF PMADGTLPDVAHRVEDGSLRFMENRDPRFYQTFAFNGLKWGHKSLTNDTVWAYRWRTTTS TTSGFAYSEGVNISSPVFVRKMSGLTTASDNNYEASGVHIYDYRYAELQLNLAECYAATN QIDLCKQAIGKLRARVGIPADNNYGLDTYVTDRASALAACLRERQVELAYEGKRYWDIWR WMLYDGGQGGTMELSSTNTCSFLGVTPLTEKYRTAKYVDVKDGYTPASKDVLADLRKNIF ADPESADFQNQLKKVADFWEANFQYGEPNTQPDKNNNNEWIKIGWNTNYYVLGLSKDILD NNSWLGQTKGWTDQNGASGTIDWQDNEILTVE >gi|225935363|gb|ACGA01000029.1| GENE 37 39690 - 42434 2382 914 aa, chain - ## HITS:1 COG:no KEGG:BT_4121 NR:ns ## KEGG: BT_4121 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 914 229 1142 1142 1713 94.0 0 TVLRDASAAIYGSRAANGAILVKTKRGKKGVPLISYSGKFAVNDAVSHSKVLKGSDYGRF YNSLAIGSNKASGYDDFDVLYSDMELAEMDNLNYDWLDKAGWKSAFQQTHTLNVSGGSER ATYYAGATYFDQGANLGNQSYKRYTYRAGVDVRLTNDIKLSATVAGNEGKSDQIYTKGAR FKLYGMGGSTEKSDYSALHHMPNHMPWSVTLPNENGEDQEYWLGPIENTYNNPNFNRDYV TSWNYFALNNSGSFSKERTNSWNADISLTYEVPFVKGLSLRATYSSSHSSEATEQASFPY ELAYVNTRMASDQHLVYTIPNSSFTKAIFDKNSTLSFKDKQAERRQMNFYVNYDRTFGKH SISAMASVERYESFYDSRDIEYADLAHDISDTYLGVGGPSIVGQDGKSALTSDNTVTSKG ESGSLSYLGRVAYSYADRYMLQFIFRSDASTKFAPENYWGFFPGISAGWIMSEESWFKRS LPWFEFMKVRASWGRTGRDNIKMWKWKEQYKMDLKGMQFGAESGKPGTSLIPQASPNRNV KWDVSDKFNLGFDTRFFDGRLSAVFDFYYDINDNILNQYMASQPGIPVYAGGSYAEENFG RVDTYGGELSLTWRDKIGQVNYNIGMDFGLSGSRVREWVPSLRYNKYPCNSSWEEGMSTY LPVWGFKVWKNTSGGDGILRTQEDINNYWNYLESWTPEGGQTKYLDKTSKDDLRPGMVAY QDLGGEMVNGVQQGPNGQIVLEQDYGKLCEKNKTYNVSTRLGASWKGLAISASISTSWGG VRFIDRASMGGSKSTMIWAPDSFWGDMFDEVYNPDGKYPNLGVESLISSSALANSDLWMI STFRCYIRNLSVSYTLPKKWIAPLKMSAVRLNLTGNNLWDLYNPYPNHYRNMYDSSSTEY PTLRTWSLGVNVTF Prediction of potential genes in microbial genomes Time: Fri May 13 07:58:29 2011 Seq name: gi|225935362|gb|ACGA01000030.1| Bacteroides sp. D2 cont1.30, whole genome shotgun sequence Length of sequence - 327499 bp Number of predicted genes - 203, with homology - 200 Number of transcription units - 85, operones - 47 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 706 485 ## BT_4121 hypothetical protein 2 1 Op 2 . - CDS 742 - 2409 1044 ## BT_4120 hypothetical protein 3 1 Op 3 . - CDS 2443 - 4128 1331 ## COG3866 Pectate lyase - Prom 4310 - 4369 6.0 + Prom 4274 - 4333 6.5 4 2 Op 1 . + CDS 4512 - 5678 1074 ## COG0150 Phosphoribosylaminoimidazole (AIR) synthetase 5 2 Op 2 . + CDS 5682 - 6794 1114 ## COG0216 Protein chain release factor A 6 2 Op 3 . + CDS 6871 - 7695 933 ## COG0284 Orotidine-5'-phosphate decarboxylase 7 2 Op 4 . + CDS 7726 - 8955 841 ## COG1078 HD superfamily phosphohydrolases + Prom 8957 - 9016 1.8 8 3 Op 1 . + CDS 9038 - 10078 877 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase 9 3 Op 2 . + CDS 10090 - 11475 1443 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 10 3 Op 3 . + CDS 11495 - 12262 878 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 11 3 Op 4 . + CDS 12284 - 12838 626 ## BT_4204 hypothetical protein + Term 12854 - 12912 14.0 + Prom 12874 - 12933 4.4 12 4 Tu 1 . + CDS 12954 - 13856 882 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase + Term 13876 - 13916 -0.7 - Term 13802 - 13839 5.5 13 5 Op 1 . - CDS 13863 - 15197 1109 ## BVU_0113 hypothetical protein 14 5 Op 2 . - CDS 15100 - 15978 428 ## BF0821 hypothetical protein 15 5 Op 3 . - CDS 16002 - 16337 387 ## BT_4198 hypothetical protein - Prom 16414 - 16473 4.7 16 6 Tu 1 . - CDS 16510 - 16812 219 ## BT_4196 hypothetical protein - Prom 16947 - 17006 4.3 + Prom 16875 - 16934 4.6 17 7 Op 1 . + CDS 16967 - 19177 1785 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 18 7 Op 2 . + CDS 19177 - 20025 346 ## COG0320 Lipoate synthase 19 7 Op 3 . + CDS 20089 - 21207 681 ## BT_4191 hypothetical protein - Term 20924 - 20957 -0.9 20 8 Tu 1 . - CDS 21173 - 21877 528 ## COG0313 Predicted methyltransferases 21 9 Op 1 . - CDS 21932 - 22939 894 ## BT_4189 hypothetical protein 22 9 Op 2 . - CDS 22946 - 23803 1075 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) - Prom 23968 - 24027 5.2 + Prom 23946 - 24005 7.1 23 10 Tu 1 . + CDS 24156 - 25775 1449 ## COG5434 Endopolygalacturonase 24 11 Tu 1 . - CDS 25780 - 25941 83 ## + Prom 25805 - 25864 3.7 25 12 Tu 1 . + CDS 25940 - 27166 1137 ## BT_4186 hypothetical protein + Term 27217 - 27262 8.1 + Prom 27228 - 27287 3.9 26 13 Tu 1 . + CDS 27314 - 28990 1234 ## COG3507 Beta-xylosidase + Term 28992 - 29029 -0.5 + Prom 29166 - 29225 2.7 27 14 Tu 1 . + CDS 29251 - 29913 640 ## COG0546 Predicted phosphatases + Prom 29942 - 30001 3.1 28 15 Op 1 . + CDS 30042 - 30950 635 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 29 15 Op 2 . + CDS 31006 - 31182 185 ## BVU_0178 hypothetical protein - Term 31182 - 31241 15.2 30 16 Tu 1 . - CDS 31244 - 35530 2574 ## COG0642 Signal transduction histidine kinase - Prom 35723 - 35782 4.9 - Term 36169 - 36209 1.3 31 17 Op 1 . - CDS 36404 - 37834 1285 ## COG5434 Endopolygalacturonase 32 17 Op 2 . - CDS 37847 - 41110 2437 ## BVU_1870 hypothetical protein 33 17 Op 3 . - CDS 41126 - 43240 1729 ## BVU_1871 hypothetical protein - Prom 43266 - 43325 5.7 - Term 43272 - 43315 10.1 34 18 Op 1 . - CDS 43356 - 45035 1181 ## BT_4169 hypothetical protein 35 18 Op 2 . - CDS 45038 - 48181 2305 ## BT_4168 hypothetical protein 36 18 Op 3 . - CDS 48203 - 49870 1332 ## BT_4167 hypothetical protein 37 18 Op 4 . - CDS 49874 - 51586 1103 ## BT_4166 putative lipoprotein 38 18 Op 5 . - CDS 51604 - 53145 1153 ## BT_4165 hypothetical protein 39 18 Op 6 . - CDS 53159 - 56422 2391 ## BT_4164 hypothetical protein 40 18 Op 7 . - CDS 56443 - 58704 1384 ## BT_4163 hypothetical protein 41 18 Op 8 . - CDS 58740 - 59315 298 ## gi|260171680|ref|ZP_05758092.1| hypothetical protein BacD2_07433 42 18 Op 9 . - CDS 59345 - 61030 992 ## Fjoh_2628 fibronectin, type III domain-containing protein 43 18 Op 10 . - CDS 61108 - 63639 1839 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins - Prom 63777 - 63836 4.2 - Term 63745 - 63780 1.3 44 19 Tu 1 . - CDS 63839 - 65149 687 ## COG5434 Endopolygalacturonase - Prom 65312 - 65371 6.0 - Term 65339 - 65381 7.0 45 20 Op 1 . - CDS 65441 - 65875 300 ## BT_4171 hypothetical protein 46 20 Op 2 . - CDS 65922 - 67466 706 ## BT_4183 pectate lyase L precursor - Prom 67682 - 67741 7.2 - Term 67805 - 67838 -0.2 47 21 Tu 1 . - CDS 67981 - 69444 1088 ## COG5434 Endopolygalacturonase - Prom 69519 - 69578 6.9 + Prom 69567 - 69626 5.1 48 22 Tu 1 . + CDS 69701 - 71566 1411 ## BVU_0152 polysaccharide lyase family protein 11, rhamnogalacturonan lyase + Term 71611 - 71655 4.1 + Prom 71603 - 71662 4.9 49 23 Tu 1 . + CDS 71704 - 76041 3243 ## COG0642 Signal transduction histidine kinase + Term 76074 - 76114 0.2 + Prom 76133 - 76192 4.5 50 24 Tu 1 . + CDS 76242 - 79205 2643 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 79224 - 79283 3.3 51 25 Tu 1 . + CDS 79337 - 80215 741 ## BVU_2332 hypothetical protein + Prom 80341 - 80400 3.1 52 26 Op 1 . + CDS 80423 - 83554 2551 ## Slin_6567 TonB-dependent receptor plug 53 26 Op 2 . + CDS 83574 - 85178 1679 ## Dfer_1583 RagB/SusD domain protein + Term 85219 - 85267 13.1 54 27 Tu 1 . - CDS 85562 - 86473 541 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 86681 - 86740 3.6 + Prom 86620 - 86679 1.5 55 28 Op 1 . + CDS 86710 - 87087 442 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit 56 28 Op 2 . + CDS 87147 - 88316 1034 ## COG1073 Hydrolases of the alpha/beta superfamily + Term 88353 - 88391 1.1 + Prom 88336 - 88395 3.2 57 29 Op 1 . + CDS 88416 - 89600 939 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 58 29 Op 2 . + CDS 89607 - 90632 550 ## BT_1114 hypothetical protein 59 29 Op 3 . + CDS 90634 - 90807 133 ## gi|260171698|ref|ZP_05758110.1| hypothetical protein BacD2_07523 + Term 90857 - 90899 6.1 - Term 90828 - 90900 9.6 60 30 Op 1 . - CDS 90917 - 91690 777 ## gi|260171699|ref|ZP_05758111.1| hypothetical protein BacD2_07528 61 30 Op 2 . - CDS 91758 - 93044 1285 ## COG2755 Lysophospholipase L1 and related esterases 62 30 Op 3 . - CDS 93060 - 94664 1190 ## BVU_2520 hypothetical protein 63 30 Op 4 . - CDS 94686 - 95576 738 ## BF3838 hypothetical protein 64 30 Op 5 . - CDS 95608 - 96558 1056 ## gi|260171703|ref|ZP_05758115.1| hypothetical protein BacD2_07548 - Prom 96657 - 96716 3.6 65 31 Op 1 . - CDS 96731 - 97867 1077 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 66 31 Op 2 . - CDS 97851 - 98843 979 ## BT_4179 polysaccharide deacetylase - Prom 98936 - 98995 4.8 67 32 Tu 1 . - CDS 99010 - 103293 3020 ## COG0642 Signal transduction histidine kinase - Prom 103341 - 103400 4.8 - Term 103363 - 103419 5.3 68 33 Op 1 . - CDS 103446 - 103760 334 ## COG3254 Uncharacterized conserved protein 69 33 Op 2 . - CDS 103779 - 106979 2852 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins - Prom 107086 - 107145 6.5 + Prom 106986 - 107045 3.9 70 34 Tu 1 . + CDS 107188 - 108363 912 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins + Term 108376 - 108432 -0.5 + Prom 108366 - 108425 2.3 71 35 Tu 1 . + CDS 108494 - 110128 1618 ## COG2755 Lysophospholipase L1 and related esterases + Term 110205 - 110237 -0.8 - Term 110263 - 110325 7.4 72 36 Tu 1 . - CDS 110447 - 113818 2344 ## BVU_0159 hypothetical protein - Prom 113924 - 113983 10.2 - Term 114002 - 114042 -0.0 73 37 Op 1 . - CDS 114050 - 116161 1564 ## Ndas_0923 cellulose-binding family II 74 37 Op 2 . - CDS 116235 - 117962 1397 ## PRU_2229 putative lipoprotein 75 37 Op 3 . - CDS 117985 - 121032 2921 ## BT_2820 hypothetical protein 76 37 Op 4 . - CDS 121074 - 123092 1517 ## PRU_2227 putative lipoprotein 77 37 Op 5 . - CDS 123115 - 126309 3054 ## BT_2818 hypothetical protein - Term 126328 - 126357 -0.9 78 37 Op 6 . - CDS 126369 - 127940 1121 ## BVU_2520 hypothetical protein 79 37 Op 7 . - CDS 128001 - 129053 797 ## BF3847 hypothetical protein - Prom 129177 - 129236 7.8 - Term 129224 - 129265 0.9 80 38 Tu 1 . - CDS 129332 - 130510 382 ## BVU_0168 tyrosine type site-specific recombinase - Prom 130543 - 130602 2.2 + Prom 130714 - 130773 7.5 81 39 Op 1 . + CDS 130954 - 134085 2666 ## PRU_1591 putative receptor antigen RagA 82 39 Op 2 . + CDS 134110 - 136191 1689 ## BVU_0172 hypothetical protein - Term 136063 - 136099 -0.6 83 40 Op 1 . - CDS 136294 - 136980 372 ## BVU_1770 hypothetical protein - Prom 137007 - 137066 3.4 - Term 137022 - 137055 -0.5 84 40 Op 2 . - CDS 137069 - 138451 1226 ## COG5434 Endopolygalacturonase 85 40 Op 3 . - CDS 138511 - 140658 1831 ## COG1874 Beta-galactosidase 86 40 Op 4 . - CDS 140690 - 143047 1747 ## BVU_0180 glycoside hydrolase family protein - Prom 143242 - 143301 6.8 + Prom 143188 - 143247 4.3 87 41 Tu 1 . + CDS 143290 - 146166 2383 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 146210 - 146269 4.0 88 42 Op 1 . + CDS 146295 - 146489 201 ## BT_4150 putative rhamnogalacturonan acetylesterase 89 42 Op 2 . + CDS 146465 - 147523 880 ## COG2755 Lysophospholipase L1 and related esterases 90 42 Op 3 . + CDS 147572 - 149107 1188 ## COG5434 Endopolygalacturonase 91 42 Op 4 . + CDS 149058 - 149660 380 ## BT_4147 hypothetical protein 92 42 Op 5 . + CDS 149657 - 151060 1030 ## COG5434 Endopolygalacturonase 93 42 Op 6 . + CDS 151057 - 153822 2139 ## BT_4145 hypothetical protein + Term 153869 - 153932 -0.5 - Term 153973 - 154020 13.6 94 43 Op 1 . - CDS 154169 - 155341 1140 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities - Prom 155457 - 155516 3.1 - Term 155440 - 155482 7.3 95 43 Op 2 . - CDS 155527 - 156516 954 ## gi|260171734|ref|ZP_05758146.1| hypothetical protein BacD2_07703 - Prom 156697 - 156756 2.3 96 44 Tu 1 . + CDS 156819 - 156995 99 ## gi|260171735|ref|ZP_05758147.1| hypothetical protein BacD2_07708 + Prom 157004 - 157063 6.1 97 45 Op 1 . + CDS 157196 - 157981 854 ## COG0561 Predicted hydrolases of the HAD superfamily 98 45 Op 2 . + CDS 157984 - 159978 1671 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 99 46 Op 1 . + CDS 160349 - 162877 2607 ## BT_4129 outer membrane assembly protein 100 46 Op 2 . + CDS 162897 - 163511 684 ## COG0009 Putative translation factor (SUA5) + Term 163532 - 163604 2.3 + Prom 163524 - 163583 2.8 101 47 Tu 1 . + CDS 163657 - 165783 1422 ## BF4291 hypothetical protein + Term 165829 - 165876 9.0 - Term 165815 - 165864 11.0 102 48 Op 1 . - CDS 165886 - 166698 886 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 103 48 Op 2 . - CDS 166747 - 167943 1183 ## COG0426 Uncharacterized flavoproteins 104 48 Op 3 . - CDS 168002 - 168979 828 ## COG1242 Predicted Fe-S oxidoreductase - Prom 169069 - 169128 5.9 + Prom 169241 - 169300 7.1 105 49 Tu 1 . + CDS 169456 - 173796 3156 ## COG0642 Signal transduction histidine kinase + Term 173889 - 173921 -0.2 106 50 Tu 1 . - CDS 173930 - 174697 204 ## COG0500 SAM-dependent methyltransferases - Prom 174738 - 174797 2.6 - Term 174734 - 174782 3.2 107 51 Tu 1 . - CDS 174921 - 176927 1328 ## COG3507 Beta-xylosidase - Prom 177078 - 177137 4.7 + Prom 177064 - 177123 4.5 108 52 Tu 1 . + CDS 177265 - 179856 2119 ## COG1472 Beta-glucosidase-related glycosidases + Term 180010 - 180063 12.1 - Term 180422 - 180482 14.2 109 53 Op 1 . - CDS 180579 - 182135 1041 ## COG5434 Endopolygalacturonase 110 53 Op 2 . - CDS 182179 - 184590 1889 ## COG4677 Pectin methylesterase 111 53 Op 3 . - CDS 184632 - 186488 1721 ## BF3847 hypothetical protein 112 53 Op 4 . - CDS 186544 - 188196 1523 ## gi|260171751|ref|ZP_05758163.1| Fibronectin type III domain protein 113 53 Op 5 . - CDS 188217 - 190118 1922 ## BVU_1119 hypothetical protein 114 53 Op 6 . - CDS 190144 - 193356 3316 ## BVU_1120 hypothetical protein 115 53 Op 7 . - CDS 193401 - 195788 2388 ## BT_4115 hypothetical protein 116 53 Op 8 . - CDS 195823 - 197628 1386 ## gi|237717967|ref|ZP_04548448.1| conserved hypothetical protein 117 53 Op 9 . - CDS 197641 - 198525 712 ## gi|260171756|ref|ZP_05758168.1| hypothetical protein BacD2_07813 - Prom 198553 - 198612 5.5 - Term 198823 - 198881 18.1 118 54 Op 1 . - CDS 198921 - 200450 1289 ## BT_4115 hypothetical protein - Prom 200495 - 200554 4.2 119 54 Op 2 . - CDS 200569 - 201387 442 ## BT_4118 two-component system response regulator 120 54 Op 3 . - CDS 201391 - 202185 519 ## BT_4117 hypothetical protein + Prom 202449 - 202508 1.6 121 55 Op 1 . + CDS 202541 - 205789 2988 ## BT_4114 hypothetical protein 122 55 Op 2 . + CDS 205815 - 207722 2102 ## Phep_0771 RagB/SusD domain protein 123 55 Op 3 . + CDS 207741 - 209615 1404 ## gi|260171762|ref|ZP_05758174.1| Fibronectin type III domain protein + Term 209632 - 209688 7.3 124 56 Tu 1 . + CDS 209700 - 211328 1155 ## COG4677 Pectin methylesterase + Term 211349 - 211410 4.5 + Prom 211332 - 211391 4.5 125 57 Op 1 . + CDS 211422 - 212999 1408 ## BT_4115 hypothetical protein 126 57 Op 2 . + CDS 213031 - 214692 1266 ## BT_4115 hypothetical protein 127 57 Op 3 . + CDS 214728 - 216224 1366 ## BT_4115 hypothetical protein + Term 216278 - 216323 7.4 - Term 216311 - 216342 -0.1 128 58 Tu 1 . - CDS 216369 - 220871 3665 ## COG0642 Signal transduction histidine kinase - Prom 220893 - 220952 3.4 129 59 Tu 1 . - CDS 221070 - 222809 1579 ## COG4677 Pectin methylesterase - Prom 222830 - 222889 3.8 - Term 222816 - 222866 1.5 130 60 Op 1 . - CDS 222898 - 223869 844 ## COG4677 Pectin methylesterase 131 60 Op 2 . - CDS 223879 - 225078 1060 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 132 61 Op 1 . + CDS 225380 - 226786 1674 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 133 61 Op 2 . + CDS 226811 - 227731 1106 ## COG3717 5-keto 4-deoxyuronate isomerase + Term 227771 - 227831 16.5 + Prom 227846 - 227905 7.2 134 62 Tu 1 . + CDS 227998 - 229515 1542 ## COG0477 Permeases of the major facilitator superfamily + Term 229546 - 229591 13.2 - Term 229745 - 229774 1.4 135 63 Op 1 . - CDS 229840 - 233310 2437 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 136 63 Op 2 . - CDS 233261 - 234109 658 ## COG0805 Sec-independent protein secretion pathway component TatC 137 63 Op 3 . - CDS 234127 - 234348 316 ## BT_4102 putative sec-independent protein translocase + Prom 234389 - 234448 4.6 138 64 Op 1 . + CDS 234623 - 236788 856 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 139 64 Op 2 9/0.000 + CDS 236807 - 237835 353 ## COG3275 Putative regulator of cell autolysis 140 64 Op 3 . + CDS 237832 - 238527 424 ## COG3279 Response regulator of the LytR/AlgR family - Term 238316 - 238353 -0.5 141 65 Op 1 . - CDS 238529 - 241069 2135 ## COG0787 Alanine racemase 142 65 Op 2 . - CDS 241045 - 242064 384 ## BT_4100 hypothetical protein - Prom 242158 - 242217 9.5 + Prom 242143 - 242202 6.7 143 66 Op 1 . + CDS 242232 - 244136 1676 ## COG1154 Deoxyxylulose-5-phosphate synthase 144 66 Op 2 17/0.000 + CDS 244138 - 245478 1496 ## COG0569 K+ transport systems, NAD-binding component + Prom 245499 - 245558 3.5 145 66 Op 3 . + CDS 245578 - 247029 890 ## COG0168 Trk-type K+ transport systems, membrane components + Term 247191 - 247242 1.4 - Term 247245 - 247296 15.5 146 67 Op 1 . - CDS 247400 - 247966 455 ## BT_4096 hypothetical protein 147 67 Op 2 . - CDS 247857 - 248783 813 ## BT_4096 hypothetical protein - Prom 248820 - 248879 1.9 148 68 Op 1 . - CDS 248897 - 250444 907 ## COG3507 Beta-xylosidase 149 68 Op 2 . - CDS 250450 - 251583 975 ## COG2152 Predicted glycosylase 150 68 Op 3 . - CDS 251625 - 253733 2071 ## COG3537 Putative alpha-1,2-mannosidase 151 68 Op 4 . - CDS 253807 - 256089 1937 ## COG3537 Putative alpha-1,2-mannosidase 152 68 Op 5 . - CDS 255977 - 256213 81 ## 153 68 Op 6 . - CDS 256258 - 258948 1744 ## BT_4076 alpha-rhamnosidase 154 68 Op 7 . - CDS 258975 - 260354 773 ## BT_4075 hypothetical protein - Prom 260377 - 260436 4.0 - Term 260497 - 260539 7.0 155 69 Op 1 . - CDS 260548 - 261564 849 ## Cpin_4815 hypothetical protein 156 69 Op 2 . - CDS 261576 - 262043 340 ## gi|260171792|ref|ZP_05758204.1| hypothetical protein BacD2_08003 - Prom 262063 - 262122 1.7 157 70 Op 1 . - CDS 262124 - 263497 703 ## Bcav_2704 hypothetical protein 158 70 Op 2 . - CDS 263540 - 265051 944 ## BDI_0242 hypothetical protein 159 70 Op 3 . - CDS 265067 - 268210 1954 ## BDI_0241 hypothetical protein - Prom 268233 - 268292 9.3 - Term 268222 - 268276 -0.8 160 71 Op 1 . - CDS 268330 - 270669 1794 ## COG3537 Putative alpha-1,2-mannosidase 161 71 Op 2 . - CDS 270706 - 273243 1338 ## COG0383 Alpha-mannosidase 162 71 Op 3 . - CDS 273254 - 273484 237 ## gi|260171798|ref|ZP_05758210.1| hypothetical protein BacD2_08033 163 71 Op 4 . - CDS 273556 - 274080 446 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 274179 - 274238 4.9 - Term 274138 - 274173 4.3 164 72 Tu 1 . - CDS 274272 - 275903 1083 ## BT_4069 putative regulatory protein - Prom 275967 - 276026 7.0 + Prom 276417 - 276476 5.8 165 73 Op 1 30/0.000 + CDS 276635 - 276985 206 ## PROTEIN SUPPORTED gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A 166 73 Op 2 9/0.000 + CDS 276976 - 277578 435 ## PROTEIN SUPPORTED gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B 167 73 Op 3 8/0.000 + CDS 277606 - 279198 1631 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 168 73 Op 4 31/0.000 + CDS 279221 - 280297 1072 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) 169 73 Op 5 28/0.000 + CDS 280310 - 280798 477 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 170 73 Op 6 30/0.000 + CDS 280801 - 281313 481 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 171 73 Op 7 26/0.000 + CDS 281347 - 281655 371 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) 172 73 Op 8 30/0.000 + CDS 281711 - 283642 1465 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 173 73 Op 9 22/0.000 + CDS 283656 - 285140 1309 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) 174 73 Op 10 . + CDS 285178 - 286617 1213 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) - Term 286628 - 286694 20.3 175 74 Tu 1 . - CDS 286717 - 289527 2418 ## COG0642 Signal transduction histidine kinase - Prom 289594 - 289653 8.2 - Term 289628 - 289681 12.3 176 75 Op 1 . - CDS 289720 - 290928 1284 ## BT_4056 hypothetical protein 177 75 Op 2 5/0.000 - CDS 290972 - 293686 2719 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 293749 - 293808 3.4 178 75 Op 3 2/0.200 - CDS 294037 - 294933 755 ## COG2207 AraC-type DNA-binding domain-containing proteins 179 75 Op 4 . - CDS 294960 - 296300 1083 ## COG0668 Small-conductance mechanosensitive channel 180 75 Op 5 . - CDS 296338 - 297354 727 ## BT_4052 putative ABC transporter ATP-binding protein 181 75 Op 6 . - CDS 297332 - 298186 782 ## COG3950 Predicted ATP-binding protein involved in virulence - Prom 298214 - 298273 5.9 + Prom 298224 - 298283 5.5 182 76 Tu 1 . + CDS 298309 - 299229 830 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 299230 - 299288 4.5 - Term 299207 - 299278 12.1 183 77 Op 1 . - CDS 299316 - 300407 938 ## COG1703 Putative periplasmic protein kinase ArgK and related GTPases of G3E family 184 77 Op 2 . - CDS 300419 - 301486 964 ## BT_4048 hypothetical protein - Prom 301537 - 301596 5.0 - Term 301609 - 301663 16.2 185 78 Tu 1 . - CDS 301723 - 307458 5021 ## COG2373 Large extracellular alpha-helical protein - Prom 307578 - 307637 6.8 - Term 307920 - 307972 1.3 186 79 Tu 1 . - CDS 308025 - 308195 98 ## gi|160886822|ref|ZP_02067825.1| hypothetical protein BACOVA_04836 - Term 308516 - 308564 6.5 187 80 Tu 1 . - CDS 308589 - 308681 74 ## - Term 308983 - 309026 10.1 188 81 Op 1 . - CDS 309079 - 310536 1469 ## COG2195 Di- and tripeptidases - Prom 310562 - 310621 3.0 189 81 Op 2 . - CDS 310658 - 311644 583 ## BT_4044 putative dolichol-P-glucose synthetase - Prom 311692 - 311751 4.7 + Prom 311559 - 311618 4.9 190 82 Op 1 . + CDS 311715 - 312515 821 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 191 82 Op 2 . + CDS 312581 - 313921 1429 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) + Prom 313930 - 313989 3.8 192 82 Op 3 . + CDS 314010 - 315860 2036 ## BT_4041 hypothetical protein + Term 315888 - 315940 13.4 - TRNA 316106 - 316176 54.1 # Gln CTG 0 0 + Prom 316189 - 316248 5.6 193 83 Op 1 . + CDS 316441 - 316800 313 ## COG0799 Uncharacterized homolog of plant Iojap protein 194 83 Op 2 . + CDS 316810 - 318960 1275 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 195 83 Op 3 . + CDS 318977 - 319825 753 ## COG0575 CDP-diglyceride synthetase + Term 319867 - 319909 9.2 - Term 319850 - 319903 12.0 196 84 Op 1 . - CDS 319927 - 320709 735 ## BT_4005 hypothetical protein - Prom 320741 - 320800 6.3 - Term 320740 - 320782 4.7 197 84 Op 2 . - CDS 320803 - 321939 1145 ## COG0763 Lipid A disaccharide synthetase 198 84 Op 3 . - CDS 321983 - 322750 680 ## COG0496 Predicted acid phosphatase - Prom 322907 - 322966 6.3 + Prom 323025 - 323084 1.8 199 85 Op 1 25/0.000 + CDS 323107 - 323871 735 ## COG1192 ATPases involved in chromosome partitioning 200 85 Op 2 . + CDS 323883 - 324773 980 ## COG1475 Predicted transcriptional regulators 201 85 Op 3 . + CDS 324774 - 325661 641 ## BT_4000 hypothetical protein 202 85 Op 4 . + CDS 325696 - 326991 1138 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 203 85 Op 5 . + CDS 327064 - 327499 446 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases Predicted protein(s) >gi|225935362|gb|ACGA01000030.1| GENE 1 1 - 706 485 235 aa, chain - ## HITS:1 COG:no KEGG:BT_4121 NR:ns ## KEGG: BT_4121 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 1142 423 86.0 1e-117 MDKHVIFMKLISSLFVKKIGVLLCFIAFALSVNAQTRKVTGQILDESGQAIIGATIRLQD SSAGTVTDIDGHFSLNVPEGKKVVISYIGFVTQVITPKANTLHIVLQEDSQKLDEVVVVG YGSMKQKNITGSVSTISASELEDLPVSNLSEALQGMVNGLTVELGSSRPGTNANEVYIRQ NRTFTGISKDGGNSTPLIIIDDVIQLGTNGQPSMEQFNMLDPSEVESITVLRDAS >gi|225935362|gb|ACGA01000030.1| GENE 2 742 - 2409 1044 555 aa, chain - ## HITS:1 COG:no KEGG:BT_4120 NR:ns ## KEGG: BT_4120 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 555 1 555 555 909 81.0 0 MNLMKLMAVGLLLSCFSALLTGCSDNDDSSSKVLLRPVATMEVVQNKAYLSWKTVEGATE YVIDVYKVTDKGEELYKTEIVPASQSSCVIDLDWEENYKFKVKCTGDGRLSGYWETEVTG ILYRPLSIELKESRAIDTKALVAWIPNDTVVITALTAVPMGTEEVSTQTMKVYPVSAEEY LSGSKVIDGLSPETAYRVSLYSGEEQTSDTYQARVEVKTEATENLDADYGTANRVDLRNE AFDPDYFNKLDWNSLEEGTTFLLPAGKTFVVNSGEVVAEFAHSVNFVTPQTLEDYCTFSF ENAFRVVEGGAIDKVTFKRINLKASKSLDEATNNSLSGKQVICPETTNFLINKIDFSNCY IENFRAIVRSKSTGGNVEAISFQECTINAIGNQGIVSTDGKKGNYINNVSFDNCTITNIC GIADLRNSDSGKNISITNTTFCYAPMENSFLFRVDASIAVNIENCVFGGSMKINGNLPMF NELGSGGQDDYTGTYRFSPVNSFQSNDHSSSKGSLGLADSKMSTISLFTDPGNNNFKLNE LFSGCSSVGALKWRQ >gi|225935362|gb|ACGA01000030.1| GENE 3 2443 - 4128 1331 561 aa, chain - ## HITS:1 COG:TM0433 KEGG:ns NR:ns ## COG: TM0433 COG3866 # Protein_GI_number: 15643199 # Func_class: G Carbohydrate transport and metabolism # Function: Pectate lyase # Organism: Thermotoga maritima # 72 275 33 226 367 60 27.0 8e-09 MKNYAFIEKWKTRMRTERVFEFLKPRHVSVSNTMERCLVKSLFLIGCTLSLGACTDIKGF TNMELGPYAKDPEVLKAFPTAEGFGKNATGGRGGKVVIVTNTDDDGEGSFRWALKQCSEN EATTVVFAISGKIELKSDIRCKAKNFTIAGQTAPGGGICVIKNEVNFGGSENFIIRHMRF RVGDKDANGKDHNAACLRVENANNFIIDHCSFSWASEENTDFIDTHFSTVQWCISSEGLY YSVNKKGSRAYGGAWGGTSSTYHHNLFAHCNSRTPLMNGARGKDPGQDINVYMEYINNVN YNWGSQMATYGGMDESQDPEHHGWACNFVNNYYKPGPATTSRVKDLKFFRQSSARDSEKA PLRAVSKWYFSGNVMEGNSELTSDNWKGVYTDGSYPYSFDEMKAPSFIVASGRDNYEQYW FDWETYTLSEKYESAEQAFQSVIGQNGAGAFPRDKVDERIMKEVKNGSCTYTGAGNETSG SIPGIINSPDEAEGFGGLTYKTSGAITDADKDGMDDAWEKKVGLDPNNPEDRNRTTDVGY TALEVYLNSLVDESISYNFKK >gi|225935362|gb|ACGA01000030.1| GENE 4 4512 - 5678 1074 388 aa, chain + ## HITS:1 COG:MJ0203 KEGG:ns NR:ns ## COG: MJ0203 COG0150 # Protein_GI_number: 15668375 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole (AIR) synthetase # Organism: Methanococcus jannaschii # 47 369 49 323 350 114 30.0 2e-25 MSNQRYMMRGVSASKEDVHNAIKNIDKGIFPKAFCKIIPDILGGDPEYCNIMHADGAGTK SSLAYMYWKETGDLSVWKGIAQDALIMNIDDLLCVGAVDNILVSSTIGRNKLLIPGEVIS AIINGTDELLAELREMGVGVYATGGETADVGDLVRTIIVDSTVTCRMKRSDVIDNANIRP GDVIVGLASYGQATYEKEYNGGMGSNGLTSARHDVFGKYLAEKYPESYDAAVPEELVYSG KLKLTDSVEDSPIDAGKLVLSPTRTYAPVVKKLLDALRPEIHGMVHCSGGAQTKVLHFVE NVRVVKDNLFPVPPLFKTIQEQSGTDWAEMYKVFNMGHRLEVYLSPEHAEEVIAISESFG IPAQIVGRVEACEQTELIIKSEFGEFRY >gi|225935362|gb|ACGA01000030.1| GENE 5 5682 - 6794 1114 370 aa, chain + ## HITS:1 COG:VC2179 KEGG:ns NR:ns ## COG: VC2179 COG0216 # Protein_GI_number: 15642178 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Vibrio cholerae # 5 365 3 355 362 332 47.0 1e-90 MADNSTILEKLDGLVARFEEISTLITDPAVIADQKRYVKLTKEYKELDDLMKARKEYIQL LGNIEEAKNILANESDAEMREMAKEEMDNSQERLPVLEEEIKLMLVPADPQDSKNAILEI RGGAGGDEAAIFAGDLFRMYAKFCETKGWKMEVSNANEGTAGGFKEVVCSVTGDNVYGIL KYESGVHRVQRVPATETQGRVHTSAASVAVLPEAEEFDVVINEGEIKWDTFRSGGAGGQN VNKVESGVRLRYIWKNPNTGIAEEILIECTETRDQPKNKERALARLRTFIYDKEHQKYID DIASKRKTMVSTGDRSAKIRTYNYPQGRITDHRINYTIYNLAAFMDGDIQDCIDHLIVAE NAERLKESEL >gi|225935362|gb|ACGA01000030.1| GENE 6 6871 - 7695 933 274 aa, chain + ## HITS:1 COG:RSc2773 KEGG:ns NR:ns ## COG: RSc2773 COG0284 # Protein_GI_number: 17547492 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Ralstonia solanacearum # 4 268 22 285 288 202 40.0 8e-52 MDKQQLFENIKRKKSFLCVGLDTDIKKIPEHLLKEEDPIFAFNKAIIDATADLCIAYKPN LAFYESMGVKGWIAFEKTVKFIKDNYPDQFIIADAKRGDIGNTSAMYARTFFEELNIDSV TVAPYMGEDSVTPFLSYEGKWVILLALTSNKGSHDFQLTEDVNGERLFEKVLRKSQEWAS DDRMMYVVGATQGRAFEDIRKIVPNHFLLVPGVGAQGGSLEEVCKYGMNSTCGLIVNSSR GIIYVDKTEKFAEAARTAAQEVQAQMAEQLKAIL >gi|225935362|gb|ACGA01000030.1| GENE 7 7726 - 8955 841 409 aa, chain + ## HITS:1 COG:BS_ywfO KEGG:ns NR:ns ## COG: BS_ywfO COG1078 # Protein_GI_number: 16080812 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Bacillus subtilis # 4 406 10 410 433 191 28.0 2e-48 MPYERKIINDPVFGFINIPKGLLYDIVRHPLLQRLTRIKQVGLSSVVYPGAQHTRFQHSL GAFHLMSEAITQLASKGNFIFDSEAEAVQAAILLHDIGHGPFSHVLEDTIVKGVSHEEIS LMLMERMNKEMNGQLSLAIQIFKDEYPKRFLHQLVSGQLDMDRLDYLRRDSFYTGVTEGN IGSARIIKMLDVADDRLVIESKGIYSIENFLTARRLMYWQVYLHKTSVAYEKMLISTLLR AKELASQGVELFASPALRFFLYNDINPTEFYNNPDCLENFIQLDDNDIWTALKVWSTHAD KVLSTLSTGMINRNIFKVEISSEPISEDRKKELTLHISQQLGITLSEANYFVSTPSIEKN MYDPADDSIDIIYKDGTIKNIAEASDMLNISLLSKKVKKYYLCYQRLHR >gi|225935362|gb|ACGA01000030.1| GENE 8 9038 - 10078 877 346 aa, chain + ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 1 336 1 331 332 222 36.0 7e-58 MEFSAKQIAAFIQGEIIGDENATVHTFAKIEEGMPGAISFLSNPKYTPYIYETQSSIVLV NKDFTPEHEIKTTLIKVDNAYESLAKLLNLYEMSKPKKQGIDSLAFVAPSAKIGENVYIG AFAYIGENTVIGDSTQIYPHTFVGDGVKIGNSCLLYSNVNVYHDCRIGNECILHSGAVIG ADGFGFAPTPNGYDKIPQIGIVILEDKVDIGANTCVDRATMGATVVHSGVKLDNLIQIAH NDEIGSHTVMAAQAGIAGSTKIGEWCMIGGQVGIAGHSKIGDKVGLGAQSGVPGDIKSGS QLIGTPPMELKQYFKASVAQRSLPDMQKELRNLRKEIEELKQLLNK >gi|225935362|gb|ACGA01000030.1| GENE 9 10090 - 11475 1443 461 aa, chain + ## HITS:1 COG:XF0803 KEGG:ns NR:ns ## COG: XF0803 COG0774 # Protein_GI_number: 15837405 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Xylella fastidiosa 9a5c # 1 319 1 297 304 182 35.0 1e-45 MLKQKTLKDSFSLSGKGLHTGLDLTVTFNPAPDNHGYKIQRIDVEGQPTIDAVADNVTET TRGTVLSKNGVKVSTVEHGMAALYALGIDNCLIQVNGPEFPILDGSAQYYVQEIERVGTV EQNAVKDFYIIKSKIEFRDETTGSSIIVLPDENFSLNVLVSYDSTIIPNQFATLEDMHNF KDEVAASRTFVFVREIEPLLSAGLIKGGDLDNAIVIYERKMSQESFDKLADVMRVPHMDA EQLGYINHKPLVWANECARHKLLDVIGDLALIGKPIKGRIIATRPGHTINNKFARQMRKE IRLHEIQAPSYDCNREPVMDVNRIRELLPHRYPFQLVDKVIEIGANYIVGIKNITANEPF FQGHFPQEPVMPGVLQVEAMAQVGGLLVLNSVDEPERYSTYFMKIDGVKFRQKVVPGDTI IFRVEMLAPIRRGISTMKGYAFVGEKVVCEAEFMAQIVKNK >gi|225935362|gb|ACGA01000030.1| GENE 10 11495 - 12262 878 255 aa, chain + ## HITS:1 COG:VC2248 KEGG:ns NR:ns ## COG: VC2248 COG1043 # Protein_GI_number: 15642246 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Vibrio cholerae # 1 255 1 262 262 209 43.0 5e-54 MISPLAYIHPEAKIGENVEIAPFVFIDKNVVIGDNNKIMANANILYGSRIGNGNTIFPGA VIGAIPQDLKFRGEESTAEIGDNNLIRENVTINRGTAAKGRTIVGNNNLLMEGVHVAHDA LVGNGCIIGNSTKMAGEIIIDDNAIVSANVLMHQFCHVGSHVMIQGGCRFSKDIPPYIIA GREPIAFSGINIIGLRRRGFANEVIESIHNAYRIIYQSGLNTTEALKKIEDEFEKSPEID YIIDFIRNSERGIIK >gi|225935362|gb|ACGA01000030.1| GENE 11 12284 - 12838 626 184 aa, chain + ## HITS:1 COG:no KEGG:BT_4204 NR:ns ## KEGG: BT_4204 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 184 1 184 184 285 93.0 4e-76 MIYRFTIISDEVDDFVREIQIDPEATFYDFHEAILKSVGYKNDQMTSFFICDDDWEKGKE VTLEEMDDNPEMDSWVMKDTTISELVEDEKQKLLYVFDYITERCFFIELSEIITGKDMDG AKCTKKSGDAPKQTVDFEEMAAASGSLDLDENFYGDQDFDMEDFDQEGFDIGGDASTPYE EEKF >gi|225935362|gb|ACGA01000030.1| GENE 12 12954 - 13856 882 300 aa, chain + ## HITS:1 COG:BH2366 KEGG:ns NR:ns ## COG: BH2366 COG0324 # Protein_GI_number: 15614929 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Bacillus halodurans # 4 282 5 287 314 189 36.0 5e-48 MPTLIVLIGPTGVGKTELSLRLAETFQTSIVSADSRQLYAELKIGTAAPTPDQLKRVPHQ LVGTLHLTDYYSAAQYEIEALEILEKLFTQHEVVILTGGSMMYVDAICKGIDDIPTVDAE TRQLMLQKYEEEGLEQLCAELRLLDPEYYRIVDLKNPKRVIHALEICYMTGRTYTSFRTQ QKKQRPFRILKIGLTRDREELYDRINRRVDQMMEEGLLEEVRSVLPYRHLNSLNTVGYKE LFKYLDGEWELPFAIDKIKQNSRIYSRKQMTWFKRDEEIQWFHPEQETEILAYLRQQINA >gi|225935362|gb|ACGA01000030.1| GENE 13 13863 - 15197 1109 444 aa, chain - ## HITS:1 COG:no KEGG:BVU_0113 NR:ns ## KEGG: BVU_0113 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 82 248 426 627 628 125 39.0 3e-27 MMLKEKSNPWARLKYLYVLPLAAIAVTAFARPEISEKMEEISAVKVNDLAEIVQEKVLQD TVKVSKDEKRDAVVVSGVKSKAEEEIIVFEVVEQMPEYPGGMNALQKYLMDKITSSPMKG KAGGRVMVGFTVAETGKIKDVYVLQSDEEALNREAERIVSEMPDWIPGKQRGRPVPVKYT IPVRFGSFRFAENKPPLILSDGKEISMEAMEKLDPSTIESLSVLKDSASIKVYGKRGANG VILVNTQRGSKDKKHNIEISFSKKATNADAIPDFPVSGTVVDEEGRPKAGVSIIVPNTNH GTITDINGHFSLKAIKDGNLWFSFIGYKPVKAPVTSTMNIRMEQEVVKLFPELSGSALNV SSGTKGVNGVSIYGIKGEQPLVIIDGKEAMDKDALSKLAPDHIKSISVLKDKSAQAVYGD KGKNGVIIVEMLTDEEYQTRQNKK >gi|225935362|gb|ACGA01000030.1| GENE 14 15100 - 15978 428 292 aa, chain - ## HITS:1 COG:no KEGG:BF0821 NR:ns ## KEGG: BF0821 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 266 2 265 520 362 69.0 8e-99 MGVFFIYILKSSVCLVLFYLFFRLLLSKETFHRFNRMALLGVLFFSLLIPCIEVTTRHQV EVQQAMLSIEQLLLMAELEATPVEAGAVQETVASWIQIVLLVYLAGILFLACRSIYSLIS LFRLIHSGKQEKLEKGVTLVVHHHEIAPFSWMKYIVISQKDLEENGREILIHEMAHIHHR HSIDLLLADICIFFQWFNPGAWLLKQELQNIHEYEADETVINEGVNAKEYQLLLIKKAVG TRLYSMANSFNHSKLKKTYHYDVKRKIESVGTVEVFVCTSVGSYCSNCFCPS >gi|225935362|gb|ACGA01000030.1| GENE 15 16002 - 16337 387 111 aa, chain - ## HITS:1 COG:no KEGG:BT_4198 NR:ns ## KEGG: BT_4198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 110 12 121 121 194 91.0 1e-48 MGFFWEKGPLFVKEMLAFYEEPKPHFNTLSTIVRGLEEKGFLAHYTFGNTYQYYPVVSEE DFRKGTLRNVISKYFNNSYLNAVSSLVKEEDISLDELKQLIHEVEQADKKH >gi|225935362|gb|ACGA01000030.1| GENE 16 16510 - 16812 219 100 aa, chain - ## HITS:1 COG:no KEGG:BT_4196 NR:ns ## KEGG: BT_4196 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 100 167 89.0 2e-40 MNSKRKDLQYASVFLLAVALTFLLGKGDNLWLVSWGDLIPSLFVLFIAGDCLHSSLLRIK RGEEDGGARWSTCFTFLVFSIVFMGDLFYIGNFIVNKLWG >gi|225935362|gb|ACGA01000030.1| GENE 17 16967 - 19177 1785 736 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 138 721 141 719 738 302 32.0 1e-81 MKKISITLLLCLLCLTGMAQGQKALDLKDITSRRFRPENIQGVIPMPDGEHYTQMNADGT QIIKYSFKTGEKVEVIFDVSTTRECDFKNFDSYQFSPDGQKLLIATKTTPIYRHSYTAVH YIYPLKRNDKGVTTNNIIERLSDGGPQQVPVFSPDGTMIAFVRNNNIFLVKLLYGNSESQ ITEDGKQNSVINGIPDWVYEEEFGFDRALEFSADNTLIAFIRFDESEVPSYSFPVFAGQA PRIDALKDYPGEYTYKYPKAGYPNSKVEVRTYDIKSHVTRTMKLPLDADGYIPRIRFTKD ANKLAIMTLNRHQDRFDLYFADPRSTLCKLMLRDESPYYIKENIFDNIQFYPEYFSMLSE RDGYSHLYWYSMGGNLIKKVTNGKFEVKDFLGYDEEDGSFYYTSNEESPLRKAVYKIDKK GKKTKLSQQAGTNTPLFSKSMKYYMNKFSSLDTPMLVTLNDNSGKTLKTLITNDALKQTL SGYAVPQKEFFTFQTTDGVKLNGWMMKPVNFSASKKYPVLMYQYSGPGSQQVLDTWGISW ETYMASLGYIVVCVDGRGTGGRGEAFEKCTYLKIGVKEAKDQVETALYLGKQAYVDKDRI GIWGWSYGGYMTLMSMSEGTPVFKAGVAVAAPTDWRFYDTIYTERFMRTPKENAEGYKES SAFTRADKLHGNLLLVHGMADDNVHFQNCAEYAEQLVQLGKQFDMQVYTNRNHGIYGGNT RQHLYTRLTNFFLNNL >gi|225935362|gb|ACGA01000030.1| GENE 18 19177 - 20025 346 282 aa, chain + ## HITS:1 COG:SA0785 KEGG:ns NR:ns ## COG: SA0785 COG0320 # Protein_GI_number: 15926513 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Staphylococcus aureus N315 # 5 281 9 288 305 297 50.0 2e-80 MTDRVRKPEWLKINIGANERYTETKRIVDSHCLHTICSSGRCPNMGECWGKGTATFMIGG DICTRSCKFCNTQTGRPHPLDANEPTHVAESIALMKLDHAVITSVDRDDLPDLGAEHWAR TIREIKRLNPQTTIEVLIPDFQGRMELVDLVIDARPDIISHNMETVRRISPLVRSAANYD TSLQVIRHISEKGVKSKSGIMVGLGETPEEVETLMDDLLATGCQILTIGQYLQPSHRHYP VAAYITPLQFAKYKTVGLEKGFNIVESAPLVRSSYHAEKHIR >gi|225935362|gb|ACGA01000030.1| GENE 19 20089 - 21207 681 372 aa, chain + ## HITS:1 COG:no KEGG:BT_4191 NR:ns ## KEGG: BT_4191 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 372 1 373 373 605 87.0 1e-172 MDHKGILSSAFNMSLGFIPVIISILLCELITQDTAIYIGTGIGIIGIYLSYRRKGLLIPN FILYIATGVLALLSVAALIPGDYVPPGALPLTLEVSILIPMLSLYMHKKRFINHFLKQIG SCNKRLYAQGAEAAVVSARIALIFGILHFIIISITVIFQNPLSKTSLFVLYKVLPPTVFL MSILFNQIAIRFFNHLMSHTEYVPIVNTKGDVIGKTPAIEAVNYKNAYINPVIRIAISTH GMLFLCDRPSTAILDKGKVDIPMECYLRYGESLEAGATRLINNAFPHEKDIKPEFNIVYH FENEVTNRLIYLFIVDIKDDSILCTPRFKNSKLWSFKQIEENLGKGFFSSCFEDEYEHLK GVIYIREKYRES >gi|225935362|gb|ACGA01000030.1| GENE 20 21173 - 21877 528 234 aa, chain - ## HITS:1 COG:NMA0547 KEGG:ns NR:ns ## COG: NMA0547 COG0313 # Protein_GI_number: 15793541 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Neisseria meningitidis Z2491 # 1 233 6 239 241 211 47.0 1e-54 METALYLLPVTLGDTPIEKVLPSYNKEIISGIRYFIVEDVRSARRFLKKVDREIDIDALT FYPLNKHTSPEDISGYLQPLVGGASMGVISEAGCPAVADPGADVVAIAQRKKLKVVPLVG PSSIILSVMASGFNGQSFAFHGYLPIEPGERAKKLKTLEQRVYAENQTQLFIETPYRNHK MIEDILLNCRPQTKLCIAADITCEGEYIQTRTVKDWKGHVPELSKIPCIFLLYK >gi|225935362|gb|ACGA01000030.1| GENE 21 21932 - 22939 894 335 aa, chain - ## HITS:1 COG:no KEGG:BT_4189 NR:ns ## KEGG: BT_4189 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 1 333 335 593 87.0 1e-168 MKLRISLLIILMSMLFTSCGISTGKGTEQKEEEISVLRYDKLLSEYVRSNSFSAMQKLTM DYRMPTKILIEDVLSIGTVKDDTISQRLQKFYSDTTLVRLLSDVEAKYPNLDEVEKGLSK GFRKLKKEVPDTKVPFIYSQVSAFNESIILVDSLLGISLDKYMGEDYPLYKRFYYDYQCR SMRPERIVPDCFAFYLLSRYGMNYHEGTCLIDLMMHTGKINYVVQNLLGYSDIGEAMGYS KEESDWCKDNEKEIWNYICTNDHLHARDPMVIRYYMKPAPAVDMLGAQAPALIGTWMGAR IIASYMKKHKDMKLKDLLEFTDYHEMLSESNYLAS >gi|225935362|gb|ACGA01000030.1| GENE 22 22946 - 23803 1075 285 aa, chain - ## HITS:1 COG:BMEI1958 KEGG:ns NR:ns ## COG: BMEI1958 COG0623 # Protein_GI_number: 17988241 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Brucella melitensis # 5 260 4 257 272 127 34.0 2e-29 MSYNLLKGKRGIIFGALNDQSIAWKVAERAVEEGATITLSNTPMAIRMGEVDALAQKLNC QVVPADATSVEDLENVFKTSMDILGGQIDFVLHSIGMSPNVRKKRTYDDLDYGMLDKTLD ISAVSFHKMIQSAKKLNAIADYGSIVALSYVAAQRTFYGYNDMADAKALLESIARSFGYI YGREHSVRVNTISQSPTFTTAGSGVKGMDKLFDFSNRMSPLGNATADECADYCIVMFSDL TRKVTMQNLFHDGGFSSVGMSLRAMATYEKGLDEYMDENGNIIYG >gi|225935362|gb|ACGA01000030.1| GENE 23 24156 - 25775 1449 539 aa, chain + ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 50 524 18 444 448 238 31.0 2e-62 MNTLTKRLFWMAVCCLPFISSGCKQSETAVKESSISDALYQNLPFEMPKVQQPVFPAYEV NIEKFGAKGDGLFLNTKAINDAIKDVNQRGGGKVIIPEGVWLTGPIELLSNVNLYTEQNA LVLFTGNFEAYPIIATSFEGLETRRCQSPISARNAENIAITGHGTFDGNGDCWRPVKKGK LTASQWKKLVNSGGVLDEKQEIWYPTAGSLKGAMACKDFNVPEGINTDEEWAEIRPWLRP VLLSIVKSKKVLLEGVTFKNSPSWCLHPLSCEDFTVNNIMVINPWYSQNGDAIDLESCKN ALIINSVFDAGDDAICIKSGKDEDGRRRGEPCQNVIVKNNTVLHGHGGFVVGSEMSGGVK NIYVEDCTFMGTDVGLRFKSTRGRGGVVENIYINNINMINIPNEPLLFDLFYGGKGAGEE SEEDLLSRMKTAIPPVTEETPAFRNIHISNIVCRGSGRAMFFNGLPEMPISNITVKNVVM TEAADGVVISQVDGVTLENIYVESSKGNNILNVKNAKNLTIDGKVYEELGAKEEILSLK >gi|225935362|gb|ACGA01000030.1| GENE 24 25780 - 25941 83 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLRLKLYVVFLLAKIAEKIADCVSFEEIIVCYLSFLLYIMILWFLFCITEWQ >gi|225935362|gb|ACGA01000030.1| GENE 25 25940 - 27166 1137 408 aa, chain + ## HITS:1 COG:no KEGG:BT_4186 NR:ns ## KEGG: BT_4186 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 406 1 406 410 749 87.0 0 MKKNLFIITFFLGAFSLSAQAQKQEKTITVEVQNNWNQAKTDAPIVINLHELHAGFKVKS AIVMEGTKEIASQLDDLNRDRKMDELAFVADLPAHARKTFQITLSSEKSTKTYPERVYAD MFIVDNRKGKHQRVQAITVPGTSNIYSMVRPHGPVLESELVGYRLYFNEKQTPDIYGKFN KGLEIKESQFYPTDEQLAKGFGDDVLRVFDSCGPGALKGWDGQQATHITPVDTRTERIIS YGPVRVIAEIEVTGWKYQDTELNMMTRYTLYAGHRDLHIEAFFDEPLNKEVFCTGVQDIV GTSKSFSDHKGLVGSWGTDWPVNDTVKYAKETVGLGTCIPQRYVKSEEKDKANFLYTITA PGNKYFQYHTTFTSMKETFGYKTPEAWFAYLREWKEELAHPVTVKVIR >gi|225935362|gb|ACGA01000030.1| GENE 26 27314 - 28990 1234 558 aa, chain + ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 18 556 22 529 531 194 30.0 6e-49 MKRLTQTLAFCLLTVFTAVAQKNYVSEVWVSDLGNGKYKNPVLYADYSDPDACRVGDDFY MTSSSFNCLPGLQILHSKDLVNWTIIGAAVPNALSPIETPERPEHGNRVWAPAIRHHNGE FYIFWGDPDQGAFMVKAKDPKGPWSEPVLVKPGKGIIDTCPFWDEDGKVYMVHAYAGSRA GLKSVITICELNAEATKAITPSRIIFDGHEAHQTCEGPKMYKRNDYYYIFHPAGGVPTGW QVVLRSKNIYGPYEWKTVLAQGNSPVNGPHQGAWVDTPTGEDWFLHFQDVGAYGRIMHLQ PMKWVNDWPVIGIDKDGDGCGEPVLTYKKPNVGKTYPICTPQESDEFDGYTLSPQWQWHA NINEKWAYYAGDKSYVRLYSYPVLNDYKNLWDVANLLLQKTPSDNFTATMKLTFTPNPKN KGERTGLVVMGRDYAGIILENTDKGLVLSQVECKKADKGKPEQANASVDLSQNTVYLKVR FSYDGKKIKGSEGGHDLIVMCNFSYSLDGKKYQPLGNPFQAREGQWIGTKVGMFCTRPAI VTNDGGWADVDWFRITKK >gi|225935362|gb|ACGA01000030.1| GENE 27 29251 - 29913 640 220 aa, chain + ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 216 4 221 222 145 36.0 6e-35 MKKLVIFDLDGTLLNTIADLAHSTNYALNKLGYPTHEIEKYNFMVGNGINKLFERALPEG EKTEENVLRVRNEFVPYYDIHNADDSRPYPGIPELLSYLQSAGIQIAVASNKYQAATEKL VAHYFSGIHFTAVFGQREGINVKPDPAVVFDILKLANVQKEDVLYVGDSGVDMQTAANAG VTACGVTWGFRPRTELEKFAPQHITDTAEEIKQFLSTGKV >gi|225935362|gb|ACGA01000030.1| GENE 28 30042 - 30950 635 302 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 42 300 17 283 352 144 32.0 2e-34 MKKHLILFFIGVILSYGNIKAQTVPDKKEILKVTLQVNDYFMKKYADYRTPSFVKKVVRP SNIWTRSVYYEGLMALYSIYPADEYYLYAKEWADYHQWGFHRGTTTRNADNYCASQIYLD LYNICPDPEKIRKTKANMDMLVNTPQVNDWWWIDAIQMGMPIFAKMGKLTGEQKYFDKMW DMYEYTRNQHGENGMFNPKDGLWWRDHNFDPPYKEPNGKDCYWSRGNGWVYAALVRVMNE IPSDEKHRQDYINDFLTMSKALKQCQREDGFWNVSLHDPTNFGGKETSGTALFVYGMSWG GT >gi|225935362|gb|ACGA01000030.1| GENE 29 31006 - 31182 185 58 aa, chain + ## HITS:1 COG:no KEGG:BVU_0178 NR:ns ## KEGG: BVU_0178 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 55 322 376 376 114 92.0 1e-24 MVKDAVHPNGFLGYVQGTGKEPKDSQPVTYDKVPDFEDFGTGCFLLAGSEIYKLDLHI >gi|225935362|gb|ACGA01000030.1| GENE 30 31244 - 35530 2574 1428 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 896 1126 52 288 318 135 32.0 4e-31 MKYKLLSLFALLFLFVSESYADIELSSKQMRTSDGLPSNSVRCMYQDSKGFLWLGTLDGL TRYDGNSFLTYQLETGKHDQVSLADNRIKHVAEDKNGFLWIKTVPELFSCYDLQKACFVD FTGTGDDGENYSEIYMSSNGDVWLWHRASGVRQIVCKENRTMTSVKFRAESGSLPDDRIK FINEDSSGRIWIGTMHGLASVYKGKVEIVDRELNFSSSFPHGGDMYFLTKGGDVYRCVNG SKKIDCLASLSTIAGNTSVSAHSLIKDKWVIFTSNEGVYEFDFQTIQITPYRKLKINNGK VIHDNYGDCWAYNNTGRIYYINVKSGEIKDFQLIPEEKMKYIDYERYHVVRDDRGMVWIS TYGNGLFTYDIQEDRLEHFVSEGKNAGPIGSDFLLCLMKDRTGGIWVSSEYSGLSHISIS NKGITHVYPESPDVFDRSNTIRLLTKMSNGDIWVGTRRGGLYNYDSHLKTKIDNHYLPYN IYAISEDSQENIWIGTRGDGLKVGDTWYKPDPSNPFALSHSNIYSILRDKKNRMWVGTFG GGLDLAEQTAGEQYKFRRFFQQKKYGLRIVRVMAEDEKGMIWMGTSEGICIFHPDSLITD EDNYRLFSYTDGNFCSNEIRCLFRDSKGRMWVGTSGAGLNLCELTDDCQSLKYTHYGISE GLVNNMVQAIQEDYSGQLWIATEYGISRFNPNSHSFENYFFSSYTLGNVYSENSACMGAD GKLIFGTNYGLTIIDPKKIPTERLLSPIVITGLSVNGIQVKPNMPGSPLQESLAYSDKIT LKYFQNSFMLDFSTLDYSDNGQIKYMYWLENYDKGWNVPSSLSFASYKYLEPGTYIFHVK YCDGAGIWNDTETTLKIVIVPPFWKTSWAMLGYFILILIALYFTYRIIINFNRLRNRINV EKQLTEYKLVFFTNISHEFRTPLTLIQGALEKMYRIDDIPQALLHPLKVMDKSTQRMLRL INQLLEFRKMQNNKLALSLQETDVVAFLYEIYLSFSDVAEQKNMDFRFLPSVPSYKMFID KGNLDKVVYNLLSNAFKYTPSGGTILFSVNVDEAKKCLQIQVADTGVGIPKEKQSELFKR FMQSSFSGDSIGVGLHLSHELVQVHKGTIEYEDNEGGGSVFIVCIPTDKTVYSEKDFLVP GNVLLEEANMQAHHLQELSEEYQEHEKAVTSSRKHKILVIEDDNDIRGFLQEELSAYFEV EVAADGISGFEKACNYDADLIICDVLIPGMTGFEVTKKLKTDFATSHIPIILLTALTSPD KHLEGLEAGADIYIAKPFSFKLLLAQVFRLIEQCEKLRKKFSSEPGIVRPALCNTDRDKE FADQLVAVVEKIYSQYNISIDEFAQMMGMGRTVFYKKLRGVTGYSPNEYLRVVRMKKAAE LFLTERNLTVAEVSYKVGINDPLYFSKCFKAQFGVSPSAYQKGERNKA >gi|225935362|gb|ACGA01000030.1| GENE 31 36404 - 37834 1285 476 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 36 445 7 408 448 334 42.0 3e-91 MIKKRCIMMRRTIYLLLILLFVAEVTYANTVDFDKAFKESARIEKQIKRTSFPKRTFLIT DFGAKTDDEANPCHEAINQAILQCSLAGGGTVIVPKGTFYTGPVTLKSNVNFHLEEGAVL KFSTDQSLYFPAVLTRWEGIDCYNAHPLIYAYGESNIAITGKGIIDGQGSMETWWPMCGA VKYGWKEGMVAQRNGGRERLLMYGETSTPVYKRLMKPEDGMRPQLLNLHSCHTILIEGVT LLNSPFWVIHPLFCESLIVSGVTVFNRGPNGDGCDPESCKNVLIENCTFDTGDDCIAIKS GRNEDGRKWNIPSENIIVRGCMMRNGHGGVVIGSEISGGYRNLFVEDCRMDSPNLDRVIR IKTSTCRGGLIENVYVRNVTVGQCREAVLRINLQYENREKCKRGFDPIVRNVHLKNVTCE KSKLGVLIIGLEDDKHVYNISVEDSHFNNVAKGGNDIKGAKDVTFKNLYINGELVK >gi|225935362|gb|ACGA01000030.1| GENE 32 37847 - 41110 2437 1087 aa, chain - ## HITS:1 COG:no KEGG:BVU_1870 NR:ns ## KEGG: BVU_1870 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 18 1087 9 1085 1085 1481 65.0 0 MKYFRLNLILKGGLTAYCLFMLPAMQLNAQREAKQIGDFKESISLNEHLRGAKRTLQYHP DGDEFVCINGKNRYTRALYGSHSPFRVETSDRPVFAFYNNGRGGNISFKIILRDGTEVAL DRTGHCESRYSAGKRTYYLTDPSWGKGELRIVVLALADMDGAIWRFSASNMPKGAILRWQ HGGATGKRLSRNGDMGVDPADCFELPAEAADLVTGELPLQKEAYLVRGEVPEQSNARKLY QKAEAASLALASRLKIETPDPYLNALGGALVAAADGIWDGQVWLHGAIGWRMPLSGWRAG YTGDALGWHDRARTHFDAYAASQVTDVPNTIPHPAQDSTMNLARSEKRWGTPQYSNGYIC RNPGRNNQMHHYDMNLCYMDELLWHFNWTGDTAYVRKMWPVITRHLAWEKLNYDPDNDGL YDAYACIWASDALYYNSGAVTHSSAYNYRSNKMAAFLASLIGEDSVPYQKEAEQILKAMN KRLWMQSKGCWAEYQDFMGHRRLHESPGLWTIYHALDSEVADPFQAYQATRYVDTEIPHI PVCADGLKEGYATIATTNWLPYSWSINNVAFAEVMHTALAYFQAGRPEEGCRLMKSSFLD GMYLGNSPGNFGQVSFYDAARGECYRDFGDPIGVASRLLVQGLYGILPDVLNGKMVIRPG FPAEWSKASISLPDMTYSFVREKNVDTYQIEQRFETPLALTLQVNAGREEIHSVKVNGRE VDWSFAEAASGYPVIIIPASSVKKSTVEIVWGGKSLSSVLPEIQADALAEINIPSTSGAL FGEIYDPQGVLVQPNVSDTSIRSKVNDHLGHHTFFVRMKQGQMEWWQPVNIQISELEKSS VILPFSQVNTLECRMINMDSLFNANLTDIFRNKYLTPRSPYTTLSLPIQGIGEWCHPKLT ADIDDTGLRALVRNEVLTTQLGVSFRTLAQGKNIAFTSLWDNYPDSLSIPLSGRASHAYL MMAGSTNHMQCRIANGIIRVFYTDGTSDVLELVNPDNWCPIEQDFYVDGQAFTVSSPRPY RIHFKTGMVSNDLGKNLGIKGVYGRSIEGGAGILLDMPLNPSKELSHLTLETLSNDVVIG LMGITIQ >gi|225935362|gb|ACGA01000030.1| GENE 33 41126 - 43240 1729 704 aa, chain - ## HITS:1 COG:no KEGG:BVU_1871 NR:ns ## KEGG: BVU_1871 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 8 704 6 702 702 1066 72.0 0 MMKTLAAYIAALLVWTACGDGKQSVIDREALVERNSPVVTAFDSLASLSVGNGEFAYTVD ITGLQTFPDNYSKGVPLGTQSQWGWHSFANPEHLTPEETLKEYDFGRGKKELYATQFKEK GRQQDAANWFRVNPHRLHLGIVGFDVKEDTDIEQVTDIHQKLCLWNGKIESRFKLNGEDY QVETVCHPSSDMIAAHITSKAHTGICFRFPYPTGGHCDDACNWEAIDKHTTTIVTQNESS AVLKHTLDSTEYYITLHWEGEATLNEKVKHCFVLTPMDDKLAFTCAFTSEAPSTQIVTVE QIQEEAKNYWNSFWKEGAAVDFSSCTDSRAKELERRVVLSQYLLAIQSAGTTPPQETGLT YNSWFGKFHLEMIWWHEAQFALWNRSKLLDRTLAWYEKAEPIARQIAQRQGFDGVRWMKM TDPSGTEAPSKVGSFLIWQQPHLIYLAELFYRADPTKEVLEKYNRLVQETAEFMYSFATY DELEGRYVLKGAIPAQETLRAAETVNPPFELSYWHFAMQTAQQWRERMGQQRNLEWDEML DKLSPLAYNADGLYLAAETATDTYKDIRYTSDHMAVLGSVGILPMNKLIREDYMKNTLHW IWDNWNWGKTWGWDYPMTAMNAARLGEPEKAVGALLMDKRTNTYLINGHNYQDGRLRIYL PGNGGLLTAIAMMCAGWDGCIETNPGFPKDGTWNVRWEGLKPLP >gi|225935362|gb|ACGA01000030.1| GENE 34 43356 - 45035 1181 559 aa, chain - ## HITS:1 COG:no KEGG:BT_4169 NR:ns ## KEGG: BT_4169 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 559 17 577 583 392 38.0 1e-107 MQMKKILLNAAFILAVTLLGGCDDFFNPDTDVTLDNEDYISEESEMYSGYIGIMTKMQAI GDKVIYLNELRGEMVVPTATAPTELYNLYNYDDDLSGNSYADPSGFYEVVNACNDYLRKL KTYEEKNSINESHYKALVSSTLRIKAWMFMTIAKIYGEVAWVNKPMTSLRDLSQFDILNL DETMVACKNLLDIGYDNIDGTYETAWKDWVDPDTELANSEYRRWDMMTPPYYAIYAEICL WLGRYQQCINLILNKMNSIYQSTTNQSIAFLRNDMLFSHYTNFFNNETPYDYESASAIMY DYQNRQTNSLLKHFDSDYPNKYWLAPAEVAVERFRNNEFGPLGNKDKDFRMGYTVSEYNG KWVISKFRPTSKPVRNAYRDDVFVYTYRGADLYFMLAEAFNQLGRRSVVDALINVGVNAY SSEFDIDEEGTYSGNWYGITPHWTNNSTVYYYSNGTTGIGSRKYGDRGIRGVEVSYSNLG ARTFTSNVQNNDEEILKEMMLEMACEGKVYPAMIRMAKRYKDNSFMAKYISEKYESTGNA AAIRSKIMSGDYFIKWKLK >gi|225935362|gb|ACGA01000030.1| GENE 35 45038 - 48181 2305 1047 aa, chain - ## HITS:1 COG:no KEGG:BT_4168 NR:ns ## KEGG: BT_4168 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1047 1 1050 1050 1177 57.0 0 MKKIQVLFFTLLICLSAANLYAQEKGKISGKVLSTLNVPIKGAVISVTGSEDVTTDKNGV FQIECKDPQKANISVWAAGYYTVLQTINNRKEISIILIPESEYKYNETTVLPFRVETNEV TTSAENITKKDFVPAGMKIDRALAGQIAGLQVTRGSGMPGEGSFINLRGTRSFVGNNAPL VVINGVPYIPDVNDSPLINGLSRNIFQAYNLNDIQNITVLKGAEAALYGSMGANGVILIE TDGANSDNMNTEISYYGQFGVNWNNKRMPLLSGINYRSYLSDMGMSYFGNMDEFFTEFPF MSDPTNSRYANYYNNNTNWQDEIYRSGFVTENLFRVEGGDAIAKYDLSLGYSKEDGILKS TQQNRYHTQLNGNFLISKNLEVYATVGLAYINGNYQEQGMNTRTNPLLTAYAQSPLLAPY QKADNGVFTPVYSTYYYGISENMDYAVSNPAAIVNTLDANNRQYDVNIKAGFNYKIFPGF SLNGVLGLYYNYNKEHIFVPGRTDFTIIPITDTYGTEENTIREGVAEATNFFYNMNACYN KLFNNRHALNVLAGVQILTTQQEYDGAYARNTTNDFYQVLGSADAIGRYFDGYQNKWNWA NIYGHVDYTYNNVLNAALNMSIDGSSANGRYTNHFQVYPSGGVTWMVKNMPFLIDKDWIN RLDLRAEYSLTGNSRFSSNYGKSYYSSSPYMAVSGIIRTQIPNTYLKPEVTSQMNLSLDA NLLRNRISVGVDYYRGRSKDVIMNIGKSAVYGTSAYYANVGKIDNNGVELSLQASLIRLR DFEWIVGGNIAFTESKIKSLGGNEQEVVEYSDGSMVVSRVGGNPYEFYGLQATGVYSTQA EAEEAGLLNSSNQAYSAGDVHYIDQNNDGRIDSKDYVSLGSATPDYFGGFYTNIRYKHFA LSAEFSYSKGNEAYNAVRRTLESASSFSNQSIAVANRWNVEGQVTNIPRASYGDAIGNNN FSSRWIEDASFLRMKSVTLSYSFDKPIWKFFRSGTLYVTGENLLTATKYLGMDPEFAYSS GNSATQGFDYAKVMQPKSVKLGINLKF >gi|225935362|gb|ACGA01000030.1| GENE 36 48203 - 49870 1332 555 aa, chain - ## HITS:1 COG:no KEGG:BT_4167 NR:ns ## KEGG: BT_4167 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 555 1 580 580 371 37.0 1e-101 MKARKYILGIALIGLIGSSCTDTWNEHYTQDSSTVNNTEISIVNATVTDYLAQEPSLSNM YQLFNETGMVKQLLAKEQMYTILAAESSIAAGDDPKYTAQTYISDASISPSNLKDGQRIL MWSGKYLNISVTSPEARGTTGIRFNNANVTRVIKLTNGYLYLLDQAIESPRSMYEIIENL GDDYSIFRDMVRSRYVLTFDKNASTIVGVDNTGNTLYDSVFTVKAPYFENKGFNIMSENL TATMLLPSNDVVNKALNTARQNLANWNMTRADSILENWVFQAAFFNRTYSKVDFENNEDL TSIFSKQWRTTVQKVDLDNPISMSNGIAYHVTEMKIPTNVLIYRLKELFRNYEYLGAAEK DAYFKSTNLSFEKISETDEATCTGWAALGFPTIGYRVLYYTLTDAGNQTYTLDYTPFRGE TMGSNYVATPYKIPPGTYTFTMGFRQRKDLGAIAISIIKDGVEIPINTLSQSVLNNAASY HYDRDGGGYPEGIDEAVAQGFKNKSKYDRDGGAVGTVTIPGDEATEIVIRFNASGSNLVR AALYHWCLKPTKDCY >gi|225935362|gb|ACGA01000030.1| GENE 37 49874 - 51586 1103 570 aa, chain - ## HITS:1 COG:no KEGG:BT_4166 NR:ns ## KEGG: BT_4166 # Name: not_defined # Def: putative lipoprotein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 569 1 581 581 362 37.0 2e-98 MKKIKRLYWMLLIILCAACNDPYDGDTYVVFDMQPAATYLSNRSEDFSEWIHIMKYADLY NAVNQATQSFTLFVPNNAAVQEFYTRKGVSSIEDLGAEYARNLVSYHIVQDTINQATFIE KEGALAKRTVSDDVLMVSFGSAEVGGGGIQSVYLNSEARVLEFANPVSNGYVYVLESTLT PLTESVYARISESGRPYTILKAALDATGVGAELNVIYDDVVDDLGQVTQQKRNYTLLAVS DAVFQEAGVNSLQDLVQLLGASSDYTNPENALYQYLAYHVLDGSYDLSKLQSFDTSDATS KIWNTLNTGSVVRISKEDRIYYLNYRDDNRAIFLEDYCNLQAKNGYIHQVSSYLPVAEAE PEIVLFDVCNYSIIGDWIAAGNGEEGIKFQESFGTAEKKCDVAGLNCYEYSLNNPSGTFS SYFNVTYFTTRTNNGWNTANNMDFLMLNLGNTGWISMQTPSIIKGKYKVTLQFGYATSMD FIRTQDSKGGSNGGRMTFSFDGENSVTHAPYTTISSNTLGCYSYVIYDELEFSETSTHTF RLVINDPAASKNASYRIYIDYLLFEPIFDE >gi|225935362|gb|ACGA01000030.1| GENE 38 51604 - 53145 1153 513 aa, chain - ## HITS:1 COG:no KEGG:BT_4165 NR:ns ## KEGG: BT_4165 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 513 6 515 519 452 47.0 1e-125 MKKIFTILLLSGGLLMSGCSDWLDVLPKDKQSTDMYWESSEDVEAILAQGYSRLRTCVPY IINWGELRGGSVIVPGRSTSTGLIQNFTVLPSVTTVQWGTFYQVIGMANAVLQYAPTVME KDASYYESQMNSHRTEAYFLRGLSYLYLVRNFREVPLILEPYVDDAMPTDVAKSSEAEIL EQIKSDVRAALATNAAKETFDGTWATKGRATKWALYALMAETCLWAEDYEECITYADYLI NATAAMRPVFMSTPGQWFNIFYPGNSNESIFEIQYDETNYAQGSGSPSRLFPYGTDITTS TYMYSEPMTVRLMNEYAQHTGDEVSRTYYGSFAGTSYVSYPENGIIWKYSGMGVADREAV RTILDANYIIYRMADVMLMKAEALIRIGGSTNWTEALAIVNQIRERSELSPRSEITAENI NEATEEMLLEILLDERDMEFAAEGKRWYDLLRYGKQQNFKYRARFKAIIMENNLTAESRW LSSVLDNDNAWYLPIPANDIFNNSLLTQNPYYE >gi|225935362|gb|ACGA01000030.1| GENE 39 53159 - 56422 2391 1087 aa, chain - ## HITS:1 COG:no KEGG:BT_4164 NR:ns ## KEGG: BT_4164 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 1087 9 1088 1088 1233 59.0 0 MKLLKYILIAIVFAATTGEAFGQAKVLSGKVTEMFGTTAEPMIGVNVNILNSQNRSLGGT ITNMDGIYNLSIPNESNLTIVFSYIGKKSERIKYNGQTRLNVSMKDDTQTINEVVISGKR IERNDLGISSREQSSATQKVSMEDLVAIAPVTSIEDALQGQLGGVDIVLGGGDPGARASI QIRGGSTLNANAEPLIVVDGVPYPAEIDDNTDFSTINNDDLGALLNISPQDIESIEVLKD AAATAVWGTSGANGVLVIKTKQGVQGKTRFSFSSKFSAKFEPESIPMLDGNQYSALISEA LWNSANYIGLANSSSYLQMLYGSPEIGYQPNWTYFDEYNQNTDWLSEVRRNTQSWENTFA MSGGGERATYRFSLGYLDEGGTTVGTDFSRLNSSLNVNYKFSDRLNFNAQFSYVQSKRNN NYYNVRTEAFRKAPNKSPYYIDDETGSHTSQYFNHSEHSTLDPAFKQDGDKSKYYNPIAM ANESVDRTIQRESKMNFRIEYDILSGLTYQGYVNMNIRNTKGRMFLPQVATGLAWTDSYA NRSTDTSSDQLTINTENKLMYRKNWNDKHAIIATALFRTTETNKSSYGSETSGNASSGLS DPTIGSAVRKISSGDSKVRSINGVALMNYTLLNRYIVNASLGLESNSTMGKSERLGMFPT VGLAWHLADEKFMDFSNNWMDEFKLRFSLGQTGNAASGASLYLGSYASGDNYMDMSAITP STMQLNRLKWETTTEYNTGLDASFLRGKLRFTVDVYQKYIKDLLQKKVEMPTSIGYGNGY EVSYMNSGKMTNKGWEFRVDAVPFENKDWRVGFYFNIAHNENKITEMPDNYTEENYTFNN GNYAYRREEGRPMGSFYGYRYKGVYQNKDATYARDADGNVMRDISGNAIVMRNGNATVAP GDAIYEDINHDGVINQYDIVYLGNSNPRFTGGAGINISYKQFKLSAIFYGRYGQDVVNST RLNNESMRGIDNQSTAVLRRWRNEGDQTDIPRALYGEGYNTLGSDRFVEDASYLRLKSLS FTYRLPKKAIQRWGFNNIDVYVTGYNLLTWTKYTGQDPEVKTENTYARDSATTPASVQLV AGFNLSF >gi|225935362|gb|ACGA01000030.1| GENE 40 56443 - 58704 1384 753 aa, chain - ## HITS:1 COG:no KEGG:BT_4163 NR:ns ## KEGG: BT_4163 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 753 1 749 750 528 41.0 1e-148 MKDTTFLRKMLWLLCLPLLLLSCEDKMDEHYEVPDWVADNAWEVLSSGEHGNYSIFLQGL EIAGYKQMLEGKAILTIMAPDDDAFRAYLNERGYTSISDMPVTEVSKLIGYHILYYSYNK EKLVNFRPTGSTETEEEQNVAAGLYYKHRTRSSDAPTIETTSAGSTVMVYHLERYLPVFS YRYFQTKGINAKSNYEAFYPNSTWTGDNGFNVSNASVKEYGIIANNGYIHTVDRVVEPLE TIYTELKSQEEYSTFLNLYDSFGYYVADNELSKSYAAAYGVDTLYQYMHDGLPNIACEWP TTSYLNFTALTATAYSIFAPSNTAINHFFDSFWKVGGYSSMQEVDELALTYFLYQFIYGG SMLFPDELGNDELKNVSGSTLKINPAMLNERIMCVNGAFYGMNEIQDPSAFASVIGPVFQ YKSARSFLYALVGSSLFSSYVSDLAKYIVLVPTADQFEASGIRTVYSTQGLEAQGDDGWS EISSSAKQSIVYLHSASIPSNQSSELPESGTRVVPTESPWNYWFVKDGTITCSSIFNQQL NPQFNGEIFTPFTKLKDGSNGSAYSFDGNQLFSTESGDLAYNIAICADHNYPYYCFAQLL RQANLVSNQILMNLFGKGRFIAFIPTNDAIQQALANDKIPGATNASFDAEGNLTGTFDAA ILANYLNSYFITAAQNAIPSYPYIGSDFRSGRYWSERAATIEGVTVPQLIYTDNGTSLSI QLEGFNACHVVSTYDYFPFAYEGGCFHLIDDVF >gi|225935362|gb|ACGA01000030.1| GENE 41 58740 - 59315 298 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171680|ref|ZP_05758092.1| ## NR: gi|260171680|ref|ZP_05758092.1| hypothetical protein BacD2_07433 [Bacteroides sp. D2] # 1 191 1 191 191 359 100.0 4e-98 MKQRYFLQFILGILLFTFFIVACNSTDLFDTSACIQKIEKNVKSISEQDSTFENLISVIA STHNGYIYNAQDRQDYCHIYLDGVEYPTFPNLTIPLENDKYVDIWYKMSFGKHHVKLIII IDEGLRYLYRKCTILATSEITALGEEFYYNTVDLDEDYTSTTTAGFVAEFDIDYPYMKEN NYALSINLNFE >gi|225935362|gb|ACGA01000030.1| GENE 42 59345 - 61030 992 561 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2628 NR:ns ## KEGG: Fjoh_2628 # Name: not_defined # Def: fibronectin, type III domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 84 531 33 447 999 291 40.0 4e-77 MKMKNTVFLEKKGISIATKCFTSIFYFFLFSLSSLAACVDKEPVDEEDADPAINPHSSFL EEEENGGNTSSGPAFPAFADNARAFPGAVGYGRNAEGARASSTREVYVVTNLNNSGAGSF KDAISQSNRIIVFNVSGIIDLNKEQLVFKSNQTILFQTAPGDGIELYNGYTSSSGASDVI IRYMRVRVGRQVSGSDNIDAGGCANGKNQIYDHCSFTWGTDECFSLNRDGKGDGLFNITL QNSILGQGCQNHSCGGLIQTEDSEGITIFRNLLIDNKTRNFKVKGLNQYVNNIVYNWGNG AAYNMGGESKGHSNTVIENNYFIKGPAYTWINTAYPIATTDLSMFDDETKYHYNGISGSD YLADTYQQVNPTKPFIGGDGDGDFDTYCVGNYYDDDKDGILNGFELTQGNWQTYCSATPV FLSKPDSQHPEISVKTSATAAYHWVVENVGPVLPNRDLVDQFMIDELITLGTKGTIFRNQ NYETQYPLATTWQNINIGAPKTDTDGDGIPDDFEDKWGLNKNSAGDAVRIADNGYTMIEN YALSLEFPNEYEKAWKEAYGE >gi|225935362|gb|ACGA01000030.1| GENE 43 61108 - 63639 1839 843 aa, chain - ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 506 840 22 357 361 298 41.0 4e-80 MKFCRIIFCLWLLVCFFPIEIHADIQLPSILSDNMVLQQNAKVRFWGKARPGEKIQVKTS WDHKKYKVVASENGHWELMIQTPAATTGQSVILKGDNKIHINNILIGEVWLCTGQSNMEF PVARNLQVKWKTGMLNEAEEMKDADFPEIRLFHVEHQMAPDSEKEDCVGKWMVCNPENLK DFSAIGFVFGRKLYKELSTPVGLIQSTWGGTHAESWTSMKVMENNPLYVDVLKQYSKEKV SREKDKCKVPATLWNGMIAPILGYTVKGNIWYQGESNSVRYEKYQEVFTNLINSWRKEWN QPDMPFYFVQIAPHYKQPAGIREAQLKTWLSGLENIGMAVVTDAADSTDIHPRNKVAPGE RLAAWALAKQYGKKIVYSGPLYKSMKVNGREITLDFEFAEGGLQTPGNEPVKGFFIAGND ARFYPAEAVIDGSSITLSSTYVSAPVAVRYGYGTFFRVNLFNKAGLPAVPFRTDTFAPDT YYRQFADSEIRRFPEAWQLDHGKRLYFGYAQGVGCCAMLQVWKKTGDRRYFDYVEAWADS LVDDKGEIHLYKKETYNLDYINSGKVLFDLYKETKKEKYKLAIENLIDQLKKQPRTTDGG FWHKKIYPYQMWLDGLYMASPFMARYGAEFNHPEWIDEAVKQFALCHQHTYDAKTGLYYH AWNEDRSQRWADPETGHSPNFWGRSIGWWFMALVDALEYIPQDHSGYADMIKWTKELAET LSKYQDKNGLWYQVIDQPSRTGNFPEASVTTQCMYAYMKAVNKEYIDSQYRAIAEKAFKG LCDKLLFSNPDGTLTLTRCCQVGGLGGKPYRDGSFEYYIGEKMRDNDAKATGPFIMGCIE LNY >gi|225935362|gb|ACGA01000030.1| GENE 44 63839 - 65149 687 436 aa, chain - ## HITS:1 COG:CAC0355 KEGG:ns NR:ns ## COG: CAC0355 COG5434 # Protein_GI_number: 15893646 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Clostridium acetobutylicum # 50 328 86 363 513 144 33.0 5e-34 MIKSVILIVATCCSLQLFGQEKFPDGKIIPDWFYKNDKTDINGLGKKFLITDYGVVNDST LLQTEKIQAVVDLASNNGGGVIVIPEGTYLSGALFFKPGTHLHLEEKAVLKGSDDISDFP VIETRMEGQNLKYFSALINVDKVNGFTLSGKGKVDGNGERYWKSFWLRRSVIPKCTNMDE LRPRLLHISHSNNVQVSDVRLVNSPFWTTHIYKCDSVKLLDLHIFSPSSPVKAPSTDAID IDACKNVLVKGCYMSVNDDAIALKGGKGPWADQDPDNGGNCDIIIEDCTFGFCHGVLTCG SESIYNHNIILRRCDLDKAKRLLWLKMRPDTPQQYKYILVEDIKGNVRNCIFIAPWTQFY DLKDRKDMPVSYSGYITMRNIHLDCDSFFAVEKSNQYKLSNFCFDNLAIKAKKDVKIDEN IIDSLIIRKVEITKVN >gi|225935362|gb|ACGA01000030.1| GENE 45 65441 - 65875 300 144 aa, chain - ## HITS:1 COG:no KEGG:BT_4171 NR:ns ## KEGG: BT_4171 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 144 114 217 217 62 33.0 7e-09 MNRKTIINGSLGILWKAIFALFLCIFVASCSKEDVITEETGITGTPRFIWDIAIELEQPN NWCMTLAIEKVSITNLTEGKQYILTWKGGLSTGQKTDGILKTVIRGEQTKKTDLDLLEIK ESGNNTYEIFLRGDGRKGEIVFTK >gi|225935362|gb|ACGA01000030.1| GENE 46 65922 - 67466 706 514 aa, chain - ## HITS:1 COG:no KEGG:BT_4183 NR:ns ## KEGG: BT_4183 # Name: not_defined # Def: pectate lyase L precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 433 3 429 437 621 70.0 1e-176 MKRMSLFLFFIGCFLLSLSATNYFVATNGNDSNSGTIDKPFATLGKAQSKVMAGDTVYIR QGTYRVTEAEIMEHYTAGSTTWSRVFKMSKSGTGPDKRICYSGYKNERPVFDLSAVKPEN ERVIVFYVNGSYLHFRNMEVVGTQVTIVGHTQSECFRNEGGSDNIYEHLSMHDGMAIGFY IVTGKNNLVLNCDAYNNYDPVSDGGKGGNVDGFGGHLTSPQYTDNVFRGCRAWYNSDDGF DLINCQAVFTIDDCWSFLNGYTKDGGKAGDGTGFKSGGYGMSDNPKAPSMIPMHIVQYCL AYMNKNKGFYANHHLGGIAWHNNTGYQNPSNFCMLNRKTATEAVDVPGYGHIIKNNLSHK PRSSGKHIIDVNQAECKITNNSFLPVEMIVTDDDFVSLDASQLTLPRKSDGSLPYIEFLR PKTNSKLYNAGMGCFLTGSSEDTSYDWLEDAAIFVERDVAKMVGKGADAFVHFYINGKEV SFSDRRVDLSIYNGEIDLRATTDNGVITKLKVIR >gi|225935362|gb|ACGA01000030.1| GENE 47 67981 - 69444 1088 487 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 58 441 33 403 448 147 27.0 4e-35 MKYLLSIVLLALIGFTTPKKTIPAYDWGSVSAEPDLSWADQVGAQRTPENTEWDAGKFGL RNDTSVFSTHAIQAAIDACHQQGGGTVVVAPGYYKIGALFVKSGVNLHLSKGTTLLASEN IQDYPEFPSRIAGIEMTWPSAVINIMDAENAALTGEGFIDCRGKVFWDKYWAMREDYEKK NLRWIVDYDCKRVRGVLVSNSKHITLKDFTLVRTGFWACQILYSDHCSVSGLTINNNVGG HGPSTDGIDIDSSTNILIENCDVDCNDDNICIKAGRDADGLRVNRPTENVVVRNCIARKG AGLLTCGSETSGSIRNVLAHDLTAYGTGTTLRLKSSMNRGGTVENIYMTRVVADSITHIL SVDLNWNPKYSYSTLPKEYEGKEVPEHWTTMLTPVEPKEKGYPNFRNVYFSHVKAGGAKR FISASGWDASHRIENFYLSNINAEVESVGKITYGKNFQLRDIHLKVKDKSRLRQTDNVDS KIEIHYK >gi|225935362|gb|ACGA01000030.1| GENE 48 69701 - 71566 1411 621 aa, chain + ## HITS:1 COG:no KEGG:BVU_0152 NR:ns ## KEGG: BVU_0152 # Name: not_defined # Def: polysaccharide lyase family protein 11, rhamnogalacturonan lyase # Organism: B.vulgatus # Pathway: not_defined # 10 621 22 633 635 985 74.0 0 MSLPLITMTAQPGYNYSKLQREKLNRGVVAIRENPSEVIISWRYLSSDPIQTGFNVYRDG KKLTDTPVTVSTLFRDKNNSQKAAVYEVRPVIKGKETHHIDGTYTLPENAPLGYLEIPLQ KPADGVTPAGDTYTYSPNDASIGDVDGDGEYEIILKWDPSNSHDNSHEGYTGEVYIDCYR MSGEQLWRINLGKNVRAGAHYTQFMVYDLDGDGKAEVVMRTADGTIDGKGKVIGDANADY REEGTFDPGRNQIMKQGRILKGKEYLTVFSGDTGEALHTIDYIPARGNVADWGDSKGNRS DRFLACVAYLDGVRPSVVMCRGYYTRTVLAAFDWNGKELKNRWVFDSSHPGCEDYAGQGN HNLRVGDVDGDGCDEIIYGSCAIDHNGKGLYSTRMGHGDAIHLTHFDPSRKGLQVWDCHE NKRDGSTYRDAATGEVLLQIKSNTDVGRCMAADIDPTHPGVEMWSGDSQGIRNVKGEIIA PKMRNMPTNMAVWWDGDLLREMLDRNMVIKYDWENKKFVPLVKFTGTLSNNGTKSNPCLQ GDIIGDWREEVLVRSENNASLRLYVSTIPTEYRFHTFLEEPIYRISIATQNVGYNQPTQP GFYFGPELIKMKGTFRGYQFK >gi|225935362|gb|ACGA01000030.1| GENE 49 71704 - 76041 3243 1445 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 891 1141 52 314 318 142 33.0 5e-33 MKKNTFLILSLLLLYLYPLSASVEIRSNKLTTGDGLANNSTRYMMQDSKGFIWLGTLNGL SYYDGNSFVNIYPDINNPLSLADSRIRSMEEDPNGFMWISTISSFISCYDLRHGCFVDFT GNGEYKENYSKFIILSDNSIWLWHNQNGCRRVVYQDGKFSSQAYKEKSGNLPSDKVQFVL ESAKGEVWIGTDKGLLKYQNGKTNNLDPQQQNWEHIISYDKYTCIITDKNEIYRHTYSTD KLEKVASLAESFDDTGHVTGNFLLQDKWVLFTVNGSYILDIPTGKLSRYSELNIKNGAVT RDNKKNVWVHNYTGNLWYVDITTGKIKPFQFLSSKHIGYIDVERYSIVHDSRDILWISTY GNGLYAYELSTGNLQHFTFEVNQSSHINSNYLQFVIEDRSGGIWVSSEFSGLSHLEILNK GTQRIHPSGENSSDRSNTIRMLFQTQNGNIYMANRMGSLYEYNSALTKVIRQEDFTHNVY SMNEDQEGNLWLGMRGIGLRIGANKWYKNDPRNANSLSNDQVYSIYRDRKGRMWLGTFGG GLNLAIKTADGYQFKHFLQDNYGEKRIRVIEEDKNGRMWVGTNNGIYIFHPDSLIASPKN YVIYNHANGTFPSNEIRCLMNDDKGNMWIGTSGAGFAVCHPGDSYQHLTFDCYDIKDGLS NAVVQSIVQDKDHKMWIATEYGISRFTPATKQIENYFFSSYTLGNVYSENTACVSNDGKL LFGTNYGLVVLDANKIETLDKPASVIFTGLQINGAHMLPGVEDSPLQEAMPYINELNLKY YQSSLTISFSTFNYLDGVSKYSYSLPPYDTEWSSPSSLNFATYRNLPPGKYELHVKACNA AGIWGEENVMEIVIAPPFWKTGWAYLIYAILIAIAGYFSFRIIRNFNALRNRIAVENQLT EYKLEFFTNISHEFRTPLTLIQGALERIVNMGNHSKDMQHSLKIMDKSTKRMLRLINQLL EFRKMQKNKLALSLEETDVVAFCYEIFLSFKDVSESKNMDFRFEPSQNSYKMPVDKGNLD KVIYNLLSNAFKYTPSNGKIIFKVDVQEQKKQLCIQVIDTGLGIPKEKRAELFKRFMQSS FSHSSVGVGLHLTYGLVNIHKGTISYNENEGGGSIFTVELPTDASVYEEKDFLVPNQLLI EEEKQRHKEFVTDENTDEQAAPPVPLNKRKVLIIEDDNDVREFLKEEIGHYFEVVAEADG ISGFERAQTYDADLIICDVLMPGMTGFEVTKKLKNEFATSHIPIILLTALNMEEKYLEGI ESGADAYITKPFSISLLLARISKLIEQRDKLREKFSNEPGMVHAAICTNNKDSKFLAKLN EMLNEHMVETEFSVDDYANLMGLGRTVFYKKVRGVTGYSPNEYLRVIRLKKAAELLLTED LTVSEISYKVGINDPYYFSKCFKNQFGIAPSVYQKNGGKAPASNDTAENTEQTSEEKEEG NKTDV >gi|225935362|gb|ACGA01000030.1| GENE 50 76242 - 79205 2643 987 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 21 784 50 871 871 153 24.0 2e-36 MKHKLFLFVFLLTALSVINGQASERKKYNFNSEWKLRIGDFPKAKDAKFDDSKWKQVTLP HAFNEDEAFKLSIEQLTDTVVWYRKNFRIPELKSNQKVFVEFEGVRQRGDFYLNGHYLGR HENGVMAVGFDLTPHIKEGENVIAVRTDNDWMYREEGTNSKFQWNDRNFNANYGGIPKNV FLYVTDNVYQTLPLYSNLKTTGVYVYAQDFDIKGRKATIHAESEVRNDSKATRQFSYQVT VLDADGKLMKTFQSDKVTLKAGETKTVKASAILHNLHFWSWGYGYLYTVKTALKDENNQV FDEVSTRTGFRKTRFAEGKIWLNDRVIQMKGYAQRTSNEWPAAGLSVPAWLSDFSNGLMV KGNANLVRWMHVTPWKQDVESCDRVGLIQAMPAGDAEKDREGRQWEQRVELMRDAIIYNR NNPSILFYECGNKAISREHMIEMKAVRDKYDPFGGRAIGSREMLDIREAEYGGEMLYINR SEHHPMWATEYCRDEGLRKYWDEYSYPFHKEGDGPLYKGQPATDYNRNQDELAITMIARW YDYWRERPGTGNRVSSGGTKIIFSDTNTHYRGAENYRRSGVTDAMRIEKDAFYAHQVMWD GWVDTEKDQTYIIGHWNYPENTVKPVQVVSTGEEVELFLNGNSLGKGKRQYNFLFIFDNV VFKPGKLEAVSYNKAGKEISRYAVNTAGEPASLKLTAIQNPEGFHADGADMALIQVEVVD KDGQRCPLDNRTIQFTLKGQAEWRGGIAQGENNHILDTNLPVECGINRALIRSTTTAGKV TLTAQAKGLPTATLALETVPVKVIEGLSTYLPQATLKGRLDRGETPSTPSYKDSKKGVRI VSAKAGCNNNDAEKSYDDIELTEWKNDGKLSTAWITYTLERDAEIDDICIKLQGWRSRSY PLEVYAGNTLIWSGNTDKSLGYIHLDVEKPVRANTITIRLKGNTSDKDAFGQIIEVEAKA ANKMELEKSSSRNQLRIIEVEFLETIK >gi|225935362|gb|ACGA01000030.1| GENE 51 79337 - 80215 741 292 aa, chain + ## HITS:1 COG:no KEGG:BVU_2332 NR:ns ## KEGG: BVU_2332 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 291 1 291 500 383 64.0 1e-105 MKNLLILFILAFSYNCMAGNNEPAHVIITAGQSNTDGRVPNDRLPDYIKTMATDTAFTTG AYKYCRIAQNRTDGKFRPFWPKSKRRAKPNTWGYDAITYYWLEQLWQEPFYVIKWAIGGT SIEPSNASDKSIHWSANPEWLANNTATSEKGRSLLLSFINDIDGCIDNTLSKLKNGYQID AFLWHQGESDHAHGDKYYENLKAVVTYVRNHLSEKTGKDYSNLPFIFGTVAKKNKQYGSE VEAAMKRFAKEDKNAYLIDMSDAELMGDRLHFNQNSAEYLGKQMYEQIKAIR >gi|225935362|gb|ACGA01000030.1| GENE 52 80423 - 83554 2551 1043 aa, chain + ## HITS:1 COG:no KEGG:Slin_6567 NR:ns ## KEGG: Slin_6567 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 27 1043 135 1155 1155 651 38.0 0 MKNILHGNGSLGRRLQYLLLMTLFVVQSTLSFAQNRTVTGVVVDEKDEPIIGANVTVKGT AKGTITDLDGKFVLSAPKSAKLLEVSFIGYNTQTVTITDKILQIKLSESAIALDEVVAIG YGSAKKGNVTGAIAKVNAEKLEDRASTNIASSLQGQLAGVEVRSTTGEPGSELQIRVRGA ASINADATPLYVVDGIPVDDLGSLNPGDIQSIEVLKDASSSAIYGSRGANGVVLITTKQA GTDEKVRVQFQASFGIQSLERKVDVLSPEEWIEFRTAYNNNRYISQYGSKGATIEDDYAT RLAMIGGKESYYFLNDPRWTEPGYGNLKLIDWQDEFFRLAPIQNYQLSLSSGRGNTKYRV SMGYTDQEGIAIESNYKRLNFRANVESKVLNRITVGVNLAPSASWNEGGRVDGKDRQATN VLTMAPVAEPEDGIYTGAGAYGYYRWTSSSKISPVAYMEHVQNKGEKVQLSSSAYVKADI WNGIKAEVTGSYNFSSSQSRSFVPSTITKYRTDVEGYRTTASRTESRSNKFLLQGVLHYD KTFGRHTIGAMAGYSMESSGGSSMRLSATQFPDNSLEVIDMADVVLTAATASLSTPSRLM SYFGRLQYEYDDRYLLTASIRRDGSSRFGKANRWGIFPAVSGAYRISNEKFWPKDFVMNS LKLRASWGANGNNSIPTNTALAKLSSANYSSGSIINGFAPTSLANPDLGWEKTESWNVAF DMGLFNNRIFISADYYVKTTKDLLYQVTVPALLGFTKAWGNIGSIRNKGFELEVTTQNLT GRLKWSTSLNVSYNQNKVLSLGDDNSTVFTGYDSTTQVFMVNQPLRSFYMYDAVGVYQTQ ADLQKYPVMQNSKVGDVRYRDANGDGVINDSDRTLMGKPDPDYTFGITNTFKYKNFDLSF LITAQTGGQIYSVLGRAMDRPGMGANGNVLSHWKNMWKSEAEPGDGKTPGIDNANTGQFY DSRWLYSTDFIKLKNVTLGYRIPFKKKFIQNARVYLTGENLLMWDKYEGGFSPEANNGGS TGDYDYGSYPQARVITLGVNVTF >gi|225935362|gb|ACGA01000030.1| GENE 53 83574 - 85178 1679 534 aa, chain + ## HITS:1 COG:no KEGG:Dfer_1583 NR:ns ## KEGG: Dfer_1583 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 534 1 478 479 182 30.0 2e-44 MKLYKYLISGGLALTLALSSCESWLEVEPNDTRTTDYFYSTPSEMEQAIIGIYNGLLPIS DYSWLMSEVRSDNAWVDKTTDKQRDYIDIGTFNPNISTISTLSGAWNNLYEIIARANMFL SKTDGVTFSSEAIKNQFIGEAKFLRALAYFDLVRYFGRIPIVLQPVSVNEAMTIKQSEPV EVYETAIVPDLEDAVKKLVDTPLNYMGNSASAGRATQVAAKSLLGRVYLTMAGYPVQDAS KKALAEELFSEVIDYSFANNKYWASTADEWIKIWISDNDNKYHIFEIQYIAAKNYGNPMV FNSVPSVNDSYTKIQMSGNRIWCENQLDDILKQTDETGVFIDKRCVGTINTNEFVDEDGT PYTGGDFFLKFFEHKMKRKLLGYEDIDDQIADRTYFPINYPLIRLEDVMLMYAEIAGPTG KGKEMVNKIRTRAGLDALDDDITIENFADSVDIERRRELASEGIRWHDLVRQNRYVSVLQ DMFKRNGSDANGVITKPETYELYKLVTKDSYIYPIPDSQMKVKEGLYEQNKGYN >gi|225935362|gb|ACGA01000030.1| GENE 54 85562 - 86473 541 303 aa, chain - ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 213 296 23 106 130 65 36.0 2e-10 MGITRIDTVQQYCDLFGIEALHPLISVVNCYEMQPIRHSKRLYNVYAVLLKDTDCGTMNY GRSLYDYEKGTVLFASPGQVMGADDDGQLHQPAGWALVFHPELLRGTSVARMMKDYTYFS YNASEALHLSEHERKTFIDCIERLQEELKYPVDKHSRSLIIDNVKLLLDQCIRFYDRQFI TREPMNSDLLARFEALLDAYYNSKEPAMSGIPTVQYCADKLCLSTNYFSDLVKKETGISA IKHIRQKIMDIAKERIFNTDKSISQISDEMGFQYPQHFTRWFKKIEGCTPNEYRNSIKTQ AIN >gi|225935362|gb|ACGA01000030.1| GENE 55 86710 - 87087 442 125 aa, chain + ## HITS:1 COG:XF1766 KEGG:ns NR:ns ## COG: XF1766 COG0599 # Protein_GI_number: 15838367 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Xylella fastidiosa 9a5c # 27 116 50 140 260 77 42.0 8e-15 MKQLFLTIACLTLSTSTVWAQNEKGMLSLKERYIVAISMYTAQGNTGCLKTALDEGLNAG LTVNEEKEVMLQLYAYCGFPRSMNALATLTNLMKERTAKGIKDNLGLEPTPITATDRATF GERTD >gi|225935362|gb|ACGA01000030.1| GENE 56 87147 - 88316 1034 389 aa, chain + ## HITS:1 COG:MA0419 KEGG:ns NR:ns ## COG: MA0419 COG1073 # Protein_GI_number: 20089312 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Methanosarcina acetivorans str.C2A # 80 389 14 327 327 356 54.0 5e-98 MLKSHLFGDIFGRDNLDWKTRELVTVAAVSVMDKLDNELRIHIAHAKYNGVTDAQINEVL TMAARCRNDMVLIEEQEPPKTFKADPAITVRKFFYKNRYDITLCAEVYLPADMDETRKYA ALIVGHPFGAVKEQCSGLYAQEMAKRGFVTLAFDASYQGESGGEPRHTVSPDALVEDFSA SVDWLGLQAYVDRNRIGVIGICGSGGFAVCAASLDPRIHALATVSMYDMGRATRNGLGNA LTDEQRKSMFAEVAQQRWKEAEGAEPRIRFGTPEQLPENASAVQKEFFDYYRNPQRGQHP RYQGTRYTSLAALTNFYPFAQIKEISPRPVLFITGENAHSRYFSEYAYKQANEPKELYIV PGANHVDLYDQMDKIPFDKLTEFFEEHLK >gi|225935362|gb|ACGA01000030.1| GENE 57 88416 - 89600 939 394 aa, chain + ## HITS:1 COG:TM1006 KEGG:ns NR:ns ## COG: TM1006 COG0667 # Protein_GI_number: 15643766 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Thermotoga maritima # 69 384 9 323 333 312 49.0 6e-85 MKKEQIQENGMDRRGFLKRTALAGAALCIVPALEKVNAAEKVISDRQKSLNTGMPSLAAL HDTRTLGSGKAAFQVSVMGFGCMGLNHNRSWHPDRKQEIALIHEAVERGVTLFDTAESYG YHVNEKLVGEALKGYTDRVFVSSKFGHKFVNGVQVRTEENSTPQNIRQVCENSLRNLGVE TLGIFYQHRIDPNTPIEVVAETCGELIKEGKILHWGMCEVNVNTIRSAHKVCPVTVIQSE YHFMHHSVEENGVLALCEELGIGFVPYSPLNRGFLGGMINEYTQFDPKNDNRQTLPRFQP DTIRANYRIVEVLNAFGRTRGITPAQVALAWLINKRPFIVPIPGTTKLSHLEENLRAADI RFTPEEMQELEAAVAAIPVVGSRYDALQESKIQK >gi|225935362|gb|ACGA01000030.1| GENE 58 89607 - 90632 550 341 aa, chain + ## HITS:1 COG:no KEGG:BT_1114 NR:ns ## KEGG: BT_1114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 1 342 343 540 74.0 1e-152 MKTAIVISMLVALVSCHNGGNTLQIAGQGSFSVGGTVLTDSAGRQYHGDHAYVFYQKPVD ARRYPLVFAHGVGQFSKTWETTPDGREGFQNIFLRKGFSTYLADQPRRGNAGRSTDTATI APRFDEEEWFNRFRIGIWPDYFKGVQFKQDEETLNQFFRQMTPNIGSKNLDVYADAYAAL FDKIGKAIFMVHSQGGAVAWQTVPKTRNIKAIIALEPGGNVPFPEGMLPEEGKIETHSKQ NTEGVEVPADIFKEFTRIPIIIYYGDNLPETDEYPELYEWTRRLYLMRMWAKIVNEMGGD VTVIHLPEIGLHGNTHFIMSDLNNMEVAAHISKWLHEKGLD >gi|225935362|gb|ACGA01000030.1| GENE 59 90634 - 90807 133 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171698|ref|ZP_05758110.1| ## NR: gi|260171698|ref|ZP_05758110.1| hypothetical protein BacD2_07523 [Bacteroides sp. D2] # 1 57 1 57 57 92 100.0 6e-18 MNQQLWTDRDMVETEVVEPYLSIYNEVTALVRQELGNNIRPALRTHVENIDELQDKQ >gi|225935362|gb|ACGA01000030.1| GENE 60 90917 - 91690 777 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171699|ref|ZP_05758111.1| ## NR: gi|260171699|ref|ZP_05758111.1| hypothetical protein BacD2_07528 [Bacteroides sp. D2] # 1 257 1 257 257 488 100.0 1e-136 MKKVMVSMALVVLALGSQAQVLKNDLLKGYKEGDRLEKAVYDDKRAPINVDTWCGAFTAT SVEGVVSPTVGKALTYAGYNETGPSINFGFPQGVKGSRVSVYSLVESGKVYSKGTYYLAC LVNFAKVGSSNLADILAASASYVGGGNRGQVYVRREGNDKIKFAVGLMKERNEAPMVYDY NTTHLLVLKVDYDKNEVSLFVDPELSQNEPKADAVVAGEEGALKAGLKAISFRNRSGFKG NIGNFRFAKDWAGSIGK >gi|225935362|gb|ACGA01000030.1| GENE 61 91758 - 93044 1285 428 aa, chain - ## HITS:1 COG:BS_yesT KEGG:ns NR:ns ## COG: BS_yesT COG2755 # Protein_GI_number: 16077769 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 175 388 4 217 232 175 41.0 2e-43 MMMKRILFLISLLSVTSCLLVRAQVYKFDFTPDKKVEEGYIKVTPETLFNNEQGFGYDLQ PAWDGKSNQPFFFSVNVPDGNYKVTVTLGSKDSAGNTTVRAESRRLFIENLPTKKGEMIT ESFTVNKRNMKISEREKVKIKAREKNKLNWDEKLTLEFNGDAPRMTSLVIERVDKVTTIF LCGNSTVVDYDNEPWAAWGQMFPRWFTDQVAIANYAESGESANTFIGAGRLKKALTQMKK GDYLFMEFGHNDQKQKGPGKGAFYSFMYNLKIYIDEARSRGAQPVLITPTQRRRFDKNGK IVNTHLDYPDAIRWLAARENVPLIDLNEISRTLYEAMGVDGSEHAFVHYPANTYPGQTKE LKDNSHSNTFGAYELAKCIIEGMKKADLQVVKFLRKDYNGFNPKHPDNFEDFKWSLCPFT EIEKPDGN >gi|225935362|gb|ACGA01000030.1| GENE 62 93060 - 94664 1190 534 aa, chain - ## HITS:1 COG:no KEGG:BVU_2520 NR:ns ## KEGG: BVU_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 14 189 13 202 568 83 29.0 2e-14 MKQYKFMQAALLTILLCGWTGCSQNEEEVPGNVRDGIVLNVTDTGLINNEPSTRTEDTGF VTTFTQGDQIGLFAVKDGAILDEINNMPFTFNGSSWSGKPILYDDRLAGVTFYAYYPYQP EMTEKIDLASDDFFAPLVAGWELTTEQSDQKAYAKQDLMTSNATALIGDNGNYTLSFQLT HRMSLLVVKLPSTRYIFTDAEGVAMPEATPYVAMPVNVAFYLDNVEEGTKISPYYDAKKD EYRLLRKPSSENQIIGHYNDKQCTFDTAEKMKEGRYKRFVVDGGYKEVTHHLQVGDYYYA DGSVVSGNEAEPAKENCIGIVCWVGNPMPSVLYKDVAGTPYTANNDALLRSHPNCVHGLV MSLYTETGKFSPALTQSIHDWFMTTSFTSSYVSVTGYYDVNENNKNKPLRFLGYNNSEVL DLYYDTFKTDFECFRYQDDCESSFPAPSITTGWYVPSSGELVALQDKDNSLESKVNSKLI KVSDKTMDMSATYWSSTERNNKNMYIVTYSKTAGSAGTGGVKTNTYTYRFFLGF >gi|225935362|gb|ACGA01000030.1| GENE 63 94686 - 95576 738 296 aa, chain - ## HITS:1 COG:no KEGG:BF3838 NR:ns ## KEGG: BF3838 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 283 17 321 337 78 27.0 3e-13 MKRNELWFGGLLAVAGFFGACTNVAEDEFQNNQYPMELMAEKSAMTFTHSIGEKTEWEGG ESVSLYDGFSQKLYEVSQTGEMSAKNNDPLYWYTTTEEETLFAWYPSSETLLTEWTVASD QTIEEAYKNSDFLYAYEKMKFRSERKLKFHHGTVKLVINVKGDGDTVSEEDLKDIVFTVS AVTKGTMAEGKLQSAAGVEATVMKPYLLENAREGYAYSFQLLMIPQDMSGKLFFKVTLKD GRNFGHMPGNGEGILEGGHQYTYNVGVGKPGLKVTIEKDSVSWEGDDEEVEGTDRE >gi|225935362|gb|ACGA01000030.1| GENE 64 95608 - 96558 1056 316 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171703|ref|ZP_05758115.1| ## NR: gi|260171703|ref|ZP_05758115.1| hypothetical protein BacD2_07548 [Bacteroides sp. D2] # 1 316 5 320 320 587 100.0 1e-166 MKNMYLAAAMVVSLAFSACSDEESGKADRAYKPIEIFGNINEVVSNVQETRAVGAAWGSD DRIGVTVEADEDNATANAVDTYINIQYRNETGGSFRVVNEGSTDNNIRLKGEGEFTLSAY YPYQGANGTLPGTEGVIVKTISGADQATDKQPQIDFLFAKATGVHAESPVTFDFSHKMTK IILKFKATNGATLNNMKVYLKSLQLEGSFNVTTGEAVAKSGATPNSELSMDIAKPAEGEM TASIILFPQDMPEKVLLEVKMNDETYTQYMPVQNLESGHAYPYNVTFENPAMTITKAEIE DWIVEDDKDVTASVTE >gi|225935362|gb|ACGA01000030.1| GENE 65 96731 - 97867 1077 378 aa, chain - ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 65 377 47 351 352 180 35.0 3e-45 MYMRNKLVLLFFFILTVAVSVNAQTLPDQKETLEVMKKVNGYFMKKYADYTIPSFYGRVR PSNIWTRGVYYEGLMALYSIYPREDYYKYAYDWADFHKWGMRNGNTTRNADDHCCGQAYI DLYKICPSDPNMIRNIKASIDMVVNTPQVNDWWWIDAVQMAMPIFAKFGKMTGEQKYYDK MWDMYYYTRSQHDETGMFNQKDGLWWRDQDFNPPYKEPNGEDCYWSRGNGWVYAALVRVL DEIPSDEKHRQDYINDFLTMSKALKKCQREDGFWNVSLHDPTNFGGKETSGTALFVYGMA WGVRNVLLDRKEYLPILLKAWNAMVKDAVHPNGFLGYVQGTGKEPKDGQPVTFKSIPDFE DYGVGCFLLAGTEVYKLK >gi|225935362|gb|ACGA01000030.1| GENE 66 97851 - 98843 979 330 aa, chain - ## HITS:1 COG:no KEGG:BT_4179 NR:ns ## KEGG: BT_4179 # Name: not_defined # Def: polysaccharide deacetylase # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 320 246 541 541 475 75.0 1e-133 MKTLSLFILTSFFYAFGINTLSAADWKVYVAKYKQDKTCAITYTFDDGLAEHSTVAAPEL EKRGFRGTFWVCGYYTEQGVSSKLPRMTWNELKKMAKNGHEISNHSWSHQNAKRLTLEQV KSEIEKNDSAIFANIGVMPVTYCYPYNYKTDEIVALASKNRVGTRIKQISIGGKSTPQRF AKWLDDLMKKEEWGVGMTHGINYGYDAFKDPSLFWAHLDKVKSLEDKIWVGTFREVSAYI KERDDIQLNVSIHKRGMTITPEMTLDKKMYTEPLTMVLVGEAVEKVSVKQGKKQLSAHIA GDKVLFDFNPYAGKIKVSFNYKQSKNVHEK >gi|225935362|gb|ACGA01000030.1| GENE 67 99010 - 103293 3020 1427 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 899 1119 8 227 294 132 39.0 4e-30 MTQRHLLVFTFLVSFVLNAYSAIELRSTQMRTSDGLPNNSIRYIYQDSKGFLWLATLNGL SRYDGNSFLTFRPEAGDKVSLADNRIYDLTEDKDGFLWISTTPELYSCYDLQRARFVDYT GCGELRQNYSTVFVAANGDVWLSHQGNGCRRMVHQKNGEMTSTVFRTECGNLPDNRVKFV NEDTSGRIWIGTQCGLVSVSNGQYRIEDRLIHFTSSLAYKDDMYFLTADGDIYYYHSATQ KMQKLAALSTVAGQTSPTGNFLLKDKWMILTTTGVYTYDFTTGEVAADPRLNIKKGELIR DNHGDYWIYNHTGHLTYVVAATGESKDFQLIPQDKIGYIDFERYHIVHDSRGIIWISTYG NGLFAYNIAEDKLEHFVANINDQSHISSDFLQYVMEDRAGGIWVASEYSGLSRISVLNEG TSRIYPESRELFDRSNTIRMLTKMSNGDICVGTRKGGLYTFDANLQSKMTNQYFHSNIYA IAEDRQGRMWTGTRGNGLKVGDTWYYNTPSDPTSLSDNNVFAIYRDRKERMWVGTFGGGL ELAEPTSDGKYKFRHFFQQTFGMRMVRVIEEDENGMVWVGTSEGICIFHPDSLIADGDNY HLFSYTNGKFCSNEIKCIYRDTKGRMWIGTSGSGLNLCTPQDDYHSLKYEHYGTSEGLVN DVIQSVLGDKKGNLWVATEYGISKFNPSTHSFENYFFSSYTLGNVYSENSACMREDGKLL FGTNYGLIVIDPEKIQDSETFSPVVFTDLYVNGTQMNPQMEDSPLKQSLAYSDEITLKYF QNSFLIDFSTFDYSDSGHTKYMYWLENYDQGWSTPSPLNFASFKYLNPGTYILHVKSSNG SGIWNDSETTLKIVIVPPFWKTTWAMLCYVLLLIVALYFAFRIVRNFNGLRNRINVEKQL TEYKLVFFTNISHEFRTPLTLIQGALEKIQRVTDIPRELIYPLKTMDKSTQRMLRLINQL LEFRKMQNNKLALSLEETDVIAFLYEIFLSFGDVAEQKNMNFRFLPSVPSYKMFIDKGNL DKVTYNLLSNAFKYTPSNGTIILSVNVDEGKQTLQIQVSDTGVGIPKEKQNELFKRFMQS NFSGDSIGVGLHLSHELVQVHKGTIEYKDNEGGGSVFTVCIPTDKTVYSEKDFLVPGNVL LKEADGHAHHLLQLSEELPDPEKMATPLNKRKVLIIEDDNDIREFLREEIGAYFEVEVAA DGTSGFEKARTYDADLIICDVLMPGMTGFEVTKKLKTDFDTSHIPIILLTALNSPEKHLE GIEAGADAYIAKPFSVKLLLARVFRLIEQRDKLREKFSSEPGIVRPAMCTTDRDKEFADR LAAILEQNLARPEFSIDEFAQLMKLGRTVFYRKLRGVTGYSPNEYLRVVRMKKAAELLLS EDNLTVAEVSYKVGISDPFYFSKCFKAQFGVAPSVYQRGVNSEGDSM >gi|225935362|gb|ACGA01000030.1| GENE 68 103446 - 103760 334 104 aa, chain - ## HITS:1 COG:mll5702 KEGG:ns NR:ns ## COG: mll5702 COG3254 # Protein_GI_number: 13474745 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 2 103 3 104 105 110 53.0 5e-25 MKREAFKMYLKPGCEAEYEKRHAAIWPELKALLSQNGVSDYSIYWDKETNILFAFQKTEG GAGSQDLGNTEIVQKWWDYMADIMEVNPDNSPISIPLPEVFHMD >gi|225935362|gb|ACGA01000030.1| GENE 69 103779 - 106979 2852 1066 aa, chain - ## HITS:1 COG:STM1911 KEGG:ns NR:ns ## COG: STM1911 COG4225 # Protein_GI_number: 16765253 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Salmonella typhimurium LT2 # 816 1009 164 357 379 99 31.0 5e-20 MKCISAFFSLCLVAAFVVAQPNYDFSKLKREHLGRGVIAIRENPSTVAVSWRYLSSDPMD ESFDVYRDGEKINKHPVRNATFFQDIYKGTNSVLYTVKAIKSKTESSYQLPSDAPAGYLN IPLNRPENGTTPAGQSYYYAPNDASIGDVDGDGEYEIILKWDPSNAHDNSHDGYTGEVYF DCYKLSGKLLWRINLGRNIRAGAHYTQFMVFDLDTDGKAEVVMKTADGTVDGKGKVIGDA QADYRNEQGRILTGPEYLTVFNGLTGEAMQTIDYVPGRGNLMDWGDNRGNRSDRFLACVA YLDGIHPSVVMCRGYYTRTVLAAYDWNGKELKERWMFDSNHPGCEDYAGQGNHNLRVGDV DGDGCDEIIYGSCAIDHNGKGLYTTKMGHGDAIHLTHFDPSRKGLQVWDCHENKRDGSTY RDAATGEILFQIKDSTDVGRCMAADIDPTQPGVEMWSLASGGIRNVKGEVVKARVRGLSC NMAVWWDGDLLRELLDRNIVSKYNWEKGICERIAIFEGALSNNGTKATPCLQGDIVGDWR EEVLLRTADNTALRLYVSTIPTDYRFHTFLEDPVYRISIATQNVAYNQPTQPGFYFGPDL QGTIFRGCKIPLPVSAQKKKTVVNDSNTPLHLLQPAYQGTYGDLTPEQVKKDVDRVFAYI DKETPARVVDKNTGKVITNYATMGEEAQLERGAFRLASYEWGVTYSALIAASEATGDTRY MDYVQNRFRFLAEVAPHFKRVYKEKGTTDPQLLQILTPHALDDAGAVCAAMMKVRLKDPS LPVDELIRNYFDFIIHKEYRLADGTFARNRPQRNTLWLDDMFMGIPAVAQMSCYDKEQKD KYLAEAVRQFLQFADRMFIPEKGLYRHGWVESSTDHPAFCWARANGWAMLTACELLDVLP EDYPQRAKVMDYFRAHVRGVTALQSGEGFWHQLLDRNDSYLETSATAIYVYCLAHAINKG WIDAIAYGPVAHLGWHAVAGKINAEGQVEGTCVGTGMAFDPAFYYYRPVNVYAAHGYGPV LWAGAEMIRLLKKQYPQMNDSAVQYYQVKQKTTAPIFAIDTEEKKD >gi|225935362|gb|ACGA01000030.1| GENE 70 107188 - 108363 912 391 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 78 390 47 351 352 181 34.0 2e-45 MNTQLIKQNYFTMKKSLLSFFTVTLLCLMTGKPVVAQELPAQKETLETIVKVNDYFMKKY ADYTLPSFYGRVRPSNIWTRGVYYEGLMALYGIYPREDYYKYAYDWADFHKWGMRNGNTT RNADDHCCGQVYIDLYNMCPSNPNMIRNIKASIDMVVNTPQVNDWWWIDAIQMAMPIYAK FGKMTGEQKYYDKMWDMYSYTRNVHGEAGMYNPKDCLWWRDQDFDPPYKEPNGEDCYWSR GNGWVYAALVRVLDEIPANETHRQDYINDFLAMSKALKKCQREDGFWNVSLHDPTNFGGK ETSGTALFVYGMAWGVRNGLLDRKEYFPVLLKAWNAMVKDAVHPNGFLGYVQGTGKEPKD GQPVTYKSVPDFEDYGVGCFLLAGTEVYKLK >gi|225935362|gb|ACGA01000030.1| GENE 71 108494 - 110128 1618 544 aa, chain + ## HITS:1 COG:BH3963_2 KEGG:ns NR:ns ## COG: BH3963_2 COG2755 # Protein_GI_number: 15616525 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus halodurans # 315 543 12 233 294 98 26.0 3e-20 MKSKLFLYICWLIAITFSLQLQAQNKVSVPMEDVNQVIDNTLDSLNKARTSRPEAGSSRK GNNPVLFLVGNSTMRTGTLGNGNNGQWGWGYYAGDYFDSNRITVENHALGGTSSRTFYNR LWPDVIKGVRPGDWVIIELGHNDNGPYDSGRARASIPGTGKDTLNVTIKETGVKETVYTY GEYMRRFIHDVKAKGAHPILFSLTPRNAWEDKDSTIITRVNKTFGLWAKQVAEEQHVPFI DLNDISARKFEKFGKNKVKYMFYIDRIHTSAFGAKVNAESAADGIRAYEGLELANYLKPI EKDTVTGSSRKDGRPVLFTIGDSTVRNEDKDKNGMWGWGSVIADEFNLNKISVENRAMAG RSARTFLDEGRWDKVYNALQPGDFVLIQFGHNDAGDINVGKARAELRGSGDESKVFLMEK TGKYQVIYTFGWYLRKFIIDVQEKGAIPIVLSHTPRNKWKDGKIERNTESFGKWTREAAE ATGAYFIDLNKISADKLEKKGVKKAAAYYNHDHTHTSLKGAHMNAKSIAEGLKKNDCPLK NYLK >gi|225935362|gb|ACGA01000030.1| GENE 72 110447 - 113818 2344 1123 aa, chain - ## HITS:1 COG:no KEGG:BVU_0159 NR:ns ## KEGG: BVU_0159 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 1122 25 1103 1106 1866 79.0 0 MNAQDNQVSGLNARQFHKYWKIESESPDYKVTFVGDTAEIVSPKGLTLWRKEKMSGRVTI EYDACVVVEREGDRLSDLNCFWMASDPEYPDNIWKREKWRNGIFQNCYSLQLYYMGYGGN YNSTTRFRRYDGNEAAIMEAGLRPSILKEYTDAEHLLKANHWYHIKITNENNRVSYYIDG KRLVDFRDAELLTEGWFGFRTTLSRTRITNFHYEYSPHQTSTVPLHWIGDTPEQDKAVSF GIPFDEGDVFPATPLRLTTSRNQEIPVDTWPLAYWPDGSVKWSGVAGVIPAGTGRLTLEK APQKVKTINKQPNASIAITETPENIQIETGVLSVFIPRRGDFLMDSLLYKGTKVGEKARL ICSTQSEPIQENTSQISFTRYIGEIKSVTIERFGSVRALVKLEGVHRNRNREIETSHAEN NQVSHSKGNQVNHSDENSLNNREWLPFVVRLYFYGGSEQVKMVHSFVYDGDQKKDFIRSL GIRFDVPMREALYNRHIAFSCADGGVWSEPVQPLIGRRMLTLNKADNKKNSHEKNDTKLI PTDEPSLQQQQMEGKRIPPYEFFDEKNRSLLGNWASWNDYRLSQLTADAFSIRKRANNDN PWIGTFSGTRSDGYIFVGDITGGLGLCMHDFWQSYPSSIEISDARTPVATLTAWLWSPES EPMDLRHYDRIAHDLNASYEDVQEGMSTPYGIARTTTFTLIPQGGYTGKKAFADYAKQFS NPSLLMPTPNYLHARQAFGIWSLPDRSTPFRTCVEDRLDAYIDFYQKAIEQNKWYGFWNY GDVMHAYDPVRHTWRYDVGGFAWDNTELASNMWLWYNFLRTGRIDIWRMAEAMTRHTGEV DVYHVGPNAGLGSRHNVSHWGCGAKEARISQAAWNRFYYYLTTDERCGDLMTEVKDSDHK LYELDPMRLAQPRSEYPCTAPARLRIGPDWLAYAGNWMTEWERTGNTMYRDKIIAGMKSI AALPNRLFTGPKALGFDPATGIVTTECDPKLETTNHLMTIMGGFEIANEMMRMIDIPEWK DAWLDHAACYKKKAWELSHSRFRISRLMAYAAYHLRDRQMAEEAWKDLFTRLEHTPAPSF RITTILPPEVPAPLDECTSISTNDAALWSLDAIYMQEVIPMDN >gi|225935362|gb|ACGA01000030.1| GENE 73 114050 - 116161 1564 703 aa, chain - ## HITS:1 COG:no KEGG:Ndas_0923 NR:ns ## KEGG: Ndas_0923 # Name: not_defined # Def: cellulose-binding family II # Organism: N.dassonvillei # Pathway: not_defined # 69 695 180 732 746 347 35.0 8e-94 MKKYYILWGLFGLLAVTACSDNDPVVEVNNGNQTEEELPPLPTEVITGSRAMWVSYDPTP NADTNNSSGISSALISWRLLKTDPSNVAFDIYKSVDGGAEEKLNENPIASTTSWFDADID VTKTNIYRVTLTNQTETLCDFTFAPEMATKFYREISLNMNVPDASVTYSPDDIQVGDLDG DGELEIVVKREPYDGANQGVWFNGTTLLEAYKMDGTFLWQIDMGINIRSGSHYTSYILYD FDGDGLCEIAFRSSEGTKFADGKIITDAYGKVNDYRIRQSDAKGWYSGAAIQRDPNDAST ATTCGLILEGPEYISICRGYDGREITRIDNIPRGGEGSKTSRAKYWSEYWGDDFGNRMDR FFIGVAYLDGIPDETTGMRTSNPSLIISRGIYHNWQVWALDLRGSELVPRWKFDTKDHSS KWLGMCSHCFRVADLDGDGKDEILYGSAAIDDDGSELWCTGNGHGDCLYVGKFIKDRSGL QIVASFEEPSNYNGQGHGYACQVIDARDGSLITGHGAGSTTDVGRCIVADIDPDSPNFEY WSSLNSGVYSCGDGRLVSSTYPTGIGSGIMYNVAIYWSGQPTREMLDRACIVSYKENPDV NKTNKTRLVYFGAYGSNDGNHSTKYNPCYYGDFLGDYREEVILGSSDGKSLYIFSTNHPT EFRLPHLMTDHNYDMSQAMQNMGYNQGTNLGYYVGAETLKKAE >gi|225935362|gb|ACGA01000030.1| GENE 74 116235 - 117962 1397 575 aa, chain - ## HITS:1 COG:no KEGG:PRU_2229 NR:ns ## KEGG: PRU_2229 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 2 575 3 559 559 409 42.0 1e-112 MKLNKLIYGVLSCFFLFSCADQMEYKEYSNYDADYVKRTFGDVGGLVANIYLGLDTDFGN YSGAILGSATDESEYAYSGNQIEDFYNGSWSPTNAKSSMWTTCYTQIANCNLYLDEFTGL TFSDYELISDYKQEMFRYNNYQYEVRFLRAYYYFYLARQYGAVPFTDHVLTTGEVNSLER RPAEEIFDYVISECDAIKDLIVEDYGKLGDMAPSGESLETGRANKRAVLALKARAALYAA SPLFNTNNDVELWHRAATANKELIDNCEAAKMKLIDDYSALWSASSYSDALDELIFGRRA TRTTNSFEGYNFPIGLENCKGGNCPTQTLVDAYEMQATGLRPDETADYDSSQPYYEGRDP RFYLTIAKNGDEKWPNWNTTPLQTYQGGVNAEPLSGGTPTSYYLKKYCQTAVDLRSSTAS TTYHTWITFRLGEFYLNYAEALFKYLELSGHEKAADITTDEFTMSAAAAVKKVRDRAKMP GFPTGMANDAFWKKYQNERMVELAFEGHRFWDVRRWKEGDKHFKSITEMKISKNADTYTY IRKTVNRQWDDKMYLFPIPQSERSKNPNLMQNPGW >gi|225935362|gb|ACGA01000030.1| GENE 75 117985 - 121032 2921 1015 aa, chain - ## HITS:1 COG:no KEGG:BT_2820 NR:ns ## KEGG: BT_2820 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 73 1015 5 926 926 523 35.0 1e-146 MRKNKILILALLACLCIPVGAQNTDNGNITGVVVDKWGNPVYGASVYVVDAPDNRVETDK DGKFEIAAEAGKNLQVVTVDKGSAIVEAAAGKTMTIRMDYAGQAIDVGANRTFSRHESTA AVSTVYNEDFNKRASKNVSNSLFGQGAGLTTLQNAGNYASVEPTFYVRGLQSLSSSSPLV LVDGLERDMSLVTPEEVESVSILKDAAAVALYGYKGINGAILITTKRGKYNAKEITFTYD HVTNFQSRRPKFVDAATYAEAVNEARGYEGLSARYTPEEVDAFRYGAGNGGASHLYPYIY PNVNWMDETFKDTGISNKYTIEFRGGGSKFRYYTMVGLLTDKGFVKSPNENDGYSTQNKY SRANLRTNLDIDLTSTTKLKLNILGTLSESSRPGNSVDLWDMIYSLPSAAFPVKLEDGTW GGSTTWAGTSNPVAQSQGAAYSKGHTRNLYADLTLSQDLSGFLKGLGANFRLSYDNYSSI WENHSKTYAYGGYTTSWSDNGPFYTYISGGAPGEMLKDANTNDWARQFNFAGSLDYGRSF GKYDVYSQLKWDYEYRDTYGLNTTIYRQNVSWYTHLGYDKRYYLDLALVGSQSSLLAPGH KWAFSPTVSAAWVLSEEDFLKDVSWIDFLKLRASFGVINADYLPKDGSTTINNYWDQIYT TTGTMYRFDEGYGSAFGSTYISRLATANSTHEKAFKYNVGIDATLFKGLNLTVDGYYQRR KDIWVASSGQYSEVLGMDAPFENAGIVDSWGTEIGLDYTKRMGDVVFNVGGNLAWNRNEI KEQLEEPRIYNNLVQTGNRLNQVYGLIAEGFFKDEKDIADSPTQNFSTVVPGDIKYRDVN GDGIIDSNDQTAIGYSTAAPEIYYSFHLGAEWKGLGFDAMFQGTANYSAVLNTKSMFWPL INNTTLSNHYYENRWTPENQNAKYPRLSSQSNANNYQTNTIWLADRSFLKLRNLEVYYKF PKALLQKTKILGSAKLYVRGVDLLCFDKIDVADPESYGATNPLNKSVIVGLTIGF >gi|225935362|gb|ACGA01000030.1| GENE 76 121074 - 123092 1517 672 aa, chain - ## HITS:1 COG:no KEGG:PRU_2227 NR:ns ## KEGG: PRU_2227 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 672 4 676 676 328 34.0 6e-88 MKQKYKWLTGAVMGVAVSLASTSCVDEIKFGNAFLDKAPGGSATIDTVFNSAAYTRQFLN TCYSRQYYGLPYVNDSSGDLPDSSNPYLGKFEALTDCWQLHYTSTTIYGSYYSGSHTSNY DLRGDVFGYTREMVWQVVRWCWLLIENIDRVPGMDDTEKRKLVAEAKCLIASRYFDMFRH YGGLPLVYASFTGTESSYSMPRATVEGTVNYMISLLDDAINSGGLEWAYTGADATTETGH WTKAGAMALKCKIWQFAASPLFNANQGFAGGGSEAERQNLVWYGSYRSDLWDNCLKACEE FFKAVDANGTYRLNTASGTNPTPAEYRYAYRMGYIKLDSPEVLHSVRVTTTDAFNSSTYC WHSWSDNGRNSYTPTQEYVEMFPWSDGTPFDWEKTKSEGKLDEMFLTGTFKPGEQMLSDI VLTRDPRLYESVIVNGLPGNLGWSSVSMGGDPYELWVGGTHAGTNSQNETMRYATGYDNM KYYLGDNDYLRQYTQWVALRLSDIYLTYAEALLQANNDARGAIEQVNIVRSRVGLLGLVE CNPDKNLLTDKDALLEEILRERACELGMEDSRFFDLVRYKRADRFEKRLHGLLIYRLGED GSRILTAWRDGTGSNSKLGGALQPVNFEYVPFELKNPVRHWWTHGFDSKWYLSPFPLAEI NKGYGLVQNPGW >gi|225935362|gb|ACGA01000030.1| GENE 77 123115 - 126309 3054 1064 aa, chain - ## HITS:1 COG:no KEGG:BT_2818 NR:ns ## KEGG: BT_2818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 45 1064 32 1057 1057 714 41.0 0 MNKKNFIKGTALPVALFFSMVTLAPALGFQPASAAIDVVQQKGVIKGTVVDTQGEPVIGA NVVAQGTSVGTITDIDGVFRLNVSPGTKLRISFIGYSDQTVTAKNDMRVVLAEDATALQE VEIVAYGVQKKITMTGAIASVKSEALTRTSVGSVSNVLGGQMTGLTTVQYSGEPGSDAAN IFIRGKATFTKDGASPLIQVDGVERNMNEVDPNEIESITILKDASATAVFGVRGANGVVL ITTKRGKEGKAKISFTTSASILAPTKMIEQANSYQYATFYNAMNKNDGVEQVFSDAVIQK FRDGSDPIRFPSIQWTDYIMGDATLQSQHNMNISGGTDKVRYFISAGAYTQGGLFNEFSL PYNISYQYRRFNYRSNLDIDVTKTTMLSFNISGNVNSSDKPRTSQGSSGMVKNIYYSTPF KSAGFVDDKLVYTTAETYDDGLQLPFVGDTDPFSYYGGGSTQTSNNSLNADLILDQKLDF LTKGLTFKLKGSYNSSFTVYKYASGGTEMTYNPAYLSDGTVGLKPIDGSKYTDVSYSSGT GKARDWYMEAGFNYNRTFGNHSVGALLLYNQSKSYYVAGTYSDIPRGYVGLVGRVTYDWK NRYMAEFNIGYNGSENFAPDKRFAPFPAGSVGWIMSEEKWFKSLKPYVSFLKLRASLGLV GNDKIGGERFMYTADPYNVNLNNLANRIPNSSDNANAWGYVYGVDLGTVSMGARELAKNN PDVSWEKAFKRNYGIDINFFDDRLSTTFEYYREHRWDILVRDGTAPGMLGFTTPYSNLGE VNNWGWELSLKWNDKIGKNFRYWAGINLSYNQNEILEMKESPKDYGYQYQQGHRIGARSQ YVFWRYYDEQTPELYQKTFNRPFPTHSVTLQNGDAVYVDLNGDRTIDGNDMSYDYGFTDD PEYMVGLNFGFAWKNWELSTQWTGAWNVSRMISDVFRQPFVSSSGNTAGGLLAYHLANTW TPENPSQSAEYPRATQVNATNNYATSTLFEKDSKYLRLKTLQVAYNFQFPLMKKLGLTTC QLAFSGYNLLTFTPYLWGDPEATASNAPSYPLTKTYTLSLKLGF >gi|225935362|gb|ACGA01000030.1| GENE 78 126369 - 127940 1121 523 aa, chain - ## HITS:1 COG:no KEGG:BVU_2520 NR:ns ## KEGG: BVU_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 17 275 15 296 568 66 26.0 3e-09 MKVYNYISMLLLAGTAIFASCSQDEEMSGGNMDSMQGFQINVLDEGYQSIDGNKTRATEN DYSTKFAEGDAIGIFAVKSGNIVDEIKNRKFTMQDGLWTLDDGGDPIEYKGSEYQRMSFY AYYPYDENVIFEPAKTDPFETYVSNWKIGENQSEGEYTKYDLMTSTGTVEGDRLKGKISF TMKHQMALAVIQMPELVYSFTNGNIDDYKLPVSVGSFTLNEVEATPYYQESTDTYRFLVN PKKTFSIKGTYNGVKEMEYTAGGTLDGGTAKMYTINDESKINHTLQVGDYYCADGKIISV ESETVPENVIGLVCYVGNIQPSVTHDEYTETQDALRRDYPNCTHGLVVAVNYAKYNDANT SVFSPNSRDYFHGNWFSNDENWAGKFINTDTKTTDSDGVAAMPFLGYNHTVLMINSPTLV NACEAGVNFVQAYRTDVTAPAVTSDWFLTSLKELDLLFRLKSTINARLKIVGGEELLEGS RHWSNAERTGNTQIVYQHNFSTGVINDKRRNEGAGYFRMVLAF >gi|225935362|gb|ACGA01000030.1| GENE 79 128001 - 129053 797 350 aa, chain - ## HITS:1 COG:no KEGG:BF3847 NR:ns ## KEGG: BF3847 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 13 317 1 303 309 63 27.0 9e-09 MKIYLFNFKKHGMKINYVTAAFALLLLAGCSNDVKESDIKFAGESGAKVSFSAVINNQEA SELVTRATETSWEMGDLVGITCGTRQVNIEYEYTGGETSLFSAKSGYTEDIWVLGSQEYD VTAYYPFTGTSGEEPLAIEVSTDSENQATAEKREQIDFLYAAGKATASTPNVRLAFNHVM SRIKLTFTAGENVTLSDITCYLINLKTKGTFNPNTGVTTVTEEPATADDDIVWTKVGSAD NYTIQAILLPQMVGNEAYIQAGMNGYYYEVHFPNLKELKPGVSYNYTIQANEYKDNPFVL TITEETQIVGWQNEDGGTIESDPSVAGTDAETTNPSWNITEETVTPTPIK >gi|225935362|gb|ACGA01000030.1| GENE 80 129332 - 130510 382 392 aa, chain - ## HITS:1 COG:no KEGG:BVU_0168 NR:ns ## KEGG: BVU_0168 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.vulgatus # Pathway: not_defined # 1 392 1 392 393 561 72.0 1e-158 MANFSIVIVPTKKLSNGRHRIRIAVAHHSKTRYISTQFTLDSASQLKNGRVIRHENAANM NACLRKLINEYEEIVTSISYLPAISCTELIHIIAYEQKKKGITFQTVAKEYMDFMKGEER EKSYKLYKIASERFIKYMKGDFPLIQLTPLHIQEFANVLHEENLADTTIRIYLTLIKVIL NYATKMNYVTYSIHPFTLFKMPASSVRELDLSIDELKRIRDVHLLKSSLSIVRDIFMLTY YLGGINLRDLLAYNFKDKDYMKYVRHKTRNSKKGENGITFTLQPEAKAIIDKYQTREGYL KFGKYSSYRQIYSLVFRHIDKMTQLSEINKKVTYYSARKTFAQHGYNLGIQIEKIEYCIG HSMKNNRPIFNYIKIMQEHADKVFREILDQLL >gi|225935362|gb|ACGA01000030.1| GENE 81 130954 - 134085 2666 1043 aa, chain + ## HITS:1 COG:no KEGG:PRU_1591 NR:ns ## KEGG: PRU_1591 # Name: not_defined # Def: putative receptor antigen RagA # Organism: P.ruminicola # Pathway: not_defined # 20 1015 1 1009 1027 903 47.0 0 MKSNKKGNHLYAYKDTIRRLFLLTLFSFLIVESYAQNKTISGTVTDFTGEPVIGASVLVN GTTNGTITDLNGKFSLSNVPIKGTITITYIGYKKQEVSVAGNTNFKITLQEDTETLDEVV VVGYGVQKKSDVTGAMARVGEKELKAMPVRNALEGMQGKTAGVDITSSQRPGEVGNINIR GQRSINAEQGPLYVVDGMVIQNGGIENINPSDIEAIDILKDASATAIYGSRGANGVILVT TKKGKEGKVTLNYSGTVTFETLHDVTEMMSAAEWLDYARLAKYNAGSYASATPTYEADKA AFGSVTASWKNIEKAWVNGVYDPSLVGSYDWASHGKQTGITHEHTLSASGGSDKFQGYAS FGYLDQKGTQPGQAYERYTLKTSFDVTPVNWFKMGSSINASYSTQDYGYSFSKSVTGSGD FYSALRSMIPWTVPYDENGEYVRYPSGDVNIINPIRELDYNTNQRRTFRASASLYSQIDF GKIWKPLEGLSYRLQFGPEFQFYTLGVANAADGINGDGNNGAQYKNEQKRAWTLDNLIYY NKALGQHNLGMTLMQSASAYHYEMGDMRATNVASSDELWYNIGSAGTLNSFGTGLTETQM ASYMIRLNYGYKDKYLLTASMRWDGASQLAEGHKWASFPSAAIAWRMDQEDFLKDISWLN QLKLRIGMGVTGNAAIKAYATKGAITGLYYNWGQNESSLGYVPSDPSQKEPAKMANPTLG WERTTQYNVGVDYGFFNNRLTGSIDAYKTKTDDLLLEMSIPSLTGYVSTYANVGKTSGYG IDLQVNAIPIQTKDFSWSTTLTWSMDRNRIDELSNGRTEDVNNKWFVGEEIGVYYDWVYD GIWKTEEAEEAAKYGRKPGQIKVKDLNNDDTIDANDDKKIVGHTRPRWTGGWSNTFSYKN FELSFFILSRWGFTVPQGAVTLDGRYMQRKIDYWVAGTNENAKYYSPGSNGEGADAFNSA MNYQDGSYIKVRNISLGYNFTPQQLKNLGINNLKLYVQAMNPFNIYKACDFLDTDLVNYD NNTKTFGSPTTLKSFVIGVNIGF >gi|225935362|gb|ACGA01000030.1| GENE 82 134110 - 136191 1689 693 aa, chain + ## HITS:1 COG:no KEGG:BVU_0172 NR:ns ## KEGG: BVU_0172 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 693 4 712 712 341 35.0 5e-92 MKLNRILFAALIASITGSSITSCSESFLDENLTTQYSTDRFKTQEGLDELVTGAYQKLKF KFNYIWGIQCYNMGVDEFTDANNVIPAWNHYSQDLNSSENAANQPIWDNYYGLVEPANIL IQNIPQYYNQSSPTYNTRLGEAHFLRAYAYFELVKQFGGVPLKLVPSTSAETYFTRNSAE EIYTQVISDFGEAYRLLPDKGESIGRINKYAAAHFLAKAHLFRASELYSDWNSNYIASDL DAVIQYSSEVVDAHPLCNDYVELWDYEQPNGANEKVSEVILAAQFSNDESTWGRYGNQMH LYYPAVYQGNDIGGCKRDISGGREFSYVSATEYTMQVFDRVNDSRFWKSFITCYGANETK SAPTWTAEDMPYAPAGVKEGDKRFSGGELGMKYIVNDPRDNRYEKYPNAPAYTVLKDGKM CNTYTYVRYFKGQEHSWNINEKTGNYYDIIPHKRSVALSKFRDGYRVSIASQFGTRDAII ARSADDVLMVAEAYIRKGEANYDKAVEWMNKLRERAGYKTGEDRSKNVDGGQAYKNNPYC SGKGGGHSSEGAIYWEENTYYESNNIEQETTASTKTTMKLNSVADVYNSTVDTPIYNELG CTSNADKMMCFLLNERTRELCGELQRWEDLARTKTLDTRWHKFNDGASRGLGEFKSEKHY YRPIPQAFLDGITNSNGSALSNEEKKAMQNPGY >gi|225935362|gb|ACGA01000030.1| GENE 83 136294 - 136980 372 228 aa, chain - ## HITS:1 COG:no KEGG:BVU_1770 NR:ns ## KEGG: BVU_1770 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 224 1 226 231 340 71.0 2e-92 MKKSFYLLLLITLYSSISFAQSLKSISILGDSYSTFEGYLQPDTNSIWYYVSPRQQTDVT SVKQTWWHKFIKENNYRLCVNNSFSGATICNTGYNQADYSDRSFITRMDKLGCPDIIFIF GATNDCWAGSPLGDYKYEGWTKEDLYTFRPAMAYLLDHMIDRYPNVEIYFLLNSGLKEEF NESVRAICHHYNIDCIELHDIDKKSGHPSIKGMEQISDQIKMFMKKTK >gi|225935362|gb|ACGA01000030.1| GENE 84 137069 - 138451 1226 460 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 48 457 24 438 448 271 35.0 2e-72 MKTNLLKYMAGICFGLLTVSLAFAGDNYKTVKVKAPFPMQPIKIFIYPDKDFPITDYGAQ PGGEADNTKAIAAAIEACNQAGGGRVVVPAGTWLTGPIHFKSNVNLCLEENAVLNFTDNP SDYLPTVMTSWEGLECYNYSPLLYAFECENVAITGKGTLQPKMDTWKVWFKRPQPHLEAL KELYTKASTNIPVIERQMAIGENHLRPHLIHFNRCKNVLLDGFKIRESPFWTIHLYMCDG GLVRNLDVKAHGHNNDGIDFEMSRNFLVEDCSFDQGDDAVVIKAGRNQDAWRLNTPCENI VIRNCQILKGHTLLGIGSEISGGIRNIYMHDCTAPNSVMRLFFVKTNHRRGGFIENVYMK NVKAGTAQRVLEIDTEVLYQWKDLVPTYEERITRIDGIYMDKVTCESADAIYELKGNSKL PVKNVMIRDVKVGEVKEFVKKVNNVENVVEKNVTYDRKGK >gi|225935362|gb|ACGA01000030.1| GENE 85 138511 - 140658 1831 715 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 57 713 1 644 649 346 31.0 1e-94 MKYMVKSKILATFCAGLLTVSTGIQAAERLATGTQQKQAQKTVIETSPWFDDKDLTLTGV YYYPEHWDESQWERDFKKMHELGFEFTHFAEFAWAQLEPEEGRYDFAWLDRAVALAAKYD LKVIMCTSTATPPVWMSRKYPEILLKNEDGTILDHGARQHASFASPLYRELSCKMIEKLA QHYGNDSRIIGWQLDNEPAVQFDYNPKAELAFRDFLRAKYNNDIQLLNNAWGTAFWSEAY SSFDEITLPKRVQMFMNHHQILDYRRFAASQTNDFLNEQCLLIKKYAKNQWVTTNYIPNY DEGHIGGSPSLDFQSYTRYMVYGDNEGIGRRGYRVGNPLRIAWANDFFRPIQGTYGVMEL QPGQVNWGSINPQPLPGAVRLWMWSVFAGGSDFICTYRYRQPLYGTEQYHYGIVGTDGVT VTPGGREYETFIKEIRELRKHSSSRETKPADYLARRTAILFNHENSWSIERQKQNRTWDT FAHIEKYYRTLKSFGAPVDFISETKILFDYPVVVIPAYQLADKELVDKWIAYVKNGGNLI LTCRTAQKDRYGRLPEAPFGSMIAPLTGNEMNFYDLLLPEDPGTVVMNGKEYAWNTWGEI LNPPTDAQVWATYKNEFYEGSPAVTFRKLGKGTITYVGVDSHDGALEKDILKKLYAQLNI PVMDLPYGVTMEYRNGLGIVLNYSDCPYTFNLPKGGKVLIGTAEIPTAGVLVFSM >gi|225935362|gb|ACGA01000030.1| GENE 86 140690 - 143047 1747 785 aa, chain - ## HITS:1 COG:no KEGG:BVU_0180 NR:ns ## KEGG: BVU_0180 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 1 784 1 784 784 1166 70.0 0 MKKVVIWIALSLWSVMTVFAGETAYLFSYFINDSKDGLHLAYSYDGLNWLPLHGGRSCLT PAVGKDKLMRDPSICQSPDGTFHMVWTSSWTDRIIGYASSRDLVHWSEQQAIPVMMHEPD AHNCWAPELFYDEPSQTYYIFWATTIPGRHKEVATSESEKGLNHRIYYVTTKDFRTFSKT KMFFNPDFSVIDAAIVKDPKRKDLIMVVKNENSNPPEKNLRVTRTKKIEKGFPTKVSAPI TGKYWAEGPAPLFVGDTLYVYFDKYRDHRYGAVRSLDYGETWEDVSDQVFFPRGIRHGTA FAVDASIVESLIADRNYNPLIPDNLADPSVSKFGDTYYLYGTTDLDYGLGRAGTPVVWKS KDFVNWSFEGSHISGFDWSKGYDYTNDKGEKKKGYFRYWAPGKVIEQDGKFYLYVTFVKP DDKMGTYVLVADRPDGPFHFTAGQGLLPPGEEGTDSPAVVDDIDGEPFINDDGSGYIFWR RRNAGRLSADRLHLDGEPVTLATARQEYSEGPVMFKRKGIYYYIYTLSGHQNYVNAYMMS RESPLTGFVKPEGNDIFLFSSPENQVWGPGHGNVFYDEGTDEYIFLYLEYGDGGTTRQVY ANRMEFNDDGTIKTLIPDMRGVGYLAASQETRPNLALQSHFYASSEKSPRTSVVNIETQP NQPLPEKGSVKSYTRTHTYQATHVADESNGTRWMAADTDSSPFITVDLKEIRKVGECQLY FTRPTEGHTWRLEKSIDGKHWQMCAEQGKVQVCSPHIAKEIGETRYLRLHIRRGDAGLWE WKIYE >gi|225935362|gb|ACGA01000030.1| GENE 87 143290 - 146166 2383 958 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 26 454 44 444 1087 88 24.0 8e-17 MKRLLVSIAIVFISILTVAGQNHSFSLSGKWNFQIDREDTGVKEQWFKKSLDDSINLPGS MPEKLKGDEVTVRTQWTGSLYDSSYYFNPYMEKYKIEGQVKLPFFLTPDKHYVGVAWYQK KVTIPADWKGERITLFLERPHIETTVWVNQQELGMQNSLCVPHVYDLTAATTPGKTYLIT IRIDNRIKEINVGPDSHSITDQTQGNWNGIVGQIELQATPKVHLEDIQVYPDLSNQKALV RMNIQSASSIKGEITLSAESFNTDIQHKVAPVHQSFNIRPGDNPVEMELPMGKEFLTWDE FSPALYKLTAKLTNGKQTDTQQVQFGMRDFKIEGKWFYVNGRKTMLRGTVENCDFPLTGY APMDVASWERVFRICRNYGLNHMRFHSFCPPEAAFIAADLVGFYLQPEGPSWPNHGPRLG NGQPIDKYLMDETIALTKEYGNYASYCMLACGNEPSGRWVAWVSKFVDYWKATDPRRVYT GASVGNSWQWQPHNEYHVKAGARGLSWAGAQPESESDYRTRIDTVKQPYVSHETGQWCVF PNFNEIRKYTGVNKAKNFEIFRDILNDNQMGSQSHDFMMASGKLQALCYKHEIEKTLRTP DYAGFQLLALNDYSGQGTALVGVLDVFFEEKGYINAEQFRRFCSPTVPLARIPKFVYANN ETFHADIEVSHFGAAPLESAKTVYTIKDKFGKVYAHGTVGTRHIPIGNLYSLGSVDMTLE GIDTPQKLNLEVRIEGSDAVNDWDFWVYPAQVELVQGTVYTTDTLDAKALAVLQDGGNVL ITAAGKIQYGKEVKQYFTPVFWNTSWFKMRPPHTTGIFLNEYHPLFREFPTEYHSNLQWW ELLNKAQVMQFTDFPATFQPTVQSIDTWFISRKIGMLFEAKVLNGKLMMTSMDITSQPEK RIVARQMHKAILNYMNSDAFRPADKIAPELIQALFTKVAGDVKSYTKDSPDELKPKIN >gi|225935362|gb|ACGA01000030.1| GENE 88 146295 - 146489 201 64 aa, chain + ## HITS:1 COG:no KEGG:BT_4150 NR:ns ## KEGG: BT_4150 # Name: not_defined # Def: putative rhamnogalacturonan acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 56 16 56 412 78 82.0 9e-14 MKTTIIGLLLLATASVNAQEKAQTYQLADAPRYSEETGYGYDLVATPEKGSKAPFFACRT VTIK >gi|225935362|gb|ACGA01000030.1| GENE 89 146465 - 147523 880 352 aa, chain + ## HITS:1 COG:BS_yesT KEGG:ns NR:ns ## COG: BS_yesT COG2755 # Protein_GI_number: 16077769 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 102 319 6 223 232 182 44.0 8e-46 MPDGNYQVTVRLGSKRQAGVTTVRGESRRLFIDNLATKKGQFVDETFIINKRNPRISEKE SVRIKPREKAKLNWDDKLTLEFNGDAPVCQSISIEPADPSVITVFLCGNSTVVDQDNEPW ASWGQMIPHFFGTDVCIANYAESGESANTFIAAGRLKKALSQIKKGDYLFMEFGHNDQKQ KGPGKGAYYSFMTSLKTFIDEARARGAYPVLVTPTQRRSFDATGHIRDTHEDYPEAMRWL AAKENVPLIDLNEMTRTLYEALGTETSKRAFVHYPAGTYPGQTKAFEDNTHFNPYGAYQI AQCVIEGMKKAVPELAKHLKIDPSYNPAQPDDVNGFHWNESPFTEIEKPDGN >gi|225935362|gb|ACGA01000030.1| GENE 90 147572 - 149107 1188 511 aa, chain + ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 55 366 22 351 448 139 28.0 2e-32 MKTKSIFPLFLLAGTLLTAACATPTAKQAGNNSFEWGQVPQQPDLSWADSVGSRQMPGNH VILSANSFGAVADSTVLSTEAIQKAIDSCAVIGGGTVVLQPGYYQTGALFIKSGVNLQLD KGVTLLASPYIHHYPEFRSRIAGIEMTWPAAVINIVNEKNASVSGEGTLDCRGKVFWDKY WEMRKEYEAKGLRWIVDYDCKRVRGILIERSSDITLKGFTLMRTGFWGCQILYSDYCTID GLTINNNIGGHGPSTDGIDIDSSCNILVENCDVDCNDDNICIKSGRDADGLRVNLPTENV VIRNCIARKGAGLITCGSETSGSIRNVLGYNLEAIGTSAVLRLKSAMNRGGTIENIYMTE VKAENVRHVLAADLNWNPSYSYSTLPKEYEGKEIPEHWRIMLTPVTPPEKGYPRFRNVYV SKVKAENVDEFISASGWNDSLRLENFYLYAIEAQTDKPGKICYTKNFNLSEITLKTEEKN VIELKENEQSNINFNYVKTSPDHRTAGNLAH >gi|225935362|gb|ACGA01000030.1| GENE 91 149058 - 149660 380 200 aa, chain + ## HITS:1 COG:no KEGG:BT_4147 NR:ns ## KEGG: BT_4147 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 199 1 190 196 321 74.0 8e-87 MLRHLLTIVLLGILLIDAQAQALTPPAGTFRLGISKGNESHWLKPKEKIQGVSFQWKALP DSRGFILEVEVASAPEADALFWSFGDCQPDSDINVFSVEGQAFTCYYGESMQLRTLQAVT PTDDIRLSDGHKDATPLMLYESGKRTDRPVLAGRCSLSPASSRNEGSKPFRLYFCFYEQN EKADYNYYMLPDIFAKIDKK >gi|225935362|gb|ACGA01000030.1| GENE 92 149657 - 151060 1030 467 aa, chain + ## HITS:1 COG:CC0572 KEGG:ns NR:ns ## COG: CC0572 COG5434 # Protein_GI_number: 16124826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Caulobacter vibrioides # 20 449 34 474 527 509 59.0 1e-144 MKRIYLLFSLLIGCIYLHAAIYNVMDFGAKADGKTIDSPVINRAIEAAAQAGGGTVYLPA GEYACYSIRLKSNIHIYLEQGTRIIAAFPGKEEGYDTAEPNEHNKYQDFGHSHWKNSLIW GIGLENITISGPGLIYGKGLTREESRLPGVGNKAISLKDCRNVTLKDLSMLHCGHFALLA TGVDHLTIMNLKVDTNRDGFDIDCCRNVRISQCTVNSPWDDAIVLKASYGLGRFQDTENV TISDCYVSGFDKGSVMDGTWQLDEPQAPDHGFRTGRIKLGTESSGGFRNIAITNCIFEHC RGLALETVDGGHLEDIVISNITMRNIVNAPIFLRLGARMRSPEGTPVGTMKRILISDINV WNADSRYASIINGVPGACIEDVTFRNIHLYYKGGYSEEDGKRIPPEQEKVYPEPWMFGTI PAKGFYIRHARNITFDGIRFHFEQPDGRPLFVTDDAENIEYYNTPKE >gi|225935362|gb|ACGA01000030.1| GENE 93 151057 - 153822 2139 921 aa, chain + ## HITS:1 COG:no KEGG:BT_4145 NR:ns ## KEGG: BT_4145 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 920 1 919 924 1772 89.0 0 MMNKRELIVLCLLAAGGVMQAQQWPDTPVEARPGARWWWLGSAVDEKNLTYNLEEYARTG MGAVEITPIYGVQGNDANEIQFLSPRWMEMLKHTQAEGKRTGIEIDMNTGTGWPFGGPEV SIEDAATKAIFQTYEIEGGKEIEQDINVTDPKQQPFSVLSRVMAYDEKGKCINLTAHVKK DKLQWKAPAGKWKVIALYIGKTRQKVKRAAPGGEGYVMNHLSKKAVKNYLSRFDRAFKSS KTSYPHTFFNDSYEVYQADWTDDFLEQFARRRGYKLEEHFPEFLDKNRPEVSRRIVSDYR ETISDLLLENFSHQWTDWAHKNGSITRNQAHGSPGNLIDIYASVDIPECEGFGLSQFHIE GLRQDSLTKKNDSDLSMLKYASSAAHIAGKTYTSSETFTWLTEHFRTSLSQCKPDMDLMF VSGVNHMFFHGTPYSPKEAEWPGWLFYASINMSPTNSIWRDAPSFFNYITRCQSFLQMGR PDNDFLIYLPVYDMWNEQPGRLLLFTIHHMDKLAPKFIDAIHRINNSGYDGDYISDNFIR STRFKDGQLVTSGGTGYKALVVPAAHLMPSDVLAHLYELAKQGATIVFLENYPTDVPGYG QLEQKRQSYQRTLRQLPAVSFSETTVTSIGKGKIITGTDYARTLASCNISPEEMKTKFGL QAIRRVNDTGHHYFISSLQNKGVDGWITLGTNAAAAALFNPMTGECGEAKVRQANGKTQV YLQLKSGESIILQTYQQPLQASKPWKYVKEQPFSLRLDHGWKLHFAESKPEIQGTFDIDR PCSWTHIDHPAAQTNMGTGVYSLDIELPTLQADDWILDLGDVRESARVRINGQEAGCAWA VPYQLKVGQFLKPGKNHIEIEVTNLPANRIAELDRQGVQWRKFKEINIVDLNYRPANYGH WSPLPSGLNSEVRLIPVNVMP >gi|225935362|gb|ACGA01000030.1| GENE 94 154169 - 155341 1140 390 aa, chain - ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 1 389 1 392 393 370 45.0 1e-102 MNYNFDEIINRHGTDSVKWDAVENRWGRNDLIPMWVADMDFRTAPFVIEALKKRLEHEVL GYTFACKEWSESIINWVKERHGWAIREEMLTFTPGIVRGLAFVIHCFTQKGDKIMVMPPV YHPFFLVTQKNEREVVFSPLVLKDEQYYIDFNRFRQDIQGCKLLILSNPHNPGGRVWTKE ELSQIADICYESGTLVISDEIHADLTLPPYKHPTFALISEKARMNSLVFMSPSKAFNMPG LASSYAIIEKDELRHRFQTYMEASEFSEGHLFAYLSVAAAYSHGTEWLDQVVAYIKGNVD FTETYLKEHIPAIKMIRPQASYLIFLDCRALGLNQEELNRLFVEDAHLALNDGTTFGKEG EGFMRLNVACPRATLEKALKQLEQAVMDLK >gi|225935362|gb|ACGA01000030.1| GENE 95 155527 - 156516 954 329 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171734|ref|ZP_05758146.1| ## NR: gi|260171734|ref|ZP_05758146.1| hypothetical protein BacD2_07703 [Bacteroides sp. D2] # 1 329 1 329 329 619 100.0 1e-176 MKQIKYLSFIPFLALLCLMTGCENDKDVYYPYVNVPLTTHNPVLVEGEVLHIGIGLDGVN YRVESDNEDVVTAEVVGNEILLTGSDIGSTIVRLSDDSYNRALMTVEVRKLQELTLESLP EGVEVLRLDNDGVILREMKILTGNGGYTVKSLNEGVATVAISGTTVTVTVVGDGGTTITV TDQEKESRSIDLWVPLNLDDPTPRIYWDGYRADINTVGVSSTGSYATPKDMYWYQQNGEV KDTYYVYFNGGWANNLNCVNAANRNPKFETTINDTNTSYTDKAASAVKLTEFILEKRIGT GPTSHPHIYFISLKTEDGKRGFIVHRWNE >gi|225935362|gb|ACGA01000030.1| GENE 96 156819 - 156995 99 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171735|ref|ZP_05758147.1| ## NR: gi|260171735|ref|ZP_05758147.1| hypothetical protein BacD2_07708 [Bacteroides sp. D2] # 1 58 1 58 58 82 100.0 7e-15 MSDVLFSFDGMSETALTFKIKALKKLGQKAYAQSVYDRFQKEYQQLYGEKYKENSLEE >gi|225935362|gb|ACGA01000030.1| GENE 97 157196 - 157981 854 261 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 261 1 256 256 137 32.0 2e-32 MTKALFFDIDGTLVSFETHRIPSSTIEALEAARAKGLKIFIATGRPKAIINNLSELQDRN LIDGYITMNGAYCFVGEQVIYKSAIPQDEVKAMGDFCEKKGVPCIFVEEHNISVCQPNDM VKKIFYDFLHVNVIPTVSFEEATSKEVIQMTPFITEEEEKEIRPSIPTCEIGRWYPAFAD VTAKGDTKQKGIDEIIRYFDIKLEDTMAFGDGGNDISMLRHAAIGVAMGQAKEDVKAAAD YVTAPIDEDGISKAMKHFGII >gi|225935362|gb|ACGA01000030.1| GENE 98 157984 - 159978 1671 664 aa, chain + ## HITS:1 COG:SPBC887.14c KEGG:ns NR:ns ## COG: SPBC887.14c COG0507 # Protein_GI_number: 19113280 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Schizosaccharomyces pombe # 14 435 318 777 805 169 29.0 2e-41 MSVDTNNAAFQDALNLIQYTRQSVFLTGKAGTGKSTFLRYVCENTKKKHVVLAPTGIAAI NAGGSTMHSFFKLPFYPLLPDDPNLSLQRGRIHEFFKYTKPHRKLLEQIELVIIDEISMV RADIIDAIDRILRVYSHNLREPFGGKQLLLVGDVFQLEPVVKNDEREILNRFYPTPYFFS ARVFGQIDLVSIELQKVYRQTDPVFVGVLDHIRNNTAGAADLQLLNTRYGSQIEESEADM YITLATRRDTVDSINEKKLAELPGDPITFEGVIEGDFPESSLPTSQELVLKPGAQIIFIK NDFDRRWVNGTIGVIAGIDEEEETIYVITDDGKECDVKRESWRNIRYRYNEKTKEIEEEV LGSFTQYPIRLAWAITVHKSQGLTFSRVVIDFTGGVFAGGQAYVALSRCTSLDGIQLKKP INRADVFVRPEIVNFAGRFNDRQAIDKALKQAQADVQYAAASRAFDKGDMEECLEQFFRA IHSRYDIEKSVPRRFIRRKLGVINTLKEQNKKLKEQMREQQERLRQYAHEYLLMGNECIT QAHDSRAALANYDKALSLDPNYVDAWIRKGITLFNNKEYFDAENCFNTAVTLYPANFKAV YNRGKLRLKTENTEGAIADLDKATSLKPEHAGAHELFGDALLKVGKEGEAALQWRIAEEL RKKK >gi|225935362|gb|ACGA01000030.1| GENE 99 160349 - 162877 2607 842 aa, chain + ## HITS:1 COG:no KEGG:BT_4129 NR:ns ## KEGG: BT_4129 # Name: not_defined # Def: outer membrane assembly protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 842 1 838 838 1406 90.0 0 MKKGLKIAAITVGVIIILMFLLPFAFQGKIADIVKTEGNKMLNAQFDFKNLNISLFRNFP QASVTLEDFWLKGTGEFANDTLVQAGEVTAAVNLFSLFGDSGYDISKIFIEDTRLHAIVL PDGRANWDIMKPDTTATAEAPVSEEEASSPFKVKLQRFVIKNMNLIYDDQQGKMYADIRD FNAVCAGDLGSDRTTLKLEAETKSLTYKMNGIPFLANANISATMDVDADLANNKYTLKDN TIRLNAIQAGIDGWVALKDPAIDMDLKLNTNDIGFKEILSLIPAIYATEFSSLKTDGTAT LTATAKGILQGDTVPAFNIDMQVKNAMFRYPALPAGVDQINISANVQNPGGNIDLTTVNI NPFSFRLAGNPFSLTANVKTPISDPDFKAEAKGILNLGMIKQVYPLGDMELNGTIDADMQ MSGRLSYIEKEEYERMQASGTIGLTGMKLKMKDMPDVEIKKSLFTFTPKYLQLSETTVNI GKNDITADSRFENYIGYVLKGTTLKGNLNIRSNYFNLNDFMAASADEATASETASTDSVA TAATGIMEVPRNIDFQMDANLKQVLFDKMSFNNMNGKLVVKDGKVDMKNLSMNTMGGNVV MNGYYSTANVKKPEMKAGFKLSNIGFAQAYKELDMVQKMAPIFENLKGNFSGSINILTDL DATMSPVLNTMQGDGSLSTRDLSLSGVKAIDEIADAVKQPSLKDMKVKDMTLDFTIKDGR VETKPFDIKMGDYTLNLSGSTGLDQTIDYSGKVKLPASVGNISKLMTLDLKIGGSFTSPK VSVDTKSMANQAVEAVADEAISKLGQKLGLDSAATANKDSIKQKVTEKAAEKALDFLKKK LK >gi|225935362|gb|ACGA01000030.1| GENE 100 162897 - 163511 684 204 aa, chain + ## HITS:1 COG:YPO2212 KEGG:ns NR:ns ## COG: YPO2212 COG0009 # Protein_GI_number: 16122440 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Yersinia pestis # 5 199 7 200 206 160 42.0 2e-39 MLLKLYDKNNNPQDLQRIIDILNDGGLIIYPTDTMYAIGCHGLKERAIERICRIKEIDPR KNNLSIICYDLSNISEYAKVDNNVFKLMKHNLPGPFTFILNGTNRLPKIFRNRKEVGIRM PDNNIIREIARLLDAPIMTTTLPYEEHEDLEYMTDPELIDEKFGDIVDLVIDGGIGGIEP STVVKCTDDELEIIRQGKGWLEEV >gi|225935362|gb|ACGA01000030.1| GENE 101 163657 - 165783 1422 708 aa, chain + ## HITS:1 COG:no KEGG:BF4291 NR:ns ## KEGG: BF4291 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 9 533 11 507 843 377 41.0 1e-103 MKKLNHIGIFLVCLFITIQSFASEPIRVACIGNSITYGAFIPNREMNCYPAQLQAYLGDG YEVKNFGASGRTILSKGDYPYSETDIYKASLEYQPDIVLIKLGTNDTKPQNWKYKDEFKD NYQTLIDTYRNLKSHPRIILLTPIRCFLPEGSEINAQLIENEVRPTVEELAWKNQLEIIN LFNLFGDQWDSVMLPDKLHPSSIGAGVMAQKMYEYLAVKTTASPTKLQTSLGIQDAKRFN FHGHQGYEFENEGVKCLVVEPAKEAIGKPWMIRARFWGHEPQTDIALLEHGFHIVYCDVA DLYGSDKAVQRWNSFYKRMVKAGFNKKVALEGMSRGGLIVYNWAAQNPEKVACIYADAPV MDLKSWPMGQGKSAGSTMDTKQLLNAYGFKNEAEALNWKKNPIDCAPTMAKAGIPILHVV GDADQVVPVVENTAIFEQRMEELHAPITIIHKPGVDHHPHSLNNPEPIVQFILKATNRAE NMCVHPVPGNEFRSAAGWTQNSDWNSVAKDITATLNGKHLKLLLLGNSITQGWGGNRKEV TYKPGKEAMDNAIGKDNWESAGISGDRTQNLLWRVRYDNYNSCHPENIVIAIGINNLISG KDSPENTAEGIIAVANEVRKQFPESRIILLGLFPSGKEQQSKVRTQCDKIHDILQHHRFE KVEYINPTKWFTEADGTMKDGLYGNDYIHFTGEGYKVAVTEIARILAR >gi|225935362|gb|ACGA01000030.1| GENE 102 165886 - 166698 886 270 aa, chain - ## HITS:1 COG:BB0152 KEGG:ns NR:ns ## COG: BB0152 COG0363 # Protein_GI_number: 15594497 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Borrelia burgdorferi # 1 264 1 264 268 370 69.0 1e-102 MRLIIQPDYQSVSLWAAHYVAAKIKAANPTPEKPFVLGCPTGSSPLGMYKALIDLNKKGI VSFQNVVTFNMDEYVGLPKEHPESYYSFMWNNFFGHIDIKPENTNILNGNAPDLDAECAR YEEKIKSYGGIDLFMGGIGPDGHIAFNEPGSSLTSRTRQKTLTMDTIIANSRFFDNDINK VPKTSLTVGVGTVLSAKEVMIIVNGHNKARALYHAVEGAITQMWTISALQMHEKGIIVCD DAATAELKVGTYRYFKDIESAHLDPESLIK >gi|225935362|gb|ACGA01000030.1| GENE 103 166747 - 167943 1183 398 aa, chain - ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 5 398 5 402 403 363 45.0 1e-100 MESKTRIKGNVHYVGVNDRNKHRFEAMWPLPYGVSYNSYLIDDEMVALVDTVDICYFEVY LRKIKQVIGERPINYLIINHMEPDHSGSIRLIKQHYPEIIIVGNKQTFGMIEGFYGVTGE QYLVKDGDFLALGRHKLRFYMTPMVHWPETMMTFDETDGVLFSGDGFGCFGTVDGGFLDT RINVDKYWGEMVRYYSNIVGKYGSPVQKALQKLGGLPISAICSTHGPVWTENIAKVIGIY DRLSRYDADEGVVIAYGSMYGNTEQMAEAIAEELSAQGVKNIVMHNVTKSHPSYIIADIF RYKGLIIGSPTYSNQIFPEVEALLSKILLREVKGRYLGYFGSFAWAGAAVKRLAEFAEKS KFELIGDPVEMKQAMKDLTYTQCENLARAMADRLKKDR >gi|225935362|gb|ACGA01000030.1| GENE 104 168002 - 168979 828 325 aa, chain - ## HITS:1 COG:SA1581 KEGG:ns NR:ns ## COG: SA1581 COG1242 # Protein_GI_number: 15927337 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Staphylococcus aureus N315 # 9 315 14 317 317 236 39.0 5e-62 MNMSTQLLYNDFPTFLRKYFPYKVQKISLNAGFTCPNRDGTKGWGGCTYCNNQTFNPDYC RTEKSIAIQLEEGKCFFAHKYPEMKYLAYFQAYTNTYAELEGLKRKYEEALAVDGVVGLV IGTRPDCMPESLLHYLEDLNKHTFLMVEYGIESTCDETLKRINRGHTYADTVEAVCQTAA CGILTGGHIILGLPGETHDTMVAQAEILSYLPLATLKIHQLQLIRGTRMAHEYDVTPAGF HLFNEVEEYIDLVIDYVEHLRPDMVVERFVSQSPKDLLIAPDWGLKNYEFVARLQKRMKE RGAYQGKKYRDSEKRIIFANDKLTT >gi|225935362|gb|ACGA01000030.1| GENE 105 169456 - 173796 3156 1446 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 907 1140 52 290 318 141 34.0 1e-32 MKRLHTLFLLLTIAIDIAAQPICQVKHFSVSDGLAQGNVMSILQDQKGLVWFSTWNGLNK FDGYTFKTYKTSQESKYAFGSNRMGTISESKYGDIWCPTYDGQACLFDVETEKFIDVLQP IELSTKQTNNVTRIYSLEKGIAWILCESGYAFRVDEQLCKKGEGITLYSASNHNLKGNQI FNIYQDSEEDEWILTDKGVSIIGKKTLDTDFPFQFITQIKETIYLIAENDKLARYDFHTK KLKFVDIPYPHHKINNVTTIGKDMLALGTDNGVILYSIPKNSFQQIDIRTATQTSNDVES VYQDHLGDIWIFSKDPGIVHLNLATNEKEHLFTPKDEIIKHGRKSRKLIFEDNAKNLWLL PTEGNFCYYDRKERTLKPLLTDINNPKSIFSPLVRSYTLDNQGNCWFATARGVEKLCFFP QSYQFNLTDYEAETRAFLQDSNKRLWTASKSNYIQIFAPDGNLVGYLSKQGNIIKEKQSF YNGVYSILEDKDGNIWLGTKEIGLFQLKKTGANHYSIHHFEHQTNDPYSLSSNSIYAIFQ DSRNNIWIGCYGGGLNLLAQAKDGKVSFIHSNNELRNYPIAYGMKVRNIAEAPGGVILVG TTNGLLTFSNNFERLEEIKFYRNIRRPGDKNSLSANDIMHIYTDKNKITYIISFTGGVNK VISPNLLNENIQFKNYDKNNGLASDLALSMIEDTQNQLWVVSEIALSKFDPAKETFENYE LSSIYQEFNFSEALPIINARNQIILGTDKGFLEVSPEKMRKSSYVPPIVFTGLKIQGHLT DHSIDNLEELELEPSQRNVTFQFAALDYVNPKGILYAYRLQGLEEEWNEADNNRSASYIN LPAGKYQLQIKSTNSDGVWVDNVRTLSIHVLPTFWETYWAWLLYFILFILFTASIVYVLF YIYRLRHRVDMEQQLANIKLRFFTDISHELRTPLTLISSPVTEVLENEPLSPSAREHLTL VHQNTERMLRLMNQILDFRKIQNQKMKLLIEETDLIPLLQKVMSSFKLIAEEKNINYQLT STIQSVYSWVDRDKFEKIFFNLLSNAFKYTPADKSITVNITTKEKTVEIEVADEGIGIAV EKQHSLFQRFESLVKQNILQPSSGIGLSLVKEMVEMHHGTITVNSQPGVGSRFTVSLPLQ REIFEEDVQVEFILNDSQSSAPHPVDSMKAPEEVEEKEDLETNSDGFSILVVEDNEELKA FLKSILSENYTVITASNGEEGLQHAVDDLPDLIISDVMMPVMDGLEMIRQIKENNNICHI PIIVLSAKASLDDRIAGLEQGIDDYITKPFSATYLKTRVASLLRQRKALQELYMNRLMEG KNTSTPDPLTPSQPQITPYDEQFMEKVMAYMEEQMDNAELTIDEFAEQLMLSRTIFYRKL KSIVGLTPVDFIREIRIKRAVQLIDSDEYNFSQVAYMTGFNDPKYFSKCFKKVIGITPSE YKERKK >gi|225935362|gb|ACGA01000030.1| GENE 106 173930 - 174697 204 255 aa, chain - ## HITS:1 COG:BH3955 KEGG:ns NR:ns ## COG: BH3955 COG0500 # Protein_GI_number: 15616517 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus halodurans # 13 250 11 252 255 119 31.0 4e-27 MNSDYKVGDLIYDANIYDGMNTGMDDLHFYKRWLPKNKDARILELCCGTGRLTLPIAKDG YDISGVDYTASMLHQAKMKAAEAGLRINFIQADIRTLDLQEEYDLIFIPFNSIHHLYENE DLFKVLHVVKNHLKDGGLFLLDCFNPNIRYIVEGEKEQQEIATYTTGDGREVSIKQTMRY ENRTQINRIEWHYFINGEFNSIQNLDMRMFFPQELDSYLEWNGFSIIHKYGGFEEEVFND DSEKQVFVCQCKFGY >gi|225935362|gb|ACGA01000030.1| GENE 107 174921 - 176927 1328 668 aa, chain - ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 18 535 12 530 531 239 31.0 1e-62 MMYHSTSSFKCLPYLLILGCILTFLQACAPVQRHVLFSSDNGDGTYTNPVINADYPDPDI IRVGEDYYMVSSSFVAMPGIPVCHSKDLINWQIIGHAYDSITFQPQYRMENEKTAYSRLC WAPTIRYDEGIYYIGVNIADDGFVMFKSTRPEGPYTMHKFEKRLYDPGFFIDDDGKKYVT HGKGKIYLTRLKDDATGVLDPQDKGTLIITAPEGYGHLFEGCHTYKRNGWYYVFNPALGY DGVQMISRSRNLYGPYETKVLIDDDINYAGAGVHQGGYIETAEGESWAYTFQDRDYMGRC LMLYPMKWENEWPVVGPEGRPGKGVVTYRKPAVKGKHKMNYPHHSDSFDKPELAPVWEFN HVPYKEKWSLTDRPGYFRIYAQHAKGFYWARNSLTQKVTGPYSTGTVLLDLSGLKEGDFA GNGIMGRMMYQFGVRKKDNRFWLEMREGNRDAAEMIVDSLELKDVQRIYLRTETTKEGAT RFHYSLDNRQYHRFGPEGVSNFWGFLGIRHALCCYNVMKGNPCGYADFDSFELESAHRGN HYDAFTEIDFSRYDDREGMTLVRPVGKRPMQFLTDIKDGDWLVFNNLTFKKRPGKITFEL QALNPGGVLEMRRGSLQGELLASCTIQSTNGEWTKQSFEVKKLRKKEKVYFVFGGSNKSL SIKNFIFE >gi|225935362|gb|ACGA01000030.1| GENE 108 177265 - 179856 2119 863 aa, chain + ## HITS:1 COG:XF0845 KEGG:ns NR:ns ## COG: XF0845 COG1472 # Protein_GI_number: 15837447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Xylella fastidiosa 9a5c # 36 814 31 825 882 567 41.0 1e-161 MKKQWIIACLIGIQGVNVQAQQPSKYPYQDTKLTVEQRADDLLQRLTLEEKVALMQNNSP AIPRLGIKPYEWWNEALHGVARAGLATVFPQAIGMAASFNDELLYEVFDAVSDEARAKNR QFNEKGQYKRYQGLTMWTPNVNIFRDPRWGRGQETYGEDPYLSGRMGMAAVRGLQGPEDA EYDKLHACAKHFAVHSGPEWNRHSFNAENIAPRDLWETYLPAFKELVQKAGVKEVMCAYN RFEGDPCCGSNRLLTQILRNDWGFKGIVVTDCGAIGDFFQRKKHETHPDAAHASADAVLS GTDLECGGNFKSITDAVKKGLISEEKINTSVKRLLKARFELGEMNSTHPWSNIPFSVIDC PKHKELALKMAHESLVLLQNNNNILPLNRQMKVAVIGPNANDSVMQWGNYNGFPSHTVTL LEGIRAKLPDAQIIYEPVCGYTNDTTLHSLFNQCSIDGEAGFNATYWNNREYKGKIAATD RLTTPFHFSAEGSTVFAPGVGLKNFTAIYRSTFRPTDSGAATFRVMTNGGVTLFLNGKQI AEATNIKNHTNLYSFNYEAGKSYDIELRFIQVKDNPALNFDLAKQTPMDAREILNKLQSA DVVIFAGGISPLLEGESMRVSDPGFKGGDRTEIELPAIQREVLALLKKNGKKTVFVNFSG SAMAIVPETQNCDAILQAWYPGQAGGTAVADVLFGDYNPAGRLPITFYKSMQQLPDYEDY SMKGRTYRFMTETPLYPFGYGLSYTRFSYGKATLNQSKLTKGEKAILTIPVSNVGQRDGE EVVQVYICRPDDKEGPQKTLRGFQRVSIAKGKTQNVQIELPYDSFEWFDAATNTIRPLNG TYKILYGNSSNEKDLQTCSIQIQ >gi|225935362|gb|ACGA01000030.1| GENE 109 180579 - 182135 1041 518 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 42 481 18 429 448 258 35.0 2e-68 MTGLVPVIIGLFCACQGTAKQVDVETDLKAMYADLPFSMPAIERPVFPDYQVNICDFGAK SDGVTLNTEAINKAIKVVHDKGGGKVIIPEGLWLTGPIVLQSNVNLHAEKNALIVFSGDT SLYPIITTSFEGLDTRRCQSPISAMNVENIAITGYGVFDGAGDRWRPVKKDKMTDRQWKN LVNSGGNVDENGKVWYPNEGALKASVLMSGQGNQQAEITSEEWEEMKSWLRPVLLSIVKS KKVLLEGVTFKNSPSWCLHPLSCESLILNDVKVFNPWYSQNGDALDVESCKNVLIANCFF DAGDDAICLKSGKDEDGRRRGEPCENVIVRNNTVLHGHGGFVIGSEMSGGVKNVYVSECS FIGTDVGLRFKSARGRGGVVENIYINNINMIDIPNDALIADLYYAAKSAPGEPIPSVSEE TPAFRNIYISDVFCRGAGRAAYLNGLPEMPIENISIKNMVVTGAKEGIVVNQVAKLNIEN VEIDTPDQSMIQIENTTDITINGKDLGTISEKRLLTKN >gi|225935362|gb|ACGA01000030.1| GENE 110 182179 - 184590 1889 803 aa, chain - ## HITS:1 COG:YPO0424 KEGG:ns NR:ns ## COG: YPO0424 COG4677 # Protein_GI_number: 16120757 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Yersinia pestis # 509 700 62 255 361 65 31.0 3e-10 MKLFHSLLYPLLGGLFCLSCSDDEQDTGTKQPTTAGEVTFGVDLTGFSTRVTQDGSSWND GDKIGTYVLDMKTLEPVTEAANVPYVCSEEGASVSFTSTTPLKVQNDGTPVKFVAYYPYN ADVRNFNYPVQLAAQERGSTACDLLYAATKEEYTYSPENEPHISLNFTHRLAKVILKFVN MEKEPLEVSDVRIEGMQTAASFNIQTDVLTVDESSVATINPYHNATTGFYEAIILPSALT DSYKVSFVLDDREKEWIFTNLDIALPQFHKGYSYTFALYIDDSGFVEMGRLENVDGGNSS APWEDGSSEDGTAEGDKTPVSGYAFTPADGTQQALADTELKIAFEGTAPELGTSGCIRIY RMSDHKQVDEINMAERRQSIVNGQTQLNTWMDIIGVTPTGSSVSRRIVNYYPARVEGKSF IIKPHQQRLQPDTEYYVTIEQAAVKQTDFKGVYGRAWTFKTKPAPTLTGPNYEVKISHTD PNADFYTLQGAIDFCATHVDLNAAKTFRMDDGIYQEIIYLRDQSNITVKGNASDNTAVNI QYDNSNDINGGIGGGTNIDQFAPTGTIVPSSGGRSVVILDGNSDKIRFENVTIENAYGWT LGKNGQAEALYINNKSAAFINCCVLSFQDTLLPGGGYNWFKDCFIAGATDFIWGSGKVVL FEDCQIHAPSGTRAVMQARVSAGYLGYVFLNSRFTVGEGVTNSTLIYQFEPDNLTFLNCT FADVYGPNFVGENKPLTPAVPTVATGCKLYNCKTESGSDIYQSIPATVRNTVLQLSKEQY DQYFGTRETIMSWDGYTDAAWFK >gi|225935362|gb|ACGA01000030.1| GENE 111 184632 - 186488 1721 618 aa, chain - ## HITS:1 COG:no KEGG:BF3847 NR:ns ## KEGG: BF3847 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 15 306 16 309 309 79 29.0 3e-13 MNLKHLFFLAVAGTLAVACSEVNENSLTDGDGQRMLRTISSGNDRFTTRLNSETSEWENG DAIGIYMFDTEDKNVLNDALNVKYTTIGEGLTANFSSDPGIAIYDMPTNFVAYYPHTTSA DVIDATAALYKVDVSDQSNGISAHDLMWAKSANQSTESLLAGGLSFTFHHQLVLLRVKIT NENVSNVTSITVGGMNTTAIFSLIDGKLTNMYTQKSISLQKTGDKSFIGIMLPTEELINK MSLTIMADGGKYQYTVPEGSKIDKFVAGNEYTFNINVGKETSGEIGGGSGSNTPWGDGGS EDGDGDKVSENEAIPADYAQKAINAETNLSTILSGASGKVALVFAANAEAYTFSDAIVVP EAVTELLLIGDTEKQVKMNLKQIQYTSLQKIALNNLDITGDNSTALLTNNETAQLATDAV VDFKKCNFSNMKTVCDWPTRNDGVQNLLSAVLIDNCLFENMESIFNYYGSKAITITNSTL YKMTERAVYVKDANGVVITVENCTLADLGKTPFESQYGNGNLYYKNNISACFVTSNPNIG YKMDVREFSGNYAAAATEAGQLPVLNVHGKAIDANTFPNAWIDTSKTVTELFEDAGNGNF KLKIDAQVGDPRWYKNVK >gi|225935362|gb|ACGA01000030.1| GENE 112 186544 - 188196 1523 550 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171751|ref|ZP_05758163.1| ## NR: gi|260171751|ref|ZP_05758163.1| Fibronectin type III domain protein [Bacteroides sp. D2] # 1 550 1 550 550 1106 100.0 0 MIKLLRKLATAMLPALLCGTLFIGCEADDKYTKVDDLFQPRFVLEKPEVKANSVTLVWYK VNDATSYTVQLHQDQYYTSLFMEIETTDPYVFIDDIPYGTTFYIRVRSNAAKTINNSQWS YVSASTEARPEYAKLVEDVSKTEITESSAIIRWKKDNKQNPVDSISIMPMMDTTLPGVSR YLTIEEMMQGYAEVDGLTKNTLYAVNLYDTSKPRKYDKPYNQVTFRTAGPSAMSIQVGLE DDLSAMLLDNDVDPEVPEGTEYYLPAGSSYRVTPFSLMKGFRLAGSRDGVKPVVVLEGSW SIAEGSYLSSLEFDNIEFRHEANNNYFMNTSKAYTIENVSFVNCDFISLRRGFWRHQSAN AKYIMNLEMEGCRFEGCGWQTSAYGTFNLQSFDKDNGVSYDQVDRAIFRNCTFSNDNDGT NGYGWGNLFYAPYMDKPIELEYKNVTIYNYSRNQRLINIESAVGSKLVMQGILLASPCGD LYAIGANTTTTFSDNYTTADYALGGSKMNATDLEITADKLFADPVNGDLTIKDTSSPIVN SRAGDTRWLP >gi|225935362|gb|ACGA01000030.1| GENE 113 188217 - 190118 1922 633 aa, chain - ## HITS:1 COG:no KEGG:BVU_1119 NR:ns ## KEGG: BVU_1119 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 9 633 8 678 679 296 32.0 1e-78 MNKKLIYSLIAGTLFTLTSCDDFLDVQPGSGFTPEYIFSSESEMKALMTRIYSSMTEDGL YGSNLASGLNTNTDVEMSSFKNNTVSTNGSDIGCFDSRPTWSVLNSTWNNLYYAINYAND FLQAVQESPLFSDKVTGDTPSETQQMYGEVKTLRAMLYLDLIRTWGDVVFVTKPTEATDD FFSLGTTDRNVILEYLIDDLIAVEPMMKYAVDLDYGVERASREYCQALIGQLALYRGGWT LRADKEDVTHVGYMERGDNFEHYYEIAVTYLGKVIKESKHDLTQSFENLWVNECNWKTAN NDDVMFAVPMLKSVTSRYGYNIGVTIAEGKHSYGSARNYVTFCGTYVYSFDKDDLRRDMT CVPYKYDKDLNQEIDMGVTGMGVGKWSKLYMQSPLGASSGSNTGINSVRMRFADVLLMYA EAVNELYGPRDDAKEALKRVRRRAFDAAQWTDKVESYVESLTNEADFFQAVMDERKWEFG GEGIRKYDLARWNKYGEVIYNLYNEMTNWGLVANGAYVPGIEKVPSNIYYKQVTDPEHSD RKVLDIVGIDEYGPGVGRPAGYTVLEYALGWRVLNKETQLYETLDAISWSFRGFINKNND QLVKPTDPVRYLCPYPAKVITDHRGLIRNYYGY >gi|225935362|gb|ACGA01000030.1| GENE 114 190144 - 193356 3316 1070 aa, chain - ## HITS:1 COG:no KEGG:BVU_1120 NR:ns ## KEGG: BVU_1120 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 14 1070 23 1111 1111 763 40.0 0 MNKIIINIFILLWVAVGAMAQTATVTGTVTDDMGPVIGATVTVQGTSIGATTDLDGKFTL TGVSNVAKAVLIVRYVGMEEAKEPLKGRTSNINIHMKEAVGMLDDVVVIGYGTQKRGNLT GSIASVSGKILEKVQTTSAAEAIVGKLPGVQVTAVDGSPDAEIKILVRGGGSITQDNSPL IILDGFEVGSLNDVPPTDIESIEVLKDAASTAIYGARGANGVILVTTKRPVEGRVVISLN TYVQTKELSNKLDVMDPYEFVLMQYENARQKSSNPTAFNNKYGHAYEHYIYQGNAGKDWQ DEVFGSNPVAKYVDLSVNGGTEKAKYKLSFIHQDQPSVMVGNGLKQNNMNASFNFKPFKF LTLEYRTRLLHKEVDGSGTEGVSLLTALRQKPTKGLDEYMKLPEDDTYFDPDQLEEETLF DPKEESKRNYKKRINKSLNTMGAVTWDIMKGMTFRSEFGFENNSEEQRRFWGMETSKARS NNNQPVAEWSMKQGTKWQLTNVWNYNFMLKDVHDIRLMLGQEIKHNQTTTKTYSTRFFPE NITAEKVFDNLSLGTPYENSSSAASPSRISSFFGRINYGYNDKYLATVTLRADGSSKFAK GNRWGFFPAGALAWRISNEDFLKDNQVVSNLKLRVSYGTSGNDRIDADLYQKLYGVSSSR PAGWGENSHYYYNFYNSKYVYNPDVRWETTITRNIGFDFGFFKERISGTVDVYWNTVKDL LVPSDIPGYTGYTKLMTNVGQTSNRGIELQLNAWLVETKDFSLNATFNIGHNKNKIDKLA SGEKEWILTSGWRNDVVNSDDYRAYVGGTSGLIYGYVNDGFYTVADFESFDSKSRTWKLK EGVVDSSPLSDTPRPGNAKFKKLTPVDPNSTNPYQLTEADRTVIGNTTPKFSGGFGLNAT YKGFDMSMFFNFMYDFDVLNANKIMLTTWADNKENNFLMDIASDKRWRNFDDMGNEIRYS PEVLAEFNKNATMWNPTSIGRPICMSYAVEDGSFLRLNSATLGYTLPSVWTKKVGLSKVR FYVSGNNLFTLTGYSGYDPEVNIATGLTPNIDYDRYPRSRTYTFGAQLTF >gi|225935362|gb|ACGA01000030.1| GENE 115 193401 - 195788 2388 795 aa, chain - ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 311 781 37 490 497 312 40.0 3e-83 MRYFLFILLVGLLSACSSDDESNAGATAAKLEVSKNEVKLSNVDGSFTINVTATSTWTAE VTSTGDWLTISKSSGEGNGDLRLFFTENTDGPKRTGTVKVSMSGAGSTLEQEISVEQLGA DPDILFDYSSDPLSFRGGTFTCKVVANVEWELEIAAEYDWIKLKEATPRTRSFVTDEVTF AVDANANKTRTAVLIFKSVGDYTLQRVLKVTQDGVSGEVTIEQDEYIIPYKCPKLVISAP QGENPVDYDAVISESWITQDKKNSTANEVVLNIENNETVFPRTATVEMLDKVITIFQYGK PDTSIGDDHSTSILAFPGAEGGGRFTTGGRGGEIYRVTTLADYNKNETPIEGSLRYGIEK SNQPRTIIFDVSGIIELKRGLYLNEYPNLSIIGQTAPGDGITLKNYNFTFNLSKDPAIGA GGSLNAIVRFLRCRPGDQFADYGEDAIGGRYFKDAIIDHITAGWSVDETLTFYGVQNFTA QWCIASESMNLSNHAKGAHGYGAMFSGDNASFHHILLAHHGSRCPRISDLSAPGTQESYD FTGYFDVRNNVYYNWSGRGQGSYGGKYAAFNLTNCYYKPGPATGTNNRSYRILSSDPTAR AYINGNYVLGNTGVTADNWTEGVWGQFDSSLGTVPEAEKQAMKMADYQPYSKLTNHTAEQ AYDRVLEYAGASLRRDVIDQRVVREVRNGTYTYIGSKPEEDGKAKQPGIIDTVSDTEGYI DVKSLKPWPDTDGDGIPDIWEEAYGLDPNDPSDAQKISSSVDPNGRYPNIEVYFHNLVQH IIYYQNQGGIVMEKK >gi|225935362|gb|ACGA01000030.1| GENE 116 195823 - 197628 1386 601 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717967|ref|ZP_04548448.1| ## NR: gi|237717967|ref|ZP_04548448.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 601 1 601 601 1207 100.0 0 MRNIFYSLTGIMLLLGITACSEEERYATSVVKEIQLFLDDEPWAVNTGASNKPLFIYTAD GEYVANYSSLYRFQLPNGSYNIISTTQSDSIPSPKNLNDIVIHQDPTAKTKYAISAPVTY KSPFDEPLSVRMYNRTGVLRLKSTDRKSDKSYSKLRAVVSSPISGYKLSDATFVKAPTDV QREKETTTGGVNYTDDFVLFQTQSEGGTVSVRIDYLDKNNTVVQSKAIDGTFPILPDDTV QVAFALNNVDEPIIQDYTVTIASEGWNEEDINPEAPMRIPDGYRYVSPEENLESICKSML ADASVNEVKLFLKAGATYKLGTQTEIPKALYIMGQAPGDGEELAFMEMGNMSINCPDAII EAFHFENLKIKVTTSDFFKFKNQAFHVSTISWKNCEISDLVRTMWYQEVDAAQKQIADHI VIENCRFLGLNSGGSGLFGLSTKQDAPVHNFVFKNSTFHANNLTRALITGLGSMTGELNV TIENCTFVSMAPAAMTFFDLNPKNTSSFHLVVRNNLFSGVCEVGQGTWFTTRNVTSKTFE NNYRTNGFVVANWGVDTAEIPVETALPMETLFKDVAGRDFTITDKNSEVYTNGIGDPHWI K >gi|225935362|gb|ACGA01000030.1| GENE 117 197641 - 198525 712 294 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171756|ref|ZP_05758168.1| ## NR: gi|260171756|ref|ZP_05758168.1| hypothetical protein BacD2_07813 [Bacteroides sp. D2] # 1 294 1 294 294 535 100.0 1e-150 MKHLASLFLLSFLLCCSCSQQMDDEWKPNEQLPNVPETHPVGVSFSRASLETFAEHGITE VGVYVYLQDSMVYGKNLPLNNGDLKVDLPLGENLQTFIVANADHLVDTDSLSKVVVYQDA HIQKPVYISDVVGFTSDNSVSSLNVELKRLVGQAVFQPKETEEELSAITRFDQLNVTFTN VAIGYKVKSKECITENVTISTNLSTGFGASVYSFPTVNGDSRTSIDVVYLKGGEEVNRII SPLDTGIGFESSKRSTVHMKITDENYLDEPWPSVRTVMYKSTSTQPFTIEVSEF >gi|225935362|gb|ACGA01000030.1| GENE 118 198921 - 200450 1289 509 aa, chain - ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 509 3 496 497 363 42.0 1e-98 MKRKLVYVCLLAMVAGGMMSFSYDIPEKQETGGREDRVISFPGAEGHGRYTTGGRGGKVY HVISLEDDGSQGTLRWALKQNGPKTIVFDVAGTIHLKSELRTGKDHLTIAGQTSPGGICL ADYGFSINSNNVIIRFLRFRPGEASGKEPDGLGGCDKKDIMVDHCSVSWSVDECLSVYGM ENSTVQWCIGSEALRKATHVKGAHGYGGNWGGHKASYHHNLIAHCESRVPRLGPRPSTLA LGECVDIRNNVFYNWAGNGCYGGEDQHVNIVNNYYKPGPATKQASKQVQYRIAKVGVYPQ AYVYVDGEPKKNLAFQPYLQKWGTFYIDGNKIEGNNKVTADNWTDGVYAQLKNDEKVDFL WTEDAKESIRLKEPLDFGVITTYSADKAYEQVMNYAGCCNYRDEVDKRIISDTRKGTATF TGEGNKPGFINSPKDTGDSPWPELSIAGLSAPVDSDGDGIPDEWEIKNGLDPNNPVDGNT KTKDPNGQYTNLEVYLNSLVQHIVDAQNK >gi|225935362|gb|ACGA01000030.1| GENE 119 200569 - 201387 442 272 aa, chain - ## HITS:1 COG:no KEGG:BT_4118 NR:ns ## KEGG: BT_4118 # Name: not_defined # Def: two-component system response regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 272 1 270 270 380 73.0 1e-104 MKPHPIIDYPFRWIPMLVLAFILVATQVALVCGYTGNDYLPALVDGVATIGWLAAIAYLA WFVVGLVSLFQTDIIMIIVGSLLWLAGSFMICDIMVRIVGISYIPFVQTIPFRLLFGLPT LIAITLWYRLIVTKEEVQNQELDKELAAHQLTVTEQQGEPQTELIDRITVKDGSRIHLVK TDELIYIQACGDYVMLITPTGEYLKEQTMKYFETHLSPDTFVRVHRSTIVNVTQISRVEL FGKETYQLLLKTGVKLKVSLSGYRLLKERLGI >gi|225935362|gb|ACGA01000030.1| GENE 120 201391 - 202185 519 264 aa, chain - ## HITS:1 COG:no KEGG:BT_4117 NR:ns ## KEGG: BT_4117 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 419 77.0 1e-116 MEEKQNFPRATGFSGKILIALLFILSGILLFARNMGWITSEVFDLIVSWHSLLILLGIYA MIRRHFIGGIILFLAGVYFLIGGLSWLPENSQAMVWPLALIIAGVLFIFKPGRNKRAQWA RQHTMEHRRQWMKMHQGRPGMNFESEQQQSESVDGFLRSENVWGAARHVVLDELFKGAMI RTSFGGTTIDLRHTHIAPGETYIDLDCSWGGVEIYVPSDWTVVFKCNAFFGGCDDKRWQN GNVNKENILVIRGTLSFGGLEVKD >gi|225935362|gb|ACGA01000030.1| GENE 121 202541 - 205789 2988 1082 aa, chain + ## HITS:1 COG:no KEGG:BT_4114 NR:ns ## KEGG: BT_4114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1082 1 1071 1071 831 44.0 0 MSNKVKNMRSLLLILFSAISLSVSAQTITVKGNVKDTSGEPVIGASVVEKGNTTNGTITD LDGNFSIKVDGKKTLVISYIGMKTQEVAVQGRKTINVQMVDDSKALDEVVVIGYGTVNKR DLTGSVASVSAKDLAAIPVASASEALTGKLAGVSVTTTEGSPDADIKIRVRGGGSLSQDN SPLYIVDGFPVSSISDIAPSEIQSIDVLKDASSTAIYGARGANGVIIITTKSGKEGKTQV DFGASFGFKQVTKLTEVLSPYDFVAYQREIGSLDYGNFADMDIWRSIDGTDYQDEMFGRT GNQQQYNINVSGGTKQMTYSVSYAHNEEKSIMLNSGFKKDNVNAKIKSELNKWMTLDFNA RLSYSTIDGLGGGADTNESNAANSTVANATVFRPVDSLKYSDDDEENSSAQQKSPLERLL ATDKTRNTFNQNYNVGLNWKPFKNWTFRSEFGYGWKFDDTEQYWGVDAVSNSKYGYNGQP QAYLLREKTMSWRNANTLTYDNKKLFKGRDKLNVLIGHEVSSSQRKSIENVSVAFPVTMN FDDMKANMGSGKALANQSTIAAKENILSFFGRVNYTMMDKYLLAVTVRADGSSKFGSGNR WGVFPSAALAWRISDEAFMSNTQDWLSALKLRLSFGTAGNNRINSGLLSTTYSLGGNDAR NPFFNGESTTMLEHGTNLYNPDLKWETTVTRNIGIDYGFWNNRISGAIDFYWNTTRDLLM RTEIPSLSGYNYQYKNFGQTSNKGVELSVSAVLFDKKNFSLNFNANIAYNRNRIDKLNTD SPWQSSNWSGSTMAKYEDFRVEKGGRLGEVWGYKTNGYYTVYDPVTNPTGELVWAGSEWG LKDGMQDNSPTITGGKYYPGGLKLECDKDGNPLKQRLGNTIAPTTGGFGFDGRVGNFDFN VFFNYSLGNVIVNGTKLAASFRSGSRTGYNLNNDFRLSNRYTWIDPETGLNLSSSSTDVL NTYGDMTTAGLRLNEINANANMYNPASATTMQLTDYAVEKASFLRLNNITIGYSLPKTIV RRAFMQNVRIYLTGYNLFCWTNYSGADPEVDTSSKKNAMTPGIDYAAYPKSRTFVGGINV TF >gi|225935362|gb|ACGA01000030.1| GENE 122 205815 - 207722 2102 635 aa, chain + ## HITS:1 COG:no KEGG:Phep_0771 NR:ns ## KEGG: Phep_0771 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 6 564 11 531 591 219 32.0 4e-55 MKIYKTLFLGLGLIALSSCSDFLNQTSPSELDNESTFNNAYYTELALNKVYGSLTQDQTY SNFLPIIAGTNTDCELIDGLGTDASNTSSERGNMNYNANPGWSQLSKVWDAMYGVIENAN LVVDGINNSQLIQQAGATRTSMMRFRAEAMTLRAMIYFDLIRLFGDIPFKTESSNSDLSN VYIGKADRDDIMDELIIELEEAIGYLPWAGEDGYTTEHVSKGYAHALLANIAMTRAGYAI REQAKDGYITGDNSDATYPTQRCSDTKRRELFELAEKHLAAVVSSGKHKLNPSVEEYWRL INIGQLDQTYQENLFEIPMGLNKSGELGYTIGYRINGASSLFGPKGNSSGKLKLTAPYYL SFGEGDIRRDLTCAISQLSTDKNTKVFKEYMLGNAPFGLYCGKWDYRKMMENSEWYAAVL ASDQKVCSGINVVKMRYPQVLLMYAEVVNELYGKGATAEGCTLTATAALKEVHDRAFTDA TKRDAAWTALMGKDFFDAIVDENAWELAGEGVRKFDLIRWNLLSEKIDEFKNEYTNAVYN GTYPQYVNYKYRTDNPMYIDMTSLVFGAKVGGEYENKSFFGAETSDKEQKNLKVNLPSIS GGLNKAVKNRYLLPIASTTISTSNGKLHNSYGYSD >gi|225935362|gb|ACGA01000030.1| GENE 123 207741 - 209615 1404 624 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171762|ref|ZP_05758174.1| ## NR: gi|260171762|ref|ZP_05758174.1| Fibronectin type III domain protein [Bacteroides sp. D2] # 1 624 1 624 624 1151 100.0 0 MKNIKNILGMGAFMLLASLAVSSCTEKSDWDIDSSYSRPFGTDENGISVETDSKIARAVV TWSSTSNTDYYIIEISPNEMTDETPMGSEENGNIVYGNDPANRIKQSPYTMDNLAVNTTY YMRIKSISGEKESRWVNYKKTFASVKEEAILNIPTTEDLPEGQGKVRMSWEAGLAVDHFE IMETGATEATSRVISSTEAAAGEAWVENLKSFTEYTITIYNGNNPRGSQTVTIPGLEIES TISDITANSAVFSWEETVDVDEYACVLSTEGVPESGTQLSPADIAAHKVTITGLASSTEY TAYAFANGSICSRITFTTKKGKPTGYTEMTWEDALANWDNLSGKVLINVSGTEGFAQEKE SIAAGVTHLIFWGDSQDGQVNMTIKKGVGASGICDKVEFHNLNITDEGNTTLIYQNGASG CIKEIEVTSCTITNIRGIVRMNASTSNAMSVTIDDCIIKGLGRAATSNHYGLLLSDKVTL TTLNLVVSNTSIIVSKGASASQFIRHKSGQTGTITIKDCTFYDMSASDAFCRDTKDMTMT ISNTLFAKGGVKPFYNTSSVATTLNVNGLYKASDFAFGATDWGKDYTSLPLTSDQLFPNG SSEDLTFGADVPEEYRVGDQRWNK >gi|225935362|gb|ACGA01000030.1| GENE 124 209700 - 211328 1155 542 aa, chain + ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 141 467 2 318 321 77 27.0 6e-14 MNTNSINTISKYLLLFLLILTGASCNDNDDAEDTSIPVLISQNINDGDVVGPSGYVELTF SKAMRQAPDTEIYFNGGVVRVSINYEKVRYTFSGMENKECTFEVPAGALTDMQGRAYDED FFLSFTAKSEISGGGKVFDAIVDSKGNGDYTTLQAAINAITTPPTSPYKIFIANGTYNEC VRINKNKPFVHLIGESRDGVKIQFAVNRVDDSSNATSWPYSIFNENSPARKAGYSEEQNT VVLIEATDFYAENISIINLYGAFSNRHTGGLGKNGQAEALINREDRFALNNCLLVSYQDT WWTRYWNNTTPHRAYVYNSWIEGHTDYIWGSGDVLIENSTFYNTGNDGGSVITASRTSES DKYGYVIKDCTVNGDDTKFSFGRSQATTTKTVWINTKLKMDIIDSHWGYGGQIPTLYAEY NTIDKNGNMIAESKTITSGNVSFTSSVLTASEAAKYTYENIITIDSWNPKEYMETPLAAP TNVNLSGNTLTWDAVSGAAGYLIFMNGNYAGQTTDTTVTLTNTDESNIYTVKTVSQYGTV SE >gi|225935362|gb|ACGA01000030.1| GENE 125 211422 - 212999 1408 525 aa, chain + ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 38 522 20 491 497 548 59.0 1e-154 MLKSIKMGLLCFCVINLSVACSSEDPSITDETPGQNTGNGGNENEGGEEETPAETRYAFP SAYGAGRYTTGGAGGEVYTVTSLEDNTTQGTLRYALNRTGKRTIVFAVSGLIELKSPLKI TNGDVTIAGQSAPGDGICLKGHPVSVQADNVIIRFMRFRMGSDNFTTEAEADSGDALWGK QHKNIIIDHCSMSWSTDECASFYDNTNFTMQWCIISESLNRSVHTKGNHGYGGIWGGSPA TFHHNLLAHHSSRTPRLCGSRYTGKPENEKVDLRNNVFYNWGPTNGGYAGEGGSYNFVNN YYKPGPVTNTKKNIVNRIFQPNGDDGTNKNTKGIWGTFYLKGNYFDGTCPELKAEYQSLL TSVNNDNWQGLHPNATEAVPLPDGGEKALQSSNEFTISEDASEFTQSAKEAYESVLKYAG ASLKYDDVDKRIIANVRNGDYTADGSNGSEKGLIDKASDVGGWPEYKKETGPKDTDGDGI PDEWETANGLNPKSKADGAKYTLSKTYTNLEVYLNSLVETLYLNK >gi|225935362|gb|ACGA01000030.1| GENE 126 213031 - 214692 1266 553 aa, chain + ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 35 541 20 490 497 288 38.0 3e-76 MKQKISLFYVAIAFSIAGLCTSCSSDDPVDDTGGGTNNPGGTSSDVKQLDYGELLAFPYA EGHGRNTTGGRGGKVYHVTSLEDDASGSISGSLRWAMKQDGPKTIVFDVSGTIYLKSELK TQKDDLTIAGQTSPGGICIANYPFTINSSNIIIRFIRFRPGNSNVDCDGLGGCDKQNVII DHCSVSWGSDECLSVYGMQNSTVQWCLAYQALRVTDVKINAATGKFASHGYGGNWGGNYA SYHHNLIAHCESRVPRLGPRYTTLALNNNDGERVDMRNNVYYNWGGEGCYGGEAQHVNIV NNYYKPGPGTDESGKADRAYRIAKPDVYPENYSGEAYKKWLQTWGKFYIDGNKVVGNTTV SNDNWTKGVFEQMDEKNCATTELWNQHTQIKSNSPVVKAGNVRTHTPDEAYERVMSYAGA SNYRDKVDELIISDVKNRKASCTGDASKWEGLSGYSQNKSGYINDPKDVCTALGVNDPYD VLKSVTNANVKDTDGDGIPDYWEEEYGLNPKKSADGKETTIDKNGKYTNLEMYLNSLVHE IMVNGATGGSVIE >gi|225935362|gb|ACGA01000030.1| GENE 127 214728 - 216224 1366 498 aa, chain + ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 496 3 492 497 853 83.0 0 MKKNILAICSLILAFPLVGQAMQPAISVEKTSKDYPTPDRSKVLAFPGADGAGKYTTGGA GGAVYTVTSLADDGSEGTLRWAISKKGPRTIVFAVSGIIELQKALKLSNGDVTIAGQTAP GDGICLKNYTFSIQADNVIIRFIRSRMGVDIKQKGDDAMNGTKAHQNIIIDHCSMSWCTD ECATFYDNRNFTLQWCIISESLANSIHEKGAHGYGGIWGGQPATFHHNLLAHHTNRTPRL CGSRYTGRPEDEKVELFNNVIYNYGSDGAYAGEGGSYNFINNYYKPGPFSATKGSFKRLF TAYADDGKNNNKAGVHGVFYFKGNYMDPTCPKLTDKQKEALYKVNMDNSYGLVIKKDFAT EKEVLSKKAFDIAEHTSLQPAKKAYKDVLEFAGASYRRDAIDQRIVDETLKGTYTYEGSH GSTNGMIDQPSDVGGWPEYKSETALVDTDGDGIPDEWEKKHNLNPNDPSDGAKYTLSPEY TNLEMYMNSLVNHLYPKK >gi|225935362|gb|ACGA01000030.1| GENE 128 216369 - 220871 3665 1500 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 933 1160 56 287 318 142 35.0 5e-33 MLAIVFSTHAQQHCFFTHYSTEDGLSQNTVMNILQDHKDNLWFATWDGINRFNGYTFKTY KARQGNYISLTNNRVDRIYEDRYGFLWLLTYDNRVHRFDPKTETFEQVPAAGEEGSAFNV HTIEVMPNGTVWLLTENDGAIRILTHPNENNRLTWDIYSSKTELFPSLHVFKVHQDKAGN DWLLTDNGLGMIRPGKKEPDSYFTETKGKFGGMNQAFYAVQERDKDICFASDHGRIWSYQ KDSGEFTLIELPTKGQITSIHPVAPDVSVITTDSDGFFTYNLRTKTNVHYSFLTCKALPA KPILSAYVDRSSEVWFEQEEPGVVAHFNPSTGVVTREQILIEYSNPERSRPAFHIHEDVN GYLWVHPYGGGFSYFDPQKKRLVPFYNGLGSRDWRFSNKIHSAFSDKQGNLWLCTHSKGL EKVTYRNVPFAMMTPMPHEHESLHNEVRALCEDKLGNLWVGLKDGMLRMYDADKTYKGYL TESGIVSTTGTPMLGTVYFVIQDSKGIIWIATKGDGLVRAEPTSSNGMSYKLTRYLHREE DMYSLSDNNVYCIYEDHYGRIWVATFAGGINYITQNEAGKTLFINHRNNLKGYPIDPCYK ARFITSDNNGLLWVGTTTGAVAFDENFKKPEDVQFYHFSRMPNDTQSLSNNDVHWIISTK KKELYLATFGGGLNKLISISKDGHGEFKSYSVLDGLPSDVLLSAREDSKENLWISTENGI CKFIPSEERFESYDERSITFPVRFSEAASTLTAKGSMLFGASGGVFIFNPDSIRKSSYIP PIVFSKLMVTNEDITPGDNSLLKVDIDDTDPLVLSHKENIFSVHFAALDYTNPQNIQYAY ILDGFEKQWTFADKQRSVTYTNLPKGEYVLRVRSTNSDGVWVDNERILNIVILPSFWETP VAYVLYVLFILIIILVAVYILFTIYRLKHEVSVEQQISDIKLRFFTNISHELRTPLTLIA GPVEQVLKNDKLPADAREQLVVVERNTNRMLRLVNQILDFRKIQNKKMKMQVQRVDIVPF VRKVMENFEAVAEEHRIDFLFQTEKEHLYLWVDADKLEKIVFNLLSNAFKYTPNGKMITM FIREDEKMVSIGVQDQGIGIAENKKKSLFVRFENLVDKNLFNQASTGIGLSLVKELVEMH KATISVDSHLGEGSCFKVDFLKGKEHYDKEAEFILEDADAPARMGQVVDIANSSIQSETL VSDDSEKIDDVYGEEFAKEENSKELMLLVEDNQELREFLRSIFSPMYRVVEAADGKEGAS KALKYLPDIIISDVMMPEKDGIEMTRELRADMTTSHIPIILLTAKTTIESKLEGLEYGAD DYITKPFSATYLQARVENLLMQRKKLQSFYRDSLMHINISTGQEEVPVATDMPSAEEDVS ETPPTTLEMSPNDRKFMDKLVELMEQNMDNGELVVDDLVRELAVSRSVFFKKLKTLTGLA PIEFIKEMRIKRATQLIETGEFNMTQISYMVGINDPRYFSKCFKAQVGMTPTEYRDRVGR >gi|225935362|gb|ACGA01000030.1| GENE 129 221070 - 222809 1579 579 aa, chain - ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 278 568 2 321 321 246 43.0 9e-65 MKTQRHKMWGLLAAVFILFSAFRADKPVITIFMIGDSTMANKKIDGGNPERGWGMVLPGF FSEDVRIDNHAANGRSSKSFISEGRWEKVISKVKKGDYVFIQFGHNDEKADSTRHTDPGT TFDEILRRYVNETRAKGGIPVLFNSIVRRNFVQPKDDVIAKDVRRTPGEKEQPKEGTVLF DTHGAYLDAPRNVAKELGVTFIDMNKITHDLVQGLGPIESKKLFMFVEPNQVPAFPKGRE DNTHLNVYGARTIAGLAVDAIGKEIPELAKYIRHNDYVVAQDGSGDFFTVQEAINAVPDF RKEVRTTILVRKGTYKEKLIIPESKINISLIGEEGVVLTYDGFANKKNVFGENMGTSGSS SCYIYAPDFYAENITFENSSGPVGQAVACFVSADRAFFKNCRFLGFQDTLYTYGKQSRQY YEDCYIEGTVDFIFGWSTAVFNRCHIHSKRDGYVTAPSTDKGKKYGYVFYDCRLTADAEA TKVYLSRPWRPYAQAVFIRCELGKHILPVGWNNWGKKENEKTVFYAEYESWGEGANPKAR AAFSQQLKNLQGYEITTVLAGEDGWNPVADGNKLITVKR >gi|225935362|gb|ACGA01000030.1| GENE 130 222898 - 223869 844 323 aa, chain - ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 33 319 1 318 321 257 44.0 2e-68 MKRTILKGMMCLLLLGVGATSVYSQQQQRKDTIIVTRDGTGDYRNIQEAVEAVRAFMDYT VTIFIKNGVYKEKLVIPSWVKNVQLVGESAEKTIITYDDHANINKMGTFRTYTVKVEGND ITFKDLTIENNAAPLGQAVALHTEGDRLMFVNCRFLGNQDTIYTGSEGTRLLFTNCYIEG TTDFIFGPSTALFEYCELHSKRDSYITAASTPQSEEFGYVFKNCKLTAAPDVKKVYLGRP WRPYAATVFINCEFGNHIRPEGWHNWKNPENEKTARYAEFGNTGAGADTTGRVTWAKQLT KKEVAKYTPENIFKESSNWYPYK >gi|225935362|gb|ACGA01000030.1| GENE 131 223879 - 225078 1060 399 aa, chain - ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 59 397 23 361 361 331 46.0 1e-90 MRQTPFYNFFILALLVTMTVSVSAQQVDSKLPWSVRLTESEMIRYPQSWQLDFQPKLKWD YCHGLELGAMLDVYDTYGDKKIRDYAIAYADTMVHDDGSITAYKLTDYSLDRINSGKILF RIYEQTKNPKYKKALDLLYSQFEGQPRNEDGGFWHKKIYPHQMWLDGIYMGAPFYAEYAF RNNLPQDYADVINQFITCARHTYDPKNGLYRHACDVSRTERWADPVTGQSKHTWGRAMGW YAMALVDALEFIPLHEAGRDSLLDILNNVAVQVKKLQDPKTGGWYQVMDRSGDKGNYVES SCSAMFIYSLFKAVRLGYIDKSYLDVALKGYKGFLDNFIEVDKNGLVTITKACAVAGLGG KVYRSGDYDYYINETIRNNDPKAVGPFIMASLEYERLQK >gi|225935362|gb|ACGA01000030.1| GENE 132 225380 - 226786 1674 468 aa, chain + ## HITS:1 COG:Z4877 KEGG:ns NR:ns ## COG: Z4877 COG3775 # Protein_GI_number: 15804015 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Escherichia coli O157:H7 EDL933 # 4 457 21 458 462 286 36.0 9e-77 MEEVFKYIIGLGAAVMMPVIFTILGVCIGIKFSKALKSGLLVGVGFVGLSVVTALLTSSL GPALSKMVEIYGLELGIFDMGWPSAAAVAYNTSVGAFIIPVCLGVNLLMLLTKTTRTVNI DLWNYWHFAFIGAIVYFASDNIFWGFFAAIICYIITLVMADMTAPAFQKFYDKMEGISIP QPFCQSFVPFAIVINKLLDKIPGFDKLNIDSEGMKKKFGLMGEPLFLGIVIGCGIGALGC ASWKEVLDGIPGILGLGIKMGAVMELIPRITSLFIEGLKPISDATRELIAKKYKNNTGLS IGMSPALVIGHPTTLVVSLLLIPVTIFLAVILPGNRFLPLASLAGMFYLFPMILPITKGN VVKSFIIGLVALIVGLYFVTELAGFFTMAAKDVYAATGDPTVNIPAGFEGGALDFASSLF CWGIFHLTYSLKIIGPAILVALALGMAIYNRIRMTRNDAKNAAMNNKE >gi|225935362|gb|ACGA01000030.1| GENE 133 226811 - 227731 1106 306 aa, chain + ## HITS:1 COG:BH2166 KEGG:ns NR:ns ## COG: BH2166 COG3717 # Protein_GI_number: 15614729 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Bacillus halodurans # 28 306 6 276 276 281 47.0 9e-76 MKKLAIAMMLGIAAMSASAQVNYKMQVACNPQDVKTYDTNRLRSSFLMEKVMVPNEINVT YSMYDRLIFGGAVPATKELVLETIDPLKAKFFLERRELGVINVGGEGIVTVDGKEYVLKF KDALYVGRGKQKVTFKSKDSSNPAKFYINSATAHKEYKTQLITIDGRKGSLKANSFPAGK MEESNDRVINQLIVNNVLEEGPCQLQMGLTELKPGSVWNTMPAHTHTRRVEAYFYFNVPT GNSICHFMGEPQEERIVWMQNEQAIMAPEWSIHAAAGTSNYMFIWGMAGENLDYGDMDKI KYTEMR >gi|225935362|gb|ACGA01000030.1| GENE 134 227998 - 229515 1542 505 aa, chain + ## HITS:1 COG:CC1508 KEGG:ns NR:ns ## COG: CC1508 COG0477 # Protein_GI_number: 16125755 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 9 430 18 395 431 245 35.0 2e-64 MNAFQKTGEKMTNYRWTICAMLFFATTVNYLDRQVLSLTWDEFIKPEFHWDESHYGTITS VFSIVYAICMLFAGRFVDWMGTKKGFLWAIGVWSAGACLHAVCGVITEAQVGLHSAAELA GATGDVVVTIATVSMYCFLAARCILALGEAGNFPAAIKVTAEYFPKKDRAYATSIFNAGA SIGALIAPLTIPILAKMFGWEMAFIVIGGLGFIWMGFWVFMYDAPSKSKHVNQAELDYIE QDQREAGSAPMADEKDEKRMKFWQCFSYKQTWAFIFGKFTTDGVWWFFLFWTPSYLNTQF GIKTSDPLGMALIFTLYAVTMLSIYGGKLPTIFINRTGMNPYAARMKAMLIFAFFPLVVL LAQPLGTVSPWFPVILIGIGGAAHQSWSANIFSTVGDMFPRTAIASITGIGGMAGGFGSM ILQKVAGNLFVYASGTTMVDGKEVEMTKELLEQGAQFVHPAMTFMGFEGKPAGYFIIFCV CAVAYLIGWIIMKALVPKYKPIVLD >gi|225935362|gb|ACGA01000030.1| GENE 135 229840 - 233310 2437 1156 aa, chain - ## HITS:1 COG:sll1582 KEGG:ns NR:ns ## COG: sll1582 COG1112 # Protein_GI_number: 16329815 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Synechocystis # 207 1132 203 1098 1118 294 29.0 8e-79 MKLEMKLEMKLPCPKSEAIESYEILLAVCRAEDAYLAVGYKQMRDLLERICRGQMQNESL QMTDLSARISFVSAKVGLSVTEQNRLHTFRLTSNAILNRQQEPNREQLLRDAKTLAFFIR KLLDEDIPLELYRLLPRADATYLVAPPARERVQRMRVCFQYADEQYLYVTPLDEVSEQPY LVRYNVPQINEEFAETCKLLWRHAQVNLLDVAIDEAGILTPSFIVLEPDYLLDISSLAEC FRDYGHHPANYFLSRLQPIENARPLLLGNIANLFLDEWIHAKSEEIDYRTCMQKAFRRYP IELAACSDLRDKEKERQFFEDCKLHFDHIRETVNDTFHAAGYELDKTDAVLEPSYICEAL GLQGRLDYMQRDMSSFIEMKSGKADEYAIRGKVEPKENNKVQMLLYQAVLQYSMGMDHRK VKAYLLYTRYPLLYPSRPSWAMVRRVIDLRNRIVADEYGIQLRNSLEYTSQKLEEINAST LNERGLKGRFWETYLRPSIDNFQSKLKALSALEKNYFYAVYNFITKELYTSKSGDVDYEG RTGAASLWLSTLAEKCEAGEIIYDLKIKENHAADEYKAGLTLTAGSEMLHAETFLPNFRQ GDAIILYERNCDTDNVTNKMVFKGNIEYLTENEIGIRLRATQQNPSVLPAESLYAIEHDT MDTTFRSMYQGLYAYLSARKERRDLLLSQRPPRFDKSLDSMIFCSEDDFTRVALKAKAAQ DYFLLIGPPGTGKTSCALKKMVETFHADKDAQILLLSYTNRAVDEICKSLASIAPAVDFI RVGSELSCDEAYRGHLIENELSSCNRRSEVYERIRNCRIIVGTVAAISGKPELFRLKHFD VAIIDEATQILEPQLLGILCARGEDEKDAIDKFVLIGDHKQLPAVVQQNVEQAAIYDESL LSIGLSNLKDSLFERLYRNCTAACSSSAIHRSYDMLCRQGRMHPEVALFANRAFYGGRLI PVGLPHQIEDSDTICRLAFYPSVPEKAGASAKINYSEARIVADLAVRIYEHHQSDFDESR TLGIITPYRSQIALIKKEIESVGIPALNRILVDTVERFQGSERDVIIYSFCVNYPYQLKF LSNLTEEEGVLIDRKLNVALTRARKQMFITGVPELLERNPLYKSLLKLIEGSWKIASFLQ MLVFILSLCNSLTPLY >gi|225935362|gb|ACGA01000030.1| GENE 136 233261 - 234109 658 282 aa, chain - ## HITS:1 COG:DR0806 KEGG:ns NR:ns ## COG: DR0806 COG0805 # Protein_GI_number: 15805832 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Deinococcus radiodurans # 1 254 19 251 270 131 35.0 1e-30 MAEMTFWDHLDELRKVLFRVVGVWFVLAIGYFIAMPYLFDHVILAPCHNDFIFYDLLRHI GKTFDLTDDFFTQQFYVKLVNINLAAPFFIHMSTAFWMSVVTAMPYIFFEIWRFINPALY PNERKGVRKALTIGTVMFFIGVLLGYFMVYPLTLRFLSTYQLSSEVENILSLNSYIDNFM MLVLCMGLAFELPLVTWLLSLLGVVNKSFLRRYRRHAVVVIVIAAAIITPTGDPFTLSVV AIPLYLLYEMSILMIKDKKKTEEEVEDEAGDEAGDEVALSEE >gi|225935362|gb|ACGA01000030.1| GENE 137 234127 - 234348 316 73 aa, chain - ## HITS:1 COG:no KEGG:BT_4102 NR:ns ## KEGG: BT_4102 # Name: not_defined # Def: putative sec-independent protein translocase # Organism: B.thetaiotaomicron # Pathway: Protein export [PATH:bth03060]; Bacterial secretion system [PATH:bth03070] # 1 72 1 72 73 99 91.0 3e-20 MTNLLLLGFLPSGSEWIIIALVILLLFGGKKIPELMRGLGKGVKSFKDGVNEAKDEINKA KDELDKPVDPSKN >gi|225935362|gb|ACGA01000030.1| GENE 138 234623 - 236788 856 721 aa, chain + ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 28 665 47 686 737 142 24.0 3e-33 MEKCYGWMLLLWLTNFSGSVLGQTDTIADQKIQEVVVIAKLPPVEMKPGKMTYHVDATIT QSQGSLFDVLSSLPGVILQSDGSVYLNGKTGANIMMDSKLTYLSGHDLVSFLKATPAGMI DKIDLISYPSTRYDASGNAGIIDIRTKKIMRKGMNLSLNGSYEQWRYGTGFGTASLNMRT NKFNFYLSYSYFQSRTRHKFIANRNIFPPSWQEQEAIRIEQDSYRFRRNCSNYYRVGVDY QISDKTIIGFSTNGNLWNDLELGDVSSLVASPKTEWKSPLRTLNNTHRSRNNFTAVLSLT HTFNPDGGTLDASADCLRYRFREDQLLHSKDTLRGIMNGSIHLYSGQANLVWPFSKTFTL RTGVKTTFVSIGNGADNNRLQGSLWNPAREISNDFKYDENINAAYTQLDVNLASFQIEAG LRMENTQIEGMQSGSSSQRDSTFKNQYLHLFPTLSLQYRLRNGNSFVLTYGKRIIRPNYR DLNPFIYIFDDYTYEQGNTMLQPELTDNVELAYIHKDLFRVGLQFSYTRDAIIKSYLDKG SYRVYVTPENLSSHLSVGTQISTSQLSITSFWDLNLSASFTYNRYRLPDNYQTDVNKRLT ASTNLSNQFSLSKNWSVELSGFYNGKMAMGQAIISPLWQVNAGIQKKIWNGKGTIQLFAR DVFHSNVSKISIQAPNLLGHIKEWQDNAVIGISVSYRLNRGLEVKESHRKNSINESKRIN L >gi|225935362|gb|ACGA01000030.1| GENE 139 236807 - 237835 353 342 aa, chain + ## HITS:1 COG:SMb21546 KEGG:ns NR:ns ## COG: SMb21546 COG3275 # Protein_GI_number: 16264735 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Sinorhizobium meliloti # 135 338 171 377 383 91 32.0 2e-18 MKLLYRLFSPMILGIVMFNCIRLTTDLLRDGTLWPGSMRYHLSALLVTITMCYVFDFENR YFLFRIKPQLQLSAVTEYALMILKVLVVLNLVLFVGQRTELLYLGNLLTDYVVVNVICVP FFLIYYIILRSNQIEADYNHQTLQLEKVRSEQLEMELKFLKSQYHPHFLFNTLNTVYFQI DEKNLLPRRTLEMISDLLRYQLYGGNQMVSVQEEIDYIRTYIDLQKMRMSERLQLYVELP SGLEYTTIYPLLFLPLIENAFKYVGGAYRLGIRMTWEPDKISFHIRNSVPDIPISIDPVK KGIGLENLRRRLALLYPGKYVLETSLLDNEYTAQLILKINES >gi|225935362|gb|ACGA01000030.1| GENE 140 237832 - 238527 424 231 aa, chain + ## HITS:1 COG:ECs2936 KEGG:ns NR:ns ## COG: ECs2936 COG3279 # Protein_GI_number: 15832190 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 2 218 6 234 244 102 30.0 9e-22 MIITCVITDDEPIARKGLQSYVEKVDFLSLTGVCEDAMQLNTLLKTQRPDLLLLDIEMPY LSGLDLLGTLNNPPKVIVTSAYERYALKGYELDVADYLLKPISFERFLKAVNKVHNLLQK ESLPIQEDFLFIKSDKQMRKVFLKDILFVEALENYVSIYTTSGKILTHSTLKRIGESLPE GNFLQTHKSYIVNIDLIDLLEGNMLRIGTFQVPIARNYREEVFKRVLRNTL >gi|225935362|gb|ACGA01000030.1| GENE 141 238529 - 241069 2135 846 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 488 846 11 376 386 207 36.0 5e-53 MSYAIESIAKSIGARRMGKHKATIDWLLTDSRSLSFPEETLFFALTTKRNSGVRYIPELY DRGVRNFVITEEDFKLIENGELKMEKSGQDDDAQPILNSQLSTLNFLIVPNPLKALQKLA EAHRDKFKIPVIGITGSNGKTIVKEWLHQLLSPDRCIVRSPRSYNSQIGVPLSVWQLSEE AELGIFEAGISEMGEMGALKRMIKPTIGILTNIGGAHQENFFSLQEKCMEKLTLFKDCDV VIYNGDNELISNCVAKSMLTAREIAWSRRDIERPLYISRVIKKEDHTVISYRYLEMDNTF CIPFIDDASIENVLNCLAACLYLMTPADQITERMARLEPIAMRLEVKEGKNNCVLINDSY NSDLASLDIALDFLVRRSEKKGLKRTLILSDILETGQSTATLYRRVAQLVQSKGINKLIG VGQEISSCSARFDDDLERYFFPNTEALLTSGILKSLHSEVILVKGSRVFNFDLVSEELEL KVHETILEVNLGAMVANLNHYRSMLRHSETKVICMVKASAYGAGSYEIAKSLQEHHVDYL AVAVADEGSELRKAGITASIIIMDPELTAFKTMFDYKLEPEVYNFHLLDALIKAAEKEGI TNFPIHVKLDTGMHRLGFAVEDIPLLIRRLKNQSAVIPRSVFSHFVGSDSPQFDAFTREQ IELFEKGSQELQAAFSHKILRHICNTAGIERFPDAQFDMVRLGIGLYGVSPIDNSIIHNV STLKTTILQIRDVPQEDTVGYSRMGHLARPSRIAAIPIGYADGLNRHLGRGNAYCLVNGK KAPYVGNICMDVCMIDVTDIDCREGDQAIIFGDDLPITVLSDKLDTIPYEVLTSISTRVK RVYYQD >gi|225935362|gb|ACGA01000030.1| GENE 142 241045 - 242064 384 339 aa, chain - ## HITS:1 COG:no KEGG:BT_4100 NR:ns ## KEGG: BT_4100 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 339 1 329 329 522 76.0 1e-147 MKRNEMMENRMNFQTSVELPAGMPPVSHADRILLMGSCFAENIGRQLMDAGFQLDLNPFG ILYNPLSVSSALREIMGNKEYTSQDLFAYKDLWHSPMHHGSFSAFTPEETLHAINARLHH AHQRLPELNWLMVTLGTAYVYKQKESGQVVANCHQLPENHFLRYRLTVEEIVEDYTALIT GMAACNPELKWLFTVSPIRHIRDGMHANQLSKSTLLLAIDRLQQLFPERVFYFPSYEIIL DELRDYRFYADDMLHPSPLAIRYLWERFSETFFSAETKQVIMAVQDIRRDLAHKPFHPES EAYQRFLGQIVLKIERLIGKYPYLDFQKETELCHMRLNP >gi|225935362|gb|ACGA01000030.1| GENE 143 242232 - 244136 1676 634 aa, chain + ## HITS:1 COG:PM0532 KEGG:ns NR:ns ## COG: PM0532 COG1154 # Protein_GI_number: 15602397 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Pasteurella multocida # 7 633 4 614 614 551 43.0 1e-156 MKNEPIYNLLNTINYPDDLRRLDVEQLPEVCSELRQDIIKELCCNPGHFAASLGTVELTV ALHYVYNTPYDRIVWDVGHQAYGHKILTGRREAFSTNRKLGGIRPFPSPEESEYDTFTCG HASNSISAALGMAVAAARKGDAKRHVVAIIGDGSMSGGLAFEGLNNASATSNNLLIILND NDMAIDRSVGGMKQYLVNLTTSNRYNQLRFKLSRMLFKLGILNEERRKALIRFGNSLKSM AAQQQNIFEGMNIRYFGPIDGHDVKNLARVLRDIKDMQGPKILHLHTIKGKGFGPAEKHA TEWHAPGKFDPVTGERFIANTEGMPPLFQDVFGNTLVELAEANPKIVGVTPAMPSGCSMN ILMEKMPKRAFDVGIAEGHAVTFSGGMAKDGLQPFCNIYSSFMQRAHDNIIHDVAIQNLP VVLCLDRAGLVGEDGPTHHGAFDMACLRPIPNLTISSPMDEHELRRLMYTAQLPDKGPFV IRYPRGRGVLVDWKCPLEEIPVGKGRKLKEGNDLAVITIGPIGNIAARAITRAEADFGLS IAHYDLRFLKPLDEELLHEVGRKFQRVLTIEDGIIKGGMGSAILEFMADNEYKPTVKRIG IPNLFVEHGSVAELYQLCGMDEEGILTKIKEFIN >gi|225935362|gb|ACGA01000030.1| GENE 144 244138 - 245478 1496 446 aa, chain + ## HITS:1 COG:PA0016 KEGG:ns NR:ns ## COG: PA0016 COG0569 # Protein_GI_number: 15595214 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Pseudomonas aeruginosa # 1 445 1 450 457 222 32.0 1e-57 MKIIIAGAGNVGTHLAKLLSREKQDIILMDDDEEKLSALSANFDLLTVAASPSSISGLKE VGVKEADLFIAVTPDESRNMTACMLATNLGAKKTVARIDNYEYLLPKNKEFFQKLGVDSL IYPEMLAAKEIVSSMRMSWVRQWWEFCGGSLVLIGTKMREKAEILNIPLHQLGGPNIPYH VVAIKRGTETIIPRGDDVVKLHDIVYFTTTRKYIPYIRKIAGKEDYADVRNVMIMGGSRI AVRTAQYVPDYMQVKIVDNDLNRCNRLTELLDDKTMIINGDGRDMDLLIEEGLKNTEAFV ALTGNSETNILACLAAKRMGVEKTVAEVENIDYIGMAESLDIGTVINKKMIAASHIYQMM LDADVSNVKCLTFANADVAEFTVPEGAKITKHLIKDLGLPKGTTIGGMIRNGEGILVTGD TQIRPGDHVVVFCLSMMIKKIEKFFN >gi|225935362|gb|ACGA01000030.1| GENE 145 245578 - 247029 890 483 aa, chain + ## HITS:1 COG:MA1483 KEGG:ns NR:ns ## COG: MA1483 COG0168 # Protein_GI_number: 20090342 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Methanosarcina acetivorans str.C2A # 2 482 1 476 476 301 38.0 2e-81 MINSKMIFRILGFLLLIETAMLLCCGAVSLFYKENDLQSFLISSAVTACISFLLLAIGKD AVKSLNRRDGYVIVSAAWVTFSLFGMLPYYIGGYIPNVADAFFETMSGFSSTGATILNNI ESMPHGILFWRAMTQWIGGLGIVFFTIAVLPIFGVGGIQVFAAEASGPTHDKVHPRIGIT AKWIWGIYAGMTGTLIILLIGGGMSIFDSICHAFATTSTGGFSTKQASIEYYHSPYIDYV ISIFMFLSGINFTLLLLMFNGKIKKFLHDAELKFYFICVSFFTLFIAIWLHQTSSMGVEE AFRKSLFQVISLQTSTGFVTADYMLWPSILWGCLIIVMIIGACAGSTTGGIKCIRMVILF QVVKNEFKHILHPNAILPVRVNKQVISPSIQSTVLAFTFLYVVIVIISVLVMMGLGVGFL ESIGTVISSIGNMGPGLGTCGPAFSWSALPDAAKWLLSFLMLLGRLELFTVLLLFTSDFW KKS >gi|225935362|gb|ACGA01000030.1| GENE 146 247400 - 247966 455 188 aa, chain - ## HITS:1 COG:no KEGG:BT_4096 NR:ns ## KEGG: BT_4096 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 188 294 460 460 273 78.0 3e-72 MIIWQKLFLILWILSLVDGEQKKGQAEATKVIPFMDEMNREVLKVTGLKGDYKLLIDEEE IGTWSGDDLAKGINLAAESKTPQYQQALAVMHLNEYRWEIERSFRDYAWTQFGFFQQKGL LFANDRQAIKVMDENLEKNGWLKGHRDGYTRMMSDAVREAREQEMDVLINKIYEVNKPVV RKILLRKI >gi|225935362|gb|ACGA01000030.1| GENE 147 247857 - 248783 813 308 aa, chain - ## HITS:1 COG:no KEGG:BT_4096 NR:ns ## KEGG: BT_4096 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 295 1 295 460 534 85.0 1e-150 MKKIFVLIAAVCMTYTTAFAQTVKPFKEGERAVFLGNSITDGGHYHSYIWLYYMTRFPNM PLRILNGGIGGDTAYDMNKRLDGDIFPMKPSVLMVTFGMNDSGYFEYNGDNPKEFGEQKY QESIKNYQQMEKRFKDLPDTRIVMVGTSPYDETVQLKENVPFKTKNETIKRIVEYQKESA AKNNWEFTDLNAPMVALNQQNQQKDPAFTLCGSDRIHPDNDGHMVMAYLFLKAQGFVGKE VAEVEINANKKQAVKAENCTISNIKKNGKDLSFDYLAEALPYPLDTIARGWGTKKRSGRS YESHSFYG >gi|225935362|gb|ACGA01000030.1| GENE 148 248897 - 250444 907 515 aa, chain - ## HITS:1 COG:CAC3452 KEGG:ns NR:ns ## COG: CAC3452 COG3507 # Protein_GI_number: 15896693 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 29 473 4 484 533 116 27.0 9e-26 MFMKKIIIFLFGICLAPQLVAQEQQYFMNPVIRGDMPDPSMIRIDDTYYATGTSSEWAPF YPVFTSKDMVNWKQVGHIFSKQPSWTSNSFWAPELYYHNNKVYCYYTARQKSTGISYIGV ATSDSPLHEFTDHGPIITYGTEAIDAFIYDDNGQLYISWKAYGLDKRPIELLGSKLSADG LRLEGEPFSLLVDEEEIGMEGQYHFKQGDYYYIVYSAHGCCGPSSNYDVYVARSKNFCGP YEKYAGNPILHGGEGDYISCGHGTAVKTPDGRMFYLCHAYLKGENFYDGRQPVLQEMEVA ADHWVRFKTGNIAVAKQPMPFKTTRQKPLTDFEDRFQDSKLKIDWTWNYPYSDVNIALKK EKLSLSGTPKKDNKQGTALCLRPQSSHYSCETKVIRANESLQGLTLYGDDKNLVTWGVKG NKLQLKMLKDNSESILHESSLTTDKDIYLKIEVEQGCIFNFYESKDGKTWKSVLDTSLKG AFLTRWDRVQRPGLLHCGAEDVPAEFAYFKMRNLK >gi|225935362|gb|ACGA01000030.1| GENE 149 250450 - 251583 975 377 aa, chain - ## HITS:1 COG:PH1107 KEGG:ns NR:ns ## COG: PH1107 COG2152 # Protein_GI_number: 14590938 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus horikoshii # 66 373 16 287 299 155 35.0 1e-37 MKRKLQNIAYLLMAAAFVTSCGEKKQTSEFPDWAWADFQRPEGVNPIISPDTTTIFYCPM RQDSVAWEASDTFNPAATVYDGKVVVLYRAEDNSATGIGSRTSRLGYASSNDGIHFQRMT VPIFYPADDSQKELENPGGCEDPRVAVTEDGLYVMHYTQWNRKQARLAVATSRDLQTWEK HGPAFAKAYNGRFLDEFSKSASILTKLVDGKQVIAQIDGKYWMYWGEKFVNVATSTDLVN WEPMLDENGEFLKVMIPRPGKFDSDLTECGPPAILTDKGILLFYNGKNKSGAEGDTLYIA NSYCAGQALFDSKDPTKLIDRLDKPFYTPESDFEKSGQYPAGTVFIEGLVFHNQKWFLYY GCADSRVAVAVYDPLKK >gi|225935362|gb|ACGA01000030.1| GENE 150 251625 - 253733 2071 702 aa, chain - ## HITS:1 COG:Rv0584 KEGG:ns NR:ns ## COG: Rv0584 COG3537 # Protein_GI_number: 15607724 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Mycobacterium tuberculosis H37Rv # 16 701 44 768 877 399 35.0 1e-110 MGSLLAQNAGSNYARQVNTLIGTKGVGLTSGYLYPGATYPYGMVQFTPSYFSKRSGFVIN QLSGGGCEHMGNFPTFPVKGKLKMSPDNILNYRINISEEKGHAGYYEAMVQEDIKAKLTV TERTGMANYEYPADQQYGTVIIGGGISATPIDQAAIVITAPNKCEGYAEGGNFCGLRTPY KVYFVAEFDADALESGTWKRDELKPNTTFAEGEYSGVYFTFDVSKKKNIQYKIGVSYVSV ENARENLKAENTGWDFLQIQNQAESKWNDYLGKIEVEGTNPDRTTQFYTHLYRSFIHPNV CSDVNGEYMGADFKVHKSRSKHYTSFSNWDTYRTQIQLLSMLDPEVASDIVISHQLFAEE AGGAYPRWVMANIETGVMQGDPTPILISNAYAFGARNYDPKPIFKIMRKGAEEPGAMSQN VEARPGLKQYLDKGYYNASIQLEYTSADFAIGQFALHAVGDEFASWRYFHFARSWKNLYN PETGWLQSRNPDGSWKPLTEDFRESTYKNYFWMVPYDIAGLIEIIGGKAAAEKRLDEFFT RLDAGYNDAWFASGNEPSFHIPWIYNWVGTPYKAQEIINRVLNEQYSSKIDGLPGNDDLG TMGAWYVFACIGLYPEIPGIGGFTVNTPIFSSVKVHLKKGDIVIKGGSEKNIYIKSMKLN GKPYDSTWIDWDQLNNGGTIDFATSAKPDVKWGTKVVPPSFP >gi|225935362|gb|ACGA01000030.1| GENE 151 253807 - 256089 1937 760 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 28 759 3 716 717 396 33.0 1e-109 MITIKKLLVGIIAVFFPSGMQAQAKDLVQYVNTLQGTDSHFGLSYGNTYPTTGMPYAMHT WSAQTGKNGEGWKYQYAVDAIRGFCQSHQCSPWMSDYAVYSFMPMVGELVVNQDKRATKF SHDNEIAKPHYYKVTLNNGITTEMAPTTRGVHLRFSYPSTGDAYLVIDGYTDMSEIKIDP AKRQISGWVNNQRFVNNSKTFRSYFVVQFDKAFEDYGVWENQKDEIFSKKQEGAGKGYGA YIKFKKGSKVQAKAASSYISAEQAVITLNNELGKDSNLETTKVRGQKTWNELLNRIQVEG GTEEQMKTFYSCLFRANLFSRKFYERKANGEPYYYSPYDGKVYDGYMYTDNGFWDTFRSQ FPLTNILHPTMQGRYMNALLAAQEQCGWLPSWSAPGETGGMLGNHSISLLADAWAKGIRT FDPEKALKAYAHEAMNKGPWGGANGRGFWKEYFELGYVPYPESMGSSAQTMEYAYDDFCG YQLAKMVGNKHYQEVFARQMYNYKNVFDPSIGFMRGKGVDGKWQEPFDPLEWGGPFCEGN AWHYTWSVFHDTEGLINLFGSDEKFTTKIDSVFTLPSTIKPGTYGGVIHEMKEMEIAGMG QYAHGNQPIQHMPYLYSYAGQPWKTQYWVRQIVERLYNSTERGYPGDEDQGGMSSWYILS SLGIYAVCPGTDEYVIGSPLFKKATITLENGNKFVIEANNNNKESVYIQSATLNGRVLDK NFIKYDDIADGGTLRFEMGGQPNKERCTSKYAAPFSLSKE >gi|225935362|gb|ACGA01000030.1| GENE 152 255977 - 256213 81 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKEGDVVCYTSFSFLFFEREFSGNYTFIRLNKVNHHNIHINDNYKKTISWDYSSFFSFR DASSSERLGAIRQYVAGH >gi|225935362|gb|ACGA01000030.1| GENE 153 256258 - 258948 1744 896 aa, chain - ## HITS:1 COG:no KEGG:BT_4076 NR:ns ## KEGG: BT_4076 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 894 1 892 892 1441 74.0 0 MFMRNIRILLYVGLIGCFFCQSIFVYSKAKKYSFKLIDLKCEGLVNPLGIDSRIPHFSWK LKGDGLKEGQAYYEIQVSSDSLLLMQDKADLWNSGKIKSGASVMVPYQGRPLSSRSLCYW RVRIWDFKKQVSEWSPVARFSIGLLNKSQVHGEYIGASIEGGQICAPILRKKINIETVET AFLHVNTLGYHEIYLNGKKISEAVLTPAVSHLTKRSLIVTYDVTPYLKKGENDLLIWLGQ GWYKNTTYGAVYDGPLVKAELNVLRNGKWKVLTKTDNSWFGRESGYSDRGTWRALQFVGE KVDGRILPTDLSTQALNQMEWAPVVTVEVPEHIVSPQMCEVNKIHRILSPVSIKKLAKST WLIDMGQVQTGWFEMRMPVLPIGHEVIMEYSDNLTKEGEFDRQGESDIYISGGREGECFR NKFNHHAYRYVRISNLPVKPEGEWMKSLQICGDYQQTATFECSDSDLNAIHNMIQYTMKC LTFSGYMVDCPHLERAGYGGDGNSSTMSLQTMYDVAPTFENWVQTWSDSMREGGSLPHVG PNPGAGGGGPYWCGFFVMAPWRTYVNYNDSRLIEKYYSQMKEWFKYVDKYTVDGLLKRWP DTKYRDWYLGDWLAPVGVDAGIQSSVDLVNNCFISECLGVMSKIAFTLGKKEEAEEYATR KEKLNKLIHQTFYQPNEGIYSTGSQLDMCYPMLVGVVPDDLYNTVKEKMMVVTEQKYKGH IAVGLVGVPILTEWVIRNKKVDFFYQMLKKRDYPGYLYMIDNGATATWEYWSGERSRVHN CYNGIGSWFYQAIGGIRPDEAAPGYRHVYIDPQVPNGLTWARVTKESPYGTIVVNWELKD NLMHLQLSVPAGTTAIVCVPDTAVSCKMNNKKVNIGKKIVPVEKGDYDFVFSLKIQ >gi|225935362|gb|ACGA01000030.1| GENE 154 258975 - 260354 773 459 aa, chain - ## HITS:1 COG:no KEGG:BT_4075 NR:ns ## KEGG: BT_4075 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 457 12 466 467 738 77.0 0 MLICLFYMSSGCLQAQSVIPPFKKGERVVFTGNSITHGGHYHSFIWLYYMTRFPDKPITI MNAGIGGESTWDMKARLDDDVFDRKPTYVTLTFGMNDTGYDIYMKDNANELSEQRIVKSL DSFREIEGRLLAKNKITKVLIGGSPYDETSKFNNFILHQKNNAILKIIDAQRVSAKKNGW GFVDFNEPMRKISLEEQKRDSTFTFCRVDRIHPDNDGQMVMAYLFLKAQGLAGYEVSEFS IDAQHLDVIVHKNCKISKLKRGKKNLTFDYLAYALPYPLDSIPRHGWGNMKSQRDAMQLV PFMKEFNQERFQVHNLEKGIYQLTIDDQYIDNLSSGQLMKGINLADYPTTPQYQQAMKIM YLNEERFEIEKRFREYLWTEYSFLRKEGLLFVDNQKAIDKLQEYLPKDLFLRASYGWYIK AMHPEIRKVWKNYMNSIIEAIYKINKPVTHRVKLIRMEQ >gi|225935362|gb|ACGA01000030.1| GENE 155 260548 - 261564 849 338 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4815 NR:ns ## KEGG: Cpin_4815 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 18 338 3 310 310 72 22.0 3e-11 MKLMKKSKHLLLLLVASFLVATCMTIEDIIHPDNAQVDSDINISVKIKIAAETDGNSKLA FGILVPKSWNVANNATLTLTTAAGFAGNVVTNEPMVLVGAGEKNPSDALPWAASFQSKIG VLENTGPVEWVVFKSATTFQINDKVAEQKEVSGTVNIKLHTGGRAIKFFMGYTFCGEAFG FNSEKYPDDYVKASKVLEVTGGDEPMMDFTADPAISFVPATFGFGDIFSIKYNEPHYVTE GGLKGGEVYLMGKVTYEENGVEKEKTVDETSSKTLMDELGDMGQVTSWQKYIYPKEFFGL SKESVILNIKVHFTNKDKSIVILDNETNSDFVVEETCE >gi|225935362|gb|ACGA01000030.1| GENE 156 261576 - 262043 340 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171792|ref|ZP_05758204.1| ## NR: gi|260171792|ref|ZP_05758204.1| hypothetical protein BacD2_08003 [Bacteroides sp. D2] # 1 155 1 155 155 302 100.0 3e-81 MKKNTLIFAIVSLLFSLSGCNTDDRVKMSDNYPVDGDINYGIRQVMQYDEVNINGIDVAS KFPFFETLRLTIHYTDSKISGVTFSNGEVPFSPYNFEVPSGKVDCYLDTDALPNELRITG TDHVIAYYQNGEFSVPFKLDCPSLDYKYTFKSINQ >gi|225935362|gb|ACGA01000030.1| GENE 157 262124 - 263497 703 457 aa, chain - ## HITS:1 COG:no KEGG:Bcav_2704 NR:ns ## KEGG: Bcav_2704 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 140 453 77 387 390 147 30.0 8e-34 MKCNSKYMVFIAFCLCFFFSCSDDEGTKENPFGNGESDTFIKVTNSSNADTLYLSWELTD TKVSFDNYRFELEKPKTIKTVGKDETKFYFTHVPYNEPVSISISLIKKDEVIKNFSTKVK IDGLDKKISGIIIPDQGSVTGGDGMYSIPLPDGRSIFLMGDSYIGTVTNGQRSASDHMYR NTYIVYNKGKVNAIYGANSNKNASAAVPPEFLDEKKWYWPGHGFVDDNKLYIFQTLMYQG GDDMWGFRYETTHVLEYNLPDFELVKTTPIPFKGSEDIHYGMAVLKESDYVYIYAQVDVS NDMDPISEVLVARSTMNSLYTGWEYYTDSGWSTNASAAVKLKGLTSVPVSSQFNVFKLCD KYVLLTQKKTFNSGEIYTFIADTPYGPWRNKQLIFKTYEQDIPHLFTYNAMGHPQFEKDG MILVSYNVNTEVFAEQFSDVTTYRPRFFWVEIDKILK >gi|225935362|gb|ACGA01000030.1| GENE 158 263540 - 265051 944 503 aa, chain - ## HITS:1 COG:no KEGG:BDI_0242 NR:ns ## KEGG: BDI_0242 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 503 4 501 504 467 49.0 1e-130 MKKIYSILCGLITLVVIATSCDAFLEEKPKTFLTPELYFQNEGQVVAAVNGLYTFLDDIF DGDIEPGSQTFIFVEYLSGYGERLRGSGTQDLAQANNLNVADNNGYVQKFWETGYKAIEN SNSIIEGIESMEQGILSEDKKDELLGEVYFMRAYYYFNLVRLYGPVPLKLISTSDLSNVE IKLTSIEGVYDQIDKDLTKAGELMEKSAWTNANGRVSKGAVKSLHAKVYLTMAGYPLQKG TEYYQKAYAKANEVYKSKAFRLFDTYEELRTTENTGEHIWSIQREADNAGSPVHGNMLPY PAPAKAISANAVYGGALVPTMLFYNSYPAGDKRKDEQAFYYTEHEALDKSAIVQLGRPYI YKYWDDEAASTGKSGANYTLLRYADVLLMMAEAKAQADGGTTTDADAVDAYYAVRKRALP AEVKPTSVKTNDILKERFWEICFEGQTWYDMLRTRKALHAVTGQIVDMIGYQTPGHGAAF KEADLLLPYPIREKRLNPNLKRD >gi|225935362|gb|ACGA01000030.1| GENE 159 265067 - 268210 1954 1047 aa, chain - ## HITS:1 COG:no KEGG:BDI_0241 NR:ns ## KEGG: BDI_0241 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 29 1047 108 1104 1104 808 44.0 0 MKRRQRADAFSAFPGIVKMMLFLTLLLWSGIQAKASSDNILQQAEKTISGLVTDESGEPL IGVTVMVKGTRIGTTTNLDGVFTLKIPTNAGTLQFSYMGMKTMELPIGNKSQFKVAMESD AVLIDDVVVIGYGTRSKKDMSISVSSVKGDELNNRPSAFNIMQSVAGKVAGVQNISMSGR PGGSSSLRVRGMGSINAGKDPIYVLDGVIGVDPDIINSANVESIDILKDAAATAMYGAQG SNGVVLITTKKGKKGKGTITYDGKVGFGFMNRKLDMLNADEYMEVQKRAYAYSGQTMPHL TTPYENLFYYAKDDAGNYQYDDKGLLIASPKYNTDWQKAVTQTAITNDHILSFSSGNEDT SIYASIGYQNQEGLIKESEYERYSATLNVRTKIKDWFRIQLVGTVGSQKGNNNDMEGSFN QGAIRNMIEMPPIVPIKYDDGTWGRKHDYPLSETAENPLLLLQDRKNEWKSNFSLFSLNA TFDITKKLTFTTQGDYQVSNRKEMSYAKAGLFDVSENNGGYADIKNMDTQKLTTENYFTY TDTFFDGKLSSNFVLGASWYYNHAEESSSGSEQYFDDSFDYHNLSAGTTWHEPKSGMNQN TMNSYYFRMNHSFLDRYLLGFTFRADGASNFGTNNKYGYFPSVSAGWRISEEKFFSPVKD VLSQMKLRASYGIVGNASIPNYRTISQYSNGSTVFNKSLNSYVVLSNLGNKDLKWESSRQ FNIGVDMSFWDNRLEVIMDYYHKSTKDLLFEKQVPFTTGYTTTWTNLGEIVNKGFEATVT SRNIHTRDFMWVTDLVFSTNKLVVADINGETIDTGNNTIAREGEKWASYYVYKRLGTWGL AEVAEAAKYGKKPGDIKYEDVNGDYKIDEEDRQIMGSGTPKGSLTMVNTFTYKGFSLMVD LNYTYGFKIMDITNSMLENRPLYSNNVKSILNAWTPENQNTMIAALRLPSDNYFGENEKD SYMLHKGDFLRLRNISLSYTFTPKVIKALKIFDSLSVGVSAENLLVITKYPGYDPEVGAF NADSGQSIDFYAYPRPTTITGNIKVVF >gi|225935362|gb|ACGA01000030.1| GENE 160 268330 - 270669 1794 779 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 26 778 42 770 790 342 30.0 2e-93 MYKATANLLIACSFLWMIACSSPNAATADLIPTDYVNPFIGASTSVGAAGVYHGLGKTFP GATTPYGMVQVSPNTITGGDNGSGYSDEHKTIEGFAFTQMSGVGWFGDLGNFLVMPTTGK LLKVAGKENNDSIEGYRSAYNKVTETAKAGYYSVELTDYKIKVETSATPHCGILRFTYPS NNQSRIQIDLARRVGGTSTSQYVKVIDDYTIQGWMKCTPDGGGWGNGEGKADYTVYYYAQ FSKPLIDYGFWSADIPDHWIRKRDEVVSIPYLTRVSQAPVITGKKELEGKHLGFFTEFPT KEGEDVEMKVGISFVDMEGAANNFKQEIASKNFDQVKKEANELWNKELSRIQISGGTDDE KTIFYTALYHTMIDPRIYTDVDGRYIGGDYKVHTSDSTFTKHTIFSGWDVFRSQFPLQTI INPGLVSDELNSLISMADQSGREYYERWEFLNSYSGCMLGNPALSVLADAYVKGIRTYDA GKAYKYAVNTSCRFGNDSLGYTPGVLSISHTLEYAYADWCISQLAKALGKEDDAKRFFEK GQAYHNIFDKEKGWFRPRNVDGTWQAWPENARLIEWYGCIEANAYQQGWFVPHDVLGMVN LMGGKEKVIADLVDFFNKTPSSLLWNEYYNHANEPVHFVPFLFNQLAVPWYTQKWTRYIC EKAYANEVEGIVGNEDVGQMSAWYILASSGLHPSCPGNTRMEITSPVFDKVEFRLDPKYY SGTKFTVIAHNNNIDNLYIQKALLNGREYHKCYLEFEDIAAGGVLELYMGNTPNKKWGE >gi|225935362|gb|ACGA01000030.1| GENE 161 270706 - 273243 1338 845 aa, chain - ## HITS:1 COG:all0848 KEGG:ns NR:ns ## COG: all0848 COG0383 # Protein_GI_number: 17228343 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Nostoc sp. PCC 7120 # 63 811 278 1015 1047 96 21.0 2e-19 MTRIYIILLAIFFCSNVRSQQAYFVDGYHGGIYGHYPVSWKTQFIVDQLSMHPDWRICLE IEPETWDTVRVQTPEAYQRFKNVVAGKQVEFTNPAYAQPYCYNISGESIIRQFQYGIAKI NKHFPKVDFVTYSVEEPCFTSCLPQILKLFGFKYAVLKCPNTCWGGYTAAYGGELVNWIG SDGTSILTVPRYACEELEQKSTWQTTAWGNSDTYLKACRDAGIKHPIGMCFQDAGWKNGP WIGSGKNTKNSSIYMRWRDYIKNVSIGKTDDNWYFSQEDMHVNLMWGSQVLQKIAQEVRV SENKIVMAEKMSVMAYLMNGYSCARADVDEAWRTLMLAQHHDSWIVPYNRLNKQGTWADA IKRWTDQTNVIADKITELSMQSFDKGNGVVDKQQKYIRVYNTLGSERTEIVNVLLPYEFV DMDWEVYDWRKKKVGCSVEREGKQIRLLFEAKVPPFGYSTYCIKRKGLGKKIVSRSVKVN DNEWTVENDMYKIIFDLSKGGIIKSLVAKKEGNKEFAKQSGEYALGELRGYFYEEDKFHS STETPARMIILQDNEYGKKVRIEGEIASHPFTQTVTLLQGDKCIDFNLTIDWKKNVGIGE YKEKSWRNNRRAYCDDRFKLSVLFPVDLCSPRIYKNAPFDVCESKLDNTFFNTWDQIKHN IILHWVDLAEQEGDYALALLSDHTTSYSHGQDYPLGLTVQYSGPGLWGPDYKITGPLTIK YAIIPHRGKWDKASIATSSDRWNEPLLYSYHSSAKLESMSFVDLQDTGYQVSAAYLQDGK IVLRLFNTEGDQTQRKITFGMPLSSVEEIDLNGNSIERKKIEIHTGKTEMVISMPRFGIK TFVLK >gi|225935362|gb|ACGA01000030.1| GENE 162 273254 - 273484 237 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171798|ref|ZP_05758210.1| ## NR: gi|260171798|ref|ZP_05758210.1| hypothetical protein BacD2_08033 [Bacteroides sp. D2] # 1 76 1 76 76 85 100.0 1e-15 MEDNLNRYFADEFNSEEKIEFLMMVENNEELKEEFIEKQNLLALIDWVSSEYENKQEYVQ QKLREFMYKMNETKNK >gi|225935362|gb|ACGA01000030.1| GENE 163 273556 - 274080 446 174 aa, chain - ## HITS:1 COG:PA1912 KEGG:ns NR:ns ## COG: PA1912 COG1595 # Protein_GI_number: 15597108 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 4 163 13 162 168 61 28.0 7e-10 MNFSELYLIYYPKLVRFAKEFVMLEEDAENITQDVFTDLWEKREAMDHIENMNAYLFRLV RNKCLDYLKHKVFEQKYVENIRMAFEIELNLKFQSLDRFDVSAISEENETKKLVQAAINS LPKKCRDIFLLSRMEGLKYREISERLGISVNTVECQMGIALKKLRMKLNTYLAA >gi|225935362|gb|ACGA01000030.1| GENE 164 274272 - 275903 1083 543 aa, chain - ## HITS:1 COG:no KEGG:BT_4069 NR:ns ## KEGG: BT_4069 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 543 1 534 534 786 82.0 0 MQTKKEYMIMKRTLWFIFFLLNSLFYLYADSENDSLLKVLDKVISERLIYTQKKEATIKE LKKKKVGLNSLEDIYNLNKEIIHQYETFVCDSAEQYIHENIDIAKTIGNKEYLLEEQLRL AFVYSLSGLFIQANDIFKSIKCADLPDHLKASYCWNRIRYYENLIKYTDDIRFSNEYITE KEAYRDTVMGVLVEESEEYKKEKAVKLQDKGSIKEALQILTKIYNKQELTSHGYAMMAMG LARVYRLIGNDALEEKYLILAAMTDIKLAVKENEALLTLAVNLYQKGDIDRAYNYIKVAL DDAIFYNSRFKNTVIARIHPIIENTYLIRLEKQKQNLRFYIFLTSLFVVTLAITLYFTYK QTKVVSRAKRHLKAMNEELVGLNKNLDEANLIKEKYVGYFMNQCAVYINKLDEYRKNVNR KIKTGQIDDLYKSSSRPFEKELEELYHNFDKAFLKLYPNFVEEFNSLLKPEEHYSLEKDQ LNTELRIFALIRLGIIDVGQIAVFLHYSVQTIYNYKSKVKRMSTLDSNIFEEEVKKLGSL SQK >gi|225935362|gb|ACGA01000030.1| GENE 165 276635 - 276985 206 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A [Campylobacter curvus 525.92] # 5 116 14 126 129 84 39 7e-15 MNFTFLVVVLLTALAFVGVVVALSRAISPRSYNLQKFEAYECGIPTRGKSWMQFRVGYYL FAILFLMFDVETALLFPWAVVMHDMGPQGLISVLFFFIILVLGLAYAWRKGALEWK >gi|225935362|gb|ACGA01000030.1| GENE 166 276976 - 277578 435 200 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B [Campylobacter curvus 525.92] # 33 190 12 169 170 172 50 2e-41 MEITKKPKIKSIPYDEFIDNESLEKLVKELNAGGANVALGVLDDFINWGRSNSLWPLTFA TSCCGIEFMALGAARYDMARFGFEVARASPRQADMIMVCGTITNKMAPVLKRLYDQMPDP KYVVAVGGCAVSGGPFKKSYHVVNGVDKILPVDVYIPGCPPRPEAFYYGMMQLQRKVKIE KFFGGVNRKEKKPDYLKNEE >gi|225935362|gb|ACGA01000030.1| GENE 167 277606 - 279198 1631 530 aa, chain + ## HITS:1 COG:SMa1529 KEGG:ns NR:ns ## COG: SMa1529 COG0649 # Protein_GI_number: 16263284 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Sinorhizobium meliloti # 163 530 11 404 404 307 39.0 4e-83 MQEIQFIAPAALHDEMLRLRNEKQMDFLESLTGMDWGVADEKDAPEKLRGLGVVYHLEST ITGERIALKTATTNRELPEIPSVSDIWKIADFYEREVFDYYGITFIGHPDMRRLYLRNDW IGYPMRKDNDPEKDNPLCMTNEETFDTTQEIELNPDGTIKNKEMKLFGEEEYVVNIGPQH PATHGVMRFRVSLEGEIIRKIDANCGYIHRGIEKMNESLTYPQTLALTDRLDYLGAHQNR HALCMCIEKAMGIEVSERVKYIRTIMDELQRIDSHLLFYSALTMDLGALTAFFYGFRDRE KILDIFEETCGGRLIMNYNTIGGVQADLHPNFVKRVKEFIPYMRGIIHEYHDIFTGNIIA QSRMKGVGVLSREDAISFGCTGGTGRASGWACDVRKRMPYGVYDKVDFKEIVYTEGDCFA RYLVRMDEIMESLNIIEQLIDNIPEGPYQEKMKPIIRVPEGSYYAAVEGSRGEFGVFLES QGDKMPYRLHYRATGLPLVAAIDTICRGAKIADLIAIGGTIDYVVPDIDR >gi|225935362|gb|ACGA01000030.1| GENE 168 279221 - 280297 1072 358 aa, chain + ## HITS:1 COG:RP796 KEGG:ns NR:ns ## COG: RP796 COG1005 # Protein_GI_number: 15604628 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Rickettsia prowazekii # 30 349 15 327 339 246 43.0 6e-65 MFDFSIVTNWIHELLLSIMPEGLAIFIECVAVGVCLVALYAILAIILIYMERKVCGFFQC RLGPNRVGKWGSIQVVCDVLKMMTKEIFMPKGADHFLYNLAPFMVIIASFLTFACIPFNK GAAILDFNVGVFFLLAASSIGVVGILLAGWGSNNKFSLIGAMRSGAQIISYELSVGMSIM TMVVLMGTMQFSEIVEGQADGWFIFKGHIPAVIAFIIYLIAGNAECNRGPFDLPEAESEL TAGYHTEYSGMGFGFFYLAEYLNLFIVASVAATIFLGGWMPLHIVGLDGFNAVMDYIPGF IWFFAKAFFVVFLLMWIKWTFPRLRIDQILNLEWKYLVPISMVNLLLMACCVAFGFHF >gi|225935362|gb|ACGA01000030.1| GENE 169 280310 - 280798 477 162 aa, chain + ## HITS:1 COG:SMa1519 KEGG:ns NR:ns ## COG: SMa1519 COG1143 # Protein_GI_number: 16263279 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Sinorhizobium meliloti # 19 146 18 140 188 93 37.0 1e-19 MEYKDKKYTYLGGLVHGISTLATGMKTSIKVYFRKKVTEQYPENRKELKMFDRFRGTLNM PHNENNEHRCIACGLCQTACPNDTIKVISETVETEDGKKKKILATYEYDLGACMFCQLCV NACPHDAITFDQNFEHAVFDRTKLVLKLNHDGSKVIEKKKEV >gi|225935362|gb|ACGA01000030.1| GENE 170 280801 - 281313 481 170 aa, chain + ## HITS:1 COG:HP1269 KEGG:ns NR:ns ## COG: HP1269 COG0839 # Protein_GI_number: 15645883 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Helicobacter pylori 26695 # 5 164 2 162 182 69 30.0 2e-12 MGSTLETVVFYFLAAFIIAMSIMTVTTQRIVRSATYLLFVLFGTAGIYFLLGYTFLGSVQ IMVYAGGIVVLYVFSILLTSGEGDRAEKVKRSKLLAGLITMIAGLAIILTITLKHNFMQT ANLAPQEIDIHAIGNALLSSDKYGYILPFEAVSILLLACIIGGIIIARKR >gi|225935362|gb|ACGA01000030.1| GENE 171 281347 - 281655 371 102 aa, chain + ## HITS:1 COG:VNG0643G KEGG:ns NR:ns ## COG: VNG0643G COG0713 # Protein_GI_number: 15789840 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Halobacterium sp. NRC-1 # 1 102 1 100 100 73 43.0 9e-14 MIHMEYYLVVSTIMMFAGIYGFFTRRNTLAILISVELMLNATDINFAVFNRFLFPEGMEG YFFALFSIAISAAETAVAIAIMINIYRNLRSIQVRNLDELKW >gi|225935362|gb|ACGA01000030.1| GENE 172 281711 - 283642 1465 643 aa, chain + ## HITS:1 COG:slr0844 KEGG:ns NR:ns ## COG: slr0844 COG1009 # Protein_GI_number: 16331732 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Synechocystis # 2 643 6 680 681 380 38.0 1e-105 MELTILILLLPFFSFLILGIGGKWMSHRTAGTIGTLILGAVAILSYVTAIQYFSAPRLED GTFATLIPYNFTWLPFTETLHFDLGILLDPISVMMLIVISTVSLMVHIYSFSYMKGEVGF QRYYAFLSLFTMSMLGLVVATNIFQMYLFWELVGVSSYLLIGFYYTKPAAIAASKKAFIV TRFADLGFLIGILIYGYYGGTFGFTPDTVSLISGGASMLPLALGLMFVGGAGKSAMFPLH IWLPDAMEGPTPVSALIHAATMVVAGVYLVARMFPLFIAYAPNTLHMVAWVGAFTAFYAA SVACVQSDIKRVLAFSTISQIGFMMVALGVCTSMDPHEGGLGYMASMFHLFTHAMFKALL FLGAGSIIHAVHSNEMSAMGGLRKYMPITHWTFLIACLAIAGIPPFSGFFSKDEILAACF QYSPVMGWVMTVIAAMTAFYMFRLYYGIFWGSEPLRAGSHSEHNTPKGNLEAAPCRPHES PLAMTFPLMFLAVVTCGAGFIPFGHFISSNGESYSIHLDPSVAITSVVIAIISIAIATWM YKNAKQPVADSLEKQFKGLHKAAYNRFYIDDIYQFITHKIIFRCISTPIAWFDRHVVDGF FDFLAWATNTTSDEIRGLQSGQVQQYAYVFLCGALALILLLIL >gi|225935362|gb|ACGA01000030.1| GENE 173 283656 - 285140 1309 494 aa, chain + ## HITS:1 COG:slr1291 KEGG:ns NR:ns ## COG: slr1291 COG1008 # Protein_GI_number: 16329430 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Synechocystis # 69 439 71 443 559 253 36.0 8e-67 MNFLSIFVLIPLLMLAGLWAARGIKAIRGVMVTGASALLIASVVLTFLYLGERSAGNTAE MLFRADTLWYAPLHISYSVGVDGISVAMLLLSAVIVFTGTFASWRLQPLTKEYFLWFTLL SMGVFGFFISVDLFTMFMFYEIALIPMYLLIGVWGSGRKEYAAMKLTLMLMGGSAFLLIG ILGIYFGSGATTMNLLEIAQLHNIPFAQQCIWFPLTFLGFGVLGALFPFHTWSPDGHASA PTAVSMLHAGVLMKLGGYGCFRIAMYLMPEAANELSWIFLILTGISVVYGAFSACVQTDL KYINAYSSVSHCGLVLFAILMLNQTAATGAILQMLSHGLMTALFFALIGMIYGRTHTRDV RELAGLMKIMPFLSVCYVIAGLANLGLPGLSGFIAEMTIFVGSFQNNDQFHRVLTIIACS SIVITAVYILRLVGKILYGTCTNKHHLELTDATWDERVAVICLIVCVAGLGMAPFWVSHM IGESVLPVVSQLIP >gi|225935362|gb|ACGA01000030.1| GENE 174 285178 - 286617 1213 479 aa, chain + ## HITS:1 COG:BMEI1145 KEGG:ns NR:ns ## COG: BMEI1145 COG1007 # Protein_GI_number: 17987428 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Brucella melitensis # 68 431 65 430 478 238 39.0 2e-62 MDYSQFLYMREELSLIAVLVLLFLADLFMSPDAHKNGGKARLNTMLPVILMAIHTAINLI PGTAADAFGGMYHYVPMHTVVKSILNVGTLIVFLMAHEWMKREDTSFKQGEFYVLTLSTL FGMYLMISAGHFLMFFIGLETASIPMAALIAFDKYRHNSAEAGAKYILTALFSSALLLFG LSMIYGSAGTLYFDDLPAHIDGNPLQIMAFVFFFAGMAFKLSLVPFHLWTADVYEGAPSA VTAYLSVISKGSAAFVLLAVLIKVFAPMINDWQEVLYWVTIASITIANIFAVRQQNLKRL MAFSSISQAGYIMLGVIGGTAQGMTALVYYVLVYAAANLGVFAVITIVALRSQKFTLEDY AGLYKTNPKIAFLMTLSLFSLAGIPPFAGFFSKFFIFMAAFDAGFHLLVFIALINTVISL YYYLLIVKAMYITPSDSPIPTFRSDRCTKWGLALCTLGIIGLGIASIVYQSIDKLSFGI >gi|225935362|gb|ACGA01000030.1| GENE 175 286717 - 289527 2418 936 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 675 923 50 308 328 191 41.0 6e-48 MQTDNERYKKMASLAQIGWWEADLTAGYYLCSDYLCDLLGLDGNTISASDFLNLIREDYR KQIAREFQANTSIHKDFYEQTFPIHSKYGEVWLHTRLAFREKGTGANGGDKSFGVIQCVE APMEEDQRDALRRVNDLLRRQNFVSQSLLRFLRDEEVDSCIADILKEILNLYNREGRVYI FEYDECHTHHSCTYEVVSEGVSAEKETLQNIPISQSKWWSEQILAGKPIVLDTLEQLPEE AADEYQILAVQGIRSLMVTPLMAGDYVWGYMGIDLVKNYHDWTNEDFQWFSSLGNIISIC IELRKAKDKVAREQTFLNNLFHFMPMGYIRMSIIRDENNEPCDYRVTDANEVSSTFFGHP LESYIGSLASEKHSDYLQKVDFLKEILGSNSYREKDEYFARTERYTHWVIYSPGKDEIVG LFLDSTGSVQANRALDRSEKLFKNIFANIPAGVEIYDKDGYLMDLNNKDLEIFGVVNKSD VIGINFFENPNVPQSIRDRVRNEDLVDFRLNYSFEQIGGYYKTSRSNVIELYSKVSKLYD NEGNFSGYILISIDNTERIDAMNRIRDFENFFLMISDYAKVGYAKLNLLNRKGYAIKQWY KNLGEEEDIPLSEVVGVYGKMHPEDRKRFFDFYDEVKKGKRRHFQGEVRIRRPGTKNEWN WVNSNVMVTNYKPEENEIEIIGINYDITELKETEAELIQARDKAEMMDRLKSAFLANMSH EIRTPLNAIVGFSDLLVETEDVAERQEYIKIVRENNDLLLQLISDILDLSKIEAGTFEFT NGDVDVNLLCEDIVRSMRMKAKEDVELVLDNPLPVCHVISDRNRIHQVISNFVNNAMKFT SEGSIHVGYELKEGELEFYVKDTGIGIEKEQLPHIFERFVKLNSFVHGTGLGLSICQSIV EQLGGRIGVDSEKGKGSRFWFTIPGVIVTEEMSCAQ >gi|225935362|gb|ACGA01000030.1| GENE 176 289720 - 290928 1284 402 aa, chain - ## HITS:1 COG:no KEGG:BT_4056 NR:ns ## KEGG: BT_4056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 402 1 403 403 569 86.0 1e-161 MKRLFVNFMTFAVMATALTLTACSSDSDGDGDGNGNGGNNGGTGSSIVVGENILSGTLTG EQTLEAKEYILNGTVVIEDGGRLNIPAGTTIKAREGFSSYLLVAQGGKLYADGTAAKPIV FTANSTSPTSGYWGGVIINGKAPISGSNANKSDTGLTEIDNNYKYGGSVDNDNSGSLTYV KICYAGARSTADIEHNGLTLNGVGNGTRIENIYILESADDAIEFFGGTVNVTNLLAVNPD DDMFDFTQGYSGTLKNCYGVWESGYTSTEADPRGIEADGNLDGIYPDHLRQSDFTVENMT IVNNAANTTDNVDRMQDVIKIRRKAKATITNALVKGSGGTIDLIDMNDGKDDGNAASTIS ITNTLQYRTKLNGTLNTFTEPTTNTGADASLFTWTGYNFSSL >gi|225935362|gb|ACGA01000030.1| GENE 177 290972 - 293686 2719 904 aa, chain - ## HITS:1 COG:CC0171 KEGG:ns NR:ns ## COG: CC0171 COG1629 # Protein_GI_number: 16124426 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 115 904 57 888 888 132 23.0 2e-30 MKKPFLLRFSVAVFFSIVCVLSALAANIKVKGAVKDKLSKEPLIGATIRLIGTQAGAVTD MDGNFELNVAGVLEGMYDVEIKYVGYKTEVRRKVRVENGKLVILNLELETDAQELSDVVV VAKKNRENENMLLLEQQKAVIAVQSVGVRELSRKGVSDAEGAVTKVAGVSKQDGVKNVFV RGLGDRYNATTFNGFALPSEDPEYKNISLDFFGTDIIQSVGVNKAFNAGGSSDVGGATID IVSKELIGSGHLGFGISGGLNTQTVAADFLKQDGVNFIGFANRTEPVDENSWNFKNKLDP SGQHLQINRSYSISGGKRFYVGEDKNPLSFFLTAGHTTDYQYTDEIIRNTTTSGTIYKDM NGKKYTENISQLALANVDFDMQNRHHISYNFMMIHANTQSVGDYNGKNSIFSDDYENLGF TRRQQANDNMLIVNQLMTNWGLTKSLSLDAGASYNMVKGYEPDRRINNLTKAENGYTLLR GNSQQRYFSTLDENDLNVKAGLVYRLKDDMEEISNVRFGYAGRFVDDNFKATEYNLTVGH VSTIPSLDGFSLDDYYNQENFSSDWFKIQKNIDEYTVKKNIHSAYAEATYQFTSRWIVNL GLKYDKVDIEVDYNVNRGGSKGNNTIQKDYFLPSLNLKYNLNEKNSLRLGASKTYTLPQA KEISPYRYVGVNFNSQGNPNLKPSDNYNLDLKWDFNPTPTELISLTAFYKLIKNPISRIE VASAGGYLSYENIADKATVAGVEVEIRKNLFVRPVGNAANGINKLSLGLNGSYIYTNAKM PLATVTTGSQLEGAAPWIVNFDLSHNFTKGKRSFVNTLVLNYVSDKIYTIGTQGYQDMME QGILTLDFVSQAKLNKHLSLNLKARNLLNPSYKLSRKANENGEKVILNDYKKGINISLGV SCTF >gi|225935362|gb|ACGA01000030.1| GENE 178 294037 - 294933 755 298 aa, chain - ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 26 296 36 304 313 127 30.0 3e-29 MTTPLSNQIIREITPLSDKDCFYIAERYKTEFTYPIHNHSEFELNFTEKAAGVRRIVGDS SEVIGDYDLVLITGKDLEHVWEQNDCRSKEIREITIQFSSDLFFKSFINKNQFDSIRRML DKAQKGLCFPMSAILKIYPLLDTLASEKQGFYAVIKFMTILYELSLFEEEARTLSSSSFA KIDIHSDSRRVQKVQEYINEHYQEEIRLGQLADMVGMTDVSFSRFFKLRTGKNLSDYIID IRLGFASRLLVDSTMSIAEICYECGFNNLSNFNRIFKKKKSCSPKEFRENYRKKKKLI >gi|225935362|gb|ACGA01000030.1| GENE 179 294960 - 296300 1083 446 aa, chain - ## HITS:1 COG:VC0265 KEGG:ns NR:ns ## COG: VC0265 COG0668 # Protein_GI_number: 15640294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 57 437 29 407 412 357 46.0 2e-98 MKEVKEVVDTVSVALANGQPEEAGNVMMQEVNHALLSMGVDEVWADKIDNFIILLCIIGI ALLANLICRKIILRTVAKLVKQTKATWDDIVFNDKVMLNISRLVAPILIYISIPIAFPEH ADSGLLDFLRRLCMIYILAVFLRFVSALFTAVYLVYSAREQYKDKPLKGLLQTAQVVLFF IGAIIIISILIKQSPVVLLTGLGASAAVLMLVFKDSIMGFVSGIQLSANNMLKVGDWITM PKYGADGTVIEVTLNTVKVRNFDNTITTIPPYLLISDSFQNWQGMQESGGRRVKRSINID MTSVHFCTPEMLAKYRKIQLLTNYVDETEKVVEEYNKEHHIDNSVLVNGRRQTNLGVFRA YLTNYLKNLPTVNQDLTCMVRQLQPTETGIPLELYFFSANKVWVAYEGIQADVFDHVLAI IPEFDLRVFQNPSGADLHRIGVKIEN >gi|225935362|gb|ACGA01000030.1| GENE 180 296338 - 297354 727 338 aa, chain - ## HITS:1 COG:no KEGG:BT_4052 NR:ns ## KEGG: BT_4052 # Name: not_defined # Def: putative ABC transporter ATP-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 297 629 629 622 92.0 1e-177 MATSLRDNLTSSYFNAAHKLYPKKARRRIIAYVESYDDIAFWRTLLEEFEDDEHYFQVML PSATSLAKGKKMVLMNTLNTAELGRSLIACVDSDYDFLLQGATNTSRKINRNRYIFQTYT YAIENYHCFAESLHEVCVQATLNDRSILDFNSYLKRYSEIVYPLFLWNVWFYRQRDTYTF PMYDFHTYTSLREINLRYPEKSLESLQQRVNQKLAELKKKFPHNINQVNGLQAEFKELGL VPETTYLYMQGHHVMDNVVMKLLIPVCTVLRREREQEIKRLAEHNEQFRNELTCYQNSQV NVEIMLKKNVAYKRLFHYDWLRQDISEYLEEGKNKQKS >gi|225935362|gb|ACGA01000030.1| GENE 181 297332 - 298186 782 284 aa, chain - ## HITS:1 COG:STM2746 KEGG:ns NR:ns ## COG: STM2746 COG3950 # Protein_GI_number: 16766058 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 156 260 302 410 427 62 33.0 1e-09 MEQQANYIRRIEIHGLWERFNIGWDLRPDVNILSGINGVGKTTILNRSVGYLEELSGEMK SDEKNGVRLFFDNPQATYIPYDVIRSYDRPLIMGDFTARMADKNVKSELDWQLYLLQRRY LDYQVNIGNKMIEMLSSDNEEERRKATTLSVAKRRFQDMIDELFSYTRKKIDRKRNDIAF YQDGELLFPYKLSSGEKQMLVILLTVLVQDNSHCVLFMDEPEASLHIEWQQKLIAMIREL NPNVQIILTTHSPAVIMEGWLDAVTEVSDISTNINQTNGNFTTR >gi|225935362|gb|ACGA01000030.1| GENE 182 298309 - 299229 830 306 aa, chain + ## HITS:1 COG:BH0390 KEGG:ns NR:ns ## COG: BH0390 COG0697 # Protein_GI_number: 15612953 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 4 290 2 291 311 67 22.0 3e-11 MDKSKNLHGHLFALTANVMWGLMSPIGKSALQEFSAISVTTFRMVGAAAAFWLLSIFCKQ EHVDHRDMLKIFFASLFALVFNQGVFIFGLSMTSPIDASIVTTTLPIVTMVVAAIYLKEP VTNKKVLGIFVGAMGALILIVSSQAGNNGNGSLIGDLLCLVAQISFSIYLTVFKGLSQRY SAVTINKWMFIYASICYIPFSYYDISTIQWASISTVAILQVLYVVLGGSFLAYLCIMTAQ KLLRPTVVSMYNYMQPIVATIAAITMGIGSFGWEKGIAIALVFLGVYIVTQSKSRADLEK AEKLHS >gi|225935362|gb|ACGA01000030.1| GENE 183 299316 - 300407 938 363 aa, chain - ## HITS:1 COG:BH2954 KEGG:ns NR:ns ## COG: BH2954 COG1703 # Protein_GI_number: 15615516 # Func_class: E Amino acid transport and metabolism # Function: Putative periplasmic protein kinase ArgK and related GTPases of G3E family # Organism: Bacillus halodurans # 33 363 4 334 340 334 48.0 2e-91 MEHPENSEEYKGLVVNKGIEQPSSVNPYLKRKPKKRQLSVAEFVEGIVKGDVTILSQAVT LVESVKPEHQAVSQEIIEKCLPFSGNSVRIGISGVPGAGKSTSIDVFGLHVLEKGGKLAV LAIDPSSERSKGSILGDKTRMEQLSVHPKSFIRPSPSAGSLGGVARKTRETIILCEAAGF DKIFVETVGVGQSETAVHSMVDFFLLIQLSGTGDELQGIKRGIMEMADGIVINKADGDNL ERAKLAATQFRNALHLFPAPESGWTPKVLTYSGFYNLGVKEVWDMIYEYIDFVKGNGYFE YRRNEQSKYWMYESINEQLRDSFYHNPKIEAMLLEKEQQVLKGNLTSFIAARSLLDTYFE DLK >gi|225935362|gb|ACGA01000030.1| GENE 184 300419 - 301486 964 355 aa, chain - ## HITS:1 COG:no KEGG:BT_4048 NR:ns ## KEGG: BT_4048 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 355 19 373 373 622 92.0 1e-177 MKRSLLSILTLLTVALAAVAQPRISSNKETHNFGQIEWKRPVTVEYTITNTGNQPLVLTN VTTSCACAVADWTKEPIAPGGKGTVKATFDAKALGHFEKSVGIYSNASPSLVYLKFTGEV VQEIKDYTKLLPHVIGNIRLDRDEFAFPDVYRGQQPSLTFSIANLSDRPYEPVLMHLPPY LKMEAEPKVLLKGKKGTVKLTLDASQLKDYGLTQTSVYLSRFSGDKVSEDNEIPVSAILL PDFSRMTEKDSLNAPAMHISETNIDLSIPLIKKNKVSHDILIANSGKTPLVISKLQVFNS SVGVSLKKTVLPPDGMTKLKVTIRKRDVGNKKHHLRILMITNDPLRPKVEINIKR >gi|225935362|gb|ACGA01000030.1| GENE 185 301723 - 307458 5021 1911 aa, chain - ## HITS:1 COG:all5100 KEGG:ns NR:ns ## COG: all5100 COG2373 # Protein_GI_number: 17232592 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Nostoc sp. PCC 7120 # 464 992 554 1131 1906 68 20.0 1e-10 MKVKQICMMVLLWLGVIPAVQAQTFDKLWKEVEQAEKKSLPKTVIKLTDEIYQKGEKEKN SPQMLKAYTWRMKYREMLNPDSLYADLKGLEQWVKQTDQPMDRAILHSLIAGIYADYAAS NQWQLRQRTEIVDQTPATDMREWTANMFIEKVRTNIKEALADSVLLLKTSSRGYIPFVEL GETSEYYHHDMYHLLASRSIEALQRVEELSNRITNDGTVNPVKQDIIAIYGNMIPAYKAT GLKEGYVLTALNYLEWRWNADRNIRPLQAKGELPVLTEDTYLKALNTLKSKYASEPICAE VYLAEARYTIGKQQQLNALQLCDGAIRLYPGYRRINALKNLREEILAPYLNVNASDLAFP NEEIELRVSHKNLDGFTVRLYQAKKLIKEQHYAVLRPKDYRTQDTVFTFKAPELGSYVMR IIPDIRAKRDSESKFDVTRFKVLTCRLPDKQYQVVTLDGQTGHPIPHAKVTMYSNDEKVL QEFTTNEEGKVVFPWKSEYRYLKASKGTDTAMPKQGIYAGSYGYYGDEDKVTENMTLLTD RSLYRPGQTVYVKGIAYSQQSDTANVLPNKEYTVTLLDVNNQEVGQKSVRTNEFGSFTTD FALPSACLNGMFSLKAGRGRTSIRVEDYKRPTFDITFEKQQGSYKLGDEVQVKGKVQSYS GVLLQDLPVKYTVKRSAYSLWRFAESVQIASGEVMANENGEFTIPVRLQESDSYKNNDKV YYRYSIEATVTNVAGETQSSTDVISAGNRSLVLQVELQDKTCKDQPFETIFKVQNLNEQP VEVKGNYYLYPAKDKDFKQLEEKPVATGTFTSNEDMMLDWKNLPSGPYVLKASVKDNQGK EVTADTNTILFSVEDKRPPIETTMWFYGANTEFDAAHPAVFCFGTSKKDAYVMMNVFSGD KLLESKTLNLSDTIVRFEYPYRESYGDGVFVNLCMVRDGQVYQEQVRLTKRIPDKTLTMK WEVFRDKLRPGQKEEWKLTIKTPQGQAANAEMLATMYDASLDKIWNRQQNFQIYYNQIVP YSNWMSGYSGNNSFNYWWNTKSLKVPSLEYDHFVMLSDYYNNGRDLGEVIVRGYGSTRKL TVTGSVSTLDVATLRSNAPKMKSAMAADAMTNVEFQSEMIPTGEKADEASDNEMLPEASA DLRTNLAETAFFYPQLRTNEQGEISFSFTMPESLTRWNFRGYSHTKGMLTGTLDGEATTS KEFMLTPNLPRFVRVGDKTSLAASISNMTGKPQAGTVSLVLFDPMTEKIISTQKQKFSLA AGKTMGVDFQFTVSDKYEIVGCRMIADSGTFSDGEQQLLPVLSNKEHLVETLPMPVRGEE TRTFSLDHLFNQQSKTATDRKLTVEFTGNPAWYAVQALPSLSLPTSNNAISWATAYYANT LASFIMNSQPKIKAVFESWKLQGGTKETFLSNLQKNQEVKNIILSESPWVLEAQTEEQQK ERIATLFDLNNIRSNNIAALTRLQELQNSNGAWSWYKGMNGSRSVTTYIAELNARLAMLT GEKLSRSALSLQQKAFAYLHQSALDEYKEILKAQKDGVKFTGVSGSILQYLYLIAISGEQ VPAVNKAAYTYYLSKVGELLTSPSMNTKAIAAIVLDKAGRKKEAQEFVASLKEHLTKTDE QGMFFAFNENPYTWGGMQMQAHVDVMEALEQTGGNNDTVEEMKLWLLKQKQTQQWNSPVA TADAVFALLMKGANLLDNQGDVRIVIANEVLETVSPSKTTVPGLGYIKRSFTQKNVVDAR KIEVEKRNPGIAWGAVYAEFESPVSDVKQQGGELNVEKQLYVERTVNNVPQLQPITSKTV LQVGDKVVSRLTIRVDRPMDFVQLKDQRGACFEPIGSISGYRWNNGLGYYVDIKDASTNF FFDHLGKGVYVLEYSYRVSRAGTYETGLATMQCAYAPEYASHSASMTIIIK >gi|225935362|gb|ACGA01000030.1| GENE 186 308025 - 308195 98 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160886822|ref|ZP_02067825.1| ## NR: gi|160886822|ref|ZP_02067825.1| hypothetical protein BACOVA_04836 [Bacteroides ovatus ATCC 8483] # 1 56 1 56 56 99 100.0 9e-20 MKDDVFVSDNWRNIIWNNRLNEIIDTDELIVFRKVSIGFSIGENQFQACEQFVSYA >gi|225935362|gb|ACGA01000030.1| GENE 187 308589 - 308681 74 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAKKMGSLVRKILSEVGRNEMLSWGYRIEK >gi|225935362|gb|ACGA01000030.1| GENE 188 309079 - 310536 1469 485 aa, chain - ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 2 485 50 533 534 457 46.0 1e-128 MSTILSLAPQNVWKHFYSLTQIPRPSGHMEKITEFLLGFGKGLGLESFVDEAGNVIIRKP ATPGMENRKGVILQAHMDMVPQKNNDTVHDFEKDPIETYIDGDWVKAKGTTLGADNGLGV AAIMAVLEAKDLKHGPLEALITKDEETGMYGAFGLKPGTLNGEILLNLDSEDEGELYIGC AGGMDVTATLEYKEVAPEAGDIAVKVTLKGLRGGHSGLEINEGRANANKLLVRFVREAVA SYEARLASWEGGNMRNAIPREAHAVITIPAENEEELLGLVKYCENLFNEEYSAIETPISF TAERVELPAGEVPEEIQDNLIDAIFACQNGVTRMIPTVPDTVETSSNLAIITIGEGKAAI KILARSSSDSMKEYLTTSLESCFSMAGMKVEMTGGYSGWQPDVNSPILHAMKASYKQQFG VEPAVKVIHAGLECGIIGAIIPGLDMISFGPTLRSPHSPDERALIPTVQKFYDFLVATLE QTPMK >gi|225935362|gb|ACGA01000030.1| GENE 189 310658 - 311644 583 328 aa, chain - ## HITS:1 COG:no KEGG:BT_4044 NR:ns ## KEGG: BT_4044 # Name: not_defined # Def: putative dolichol-P-glucose synthetase # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 327 14 327 328 496 90.0 1e-139 MKKLLKKTLKLILPVVLGGFILFWVYRDFDFTKAGDVLLHGTNWWWMLFSLVFGVFAQVF RGWRWRQTLEPLDAFPKKSDCVNAIFISYAASLVVPRIGEVSRCGVLAKYDNVSFAKSLG TVVTERLVDTLTILLITGITVLLQLPIFVTFLQQTGTKIPSLVHLLTSVWFYIILFCFIG VAILLYYLRKTLFFYERVKGFVLNIWEGIMSLKGVRNIPLFIFYTLAIWACYFLHFYFTF YCFAFTAHLGILAALVMFVGGTFAVIVPTPNGAGPWHFAIISMMMLYGVNVTDAGIFALI VHGIQTFLVVLLGIYGFAALSFTNRHRG >gi|225935362|gb|ACGA01000030.1| GENE 190 311715 - 312515 821 266 aa, chain + ## HITS:1 COG:PA0592 KEGG:ns NR:ns ## COG: PA0592 COG0030 # Protein_GI_number: 15595789 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Pseudomonas aeruginosa # 5 257 8 263 268 167 37.0 3e-41 MKLVKPKKFLGQHFLKDLKVAQDIADTVDTFPELPILEVGPGMGVLTQFLVKKDRLIKVV EVDYESVAYLREAYPSLEDHIIEDDFLKMNLHRLFDGKPFVLTGNYPYNISSQIFFKMLD NKDIVPCCTGMIQKEVAERIAAGPGSKTYGILSVLIQAWYKVEYLFTVSEHVFNPPPKVK SAVIRMTRNETKELGCDEKLFKQVVKTTFNQRRKTLRNSIKPILGKDCPLTEDILFNKRP EQLSVAEFIDLTNKVEEALKTAGNNQ >gi|225935362|gb|ACGA01000030.1| GENE 191 312581 - 313921 1429 446 aa, chain + ## HITS:1 COG:BH0511 KEGG:ns NR:ns ## COG: BH0511 COG2239 # Protein_GI_number: 15613074 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Bacillus halodurans # 9 441 14 441 452 229 32.0 7e-60 MNEEYIDNVKHLIEQKDADTVKGLLIDLHPADIAELCNDLNPEEAKFVYRLLDNEIAADV LVEMDEDARKELLEMLPSETIAKRFVDYMDTDDAVDLMRELDEDKQEEVLSHIEDIEQAG DIVDLLKYDENTAGGLMGTEMVLVNENWSMPECLKEMRQQAEELDEIYYVYVIDDDERLR GIFPLKKMITSPSVSKVKHVMQKDPISVHVDTPIDEVVQAIEKYDLVAIPVIDSIGRLVG QITVDDVMDEVREQSERDYQLASGLSQDVETDDNVLKQTTARLPWLLIGMIGGIGNSMIL GNFDATFAAHPEMALYIPLIGGTGGNVGTQSSAIIVQGLANSSLDAKNTFKQVTKEAVVA LINATIISLLVYTYNFIRFGATATVTYSVSISLFAVVMFASIFGTLVPMTLEKLKIDPAI ATGPFIAITNDIIGMMLYMGITVLLS >gi|225935362|gb|ACGA01000030.1| GENE 192 314010 - 315860 2036 616 aa, chain + ## HITS:1 COG:no KEGG:BT_4041 NR:ns ## KEGG: BT_4041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 616 1 624 624 889 92.0 0 MDAHDTNQPLNQGELEEEKKTVEVSEAITETPTEEVTAEVQPEAAPKPATKEDVLNQLKE LAQDAENANKQEIDNLKQSFYKLHNAELEAAKVQFTDNGGNIEDFVAQEDPTEEEFKRLM GVIKEKRGKQVAELERQKEENLQVKLSIIEELKELVESGDDANKSYTEFKKLQQQWNDTK LVPQGKVNELWKNYQLYVEKFYDLLKLNNEFREYDFKKNLEIKTRLCEAAEKLADEEDVV SAFHQLQKLHQEFRDTGPVAKELRDEIWNRFKAASTAVNRRHQQHFESLKEAEQHNLDQK TVICEIVEAIEYDELKTFSAWENKTQEVIALQNKWKTIGFAPQKMNVKIFERFRHACDDF FKKKGEFFKSLKEGMNENLEKKKALCEKAEALKDSTDWKVTADALTKLQKEWKTIGPVAK KHSDAIWKRFITACDYFFEQKNKATSSQRSVETENMEKKKALIEKLSAIDENMDTEEASN LVRDLMKEWNSIGHVPFKEKDKLYKQYHGLIDQLFDRFNINASNKKLSNFRSNISNIQGG GTQSLYREREKLVRTYENMKNELQTYENNLGFLTSTSKKGSSLLTELNRKVDKLKADLEL VLQKIKVIDESIKAEE >gi|225935362|gb|ACGA01000030.1| GENE 193 316441 - 316800 313 119 aa, chain + ## HITS:1 COG:slr1886 KEGG:ns NR:ns ## COG: slr1886 COG0799 # Protein_GI_number: 16330295 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Synechocystis # 4 118 27 140 154 78 34.0 2e-15 MNDTKVLIEKIKEGIQEKKGKKIIVADLTSIEDTICKYFVICQGNSPSQVSAIVDSIKEF TRKGADSKPYAIDGLRNAEWVAMDYADVLVHVFLPETRDFYNLEHLWADAKLTQIPDLD >gi|225935362|gb|ACGA01000030.1| GENE 194 316810 - 318960 1275 716 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 19 662 3 627 636 495 44 1e-139 MDNNNSNNNNSNKPNNKVNMPKFNLNWMYMIIALMLLGLWWGSDSKGAGSKAVTYSEFQD YVKKGYVSKVLGYEDKSIEAFLKPTAVGAVFGADSTKVGRNPIITSRTPSTDKLEEFLQA EKEAGHFDGSSDYPPKSDIFPAILIQILPLVLLVALWIFFMRRMSGGGSGGPGGVFNVGK SKAQLFEKGGAIKITFKDVAGLAEAKQEVEEIVEFLKEPQKYTDLGGKIPKGALLVGPPG TGKTLLAKAVAGEANVPFFSLAGSDFVEMFVGVGASRVRDLFKQAKEKAPCIVFIDEIDA VGRARGKNPAMGGNDERENTLNQLLTEMDGFGSNSGVIILAATNRVDVLDKALLRAGRFD RQIHVDLPDLNERKEVFGVHLRPIKIDDSVDVDLLARQTPGFSGADIANVCNEAALIAAR HGKKFVGKQDFLDAVDRIIGGLEKKTKITTEAERRSIALHEAGHASISWLLEYANPLIKV TIVPRGRALGAAWYLPEERQITTKEQMLDEMCATLGGRAAEDLFLGRISTGAMNDLERVT KQAYGMIAYLGMSDKLPNLCYYNNDEYSFNRPYSEKTAELIDEEVKRMVNEQYDRAKRIL SEHKEGHNELTQLLIDKEVIFAEDVERIFGKRPWASRSEEIMAAKESQDAARAERKLAQK LKEEEKEIKEEEAENTAEEQAPIDTKVAAAGNKVTVEGKVTVEGKSNGEEQANGSN >gi|225935362|gb|ACGA01000030.1| GENE 195 318977 - 319825 753 282 aa, chain + ## HITS:1 COG:HI0919 KEGG:ns NR:ns ## COG: HI0919 COG0575 # Protein_GI_number: 16272856 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Haemophilus influenzae # 13 270 11 278 288 105 30.0 8e-23 MINNFIKRAITGVLFVAILVGCILYDAFSFGILFTAISALTIYEFAQLVNMRAEGVKINK TINMLGGAYLFLAIMGFCIDAADSKIFIPYVLLLLYMMISELYLKKENPVLNWAYSMLSQ LYIGLPFALLNVLAFHNDPASEFSSISYNPILPLSIFIFLWLNDTGAYCIGSLIGKHRLF ERISPKKSWEGSIGGGVVAIGVSFILAHYFPFMSMMEWAGLALVVVIFGTWGDLTESLLK RQLHVKDSGNILPGHGGMLDRFDSSLMAIPAAVVYLYALTWF >gi|225935362|gb|ACGA01000030.1| GENE 196 319927 - 320709 735 260 aa, chain - ## HITS:1 COG:no KEGG:BT_4005 NR:ns ## KEGG: BT_4005 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 260 1 258 258 404 80.0 1e-111 MKKFKWLLGMLLLALVPMLQSCDDDGYSIGDFSWDWATVRATGGGGYYLEGDRWGLIDPV ATSIPWFKPVDGERVVAFFNPLYDMEGGKGVQVKMEGVQELLTKEVEDMTTEEEAVEFGN DPILIYQGDMWLGGKFLNVIFRQELPRSEKHRISLVQNKIETGEPSEPGTLNVAEDGYVH LELRYNTYEDVTDYWGWGRVSYNLEKFFPTEKDAESTMKGFKVTINSREHGEGRVIVLDL DHPVGVPEAAKDVHSTSSIR >gi|225935362|gb|ACGA01000030.1| GENE 197 320803 - 321939 1145 378 aa, chain - ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 1 341 1 323 356 155 31.0 1e-37 MKYYLIVGEASGDLHASHLMAALKAKDPQADFRFFGGDLMAAVGGTMVKHYKELAYMGFI PVLLHLRTIFANMKRCKEDIVSWQPDVVILVDYPGFNLDIAKFVHAKTQIPVYYYISPKI WAWKEYRIKNIKRNVDELFSILPFEVEFFEGKHQYPIHYVGNPTVDEVAAYQEAHPKNKD QFIAENQLEDKPVIALLAGSRKQEIKDNLPDMLKAASAFPDYQLVLAGAPAIAPDYYKKY VGEAKVKIIFDQTYRLLQHADVALVTSGTATLETALFRVPQVVCYYTPIGKVVSFLRRHI LTVKFISLVNLIADREVVKELVADTMTVKNMQSELKNIIENEAYRNEMLLGYEYVAERLG PAGAPRHAAREMLRLLKK >gi|225935362|gb|ACGA01000030.1| GENE 198 321983 - 322750 680 255 aa, chain - ## HITS:1 COG:alr4846 KEGG:ns NR:ns ## COG: alr4846 COG0496 # Protein_GI_number: 17232338 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Nostoc sp. PCC 7120 # 8 253 3 260 265 151 35.0 1e-36 MENEKPLILVSNDDGVMAKGINELVKFLRPLGDIVVMAPDAPRSGSGCALTVTQPVHYQL VKKEVGLTVYKCSGTPTDCIKLARNTVLDRTPDLVVGGINHGDNSATNVHYSGTMGVVFE GCLNGIPSIGFSLCNHAPDADFEAARPYIRSIATMVLEKGLPPLTCLNVNFPDTTDIKGI KICEQAKGRWTNEWAACPRLNDPNYFWLTGEFTDHEHENEKNDHWALANGYVAITPTTVD VTAYHFMDELNNWFN >gi|225935362|gb|ACGA01000030.1| GENE 199 323107 - 323871 735 254 aa, chain + ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 250 1 250 253 276 56.0 3e-74 MGKIIALANQKGGVGKTTTTINLAASLATLEKKVLVVDADPQANASSGLGVDIKQSECTI YECIIDRANVQDAILDTEIDSLKVISSHINLVGAEIEMLNLPNREKILKEVLTPLKKEYD YILIDCSPSLGLITINALTAADSVIIPVQAEYFALEGISKLLNTIKIIKSKLNPALEIEG FLLTMYDSRLRQANQIYDEVKRHFQELVFNTVVQRNVKLSEAPSYGVPTILYDAESTGAK NHLALAKEIINRNK >gi|225935362|gb|ACGA01000030.1| GENE 200 323883 - 324773 980 296 aa, chain + ## HITS:1 COG:ML2706 KEGG:ns NR:ns ## COG: ML2706 COG1475 # Protein_GI_number: 15828464 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mycobacterium leprae # 32 295 64 334 335 196 42.0 6e-50 MATQKRNALGRGLDALLSMDDVKTEGSSSINEIELAKITVNPNQPRREFDETALQELADS IAEIGIIQPITLRKLSDDEYQIIAGERRYRASQRAGLKTIPAYIRTADDENMMEMALIEN IQREDLNAVEIALAYQHLLDQYELTQERLSERIGKKRTTIANYLRLLKLPAPIQMALQNK QLDMGHARALISLGDPKLQVKIFEEIQEHGYSVRKVEEIVKSLSEGEAVKSGTRKITPKR AKLPEEFNLLKQQLSGFFNTKVQLTCSEKGKGKISIPFGNEEELERIMEIFDTLKK >gi|225935362|gb|ACGA01000030.1| GENE 201 324774 - 325661 641 295 aa, chain + ## HITS:1 COG:no KEGG:BT_4000 NR:ns ## KEGG: BT_4000 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 295 1 295 295 492 85.0 1e-138 MTRKSKKYQLLVSALLLCFLQVAGIDVYAQSTVTPVRKDTTIIQREAPKARARRHREPIA QDSTRQDSIRIIPSKELPAIDSLSAAKIQIADSLDAANKKELKKIEQPASIVVKTDSVPP TTQDINKKIFIPNPTKATWLAVVFPGGGQIYNRKYWKLPIIYGGFAGCAYALSWNGKMYK DYSQAYLDIMDSNPNTKSYEDLLPPNSTYNEEQLKNTLKRRKDMFRRYRDLSIFAFIGVY LISIIDAYVDAELSNFDITPDLSMKVEPAVIDNNNQFRSNSFKNKSVGLQCVLRF >gi|225935362|gb|ACGA01000030.1| GENE 202 325696 - 326991 1138 431 aa, chain + ## HITS:1 COG:PA1812 KEGG:ns NR:ns ## COG: PA1812 COG0741 # Protein_GI_number: 15597009 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Pseudomonas aeruginosa # 126 429 118 462 534 161 30.0 2e-39 MKKLVNYCSIIFLLLVATSQVKAQSVDVVIRENGTERKESIDLPKSMTYPLDSLLNDWKA KNYIDLGKDCSTAEINPLFSDSVYIDRLSRIPAIMEMPYNDIIRKFIDMYAGRLRNQVSF MLSACNFYMPIFEEALDAYGLPLELRYLPIIESALNPSAVSRAGASGLWQFMIGTGKIYG LESNSLVDERRDPIKATWAAARYLKEMYDIYGDWNLVIAAYNCGPGTINKAIRRANGETD YWKIYNYLPKETRGYVPAFIAANYVMTYYCDHNICPMETNIPASTDTVQVNQNLHFEQIA DLCNVPLDQIKSLNPQYKKQMIPGDSKPYTLRLPIDAISTFIDKQDTIYAHRADELFRNR KTVAVKDITPATRKTTSAVAGKGNLTYHTIKSGDTLSTIAGKYGVTIKDIQRWNGMSSTK IAAGKRLKIYK >gi|225935362|gb|ACGA01000030.1| GENE 203 327064 - 327499 446 145 aa, chain + ## HITS:1 COG:Cgl1614 KEGG:ns NR:ns ## COG: Cgl1614 COG0317 # Protein_GI_number: 19552864 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Corynebacterium glutamicum # 16 145 33 159 760 115 42.0 4e-26 MDNLAPKEIADEEMINQAFHELLNDYLNTKHRKKVEIITKAFNFANQAHKGIKRRSGEPY IMHPIAVASIVCNEIGLGSTSICAALLHDVVEDTDYTVEDIENIFGPKIAQIVDGLTKIS GGIFGDRASAQAENFKKLLLTMSND Prediction of potential genes in microbial genomes Time: Fri May 13 08:13:16 2011 Seq name: gi|225935361|gb|ACGA01000031.1| Bacteroides sp. D2 cont1.31, whole genome shotgun sequence Length of sequence - 242559 bp Number of predicted genes - 181, with homology - 179 Number of transcription units - 72, operones - 44 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 13 - 1776 1651 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Term 1803 - 1860 14.4 - Term 1790 - 1848 10.8 2 2 Op 1 . - CDS 1900 - 2244 417 ## COG0789 Predicted transcriptional regulators 3 2 Op 2 . - CDS 2265 - 3233 809 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 3301 - 3360 3.1 + Prom 3173 - 3232 3.4 4 3 Tu 1 . + CDS 3341 - 5959 2532 ## COG0013 Alanyl-tRNA synthetase + Prom 6219 - 6278 9.3 5 4 Op 1 . + CDS 6298 - 8535 1753 ## COG3537 Putative alpha-1,2-mannosidase 6 4 Op 2 6/0.000 + CDS 8561 - 9115 415 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 7 4 Op 3 . + CDS 9168 - 10160 822 ## COG3712 Fe2+-dicitrate sensor, membrane component 8 4 Op 4 . + CDS 10205 - 12484 1826 ## COG3537 Putative alpha-1,2-mannosidase 9 4 Op 5 . + CDS 12481 - 14751 1893 ## COG3537 Putative alpha-1,2-mannosidase + Term 14825 - 14886 9.3 + Prom 15259 - 15318 10.6 10 5 Op 1 . + CDS 15408 - 18791 3356 ## BT_3983 hypothetical protein 11 5 Op 2 . + CDS 18803 - 20428 1464 ## BT_3984 hypothetical protein 12 5 Op 3 . + CDS 20458 - 21486 832 ## BT_3985 hypothetical protein 13 5 Op 4 . + CDS 21495 - 22661 755 ## BT_3986 putative patatin-like protein 14 5 Op 5 . + CDS 22693 - 24660 1166 ## BT_3987 endo-beta-N-acetylglucosaminidase F1 precursor + Term 24668 - 24712 8.0 - Term 24649 - 24708 19.4 15 6 Tu 1 . - CDS 24767 - 26185 697 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 26227 - 26286 7.3 + Prom 26133 - 26192 8.2 16 7 Tu 1 . + CDS 26281 - 26967 901 ## BT_3981 hypothetical protein + Term 27003 - 27038 5.4 + Prom 26988 - 27047 2.4 17 8 Op 1 . + CDS 27177 - 27737 450 ## BT_3980 hypothetical protein 18 8 Op 2 . + CDS 27728 - 28261 227 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 19 9 Tu 1 . - CDS 28286 - 29725 879 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 29752 - 29811 3.1 20 10 Op 1 . - CDS 29858 - 32782 2478 ## BF0745 hypothetical protein - Prom 32805 - 32864 1.8 21 10 Op 2 . - CDS 32866 - 33270 290 ## BT_3976 hypothetical protein - Prom 33304 - 33363 12.1 + Prom 33115 - 33174 2.9 22 11 Op 1 . + CDS 33203 - 33334 89 ## 23 11 Op 2 . + CDS 33350 - 34411 998 ## COG0337 3-dehydroquinate synthetase + Term 34441 - 34476 6.0 - Term 34480 - 34513 4.4 24 12 Tu 1 . - CDS 34518 - 35387 358 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 35553 - 35612 5.6 + TRNA 35863 - 35936 59.6 # Pro GGG 0 0 + Prom 35863 - 35922 76.9 25 13 Op 1 . + CDS 35994 - 37214 848 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 26 13 Op 2 . + CDS 37261 - 37779 383 ## COG1443 Isopentenyldiphosphate isomerase + Term 37899 - 37942 1.0 - Term 37628 - 37665 3.0 27 14 Op 1 . - CDS 37757 - 38299 580 ## COG0386 Glutathione peroxidase 28 14 Op 2 . - CDS 38359 - 40305 1590 ## BT_3294 putative alpha-glucosidase 29 14 Op 3 . - CDS 40367 - 42700 1881 ## COG3537 Putative alpha-1,2-mannosidase 30 14 Op 4 . - CDS 42710 - 43507 828 ## BT_3964 putative secretory protein 31 14 Op 5 . - CDS 43520 - 45865 1853 ## COG3537 Putative alpha-1,2-mannosidase 32 14 Op 6 . - CDS 45872 - 48157 1960 ## COG3537 Putative alpha-1,2-mannosidase - Prom 48182 - 48241 7.3 - Term 48266 - 48320 11.2 33 15 Op 1 . - CDS 48344 - 49582 887 ## BT_3961 hypothetical protein 34 15 Op 2 . - CDS 49608 - 51134 1193 ## BT_3960 hypothetical protein 35 15 Op 3 . - CDS 51189 - 53153 1745 ## BT_3959 putative outer membrane protein 36 15 Op 4 . - CDS 53176 - 56328 3085 ## BT_3958 hypothetical protein - Prom 56486 - 56545 3.4 - Term 56357 - 56401 1.6 37 16 Tu 1 . - CDS 56569 - 60513 2353 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 60750 - 60809 3.3 38 17 Op 1 . - CDS 61488 - 62876 1802 ## COG1109 Phosphomannomutase 39 17 Op 2 . - CDS 62919 - 63614 640 ## BT_3949 hypothetical protein - Prom 63636 - 63695 4.1 40 18 Op 1 . - CDS 63733 - 64770 759 ## COG0618 Exopolyphosphatase-related proteins 41 18 Op 2 . - CDS 64817 - 66910 534 ## COG0658 Predicted membrane metal-binding protein 42 18 Op 3 1/0.000 - CDS 66919 - 67569 767 ## COG0036 Pentose-5-phosphate-3-epimerase 43 18 Op 4 . - CDS 67668 - 68609 912 ## COG0223 Methionyl-tRNA formyltransferase - Prom 68639 - 68698 1.8 44 19 Op 1 . - CDS 68711 - 70504 1387 ## COG0038 Chloride channel protein EriC 45 19 Op 2 . - CDS 70505 - 71068 516 ## COG0009 Putative translation factor (SUA5) - Prom 71200 - 71259 7.3 - Term 71185 - 71232 10.0 46 20 Op 1 . - CDS 71271 - 72896 1309 ## BVU_3705 hypothetical protein 47 20 Op 2 . - CDS 72921 - 76001 2492 ## BF4062 putative TonB-linked outer membrane protein - Prom 76071 - 76130 6.6 + Prom 76264 - 76323 4.2 48 21 Op 1 1/0.000 + CDS 76437 - 76868 365 ## COG0824 Predicted thioesterase + Prom 76973 - 77032 1.7 49 21 Op 2 . + CDS 77053 - 78576 1210 ## COG2986 Histidine ammonia-lyase 50 21 Op 3 11/0.000 + CDS 78575 - 79306 199 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 51 21 Op 4 . + CDS 79333 - 80556 1070 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 52 21 Op 5 . + CDS 80578 - 80829 338 ## BVU_1013 hypothetical protein 53 21 Op 6 . + CDS 80832 - 81743 776 ## COG4261 Predicted acyltransferase 54 21 Op 7 . + CDS 81769 - 82839 960 ## COG0500 SAM-dependent methyltransferases 55 21 Op 8 . + CDS 82836 - 83273 314 ## BVU_1016 putative 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase 56 21 Op 9 . + CDS 83270 - 83713 386 ## COG0824 Predicted thioesterase 57 21 Op 10 27/0.000 + CDS 83698 - 85470 994 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 58 21 Op 11 27/0.000 + CDS 85463 - 85744 460 ## COG0236 Acyl carrier protein 59 21 Op 12 . + CDS 85748 - 86938 785 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 60 21 Op 13 . + CDS 86963 - 87961 696 ## BVU_1021 3-oxoacyl-[acyl-carrier-protein] synthase 61 21 Op 14 . + CDS 87958 - 88638 339 ## COG0726 Predicted xylanase/chitin deacetylase 62 21 Op 15 . + CDS 88651 - 89286 601 ## BVU_1024 hypothetical protein 63 21 Op 16 . + CDS 89261 - 89839 358 ## BVU_1025 hypothetical protein 64 21 Op 17 . + CDS 89843 - 90307 425 ## BVU_1026 putative 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase 65 21 Op 18 . + CDS 90265 - 91494 756 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 66 21 Op 19 . + CDS 91491 - 95348 2599 ## COG4258 Predicted exporter 67 21 Op 20 . + CDS 95350 - 96840 1311 ## COG1233 Phytoene dehydrogenase and related proteins + Prom 96874 - 96933 2.0 68 22 Op 1 . + CDS 96965 - 98530 1106 ## BVU_1032 putative choloylglycine hydrolase 69 22 Op 2 . + CDS 98514 - 99809 980 ## COG1541 Coenzyme F390 synthetase 70 22 Op 3 . + CDS 99866 - 100435 689 ## BVU_1034 hypothetical protein 71 22 Op 4 . + CDS 100453 - 101193 753 ## Slin_5315 hypothetical protein + Term 101290 - 101342 7.0 - Term 101337 - 101371 -0.0 72 23 Op 1 . - CDS 101471 - 102331 571 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase 73 23 Op 2 . - CDS 102373 - 104373 1398 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase 74 23 Op 3 . - CDS 104385 - 106133 1354 ## Dfer_2403 RagB/SusD domain protein 75 23 Op 4 . - CDS 106155 - 109643 2240 ## Slin_4979 TonB-dependent receptor plug - Prom 109697 - 109756 3.5 76 24 Tu 1 . - CDS 109886 - 110821 669 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 110896 - 110955 6.1 77 25 Tu 1 . - CDS 111001 - 111564 381 ## BVU_1779 putative ECF-type RNA polymerase sigma factor - Prom 111747 - 111806 5.2 + Prom 111540 - 111599 4.5 78 26 Op 1 . + CDS 111764 - 113482 1685 ## COG0608 Single-stranded DNA-specific exonuclease 79 26 Op 2 1/0.000 + CDS 113475 - 115385 1324 ## COG0514 Superfamily II DNA helicase 80 26 Op 3 . + CDS 115426 - 116388 1266 ## COG0457 FOG: TPR repeat + Term 116433 - 116471 0.2 + Prom 116497 - 116556 5.0 81 27 Op 1 . + CDS 116689 - 117537 794 ## COG0077 Prephenate dehydratase 82 27 Op 2 . + CDS 117512 - 118690 1181 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 83 27 Op 3 . + CDS 118778 - 119839 1153 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Prom 119895 - 119954 6.8 84 28 Tu 1 . + CDS 119995 - 120768 705 ## BT_3933 chorismate mutase/prephenate dehydratase (TyrA) + Term 120793 - 120837 9.1 + Prom 120789 - 120848 4.4 85 29 Tu 1 . + CDS 120875 - 122956 1597 ## COG0358 DNA primase (bacterial type) + Term 123147 - 123184 0.2 - Term 122798 - 122836 -0.8 86 30 Op 1 . - CDS 122986 - 123576 637 ## COG0302 GTP cyclohydrolase I 87 30 Op 2 . - CDS 123584 - 124027 459 ## BT_3930 hypothetical protein - Prom 124063 - 124122 4.9 - Term 124089 - 124158 4.7 88 31 Tu 1 . - CDS 124196 - 124951 1063 ## COG0149 Triosephosphate isomerase - Prom 124972 - 125031 7.2 89 32 Op 1 . - CDS 125058 - 126416 846 ## BF3957 hypothetical protein 90 32 Op 2 . - CDS 126406 - 126969 588 ## BT_3927 hypothetical protein - Prom 127005 - 127064 6.0 - Term 126985 - 127032 11.0 91 33 Op 1 . - CDS 127071 - 127943 718 ## COG0739 Membrane proteins related to metalloendopeptidases 92 33 Op 2 . - CDS 127964 - 128428 364 ## COG0105 Nucleoside diphosphate kinase - Prom 128476 - 128535 6.5 - Term 128584 - 128624 2.6 93 34 Op 1 . - CDS 128652 - 130748 1402 ## COG1200 RecG-like helicase 94 34 Op 2 . - CDS 130749 - 131408 296 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 95 34 Op 3 . - CDS 131416 - 131967 720 ## COG0693 Putative intracellular protease/amidase 96 34 Op 4 . - CDS 131995 - 132870 940 ## BT_3921 hypothetical protein 97 34 Op 5 . - CDS 132877 - 133293 312 ## BF3738 putative tansport related protein 98 34 Op 6 . - CDS 133301 - 134017 843 ## COG0811 Biopolymer transport proteins 99 34 Op 7 . - CDS 134014 - 134730 753 ## COG0854 Pyridoxal phosphate biosynthesis protein - Prom 134876 - 134935 5.1 + Prom 134694 - 134753 6.8 100 35 Tu 1 . + CDS 134843 - 135748 529 ## COG0061 Predicted sugar kinase - Term 136089 - 136147 20.0 101 36 Tu 1 . - CDS 136381 - 137253 981 ## BT_3912 hypothetical protein - Prom 137351 - 137410 4.3 - Term 137308 - 137346 4.5 102 37 Op 1 . - CDS 137428 - 139122 1233 ## gi|295086521|emb|CBK68044.1| hypothetical protein 103 37 Op 2 . - CDS 139156 - 141024 1431 ## gi|260171941|ref|ZP_05758353.1| hypothetical protein BacD2_08750 104 37 Op 3 . - CDS 141011 - 143224 1431 ## BT_3590 alpha-N-acetylglucosaminidase precursor - Prom 143245 - 143304 6.0 - Term 143251 - 143297 4.9 105 38 Op 1 . - CDS 143328 - 144788 1365 ## gi|260171943|ref|ZP_05758355.1| hypothetical protein BacD2_08760 106 38 Op 2 . - CDS 144809 - 146659 1543 ## BDI_3134 hypothetical protein 107 38 Op 3 . - CDS 146666 - 150097 2734 ## BDI_3133 hypothetical protein - Prom 150120 - 150179 6.0 108 39 Op 1 6/0.000 - CDS 150223 - 151365 871 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 151402 - 151461 6.9 - Term 151439 - 151487 6.4 109 39 Op 2 . - CDS 151510 - 152100 364 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 152155 - 152214 2.3 + Prom 152629 - 152688 7.3 110 40 Tu 1 . + CDS 152735 - 153676 1330 ## COG0039 Malate/lactate dehydrogenases + Term 153704 - 153749 9.5 111 41 Tu 1 . - CDS 153693 - 153824 69 ## gi|260171950|ref|ZP_05758362.1| hypothetical protein BacD2_08795 + Prom 153793 - 153852 5.4 112 42 Op 1 13/0.000 + CDS 153949 - 155412 1122 ## COG1538 Outer membrane protein + Prom 155434 - 155493 1.9 113 42 Op 2 9/0.000 + CDS 155519 - 156511 1125 ## COG0845 Membrane-fusion protein 114 42 Op 3 22/0.000 + CDS 156514 - 157695 551 ## COG0842 ABC-type multidrug transport system, permease component 115 42 Op 4 . + CDS 157696 - 158955 921 ## COG0842 ABC-type multidrug transport system, permease component + Term 158972 - 159034 6.8 - Term 159031 - 159076 7.1 116 43 Tu 1 . - CDS 159110 - 159697 652 ## BT_0646 hypothetical protein - Prom 159874 - 159933 10.0 + Prom 159906 - 159965 7.0 117 44 Op 1 . + CDS 160096 - 160461 367 ## BT_3899 transcriptional regulator 118 44 Op 2 . + CDS 160480 - 162297 1081 ## BT_3898 TonB + Term 162314 - 162357 3.0 + Prom 162408 - 162467 6.8 119 45 Tu 1 . + CDS 162487 - 164400 1213 ## BT_3897 putative thiol:disulfide interchange protein DsbE + Term 164426 - 164479 13.3 - Term 164417 - 164464 10.3 120 46 Op 1 . - CDS 164518 - 165618 1311 ## COG0489 ATPases involved in chromosome partitioning - Prom 165643 - 165702 10.2 121 46 Op 2 . - CDS 165709 - 166467 707 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase - Prom 166491 - 166550 2.7 - Term 166505 - 166555 10.3 122 47 Op 1 . - CDS 166580 - 167203 578 ## COG5523 Predicted integral membrane protein 123 47 Op 2 . - CDS 167249 - 168268 1155 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 124 47 Op 3 . - CDS 168319 - 168528 229 ## BT_3891 hypothetical protein 125 47 Op 4 . - CDS 168589 - 169878 906 ## COG1570 Exonuclease VII, large subunit 126 47 Op 5 . - CDS 169891 - 171255 895 ## COG1404 Subtilisin-like serine proteases 127 47 Op 6 . - CDS 171280 - 172365 864 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 172445 - 172504 5.3 128 48 Tu 1 . + CDS 172476 - 173339 421 ## BT_3886 hypothetical protein - Term 173283 - 173315 -1.0 129 49 Op 1 . - CDS 173329 - 174336 896 ## COG1409 Predicted phosphohydrolases 130 49 Op 2 . - CDS 174356 - 174838 500 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 131 49 Op 3 . - CDS 174877 - 175491 635 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) 132 49 Op 4 . - CDS 175491 - 176141 643 ## COG2344 AT-rich DNA-binding protein - Prom 176162 - 176221 6.5 + Prom 176113 - 176172 4.5 133 50 Op 1 . + CDS 176307 - 176657 384 ## COG0023 Translation initiation factor 1 (eIF-1/SUI1) and related proteins 134 50 Op 2 . + CDS 176675 - 177184 495 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 177211 - 177259 6.4 - Term 177205 - 177239 6.0 135 51 Op 1 38/0.000 - CDS 177306 - 178298 1317 ## COG0264 Translation elongation factor Ts - Prom 178318 - 178377 3.1 - Term 178353 - 178392 2.3 136 51 Op 2 . - CDS 178420 - 179256 1397 ## PROTEIN SUPPORTED gi|237719097|ref|ZP_04549578.1| 30S ribosomal protein S2 137 52 Op 1 59/0.000 - CDS 179378 - 179764 640 ## PROTEIN SUPPORTED gi|160883130|ref|ZP_02064133.1| hypothetical protein BACOVA_01099 138 52 Op 2 . - CDS 179771 - 180232 795 ## PROTEIN SUPPORTED gi|160883129|ref|ZP_02064132.1| hypothetical protein BACOVA_01098 - Prom 180329 - 180388 5.0 139 53 Op 1 . - CDS 180436 - 180729 132 ## Phep_2828 RagB/SusD domain protein 140 53 Op 2 . - CDS 180726 - 182129 1246 ## Phep_2828 RagB/SusD domain protein 141 53 Op 3 . - CDS 182151 - 185363 3072 ## Phep_2829 TonB-dependent receptor plug 142 53 Op 4 . - CDS 185407 - 187953 1850 ## BT_4652 hypothetical protein - Prom 187982 - 188041 2.2 - Term 188004 - 188048 5.2 143 54 Op 1 . - CDS 188081 - 189631 1537 ## gi|260171980|ref|ZP_05758392.1| hypothetical protein BacD2_08955 144 54 Op 2 . - CDS 189680 - 190690 1240 ## BT_3148 hypothetical protein 145 54 Op 3 . - CDS 190728 - 191465 555 ## BDI_3526 hypothetical protein 146 54 Op 4 . - CDS 191492 - 194095 2092 ## BT_4652 hypothetical protein - Prom 194264 - 194323 3.6 147 55 Tu 1 . - CDS 194896 - 198993 2562 ## COG0642 Signal transduction histidine kinase - Prom 199130 - 199189 3.9 148 56 Tu 1 . - CDS 199191 - 200399 1144 ## BT_2913 unsaturated glucuronylhydrolase - Prom 200560 - 200619 5.7 - Term 200570 - 200606 -0.6 149 57 Tu 1 . - CDS 200657 - 201148 526 ## BT_3874 hypothetical protein - Prom 201184 - 201243 8.6 - Term 201224 - 201272 10.7 150 58 Op 1 . - CDS 201290 - 202693 1475 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 202737 - 202796 2.3 151 58 Op 2 . - CDS 202798 - 204246 1899 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 152 58 Op 3 . - CDS 204320 - 205666 1523 ## COG0015 Adenylosuccinate lyase - Prom 205722 - 205781 7.0 - TRNA 205809 - 205883 87.4 # Val TAC 0 0 - TRNA 205919 - 205993 87.4 # Val TAC 0 0 - Term 206106 - 206140 -1.0 153 59 Op 1 . - CDS 206180 - 206725 471 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 154 59 Op 2 . - CDS 206729 - 207502 599 ## COG3022 Uncharacterized protein conserved in bacteria - Prom 207528 - 207587 9.0 + Prom 207535 - 207594 6.4 155 60 Tu 1 . + CDS 207641 - 209623 1510 ## COG3525 N-acetyl-beta-hexosaminidase + Term 209665 - 209704 5.3 - Term 209640 - 209701 13.8 156 61 Op 1 . - CDS 209725 - 212946 3814 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) - Prom 212982 - 213041 6.6 - Term 212987 - 213027 4.0 157 61 Op 2 . - CDS 213221 - 213808 443 ## gi|260171994|ref|ZP_05758406.1| hypothetical protein BacD2_09025 - Prom 213982 - 214041 7.1 + Prom 213840 - 213899 3.7 158 62 Tu 1 . + CDS 214012 - 215106 1346 ## COG0180 Tryptophanyl-tRNA synthetase + Term 215131 - 215193 4.0 - Term 215126 - 215175 11.3 159 63 Tu 1 . - CDS 215197 - 217539 1552 ## COG5373 Predicted membrane protein - Term 218278 - 218344 16.5 160 64 Op 1 . - CDS 218396 - 219583 1290 ## BT_3852 major outer membrane protein OmpA - Prom 219624 - 219683 2.5 161 64 Op 2 . - CDS 219687 - 221588 1623 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 162 64 Op 3 . - CDS 221606 - 221905 321 ## BF4070 hypothetical protein 163 64 Op 4 . - CDS 221907 - 223670 1313 ## BT_3849 hypothetical protein 164 64 Op 5 . - CDS 223693 - 225075 1546 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 165 64 Op 6 . - CDS 225085 - 225930 859 ## BT_3847 hypothetical protein 166 64 Op 7 . - CDS 225985 - 227532 1030 ## BT_3846 peptidyl-prolyl cis-trans isomerase - Prom 227572 - 227631 2.8 - Term 227546 - 227608 7.8 167 65 Op 1 1/0.000 - CDS 227633 - 229111 1883 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 229135 - 229194 3.2 - Term 229121 - 229183 14.3 168 65 Op 2 . - CDS 229198 - 231378 1827 ## COG0514 Superfamily II DNA helicase - Prom 231406 - 231465 3.8 + Prom 231058 - 231117 3.8 169 66 Tu 1 . + CDS 231296 - 231493 135 ## - Term 231431 - 231474 0.1 170 67 Op 1 24/0.000 - CDS 231547 - 232791 250 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 171 67 Op 2 29/0.000 - CDS 232794 - 233456 812 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 233476 - 233535 5.6 - Term 233515 - 233573 6.4 172 67 Op 3 . - CDS 233600 - 234955 1797 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 234975 - 235034 4.6 + Prom 235316 - 235375 6.7 173 68 Tu 1 . + CDS 235397 - 235642 314 ## COG0724 RNA-binding proteins (RRM domain) + Term 235673 - 235715 8.1 - Term 235663 - 235699 5.0 174 69 Op 1 23/0.000 - CDS 235708 - 236478 324 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 175 69 Op 2 . - CDS 236475 - 237218 597 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component - Prom 237273 - 237332 3.7 + Prom 237166 - 237225 2.7 176 70 Tu 1 . + CDS 237294 - 238055 274 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 238108 - 238161 7.2 - Term 238092 - 238153 12.2 177 71 Op 1 . - CDS 238162 - 239475 1384 ## COG1160 Predicted GTPases 178 71 Op 2 . - CDS 239524 - 240405 933 ## COG1159 GTPase 179 71 Op 3 . - CDS 240492 - 241499 814 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III - Prom 241519 - 241578 4.8 - Term 241518 - 241574 6.7 180 72 Op 1 . - CDS 241587 - 241772 338 ## PROTEIN SUPPORTED gi|160882088|ref|ZP_02063091.1| hypothetical protein BACOVA_00026 181 72 Op 2 . - CDS 241785 - 242363 610 ## BT_3832 hypothetical protein - Prom 242474 - 242533 3.9 Predicted protein(s) >gi|225935361|gb|ACGA01000031.1| GENE 1 13 - 1776 1651 587 aa, chain + ## HITS:1 COG:BH1242 KEGG:ns NR:ns ## COG: BH1242 COG0317 # Protein_GI_number: 15613805 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Bacillus halodurans # 1 585 147 726 728 333 35.0 4e-91 MRTLGSMLPNKQYKIAGETLYIYAPLANRLGLYKIKTELENLSFKYEHPEEYAEIEEKLN ATAAERDKVFNDFTAPIRTQLDKMGLKYRILARVKSIYSIWNKMQTKHVPFEEIYDLLAV RIIFEPRNEEEELNDCFDIYVSISKIYKPHPDRLRDWVSHPKANGYQALHVTLMGNNGQW IEVQIRSERMNDVAEQGFAAHWKYKEGGGSEDEGELEKWLRTIKEILDDPQPDAIDFLDT IKLNLFASEIFVFTPKGELKTMPQNSTALDFAFSLHTDIGSHCIGAKVNHKLVPLSHKLQ SGDQVEILTSKSQRVQPQWEVFATTARARAKIAAILRKERKANQKIGEEILSEFLKKEEV RPEEAVIEKLRKLHNAKNEEELLAAIGSKAIVLGEADKNELKEKQTSNWKKYLTFSFGNS KEKQEEKEPQEKEKINPKEVLKLTEESLQKKYIMAECCHPIPGDDVLGYVDENDRIIIHK RQCPVAAKLKSSYGNRILATEWDTHKELSFLVYIYIKGIDNMGLLNEVTQVISRQLNVNI RKLTIETEDGIFEGKIQLWVHDVDDVKTICNNLKKIQNIKQVSRVEE >gi|225935361|gb|ACGA01000031.1| GENE 2 1900 - 2244 417 114 aa, chain - ## HITS:1 COG:AGc2183 KEGG:ns NR:ns ## COG: AGc2183 COG0789 # Protein_GI_number: 15888519 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 82 26 105 203 66 41.0 1e-11 MLNTDKELKLYYSIGEVADMFGVNPSLLRFWEKEFPQISPKTAGRGIRQYRKEDVETIGL IYHLVKEKGMTLPGARQRLKDNKEATVRNYEIVNKLKAIKEELLAIKRELDGRE >gi|225935361|gb|ACGA01000031.1| GENE 3 2265 - 3233 809 322 aa, chain - ## HITS:1 COG:mll8577 KEGG:ns NR:ns ## COG: mll8577 COG0739 # Protein_GI_number: 13477076 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Mesorhizobium loti # 84 297 222 423 434 133 34.0 6e-31 MRKVYYIYNSQTQTYDRIYPTVRQRALSILRRLFYGMGLGAGCFIVLLLIFGSPSEKELR IENSRLLAQYNVLSRRLDDAMGVLQDIQQRDDNLYRVILQADPVSPAIRQAGYGGTNRYE ELMDLANSKLVVNTTQKLDVLSKRLYIQSKSFDDVIDICKNHDEMLKCIPAIQPISNKDL RQTASGYGTRIDPIYGTTKFHAGMDFSAHPGTDVYATGNGTVVKVGWETGYGNTIEIDHG FGYLTRYAHLQGFNTKVGKKVVRGEIIGKVGSTGKSTGPHLHYEVHVKGQVVNPVNYYFM DLSAEDYEKMIQLAANHGKVFD >gi|225935361|gb|ACGA01000031.1| GENE 4 3341 - 5959 2532 872 aa, chain + ## HITS:1 COG:ZalaS KEGG:ns NR:ns ## COG: ZalaS COG0013 # Protein_GI_number: 15803211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Escherichia coli O157:H7 EDL933 # 3 870 4 871 878 650 43.0 0 MLTAKEIRDSFKNFFESKGHHIVPSAPMVIKDDPTLMFTNAGMNQFKDIILGNHPAKYQR VADSQKCLRVSGKHNDLEEVGHDTYHHTMFEMLGNWSFGDYFKKEAINWAWEYLVEVLKL NPEYLYATVFEGSPEEGLSRDDEAASYWEQFLPKDHIINGNKHDNFWEMGDTGPCGPCSE IHIDLRPAEERAKISGRDLVNHDHPQVIEIWNLVFMQYNRKADGSLEPLPAKVIDTGMGF ERLCMALQGKTSNYDTDVFQPMLKAIAAMSGTEYGKDKQQDIAMRVIADHIRTIAFSITD GQLPSNAKAGYVIRRILRRAVRYGYTFLGQKQAFMYKLLPVLIDNMGEAYPELVAQKTLI EKVIKEEEESFLRTLETGIRLLDKTMEDTKASGKTEISGKDAFTLYDTFGFPLDLTELIL RENEMTVNIDEFNEEMQQQKQRARNAAAIETGDWIILKEGTTEFVGYDYTEYEASILRYR QIKQKNQTLYQIVLDYTPFYAESGGQVGDTGVLVSEFETIEVIDTKKENNLPIHITKKLP EHPEAPMMACVDTDKRAACAANHSATHLLDAALREVLGEHIEQKGSLVTPDSLRFDFSHF QKVTDEEIRKVEHIVNARVRANIPLKEYRNIPIEEAKELGAIALFGEKYGDKVRVIQFGS SIEFCGGTHVAATGNIGMVKIMSESSVAAGVRRIEAYTGARVEELLDTVQDTLSDLKALF NNAPDLGIAIRKYLDENAGLKKQVEDFMKEKEAALKERLLKNIQEIHGIKVVKFCAPLPA EVVKNIAFQLRGEITENLFFVAGSIDNGKPMLTVMLSDNLVAGGLKAGNLVKEAAKLIQG GGGGQPHFATAGGKNTDGLPAAVEKVLELAGI >gi|225935361|gb|ACGA01000031.1| GENE 5 6298 - 8535 1753 745 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 20 745 37 777 790 540 39.0 1e-153 MRTPAYLCLLLTCMLISCTPTQEKHKIDYTSYVNPFIGTDFTGNTYPGAQAPFGMVQLSP DNGLPGWDRISGYFYPDSTIAGFSHTHLSGTGAGDLYDISFMPVILPYKEAEAPLGIHSK FSHKDESAHAGYYQVRLTDYNINVELTATERCGIQRYTFPEAQSAIFLNLKKAMNWDFTN DSHIEVVDSVTIQGYRFSDGWARDQHIYFRTRFSKPFEKMELDTTAIIKDNKRIGTAVIA RFYFNTQKDEQILVNTAISGVSMEGAAKNLQAEVPENDFDKYLAETKANWNHQLGKIEIK GDNENDKVNFYTALYHSMIAPTIYSDVDGAYYGPDKKVHQADGWVNYSTFSLWDTYRAAH PLFTYTEPERVNDMVKSFIAFYEQNGRLPVWNFYGSETDMMIGYHAVPVIVDAYLKGIGN FDAKKALDACIATANLDNYRGIGLYKELGYIPYNVTDHYNAENWSLSKTLEYAFDDYCIA ETAKKMGNQDIADEFYKRSQNYKNVYNPATSFMQPRDDKGVFIKDFKADDYTPHICESNG WQYFWSVQHDIDGLIDLVGGKNRFAEKLDSMFTYHPAADDELPIFSTGMIGQYAHGNEPS HHVIYLFNAIGQENRTQEYVAKVMNELYKNEPAGLCGNEDCGQMSAWYVFSAMGFYPVNP VSGKYEIGTPLFPEMQLHLANGKTFTVLAPKVNKENIYIQSIKVNGQLYDKTYITHEQIM SGATVEFEMSKTPKISGGTFKERNP >gi|225935361|gb|ACGA01000031.1| GENE 6 8561 - 9115 415 184 aa, chain + ## HITS:1 COG:CC0981 KEGG:ns NR:ns ## COG: CC0981 COG1595 # Protein_GI_number: 16125233 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 10 171 14 175 201 63 31.0 2e-10 MNENFDVTYKSLFRRYYPNLIFYATRLVGEEEAEDVVQDVFVELWKRKDNIEIGEQIQAF LYRAVYTRALNVLKHRNVEDGYCAAMEEINQRRTEFYQPDNNEVIRRIEDRELRKEIHDA INELPDKCKEVFKLSYLHDMKNKEIADILGVSLRTVEAHMYKALKYLRGRLNPLWTILFL FLWR >gi|225935361|gb|ACGA01000031.1| GENE 7 9168 - 10160 822 330 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 18 317 29 313 331 75 27.0 1e-13 MSNLSEEIINRYLTGQCSEEELIEVNAWMKESEENARQLFRMEEIYHLGKFDQYANEQRI LRAEKQLYKKLDEEKSKQKIALNMQRWMKYAAMIAVILVIGGGAGYWLYQNGNNQQMMVA VANEGIVKEVILPDGTKVWLNNSATLKYPREFSEKERNVYLDGEAYFEVTKNRHKPFTVQ SDAMRVRVLGTTFNFKSDKNCRIAEATLIEGEIEVKGNKEEGQIILAPGQRAELNKNNGR LTVKQVNAKLDAVWHDNLIPFQQADIFTISKALERFYDVKIILSPDIQADKTYSGVLKRK SNIESVLKSLQNSIPIDYKIVGNNIFISPQ >gi|225935361|gb|ACGA01000031.1| GENE 8 10205 - 12484 1826 759 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 29 757 45 772 790 541 39.0 1e-153 MKLRTFLTVGCLGGLFTLSSCTAPTNVKDYSAYVNPFIGTGGHGHTFPGAIVPHGMIQPS PDTRIDGWDACSGYYYADSTINGFSHTHVSGTGCCDYGDVLLMPTVGKQQYLTTDPQSQT LVYASSFSHENEVAEPGYYSVFLDTYQVKAEISATKRGAIHRYTFPENLESGFIIDLDYS LQRQTNSEMEIEVISDTEICGHKKTTYWAFDQYINFYAKFSKPFSYTLITDSITMDDGKR LPVCKALLHFDTKKNEQVLVKVGVSAVDIAGARKNVESEIPGWDFDKVRKDARQAWNQYL SKIDITTSDKEDKTIFYSALYHTGISPNLFSDADGRYLGMDLEVHQGDTVNPVYTVFSLW DTFRALHPLMTIIDPDLNNQFINSLIKKHQEGGIYPMWDLASNYTGTMIGYHAVPVIVDA YMKGYRNFDAKEAYKACLRAAEYDTTGIKCPDLVLPHLMPKAKYYKNAIGYIPCDRENES VAKALEYAYDDWCISIFAEAMNDFESKAKYERFAKAYEFYFDKSTRFMRGLDSKGEWRTP FNPRASTHRNDDYCEGTAWQWTWFVPHDVEGLVNLMGGEDAFVQKLDSLFTVDSSLEGET TSSDISGLIGQYAHGNEPSHHVIHLYNYVNRPWRTQELVDSVYRSQYANSVDGLSGNEDC GQMSAWYILNSMGFYQVCPGKPVYSIGRPAFDKAVINLPGGKTFSIIAKNNSKKNKYIES ISLNGKSLETPFFNHQDIVNGGTMEIRMTDKPNYKTHIP >gi|225935361|gb|ACGA01000031.1| GENE 9 12481 - 14751 1893 756 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 31 754 13 716 717 436 34.0 1e-122 MKRILLIACVLGCTLFAKAKDWTQYVNPLMGSQSTFELSTGNTYPAIARPWGMNFWTPQT GKMGDGWQYTYTANKIRGFKQTHQPSPWINDYGQFSIMPIVGKPVFDEEKRASWFAHKGE IATPYYYKVYLAEHDVVTEMAPTERAVLFRFTFPENDHSYVVVDAFDKGSYIKIIPEENK IIGYTTRNSGGVPENFKNYFIIEFDKPFTYKATVANGNLQENIAEQTTDHAGAIIGFQTQ KGEQVHARIASSFISFEQAAANMKELGKDNIEQVAKKGKEAWNQVLGKIEVEGGNLDQYR TFYSCLYRSLLFPRKFYELDANGEPIHYSPYNGQVLPGYMYTDTGFWDTFRCLFPLLNLM YPSVNKEMQEGLINTYKESGFFPEWASPGHRGCMVGNNSASVLVDAYMKGVKVDDIKILY EGLLHGTENVHPEVSSTGRLGHEYYNKLGYVPYDVKINENAARTLEYAYDDWCIYKLAKE LKRPKKEINLFAKRAMNYKNLFDKESKLMRGRNEDGTFQSPFSPLKWGDAFTEGNSWHYT WSVFHDPQGLIDLMGGKDEFITMMDSVFAVPPIFDDSYYGQVIHEIREMTVMNMGNYAHG NQPIQHMIYLYDYAGQPWKAQYWLRQVMDRMYTPGPDGYCGDEDNGQTSAWYVFSALGFY PVCPGTDEYIVGAPLFKKATLHFENGNSLIIDAPNNSKKNFYINSMNFNGTDYTKNYLRH EDLFKGGTIKVDMSNKPNQDRGIKEEDMPYSFSKEK >gi|225935361|gb|ACGA01000031.1| GENE 10 15408 - 18791 3356 1127 aa, chain + ## HITS:1 COG:no KEGG:BT_3983 NR:ns ## KEGG: BT_3983 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1127 9 1135 1135 1540 70.0 0 MESNHLIRRTKFVRALLILLFMVVPVQWTVAQLTLSTPRTTLGTVIKQIQSQSKYQFFYS DKLSSIAVDALQVKDASLENVLSALLKGKNISYKVEDNIVYLSDRESAQAAQQQVGKERT ISGNVVDSKGEPLIGVSVLIKGTTSGGITDFDGNYKVTTNEVNPVIVFSYIGYKTQEVSV KGQTSINITLQEDTQVIDEVVVTALGIKRSEKALSYNVQQVNADDITRNKDVNFVNSLSG KVAGVNINASSSGVGGVSKVVMRGTKSIMQSSNALYVIDGVPMYAGTNQGGTEFSSKGAT EPIADVNPEDIESMSVLTGAAAAALYGSDAANGAIIITTKKGKEGRMSITVNSNTEFSNT FVTPSFQNRYGTGSSLSSNDQIHSWGKLLNSANSYGFDPVSDYFQTGITGTESISFSTGT DKNQTYASAAAVNSKGIVPNNKYARYNFTFRNTTTFLDDKMTLDVGATYVLQKDQNMINQ GTYNNPVVGAYLYPRGNDWEEIKMYERYDQTRKLNTQYWPVGDAGIVMQNPYWINYRQLR NNDKDRYMLNASLNYKILDWLSLSGRVRLDNSMNTYTEKYYAGTNTQMTEQSNRGLYTNA TSKDKQLYADFLVNINKTFGELWSLQANIGGSFTDMRSDVMEIRGPIADGSAAFEGETPG LANEFNIQNLSAKKTSRMQTGWREQTQSLFASAELGYKSTYYLTLTGRNDWPSQLAGPNS KSKSFFYPSVGASVVLSELMPNLNRNYLSYMKLRASFASVGTAFERYIANPRYEWNSSTG QWSNTTQYPVYNLKPERTQSWEVGLTMRFLNNFNLDVTYYNTITKNQTFNPQLGVSGYSA LYIQTGAVRNQGVELSLGYDKTWNKFNWNSNFTFSTNQNKILELADNAINPATGESFSIN TLNMGGLGSARFLLKEGGSMGDVYSNRDLKRDANGNIYVDANGQLSTETISNMDDYIKLG SVLPKANMAWRNDFSWNNFRFGFMISARLGGIVFSRTQAMLDEFGVSEATAEARDRGYIQ INGGDKINPENWYRTIGSGDAVAQYYTYSATNIRLQEASIGYTFSRKMLGNVCDLTVSLV GRNLWMLYSKAPFDPESVASTNNFYQGIDYFMMPSTRNIGCSVTLKF >gi|225935361|gb|ACGA01000031.1| GENE 11 18803 - 20428 1464 541 aa, chain + ## HITS:1 COG:no KEGG:BT_3984 NR:ns ## KEGG: BT_3984 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 537 1 536 537 645 61.0 0 MRSIINKFILGTLTVASVSCTGNYMDINSNPYQPGDLSADDYALGSAMNNLASCVVSSDV NTAQFTDCLLGGPMGGYFADSNSSWNNTIANYNATDNWTNVFLKSDRIIPILFTNLTAVE TIARNSGNEVPLAIAKIIKVAAMSRVTDTYGPIPYSQIGKDGSVTTPYDSQEVVYDAFFQ ELSEAVKTLQANPEASLTATADYVYSGNLTKWIKFANSLKLRLAMRIVYANETKARQWAE DAVKADNGGVIEANADNAQWNYFGTVTNPLYTATRYNAAADTETGGDTHAAADIICYMNA YSDPRLKAYFVKSEWGGDNEYVGIRRGIDLSTISNIARKYSGVKVAQSDPIYWMNAAEVA FLRAEAVGVFGFTNVGSDAKTLYEKGVALSFEQWGAGSADGYFTSNNQDKKLSYTDPSGA NSYGQTDKFIAIAAKWDASASKEQMQQRIITQKWIANWMLGNEAWADFRRTGYPYLMPAT ESGNKSGGVISNLEAGARRMPYPTDEFISNKENVEYAVANLLKGANTYATRIWWDCKSKA N >gi|225935361|gb|ACGA01000031.1| GENE 12 20458 - 21486 832 342 aa, chain + ## HITS:1 COG:no KEGG:BT_3985 NR:ns ## KEGG: BT_3985 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 339 13 357 358 178 34.0 2e-43 MKNIIRPIIGLFILSSLACVSCTDVESLDITRPTAQEQNPEAYAKYLSNLREYKSTEHKV VYAWFDNSTKTPASPAHHITNIPDSVDVISMLTPTLADFEKTDIETVHQKGTKVVYTISY DDIKAAYDELQTAEEENGGTSTLEAFDIYLKSEIEKLLTHASSYDGLIAKYIGQNPEFMA DDVKAEYKKYQDVFLTTIKSWKDANKEKMLVWMGKPQNLITRTILTDCKNIILDTEGVND TNQLRLSVTKALVENVPSNNLIFAVSTTAADTSNKETGYWGTGDTALRALSEVAYWVTID DTHAYTKAGIAIYNVQNDYYATGGTHTYVREAINIMNPSPIK >gi|225935361|gb|ACGA01000031.1| GENE 13 21495 - 22661 755 388 aa, chain + ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 388 3 384 384 181 32.0 4e-44 MKLKNIYLVSFALLAGNLISSCADDIENFDNQLFMTAKTPESVLVMSTSPDEERSFTLSI AKPESQDVTFTMRVAPELVSVYELQNYTSNVEMLPEAHYELADANGKILAGSLNSDPITV KFKDIKGLDVKKVYVLPLTIGNSNIGILASANTYYYVFKEGHLVNLAANIKENAICIDKW ATPDLLKNMHTLTAEALICIHSLTNTVNTLMGIEGHFLLRFGDANLDPTTLEIACANNKR VPTPIKVGEWVHVAVTFNSDEGTIKVYYDGQQVGNFTNVGQGPVDWSPEFSNDNDGKPRS FWVGHSYDNGRWLDADIAEIRIWNRELDETEIQEPNHFYQVDPASDGLVVYWKLDERSDE IKDATSNGNNGKAYKAMEWVEASLPANN >gi|225935361|gb|ACGA01000031.1| GENE 14 22693 - 24660 1166 655 aa, chain + ## HITS:1 COG:no KEGG:BT_3987 NR:ns ## KEGG: BT_3987 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F1 precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 387 3 382 476 153 31.0 2e-35 MKHLFNIRLASMAVALAFAGATFTACEDDIEITNSTSNPWGELDGTFGSVRSAAGAKMGT TLTLRNNEDATGYVYFELSKVSDQDTKVTFKIDNSALSTYNAANGTSYEMYPNVTLENEG KTTITAGKRKSELLAVNVPAGKTGNNYAVAITATADNGTTIASNNASYVYLIKSIQIPSY TPRDIKNIVYVEVNNENPLNAGEYTIGDEKAPFFDIVSIFAANINLDKEGSPYISCNEQT TFVLHNADRIIRPLQEKGIKVHLSILGNHDDAGMRSLSDEGAKVFAKELKAYADIYGLDG FDFDDEYSSYDKGAYQGSSKAVVNNASQCTQERYANLIYEYRQLMPKESNISFGIYWYRT SDYPKGDINGVSAADMVDYTVFGSYGSFRALSGFSNKIQAPYAITLAGASTSDGSPIGIR SNSTYLNNVKNNGYGMFAFYNLNNSRSVTRVFNEMSNIIYGQPVEWNGKYYERTEFVAKT GKSTNYEDYLGTWTVTASKSLYWNGSSWAQGDNESFIINIIEKENGKSYNVYGWDGREET QKYPFELTYDNGMIKIASQQTIHTPDTDDGNEWVMSFASGSTKANWNATTTSRDNAFNGE MAPNGTLMLLDYSSMTASKNANKYAFSLFKKDGTTYQAAIKAENERLAGWYILSR >gi|225935361|gb|ACGA01000031.1| GENE 15 24767 - 26185 697 472 aa, chain - ## HITS:1 COG:mll1421 KEGG:ns NR:ns ## COG: mll1421 COG0507 # Protein_GI_number: 13471448 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Mesorhizobium loti # 21 464 6 374 375 85 26.0 2e-16 MINNYLERQIKENFPYQPTLEQEIAVKSLSEFLLSTLADEVFILRGYAGTGKTSLVGALV KTMDQLQQKSVLLAPTGRAAKVFSAYAGHPAFTIHKKIYRQQSFSNELSNFSINDNLATN TLFIVDEASMISNEGLSGSVFGTGRLLDDLVQFVYSGQGCRLLLMGDTAQLPPVGEELSP ALFSDALKGYGLEVREIDLTQVVRQVQESGILWNATQLRQLIAADDCYSLPKIKIAGFPD IKLVPGTELIEELTNCYDHDGMDETIVVCRSNKRANLYNNGIRAQILWREDELNTGDMLM IAKNNYYWTEKYKEMDFIANGEIAIVRRVRRTRDIYGFRFAEVTLRFPDQNDFELDANLL LDTLHSDSPALPKEDNDRLFYTVLEDYIDIPIKRDRMKKMKADPHYNALQVKYAYAITCH KAQGGQWQNVFLDQGYMTDEYLTPDYFRWLYTAFTRATKTLYLVNYPKEQVE >gi|225935361|gb|ACGA01000031.1| GENE 16 26281 - 26967 901 228 aa, chain + ## HITS:1 COG:no KEGG:BT_3981 NR:ns ## KEGG: BT_3981 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 228 1 228 228 408 87.0 1e-113 MKTVFNIVLVLCAAALIYICYTSIMGPINFENAKKDREKAVIARLIDIRKAQQEYRTLHR GMYTDKFDTLIDFVKNQKLPFVMKMGMLTDKQLEDGLTEKKAMAIIEKAKKTGKYDEVKK WGLENFKRDTMWVAVMDTIFPKGFNPDSMKYIPYGGGAQFEMNVRNDTAKSGAPVFLFEV KAPYDTYLSGLDKQEIINLKDVQSKLGRYCGLMVGSIDTPNNGAGNWE >gi|225935361|gb|ACGA01000031.1| GENE 17 27177 - 27737 450 186 aa, chain + ## HITS:1 COG:no KEGG:BT_3980 NR:ns ## KEGG: BT_3980 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 71 256 256 314 86.0 8e-85 MMASKRFTIVPLDLFEEEQAELLFYHNHQKRENETVLYNILRKNNVAVIFGIDKSAQIFL NEQYPEARFYSQSTPFIDYFSVKSRLGNSKKMYASVRKDGIDIYCFERGHLLLANSFECT HTEDRIYYLLYAWKQLEFDQERDELHLTGTLSEKEVLMSELKKFILQVFIMNPATNIDMQ ALLTCE >gi|225935361|gb|ACGA01000031.1| GENE 18 27728 - 28261 227 177 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 177 13 193 199 92 33 2e-17 MRVISGIYKRRRFDVPRTFKARPTTDFAKENLFNVLNNYIDFEEGVTALDLFAGTGSISI ELVSRGCDRVISVEKEPAHHSFICKIMKEVQTDKCLPIRGDVFKFIKNGREQFDFIFADP PYALKELETIPELIFQNDLLKEGGLLVLEHGKDNNFEENPHFIERRVYGSVNFSLFR >gi|225935361|gb|ACGA01000031.1| GENE 19 28286 - 29725 879 479 aa, chain - ## HITS:1 COG:BS_ywnE KEGG:ns NR:ns ## COG: BS_ywnE COG1502 # Protein_GI_number: 16080712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus subtilis # 11 479 3 482 482 357 37.0 3e-98 MIDWNYMLSQIATVAFDILYFGAIIGTIVIIILDNRNPVKTMAWILILLFLPIVGLVFYF FFGRSQRRERIIGQKSYDRLLKKPMAEYLAQDCSDVPYEYSRLIQLFQQTNQAFPFEGNR VAIYTEGYTKLQSLLRELQKAKQHIHMEYYIFEDDAIGRLVRDVLIEKASQGVEVRVIYD DVGCWHVPNRFFEEMRNAGIEVRSFLKVRFPLFTSRVNYRNHRKIVVIDGRVGFVGGMNL AERYMRGFSWGIWRDTHIMLEGKAVHGLQTAFLLDWYFVDRTLITASRYFPKIDSCGNSL AQIVTSEPIGPWKEIMQGLTVAITSAKKYFYMQTPYFLPTEQILAAMQTAALSGVDVRLM LPERADNWITHLGSRSYLADVMQAGVKVYFYKKGFLHSKLMVSDDMLSTVGSTNVDFRSF EHNFEVNAFMYDVETALEMKEIFLQDQRESTQIFLKNWIRRSWRQKAAESIVRLLAPLL >gi|225935361|gb|ACGA01000031.1| GENE 20 29858 - 32782 2478 974 aa, chain - ## HITS:1 COG:no KEGG:BF0745 NR:ns ## KEGG: BF0745 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 950 1 944 970 1251 71.0 0 MKRFSAGLVLMLLCTFSIFAQNKVITVSGRVVEADTKEPAAQATVQLLSLPDSAYAAGIA SSNQGWFTLPKVKAGKYVLKVSYIGFRTKLVPIQLSANATDKKMGTIALDPDAVMLKEAV ITAEAPQVTVKEDTLEYNSAAYRTPEGAMLEELVKKLPGAEIDDDGNVKINGKEVKKIMV DGKEFFGGDVKTGLKNLPVNMIDKLKTYDKKSDLARVTGIDDGEEETVLDLKVKKGMNQG WFGNASVAGGTEDRYGSNLMLNRFVDNSQFSLIGSANNVNDQGFSGGGGGPRFRNSNGLT ATKMLGANFATQTDKLELGGSARYNFSDRDATSTNYSERFLQNGNSYSNSNSKGRNKNTN FNADFRLEWKPDTLTNIIFRPNVSYGKSNGYSISESGTFNGDPFNLVSNPNDYLNKVLWG SADDPLDEIRVNASNSESQSEGQDLSANASLQVNRRLNNQGRNITFRGTFSYGDNDSEQF SESLTRYFDKAANKADDERKQYITSPTKSYDYTAELTYSEPIAKATFLQFRYKFQYKYSE SDRSTYSLIPDADKGQDWFWNFGDGLPEGYEENKDRDLSKYAQYKYYNHDINAGLNIIRE KYRLNFGVSLQPQNTRLDYKKAEVDTVVKRNVFNFAPNVDFRYRFSKVSQLRFTYRGRAS QPSMENLLDVTDDSNPLNIRKGNPGLKPSFSHSMRLFYNTYNADSQRGIMAHANLNMTQN NITNATTYNQSTGGVTVKPENINGNWNAMGMFGFNTALKDKRFTINSFSRANYTNAVSYL FNDDTKINDKNTSTTLTLGENLNGTFRNDWFEFTLNGSINYNFERNKLRPENNQEPYTFG YGASTNISLPWSMTLSTNITNNARRGYRDASMNKNELIWNAQIAQNFLKGNAATISFEVY DILRQQSNISRSLTADMRSVSEYNGINSYCMLRFSYRLNIFGNKEARGNMRHGGFDGGGP RGPRGGGYGGGRPF >gi|225935361|gb|ACGA01000031.1| GENE 21 32866 - 33270 290 134 aa, chain - ## HITS:1 COG:no KEGG:BT_3976 NR:ns ## KEGG: BT_3976 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 134 203 89.0 2e-51 MKALLPVYCRPLGYVVLLVALFIPFILVMRGVVTDHNLLFYKECTKLLMMAGCLLIIFAF SKNESRETEQIRNSAVRNAIFLTFLFVFGGMLWRVMQGDVINVDTSSFLTFLVFNVLCLE FGLKKALVDRFFKR >gi|225935361|gb|ACGA01000031.1| GENE 22 33203 - 33334 89 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNRATNKTTYPKGRQYTGNNAFMIERFNETKVSKKNILYTFAE >gi|225935361|gb|ACGA01000031.1| GENE 23 33350 - 34411 998 353 aa, chain + ## HITS:1 COG:FN0871_1 KEGG:ns NR:ns ## COG: FN0871_1 COG0337 # Protein_GI_number: 19704206 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Fusobacterium nucleatum # 26 348 26 349 350 187 35.0 3e-47 MSKQEVILCESLETSLGRAIELCPHDKLFVLTDEHTQRLCLPSLKASGLLKDAVEICIGA EDVHKTLETLASVWMALSTQGATRHSLLINLGGGMVTDLGGFAAATFKRGISYINIPTTL LAMVDASVGGKTGINFNGLKNEIGAFAPANSVLIETEFLRTLDAHNFFSGYAEMLKHGLI SNTAHWVELLNFNTSNIDYTALKQLVGQSVQVKEDIVEQDPFEHGIRKALNLGHTVGHAF ESMALAEDRPVLHGYAVAWGIVCELYLSHLKVDFPKEKMRQTIQFIKDNYGVFTFDCKKY DQLYAFMTHDKKNTSGTINFTLLKDIGDICINQTADKDTIFEMLDFYRECMGI >gi|225935361|gb|ACGA01000031.1| GENE 24 34518 - 35387 358 289 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 178 283 180 285 288 62 31.0 1e-09 MLDIKNRETLEIYVLGDDFKFYQNLKYLPLTSYPSNNQSGLIIYCLSGRAKITVHEDVHW IQPEELIILLPGQFVSFCEPSEDFSTITMVISTSLFSDALSGVPRFSPHFFFYMRSHYWY PQSERDIPRMYNYLGMIKDKVTSQDIYRRELIIHLLRYLYLELFNAYQKEATLMTARRDT RKEELANKFFGLIMKHFKENKDVAFYADKLCITSKYLTMVIKETSGKSAKDWIVEYIILE IKALLKNTSLNIQEIAIKTNFANQSSLGRFFRKHTGMSLSQYRMSNLEQ >gi|225935361|gb|ACGA01000031.1| GENE 25 35994 - 37214 848 406 aa, chain + ## HITS:1 COG:HI0245 KEGG:ns NR:ns ## COG: HI0245 COG0809 # Protein_GI_number: 16272205 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Haemophilus influenzae # 8 404 1 353 363 230 36.0 3e-60 MKEDPRHIRISEYNYPLPDERIAKFPLPTRDQSKLLIYRRGEVSEDIFTSLPEYLPQGSL MIFNNTKVIQARLHFRKETGALIEVFCLEPIQPNDYALNFQQTEHAAWLCMIGNLKKWKD GELKREMTVKGFPITLTAVRGECKGTSHWVDFSWDNPEVTFADILEVFGELPIPPYLNRD TEESDKETYQTVYSKIKGSVAAPTAGLHFTPRVLDALQEKGIDLEELTLHVGAGTFKPVK SEEIEGHEMHTEYISVNRSTIKKLIDHDGCAIAVGTTSVRTLESLYHIGVILADHPDATE EELHVKQWQPYEKYDQIPPVVALQKILGYLDRNGLEALHTSTQIIIAPGYQYKIVKAMVT NFHQPQSTLLLLVSAFVKGNWRAIYDYALAHDFRFLSYGDSSLLMP >gi|225935361|gb|ACGA01000031.1| GENE 26 37261 - 37779 383 172 aa, chain + ## HITS:1 COG:MT1787 KEGG:ns NR:ns ## COG: MT1787 COG1443 # Protein_GI_number: 15841209 # Func_class: I Lipid transport and metabolism # Function: Isopentenyldiphosphate isomerase # Organism: Mycobacterium tuberculosis CDC1551 # 8 155 12 166 203 64 25.0 8e-11 MLSDNNQEMFPVVDEQGNITGAATRGECHSGSKLLHPVVHLHIFNTRGELYLQKRPEWKD IQPGKWDTAVGGHIDLGESVEIALKREVREELGITDFIPELLTNYIFESEREKELVFVHK TVYDGEIHPSEELDGGRFWTIEEIKENLGKGIFTPNFESELQKVSLIPSLSK >gi|225935361|gb|ACGA01000031.1| GENE 27 37757 - 38299 580 180 aa, chain - ## HITS:1 COG:PA0838 KEGG:ns NR:ns ## COG: PA0838 COG0386 # Protein_GI_number: 15596035 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Pseudomonas aeruginosa # 23 179 3 157 160 165 49.0 4e-41 MKTFVLMMVSLLFAVSLEAQNKSFYDFTVKTIDGKDFPLSSLKGKKVLVVNVASKCGLTP QYAQLEKLYEKYKDKDFVIIGFPANNFMGQEPGSNEEIAQFCSLNYDVTFPMMAKISVKG KEIAPLYQWLTEKKLNGKEDASVQWNFQKFMIDENGNWVGFASPKESPFSEKIVTWIEKE >gi|225935361|gb|ACGA01000031.1| GENE 28 38359 - 40305 1590 648 aa, chain - ## HITS:1 COG:no KEGG:BT_3294 NR:ns ## KEGG: BT_3294 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 648 1 650 650 1095 79.0 0 MRKYLLLSMFLCLVSLLHAQNKFELVSPNGEIKVSFNLSDKIYYNIDYNGEVLLKDNFLQ LILKNQVLGQNPKLRRQKRTSIDEQLTPVVPLKYSKVNNRYNQLLLTFKDYSVEFRAFDD GIAYRFITSQKGDVEVMGEEFAINFPSDYLLHLQQPGGFKTSCEEPYTHIQSNTWKPEDK MAILPALIDTKKDYKILISESDLTDYPCMFLKGTGANGIVSTFPKTPLAFAEDGDRSLKI TEEADYIAKTKGTRNYPWRYFVISKNDKQLIENTMTYRLAEKNQLQDVSWIKPGQVSWEW WNDASPYGPDVNFVSGYNLETYKYYIDFASKFGIPYIIMDEGWAKSTRDPYTPNPKVNLH ELIRYGKEKNVGIVLWLTWLTVENNFDLFKTFNEWGVKGLKIDFMDRSDQWMVNYYERVA REAAKHNLFVDFHGSFKPAGLEYKYPNVLSYEGVRGMENMGGCYPDNSLYLPFMRNAVGP MDYTPGAMISMQPNVYRSERPNSASIGTRAYQLALFVVFESGLQMLADNPTLYYRNEDCT RFITQVPVTWDETVALEAKAGEYVIVAKRKGDKWFIGGMTNNGEREREFTIKLDFLNKDR SYQMTSFEDGINAGRQAMDYRCKSSQVKAGEQLTVKMVRNGGFAAIIE >gi|225935361|gb|ACGA01000031.1| GENE 29 40367 - 42700 1881 777 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 30 765 46 781 790 509 37.0 1e-143 MICRFGKKRLFLVWSCLVSVLPGMAQTEKLIDYVNPFVGTDGYGNVYPGAQIPFGGIQMS PDTDSKYYDAASGYKYNHSTLLGFSLTHLSGTGIPDLGDFLFIPGIGEMKLDPGTREEPE KGYRSRYSHEKEWASPNYYAVELTDCGVKAEMTSGVRSGMFRFTYPQSEKAFIMIDMNHT LWQSCEWANLRMENDSTITGYKLVKGWGPERHVYFTATFSKKLTGLRFMQDKKPVIYNTS RFRSSYETWGKNLMACISFETKAGEEVIVKTAISSVSTNGAKNNMNELAGLTFNDLKARG EALWEKELGKYTLTADRKTKRTFYTSAYHAALHPFIFQDADGQFRGLDKNIEKAEGFTNY TVFSLWDTYRALHPWFNLVQQDVNANIANSMLAHYDKSVEKMLPIWSFYGNETWCMIGYH AVSVLADMIVKGVKGFDYERAYEAMKTTAMNSNYDCLPEYRTMGYVPFDKEAESVSKTLE YAYDDYCIAQAAKALGKEDDYRYFLNRALSYQTLIDPETKYMRGKDSKGNWRTPFTPVAY QGPGSVNGWGDITEGFTVQYTWYVPQDVQGYINEAGEAWFRNRLDELFTVELPDDIPGAH DIQGRIGAYWHGNEPCHHVAYLYNYLKEPWKCQKWVRTIVDRFYGDTPDALSGNDDCGQM SAWYMFNCIGFYPVTPSSNVYNVGSPCVEAITVRMSNGKSIEMVADNWSPKNVYVKELYV NGKKYDKSYLRYEDIRDGVKLRFVMSGKPNYKRAVSDEAVAPSLSFPGKTMRFQYGL >gi|225935361|gb|ACGA01000031.1| GENE 30 42710 - 43507 828 265 aa, chain - ## HITS:1 COG:no KEGG:BT_3964 NR:ns ## KEGG: BT_3964 # Name: not_defined # Def: putative secretory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 263 1 263 264 494 88.0 1e-139 MKKRHLIYVACLLVAMGACSATSKERVKITPDAWGKYNVGTILFEDKAPETKGSDIYHRI IPDAESYIKAQAREVLATLYHSPEDSIPTVNKIHYTLEDIEGVSAKGGGNGDVTIFYSTR HIEKSFAANDTAKLFFETRGVLLHELTHAYQLEPQGVGSYGTNRVFWAFIEGMADAVRVA NGGFDGPNARPKGGNYMDGYRTAGYFFVWLRDNKDPEFLRKFNRSTLEVIPWSFDGAIKH ILGNEYNIDELWHEYQVAVGDIQAQ >gi|225935361|gb|ACGA01000031.1| GENE 31 43520 - 45865 1853 781 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 29 776 49 781 790 529 40.0 1e-149 MYGIKKKIGIIAVAAIAMSSAMAQQPVDYVNPIIGTNGMGHTFPGACTPFGWVQLSPDTD TIPHNINGAYQKNAYEYCAGYQYRDKTIVGFSHTHLSGTGHSDLGDILLMPAVGDLKLNC GRADHPDEGYRSRFSHATEKSTPGYYEVMLDDYGIKAQLTATQRVGVHKYTFPKGKDGHL ILDLVHGIYNYDGKVLWANLRVENDTLLTGYRITNGWARTNYTYFAISLSQPIRDYGYKD KEKVLYNGFWRRFNMDKNFPEITGRKIVAYFNFNTAQEPELVVKVALSAVSTEGAVKNLR AEASGKSFEQLAEAARTDWNNELDHFEVEGTADQKAMLYTSLYHTMINPSVYMDVDGAYR GLDHNIHQAKGFTNYTIFSLWDTYRAEHPFLNLVKPERNSDMVESMIKHEQQSVHGMLPV WSMMGNENWCMSGYHAVSVLADAITKGVFSNVDEALAAMVSTSTVPYYEGVADYMKLGYI PLDKSGTAASSTLEYAYDDWTIYQTALKSGKKDIAETYRKRALNYRNIYDTTIGFARPRY SDGSFKKDFDVLQTYGEGFIEGNSWNFSFHVPHDVFGMMDLMGGERVFVDKLDKLFSMHL PEKYYEHNEDITAECLVGGYVHGNEPSHHVPYLYAWTSEPWKTQYWLREILNKMYKNDIN GLGGNDDCGQMSAWYLFSVMGFYPVCPGTDQYVLGAPYLPYLKLKLPNGNTLEIKAPGVS DKRRYVQSLKLNGKIYDKMYLTHEDILKGGVLEFKMSASPNKRRGLSAEDKPYSLTNGIN K >gi|225935361|gb|ACGA01000031.1| GENE 32 45872 - 48157 1960 761 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 27 757 3 715 717 434 33.0 1e-121 MNKKNLLSLGTALLLGTSLLFAQKEPVDYVNPLMGTDSKISLSNGNTYPAIALPWGMNFW MPQTGKMGDGWAYTYASDKIRGFKQTHQPSPWINDYGQFSIMPMTGQLKIDQEQRASWFS HKAEKATPYYYSVYLSEYNLTTEIAPTERCAYFRFTFPETKDAYVVIDAFDRGSYVKVIP EENKIIGYTTRNSGGVPQNFRNYFVIEFDKPFTFNKVWADYHLVEKHLELQSNHVGAAIG FATKKGEQVHARVASSFISPEQAELNLKEIGKKTFEQTKEVGRKAWNDVLGRIRIEDNDE NRMRTFYSCLYRSVLFPRMFHEVNAKGETVHYSPYNGEVRPGYMFTDTGFWDTFRCLFPF VNLVYPSMGEKMQEGLLNTYLESGFFPEWASPGHRGCMVGNNSASVVADAFMKNVTKADA EKMYEGLLKGANSVHPRVSTTGRRGYEYYNKLGYVPYDVKINESAARTLEYAYDDWCIYR MGEKLGRPAEELELYKKRSQNYRNLFDPETKLMRGKNSDGTFQTPFNPFKWGDAFTEGNS WHYTWSVFHDPQGLIDLMGGKKIFVSMLDSVFNLPPVFDDSYYGGVIHEIREMEIANMGN YAHGNQPIQHMIYLYNYAGEPWKAQYWLREVMTRLYFATPDGYCGDEDNGQTSAWYVFTA LGFYPVCPGSNEYVMGAPYFKKATITLENGRKLEISAPKNSDENRYIRSLNYNGKNYTKN YLNHFDLLKGRRLLFEMDNKPNTGRGVNESDYPYSFSLDKK >gi|225935361|gb|ACGA01000031.1| GENE 33 48344 - 49582 887 412 aa, chain - ## HITS:1 COG:no KEGG:BT_3961 NR:ns ## KEGG: BT_3961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 411 1 410 411 471 57.0 1e-131 MKKLTYIMLSLCCFALYSCGMTEPWKDWENEGNMSGDRLRPSEVKDLLCTAGGWKMTYEG VTFYFQFSEDGAVTSDSDESLLKNAVETDYSLDFEGEKVVLLTLQNGGMLQYLGENQENT FVITGYSDYQITATGQANGKTMILNSVTTADLQQAKERKRLAIIAYNKAQSMEILKTDLS NGLLRNASTNQFAAHYAMSCDENDNWKIKISLLNGKILKHTEYAMTINTTNDEKATLSID GLTVNGVAVGALYYKYDTGDLSTDNPAFKVDLNKSSDMLKTYTSSWKTHIVDRNNICDKL TGLLTQIEFDDRSPRNIIVCPGETGAGKWHYVGFVINATANDATGCVYLENTGINYILGS YGDDAGVVRSIPIYSAFLDFCFSDKGVWMYEDSDSYFYVISPVSDEWFRMKI >gi|225935361|gb|ACGA01000031.1| GENE 34 49608 - 51134 1193 508 aa, chain - ## HITS:1 COG:no KEGG:BT_3960 NR:ns ## KEGG: BT_3960 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 508 1 508 508 920 88.0 0 MKIPQIRFNDTNLVKRFFFLFVLFIPVYFAGCSSDDDEGGNGGEDPNGSAPASYVLGSGN TNMPSSGTIIAEYADAPAGSEIKSLVDDNADTKYVTYHSSFHITWNGNSSKAVTAYSLTS AADTPEMDPKTWTLYGSNDNATWTKLDVQTNQIFAARKEEKNYEVDNATAYRYYQLSVEA NNGGTATQIAEWKLVAMRSYTENINDLISSKGSSSFSAITPMGKQHEDDLEATVADLKWL ADPAEEPEPFGDNGTKMAWNTFNVVSLYPNGSPALSDVNQRWVGDCCACAVFASMAYLYP RFIKHIIKDNKDKTYTVTMYDPTGKQIPVSVGGDYFIGNSGDLGALGGRNKEVTWATILE KAMMKWRQVYLGSSNVGGIGTEYVSAIFTGDGESVGFGAGALIAEDLQRAVEVSLKQGRL VIGGFTRSGEQVDENWQTTSGHAFTFILPDDDSHLFKMRNPWGGTTDGVMKVKNDNRIPP MIDLRICAPGAAKNYGVGPDLGGYIPNF >gi|225935361|gb|ACGA01000031.1| GENE 35 51189 - 53153 1745 654 aa, chain - ## HITS:1 COG:no KEGG:BT_3959 NR:ns ## KEGG: BT_3959 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 654 1 654 654 1253 92.0 0 MKNIFKIGTLALSLAVGLSSCNDFFDAIPGQQYDLEDTFTNRSKTEQFLNNVYNYVPDET LERNAKNTMSGIWTTGSLECKLSWDGNNGSEWASGATYAGSSWINFWYIEYYKGISRAST FIMNVDRCLEASAAQRKQWKAQARALRAFYYFMIFRSYGPFVILGEEPIPLDISTSELLK ERNSVDECVDFMVKEFDDAANDLPDRYDGSNLGRIDRAACKAFKAKMLLYAASPLFNCNT DYAGIVNPESGKQLFPQDKSQEKVKWETARDAYKAFFDEYGSTFSLYTEKTSDGKTDFYE SYRKATSGVLYGAENKEQIFIRLADHDYRAYETTPYHKGYDDNNGALRGGLGFGVPQEMV DLYFMKDGRRIVEDQNYEEYEGVPSSGYLGWSTDYKDAVVNSRTYFKANSNKTLKQWADR EPRFYANITFNGSTWLKTDTPRGEVTTELTFNGNSGYANANWDAPYTGYGMRKMAPKEGR NGANRHCATLLRLADMYLGYAETLSACDQRGEAIKYVNKIRARAGIPGYGVAGTTDDNGF ACIPYEDTRDEVDKRIRRERLIELMFEWNHFFDVRRWKVANMAVGDDWIYPNYHRGGEGG AIHGMAFRSDAPAFFEKVVVETRTFLPKHYLFPIPDEDVRRNPKMVQNLGWTTE >gi|225935361|gb|ACGA01000031.1| GENE 36 53176 - 56328 3085 1050 aa, chain - ## HITS:1 COG:no KEGG:BT_3958 NR:ns ## KEGG: BT_3958 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1050 1 1050 1050 2016 94.0 0 MNLFGMEKHHTKSLWLLAILIAFSTTVCAQGTSVTGRVSDEKGELLIGVSVQEKGTTNGT ITDTNGQYNLKLSSKNPILVVSYIGYKPQEVKVAKQKVLDVILVEDVSSLDEVVVVAYGH QRKVSVVGAQSSMKIEDIKMPTANLSSAIAGRLPGVVAVQRTGEPGHDDSDLWIRGISTL AGQNSKPLVLVDGVERSFNNIDPEDIESFTVLKDASATAVYGVRGANGVILIKTKPGKVG KPQFSVDYYEGFVTLTKKPEMADAFTYMDAANEAYMDTKGSMLYSPQYIEATKKAHGLLP NDNPLMFNPYLYPNVNWMDELFNDWGHNRRVNVSVRGGVPNATYYVSLSYYNEKGLTRTA EMENYDANIRYDRYNYTANLNLKPTETTTIDLGFNGFLSMGNYPQQSTGDLFASAMEINP VYLPLMMPDGSVPGISTNGDLRNPYADLIRRGYKNEARNQLNSNIRLTQDLGFWKWSKGF SVSAMLAFDVHNSRDLKYNKREDTYNFAGTKDENGLWNDDVFDENGDYRYALTYTGHKDL AFDQSASDSRSTYFEASLNYDRSFGLHRIGGLLLYNQKIYRSSSDNLVGSLPYKQQGLAA RVTYSWNDRYFFEANAGYNGSENFSPDKRFGFFPAFGIGWAVSNESWWTPLQDVISYFKV RYTDGLVGTDAVTGRRFMYLDQMASVDGYRFGDQNNSVGGWGFSKYGANVGWSTSRKQDL GVDLKFFKDNLSLTFDVFKEHRKDIFITRRVIPDYSGFVEMPYANLGVVDNKGFEATLEY TQQLGKQCFLTVRGNFSWNEDKIIENDDPRVQYPWMEKRGTNVNGRWGWIAEGLFTSEEE IMDHAKQFGEGHPGQISKVGDIKYKDLNGDGVIDDYDQCLIGQGDVPKIYYGFGADLQLG DFSIGALFAGNAKVDRCLSGNAIYPFNDGSGITNLFANITDRWSADDPTNQDVFYPRLHH GNNANQNNMKTSTWWQKDVSFLRLKQLTIAYQLPKKLINRSFLKSARIYLMGTNLLTFSK FKMWDPELNTNNGTSYPNVRTYSVGVNVSF >gi|225935361|gb|ACGA01000031.1| GENE 37 56569 - 60513 2353 1314 aa, chain - ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1203 1308 182 286 299 70 38.0 3e-11 MKIKLIALFIMMSSLAAVADNDTYFFSKVDYQQGLSNSAVLCLFQDNVGLMWFGTYDGVN CWDGKTMEVFRSDFSESKTLSNNVIHSISQADSSCLWISTHLGVNRFSQKTRQVVGNYDF SGDYYVHSNNQGNTWILAGDGIYYYNTKHQSFIRAQTSALPMDNMEQRAFVTDDGVLWLF PLHTGQIQNYSLNAFDADSLSVKLSVSTNKFHSKAIEDIFYQNGIFCFIDCDKDLYMYDI SRKSKIYIRNLSSLVQKYGVIKGIVPFYEDIVIGFRANGLIRLRTSKKYEEEVVNRNIRI YGIYREPHQGLLWIASDGQGAVMYAKKYSIATNLMLNRMSANLSRQVRSVMTDKYGGLWF GTKGDGLLHVPDYHQATRNTPLGATVYSPGKKQVVSSYIKWNQEFHAYKLVQSRYMDGFW IGAGDPGLFYYSFRDDAVHPVECSSGGQPAIEIHDIYEESDSVLYVVTAGVGFYKMVLEK KAGQISIRRQKRYRFFHEQQEITMFYPMLAEGDSILWLGSREKGLVRFDKRTEEYQVISL KEMLRKSVDDILSLYRTQDGRMYVGTTSGLVCLTFHEKKIEAAYIGREQGLLNDMIHGVL EDGNGFLWLGTNRGLIKYNPQNGSSHDYYYSAGVQIGEFSDDAYYQCPYTGSLFLGGVDG LLYLDRKVAAAPEFNPDILLRRLWIGRTAVNLGDYYAADGKSLQLEGAVVSFSLSFAVPD YLTGEEVEYSFILEGYDKQWTSFSSLNEASYTGVPSGDYIFKVRYKKDVFDTEYKFFSIP IHILPPWYQSMWAYVFYILLGLLFVIYLLHLLRKYILHEQILKRLLTTESNKEVSESGSY NRDLLDRFTLIYHACDQLRAENVSYNQRCEQVELIRETAMNALFRPETLYSEKLKHFYPI TFVLSARMCMQELSVEVLRALKAEGVNTTPVLSSIPESFIFPVYKNALRCILYYCYQFVC QRTSSEVVVSAQEEDNKMLLVMSGEEDMLKELRDQLIGTSCLVYDKKDVDESFGIQLMLC FVQSALEQLHTAIHYVESEGHRLMLTFKPAILIEGQSNDKKTILLLEDREEMVWLITGLL SEEYSIRSVKSVQVAFDEIRNSAPAVFLVDMAMYTKEESTFMEYVGKNRSVLSKTAFIPL LTWKASSSIQRELIKWSDSYIVLPYDILFLKEDVHKAIYGKREAKQIYLEDLGELAGQIV CTTTEQVDFIRKLLQVIEQNLAEEELGSTLIADRMAMSPRQFYRKFKEISSIAPGDLIKN YRMEKAARLLREEDISIQDVIADVGIASRSYFYKEFARKYGMTPKDYREQRKNI >gi|225935361|gb|ACGA01000031.1| GENE 38 61488 - 62876 1802 462 aa, chain - ## HITS:1 COG:PH0923 KEGG:ns NR:ns ## COG: PH0923 COG1109 # Protein_GI_number: 14590777 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Pyrococcus horikoshii # 12 448 6 441 455 251 37.0 2e-66 MTLIKSISGIRGTIGGGAGEGLNPLDIVKFTSAYATLIRRTCKAKSNKIVVGRDARISGE MVKNVVVGTLMGMGWDVVDIDLASTPTTELAVTMEGACGGIILTASHNPKQWNALKLLNE HGEFLNAAEGNEVLRIAEAEEFDYADVDHLGSYRKDLTYNQKHIDSVLALDLVDVEAIKK ANFRVAIDCVNSVGGIILPELLERLGVKHVEKLYCEPTGNFQHNPEPLEKNLGDIMNLMK GGKADVAFVVDPDVDRLAMICENGVMYGEEYTLVTVADYVLKHTPGNTVSNLSSTRALRD VTRKYGMEYSASAVGEVNVVTKMKATNAVIGGEGNGGVIYPASHYGRDALVGIALFLSHL AHEGKKVSELRATYPPYFIAKNRVDLTPEIDVDAILAKVKEIYKNEEINDIDGVKIDFAD KWVHLRKSNTEPIIRVYSEASTMEAAEEIGQKIMDVINELAK >gi|225935361|gb|ACGA01000031.1| GENE 39 62919 - 63614 640 231 aa, chain - ## HITS:1 COG:no KEGG:BT_3949 NR:ns ## KEGG: BT_3949 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 229 1 212 214 267 62.0 3e-70 MKKLVFLFLSLLTAGSLFQACDNSKTYAEMLEDEKNAVNKFIKDNDIRVISLEEFERDTI TASKEAGNGYDEYVAFSNGVYMQIVDRGGKEDKDGVEVINEVDTFANNNVICTRYVEQDM MTGDTTCFNVPLEKWMDISEYYKSPLTFRYVQNTSTVYGIVLSGDFDYDYLWTVANGYGT AIPSGWLIALPYLRNNAHVRLIVPSKMGHTTAQQYVNPYFYDIRKFEKAKS >gi|225935361|gb|ACGA01000031.1| GENE 40 63733 - 64770 759 345 aa, chain - ## HITS:1 COG:aq_1630 KEGG:ns NR:ns ## COG: aq_1630 COG0618 # Protein_GI_number: 15606737 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Aquifex aeolicus # 24 338 19 319 325 132 31.0 9e-31 MLTKVIEQAKIDHFTKWFERADKIVIVSHVSPDGDAIGSSLGLYHFLDSQEKTVNVIVPN AFPDFLRWMPGSKDILLYDRYKDFADKLIAEADVICCLDFNALKRIDDMADAVAASPARK IMIDHHLYPEDFCKIVMSYPKISSTSELIFRLICRMGYFSDISKEGAECIYTGMMTDTGG FTYNSNNREIYFIISELLSKGIDKDDIYRKVYNTYSESRLRLMGYVLSNMTVYSDYNSAL ITLTKAEQSKFNYIKGDSEGFVNIPLSIKNVCFSCFLREDTEKPMIKISLRSVGTFPCNQ LAAEFFHGGGHLNASGGEFFGTMEEAKAVFEKALEKYKPLLTAKS >gi|225935361|gb|ACGA01000031.1| GENE 41 64817 - 66910 534 697 aa, chain - ## HITS:1 COG:BS_comEC_1 KEGG:ns NR:ns ## COG: BS_comEC_1 COG0658 # Protein_GI_number: 16079611 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Bacillus subtilis # 172 475 149 431 469 108 30.0 5e-23 MSTFYIHRYPYIRLIIPWITGVFCGDHFFDRSREPFWSILTLGLCIALLFVLYFLKRHSL RWCFGLAVSILCFIGGWLGITWQLQHAVYSFPEEETVYRVLITDAPQAKEYTYLCQTLLK ERRDTTGTYPIERAAILYLQQDSAVTRLKSGDELLISARISPPLNNRNFDEFDYARFLMR KGISGTGYVASGKWTKQDGMNNLDLKSIASSCRRKMISLYQKLGFFGDELAVLSALTIGD KTELSDSVRESYSVAGASHILALSGLHIGLLYTLLFFILKPIARRGNIGRVIRSVLLLIL LWAFAFFTGLSPSVVRSVSMFSILAMADMVGRQPLSLNTLAAAAWLMLFCNPAWLFDVGF QLSFLAVASILLIQKPIYHLITVKGRIGKYIWGLISVSVAAQIGTAPLVMFYFSRFPVHF LLTNLVVIPFITIILYAAVIMLLLTPLSWLQIVVAEGVKKLLEGLNFFVRWVEQLPYASI DGIWLYQSEILGIYIVGSLLTYYFLNRRYRNLLICLFSILLLGTYHATLYWLDRPRTSLV FYNVRGCPAVHCIESDGRSWINYVDTIPNEKRLKRMTANYWKHHHLLPPREITGDCRYME LNRQQQIISYHGCHICVINDNHWRNKTTVSPLYIQYLYLCKGYDGHLEELTRIFSFSYVI LDASLSEYRRHLLESECKQSGLRFISLSDEGSVRFLL >gi|225935361|gb|ACGA01000031.1| GENE 42 66919 - 67569 767 216 aa, chain - ## HITS:1 COG:BH2502 KEGG:ns NR:ns ## COG: BH2502 COG0036 # Protein_GI_number: 15615065 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Bacillus halodurans # 5 211 4 210 216 206 51.0 2e-53 MKPIIAPSILSADFGYLAKDIEMVNRSEAEWVHIDIMDGVFVPNISFGFPVLKYVAKLSQ KPLDVHLMIVNPEKFIPEVKALGAHTMNVHYEACPHLHRVIQQIREAGMQPAVTINPATP VALLQDIIRDVYMVLIMSVNPGFGGQKFIEHSVEKVRELRALIERTGSKALIEVDGGVNL ETGARLVEAGADVLVAGNAVFAAPDPEAMIRQLHEL >gi|225935361|gb|ACGA01000031.1| GENE 43 67668 - 68609 912 313 aa, chain - ## HITS:1 COG:BS_fmt KEGG:ns NR:ns ## COG: BS_fmt COG0223 # Protein_GI_number: 16078636 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Bacillus subtilis # 1 309 7 307 317 219 39.0 7e-57 MGTPDFAVEALRQLVEGGYNVVGVITMPDKPAGRGHKIQYSPVKQYALEQNLPLLQPERL KDEAFVEALREWKADLQIVVAFRMLPEVVWNMPRLGTFNLHASLLPQYRGAAPINWAVIN GDTETGITTFFLKHEIDTGEVIQQVRVPIADTDNVEVVHDKLMVLGGKLVLETVDAILNG TVKPIPQEEMAVVGELRPAPKIFKETCRIDWNQSVKKIYDFIRGLSPYPAAWSELVTSEE GAAVVVKIFESEKIYESHQLATGTIVTDGKKFMKVAVPDGFVSILSLQLPGKKRLKIDEL LRGYHLEDGCLMK >gi|225935361|gb|ACGA01000031.1| GENE 44 68711 - 70504 1387 597 aa, chain - ## HITS:1 COG:RSp0020 KEGG:ns NR:ns ## COG: RSp0020 COG0038 # Protein_GI_number: 17548241 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Ralstonia solanacearum # 21 448 28 447 461 150 27.0 1e-35 MKGEKLSLLQRCIKWREANIKEKQFILILSFLVGIFTAFAALILKFFIHQIQNFLTDNFN ATEANYLYLVYPVVGIFLAGWFVRNIVKDDISHGVTKILYAISRRQGRIKRHNIWSSTIA SAITIGFGGSVGAEAPIVLTGSAIGSNLGSVFKMEHRTLMLLVGCGAAGAIAGIFKAPIA GLVFTLEVLMIDLTMSSLLPLLISAVTAATVSYIITGTEAMFKFHLDQAFELERIPFVIL LGIFCGLISLYFTRAMNSVEGVFGKLNNPYKKLAFGGVMLSILIFLFPPLYGEGYDTINL LLNGTSAAEWDTVMNNSMFYGYGNLLLVYLMLIILLKVFASSATNGGGGCGGIFAPSLYL GCIAGFVFSHFSNDFAFSAYLPEKNFALMGMAGVMSGVMHAPLTGVFLIAELTGGYDLFL PLMIVSVSSYLTIIAFEPHSIYSMRLAKKGQLLTHHKDKAVLTLMKMENVVEKDFVVVHP EMDLGELVKAIAASHRNVFPVTDKKTGGLLGIVLLDDIRNIMFRQELYHRFTVNKLMTSA PAKIFDTDGMEQVMQTFDDTKAWNLPVVDEEGRYQGFVSKSKIFNSYRQVLVHFSED >gi|225935361|gb|ACGA01000031.1| GENE 45 70505 - 71068 516 187 aa, chain - ## HITS:1 COG:MK0635 KEGG:ns NR:ns ## COG: MK0635 COG0009 # Protein_GI_number: 20094073 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Methanopyrus kandleri AV19 # 5 187 13 193 212 99 33.0 4e-21 MIEDIKKACQVMNEGGVILYPTDTVWGIGCDATNEEAVRRVYEIKKRADSKAMLVLVDSP VKVDFYVQDVPDVAWDLIEVADKPLTIIYSGARNLASNLLAEDGSVGIRVTNEDFSRRLC QQFRKAIVSTSANVSGQPGAANFSEISDEIKSAVDYIVGFRQDDMSRPRPSSIIKLEKGG VIKIIRE >gi|225935361|gb|ACGA01000031.1| GENE 46 71271 - 72896 1309 541 aa, chain - ## HITS:1 COG:no KEGG:BVU_3705 NR:ns ## KEGG: BVU_3705 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 23 534 19 532 534 399 45.0 1e-109 MKTAYTFIMKALPVLLTALCWTSCSDFLEEDPKGQLPSDTYFSNKEDLDASLTALYSVIA SSQASNNLCGTNFLVGDDISTHPSSNKQPLREHDQFDVKDNNSWLSSMWEQRFKVIKAAN FIINNAERTPEVSKDDIKVAIAQAHYWRAYSYFYLVTTWGRVPIMLKEEIDYNAPLKTEE EVYELILSDLKIAETDCPALYTKEPYGRNGMNIAVSEGAVKATMAYVYMCMAGWPLNKGV EYYKLAAAKAKEVIDGADGGSYYNKLLPEYSQVYSWEYNNKNTELLLGIYYNRDAAGQSA PLTDFLQDMKQAGWGDTNGEINFWMNFPEGPRKDATYFPKIMLSDGKLYDWWYDTDPASR EVVAPVFMKTVEGAVRGTEFDYTNPAVVNASGEKTFQLLRLSQVYCWYAEATGRAGEINE QAVKVLNEVRNRADGEKTNRYTTDMSPDELAEAAYNEHGWEMAGYYWGGIASRARDMFRM YRYKDHFEFRKKNEPIEVAPGVFRKEAVAVTGTWDDSKMYVPYPYEDVILNPNLDNSWKN R >gi|225935361|gb|ACGA01000031.1| GENE 47 72921 - 76001 2492 1026 aa, chain - ## HITS:1 COG:no KEGG:BF4062 NR:ns ## KEGG: BF4062 # Name: not_defined # Def: putative TonB-linked outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 79 1026 43 998 998 810 46.0 0 MVIYPNNYSNFRTIMEVIMKNKWLLCILLMLFLPFGLFAQVAPPLEKQKTETQEAPENNN EVTTNSDTTTDGDAGKTIKGVINDEQGETIIGASVIIKGEDTGTTSDMDGRFTLEAPEGA ILVISYIGYHTQEVKVRKRSLLRVVLKEDNQLLDEVVVVGYGTVKKSDLTGAVSGVSNRQ YKNQPVQRVENILQGRTPGVEVTATSGMPGASMKVRVRGTTSINKSSDPLYVIDGIISSS GLDGINPSDIQSMEILKDASSTAIYGSRGSNGVILITTKQGSEGKAQVTFDASVGLSTVR KQYDLLNAYEYATALNDIRGSSTISAEDLEAYKNGTKGINWTDLLTRTGITQDYRLAISG GNEKVKYLISGNVLDQEAITIMSDYKRYGIRANIDSEVKPWLTISAKLNASSLHKHNEGG VNWLHVTNFSPTMELKDPETGVYNTDPYNMIGSSPYGEMIVNNSDSYSYNLNANLTLLFK IMKGLTLSVQGGYDYDNSPSYSFRSKLDSPGAINSASNTNALHNYWQNTNNLTWQKQFGD HSFTAMGVWEISRSWDSQLKGTGSNLNNESVGYWNLGNAAIRDASNSYTEFSLASGIVRA NYDYKKRYFITAALRADGSSKFQGDNKWGYFPSAAVAWDIAQESFMSNQHVLDQLKLRAS FGVTGNQDIAAYSTLGMLSGASYGWGTSTSSTGYWGYQFATPGITWEKTYQYDLGLDMSI GGFNITVDWFKKQTKDLLFQKQVPKYNGGGTYWVNQGKLNNTGVELSLTTFPVKGAVTWE TSLNASYVKNEVADLAGDDFVLTANYSDLGGPMQIMKPGYPMGSFYVYQWKGFDDKGANL YQKADGSLTTNPTSDDLVVKGQASPKWTVGWNNTVTWKNWTLNVFFNAATGYDRLNISRF MAASMTGVSRFVTLRDAYFKGWDHVANKADALYPSLTNTDNKSYANSDFWLEDASFIKLK NISLSYRIPRRVLKFASVQLSVSAQDLFTITRYKGMDPEVYTSYDGLDYGAYPIPRTITF GAKFRF >gi|225935361|gb|ACGA01000031.1| GENE 48 76437 - 76868 365 143 aa, chain + ## HITS:1 COG:TVN0706 KEGG:ns NR:ns ## COG: TVN0706 COG0824 # Protein_GI_number: 13541537 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Thermoplasma volcanium # 12 102 11 101 133 69 32.0 2e-12 MEEIVFHHTLPIQLRFNDVDKFGHVNNTVYFSFYDLGKTEYFGSVCPGVDWEKIGIVVVH IEANFVKQIFASDHIAVQTAVSKIGTKSFHLVQQVIDTKTNEVKCVCKSIMVTFDLERHE SMPLTKEWIEAICKYEGRDLQKA >gi|225935361|gb|ACGA01000031.1| GENE 49 77053 - 78576 1210 507 aa, chain + ## HITS:1 COG:BS_hutH KEGG:ns NR:ns ## COG: BS_hutH COG2986 # Protein_GI_number: 16080986 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Bacillus subtilis # 6 505 8 492 508 316 36.0 6e-86 MIADKSINLDTLHKVLFDNEKLELSEECIRKVEESFDFLQSFSSDKIIYGINTGFGPMAQ YRIEDQSLIDLQYNIIRSHSTGAGKPLPELYVKAAMIARLYTFLQGKSGVHLELVSLLCE FINRGIYPFIPEHGSVGASGDLVQLAHIALTLIGEGEVFYQGKLCNAATVLQENGLKPFS MRIREGLSVTNGTSVMTGIGIVNLIYAKKLLRWSVAASVMMNEIAASYDDFMAQALNEAK HHKGQQEIAAMMREWVAGSKCVLQRENELYNQVHKEKIFEHKVQPYYSLRCVPQILGPIY DELENAEEVLINEINSACDNPIVDPDTQNIYHGGNFHGDYISFEMDKLKIAVTKLTMLCE RQINYLFHDRINGILPPFVNLGVLGLNYGLQASQFTATSTTAECQTLSNPMYVHSIPNNN DNQDIVSMGTNSALLAKTVIENSYQVMAIQFMGMAQAIDYLKIQDRLSSKSRQVYEEIRS FFPVFTNDTPKYKEIEMMIDYLKKEDK >gi|225935361|gb|ACGA01000031.1| GENE 50 78575 - 79306 199 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 1 239 2 238 242 81 29 4e-14 KMKYALVTGGSRGIGRAVSCKLAEMGYFILINYQNNDAEAEKTLQLVQEKGSNGELMKFD VTDPAAITLALGNWASQHPDEYIEVLINNAGIRKDNLMLWMTGEEWSKVLDISLNGFFNV TQPLLKNMLVKRYGRIVNIVSLSGIQGMPGQTNYSAAKGGVIAATKALAQEVAKKKVTVN AVAPGFIRTDMTEGIDENEWKKHIPAGRFGTPEEVADLVGFLASPASSYITGEVISINGG LYT >gi|225935361|gb|ACGA01000031.1| GENE 51 79333 - 80556 1070 407 aa, chain + ## HITS:1 COG:PM0339 KEGG:ns NR:ns ## COG: PM0339 COG0304 # Protein_GI_number: 15602204 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Pasteurella multocida # 1 404 1 403 406 293 42.0 5e-79 MKRVVITGMGIYSCIGKNLDEVKDSLYNGKSGIGIDPVRKELGYFSALTGILERPDLKKL LDRRKRLCLPEQGEYAYLATLEAFRNAGINEDFLEANEVGILYGNDSSAAPVINAVDIIR EKKNTALVGSGSIFQSMNSTVTMNLSVIFKLRGVNFTIAGACASGSHAIGMGYLLIKSGL QDCILCGGAQEVNPYAVGSFDGLSAFSTQEAVPEKASKPFDKRRDGLIPSGGAASLVLES YESAVKRGAPILAEVIGYGFSSNGDHISVPNVDGPKRSLQMAIKDAGIALEQISYINAHA TSTPVGDLNEAKAIAEIFEGHHPYVTSTKSMTGHEMWMAGASEVIYSTLMMNNGFIAPNL NFEEPDEASAQLNIPTQRVNLEFDTFLSNSFGFGGTNSTLIIRKINP >gi|225935361|gb|ACGA01000031.1| GENE 52 80578 - 80829 338 83 aa, chain + ## HITS:1 COG:no KEGG:BVU_1013 NR:ns ## KEGG: BVU_1013 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 81 1 81 83 113 81.0 2e-24 MTNEEIIEKIRTTLAEEFEVDIDVIQPDAPLMETLELDSLDLVDMVVLVEKNFGFNVTGQ DFAGIRTFQDFYDLVITRMQEAK >gi|225935361|gb|ACGA01000031.1| GENE 53 80832 - 81743 776 303 aa, chain + ## HITS:1 COG:Z4858_2 KEGG:ns NR:ns ## COG: Z4858_2 COG4261 # Protein_GI_number: 15803996 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Escherichia coli O157:H7 EDL933 # 11 293 12 304 312 102 24.0 1e-21 MSTWKGKTRGGTFGYLFFIYLIKYLGITAAYIFLSLVVLYFIPFAPKATKSTWFYARHIL KHNRIRSLGMLLRNYYRLGQILIDKVAIGNGKVDQYRFEFERYPEFLQLLNSEQGVIMIG AHVGNWEIGVPFFDDYGKKINIVMYDAEHRRIKEILEKNGQDKDFKIIPVNEDNLTHVFR ITEALNKKEYVCFQGDRYLNKEKLLTGTLLGQKAPFPAGPFLLGSRMKVPVVFYFAMREP GRTYRFHFIRTEPVIRTKEKKAETALLEQYTAALDQILKRYPEQWFNYYSFWETTSDGSL SKD >gi|225935361|gb|ACGA01000031.1| GENE 54 81769 - 82839 960 356 aa, chain + ## HITS:1 COG:Z4850 KEGG:ns NR:ns ## COG: Z4850 COG0500 # Protein_GI_number: 15803988 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 EDL933 # 6 356 2 352 352 314 43.0 2e-85 MQQNQYDKEPLTAVEAQRLAQEIAFGPIVFQVSRLMLKFGIFQLLADERAGMTQEEISKA CGLSSYATQVLLEASLTIGTVLQHENRYRLAKAGWFLLNDKMVRVNLDFNHDVNYLGMFH LEEALTNGRPEGLKVFGEWSTIYEGLSSLPSQVQKSWFGFDHYYSDCSFDEALAIVFARH PKTLLDVGGNTGRWATKCVSYDDAVEVTIMDLSQQLEMMRQQTKKLPGAMRIHGYGANLL DPEVPFPTGFDAIWMSQFLDCFSEEEVTSILTRAARSMSRESRLYIMETFWNRQKFDTAA YCLTQISLYFTAMANGNSKMYHSDDMERCIKAAGLEVEKIDDHLGMGHSIVQCRLK >gi|225935361|gb|ACGA01000031.1| GENE 55 82836 - 83273 314 145 aa, chain + ## HITS:1 COG:no KEGG:BVU_1016 NR:ns ## KEGG: BVU_1016 # Name: not_defined # Def: putative 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase # Organism: B.vulgatus # Pathway: not_defined # 1 145 1 145 145 199 65.0 3e-50 MNNREVIIQGEGILNLIPQRPPIVMVDSFFGIEENHSYSGLTVTADNIFCETRKLQEAGI IEHIAQSAAARIGFLYTRQGEKVPLGFIGSVDKLKIYDLPKIGMKLFTEITVVQEVFDIT LISAQVKVEDKLIAECRMKIFIKKE >gi|225935361|gb|ACGA01000031.1| GENE 56 83270 - 83713 386 147 aa, chain + ## HITS:1 COG:CAC0271 KEGG:ns NR:ns ## COG: CAC0271 COG0824 # Protein_GI_number: 15893563 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Clostridium acetobutylicum # 16 124 6 114 138 73 33.0 2e-13 MKRKTNQQVATLTNRTTFRVRFSEIDSMQIVWHGEYVRYFEDGREAFGKQYGLDYMSIYR EGYMVPIVDLTCQFKQSLSFGEEAIVETRYIACEAAKIKFEYVIYRATDQSIVATGSTIQ VFLNLNKELELMNPPFYLEWKKKWNIL >gi|225935361|gb|ACGA01000031.1| GENE 57 83698 - 85470 994 590 aa, chain + ## HITS:1 COG:RSc0427 KEGG:ns NR:ns ## COG: RSc0427 COG0304 # Protein_GI_number: 17545146 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Ralstonia solanacearum # 115 380 126 398 405 147 35.0 5e-35 MEYPLNIYITAHTLISSLGFGIPENLEAIHNYRSGIRMQEAGLISDHPLLAGMIDSVELE KRAKLMQITDYTRMEQLFILAIQKVISQSGTDLREPDCALLLSTTKGNVDLLSELPADSP VFLWKMAERIGDFFGAANQVEVISNACISGVSALIVAKRWIESGRYKRVIVAGGDILSHF ITSGFLSFRSVSAHPCRPYDIQRDGLSLGEACGAVLLETQGNANHIILSGGAISNDANHI SGPSRTGDGLALAINQAMEEAGALPEDISFINAHGTATVYNDEMESKAIHLAGLAAVPVN SLKPYFGHTLGASGIIETILCIEQLKEGRYYGTLGYETLGVPMPITVYATHQPIPMKCCI KTASGFGGCNAALVLSLPDAHLKQKANSQATDKASAPSVCKAVVESGNMVTIRPGAVESK GTTVFSSSETDFAPFIREAYKHLGENNMKFYKMDNLCKLGYVAAEYLLKDTNYRPKEIGI ILANASSSLDTDCKHQAIISKEGDKAASPAVFVYTLPNVVLGEICIRHKIQGENTFFVRR QSDAASLEDYARIVMAKSKLRTCIIGWCELLDGHYQAEFKQLNNISTIYG >gi|225935361|gb|ACGA01000031.1| GENE 58 85463 - 85744 460 93 aa, chain + ## HITS:1 COG:ECs4328 KEGG:ns NR:ns ## COG: ECs4328 COG0236 # Protein_GI_number: 15833582 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Escherichia coli O157:H7 # 14 91 7 84 85 76 58.0 1e-14 MDNLDQTKKELMDEVKGKLIEELNLEEITPEDIDNEAPLFGDEGLGLDSIDALEIILILE REYGIKIENPSEGKQIFYSVRTLADYIIANRKA >gi|225935361|gb|ACGA01000031.1| GENE 59 85748 - 86938 785 396 aa, chain + ## HITS:1 COG:CAC2008 KEGG:ns NR:ns ## COG: CAC2008 COG0304 # Protein_GI_number: 15895278 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Clostridium acetobutylicum # 2 389 6 403 406 199 34.0 5e-51 MKIYVTGLGVVSGIGIGVSENIEALRQRKHGIGKVTLFPTALDVPVSEVKRSNEELKKLI SLPPQRTVSRTALLGMIAAEEAMKDAGLTPPLRIGFISATSVGGMDLSEHFYESFKKNPR QGRLREVISHDCGASTELIASYLGINDFITTISTACSSAANAIMLGARMIKHGLLDAAIV GGTDALCRFTLNGFNSLMILDKTHCRPFDRSRTGLNLGEGAGYLVLQSESSLQRTPYCEL SGYANTNEAYHQTGSSPEGDGAFLSMSEAIASSGISPKEIDYINVHGTGTPGNDASEGMA LRRIFGEHVPPFSSVKAFIGHTLGASEGIEAVYSVLSIDKGLIYPNLNFTDAMPETGLIP ETSFQEGIPIRHVLSNSFGFGGNDSSLLFSATNFPA >gi|225935361|gb|ACGA01000031.1| GENE 60 86963 - 87961 696 332 aa, chain + ## HITS:1 COG:no KEGG:BVU_1021 NR:ns ## KEGG: BVU_1021 # Name: not_defined # Def: 3-oxoacyl-[acyl-carrier-protein] synthase # Organism: B.vulgatus # Pathway: not_defined # 37 330 1 282 285 350 57.0 3e-95 MQPVYIQRIASIHPPKDHSPGNNRPYLQACEPDYKDIITNATLRRRMSRIVKMGVACGLE CMGELSPEKIQGIITATGLGCLTDTEKFLNNLLDNKERMLNPTPFIQSTFNTIGAQIALI HQIHAYNMTYVHRGLSFESALLDAMMKIGEGSENILVGAINEMTETSYTIQQRLGVLKGI AAGEGAQFFLLSREAGEHPLAEIQGIETFIGKQTTEEISSRIIRFLQRNGLECQDIQWLV TGKNKKPHNQDDSHEQTVDNGNSIYEELETNLFPESVHLSFKNECGEYPTATSYAVWKAV NESANCTTSTHILIYNHHHSINHSLILIRKSV >gi|225935361|gb|ACGA01000031.1| GENE 61 87958 - 88638 339 226 aa, chain + ## HITS:1 COG:SP1479 KEGG:ns NR:ns ## COG: SP1479 COG0726 # Protein_GI_number: 15901329 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Streptococcus pneumoniae TIGR4 # 17 225 244 448 463 129 34.0 3e-30 MIIVCLLLCLLLLLSFLVYASYSIQSGIYLRSFCKKHTAEKIVALTFDDGPDSLQTPKVL QVLKDYQVTACFFCIGQKVKGNEAILQKMVAEGHLIGNHSHTHSGLFPLYGLSKMKKDLQ TCQCELERVTSQPVSLFRPPFGVTNPTIAKAVRQLGYTSIGWSIRTLDTQQPAPDKVLAR IRKRLEPGAIILLHDRMPDSDQLVKQILDLLKEQGYTAVHPDKLLA >gi|225935361|gb|ACGA01000031.1| GENE 62 88651 - 89286 601 211 aa, chain + ## HITS:1 COG:no KEGG:BVU_1024 NR:ns ## KEGG: BVU_1024 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 6 211 3 208 208 261 69.0 1e-68 MRTLLIYLSLFFLSIASIHAQSMKKMAQKTEFESRLAKEAQTVESIESDFTQVKYLDVFD EKVTSKGKFYYQKTHKICMEYFRPMDYLIVINGSKLKIVSDGKKSIMNLSSNKMMAQMQD MLTACMIGDLSKISSNYLLEYFEDARYYLVKIKPTNKAVQAYIAGIEIYLNKKDMSVHKL RLSETATNYTEYEFYNKKFNSLKNETKFAIR >gi|225935361|gb|ACGA01000031.1| GENE 63 89261 - 89839 358 192 aa, chain + ## HITS:1 COG:no KEGG:BVU_1025 NR:ns ## KEGG: BVU_1025 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 183 1 176 180 192 58.0 4e-48 MRRSLRFVSLTALLFLWIQIQTSCSPRMSGEGSTVPPAIELTQGESSQKYNIQLDFMRHH MSGMLIVRRMPDNEIRIVASTYFGLSLFDFSLRNNQFTVNSCIEPMKKEKVLKLLEMDFK RLFLSEKGTRVKASKDSATEKRTSGKGFGKSVVYTTGNTPGDPAQIKIKHPWIRLTIRLD KLSNLDKLSKEF >gi|225935361|gb|ACGA01000031.1| GENE 64 89843 - 90307 425 154 aa, chain + ## HITS:1 COG:no KEGG:BVU_1026 NR:ns ## KEGG: BVU_1026 # Name: not_defined # Def: putative 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase # Organism: B.vulgatus # Pathway: Fatty acid biosynthesis [PATH:bvu00061]; Metabolic pathways [PATH:bvu01100] # 1 106 1 106 126 114 54.0 1e-24 MLLDNFYTILSSELSDSTAWIIQIELNPGHPVYQGHFPGHPVVPGVCLLQLIKECVEDIR QQKLQVAQVSSCKFLSAINPIVTPYISVALTFKETGEGILQLQAEGTVKEGIVKENIVKE NSVKEYTIEENSVEENTVKNECFIKLKAALTPTV >gi|225935361|gb|ACGA01000031.1| GENE 65 90265 - 91494 756 409 aa, chain + ## HITS:1 COG:XF1638 KEGG:ns NR:ns ## COG: XF1638 COG0463 # Protein_GI_number: 15838239 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Xylella fastidiosa 9a5c # 34 245 13 227 274 121 34.0 2e-27 MFHQTKSCTYPDSMKEQISTAYWHEKLKQLGIIVIIPSYNNGKTLADVIESVRFYAPDIL VVNDGSTDETAEILSRESNLHTITHPVNQGKGVALRHGLSFAKKQGFRYAITIDSDGQHF ASDIPALIEAIEKEPDTLLVGARNLASDNMPGKNTFANKFSNFWFTLETGIKLQDTQSGY RLYPIQRMNVDKWYYTAKYEFELEALVFAAWGGITVKNIPVHVYYPPQEERVSHFRPFRD FTRISILNTVLVLVTFLWIIPRNFFRKLTWKNCKQFFSDHVTHSPESNLRITAAITLGVF MGIVPVWGYQMLITLFLAHLFRLNKVIAIVAANISIPPMIPFLLYGSYVTGCKVLGDPVN LHLNELSFENVKSVIEQYLIGSVIFAVVCSILAGTIAFILLTACRKKKI >gi|225935361|gb|ACGA01000031.1| GENE 66 91491 - 95348 2599 1285 aa, chain + ## HITS:1 COG:XF0777 KEGG:ns NR:ns ## COG: XF0777 COG4258 # Protein_GI_number: 15837379 # Func_class: R General function prediction only # Function: Predicted exporter # Organism: Xylella fastidiosa 9a5c # 142 807 124 771 788 84 20.0 1e-15 MTQFFIGLYDYFERHKILFYLSLISCVLLMGFFALQVRFEENITQFFPDTKDSQNTIKVF DNLKIKDKIIIMLSSADTCHRVEPNSLIEAAGQLQQTLTEKAGGKLIKGIFAQVDQSLIQ GATDFVYEHLPLFLTDTDYQRFDSLLTDKGIQAVMQKNYTNLLSPAGIALRSYILRDPLG LGSETLKHLQDFQLEANYEIYDEHIFSKDGSTLLMFITPVFSTGSTGKNDELIKILEEEL KHVQGESPTIRAEYFGGPSVGVYNARQIKKDTILTSSLALLIIIVFISLVFKRKRSIPLI ITPVLFGGLFALFLIFFIKGSISAIAVGAGSAVMGIALSYSIHMLAHQNHVSTVQQLIKE IAYPLTVGSFTTIGAFLGLIFTSSDLLRDFGLFASLALVGTTLFCLIYLPHFLKGQADVK QGRILRIIEKINAYSYEKNKWLVGGILLITVICLFTSQKVGFNNDMMSLNYEPRHLQQSE EKLLQLFDNDEKTVLFVSVGKDMNQATETYAMTNQKLSALKEQGLIKEYASASQFLISPQ EQQKRLKKWKDYWTDEKQQQVREQLETAAAEYRFRPGSFEPFYQWMNQPFGEYHYTAQGD DLSGKLLNEWQTSADSITMLISQIRISEPNKEAVYQNFNKDPNVVIFDRSYFANKWVSAI NDDFYLILYISSFLIFFALWFSYGRIELTLMSFLPMLISWVIILGLMGILGIEFNIINII LSTFIFGIGDDFSIFIMDGLQNKYRTGQKVLNSHKTAIFFSAFTTVVGMGALVFAKHPAL QSISLISILGMIAVVLVAYTIQPLIFRFFIAGPASKGLPPYTLIGLIRTVLLFLLFFIGC IVLRILIILLYPVPVRKSSKQRLVCRLIQITCKGILLLATAVKKEHINKANERFRHPAII IANHQSFIDILVLLSLSSKILMVTNHWVWHSPFFGAIIRYVDFYYIGEGYEQYMERMRKK VKEGYSIAIFPEGTRTYNGKMKRFHKGAFYLAEALKLDILPILLYGNNKIIAKAQPFNIR KGIIYTEILPRIPLDDLSFGSTYQERTKRISAYMKEVYARICREQNTTDNPAFYEALIQN YIYKGPVVEWYIRIKVKMEKNYRLFNRLIPVQGQITDIGCGFGPLCYMLSLLSEDREILG IDYDEDKIALAQHGWLRNEHLQFKHGNALEYPLPESDVFILNDMLHYMSYEHQQTLLLKC AGRLRSQGMIIIRDGNSANASKHRLTRFTELLSTRIFNFNRTTGELYFTTETQLREIAVA CGMAVEIIPNDKYTSNTIYIFRKPN >gi|225935361|gb|ACGA01000031.1| GENE 67 95350 - 96840 1311 496 aa, chain + ## HITS:1 COG:alr4631 KEGG:ns NR:ns ## COG: alr4631 COG1233 # Protein_GI_number: 17232123 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Phytoene dehydrogenase and related proteins # Organism: Nostoc sp. PCC 7120 # 1 491 14 522 533 117 23.0 4e-26 MSKYDIIIIGSGLGGLECGAILSKEGFNVCVVEKNAQFGGCFQTYQRKGHLLDTGIHYVG SLDEGQVMNQYFRYFGIMDKLKIKRLDDAAFDTIYYKGKTYDYAMGYPQFIETICRSFPH EKENLKCYAEQLQAVGNLISVDHLKQGTLALEGMKYFCTSAAHLIADITPDPALRNVLAG SALLYGGLKDVSTFYQHAMINHSYIEGAYRFVDGSMQVSLELINVIRANGGTVLNNSEAT RIIVENEKVQGVIINGEERLESDFVISNMHPQRTLEILDKNRSIKNAYISRIRSLENTYG IFTLYLVMKKKNTPYQNRNLYLHANNKVWYDKLLYPGRTTNCMISMQASSEDSQYASVIS ILTPMYIDELSAWKGTTPEHRGEDYQSFKKEKAEQILRFIRGHGIDLSENIEEMYTTTPL SYRDYTGTVDGSAYGIIKDYKCPQIGFVSTRSRLGNLFLTGQNLNVHGALGVTLTAMLTC AEFVGQEYLAKKVGNA >gi|225935361|gb|ACGA01000031.1| GENE 68 96965 - 98530 1106 521 aa, chain + ## HITS:1 COG:no KEGG:BVU_1032 NR:ns ## KEGG: BVU_1032 # Name: not_defined # Def: putative choloylglycine hydrolase # Organism: B.vulgatus # Pathway: Metabolic pathways [PATH:bvu01100]; Biosynthesis of secondary metabolites [PATH:bvu01110] # 3 511 47 552 554 781 73.0 0 MTTDKVIDIDSLHLRHYGDNFLRHSDSGLWELFVKGDAFQRGEAIGQLSSDLLHHQEKVF VDQIREIVPSDSYLKFLRFFIVLFNRNLGENVLEEYRDEIYGISLSCTHEYDFIGTPYER QLNYHSAHDLGHAMQDYMLVGCSSFATWGTQSADSSLLIGRNFDFYVGDAFAENKQVAFY VPEQGYRFASVGWPGMIGVLSGMNETGLTVTINAAKSAVPTGSATPISILTREILQYAAT IDEAFAIAKKRETFVSESILIGSAKDGKAAIIEKSPEKTVLFTGKEANRLICTNHYQSEE FSKDERNMENIRTSDSPYRFARLTELINEDLPIDVSKAASILRDHKGLQNTDLGLANEMA INQFIAHHSVIFQPEKRLMWVSTSPWQCGKYVAYDLNKIFKDTIDWQHEIYSSDLTIPED KFIDTPEFQHLLTYKKLTPLLLKKIRKKEQIEESVLKTHQASNPSLYYVYEVIGDYYEAM QQSKQAIAYWQQALKKSIPKLQEKERIQQKIQKQSKDGKES >gi|225935361|gb|ACGA01000031.1| GENE 69 98514 - 99809 980 431 aa, chain + ## HITS:1 COG:MA3853 KEGG:ns NR:ns ## COG: MA3853 COG1541 # Protein_GI_number: 20092649 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 4 431 6 432 434 182 30.0 2e-45 MEKNPEIQFRSPEDIKSYQEEQLAKVLAYLQAHSKFYQQMFAEHHIDINQIKTIEDLQQI PVTTKTDLQLHNDDFICVDKEEIIDYVTTSGTLGDPVTFVLTSEDLDRLSYNEYLSFTTT GCTKQDILQLMTTIDRRFMAGLAYYMGARELGMGVIRVGNGIPELQWDTIRRIHPTCGMV VPSFLIKLIEFAEKNHINHNTCSMQKCICIGEALRNQEFQLNTLGQKIHDKWPALQLYST YASTEMQSSFTECNEFHGGHLQPELIIVEFLDDNNRPVEEGEAGEVTITTLGVKGMPLLR FKTGDICYHHTDPCGCGRNTIRLSSILGRKGQMIKYKGTTLYPPALFDILDNIPSVKNYI IEVYTNELGTDEILIRIGSENRSEAFAKEIKDLFRSKVRVAPSINFESAEYIAKIQMPPM SRKTIKFIDLR >gi|225935361|gb|ACGA01000031.1| GENE 70 99866 - 100435 689 189 aa, chain + ## HITS:1 COG:no KEGG:BVU_1034 NR:ns ## KEGG: BVU_1034 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 189 1 190 191 189 53.0 5e-47 MKKLILFLFLSTSLFCFGQGKSALKDIQSIKFYGVDYSQVKVFGADESPAQFKDAFRRIN ELFITEAKKYNVGKQLKKEVTEISLDAVNQVNENIDLTELMTAKREYTLSKEQIKAAINA LPIQKTPGVGMVFIAQFLDKSNNRGTYEVVFFNTETKEIIEEWITDGKARGFGLRNYWAG SIYSALKKL >gi|225935361|gb|ACGA01000031.1| GENE 71 100453 - 101193 753 246 aa, chain + ## HITS:1 COG:no KEGG:Slin_5315 NR:ns ## KEGG: Slin_5315 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 19 220 17 216 231 97 33.0 3e-19 MRKLLLVLTLLLSISTYVAAQNYTEVVYLKNGSIIKGVIIEQVPNVSLKIKTGDGSLIIC QMSDVTKITKEERYTRDYRKDTNDRKAARNTLKGYKGFVDFAYIGDVSDYNASKIELSTT HGYQFNNYFFVGGGVAFNYYTDADLYAAPIYANFRANFINKKVTPFADVKAGYAVGDIEG AYASIGVGVRFSLKGKKALNLALMYNFQDYDATTDYSYSYGGHYYHNSWNDTWELHGVGA RFGFEF >gi|225935361|gb|ACGA01000031.1| GENE 72 101471 - 102331 571 286 aa, chain - ## HITS:1 COG:CAC2630 KEGG:ns NR:ns ## COG: CAC2630 COG4632 # Protein_GI_number: 15895888 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 85 283 147 344 347 70 26.0 3e-12 MKKNLLIVVILAVCSIGIVKGQTVADSLAIVSACWKIENPQKGIVYKYASIPQLYQGPQS ISLIEIDPVAGLKVGVAVSDKMKETSKIASEYNAIAAINGSYFDMKRGNSVCFLKTDRQV IDTTLQSEFKQRVTGAISVRKRGMKLIPWNKQIEKNYKGKAGTILASGPLMLKDGQVCDW NSCEPNFIRTKHPRSAVATTKDGKVLLMTVDGRFPEQAGGVNIPELAHLIRVLGGEDALN LDGGGSTTLWLSGASDNGVVNYPCDNGRFDHAGERKVPNILYITHQ >gi|225935361|gb|ACGA01000031.1| GENE 73 102373 - 104373 1398 666 aa, chain - ## HITS:1 COG:CAC2633 KEGG:ns NR:ns ## COG: CAC2633 COG4632 # Protein_GI_number: 15895891 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 107 321 152 353 354 67 28.0 1e-10 MKEYRIMILLVSTLFTIAMSCSSQLDNEDFSDMIGKGQNYTPPPVPVEEPTLKDKEGWKI DSIDGKSFIWYSYNKSAFNARQQVNVLEIDLSSPDYELEFVSAPQLDSLSSVALKHDAVA GINGTYELEASFVKVNGSIISPITLPEGHLRYWKHEGAIAYDGYKVEIGYGTKESYSYNS MPNIFSGAPVLIDDYQPVGKTFIGDITGINLNSLDGEDYRRHQGVRHPRTAVALTEQNKL LLVTVDGRADLAAGMTAKELTSFINQYFKPQHALNVDGGGSTTMYIRDSNLSATDVVNYP CDNKKFDHYGQRSVRTFILVKKHSNGQLFDSGDGSEDNPYIIKTARHMQDMHKVNYSKGM VYFRMEADVNMSGIDWQALNVSEPYDRLVHFDGNGHVIKGLKSQGNYASLFGVLCGVCKN LGIVDADIVAQNGGGILAGYVGIKIPTSDVLTGSVENCYTSGKVSGFDIIGGISGNIGKP SGSVYSSIKNCYSTATVTAKNETGNSRAGGIVGIVWAGGILENCYAAGEVISMNSGAAGI GAYFDTAPKRCVALNKLVENKKNGVNLGRIAAFMGAANMTVVDCWATDDMKILNAGVPKT EFSGELVGVKQPHDGETKSKEFLSNMQNYLDAGWGNSWYYKVNAKGYPILLWQYNREDYN TDITGH >gi|225935361|gb|ACGA01000031.1| GENE 74 104385 - 106133 1354 582 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2403 NR:ns ## KEGG: Dfer_2403 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 576 1 575 576 597 51.0 1e-169 MKTKFSIIFMWLFTILFISCDKLDLYPEDSLSPTTYFKNENELQLYSNQFYYNILPTAAD IYKDNADVLIVSPLDDEVTGQRIIPSTGGGWNWEALRSLNFLLENSHNCEDQDVRNKYDA ITRFFRAYFYFKKVQRFGDVPWYDKVLDSSDAELLYKPRDSREFVMQKIIEDLDFAITTL SKDKELYRVSKWAALALKSRVCLFEGTFRKYHGIADYEKYLDACITASDDFMKNSGYSLY KTGSTPYQTLFSSLNAIQQEIVLARDYNGTLNIRHDVQGFENTASKGRPGLSKKIVNMYL NKNGSRFTDMQGYATKVFFDECQNRDPRLAQTIRTPGYIRPGETKQSGPNFSSAMTGYHL IKYSGATKYDVGDTSENDFPIFRTAEVFLNYVEAKAERGTLEQNDIDRTIKLIRDRVGMP NLNMSDANTGPDPYLLNVYPNVSETNKGVILEIRRERVIELLMEGFRYYDLMRWKAGSCM DDEFLGMYFPGPGNYDLNHDGTIDVCLYKGAKPSGVNALEFLEIGVTVDLTEGESGNVIC YKDVPRQWNEERDYLYPIPRQDRILTNGVLTQNPGWNDGLNF >gi|225935361|gb|ACGA01000031.1| GENE 75 106155 - 109643 2240 1162 aa, chain - ## HITS:1 COG:no KEGG:Slin_4979 NR:ns ## KEGG: Slin_4979 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 24 1162 30 1188 1188 946 45.0 0 MRLTVFFLLFIVFETYSLNVRSQNQKVTMNKGTATLSDIIRQIEKQTDYLFIYNEHEIAL DKRIAVSTRETTVAEILNRVLQGTGFSYTMEGNHIILVKSMEKMKNVTLGRKITGTVKDD LGEAIVGCNVAVKGEKIGAITDMDGRYTIEVPDNSILVFSFLGYKSVESPVKNNSVINIT MQEDSRSLEEVVVVGYGTQKKVNLTGAVSVVEGDQLAGRSASTMSQLLQGAVPNMTVSFS SGRAGDGGSFNIRGVNSISGSAKPLVLIDGVEGDVNRVNPNDVASVSVLKDASSAAIYGA RAAYGVILVTTKTGEKGKVTLNYNGRYSFSDVTTSTDYETRGYYAAGISDMFFSTWQGTP LLTYTEKDYHELWIRRNDKTEQPERPWVVTENGQYKYYGNFDWYNYLYDHKRPTHEHNVS ISGGSEHLKFRLSGGYYNQKGVLKIQPDKYERYNFKSRLEAKVTPWLSVSNNTSYFKSNY SYPGLGGVNTIFKNASYRALPILVPVHPDGTLVYETNIMNYTLAGTTVALSNKKHNNKDE IDEFMTTFEVVINPVKHVEIIGNYTYSKYDKKCTNRSVDMQYSKVPGVTITMPESTSGGN KLTVLNTKNLYYAYNAYGTYGNLFAGSHNVKLTGGINYETKSYEDLTVSKDGLLSDELND FNLAKGENTVVTGGKNKYALFGVFYRANYDYKGRYLFEASGRYDGSSRFKKGHRYGFFPS FSAGWRISEEPFYGALKNTVDNLKLRLSYGTLGNQQVGYYDYLQQIETGKVLNYSFGDKE KASYAYETAPNASDLTWETVVTKNIGLDIGIFNNRLNISADAYIRDTKDMLMPGKALPSV YGAKSPNMNAADLRTKGWELVVAWNDRFNVMDKPFNYGVSFGIGDNVSKITKYDNPNKEI SSPYVGQRLGDIWGYMVDGYFATDEEAANYGVDQSVVNYIINNAVVDRGLHAGDMKYLDL DGNNKIEQTVSANDIKDQRIIGNSLPRYTYSIRLNAEWNGIDFSVFFQGVGKQDWYPSSD AFAFWGPYSSPAPSFIPKDFLADVWSVDNPDAYFPRPRGYIAWTDGRSLSSVNNRYLQSL AYCRLKNLTVGYTLPVKWLSKIHVQKARLYFSGENLLTLDRLDTDYIDPEAAAAGTNWKT GKTNALSYPFSKTYSFGIDITF >gi|225935361|gb|ACGA01000031.1| GENE 76 109886 - 110821 669 311 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 11 263 25 279 331 65 24.0 1e-10 MDTKLLNKYLAGDALPEEKREVVRWMKESEEHREQLMQMRHIYDATIWNGNLQEKKAENK KIMMRYLWASMKIAAVIAMIAFIIHKEYQEYRFEHSTEMQMMTVPAGQRASLVLADGTIV WLNSNSTLKYPATGFHAKERKVILEGEGYFEVAHNEKHPFIVETEKYDIRVLGTTFNVSA YPNSGLFEASLIEGKVTVYHPETQNEIILNPHEKVEVRDGKLYKETFTSDHDFLWRMGIY SFKDEPLETVFKKLEQYYEVKIINKNIEITSHPCTGKFRQKEGIEHVMRVLQKYIKFNYI QDDEKNQIIIY >gi|225935361|gb|ACGA01000031.1| GENE 77 111001 - 111564 381 187 aa, chain - ## HITS:1 COG:no KEGG:BVU_1779 NR:ns ## KEGG: BVU_1779 # Name: not_defined # Def: putative ECF-type RNA polymerase sigma factor # Organism: B.vulgatus # Pathway: not_defined # 10 187 2 181 184 168 49.0 1e-40 MRRQNGAISSNMTFDDIYIEYYQRCFLFAKSYLHDEMQSKDIASEAMITLWSTMKTEEVK NIRAFLMTVVKNQALNYLRNEHLRMEARENILDDELYELDFRISSLDSSDPNQLFSEEII AIVNHTLNELPEKTRRAFMMSRYENKSMKEIAESLNVTVKGVDYHIGKALQALRKNLKDY LYTLLFF >gi|225935361|gb|ACGA01000031.1| GENE 78 111764 - 113482 1685 572 aa, chain + ## HITS:1 COG:lin1560_1 KEGG:ns NR:ns ## COG: lin1560_1 COG0608 # Protein_GI_number: 16800628 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Listeria innocua # 5 568 8 561 562 363 37.0 1e-100 MNHKWNYRPITPEQAETSQTLAQELGISPILGQLLVQRGITKAADAKKFFRPQLPDLHDP FLMKDMDIAVERLNRAMGKKERILIYGDYDVDGTTAVALVYKFIQQFYSNLDYYIPDRYN EGYGISKKGVDYAAETGVGLIIVLDCGIKAVEEITYAKEKGIDFIICDHHVPDDVLPPAV AILNAKRLDNTYPYTHLSGCGVGFKFMQAFAISNGIEFHHLIPLLDIVAVSIASDIVPIM GENRILAYHGLKQLNSNPSIGMKAIIDVCGLSEKEITVSDIVFKIGPRINASGRIQNGKE AVDLLTEKDFSIALEKAGQINQYNETRKDLDKSMTEEANNIVANLEGLADRRSIVLYNEE WHKGVIGIVASRLTEVYYRPAVVLTRTDDMATGSARSVSGFDVYKAIEHCRDLLENFGGH TYAAGLSMKVENVNAFTKRFEEYVSQHILPEQTSAVIEIDAEIDFRDISSKFFSDLKKFN PFGPDNTKPIFCTHHVYDYGTSKVVGRDQEHIKLELVDNKSNNVMNGIAFGQSSHVRYIK TKRSFDICYTIEENTHKRGEVQLQIEDIKPIE >gi|225935361|gb|ACGA01000031.1| GENE 79 113475 - 115385 1324 636 aa, chain + ## HITS:1 COG:CAC2687 KEGG:ns NR:ns ## COG: CAC2687 COG0514 # Protein_GI_number: 15895945 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Clostridium acetobutylicum # 7 481 8 471 714 298 37.0 3e-80 MNKYQEILKQYWGYDSFRDLQEEIITSIGEGKDTLGLMPTGGGKSITFQVPALAQEGICI VITPLIALMKDQVQNLRKRGIKALAIYSGMTRQEILTALENCIFGNYKFLYISPERLDTD IFRTKLRSMKVSMITVDESHCISQWGYDFRPAYLKIAEIRTLLPGIPVLALTATATPEVV KDIQARLDFREENVFRMSFERKNLAYIVRQTDNKTQELLHILRKIPGSAIIYVRNRRRTK EITELLVNEDITADFYHAGLDNAVKDLRQKRWQSGEVRVMVATNAFGMGIDKPDVRIVLH LDLPDSLEAYFQEAGRAGRDGEKAYAVILYTKTDRTTLHRRVVDTFPDKEYILNVYEHLQ YYYQMAMGDGFQCVREFNLEEFCRKFKYFPVPVDSALKILTQAGYLEYTDEQDNASRILF TIRRDELYKLREMGTEAEALIQTILRSYTGVFTDYAYISEATLSIRTGLTREQIYNILVT LTKRRIVDYIPHKKTPYIIYTRERQELRFVHIPPSVYEERKARYEARIKAMEEYVTSENV CRSRMLLRYFGEKNEHNCGQCDVCLSHRATDALTENSFDFEELKKKISELLTQKPLTPVE IADKIEAEKESISEVIQYLLEEGEWKMQDGMIHISK >gi|225935361|gb|ACGA01000031.1| GENE 80 115426 - 116388 1266 320 aa, chain + ## HITS:1 COG:alr0622 KEGG:ns NR:ns ## COG: alr0622 COG0457 # Protein_GI_number: 17228118 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 98 297 256 451 547 77 28.0 4e-14 MPNFFKSFFSGKSETPESEKQKNDQKNFEIFKYDGLRAQRMGRPDYAVKCFIEALAIKEE FETMGYLSQLYIQMGETAKARELLEKMAAMEPDVTSTFLTLANVCFIQEDYQAMEEAANK AIAIEEGNAVAHYLLGKARKGQDDDLMTIAHLTKAITLKDDFIEARLLRAEALMNLKQYK DMMEDIDAVLAQNPEEETAMLLRGKVKEADGKDEEAEEDYKLVTEINPFNEQAYLYLGQL YINQKKLTEAIGLFDEAIELNPNFAEAYKERGRAKLLNGDKDGSVEDMKKSLELNPKEEA GLNGEFKNLGPKPEALPGIF >gi|225935361|gb|ACGA01000031.1| GENE 81 116689 - 117537 794 282 aa, chain + ## HITS:1 COG:VC0705_2 KEGG:ns NR:ns ## COG: VC0705_2 COG0077 # Protein_GI_number: 15640724 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Vibrio cholerae # 11 275 3 265 278 139 35.0 6e-33 MKKIAIQGTLGSYHDIAAHKYFEGEEIELICCANFEDVFTSIRKDSQVIGMLAIENTIAG SLLHNNELLRQSGTQIIGEYKLRISHSFVCLPDENWEDLTEVNSHPIALMQCREFLNQHP QLKVVEGEDTARSAEIIKNENLKGHAAICSKAAAERYGMKVLQEGIETNKHNFTRFLVVA DPWQVDELRQHHANATNKASIVFTLPHTEGSLSQVLSILSFYNINLTKIQSLPIIGREWE YQFYVDVAFNDYLRYKQSIAAITPLTKELKLLGEYAEGKSNV >gi|225935361|gb|ACGA01000031.1| GENE 82 117512 - 118690 1181 392 aa, chain + ## HITS:1 COG:aq_273 KEGG:ns NR:ns ## COG: aq_273 COG0436 # Protein_GI_number: 15605813 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Aquifex aeolicus # 13 392 5 385 387 301 42.0 2e-81 MQKESQTYKIAPADRLASVSEYYFSKKLKEVAQMNAEGKDVISLGIGSPDMPPSRETIET LCNNAHDPNGHGYQPYVGIPELRKGFAAWYQRWYGVELNPNTEIQPLIGSKEGILHVTLA FVNPGEQVLVPNPGYPTYTSLSKILGAEVINYDLKEEDGWMPDFEALEKMDLSRVKLMWT NYPNMPTGANATPEIYERLVDFARRKNIVIVNDNPYSFILNEKPISILSVPGAKDCCIEF NSMSKSHNMPGWRIGMLASNAEFVQWILKVKSNIDSGMFRAMQLAAATALEAEADWYEGN NENYRNRRHLAGEIMKALGCTYDEKQVGMFLWGKIPASCKDVEELTEKVLHEARVFITPG FIFGSNGARYIRISLCCKDNKLAEALERIKRI >gi|225935361|gb|ACGA01000031.1| GENE 83 118778 - 119839 1153 353 aa, chain + ## HITS:1 COG:DR1001_2 KEGG:ns NR:ns ## COG: DR1001_2 COG2876 # Protein_GI_number: 15806024 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Deinococcus radiodurans # 1 242 13 245 270 146 38.0 7e-35 MELESILLPGVEAKRPIVIAGPCSAETEEQVMDTAKQLAAKGQKIYRAGIWKPRTKPGGF EGIGVEGLAWLKEVKKETGMYVSTEVATAKHVYECLKAGIDILWVGARTTANPFAVQEIA DALKGVDIPVLVKNPVNPDLELWIGALERINNAGLKRLGAIHRGFSSYDKKIYRNLPQWH IPIELRRRIPNLPIFCDPSHIGGKRELVAPLCQQAMDLNFDGLIVESHCNPDCAWSDASQ QVTPDVLDYILNLLVIRTETQSTESLAQLRKQIDECDDNIIQELAKRMRVAREIGTYKKE HGITVLQAGRYNEILEKRGAQGEQCGMDSEFMKKIFEAIHEESVRQQMEIINK >gi|225935361|gb|ACGA01000031.1| GENE 84 119995 - 120768 705 257 aa, chain + ## HITS:1 COG:no KEGG:BT_3933 NR:ns ## KEGG: BT_3933 # Name: not_defined # Def: chorismate mutase/prephenate dehydratase (TyrA) # Organism: B.thetaiotaomicron # Pathway: Phenylalanine, tyrosine and tryptophan biosynthesis [PATH:bth00400]; Novobiocin biosynthesis [PATH:bth00401]; Metabolic pathways [PATH:bth01100]; Biosynthesis of secondary metabolites [PATH:bth01110] # 1 257 1 257 257 488 98.0 1e-137 MRILILGAGKMGSFFTDILSFQHETAVFDVNPHQLRFVYNTYRFTTLEEIKEFEPELVIN AVTVKYTLDAFRKILPVLPKDCIISDIASVKTGLKKFYEESGFRYVSSHPMFGPTFASLS NLSSENAIIISEGDHLGKIFFKDLYQTLRLNIFEYTFDEHDETVAYSLSIPFVSTFVFAA VMKHQEAPGTTFKKHMAIAKGLLSEDDYLLQEILFNPRTPGQVTNIRTELKNLLEIIENK DAEGMKKYLTKIREKIK >gi|225935361|gb|ACGA01000031.1| GENE 85 120875 - 122956 1597 693 aa, chain + ## HITS:1 COG:BH1375 KEGG:ns NR:ns ## COG: BH1375 COG0358 # Protein_GI_number: 15613938 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Bacillus halodurans # 2 450 5 453 599 280 36.0 8e-75 MIDQITIDRILDAAQIMDVVSDFVTLRKRGVNYVGLCPFHSDKTPSFYVSPAKGLCKCFA CGKGGNAVHFIMEHEQMSYPEALKYLAKKYNIEIKERELSDEEKFVQSERESLFIVNNFA RDYFQNILKNHVDGRSIGMAYFRNRGFRDDIIEKFQLGYCTEAHDAFAKEAIQKGYKKEY LVKTGLCYETDDHRLRDRFWGRVIFPVHTLSGKVVAFGGRVLASATKGVKVKYVNSPESE IYHKSNELYGIYFAKQAIVKQDRCFLVEGYTDVISMHQSGIENVVASSGTALTPGQIRMI HRFTNNMTVLYDGDAAGIKASIRGIDMLLEEGMNIKVCLLPDGDDPDSFARKHNSTEFQT FISEHETDFIRFKTNLLLEDAGKDPIKRAELIGNLVQSISVIPEAIVRDVYIKECAQLLH VEDKLLVSEVAKRRETQAEKRAEQTERERRMAERTAMMSQGSTPPEDVPIPNGDIPLPPE VDGGYTDVPPALQEDSYASFIPQEGKEGQEFYKFERLILQAVVRYGEKIMCNLTDEEGNE IPVTVIEYVVNDLKEDELAFHNPLHRQMLSEAAAHMHDSNFIAERYFLAHPDPVISKLSV DLINVRYQLSKYHSKSQKIVTDEERLYEMVPMLMINFKYAIVTEELKHMLYALQDPALAH DNEKCDSLMQRYNELRTVQSIMAKRLGDRVVLR >gi|225935361|gb|ACGA01000031.1| GENE 86 122986 - 123576 637 196 aa, chain - ## HITS:1 COG:slr0426 KEGG:ns NR:ns ## COG: slr0426 COG0302 # Protein_GI_number: 16331608 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Synechocystis # 6 193 41 229 234 215 59.0 5e-56 MLEKEEIVSPNLEELKSHYRSIITLLGEDAEREGLLKTPERVAKAMLSLTKGYHMDPHEV LRSAKFQEEYSQMVIVKDIDFFSLCEHHMLPFYGKAHVAYIPNGYITGLSKIARVVDIFS HRLQVQERMTLQIKECIQETLNPLGVMVVVEAKHMCMQMRGVEKQNSITTTSDFTGAFNQ AKTREEFMNLIQHGRV >gi|225935361|gb|ACGA01000031.1| GENE 87 123584 - 124027 459 147 aa, chain - ## HITS:1 COG:no KEGG:BT_3930 NR:ns ## KEGG: BT_3930 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 147 6 149 149 251 83.0 6e-66 MKVLLTLLFVLTATFAQAQSIIKSLERNIPGQGKVTIHQDPRIEALIGMERPATGEQKVI KTSGFRIQAYAGNNTRQAKNDAYHVASRVKEYFPELTVYTSFNPPRWLCRVGDFRSIEEA DAMMRRLKATGVFKEVSIVRDQINIPL >gi|225935361|gb|ACGA01000031.1| GENE 88 124196 - 124951 1063 251 aa, chain - ## HITS:1 COG:FN1366 KEGG:ns NR:ns ## COG: FN1366 COG0149 # Protein_GI_number: 19704701 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Fusobacterium nucleatum # 1 251 1 251 251 236 49.0 3e-62 MRKNIVAGNWKMNKTLQEGIALAKELNEALANEKPNCDVIICTPFIHLASVTPLVDAAKI GVGAENCADKASGAYTGEVSAEMVASTGAKYVILGHSERRAYYGETVAILEEKVKLALAN GLTPIFCIGEVLEEREANKQNEVVAAQMESVFSLSAEDFSKIVLAYEPVWAIGTGKTASP EQAQEIHAFIRSIVANKYGKEIADNTSILYGGSCKPSNAKELFANPDVDGGLIGGAALKV SDFKGIIDAFN >gi|225935361|gb|ACGA01000031.1| GENE 89 125058 - 126416 846 452 aa, chain - ## HITS:1 COG:no KEGG:BF3957 NR:ns ## KEGG: BF3957 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 444 1 442 443 695 74.0 0 MKDKNLHIIQEVVANTCRFLLAASFIFSGFVKAVDPLGFQYKIQDYLTAFGIASWFPSFF PLLGGIILSAVEFFIGISLFFATRRTLATSLALMLMIFMTPLTLYLAIFDPVSDCGCFGD AWVLTNWETFGKNIILLFAAMMAFRHRRMLIRFISVKMEWLVSLYTLFFVFTLSFYCLDR LPVLDFRPYKIGKNILEGMTMPEGAKPSVYESIFILEKNGEKKEFTLDNYPDSTWTFIDT RTILKEKGYEPAIHDFSMIDLNTGEDITDDVLTDIGYTFLLVAHRIEEADDSNIDLINEI YDYSVEHGYKFYCLTSSPEEQIELWKDKTGAEYPFCQMDDITLKTMVRSNPGLILIKNGT ILNKWSDEDIPDEYVLTDKLENLPLGKQKVSSDTHTVGYVFLWFVIPLLLVLGVDVLVVR RRERKNAKRKQQEEEMKNKELKAENSKIEEQE >gi|225935361|gb|ACGA01000031.1| GENE 90 126406 - 126969 588 187 aa, chain - ## HITS:1 COG:no KEGG:BT_3927 NR:ns ## KEGG: BT_3927 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 187 1 180 180 342 96.0 5e-93 MRTIWKKMKDTKQQFEHVIALCRDLFSKKLHDYGPAWRILRPASVTDQIFIKANRIRSIE TKGVTLIDEGIRAEFIAIVNYGIVGLIQLELGYAESADISNEEAMALYDKYAKEALDLML AKNHDYDEAWRSMRVSSYTDLILMKIYRTKQIESLAGNTLVSEGIDANYMDMINYSVFGL IKIEFEG >gi|225935361|gb|ACGA01000031.1| GENE 91 127071 - 127943 718 290 aa, chain - ## HITS:1 COG:HI0409 KEGG:ns NR:ns ## COG: HI0409 COG0739 # Protein_GI_number: 16272358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Haemophilus influenzae # 98 206 328 443 475 102 44.0 7e-22 MNFNCIIKTGLVAVAAMVSLSSFSQDLIARQAPIDKKLKSVDSLALQKQIRAEQSEYPAL SLYPNWNNQYAHSYGNAIIPETYTIDLTGFRMPTPSTKITSPFGPRWRRMHNGLDLKVNI GDTIVSAFDGKVRIVKYERRGYGKYVVIRHDNGLETIYGHLSKQLVEENQLVKAGEPIGL GGNTGRSTGSHLHFETRFLGIAINPIYMFDFPKQDIVADTYTFRKTKGVKRAGSHDTQVA DGTIRYHKVKSGDTLSRIAKLRGVSVSTLCKLNRIKPTTTLRIGQVLRCS >gi|225935361|gb|ACGA01000031.1| GENE 92 127964 - 128428 364 154 aa, chain - ## HITS:1 COG:AF0767 KEGG:ns NR:ns ## COG: AF0767 COG0105 # Protein_GI_number: 11498373 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside diphosphate kinase # Organism: Archaeoglobus fulgidus # 2 149 1 148 151 145 45.0 2e-35 MLEKTLVILKPCTLQRGLVGEITHRFERKGLRLAGMKMMQLTDELLSEHYAHLSGKSFFQ RVKDSMMTAPVIVCCFEGVDAIQTVRTLAGPTNGRLAAPGTIRGDYSMSFQENIVHASDS PETAAIELKRFFKPEEIFDYKQATFNYLYANDEY >gi|225935361|gb|ACGA01000031.1| GENE 93 128652 - 130748 1402 698 aa, chain - ## HITS:1 COG:slr0020 KEGG:ns NR:ns ## COG: slr0020 COG1200 # Protein_GI_number: 16331409 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Synechocystis # 18 670 146 805 831 496 43.0 1e-140 MFDLATRDIKFISGVGPQKAAVLNKELEIYSLHDLIYYFPYKYIDRSRIYYIHEIDGNMP YIQLKGEILGFETIGEGRQRRLTAHFSDGTGVVDLVWFQGIKYILGKYKLHEEYIIFGKP TVFNGRINVAHPDVDKPDDLKLSSVGLQPYYSTTEKMKRSFLNSHAIEKMMATVIQQIQE PLPETLSPKLLTEHHLMPLTEALRNIHFPTNPDVLRRAQYRLKFEELFYVQLNILRYAKD RQKRYRGYIFEKVGDVFNTFYTKNLPFQLTGAQKRVLKEIRNDVGSGRQMNRLLQGDVGS GKTLVALMSMLLALDNGYQACMMAPTEILANQHYETIKELLFGMDIRVELLTGSIKGKRR EAILSGLLTGDVQILIGTHAVIEDTVNFSSLGFVVIDEQHRFGVAQRARLWSKNVQPPHV LVMTATPIPRTLAMTLYGDLDVSVIDELPPGRKPITTIHQFDNRRESMYRSVRKQIDEGR QVYIVYPLIKESEKIDLKNLEEGYQHILEEFPKCTVCKVHGKMKPAEKDEQMQLFVSGKA QIMVATTVIEVGVNVPNASVMIIENAERFGLSQLHQLRGRVGRGAEQSYCILVTNYKLTE DTRKRLEIMVRTNDGFEIAEADLKLRGPGDLEGTQQSGIAFDLKIADIVRDGQLLQYVRA IAESIVEQDPAAQSPENEILWRQLKALRKTNVNWAAIS >gi|225935361|gb|ACGA01000031.1| GENE 94 130749 - 131408 296 219 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 1 219 1 223 234 118 37 2e-25 MKKYVIIVAGGKGLRMGSDLPKQFLPMGDKPVLMHTLEVFRRYDEALQIILVLPQEQQSF WKQLCDEHHFTVKHVLAEGGETRFHSVKNGLALVQEPGLVGVHDGVRPFVSVEVIRRCYE LAEVQKAVIPVVDVVETLRHLTDAGSETVSRIDYKLVQTPQVFDVELLKQAYAQEFTPFF TDDASVVEAMGMPVYLAEGNRENIKITTPFDLKVGSALL >gi|225935361|gb|ACGA01000031.1| GENE 95 131416 - 131967 720 183 aa, chain - ## HITS:1 COG:CAC1629 KEGG:ns NR:ns ## COG: CAC1629 COG0693 # Protein_GI_number: 15894907 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 7 182 6 180 188 132 42.0 4e-31 MGTVYAFFADGFEEIEAFTAIDTLRRAGLNVEIVSVTPDEIVVGAHDVSVLCDINFENCD FFDAELLLLPGGMPGAATLDKHEGLRKLILDFAAKGKPIAAICAAPMVLGKLGLLKGKKA TCYPSFEQYLDGAECVNAHVVRDGNIITGMGPGAAMEFALTIVDLLVGKEKVDELVEAMC VKR >gi|225935361|gb|ACGA01000031.1| GENE 96 131995 - 132870 940 291 aa, chain - ## HITS:1 COG:no KEGG:BT_3921 NR:ns ## KEGG: BT_3921 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 291 3 292 292 374 87.0 1e-102 MDRRKKGEYIGALGALLVHVAVIALLILVSFTVPQPDEDAGGVPVMLGNVDTASGFDDPS LVDVDIMDEDAAAPPAETEPQLPSEQDLLTQTEEETVTLKPKTEEPKKETVKPKEVVKPK EPVKKPEKTEAEKAAEAKRLAEEKAERERKAAEEAARKRVSGAFGKGAQMTGNKGTAASG TGTEGSKEGNSSTGAKTGTGGYGTFDLGGRSLGTGSLPKPVYNVQEEGRVVVNITVNPAG QVISTSISPQTNTVNSALRKAAEDAAKKARFNTIDGVNNQTGTITYYFNLR >gi|225935361|gb|ACGA01000031.1| GENE 97 132877 - 133293 312 138 aa, chain - ## HITS:1 COG:no KEGG:BF3738 NR:ns ## KEGG: BF3738 # Name: not_defined # Def: putative tansport related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 138 6 143 145 234 87.0 9e-61 MGLKRRNRVSPNFSMASMTDVIFLLLIFFMITSTVVSPNAIKVLLPQGKQQTSAKPLTRV VIDKDLNFYAAFGNEKEQPVALNDLTSFLQSCAEKEPEMYVALYADESVPYREIVRVLNI ANENHFKMVLATRPPENK >gi|225935361|gb|ACGA01000031.1| GENE 98 133301 - 134017 843 238 aa, chain - ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 35 223 1 190 202 91 31.0 1e-18 MNAMILLAQGAMNMADSLATANPVLTEVNAPEMNMLDMAVKGGWIMIVLGVLSVICFYIL FERNYMIRKAGKEDPMFMERIKDYIHSGEIKAAINYCRTMNTPSARMIEKGISRLGRPIN DVQVAIENVGNIEVAKLEKGLTVMATISGGAPMLGFLGTVTGMVRAFYEMANAGSGNIDI TLLSGGIYEAMITTVGGLIVGIIAMFAYNYLVMLVDRVVNKMESRTMEFMDLLNEPAK >gi|225935361|gb|ACGA01000031.1| GENE 99 134014 - 134730 753 238 aa, chain - ## HITS:1 COG:XF0060 KEGG:ns NR:ns ## COG: XF0060 COG0854 # Protein_GI_number: 15836665 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal phosphate biosynthesis protein # Organism: Xylella fastidiosa 9a5c # 2 238 5 252 260 207 47.0 1e-53 MTKLSVNINKIATLRNARGGNVPDVVKVALDCESFGADGITVHPRPDERHIRRSDVYDLR PLLRTEFNIEGYPSPEFIDLVLKVKPHQVTLVPDDPSQITSNSGWDTKANQEFLTEVLDQ FNSAGIRTSVFVAADPEMVEYAAKAGADRVELYTEPYATDYPKNPEAAIAPFIEAAKTAR KLGIGLNAGHDLSLVNLNYFYKNIPWVDEVSIGHALISDALYLGLERTIQEYKNCLRS >gi|225935361|gb|ACGA01000031.1| GENE 100 134843 - 135748 529 301 aa, chain + ## HITS:1 COG:PA3088 KEGG:ns NR:ns ## COG: PA3088 COG0061 # Protein_GI_number: 15598284 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Pseudomonas aeruginosa # 75 297 64 289 295 166 38.0 6e-41 MPNLHPELMMLLMKFAIFGNTYQAKKSSHAVTLFKLLKKQGAEIGMCREFYQFLVSENMD IEADQLFDGDDFTADMVISIGGDGTFLKAARRVGRKGIPILGINTGRLGFLADISPEEME ETFDEIQNGRYSVEERSVLQLICKDKHLQDSPYALNEIAILKRDSSSMISIRTAINGAYL NTYQADGLVIATPTGSTAYSLSVGGPIIVPHSNTIAITPVAPHSLNVRPIVIRDDWEITL DVESRSHNFLVAIDGRSETCKETTQLTIRRADYSVKVVKRFNHIFFDTLRSKMMWGADGR R >gi|225935361|gb|ACGA01000031.1| GENE 101 136381 - 137253 981 290 aa, chain - ## HITS:1 COG:no KEGG:BT_3912 NR:ns ## KEGG: BT_3912 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 290 1 290 290 414 86.0 1e-114 MSKKSLLIAAVAILVIAIIGITYLLFTEKQANRELVQEFQLDKEDLENEYTRFVQQYDEL KMTISNDSLSQLLEQEQLKTQRLLEELRTVKSTNATEIRRLKNELATLRKVMVGYINQID SLNKLTAQQKQVIAEVTQKYNQASQQISNLSEEKKNLDKKVTLAAQLDATNIRIEPRNKR GKAAKKVKDVVKLAISFTVVKNITAENGERTIYIRITKPDNDVLTKSASNTFPYENRTLV YSIKKYIEYNGEEQNINVFWDVEEFLYAGNYRVDIFEGGNLIGSQTFMLD >gi|225935361|gb|ACGA01000031.1| GENE 102 137428 - 139122 1233 564 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295086521|emb|CBK68044.1| ## NR: gi|295086521|emb|CBK68044.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 564 1 564 564 1129 99.0 0 MNIKQLMVTFFIALLAGGEIGARVLTDKFVYSQGEKVVFTFDGKSEGKTIILKYLSKKGE PVLAEIGGEPFVWEVPSEFTPAAVGVYQKEEGQLTYSSYFRVVTPGMLTTYQIAKEEYKG LNVFMLDGGMSAEYAVQKSLANLTAGVSHTWQIGPGGGPKPVWGTPDFLQQSVQHTVDLY NEYLGKSKKLKTVIIATGVPTVPYLSAAMEAPVLPLHFLVSVNSTKEVSSILEYSSQAGV PCYATLGYDASMDDVGVAWIKLLALPDEYRKFIIEHEVENVIIAGIGEDVKSESYCRKLS KTGVDGQEYADGSLYILYTQSGSEHDIKTISRNVVDYDTLSLEKGKDLADWESGVVNRQI DNISKGICEHTPAQVYSLIATHDMMDMYNLGANMGMYFMYKNREQTKVSVQGTYLNEYLI SQPLYELTQGYIPLLFWQFVPPVSTIDRIKRDIQKVVDVYEKGILLENKTVHVNARIGKE ELVQELKKRGFRFVTKRKDNVEELWNLSDGINSPCEEVVQNIVEQIGVKQYQTQCKNALY LNMGDLKLVTNNIPGLVFHSFKKK >gi|225935361|gb|ACGA01000031.1| GENE 103 139156 - 141024 1431 622 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171941|ref|ZP_05758353.1| ## NR: gi|260171941|ref|ZP_05758353.1| hypothetical protein BacD2_08750 [Bacteroides sp. D2] # 1 622 1 622 622 1228 100.0 0 MRIISYISSIIITFSALAIMSGCKDSEVVDDGFRTTMAKASDVKLNKVAYWPGDPVNCSF TLKNETNYPMDIREVKVVIQNLDDGGCVLIEKSVASHIQIEPGQSVPVDAGTLYTLPAAL KPSSFCAVKFLLDFEDGITTTIDGTYFRAVNEQSLLTYDIQKLDYQGLPVYRQIGDMSAG FGVLKTIVAFDQGIAATMEEAPQGGTYPVAPTPEFLQRSVRKTVELYNSEIGAATKIKRV VVGTGIASVSYFATMMGAAYLPIHYLVSANSASEVQAILDYSNQNGYASYATLGYDGSMP GVGVAWIKLLDLPEEYKQFIKDHQVEEVYIYGVGQEGHGESYSRRVLTQNTITDEYAPGA LYILYTNFGSDADIDALKHRLYDYNQLKLGEGQYISDWESGIVDDQIANISGSAQAMTNV KAYTIETDDMMALYNISSFLTLQYIKKNQSKLQAPFVNGVIFNEYLTNHPQYEAFVGYVP LLYWQFNSAASTVERIDGYLKPAIAGYFPDVVDHLYEGSFYLNSNMRRYEFYDELIARGV TSENIRIRQSVDKWNPEDDGETEEYLGRINHKIGSAEEFAYDIIERIGVQKYRNTVKSME YLTLEELRTICAQVGNMRLVEH >gi|225935361|gb|ACGA01000031.1| GENE 104 141011 - 143224 1431 737 aa, chain - ## HITS:1 COG:no KEGG:BT_3590 NR:ns ## KEGG: BT_3590 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 732 16 732 732 845 54.0 0 MCKIKCLFLLMLIVWCSSCKESTEDEKAMQQMVERLFPEYASQFSFEQSEKIDKDWYEIE AQGGTVRIRGNNANSMAVGLNYYLNHYCLTSVSWYVNDTVEMPEVLPMPPAKITSTARCK NRFFLNYCTFGYTMPWWTWKDWERLIDWMALNGINMPLAITGQESVWYRVWTKLGLTDEE IRNYFTGPAHLPWHRMSNLDYWQGPLPKEWLDTQEALQKQIVARERQFNMRPILPAFAGH VPSELKRIYPEAKISRMSSWGGFEDKYRSHFLDPLDPLFATIQKEFLEEQTKLFGTDHIY GADPFNEVAPPSWEPEFLANCSKHIYQSMTHVDPDATWLQMTWLFYIDRHLWTNERVEAF LKAVPQDKLLLLDYYCENTEVWKQTDRYFGQPYLWCYLGNFGGNTMLAGNTKEVGKRIEN VYTNGGENFSGLGSTLEGFDVNPFMYEYVFSKAWDCNLPDSVWIEQLADRRIGLRNQQMR RAWKLLYDSIYTAPAALGQGTLMNARPCLKGNGNWTTTSTVAYSNETLFEVWEMLLKAGE HRHSAYEYDVVNIGRQVLGNYFGKLRDEFAEAYSRKQLPLLKQKGAEMKQLLRDVDTLLS TQSSFLLGKWIEDARSLGTDGASKNYYEENARTIVSTWGDKDQSLNDYANRTWGGLVSGY YAPRWEMFIDEVIRSVSNKQPFNADAFHQRVTQFEIDWVKSHERYPSEPVGNAVEIATLL MNKYKDSILKEKHNENN >gi|225935361|gb|ACGA01000031.1| GENE 105 143328 - 144788 1365 486 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171943|ref|ZP_05758355.1| ## NR: gi|260171943|ref|ZP_05758355.1| hypothetical protein BacD2_08760 [Bacteroides sp. D2] # 1 486 1 486 486 922 100.0 0 MNRARLYINIKLLLLAVLTLNLSGCELDERVDDLTGGYEGAFIDRLTGEKVATEYYGAKL KLLDLEYGNVAVPLEYNTLPEGTYRNTKVYPSRYKVWANGPFFELDTIYGDIRSFKKMDL IVTPNVTLKIKKVEVLYGITANVTFTYQVNDERSKNQEIGLVYGKEQYPGQRTAMNESES GSHTYKRIKKNLAELSGEFTETLFLNPNSTYYLRALGRTESAGDYWNYSEQTVINTTDID LSSLPIEAAVGVSSATSAVLQWAFPPVVDEIKVSYTDRDGEEVMDKFKPTDYSYVANLPH NQKSTIKVQLLAKGVAGPEQTIEVQTKSLTDKYVPASNTRPENVPFYNDSEFKKSLSGEW ALIYGPTIGEDWSTTDLRFEYFDWWDTWLIGFADRMPACQDIENFKSLTIQGEIQTLVDI LPFVNLETLSIIKGKGFSVDKTINPKVDLTVLKKLKKLNTVIIGPDVPLTKKNFDDAGLT HLTITN >gi|225935361|gb|ACGA01000031.1| GENE 106 144809 - 146659 1543 616 aa, chain - ## HITS:1 COG:no KEGG:BDI_3134 NR:ns ## KEGG: BDI_3134 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 143 598 136 584 605 187 32.0 8e-46 MKVKSIIYMVLALSLCCGCSDWLTVDSKTILSEEDIAKYPELAEAQFLSNYAELRKSIHC IGDGAMSYRQHHLDAFTDDGASNITYENGVMRNNTPGTVFGGVFSQSKGEIFEAVWNYKT INIVNKFIATYKNSDNEGVLSTVGEAYFIRAYLYFEMVKRYGGVPLYSSPLDDVSSINNR STEEKSWDYIKDNLDSALVLLPKVQRIASEDRDRANRYTALALKSRAMLYAGTIAKYGKV SNNSFQGIRKEMAKTYLLEAAKAAKEIVDDGKYALSTEFGDLFNGKDENNNEIIFRFANV AKTGVAVYEDYWYQSYRIKRSSYCAFMVPPLDVVEQFETLDGKIQPLDYAASKNNPEDFF ANRDKRLDATVIYPGGEFLGERFSIYRKTLVKRTDGTTEEYSYEKSEDWMGAGKVPGHEK YMKSGADGIFLNLSAAGTTNWGFFLKKTLYGVKRLDDYLIQENDQDAVVIRYGEVILNLA EAAVELSTYGVNDYLAVAQVAFDLLRSIHGGLPAKTMDLEVVRHERRIDLMYEGFRYWDL KRWRIGEEKMHNKTLKALYPILHIDETTSPASVYYTLEKVEAPDLATRVKWFEERDYYCP LPLSKSPGIVQNDGWN >gi|225935361|gb|ACGA01000031.1| GENE 107 146666 - 150097 2734 1143 aa, chain - ## HITS:1 COG:no KEGG:BDI_3133 NR:ns ## KEGG: BDI_3133 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 14 1143 7 1117 1117 578 33.0 1e-163 MNEKHQKKNHTCLSVRRPFICFFLALGVFLLSAFPTELYAQKKTITLDYKELGLMELFKK IEEKSDYVFFYYDVLIDKDVKVPARFKNMTVEQILDKVFANMELTYSINKKQITIKKKVL QEKTKKEGQSSIVKGVVFDQSREPLPGVSIVVQGTNIGTVSDLNGAYSIHVPSDDSKIIF SYIGFAPVTYSAKEMNKLSEVVLVEDTKTIDEVVVVGYTTRTREKLISSVSTINNQELVK STVPNLENALTGRVSGVFSRQTSAEPGSDGADLKIRGFGSALVVVDGIPGRNYSDIDPSE IESVSVLKDASAAAVYGMQGANGVILVTTKRGGKNKPTTLDINTRFGLQMPHNYPQPAST PLWQTLVGEYYANMKLINDKNAVITPADMATRDYAYNTNWYEEMIKNAPITQSNINISGG TDKVSYFISAGYLYQGGIWSTNFTDKNRFNFRSNLDADILKNLKLSVGVGAVINSLNYPR SASYEIARKMKDMAPNIPVKWPGHDDYYAFGGEGTVNPMALADKEASGYSKKIAKNLNVD FSLEYKVPFVEGLSLKATMGYTQSDSWNKNWNMNIVYMGYREDAQEYYESASASNANKAS LSLEDGFSYNITGQGFINYIRSFGNHNINSGLVFEFSDAENRSTVTSRGEFPSTVLDMMA GGIANKQVTNSEVFRKYRTASFIGRFSYDYRSKYFVDFNFRYDGAQYFADKWGFFPSASV GWMLTNEEFMNPLKKVLNEFKIRASWGELGDLSAASQYYANNEQYYFQSGYQYPGTPMNF GDRTIYGLNPTLNPNPDFTWATSSMINAGVDFKLWNGLLSGSADVFYRQRKGLPAQKAND NAGALATWYNLDHDNTRGFEFSLNHQYKIGEVNYFVGGNMSWSRTRKGNIEHGRFTSGYD EWKWNTEGHWNNVRWGYNCIGRYQSYGEIANAPMHNNSNNNSAILPGDLKYEDWNGDGYI DNYDQRPIGRNAYPELVYGINLGLSWKGVDFSMFWQGGALSDFQIGAFDMDAFQEGATNL NTWEYFGDRWHRADYTDPNSEWIPGYFPAVRDFTSVTINRLSSNFWMWNGSYIRLKNVEL GYTLPQRITQKANIKQLRIYANLYNCLTFSSQKFFDPEQLESQYSFASYPQIMSFNVGIN LKF >gi|225935361|gb|ACGA01000031.1| GENE 108 150223 - 151365 871 380 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 172 369 65 260 274 75 27.0 1e-13 MKEYSSYTTADFLNDDLFLFWYYSGEGDYYRKIIADCPDREPYLKEAMERLAALKWEKPV LPPKEVDNACLKLEQAIQNRKMSQRKYRLYKMRWWSASVAAAILILFVSVEVIWKSAPAI DYLALLQVNDSVFMSGKTQLFVDDQLKETFEGNLDLAYDQMATDAGEGEFNKLVVSYGKR ARVTLCDGTKIWANAGTVLLYPTHFEDKKREIYVDGEIYIDVTPNPEKPFIIKTSDMGVK VLGTSFNVSAYREDVEKSVVLVTGKVEVTASNGESVRILPNDRFRQSTDKYVVDKVNVED YVSWKEGRLSFKNTELGGILKQLSRYYNVRIDYDKQQQITCSGKLNLDDTIEQILNTITE TAPVVISKENNVYKVTIKKK >gi|225935361|gb|ACGA01000031.1| GENE 109 151510 - 152100 364 196 aa, chain - ## HITS:1 COG:DR0180 KEGG:ns NR:ns ## COG: DR0180 COG1595 # Protein_GI_number: 15805216 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Deinococcus radiodurans # 5 175 48 221 229 64 30.0 1e-10 MKFWESNHQSVQEGDMQFIALYKFYYQDLYAYGVSLGFNTEDVKDAIQEVYLKLYFNERL CIDEKKIKFYLLRSVRNQLIDWERTKKDTSSIEEEERSFKLSVSVEESFISDEEDLLLKK RVNRILDLLTDHQREIVYLHFIEEMPYEEIAVMLDMKIQTVRGQVFKAMEKLRKLDSKDY FLFFLILYLHGVSDFK >gi|225935361|gb|ACGA01000031.1| GENE 110 152735 - 153676 1330 313 aa, chain + ## HITS:1 COG:BH3158 KEGG:ns NR:ns ## COG: BH3158 COG0039 # Protein_GI_number: 15615720 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Bacillus halodurans # 3 307 7 311 314 266 46.0 3e-71 MSKVTVVGAGNVGATCANVLAFNEVADEVVMLDVKEGVSEGKAMDMMQTAQLLGFDTTLV GCTNDYAQTANSDVVVITSGIPRKPGMTREELIGVNAGIVKSVAENLLKYSPNAIIVVIS NPMDTMTYLALKALGLPKNRVIGMGGALDSSRFKYFLSQAIGCNANEVEGMVIGGHGDTT MIPLTRFATYKGMPVANFISAEKLEEVAAATMVGGATLTKLLGTSAWYAPGAAGAFVVES ILHDQKKMVPCSVLLEGEYGESDLCIGVPVILGKNGIEKIVELNLNEDEKAKFAASAKAV HGTNAALKEVGAL >gi|225935361|gb|ACGA01000031.1| GENE 111 153693 - 153824 69 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171950|ref|ZP_05758362.1| ## NR: gi|260171950|ref|ZP_05758362.1| hypothetical protein BacD2_08795 [Bacteroides sp. D2] # 1 43 1 43 43 82 100.0 1e-14 MQMYVLSVRTQNELAFFRNIVGMDSVEKKEGYSSWRGTAFLDI >gi|225935361|gb|ACGA01000031.1| GENE 112 153949 - 155412 1122 487 aa, chain + ## HITS:1 COG:HP1489 KEGG:ns NR:ns ## COG: HP1489 COG1538 # Protein_GI_number: 15646098 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Helicobacter pylori 26695 # 23 455 48 471 510 82 24.0 2e-15 MKKVFLLTFLLSFTFTVKAQSFLSLDSCRALALANNKDLLISNEKISAAHYQRKAAFTNY LPNFSATGAYMRNQKEFSLLNNDQKAALSGLGTNLAGPIQQAATEIATAHPDLAPLISSL SGKLGAVLPALDQAGNSLVDALRTDTRNVYAGAITLTQPLYMGGKIRAYNKITKYAEELA QEQHHGGMQEVIMSTDQAYWQVISLVNKKKLAEGYLKLLQQLDSDVEKMINEGVATKADG LSVRVKVNEAEMTLTKVEDGLSLARMLLCQLCGIDLSSPITLADENMEDIPLLTTDPHFD LSTAYENRPEIRSLELATQIYKQKVNVTRAEHLPSIALMGNYMVTNPSVFNSFENKFKGM WNVGVMVQIPIWHWGEGIYKTRAAKAEARIAQYQLQDAREKIELQVNQAAFKVKEAGKKL VMSSKNMEKAEENLRYATLGFKEGVIATSNVLEAQTAWLSAHSEKIDAQIDVKLTEIYLK KSLGTLK >gi|225935361|gb|ACGA01000031.1| GENE 113 155519 - 156511 1125 330 aa, chain + ## HITS:1 COG:VC1607 KEGG:ns NR:ns ## COG: VC1607 COG0845 # Protein_GI_number: 15641615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 17 328 6 322 324 217 45.0 3e-56 MAPIKSQNSNMLLAFLTLLGVIAIVAVVGFFMLRKGPEIIQGQAEVTEYRVSSKVPGRIL EFRVKEGQSVNAGDTLAILEAPDVVAKMEQARAAEAAAQAQNAKAIKGAREEQIQAAYEM WQKAQAGVTIAEKSYQRIKNLYEQGVMPAQKLDEVTAQRDASIATEKAAKAQYTMAKNGA EREDKMAAEALVNRAKGAVAEVESYIKETYLIAPAAGEVSEIFPKVGELVGTGAPIMNIA ELNDMWVTFNVREDLLKNLTMGSEFEAIIPALDNKKIQLKVYYLKDLGTYAAWKATKTTG QFDLKTFEVKASPVEKVENLRPGMSVIIDK >gi|225935361|gb|ACGA01000031.1| GENE 114 156514 - 157695 551 393 aa, chain + ## HITS:1 COG:VC1608 KEGG:ns NR:ns ## COG: VC1608 COG0842 # Protein_GI_number: 15641616 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 28 355 22 348 387 126 27.0 8e-29 MKEREKKYIALWQVMQRECRRLVSRPLYLFCMVIAPLFCYIFFTTLMDSGLPKDLPAGVV DMDDSSTSRNIVRNLDAFSQTGVVAHYSNVTDARIAMQEGKIYGFFYLPKGLSAEAQSQR QPTISFYTNYSYLIAGSLLFRDMKMMGELTSGAAARTMLYAKGATEDQAMAYLQPIVIDT HPLNNPWLNYSVYLCNTLIPGVLMLLIFMVTVYSIGVEIKDRTAREWLRMSNNSIYIALA GKLLPHTIVFFIMGIFYNVYLYGFLHFPCNSGIFPMIFATLCLVLASQCCGIVMIGTLPT LRLGLSFASLWGVISFSISGFSFPVMAMHPVLQALSNLFPLRHYFLIYVDQALNGYSMAY SWTNYMALLIFMMLPFFVVHRLKEALVYYKYIP >gi|225935361|gb|ACGA01000031.1| GENE 115 157696 - 158955 921 419 aa, chain + ## HITS:1 COG:VC1609 KEGG:ns NR:ns ## COG: VC1609 COG0842 # Protein_GI_number: 15641617 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 17 381 29 388 408 152 28.0 1e-36 MKDIKLKDKIAQGINDLFYIWKREFQTTFRDQGVLIFFILVPLGYPLLYSFIYDNEVVRE VPAVVVDDSHSSLSREYLRKVDATPDIQIVSYCADMEEAKQMLKNRRAYGIIYIPSDFSD NIAKGKQTQVSIYCDMSGLLYYKSMLLANTAVSLDMNRDIKIARSGNTTDRQDEITAYPI EYEEISIFNPTAGFAAFLIPAVLVLIIQQTLLLGIGLAAGTARENNRFKDLVPINRHYNG TLRIVLGKGLSYFLVYILVAFYVLYVVPRLFSLNQIGQPSSLILFVVPYLAACIFFAMTA SIAIRNRETCMLIFVFTSVPLLFISGISWPGAAIPPFWKYVSYIFPSTFGINGFVKINNM GATLSEVAFEYKALWLQAGIYFLTTCWVYRWQILMSRKHAIERYKELKEKANLSKQISD >gi|225935361|gb|ACGA01000031.1| GENE 116 159110 - 159697 652 195 aa, chain - ## HITS:1 COG:no KEGG:BT_0646 NR:ns ## KEGG: BT_0646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 190 190 189 55.0 6e-47 MKKGLIFVLFALVSIVSYSQMSWNAKVGMNMSNITDSEMDMKIGFQAGVGMEYAFSDMWS IRPSLMFTTKGAKVSEEGVDVTYNPMYLELPIMAAASFAIADNQNIVVKAGPYFAYGIAG KAKFSAGGESEKVDLFGDGEDQMGMKRFDAGIGVGVAYEINKFFVDLTGEFGLAKLGDEG SSKNMNFSIGVGYKF >gi|225935361|gb|ACGA01000031.1| GENE 117 160096 - 160461 367 121 aa, chain + ## HITS:1 COG:no KEGG:BT_3899 NR:ns ## KEGG: BT_3899 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 121 205 92.0 4e-52 MEKLTIQEEEVMIYIWELQCCFVKDIVAKYTQPAPPYTTVASIVKNLERKGYVTPKRVGN TYQYTPAIRENEYKRHFMSGVVRNYFENSYKEMVSFFAKDQKISTDDLKDIIELIEKGKE N >gi|225935361|gb|ACGA01000031.1| GENE 118 160480 - 162297 1081 605 aa, chain + ## HITS:1 COG:no KEGG:BT_3898 NR:ns ## KEGG: BT_3898 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 605 1 609 609 887 71.0 0 MTPELAYFLKINVAIALFYAFYRLFFHKDTFFHWRRMALLCFFAISMLYPLLNIQGWIKA HEPMVAMADLYATIILPEQVVTPLQEPVMNWQEFIMQLAKVIYWSGMLLLATRFFVQLGS IIRLHFQCSKSKIQGVRIHLLKKETGPFSFFHWIFIHPQSHTESEISEIITHEETHARQY HSVDVLISEIMCIFCWFNPFIWLMKREVRGNLEYMADHRVLETGHDSKSYQYHLLGLAHH KAAANLSNSFNVLPLKNRIKMMNKRRTKEIGRTKYLMFLPLAAILMIVSNIEMVARTTEK FAKEMMGQATEEVAMQAETTDIPELPTEDIQGTTLPRDKQEKEMAETQIKSVPDSVVFQV VEEMPDFPGGVQALMDYLSKNVRYPAEAHAIGAQGRVIVSFTVKKDGSIADTKVERSVNP YLDKEAMRVIAAMPKWKPGKQRGEAVNVKFTVPVAFRLSGPELPKAEEVKQSDMDEVVVV GYAAKDDPTPEGGSVKGENEDEVFTMVEAMPKFPGGQAGLFQYLARSIKYPVIAQKSKEQ GRVIIQMVISKDGSLSNIKVLRSVSPSLDAEAVRVVGNMPKWEPGMQKGQPVSVKYTIPI VFRLQ >gi|225935361|gb|ACGA01000031.1| GENE 119 162487 - 164400 1213 637 aa, chain + ## HITS:1 COG:no KEGG:BT_3897 NR:ns ## KEGG: BT_3897 # Name: not_defined # Def: putative thiol:disulfide interchange protein DsbE # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 637 1 634 634 737 58.0 0 MKRFVWIIGLIFCITCTIQAKDRVIERPPFLAWSSNSIEVDKIVMSDTVTTVYIKAFYHP KYWIKIATGSFLKDNNGMLYPIRRGVGITLDKEFWMPESGEAEFQLQFPPIPENVTSLDF SEGDFDGAYKIWGIQLDKDAFYKQKLPKEAVVHKINKKAILPTPKLVYGTATLKGKILDY QKEMIKQVKMHIESPALNIHNEQNIIKIKEDGTFLAEVKVVSVTSVALEFPFGWIECLIA PNEETSLIINTKELCRRQAHLQRKDKTYGEPVYFNGYLASLQQELASVDIDIVLKSVYYM DMYNDIVGKSADEYKAYVLERLPSIRKEIAQSQYSNACKELLNIQVDLAATGKIAMTERE LKSAHIAVNKLNKEQTDDYFYNTRIDIPAGYYDILKEFTSINTLKALYGKYYASTIYLIS FLPNSLDILKETLGTGQGPLFDNIKFNKLYQSIKDFTPLTTEQNAELKTFSSPVYAEMLT QTNKEIIKKIELNKRKTGFTVNEAGQVSNEDLFPSIISKFRGHTLLVDFWATWCGPCRTA NKAILPMKEELKDMDIIYLYITGETSPKGTWENMITDIHGEHFRVTNEQWSFLMSNFNIQ GVPTYFVVDPEGNITFKQTGFPGVDTMKEQLMKAMNK >gi|225935361|gb|ACGA01000031.1| GENE 120 164518 - 165618 1311 366 aa, chain - ## HITS:1 COG:alr0652 KEGG:ns NR:ns ## COG: alr0652 COG0489 # Protein_GI_number: 17228148 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 8 349 10 350 356 265 44.0 9e-71 MTLYPKLILDALATVRYPGTGKNLVEAEMVADNLRIDGMTVSFSLIFEKPTDPFMKSMLK AAETAIHTYVSPDVQVTITAESKQAARPEVGKLLPQVKNIVGISSGKGGVGKSTVSANLA VALAKLGYKVGLLDADIFGPSMPKMFQVEDARPYAERIDGRDMIIPVEKYGVKLLSIGFF VDPDQATLWRGGMASNALKQLIADAAWGDLDYFLIDLPPGTSDIHLTVVQTLAMTGAIVV STPQAVALADARKGINMFTNDKVNVPILGLVENMAWFTPAELPENKYYIFGKEGAKKLAE EMNVPLLGQIPIVQSICEGGDNGTPVALDEDSVTGRAFLSLAASVVRQVDRRNVEMAPTQ IVEMHK >gi|225935361|gb|ACGA01000031.1| GENE 121 165709 - 166467 707 252 aa, chain - ## HITS:1 COG:CAC2627 KEGG:ns NR:ns ## COG: CAC2627 COG0220 # Protein_GI_number: 15895885 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Clostridium acetobutylicum # 30 215 23 206 211 118 36.0 1e-26 MGKNKLEKFADMASYPHVFEYPYSAVDNVPFDMKGKWHKEFFKNDHPIVLELGCGRGEYT VGLGKMFPEKNFIAVDIKGARMWTGATESLQDGMKNVAFLRTNIEIIERFFAEGEVSEIW LTFSDPQMKKATKRLTSTYFMERYRKFLQPNGIIHLKTDSNFMFTYTKYMIEANKLPVEF MTEDLYHSDLVDNILGIKTYYEQQWLDRGLDIKYIKFRLPQEGKLQEPDVEIELDPYRSY NRSKRSGLSTSK >gi|225935361|gb|ACGA01000031.1| GENE 122 166580 - 167203 578 207 aa, chain - ## HITS:1 COG:lin0656 KEGG:ns NR:ns ## COG: lin0656 COG5523 # Protein_GI_number: 16799731 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 6 195 4 243 345 123 32.0 2e-28 MLKQNSELRAQAREALRGKWPMAAVAALIYSAIVGGLSAIPMIGSLCSLFVGLPVAYGFT IVMLGVCRGKEIDFGVLFEGFQDYGRIFVTMLLQAVYTALWSLLLVIPGIIKSYSYAMTS FILKDEPEMKNNAAIEKSMAMMEGNKMKLFMLDLSFIGWAILCIFTLGIGLLFLQPYVAI SRAAFYEDLKAQQGGNVEVNVEVNVEI >gi|225935361|gb|ACGA01000031.1| GENE 123 167249 - 168268 1155 339 aa, chain - ## HITS:1 COG:L0086 KEGG:ns NR:ns ## COG: L0086 COG0115 # Protein_GI_number: 15673270 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Lactococcus lactis # 4 339 5 340 340 340 51.0 3e-93 MKEIDWANLSFGYMKTDYNVRINFRNGAWGELEVSSEEHLNLHMAATCLHYGQEAFEGLK AFRGKDGKVRIFRLEENAARLQSTCQGILMAELPTERFKEAILKVVKLNERFIPPYETGA SLYIRPLLIGTSAQVGVHPAEEYMFVVFVTPVGPYFKGGFSTNPYVIIREFDRAAPHGTG IYKVGGNYAASLRANKKAHDLGYSCEFYLDAKEKKYIDECGAANFFGIKDNTYITPKSSS ILPSITNKSLMQLAEDMGIKVERRPIPEEELETFEEAGACGTAAVISPIQRIDDLENGKS YVISKDGKPGPICTKLYNKLRGIQYGDEPDTHGWVTIVE >gi|225935361|gb|ACGA01000031.1| GENE 124 168319 - 168528 229 69 aa, chain - ## HITS:1 COG:no KEGG:BT_3891 NR:ns ## KEGG: BT_3891 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: Mismatch repair [PATH:bth03430] # 1 68 1 68 68 69 94.0 3e-11 MAAKKETYSQAMERLEKIVRQIDNNELDIDILSEKIKEANEIIAFCKDKLTKADREVEKL LQEKRQSEE >gi|225935361|gb|ACGA01000031.1| GENE 125 168589 - 169878 906 429 aa, chain - ## HITS:1 COG:all1774 KEGG:ns NR:ns ## COG: all1774 COG1570 # Protein_GI_number: 17229266 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Nostoc sp. PCC 7120 # 43 427 36 405 412 158 30.0 2e-38 MINAINKSKQSGSLSLGEGGERLSLLELNALVRRSLEQCLPDEYWIQAELSDVRSNTTGH CYLEFVQKDPRSNNLVAKARGMIWNNIYRLLKPYFEETTGQLFTSGIKVLVKVTVQFHEL YGYSLTVLDIDPAYTLGDMARRRREILLQLEEEGVLTLNKELEIPVLPQRIAVVSSATAA GYGDFCHQLQHNSGGFFFYTELFPALMQGNQVEESVLAALDRINARINEFDVVVIIRGGG ATSDLSGFDTYLLAAACAQFPLPIITGIGHERDDTVLDSVAHTRVKTPTAAAELLIHRVA EVAEHLEELSIRIQQGAYMLLDLERRRLETLQTRIPNLVHRKLADARFSLLAAKKDLSQV TKALVARQSHRLELLQQRIADASPDKLLSRGYSITIKDGKAVTDASSLKPGDHLITRLSK GEVRSVVEK >gi|225935361|gb|ACGA01000031.1| GENE 126 169891 - 171255 895 454 aa, chain - ## HITS:1 COG:BS_aprX KEGG:ns NR:ns ## COG: BS_aprX COG1404 # Protein_GI_number: 16078789 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus subtilis # 169 438 146 431 442 85 28.0 2e-16 MKKFIVLILALNMYLGVFAQFTPEDTLKYRISLKDKAATEYSLQKPEKYLSIKSIERRKK QGLPIDSTDLPVCKKYVDAIRKTGVHVLVTGKWDNFVTVSCNDSTLVDEIAKLPFVRSTE RVWKGITQRAFQRDSLINKPVRTDSLYGPAITQVAMSRVDLLHDAGFKGQGMTIAVIDAG FHNVDKIDAMKNIRILGVRDFVNPEADIYAESIHGMSVLSCMAMNQPNVMIGTAPEASYW LLRSEDEYSENLVEQDYWAAAIEFADSVGVDLVNTSLGYYSFDDPSKNYRYRDLNGHYAL MSREAAKAADKGMVVVCSAGNSGAGSWKKITPPGDAENIITVGAVNKRGELAPFSSVGNT ADGRVKPDVVAVGLNSDVMGTDGNLRRANGTSFSSPIMCGMVACLWQACPKLTAKQIIDL VRQSGDRADFPDNIYGYGIPDLWKAYQSFSKEKK >gi|225935361|gb|ACGA01000031.1| GENE 127 171280 - 172365 864 361 aa, chain - ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 5 345 2 354 355 254 38.0 1e-67 MEERNKRVLVGMSGGIDSTATCLMLQEQGYEIVGVTMRVWGDEPQDARELAVRMGIEHYV ADERIPFKETIVKNFIDEYKQGRTPNPCVMCNPLFKFRVLTEWADKLNCAWIATGHYSRL EEKNGYTYIVAGDDDKKDQSYFLWRLGQEVLKRCIFPLGDYTKVKVREYLAEKGYEAKSK EGESMEVCFIKGDYRDFLREQCPEMDSEIGPGWFVNSEGVKLGQHKGAPYYTIGQRKGLE IALGKPAYVLKINPQKNTVMLGDADQLNTEYMLAEQDKIVDEQELFGCENLTVRIRYRSR PIPCRVKRLEDGRLLVRFLETASAIAPGQSAVFYDGRRVLGGAFIASQRGIGLVIIENEG L >gi|225935361|gb|ACGA01000031.1| GENE 128 172476 - 173339 421 287 aa, chain + ## HITS:1 COG:no KEGG:BT_3886 NR:ns ## KEGG: BT_3886 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 286 1 286 292 503 87.0 1e-141 MANPRLPGISENEQALLYAKLNEYNRGRASFKEVGVYLVVLPRTGKPNYSLWLYSPLPEK QSILYIHDLSPDINESMRMASTMFYYSKRCIILVDYNEKRMQSNGDDLIFFGKYRGHFLH EILKIDPAYLSWVAYKFIPKIPKQERFVKIAQAYHSIHLDIMIRKSREKRSSSRYLGELG EKLTDFKLKVTRVRLEDDPYKTRVNGTTPQFFVKQVLTLTDASGNLVTMSIPSKNPSAVS CTLSGIEHEYRLGDIIYVASAKVSRRYESYGSKYTRLSHVKFASLNV >gi|225935361|gb|ACGA01000031.1| GENE 129 173329 - 174336 896 335 aa, chain - ## HITS:1 COG:CAC2806 KEGG:ns NR:ns ## COG: CAC2806 COG1409 # Protein_GI_number: 15896061 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 21 302 24 303 317 162 38.0 1e-39 MIRLLKIYLTFAFLLVTTFCMAQKSELKFSKDGKFKIVQFTDVHFKCGNRASDIALERIN QVLDDERPDLVIFTGDVVYSAPADSGMLQVLEPVVKRKLPFVVTFGNHDNEQGMTREQLY DIIRQVPGNLLPDRGTVLSPDYVLTVKSSSNLKKDAALLYCMDSHSYSPLKDVKGYAWLT FDQINWYRQQSAAYKVQNGGQPLPALAFFHIPLPEYNEAARSENAILRGTRMEEACAPKL NTGMFAAMKEAGDVMGMFVGHDHDNDYAVMWKDILLAYGRFTGGNTEYNHLPNGARIIVL DEGTRTFTSWIRQKDGVVDKISYPASFVKDDWTKR >gi|225935361|gb|ACGA01000031.1| GENE 130 174356 - 174838 500 160 aa, chain - ## HITS:1 COG:RSc1644 KEGG:ns NR:ns ## COG: RSc1644 COG0245 # Protein_GI_number: 17546363 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Ralstonia solanacearum # 1 158 4 161 168 179 56.0 2e-45 MKIRVGFGFDVHQLVEGRELWLGGILLEHTKGLLGHSDADVLLHAVCDALLGAANMRDIG YHFPDTAGEFKNIDSKILLKKTVELIATKGYKVGNIDATICAERPKLKAHIPLMQETMAT VMGIDMGDISIKATTTEKLGFTGREEGISAYATVLIEKIA >gi|225935361|gb|ACGA01000031.1| GENE 131 174877 - 175491 635 204 aa, chain - ## HITS:1 COG:PH0643 KEGG:ns NR:ns ## COG: PH0643 COG0179 # Protein_GI_number: 14590533 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Pyrococcus horikoshii # 2 201 22 220 230 153 40.0 2e-37 MKIIAVGMNYAQHNKELGHTQVNTEPVIFMKPDSAILKDGKPFFIPDFSKEIHYETELVV RINRLGKNIAPRFANRYYDAVTVGIDFTARDLQRKFREQGNPWELCKGFDSSAAIGTFVP VEHYKDIQNLDFHLLIDGKEVQRGCTADMLFKIDDIIAYVSQFVTLKIGDLLFTGTLVGV GPVSIGQRLQGYLEGEKLLDFYIR >gi|225935361|gb|ACGA01000031.1| GENE 132 175491 - 176141 643 216 aa, chain - ## HITS:1 COG:lin2178 KEGG:ns NR:ns ## COG: lin2178 COG2344 # Protein_GI_number: 16801243 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Listeria innocua # 7 207 3 203 215 139 35.0 3e-33 MSTYIRKEADKVPEPTLRRLPWYLSNIKLMKEKGEQYVSSTQISKEINIDASQIAKDLSY VNISGRTRVGYNIDALIEVLESFLGFTNMHKAFLFGVGSLGAALLRDSGLHHFGLEIVAA FDVNPELVGKDLNGIPIFHSDDFEAKMKEYDVNIGVLTVPINIAQEITDKMVDGGIKAVW NFTPFRIRVPENIVVQNTSLYAHLAVMFNRLNFNEK >gi|225935361|gb|ACGA01000031.1| GENE 133 176307 - 176657 384 116 aa, chain + ## HITS:1 COG:alr3795 KEGG:ns NR:ns ## COG: alr3795 COG0023 # Protein_GI_number: 17231287 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 1 (eIF-1/SUI1) and related proteins # Organism: Nostoc sp. PCC 7120 # 33 116 33 115 115 68 48.0 3e-12 MKNNDWKDRLNVVYSTNPDFGYEMDNDEEQVTLDKDKQNLRVSIDKKNRGGKVVTLITGF VGTDNDLKELGKLLKSKCGVGGSAKDGEIMVQGDFKTKIIDLLIKEGYTKTKGIGG >gi|225935361|gb|ACGA01000031.1| GENE 134 176675 - 177184 495 169 aa, chain + ## HITS:1 COG:TP0100 KEGG:ns NR:ns ## COG: TP0100 COG0526 # Protein_GI_number: 15639094 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Treponema pallidum # 51 166 83 199 200 72 27.0 4e-13 MRKVILFCAATLFSLLSFAQESDADIVKVGDNMPAFTLHSTANGTVNSTDLKGKVVLINI FATWCGPCQSELAEVQKILWPKYKNNKDFCMLVIGREHTDDQLAEYNKRKGFTFPLYPDP KREVTGKFATKMIPRSYLIDKEGKVISATTGYEDGAIDTLIKQIDKALK >gi|225935361|gb|ACGA01000031.1| GENE 135 177306 - 178298 1317 330 aa, chain - ## HITS:1 COG:BS_tsf KEGG:ns NR:ns ## COG: BS_tsf COG0264 # Protein_GI_number: 16078713 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor Ts # Organism: Bacillus subtilis # 1 225 1 233 293 118 40.0 1e-26 MAVTMADITKLRKMTGAGMMDCKNALTEADGDYDKAMEIIRKKGQAVAAKRSEREASEGC VLAKTTGDYAVVIALKCETDFVAQNADFVKLTQDILDLAVANKCKTLDEVKALPMGNGTV QDAVTDRSGITGEKMELDGYMTVEGASTAVYNHMNRNGLCTIVAFNKNVDEQLAKQVAMQ IAAMNPIAIDEDGVSEEVKQKEIEVAIEKTKAEQVQKAVEAALKKANINPAHVDSEDHME SNMAKGWITAEDVAKAKEIIATVSAEKAAHLPEQMIQNIAKGRLAKFLKEVCLLNQEDIM DAKKTVREVLAAADPELKIVDFKRFTLKAE >gi|225935361|gb|ACGA01000031.1| GENE 136 178420 - 179256 1397 278 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237719097|ref|ZP_04549578.1| 30S ribosomal protein S2 [Bacteroides sp. 2_2_4] # 1 278 1 278 278 542 100 1e-153 MSRTNFDTLLEAGCHFGHLKRKWNPAMAPYIFMERNGIHIIDLHKTVAKVDEAAEALKQI AKSGKKVLFVATKKQAKQVVAEKAQSVNMPYVIERWPGGMLTNFPTIRKAVKKMATIDKL TNDGTYSNLSKREVLQISRQRAKLDKTLGSIADLTRLPSALFVIDVMKENIAVREANRLG IPVFGIVDTNSDPSNVDFVIPANDDATKSIEVILDACCAAMIEGLEERKAEKIDMEAAGE APANKGKKKSVKARLDKSDEEAINAAKAAAFIKEDEEA >gi|225935361|gb|ACGA01000031.1| GENE 137 179378 - 179764 640 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883130|ref|ZP_02064133.1| hypothetical protein BACOVA_01099 [Bacteroides ovatus ATCC 8483] # 1 128 1 128 128 251 100 3e-65 MEVVNALGRRKRAIARIFVSEGTGKITINKRDLAEYFPSTILQYVVKQPLNKLGAAEKYD IKVNLCGGGFTGQSQALRLAIARALVKMNAEDKAALRAEGFMTRDPRSVERKKPGQPKAR RRFQFSKR >gi|225935361|gb|ACGA01000031.1| GENE 138 179771 - 180232 795 153 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883129|ref|ZP_02064132.1| hypothetical protein BACOVA_01098 [Bacteroides ovatus ATCC 8483] # 1 153 1 153 153 310 99 3e-83 MDTLSYKTISANKATVTKEWVIVDATDQTLGRLGAKVAKLLRGKYKPNFTPHVDCGDNVI IINADKVKLTGNKWNDRVYLSYTGYPGGQREMTPARLITKPNGEERLLKKVVKGMLPKNI LGAKLLNNLYVYAGSEHKQAAQNPKMIDINSYK >gi|225935361|gb|ACGA01000031.1| GENE 139 180436 - 180729 132 97 aa, chain - ## HITS:1 COG:no KEGG:Phep_2828 NR:ns ## KEGG: Phep_2828 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 13 90 468 544 549 67 47.0 1e-10 MTFIDDSTPEGKFRTVLYWERGFELAFEGQRKYDLIRWGVLGKALKLFGEISSVNQKENK PYPAYRNFTEGKHELFPIPLKEIQSNPKLNGMNNNGY >gi|225935361|gb|ACGA01000031.1| GENE 140 180726 - 182129 1246 467 aa, chain - ## HITS:1 COG:no KEGG:Phep_2828 NR:ns ## KEGG: Phep_2828 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 454 10 454 549 253 35.0 8e-66 MKKIHLLYAVFALLATGFSSCEDLLTEEPNSKYDRDRYFDSEDKAEMAVMGIYSSLSDFN HYGWYEMASPASDDTYYTARTQSDNQVHDIAHYQLNSTNTWIESIWKLKYEGIDRANLTI DGICGMTGYAENTRLKALEAEARFLRAFLAFDLVRYWGDVPFKTSYSSSYESAFGERVDR EVIYDEIISDLTFAKNNLDWATASSSPERVTQGAARALLMRVYLQRAGYSLQSNGQLKRP EDSKRMEYFDAVIEEWEAFEKKGYHDFYDSGYEALFKSYSQGVLNNKESLWEIAFYHSQG RRNGGAWGIYNGPQVAEPTGVSASEANQYMGRANGFFIVVPEWRNFFEASDKRRDVAICT YRYTWNGTKKEHVKEERSAGSWYVGKWRREWMPKESWNKNINYADVNYCPLRYADVVLMA AEAYNETGADRQKAWSLLNSVRTRAEATSITETNYVEMMSARKKHTT >gi|225935361|gb|ACGA01000031.1| GENE 141 182151 - 185363 3072 1070 aa, chain - ## HITS:1 COG:no KEGG:Phep_2829 NR:ns ## KEGG: Phep_2829 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 3 1070 2 1053 1053 833 43.0 0 MKSRIICILLLLVGVSGIYAQSLTVTGKVVDNEGLEVIGGNVTVKGKQNTGTITDINGKY TITVSDPQKDVLVFSFIGLENMEVAVKGRKQIDVTMKAASVLLDEVVAIGYATVKRKDLT GSVASVRSDDLLKVPSSDVTQALAGRMAGVQIIQTDGQPGATMSVRVRGGISITQSNEPL YIIDGFPTEDGMSSLDPADIESIDVLKDASATAIYGARGANGVVVITTKSGAKSEGKATL TFDSYVGVRTLAKRLDVLSVEEFVLADYERTLGDATDPEESMRSWQNRYGGFVDLHENYG NRKGIDWLDRTMGRTTVTQNYRVGVNGGNDKLNYNMSYGYFKDEGAMVYSGSDKHNIALS VKSEVNKRLSVTGRINFDYLKVYGAGVAGNGTNEGGSNVDAKFNKMVQILQYRPTIGIRG SDSDLLAGEDPVLSDADGNVMQNPLIAAAEEKDNKETRTLQANGGLTFKIIKGLTFRNNT GMRYQLYRRELFYGDQSIMGRRNGIYGSIRNTETGSFQTSNVLTYDKRFQKKHKVVVQLG QEFVKRWTRVLESGVSGLPTDEFILGDMSLGTPSVASSDENYDDNLLSFFARLNYDFTDK YLFSATFRADGSSKFGKNNKWGYFPAVSAAWRVGEEDFIKKLNVFSDLKFRIGYGLAGNN RIGSYNSLALMSSITTAMGDQLTPGYASKQIPNPDLKWEANKTFNMGVDLGFLNQRITIS PEFYINRSSNLLLNAQLPYSSGYQSMLINAGETKNVGVELTVNTVNFSTKKFSWNTTLTL SHNKNSVKALTGEAVQLYEAKFGFNQNTHRIAVGEPLGQFYGYITEGLYQVDDFNYDAST QAYTLKDGIPYHGDKGQIRPGMWKFKNLTGDDNVIDENDKTVIGNAQPKFYGGLNNSFTY KGFDLSIFLTFSYGNEVLNATKMVASKVGSQNYNALDVMSSSNRWMTINSEGQKVTDPGE LAALNIGKTVAAYYDAQQGDNYIHSWAVEDASYLKLSNVTLGYTFPKNLIARVGLKNLRL YATGNNLLTWTKYSGFDPEVSTMKSGLTPGVDFGAYPLSRSFIFGLNVAF >gi|225935361|gb|ACGA01000031.1| GENE 142 185407 - 187953 1850 848 aa, chain - ## HITS:1 COG:no KEGG:BT_4652 NR:ns ## KEGG: BT_4652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 840 36 868 872 374 30.0 1e-101 MHTTFRLSVFLLFIVQLFCSCTSDSLPEPDSTQPETPAGIPDAYQDKIRTQPYPKADNEL YLNPAPLIVPQAMKTGERLQFSLSRTEDFSSSETLLSEPQEWCMYNLHRRLEVGTWYWRF RSTNLNGTTPGEWSAIYRFEVKNDTPEFVTPPFQTFLANAPRLHPRIYCFLDDRIGEARN RVTSHPEYTELQSRASQALKAEYTGMTDLYNRAEELRQHATYLYQAYHLTQKEIYAEKLR QLLEALIVAPPADGQLFASNFTASNIAWCLVAAYDLLYNNLSASDRTAAEELMMRVARYY YKVNCGFQENHLFDNHFWQQNMRILFQVALSLYDKPAYSSEVLPMMEYYYELWTARAPAS GFNRDGIWHNGTGYFSANILTLAYMPSLLSFISQYDYLSHPWYQNVGRSLVYTCPPGSKS NGFGDNSEKGSEPNRLVAAFADYLACETGDPYAGWYAGECRELLRRDYELRLYRMCTEQD YNTAFPAGADKMVWYKDAGEVAMHSAPEDVEKDLALSFRSSTFGSGSHTTASQNAFNLLY KGVDVYRSTGYYQNFSDAHNLMSYRHTRAHNSLLVNGIGQPYSTEGYGSVMRAMGGQHIS YCLGDASHAYRSISNDPMWVDYFKQAGIEQTPENGFGATPLTKYRRHVLMLHPHTVIVYD ELEASEAVRWEWLLHSPTEFKIDAVKKTLSTDNKTKGWVAVTQLFGGHVFTLSQTDRFVV PPAITGAEYPNQWHLTARVDGCSATRFLAIIQVGDEAVSIINRDGDTFNVGDWTIKAVLD ASKAPELTVSHRTEQAVFSYGTDNPALNGNFYSRQYTGSSLLYDEIDGAYQVVEMTDRSP ISTRVVNQ >gi|225935361|gb|ACGA01000031.1| GENE 143 188081 - 189631 1537 516 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171980|ref|ZP_05758392.1| ## NR: gi|260171980|ref|ZP_05758392.1| hypothetical protein BacD2_08955 [Bacteroides sp. D2] # 10 516 1 507 507 924 100.0 0 MKKNLFYSLMALFVVLFASCGQEEIVSDNGTEINAPVTISVQAPVNNVFSRAVTIPDGYT MQCIMQLLNADGDKIGDQVTKPVTDGKVSFTISVDEQKEVSKALFWAEYVPESGAANKVY NTADLRAVGYNTASFDLTNDALMAASDAFCGKLETIGNASVTLKRPFANVSVKPKNPEVA AAANKLEITYNALSGYDILEGKCSATTPVTYTNANFASADGNWFANFFFAPSNVGKFTEE ITMALSGGYSKEIKIPANTLPLDANMQIMAKFEIGDGNFDIEVGVDPDYEALEMKVGSYI NAEGKVVRDAADAVGIVFKMEAIGDDVPANYPVALQGKTIAGYAVAIENVATGRQTLNTE LLTTLTETAAGMTNGTQTTDVLLTGLGDVTFKTTYEKWVGEHASASENLSAWYIPTLSQL SAFMGTLFTMKEVPATGSEDFKNIPEFEFVNGKMFDRETIATVNYASSTINNQGNVSGVR INVADGVVTNAQEAGISVKTALNQQALCRPMITIFK >gi|225935361|gb|ACGA01000031.1| GENE 144 189680 - 190690 1240 336 aa, chain - ## HITS:1 COG:no KEGG:BT_3148 NR:ns ## KEGG: BT_3148 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 105 327 202 424 430 80 29.0 8e-14 MRRNLYILAQCTFLLFITLLVNGCSLHEEPEMTPEGELGVDPTAVTLNLNLAMNLSLAER APVTITRASETNYLRRFIVEAYLDRQVAARQTVYEEDFNRASLSVSMKLHARNYRILVWA DYVNAETPEQGLVYDAGNLAFILPAGKYIGNSRYKDVFAASTMADLTSFRNHWGAETSLD VELYRPVARYELVAKDVATFLNKLSTGGLKGESFTARVKYSDYLPTGYNLWDDVPKNSLM YMEYKVAFERPADGTKELILGFDYVLTDAGETVSIPVELEILNEKNEVLAHTAFRIPCER GKNTTVRGNFLTSDANGGIGIDPDYDGDLEVDLGEL >gi|225935361|gb|ACGA01000031.1| GENE 145 190728 - 191465 555 245 aa, chain - ## HITS:1 COG:no KEGG:BDI_3526 NR:ns ## KEGG: BDI_3526 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 87 243 22 182 185 113 38.0 7e-24 MRHSTFNMRTICLLAALSGITVLSAQTPGKPRVQTSAAPEETTVYKPGGKKQETAVATSS KKEEDKRQSGKENTGKKAAAFDSQRYLALKTNVIYDACALLNLAVEMQVHKKITVELPLT CSLWDLGDKRGVRTVALQPEARWWIGNETGRGHFVGLHAHVAWFNVKWNDDRYQDTDRPL LGAGISYGYKLPLSKHWGAEFNLGVGYANMKYNTYYNVDNGALLDTRVRHYWGITRVGAS LVYRF >gi|225935361|gb|ACGA01000031.1| GENE 146 191492 - 194095 2092 867 aa, chain - ## HITS:1 COG:no KEGG:BT_4652 NR:ns ## KEGG: BT_4652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 867 1 872 872 805 47.0 0 MKKVLLISFVILSVTAQMTHAQKQAVIKLTEKTLMHEMRATPYPLDKAVVNDRAVSFQWP LRSDMNSQDSPLDGFEHKVKKVDKTKVTYRLRYSQDAGLKSGVVQVETRWPFYNPEQPLA PGVWYWQFGYVENGQVTWGSTQQVTVEDRPGKFCPPSLKTVLAKLPADHPRVWIMKNEWK DFINHSKQKAERQWYLERADQVIQTPMKSVKDINVSQVKNLKNEMQINSYLTRESRRIID AEEGNTEALIRAWLLTQDTKYADEAIKRVFIMADWDKDKNVKGDFNASSLLSLCSMAYDS FYDRLNTSQKKALLEAIKNKGGEMYENFNNRMENHIADNHVWQMTLRILTMAAFSVYGDL PEANTWVDYCYNVWLARFPGLNKDGGWHNGDSYFTVNTRTLVEVPYYYSKLTGYDFFSDP WYQGNIMYTIFQQPPFSKSGGNGSSHQNVGRPNSIRIGYLDALARLTGNTYAADFVRRTL KVEPDYMKKALLNKPGDLAWFRLQCDKPLPEGEGLTALPAGYVFPATGLASFQTNWDRVG GNAMWSFRSSPYGSTSHALANQNAFNTFYGGKPLFYSSGHHIEFTDVHSMLCHRATRAHN TILVNGMGQRIGTEGYGWIPRYYASEKIGYVLGDASNAYGKVISPLWLTRGEQSEVHYTP ENGWDENHVKTFRRHIVNLGKTGLIFIYDELVADEPVNWSYLLHTTENPMTVDKSNHRFV HIQATNRGGASDAYLFSTGTLQTDTTSRFFYPAVNWLRADDKGVFKKYPNHWHFTATSEK AQVYRFATVINTHALKYPAKDPEILSDGRIKVGGWLISVNLKSDGAPSFFIRSTQEKVNI TYKGEATVINEDGYETVMRDTVPELEI >gi|225935361|gb|ACGA01000031.1| GENE 147 194896 - 198993 2562 1365 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 812 1236 274 668 676 149 27.0 3e-35 MKDRHLLFKYLFGLTVCILFFPLAVSAADSFIHFKRLSVDDGLSQNTVLALTQDHNNKLW VGTIDGLNWYEGSRFVSYYKAPDDTTSLANNHVYSLHTDSKGTVWVGTQVGLSRYNIVGN NFTNYSSPDNQPMQVLAIGEPEEGDRLLLATNIGLVVFDKKTGRMKVLPELAGKTIYSVC RMNDGFLLGTSEGVYFYYVRNENVTRLLLQLKGETISDMLYDDKTGNCWLASLTNGVYCV DNNFQIKHHYNKQNTPAYFLTNSVRTLSGDDKGRVWIGTMEGLLILEPETGTFRICRFSP EDPTTLGHNSIRSILKDNQGGMWIGTFYGGLNYYHPLAPAFGRLQHSAWRNSLSDNTVSC IAEDPDNGNLWIGTNDGGLNYYNRKTGVFSYYRTGTSANALKSDNIKCIWTDKDGSVYIG THGGGMSRLRHQSGRIETYSFPHSTSLTNSCYSLLDGTDGTLWVGGMSGLYLFDKQTSEL SQHPLAKKHKKLENVLIYTLFRDSKGRIWIGTEESLFLYAGGKLEELHLSSSAYLHGLIQ AFCVQEDSRHEIWIGSSTGLYCYKEGVPTAWKHYTMKDGLPNDFIYNILEDERGRLWLTT NKGLACFNTEEGTFLNYSKQDGLPHDQFNYFGACKAHDGTFYLGSLGGVAYFKPYELGDN PYSPDAVVTGAVVLNQVITDMKSERVRYYQDEQGRMLGMSFPSDQKLFNIRFAVINYLAG KRNQFVYKLEGFETEWNYSRHVSFARASYSNLPPGEYVFKVKACNNNGKWSEATTEFFVH IIPMWYQTWWAKTLFIFFSLGLLVFVIYFFIARAKMKMQVRIEQIERNKIEEISQEKVRF YMNMSHELRTPLSLILAPLEELLGQSNLKGTPVQQKLDYVYKNGRKLLHIVNQLLDFRKA EAGAMPIHVAQVNVEELLQDAFALFKENAQKRAISFHIKSDLEGRLFPADRTYVETILMN LLSNAFKFTPDGGSISLSLWTEGDTYGFTVRDSGIGMSPEQLTRIFERFYQVDGQRKGTG IGLSLVKMLVEKHHGTITVASEPAQYTEFKVTLPADMAAFTEKERELPAHEVETSASLRE LPVADEYFSGDASAVVSEEKSDDDQIEVGSEEERPTILLVDDNKEMVDYLKDNFRQNYVT LTAGNGEEALAIMKEHRVDIVLSDVMMPGIDGIKLCQLIKRNLQTCHIPVLLLSAKGSVD AQTEGIQAGADDYIAKPFSIHLLKGKIANQLKARQRLKHYYSNTIDIDTAKMTSNNLDEE FMSKAIQVVEENISNEDFTSDELASQLCMSRSSLYLKMNSISGEPPANFIRRIRFNKACK LLLEGRYSISEISGMVGFGSSSYFSTSFKKYVGCLPSEYVKQHTK >gi|225935361|gb|ACGA01000031.1| GENE 148 199191 - 200399 1144 402 aa, chain - ## HITS:1 COG:no KEGG:BT_2913 NR:ns ## KEGG: BT_2913 # Name: not_defined # Def: unsaturated glucuronylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 396 8 400 402 381 51.0 1e-104 MKNKLLLAVGSMALLTACDASKGNEMGWFDHAVKTAGHQLLYMAEQLKNEPDTACFPRSI KEGKYRLEHPTDWTSGFYPGSMWLAYELTGDEALAKEARKYTNRLQDMQYYTGNHDLGFM MFCSYGQGIRLKPEPTDSLILIHSSESLCSRFRPEVGLIRSWDFGDWSYPVIIDNMMNLE MLFWASEQTNNPKYREIAISHANKTLKNHFREDMTSYHVVSYLADSGEVESKGTFQGYAD SSAWARGQAWGVYGYTMCYRFTKQQNYLDAAHKIARFIIDHRPSEYDYVPYWDYDAPKIP NEPRDASAAAVTASALLELSGYGDKKQGEEYFRYAEHILKQLSSDDYLAQEGENHGFVLL HSVGSFPHDSEIDTPLNYADYYYLEALKRYKDLKEKSENPSY >gi|225935361|gb|ACGA01000031.1| GENE 149 200657 - 201148 526 163 aa, chain - ## HITS:1 COG:no KEGG:BT_3874 NR:ns ## KEGG: BT_3874 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 1 163 163 318 96.0 6e-86 MKKLYFFTMLSIMLLAVTGATAQKKTKFKAADLKGIWQLCHYVSESPGVPGALKPSNTFK VLSDDGQIVNFTIIPGADAIITGYGTYKQLTDDSYKESIEKNIHLPMLDNQDNILEFEIK DNDYLHLKYFIKNDLNGNELNTWYYETWKRVEMPAKFPEDIVR >gi|225935361|gb|ACGA01000031.1| GENE 150 201290 - 202693 1475 467 aa, chain - ## HITS:1 COG:sll0495 KEGG:ns NR:ns ## COG: sll0495 COG0017 # Protein_GI_number: 16332045 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Synechocystis # 4 467 52 513 513 558 56.0 1e-159 MEKIGRTKIVDLLKRTDIGAMVNVKGWVRTRRGSKQVNFIALNDGSTINNLQVVVDLANF DEEMLKLITTGACISVNGEMVESVGSGQKVEVQAREIEVLGTCDNTYPLQKKGHSMEFLR EIAHLRPRTNTFGAVFRIRHNMAIAIHKFFHEKGFFYFHTPIITASDCEGAGQMFQVTTM NLYDLKKDERGSISYEDDFFGKQASLTVSGQLEGELAATALGAIYTFGPTFRAENSNTPR HLAEFWMIEPEVAFNDITDNMDLAEEFIKYCVKWALDNCADDVKFLNDMFDKGLIERLQG VLKDDFVCLPYTDGIKILEEAVAKGHKFEFPVYWGVDLASEHERFLVEEHFKRPVILTDY PKEIKAFYMKQNEDGKTVRAMDVLFPKIGEIIGGSEREADYNKLMTRIEEMHIPMKDMWW YLDTRKFGTCPHSGFGLGFERLLLFVTGMANIRDVIPFPRTPRNADF >gi|225935361|gb|ACGA01000031.1| GENE 151 202798 - 204246 1899 482 aa, chain - ## HITS:1 COG:SPy0369 KEGG:ns NR:ns ## COG: SPy0369 COG1187 # Protein_GI_number: 15674518 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Streptococcus pyogenes M1 GAS # 247 476 1 232 240 167 39.0 3e-41 MSTENEEWRENSFNEENTGAGRDGNRSYNREGGERPYRPSYNREGGDRPYRPRFNANNEG GERPQRSYGDRSYGDRPQRPSYNREGGDRPYRPRFNNNNEGGERPQRPYNREGGSYDRPQ RPSYNREGGSYDRPQRPRFNSGEGGDRPYRPRFNNGEGGDRPQRPSYNREGGDRPYRPRF NNGEGGGGYRSNNGGGGGGYRPRYNNDRSQGGYRPRPRTGDYDPNAKYSVKKQIEYKEQF VDPNEPIRLNKFLANAGVCSRREADEFITAGVVSVNGEVVTELGTKIKRSDVVKFHDEPV SIERKVYVLLNKPKDTVTTSDDPQERRTVMDLVKGACNERIYPVGRLDRNTTGVLLLTND GDLASKLTHPKFLKKKIYHVHLDKNLTKADMDQIAAGIQLEDGEIHADAISYTDDFKKDQ VGIEIHSGKNRIVRRIFESLGYKVVKLDRVFFAGLTKKGLRRGDWRYLSEAEVNYLRMGS FE >gi|225935361|gb|ACGA01000031.1| GENE 152 204320 - 205666 1523 448 aa, chain - ## HITS:1 COG:PA2629 KEGG:ns NR:ns ## COG: PA2629 COG0015 # Protein_GI_number: 15597825 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Pseudomonas aeruginosa # 1 447 1 447 456 453 50.0 1e-127 MELDVLTAISPIDGRYRGKTKALAAYFSEFALIKYRVQVEVEYFITLCELPLPQLKGIDS NVYETLRNIYRNFSEADAQRIKDIESVTNHDVKAVEYFLKEEFDKMGGMDDYKEFIHFGL TSQDINNTSVPLSIKEALEQVYYPLIEELIAQLKTYATEWANIPMLAKTHGQPASPTRLG KEVMVFVYRLERQLAMLKACPITAKFGGATGNYNAHHVAYPQYDWKQFGNRFVAEKLGLE REEYTTQISNYDNLSAVFDSMKRINTIMVDMNRDFWQYISMEYFKQKIKAGEVGSSAMPH KVNPIDFENAEGNLGIATSILEHLAVKLPVSRLQRDLTDSTVLRNVGVPFGHIVIAIQSS LKGLRKLLLNEPAIYRDLDNCWSVVAEAIQTILRREAYPHPYEALKALTRTNQAITESSI KEFIEELNVSEDIKKELRAITPHTYTGL >gi|225935361|gb|ACGA01000031.1| GENE 153 206180 - 206725 471 181 aa, chain - ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 2 180 10 189 192 161 46.0 5e-40 MTEVEKMRSSQLADMSAPELQVRFEHAKKLLARMRGMSTYDEGYRELLDKLVPGIPETSI ICPPFHCDHGDGIKLGGHVFVNANCTFLDGGYITIGAHTLVGPCVQIYTPHHPMDYQERR GSKEYAYPVTIGEDCWIGGGAIICPGVTIGNRCVIGAGSVVTKDIPDDSVAVGNPARVIR K >gi|225935361|gb|ACGA01000031.1| GENE 154 206729 - 207502 599 257 aa, chain - ## HITS:1 COG:PA3539 KEGG:ns NR:ns ## COG: PA3539 COG3022 # Protein_GI_number: 15598735 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 1 251 1 252 259 147 33.0 1e-35 MLVLLSCAKTMSETSKVKIPLKTVPRFQKEASGIALQMSQFSVDELERLLRVNAKIAVEN YKRYQAFHAEGAPELPALLAYTGIVFKRLNAKDFSKEEFEYAQEHLRLTSFCYGLLRPLD VIRSYRLEGDVVLPELGNQTMFSYWQSRLTDVFIEDIKKAGGILCNLASDEMKSLFDWRR VEKEVHVITPEFQVWKNGKLASIVIYIKMSRGEMTRFILKNRIENPEDLKSFFWEGFEFN ESLSNEKKFVFTNGKEI >gi|225935361|gb|ACGA01000031.1| GENE 155 207641 - 209623 1510 660 aa, chain + ## HITS:1 COG:SMb21160 KEGG:ns NR:ns ## COG: SMb21160 COG3525 # Protein_GI_number: 16264574 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Sinorhizobium meliloti # 92 480 201 622 639 108 24.0 3e-23 MTHTLIPLKVLCLLTIFCLYGSLSYASVNSKPFVVPELKQWTGKDGNFTPKTNAKIVCTS ANPELLRIAQMFADDYQQMFGKTLSVTQGKATPGDFILSLSADKKLGEEGYEIKITDRIT TSAPTPTGLYWSTRTLLQIAEQSQEHSFPKGIIRDYPDYSIRGFMIDCGRKFIPMSYLQD LVKIMAYYKMNTLQVHLNDNGFKQYFDNNWDKTYAAFRLESETYPGLTARDGSYSKKEFI DFQKQAATNFVEIIPEIDIPAHSLAFTHYKPEIGSKEYGMDHLDLFKPETYQFADDLFKE YLKGDDPVFVGKRVHIGTDEYSNAKKEVVEKFRAFTDHYIRLVEGFGKQAVIWGALTHAK GDTPVKSENIIMNAWYNGYADPATMIKDGYQLISIPDEMVYIVPLAGYYQDYLNEAFLYK EWTPAHIGKAVFEEKHPAILGGMFAIWNDHAGNGISVKDIHHRVFPALQTLAVKTWTGKE TSLPFEVYNEKRSAISEAPGVNQLGRIGKSPALVYERSTVAPGSTSTYPEIGYNYTVSFD ITGAPEKSGTELFRSPNAVFYLADPIRGMMGFARDGYLNTFPYKVNPGEKATIQIEGDHR STTLRVNGKVVEEMNIQKCYFNAGKDSMSYIRTLVFPLEKAGNFNSRIENLKVHNYRVSK >gi|225935361|gb|ACGA01000031.1| GENE 156 209725 - 212946 3814 1073 aa, chain - ## HITS:1 COG:YJL130c_2 KEGG:ns NR:ns ## COG: YJL130c_2 COG0458 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Saccharomyces cerevisiae # 4 1055 4 1051 1070 1143 53.0 0 MEKEIKKVLVLGSGALKIGQAGEFDYSGSQALKALKEEGISSVLVNPNIATIQTSEGIAD KVYFLPVNTYFVEEIIKKERPDGILLAFGGQTALNCGAELYTQGILDKYGVKVLGTSVEA IMYTEDRDLFVKKLNEIEMKTPVSQAVENMEDAIAAARRIGYPVMVRSAYALGGLGSGIC ADEEEFLKLAESSFAFSKQILVEESLKGWKEIEFEVIRDANDHCFTVASMENFDPLGIHT GESIVVAPTCSLDDKELTLLKELSTKCIRHLGIVGECNIQYAFNSDTDDYRVIEVNARLS RSSALASKATGYPLAFVAAKVALGYTLDQIGEMGTPNSAYSAPELDYYICKIPRWDLTKF AGVSREIGSSMKSVGEIMSIGRSFEEIIQKGLRMIGQGMHGFVGNDELHFDDLDKELSRP TDLRIFSIAQAMEEGYTIERIHDLTKIDPWFLGKLKNIVDYKAKLSAYNKVEDIPADVMR EAKVLGFSDFQIARFVLNPTGNMEKENLAVRAHRKALGILPAVKRINTIASEHPELTNYL YMTYAVEGYDVNYYKNEKSVVVLGSGAYRIGSSVEFDWCSVNAVQTARKLGYKSIMINYN PETVSTDYDMCDRLYFDELSFERVLDVIDLEQPRGVIVSVGGQIPNNLAMKLYRQSVPVL GTSPISIDRAENRNKFSAMLDQLGIDQPAWMELTSLEEVKGFVEKVGYPVLVRPSYVLSG AAMNVCYDDEELENFLKMAAEVSKEYPVVVSQFLENTKEIEFDAVAQNGEVVEYAISEHV EFAGVHSGDATLVFPAQKIYFATARRIKKISRQIAKELNISGPFNIQFLARNNEVKVIEC NLRASRSFPFVSKVLKRNFIETATRIMLDAPYSRPDKSAFDIDWIGVKASQFSFSRLHKA DPVLGVDMSSTGEVGCIGDDFSEALLNAMIATGFKIPERAVMFSSGAMKSKVDLLDASRM LFAKGYQIYATAGTAAFLNAHGVDATPVYWPDEKPGAENNVMKMIADHKFDLIVNIPKNH SKRELTNGYRIRRGAIDHNIPLITNARLASAFIEAFCELKLGDIQIKSWQEYK >gi|225935361|gb|ACGA01000031.1| GENE 157 213221 - 213808 443 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260171994|ref|ZP_05758406.1| ## NR: gi|260171994|ref|ZP_05758406.1| hypothetical protein BacD2_09025 [Bacteroides sp. D2] # 1 195 1 195 195 379 100.0 1e-104 MKTGKLLGMLLFVLGIFSLMGCNDEMTDEEEGWYDSIPDGNFFSRGNTVRFYYIDGNGNS LINPEDKNTLPVSWREELVNPIERTEDYNAEHGNYNGNHNWVVYDEEEGLYYCTVSAYGD ERQSTYSFPIYVNGEKDAMQITYKYTDRDVIGGKYWAKMISWKYNGVHVYSDDDEPYKKV FIKKANGKTTVSLTR >gi|225935361|gb|ACGA01000031.1| GENE 158 214012 - 215106 1346 364 aa, chain + ## HITS:1 COG:L0358 KEGG:ns NR:ns ## COG: L0358 COG0180 # Protein_GI_number: 15672048 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Lactococcus lactis # 7 348 6 340 341 417 61.0 1e-116 MGKEKIILTGDRPTGRLHIGHYVGSLKRRVDLQNAGDYSKMFIFIADSQALTDNIDNPEK VRQNVIEVALDYLACGIDPSKATIFIQSQIPELCELSFYYMDLVSVSRLQRNPTVKSEIQ MRNFEASIPVGFFTYPISQAADITAFRATTVPVGEDQEPMLEQAREIVRRFNYIYGETLV EPEILLPDNAACLRLPGTDGKAKMSKSLGNCIYLSEEPEEIQKKIMSMYTDPGHLRVQDP GKIEGNTVFTYLDAFCLPEHFERYLPDYPNLAELKAHYQRGGLGDVKVKRFLNSIMQEIL EPIRNRRKEFSKDIPAIYDMLQQGCEVARAAAAETLADVKKAMKINYFDDKELIEEQVKR FSQE >gi|225935361|gb|ACGA01000031.1| GENE 159 215197 - 217539 1552 780 aa, chain - ## HITS:1 COG:RSc0786 KEGG:ns NR:ns ## COG: RSc0786 COG5373 # Protein_GI_number: 17545505 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 139 467 165 487 938 132 29.0 3e-30 MDDFNTVYALLVLVMLFVLWSRLDNRFGKIEKELNEIKKRMDDYLKLQQKTATEEHKPKE EILVKETEDVPLMSQMTEHVTVVKAAQQGTVVQEKQESVIETVREALEETPEKTAETELE DVCVEAVPRQKKQVNYEKFIGENLFGKIGILIFVIGVGFFVKYAIDKNWINETFRTVLGF LTGAVLLAVAERLQKKYRTFSSLLAGGAFAVFYLTVAIAFHYYHIFSQTMAFIILIGVTV FMSVLSVVYNRRELAIISLVGGFLAPFIVSSGEGSYLVLFTYVSILNLGMFGLSIYKKWG ELPIISFVFTWLIMGIFLLFSYTSSSTVISGHLFLFTTLFYFIFLLPVFSILRGEDMRTM SRGLVFVIITNNFIYLLSGALFLRNMGWPFKASGLLSLFIALVNLGLVLWLWKSRKDYKF LVYTTLGLVLTFVSITVPIQLDGNYITLVWASEMVLLLWLYIKSRIRVYEYAAKILVGLT FISYLMDIYNVVMHEHHAVSTIFLNSSFATSLFVGLATGAFALLMGYYRPFFSTARQLKY GFWNPFMLFVSVAILYYTFMMEFHLHFEGATRSGAMFLFTAIAISSVCYAFRKRFPITQY LTFYMLAIGINTLVYIINIWGDQWENMALVPVVLRWFTAAFVIANIYYVARQYYLLIGIK SRFTIYLNILVTLLWVTMVRSFLWQVGVDDFSAGLSLSLSIAGFVQMGLGMRLHQKVMRM ISLSTFGIVLLKLVFDDLWAMPTIGKIIVFIILGLILLILSFLYQKLKDVLFKNDEDEVS >gi|225935361|gb|ACGA01000031.1| GENE 160 218396 - 219583 1290 395 aa, chain - ## HITS:1 COG:no KEGG:BT_3852 NR:ns ## KEGG: BT_3852 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 395 1 394 394 702 92.0 0 MKKILMLLAFAGVASVASAQQTMTVTEYEVIQVQDKYQVITNPFWSNWFFSVGGGAQVLY GNNDHIGKFRDRVAPTFNVSVGKWVTPGFGLRLQYSGLQAKGFTTSENANYVVGGPREDG SYKQRWDYMNLHGDLMINLNALFGGYNPNRVYEIIPYIGAGWAHAYSRPHTNSATFNAGI INRFRLSNAVDLNLELSATGLEGKFDGEHAGKPDYDGILGATIGVTYYFPTRGFQRPVPQ IISELELNQMRNQMNAMAAANMQLQQQLANAQQPVEVEDTEEVVITDTNIAPRTVFFKIG SDKLSPQEEMNLSYLASKIKESPNATYTINGYADSATGTPAFNQKLSLERAQVVKDLLVK KYGISADRLKVAAGGGVDKFGQPILNRVVLVESAQ >gi|225935361|gb|ACGA01000031.1| GENE 161 219687 - 221588 1623 633 aa, chain - ## HITS:1 COG:BS_mutL KEGG:ns NR:ns ## COG: BS_mutL COG0323 # Protein_GI_number: 16078768 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Bacillus subtilis # 1 633 1 624 627 301 32.0 2e-81 MSDIIHLLPDSVANQIAAGEVIQRPASVIKELVENAIDADAQNIHVLVTDAGKTCIQVID DGKGMSETDARLSFERHATSKIREAADLFALRTMGFRGEALASIAAVAQVELKTRPESEE LGTKLVIAGSQVESQEAISCSKGSNFSVKNLFFNVPARRKFLKANSTELSNILAEFERIA LVHPEVAFSLYSNDSELFNLPVSQLRQRILAIFGKKLNQQLLNIDVNTTMVKISGYVAKP ETARKKGAHQYFFVNGRYMRHPYFHKAVMEAYEQLIPAGEQISYFIYFDVDPANIDVNIH PTKTEIKFENEQAIWQILSASVKESLGKFSAIPSIDFDTEDMPDIPAFEEKISSEPPKVH YNTDYNPFKVSAGGGGSGSYSRSKVEWEDLYGGLTKASKMNNPQPEPEMDWEDSSIGGEP AFVEEKIETVTSAASSTLYANEPVMEKGNQHLQFKGRFILTSVKSGLMLIDQHRAHIRVL FDRYMVQIQQKQGVSQGVLFPEILQLPASEAAVLQSIMDDLSAVGFDLSNLGGGSYAING IPSGIEGLNPVELVRNMLHTAMEKGNDVKEEIQNILAITLARAAAIVYGQVLSNEEMVSL VDNLFACPSPNYTPDGKTVLTTIKEEDIERLFK >gi|225935361|gb|ACGA01000031.1| GENE 162 221606 - 221905 321 99 aa, chain - ## HITS:1 COG:no KEGG:BF4070 NR:ns ## KEGG: BF4070 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 99 1 98 98 169 83.0 3e-41 MGMFNSMRKPRGFNHQYIYVDERKEKLAKMEENAKRDLGMLPEKEFNPEDIRGKFVEGTT HLKRRKASGRKPVSFGIILIIIAFLLYLWHYLATGSWSF >gi|225935361|gb|ACGA01000031.1| GENE 163 221907 - 223670 1313 587 aa, chain - ## HITS:1 COG:no KEGG:BT_3849 NR:ns ## KEGG: BT_3849 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 587 1 576 576 1035 86.0 0 MLKKNKKNKQQDRHRWLLIGFLCLFGVCLVAQVRPTGQKPTAGASTKSTADTSKTAPKTK TPAGNKKQSDNKKQPENKKTKVYLLHADEGQADKLARPDVQVLIGNVKLRHDSMYMYCDS ALIFEKTNSVEAFSNVRMEQGDTLFIYGDYLYYDGMTQIAQLRENVKMINRNTTLLTDSL NYDRLYDLGYYFEGGTLMDEENVLTSDWGEYSPATKQSVFNHDVKLVNPKFVLTSDTLRY NTENKIAVILGPSNIVSDNNHIYSERGFYNTLTEQAELLDRSVLTNQGKKLVGDSLFYDR IIGYGEAFDNVKMTDSINKNMLTGDYCFYNELTDSAFATKRAVAIDYSQGDSLYMHGDTL QMVSYNLNTDSLYRLMKAYHKVRMYRTDVQGVCDSLVYNSKDSCMTMYTDPILWNDGQQL LGEQIKIYMNDSTIDWAHIINQALTVEMKDSIHYNQVSGKEMKAYFVNGDMRHIEVIGNV LTAFYPEEKDSTMTGFNCLEGSMLHLYMKDKRMEKGLFIGKSNGTMYPMDQIPPDKLRLP TFAWFDYVRPLNKDDIFNWRSKRAGETLKPTTDRRPKTEKRNLINMK >gi|225935361|gb|ACGA01000031.1| GENE 164 223693 - 225075 1546 460 aa, chain - ## HITS:1 COG:STM0092 KEGG:ns NR:ns ## COG: STM0092 COG0760 # Protein_GI_number: 16763482 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Salmonella typhimurium LT2 # 13 338 9 339 428 77 27.0 5e-14 MKKFVNFRFVVTLVLAIFANVATYAQDNVVDEVVWVVGDEAILKSDVEEARMDALYNGRR FEGDPYCVIPEEIAVQKLFLHQAKLDSIEVSEAEIIQRVDMMTNMYIQQIGSREKMEEYF NKTSTQIRETLRDNARDGLTVQKMQQKLAGDIKVTPAEVRRYFKDLPQDSIPYIPTQVEV QIITLQPKIPVSEIEDVKRTLRDYTDRVTKGEIDFSTLARLYSEDKASAIKGGECGFMGR GMMDPAYANVAFSLQDPKKVSKIVESEFGFHIIQLIEKRGDRVNTRHILLRPKVSEKELT EACARLDSIADDIRANKFTFDDAAAVISQDKDTRNNHGIMVNINEHSGITTSKFQMQDLP QDVAKVVDKMNVGEISKAFTMINEKDGKEVCAIVKLKTKINGHKATIAEDYQDLKEIVMD KRREEVLQKWILNKQKHTYVRINENWQKCDFKYPGWVKND >gi|225935361|gb|ACGA01000031.1| GENE 165 225085 - 225930 859 281 aa, chain - ## HITS:1 COG:no KEGG:BT_3847 NR:ns ## KEGG: BT_3847 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 281 14 281 281 463 91.0 1e-129 MRILVLLLITLLGCGACKEQHDHKGKTPLVEVDGNFLYKEDLMSVLPVGLSKDDSILFTE HYIRSWAEEILLYEKAANNIPDNVDVDKLVENYRKALIMHTYQQELINQKLTNDISEHEI AEYYGKNKELFKLESPLIKGLFIKVPLTAPQLNNVRRWYKSEKQDAIESLEKYSLQNAVK YEYFYDKWVSVTDVLDMIPLKVEAPEEYVNKHRQVELKDTAYYYFLNVSDFRGVGEEKPY EFARSEVKDLLVNQKRVSFMEQVKNDLYQQAVSKKKIIYNY >gi|225935361|gb|ACGA01000031.1| GENE 166 225985 - 227532 1030 515 aa, chain - ## HITS:1 COG:no KEGG:BT_3846 NR:ns ## KEGG: BT_3846 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 515 1 514 514 703 71.0 0 MMKRSLLLGCISLFVVAVFAQEDPVLMRVNGREILRSEFEYAYRRYAERSNARLSSKEYA ALFAQSKLKVEAARAAGLDTTSVFRKQQEKCRTELVESYLIDRQVMDSCARAIYQKMGLK ARSGRVQVMQIFKRLPQTITSRHLEEEKTRMDSIYRMIQNQPDLNFNRLVEIYSDDKQSR WIECLETTSEFENVAFSLAKGMASQPFFTPEGIHILKVMDREETAAYENVSARLMERLRR KEILDKGAGAVLERLKKAWQYAPNQAAMEELLAKGRTEQTLFTIDGQAYTGTMFTHFASS HPQAVKRQLEGFIAKSLLDYESRNIDKKHPEIPYALRESDENYLVEEITRQKIDLPAAND RAGLATYFKFHSSDYRWESPRYRGVVLHCVDKKTAKQAKKMLKKVPEKEWADKLRQTFNT SGEEKIQVEQGIFADGDNKYIDKLVFKKGDFDPLMSYPFTIVVGEKMKGPDDYREVIDRV RKDYRSYLDTCWMRELSESGKVEINQEVLKTVNNN >gi|225935361|gb|ACGA01000031.1| GENE 167 227633 - 229111 1883 492 aa, chain - ## HITS:1 COG:BH0020_3 KEGG:ns NR:ns ## COG: BH0020_3 COG0516 # Protein_GI_number: 15612583 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Bacillus halodurans # 207 489 1 281 282 347 61.0 5e-95 MSFIADKIVMDGLTYDDVLLIPAYSEVLPRTVDLSTKFSKNIELKIPFVTAAMDTVTEAK MAIAIAREGGIGVIHKNMSIEEQARQVAIVKRAENGMIYDPVTIKRGSTVQDALDIMAEY KIGGIPVVDDEGYLVGIVTNRDLRFERDMAKHIDLVMTPKERLVTTNQSTDLESAAQILQ KHKIEKLPIVGMDGKLIGLVTYKDITKAKDKPMACKDAKGRLRVAAGVGVTADTLDRMQA LVDAGADAIVIDTAHGHSKFVIEKLKEAKKRFPNIDIVVGNIATGEAAKALVEAGADAVK VGIGPGSICTTRVVAGVGVPQLSAVYDVAKALKGTGIPLIADGGLRYSGDVVKALAAGGY CVMIGSLVAGTEESPGDTIIFNGRKFKSYRGMGSLEAMENGSKDRYFQSGTADVKKLVPE GIAARVPYKGTLFEVVYQLTGGLRAGMGYCGAANIEKLHDAKFTRITNAGVMESHPHDVT ITSESPNYSRPE >gi|225935361|gb|ACGA01000031.1| GENE 168 229198 - 231378 1827 726 aa, chain - ## HITS:1 COG:alr0205 KEGG:ns NR:ns ## COG: alr0205 COG0514 # Protein_GI_number: 17227701 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Nostoc sp. PCC 7120 # 6 720 6 712 718 494 39.0 1e-139 MAGKINLTDELKKYFGFNKFKGNQEAIINNLLDGKDTFVLMPTGGGKSLCYQLPSVLMEG TAIVISPLIALMKNQVDAMRNFSEEDGVAHFINSSLNKGAIDQVRSDILAGKTKLLYVAP ESLTKEENVEFLRSVKISFYAVDEAHCISEWGHDFRPEYRRIRPIINEIGKAPLIALTAT ATPKVQHDIQKNLGMVDAQVFKSSFNRPNLYYEVRAKTANIDRDIIKFIKNNPEKSGIIY CLSRKKVEELAEILQANGINARPYHAGMDSLTRTKNQDDFLMEKVDVIVATIAFGMGIDK PDVRFVIHYDIPKSLEGYYQETGRAGRDGGEGQCITFYTNKDLQKLEKFMQGKPVAEQEI GKQLLLETAAYAESSVCRRKTLLHYFGEEYTEENCGNCDNCLNPKKQVEAQELLCAVIEA IIAVKENFKADYIIDILQGKETSEVQAHLHEDLEVFGSGMGEEDKTWNAVIRQALIAGYL SKDVEHYGLLKVTEEGHKFLKRPKSFKITEDNDFEETEEEVPARGGGSCAVDPALYSMLK DLRKKLSKKLEVPPYVIFQDPSLEAMATIYPVTLDELQNIPGVGAGKAKRYGEEFCKLIK RHCEENEIERPEDLRVRTVANKSKMKVAIIQAIDRKVALDDIALSKGIEFGELLDEVEAI VYSGTKLNIDYFLEEIMDEDHMLDIYDYFKESTTDKIDDALDELGDDFTEEEVRLVRIKF ISEMAN >gi|225935361|gb|ACGA01000031.1| GENE 169 231296 - 231493 135 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIASWFPLNLLNPKYFFSSSVKLIFPAILTKVVLFSSWFNFKYVLFFKRAKQSYKQLRNY QAKCD >gi|225935361|gb|ACGA01000031.1| GENE 170 231547 - 232791 250 414 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 154 404 248 453 466 100 30 4e-20 MADSKTKKRCSFCGRSENEVGFLITGMNGYICDSCATQAYEITQEALGEGKKSAGATKLN LKELPKPVEIKKFLDQYVIGQDDAKRFLSVSVYNHYKRLLQKDSGDDVEIEKSNIIMVGS TGTGKTLLARTIAKLLHVPFTIVDATVLTEAGYVGEDIESILTRLLQVADYNVPEAEQGI VFIDEIDKIARKGDNPSITRDVSGEGVQQGLLKLLEGSVVNVPPQGGRKHPDQKMIPVNT KNILFICGGAFDGIEKKIAQRLNTHVVGYTASQKTATVDKNNMMQYIAPQDLKSFGLIPE IIGRLPVLTYLNPLDRDALRAILTEPKNSIIKQYIKLFEMDGIKLTFEDDVFEYIVDKAV EYKLGARGLRSIVETIMMDVMFEIPSENKKEYKVTLDYAKMQLEKANMARLQTA >gi|225935361|gb|ACGA01000031.1| GENE 171 232794 - 233456 812 220 aa, chain - ## HITS:1 COG:sll0534 KEGG:ns NR:ns ## COG: sll0534 COG0740 # Protein_GI_number: 16332068 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Synechocystis # 32 217 25 210 226 227 58.0 2e-59 MDDFRKYATKHLGMNGMVLDDVIKSQAGYLNPYILEERQLNVTQLDVFSRLMMDRIIFLG TQVDDYTANTLQAQLLYLDSVDPGKDISIYINSPGGSVYAGLGIYDTMQFISSDVATICT GMAASMAAVLLVAGAEGKRSALPHSRVMIHQPMGGAQGQASDIEITAREIQKLKKELYTI IADHSHTDFDKVWADSDRDYWMTAQEAKEYGMIDEVLIKK >gi|225935361|gb|ACGA01000031.1| GENE 172 233600 - 234955 1797 451 aa, chain - ## HITS:1 COG:PA1800 KEGG:ns NR:ns ## COG: PA1800 COG0544 # Protein_GI_number: 15596997 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Pseudomonas aeruginosa # 1 445 1 425 436 71 21.0 4e-12 MNVSLQNIDKVSAELTVKLEKADYQEKVDKELKSLRQKAQIPGFRKGMVPTSLIKKMYGK SVIAEVVNKALQEAVYNYIKENKVNMLGEPLPNEEKQQNIDFDTMEEFDFVFDIALAPEF KAEVSAKDKVDYYTIEVSDEMIDNQVKMYTQRTGKYDKVDAYEDNDMLKGLLAQLDEEGN TKEGGIQVEAAVLMPAYMKNDDQKAIFANAKVNDVLVFNPNVAYDGHAAELGSLLKIDKE IAKDVKSDFSFQVEEITRFVPGELTQEVFDQAFGEGVVKTEEEFRAKIKEEIAARFVADS DYKFLIDIRKVMMEKVGKLEFSDALLKRIMLLNNEEKGEEYVAENYDKSIEELTWHLIKE QLVEANDIKVEQEDVLKMARETTKAQFAQYGMLSIPDDVLDNYAQEMLKKKETINNLVSR VVEVKLAAALKAQVTLENKNVSIEEFNKMFE >gi|225935361|gb|ACGA01000031.1| GENE 173 235397 - 235642 314 81 aa, chain + ## HITS:1 COG:asl4022 KEGG:ns NR:ns ## COG: asl4022 COG0724 # Protein_GI_number: 17231514 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 80 1 80 94 86 56.0 1e-17 MNIYVGNLNYRVKEGDLQQVMEDYGAVSSVKVVMDRETGKSKGFAFIEMEDDAAAAKAIA ELNGAEYMGRTMVVKEARPRA >gi|225935361|gb|ACGA01000031.1| GENE 174 235708 - 236478 324 256 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 237 1 239 245 129 32 1e-28 MIEIKGLYKSFDDKTVLSDINADFANGKTNLIIGQSGSGKTVLMKCIVGLLTPEKGEVLY DGRNLVLMGKKEKKMLRKEMGMIFQSAALFDSMTVFDNVMFPLNMFSNDILRDRIKRAMF CLDRVNLGEAKDKFPGEISGGMQKRVAIARAIALNPQYLFCDEPNSGLDPKTSLVIDDLI HDITQEYNMTTIINTHDMNSVLGIGEKVIYIYEGHKEWEGTKDDIFTSTNERLNNFIFAS DLLRKVKDVEVQGMEG >gi|225935361|gb|ACGA01000031.1| GENE 175 236475 - 237218 597 247 aa, chain - ## HITS:1 COG:aq_355 KEGG:ns NR:ns ## COG: aq_355 COG0767 # Protein_GI_number: 15605864 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Aquifex aeolicus # 1 245 1 244 245 118 33.0 1e-26 MIKALRTVGRYFMLMGRTFSRPERMRMFFRQYLNELEQLGVNSIGIVLLISFFIGAVITI QIKLNIESPWMPRWTVGYVTREIMLLEFSSSIMCLILAGKVGSNIASELGTMRVTQQIDA LEIMGINSANYLILPKITAMVTVIPILVTFSIFAGIIGAFCTCWFAGVMNAVDLEYGLQY MFNEWFIWAGIIKSLFFAFIIASVSAFFGYTVEGGSIAVGKASTDAVVSSSVLILFSDLI LTKLLMG >gi|225935361|gb|ACGA01000031.1| GENE 176 237294 - 238055 274 253 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 230 1 229 245 110 25 7e-23 MEESKMVLRTEDLVKKYGKRTVVSHVSINVKQGEIVGLLGPNGAGKTTSFYMTVGLITPN EGRIFLDDLEITKYPVYKRAQTGIGYLAQEASVFRQMSVEDNIASVLEMTNKPKEYQKEK LESLIAEFRLQKVRKNKGNQLSGGERRRTEIARCLAIDPKFIMLDEPFAGVDPIAVEDIQ QIVWKLKDKNIGILITDHNVQETLSITDRAYLLFEGKILFQGTPEELSENQIVREKYLSN SFVLRRKDFQLEK >gi|225935361|gb|ACGA01000031.1| GENE 177 238162 - 239475 1384 437 aa, chain - ## HITS:1 COG:SPy0341 KEGG:ns NR:ns ## COG: SPy0341 COG1160 # Protein_GI_number: 15674498 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Streptococcus pyogenes M1 GAS # 5 437 6 435 436 370 45.0 1e-102 MGNLVAIVGRPNVGKSTLFNRLTKTRQAIVNEEAGTTRDRQYGKSEWLGREFSVVDTGGW VVNSDDVFEEEIRKQVLLAVEEADVILFVVDVMNGVTDLDMQVAAILRRANSPVIMVANK TDNHDLQYNAPEFYKLGLGDPYCVSAMTGSGTGDLMDLIVSNFKKESSEILDDDIPRFAV VGRPNAGKSSIVNAFIGEERNIVTEIAGTTRDSIYTRYNKFGFDFYLVDTAGIRKKNKVN EDLEYYSVVRSIRSIEGSDVCILMLDATRGVEGQDLNIFSLIQKNQKGLVVVINKWDLVE DKSVKVQKAFEEAVRSRFAPFVDFPIIFASALTKQRILKVLEEARSVYENRTTKIPTARL NEEMLPLIEAYPPPSNKGKYIKIKYITQLPNTQVPSFVYFANLPQYVKEPYKRFLENKMR EKWNLTGTPVNIYIRQK >gi|225935361|gb|ACGA01000031.1| GENE 178 239524 - 240405 933 293 aa, chain - ## HITS:1 COG:lin1499 KEGG:ns NR:ns ## COG: lin1499 COG1159 # Protein_GI_number: 16800567 # Func_class: R General function prediction only # Function: GTPase # Organism: Listeria innocua # 3 290 6 296 301 256 46.0 3e-68 MHKAGFVNIVGNPNVGKSTLMNVLVGERISIATFKAQTTRHRIMGIYNTDEMQIVFSDTP GVLKPNYKLQESMLNFSTSALTDADILLYVTDVVETPDKNNEFMEKVRQMTVPVLLLINK IDLTDQEKLVKLVEEWKELLPQAEIIPISATSKFNVDYVMKRIKELLPDSPPYFGKDQWT DKPARFFVNEIIREKILLYYDKEIPYSVEVVVEEFKEEPKKIHIRAVINVERDSQKGIII GKQGKALKKVATEARRELERFFGKTIFLETYVKVDKDWRSSDKELRNFGYQLD >gi|225935361|gb|ACGA01000031.1| GENE 179 240492 - 241499 814 335 aa, chain - ## HITS:1 COG:lin2305 KEGG:ns NR:ns ## COG: lin2305 COG0332 # Protein_GI_number: 16801369 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Listeria innocua # 4 328 1 311 312 281 42.0 1e-75 MEKINAVITGVGGYVPDYILTNDEISKMVDTNDEWIMTRIGVKERHILNEEGLGSSYMAR KAAKQLMKKTGANPDDIDLVVVATTTPDYHFPSTASILCDKLGLKNAFAFDLQAACSGFL YLMETAANFIRSGRYKKIIIVGADKMSSMVNYTDRATCPIFGDGAAAFMVEPTTEDYGIM DSILRTDGKGLPFLHMKAGGSVCPPSYFTVDNKMHYLHQEGRTVFKYAVSNMSDVSAAIA EKNGLTKDSINWIVPHQANVRIIEAVAHRMDVPMDKVLVNIEHYGNTSAATLPLCIWDYE DKLKKGDNIIFTAFGAGFTWGAVYVKWGYDGKKES >gi|225935361|gb|ACGA01000031.1| GENE 180 241587 - 241772 338 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160882088|ref|ZP_02063091.1| hypothetical protein BACOVA_00026 [Bacteroides ovatus ATCC 8483] # 1 61 1 61 61 134 100 3e-30 MAHPKRRQSSTRQAKRRTHDKAVAPTLAICPNCGEWHVYHTVCGACGYYRGKLAIEKEAA V >gi|225935361|gb|ACGA01000031.1| GENE 181 241785 - 242363 610 192 aa, chain - ## HITS:1 COG:no KEGG:BT_3832 NR:ns ## KEGG: BT_3832 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 192 1 179 179 328 91.0 9e-89 MGKFDKYKIDLKGMQTDSAKYEFVLDNLYFAHIDGPEVQKGKVNVTLTVKRTSRAFELSF QTEGMVSVPCDRCLDDMELPISSSDKLMVKFGHEYAEEGDNLIVIPEEEGEINVAWFMYE FVALSVPMKHVHAPGKCNKAMTGKLNKHLKTNANEDSDDTFDTGGDDIVIEEEVEEQIDP RWNELKKILDNN Prediction of potential genes in microbial genomes Time: Fri May 13 08:22:08 2011 Seq name: gi|225935360|gb|ACGA01000032.1| Bacteroides sp. D2 cont1.32, whole genome shotgun sequence Length of sequence - 47458 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 12, operones - 6 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1062 - 1121 1.7 1 1 Tu 1 . + CDS 1150 - 1662 712 ## COG2193 Bacterioferritin (cytochrome b1) + Term 1689 - 1736 3.7 + Prom 1756 - 1815 2.7 2 2 Op 1 . + CDS 1850 - 2803 780 ## COG0685 5,10-methylenetetrahydrofolate reductase 3 2 Op 2 1/0.000 + CDS 2805 - 3929 1091 ## COG2812 DNA polymerase III, gamma/tau subunits 4 2 Op 3 . + CDS 3944 - 5449 1471 ## COG1774 Uncharacterized homolog of PSP1 5 2 Op 4 . + CDS 5427 - 5900 265 ## BT_3818 hypothetical protein 6 3 Op 1 19/0.000 - CDS 5903 - 7360 1137 ## COG0772 Bacterial cell division membrane protein 7 3 Op 2 . - CDS 7341 - 9203 1481 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 8 3 Op 3 . - CDS 9206 - 9703 299 ## BT_3815 hypothetical protein 9 3 Op 4 22/0.000 - CDS 9700 - 10545 701 ## COG1792 Cell shape-determining protein 10 3 Op 5 . - CDS 10553 - 11575 1044 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Prom 11638 - 11697 4.1 - Term 11629 - 11679 -0.4 11 4 Tu 1 . - CDS 11702 - 13225 1711 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) - Prom 13267 - 13326 8.7 - Term 13295 - 13349 12.6 12 5 Op 1 . - CDS 13377 - 15413 2572 ## COG3590 Predicted metalloendopeptidase - Term 15513 - 15552 2.0 13 5 Op 2 . - CDS 15623 - 17578 1811 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 17626 - 17685 7.1 - Term 17652 - 17702 10.5 14 6 Tu 1 . - CDS 17721 - 18632 1010 ## BT_3808 hypothetical protein - Prom 18688 - 18747 4.9 15 7 Op 1 . - CDS 18761 - 21103 1992 ## BT_3807 hypothetical protein 16 7 Op 2 . - CDS 21106 - 25548 3648 ## BT_3806 hypothetical protein - Prom 25770 - 25829 8.8 17 8 Tu 1 . - CDS 25887 - 27410 813 ## ZPR_3550 calcineurin-like phosphoesterase - Prom 27439 - 27498 6.5 - Term 27423 - 27479 2.2 18 9 Op 1 . - CDS 27511 - 29004 886 ## ZPR_3551 hypothetical protein 19 9 Op 2 . - CDS 29062 - 31863 1114 ## COG0612 Predicted Zn-dependent peptidases 20 9 Op 3 . - CDS 31876 - 32331 329 ## Phep_0439 hypothetical protein 21 9 Op 4 . - CDS 32345 - 33760 877 ## ZPR_3554 RagB/SusD domain-containing protein 22 9 Op 5 . - CDS 33781 - 37125 1820 ## ZPR_3555 protein containing TonB-dependent receptor Plug domain 23 9 Op 6 . - CDS 37142 - 38314 569 ## COG3712 Fe2+-dicitrate sensor, membrane component 24 9 Op 7 . - CDS 38348 - 38923 259 ## BT_3277 RNA polymerase ECF-type sigma factor 25 9 Op 8 . - CDS 38989 - 41319 1612 ## COG0642 Signal transduction histidine kinase 26 9 Op 9 . - CDS 41360 - 42718 1274 ## COG0534 Na+-driven multidrug efflux pump - Prom 42741 - 42800 5.7 27 10 Op 1 . - CDS 42804 - 44042 1115 ## COG0612 Predicted Zn-dependent peptidases 28 10 Op 2 . - CDS 44048 - 44887 991 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family - Prom 45115 - 45174 4.5 + Prom 44835 - 44894 5.8 29 11 Tu 1 . + CDS 44975 - 46348 1193 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 46366 - 46409 1.1 + Prom 46473 - 46532 3.3 30 12 Tu 1 . + CDS 46564 - 47458 611 ## COG3119 Arylsulfatase A and related enzymes Predicted protein(s) >gi|225935360|gb|ACGA01000032.1| GENE 1 1150 - 1662 712 170 aa, chain + ## HITS:1 COG:PA4880 KEGG:ns NR:ns ## COG: PA4880 COG2193 # Protein_GI_number: 15600073 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Pseudomonas aeruginosa # 14 161 32 177 177 79 35.0 5e-15 MARESVKILQGKLDVESLISQLNAALAEEWLAYYQYWVGALVVEGAMRADVQGEFEEHAE EERRHAQLLADRIIELEGVPVLDPKQWFELARCKYDAPQGFDSVSLLKDNVASERCAILR YQEIADFTNGKDFTTCDIAKHILAEEEEHEQDLQDYLTDIARMKKSFLEK >gi|225935360|gb|ACGA01000032.1| GENE 2 1850 - 2803 780 317 aa, chain + ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 316 1 287 296 177 33.0 2e-44 MKVIDLINNSKGTAFSFEILPPLKGTGIEKLYQTIDTLREFDPKYINITTHRSEYVYKDL GNGLFQRNRLRRRPGTVAVAAAIQNKYNITVVPHILCSGFTREETEYVLLDLQFLNITDL LVLRGDKAKHESVFTPEGDGYHHAIELQEQINNFNKGIFVDGSEMKVTASPFSYGVACYP EKHEESPNIETDLYWLKKKVETGAEYAVTQLFYDNKKYFDFVEQAKAAGINIPIIPGIKP FKKISQLSMIPKTFKVDLPEDLVKEALKCKNDAEAEQVGVEWCVAQCKELMEHGVPSIHF YSIGAVDSIKEIAKIIY >gi|225935360|gb|ACGA01000032.1| GENE 3 2805 - 3929 1091 374 aa, chain + ## HITS:1 COG:DR2410 KEGG:ns NR:ns ## COG: DR2410 COG2812 # Protein_GI_number: 15807400 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Deinococcus radiodurans # 3 214 13 180 615 100 31.0 6e-21 MFFRDVIGQEEIKQRLIQEVNEGRIPHAQLICGPEGVGKMPLAIAYARYISCTNRGEEDA CGVCPSCVKFNKLVHPDVHFVFPIVKSAKGKKEVCDDYIADWRPFVINNPYFNLNHWLGE MDAENSQALIYAKESDEILKKLSLKSSEGGFKITIVWLPEKMHPVCANKLLKLLEEPPEK TIFLLVSEAPDMILPTILSRTQRMNVRKIDEASIDRVLQSKYHVQPADSISIAHLANGNF VKALETIHLNEENQLFFELFVNLMRLSYQRKIKEMKMWSEQVASMGRERQKNFLEYCQRM IRENFIFNLHQRNLTYMTINEQNFATRFAPFVNERNVMGIMDELSEAQLHIEQNVNAKMV FFDFSLKMIVLLKQ >gi|225935360|gb|ACGA01000032.1| GENE 4 3944 - 5449 1471 501 aa, chain + ## HITS:1 COG:BH0045 KEGG:ns NR:ns ## COG: BH0045 COG1774 # Protein_GI_number: 15612608 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Bacillus halodurans # 42 275 4 231 275 164 39.0 3e-40 MEYKLHNGSGGLCCKGCSRQDKKLNTYDWLADIPGNAEESDMVEVQFKNTRKGYFRNSNK IKLEKGDVVAVEAAPGHDIGVVTLTGRLVPLQMKKANFKADTEIKRVYRKAKPVDMEKFN EAKAKEHATMIRARQIALNLNLDMKIGDVEYQGDGNKAIFYYIADERVDFRQLIKVLAEA FRVRIEMKQIGARQEAGRIGGIGPCGRELCCATWMTSFVSVSTSAARFQDISLNPQKLAG QCAKLKCCLNYEVDCYVEAQKRLPSREIELETKDGTFYFFKADILSNQVSYSTDKNFPAN LVTISGKRAFEVISMNKKGMKPDSLIEEEKKPEPKKPIDLLEQESVTRFDRSRNNKDGGN NANRNNKKKKKGNNNSNNNGNRPQQQAEGGNRPQQPQRENDNRPQSSENGNRGERDNRPR NNNNNRNRGQNQGRNNENRRPERGQNQERPQGQERPQQQDRQREQQGQERQERRPNQERP SRPERNQNQEKQSTNEKPTQE >gi|225935360|gb|ACGA01000032.1| GENE 5 5427 - 5900 265 157 aa, chain + ## HITS:1 COG:no KEGG:BT_3818 NR:ns ## KEGG: BT_3818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 157 1 157 159 259 75.0 2e-68 MKSLLKNSILSLFSACLLTACNEHTVYHSYQSLPNKGWGKSDTLSFQIPITDSVPTTLRL FAEVRNSIEYPYHDLHLCISQNLQDSTVWRTDTIAFCLADSTGRWTGHGWGSIYQSETFI TSVHPLHPANYTIKIMSGMKDEKLQGLSDVGIRIEKQ >gi|225935360|gb|ACGA01000032.1| GENE 6 5903 - 7360 1137 485 aa, chain - ## HITS:1 COG:TP0501 KEGG:ns NR:ns ## COG: TP0501 COG0772 # Protein_GI_number: 15639492 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Treponema pallidum # 52 479 46 430 433 168 32.0 2e-41 MVTRNDSLWKTLDWVTIFIYLLLIVGGWFSVCGASYDYGERDFLDFSTRAGKQFVWIICS FGLGFVLLMLEDRMYDMFAYIIYVGMILLLIVTIFIAPDTKGSRSWLVMGPVSLQPAEFA KFATALALAKYMNSYSFSIKKEKCAFILGFIILLPMLLIIGQRETGSALVYLAFFLVLYR EGMPGVVLFAGVCAVIYFVVGIRFDEVFIADTPTPLGEFIVLLLILLFAGGMVWVYRKKW SATRNIIGGSLAILLIAYLISEYWVHFSLVWVQWALCIVVIGYLIYLALSERQRTYFLIA LFTIGSVGFLYSSNYVFDNVLEPHQQVRIKVVLGLEEDLTGAGYNVNQSKIAIGSGGLTG KGFLNGTQTKLKYVPEQDTDFIFCTVGEEQGFVGSAAVLLAFLILILRLIFLSERQTSNF GRVYGYSVVSIFLFHLFINIGMVLGLTPVIGIPLPFFSYGGSSLWGFTILLFIFLRIDAG RGRRL >gi|225935360|gb|ACGA01000032.1| GENE 7 7341 - 9203 1481 620 aa, chain - ## HITS:1 COG:RSc0062 KEGG:ns NR:ns ## COG: RSc0062 COG0768 # Protein_GI_number: 17544781 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Ralstonia solanacearum # 24 600 31 633 801 281 33.0 3e-75 MAKDYILEKRKFVIGGIAISIVLIYLIRLFVLQITTDDYKKNADSNAFLNKIQYPSRGAI YDRTGKLLVFNQPAYDITIVPKEIENLDTLDLCQSLNITRAQFLKIMSDMKDRRRNPGYS RYTNQLFMSQLSAEECGVFQEKLFKFRGFYIQRRTIRQYSYNAAAHALGDIGEVSAKEME ADEEGYYIRGDYVGKLGVEKSYEKYLRGEKGIEILLRDAHGRIQGHYMDGEYDRPSVPGK NLTLSLDIDLQMLGERLLKNKIGSIVAIEPETGEILCLVSSPNYDPHLMIGRQRGKNHLA LQRDMTKPLLNRALMGVYPPGSTFKTAQGLTFLQEGIITEQSPAFPCSHGFHYGRLTVGC HAHGSPLPLIPAIATSCNSYFCWGLFRMFGDRKYGSPQNAITVWKDHMVSQGFGYKLGVD LPGEKRGLIPNAQFYDKAYRGHWNGLTVISISIGQGEILSTPLQIANLGATIANRGYFVT PHIVKEIQDNQLDSIYRVPRYTTIEKRHYESVVEGMRGAATGGTCRMLSVMVPDLEACGK TGTAQNRGHDHSVFMGFAPMNKPKIAIAVYVENGGWGATYGVPFGALMMEQYLKGKLSPE NELRAEEFSNRVILYGNEER >gi|225935360|gb|ACGA01000032.1| GENE 8 9206 - 9703 299 165 aa, chain - ## HITS:1 COG:no KEGG:BT_3815 NR:ns ## KEGG: BT_3815 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 248 90.0 4e-65 MIITYIHRIGWFIGLVLLQVLILNSVHIAGYATPFLYIYFILKFSSGTSRNELMLWAFFF GLTIDIFSDTPGMNAAATVLLAFLRPSLLRLFMPRDNPDSLIPSFKTMGISPFLKYTTAS VFVHSLALLSIEFFSFTSIWLLLLRVLLCTILTVTCIIAIEGIKK >gi|225935360|gb|ACGA01000032.1| GENE 9 9700 - 10545 701 281 aa, chain - ## HITS:1 COG:lin1582 KEGG:ns NR:ns ## COG: lin1582 COG1792 # Protein_GI_number: 16800650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Listeria innocua # 46 268 62 278 295 89 30.0 5e-18 MRNLLNFLLKYNYWFLFILLEVASFVLLFRFNRYQQSAFFTSANTVVGAVYEVSGGISSY FHLKSVNEDLLDRNMVLEQQITNLEKALREQQLDSMAINSIRQVPQADYQLFKAHVIKNS LNLVDNYITLDKGSSSGIRSEMGVVDGNGIVGIVYETSPSYSVVISVLNSKSNISCKIIG SDYFGYLKWEHGDSRYAYLKDLPRHAEFNLGDTVVTSGFSTVFPEGIMVGTVDDMSDSND GLSYLLKIKLATDFGKLSDVRVVARTGQEEQKKLENKVMKE >gi|225935360|gb|ACGA01000032.1| GENE 10 10553 - 11575 1044 340 aa, chain - ## HITS:1 COG:CAC1242 KEGG:ns NR:ns ## COG: CAC1242 COG1077 # Protein_GI_number: 15894525 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 1 335 1 332 335 284 45.0 2e-76 MGLFSFTQEIAMDLGTANTIIITNGKIVVDEPSVVALDRRTEKMIAVGEKAKLMHEKTHE NIRTIRPLRDGVIADFYACEQMMRGLIKQVNTRNHLFSPSLRMVIGVPSGSTEVELRAVR DSAEHAGGRDVYLIFEPMAAAIGIGIDVEAPEGNMIVDIGGGSTEIAVISLGGIVSNNSI RTAGDDLTEDIREYMSRQHNVKVSERMAERIKINVGAALTELGDDAPEDYIVHGPNRITA LPMEVPVCYQEVAHCLEKSISKIETAILSALENTPPELYADIVHNGIYLSGGGALLRGLD KRLTDKINIPFHIAEDPLHAVAKGTGVALKNVDRFSFLMR >gi|225935360|gb|ACGA01000032.1| GENE 11 11702 - 13225 1711 507 aa, chain - ## HITS:1 COG:aq_1963 KEGG:ns NR:ns ## COG: aq_1963 COG0138 # Protein_GI_number: 15606962 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Aquifex aeolicus # 10 507 3 506 506 434 47.0 1e-121 MSESKRIKTALVSVYHKEGLDEIITKLHEEGVEFLSTGGTRQFIESLGYPCKAVEDLTTY PSILGGRVKTLHPKIFGGILCRRGLEQDMQQIEKYEIPEIDLVIVDLYPFEATVASGASE ADIIEKIDIGGISLIRAAAKNYNDVIIVASQAQYKPLLDMLMEHGATSSLEERRWMAKEA FAVSSHYDSAIFNYFDAGEGSAFRCSVNSQKQLRYGENPHQKGYFYGNLEAMFDQIHGKE ISYNNLLDINAAVDLIDEFDDLTFAILKHNNACGLASRATVLEAWKDALAGDPVSAFGGV LITNGVIDKEAAEEINKIFFEVIIAPDYDVDALEILGQKKNRIILVRKEAKLPKKQFRAL LNGVLVQDKDTNIETVADLKTVTDKAPTPEEVEDMLFANKIVKNSKSNAIVLAKDKQLLA SGVGQTSRVDALKQAIEKAKSFGFDLNGAVMASDAFFPFPDCVEIADQEGITAVIQPGGS VKDDLSFAYCNEHGMAMVTTGIRHFKH >gi|225935360|gb|ACGA01000032.1| GENE 12 13377 - 15413 2572 678 aa, chain - ## HITS:1 COG:MA2001 KEGG:ns NR:ns ## COG: MA2001 COG3590 # Protein_GI_number: 20090849 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Methanosarcina acetivorans str.C2A # 32 678 16 665 665 550 41.0 1e-156 MKVTKYLPILAVCLMTTGCNSKKEAVLTSGIDLANLDTTAMPGTSFYQYACGGWVKDHPL TDEYSRFGTFDMLRENSREQLKALIAELAAKKDNAPGSAAQKVGDLYNIAMDSVKLNQEG VAPIKAELAAIDALKDKGEIYAYIAESQKKGIRPYFTMFVSADDMNSSMNIVQTYQGGIG MGQRDYYLENDEQTKNIRNKYQEHIAKMFQLAGYDEATAQKAVKAVMNIETRLAKAARSQ VELRDPHANYNKMDRATLKKNFPTFDWDTYFTVSGLKDLEEVNVGQPAAMKEVADVINTV SLDDQKLYLQWGLIDAAASYLSDDFEAQNFDFYSRTMSGKKEMQPRWKRSVSTVDGVLGE VVGQMYVEKYFPAAAKERMVTLVKNLQTSLGERIKGLEWMSEPTKEKALEKLATFHVKIG YPDKWKDYSALEIKDDSYWANIERANEWDYNEMIAKAGKPVDKDEWLMTPQTVNAYYNPT TNEICFPAAILQPPFFDMNADDAMNYGAIGVVIGHEMTHGFDDQGRQYDKDGNLKDWWTE EDAKKFEERAQVMVNFFDSIEVAPGVHANGSLTLGENIADHGGLQVSFQAFKNATEAAPL EIVDGFTPEQRFFLAYANVWAGNIRPEEILRLTKLDPHSLGKWRVDGALPHIQNWYEAFK ITEQDSMFVPKEKRVSIW >gi|225935360|gb|ACGA01000032.1| GENE 13 15623 - 17578 1811 651 aa, chain - ## HITS:1 COG:all4183 KEGG:ns NR:ns ## COG: all4183 COG0488 # Protein_GI_number: 17231675 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Nostoc sp. PCC 7120 # 1 532 1 531 564 400 42.0 1e-111 MISVEGLSVEFNATPLFEDVSYVINKKDRIALVGKNGAGKSTMLKILAGLQSPTRGVVAT PKDVTIGYLPQVMILSDNRTVMGEAELAFEHIFELQAKLERMNQELAERTDYDSEEYHQL IDRFTHENDRFLMMGGTNFQAEIERTLLGLGFSREDFERPTSEFSGGWRMRIELAKLLLR RPDVLLLDEPTNHLDIESIQWLENFLSTRANAVVLVSHDRAFLNNVTTRTIEITCGQIYD YKVKYDEFVVLRKERREQQLRAYENQQKQIQDTEDFIERFRYKATKAVQVQSRIKQLEKI DRIEVDEEDNSALRLKFPPASRSGNYPVICEDVRKAYGSHVVFHDVNLTINRGEKVAFVG KNGEGKSTLVKCIMDEIDFEGKLTIGHNVQIGYFAQNQAQMLDENLTVFDTIDRVATGDI RLKIRDILGAFMFGGEASDKKVKVLSGGERTRLAMIKLLLEPVNLLILDEPTNHLDMRSK DVLKEAIREFDGTVILVSHDRDFLDGLATKVYEFGGGLVKEHLGGIYEFLQKKKIDSLNE LQKGAGLSASPTASAKGNEPETVQPSENRLSYEAQKELNKKIKKLERQVADCEASIEETE SAIAIVEAKMATPEGASDMQLYERHQKLKQQLDGIVEEWERVSMELEEAKN >gi|225935360|gb|ACGA01000032.1| GENE 14 17721 - 18632 1010 303 aa, chain - ## HITS:1 COG:no KEGG:BT_3808 NR:ns ## KEGG: BT_3808 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 303 303 607 97.0 1e-172 MKPTLFLLAAGMGSRYGGLKQLDGLGPNGETIMDYSIYDAINAGFGKLVFVIRKDFEQDF RDKIISKYEGHIPCELVFQSIDDLPEGFICPADRTKPWGTNHAVMMGADVIKEPFAVINC DDFYGRDSFQVMGKFLSALPENSKNVYSMVGFRVGNTLSESGTVSRGICSTDAKGLLTSV VERTKIQRLDGEVKYIGDDGEWTATPDTTPVSMNFWGFTPDYFAYSQEFFKTFLSDPKNM ENLKSEFFIPLMVDKLINDGTATVEVLDTTSKWFGVTYPEDRQSVVDKIQALVDAGEYPA KLF >gi|225935360|gb|ACGA01000032.1| GENE 15 18761 - 21103 1992 780 aa, chain - ## HITS:1 COG:no KEGG:BT_3807 NR:ns ## KEGG: BT_3807 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 780 1 788 788 1455 89.0 0 MKGIYSNILASCLIGIILFSGCSVTKHLPEGEVLYTGGKTVVENKSATPVGETALTEIDA ALDKTPSTKMLGGLLPIPFKMWMYNDFVKYKKGFGKWMFNRLAANPPVFISTVNPEVRIK VATNLLRDYGYFNGKVTYETLVDKKDSLKASILYTVDMKNPYFIDTVYYQRFTPQTLRIM ERGRRMSYISPGEQFNVVDLDEERTRISTLLRNRGYFYFRPDYMTYQADTTLVPGGHISL RLIPVSGLPAAAQRPYYVGDASVYLYGKNGEMPNDSMLYKNLNIHFYKKLQVRPNMLYRW LNYQQFVRNTQMRASNRTRLYSQYRQEQIQEKLSQLGIFSYLDLQYAPKDTTAVCDTLQV TMQATFAKPLDAELELNVVTKSNDQTGPGASFGVTRNNVFGGGESWNVKLKGSYEWQTGG GEKSSLMNSWEMGVSTSLTFPRVVFPHLGKREFDFPATTTFRLYVDQLNRARYYKLLSFG GNATYDFQPTRTSRHSITPFKLTFNVLQHQSDEFIAIAEANPALYISLKDQFIPAMEYTY TYDNASARGIKNPIWWQSTVTSAGNLTSLIYRAFGQPFNKEDKRLLNVPFAQFVKLNTEF RHLWNMDKNNKIASRVALGALFSYGNATIAPYSEQFYVGGANSIRAFTVRSIGPGGYHPS ESRYSYLDQTGTFRFEANVEYRFRIFKSIWGATFLDAGNVWLMRKDEARPDSQLELKTFP KQIALGTGVGIRYDMDILVFRLDFGIPLHLPYDTERSGYYNVTGTFMKNLGIHFAIGYPF >gi|225935360|gb|ACGA01000032.1| GENE 16 21106 - 25548 3648 1480 aa, chain - ## HITS:1 COG:no KEGG:BT_3806 NR:ns ## KEGG: BT_3806 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1480 1 1479 1479 2499 86.0 0 MVLLYVPPVQNLLRREVTAYASKVTGMQIQVERIDLRFPLNLLVRGVEVIQQPDTLLSLE SLNVRVQAWPLIKGKVEVDEVTLSRVAVNSADLMEGMKIKGVLGRFFLQSHGVDLSNELA VINQVELSDTHMQLLMNDTTTTPKDTTASAPINWKVALHQLKLKNVSFGMQLPADSMRMT AHIGEAAINNAQADLKNQYYDLKKFLLSGTSASYDTGTAQPTEGFDASHIAVRDIRIALD SLLYKGRDMNAVIREFTMNERSGLSVTALTGRAYSNDSVICVPSLKLNTPHSEIDLSAHT YWELVNIPTTGRLSARLNAYIGKEDVMLFAGALPESFKEAYPFRPLVIRAGTDGNLKEMQ ISRFTVDLPGAFALEGGGLLENLADSLTRTGTIGLKMTTQNLNFLTALSGEAPNGTLVIP DSMNLVAKVDIKGPEYKANLRLKEGQGAMDVNAALNTATEVYKADLKIDNLQLHSFLPKD SIYELSLSAAANGRGLDVMSYHSFAKLNLALDQLHYAKYHLSDLDLTGELKGALVTAHLT SDNALLKMITDAEYNLAHSYPDGKITIDVTQLDLHELGIMPQSMKRPLTFNLVAEARRDL VSAHLISGDMKLDLSARSGVNPLIRQSTHFVDVLMKQIDEKALNHAELRKALPTAVFSFS AGQENPLAYFLATKKIAYHDASVKFGTAPDWGINGKAAIHALKVDTLQLDTIFFTVKQDT TLMKLRAGVINGPKNPQFSFSTTLTGEIRDRDAELLVDFKNGKGETGVRLGVNARPLFEG QGKGDGLAFTLIPEEPIIAFQKFHFNENHNWIYVHKNMRVYANVDMWDDEGMGFRVHSVR GDTVSLQNIDVEIRRISLAELSKVLPYFPEITGLFSAEAHYVQTEKDLQLSVESSIDELT YERQRIGDVTLGATWLPGEQGKQYLNAYLNHDNVEVMVADGKLVPTRTGKDSLEVNATLE HFPLRVANVFIPDQMVTLAGDMDGNLNITGSTEQPLINGELILDSVTVLSRQYGANFRFD NRPVQIKNNRLELDKFAIYTTGKNPFTIDGSVDFRDMSRPMANLNLLAQNYTLLDAKRTR ESLVYGKVYADFRATVKGPLDGLNMRGNMSLLGNTDVSYILTDSPLTVQDRLGSLVTFTS FSDTTTVVQQEVPTVSLGGLDMVMMVHIDPSVRLKVDLDASNDNRVELEGGGDLSMKYTP QGDLTLTGRYTLSGGLMKYALPVIAAKEFAIDNGSYVEWTGNPMDPMLNFKATDRIRASV SEGENGGTRMVNFDVSIVVKNRLDNLSFAFDVSAPEDATIQNELTAMGAEERGKQALYIM VMKTYLGTGPIGGGGGGLGKLNMGSALNSVLSSQINSLMGNLKNASVSVGIEDHDLSDTG GKRTDYSFRYSQRLFNNRFQIVIGGKVSTGENATNDAESFIDNISLEYRLDRTGTRYVRL FYDKNYESVLEGEITETGVGLVLRKKLDKLSELFIFKKKK >gi|225935360|gb|ACGA01000032.1| GENE 17 25887 - 27410 813 507 aa, chain - ## HITS:1 COG:no KEGG:ZPR_3550 NR:ns ## KEGG: ZPR_3550 # Name: not_defined # Def: calcineurin-like phosphoesterase # Organism: Z.profunda # Pathway: not_defined # 1 491 3 482 499 273 32.0 1e-71 MKKFMIIYIFFLLGSFVSTKCLAQDVAWIEGCVYEDGNQNGIQDKGEKGISGIAISNGDT VLLADKQGHFRIRLSKGNSIFPILPNNWTLLSNQIVNSGFYYWNSHGDTETQQINFGLNK KKVNKHFSINAIGDVQVGNSQELDYASRSLWPELLQADSSSFNIFLGDLVNNNLNLFPAI KQMMELLPVQSWTVIGNHDRDADSIRINQTFSYNTAFGSATYAFNEGNVHFIVLNNVYGK GTRSYVGKISDSQLRFVSNDLKLVPKNAQIILCMHIPLVHTTNSSALIEILEGRGNVLAL TGHMHQVERNFLHGQDVCVHELVTGASCGFWWVGEKDWEGIPSALMQCGTPRNYFVFDFT EKDYSFRYKGIGMDASRQMNIWIAGIDSTDVYIDELRNKHQGEMLVTVYGASDSTIVRCR LDNGEWLLCEKKEELDVNVARVRSWNQLKIYPTRFNRRNPFRRQFSPQIWGLQLPKECCE GVHLIAVEASDQWGFKASGERCFYYQR >gi|225935360|gb|ACGA01000032.1| GENE 18 27511 - 29004 886 497 aa, chain - ## HITS:1 COG:no KEGG:ZPR_3551 NR:ns ## KEGG: ZPR_3551 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 5 486 7 468 481 180 27.0 1e-43 MKDLFWMAVIVLFAAFSSCSDDKNDGTDEGVEAKLLTFGFYAEDNSSILSDDYIATLSST TAKVTMPAFIDKSALVARFTTNDGNIVLVDGVTQVSGATANDFTAPVDYIVSNGKQNVKY TITVAKSSNMAWVRMPDFTATSVFSGSVLKINPVDQVPYLGFKLKEEEKMAVIKLVDGTT WTAVGSLEGFGGEVSLSNYDFDIDKDGIPYVVYSDNSATTVAGAASLMKWNGTTWNYVGN QGFVNAQSQQLHLRVLDNGETIVSQVNNSNKVSFPRRVLVMSIYQNSWASSELPLLASGT PIYNCNLAKTEHAAYLLVVNRGAVAGVNYGHSVYEYKNGTWSILLNNYVEPNATQTGIVG LDIEAEENGTLYILTSDDAVTSGVYNLRLKKYDPVTKQWSTVGGNPLPLDFKTSTSTSVA ISIAPDGTPFVAYRDEQDQDYPKVIYLDNETKQWSDPHKLADIASSNLNIAFSSTGVGYI SFTDDNNHAIVFKYTDK >gi|225935360|gb|ACGA01000032.1| GENE 19 29062 - 31863 1114 933 aa, chain - ## HITS:1 COG:BB0536 KEGG:ns NR:ns ## COG: BB0536 COG0612 # Protein_GI_number: 15594881 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Borrelia burgdorferi # 8 886 15 889 933 261 26.0 5e-69 MKKFLWLIFIGLLLSPVVYGQSSEFLQYDKNLIVRKLDNGLTYYIYPNTNPKGEAVYRLF VKAGSVMEKENQRGLAHFLEHMAFNGSYHFPSDGMVRFLESKGAKFGKDLNAHTSFNETV YKLQLPSSNPQMVDSTLTILADWAGGLSIDSMQVEKERGVILSEWLSKRDAKRDSDTAFL LELLNSSHYSERMTIGDTAVIRNCKREDILDYYQTWYHPSLMAVAVVGDINPEQVETLIK EKFGKLSSLASPIWKQCHIPVYKKEAVKILTNESLKTIELDMIQLLPLSKPVQTAKDYKA YLTRTLLNRLFKMRMNAWAFENPSYRKASIQYSSFLNATGVLLCSVELLPGKMEKGISEF IAQQQQIFRYGFTETEIERAKKVIYNNLENKLQNQQNPASVELMNDIYADFYVGNRFTSL QEEYRLVQRYFPELDSVALVRNLAKVYSPQKMHYLLRANEKANKEVNGDATLMSIIKEAR KQSSKRYNKYFSVPDELCQLPSGGHIVREENIPEIGAVSLWLNNGTRIIFKSSELDKGKV LLTGFRKGGLYSLDSLHYYTGLFAPSIISLSGAGNFSRDALNYFLAGNSASMRLLVDKTR TGLAGVSQVKDMETMFQLLYTKWMYPQLDTAICKQTIEKTKENYRVKQKSPTEFFQEELM WLLNGRNYTNTILSDSLITRYVKQEDMLPLFNRFYGAAKDYTFVILGDCTIQDIKPLITT YIGGLPKGSNDTDWCYTERNIPYKSCSLIRHTGDSPKASVSLIFQQDSLLEEFSSFTLKS NVMKAMLRTCLLNRLREEMGKVYSVSVASSAGLYPSFLSRTMIGFVCLPEDVDSLVNATQ EELQRLYEYPESFDGILTDVKRNLLKDFELDKQKNSFWTSWIRNSIFNQQEDWKYLNNYA QTVNSITAKDISSFAKSLLSTTPMIKAVLYPKN >gi|225935360|gb|ACGA01000032.1| GENE 20 31876 - 32331 329 151 aa, chain - ## HITS:1 COG:no KEGG:Phep_0439 NR:ns ## KEGG: Phep_0439 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 18 150 24 155 155 123 48.0 2e-27 MRSLYFILLISCLWACTTDEKEPYDTPFIHIMTDKGVSKVIVKSDVNNINTYSVYLSSKP LTENLEVNYQVIVGDGLKSGVDFELVTKGSTLTFLPGIYDMPIRIRWMPNHLDENKDNTI TIRLTGNNQGLTMGLPGPDGLQRELVIEKQN >gi|225935360|gb|ACGA01000032.1| GENE 21 32345 - 33760 877 471 aa, chain - ## HITS:1 COG:no KEGG:ZPR_3554 NR:ns ## KEGG: ZPR_3554 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: Z.profunda # Pathway: not_defined # 1 466 5 471 474 355 42.0 2e-96 MKKRIYLLVGALCIIMQLVSSCTNFLEVEEKGKTTIPSFLSDPDGLNAGLVGAYNKMYAY YDNEFTKYPDVAGNMLSMKYVSAGADMLDQYNFTSDALQETGAVGYIWRKIYEALANVNN VIQYQPQVISAYPNAKDMCQRILGEALFLRALCHFDLCRVYAQPYNYTSDASHLGVPILL KTPGPDDNVSRESVKKVYLQILADLERAADCFRGIESSGIYYASLQAVNALYSRVYLYME DWNNALKYAKLAIGTGKLSQGDTYLNMYQDLSTTGEAIFRLNGIDQSGKLKAFYDASCVP ADTLFTLFDEGDIRLGLLRNKDGIAYCSKYYSLKQPDNQVNRDDPFVFRLSEMYMNAAEA AWHLKDYTAASGYLKSILERAVDTDYAVNTLSQYSDAKLIQLIEKERVKELCFEGHNFFD IIRWKQDLKREENTNSSVEKIVYPSDYFVLPIPQVELNANTNMQPNPTVNN >gi|225935360|gb|ACGA01000032.1| GENE 22 33781 - 37125 1820 1114 aa, chain - ## HITS:1 COG:no KEGG:ZPR_3555 NR:ns ## KEGG: ZPR_3555 # Name: not_defined # Def: protein containing TonB-dependent receptor Plug domain # Organism: Z.profunda # Pathway: not_defined # 31 1112 30 1118 1120 1033 49.0 0 MRILIIIMLLCTTLVPNYLYAISHENYSEESFLFVGEDYMQENYKVTLHLSGVSVSTLFN EIQKQTKLDFVYNTELLDTITPISVTAEKESVFDVLNRLFNGTGFIFKKSGNIITVNKKE ELQQDDKFLVTGEVKDDTGEPLVGVTIQQKDTRNITITDMDGRYSIQVSSKAEAFLTFSF VGMKTKTVPVKGRVLNVKLVQDAISIDDVVVVGAYGTAQKRLDQVGSAFQVNADQLKALP ALRVDKMLDGLIPGVKIDPNTDSPDNTRARYNVRVRGDASLSASNEPLWVVDGTPIYTGE HTNLIPGMSTTISPLSFINPDDIESITVLKDATATSIYGANGANGVILITTKKGKEGDLR IHLTAQYGVAKIDKSTSPKVLNANQYLMLAKEAYQNAGNDLKYFPYTDNEMNKYSLTNTD WTDVFYDTANTFQTNLSLMGGARNTAYYLSASYLENTATVKGNKQQRFSIRSNLDFTFLR KFKVSVDLATSYNVNDLFNPGRDYYEYLPIFEPYNEDGTFRLYNKVISGKETDGSPKWSE NRFLNSVAEREENIYNQKTLYTNANLMLRYDILSGLSYTGQFGVDYQSGKEEIYYARTNW SGMTSADGPIGYSTRNSLNMMNWTTVHRINYKQSFDKHDISGLLGIEAGSQDYVTLGVTG SGFINDHIQDVSYAKERKGTNTSKTKRKASMLGQLSYSYDHRYYVTLNGRRDGNSQFGSD VRWANFGSIGVSWNVQNESFYNIPWMNILKIKASYGANGNSRLGSQEALGLYSYGESYSY AGEIGGVMSGCPNSRLSWETTYMTNIGVRVRLFDRLDIEVEGYHNKTTNLLSNLDVSRTT GDTRVYRNVGTILNKGIEVTITSHNFRPKKDGDFSWLTDLNLAHNSNKLLELYNGIQKNM GEMVWREGYDIHTYNLVRWAGVDPRDGAPLWYDAKGNITRIYSTDNRVPYKSSTPVLAGG LTNTLTYKDFSLRFMFNYSIGGYGFTSFGRASNSDGLNIMTENQSIDQLNRWQKSGDLVL NPKPIWGVSTQSVMNSTRYLYNKTQVRLQNLVFSYRIPRTILHSTGLKDCSISLIGDNLW VWTPYSGKDHNSYKTCMSGYPMERYFSIALNVGL >gi|225935360|gb|ACGA01000032.1| GENE 23 37142 - 38314 569 390 aa, chain - ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 160 370 110 312 331 73 30.0 5e-13 MEIEKLIVKYLVGCLSKEEDQILKKWLQEDESHLRFFQKINLDKTAVEDYRQYHRFDAKS DEAFRKVWTSLHKGTRRISTQRKIGFFSRRWLKIACALLILLFSVGGGLYFYYEASRITP GESKAILTLEDGSAKQLKKSGQEHWIYIGDTPIAREYDGMIVYHIPETSKVEISQQNTLS VPRGGEFRLTLSDGTRVHLNSLSELKYPVSFKGMNERTVELKGEAFFEIAKDSLHPFKVE TQGILIQQFGTAFNVKSRVKGKVEVALVQGSIGIYTQSQKLHKLSPGQLAIWDASTDLLS VENKDLLLNTAWHSNRFIFYDESLGSLMEELALWYNVDVDFLDKSLKELHFTGSLYRYDD IAVILNAIEETVNVDFKISGLRIKIDRKNK >gi|225935360|gb|ACGA01000032.1| GENE 24 38348 - 38923 259 191 aa, chain - ## HITS:1 COG:no KEGG:BT_3277 NR:ns ## KEGG: BT_3277 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 167 11 176 189 131 46.0 1e-29 MIREPGFSINRKDIFERLFSDYYGILVCYAQKYTKREDIAEDIVQDVFASLWEENRIFPS QANFRSFLYISIRNAAFDYLRHQNVESRYIEEALTANRFLSDDSFQKEEVFRLLFKQIDL LPERCREIFLLHLEGYDNDAIAKKLSLSIETVKTQKKRAMKTLRNNLKEKLQKKYPDTSF FILFLYLDLYL >gi|225935360|gb|ACGA01000032.1| GENE 25 38989 - 41319 1612 776 aa, chain - ## HITS:1 COG:rcsC_1 KEGG:ns NR:ns ## COG: rcsC_1 COG0642 # Protein_GI_number: 16130155 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 243 492 424 675 700 149 33.0 2e-35 MASFRFTKLKITAGYTLLLAILLFSLVFVHREMEALSAADDQQNLRTDSLLTLLHEKDQN TIQMLRVLSEANDSLLSASEIEEIISEQDSVITQQRVQHRVITKRDSLITTPKKKGFFKR LAEVFSPSKQDSAVLVNTSLEVATDTILQPTTSKDSLQQKIRMATEEKRLQRRKTIRRTS TKYQRMNTQLTARMDSLIKQYEEEMTLRARQDAELQQEVRMRSARIIGGIAVGAVLLSAF FLILIMRDISRSNRYRQQLEVANKRAEDLLIAREKLMLAITHDFKAPLGSIMGYTELLSR LTEDERQRFYLDNMKSSSEHLLKLVSDLLDFHRLDLNKAEVNRVTFNPSQLFDEIYVSFE PLTAAKGLALQCHVVPELNGRYISDPLRLRQIVNNLLSNAVKFTQKGEISLTASYDSSKL TIAIADTGKGMASEDRERIFQEFTRLSGAQGEEGFGLGLSIVKKLVTLLEGTIDVQSTLG KGSCFTVTLPLYPVGKFLAESESPESESSENESPYAPKQSAAIPPMKVIRVLLIDDDKIQ LNLTAAMLKQHGIDAVCCEQLEQLIEQLRSSVFDVLLTDIQMPAINGFDLVKLLRASNIP QAKTIPVIAVTARSEMDKAALHEHGFAGCLHKPFTVKELLLTVNEGQLSADEAHITEDMQ LNVNALTSFSEDDPEATHSIIQTFIEETQKSADRMVQALNAKEVDEIAAIAHKLLPLFTL IGAGNAVILLSWLEARRGEDFSTEINEKVESILQEIQKILKEVNGVECSNILNSEI >gi|225935360|gb|ACGA01000032.1| GENE 26 41360 - 42718 1274 452 aa, chain - ## HITS:1 COG:VC1540 KEGG:ns NR:ns ## COG: VC1540 COG0534 # Protein_GI_number: 15641548 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 1 444 1 449 461 182 29.0 9e-46 MRHIYSTYKEHYKALIFLGLPIVIGQIGVIVLGFADTLMIGHHSTIELGAASFVNNVFNL AIIFSTGFSYGLTPIVGGLYGTRQYAPAGQALRNSLLANLMVALLLTICMTVLYLNIERL GQPEELIPLIKPYYLVLLASLVFVMLFNGFKQFTDGITDTKTAMWILLGGNVLNIIGNYI LIYGKLGLPELGLLGAGISTLFSRIVMVIIFIIIFMRSPRFVRYKIGFFRLGWSRAIFGR LNGLGWPVAFQMGMETASFSLSAIMIGWLGTIALASHQVMLAISQFTFMMYYGMGAAVAV RVSNFKGQNDIINVRRSAYAGFHLMMTLGVVLSLIVFLCRNYLGGWFTDSQEVAAMVTSL IFPFLVYQFGDGLQITFANALRGISDVKLMMVIAFVAYFIISLPVGYFCGFVMGWGVVGV WMAFPFGLTSAGLMLWWRFHHMTKLPEPHPKT >gi|225935360|gb|ACGA01000032.1| GENE 27 42804 - 44042 1115 412 aa, chain - ## HITS:1 COG:CC3584 KEGG:ns NR:ns ## COG: CC3584 COG0612 # Protein_GI_number: 16127814 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Caulobacter vibrioides # 5 412 47 459 948 195 31.0 1e-49 MKINRHILDNGLRLVHSQDESTQMVALNILYNVGARDEDPEHTGFAHLFEHLMFGGSVNI PDYDMPLQLAGGENNAWTNNDITNYYLTVPRQNVETGFWLESDRMLSLDFSERSLEVQRG VVMEEFKQRCLNQPYGDIGHILRPLAYQTHPYQWPTIGKELSHIANATLEEVEAFFFRFY APNNAILAVTGNISFEEAVALTEKWFGSIPRREVPQRNLPQEQEQTEERRLTVERNVPLD SLFMAYHMPAHCHPDYYAFDILSDVLSNGRSSRLSQRLVQQKQLFSSIDAYISGSVDAGL FHISGKPSAGVTLEQAEAAVREELYLLQQELVDEQELEKVKNKFESTQIFGNINYLNVAT NLAWYELLGRAEDMEKEVDRYRSVTAEQLRAVAQSAFRKENGVILYYKKQQN >gi|225935360|gb|ACGA01000032.1| GENE 28 44048 - 44887 991 279 aa, chain - ## HITS:1 COG:SPy0457 KEGG:ns NR:ns ## COG: SPy0457 COG0652 # Protein_GI_number: 15674576 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Streptococcus pyogenes M1 GAS # 31 278 74 261 268 95 31.0 9e-20 MKKILLIILTISFCGLTACKTGTKKGGDMDKETLVKIETTVGDIEVKLYNETPKHRDNFI KLVKDGVYEGTLFHRVIKDFMIQAGDPDSKNAPKGKMLGTGDVGYTVPAEFVYPKYFHKK GALSAARQGDNVNPKKESSGCQFYIVTGKVFNDSTLLGMESQMNENKINVIFNTLAQKHM KEIYKMRKANDENGLYDLQEKLFAEAQEMAAKQPEFHFTPEQIEAYTTVGGTPHLDGEYT VFGEVVKGLDIVDKIQQVKTDRSDRPEEDVKITKVTILD >gi|225935360|gb|ACGA01000032.1| GENE 29 44975 - 46348 1193 457 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 5 450 8 439 441 323 40.0 4e-88 MMSSILIVEDDITFGMMLKTWLGKKGFEVSSVSNIARARKHIESQNVDLILSDLRLPDHE GIDLLKWMNEQGMDIPLIIMTGYADIQSAVQAMKLGARDYIAKPVNPEELLKKISECLQS EKSPATHNVAKSSSKKGASTSSKGSTENHRAYLEGESDAAKQLYNYVGLVAPTNMSVLIN GSSGTGKEYVAHRIHQLSKRNDKPFIAVDCGSIPKELAASEFFGHVKGSFTGALTDKTGA FVAANGGTIFLDEIGNLSYEVQIQLLRALQERKIRPVGSTQEISVDIRLVSATNENLEQA IEKGTFREDLYHRINEFTLRMPDLKERKEDILLFANFFLDQANKELDKHLIGFDAKASQA LMNYHWPGNLRQMKNIVKRATLLAQGSFITLLELGTELLETPASSNTSIALRNEETEKEH ILEALRQTGNNKSKAAQLLNIDRKTLYNKLKLYNIDL >gi|225935360|gb|ACGA01000032.1| GENE 30 46564 - 47458 611 298 aa, chain + ## HITS:1 COG:CC1172 KEGG:ns NR:ns ## COG: CC1172 COG3119 # Protein_GI_number: 16125424 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Caulobacter vibrioides # 34 298 33 306 521 86 26.0 5e-17 MNKKTLLPLVFVPLAATNLQAQSNMQIERADKRPNIILFMVDDMGWQDTSLPFWTQKTHY NELYETPNMERLAKQGMMFTQAYANSISSPTRCSLITGTNAARHRVTNWTLQKNTMTDRK DSILAVPDWNYNGVSQVSGTNHTFVGTSFVQLLKNSGYHTIHCGKAHFGAIDTPGEDPHH WGFEVNIAGHAAGGLASYLGEENYGHTKDGKAVSLMSVPGLEKYWGTETFVTEALTLEAI KALDKAKKYNQPFYLYMSQYAIHIPLNKDMRFYEKYKKKGMTDHEAAYATLIEGMDKS Prediction of potential genes in microbial genomes Time: Fri May 13 08:23:49 2011 Seq name: gi|225935359|gb|ACGA01000033.1| Bacteroides sp. D2 cont1.33, whole genome shotgun sequence Length of sequence - 68983 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 16, operones - 9 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 15 - 665 612 ## COG3119 Arylsulfatase A and related enzymes + Prom 798 - 857 4.4 2 2 Tu 1 . + CDS 940 - 1206 388 ## gi|237719593|ref|ZP_04550074.1| conserved hypothetical protein + Term 1318 - 1354 -0.7 - Term 1165 - 1197 -0.9 3 3 Op 1 . - CDS 1327 - 2892 1580 ## BT_3791 hypothetical protein 4 3 Op 2 . - CDS 2929 - 4857 1820 ## Phep_3405 RagB/SusD domain protein 5 3 Op 3 . - CDS 4871 - 8116 3667 ## BT_2894 hypothetical protein 6 3 Op 4 . - CDS 8153 - 9544 1370 ## COG4833 Predicted glycosyl hydrolase 7 3 Op 5 . - CDS 9559 - 10794 1210 ## Cpin_1591 hypothetical protein 8 3 Op 6 . - CDS 10825 - 12006 981 ## BT_3791 hypothetical protein - Prom 12032 - 12091 5.6 9 4 Tu 1 . - CDS 12195 - 16235 3025 ## COG0642 Signal transduction histidine kinase - Prom 16393 - 16452 3.5 + Prom 16193 - 16252 5.3 10 5 Op 1 . + CDS 16446 - 18728 2147 ## COG3537 Putative alpha-1,2-mannosidase 11 5 Op 2 . + CDS 18782 - 19732 894 ## COG3568 Metal-dependent hydrolase 12 5 Op 3 . + CDS 19785 - 20942 905 ## COG4833 Predicted glycosyl hydrolase 13 5 Op 4 . + CDS 21002 - 22447 1378 ## COG3538 Uncharacterized conserved protein + Term 22493 - 22553 9.1 + Prom 22837 - 22896 5.0 14 6 Tu 1 . + CDS 22944 - 24104 864 ## COG2152 Predicted glycosylase + Term 24114 - 24174 15.1 - Term 24236 - 24295 3.1 15 7 Tu 1 . - CDS 24406 - 26676 1856 ## COG3537 Putative alpha-1,2-mannosidase - Prom 26711 - 26770 2.5 - Term 26809 - 26851 3.3 16 8 Op 1 . - CDS 26896 - 27567 204 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 17 8 Op 2 . - CDS 27577 - 28323 287 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 18 8 Op 3 . - CDS 28340 - 28930 524 ## BT_3770 transcriptional regulator - Prom 29060 - 29119 4.5 19 9 Tu 1 . + CDS 29084 - 29320 152 ## gi|237714317|ref|ZP_04544798.1| conserved hypothetical protein - Term 29608 - 29662 -0.8 20 10 Op 1 . - CDS 29722 - 31926 1466 ## COG3537 Putative alpha-1,2-mannosidase 21 10 Op 2 . - CDS 31971 - 33416 1168 ## BT_3171 sialic acid-specific 9-O-acetylesterase - Term 33421 - 33458 1.6 22 11 Op 1 . - CDS 33469 - 34881 916 ## COG5434 Endopolygalacturonase 23 11 Op 2 . - CDS 34894 - 36258 1171 ## COG0366 Glycosidases 24 11 Op 3 . - CDS 36287 - 37855 1036 ## Fjoh_1407 hypothetical protein 25 11 Op 4 . - CDS 37866 - 39029 1059 ## Fjoh_1407 hypothetical protein 26 11 Op 5 . - CDS 39043 - 40614 1571 ## GFO_2139 SusD/RagB family protein 27 11 Op 6 . - CDS 40627 - 43614 2646 ## BDI_1558 hypothetical protein - Prom 43721 - 43780 3.8 - Term 43740 - 43776 -1.0 28 12 Tu 1 . - CDS 43838 - 45508 1130 ## BT_3309 transcriptional regulator - Prom 45597 - 45656 5.2 + TRNA 45901 - 45976 71.3 # Met CAT 0 0 - Term 46166 - 46206 9.6 29 13 Op 1 . - CDS 46233 - 47030 485 ## BT_4180 acetyl xylan esterase A 30 13 Op 2 . - CDS 47048 - 49723 2398 ## COG3250 Beta-galactosidase/beta-glucuronidase 31 13 Op 3 . - CDS 49743 - 51971 1629 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 32 14 Op 1 . - CDS 52109 - 53902 1492 ## gi|260172079|ref|ZP_05758491.1| hypothetical protein BacD2_09461 33 14 Op 2 . - CDS 53959 - 54945 773 ## Phep_2142 hypothetical protein 34 14 Op 3 . - CDS 55007 - 56545 799 ## Acid_0712 hypothetical protein 35 15 Op 1 . - CDS 56660 - 58201 1212 ## gi|260172082|ref|ZP_05758494.1| hypothetical protein BacD2_09476 36 15 Op 2 . - CDS 58229 - 58837 482 ## gi|260172083|ref|ZP_05758495.1| hypothetical protein BacD2_09481 37 15 Op 3 . - CDS 58851 - 60311 1510 ## Phep_0446 RagB/SusD domain protein 38 15 Op 4 . - CDS 60325 - 63462 2900 ## Phep_0445 TonB-dependent receptor plug - Prom 63618 - 63677 8.9 - Term 63641 - 63672 -0.7 39 16 Op 1 3/0.000 - CDS 63798 - 67823 2758 ## COG0642 Signal transduction histidine kinase - Prom 67873 - 67932 4.2 40 16 Op 2 . - CDS 67934 - 68833 616 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 68897 - 68956 7.4 Predicted protein(s) >gi|225935359|gb|ACGA01000033.1| GENE 1 15 - 665 612 216 aa, chain + ## HITS:1 COG:MT0738 KEGG:ns NR:ns ## COG: MT0738 COG3119 # Protein_GI_number: 15840118 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 1 184 309 533 757 81 27.0 1e-15 MNWLEKNGKANNTIIIFMSDNGGLASESGWRDGKLHTQNYPLNSGKGSTYEGGIREPMIV SWPGVVAPGSKCNNYLLIEDFYPTILEMAGVKNYQTVQPLDGISFIPLLKQTGNPAKGRS LFWNMPNNWGNDGPGINFNCAVRNGDWKSIYYYGTGKKELFNIPDDIGESNDLSAQHPDI VKKLSKELGNYLRKVDAQRPTFKATGKPCPWPDEIK >gi|225935359|gb|ACGA01000033.1| GENE 2 940 - 1206 388 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237719593|ref|ZP_04550074.1| ## NR: gi|237719593|ref|ZP_04550074.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 88 1 88 88 112 100.0 7e-24 MKKLVLVVAMFMFVCGGSFLVKAQSSAEAVTAPTEINATVVNDTVVKDTVTKEDTPAKAE LVALAAIVNDTVVTDTTSKDKPAEPVKE >gi|225935359|gb|ACGA01000033.1| GENE 3 1327 - 2892 1580 521 aa, chain - ## HITS:1 COG:no KEGG:BT_3791 NR:ns ## KEGG: BT_3791 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 482 13 499 545 125 27.0 6e-27 MKRIYTYMCLLLLSVLSFAVTGCDDDDDVAKDLIELIVNKPVIRIGQNEEAKVNVVVGNG NYTVRSFNTAIATATVSGELITVKSGSQNGATTVEVMDGEGVVANISVNVGVFELEVNEP EVILEVADEKQLIVSMGNFSSNDELSYEVEDETVVSMVNTDQFRPFYTLTGLKNGHTTVT FTDHKGKQAVVQVTVNPISIDVSNLTPRVGVNNKIMITVEKGNGGYSLTAENEEIVAIQQ VDDTRFNLIGKKAGITTVFVRDEAEQELSLTVTVVQADKVANLGSGNYFKVPFEYNGTAD ESLKNLSTLTFEARFNIESLNGNDNGNARINTVMGIEKKFLLRVDVHKGGSNDEERFLQL AADDKGSIRYEGSTKIETNKWYDVAVVLDNSKSGSERIALYVNGVRETLQLSNGTPDDLK EINLTSDFYIGQSDGKRRLNGAISYARIWTKALSDQQISEQSGKLLSEDKDGMVANWLFN NGNGNTKTFVSLAGKSFEAEAANIVSSWKTDPILETSTPTE >gi|225935359|gb|ACGA01000033.1| GENE 4 2929 - 4857 1820 642 aa, chain - ## HITS:1 COG:no KEGG:Phep_3405 NR:ns ## KEGG: Phep_3405 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 642 1 600 600 180 27.0 2e-43 MKKLYRFSAIALLVLPLVFTSCSDYLDRDDDDNITEADVFARYEKVNGLVSDVYAAAKKA DRPLVFFEHFSNSAITDECEGTNVEGNITNNFNNGAWNPNSLPGSVGQYWEALYEGIRKA NLIIENVQKYNTPDNPQQDGDLRNRIGEMYFMRGYFHMLLLRMYGEAPYIDRVINAGDNM DFKKESVHSMVEKIVTDAQTAYGMVPNKYVKTSENFGRVDKGACLGLISFVRWVAATPLW NGASQYGYNLRRVFENEYAYDATRWRKAKEAAKAVLDFEVGGTKRYSLYTKHDANDFKDP ADGNLNDSRVYARLWDMFYDMDAFANEYVFFMTKSKDQAWQGDIYPPSREGSSRQQPVQE QVDEYEYIVGDYGYPVYSAEARKGGYDDTNPYVKGTRDPRFYRDVIYHGAPYRNNKNESK TINTASGSDKIGATNATTTGYYLRKFQQESWNKSGNFSINAPAIWRLPEFIYIYAEACNE LGEDIDEAYKLVNTVRERSFMKPMPPEVKTNQQLMREYIQRERRVELFYEGKRPWTCRLY LEPTSKEELAKESLWKSSGSDNSKRTQKYWAANNGALPRCQRMINGMRPVQDENGAITVD GVKYRMERFCVEERTFSIQHYLFPIRQSELQKTPTIEQNPGW >gi|225935359|gb|ACGA01000033.1| GENE 5 4871 - 8116 3667 1081 aa, chain - ## HITS:1 COG:no KEGG:BT_2894 NR:ns ## KEGG: BT_2894 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 1081 25 1017 1018 575 35.0 1e-162 MIHIKRNICLVAVSCTLLAGIPLQGVAQTGRTAKVQATQNHKITVSGTVLDKTTNDPLIG VSVVVKGVANAGTITDMDGKFTLKLPYAEAPLVFSYLGYQPQEIVPGAKKELTVLLQEDT KALQEVVVVGYTKQRKETMIGSVATITTKDLTQSPTANINNALAGRLPGLIVNQYAGGEP GVDQSELFIRGKATYGNQSAIVIVDGIERDMSYLAPDEIETFTILKDASATAAYGIRGAN GVIVITTKRGKAAEKATVNLKASIGINQPIGFPEYLGSADYATLYNEARLNDAKMTGADI SSLNLFSQQAIDNFRRAKGDNSDGLGYDWDYYDFAFKPGLQEDVSLSIRGGTDKVRYYVL ANYFSQGGNYKYSNAGEYDSQTKFTRYNFRSNIDININRYLSTRLDLGARITDRNAPGTT AGRLMTICATQPPYLPILVEENAHPQNEEYIQQNPRGMLYGDNIYRYNLLGELSRTGYLN EKNTYLNGSFAMNLDMEFLTKGLKAEVMFSYDASEGRWINRKLDTYKDGYREYPKYATFM PIEGSDAYMAGGHYTGAYKTGNKYDIDQTIGNGFSHNASDGRTYIQARLDYNRLFSNRHE VTAMLLANRGNRTVNNELAYHSQGITGRFAYYYNQKYLMEFNFGYNGSENFAPGKRYGFF PAGSIGWVVSEEEFMKKASWIDFLKVRASYGLVGSDNVSSRFPYLAFYGGGSGYDFGNNF GTNVGGTSEGNLANANLTWEKARKLNVGIDFTTLNQRLALTIDAFYEYRFDIITDMNSDG IMGYPDIVGKDAALQNLGEVSNRGVDIELSWNDKIGKDFRYYIRPNLTFSRNRLEYKAEV ARKNSWRKETGKRLYENFVYVFDHFVADQEEADRLNKIGYQPWGQLIPGDVVYKDLDRNG VIDDEDRTAMGNPRSPELMFGIPFGFQYKNFDFSVLLQGATKSSILLNGAAVFDFPQFEQ DKIGRVKKMHLDRWTPETAATAKYPALHYGTHDNNKNGNSSLFLYDASYLRLKNVEIGYN VSPKLLRKFHVQQARIYVQGLNLLTFDKLGDVDIDPETKSGDGASWYPIQKVFNFGIDIT F >gi|225935359|gb|ACGA01000033.1| GENE 6 8153 - 9544 1370 463 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 99 439 46 341 341 107 30.0 4e-23 MKNYIVSLSKRALLLSGALAIFTACDDGDLGDIYKGNSSGEYVSDVNWTDAADKSTGTYI KYFYENASGRDTFCGSIYWEKPNTPETGEPASTGGSGGWSQGHALDIVIDAYIRHANNPE YQAHLYENIMKPFLPAFDDWNEHCGYGGKDFWNNFYDDMEWMALSCLRVYELTGDQDYYS ALMKMWNHIKGAKNDYKGVGGMAWKTDAPASRMSCSNGPGCLLAMKLYQLTVKEAKDGWE EQAAYYLNFAKEVYNWMTAYLCDISTGQVYDNLSIKDDGTPGDPDKVALSYNQGTFMAAA LELYNATGEDEYLRNAVAFGSYQVNKKMDSNYPVFSGEGNSGDNLLFRGIFVRYFLDMVK QPTNSLYPEKTKNKFIAALRSCSDVLWTLAHPEGYYVWEYDWAKAPTFGNRDNREDRLTI SLNAEVPGATMIEIRARYEDWVQGKATEKANWVGPDFGKKAEE >gi|225935359|gb|ACGA01000033.1| GENE 7 9559 - 10794 1210 411 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1591 NR:ns ## KEGG: Cpin_1591 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 10 401 15 376 381 122 25.0 2e-26 MITKKLIKFALLFFLIGLTACDNKLNPGFDPDDYEPIVPPEPVRTMDAMASGERVVLSGA SLTDIVLKWTPTEKHGNTVYRYEVLIDTIGGDFTKPVETIFSDNNGLESQLTLTHYQANT IGKLAKFRCNTNGTLRWKVRAYCGLDQALSSLEGYFVIFMMDGIDDMPSENEPVYITGVG TEDDGDEAEAQQMLRQNEGIYQTFTQLKADQPFIFMSTVEGRKCYYYVDDNGVLRERNDG EDYTVTVPQSGIYRITINMGEQTISYDEIGAVYLYNHSGGYRQDFNYLGYGKWGVKNYTA RKQTESWAGSGETRHSFKMEINGTTYRWGHKEKDKGQPNLTTDNSYYNLYQLALGTDPWD YSFRFCDELLQWGEKQGNVYYATVKTDVTLYFNAEFGTYTHRWVASETNED >gi|225935359|gb|ACGA01000033.1| GENE 8 10825 - 12006 981 393 aa, chain - ## HITS:1 COG:no KEGG:BT_3791 NR:ns ## KEGG: BT_3791 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 156 13 152 545 68 31.0 3e-10 MKRKLSFLFLSALTIPFFMVSCSDDKEEQEVKPATDLVVDNNSVTLLQGDEITVSITSGN GDYYVKPFDENVATATINGNKVTIKATENSQLEENNRTETTILIVDGRKKVARVLVRVAK LWDLTVDAPEEGFDLFIGEKRLVKILTGNGDYQISIPEGADKFLEVGELSGQVIPLTAKF ETGADPVNITITDKNGKSDPRFRSQQTIYCDDSEGGRKTPGKWYHIAIVYDGTKSSTKEA YKMYINGVRETLTPADNSYEDCAPNSSLNLTDVGGNDKALLIGRSGDSYRVGYCKVYQAR MWKRALDESEIKANMCKILNAEEHSDLMGYWVFSKGVGGTTVFENWGNGGSGLDAQVCLQ NISENKPAWGAELPATYNGDKSRFEPIECPHSY >gi|225935359|gb|ACGA01000033.1| GENE 9 12195 - 16235 3025 1346 aa, chain - ## HITS:1 COG:BS_resE_4 KEGG:ns NR:ns ## COG: BS_resE_4 COG0642 # Protein_GI_number: 16079368 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 827 1050 43 266 269 101 29.0 1e-20 MGKQSVFILFFFLNLLIHNAYAYNLKQVADKEYMSNSSITSLCQDERGLMWIGTCDGLNI YDGQEIEEFKTRDKEDYLSGNLIDNIVYTGEDIYWIQTYYGLNRLNRKTNTITHYNEFQK LFFMNKDSHGNLFIIQDSNCIYYYHKKEGVFKKINITGIPISDIVNFFIDGNNRMWVVMK GYNRCYDIQQEAASGDITLLPQKTNLIYQTSLIYCFNDEQSLYYIDKEFNFYAFHIPTQK NEFIANLGKEIQERGKISSIVHYHNSYFVGFLMDGILLLEKQKETNHYQIQSLPINSGVF CLKKDCFQDVVWIGTDGQGVYLYSNPLYSIKSMVLSNYTEKIERPVRAIYLDDERTFWVG TKGNGILKIYDYEFDKNISDCRAEVLTTSNSALSSNAVYCFAKSHRNLLWIGDEEGLTYY SYREKRIKSIPIRIGNEDFKYIHDIYETADSELWLASVGMGVVKARIAGTPDNPVIVDAQ RYIINDGELGSNYFFTIYAENEANLLFGNKGYGVFRYNETTNGLEPVSTHKYENMTLNNI LAISKDSSNNYLFGTSYGLIKYTSETSYQLFNAKNGFLNNTIHAILRNSSDDFWLSTNLG LINFDTKRNVFRSYGFGDGLKVVEFSDGAAFRDSQTGTLFFGGINGVVAIRADGRPEQLY MPPVYFDKLSIFGEQYNLGEFLTRKKETEVLNLQYDQNFFSVSFASVDYLNGNNCTYFYK LKGLSDQWVNNGSESGVSFTNMAPGEYTLLVKYYNSVFDKESDVYSLVIRIGDPWYASWW AYLIYALCLLLLAALLIRSFILRSKRKKQELLNEIEKRHQKNVFESKLRFFTNIAHEFCT PLTLIYGPCGRILSSKGLSKFVVDYVQMIQTNAERLNNLIHELIEFRRIETGNREVRVES LNVSSIVKGIAKTFVEMAKSRNITFLSKIPEQVMWNSDKGFLNTIIINLISNAFKYTPDG QSIKIEVDTSGENMLTLRVANEGSTIKEKDFQYIFNRYAILDNFENQDEKNFSRNGLGLA ISYNMAKLLNGTLKVENTSDGWVMFTLTLPVMELTTGVSETKRLMAEYIPKIDTQPILKL PQYEFDKMRPTLLVVDDEIEMLWFIGEIFSADFNVVTLQDPERLDQVMNEVYPNVIICDV MMPGMGGIELTRRIKSVKETAHIPIIVVSGRHEMEQQMEALSAGAEMYITKPFSAEYLRI SVCQVIERKEVLKNYFSSPISSFEKSDGKLTHKESKKFLQSVLKIINDNITNKDLTPRYI ADRLAISPRSLYRKMEEIGEDSPTDLIKECRLHIAKDLLLTTKKTIDEIVFDSGFSNKVT FFKVFREKYECTPKEFRMKHLEEVQQ >gi|225935359|gb|ACGA01000033.1| GENE 10 16446 - 18728 2147 760 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 30 758 3 715 717 422 33.0 1e-117 MKTHFSFKHLLFLGGAVLYSLQSSAVKNPVDYVSTLVGTQSKFELSTGNTYPATALPWGM NFWTPQTGKMGDGWAYTYDADKIRGFKQTHQPSPWMNDYGQFAIMPITGGLVFDQDRRAS WFSHKAEVAKPYYYKVYLADHDVTTELAPTERAVMFRFTYPETKNAYVIVDAFDKGSYVK VIPEENKIIGYSTKNSGGVPENFKNYFVIQFDKPFTFVSTVFENNILPNETEAKGNHTGA VIGFATKKGEIVHARVASSFISPEQAELNLKELGKNSFDQLVANGREIWNREMSKIEIED DNIDNLRTFYSCLYRSMLFPRSFYEIDAKGQVMHYSPYNGEVRPGYMFTDTGFWDTFRCL FPFLNLMYPSMNQKMQEGLVNTYKESGFLPEWASPGHRDCMVGNNSASVVADAYIKGLRG YDIETLWEALKHGANAHLRGTASGRLGYESYNQLGYVANNIGIGQNVARTLEYAYNDWAI YTLGKKLGKPESEIDIYKKHALNYKNVYHPERKLMVGKDNKGVFNPNFDAVDWSGEFCEG NSWHWSFCVFHDPQGLINLMGGKKEFNAMMDSVFVIPGKLGMESRGMIHEMREMQVMNMG QYAHGNQPIQHMVYLYNYSSEPWKAQYWIREIMNKLYTAGPDGYCGDEDNGQTSAWYVFS ALGFYPVCPGTDEYIIGTPLFKSAKLHLENGKTITIKADNNQLDNRYIKEMKVNGKSQTR NFLTHDQLIKGANIQFQMSPVPNKQRGTTEKDVPYSLSFE >gi|225935359|gb|ACGA01000033.1| GENE 11 18782 - 19732 894 316 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 29 310 3 256 257 163 34.0 3e-40 MKLKNLLLIALVAIVFCGCQSNYQPTSITVASYNLRNANGGDSINGNGWGQRYPVIAQIV QYHDFDIFGTQECFIHQLKDMKEALPGYDYIGVGRDDGKEKGEHSAIFYRTDKFDVIEKG DFWLSETPDVPSKGWDAVLPRICSWGHFKCKDTGFEFLFFNLHMDHIGKKARVESAFLVQ DKMKELGKGKELPAILTGDFNVDQTHQSYDAFVSKGVLCDSYEKAGFRYAINGTFNDFDP NSFTESRIDHIFVSPSFQVKRYGVLTDTYRSIVGKGEKKQANDCPEEIDIKTYQARTPSD HFPVKVELEFDQRQQK >gi|225935359|gb|ACGA01000033.1| GENE 12 19785 - 20942 905 385 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 139 361 90 314 341 88 33.0 2e-17 MRNICFVACMLFCLTSAVGKTPGNTRYLSIADSILSNVLNLYQTNDGLLTETYPVNPNQK ITYLAGGTQQNGTLKASFLWPYSGMMSGCVALYKATGNKKYKKILEKRILPGMEQYWDNS RLPACYQSYPTKYGQHGRYYDDNIWIALDYCDYYQLTHKPASLEKAVALYQYIYSGWSDE MGGGIFWCEQQKEAKHTCSNAPSTVLGVKLYRLTKDAKYLEKAKETYAWTKKHLCDPTDH LYWDNINLKGKVSKEKYAYNSGQMIQAGVLLYEETGDEQYLRDAQQTAAGTDAFFRTKAD KKDPTVKVHKDMAWFNVILFRGLKALYKIDKNPTYVDAMVENALHAWENYRDENGLLGRD WSGHNKEQYKWLLDNACLIEFFAEI >gi|225935359|gb|ACGA01000033.1| GENE 13 21002 - 22447 1378 481 aa, chain + ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 37 468 61 497 516 468 49.0 1e-131 MNITKAFCLSIALFGASNMQAITNSDFVIQQDNTKINNYQTNRPEASKRLFVSQAVEQQI AHIKQLLTNARLAWMFENCFPNTLDTTVHFDGKDDTFVYTGDIHAMWLRDSGAQVWPYVQ LANKDAELKKMLAGVIKRQFKCINIDPYANAFNMNSEGGEWMSDLTDMKPELHERKWEID SLCYPIRLAYHYWKTTGDASIFSDEWLTAIAKVLKTFKEQQRKEDPKGPYRFQRKTERAL DTMTNDGWGNPVKPVGLIASAFRPSDDATTFQFLVPSNFFAVTSLRKAAEILNTVNKKPD LAKECTTLSNEVEAALKKYAVYNHPKYGKIYAFEVDGFGNQLLMDDANVPSLIALPYLGD VKVNDPIYQNTRKFVWSEDNPYFFKGTAGEGIGGPHIGYDMIWPMSIMMKAFTSQNNAEI KTCIKMLMDTDAGTGFMHESFHKNDPKNFTRSWFAWQNTLFGELILKLVNEGKVDLLNSI Q >gi|225935359|gb|ACGA01000033.1| GENE 14 22944 - 24104 864 386 aa, chain + ## HITS:1 COG:PAB1622 KEGG:ns NR:ns ## COG: PAB1622 COG2152 # Protein_GI_number: 14521331 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus abyssi # 49 375 8 287 305 165 34.0 2e-40 MNNMKSTFLFLLTTTMMTCTAYGQSSNHKENKLPDWAFGGFERPKNVNPVISPIENTKFY CPLTKDSIAWESNDTFNPAATLYNGEIVVLYRAEDKSGVGIGHRTSRLGYATSTDGTHFQ REKTPVFYPDNDSQKELEWPGGCEDPRIAVTDDGLYVMMYTQWNRHVPRLAVATSRNLKD WTKHGPAFAKAFDGKFFNLGCKSGSILTEVVKGKQVIKKINGKYFMYWGEEHVFAATSDD LIHWTPIVNIDGSLKKLFSPRDGYFDSHLTECGPPAIYTPKGIVLLYNGKNHSGRGDKRY TANVYAAGQALFDANDPTRFITRLDEPFFRPMDSFEKSGQYVDGTVFIEGMVYFKNKWYL YYGCADSKVGVAVYDPKRPAKADPLP >gi|225935359|gb|ACGA01000033.1| GENE 15 24406 - 26676 1856 756 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 32 748 46 777 790 573 41.0 1e-163 MKAFKLLALTSCVLLAAGNGVAQQNYSKSEGLLQYVDPYIGSGYHGHVFVGTSVPYGMVQ LGPTNIHKGWDWCSGYHYSDSILIGFSHTHLSGTGCTDLCDILIMPLNEIRTPRGNQDDI RDGYASRYSHANEIARPEYYSLLLDRYNIKAELTATDRVGFHRYTYPEGKPASILIDLRE GNGSNAYDSYIRKVDDYTVEGYRYVRGWSPSRKVYFVLKSDKKIEQFTAYDDNNPQPWDQ LKVASVKSVLTFGNVKEVKIKVALSSVSCDNAAMNLQSELTHWDFDKVVDMSADRWNKQL EKMTVETDDEASKRVFYTAHYHTMIAPTLFCDVNGEYRGMNDMIYTDPKKANYTTLSLWD TYRALNPLMTITQPEMVDHVVNSMISIYRQQDKLPIWPLMSGETDQMPGYSSVPVIADAY LKGFTGFDAEEALQAMIATATYEKQKGVPYVVKKGYIPADKVHEATSIAMEYAVDDWGIA AMARKMGKTEDAETFSKRAHYYKNYFDSSIHFIRPKLEDGSWRTPYDPARSIHTVGDFCE GNGWQYTFFAPQDPYGLIALFGGDKPFTTRLDGFFTNTDSMGEGASSDITGLIGQYAHGN EPSHHVAYLYAYAGEQWKTAEKVRFIMSDFYTDQPDGIIGNEDCGQMSAWYLLSSMGLYQ VNPSDGVFVFGSPCFKKVEVKVRGGNTFTVEAPNNSKENIYIQKVYLNGKPYDKSYITYQ DIINGSTLKFVMGKKPNKNFGKAPANRPVVLNKING >gi|225935359|gb|ACGA01000033.1| GENE 16 26896 - 27567 204 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 3 210 83 279 287 83 29 4e-15 MTVVYEDNHIIVVNKTASEIVQADKTGDTPLSETVKQYLKEKYQKPGNVFLGVTHRLDRP VSGLVIFAKTSKALTRLNEMFRTSEVKKTYWAVVKNAPQEPEGELVHFLVRNEKQNKSYA YDKEVPNSKKAILHYRLIGHSENYYLLEVDLKTGRHHQIRCQLAKMGCPIKGDLKYGSPR SNPDGSICLHARRVRFIHPVSKELIELEAPLPEGNLWKGFAID >gi|225935359|gb|ACGA01000033.1| GENE 17 27577 - 28323 287 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 248 4 242 242 115 34 9e-25 MGLLDGKTAIVTGAARGIGKAIALKFAAEGANIAFTDLVIDENAENTAKELEAMGVKAKG YASNAANFEDTAKVVEEIHKDFGRIDILVNNAGITRDGLMMRMSEQQWDMVINVNLKSAF NFIHACTPIMMRQKAGSIINMASVVGVHGNAGQANYAASKAGMIALAKSIAQELGSRGIR ANAIAPGFILTDMTAALSDEVRAEWAKKIPLRRGGTPEDVANIATFLASDMSSYVSGQVI QVDGGMNM >gi|225935359|gb|ACGA01000033.1| GENE 18 28340 - 28930 524 196 aa, chain - ## HITS:1 COG:no KEGG:BT_3770 NR:ns ## KEGG: BT_3770 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 355 95.0 5e-97 MTVSKTKAKLVDVARQLFAKMGVENTTMNDIALASKKGRRTLYTYFKSKDEIYLAVVESE LDILSDMMKRVAEKNISPDEKLLEMIYTRLDAVKEVVYRNGTLRAYFFRDIWRVEKVRKK FDAKEVQLFKAVLLEGQAKGVFHIDDVEMTADLIHYCVKGIEVPYIRGHIGAHLDEDTRN RYVSNIVFGALHRTEI >gi|225935359|gb|ACGA01000033.1| GENE 19 29084 - 29320 152 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237714317|ref|ZP_04544798.1| ## NR: gi|237714317|ref|ZP_04544798.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 78 1 78 597 108 65.0 1e-22 MKRQIFITQMQCNFNLRQPKTNRPTNIYLVVYLNNKQVKLSTGVKVYPEHWNIRRQQAYV NARLSKLDNNNNTITNDR >gi|225935359|gb|ACGA01000033.1| GENE 20 29722 - 31926 1466 734 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 25 732 42 766 790 480 37.0 1e-135 MKNFIIGLLFLCGFYACSSGKENISEVDCAQYVNPFIGNADNGHTFPGACAPFGMIQASP ESGNASWRYCSGFNYEDDEIFGFAQNHLNGTGCPDLGDILMLPFSGIIENREYKARIDKS RQTARPGYYAVTLPDFGINAEVTATEHTAFYRYTYQNDEEANLFIDLQSGLVTSPQGIRD RVLFSDMQMPDGRTITGHNETNGWVRRHFYYVIAFDKSYKVKEMLPVREGEKAKRFVLSF DLKRGEALQVKIALSTVSVNGAKASLQKENPDWNFDKMKGETYQRWNTLLQRVRVEGTDK QKTNFYTSLYHLYIQPNNIADTDGRYRGVNDSVSTSPSGEYYSTLSLWDTYRAAHPLYTI LTPERVNGMVNSMLAHHKAYGYLPIWTLWGKENYCMIGNHAIPVIVDAYLKGFDGFDKRE AYKAIKESSTLSHRNSDWEVYNKYGYYPFDIMTAESVSKTLESAYDDYCVAQMAKSLDEM WDFEYFMKRSGYYKNLFDPETKFMRGKDSKGNWRTPFRAMRLFHAGEVGGDFTEGNSWQY TWHVQQDVEGLIELMGGKDAFANKLDSLFTLEANPQEMGEVLDVTGLIGQYAHGNEPSHH VIYLYNYANRPWKTQELVREVFDRFYLDKPNGLCGNDDCGQMSAWYIFSAMGFYPVNPCG GEYVIGAPQLEEVIIDLPDRKQFTVKAINLSETNKYVGTVLLNGKPLKGFILHHSDIMKG GLLEFVMKSQPDKR >gi|225935359|gb|ACGA01000033.1| GENE 21 31971 - 33416 1168 481 aa, chain - ## HITS:1 COG:no KEGG:BT_3171 NR:ns ## KEGG: BT_3171 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 480 20 476 477 421 46.0 1e-116 MKRVIWLALFVCMVLSLWGKIRLPSILGDNMVLQRSDTVNIWGWAAPSQNVTVKPSWDNR IYTTKAENNGKWLLQIQTPEAGGPYQIDISDGELLTLKDILIGEVWICSGQSNMEMPVHG FYGQPVVGSLEEIVEASQYPDIRMFTLPPTPAAEPQDDCRGSWLKSTPESVRDFSAVGYF FGKNLNKVLNIPIGLITPNCGGIAIEPWMTAEAIRETAGINQKLAFTPQVQTEAANASYL FNGMIAPIRNFTGRGFIWYQGESNQHNYFDYDKLQVSMVNLWRKEWKNEDMPFYYVQLVP FPFDGAERISLSLVIEAQYKALQHISNAGIVATTDLGHFTCIHPPRKKEIGQRLAALALR KTYQINGVLPDAPMIDKVVFEGNKAVLTFKGVPDYSPAAVGSLDFYGGELRGFEIAGEDR QFYPAKALLIQGQNRMEVMSEKVSRPVAVRYAFKNYHDANVMTTEGQPLVPFRTDHWDDV Y >gi|225935359|gb|ACGA01000033.1| GENE 22 33469 - 34881 916 470 aa, chain - ## HITS:1 COG:CC0572 KEGG:ns NR:ns ## COG: CC0572 COG5434 # Protein_GI_number: 16124826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Caulobacter vibrioides # 28 455 33 496 527 127 26.0 5e-29 MKRSMMQTVFLLVFLLYPLCAGASVIQEKVYDICKYGAKGDGKTLNTNAINSAIEYCAEN GGGTVLIPKGTFLSGTIYLKDNISLVLRKGAVLKGTADVSQYKSYTPLGNLSMFDSGGDG ENANSAIDPHWNRALILGVGVSNISIEGEGIIDGNHVFDSEGEENMRGPHTIIIAESRNI LMRGITLNCASNYAFMAYKIEDMVFHDLEFNEGWDGIHIRGGKNITIRNCRFFTGDDAIA GGYWENMVISDCHINSSCNGVRMIMPATGLTISNCTFAGPGKYPHRTSKEKKRNNMLSAI ILQPGGWGKAPGKIRDIYIHDVSIDNMNNPLMIVLNEGNTGDNILVERMKATRISQAAVS IESWKGGTFGNVTFRDISISYEGNKNPELKNLQVSQPPADSRLLPCWGWYARNVQNMVLE NVELNYRGEDVRPAFWLNNVGKTELIQVNYTGSKNVGAIMQENTGTVIKR >gi|225935359|gb|ACGA01000033.1| GENE 23 34894 - 36258 1171 454 aa, chain - ## HITS:1 COG:TM1650 KEGG:ns NR:ns ## COG: TM1650 COG0366 # Protein_GI_number: 15644398 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Thermotoga maritima # 45 406 7 365 422 212 36.0 1e-54 MNTIKIFTLLLFSLLILSSCNDEDEYTWTDIPNSEVKPVTAHNKVIYEVNVYSYSSGHNF KGLESDLPRLKELGIDILWLMPIHPRGEENRAGTLGSPYSVKDYKAINPDYGTSEDFKSL VNTAHAMGMEIWLDWVANHTAWDNVWVAGHLDYYAEKDGERPYAPGGWLDVIQLDHTNAE MRTAMADAMKYWLTEFDIDGFRFDAADFVPLDFWRELRKEVDKVKKVTWLSEGSDPAYME VFDYDYAWDFATALKDFGAENDVPVLIDECKKLFNDAAYKNKGRMVYITNHDLNAYEGSE FDRYGNNVLPLAVLSFTIYDMPLIYNGQEIGMNKSMNFAEPVMVDWNPANKVYVNLYQKL TRLKRTQPALEDGANRGALKIYSTNDESLFAYSRIKGDNEILVLLNFAQVPKRLRFTQES PAGKFKDYLNGGYREFSAGNGISLHENGYAIFVK >gi|225935359|gb|ACGA01000033.1| GENE 24 36287 - 37855 1036 522 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_1407 NR:ns ## KEGG: Fjoh_1407 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 160 518 28 363 366 96 26.0 2e-18 MNMKKIQFILTLFCVLALLPACNDDEQGRLTALPSVISDLESTDYTLPKFVEGQNPYLFR MTWTKAKYFSESESPVYVGDVVYEVEVDLAGNEFENPRMIFSTQGLYMDVYEGTLRTILS ELAGENKEESQVVGIRIKTTGSGLVVYSEPILLSITPYVPEPSVEAVSGVIAELTEDNYQ LQRPTGEDNPQLFTIGWTATGFYLEGTGTPAPVPPAVEYTLQIDQADNNFASPQTLAVTS LISVNILTREFNNLLIEQLGATPGESMDLQIRLLSRYNEGGVAKEVLSNSISLSATPYVE VDPVRPIYMVGDMNNWNTTNTDFMMFKENSDVRNYVYTFTGYFDGNVSFKIIPEESLGTN KAYCRKEEGTLTYADTQEGKIWIATAGYKTITVNLEEMTYTIVDYDASGATEWEAMSVVG AYCGWNPSNSIAQMTKMNGNPHIWRLKIDMPFVEASDNGVKFVGNNAFGNNYVPVDQWSN PYGVCELNPAGRDVNIVRDEGGDFLFILNTLTGHYVMMKLND >gi|225935359|gb|ACGA01000033.1| GENE 25 37866 - 39029 1059 387 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_1407 NR:ns ## KEGG: Fjoh_1407 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 382 1 361 366 107 26.0 6e-22 MKTYKKYIIGILCLLGMAACSEKDAEPVLTRVVPSALDDFPFDDYVLQNPEDEDPLLFTV TWTETLFYLDGSSTPMPIAPVEYTLQVDKAGNNFESPQTLATTSSLVANIYTKELNNLME ESLDAIPGEKMDIELRVLTSYGQNVVREVVSSNTMKLAVTPFKDRDPLQLVYIIGDMNGW NNSNTDGMYVMFKNNSDSKNHVYTYTGYMPENCYYKFLPKESLGTYKAYCFKEDGKLEYV ESDGGALHNATAGYKTITINLKELSYTVEDYDISGAKVWGSIGLIGEFCGWDNEPLMTRF SVDNSHLWKMDLTLPALTGNNTHPVKFRANKSWDSRWAATDPESVPYGKTIFLMGDEYDP NIILREGGDYQVIFNDLTGHYIFRKKE >gi|225935359|gb|ACGA01000033.1| GENE 26 39043 - 40614 1571 523 aa, chain - ## HITS:1 COG:no KEGG:GFO_2139 NR:ns ## KEGG: GFO_2139 # Name: not_defined # Def: SusD/RagB family protein # Organism: G.forsetii # Pathway: not_defined # 8 523 8 533 533 436 43.0 1e-120 MKKLYNLLLLSVCLPMFLSSCLGDLDTVPLTDDKLLPEKAWEDPNAYEQFLAKIYAGLTL SGNEGPFGLPDISASDQGEATFLRSYWNLQQLGTDEVVCAEDNETMRGLQFCQWNSNNNF VALNYTRIYMNVAYANEYLRETTDEKLGNRGVGDELRAKIEGFRAEARVLRAMSYYFLMD LYANVPFIDENVPVGSRKLDQKDRVFFFSWIESELKNVEGKLPPADKDHYGMVNDPTVWM LLAKMYLNAEVYIGEKRYDDCLIYLKKLLGAGFTIDPVYKNMFCADNEKSPEIIFSLVYD GRRATTFGGTTYLIAAACKSDMNPLTTLGFSQAWSNIRAKETLSTLFEESDKRAMFWKQD RTLESSVWYDFTKGWSVIKYSNLKSDGTPGSNSAHADTDFPFFRLADAYLMYAEAVLRGG QGGTKEQALLYLKELRERAGISPVSSVELTLDFILEERSRELYWEGHRRIDLIRFGKFTK NYKWPWKNGVFSGVANIDSKYNIYPLPATELTSNPDLKQNTGY >gi|225935359|gb|ACGA01000033.1| GENE 27 40627 - 43614 2646 995 aa, chain - ## HITS:1 COG:no KEGG:BDI_1558 NR:ns ## KEGG: BDI_1558 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 16 995 16 989 989 939 51.0 0 MEKQSKIKRITQFFVLVLFVLFSSQALAQQSTIKGLVKDETGVPIIGANVSVKGTTNGVV TDTNGRFILSKVKSGAAIVFSYVTYITKEVTYKGQADMIVTLVEDNKLLDEVVVIGYGQV RKGDATGALTSLKADDKIRGFAPNTQDLLVGKVAGVAITSEGGSPNGASYIRIRGGSSLT ASNDPLIVVDGVFIDNQGLNGAGNILSTINPTDIESFTILKDASATAIYGSRASNGVILI TTKKGTEGKVKITYDGNVSIGTPKKKIDVMTGDEFRVFLKENYSDLSIYEQMAKKQGLVN TDWQDEIFRTTLNTEHNLSVYGTAAKVMPYRVSFGYTNIDGILKTSKTERYTGSFSVTPS LFDDHLKINLNGRGMYVKNRFADWGAIGAAIVMDPSQSVYDKDSPYGGYFTWTGDDDNVI QVATKNPVSMLEMTHDESKVRNFIGNAQLDYKLHFFPDLRLNMNVGIDYSKGEGIKYISE FAPSDYMYGGYDSNWNQKIRNSSFDFYAQYAKDFNFLDSNFDLMGGYSWQHYWRAGDGVG HRITKFDAYGDPVLVTQNNYETEHYIVSFFGRMNYNVKDRYLLTFTLRNDGSSRFHKDNR WALFPSVALAWRMSEEDFIKQLDFVNNLKLRLGWGITGQQDINQGDYPYMATYLHTVGDQ ANYLRGYNNGVPVWVSLLRPEAFNPDLKWESTITYNVGLDYSLFNNRVDGSIDFYHRKTK DLINAETKTTAGTNFKEFVAANIGNLENTGFEFAINGRPVVNRNFTWEVGANFAYNKNKI TKLSVGDDKDTKSVNGMTVHMVGHAANMYYVYEQIYDENGKPIEGLYKDRNNDGQINEQD LRPYNKSSPDVTLGLNTRLNWKAWDLSIAGHGSFGTYNYNAIAANHAGLSPTTIYASESL SNRVKSAFDTNFQIAQPLSDYYVQNASFFRIDNIVLGWSFKNSKRIPFNGRIYGSVQNPF VFTKYKGVDPEVFGGYDGTLYPRPMTFLLGVNLNF >gi|225935359|gb|ACGA01000033.1| GENE 28 43838 - 45508 1130 556 aa, chain - ## HITS:1 COG:no KEGG:BT_3309 NR:ns ## KEGG: BT_3309 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 549 10 544 547 331 39.0 5e-89 MSKFLGIRFVAFIICGLFVASPMQAHLRFDSLLIELDHVIANEKEYKSRKENYLEKLKEQ LRNQNLSRESRYSIYQSLAGEYETFICDSAIVYANRALYEAAELKNTSWMNDSRIQLARG EAKAGMFSKTLDILNSIDRTQLNRHQLIDYYKTYIDVYIYMIEYNDGYDLADLIAKKVVC QDSLIQIVDTTSFEYVTRYGFRCIETGDLAGAERILLSYFPKVKPDTKECAGITSILSFL YEQKGDSEKEKEYLAISAISDIKASVKENISLRALAFILFNNDVDIDRANRYIKKSLDDA NFYNARLRNIQTAKILPIIDKAYQLDREKQQNRLRFLLIIVSLLSIVLLIAILLVVKQMK KLARAKQHIEEINARLNELNTVLQDVNKQLKQSNLSLAESNHIKELFISSFLEICTQYIE KLDAFKGTVNRKVKAGQVADILKLTSKTENSALELKELYANFDHAFLNIYPTFVEEFNTL LRIEERYPVMSDKSLNQELRVFALIKLGIKDINKIAVFLHYTPRTVYNYRSKIKSKALNT DEDFEEKVKLLCSDSF >gi|225935359|gb|ACGA01000033.1| GENE 29 46233 - 47030 485 265 aa, chain - ## HITS:1 COG:no KEGG:BT_4180 NR:ns ## KEGG: BT_4180 # Name: not_defined # Def: acetyl xylan esterase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 263 1 266 267 323 61.0 4e-87 MLRSIYLLSFLLLQSLFAFAGNVKESVPSSFDLYVCIGQSNMAGRATLTPEVMDTLQNVY LLNDKGNFEPAVNPLNRYSTVRKDLSMQRLGPAYGFAKEMVRQTKRPVGLVVNARGGSSI NSWLKGSKDGYYEEALSRVRIAMKQGGVLKAILWHQGEADCSNSEAYKQKLISLVKDLRE DLDMPDLPVVVGQISQWNWTKREAGTVPFNQMIKKVSSFIPHSDWVSSKGLGWYKDEKDP HFNTEAQLLLGKRYAKKVLKFYKHQ >gi|225935359|gb|ACGA01000033.1| GENE 30 47048 - 49723 2398 891 aa, chain - ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 50 594 36 554 570 177 27.0 8e-44 MKKFLYIFILLFIVGWAQAQQRVVYTINDGWKFAKGSPFEAQLAGCDDSSWETVNIPHTW NNKDADDETPGFYRGPAWYRKQLFVDKSQEGRQAVIYFEGANQEVRFYLNGQFVGEHKGG YTRFCFDITSHLRYGQENLFAIYVNNVYNPNIPPLSADFTFFGGIYRDVYLQFMNPVHIA TNDYASSGVYIRTPEVNNSAASVEITTLLTNNTLQPAEIRVENVICDADGKEVKKTHAEI KLASGETKTDISKKIKIDSPRLWDIDDPYRYMVYTRILDKKKGTLLDEVVNPLGLRWFKF DSEKGFFLNGKWRKLIGTARHQDYFQKGNALRDELHVQDVLLLKEMGGNFLRVSHYPQDP VIMEMCDKLGIVTSVEIPVVNAVTETEEFLQNSVEMAKEMVRQDFNRPSVMIWGYMNEIF LRRPYTEGKQLEDYYRFTEKVARALEATIREEDPSRYTMMAYHNMPQYYEDAHLTEIPMI QGWNLYQGWYEPDINEFQRLLDRAHKVYKGKVLMVTEYGPGVDPGLHSYQPERFDFSQEY GLVYHKHYLREMMKRPFIAGSSLWNLNDFYSESRVDAVPHVNNKGVVGLDREKKDVYWFY KTALSRRPILVIGNREWKNRGGVVNTAQKECIQSVPVFSNAEEVELFVNNKSLGKKKVED NYALFDVPFVGGENLLEAVAVAGDSKLRDMLRIQFQLVGSQLKDEAIPFTEINVMLGSPR YFEDRTANVAWIPEQEYKPGSWGFVGGTSYRRKTGFGSMLGSDIDIHGTDMNPIFQTQRV GIKSFKADVPNGEYSVYLYWAELESDKEREALVYNLGADSEQTFAGNRSFGISMNGTTVL DDFNIARDYGYACAVIKKFVVTVKDGKGLSVDFHKKEGEPILNAIRIYRNY >gi|225935359|gb|ACGA01000033.1| GENE 31 49743 - 51971 1629 742 aa, chain - ## HITS:1 COG:SSO3022 KEGG:ns NR:ns ## COG: SSO3022 COG1501 # Protein_GI_number: 15899728 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Sulfolobus solfataricus # 6 737 18 724 731 497 39.0 1e-140 MSAGCVRILSFPEGDTLVTKRLVVDSEKPAFKDYKWEDTGQKLLFRTRELSVVFDKADAA FTFQETSTGKLLLKEKGDRKARNFKRSVAGGEQCLEVTQRFVPTEDEAIYGLGQYQNGIM NYRGKSVLLLQANMDIVNPFLISTNGYGVLWDNYSSTKFDDTKEGYSFTSEVGDASDYYF VYGKNMDEVVVGYRELTGDVPMFGKWVYGFWQSKERYKSFDELKAVVKEYRKRGIPLDNI VQDWEYWGDKPHWNSLTFHPANFNHPRQVIDELHQQDHVHFMLSVWPGFGPETAVYQSLD SIGALFSEPTWAGYKVFDAYNPVARDIFWQYLKKGLYDMGVDAWWMDATEPSFRDGFTQL KQEEKTKSAGNTYLGSFHRYLNTYSLEMLKDFYQRLRAESDQKRIFILTRSAFASQQHYG TAVWSGDVSASWENMHKQLVAGLNLSMSGIPYWTSDTGGFFVTERDAKYPDGLKSNDYKE LYSRWFQFSAFTPIFRAHGTNVPREIWQFGEKGTLSYDNQVKYIHLRYRLLPYIYSTSHQ VTANNYTMLRGLAMDFTTDTRTFDIDNAYMFGPSLLVRPVFHPQSEEKNICIYLPEHSGK YWYDFWTGEAFEGGREQMQTNVLDILPLYVKAGSILPLAEVKQYAMEYPDRELELRIYGG ADASFLWYEDEGDSYRYEEGVCSKVPMQWKDSERTLTIGLREGTYPGMPEQIKMHVKLYL PEGAALDSKECVYTGREVKIKF >gi|225935359|gb|ACGA01000033.1| GENE 32 52109 - 53902 1492 597 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172079|ref|ZP_05758491.1| ## NR: gi|260172079|ref|ZP_05758491.1| hypothetical protein BacD2_09461 [Bacteroides sp. D2] # 1 597 1 597 597 1242 100.0 0 MRKLLLFLLVSIALPALAQNGKKVYADFHGVRYTRQHDGKLGRWEMYANTEKSSTGRKSL CYNADLIDSEGRHEIAAVAYPQVGMQSNFDPDYIEYQILSAKAAKIDGFFIEWGFKPHEN DILLREMQKVAAKYDFEIGVNWCDGWLYYDWITKLYPEINTREAKTEYMAKCYQYLVDSV FTGPTAPMVKGMPVFYHFGPGAKVDEYKKVLSLAKLPQGMKQPVALRRWADWGKLENDKY IPVTRSDEMDTWKEVGEIPTAWLPARVRTRDQAHAEWDNYATQDDVIEFMKPFRDSIWHS NNPAYTIKSGFAMPGMDNRGCAGWGRGHFYYIPRNNGETYQSMWKFCMAEKDSLDMMFIA SWSDYTEGHEIEPTIENGDRELRTTLKYVAEFKGEQADERGLTLPLMLFRLRKEARFLEK TKMDVSGCHRSLDKAALLISQGRYPVAIGLLSQIENDVKTAKSALAVEMMRLRDSDIKIQ GKRKSGGYSAEETLSVSLPKELVSRLQMNNYVGYLYFEYLDKGNESLFIRSSTLREPKEP FKIVSRIRTDNTGEWKSAKVELYKDNIVNGFNIPTFYLKGNVVVRNLSLGYTIYTVK >gi|225935359|gb|ACGA01000033.1| GENE 33 53959 - 54945 773 328 aa, chain - ## HITS:1 COG:no KEGG:Phep_2142 NR:ns ## KEGG: Phep_2142 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 39 318 51 324 333 218 43.0 2e-55 MKKISLLALFLGFNIALGACSDDDAAPIVKNESFQLPAKAYVLAEQTKRAIVIRDAETQR NVWSWDPYTARVPSAHQDWFINPSEVKPVFNKRYILMTASGGAVALIRLSDHKLMFYANC GQNPHSAEILPDGNIVTAESKSGEINTFVVDTVKVLGMKANTIKLGNAHNVVWDKERECI YATATIQAGVTALFRMNYNGNRNNPQLTNQTRIYTFDKESGGHDLFPVYGEEDKLWLTAA SAVYKFDISTDVPTCEKVYSIADIKSICNGPDGILMLKPTEEWWAEGLVNEKGEELFKMD GAKIYKGRWMIDNLFSYPEKHDFVLSED >gi|225935359|gb|ACGA01000033.1| GENE 34 55007 - 56545 799 512 aa, chain - ## HITS:1 COG:no KEGG:Acid_0712 NR:ns ## KEGG: Acid_0712 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 71 511 42 461 462 216 32.0 2e-54 MEKHTYRYFCSFCLASLILLFAISCSGSAEEPSEPETPTTPTDEYTYLNVEYRKWQNGIF QSWTTADSRETRIIDNMNRYAPSGDYSRTAWGGRIGLQPSSVTGKEGFFRVAYCGGRAYL LDPDNGAVILHGIQHVRPGESTAHQKAFSTKYGSEVRWSEVTGKMLADNHINYISYGSNR IEAFPVAVRANLLTPKTQKIAYAETLYLLRTFMWDMTKNLGYAFDDDKYNRLVLLFEPTF AAYIDNLVQEKSALFAGDKHFVGYYLDNELPFASYQNGDPVKGIDLKHFLSLPDRYKAAR IYAEKFMQERGIASSAAISKKDQEDFRGVVSDYYYQLATTTVRRHDAEHLILGTRLHDWS KYNQKVVEACARYCDVVSVNYYGRWQPETDFLANLKAWCGAKPFLVSEFYTKAEDASYQG VKYANTEGGGWLVRTQKNRGEFHQNFCLRLLETQNCVGWVHFEYNDGYASDGGASNKGIV SLEYEPYESFLSYVRQLNLAVYPLIDYYDAKQ >gi|225935359|gb|ACGA01000033.1| GENE 35 56660 - 58201 1212 513 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172082|ref|ZP_05758494.1| ## NR: gi|260172082|ref|ZP_05758494.1| hypothetical protein BacD2_09476 [Bacteroides sp. D2] # 1 513 1 513 513 926 100.0 0 MIRKLYTILLIGLCLNLVACGDDNEGIDPNASAPVIKFPMEQLDVDLNKVDNLPVVAVIK SQAGLQSVTMRIQTVEGTVEYKTVTDFFNPNSYSLSENLEYNANYQAFIIEATDKLNHAI TGTLPISVTDVVERPVITFDPEEIIYDEMDENPTIPRTTFKITSEAGLKTVEMYLVSASG QEAKETINLSGEKEYTFDEMINYKEGDRGFKVKAEDTYGYITISTLPVTYKTIPGPSLTL TESTIFAGTDAKKGVSMQIESVRGVHEVVIYRIENGSEVEALRETRSGEHTLNYAPEIDF TEATSKLKVVVSDGREGKEATGYVKAYVNMDVATLNVGSQPLANNAHEKYPDAFGMVSLN DLKTYSVDYAIANEANAKNIDFKFYCFGAAGSPRLYSMDNTGKDGEFSGSTGKLSAIKVK NLTRFAILSNFDYENATVTSISSEILSSSITQSLLDPIAVGNIIAFRTGGSSAAGGGRIG VMKVINITEPKDLVSSNATARVMTVEIKFPQKK >gi|225935359|gb|ACGA01000033.1| GENE 36 58229 - 58837 482 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172083|ref|ZP_05758495.1| ## NR: gi|260172083|ref|ZP_05758495.1| hypothetical protein BacD2_09481 [Bacteroides sp. D2] # 1 202 13 214 214 405 100.0 1e-112 MKKTIYRTLFFCLSVIALAGCDLELQKNYDYEASVDDPHVNVTAWEYFQDHQDVFSEFTT AIEYTGLKDYYTQTENKYTYLALNNTAMQSYRENAFPGIASITDCDKETVKNMLLYHIVD GEYSSYGQLQVEAMFVLTMLPGEKGLMTMSVWKNPWQAAVGKILVNETGSNGKSPQRNAK TSNILPTNGVIHIFEKYCYYQK >gi|225935359|gb|ACGA01000033.1| GENE 37 58851 - 60311 1510 486 aa, chain - ## HITS:1 COG:no KEGG:Phep_0446 NR:ns ## KEGG: Phep_0446 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 20 445 23 469 530 207 32.0 9e-52 MKNIKYFVGAICLALSLNSCSDFLNEEPVSEIPAGDMWQTARDAKAGINEIYGLLRSTLR ENYFYWGEFRSDNVAPGAPVMADQARVINNLMSTDEKCAKWTTLYQMINQTNLAIKYVPN ISMPDVADRNDYLGQAYALRALAYFYAIRVWGDVPLFIEPTEKYSEAIYKERTDKNYIID NVILPDLKKAESLINRNKNYERKRISICGVWAIMADVYMWTEQYNLADQTIDKMADIKSS KGRLVDFEPSIQTWHTMFTEELNNKPSDDTPENDEYSTKEFIFLIHFNMDEVGTNGYSYM YQWFSGSGNRAAVLSDKLMSVFNEPDMQGDLRKAYTVKDYQNGNELRKYMAGDISNSLNK TCEVAYPIYRYTDMLLLQAEARARLGKWEEALDLVKKVRDRAGLVTPTALSFASEDEVIN YILRERQVELVGEGRRWFDLLRTGKWKEVMKPINGMSQDGNELFPIHYSHILENPKITQN AYYGNN >gi|225935359|gb|ACGA01000033.1| GENE 38 60325 - 63462 2900 1045 aa, chain - ## HITS:1 COG:no KEGG:Phep_0445 NR:ns ## KEGG: Phep_0445 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 11 1045 3 1043 1043 696 39.0 0 MKNLQSRTKVRLLQSFLLISLSLFFLAHPAYAQGGFAVKGVVVDKTGFPLPGANVMEKGT SNGTITDLDGNFSLNVSKKGVTLTVSFMGYTPKDVVVKDKTTNITLEENSKLLDEVVVVG YGTMKKRDVTGAITSISSEAIEQKMATNVFEALQGTTAGVQVVSGSGQPGESSSIKIRGT STFSAEGVTPLYIVDGVPLESIDGINTNDITSMEILKDAASAAIYGSRSANGVIIITTKS GQEGKARIDIKYNHSWGTLSHKVPQANRKERLLYDQYRKEYFETYGGGNPDESIDILNDP LNSFFNVDNDYLDMITSTAQKDQVDISVGGGTKKLKYFINTGYYNEKGIISNTGFQRLNT RINSDYSPTDWMNMGSRISLTYSKKKGLDEGTLLSAVLTRRPYFNTYYPDGSLVGVFNGQ KNPIAQINYTTDFTDSYKANFFQFFEIKFNKYLKFRANINANFYLDKRKKLEPSLITDEW QKQNRGYSYNYLNWNWMNEDFITYARKIKGHNFSAMVGVSAQQWRYENETFVGLNSSTDF IYTMNAFAANLDLSSTGSTLSNHSMASVFARVTYDYLGKYLFTANIRRDGSSRFAKENKW GNFPSVSVGWRFSDEKFMKFSKKFLEDGKIRVSYGITGNEAIGNYDYIYSYSPNSIYDGV GGVIPTRIGKDNLKWEETKQFNLGLDLNFWNSRLTITADYYDKYTDGLLANYQLPKESGF AYMKTNVGEMSNRGFEIAVTGDIIRTKDWKWNASFNISRNINRIEKLSEGKAYMEGDIWW MQEGGRVGDFYGFKSAGIFAYDESNAFTDKWEQLTPVFENGVFQYKYLLNGKEYAGDIRQ KTLPNGKPFRGGDYNWEEPEGTRDGVIDDNDRMVIGNAMPDVTGGLNTTVTWKNVSLYLG FYYSLGGQIYNAAEHNRNMFKYTGTTPSPEVIHNMWLHPGDQAIYPRPYNDDYNNARMGN SFYLEDASFIRLQNVRIAYDLPENWIKKLMLKNINIYAFVNNALTWTNYSGFDPEFSTSN PLQVGKDSYRYPRKREYGIGFSANF >gi|225935359|gb|ACGA01000033.1| GENE 39 63798 - 67823 2758 1341 aa, chain - ## HITS:1 COG:VCA0709_1 KEGG:ns NR:ns ## COG: VCA0709_1 COG0642 # Protein_GI_number: 15601465 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 806 1035 494 725 738 149 36.0 4e-35 MREQRLSKIHFFILLSLFGFVFPLRSQEYKFRTLGVEDGLSQITVSDICQDEKSRIWIAT LDGLNCFDGNHIKVFNHFHNDSISYGNLYVTQMVEDGQGSLFLLASTGLFQFDLETEKYY ILPVASPTTLAKGKTGVWIAEGGKLFLYDKNTRLLKPMYADLQLPDSGPTMVEGSEGSLW IALKESGVMRVDTCGSMSLHLPGIKVMKLIKSNDQNIWIGSQDQGVFCFSPQGTIIHHYE YNDKSVYTVRDDMARALCQDLEGNIWVGYRSGLSKIEVATGKIFHYQADPNRVGAMSNRS VTSLYTDKQGTVWVGTYWGGVNFFSPEYQHFVHYHASDTGLSFPVVGAMAEDKAGNIWIC TEGGGLDLYQPEQGKFKHFNAHTGYHFSTDYLKDVVFDEANNCLWIAADFTNKVNCFHLN NYRNDIYNLEPLGEESVGEALFALADTPRKLYVGTTSAIVSLDKQTLKSEVLFHQKELFT HNYNTLLLDSKNRLWFAADDGCVAYVIDEKRFETYRISLKKRVRSQKELVNVIYEDRKGN IWVGTHGNGLFLLDKKERLFRLHTPESVLSGENIRVLGETPSGNLLIGTGHGLSMLEQKE GKVINFNSKTGFPLTLVNRKSMHVSRNHDIYMGGATGLVTIRESSLYYPPKIYDLELAHL YVNNKEITTGDQTGILNKSFAYTGRIKLNYLQNVFSIGFSTDNFLHIGGGEVEYRLMGYN DEWSENRLGNDITYTNISPGDYVFEIRLKNFPEVIRSLDITITPPFYATWWAYTIYVCVI LTILFFIVREYRIRLFLKTSLDFELREKQYIEEMNQSKLRFFTNISHEIRTPITLILGQV DLLLNSGKLSTYAYSKLLNIHKNAGNLKSLITELLDFRKQEQGLLKLKVSQFDLYSLLKE HYVLFKELAANRNISFVLHADCEQCLVWGDRMQIQKVVNNLLSNAFKYTSDGGTIRMELA DGTEECMFSVSDNGAGISEEDYVKIFERFYQVENIGQYGGTGIGLALSQGIVKAHQGDIT VESQLGKGSCFKVTLKKGDAHFDSSVARIEPEQDKEYIYYSEDKELLVKEVQSAQSENGT TDCKLLVIDDNEEIRNILVDIFSPLYTVETASDGKEGYEKVKVMQPDLVISDIMMPGMPG TELCAKIKNNIETCHIPVVLLTALSAPERELEGLRIGADIYVVKPFNMRRLVMQCNNLIN TRRLLQNKYAHQLDSKAEKIATNELDQKFIEQATQVVEDNMENPEFSVDAFSREMGVGRT VLFQKIKGITGSTPNNFIMNLRLKKAAYFLLNAPEMNISDIAYRLGFGNPQYFNKCFKEL FDIAPTQYRKSHDAPSDPSVK >gi|225935359|gb|ACGA01000033.1| GENE 40 67934 - 68833 616 299 aa, chain - ## HITS:1 COG:SMb21419 KEGG:ns NR:ns ## COG: SMb21419 COG2207 # Protein_GI_number: 16264994 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 11 293 8 287 295 154 30.0 2e-37 MIEDNSLGLEFKYLIVNDMDRKFGLWVNTVGYQSIPPDSPYPLKEHPSGYFFNAEKGRVL REYQLVYITKGRGLFSSDSTSEKQVCKGRLMVLFPGQWHTYRPLRQTGWTEYYIGFEGPM IDAIVNDAFLSQEQQILEIGINEELVSLFSRALAVAEADKISAQQYLSGIVLHMIGMILS VSKNKVFEMSDVDQKIEQAKIIMNENVSGNVDPEELAMRLNISYSWFRRVFKEYTGYAPA KYFQELKLRKAKQMLVGTSQSVKEISFFLGFQSTEYFFSFFKKRTGLTPLEYRSFGREE Prediction of potential genes in microbial genomes Time: Fri May 13 08:27:42 2011 Seq name: gi|225935358|gb|ACGA01000034.1| Bacteroides sp. D2 cont1.34, whole genome shotgun sequence Length of sequence - 124815 bp Number of predicted genes - 90, with homology - 89 Number of transcription units - 41, operones - 23 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 2 - 1156 1242 ## COG1454 Alcohol dehydrogenase, class IV 2 1 Op 2 . - CDS 1245 - 2054 992 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Prom 2075 - 2134 7.3 3 2 Op 1 . - CDS 2148 - 3167 1050 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 4 2 Op 2 6/0.000 - CDS 3171 - 4427 1361 ## COG4806 L-rhamnose isomerase 5 2 Op 3 . - CDS 4474 - 5934 1479 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 6154 - 6213 5.4 + Prom 6091 - 6150 5.6 6 3 Op 1 . + CDS 6171 - 6644 564 ## COG1438 Arginine repressor 7 3 Op 2 . + CDS 6672 - 7250 566 ## BT_3761 hypothetical protein 8 3 Op 3 . + CDS 7264 - 8472 1576 ## COG0137 Argininosuccinate synthase 9 3 Op 4 1/0.000 + CDS 8469 - 9437 788 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 10 3 Op 5 . + CDS 9450 - 10571 1108 ## COG4992 Ornithine/acetylornithine aminotransferase + Prom 10623 - 10682 3.6 11 4 Op 1 . + CDS 10708 - 11481 883 ## COG0345 Pyrroline-5-carboxylate reductase 12 4 Op 2 1/0.000 + CDS 11521 - 12075 628 ## COG1396 Predicted transcriptional regulators 13 4 Op 3 . + CDS 12083 - 13738 1703 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases + Term 13754 - 13807 12.7 + Prom 14020 - 14079 6.7 14 5 Op 1 . + CDS 14113 - 14769 413 ## gi|260172101|ref|ZP_05758513.1| hypothetical protein BacD2_09573 15 5 Op 2 . + CDS 14819 - 15712 240 ## gi|260172102|ref|ZP_05758514.1| hypothetical protein BacD2_09578 + Prom 15843 - 15902 8.2 16 6 Op 1 . + CDS 15922 - 16860 363 ## gi|260172103|ref|ZP_05758515.1| hypothetical protein BacD2_09583 + Term 16883 - 16940 8.1 + Prom 16994 - 17053 6.0 17 6 Op 2 . + CDS 17131 - 17331 98 ## BT_3747 hypothetical protein + Term 17439 - 17491 13.4 - Term 17429 - 17477 11.0 18 7 Tu 1 . - CDS 17509 - 17943 294 ## BT_3564 hypothetical protein - Prom 17971 - 18030 3.3 + Prom 18736 - 18795 2.0 19 8 Tu 1 . + CDS 18919 - 21003 1392 ## COG5545 Predicted P-loop ATPase and inactivated derivatives + Term 21017 - 21065 12.3 - Term 21005 - 21053 10.2 20 9 Op 1 1/0.000 - CDS 21289 - 21801 242 ## COG1472 Beta-glucosidase-related glycosidases 21 9 Op 2 . - CDS 21713 - 22198 392 ## COG1472 Beta-glucosidase-related glycosidases - Prom 22225 - 22284 1.8 - Term 25818 - 25847 1.4 22 10 Op 1 4/0.000 - CDS 25857 - 26189 226 ## COG3512 Uncharacterized protein conserved in bacteria 23 10 Op 2 . - CDS 26189 - 27121 638 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 24 10 Op 3 . - CDS 27114 - 28139 850 ## COG3943 Virulence protein 25 10 Op 4 . - CDS 28155 - 32465 2984 ## COG3513 Uncharacterized protein conserved in bacteria - Prom 32498 - 32557 7.0 - Term 32700 - 32762 13.2 26 11 Op 1 . - CDS 32800 - 35151 1609 ## COG1472 Beta-glucosidase-related glycosidases - Prom 35172 - 35231 1.8 27 11 Op 2 . - CDS 35278 - 36492 1059 ## BT_3986 putative patatin-like protein 28 11 Op 3 . - CDS 36504 - 37481 914 ## BT_4711 hypothetical protein 29 11 Op 4 . - CDS 37501 - 37938 412 ## BT_4709 glycosyl hydrolase 30 11 Op 5 . - CDS 37794 - 38513 477 ## BT_4709 glycosyl hydrolase 31 11 Op 6 . - CDS 38556 - 40157 1308 ## BT_4708 hypothetical protein 32 11 Op 7 . - CDS 40170 - 43532 3042 ## BT_4707 hypothetical protein - Prom 43559 - 43618 3.1 33 12 Tu 1 . - CDS 43717 - 44688 737 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 44735 - 44794 6.8 - Term 44737 - 44775 3.8 34 13 Tu 1 . - CDS 44838 - 45380 325 ## BT_3748 RNA polymerase ECF-type sigma factor - Prom 45428 - 45487 6.0 35 14 Tu 1 . - CDS 45511 - 46992 821 ## BDI_1707 hypothetical protein - Prom 47012 - 47071 2.6 - Term 47123 - 47174 13.1 36 15 Op 1 . - CDS 47280 - 48410 726 ## BT_4407 hypothetical protein 37 15 Op 2 . - CDS 48443 - 49558 1023 ## BT_4406 hypothetical protein 38 15 Op 3 . - CDS 49626 - 51278 1477 ## BT_4405 hypothetical protein 39 15 Op 4 . - CDS 51294 - 54629 2854 ## BT_4404 hypothetical protein - Prom 54651 - 54710 3.9 40 16 Op 1 6/0.000 - CDS 54793 - 55779 609 ## COG3712 Fe2+-dicitrate sensor, membrane component 41 16 Op 2 . - CDS 55837 - 56391 391 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 56490 - 56549 10.6 - TRNA 56670 - 56756 61.5 # Leu TAA 0 0 - TRNA 56773 - 56845 84.5 # Gly GCC 0 0 42 17 Op 1 . - CDS 56939 - 58279 1319 ## COG0165 Argininosuccinate lyase - Prom 58300 - 58359 3.8 43 17 Op 2 . - CDS 58367 - 58777 356 ## BT_3732 hypothetical protein - Prom 58809 - 58868 3.0 - Term 58787 - 58845 4.0 44 18 Op 1 . - CDS 58873 - 59511 659 ## COG0461 Orotate phosphoribosyltransferase 45 18 Op 2 . - CDS 59590 - 60072 525 ## BT_3730 putative regulatory protein 46 18 Op 3 . - CDS 60069 - 60905 323 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase + Prom 60679 - 60738 5.4 47 19 Tu 1 . + CDS 60942 - 61997 448 ## COG0117 Pyrimidine deaminase + Prom 62004 - 62063 6.5 48 20 Op 1 . + CDS 62140 - 63618 682 ## BF0506 hypothetical protein 49 20 Op 2 1/0.000 + CDS 63644 - 64378 372 ## COG0020 Undecaprenyl pyrophosphate synthase 50 20 Op 3 . + CDS 64409 - 67063 2787 ## COG4775 Outer membrane protein/protective antigen OMA87 51 20 Op 4 . + CDS 67087 - 67602 533 ## BT_3724 cationic outer membrane protein precursor 52 20 Op 5 . + CDS 67659 - 68174 618 ## BF0502 putative outer membrane protein OmpH + Term 68205 - 68254 15.6 + Prom 68204 - 68263 4.0 53 21 Op 1 . + CDS 68283 - 69125 629 ## COG0796 Glutamate racemase 54 21 Op 2 . + CDS 69202 - 69435 293 ## BT_3721 hypothetical protein 55 22 Tu 1 . - CDS 69404 - 70552 922 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 70661 - 70720 4.1 + Prom 70560 - 70619 2.5 56 23 Tu 1 . + CDS 70658 - 71740 812 ## COG0263 Glutamate 5-kinase + Term 71887 - 71934 8.2 + Prom 72056 - 72115 7.3 57 24 Op 1 . + CDS 72209 - 73234 438 ## gi|260172144|ref|ZP_05758556.1| hypothetical protein BacD2_09798 58 24 Op 2 . + CDS 73269 - 74315 777 ## gi|160885535|ref|ZP_02066538.1| hypothetical protein BACOVA_03535 + Term 74371 - 74423 11.1 - Term 74037 - 74074 2.4 59 25 Tu 1 . - CDS 74195 - 74455 70 ## - Prom 74543 - 74602 5.1 + Prom 74370 - 74429 6.3 60 26 Op 1 . + CDS 74541 - 75797 1002 ## COG0014 Gamma-glutamyl phosphate reductase 61 26 Op 2 . + CDS 75831 - 76787 849 ## COG0078 Ornithine carbamoyltransferase + Term 76798 - 76835 6.6 - Term 76790 - 76819 1.4 62 27 Op 1 . - CDS 76821 - 77213 285 ## COG0607 Rhodanese-related sulfurtransferase 63 27 Op 2 . - CDS 77228 - 77929 503 ## BT_3715 hypothetical protein 64 27 Op 3 . - CDS 77926 - 79074 972 ## BT_3714 hypothetical protein - Prom 79105 - 79164 5.2 - Term 79101 - 79160 1.0 65 28 Op 1 . - CDS 79175 - 80149 1007 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 66 28 Op 2 . - CDS 80146 - 81219 1061 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 67 28 Op 3 . - CDS 81246 - 81899 667 ## BT_3711 hypothetical protein - Prom 82136 - 82195 5.1 - Term 82197 - 82252 18.2 68 29 Tu 1 . - CDS 82270 - 82428 266 ## PROTEIN SUPPORTED gi|160885524|ref|ZP_02066527.1| hypothetical protein BACOVA_03524 - Prom 82448 - 82507 4.7 + Prom 82539 - 82598 3.7 69 30 Tu 1 . + CDS 82621 - 83187 765 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 83202 - 83253 16.9 - Term 83186 - 83243 16.1 70 31 Op 1 . - CDS 83258 - 84658 1542 ## COG1785 Alkaline phosphatase - Prom 84747 - 84806 3.4 71 31 Op 2 . - CDS 84820 - 86415 1417 ## BT_3705 regulatory protein SusR + Prom 86476 - 86535 7.6 72 32 Tu 1 . + CDS 86660 - 88513 1352 ## COG0366 Glycosidases + Prom 88519 - 88578 6.9 73 33 Tu 1 . + CDS 88711 - 90927 1928 ## BT_3703 alpha-glucosidase SusB + Prom 90948 - 91007 2.7 74 34 Op 1 . + CDS 91033 - 94080 3326 ## BT_3702 SusC, outer membrane protein involved in starch binding 75 34 Op 2 . + CDS 94110 - 95720 1375 ## BT_3701 SusD, outer membrane protein 76 34 Op 3 . + CDS 95756 - 96928 949 ## BT_3700 outer membrane protein SusE 77 34 Op 4 . + CDS 96954 - 98435 1128 ## BT_3699 outer membrane protein SusF + Term 98493 - 98528 1.0 + Prom 98477 - 98536 2.3 78 34 Op 5 . + CDS 98599 - 100875 1473 ## COG0296 1,4-alpha-glucan branching enzyme + Term 100942 - 100996 9.2 79 35 Op 1 . - CDS 101041 - 101805 556 ## COG2908 Uncharacterized protein conserved in bacteria 80 35 Op 2 . - CDS 101878 - 102189 352 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme - Prom 102209 - 102268 5.4 - Term 102232 - 102275 10.1 81 36 Tu 1 . - CDS 102350 - 104989 2318 ## COG3250 Beta-galactosidase/beta-glucuronidase - Term 105072 - 105130 5.7 82 37 Op 1 2/0.000 - CDS 105155 - 107194 1454 ## COG3210 Large exoproteins involved in heme utilization or adhesion 83 37 Op 2 . - CDS 107222 - 109417 1832 ## COG3210 Large exoproteins involved in heme utilization or adhesion 84 37 Op 3 . - CDS 109465 - 111417 1665 ## BT_3176 hypothetical protein 85 37 Op 4 . - CDS 111447 - 113237 1572 ## BT_3175 hypothetical protein 86 37 Op 5 . - CDS 113248 - 116574 3178 ## BT_1631 hypothetical protein + Prom 116545 - 116604 5.7 87 38 Tu 1 . + CDS 116832 - 120767 2344 ## COG0642 Signal transduction histidine kinase - Term 120735 - 120781 0.5 88 39 Tu 1 . - CDS 120801 - 121496 470 ## COG2003 DNA repair proteins - Prom 121651 - 121710 6.2 + Prom 121558 - 121617 5.0 89 40 Tu 1 . + CDS 121641 - 123656 1283 ## COG3525 N-acetyl-beta-hexosaminidase + Term 123704 - 123743 7.7 - Term 123691 - 123731 9.1 90 41 Tu 1 . - CDS 123780 - 124769 638 ## COG0463 Glycosyltransferases involved in cell wall biogenesis Predicted protein(s) >gi|225935358|gb|ACGA01000034.1| GENE 1 2 - 1156 1242 384 aa, chain - ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 2 384 4 383 383 415 56.0 1e-116 MNRIILNETSYFGAGCRSVIAVEAARRGFKKAFFVTDKDLIKFGVAAEIIKVFDENQIPY ELYSDVKANPTIANVQNGVAAYKASGADFIVALGGGSSIDTAKGIGIVVNNPDFADVKSL EGVADTKHKAVPTFALPTTAGTAAEVTINYVIIDEDARKKMVCVDPNDIPAVAIVDPELM YSMPKGLTAATGMDALTHAIESYITPGAWVMSDMFELKAIEMIAQNLKAAVDNGKDVAAR EAMSQAQYIAGMGFSNVGLGIVHSMAHPLGAFYDTPHGVANALLLPYVMEYNAESPAAPK YIHIAKAMGVDTTGMSEAEGVKAAIEAVRALSISINIPQKLHEINVKEEDIPALAVAAFN DVCTGGNPRPTSVADIEALYHKAF >gi|225935358|gb|ACGA01000034.1| GENE 2 1245 - 2054 992 269 aa, chain - ## HITS:1 COG:rhaD KEGG:ns NR:ns ## COG: rhaD COG0235 # Protein_GI_number: 16131742 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 26 265 21 259 274 127 33.0 3e-29 MKSILENRPGLAKEVNKVAEVAGYLWQKGWAERNGGNITVNITEFVDDEIRQMKPISEVK PIGVTLPYLKGCYFYCKGTNKRMRDLARWPMENGSVIRILDDCASYVIIADEAVAPTSEL PSHLSVHNDLLSKNSPYKASVHTHPIELIAMTHCPKFLEKDVATNLLWSMIPETKAFCPR GLGIIPYKLPSSVELAEATIKELQDYDVVMWEKHGVFAVDCDAMQAFDQIDVLNKSALIY IAAKNMGFEPDGMSQEQMKEMTVAFNLPK >gi|225935358|gb|ACGA01000034.1| GENE 3 2148 - 3167 1050 339 aa, chain - ## HITS:1 COG:YPO0334 KEGG:ns NR:ns ## COG: YPO0334 COG0697 # Protein_GI_number: 16120671 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Yersinia pestis # 3 332 5 334 344 165 32.0 1e-40 MDILIGLLIIAIGSFCQSSSYVPIKKVKEWSWESFWLIQGVFAWLVFPFLGSLLGVPQES SLFDLWGAGGAGMSIFYGILWGVGGLTFGLSMRYLGVALGQSISLGTCAGFGTLLPALFA GTNLFEGNGLILLLGVCITLAGIAVIGYAGSLRAQNMSEEEKRAAVKDFALTKGLLVALL AGVMSACFALGLDAGTPIKEAALAGGVEGLYAGLPVIFLVTFGGFLTNAVYCLQQNVANK SMGDYAKGKVWGNNLVFCALAGVLWYMQFFGLEMGKSFLTESPVLLAFSWCILMALNVTF SNVWGIILKEWKGVSNKTITVLIAGLIVLIFSLVFPNLF >gi|225935358|gb|ACGA01000034.1| GENE 4 3171 - 4427 1361 418 aa, chain - ## HITS:1 COG:STM4046 KEGG:ns NR:ns ## COG: STM4046 COG4806 # Protein_GI_number: 16767312 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Salmonella typhimurium LT2 # 7 418 5 418 419 494 55.0 1e-139 MKKEELIQKAYEIAVERYAAVGVDTEKVLKTMQDFHLSLHCWQADDVAGFEVQAGSLTGG IQATGNYPGKARNIDELRADILKAASYIPGTHRLNLHEIYGDFQGKVVDRDQVEPEHFKS WIEWGKEHNMKLDFNSTSFSHPKSGDLSLSNPDEGIRQFWIEHTKRCRAVAEEMGKAQGD PCIMNLWVHDGSKDITVNRMKYRALLKDSLDQIFATEYKNMKDCIESKVFGIGLESYTVG SNDFYIGYGASRNKMITLDTGHFHPTESVADKVSSLLLYVPELMLHVSRPVRWDSDHVTI MDDPTMELFSEIVRCGALDRVHYGLDYFDASINRIGAYVIGSRAAQKCMTRALLEPIAKL REYEANGQGFQRLALLEEEKALPWNAVWDMFCLKNNVPVGEDFIAEIEKYEAEVTSKR >gi|225935358|gb|ACGA01000034.1| GENE 5 4474 - 5934 1479 486 aa, chain - ## HITS:1 COG:BS_yulC KEGG:ns NR:ns ## COG: BS_yulC COG1070 # Protein_GI_number: 16080172 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus subtilis # 5 471 3 467 485 394 42.0 1e-109 MKQNFFAVDLGATSGRTILGTFIEGGLNLEEINRFPNHLIEVGGHFYWDIYALYRHIIDG LKLVAHRGESIASIGIDTWGVDFVCVGKDGNLLRQPYAYRDPHTVGAPEALFSRISRSKV YGKTGIQIMNFNSLFQLDTLRRNHDSALEAADKILFMPDALSYMLTGEMVTEYTIASTAQ LVNAQTRRLEPELLKAVGLSEKNFGRFVFPGEKVGVLTEEVQKITGLGAIPVIAVAGHDT GSAVAAVPALDRNFAYLSSGTWSLMGVETDAPVINAETEALNFTNEGGVEGTIRLLKNIC GMWLLERCRLNWGDTSYPELISEADACEPFRSLINPDDDCFANPADMEKAITEYCRATGQ SVPEKRGQVVRCIFESLALRYRQVLENLRSLSPRPIETLHVIGGGSRNDLLNQFTANAIG IPVVAGPSEATAIGNVMIQAMAAGEATDVAGMRQLINRSIPLKTYQPQDTEVWDAAYIHF KNCVRK >gi|225935358|gb|ACGA01000034.1| GENE 6 6171 - 6644 564 157 aa, chain + ## HITS:1 COG:BH2777 KEGG:ns NR:ns ## COG: BH2777 COG1438 # Protein_GI_number: 15615340 # Func_class: K Transcription # Function: Arginine repressor # Organism: Bacillus halodurans # 4 138 3 134 149 94 36.0 7e-20 MKKKANRLDAIKMIISSKEVGSQEELLQELNREGFELTQATLSRDLKQLKVAKAASMNGK YVYVLPNNIMYKRSTDQSAGEMLRNNGFISLQFSGNIAVIRTRPGYASSMAYDIDNNEFS EILGTIAGDDTIMLVLREGVAISKIRQLLSLIIPNIE >gi|225935358|gb|ACGA01000034.1| GENE 7 6672 - 7250 566 192 aa, chain + ## HITS:1 COG:no KEGG:BT_3761 NR:ns ## KEGG: BT_3761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 397 98.0 1e-109 MDTQQIDVMVADASHEVYVDTILETIRNAAAVRGTGIAERTHEYVATKMKEGKAIIALCG DTFAGFTYIESWGNKQYVATSGLIVHPDFRGLGLAKRIKQASFQLARLRWPKAKIFSLTS GAAVMKMNTELGYVPVTFNELTDDEAFWKGCEGCINHDILVAKNRKFCICTAMLYDPTEP RNIKKEQERNNI >gi|225935358|gb|ACGA01000034.1| GENE 8 7264 - 8472 1576 402 aa, chain + ## HITS:1 COG:XF0999 KEGG:ns NR:ns ## COG: XF0999 COG0137 # Protein_GI_number: 15837601 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Xylella fastidiosa 9a5c # 5 385 3 384 401 204 34.0 2e-52 MEEKKKKVVVAFSGGLDTSFTVMYLAKEKGYEVYAACANTGGFSEEQLKTNEENAYKLGA VKYVTLDVTQEYYEKSLKYMVFGNVLRNGTYPISVSSERIFQALAIARYANEIGADAIAH GSTGAGNDQIRFDMTFLVLAPNVEIITLTRDMALSRQEEIDYLNKHGFSADFTKLKYSYN VGLWGTSICGGEILDSAQGLPETAYLKHVEKEGSEQLRLTFEKGELKAVNDEKFDDPIKA IQKVEEIGAAYGIGRDMHVGDTIIGIKGRVGFEAAAPMLIIGAHRFLEKYTLSKWQQYWK DQVANWYGMFLHESQYLEPVMRDIEAMLQESQRNVNGTAILELRPLSFSTVGVESEDDLV KTKFGEYGEMQKGWTAEDAKGFIKVTSTPLRVYYNNHKDEEI >gi|225935358|gb|ACGA01000034.1| GENE 9 8469 - 9437 788 322 aa, chain + ## HITS:1 COG:AF2071 KEGG:ns NR:ns ## COG: AF2071 COG0002 # Protein_GI_number: 11499653 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Archaeoglobus fulgidus # 2 319 1 329 332 215 38.0 1e-55 MIKAGIIGGAGYTAGELIRLLLNHPETEIVFINSSSNAGNRITDVHEGLYGETDLRFTDQ LPLDEIDVLFFCTAHGDTKKFMESHNIPEDLKIIDLSMDYRIMSDDHDFIYGLPELNRRA TCTAKHVANPGCFATCIQLGLLPLAKNLMLTDDVMVNAITGSTGAGVKPGATSHFSWRNN NMSVYKAFEHQHVPEIKQSLKQLQNSFDAEIDFIPYRGDFARGIFATLVVKTKVALEEIV RMYEEYYAKDSFVHIVDKNIDLKQVVNTNKCLIHLEKHGDKLLIISCIDNLLKGASGQAV HNMNLMFNLEETVGLRLKPSAF >gi|225935358|gb|ACGA01000034.1| GENE 10 9450 - 10571 1108 373 aa, chain + ## HITS:1 COG:BH2897 KEGG:ns NR:ns ## COG: BH2897 COG4992 # Protein_GI_number: 15615460 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Bacillus halodurans # 3 373 4 377 384 252 39.0 9e-67 MKLFDVYPLYNINIVKGDGCKVWDENGTEYLDLYGGHAVISIGHAHPHYTAMISNQVAKL GFYSNSVINKLQQEVAERLGKISGYEDYSLFLINSGAEANENALKLASFHNGRTKIVSFN KAFHGRTSLAVEATHNPSIIAPINNNGHVTYLPLNDVEAMKQELSKGDTCAVIIEGIQGV GGIKIPTTEFMQELRKACSETGTILILDEIQSGYGRSGKFFAHQYNDIKPDIITVAKGIG NGFPMAGVLISPMFKPVYGQLGTTFGGNHLACSAALAVMDVIEQENLVENAKAVGDYLLE ELKKFPQIKEVRGRGLMIGLEFEEPIKELRSRLIYDEHVFTGASGTNVLRLLPPLCLSME EAKEFLARFKKVL >gi|225935358|gb|ACGA01000034.1| GENE 11 10708 - 11481 883 257 aa, chain + ## HITS:1 COG:lin0414 KEGG:ns NR:ns ## COG: lin0414 COG0345 # Protein_GI_number: 16799491 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Listeria innocua # 2 256 3 259 266 137 35.0 1e-32 MKIAIIGAGNMGGSIARGLAKGSLIDDSDIIVSNPSAGKLEKLKKEFPGISTTNNNLEAA TGADIVILAVKPWFMESVMRELKLKSKQILISVAAGISFEELAHYVVAPEMPMFRLIPNT AISELESMTLVAARNTNDEQDNFILRLFSEMGTVMLIPEDKIAAATALASCGIAYVLKYI QAAMQAGIEMGLRPKDAMQMIAQSLKGAAALIQNNDTHPSVEIDKVTTPGGITIKGINEL EHNGFTSAIIKAMKASK >gi|225935358|gb|ACGA01000034.1| GENE 12 11521 - 12075 628 184 aa, chain + ## HITS:1 COG:MTH700 KEGG:ns NR:ns ## COG: MTH700 COG1396 # Protein_GI_number: 15678727 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 1 184 1 182 182 166 49.0 3e-41 MNEQIKQIAERLRGLRDVLELTAEDIARDSDISAEEYRLAETGDYDISVSMLQKIARTYN VALDALMFGEEPKMSSYFVTRAGKGISIERTKAYKYQSLASGFMNRTADPFIVTVEPKAI DEPIHYNKHNGQEFNLVIEGRMLINIEGKEIILNQGDSIYFNSKLPHGMKALDGKTVRFL AVIM >gi|225935358|gb|ACGA01000034.1| GENE 13 12083 - 13738 1703 551 aa, chain + ## HITS:1 COG:MA2912 KEGG:ns NR:ns ## COG: MA2912 COG0365 # Protein_GI_number: 20091733 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Methanosarcina acetivorans str.C2A # 1 550 7 558 560 708 59.0 0 MVERFLSQTSFSSQEDFNKNLKINVPENFNFGYDVVDAWAAEQPDKKALLWTNDKGESRQ FSFAEMKRYTDMTASYFQSLGIGRGDMVMLILKRRYEFWYSTIALHKLGATVIPATHLLT KKDIIYRCNAADIKMIVAAGEGIILQHIKDALPECPTVEKLVSVGPEIPEGFEDFHQGIE NAAPFVRPRHANTNDDISLMYFTSGTTGEPKMVAHDFTYPLGHIVTGSFWHNLNENSLHL TIADTGWGKAVWGKLYGQWIAGANIFVYDHEKFTPAAILEKIQEYHVTSLCAPPTIFRFL IHEDLTKFDLSSLKYCTIAGEALNPAVFETFKKLTGIKLMEGFGQTETTLTIATMPWMEP KPGSMGLPNPQYDVDLIDHEGRSVEAGEQGQIVIRTSKGKPLGLFKEYYRDAERTHEAWH DGIYYTGDVAWKDEDGYLWFVGRADDVIKSSGYRIGPFEVESALMTHPAVVECAITGVPD EIRGQVVKATIVLSKDYKARAGEELIKELQNHVKKVTAPYKYPRVIEFVEELPKTISGKI RRVEIRENDKK >gi|225935358|gb|ACGA01000034.1| GENE 14 14113 - 14769 413 218 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172101|ref|ZP_05758513.1| ## NR: gi|260172101|ref|ZP_05758513.1| hypothetical protein BacD2_09573 [Bacteroides sp. D2] # 1 218 1 218 218 396 100.0 1e-109 MIMNEYIKRISFISIIEINNITKKILMILFIVVLFGLLFPIFTYAQIKINKEEFEKEYSS KLVLDKDDNLVFQKILEFPQYTKEKLYEMAKNYMDSRIQKKSVPYSTNTDYSYSNYSLII TERKNDFICFSKKIAGTCYAAIEYIYKLEIKDYRIRATIIAHRLGIAGSWGDIKDCVSNK HNKYMRRILSQFVDYTEELFAHTKKGIETPQMKMEDDW >gi|225935358|gb|ACGA01000034.1| GENE 15 14819 - 15712 240 297 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172102|ref|ZP_05758514.1| ## NR: gi|260172102|ref|ZP_05758514.1| hypothetical protein BacD2_09578 [Bacteroides sp. D2] # 1 297 1 297 297 564 100.0 1e-159 MRRTLITFFFFLWGVSEVCMGQYSIISSRRNSLIEQIQSLNLQIKKENNQSEKKMLTIRR DTLQKKLDYINQAIEKQNIRSTPHFVGEETSQEENKAPGSAVSTGKSQPSTNDLDRFKRT VTNYINQGNSITAYNYIKKTFEDSYPSDPNEQFAFAALLQDIIIGLTNDGTKAAYSMDFS TALVTTSCLSGAHQMQTSLYAKAASNGHEGAMLMLQVQLGLSGGGVNTNPPINNSSNKTP STTQKTCTLCNGKGWIAGSKTPTYGNSDTHWCNECSRNVNASHSHDRCPSCSGRGYR >gi|225935358|gb|ACGA01000034.1| GENE 16 15922 - 16860 363 312 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172103|ref|ZP_05758515.1| ## NR: gi|260172103|ref|ZP_05758515.1| hypothetical protein BacD2_09583 [Bacteroides sp. D2] # 1 312 1 312 312 579 100.0 1e-164 MKGLIFLTTLFFMLFTACSNNDTLDENNENDGNKTDETLKYPVIINSTFHYPKADLDIQF NNTDSDPVTEVGCYIGGEKNFSSGFMMNKKAVNPQESTITLTTGSVTALADRYLLPYIIT TQKGKIFGRVHIFNTNHQECSEYCDFNPKDPNIMSEILKGYRGETSSGGSSSGGGSSVSP EGVSSFVINATRKERVLNAKYIGTNKKYLLTIREYSEVTYNWSCSDPKVEGILITRSAVK PTNGVKCTDFKAIYQPRNGEYIVKEWQETTSYADFYIWTYSSGYQYSTTSYNVRKYIEGV RDYTKWSSTPYL >gi|225935358|gb|ACGA01000034.1| GENE 17 17131 - 17331 98 66 aa, chain + ## HITS:1 COG:no KEGG:BT_3747 NR:ns ## KEGG: BT_3747 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 100 87.0 1e-20 MITSRKISQVNVFAGSPWEAASVKSLLNAAYIQVSTKDNGLNSILISVPCEYYTAAMRVI RNRKVS >gi|225935358|gb|ACGA01000034.1| GENE 18 17509 - 17943 294 144 aa, chain - ## HITS:1 COG:no KEGG:BT_3564 NR:ns ## KEGG: BT_3564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 110 1 110 146 64 32.0 1e-09 MSQWISIEAAAEKYRLEKEYIWLWVEMKKVTVSYENDTVSVDDDSIQQFIKRTKLGITSE YVDELEQLCMEKNKTSRLYASLLNMRDQELMAIRGQSSRLDGLWKMVEEQYERLRSFEKN SMSDNAICSNCWIRKICRRLKRIL >gi|225935358|gb|ACGA01000034.1| GENE 19 18919 - 21003 1392 694 aa, chain + ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 307 687 53 425 480 64 22.0 5e-10 MKITQIRENGDTEALSVTDIDLLIEKMKKETKLRPVTGLRQALHFVLPDEPCSLASKLPR VIPAAAFGRVNGVKRMKTYNGIVELTIGPLAGKTEVEIVKQKAAELPQTMLAFMGASGKS VKIWTCFTRPDGTLPQTTEEAEVFQAHAYRLAIKCYQPQLPFNILLKEPKLEQFSRLSYD PDLIYRPAPVPFYLSQPIGMPGEMTYHEKSSTEASPLNRALPGYDTEDTVALLYEAALRK TFEEMETDWHRNDHDLQTLVVPLAENCYYSGIPEEEVTRRTITRYYKRKNPMLVREMIRN VYKECKGVPKSSCLTKEQRLSLQMDEFMNRRYEFRYNTQIGEVEYRERFSFQFYFHPIDK RAQNSIMLDAQSEGIGVWDRDIDRYLHSNRVPIYNPLEEFLFHLPHWDGKDRIHALANRV PCKNPHWELLFHRWFLNMVSHWRGVDKMHANNTSPILVGAQGTHKSTFCREMIPPALRAY YTDSIDFSQKRDAELYLNRFALINIDEFDQITLPQQGFLKHILQKPVVNLRKPHGRSVLE LQRYASFIGTSNQKDLLTDPSGSRRFICIEVTGNIDTTQPIDYEQLYAQAIHEIYHGERY WFDSEDEQIMTENNREFEQTPAMLQLFYQYFKAAQTKEEGEFLTPVEILNYLKKKSGMSL SDNKVYHFGRLLQKCGIPSKHTYKGTVYQVIKLS >gi|225935358|gb|ACGA01000034.1| GENE 20 21289 - 21801 242 170 aa, chain - ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 30 170 593 736 754 119 42.0 2e-27 MFYSAILIRQDDCHSLCPVPSDKFHSIIIKKAPQSHDYVELSASPLYPFGYGLSYTSFDY SDLHLSALTPRSFEVSFKVRNTGKYDGEEVAQLYLRDEYASVVQPLKQLKHFARFYLKRG EEREVKFILSEEDFSLVDWNLKRIVEPGTFQIMIGAASDDIRLQTKVEIK >gi|225935358|gb|ACGA01000034.1| GENE 21 21713 - 22198 392 161 aa, chain - ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 2 161 437 592 754 127 41.0 9e-30 MQVEYVKDCFIRDTVTTDIEQAIAVAQRSEVIIAVVGGSSARDFKTSYKETGAAIADEKT ISDMECGEGFDRVTLSLLGKQQELLNALKATGKPLVVIYIEGRPLDKNWASENADAVLTA YYPGQEGGIAIADVLFGDFNPAGRLPFSVPRSVGQIPLYYN >gi|225935358|gb|ACGA01000034.1| GENE 22 25857 - 26189 226 110 aa, chain - ## HITS:1 COG:Cj1521c KEGG:ns NR:ns ## COG: Cj1521c COG3512 # Protein_GI_number: 15792834 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 6 94 3 91 143 79 46.0 2e-15 MDRFSEYRVMWVLVLFDLPTETKKEKKAYADFRKNLQKDGFTMFQFSIYVRHCASSENAI VHIKRVKSFLPEHGHVGIMCITDKQFGDIELFYGKKVSCVNTPGQQLELF >gi|225935358|gb|ACGA01000034.1| GENE 23 26189 - 27121 638 310 aa, chain - ## HITS:1 COG:PM1126 KEGG:ns NR:ns ## COG: PM1126 COG1518 # Protein_GI_number: 15602991 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Pasteurella multocida # 14 294 53 317 343 164 34.0 2e-40 MIKKTLYFGNPVYLSLRNAQLVIKLPDVEKAAALPETLKKQAEITKPIEDIGIVVLDNKQ ITITSGVLEALLENNCSVITCDSKSMPVGLMLPLYGNTTQNERFRKQLDASLPLKKQLWQ QTIQAKINNQASVLKDCMDEEVKCMKVWAADVRSGDPDNLEARAAVYYWKSLFADVDGFT REREGIPPNNLLNYGYAILRAVVARGLVISGLLPTLGIHHHNRYNAYCLADDIMEPYRPY VDELVFSLTQEFGKNAELVKEVKARLLTVPTLEVIIGGKRSPLMVAVGQTTASLYKCFSG ELRRVVYPER >gi|225935358|gb|ACGA01000034.1| GENE 24 27114 - 28139 850 341 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 13 333 13 333 345 298 48.0 9e-81 MDKNNNNPPIPSDFFLYKDSNGEVKVEIYIFNETVWLPQDKIAQLFGVDRSVVTKHLKNV YKSGELSKEVTCAKIAQVQTEGTRQVTRQIEFYNLDAILSVGYRVNSIQATQFRIWANSV LKEYLIKGFAMNDERLKNPQTLFGKDYFEEQLARIRDIRSSERRFYQKITDIYSQCSADY EAGSDVTKKFFATVQNKLHWAISGQTAAEIIVSRVDADKPNMGLTTWKNAPNGMIRKPDV SIAKNYLNEMEVDDLNRIVSMYLDYAERQAKKGQVMYMRDWVKKLDAFLQFNEEAVLQHS GWVSHEIAKALAEQEYDKFHVRQLQNYESDFDILLKGMGHD >gi|225935358|gb|ACGA01000034.1| GENE 25 28155 - 32465 2984 1436 aa, chain - ## HITS:1 COG:PM1127 KEGG:ns NR:ns ## COG: PM1127 COG3513 # Protein_GI_number: 15602992 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 470 992 275 723 1056 139 28.0 6e-32 MKRILGLDLGTNSIGWALVDEAENKDERSSIVKLGVRVNPLTVDELTNFEKGKSITTNAD RTLKRGMRRNLQRYKLRRETLMEVLKEHKLITEDTILSENGNRTTFETYRLRAKAATEKI SLEEFARVLLMINKKRGYKSCRKAKGAEEGTLIDGMDIARELYNNNLTPGELCLQLLDAG KKYLPDFYRSDLQNELDRIWEKQKEYYPEILTEVLKEELRGKKRDAVWAICAKSFVWEEY YTEWNKEKGKTEQQEREHKLEGIYSKRKRDEAKRENLQWRVNGLKEKLSLEQLVIVLQEM NIQINNSSGYLGAISDRSKELYFNKQTVGQYQMEMLDKNPNASLRNMVFYRQDYLDEFNT LWEKQAEYHKELTGELKKEIRDIVIFYQRRLKSQKGLVGFCEFESRQIEVEIDGKKKIKT VGNRVIPRSSPLFQEFKIWQILNNIEVTVVGKKGKRRKQKDNSPTLFDGLDDTEQLELNG SRYLYQEEKELLAQELFVRDKMTKSDVLKLLFDNPQELDLNFKTIDGNKTGYTLFQAYSK MIEMSGHEPVDFKKTVEKVVEHVKAVFDLLNWNTDILGFNSNEELDNQSYYKLWHLLYSF EGDNTPTGNARLVQKITELYGFEKEYAIILANVTFQDDYGSLSAKAIHKILPHLKEGNRY DVACAYAGYKHSESSLTKEEIANKALKDRLTLLPKNNLRNPVVEKILNQMVNVINAIIDT YGKPDEIRVELARELKKNAKEREELTKFIAETTRKHEEFKILLQSEFGLTNVSRTDILRY KLYKELESCGYKTLYSNTYIPREKLFSKEFDIEHIIPQARLFDDSFSNKTLEARSINLEK GNKTAYDFVKEKFGESGADNSLDHYLNNIEDLFKSGKISKTKYNKLKMAEQDIPDGFINR DLRNTQYIAKKALSMLNEISRRVVATSGTITDKLRQDWQLVDVMKELNWEKYKALGLVEY FEDKDGRQIGRIKDWTKRNDHRHHAMDALTVAFTKDIFIQYFNNKNASLNSNTNEYAIKN KYFQNGRAIAPIPLREFRVEAKKHLENTLISIKAKNKVVTGNINKTKKRGGVNKYLQQTP RGQLHLETIYGSGKQYLTKEEKVNASFDMKKIATVSKSAYRDVLLKRLHENDNDPKKAFT GKNSLDKQPIWLDKEQTRKVPEKVKTVTLETIYTVRKEISPDLKVDKVIDAGVRKILTGR LNEYGNDAKKAFSNLDENPIWLNKEKGISIKRVTISGISNAQSLHVKKDKDGKPILDEDS RNIPVDFVNTGNNHHVAVYRRPVIDKRGQLVIDEAGNPKYELEEVVVSFFEAVTRANLGQ PIIDKDYKTSEGWQFLFSMKQNEYFVFPNEKTGFNPKEIDLLDAENYGVISPNLFRVQKF AHKNYVFRHHLETTIKDTSSILKGITWIDFRSSKGLDAIVKVRVNHIGQIVSVGEY >gi|225935358|gb|ACGA01000034.1| GENE 26 32800 - 35151 1609 783 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 31 782 4 758 778 563 42.0 1e-160 MLNRICKSLLFTAIMTGVSYSFAQQLPVVTYKNPTLPVETRVADLLSRMTLEEKVGQLLC PLGWEMYEIKRNDVFPSDKFKQLIKDRKAGMLWATYRADPWTKKTLSTGLNPALAAKAGN ALQRYVIENTRLGIPLFLAEEAPHGHMAIGATVFPTGIGMAATWSPQLIREVGKAIGKEI RLQGGHISYGPVLDLARDPRWSRVEETFGEDPVLTGEIGKAMVEGLGSGDLSHPYSTLAT LKHFLAYGISESGQNGNPSFAGIRELHENFLPPFRQAIDAGALSVMTSYNSMDGIPCTAN HSLLTELLRNEWKFSGIVVSDLYSIEGIHQSHFVAPTMEAAAILALSAGVDVDLGGDAYM NLMNAVNTGRISKTALDASVARVLRLKFEMGLFENPYVDPEKAKKEVRSEESVTLARRVA QASITLLKNEHSLLPLNKNRKVALIGPNADNRYNMLGDYTAPQEEENIKTVLDGIRTKLS SSQVEYVKGCSIRDTVTTDIEQAVAAAQRSEVIIAVVGGSSARDFKTSYKETGAAIADEK TISDMECGEGFDRATLSLLGKQQELLKALKATGKPLIVVYIEGRPLDKTWASENADAVLT AYYPGQEGGNAIADVLFGDYNPAGRLPLTVPRSVGQIPIYYNKKAPQNHDYVELSASPLY AFGYGLSYTTFEYSDLRVSAISPHSFEVSFKVKNTGRYDGEEVSQLYLRDEYASVVQPLK QLKHFERFCLKRGEVKEVKFVLSESDFTIIDRNLKTVVESGTFQVMVGAASDDIRLQAKV VVK >gi|225935358|gb|ACGA01000034.1| GENE 27 35278 - 36492 1059 404 aa, chain - ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 1 376 384 139 31.0 2e-31 MKKYTLLSLLALLFCVSGCKTEPDYSDAVYMTGTLTSSNVKFLVEGQSTLGLTVTSTDKT ETDVAVGVQVAPQLLEAFNASTGRNCQMPPEGSYSFEAGDVIIPAGSNQSTQIKVTADAD KLQEGVSYCLPVSITSVSNSDLKVMESSRTAYVMLTKVIKIKAAYLARRGYFNIPSFGDQ EKSPVKALGQMTLEMKVLPVSFPVGSERSANGISSLCGCEENFLFRFGDGAGNPVNKLQF VKGSIGSASHPDKKDHYESWVEKEFPTGHWLHFAAVYDGQYLRLYLDGEQIHFVETKNGG SINLSMAYDGHTWNDTFSIGRSAGNARFFDGYISECRVWNVARTSAQIEDGVCYVDPTSE GLIAYWRFDGETQDDGSVLDMTGHGHNAVPGGTITYVDNQKCPF >gi|225935358|gb|ACGA01000034.1| GENE 28 36504 - 37481 914 325 aa, chain - ## HITS:1 COG:no KEGG:BT_4711 NR:ns ## KEGG: BT_4711 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 319 4 309 320 75 25.0 2e-12 MTMKSRKNLFYIGMFALCAIGFAACDDDETYDVYGDPFNRVYISDNSKAYKIVQTPISTV SNVDFKWPAKCSQKASGTIKATVVVDNSLIAAYNEEHGTVFEAMPAEAIVLENAEMTIPA GKMVVADTVHVKLTDDANVLSTLKSEKGYIIPLRLVSAEGPNTQLSTNMVAPSFLTVTVT EDNVNHEATEADITGTLVANQAGWSVNVLSGSGSNLDRWFDGNPQSTASISDWSNYKAAF TVDMGKSYTFSAIVAYRGTSWGEYNSISAGTKISTSDDGTNFKSAGEVTGSSKFIVFYAP LTARYIRVETTSYSVNCGVFNVYAK >gi|225935358|gb|ACGA01000034.1| GENE 29 37501 - 37938 412 145 aa, chain - ## HITS:1 COG:no KEGG:BT_4709 NR:ns ## KEGG: BT_4709 # Name: not_defined # Def: glycosyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 145 184 312 313 72 37.0 7e-12 MAKYMGPNPDITREERLQLIEERYGKEIASREGSCDKMLNIDQTSTSMTSLIPYSNYCFL QAYGGGTGAGGWPDEKVVYCCNMGDNWQGDMQSMYNQARYKPANGKRKGGFGAFFIHRDY NVHEYNPEPYYRFRQCIQIQNPAIH >gi|225935358|gb|ACGA01000034.1| GENE 30 37794 - 38513 477 239 aa, chain - ## HITS:1 COG:no KEGG:BT_4709 NR:ns ## KEGG: BT_4709 # Name: not_defined # Def: glycosyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 175 20 176 313 97 34.0 4e-19 MKTMKYILAALTVGGMMASCNTDIESLTIQRPLTYDDQYYQNLRDYKESDHEIAFGWFAQ YGAQNSAAVRFMGLPDSLDICSMWGGIPSTENTEIWEEIRFVQKVKGTKMLCVAITRIDA ETDDHAFKQAYNEAKAMPNGEERTAALNRSFEMYAEYFLDQVFLNDLDGFDADYEPEGDF LSGSNFEYFYKQWQSTWDRIRTLPEKNVYSLSKNVMVRRSLRGKVAVIKCSISTRPVPA >gi|225935358|gb|ACGA01000034.1| GENE 31 38556 - 40157 1308 533 aa, chain - ## HITS:1 COG:no KEGG:BT_4708 NR:ns ## KEGG: BT_4708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 532 26 529 531 356 40.0 1e-96 MKHILKCMKVFNGVLLAGGLLFTACTDHYESWNINPNEVTPEQMERDNLLTGAYFTQMER GIFVVGKDKGGLFEETQMLTGDIFASYCAPIKTWNYAGEQDNDCYKLYPQWYNSPFSTAY TEVMQPWRKVVDCTSGTSPARAMATIVKVFGMSRITDKYGPIPYSKFGTGIQVAYDSQKD VYYRFFEELSNAIEVLTSYNSRTSEPYLEDYDYIYNGNVANWIKFANTLRLRLAMRISYV DQTKAETEAKAAIDHSIGLMTSVSDNAVLKQSVSFSFINEWWEAYESFNDFRMSATMDCY LKGLQDPRLAAYFKVAKKDGEYHGVRNGQTSRSQGTLSEVASSMNVGQASSILWMDAAET YFLLAEAKLRLNLGDKTVKEYYEEGVKTSFSSKGASGAESYLDNDVNVPSISFIDPTNGR NTDVSRMVSNLTVKWDESVSDDKKLERIMVQKWIALFPDGQEAWSEMRRTGYPGIVTINT NASGGEVATGELISRLKFPTKEYSDNGENTQAAVSLLNGTDIAGTRLWWDVKR >gi|225935358|gb|ACGA01000034.1| GENE 32 40170 - 43532 3042 1120 aa, chain - ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 1120 1 1099 1099 979 49.0 0 MQNYCIVQFLGGLKSLRRTLKEISLAMKLLFILLICSAGLAYASDGYAQKTSITLRVNDC TIEEVLHKIERESGFSFFINSKNLNLNRRVSVSASEKNIFQVLEQVFAGTNVEYKILDNK IVLTAKEMKVTQQKTKPVTGTVKDRHGEPLIGVSIMEKGTTSGTVTDLDGNYTLSTQTAS PVLVFSYVGYQKKELPVSGPILNVTLEDDSQVLNEVVVTALGIRKEAKALSYNVQEVSAS EIVGVKDANFVNSLSGKIAGVVINSSSSGIGGGAKVVMRGAKSLSGNNNALYVIDGIPMP SLETTQPDDFMTGMGQSGDGASMINPEDIETMSVLSGAAASALYGSDAANGVIMITTKKG SKDKLRVNYANNTSFFNPFVTPEFQNTYGATTGEIRSWGQKLGQASSYDPLDFYQTGWNE TNSLTISNGNEKNQIFLSMAATNAAGIVPNNTLDRYNFTIRNTTSMLNDRLRLDLSASYM NVREQNMVSQGQYFNPIVPIYLMSPSYSLEMLQQFEMYNEARNFKTQYWPWGNQGIAMQN PYWIVNRDNFINHKNRFLMSGGLSFEIMKGITLGARAKMDYTSALYEKKYAASTDNIFCQ KFGGYYKGDASTRQLYGDVMLNIDKYFGDFSLTATLGTSIQDVNYQYYDVGGSLNSVADG FTMLNLMQSAVKFQQDGYHDQTQSIFATAQLGWKSKLYLDVTGRVDWSSALAWTDTKSVA YPSVGVSAILTELLPIKNDVLTFLKVRASYSEVGNAPTRYIAYQTYPYESASPVTATTYP NTDIKPERTKAWEVGLQSHLWNDKLELNVSLYKTSTYNQLFNPSISASSGYSSIYINGGQ VDNKGIEASLTLNQPLGPVKWNSTFTYTINRNKIKKLLKPTTLSSGEVVKQDMMDLGGLE IVNSRLFEGGSIGDLYVTALRTDSHGYIDVDYVNNTVAIDNKAGERQDGWIYAGNSQAKY MMGWRNSFSWKGLTLSCLVNARIGGIVVSQTQAMMDAYGVSTATAEARDLGYVLIDGYKV PAVQKYYSTVGSGVGSMYVYSATNVRLAELSLGYNIPVQKWVPWIQGMNVAFTGRNLLMF YCKAPFDPELTASTGTHFSGMDYFMLPSLRNLGFSVKLNF >gi|225935358|gb|ACGA01000034.1| GENE 33 43717 - 44688 737 323 aa, chain - ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 107 297 74 257 280 71 27.0 2e-12 MDKELLLKYIAGKASQKEKEDVATWIDADAANLKEFISLRKSYDAFIWQDTGVFSKKTKK TISLHPVTQRILRIAAVFVVAFGLSYAMIQVLQKEDLEMQTVYVPAGQRTLVTLADGTTV WVNGKSTLTFPSYFSSRTRTVELDGEAYFDVRKNPEKQFIVSTAHQSAIKVLGTKFNVKA YKEADEVITTLVEGKVNFEFNNASQQPQYIVMAPGQKLVYYSQNGKTELYTTSGERELSW KDGKIIFRQTSLRDALDILADRYNAEFVIQENVPHDDSFSGTFTNRNLEQVLNFISASSK VRWRYLNNNADAGKEKIKIEIFI >gi|225935358|gb|ACGA01000034.1| GENE 34 44838 - 45380 325 180 aa, chain - ## HITS:1 COG:no KEGG:BT_3748 NR:ns ## KEGG: BT_3748 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 180 1 180 180 292 85.0 4e-78 MAQINFNSIYTAYYRKAFLFTLSYVHNDLVAEDIVSEAIIHLWELSKEREIPSVEAILIT YIRSKSLNYLKHIQAQENVFQTLLDKGQRELEIRISTLEACDPKEILSEELRAKVHALLE SMPEKTRTAFIRDRLDGKSHKEIAEELGISVKGVEYHISRAVKILRDNLKDYAPFLFFFI >gi|225935358|gb|ACGA01000034.1| GENE 35 45511 - 46992 821 493 aa, chain - ## HITS:1 COG:no KEGG:BDI_1707 NR:ns ## KEGG: BDI_1707 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 490 3 488 493 360 41.0 6e-98 MKKLLFILFISFVSIGLAHSQFAIRQQVATPRAADSLDISFYSQKKGLRAATMTFGINMG VWAFDRYIRKADFAYIDFHTIKDNIKHGFVWDNDAMGTNMFMHPYHGSLYFNSARSNGYN YWQSGLYAFGGSFMWEMFMENEYPSTNDIIATPIGGMALGEVFYRASDLILDDRKTGASR FGLEFASFVVSPMRGLTRIINGDAWRKRSTSGKQFGVPDVSVDISAGVRALELQDEIFDK GVGFASEIDVEYGDRFDADNRRPYDYFSVRAKLNIQASQPVLGQFNIVGRLAGKELVDNR KHYLSLGLYQHFDYYDSDTISSVSAKTPYKFCTPASFGGGLIYKRKTGKRWDIDGYAHLN AILLGGALSDYYRVEERNYNLASGYSTKVGFNLIYKKDKFSISTTYDVYHMFTWKGYPQD MDWEHYNSKTLNAQGDRSMAILHATGLRLDIKLRKGLYLTGDYMNYTRHTRYKHYENVFS STSEGRLMLTCKL >gi|225935358|gb|ACGA01000034.1| GENE 36 47280 - 48410 726 376 aa, chain - ## HITS:1 COG:no KEGG:BT_4407 NR:ns ## KEGG: BT_4407 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 346 21 345 391 231 40.0 4e-59 MNKHIFNYLILGGIALLLGACNDSESKLLEPKVYFESKEYNLSVEDDDVLTFDLTSRLST MTSSQVDVSYTIADPSVVDEYNAKYGTVYEMFDASNVKMSSTTSAIPSGKLYADNVKMEL SSLKTLKEGKSYVLPVRVQSASVPTLSGTSIAYFFLSKPIKITKVGKFSNTYISIKYPVG TYFSSFTYEALIYVNSFASNNTIMGTEGVMIFRIGDAPGVPKDVLEAAGTQNYHTTEALK TGRWYHVALTYDQPSGKTVIYVNGSKWAGSDWGLPGFDPNSDVGFNIGRIPKFPWGERPL NGKMSEVRVWSVARTENQLKQNMLGVDPASDGLVLYYKLDGSETQEGGVIKDATGRLNGT TSGVTIETLPTPIAIN >gi|225935358|gb|ACGA01000034.1| GENE 37 48443 - 49558 1023 371 aa, chain - ## HITS:1 COG:no KEGG:BT_4406 NR:ns ## KEGG: BT_4406 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 370 4 388 388 274 44.0 5e-72 MKALRKLLYFLSLAGLMTTTSCTDVENMEVEHIGGYNTLGDGEYYANLRAYKATAWNYGR PVAFGWYSNWSPAGAYRRGYLTSMPDSMDIVSMWSGAPGRYEITPEQKADKEFVQKVKGT KLLQVSLLSYIGKGATPGSVYADAEKQAEAEGWTDKQLEEAKKQARWKYWGFEGQFESEN HYQCLAKFAKALCDSLYANEWDGYDVDWEIGSGVFDMDGTLSANKHLIYLVKEMNNYIGP KSDPEGKGHKMICIDGSIGGLTRELDEYVDYWIIQSYGSSRPGLEGYGVDPKKIICTENF EAYAPTGGGLLSQAATMPSKGYKGGVGAYRFEKDYDNTPDYKFMRQAIQINQQVFNEWKA KQNEAENKPQE >gi|225935358|gb|ACGA01000034.1| GENE 38 49626 - 51278 1477 550 aa, chain - ## HITS:1 COG:no KEGG:BT_4405 NR:ns ## KEGG: BT_4405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 550 1 539 542 544 52.0 1e-153 MKRIKQYKFFAVAFAILALASCNYEEINTNPFEMTDGEGVMDGIVVGGYITAMQKSVFPT GTQADDTGPVNAYQTAYNLSADAWSGYIGINNTFEGGNCHLNYVIVNSWVSTTFSNSYTN LLDPWKKLTASAKENDTPEIAALAQILKISGWHKVLESFGPIPYTAAGKGAIDVPFDSEE TVYTEMLKDLAGAVEVLTPKAVNNVKVMSDYDLVFNGDVTKWVKYANSLMLRLAIRLRSV KPELAKQYAKQAVEHSIGVMTEAGDAAGAGPGPVIALRNPLYWIADNYNDARVGTSILAY LMGYKDPRLSAYCEPANSQCTVAVTAFDNNKYQGVPLGHTNTRSKDTDPTDSYYFYSKPK IQGNTPLYWMRASEVYFLRAEAALFWGTEYGKGDPETLYKQGIETSFQENGATGSVDTYM ASGNKPAANKVTSSKFGFDYQAPSQATAKFEGTQEQKLEKIIIQKYIALYPNGQEAWTEF RRTGYPKLNPIISGGNHNSTQITAARGMRRMTYPNSFNGTGQSHEIYLDAVQKLGGVDNA ATDLWWAKKN >gi|225935358|gb|ACGA01000034.1| GENE 39 51294 - 54629 2854 1111 aa, chain - ## HITS:1 COG:no KEGG:BT_4404 NR:ns ## KEGG: BT_4404 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1111 1 1106 1106 1731 79.0 0 MLQIYEKIAEQIAHKRVFLTFLSLLLLIQTQAFAQNDSKITIQQKNITVIDALKTVEKQS KMSINYSDSELKGKQIVDLNLQNASVLAALDTILKGTGFAYQIQGNYIIITGKKPTAVAQ TVKDIKGKVTDEKGEPLIGVNITVDGSSTGTITDLDGNFSIKAPDNSQFKVSYIGYATQV IPVSKKDFYQIVMKQDTEVLEEVVVTALGIKRAEKALSYNVQQIKGDDLTTVKDANFVNS LNGKVAGVSINRSATGVGGATRVVMRGAKSIEGNNNALYVVDGIPLFNTDMGNTDSGIMG EGKAGTEGIADFNPEDIESISVLSGPSAAALYGSSAASGVILITTKKGKEGKLSVQFSSS SEFSKAYMTPDFQNTYGNKKDTYESWGEKLPTPSSYDPKNDFFNTGTNFINSVTLTTGTQ KNQTFASFSSTNSKGIVPNNSYNRLNFTIRNTATFFDDKLQLDLGASYVKQDDTNMVSQG LYWNPIVAAYLFPRGEDFEDIKTFERFDDSRKLPVQYWPVSDATYASQNPYWTAYRNVST NEKSRYMFNVGLTYKITDWLNITARYRMDDTYVLFERKISASSDQVFAEGKKGHYEYINY NDRQEYADAMLNINKRIQDFSVSANFGWNYSNYWALQRGYKGTLLGVPNKFATSNIDPSN GRISEKGGDSRVRNHAIFGNVELGWRSMLYLTLTGRNDWNSRLVNTSEESFFYPSVGLSA IVSEMVKLPEFLSYLKVRGSYTEVGAPVSRSGLTPGTVTTPIVGGNFKPTYIYPFTDFKA ERTKSYEFGLSLRLFSKFNAEVTYYKSNTYNQTFLGNLPESTGYNSIYLQAGNVENRGWE ASFGYSDRFKNGLSISSTLTFSKNINEIKEMVKDYNTEVMGVPVSINIPEVLKDKGRTIL KEGGSIHDIYATRFFKKDSQGYVLVSSDGKYEMEDGDPVYLGKTAPDFNMGWNNAISYKG FGLSFLINGRFGGVVTSSTEAILDRYGVSQRTADARESEGALFPGQGRVDAKTYYQMIGT GNYQTSGYYVYSATNIRLQELTFSYTMPNKWFGNVLKDVSVSFIANNPWMLYCKAPFDPE LTPSTATYGQGNDYFMQPSVRSFGFGLKFKL >gi|225935358|gb|ACGA01000034.1| GENE 40 54793 - 55779 609 328 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 102 273 103 272 323 90 33.0 3e-18 MKNYIQQLIELFGHNNYSADTQKKVQQWLADEEHVDEKNEALRELWKQAEGQRVPDGMQQ SIQRMRQNLGMQSVTSRRNYQLLIWRAAAIFLLAVSSISIYLMLEKDRPEKDLVECYIPT AEIHELTLPDGTHVMLNSKSTLLYPEQFTGETRSVYLIGEANFKVKPDKKHPFIVKANDY QVTALGTEFNVNAYPESNELIATLLEGCVKVEFNNLMSNVILKPNEQLIYNKQTKEHNLR LPEISDVTAWQRGELVFSNMHLEDIFTNLERKFPYAFVYSLHSMKKNTYSFRFRNQATLE EVMEIISQVVGDVNYVIKGNKCYVTNKK >gi|225935358|gb|ACGA01000034.1| GENE 41 55837 - 56391 391 184 aa, chain - ## HITS:1 COG:PA2426 KEGG:ns NR:ns ## COG: PA2426 COG1595 # Protein_GI_number: 15597622 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 35 172 37 171 187 60 30.0 2e-09 MENNDKQIELKFQRFFTVNFPKVKNFAQMLLKSESDAEDVAQDVFCKLWLQPELWLNNDK ELDNYIFIMTRNIVLNIFKHQQVEQEYQAEVIEKTFLYELTEKEEILNNVYYKEMLLIIQ LTLEKMPKRRRLIFELSRFRGLSHKEIADKLDVSIRTIEHQVYLALIELKKVLLFFIFFS NIFK >gi|225935358|gb|ACGA01000034.1| GENE 42 56939 - 58279 1319 446 aa, chain - ## HITS:1 COG:XF1003 KEGG:ns NR:ns ## COG: XF1003 COG0165 # Protein_GI_number: 15837605 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Xylella fastidiosa 9a5c # 1 416 6 424 445 250 34.0 4e-66 MAQKLWEKSVQVNKDIERFTVGRDREMDLYLAKHDVLGSMAHITMLESIGLLTKEELEQL LAELKSIYASAERGEFIIEDGVEDVHSQVELMLTRRLGDIGKKIHSGRSRNDQVLLDLKL FTRTQIKEVAEAVEQLFHVLIRQSERYKNVLMPGYTHLQIAMPSSFGLWFGAYAESLMDD MLFLQAAFKMCNRNPLGSAAGYGSSFPLNRTMTTDLLGFDSMNYNVVYAQMGRGKMERNV AFALATIAGTISKLAFDACMFNSQNFGFVKLPDDCTTGSSIMPHKKNPDVFELTRAKCNK LQSLPQQIMMIANNLPSGYFRDLQIIKEVFLPAFQELKDCLQMTTYIMNEIKVNEHILDD DKYLLIFSVEEVNRLAREGMPFRDAYKKVGLDIEAGKFSHTKEVHHTHEGSIGNLCNAEI SALMQQVIDGFNFCGMERAEKALLGR >gi|225935358|gb|ACGA01000034.1| GENE 43 58367 - 58777 356 136 aa, chain - ## HITS:1 COG:no KEGG:BT_3732 NR:ns ## KEGG: BT_3732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 136 136 221 86.0 4e-57 MSNFESSVKVIPYSQERVYNKLSDLSNLEAVKDRLPKDKVQDLSFDSDTLSFSVPPVGQL TLQIVERDPCKCIKLATTNSPLPFNMWIQLVETAEEECKVKVTIGMELNPFMKAMVQKPL QEGLEKMVDMLAVIEY >gi|225935358|gb|ACGA01000034.1| GENE 44 58873 - 59511 659 212 aa, chain - ## HITS:1 COG:lin1945 KEGG:ns NR:ns ## COG: lin1945 COG0461 # Protein_GI_number: 16801011 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Listeria innocua # 3 208 2 207 209 224 53.0 7e-59 MKNLERLFAEKLLKIKAIKLQPANPFTWASGWKSPFYCDNRKTLSYPSLRNFVKIEITRL ILERFGQVDAIAGVATGAIPQGALVADALNLPFVYVRSTPKDHGLENLIEGELRPGMKVV VVEDLISTGGSSLKAVEAIRRDGCEVIGMVAAYTYGFPVAEEAFKNAKVTLVTLTNYEAV LDVALRTGYIEEEDIQTLNEWRKDPAHWDAGK >gi|225935358|gb|ACGA01000034.1| GENE 45 59590 - 60072 525 160 aa, chain - ## HITS:1 COG:no KEGG:BT_3730 NR:ns ## KEGG: BT_3730 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 160 275 91.0 4e-73 MSAQLTDEEALNRVASYCSTAEHCRAEINEKLQRWGIAYDTIARILDRLESEKFIDDERF CRAFVNDKFRFAKWGKMKIAQGLYMKKIPSDVAWRYLNEIDEEEYLSILRDLLASKRKSI HAADDYELNGKLMRFAMSRGFELKDIKRCIDIPDEEEQID >gi|225935358|gb|ACGA01000034.1| GENE 46 60069 - 60905 323 278 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 1 278 16 290 294 129 34 9e-29 MNRITAYIRQSLQEIYPQEEIKALSMLICCDMLGLDALDIYMGKDIILSECKQRELENII FRLQKNEPIQYIRGFAEFCGRNFKVASGVLIPRPETAELVELIVKENPNARHLLDIGTGS GCIAISLDKKLPDVDVEAWDISEEALAIARKNNEDLEAGVRFLQRDVLSDDWEKVPSFDV IVSNPPYVTETEKNEMDANVLDWEPGLALFVPDEDPLRFYNRIACLGSDLLLPGGKLYFE INQAYGRETAHILEMNQYRDVRVIKDIFGKDRIVTANR >gi|225935358|gb|ACGA01000034.1| GENE 47 60942 - 61997 448 351 aa, chain + ## HITS:1 COG:BH1554_1 KEGG:ns NR:ns ## COG: BH1554_1 COG0117 # Protein_GI_number: 15614117 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Bacillus halodurans # 7 149 1 141 143 159 51.0 9e-39 MAKSTKMEEEKYMRRCIELAKNGLCNVPPNPMVGAVIVCNGRIIGEGYHIRCGEAHAEVN AIRSVKDESLLKHSTIYVSLEPCSHYGKTPPCADLIIEKQIPRIVIGCQDPFSEVAGRGI QKLRDAGREVIVGVLEEECQSLIRRFITFNTLHRPFITLKWAESADHFIDMERTSGMPVK LSSPLTSMLVHKKRAEADAIMVGRRTALLDNPSLTVRNWYGHNPIRVVLDRTLSLPNDLQ IFDENVPTLVFTEKKQPARTNITYISIDFNHNTLKQIMEVLYQRKIQSLLVEGGSQLLQS FINDELWDEAYIEKCPSRLHSGVKAPQINNNFSYSIEEHFERQFWHYVHRI >gi|225935358|gb|ACGA01000034.1| GENE 48 62140 - 63618 682 492 aa, chain + ## HITS:1 COG:no KEGG:BF0506 NR:ns ## KEGG: BF0506 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 486 1 447 452 288 40.0 2e-76 MKIKFLSFIASFFMVSFVITSCLDDDNNIEYSPDATIHAFALDTAGLGSYKFTIDQLSRE IYNEDSLPVHADTIIDKILIKTLTTASGVVTMKDKSGNDSVININDSIDLREPLIIKVWS TEALAGVSPNQTKEYTIKVNVHRHDPDSLRWNYCGNIDEANLTGEQKSVALENDILTYSV VNGSLMVYRAQIGNTMNWSSSVVSDSKSKFDDKLPSSILSYKKKLYATSATEDGKVYEST DGTTWLESDLFSGNHVNLLLAPLSNKITYIKTDDGKKVFDSIAEIASIAEAEKKLQEVPD DFPIGNISYTTYTTATNQEGIMLIGKYKDQPTVGETETIVPWAYMGEIWIPLPPNNVDTS CPALTNPSIIYYNNKFYIFGEKFESFYISEAGIAWKKANKKFYLPYQDWSESDFKPSPEK PEFRGRTSYSTAVNKDNNSIYILFSAGNASFDEEVEDDDSTQATTPHTYSYKSEVWRGRL NQLWFDLAKAGK >gi|225935358|gb|ACGA01000034.1| GENE 49 63644 - 64378 372 244 aa, chain + ## HITS:1 COG:SPy1965 KEGG:ns NR:ns ## COG: SPy1965 COG0020 # Protein_GI_number: 15675763 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Streptococcus pyogenes M1 GAS # 12 238 16 248 249 246 48.0 2e-65 MSYIEQIDKTRIPQHVAIIMDGNGRWAKQRGEERTYGHRAGAETVQNITEDAARLGIKYL TLYTFSTENWNRPQDEIAALMNLLLESIEEETLMKNNIRFNVIGDFKKLPVEVQKSLASC IERTSQNSGMCMVLALSYSSRWEITEAVRQIATLIKTGEISPEQITDECIASHLETKFMP DPDLLIRTGGEIRLSNYLLWQCAYSELYFCDTFWPDFNKEELYKAIWEYQQRERRFGKTS EQIS >gi|225935358|gb|ACGA01000034.1| GENE 50 64409 - 67063 2787 884 aa, chain + ## HITS:1 COG:RSc1412 KEGG:ns NR:ns ## COG: RSc1412 COG4775 # Protein_GI_number: 17546131 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Ralstonia solanacearum # 46 884 30 765 765 151 23.0 7e-36 MHYRISFIFITFVCLCCFATTGIAQNANTDEDSKPVILYSGTPKKYEIADIKVEGVKNYE DYVLIGLSGLSVGQTITVPGDEITGAIKRYWRHGLFSNVQITAEKIEGDKIWLKISLTQR PRIADVRYHGVKKSERTDLESKLGMVKGMQITPNTVDRAKTLIKRYFDDKGFKNAEVIIS QKDDPSSENQVIVDIDIDKKEKIKVHEIQIVGNHAIKASKLKKVMKKTNEKGKLRNLFRT KKFVPENFEADKQLIIDKYNELGYRDAMIVKDSVSQYDEKTVNVYLNIDEGQKYYLRNVT WVGNTLYPSEQLNFLLRMKKGDVYNQKLLNERVSTDDDAIGNLYYNNGYLFYNLDPVEVN IVGDSIDLEMRIYEGRQATINKIKISGNDRLYENVVRRELRIRPGQLFSKEDLMRSLREI QQMGHFDPEKLQPDIQPDPMNGTVDIGLPLTSKANDQVEFSAGWGQTGIIGKLSLKFTNF SVANLLHPGENYRGILPQGDGQTLTISGQTNAKYYQSYSISFFDPWFGGKRPNSFSVSAF FSVQTDISSRYYNSSYFNNYYNSMYSGYGGYGMYNYGNYNNYENYYDPDKSIKMWGLSVG WGKRLKWPDDYFTLSAELAYQRYNLSDWQYFPVTNGKCNDLSISLTLARNSIDNPIFPRS GSDFSLSVQFTPPYSLMDGKDYKGYYSNPETGSITQDNMNKLHKWIEYHKWKFKGKTYTP LMDPVAHPKCLVLMTRTEFGLLGHYNQYKKSPFGTFDVGGDGMTGYSTYATESIALRGYE NSSLTPYGSEGYAYARLGIELRYPLMLETSTNIYVLGFLEAGNAWHDIKKFNPFELKRSA GVGVRIFLPMIGMMGIDWGYGFDKVFGSKQYGGSQFHFILGQEF >gi|225935358|gb|ACGA01000034.1| GENE 51 67087 - 67602 533 171 aa, chain + ## HITS:1 COG:no KEGG:BT_3724 NR:ns ## KEGG: BT_3724 # Name: not_defined # Def: cationic outer membrane protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 239 84.0 2e-62 MRKSVLLIMMLFAVSMAANAQKFALIDTEYIMKNIPAAQSANEQMQEATKKYQSEVEALA KEAQKMFQDYQAKSSTLSAAQKTKTEDAIVAKEKEAAELKRNYFGPEGELAKMRDKLITP IQDDIYEAVKAISQQHGYDMVIDRASATGIIFANPRIDISDEILRKLGYSN >gi|225935358|gb|ACGA01000034.1| GENE 52 67659 - 68174 618 171 aa, chain + ## HITS:1 COG:no KEGG:BF0502 NR:ns ## KEGG: BF0502 # Name: not_defined # Def: putative outer membrane protein OmpH # Organism: B.fragilis # Pathway: not_defined # 1 171 1 169 169 206 76.0 2e-52 MLKKIALVMLLALPMGVFAQNLKFGHINAQEIITVMPEFTKAQNDIQTLEKQLTAELQRT QEEFNKKYQEFQQAIAKDSLPPNIAERRQKELQDMMQRQEQFQQDAQQQMQKAQNDAMAP IYQKLDNAIKAVGAAEGVVYIFDLARTSIPYVNESQSINLTNKVKANLGIK >gi|225935358|gb|ACGA01000034.1| GENE 53 68283 - 69125 629 280 aa, chain + ## HITS:1 COG:lin1200 KEGG:ns NR:ns ## COG: lin1200 COG0796 # Protein_GI_number: 16800269 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Listeria innocua # 12 279 5 264 266 174 38.0 2e-43 MKQHLSHTPGPIGVFDSGYGGLTILDKIREVLPEYDYIYLGDNARAPYGTRSFEVVYEFT RQAVNKLFDMGCHLVILACNTASAKALRSIQMNDLPQIDPARRVLGVIRPTVECVEEISK NQHIGVLATAGTIKSESYPLEIHKLFPEIQVSGTACPMWVSLVENNESQDEGADYFIRKY IDQLLSKDPQIDTVILGCTHFPILLPKIRQYIPDHISVIAQGEYVAESLKDYLKRHPEMD AKCTKNGNCQFYTTEAEEKFSESASTFLKQQISVKHITLE >gi|225935358|gb|ACGA01000034.1| GENE 54 69202 - 69435 293 77 aa, chain + ## HITS:1 COG:no KEGG:BT_3721 NR:ns ## KEGG: BT_3721 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 129 92.0 4e-29 MKEEDYSKAIEVFSGSPWEAEIIKGLLESNNIRCVIKDGIMGTLAPYIAPSVSVLVTEEE YEAATELIRSRDEKGSD >gi|225935358|gb|ACGA01000034.1| GENE 55 69404 - 70552 922 382 aa, chain - ## HITS:1 COG:MA0636 KEGG:ns NR:ns ## COG: MA0636 COG0436 # Protein_GI_number: 20089523 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanosarcina acetivorans str.C2A # 7 379 13 390 394 410 53.0 1e-114 MKHTNPQVEQMTSFIVMDVLERANELQKQGVDIIHLEVGEPDFDVPSCVSEAAKAAYDRH LTHYTHSLGDPELRREIAAFYLREYGVTVDPDCIVVTSGSSPSILLALMLLCNTDSEVIL SNPGYACYRNFVLATQAKPVLVPLSKEYLQYDIEAIRKCVNPHTAAIFINSPMNPTGMLL DEKFLREVAALGVPVISDEIYHGLVYEGRAHSILEYTDQAFVLNGFSKRFAMTGLRLGYL IAPKSCMRSLQKLQQNLFICASSVAQQAGIAALRQADSDVERMKQIYDERRRYMITRLRE MGFEIKVEPQGAFYIFANARKFTTDSYRFAFDVLEHAHVGITPGVDFGTGGEGYVRFSYA NSLENIKEGLDRINRYLSRPDF >gi|225935358|gb|ACGA01000034.1| GENE 56 70658 - 71740 812 360 aa, chain + ## HITS:1 COG:BS_proJ KEGG:ns NR:ns ## COG: BS_proJ COG0263 # Protein_GI_number: 16078908 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Bacillus subtilis # 7 341 9 342 371 197 33.0 3e-50 MKQEFTRIAVKVGSNVLTRRDGTLDVTRMSALVDQIAELHKSGVEIILISSGAVASGRSE VHPQKKLDSVDQRQLFSAVGQAKLINRYYELFREHNIPVGQVLTTKESFGTRRHYLNQKN CMTVMLENNVIPIVNENDTISVSELMFTDNDELSGLIASMMDTQALIILSNIDGIYNGSP SDPNSSVIREIGHGKDLSNYIQTTKSSFGRGGMLTKTNIARKVADEGITVIIANGKRDNI LVDLLQHPEETLCTRFIPSDEPVSSVKKWIAHSEGFAKGEIHINECATEVLNSEKAVSIL PIGITHIEGEFEKDDIVRIMDFQGNQVGVGKVNCDSKQAQEAIGKHGKKPVVHYDYLYIE >gi|225935358|gb|ACGA01000034.1| GENE 57 72209 - 73234 438 341 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172144|ref|ZP_05758556.1| ## NR: gi|260172144|ref|ZP_05758556.1| hypothetical protein BacD2_09798 [Bacteroides sp. D2] # 1 341 1 341 341 653 100.0 0 MLLKHWNLILIFLTFSIISCESSKNEPEVNSTPSFPVGREFNVSLNIAPFLEVAEQPLSR STSTVADGIYAVNVFWKGKGLTSFQPYASGLFDNPYRVEIGLIEGYAYRFDCSFLGYKDR PYYTIQSDSILYGLPFSSTLQKEVNGLVNNDLKISINPLNINGAFHQHIYKGEMHIKCDS ISTHPTVKRFFGSEQLDFSNPNMNTSVSMTLKRAYYSIQFVTDELAPGDSIKIKAADVAP FYLLYSKDGTSRTEERIISMYDISDYYTAKLKEEETIAFTISYRPVNEEKWYSLYTNQSI KMKRNKKNIVKIVKIDEHIGDATISFGEEAEMEESEQEIGK >gi|225935358|gb|ACGA01000034.1| GENE 58 73269 - 74315 777 348 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885535|ref|ZP_02066538.1| ## NR: gi|160885535|ref|ZP_02066538.1| hypothetical protein BACOVA_03535 [Bacteroides ovatus ATCC 8483] # 1 348 1 348 348 706 100.0 0 MKIKHLMIAGCLFTAFTLQSCIKEKDLYQAPPTTYSELNLQLDGDYLSTELPMARATTPT KINESETLVGIAITMTPKEGQNPSSKPYAYGIFQLDKAKDPNNLKIKVIDGYTYRISCSM IANAKDSIIKDEKGFFGAPFDLERSGKIKGECLNKFLTAEQADGIKFLYNIENPRIQTRN GSQDACNRPFIERYHGLNDKVEVTDGMQNHIMLYRRFFGVKFKQTGLEKGRLRIKLEDAP SIYLNANADIHKTVESELKMVSMRNLTANIPTGDNQHLTENVKVEVFWEETPGAEEKTII SSSITFKRNYTHSIALTNIEHIGTPSDIGIEVESGEMGEDIEQDIPWQ >gi|225935358|gb|ACGA01000034.1| GENE 59 74195 - 74455 70 86 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCSLLIHWGFYKINTKKIRDSILNLVFFIYALYWYSACYSNSYLALSHCHGMSCSISSPI SPDSTSIPMSLGVPICSILVSAIEWV >gi|225935358|gb|ACGA01000034.1| GENE 60 74541 - 75797 1002 418 aa, chain + ## HITS:1 COG:SP0932 KEGG:ns NR:ns ## COG: SP0932 COG0014 # Protein_GI_number: 15900812 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Streptococcus pneumoniae TIGR4 # 7 416 6 419 420 343 47.0 5e-94 MTTNLNETFAAVQVASRELALLNDNVINQILNAVADAAIAETPFILSENEKDLARMDKND PKYDRLKLTEERLKGIASDTRNVATLPSPLGKVLKESVRPNGMKLTKVSVPFGVIGIIYE ARPNVSFDVFSLCLKSGNACILKGGSDADCSNRAIISVIHKVLKKFKINPHIVELLPADR EATAALLNAVGYVDLIIPRGSSSLIHFVRENARIPVIETGAGICHTYFDEFGDTNKGADI IHNAKTRRVSVCNALDCTIIHEKRLADLPLICEKLKDSHVIIYADPQAYQALEGRYPAEL LEHAKAESFGTEFLDYKMAVKTVKSFEDALGHIQENSSKHSECIVTENGERAALFTRIVD AACVYTNVSTAFTDGAQFGLGAEIGISTQKLHARGPMGLEEITSYKWVIEGDGQTRRN >gi|225935358|gb|ACGA01000034.1| GENE 61 75831 - 76787 849 318 aa, chain + ## HITS:1 COG:XF0998 KEGG:ns NR:ns ## COG: XF0998 COG0078 # Protein_GI_number: 15837600 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Xylella fastidiosa 9a5c # 1 302 3 322 336 195 36.0 1e-49 MKKFTCVQDIGDLKSALAEAFEIKKERFKYVELGRNKTLMMIFFNSSLRTRLSTQKAAIN LGMNVMVLDINQGAWKLETERGVIMDGDKPEHILEAIPVMGCYCDIIGVRSFARFEDRDF DYQETIINQFIQYSGRPVFSMEAATRHPLQSFADLITIEEYKKTARPKVVMTWAPHPRPL PQAVPNSFAEWMNATDYEFVITHPEGYELDPQFVGNAKVEYDQMKAFEGADFIYAKNWAA YSGDNYGQILSKDRDWTVSDRQMAVTNNAYFMHCLPVRRNMIVTDDVIESPQSIVIPEAA NREISATVVLKRLLEGLK >gi|225935358|gb|ACGA01000034.1| GENE 62 76821 - 77213 285 130 aa, chain - ## HITS:1 COG:MA0746 KEGG:ns NR:ns ## COG: MA0746 COG0607 # Protein_GI_number: 20089631 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Methanosarcina acetivorans str.C2A # 27 125 39 146 151 67 35.0 8e-12 MFKLNQLIVGIFLFLSSLFSCQQKGDFQSMNVEEFDSLIQNEDIQRLDVRTLAEYSEGHI TKTININVMDDSFASMADSLLQKDKPVAVYCRSGKRSKKAAAILSEKGYKVFELDKGFNS WEEAGKEIEK >gi|225935358|gb|ACGA01000034.1| GENE 63 77228 - 77929 503 233 aa, chain - ## HITS:1 COG:no KEGG:BT_3715 NR:ns ## KEGG: BT_3715 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 237 237 313 64.0 3e-84 MKRRKLHVVWLWGLLFLMTACGDDDYYYPSVKLEFVTVEAGEDGRIQTLIPDKGEVLTVS EDRTGSTISPNTSRRVMSNYEVLSGENTATIYSLQSLITPVPKPEDDPIYKDGIKLDPVE MVSIWLGRDYLNMILNLKVSTGKGHVFGIVEDVSELKTNGIVNMLLYHDANSDGEYYNRR AYISVPLAQYIDEENPGRTIKVKFKYYTYDKDGSPIESDKYCDPGFDYTPGQN >gi|225935358|gb|ACGA01000034.1| GENE 64 77926 - 79074 972 382 aa, chain - ## HITS:1 COG:no KEGG:BT_3714 NR:ns ## KEGG: BT_3714 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 706 89.0 0 MDLTEFNEIRPYNDEELPQIFEELIADPAFQKAATGAIPNVPFELLAQKMRACKTKLDFQ EAFCYGILWKIAADHTAGLTLDHTAIPDKSKAYTYISNHRDIILDSGFLSILLIDQGMDT VEIAIGDNLLIYPWIKKLVRVNKSFIVQRALTMRQMLESSARMSRYMHYTISEKNQSIWI AQREGRAKDSNDRTQDSVLKMLAMGGEGDLIDRLMEMNIAPLAISYEYDPCDFLKAQEFQ LKRDIEGYKKTTQDDLINMQTGLFGYKGRVHFQVAPCLNDDLQGLDRSLPKPDLFARISA CLDRRIHRNYWIYPGNYVAYDWLNGTTEFVSNYTEEEKQQFMNYIEQQLAKIKIPNKDED FLREKLLLMYSNPLVNYLAACR >gi|225935358|gb|ACGA01000034.1| GENE 65 79175 - 80149 1007 324 aa, chain - ## HITS:1 COG:HI1140 KEGG:ns NR:ns ## COG: HI1140 COG1181 # Protein_GI_number: 16273066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Haemophilus influenzae # 2 321 5 303 306 179 32.0 7e-45 MKRNIAIVAGGDTSEIVVSLRSAQGIYSFIDKEKYNLYIVEMEGRRWEVQLPDGNKVPVD RNDFSFTNGTEKVVFDFAYITIHGTPGEDGRLQGYFDMMRIPYSCCGVLAAAITYDKFTC NQYLKAFGVRIAESLLLRQGQSVSDEEVVEKIGLPCFIKPSLGGSSFGVTKVKTKEQIQP AIVKAFGEAEEVIVEAFMDGTELTCGCYKTKEKSVIFPPTEVVTHNEFFDYDAKYNGQVD EITPARISDELTKRVQMLTSAIYDILGCSGIIRVDYIITAGEKLNLLEVNTTPGMTTTSF IPQQVRAAGLDIKDVMTDIIENKF >gi|225935358|gb|ACGA01000034.1| GENE 66 80146 - 81219 1061 357 aa, chain - ## HITS:1 COG:BH2542 KEGG:ns NR:ns ## COG: BH2542 COG0564 # Protein_GI_number: 15615105 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Bacillus halodurans # 26 346 1 300 305 239 43.0 9e-63 MIEEELPDELENDLDDIEPVGDESQLYEHFRVVVDKGQAMVRVDKYLFERIVNASRNRIQ KAAEDGFVMANGKPVKSSYKVKPLDVITVMMDRPRYENEIIPENIPLTIVYEDPYLMVVN KPAGLVVHPGHGNYHGTLVNALAWHMKDIPDYDANDPHVGLVHRIDKDTSGLLVIAKTPD AKTNLGIQFFNKTTKRKYRALVWGVMEQDEGTIVGNIARNPRDRMQMAVMSDPTVGKHAV THYRVLERLGYVTLVECILETGRTHQIRVHMKHIGHVLFNDERYGGHEILKGTHFSKYKQ FVNNCFDTCPRQALHAMTLGFVHPVTGEEMYFTSELPDDMTRLIEKWRGYISNRELE >gi|225935358|gb|ACGA01000034.1| GENE 67 81246 - 81899 667 217 aa, chain - ## HITS:1 COG:no KEGG:BT_3711 NR:ns ## KEGG: BT_3711 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 217 1 216 216 310 70.0 3e-83 MTIKEFFSFKTNKYFWVNIIAMIVVMVVMIFGVLKWLDIHTHHGESVVVPDVKGMTVEEA TKMFRNHGLVYVISDTKYVKDKAAGIILELKPGAGEKVKEGRTVYLTVNTLDVPLRAIPD VADNSSLRQAQAKLLNAGFKLNQIQLVNGEKDWVYGVKYQGRQLAAGEKIPLGASLTLMV GDGSGDMSEEEDSVDVSVDTEKPVTSESSPAQDDSWF >gi|225935358|gb|ACGA01000034.1| GENE 68 82270 - 82428 266 52 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885524|ref|ZP_02066527.1| hypothetical protein BACOVA_03524 [Bacteroides ovatus ATCC 8483] # 1 52 1 52 52 107 100 4e-22 MKRTFQPSNRKRKNKHGFRERMASANGRRVLAARRAKGRKKLTVSDEYNGQK >gi|225935358|gb|ACGA01000034.1| GENE 69 82621 - 83187 765 188 aa, chain + ## HITS:1 COG:MT2609 KEGG:ns NR:ns ## COG: MT2609 COG0231 # Protein_GI_number: 15842068 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Mycobacterium tuberculosis CDC1551 # 1 188 1 187 187 179 44.0 2e-45 MINAQDIKNGTCIRMDGKLYFCVEFLHVKPGKGNTFMRTKLKDVVSGYVLERRFNIGEKL EDVRVERRPYQFLYKEGEDYIFMNQETFDQHPIAHDLINGVDFLLEGSVLEVVSDASTET VLYADMPIKVQMKVTYTEPGMKGDTATNTLKPATVESGATVRVPLFISEGETIEIDTRDG SYVGRVKA >gi|225935358|gb|ACGA01000034.1| GENE 70 83258 - 84658 1542 466 aa, chain - ## HITS:1 COG:TM0156 KEGG:ns NR:ns ## COG: TM0156 COG1785 # Protein_GI_number: 15642930 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Thermotoga maritima # 1 465 1 421 434 224 34.0 2e-58 MKKLVYTLFFVLISVVANGQAKYVFYFIGDGMGVNQVNGTEMYQAEIQNGRIGVEPLLFT QFPVATVATTFSAKNSVTDSAAAGTALATGKKTYNGAISVGEDKNAIQTVAEKAKKAGKK VGVTTSVSVDHATPAAFYAHQPDRNMNYEIALDLPKANFDFYAGGGFLKPTTTYDKKEAP SIFPIFEEAGYTVARGYNDYKAKAAKAEKMILIQEEGANPSCLPYAIDRKENDLTLAQIT ESAIDFLTKGNNKGFFLMVEGGKIDWACHANDAATVFNEVKDMDNAIKVAYEFYKKHPKE TLIVITADHETGGIVLGTGKYALNLKALQYQKHSADGLSQRISELRKSKGNKVTWEDMKT FLGEEMGFWKQFPLSWEQEKTLRDEFEKSFVKNKVVFAESMYSKSEPMAARAKEVMDEIA MIGWVSGGHSAGYVPVFAIGAGSQLFGEKIDNTEIPKRIAKAAGYK >gi|225935358|gb|ACGA01000034.1| GENE 71 84820 - 86415 1417 531 aa, chain - ## HITS:1 COG:no KEGG:BT_3705 NR:ns ## KEGG: BT_3705 # Name: susR # Def: regulatory protein SusR # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 51 582 582 833 88.0 0 MKNLTLYMLFLLLLFPFSLCAMHTDNAAALKKLDEVISKKETFQIRKEKEINNLKLELAH STDPVRKYELYASLFGAYLHYQADSSLYYINREMEILPLLNRPELEYEIIINRATVMGVM GMYIEAIEQLERIDPKKLNEWTRLSYYQTYRACYGWLADYTTNKNEKEKYLKKTDLYRDS IIAAMPPEANKTIVLAEKCIMNGKADVAVDMLNNALKEIQDERQKVYIYYTLSEAYSMKK DIEKEVYYLILTAIADLETPVREYASLQKLAHLMYESGDIDRAYKYLSCSMEDAVACNAR LRFIEVTEFFPIIDKAYKLKEEKERAVSRAMLISVSLLSLFLLIAIFYLYRWMKKLSVMR RNLSLANKQMSAVNAELEQTGKIKEVYIARYLDRCVNYLDKLETYRRSLAKLAMASRIED LFKAIKSEQFIRDERDEFYNEFDRSFLKLFPNFISAFNNLLVEEGRVYPKSDELLTTELR IFALIRLGVVDSNKIAHFLGYSLATIYNYRSRMRNKAAGDKDMFEQNVMNL >gi|225935358|gb|ACGA01000034.1| GENE 72 86660 - 88513 1352 617 aa, chain + ## HITS:1 COG:BH2927 KEGG:ns NR:ns ## COG: BH2927 COG0366 # Protein_GI_number: 15615490 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 127 615 134 577 578 172 28.0 2e-42 MKRNLLFAILLLLLSGYHQAFAATNIKKVAPTFWWAGMKNPELQILLYGDGISSAEVSIS SNDITLQDVVKQENPNYLILYLDLSKAIPQHFDILLKQGKKQTKIPYELKQRKENASAVE GFNSSDVLYLIMPDRFANGNPSNDIIPGMLEANIDRNEPFARHGGDLKGIEKHLDYIADL GVTSIWLNPIQENDMKEGSYHGYAITDYYQVDRRLGSNEEFRNLVKEANAKGLKVVMDMI FNHCGSNNYLFKDMPAKDWFNFEGNYMQTSFKTATQMDPYTSDYDKKLAIDGWFTLTMPD FNQRNRHVATYLIQSSIWWIEYAGINGIRQDTHPYADFEMMAHWCKAVNDEYPSFNIVGE TWLGSNVLISYWQKDSKLAYPKNSYLPTVMDFPLMEEINKAFDEETTEWNGGLFRLYEYL SQDIVYADPMSLLTFLDNHDTSRFYRSEEDTKNLNRYKQALTFLLTTRGIPQIYYGTEIL MAADKANGDGLLRCDFPGGWQNDAKNCFEATNRTPQQNEAFSFMQKLLQWRKGNEVIAKG KLKHFAPNKGIYVYERKYGNQSVVVFLNGNDREQTIDLHPYQEILSTSSAHDLLTDKKIE LRNELTFPSRGIYLLAF >gi|225935358|gb|ACGA01000034.1| GENE 73 88711 - 90927 1928 738 aa, chain + ## HITS:1 COG:no KEGG:BT_3703 NR:ns ## KEGG: BT_3703 # Name: susB # Def: alpha-glucosidase SusB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 738 1 738 738 1435 92.0 0 MKKKKFFSIIAFLCISFIANAQQKLTSPDGNLVLTFQVNKEGAPTYDLTYKGKVVIKPST LGLELKKEDNTRTDFDWVDRRDLTKLDSKSNLYNGFKLKDAQTTTFDETWQPVWGEEKEI RNQYNELAVILFQPMNDRSIVVRFRLFNDGLGFRYEFPQQKSLNYFVIKEEHSQFAMAGN HIAYWIPGDYDTQEYDYTISRLSEIRGLMQQAITPNSSQTPFSPTGVQTALMMKTDDGLY INLHEAALTDYSCMHLNLDDKNMIFESWLTPDAKGDKGYMQTPCNSPWRTIIVSDDARNI LASRITLNLNEPCKIADAASWIKPVKYIGVWWDMITGKGSWAYTDELTSVKLGVTDYSKT KPNGKHSANTANVKRYIDFAAANGFDAVLVEGWNEGWEDWFGNSKDYVFDFLTAYPDFDV QEIHRYAASKGIKMMMHHETSASVRNYERHLDKAYQFMVDNGYNSVKSGYVGNIIPRGEH HYGQWMNNHYLYAVKKAADYKIMVNAHEATRPTGICRTYPNLIGNESARGTEYESFGGNK VYHTTILPFTRLVGGPMDYTPGIFETHCNQMNPANNSQVRSTIARQLALYVTMYSPLQMA ADIPENYERFMDAFQFIKDVALDWDKTIYLEAEPGEYITIARKAKGTDDWYIGCTAGENG HDSQLTFDFLEPGKQYVATVYADAKDADWKDNPQAYTIKKGILNNKSKLNLHAANGGGYA ISIKEVKNKSEVKGLKRL >gi|225935358|gb|ACGA01000034.1| GENE 74 91033 - 94080 3326 1015 aa, chain + ## HITS:1 COG:no KEGG:BT_3702 NR:ns ## KEGG: BT_3702 # Name: susC # Def: SusC, outer membrane protein involved in starch binding # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1015 1 1003 1003 1248 63.0 0 MKQVKFRIVQTILPLLIGMFLSLGAYAQQITVKGHVKDAMGEPVIGANVIAKGTTTGTIT DFDGNFTLNVPQNSILSITFVGYKAAEIKAAPSVMVTLEDDSQVLDAVVVVGYGTVKKND LTGSVTAIKPDKISKGVTTSAQDMITGKIAGVNVISSGTPGGGATIRIRGGSSLNAKNDP LIVIDGLAMDNSGVQGLTNPLAMVNPNDIETFTVLKDASATAIYGSRASNGVIIITTKKG KAGSKPQVNYEGNVSAGILQKTIDVMDANEFKGYVSKLYGEGNAPSPFGEANTDWQKEIF QTAVSTDHNVTVSGGLKNMPYRVSFGYTNQNGILMTSNFERYTASVNLTPSFFKDHLKFN INAKMMWANQRYADDGAIGAALTMDPTQPVYDSSDMYKNFGGFYQPTSDGSSYNDPEWPL TLESNSTANPVSLLKLKKHTSRNTSFISNVEVDYKFHFLPDLHIHANVGGDYSEGKEKNV NSPYAPGSYYYGWNGTDYGYKYNLSVNAYAQYSKEIGDHYVDVILGGEEQHFHYTGYKVG QGTNPLTGEAYNPNLRSQTAWGHHSTLVSYFARVNYTLLSRYLLTATFRQDGSSRFSKDN RWSSFPSVALGWKLKEENFLKNVEVLSDLKLRLGWGITGQQNLGDDYDFPYMALYRVNAA GGYYPFGDTYYGTMRPKAYNEDLKWEETTTYNAGFDFAFLNGRISGSMDYYYRKTDDLIN TVKIAAGTNFNTQLISNIGSLKNTGLEFTINAKPIVTKDFVWDLGYNITWNKNEITKLTG GDDSNYYVETGGVSTGISGATCQVQKVGYPMNSFFVYQQVYDKDGKPIENMFVDRNGDGV INASDKYIYKKPAADVLMGLTSKFTYKNWDLSFALRASLNNYVYNDVLASKSSVGKGGIF NHGYYSNRPTAAVNLGFEGKGDYYLSDRFVENASFLRCDNITVGYSFKNLLKSQAYKGIN GRIYGTVQNPFVITKYTGLDPESVISSGNDAGVAGIDRNIYPHPITILFGLSLQF >gi|225935358|gb|ACGA01000034.1| GENE 75 94110 - 95720 1375 536 aa, chain + ## HITS:1 COG:no KEGG:BT_3701 NR:ns ## KEGG: BT_3701 # Name: susD # Def: SusD, outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 536 1 550 551 565 53.0 1e-159 MKFKYIKSIVSAALFLSLTTGMTSCINDLDISPIDPQMTATFDQDMYFTKLYASLGLTGQ KLSEDPDIAVKDEGQSCFYRALFTNNEYGTDEMIWTWQENAGIPELTYMRWNSSHQQTEI LYNRLAYNITLCNFFLDQIAGKEDATSVQQRAEARFLRSLFYYYLMDTFGKAPFTEHFSK ENPPQKTASELFAYIESELESIENDMSEPRQAPFGRADKAACWLLRARLYLNAEVYTGQP RWNDAITYAGKVLDPSNGYGLCGNYEQLFMADNDENPDAKKEIILSIRQDGVQAKSYGGS YFLIAATQKSDMPNRGTNDPWECIRTRKALVDKFFANSEDIPFTEYTDNKWQNVRDVQAA AKDERALFYTTGRKAELESVGKFTDGLSFMKWSNLRSDGQPAHDAKIPDTDIPFFRLAEA YLIRAEAYLRAGGANAQQNAWLDIKALRDRAKATEIPSANNLTLDYILDERARELYLEGF RRTDLIRYGYFTSSTYLWDWKGGSFEGNGVSSIYNLYPIPKTETLTNTNMTQNPGY >gi|225935358|gb|ACGA01000034.1| GENE 76 95756 - 96928 949 390 aa, chain + ## HITS:1 COG:no KEGG:BT_3700 NR:ns ## KEGG: BT_3700 # Name: susE # Def: outer membrane protein SusE # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 386 8 384 387 355 50.0 1e-96 MKNLYKLFTLTMGLLALSACEADRDSNPVLNEPDTFVLNVPAFASNNVYDLKNSESLELT CTQPDYGIPMATTYSVQISLEENFVDAHAETNTEANYTTLGTTHSSAKMEVKALEFALAL GDLWSASSDEEFPTTPIPVYVRLKAELTDSGRGIAFSNVIELPKVLGYKAVPPLELPSSI FINGSMAGSNWSNWVPLAAVNGMSKFFGLFYFGGTDMFKFGTKEGEYIGFNDPRLTIASD AFTGSDDGFGGQNISVNVTGWYTVIMSVSIKGTDYAFTLDIAPGEVCLIGNAIGDWTFGD KGKFQAPTTADADFVSPVCTGGGELRMSVKVPGEDWWRTEFAIPNGKIVFRENKSVIDSW SEIGPEYAINVKAGQKINLNFVQKTGSVTQ >gi|225935358|gb|ACGA01000034.1| GENE 77 96954 - 98435 1128 493 aa, chain + ## HITS:1 COG:no KEGG:BT_3699 NR:ns ## KEGG: BT_3699 # Name: susF # Def: outer membrane protein SusF # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 493 1 485 485 273 37.0 9e-72 MKRLSYMACLLLASAAFTACDEDFKDWADPQSNPQEEAITAMVDITPVASMKLEEQPGDS VVIASVSSIAENFSLTACNIELVAEGQILNLPSKVKDGNIKVRLTELDQKVASLYKSQKS LEREVTLKLTPVVMTTNGEATSLTEYPEMQSAITPVATPAVDTEYYIVGDLNSWQMDKST ATKLEVDKDNQYLFSVVVESEEKFDFKIVPGSAIEAPDAWQRALGASKVIEDPDPGLLAF RDKEGADPDNLTCAGGKKMKITINVEDYTYTIKEDLPEHMYINGSPYSLGWDWAVAPEMV PVTQTPGMFWSIQYYTAGDQIKFAPVRKWEGDFGYDEEILSPEAIDFAELTSSGGNIGIG KSGWYLVIVAVTTEGKTISFRLPEVYLLGGVINNSWNCDETTLFRIPTDKTSDFISPAAT VTGMARISTTAVDAGGWWKSEFTLDLANEGDGTIVYRENKNVSDNLSELGYECNVKAGQK VHINFTTGKGKVE >gi|225935358|gb|ACGA01000034.1| GENE 78 98599 - 100875 1473 758 aa, chain + ## HITS:1 COG:MA3032 KEGG:ns NR:ns ## COG: MA3032 COG0296 # Protein_GI_number: 20091850 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Methanosarcina acetivorans str.C2A # 290 691 134 549 627 130 27.0 9e-30 MKDFKYIWLLLLLILNSFGACSDDDPLMPGERPSSGTDPAPEEQVLHDGFNFDPAIPKAD EPLTITFKAPEGSNFYGYADDLYLHSGTGANWTGAPTWGDNQNKYRLKKTKDNVWSITLS SSIRHFYSVAPSTPLQTINLIVRDAEGSQQTYDYATLVEDSQNGFIWEEPQKTPLPISGE EKEGIHIHSATSITLVLYDKDSQGGHKDCVFVTGNFNNWKLDSRYMMKYDETNHCWWITL EELTAGETQFQYFVYSASDGGTYLCDPYCEQALEKGVDTNFPTGAQAPYVSVVSTNPQPY QWSAGEFEMKNKENPVIYELLLRDFTSSGNLAGAMEKLPYLKELGIDAIELMPVQEFAGN DSWGYNTGLYFALDASYGTQNEYKAFIDACHQNGIAVIFDVVYNHTNNDNPFARMYWDTF NNRPSTKNPWLNAVTPHQKYVFSPDDFNHTSEQTKAFVKRNLKYLLDTYHIDGFRFDFTK GFTQKQTTGDDDLAATDPARVSVLKEYYEAVKAVKEDAMVTMEHFCANEETTLATEGIHF WRNMNHSYCQSAMGWKDNSDFSGLYDTTRPNQFVGYMESHDEERCAYKQIEYGNGALKTN LSERLKQLSSNAAFFFTVPGPKMLWQFGEMGYDISIDENGRTGKKPVLWEYQTERKSLVD IYTKLITLRTTHSDLFNASSQFTWKVSYNDWDNGRTLTLKAVNGKQLHVYANFTNASIDY TIPEGTWYLYLENGNPVEGEKKISVPAHEFRLYTNFAE >gi|225935358|gb|ACGA01000034.1| GENE 79 101041 - 101805 556 254 aa, chain - ## HITS:1 COG:NMA0723 KEGG:ns NR:ns ## COG: NMA0723 COG2908 # Protein_GI_number: 15793700 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 1 233 1 222 240 87 29.0 2e-17 MKNIYFLSDAHLGSRAIEHGRTQERRLVNFLDSIKHKASAIYLLGDMFDFWYEFRLVVPK GYTRFLGKLSELTDMGVEVHFFIGNHDIWCGDYLTKECGVIMHREPLTTEIYGKEFYLAH GDGLGDPDKKFKLLRSMFHSKTLQTMFSAIHPRWSVELGLTWAKHSRQKRADGKEPDYMG ENKEHLVLYTKDYLKSHPNINFFIYGHRHIELDLMLSATSRVLILGDWINFFSYAVFDGE NLFLEEYIEGETQV >gi|225935358|gb|ACGA01000034.1| GENE 80 101878 - 102189 352 103 aa, chain - ## HITS:1 COG:CC1859 KEGG:ns NR:ns ## COG: CC1859 COG2151 # Protein_GI_number: 16126102 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Caulobacter vibrioides # 6 100 21 115 118 100 52.0 6e-22 MEKIEIEEKIVAMLKTVYDPEIPVNVYDLGLIYKIDVSDSGEAALDMTLTAPNCPAADFI MEDIRQKVESVEGVNSATINLVFEPEWDKDMMSEEAKLELGFL >gi|225935358|gb|ACGA01000034.1| GENE 81 102350 - 104989 2318 879 aa, chain - ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 37 799 3 734 755 207 27.0 8e-53 MRRIFVSLLLLSLFFMGYAHEPEFSIAGFFRLPDTGRDVYSMNPAWRFYKGSAVGAEAKE FNDKAWQVVSLPNGIEYLPTEASGCINYQGEVWYRKHFTPSEALKGKKLFLHFEAIMGKS KIYVNGKLLAEHFGGYLPVSVDVTDALKWGEDNVIAVWADNSDDPTYAPGKPQDVLDYAY LGGIYRDCWLIAHNYVFITDPNYENEIAGGGLFIATDRVSEQSADILLKAHIRNEDKQGF AGKISYSLVDREGKEVASSDVKLNIRKGTATSHNGKMKVEQPHLWTPESPYLYNLCVRIY DKNGNVVDGYRRRVGIRSIEFKGKDGFWLNGKPYGKPLMGANRHQDFAVVGNAVANSIHW RDAKKLKDLGLEVIRNAHCPQDPAFMDACDELGLFVIVNTPGWQFWNDEPIFAKRVYNDI RNMVRRDRNHPCVWLWEPILNETWYPEDFAGRVKGIVDEEYPYPSCYSGCDVQAKGSQYF PVQFAHPMDLSKRDPKITYFTREWGDNVDDWSSHNSPSRTARNWGEQPMLIQAAHYASPY YNYICYDGLYKESPWHVGGCLWHSFDHQRGYHPDPFYGGLMDVFRQPKYSYYMFKAQRPA VVSESLAESGPMVYIAHEMTPFSSRDVTVYSNCDEVRLTVNKDGQTYTYKKDKTRKGMPS PVITFPGIFDFMVDKKMTREKHDADVYFLAEGLMDGKVVATHKVMPARRAEQIRLRVDNE GIGLRADGSDFVTVVAEITDKNGNVKRLNNYYIKFFVEGEGRILGGANVLANPAPVKWGS APVLIQSTTTPGKIKVRASVLFEGSQMPVSGELELTSSAPMYPLIFDKKEEALPKPSESF SMDKKSDAELELERRRRELNELKLKEVEQQQTDFGEKVD >gi|225935358|gb|ACGA01000034.1| GENE 82 105155 - 107194 1454 679 aa, chain - ## HITS:1 COG:PA4082 KEGG:ns NR:ns ## COG: PA4082 COG3210 # Protein_GI_number: 15599277 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Pseudomonas aeruginosa # 365 575 530 748 1018 61 34.0 5e-09 MNKIYSLLGIGLLSAATLSSCKEDVFIEGGDELQRGESQTYVAVASIRGYENTDKESSTR ANVQDDGSSFMWNADDKVTLWNGTNGYDFTTINYDESEPSGNVEFTGNGNFEEGATVWGI YPKKDAPTLGNVFTFTLGDATQSAQKAELQNTMHMLAKGTVNGTTVTNLKFEHLTALYQF KFTNRRPDAYKVTKVVVSADAAIFPKTLTVSGEEKTYGDKSNSLTLSMTSLEMAKNEVAY GYLSFFPMADMTKDTELTFTATIEKVGDSSSTETIEKKGKISELYNAESVVAGDEYKYVA GKRYGISFMLVADLGYEETEAGKYLVKKEDGLINLASEPTVMTNAATVITLDADLDMSTK EAWVPVTEFKGTLDGNGKTIFGLTIEATGNDAGLFITNNGIIKNLTLKDVTVKQVPGAAG AFAAINTGTIQNCVIDGGALTVNGADAKLGAITGHNQTGESMIKDCKVKGNVVLTVAGGK VNAGGLAGVNGWWSKAQIQGSSIDKEVSFIYRGNGEGAIGGLVGWNVQGTITGCYSLMTI TAFTAVNAGGLVGGNEGPVTASFAAGEIVAKASGNIGGLVKNGGTLTGCYSTSVLSGTAS VTICGISTGSVTANECYFMSDGVSNPGGNLPTSTKVSDAAALIDKITSMNQAVAGSGYKY VENTGTDSARVPLLIQPDE >gi|225935358|gb|ACGA01000034.1| GENE 83 107222 - 109417 1832 731 aa, chain - ## HITS:1 COG:PA4082 KEGG:ns NR:ns ## COG: PA4082 COG3210 # Protein_GI_number: 15599277 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Pseudomonas aeruginosa # 386 640 530 816 1018 60 27.0 9e-09 MKKVYHLLSVALLGAMALTSCEEDKIVNENNGEGNETDKNLTDYSFIASIKQSAPLGRSN LQNGVYTWNKGDAVTLWNRSFGAGYDFSITPGYNDNQPDKSAEFTGKAAVENGHKLIAVF PRKEAKTFNDLATFSMPETFTQTGKTAELAATTYMVATGDVTDNKIPALTFSPLTALIQF GLKNTSDRELKIRYITLESDDDVFPAELKIDEDGVVQSLSGMRNKLTLDMSGQALAQNET LNGYLNILPTTYGDTRLMKSTTELNITVSVLNNEVEQDIILLKKVKVKDLEDNIGLDMDA TANQFAAGKHYKMDFEVDYRFRIPDEGYMIDDDGNIHIYNKTGLFGWNKIADEYRKATVT LEKEYIDEPAGDGIKVIDMGNELWEPISAFGGVFEGNGVTIRNLQIANKGFIATNTGTIR NLTLENVSFSADITEGAGALAAESSTSVIQNCTVKGVTATVIKPAVFGGLIGRNSEGRIE GCQVISGTINLNLSGAGNSNYGGLVGEHFNGTALIINSYVGADVTIKHPSNSSGASCVGG LVGWNNSGKVKGCYSLAKLEVSCSGQVGGLIGANSNGTVLACYVAGSISGTIYNNTGGFI AQNTINNADATVTACYSTTQINVTNNAGSNKLGAFVASNSTGINQCYFVGETVTNPVGSG SVNGISKVTATQLKDKKRQMNLAIEAKDPEFGFSFKVNEDAGSNLYCPLILQGAVKEPGF GGSDFGDGGDI >gi|225935358|gb|ACGA01000034.1| GENE 84 109465 - 111417 1665 650 aa, chain - ## HITS:1 COG:no KEGG:BT_3176 NR:ns ## KEGG: BT_3176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 637 2 611 623 317 33.0 8e-85 MKKYFYRLLPICLFILVMTGCNDTETVISIGEARNLVAEISTQSAAIGDNVTFTVKVDQQ DNQLALEEDIDVVLTFAGKNVEGKDVPVSDVFEGFAGHLFMKKGEKQGFTEFKVKNDLAK YPISGTITAYVRGYKINAAERPIVVSDKHYTILSLKNNSDNTVKEYGSFILVATVGASAK EDVVVHIDAGEDADKYENLPSELVVKAGYKTAESELIWVKGEKGPNTFTSVSMSFATDSE IHPVYGDELEIKVTDTDAGLTPGTELTNEQWVYTDPDQIFVSASNKKAVEKWDEVRAASA LLIKEGDPHPNEALAAEGWTFLNSYEFHPIDALTEGTGLPNQYGNRPPRFMAAQNVANTQ KVQAVVNEKYATMIQDGYLKMWCAYDPGISVTGEITGTRDFGVSSLYASKFDGVPTGADS WESSNVRILPGTRVEVRIRVRGKKHSFNSAVWFQGNIRGVQWSTYGEVDLLENPATNANP NGAWQTFHWNDISTSSGDKYKPSSGQLIISDMDEFDIYWMEWRDNNEIALGINGKENVRM RTDGSYTGIANGTASWNSSTHWPFTDKYNTEGLHLLLTFAGCNEWALGDAAAEQAAKDGS WANDFKHISYQESKSSNDTPRMEIDWIRFYKKPTYDYFGSGTPTRNKPMY >gi|225935358|gb|ACGA01000034.1| GENE 85 111447 - 113237 1572 596 aa, chain - ## HITS:1 COG:no KEGG:BT_3175 NR:ns ## KEGG: BT_3175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 596 4 616 621 674 56.0 0 MKRIYIMAFVAASILPFSSCSDFLDREPITKPEAGNFLTGAIQVENYINGLYMTLPSFSK FGLGVRSEEKNSDNIIAEKYDARLHGQNNQFSGASDWQTGYQNLRKVNYFFHNYKVPEAQ ENEDVLSLKGEAYFLRAYWHFDLLRKFGSIPVMDAFWDENATIAGLQIPAKTRNEVARFI LSDLVEAKNLLHSRGKYSGIRINKEAAMVLAMNVALYEGTWEKYHSSDDFASSTNESNYF LGEVINWGNELFGCGIDLYKTGQNPGDAFAALFNSKDLSGMGEVLLWRKYSSDEGVFHDV NGNLKAGVVDSEGAAGITQSLVDNYLNADGTFIDPTNEKFKDFKETFEGRDPRLIQTVMN EGAKFASATTATPMHLEEYTDEKKKNTISPPKLAGDGNTRSLTGYHIRLGIDTTFVSGNG ETALPIIRYAEGLLAYAEAAAELEMWSDDIANKTLKALRERAGVKYLAPAKDANFTDFGY TLTPVLQEIRRERRSELALQGFRLDDLMRWKADKLIVGKRGKGAYVGDESILFKSYSPDN QKRIRERLTLDDNKWADPMAGTLPSGYQFHADRDYLLPIPPSELELNKKLKQNPKW >gi|225935358|gb|ACGA01000034.1| GENE 86 113248 - 116574 3178 1108 aa, chain - ## HITS:1 COG:no KEGG:BT_1631 NR:ns ## KEGG: BT_1631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 1108 29 1119 1119 1505 68.0 0 MKKMQIRMQLVCLFSMLLLFFPVELFAQQGTITGKVVDERGESVIGATVMVKGSTDGTIT DLDGNFKINGKVGTTITVSYVGYAPLEVKITKLTGNRFVMKEDSKTLDEVVVVGMGTQKR NTITAAVATVNADAIATRPVTDVTSALQGNVAGLNFASDASESGTGGEVGGEIKFNIRGI GSINGGEPYVLVDGVEQSLQNVNPADIESISVLKDASAAAVYGARAAYGVVLVTTKSGKQ DKARVTYRGSVGMSAPINMPEMMNSVEFANYKNAYRSAIGESPYFSQETIDLMNQFLQNP YGEGLPGITANQENTGWRGGESQYANTDWFDYFYKNKSLRHSHNLSISGGSDKVTYYVGL GYTYQGGLLDRVEDSLDKYNVNTKFQIKPNDWLKFNFNNNITLNLISRPLPDMSILYHQI ARSQPNMVTEVPISGLYNLPSWNEALYLDNVGYSQNRISDAMTFAATVTPLKGWDIIGEM KVRFDVQNDEFLQKQPRTTRPDGTIENVTAPRQGYSYPGIEYSNSLWGSMTRGNQFNYYL SPSVSSSYTNQWGDHFFKAMAGFQMEVQQNSSMYAYKDGVMSDDTFSFDNANGKAYNSEA RDHWATMGFYTRLNWNYQNVYFLEFSGRYDGSSRFASGNRWGFFPSFSAGYDIARTDYFK QWSLPVSQLKVRLSYGRLGNQNGAGLYDYLNFMTLRPDYSNAWLLSGVTSATPVRGTVAL TPSMVSPYITWEKVDNANLGFDLTLLNNRLTITADIYQRTTKDMIGPAEAIPDIGGIAVD QRAKVNNATLRNRGWELSVNWSDQLKNGFSYGIGFNIFDYKAVVTKYNNPEGLIYNNHTG LVRNKGYYQGMDLGEIWGYEANDLFLTNQEVDEYLRGVDMSFFKPNKDWQRGDLKYIDSN GDGKIDPGSGTLKDHGDLKIIGNTTPRYSFGLNLHAGYKGFEVSALFQGVMKRDFPMAAS TYLFSGDSNFFKEHLDYYNVYNPGAYLPRLTKPDSPEYNANVGYNTSRYLLNAAYMRLKN LMISYNFQPKLVKKMGLENLKVYFTCDNLFTIDNLPDVFDPETLNQVNTWAGGSNETAPG LTSPMKQNGNGKVYPLNKNYVFGIEFTF >gi|225935358|gb|ACGA01000034.1| GENE 87 116832 - 120767 2344 1311 aa, chain + ## HITS:1 COG:slr1393_3 KEGG:ns NR:ns ## COG: slr1393_3 COG0642 # Protein_GI_number: 16329802 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 791 1022 45 296 301 128 32.0 8e-29 MKKSIWIKHFFCLLAFLIVGTFPAISIGNHYDFKLIGSQNGMPSIISCIYTEQKGFIWIG TPTGLIRFDGTELKKYTAQPDNENSLPNDSILQIIEDNQQTTWILTTTGIARYSLIHDNF FIPKLNNQPIIAKSICKIADGLLFGGINKIYKYSYATNTISLIREFKTDEPFIIRSIQMW KPDTILCLSWQQGILQMNLKNFEITPFPFQYGNDNVGIIIDSQQRIWLSTYNNGVKCFSA QGKLIASYTTQNSRLSSDIVLCMTEFKQKIWMGTDGGGINILDPTTGNITVLEHISGDNH SLPTNTILCLHSDSAGNIWAGSKREGVINIREVSMRTYTNVVLGHNKGLSQSTVLQLYQE PASEELWIATDGGGINKFIPSTETFVHYPDTWGDKMVSITGFTPNELLVSAFSKGIFIFD KKTGKKRPFAIMSPSLEHHIRYSGQPINLYRDTPNSILILSSPPYRYHTSTRQLTPVPCA KEVKIKGLLSSIGYDSTAIYLNDPYNIYRLDRKKDTVEIFASAPQGFFNAVSRDDKGVFW IAGTRGLYTYHPDTRKFERIPTTLFNEVNTVLCDNKGKVWIGAEQKLFIWLTDTKRFVWF GEADGISPNEYLAEPRLISPKGNIYMGGVKGLLCINADTQIEKSSDSPEIRLAELTINGE NRMYQLNQDKISIPWNSKNIKIRVMTYGADILRLKVYHFQMGDLKTEGYNPELMIPSLAP GSYPIMVSCNTQEGNETTPQFLFTLNVLPPWYRSWWFVSLCIIACIGIVVQIVFVLLRRK ENKMKWMMKEHEQNVYEEKVRFLINISHELRTPLTLIYAPLSRILKTLPSTDAIYPPLKN IHKQAQRMKDLINMVLDVRKMEVGETKLKLQAYPFNSWIKEIGADFTDEGAAQEVQIDYQ LDDSIKEVVFDKGMCTIVVTNLLTNALKHSPKNTTVTIRTSKNDSYVRVSVLDQGEGMQE EDMEKVFIRFYQGKQEIGGSGVGLSYAKMLVELHKGKIGAMNNEAGGATFFFDLPLGLES GEVICQPKPYINELITPDVDHEINAPETLDFSTQNYTVLLVDDNHNLTDFLSKELKSLFK AIYIAHDGREAFEIAQKQVPDIIVSDVMMPVMNGYELCKAVKENIGISHIPVILLTARND EQSRLYGYKIGADAYLGKPFEIDTLVKIIQNRLYNREQTKEHYQHIGNFPQPLESTFSQA DETFLLKFNKLISENISNPSLDIPFICREIGMSKTSLYNKLKAITDMGANDYINKFRLEQ AITLIKTTDLTFTEISDQIGFTTLRYFSTAFKQYTGLTPTQYKNECRSENK >gi|225935358|gb|ACGA01000034.1| GENE 88 120801 - 121496 470 231 aa, chain - ## HITS:1 COG:MA1979 KEGG:ns NR:ns ## COG: MA1979 COG2003 # Protein_GI_number: 20090827 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Methanosarcina acetivorans str.C2A # 5 231 4 228 229 133 30.0 3e-31 MESKHKLSINQWALEDRPREKMMEKGAAALSDAELLAILIGSGNTEESAVELMRRLLLSC DNNLNSLAKWEVCDYSSFKGMGPAKSITVMAALELGKRRKLQETKERLRITCSKDIYDIF QPIMCDLEQEEFWVLLLNQATKLIDKVRISTGGIDGTYTDVRTILREALLQRATQIAVVH NHPSGNIHPSQPDRSLTEHIHKAAETMNIRLIDHVIVCEDGFFSFADEGLL >gi|225935358|gb|ACGA01000034.1| GENE 89 121641 - 123656 1283 671 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 37 472 41 495 757 101 24.0 5e-21 MKKRISILSILLLLGMTVYAQTDIPQVIPSLQHWNGAKGKLTLPETGKIIIAPEAESLLK ETAEIMAKDLKDMFGWNYQVTSGKPKNHSIYLSVKKSPSSLGEESYELDIRNHVTIEAST VKGVFWGTRTLLQMIHNQPFGLMKGKALDYPQYAHRGLMIDVARKFFTMDYLQDYVKILS FYKMNELQIHLNDNGFVEFFDNDWNKTYAAFRLESDRFPGLTSKDGSYTKEEFRNFQQMA ARYGINIIPEIDVPAHSLAFTHYNPILAADKKEYGMDHLDLYKKEVYDFLDTLFDEYLSG DHPIFVGPDVHIGTDEYNLKEAEQFRYFTEYYLKYITKYGKNPRLWGSLKHMKGNTPVNL KGKTVNAWNYSWLDLETALQEGAKAINTCDAFLYIVPAVNYYHNFLDHQWIYESWSPRMM QEGEMIEQSTNLLGAMFAVWNDRVGNGISQQDVHIRTFPAMQVMSEKLWKGENTRNIPFE TFETWCRTTPEAPGVNLQASIDDRKDLTLSGQEIILQGNDSVLTSIPEIGYPYSVEFEIY ADPKPNIDAVLFKGPHSVFTANWQNTGKFAFSRDGYEFIFHSYRLPVEKWTKVRVEGDAK GTSLYINGELQERLEGRIGVVYNQKSLRKDSIWYQETLIFPLKQIGDKLLGFKGRIRNVI CTPLNEKRYSL >gi|225935358|gb|ACGA01000034.1| GENE 90 123780 - 124769 638 329 aa, chain - ## HITS:1 COG:TVN0547 KEGG:ns NR:ns ## COG: TVN0547 COG0463 # Protein_GI_number: 13541378 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Thermoplasma volcanium # 4 236 7 221 250 80 29.0 5e-15 MRYSVIIPVYNRPDEVDELLQSLTVQHFKGFEVVVVEDGSSIPCKGVVDRYADRLNIKYF SKPNSGPGQTRNYGAERSEGEYLIILDSDVILPEGYFDAVEKELTTSPADAFGGPDRAHD SFTDIQKAINYSMTSFFTTGGIRGGKKKMDKFYPRSFNMGVRRAVYEALGGFSKMRFGED IDFSIRIFKNGYTCRLFPDAWVYHKRRTDLKKFFKQVHNSGIARINLYKKYPDSLKLVHL LPAVFTLGVALLLLGTPFCLFSFTPIILYALLVCIDSTIQNKSLSIGVYSIAAAFIQLIG YGTGFWRAWWQRCIRGKDEFEAFQKNFYK Prediction of potential genes in microbial genomes Time: Fri May 13 08:34:49 2011 Seq name: gi|225935357|gb|ACGA01000035.1| Bacteroides sp. D2 cont1.35, whole genome shotgun sequence Length of sequence - 285623 bp Number of predicted genes - 193, with homology - 188 Number of transcription units - 83, operones - 44 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 - CDS 29 - 1228 1416 ## COG0282 Acetate kinase 2 1 Op 2 . - CDS 1268 - 2287 1207 ## COG0280 Phosphotransacetylase - Prom 2495 - 2554 7.0 + Prom 2272 - 2331 7.8 3 2 Op 1 23/0.000 + CDS 2484 - 2927 340 ## COG1380 Putative effector of murein hydrolase LrgA 4 2 Op 2 . + CDS 2924 - 3619 588 ## COG1346 Putative effector of murein hydrolase 5 2 Op 3 . + CDS 3676 - 4593 1006 ## COG4866 Uncharacterized conserved protein 6 2 Op 4 . + CDS 4607 - 5623 626 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases - Term 5511 - 5538 -0.1 7 3 Op 1 . - CDS 5706 - 6839 1046 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 8 3 Op 2 . - CDS 6844 - 8139 802 ## BT_3686 hypothetical protein 9 3 Op 3 . - CDS 8149 - 9195 917 ## BT_3685 hypothetical protein - Prom 9234 - 9293 7.3 10 4 Op 1 . - CDS 9412 - 11145 1154 ## BT_3676 hypothetical protein 11 4 Op 2 . - CDS 11195 - 13195 1161 ## COG4289 Uncharacterized protein conserved in bacteria - Prom 13259 - 13318 5.1 + Prom 13163 - 13222 8.0 12 5 Op 1 . + CDS 13462 - 15993 1262 ## BT_3660 transcriptional regulator 13 5 Op 2 . + CDS 16076 - 18556 2007 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 18615 - 18654 6.2 - Term 19086 - 19133 -0.9 14 6 Tu 1 . - CDS 19168 - 20313 646 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 20334 - 20393 10.3 15 7 Op 1 . - CDS 20410 - 22380 1765 ## BT_3661 alpha-glucosidase 16 7 Op 2 . - CDS 22416 - 24881 2092 ## COG3534 Alpha-L-arabinofuranosidase - Prom 25040 - 25099 8.2 - Term 25146 - 25190 5.1 17 8 Op 1 . - CDS 25202 - 26176 585 ## BT_3655 arabinosidase 18 8 Op 2 . - CDS 26225 - 28561 2474 ## COG1874 Beta-galactosidase - Prom 28581 - 28640 6.3 - Term 28678 - 28745 19.3 19 9 Op 1 1/0.000 - CDS 28767 - 29531 536 ## COG1624 Uncharacterized conserved protein 20 9 Op 2 . - CDS 29543 - 30409 875 ## COG0294 Dihydropteroate synthase and related enzymes - Prom 30438 - 30497 5.4 + Prom 30327 - 30386 4.5 21 10 Op 1 . + CDS 30527 - 32542 1329 ## COG0642 Signal transduction histidine kinase 22 10 Op 2 . + CDS 32620 - 33915 1208 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase + Term 33951 - 33994 5.5 - Term 33939 - 33982 5.5 23 11 Tu 1 . - CDS 33997 - 34389 380 ## BT_3643 hypothetical protein - Prom 34452 - 34511 6.5 + Prom 34410 - 34469 7.6 24 12 Op 1 . + CDS 34493 - 35863 1079 ## COG0733 Na+-dependent transporters of the SNF family 25 12 Op 2 . + CDS 35863 - 36780 339 ## COG1555 DNA uptake protein and related DNA-binding proteins - Term 36877 - 36939 6.5 26 13 Op 1 . - CDS 36963 - 37619 372 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 27 13 Op 2 . - CDS 37627 - 38343 425 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 28 13 Op 3 . - CDS 38343 - 40478 1987 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 40705 - 40764 6.7 + Prom 40604 - 40663 7.2 29 14 Tu 1 . + CDS 40731 - 41738 1210 ## COG0136 Aspartate-semialdehyde dehydrogenase + Term 41773 - 41832 9.2 + Prom 41792 - 41851 5.9 30 15 Op 1 . + CDS 41925 - 44813 1416 ## BT_3633 hypothetical protein 31 15 Op 2 . + CDS 44818 - 46050 725 ## BT_3632 hypothetical protein 32 15 Op 3 . + CDS 46043 - 47551 619 ## BT_3631 hypothetical protein 33 15 Op 4 . + CDS 47591 - 48508 641 ## BT_3630 hypothetical protein + Term 48519 - 48576 8.4 + Prom 48581 - 48640 4.8 34 16 Op 1 . + CDS 48690 - 49277 601 ## BT_3629 hypothetical protein 35 16 Op 2 . + CDS 49336 - 49830 304 ## BT_3628 hypothetical protein 36 16 Op 3 . + CDS 49853 - 50740 227 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 37 16 Op 4 . + CDS 50765 - 51433 386 ## BT_3626 hypothetical protein 38 16 Op 5 . + CDS 51476 - 52510 1016 ## BT_3625 hypothetical protein + Term 52651 - 52688 -0.8 + Prom 52582 - 52641 2.9 39 17 Op 1 . + CDS 52772 - 54046 750 ## BT_3624 hypothetical protein 40 17 Op 2 . + CDS 54138 - 55610 1133 ## COG0591 Na+/proline symporter 41 17 Op 3 . + CDS 55624 - 58074 1499 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 42 17 Op 4 . + CDS 58087 - 59457 850 ## COG2385 Sporulation protein and related proteins 43 17 Op 5 . + CDS 59474 - 60757 1028 ## COG0477 Permeases of the major facilitator superfamily 44 17 Op 6 . + CDS 60796 - 61911 883 ## COG4299 Uncharacterized conserved protein 45 17 Op 7 . + CDS 61908 - 62312 123 ## BT_3618 hypothetical protein 46 17 Op 8 . + CDS 62192 - 62758 421 ## BT_3618 hypothetical protein 47 17 Op 9 . + CDS 62790 - 63605 795 ## COG2103 Predicted sugar phosphate isomerase + Prom 63697 - 63756 4.7 48 18 Tu 1 . + CDS 63871 - 66408 2207 ## BVU_0030 hypothetical protein + Term 66570 - 66603 -0.6 + Prom 67116 - 67175 6.8 49 19 Tu 1 . + CDS 67396 - 69423 1660 ## PRU_2308 putative glycosyl hydrolase - Term 69753 - 69808 11.2 50 20 Op 1 . - CDS 69876 - 72497 1455 ## COG3250 Beta-galactosidase/beta-glucuronidase 51 20 Op 2 . - CDS 72508 - 73407 694 ## COG0627 Predicted esterase 52 20 Op 3 . - CDS 73463 - 75043 904 ## COG3507 Beta-xylosidase - Prom 75096 - 75155 5.5 - Term 75093 - 75142 -0.7 53 21 Tu 1 . - CDS 75160 - 77775 1822 ## CJA_3286 endo-beta-galactosidase, putative, ebg98A (EC:3.2.1.-) - Prom 77923 - 77982 5.6 54 22 Op 1 . - CDS 78005 - 80236 1660 ## PRU_2739 endo-1,4-beta-xylanase (EC:3.2.1.8) 55 22 Op 2 . - CDS 80273 - 81307 960 ## Slin_2105 hypothetical protein 56 22 Op 3 . - CDS 81335 - 83104 1611 ## ZPR_0751 hypothetical protein 57 22 Op 4 . - CDS 83129 - 85972 2465 ## Slin_2103 TonB-dependent receptor plug 58 22 Op 5 . - CDS 86005 - 87876 1372 ## Slin_2102 RagB/SusD domain protein 59 22 Op 6 . - CDS 87893 - 91051 2897 ## Slin_2101 TonB-dependent receptor plug - Prom 91222 - 91281 4.0 - Term 91791 - 91821 2.0 60 23 Tu 1 . - CDS 91913 - 95893 2562 ## COG0642 Signal transduction histidine kinase - Prom 95913 - 95972 8.7 61 24 Op 1 . + CDS 95984 - 96292 129 ## 62 24 Op 2 . + CDS 96380 - 96829 327 ## gi|260172240|ref|ZP_05758652.1| hypothetical protein BacD2_10282 63 24 Op 3 . + CDS 96849 - 99386 1718 ## BVU_0030 hypothetical protein + Prom 99391 - 99450 2.0 64 25 Op 1 . + CDS 99470 - 103534 2773 ## COG3534 Alpha-L-arabinofuranosidase 65 25 Op 2 . + CDS 103583 - 104977 1068 ## COG5498 Predicted glycosyl hydrolase 66 25 Op 3 . + CDS 104983 - 106935 1005 ## Fjoh_3873 hypothetical protein 67 25 Op 4 . + CDS 106960 - 109395 1627 ## BVU_2979 glycoside hydrolase family protein + Term 109583 - 109619 5.8 - Term 109671 - 109720 3.2 68 26 Tu 1 . - CDS 109728 - 112313 2163 ## COG1472 Beta-glucosidase-related glycosidases - Prom 112489 - 112548 7.9 + Prom 112479 - 112538 10.2 69 27 Tu 1 . + CDS 112560 - 113765 768 ## COG1373 Predicted ATPase (AAA+ superfamily) 70 28 Op 1 . - CDS 113874 - 114890 1044 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 71 28 Op 2 . - CDS 114915 - 116207 1039 ## COG0738 Fucose permease 72 28 Op 3 . - CDS 116246 - 117169 753 ## BT_3615 hypothetical protein 73 28 Op 4 . - CDS 117185 - 118117 948 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 118235 - 118294 3.1 + Prom 118198 - 118257 2.1 74 29 Tu 1 . + CDS 118297 - 119331 780 ## COG1609 Transcriptional regulators + Term 119435 - 119481 2.9 - Term 119423 - 119469 6.7 75 30 Op 1 . - CDS 119496 - 120056 484 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 76 30 Op 2 . - CDS 120068 - 121609 1699 ## COG0423 Glycyl-tRNA synthetase (class II) 77 31 Op 1 . - CDS 121713 - 122981 653 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 123001 - 123060 4.8 78 31 Op 2 . - CDS 123063 - 123560 357 ## BT_3610 hypothetical protein - Prom 123716 - 123775 5.6 + Prom 123511 - 123570 4.9 79 32 Tu 1 . + CDS 123794 - 124939 613 ## COG1609 Transcriptional regulators + Prom 124963 - 125022 4.4 80 33 Tu 1 . + CDS 125048 - 126235 775 ## BT_3608 hypothetical protein + Prom 126268 - 126327 9.2 81 34 Op 1 . + CDS 126404 - 129553 2210 ## BT_0452 hypothetical protein 82 34 Op 2 . + CDS 129566 - 131200 1220 ## BT_0451 hypothetical protein 83 34 Op 3 . + CDS 131220 - 132917 1132 ## BT_0450 hypothetical protein 84 34 Op 4 . + CDS 132959 - 134527 956 ## Thit_2257 copper amine oxidase domain protein 85 34 Op 5 . + CDS 134551 - 136755 1566 ## COG1472 Beta-glucosidase-related glycosidases 86 34 Op 6 . + CDS 136762 - 138078 1079 ## COG0738 Fucose permease 87 34 Op 7 12/0.000 + CDS 138096 - 139283 910 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 88 34 Op 8 . + CDS 139264 - 140055 531 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 89 34 Op 9 . + CDS 140062 - 141066 488 ## BT_3586 putative dehydrogenase 90 34 Op 10 . + CDS 141071 - 142480 936 ## BT_3585 putative oxidoreductase 91 34 Op 11 . + CDS 142532 - 143302 567 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 92 34 Op 12 . + CDS 143304 - 144629 897 ## COG0673 Predicted dehydrogenases and related proteins 93 34 Op 13 . + CDS 144633 - 144872 138 ## BT_3582 hypothetical protein + Prom 144877 - 144936 3.8 94 34 Op 14 . + CDS 144960 - 146303 955 ## COG0673 Predicted dehydrogenases and related proteins + Term 146327 - 146366 4.5 - Term 146311 - 146358 1.9 95 35 Tu 1 . - CDS 146403 - 146990 463 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 147131 - 147190 7.1 + Prom 147012 - 147071 5.7 96 36 Op 1 . + CDS 147292 - 150636 2263 ## BT_4606 hypothetical protein 97 36 Op 2 . + CDS 150649 - 153087 2023 ## BT_4606 hypothetical protein 98 36 Op 3 . + CDS 153100 - 153981 738 ## BT_4606 hypothetical protein 99 36 Op 4 . + CDS 153995 - 154933 769 ## COG0584 Glycerophosphoryl diester phosphodiesterase 100 36 Op 5 . + CDS 154973 - 156055 547 ## Phep_1387 hypothetical protein + Prom 156078 - 156137 2.9 101 37 Op 1 4/0.000 + CDS 156179 - 157165 576 ## COG3712 Fe2+-dicitrate sensor, membrane component 102 37 Op 2 . + CDS 157189 - 160731 2420 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 103 37 Op 3 . + CDS 160754 - 162505 1557 ## Slin_4978 RagB/SusD domain protein + Term 162532 - 162579 6.7 - Term 162307 - 162346 -0.9 104 38 Tu 1 . - CDS 162450 - 162608 66 ## - Prom 162727 - 162786 3.5 105 39 Tu 1 . + CDS 162607 - 163935 731 ## COG2271 Sugar phosphate permease + Term 163996 - 164052 4.2 + Prom 163949 - 164008 5.9 106 40 Op 1 . + CDS 164071 - 166716 2354 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 107 40 Op 2 . + CDS 166716 - 167576 544 ## BT_3578 hypothetical protein 108 40 Op 3 . + CDS 167644 - 168657 743 ## BT_3577 hypothetical protein + Term 168826 - 168868 9.6 + TRNA 168731 - 168815 76.5 # Leu TAG 0 0 + Prom 168740 - 168799 80.4 109 41 Op 1 . + CDS 168871 - 169761 790 ## COG0524 Sugar kinases, ribokinase family + Term 169836 - 169882 11.2 + Prom 169879 - 169938 5.9 110 41 Op 2 . + CDS 170012 - 173077 2952 ## BT_3569 hypothetical protein + Term 173084 - 173127 1.4 111 41 Op 3 . + CDS 173142 - 174674 1618 ## BT_3568 hypothetical protein + Term 174679 - 174732 2.3 + Prom 174737 - 174796 4.8 112 42 Op 1 . + CDS 174817 - 177126 2386 ## COG1472 Beta-glucosidase-related glycosidases 113 42 Op 2 . + CDS 177137 - 178543 1274 ## COG5368 Uncharacterized protein conserved in bacteria + Term 178738 - 178781 -0.1 114 43 Tu 1 . - CDS 178571 - 181129 1680 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 181258 - 181317 4.7 115 44 Tu 1 . + CDS 181149 - 181310 75 ## gi|295086358|emb|CBK67881.1| hypothetical protein 116 45 Tu 1 . - CDS 181327 - 181803 214 ## BT_3564 hypothetical protein - Prom 182012 - 182071 6.0 + Prom 182725 - 182784 6.4 117 46 Tu 1 . + CDS 182953 - 184986 1075 ## COG5545 Predicted P-loop ATPase and inactivated derivatives + Term 185023 - 185064 7.3 - Term 185007 - 185055 8.3 118 47 Tu 1 . - CDS 185185 - 186039 622 ## COG1864 DNA/RNA endonuclease G, NUC1 - Prom 186059 - 186118 2.8 - Term 186053 - 186103 11.1 119 48 Op 1 . - CDS 186120 - 188078 1680 ## COG4085 Predicted RNA-binding protein, contains TRAM domain 120 48 Op 2 . - CDS 188099 - 188941 729 ## BT_3561 hypothetical protein 121 48 Op 3 . - CDS 188989 - 191517 2298 ## BT_3560 hypothetical protein - Prom 191590 - 191649 5.1 + Prom 191480 - 191539 6.6 122 49 Op 1 . + CDS 191612 - 191836 117 ## 123 49 Op 2 . + CDS 191718 - 192743 930 ## BT_3559 hypothetical protein 124 49 Op 3 . + CDS 192740 - 193885 665 ## COG1864 DNA/RNA endonuclease G, NUC1 + Term 193902 - 193955 7.1 - Term 193891 - 193940 4.2 125 50 Op 1 . - CDS 194042 - 195340 1036 ## COG3174 Predicted membrane protein 126 50 Op 2 . - CDS 195394 - 195861 484 ## COG2954 Uncharacterized protein conserved in bacteria - Prom 195889 - 195948 1.7 127 51 Tu 1 . - CDS 195981 - 197006 766 ## BT_3553 hypothetical protein - Prom 197125 - 197184 9.1 128 52 Op 1 . - CDS 197269 - 197403 138 ## - Prom 197426 - 197485 2.9 129 52 Op 2 . - CDS 197488 - 198447 1066 ## COG1186 Protein chain release factor B - Prom 198624 - 198683 5.3 + Prom 198564 - 198623 4.7 130 53 Tu 1 . + CDS 198704 - 201622 3246 ## COG0612 Predicted Zn-dependent peptidases + Term 201681 - 201729 14.1 - Term 201661 - 201724 15.9 131 54 Tu 1 . - CDS 201810 - 203624 1563 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 203697 - 203756 4.3 + Prom 203644 - 203703 4.5 132 55 Tu 1 . + CDS 203734 - 204804 786 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Term 204797 - 204865 13.2 133 56 Op 1 . - CDS 204888 - 205310 371 ## COG0526 Thiol-disulfide isomerase and thioredoxins 134 56 Op 2 . - CDS 205326 - 206069 430 ## gi|260172309|ref|ZP_05758721.1| hypothetical protein BacD2_10627 - Prom 206089 - 206148 4.1 135 56 Op 3 . - CDS 206169 - 207188 833 ## BT_3536 hypothetical protein - Prom 207353 - 207412 7.9 - Term 207243 - 207286 -0.3 136 57 Tu 1 . - CDS 207466 - 207756 239 ## BT_3546 glutaminase - Prom 207815 - 207874 1.6 137 58 Tu 1 . - CDS 207976 - 208182 84 ## - Prom 208356 - 208415 7.9 138 59 Op 1 . - CDS 208724 - 209239 141 ## BDI_2696 hypothetical protein 139 59 Op 2 . - CDS 209226 - 210335 514 ## BT_3533 hypothetical protein 140 60 Tu 1 . - CDS 210976 - 211596 275 ## HMPREF0868_1350 antioxidant, AhpC/TSA family (EC:1.11.1.15) - Term 211745 - 211788 -0.8 141 61 Tu 1 . - CDS 211961 - 212725 458 ## gi|260172316|ref|ZP_05758728.1| hypothetical protein BacD2_10672 142 62 Op 1 . - CDS 213121 - 213555 121 ## Phep_1402 protein of unknown function DUF1573 - Prom 213581 - 213640 4.8 143 62 Op 2 . - CDS 213644 - 213862 204 ## gi|260172318|ref|ZP_05758730.1| hypothetical protein BacD2_10682 144 62 Op 3 . - CDS 213895 - 216783 2172 ## BT_3546 glutaminase 145 62 Op 4 . - CDS 216807 - 218378 818 ## BT_3545 hypothetical protein 146 62 Op 5 . - CDS 218392 - 218748 240 ## gi|260172321|ref|ZP_05758733.1| hypothetical protein BacD2_10697 147 62 Op 6 . - CDS 218758 - 219060 215 ## gi|260172322|ref|ZP_05758734.1| hypothetical protein BacD2_10702 148 62 Op 7 . - CDS 219113 - 219385 145 ## gi|260172323|ref|ZP_05758735.1| hypothetical protein BacD2_10707 149 63 Tu 1 . - CDS 219529 - 219795 227 ## gi|260172324|ref|ZP_05758736.1| hypothetical protein BacD2_10712 - Prom 219849 - 219908 7.5 150 64 Tu 1 . - CDS 220142 - 220444 147 ## gi|260172326|ref|ZP_05758738.1| hypothetical protein BacD2_10722 - Prom 220464 - 220523 6.1 151 65 Op 1 . - CDS 220581 - 220778 130 ## gi|260172327|ref|ZP_05758739.1| hypothetical protein BacD2_10727 152 65 Op 2 . - CDS 220807 - 221520 364 ## BT_3540 hypothetical protein 153 65 Op 3 . - CDS 221582 - 222076 343 ## gi|260172329|ref|ZP_05758741.1| hypothetical protein BacD2_10737 - Prom 222281 - 222340 4.2 154 66 Op 1 . - CDS 222393 - 222803 284 ## gi|260172330|ref|ZP_05758742.1| hypothetical protein BacD2_10742 155 66 Op 2 . - CDS 222846 - 224870 1452 ## BT_3542 hypothetical protein - Prom 224890 - 224949 7.4 156 67 Tu 1 . - CDS 225056 - 225517 184 ## BT_3543 hypothetical protein - Prom 225702 - 225761 6.8 - Term 225767 - 225810 1.1 157 68 Tu 1 . - CDS 225823 - 226308 336 ## gi|260172333|ref|ZP_05758745.1| hypothetical protein BacD2_10757 - Prom 226381 - 226440 6.4 + Prom 226633 - 226692 9.7 158 69 Op 1 . + CDS 226726 - 227445 508 ## BT_3540 hypothetical protein 159 69 Op 2 . + CDS 227448 - 227726 195 ## BT_3539 hypothetical protein 160 70 Tu 1 . - CDS 227741 - 228538 222 ## BT_3538 hypothetical protein - Prom 228563 - 228622 3.1 - Term 229612 - 229654 3.0 161 71 Op 1 . - CDS 229836 - 230924 659 ## BF0670 putative transmembrane acyltransferase protein 162 71 Op 2 . - CDS 230950 - 233004 2013 ## COG3533 Uncharacterized protein conserved in bacteria 163 71 Op 3 . - CDS 233029 - 234177 891 ## COG3274 Uncharacterized protein conserved in bacteria 164 71 Op 4 . - CDS 234185 - 235315 1187 ## COG2017 Galactose mutarotase and related enzymes - Prom 235373 - 235432 4.7 165 72 Op 1 1/0.000 - CDS 235445 - 236908 1561 ## COG3538 Uncharacterized conserved protein 166 72 Op 2 . - CDS 236911 - 239202 2159 ## COG3537 Putative alpha-1,2-mannosidase 167 72 Op 3 . - CDS 239215 - 241692 2449 ## BT_3526 glutaminase 168 72 Op 4 . - CDS 241724 - 243895 1659 ## BT_3525 hypothetical protein 169 72 Op 5 . - CDS 243924 - 245183 1170 ## COG4833 Predicted glycosyl hydrolase - Prom 245216 - 245275 5.6 - Term 245327 - 245375 10.6 170 73 Op 1 . - CDS 245415 - 246881 1297 ## BT_3523 hypothetical protein 171 73 Op 2 . - CDS 246904 - 248046 1084 ## BT_3522 hypothetical protein 172 73 Op 3 . - CDS 248072 - 249391 1347 ## BT_3521 alpha-1,6-mannanase 173 73 Op 4 . - CDS 249418 - 251337 1606 ## BT_3520 hypothetical protein 174 73 Op 5 . - CDS 251349 - 254789 3369 ## BT_3519 hypothetical protein - Prom 254848 - 254907 3.5 175 74 Tu 1 . - CDS 254939 - 256138 1020 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 256173 - 256232 3.6 176 75 Tu 1 . - CDS 256255 - 256827 310 ## BT_3517 RNA polymerase ECF-type sigma factor - Prom 256851 - 256910 8.4 177 76 Tu 1 . - CDS 256933 - 257922 936 ## COG3507 Beta-xylosidase - Prom 257977 - 258036 4.8 + Prom 257875 - 257934 4.8 178 77 Tu 1 . + CDS 258088 - 259197 409 ## PROTEIN SUPPORTED gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase + Term 259294 - 259337 5.2 + Prom 259329 - 259388 8.1 179 78 Tu 1 . + CDS 259408 - 261231 1849 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 261267 - 261313 5.2 - Term 261255 - 261299 6.4 180 79 Tu 1 . - CDS 261331 - 262851 1184 ## COG3119 Arylsulfatase A and related enzymes - Prom 262876 - 262935 1.9 - Term 263259 - 263309 11.1 181 80 Op 1 . - CDS 263414 - 264793 1215 ## COG3119 Arylsulfatase A and related enzymes 182 80 Op 2 . - CDS 264828 - 266204 1343 ## COG3119 Arylsulfatase A and related enzymes 183 80 Op 3 . - CDS 266259 - 267629 1153 ## COG3119 Arylsulfatase A and related enzymes 184 80 Op 4 . - CDS 267648 - 269240 1306 ## COG3119 Arylsulfatase A and related enzymes - Prom 269269 - 269328 5.0 + Prom 269320 - 269379 3.4 185 81 Op 1 . + CDS 269501 - 271291 1039 ## COG3250 Beta-galactosidase/beta-glucuronidase 186 81 Op 2 . + CDS 271298 - 273859 1691 ## BT_3508 hypothetical protein + Term 273902 - 273962 5.2 - Term 273890 - 273948 4.3 187 82 Op 1 . - CDS 274008 - 275693 1507 ## COG3119 Arylsulfatase A and related enzymes 188 82 Op 2 . - CDS 275731 - 277083 1065 ## COG3119 Arylsulfatase A and related enzymes - Prom 277221 - 277280 3.9 - Term 277237 - 277287 10.7 189 83 Op 1 . - CDS 277303 - 278322 915 ## BT_3473 hypothetical protein 190 83 Op 2 . - CDS 278343 - 280391 1787 ## BT_3474 hypothetical protein 191 83 Op 3 . - CDS 280407 - 283496 2680 ## BT_3483 hypothetical protein 192 83 Op 4 . - CDS 283509 - 285008 1001 ## BT_3476 hypothetical protein 193 83 Op 5 . - CDS 285026 - 285550 398 ## BT_3477 glutaminase A Predicted protein(s) >gi|225935357|gb|ACGA01000035.1| GENE 1 29 - 1228 1416 399 aa, chain - ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 399 1 400 403 462 57.0 1e-130 MKILVLNCGSSSIKYKLFDMTTKEVIAQGGIEKIGLKGSFLKLTLPNGEKKILEKDIPEH TVGVEFILNTLINPEYGAIKSLDEINAVGHRMVHGGERFSESVLLNKEVLEAFAACNDLA PLHNPANLKGVNAVSAILPNIPQVGVFDTAFHQTMPDYAYMYAIPHELYEKYGVRRYGFH GTSHRYVSKRVCEFLGVNLVGQKIITCHIGNGGSIAAIKDGKCMDTTMGLTPLEGLMMGT RSGDIDAGAVTFIMEKEGLNTTGVSNLLNKKSGVLGISGVSSDMRELLAACAAGNEKAIL AEKMYYYRIKKYIGAYAAALGGVDIILFTGGVGENQMECRREVCKDMEFMGIELDNEVNA KVRGEEAIISTPASKVKVVVIPTDEELLIASDTMDILNK >gi|225935357|gb|ACGA01000035.1| GENE 2 1268 - 2287 1207 339 aa, chain - ## HITS:1 COG:CAC1742 KEGG:ns NR:ns ## COG: CAC1742 COG0280 # Protein_GI_number: 15895019 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 2 332 1 329 333 316 52.0 4e-86 MLNLINQIVARAKANRQRIVLPEGTEERTLKAANMILTDEVADLILLGKPDEINELATKW GLGNIGKATIIDPETSPKHEEYAQLLCELRKKKGMTIEEARKLTSDPLFYGCLMIKSGDA DGQLAGARNTTGNVLRPALQIIKTAPGITCVSGAMLLLTHAPEYGKNGILVMGDVAVTPV PDANQLAQIAICTAQTAKAVAGIENPKVALLSFSTKGSAKHEVVDKVVEATKIAKEMAPT LDLDGEMQADAALVPEVGASKAPGSPVAGEANVLIVPSLEVGNISYKLVQRLGHADAVGP ILQGIACPVNDLSRGCSIEDVYRMIAITANQAIAAKAKK >gi|225935357|gb|ACGA01000035.1| GENE 3 2484 - 2927 340 147 aa, chain + ## HITS:1 COG:NMA0437 KEGG:ns NR:ns ## COG: NMA0437 COG1380 # Protein_GI_number: 15793442 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Neisseria meningitidis Z2491 # 1 109 3 111 114 97 49.0 1e-20 MIRQCAILFGCLALGELIVYLTGIKLPSSIIGMLLLTLFLKLGWIKLHWVQGLSDFLVAN LGFFFVPPGVALMLYFDVIAAEFWPIVIATIVSTALVLVVTGWVHQIVRKFRLARQIKLA RKLHLSDFHLPEKLHLKDKINLTNKDK >gi|225935357|gb|ACGA01000035.1| GENE 4 2924 - 3619 588 231 aa, chain + ## HITS:1 COG:NMB2004 KEGG:ns NR:ns ## COG: NMB2004 COG1346 # Protein_GI_number: 15677832 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Neisseria meningitidis MC58 # 10 229 11 229 230 177 45.0 2e-44 MSFLENNFFLLAITFGIFFFAKLLQKKTGLVLLNPILLTIALLIIFLKMTNISYETYNKG GHLIEFWLRPAVVALGVPLYLQLEMIKKQLLPILLSQLAGCIVGVISVVLIAKFMGASQE VILSLAPKSVTTPIAMEVTKAIGGIPSLTAAVVVAVGLLGAICGFKTMKIMRVGSPIAQG LSMGTAAHAVGTSTAMDISSKYGAYASLGLTLNGIFTALLTPTILRLLGIL >gi|225935357|gb|ACGA01000035.1| GENE 5 3676 - 4593 1006 305 aa, chain + ## HITS:1 COG:jhp0277 KEGG:ns NR:ns ## COG: jhp0277 COG4866 # Protein_GI_number: 15611347 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 4 293 2 286 290 164 36.0 3e-40 MIPFKDVTLADRDTITSFTMKSDRRNCDLSFSNLCSWRFLYDTQFAVVDNFLVFKFWAGD QLAYMMPVGTGDLKAILGELIEDARKENQHFCMLGVCSNMRADLEAILPGQFTFTEDRDY ADYIYLRSDLSTLKGKKFQAKRNHINRFRNTYPDYEYTPITPDRIQECLDLEAEWCKVNH CDQQEGTGNERRALIYALHNFEALGLTGGILHVNGKIVAFTFGMPINHETFGVHVEKADT SIEGAYAMINYEFANRIPEQYIYINREEDLGLEGLRKAKLSYQPVTILEKYMACLKEHPM NMVKW >gi|225935357|gb|ACGA01000035.1| GENE 6 4607 - 5623 626 338 aa, chain + ## HITS:1 COG:BH1812 KEGG:ns NR:ns ## COG: BH1812 COG4552 # Protein_GI_number: 15614375 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Bacillus halodurans # 42 302 50 323 386 70 24.0 3e-12 MMIKEQVKALWKVCFDDSEEFVEMYFRLRYKTEVNVAIKSGDEVISALQMLPYPMTFGGE TVQTSYISGACTHPDFRSKGVMRELLSQSFARMLRNGVHFSTLIPAEPWLFDYYARMGYA SVFKYSTKEIVLPEFIPSKEIAVSVVSEFQEEVYSYLNKKLSERACCIQHTPEDFQVIMT DLAISGGYLFVARQENEIKGITIVYKRDKHIIINELCADDKDVEYSLLYAIRKHTGCKRM VQLLPPEDKQPQHPLGMARIINAKEVLQIYAAAFPEDEMQLELSDKQLSVNNGYYYLCKG KCMYNTERLPGAHIQMNISELTNRILQPLKPYMSLMLN >gi|225935357|gb|ACGA01000035.1| GENE 7 5706 - 6839 1046 377 aa, chain - ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 56 373 47 350 352 162 32.0 1e-39 MKKLYATLFSALFLGGAICASCTDKKDASAEEVINTIHKVNNYWQTNHPEHGRSFWDNAA YHTGNMEAYFLTKQPEYLEYSKAWAEHNEWKGAKSDHKENWKYSYGESDDYVLFGDYQIC FQTYADLYNLEPDTQKIARAREVMEYQVSTPNHDYWWWADGLYMVMPVMTKMYNITKNPL YLEKLHEYLAYADSIMYDEEAGLYYRDGKYVYPKHKSVNGKKDFWARGDGWVLAGLAKVL KELPETDKYRPEYIDRFRTLAKSVAACQQPEGYWTRSMLDPLHAPGPETSGTAFFTYGLQ WGINNGFLDAAEYQPVVEKAWKYLSTVALQPDGKIGYVQPIGEKAIPGQVVDANSTSNFG VGAFLLAACERVRYLNK >gi|225935357|gb|ACGA01000035.1| GENE 8 6844 - 8139 802 431 aa, chain - ## HITS:1 COG:no KEGG:BT_3686 NR:ns ## KEGG: BT_3686 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 431 1 431 431 773 85.0 0 MKRILLLLCGIMLVPVIACSQHLVEVGKGYSCTSVNTTVFRNNSLVTHGDEQYISYYDND GYLVLGKRKLDSEQWTLHRTQYQGNVKDAHNIISMMIDGEGYIHVSFDHHGHPLNYCRSI APGSLELGDKMPMTGVDEANVTYPEFYPLSGGDLLFVYRSGSSGRGNLVMNRYSLKEHKW TRVQDILIDGENKRNAYWQLYVDEKGTIHLSWVWRETWHVETNHDICYARSFDNGVTWYK TSGERYELPIKLSNAEYACRLPQNCELINQTSMSADAGGNPYIATYWREPNSDVPQYRIV WNDGKMWHQRQITDRQTPFTLKGGGTKMIPIARPRIVVEGGEVFYIFRDEERGSRVSMAH ATDVGTSKWTITDLTDFSVDAWEPSHDTELWKKQRKLHLFVQHTRQGDGERTAEIEPQMI YVLETNMDTNK >gi|225935357|gb|ACGA01000035.1| GENE 9 8149 - 9195 917 348 aa, chain - ## HITS:1 COG:no KEGG:BT_3685 NR:ns ## KEGG: BT_3685 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 346 1 338 355 646 90.0 0 MNMNRHCGIIKFILLGVVCISLLPVSVHAQKYTEFIPGEVWKDTDGNPINAHGGGLLYHN GTYYWYGEYKKGKTILPDWATWECYRTDVTGVGCYSSKDLLNWKFEGIVLPAVKDDPNHD LHPSKVLERPKVVYNKKTGKFVMWAHVESADYSKACAGVAVSDSPVGPFVYQGSFRPNNA MSRDQTVFVDDDGRAYQFYSSENNETMYISLLTDDYLKPSGSFTRNFVKESREAPAVFKY NGKYYMLSSGCTGWDPNIAEIAVADSIMGTWKTIGNPCTGPDADKTFYAQSTYVQPVVGK KNAYIAMFDRWKKKDLEDSRYVWLPVLVKDGKITIPWHEKWNLSIFDK >gi|225935357|gb|ACGA01000035.1| GENE 10 9412 - 11145 1154 577 aa, chain - ## HITS:1 COG:no KEGG:BT_3676 NR:ns ## KEGG: BT_3676 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 575 1 584 586 965 77.0 0 MKEKIVWIFVVLYLCSFHLNAQNVENNPTGGGYQVAAEGAWCWFADPRALHYENKEGTIN KTYIGYIDIHGNIKAMQYDFQKKEQEEVLIRSYFQPDDHNNPTFLVLPDERIMIFYSRHT DEPCFYYRVSRIPGDITTLGEEKRIETKNNTTYPSPFILSDDPEHIYLCWRGIGWHPTIA KLSLPDDKDEVDVVWGPYQIVQSTGARPYAKYMSNGKDKIYLTYTTGHPDNELPNFLYFN YIDIKTLQLTDVRGTVLSTIADGTFKVNKTGDYAKQYPSTVVDNPSERDWVWQIASDRNG NPVIVMVRISDNKESHDYYYAKWNGHEWKKTFLINAGGHFHQTPNLEKCYSAGMAIDPSN VNEVYCSLPVEGKYGKVYEIVRFIMSEDGEVISKEAVTKDSQLNNVRPYMIPASEGTPLR LTWMYGNYYDWIVSLQHPQGYSTGIACDFKGFPDRKKKKMIAASGKEIRFNPEKPFVLEQ TITLGADNYQGCLLQLGDLEYYLNGETMKPEIRYKGKVYTSTNVLGTSDCWKTQARGTGG KWYTPQKYGEVRLRMEYKKGVLCVYINDLLDQRIDFD >gi|225935357|gb|ACGA01000035.1| GENE 11 11195 - 13195 1161 666 aa, chain - ## HITS:1 COG:AGl3401 KEGG:ns NR:ns ## COG: AGl3401 COG4289 # Protein_GI_number: 15891818 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 90 394 67 365 627 122 29.0 2e-27 MKARFALFLLFAILSLKLSVVFAQENIFTVWQPDYQKSPYTGMTRQHWIQAGEYLLKGAF NYIHTLDDQMYFPKQLDKTYPRNTGEIPVAKLEGLARTLFVAAPLLKDNPELEMNGIKVA DYYRYQLINISNPESRSYIPHRTGGPSQTLLELGSLAISMKAAQEVLWNPLTKKQKDSLA ATMLSYGEGPTIGSNWMFFNVFILSFLKDQGYAVNESYLESNLQKLLARYRGEGWYNDAP AYDYYSAWAYQTYGPIWAEMFGKKQYPQYARLFMENQHDMVDNYPFLFSRDGRMNMWGRS ICYRFAVTAPLSLYEYDKSGNVNYGWMRRIASSTLLQFLERPEFLEDGVPTMGFYGPFAP AVQIYSCRGSVYWCGKAFLSLLLPENSNYWSATENNGPWEKELKKGEVYNKFQQGTNLLI TNYPNCGGSEMRSWCHETVAKDWQKFRSTENYNKLAYHTEFPWMADGKNGEISMNYGTRN KKGEWEVLRLYTFKSFENGIYRRDAVLETDSTVKYQLADIPLPDGILRIDKVSVSEPTEI CLGHYSLPRLDSDIKETGCKVGKQNIPVLSNGKYELAMIPLTGWEKTYTVYPEGVHPVSE KCALNMVSDQLSGEKIYVTLQLWKKNEKRGFTSKELTPVKSVHVSEDKKQVTVCLSNGEI KTISFE >gi|225935357|gb|ACGA01000035.1| GENE 12 13462 - 15993 1262 843 aa, chain + ## HITS:1 COG:no KEGG:BT_3660 NR:ns ## KEGG: BT_3660 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 843 21 904 904 1208 67.0 0 MTPFIYAGEFLFTTINASQGLSDNQIRYILQLPDGRMVFTTSGNVNLYDGVHFSYLHRTT EYVYPLNQYNGHYRIYQSGDSLLWIKDTHKLMCIDLYREEYIPDLDAYFKNREIQAPVED LFVDDSGHIWLLIANELLELNNKTRITLPGKYSGKLQDLNADSTSLYLFYDTGEVSCYHI ADQKTVYNIAAYPASEQAEYQNTSLVVKSADGFYQLRNGRKGGLFYFNSHKRTWKKLFEQ NYTLNTLIITPNGEKAYISCVHGFWMIDLHTGTQKYIPLLETGNGQIVSTEISTIFQDRQ GGLWLGTFNRGLLYHHPSMHKLTHIGRNAFPVSPEEEINIESFAEDKDGNIYLKAHSRIY RLTANEQKSYVLKSAAIPSVSPEILNRLNPNKNHHFRNKVYNTLYTDTRGWTWAGTPDGL ELFTSENDSAPRIFYRENGLSNNFIQGIIEDKYRDIWVTTSNGVTRIHINPENKNISFTR FNQLDGALDGEYIKDAVFSSSDGTLYLGGIDGFSIFHPDKDSIHPMLPDPPVFTALRLYG EKVNTGKEYGNRIILSKAAPYTTEIELDYNQNFLTFEFSALNYINHERTYYRYQLEGIDS QWMSTFAGRQGNATVGKGLLQAVYTNLPPGDYTFKVMASDNPLQWNGKVTAIKLTIHAPW WKTTTAYILYAVILLLIAFTGSRLYIYWSRKEMERKHKEEILLLRIRNLIEQNNSLTCGI EETSPDHQQQDTEESAFLAQAIKLVEQNLHVNGYSVEQLSRDLCMERTGLYRKLVTMLDQ SPSLFIRNIRLQRAAQLITEGKLSITEIAEHTGFSSSSYLSKCFQEMYGCRPSEYAEKAK KST >gi|225935357|gb|ACGA01000035.1| GENE 13 16076 - 18556 2007 826 aa, chain + ## HITS:1 COG:SSO3022 KEGG:ns NR:ns ## COG: SSO3022 COG1501 # Protein_GI_number: 15899728 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Sulfolobus solfataricus # 43 820 6 723 731 489 36.0 1e-138 MKMNLIRKAGFLLVCATMVAAGNGTAQNVQRTSQGIKCATQGMDVNVEFYSPSIVRVYKT PSQKSCNKESLVIVKTPESTPVSFGEKGKNVTLSSRLIQVEVNPETGGIHFFDSSGQRLL TDKDYGTQFTPFNDAGVPSFNVRQAFLLDKDEVIYGLGQQQTGKVNQRNQKLFLRNQNMS ICIPFIHSVKGYALYWDNYSPTTFLDNPQEMSFDSEVGDCADYYFIYGGNADGVIAGVRE LTGQAPLYPLWTLGFWQCRERYKSPDELCEVVDKYRELKVPLDGIIQDWQYWGCNENWNS MKFQNPRYINKMGDPEYMKFLPNGEDRNANYGTPRIKSPKEMIDYVHKQNAHIMISVWAS FGPWTEMYQKMDSLKALLHFETWPPKAGVKPYDPFNPTARDMYWAEMKKNIFDLGMDGWW LDSTEPDHLEIKDKDFDTPTYLGSFRRVHNAFPLMSNKGVYEHQRATTSDKRVFLLTRSS FLGQQRYASHSWSGDVVSTWEVMKKQLAAGLNYSLCGIPYWNTDLGGFFAWKYNNNVHNI AYHELHVRWYQWGAFQPIMRSHNSSPVAVEIYQFGKKGDWAYDALEKYTHLRYRLLPYLY STSWEVTNKAGSIIRPLMMDFPKDKKVLEMDTEYMFGRNFLVRPVTDSLYTWQDDKQNGY QKNMNKIGKTDVYLPAGAQWIDFWTGKSLKGGQTIQREVPIDIMPVYVRAGSILPWGPAV QYSTEKKWDNLTLRIYPGADAEFTLYEDEFDNYNYEKGAYTTITMKWNDKDRTLTINDRQ GNYKGMLKNRKFNIIIVEPGKGCGDGDATTFDQSVSYRGKRVDLKL >gi|225935357|gb|ACGA01000035.1| GENE 14 19168 - 20313 646 381 aa, chain - ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 8 372 7 384 390 77 23.0 6e-14 MNDVNNPFLIYAYAGPKYFCDRIEETEHLISALRNGRNVTLMSPRRMGKTGLIQNVFHQI RREYPEAACFYMDIFSTTCLDDFIIQFGQTVIGKLDNLSQKTLAAISGFFKSCRLVFSPD VLTGVPQATLDFQPSQAQATLKEIFDYLEHSGKECYIAIDEFQQITEYPEKGVEGLLRSY IQFLPHVHFIFSGSKQHLMAEMFGSAKRPFYRSTEKMNLASIPLEHYSLFAIRWMQAGGK ELSDELFQMIYQRFNGHTWYIQYVLNRLYEQKEPILNETIFEKCLADIVRSEVDEYQRLY GMLTENQSTLLRAIAREQFVAAINSSIFIKKYGLKGSSSINAALKFLINKEYVFKAEEGY CVYDRFMELWLQTLPYAGILK >gi|225935357|gb|ACGA01000035.1| GENE 15 20410 - 22380 1765 656 aa, chain - ## HITS:1 COG:no KEGG:BT_3661 NR:ns ## KEGG: BT_3661 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 646 1 646 647 1206 86.0 0 MKKIISTLLASCCLTSLIAQEVVVKGPDEKLQLVVSASPAEKPSYSITYNGKTMLEKSPL GMNTNIGDFAKGMKLTGHAVTPIDTVYHQDRIKTSKVHYQANELICNFENPKGQKIDVVF RVSNHDVAFRYTLPRQDGKGSVTVTAEETGFRFPQQTTTFLCPQSDAMIGWKRTKPSYEE EYKADAPMSDRSQYGHGYTFPCLFRIGDDGWVLVSETGVDSRYCGSRLSDVSEGNLYTVA FPMAAENNGNGTSAPAFALPGTTPWRTITVGETLKPIVETTVIWDVVRPLYETKHDYRFG RGTWSWILWQDGSINYDDQVRYIDFAAAMGYEYALIDNWWDTNIGRDRMKSLIEYARSKG VELFLWYSSSGYWNDIEQGPVNHMDNAIIRKREMKWLQSLGVKGIKVDFFGGDKQETMRL YEDILSDADDHGLMVIFHGCTIPRGWERMYPNYVGSEAVLASENMVFGQHFCDEEAFNAC LHPFIRNAVGCMEFGGCFLNKRLNRNNDGGTTRRTTDIFQLATTVLFQNPVQNFALAPNN LKDVSPVCMDYMKTVPTTWDETRFIDGYPGKYVVLARRHGDTWYLAAVNAGKEIVKLKLD LEMFAGKTVSLYKDDKKGEPQLVTLKVKESGKVQLEILPQGGVVLVNRSVAESYGK >gi|225935357|gb|ACGA01000035.1| GENE 16 22416 - 24881 2092 821 aa, chain - ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 206 715 51 547 835 413 43.0 1e-115 MRKLFLFLVVLFLSFQQVTLAAIKEMASTPDSVYLFSFATSGDDGRSGLRFAWSTDKENW FEIGRNYGYLRCDYSRWGSQKKMLDPYLKQSSDGEWICTWKLNDHDGYGQATSKDLINWT SQKYPRTTPDFDGVRVKAIVAGEEQKGNINRVAWTLVDGLDKNYGWNQYRNSLHGERPVQ DGERFAGLKPVKATVTVQPEGAKAISDVLLGAFFEDINYSADGGLYAELIQNRDFEYDPS DREGDKNWNSTHSWALKGEKTTFTIDTTDPIHINNPHYAVLNVEQPGAALENTGFDGIAL NTGEKYDFSIFARVPQGKSNKLQVRLVDGKGNVCGETSLTVSSRQWKTYKAVITAKATAD THLEIIPQSVGELNLDMISLFPQNTFKGRKNGLRKDLAQVLADIHPRFIRFPGGCVAHGD GLKNIYQWKNTVGPLEARKAQRNLWGYHQSMGLGYFEYFQFCEDIGAEPLPVLAAGVPCQ NSACHGDLRGGQQGGIPMSEMGAYIQDILDLIEWANGDAKKTKWGKVRAEAGHPKPFNLK YIGIGNEDLITDIFEERFTMIFNAIKEKYPEMIVVGTVGPFNEGTDYVEGWKLADKLGIP MVDEHYYQTPGWFLNNQDFYDKYDRSKKTKVYLGEYATHIPGRKANIETALTEALYLAAL ERNGDVVHMTSYAPLLAKEGHTQWNPDLIYFNNREVKPTTGYYVQKLYGQNAGNEYLPSK ITLDNRDDQVRKRIAASIVRDSASGDVIVKLVNLLPVEVNTNVDLSGIGAIQSSAKRTVL TGKPADTPLPVEDTMEVAEKFDYQLPAYSFTVIRIKKANEK >gi|225935357|gb|ACGA01000035.1| GENE 17 25202 - 26176 585 324 aa, chain - ## HITS:1 COG:no KEGG:BT_3655 NR:ns ## KEGG: BT_3655 # Name: not_defined # Def: arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 322 1 321 323 568 89.0 1e-161 MMNMKKNIILFAFLFVGVLTGYCQQSAYLFVYFTGNRMSEEAVRMAVSLDGYNYKALNGN QPVLDSRVISSTGGVRDPHILRCEDGKTFYMVVTDMVSGNGWSSNRAMILLKSKDLVHWT SNIVNIQKKYPNQEDLKRVWAPQTIYDREAGKYMVYWSMQHGNGPDIIYYAYANKDFTDI EGEPKPLFLPENKKSCIDGDIIYKDGLYHLFYKTEGNGNGIKKATTSSLTSGQWTESDDY KQQTKDPVEGSGIFPLVGSDKYILMYDVYTKGKYQFTESSDLEHFKVIDHAISMDFHPRH GTVIPITPKELKRLFKAYGKPEGF >gi|225935357|gb|ACGA01000035.1| GENE 18 26225 - 28561 2474 778 aa, chain - ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 29 608 28 602 612 434 39.0 1e-121 MKNRLIALLVLFTVIFFSTAQAQTTARKFEAGKNTFLLDGKPFVVKAAELHYTRIPQAYW EHRIEMCKALGMNTICIYIFWNIHEQEEGKFDFSGQNDIAAFCRAAQKHGMYVIVRPGPY VCAEWEMGGLPWWLLKKKDIALRTLDPYYMERVGIFMKEVGKQLAPLQVNKGGNIIMVQV ENEYGSYGIDKPYVSAVRDLVRESGFTDVPLFQCDWSSNFTNNALDDLIWTVNFGTGANI DQQFKKLKELRPETPLMCSEFWSGWFDHWGRKHETRPAKDMVQGIKDMLDRNISFSLYMT HGGTTFGHWGGANNPAYSAMCSSYDYDAPISEPGWTTDKFFLLRDLLKNYLPAGETLPAV PAALPVIEIPEFHFHKVAPLFSNLPEAKQTVDIQPMEQFNQGWGTILYRTTLPETTLAGT TLQITEVHDWAQIYADGKLLARLDRRKGEFTTILPALKKGTQLDILVEAMGRVNFDKSIH DRKGITEKVELISGNQTKELKNWTVYNFPVDYSFIKDKKYSDKKILPTMPAYYKSTFTLD KVGDTFLDMSTWGKGMVWVNGHAMGRFWEIGPQQTLFMPGCWLKEGENEILVLDLKGPTR ASIKGLKKPILDVLREKAPETHRKDGEKLKLTGEKVGHEGAFTPGNGWQEVRFATPVKGR YFCLEALSPQANDNIAAIAEFDVLGADGKPVSREHWKIRYADSEETRSGNRTADKIFDLQ ESTFWMTVDNVPYPHQLVIDLSKVENVTGFRYLPRAEKGYPGMIKEYRVYVKPADFNY >gi|225935357|gb|ACGA01000035.1| GENE 19 28767 - 29531 536 254 aa, chain - ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 11 253 16 252 273 167 38.0 1e-41 MFFEFGIKDFIDILLVAFVLYYTYKLMKASGSIKVFTGILVFILIWLVVTQVLEMKLLGS IFDTLMNVGVIALIVLFQDEIRRFLLTLGSHRHVSALARLFSGSKKESLKHDDIMPVVMA CLSMGKQKVGALIVIEHNTPLDEVVRTGELIDAAINQRLIENIFFKNSPLHDGAMVISKK RIKAAGCILPVSHDLNIPKELGLRHRAAMGISQQSDAHAIIVSEETGAISVAYRGQFYLR LNAEELESLLTKEN >gi|225935357|gb|ACGA01000035.1| GENE 20 29543 - 30409 875 288 aa, chain - ## HITS:1 COG:ECs4056 KEGG:ns NR:ns ## COG: ECs4056 COG0294 # Protein_GI_number: 15833310 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Escherichia coli O157:H7 # 13 278 21 285 297 225 44.0 8e-59 MMKPISPIYINVKGRLLDLATPQVMGILNVTPDSFYSGSRMQTKEDIAARARQILDEGAS IIDIGAYSSRPNAEHISAEEEMNRLRIGLEILNRNHPEAIISVDTFRADVAEECVKDYGV AMINDIAAGEMDHRMFQTVADLGVPYIMMHMKGTPQSMQKEPSYDNLIKDVFLYFARKVQ QLRDLGVKDIILDPGFGFGKTLEHNYELMAHLEEFHIFELPVLVGVSRKSMIYKLLGGTP QDSLNGTTVLDTVALMKGAHILRVHDVREAVEAVRITEKLKIESGYDK >gi|225935357|gb|ACGA01000035.1| GENE 21 30527 - 32542 1329 671 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 429 668 64 310 328 130 32.0 1e-29 MSRLTLFHVGFLFLILFFTTTAKAQKEAETFNVDFTLYEYYQRCQEYLLEPVVLSMSDTL FRMAGERHDERMQAVAIATRLDYYYFQGTNEDSVIHYTNKVKEFAKATHQPKYYYFAWAN RLVTYYLKTSRTNIALYEVQNMLKEAQEEDDKTGLSRCYNIMSQIYTIKRFDSMAFEWRL KEIELTEKYNIENYNISQTYAQIANYYISQKKQKEALRAVEKAISTANSSTQQISAKLEY VNYYSKFGDFQAAEKILKECQIAFEQDKRLESIKKRLYNIECLYYQQTRQYQKALEAAEM QEKEEHRLSESILSSSHYRTQGEIYQKMGNMNMAVKYLQMYINTDDSLKIANEQVASSEF ATLLNVEKLNAEKKELMLQAQEKELHNKTTLIISLIILLGILFLFLYRENFLKRKLKVSE AELKTKNEELTVSREELRKAKDIAEASSRMKTTFIQSMTHEIRTPLNSIVGFSQVLSDHY SNSPETQEFVNIIKSNSNDLLRLVTDVLALSELDQYEQLPTDDETDMNTICQLASEVAKD NRQKDVEVLFEPAKENLLIRSNSERISQVLNNLAHNAAKFTTHGSIRIAYSVLEAEKKIE ISVTDTGTGIPKDQQEAVFERFYKMNSFTQGTGLGLPICRSIAEKLGGSLRIDSSYTDGC RMILTLPLVYA >gi|225935357|gb|ACGA01000035.1| GENE 22 32620 - 33915 1208 431 aa, chain + ## HITS:1 COG:BS_murF KEGG:ns NR:ns ## COG: BS_murF COG0770 # Protein_GI_number: 16077524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Bacillus subtilis # 14 431 28 451 457 217 35.0 3e-56 MKLSALYQIFLDCQSVTTDSRNCPDGSLFIALKGESFNGNAFAKLALDSGCAFAIIDEAE YAVEGDKRYILVDNCLQTMQQLANYHRRQLGTRVIGITGTNGKTTTKELISSVLCQAHNV LYTLGNLNNHIGVPTTLLRLKPEHDLAVIEMGANHPGEIKFLCEIAEPDYGIITNVGKAH LEGFGSFEGVIKTKGELYDFLRKKDAITFIHHDNAYLMNIAQGLNLISYGTEDDLYVNGQ ITDNSPYLAFEWKAGKDGERHQVRTQLIGEYNFPNALAAITIGRFFGVEAKKIDEALASY TPQNNRSQLKKTEDNTLIIDAYNANPTSMMAALQNFRNMTVPHKMLILGDMRELGAESPA EHQKIVDYIKESGFEKVWLVGELFAASEHSFKTYANAQEVIKDLQADKPKGYTILIKGSN GIKLSSTVEYL >gi|225935357|gb|ACGA01000035.1| GENE 23 33997 - 34389 380 130 aa, chain - ## HITS:1 COG:no KEGG:BT_3643 NR:ns ## KEGG: BT_3643 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 252 91.0 4e-66 MLSYIKKYPISLFIILTVIYLSFFKPPKTDLDEIPNLDKLVHVCMYFGMSGMLWLEFLRA HRRDNAPMWHAWVGAFLCPVLFSGCVELLQEYCTTYRGGDWLDFAANTTGAILASLIAYY VVRPRMMNKQ >gi|225935357|gb|ACGA01000035.1| GENE 24 34493 - 35863 1079 456 aa, chain + ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 6 449 9 446 453 327 46.0 3e-89 MAKNDRANFGSKLGVILASAGSAVGLGNIWRFPYETGNHGGAAFILIYLGCILLLGLPIM IAEFLIGRRSRANTARAYQKLAPGTHWRWVGRMGVLAGFLILSYYAVVAGWTLEYIFEAA TNGFAGKTSGEFISSFQQFSSNPWRPVVWLVAFLLITHFIIVKGVEKGIEKSSKIMMPTL FIIILILVVCSVTLPGAGAGIEFLLKPDFSKVDGNVFLSAMGQAFFSLSLGMGCLCTYAS YFSKETNLTKTAFSVGIIDTFVAILAGFIIFPAAFSVGIQPDSGPSLIFITLPNVFQQAF SGVPILAYIFSVMFYALLAMAALTSTISLHEVVTAYLHEEFNLSRGKAARLVTGGCVFLG IFCSLSLGVMKGFTVFGLGMFDLFDFVTAKIMLPLGGLCISLFTGWYLDKKIVWSEITND GSLKVPVYKLIIFILKYIAPIAISLIFINELGLIKL >gi|225935357|gb|ACGA01000035.1| GENE 25 35863 - 36780 339 305 aa, chain + ## HITS:1 COG:TM1052 KEGG:ns NR:ns ## COG: TM1052 COG1555 # Protein_GI_number: 15643810 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Thermotoga maritima # 180 284 47 161 181 68 36.0 1e-11 MWKDFLYYTKTERQGIIVLVVLILGVYAAPKLFSFFTHAEDTDCRENEKFDKEYNDFISS LRETQPHQKSGHSFQSSPQREIKLAVFDPNIADSTTFLSLGLPSWMIKNILHYRYKQGKF RHPEDFRKIYGLTEEQYQTLRPYIQITEDFSSTNKDTVRLLTTPSIQRDTLVKYLPGTII SLNSADTTELKKIPGIGSSIARMIVNYRERLGGFFRIEQLQEIHLKAEKLRPWFSIDTHQ TRRINVNKAGMERMMHHPYINYYQAKVIIEYRKKKGFLKSLKQLSLYEEFTPIDLERLEP YICYN >gi|225935357|gb|ACGA01000035.1| GENE 26 36963 - 37619 372 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 215 1 217 245 147 40 3e-34 MIKLEGITKSFGSLQVLKGIDLEINKGEIVSIVGPSGAGKTTLLQIMGTLDEPDAGVVQI DGTVVSRMKEKELSAFRNKNIGFVFQFHQLLPEFTALENVMIPALIAGVSSKEAHERAMK ILDFMGLVDRASHKPNELSGGEKQRVAVARALINDPAVILADEPSGSLDTHNKEDLHQLF FDLRDRLGQTFVIVTHDEGLAKITDRTVHIVDGMIKKD >gi|225935357|gb|ACGA01000035.1| GENE 27 37627 - 38343 425 238 aa, chain - ## HITS:1 COG:FN0725 KEGG:ns NR:ns ## COG: FN0725 COG1179 # Protein_GI_number: 19704060 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Fusobacterium nucleatum # 8 236 4 229 234 164 39.0 1e-40 MERYDWQQRTELLLGEEKMERIRSAHVLVVGLGGVGAYAAEMICRAGVGRMTIVDADTVQ PTNMNRQLPAMHSTLGRTKAEVLAARYKDINPEIELAVLPVYLKDENIPELLDADKYDFI VDAIDTISPKCFLIYEAMKRHIKIVSSMGAGAKSDITQVRFADLWETYHCGLSKAVRKRL QKMGMKRKLPVVFSTEQADPKAVLLTEDEQNKKSTCGTVSYMPAVFGCYLAEYVIKRL >gi|225935357|gb|ACGA01000035.1| GENE 28 38343 - 40478 1987 711 aa, chain - ## HITS:1 COG:all3567_1 KEGG:ns NR:ns ## COG: all3567_1 COG0475 # Protein_GI_number: 17231059 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Nostoc sp. PCC 7120 # 8 396 22 404 413 303 44.0 9e-82 MNLFDLNLTLPISDPTWVFFLVLIIILFAPMILGRLHIPHIIGMILAGVLIGEHGFHVLD RDSSFELFGKVGLYYIMFLAGLEMDMEDFKKNRTKSFVFGWLTFLIPMALGIWSSMSMLG YGFLTAVLLASMYASHTLIAYPIISRYGLSRLRSVNITIGGTAVTVTLALIILAVIGGMF KGTVDGWFWVFLVAKVAFLGFLIVFFFPRIGRWFFRKYDDSVMQFVFVLAMVFLGGGLME FVGMEGILGAFLAGLVLNRLIPHVSPLMNRLEFVGNALFIPYFLIGVGMIIDVRSLFTGG EALKVAIVMTVVATFSKWLAAWITQKIYRMQSNERSMIFGLSNAQAAATLAAVLIGHEII MENGERLLNDDVLNGTVVMILFTCVISSLVTERSARRFALDENVQAEEEAKKVNTEQILI PVANPETIEDLINLALVIKDAKQKNALVALNVINDNNSSEKKEQQGKRNLEKAAMIAAAA DVPVTMVSRYDLNIASGIIHTIKEYEATDIVIGLHRKANIVDSFFGHLAESLLKGTHREV MIAKFLMPVNTLRRINIAVPPKAEYESGFSKWVEHFCRMGSILGCRVHFFANERTLMRVQ QLVKKRHAGTPTEFSILEEWEDLLLLTGQVNYDHLLVVISARRGSISYDPSFERLPAQLG KYFSNNSLIILYPDQFGEPQEIVSFSDPRGHNESQHYEKVGKWFYKWLKKN >gi|225935357|gb|ACGA01000035.1| GENE 29 40731 - 41738 1210 335 aa, chain + ## HITS:1 COG:aq_1866 KEGG:ns NR:ns ## COG: aq_1866 COG0136 # Protein_GI_number: 15606903 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Aquifex aeolicus # 2 331 4 336 340 371 59.0 1e-103 MKVAIVGVSGAVGQEFLRVLDERNFPMDELVLFGSKRSAGTTYTFRGKQIEVKLLQHNDD FKGVDIAFTSAGAGTSKEFEKTITKYGAVMIDNSSAFRMDADVPLVVPEVNAEDALERPR GVIANPNCTTIQMVVALKAIEKLSHIKTVHVSTYQAASGAGAAAMDELYEQYRQVLANEP VTVEKFAYQLAFNLIPQIDVFTENGYTKEEMKMYNETRKIMHSDVKVSATCVRVPALRAH SESIWVETERPISVEEAREAFANGEGLILQDNPAEKEYPMPLFLAGKDPVYVGRIRKDLT NENGLTFWIVGDQIKKGAALNAVQIAEYLIKVKNI >gi|225935357|gb|ACGA01000035.1| GENE 30 41925 - 44813 1416 962 aa, chain + ## HITS:1 COG:no KEGG:BT_3633 NR:ns ## KEGG: BT_3633 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 40 962 23 945 945 1495 80.0 0 MRLLYIIHLENFAQNLYMGYFRLNIAFFGLLFFIVFKATANPGTYTISGRITEEKTNNPL PGASIIIKGTYLWAVSNQNGDFTIQGVREGKCNLEVSFLGYVTTTILVNIRSSIKDITIQ LKENTLALDDVVITAQAPKSELNTTLIIGSNALEHLQVSNVSDISALLPGGKTKVPDLTT NNVFSLRDGGSTVGNATFGTAVAVDGIRIGNNASFGNMSGIDTRSIAVTNIESVEVITGV PSAEYGDLNSGMVNIRTKKGKTPWEILLAINPRTEQFSFSKGLDLGNDKGSINISGEWTK ATRKLSSPYTSYTRRGFSANYNNTFRNIFRFNIGFTGNIGGMNTKDDPDAYTGEYTKVRD NVFRANTSLSWLLNKSWITNLKFDASLHYNDNKSHAHTPYTYASEQPAVHAEQEGYFLAD KLPYSYFADQIIDSKELDYAASLKYEWNRRFKKVNSNLIAGIQWKSTGNIGEGEYYLNSS LAPNGYRPRPYTTYPYMHNVSLYAEENLTVPLGSTTLKLMAGIRWEKIFISGTEYKNLNT FSPRFNAKWQLNRNISIRGGWGVAEKLPSYYILYPRQEYRDIQTFGVSYNNNESSYVYYS QPYVLLHNKNLRWQRNQNAELGVDIEIAKTKISLVGYFNRTKNPYKYTNAYTPFSYDVLQ LPDGFSMPANPQININNQTGMVYIRGNESEEWVPMDVKVTNRTFVNSVSPDNGPDIKRRG VEMIVDFPEITPIRTQLRLDASYDYMKYVDNSLSYYYQTGWSHTGIPNRSYQYVGIYANG DNSSTTANGKRTHSLDANITAITRIPKARLIISFRLEASLLKRSQNLSEYNGKEYAFNVS DNSNAGTGGSIYDGNSYTAIYPVAYLDLNNEIRPFTEKEAQNSAFANLIRKSGNAYTFAA DGYDPYFSANINITKEIGDHVSLALNAINFTNSRKYVTSYATGVSAIFTPDFYYGLTCRI KF >gi|225935357|gb|ACGA01000035.1| GENE 31 44818 - 46050 725 410 aa, chain + ## HITS:1 COG:no KEGG:BT_3632 NR:ns ## KEGG: BT_3632 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 410 3 411 411 655 74.0 0 MKQLAYILLLMTFLTTGSCTDIDKDNPYDNQLHTLQVNAVYPNEYSDYLREGVTVKIEDI DRGNSYTSKTDKNGTVRFSLTKGIYRIQISDKAEQDIFNGLADKVKLVNGDLALNLPLVH SRSGDIVIKEIYCGGCTKLPFEGNYQSDKYMILHNNTSETQYLDGLCFGSLDPYNSQATN VWVTQDESTGATIFPDFLPVAQCVWQFGGTGQTFPLAPGEDAVVVICGAIDHAAQYIQSV NLNKPGYFVCYNPVYFWNTLYHPAPGDQITPDHYLNVVIKTGQANAYTFSVFSPATVLFK AKDTTIQDFVSQADNVIQKPGSTVDRIVKVPINWVLDAVEIYYGGSSNNMKRMPPSVDAG YVTQSALYDGRTLYRHTDEEASREAGYEILEDTNNSSSDFYEREKQSLHE >gi|225935357|gb|ACGA01000035.1| GENE 32 46043 - 47551 619 502 aa, chain + ## HITS:1 COG:no KEGG:BT_3631 NR:ns ## KEGG: BT_3631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 502 1 502 502 785 77.0 0 MSNMKRIYITCIFLYMASTVSAQSYDIVERRNPWNAGANVTGIMVDSTTVSYAELYGRNN HGDFHNYYEADKLWNAGAIAKSITHLKSYSLTGSFSFDHTSGRNMSGSMFIHPGFYPVDI LEFTPGRKDLQTYAFMGGIAKDIAPCWRIGGKIDFTSANYSKRKDLRHTNYRLDLKVAPS IMYHSDDYAIGFSYIFGKNSESVKAEEIGTAATSYYAFLDKGLMYGAYETWEGSGIHLNE SGINGFPIKELSHGAAVQFQWKAFYVDVEYSHSSGSAGEKESIWFKFPTNRVTSHLSYRF SKGNVAHFLRLNLTWSRQFNNENVLGRETSNGITTTHVYGSNRIFERSVFSVQSEYEFIA PRRELRFGANVSSLKSLTTQMFPYSVSQTMTCGRIYLAGTFHTKLFDLKTSGIFSAGDYT EKNKTVKTESEAGEPPYRLTEYYNLQNEYTTAPRLTFEVGLRYKFYRGIYAEIQAEYTHG FNLKYIAGVNRWSETIKLGYTF >gi|225935357|gb|ACGA01000035.1| GENE 33 47591 - 48508 641 305 aa, chain + ## HITS:1 COG:no KEGG:BT_3630 NR:ns ## KEGG: BT_3630 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 302 262 540 541 304 56.0 2e-81 MKKYLMFMLAAALFASCSDDEKLPKEDETPGDTNFEEHDGYFVYYGETYKTIKLANGTTW MAEPLRYVPEGYTPSSDPTADSHIWYPYQLTGVTDKITATGAEALTDEASIKKLGYFYDI YAALGGKEVTAENCYEFESAQGICPKGWHIPTRAEFLDLCGLAKGGQGESDTTKEDALFY DKAYSGGNMSKYNAAGWNYVLSGVRMQNNFAATPTYQLTTFYSGNTTSETLEKYKGQPAL TYIMSSTCYAPLYLDKTDPTKLTNIQFFAQMTTFTKSYPEGRINVPYISIKSGQQLRCVK DQAAN >gi|225935357|gb|ACGA01000035.1| GENE 34 48690 - 49277 601 195 aa, chain + ## HITS:1 COG:no KEGG:BT_3629 NR:ns ## KEGG: BT_3629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 195 195 311 89.0 8e-84 MKIKKLTIAVAIVAMTFIGTSCGNKQQKSASEATTEQSASSALEIDSLLANAESLAGQEV TIEGVCTHTCKHGAKKIFLMGSDDTQVIRVEAGTLGAFDPKCVNSIVRVTGTLKEQRIDE AYLQNWEAQLKAQAAEKHGTGEAGCDSEKKARGETANSPEARIADFRAKIADRKAETGKE YLSFYFMEADSYEVE >gi|225935357|gb|ACGA01000035.1| GENE 35 49336 - 49830 304 164 aa, chain + ## HITS:1 COG:no KEGG:BT_3628 NR:ns ## KEGG: BT_3628 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 25 188 188 289 87.0 3e-77 MVLIYAISGIVMNHRDTINPNFSITRKEYKIAEKLPDKAGMNKEKVLTLLEPLGESGNYT KYYFPKTDVMKVFLKGGSNLLVNVKTGEAVYESVTRRPLIGAMSRLHYNPGQWWTYFADI FAIALIIITLSGIIMLKGNKGIIGRGGIEVIVGILIPILFLFFF >gi|225935357|gb|ACGA01000035.1| GENE 36 49853 - 50740 227 295 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 21 291 26 304 309 92 26 2e-17 MEHIIECNNLTHYYGKRLIYENLSFTVPKGRILGLLGKNGTGKTTTINILSGYLKPRSGE CRIFGQEIQTMAPALRRNIGLLIEGHVQYQFMTIIEIEKFYAAFYPGQWKKEAYYELMNK LKVAPGQRISRMSCGQRSQVALGLILAQNPELLVLDDFSLGLDPGYRRLFVDYLRDYARS EGKTVFLTSHIIQDMERLVDDCIIMDYGKILIQKPIVELLEKGRRYTFTIPEGYELPASD DFYHPSVMRNQLETFSFLQPTETEAKLKSMSVPYTNFHSEQVNLEDAFIGLTSKY >gi|225935357|gb|ACGA01000035.1| GENE 37 50765 - 51433 386 222 aa, chain + ## HITS:1 COG:no KEGG:BT_3626 NR:ns ## KEGG: BT_3626 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 222 1 222 222 371 94.0 1e-101 MYAIFYKEWIKTRWYFLLAVLATLGFTGYCMLRINRVVEMKGAAHVWEVMLQRDVIFIDM LQYIPLIAGILMAIVQFVPEMQRKCLKLTLHLPYPELKMTGNMLLSGLVLLLVCFASNFL LMEVYLNGILAHELKNHILLTALTWYLAGISGYLLVAWICLEPAWKRRILNLIFAVLLLR IFFLSPTPEAYNKFLPYLVVYTLLTASFSWLSIVRFKAGKQD >gi|225935357|gb|ACGA01000035.1| GENE 38 51476 - 52510 1016 344 aa, chain + ## HITS:1 COG:no KEGG:BT_3625 NR:ns ## KEGG: BT_3625 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 344 1 344 344 632 90.0 1e-180 MRRFSTILLYVTIAFLLLWQLPWCYNFFVVKPEKTPFTLYSFVIGDFAQMGQEEGKGSVR RDLAGNIYSEAAFDSILPMFYFRQLMSDERFPDTIKGIAVTPKMVQTENFNFRSVPSDIN APSIGLYPLMESMSGRVDLKMPDDVFRITTKGIEFIDMASNSVKEDKSLQFTEAMTKKEF RFPAREIVGNPTVKKEYDEGYLLLDADRRLFHLKQVKGRPYVRAITLPEGLTLEHLYLTE FRNKKTLAFMTDVNNAFYVLQSRTYEVVKTGIPVFNPETDALTIIGNMFDWTVRVTTPTS DNYYALNADDYSLIKKLENASSTHSMPGITFTSYTDKYVMPRFE >gi|225935357|gb|ACGA01000035.1| GENE 39 52772 - 54046 750 424 aa, chain + ## HITS:1 COG:no KEGG:BT_3624 NR:ns ## KEGG: BT_3624 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 423 1 420 421 547 65.0 1e-154 MKVYKAFAFLSIFIILFYNCSLPTDDEEKGDSEGEDTVTSELPWEGEMDKFTINSKEGIH LNDPYEDAGTAYVTIPSTSVKNTRWEFGVHLTFNPSANNYARFYLTSSSNILSENLNGYY IQIGGAKDNVTLYRQNGDQSKLLASGRELMKGDSSPKLYIKVERDNNGYWTFWTRRESEN EYVKEKQIKDTDIQTSRYCGIYCIYTKTRCKGFTFHHIQLSNNVETDTTPDETPDHPGTD IPDNPNTPELPKDVRGMLLFNEIMYNNATDGAEYIEIYNPTEQAIILPALYLYKMYKDGA IYNTTILQNESPSTPLTIPAKAYLCFTKYFNRVVQKHKVGGENIIIIPNFPALNNNGGYL ALSSSKETAPGHTFDTCCFRDEMHTIDKITGVSLEKKSPELSSLNKNWRSSKHATGGTPG IKNM >gi|225935357|gb|ACGA01000035.1| GENE 40 54138 - 55610 1133 490 aa, chain + ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 26 421 25 420 512 129 28.0 2e-29 MSPAVISITTVAYFIILFTISYIAGRKADNEGFFVGNRKSAWYIVAFAMIGSTISGVTFV SVPGMVQASSFSYLQMVLGFIVGQIIIAFVLVPLFYRMNLVSIYEYLENRFGSSSYKTGA WFFFISKMLGAAVRLFLVCLTLQFLIFDPFHLPFLLNVILTVFIVWLYTFRGGVKSLIWT DVLKTFCLVVSVVLCIYYIASSLHLNFSGLISTISDSDFSKTFFFDDVNDKRYFFKQFLA GVFTVIAMNGLDQDMMQRNLSCKNFRDSQKNMITSGISQFFVILLFLMLGVLLYTFTAQQ GIENPEKSDELFPMIATGNYFPGIVGILFIIGLIASAYSAAGSALTALTTSFTVDILHAQ IKGEAALSKIRKQVHIGMAVVMGAVIFVFNLLNNTSVIDAIYTLASYTYGPILGLFAFGI FTKKQVYDKYIPLVAITSPILCYILQRNSEAWFNGYQISYELLLFNAAFTFIGLCFLIRK KSTMDNTTVL >gi|225935357|gb|ACGA01000035.1| GENE 41 55624 - 58074 1499 816 aa, chain + ## HITS:1 COG:SPy0791 KEGG:ns NR:ns ## COG: SPy0791 COG0463 # Protein_GI_number: 15674834 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 246 354 2 110 335 64 40.0 7e-10 MKKKINCFIPFGTPEDTMQTVKELQVSELVNKIYLLGSEPGKKALPGCEYLSVKGFYSTD TMKTIAANANTEYTLFYLKQTPLKLGLYALERMVQIMENDKKNGIVYADHYQLINGELKQ APVIDYQLGSVRDDFDFGSMLLFSPSAFTKIADALREEYKYAGLYAMRLFISYKYSIVHI NEYLYTEIETDTRKSGEKQFDYVNPKNREVQIEMEAACTEYLKCIDAYFMPTSSRPVNLH SENFEFEASVIIPVRNRAHTIRDAVNSALNQRTTFSFNIIVIDNHSTDGTTEILQELSSD KRLIHIIPQEHDLGIGGCWNKGICHEKCGKFAIQLDSDDLYKDESTLQKIVDTFYKESCA MVIGTYLMTDFQLNEIPPGIIDHKEWTPENGKNNALRINGLGAPRAFYTPILRNIKLPNT SYGEDYAIGLRISREYKIGRIYDVIYLCRRWEGNSDAALSTEKVNRNNFYKDRIRTWEIK GRIQMHTIDEEFQELVEEMIENQKENWELAKRNYEALEENLEKKKVLKLKEEDREMKVRI FPNPQRILSTMAKTDSRSIQERPCFLCGKNRPAEQTYLPFGHYEVCLNPYPIFQRHLTII DKEHTPQSMKGRFEDMLHLAENLDEFYILYNGPECGASAPDHMHFQAAGKEEELTNPFAL NFLKSILENENGVTTYVDNVFTTCIGMTSGLKVDLMQQFEKVYQNLSIIYSDKEPLINMI TWYGLDKISHFGGDEIEVWNCIIFLRSKHRPDCYYTPNEKGLLISPAVAEMGGIFPIVRE EDMDKLNAKKLTEIYKEISLSPQQLNTLCDQLFKKK >gi|225935357|gb|ACGA01000035.1| GENE 42 58087 - 59457 850 456 aa, chain + ## HITS:1 COG:sll1283 KEGG:ns NR:ns ## COG: sll1283 COG2385 # Protein_GI_number: 16329811 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Synechocystis # 88 453 127 388 391 107 25.0 5e-23 MRVPLISVGILSGKEIEFSFPIKFISSDETESSGMQKAIYRDGKIHWQGKEYNELSFTPQ QGAPAFFELKDVTIGINFHWERKEVQKFKGELKIIIEGEQLTAINIISIEEYLISVISSE MSATASLELLKAHAVISRSWLLNKCKTESRKQEMKNEKTGKSIQKDSATCSPSFVSDSQF IKWYDHEAHKNFDVCADDHCQRYQGITRASTPQAIEAVTATRGEVLMYAGTICDARFSKC CGGAFEEFQNCWENVKHPYLIGQRDCKIEAQLPDLTIETEADKWIRTSPVAFCHTQDKKI LSQVLNNYDQETTDFYRWKVSYSQQELSTLIHQRSGIDFGQILDLIPIERGTSGRLVRLK IVGTLRTLIIGKELEIRRTLSTSHLYSSAFVIDKEYKEKGHKKDKNPSRFILIGAGWGHG VGLCQIGAAVMGEQGYKYEEILSHYYPGSTLEKQYQ >gi|225935357|gb|ACGA01000035.1| GENE 43 59474 - 60757 1028 427 aa, chain + ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 17 422 20 409 492 111 24.0 3e-24 MTTPTIKRNPWSWIPTLYFAEGLPYVAVMTIAVIMYKRLGLSNTEIALYTSWLYLPWTIK PLWSPFVDLVKTKRAWIIAMQGFIAAGFAGIAFFIPTAHYVQLTLAFFWLLAFSSATHDI AADGFYMLGLNNKEQSFFVGIRNTFYRLANIFGQGILVMLAGWLETSQNNIPLAWSITFY LLAGLFLALTIYHRLILPHPDSDIKRPGLTPGKLLGDFLLTFVTFFKKKNLGLMFFFLLT YRLGESQLAKIASPFLLDATDKGGLALSTATVGMIYGTIGVIALLVGGIISGFLVSRDGF RKWILPMALAINLPDLLYVWMAAATPDNPIFIAICVAIEQLGYGFGFTAYMLYLIYIAEG EHKTAHYAIGTGFMALGMMLPGMPAGWIQEHLGYTNFFIWVCICTLPGIVASLMIRNRLE DSFGKKQ >gi|225935357|gb|ACGA01000035.1| GENE 44 60796 - 61911 883 371 aa, chain + ## HITS:1 COG:CC0541 KEGG:ns NR:ns ## COG: CC0541 COG4299 # Protein_GI_number: 16124796 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Caulobacter vibrioides # 8 371 7 372 372 192 34.0 7e-49 MNTTSNKRLLALDVMRGITIAGMILVNNPGSWGHAYAPLKHAQWNGLTPTDLVFPFFMFI MGISTYISLKKYNFTFSTPAALKIIKRTIVIFLIGIALNWFALLCYTHNPLPFEQIRILG VMQRLALCYGASALIALLLKHKYIPYLIVVLLVGYFIILITGNGFAYNETNILSIVDRSI LGDAHMYQDNHIDPEGLLSTIPSIAHVLIGFCVGKLLMEVKDIHEKLERLFLIGTILTFA GFLFSYGCPFNKKIWSPTFVLATCGLGSSLLALLVWIIDIKGCKKWSRFFESFGVNPLFI YVMAGVIAVLVGAITVTYQGESVSIQQVVYRCALQPVFGDEGGSLAYAILFVLLNWSIGY ILYKKKIYIKI >gi|225935357|gb|ACGA01000035.1| GENE 45 61908 - 62312 123 134 aa, chain + ## HITS:1 COG:no KEGG:BT_3618 NR:ns ## KEGG: BT_3618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 283 113 71.0 2e-24 MILIADSGSTKTDWCIVFNDTPIKRIGTKGINPFFQSEEEIQQELTHSLLPQLPEGTINS VFFMVQDALPKELPFSDGPLQTVCPSLETSKPIRICLLLPADYADMKLGLPASSAQVPIL VSTTVKRLSTTSPL >gi|225935357|gb|ACGA01000035.1| GENE 46 62192 - 62758 421 188 aa, chain + ## HITS:1 COG:no KEGG:BT_3618 NR:ns ## KEGG: BT_3618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 185 96 280 283 333 86.0 2e-90 MLAAARGLCGHEAGIACILGTGSNSCFYNGEEIVNNISPLGFILGDEGSGAVLGKLLVGD ILKNQLPPAIKEAFLKQFDLTAPEIIDRVYRQPFPNRFLASLSPFIAQHLEEPGIRQLVL GSFIAFFRRNVMQYDYTQYPAHFIGSVAHCYKEILQEAAQETGIRIGKILQSPMEGLILY HTASSQSL >gi|225935357|gb|ACGA01000035.1| GENE 47 62790 - 63605 795 271 aa, chain + ## HITS:1 COG:HI0754 KEGG:ns NR:ns ## COG: HI0754 COG2103 # Protein_GI_number: 16272695 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Haemophilus influenzae # 3 253 13 264 303 242 52.0 7e-64 MQITEQPSLYDNLEKKSVREILEDINREDQKVALAVQKAIPQIELLVNQIVPRMKQGGRI FYMGAGTSGRLGVLDASEIPPTFGMPPTWIIGLIAGGDTALRNPVEGAEDDMSRGWEELT EYHINPKDTVIGIAASGTTPYVIGALREARKHGILTGCITSNPDSPMAAEADVAIEMIVG PEYVTGSSRMKSGTGQKMILNMISTSVMIQLGRVKGNKMVNMQLSNHKLVERGTRMIMEE LGLDHDHAQKLLLLHGSVKKAIDAYQHSKDY >gi|225935357|gb|ACGA01000035.1| GENE 48 63871 - 66408 2207 845 aa, chain + ## HITS:1 COG:no KEGG:BVU_0030 NR:ns ## KEGG: BVU_0030 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 31 842 35 858 864 1210 67.0 0 MKKLFLFFYLSVVSIAAFAAEPFVVFTATGNHFPLVANGVPCPIYIDSSEDKGVMIAAGN LQQDILQVCGKKPELLTSASSKRCIIAGTYGTPFIKKLVSAGKIDKKELDGKNEKYILQV ISNPCEGVDEAVVIIGSDRRGTIYGIYELSEQMGVSPWYWWADVPVMKQPNVYIKPGQYS DGEPAVTYRGIFLNDEAPCLTRWVKHTYGTNYGDHRFYARVCELILRLKGNFLWPAMWSW AFYQDDPQNSKTASEMGVIIGTSHHEPMARNHQEWSRKRKEYGAWDYSTNQKVIDQFFRE GIERMQGTEDIVTIGMRGDGDAAMSESTNVKLLENVVKNQRKIIEEITKRPAKETPQVWA LYKEVLDYYDKGMRVPDDVIMLLCDDNWGNVCRLPNAKERKHPGGWGMYYHVDYVGAPRN SKWLNVTPIQNMWEQLQLTYDYGVEKLWILNVGDLKPMEYPITLFMDMAWNPKQFNVSNL LDHPRRFCAQQFGEDQADEAMRILNLYSKYNGRVTGEMLDWNTYNLETGEWKQVSDEYLK LEAEALRQYISLKPEYKDAYKQLILFPVQAMANLYEMYYAQAMNHKLYKENNPQANEWAD KVEQTFVRDKALSDDYNNVMSSGKWKNMMIQKHIGYTSWNDNFPADTQPKIYRIEKPEKA VGGYVFTGQDGYIAIEAEHYYSAKAAPGTEWTVIPYMGRTLSGMALMPYTQPTDGASISY KINLPKGIDKVTVHVIVKSTLAFHDRKGHEYSIGFEGGKDQTINFNHNLNELPENVYSIY YPTVARRIVEKKAKLNVPNTSDGMQTITFKPLDPGIVLEKLVVDYGGYKRSYLFMNESKS KRESR >gi|225935357|gb|ACGA01000035.1| GENE 49 67396 - 69423 1660 675 aa, chain + ## HITS:1 COG:no KEGG:PRU_2308 NR:ns ## KEGG: PRU_2308 # Name: not_defined # Def: putative glycosyl hydrolase # Organism: P.ruminicola # Pathway: not_defined # 26 671 21 676 677 790 60.0 0 MTLLTTQNMVGANKNNKELNIIGTGNPYLPLWEHLPDGEPRVFEDPDNPGKYRIYIIGSH DVRFTSYCGADIRMWSAPVEDLTNWRDEGAIFTYHVQNQWDVMYAPDLVEIKRKDGTKEY YLFPHSRGRNREAMVCKGSRPDGPFTPINLNEDGTRTLPGSFLGFDPSVFVENITDPNDP DYEIGFRAYGFWGFQRSSAAQLDQNTMYSVRPGTEIIPYFIPASTNSGIIRDPAGTTYPA LYKEQNPKDFKFFEASSIRQIGNKYVMIFSGYSGPEYGLGNSNSTLRYAFGDSPLGPWRS GGVLVDSRGVVPNQDGSKLQATNAAHNTHGSLQEINGQWYVFYHRPPRGFGFARQAMVAP VTIKWDEKSVAEGGKVVIHGYDPYAKEGIWEAKATNGSEYTGAEVTSEGFQIFGLDPYKY YSAGYACYLSNIGSQQDSWDIWDNNMPVDVRGNDIIGYKYFGFGGLKKAQKGLKPFEGTQ KGNKTSFNLFLTPMTDASFKINIWLDGPWDNSVWKGKKIGEITVPAGSIQEITKYTIDVS EVVDKLDKKHAIFLVAESNSKEKLCVLQGLGFSKKGKSLEYPTIPTVEIMVDGQKVDLPT TPVRSTNANGITGFNQYETTYTIPANTEDTPKVSATSNDPSVKIDVIQPTSGNKKAVVNF EYKGTVKTYTILLAE >gi|225935357|gb|ACGA01000035.1| GENE 50 69876 - 72497 1455 873 aa, chain - ## HITS:1 COG:SPy1586 KEGG:ns NR:ns ## COG: SPy1586 COG3250 # Protein_GI_number: 15675473 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pyogenes M1 GAS # 28 476 54 515 1168 192 29.0 3e-48 MIKRVLFLGILIGLLHSAAYSVDFVRLKRNINREWKFVLSDIKGAEVTDFNDGDWQNISL PHSFSIPYFMWSKIYNGYGWYRKVIDIPTEWKDKNVTLEFEGSFIETEVYINGTYLGKHI GGYTGFFFDITRYLNAGKNVIAVRVNNLWKPDVAPRAGDHQFSGGIYRDVYLNVTDKLHV DVYGTFAYTAAISKKEALCEIQTEIRNNYPSDKNCIINTEIESPVGKVVSKVQSKMLVKG NEVKIVRQVFPRISEPQLWSPESPQQYKAVTTVMADGKVMDKYETTFGLRWFDWTADKGF FLNGEHYYLLGANVHQDQAGWGDAVTNAAMRRDVQMIKDAGFNCIRGSHYPHDPAFVQAC DEIGLIFFSENAFWGTGGAFGDRFSWNHPTSSCYPANPEYRPKFDESVLAQLKEMIKIHR NHASIAAWSMCNEPFFTDWETFKDMKSLLNQVTDSASVWDPTRKVAIGGAQRGEIDRLGK NVIAFYNGDGASRTEFQNPGVPNLVSEYGSTTSYRPGRFFAGWGDVVKTPGYADPWNPPT WRSGQIIWCGFDHGTIFGTGLATMGMIDYFRLPKRQYYWYVEALKKGNRNPQEPEWPSKG IPARLGLKASNNIISSTDGIDDAQLVVTVLDEYNRHISNNVPIELRILSGPGEFPTGRMI RFVPPSQEEASDIAVRDGQAAIAFRSYHAGKTVIQAITDGLEPAVIEIITLGTPMWEEGV TVPVAGRPYHRYEGEVCERVSDTNKMLLATYRPTWVSSSLEGTNKAYVNDGDVTTFWKPV VTDREKWWKLALEASYCISKIQVELPEADVVYQYKIEVSVDDLIWKEVISDRVSSKDIKV RTFQGDFGCDIAFVRISFISEEAGLTEVRIGGK >gi|225935357|gb|ACGA01000035.1| GENE 51 72508 - 73407 694 299 aa, chain - ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 34 290 30 263 269 91 28.0 2e-18 MEMRKLGVLVALSIYTLSLQAQSNFQSRVITDTVYSEFLKTKRAFTVYLPKSFEHNKEKK YPILYLLHGMWEKNDVWANRGHIKDIMDCLTANGEVCEMIVVTPDAGGGDPNIYQNGYFD MAGWKYETFFFMEFLPFVEGRYRVIGDKQHRAIAGLSMGGGGAISYGQRHSDMFCAVYAM SALMDIPQEGAARFDDPNGKLAVLTRSVIEKSCVKYITASNEERKTELRSVAWFVDCGDD DFLLDRNIEFYRAMRGANIPCQFRVRDGGHTWEYWHSALYTCLPFVSRIYHSLCFTQIK >gi|225935357|gb|ACGA01000035.1| GENE 52 73463 - 75043 904 526 aa, chain - ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 23 522 39 528 531 346 38.0 7e-95 MNRIIILLLILSLCGSCKVASVSSAFTNPVVYADVPDMDVIRVGDDFYMMSTTMHLMPGA PVMRSKDLVNWEIISYVFDRLTDTSNYDLINGTVYGRGQWATSIRYHKGKFYVLFSPNDK PYRSYIYTATDPAGKWELLTRTQHFHDSSLFFDDDGRVYVFSGTGSLKELNADLSDVKPD GVDMKIFERDETETGLLEGSRVVKYDGKYYLLMISWPKGGNRRQVCYRADKITGPYEKKV ILEDNFAGFPYVGQGCIVDDTKGNWYGMIFQDRGGVGRVATLMPCRWIDGWPILGDENGK VPLLMEKPVQGCPEKSLVVSDDFSDSKLKLNWQWNHNPVNEAWSLTERPGFLRLKTNRVV DNLFLAPNTISQRMEGPECSGVVALDVAHMKEGDVAGFSAFNGDAALLSVIVRNGKKYLT MSVDSVSLDSSNKAVLDVKSEEKEQLELNQSVVYLRIDGNFRLNHDIASCYYSLDNKNWK KIGEDFKMKFDYRRLFIGTRFAIFNYATNSLGGYVDVDYFDYKKTE >gi|225935357|gb|ACGA01000035.1| GENE 53 75160 - 77775 1822 871 aa, chain - ## HITS:1 COG:no KEGG:CJA_3286 NR:ns ## KEGG: CJA_3286 # Name: ebg98 # Def: endo-beta-galactosidase, putative, ebg98A (EC:3.2.1.-) # Organism: C.japonicus # Pathway: not_defined # 161 870 306 1014 1016 743 50.0 0 MVVAQEKGLDVRSCVALTASSYNVELSDGNSFIVHTEITPLDKGESPVYAPKLGVKKEGD IYYWTIDDDWLNIKDAKVAVTPELSSIPVVGMDEDGYWTVKCNDDVKTLKRRAEAGTVKS IFNQVDIKGKTVTFRFSDGSLPIAFTIDDKDNPDEPVMGDLRRPISPDKPAWIFHIDTWN HADPQRIIDMIPADIRPYVIFNISLSVLHDDDTGKWGLVEYGYEVAKSWLRTCAENNVWA MVQPSSGGFSHFPDFATYGQMEGSLYEEFYRDYPNFLGFNYCEQFWGFDDPFSVSYPQRL KHWTNLMKLNAKYGGYLTISFCGPHWGASLTPVAMFKRDAEFAAVCREHPENLIVCEKYT STYGFFDIESACLGAWLSGYAGQYGMRFDECGWNAIYWNGDEKFPVAAGAIPAIEHIMFT GQTIFDGPETIPEQVCKEVNAVSAGDGYTRRNWEFYPQLYNINMDIYRKILDGTIRIMNR QEVIDRTKYVIVNDITPNNTASDPGYLAPKTLYDGLYKLDEDGTNYEQRLYFKKTGRYPT LPVVYGLVDDIANSFRYKINASQYNGTYGDVKLKQSIFNRDFPEEYTGNLYVGRHENAWV AYNAYAEIRNAAIPFKYNTCEKMELAFAKYSVSAIKEYSDKVTFYLTNYNEKGIKQTDVI KIYGSTNKPKMTYTDRATPASCSISEDWTNGVYTMTVIHNGAVDITVNCAGNAIGRETQY TKATISVPANPNIYHGARQYEAENFEYKNISRIITKKPYSDTEGILPDYTAMGFLNFGSN GNAAVRDEVSVTDTGNYSLRIRYCAAATVNTVDLYVNGVKVTTLEFAQTGVGNWETTSAG VSLNAGKNKVELIANGSSASCDLYLDNIVIE >gi|225935357|gb|ACGA01000035.1| GENE 54 78005 - 80236 1660 743 aa, chain - ## HITS:1 COG:no KEGG:PRU_2739 NR:ns ## KEGG: PRU_2739 # Name: not_defined # Def: endo-1,4-beta-xylanase (EC:3.2.1.8) # Organism: P.ruminicola # Pathway: not_defined # 454 742 385 704 707 140 31.0 1e-31 MKYINNKIISLVLVATVVVSCKDEYACHLPLEKPEEVANSEYLSTFDLLKSYVNSNSSFK LAANMSASLLAKKDIAYSTLLANFNAVDVNGSFTPLNTLKVDGTYDFGGMQSVTDLAAGA GIALYGGSLCSDQGQRAAYYNKLIEPIDIPVETETGKTKLFNFDDDIIGKTYPMTGNSSA TVEEDPAGESGHVLHVGTDAVKANDSYAKLHVVLPSGRMLGDYVRLNIDLRHVGEDGIYG QGMKVIINGTKFDLGKSAADLGATNNNWKRSLVIKLNDAASPGFVIPENMRSLTEFDLSI GSQSGTAQYYLDNISMEYEVSGKGSTTINFEADELNATYPMTGSGTATVVKDPAGESGNA LYINQAAWAFPKFTVKLKEGMTLKDYTGMMMDMRLIKGMYGGGMSVIINGQTISLNQNAA GYGFQENDTWKRGGILVTFVKEGTYTALNEKVPAGTIEIPDAMKDLNEMEFSIGSSSSNW TAYIDNLKFMWEAQPQHIEKTPEEKKEIFTKEMEKWIGGMVYAGVNEVNSVKLWNIISEP LDNTTDENTFKWSEYLGEMDYARTAVKIARDTVKNVGVELELFVSQTINQFDDMGKVAND LITLVDSWEADNVTKIDGYNILLHAIYSQNVTDQKGNEEIVTGLFEKLAASGKSVRVSDL SMMVQDANRNFIVANKLTLEERAAAADYLAFIMKEYRRLIPADKQYGISISGISESNGGN TVCPWTSGYNRNGMYEGVVEGLK >gi|225935357|gb|ACGA01000035.1| GENE 55 80273 - 81307 960 344 aa, chain - ## HITS:1 COG:no KEGG:Slin_2105 NR:ns ## KEGG: Slin_2105 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 17 333 4 321 331 289 48.0 1e-76 MKKNLIYIGLASLFFTFTSCENGENEFPDFDYQTVYFANQYGLRTIELGEENFYDNTLDN QHKMLIKAAWGGGYTNRKNVIIDYVVEESLCDGLYFKDTDIPVTPMPASYYTLASNQINI PVNEISGGVEVQLTDAFFADEKSIDRYYVIPIRMNDVQGADSILQGRSAVDSPAWNKADD WSILPKNYVLYAVKYVNPWHGQYLRRGVDQVTINGESKKLIRHAEFVEKDEDVDVNTAAY KEDLLTLQVKDGTGDAHSFTLRLTFNEDGVCSITSGSQDVVASGNGKFVSKGEKNSLGGK DRDAIYLEYNVELKNPGIQLATKDTLVLRTRNVYGGGTFEVERK >gi|225935357|gb|ACGA01000035.1| GENE 56 81335 - 83104 1611 589 aa, chain - ## HITS:1 COG:no KEGG:ZPR_0751 NR:ns ## KEGG: ZPR_0751 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 588 1 580 580 556 50.0 1e-156 MNKKILLFVAGALMLGGCDDLFKPGLENFKDVEQMYEDAQYAQGFLMSTYSHIPGYYDNS EYATDDAVVNQKSDAFLTIATGGWTASTWTSLNQWTNSFSSIQYLNLFLENVDQVKWADD VEKNALFARRTKGEAYGLRGMFLYYLLRAHAGFGENGELLGVPILTESQTVESNFNLPRA SFQACVEQIYSDLSKAEEMLPWEYEDVTTVPAEFQSLTQDIGKYNTVMGAKARQLFNGLI ARAYRVRTAILAASPAFQTPSNTSTWEEAAMAAASVIDYNGGIDGLAADGITYYEATVAD AVKDGINPKEILWRENISSGDTNQEGRHLPPSLFGSGNMNPSQNLVDAFPMANGYPIDDV TNSGYDKSKPYDKRDPRLAKYIIYNGSTAGSSNKTIYTSSKSATDDGINTVEQKSTRTGY YMKKRLRMDVNCDPTSTSAKAHYNPRVRYTEMYLDYAEAANEAWGPKNANGNSYSAYDVV KAIRKRAGVGGDNDPYLEKCANDPNEMRKLIRNERRLELCFEGFRFWDLRRWNENLNEAV RGIDWTSDGQSYTTFTVEERAYEDYMRYCPIPYSETLKFSNLVQNKGWK >gi|225935357|gb|ACGA01000035.1| GENE 57 83129 - 85972 2465 947 aa, chain - ## HITS:1 COG:no KEGG:Slin_2103 NR:ns ## KEGG: Slin_2103 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 13 947 42 966 966 798 47.0 0 MNRKFIYIGCTALAMAFLNVRSIVAQEAQDSLVNVAFGKVAKEDLTHAVSTVNTSELSKI LYNGNSLMGLNGMVGGYNGNIWGQGALVLVDGVPRSASNIRATEVESVSVLKDAAAVALY GSRASKGVILITTKRGKNAPMSIDIRGTVAANVPVSYPRYLDSDCYMTLYNEACRNDGIA EKYDASTIYQTTLGTNPYRYPDTDFYSSDYLRKCNVTSGVVGEVYGGNEKTHYYLDFGMN YNNSLLKYGEAKKAYDMSFNVRGNVDMSLASWLKASTNAAVIFNNSYAGRGDFWGAASTL RPNWFAPLLPIDMMDRTVATIDEYITNSNHLIDGQYLLGGTSADTTNPFSELLAAGYTKE KSRMFMFDVVLMADLGNLLKGLSFKTAYSVDYTSYYSEAFAENYAVYQPTWANVNGRDMI VGLTKFGEDKKGTNEYVGKSTYDQTLTFSAQFDYARTFNKYHNVSATLLGWGYQTQSSSD ENNNAIGSSVAGSNYHRTSNVNLGLRTSYNYAHKYYIDFSGALIHSAKLPEGKRNAFSPS VTLGWRLSKEKFMKNVAFVDDLKLTASYAKLHQDLDISDYYMYKGYYLMEQKKTGWYQWH DGTAGGVLSGSIRGGNPELTFITREEFRVGIDAALFNKLLQLNANYFTQTTGGLLTRGAS TIFPSYFNLNDTSSFLPWINNNEDKRSGFDFGVTANKKFGQLDVTLGFNGMVFSSKATKR DEVTEYDYLLAEGHALDAVRGYVCEGFFQSEEDIAKHPRQTFGTVKPGDLKYKDINEDGV IDSKDQIDLGKGGWSTAPFTFGINLKLKYKNFSLFALGNGQTGAIGMKNNSYYWNRGTSK FSEVVWNRWTYKTAETATYPRLTTGNGDNNYRNSTFWMYKTNMFKLANVQFNYDFPESTF NGTFVRGLSLYCGGSNLLTISKERQYMELSTGYPQMRNFYVGFKAAF >gi|225935357|gb|ACGA01000035.1| GENE 58 86005 - 87876 1372 623 aa, chain - ## HITS:1 COG:no KEGG:Slin_2102 NR:ns ## KEGG: Slin_2102 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 24 623 22 617 617 522 46.0 1e-146 MNKYINKIFLASAIALIGVLTTTSCTDYLDKSPNSDIAENDPYKNFKNFQGFIEELYKRV PIVSNNDYHTCFNYGEEEYWEPQEIRLFARNIDYGDFWGWTTCYYSYPTTRSSGGLARNS HGNLWQDCWYGIRKANIGIANLDKLVNVTEEERQLLEGQLYFFRAWFHFMLMEWWGGMPY IDEVIPSDVTPALPRLTWQVCAEKCVADFDHAIPLLPVDWDQTTVGKVTLGNNNGRINKI MALAYKGKTLLWAGSPLMSWASGGTKEYNAGLCKRAADAFGDALKIVEETGRYELAPMAS YNEIFLVHNSNGKLNGLKEAIFRENLVDYDGRWRWNMINDFRPTSIESTGIKCYPTANYV NYFGMQNGYPIKNMTQADPESGYDPAYPWKNRDPRFYKTIMFDGVKWQSTGGAGGSSELF TNGRESEEKDEKKGCFTGYMNCKLCPQLMNTVDGYKENNIMVLSLMRLADVYLMYSEATA VGYGTPQSKASTFQKTAVDAINEIRRRAGVEDVLDRFTGNTTDFLGEVRRERAVELAFEG FRFMDLRRWMLIGQAPYNKKTKIEFDRGVSADDYNYDKPEENMIKNLREEVLVERKFTDR HYWLPLPKNDVYLYEGFGQNPGW >gi|225935357|gb|ACGA01000035.1| GENE 59 87893 - 91051 2897 1052 aa, chain - ## HITS:1 COG:no KEGG:Slin_2101 NR:ns ## KEGG: Slin_2101 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 20 1052 16 1038 1038 1082 54.0 0 MVMKNVKLQIQKIPSLVLLLIFSCISAYAQHGFVVSGTVVDSNGESIIGASVALKGNKTV GTITDLDGNFKLKVPDDKAVLVVSFIGMTPQEIKVAGKKTIKVTLVDSNVQLEEVIVVGY GQQKKVSVVGAITQTSGKTLERAGGVSSLGAALTGSLPGVITSASTGMPGEEDPQIIIRT QSSWNNSEPLVLVDGIEREMGSVDISSVESISVLKDASATAVYGVKGANGVILITTKRGK EGKASVQIKANITAKVASKLPEKYDAYDTFYLMNNSIEREACLNPSGWANYNPVAIIDKY RHPANQEEWDRYPNTDWEDVLFKKSTMSYNASVNVAGGTKVVSYFAAADFVSEGDLFKTF DNHRGYNSGFGYNRINVRSNLDFQLTPTTKFSTNLFGSNAERTVPWDYGDQDASYWASAY KSAPDAMRPIYSNGMWGYYAPRNADVPNSAYSLATSGIEKRTTTKITTDFIFDQDLKMIT KGLRFKANFSMDYTFKEAKRGVNDMYHDAQRMWVDPDTGLVKYQQDPDTGTGLDYPVNKI YWASQAGSVDVNATYRKLYYSMQLDYARTFGKHEVTALGLFSRLKEAQGPVFPIYREDWV FRLTYNYALRYFFEMNGAYNGSEKFGPEYRFDLFPSFSLGWMLSEEKFMKNNLKFLDMLK FRASWGRVGDDSLVLPWQRFGMENRFLYKDQLSYGGNTLMGDINPANTPYTYYSISKLGN PDISWETVEKRNFGIDYALFNGLIAGSVDIFNDTRTDILVKGSERAIASYFGQSAPDANL GKVNSHGYELELRLNHTFNNGIRTWLNTSMTHAVNKVKFRDDAPLKPTYQRGAGHTIDQT YAYIDHGNLATWDDVIGSSAWTTGNDMKLPGDYNIVDFNGDGIIDANDRAPYQYAKMPQN TYNASIGAEWKGFSIFAQFYGVNNVTREVNFPTFRSTAHVAYVEGEYWTPNGTAVLPTPR WGTTVDPAAAGTRYLYDASYLRLKNVELSYTFNNAKWLKKIGLKTCRIYLNGDNLYMWTD MPDDRESNTGYSSSDGAYPTMRRFNLGIDITL >gi|225935357|gb|ACGA01000035.1| GENE 60 91913 - 95893 2562 1326 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 786 1181 285 659 676 136 29.0 3e-31 MRKNLILITGFLLLLAGACYAQQVRSYGSGKLSCSLIRNIVQDAHGFVWIGTENGLNKFD GWTFVNYFHNKRDSTSLLNNLIESLLCDSQGNLWVGSGNGVQLYSPYEDSFKKVRFLDGS KPSVLHLLELDSGEIWAVTAGYGAYSIDRKTMSATSLVRVNDLCSSPYMHNIFQDSRRVI WIALPNGRIARITSDLKEVEFLSSSTDVLGKIYTILEDLKGRVWITAVSGVYLWNEEERL LVKMGQPTNEFIGVRGMVCTGHGELYVNTVNDGLYVVNVEDKMLKPFGGKQELEKDKNYA LMEDKNGNLWLGGLKKGIGMLSNESSQFKFRSLSSFSSSMGSTLSFVYEDKKNRAWIGTI DGKLIRMDGDMQECSLYTVGEGILAMLHDDEGGIWLGSYSGFSRFDERTGSVYKVPLLQG KAIHNIVEGTGQTLYISVSGEGFAEYNRRTGTLKQISDTTRLRTTMRLANNWINKILYDS GGLVWLGHCMGVNCYDPVKQEFLKLECEKDLLSSLCFSLLEDKEGHIWMGTNNGLFEYDK KTLKMNHYGIEEGMPSNMVCGLGQSKNGDIWCSTFNGLCKLNYQERKIANYFSGNGLVDK EYLKSVYFQRECGNIYLGGLHGITKLRPDSVGKQLPLCMPQLTHVYLNDKEVSANTQLGG RVISDHVWMDARQINLTYKNDIFSFEFSTMEYHDRENIKFEYRLAELDGVWRSTSSGENR ITYNYLPPGHYTFEVCVNENERKSPVRSFSIYIAPPWYDTLWAKIGYLLICAGLLFWVFY AWYMRQRRRRQEEMNEEKLKFFINIAHELRSPITLIVSPLAALIKNEQEEGRKKALLTMQ RNANRILNLINQLLDIRKIDKGQMKIECRETEMIGFIEELFQMFDYQAAKRNIHFSFIHT MEYLPVWIDRNNFDKVLMNLLVNAFKYTPDGGEVTLSLTVGEDKDVRGPLSKYVEITVTD SGMGLDEKKLEKIFERFYQASTNSHGFGIGLNLTKMLVELHHGSIFAVNRMDKQGSCFIV RIPLGKGHLTSEELAEEVMLSGEEPIRLILGEETCWEGEEESKVSMSKSKTQWRILVVDD DEEIREYLKFELGIYYKVRTACNGAEAYQIALNQRIDLIISDVVMPEMDGFELLKKTRGN ANVSHIPFVLLTSQAEYDSRLKGWNVGADAFLAKPFQIEELLLICENLIAGRIRLKGRFG MDQEVEEKMKTIEVKANDEYFMERLMRAINENLEDSKFSVEDLAQAVGVSRVQLHRKLKV LTGNTTTEFIRNIRLKQAAKLLKEKKVNISQIAYLVGFTNPTLFSIAFKKFYGCAPSEYV ERKTEE >gi|225935357|gb|ACGA01000035.1| GENE 61 95984 - 96292 129 102 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPYITYIFSLPTFSKLNTCLNLILFWLKFHQLSYFDPTLQRYGTATQKAKTPDTRTRVLC NNKQTFRFIQAIQMLHHRHKQNECNMSDKIGNKYKTGSNFRL >gi|225935357|gb|ACGA01000035.1| GENE 62 96380 - 96829 327 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172240|ref|ZP_05758652.1| ## NR: gi|260172240|ref|ZP_05758652.1| hypothetical protein BacD2_10282 [Bacteroides sp. D2] # 1 149 1 149 149 277 100.0 2e-73 MKKLFLSLFICSVSLMSHTQESKEVPLNQLSFSRSVFSETQSKVTFKGGKWNSFIKLPAS INQELPNYKFMIINIPQSTVMIRIKFEGGNGLYKEFYQPAVKSELKREINLSLVPFIKDV KDIRIEAAQSIDESGNYHTLHIKYIYLIN >gi|225935357|gb|ACGA01000035.1| GENE 63 96849 - 99386 1718 845 aa, chain + ## HITS:1 COG:no KEGG:BVU_0030 NR:ns ## KEGG: BVU_0030 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 49 844 48 858 864 898 52.0 0 MRKKTSIVIALLIFLTSVSYLKATETFVHFTSKNTGLALVTPQIRSFAIWCDEREHKGVL TAVHNLQTDFGKVTNITPDLTNEKKQDIRIIIGSFDKSPLIRQLVRAGKLDAKELKNKNE KFIITTLHAPLEDLDGEVLLIAGSDKRGTIYGIYELSRQIGVSPWYWWADVPVAHHNSIY IKKGTYTDGEPAVKYRGIFINDEWPCMGNWTTRKFGGFNSKMYVHIYELLLRLKANFLWP AMWSAAFYADDPQNSSLADEMGIIVGTSHHEPMARNHQEYARSRKDYGPWNYQTNKEGID RFFREGIQRMKGKEEVVTIAMRGDGDAPMGPDTDTRLLEDIVKEQRKIIAEITGKPAYKT PQLWALYSEVLEYYDQGMKIPDDVMILLCDDNWGNVRRLPELNGKRHPGGYGMYYHVDLH GAPRAYQWLNMTQIQHMWEQLYLTYTYGVDKMWILNVGDLKPNEFPTDFFLNMAWNPNQF TPDNLNEYTYNFCKQQFGSQYAKEAARILNLYCKYSARITAEMLDHKTYNLANGEFKAVT DEFLALEAHAYRQFTMLPKDMRDAYQELILFPVQAMANLYEMYYAVAMNRKLASDDDPRA NEWADRVVYCFQRDAELCSHYNEHIADGKWKHMMDQTHIGYTSWDEPQNGNIMPKVIQVN ASLYQPGNYEYEEKGGVIVMEAERFAKCTQGEDTQWTIIPDLGRTLSGITLMPYTQSVKG ASLTYQMNFKSDLRHLRLRLILDSTLPFVKEGHSYAISLDGSKEQIINYNSEMTWANCYS KMYPAGAARIIESVVNFPNVNLQKGKHTLSIRPLSPAIVLHKIIIDCGSDETSRLNLQES PYRKL >gi|225935357|gb|ACGA01000035.1| GENE 64 99470 - 103534 2773 1354 aa, chain + ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 715 1237 51 550 835 333 35.0 1e-90 MRNIVLLLLMSISILCPLAAQTSWTADNGNGTFTNPLFYDEFSDPDIIRVGEDFYMAGTT MHCMPGLVVLHSKDLVNWKFLSYAFNQLNMGDEFRLSNGKEAYGQGIWAPCIRYHKGKFY IFSNINGHGMQVFISDNPAGPWEHKNMGGNIYDLSVLFDDNDKIYAVYKYDEVKMIEIKP DFSGYVEGSERVIIPAGNAMGEGHHIYKIKGKYYIISADYSPMGRMQCARADKIEGPYET AAISVKETMGTERVHWTDNISLGSSVPEMGFKFKISKPSKDNMGCATLHQGGIVDLPNGD WWGVSMLDFRAVGRTTCLSPVTWEDGWPYFGLPGNLGRSPRTWVKPAVNAKVEPHAPYQR SDNFDGKALLPVWQWNHEPDHSKWSLTKGKLRLHTLPAKDFLWAKNTLTQRGIGPVSSAT VTLDASRLKDGDIAGIGLLNIPYAWLGVSRMNKNLVIRWYDQAHNKWMEAASQPLASQPI FLRVTGNFDEDTAWFSYSVDGESFTNIGDSVLMPYQLKTFQGTRYALFAYNSEGKQGGYA DFDNFQIDEPLADRSKNIPTGKIITLFNRGNDTRVFANPHGTLHFRGKGSKEYETEACRF RVHDRGQGRVALEAMNGTGFMTVVGAGLSADVRLTKEESEGSLFQWQDMLHGQCMLLSLK TNRYVGLDPTTGEPYGADWTGASPSRENGTVFIWNDCQPKEDIVIHLQEKGAEVSPTMYG VFFEEINHAGDGGLYAELVQNRSFEELEMPEGYHAEGKTLYPKPVKNHITGEVRPERYRW TTSPVPAWSLQAKDSLAVQMQLTKYQPKFTTAPNNLELTIKDASEPVLLINEGYWGMNLV AEENYLLRVIARTTPEYKGHIAAKLLSEKNEELASAPITINTSGEWNDIKIKIKANGTAA KGKLALVFDAPGKIWLDYVSLFPENTFNQRNNGLRKDVAQMLVGLKPAFVRWPGGCVVEG ISLENRFEWKKTLGDPAARPGEYSTWGYRCSYGFGYYEMLQFCEDINAKAMYVCNVGLGC QFRMGDACSEDKIDFYLNDCLDAIEYALGDKTTEWGARRTADGHPKPFPLQYVEIGNENW GDEYDKRFDIFYKAIKEKYPQLTLISNHGINGTGKISQTDMVDPHWYVAPDFFYQNSTIF DQHPRGKYTVYVGEYACNRGVGGGNMTAALSEAAFISGMERNGDLVKMASYAPLFENRND RSWSTNLIWIDSDQVLGRSSYYVQKMAAENRPTYNIKYNRSVSMDKNSNLTHSSDSIPLQ FISSGYDEETKEIIIKVVNAADVPYSTSFQLDGVTKVDKKGNIITLSATSGKDENSFDEP EKIYPRQIEYNEFNKHFSYEFLPFSYTIFRISAR >gi|225935357|gb|ACGA01000035.1| GENE 65 103583 - 104977 1068 464 aa, chain + ## HITS:1 COG:BH0236 KEGG:ns NR:ns ## COG: BH0236 COG5498 # Protein_GI_number: 15612799 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted glycosyl hydrolase # Organism: Bacillus halodurans # 336 462 793 921 1020 77 35.0 4e-14 MKNLQKVMAISLFSLFLNHVCQAQNPIIQTWFTPDPAPMVHNDTVYLFTDHDEDDAQYFK MKDWQLYSTTDMVNWTYRGTPISTETFQWAIQGDNAWASQAIERDGKWYWYICANDTIKK LHGIGVAVADRPEGPYSDPLKKPLVPGAFGYIDPSVFIDDDGQAYLFWGNNGLWYAKLNK DMVSLGSEVIPVKELNDSIAFGPLVMKRDYQLNKRVLKTNYEEGPWVFKRNGLYYMVYAA GGVPEHMAYSTAKSIHGPWKYQGRIMNEAANSFTIHGGSIEYHNRHFMFYHNGALPNGGG FRRSTCIEEFKFGEDGKIPFIPFTKEGVRTPVRNLNPYQRVEAETMAFSYGLKTDRRAGS QHYITSIHNGDWLKVRSVDFGEKGATNFSACVAGLMQGGSIELRIDALGGSVIGTLEVSP TDGSEQWKVQTTQVNAVKGVHDLYLLFKGGDEELFNVDYWIFNH >gi|225935357|gb|ACGA01000035.1| GENE 66 104983 - 106935 1005 650 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3873 NR:ns ## KEGG: Fjoh_3873 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 11 650 9 647 647 903 66.0 0 MSRIILLTGILWLFSQSIHAEQSQITSPDGKLSVHITDENGVPSYSVSYNHIPFILASPL GLNTNIGNFTREMKLSTTSPVQLLNEEYNLPTIKQSNVHYQANRGVFTFSQNGKVIYDVV FQVSNRDVAFKYNMYPQKEILCCVINEEMTGFVLPSGSTTFLCPQSKPMGGFARTSPSYE TSYTVDDAIGKNGWGEGYTFPCLFRNNDKGWVLISETGVSSAYCGSRLIGKPDGLYQIGF PQTGEYNGNGTTNPGIILPGETPWRTITIGETLAPIVETTVAFDVVKPLYEASQEYKYGK GSWSWIIGGDSSCNYDEQIRYIDFSAAMGYQSVLIDALWDKQIGRKGIEKLAEYAKSKGV ALYLWYNSNGYWNDAPQSPRGIMNTTINRRKEMKWMQSIGIRGIKVDFIGSDKQVTMQLY EDILADANDYGLLVIFHGCTLPRGWERMYPNFASSEAVLASENLHFSQGSCDAEAFNACL HPFIRNTVGSMDFGGSTLNKYYNARNDSTTRTSRRITSDVFALATAVLFQSPVQHFALTP NNLTDAPEWAINFMKEVPSTWDEVRFLDGYPGKYIILARRHQNKWYIAGINASSEEIKVK LHLPMISAGETISVYTDNARLVGKTSTEKLNKHQQLQVSIPCNGGIVIAN >gi|225935357|gb|ACGA01000035.1| GENE 67 106960 - 109395 1627 811 aa, chain + ## HITS:1 COG:no KEGG:BVU_2979 NR:ns ## KEGG: BVU_2979 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 2 804 1 815 819 1081 61.0 0 MMNRNTLIFLLLLISNISFGQDLKLWYSQPARNWSEALPIGNSRLGAMVYGGTEREELQL NEETFWAGGPYSNNNSNAKYVLPVVRNLIFDGKNREAQSLVDANFLTKQHGMSYLTLGNL YIDFPGHKDASGFYRDLNLENATTTTRYEVNGVTYTRTTFASFTDNVIIVHIQADKTQAL NFNMTYNCPLEYNVNAQDDKLIITCQGKEQEGIKAAIQAECVVQVKTNGAISPAGKVLQV EKATEATLYIAAATNYVNYQNVSANASERANKFLEKAIQTPYNKALKDHIAFYKKQFDRV RLNLPSSEASKAETPRRIENFNKGEDMAMAALLFQFGRYLLISSSQPGGQPANLQGIWNN STHAPWDSKYTININTEMNYWPAEVANLSETHSPLFSMLKDLSVTGAETAQSMYNCRGWV AHHNTDLWRICGVVDFAAAGMWPSGGAWLAQHIWQHYLFTGDKEFLKEYYPILKGTAQFY MDFLVEHPDYKWLVVAPSVSPEHGPITAGCTMDNQIAFDALHNTLLASRITGETSSFQDS LQQILDKLPPMQIGKHHQLQEWLEDVDNPKDEHRHISHLYGLYPSNQISPYANPELFQAA RNTLLQRGDKATGWSIGWKVNFWARMQDGNHAFQIIKNMIQLLPSDNLAKEYPEGRTYPN MFDAHPPFQIDGNFGYTAGVAEMLLQSHDGAVHLLPALPDAWKEGNVKGLVARGNFTVDM DWKNSQLNKAVIHSKIGGTLRIRSYVPLKGKGLKEANGECPNSLFATTSVRKPLVSKEII PQSPELRKVYEYDIATQAGKIYVINMIQDKK >gi|225935357|gb|ACGA01000035.1| GENE 68 109728 - 112313 2163 861 aa, chain - ## HITS:1 COG:XF0845 KEGG:ns NR:ns ## COG: XF0845 COG1472 # Protein_GI_number: 15837447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Xylella fastidiosa 9a5c # 34 846 31 858 882 570 40.0 1e-162 MNTGLKFALGVCSLSLLFSCAQKLPYQDTSLTAEQRAEDLLPRLTLEEKVALMQNASPAI PRLGIKEYDWWNEALHGVGRAGLATVFPQSIGMGASFNDSLLYEVFDAVSDEARVKSRIF SENGVLKRYQGLTFWTPNVNIFRDPRWGRGQETYGEDPYLTGQLGMAVVRGLQGPENGKY DKLHACAKHFAVHSGPEWNRHSFDAENITPRDLWETYLPAFKDLVQKADVKEVMCAYNRF EGEPCCGSNRLLMQILRDEWGYKGIVVSDCGAISDFYRPGTHGTHPDKEHASAGAVLSGT DLECGGEYGSLADAVKAGLIDEKQIDVSLKRLLTARFELGEMDEQPAWAEIPASTLNSKE HQDLALRMARESLVLLQNKNDILPLNTDLKVAVMGPNANDSVMQWGNYNGIPGHTVTLLE AVRSKLPEGQVMYEPGCDRTSREALQSLFSECSINGKPGFSAEYWNNRICEGEVVATDQI STPFHFATTGATTFAPGVEITNFSARYESVFRPSQSGDVAFRFQLDGEVTLMLDGKQVAQ KVYVKNPTSLYTLQAKAGKEYHIEILFKQRNERATLDFDLGKQVEINLNLAVEKVKDADV VLFAGGISPSLEGEEMPVEVPGFKGGDRTDIELPAVQRDLLKALKKAGKKVVFINYSGSA IGLVPESNTCEAILQGWYPGQAGGTAIVDVLFGDYNPAGRLPVTFYKDAGQLPDFEDYSM KGRTYRYMQQQPLFPFGHGLSYTTFTYGEADLSKNTIGDGGTVTLTIPVSNAGQRDGDEV VQVYLRCMADKEGPHYTLRAFKRVHIPAGETKQVTIPLTYESFEWFDTATNTVHPLKGTY ELLYGGSSDKNKLKTIVMNVQ >gi|225935357|gb|ACGA01000035.1| GENE 69 112560 - 113765 768 401 aa, chain + ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 7 400 1 401 402 310 47.0 4e-84 MDFPKILKKRKGYIERIKPFMRKSVAKVLTGQRRVGKSFLLYQLIEEILGEKPNANIIYI NLEDFEFSSLQTAQDLHSYIISQSQKKDKNYIFIDEIQDIPGFEKVIRSLLLNEDNDIYI TGSNANMLSGELATYLSGRYIEFKIYSLSYLEFLEFHGLTDNEKSYELYSRYGGLPYLLN LPLEDETVNEYLKSVYSTIVFRDVVSRYKLRNTLFLEKLIQFLSENIGNLFSAKNISDYL KSQHATVSVNQIQNYTEYLSNAFLIHRVERYDLIGKRVFEIGEKYYFENMGIRNIVIGYR ITDKAKILENLVYNHLLYRGYEVKVGYYGDKEIDFIGEKNGEKLYIQVALKIDSEKTAER EFGNLLKIQDNYPKMVITEDNFNGNSYEGIRHCSIRQFLME >gi|225935357|gb|ACGA01000035.1| GENE 70 113874 - 114890 1044 338 aa, chain - ## HITS:1 COG:STM1542 KEGG:ns NR:ns ## COG: STM1542 COG1063 # Protein_GI_number: 16764887 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 338 3 340 341 196 34.0 5e-50 MKAVQIVNPSEMKVVELEKPTVGVGEVLVRIKYVGFCGSDLNTFLGRNPMVKLPVIPGHE VGAVIDEIGPNVPAGFEKGMNVTLNPYTNCGKCASCRNGRVNACEHNETLGVQRNGVMCE YATLPWTKIIPAGNISSRDCALIEPMSVGFHAVSRAQVIDNEFVMVIGCGMIGIGAIVRA ALRGATIIAVDLDDEKLELAKKVGASHVINSKSENVHERMQEITEGFGADVVIEAVGSPV TYVMAVDEVGFTGRVVCIGYAKSEVAFQTKYFVQKELDIRGSRNALPADFRAVINYMKEG NCPVEELISKIAKPEGALEAMQEWTANPGKVFRILVEF >gi|225935357|gb|ACGA01000035.1| GENE 71 114915 - 116207 1039 430 aa, chain - ## HITS:1 COG:fucP KEGG:ns NR:ns ## COG: fucP COG0738 # Protein_GI_number: 16130708 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Escherichia coli K12 # 2 415 20 422 438 254 38.0 2e-67 MKKNTYTIPLALVFCLFFLWAISSNLLPTMIRQLMKTCELNTFEASFTETAYWLAYFIFP IPIAMFMKRYSYKAGIIFGLVLAAIGGLLFFPAAILKEYWAYLCIFFIIATGMCFLETAA NPYVTVLGAPETAPRRLNLAQSFNGLGAFIAAMFLSKLILSGTHYTRETLPIDYPGGWQA YIQLETDAMKLPYLILAILLIAIAVVFIFSKLPKIGDEGERASSDKTTSSGKTKEGSQKE KLIDFGVLKHSHLRWGVIAQFFYNGGQTAINSLFLVYCCTYAGLPEDTATTFFGLYMLAF LLGRWIGTGLMVKFRPQDMLLIYALMNILLCGVVMIWGGMIGLYAMLAISFFMSIMYPTQ FSLALKGLGNQTKSGSAFLVMAIVGNACLPQLTAYFMHANEHIYYMAYCVPMICFVFCAY YGWKGYKVID >gi|225935357|gb|ACGA01000035.1| GENE 72 116246 - 117169 753 307 aa, chain - ## HITS:1 COG:no KEGG:BT_3615 NR:ns ## KEGG: BT_3615 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 307 1 307 307 563 88.0 1e-159 MDYTIIDAHAHLWLRQDTVVDGLPIRTLENGRSEFMGEIRQMVPPFMIDGVNSAEVFLSN MDYAQVAAAVITQEFIDGIQNDYLSEIVSHYPNRFFVCGMCEFRKPGFLEQAKELIAKGF KAIKIPAQRLLLKEGRVMLNNEEMMQMFHSMEERNVMLSIDLADGATQVSEMEEIIQECP RLKIAVGHFGMVTRSGWKEQIRLARHPNVMIESGGITWLFNDEFYPFKGAVKAIREAADL VGMEKLMWGSDYPRTITAITYKMSYDFVVKSSELTEEDKRLFLGENARNFYGFTDLPVLP YIKNMSE >gi|225935357|gb|ACGA01000035.1| GENE 73 117185 - 118117 948 310 aa, chain - ## HITS:1 COG:YMR041c KEGG:ns NR:ns ## COG: YMR041c COG0667 # Protein_GI_number: 6323684 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Saccharomyces cerevisiae # 13 259 14 266 335 120 33.0 3e-27 MQYHEIGKTGMKVSSLSFGASSLGGVFHDLKEKEGIQAVFTAVEAGMNFIDVSPYYGHYK AETVLGKALKDLPRDRYYLSTKVGRYGKDGVNLWDYSAKRATESVYESMERLNIDFIDLI NVHDVEFADLNQVVNETLPALVELREKRVVGHVGITDLQLENLKWVIDHSPSGTVESVLS FCHYCLCDDKLADFLDYFESKEIGVINASPLSMGLLSERGVPAWHPAPKPLVEACRKAME HCKAKNYPVEKLAMQFSVSNPRIATTLFSTTNPENVKKNIAFIEEPIDWELVREVQEIIG EQKRVSWANS >gi|225935357|gb|ACGA01000035.1| GENE 74 118297 - 119331 780 344 aa, chain + ## HITS:1 COG:YPO0108 KEGG:ns NR:ns ## COG: YPO0108 COG1609 # Protein_GI_number: 16120455 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 335 1 337 342 173 30.0 6e-43 MENRKHTSLKDLAQALGVSIPTVSRALKDSPEISRELCAKAKALAKEMNYRPNPFAMSLR KNAPRIIGVVVPDIVTHFFASILNGIENMAIANGYFVIITTSHESYEHEKRNIENLVNMR VEGIIACLSQETTDFSHISALKDINMPLILFDRVCLTDQFSSVIADGAQSAQMATQHLLD NGSERVAFIGGANHLDIVKRRKHGYLEALRENRIPIEKELVVCRKIDYEEGKIATETLLS LPQPPDAILAMNDTLAFAAMEVVKNHDLRIPNDVAIIGYTDEQHANYVEPKLSAVSHQTY KMGETACQLLIDQIKGDKTIKQVTIPTHLQIRESSIKKQKCNLL >gi|225935357|gb|ACGA01000035.1| GENE 75 119496 - 120056 484 186 aa, chain - ## HITS:1 COG:STM4397 KEGG:ns NR:ns ## COG: STM4397 COG0545 # Protein_GI_number: 16767643 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Salmonella typhimurium LT2 # 80 184 123 220 220 62 34.0 7e-10 MSKKIYLFSLVLLALAFTACSETEETSRYDNWQARSVAFIDSIASVYDSPKNQALDDDDP EKLHAFPDPTNSQTIYVKKIKKGEGTESPKYTSTVSAHYRMSYFNGDVVQQTYTGTEPTE FDSPTNFTLNGVISGWSYTLMYMKVGDFWTLYIPYQSGYGSSTNDGNLQAYSALVYNVRL EKIVER >gi|225935357|gb|ACGA01000035.1| GENE 76 120068 - 121609 1699 513 aa, chain - ## HITS:1 COG:SA1394 KEGG:ns NR:ns ## COG: SA1394 COG0423 # Protein_GI_number: 15927145 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Staphylococcus aureus N315 # 10 502 8 460 463 470 49.0 1e-132 MAQEDVFKKLVSHCKEYGFVFPSSDIYDGLGAVYDYGQMGVELKNNIKKYWWDSMVLLHE NIVGIDSAIFMHPTIWKASGHVDAFNDPLIDNKDSKKRYRADVLIEDQLAKYDDKINKEV AKAAKRFGESFDEAQFRSTNGRVLEHQAKRDALHTRFAKALNDGNLDELRQIIIDEEIVC PISGTKNWTEVRQFNLMFSTEMGSTSEGAMKIYLRPETAQGIFVNYLNVQKTGRMKVPFG IAQIGKAFRNEIVARQFIFRMREFEQMEMQFFVKPGTELDWFKKWKEIRLKWHKALGFGD ASYRYHDHDKLAHYANAATDIEFLMPFGFKEVEGIHSRTNFDLSQHEKFSGKSIKYFDPE LNESYTPYVIETSIGVDRMFLSIMSASYCEEQLENGESRVVLKLPAALAPVKLAVMPLVK KDGLPEKAREVIDSLKFHFHCQYDEKDSIGKRYRRQDAIGTPYCVTVDHQTLEDNCVTLR NRDTMQQERVAISELNNIIADRVSITSLLKTLQ >gi|225935357|gb|ACGA01000035.1| GENE 77 121713 - 122981 653 422 aa, chain - ## HITS:1 COG:MJ1544 KEGG:ns NR:ns ## COG: MJ1544 COG1373 # Protein_GI_number: 15669739 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanococcus jannaschii # 1 420 15 439 441 211 35.0 3e-54 MIRREILKEVLLDNRNEVVKHQVIPRSFQFEEFGNYVFVGIRRAGKSFLLYQRIQELLHN GHTWNEMLYVNFEDERLIGMTAADLNLILEVHAEMSTDRPMLFLDEIQNIDGWDKFVRRL ADNKYRAYVTGSNAKMLSQDVATTLGGRYITVHVYPYDFKEFLYANGVEVTENSLFATES RAEIKRIFNDYFRSGGFPEGASLAAKRDYLTSVYQKIYLGDIAARHSIENTFALRILFRK LAESIKQPVSFTRITNIVASTGVKISKNTVINYMEYAKDACLLIPIQNIADKLVERETNP KYYFADNGILNLLLLDADTSLLENLVAVTLLRRYGTDDAVFFYNRNVEVDFYLPDTGVAI QVCYTMNLSDETFQREVQALVKLNKVFECRQLLIITYDDEERIELEKCAIEVIPVWKWLL GK >gi|225935357|gb|ACGA01000035.1| GENE 78 123063 - 123560 357 165 aa, chain - ## HITS:1 COG:no KEGG:BT_3610 NR:ns ## KEGG: BT_3610 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 156 215 73.0 4e-55 MSIPYVVRKKADLSSGERKELWYGVPSKMQRRGGVKNKELAVRMAKTTGFHRGQIEGILS ELVESIQDLLSSGHSVTIEGLGTFQTALTSQGCVLPEQVTPGKVAISRVYFVANSKFSRE LKKTKCTRIPFKYYMPESMLTKNMKKADRELEQTEYDIEEPEINE >gi|225935357|gb|ACGA01000035.1| GENE 79 123794 - 124939 613 381 aa, chain + ## HITS:1 COG:YPO4034_1 KEGG:ns NR:ns ## COG: YPO4034_1 COG1609 # Protein_GI_number: 16124154 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 245 7 255 265 82 23.0 2e-15 MIKILLLIDYSSEFDRKLLRGLVQYSKENGPWLFYRLPSYYSTMHGEQGILRWAKEWKAD AIIGQWNNDTIDLQKELNIPVVLQNYHHRSVTYSNLTGDYKGTGRMAAQFFAKRMFRNFA YFGVKGVVWSDERCEGYRQEVKRIGGEFFSFESDKQEDEIRMEVSQWLQQLPKPVALFCC DDAHALFISETCKMTNIPIPEEIALLGVDNDELMCNISDPPISSIELEVERGGYSIGRLI HQQIKKEHEGTFNIVINPIRIELRQSTEKHNIKDPYILEVVKYIESHYGSDLTIESLLAN IPLSRRNFEVKFKNALNTSIYQYILNCRCNHLADLLLTTDRPLADLAMEVGFTDYNNIAR IFKKFKGCSPIEYRQKKTRQR >gi|225935357|gb|ACGA01000035.1| GENE 80 125048 - 126235 775 395 aa, chain + ## HITS:1 COG:no KEGG:BT_3608 NR:ns ## KEGG: BT_3608 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 35 387 31 383 392 511 69.0 1e-143 MQRHTIRGKTRTIIYPIIIYWGITSIMGCKRLEINQTKRNWEKLSAYFTPPPLYQEKFGT YRNPLLFYNGDTVKNADDWLKRRKEIKDKWLNLIGHWPAIITNQKLEIIKTTEREDFKQH LVRFYWTPLEQTYGYLLEPNKKGKHPAVITVFYEPETAIGWGGKANRDFAYQLTKRGFVT LSLGTRQTTKDKTYSLYYPTINNSTMQPLSVLAYAAANAWEALARVESVDSTRIGIMGHS YGAKWAMFASCLYEKFACTAWSDPGIVFDETKDNYINYWEPWYLGYYPPPWKKIWSNNGN NSSTSVYARLCKEGHDLHELHSLLAPRPFLVSGGYSDNVDRWIPLNHSVAVNRLLGYHHR VAMTNRPKHDPTPESNETIYKFFEWFLKRKTPKED >gi|225935357|gb|ACGA01000035.1| GENE 81 126404 - 129553 2210 1049 aa, chain + ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 1049 7 1074 1074 823 43.0 0 MKKIRWNPRLVTSLFITLLMCTLDCLAQEVIKGFVKDGNDEPLPGVSVAVKGATNVGTIT NVDGKYSINAHKNDVLVFSYIGMVSQEVKVSNKTNINITLKEDVSSLDEIVVVGYGTQKR GSLTAAISTVSDKEILKAPTMGISNIIGARVAGISAVQASGQPGADNASLSIRGQSGIIY VIDGIRRSASDFNGLDPNEIESVSVLKDASAVAVYGLDANGAFIVTTKKGQTDKVTISYT GTVGISQNAEEQEWLDGPGYAYWYNKARLLQGDTEVFTVDMVRKMREGVDGWGNTNWYDK VYGTGVRQHHNISASGGSEKIRFFTSIGYLEEKGNIDKFKYRRMNLRSNIDAQLAKGLSL SLGVSGRIEKRDAPKFSADPDDWMNIPQQVCYALPYVQDTYEYNGKIYDVSTPTSGSPVA PIASIYDSGYNRSNQSYMQSNFSLKYDTPWFKGLSLKFQGAYDLVHGMTKQLTKPNEVMI MDLPNAATTTLTYHKDYSVLKDTPILSESASRAQEFTTQTSITYDNKFGDHSIGVLLLAE TRERNSNNLGVTGTGLDFIQLDELNQITGFTKEGKEQPAIPSGSSNQTRVAGFVGRLNYN YADKYYLEASLRHDGSYLFGGMNKRWVTLPGLSLAWRINNEKWFHATWVDNLKLRAGIGK TATSGIQPFQWRNTMSTSPNSVVIGGASQTAIYPSVLGNPNLTWAQCLNYNIGVDVTLWN GLLGMELDAFYKYEYDKLSSVTGSHTPSMGGYYFSTANVNKADYKGFDVTFTHQNRIGSF SYGAKLIWSYAYGRWLKYAGDSENTPEYRRLTGKQIGSKMGFIAQGLFQSEEEIANSATM PDRPAYPGYIKYVDRNGDGIITVNQDQGYVGKSSRPTHTGSFNLFGNWKGFDFDILASWG LGSDVALTGVYTATGSSGIQSATAFTRPFYQNGNAPVYLVANSWTPENTNAEFPCLEINP RSLNNGLASTFWYRNGNYLRIKTAQIGYNFPKKWLSPLGVEALRLYVEGYNLLTFSAVSK YNIDPESPAVNNGYYPQQRTYTLGAKITF >gi|225935357|gb|ACGA01000035.1| GENE 82 129566 - 131200 1220 544 aa, chain + ## HITS:1 COG:no KEGG:BT_0451 NR:ns ## KEGG: BT_0451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 544 1 553 553 383 41.0 1e-104 MKKLSINISLFFVLLTLSSCNDWLDEVKQTTTVSDEIIWQDETQVDKYVNSFYPLLHDYG QFGEAQFYGSFTESLTDAFKYGSYTLGHRAGHPNLYVLTPDAISPDNCLYSIWLRSTAYK QIRETNQFLSLQRKYSEFSADRNKLWEAQVRFFRAFVYFQLAKRHGGVILYDDLPVSTNK ARSSAEETWQFIADDLDFAANNLPKEWDAANKGRITKGAAYALKSRAMLYAKRWQDAYDA ANNVIALKLYGLTDNYEDAWKGNNKEAILEFDYDAANGPNHLFDRYYVPQCDGYDNGSTG TPTQEMVECYESKNGEKIDWTPWHGITDETPPYDQLEPRFAATVIYRGCTWKGKKMDCSL DGKNGVFMPYREQGTSYGKTTTGYFLRKLLDETLTDVKNGKSAQPWVEIRYAEVLLNKAE AAYRLNKIGEAQSAMNEVRARVKVNLPGKSSTGEAWFKDYRNERKVELAYEGHLFWDMRR WELAHIEYNNYRTHGFKIIGATNTYEYVDCDGQDRKFIKKLYVLPVPSEELKNNSLIEQY DEWK >gi|225935357|gb|ACGA01000035.1| GENE 83 131220 - 132917 1132 565 aa, chain + ## HITS:1 COG:no KEGG:BT_0450 NR:ns ## KEGG: BT_0450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 564 1 564 565 211 30.0 5e-53 MKNKIISIIQALVLLWGVTSCHSPEELGSDIQREGINSITASFPNDDSSENSFKGEIDYT NHSITFVFPYNYPKLSDNVLPTSALKRVRLSASLANNVSVTPPLLYMDLTQDNYITVKDQ TSGTSTEFKVIGEIRKSNECSITKFDIPAIGLLGVINENEKTISLVTIDNVGEQIANIDI SHGATCSPNPETEALDYEQEQTIVVTAQNGIDKATYTVKKDIPQKTVAGIRQGSGKLLWS KRLSEISGILLPGKVTGLAVVDKYVVINERANDRAIYLNSQTGEIAGSMDISQFAGDNSN FHATADRGNNILFCSYTPSGGTFTVWKANGVNEKPQKYIEYKTGTNIRFGWKISIQGDLD ANALITTPVFQKDSKVQFARWRVINGTLQSQSPEFVIMSSSLLTSNWIKWADVIYADDTD TQSDYFLASHVTDTSAKRYFYWFKGTDNSIKAANTGAPGNTIINAVDYAVFNKVPYVIYN HVNSFNYAVTGSDAVRMYDLSSGSFDNQIVVCPDKIYGGLENSGQNTEGTGDVVFKVAKN GYYLYVYLVFSNGGIACYQYDCIDM >gi|225935357|gb|ACGA01000035.1| GENE 84 132959 - 134527 956 522 aa, chain + ## HITS:1 COG:no KEGG:Thit_2257 NR:ns ## KEGG: Thit_2257 # Name: not_defined # Def: copper amine oxidase domain protein # Organism: T.italicus # Pathway: not_defined # 300 503 118 336 452 94 33.0 1e-17 MKLKNLWLIAVAILIYACNAEETRSWEVTFDPNKPKDPNEQSIINVAAGKFYKMNVVANS AYADKAPLDPWTSGTKLTDEETGTLSTKNRGVGWDAQTVEVIIDLGSLRNITEVSVHAIS DPTSQIVFPAQIEVSTSKDKSQWEKAASPITYSDSNGKSDAWGKTDFSNVACRYVKATLK SSASTSTMMLIDEIKVMGEFHNDMKYVPEKGCYHGAFPPLYGFDPEDREGSTDQCAVALF EKLVGKQLSMILWYQNMEPGRNFSEMQTVREKYWGKNYQGKYRFFLYGWLPVIPTQQMAN GELDDFHKSYFAEVAAQKVRDMGPIWFRPANEMNGSWTPYYGDPTNYVKAWRRMYNIAEQ LGVTAYNVFVWSPNSVSMPGTEANAMKNYYPGDMYVDWLGVSCYPPSLSATYPEERRYPL TLMQGIKQVSADKPIMISEGGYSSTCDHQRWVREWFKLKDEEPRVKAVVWENHENAENGD RRLQSDPLALELYKELVQDPYWLDLIPDAVYSEIETRKNNSK >gi|225935357|gb|ACGA01000035.1| GENE 85 134551 - 136755 1566 734 aa, chain + ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 29 724 29 705 793 416 36.0 1e-115 MKRLTPYSWLILIIFSSCLTGNKVASTDKQIESLLSQMTLEEKIGMLHSNTMFSSTGVPR LGIPDLHYSDGPHGVRFEGVANGWESARWDNDACSYLPALSALASTWNRDLAQLYGEVLG AECKARGKHVSLAPGVNIHRSLLNGRNWEYFSEDPFFSGELAVPYIQGVQSQGVASCVKH FALNSQAYNQYKVSVEVDERTLHEIYLPAFEAAIQRGGAMAAMAAYNKVRGLWCTESPYL LDTLLRDELGFDGLVVSDWNAVHNTERTALCGMDVEMGTSIKENGKYAFNKYYLADPLLK KVRNGEIPEEAVNKKVRNILKLMIRLDLIGQAPYDTTGMAAKLAIPAHTKAAREIAEESL VLLKNSKDMLPLDPAQYKNVAVIGANATEVFAAGGGSTKLKAKYEVTPLEGLQNLLDGKA RIEYAPGYQLSKKAYKVGHWFTNEFDKSDEELYKKAINTAAQAELVIYIGGTSHEHGSDC EGYDKPNLKLPYQQDRLLKGILEVNPNTVVVLISGGPVEIGEWYNDATALLHGSFLGMEG GNALARTLFGEVNPSGKLTTTWCKRLEDMPDHVFGEYPGINDTVRFKEGLMVGYRYFDTY RVVPLFEFGYGLSYTTFTYSNIKMKPVWKDSDTEFAVSFTVTNTGKRYGQEIAQLYLHQN KCSVERPFKELKGFTKVGLKPGESKQVTIKLPRRALQYYDTESKRWKDEPGMFTVLIGAS SRDIKLQKDFELKK >gi|225935357|gb|ACGA01000035.1| GENE 86 136762 - 138078 1079 438 aa, chain + ## HITS:1 COG:XF1462 KEGG:ns NR:ns ## COG: XF1462 COG0738 # Protein_GI_number: 15838063 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Xylella fastidiosa 9a5c # 59 429 3 360 377 241 37.0 2e-63 MGNNNLISSKKTYYISIAILAGMFFIFGFVSWVNAILIPYFRISCELTHFESYFVAFAFY IAYFVMAIPSGVLLKKVGFKRGIMYGFMLTALGAFLFVPAALARQFEIFLAGLFSIGTGL AILQTAANPYVTIIGPIDSAARRISIMGICNKFAGIVSPLIFAALILNVTDKELFATIES GTLDIVTKNAMLDELIQRVIVPYAVLGVILLFTGIGIRYSILPEINTDEENSTEDTGSHH HTRKSIFDFPYLILGAVAIFLHVGTQVIAIDTIINYANSMGMDLLEAKTFPSYTLACTMI GYLLGILLIPKYVSQKNALIGCTIIGLLLSFGVVFADFEVTLFGHHANASIFFLNALGFP NALIYAGIWPLSIHGLGKFTKTGSSLLIMGLCGNAILPLIYGHLADMYSLRFGYWVLIPC FLYLVFFAVKGHKIDSWK >gi|225935357|gb|ACGA01000035.1| GENE 87 138096 - 139283 910 395 aa, chain + ## HITS:1 COG:SA0656 KEGG:ns NR:ns ## COG: SA0656 COG1820 # Protein_GI_number: 15926378 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Staphylococcus aureus N315 # 1 391 1 389 393 218 35.0 1e-56 MERLIITNGKLILPTGIEMGQTLICENGKIEQIIPNGSYQPLVGDKVIDARQNYVSPGFI DMHIHGGGGHDFMDGTVEAFLGVAETHAKYGTTAMVPTTLTSTIEELMTTFTVYRKAKEM NINGSQFIGLHLEGPYFSPKQCGAQAPNFLKKPQAEEYNAILEASKDIIRWSVAPELEGA LALGQTLQQHHILPSIAHTDAIYEEVEKAFTAGYTHVTHLYSAMSSVTRKNAFRYAGVVE ATYLIEDMTVEIIADGIHLPKPLLQFVYKFKGVDKTALCTDAMRGAGMPDGESILGSLNN GQKVIIEDGVAKMPDRKAFAGSVATTDRLVRTMIYLAGVPLIDAVRMMSLTPARILHIDK EKGSLEIGKDADIVIFDNQININNTILKGHVIYTK >gi|225935357|gb|ACGA01000035.1| GENE 88 139264 - 140055 531 263 aa, chain + ## HITS:1 COG:all0727 KEGG:ns NR:ns ## COG: all0727 COG0363 # Protein_GI_number: 17228222 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Nostoc sp. PCC 7120 # 10 258 5 253 258 213 43.0 2e-55 MLSILNETKTKVFQQEQLTVRIFPSIQEMGSAAAKEVGDQICRLLESKPEINMIFAAAPS QNEFLSHLIHDKRIDWTRINAFHMDEYIGIHPEAPQSFGHFLRIRIFDKVPFKKVNYLNG LAENLEEECQRYADLLTKHPVDIVCLGIGENGHIAFNDPDVADFNDPKLVKIVELDPICR QQQVNEKCFKTLDLVPKEALTLTIPALLKAEWMFCIVPFKNKAQAVYQTVYGEVSEKCPA SILRRKENSSLYLDPKSAERINL >gi|225935357|gb|ACGA01000035.1| GENE 89 140062 - 141066 488 334 aa, chain + ## HITS:1 COG:no KEGG:BT_3586 NR:ns ## KEGG: BT_3586 # Name: not_defined # Def: putative dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 1 334 334 561 82.0 1e-158 MKNRISLLILLAISFSIHAQKVMRIGIIGLDTSHSTAFTELINSGSDETFSKGFRVVAAY PYGSKTIQSSYERIPGYIEKVKANGVEITSSIADLLEKVDCVLLETNDGRLHLEQAVEVF KSGKICYIDKPIGATLGDAIAIYEMAEKYNAPVFSSSALRFTPQNQKLRNGEFGKILGAD CYSPHKVEPTHPDFGFYGIHGVETLYTIMGTGCESVNRMSSDRGDVVVGRWKDGRIGTFR AIIKGPQIYGGTAYTSKGAVAAGGYQGYKALLEQILKYFQTGISPISKEETIEIFTFMKA SNMSKAENGRIVTLEEAYQKGLKDARKLIKTYKK >gi|225935357|gb|ACGA01000035.1| GENE 90 141071 - 142480 936 469 aa, chain + ## HITS:1 COG:no KEGG:BT_3585 NR:ns ## KEGG: BT_3585 # Name: not_defined # Def: putative oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 466 1 466 468 691 71.0 0 MKSIYKGTAIIAILFIVVSCSFSKKQAANRMETNDLPQLEIVVLDPGHFHASLLQKETLT DVSDTIRIYAPEGIAVNQYLESIDSYNQRAESPTTWKKQVYTGDDYLQKMLADHKGNIVV LAGNNQKKTRYIMESIKAGYHVLADKPLAINPQDFKLLTEAYQLAKEKNLLLYDLMTERY DILNIIEKELLHQTELFGDLQKGSPDNPSVIMESVHHFFKTVSGKPLIRPAWYYDVEQQG EGIADVTTHLIDLINWQCFPDKTIHYQSDVTVNAAKHWPTPITLAEFSQSTQVDSFPAYL NRYIKNDVLEVMANGSLNYTVKGICIGMKVTWNYTPPTNGGDTFTSIKKGSKATLKIVQD EKNGFVKELYIQKEPDIDNRTFEAQLQKTVEQLQITYPFLSVKNKKNGTYLIDIPQEKRL GHEEHFSKVAKAFLHYVHNQDMPEWENENTLAKYYITTTAVEMAKKGNK >gi|225935357|gb|ACGA01000035.1| GENE 91 142532 - 143302 567 256 aa, chain + ## HITS:1 COG:BMEII0967 KEGG:ns NR:ns ## COG: BMEII0967 COG1477 # Protein_GI_number: 17989312 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Brucella melitensis # 34 251 58 309 323 77 24.0 3e-14 MFHGFIPHIMGTRFDILLIHSDIDRLNTLWADIAYELERLDKILNRFDPHSEVSKINNHA SQSKIQISKELKSILQLCSYYYQTTSHLFDITLKDFSRIQLHEHSYISFTSPDISLDFGG FAKGYALKKIRKFIEQENISDAFANFGNSSILGIGHHPYGDCWKVSFLNPYNQSLLKEFD LQDTALSTSGNTLQYTGHIMNPLTGTFNEQRKASSILSPDPLEAEVLSTVWMIANEQEQK KISEKFNNIQATLYNL >gi|225935357|gb|ACGA01000035.1| GENE 92 143304 - 144629 897 441 aa, chain + ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 53 198 5 150 340 102 38.0 2e-21 MKEHKEKGVDLGLRQFIKSLGYVAGGTALLATTPWLTSCTPEKLQEIKHEKARIALIGTG SRGQYHIHNLKEIPHAQIVAVCDNYAPNLQQALELCPDAKSYTDYRKLLESKDIDGVIIS TPLNWHAPIVLDALAAGKHVFCEKAMARTLDECKAIYDTYNQSDKVLYFCMQRMYDEKYI KGMQMIHSGLIGDVVGLRCHWFRNADWRRPVPSPELERKINWRLYKESSGGLMTELACHQ LEVCNWAAQKMPESIMGMGDIVYWKDGREVYDSVNVTYRYSDGVKIAYESLISNKFNGME DQILGSKGTMEMAKGIYYLEEDHSTSGIRQLIDQVKDKVFAAIPTAGPSWRPETKMEYTP HFIIDGDIHVNSGLSMIGADKDGSDIILSSFCQSCITGEKAQNVVEEAYCSTVLCLLGNQ AMDEQRHILFPDEYKIPYMKF >gi|225935357|gb|ACGA01000035.1| GENE 93 144633 - 144872 138 79 aa, chain + ## HITS:1 COG:no KEGG:BT_3582 NR:ns ## KEGG: BT_3582 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 79 80 70.0 2e-14 MKDIIITSKKLKQERNIYLLSFLLAFIINVIAIIIYSRPWIEIISQIGYVIVISFFIYLI LWIPRGILIFLSHLFRRKK >gi|225935357|gb|ACGA01000035.1| GENE 94 144960 - 146303 955 447 aa, chain + ## HITS:1 COG:MK0248 KEGG:ns NR:ns ## COG: MK0248 COG0673 # Protein_GI_number: 20093688 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Methanopyrus kandleri AV19 # 42 189 3 144 317 82 32.0 2e-15 MTTRRDFIKKTVAGTAALSLGSILPGFGSSRYQDILGANEKIRIGVIGVNSRGKALAQGF AKLPDCEVTYICDVDSRALEKCQAAIHKITGRTPKGEKDIRKMLESNDFEAVVIATPDHW HAKAAIMAMQAGKHVYLEKPTSHNPAENEMLVRAALKYNRIVQVGNQRRSFPNVIKAMEE IKSGSIGKVRYAKSWYVNNRPSIGTGKVVPVPDYLDWDLWQGPAPRVADFKDNFIHYNWH WFWNWGTGEALNNGTHFVDILRWGLGVDYPTKVDSIGGRYRFQDDWQTPDTQLITFQFGD EASFSWEGRSCNTMPVDGYGVGTAFYGENGTLFIGGGNEYKIADIKGKTIKEVKSDLKFE TGNLLNPSEKLDAFHFRNWFDAIRKGTKLNSGIVDACISTQLVQLGNIAQRVGHSLQIDP GSGRILNDLEANKLWGREYEKGWEIRV >gi|225935357|gb|ACGA01000035.1| GENE 95 146403 - 146990 463 195 aa, chain - ## HITS:1 COG:mll5456 KEGG:ns NR:ns ## COG: mll5456 COG1595 # Protein_GI_number: 13474550 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 8 180 6 172 182 70 26.0 2e-12 MNKTDLSSATEEQLLLQRLREGDMGSYETLFHRYYPTFFAFAKGMLKDAGAAEDIIQNVF MKIWIHREALDETMSIKNYIYVLSKREVFNYLRAKYNTHVVLTEDMMTLERPSSIDEPTT DYRELREAVQSVINTMPPKRRSVFCLSRFKSLTNQEIADKLGISIRTVEKHIELALRTFK EQLGSFFALFVGWFL >gi|225935357|gb|ACGA01000035.1| GENE 96 147292 - 150636 2263 1114 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 169 6 181 394 74 29.0 2e-11 MRNFLLLFLLLMPVIGSCTDDYDDSAAWKDIDGIYKDLDQLKEKLNSLQLQANALSQIVK GGAITSVTEAANGGYVISYKGSDNIEHSFTIATTDQMVSSPIIGIQEEAGTYYWTTTTKG QTTFLLDANKQKIPVSGSAPQIRVDENGYWIINGQQILDSNQKPIKAEGKTTSLITKVEM NDNGTASITLGNGETLSVNTFTLFNVEFKNTDQTAISPIIIEEGTKNLTLNYNIIGKKAA QALMLITRNDDGLEARLNSSNKTLVVTFADDFEEGVTMIMLYDTEDNVLIKPMRFTLPII ENGGIATATDFKAFIDAVTSGSSLRKFKDTEGNVILLNDIDMKDITLTSGAGSNVTSNTT NANTKVVYTIGEQTFNDVFDGKGHSVINLTFTYNLEDGNIAHGLFNALGSSGVIRNLVIS GNATITGKAPQGAAIGGLVGYCEGSILACTNQINLSFEGTDAANVGVRMGGLAGVLYGNK IGDTTQANGCSNEGNLTCSNIVNTASGAYSAFNQGGIAGYIENDEAYIGYAINKGNISAP SGRGGGIAGTLQEGIIENSTNEGVIQDDVNGVFASTSKRYNVKRIGGLAGGINTDKYLKN CINNGNVYSQNGSRAGGFVGHNAGFVQSCTNNGIILSDATADGANKHGAGWACGYSGTKN GTNYITDCHIGGKVGDYSIYKNNPEDTPGATYSNAVRHGAFSKEANNFSNQDEAYYDWQV TEDRELASGIVYKHYSFINFNQNIYAIEIDMNNPKVTFETVMADEICPNPNGNNNSNNGK VLRETLSETCTRRRDEGRNIIVGINTGFFNSHDGFPRGMHIEEGEPVFINNPYVRSILTN HVWGFTFFDNRTVSFEKRDFTGKLKVGTKEYEYYSVNDTIVRLSGKPSYDANLYTFRYVK EPHPGLTNPIGTKALFIIGKNNQPLKVNSGDFEATITKIIDGRGTTVEAPYVTDKNEWVL QVTGDKADELVQNLKTGDKVQISAELKIGSSTNPIKVHNSSMYRYVYNGVYSAPPKKEDA ETINPTTNLGMTQDKSKIVIFCVDGRTDSDRGLDFYEAYRVCKKLGLYDVIRFDGGGSTV MWTYENGIGKVINHVSDTKGERSCMNYLHVRVLE >gi|225935357|gb|ACGA01000035.1| GENE 97 150649 - 153087 2023 812 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 206 19 214 394 69 27.0 6e-10 MLITEYMKKFISISLFISACFLGSCSSYDDERIWQDLNEIEKTIGDYEAQAETLTKQMTS LSKVIGSSFITLISQDADGNHVISYSDGGGETHTVTIATQEDAIKLPIITAKQDTDSKFY WAQTTDKGKTYTFILDGDGKKFPIGGTMPDVKINENGYWSVNGASTGVLANDLSNLLFKS AYIDDKTGEAVFILADGQELRMSLQEALGIRFNSPVYNAVTDYATPVSIPYEIYGTQSEN AYVDLFTAYNMEVKIDKASSTLIATMKEGATEGNILLLASAGNNTVLKPIYFTYGTAILD DPLYQGHVGPIQLKGTQMDIEMQISANIFYQVSTENEWITYKGTRALITTTHAFTILANE TGDERSGKIIFSNSLYNISSSIDVIQEAKEVEAKGGISTAADLVNFAKAVNNGTSTSRWQ NDAGEIVLLNDIDMSSVTSWTPIGDIDASNYTTAEPYVSIHPFTGTFNGQGHAIKNLNCS ADITNGGLAYGLFGSIENATIKNLVLGDASTTITWMMSGTASKYTVIAPLVCFAKKSVIE GCTNYYNIDFTADNKSGEFNALSGLVGTIANTTIGGESKAQGCSNRGFVRTGRISNTANG GTGMQTAGICAFMAKAEGGKLNYCTNYGDISCPSGRTGGIVATLMYGNIYNCDNRGTIED DKVGQHEGKEASVTYNYKRMGGIVGGTDDLKTKPEYTVESCTNYGNVMTHLSVRTGGIIG HSNIQIIGCVNKGAVLGDVFTEGNGTNRHGPGWLCGYSGASTATWTNCKACVCGGYVGDY SKYKDDPTSAPDATNQNAFCHANQNFDPSINF >gi|225935357|gb|ACGA01000035.1| GENE 98 153100 - 153981 738 293 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 225 394 86 28.0 1e-15 MKRIYFVCSVLFSLLLTSCGDYDDSSIQNKLNDFKERIAALQTKADKLNEDISKLGYLTE GNVITSVSRNSDGQYVITYKDNNNEEKAVVVATQEDVIEAPILGVRLNDDDQLYYWTTTI GNETNWLTDDTEKKVPVCGYTPEMGVNADGYWTVNGEILKDNKGTPITATTDETAIFKNI TKTDEGYLKITLGNGETLTLEVFSSLNLRLKANAVTKITDLSSPLKIEYEVTGASAEEAL VTIAQAVNVKATIDKETHTLTVIFENNFDEGHVIITAYDLQHLVLRPLLFKKN >gi|225935357|gb|ACGA01000035.1| GENE 99 153995 - 154933 769 312 aa, chain + ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 48 306 36 294 306 117 29.0 3e-26 MNMKKRILFTGLLLLLTLGTFAVTPVAPDKTHAQKIVEAIHNPKTDYVVVVSHRGDWRDY PENSIPAIESIIRMGVDVMELDLKLTADSVLVLCHDGTIDRTTTGKGRVSDVTYDYIKSC FLKTGHNCPTKYKMPTLKEALAVCKDRIVVNIDQGYQYYDLVMAITEELGVTEQILIKGK KSVDFVDAQAKKYKHAMMYMPIIDINKPSGQDLFRQYMDKKIIPLAYEVCWQQETSEVKK CMKDILAQGSKIWVNTLWASLCGGEEAGMYDDYAFEHGAEVYQKVLDLGTSIIQTDRPEL LISYLKKIGRHN >gi|225935357|gb|ACGA01000035.1| GENE 100 154973 - 156055 547 360 aa, chain + ## HITS:1 COG:no KEGG:Phep_1387 NR:ns ## KEGG: Phep_1387 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 26 356 24 356 358 181 32.0 3e-44 MRNFRLLLLASLLAPTILWGQSIKDIRYVDASQLTLVGKALPTPHLYHRIDTVAFKGFSK SENQQARCSAGLAVVFRTNSPQIDLLPSYKWEYRKDNVTGIAAAGFDLYIRQNNEWIYAN SLAPAKRNEAFTLMYGMEPKEKECLLYLPMYSELESLKIGIQPGSSIETIPNPFGQKVVF FGSSFTQGIGASRPGMSYPLQIERNTNLHVCNLGFSGNAKLQSYFAEVIAATEADAFVFD VFSNPDAMQIKERLQAFVDIIVAKHPKTPLIFVQTIQRGNEAFNTLIRARESDKLEIVET LMKEIIRKYPNIYLIGNPLPSPENRDTCTDGTHPSDLGYYFWAKNLEKKIIEILNKNKCL >gi|225935357|gb|ACGA01000035.1| GENE 101 156179 - 157165 576 328 aa, chain + ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 126 279 127 278 323 66 29.0 7e-11 MKITNDILKDYFYDKCSHEEEIEIQKWLTENDNSLVDQSFREIISEIQMENHELSQKAFE KFQKATQPQQVRTPRPGLQRSIRWMQRVAAVLFIPLLFLAGYLLLDKKANTHWNDITVPH GMHQTLTLSDGTTLHVNSGTRVIYPSDFTENKREIFVSGELFADVAKDPDKPFIISAGDV HVQVLGTKFNLRAYENIETVEVALVEGSVLFKTPTHPNEILKKGEMIQYNRTSQKIVRDT FLANLYKCPAKNEGFYFSNLPLNDIVKELEYYFDTRIIILDQKLGDSTYIAYFTNGETLD EILSNLNTDGQMSITRSQGVILITSAAP >gi|225935357|gb|ACGA01000035.1| GENE 102 157189 - 160731 2420 1180 aa, chain + ## HITS:1 COG:CC1623 KEGG:ns NR:ns ## COG: CC1623 COG1629 # Protein_GI_number: 16125870 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 196 330 50 202 970 61 37.0 1e-08 MKQTNIYKKYISTIWFACVMIFFSQPILAQDITLQLKNVTVKEAIEALHKTKNYSVVIKS AEINMSKKVSVNATNAPIKAVLDQIFVGQNVSYAINGHSIIISKKSDTSQQKPGEKKKQT ITGIVYDEEGNPVIGASVMNKETAQGAITDLDGKFSLEAFIPSTIEISYVGYEMATVNVK DNQTKTIRLIPSSLMIDEVVVVGYGSQRRSNLTGAVSTISSKDLNNRPVVSAANALQGAD PSVNLTFGTGSPESGYSLNIRGGISVNGGTPLVLCDGVEVPLNQVNANDIESISVLKDAS SCAIYGAKASAGVVLITTKSGSAATKGKAKISYNGRFGWTQNTTSTDFIRTGYDYVTFAN KFYHAYNGVNMYLYEDEELQKLYDRRNDMTENPERPWVEIGEDGKYYYYGNTDWYGHFYN RTRPQMEHNVSITGGGEKVNYYISGRYYQQYGMFNIDKDLYKDYSFRAKMDAQLNKWIKW STNIGLDNNNYRYNGTSNYAMTIARLESNISPSFVPFNPDGTIVQYTNQLYANSPLGAGD GGYLTSQRGHNTKSRTLLSVVNQIDITLLEGLTLTANYSYQQRKQLYRYRNNSFEYSRSQ GVTQTFTSGSIFNNYEEDESFPVTHMLNYYATFEHSWAKKHNLKVVAGSQYETYRNVNKD TSMTNLSNDNLDSFSAVTPESVLTVSQDISAYKTLGFFGRINYDYMGKYLLEVSCRADGS SRFAEGDRWGIFPSVSAGWRISEENFFKPVSDWWSSLKLRASVGSLGNQQVDYYAYLQTI TSDNQFSYTFDGEGKAYYAKISNPISSGLTWETVTTYNVGLDMSFLRNRLSMSADYYVRK TTDMLTTSLTLPDVYGASTPKANCADLRTNGWELSVSWNDSFKVANKPFRYGIQATLGDY QRTITKYNNPDKLISDHYVGKKMGEIWGYHVDGLFKTDKEAAEYQAKINDKAVNGRVYSS KVDGYLRAGDVRFADLNGDNVIGAGAGTVDDPGDKRIIGNTTPRYNYSFRLDASWNGFDV SAFFQGIGKRDWYPSSSSSSQGANSFWGPYSFPSTSFIEKSFPEDCWTEDNRNAFFPRIR GYQSYSGGSLGTVNDRYIQNIAYLRFKNLSIGYTLPINKRFFEKVRVYVSGENLYYWSPL KKHNKTIDPELAISSSTYSSNTGSGYAYPRVYTVGVDITF >gi|225935357|gb|ACGA01000035.1| GENE 103 160754 - 162505 1557 583 aa, chain + ## HITS:1 COG:no KEGG:Slin_4978 NR:ns ## KEGG: Slin_4978 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 4 578 1 576 576 468 44.0 1e-130 MKAINKIIILGMVTTLFSSCDLTLLPENAVTPENYFQNKSDLELWTNQFYTLLDEPDASA GTNADDMIDKGMGQVIEGTRSAASETGWSWSKLRHINYFLQHSSNCDDETARSQYNGVAQ FFRAYFYFVKVRRYGDVPWYDQVLGSEDQELLAKARDSREFVMDRVLKDFEDAATSLPTK STDTRNTRVTKWAALAFASQAALYEGTYRKYHGLDNYEKYLEIAASTARQFIDESGFSLY KEGTEPYRDMFCADNAKTTEVVLARAYNFEGLQLSHSVQFSIANLQMGFTRRFMNHYLMA DGTRFTDKQGYETMFYTDEVKNRDPRLQQTVLCPNYIQKGETTVTANDLTAYCGYRPIKF VGTKDHDGAAKSTSDWPLMRAAEVYLNYAEAKAELGTLKQEDLDISINKIRERAKMPDLN LTDANSNPDPYLAACYPNVEQGTNKGVILEIRRERTIELVMEGLRQWDLFRWKEGKQMFN HYVPYYGIYVPGVGTYDMDGDGKPDLEIYETTATSQCDNKKKLDKDIYLSNGTSGYIIGF PKVTYGKDWKEERDYLWPIPADQRVLTQGILTQNPGWEDGLSY >gi|225935357|gb|ACGA01000035.1| GENE 104 162450 - 162608 66 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVVGFDSLVKGDSNGKVYRLSSDTPSDEEINKELVSSLVHLPNPDFGLECLV >gi|225935357|gb|ACGA01000035.1| GENE 105 162607 - 163935 731 442 aa, chain + ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 23 439 23 440 459 356 43.0 5e-98 MLKQLIKFYHISTPATPLKKDKKKLKRLQWATFLSATAGYGIYYVCRLSLNVVKKPIVEE GIFSETELGIIGAVLFFTYAIGKFTNGFLADRSNIRRFMSTGLLITALANLCLGFTHSFI LFAILWGISGWFQSMGAASCVVGLSRWFENKKRGSFYGFWSASHNIGEAMTFIIVASIVS ALGWRYGFIGAGSIGIIGALIVWRFFHDSPESEGLPAVNHPQMQSETGDAADFNKAQRQA LMMPAIWILAISSALMYVSRYAVNSWGVFYLETQKGYSTLDASFIISIGSVCGIIGTMFS GVISDKFFSGRRNAPALIFGLMNVMALCLFLLVPGVHFWVDALSMVLFGTAIGVLLCFLG GLMAVDIAPRNASGAALGIVGIASYIGAGIQDIMSGILIGGHKSIVDGKEIYDFSYINIF WIGAALLSVFFALLVWNVKSKD >gi|225935357|gb|ACGA01000035.1| GENE 106 164071 - 166716 2354 881 aa, chain + ## HITS:1 COG:BB0035 KEGG:ns NR:ns ## COG: BB0035 COG0188 # Protein_GI_number: 15594381 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Borrelia burgdorferi # 37 615 11 583 626 437 42.0 1e-122 MSDEINEITEGHSDYKPADTRDENVKHQLTGMYQNWFLDYASYVILERAVPHINDGLKPV QRRILHSMKRLDDGRYNKVANIVGHTMQFHPHGDASIGDALVQLGQKDLLIDCQGNWGNI LTGDGAAAPRYIEARLSKFALDVVFNPKTTEWKLSYDGRNKEPVTLPVKFPLLLAQGVEG IAVGLSSKILPHNFNELCDASISYLRGEEFQLYPDFQTGGSIDVAKYNDGERGGAVKVRA KINKIDNKTLAITEIPYGKTTSTVIDSILKAVDKGKIKIRKVDDNTAANVEILVHLAPGT SSDKTIDALYAFTDCEVSISPNCCIIDDSKPHFLTVSKVLKKSADNTMDLLKQELEIKKG EILESLHFSSLEKIFIEERIYKDKEFEQSKDMDAACAHIDERLTPYYPTFIREVTKEDIL KLMEIKMGRILKFNSDKADELIAKMKEEIAEIDNHLAHIVDYTVNWYQMLKNKYGKNFPR HTELRNFDTIEAAKVVEANEKLYINREEGFIGTTLKKDEFVACCSDIDDVIIFFRDGKYI VTPVADKKFVGKNVLYVNVFKKNDKRTIYNVAYRDGKEGTTYVKRFAVTSVVRDREYDVT LGTPDSRITYFSANPNGEAEIIKVTLKPNPRVRRIIFEHDFSEVSIKGRQARGIILTRLP VHKISLKQKGGSTLGGRKVWFDRDILRLNYDGRGEYLGEFQSDDSILVVLNNGDFYTTNF DLSNHYEDNVSIVEKFDPHKVWTAALYDADQQNYPYLKRFCFEASNRKQNYLGENKNNRL ILLTDEYYPRLEVIFGGHDSFRDPLDIDADEFIAVKGFKAKGKRITTYAVDTINELEPTR FPEPAQEQQESPEEEPENLDPDNDKSEGDIIDEITGQMKLF >gi|225935357|gb|ACGA01000035.1| GENE 107 166716 - 167576 544 286 aa, chain + ## HITS:1 COG:no KEGG:BT_3578 NR:ns ## KEGG: BT_3578 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 286 1 286 286 501 92.0 1e-141 MKKKLIYLGLTGCILFALCTRLQAQIDSLQTHRYVTRATMYGVGFTNVFDTYLSPQEYKG IDFRISRETIRMTKLFDGNVSVQNFFQADIGYTHNRADNNNTFSGLVNWNYGLHYQFRLT ENFKLLAGGLIDANGGFVYNLRNTNNPASARAYVNLDASGMAIWHLKIKRYPMVLRYQVN LPVMGVMFSPHYGQSYYEIFSLGNSSGVIKFTSLHNQPSLRQMLSVDLPIGYTKMRFSYL ADLQQSNVNNIKTHTYSHVFMVGFVKDLYRIRNKKRTALPSSVRAY >gi|225935357|gb|ACGA01000035.1| GENE 108 167644 - 168657 743 337 aa, chain + ## HITS:1 COG:no KEGG:BT_3577 NR:ns ## KEGG: BT_3577 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 337 16 340 340 574 83.0 1e-162 MLWLLCCLPVLTGCIGEEEYANDPVGNFEQLWKIIDEQYCFLEAKGIDWDAVHDQYSPLI IPTMSNDDLFDILSQMLYILKDGHVNLSSAKRTSFYDEWYQGYDWNYREDILYQTYLGSA STGYYTSAGLKYKIFDNNIGYIRYESFSSGVGDGNLDEVLLYLATCNGLIIDVRDNGGGN LTNSTRIAARFTNEKALTGYIQHKTGPGHNDFSKMEPIYLEPSNSIRWQKKVIILTNRRC YSATNDFVNAMRSLNKDNEDKRIIQLGDWTGGGSGLPFSSELPNGWSIRFSASPHFDKNK QPLEEGIEPNIAINMSESDQLKHKDTLIEKAFEILSE >gi|225935357|gb|ACGA01000035.1| GENE 109 168871 - 169761 790 296 aa, chain + ## HITS:1 COG:TM0415 KEGG:ns NR:ns ## COG: TM0415 COG0524 # Protein_GI_number: 15643181 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 9 285 5 277 286 125 30.0 1e-28 MNKHQLCCIGHITLDKVVTPQNTVYMPGGTAFYCSHAIRHFNDIDYALVTAVGATEMKVV EQLREVGINVTALPSKHSVYFENIYGANPDDRTQRVLAKADPFTASQLKEIDAQIYHLGS LLADDFSLEVIKALSQKGLIAVDSQGYLREVRDTHVYPVDWTDKREALQYIHFLKVNEHE MEVLTGLSDPHEAARQLYKWGVKEVLVTLGSMGSLIFNGKEFYRIPAYKPKEVVDATGCG DTYTIGYLYQRVSGADIEEAGRFAAAMSTLKIEKSGPFSGSKEDVIRCMTTAEQMF >gi|225935357|gb|ACGA01000035.1| GENE 110 170012 - 173077 2952 1021 aa, chain + ## HITS:1 COG:no KEGG:BT_3569 NR:ns ## KEGG: BT_3569 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1021 1 1021 1021 1851 90.0 0 MEKFNAMRTRLLQHLQKKAIRSRSIMTLVCLLLASASVFAQTKTVTGTVTDAANEPLIGA SVLVQGTSTGTITDMDGKYSISVTPEDVLAFSYVGMTSQSVKVGTQSVINVTLKEDSQVL AETVVIGYGSAKKRDLTGSITNVKGEEIANKPAMNPLSSLQGKVAGVQIVNSGRAGSDPE IRIRGTNSINGYKPLYIVDGLFNDNINFLNPEDIESMEILKDPSSLAIFGVRGANGVIII TTKKAKEGQTLVNINTSFGFKKVVDKVKMVNGSQFKELYSEQLANQGDAPFDYTGWDANT DWQDEIFQTAFITNNNISITGASPKHSFYLGVGYSYEQGNIEHEKFSKVTINASNDYKIT DFLKVGFQFNGARMLPADSKQVLNALRATPIAPVYNEEYGLYTSLPEFQKAQINNPMVDI GLKANTTKAENYRASGNIYGEVDFLKHFKFKATFSMDYASNNGRTYTPIVKVYDAAVSGN VSTLGTGKTEVSQFKENETKVQSDYVLTYTNSFDNGNHNLTATAGFTTYYNSLSRLDGAR KQGVGLVIPDDQDKWFVSIGDAATATNGSTQWERSTLSVLARVIYNYKGKYLFNGSFRRD GSSAFSYTGNEWQNFFSLGGGWLMSEEEFMKDIKWLDMLKIKASYGTLGNQNLDKAYPAE PLLTNAYSAVFGKPSIIYPGYQLAYLPNPNLRWEKVEAWEAGFETNVLRNRLHFEGVYYM KNTKDLLAEVPGISGTIPGIGNLGQIQNKGVEMAVTWRDQIGEWGYSVSANLTTIKNEVK SLVQDGYSIIAGDKQQSYTMAGYPIGYFYGYKVAGVYQSQADIDASPENTLATVTPGDLK FADVNGDGKITPEDRTMIGNPTPKVTYGLSLGVDYKNWSLGIDMMGQGGNKIYRTWDNYN FAQFNYLEQRLGRWHGEGTSNTQPLLNTKHSINTMNSDYYIENGNFFRIRNIQLAYAFDK SLLAKIRLQALKVYVNIQNLKTWKHNTGYTPELGGTATAFGVDNGSYPVPAVYTFGINLT F >gi|225935357|gb|ACGA01000035.1| GENE 111 173142 - 174674 1618 510 aa, chain + ## HITS:1 COG:no KEGG:BT_3568 NR:ns ## KEGG: BT_3568 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 509 1 509 509 943 88.0 0 MKLNKYIFATLITSTALFLGGCSDFLDRSPQGQFTEDDNPNALVNGKIYNVYTMMRNYDV TAGPPALAIHCFRSEDSEKGSISSDGSDVAEMYDDFLYTPTNGLLGAYWGKNYAIIYQCN DILTAIADKETAGQTEAEDIINKGEASFFRAYCYFNLVRAFGEVPLVTYKINDASEANIP KTTVDKIYEQIDNDLKTAEESLPETWSTEYTGRLTWGAARSLHARTYMMRNDWNNMYTAS TDVIKKGLYNLKTSYDEIFTDDGENNGGSIFELQCTATAALPQSDIIGSQFCEVQGVRGA GKWDLGWGWHMATEYMAQAYEPGDPRKNATLLYFRHSDDEPITPENTNEPYGESPVSPAI GAYFNKKAYTDPALRKEYTNKGFWVNIRLIRYADVLLMGAESANEKGIPGEAIDYLEQVR ARARGTNTNILPKVTTTDQGELRDAIRHERRVELGMEFDRFYDLVRWGIAKEVLHAAGKT NYQDKNALLPLPQTEIDKSKGVLVQNPDYQ >gi|225935357|gb|ACGA01000035.1| GENE 112 174817 - 177126 2386 769 aa, chain + ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 21 768 26 763 764 679 48.0 0 MIYKKLSLSILLFAAGFLTASAQKSPQDMDRFIDVLMNKMTLEEKIGQLNLPVTGEITTG QAKSSDIAAKIKKGEVGGLFNLKGVEKIREVQKQAVEESRLGIPLLFGMDVIHGYETMFP IPLGLSCTWDMTAIEESARIAAVEASADGISWTFSPMVDISRDPRWGRVSEGNGEDPFLG AMIAEAMVRGYQGKNMERNDEIMACVKHFALYGAGEAGRDYNTVDMSRQRMFNDYMLPYE AAVEAGVGSVMASFNEVDGIPATANKWLMTDILRGQWGFNGFVVTDYTGISEMIDHGIGD LQTVSARAINAGVDMDMVSEGFVGTLKKSVQEGKVSMETLNTACRRILEAKYKLGLFDNP YKYCDPKRPARDIFTKAHRDAARRIAAESFVLLKNDSPDGNPNGNPLLPFNPKGNIAVIG PLANSRSNMPGTWSVAAVLDRCPSLVEGLKEMTAGKANIMYAKGSNLISDASYEERATMF GRSLNRDNRTDQQLLDEALNVARRSDIIIAALGESSEMSGESSSRTDLNIPDVQQNLLKE LLKTGKPVVLVLFTGRPLTLSWEQEHVPAILNVWFGGSEAAYAIGDALFGYVNPGGKLTM TFPKNVGQIPLYYAHKNTGRPLKEGKWFEKFRSNYLDVDNDPLYPFGYGLSYTTFSYSDI DLSHSSMDMTGSLTAAVEVTNTGTWPGTEVVQLYIRDVLGSSTRPVKELKGFQKIFLEPG EMKIVRFKIAPEMLRYYNYDLQLVAEPGDFEVMIGTNSRDVKTAKFTLN >gi|225935357|gb|ACGA01000035.1| GENE 113 177137 - 178543 1274 468 aa, chain + ## HITS:1 COG:AGl3503 KEGG:ns NR:ns ## COG: AGl3503 COG5368 # Protein_GI_number: 15891871 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 52 468 9 414 425 322 39.0 1e-87 MMKDKYQRIIEMNALPLMLLFLSAIALTFQSCKGKSKKDNLSAATDSLSDDALMDTVQRR TFLYFWEGAEPNSGLAPERYHVDGVYPENDANVVTSGGSGFGIMAILAGIDRGYVTREEG RTRMERIVSFLEKADRFHGAYPHWWYGDTGKVKPFGQKDNGGDLVETAFLIQGLLAVHQY YVNGNEKEKALAQRIDRIWRDVDWNWYRKGGQNVLYWHWSPTYGWEMNFPVHGYNECMIM YILAAASPTHGVPAAVYHDGWAQNGAIVSPHKVEGIELHLRYQGTEAGPLFWAQYSFLGL DPVGLKDEYCPSYFHEMRNLTLVNRAYCIRNPKHYKGFGPDCWGLTASYSVDGYAAHSPN EQDDKGVISPTAALSSIVYTPEYSMQVMHHLYGMGDKVFGPFGFYDAFSETDNWYPQRYL AIDQGPIAVMIENYRTGLLWNLFMSHPDVQAGLTKLGFNTNKQDVRQK >gi|225935357|gb|ACGA01000035.1| GENE 114 178571 - 181129 1680 852 aa, chain - ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 116 816 45 719 737 168 25.0 4e-41 MKHRLLLLLLFVFATSVAGWAQKSTAPSYTIKGVLLDSLTQEGEPYATIKITPKNAPDKA VKMAVTGANGKFQEKLNAAAGSYIITIFSIGKSPIVKEFTLKPSVREIDLGTMISSEANN VLKGVEVVAQKPLVKVDVDKIEYNIEDDPDSKSNSILEMLRKVPLVTVDGEDNVQVNGSS SFKIHVNGKPNNMMSNNPKEVLKSMPANTIKYIEVITSPGAKYDAEGIGGILNIVTVGTG FEGYTATFRGNANNNGVGAGTYAMVKQNKLTVSVNYNYNYDNSPRGYSSGYREIYNPSGE NEKYLESENSSKSKGNFQYGNLEASYEIDTLRLLTVAFGMYGSNRKDDSGGNTILHGADF QDFAYRYRTDNHGKDSWYSINGNIDYQRTSRKNKERMITFSYKINSQPQTSDLYNTYLDI EFDENKQEVIDRLNLKNFHSDGKTNTMEQTFQVDYTTPIGKLHTIETGAKYIFRRNSSDN RLYDAPRGSEDYTYSEDRSSEYRHLNHIISAYAGYTLKYKDFTFKPGLRYEQTIQEVKYL VGPGENFSSDFSDLIPSVSLGIKLGKTQNLRGGYNMRIWRPGIWNLNPYFDNQNPMFINQ GNPNLKSEKSHSFDLSYSSFTAKFNINLSLRHSFNNNGIERVSRLIDAEEGENFEGGHHA PKDALYSTYDNIGKSRETGLNFYLNWNASPKTRIYINGRGNYNDLKSPSEGLHNYGWNAS AYGGIQQTLPGKIRLSLNGGGSTPRINLQGRSSGYNYYSLGLSRSFMKEERLSLNVYCSN FFEKYRTYNSHTEGANYISRSSSKYPSRYFGFSVSYRIGELKASVKKAARSIDNDDVKGG GGGGNTGGGGGQ >gi|225935357|gb|ACGA01000035.1| GENE 115 181149 - 181310 75 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295086358|emb|CBK67881.1| ## NR: gi|295086358|emb|CBK67881.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 53 1 53 57 68 79.0 1e-10 MFIRCITFKNAAKSEAKSNWFIHTYEVFGVKNMKRMDKISYYSIFLVYLNIPV >gi|225935357|gb|ACGA01000035.1| GENE 116 181327 - 181803 214 158 aa, chain - ## HITS:1 COG:no KEGG:BT_3564 NR:ns ## KEGG: BT_3564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 115 146 84 41.0 1e-15 MSKWISVEEAAVKYGINKEVIWLWAEMKRFPMSCEEGITVVDEESVEGFLYQNKDRITAE YIDTLEELCIEKANICNLYAEIIGCQDKELLCQREQIARISEILLAMERQNARLRDCKKV LTKYEQELDGCWIGRICADLRRLKRPIQRYISSLFHFK >gi|225935357|gb|ACGA01000035.1| GENE 117 182953 - 184986 1075 677 aa, chain + ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 433 630 419 618 836 61 26.0 7e-09 MKHETKARPISNLRTSIRYASPDSKLEAAQKLPKVIPAANFRKTTNGAQMTGYNGIVQIE VNHLANRTEVNRIKQEAAELTQTFAAFMGSSGHSVKIWLRFTRPDKSLPKDREEAEIFQA HAYRKAVSLYQPALSYSIELKNPTLEQFCRQTCDPELYYNPDSTVIYMRQPLEMPSDTTY KESVQAETSPFKRLIPGYDSFETLSALFEAALNKAYHSLSELQPNTHLHSDEDLKPLLVH LAENCFQAGIPEEETARWAIAHFYTKKKEFLIRQTVQNVYTGAKGFGQKLPLSTEQELEF RTEEFMQRRYEFRYNTMTTVTEYRERNTFCFCFRPITNRIRNSIAMNARLEGLNLWDRDV IRYLDSDRIPIFNPIEDFLFGVDVRWDGHDRIRELASRVPCNNIHWPDLFYRWFLNMVEH WRHTDRKYANCTVPLLVGPQAYRKSTFCRSLLPPELQIYYTDRIDFSNKRDAELALNRFA LINMDEFDQNRVSQQAFLKHILQKPVVNIRRPHGTATQEMRRYASFIGTSNHKDLLTDTS GSRRYIVINVTGPIDCSPIDYEQLYAQAIHDIYRGERYWFDTEDEKVMTEANQEFQVIPI AEQLFHQFFRAAKEEEEEYELLLAIEILEQVQHDSKIHISDCNIIQFGRILQKNQVPSIH TKRGNVYKVVRIKPKRE >gi|225935357|gb|ACGA01000035.1| GENE 118 185185 - 186039 622 284 aa, chain - ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 90 280 12 192 195 95 28.0 1e-19 MNKTIYIYSSLLFFGFILLSCSKDDNDNSNNFATGIVELPALRNGANDVFITHSTTFKGQ KVTSFSMEYDKSKKHSRWVAFRFDNQTRLQEATRGDNFIPDPSLDTEYQRTQTDFGRKGY DRGHLCASADRLYKQDANDQTFYYTNMSPQRNNFNTGVWLALEGQVQSWGRSCTSLDTLY VVKGGTIDKEEQVKEYIGGDRSKPVPKYYYMALLFKKGDSFKAIAFWAEHTDDKPSKTIK LVDYALSIDELEEKTGIDFFPNLNDNLENALEATYSTKAWPGLE >gi|225935357|gb|ACGA01000035.1| GENE 119 186120 - 188078 1680 652 aa, chain - ## HITS:1 COG:BH1015_2 KEGG:ns NR:ns ## COG: BH1015_2 COG4085 # Protein_GI_number: 15613578 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein, contains TRAM domain # Organism: Bacillus halodurans # 63 156 19 107 430 72 45.0 2e-12 MKKILNALFLTLLTVFTFSSCSDVPAPYDILGEGDVPGLTGDGTKENPYNIATASLKQDG SVAWVQGYIVGSADGASLADGSKFEAPFVGASNILIADDVNEKDYKKCIPVQLVAGTDLR AKLNLVDNAANLGQVVIIKGTLENYFKQAGVKAPTAAVLNGQEIGDSGEPTPGGNLVELL DPSNPVNQVVNTFDDAETDKDYVKEGYVNLAEVGGRTWRGKPFNNNGLIQATAYGSKEPS IISWFVTPAVNIAQMEVKKVTFDCISAYYKDGNKLEVYFLEKDGNNLKQTLLNVGTLPQN AEGYSDAKTLTGDLTSIGDKVGFIGFKFVGSETASGTYQIDNLYVGVEPGEGPGPGPDPD PVGDGTKENPYDVTTALSLSTATGTTVAWVKGYIVGSVNSDNASSSVDGPEDIIFGVTGI RATALVIAGSANETDYKKCMVIGFGNDSQAAKTALNLVDNPGNLGKEVLLQGTLKYAFNA PGMKTITDHELVSGGEEPEPEGKVYTSNIALPTEDNSTDSYYGGKVKIDEVSYDILKLGT SSKIGTWTSPIFGKSATKLSFYALGWKGQTATLIVTIDGGTFAGGTTTQTISLSGNDGIS GNPPFTINAPASSDFFEYTLNDVTANTKIKFATDEQETAKNRRAVIFGVNIE >gi|225935357|gb|ACGA01000035.1| GENE 120 188099 - 188941 729 280 aa, chain - ## HITS:1 COG:no KEGG:BT_3561 NR:ns ## KEGG: BT_3561 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 280 1 277 277 296 56.0 4e-79 MKKLLSILCTGLLLCGAGLTSCENYDDPVTGNAYGNNSIPEGRTISIAALKEKYNDFIDT SKDTYTTIEGETRIEGVITCDDESGNLYKKLVVADETGAIVIGVNATGLYAFCPVGQKVV IDCKGLQIGSYRKQAQIGTVYNNAVGRMPEYVWKQHVRLINEPKLYYSELTPIEITTPAE LAAINLKEAPVLVTFKNVKLTEADGTATYAPGDEGSVKRYFTYADGTESGSNLFLYTSAY ANFSMEVMPQGSVNITGILLRYNNQWEVVVRTLNDIKRNN >gi|225935357|gb|ACGA01000035.1| GENE 121 188989 - 191517 2298 842 aa, chain - ## HITS:1 COG:no KEGG:BT_3560 NR:ns ## KEGG: BT_3560 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 842 1 846 846 1278 79.0 0 MKQRLGLAIVLFCLMPALFAQQKAKQNAREDNASFMFTESQLNEDDDAAQSTSALVTSNN DVYLSNVGYLFSPMRFRVRGYDSQYSDMYINGVQFNDAETGRFSYGLIGGLNDATRNKEG IGPFEINNFTFGAIGGASNINLRASQYAAGSKLTLSGSNRNYILRGMYTYSTGLMNNGWA FTGSLGYRWGNEGNIEGIKYNSFSYFLGAEKVFNDRHSLSLATWGTPTERGQQMAATEEA YYLANSHYYNPNWGYQNGEKRNARIVRQFEPSAIASWNFTIDDNKKLVTSAGFKYSNYGK SALGWNGNAADPRPDYYKKLPSSIFDVWESVPTADELQQFNEVTDNWKNNKAYRQLDWDA LYFANKQANALGKETLYYVEERHDDQLAFNLSSVFNHQWNERNSYVAGIAVNTTKGMHYK KMKDLLGGQLYTDVDKFAVRDHGASSSMVQNDLDNPNRRIGEGDKFGYDYNIYVNKQSAW VRYQGNNGGSLNYFASGKIGSTQMFRDGLMRNGRAPLKSLGSSGTAKFLEGGIKAGLNWA INGNHSFTLNAGYEERAPLAYNSFIAPRIKNDFVRDLKTERIIGGDLTYNFNTPWVMGRL TGYYTRFQNQVEMDAFYNDSEARFTYLSMNGIEKEHWGIEAAATFKLTSELSLTAIGTWS EAKYTNNPDAVLTYESENESNLDRVYAKGMRANGTPLSAYSLALDYNVKGWFFNLTGNYY DRVYIDFSSYRRLGSVLDKNGAGVDANGNPVLNVPGQEKLDGGFMLDASIGKYIRLRNGK SISLNLSLTNILNNTDLRTGGFEQNRDDNYKDGDARVYKFSKNSKYFYAFPFNAFLNIGY RF >gi|225935357|gb|ACGA01000035.1| GENE 122 191612 - 191836 117 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFFTKKVTRKGGFFVFFLDICHTNDKLPTFKINFKDEKKLNDFRCTRSFRPHGLWTGKEI CPLQCSVLQYGEFV >gi|225935357|gb|ACGA01000035.1| GENE 123 191718 - 192743 930 341 aa, chain + ## HITS:1 COG:no KEGG:BT_3559 NR:ns ## KEGG: BT_3559 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 1 342 343 575 80.0 1e-163 MKKSLMTLGVLALFVLMAYGQEKKFALYSVAFYNMENLFDTIHDEGKNDYEYLPNGTNQW NTMKYKAKLKNMSEILSLLSTDKLPMGPAIIGVSEIENYRVLEDILKQPALADRGYQYVH YEGEDQRGVDCAFFYNPKLFELTNSKLVPYVYINDTIHKTRGFLIASGNIAGEKMHFIVN HWPSRAAASPARERAGEQVRAIKDSLLREDSAAKIVIMGDMNDDPMDKSMAVALGAKRKP ADVGPTDLYNPWWDTLKKGYGTLMYKGKWNLFDQIVFTGNLLGTDRSTLKFYKHEIFRRD FMFQKEGKYKGYPKRTQAGGVWLNGYSDHLPTIIYLIKEMK >gi|225935357|gb|ACGA01000035.1| GENE 124 192740 - 193885 665 381 aa, chain + ## HITS:1 COG:all7362 KEGG:ns NR:ns ## COG: all7362 COG1864 # Protein_GI_number: 17233378 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Nostoc sp. PCC 7120 # 169 372 65 270 274 75 28.0 1e-13 MTKALFKLFILFITCNTAISCSEQDSPELPDNPGNTNQGIASIDQTQINGNGGGFIIRVK ADGTWQASSSETWCTLSRTSGNGNGSISGYMKANTGAERSVIITITAGKEEAKFTLKQLA GNGSNPDPDPDPNPSGYAGRIEIPKLKGGSMNIFHTWTTTENGKKTVTYSYEYDCTKKHV RWVALTFDNVTSQKNVDRKDDYKPDTNIPAQYRTDKQDYYSPYNRGHMVASSDRLYSREA NSQTFYYSNISPQLITGFNQGGSTWDAIENKVQEWAKVSNPQDTTYIVKGTSIDYAILET GTYGVKIPKYYFSTILSYKNGQYKAIGFYIEHKSDKSKNIKACAKSIDELESITGLDFYH NLPDEIETAVEANYKESDWSW >gi|225935357|gb|ACGA01000035.1| GENE 125 194042 - 195340 1036 432 aa, chain - ## HITS:1 COG:MA1450 KEGG:ns NR:ns ## COG: MA1450 COG3174 # Protein_GI_number: 20090309 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 14 348 4 329 413 82 26.0 1e-15 MDMEQLYGYVPRELVTFVLVTLFSLLIGLSQRRISLKREGETTLFGTDRTFTFIGILGYL LYILDPTDMRLFMGGGAVLGLLLGLNYYVKQSQFHVFGVTTIIIALITYCMAPIVATQPS WFYVMVVVTVLLLTELKHTFTEFAQRMKNDEMITLAKFLAISGIILPMLPHKNLIPDVNL TPYSIWLATVVVSGISYLSYLLKRYVFHESGTLVSGIIGGLYSSTATISVLARKSRKASE QEATDYVAAMLLAISMMFLRFMILILIFSREIFLSIYPYLLTMAVVAAIVAWFIHSRQKR SNDQPVEEEEDDSSNPLEFKVALIFAVLFVIFTFLTHYTLVYAGTGGLNLLSFVSGFSDI TPFILNLLQNTGSVAALIITACSMQAIISNIMVNMFYALFFAGKGSKLRPWILGGFGVVI ACNLVLLLFFYI >gi|225935357|gb|ACGA01000035.1| GENE 126 195394 - 195861 484 155 aa, chain - ## HITS:1 COG:XF2357 KEGG:ns NR:ns ## COG: XF2357 COG2954 # Protein_GI_number: 15838948 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 1 155 1 161 165 140 47.0 8e-34 MAQEIERKFLVTGEFKSLAFAQSRIVQGYISSARGRTVRVRIRDDKGYLTIKGASNASGT SRYEWEKELPLSEAEELMKLCEPGIIDKTRYLVRSGKHIFEVDEFYGENEGLIVAEVELE SEDEVFVKPGFIGEEVTGDIRYYNSQLMKKPYKTW >gi|225935357|gb|ACGA01000035.1| GENE 127 195981 - 197006 766 341 aa, chain - ## HITS:1 COG:no KEGG:BT_3553 NR:ns ## KEGG: BT_3553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 10 350 350 634 91.0 1e-180 MLLLATMAFGQAKEDAGDGKMLDKQTMEFKDYLPEIHGTIRGKYEYQTETSESRFEVRNA RFSVSGNVHPIVAYKAEIDLSDEGSIKMLDAYARVFPVKDLNFTIGQMRVPFTIDAHRSP HQQYFANRSFIAKQVGNVRDVGFTAGYTNKGGFPFILEGGLFNGSGLTNQKEWHKTLNYS IKAQLLPNKNWNLTLSTQMIKPENVRINMYDAGIYYQNDRFHIEAEYLYKMYGHETFKDV HAVNSFINYDLPLKKVFNKISFLARYDMMTDHSDGKMDETTKALIINDYARHRVTGGITL SLSKAFIADLRLNFEKYFYKNSGIPKESERDKIVIEFMTRF >gi|225935357|gb|ACGA01000035.1| GENE 128 197269 - 197403 138 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKVKKKVAISKKEEEQAQRVVKIVFVSLIILALIMLIAFSFFG >gi|225935357|gb|ACGA01000035.1| GENE 129 197488 - 198447 1066 319 aa, chain - ## HITS:1 COG:SA0709 KEGG:ns NR:ns ## COG: SA0709 COG1186 # Protein_GI_number: 15926431 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Staphylococcus aureus N315 # 7 318 23 330 330 247 44.0 2e-65 MKLVKDLQKWIEGYNELKTLADELELAFDFYKEELVTEEDVDAAYAKASEAVEALELKNM LRDEADQMDCVLKINSGAGGTESQDWASMLMRMYLRYAETNGYKATIANLQEGDEAGIKT CTINIEGDFAYGYLKGENGVHRLVRVSPYNAQGKRMTSFASVFVTPLVDDSIEVNILPAN ISWDTFRSGGAGGQNVNKVESGVRLRYQYKDPYTGEEEEILIENTETRDQPKNRENAMRQ LRSILYDKELQHRMAEQAKVEAGKKKIEWGSQIRSYVFDDRRVKDHRTNYQTSDVNGVMD GKIDGFIKAYLMEFSSQES >gi|225935357|gb|ACGA01000035.1| GENE 130 198704 - 201622 3246 972 aa, chain + ## HITS:1 COG:sll0915 KEGG:ns NR:ns ## COG: sll0915 COG0612 # Protein_GI_number: 16330991 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Synechocystis # 42 489 62 514 524 204 28.0 7e-52 MNKRLKLSCLSLFLALVICSCSSQKKYSYETVPNDPLKARIYTLDNGLKVYLTVNKETPR VQTFIAVRVGGKNDPAETTGLAHYFEHLMFKGTDKFGTQDYATEKPLLDAIEQQFEIYRK TTDEAERKAIYHTIDSLSYEASKYAIPNEYDKLMAAIGSTGSNAYTWYDQTVYQEDIPSN QIENWAKIQADRFENNVIRGFHTELEAVYEEKNMSLTRDNSKVQEAIFSSLFPKHPYGTQ TVLGTQENLKNPSITNIKNYYKQWYVPNNMAICMSGDLDPDATIALIDQYFGGLKPNLEL PKLDLPKEAPITQPVVKEVLGPDAESVALAWRFPGVSDKDFEILQVVSQVLYNGKAGLID LDLNQQQKVLNSYGYPMGLADYSALLLGGLPKQGQTLEEVKDLLLSEIKKLRAGEFDEKM LEANINNFKLGELQNMESNEGRADMFVNSFINGTDWKNEVTAIDRMAKLTKEDIVAFANK YLKEDNYAVIYKKQGKDPNEKKMTKPEITPIITNRDVASPFLVEVQESAVKPIEPVFLDY QKDMSQLKAKSDIPVLYKQNVANDLFQLIYVFDMGNNHDKALGTAFDYLEYLGTSDMTPE ELKSEFYRLACTFYVSPGNERTYVVLSGLNENMPAAVQLFEKLLADAQVNKEAYTNMTSD ILKARSDAKLNQGQNFSRLMSFAMYGPKSPATNLLTEAELTNMNPQELVDRIHNQNSYKH RILYYGPSSSKDLLATINQYHQVPATLKDIPAGNEYSYLKTPVTKVLVAPYDAKQIYMAQ ISNLDKKYDPAIEPIRALYDEYFGGGMNSIVFQEIRETRGLAYSAWASIMPPSYLKYPYV LRTQIATQNDKMIDAVTTFNDIINNMPESEAAFKLAKDGLTNRLRTERIIKGDIIWSYIN AQDLGQNVDPRIKLYNDIQNMSLKDIVDFQKQWVKGRTYVYCILGDKKDLELDKLKAVGP IEELTQEQIFGY >gi|225935357|gb|ACGA01000035.1| GENE 131 201810 - 203624 1563 604 aa, chain - ## HITS:1 COG:HI0002 KEGG:ns NR:ns ## COG: HI0002 COG1022 # Protein_GI_number: 16271978 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Haemophilus influenzae # 5 603 14 606 607 498 42.0 1e-140 MTYHHLSVLVHRQAEKYGDKVALKYRDYETAQWIPISWKQFSGTVRQAANAFVALGVEEQ ENIGIFSQNKPEWFYVDFGAFANRAVTIPFYATSSPAQAQYIINDAQIRFLFVGEQYQYD AAFSIFGFCSSLQQLIIFDRSVVKDPRDVSSIYFDEFMATGKGLPHNDTVEERTERASYD DLANILYTSGTTGEPKGVMLHHSCYLEQFHTHDDRLTTMSDKDVSMNFLPLTHVFEKAWC YLCIHKGVQICINLRPADIQTTIKEIRPTLMCSVPRFWEKVYAGVQEKINETTGLKKALM LDAIRVGRIHNLDYLRLGKTPPVMNQLKYKFYEKTIYSLLKKTIGIENGNFFPTAGAAVP DEINEFVHSVGINMVVGYGLTESTATVSCTLPVGYDIGSVGVVLPGLEVKIGEDNEILLR GKTITKGYYKKAEATAAAIEPDGWFHTGDAGYFKNGQLFLTERIKDLFKTSNGKYVAPQA LETKLVIDRYIDQIAIIADQRKFVSALIVPVYGFVKEYAKEKGIEYKDMTELLQHPKIVG LFRARIDTLQQQFAHYEQIKRFTLLPEPFSMERGELTNTLKLKRAVVAKNYSEQIEKMYE ESEK >gi|225935357|gb|ACGA01000035.1| GENE 132 203734 - 204804 786 356 aa, chain + ## HITS:1 COG:AGpA709 KEGG:ns NR:ns ## COG: AGpA709 COG0624 # Protein_GI_number: 16119709 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 66 354 69 383 387 105 27.0 2e-22 MKYDIPYMSTEAVSLLKSLISIPSISREETQAADFLQNYIEMAGMQTGRKGNNVWCFSPM FDLKKPTILLNSHIDTVKPVNGWRKDPFTPREENGKLYGLGSNDAGASVVSLLQVFLQLC RTSQKYNLIYLASCEEEVSGKDGIESVLPGLPPVSFAIVGEPTEMQPAIAEKGLMVLDVT ATGKAGHAARNEGDNAIYKVLDDIAWFHDYRFEKESPLLGPVKMSVTVINAGTQHNVVPD KCTFVVDVRSNELYSNEELFAEIKKHISCEAQARSFRLNSSRIDEKHPFVQKAVKLGRVP FGSPTLSDQALMSFPSVKIGPGRSSRSHTAEEYIMLKEIEEAIGLYLELLDGLLIQ >gi|225935357|gb|ACGA01000035.1| GENE 133 204888 - 205310 371 140 aa, chain - ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 12 122 45 155 181 76 31.0 2e-14 MDAKIKIGEHFPDAKVKDNTGNMKLLSDYVGKGKYVLIDFWASWCGPCRHEMPNVKAAYE KYVSKGFEVISISTDRKLKLWRAAIEELGMNWTQLLDVDAGDVYGIYAIPRTFLVDPTGI VIDKNLRGEKLEEALSKLFE >gi|225935357|gb|ACGA01000035.1| GENE 134 205326 - 206069 430 247 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172309|ref|ZP_05758721.1| ## NR: gi|260172309|ref|ZP_05758721.1| hypothetical protein BacD2_10627 [Bacteroides sp. D2] # 1 247 1 247 247 465 100.0 1e-130 MILCTFCAIAIFSSAQEKYSIKGIANEELNNQLLYLCLMGEGDGKNAKEVVLDSTTVEKG KFSFSGVYQMPDIAIIKDMDGETYPLILEKGKISVNTATNERGGTPLNDSLNIALNRMQL IMDNMLKTSDSIYKLMTGMKSEEFADKMVNDTAFRARYDKIEKKFLAQMDSVSHCIRDYK NSIVGIYLFSIGGMMMPFEDMEILMKEASPIFSQNNLVRNIVEKKNQAQLRMKAEVEKRI TPEQREE >gi|225935357|gb|ACGA01000035.1| GENE 135 206169 - 207188 833 339 aa, chain - ## HITS:1 COG:no KEGG:BT_3536 NR:ns ## KEGG: BT_3536 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 339 2 329 329 341 56.0 3e-92 MKNIFLPKVMMCIIALFTLILPASAQNTNATNEYKETLKKIMDFSGTSTTTDDFFQKLSS IMKLNVPKENEAYWNEFAKKWKQKMEKKIFEMYMPIYEKHLTLEELKAVAAFYESPVGEK YKEASLIVMREAMPLLVQQLQTEMFKEVMPERSERVKRDEQRLKEYEQKKKRDKELYAQA YMLPSDSIVVVPEEVYEKAYENGRSTSPSLYSIERRKNDTKVTFIQPIYWDWQWLYYSPG FKIVDKKSGDEYNVRGYDGGAPMGRLLAVKGFNHKYIYISLLFPKLKKSVKEIDILELPH EKDKEQLPSNDDGRSKSYFDIKVKDYQAVSDKKDKKIYY >gi|225935357|gb|ACGA01000035.1| GENE 136 207466 - 207756 239 96 aa, chain - ## HITS:1 COG:no KEGG:BT_3546 NR:ns ## KEGG: BT_3546 # Name: not_defined # Def: glutaminase # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 88 435 509 965 119 73.0 2e-26 MGGYRFMGSEKLFMQAIARPVNYISYQTKALDGKGHDVAIYFEMDSHKAFCAGQSTQMYE KNGLVLMKTGRENQKLWVNKGKKYPAWGRSERVPIS >gi|225935357|gb|ACGA01000035.1| GENE 137 207976 - 208182 84 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKGLIVAKWSRWSFPFDLHLGCNMNELITSSPSSVRIGPIILAVLLLLLPMMVFKIKDK RINNRFMK >gi|225935357|gb|ACGA01000035.1| GENE 138 208724 - 209239 141 171 aa, chain - ## HITS:1 COG:no KEGG:BDI_2696 NR:ns ## KEGG: BDI_2696 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 12 170 16 174 427 133 39.0 3e-30 MKKISFFLILSKIVLGLTFLFSGFVKSVDVYGTTFKIVDYFHAFHLDIFQPLSKLLAFGI VGFEFLLGILIIIGIYAKVVSKLVISIVFFMMLLTLYIALFNNVEDCGCFGDVLILSNWA TFFKNIVLLIFAFIFVIYHRLVIPLFSIKIRKYVLRYSFIYISGILLYSYI >gi|225935357|gb|ACGA01000035.1| GENE 139 209226 - 210335 514 369 aa, chain - ## HITS:1 COG:no KEGG:BT_3533 NR:ns ## KEGG: BT_3533 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 349 193 526 537 201 35.0 5e-50 MLYCGYKRFFCCSKLIKQVWLCSGMFVLIAIGIILFVIKKDSAVGRILIWSNTLELINDR PLLGYGPCGFTANYMSAQAVYFENNSESIYSQLADNIIMPFNDYLFIAVKYGIIGLLMCL VVAYFVFKESDKLELGHFCIISIGVFACFSYPLRYPYVLFLLAYSIAISSKKIGMHNLNI MVKTLLTGILIIGMYMLCLDIRFESKWNILVEMSVLGKTRTLIPEYNKLYKIWNYNPSFL YNYAAVLNKASDFRTSNAVIKECIKYVNDYDIQILLANNYYNLNDLSLAEKYYMNASNMC PNRFIPLYGLFLVNQKRGDQKKCYELASLILNKPIKTMSSTIKNIKREVYVFNKSKKAIN RLDERNEEN >gi|225935357|gb|ACGA01000035.1| GENE 140 210976 - 211596 275 206 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_1350 NR:ns ## KEGG: HMPREF0868_1350 # Name: not_defined # Def: antioxidant, AhpC/TSA family (EC:1.11.1.15) # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 72 187 59 173 207 63 25.0 6e-09 MSLLDLPCRWLAIIVAYFCYRWQNYMMRLFSLVILYLIFAFWLTFEGFDLWVHKLNFGTF TGKVIEDVAENDIMFTNELGEKQSINQLHGEYIVLDFWFTGCRICFDTMPNFQKLYNLSN SNKVGLFSVCCYNEKQGEDYLTGRKVLSERGYDFPVLSINIKDSLLQNLGVSAMPMVLIF NSERKLIFRGSLEYAEDFMKDFKCKI >gi|225935357|gb|ACGA01000035.1| GENE 141 211961 - 212725 458 254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172316|ref|ZP_05758728.1| ## NR: gi|260172316|ref|ZP_05758728.1| hypothetical protein BacD2_10672 [Bacteroides sp. D2] # 1 254 1 254 254 482 100.0 1e-134 MKNIYFILIALSYFTVSCSNIDDELRYSCNEKIDSWVKENFTDIQKMGYTEILEYDMGTQ KAIYNAMSLEQRYNLWITKLNNVLKLEWNEKEEEHLNNLLKFVEDNKMLFDNKKNDDIND EFELFMYQWKEYARNELGWDQTSLYSIYCTLMTPTKKLENDTVRLFVEEDLLSSNIPQTR SSTELIKKHDTSEVCYCSTSSDYCGNNNSSSPAGSYTYYCSSGCRSTGKPKGCGILWQYE CDGGCDMYYTPVGN >gi|225935357|gb|ACGA01000035.1| GENE 142 213121 - 213555 121 144 aa, chain - ## HITS:1 COG:no KEGG:Phep_1402 NR:ns ## KEGG: Phep_1402 # Name: not_defined # Def: protein of unknown function DUF1573 # Organism: P.heparinus # Pathway: not_defined # 1 144 1 150 153 74 33.0 8e-13 MKKTIALIFVSLFFLSSCINDTTDKKASLPTEKSVIENREAYLEIVNRKYDFGRISKKEH SYLDVEFELRNTGEIPLIISKVDVSCGCLSVNYSKEPINPGKIRKLVVHIDTKNQYGMFN KVIFINSNAENNLELIRITGEVEK >gi|225935357|gb|ACGA01000035.1| GENE 143 213644 - 213862 204 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172318|ref|ZP_05758730.1| ## NR: gi|260172318|ref|ZP_05758730.1| hypothetical protein BacD2_10682 [Bacteroides sp. D2] # 1 72 1 72 72 115 100.0 1e-24 MPYYTYYKGKLLDQYLTYYNLLLQFKGKKIKPKILADWEKDIYQKYNPLLDSLCASDVLD KEEPIVAIEKLL >gi|225935357|gb|ACGA01000035.1| GENE 144 213895 - 216783 2172 962 aa, chain - ## HITS:1 COG:no KEGG:BT_3546 NR:ns ## KEGG: BT_3546 # Name: not_defined # Def: glutaminase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 958 8 963 965 1367 68.0 0 MKKLLIALLILLLAACSKSVPYSDETIKNDLRVPAYPLLMLHPHLRLWSTTDQLTEKNMI FTNGKNLPFVGFLRVDGTIYRFMGGRDLPMQAIAPMAYDYESWEGKFTSLRPDEGWEQPD FNDQYWQVSEGAFGTRDRRETRTPWLSTDIWVRREIVDIDPYLLENKKIYLRYSYDDILQ LYINGKLVVSADRAAANLKVELPDSILNTMKEGKALIAAHCENKKGSALIDFGLFAEESG ILVEGIAPVAKEKEWIGKYTTEQPEEGWEIAAFNDSTWAQGSAAFGTEGGPGVGTPWNTN RLWIRREVSFDPSLVRNRQLFMRYSYNDGMQLLINGKELVRTGTKARNDVKVQIPDSILE TMKDGKALFAARCVNWGGTSFADFGLYGELKEAGQKTVDVQATQTHYIFDCGDVELKLTF TAPYLLDDLELLSRPVNYISYQAKALDGKEHDVAIYFEMDPHKAFRAGQSTEMYEKDGWV MMKTGRENQKLWVDKLKDAPAWGYFYLGAKENATYAQGDAAEMRAYFMKEGDLKEMRRSN EKRYAAIAQKLEMNSEFPQHLIVAFDGLYTMAYFGEDLRPYWNKDGEKTIEGLYEDAEKV YKETMAKCYAFDRQLMENACRAGGKEYAELCASAYRQAVASFQMSKNSSDELLYFTTLVG SLDIYYAASPLFLCYNPDLLKAMLNPFFYYSESGKWNKPFPAHDLGGYPFVNGQAKGGDL PVEHAGNMLIMVAAIAKAERDASYAKAHWEALSKWAGYLMENGVDTGKQIDTDSFAGRYS HNANLSAKGILGIASYALLAKMLDKQEDAEKYLAAAKRMAEEWEKQASDGEHYRLAFDQT DSWGQKYNLIWDKLLDLHIFPNRVVELETAFYRTKLNTYGCPLHSKTDYAKADWTVWTAA LQNDRLMFREFILPLYNYMNENKWRVPMADTYNVVNQKTRVTSWGRPVLGAYFIKLLEAV IE >gi|225935357|gb|ACGA01000035.1| GENE 145 216807 - 218378 818 523 aa, chain - ## HITS:1 COG:no KEGG:BT_3545 NR:ns ## KEGG: BT_3545 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 518 4 519 529 498 53.0 1e-139 MIRLFHILFAAACIGSMFVYSHQFTDAYIVPKWCCVLFVLLWMLVCAAILALQRKSILVD MAIWGSIIVFSCWLQAVYGILQYVGLFSSHATFHVTGSFDNPAGFAACLCAGLPFVVFLI IHRNKYIRYAGWLAGGVMMLAIFLSHSRSGMVSVIAVCVMYLCGRFVHGRLWRRYLLSVS MIGLLIIGSYWLKKDSADGRMLIWRCGLEMVKDAPWTGHGIGSFEAKYMDYQADYFKEYD SQSRYAMLADNVKQPFNEYLGVLINFGIVGLALLLGIVGALVYCYEQNPTQEKRIALYVL LSIGIFSFFSYPFMYPFTWMVTFLAVIMLTADYLKRIKIGTWGRNIMYTATMMCFFWGLV RLGERTQSERSWQEASELALCHLYDEALPYYVSLKHRFKDNPYFLYNYAAVFTEAKEYEK ALEIALECRKYWADYDLELMIGENYQEQKDLISAEKYYKNASMMCPSRFTPLYQLFKLYK QLGENIRAYETAEMIINKPVKINSMTIRMMKREMEKEILQRKG >gi|225935357|gb|ACGA01000035.1| GENE 146 218392 - 218748 240 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172321|ref|ZP_05758733.1| ## NR: gi|260172321|ref|ZP_05758733.1| hypothetical protein BacD2_10697 [Bacteroides sp. D2] # 1 118 1 118 118 223 100.0 3e-57 MKGLLVKSKDKISKAGIPQNGVGLVGNITWHSGASWSVGGLRVSDEMHLVWDGGMLEVGD VIEVEVVEFDEASATVWEEKHCCPTRTASNDIDNSKEWEHKLDLYYRLKKVLEDENLI >gi|225935357|gb|ACGA01000035.1| GENE 147 218758 - 219060 215 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172322|ref|ZP_05758734.1| ## NR: gi|260172322|ref|ZP_05758734.1| hypothetical protein BacD2_10702 [Bacteroides sp. D2] # 1 100 1 100 100 179 100.0 7e-44 MIGYEVAINDQSPVVITSPDVAVVMVHSNCSFGDSIYVGGLDTSRRIVWVDEKLKMGDRV RIKVVEVSVVSPTVKMTYDREELKAKYERLKAELEAKGLI >gi|225935357|gb|ACGA01000035.1| GENE 148 219113 - 219385 145 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172323|ref|ZP_05758735.1| ## NR: gi|260172323|ref|ZP_05758735.1| hypothetical protein BacD2_10707 [Bacteroides sp. D2] # 1 90 1 90 90 150 100.0 3e-35 MIWEDKTPTKKMLSTKNNIYIGLCILYILLSMINRKYPIGDITLHIAYLSTWFFLLLWEV VDVLINRKKDFGCIVIISMVLVASIANMLR >gi|225935357|gb|ACGA01000035.1| GENE 149 219529 - 219795 227 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172324|ref|ZP_05758736.1| ## NR: gi|260172324|ref|ZP_05758736.1| hypothetical protein BacD2_10712 [Bacteroides sp. D2] # 1 88 1 88 88 152 100.0 5e-36 MNRLKIYLAIVLALIFMSSCSSSAKLALAKNHDEMKSDPRIWDYLNVSERERALVDSIVA LDEYAVFLDSYKKYRTKKNAMREGEDVE >gi|225935357|gb|ACGA01000035.1| GENE 150 220142 - 220444 147 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172326|ref|ZP_05758738.1| ## NR: gi|260172326|ref|ZP_05758738.1| hypothetical protein BacD2_10722 [Bacteroides sp. D2] # 1 100 1 100 100 195 100.0 7e-49 MRRIAAYTIILLMSCLCSLSACTSSEQKENFTSLESEMIAFFQATLEKNYGDKDAVCQFA RALKRYHFNYLLAVDRGKLRRINEKLYGEGILKWRISPCH >gi|225935357|gb|ACGA01000035.1| GENE 151 220581 - 220778 130 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172327|ref|ZP_05758739.1| ## NR: gi|260172327|ref|ZP_05758739.1| hypothetical protein BacD2_10727 [Bacteroides sp. D2] # 1 65 1 65 65 129 100.0 5e-29 MKSTSPLLKDMYKAYVTTGSLHITIVYGKLRSNQHGCLDAFETDQDVQFLITWIFLAIFV LLREC >gi|225935357|gb|ACGA01000035.1| GENE 152 220807 - 221520 364 237 aa, chain - ## HITS:1 COG:no KEGG:BT_3540 NR:ns ## KEGG: BT_3540 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 236 5 233 235 90 33.0 5e-17 MDMRKKRTNLCFFTLIVAAFSVLLVHGCSFSEYDLVEDSSGMVLEEANSTRALSEQNCRN DLVISISESEEYLDYLMSLHMFFDKFDSYYSSLNDKEKIQLEENLNNDDYIEDIIDESCI RNELEQMINAQNQLKNTAYFHLNKLERSMLISLDCTYIYKTVLLKTRGEGDDARKCAEIR DKAIQAASDLAIEEKEACDDAYESGTTAHAYCYFKAAKKFSKAKEEAKEEYESCIGK >gi|225935357|gb|ACGA01000035.1| GENE 153 221582 - 222076 343 164 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172329|ref|ZP_05758741.1| ## NR: gi|260172329|ref|ZP_05758741.1| hypothetical protein BacD2_10737 [Bacteroides sp. D2] # 1 164 1 164 164 293 100.0 4e-78 MKMKRRSCSLGFKVAFMIIFIMYFMQSCSSDTVISDDLFCMTVMTKSNDFIEPELTHTDS ILVDSITKLDEFKSYATATRQLIKKIQPLMESDLWDKEPLGGHQNRKDSINALVLKISKH KDVQNDWNSYVESYKSFIKLMENAQLSSAIKTILVMKVITSSDD >gi|225935357|gb|ACGA01000035.1| GENE 154 222393 - 222803 284 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172330|ref|ZP_05758742.1| ## NR: gi|260172330|ref|ZP_05758742.1| hypothetical protein BacD2_10742 [Bacteroides sp. D2] # 1 136 1 136 136 280 100.0 3e-74 MRERLVKRNVKHKRLRPIYMKNFIQWFIGVFVLMAFFIGCQATNDKTGKITISAEEKKLN LQAPNGEYLAGGDLVRLKEMLAPCIGTIEKDSVVYDFEILSIQYDSLHYGFIADIEFVTK SGYHNHLIMEHDGKEE >gi|225935357|gb|ACGA01000035.1| GENE 155 222846 - 224870 1452 674 aa, chain - ## HITS:1 COG:no KEGG:BT_3542 NR:ns ## KEGG: BT_3542 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 276 674 1 400 400 576 66.0 1e-162 MKAMTKLIPIICLSVIIGGSLFYSCSQEKKEETAIVLPLDAALSQARENRVELEKVLHRY QSNPSDSLKYRAARFLIENMPSYTYYKGKLLDQYLTFFTLLQEARSKKIYPQAMIDSIRR MYGPFSLDSLQYYEDITTVDSAYLCSNIDWAIKVWQEQPWGKNVSFADFCEYILPYRIGD ETLSYWREDIYRKYNPLLDSLRASMVLDIEDPLVAARCLCDSLRKRSRFFTTTVPQGLPH VGPEIAQSVSGSCRELSDYVVYVCRALGIPCAIDFMPLHGGGNDGHQWVSFADKYGTLYF QEFPDKIKEVRKDKMCEASKIKVYRNTFSLNRAMQAEMQRLDTAVVPFFRDPHIVDVTTD YAKTYKKKLEIPVSMLYSGKPHSRIAYLCGSSRMDWEPVAWAEFDGKRLAFSDVQIEPVM RIATYERGRLRYWTDPFEITVSGEFHVFTPSDSVQDVTLFAKYPLWQDEKYQKRMIGGVF EGSNDPDFRQKEVLFLIEKQPERLRTMAYSRSLTPYRYVRYIGPEKGHCNVAEIEFYEAG GLLPLSGRIIGTPGCYQQDGSHEYTNAFDDNTETSFDYTEPYGGWTGLDLGTPKVIDKII YTPANRDNYVRSLDEYELSYCTKRGWRTLGQQTAMLDSLVYKRVPKGALLLLQNHTRGNQ ERIFVYEDRKQVWK >gi|225935357|gb|ACGA01000035.1| GENE 156 225056 - 225517 184 153 aa, chain - ## HITS:1 COG:no KEGG:BT_3543 NR:ns ## KEGG: BT_3543 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 80 234 240 77 32.0 2e-13 MDEYISALSQKEKEELISNANNDDFVLKFIDCIDIEKEVLAISLAKRRLLDNTSFLRLNE SEKSQLFDGCFVTRGHMLMKTRGGEGVTQQECEKAKKKAYDSAYDAFIRRLLLCDDLPDR EKYACQVATKLNYEAEKSQADTEYKRCMEKVKK >gi|225935357|gb|ACGA01000035.1| GENE 157 225823 - 226308 336 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172333|ref|ZP_05758745.1| ## NR: gi|260172333|ref|ZP_05758745.1| hypothetical protein BacD2_10757 [Bacteroides sp. D2] # 1 161 1 161 161 283 100.0 3e-75 MKIKKRKCYFITLVLMALVVIVACNCSSDVKVDSNKFNLITMAAMNADSWPMPKLSEKEA SLVDSISRLDVFINYHNAYQQLIKKTSSYFANLSSEEIEEFTQNGTDTAFISDFMKKMDS CINIEIEIKAVEEAGKDFDKMIKGLELTEAERIALIIKKFK >gi|225935357|gb|ACGA01000035.1| GENE 158 226726 - 227445 508 239 aa, chain + ## HITS:1 COG:no KEGG:BT_3540 NR:ns ## KEGG: BT_3540 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 1 235 235 206 54.0 7e-52 MKSENKKQSSIYVITLFGILLTAFFMYSCSTDEYNYTNETATEVNSIAKTRSLVSQTWNT GNVLIDSVANSDEFYEFERCSEQLADKFSAYTSKLSDEEYDKLMEKLNDDEYMEDFIKKA NLEEVLRQMDEAKKGLLEHTVFLRLSEEERLLLFRQFAESRELAKRKILKTRKEGNGNSK CEELRQAAYEQAKTEYDNAIATNCKGMGPLSPCYLKEAAVYKANIRIANKDYENCINNQ >gi|225935357|gb|ACGA01000035.1| GENE 159 227448 - 227726 195 92 aa, chain + ## HITS:1 COG:no KEGG:BT_3539 NR:ns ## KEGG: BT_3539 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 91 3 93 95 84 52.0 2e-15 MIRYIILYCSTCAVCIIMCYLDLFIDNIDSILQLFLIHFFDFLSWIIVTIGAIKCMPEKS YSNKRVWFYCAAMSGMLAAIKSFVKLIEILDT >gi|225935357|gb|ACGA01000035.1| GENE 160 227741 - 228538 222 265 aa, chain - ## HITS:1 COG:no KEGG:BT_3538 NR:ns ## KEGG: BT_3538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 264 31 269 270 310 62.0 4e-83 MKLSAYSFKQLFSGGLLCCLILLAAYLFQVFFPEMYVDAFSSSIHLLMWLTLFVGALLWT FREKGAHGIPSEITDWQCIAAFLLITADQIYVVMPKEIHQEVAPALSTYTIPTLVVAGIL IAGIIKYSILLYQKLQNERQQSIIQSQKLQQLLASPLSIQRGISLVRKLIENKSIAQLSA QDYLSLVEGCRTIDPDFFCWLKSQNLQLPPRDIVLCVLIRMYKTKEEILSVLCVSDGSYR TMRSRARKRLGIVDKELEAFLLEID >gi|225935357|gb|ACGA01000035.1| GENE 161 229836 - 230924 659 362 aa, chain - ## HITS:1 COG:no KEGG:BF0670 NR:ns ## KEGG: BF0670 # Name: not_defined # Def: putative transmembrane acyltransferase protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 362 1 362 362 598 86.0 1e-170 MKRIVFLDYVRVFACFLVMVVHASENFYGAPGSTDMAGPQSFLANEADRLWVAVYDGFSR MAVPLFMIVSAFLLAPMKEEQTMWQFYRQRCLRILPPFFIFMILYSTLPMLWGQLDAETS IKDLSRIFLNFPTLAGHFWFMYPLISLYLFIPIISPWLRKATAKEELFFIGLFVLSTCMP YLNRWCGEVWGQCFWNEYHMLWYFSGFLGYLVLAHYIRVHLTWNRSKRFIIGAILMVIGA AWTIYSFYVQAVPGELHSTPVIEIGWAFCTINCVLLTSGTFLLFTCIEGPKAPVLVTETS KLSYGMYLMHIFWLGLWVSVFKDTLGLPTVAAIPCIAVVTFISCFVTTKIISYIPGSKWI IG >gi|225935357|gb|ACGA01000035.1| GENE 162 230950 - 233004 2013 684 aa, chain - ## HITS:1 COG:SMb20631 KEGG:ns NR:ns ## COG: SMb20631 COG3533 # Protein_GI_number: 16265291 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 326 528 309 525 640 94 30.0 6e-19 MKKIKTKSVFASIVLGVAVTACASAPSSGTVTVVDRPDIQSTNTNYTGYRAPLRPLNFIK LPVGSIQPEGWVKKYLELQRDGLTGHLGEISAWLEKDNNAWLTTGGDHGWEEVPYWLKGY GNLAYILNDPKMIEETKYWIEGVFASRQPDGYFGPVNERNGKRELWAQMIMLWCLQSYYE YSQDQRVIDLMTNYFKWQMTVPDDQLLEDYWEKSRGGDNIISIYWLYNHTGDAFLLELAE KIHRNTADWTKSTSLPNWHNVNIAQCFREPATYYMQTGDSAMLKASYNVHHLIRRTFGQV PGGMFGADENARLGYIDPRQGVETCGLVEQMASDEIMLCMTGDPMWAEHCEEVAFNSYPA AVMPDFKALRYITCPNHAISDSKNHHPGIDNRGPFLSMNPFSSRCCQHNHAQGWPYFTEH LVLATPDNGVATAIYAACKATVKVGDGKEITLHEETNYPFEESIAFSVSTGEKVTFPFYL RIPSWTKGAEVRVNGKKVNVAPVAGKYLCIHREWSNGDRVELTLPMSLSMRTWQVNKNSV SVDYGPLTLSLKIAEKYVEKDSRETAIGDSKWQKGADPQKWPTTEIYPDSPWNYSLVLDK TEPLKNFKVIRKSWPADNYPFTVASVPLEVKATGRLVPEWKIDETGLCGVLPEEDAVKGN KEEITLIPMGAARLRISAFPNTRE >gi|225935357|gb|ACGA01000035.1| GENE 163 233029 - 234177 891 382 aa, chain - ## HITS:1 COG:RSc3292 KEGG:ns NR:ns ## COG: RSc3292 COG3274 # Protein_GI_number: 17548009 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 11 372 1 329 336 69 24.0 1e-11 MNQLPALDHRMVWLDVIRAVAMLMVIGVHCIDPFYISPTLGVIPEYTHWAAIYGSLLRPS VPLFVMMTGLLLLPVKQQPLGTFYKKRIFRVLFPFLIWSVLYNMFPWFTGVVGLDKSIIG QFFCYVQGNESQELMDSLKDVAMIPFNFSFKENHMWYIYLLIGLYLYMPFFSAWIERADQ KMKRTYLLIWFISLFLPYMAEYISGYLFGTSTWNAFGMFYYFAGFNGYLLLGHYGKKGND WGIWKTLLICAVLFAIGYYVTYSGFSAAAADPNHTESDMELFFTFCSPNVVLMTLAVFLL LQKVTVTNRIVIKALANMTKCGFGIYMVHYFVVGPFFLLIGPSSIPIPLQVPLMAAGIFL CSWAFTALMYKLMPNKAHWIMG >gi|225935357|gb|ACGA01000035.1| GENE 164 234185 - 235315 1187 376 aa, chain - ## HITS:1 COG:CC1418 KEGG:ns NR:ns ## COG: CC1418 COG2017 # Protein_GI_number: 16125667 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Caulobacter vibrioides # 10 376 11 378 378 287 41.0 2e-77 MKNLCVWAVAALLMASCTPKAEKTTDSGLLQSNFQTEVDGKKTDLFTLRNKNNMEVCITN FGGRIVSVMVPDKDGQMRDVVLGFDSIQDYISKPSDFGATIGRYANRINQGKFTLDGVEY QLPRNNYGHCLHGGPQGFQYRVFDAVQPNPQELQLTYLAEDGEEGFPGNITCKVVMKLTD DNAIDIQYEAETDKPTIVNMTNHSYFNLEGDAGNNAGHLLTVDADYYTPVDSTFMTTGEI VTVEGTPMDFRTPTPVGERINDYDFVQLKNGNGYDHNWVLNTKGDVTRKCASLKSPKTGI VLDVYTNEPGIQVYAGNFLDGSLTGKKGITYNQRASVCLETQKYPDTPNKPEWPSAVLRP GEKYTSQCIFKFSVDK >gi|225935357|gb|ACGA01000035.1| GENE 165 235445 - 236908 1561 487 aa, chain - ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 21 474 45 497 516 462 47.0 1e-130 MKKQIKYISAGMLAGMLLCGGELQASSRMTEMHVCLADAIQKDNRPEISNRLFRSNAVEK EILRVQKLLKNAKLAWMFTNCFPNTLDTTVHFRKGSDGKPDTFVYTGDIHAMWLRDSGAQ VWPYVQLANSDPELKEMLAGVILRQFKCINIDPYANAFNDGAIPNGHWMSDLTDMKPELH ERKWEIDSLCYPLRLAYHYWKTTGDASIFNEEWIQAIVNVLKTFKEQQRKEGVGPYKFQR KTERALDTVSNDGLGAPVKPVGLIVSSFRPSDDATTLQFLVPSNFFAVSSLRKAAEILEK VNKKTALSKECKDLAQEVETALKKYAVYNHPKYGKIYAFEVDGFGNHYLMDDANVPSLLA MPYLGDVNVNDPIYQNTRRFVWSEDNPYFFKGKAGEGIGGPHIGYDMVWPMSIMMKAFTS QNDAEIKTCIKMLMDTDADTGFMHESFHKDNPKKFTRAWFAWQNTLFGELILKLINEGKV DLLNSIQ >gi|225935357|gb|ACGA01000035.1| GENE 166 236911 - 239202 2159 763 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 34 756 11 715 717 417 33.0 1e-116 MKKLALFAFTLLSAWSMTAKTITGPVDYVSPLVGTQSKHALSTGNTYPAIALPWGMNFWV PQTGKMGDGWAYTYDADKIRGFKQTHQPSPWINDYGQFAIMPVTGEAVFNQDQRASWFSH KAETATPYYYKVYLADHDVVTEIAPTERAAAFRFTFPENDHSYVVVDAFDKGSFVKVIPS ENKIIGYTTKNSGGVPANFKNYFVLVFDKPFTYTAAVASGVIDTNKLEVTDNHAGALIGF KTRKGEQVNVRVASSFISPEQAELNLKELGTANVEQIAAKGRKVWNDVLGRIEVKDDDID HLRTFYSCLYRSVLFPRSFYEIDAKGDVMHYSPYNGEVLPGYMFTDTGFWDTFRCLFPFL NLMYPSMNTKMQEGLVNTYKESGFLPEWASPGHRGCMVGNNSASVVADAYLKGLKGYDIE TLWEAVKHGANAVHPKVGSTGRLGHEYYNKLGYVPYNVGINENAARTLEYAYDDWCIYQL GKALKKPKKEIEIFAKRAMNYKNLYDPEHKLMRGKNEDGTFQSPFNPLKWGDAFTEGNSW HYTWSVFHDPQGLIDLMGGKDGFNQMMDSVFILPPIFDESYYRAVIHEIREMQIMNMGNY AHGNQPIQHMLYMYNYSGQPWKAQHWIREVMDKLYTPAPDGYCGDEDNGQTSAWYVFSAM GFYPVCPGTDEYVLGTPYFKEMKLHLENGKTVTISAPNNGDDKRYISSMTLNGKEHTKNY LTHQDLMNGATISFKMDAKPNQRRGTKESDFPYSFSNEFKKKK >gi|225935357|gb|ACGA01000035.1| GENE 167 239215 - 241692 2449 825 aa, chain - ## HITS:1 COG:no KEGG:BT_3526 NR:ns ## KEGG: BT_3526 # Name: not_defined # Def: glutaminase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 825 1 825 825 1513 89.0 0 MKQQLMMLLLGTASVFCSCETQIEQHEKNELRAPAYPLVTIDPYTSAWSTTDNLYDGSVK HWTGKDFPLLGVAKVDGQTYRFMGTEELELRPLVKTSEQGSWTGKYTTQQPADGWQNAGF NDKAWKEGEAAFGTMENEHTAKTQWGEEFIWVRRVADIQEDLTGKNVYLEFSHDDDAIIY INGIKVVDTGNACKKNERVKLPEEVVASLKPGENLIAGYCRNRVGNGLLDFGLLVELDGY RSFHQTAQQTSADVQPMQTYYTFTCGPVDLKLTFTAPMFMDNLDLLSRPVNYISYEVASN DGKKHQVELYFEASPQWAIDQPHQESVADSFTDGDLLFLRTGSRNQEILKKKGDDVRIDW GHFYLAAEKENSTSAIGDGRELRKSFVANKLEAPTTNGYDKLALVRSLGETQKADGHLLI GYDDIYSIQYFGDNLRPYWNRQGNETIVSQFQKAEKEYKTQMKNCAAFDKKMMEEAIAAG GRKYAELCALAYRQALAAHKLVEAPNGDLVFLSKENFSNGSIGTVDLTYPGSPLLLYYNP ELVKATMNHIFYYSESGKWEKPFAAHDIGTYPLANGQTYGGDMPIEESGNMVVLAAAIAK VEGHAEYAQKHWETLTIWTDYLVEYGLDPANQLCTDDFAGHFAHNANLSIKAIMGVASYG YLADMLGKKEVAEKYTQKAKEMAAEWVKMANDGDHYRLTFDKPGTWSQKYNLVWDKLLNL QIFPKNVAETEIAYYLSKQNKYGLPLDNRETYTKTDWIMWTATLADDKATFEKFIEPVYL FMNVTPNRVPMSDWVFTDEPNQRGFQARSVVGGYFIKMLEGKLIK >gi|225935357|gb|ACGA01000035.1| GENE 168 241724 - 243895 1659 723 aa, chain - ## HITS:1 COG:no KEGG:BT_3525 NR:ns ## KEGG: BT_3525 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 723 1 721 721 1325 89.0 0 MNNKQLLIIFSLLLTGNNVSFAQTQAVNGKEAKTTFQTAEPWKSETDVRADATMVYGTLD KKGVSFEQRVQSWRDKGYQTQFMTGVAWGDYQDYFLGKWDGIDGHLKEGQRDRNGNEIAH GHLIPYIVPTESFIRYMQETQIKRVIDAGITSIYLEEPEFWMRGGYSEAFKSEWQKYYNF PWRPQHESPENTYLSNKLKYHLYYHALDKIFTYAKEYGKSKGLDVKCYVPTHSLINYTSW QIVSPEASLASLDCVDGYIAQVWTGTAREPNFYNGVKKERVFENAFLEYGCMKSMTAPLN RKMYFLTDPIEDRAKDWLDYKINYQATFAAQLMYPMVDTYEVMPWPDRIYQGLYRIAGTD QKERIPRSYSTQMQVMINTLNDIRTSDKQINGTHGIGVLMANSLMFQRFPDHNGYDDPQF SSFYGQTLPLLKRGIPVELVHMENTPFQETFKGLQVLVMSYSNMKPMKPVYHNYLADWVK KGGTLVYCGEDVDPYQTVLEWWNTEGNAYKAPSEHLFEAMGLSRTPGDGTYHSGKGTVIV MREDPKHFVLKNGNDRKYFETIASAYLNQTGKKIEIKNNFVVERGPYTIAAVMDESSSKE PLKLSGLYIDLFDKDLPVLTVKQINPGEQGYLYDLSKVSGKVKAKVLCGASRIYDEKVGK QSYSFVAKSPLHTTNVSRILLPRKPGKVLVNGKTEQSEWDESSKTLLLSFENDPAGVNVS IEW >gi|225935357|gb|ACGA01000035.1| GENE 169 243924 - 245183 1170 419 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 84 411 45 334 341 72 25.0 2e-12 MKSKILLALTLLLGASTTIWAVGNLGKANQKKHAYTNEDVWAAYEGFNNTLLDSSKYIYK TSSSYPSAVDRGNGAAAIWCQPIYWDMAMNAYKLAKAQKDKKKTREYKALCEKIFAGNKA QYCQFDFDDNNENTGWFIYDDIMWWTISLARAYELFGVDEYLKLSEASFKRVWYGSEKVG DTGSYDKENGGMFWQWQPIRNPKPNKFGDGKMACINFPTVVAALTLYNNVPENRKESTSK RPDYQTKAQYLAKGKEIYEWGVENLLDKETGKIADSRHGNGNPAWKAHVYNQATFIGASI LLYKATGEKRYLDNAILAADYTVKDMSAEHKVLPFESGIEQGIYTAIFAEYMAWLVYDCG QTQYLPFLKHTIKTGWANRDETRNVCGGEYHKKLPAGAEINSYSASGIPALMLLFPAKK >gi|225935357|gb|ACGA01000035.1| GENE 170 245415 - 246881 1297 488 aa, chain - ## HITS:1 COG:no KEGG:BT_3523 NR:ns ## KEGG: BT_3523 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 484 4 489 491 525 58.0 1e-147 MKCLTNKWREGAMLLSFLLISCLAGIFTACDDIEDEYITDTQLSILQENRTSLNYLLKNS TYGTAPGTYPETGKDVLNAAIAELDALIERVEGGEELDETTLEAAVAKVNKAIDEFKNSK FYNLSPEAQQYINNLLAKADEILTIVNDDTKWGNHQGQYPVDNKSVLESAAQDLESLAER IKSGSITDMTQEIYDEAIAAADKKVEEVEDSAWPDNSQITWNLFVDGNAGSYIDFGYSED YVKFGEDDNQAFTIELWVNIKEYCNKQGEDNCTFLSTMTNDPYWSGWRAQDRMKGLLRTM VAHWEDDNHTNPQEWEPGWKKSDNWTKDRWTHYAFLFRDKGLPGFDTPTDVKCYSMIDGT RQGDPIRVGESWRTYINEQSIVNQVKMTGFCMMDNNGNRNEWFSGYIKKIRIWKTNRTEN QVYASYMGNEEGVSADNPNLVEAWDFEVKGDQPTQSATRTITGLKGHKATLKGDDWQWIE STDITDNK >gi|225935357|gb|ACGA01000035.1| GENE 171 246904 - 248046 1084 380 aa, chain - ## HITS:1 COG:no KEGG:BT_3522 NR:ns ## KEGG: BT_3522 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 380 5 390 390 513 68.0 1e-144 MNTKYSNWKMWYYVLCMVLTLQLAACSEETHDEYTAAPEIEDAYIDQLDALIAEMTDLQQ NSEYGDKKGQYPTESRAILTDAIDDANRAVLLIKYQKPSPSESEKQRYVAEAEAAIEQFE STIRTEDAETTPAELFVDGRGDGGSYIDFGRSEEYVNFGTEGNQAFTVEFWVKVTKGGGK DQNVFLSTYMGGDGWRNGWMMYWRNADGGIYRASWGETGGNICEPSLKAPEDGEWQHFLF VYSDKGLPGSPEYRAKLYVNGEMKTTEGSVGSRFYNSSNYASYNTPMTAFGRYMRTSDNL FEEGFAGYMKKIRIWKSAKDNEYIQSSYNGTAEVTGKEEDLAAAWDFTTKPSGSGNEVID LTGRHTAKIIGTYEWQRIVE >gi|225935357|gb|ACGA01000035.1| GENE 172 248072 - 249391 1347 439 aa, chain - ## HITS:1 COG:no KEGG:BT_3521 NR:ns ## KEGG: BT_3521 # Name: not_defined # Def: alpha-1,6-mannanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 439 2 429 429 691 77.0 0 MKQYIFSAVCLMSGVFCMSSCNEDKQAKPYTPDYEIVPEYTNADTWTAYEAFNDYLLDPD KNIYKTSTAYTAAVDRNNGAAAIWCQPIYWDMAMNAYKLAKAEGDTERENKYKQLCDDLF AGNKAHYANFDFDDNNENTGWFIYDDIQWWTITLARAYELFKVDEYRSLAEASFARVWYG SEKVGDTGSYADPEKNLGGGMFWQWQPISNPNENVASEGKMACINFPTVVAALTLYNNVP ADRVADPNPESWSNEYGNFTRPHYETKDAYLAKGKEIYEWAVKNLVDSNTGEVADSKHGE GNPAWSDHVYNQATFIGASLLLYKATGEKTYLDNAVLGADYTMNTMSATYDLLPFESGVE QGIYTAIFAEYMAMLVNDCGQTQYLPFLKRNINYGWANRDQTRNLCGGEYHKAQIEGATI DSYSASGIPALMLLFPADK >gi|225935357|gb|ACGA01000035.1| GENE 173 249418 - 251337 1606 639 aa, chain - ## HITS:1 COG:no KEGG:BT_3520 NR:ns ## KEGG: BT_3520 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 639 1 642 642 1058 79.0 0 MKKYNYIIVSLLTCLLTTSCNDYFDQVPDDRLSLKEIFTTRDGALRYLSNVYTFLPDEFN QRQVHETSLYRTPGPWTGSSDEAEWTNDNKGKLINNNSIDATEGTMVLYRWKSWFSGIHE AAVFTENVDQAPLTVTERNQWKAEARALRAIYYFYLVRTYGPVPLLEKDFAMDTPSDELQ LPRNTVDECFDFIVSELKGAQNDGLLDDASTDKVSGYGRIDKAIAQAFIIEALTYRASWL FNGESTYYSDLANNDGTKLFPSRPDEATKRANWQRVIDECNTFFSNYGSRYHLMYTNKDG VSVSGSDSEGFSPTESYRRAVRTLFSEMGNNKEMIFYRLDNSAGTMQYDRMPNRSGNTTD YRGGSLLGATQEMVDAYFMSNGESPISGYSADGVTPIINEESDYVEEGVSTTEYKGTDGT LYAPVGTRMMYVNREPRFYADITFSNSKWFDGTEGGYVVDFTYSGSCGKEQGSNDYSSTG YLVRKCMDSGDRNQNLVCVLLRLTNIYFDYVEALAHVSPTHEDIWTYMNMIRKRAGIPGY GETVNLPKPTTTDEVMELIRKEKRIELSFENCRYFDVRRWGLVNEYFNKAIHGMNVNYDG NEFFKRTEIVKRIFDRQYFFPIPQGEIDIDKNLVQNTGF >gi|225935357|gb|ACGA01000035.1| GENE 174 251349 - 254789 3369 1146 aa, chain - ## HITS:1 COG:no KEGG:BT_3519 NR:ns ## KEGG: BT_3519 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1146 6 1151 1151 1988 87.0 0 MFNMKNAIRERKSKVLGLFLCFLLLGIDYSFASYNNYSQFKTLSVSVNNSTLREVLKTIE KSSQFVFFYLDDAVNLDRKVSIDSKNKKIEEILSELFEGTSCTYRISDRQIFISGKAPAP NEQQQNKRKITGRVTDVKGEPLIGVNVTVDGDANGSITNMDGLYEIFVTKKSVVLKFTYI GFKTSEIRTNASTNIYDVALEEQVNELEETVIVGYGTQRKISNIGAQSSMKMEDIKTPSA SLTTTLAGRLAGVVAVQRTGEPGKDAADIWIRGISTPNTSSPLVLVDGVERSFNDIDPED IESLTTLKDASATAVYGVRGANGVILIKTKPGKVGKPTVSADYYESFTRFTKMVDLADGI TYMNAANEAMRNDGIATKYTEDQIRNTIAGKDPYLYPNVDWLKEIFNDWGHNRRVNVNVR GGSEKVAYYASVSYFNETGMTVTDKNINTYDSKMKYSRYNFTTNLNIDVTPTTKVEIGAQ GYLGEGNYPAISSADLYNAAMSISPVEYPKMFFVNGEAYVPGTSTNNNFNNPYSQATRRG YDNLTKNQIYSNLRITQDLDMLTKGLKLTAMYAFDVYNEIHVHQDRAESTYNFLDTSVPY DMDGQPILQRIYEGSNVLSYTQETSGNKKTYLEASLNYDRTFNDDHRVSALFLFNQQSKL LYPKGTLEDAIPYRMMGIAGRATYSWKDRYFAEFNIGYNGAENFSPKHRFGTFPAFGVGW VISNEKFWQPLSKAVSFLKIRYTDGKVGNSEVSDRRFMYLDQMKENGDYGYKFGPNGTKW SGYETVNMAVDLIWEESRKQDLGIDIKLFNDDLSIVFDLFKERRENILLKREHSIPSFLG YNTSAPYGNIGIIENKGFDGTIEYNKRINKDWVLALRGNITFNKDKWIQGELPEQKYEWM NQYGRNINGVKGYVAEGLFTQAEIDDMARWESLSDANKAITPKPFASQFGTVKAGDIKYK DLNNDGQIDAYDQTYISRGDVPTTVYGFGFTVGWKDLSVGMMFQGVAGAERVLNGSSINP FNGGGGSGNLYSNIGDRWTEENPDQNAFYPRLSYGSETTSNINNFQKSTWWVRNMNFLRL KTLQISYNLPKPWVNKVHLKNAAVYVMGTNLFTLSRFKLWDPELNTDNGASYPNTTSYSV GINFTF >gi|225935357|gb|ACGA01000035.1| GENE 175 254939 - 256138 1020 399 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 196 384 71 256 274 79 30.0 2e-14 MNSEQKHTDFSLYTFEEFLQNDFFISSMNYPTEETQKFWDEFEQMNPSNIDEYIAAKRYL KVFSKEKDEVLSNQETDDLWTRIQATNINKEKARRKNYFLIGLSAAASVAILVGSFFFLK NYSSALDPDIATFAVNTKADLPLTEETLLILAEDNVVSLKEKETEITYDSVEIKTNQESI QKEKSATYNQLVIPRGKRSVLTFADGSKVWVNAGTRVIYPVEFEQDKREIYVDGEIYIEV ARDENRPFYVRTKDMNVRVLGTKFNVTAYESEAIRSVVLAQGCVQVETERTPKAILAPNQ MFSSADGKENITQVDVEEAISWVNGLYYFQSADLGIVLKRLSTYYGINVEFDPALSKIKC SGKIDLKDNFETVINGLTFVAPISYAYDEQYKTYRVVKK >gi|225935357|gb|ACGA01000035.1| GENE 176 256255 - 256827 310 190 aa, chain - ## HITS:1 COG:no KEGG:BT_3517 NR:ns ## KEGG: BT_3517 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 273 78.0 2e-72 MKSDIHTLSDSLLWKRFLEGDSSAYTQIYNQTVQDLFRFGLLYTSDKELIKDCIHDVFVK IHMNRAKLAPTDNITAYLTVALKNTLFNALKKTTDSLPFDEIGEREDTVDNSPSTPETIY INNEQEKLVQTTVHSMMSVLTDRQREIIYYRYIKEMSIDEISEVTDMNNQSVSNSIQRAL GRIRDLFKRK >gi|225935357|gb|ACGA01000035.1| GENE 177 256933 - 257922 936 329 aa, chain - ## HITS:1 COG:CC0813 KEGG:ns NR:ns ## COG: CC0813 COG3507 # Protein_GI_number: 16125066 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 32 291 63 312 540 70 27.0 4e-12 MKLKHLILFSIICCSQSVFGQKAGQQPKTSGNPVFPGWYADPEGIVFGDEYWIYPTYSAP YDEQTFMDAFSSKDLVNWTKHPKVLSKENISWFKRALWAPAVIHANDKYYIFFGANDIQS NNELGGIGVAVADNPAGPFKDALGKPLIDKFVNGAQPIDQFVYKDDDGQYYMYYGGWGHC NMVKLAPDLLSIVPFEDGTIYKEVTPEKYVEGPFMLKRNGKYYFMWSEGGWTGPDYCVAY AIADSPFGPFKREAKILERDPNIGTGAGHHSVVKGPGADEWYIIYHRHPLGEMDGNARVT CIDRMIFDKDGKIKPIKMTFEGVEASPLK >gi|225935357|gb|ACGA01000035.1| GENE 178 258088 - 259197 409 369 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase [Saccharophagus degradans 2-40] # 39 351 5 310 314 162 34 2e-38 MKSGISLLTVLLIIICASCQPKKKTEPEKETITGDTYTNPLREKGAEPWGVFHEGKYYYT QGSESRIVLWETSDITNLNDSLKKPVWIPNDPSNSHHLWAPEIHRINNKWYIYFAADDGN MDNHQIYVIENEAAIPTEGKFVMKGRIPTDKNNNWAIHASTFEHNGQRYMIWCGWQKRRI DSETQCIYIASMENPWTLSSDRVLISKPEYEWECQWVNPDGSKTAYPIHVNEAPHFFQPK NKDKVCIFYSASGSWTPYYCVGLLTADANANLLDPASWKKHPTPVFQQEPKNEVFGPGGS SFIPSPDGKECYMVYHARQIPNDAPGAMDSRSPRLQKIEWDKDGMPVLGVPDKAGTKLPK PSGTKTADK >gi|225935357|gb|ACGA01000035.1| GENE 179 259408 - 261231 1849 607 aa, chain + ## HITS:1 COG:L0025 KEGG:ns NR:ns ## COG: L0025 COG3250 # Protein_GI_number: 15673962 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Lactococcus lactis # 109 485 109 458 996 91 25.0 4e-18 MNMKKNFLTMLLALALCSSTFAQWKPAGDKIKTSWGEQLDPKNVLPEYPRPIMERSDWKN LNGLWKYAITKKGAPAPVAYQGDILVPFAVESSLSGVGKMINEKEELWYQRTFDVPSNWR GKQVLLHFGAVDWKAEVWVNDVKVGEHTGGFTPFYFDITSVLNKGNNDLVVKVWDPSDRG EQPRGKQIANPHGIWYTPVTGIWQTVWLEPVATQYIANLKTTPNIDNNSVKVEVAANTTS ADKVEVKVFDGKNLVAKGAALNGVPVELAMPANAKLWSPDSPFLYNMEVTLYKDGKAIDQ VKSYTAMRKYSVRKSPNGITRLQLNNKDYFQFGPLDQGWWPDGLYTAPTDEALVYDLKKT KDFGYNMVRKHVKVEPARWYTHCDQLGLIVWQDMPNGGPSPQWQARNYFNGTEVIRSAAS EANYRKEWKEIIDCLYSYPSIAVWVPFNEAWGQFKTPEIVAWTKEYDPSRLVNPASGGNH YTCGDILDLHHYPGPNMFLYDPRRATVLGEYGGIGLVVEGNTWVNDKKNWGYVKFNTSDE VTNEYIKYGRHLLELIQKGFSAAVYTQTTDVEGEINGLLTYDRKVIKMDEAKIREINQKI CNSLNKE >gi|225935357|gb|ACGA01000035.1| GENE 180 261331 - 262851 1184 506 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 11 399 20 408 497 157 30.0 5e-38 MNKLITISSALLAVTVAKAQTPNIVFILADDLGYGDISAFNPDSKIHTPNIDKLAEHGIA FTDAHASSALSTPSRYSLLTGRYPWRTKLKRGGLDGDSPAMIDPERRTIAQMFSANGYNT ACIGKWHLGWDWGYTDNGRSMKDIDFSLPIKNGPTDRGFDYYFGIPASLDISPYVYVENN KATSIPDHVIEPQKKNLALLMHGGMAGADFKPEECFPNIIRHSLNYINEQKDSKKPFFLY LPITAPHTPILPSKEFQGKTSIGPYGDFVVMIDDMVRQIVETLKKNKQLDNTIIVFASDN GCAAYIGVKEMEKQGHFPSYIYKGYKSDIYEGGHRIPLIVSWKGKYGKETNNSLVSLIDF YATFAQMLNHNLETEEAVDSYSMWPILSKKGTSARKDLVHEAGEGYLSLRTPQLKLVFYG GSGGWTYPVKPVDVAKFPPMQLFDIVKDPSEKENIIGDKRYENEVKEMKRTMKKYLEEGR STPGEKVSNDTENSWNQVKIFMQEEE >gi|225935357|gb|ACGA01000035.1| GENE 181 263414 - 264793 1215 459 aa, chain - ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 23 438 7 421 465 122 25.0 1e-27 MEKSMISMCLLGGLAIGNGFAQTPSKPNILLIIADDCSYYDIGCFGAVNNKTPHIDALAE QGIKFNNAYNSVSMSTPTRHCVYTGMYPMHHGGYANHSSVNADVKSLPTYLGNLGYRVGL AGKWHIKPLANFPFEDVPGFPKGCTSTNTDYHTKGIEKFMERDSSQPFCLVLASINSHAP WTGGDASVFDRKKLQLPPQFVDTEVTREYYARYLAEVGLLDQQVGDAMQILKDKDLLQNT LVIFISEQGTQFAGAKWTNWSAGVKSAMVASWPGVIKPGVETSAIVQYEDLLPTFIDVAG GEIPDVIDGKSLLGVFQGKTKTHHKYAYHVHNNVPEGPAYPIRSISDGHYRLIWNLTPEE TYVEKHIEKAEWYLSWKAQDSDRAHKILNRYKNRPEFELYDIKKDPFEMNNLADVKKYSK KKAELTMELQKWMKQQNDTGADKDQPRTPKNKQKKAANA >gi|225935357|gb|ACGA01000035.1| GENE 182 264828 - 266204 1343 458 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 9 429 14 474 497 178 30.0 2e-44 MKKEFFGLLCGCTLLPAFLHAQTERPNIVIVLADDLGWGDVGFHGSEIKTPCLDALVGEG VELERFYTSPISTPTRAGLMTGRYPNRFGVRSAVIPPWREDGLDENEETMADMLARNGYK NRAIIGKWHLGHTKKVHYPMNRGFSHFYGHLNGAIDYFDLTREGELDWHNDWETCHDKGY STELITKEAIRCIDAYEKEGPFMLYVAYNAPHTPLQAQEKDIKLYCDNFDSLTPKEQKKV TYSAMVSCMDRGIGAIVDALKKKGIMDNTFFIFFSDNGSAGVPGSSSGPLRGHKFDEWDG GVHAPAVLCWKKAEKQYKNLSSQVTGFVDLVPTLKELVGDHSRPKREYDGISILPVLNGK KSCIDRDFYLGHGAVVNKDYKLIRKGMKPGLDLKQDFLVEYKTDPYEKKNASIGNEKIVK ALYQVAVKYDTITPCLPEVPYGKGRDGFKAPKEWKVVR >gi|225935357|gb|ACGA01000035.1| GENE 183 266259 - 267629 1153 456 aa, chain - ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 24 438 9 421 465 122 26.0 1e-27 MNKLLLSASVALGATSLSFAQQTEKPNFLLFIADDCSHYDLGCYGSVDSKTPNIDRFATQ GVRFTQAYQAVPMSSPTRHNLYTGLWPVRSGAYPNHTCADKGTLSVVHHLQPLGYKVALI GKSHIAPKSVFPFDLYVPPLKGGELNFEAIQKFISDCKANGEPFCLFVASNQPHTPWNKG DASQFNADKLTLPPMYVDIPQTRELLTHYLAEINFMDQEFGNVLSILDKEKMTDKSVVVY LSEQGNSLPFAKWTCYDAGVHSACIVRWPGVIKPGSVSDALVEYVDIVPTFVDIAGGKPQ AKVDGESFKPVLTGKKKEHKKYSFSLQTTRGINAGSPYYGIRSVYDGRYRYIVNLTPEAT FKNVETNSPLFKEWESLAETDAHAKAMTTKYQHRPAIELYDVKNDPYCMKNLAEDATQTA TISRLDKELKRWMKDCGDEGQATEMRAFEHMPGKKK >gi|225935357|gb|ACGA01000035.1| GENE 184 267648 - 269240 1306 530 aa, chain - ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 22 524 3 527 536 314 34.0 3e-85 MVKKNFLMTAPLLMASFCSAQEKPNVVLIMVDDMGYSDIGCYGGEILTPNIDALASKGVR FTQFYNTSRSCPARASLMTGLYQHQAGIGQMSEDPFKQENQKSPNDWGVPGYKGFLNRNC VTIAEVLKEGGYHTYMAGKWHLGMHGEEKWPLQRGFERFYGILAGACSYLRPSGGRGLTL DNTKLPEPEAPYYTTDAFTDYAIRFVDEQKDDKPFFLYLAFNAPHWPLQAKEEDIQKFTK IYREKGWNEIREARRKRMAKLGIIDSNTEFAEWENRNWDELTEKEKDEVAYRMAVYAAQV HCVDYNVGKLLDYLKKNHKLDNTLVMFLSDNGACAEPYAELGGGKVSEINDPTHSGMPSY GRAWAQVSNTPFRKYKCRSYEGGISTPLIVSWKNNLNNKKGEWCRVPGYLPDIMPTILEA TGAAYPETYHGGNKIHPLVGSSLFPAIHKKTDSLHEYMYWEHQNNRAIRWGNWKAIRDEK GKEWELYDVVKDRTERNNLAEQHPEVLTKLVTEWEKWANANFVLPKHPKK >gi|225935357|gb|ACGA01000035.1| GENE 185 269501 - 271291 1039 596 aa, chain + ## HITS:1 COG:uidA KEGG:ns NR:ns ## COG: uidA COG3250 # Protein_GI_number: 16129575 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 41 573 15 578 603 128 24.0 3e-29 MRKTQLVLVGLLLSFTVNATNDIPRPEYPRPQFERAEWINLNGTWTYEFDLSNSGKNRQL TTAKQLSNTITVPFCPESKLSGVNHTDFIEQMWYQRSISIPENWKDKKIMLNFGAVDYHA EIYIDGNYLGNHDGGSSSFSLDITPAVEPGNTHSLVVFVADKTRSGLQAVGKQSTQYNSY GCFYTRVTGIWQTVWLEAVSPYGLRSARTNPDIDQQQLIVTPEFYRSSNDETLEITLYDN SRQVSRKVVKCNNGSSVVLPVKDMKLWSPENPYLYDITYRIKNANGEVIDEVKSYVGMRK VHTSNGQIYLNNEPYFQRLVLNQGYYPDGIWTAPTDEALKNDILLSKEAGFNGARLHQKV FEERFHYWADKLGFITWGESANWGMDYKEEEAARYFLTEWSEILMRDYNHPSIIAWVPFN QPLENPYTLISGKMPRLIIDTYRLTKAIDPTRLVNGIAGDTHFLTDIWGIRNYESDTARF ARYLKPNEKQAFYNHQPFFIGEFGGMLWTGAHKDKTSWGYGKTITSEEGFYERLEGFMDA IANAKEVTGFCYTQLTDIEQEKNGIYYYDRRPKLDMKRIKTIFEKIPSNRFLKIKE >gi|225935357|gb|ACGA01000035.1| GENE 186 271298 - 273859 1691 853 aa, chain + ## HITS:1 COG:no KEGG:BT_3508 NR:ns ## KEGG: BT_3508 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 853 1 851 851 1398 78.0 0 MIKKRILVISVLCTLLTVYVSGEVPPAKVVFQQYMNQAQTFANNFPREKAYLHFDNTSYY VGDTIWFKAYVTLAEKQTFSLISRPLYVELVDQTGHIADKQIIKLTQGQGNGQFILPHSM LSGYYEVRAYTRWMLAFNEPQYFSRTFPIYQLSNSDKLERSITTYELSSSMENRPSETEE KLNVRFFPEGGQLVEGVTSQVAFKAESKNGGNIELSGTIYTKEGTEITSFETLHDGMGHF EYTPSAQPAIAKVSFQGKKYEFTLPQALPNGYVLSTVSNAGALLVRVSCNAATPQDTLAV FISHQGRPSIHQLISCRADAPQEFILPTRKLPAGVLQVSLINRAGNTLCERFVFANPRAP LQISTEGLKEVYAPYAPIRCELQVKNAKGEPVSGELSVSIRDGVRSDYLEYDNNIFTDLL LTSDLKGYIHQPGYYFASPSPRKQTELDILLMVHGWRKYDISQAISTAPFTPLQLPEAQL VLNGQVKSTILKNKLKDIALSVIVKKDDQFITGGTVTDENGRFTIPVEDFEGTTEAVIQT RKVGKERNKDASILIDRNFSPAPRAYGYKELHPEWKDLTYWQQKAESFDSLYMDSIRRVE GLYVLDEVEIKSKRRQGSNMATKISEKSVDAYYDVRRSVDVLRDNGKIVTTIPELMEKLS PQFYWDRTNDKHTYRQKPICYIMDNHILSETETQMMLTEVDGLASIIISKGTGGIDDDII QNTKMSEVTDSTGVDISKLDKYSVFYLIPLPRRDVLNKSQSAVLGTRQTVIQGYTRPLEY YSPAYPTKELYMDKVDKRRTLYWNPSVQADENGKAVIECYNNQYSTPLIIQAETLGKDGQ IGSMRYSTIGQIE >gi|225935357|gb|ACGA01000035.1| GENE 187 274008 - 275693 1507 561 aa, chain - ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 29 546 3 520 536 293 36.0 7e-79 MNSKLLLLPTALMVAGHSAAEAKGKKSDKRPNILVILADDLGYSDLGCYGSEIHTPNLDK LAQEGVRFNHFYNASRSCPTRASLLTGLYQHQAGIGRMTFDAHLSGYRGTLSRNAVTIAE VLKEAGYTTSMVGKWHVAETPLRKDQREWLAHRVFHDTFSDLCHYPVNRGFDSHYGVIYG VVDYFDPFSLVEGEVPIKEVPKGYYITQALSDRAVQEVEEYAKDDKPFFMYLAYTAPHWP LHALPEDIEKYKDTYKVGWEAIRNARYERQKQLGIFPGMDNFLSERQFHDKWEDNPHAEW DARAMAVHAAMIDRVDQGIGQVIEALKKTGQLDNTLILFLSDNGCSNEDCQNMSGGENDR PDMTRDGKKIIYPRNKQVLPGPQTTYASLGARWSNVANTPFRFWKAKSYEGGICTPMIAH WPKGIKKNVGGMTSEIGHVMDIMATCVDLADAEYPTTYKGHDILPMEGKSLLPIFKTGHR KGHDYLGFEHFNERAFLSNDGWKLVRPKNNSQWELYNLNEDRSEQHDLAAKYPEKVTEMA KAYEAWAKRCMVEPYPGQKKK >gi|225935357|gb|ACGA01000035.1| GENE 188 275731 - 277083 1065 450 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 11 440 17 474 497 205 30.0 2e-52 MNIKKHLPWLGALLPVASMQAETKPNVVVIYIDDMGIGDIGCYGGKFVPTPNIDKIAQDG LLFNQYYSSAPVSSPSRCGLTTGKFPIEVGINTFLNNKASNKKCEQRNFLSDKNPSMARA FQSAGYVTGHIGKWHMGGGRDVHNAPSIKNYGFDEYISTYESPDPEPAITATNWIWSAKD SVKRWRRTEYFVNKSIDFVKRHKDQPFFLNLWPDDMHTPWVPEFKQKDNKSWNTEEAFIP VLAEMDKQIGRFIKALDDMGLSENTIVIFTSDNGPAPSFQSARAAYLRGTKNSLYEGGIR MPFLIKYPKKIKAGQVNNESVLCAVDLYPSLCAIAGIETEKGYRGDGQNYDKVLLGKSNA KRKTDLMWDFGRNQFFKKPGNANDRSPHLAIRSGNWKLLINSDGSDAQLYDIEKDKFEKN DVAQSHPEVVAKLSKKVCKWFAENKDKGKE >gi|225935357|gb|ACGA01000035.1| GENE 189 277303 - 278322 915 339 aa, chain - ## HITS:1 COG:no KEGG:BT_3473 NR:ns ## KEGG: BT_3473 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 339 1 333 333 301 49.0 3e-80 MKNIYIYLSLLAVIVLGTACNNEWEDEQYEQYVSFKAPIASGGDGVTTIYVRYKDNGKVT YQLPVIISGSTVNGQDRDIHIAVDKDTLKTLNIERFSLYRPELWYTEMEEDKYEFPETVH IPAGSCVEQLNIDFNLQGIDMLEKWVLPLTIVDDGAYDYQSHPRKNYAKALLKVVPFNDY SGSYTASSMKVYTYINGKPDTNARTTNKRTGYVIDNNSVFFYAGLINEDMDKDIRKKYKI NVHFREDGTLDMKPDDPNNEMEFELIGTPIYSSTSIMDATRPYLERRYVQIMFEYDFQDF TYGGSDTEVIPIKYRVEGSMTLQRNINTQIPDEDQQIEW >gi|225935357|gb|ACGA01000035.1| GENE 190 278343 - 280391 1787 682 aa, chain - ## HITS:1 COG:no KEGG:BT_3474 NR:ns ## KEGG: BT_3474 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 680 1 679 681 663 53.0 0 MRKIYKLYIGIMLSAIAVACSDKLDSDKYFKDRRSLEDVFTDKESTEEWLANAYSYLGGY NLEVSNMLNTITNFADDIYFGGNSLDAYKTFKTGTYDEGYRHSWGDSYKGIRQASIFIQN IDMNTKYTEEERADLKAQARFVRAYYYWLLLRKYGPVPLLPDDGLDYTADYDDLEIQRNS YDECVEYIESEMCLAAKDLPLARAINQVARPTRGAALAARARALVYSASPINNPRPGDPD KFSDLVDYEGRCLMSQEYNEYKWARAAAAARDVMELPGENNGRRYELYHKKATNIAEPGY PATIAPYQDKDFSTKTWNEGGYSDIDPYESYRAVFNGSLAMYQNPELIFSRGRNQGANSI AEMVKLQMPKTLGGGSNAYGMTQKMCDAYYMANGDEFSREHFKEEYPSGTRFVTKEEVEA GTYPQLKEGVYKEYANREPRFYASVSYNGCVWALLKNAETTDYKNDVEKQVNYYYGINTD GFSGTGVYLRSGIGIMKYVHPDDTNRKEIKAKAEPAIRFAEILLIYAEALNELEDGSSYD IASWDGSTSYSVKRDIDEMKKGIRQIRRRAGVPDYTMSEYQDRDVFRKKLKRERQIELMA EGQRYFDLRRWKDAEVEESKKNYGCNFYMSDTEEQREQFYTPIEIADLPTAFSRKLYFWP ISHDELKRNKRLTQNPGWTYND >gi|225935357|gb|ACGA01000035.1| GENE 191 280407 - 283496 2680 1029 aa, chain - ## HITS:1 COG:no KEGG:BT_3483 NR:ns ## KEGG: BT_3483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 1029 1 1046 1046 1375 66.0 0 MRIKEFITICLLLVSVSLFAQEKTIEVTGVVTDTNKEPLVGVNVTVKDKPGLGAITDING RYKINIEEFSRLVFSYIGFDTQEILVKRQSLVNVSMKESDAREIDEVVITGTGAQKKLTV TGAVTNVDVEVLKSNPSANLSNALAGNVSGVLAMQTSGQPGVNTSEFWIRGISTFGANSS ALVLVDGFERDLDEINIEDIESFSVLKDASTTAIYGPRGANGVVLITTKHGKEGKIKINA KVETSYNARTITPEFADGYSYAMLMNEARITRNQERIYQEDEMEILRLGLDPDLYPNVDW MDVLLKDGAMTYRANLNMSGGGSTARYFVSLSYVNEEGMYKTDESMRKDYNTNPSSQRWN YRLNTDIDVTKTTLVKVGVSGSLKKRNAPGQGSNVWTSLMGQNPVSIPVMYSNGYIPAHG VEDNRKNPWVLATQTGYKEIWNNKIQTNISLEQKLDFITKGLRFEGRFGFDTNNQSEINR IRMPETWRAQRERDDDGNLIFKQQSAEKPMEQTSSSTGERREFLEAYLQYNRAFKAHHVG GTLKYSQDAYRTTVDIGTDVKNGIAKRHMGLAGRASYNWNYRYFADFNFGYNGSENFADG HRFGFFPAFSLAWNIAEEKLVKKHLKWMNMFKLRYSYGKVGNDKMDVRFPYLYTIKDDGS GWTWSDYGSTDNVYTGMQYTQLASNNVTWEIATKHDAGVDLSLFNDKFTATVDYFHEQRD GIYMERKYLPGIVGVNSNPKANVGSVRSKGFDGNFAYKQKIGKVNLTVRGNFTYSKNEIL EKDEMNAVYPYQKEAGYRVNQAKGLIALGLFKDYDDIRNSPQQTNWGKVQPGDIKYKDVN GDGIINASDEVAIGATTKPNLIYGFGISAQWKGFDFNAHFQGAGKSSFFINGSTVYAFKD SQWGNILTNLVKDRYVDAETAATLGIPANENPNASYPRLSYGGNDNNYRASTYWLRDGSY LRLKTLEVGYTLPKSIVNKMRFNKIRVFFIGTNILTFAKFKEWDPEMGVSNGEKYPLAKT FTLGLTVNI >gi|225935357|gb|ACGA01000035.1| GENE 192 283509 - 285008 1001 499 aa, chain - ## HITS:1 COG:no KEGG:BT_3476 NR:ns ## KEGG: BT_3476 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 480 15 457 457 346 41.0 1e-93 MKCKTMKHGSIGLQCVMLAFIFAVCFSSCKDSKDETSAPYDPSQPVEVTDFTPKSGGGKK KMIIYGSNFGIDRSIISVKVGGKEAIVISAKGTSIYCSTPESCFEGTIEIKVGEQTVIAS EKYQYEPQMVVTDLCGDVDELGEGNIVETGPFDDCGKIDYPFWFSFDPQHSNILYLSQNN GLLRVLDLEQEMIYTKKLDNINRATTITWTNDDDMILAAPQNGGAKNRNNIILKRESSTD GQDFKESSWVALARGNACNGSMVLPQTGELYFNHRATGNVYRYNFEENGYNNNPEVPGGA GLNELLAFSVPNQTVDFSMVPHPTGKYVYIIMHESHYILRSNYDEETKKLVTPYIVCGQS GQADYKDLVGINARINKPGQGVFVLNEEYKAANKDDWYDFYFADKENHCIRILTPDGVVS TFAGRGSASASSYKWGKQNGEVRERARFNQPVALAYNEATKTFYVGDSGNYKIRKIAKEQ APDDLGGEESNDNQNQENQ >gi|225935357|gb|ACGA01000035.1| GENE 193 285026 - 285550 398 174 aa, chain - ## HITS:1 COG:no KEGG:BT_3477 NR:ns ## KEGG: BT_3477 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 173 667 839 840 292 79.0 4e-78 MGVAGYSEMARMLGLNDVADKYAATAKQMAIKWEEMANEGDHYRLAFDRKNTWSQKYNMV WDKLWDLKLFPNNVIGKEINYYLTKQNPYGLPLDSRKEYTKSDWIMWTAAMASDKETFQK FSDPVYKYINETVSRVPISDWHHTDSGRWVGFRARSVIGGYWMKVLMDKVQNNQ Prediction of potential genes in microbial genomes Time: Fri May 13 08:46:51 2011 Seq name: gi|225935356|gb|ACGA01000036.1| Bacteroides sp. D2 cont1.36, whole genome shotgun sequence Length of sequence - 4445 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1913 1363 ## BT_3477 glutaminase A + Prom 2332 - 2391 6.3 2 2 Tu 1 . + CDS 2447 - 3730 359 ## BT_3478 integrase + Term 3732 - 3776 1.1 Predicted protein(s) >gi|225935356|gb|ACGA01000036.1| GENE 1 2 - 1913 1363 637 aa, chain - ## HITS:1 COG:no KEGG:BT_3477 NR:ns ## KEGG: BT_3477 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 637 1 637 840 1125 84.0 0 MKKLLTVMAFSVGLFANAQTDNLFKPVKEVALRTPSVPIVVSDPHFSIWSPYDKLMEGST EHWTTAKKPLVGALRVDGKVYRFLGKDQVALIPIAPMTNVERWEAAYTNRQPANGWQEFQ FDDSSWKKGKAAFGSRDMPRVRTEWKGDNTDIYIRRTFEINDLDLTENIFLIYSHDDVFE LYLNGEKLVATDLVWKNNVNLKLSDEAKKKLRNGKNVIAAHCHNTTGGSYVDFGLYREKK NAVTFENEAIQKSVDVLATSSYYTFTCGPVELDVVFTAPQLIDDLDLLSTPINYISYCVR PLDKKEHDVQFYLETTPELAVNETTQPTIARTLSKNGISYVEAGTINQPICDRKGDLICA DWGYVYLGSVNGTGKSVSLGDYSGMKESFAKNGTLTSSKTKWITRREENTPAMAYVHNFG TVTKDGKDGFLMIGYDDIYSIEYMYEKRMGYWKHDGKVTIFDAFEKLRDNYQFIMERCRA LDELIYSDAEKAGGKKYAEICSAAYRQVISAHKLFTDKEGNLMWFSKENNSNGCINTVDL TYPSAPLFLIYNSDLAKAMMTSIFEYSASGRWDKPFAAHDLGTYPVANGQVYGGDMPIEE SGNMVILTAAVSKIEGNADYAKKYWDILTTWTNYLAE >gi|225935356|gb|ACGA01000036.1| GENE 2 2447 - 3730 359 427 aa, chain + ## HITS:1 COG:no KEGG:BT_3478 NR:ns ## KEGG: BT_3478 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 757 88.0 0 MQYESNTDFTIAKYTPYIKPRYVTREGTTVLYVRYNYNRTKRTLISTGYNIKPEHWDSKK RWIKRACPNYDEIDACLIRITSKLGEILTYAKINGISPTVDFVLLELKKNREYELRPNRV DIFDALERYITEKAPVVSADQIKDYRTLRKHLIAFKEHSSQPITFHNLNLIFYNEFMDYL FYKVIKPDGSVGLLTNSAGKIVRLLKGFVNYQIDKGVIPPIDLKHFKVVEEETDAIYLSE KELSTIHELDLSDDKQLEEIRDVFITGCFTGLRYSDLSTLSPEHIDLDNEIINLKQRKVH KAVIIPMIDYVPEILKKYNYDLPKIPRYIFNERVKELGRRAKLKQKIEVVRKKGKEREKR VYEKWEMISSHTCRRSFCTNMYLSGFPAGELMRISGHKSPSAFMRYIKVDNLQAAKRLKE LRAKLAK Prediction of potential genes in microbial genomes Time: Fri May 13 08:47:51 2011 Seq name: gi|225935355|gb|ACGA01000037.1| Bacteroides sp. D2 cont1.37, whole genome shotgun sequence Length of sequence - 97513 bp Number of predicted genes - 68, with homology - 67 Number of transcription units - 23, operones - 15 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 74 - 598 421 ## BT_3477 glutaminase A + Prom 610 - 669 4.4 2 2 Op 1 . + CDS 841 - 3993 2221 ## BT_0364 hypothetical protein 3 2 Op 2 . + CDS 4015 - 5895 1154 ## Slin_2102 RagB/SusD domain protein 4 2 Op 3 . + CDS 5914 - 8985 2192 ## Phep_3875 TonB-dependent receptor plug 5 2 Op 4 . + CDS 9005 - 10801 1091 ## ZPR_0751 hypothetical protein 6 2 Op 5 . + CDS 10816 - 11754 797 ## ZPR_0752 hypothetical protein 7 2 Op 6 . + CDS 11830 - 13395 1318 ## COG3119 Arylsulfatase A and related enzymes + Prom 13408 - 13467 5.0 8 3 Op 1 . + CDS 13491 - 16298 1603 ## Phep_3406 TonB-dependent receptor plug 9 3 Op 2 . + CDS 16303 - 18057 1336 ## Phep_3874 RagB/SusD domain protein 10 3 Op 3 . + CDS 18090 - 21206 2102 ## BT_2894 hypothetical protein 11 3 Op 4 . + CDS 21226 - 22935 1214 ## Phep_2240 RagB/SusD domain protein 12 3 Op 5 . + CDS 22970 - 23911 704 ## gi|260172387|ref|ZP_05758799.1| hypothetical protein BacD2_11029 + Term 23964 - 24016 5.1 + Prom 24037 - 24096 5.7 13 4 Op 1 . + CDS 24148 - 25500 1189 ## COG3119 Arylsulfatase A and related enzymes 14 4 Op 2 . + CDS 25538 - 27223 1679 ## COG3119 Arylsulfatase A and related enzymes + Term 27273 - 27328 5.4 15 5 Op 1 . - CDS 27344 - 31360 2909 ## COG0642 Signal transduction histidine kinase 16 5 Op 2 . - CDS 31381 - 32682 1250 ## COG4942 Membrane-bound metallopeptidase 17 5 Op 3 . - CDS 32679 - 33257 404 ## BT_3463 hypothetical protein 18 5 Op 4 . - CDS 33254 - 35026 1645 ## COG0457 FOG: TPR repeat 19 5 Op 5 . - CDS 35051 - 35482 472 ## COG0756 dUTPase - Prom 35588 - 35647 6.2 + Prom 35447 - 35506 5.3 20 6 Tu 1 . + CDS 35607 - 36953 844 ## COG0232 dGTP triphosphohydrolase - Term 36826 - 36866 7.3 21 7 Op 1 . - CDS 36925 - 37941 922 ## COG3176 Putative hemolysin 22 7 Op 2 . - CDS 37968 - 38786 632 ## COG3176 Putative hemolysin - Prom 38806 - 38865 6.6 + Prom 38764 - 38823 8.4 23 8 Tu 1 . + CDS 39046 - 39549 505 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 39629 - 39667 5.1 + Prom 39934 - 39993 4.2 24 9 Op 1 29/0.000 + CDS 40014 - 40484 177 ## COG2001 Uncharacterized protein conserved in bacteria 25 9 Op 2 . + CDS 40456 - 41385 898 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 26 9 Op 3 . + CDS 41392 - 41748 361 ## BT_3454 hypothetical protein 27 9 Op 4 26/0.000 + CDS 41825 - 43951 2115 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 28 9 Op 5 4/0.000 + CDS 43981 - 45429 1627 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 29 9 Op 6 28/0.000 + CDS 45520 - 46788 1023 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase + Prom 46814 - 46873 3.5 30 9 Op 7 25/0.000 + CDS 46930 - 48264 1281 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 31 9 Op 8 31/0.000 + CDS 48297 - 49625 905 ## COG0772 Bacterial cell division membrane protein 32 9 Op 9 26/0.000 + CDS 49654 - 50778 995 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase + Prom 50781 - 50840 3.8 33 9 Op 10 . + CDS 50867 - 52243 1345 ## COG0773 UDP-N-acetylmuramate-alanine ligase 34 9 Op 11 . + CDS 52314 - 53051 256 ## PROTEIN SUPPORTED gi|163752975|ref|ZP_02160099.1| 30S ribosomal protein S12 + Prom 53053 - 53112 3.6 35 10 Op 1 35/0.000 + CDS 53132 - 54583 1514 ## COG0849 Actin-like ATPase involved in cell division 36 10 Op 2 . + CDS 54641 - 55951 1363 ## COG0206 Cell division GTPase + Prom 55954 - 56013 9.1 37 10 Op 3 . + CDS 56037 - 56486 282 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 + Term 56593 - 56649 12.3 - Term 56757 - 56790 0.0 38 11 Op 1 . - CDS 56907 - 58136 864 ## BT_3434 hypothetical protein 39 11 Op 2 . - CDS 58129 - 59073 430 ## BT_3433 hypothetical protein - Prom 59113 - 59172 4.8 40 12 Tu 1 . - CDS 59238 - 59966 443 ## BT_3431 DNA repair protein - Prom 60117 - 60176 4.6 + Prom 60023 - 60082 4.4 41 13 Tu 1 . + CDS 60108 - 61067 337 ## BT_3916 site-specific recombinase IntIA + Prom 61113 - 61172 7.2 42 14 Op 1 . + CDS 61418 - 62089 425 ## BVU_2683 hypothetical protein 43 14 Op 2 . + CDS 62115 - 63146 633 ## BVU_2682 hypothetical protein 44 14 Op 3 . + CDS 63159 - 63728 382 ## BVU_2681 hypothetical protein 45 14 Op 4 . + CDS 63740 - 63931 132 ## gi|298384857|ref|ZP_06994416.1| hypothetical protein HMPREF9007_01498 46 14 Op 5 . + CDS 63980 - 65395 889 ## BVU_2680 hypothetical protein 47 14 Op 6 . + CDS 65420 - 66913 1169 ## BVU_2679 hypothetical protein 48 14 Op 7 . + CDS 66945 - 67919 374 ## BVU_2678 hypothetical protein + Term 68045 - 68102 3.2 + TRNA 68450 - 68524 47.7 # Glu TTC 0 0 + Prom 68451 - 68510 79.3 49 15 Op 1 . + CDS 68564 - 68818 416 ## PROTEIN SUPPORTED gi|153809175|ref|ZP_01961843.1| hypothetical protein BACCAC_03485 + Term 68852 - 68886 4.0 + Prom 68820 - 68879 2.8 50 15 Op 2 . + CDS 69017 - 70975 2139 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Term 71020 - 71074 13.3 - Term 71123 - 71168 5.2 51 16 Op 1 . - CDS 71392 - 71973 465 ## BVU_1376 hypothetical protein 52 16 Op 2 . - CDS 72022 - 72387 317 ## BT_0352 hypothetical protein 53 16 Op 3 . - CDS 72412 - 73926 1808 ## COG0696 Phosphoglyceromutase + Prom 74013 - 74072 5.0 54 17 Tu 1 . + CDS 74099 - 76549 1413 ## BT_3418 putative thiol:disulfide interchange protein + Prom 76602 - 76661 4.1 55 18 Tu 1 . + CDS 76728 - 77708 551 ## COG0598 Mg2+ and Co2+ transporters - Term 77751 - 77811 4.9 56 19 Op 1 . - CDS 77866 - 81279 1676 ## Phep_2759 alpha-L-rhamnosidase 57 19 Op 2 . - CDS 81287 - 82240 688 ## BT_1873 endo-arabinase 58 19 Op 3 . - CDS 82285 - 83496 909 ## COG3754 Lipopolysaccharide biosynthesis protein - Prom 83516 - 83575 2.9 - Term 83501 - 83535 1.0 59 20 Op 1 . - CDS 83607 - 85031 712 ## DICTH_1900 F5/8 type C domain protein 60 20 Op 2 . - CDS 84970 - 85107 61 ## 61 20 Op 3 . - CDS 85124 - 86962 1426 ## BDI_3134 hypothetical protein 62 20 Op 4 . - CDS 86976 - 90335 2079 ## BDI_3133 hypothetical protein - Prom 90365 - 90424 6.7 - Term 90403 - 90453 -0.7 63 21 Op 1 6/0.000 - CDS 90519 - 91508 555 ## COG3712 Fe2+-dicitrate sensor, membrane component 64 21 Op 2 . - CDS 91505 - 92119 378 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 92307 - 92366 7.4 - Term 92253 - 92310 15.5 65 22 Op 1 . - CDS 92382 - 93917 1557 ## BT_3413 hypothetical protein - Prom 93956 - 94015 4.6 66 22 Op 2 . - CDS 94098 - 94700 550 ## COG0164 Ribonuclease HII - Prom 94729 - 94788 1.9 - Term 94728 - 94770 6.4 67 22 Op 3 . - CDS 94790 - 96994 2382 ## COG3808 Inorganic pyrophosphatase - Prom 97022 - 97081 6.0 68 23 Tu 1 . - CDS 97176 - 97409 95 ## gi|260172443|ref|ZP_05758855.1| hypothetical protein BacD2_11309 Predicted protein(s) >gi|225935355|gb|ACGA01000037.1| GENE 1 74 - 598 421 174 aa, chain + ## HITS:1 COG:no KEGG:BT_3477 NR:ns ## KEGG: BT_3477 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 173 667 839 840 291 80.0 5e-78 MGVAGYSEMARMLGLNDVADKYAATAKQMATKWEEMANEGDHYRLAFDRKNTWSQKYNMV WDKLWDLKLFPNNVIGKEINYYLTKQNPYGLPLDSRKEYTKSDWIMWTAAMSSDKETFQK FSDPVYKYINETVSRVPISDWHHTDSGKWVGFRARSVIGGYWMKVLMDKVQNNQ >gi|225935355|gb|ACGA01000037.1| GENE 2 841 - 3993 2221 1050 aa, chain + ## HITS:1 COG:no KEGG:BT_0364 NR:ns ## KEGG: BT_0364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 1050 5 1027 1027 552 35.0 1e-155 MKHQNLEFKWMCLLFLALLLSPPLLWGQNHSIKGQIVDAKSNEPLIGVNITVEGTSNGTI SDIDGRFTLSVAANAVLKISYIGYQEKKIKVDDLKKEPIISLEEDSKQLEEVVVVGYGIQ KKVSSVGAITQTKGEELLKGGNITSVSEALQGKLNGLVAINTSGKPGASDTKMYIRGKAS WQNSDPLVLVDGIERDMNDVDMNEIESISILKDASATAVYGVRGANGVILLTTKRGQDKK PNINFSTSFGIKQTTANMEWADYVTSMKMYNEAVANDNNWNQQIKQSVINAWSNAFDTGN YGPYNDVFPQVDWWDELVKPGFSQQYNVNISGGTNFMRYFASIGYQNDGDIYDLQKQENF DPRNYYRRYNWRSNLDFNLTKTTSLSVNIAGKMGYRNDNFADDVYTRIIAAPTNSFPIKY SDGYWGDGSTAGYNPVCNMNTNGSQLYKTFQGWYDIRLEQKLDFLTKGLKAAAKVSYTSA STTRSSIKTGEIWGNNDFESRNSIIKYHREYDYSNPTVNADGSLTYPMILEKRWPNDEDI ELPPNVSYDNLDGYNRKLYYEFSVEYNRSFGSHNVTALALMNRQVSDLKDDKVIKFPSYR EEWVGRVTYNWKERYLAEMNISYTGSEKFARGQRFGLFPSYSLGWRASEEPFIKKHVGDV LTNLKFRYSYGKVGSDAAAARWNYIQLFNSLGSIELGNTQGVTHSPLYTEGDIANLTSTW EKATKHNLGIEIGLWNKLDLTLDLFKENRKDILMTPQTTSSIVGATFNAINRGETKNHGL ELELRWNDKIGRNFNYWGALTFATSENRIVYKDDPRNADEHLKAAGKPIDYQNRFIAIGN YGSIDDVFNYAQTAINGTTASQVIPGDLIYIDFNGDGIINSNDKVVVDELNYPLTTIGFS FGFNWKGLNFSAMLYSPRNIYKLQFDQYLWDFPASNIRVQPKSMERWTPETANSSGVMRP ATHLNNTYNKVESTYRYSDYSYIRLKNMEIGYTFPKKWLSEAHISNLQVYMNGNNLLTFW KGDKRVDPETGGVGSYPIVRTYTLGLRVSF >gi|225935355|gb|ACGA01000037.1| GENE 3 4015 - 5895 1154 626 aa, chain + ## HITS:1 COG:no KEGG:Slin_2102 NR:ns ## KEGG: Slin_2102 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 2 626 5 617 617 196 28.0 4e-48 MRKVLLYICSVFILGGMTSCEEFLDKREDVGLTEDDIFRDYYSLRGFLDQSFNQLENVMV MDNWENGRAFVGLFADEMATTDNSSAVFTLHSGNWLSNAKSSTTFEIGNGGSTAISRSYK ALRINNRIINDIDKAPLTPEQKNEILGQAYFYRSWFYFQIIKRYGGMPIIDKVFEGGDDD IPRMTYHESHDWMMEDIQKAIYMLPDSWDDPNYGRPTKIAAMALKEWAQLFDASPLMQND LNSTENKGYDTERAKSAAKSAYEVIRYMDGSKSAPYPYGLASKEEYTNIFYFKYPPVHQP EYIWVKRQFPNADNQNQKRTIRTFWQYEDLAFGSGPDGNSMCCPSLNIVNMFDKKGADGI YYPIDDPRSGYALDYDHKPFEDRDPRFYNNILLPGTTWGHYADGRTYYITTYKGGAAYQH MLTQKDCNKRMFTGFMCKKFMWPECNQYLEKNNPNCWWDYRFLTIYIRASQIYLDYAEAL FEGCGSATATIEGCPISAAEAINILRRRIGLTDLPSDIVADPDKFRAAYRRERAVELMFE NHRWWDIRRWMILEETFKDTYPIKGALFTPRESNHGSITDKSTLTYDYEQIDITPEVRSF TKKNYWYPLPQHDVDALNNLQQNPYW >gi|225935355|gb|ACGA01000037.1| GENE 4 5914 - 8985 2192 1023 aa, chain + ## HITS:1 COG:no KEGG:Phep_3875 NR:ns ## KEGG: Phep_3875 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 154 1023 73 937 937 390 31.0 1e-106 MKRYNKFIFILFVTAILSGIATSLEAQDKPKKKKTALIEVKGLVTDNTGKPLSGVTVLSG EGSIINYTDANGKFALKTKADGTLLIEAFGYKDFVINLTKEQPTVIKLQNEDLYASERDI HERADGGKTYQRDLVGTVSKLSMENVLKYPDLQLSNALQGQAAGLIAISGDGGLGYNAST LYVRGQHNNGTNTALVIIDGIERPIDDILPEEIESIEVLKDATAKILYGAAATNGVVLVR TKRGEAHKRIVRVGVEYGVQSSTRVPKYLDSYNYSKLFNEARINDGMNPYYTETQIEGYR NSSGVNDVLYPNVDYYNEFLLNQNIYRKGTIEFNGGNEGVKYALVGGYTGGSGLEKVGER SALHRMNARGNLDIKITDFLTVTADVAARVELENWGAKDGAGIFSTLSSNRPNEYPFIIP NETLSGQFTPNEDGTPFFGASTRIVDNLYADMVYGGDTSERYVNSQTNLGANFDFNKYVK GLTFNAYVTFDNYSYLRQELRNTYPTYAIDTYNDLDGETITRYTQMKKLDLPKTQKIASN NTYRYFGMRADIGYERTLGVHDFSAIGAFRYTKNEMTGMTQDFKDANISLRLNYSYDKRY LAEFTLAGMGSNKFDKNDRFFFSPAVGASWIISNESFMKEVKAVNFLKLKASFGVLGYTG NTGFFLYQTGWNNNGNYNFFQDQTDHKVSLARWGNPDLTWEYSQEFNIGVEGLFFNNRLS TELNYFHECRKDIIGVNNAQYAATAGNYTMYENIGQVTNQGIDIAINWKGNIGRDFLYTV GANMTYSKNKLDKWNEIEGVESYRKAIGRPTSTIFGLQALGLFGKDIPLEGHSLQSYGIY QNGDIAYADLNNNGIVDDNDRMSLGQSFPVTTWGINVDLKYKGFGPYMLGTLHTGITQLC TNAYYWNNGLNGYSELALNRYHEVNNPSGTMPRLTTTTESNNFRDSSFWTENGSFFRLKN VELSYTFENKAGRFFANKCKLFVRGTNLLTFSKIKDLDPERLNAGITNYPAYMTVTGGLS VSF >gi|225935355|gb|ACGA01000037.1| GENE 5 9005 - 10801 1091 598 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0751 NR:ns ## KEGG: ZPR_0751 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 4 595 3 580 580 260 33.0 1e-67 MKISKLILLTCVWGSIVSCEDLLDPKLTNEWSSETVWTNPDMAQAVLTQVYADLMVVPDH YDDNFLDAATDNALTRNYGSSVYRASMGAFSRSTNPLGNWDNMYDKIQSINLFMEKGVTD DVIYNRVSKETDQAIKTRLLGEAYFLRAWCSFKLLQTYGGRTDEGEALGYTITNHFIGDK ESAKPSLFKRDSYKDCVSQIVSDCEEAARRLPATYTGDDVVVGKSKIGRACGLAANALKA RTLLYAASPAYQDMDVIRINGMGSFTVLNEATYQAGWERAALFANEVLKDAGVNYTFTAM AAKDLADAGSDTPADFIFRTYMGLVHGMESRHYPPFYLGNAQTIPSHNLAAAFPAKNGYP ITDSRSLYDEQDPYLIARDNRFNMNLYYQGKKFGAYNSNIDVSEGGKDSESFHIYASRSG YYLSKFLSTTQKNMLDPIQTLNSRHFNPLLRKTEVWLNYAEAANEAWGPEGKGDDCQYSA YDVLKIIREKSGGITDVTYLSEVASEGKAAFRKLIQNERRIEFAFEDQRFWDLRRCLLPL DTCIEGMQVTRTDNGLSYEIKEIEQRPLDALRYYYLPLPNDELLKNPNLKNNMGWNNN >gi|225935355|gb|ACGA01000037.1| GENE 6 10816 - 11754 797 312 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0752 NR:ns ## KEGG: ZPR_0752 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 299 1 330 344 74 25.0 5e-12 MKIAKIILFALTTCLVSCYEDYTHDYETTNVGFALQTPLRTVISDRDMPIYVGVSLGGKR EVDMNDWAKFTLDASLLEGTSLTLLPEEYYILEDPEIFKVRKSNLPVADVEIKFTDAFYT DPLSLTTHYALPFKVTESSMNSIREGADYSIVAIKYVSSFSGVYYLQGEVSEVDNNGNII EGTTVVYKEKDLINNNTLELSTLAQNKLLRPGVANLAKSNINRFQLTIDNNGESGSDYNV QIEGIEGGITIVEGNGSYIAQSPTYTFNSGDKPCPEINLNYTYELADKRYKVKEQLVLRR DPVNDLRVESWK >gi|225935355|gb|ACGA01000037.1| GENE 7 11830 - 13395 1318 521 aa, chain + ## HITS:1 COG:yidJ KEGG:ns NR:ns ## COG: yidJ COG3119 # Protein_GI_number: 16131548 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 25 496 3 442 497 130 25.0 7e-30 MKKELAVSVVSSILCITTGVAQTNKPNVVIIMTDQQRADLCGREGFPMAVTPFADSLALS NVWFDKAYTVAPASMPARCSMFTGRFPTATHVRTNHNTPDMYYKKDMLEVFKEQGYKTAL VGKNHAHVKGKDFDYCEEYFHWGKNRRDTQEDKDFAFFLNNKARGQYLEATPFSSEAQNP VKIVTKALAWAAEQKDSAFFMWVSMPEPHNPYQISEPYFSMFSPEKIPATLTSRKDLSKK GDKYAILAELEDTSCPNLAEDLPRLRGNYLGMLRLIDDQTKRLIDGFQAVGLYENTIFVI LSDHGDYCGEYGLIRKGAGVPECLTRIPMIWAGCGIHPQSGPMDAHVSIADVFPTICTAI GADIPLGVQGRSLWPMLTGRAYPEREFSSIIVQQGFGGEDFTRDEPLTFVQEGALQPNKI AHFDELNTWTQSGTERMVRKDDWKLVLDNYGRGELYNLKADPSEINNLYNKKKYASKQME LLEELMTWELRVQDPLPVPRNRYHFKRNPFNYHFTNEKETP >gi|225935355|gb|ACGA01000037.1| GENE 8 13491 - 16298 1603 935 aa, chain + ## HITS:1 COG:no KEGG:Phep_3406 NR:ns ## KEGG: Phep_3406 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 25 935 95 1005 1006 453 33.0 1e-125 MKFLKKMLLVGLLPGYLICQAQNVQNKEDVPLDLGYRVLPTGDYNGSAYTISGKQLRNLP VTNLSAVLAGLVPGYFARQVQGGGLVNEQNSFWIRGQRSNSPDVLVLVDGQERDFAVLSS HEVESITVLKDAAATALYGMRAGSGAILVTTRKGAKGKPQVELTAQIIAQQPLKELNSLN ATDYARQHNIARHNDNMDPLYSNYDIMNYAKNDPNSILYPNVDWADKYLKDTRWSQRYNL NIQGGTEKSTYFVNAMYTRNNGYFNTDDSHDYSTNHSAERFNIRSNIDFAVTRTTQLDVN LYGWYQSQNGPGSGAENIYRNLVTLPQGIFPEWYNDQGYTDQYGNVVNAEDGKIVAGSTV RENPWAMLNRSGYFQDKQLYGSFRTKLSQDLSFITEGLKASVALSMDSRTISSIRRTITF AYYEKDATNENVLRRTREDDSMINKVDNTSSFRRTGIVAQLDYNRTFGKHGVSALAFYEQ YESNDEMVLPTRFQSVNGWFGYNYDKKYGIDVIGAYQGSHKFGPGHKFGFFPTISAGWTV SNESFWKDAKKIVPYLKLKASYGQVGNSSGVDAFYYRGRMWPQNDVYITGVNMGTKLGGY IQDILPNPGLTWEKARILNIGIDTRLFSDRLSVTAEFFKDNRHDMYVVNNKISSLVGNVK EFKQNIGKINSHGVDLSAMWNSNIADWSYFIGGTFSYSTNELIANGEVDQPYEWLKNQGR PLGENRGYIAKGFFNSWEEIAASPVQTFSDVQPGDVRYEDINKDGQIDVNDMVPIGYGDI PKIMYGINLGVGYKGFSITALFQGAAKVSRQYSDIVMNPFTDNGTIFEHQLDYWTPEHQN ATFPRLTTLDNANLNNRQISTLNVKDADFLRLKTLEVAYDFPTKMIQKIHLKGLRLFLSG TNLITWTKYKWVDPEAPSLAAPLTRNMSIGCSLKF >gi|225935355|gb|ACGA01000037.1| GENE 9 16303 - 18057 1336 584 aa, chain + ## HITS:1 COG:no KEGG:Phep_3874 NR:ns ## KEGG: Phep_3874 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 584 1 568 568 257 33.0 1e-66 MKKSIIYLTISATLTFSGCDYLDDMPYDWAQPDDIFTNEQNYLKPINQVYAYIPEGFNHV GNAFLDAATNDGISTIIDSDIHKLSRGYVTSSNPIEECWNKSYKGIQQAIFARKYLREAD LVLQNKTEEDIQTFKDTYCAETECLQALFEFNLLRHYGGFPIVDRIYEVDDPELQTKARD SFKDCVNHIVQLCESAAAVLKVDPEGGNGSYGRMTKGMALAIKAKTLLYAASPLYNRTDN NDPLLGYTDGSDVTERWKLAAKACADVINLNADGTISPNGNKKYNLIALTAAKTYDKIFI NANPNPEYILFYTAAKDNALENRHYPPTISKDQGGGTVPSQQLIDAFTMKDGSDYTHSSD GASMYTNRDPRLAAIIGYDGSVYGKNTIYTRISDNTTIDGLNQVKNRSTNTGYYLAKFLD KTLNFGQANVGTVFHLFPMIRLSDILLSYAEAMHHAYGMSADPEGYGLTAIEAVQKIRTR AGFGTNDKFLNGVTDNNFMEKVKQERRIELCFEEHRYFDLRRWMDGNQLREPIIGMKVEE NASGLHYTRITVDEARNFKDYMYYHPIPLKVIKESPSIQQNPGW >gi|225935355|gb|ACGA01000037.1| GENE 10 18090 - 21206 2102 1038 aa, chain + ## HITS:1 COG:no KEGG:BT_2894 NR:ns ## KEGG: BT_2894 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 1038 9 1017 1018 627 36.0 1e-178 MKKIIILFICLTSSIWHVCAQTKVSGIVKDAQGMEMIGATVLQKGTNNGIVTGLDGKFTL TLSNKAEQTLLISMIGFIEQSIAIKKNHPFINITLEEDIAQLDEVVVVGYGTQKKVSLTG SISNVGTEDLKSMPVSSVTNALGGRIPGLVTRQESGRPGGDQATMFVRGRASLNDSSPLV LIDGVERPMAQIDPDDIETISVLKDASATAVYGVRGANGVILVTTRRGREGETRISFSSE FGVTSFNRISQTLNAESVSRFMREGAINDGFDPSDTGNTRGLFLSEYDNYLYRTQKSPFT HPDNDFVDMFTKNGLQQKYNVNLSGGNKTVRYFVSVGYFTQTGMFETDVDKIKEHETIQA LLAASPDVAKGLYKEGYNSEYKYSRLTTRSNIDINLTEDFKVSVNLAYRFGSQNRPYGYD ADGQEALRLFGMFYRNSPQAFPILNANGTYAAADGIWRQNPLVTLCYSGFFLNFNNKLET DFAFKYNLRKLLKGLSIDGKFSYDAGWSNNRSIQQRPNIYQYNPVNGTYKQGLEMVLPTK ATNKTAATHRKYAEAAVRYKQSFSGHNISGLVLYNMSYTSTPGGRYSYVPHIYQALVGRV NYDYENRYLFEVNAGYNGSNRFAEGHRYQLFPAASVGWALTNEPFFKENPILSFTKLRFS YGEVGNDKLGGFSYYYRSGYDEGLGYTFGETHNPAIKGLIQGKSANENITWEVARKYNLG LETKWLKDKISASIDFFKERRSNILCEPERYSQAAGSNGLAPINYGVVTNQGYDLEIGYQ DQKGDFGYSIKGIYGYAHNEIVEKSESVKPYAYMSQTGNPIGQFIGYISDGFFSSYEDIA SSPVQFGQVSRPGDIKYKDLNHDGVIDSNDQAPIGYNPVPEMTFSLAAGLNWKGIDFSIL LQGAARSSIYLQQDIAWDNFWGNYYEEHIGRWTPETAATATYPRFTKAATAAHPNYYKSD YWLKDSKYLRLKNIQLGYTIPRKLLKKFGVRSLRVYANAYNLFTWDNVKKVDPESSNNSN GQFYPQQKVINFGINLNF >gi|225935355|gb|ACGA01000037.1| GENE 11 21226 - 22935 1214 569 aa, chain + ## HITS:1 COG:no KEGG:Phep_2240 NR:ns ## KEGG: Phep_2240 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 120 569 107 584 584 243 35.0 2e-62 MRKLFISFVTAVLLASCSADSLMKETELDLTDTEKVYSDITLTRKVAYDLYARMRMGRDR LGSFGFLANMGSSPCMLDNATDDGAGNTTRAGGAVIPALEKFIKGGIDASKGLIGDDHPW TFYYKAIRSANTFLNNVDRSPLEDAEKTALKNQVRFLRALYHHELFRFYGALVIGDKELD PLLYDEIKRESLETTVRWIANEFKELAEPEVLPDKYDAADYGRATRGAALGYLARTLLYA ASPLHKASGVTWKEAADAAYDMIEYSDGGDFYRLYEDPTTPEKSYTRLFNTRANGEVIMG FLRAPANDLYAMMPAFDPWNVNKELLTCPNQWLVDCYDMLDGSQPILGYESNTKAIINPN SGYDENSPYANRDPRLAQSILCDGATWPLVNGKPATIDLSKSYRWGSGYFLTKFLDDRID HRKGGTTYMDFPMMRYAEILLDYAEAENEAEDTNTAREKAIAQLNRIRRRAGITTDLLAA DYNQATLRERIRKERRVELCFEDHRFFDIRRWLIAKDVMRLPAVGIKKINGEYQRITLDT RNYNERMNLAPIPQDEVNNCPNIYQNPGY >gi|225935355|gb|ACGA01000037.1| GENE 12 22970 - 23911 704 313 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172387|ref|ZP_05758799.1| ## NR: gi|260172387|ref|ZP_05758799.1| hypothetical protein BacD2_11029 [Bacteroides sp. D2] # 1 313 1 313 313 591 100.0 1e-167 MKTNSIIALILSISLFGLFGCADKYEVDYEALVKIEFTGVDQNNRVSLEKGVAEYTATVK VQGEIMSFEIYQADSKTGMQGSLIEETAQSFEDGTANYETTYKFTSLKENACITVVVLGT DGHTYQRNLLVEITPSVLFSDPDYGKDGEIVETASAYYGCYYATWLLGRTYMAADAMKYT NEVDFSLGDVILSPGSEAVPAFVSPAKRSEYGLMTISGLQHTLFAETSLSQAEFNAISQV DATPIESLADPTSEVLAIQANKVYLFKTANGKKGLICIQKITAKTGTIEVSPDNWVENTK YSWVQLLTKTVVK >gi|225935355|gb|ACGA01000037.1| GENE 13 24148 - 25500 1189 450 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 11 446 17 480 497 203 31.0 6e-52 MNLKKHLPWLGALLPIANTQAETKPNVVIIYIDDMGIGDIGCYGGKFAPTPNIDKLAQDG LLFNQYYSSAPVSSPSRCGLTTGLFPLEVGINTFLNDKAANKRCEQRNFLDDKLPSMARA FQNAGYSTGHIGKWHMGGGRDVHDAPSIKNYGFDEYISTYESPDPEPAITATKWIWSDKD SVKRWRRTEYFVDKSIEFVKHHKDEPFFLNLWPDDMHTPWVPEFKQKERKSWETQEAFAP VLAEMDKQLGRFIKALDELGLAENTIIIFTSDNGPAPSFKSVRSAYLRGTKNSLYEGGIR MPFIVRYPKKIKAGQVNNESVLCAVDLYPTLCSVAGIKTEKGYKGDGQNYAKVLLGKSEA KRKTDLMWDFGRNKHFGFPGNPYDKSPHLAIRSGKWKLLVNGDGSDAQLYDMEKDKFEKN NIANEHPDLVAKLSKKVCKWFEENKNKGKE >gi|225935355|gb|ACGA01000037.1| GENE 14 25538 - 27223 1679 561 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 29 546 3 520 536 294 35.0 3e-79 MNSKFLLLPTALMVAGHSAAEAKGKKSDQRPNILVILADDLGYSDLGCYGSEIHTPNLDK LAKQGVRFNHFYNTSRSCPTRASLLTGLYQHQAGIGRMTFDDHLPGYRGTLSRNAVTIAE VLKESGYATSMVGKWHIAETPLRKDQREWLAHHVYHETYSDLCHYPVNRGFDTHYGTIYG VVDYFDPFSLVEGEVPVKEVPDGYYITQALSDRAVKEVKEYAKDDKPFFMYLAYTAPHWP LHALPEDIEKYKDTYKVGWEAIRNARYERQKQLGIFPNMDNFLSDRQFKDKWEDNPHAEW DARAMAVHAAMIDRMDQGIGQVIDALEKTGQLENTLILFLSDNGCSNEDCQNYSPGENDR PDMTRKGEKMVYPHNKEVLPGPETTYASLGARWANVANTPFRFWKAKSYEGGICTPMIAH WPKGIKKNVGKMTPEIGHVMDIMATCIDMAGATYPTKYKGNDIIPLEGKSLVPIFKNGHR EGHDYLGFEHYNERAFLSNDGWKLVRPGENAKWELYNLNEDRSEMHNLAAQYPEKVAEMT KAYEAWAKRCMVEPYPGQKKK >gi|225935355|gb|ACGA01000037.1| GENE 15 27344 - 31360 2909 1338 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 811 1053 46 295 318 137 31.0 1e-31 MKLTKILICLLVFWTGTLSASPYFSFKKYQVEDGLSHNTVWCALQDSYGFIWLGTSDGLN RYDGRGNKVYRNVLNEKFSLENNFVEALIEVDKNLWVGTNSGLYIYDRDTDRFSYFDKTT QYNVYISSEIKKIIKTENGLIWIATLGQGFFIYDPKTEVLTQNSVQTSFVWDLCQSADRK RVYISSLQEGLLCFDENGKFLRTYEISLDINASDSYKVNCIQNIDGEIWIGAGSNLLSRL DERTEAIDNYSGSAFNFGAVHCLLKYTDKELLVGTDNGLYLFDQNTNTFQRADNPADPRS LSDQTINGMMWDAEGALWVLTNLGGINYMSKQTKRFDYYSPAYLSGVPGAGEVVAPFCEN KDGNIWIGTQSGLYFFNAATRELSPYAIGGTKNQKYDIRSLMLDGDYLWIGTYAGGIRVI NLRTGAVKAYTHSRGIPNTICSNDVLCIYKGRKGEIYVGTSWGLCRYDAARDNFMTITSV GSMVSVVDIYEDMYNHLWIATSNSGVFSYNTMNAHWKHYQHEREDSTTITSNSVITLFED TKGTMWFGTNGGGLCSFDAKEKRFIEFDPHNTLLPNKVIYAIEQDQGGDFWVSSNAGIFK INPVTKDHFRQFTINDGLQGNQFIARSSLKSSEGKLYFGGINGFNVFQPEQFVDNKYIPP VYVTDIRLPYQTDEQEVKKLLQLDKPLYMADKVTLSYENNSFSIRFVALSFEDPGKNRYS YILRGVDKEWILNTDNNMASYTNLPPGEYLFEVRGSNNDRQWNENTTTLKVVITPPWWRS TFAYFVYVLMLLGWIGWMAWRWNLRVKRKYKRRMEKYQTAKEQEVYKSKISFFINLVHEI RTPLSLIRLPLEKLLEKEHEGKDLKYLSVIDKNVNYLLGITNELLDFQKMESGTLHLNLK KSDIKELVSDVYNQFTSPAELKGIDLQLIVPEQELVSTVDKDKLSKILVNLMGNAIKYAH ARIDLKLLVTDGGYEIQVNDDGPGIPNEQKQKIFEAFYQLPDDKVATAVGTGIGLAFAKS LAEAHQGDLRLEDNVGGGSSFILSLPFKEWETEKMGDIVEISPEHADASETISSEYAGKK FTVLLVEDNVDLLNLTRESLAEWFRVLRASNGREALEVLAKENVDVIVSDVMMPEMDGLE LCSKVKSEISYSHIPVILLTAKTTLESKVEGLECGADVYVEKPFSVKQLHMQIENLLKLR QSFHKLMVSLAGDTVAVSTTDFAMSQKDCEFIAKIQGVIAEQLADENFSIDTLAEQMNMS RSNFYRKIKALSGMSPNDYLKTLRMNRAAELIVSGTRISEVAAQVGFTSSSYFAKCFKAQ YGVLPKEYTGQPPISGEA >gi|225935355|gb|ACGA01000037.1| GENE 16 31381 - 32682 1250 433 aa, chain - ## HITS:1 COG:YPO0063 KEGG:ns NR:ns ## COG: YPO0063 COG4942 # Protein_GI_number: 16120414 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Membrane-bound metallopeptidase # Organism: Yersinia pestis # 29 433 62 450 450 82 24.0 1e-15 MMKRFFVVLISCLWLAVPLFAQSNKLIRELESKRGALQKQIAESESILKDTKKDVGSQLN SLAVLTGQIEERKRYIMAINNDVEAIQRELASLERQLRGLEKDLKDKKKKYEASVQYLYK NKSIEEKLMFIFSAKNLGQTYRRMRYVREYATYQRLQGEEILKKQEQIRKKKAERQQVKA AKENLLQERESEKVKLEEQEKEKRTLVTNLQKKQKGLQNEINKKRREANQLNARIDKLIA EEIERARKRAQEEARREAAARKKEESKEGKSVATETTAKSKPLEAYTMSKADRELSGNFA ANRGKLPMPISGAYIITSRYGQYAVEGLRNVKLDNKGIDIQGKPGAQARAIFDGKVAAVF QLNGLFNVLIRHGNYISVYCNLSSASVKAGDTVKTKQSIGQVFSDGTDNGRTVLHFQLRR EKEKLNPEPWLNR >gi|225935355|gb|ACGA01000037.1| GENE 17 32679 - 33257 404 192 aa, chain - ## HITS:1 COG:no KEGG:BT_3463 NR:ns ## KEGG: BT_3463 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 193 193 238 75.0 9e-62 MKRIIYLLLLVVVVLAGCKSSKHLATSETKTSTKAPTSSYLASKLQLTIPGKKGSMSVGG TMKMKTHERVQISLLMPILRTEVARIEVTPDEVLLVDRMNKRFVRATKAELKNVLSKNVE FSRLEKILMDASLPGGKTELTGKDIGIPSLEKAKVQLYEFSTQEFSMTPTELTSKYNQIP LEELVKMLVALL >gi|225935355|gb|ACGA01000037.1| GENE 18 33254 - 35026 1645 590 aa, chain - ## HITS:1 COG:aq_854 KEGG:ns NR:ns ## COG: aq_854 COG0457 # Protein_GI_number: 15606205 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Aquifex aeolicus # 126 576 89 528 545 98 22.0 3e-20 MKIKIGWLFVTVLMLTSCGGIRSVRTAKTTAKADGASLMKETLLSAEQQRKYDYFFLEAM RMKGKNEYDAAFGLLQHCLDINPTASSALYEISQYYMFLRQVPQGQAALEQAVAFAPDNF WYSQGLVSLYQQQNELDKAVTLLEKMVTRFPSKQEPLFSLLDIYSRQEKYNDVISTLNRL EKRLGKNEQLSMEKFRIYLQMKDDKKAFQEIESLVQEYPMDMRYQVILGDVYLQNGKKQE AYDAYQKVLAVEPDNPMALFSMASYYEQTGQKELYQQQLDTLLLNKKVTSDTKISVMRQV IVENEQSSAKDSTQVIALFDRMMKQDIDDPQIPMLYSQYLLSKNMEQEAVPVLEQVVDLD PTNKAARLMLVSAAVKKEDYKQIIKVCEPGIEATPDALELYYYLAIAYHQAEQTDSVLSV CSRALEHVTADTRKEVISDFYSIMGDIYHTKKQMTEAYAAYDSALVYNPSNIGALNNYAY YLSVERRDLDKAEEMSYKTVKAEPNNSTYLDTYAWILFEKGNYAEARIYIDNAMKNDGEK SDVIVEHCGDIYFMTGDVEGALKYWKKALEMGSESKTLKQKIEKKKYIAE >gi|225935355|gb|ACGA01000037.1| GENE 19 35051 - 35482 472 143 aa, chain - ## HITS:1 COG:FN1028 KEGG:ns NR:ns ## COG: FN1028 COG0756 # Protein_GI_number: 19704363 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Fusobacterium nucleatum # 1 142 4 145 146 171 61.0 5e-43 MNIQVINKSKHPLPAYATELSAGMDIRANLSEPITLEPLQRCLVPTGLYIALPKGFEAQV RPRSGLAIKKGITVLNSPGTIDADYRGEVCIILVNLSSETFVIEDGERIAQMVIAKHEQP AWQEVEVLDETERGAGGFGHTGV >gi|225935355|gb|ACGA01000037.1| GENE 20 35607 - 36953 844 448 aa, chain + ## HITS:1 COG:sll0398 KEGG:ns NR:ns ## COG: sll0398 COG0232 # Protein_GI_number: 16331575 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Synechocystis # 2 445 1 440 440 289 38.0 9e-78 MMNWNQLISAKRFGMEEFHEERQENRSEFQRDYDRLIFSAPFRRLQNKTQVFPLPGSVFV HNRLTHSLEVASVGRSLGDDVAKALLERHPELQDSFLPEIGSIVSAACLAHDLGNPPFGH SGEKAISTFFSEGKGVRLKEKQPNGEQLSPMEWEDLIHFEGNANAFRILTHQFEGRRKGG FVLTYTTLASIVKYPFSSSLAGTKSKFGFFVSEEESFRKIATELGLILLNEHPLKYARHP LVYLVEAADDICYQMMDIEDAYKLKILTTEETKELLMAYFSEERQEHLRKTFLIVNDVNE QIAYLRSSVIGLLIRECTRVFLDHEQEILSGTFEGSLIKRIAERPAAAYKHSVEVSINKI YRSRDVLDVELAGFRIISTLLELMIDAVTSPEKTYSKLLIDRVSSQYNIKAPVLYERIQA VLDYISGMTDVFALDLYRKINGNSLPAV >gi|225935355|gb|ACGA01000037.1| GENE 21 36925 - 37941 922 338 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 4 222 304 514 605 134 36.0 2e-31 MEEIIKPVSKELLKAELTEDRRLRMTNKSNNQIYIITHQNAPNVMREIGRLREIAFRAAG GGTGLSMDIDEYDTMEHPYKQLIVWNPEAEEILGGYRYLLGTDVRFDEKGAPILATSHMF HFSDAFIKEYLPQTIELGRSFVTLEYQSTRAGSKGLFALDNLWDGLGALTVVMPNVKYFF GKVTMYPSYHRRGRDMILHFLKKHFYDKEKLVTPIEPLKLETSEDELSALFCKHSFKEDY KILNCEIRKLGYNIPPLVNAYMSLSPTMRMFGTAINYEFGDVEETGILIAVDEILEDKRI RHIQTFIESHPDALRLSSCEGEEVFTPKVVTPQANCSR >gi|225935355|gb|ACGA01000037.1| GENE 22 37968 - 38786 632 272 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 55 267 68 273 605 80 29.0 2e-15 MTDDSLFLIDVDKILRTKAPKHYKYIPKFVISYLKKIVHQDEINVFLNESKDKLGVDFLE ACMGFLDAKVDVKGIENLPKDGLYTFVSNHPLGGQDGVALGYVLGKHYEGKVKYLVNDLL MNLRGLAPLCIPINKTGKQAKDFPKMVEAGFQSDDQLIMFPAGLCSRRQNGVIRDLEWKK TFVVKSIQAKRDVIPVHFGGRNSDFFYNLANVCKALGIKFNIAMLYLADEMFKNRHKTFT VTFGKPIPWQTFDKSKTPAQWAEYVKDIVYKL >gi|225935355|gb|ACGA01000037.1| GENE 23 39046 - 39549 505 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 97 35.0 9e-21 MKSLSFRKDLVGVQDELLRFAYKLTTDREEANDLLQETSLKALDNEDKYTPDTNFKGWMY TIMRNIFINNYRKVVRDQTFIDQTDNLYHLNLPQDGSSENTERAYDLKEMHRVVNKLPKE YRVPFAMHVSGFKYREIAEKLNLPLGTVKSRIFFTRQKLQEELKDFR >gi|225935355|gb|ACGA01000037.1| GENE 24 40014 - 40484 177 156 aa, chain + ## HITS:1 COG:CAC2133 KEGG:ns NR:ns ## COG: CAC2133 COG2001 # Protein_GI_number: 15895402 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 127 2 119 142 67 33.0 1e-11 MIRFLGNIEARADAKGRVFIPATFRKQLQAASEERLIMRKDVFQDCLTLYPESVWNEELN ELRSRLNKWNSKHQLIFRQFVSDVEIVTPDSNGRILIPKRYLQVCSIHGDIRFIGIDNKI EIWAKERAEQPFMSPEEFGAALEEIMNDDNRQDGER >gi|225935355|gb|ACGA01000037.1| GENE 25 40456 - 41385 898 309 aa, chain + ## HITS:1 COG:BS_ylxA KEGG:ns NR:ns ## COG: BS_ylxA COG0275 # Protein_GI_number: 16078578 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Bacillus subtilis # 14 309 4 310 311 239 43.0 7e-63 MIIDKMEKDELTYHVPVLLKESVDGMNIQPDGTYVDVTFGGAGHSREILSRLGEGGRLLG FDQDEDAERNIVNDTHFIFVRSNFRYLHNFLRYHNIEQVDAILADLGVSSHHFDDSERGF SFRFDGALDMRMNKRAGMTAADIVNTYDEERLANILYLYGELKNSRKLASVIVKARSGQN IRTIGEFLEVVKPLFGREREKKELAKVFQALRIEVNQEMEALKEMLLAATEALKPGGRLV VITYHSLEDRMVKNIMKTGNVEGKAETDFFGNLQTPFRLVNNKVIVPDEAEIERNPRSRS AKLRIAEKK >gi|225935355|gb|ACGA01000037.1| GENE 26 41392 - 41748 361 118 aa, chain + ## HITS:1 COG:no KEGG:BT_3454 NR:ns ## KEGG: BT_3454 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 118 168 95.0 6e-41 MEEEVVNKKAEEDKKKKRTSLKSILGGDILATDFFRRQTKLLVLIMVFIIFYIHNRYASQ QQQIEIDRLKKELTDIKYDALTRSSELMEKSRQSRIEDYISSKESDLQTSTNPPYLIK >gi|225935355|gb|ACGA01000037.1| GENE 27 41825 - 43951 2115 708 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 3 707 15 723 729 178 26.0 3e-44 MTRYFFVILLMALIGVAIVVKAGITMFAERQYWQDVADRFVKENVTVKPNRGNIISSDGK LMASSLPEYRIYMDFMSGEKDEKRRQKDQARRDSILNANMDSICIGLHKIFPDKSAAQFK AHLKKGRQAKSRNYLIYPKRISYIQYKEVKRLPVFCLNRYKGGFKEQAYNQRKKPFGSLA ARTLGDVYADTAKGARNGIELAFDTILKGRDGLTHRQKVMNKYLNIVDVPPVDGCDLIST IDVGMQDICEKALVDKLKELNASVGVVVLMEVSTGEVKAIVNMMQGKDGEYYEMRNNAIS DMLEPGSTFKTASIMVALEDGKITPDYVVDTGNGQMPMYGRVMKDHNWHRGGYGKLTVTE ILGVSSNVGTSYIIDHFYGSNPQKFVDGLKRMSIDQPLHLQIAGEGKPNIRGPKERYFAK TTLPWMSIGYETQVPPINILTFYNGIANNGVTVRPKFVKAAMKDGEVVKEYPTEVINPKI CSDKTLAQIREILRKVVGEGLAKPAGSKQFHVSGKTGTAQISQGAAGYKTGRTNYLVSFC GYFPSEAPKYSMIVSIQKPGLPASGGLMAGSVFSKIAERVYAKDLRLPLTNAIDTNSVVI PNVKAGEMREAQRVLEELQINVQGKIADSGKEVWGNTHSAPQAVVLESRSNMQNFVPSVI GMGAKDAVYLLESKGLKVNLVGVGKVKSQSIANGTIVKKGQTVTLTLN >gi|225935355|gb|ACGA01000037.1| GENE 28 43981 - 45429 1627 482 aa, chain + ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 482 1 479 482 335 39.0 2e-91 MLLNELLKAIQPVEVAGDSNIEITGVNIDSRLVEAGQLFMAMRGTQADGHAYIPAAIAKG AIAILCEDMPEEPVAGITYVRVKDSEDAVGKIATTFYGDPTSKLELVGVTGTNGKTTIAT LLYNTFRYFGYKVGLISTVCNYIDDEPIPTEHTTPDPITLNCLLGRMADEGCKYVFMEVS SHSIAQKRISGLKFAGGIFTNLTRDHLDYHKTVENYLKAKKKFFDDLPKNAFSLTNLDDK NGLVMTQNIRSKVYTYSLRSLSDFKGRVLESHFEGMLLDFNNHELAVQFIGKFNASNLLA VFGAAVLLGKKEEEVLVALSTLHPVAGRFDAVRSPQGITAIVDYAHTPDALINVLNAIHG VLEGKGKVITVVGAGGNRDKGKRPIMAKEAAKASDRVIITSDNPRFEEPQDIINDMLAGL DAEDMKKTLSIADRKEAIRTACMLAEKGDVILVAGKGHENYQEIKGVKHHFDDKEVLKEI FK >gi|225935355|gb|ACGA01000037.1| GENE 29 45520 - 46788 1023 422 aa, chain + ## HITS:1 COG:YPO0552 KEGG:ns NR:ns ## COG: YPO0552 COG0472 # Protein_GI_number: 16120880 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Yersinia pestis # 1 422 1 360 360 211 33.0 2e-54 MLYYLFEWLHKLNFPGAGMFGYTSFRALMAVILALLISSIWGDKFINLLKRKQITETQRD AKTDPFGVNKVGVPSMGGVIIIVAILIPCLLLGKLHNIYMILMLITTIWLGSLGFADDYI KIFKRDKEGLHGKFKIIGQVGLGLIVGMTLYLSPDVVIRENIEVHNPGREMEVIHGTNDL KSTQTTIPFFKSNNLDYADLVSFMGEHAQTAGWILFVIVTIIVVTAVSNGANLNDGMDGM AAGNSAIIGATLGVLAYVSSHIEFAGYLNIMYIPGSEELVIYICAFIGALIGFLWYNAYP AQVFMGDTGSLTIGGIIAVFAIIIHKELLIPILCGVFLVENLSVILQRAYYKAGKRKGVK QRLFKRTPIHDHFRTSMSLIEPGCTVKFTKPDQLFHESKITVRFWIVTIVLAAITIITLK IR >gi|225935355|gb|ACGA01000037.1| GENE 30 46930 - 48264 1281 444 aa, chain + ## HITS:1 COG:BS_murD KEGG:ns NR:ns ## COG: BS_murD COG0771 # Protein_GI_number: 16078584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus subtilis # 6 444 14 450 451 243 37.0 6e-64 MKRIAILGAGESGAGAAVLAKVKGFETFVSDMSAIKDKYKELLDSHQIAWEEGHHTEELI LNADEVIKSPGIPNDAPIILKLKAQGTPVISEIEFAGRYTDAKMICITGSNGKTTTTSLI YHIFKSAGLNVGLAGNIGKSLALQVAEDYHDYYIIELSSFQLDNMYNFRANIAVLMNITP DHLDRYDHCMQNYIDAKFRITQNQTPDDAFIFWNDDPIIRQELAKHGLKAHLYPFAAVKE DGAIAYVEDHEVKITEPIAFNMEQEELALTGQHNLYNSLAAGISANLAGIAKENIRKALS DFKGVEHRLEKVARVRGIDFINDSKATNVNSCWYALQSMTTKTVLILGGKDKGNDYTEIE DLVREKCSALVYLGLHNEKLHEFFDRFGLPVADVQTGMKDAVEAAYKLAKKGETVLLSPC CASFDLFKSYEDRGDQFKECVRAL >gi|225935355|gb|ACGA01000037.1| GENE 31 48297 - 49625 905 442 aa, chain + ## HITS:1 COG:PA4413 KEGG:ns NR:ns ## COG: PA4413 COG0772 # Protein_GI_number: 15599609 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Pseudomonas aeruginosa # 27 383 37 375 399 140 30.0 6e-33 MDLLKNIFKGDKVIWIIFLCLCLISIIEVFSAASTLTYKSGDHWGPITQHSIILMVGAVV VVFLHNVPYKWFQVFPVFLYPVSLVLLAFVTLMGIITGDRVNGAARWMTFMGLQFQPSEL AKMAVIIAVSFILSKRQDEYGANPNAFKYIMILTGLVFLLIAPENLSTAMLLFGVVCMMM FIGRVSSKKLFGMLGILGLVGGVAVGILMAIPAKTLHNTPGLHRFETWQNRVSGFFEKEE VPAAKFDIDKDAQIAHARIAIATSHVVGKGPGNSIQRDFLSQAFSDFIFAIVIEEMGLVG GIFVVFLYLWLLMRAGRIAQKCERTFPAFLVMGIALLLVSQAILNMMVAVGLFPVTGQPL PLVSKGGTSTLINCAYIGMILSVSRYTAHLEEQKAHDAQIQLQIEADAAANSEAQAAAEP TAQILNSDAKFEDEHDSRNNER >gi|225935355|gb|ACGA01000037.1| GENE 32 49654 - 50778 995 374 aa, chain + ## HITS:1 COG:BH2565 KEGG:ns NR:ns ## COG: BH2565 COG0707 # Protein_GI_number: 15615128 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Bacillus halodurans # 6 367 1 363 363 241 36.0 1e-63 MMEKELRIIISGGGTGGHIFPAVSIANAIIELRPDAEILFVGAEGRMEMQRVPDAGYRII GLPIAGFDRKHLWKNVSVLIKLMRSQWKARKVIKNFRPQVAVGVGGYASGPTLKTAGMMG VPTLIQEQNSYAGVTNKLLAQKARKICVAYDGMEKFFPADKIIMTGNPVRQNLTKDMPSK EEALGSFHLQSGKKTILIVGGSLGARTINNTLTASLTTIKENTDVQFIWQTGKYYYPQVT EAVKAAGALPNLYVTDFIKDMAAAYAAADLVISRAGAGSISEFCLLHKPVILVPSPNVAE DHQTKNALALVNKQAAIYVKDSEAETTLMDVALSTVNDEQKLKELTENIAKLALPDSARI IAQEVIKLAEAKNR >gi|225935355|gb|ACGA01000037.1| GENE 33 50867 - 52243 1345 458 aa, chain + ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 7 444 11 448 458 233 33.0 5e-61 MNIETIKSVYFVGAGGIGMSALVRYFLFKGKVVAGYDRTPTPLTETLISEGAQIHYEENV DLIPAACKDKESTLVIYTPAVPQEHEELVYFHNNGFEIQKRAQVLGTITHSSKGLCVAGT HGKTTTSTMTAHLFHQSHVECTAFLGGISKNYGTNLLLSQASPYTVIEADEFDRSFHWLS PYMTVITSTDPDHLDIYGTEQAYLESFEHYTTLIQPGGALIIRKGISLQPKVQPGVRVYT YSRDEGDFHAENIRIGNGEIFIDFVAPDTRINDIQLGIPVSINIENGVAAMALAHLNGVT DEEIKRGMASFRGVDRRFDFKLKNDRIVFLSDYAHHPSEIKQSVLSMRELYKDKKITAIF QPHLYTRTRDFYQDFADSLSLLDEVILVDIYPAREAPIPGVTSKLIYDNLRPGIEKSMCK KEEILNILKDKNIEVLITLGAGDIDNYVPEIKELLEKR >gi|225935355|gb|ACGA01000037.1| GENE 34 52314 - 53051 256 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163752975|ref|ZP_02160099.1| 30S ribosomal protein S12 [Kordia algicida OT-1] # 49 244 47 239 239 103 31 4e-21 MIKRILLSIVMLVLIAYLAVAITAFNRKPADQTCRDMELVIKDTAYAGFITKEELKGILQ QKGIYPIGKKMERISTKSLERELSKHPLIDEAECYKTPSGKVCVEVTQRIPILRVMSANG QNYYLDNKGTIMPPEAKCVAHRVIVTGNVEKSFAMKDLYKFGVFLHNNKFWDAQIEQIHV LPDQNIELVPRVGDHLVYLGKLENFEDKLARLKEFYKKGLNRVGWNKYSRINLEFSNQII CTKRE >gi|225935355|gb|ACGA01000037.1| GENE 35 53132 - 54583 1514 483 aa, chain + ## HITS:1 COG:RSc2840 KEGG:ns NR:ns ## COG: RSc2840 COG0849 # Protein_GI_number: 17547559 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Ralstonia solanacearum # 5 333 7 338 410 137 26.0 6e-32 MATTEFIAAIELGSSKITGVAGRKNSDGSMQVLAYAQEDSSTFIRKGVIFNLDKTAQSLT SIINRLEGELKNSIAKVYVGIGGQSLRTVRNVVSRDLEEEAIISEELVSAIGDENIAIPV VDMDILDVAPQEYKVGNNLQANPVGLVGSHIEGRFLNIVARASVRKNLEHCFQQAKIDIA DQLIAPLVTANAVLTESERRSGCALIDFGADTTTISVYKNNILRFLTVLPLGGNSITRDI TTLQMEEEEAERLKKAYGDALYEEDPEQEEATCKLDDDNRIIKVADLNNIIEARAEEIVA NVWNQIQLSSYEDKLLAGIILTGGAANLKNLDETLRKRSKIEKIRMAKLPRNTVHAPNNI LKKDGSQNTLFGLLFEGNQNCCLTETAPQAAPAPSVSKPEPEVHKTVDMFEDDQELKEQA RIARLKKEEEEREAKLAAKEAEKLRKQKEKEEKERRKREAGPSWIQRKIDSLTKEIFSDD DMK >gi|225935355|gb|ACGA01000037.1| GENE 36 54641 - 55951 1363 436 aa, chain + ## HITS:1 COG:BB0299 KEGG:ns NR:ns ## COG: BB0299 COG0206 # Protein_GI_number: 15594644 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Borrelia burgdorferi # 11 405 22 396 404 226 41.0 7e-59 MDEIVQFDFPTDSPKIIKVIGVGGGGGNAVNHMYREGIHDVTFVLCNTDNQALAESPVPV KLQLGRSITQGLGAGNRPERARDAAEESIEDIRNQLNDGTKMVFITAGMGGGTGTGAAPV IARIAKEMDILTVGIVTIPFIFEGEKKIIQALDGVERIAQHVDALLVINNERLREIYADL TFMNAFGKADDTLSIAAKSIAEIITMRGTVNLDFADVKTILKDGGVAIMSTGFGEGENRV TKAIDDALHSPLLNNNDIFNAKKVMLNVSFCPSSELMMEEMNEIHEFMSKFREGVEVIWG VAIDNSLETRVKITVLATGFGVEDVPGMDSLHAARSQEEEERQLQLEEEKEKNKERIRKA YGESASNIGSKSLRKRRHIYLFNTEDLDNDDIIAMVEDSPTYMRDKTTLTKIRTKAALEE EVATEEATDDNGVITF >gi|225935355|gb|ACGA01000037.1| GENE 37 56037 - 56486 282 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 148 1 146 147 113 43 4e-24 MDLFDQVSEDIKTAMKAKDKVALETLRNIKKFFLEAKTAPGANDILTDDAALKIIQKLVK QGKDSAEIYIGQGRQDLADVELGQVAVMEKYLPKQMTAEELEAALKEIIAETGATSGKDM GKVMGVASKKLAGLAEGRAISAKVKELLG >gi|225935355|gb|ACGA01000037.1| GENE 38 56907 - 58136 864 409 aa, chain - ## HITS:1 COG:no KEGG:BT_3434 NR:ns ## KEGG: BT_3434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 409 1 411 411 678 79.0 0 MNRNMNTGRLLYMICLLAFSAVGGKLEAQTVVQPVEQQHDSIKYAPIHQIGFDIRPGYVA PTNSFLEGDNAQQKRINQSLSLHLKYAFRFSKDSNLGRLYPHAYQGIGISYNTFYCPAEL GNPVAVYAFQGARIMQVSPRLSFDYEWNFGASFGWKKYDGQYYPQNEVIGSKINAYINLG LVLNWQLHPQWKLAAGVDLTHFSNGNTHYPNGGLNVIGGRIGVVRTFGEEDASTDVPKRL YVKPHISYDLVIYGATRKRGFVKDGVPSLVPGSFGVAGLNFAPMYNFNNYFRAGLSLDAQ YDESANIKDYKLEGSFMDDLKFHRPPFREQFAVGLSLRAEWVMPIFSINAGIGRNLICSG DDTKGFYQILALKTYVTRHLFLHVGYQLSKFKDPNNLMLGLGYRFHDKR >gi|225935355|gb|ACGA01000037.1| GENE 39 58129 - 59073 430 314 aa, chain - ## HITS:1 COG:no KEGG:BT_3433 NR:ns ## KEGG: BT_3433 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 312 1 314 315 384 63.0 1e-105 MKTFINFTTYFLILLGLYSCNGDVFVDDFRSSDSELTLDGNGDVATIRFASSNWDLFGMY NYDESFSYPYKVFDENGNLIMTDQIPYLKGLGKIVCDEELIGFTVDRNNPKELKITVDEN ARSTHFRLMLVVGNEYESQDIYVEISPSDRYVFDHITYSLDMYSYGKVIETQNSFVQPNY WDIPYPYLLSPYEDVQHEVIFTSYMPEAFQLLGESNLTVEIPSIKSRYLEMNGKKAQYIS TKQSFPYHNTEQKEILIPPFTTQRITLMLEYERFETQYTLYAFHPKTKKQRIITGMLQSK MPVDYHIKREDINE >gi|225935355|gb|ACGA01000037.1| GENE 40 59238 - 59966 443 242 aa, chain - ## HITS:1 COG:no KEGG:BT_3431 NR:ns ## KEGG: BT_3431 # Name: not_defined # Def: DNA repair protein # Organism: B.thetaiotaomicron # Pathway: Homologous recombination [PATH:bth03440] # 1 242 1 242 242 450 95.0 1e-125 MLQKTKGIVLHTLKYNDTSIIVDMYTELSGRASFLVTVPRSRKAAVKSVLFQPLSFIEFE ADYRPNATLYRVKEAKSFYPFSSIPYDPYKSSMALFLSEFLYRAIREEAENRPLFAYLQH SIIWLDECGEGFANFHLVFLMRLSRFLGLYPNLEDYHTGDYFDLLNACFTSIRPQLHSSY INPEEAARLRQLMRMNYETMHLFGMSRAERTRCLTIMNDYYRLHLPDFPALKSLEVLKEL FD >gi|225935355|gb|ACGA01000037.1| GENE 41 60108 - 61067 337 319 aa, chain + ## HITS:1 COG:no KEGG:BT_3916 NR:ns ## KEGG: BT_3916 # Name: not_defined # Def: site-specific recombinase IntIA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 315 1 315 316 486 79.0 1e-136 MKRKSNKNGFTGCAKAYIRNLQEEGRYSTAHVYKNAILSFTRFCGTPSITFGQITRDSLR RYGQHLHEHGLKPNTVSTYMRMLRSIYNRGVEAGVARFIPRLFRDVYTGVDVRQKKALPV SELHTLLYKAPQSQHLIRTQAIARLMFQFCGMPFSDFAHLEKSALEDGILRYNRIKTGTP ISLEILDTSKKEINRLRNSNPPREDCPDYLFNIMRGDKRRKEEGIYKEYQSALRRFNNQL KSLSRELNLKSPVSSYTIRHSWATNAKYQGIPIEMISESLGHKSIKTTQTYLKGFGLKKR TEANQLNCFYVESYNTSHQ >gi|225935355|gb|ACGA01000037.1| GENE 42 61418 - 62089 425 223 aa, chain + ## HITS:1 COG:no KEGG:BVU_2683 NR:ns ## KEGG: BVU_2683 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 25 223 1 188 188 228 56.0 1e-58 MTHKPNISKNERSAASNRWRHAANIRHGLAIIAFVACILPASNLCGQNIAIKSNLLYDLT TTLNMGGEVRCDDTHTISLSLNYNPWNFGGNKKMKHFLLQPEYRKWFNEAFTGSFIGFQL HYALFNFGGMLPWGFSDGKMLGIENRQIAHNRYQGNLAGFGISYGYQWIISPQWNLEAGI SLGYAHLNYKRYGQSEGAALLEKSSCNYWGPTQAGISIIYFIR >gi|225935355|gb|ACGA01000037.1| GENE 43 62115 - 63146 633 343 aa, chain + ## HITS:1 COG:no KEGG:BVU_2682 NR:ns ## KEGG: BVU_2682 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 341 1 360 364 284 46.0 3e-75 MKTKAFIFFSLLICNLLPAQNIKLDITPHFLGVRADSLLLSMDVSVEIENMESKNAVRLT PVLTGQERKILLPSILLNGKQKQKLYLRNQILRKKSEHKEENATYLVAGIDESHSRTIAY RTSLPAEDWMNHATLLLRRTIIRPEGEQALKDTLLITSQPAALPSSQKNPDEVTSTAENT SPADMTTTKMKYKGSYVSPATDDVDIRNQKELNFNLEEAKIMADVNPQMLSLRELYTVAL SYADNKAKFYQIINISVKLYPVHPIANLNAAAAAIEQGDTQSISKFLSMAPHDSLAYKNC RGVYELMTGNTYEGIRMLKAAKTEGSEEASYNLNVFFENNKRP >gi|225935355|gb|ACGA01000037.1| GENE 44 63159 - 63728 382 189 aa, chain + ## HITS:1 COG:no KEGG:BVU_2681 NR:ns ## KEGG: BVU_2681 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 16 189 16 189 190 236 66.0 2e-61 MKYFLFLFLFLSVGWLHCRSQRVAVSVNALPAIDGAFEAGVSYAIRNKSTVELTGSLRPW KREEKYVNRYWLIQPEYKYWTCQKFNGFFWGAYLNGAQFNIGGKKLPFGIFAGLKKYRYE GWLAGGGISAGYHWMLDNHWNIETSLGVGYDFIRYKQYNCVKECAGLRDKGDYHYIGPSK ASISLVYLF >gi|225935355|gb|ACGA01000037.1| GENE 45 63740 - 63931 132 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298384857|ref|ZP_06994416.1| ## NR: gi|298384857|ref|ZP_06994416.1| hypothetical protein HMPREF9007_01498 [Bacteroides sp. 1_1_14] # 1 63 1 63 63 85 74.0 7e-16 MKSLWKVWFSRRRSIYIRIARQCHSTPWRVYYLGHGGISRSIKDIKILKALQQQGIISRI YPW >gi|225935355|gb|ACGA01000037.1| GENE 46 63980 - 65395 889 471 aa, chain + ## HITS:1 COG:no KEGG:BVU_2680 NR:ns ## KEGG: BVU_2680 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 7 470 5 468 469 486 50.0 1e-135 MKTGGTFIIAAILFLSLQAQTPVTGDDGTVRATSSSLHKAGKELLVSIEIEITRDFLPNE SMTLVPVVSDSLEHRLELPAIYINSRKQHIIFLREAGKKEKEAEALQRKNGSRQTMHYLQ SVPFESWMNHATLSLVEKSCGCGIPGEENLTCIARLHPQPTPIPQLVFLTPQVETSKIRE EKGCAFIDFPVNVTAIYETYSNNTVELNKIIETINTIKNDTNVTITHISIHGYASPDGPY RLNEKLARERTQSVKEYVNQLYAFDGAHIQTDYTPEDWEGFEALLCDTTFQEKEAIIKTI TSDIHPDSKERKLKKHFPAFYDFVLKHWFTLLRHSDYTIEYRVRPFTLAESRKVFTTNPK NLSLEEMFRLALASTPGSETYNQIFMTAVQLFPDNPTANLNAACIALMQRDVQTAATYLE KAPKVPETILAKGVLCFLKNNYEEAEYLFRQAQKAGLSQADNNLQLVRALK >gi|225935355|gb|ACGA01000037.1| GENE 47 65420 - 66913 1169 497 aa, chain + ## HITS:1 COG:no KEGG:BVU_2679 NR:ns ## KEGG: BVU_2679 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 497 18 512 512 365 45.0 2e-99 MKRFSKLMSLLLFAAVTQYSCINDADVVIADDTHSESNGNLALFFEIPNVATSRSAESSG VTTEGSKEEYAVKSLTVYLFDSTTKTLKDQQELKNINRTVTSGQNIQYTADKITVNPGTY NIFAIANGKAITGNISTQEAFLNAVDAVTYSAGKIPSVPENGFVMTNRGAANLNVEVKKP TDSDKVTSVSIGLERVVAKVELTQTQETFPLTDPSGKTYCTIKLNNFRMLNLATQFYTFR HTAVLNSFQEPDSYTDENFGDINDNNGYVIDPYFFKKTVENAGDFTNADGFFAQALVQLD INDGNWAGMAQANSWSHIYCLENCMFLDAQLNAYSTGVMFKANMDIAPDRVFNENGDVVS NQSNWPTNMFYFNYNFYTSVDAIRKLVLNNLPDDVTDNSTTEELARYSIKRFKKTENYSC YYNYWIKHLDNNSPEMGVMEFGIVRNNIYRLSVNKIAGLGSGEPFIDPDQPDEYKAELDI NFNVFPWAVRNQDVELE >gi|225935355|gb|ACGA01000037.1| GENE 48 66945 - 67919 374 324 aa, chain + ## HITS:1 COG:no KEGG:BVU_2678 NR:ns ## KEGG: BVU_2678 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 16 323 17 314 315 277 44.0 5e-73 MRPFLLYIISAIALFSSCNWVNDDLSDCPSGTWLKISYTYNILDVDAASTQVSDITILAF DKDDKYVDRVDVDSITLHQGYCMVRVPFPEGTYRLLIWGGTSNHMYQLPNLKAGQTERRS LNISLACDNKNQSNKKLNALFYSSLENITISQKYQVITANLVKDTNYFSCILQDEANAPL SQEDFSFTLESANGIIDYTNTPVGTTPVYYLPYQKEVAVMSEQTPVIHARLNTLRIMKGD QTTLSIKHIPSGQNILRLPLTQYLLLSKIYNDIGEMSDQEYLDRQDSYTLLFFIQPSDTG IPQICPRMQVNGWIIRLNNSELES >gi|225935355|gb|ACGA01000037.1| GENE 49 68564 - 68818 416 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153809175|ref|ZP_01961843.1| hypothetical protein BACCAC_03485 [Bacteroides caccae ATCC 43185] # 1 84 1 84 84 164 98 1e-39 MANHKSSLKRIRQEETRRLHNRYYGKTMRNAVRKLRATTDKAEAVAMYPGITKMLDKLAK VNIIHKNKASNLKSKLALYINKLA >gi|225935355|gb|ACGA01000037.1| GENE 50 69017 - 70975 2139 652 aa, chain + ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 6 646 1 630 637 690 55.0 0 MSEEQITPNNGSYSADSIQVLEGLEAVRKRPAMYIGDISVKGLHHLVYEIVDNSIDEALA GYCDHIEVTINEDNSITVQDNGRGIPVDYHEKEKKSALEVAMTVLHAGGKFDKGSYKVSG GLHGVGMSCVNALSTHMTTQVFRDGKIYQQEYEIGKPLYSVKEVGTADITGTRQQFWPDN TIFTETVYDYKILASRLRELAYLNAGLRITLTDRRVVNEDGSFKGEQFYSEEGLREFVRF IESSREHLINDVIYLNSEKQGIPIEVAIMYNTGFSENVHSYVNNINTIEGGTHLAGFRRA LTRTLKKYAEDSKMLEKVKVEISGDDFREGLTAVISVKVAEPQFEGQTKTKLGNNEVMGA VDQAVGEVLAYYLEEHPKEAKTIVDKVILAATARHAARKAREMVQRKSPMSGGGLPGKLA DCSDKDATKCELFLVEGDSAGGTAKQGRNRMFQAILPLRGKILNVEKAMYHKALESDEIR NIYTALGVTIGTEEDSKEANIQKLRYHKIIIMTDADVDGSHIDTLIMTFFFRYMPQIIQN GYLYIATPPLYLCKKGKVEEYCWTDAQRQKFIDTYGGGSENAVHTQRYKGLGEMNAQQLW ETTMDPENRMLKQVNIDNAAEADYIFSMLMGEDVGPRREFIEENATYANIDT >gi|225935355|gb|ACGA01000037.1| GENE 51 71392 - 71973 465 193 aa, chain - ## HITS:1 COG:no KEGG:BVU_1376 NR:ns ## KEGG: BVU_1376 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 193 1 193 193 329 84.0 3e-89 MIQIGDVVVSLDVFQEKFLCDLGACKGACCIEGDAGAPVELDEVMELEEVLPVIWDELAP EARAVIEKQGGVYTDQEGDLVTSIVNNKDCVFTCYDENGCCYCAIEKAYREGKTAFYKPV SCHLYPIRIGDYGPYKAVNYNRWDVCKAAVLLGKKENLPVYQFLKEPLIRKFGEEWYKEL VTVAEELKKQQYI >gi|225935355|gb|ACGA01000037.1| GENE 52 72022 - 72387 317 121 aa, chain - ## HITS:1 COG:no KEGG:BT_0352 NR:ns ## KEGG: BT_0352 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 118 6 120 120 121 56.0 6e-27 MIAKNNAIHIKSYAFAIRIVNAYKFLAESEKEFVLSKQLLRSGTAIGALVAEAHHAQSGA DFLNKMNVALKEANETSYWLSLLKDTHYMDEIVYQSISSDCNELVALLVCIVKSMKDSLK K >gi|225935355|gb|ACGA01000037.1| GENE 53 72412 - 73926 1808 504 aa, chain - ## HITS:1 COG:MA4007 KEGG:ns NR:ns ## COG: MA4007 COG0696 # Protein_GI_number: 20092802 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Methanosarcina acetivorans str.C2A # 6 503 15 518 521 501 50.0 1e-141 MSKKALLMILDGWGLGDQKKDDVIFNTPTPYWDYLTTTYPHSQLQASGENVGLPDGQMGN SEVGHLNIGAGRVVYQDLVKINRACADNSILKNPEVVSAFSYAKENGKSVHFMGLTSNGG VHSSLVHLFKLCDIAKEYNIDNTFIHCFMDGRDTDPKSGKGFIEELSAHCEKSAGKIASI IGRYYAMDRDKRWERVKEAYDLLVNGIGEKATDMVQAMQESYNAGVTDEFIKPIVNANCD GTIKEGDVVIFFNYRNDRAKELTVVLTQQDMPEAGMHTIPGLQYYCMTPYDASFKGVHIL FDKDNVSNTLGEYLASKGLSQLHIAETEKYAHVTFFFNGGRETPFDKEDRILVPSPKVAT YDLKPEMSAFEVKDKLVAAINENKYDFIVVNFANGDMVGHTGIYEAIEKAVIAVDACVKD VIEAAKAQDYEAIIIADHGNADHALNEDGTPNTAHSLNPVPCIYVTENKAAKVENGRLAD VAPTILKIMGLAAPAEMDGNVLIN >gi|225935355|gb|ACGA01000037.1| GENE 54 74099 - 76549 1413 816 aa, chain + ## HITS:1 COG:no KEGG:BT_3418 NR:ns ## KEGG: BT_3418 # Name: not_defined # Def: putative thiol:disulfide interchange protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 811 1 810 823 1168 70.0 0 MKTTLFCILLLLSGIVSAQTIDHPPFKARSGSISNITRIERTPENTRVYIHAIFRPHWWI MEDGDTYLEDAATGKKYLFKSAEGIELKKEVYMPDSGTMDYVLVFEPLPSETQSIHFLNP TDPEGNIYDISLVPQKKKDSSPLVTIKGNWFKTDGSGSWEYGVYDSISILNNRIYTNENI RKKGKRIEMTLKDRESQEEMTLSFTPQKDGTCKIQQKGAEELVYSKERTPITQVAAEPDF KQFFRQDSTYLQGYINGYDPRLGFDTGLIYLSNELTREDYPTVIQIAPNGSFSCRFIINH PIESSVVLGHNWIPFYIEPGQTLTMYIDWEAVMARSRARDHYFPIRNTAYMGPSASLSYL LKDFDNLITYRYEDLSKSQKTLTPDQYKEHMKPIIAQWKQVADSVSQIYQPSLKAVHLIK NKVDLQAGSMLFDFLMSRDYYAKQDSTNQALKVKEDDSYYSFLKDMPLNDVTVLANTNAS TFINRFEYMDLFREAYSGQSFSPSDSIDYTYPKKPLLTFLKEKGVKLNKEQEAIRLRQEK LAGTTAKIIMRQLIAENEKMASLYEKEQKLIQEYVALYSEKKEESQQDKDRIFIKMNQKY DFKKDSIIAQLYPTPNPLLWQIAKVRSLNFNLGNIKDSQIAHEYVDSIKQIFTEPFLAAE AERVLEKAHPKDRARSYQLPEGKATEVFRNIIKNHSDKVLFVDFWATTCGPCRAGIEATA DLRKKYKDHPEFQFIYITSQKDSPEKDYQKYVEKNLKGEACYYVSEAEFNYLRQLFQFNG IPHYELVEKDGSISKERLSSYNIRKYLDNHFEGKTE >gi|225935355|gb|ACGA01000037.1| GENE 55 76728 - 77708 551 326 aa, chain + ## HITS:1 COG:CAC0294 KEGG:ns NR:ns ## COG: CAC0294 COG0598 # Protein_GI_number: 15893586 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 36 291 24 279 315 193 40.0 5e-49 MKPDASARVKQTKGGNKMRTYLYCEAGFVEKAQWLPNSWVNVVCPNNDDFEFLTQTLNVP ESFLNDIADTDERPRTDTEGNWLLTILRIPVQIAQNSTLPFSTVPIGIITNNEIIVSVCY HNTDLLPDFIEHTRRKGIVVRNKLDLILRIIYSSAVWFLKYLKQINIDISAAEKELERSI RNEDLLRLMRLQKTLVYFNTSIRGNEVMIGKLRTIFQDTDYLDTELVEDVIIELKQALNT VNIYSDILTGTMDAFASIISNNVNTIMKRMTSLSIVLMLPTLIASFYGMNVDIHLEEVPF AFSLIVLFSIGLSTLAFVIFRKIKWF >gi|225935355|gb|ACGA01000037.1| GENE 56 77866 - 81279 1676 1137 aa, chain - ## HITS:1 COG:no KEGG:Phep_2759 NR:ns ## KEGG: Phep_2759 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: P.heparinus # Pathway: not_defined # 34 921 34 915 916 872 49.0 0 MYRITLLLMLLTTVNMSAGNFFSVYCLTTEQAGNPIGIDSQSPRFSWKIYAQKRNFKQYA YQVCVADSEDRLTNGEANIWDSGKVLSEKSILIPFEGVKLKSSEVYYWKVRIWDDENKAS AWSQINTFTTGLLEDSDWGNALWISMEKDRELVRGIHYQEKEALPAQKVGMYRLPQFRKQ FSVKAKNISRAFAYVSGLGHFDFFLNGKKVGDHFLDAGWTLYDKEAFYVSFDITDQLQRG ENVLGVMLGNGFYNVPQERYFKLLISYGAPKMRLHLRIVYDDNSVQEIISDKSWRVSESP VTFSSIYGGEDYDATREQPGWMDVGFDSSNWKDVLVSNYVPKMVSQQTEPIKQREEIPVV EYYKNEKGNWVYDLGQNFSGIIRLSIRGERGQSVRLIPAELLNRDHTVNQSASGEPFYFA YKLRGGQCIETWQPQFTYYGFRYVEVEDAVPAGEENPDKLPIITGLVGLQTCLATPETGS FSCSNPLFNKIHNLIDWAMRSNMASVLTDCPHREKLGWVEQAYLMQYSLQYRYNMSRMYN KIIRDMCLSQTEEGIIPSISPEYVRFKEGFEDTPEWGSAFIISSWYAYLWYGDDRALIEY YPAMKNYMNYLASRAKDHIISYGLGDWFDIGPDVPGNSQLTSNGVTATATYYYNAVIMQK IARLLDVSEDVETYKKLADSIKVAFNRIFFDSSSNIYDHNSQTTNAIVLFMDLVDEAHKP IVLGNLVRDIQNRNYALTAGDIGYRYVLRTLEANELSELIYKMNCRYDVPGYGWQLAHDA TALTESWQAFGFVSNNHFMLGHLMEWLYSGIGGIRQAENSLGYKTVVITPQLVGDITSAV TSYESPYGMIRCEWKKGREKYELKVSIPANSEAIISLPAATFEDVADYGVDLTSVTSIIK MGTCQEGIKLKVGSGNYLFTVNNPVYKSTTSLDVSKVTNVLCLGNSITKHGVKRDIDWLS DWGMAASKEEYDYCHQLQSMLKQYNDLSTVTPLNIAYWEQNLNCNIDSLIGEECQNKDLI VIRLGENVRNKELFKTKILDLVEVCKKYASNIIITGCFWPDADKEEALISAASRSGAEYV PLAWISEQQGVYPEIGDILYSTSNKPYKVKQDFIITHPNDKGMRMIARRIFEAIDRK >gi|225935355|gb|ACGA01000037.1| GENE 57 81287 - 82240 688 317 aa, chain - ## HITS:1 COG:no KEGG:BT_1873 NR:ns ## KEGG: BT_1873 # Name: not_defined # Def: endo-arabinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 312 14 322 327 367 57.0 1e-100 MRISKLLCKSFALLSLLFIACSNTDDIQQKVTEVEASQLKVRDPFIFYDVDTDYYYLCVN GTLKVKSYKSKDLLTWQENGYSFLPSAGFWGKEDFWAPDLYKYKDKFYLFITLSAPNIKR GTSVLASDRVMGTFQPLVNKAVTPQEWTCLDGSLYVDQEGTPWLFYCREWVEVGDGEIYV QQLKDDLTVTDSEPYLLFKASEAPWVGPVTLGENTGNVTDAPTVYRLENGALLMLWSSFT KDGKYCIGQALSESGNVRGPWVQIPEPLNDDDGGHAMLFKDKAGNLKISYHSPNTTGLER LTIKDVVIENNKVRIIE >gi|225935355|gb|ACGA01000037.1| GENE 58 82285 - 83496 909 403 aa, chain - ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 62 394 240 564 818 117 27.0 5e-26 MIKKNRLKYIVFCWLNITAVFAFVGQVAAQTAEMNTHPDYSVAAYVWPSCHNDHPKGCKL LWPEGMGEWEIIKKATPRFEGHYQPKQPLWGYEMDNDPQVMEKWIDAATRHGVNVFIFDW YWYDEGPFLESCINDGFLKAHNNDKMQFYIMWANHDVKKNYWNVHKYGNDESLLWDATVD WKNFKTIVDRVINQYFVKSNYFKIDGAPVFSIFGLEKFIQSFGTVDEARKALDYFREQVK KYGFPELHIQSIIGGGIPDKTMLKQIEQLGINSMTHYNWGGPHPEDYMTWGRESMERMEK WVEALSVPYFPNASIGWDDTPRFPHKTKKDVVHYNNSPQSFATYLQKAKEYVDARPDLPK LITVFSWNEWIEGGYLLPDMKYGFGYLEAVKEVMLDGKYDRYK >gi|225935355|gb|ACGA01000037.1| GENE 59 83607 - 85031 712 474 aa, chain - ## HITS:1 COG:no KEGG:DICTH_1900 NR:ns ## KEGG: DICTH_1900 # Name: not_defined # Def: F5/8 type C domain protein # Organism: D.thermophilum # Pathway: not_defined # 32 471 45 481 685 187 30.0 7e-46 MYLKYVFIVFLVISITCCTDKSNPRNIGELLSGNFKIEVGGEPVFVYQARVSKYPVNQNW PGYQRPLEQTEIASFASFDYEAGKTVRITTDKKIESCDIRPKEFGIIPEVKGNTIEFEVA NPCQFVVEINGHHEALHLFVNPKEYTQVDKEDSRIHYYGPGVHEAGVINVKSDETVYIDE GAVVYGVIRSENSSNIKIIGKGILDASRIARGTAPNMISLHKVRNAYIGGIILRDAHEWG VVPSCCNQVIFDNIKLIGFWRYNSDGIDIVNSSNIVIKNSFIRAFDDNIAIKGLQWAYEE QRIIEGIRVDNCVLWNDWGKIFEFGAETVVDTIRDVVISNCYVPRFTMVAMDMQNGDRGH IENVIFDNISIEEPIRERAMLGETPMDTKDWGRAISLGVYGTQWSSDVIRGSIDNVQFRN IRCVGSSSSIELKGYDEKHRVSNVHIEGYLVNGTKITGDDFVQKNEFVTNVVWE >gi|225935355|gb|ACGA01000037.1| GENE 60 84970 - 85107 61 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPEKYISLGIFCSIVPSLQYCKKDNYVFEICIHCVFSDKYNLLYG >gi|225935355|gb|ACGA01000037.1| GENE 61 85124 - 86962 1426 612 aa, chain - ## HITS:1 COG:no KEGG:BDI_3134 NR:ns ## KEGG: BDI_3134 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 16 610 14 605 605 286 35.0 2e-75 MHAMKKIKSLLFLGLFMMTSCLDILDKEPLDIISDNAVWSDPVLIDSYLAECYYQTSVMV NETPGYFTDGGNFWQSELGMGMCWINEIADEAKVNWAYNTDAVRTYKAGGLTIGGGFLEW WELPYNTIRALNEFIQRVPGSPVAEELKNERVAEARFLRAYNYFAMVKRYGGVPLITVPQ ALDEPWEELYPSRNSEQEVYDFILSEMDDIINNEYLYETVGDDNLGRPSKYAALALKSRA ALYAGSIAQFGKVQMNGLLGIPSSLAEHYYRESYKASKEIGKKHSLYDVNADKVSNFKNI FLVKNNCEVILARRHNDSPTSINSGDYSGGHSWGYDFAQCPKPQAWGAGNKDAPYLEMVE AFEHIDGTSGILDRMQIQQGLWTTDELWANKDPRFFATIYTQNTAWKGTMVDYHNGLRLP DGTIQNDGSYQGVLALGTQSVDNGFGTGFGIMKYLDEGNNTLQFPGISSTDYLVFRYGEV LLNLAEAAFELGETGEALGAVNEIRERAGIAPLESIDRDKIRHERKVELAFEGHRYWDVR RWRIAVDVLSKPNSGLRYILDYETGKYRLQVLDNVDGTGTSPNFYEHNYYFPITLSRTGN NPNLQENPGYQN >gi|225935355|gb|ACGA01000037.1| GENE 62 86976 - 90335 2079 1119 aa, chain - ## HITS:1 COG:no KEGG:BDI_3133 NR:ns ## KEGG: BDI_3133 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 28 1119 11 1117 1117 659 36.0 0 MKNFQQKRLAGKIPWYRRGMAGWSLALVCFCVILFGNIQTMVAQTRNVTIDVVNTPIGKV FDELSKQTRYKIFYSDNVIDSKQKVSINCVNKPLKDVLEKILEGTGVGFEIKGRKVLLFK LKEPSSNSKKESNIMINGRVTDENGEPIIGATITSEKTLKGTITDLDGKFMLEIPENTNL TVSYVGYRNQIVRGLPNMDVRLQEDSKMLDEVVVVGYGTQRKGNLTGSVSSVKADKLTIA PVATASNTLVGQLPGLIATQSSGQPGSDSATLNVRGFGSALIIVDGIEASLDNIDANQIE SISILKDGAASIYGARAGNGVILITTKRGTDQKPMITLNTAFTWQGVTKMLKPASSGQRA EMEREAWLQSGQPEASAPFTEEQIQKYYDGTDPLFPNTNWYKELIRDWAPEQQHNISIRG GNDRLKFYGFFGYLNQETMIKKNGGNYERFNLQSNIDAKILDNLTLRLDLAYSQEQRNYT TRSMGVGGTIWQDYWNTLPYYPSTLPDPDRIPYAFGAGVGGLHVSSNRDISGYDDARGED LRGTASLEYKFKSVKGLSIKMFESYRKVYSFNKIFNKPVDLYTYDPASDIYTLAGTYGSK ASLSQAHSRSETFTQQYSINYENSFNDTHHLTAMALYEAINYSGDYLFASRKDFMMTAID QMFAGSTEGMTNNGYANEMGRMSFVGRVNYSYKNKYLLETILRADASAKFPPGKRWGYFP SVSVGWMLSEEKFISQIRFIDNLKLRLSYGQSGNDGVGNFQYLSGYTYGHAYLLGKESVQ GMVSTGLANPNLTWEKVEIANVGIDYSFFDRKLYGEVDVFYRERTGIPATRLMSLPSTFG ANLPPENINSLNDRGFEVKLGTSGTVNQLRYDVSGNISWSRAKWKYYEEPDYTDSDQARI FKQSGKWTDCVYGYVSDGLFTSQTEIDVAPYYEDLQGNAMLKPGDVRYKDLNGDGIINWK DQKEIGGSTLPHWIYGLNVALSYKGFDLSMLFQGAFGYNQNISNGLYSTLLYEERWSPEN NDANAIYPRLGGVSTNSYTSDFTYKKAGYLRLKVASLGYSLPRQLLEKCHINNARVYLSG TNLLTFNRLGKYGIDPEAQNVGYYYPQQRTISLGLNISF >gi|225935355|gb|ACGA01000037.1| GENE 63 90519 - 91508 555 329 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 101 260 105 258 331 70 31.0 5e-12 MISDIDNSLLVKYLKNELNEEESQKVIEWLEQNKENLEFLFGLKDLYMLGRWEELSKKAD TAHGWEKLVDSIKKQRQSKNVFRTYLKYAAILIVFFSMGYGYKNYFHQPSPMMNTIITAE GERTTVILDDGTKVKLNQNSKLVYPSSFDGKNRKVALSGEAYFEVFHNDENPFLVDVGIY IVKVLGTKFNVEAYPGDICSYTSLKEGRVQILENGTEEVHILSELKPGTQLVYNVQTGQY QMNRVNIEEIGDWMKGQIVVKQKTLLELTDILKQKYGYHFEIHTDSISNIIYNIVLEQES LEEILNDMTIITPQVHYSVHHESRTVIFR >gi|225935355|gb|ACGA01000037.1| GENE 64 91505 - 92119 378 204 aa, chain - ## HITS:1 COG:mll8140 KEGG:ns NR:ns ## COG: mll8140 COG1595 # Protein_GI_number: 13476734 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 16 192 11 184 208 65 25.0 4e-11 MKTETDSFYEDICKGDLKAFEYFYKKYQPRLFAYGVGVLGDEEASKDLVQETFIAFWENK ERLVTCYSVSSYLFKIFQSKCLNYLRKRTLLSDFSSLSELKLKEIEMSYYSSDNIDGGTV FMKEVEELYTKTMNDLPDKCREIFILSKEQDVKAADIADKLGVSVRTVENQLYKAIKIMR QAMREYAIPVIFVVLSNLLNTLSR >gi|225935355|gb|ACGA01000037.1| GENE 65 92382 - 93917 1557 511 aa, chain - ## HITS:1 COG:no KEGG:BT_3413 NR:ns ## KEGG: BT_3413 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 511 1 485 485 529 55.0 1e-149 MATNKLLWSSKLIGVFFIMLVCTLSANAQFLRTSYFMEGTHYRQQLNPALTPTKGYFNLP VIGAVNATVGSTSLGYQDILDIIDNGDDFYKSNDFMSRLKDKNKLNVNFSTEILSAGWYK GKNFWSFNIGLRTDIGANLTKSMFQFLHDMDGLETTWQDQQFNIGGQQLNIQAYTEVGLG LSRQINNRLTVGARVKALLGIGNMEMKINKAYMNAVLPSNARMQEFEHMQITDIAQINQL KNEIGNYRAELAVGASLESSFKGLNLVQEEGQDYISDFDFDAGDMGIAGYGFGIDLGASY KIMDNLTVSASILDLGFISWSKGSTKIASADSGIDMKGSDYVNELAGINPADPNALQQAQ AVIDGFTNKATDYLNRVSNGDVLDYDMLQLRTEDASKSRKSRLASTLVLGAEYGFFNNKL AVGVLSTTRFVQPDALTELTFSANYRPKSWFNVALSYSAIQSAGKSFGLGLKLGPLFVGT DYMFLGKNSNSVNGFVGVSIPLGGRKASKEG >gi|225935355|gb|ACGA01000037.1| GENE 66 94098 - 94700 550 200 aa, chain - ## HITS:1 COG:NMA0075 KEGG:ns NR:ns ## COG: NMA0075 COG0164 # Protein_GI_number: 15793104 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Neisseria meningitidis Z2491 # 9 196 2 193 194 179 50.0 3e-45 MLLPYLNENLIEAGCDEAGRGCLAGAVYAAAVILPKDFKNELLNDSKQLTEKQRYALREV IEKEAIAWAVGIVSPEEIDEINILRASFLAMHRAVDQLSTRPQHLLIDGNRFTKYPGVPH TTVVKGDGKYLSIAAASILAKTYRDDYMNRLHEEFPYYDWDHNKGYPTKKHRAAIAERGT TPYHRMTFNLLGDGQLTLSF >gi|225935355|gb|ACGA01000037.1| GENE 67 94790 - 96994 2382 734 aa, chain - ## HITS:1 COG:MA3879 KEGG:ns NR:ns ## COG: MA3879 COG3808 # Protein_GI_number: 20092675 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 3 727 11 683 685 526 48.0 1e-149 MDNILFWLVPVASVLALCFAYYFHKQMMKESEGTPQMIKIAAAVRKGAMSYLKQQYKIVG WVFLGLVILFSIMAYGFHVQNAWVPIAFLTGGFFSGLSGFLGMKTATYASARTANAARNS LNAGLRIAFRSGAVMGLVVVGLGLLDISFWYLLLNAVIPADALTPTHKLCVITTTMLTFG MGASTQALFARVGGGIYTKAADVGADLVGKVEAGIPEDDPRNPATIADNVGDNVGDVAGM GADLYESYCGSILATAALGAAAFIHSADTVMQFKAVIAPMLIAAVGIILSIIGIFAVRTK ENATMKDLLGSLAFGTNLSSVLIVAATFLILWLLQLDNWIWISCAVVVGLVVGIIIGRST EYYTSQSYRPTQKLSESGKTGPATVIISGIGLGMLSTAIPVVAVVIGIIASYLLASAGDF GNVGMGLYGIGIAAVGMLSTLGITLATDAYGPIADNAGGNAEMSGLGAEVRKRTDALDSL GNTTAATGKGFAIGSAALTGLALLASYIEEIRIGLTRLGNVDLTFADGSSISVANATFID FMDYYEVHLMNPKVLSGMFLGSMMAFLFCGLTMNAVGRAAGHMVDEVRRQFRDIKGILTG EAEPDYERCVEISTKGAQREMVIPSLIAIVAPILTGFIFGVPGVLGLLIGGLSSGFVLAI FMANAGGAWDNAKKYVEEGNFGGKGGEVHKATVVGDTVGDPFKDTSGPSLNILIKLMSMV AIVMAGLTVAWSLF >gi|225935355|gb|ACGA01000037.1| GENE 68 97176 - 97409 95 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172443|ref|ZP_05758855.1| ## NR: gi|260172443|ref|ZP_05758855.1| hypothetical protein BacD2_11309 [Bacteroides sp. D2] # 1 77 8 84 84 114 100.0 2e-24 MQVVIFKKQKRKRRKNRGRTCDSKSTVTVLSHAAVTPLLIHCQKFKYKHVTDDSKFSLFN SLVNARWDYISVYFFMT Prediction of potential genes in microbial genomes Time: Fri May 13 08:53:28 2011 Seq name: gi|225935354|gb|ACGA01000038.1| Bacteroides sp. D2 cont1.38, whole genome shotgun sequence Length of sequence - 224120 bp Number of predicted genes - 155, with homology - 151 Number of transcription units - 73, operones - 35 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 99 - 2927 2297 ## COG3292 Predicted periplasmic ligand-binding sensor domain - Prom 3070 - 3129 3.9 + Prom 2917 - 2976 4.3 2 2 Op 1 . + CDS 3170 - 5539 2166 ## BF3178 hypothetical protein 3 2 Op 2 . + CDS 5622 - 7811 2240 ## COG3345 Alpha-galactosidase + Term 7838 - 7877 1.3 - Term 8140 - 8180 8.1 4 3 Op 1 24/0.000 - CDS 8187 - 9404 1145 ## COG0520 Selenocysteine lyase 5 3 Op 2 41/0.000 - CDS 9418 - 10761 1208 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 6 3 Op 3 41/0.000 - CDS 10775 - 11527 213 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 7 3 Op 4 . - CDS 11570 - 13069 1417 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 8 3 Op 5 . - CDS 13050 - 13577 423 ## BT_3405 hypothetical protein - Prom 13699 - 13758 7.1 - Term 13647 - 13700 1.6 9 4 Op 1 20/0.000 - CDS 13765 - 16863 3649 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) - Prom 16891 - 16950 1.9 10 4 Op 2 . - CDS 16968 - 18251 595 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 11 4 Op 3 . - CDS 18254 - 18721 436 ## BT_3402 hypothetical protein - Prom 18875 - 18934 4.8 + Prom 18669 - 18728 6.5 12 5 Tu 1 . + CDS 18894 - 19289 462 ## BT_3400 hypothetical protein 13 6 Tu 1 . - CDS 19355 - 20053 399 ## COG1451 Predicted metal-dependent hydrolase - Prom 20075 - 20134 5.0 14 7 Tu 1 . + CDS 20183 - 20965 575 ## BT_3398 hypothetical protein + Term 20982 - 21028 8.1 - Term 20969 - 21016 5.3 15 8 Op 1 . - CDS 21133 - 21612 403 ## BT_3397 hypothetical protein 16 8 Op 2 . - CDS 21602 - 22111 290 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 17 8 Op 3 . - CDS 22173 - 22946 780 ## COG0548 Acetylglutamate kinase 18 8 Op 4 . - CDS 23003 - 24895 1798 ## COG1166 Arginine decarboxylase (spermidine biosynthesis) 19 8 Op 5 . - CDS 24965 - 26035 1103 ## BF0195 hypothetical protein 20 8 Op 6 . - CDS 26045 - 27811 1221 ## COG0326 Molecular chaperone, HSP90 family 21 8 Op 7 . - CDS 27821 - 30319 3113 ## COG0790 FOG: TPR repeat, SEL1 subfamily - Prom 30373 - 30432 5.7 - Term 30473 - 30512 -0.5 22 9 Tu 1 . - CDS 30576 - 31103 476 ## COG0703 Shikimate kinase - Prom 31150 - 31209 4.3 - Term 31134 - 31196 3.2 23 10 Tu 1 . - CDS 31219 - 31824 510 ## COG3560 Predicted oxidoreductase related to nitroreductase - Prom 31856 - 31915 4.7 + Prom 31794 - 31853 8.5 24 11 Op 1 . + CDS 31962 - 32597 661 ## COG3341 Predicted double-stranded RNA/RNA-DNA hybrid binding protein 25 11 Op 2 . + CDS 32627 - 34399 1199 ## COG1807 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family + Term 34417 - 34479 7.2 - Term 34243 - 34273 -0.9 26 12 Op 1 . - CDS 34386 - 36365 1120 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 27 12 Op 2 . - CDS 36412 - 36831 256 ## BT_3390 hypothetical protein 28 12 Op 3 3/0.182 - CDS 36815 - 37771 893 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 29 12 Op 4 . - CDS 37768 - 38571 564 ## COG0726 Predicted xylanase/chitin deacetylase 30 12 Op 5 . - CDS 38586 - 40421 215 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein - Prom 40444 - 40503 4.6 - Term 40488 - 40532 3.6 31 13 Op 1 . - CDS 40542 - 41675 785 ## COG1672 Predicted ATPase (AAA+ superfamily) 32 13 Op 2 . - CDS 41754 - 43832 1496 ## COG5545 Predicted P-loop ATPase and inactivated derivatives + Prom 44355 - 44414 5.2 33 14 Tu 1 . + CDS 44435 - 45046 316 ## BT_3384 hypothetical protein + Term 45090 - 45155 8.2 - Term 45084 - 45134 8.1 34 15 Op 1 . - CDS 45159 - 45482 187 ## BT_1828 hypothetical protein - Prom 45512 - 45571 2.1 35 15 Op 2 . - CDS 45597 - 46796 697 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 46850 - 46909 1.9 36 16 Op 1 . - CDS 47389 - 47739 170 ## BDI_3447 hypothetical protein 37 16 Op 2 . - CDS 47818 - 48168 297 ## BDI_3447 hypothetical protein - Prom 48190 - 48249 2.5 38 17 Op 1 . - CDS 48655 - 50853 1612 ## BT_3382 hypothetical protein 39 17 Op 2 . - CDS 50885 - 52609 938 ## BT_3381 hypothetical protein - Prom 52772 - 52831 9.8 40 18 Tu 1 . + CDS 52944 - 53816 281 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 53929 - 53988 3.2 41 19 Op 1 . + CDS 54008 - 54823 220 ## Mmol_1034 glycosyl transferase family 2 42 19 Op 2 . + CDS 54876 - 55874 431 ## gi|260172485|ref|ZP_05758897.1| hypothetical protein BacD2_11531 43 19 Op 3 . + CDS 55874 - 56713 258 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 44 20 Op 1 . - CDS 56786 - 58111 782 ## COG0438 Glycosyltransferase 45 20 Op 2 . - CDS 58176 - 59105 338 ## BVU_1071 glycosyl transferase family protein - Prom 59191 - 59250 5.0 + Prom 59206 - 59265 4.4 46 21 Tu 1 . + CDS 59458 - 61764 472 ## COG1216 Predicted glycosyltransferases - Term 61749 - 61791 9.0 47 22 Op 1 2/0.182 - CDS 61852 - 62955 465 ## COG3754 Lipopolysaccharide biosynthesis protein 48 22 Op 2 . - CDS 62984 - 63976 444 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 63875 - 63934 6.7 49 23 Op 1 1/0.182 + CDS 64165 - 64974 423 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 50 23 Op 2 . + CDS 64976 - 66025 820 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 51 23 Op 3 . + CDS 66022 - 66753 489 ## COG1213 Predicted sugar nucleotidyltransferases 52 23 Op 4 3/0.182 + CDS 66769 - 67581 569 ## COG3475 LPS biosynthesis protein 53 23 Op 5 . + CDS 67611 - 68744 816 ## COG0438 Glycosyltransferase 54 23 Op 6 . + CDS 68741 - 69553 386 ## gi|160883972|ref|ZP_02064975.1| hypothetical protein BACOVA_01946 55 24 Tu 1 . - CDS 69554 - 70465 345 ## BT_3364 hypothetical protein - Prom 70591 - 70650 2.5 + Prom 70583 - 70642 9.5 56 25 Tu 1 . + CDS 70697 - 71755 491 ## COG0859 ADP-heptose:LPS heptosyltransferase 57 26 Op 1 5/0.091 - CDS 71738 - 72811 612 ## COG0438 Glycosyltransferase 58 26 Op 2 . - CDS 72872 - 73903 557 ## COG0859 ADP-heptose:LPS heptosyltransferase 59 26 Op 3 . - CDS 73907 - 74512 580 ## BDI_2820 hypothetical protein - Prom 74602 - 74661 7.3 + Prom 74463 - 74522 7.7 60 27 Tu 1 . + CDS 74630 - 75670 430 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 75884 - 75923 0.2 - Term 75568 - 75598 -0.9 61 28 Tu 1 . - CDS 75729 - 76313 257 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN - Prom 76445 - 76504 7.8 + Prom 76384 - 76443 5.2 62 29 Op 1 27/0.000 + CDS 76463 - 76699 401 ## COG0236 Acyl carrier protein 63 29 Op 2 1/0.182 + CDS 76715 - 77977 1279 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 64 29 Op 3 . + CDS 78015 - 79007 636 ## COG0571 dsRNA-specific ribonuclease - Term 78865 - 78905 4.3 65 30 Tu 1 . - CDS 78949 - 79959 828 ## COG0205 6-phosphofructokinase - Prom 80076 - 80135 5.6 + Prom 80011 - 80070 2.0 66 31 Tu 1 . + CDS 80097 - 81623 1130 ## BT_3355 putative auxin-regulated protein + Term 81682 - 81718 -0.2 - Term 81727 - 81766 0.1 67 32 Tu 1 . - CDS 81863 - 83017 861 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 83054 - 83113 2.0 + Prom 82978 - 83037 2.7 68 33 Tu 1 . + CDS 83074 - 85914 1320 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Prom 85945 - 86004 7.0 69 34 Op 1 . + CDS 86060 - 88648 1716 ## BT_3328 hypothetical protein 70 34 Op 2 . + CDS 88670 - 90673 1665 ## BT_3329 hypothetical protein + Term 90696 - 90745 8.3 + Prom 90685 - 90744 4.1 71 35 Tu 1 . + CDS 90837 - 91649 722 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 91792 - 91851 5.4 72 36 Tu 1 . + CDS 91881 - 93362 1348 ## COG0215 Cysteinyl-tRNA synthetase + Term 93402 - 93442 5.1 - Term 93388 - 93432 11.4 73 37 Op 1 . - CDS 93503 - 96364 2442 ## BT_3350 putative chondroitinase (chondroitin lyase) - Prom 96416 - 96475 1.6 74 37 Op 2 . - CDS 96478 - 98028 1270 ## COG3119 Arylsulfatase A and related enzymes - Prom 98076 - 98135 5.1 75 38 Tu 1 . - CDS 98194 - 99396 949 ## BT_3348 putative unsaturated glucuronyl hydrolase - Prom 99497 - 99556 5.8 + Prom 99620 - 99679 5.5 76 39 Tu 1 . + CDS 99858 - 100427 105 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 100670 - 100729 3.9 77 40 Tu 1 . + CDS 100866 - 102365 988 ## gi|260172521|ref|ZP_05758933.1| hypothetical protein BacD2_11711 + Prom 102412 - 102471 4.2 78 41 Tu 1 . + CDS 102530 - 103474 359 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 103581 - 103640 3.7 79 42 Op 1 . + CDS 103694 - 107185 1598 ## Dfer_2402 TonB-dependent receptor plug 80 42 Op 2 . + CDS 107200 - 108966 681 ## Dfer_2403 RagB/SusD domain protein + Prom 109048 - 109107 7.6 81 43 Tu 1 . + CDS 109128 - 110027 346 ## COG3568 Metal-dependent hydrolase + Term 110126 - 110176 -1.0 + Prom 110115 - 110174 6.1 82 44 Op 1 . + CDS 110256 - 112121 686 ## COG3533 Uncharacterized protein conserved in bacteria 83 44 Op 2 . + CDS 112182 - 112874 252 ## RB343 hypothetical protein + Prom 112903 - 112962 2.3 84 44 Op 3 . + CDS 112986 - 116264 1290 ## Slin_0358 coagulation factor 5/8 type domain protein + Term 116297 - 116355 9.7 - Term 116276 - 116349 18.7 85 45 Op 1 . - CDS 116449 - 118257 1160 ## COG3250 Beta-galactosidase/beta-glucuronidase 86 45 Op 2 . - CDS 118297 - 119373 782 ## COG1409 Predicted phosphohydrolases 87 45 Op 3 . - CDS 119377 - 120462 768 ## COG3507 Beta-xylosidase 88 45 Op 4 . - CDS 120489 - 122321 1466 ## PRU_2073 hypothetical protein 89 45 Op 5 . - CDS 122342 - 125842 2842 ## PRU_2074 hypothetical protein 90 45 Op 6 . - CDS 125900 - 127624 1164 ## gi|260172534|ref|ZP_05758946.1| hypothetical protein BacD2_11776 - Prom 127854 - 127913 7.1 + Prom 127800 - 127859 9.7 91 46 Tu 1 . + CDS 128030 - 128626 381 ## BT_1877 RNA polymerase ECF-type sigma factor + Term 128650 - 128708 0.4 + Prom 128647 - 128706 1.8 92 47 Op 1 . + CDS 128728 - 129720 534 ## COG3712 Fe2+-dicitrate sensor, membrane component 93 47 Op 2 . + CDS 129753 - 130163 502 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism - Term 130239 - 130280 -1.0 94 48 Op 1 . - CDS 130328 - 131203 748 ## BT_3342 hypothetical protein 95 48 Op 2 . - CDS 131148 - 133103 1528 ## BT_3341 hypothetical protein - Prom 133178 - 133237 8.4 + Prom 133140 - 133199 6.1 96 49 Tu 1 . + CDS 133402 - 134013 555 ## BT_2534 hypothetical protein + Term 134202 - 134236 3.1 + Prom 134183 - 134242 11.2 97 50 Tu 1 . + CDS 134491 - 134835 349 ## gi|237720495|ref|ZP_04550976.1| conserved hypothetical protein + Term 134885 - 134931 1.8 + Prom 134988 - 135047 7.5 98 51 Op 1 . + CDS 135171 - 135647 586 ## BT_2538 hypothetical protein + Term 135743 - 135799 1.1 + Prom 135831 - 135890 4.7 99 51 Op 2 . + CDS 135931 - 139038 2898 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 139101 - 139146 12.3 + Prom 139077 - 139136 2.9 100 52 Op 1 27/0.000 + CDS 139163 - 140398 1205 ## COG0845 Membrane-fusion protein 101 52 Op 2 9/0.000 + CDS 140414 - 143845 3134 ## COG0841 Cation/multidrug efflux pump 102 52 Op 3 . + CDS 143857 - 145248 388 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 103 52 Op 4 . + CDS 145267 - 146037 675 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase + Term 146058 - 146101 11.2 - Term 146046 - 146089 11.2 104 53 Tu 1 . - CDS 146122 - 146763 452 ## BT_3335 hypothetical protein - Prom 146821 - 146880 7.0 + Prom 146844 - 146903 9.6 105 54 Op 1 . + CDS 146951 - 147961 582 ## COG1609 Transcriptional regulators + Prom 147965 - 148024 6.7 106 54 Op 2 . + CDS 148087 - 150723 1393 ## BDI_1318 glycoside hydrolase family protein + Prom 150740 - 150799 3.7 107 55 Op 1 . + CDS 150846 - 153248 955 ## COG3250 Beta-galactosidase/beta-glucuronidase 108 55 Op 2 . + CDS 153254 - 154222 364 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 109 55 Op 3 . + CDS 154228 - 154806 580 ## COG0279 Phosphoheptose isomerase 110 55 Op 4 . + CDS 154829 - 156430 1332 ## gi|260172554|ref|ZP_05758966.1| hypothetical protein BacD2_11876 + Term 156442 - 156469 -0.4 + Prom 156435 - 156494 5.7 111 56 Op 1 . + CDS 156567 - 157721 748 ## COG0738 Fucose permease 112 56 Op 2 . + CDS 157734 - 159833 1581 ## COG3345 Alpha-galactosidase 113 56 Op 3 . + CDS 159830 - 163339 2427 ## COG3345 Alpha-galactosidase 114 56 Op 4 . + CDS 163377 - 165125 1324 ## gi|260172558|ref|ZP_05758970.1| hypothetical protein BacD2_11896 115 56 Op 5 . + CDS 165154 - 168393 2674 ## Coch_1022 TonB-dependent receptor plug 116 56 Op 6 . + CDS 168413 - 170251 1316 ## Phep_1301 RagB/SusD domain protein 117 56 Op 7 . + CDS 170270 - 173296 2375 ## Phep_3406 TonB-dependent receptor plug 118 56 Op 8 . + CDS 173308 - 175080 1261 ## Phep_3874 RagB/SusD domain protein 119 56 Op 9 . + CDS 175103 - 176173 782 ## BF1058 hypothetical protein 120 56 Op 10 . + CDS 176191 - 177474 569 ## COG3458 Acetyl esterase (deacetylase) + Term 177540 - 177589 14.3 + Prom 177584 - 177643 5.4 121 57 Tu 1 . + CDS 177670 - 181728 2953 ## COG0642 Signal transduction histidine kinase + Prom 181792 - 181851 3.5 122 58 Op 1 . + CDS 182021 - 183556 1249 ## COG3119 Arylsulfatase A and related enzymes 123 58 Op 2 . + CDS 183560 - 183676 73 ## 124 58 Op 3 . + CDS 183716 - 186874 2960 ## BT_3332 hypothetical protein 125 58 Op 4 . + CDS 186902 - 188674 1752 ## BT_3331 hypothetical protein 126 58 Op 5 . + CDS 188705 - 189805 844 ## BT_3330 hypothetical protein + Term 189884 - 189927 10.1 + Prom 189829 - 189888 4.7 127 59 Op 1 . + CDS 189950 - 190894 749 ## COG0042 tRNA-dihydrouridine synthase 128 59 Op 2 . + CDS 190949 - 192184 988 ## BT_3325 hypothetical protein + Term 192405 - 192469 3.9 - Term 192402 - 192442 3.4 129 60 Op 1 . - CDS 192572 - 195643 2214 ## BT_3324 chondroitinase (chondroitin lyase) precursor 130 60 Op 2 . - CDS 195679 - 196371 475 ## BVU_0159 hypothetical protein - Term 196399 - 196440 7.4 131 61 Op 1 . - CDS 196487 - 197857 1092 ## BF3314 hypothetical protein 132 61 Op 2 . - CDS 197927 - 198115 67 ## - Prom 198229 - 198288 7.0 133 62 Op 1 . + CDS 198183 - 199355 826 ## Cpin_2255 hypothetical protein 134 62 Op 2 . + CDS 199361 - 200836 1155 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 135 62 Op 3 . + CDS 200867 - 203836 2202 ## Cpin_2252 TPR repeat-containing protein 136 62 Op 4 . + CDS 203811 - 205964 1419 ## Cpin_2251 coagulation factor 5/8 type domain protein 137 62 Op 5 . + CDS 206003 - 206518 412 ## gi|237716783|ref|ZP_04547264.1| conserved hypothetical protein - Term 206524 - 206591 17.4 138 63 Tu 1 . - CDS 206612 - 207172 579 ## BT_3323 hypothetical protein 139 64 Op 1 . + CDS 207459 - 209264 523 ## BDI_3446 hypothetical protein 140 64 Op 2 . + CDS 209295 - 210071 527 ## gi|260172582|ref|ZP_05758994.1| hypothetical protein BacD2_12016 141 64 Op 3 . + CDS 210001 - 210348 152 ## gi|237720541|ref|ZP_04551022.1| predicted protein 142 64 Op 4 . + CDS 210356 - 211897 712 ## gi|237720542|ref|ZP_04551023.1| predicted protein 143 64 Op 5 . + CDS 211902 - 212360 234 ## gi|237720543|ref|ZP_04551024.1| predicted protein 144 64 Op 6 . + CDS 212357 - 212803 227 ## gi|237720544|ref|ZP_04551025.1| predicted protein + Term 212869 - 212926 10.3 145 65 Tu 1 . - CDS 213347 - 213574 65 ## - Prom 213778 - 213837 6.5 146 66 Tu 1 . - CDS 213935 - 214063 64 ## - Prom 214134 - 214193 5.9 + Prom 214098 - 214157 6.0 147 67 Tu 1 . + CDS 214215 - 214589 226 ## gi|298480418|ref|ZP_06998615.1| O-antigen polymerase superfamily 148 68 Tu 1 . - CDS 214760 - 216103 487 ## gi|260172588|ref|ZP_05759000.1| hypothetical protein BacD2_12046 - Prom 216152 - 216211 10.1 - Term 216167 - 216220 -0.8 149 69 Tu 1 . - CDS 216243 - 217526 851 ## BT_3321 hypothetical protein - Prom 217559 - 217618 5.2 + Prom 217524 - 217583 4.3 150 70 Op 1 . + CDS 217610 - 218374 850 ## COG0289 Dihydrodipicolinate reductase 151 70 Op 2 2/0.182 + CDS 218410 - 219894 1473 ## COG0681 Signal peptidase I 152 70 Op 3 . + CDS 219937 - 220878 514 ## COG0681 Signal peptidase I + Prom 220886 - 220945 6.4 153 71 Tu 1 . + CDS 220970 - 221596 604 ## BF0179 hypothetical protein + Term 221724 - 221763 -1.0 154 72 Tu 1 . - CDS 221651 - 223360 1175 ## COG1874 Beta-galactosidase - Prom 223396 - 223455 5.6 - Term 223395 - 223442 7.5 155 73 Tu 1 . - CDS 223556 - 224044 320 ## BT_3316 hypothetical protein Predicted protein(s) >gi|225935354|gb|ACGA01000038.1| GENE 1 99 - 2927 2297 942 aa, chain - ## HITS:1 COG:XF1330_1 KEGG:ns NR:ns ## COG: XF1330_1 COG3292 # Protein_GI_number: 15837931 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Xylella fastidiosa 9a5c # 27 738 28 739 740 125 24.0 5e-28 MKRKLLLFLLSICFFLPDFAATGQYNFIRVDGGSGLSNSHVKSIIQDSYGFIWLGTRNGL NRYDGVSMKLYNCYDETLQHGNQVISALFEDNHRQLWVGTDDGVYIQELATGKFSFFDAR TENGEQIRYNWIEDILADHSGNIWVNAPNQGVFRYQVETGKLFRYIPCPSKDKSKDFPQS ICVDKEGTVWVGTYGAGIYRYSPEQDKFVPYATEALKGDFIFTLCDYGDELIVGVHEEEL KRFNKKTGEVSIFPAPEVHRKIIRYAVCFGDELWVGTQNGVYVINEKHNSVQHIPADAGG KYGLGDAIVDKIYRDCEGGTWICTQFGGVSYLPVRTLDFSVYLPGAPGTVSGRRISELAE GKDGVVWVSTQDGGVCYWNPGTQTFTKIPQSPDRQNVLSLFASDDLVGAGYFKGGIDLIV SQDQVQTFYPAQLGISEGSVFALYRDRGGAIWLGDGWNIFRSADKGRTFEKMEQFGYAYM RDILEDKSGNIWVATMGSGIFRYHPQTNQMVSYKCIPEDSTSIGTNEVTGISEDSKGFIW FSTDRGGLLRFNPETGRFRTYTKANGLPDNVTYKVVEDRQHRIWFGTDRGLVCLHPETDS LQVFNRNDGLPDNQFNYKSALAASDGTIWMGTINGLVSFNPQIVHPNTFVPPVYITGMYV QGRETPFTADGVQLPYRSNVSFDFVALSYTSPSANRYAYKMEGIDNDWNYTSDVHTASYA QLPPGDYLFRVRGSNNNGVWNQEEATLSVRILPPWWRTVWAYLIYIIVVSGSFVLTLRAY RRREVQKIREQQLLAELARERESHRTHEMFINQITYGACTPQGGAMSRADEQLMSQLIAK VRENLSDANYNVEALAAAMNMSRSSLHRKIKALTDLSSLDFIRIIRLKRAAELLQEGELR INEISDRVGFQSPSYFAKIFQKQFGVTPTEFAQQNKQRMAED >gi|225935354|gb|ACGA01000038.1| GENE 2 3170 - 5539 2166 789 aa, chain + ## HITS:1 COG:no KEGG:BF3178 NR:ns ## KEGG: BF3178 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 46 789 51 783 783 696 50.0 0 MKKVTFSCLTLATLLQLIPSNGNAANIQVNKKKTNRIEMKQDTIPANSTKKESEEGERNV MLNASDANKPREIQIGLPSEDVNVYENGLPAVYSSTVHKLAAHWRSDSSLGEVGLLSPSE SAITTGNIAYSVNSFSKLGQKDFQGILNYRTNHFGMQNFDFNVSGAINDQWLYTASIYQN FDPGSFDLKFTNYADRTEIYHAGLTRLFNNGRGKISLLYKHSRSENPASYANAAPFIYVG DGSVKEIDGFKLGTNSYVPQNGSFPYMDVRDGKIKTWNLGDGSENRANEIALISDYRFRN DLLWKFNLKYMDAPRANYVDFGGSTISEVTANDGFTLSNGDPYEGLAEGRRTWLHVGKVK NFLITSELNKTFGSHNLRLGVNEWYYHLDYYSSSLQWMATVQNYPQLLNSTTTSSLDPTL TGQRVQTYGYNELSPEYTKGYENKLALYFTDNWQVTPQFNIYYGGRLEYYRMSADQISAS RFPGFHIGDFTTYSKEEETGNIIATQRSIHPAKVVKDKLNYAATLRATYNVTGQFGLTAD GTVATRFPRINEYAGTGPTEEQYKRVTIPLIRGGLFYKNKWINLTSMVTYISKSNNIDQQ NLTKPGTEEGKTVLLIYNIKTLGWTTSAEIDPFKGFHLHALFTYQKPVYKNYNASVTFND GSEMSVNANGMIVKEIPQILVELDPSYNITKDLRLWLSFRYFGKTYANLQEALYFNGRWE TFGGINWNVNKHLSLGATVINFLNQKGASGTINGSELITKEEAAQYAGNYMSGNYLRPFT VEFSASIKF >gi|225935354|gb|ACGA01000038.1| GENE 3 5622 - 7811 2240 729 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 20 711 7 712 748 346 30.0 8e-95 MKKLFVLLMALFTLAQVDAQEKNVIRIATDNTDLILQVAPNGRLYQAYLGDKLLNEKDIN NFSPYVKGGSDGSVSTRGWEVYPGSGAEDYFEPAVAITHNDGNPSTILRYISSEQKAVAG GTETIIQLKDNQYPVEVTLHYIAYPKENVIKTWSEIKHAEKKPVTLWRYASTMLYFSGNE YYLTEFSSDWAKEAQMSTQPLLFGKKVIDTKLGSRAAMHTHPFFELGFEQPAQEAQGRAM LGTIGWTGNFQFTFEVDNVGNLRVIPAINPYASDYELKPNEVFTTPEFIFTFSNNGTGEA SRNLHAWARNYQLKDGQGDRMTLLNNWENTYFKFNEELLAELMKEAKHLGVDMFLLDDGW FGNKHPRNSDNAGLGDWEVMRSKLPGGIPALVQSAKEAGVKFGIWIEPEMVNPKSELFEK HPDWAIQLPNRETYYYRNQLVLDLSNPKVQDFVYGVVDKILTENPEVAFFKWDCNSPITN VYSPYLKNKQGQLYIDHVRGIYNVLKRIKDKYPNVPMMLCSGGGARCDYEALKYYTEFWC SDNTDPIERLFIQWGFSQIFPAKAMCAHVTSWNKNTSVKFRTDVASMCKLGFDLGLKELN ADEQTYCQNAVANWTRLKKVILDGDQYRLVSPYDGNHMSLMYASPDKNKAVLYTYDIHPR FGEKLLPVKLQGLDAKKMYKVKEINLMPNSKSNLAANEKTYSGDYLMKVGINAFTTNQTF SRVIELTAE >gi|225935354|gb|ACGA01000038.1| GENE 4 8187 - 9404 1145 405 aa, chain - ## HITS:1 COG:mlr0021 KEGG:ns NR:ns ## COG: mlr0021 COG0520 # Protein_GI_number: 13470346 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mesorhizobium loti # 4 405 11 412 413 450 51.0 1e-126 MNVDIQKIREDFPILSRTVYGKPLVYFDNGATTQKPRLVVDALVDEYYSVNANVHRGVHY LSQQATELHEASRETVREFINARSTNEVVFTRGTTESINLLVSSFGDEFMEEGDEVIVSV MEHHSNIVPWQLLAARKGIAIKVIPMNDKGELLLDEYEKLFSERTKIVSVVHVSNVLGTV NPVKEMIATAHAHGVPCLIDAAQSIPHMKVDVQELDADFLVFSAHKIYGPTGVGVLYGKE EWLDRLPPYQGGGEMIQHVSFEKTTFNELPFKFEAGTPDYIGTTGLAKALDYVNGHGIEQ IAAHEHELTTYALQRLKEIPHIRIFGEAAERGAVISFLVGDIHHFDLGTLLDRLGIAVRT GHHCAQPLMQRLGIEGTVRASFAMYNTKSEIDTLVAGIDRVSKMF >gi|225935354|gb|ACGA01000038.1| GENE 5 9418 - 10761 1208 447 aa, chain - ## HITS:1 COG:alr2494 KEGG:ns NR:ns ## COG: alr2494 COG0719 # Protein_GI_number: 17229986 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Nostoc sp. PCC 7120 # 26 425 45 440 453 187 32.0 4e-47 MMVEQQYIELFSQTEAMICKHSAEVLNAPRASAFADFERLGFPTRKMEKYKYTDVSKYFE PDFGLNLNRLAIPVNPYEVFKCDVPNMSTSLYFVVNDTFYNRALPTGNLPEGVIFGSLKE VAEQHPELVKKYYGQLADTSKDGVTAFNTAFAQDGVVFYVPKNVVVEKTIQLVNILRADV NFMVNRRVLIILEDGAQARLLICDHAMDNVNFLATQVIEVFAGENAIFDMYELEETHTST VRISNLYVKQEANSNVLLNGMTLHNGTTRNTTEVLLAGEGSEINLCGMAIADKNQHVDNH TSIDHAVPNCTSNELFKYVLDDQSVGAFAGLVLVRPDAQHTNSQQTNRNLCATRDARMYT QPQLEIYADDVKCSHGATVGQLDEGALFYMRSRGIAEKEARLLLMFAFVNEVIDTIRLEA LKDRLHLLVEKRFRGELNRCQGCAICK >gi|225935354|gb|ACGA01000038.1| GENE 6 10775 - 11527 213 250 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 2 235 12 232 318 86 29 8e-16 MLEIKDLHASINGKEILKGINLTVNPGEVHAIMGPNGSGKSTLSSVLVGNPAFEVTKGSV TFYGKDLLELSPEDRSHEGIFLSFQYPVEIPGVSMVNFMRAAVNEQRKYKGLPALTASEF LKLMREKRAVVELDNKLANRSVNEGFSGGEKKRNEIFQMAMLEPRLSILDETDSGLDIDA LRIVAEGVNKLKTPETSTIVITHYQRLLDYIKPDIVHVLYKGRIVKTAGPELALELEEKG YDWIKKEVGE >gi|225935354|gb|ACGA01000038.1| GENE 7 11570 - 13069 1417 499 aa, chain - ## HITS:1 COG:SMc00530 KEGG:ns NR:ns ## COG: SMc00530 COG0719 # Protein_GI_number: 15965488 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Sinorhizobium meliloti # 25 499 11 489 489 714 70.0 0 MQQEPNKMKPNQKDELEKKTDNEFVRKFAEEKYKYGFTTEVHTDIIERGLNEDVIRLISS KKDEPEWLLEFRLKAYRHWLTLEMPTWAHLRIPEIDYQAISYYADPTKKKEGPKSMEEVD PELIKTFNKLGIPLEEQMALSGMAVDAVMDSVSVKTTFKETLMEKGIIFCSFSEAVREHP DLVKKYMGSVVGYRDNFFAALNSAVFSDGSFVYIPKGVRCPMELSTYFRINARNTGQFER TLIVADDDSYVSYLEGCTAPMRDENQLHAAIVEIIVHDRAEVKYSTVQNWYPGDAEGKGG VYNFVTKRGNCKGVDSKLSWTQVETGSAITWKYPSCILTGDNSTAEFYSVAVTNNYQQAD TGTKMIHLGKNTRSTIVSKGISAGHSENSYRGLVRVAEKADNARNYSQCDSLLLGDKCGA HTFPYMDIHNETAVVEHEATTSKISEDQIFYCNQRGIPTEDAIGLIVNGYAKEVLNKLPM EFAVEAQKLLTISLEGSVG >gi|225935354|gb|ACGA01000038.1| GENE 8 13050 - 13577 423 175 aa, chain - ## HITS:1 COG:no KEGG:BT_3405 NR:ns ## KEGG: BT_3405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 173 1 173 176 249 87.0 2e-65 MATIDIIILIVIGAGAIVGFVKGFIRQLASILGLIVGLLAAKALYASLAEKLCPTVTDSM TVAQVLAFIMIWIAVPLIFVLIASLLTKAMQAISLNWLNRWLGSGLGALKFLLLTSVVIG AIEFVDSDNKLISATKKEESLLYYPMETFAGIFFPAAKNMTQQYILENKDATRTQ >gi|225935354|gb|ACGA01000038.1| GENE 9 13765 - 16863 3649 1032 aa, chain - ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 460 1031 157 730 730 533 52.0 1e-150 MTIRLNKVTRDLNVGIATVVEFLQKKGHTVEANPNTKISEEQYAILVKEFSTDKNLRLES ERFIQERQNKERNKASVSIEGFEKQPEKPKSEDVIKTVVPEDARPKFKPVGKIDLDKLNG RKPEKVEKEPEQKQEEPVVERPVVKPEVKKEPEKREPEVKKEEVVTPPVSVVEPTPVVVE PVVVPEPVVETKPVEVEKVVEEVKKEEPKVVVAAPVKAEEHKEEEKVETAQAEVTPVAEK APEDDGVFKIRQPELGAKINVIGQIDLAALNQSTRPKKKSKEEKRREREEKEKIRQDQKK LMKEAIIKEIRKDDSKLAKSGPKDSAEAAANKKKRNRINKEKVDVNNVATSNFAAPRPNV QGKGGNSNGQGGQANGQGNNNRRNNNNNKDRFKKPVIKQEVSEEDVAKQVKETLARLTTK GKNKTSKYRKEKREMASNRMQELEDQEMADSKVLKLTEFVTANELATMMDVSVNQVIATC MSIGIMVSINQRLDAETINLVAEEFGFKTEYVSAEVAQAIVEEEDAPEDLQPRAPIVTVM GHVDHGKTSLLDYIRKANVIAGEAGGITQHIGAYNVQLEDGRRITFLDTPGHEAFTAMRA RGAKVTDIAIIIVAADDNVMPQTKEAINHAMAAGVPIVFAINKVDKPTANPDKIKEELAA MNYLVEEWGGKYQSQDISAKKGMGVEDLLEKVLLEAEMLDLKANPDRNATGSIIESSLDK GRGYVATVLVSNGTLKVGDIVLAGTSYGRVKAMFNERNQRVKEAGPSEPALILGLNGAPA AGDTFHVVESDQEAREITNKREQLAREQGLRTQKILTLDELGRRIALGNFQELNIIVKGD VDGSVEALSDSLIKLSTEQIQVNVIHKGVGAISESDVSLAAASDAIIVGFQVRPGAAGKM ADQEGVDIRKYSVIYDAIEEVKAAMEGMLAPEVKEQITATIEIREVFNITKVGLVAGAMV KTGKVKRSDKARLIRDGIVIFTGNINALKRFKDDVKEVGTNFECGISLVNCNDMKVGDMI ETFEEIEVKQTL >gi|225935354|gb|ACGA01000038.1| GENE 10 16968 - 18251 595 427 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 1 403 1 426 537 233 33 4e-60 MAKKEETISLIDTFSEFKELKNIDRTTMVSVLEESFRSVIAKMFGTDENYDVIVNPDKGD FEIWRNREVVADEDLTNPNMQISLSEAQKIDASYEEGEEVTDEVIFAKFGRRAILNLRQT LASKILELEKDSIYNKYIDKVGTIINAEVYQIWKKEMLLLDDEGNELLLPKTEQIPSDFY RKGETARAVVARVDNKNNNPKIILSRTSPVFLQRLFEMEVPEINDGLITIKKIARIPGER AKIAVESYDDRIDPVGACVGVKGSRIHGIVRELRNENIDVINYTSNISLFIQRALSPAKI SSIRLNEEEKKAEVFLKPEEVSLAIGKGGLNIKLASMLTEYTIDVFRELDENVADEDIYL DEFRDEIDGWVIDAIKAIGIDTAKAVLNAPREMLIEKTDLEEETVDEVIRILKSEFEEED PSIENKD >gi|225935354|gb|ACGA01000038.1| GENE 11 18254 - 18721 436 155 aa, chain - ## HITS:1 COG:no KEGG:BT_3402 NR:ns ## KEGG: BT_3402 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 155 255 97.0 3e-67 MIEKKTVCQIVEEWLEGKDYFLVEVTVSPDDKIVVEIDHAEGVWIEDCVELSRFIESKLN REEEDYELEVGSAGIGQPFKVLQQYYIHIGQEVEVLTGDGRKLAGILKDADEEKFTVGVQ KKVKTEGSKRPKLVEEDETFTYEQIKYTKYLISFK >gi|225935354|gb|ACGA01000038.1| GENE 12 18894 - 19289 462 131 aa, chain + ## HITS:1 COG:no KEGG:BT_3400 NR:ns ## KEGG: BT_3400 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 209 83.0 3e-53 MAHRLNTNKQFMVGNGLLAFAVIFVVVIFVYMSMRLQREKQEERHFIESYTISLVKGFAG DSISLFVNDSLISNKTMSEEPYTIEVGRFAEQSALLIVDNNTELVSTFDLSEKGGTYQFE KESDGIKQLAK >gi|225935354|gb|ACGA01000038.1| GENE 13 19355 - 20053 399 232 aa, chain - ## HITS:1 COG:SMc02768 KEGG:ns NR:ns ## COG: SMc02768 COG1451 # Protein_GI_number: 15963779 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Sinorhizobium meliloti # 13 209 30 223 251 79 30.0 6e-15 MDKIIEDDELGRLIVRVNSRARSLVFRTKSDAVYVSVPPGTTLKEVKQAIENLRGKLLAS RQKLARALVDLNYKIDAEHFKLSLVSGEKDQFLANSRLGVMEIVCPPHADFTDEKLQSWL HKVIEESLRRNAKSILPSRLASLSKQCGLPYSSVKINSSQGRWGSCSARKDINLSYYLVL LPSHLIDYVLLHELCHTREMNHSERFWALLNQFTEGKALALRGELRKYRTEI >gi|225935354|gb|ACGA01000038.1| GENE 14 20183 - 20965 575 260 aa, chain + ## HITS:1 COG:no KEGG:BT_3398 NR:ns ## KEGG: BT_3398 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 259 1 260 261 261 56.0 1e-68 MKTSLLLVLVTGLSLVLTAGSCLQGKHVVGSKNYISQEVKADHFNEIKLLGSANISYHQD TRSHVEIHGSDNIIPLVETYVDGNTLIIKFKKNVSIWKGKLEIKVFAPELNKLTINGSGN IKLINGIQTSKDIEFHINGSGNIQGEGLNCRRMAVSINGSGDVRLQQIESQECQAGISGS GNINLKGKAIQAKYSIAGSGNIQAADLEAENTDASISGSGNISCYASQKLVARVKGSGDI AYKGNPQEVDAPRKNIRQIK >gi|225935354|gb|ACGA01000038.1| GENE 15 21133 - 21612 403 159 aa, chain - ## HITS:1 COG:no KEGG:BT_3397 NR:ns ## KEGG: BT_3397 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 158 158 264 88.0 8e-70 MDYKDIEQLLERYWQCETSVEEEATLRDFFAKEEVPAHLLRYKNLFVYQQVQQEVGLGED FDARILAEVEPTVVKAKRLTLTGRFIPLFKAAAVIAIILSLGNVAQHSFSGDDGSVLATD TIGKQVTAPSVAISNDVKAEQVLADSLARVNHKVQVINE >gi|225935354|gb|ACGA01000038.1| GENE 16 21602 - 22111 290 169 aa, chain - ## HITS:1 COG:MT1259 KEGG:ns NR:ns ## COG: MT1259 COG1595 # Protein_GI_number: 15840665 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mycobacterium tuberculosis CDC1551 # 15 168 93 247 257 65 28.0 4e-11 MQEISFRNDILPLKDKLFRLALRITLDRAEAEDVVQDTMIRVWNKRDEWSQFESVEAYCL TVAKNLAIDRSQKKEAQNVELTPEMEEEPDANSPYDQMIHDERMNIINRLVNELPEKQRL IMQLRDIEGESYKKIAGLLNLTEEQVKVNLFRARQKVKQRYLEIDEYGL >gi|225935354|gb|ACGA01000038.1| GENE 17 22173 - 22946 780 257 aa, chain - ## HITS:1 COG:MK1631 KEGG:ns NR:ns ## COG: MK1631 COG0548 # Protein_GI_number: 20095067 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Methanopyrus kandleri AV19 # 5 256 1 246 246 159 40.0 4e-39 MREKLTVIKVGGKIVEEEATLRQLLNDFAAITGHKVLVHGGGRSATKIAAQLGIESKMVN GRRITDAETLKVVTMVYGGLVNKNIVAGLQARGVNALGLTGADMNVIRSVKRPVKEVDYG FVGDVEKVDATLLSDLIHKGVVPVMAPLTYDGHGNMLNTNADTIAGETAKALSALFDVTL VYCFEKKGVLRDENDDDSVIPQITRAEFEQYVADGVIQGGMIPKLENSFEAINAGVSEVV ITLASAINNSGGTRIKK >gi|225935354|gb|ACGA01000038.1| GENE 18 23003 - 24895 1798 630 aa, chain - ## HITS:1 COG:all3401 KEGG:ns NR:ns ## COG: all3401 COG1166 # Protein_GI_number: 17230893 # Func_class: E Amino acid transport and metabolism # Function: Arginine decarboxylase (spermidine biosynthesis) # Organism: Nostoc sp. PCC 7120 # 2 629 51 677 679 609 45.0 1e-174 MRKWRIEDSEELYNITGWGTSYFGINDKGHVVVTPRRDGVTVDLKELVDELQLRDVASPM LIRFPDILDNRIEKMSSCFKQAAEEYGYKAENFIIYPIKVNQMRPVVEEIISHGKKFNLG LEAGSKPELHAVIAVNTDSDSLIVCNGYKDESYIELALLAQKMGKRIFLVVEKMNELKLI AKMAKQLNVQPNIGIRIKLASSGSGKWEESGGDASKFGLTSSELLEALDFLESKGMKDCL KLIHFHIGSQVTKIRRIKTALREASQFYVQLHSMGFNVEFVDIGGGLGVDYDGTRSSNSE GSVNYSIQEYVNDSISTLVDVSDKNGIPHPNIITESGRALTAHHSVLIFEVLETATLPEW DDEEEIAPDAHELVQELYGIWDSLNQNKMLEAWHDAQQIREEALDLFSHGIVDLKTRAQI ERLYWSITREINQIAGGLKHAPDEFRGLSKLLADKYFCNFSLFQSLPDSWAIDQIFPIMP IQRLDEKPERSATLQDITCDSDGKIANFISTRNVAHYLPVHTLKKTEPYYVAVFLVGAYQ EILGDMHNLFGDTNAVHVSVNEKGYNIEQIIDGETVAEVLDYVQYNPKKLVRTLETWVTK SVKEGKISLEEGKEFLSNYRSGLYGYTYLE >gi|225935354|gb|ACGA01000038.1| GENE 19 24965 - 26035 1103 356 aa, chain - ## HITS:1 COG:no KEGG:BF0195 NR:ns ## KEGG: BF0195 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 350 1 350 359 552 82.0 1e-156 MRYTLEIQKLLLQTQNNNLHPREKANLLKEAIRIADENEDVEWATELRLDLIYELNLLSA DAEEITVFSKILDDYENHKDVIKEDDLLWKYKWIWACTFDLPEIPMEQVQAIGEDYKTRI LRNGYSLRSYYHRWSVECVWMRQYDKAKEYIDKMLNEKIDDQSCEACELNFMLDYYLETG QFDEAYSRAQPLINKQVTCYEANLRAYLKLAYYAQKAGKPEIAADMCARAEEALVGREKD EYLLLYLGLFIAYNMMTKPERAWKYAERCIGWSLRTNTLKSYRFSCDMVEALKYETRPEV SLSLPEEFPLYRPDGIYQVNELRNYFYQQAEELARRYDARNGNSGYMDRLKDLMSN >gi|225935354|gb|ACGA01000038.1| GENE 20 26045 - 27811 1221 588 aa, chain - ## HITS:1 COG:lin0941 KEGG:ns NR:ns ## COG: lin0941 COG0326 # Protein_GI_number: 16800010 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Listeria innocua # 9 534 8 556 564 311 33.0 3e-84 MEKEGNNLFQVNLKGMIALLSEHIYSNPNTFVRELLQNCVDAITALRNIDENYKGRIDVF LNENKTVVFRDNGIGLKEEEVYRFLTVIGESSKRDTPDADDFIGRFGIGLLSCFVVTNEI TVESRSAMGGQPVCWCGKVDGTYQLTLSDEERPIGSQVVLHPKGDWMHLFEYETFKKILV GYGEVLPYPIYLHYQGEEELVNTPSPVWLDPKATRKELLDYGAKVFQSSALDAFRIYTES GKVEGVLYVLPFRTQFSVRNSHKVYLKRMLLSEDDCNLLPSWAFFIRCLVNADGLLSTAS RESLVSNDQLKDARKEIGVAIKDYLRGLVQNDRAMFNRILDVHHFHIKAIASEDNELLRL FMDYLPFETNKGLRSFGSIRSASNVICYTKNLEDFRQVRRIAGAQGWLVVNAAYTFDETL LKKYVRLNPELTLDEISPSRLLEQFGEVEANQEFQAFEVKANELLKRFGCICRLKHFTPV DIPVIFVAEEKENAAKSANNPLAAVLGAVNTTKQIPPTLTFNADNEMVKTLLQIQGDNKL FQHVVHILYVQSLLQGKYPVNSEEMELFNHSLSELMTSKMNDFINFLN >gi|225935354|gb|ACGA01000038.1| GENE 21 27821 - 30319 3113 832 aa, chain - ## HITS:1 COG:ECU11g0430 KEGG:ns NR:ns ## COG: ECU11g0430 COG0790 # Protein_GI_number: 19074843 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Encephalitozoon_cuniculi # 372 829 65 537 590 157 27.0 8e-38 MKTLKEKFGELSAKIKASGQPARVWFPQYTPASLLSAENWWEALAVCEYALDTKEDEKLT EDFFELIFSAFDCNVEVDLNAEEYEFWWEKVMQVCDRVAEFSGAGWAQKGAQYSEARYGK RDMSYLLPYYEKAADMGWAEAEATVAYWRYMGFYCEQDKEEGERRFAALTSPEAILWGKH YRAFAEEFTGDKAKALQIRNELLAELPEGERLRAHVYASLGDALDRAEGNVAEEAAYYEK ALEIVPNLYSLKNLATLYFRYPELNKPKELCFELWEKAWHAGVWSAANFLGYNYQEEEWQ DMPKAIEWLEKGMLYCEPYSAYELALIYLYNDEYKNVERGLMCLNRCVEDDYIQGIEGLA NIYFNGDLVPEDMNRAKELLEKAIELGSGSAAYRLGWMYERGFLSEEPDYVKALEFYEKA ASLNNADGYCRVALYLANGYSGVKDPVKSREYYEKAAELGACFALVELAFLYENGDGVEK NYEKSFELISKAAEQGYPYAMFRVGLYMEKGVLGEVKPEEAFAWYTKAAEADDNDAIFAL GRCYREGIGTEENWDKALEWFSKGAEKNEARCLTELGMAYENGNGVEENPQKAVEYMMKA AEQDYGYAQFKMGDYYFFGCGPCLEDNKTAVEWYEKAVANEIPMAMLRVGEYYLYDYDSL NESEKAFAYFKKAAEYEWYSEGLGICYEMGIGVEENETEAFKYYTLAADNGNTTSMYRTG LCYYNGVGVKQNYAEAYRWFTDAAGNENVAAIYYLGKMMMYGEGCNPDPEAAVQWLLKAA EKNNDKAQFELGNAYLTGNGVEENDEIAMEWFEKAAENGNEKALKITGRRRK >gi|225935354|gb|ACGA01000038.1| GENE 22 30576 - 31103 476 175 aa, chain - ## HITS:1 COG:alr1244 KEGG:ns NR:ns ## COG: alr1244 COG0703 # Protein_GI_number: 17228739 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Nostoc sp. PCC 7120 # 2 159 8 162 181 102 35.0 4e-22 MVRIFLTGYMGAGKTTLGKAFARYMNIPFIDLDWYIEERFHKTVGELFIERGETGFRELE RNMLHEVAEFENVVISTGGGAPCFYDNMDFMNRTGKTVFLEVHPDVLFRRLRVAKQQRPI LQGKEDEELKAFIVQALEKRAPFYHQAQYIFNADELEDRWQIETSVQCLRQLLGL >gi|225935354|gb|ACGA01000038.1| GENE 23 31219 - 31824 510 201 aa, chain - ## HITS:1 COG:CAC3314 KEGG:ns NR:ns ## COG: CAC3314 COG3560 # Protein_GI_number: 15896557 # Func_class: R General function prediction only # Function: Predicted oxidoreductase related to nitroreductase # Organism: Clostridium acetobutylicum # 2 200 1 198 198 232 54.0 3e-61 MMERSFSEALKQRRTYYSITNQSPISDQEIECIVNMTVRHVPSAFNSQSTRVVLLLGESH KKLWQIVKDALKRIVPAEAFVKTEEKIDHSFACGYGTVLFFEDQKVVKGLQEAFPSYQEN FPGWSLQTSAMHQLAIWVMLEDVGFGASLQHYNPLIDEEVRRAWDLPEHWHLIAEMPFGL PVGKPGEKEFQPLEERVRIFK >gi|225935354|gb|ACGA01000038.1| GENE 24 31962 - 32597 661 211 aa, chain + ## HITS:1 COG:BH0863 KEGG:ns NR:ns ## COG: BH0863 COG3341 # Protein_GI_number: 15613426 # Func_class: R General function prediction only # Function: Predicted double-stranded RNA/RNA-DNA hybrid binding protein # Organism: Bacillus halodurans # 1 211 1 196 196 185 47.0 6e-47 MAKQKFYVVWEGVTPGVYTSWTDCQLQIKGYEAAKYKSFDTREEAERALTMSPYAYIGKN AKAKSGGPKPSSDTLPSCVIDNSLAVDAACSGNPGPMEYRGVHIASRQEIFHFGPMKGTN NIGEFLAIVHGLALLKKKGFDMPIYSDSANAISWVRQKKCKTKLPRTPETEELFLLIERA EKWLQGNTYTTRILKWETKEWGEIPADFGRK >gi|225935354|gb|ACGA01000038.1| GENE 25 32627 - 34399 1199 590 aa, chain + ## HITS:1 COG:all2870 KEGG:ns NR:ns ## COG: all2870 COG1807 # Protein_GI_number: 17230362 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family # Organism: Nostoc sp. PCC 7120 # 35 500 55 561 641 85 25.0 2e-16 MKTLTSNKAFWLLLVICVVTILPYLGLSEYHTKGEPRESIVSYSMLDSGNWILPRNNGGE MAYKPPFFHWSIAAVSAAVNGGQVTEMTSRLPSAIALIAMTLCGFLFFAKRKGVELALLA AFITLTNFELHRAGANCRVDMVLTALTVGALYCFYKWYEKGLKGIPWLAILLMSCGTLTK GPVGTIIPCLVVGIFLLLRGVNFFKAFLLLSAWAILSLILPFCWYVAAYQQGGEEFLALV MEENLGRMTNTMSYDSCVNPWHYNFVTLFAGYVPWTLLVVLSLFSLTYHKFSIQPAAWWK RFTTWIKNMDPVDLFSFTSIVVIFVFYCIPQSKRSVYLMPIYPFIAYFLAKYLFYLVKKQ SKVIKVYGSILAVISLLLFTCFIVLKCGLIPETIFQGRHAPDNINFMRAIQNISGAGALF LIAIPTILGIYWWFYQRKNALSNRFLYALVVLTMGLYLALDGAYQPAALNSKSVKFIATE IEKVAPESEGTMYEFIEESLHAAGDPVHYFELNFYLRNRLDNFYEKRPSEGFLLIGTNDA EKYLPEFEKEGYQFEEVYESPKRVLRQIAKVYKFVKKQQPENTETTPIVE >gi|225935354|gb|ACGA01000038.1| GENE 26 34386 - 36365 1120 659 aa, chain - ## HITS:1 COG:PA1689 KEGG:ns NR:ns ## COG: PA1689 COG1368 # Protein_GI_number: 15596886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pseudomonas aeruginosa # 19 575 35 607 700 195 27.0 2e-49 MKRFLNLVVYILTIHVSALLIAGLFRLVLFISSYHQLTSEALSDKTLSMLAFVHGVWFDN VIGCYILLLPLVVAVVCGVCNYYGKALFRFFTIFFSVFYGLVYLISASDIPYFAYFFKHI NSSIFEWFGYAGTTAGMILGESAYYLSIGLFLLFLAGFVVWLIYLARYFHHRSLTISTPF PYWKRGGAVLIGACLIGLCIFGIRGRTGYNPIKVSAAYFCQDAFLNQLGVSPTFNLLTSV MDDRRPENKYLHLMDEQEAITKAQALLNRPGEPNVSPLAVYRHTSKMDSVQQRRPNVVLI MMESMSSKFMKHFGQSETLTPFLDSLYTRSISFRNFYSAGIHTNHGLYATLYSFPAMMKR NLMKGSVIPRYSGLPTVLKENGYYNLFFMTHEGQYDNMNAFFRTNGYDEVFSQEDYPADK VVNSFGVQDDFLYDYAIPVLNQRAATGQPFFATLLSISNHPPYVIPPFFHPKTSEPETQI VEYADWALRQFFEEARKQPWFDNTIFVLEGDHGKLVGDAECELPESYNHIPLMIYSSRIQ PEEKTAFGGQVDIQPTILGLLNIDYLQNNFGVDLLKEERPCMFYTADNMVAGRNDTLLYL YNYETQQELTYHIGNGKLNAVPMDDSFLPLKEYSFSMLQSAEFLVKHGKTLNSIPFTQQ >gi|225935354|gb|ACGA01000038.1| GENE 27 36412 - 36831 256 139 aa, chain - ## HITS:1 COG:no KEGG:BT_3390 NR:ns ## KEGG: BT_3390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 136 11 133 143 125 49.0 5e-28 MIGISKAILEKHPDWRDKFWQFVRFGVVGTISSAIHYGVYCLVLLVANANISFTAGYAVG FVCNYFLTTFFTFRSKPSSHNAIGFGFSHLINYLLEIGLLNLFLWIGAGELLAPILVMII VVPINFLILHFVYIYKGRK >gi|225935354|gb|ACGA01000038.1| GENE 28 36815 - 37771 893 318 aa, chain - ## HITS:1 COG:lin1066 KEGG:ns NR:ns ## COG: lin1066 COG0463 # Protein_GI_number: 16800135 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Listeria innocua # 4 318 6 323 329 315 47.0 6e-86 MIKLAIVSPCYNEEEVLEDSASRLTALFDELVAKEKISADSFVLFVNDGSKDRTWSIIKK LHGTNPYIKGMNLARNVGHQYAIMAGMMTAKDWSDAVITIDADLQDDLNAIEEMIDAYTE GYDVVYGVKVSRQADPMLKRLSATAFYKLQHRMGVETIYNHADFRFLSRRVLEQLSHYQE RNVYLRGIIPLLGFPSTTVDDVIRERTAGTSKYTVRKMFSLALDGITSFSVKPIYGIVYL GGIFVFISILIGIYVLYALISGTAEHGWASLMLSIWFVGGVVLLSIGAVGLYIGKIYKEV KRRPLYNVEEVLYDDRNK >gi|225935354|gb|ACGA01000038.1| GENE 29 37768 - 38571 564 267 aa, chain - ## HITS:1 COG:MA0797 KEGG:ns NR:ns ## COG: MA0797 COG0726 # Protein_GI_number: 20089681 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Methanosarcina acetivorans str.C2A # 2 257 16 248 250 68 26.0 9e-12 MILLSFDTEEFDVPREHGVDFPLDEAMKVSVYGTNRILDCLKSNGVKATFFCTSNFAENA PEVMRRIMDEGHEVAAHGCDHWQPQASDVSRSKEILERLTGRTIQGYRQPRMFPVSDTEL GRMGYVYNSSLNPAFIPGRYMHLSEPRTCFMTGKLLQIPASVTPWIRFPLFWLSCHNLPM WLYQLLVNRVLKHDGYFVTYFHPWEFYPLGEHPEFKMPFIIRNHSGKGMEERLDMLIRKL KEKGYAFMTYSEFAQIKLAELNKPDEK >gi|225935354|gb|ACGA01000038.1| GENE 30 38586 - 40421 215 611 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 370 607 1 229 311 87 28 5e-16 MKEFLQLMRRFVSPYKKYIGWAVLLNILSAVFNVFSFTFLIPILSILFKTEGADKVYHFM EWGSGDLADVAKNNFYYYISQMIVDNGPTVALIFLGLFLMVMTLFKTGCYFASSAVMIPL RTGVVRDIRIMVYAKVMRLPMSFFSEERKGDIIARMSGDVGEVENSITSSLDMLMKSPIL IIIYFVTLVTVSWQLTLFTIIVLPGMGWLMGVVGRKLKRQSLEAQAKWSDTMSQLEETLG GLRIIKAFIAEDKMINRFTKCSNELRDATNKVAIRQAMAHPMSEFLGTILIVAVLWFGGT LILGKNATIDAPTFIFYMVILYSVINPLKDFAKAGYNIPKGLASMERVDKILKAENKIKE IPNPKPLKGLNDKIEFKDISFSYDGKREVLKHVNLTVPKGKTIALVGQSGSGKSTLVDLL PRYHDVQEGDITIDGTSIRDVRIADLRSLIGNVNQEAILFNDTFFNNIAFGVENATMEQV IEAAKIANAHDFIMEKPEGYNMNIGDRGGKLSGGQRQRISIARAILKNPPILILDEATSA LDTESERLVQEALERLMKTRTTIAIAHRLSTIKNADEICVLYEGEIVERGKHEELIELNG YYKRLHDMQQL >gi|225935354|gb|ACGA01000038.1| GENE 31 40542 - 41675 785 377 aa, chain - ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 4 376 2 386 390 72 22.0 1e-12 MKTMIRNPFITSGYVSADYFCDRRFESEQLVREMMNGNNLALVSTRRMGKTGLIRHCFQF PEIKQGYYTFFIDIYDSRSLRDLVFALSKEILEVLKPVGKKALQSFWECVKSLQASISFD VNGMPSLNLGLGDIQAPATTLDEIFRYLEQADKPCLVAIDEFQQISGYVEKNVEATLRTY VQHCNNARFIFAGSQRHVMGNMFLTPSRPFYQSVSMMHLESIPLEEYIRFAGTHFKRAGK EIEENAIIAIYQQFEGITWYIQKVLNTLYDMTPEQGVCRVEMVPGAIRQIIDSFRYTYSE ILFRLPEKQKELLIAITKEGKAKAVTSGAFIRKYRLASASSVQAALKGLLEKDFVTQEMG ICQIYDRFLGIWLKENY >gi|225935354|gb|ACGA01000038.1| GENE 32 41754 - 43832 1496 692 aa, chain - ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 332 674 306 634 836 76 25.0 2e-13 MRITLVRDDGKVNTMRTLRIEQLVEQMKVETKTQPVSKMREVLPFMLPGDKNDYVQKVPK LIPAAAFFRKGGITTMSEYNGIVMIQVNNLSGHMEADEVKERVKELPQTYLAFTGSSGKS VKVWVRFTYPNDLLPTTSEEAELFHAHAYRLAVKFYQPQLPYDIELKVPSLEQYCRLTFD PNLYFNPEAMPIYMKQPAAMPGEVTYRERVQTETSPLQRLAPGYEKCNALSVLFEAAFAR ALDEETDYQPEGDKQSLLINLAGHCFRAGIPEEDTVRWSRAHYRLPKDDTLVRETVRNVY RTCEGFASKSSLLPEQLFVMQMDEFMKRRYDFRFNQLTSQVECRERNSFNFYFHPVDKRL MASIAMNAHYEGLKLWDKDVVRYLNSDHVPVYQPIEEFLYDLPHWDGKDHIGDLAKRVPC DHPHWAKLFRRWFLSMIAHWRGMGKNHANSTSPILIGPQAYRKSTFCRLILPPCLQAYYT DSIDFSRKRDAELYLNRFLLINMDEFDQIGINQQSFLKHILQKPVVNTRRPNASAVEELR RYASFIGTSNHKDLLTDTSGSRRFIGVEVTGVIDVVRPIDYEQLYAQAMALLRSNERYWF DEKEEAIMTEANREFEQSPVIEQLFQVYYRAAEEEEEGEWLLAADILQRIQKASKMKFSS GQVNYFGRILQRLGVKSFRKTRGVYYHVVPVE >gi|225935354|gb|ACGA01000038.1| GENE 33 44435 - 45046 316 203 aa, chain + ## HITS:1 COG:no KEGG:BT_3384 NR:ns ## KEGG: BT_3384 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 203 203 302 75.0 4e-81 MAIQFEFYKNPQPEKEGEEPSYHPRVVNFQHVTTQKLAREIHMATTFGKAEVEAMLMELS RCMGNHLCEGERVHLDGIGYFQVTLQATEPVHSLTTRADKVRLKSINFQADRDLKSLCMS THLRRSKYKPHSASLSEEEIDKKLTGYFANHPVLTRSNMQSLCCFTQSMASRQIRRLKAQ GYLQNIGKPTQPIYIPTPGHYEK >gi|225935354|gb|ACGA01000038.1| GENE 34 45159 - 45482 187 107 aa, chain - ## HITS:1 COG:no KEGG:BT_1828 NR:ns ## KEGG: BT_1828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 336 428 520 159 81.0 4e-38 MIYQSGYLTIKGHDERFGIYRLGFPNREVEEGFIRFLLPYYANVNKVESPFEILKFVREV RAGDYESFFRRLQSFFSDIPYELARELELHCQNGYKIFVSHILINMT >gi|225935354|gb|ACGA01000038.1| GENE 35 45597 - 46796 697 399 aa, chain - ## HITS:1 COG:TM1265 KEGG:ns NR:ns ## COG: TM1265 COG1373 # Protein_GI_number: 15644021 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermotoga maritima # 34 399 40 387 387 116 27.0 7e-26 METLFAKQDRLLLLTSTEIIRTLMHRINWDAQLVAIRGPRGVGKTTLMLQYMKLHYEVYS REVLYCTLDSVYFSNHTLLELADVFVKNGGKHLFLDEVHKYPTWSKEIKEVYDMYPDLKV VFSASSLLNILNADADLSRRCIPYEMQGLSFREFLLFYKQMSFPVCTLEEVLTSPEKICS EVNKVCRPLPLFKEYLQYGYYPFYLKNQIDYYTSIEQVVNFIVETELPQLCGIDVGNVRK IKALLGILATSVPFEVDISKLSTTIGIHRNTVIEYLNSLEKAKLLHLLYADLLSVKKMQK PDKIFLDNPNLLYALASHPVKIGTARETFVVNQLSCDNEVEYGKKTGDFRVNGRYILEVG GEGKTYDQIADVPDSYILADGIETPYRCKLPIWIVGFLY >gi|225935354|gb|ACGA01000038.1| GENE 36 47389 - 47739 170 116 aa, chain - ## HITS:1 COG:no KEGG:BDI_3447 NR:ns ## KEGG: BDI_3447 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 104 146 247 390 136 63.0 3e-31 MVGLSFCEYLAISYGYHLPVYSLEDILKHKVDFPYVEARPILLFKEYLQHGYYPFFQEKG YLLRLQSIIKQTLENDIPTFANMNIATALKLKRLLYIIAKSVPLNLILPSWRLCLI >gi|225935354|gb|ACGA01000038.1| GENE 37 47818 - 48168 297 116 aa, chain - ## HITS:1 COG:no KEGG:BDI_3447 NR:ns ## KEGG: BDI_3447 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 116 1 116 390 166 67.0 3e-40 MDTLYRSYRMLLSNISTEFIRYLHDEIEWSSRLIAILGPRGVGKTTMLLQHIKLYDNIDE TLFVTADDLYFAEHKLIDLAMDFYQHGGKKLYIDEIHKYAGWAREIKNIYDLIPKL >gi|225935354|gb|ACGA01000038.1| GENE 38 48655 - 50853 1612 732 aa, chain - ## HITS:1 COG:no KEGG:BT_3382 NR:ns ## KEGG: BT_3382 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 583 1 581 582 955 75.0 0 MKMLSIICAKKLEYVVRRCVMPLLIPFLIVNVISCGDDFVEDTGGGSIPPEEIITPPVYM LLDGPYKSGVVLTRNEDDSYIIETIDGDPWVTGGRFKEDIPEECNVLEFEYQTMMGISNL ELFFADAETNIDASHSMSAGEVPGTETWKSFSVRLKQYRKDFDWGKKGDYLRIDFGNVPD NTIQIRNICLRVMNEEERKEEEEEDNEILNKEKYEQNIKDYLNKDYNCHVTDITVGKDII SVRGNYQGDGIFFLGEIPPFVDMFKAKKIESIYKTVLSQNSFEIHLNRYAAIGGYQYDRL LSKWAIFKEGVEKDEQVSHARYVGADGIYVKQNVEAIPLKSKKGLGGLINHEFLASDLDE LGISSATINIPITNFMHLSQQSGDIPYVYGGVTYYFNEEYLRSAFDVVLEQTSQRNISVA GILLVSPEGDAGELLKHPDFNGIAPYTMPNMTTMESTQCYAAALDFLAQRYSKPGMRIAH WIIHNEVDGGSHWTNMGDKPIATFMDTYLRSMRMCYNIVHQYDQNSEVFISFSHGWNIAA GGGWYKVRDMLDFMNLFSKAEGDFFWSLACHSYPAQLGNPCTWDDAQATFSMDTEYVTLK NLEVLDKWVSVPQNQYKGGIRRSVWLSEAGTCSPSYADKDLQNQAAGFAFGWKKINALEG INGIQWHSWFDHLGDGARLGLRKYNDAEYQGEAKPVWMTYQKAGTDAENEYFEQYLQRIG IDSWEGIIQKIP >gi|225935354|gb|ACGA01000038.1| GENE 39 50885 - 52609 938 574 aa, chain - ## HITS:1 COG:no KEGG:BT_3381 NR:ns ## KEGG: BT_3381 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 574 19 575 575 831 69.0 0 MGIFENRSLVSSDRLKRLLLFVLFLLSGLMLLVGCHEDQEEEVISSIYNEDDLKYEDSLK QYLRNTYPSVINKVSVTESSIEIVGNCVGEGHFYLGEIPPYLDVTKIDKAPYKVELTEAS FSIVLDRFIEREGVLYDRLLSKWAIFQMTDLADQLVSHARHADEIWAYQHLSPITLTSKK GLGGLTANSFISDLSLLGISSATINVCITHFMHLTPKAGDIEHLYGGKTYYMDEEYLNNT LDKVLLEATKKRNISVAAIILIDPASRSVDTGIGELLQHPDYSEGTYTMPNMTTLESVNC YAAALDFLARRYCRSDNRYGRISHWIMHNEVDGGLSWTNMGVKPVTIFSDTYIKSMRMCY NIVRQYDEHAEVFASFSHSWTDISNVGWYTSKDIVDLLNTYSRVEGDFQWAMAYHSYAQS LFNPCTWLDPDATYSMDTKYITFKNLEVLNKWALSKENKYKGTVKRSVWLSEAGVNSPTY SDEDFQKQAAGFAYAWKKINALEGIDGIQWHNWFDHPGDGACLGLRKYLDATYNGEAKPV WYVYQKANTEEEDEYFEQFLSVIGISDWNIIEKF >gi|225935354|gb|ACGA01000038.1| GENE 40 52944 - 53816 281 290 aa, chain + ## HITS:1 COG:Rv2957 KEGG:ns NR:ns ## COG: Rv2957 COG0463 # Protein_GI_number: 15610094 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis H37Rv # 7 203 23 222 275 95 32.0 1e-19 MNLNNTPQLSIITINRNDAQGLEKTLESIWKKQSFKDFEHIIIDGASTDNSINIIKKYAS HLSYWVSEPDKGIYNAMNKGIIKAKGNYLLFLNSGDWLENDILARVFKENFTEDIVYADL YQYRNADDIQISPYPDKLTLPFIYNYSLGHPSTFIKRELFKNMLYEEKYRIISDWAFFIT QILLFNRTTKHLNFAVSYFNVYGISSDPKSGNLIMQERSDFFKNNFPYLISEFYQNHTTL QEKVRTQEQALDILSKHRVQQLIDTVWIQRKARQYIKFLFLIERFLKRKR >gi|225935354|gb|ACGA01000038.1| GENE 41 54008 - 54823 220 271 aa, chain + ## HITS:1 COG:no KEGG:Mmol_1034 NR:ns ## KEGG: Mmol_1034 # Name: not_defined # Def: glycosyl transferase family 2 # Organism: M.mobilis # Pathway: not_defined # 44 258 362 574 626 161 39.0 2e-38 MRRKLIRKYKAFNQDFQRSKELFLKWFIPEIKKRFFGVKKINYKDIPIIINNYNRLEMLT KLIHSLESKGYHNLYIIDNQSTYPPLLEYYTRLPYPVYMLNKNVGHLSLWETGIFKQFKD SYYAYTDSDLEILPNCPDDFIEKFILLLQKYPKALKAGFSICIDDLPDHYKLKEKVIEWE SVFWKEEIEPNIFKALIDTTFAVYKPYFIGEPIDPDCFCIRTGYPYSVRHLPWYMNSAKP TEEELYYLGHIKTLTHWSKQNQTNTNSAKKE >gi|225935354|gb|ACGA01000038.1| GENE 42 54876 - 55874 431 332 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172485|ref|ZP_05758897.1| ## NR: gi|260172485|ref|ZP_05758897.1| hypothetical protein BacD2_11531 [Bacteroides sp. D2] # 71 320 1 250 262 457 100.0 1e-127 MKCYICLTDSIVTRKDYLDLLKVMLISARKNTSLHLVCLYDGNTNDPVYNLLKEFNVEII LHQLPYKLELMEIYPREWMLQNLGKEIEYNRIFGTFMRMEIPVVEKEEKYVLYSDIDVIF NADILLEELPHPTYLAAAPEYERNVEDMEYFNAGVLVMNIQGMKEKYEEFILKMKNRERN ISGLFDQGYLNELCFKDMELLPIEYNWKPYWGINDKAKLIHFHGMKPSSNLNEAGFITDN SFFRIVFDANPGGYAGYVYYFTQFYDYLGRKDDKWLYNHLQEVFNLYKDPSFFFSKYNKY KLKYQKYKKYYLVTLGISCVLLILLCLTLILQ >gi|225935354|gb|ACGA01000038.1| GENE 43 55874 - 56713 258 279 aa, chain + ## HITS:1 COG:CAC2174 KEGG:ns NR:ns ## COG: CAC2174 COG0463 # Protein_GI_number: 15895443 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 224 1 220 336 90 31.0 4e-18 MKKQPLISVIVASYNYAKYIKETLDSLIRQTNKSFEVIVVDDGSKDQSLPIIEEYANRFK NIKLYTHPGNQNRGLAETVILGIEKSNGEYIAFCESDDYWTNNHIEYLQDTIQQNPLANF IVNGIKVINLSNNPEYDSYIEFSSSFLKKHSGSNIFPYLDSNYIPTFSAVCVKKEVIQNI NFSTPYPAWLDFWLWRQICVFNKVYYIPQELTLWRKHNESYDAVSDNKDFNGFIKSSDQY LVSQYSLSQIIANANIPKDKKSLKQFIRSFCHFISFKNE >gi|225935354|gb|ACGA01000038.1| GENE 44 56786 - 58111 782 441 aa, chain - ## HITS:1 COG:XF0885 KEGG:ns NR:ns ## COG: XF0885 COG0438 # Protein_GI_number: 15837487 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Xylella fastidiosa 9a5c # 171 421 188 429 443 93 28.0 7e-19 MKKILFIIPCVPYPLNSGGNQAFFQMVDYIRHKMSVSVLFYAWTIDEAKRVEKLEDLWED VDFYTFVKETKEEPDSPLVRNPFYYKWLKKLKMSIERKMRRQLLSTSNSTVDLLKDKDFV REKSVLPNSVYKEFDTRYIDYVTKVAHTGFDIIQVEFYELIALGYVLPQNVQTIFVHHEL RYIRNENEMNLFQAIRPSDRMNFLVAKDFEWNALQKYKHIIALTEVDRLLLASYLGREDH IYASPAVVKFESNSETEFIPCVHRRFTFVGYGEHFPNLDAVVWLCKEIVPYLRKCNFEFT LQVVGMGYERYSAELRTACPEIELVGFVENLTSFLQGSIALVPIRIGSGMRMKILDMVSS NVPFITTSKGLEGIHFSDGVDCLIADNTVDFADAMIKLSNDLELQKILVHQANDKMKNMY NPQEMLDRRLSVYTSILKDEF >gi|225935354|gb|ACGA01000038.1| GENE 45 58176 - 59105 338 309 aa, chain - ## HITS:1 COG:no KEGG:BVU_1071 NR:ns ## KEGG: BVU_1071 # Name: not_defined # Def: glycosyl transferase family protein # Organism: B.vulgatus # Pathway: not_defined # 4 294 6 298 310 307 51.0 3e-82 MKNKLAIIIPAYKACFFREVLDSIVRQSNRDFTVYIGDDASPDDLESIVSDYKDKLDIFY FRFEQNWGGRDLVAHWERCIELSDEPLVWLFSDDDLMPPDAVERVIKGWKRSGECDAVFR FPLAIVDAYGELKYTNPPFETERISGYDFLLDKLSGKISSAACEYVFTRDVWKKTGGFVR FPLAWCSDDATWAKFADYASGIISLPGTPVYWRNAEDKNISNSTRFDGEKLKATGLFLKW IGMNYRLNLRESRFQDALVTYVNVILECSVRGNYSLKDLVCLYAILRKFSPTVASRILRS HILKAKLFI >gi|225935354|gb|ACGA01000038.1| GENE 46 59458 - 61764 472 768 aa, chain + ## HITS:1 COG:CAC2347 KEGG:ns NR:ns ## COG: CAC2347 COG1216 # Protein_GI_number: 15895614 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 1 230 1 219 243 103 29.0 1e-21 MDTDIIIPIYNAFDFTKKCIETVIEHTDLTKHTLLLINDKSTDQRILPLLNLFTTEYPSL NITIINNESNQGFVRTVNIGMQHSSRDVVLLNSDTEVTKNWLPKIQKCAYSKAAIATVTP LSNNATLASVPDFMSENTIPSDFTIEEYAGIVERCSMNLFPEIPTANGFCMYIKREAINN IGLFDEKTFGKGYGEENDFSYRCLQAGYRHLLCDNTYIYHKGTQSFSQEKTELINSHLQI LKSRYPSCVENTESFVQQNPISDIQLNIRYAINSHSKKNVLIVIHDFKEAEKKNIGGTTL HVHDLITNMKEEFNFHVLYYSDDDFKYHVTSFLPFDKITSTLGAYSQYTTLNLYNDTFNR DIKILIDTLKIDLIHIHHLKHMYLDIFKVAKERSIPVIYTLHDFYSICPSVKLFNKETFL CNYANAAGCGSCIAKTFNLNINFIPLWRKEFYENLKTVKKIIVPSYSTKNIFLNTYKDLT IEVVEHGYDKINGNPNNDNPDKKKNKKFNIAFIGYITEEKGLKYLEELTEKVKGTDINVH LFGQTTNKKCNKNKKNYVYHGKYIQQDLPNLLLENDIKLICLLSMWPETYSYTLSESLIS EIPVISFDLGAIAERVKKADVGWILPINSTLDDIFKLISTIKSAPQEYKQKVERIRHLLK NMKSLKDMGNEYTEIYNKTINVFPVKNHDIYYIQSRNEFYRKGKETPTLDLKEEKKEYKR VKHIIKSSVPLKQAFNEVQNFRHTYTNSKCRNKIFFKFIWYRILRINI >gi|225935354|gb|ACGA01000038.1| GENE 47 61852 - 62955 465 367 aa, chain - ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 5 361 221 566 818 309 43.0 6e-84 MVDKRIIAFYLPQYHPFPENDEWWGKGFTEWRNVVKARPLYRGHYQPHLPADLGFYDLRV PEVRQQQADMARMYGINGFCYYHYWFNGHQLMERPLEEMLSSGNPDFPFMLCWANENWTR AWDGGSRHILIAQNYSEEDDRAHIRYLLDNVFSDSRYIRVDGKPVFLIYRSMLFPNMKET IRVWREEASSKGVELYLCRVETMDCYGEEYLQDGFDAAVEFQPFTHQMNEFQKKRNPLRK FAYNINRHLFNTCKKKKIDYSEYVDYICKTHFPDYKMYPGVTPMWDNTSRRKQKMFILDK STPEKYGEWLYSVMNKFVPYSKDENFVFVNAWNEWAEGNHLEPDLKWGFRYLEETEKVVK SMQEDGF >gi|225935354|gb|ACGA01000038.1| GENE 48 62984 - 63976 444 330 aa, chain - ## HITS:1 COG:AGl534gl_1 KEGG:ns NR:ns ## COG: AGl534gl_1 COG0463 # Protein_GI_number: 15890377 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 235 113 342 365 102 29.0 7e-22 MFVMENYPIEVSIIIPNYNYARFLQQRIESVLAQTYTDYEIILLDDASTDDSVSILNHYK TNSRVAHLEINSVNTGSPFIQWQKGISLSRGKYIWIAESDDVADSSFLEKAVSILNQYPH ASFCFLGSHCIDEKGNKLSTDFDRWTSKQLRRPYNIGVFDSEDYIKQNLYWRNYIYNASG VVFRKQCFEQIKDLSCFSMRYSGDWLFWIEMARQGSVIELYEKLNFFRLHSTSTTVEGNA SGNAILEDIQVVHYVESFSYPIGRYKRLMRHGMLYKNIKRAKVDPKMRLLLFEKLKGCFG TSVWTYRWERVNKYMSFLNPWQPTRDRDRL >gi|225935354|gb|ACGA01000038.1| GENE 49 64165 - 64974 423 269 aa, chain + ## HITS:1 COG:Cj1135 KEGG:ns NR:ns ## COG: Cj1135 COG0463 # Protein_GI_number: 15792460 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Campylobacter jejuni # 10 261 257 514 515 188 41.0 8e-48 MNKPMASNISTSLIISTYNRSDALELCVKSVLRQSLLPDEIIIADDGSKEDTRELIHQLA ASSEVPIIHVWHEDLGFRLASIRNKAIAKASKEYIIQIDGDIVLHKDFVKDHVHFAKKGS FVTGSRVLIREGLTKKMLAERNCIISIHDKGTKNTINGVHLPWLSPLLQHYRQWDISYSR GCNMAFWKEDLLKVNGYNEAITGWGSEDHELVCRLINSGVRKRTIKFAGIVFHLHHELHG TDNLSNNRSILNETKVRKLTWCDKGIIQN >gi|225935354|gb|ACGA01000038.1| GENE 50 64976 - 66025 820 349 aa, chain + ## HITS:1 COG:BS_hisC KEGG:ns NR:ns ## COG: BS_hisC COG0079 # Protein_GI_number: 16079319 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Bacillus subtilis # 8 346 34 359 360 132 30.0 1e-30 MEKNLNFLDRNEFNYSPSKEVVEALKNFDINKLCFYTRIYDEGKKSILSVFLSELYDIDE TQVLLGYGGEDILKQAVHYFLTQEDGNKTMLIPKFSWWYYKSIADEVNGHTLQYPLYEDG NTFKYDFETLKDMIQKENPKILLLASPNNPTGNGLTPKELDELLAEVPSQTVVLIDEAYA SFVSTDTSYIKKLVNKYPNLIISRTLSKFYGLPGLRMGFGFMSKELEKFSRYSNKYLGYN RISEDIAIAALKSDAHYRNIAKLMNEDRERYEKEIGVLPGFKVYESVANFILIKYPIELK EALQKSFAEQSYKVKFMNEPDINTHLRITLGRPEQNRIVIDTIKEIASK >gi|225935354|gb|ACGA01000038.1| GENE 51 66022 - 66753 489 243 aa, chain + ## HITS:1 COG:AF1142 KEGG:ns NR:ns ## COG: AF1142 COG1213 # Protein_GI_number: 11498742 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar nucleotidyltransferases # Organism: Archaeoglobus fulgidus # 1 235 1 232 241 120 35.0 2e-27 MKAVILAAGIASRLRPLTDTTPKCLLKIGERCLLERAFDALIQNGFDEFIIVTGYRQQQI VDFLQAHYPTQDITFIYNDRYESTNNIYSLWLTRPYTDGEAILLLDSDIVFDPQIVEKLL HSDKDDILALNRHELGAEEIKVIVDDAQKVVEISKVCSISDAIGESIGIEKMSAEYTKAL FRELEIMITTEGLDNIFYERAFERLIPQKYSFYVMDTTEFFSAELDTVEDFQQAQKLIPA SLY >gi|225935354|gb|ACGA01000038.1| GENE 52 66769 - 67581 569 270 aa, chain + ## HITS:1 COG:L15884 KEGG:ns NR:ns ## COG: L15884 COG3475 # Protein_GI_number: 15672196 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Lactococcus lactis # 1 257 4 268 278 128 30.0 1e-29 MANYDIRPLQLRILKNLLAVDKVCKEHNLRYYIMAGTMLGAVRHKGFIPWDDDLDIGMPR ADYDLLMANAKEWLPEPYEAVCAENDKEYPLPFAKVQDANTTLIERMHLKYLGGVYIDIF PLDGVPESRMAQRMHFAKYEFYKRVLYLIHRDPYKHGKGPSSWIPLLCRKFFTLTGAQES IRKVMKKYDFDQCALVCDYDDGMKGIMSKDILGTPTPIRFEDEEVWGVQKYDAYLSQKYG DYMTIPKQSGQRQHNFHYLDLNKPYRNFEV >gi|225935354|gb|ACGA01000038.1| GENE 53 67611 - 68744 816 377 aa, chain + ## HITS:1 COG:sll1231 KEGG:ns NR:ns ## COG: sll1231 COG0438 # Protein_GI_number: 16330676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 86 376 97 398 399 145 30.0 1e-34 MRIGFDGKRAVQNFTGLGNYSRYIVDILCQFYPENEYVLYAPKKRENKRLDKLTKQYQQL QLSYPTTSSWKKLSSLWRVWGVTQQLEKEKIDIFHGLSNELPLNIHQSEVKSIVTIHDLI FLRYPQYYHSIDRKIYTYKFRKACENADKIIAISECTKRDIIEYFRIPADKIEVVYQGCD LSFIHPVAAEKKREIRAKYQLPDHYILNVGSIEERKNALSAVQALMMLPEQIHLVIVGRH TEYTDKIERFIKENKLEERVHIISNVPFDDLPVFYQLAEIFVYPSRFEGFGIPIIEALYS GIPVVAATGSCLEEAGGPDSIYVDPDDIKGMANAFKQIYSDTERKKEMIEKGQKFAKRFS EEKQAEEILNIYKKLMK >gi|225935354|gb|ACGA01000038.1| GENE 54 68741 - 69553 386 270 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160883972|ref|ZP_02064975.1| ## NR: gi|160883972|ref|ZP_02064975.1| hypothetical protein BACOVA_01946 [Bacteroides ovatus ATCC 8483] # 1 270 1 270 270 511 100.0 1e-143 MNVLKHTIKKFIYGTLPYYFMKGYKPGSPYLKYYEYIKEHGYSRHLYEFKDEYANMPVDV QKDEEKGLYYVQKEKKRLYFRKSTPAKKIQKYYRALSMEQDRRSPHHYFNSVKEVTGKVF VDVGCAEGYSSLEIIEEAKHVYLFEQDEQWLEAIRATFEPWQDKVTIVQKYVSDHNSSRE QTLDDFFNNQTNEHLFLKMDIEGAERHALAGCNNLFQNCQKLDFAICTYHLRDDEEVISA FLDKHNCTYINQKGFFRHKIRSVVMRGSKK >gi|225935354|gb|ACGA01000038.1| GENE 55 69554 - 70465 345 303 aa, chain - ## HITS:1 COG:no KEGG:BT_3364 NR:ns ## KEGG: BT_3364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 1 238 247 189 39.0 1e-46 MRVIVNPKYAHLQKEIEEVPRSFQNEGDVVYDGRNVLKRISLGSIDVVVKSFKKPHIINR VVYSFFRQSKAERSYIYSMEIQQHGFDTPEPVAMIEQFQDGLLSHSYYICCYDGGETVRS LMDGKVEGNEDKLSAFARYTAALHQAGILHLDYSPGNILIHQNETNEYSFSLVDVNRMQL LSDIDCDTVCRNMCRLCISREVLTYIMTEYASLRGWDVESTVSLALRYSDQFFTHYIYRR AARKEKEKHIVSLILFFRLYRSVRKFFSWEPHISRFLLRKEKHIYDTYLCKYDYCELLSA DYQ >gi|225935354|gb|ACGA01000038.1| GENE 56 70697 - 71755 491 352 aa, chain + ## HITS:1 COG:FN0992 KEGG:ns NR:ns ## COG: FN0992 COG0859 # Protein_GI_number: 19704327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 7 340 10 344 358 210 39.0 3e-54 MKPIKRILIIRFRQIGDSILAVALCSTLKKSFPDAEIHFVLNKNIAPLYEGHPDIDKVIT FDKNENKPFTAYIKKVWQVMHQNKYDIIIDMRSTIRTLFFSLFSLRTPFRIGRIKGYTRF LLNYRTDTYSKDLTTDMVQRNLLLAAPLEKIRPIQYTKEFKLYLTDREKEDFRYYMEKEG INFAHPVLLIGVTTKLIHKKWNTEFMIATLKRILEEHKDIQMIFNYAPGYEEEDARNIYK ELGCPERIKIDIQASSLRQLAALCANCSFYFGNEGGARHIAQALGIPSFAIYSPSASKSM WLPANSVLAEGISPDDILSPEQQATLTYEERFALITPEKVYGQLTSILRQLS >gi|225935354|gb|ACGA01000038.1| GENE 57 71738 - 72811 612 357 aa, chain - ## HITS:1 COG:SMb21078 KEGG:ns NR:ns ## COG: SMb21078 COG0438 # Protein_GI_number: 16264405 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 132 353 179 399 402 96 31.0 7e-20 MKEIQRKKKVLIDLTNYGSLTAGFGQIAANYATAFSSMPVDDLHFVYLLRQKYMQEFGPN VTSVPVRRINKFLPFTLPKVDVWHAVNQQRKLLRIAGGTKFIFTIHDFNFLTEKKPWKAK MYLRRMQNKVNKAAVVTTISHYVADVIRQHVDLKGKEIRVIYNGVERIDTLEGTKPSFAT GRPFFFTIGQIRRKKNFHLLVDVMRHFPEYDLYICGDAHFAYAEEVRNLIRENQLTNVFL TDVISQSEKIWLYRNCEAFLFPSEGEGFGLPVVEAMQFGKAVFAANRTSLPEVCNGHAIM WEHLDTESMVQSIREHLPDFYKDKERLEKIKEYAASFSYEKHIQAYLDLYRELAQLP >gi|225935354|gb|ACGA01000038.1| GENE 58 72872 - 73903 557 343 aa, chain - ## HITS:1 COG:FN0546 KEGG:ns NR:ns ## COG: FN0546 COG0859 # Protein_GI_number: 19703881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 3 339 6 332 335 91 26.0 2e-18 MARILIIRFSALGDVAMTIPVIHSLAVQYPQHEITVLSRAVWQPLFQGLPANVGFVGADL TGKHKGLWGLNSLYSELKAMHFDYIADFHHVLRSKYLCLRFRLANKPVASICKGRAGKKK LVRRHDKVMENQKSSFRRYADVLEKLGLPVLLNFSSIYGEGKGNFAEIEPVTGPKEDQKW IGIAPFAKHAGKIYPLELQEQVIAHFAANPKVKVFLFGGGKSEQDVFDAWIAKYPSVVSM IGKLNMRTELNLMSHLDVMLSMDSANMHLASLVNIPVVSIWGATHPYAGFMGWRQLPVNT VQLDLSCRPCSVYGQKPCWRGDYACLRDIKPEQVIAKIEGIVD >gi|225935354|gb|ACGA01000038.1| GENE 59 73907 - 74512 580 201 aa, chain - ## HITS:1 COG:no KEGG:BDI_2820 NR:ns ## KEGG: BDI_2820 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 201 1 201 201 295 75.0 6e-79 MTFSNLCNEIFWKSTTDYHVTDSVDAPMNNPYELKTIEYYLYLKNWIDAVQWHFEDIIRD PQIDPVEALALKRRIDKSNQDRTDLVELIDSYFLDKYKEVKPLSDATINTESPAWAIDRL SILALKIYHMQQEVERTDTTEEHRAQCQTKLNILLEQRKDLSTAIEQLLADIEAGRKYMK VYKQMKMYNDPALNPVLYAKK >gi|225935354|gb|ACGA01000038.1| GENE 60 74630 - 75670 430 346 aa, chain + ## HITS:1 COG:STM2370 KEGG:ns NR:ns ## COG: STM2370 COG0111 # Protein_GI_number: 16765697 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 338 1 348 378 248 39.0 1e-65 MKIIIDNKIPYIKEAVQRIADEVVYAPGKDFTPELVRDADALIVRTRTHCNRDLLEGSRV RFIATATIGFDHIDTEYCKQAGIEWTNAPGCNSASVAQYIQSSLLVWKSVRNKRLDELTI GIIGVGNVGSKVAKVAQDFGMRVLLNDLPREEKEGTKRFSSLEKIAEECDIITFHVPLYK EGKYKTFHLADDVFFQSLKRKPVIINTSRGEVIQTDALLKALNSRMISDTIIDVWEHEPE INRDLLEKAFIGTPHIAGYSADGKANATRMSLDAICKFFQIKGDYEINAPAPASPIIHAK NHEEAVLQMYNPTEDSNRLKNQPELFETLRGNYPLRREEKVYIIKY >gi|225935354|gb|ACGA01000038.1| GENE 61 75729 - 76313 257 194 aa, chain - ## HITS:1 COG:MA0316 KEGG:ns NR:ns ## COG: MA0316 COG0299 # Protein_GI_number: 20089214 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Methanosarcina acetivorans str.C2A # 2 178 8 187 204 140 45.0 1e-33 MKKNIAIFASGSGSNAENIIRYFQKNDSVQVSLVLSNKSDAYVLERAHRLGVPSNVFPKE DWIAGDEILAILQEYRIDFVVLAGFLVRVPDLLLHAYPDKIINIHPALLPKYGGKGMYGD RVHEAVVAAGEKESGITIHYINEHYDEGNTIFQVTCPVLPTDSPDDVAKKVHALEYEHYP KIINQILSNKYYVL >gi|225935354|gb|ACGA01000038.1| GENE 62 76463 - 76699 401 78 aa, chain + ## HITS:1 COG:SMc00573 KEGG:ns NR:ns ## COG: SMc00573 COG0236 # Protein_GI_number: 15964896 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Sinorhizobium meliloti # 1 75 1 75 78 77 64.0 6e-15 MSEIASRVKAIIVDKLGVEESEVTTEASFTNDLGADSLDTVELIMEFEKEFGISIPDDQA EKIGTVGDAVSYIEEHAK >gi|225935354|gb|ACGA01000038.1| GENE 63 76715 - 77977 1279 420 aa, chain + ## HITS:1 COG:BS_yjaY KEGG:ns NR:ns ## COG: BS_yjaY COG0304 # Protein_GI_number: 16078199 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus subtilis # 1 418 1 411 413 421 53.0 1e-117 MELKRVVVTGLGAITPVGNSVPEFWENIVNGVSGAGPITHFDASLFKTQFACEVKDFDAT KYIDRKEARKMDLYTQYAIAVAKEAVSDSGLDVEKEDLNKIGVIFGAGIGGIHTFEEEVG NYYTRQEMGPKFNPFFIPKMISDIAAGQISIMYGFHGPNYATCSACATSTNAIADAFNLI RLGKANVIVSGGSEAAIFPAGVGGFNAMHALSTRNDEASKASRPFSASRDGFVMGEGGGC LILEELEHAKARGAKIYAEVAGVGMSADAHHLTASHPEGLGAKLVMKNALEDAEMDPKEV DYINVHGTSTPVGDISEAKAIKEVFGDHAFELNISSTKSMTGHLLGAAGAVESIASILAI KNGIVPPTINHEEGDDDENIDYNLNFTFNKAQKREVNVALSNTFGFGGHNACVIFKKYAE >gi|225935354|gb|ACGA01000038.1| GENE 64 78015 - 79007 636 330 aa, chain + ## HITS:1 COG:SA1076 KEGG:ns NR:ns ## COG: SA1076 COG0571 # Protein_GI_number: 15926816 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Staphylococcus aureus N315 # 16 231 24 241 243 107 36.0 4e-23 MFRKDRESYLCFYRILGFYPRNIQLYEQALLHKSTSVRSDKGRPLNNERLEFLGDAILDA IVGDIVYKRFEGKREGFLTNTRSKIVQRETLNKLAVEIGLDKLIKYSTRSSSHNSYMYGN AFEAFIGAIYLDQGYERCKQFMEQRIINRYIDLDKISRKEVNFKSKLIEWSQKNKMEVSF ELIEQFLDHDSNPVFQTEVRIEGLPAGTGTGYSKKESQQNAAQMAIKKVKEPTFMSTIEE IKTQHSATATETETEMEVELATGSDTELATELENELETELNANSEIKPENVSEENILVDE ISTVQDTEAPATSQSKSPCDDPESPNRDLQ >gi|225935354|gb|ACGA01000038.1| GENE 65 78949 - 79959 828 336 aa, chain - ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 1 318 4 333 346 267 45.0 3e-71 MRIGILTSGGDCPGINATIRGVCKTAINYYGMEVIGIHSGFQGLLTKDVESITDKSLSGL LNLGGTMLGTSREKPFKKGGVVSDVDKPSLILQNIHEIGLDCVVCIGGNGTQKTAAKFAA MGVNIVSVPKTIDNDIWGTDISFGFDSAVSIATDAIDRLHSTASSHKRVMVIEVMGHKAG WIALYSGMAGGGDVILVPEIAYNIKNIGNTILERLKKGKPYSIVVVAEGIQTDGRKRAAE YIAQEIEYETGIETRETVLGYIQRGGSPTPFDRNLSTRMGGHATELIANGQFGRMVTLQG DDIASIPLEEVAGKLKLVTEDHDLVIQGRRMGICFG >gi|225935354|gb|ACGA01000038.1| GENE 66 80097 - 81623 1130 508 aa, chain + ## HITS:1 COG:no KEGG:BT_3355 NR:ns ## KEGG: BT_3355 # Name: not_defined # Def: putative auxin-regulated protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 502 1 502 503 993 95.0 0 MNITKIISKTFDSRLKQIDLYASQASEIQHRVLSRLIHQAAQTEWGRKYDYSSIRNYEDF RKRVPIQTYEEIKPYVERLRAGEQNLLWPSEIRWFAKSSGTTNDKSKFLPVSKEALEDIH YRGGKDAAALYFRINPDSHFFSGKGLILGGSHSPNLNSNHSLVGDLSAILIQNVNPLINF IRVPSKKIALMSEWETKIEAIANSTIPVNVTSLSGVPSWMLVLIKRVLEKTGKQALEEVW PNLEVFFHGGVAFTPYREQYKQVIQTPKMHYVETYNASEGYFGTQNDLSDPAMLLMIDYG IFYEFVPLEEVGKESPRAYCLEEVELNKNYAMVISTSCGLWRYMIGDTVKFTSKNPYKFV ITGRTKHFINAFGEELIVDNAEKGLAKACAETGAQVSEYTAAPVFMDENAKCRHQWLIEF AKMPDSVEKFAAILDATLKEVNSDYEAKRWKDIALQPLEVIVAREGLFHDWLAQKGKLGG QHKVPRLSNTREYIETMLALNDSIPSEE >gi|225935354|gb|ACGA01000038.1| GENE 67 81863 - 83017 861 384 aa, chain - ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 4 380 2 354 355 235 37.0 1e-61 MEENKRVLLGMSGGTDSSVAAMRLLEAGYEVIGVTFRFYELNGSTEYLEDARNLAERLGI RHITYDAREIFSKQIIEYFVQEYLAGRTPVPCTLCNNYLKWPLLAKIADEMGIFYIATGH YAQNIQLNNTFYITYAADSDKDQTFFLWGLKQDILCRMLLPMGDITKVEARVWAAEHGFR KVATKKDSIGVCFCPMDYRTFLRNWLAQNSAALAGEEQMMEEDQVLVEDCQRLTGSNQVG LNKVKRGRFVDEKGDFIAWHEGYPFYTIGQRRGLGIHLNRPVFVKEINPEKNEVVLASLS ALEKTEMWLKDWNLVNQERTLGHSDIIVKIRYRKQENHATITITPNHLLHVQLHEPLTAI APGQAAAFYKDGLLLGGGIIVNAR >gi|225935354|gb|ACGA01000038.1| GENE 68 83074 - 85914 1320 946 aa, chain + ## HITS:1 COG:PA0799 KEGG:ns NR:ns ## COG: PA0799 COG0553 # Protein_GI_number: 15595996 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Pseudomonas aeruginosa # 359 930 76 639 663 327 37.0 8e-89 MKERPTNGQVIIVFTEHPILGILLIPYIAEKLDDGTLQLVEQAFHASPEAMSKMSEAERQ AIHIASYYTEKHLMSVYSREKTVSRFLHKLSEDPERIKNDIRPSIEKKLLEMLALIRDNG LPFYQKQAGSKILYAHHAYHINPHNVEIRVTFHVDNKTFRYQLQCYYEGQPFSLSELKPV VVLTSAPATLLLGMELYFFPHIESARILPFTKKRSISVDASQIEKYIDNIVIPIARYHEI ETHGLNIMEEKCPCEAILSFEDTTYNGQALQLGFRYGDQTFTPDSALEMKKIIYRKTSGE IFFFRRNITAEEQAVQLLTDAGLRQLNDTHFSLSPEAPEKTIVEWINNHREMLQQSFHLT SNMGNTPYCLDEIRIEQSCDDEVDWFELHITVVIGNLRIPFSRFRKHILEEKREYLLPDD RMILLPEEWFSKYANLLEMGIQTEKGIRLKHTFVGAAQTALGKEELKKFPAKQQIQNVAV PKNLKATLRPYQQKGFSWMVHLHKQGFGGCLADDMGLGKTLQTLTLLQYIYKPSTPRQPA TLIIVPTSLLHNWRREAKRFTALSMAEYNNTMAIDKEQPEKFFGHFHLIFTTYGMMRNNI DILCSYRFEYIVLDESQNIKNSESLTFRSAIRLQSKHRLVLTGTPIENSLKDLWAQFHFI QPDLLGTESAFQKQFIMPIRQGNARAKVLLQQLTAPFILRRSKKEVAPELPALTEETIYC DMTEEQNTCYEQEKNSLRNILLQHPQSTDRLHSFSVLNGILRLRQLSCHPQLILPDYTGT SGKTAQIIETFDTLQSEGHKVLIFSSFVRHLEVLAEAFHERGWKYALLTGSTNNRPSEIA HFTDQKDVQAFLISLKAGGVGLNLTQADYVFIIDPWWNPAAESQAIARAHRIGQDKQVIA YRFITQNSIEEKILHLQDEKRKLAETFVADSEPLPILSNEQWVDLL >gi|225935354|gb|ACGA01000038.1| GENE 69 86060 - 88648 1716 862 aa, chain + ## HITS:1 COG:no KEGG:BT_3328 NR:ns ## KEGG: BT_3328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 862 8 868 868 513 36.0 1e-143 MKNRSITYLLLFGICSMLAIMQSCQKTDDLEADINSLKDRVAALEKATEGLNTSFASLQA LMQKNKVIIGITPTKDGLGYLLELSDGTSIKVMESEAVQASVPEFSVDEEGYWIYKTSNN TNFKYLPGADGEKVSAWPRDEEGNVVATPLISVSSSGYWQVSYDNGQTYTSLGTKAEGGS QGGTSIFSKVEYNEANHTFSFTLSDGGKTYTFPVDDTFGLIIYGLNDADSEQTVQVFAPN ENHKEYKVEQNDVQQAVVQAPKGWNVLLSENLLTITPQATATKDMEETIKIVLTSSKNYI RIVSIEVKQLSSEAGAEAWQQFVNADQQNVLLDFSYAGYKHGEIAPPETETLIAQGYKVY DVTDPKYGAIPNDGKSDRAAFMKVLEEIASETKQEDLNMTDRYIKENAKAIIYFPEGNYI LQDEDSKDRRIRISMSDIVLKGAGRNKTTLEMTAANNSPKPTEEMWNAPVMMEFKHNTGL KESIGVITEDAPIGSRTITASLTGVSAGSWVCLVLENTDDNVINSELYPHKWEDIKIQQG GTPNIKTKGIQIYEYHQIEKISGNSVTFKEPIMHAINKDWGWNVHKFANYANVGVEDLTF KGHAKEKFIHHGSDIDDGGFKLIDFVRLTNSWMRRVNFESVSEAMSITNSANCSAYDITI GGNRGHASIRSQASSRIFIGKVTENSNGYTLRKGEGESTLMEYKTNVGQYHACGVSKQSM GAVIWNVRWGDDSCFESHATQPRATLIDCCSGGFMHWRQGGDSAQMPNHMENLTIWNFYA TNAQTDQDIDTGGKFTWWDGNGFWWKFMPPIIVGFHGSPLDFDDTQMKRLESNGTAVEPY SLYEAQLRKRLGYVPSWLSSLK >gi|225935354|gb|ACGA01000038.1| GENE 70 88670 - 90673 1665 667 aa, chain + ## HITS:1 COG:no KEGG:BT_3329 NR:ns ## KEGG: BT_3329 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 664 32 644 645 173 28.0 2e-41 MKKVWSMFMLLAVCLVACTNIDDLEDDVDALKKRVTALETQVRDINSNTEALRELYNEGT FITNIEEKSDSYTLTLSNGKTVNLYMKNDNNLLCPIIGIDSEGYWTVLYNKNETPERLTV NGQPVKANGESGKTPTFNVDSEGYWQVSYDGGKNYSYIYKEGTNDKVSATGDGSAPTEDK NFKSVTVENNELVLVLAGEDAPTIRIPIVRDFECSFAAEDLKQVQEFSAGEVKEFTMTVR GVENTMITAPEGWSAKFSKEAGKENVLVVTAPASSAKMMTRATADNSTDIAVLATSGKYA MIAKIQVKVIDTPQSKDFYTEYNTNGNITIGNVSITKGANEAILLDGSDESASNIRNLIH QKTEQTVIFLEKGAYDFNTPSIAEITGTVVIIGRYSDSKPILKPELLWKLKSGKLIFKNI QLDLTKIDASGGNNKYMFCNADATADFEDFVFEDCEITNIQKNFYYNNVATYTIKNIYAK NSRFQLNTTVDGILIFNIYKTTHLDAMQMFTFENNIVYNKTSVAGQILSWDNKIVQTPDQ QQIKVSFCNNSIVNYVGKNYHLKFYDATKMTISKNIFYADPNMNSDATMCGIYKTESTPE FDVKDNIAYGLTETNKWNSFGATVKPTNPIEEQLLRLTNSPFTSMDLDKGIFIPDNAYTG FGSTINQ >gi|225935354|gb|ACGA01000038.1| GENE 71 90837 - 91649 722 270 aa, chain + ## HITS:1 COG:VC1364 KEGG:ns NR:ns ## COG: VC1364 COG0561 # Protein_GI_number: 15641376 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Vibrio cholerae # 3 266 2 265 273 173 35.0 3e-43 MKYKLIVLDLDGTLTNSKKEITPRNSETLIRMQEQGIRLVLASGRPTYGIVPLANELRMN EFGGFILSYNGGEIINWETQEMVYENVLPNEVVPVLYECARTHQLSILTYDGAEIITENS QDPYVLKEAFLNKMAVRETNDFLTDITLPVAKCLIVGDADKLIPLEAELCLRLQGRINVF RSEPYFLELVPQGIDKALSLAVLLKEIGVAREEVIAIGDGYNDLSMIRFAGLGIAMGNAQ EPVKKAADYITLSNEEDGVAEAIKKFCNQQ >gi|225935354|gb|ACGA01000038.1| GENE 72 91881 - 93362 1348 493 aa, chain + ## HITS:1 COG:DR1670 KEGG:ns NR:ns ## COG: DR1670 COG0215 # Protein_GI_number: 15806673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Deinococcus radiodurans # 5 490 52 531 532 451 46.0 1e-126 MEHQLTIYNTLDRKKELFIPLHAPHVGMYVCGPTVYGDAHLGHARPAITFDVLFRYLSHL GYKVRYVRNITDVGHLEHDADDGEDKIAKKARLEELEPMEVVQYFLNRYHKAMEALNVLS PSIEPHASGHIIEQIQLVQKILDAGYAYESEGSVYFDVAKYNKDYHYGKLSGRNLDDVLN TTRDLDGQSEKRNPADFALWKKAQPEHIMRWPSPWSDGFPGWHAECTAMGRKYLGEHFDI HGGGMDLIFPHHECEIAQSVASQGDDMVHYWMHNNMITINGTKMGKSLGNFITLDEFFTG THKLLAQAYTPMTIRFFILQAHYRSTVDFSNEALQASEKGLQRLMEAIDALDKITSSSAT SEGINVKELRAKCYEAMNDDLNTPIVIAQLFEGARIINNIIAGNATISAEDLKDLKETFH LFSFDIMGLKEEKGSSDGREAAYGKVVDMLLEQRMKAKANKDWATSDEIRNTLTALGFEI KDTKDGFEWRLNK >gi|225935354|gb|ACGA01000038.1| GENE 73 93503 - 96364 2442 953 aa, chain - ## HITS:1 COG:no KEGG:BT_3350 NR:ns ## KEGG: BT_3350 # Name: not_defined # Def: putative chondroitinase (chondroitin lyase) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 953 1 953 953 1720 85.0 0 MKNSCFIVLFLLGIIPSAFAQLIGFEEEVPEAFTVSGKGEVKISSLFYKEGESSLEWDFQ PGSTLDVQIAPLSLNTRNEKQFGITLWIYNEKPQQDSIRFEFLNKEGEVSYWFSYRLQAA GWRACWISFEYMQGDKKDKKIVAYRLVAPQRKGRIFLDRLIFPEKKMNLRTTPDQQLPTN NGLSNRDLWHWCLVWKWEQQSYDIPLPSKLTSEQKKELKTIEQRLTDFLEVKKAPQGPIN AAYKTFEKAAISPSIAGTGFIGTPIVTPDEQDKKKGEMSWNDIETMLSGFAYDAYYNQNE TSKKNYFTVFDYAIDQGFAYGSGMGTNHHYGYQVRKIYTTAWLMRDAIYKHPHRDAYLST LRFWAALQETRQPCSPTRDELLDSWHTLLMAKFISAMMFPDAREQAQALSGLSRWLSSSL RYTPGTIGGIKVDGTTFHHGGFYPGYTTGVLATVGEYIAFTNGTSFELTEDARKHMKSAF IAMRNYCNFYEWGIGISGRHPFGGKMGSDDIEAFANIALSGDLSGQGNTFDRGLAADYLR LIRNSDTPNARFFKKEGIQPAQAPQGFFVYNYGSAGIFRRADWMVTLKGYTTDVWGSEIY TKDNRYGRYQSYGSVQIMGKGNPVSRAGSGFVQEGWDWNRLPGTTTIHLPFDLLDSPLKG TTMARSKENFSGSSSLDGKNGMFAMKLAERDYENFTPDFVARKSVFCFDNRMVCLGTGIS NSNVDYPTETTLFQTKYNGKKPKVGEDNYWLHDGYDNYYHVVDGTVRAQVAEQESRHEKT REVTKGKFSSAWIEHGKAPKEGTYEYMVLIQPSASDLDELRKTPAYEVLQRDQTAHVVYD KKTGITAYAAFEAYQPATDKVFVAIPAETMVMYAKESDKGIRLSVCDPNLNIEEKTYTTK EPSRPITKEIRLKGHWTLTSPMENVRLEQQGDQTVLTVTCQHGQPVEMLMENK >gi|225935354|gb|ACGA01000038.1| GENE 74 96478 - 98028 1270 516 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 26 512 37 511 517 306 38.0 6e-83 MNKPINLLVGGLTLFAAQGCKAPKQASQQAEHPNIIYVFPDQYRNQAMGFWRQDGFRDKV NFEGDPVHTPNLDAFARESMVLSSAQSNCPLSSPHRGMLLTGMYPNKSGVPLNCNSTRPI SSLREDAECIGDVFSKAGYDCAYFGKLHADFPTPNDPEHPGQYVEEKRPAWDAYTPKERR HGFNYWYSYGTFDEHKNPHYWDTDGKRHDPKEWSPLHESGKVISYLKNDGNVRDTKKPFF IMVGMNPPHSPYRSLNDCEEQDFNLYKDQPLDSLLIRPNVDLKMKKAESARYYFASVTGV DRAFGQILTTLKELGLDKNTVVIFASDHGETMCSQRTDDPKNSPYSESMNIPFLVRFPGK IQPRVDDLLLSAPDIMPTVLGLCGLGDSIPAEVQGRNFAPLFFDEKAEIVRPTGALYIQN VDGDKDENGLVQTYFPSSRGIKTAQYTLALYIDRDTKQLKKSLLFDDVKDPYQLHNLPLE ENKEIVAQLCGEMGAMLKEINDPWYTEKILSDRIPY >gi|225935354|gb|ACGA01000038.1| GENE 75 98194 - 99396 949 400 aa, chain - ## HITS:1 COG:no KEGG:BT_3348 NR:ns ## KEGG: BT_3348 # Name: not_defined # Def: putative unsaturated glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 400 1 400 400 794 96.0 0 MKTILSALGLSLLILTSCGGQKKVEVDFIQDNIDNAVAQNTIQTDIIEKSGKILNPRTIN KDGSISYIPIDDWCSGFFPGSIWLTYKLTGDQKWLPLAEKYTEALDSVKYLKWHHDVGFM IGCSYLNGYRFADKKEYKDVIIETAKSLSTRFRPNAGVIQSWDADRGWQGTRGWKCPVII DNMMNLELLFEATAFSGDSTFYNIAVKHADTTIAHHFRPDNSCYHVVDYDPETGEVRKRQ TAQGYADESAWARGQAWALYGYTTCYRYTKDKKYLDQAQKVYNFIFNNKNLPEDLVPYWD YDAPNIPNEPRDASAAACTASALYELDGYLPGNHYKETADKIMESLGSPAYRAKVGTNGN FILMHSVGSIPHGQEIDVPLNYADYYFLEGLMRKRDLEKK >gi|225935354|gb|ACGA01000038.1| GENE 76 99858 - 100427 105 189 aa, chain + ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 15 175 17 168 181 62 27.0 7e-10 MFNKNNYTNNFNQFFIDYQRRFIHFASTYVHDEAVAEDFVIESIMYYWENKERLPSDINI PAYVLTVLKHKCIDYLRNQQVRQMASDKIFQIYSWELSNRIATLEELEPNEIFTAEIQEI VNRTLETLPEQTRRIFAMSRYENKSHKEIADLLNMTTKGVEYHINKATKVLRIALKDYLP TTLLLFFLN >gi|225935354|gb|ACGA01000038.1| GENE 77 100866 - 102365 988 499 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172521|ref|ZP_05758933.1| ## NR: gi|260172521|ref|ZP_05758933.1| hypothetical protein BacD2_11711 [Bacteroides sp. D2] # 1 499 1 499 499 897 100.0 0 MKTTSAFMAVCTALLMFSCSGDDENSQSTPLQLKTTTSSPAKGEIGTKITIIGENFTDQA KVYLKNAKATITSVTATQLQVIAPANEAGECIIEVMVGAQKTENLRFTYVDYTPVVSSIS PSSGTENTEITIIGENFSTTPEENIVKIGDAIATVKYATETELKIIAPQNEIGTYAVTVS VGVKTGKNPALFTYEDTRERIYECTQNFITVPSDINTQDLKSVTFLKDGRLAYSTNGGSA TEAWAIDLRTMEREKIVPNGTGTVLLKITTNPTNGKLYLAYKGEDKISVWDPNTKQVSDL LTRNGLDNLMDVKFDQYNNMYAVCRNSGTIQKYPSGNYNSSSRLEFVKITGRQIQAIAFD AAGNLIAAATDGSNAGYIYKVDSSGAYTLIAGGGNKVITEDITEDPKTAKLEKAEGLMVD KDGYIWFSDGNSGTRKTKILKPGKNGYQDATIILVCKAENGWKDSGTTSPVDFVQDTDGT IYIADGPNKAIHKVTIGYK >gi|225935354|gb|ACGA01000038.1| GENE 78 102530 - 103474 359 314 aa, chain + ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 110 299 74 260 280 78 27.0 2e-14 MDKNSLYRFFEGQASIEEMRAVKEWAESSEEHSKQFHRERKLFNAMILIGNPKWTAINTT KRNRYFIREFLKIASVIIIAVSITATIFSITNEHHKHVDIAMQTIIVPAGQRVNIELPDG SNVWLNAGTQMQYPVSFMEDKREIILDGEAYFDVARNEKCPFIVHTHAMDIEVLGTKFNV EAYAKQQNFEASLIQGKIKIKSPSDNNISQVLLPDYKSTLKNGKLVISKIEDYNIYRWKE GLYCFKNKSFIEIIKDLEKYYNLKIIVDKPSITNIILTGKFRISDGLDYALRILQKDVPF IYNKHTTDDIIYIK >gi|225935354|gb|ACGA01000038.1| GENE 79 103694 - 107185 1598 1163 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2402 NR:ns ## KEGG: Dfer_2402 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: D.fermentans # Pathway: not_defined # 27 1163 62 1217 1217 926 42.0 0 MRTSIFLLLFCSFSLMAKNANSQNAKVTINKQDVRLESILSEIESQTNYLFIYKKNVDVN LHKSISVKARPVSEVLSILLHESPIIYKIEGNHIILTKKEITQTPTIKKITGVVTDRAGE PLIGVTITVKGISQGSITDINGKFSLSADIGDVLQFSYIGYIPQTIKLKNLNSLKVFLVE DVKTLDEVVVVGYGSQKRTNLTGAISTVTSQELVDRPASSVTHMLQGRVPGLNITTSSGV PGNTASFNIRGTNSINGGSPLVLVDGVEGDLDRVNPNDIESISVIKDASAAAIYGGRASF GVILVTTKSGSSDGKVQISYSGRVGWTEPTTSTDYENRGYYHVSLVDQFYIPNNGRRYTS YTEEDMKELWERRNDKTEDPARPWTVLDKRNGQNVYLYYANTDWWQYLFKDKKPLQSHNI SISGGTNKLKFYLSAGYNSEEGMYRRNTDKFRRINFHSRISADITDWLNISNNTEFFNSN YTYPGLGGINSNISQAQNHGLASFPTQNPDGTSLYTTSLTDYAIMDGLLVILDSEGFRNK TRKDNFKTTTELTLKPLKGLEIKANLTYMLYNQYDIRRQSNGTYSKYPGETSILNTGNFQ NQMFEQFAPNEYWATNVFATYQGSVSNKHNYKVTGGFNYETRYNKTVRGTGYNLLSDVLD DFNLIGVDPATNQKRMEVTGGQNEYKLAGFFGRINYDYKGIYLLELSGRYDGTSRFASGH RWGFFPSMSVGWRVSEEKFFSNMKDWCNNLKFRYSFGSLGNQQVGYYDYIRTISIGTTGY LFGGARPSYADISAPSADNLTWETVQQHNAGVDMTFLDNRLTFTGEYYIRNTKDMLTAGI TLPAVYGTSAPKMNSADMRTKGYELSVAWRDQFKLGNKPFSYQLSFTFSDYISEITKFDN PDKLLAKTYYEGMRIGEIWGYRVDGFFATDEEAKNYEVDQRSVNTTINSSSGEYRGVRAG DLKYRDLDGNKTISIGENRVGSSGDREILGNSSPRYQYGTNLAFQWYGFDFNIFFQGIGH IDWYPGREAQAFWGYYNRPYGTFIPKDFQKLCWSEDNPNAYFPRPRTAVALTENSELSTV NDRYLQNIGYCRLKNLTIGYSLPKFLTNKIGLDLVRLYFSGENLAYWSPIKTDYVDPEQA RQGGTLKVYPWQKSFTFGVNINF >gi|225935354|gb|ACGA01000038.1| GENE 80 107200 - 108966 681 588 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2403 NR:ns ## KEGG: Dfer_2403 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 5 583 4 576 576 533 50.0 1e-150 MKNNIFLLIAASIFILSFSSCDDLLDVTPKDKITPETFFSTEADLQAFSTNFYNSLPKDR DLYKDDADNITHSTPNSAATDSRTVPASGGGWEWGYLRRFNTLLQYSVNCKDKTVRTRYQ ALTRFFRAYFYFEMVKRFGDVPWYDTELFVSETDQLNKPRDSRELVMKNILEDLDFAITN LSTTPSVYTVTKWTALSLKSRVCLFEGTFRKYHNLNLSTNADFFLQESITASRELMKESN YTLYTEGGTNQAYRKLFTSQNAPTTEVIMARNYNLTYDIYNTVGTYHTSVGEGRPGVTRK IIASYLMNDGSRFTDNPNWETMTFVDECKNRDPRLAQSICTPSYQPIVSNRKNIPDLNAC VTGYQLIKYEDEAVGGKSDIDLFYFRLAEVYLNYAEALAEINQLTQDDLDISINKLRDRV GMPHLIMKTANANPDPYLLANTTGYPNVSGANQGVILEIRRERTIELLSEHFRYDDILRW KAGQNMKQAILGMYFPSPGEYDLNGDGQNDICLYTDTKPGNAQGITYLKIDSDIKLSDGN KGYLSPHKGLTLFWNEQRDYFYPIPSNERLITNGALTQNPGWDDGLNF >gi|225935354|gb|ACGA01000038.1| GENE 81 109128 - 110027 346 299 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 46 298 4 255 257 196 40.0 5e-50 MNSKRNFIYNFFLLSFFFIGCSNSTIEDEPEMPPFNNETPSNSLCVAQFNIRYDTPEDGQ YVWANRKTMAKEIITSHDFDIFGVNECLLNQLNDLLELKQYEYIGTGRDDGKEAGEFCPI LYKKERVELLYHGQFWYSETPDKPSKSWNSFCNRICTWGKFKDKKTDKDFFFFSSHFDHV SNEARVNSAKLLVQKVQEIAGDLPYFCTGDLNCDPDEEPISFILNSGLFKDSYSISETTP KGPAGTLHYWNFDFNPEHRIDYILVEKSIKVLSFETITDDARQGRFSSDHYPIMIKAEL >gi|225935354|gb|ACGA01000038.1| GENE 82 110256 - 112121 686 621 aa, chain + ## HITS:1 COG:ECs4459 KEGG:ns NR:ns ## COG: ECs4459 COG3533 # Protein_GI_number: 15833713 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 42 522 37 559 656 106 23.0 1e-22 MNYKKWIIGLCILNILPALSQEYNIENIHIDGYIGNRINTCIEHRVKSQNTDHLIEPFKH RNEDHLWQSEFFGKWLLGAIASYQYTKDKELYNLITNSVEKLMNTQTSDGYIGNYKREAQ LTNWDIWGRKYTSLSLLSYYRLTGDKKALNAVERLINHLMEQLQIHNINIAATGYYLGMA SCSILEPVVYLYDITRNPRYLSFAKSIVSSIEREGSSQLITKTLKNIPVSERSAFPKSWW SFENGQKAYEMMSCYEGLIELGTIVNDPFYIRIAEKAVNNIQEDEINIAGSGAAFECWYK GKEKQTLPTYHTMETCVTFTYMQLCHRLLCKTGNSFYAEEFEHTMYNALMATMKNDGSQI SKYSPLEGRRQPGEEQCGMHINCCNANGPRGFALIPKTACTIKDNHIYLNLYLPLQATIS LNKKNKVHLNVESDYPIHGKVNVNIGVQKKEKFTLALRIPTQIEKMKAYINGEEQEITHK GGYLYIERIWENADKVTLDFKIETKVVKLNNSQAIVRGPLLFARDSRFNDGDIDECATIK CNNQGVIQAKIKKNKTGTFPWITLTVPAVLGTDLENVENKTPKLINFCDFSSSGNDWNPN GRYRVWIPQTFHAMSEPYKKY >gi|225935354|gb|ACGA01000038.1| GENE 83 112182 - 112874 252 230 aa, chain + ## HITS:1 COG:no KEGG:RB343 NR:ns ## KEGG: RB343 # Name: not_defined # Def: hypothetical protein # Organism: R.baltica # Pathway: not_defined # 34 219 41 226 244 75 30.0 1e-12 MKKYNKIKKILFAAICGGLFIACQQPHNVNDPLQVLCLGNSITKHPIKEDIEWFSDWGMA ASNIEYDYCHQLEKMLCKYHPNSTVTPLNIASWERNLSCDIDSLIGSYCKNKDIIIIRLG ENVQDVTTFPDAISRLVKYCQSKAKRVIITGCFWKNDSKERSIIHAARTNDLTYVPLYWI DNLYNVRPQIGDTLYDINKKPYVITQDFIITHPNDEGMQMIAETIWGVIR >gi|225935354|gb|ACGA01000038.1| GENE 84 112986 - 116264 1290 1092 aa, chain + ## HITS:1 COG:no KEGG:Slin_0358 NR:ns ## KEGG: Slin_0358 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: S.linguale # Pathway: not_defined # 41 1090 40 1110 1114 947 46.0 0 MKHFNNIPIVTILLFSYILTLYSIKTHASALTSEHEIYKGFINPPQEAKPRVWWHWMDGN ITEEGIYKDLLWMNRIGIGGFHQFDASLYTKPIVKERLVYMTPKWKKAFRYAIHLADSLN MEVGIASAPGWSSTGGPWVNPKNAMKKLVWRTFTVEGQKMFSGKLPSPYTTTGEFQNIPI RNMNNYTEYYEDIATIAVRLPSEELSLNELDAEITSSGGNFSLHKLTDGDFTNAELLPFN PFDKYGWIEYSFPKPQTIRALSFVSGRLRTEWRSESPQTVNFLQCSDNGVDYRMVCGIPD GGNAQQTISIPETTARYFRVLVPNPTNNAPGTQIPEFVLHPTVKINHTEEKAGFSSPHDL MNYPTPETNTPIDKKNVIILTNKLDSNGILKWNVPEGKWRIYRFGYSLTGKRNHPAPAEA TGLEVDKLDPESWLSYFRTYMDMYKEAAGGFMGKRGIQYIITDSYEAHWQTWTPSLPSFF KHKYGYDLLPWLPVLTGEIIENTSESECFLRDWRLAIAELYRKNYDRINSIVKEYGLKGR YTEAHENGRVYVGDGMEIKRTATFPMAALWMPNSGACSSQQMGQADIRESASVAHIYGQN IVAAEFGSAIAHHAYACCPENIKPIADKALANGLNRFVIHETSHQPVDDKIPGLGLIQYG QWFNRHETWAEQAKVWIDYLARSSYMLQQGNNVADILYYYGEDNCITGLYAHTLPDIPAG YEYDFIDPYGFVHDIKMDKKCLLAPSGRKYSILILGENTKVMSLKVLKKIASLVEEGVAV LGREPIFRGSNLDNADEFKALVTSIWHTNRKNVYTDTPIEEILESTGVQPDFIYRNNKHI TLKYRHRLLDHGHIYWVSNPSDKELYTEVSFRVSGLKPEIWHPETGLSEDVSYEMKDGRT IVKFSMVQNDAVFVVFIKPTKRIKQEIPLPHKKELTTIDSNWTVKFQPKIGNSESLLMDS LVSFTDYSSPSIRYFSGTATYTNSFIIKPENYKKSDKLLLDMGSVKNIAEVYVNGKICGT FWKAPFIVDITAAINPGENTLEIKVTNLWRNGLIGDQQKGVTPQYYTSYHFFKSDSALLP SGLLGPIRLIKY >gi|225935354|gb|ACGA01000038.1| GENE 85 116449 - 118257 1160 602 aa, chain - ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 106 471 124 484 1014 90 24.0 1e-17 MRKIVCSLVILINSVYLFAQWKPVDGRIMTEWAAKIDVENVLPEYPRPIMERADWINLNG LWNYAISPIGQAMPQSYDGRILVPFAVESSLSGVGKSLGEKNELWYQRQFSVPSKWKGHR ILLHFGAVDWKADVWVNKVKVGQHTGGFTPFSFDITPALLNGENELIVKVWDPTDKGPQP RGKQVSKPGGIWYTPVSGIWQTVWLEPVPIRSIVNIKTTPDIDRNKLTVEVMTDHQMLSD KLEVKVLEGKQLVAVGSSVNGIPVEIAMPSDVKWWSPDSPFLYDMEICLYSENKLIDKVK SYTAMRKYSTKRDKNGIVRLQLNNKDLFQFGLLDQGWWPDGLYTAPSDEALVYDIQKTKD FGYNMIRKHIKVEPARWYTYCDRLGVIVWQDMPSGDKNPEWQNRKYFEGTEFKRSAESEA IYRKEWKEIIDCLYSYPCIGTWVPFNEAWGQFKTPEIAEWTKQYDPSRLVNPASGGNHYT CGDMLDLHNYPGPEMYLYDAQRATVLGEYGGIGLVLKEHLWEPNRNWGYVQFSTSKEATD EYLKYANMLKDMIARGFSAAVYTQTTDVEIEVNGLMTYDRKVIKLDEQNLKRVNTEICES LK >gi|225935354|gb|ACGA01000038.1| GENE 86 118297 - 119373 782 358 aa, chain - ## HITS:1 COG:AGl909_1 KEGG:ns NR:ns ## COG: AGl909_1 COG1409 # Protein_GI_number: 15890570 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 29 334 352 660 1299 103 26.0 7e-22 MKLFNYLLAGILCASATAVSAQHRADPQKLVDPESFSMILLGDPQGYTKYDINQPLFDLC TAWIADNIEPLKIKAVLCTGDLVEQNDNNVLNRKMLNQTSREMWEAASQALKRLDNKVPY IIAAGNHDYGYKAAENGRTYFPDYIPFERNSTWRNICVSEFPNREGRASLENSAFEFDEP GWGKILVIAVEFVPRDEVLQWAKNLVNKPEYKNHKVIFITHSYLDIGNKRVTKDGYKISP QNSGQAIWEKLIYPSSNIRLVLCGHVGRGTGEYENNVAYRVDKNSAGKDVSQMTFNVQYV GGGPEGNGGDGWLRILEFMPDGKTIKVRTYSPLFGISKLTRHLAHRTAPYDQFDIILE >gi|225935354|gb|ACGA01000038.1| GENE 87 119377 - 120462 768 361 aa, chain - ## HITS:1 COG:BS_abnA KEGG:ns NR:ns ## COG: BS_abnA COG3507 # Protein_GI_number: 16079933 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus subtilis # 55 294 42 284 313 90 30.0 6e-18 MKKLSFLIATALCIMACSGGSDEQPDGPGNGNGNGGGGSTPNPVSFSNPIVGKQLPDPTI IHAADGNFYLYVTQQDNIYIPIYKSTNMTQWTYAGAAFLQKPNWQPDTRLWAPDINYING KYVLYYTLGHWDLPKNSCIGVAVSDSPVGPFTDHGKLLDYNSGTMQCIDPCYFEDNGKKY LFWGSYNSTSKGYEGGIRYVELSADGLSIKGAVSEKIAGPSIEGIMVHKRGNYYYLFGST GSCCEEERSTYRVVVGRSKNIEGPYVGKNGTVMKENSSYNEVILQGNNLFAGTGHNSEII TDDEGNDWFFYHAWQKAKIDNGRQLMCDRIQWSSDGWPYITNGTPASVSRAPVFKNEEEK E >gi|225935354|gb|ACGA01000038.1| GENE 88 120489 - 122321 1466 610 aa, chain - ## HITS:1 COG:no KEGG:PRU_2073 NR:ns ## KEGG: PRU_2073 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 7 608 4 615 615 480 43.0 1e-134 MMKNKIKIFRKLVRCASIALLSFSMAGCGADFLDTTPTNNISGSSIWTKATLAEQAVIGV YNVFLDQYCPNSGILDSHRIPWDAYSSVMDTDKNWIKRMPNCTGAATPSSGLFSDIYKRF YTFIYRANDVIDNIGQVPDMSTGEKQRAIAECKFLRAFAYMRLNILYKGVPLYMNAITSP ENGNKARSSEADVWAAILQDLTDCVNEPNLPGKYNSGDSNYGRVTKGAAYALRGQVYMWQ KNWEAAVDDFNSVADCGFGLYTKAGATSYKQLLKIANEQCEEMIFSVQCIKQYGYSNTRN ITYGNRCTAGSMWNNYLPNPHFVESFETATGEKFNWDNYLPGYSAMKEKERVVFFLRDGL SAAEKERMTAYGADMSKYLDSGNEARIKKAYESRDPRLQMAVITPYAQYNGASSGIAHTY TLRWPYRGSDSAEPFDIRTDTNDKFYYLWRKFVPEGLEQTERDTYGLDIPLIRYAEVLLL KAEALNEWGSTHESEAIRCVNEVRRRAGHVELNNATYPATKVNGQSDLRERIRNEYYWEI GGEDSMYFHELRWGTWKDKKFDNNRNGLMQMWGETTYTWYWLGDQCWSWPIPAKEMEMNS NLEQNEGWIN >gi|225935354|gb|ACGA01000038.1| GENE 89 122342 - 125842 2842 1166 aa, chain - ## HITS:1 COG:no KEGG:PRU_2074 NR:ns ## KEGG: PRU_2074 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 101 1166 23 1091 1091 1056 50.0 0 MSKIIKCMFFVLLAAVLPLSVHAQQVTLHLQDVTVRKAFRELEVKGNVSLVYEKNDVDLT RKVTVKVDNQPISKALDQILKGQELIYKIKDNHIIITRPVVQKSDKKRRINGVISDVNDE LLIGVSVSIPGTNIGTVSNMEGKFELEIPEDAKKLQVSYIGFDTQEISLSPGKTTFNIVM GENVKMLSEVVVVGYGTQKKVNLTGSVATVNLEKESLSRPLVSASQALSGMTAGLQVMQS SGTPYNEGFSFNIRGVGTLNSSSPLVLVDGMEQSLNNVDPNDIASISILKDAASCAIYGN RGANGVILVTTKTGQTGKVSVSYNATFSLNQPTKLIKTVSNYAKYMELMNEAATNIGESA PFSDITINQWREAEKDPNGISASGYPNYVAYPNTDWYDEIYSNDWMMKHSLSVTGQEGRT GYNLSISYTDNPGLIKDTGYQRYFLRANVYSDITKWLRIGTRVWGYHTDQKKSDTGSLTN INTQKMIPGVYPYYDGKYGAPEANEEDPQSHNPLWDMAQSEGHIKNTQFFTTFYANVKFL KHFSYDVNLNYKDYRYESMSVNTDYGKYSFSTDQWIAAPKDPADLYTRMAYVRENHWKLT HLLNYNQTFYKHDIGMLLGFEEERFVKRTTDTAKLGLIDSSVGDSSSATTPSSIGGTGEE FTSRSYFGRINYAFDSRYLFEVNMRYDGSSRFAPDNRWGLFPSFSAGWRISEERFMKGLP IDNLKLRASWGKLGNNAIGNYEWQAVYNSAKYAFGGSLNNGLAITSISNNLLEWESTAIT NIGIDFATLRNRLTFEAEFYNKVTDGILYRPDMYMSLGTASGPRQNIAEVTNRGIEMTLN WQDRIGKVEYRVSGNFAYNQNKVSKYKGELQRGWVTDENGNRVYQTNIGDVSTGGSTRVI EGKMINEYYMLQPYKGSGNGFNADGTVNINGGPRDGMIRTVDDMKWLNAMVGAGYKFYPR QNVSKSGIWYGDYIYADLNGDGVYGNDADNEFQGCSNVPKYNFGLQASANWKGFDFSMNW SGAAGFKIYYYRLALNSSATIKGYAIGEAIANDHYFFDPKNPNDPRTNLSSKNPRLTNNS GNDQSGMTQSSVHLQKGDYIKLKNLTIGYTLPAVLSKKVYAQNIRFFASGENLFTITGYE GMDPEMRTAVGYSTMRQYAFGVNITF >gi|225935354|gb|ACGA01000038.1| GENE 90 125900 - 127624 1164 574 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172534|ref|ZP_05758946.1| ## NR: gi|260172534|ref|ZP_05758946.1| hypothetical protein BacD2_11776 [Bacteroides sp. D2] # 1 574 1 574 574 1117 100.0 0 MKKLHFLLSTFLVFIFTSCGEDELKGVVLSEDPGYVKEPLVAIQAEDGTGNWINGLIDQN SRVIALDFRILDDQSAVNVKLKLADEWAKPIDPLTTDAVLDLSSGITRIKVNDGADDIEY TIFSTSTQLLRGITATCNTEQVSANFINGIASLRFKQNTVFANVVLMPTLADNAEIISTD PSSQKDENGNLIVDLNSTLTITVKDNTNNMTKKYQLSAVNGIIDAGVNWKNITADLKSQH SSIRIPDFMIVYENKNLHGRSGNTGWLITIPAGKINMKVSWDMNKDNVTWAEPTAQHRTT AIMNNNMDYSLFVPGLSGQFWVPIFYSLGWNDSGLLSAPRLGYNLSTSLKYCPGTLGITA DGKAEISYAEVIDNNLYKFTSGGAGKQAANGVQWSPTSAVSGYSYPLQKGQIMIAGENAE LYKTFATNEGRNTSTRNRGMDGALSGSNSWAANPVSIWTFDGKPVGRRAVGITEEGDLVI FVSNRFTNSYNSVKAAWNVFSDGSSLREVAVALQEVGCTDAIVFGENYHSPVVIRDDSRG VPLGKVAGRYDWETNGNVKNADNEASSQSWIMFK >gi|225935354|gb|ACGA01000038.1| GENE 91 128030 - 128626 381 198 aa, chain + ## HITS:1 COG:no KEGG:BT_1877 NR:ns ## KEGG: BT_1877 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 192 2 193 196 145 44.0 1e-33 MEKNLETEIIEALKSGKEEAFRYIYKTYYTDLCRIAKGYLTDSYLSESIVEDLVYSLWEN RSKIAINTSLKNYLFRSVANKCINYLQLEYVKRETACSSEDLVIYSDLWSLGENPVEHLE GKELHAIIQKTINGLSPETRKVFLLSRFENKRQEEIAEIMGITIHTVKYHMRSALDKLKE AAKSYLALIVMFITNMFN >gi|225935354|gb|ACGA01000038.1| GENE 92 128728 - 129720 534 330 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 18 308 29 306 331 64 24.0 2e-10 MKEEFDRLIYDFLSGSITKEDLHKLNEWIYSDPENQKYFEQQKRIWLLSAEYKNVSVHEE QAYNRFAKRIRKHKSAPAGKMRRALFVKYAGYAAAIIILLFTTPFIIYNFTTPDDSLISE VYAPRKSKLKMKLPDGSIVWLNADSKLSYSESFSRKNRNVRLEGEGYFEVEHGEHPFVIQ TDSAQIKVLGTKFNVKNYGDENYIKVSLLEGSIVLLCINQEFIMKPNQTMTVNKLDQTYK LTESADYAEQWINCKVFWDEVPMSIISKELERQFDVTFAFESEQLKNLIFHGSFIIETNN LEKILDIMSETNKFSYSIEKDKVYIHSLKK >gi|225935354|gb|ACGA01000038.1| GENE 93 129753 - 130163 502 136 aa, chain + ## HITS:1 COG:MA0735 KEGG:ns NR:ns ## COG: MA0735 COG2050 # Protein_GI_number: 20089620 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Methanosarcina acetivorans str.C2A # 4 132 16 143 146 110 49.0 7e-25 MTAQEFFKNDLFAENAGVVLLEVRKGYSKAKLEIKAEHLNAGARTQGGAIFTLADLALAA AANSHGTLAFSLSSTITFLRASGPGDTLFAETRERYIGRSTGCYQVDVTNQNGDLIATFE SSVFRKDQKVPFEVQE >gi|225935354|gb|ACGA01000038.1| GENE 94 130328 - 131203 748 291 aa, chain - ## HITS:1 COG:no KEGG:BT_3342 NR:ns ## KEGG: BT_3342 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 33 291 1 259 259 503 92.0 1e-141 MKTRRKRIGNEIVRTAVVSVRRNFIIGLVALLMGAFALPASAQCEAKNDAFQSGEHVMYD LYFNWKFVWVKAGLASLTTNATTYHSQPAYRINLLALGSKRADFFFKMRDTLTCVIGEKL EPRYFRKGAEEGKRYTVDEAWFSYKDGLCLVNQKRTYRDGAFNESEDSDSRCIYDMLSIL AQARSYDPADYKVGDKIKFPMATGRKVEEQTLIYRGKENVKAENGVTYRCLIFSLVEYDK KGKEKEVITFFVTDDLNHLPVRLDLFLNFGSAKAFLNNVTGNRHPLTSIVK >gi|225935354|gb|ACGA01000038.1| GENE 95 131148 - 133103 1528 651 aa, chain - ## HITS:1 COG:no KEGG:BT_3341 NR:ns ## KEGG: BT_3341 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 628 1 616 642 1036 85.0 0 MWCLLNKNPYFCSMLQFENQKIKQIVRKVLPVVTLGAMLYSCASIGRPDGGPYDETPPRF IGSTPAAGALNNERTKISLMFDEFIKLEKATEKVVVSPPQIQQPEIKASGKRIQVNLLDS LKPNTTYTIDFSDAIVDNNEGNPLGNFAFTFSTGTEIDTMEVAGTMLDASNLEPIKGMLV GLHSNLNDSAFNKLPFDRVARTDSRGRFSIRGVAPGKYRIYGLMDADQNFTFNQKSEMIA FHDSLIIPRMEERIRMDTAWVDSLTYDTIVEKKYMHYLPDDVILRAFKELNYSQYLIKSE RLVPHKFTFYFAGKADTLPVLKGLNFDEKDAFVIEKNQRNDTIHYWVKDSLLFKQDTLAM SLTYLYTDTLNQLVPRTDTLNLVSKQKYKKEESDKDKKKKKKKKGEEDEPEPTKFLPVNV SAPSSMDVYGYISLNFEEPIASYDTAAIHLRQKVDTLWKDIPFEFEQDSVNLKKFNVYYD WEPTLEYEFSVDSTAFHGIYGLFTDKIKQGFKVRSLEEYAKITFVVTGAEPNAFVELLDA QDKVVRRRRVEEGGYADFYYLNPGKYSARLINDRNGNGEWDTGNFEKGVQPEEVYYYNQI LEPKANWDLEQSWDIHAVPLDKQKLDELKKQKPDEDKKKKDRERNSQNRRR >gi|225935354|gb|ACGA01000038.1| GENE 96 133402 - 134013 555 203 aa, chain + ## HITS:1 COG:no KEGG:BT_2534 NR:ns ## KEGG: BT_2534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 200 1 200 202 317 76.0 1e-85 MSIQYDLYDTPDIQQTGDAQPLHPRVVFKGTVDQEEFLDRVHKFTGISRSLLAGAMQSFQ NELRDLIANGWIVELGDIGYFSVSLKGPRVMKKKDVHAQSIELKNVNFRVSSQFKKEVGQ QMRLERGESMTRHHGKGRSEEECLTIINQHLNKYPCLTRTDYSRLTGHDKKRALKELNAF IERGLLIRYGTGKQVVYAKKTES >gi|225935354|gb|ACGA01000038.1| GENE 97 134491 - 134835 349 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720495|ref|ZP_04550976.1| ## NR: gi|237720495|ref|ZP_04550976.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 114 1 114 114 200 100.0 3e-50 MYDDLKENIILVMQHPIARRPISNLSDEEREKAFDLLNYLSTLSVDENYTLLDYIQMARL EYALGELEYKTTNDTEKVIRHFRTALQHLEKGGFDLSISKWTELVSLRTKEDTE >gi|225935354|gb|ACGA01000038.1| GENE 98 135171 - 135647 586 158 aa, chain + ## HITS:1 COG:no KEGG:BT_2538 NR:ns ## KEGG: BT_2538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 158 160 186 66.0 2e-46 MALKYVVKKRVFGFDKTKTEKYVAQNVITNTVNFRDLCKEVSMFGMIPEGAVKHVIDALI DALNTNLNKGLSVQLGDFGCFRPGMSCKSQKEKKDVDADTVRRVKIVFTPGYKFKDMLDN VSIYKTEGNGGTSSGNGGGNKPENPDEGGSGEAPDPAA >gi|225935354|gb|ACGA01000038.1| GENE 99 135931 - 139038 2898 1035 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 25 1027 7 983 1087 702 38.0 0 MKKQLLSCCLAALGLTAIQAQSFNEWKDPEVNSANRSAMHTNYFAFASADEAKAGIKENS ANFMTLNGLWKFNWVRHADARPTDFYQTNFNDKGWDDLQVPGVWELNGYGDPIYVNVGYA WRSQFKNNPPLVPTENNHVGSYRKEIILPADWKGKEIFAHFGSVTSNMYLWVNGRYVGYS EDSKLEAEFNLTNYLKPGKNVIAFQVFRWCDGSYLEDQDFFRYSGVGRDCYLYARDKKYI QDIRLTPDLDSQYKDGTLNIAVDLKGSGTVALDLTDTQGNSVATADLKGSGKLNTTLSIS NPAKWTAETPNLYTLTATLKNGNNVVEVIPVKVGFRKIELKGGQILVNGQPVLFKGADRH EMDPDGGYVVSLERMLQDIKVMKELNINAVRTCHYPDDNRWYDLCDQYGLYVVAEANVES HGMGYGDQTLAKNPSYAKAHMERNQRNVQRGYNHPSIIFWSLGNEAGMGPNFEECYTWIK NEDKTRAVQYEQAGTSEFTDIFCPMYYDYDACIKYSEGDIQKPLIQCEYAHAMGNSQGGF KEYWDIIRKYPKYQGGFIWDFVDQSCHWKNKDGVNIYGYGGDFNKYDASDNNFNDNGLIS PDRVPNPHAYEVAYFYQDIWTTPADLAKGEINIFNEYFFRDLSAYYMEWQLLANGEVVQT GIVSDLKVAPQQTVKVQIPFDVKNICPCKELLLNVSYKLKAAETLLPAGTTIAYDQLSIR DYKAPELKLENQQASNIPVIVPSILDNDRNYLIVKGENFSMDFNKHNGYLCRYDVNGMQM MEDGSALTPNFWRAPTDNDFGAGLQHKYAAWKNPELKLTSLKHAIENDQAVVRAEYDMKS IGGTLSLTYTINNKGAVKVTQKMVADKSKKVSEMFRFGMQMRMPVNFNEVEYYGRGPGEN YADRNHGAMLGKYRQTVEEQFYPYIRPQETGTKTDIRWWRLLNISGNGLQFISDAPFSAS ALNYTIESLDDGDGKDQRHSPEVEKADFTNFCIDKAQAGLACVNSWGAIPLEKYRLPYQD YELSFIMTPVYHKIK >gi|225935354|gb|ACGA01000038.1| GENE 100 139163 - 140398 1205 411 aa, chain + ## HITS:1 COG:ECs4393 KEGG:ns NR:ns ## COG: ECs4393 COG0845 # Protein_GI_number: 15833647 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 25 395 8 382 385 167 31.0 4e-41 MDEQHTGNKRIKLKNRYLLSMKSRIVLFAFCVALLSSCGNKGNDTGKAPEYAVQELQKTT ANLTTAYPATIKGKQDVEIRPQVSGFITKLCVDEGATVRKGQVLFIVDPTQYEAAVRTAK AAVATAEAAVSTQQMTYDNKKELNKKQIISDYDLAMAENSLAQTKAQLAQAKAQLTTAQQ NLSFTQVKSPSDGVINDIPYRIGALVSPSIATPMTTVSEIDEVYVYFSMTEKELLAMTKS GSTIKEEISKIPTIKLQLIDGSTYDVDGKVDAITGVIDQSTGSVSIRAIFPNKEHVLRSG GTANALIPYTMENVITIPQSATVEIQDKKFVYVLQPDNTVKYTEIKIFNLDNGKEYLVTS GLNSGDKIVIEGVQNLKDGQKVQPITPAQKEANYQQHLKDQRDGNLATAFN >gi|225935354|gb|ACGA01000038.1| GENE 101 140414 - 143845 3134 1143 aa, chain + ## HITS:1 COG:all3143 KEGG:ns NR:ns ## COG: all3143 COG0841 # Protein_GI_number: 17230635 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Nostoc sp. PCC 7120 # 1 1123 1 1036 1057 624 34.0 1e-178 MKLDRFINRPVLSTVISILIVILGAIGLATLPITQYPDIAPPTVSVRATYTGASASTVLN SVIAPLEEQINGVENMMYMTSTASNTGSGEISIYFKQGTDPDMAAVNVQNRVSMAQGLLP AEVTKVGVTTQKRQTSMLVVFSLYDETDTYSESFIENYAKINLIPQVQRVQGVGDANVLG QDYSMRIWLRPDVMAQYKLVPSDVSTALAEQNVEAAPGQFGERSNQTFQYTIRYKGRLQQ PEEFENIVIKSLPDGEVLRLKDIAEIQLDRLGYNFTNRVDGHKSVTCIVYQMAGTNATQT ISDIEALLDEASKTLPTGLKLNISMNANDFLFASIHEVLKTLIEAFILVFIVVYIFLQDL RSTLIPTIAIPVALIGTFFILSLVGFSLNLLTLCALVLAIAIVVDDAIVVVEGVHAKLDQ GYTSARLASIDAMNELGGAIVSITLVMMAVFVPVSFMGGTAGTFYRQFGMTMAIAIGLSA LNALTLSPALCAVLLKPHKKEDGTTEDSTLKERMKVAYTAAHTTMINRYTEAIGKMLHPG ITLTFTLIAILGMIFGLFSINPIVTAIFVVLSVLALIGMSTKKFKNRFNDTYESILKRYK KRVLFFIQKKWLSMGLVVASIAILVFFMNTTPTGMVPNEDTGTLMGAVTLPPGTSQDRSE EILARVDSLIASDPAVLSRTMISGFSFIGGQGPSYGSFIIKLKDWDERSMIQNSDVVVGS LYMRAQKIIKEAQVLFFAPPMIPGYSASTDIEVNMQDKTGGDLNKFFDVVNDYTAALEAR PEINSAKTSFNPNFPQYMIDIDAAACKKAGISPSDILTTMQGYYGGLYASNFNRFGKMYR VMIQSDPLSRKNLESLKNIKVRNSAGEMAPIAQFISVEKVYGPDIISRFNLYTSMKVMVA PASGYTSGQALTALAEVAQENLPTGYTYELGGMAREEAQSSGSATGLIFVLCFVFVYLLL SAQYESYILPLAVLLSIPFGLLGSFLFVNGVSAIGNISALKMILGTMSNNIYMQIALIML MGLLAKNAILIVEFALDRRKMGMSITWAAVLGAGARLRPILMTSLAMVVGLLPLMFAFGV GAHGNRTLGTASIGGMLIGMICQIFIVPALFVIFQYLQEKVKPMEWEDIDNADAVTEIEQ YAK >gi|225935354|gb|ACGA01000038.1| GENE 102 143857 - 145248 388 463 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 3 460 2 455 460 154 26 4e-36 MKKQILYMLCATALLSSCHIYKSYDRPEDITTSGLYRDPVADNDTLASDTANFGNLPWRE VFTDPQLQSLIEAGLKQNTDLLSAAQNVKAAEASLMSARLAYAPSLGLSPQGTISSFDKN AATKTYSLPVTASWQIDLFGQLLNAKRSAQVTLKQTKAYRQAVQTQVISNVANMYYTLMM LDRQLEITKATAEILKKNAETMEAMKDAAMYSINSAGVEQSKAAYAQVLASIPDIEQSIR ETENALSTLLGEAPHAIKRGELEAQVLPTELSAGVPIQLLSNRPDVKAAEMSLASCYYNT NSARAAFYPQITLSGSAGWTNNSGAGIVNPGKLLASVVGSLTQPLFYRGQNIARLKAAKA QEEQAKLSFQQALLNAGSEVSNALSLYQKTSEKVESRQMQVESAKKASEDTKELFNLGTS TYLEVLSAQQSYLSAQISQVSDCFDRMQAVVSLYQALGGGREE >gi|225935354|gb|ACGA01000038.1| GENE 103 145267 - 146037 675 256 aa, chain + ## HITS:1 COG:CT531 KEGG:ns NR:ns ## COG: CT531 COG1043 # Protein_GI_number: 15605260 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Chlamydia trachomatis # 2 254 4 256 280 166 33.0 4e-41 MISPLAYVDPEAKLGKNVTVLPFAYIEKNVEIGDDCVIMSYASILQGTKMGKGNKVHQNA VLGAEPQDFHYTGEESSLIIGDNNDIRENVVISRATFAGNATKIGNGNYLMDKVHLCHDV QISNNCVVGIGTTIAGECMLDDCAILSGNVTLHQYCHIGSWTLVQSGCRISKDVPPYVIM SGNPVAYHGVNAVVLSQHRNTSERVLRHIANAYRLIYQGNFSVQDAVQKIIDQVPMSEEI ENIVNFVKNSERGIVK >gi|225935354|gb|ACGA01000038.1| GENE 104 146122 - 146763 452 213 aa, chain - ## HITS:1 COG:no KEGG:BT_3335 NR:ns ## KEGG: BT_3335 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 213 1 213 213 413 89.0 1e-114 MKVKKISAANVEACSLPKLFDEEKIDFQPVQCVNWAEYPYKPKVNFRIAHTQNSILLHFK VKEESVRAKYGEDNGSVWTDSCVEFFSIPAGDGIYYNIECNCIGTILVGAGPVRNNREHA PKEVTALVQRWSSLGNQPFAERIGETDWEVALIIPYSVFFKHQIDSLDGKEIKANFYKCG DELQTPHFLSWNPIQIEQPDFHRPDFFGTLEFE >gi|225935354|gb|ACGA01000038.1| GENE 105 146951 - 147961 582 336 aa, chain + ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 322 2 315 331 143 29.0 5e-34 MKTSLKDIAETLKLSKTTISWVLSGKGDEKGISLATQEKVFQCAKQLNYQPNLLARSLNT GISGTIGLIIPDITDSFYSKVARSIEKEAETQGYSLMICSSESEIERENRMIRLFKAKQV DGIIVAPTKVSKIEIQNLVKEGYPLVLFDRYFPEMQTNYIIIDNEGSSYELVKKMIDNGS SKIAIITTNPHLRTMNMRREGYARALIEANLPVDPDLYGEVTFVDYEKNVFKTLDQIFSK VPDVDGFFFTTHILALEAFRYFYEKGININKGYELACIHGVSAFRVLAPNMNIARMPIEE IGKNAVRILLGDIKYRLENSTEKKEVETLVLPCSLP >gi|225935354|gb|ACGA01000038.1| GENE 106 148087 - 150723 1393 878 aa, chain + ## HITS:1 COG:no KEGG:BDI_1318 NR:ns ## KEGG: BDI_1318 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: P.distasonis # Pathway: not_defined # 4 878 11 867 869 932 52.0 0 MKFQIKNEKYMLALLGLLVWACSSNNIDIENMRCEYQISPIGIDSPAPRFTWNYTTNNED ATEFSQDSYQLYIATREQDLRNSSDNSWQSDTIYSDTPHATYQGSEKLKSRTRYYWQVIA WDRTKEHKIISPISSFETAQLNQSDWTGVWITDNHDKDYTPAPMLRKSFIAQDDIQQARL YVSAAAYYKMTLNGKSITSSHLNPGFTHYDKRNLYNTYDVTSQLLKGENVLSAILGNGFY NESAPVATWSYEQARWRNRPRMICEMEILYKNGEKQTIHSDSTWKTSTGPYIQNNIYSGD TYDACLAIAGWDKPGFDDSKWTNAIQVAAPSPLLVSQNMPAIETEQFITPINMRSFGDTV YVYDFGVNMSGVCTLSINGKKGTKVSMQHGELLKDNGRLEMRNLDIYYKPLPGLAFQTDT YILDGKQGTFTPDFTYHGFQYVEVRSDRPVKLTKESLTAQFIHTAVPPVGKFSCSNELLN KIWKAANQSYLSNLMSIPTDCPQREKNGWTADAHITMDLGLLNFDGITFYEKWLDDMIDN QNEEGRISGIIPSSGWGYDDWIGPVWDAAMFIVPMAIYHYYGDTRSIEKLWPVCTKYLAY LAGREDAEGTVTYGIGDWVFHKTQTPTEFTTTCYYYLDNLYMAKFAELIGKDGSSYAKKA EELKLLINHKYFDTSKAIYANGSQAAQGVALYLGIVPKEYEQQVADNLSTMIKANKNLLD FGVLGSKTVLRMLSKYGYADLAYQMAAQEEAPSWGNWIKQGFTTLAETWILSPEFRDASV NHMFLGDINAWMYNILAGINYDEQKPGFKHILIQPHFVKGLDWVKAEYKSVNGIIRSEWE RKGEQVILKVTIPVNTNATVEYNGKKIDLRSGKHQLTF >gi|225935354|gb|ACGA01000038.1| GENE 107 150846 - 153248 955 800 aa, chain + ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 63 779 25 714 785 317 30.0 4e-86 MNKISTLLFLIFLTCTSGVFADSGSSISLNGTWELSYWMQPEEPITSPKDMQLAEVKHLS AQVPGNVELDLMAANLIKDPMIGSNVNELRKWEGYQWCYSKSFVAPQLKPGQQYQLFFAG IDCLADIWLNGKHIGKAENMMIEHAFDVTKEIKAGESNQLQVILRSSVIEGQKHLLGTFS IGNFPSEESVFIRKAPHTYGWDILPRLVSAGLWRDVELRVLNPARLTDVHYMVANVDTAT RNVRLYTDVQVKLPFEKFDKVKAVYTLSRHGKEIYKGSSVVVSPAFRYIMEIKNADLWWP RGYGEPALYDAKVELVDSDGTILSTDNKRVGLRTIQLDITDINLPPDHPGKFCFIVNGEP IFIHGTNWVPMDALHSRDHSFVDESIRMAVELNCNMIRCWGGNVYEDHHFFNLCDENGIM VWQDFTMGCTFYPQRSSFTQALEEEAISVVCKLRNHPSLVLWSGNNEDDCALRWSLQPFN INPNQDVVSRKVLPAVIYEFDPTRPYLPSSPYYSQAVYERGSADQYLPENHLWGPRGYYK DKFYTDATCCFVSEIGYHGCPNLESLQKMMTKDAVYPWTKNHEWNDEWVTKSVRRFPEWG KTFDRNNLMINQVRLLFGEVPSKLDDFIFASQSVQAEAMKYFIEMWRGKKFDDKTGIVWW NLRDGWPVISDAIVDYYNSKKMAYYFIKNVQQDVCVLINDAEGGNYPLIGTNDTRNVQSG NVTVTDASSGRKIYESTFRIPANQKVRIASLPEESGQGIYLIQYQIGNQKFMNHYLYGKA PFNLKEYKRLLQKTGLYAKK >gi|225935354|gb|ACGA01000038.1| GENE 108 153254 - 154222 364 322 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 1 316 1 316 319 144 33 2e-33 MKENLLGIDVGGTKCAIIYGIKENDELHIIDKKKFDTTTVDETIDRILCETEKMMNLHQL TPTNTKAIGICCGGPLNSETGIVMSPPNLPGWDNIPIVAMVEKKTGIKTSLHNDANACAL AEWKFGAGKGTKNMVFLTFGTGLGAGLILNGKLYTGTNDNAGELGHIRLSDFGPIGYGKK GSFEGFASGGGIAQLSKMYVMEKLQTGQKVEWCTLQELDQLTARKVAEEAAKGDKLAQSI YETSAIYLGKGLSMVIDILNPEVIVIGGIYTRNKNMMEPIMQKIIDQEALSCANRVCKVK PAALGEQIGDYAALSVAANLTD >gi|225935354|gb|ACGA01000038.1| GENE 109 154228 - 154806 580 192 aa, chain + ## HITS:1 COG:ECs0249 KEGG:ns NR:ns ## COG: ECs0249 COG0279 # Protein_GI_number: 15829503 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Escherichia coli O157:H7 # 2 189 3 190 192 215 57.0 5e-56 MKELIETTLEEARQTLESFLEETTTVGIISQAASTCAEALKAGNKIISCGNGGSLCDATH FAEELTGRYRNNRIPLPAIAINDPAYMTCVGNDFSFNEIFSRYVEAMGCKGDVLLAISTS GHSENILKAAVQAHNNGMKVIALTSKGENELSRLADIAVCAPRAAHSDRIQEIHIKVIHI LIQAIEAELGFE >gi|225935354|gb|ACGA01000038.1| GENE 110 154829 - 156430 1332 533 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172554|ref|ZP_05758966.1| ## NR: gi|260172554|ref|ZP_05758966.1| hypothetical protein BacD2_11876 [Bacteroides sp. D2] # 1 533 1 533 533 1089 100.0 0 MAMKMTHISKTLLSIVCTLWGMSACSDDYNKYDDCPACNYYPVAVGNSTGTPVHAIGVEM DPHFFSQNITRNDGSKAEDWENIVVQRVKKMKVQQFRIMMQPQWWEPSNDNNDPNVADMS KFTFDSEEMQSVYKVLDLAQENNIGVTLVVWGAITNIDLLSGINNGQKHFLCDARSYNVN PGWIAGIDNYEEFAENFSTMVKYLIEEKHYTCINQITPFNEPDSHIAGYGRIMWQGDFET MGWQDTYAPMVKALDAKFKADGIRSKVHFNLSDNTDGTPGYIAACVSAFTNDEADLYNSH VYKFDYNTPNSTLVNWERQNIASAGGKRHFVGEFGFPGYGSARQYGIDTYTRGVQLIRVA LNYLNAGACGVSYWSLIDQYYNRNASYSEMQQLGLWKYLKSAYTEDPDVYSKIKEDYEVR PQYYAYSLLTRFVRQGDEVYPLDLGDELIAGSAFLNTEGKWTYVLSNATDKDKMIQLEND KEGANGEYNVYKYMEGRLPEGDNLIESTETVNTQENNLKLKLSRSSIRVLVQK >gi|225935354|gb|ACGA01000038.1| GENE 111 156567 - 157721 748 384 aa, chain + ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 10 374 25 402 412 87 24.0 5e-17 MKKEISIVKMLSVMFGFFVMGFVDIVGITTSYVKNDFSHLTDTMVNLISLSCFLWFLVLS IPTGMLMNRIGRKKTVLLSFAFHVLAMCLPLVAYDFTAILIAFALIGIGNTLLQVSLNPL VTDVVANDKLTGTLTLGQFVKAVSSFLGPILAAWVTGSFFGWKMIFPVYAGLSLLALIWL WLTPIQEKQQERKNISFKVTLELFKDKYILAFFIGILVLVGVDVGINMTFPKFLMERCDL PLTDAGMGNSVYFFARTVGAFLGGILLMKYSESKFYIYSVWIALAGLILMTISTNLWFIL GCVAVFGIGYANLFSIIFSLSLKRVPEKANEVSALLIVGVSGGAVLPPILGVITDSFHSQ LSAIIVLAIVWCYLIGIMRKIKTT >gi|225935354|gb|ACGA01000038.1| GENE 112 157734 - 159833 1581 699 aa, chain + ## HITS:1 COG:YPO1581 KEGG:ns NR:ns ## COG: YPO1581 COG3345 # Protein_GI_number: 16121851 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Yersinia pestis # 164 540 167 525 708 121 26.0 4e-27 MKHFKKIESKFLFSLLIMSVWSIIAVGQTTAKISKHIPEIEKWIQKQFAKGKTPPFSFIC DGKPSAEFIRQWDYSQQKIESEEADVIKYLFTYYNPTNGLKVECTVKGYPSYQAAEWVLN FTNKGTSNSPTLEQVKVVDLAMQYASKGEFKLHYADGNHISKYDFHPRSVTLATGETKQM SPQGGRSSEGDYLPFFNIESPAGQGVILGVGWSGTWYADVCAQDNRTISMASGMKTMKLY LRPQETIRTPSISLMFWENIDRMAGHNKFRRFVLAHKSRKINGKFAEYPLSSGFNYRDPA PCTEYSCLTADYAIAMVKRYIQFGLKPEVFWLDAGWNTDAADFEHGKTWANTAGNWTVDT LRFPKGLRPVADEIHKVGAKFMVWFEPERVIRGTQWAVEHPDWMLDIPEHNNDTYLLFDL GNPEACHWMSKYIGDMLEENSIDYYRQDFNMQPDIYWAANDEPGRTGMKEIRHIEGLYYF WDYLLSRFPNLLIDNCASGGRRIDWETIGRSAPLWRSDYYHYDDPDGYQCHTYGLNFFLP LHGTGSLQTDPYSFRSSVSTALIYNWKITDKNASVYEMQRCLKEFHEIRPYYYEDYYPLT GTEDMTRDDIWLAYQMHRPSDDSGIIVAFRRTASKDKKITVRLSGVDPGKYYTVTDMDSR QSKIYQGKELTSQLPLILEKPHSSLLLKYKVTNEKPNKE >gi|225935354|gb|ACGA01000038.1| GENE 113 159830 - 163339 2427 1169 aa, chain + ## HITS:1 COG:YPO1581 KEGG:ns NR:ns ## COG: YPO1581 COG3345 # Protein_GI_number: 16121851 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Yersinia pestis # 304 541 310 525 708 107 29.0 1e-22 MKKRILYYLFILLLFPHNGTGQTVARMSDRPVNTSQWITTHFARGVVPPFSFEYGGKPSQ EFIKSWSYTATRQKSDDPKVLKYLYTYREPSGGLKVECEVKGFTDFNTVEWVLRFTNQGK KNTPEIANVKVSDITFRYQHPGAFNLHHSEGSHASKSDFSQQLTILSPGDNLYMRPEGGR SSQMRMPFFNIESPANQGVIVAIGWTGTWFADVRCADQQSVALTSGIERLKTYLYPEESI RTSSVCLSFWNGSDRMKGHNQFRRFVQAHHTWKVGGKPTVYPISTSFNYGDPTPCNEYTC LTTDYAIAMVKRYEQFKLVPEVFWLDAGWYNHSADVANHKNWANTVGNWTVDSIRFPEGL RPIADEVHRVGSKFMVWFEPERVMKGSAWALQHPQWMLDARGKAKQEDWTRDGEHDSYLF NLGNPEACRWMSKYIGDFLEENGIDYYRQDFNIEPEGFWAANDEPGRQGICEIRYIEGLY SFWEYLLNRFPGLLVDNCASGGRRIDLESISRSAPMWRTDYSYGEPIGYQCHTYGLNLYL PLHGTGTVSADKFTFRSSLGTSIIYNWKITEAGQSIYDMRDRQAEFKELRPYFYEDYYPL SGINNITSENIWLAYQLYRPSDDSGYVVAFRRKDNPDKSYTVNLSGLHPDHTYILTNKDT GEAIKKTGKELANGFTLTLDNPQSSLIIKYQSSTTAIQKLSVGKKTGVKLRAIGAELDPH FLSQNVTRNDGAKAEDWERIVVKRVKEMGLQSLRVMVMPQWYEPKNDNPDASKIDWHNFT FNSVEMQSLYKVLDMAQEQKMEVTLVLWGAPPGHFLAEGNYGNWVVAPTNYEEWSENFSA LVQHLLNNKKYTCAKEITPINEPDWSYIIKGKAAPTADYIEMCKVLDRRFKEDGIRNKVH FSLSDNSDGGTGTHKYLAACTKGLANVADVFNSHTYIFGYETPNSTILDWEKQNSQLASS VGKAHFIGEFGGNQCVGATRQKDIDLYERGVLMTRIVINLLNAGASGVSYWSLIDQYYGK DADYGAMQQLDLWKYVKKTYASEPYYNDIKSDYEVRPQYYAYSLLTRFIRPGAEIHPIAT PEEWYAGTAVRNTNGKWVYVFANGMDQEKTISLINEHTQAKGNYQVYRYIKDGLPTTDQM LAPDAQPLKVNDKILCLLPAHSVIVLKQE >gi|225935354|gb|ACGA01000038.1| GENE 114 163377 - 165125 1324 582 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172558|ref|ZP_05758970.1| ## NR: gi|260172558|ref|ZP_05758970.1| hypothetical protein BacD2_11896 [Bacteroides sp. D2] # 1 582 1 582 582 1130 100.0 0 MKTHFYILLMLGMVFLLGCEDEKLGTDLGVTNVVLPDISEESLGTEITIQGNGFIDCDVL ALSPLSGGTEQPIYMETREVASDHITVLYPSTATKDSYGLVLVRGSKMRTLGVINSTVGV MPDENLRNALSALFPDIFKGEKISSSAKYVTFTDGTLNISDKNITSLEGLEHFSNIRKLI CNNNDISEIPAEVLSRLSELTAQNTGLTKLELATSEQPNTTLVSLNIDGSTKLESVDLYY CYNIEKFSALNCKLVYLDVRNYHSIYGGCLNYNSTDFKFTFSDDASKERLLKMESWWMDS YYSNSGSIVDAINNGVTVEGYDWMHDYPDGNNNYYYSYGKYQKTMKKYGEIPDTNLRNAL KALVPDVFDGDKVLTVAALNTEYFKNNTTLDLSNKGITNLEGLQYFCGYKNLILDGNNLG EIDLSKYAISTSYTAGPVDEKGIQTFSAKNAGLTKLISGDQYMITSIDVSNNPGLAYLDI NRCKSITSLNASGCPLTYVDLRNLAGTYSVLGYSGGAVDASKVQFSFTNSSSTQRKLLVE EWWMDSPWNGTSPCITAKNQGVRIERYEYIGYDKDKMLSSFN >gi|225935354|gb|ACGA01000038.1| GENE 115 165154 - 168393 2674 1079 aa, chain + ## HITS:1 COG:no KEGG:Coch_1022 NR:ns ## KEGG: Coch_1022 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.ochracea # Pathway: not_defined # 42 1079 31 1094 1094 561 34.0 1e-158 MKRSYWLYKMCLLLGLGICIPVTGNASDIGDATLTQQQSKITASGVVVDESGESLIGASV MEKGAATNGTITDIDGRFSLSVAPNAILEISYVGYQKQDIPAKKDMRIIMHPDVANLEEV VVVAYGTQKKVSVIGSVASIDRKDLMKSSSPNLSSALSGKLPGLTTIQTSGEPGRDEVTM FLRGAATTNGTNPLIMVDGVAVDDMRSIDPNEIANISVLKDASATAVFGVRGANGVIMIT TRRGEKGTARISANVEFSMQEIAFKPERLDSWDWVRLRNEALVNDGNSAEFFGPDIDKFD SWKTGNPVDPDFYPNNNWQDILFRDYAPMTRANMNVSGGSDKLQYFVSAGYLHQGGMFNV EPKSKLGYNAQSSLDRYNFRSNIDYKVNKSVKINLNASSYLERINGTSASMSSVFNSALT SRPTSMYLTPEGAYATDAIRTFPIGAGLSVEDPANNSLSAYPLINRSGYQLETRSGINVI GGVEIDLGFITKGLSVKGQVSFDSKGFGKTIGKRSYTWYTYQTLASGEHLFINRHPSLED EDGPIELTKSSESYWVMNLQGQINYNQTFAGKHNVTAMFLAQRDIKESKETSGDLLLPYN VIGIAGRATYDYDRRYFAEVNVGYNGSEQFSPDKRFGLFPAASFGWLITNEGFLQDNPVL TNLKLRASWGKVGNDAFGSARFLYLDNISQYSVVTDSKGDHWLSPSLGYASNGSGWGQGY KIAENYIGNKSITWETAEKQNYGIDISLYHDLSFSFDYFVEKRKNILIIPQTTPMIQGLP SSALPLMNDGEVKNQGFEMVLGYQKQFKNGLSVSVNANFSYAKNKVLEYDEPLLGKDYAY RTRTTGFSLGQNWGYIIDRSYDPEKGRDGTGFFYSDESIAKSGLTYEGVGTPEPGDFIYK DLNGDGVINDRDKAPIGYSSLLPRINYGFSLSANWKGFDVSIMLQGVGQYSKTYSGAGIY ESSGNFYKMHMERWSEERYNNHEKITYPRLSSSGGPSLQPNDFFIMDASYIRLKNAEVGY TLPESISKKIGASNVRFYLSGNNLYTWTHLKTDSFDPEQNSPAAYPTMRTYNVGLNITF >gi|225935354|gb|ACGA01000038.1| GENE 116 168413 - 170251 1316 612 aa, chain + ## HITS:1 COG:no KEGG:Phep_1301 NR:ns ## KEGG: Phep_1301 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 118 612 107 571 571 187 32.0 1e-45 MKKRFIYICMFASLLTFSSCNEALDVAPDGRLSLDEVFQDPDLTKAYFSTCFDYLPKKSL RYHFWSNYPVALSDEAWDCTDGSGAGFAHAASGNCTTTDFYLDVRRDMGDTYSEGGYWQL YWGQIRIINTFLQRAATAAIPSESDRDRWVAEAHVLRAYFYMELLKWYGPVPIEKEPYGL DYDYSTLSRPTFEDCARFIVDDCEIALQSSNLPWRLTTLNEKIRMTKGIAMAIRSEASLF VASTYNNGGKDLWEWAYTINKDCLEQLKANGYELYTKCADTEKYYNNAYMEYFTLNNYIG TDPSDKETIWQSTEGYWPDPYTNPLYDVIGAPVLGNAQLTIVPSQELVDAYDMLATGKPV LDLSKPYNDEKHLSPNYNPNSGYDPANPYKGRDPRFQATIFYNNSPVLLGKEPAVVETYV GGNSEIRTSGNTNTRTGYYWRKRMVNGNCKAAGVAGYDGRFRFYRLGKIYLNAAEAAIES GHLTEGLEWINEIRHRAGFDPSVDLSTNDKNEARLLVRHERQIELACEEDRYFDIRRWTP ANENMENEKFTTGMRITKNGNSFSYERINLGTDGSKPSKMSYEKKWHVYPIPASEAAVLE GTTGQTWQNAGW >gi|225935354|gb|ACGA01000038.1| GENE 117 170270 - 173296 2375 1008 aa, chain + ## HITS:1 COG:no KEGG:Phep_3406 NR:ns ## KEGG: Phep_3406 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 27 1008 18 1005 1006 428 31.0 1e-118 MKNRIITLLCIVCLFTGNEVCRAQVQEVISGVVRNESNEPLDGAIISVPDNPEYTTSSDK DGKFIMEIPSGASMVICKIPGYRNQTMKFTLNKPVEFRLISDIAGQDTSLEVAMLQKERK GGYTGAISTVQGEELVKTPNSGFSATLTGRLAGLTSIQSTSTPADDATTMYIRGLNSING NSPLVVLDGIPAPLFDMNTLDANTVESVSILKDAAAKALYGPRASSGVILITTKRGEIGR TKVNVNMDFSIQQATQRPKIVSTGEYAQMRNQALLNDGGTPLYSLYEIAGFTDGTMPGHD WYDTFMNDAASMQRYNVNISGGNNRVKYLVNAGYLHQNSLIDAEKNDNYNPALKLHRFNI LANVDVTLHRYLNCFLNTNVTIDRMNQGYNGTGDIMNSIYQMAPTVPGPITEDGKVITAE YNEYPTYGLINRSGYSHQTGINLNVAYGMNLDLGFLTKGLSVKGIVGYEANYDGTIYGST DYARYAANGSGLFGTHTTLPLSLSKTANMRYFINFQGFINYNRIFGGRHEVDAFVSYFNE NMMKNGDLPYDRLSLMGHVKYGYDSRYYIQGDFTYDGTEQFRPGNRFKFFPTVSAAWVVS NESFLKDTDWLSFLKIRASAGWLGNDQISDTRFLYATDVRYKDAGYLLSNYYGFHTVEGM QGNPYIHYEESFQQNYGIDVTLFNSLGITFDYWNVKQTQMAIQDNSVSSAQGIASENLPF ENLGKMNNKGFDIELSYIKNLKCGLRMTIRGNMGYNKNEVTDIKELNRTASGYYYPYRKT GYSAGQQFGYLIDYSNGNGYYNSQEEIDKSGLRFSGASPRPGDFIYKDLNNDGNIDERDQ APMDKAQTLPTLSYGGSIQLEYKNFDLYVQLQGVSGTAAYYSGCGIFDNAYQGVYTDLHR NAWTADRYAANEKISYPALTTGSSSSLQSNEFFYSKNNYMRLKNVVLGYTLPKRWSSKLK LEKMRFYISGANLLTTSSLKFDNLDPEQYSYSVYPIYRTFNIGLNLNF >gi|225935354|gb|ACGA01000038.1| GENE 118 173308 - 175080 1261 590 aa, chain + ## HITS:1 COG:no KEGG:Phep_3874 NR:ns ## KEGG: Phep_3874 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 590 1 568 568 232 32.0 4e-59 MKINKIISGVFSLLILSSFIGCTEIEDHPDGRTDYSDLFTTNRKTYTYMNQCYGWILNYG MNYNYTMLAGCTDEAKDSWELQNGVTRKWNEGQLSPFSNPLEGIEGNPENYNYYYQGIRA CNIFLANIPTASVYSEDIRNSFKAQVLTLRAFYYLQLVKRYGGVPIITTPDYDYTKVKRG TFGECARQILADCQAAIDIPTVEEWGWRSLDKENYRHVMTKAICAAIRSQISLYAASPLY NDGTITWTEAAEITKKSLDDCLANNYELYKKQPNATAGYSPYDVYFYSRTDLPVVNDKET IMEVGQMYMWNYAGLPTTDGQTDAGACPSQELLDAYEVVNGDMTESYPLLNLETPYLDAN HLQPNLNSAVQGLYNQAKPYENRDPRLKASIYYDGSKLNLETGALLSTKTGGNCALDPSN ARYTCTGYYTRKFSHYKSGRSGNLDGYFKEFRLAELYLNYAEAANEASTTGTVPNDAVDA VNAVRSRVGMPDIPYGLSCEQFRLRIRNERRVELAFEEHRFFDVRRWKILDQTDKVITGM KANSDGSYSRFVVDNNRKAYSEKFLLYPIPGDEAIRLQNASGTNCQNPGW >gi|225935354|gb|ACGA01000038.1| GENE 119 175103 - 176173 782 356 aa, chain + ## HITS:1 COG:no KEGG:BF1058 NR:ns ## KEGG: BF1058 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 34 355 20 342 342 276 43.0 1e-72 MQKWLLTIIIAILPVITACSKTDTEVSVPEEETGKKTEIKFLQLNLWVECTKVEHAPEYL IEQIVALQPDIATFCELYKGPQDDPVMPKLMQGLKNKGLIYYDARIDGRAVISKYPIKET ERINKWMFKAILDVNGKRVAVYPAHSEYRYYTCYYPRGYNDGSVNWDKLPAPITDADKIL AVCEESDRIESAQAFINNAAKELEQEALVFFAGDLNEPSYLDWQADTKDLFDHRGCIVNW GTSKLLVQRGYKDAYRVIHPDPVKCPGFTFPADNKSVIPENLSWAPEADERERIDFVYYY PNKNLQIKNAQIVGPTGSIVRGKRIEEQTKDPIIPPVNNQWPSDHKGVLITFYIKE >gi|225935354|gb|ACGA01000038.1| GENE 120 176191 - 177474 569 427 aa, chain + ## HITS:1 COG:BS_cah KEGG:ns NR:ns ## COG: BS_cah COG3458 # Protein_GI_number: 16077387 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Bacillus subtilis # 139 425 23 316 318 114 32.0 2e-25 MQKWIYIIILLTGYSIKSYPQETINLNIPVEQNWMFDNEHPVRFMLELTSAQSGFSDRIR MDIATDFGNAVSSTFITYSFGTDSIAPGGGFRSRASVTLTNLTPGFYQACFIAGTDTLKH FFFGYEAEQILSKPDGLPNLKRFWNKTITELKRVQPVYKMTRLPESSKPERDVYEVEMCS LEGEILRGYWAVPTDGQSHPAVVVYQGYDAVTWIPGPDNYPGWCILVIPPRGQGLNKPYN RFGEWIAFGLDSPKHYYYRGAFADTVRSIDFIFAQPCFDGQNLFATGISQGGALTLAAAS LDHRISAAAPIVPFLSDYPDYFRLVHWPASLVFEGAARQNIDKDKIFSLLTYFDLKNLTG WIKCPVLMGFGLQDNITPPHTNFAAYNHIRTPKRWICYPTSGHFAAYENMDSWIAESTRF FNQYVVW >gi|225935354|gb|ACGA01000038.1| GENE 121 177670 - 181728 2953 1352 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 813 1060 4 247 294 139 36.0 3e-32 MRKVILLFLLFLSVAGVRTQGQNITFSHLTTDDGLSQFSVNSLYIDERGIIWIGTREGLN RYNGNDIKSFKLNKNDPNSLFSNTVLRITGNKNGKVYLLCTDGVAEFDLTTQRFKTLLQG NVDAIYFNEKLYIGKREEVFVYNESTGNFDLYYHLAGKDITLSCLHLDEKKNLWMGTTSN GLYCLSGDKKISQPVTRGNIASIYEDSSKELWICTWEEGLYRIKTDSTIENFRHDPKNPN SICADFVRSCCEDNAGNLWIGTFHGLNRYEKSTGKFQLYTANANKPDGLTHSSIWCIVKD EQGTIWLGTYFGGVNYFNPEYEIYTRYKTGDTEKEGLSSPIVGRMTEDKDGNLWICTEGG GVNVYDRKNNTYRWYRHEEGKNSISHNNVKAIYYDRTNEIMWIGTHLGGLNKLDLRTNRF TVYRMKAGDPTSLPSDIVRDIVPYKDKLVVATQNGVCLFNPATGTCQQLFKETKEGRGIG MVASLCIDKDGTLWIAATGEGVYSYRFDSGKLTNYPHNPANPNSLSNNNINSIMQDSNGN LWFCTSGSGLDRYRKASDDFENFDVQTDGLSSDCIYEVCESSIQKGDLLLITNQGFSQFD YPSKKFYNYGTENGFPLTAVNENALFVTHDGEVFLGGIQGMISFWEKKLHFTPKSYNIIL SRLLVNGKEVVPGDESGILEQSICHTPEISLKANQSMFSIEYATSNFIPANRDEILYRLE GFSDEWNHTYRKQTLITYTNLNPGKYTLVIKSQREGIKEARLLIIVLPSWYETWWAYLIY TIVTISLLWYLIQNYNSRIKLRESLKYEKKHIEDLEALNQSKLRFFTNISHEFRTPLTLI VGQVETLLQVQTFTPNIYNKVLGIYKNSLQLRELITELLDFRKQEQGHMKIKVSQHNLVN FLYENYLLFLEYASSKQINFKFNKQKDDIEVWYDQKQMQKVINNLLSNAVKHTKAEDTIS INVSQEKDHVIIEIKDTGTGIAAAEIDKIFDRFYQTEHLNSLNTGAGTGIGLALTKGIVE LHHGTIRVESEPGKGSSFIITLKLGKEHFTEEQIAKDDTETIQQTETIVPSVEIIPDSEW KEEDNKRIEDAKMLIVEDNESIKQMLVGIFETFYQVSTASDGVEALEMIQKDMPSIILSD VVMPRMSGTELCKQIKTDFNTCHIPVVLLTARTAIEHNIEGLKIGADDYITKPFNTNLLI SRCNNLVNSRRLLQEKFSKQPQAFAQMLATNPMDKEMLDRAMAIIERHLDNTDFNVNIFA REMGMARTNLFTKLKAVTGQTPNDFILSIRLKKGAVMLRNNPELNITEISDRIGFSSSRY FSKCFKEIYHVSPLAYRKGEESEEEGDGEETD >gi|225935354|gb|ACGA01000038.1| GENE 122 182021 - 183556 1249 511 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 13 501 11 467 497 160 28.0 8e-39 MKNVSRLLPLLSGIVTLSGCSHAPQKSNGQNNQKPNIIYIFADDLGIGDLSCYGATKVST PNIDRLAGQGVQFTNAYATSATSTPSRFGLLTGMYPWRQENTGIAPGNSELIIDTTCVTM ADMLKDAGYATGAVGKWHLGLGPKGGTDFNNRITPNAQSIGFDYEFIIPATVDRVPCVFV ENGHVVGLDPNDPITVSYDHKVGDWPTGEENPELVTLKPSQGHNNTIINGIPRIGWMTGG KSALWKDEDIADIITHKAKNFIASHQEEPFFLYMGTQDVHVPRIPHPRFAGKSGLGTRGD VILQLDWTIGEIMHTLDSLHIADNTILIFTSDNGPVIDDGYQDQAYELLNGHTPMGIYRG GKYSAYEAGTRVPFIVRWPARVKPNKQQALFSQIDVYASLASLLDQPLRKGAAPDSQEHL NVLLGKNNTNREYVVQQNLNNTLAIIKGQWKYIEPSDGPAIEHWTKMELGNDKQPQLYDL SSDPSEKTNVSKQHPDIVKELSELLESIKEK >gi|225935354|gb|ACGA01000038.1| GENE 123 183560 - 183676 73 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTEKLYDFAPILNNSIHIRIFNPSIFASVDSIKSLNF >gi|225935354|gb|ACGA01000038.1| GENE 124 183716 - 186874 2960 1052 aa, chain + ## HITS:1 COG:no KEGG:BT_3332 NR:ns ## KEGG: BT_3332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1052 1 1053 1053 1547 73.0 0 MLSSIGLILFSVSFVLAQVLVKGTVKDNLGEGVPGASVQVQGTSQGTITDLDGKFAFNVP NKNSILVISFIGYVTVEIKADTQKPMVITLKEDTKTLDEVVVVGYQEIRKRDLTGSVAKA SMAELLSTPTASFGETLGGRIAGVNVSSGEGMPGGQMNIVIRGNNSLTQDNSPLYVIDGF PVEDPSIAAAINQNDIESLDFLKDASATAIYGARGANGVVMITTKKGTIGQPKIKYDGSF GIQHITKTIPMMDAYEFVKLQAERSPKDMETTYFMNYDGKKWGLEDYRNIPQYNWQDEIF RSAWMQSHNVSLTGGSEGVRYNASLSYYDQDGILLESNYKRIQGRMGTTIQKKKLKIYLT TNYSSTTTTGGSPSQNSYSGMNNLFYSVWGYRPVTEPDRPLNSLMDNIMDDAINNTNDYR FNPIMSLKNEYRKTYANYIQFNGFAEYEFIKGLKLKVSGGYTFDTRKGETFNNSKTRYGN PKSSDKVNAEIYHSQRATWLNENILTYQTNIKRKHFFNSMAGVTLQNSDYEYYSYKTVQI PNEALGMAGMSEGTPSTTKSLKSSWSMLSFLGRLNYNYKSLYYATVSFRSDGSSKFRGDN RFGYFPSSSLAWGFMEEDFMKPLKPVVSSGKLRASWGLTGNNRVGEYDTYALYQMLKDKV GDFISVGSTPSGVYPFENSLTSVGMVPTSLRNRKLKWETTEQWNLGLDLGFLDERIGLTI DWYRKTTRDLLLNASTAPSSGFTSAMKNIGKVRNQGIEFTLNTTNIKNRNFSWTSNFNIA FNKNKVLALAENQVSMPTAAKFDQNYNSQYSYIAKVGYPMGMMYGFIYEGTYKYEDFDKV GDTYTLKRNVPYFSSESNTQPGMPKYADLNGDGIIDDNDRTMIGNGMPKHTGGFTNNFEY KGFDLSIFFQWSYGNDVLNANRLFFENSNKTRDLNQYASYADRWTTENPESNIPRATDSG SNKVFSTRIIEDGSFLRLKTVSLGYTLPKQLTKKWKIDNARVFVAGQNLWTCTGYSGYDP EVSIREGALTPGLDFSAYPRAYSISFGVNLGF >gi|225935354|gb|ACGA01000038.1| GENE 125 186902 - 188674 1752 590 aa, chain + ## HITS:1 COG:no KEGG:BT_3331 NR:ns ## KEGG: BT_3331 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 590 1 572 572 695 60.0 0 MKTIRIIIISLLVGLNLTSCDFLEKDPTYTTPENFFKNEAEATSWLTGTYAILGQSSFYG NEFLYLVGGDDLGHYGGANRGPNKSGLICNNANTSDPTVAALWSTLYAGVNRANIFLENI DAVPDMNDDTRKQYKAEARFLRAFYYFTLVECWGDVPFKTTSTEDVYNHSIPRTDKQTIY DFIIKEMYGSAEDLKSAQDLNYLPGRISKSAAWGMLARVYMFRAGEPKRDKEVGLANSTT SAEITEYFKKASYYAQLVKNEGHSLTAKYWDFFIDICSDKYNTALNKDGAKANESIWEVE FAGNRSTDVRAEGRIGNIIGIQGKDLSSKASITGKGDPGYAYAFIWNTPKLLELYEANGD IDRCNWNIAPFTYTQSAGEGTPVDGREFVKGKRDEVKQQYWDKSFSYGKTEPGSTYGDRE SKNDANKNRNRAAAKYRREYEADKKSKNDTSINFPLLRYSDILLMIAEAENEVNHGPNDL AYECINAVRERAGINKLAANLDETSFRKAIKDERAMELCFEYTRRFDLIRWGDYHKLMQE QVDKAQADESWKFGTNVYTYFNIPKSYNYFPIPANEIGSNSAIKTNNPGW >gi|225935354|gb|ACGA01000038.1| GENE 126 188705 - 189805 844 366 aa, chain + ## HITS:1 COG:no KEGG:BT_3330 NR:ns ## KEGG: BT_3330 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 366 1 347 347 189 34.0 1e-46 MKTIYKSLMTIAFAGLCLASCDKELKEDTTMEVGVVTDSNVSFDGKTVTVKKGSPVTFSF DGDPDFISFFSGEIGHEYKHRNRIEMQPEDVEKCEINFSVLYDYGSAKTIEGSTHILISD QFGGISGNNVEKDKEAVTNCEWTELVSQDELPKATKDTKDYSCPLISYLGKEISIAFRLN PLDNSSTMPVIHIKGLQLNLEFNNGKSTTINAKNFEFSALNVTYNLDDLSKNNTHLTKLK EALGNKNLTLEEMKSAEYEDKIAYATVDGNIPYFWRINQPSDFVTSGGAAGYTKGDTWII SNPILLNGSCDPDAGVAIKNISQSLEIYSHTYEEAGTYTATFVANNANYVHQGGQVVREL TINVVE >gi|225935354|gb|ACGA01000038.1| GENE 127 189950 - 190894 749 314 aa, chain + ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 8 313 4 307 311 181 33.0 1e-45 MPKTLPIHFAPLQGYTEVFYRNAHAACFGGIDTYYTPFVRFEKGGFRHRDVRGIDPGNNQ VAHLIPQLIAPSFEKAEKILSLFIEKGYKEVDINMGCPFPMLAKRHNGSGILPYPEEVQA LLSLITEYPQISFSIKMRLGWEDPEECLKLAPIINELPLRQVVMHPRLGKQQYKGEVDLK AFEAFQNACKHPLIYNGDINSVEDIHRIQEQFPGLAGIMIGRGLLANPALALEYRQNRAL EFDEMREKLQSMHKCVYNQYAEQLEGGDEQLLNKMKTFWEYLMPQADRKLLKAIHKSGNL NKYNQAILAFFNQR >gi|225935354|gb|ACGA01000038.1| GENE 128 190949 - 192184 988 411 aa, chain + ## HITS:1 COG:no KEGG:BT_3325 NR:ns ## KEGG: BT_3325 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 411 1 409 409 648 78.0 0 MKKYILIFLCLITVSGCFAQMQDFKFKFYGQIRTDFYYNSRANEETVDGLFYMYPKDKVR DPEGNDLNSTPNSNFYTLYSRLGVDVAGPKLGTAKTSAKVEVDFRGTGTSYSVIRLRHAY LNLDWGKSALLLGQNWHPLFGDVSPQILNLSVGAPFQPFSRAPQIRYRYTNKNFQLTGAA VWQSQYTSQGPEGKTHKYLKQSCIPEFYVGADYKNDGLLAGVGIELLSLKPRTESIVNTD KYKVDERITTLSYEAHVKYTNKDWFIAAKSVLGSNLTQASGLGGFGIKSVNEQTGEQEYT PIRFSSSWFNVVYGQKWKPGIFVGYAKNLGTSDALYAPKGNDAKLYGTGTDLNQLVTAGA ELTYNVPHWKFGLEYTLSSAWYGSLNTSNGKIQDTHAVCNNRIVAVAMFMF >gi|225935354|gb|ACGA01000038.1| GENE 129 192572 - 195643 2214 1023 aa, chain - ## HITS:1 COG:no KEGG:BT_3324 NR:ns ## KEGG: BT_3324 # Name: not_defined # Def: chondroitinase (chondroitin lyase) precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 1023 3 1014 1014 1841 84.0 0 MMKQPFTKFGVTTLFSLLCSAFLHAQVVTDERMFSFEEPQIPDCITATHSRLSVSDLHYK DGKHSLEWTFEPGGILELKKDLKFEKKDPTGKDLYLSAFIVWVYNEVPQDATIEFQFLKD GKRCTSFPFGINFSGWRAAWVCYERDMQGTPEEGMNELRIIAPNSKGSLFIDHLITATKV DARQQTADLQAPFVNAGTTNHWLVVYQHSLLKPDIELTPVDDKQRAEMQLLEKRFRDMIY TKGKTTDKEVETIRKKYDFYQITYKNGQVSGIPIYMVRASEAYERIIPNWDKDMLTKQGV EMRAYFDLMKRIAVAYNNAANPVIREEMKKKFLAMYDHITDQGVAYGSCWGNIHHYGYSV RGLYLAYFLMKDVLKETGKLQEAERTLRWYAITNEVYPKPEVNGIDMDSFNTQTTGRIAS ILMMEDTPEKLQYLRSFSRWIDFGCRPALGLSGSFKVDGGAFHHRNNYPAYAVGGLDGAT NMIYLFSRTEFAVSELAHETVKNVLLTMRFYCNKLNFPLSMSGRHPDGKGKLVPMHFAMM ALAGSPDGKEEYDSEMASSYLRLISDPSIENDSPEYMPKVSNAEERKVAKRLVEKGFRPE PDPQGNIAMGYGCVSVQRRSNWSAVARGHSRYLWAAEHYLGANLYGRYLAHGSLQILTAA PGQTVTPATSGWQQEGFDWNRIPGVTSIHLPLEQLQAKVLNVDSYSGMEEMLYSDEAFAG GLSQQKMNGNFGMKLHEHDKYNGSHRARKSYHFIDGMIVCLGSDIENTNTEFPTETTIFQ LAVTDKAGHDYWKNYQGDKKVYVDHLGTGYYVPTPIRFEKNFPQYSRMQNTGKETKGDWV SLVVDHGKAPKNGSYEYAVLPQTNEALMKKFAKKPTYRVLQQDRNAHIVESVSEQIISYV LFETPETTLPGGLLQRVDTSCLVMTHKESVDKIKLTVAQPDLALYRGPSDEAFDKDGKRI ERSIYSRPWIENASGEIPVTVTIKGQWNVERTPFCKVISSDKKQTILQFSCKDGASFEVE LRR >gi|225935354|gb|ACGA01000038.1| GENE 130 195679 - 196371 475 230 aa, chain - ## HITS:1 COG:no KEGG:BVU_0159 NR:ns ## KEGG: BVU_0159 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 30 225 38 233 1106 193 45.0 5e-48 MTRRFFIGYSLITFFLLNSVVLLAQHKASFVQQWKIEDASHALQIIERADTLELIVPDGL TMWYRQRLTGDYEISYRICMVMQGGKYDRLSDLNCFWAANDPKYPDDLFARSQWRDGIFK NYNTLNLFYVGYGGNDNSTTRFRRYKGEYYGVADDKVKPLLKEYTDASHLLVPNQWYEVR IRVEKGITTYSVNDEELFRYTLAGSEGDGHFGLRLLQNHVLFTDFKATIL >gi|225935354|gb|ACGA01000038.1| GENE 131 196487 - 197857 1092 456 aa, chain - ## HITS:1 COG:no KEGG:BF3314 NR:ns ## KEGG: BF3314 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 44 453 66 429 430 97 24.0 1e-18 MKIKSLMYVLMGTMLVTTSCSDNELEKGNDGSGTVDPVNASALVNVYSDKSGSEASLLVG DVLVKDSRTLTLNVPAACEKVYMKYNTVSGTEATKEFALSPVSRGVDQSTGFNFETNRLA LVTLALPEDAVQPTNETDQGYLFYHNTGVVMFEDGWPIQLDSWYDEDFNDVVFEYDLKVT ECHSQQMMETVGGKEELLLTLDVRAVGGIYPTVLGVVLDGLKSEYVDRITASLVLKGGQG TMTDLAKEELSTKNIVKVENKNWNWSNDTRKEPRFAILTVDKAQAEGTVITLDGLTSLMD NNQDMFQVTQGKVREGLPMLRAEVRLIGKEGLTGAERDAQLAAFRELILDTNRQNFFIKV NGGKEIHMRGYAPTSAYKAEYEALVAGDTTLDANVYYSNTKGSTWGVKLPVGTRHAYERV PFREAYPDFTKWVDSKGVSNQKWYENFVDEKTIRYW >gi|225935354|gb|ACGA01000038.1| GENE 132 197927 - 198115 67 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFCLMQIYEITCIYYPLFMNRLFEPYFSSCDYSRWKGLKLQIKASKKLLGVKYIFDKLTN EA >gi|225935354|gb|ACGA01000038.1| GENE 133 198183 - 199355 826 390 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2255 NR:ns ## KEGG: Cpin_2255 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 155 386 145 373 375 117 28.0 1e-24 MDQVLYGIDYIEYYFYWIYYKFIGYPLIIRICSIAVMFCIIAYLFLLFHIIYGIFKRRKE KRKYNKAFDKYYEEMKSISLDSNTLSEEEIADRLEYDTKKRPKPNELRIITQLLTEIKSV HEDEINELNYQTIQTVFQITRFLERELQFGSKRSKIQALKLIQSINGYASEAVLVRFLYH RELELRNSARYTYMWLSQGDPFRFFDEDIGMKLRQWDMMELHAILEHRKKVGYNTPTFIK WVNTSAEENVKIFFINEIRLYNETESAPILAKQLNARSVEIRGEAIKTLGKLKYKEVEPK LIEMYNVQPEEVKRQIISAIADLKTDKALGFLYNAYDEADNWGTKRVILKALYEYSAMGR KTFDQLERKADSHTAILFAHTKHPLINQLN >gi|225935354|gb|ACGA01000038.1| GENE 134 199361 - 200836 1155 491 aa, chain + ## HITS:1 COG:mlr6694 KEGG:ns NR:ns ## COG: mlr6694 COG1215 # Protein_GI_number: 13475588 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Mesorhizobium loti # 58 469 63 470 475 292 37.0 8e-79 MRDIIFTFFNYFVFFYTSMLAIFFVTFAFLSFISLKRRKDYYVESYMRKTIKESPYTPGI SVIAPAFNEEKTIIDNVNSMLALEYPLFEVVIVNDGSTDSTLENMTEYYELVEVPYAYIE RIKTRPFRRLLKSSNPKYSRLIVVDKENGGTKADASNAGINVASHPYFICTDVDCILEKY ALYRCISPIISSEKQVIAVSGTMLMANGCVVKDGQIIDVRTPRTPIPLFQNLEYMRSYLI GKMGWSAINGMPNVSGGFGLFDRSVAIAAGGYDAPSFAEDMDLITRMVGYMCDFSRPYKI VQIPDTCCWTEGPPNLAMLYRQRTRWARGLFQTLNIHRKMIFKKTYKQMGLLTLPYMFVF EFLAPIIELVGLIVFIYLAFTGAVNWNTAWMIYLTIYTFCQFLSIVVITYDYYVGMLYKR GYEYLWIIIASILEPIFYHPIITFCSLRGYLSYLTNRDFKWKNMERKGFKQKEESADGTD ITAMKPEPATI >gi|225935354|gb|ACGA01000038.1| GENE 135 200867 - 203836 2202 989 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2252 NR:ns ## KEGG: Cpin_2252 # Name: not_defined # Def: TPR repeat-containing protein # Organism: C.pinensis # Pathway: not_defined # 28 989 27 954 954 151 22.0 1e-34 MNREFLHKITILGCLFLLITSSSGTDNGFQTPEQYAQIVQEHFANEEWEAGKELLEEGLQ KYPNVSDLEWLMGKYWFHEKNYDQSRYHLVKAIDDNYNNVNAKHLLVDVEDITENYSSAI CYVNELLEVNPYWRGLWRRKIELYRKQNNDVEADRLLKRINQIYPNDTILRKDYIYSMEV GYQQMKKGGNRKEAIEKLTELIKVSPQNEEYYLDIINLHLQEGNREAALGWSSNGLAAIP GSGALIVKRASILSELARYPEALVFIREQMRRNNSPAIRRMYNDLLMEAARAEKQRDPYV LYGMAYEGGNKNKEALDYLLNTSVTRGYTDDALFYIREAKKQYGNNDKGILYKEYMLYRQ MNEDDLAYSTLKKMYEMYPDDYDITLAMSAQHMKKAEKLMELGLYAEALPHVLFVSQKHV DDNEVNGAAWEKALSCYINMKRYNEALATLDTITLHFPDYENGTLKRAFILDKMDKTEEA LQLYLSAIEQSDEDMRIFYVIGYEELAVPYIKKCMEAGATKKAYEESVKLISLNPSSDLG LRYAINSSGLLGKYDEFEKYTVQGINYYPEEPFYQAKRATVLERDKRYEASLEFLKPILN KYPSNKEIIGAFSQSSEYRALQLTKAKEPEKALAVLDTALLYDSQNKSLKYTKGVVYEAN RQADSAYYYQKYYEPSIMEYRSFQRHLSGLRSMMLKNEIALTYLRARYGEEDIITSVATA EYTRKGQKNSYTGRLNYAGRSGSASDSMEAEEQTPGGVGIQVQGEWTHHFSPKWSLTANA AFATKYFPDITADVALRHYLKNDWEIGAHVGYRRVAAYTKRYEWNDEFFAGGTGDNGYLF TGWNESKTNLLTVGGELAKTIEVVRLNTKIDMHFFNSNFYYNAQIGAKYFPASDTKTNIN AMASIGSAPETAVLDYALPGSFSHTNTMVGLGGQYMVSPNITIGLMGTWNTYYNQTNTVR GTSPSNQIESISTRYKNLYNIYAQVYISF >gi|225935354|gb|ACGA01000038.1| GENE 136 203811 - 205964 1419 717 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2251 NR:ns ## KEGG: Cpin_2251 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: C.pinensis # Pathway: not_defined # 157 654 165 674 752 181 26.0 1e-43 MHKFIFHSDTLGPVLAGMICICCLLSGCRMQARTEMFPSKEGYLVTIGEDPTDRDTRWAK YLYEHLKKRANDDEIVAFGVSEMDMWRIIIQIDPTLQRGFKVACKGSDIRLTASDDKQML WLQYQLIKKISKEDPRIDGSDLPPALINLNDTCGAFAFDYQSIYSPYGLNADHTGVIGLN NFDDSWGIWGHNLRKVLGKDAEKVYATIHGKTDDSQLCFSSEDMYRQIESYIVDNFGEKG NFRFVIAPDDTPYACTCATCTALGNTEKNATPAVTELILRLSQRFPKHTFFTTSYLTTQQ VTDKQLPSNIGVIVSAIDYPLRRTDGKDEQDKKFAEQLDNWKKVTNKIYIWDYINNFDDY LTPFPILKIAQQRLQLFKQHGASGIFFNGSGYSYSSFDEMRTFVLSALLINPELPVDELI KSYFNQEYPVSKKWLYDYYTELENNAQSGKRLGLYAGIRESEKGFLYPEKFIKFYDEMGD FVSEAKGKERKKLHELQTALSFTRMELARDHSFDAYGYAKRNGKDIQPLPQARKWVTQLK EHQAFAGMGYYNESAYEIDYYIKEWEQYILASDIKKSLFLGMHPSATPKLNKNDSKKLTD GTHGLPGDYHCGWVIIPGEECTINLPVKGLNASGTFYISFLNLPRHRIYAPQQVQLLKDG VAYKTIDLKPEDSPEKGEMMKATVPADLNGTEQLSIKISCLKKPGTQMGIDEIAFIP >gi|225935354|gb|ACGA01000038.1| GENE 137 206003 - 206518 412 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716783|ref|ZP_04547264.1| ## NR: gi|237716783|ref|ZP_04547264.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 171 4 174 174 351 100.0 8e-96 MNKSIFYILLLTALPLYFTGCRKEVRPTSMTIKDSVRHYYPIKQGQQLDIMFTITNTGDA PLIISEMQPSCGCIILDKSSHIIIPEDGIRQFKATYNSIKNVGEVVHRIRIFGNMLPDGR AELKFDVNVVPDADYTRDYEELYQEFNTKNGIVREMVDGKESELGYYVGEP >gi|225935354|gb|ACGA01000038.1| GENE 138 206612 - 207172 579 186 aa, chain - ## HITS:1 COG:no KEGG:BT_3323 NR:ns ## KEGG: BT_3323 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 186 1 173 173 302 86.0 4e-81 MRAKKFVCCLLAMMLLAGVSFISCGNSSKAKADSELTTQDGEDFKSFLDKFTSSAAFQYT RIKFPLKTPITLLADDGETEKTFPFTREKWPLLDSETMKEERITQEEGGIYVSKFTLNEP KHKIFEAGYEESEVDLRVEFELQSDGKWYVVDCYTGWYGYDLPIGELKQTIQNVKEENAA FKEIHP >gi|225935354|gb|ACGA01000038.1| GENE 139 207459 - 209264 523 601 aa, chain + ## HITS:1 COG:no KEGG:BDI_3446 NR:ns ## KEGG: BDI_3446 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 601 1 660 660 305 32.0 5e-81 MKTIYTVFLLLLLLSCTSSSEKLAWEIANNSQTNKKELTRFLEHYKTNKDKDKYKAACFL IENMPNKYSINGKEQKIYDIDIVKADSLIKSLEHSFFLKEKSPYLKNYTFEQFCEYILPY RVADESLQYYWKWDCSRKFEKQCTNDIIQTAQNINAQIKIELSPEFYKDTLKSYSSIIKT GYGKCDDRTALVTMALRSVGIPAAFEFVPYWGSNNNGHSFVSIILPDNKIYPLQNTDKQA NGDYYLSRKTPKIYRKMYSIQDLAKHIDNIPELFRHNDLLDVTKLHNIGSCDVTVSTNIN KEKENFLSVFSPKRWVPVAFSSSQTFHHIGTGNIYNVDRNKEAIDLGDGIVYLPTHWVNE EAIPIGSPIIVSEDSVREIKPDTKHLERVVCKRKFPLNMRIVDFSKLMIMGVFEGANKAD FSDTTELYKITKTPESKMQKIEISAEKAYRYIRYRKPKGTFSIAEFCLYQSDEKLLPFHP IACDAIYEDSTMLNIFDGQPLTYYQVSGGIDLWVGVDLYKPVKISKIGFAPRNDDNAIVS TDTYELFYWQDQWISLGRKRPIGDSVVYDNVPQKALLWLRNLTKGREERPFTYENGKQIW W >gi|225935354|gb|ACGA01000038.1| GENE 140 209295 - 210071 527 258 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172582|ref|ZP_05758994.1| ## NR: gi|260172582|ref|ZP_05758994.1| hypothetical protein BacD2_12016 [Bacteroides sp. D2] # 1 258 1 258 258 445 100.0 1e-123 MKKITLFLSLLLIGSVGFSQIIPGVNIGKRKEYMMRTYKINSQKADEHEQILFSLQKEND QLKNRKISSTQFKAEQKKLYKKYGTIISQAFSGGKHKKWSSCTQEMERYQILSENKFIPY EKMRALYKAESEWVKERDKMHKDTGEAWEKYENSDTMVSELNIKIKQILGTENGTWYIEY KRLFFRALDNMDKYGVTYKDAFTIAKIEDTYKQKRANILNSNKKNAEREVELMAIDDEMA KKNSKDCSVCFCKVGKSK >gi|225935354|gb|ACGA01000038.1| GENE 141 210001 - 210348 152 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720541|ref|ZP_04551022.1| ## NR: gi|237720541|ref|ZP_04551022.1| predicted protein [Bacteroides sp. 2_2_4] # 6 115 241 350 350 186 99.0 4e-46 MMRWQKKIAKTVPSVSVKWEKVNNAALDHTLKSRYGLNQEQINKFKTAYNKYAIEEYKIL NQKKLSDSDKYDQLSQLGETFCKTVNPLFKVDNYKKWYGWWKYDFERKMKRKGLK >gi|225935354|gb|ACGA01000038.1| GENE 142 210356 - 211897 712 513 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720542|ref|ZP_04551023.1| ## NR: gi|237720542|ref|ZP_04551023.1| predicted protein [Bacteroides sp. 2_2_4] # 1 513 1 513 513 967 100.0 0 MKKSTYIYTVLCVIAIVISIVNCKDEDLLESGLQTDMKDFSLQEAKNFFQTQAHANLTLS RSLDNKRNKTVSPGDFVPNWDAAVSSTNNGLACYDIPITPTYHFKAIYVDERNGKPSAGK VNVYQKLVIVKDVKSNRMDQYILTLIPSKLYDSRNGAQTCNNFINCADKGGFTGVALYSC VYSQVTARISTFKNGVKTRGVFLLNASGKTNLSDKYEQARALASTVYIQKKKMVLTRGED DYNYDYDYGNEDDYTYIGETLEEVIITPESNNNETSGGNDEWEIIAPPDSGTIDPEPTEP ESTSTEDDTVTENNNGDQNSDEKSIPLSTAEKKAVNSLLIQLEKLKNIDRTKYTIEKQNY CRSTARTSKDGVLQLCQLFFSSDNLTEIDRIATIWHEMYHIDHKHYGKLEMTILEKTIVL NPPPYIEKILNERLDIMYGKYIMTPETREADFKQELIIDRYGTIEYYKNELETHKAEREN FPEVSHYYENERTWLEWTYEQLLIIATEQSSNK >gi|225935354|gb|ACGA01000038.1| GENE 143 211902 - 212360 234 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720543|ref|ZP_04551024.1| ## NR: gi|237720543|ref|ZP_04551024.1| predicted protein [Bacteroides sp. 2_2_4] # 1 152 1 152 152 283 100.0 2e-75 MNTIKSFFISILLCMSIPSKGQTQDLEITFAYNRQDNALILKLFNNTDKEIIVLNQSLLN ESSGSCIILTEKHDNGQSDLIISLYDYEDGQWIRSKTINPNERLELFYSFEAIPANNVTR ARLFLSTYFRDRKTGKLVSKRYKNDLPIKQIK >gi|225935354|gb|ACGA01000038.1| GENE 144 212357 - 212803 227 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720544|ref|ZP_04551025.1| ## NR: gi|237720544|ref|ZP_04551025.1| predicted protein [Bacteroides sp. 2_2_4] # 1 148 1 148 148 276 100.0 2e-73 MITSKHLFIGIVLLGISIFCTAQEVKIEFSYDKPNNSLTLILTNNTDKEILVMNQGRLSE FSGSYIVLTESSNGKSADLTICLFTLESGKWILHKSLSPKGRIELSYPLDSIPANNVVRA HLFLSTYSNDEKTGKLTSKRYEKNLYID >gi|225935354|gb|ACGA01000038.1| GENE 145 213347 - 213574 65 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKTVMISRKPDFKRVGLTLIVCFLFNHCHRNIMLPTNAAVHPAREPDNSKPYKIKQAIM QNDNVNIRCLFCMFS >gi|225935354|gb|ACGA01000038.1| GENE 146 213935 - 214063 64 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEAILLTMYFAKIDIVNTFMVLLFIIPIMLMNNISNGKTWPG >gi|225935354|gb|ACGA01000038.1| GENE 147 214215 - 214589 226 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298480418|ref|ZP_06998615.1| ## NR: gi|298480418|ref|ZP_06998615.1| O-antigen polymerase superfamily [Bacteroides sp. D22] # 1 124 407 530 530 255 100.0 5e-67 MYIYSKSSLEDYPLNTKPQILQTAARIAPNSELYIKMGDFWKQKRDYAQAEACYQTAAAM IPHHITPSYKLFQLYIDKGNINAAIDMGNYLLKQPIKKKGTKALRMEAEILEFLHKEKNI KKTQ >gi|225935354|gb|ACGA01000038.1| GENE 148 214760 - 216103 487 447 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172588|ref|ZP_05759000.1| ## NR: gi|260172588|ref|ZP_05759000.1| hypothetical protein BacD2_12046 [Bacteroides sp. D2] # 1 447 1 447 447 890 100.0 0 MNKIISLFIAVFFNVYTSAQINAGSTKESLMTLDEMPSWILSNIQFPQEAYKYGIAGIEQ VCISASWDGKVFITSILNTLNPAFEKEIMDVISKAPRCRYNGSQPKDIYKYMLIDFHQYI PEDKREQIQQVTMHIPPRLSNIPTSPFNSRDKFVQWIHNNIQIPSTLKCYSETLLFQYTI TKKGKVNNISILQCKNDIVKCAIEDLLKKSPKWEPAIADRTTPIDVTICDKIIIKTDNDG MLLPLIVYRDDVFCNTRSKPTDPDMIVFNPEIKAKYNEEGNFLKNIMCDVIVDKKMVLNG SFVIEKDGTTSHIEISNSPDAETDSIVTEAIARTKWIAAMQGESAVRTIYSFGVNKQPRK QNQSSKYSYYDIFGKYFIALQANPMRTSYRFIQGDGTIQNYPFNNQGLFDYKAYYQGMLY YYKNMAGKNSNISRDYFDKLYKMYAGY >gi|225935354|gb|ACGA01000038.1| GENE 149 216243 - 217526 851 427 aa, chain - ## HITS:1 COG:no KEGG:BT_3321 NR:ns ## KEGG: BT_3321 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 763 83.0 0 MEHLLHYVWKHKLFPLKVLQTTNGLPVEVIDSGLQNPNAGPDFFNAKLKIDGALWVGNIE IHTHSSDWFRHGHHSDKAYDSVILHVVSEADTEITRTNGEQIPQLLLTCPDNVQLHYHEL CVADQYPACHPILASLPKLTIHSWLTALQTERLEQKAQLITQRLKHCNSNWEDAFFITLA RNFGFGLNGDAFETWAGLLPFRAMDKHRNDLFQIEAFFYGLAGLLEETFLKKEQEDEYSL RLCKEFRYLQRKFEIGQGMDATLWRFLRLRPENFPHIRLAQLAYLYQKGDKLFSRLLEAE TLADVRNLLDARTSPYWENHYLFGRPSSQKEKTMGERSKDLIIINTVVPFLYTYGLHKAD ERMCERAGRFLEELKAESNHIIRSWSDAGLPVVSAADSQALIQLQKEYCDKRKCLYCRFG YEYLRKK >gi|225935354|gb|ACGA01000038.1| GENE 150 217610 - 218374 850 254 aa, chain + ## HITS:1 COG:MK1422 KEGG:ns NR:ns ## COG: MK1422 COG0289 # Protein_GI_number: 20094858 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Methanopyrus kandleri AV19 # 44 249 69 266 275 110 35.0 3e-24 MKIALIGYGKMGKEIEKVARSRGHEIVCIIDINNQDDFESEAFKSADVAIEFTNPMVAYS NYMKAFKAGVKLVSGSTGWMAEHGEEIKKLCTEGGKTLFWSSNFSLGVSIFSALNKYLAK IMNQFPAYDVTMSETHHIHKLDAPSGTAITLAEGILEKLDRKDKWVKGTFLAPDGTISGT NDCAPNELPIASIREGEVFGLHTIRYESDVDSITITHDAKSRGGFVLGAVLAAEYTATHE GFLGMSDLFPFLND >gi|225935354|gb|ACGA01000038.1| GENE 151 218410 - 219894 1473 494 aa, chain + ## HITS:1 COG:STM2582 KEGG:ns NR:ns ## COG: STM2582 COG0681 # Protein_GI_number: 16765902 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Salmonella typhimurium LT2 # 438 489 269 320 324 73 57.0 1e-12 MRQATRAQWIKCATAILLYLIFLIWVRSWWGLIVVPFIFDIYITKKIPWSFWKKSKNPAV RSVMSWVDAIVFALVAVYFVNIYIFQNYQIPSSSLEKSLLVGDFLYVSKMSYGPRVPNTP LSMPLAQHTLPVFNTKSYIEWPQWKYKRVPGFGKVKLNDIVVFNFPAGDTVAVNYQQTTD FYTLAYGEGQRIYSKQIEMDSLTRSQQRAIYDLYYDAGRKQILNNPRTYGEVLWRPVDRR ENYVKRCVGLPGDTLQIVDGQVMIDGKAIENPENLQFNYFVQTTGPYIPEDMLRELGISK DDTMLIEDSGWEGGLLDMGLDNRNAQGKLNPVYHLPLTKKMYDTLLGNKKLISKIVMEPE EYAGQMYPLNLYTKWNRNNYGPIWIPAKGATITLTEDNLPIYERCIVAYEGNKLEVKPDG IYINGEKTNEYTFKMDYYWMMGDNRHNSADSRYWGFVPEDHVVGKPIVVWLSLDKDRGWF DGKIRWNRLFKWVD >gi|225935354|gb|ACGA01000038.1| GENE 152 219937 - 220878 514 313 aa, chain + ## HITS:1 COG:YPO2717 KEGG:ns NR:ns ## COG: YPO2717 COG0681 # Protein_GI_number: 16122921 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Yersinia pestis # 16 307 78 327 332 78 28.0 1e-14 MNIRKFKWILAFAGAVVVVLLLRGFAFTSCLIPSTGMENSIFQGERILVNKWSYGLRVPF MSLFSYHRWCESPVRQQDIVVFNNPAGIREPIIDRREIYISRCLGVPGDTLLVDSLFSVI SPEAQFNPDKKRLYSYPASKENLITSLMHTLSITNDGLMGSNDSTHVRSFSRYEYYLLEQ AMNGKESFVQPLSNREDAEPNPLIVPGKGKFIRVYPWNITLLRNTLVMHEGKQAEIKNDT LYVDGKPTQHCYFTKDYYWMGSNNTVNFSDSRLFGFVPQDHIIGKASIIWFSKEKETGLF DGYRWNRFFRTVK >gi|225935354|gb|ACGA01000038.1| GENE 153 220970 - 221596 604 208 aa, chain + ## HITS:1 COG:no KEGG:BF0179 NR:ns ## KEGG: BF0179 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 208 1 208 214 362 85.0 4e-99 MKIAYLSSAYLAPVEYYTKLLAYDKVFVEQHDHYIKQTYRNRCTIAGPSGELALSIPTVK PDTLKCPMKDIRISDHGNWRHLHWNAIESAYNSTPFFEYYKDDFRPFYEKKYEFLIDFNE ELCRMVCELIDIHPTMERTSEYKMEFAPGEFDFREVIHPKKDFREVDTEFIPQPYYQVFE PKLGFLPNLSIIDLLFNMGPESLLVLGK >gi|225935354|gb|ACGA01000038.1| GENE 154 221651 - 223360 1175 569 aa, chain - ## HITS:1 COG:CC2801 KEGG:ns NR:ns ## COG: CC2801 COG1874 # Protein_GI_number: 16127033 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Caulobacter vibrioides # 23 569 125 628 628 385 39.0 1e-106 MRNTLFGILFLFILPLQAHQKLPYLQKQGSTTQLMVDGKPFLVIGGELGNSSASSIEDIE RIFPKLQRMGLNTVLVPAYWDLTEPQEGKFDFTLTDKVIQQARANDLKVVFLWFGAWKNS MSCYAPIWFKEDYKKYPRAYTKAGKPLEIASSFSENVFQADSRAFSQWMKHIASVDKEEG TVIMIQIENEIGMLEDARDYSKEADKLFYAPVPSLFIDYLQKNKRSLHPEMLAKWESQGF KKKGTWQEVFGADVYTDEIFMAWSYAQYVERMAKLARSIYNIPLYVNAAMNSRGRKPGEY PSAGPLAHLIDVWHYAAPNIDFLAPDLYDKGFVDWVAKYKLHNNPLFIPEIRLEDNDGVR AFYVFGEHDAIGFCPFSIESGSDRADAPLVQSYIKLKELMPLLTKYQGKGVMNGLLFDEE NKERILSYDDLEITCRHYFTLPWDPRARDGTVWPEGGGVLLRLAPDEYIVAGSGLVLEFK KQGENKMKSSPALGEDGFASVGGKASKTENSWQGGMRAGIGSVDEVNVNEDGSLKYIRRL NGDQDHQGRHVRIPVGEFSILHVKLYEYK >gi|225935354|gb|ACGA01000038.1| GENE 155 223556 - 224044 320 162 aa, chain - ## HITS:1 COG:no KEGG:BT_3316 NR:ns ## KEGG: BT_3316 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 148 150 196 68.0 2e-49 MEEVLSNQQARPGDATQLMHVIFSSDDEMMSFYLTLNRFMNPESYLVERTDRKRLEDLAS TLCSNVAAFEAIRNYKSISVKEVIRGFGAHMMNTLISNTNRFQSADAVGTLMNCILNTTK NSWQFKKMDRNNDIHLQNVRYLLNRLDAAESNEEKNCEEVAI Prediction of potential genes in microbial genomes Time: Fri May 13 09:04:37 2011 Seq name: gi|225935353|gb|ACGA01000039.1| Bacteroides sp. D2 cont1.39, whole genome shotgun sequence Length of sequence - 78296 bp Number of predicted genes - 53, with homology - 52 Number of transcription units - 25, operones - 12 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 125 - 934 712 ## COG3177 Uncharacterized conserved protein - Prom 959 - 1018 5.8 - Term 1039 - 1093 11.2 2 2 Tu 1 . - CDS 1148 - 3394 2340 ## COG1472 Beta-glucosidase-related glycosidases - Prom 3524 - 3583 4.0 - Term 3504 - 3551 12.7 3 3 Op 1 . - CDS 3586 - 5583 1815 ## BT_3313 hypothetical protein 4 3 Op 2 . - CDS 5606 - 7096 1320 ## COG5520 O-Glycosyl hydrolase 5 3 Op 3 . - CDS 7127 - 8647 1377 ## BT_3311 hypothetical protein 6 3 Op 4 . - CDS 8662 - 11667 2820 ## BT_3310 hypothetical protein - Prom 11740 - 11799 9.5 - Term 11778 - 11815 2.2 7 4 Tu 1 . - CDS 11878 - 13521 1168 ## BT_3309 transcriptional regulator - Prom 13649 - 13708 4.8 8 5 Tu 1 . + CDS 13703 - 14491 439 ## COG1712 Predicted dinucleotide-utilizing enzyme - Term 14327 - 14376 -0.6 9 6 Tu 1 . - CDS 14474 - 15358 732 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 15455 - 15514 6.1 - Term 15644 - 15684 -0.4 10 7 Op 1 . - CDS 15754 - 15924 56 ## gi|260172606|ref|ZP_05759018.1| hypothetical protein BacD2_12136 11 7 Op 2 . - CDS 15867 - 17543 933 ## COG3291 FOG: PKD repeat - Prom 17589 - 17648 2.6 - Term 17573 - 17609 3.0 12 8 Tu 1 . - CDS 17752 - 18852 440 ## BDI_2291 putative transcriptional regulator - Prom 18973 - 19032 3.5 13 9 Tu 1 . - CDS 19109 - 19732 676 ## BF4289 hypothetical protein + Prom 20033 - 20092 4.5 14 10 Op 1 6/0.000 + CDS 20127 - 20693 495 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 20737 - 20796 3.5 15 10 Op 2 . + CDS 20859 - 21806 795 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 21859 - 21900 2.8 + Prom 21854 - 21913 3.3 16 11 Op 1 . + CDS 21939 - 25277 2205 ## Phep_1362 TonB-dependent receptor 17 11 Op 2 . + CDS 25283 - 26569 1100 ## Phep_1361 RagB/SusD domain protein 18 11 Op 3 . + CDS 26583 - 27521 712 ## Phep_1360 exopolysaccharide biosynthesis protein 19 11 Op 4 . + CDS 27534 - 29315 1455 ## COG3391 Uncharacterized conserved protein 20 11 Op 5 . + CDS 29322 - 30581 868 ## Phep_1359 NHL repeat containing protein + Term 30588 - 30621 5.5 21 12 Tu 1 . + CDS 30643 - 31545 645 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 31624 - 31685 4.5 - Term 31623 - 31661 5.0 22 13 Op 1 . - CDS 31705 - 32802 937 ## Phep_1387 hypothetical protein 23 13 Op 2 . - CDS 32816 - 34384 1423 ## BF2880 hypothetical protein 24 13 Op 3 . - CDS 34470 - 35690 1084 ## COG0612 Predicted Zn-dependent peptidases - Prom 35768 - 35827 4.6 + Prom 35649 - 35708 3.0 25 14 Op 1 . + CDS 35796 - 36329 393 ## COG1611 Predicted Rossmann fold nucleotide-binding protein 26 14 Op 2 . + CDS 36336 - 36941 488 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 27 14 Op 3 . + CDS 36929 - 37849 746 ## COG0524 Sugar kinases, ribokinase family + Term 37911 - 37961 4.2 + Prom 37859 - 37918 6.8 28 15 Tu 1 . + CDS 38002 - 39618 632 ## BT_3091 putative regulatory protein + Prom 39708 - 39767 6.0 29 16 Op 1 . + CDS 39833 - 40996 811 ## COG4833 Predicted glycosyl hydrolase 30 16 Op 2 . + CDS 41032 - 42501 1298 ## COG3538 Uncharacterized conserved protein + Term 42528 - 42568 6.2 + Prom 42583 - 42642 4.0 31 17 Op 1 . + CDS 42665 - 43732 773 ## COG2365 Protein tyrosine/serine phosphatase + Prom 43764 - 43823 4.9 32 17 Op 2 . + CDS 43848 - 45782 1753 ## COG0513 Superfamily II DNA and RNA helicases + Term 45956 - 46018 13.1 - Term 46273 - 46318 -0.8 33 18 Tu 1 . - CDS 46541 - 49198 1930 ## BT_2524 alpha-rhamnosidase - Prom 49320 - 49379 4.3 + Prom 49309 - 49368 6.6 34 19 Tu 1 . + CDS 49423 - 50208 575 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Term 50041 - 50094 3.2 35 20 Op 1 . - CDS 50237 - 52378 1952 ## BT_3289 hypothetical protein 36 20 Op 2 . - CDS 52398 - 53753 1257 ## COG1808 Predicted membrane protein 37 20 Op 3 . - CDS 53756 - 55252 1270 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 38 20 Op 4 . - CDS 55283 - 56059 621 ## COG4099 Predicted peptidase 39 20 Op 5 . - CDS 56122 - 57153 1028 ## COG2255 Holliday junction resolvasome, helicase subunit - Prom 57195 - 57254 5.3 + Prom 57088 - 57147 5.0 40 21 Op 1 . + CDS 57280 - 58512 1053 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. 41 21 Op 2 . + CDS 58529 - 58945 457 ## COG0319 Predicted metal-dependent hydrolase + Term 59056 - 59114 1.1 - Term 58987 - 59034 10.1 42 22 Op 1 . - CDS 59103 - 59507 437 ## gi|160882930|ref|ZP_02063933.1| hypothetical protein BACOVA_00892 43 22 Op 2 . - CDS 59521 - 62238 1806 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 44 22 Op 3 . - CDS 62222 - 64579 1868 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 45 22 Op 4 . - CDS 64598 - 66922 1432 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases - Term 66924 - 66978 8.2 46 23 Op 1 . - CDS 67005 - 68678 1433 ## BT_3274 hypothetical protein 47 23 Op 2 . - CDS 68697 - 69443 636 ## gi|160882925|ref|ZP_02063928.1| hypothetical protein BACOVA_00887 48 23 Op 3 . - CDS 69465 - 70988 1170 ## BT_3272 putative outer membrane protein 49 23 Op 4 . - CDS 71003 - 74359 2683 ## BT_3271 hypothetical protein 50 23 Op 5 6/0.000 - CDS 74411 - 75625 971 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 75791 - 75850 8.2 51 23 Op 6 . - CDS 75870 - 76439 326 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 76566 - 76625 9.9 52 24 Tu 1 . + CDS 76514 - 76645 59 ## - Term 76625 - 76682 10.3 53 25 Tu 1 . - CDS 76808 - 78118 713 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 78170 - 78229 3.7 Predicted protein(s) >gi|225935353|gb|ACGA01000039.1| GENE 1 125 - 934 712 269 aa, chain - ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 39 252 41 240 263 89 32.0 8e-18 MEKGIWLEIEELYKEFQQLGISQSVDYEKYYLYSLITHSTAIEGSTLTEMDAQLLFDEGV TAKGKPLVYHLMNEDLKKAYELAKKEAQRNTVITPAFLQKLNATLMRTTGGRHNTIGGSF DSSRGEFRLCGVTAGVGGRSYVGYQKVPVKVEELCFLLQERQKNVETFREQYELSFNAHL NLVTIHPWVDGNGRAARLLMNYIQFCYHLFPAKIFKEDRADYILSLQQAQDDETNQPFWD FMAVQLKKSLSLEIQKYKASHNKGFSFMF >gi|225935353|gb|ACGA01000039.1| GENE 2 1148 - 3394 2340 748 aa, chain - ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 19 748 14 714 793 427 35.0 1e-119 MKLKAMLLGLSVITALPAFAQKPVYLDTGKPIEERVKDALNRMTLEEKVKMIHAQSKFSS AGVPRLGIPEVWATDGPHGIRPEVLWDEWDQAGWTNDSCIAYPALTCLSATWNPEMSHLY GKSIGEEARYRKKDILLGPGVNIYRTPLNGRNFEYMGEDPYLSATMVVPYIKGVQENGVA ACVKHYALNNQEFNRHTTNVQLSDRALYEIYLPAFKAAVQEGGTWSIMGSYNLYQGEHAC HNKRLLRDILRDEWGFDGVVVSDWGGVHNTEQAIHNGMDLEFGSWTNGLSAGTRNAYDNY YLAFPYLKLIKEGKVGTKELDEKVSNVLRLIFRTSMDPHKPFGSLGSPEHGQAGREIAEE GIVLLQNNGNVLPIDLNKAKKIAVIGENAIKMMTVGGGSSSLKVKYEISPLDGLKSRVGS KAEVVYARGYVGDPTGEYNGVKTGQDLKDNRSEDELLAEALQVAKDADYVIFFGGLNKSN HQDCEDSDRASLGLPYAQDRVISELAKVNKNLIVVNISGNAVAMPWVNEVPAIVQGWFLG SEAGTALASVLVGDANPSGKLPFTFPAKLEDVGAHKLGEYPGNKEELAQSKHRGDTINEI YREDIFVGYRWADKEKIKPLFPFGHGLSYTTFAYGKPSADKKTMTVDDTISFTVNVKNTG TREGQEVVQLYISDKKSSLPRPIKELKGFQKVKLAPGEEKVVTLTIDKKALSFFDDAKHE WVAEPGKFEAIIGSSSRDIKGTVPFELK >gi|225935353|gb|ACGA01000039.1| GENE 3 3586 - 5583 1815 665 aa, chain - ## HITS:1 COG:no KEGG:BT_3313 NR:ns ## KEGG: BT_3313 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 663 1 666 667 885 68.0 0 MKKNSYIILALAGMLSMNSCNDDEFLPGNPSMEIKAENADALFGDSLPFTIKASDVDVPL STLKAQLFYGEEQVFETVIRTKTSGNDYTGKIFVPYYANIPNGTATLKYILQNIHFTTTE MTKELALARPDFPYLTLVDEEGKEYRMERQSMYKYSVTGDFSQKMKAYIKTPKVGENGNE LTFGWENGAIEAGSTNAISFSNTEPGNYAVKFNTLTYEAEPFAKLKVNGEDMELVENDIY AIKLTLKKNDILAFEGVPDYDNWWIDQDYFEKQEDGTLKFLPIDGSYQITANGKMKYFSV IALKNGEAAKLQDDGTGAIWAIGTGIGKPSVALSEVGWTPENGLCMPQLTAKKYQLTFIA GVTMKVDDINFKFFHINKWDNGEFKGDAISTTSELVKISSDGNLGLEEGQKFERGGIYRF IVDVTKGNTKAVLTVEKVGKVDLPAPDIFFGNDKMEVTDTDIYKSDQAFTQGQMITVTGI DNLNEWWIDPDFFEKQSDGALKFLPINGDYRVTANAVLKYFSVMALKDGKPAKLQDDGTG AIWAIGKGIGKPSVTSSEVGWEPGKALCLAQVASKKYQLTLKAGETLKTSGDPEVISFKF FHQNDWGGEFGNYASNTLVEQLKLADSGNLEMQDNKAFEEGAVYRFTIDVTNGNANADLK VEKIN >gi|225935353|gb|ACGA01000039.1| GENE 4 5606 - 7096 1320 496 aa, chain - ## HITS:1 COG:CC1757 KEGG:ns NR:ns ## COG: CC1757 COG5520 # Protein_GI_number: 16126001 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: O-Glycosyl hydrolase # Organism: Caulobacter vibrioides # 27 492 15 469 469 240 32.0 5e-63 MIDMRNIALTFLGCFTILAACSNSDDAEKPVTPVPTGDVTIYATTSSLTRDLTRDAVNFS SKDNLAPTSITLNPTEQYQTMDGFGAAITGATCFNLLQMKPEDRHAFLTETFSDDKGFGF SYIRISIGCSDFSLSEYTCCDTKGIEHFALQSEEKDYILPILKEILSINPSIKVIAAPWT CPKWMKVKSLTDLTPLDSWTNGQLNPAYYQDYATYFVKWVQAFNAEGIDIYAVTPQNEPL NRGNSASLYMSWEEQRDFVKTALGPKFKTAGLATKIYAYDHNYDYSDIATEKNYPGKMYE DATASQYLAGAAYHNYGGNREELLNIHKAYPEKELLFTETSIGTWNSGRDLSKRLLEDMK EVALGTINNWCRGVIVWNLMLDNDRAPNREGGCQTCYGAVDISNSDYKTIIRNSHYYIIA HLSSVVKPGAVRIGASGYADSNIMYSAFENPDGTYAFVLMNNNEKTKKITLSDGKRHFAY DVPGKSVTSYRWAKSE >gi|225935353|gb|ACGA01000039.1| GENE 5 7127 - 8647 1377 506 aa, chain - ## HITS:1 COG:no KEGG:BT_3311 NR:ns ## KEGG: BT_3311 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 506 1 506 506 924 89.0 0 MKLRTIFYGLTSGLLLGLSSCSLNYEPLDTYSDVTEGVTSDGTKIVFKDKAAVESHLTTL YNQMRDRQEHWYVDLLLISDSHSDNAYAGTTGAEVVPFENNSIEGSNSVLERDWNRYLED VARANKLICNIDLVTDNSLTTAERAQYKAEAKIFRAMVMFDMVRLWGDFPVITTVADDIT SENIDEVYPQYFPKQNTELEAYQQIEKDLLDAVLYAPDNTPGNKTLFTKSVARTLLAKIY AEKPLRDYTKVIQYCDEVKADGFDLVDDFSDLFGMNAAGTDAKMRNTKESILEAQFTSGA GNWCTWMFGRDLVNWNNNFTWAKWVTPSRDLISAFKQEGDEVRFKESIVYYDCNWSNYYP SDNYPFMYKCRSANSSIIKYRYADVLLLKAEALIMQDTPDLEGAADIIDKVRDRAKLGAL PTSVRSNKNAMLNALLKERRLELAFEGQRWFDLVRLDKVEEVMNAVYAKDSGRKAQIYTF DKNSYRLPIPQSVIDANDKIHQNPGY >gi|225935353|gb|ACGA01000039.1| GENE 6 8662 - 11667 2820 1001 aa, chain - ## HITS:1 COG:no KEGG:BT_3310 NR:ns ## KEGG: BT_3310 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1001 1 1001 1001 1811 93.0 0 MKKNRRKILSGSRKIFFAILAMFLSLSASAQQITASGQILDAQKEPLIGVSVQEKGTSNG AITDLDGNFTLNVKQNAILIFSYVGYKSQEVKAAHQMKITLQEDNEVLDEVVVIGYGSVK RKDVTTAISSVSTKDLDMRPIVSAGQAIQGKAAGVSVIQPNGTPGGEMSIRVRGTTSMNG SNDPLYVVDGVPVDNIKFLSPNDIESMQILKDASSASIYGSRAANGVILITTKAGAAGNA KVSLTAQFGLNKVADKVESLNAAQYKELQDEIGLVSLPDGLPDRTDWFDETYTTGKTQNY QVAVSNGNEKMKYYLSAGYLKEQGVLDISYYKRYNFRVNLENQVRKWLTVSANISYSDYT SNGGGAMGTGSNRGGVILAVINTPTYAPVWDALNPNQYYNNFYGVGNITNPLENMARAKN NKDKENRLLASGNILLTPFPELKFKSTLTLDRRNAVNTTFLDPISTAWGRNQYGEASDNR NMNTVLTFDNVLTYNKNFKKHGLEVMAGSSWTDSDYSNSWINGSHYRSDQIQTLNAANKI SWDNTGTGASQWGIMSFFGRVAYNFDSKYLVTANLRADGSSKLHPDHRWGVFPSFSAAWR ISSEKFMENLTWIDDLKLRGGWGQTGNQSGIGDYAYLQRYNIGRIEWFKKGGEGDSTDYA NAVPTISQANLRTSDLTWETTTQTNIGLDLTLLNGRLTFNADYYYKKTKNMLMNVSLPAG AAAATSIARNEGEMVNKGFELSISSKNLRGGAFTWDTDFNISFNRNKLTKLELQKVYYDA KTADVVNDYVVRNEPGRALGGFYGYISDGVDPETGELMYRDLNNDGKISSSDRTYIGDPN PDFTYGMTNTFSWKGFNLSIFIQGSYGNDIYNASRIETEGMYDGKNQSARVLNRWKIPGQ ITDVPKANFKLLNSTYFVEDGSYLRLKDVSLSYNVKGKLLKKWGITRLQPYFTATNLLTW TNYSGMDPEVNQWGNSGTVQGIDWGTYPHCRSYVFGINVEF >gi|225935353|gb|ACGA01000039.1| GENE 7 11878 - 13521 1168 547 aa, chain - ## HITS:1 COG:no KEGG:BT_3309 NR:ns ## KEGG: BT_3309 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 547 1 547 547 801 81.0 0 MKKKYLIFLLLLSFPLYTRADKTLDSLLNVLDLTIQEHETYVVQRESRIKHLKELTHGIE SNSAEQYNLNSQIYKEYKAFICDSAIHYLNENIRIAERLRDTDRKIESQLQLSLLLSSTG MYKESLDVLESVDRRKIIPRLIADYYTCFDHVYGELGVYTQDKTLSGRYWSISQAYRDSL YAILPPESEEYLLMREASFRDQRQYEDALKVNDLRLTKIEPYTPQYAMATYHRSLIYKYS NDSLGEKRNLCLSAISDIRSAIKDHASLWMLAQLLYEDGDMERAYQYMRFSWNATKFYNA RLRSWQSADVLSLIDKTYQAMIEKQNDRLQQNLLLITALLVLLIVALGYIYRQMKKLADA RNHLQVANKQLNGLNEELRQMNSCLSSTNIELSESNQIKEEYIARFIKLCSTYINRLDAY RRMVNKKVSAGQIAELLKITRSQDALDEELEELYANFDTAFLHLFPNFVGKFNDLLQENE QILPKKGELLNTELRIFALIRLGIEDSSQIAEFLRYSVNTIYNYRAKVRNKARGSREDFD DLVRKIR >gi|225935353|gb|ACGA01000039.1| GENE 8 13703 - 14491 439 262 aa, chain + ## HITS:1 COG:MA0958 KEGG:ns NR:ns ## COG: MA0958 COG1712 # Protein_GI_number: 20089836 # Func_class: R General function prediction only # Function: Predicted dinucleotide-utilizing enzyme # Organism: Methanosarcina acetivorans str.C2A # 1 260 1 267 271 97 30.0 3e-20 MKKLVIVGCGRLAEIVADAVVKGLLPDYNLVGVYSRTASKAAHIVHKMQQHGKPCIACAT LEELLALKPDYLVESASPAAMRELALPALKNGTSVITLSIGALADETFYREVAETAKVNG TRIYIASGATGGFDVLRTASLMGNTTARFFNEKGPNALKGTPVYDDSLQTEQRTVFSGSA AEAIRLFPTKVNVTVAASRASVGPENMQVSIQSTPGFVGDTQRVEIKNDQVHAVVDIYSA TSDIAGWSVVSTLINIVSPIVF >gi|225935353|gb|ACGA01000039.1| GENE 9 14474 - 15358 732 294 aa, chain - ## HITS:1 COG:PH0520 KEGG:ns NR:ns ## COG: PH0520 COG1052 # Protein_GI_number: 14590422 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Pyrococcus horikoshii # 27 244 26 254 333 80 29.0 3e-15 MFQKLVAIEPVSLVPSAEKALYSFAGQVVMYPDIPSNDDEIIARIGDADAVLLSYTSRIN RYVLECCPNVKYIGMCCSLYSPESANVDIRYANERGITVTGIRDYGDEGVVEYVVSELVR CLHGFGQETWEDLPREITGLKVGIVGLGKSGGMIADALKFFGADISYYARSEKEAATAKG YRFLPLRELLAESEVICCCLNKNTILLHEEEFKQMGNRKILFNTGLSPAWDEAAFTEWLE GDNLCFCDTIGALGGEQLLNHPHVRCMQVSTGRTRQAFDRLSAKVLANLSEYNG >gi|225935353|gb|ACGA01000039.1| GENE 10 15754 - 15924 56 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172606|ref|ZP_05759018.1| ## NR: gi|260172606|ref|ZP_05759018.1| hypothetical protein BacD2_12136 [Bacteroides sp. D2] # 12 56 1 45 45 96 100.0 6e-19 MDCGAAWRSYEMEEHHATVIRAFPEVELQGGLIYFLKSNFMPDAVDDICCRKKIIS >gi|225935353|gb|ACGA01000039.1| GENE 11 15867 - 17543 933 558 aa, chain - ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 361 477 1187 1305 1734 87 45.0 7e-17 MKKIFSLLVLCATVVFASCSKDDPVTDPVPEGIVVTTYEALLDALQTGGTSADAPTLITL GGNITVPAGGDYDTPPINGSGHFKIDGGGHTITWESTANNHHLLGNTSPDADAVYIELTN INLDRQDMNAAVGVYNGKITLGKDVALSGQDGNMILANGEKVALELGDGCELSYAAGSSP CCASAWFGAILVLNGGKTAAGAYIGLDCNFYPAVSYPLISVPKALTGDVHLLLKMIGTTP VAQGIGNYQLTQADCDRLKVNPESTVSLYMGQRDKYDGNFELHLDPAASFQIKLRPKNFT PPTSGNIDVTNMTDVVAQITIRAALEAGHTDLKLTGELSKIGIGGQWGTFANNTQITTCD LTEVTGWGTTPTLPELAFKDCTKLQEVTLPDGVQVIGEYAFIRCAALTTVNLSQVTRIDE YAFWECTSLTALTLDNVTTIDHDAFYGCTGLETLKIPKCTWFGNYIVTGCKALTRIEATA AGNFVDISDGRSSIERTAVFHNRTAHSGDNAFDPAKCDLVLNPDKYENGGAVPTASANNE WTVAQHGGLMKWKSITPP >gi|225935353|gb|ACGA01000039.1| GENE 12 17752 - 18852 440 366 aa, chain - ## HITS:1 COG:no KEGG:BDI_2291 NR:ns ## KEGG: BDI_2291 # Name: not_defined # Def: putative transcriptional regulator # Organism: P.distasonis # Pathway: not_defined # 11 366 16 390 393 139 29.0 2e-31 MLDIIYYFNIFGCLTMGMALLLIQTAPQLINPHYRNAKRFLGIASIIVALCNALIFYNRV QESVAEIFAMPVLVAAQLQAALFTFIVLILFHSPYVCRRNILRHLCPTLFFVGLYLLTIL LFPDVRIYSVDEYIANITNPVLLLRTVFAATYLTQIVIYVRLFRREQRNYIAKIENYFSD TDKYEFRWASRLFYEAACIGIAVLVFSIFPAPLFDGIITVVITVYYFDFGVRYINYQYKL YYEALPAIEEKEESQLTKESEGDKELEDEMAKLLLYLQQGVVLGDYAEALHIPERKLSVF INSTYGVSFKRWVNNKRVEYATEQMAKHPDYTMERIAELSGFAHKSHFCKIFREITGGSF TEYKNR >gi|225935353|gb|ACGA01000039.1| GENE 13 19109 - 19732 676 207 aa, chain - ## HITS:1 COG:no KEGG:BF4289 NR:ns ## KEGG: BF4289 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 203 3 204 206 134 38.0 2e-30 MAFYNLKKKPALTTKEGETETMYADIVYSGTILAERLIRGVAKRTGFKEGVIEGILMELK DDVLQYLGEGYRVELGEFGFFSAKVKASRLVANKNDIRSESVAFNGVNFRASKSMRVGIR GDLERRKCVDFNTSRKWGRDNLKKLVLQYIGEHGFITRATYTQLTGRLKNTALDDLKSFA AEGIIKREGRGNQMHFIAPPRKEPDGE >gi|225935353|gb|ACGA01000039.1| GENE 14 20127 - 20693 495 188 aa, chain + ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 9 177 9 184 187 63 27.0 2e-10 MKIDEIQCIKELRNGSYQAFTQIYEAYADRLYSFVLKQLKNRSLTQDIVQDTFLRLWDNR NQLNSFGNLQAFIFTIAKHQVIDYFRKQVNELQFEDFMEYCENQATDVSPEDILLYDEFL QQLQQSKKVLSQREHEIYELSREKHIPIKQIAEQLDLSEQTVKNYLTSALKILRSEMMKY NILFIFFL >gi|225935353|gb|ACGA01000039.1| GENE 15 20859 - 21806 795 315 aa, chain + ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 122 285 157 321 354 79 32.0 1e-14 MKALFKDLYRQYLEKGVSPTDFGQFKEELNHIPDEELWNVMIDMEKDATTEIGMPPMMKK QIRKEMHQIIWRRRWIQITKYAAVIALLVTSSLGVYSLLNTPETQQMITANVKSGSKSEI ILPDGTKVQLNGATTITYDVNSSKQRLVQLSGEAFFDVAKNPDCPFRVIANGLQIEVVGT SFNVNTYKKGVVETSLLTGQIKISGGSLPQEYILTPGEKATYSSINNALKITQADVHVET GWCDDYLIFDSEPLIDVIEEIERWYGVEIELRYPQIGQDLLSGSFRHENIQNVIHSLSLQ YKFKYEIHKDKITIY >gi|225935353|gb|ACGA01000039.1| GENE 16 21939 - 25277 2205 1112 aa, chain + ## HITS:1 COG:no KEGG:Phep_1362 NR:ns ## KEGG: Phep_1362 # Name: not_defined # Def: TonB-dependent receptor # Organism: P.heparinus # Pathway: not_defined # 3 1112 22 1140 1140 948 46.0 0 MNYTKLIFSFRKRRQNLLIFRNFTILLVALVLLPAGVSAQKGNVSVNINNGTVKTFIKEI EKQTRYTFVYRNNVLNDQAKVTVNCKNKPLDQVLSQVFTPLNVSYSLNNNTIVLVKKEVQ QQKKSEKKTIKGTVTDGRGEPIIGANVIQEGGIGTITDVDGNFTVTADPSKPLDISYIGY KKKSVRIGASPTVRIALEEDAHVMDEVIVIGYGSKTKRDVTSSIGTYKPGEVNVRQVLGV DELLQGRVSGVNITSASGVPGSKNRVSIRGIGSITAGNEPLYVIDGVPINNTSGDTGAWG AQSMNGLNDFNPSDVESIQILKDAASAAIYGSRATNGVILITTKKGSKGQAKVSIDTNVS FSNLTRTDKLDMADTDLFLEVLNEAIDNYNLQTNSTQARIDNPAPGKAQTNWLDLVLRTA VTYTTTASVSGGTDKTNYYLSANYKHNEGVIINNLLKRYNLKVNLDTEIKKWLKVGTSLN LSYSRNNRVPTGYNIGTSVITRAIEQRPWDSPYRPDGEYAVGGQELANHNPIQALNEEDV YIDNYRALGSLYMLFNITKDLNFKTTLGEDFNYTEEHIYYSADHPYGNKVGKLIDGRKSY ASTLWENVLTYKHSFAEDFSLDVMLGHSIQKDVTSSAAQTGIGFPSPSFDVNSVAAEFSD VSTGLSSFLLQSFFGRLSLNYKNRYLLTGTMRADGSSKFISNNRYGYFPSVSAGWNLGEE SWWKFPQTDVKLRASWGCTGNQGGIGSYAYQALAGGGYNYNGENGLGLTTAGNRDLKWEK AQQGDIGVDLSFFRGAITFTADAFIKDTKDLLYQKPTPATSGYTSQVCNIGSMRNKGLEF TLGANLGKGSFSWHSDFNISFIRNKLTALLDNNEILTTSSMHALKVGEPIGSFYMIKWKG IYQSDDEIPAKIYDQGVRAGDCIYEDVDGNDVIDENDKQFVGSANPKFTGGFNNTFKYKG FDLSMFFTFSSGNKLYELWTGGLRMGNGTWPILKSSAESRWTGPGSTNKNPRAIYGYTWN STKFVNTRMLHDASYIRCRTASIGYTLPKSWINRLHIDNLRIYFQADNLFVLTKWPYLDP EVNVSLSATNMGYDYLYPSQPRTFTIGVNLKF >gi|225935353|gb|ACGA01000039.1| GENE 17 25283 - 26569 1100 428 aa, chain + ## HITS:1 COG:no KEGG:Phep_1361 NR:ns ## KEGG: Phep_1361 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 18 428 18 426 426 405 51.0 1e-111 MKNFIISLLTVLLVFATSCNEMDQYPHNAVSSDNLTEEDAQLLLTGLYFYIQNKPTVNGY LTQDIVGGDLVRGGATGLKDPVLLVKDLVTPESGFVSGPWDGFYTALYQVNSLIVALDKL AASQSRNEILGVASFFRGLIYYHLVSRYGEVPILEAPFSGDIAASTEEEGWSFVEKNFQV AIDYAPTFSDKYYVSKQAAKALMARTKLAQGKLTEAAKLAEEVIGDANFSLADFDQIFRG KANREEIFSFVNLLNESSVNLSASLYSRASANGGSYTYAPTTKVMNMFEPDDKRTAISID MQETNEVINKYPGGEVTTDPIIITRLGEMYLISAEAQGLSKGLSRLNELRNFRGLPSVHP ATEEDFIDAILNERHTELLAEGFRWFDLVRLNRLESDLGFERKYNRLPIPAKERSLNKLL NQNSYWAN >gi|225935353|gb|ACGA01000039.1| GENE 18 26583 - 27521 712 312 aa, chain + ## HITS:1 COG:no KEGG:Phep_1360 NR:ns ## KEGG: Phep_1360 # Name: not_defined # Def: exopolysaccharide biosynthesis protein # Organism: P.heparinus # Pathway: not_defined # 16 307 7 297 303 226 40.0 1e-57 MNLLKYMQILSFVLPLAFLSCSNDTIEDVHFIPQTKIGQKLLAGSETVARIYTDTSFVVA LGVTETDVHFQKADSRSTHIFIIDIDLNEPGVSLEVGMPYDADVRNNFQRQTLTEMADYA DRPWHRVAAMINADFWDVSTMDIRGPIHRSGVILKNSFIFKESLPQQALSFIALTKDNKM VIADSVEYRGMQYNLKEVTGSGVIVLRDGEISGATYPGIDPRTCLGYSDDGHVYFMVVDG RVEFYSYGLTYPEMGSIMKALGCSWAVNLDGGGSTQMLIRHPIADIFQIRNRPSDGQERP VVNAWMVTVNEP >gi|225935353|gb|ACGA01000039.1| GENE 19 27534 - 29315 1455 593 aa, chain + ## HITS:1 COG:MA2021 KEGG:ns NR:ns ## COG: MA2021 COG3391 # Protein_GI_number: 20090869 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 311 471 142 315 341 68 30.0 3e-11 MKNSIFILSILLCCNLFCGCSESDSSDSRQLPPKLISIIPKAGSTGGTAIISGVYFSETI TDNEVFINGVRAEITDATQNRLVIALPDNPEGTYSIKVSVKGETVEGLKFTYATPQAPPE LTVLQVMPSSAYAGDLVTLIGQCFSTVVSENQVTINGAVAEVKEATSSQLKIIIPDTEEG SYPIRVKVGTKEAESPLFTYLHTVTLTTSSLAPTRGKAGEEIVISGEGFGMTIEENIVSI NGKPATVKAVTATTLSIIIPENPSGTYPVKVTVADKTVENLSFTYEDLSYTVETVAGNSA TTSTDGKGTAASFKFPQGLALAPNGDIWIAERGNNTIRKMDQEYNVSTVAKSGTVTFNAP WQGEFDPSGIYYVANKALNNIIKVTQDGTCSVFSAETTFKSPMSITFDANGNMYIADRDN KAVKKITSGGAVTNYDMSSLKAGPNCMAVDKKGRIFVGTGGTYQLHMFDTNGTLKTIFGT GVVPTVATYSDGEQNDLSKATMGATFGIAFGPDEILYITDYTMHTIRTLTPDAEGDYTKG TLKTIAGIPGTKGKIDGSALTATFNCPASVLVSDKVYIADEQNHLIRSITVNK >gi|225935353|gb|ACGA01000039.1| GENE 20 29322 - 30581 868 419 aa, chain + ## HITS:1 COG:no KEGG:Phep_1359 NR:ns ## KEGG: Phep_1359 # Name: not_defined # Def: NHL repeat containing protein # Organism: P.heparinus # Pathway: not_defined # 1 416 1 435 439 149 31.0 2e-34 MRKIYILLLAYSLLSLFAACNDNDYKASVTELRLVLVKPTNVYSGEIATILGRNFSTVPE ENAVFINDQQATVIEAFKDELKIILPEMAPGKYSIRVKSPSGELTGLELNYLKTPDQEYI VQTIVGQKGVFEMTDGVGTEATTKLPTGITFAPDGSLWFTERGYNYIRRISPDFLVTSLL DVAVDGSSAIWQGGFDSKGNYYFIDKGKGMLRKIETNSMTVSTIASGMKSPMNVAFDDED NIYVSARDNKAVYKFTPSGAKTTFATLNVSPNYIVFDKNKNMIVGTSNGYVLIQISPDGT QKTIAGDGVKGQEYYDGDPGNPLSAKVGATFGVAAGSDGCLYLSDNTYNCIRKLTPDASG DYSKGTLETIAGSGKSGFSDGKGLKATFNQPYEIIITKDCKTMYVAGAVNYLIRRITVK >gi|225935353|gb|ACGA01000039.1| GENE 21 30643 - 31545 645 300 aa, chain + ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 39 292 36 294 306 119 33.0 7e-27 MTKKLFSLLLILLTAFSISAQTRTDKLLKNLHDNKSKYIFVIAHRGDWRNAPENSLQSIE KAIAMKVDMVELDIQPTKDGNFICMHDETLDRTSTGKGTIKNYTTEELKKFVLRSGNGIK TRQPIPTLKEVLNVCKDRILVNIDKGGTYIKEIMPIIKECGMEKQVIIKGYYPVEKVKKE YGSNESILYMPIVNLWDKEAVATIQTFIKDFTPIAYELCFKDDTTPSLRIIDEIIKSGSR IWMNTLWDSLCGGHDDENALLEGKDRHWGWMLKHKATMIQTDRPQELIHYLEEKGLRNLQ >gi|225935353|gb|ACGA01000039.1| GENE 22 31705 - 32802 937 365 aa, chain - ## HITS:1 COG:no KEGG:Phep_1387 NR:ns ## KEGG: Phep_1387 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 25 362 20 358 358 324 46.0 4e-87 MKNLIRNVWVCLGLMVLPFTASGQQKDNITYVPAQELLLVGKATTEGGYFHRVDTAKYCT MPPTVKKLFTNSAGLAISFTTNSPVIKAKWMVPDNYQLPNLTRVAQEGLDLYIKRDGKWQ FAGVGIPGGVTTEKVLVDNMGTEEKECLLYLPLYDELKSLEIGVSSDAHIRKGENPFKEK IVVYGSSILQGASASRPGMAYPARLSRSSGYNFINLGLSGNGKMEKEVAEMLADIDADAF ILDCIPNPSPKEITDRTVDFVMTLRQKHPDTPIIVIQTLIRETGNFNQKARENVKRQNEA IAEQVEVLRKKNVKNLYFIKEDQFLGTDHEGTVDGTHPNDLGFDRMLKKYKPAISKILKI KFKTE >gi|225935353|gb|ACGA01000039.1| GENE 23 32816 - 34384 1423 522 aa, chain - ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 6 519 10 524 525 635 59.0 0 MLVTGRKLPIGIQTFEDIRNDGYLYVDKTALMWTIANIGKPFFLSRPRRFGKSLLISTFE AYFKGRRDLFTGLAVEQLEKKWEEYPVLHLDLNAEKYDSPDRLDAILSNQLTQWEAIYGR GEDETTLSSRFLGVIRRASEQAGRGVVVLVDEYDKPLLQAIQNEPLLDSYRSTLKAFYGV LKSADRYLRFAFLTGVTKFSQVSVFSDLNQLNDISLNYDFSTLCGITREELLANFEPEIA ALSKANDINTKEVVETMTQQYDGYHFHPNGEGVFNPFSVLNAFFSKEFGNYWFQTGTPTF LVELLKESDYDLRLLMDGIETAASAFTEYRADRKNPIPLIYQSGYLTIKDYDREFRLYRL GFPNDEVRYGFLNFLLPFYTAVTDEERSFYIGKFVQELRTGNVDAFMHRFEAFFADFPYE LNDQTERHYQVIIYLIFKLMGQFTQAEVHSSRGRADAVVRTPKFIYIFEFKLNGTVEQAM EQIEEKGYAFPYTAEDQQVIKVGVEFSAEKRNVERWMVAKEF >gi|225935353|gb|ACGA01000039.1| GENE 24 34470 - 35690 1084 406 aa, chain - ## HITS:1 COG:BMEI1451 KEGG:ns NR:ns ## COG: BMEI1451 COG0612 # Protein_GI_number: 17987734 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Brucella melitensis # 1 395 60 453 490 211 34.0 2e-54 MQGNEYTLPNGLRIIHEPTLSKVAYCGFAIDAGTRDEAENEQGMAHFVEHLIFKGTEKRK AWHILNRMENVGGDLNAYTNKEETVVYAAFLKEHLERALELLGDIVFHSTFPQHEIEKET EVIIDEIQSYEDTPSELIFDDFEDMIFRNHPLGRNILGKPELLRSFRTEDVLSFTRRFYQ PGNMVFFVQGQYEFKRIIRLVEKYLLDIPDVKVENRRTPPPLYVPEHLTVARDTHQAHVM IGSRGYNAYDDKRTALYLLNNVLGGPGMNSKLNVSLRERRGLVYNVESNLTSYTDTGAFC IYFGTDVDDMDTCLKLTYKELKRMRDTKMTSSQLAAAKKQLIGQIGVASDNFENNALGMA KTFLHYHKYESSELVFKRIEELTAEMLLEVANEMFAEEYLSTLIYK >gi|225935353|gb|ACGA01000039.1| GENE 25 35796 - 36329 393 177 aa, chain + ## HITS:1 COG:BH3084 KEGG:ns NR:ns ## COG: BH3084 COG1611 # Protein_GI_number: 15615646 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Bacillus halodurans # 3 161 2 160 187 114 38.0 6e-26 MEKIGIFCSASDNIDKMYFESASQIGKWMGQTGKTLIYGGASLGLMECIARVVKENGGKV IGVVPAKLEENGKVSSLLDEEIHTRNLSDRKDIITEKSEVLVALPGGVGTLDEIFHVIAA ASIGYHRKKVIFYNEYGFYDELLKALHTLENKGFARQPFSTYYEVANTLNELKEKIN >gi|225935353|gb|ACGA01000039.1| GENE 26 36336 - 36941 488 201 aa, chain + ## HITS:1 COG:YPO3577_1 KEGG:ns NR:ns ## COG: YPO3577_1 COG0794 # Protein_GI_number: 16123721 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Yersinia pestis # 6 194 17 201 212 142 42.0 6e-34 MIDSIKQLLQQEAQAVLNIPVTDAYEKAVKLIVEQIHQKKGKLVTSGMGKAGQIAMNIAT TFCSTGIPSVFLHPSEAQHGDLGILQKNDLLLLISNSGKTREIVELTRLAHNLDPDLKFI VITGNPDSPLAKESDVCLSTGKPAEVCVLGMTPTTSTTAMTVIGDILVVQTMKETGFTIA EYSKRHHGGYLGEKSRSLCEK >gi|225935353|gb|ACGA01000039.1| GENE 27 36929 - 37849 746 306 aa, chain + ## HITS:1 COG:VCA0656 KEGG:ns NR:ns ## COG: VCA0656 COG0524 # Protein_GI_number: 15601414 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Vibrio cholerae # 1 257 18 276 323 90 24.0 3e-18 MRKVIGIGETILDIIFRGDQPSAAVPGGSVFNGIVSLGRMGINVGFISETGNDRVGNIIL QFMRENHIPTDHVNVFPDGKSPVSLAFLNEQSDAEYIFYKDYPKQRLDVLFPKLEEDDIV MVGSYYALNPVLREKILELLDQAREKKAIIYYDPNFRSSHKNEAMKLAPTIIENLEYADI VRGSLEDFLYMYNMQDIDKIYKDKIKFYCPRFICTAGAEKVALRTNLVNKDYPIEPLQAV STIGAGDNFNAGLIYGLLKYDVRYRDLNNLNEEIWDKIIQCGKDFAAEVCRSFSNSVSVE FAQKYK >gi|225935353|gb|ACGA01000039.1| GENE 28 38002 - 39618 632 538 aa, chain + ## HITS:1 COG:no KEGG:BT_3091 NR:ns ## KEGG: BT_3091 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 534 4 547 550 318 36.0 5e-85 MKRFILPLLIIFIFISLPAKANDEVKTLLKVLDKSLQNKASYTQQKQRQIDSLKIILRQS RDIRKKVGTAQSLCFEYSSFQKDSALAYAIHMNRFAQESNDKELLIEAKLDYSRILSSMG FFKEALAITNSMQQKQLSPKLKAEYFLGQVTIYNHQKAFASNEDDSQENDLIAQIYRDSL LQCKEVPSNVRAFITAPTLLFHKKYDDAIHILDSTYQSYTPYSRNAGIIAYSLASAYQGK NDHENTIKYFAISAISDVLNGARENLSLKILAKLIFESGDIDRASKYMKNAMEDAILCNA RINTIEASDMYLFIDKAFQEKEKHKFIIITALLIALCIVCILLFILSVQLKKQKGKVEQA NESLSYHLYEIQQMNSILADNNKIKEEYVGLYMEQYTSYISKIANFKKRALKIAKSEDIK KVTSFLHSSLNTEEDLAEFYNNFDKAILNLFPNFVEDFNALLLPENAIIPGPGKLLTPEL RIFALIRLGITDSVKIAHFLQYSLSTIYNYRSKMRIKANGDRNEFEEKVARIGQQIRE >gi|225935353|gb|ACGA01000039.1| GENE 29 39833 - 40996 811 387 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 133 331 90 287 341 82 31.0 2e-15 MNGLLLNVICAFTIANTNPNIEKAQQTLDALYQNYAAPNTCLLRENYPFDQDSKATYLAS EEQAKRRNEYSYLWPYSGTFSAVNALLESTGNKKYKKLLENKVLPGLEEYFDTRREPFAY SSYISSQPLSDRFYDDNVWLGIDFTDFYRMTGKQAYLEKAKLIWKFILSGKDDVLGGGVY WCEQKKESKNTCSNAPGAVFALKLFQATQDDAYLKEGKELYEWTKKNLEDSKDHLYFDNI SLNKKTGRAKFAYNSGQMMQAAALLYQITKQKSYLEDAQNIAEACHKHFFTQFTSPDGQT FQLLKKGNIWFTAVMLRGFIELYQIDHNKTYLTDFQRSLDYAWHHARDERGLFQTDWSGT DKNEKKWLLTQAAFIEMYGRLGGIDLQ >gi|225935353|gb|ACGA01000039.1| GENE 30 41032 - 42501 1298 489 aa, chain + ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 28 476 46 497 516 479 48.0 1e-135 MKKRNIGICIITTTLLMGGPAKATELILAKDHTNVVNQAAEIAAYKSNRPPVNKRLFKSK AVEAEIIRVKKLLTNQKLAWMFENCFPNTLETTVHYRTTDGKPDTFVYTGDIHAMWLRDS GAQVWPYIQLVNKDPELKKMLEGVIRRQFKCINIDPYANAFNDGAVGGDWMSDLTDMKPE LHERKWEIDSLCYPLRLAYQYWKETGDASIFDNEWIQAIANILTTFKEQQRKEGVGPYKF QRKTERALDTLNNNGLGAPVNPVGLIVSAFRPSDDATTLQFLVPSNFFAVSSLKKAAEIL NVVNKNTSLAKQCTDLAQEVETALKEYATYNHPKYGTIYAFEVDGFGNHLLMDDANVPSL LAMPYLGDVDINDPIYQNTRRFVWSKDNPYFFKGKAGEGIGGPHIGYDMIWPMSIMMKAF TSQDDQEIKSCIKMLMDTDAGTGFMHESFHKDDPKNFTRAWFAWQNTLFGELILKLVNEG KIDLLNNIQ >gi|225935353|gb|ACGA01000039.1| GENE 31 42665 - 43732 773 355 aa, chain + ## HITS:1 COG:lin2049 KEGG:ns NR:ns ## COG: lin2049 COG2365 # Protein_GI_number: 16801115 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Listeria innocua # 39 352 12 325 326 125 31.0 8e-29 MYRNLLSWLTVLLVLPSCSGTSPAISVVCEENNVGNCIIKWETAPVLKGQVKVYTSTSPE AIPEDSPIAMANISSGKMTIVTNDPSQRYYYLMVFNNKYRVKVATRNINIPGIQNFRDLG GYESAGTGKSLRWGMIYRSAQIDSIPPCSRRELKNMGVRTIIDLRSENERHNYPQLHDDE FNIIHIPILTGNMEEILQGIQEEKIKSDTIYRLVEQMNRELVINYQKEFKKLFTVLLDRT HYPVVIHCTSGKGRTGVVSALLLAALGVNEDVIMEDYRLSNDYFNIPKASQYAYKLSVNS QEAITTIYSAKEDFLNAAKEQIEAEYGSVQTYLKKGIGLSAEEIEQLRSILLTDN >gi|225935353|gb|ACGA01000039.1| GENE 32 43848 - 45782 1753 644 aa, chain + ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 1 544 5 524 539 372 38.0 1e-102 MKTFEELGVSPEIRRAIEEMGYENPMPVQEEVIPYLLGENNDVVALAQTGTGKTAAFGLP LLQQVDVKNRIPQSLILCPTRELCLQIAGDLNDYSKYIDGLKVLPVYGGSSIDSQIRSLK RGVHIIVATPGRLLDLMERKTVSLSTIHNVVMDEADEMLNMGFTESINAILADVPQERNT LLFSATMSPEIARISKNYLRNAKEITIGRKNESTNNVKHVVYTVQAKDKYEALKRIVDYY PQIYGIIFCRTRKETQEIADKLMQEGYNADSLHGELSQAQRDAVMQKFRIRNLQLLVATD VAARGLDVDDLTHVINYGLPDDTESYTHRSGRTGRAGKTGTSIAIINLREKGKMREIERI ISKKFIVGEMPTAAGICQKQLIKVIDDLEKVKVNEEEIADFMPEIYRKLEWLSKEDLIKR MVSHEFNRFAEYYRNRAEIEVPTDIRGERTGRGDRKEGGFEKRSRQAAPGFNRLFINLGK TDSFFPSDLIGLLNSNTRGRIELGRIDLMQNYSFFEVPEKEATNVIKALNRAKWNGRKVV VEIAGAGEESGKGRENGSNERKGGKRFGGKSDERAPRYENKDRKPKDASAKDSKSTKKDK PSRADRGYSDARGPKKKDDWQEFFKDKEPDFSEEGWARRKPKKK >gi|225935353|gb|ACGA01000039.1| GENE 33 46541 - 49198 1930 885 aa, chain - ## HITS:1 COG:no KEGG:BT_2524 NR:ns ## KEGG: BT_2524 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 877 1 877 881 1553 83.0 0 MKRLTLLPGLLFLMLQAAFSQESGSCKPVNLVCDYLVNPLGIDNPAPRLSWMLNDARKGA RQTACQVIVDKDSLELIAGKGTIWDSGKKDSEQILITYSGQELRPFTKYYWRVNVWDKDG VKSSSAINSFETGMMGMEHWQGAWISDNKDINYRPAPYFRKVFDARKKIRSARAYIAVAG LYELYINGEKIGNHRLDPLYTRFDRRNFYVTYDITSQIQKGKNAIGVLLGNGWYNHQSMA VWDFHRAPWRNRPAFCMDVRITYEDGSVEVVSTERDWKTSSGALIFNSIYTAEHYDARLE QKGWNTVNFDDSKWNGVGYRAVPSQNVVSQQVQPIRAVETIPVKIWKKLNDTTYVFDFAR NMAGVTRIKVSGEEGTVVRLKHGERLYDNGRVNTSNIDVYHRPVDNSDPFQTDILVLSGK GEDEFMARFNYKGFRYVEVTSTNPLVLNENNLTAYFVHSDVPQKGMIHTSNALINRLWWA TNNAYLSNLMGYPTDCPQREKNGWTGDGHFAIETALYNYDGITVYEKWLADHRDEQQPNG VLPDIIPTGGWGYGTDNGLDWTSTIALIPWNIYMFYGDHKLLADCYENIKRYVDYVDRTS PTGLTSWGRGDWVPVKSHSSKELTSSVYFYVDTKILANAAKMFNKTEDYKYYSALANKIK NAINDKFLNRETGIYGSGVQTEQSVPLQWGIVPEELKRKVARNLAKQVEAAGFHLDVGVL GAKAILNALSENGEAETAYKLAAQDTYPSWGCWIANGATTLLENWDLNATRDISDNHMMF GEIGGWFYKGLGGIFPDPENPGFKHILLRPNFPSGLNELEARYQSPYGEICSKWERKKNR IVYHVTVPANSTATFYAPDNVKGERAVKLEAGKHILELPIKRAVY >gi|225935353|gb|ACGA01000039.1| GENE 34 49423 - 50208 575 261 aa, chain + ## HITS:1 COG:PAB0040 KEGG:ns NR:ns ## COG: PAB0040 COG0697 # Protein_GI_number: 14520295 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pyrococcus abyssi # 9 256 53 291 295 68 29.0 2e-11 MQAGMQFESILFYRFLFACLALGGILLVNGQSFRIKRQDIPSLFLLALLYLMSAVFLFWG YKFMASGVATTIHFMYPVLTTLIMMLFFKEKKSGWRIAAIASAVAGVYFLSGGDTKTGSF SFLGLFIVLLSALGYALYLVTMSQLKIGQMKGLLLTFYVFLFGGILLFIGTETISQLQPI SKWHTAGNLILLALIPTVVSNLALVRAVKSIGSTLTSVLGAMEPVTAVCVGIFLFGEAFT TSIGVGIALIIAAVIVIILKR >gi|225935353|gb|ACGA01000039.1| GENE 35 50237 - 52378 1952 713 aa, chain - ## HITS:1 COG:no KEGG:BT_3289 NR:ns ## KEGG: BT_3289 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 713 1 713 713 1363 91.0 0 MKFRLTVLIVFSLCLSNVFADEGMWLLGNLRKNKQTDRVMKELGLQMPVNKIYDPKKPCL ADAVVSFGGFCSGVVVSEDGLVFTNHHCGFSSIQQHSSVEHDYLKDGFFARSLEEELPNP ELYVRFLLRTEDVTKRVLSAARHAKTETERRVAVDSIMNVISMEVSEKDSTLTGIVDAYY AGNEFWLSVYRDYNDVRLVFAPPSSVGKFGWDTDNWMWPRHTGDFSVFRIYANTKNGPAD YSPENVPYHPEYVAPISLDGYKEGSFCMTLGYPGSTERYLSSYGIEEMMNGINQAMIDVR GVKQTIWKREMDRRPDIRIKYASKYDESSNYWKNSIGTNKAIKHLKVLEKKRVAEAELRN WIQSHPEEREKLIRLFSSLELSYSNRRETNRALAYFGESFINGPELVQLALEILNFDFEA EEKLVITRMKKLLEKYDNLDLSIDKEVFAAMLKEYQSKVDKKFLPAMYEKIDTLYNGNIQ TYVDSLYATSNITSPKGLKRFLERDTTYNLIEDPAVSLSLDLIVKYYEMNQSISEASEQI EEGERLFNAAMRRMYADRNFYPDANSTMRLSFGTVGGYTPFDGATYDYYTTVKGIFEKVK EHAGDIDFAVQPELLSLLSSGDFGRYANAQGDMNVCFISNNDITGGNSGSAMFNAKGELL GLAFDGNWEAMSSDIVFEPDLQRCIGVDVRYMLFIIEKYGKAAHLIQELKMGR >gi|225935353|gb|ACGA01000039.1| GENE 36 52398 - 53753 1257 451 aa, chain - ## HITS:1 COG:SP1264 KEGG:ns NR:ns ## COG: SP1264 COG1808 # Protein_GI_number: 15901124 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 33 332 14 311 347 231 41.0 3e-60 MKTDERNKFAIKSFLGEYLDLRKDKDNELATVDSIRKGVEFKGANLWILIFAIFMASLGL NVNSTAVIIGAMLISPLMGPIMGVGLSVGLNDFELMKRSLKSFLITTAFSVTTATIFFLL APIAGSQSELLARTSPTIYDVFIALFGGLAGVVALSTKEKGNVIPGVAIATALMPPLCTA GYGLASGNLIYFLGAFYLYFINSVFISLATFLGVRVMHFQRKEFVDKTREKTVRKYIVLI VVLTMCPAVYLTFGIIKSTFYEAAANRFINDQLSFENTQVLDKKISYDHKEVRVVLIGPE VPDASISIARSKLKEYKLEDTKLIVLQGMNNEAVDVSSIRAMVMEDFYKNSEQRLQQQAV KISQLETTLEQYRTYDAMSRTLVPELKVLYPSITTLSIAHSLEVRVDSMKTDTVTLAVLK FARHPSVAEKEKISEWLKARVGTKKLRLITE >gi|225935353|gb|ACGA01000039.1| GENE 37 53756 - 55252 1270 498 aa, chain - ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 5 458 6 442 479 75 20.0 2e-13 MAGLKSLAKDTAIYGLSSIVGRFLNYMLVPLYTAVLPASTGGYGVVSNVYAFTALMLVLL TFGMETGFFRFANKSGEDPMKVYANSLLSVGGVSLIFVFLCLLFLQPISNLLDYGDHPEF IAMMAVVVALDSFQCIPFAYLRYKKRPIKFAAIKLLSIVGGIGLNLFFLLVCPWLNVHCP STISWFYDPDYLVGYIFISNLIISVVQMFFFIPELTGFAYKLDRVLLKRMVVYSFPVLIL GLVGILNQTVDKMIYPFLFEDRQEGLVQLGIYAATSKIAMVMAMFTQAFRYAYEPFVFGK DREGDNRKMYAAAMKYFLIFSLLAFLAVMFYLDLLRYLVARGYWEGLGVVAIVMLAEICK GIYFNLSFWYKLTDKTYWGAYFSVIGCVIIVVLNILFVPVYGYLASAWASVAGYAVILLL SYWIGQKEYPIHYDLKSLGLYVLLAAVLYVIGEQVPIPNIVLRLAFRTVLLLLFIAYIIK KDLPLSQIPVINRFIKKK >gi|225935353|gb|ACGA01000039.1| GENE 38 55283 - 56059 621 258 aa, chain - ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 34 256 170 395 395 172 40.0 5e-43 MMKQWTTLLVFLFLSLSLSAQQEYGRDIFVSSKGDSLPYRMIHPESVKPGEKYPLVLFLH GAGERGNDNEKQLTHGGQMFLNPVNQEKYPAFVLIPQCPTDGYWAYTGRPKSLIPTEMPV GQEISPILQTLKQLLDSYLVMPEVDTQRVYIIGLSMGAMGTYDLVVRYPEIFAAAVPICG TVNPSRLSVAKDVKFRIFHGDADDVVPVKGSREAYKALKAAGADVEYIEFPGCNHGSWNP AFNYPGFMDWLFKQKKKR >gi|225935353|gb|ACGA01000039.1| GENE 39 56122 - 57153 1028 343 aa, chain - ## HITS:1 COG:NMB1243 KEGG:ns NR:ns ## COG: NMB1243 COG2255 # Protein_GI_number: 15677115 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Neisseria meningitidis MC58 # 14 330 22 338 343 388 61.0 1e-108 MEQEDFNIREHQLTSKERDFENALRPLSFEDFSGQDKVVENLRIFVKAARLRGEALDHVL LHGPPGLGKTTLSNIIANELGVGFKVTSGPVLDKPGDLAGVLTSLEPNDVLFIDEIHRLS PVVEEYLYSAMEDYRIDIMIDKGPSARSIQIDLNPFTLVGATTRSGLLTAPLRARFGINL HLEYYDDDILSNIIRRSASILDVPCSVRAASEIASRSRGTPRIANALLRRVRDFAQVKGS GSIDTEIAQFALEALNIDKYGLDEIDNKILCTIIDKFKGGPVGITTIATALGEDAGTIEE VYEPFLIKEGFMKRTPRGREVTELAYKHLGRSLYNSQKTLFND >gi|225935353|gb|ACGA01000039.1| GENE 40 57280 - 58512 1053 410 aa, chain + ## HITS:1 COG:PA5478_1 KEGG:ns NR:ns ## COG: PA5478_1 COG2715 # Protein_GI_number: 15600671 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Pseudomonas aeruginosa # 2 246 1 245 245 198 45.0 1e-50 MVLNYIWIGFFVIAFIIALIKVIVLGDTGIFTAIMNSTFDSSKTAFEISLGLTGVLALWL GIMKIGENSGLINALAHFLSPVLCRLFPDIPKGHPVLGSIFMNMSANMLGLDNAATPLGL KAMKELQELNPNKDTASNPMIMFLVINTSGLIIIPISIMVYRAQMGAAQPTDVFIPILLS TFISTLVGVIAVSIAQKINLINKPILILMGVICLFFSGLIYLFLSVSREDMGTYSTLIAN ILLFSVIILFILTGVRKKINVYDSFVEGAKEGFTTAVRIIPYLVAFLVGIAVFRTSGAMD FLVGGIGYIVGSCGVDTSFVGALPTALMKSLSGSGANGLMIDTMKELGPDSFVGRMSCVV RGASDTTFYILAVYFGSVGITKTRNAVTCGLIADFSGIIAAILISYLFFF >gi|225935353|gb|ACGA01000039.1| GENE 41 58529 - 58945 457 138 aa, chain + ## HITS:1 COG:TP0650 KEGG:ns NR:ns ## COG: TP0650 COG0319 # Protein_GI_number: 15639637 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Treponema pallidum # 37 122 40 135 160 70 38.0 8e-13 MAVTYQTEGVKMPDIKKRETTEWIKTVAASYGKRLGEIAYIFCSDEKILEVNRQYLQHDY YTDIITFDYCEGDRLSGDLFISLDTIRTNAEQFGASYEDELHRVIIHGILHLCGINDKGP GEREVMEAAENKALGMLR >gi|225935353|gb|ACGA01000039.1| GENE 42 59103 - 59507 437 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882930|ref|ZP_02063933.1| ## NR: gi|160882930|ref|ZP_02063933.1| hypothetical protein BACOVA_00892 [Bacteroides ovatus ATCC 8483] # 1 134 1 134 134 257 100.0 1e-67 MKKLLITAMALSCFALSYGQKKVEKQIIGKWCNPYTYKSTGELKGFEFKKGGKCSAINIP SLDLKTWKVDNGYLIVEGFSKEDDGKVEVYKTRERIGYLTSDSLQLIVNEGTPRLAFLYL NTKSIKELVTPEVK >gi|225935353|gb|ACGA01000039.1| GENE 43 59521 - 62238 1806 905 aa, chain - ## HITS:1 COG:FN1128 KEGG:ns NR:ns ## COG: FN1128 COG1506 # Protein_GI_number: 19704463 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Fusobacterium nucleatum # 634 877 419 660 660 84 26.0 1e-15 MTGEDKMNRIFVYIGIIAGLLLGGSLSVEAQKKPLDIEACTSWKRIDAPDISPTGRWVTY RISLMEYNPASKEEKKLHLFDSRTRKEILLNGDIERLEFYNNDQGAFYRLADSAGVMKTF LLSLPSGVKTEWKHKEAFRPVEGTPYSISVTNVSKDTVNHVPAFNRLVVRHLKTEVAFHI DSIGYHTLYDGGRSILFIRKKSDRNELCYGPLAGPYKTLYQSSVKSEPSSYSFNEKEMIG EFSVNDSLWYAFSLKKPGCNLLFDRKEIVLPDGMSVGRIDLSKSHNFLILELRDSQQISR RREESEKKPDKSFELELWTWNEMEVPTLQRGGRYRQDKSVTYIYDIFSKKLTEVAPFTVD LLLPSGAENLNYVLYTDESPYKMQREWLDRLPFDIYSVNVHTGAKRLIGRSYRTAPKWSV NGKWAVMYDPIAQVWNKFDGATGKVVNISDAIGYPMFMESYDKPAPAPAYGIAGWTADGN NVFLYDAYDWWKIDLTGERQPECLTKGYGRKHGKSIRKMTSNIDKDVFQKDEKVIVSLWD KDTMDEGVYQLDMKGRLRKLMEGNYVYTIHRFSDNHEYCVWNRQNVSEFRDLWWSKSDFS NPVKVTNANPQQADYKWGTVKLIKWTNYENKENKGLLYLPEDYDPQKEYPALVQFYETHS GELNIYHAPLLSSALGDPMYFASNGYIVFMPDVHFTVGTPGQSCYDAVVSGTKYLIEQGI AHPGKIGLQGHSWSGYQTSYLVTKTDLFTCANIAAPITDMVTGYLGIRNGSGLPRYFMYE ETQSRMGKTLWEAKDKYLASSAILEADKIHTPLLILHNDEDEAVAYEQGRALYLAMRRLQ RPAWLLNYKGEGHFVLGRGAQKDWTIRMMQFFDYYLKGTKEPRWMKEGIHLRERGIDQKY DLLKK >gi|225935353|gb|ACGA01000039.1| GENE 44 62222 - 64579 1868 785 aa, chain - ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 101 754 102 735 738 127 24.0 6e-29 MKNSLFRYVCLVVALLICSFADAQQKANYKLAEKFRLLEQNPIIKYSTEVKPTFINGTDC FYYSFTTREGKKYYYVNPKKKEKRLLFDTAELLSKIAVYTKKAYSSADPYLSFTFMKDNE TIRIDFDRGLYTYNIHTKALKQLNEKPSYGNSDPYWMKYSPDSLYFLYASKDNLYFVGNQ KKGQDTIPVQLTTDGEPNYTFNREDEGKLEGRFGAESTHWIPGSHRFYAVREDNRKVRDL WLINSLSTPYPTLKTYKAELAGDKHVTQYELLIGDVDTREVKKIDINRWPDQYIDILYIS KDGKRLYFQRYNRPWNQSDICEVDVETGKVRVVIHEENKPYLDYQMRNVSFLNDGKEILF RSERNGWGHYYLYDTATGNLKNQLTDGIWVAGPVTKIDTIGRKMYFYGYGREKGIDPYYY ILYEAQLDRPNAVRLLTPENASHDVSISPSYRYMVDSYSTVSQEPVNVVRNRNGKVIMTL EKPDLQPVYEMGWKAPERFKVKAADGVTDLYGVMWKPADFDSTKVYPIISNVYPGPFFEY VPTRFTINDVYNTRLAQLGFIVITVGHRGGTPMRGKAYHTYGYNNMRDYPLADDKYAIEQ LAARYPFIDATKVGIYGHSGGGFMSAAAICTYPDFYSAAVSSAGNHDNRIYNKGFVEIHF GVDEKVKTTKDSLGVESTMYDYSVRVRPNQELVKNYKHGLLLFTGAMDKTVNPANTLRLV DALIKADKDFEMFVLPKCTHGFFGESEDFFEHKMWRHFARLLLHDNSADSDVDLNKDMIK DDRRR >gi|225935353|gb|ACGA01000039.1| GENE 45 64598 - 66922 1432 774 aa, chain - ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 159 760 155 737 738 129 25.0 3e-29 MRKRILFIVLFAWSGILHAFSQQQANYELANEFQAFGLGGNFTANSLNLWPNEINGTDKF WFEFHTTVGKDYYFVDPERRLKEPLFDKGKLASGIAQITRGVVDRNKLELSDIKFSENLA SFSFSYKGRNYEYNRLTHQIVEKKKEKSRSLFDRYSWMKYSPDRKYLVYVKKHNLYVKGN GEMGMDTTEVQLTTDGVRYYSYAHDGGTDEEGEVMSDICWCPDSRHLYLVREDERLLRDF WVINSLDDRPSLTTYRYEFPGDKNVTQNELVIVDVIGRTVKKTDISKWPDQYINPLCVTK DSKYLFFERTKRTWDEVDLCSVNLSTMEVKEIIHEVDKPYRDPHARSVAILNDGKDILFR SERTGWGHYYHYDGNGKLKNVMTSGEWVAGQIASIDTLGRTVYLYGYGREKDIDPSYYML YKVHIDREGVTPLSTEDGQHQVKFLKSNRYYIDTYSRVDMEPKIMLKDNKGQGILELAKP DIELAYKAGWKKPERFVVKAADNITDLHGVMWKPADFDSTKVYPIISVVYPGPYFGQVPT SFTLEDRCCTRLAQLGFIVIAVSHRGDTPMRGKAYHRFGYGNMRDYPLADDKYAIEQLAR RYSFINGKKVGIYGHSGGGFMAAAAICTYPDFYKAAVSCSGNHDNNIYNRGWGECYNGVK EVEKVVKDSLGNETKEYGYKFNVKPNTEIAKNLKGHLMLVTGDMDKNVNPAHTYRMAQAL IEAGKDFDMLVIPGAGHGYGSADKYFEKKMYRFFAKHLLGDTRADYWGDINRNK >gi|225935353|gb|ACGA01000039.1| GENE 46 67005 - 68678 1433 557 aa, chain - ## HITS:1 COG:no KEGG:BT_3274 NR:ns ## KEGG: BT_3274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 285 1 262 534 87 30.0 9e-16 MKKITVYILLLLLLSSICSCLDDKNNYDYTPINELKGTISNIEKNYSVGIDEDLVITPTF EFTIDKTDPDVSYEWRLDGELLNVKAPSYTFNSHKIGLFELTFSVIDNKTGVKYSASTDI LVRSPYQRGWVVLSDVDGKSTLSFIKIKTLYGVSETVNIWGEKVVRDSIAYHSVEKYLVK DLGTNPKGVFEHLGYPSTFGQVETVYDELVVMQDRWVELNGNTLEREVYTEDEFYGDLPV GGFKPVEAAMSYSAKFIRDENGYIYMHTKPVANDFHAGAYMSIPLWNYTRFSALYPVQKF KDRNMDAVLALREEDNSLVIIRNGGGVKNSSNDPTFYDNTVKDTGEIVEFEGEDAKNFQN MRDESIVTMKTASVNVNGDMMQAWVALMKGNDMSYQLRYFRMKNKKWDIYDYYENDLTSL KNKEYTDMAVFEQKQYVVIAMGNELWYCQYVKGMASAPKLIHTFDKKVIALSSNDIYLYP KYGVYNGQLGVALEDGSFFIYGVEEKDKQESELVNDVKIKQLYPNEESEKEGDNNFGKIV DVLYKIGNANQIVQYDL >gi|225935353|gb|ACGA01000039.1| GENE 47 68697 - 69443 636 248 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882925|ref|ZP_02063928.1| ## NR: gi|160882925|ref|ZP_02063928.1| hypothetical protein BACOVA_00887 [Bacteroides ovatus ATCC 8483] # 1 248 1 248 248 496 100.0 1e-139 MKRISFYIVIVLLGVSFFTSCEEQGLLTHTNDVSYIAFEKNMTTDTTGVSFKFYNEGENA KILLGVTISGKVQDKDLEFTVSVDPERTTLPATQYELPEKCVIKAGELTGEILVVLKYYE KLDTKAELLTLQVNDGGEVRQGPSVYSRAIISVSNLLFAPEWWTRNDGDVNNPYNIVEEW YLGRYSEKKYLMFLEELKKDGVVFDGKDMFILRKYALRLKNRIKDYNTQHPDEPMSDEYG KMEIPVAG >gi|225935353|gb|ACGA01000039.1| GENE 48 69465 - 70988 1170 507 aa, chain - ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 476 4 470 488 258 35.0 4e-67 MKKIYTIIFVSCLAAISFSSCSDWLDVRPADEIKEEYLFETGNGYRTALNGIYRKLATFD LYGSNLSWGLIDGWGQVYDLDKAPDSGGGKAMKKLSKFLYKNSELTPTTDAMWSAAWNII ANCNNLAQQVAEEDTMLFYKRTQEKNMILGEAIALRAYMHFDLLRIYAPSLAMNPGDRLF IPYVDKYPSYLSDRQTVGYCLERIIADLVEAQKILKDVDKSSEFTSGNRFTKSPSGEERF VWYRGFHLNYHAVTAELARVYLYAGQSEKAYETAKLLIDINADKGYYKAVTSSYSGPMNI ENGNIKMYEDIIFALYSTDQTDWDLEINHASDNATKPDDEKYLALSDAVITKFFGTESDK DWRLKYQLGPNTSSFYRSLKYKKQDEGSGFGKVNSTMVPMIRMSEVYYIAAEAIYDTDKE LAKTYLKTVKQGRGISSPDLSKSGTKQDFINLIVDDARREFIGEGQTFFLYKRLKRNLEG SDEKQSVEYPAIEDNLVMPLPDSESNI >gi|225935353|gb|ACGA01000039.1| GENE 49 71003 - 74359 2683 1118 aa, chain - ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1118 1 1116 1116 1059 49.0 0 MREKRWLLCFLVAMCCTFSAWALPSQDKTVTLKLHNVSIETVLDAVKKQTGVNMLYNSQM FKGVPSVSLDVKNERWDTTLKLVLNPQGFDYVMKDGIVVIRKQKRENRVRGMVTDSHGEP IPGASIIVKGTRTGTSTNIEGEFTLDVKSDKVVLEVSFIGMKKQTVQVDATRRKTVEITL VDDVKTLEDVVVTGYSNVRKSSFTGSSTQISGDDLRKVSQTNIIGAIQSFDPSFRLVDNV QFGSDPNALPEMYIRGRSGFGVKELDKDQLSKSNLENNPNLPTFIMDGFEVSIEKVYDLD PTRIESLTVLKDAAATAMYGSRAANGVVVITTVAPKAGEVRVSYNFTGTLEMPDLRDYNL ANASEKLEIERLAGLFDKGQANIPSTAVGMNNYYKKYALIQKGVDTDWMSIPLQNSFNHK HSIYLEGGTPNLRYGVDGSFNIADGVMKGTERNRYSAGFSLDYRIKNLQVRNYVSFGHTK SKESPYGSFSDFSGLQPYDSPYKDDGTLREKLTYSKISGRDNNNPLYEATLGNYNWNSYD EVIDNLSFNWYLNDYWTVKGQFSVTKKYSKSEKFIDPLSSKTSVTGSKDNNLAGDLYTTN GESLDWNSNAFLYYTRSFNKQHNLNVSAGWEAASGTTESTNAHYRGFPSGEFHSLNYAAE VYKKPTRTENTTRRVSVLASANYTWNDIYLVDASVRFDGSSEFGANQKWAPFFSGGLGVN IHNYDFFKSNGYINKLKLRASYGRTGKVNFPAYAATTMYETLFDEWYITGYGAVLKALGN KDLTWEKTDKLSVGIETKTFHDRLTVDFEYYYNKTIDLVNDVTLSQTSGFSTYKDNMGQV ANKGVELKLRADIYRDRNWNVALWGNMAHNKNEILKISDSQKAYNERVAAYYKNEALYQE TLGLSHGAEYSVPLPQYEEGASLTSIWAVRSLGIDPTTGKEIFLNRDGSIADKWNAAQEV VVGNTEPKCSGAFGLNLSYKNWTLFASFLYEWGGQEYNQTLVNNVENADLVNKNVDLRVL TDRWKKPGDIAQFKDIKDRGMTTLPTSRFVQDKALVRLNSLNVSYDFNRDWIKKHLRMNL LRLEASTSDLINWSSIKQERGLSYPRSWKVEFSLKAQF >gi|225935353|gb|ACGA01000039.1| GENE 50 74411 - 75625 971 404 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 194 344 125 275 331 81 34.0 3e-15 MKKFENVYQDAALMRKVLLGEADEVEQKDLEKRLAECPDLRNVYEQLQSGETLKVAFGEY QKYSSKKAYQSFLQSIGQMESKGRKSRSRRVGWYAAAAVVTLAVALSFYMSNYISSEKEG KTLIQPGTQQAQLTLSDGSVIDVHRKEVSVVVDGVQVKYKKGVLSYQPTVTTQHEEKNIE EQSGKSNELVIPRGGENTVILADGTTVHLNAGSKLTYPVRFAGKRRIVRLEGEAYFDVAG DENHPFVVQTHLGEITVLGTEFNVNAYADTPVCYTTLVHGKVKFSTLNAETVTLSPGEQA VVFANSSTKRKVDLEEYVGWVDGMYIFNDRPLGDIMKTFERWYDIQVYYETPNLRDITYS GNLKRYGTINSFLDALELTGDLTYKISGRNILIYDKVEEQEWKR >gi|225935353|gb|ACGA01000039.1| GENE 51 75870 - 76439 326 189 aa, chain - ## HITS:1 COG:RSc1055 KEGG:ns NR:ns ## COG: RSc1055 COG1595 # Protein_GI_number: 17545774 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 7 176 9 187 199 59 25.0 4e-09 MNDKIDIIVAGVNRKDKKMWGDFYDRFYTALCVYVSKILPVPDAVEDLVQEVFISVWEGK RTFSDIKELTNYLYRACYNNALLYIRNNQIHDTILSSLAEEESMVDEDTIYALTVKEEII RQLYCYIEELPAEQRRIILMRIEGHTWEEIAERLEISINTVKTQKTRSYKFLRERLGDSI HSIILFLFF >gi|225935353|gb|ACGA01000039.1| GENE 52 76514 - 76645 59 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNANILFFSETQTLPTYKIKNKMQQGTKGASFKEEYKNKKYFD >gi|225935353|gb|ACGA01000039.1| GENE 53 76808 - 78118 713 436 aa, chain - ## HITS:1 COG:XF0506 KEGG:ns NR:ns ## COG: XF0506 COG5545 # Protein_GI_number: 15837108 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Xylella fastidiosa 9a5c # 84 348 78 351 488 64 23.0 3e-10 MKTLEYNSKRLAKRIQALMKESSNVIVSPNTSKNVPKMQEIDEQEAYKQEVHKHPVLLTR QVIDFLKVRYDFRYNLLTEETEFRPSGQREIPFCRIDKRELNTFCLEAHKEGINCWDKDL QRYIYSTLVESYHPFRLYMNELPAWDGTDRLKPLAKRVSDTPIWIKSFHTWMLGLAAQWL GVNQTQANSVAPILISEEQGRHKSTFCRMLMPPQLARYYSDNLKLTAQGNPERLLAEMGL LNMDEFDKFGAQKMPLLKNLMQMSSLNICKAYQKNFRSLPRIASFIGTSNRTDLLCDPTG SRRFICIEAEHDIDCTGIEHSQIYAQLKEELLAGARHWFNKEEEHALQRHNSAYYHVNPI EDIVCNIYAPAMLDEPDCLSLSAAIIHQELKKRFPAVLRNCTPIQLAQILTATGISRQHT RLGNVYLVKKVKGCEE Prediction of potential genes in microbial genomes Time: Fri May 13 09:07:33 2011 Seq name: gi|225935352|gb|ACGA01000040.1| Bacteroides sp. D2 cont1.40, whole genome shotgun sequence Length of sequence - 789 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 48 - 107 2.4 1 1 Tu 1 . + CDS 142 - 780 545 ## BT_4231 hypothetical protein Predicted protein(s) >gi|225935352|gb|ACGA01000040.1| GENE 1 142 - 780 545 212 aa, chain + ## HITS:1 COG:no KEGG:BT_4231 NR:ns ## KEGG: BT_4231 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 207 212 238 54.0 1e-61 MAIVFDWYENPNASSEEEAALHPRIFMNGKVDTDTLCYKIHDYSSLTVGDVKNVLDNLSK ILGESLREGKEVHIEGIGYFYPTLAATGKVTRSTPHKTNKVAFKTVRFRPDSNLKGHFVG VRANQSKYVRHSEKVSEVEIDMLLKEYFAEHQMMTRRDFQEVCGLARTTAKTHLVRLRGE GKLVNIGLRNQPMYVPAPGYYGVSRDAAHPSR Prediction of potential genes in microbial genomes Time: Fri May 13 09:08:27 2011 Seq name: gi|225935351|gb|ACGA01000041.1| Bacteroides sp. D2 cont1.41, whole genome shotgun sequence Length of sequence - 186019 bp Number of predicted genes - 136, with homology - 134 Number of transcription units - 64, operones - 33 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 706 - 765 3.5 1 1 Op 1 . + CDS 788 - 2674 1457 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 2 1 Op 2 . + CDS 2725 - 3255 598 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 3 1 Op 3 . + CDS 3325 - 5160 985 ## COG0322 Nuclease subunit of the excinuclease complex + Prom 5313 - 5372 4.1 4 2 Op 1 . + CDS 5399 - 5851 380 ## COG1490 D-Tyr-tRNAtyr deacylase 5 2 Op 2 . + CDS 5871 - 6209 406 ## COG1694 Predicted pyrophosphatase 6 2 Op 3 . + CDS 6196 - 7098 910 ## COG0274 Deoxyribose-phosphate aldolase + Prom 7362 - 7421 8.4 7 3 Tu 1 . + CDS 7493 - 8173 461 ## BT_3262 hypothetical protein 8 4 Tu 1 . - CDS 8174 - 9148 720 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 9202 - 9261 7.4 + Prom 9128 - 9187 4.8 9 5 Tu 1 . + CDS 9237 - 12086 2384 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains + Term 12277 - 12313 6.5 + Prom 12354 - 12413 6.0 10 6 Tu 1 . + CDS 12477 - 12872 463 ## BT_3259 hypothetical protein + Term 12911 - 12951 8.2 - Term 12899 - 12939 4.4 11 7 Op 1 9/0.000 - CDS 12961 - 13740 711 ## COG3279 Response regulator of the LytR/AlgR family 12 7 Op 2 . - CDS 13737 - 15770 1365 ## COG3275 Putative regulator of cell autolysis + Prom 15823 - 15882 6.5 13 8 Op 1 . + CDS 15912 - 16817 678 ## COG1045 Serine acetyltransferase 14 8 Op 2 . + CDS 16834 - 18273 1619 ## COG0116 Predicted N6-adenine-specific DNA methylase + Prom 18315 - 18374 2.3 15 9 Op 1 . + CDS 18418 - 18981 239 ## BDI_3007 hypothetical protein 16 9 Op 2 . + CDS 18996 - 21194 1824 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 17 9 Op 3 . + CDS 21260 - 22534 1600 ## COG0151 Phosphoribosylamine-glycine ligase 18 10 Op 1 . + CDS 22715 - 23725 615 ## BT_3252 hypothetical protein 19 10 Op 2 . + CDS 23710 - 24189 278 ## COG1238 Predicted membrane protein 20 10 Op 3 . + CDS 24253 - 25002 587 ## COG4121 Uncharacterized conserved protein 21 10 Op 4 25/0.000 + CDS 25047 - 25976 743 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 22 10 Op 5 . + CDS 26014 - 26787 218 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 23 11 Op 1 . - CDS 26861 - 27499 352 ## Athe_0520 hypothetical protein 24 11 Op 2 . - CDS 27496 - 27828 335 ## gi|160883372|ref|ZP_02064375.1| hypothetical protein BACOVA_01341 25 11 Op 3 . - CDS 27864 - 28214 325 ## BF2945 hypothetical protein - Prom 28270 - 28329 2.3 - Term 28583 - 28645 14.1 26 12 Op 1 . - CDS 28675 - 29940 1120 ## BT_1283 hypothetical protein - Prom 29986 - 30045 4.4 - Term 30006 - 30050 3.1 27 12 Op 2 . - CDS 30134 - 30364 258 ## Arnit_1148 DNA-binding domain-containing protein - Prom 30395 - 30454 6.8 28 13 Op 1 . - CDS 30570 - 32240 812 ## BT_1284 putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) 29 13 Op 2 . - CDS 32267 - 33220 1110 ## BT_1282 hypothetical protein 30 13 Op 3 . - CDS 33285 - 34886 1614 ## BT_1281 hypothetical protein 31 13 Op 4 . - CDS 34905 - 38249 3086 ## BT_1280 hypothetical protein - Prom 38301 - 38360 4.4 - Term 38293 - 38321 -0.0 32 14 Op 1 6/0.000 - CDS 38472 - 39485 595 ## COG3712 Fe2+-dicitrate sensor, membrane component 33 14 Op 2 . - CDS 39539 - 40084 395 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 40129 - 40188 3.0 + Prom 40251 - 40310 4.6 34 15 Tu 1 . + CDS 40336 - 43749 3066 ## BT_3247 hypothetical protein + Prom 43764 - 43823 6.5 35 16 Tu 1 . + CDS 43858 - 44472 613 ## COG0726 Predicted xylanase/chitin deacetylase + Prom 45036 - 45095 2.3 36 17 Op 1 . + CDS 45158 - 45388 163 ## gi|237720641|ref|ZP_04551122.1| conserved hypothetical protein 37 17 Op 2 . + CDS 45488 - 45979 508 ## gi|160882201|ref|ZP_02063204.1| hypothetical protein BACOVA_00147 38 17 Op 3 . + CDS 46000 - 46635 587 ## BT_2676 hypothetical protein + Term 46671 - 46729 10.1 39 18 Op 1 . + CDS 46741 - 47460 660 ## BT_2675 hypothetical protein 40 18 Op 2 . + CDS 47509 - 47979 502 ## gi|160882198|ref|ZP_02063201.1| hypothetical protein BACOVA_00144 + Term 48030 - 48082 16.2 - Term 48013 - 48077 18.5 41 19 Tu 1 . - CDS 48198 - 49439 1590 ## BT_3233 hypothetical protein - Prom 49610 - 49669 8.4 - Term 49614 - 49677 15.1 42 20 Op 1 9/0.000 - CDS 49698 - 50501 1068 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 43 20 Op 2 . - CDS 50542 - 51384 1004 ## COG3717 5-keto 4-deoxyuronate isomerase - Prom 51543 - 51602 6.7 + Prom 51681 - 51740 7.8 44 21 Op 1 . + CDS 51814 - 55176 2408 ## PRU_1888 TonB dependent receptor 45 21 Op 2 . + CDS 55181 - 56770 1468 ## PRU_0155 putative lipoprotein 46 21 Op 3 . + CDS 56798 - 58903 1629 ## COG0642 Signal transduction histidine kinase + Term 59037 - 59089 14.4 - TRNA 59463 - 59536 48.6 # Gln TTG 0 0 - Term 59562 - 59597 2.8 47 22 Op 1 . - CDS 59645 - 60937 712 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 - Prom 60959 - 61018 6.8 48 22 Op 2 . - CDS 61024 - 61716 475 ## COG0084 Mg-dependent DNase - Prom 61746 - 61805 2.5 - Term 61748 - 61788 3.9 49 23 Op 1 . - CDS 62013 - 62402 173 ## PROTEIN SUPPORTED gi|126646897|ref|ZP_01719407.1| 50S ribosomal protein L34 50 23 Op 2 . - CDS 62408 - 63154 581 ## BF0075 uroporphyrinogen-III synthase 51 23 Op 3 . - CDS 63159 - 63992 337 ## BT_3225 hypothetical protein 52 23 Op 4 . - CDS 63989 - 64576 533 ## COG1611 Predicted Rossmann fold nucleotide-binding protein 53 23 Op 5 . - CDS 64656 - 66896 1185 ## Cpin_4389 TonB-dependent receptor plug 54 23 Op 6 . - CDS 66896 - 67903 747 ## gi|237716900|ref|ZP_04547381.1| conserved hypothetical protein 55 23 Op 7 . - CDS 67919 - 68911 583 ## COG1651 Protein-disulfide isomerase - Term 68913 - 68972 17.4 56 24 Op 1 . - CDS 68985 - 69680 293 ## gi|260172708|ref|ZP_05759120.1| hypothetical protein BacD2_12646 57 24 Op 2 . - CDS 69683 - 69886 252 ## gi|237716902|ref|ZP_04547383.1| conserved hypothetical protein - Prom 69947 - 70006 5.6 - Term 69954 - 70006 1.1 58 25 Tu 1 . - CDS 70056 - 70226 80 ## gi|294644389|ref|ZP_06722152.1| hypothetical protein CW1_0710 - Prom 70404 - 70463 5.6 59 26 Tu 1 . - CDS 70816 - 71187 429 ## gi|260172710|ref|ZP_05759122.1| hypothetical protein BacD2_12656 - Prom 71253 - 71312 5.6 + Prom 71084 - 71143 2.7 60 27 Tu 1 . + CDS 71193 - 71393 62 ## 61 28 Tu 1 . - CDS 71314 - 72981 1003 ## BT_3570 TPR repeat-containing protein - Prom 73025 - 73084 6.8 - Term 73065 - 73130 12.0 62 29 Tu 1 . - CDS 73162 - 74454 1550 ## COG0192 S-adenosylmethionine synthetase - Prom 74477 - 74536 4.4 + Prom 74413 - 74472 3.5 63 30 Tu 1 . + CDS 74685 - 74873 271 ## BT_3217 hypothetical protein + Term 74931 - 74964 0.2 64 31 Op 1 . - CDS 74848 - 75357 278 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 65 31 Op 2 . - CDS 75333 - 76391 1202 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 66 31 Op 3 . - CDS 76409 - 77131 643 ## COG0130 Pseudouridine synthase 67 31 Op 4 . - CDS 77131 - 77985 743 ## COG1968 Uncharacterized bacitracin resistance protein 68 31 Op 5 . - CDS 77988 - 78230 271 ## BT_3211 hypothetical protein - Prom 78254 - 78313 2.8 69 32 Op 1 . - CDS 78344 - 79225 580 ## COG2177 Cell division protein 70 32 Op 2 . - CDS 79229 - 80128 807 ## COG2227 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase - Prom 80167 - 80226 5.8 - Term 80554 - 80612 11.0 71 33 Op 1 . - CDS 80636 - 80893 265 ## gi|295085889|emb|CBK67412.1| hypothetical protein 72 33 Op 2 . - CDS 80898 - 81128 206 ## gi|260172723|ref|ZP_05759135.1| hypothetical protein BacD2_12721 - Prom 81307 - 81366 3.0 - Term 81297 - 81356 11.1 73 34 Tu 1 . - CDS 81371 - 81553 200 ## gi|260172724|ref|ZP_05759136.1| hypothetical protein BacD2_12726 - Prom 81600 - 81659 3.2 74 35 Op 1 . - CDS 81695 - 81970 262 ## gi|260172725|ref|ZP_05759137.1| hypothetical protein BacD2_12731 75 35 Op 2 . - CDS 81970 - 86280 2171 ## COG3209 Rhs family protein 76 35 Op 3 . - CDS 86277 - 89282 1429 ## Cpin_4144 YD repeat protein - Prom 89302 - 89361 4.8 - Term 89331 - 89371 8.0 77 36 Op 1 . - CDS 89381 - 89482 65 ## 78 36 Op 2 . - CDS 89482 - 90009 371 ## COG3023 Negative regulator of beta-lactamase expression - Prom 90030 - 90089 2.9 79 36 Op 3 . - CDS 90094 - 90597 577 ## BT_3199 putative non-specific DNA-binding protein - Prom 90622 - 90681 4.3 80 37 Op 1 . - CDS 90720 - 91610 462 ## BT_3198 hypothetical protein 81 37 Op 2 . - CDS 91598 - 91918 197 ## BT_3197 hypothetical protein - Prom 91939 - 91998 2.4 + Prom 92278 - 92337 1.8 82 38 Tu 1 . + CDS 92360 - 93073 717 ## BT_3196 hypothetical protein - Term 93105 - 93167 3.1 83 39 Tu 1 . - CDS 93270 - 94652 472 ## PROTEIN SUPPORTED gi|227395721|ref|ZP_03879044.1| SSU ribosomal protein S12P methylthiotransferase - Prom 94868 - 94927 5.8 + Prom 94787 - 94846 5.1 84 40 Op 1 . + CDS 94989 - 95471 329 ## gi|260172735|ref|ZP_05759147.1| hypothetical protein BacD2_12781 + Prom 95475 - 95534 3.3 85 40 Op 2 . + CDS 95623 - 97122 1513 ## COG0427 Acetyl-CoA hydrolase + Term 97167 - 97229 12.1 - Term 97155 - 97217 8.3 86 41 Op 1 . - CDS 97345 - 98751 865 ## COG2027 D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) 87 41 Op 2 . - CDS 98798 - 100000 934 ## PRU_2053 hypothetical protein - Term 100012 - 100055 3.3 88 42 Op 1 . - CDS 100065 - 101408 667 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 89 42 Op 2 . - CDS 101450 - 101902 384 ## BT_3185 hypothetical protein - Prom 101925 - 101984 4.3 + Prom 101703 - 101762 4.7 90 43 Tu 1 . + CDS 102009 - 103580 1598 ## COG0029 Aspartate oxidase + Term 103632 - 103680 10.6 - Term 103621 - 103668 7.2 91 44 Tu 1 . - CDS 103691 - 104269 868 ## COG1592 Rubrerythrin - Prom 104444 - 104503 6.5 + Prom 104271 - 104330 5.7 92 45 Tu 1 . + CDS 104519 - 106198 1669 ## COG0659 Sulfate permease and related transporters (MFS superfamily) + Term 106234 - 106278 4.1 - Term 106220 - 106265 8.1 93 46 Op 1 9/0.000 - CDS 106307 - 107671 398 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 94 46 Op 2 27/0.000 - CDS 107684 - 110863 3003 ## COG0841 Cation/multidrug efflux pump 95 46 Op 3 . - CDS 110878 - 111984 1360 ## COG0845 Membrane-fusion protein - Prom 112087 - 112146 10.1 + Prom 112046 - 112105 4.7 96 47 Tu 1 . + CDS 112208 - 113122 373 ## BT_2939 putative transcriptional regulator - Term 113503 - 113572 13.3 97 48 Op 1 . - CDS 113661 - 115700 1638 ## gi|260172750|ref|ZP_05759162.1| hypothetical protein BacD2_12856 98 48 Op 2 . - CDS 115730 - 116203 471 ## gi|260172751|ref|ZP_05759163.1| hypothetical protein BacD2_12861 99 48 Op 3 . - CDS 116216 - 117397 1047 ## BT_2913 unsaturated glucuronylhydrolase 100 48 Op 4 . - CDS 117412 - 119343 1545 ## Phep_2654 hypothetical protein 101 48 Op 5 . - CDS 119368 - 121182 1648 ## Slin_2455 heparinase II/III family protein 102 48 Op 6 . - CDS 121203 - 123014 1657 ## BF0333 hypothetical protein 103 48 Op 7 . - CDS 123044 - 126316 3020 ## BF0387 hypothetical protein + Prom 126275 - 126334 6.5 104 49 Tu 1 . + CDS 126583 - 128016 1094 ## BT_3171 sialic acid-specific 9-O-acetylesterase + Term 128085 - 128132 6.1 + Prom 128098 - 128157 2.4 105 50 Tu 1 . + CDS 128184 - 132152 2633 ## COG5002 Signal transduction histidine kinase - Term 132244 - 132301 8.0 106 51 Op 1 . - CDS 132496 - 133269 1027 ## COG0731 Fe-S oxidoreductases - Prom 133289 - 133348 7.5 - Term 133292 - 133329 2.1 107 51 Op 2 . - CDS 133350 - 135407 1252 ## Cpin_1424 hypothetical protein 108 51 Op 3 . - CDS 135433 - 137781 1454 ## Cpin_6153 hypothetical protein 109 51 Op 4 . - CDS 137797 - 138570 508 ## gi|260172762|ref|ZP_05759174.1| hypothetical protein BacD2_12916 110 51 Op 5 . - CDS 138536 - 139465 658 ## gi|293369292|ref|ZP_06615879.1| hypothetical protein CUY_3865 111 51 Op 6 . - CDS 139482 - 140162 580 ## gi|260172764|ref|ZP_05759176.1| hypothetical protein BacD2_12926 112 51 Op 7 . - CDS 140172 - 141695 1403 ## Cpin_1098 hypothetical protein 113 51 Op 8 . - CDS 141712 - 145062 3056 ## Cpin_5147 TonB-dependent receptor plug - Prom 145102 - 145161 5.0 114 52 Op 1 . - CDS 145195 - 146370 829 ## COG3712 Fe2+-dicitrate sensor, membrane component 115 52 Op 2 . - CDS 146410 - 146913 442 ## BVU_0609 RNA polymerase ECF-type sigma factor 116 52 Op 3 . - CDS 146957 - 149005 1019 ## BT_3167 hypothetical protein - Prom 149094 - 149153 5.8 + Prom 148972 - 149031 6.0 117 53 Tu 1 . + CDS 149150 - 149728 550 ## BT_3166 hypothetical protein - Term 149668 - 149704 -0.1 118 54 Tu 1 . - CDS 149766 - 150578 174 ## CHU_3539 hypothetical protein 119 55 Tu 1 . - CDS 150680 - 152959 1182 ## COG3537 Putative alpha-1,2-mannosidase - Prom 153039 - 153098 4.1 + Prom 153259 - 153318 7.0 120 56 Op 1 . + CDS 153373 - 156432 2409 ## BDI_2469 hypothetical protein 121 56 Op 2 . + CDS 156452 - 158062 1170 ## BDI_2468 hypothetical protein 122 56 Op 3 . + CDS 158065 - 159831 1289 ## Oter_2906 hypothetical protein 123 56 Op 4 . + CDS 159836 - 161473 1148 ## Phep_3982 alpha-L-rhamnosidase + Term 161689 - 161729 2.1 - Term 162046 - 162097 11.2 124 57 Op 1 . - CDS 162142 - 164742 1534 ## COG3250 Beta-galactosidase/beta-glucuronidase 125 57 Op 2 . - CDS 164724 - 165389 314 ## COG3250 Beta-galactosidase/beta-glucuronidase 126 57 Op 3 . - CDS 165425 - 169057 2425 ## gi|260172779|ref|ZP_05759191.1| hypothetical protein BacD2_13001 127 57 Op 4 . - CDS 169084 - 170895 1343 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins - Prom 171022 - 171081 6.9 - Term 171044 - 171081 4.2 128 58 Tu 1 . - CDS 171100 - 172497 1070 ## Slin_3640 hypothetical protein - Prom 172542 - 172601 8.3 + Prom 172615 - 172674 7.4 129 59 Tu 1 . + CDS 172711 - 174663 583 ## PROTEIN SUPPORTED gi|90021240|ref|YP_527067.1| ribosomal protein S32 + Term 174767 - 174823 7.5 - Term 175052 - 175091 -0.9 130 60 Tu 1 . - CDS 175105 - 175632 663 ## COG0566 rRNA methylases - Prom 175725 - 175784 4.8 + Prom 176036 - 176095 6.3 131 61 Tu 1 . + CDS 176133 - 177071 717 ## COG0379 Quinolinate synthase + Term 177149 - 177201 11.8 132 62 Op 1 . - CDS 177216 - 178058 825 ## COG1360 Flagellar motor protein 133 62 Op 2 . - CDS 178090 - 178671 309 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein 134 62 Op 3 . - CDS 178722 - 179639 912 ## COG1284 Uncharacterized conserved protein - Prom 179659 - 179718 5.4 135 63 Tu 1 . - CDS 179762 - 182596 2958 ## COG0495 Leucyl-tRNA synthetase - Prom 182616 - 182675 7.9 + Prom 183045 - 183104 7.5 136 64 Tu 1 . + CDS 183332 - 185902 2016 ## PRU_1162 putative glycolsyl hydrolase, family 18/alpha-rhamnosidase + Term 185956 - 186008 9.2 Predicted protein(s) >gi|225935351|gb|ACGA01000041.1| GENE 1 788 - 2674 1457 628 aa, chain + ## HITS:1 COG:RSc3328 KEGG:ns NR:ns ## COG: RSc3328 COG0445 # Protein_GI_number: 17548045 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Ralstonia solanacearum # 4 624 6 627 647 590 48.0 1e-168 MDFKYDVIVIGAGHAGCEAAAAAANLGSKTCLITMDMNKIGQMSCNPAVGGIAKGQIVRE IDALGGQMGLVTDETAIQFRILNRSKGPAMWSPRAQCDRAKFIWSWREKLENTPNLHIWQ DTVCELLVENGEVTGLVTAWGVTFKAKCIVLTAGTFLNGLMHVGRHKLPGGRMAEPASYQ LTESIARHGITYGRMKTGTPVRIDARSVHFDQMETQDGESDFHKFSFMNTSTRHLKQLQC WTCYTNEEVHRILREGLPDSPLFNGQIQSIGPRYCPSIETKIVTFPDKDQHQLFLEPEGE TTQELYLNGFSSSLPMEIQIAALKQIPAFKDLVIYRPGYAIEYDYFDPTQLKHTLESKII KNLFFAGQVNGTTGYEEAGGQGLIAGINAHINCHGGEAFTLARDEAYIGVLIDDLVTKGV DEPYRMFTSRAEYRILLRMDDADMRLTEKAFKLGLAKEDRYQLVRGKKEAVEQIISFARN YSMKPALINDALERIGTTPLRQGCKLIEILNRPQVTIENIAEYVPAFQRELEKATSSDQD RKEEILEAAEILIKYQGYIDRERMIAEKLARLESIKIKGKFDYSTIQSLSTEARQKLVKI DPETIAQASRIPGVSPSDINVLLVLSGR >gi|225935351|gb|ACGA01000041.1| GENE 2 2725 - 3255 598 176 aa, chain + ## HITS:1 COG:YPO3123 KEGG:ns NR:ns ## COG: YPO3123 COG0503 # Protein_GI_number: 16123288 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Yersinia pestis # 11 162 18 169 187 159 49.0 3e-39 MIMSKETLIKSIREVPDFPIPGILFYDVTTLFKDPWCLQELSNIMFDMYKDKGITKVVGI ESRGFIMGPILATRLNAGFIPIRKPGKLPAETIEESYDKEYGKDTVQIHKDALDENDVIL LHDDLLATGGTMKAACELVKRLKPKKVYVNFIIELKELNGKSLFGDDVEVESVLSL >gi|225935351|gb|ACGA01000041.1| GENE 3 3325 - 5160 985 611 aa, chain + ## HITS:1 COG:lin1197 KEGG:ns NR:ns ## COG: lin1197 COG0322 # Protein_GI_number: 16800266 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Listeria innocua # 9 590 2 573 603 409 41.0 1e-114 MNTEPESRANEYLKGIVANLPEKPGIYQYLNTEGTIIYVGKAKNLKKRVYSYFSKEHEPG KTRVLVSKIADIRYIVVNTEEDALLLENNLIKKYKPRYNVLLKDDKTYPSICVQNEYFPR VFRTRKIIRNGSSYYGPYSHIPSMYAVLDLIKHLYPLRTCSLNLTPENIRAGKFNVCLEY HIKNCAGPCIGLQSQEDYLKNIDEIKEILKGNTQEISRMLLERMQTLAGEMKFEEAQKVK EKYLLIENYRSKSEVVSAVLHNIDVFSIEEDESNSAFINYLHITNGAINQAFTFEYKKKL NESKEELLTLGIIEMRERYKSQSREIIVPFELDLELNNIVFTVPQRGDKKKLLDLSILNV KQYKADRLKQAEKLNPEQRSMRLMKEIQQELHLDRPPLQIECFDNSNIQGSDAVAACVVF KKAKPSKKDYRKYNIKTVVGPDDYASMKEVVRRRYQRAIEENTPLPDLIITDGGKGQMEV VREVIEDELNLTIPIAGLAKDNRHRTSELLFGFPPQTIGIKQQSPLFRLLTQIQDEVHRF AISFHRDKRSKRQVASALDNIKGIGEKTKTALLKEFKSVKRIKEASFEEISAVIGEAKAK TVKEGLDNEQQ >gi|225935351|gb|ACGA01000041.1| GENE 4 5399 - 5851 380 150 aa, chain + ## HITS:1 COG:L110564 KEGG:ns NR:ns ## COG: L110564 COG1490 # Protein_GI_number: 15672090 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Lactococcus lactis # 1 147 1 145 151 154 54.0 7e-38 MRIVIQRVSHASVTIEGHCKSSIGKGMLILVGIEEADGQEDIDWLCKKIVNLRIFDDENG VMNKSILEDGGEILVISQFTLHASTKKGNRPSYIKAAKPDVSIPLYEQFCKDLSGALGKE IGTGIFGADMKVELLNDGPVTICMDTKNKE >gi|225935351|gb|ACGA01000041.1| GENE 5 5871 - 6209 406 112 aa, chain + ## HITS:1 COG:SA1292 KEGG:ns NR:ns ## COG: SA1292 COG1694 # Protein_GI_number: 15927040 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Staphylococcus aureus N315 # 2 99 3 101 105 80 44.0 6e-16 MTLEEAQKQVDQWVKTYGVRYFSELTNMAVLTEEVGELARVMARKYGDQSFKEGEKDNID EEIADVLWVLLCIANQTGVDITTAFQKSMDKKTKRDNKRHINNPKLKDHGRE >gi|225935351|gb|ACGA01000041.1| GENE 6 6196 - 7098 910 300 aa, chain + ## HITS:1 COG:SMb21300 KEGG:ns NR:ns ## COG: SMb21300 COG0274 # Protein_GI_number: 16264552 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Sinorhizobium meliloti # 54 293 71 314 334 183 39.0 4e-46 MEENNSQPNKYNAALAKYNTNLSDADIQARVAELIEKKVPENNTEDVKKFLFNCIDLTTL NSTDSDKSVMHFTEKVNQFDDEYPDLKNVAAICVYPNFAAIVKNTLEVDRVNIACVSGGF PSSQTFIEVKVAETALAIAEGADEIDIVISIGKFLSGDYEGMCEEIQELKEVCKERHLKV ILETGALKSASNIKKASILSMYSGADFIKTSTGKQQPAATPEAAYVMCEAIKEYYQKTGI KIGFKPAGGINTVNDAIIYYTIVKELLGEEWLDNQLFRLGTSRLANLLLSDIKGEEIKFF >gi|225935351|gb|ACGA01000041.1| GENE 7 7493 - 8173 461 226 aa, chain + ## HITS:1 COG:no KEGG:BT_3262 NR:ns ## KEGG: BT_3262 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 221 1 221 226 340 71.0 2e-92 MDAYTRTLRFNHNPLNLILGAEKKKGLRIGYMEAGLQGFYLNSMETGIHPQKLSKLLTEE FHCTDNESATGLFQFLINEGDRVSYQIMLPYLLSTENINEFENIIQKRFFGVERFIRQGK NLYKFVKYTEERRDPIIWINDLEKGIIGWDMGLLVSLARASQACGHITKEKAWDYIEQAA KLCSLDLHTAEEIDKSFLLGKAMKSEKIEDWDRLLLCYSLLAKYRK >gi|225935351|gb|ACGA01000041.1| GENE 8 8174 - 9148 720 324 aa, chain - ## HITS:1 COG:PA4569 KEGG:ns NR:ns ## COG: PA4569 COG0142 # Protein_GI_number: 15599765 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Pseudomonas aeruginosa # 26 322 25 320 322 177 35.0 2e-44 MDSISLIRTPIEAELGDFKELFDSSLSSSNALLDSVVSHIRQRNGKMMRPILVLLVARLY GAICPSTLHAAVSLELLHTASLVHDDVVDESSERRGQLSVNAIFNNKVAVLTGDYLLATS LVHAELTNSHRIIQLVSTLGQDLADGELLQLSNVSNHSFSEEVYFDVIRKKTAALFAACT KAAAFSVGVGEGEAELARLLGEYIGICFQIKDDIFDYFDNKEIGKPTGNDMLEGKLTLPA LYVLNTTKNEEAQEIAIKVKEGTATLDEIAHLISFIKENGGIEYAVQTMNVYKQKAFNLL ASLPDSDICAALRAYLDYVVDREK >gi|225935351|gb|ACGA01000041.1| GENE 9 9237 - 12086 2384 949 aa, chain + ## HITS:1 COG:YPO0017_2 KEGG:ns NR:ns ## COG: YPO0017_2 COG0749 # Protein_GI_number: 16120370 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Yersinia pestis # 356 949 45 645 645 518 47.0 1e-146 MDSGNKLFLLDAYALIYRAYYAFIKNPRINSKGFNTSAILGFVNTLEEVLKKENPTHIGV AFDPAGPTFRHEAFEQYKAQREETPEAIRLSVPIIKDIIRAYRIPILEVAGYEADDVIGT LATEAGRQGITTYMMTPDKDYGQLVSDKVFMYRPKHTGGFEVMGVEEVKAKFDIQSPTQV IDMLGLMGDSSDNIPGCPGVGEKTAQKLVSEFGSIENLLEHTDQLKGALKTKVETNREMI TFSKFLATIKIDVPIQLEMDALVREEADENSLRSIFEELEFRTLIDRVLKKDISGNGITS ATGSKVATGKSAPSPLPLFPEEGGGMQGDLFANFTPDEPGEAKKSNLETLESLTYSYQLI DTEEKRREIIQKLLTSKILSLDTETTGTEPMDAELVGMSFSIAENEAFYVPVPSDQDEAL KIVNEFRPVFENENSLKVGQNIKYDMIVLQNYGATVKGPLFDTMIAHYVLQPELRHGMDY LAEIYLHYQTIHIDELIGPKGKNQKNMRDLDPKDVYLYACEDADVTLKLKNVLEKELKEN DAERLFYDIEMPLVPVLVNIERNGVLLDTEALKQSSAHFTAQMEQIEKEIYELAGETFNI ASPKQVGEVLFDKLKIVEKAKKTKTGQYVTSEEVLESLRHKHPVVEKILEHRGLKKLLGT YIDALPQLINPRTGRVHTSFNQTVTATGRLSSSNPNLQNIPIRDENGKEIRKAFIPDEGC LFFSADYSQIELRIMAHLSEDKNMIDAFLSNHDIHAATAAKVYKIDLKDVGSDMRRKAKT ANFGIIYGISVFGLAERMNVDRKEAKELIDGYFETYPGVKAYMDKSIQVAQEKGYVETIF HRKRFLPDINSRNAVVRGYAERNAINAPIQGSAADIIKVAMARIYQRFQTEGIQAKMILQ VHDELNFSVPVNEKERVEEIVIEEMEKAYRMHVPLKADCGWGKNWLEAH >gi|225935351|gb|ACGA01000041.1| GENE 10 12477 - 12872 463 131 aa, chain + ## HITS:1 COG:no KEGG:BT_3259 NR:ns ## KEGG: BT_3259 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 184 82.0 9e-46 MKAKVSLKMFVLSAALLVVSLATSARSYDNQLIYNPIEENGMTVGQTVYKMDGNTLANYM KYNYKYDDNKRMIESEALKWNNSKDAWEKDLRINYTYEGKTVTTNYYKWNAKKQAYVLVP EMTVTMDNTNM >gi|225935351|gb|ACGA01000041.1| GENE 11 12961 - 13740 711 259 aa, chain - ## HITS:1 COG:VC0693 KEGG:ns NR:ns ## COG: VC0693 COG3279 # Protein_GI_number: 15640712 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Vibrio cholerae # 11 254 5 232 237 85 28.0 7e-17 MMETEEKYKVVIVDDERTAIDALRRELEPYREFEVKGIAGNGAKGKKMIMELHPDLLFLD VELPDTLGINLLSEIREDILWDMKVVFYTSYDKYLLQALRESAFDFLLKPFEAEDLKVIM DRYRKAMTSTSLPLLPSFASSISALMPQQGMFMISTVTGFKLLRLEEIGFFEYLKDKRQW QVVLFNQTCLNLKRNTKAEDIISYSQAFVQISQSAIVNINYLAMIDGKCCQLYPPFHDKN DLIISRSYLKELQERFFVL >gi|225935351|gb|ACGA01000041.1| GENE 12 13737 - 15770 1365 677 aa, chain - ## HITS:1 COG:ECs3260 KEGG:ns NR:ns ## COG: ECs3260 COG3275 # Protein_GI_number: 15832514 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 463 670 350 551 565 91 27.0 4e-18 MKLFGLLGLLMLLSCGYADRHRNSSSCEQALVDSLEVRVQDSLFSNVHYSRSQLLEALTQ TQDSLVYYRLLALYGKTFFVSSDFDSILYYNRRVKEFSRNASQSSESLQSPRWNDVLSDV YNIEGNVWMQLNRPDSAIIDYKKAYGYRLEGKKLHLLPDICINIADAYLHSSDLAHTASY YRRALFLCDSLNLSEHTKFPVYYGLGQTYMELRDFDLSNHYYELAGQFFDEMNVSEQWTY LNNRGNHYYYRKNYQEALKYMRRANALVSSYPQMVFEQNFIKVNLGELYLLTDKLDSAQI CLDESYRFFSGIQHNSAVHYIETQMIELALKKGNIAQAKTMIARTAPVGHLDANMLTIRN QYLQHYFEHTGDYRRAYEYLKRDCHLDDSIRSERIQMRVAELDMRYRQDTIVLRKEMKIQ RQAAEMRVLKLSVSIWVLVCVLLVAGVVIVVWYMRKKREFLRQRFFQQINRVRMENLRSR ISPHFTFNVLGREINQFNGSEEVKHNLMELVKYLRRSLELTEKLSVSLQDELDFVRTYIE LERGRVGEDFVATITVEDGLDATRIMIPSMIVQIPVENAIKHGLAGKDGKKELMVYVSRE TNGIRITVTDNGRGYLPQVVSATRGTGTGLKVLYQTIQLLNTKNKSDKIRFNITNRSDGQ TGTEVSVYIPFRFSYDL >gi|225935351|gb|ACGA01000041.1| GENE 13 15912 - 16817 678 301 aa, chain + ## HITS:1 COG:PA3816 KEGG:ns NR:ns ## COG: PA3816 COG1045 # Protein_GI_number: 15599011 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Pseudomonas aeruginosa # 133 295 8 161 258 140 44.0 3e-33 MSPLNFTHILTQAVDELSESESYKGLFHQHKDGEPLPSAKVLYEIIELSRSILFPGYYGN STINSRTINYHIGVNIEKLFDLLCEQILAGLCFSTSIEGKCNVCSDSKREEAARLAAKFI SKLPAMRRILATDVEAAYNGDPAAESYGEVIFCYPAIKAISNYRIAHELLELGVPLIPRI ITEMAHSETGIDIHPAAKIGTHFTIDHGTGVVIGATSIIGNNVKLYQGVTLGAKSFPLDA DGKPIKGIPRHPILEDNVIVYSNATILGRITIGRDATVGGNIWVTENVPAGARIVQTKAK K >gi|225935351|gb|ACGA01000041.1| GENE 14 16834 - 18273 1619 479 aa, chain + ## HITS:1 COG:slr0064 KEGG:ns NR:ns ## COG: slr0064 COG0116 # Protein_GI_number: 16331495 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Synechocystis # 9 374 16 384 384 271 39.0 3e-72 MSEQFEMIAKTFQGLEEILAEELTALGANDIQIGRRMVSFTGDKRMMYKANFCLRTAIRI LKPIKNFTAKDADEVYNQIQAIPWEEYLDVNKTFAIDAVVFSEEFRHSKFVSYKVKDAIV DYFREKTGKRPSVRINNPDVLLNIHIAQTTCTLSLDSSGESLHRRGYRQEAVEAPLNEVL AAGMILMTGWRGECDLIDPMCGSGTIPVEAALIAKNIAPGVFRKGFAFEKWVDFDADMFD EIYNDDSQEREFTHKIYGYDNNPKANEIATHNIKAAGVSKDIVLKLQPFQQFEQPKEKSI IITNPPYGERISTNDLLGLYQMIGERLKHAFVGNEAWVLSYREECFDQIGLKASQKVPLF NGPLECEFRKYEIFDGKYKEFKSQEEGEEKKEGEGEQAPRFRERKEFKPRREGELKPRRD GESRPRREGEYKPRREGGDFKGGDRDRKPQGEFRGGRDSRGPREFKGNREPRIPKKEGE >gi|225935351|gb|ACGA01000041.1| GENE 15 18418 - 18981 239 187 aa, chain + ## HITS:1 COG:no KEGG:BDI_3007 NR:ns ## KEGG: BDI_3007 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 167 2 170 185 160 56.0 2e-38 MNDLIIVENKIYEIRGQKVMLDFDLAEMYEIETKYLKQMVRRNIERFPADFMFQLSNEEV NYLEISLRLQIATSNKGGNRYAPFAFTEQGVAMLSGVLRSPKAIEVNINIMRAFVHMRQY LLSHAPKQELEELRKRIEYLEEDISSDRESYEKQFDDLFTAFAKLSAAIQIKQTPLDRVK IEGFKNK >gi|225935351|gb|ACGA01000041.1| GENE 16 18996 - 21194 1824 732 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 150 714 161 715 738 282 33.0 1e-75 MSKIGKRAILLLLALPIGFNVMAQETKKPTLEELIPGGESYRYAENLYGLQWWGDECIKP GVDTLYSIQPKTGKETMVITREQINKVLEENKAGKLSHLYSVRFPWTDKAQMLFTIAGKF IVYDFKNNQVVSTFKPKDGANNEDYCAASGNVAYTIDNNLYVNEKAVTNEPEGIVCGQTV HRNEFGINKGTFWSPKGNLLAFYRMDESMVTQYPLVDITARVGEVNNVRYPMAGMTSHQV KVGIYNPATDKSIYLNAGDPTDRYFTNISWAPDEKSLYLIEVNRDQNHAKLCQYNAETGE PMGVLYEEMHPKYVEPQTPIVFLPWDPTKFIYQSQRDGYNHLYLFETNAANMKGETYNSA NGGSYFQAGKVKQLTKGNWLVSEILGFNTKRKEVIFTAVEGLRSGHFAVNVSNGKISQPF ENCKESEHSGTLSASGTYLIDRYSTKDQPRVINLVDTKNFKETANLLTAENPYDGYQMPS IETGTIKAADGTTDLHYRLMKPANFDPAKKYPVIVYVYGGPHAQCVTGGWQNGARGWDTY MASKGYIMFTIDNRGSSNRGLTFENATFRRLGIEEGKDQVKGVEFLKSLPYVDSERIGVH GWSFGGHMTTALMLRYPEIFKVGVAGGPVIDWGYYEIMYGERYMDTPESNPEGYKECNLK NLAGQLKGHLLIIHDDHDDTCVPQHTLSFMKACVDARTYPDLFIYPCHKHNVAGRDRVHL HEKITRYFEQNL >gi|225935351|gb|ACGA01000041.1| GENE 17 21260 - 22534 1600 424 aa, chain + ## HITS:1 COG:VC0275 KEGG:ns NR:ns ## COG: VC0275 COG0151 # Protein_GI_number: 15640304 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Vibrio cholerae # 1 422 1 420 429 378 47.0 1e-104 MKILLLGSGGREHALAWKIAQSPKVEKLYIAPGNAGTTAVGENVNIKATDFEAISAFALK ENIQMVVVGPEDPLVEGIYDYFQNRPELKHIAVIGPSAQGAELEGSKEFAKGFMMRHNIP TARYKSITSANLEEGLAFLETLEAPYVLKADGLCAGKGVLILPTLDEAKVELKEMLGGMF GSASATVVIEEFLSGIECSVFVLTDGEHYKVLPVAKDYKRIGEGDKGLNTGGMGSVTPVP FADEVFMEKVRTRIIEPTINGLKEENITYKGFIFLGLIKVKGEPMVIEYNVRMGDPETES VMLRIQSDFVELLEGTAAGNLNEKTLVMDPRSAGCVILVSGGYPEAYEKGFPISGLEQAA ATDSIIFHAGTAMKDRQVVTNGGRVIAVCSYGATKEEALAQSYKVADMIDFDKKYFRRDI GFDL >gi|225935351|gb|ACGA01000041.1| GENE 18 22715 - 23725 615 336 aa, chain + ## HITS:1 COG:no KEGG:BT_3252 NR:ns ## KEGG: BT_3252 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 336 1 336 336 445 75.0 1e-123 MRNKRLQSQVTAGRWTLPAVIFICALCWVLTYFLFPGSIASTVLEGSPSAWQPARNFLLP GWADRIVSFLIYATIGYFLIELNNQFSIIRMRASMQTAIYFLLVTVCPKMHFLYTGDIVA LGFLISIYFLFKSYQQTQAAGYLFYSFFFIGAGSILFPQFTILSVLWLLEAYRFQSLTPR SFCGALLGWMLPYWMLFGHAFFYNEMDLFYRPFNQLLTFGEFFNLQILQPWELAILGYLL VMFIVSAVHCIAAGFEDKIRTRAYLQFLIDLTIFLFLLIALQPIYCSALLPLLIISNSIL IGHFFVLTNSKSSNVFFIISLVGLILLFAFNVWTLL >gi|225935351|gb|ACGA01000041.1| GENE 19 23710 - 24189 278 159 aa, chain + ## HITS:1 COG:Cj0341c KEGG:ns NR:ns ## COG: Cj0341c COG1238 # Protein_GI_number: 15791709 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 19 152 13 146 147 84 35.0 6e-17 MDAFIDSTIQLLIEWGLPGLFISALLAGSIVPFSSELVLVALVKLGLPPIACLISATLGN TVGGMTCYYMGRLGKISWIEKYFKVKKEKVDKMVKFLQGKGALMAFFTFLPAIGEVIAIA LGFMRSNTWLTIVSMFVGKLIRYILLLYALESAWDAIAG >gi|225935351|gb|ACGA01000041.1| GENE 20 24253 - 25002 587 249 aa, chain + ## HITS:1 COG:DR1672 KEGG:ns NR:ns ## COG: DR1672 COG4121 # Protein_GI_number: 15806675 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 9 249 22 232 234 97 33.0 2e-20 MMKRIIEKTDDGSATLFVPELNEHYHSTKGARTESQHIFIDMGLKASSATTPRILEIGFG TGLNAWLTLEEAERSRWNIHYTGLELYPLEWQTIEQLGYISNDEQLTTSDRQQPAIELFK QLHTSPWEKDVQLTPHFTLRKVETDVNKWVENRERTMSNINDSVTNAESPALNLSFNLIY FDAFAPEKQPEMWSQELFNRLYVLLDRDGILTTYCAKGVVRRMLQTAGFTVERLPGPPGG KREILRARK >gi|225935351|gb|ACGA01000041.1| GENE 21 25047 - 25976 743 309 aa, chain + ## HITS:1 COG:MTH604 KEGG:ns NR:ns ## COG: MTH604 COG0803 # Protein_GI_number: 15678632 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Methanothermobacter thermautotrophicus # 10 302 13 292 295 200 35.0 4e-51 MKKQLKDMRTLFLLSACLLMAACTGRSSQASNDDEAKPVITVTLEPQRYFTEAIAGDKFK VVSMVPKGSSPETYDPVPQQLVSLGDSKAYFRIGYIGFEQTWMDRLMNNTPHIQVFDTSK GIDLILNNDDHDHAHGHNSHDGHIHAVEPHVWNSTGNALIIAGNTYKALSQLDKANEVYY RNRYDSLCQRIQHTDSLIRRQLSVPEAAKAFMIYHPALSYFARDYGLHQISIEEGGKEPS PAHLKALIDLCQAEDVRVIFVQPEFDKRNAETIAQQTGTKVIPINPLSYDWEEEMLNVAK ALAPQAATK >gi|225935351|gb|ACGA01000041.1| GENE 22 26014 - 26787 218 257 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 213 1 210 305 88 28 2e-16 MKPIIEIKNLSAGYDGRTVLHDVNLSIYERDFLGIIGPNGGGKTTLIKCILGLLKPTGGE IIFHTPEKASTKTTSTFLGYLPQYNSIDRKFPISVEEVILSGLSIQKSLTSRFTPEQKEK GKQIISRMGLEGFEHRSIGQLSGGQLQRALLGRAIISDPAVLILDEPSTYIDKRFEARLY ELLAEINKECAIILVSHDIGTVLQQVKSIACVNETLDYHPDTGVTTEWLERNFNCPIELL GHGTLPHRVLGEHHHHH >gi|225935351|gb|ACGA01000041.1| GENE 23 26861 - 27499 352 212 aa, chain - ## HITS:1 COG:no KEGG:Athe_0520 NR:ns ## KEGG: Athe_0520 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophilum # Pathway: not_defined # 10 151 13 157 237 67 31.0 4e-10 MKGLAPHTLQIFEAVSQLDCIKPYLLVGGTALSLQIGTRQSEDLDFMKWRTSKIEKMEVA WYQIEKELATIGEIQHKDILDIDHVEYLVSGVKFSFYACPKYSPVSEPVSYLNNLKLADV KSIGAMKMEVLLRRSNFRDYYDIYSILKSGVPVNDLISLALSYSGHRLKSKNLLAMLTNS NRFTRDSHFEQLEPVYDVTAQEIEGYIKNCLL >gi|225935351|gb|ACGA01000041.1| GENE 24 27496 - 27828 335 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160883372|ref|ZP_02064375.1| ## NR: gi|160883372|ref|ZP_02064375.1| hypothetical protein BACOVA_01341 [Bacteroides ovatus ATCC 8483] # 1 110 1 110 110 191 100.0 9e-48 MSVDVIKQELLSKLKQEHCFWSYNEDSIKDIPDDMLIEKTLLHLDLEEINQLFLVYPFKK IKQVWLDYLIPQEEYLYTLNRFFAWYYFKAKKPDAYIKSMATRRLNKILL >gi|225935351|gb|ACGA01000041.1| GENE 25 27864 - 28214 325 116 aa, chain - ## HITS:1 COG:no KEGG:BF2945 NR:ns ## KEGG: BF2945 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 12 116 1 104 104 132 67.0 5e-30 MEEKARFNVKYMEEMKLYTHEEMLDRVIGTKGTPAREKYETDINNFLIGEAIKRAREAKN LTQEQLGELMGVKRAQISKIESGKSISFSTIVRAFKAMGVKTASLELGSLGKVALW >gi|225935351|gb|ACGA01000041.1| GENE 26 28675 - 29940 1120 421 aa, chain - ## HITS:1 COG:no KEGG:BT_1283 NR:ns ## KEGG: BT_1283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 421 12 439 440 366 45.0 1e-99 MKTKKILRGMLTVISPAIITTVLLLCATGCDNKDYSNASPFDNVVYLDAAKLKDVSNFTF NRTIETGQKEISALLARPAGEDINVGIKVDASLVNTYNARLGANYTMLDAKHYKLSAGQT VIPQGEVSSKPVTIDFSGLTDLEIDAGYLLPITIDQVSGGMGTLGGSKTICYVVRRSSAI TTAVSLKNNFFEVPGFDKGSSTADVVNNMKQLTYEAIIRVNNFKNGVTAREISTIMGIEQ YCLFRLGDAGFPAQQLQFSCNEIKFPNADNSKLLQEGEWYHVAVTFDTEKKVAVIYVDGR EQSRIEDYGKGEPINLGMQNRGKDFMFKIGHSYGESTDFSRQLDGEICEVRVWNVMRTQQ EIFDNMYNVDPATTGLCAYWKFNEGHGDIAEDYTGHGNDAHVHISPAVWPQGIEVTQKNK E >gi|225935351|gb|ACGA01000041.1| GENE 27 30134 - 30364 258 76 aa, chain - ## HITS:1 COG:no KEGG:Arnit_1148 NR:ns ## KEGG: Arnit_1148 # Name: not_defined # Def: DNA-binding domain-containing protein # Organism: A.nitrofigilis # Pathway: not_defined # 4 75 6 77 151 94 69.0 9e-19 MDQLQIIQNRIYEFRGQKVMLDRDLAEMYGVQTKVLNQAVKRNIERFPSDVMFQISSEEI QDWRSQFVTSNAIKME >gi|225935351|gb|ACGA01000041.1| GENE 28 30570 - 32240 812 556 aa, chain - ## HITS:1 COG:no KEGG:BT_1284 NR:ns ## KEGG: BT_1284 # Name: not_defined # Def: putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 472 20 436 508 148 25.0 5e-34 MKPVKHYNDMKNNPFKYFVFLMLSLLMVATTGCTEDDISMPAGQLPDETPMNSVGGQLCS GKTFSNKITVSMYEGDGAVTEEICYALTKPAITAVTVKAIPSPELVAQYNSDNETDMKEF PVANVTLGNGGSLTVAAGKKESGTISITLSPDGLKPETLYLLAIALTQNPVGVDAQGNKQ VIYYRVNFREKTITCIPGTTGQIQDIPPLLPEMITVFYVNTETYQPLIASAWGIMLDDMI TRPYPIYSLGNIVNLKRATIGYDGVSQRALFELGSDLAYVLEHRDKYIRHLQEYKRKVCL CIENGGKGIGFCNMNDTQIADFVRQVKDVIERYQLDGVNLWDEDGKYGKAEMPGMNTTSY PRLIKALREALPDKLLTLVDKGDATEYFYDVSRCGGIEVGGYIDYAWHGYFSSTEELQII NPNLDGSVQTYSKYTRRSIAGLTETRYGCVNVPRYSSNDPNIRNRAADIICKWKSGGNKK SNILIYGDDLIGNEYGDRENAARIMLGDYSLLQFMDDGDGWDFSRDELIWGDVVYSGVSL DPGVEHAAGNTYRKDW >gi|225935351|gb|ACGA01000041.1| GENE 29 32267 - 33220 1110 317 aa, chain - ## HITS:1 COG:no KEGG:BT_1282 NR:ns ## KEGG: BT_1282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 316 26 324 327 151 32.0 3e-35 MNMRISIANIFTHLFLLLALLATASCSDWTDQKTVDIDPQHAKEQNPELWARYMETLRTY RQSKHFVTYGSFDNSAEKSKNEGDYLRSLPDSLDIVTPTHPESLTSYDCEDILLLQEKSI KVLYLVDYTAQMPALTDAAKLGAWLDKAVAAASQLGMNGFAIKGTPLYGGTEAEQAARRE AAKLIVSKLSTAVGDGKLLVFEGDPAFMDAADLNKLDYVVLNTATITSAVDLKLYVAGVI ENFSIPKEKLLLSAKINEKLVDEEGEKLDAVTEMTNRVVSLGPVGGLAIYALGDDYYHTK MNYETSRLAIQIMNPSK >gi|225935351|gb|ACGA01000041.1| GENE 30 33285 - 34886 1614 533 aa, chain - ## HITS:1 COG:no KEGG:BT_1281 NR:ns ## KEGG: BT_1281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 530 1 528 531 506 49.0 1e-142 MKQVKSFLKIFSLGLLLVGGAACTGNFDEINRKEYEVTKDEQGRENYNIGSTLRGLQGLV VPTKEHLYQFIEALAAGPFAGYYGTTLVRTDKFETYNPSVDWQDKTYGDIFTESYPLYRD LQDQSDDPVALALAKLLRVAIMHRMTDMYGPIPYSKVIDEQGSVSLNVPYDSQEAVYKQM LKELDEVSSVLKENLTIGSEAFRKFDDVYYGDVSKWYKFANSLKLRMAIRMVYVDPTTAQ QVAQDAVAAGVITDNADNAEMKVEENRAAMVFNGWSDHRIGADLLCYMNGYQDPRREKMF TQVEITETVGGKPTKVSGFAGIRIGIDVVNKESVIDRYSKPIISTASPYPWMNAAEITFL RAEGALRGWAMGGDAKSLYEEAIALSFEQYGLPATDALSYAANASNTPQAYTDPVDGTYS AGAVSNLTVAWQEGDEYAEKNLERIITQKWIAMFPSTVEAWSEYRRTDYPHLLPVVVNNS GGTIDDTKAKIRRLWYPPSEYSGNRQYILEAVGMLGGPDNGATSLWWDKKPRP >gi|225935351|gb|ACGA01000041.1| GENE 31 34905 - 38249 3086 1114 aa, chain - ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1114 1 1110 1110 1509 67.0 0 MLQNYTFIADKRGVRRSFLLFFSLFLLQVTAFAQNDVRITIRENNITVIEALKKVEKQSG LSIGYNNSLLRDKPALNLNLDKAGLDYSLSTILKGTGCTYELKGKYIKIIPQPAQEKPSS DKQIKGKVTDETGEPLIGVNIQVQGSASGVISDIEGRYSIEAPVGSTLNFTYIGYTPQNV KVTDRSVYNVTLATAVEQLNEVVVTALGIKREQKALSYNVQQVKADEISGIKDANFINSL NGKVAGVTINSSSSGVGGASKVVMRGAKSIEQSSNALYVIDGIPMYNFGGGGGMEFDSRG VTESIADINPDDIESISVLTGAAAAALYGSNAANGAIVITTKHGQVGKLQVTVNSNTEFA RPFVLPEFQNRYGTGSRGKDGGSTILSWGAKLNDASRTNYEPKDFFDTGLIFTNSVTLST GTEKNQTFFSVASVNSEGIVPNNRYNRFNFTFRNTTNFLNDRMKLDIGASYIIQNDRNMT NQGIYSNPIVPVYLFPRGDDFGLVKVFERWDPARKINTMFWPQGEGDYRMQNPYWIAYRN LRLNQKKRYMLSAQLSYDITDWLNISGRVRVDNTHTKYEQKLYASSNLTITEESTQGHYT ISKPDETQTYADVLANINKRFNDFSLVANVGASIVNNRYEDLSYRGPIREKGIPNVFNVF DLDNTKKKARQDEWQEQTQSIFASVEVGWKSMLYLTLTGRNDWASQLANSSTPCFFYPSV GLSGVISEMLTLPEFIDYMKVRGSFSSVGMPYPRNLTSPTYEYDEANQQWKPKTHYPIKD LKPERTNSWELGLDMRLFKDFSLGFSWYLANTFNQTFDPKVSVSSGYSKIYLQTGYVRNS GVELSLGYGHTWNNHLHWESNFTLSHNKNTIKDLVTSYIHPETGLPITQNRLDVGGLGKA RFILKKGGTLGDLYTQSDLKRDNSGMVEIDPSGALTTEDNLPDIKLGSVFPKANLAWNNR FEWRGISLSALFTARIGGIAYSATQAAMDQYGVSERSAQARDNGGVLLNGRTLVDAQTYY SLIGNSSGLPQYYTYSATNVRLQEASVGYTIPRKWLGGVCDINVSVVGRNLWMIYCRAPF DPEAVANTGNYYQGIDNFMLPSTRNIGVNVKINF >gi|225935351|gb|ACGA01000041.1| GENE 32 38472 - 39485 595 337 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 17 277 28 272 331 80 30.0 6e-15 MKNYFRQILTSFTKNHYSETIRQNFYGWLTDKEHAVEKDKALKEIWTTARAMGEVPHVEK ELERWKRNNGLHTASSVPVTPNRKIRILRFWQSVAAVLLLAAVSLGYLVMQAERTQNDLI QEFIPVAGMRHLSLPDGSQVQLNSKSTLLYPKQFTGKERCVYLVGEANFKVKPDKKHPFI VKSDDFQVTALGTEFNVSAYPENQEISTALLSGSVLVEWGNLTQRTVLQPDEQLTYDKEN RQFRVVHPDMADVTAWQRGELVFNEMTVEDIIRVLERKYNYTFVYSLNQLKKDRLSFRFK DKAPLAEVMDIIVDVAGNLKFKIEGDKCYITRKGNKK >gi|225935351|gb|ACGA01000041.1| GENE 33 39539 - 40084 395 181 aa, chain - ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 11 171 33 192 201 68 26.0 4e-12 MDKDITLEQEFDLVFKAHYSLVKNFALMLLKSGQDADDIAQDVFTRLWAKPQIWQDNPGI DKYIYAMTKHAIFDFLKHKRIERSYQQAQMEESLFKDLSPSGDTLDAIYYKEIQLALQMA VEQFPERRRLIFEMSRIQGMSNLEIAEKLDISVRTVERQIYLSLVELKKIVYILFFFISF E >gi|225935351|gb|ACGA01000041.1| GENE 34 40336 - 43749 3066 1137 aa, chain + ## HITS:1 COG:no KEGG:BT_3247 NR:ns ## KEGG: BT_3247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1137 1 1124 1124 2125 90.0 0 MKQYRTVNNLMGWLTFIIAATVYCLTIELTASFWDCPEFITTGYKLEVGHPPGAPFFMLV ANLFSQFASDVTTVAKMVNYMSALMSGACILFLFWSITHLVRKLIITDENNITKGQLITV MGSGLIGALVYTFSDTFWFSAVEGEVYAFSSLFTAVVFWLILKWEDVADQPHSDRWIILI AYLTGLSIGVHLLNLLCLPAIVLVYYYKKTPNATAKGSLIALFGSMVLVAAVLYGIVPGI VKVGGWFELLFVNGLGMSFNSGVVVYIILLAAALIWGVYESYTEKNKTRMAISFILTIAL LGIPFYGHGASSIVIGILVIAALGLYLAPNIQTKIKEKWRISARTMNTALLCTMMIVIGY SSYALIVIRSTANTPMDQNSPEDIFTLGEYLGREQYGTRPLFYGPAFSSKVALDVKDGYC IPRQSEAGSKFVRKEKTSPDEKDSYIELPGRVEYEYAQNMFFPRMYSSSHAPLYKQWVDI KGHDVPYDQCGEMVMVNMPNQWENIKFFFSYQLNFMYWRYFMWNFAGRQNDIQGSGEIEH GNWITGIPFIDNLLVGNQKLLPQDLKNNKGHNVFYCLPLILGLIGLFWQAYHSQRGIQQF WVVFFLFFMTGIAIVLYLNQTPAQPRERDYAYAGSFYAFAIWVGMGVAGIIRMLRDYAKM QELPAAVLASVLCLFVPIQMAGQTWDDHDRSGRFVARDFGQNYLMTLQEEGNPIIYTNGD NDTFPLWYNQETEGFRTDARTCNLSYLQTDWYIDQMKRPAYDSPSLPITWDRVEYVEGQN EYISIRTEMKAFIDSYFKQANELAAQGDTTILSLVHSIFGENPYELKEIINRWMLGKNDQ LKELLKKTGKEIQLPLIPTDSIVMKIDKEAVRRSGMKIPEALGDSIPEYMTITLRDANGN PKRALYKSELMMLEMLANANWERPIYMAITVGSENHLGMGNHFTQEGLAYRFTPFDTDKL DSKIDSEKMYDNLMNKFKFGGIDKPGIYIDENVMRMCYTHRRIFTQLVGQLIKEGKKDKA LAALDYAEKMIPTINVPYDWANGAFQMAESYYQLGQNEKANKIIDELANKSLEYMIWYLS LNDNQLAIASENFVYNASLLDAEVRLMEKYKSEELAKHYSTQLDQLYNEYVTRMKGK >gi|225935351|gb|ACGA01000041.1| GENE 35 43858 - 44472 613 204 aa, chain + ## HITS:1 COG:all4345 KEGG:ns NR:ns ## COG: all4345 COG0726 # Protein_GI_number: 17231837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Nostoc sp. PCC 7120 # 20 201 100 284 305 123 38.0 3e-28 MFIEQPPWLFRALYPQAIFRMDPNERAVYLTFDDGPIPEVTPWVLEILEKHHIKATFFMV GDNIRKHPDEYRMVVEHGHRIGNHTFNHIRGFEYSNPDYLANARKVDDIIHSDLFRPPHG HMGFRQYYTLRYHYRIIMWDLVTRDYSKRMRPEQVLNNVKRYARNGSIITFHDSLKSWNN GNLQYALPRAIEFLKEEGYEFKVL >gi|225935351|gb|ACGA01000041.1| GENE 36 45158 - 45388 163 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720641|ref|ZP_04551122.1| ## NR: gi|237720641|ref|ZP_04551122.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 76 19 94 94 160 100.0 3e-38 MIHAGQLIERTLHEQGRTVTWFATQLCCTRPNVYKIFRKENIDIHLLWRISCILGHDFFR DLSDSINTGNFPSVSK >gi|225935351|gb|ACGA01000041.1| GENE 37 45488 - 45979 508 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160882201|ref|ZP_02063204.1| ## NR: gi|160882201|ref|ZP_02063204.1| hypothetical protein BACOVA_00147 [Bacteroides ovatus ATCC 8483] # 1 163 1 163 163 321 100.0 1e-86 MKTKKLFLIIALVVSCAVGAHAQKTVFKFRDAQARAGDAVTEVCVKPTVVEVKILEDKGR IKDEWTLSKEEVEIAMKGELDNIRAWGTYLSTIKYNCDVIMGATFKVEDNEKTGGYTVTV VGYPGIFVNWHPATQDDYEWIRLQKLSPTDGKSQIAPVVKNKN >gi|225935351|gb|ACGA01000041.1| GENE 38 46000 - 46635 587 211 aa, chain + ## HITS:1 COG:no KEGG:BT_2676 NR:ns ## KEGG: BT_2676 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 210 210 359 84.0 4e-98 MKKVLSMIAALLLCVGTQAQIVSSRSAIVKTEKQASSTQWFLRAGLNIMNFSGDGAEGAD SNIGYNATFGYQKPLGSTGGYWGMEFGLGSRGFKVEDTKCMAHNIQYSPFTFGWKFAVAD NVHIDPHVGVFASYDYTSKMKTDGESISWGDYANYMEVDYNHFDAGMNIGVGVWYNRFNL DLTYQRGFIDTFSDADGFKTSNFMIRLGIAF >gi|225935351|gb|ACGA01000041.1| GENE 39 46741 - 47460 660 239 aa, chain + ## HITS:1 COG:no KEGG:BT_2675 NR:ns ## KEGG: BT_2675 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 2 241 241 379 77.0 1e-104 MTKRQFLAIMYLLTGVIPLIAQTFTEQKKSYPISADGNKYVVSGFTPFSSMQDENIYANA LLWTIKNVCPQLREGITEVNVPAKNFSCDLILASQADSNQKNTYYCKALFQVKDGKLVYY LSNIRIESSAVIMKKITPMEKLQPDKKASHKEIMDDFVQVESQVLNKMFDFIVMKQLSPI THWNEISINKPVKGMTEDECLLAFGKPQTIQESNGEVQWMYSSSFYLFFKNGHVETIIK >gi|225935351|gb|ACGA01000041.1| GENE 40 47509 - 47979 502 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160882198|ref|ZP_02063201.1| ## NR: gi|160882198|ref|ZP_02063201.1| hypothetical protein BACOVA_00144 [Bacteroides ovatus ATCC 8483] # 1 156 27 182 182 268 98.0 9e-71 MKTIWKATVCIVAVLLSFSIYSCGDDDDDTVGSRDLLLGTWNGVYYLSQEWEDGEKVSDS KEDFVNGTNRYSIEFKEDGTYVEKDVYNSSGSTNYYHGTWSYSGNKLTLIDTEEDNYTEV WTVTTMTENELVYELREKEKEDGTTYEYYEQHAFKR >gi|225935351|gb|ACGA01000041.1| GENE 41 48198 - 49439 1590 413 aa, chain - ## HITS:1 COG:no KEGG:BT_3233 NR:ns ## KEGG: BT_3233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 413 1 414 414 727 88.0 0 MRKILILFAVALIGFASCADSKQSMTVTVTNPLALERAGEMVEVPMSDVVAKLKLADTAQ IVVLDVDGQQVPYQVTYDEKVVFPVTVGGNSAVTYTIQPGTPAPFDVIACGKYYPERLDD VAWENDLGGFRAYGPALQARGERGFGYDLFTKYNTTEPILESLYAEELNPEKRAKIAELK KTDPKAASELQNAISYHIDHGYGMDCYAVGPTLGAGVAALMAGDTIIYPYCYRTQEILDN GPLRFTVKLEFNPLVVRGDSNVVETRVISLDAGSYLNKTVVSYTNLKEAMPVTTGLVLRE PDGAVVADAANGYITYVDPTTDRSGANGKIFVGAAFPAQVKEAKVVLLSEKEKKDRGGAD GHVLAISEYEPGAEYTYYWGSAWDKAAIKTPDAWNKYMAEYAQKLRTPLAVAY >gi|225935351|gb|ACGA01000041.1| GENE 42 49698 - 50501 1068 267 aa, chain - ## HITS:1 COG:CAC2607 KEGG:ns NR:ns ## COG: CAC2607 COG1028 # Protein_GI_number: 15895865 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Clostridium acetobutylicum # 1 267 1 267 267 427 78.0 1e-120 MNQFLNFSLEGKVALVTGASYGIGFAIASAFAEQGAKVCFNDINQELVDKGMAAYAAKGI KAHGYVCDVTDEPAVQAMVATIAKEVGTIDILVNNAGIIRRVPMHEMDAADFRRVIDIDL NAPFIVAKAVLPAMMEKRAGKIINICSMMSELGRETVSAYAAAKGGLKMLTRNICSEYGE YNIQCNGIGPGYIATPQTAPLREPQADGSRHPFDSFICAKTPAGRWLDPEELTGPAVFLA SEASNAVNGHILYVDGGILAYIGKQPK >gi|225935351|gb|ACGA01000041.1| GENE 43 50542 - 51384 1004 280 aa, chain - ## HITS:1 COG:YPO1725 KEGG:ns NR:ns ## COG: YPO1725 COG3717 # Protein_GI_number: 16121985 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Yersinia pestis # 6 280 2 278 278 296 49.0 2e-80 MKTNYEIRYAAHPEDAKSYDTTRIRRDFLIEKIFVPNEVNMVYSMYDRMVVGGALPVGEV LTLEAIDPLKAPFFLTRREMGIYNVGGPGIVKAGDAEFELDYKEALYLGSGDREVTFESK DAAHPAKFYFNSLTAHRNYPDRKVTKADAVVAEMGSLEGSNHRNINKMLVNQVLPTCQLQ MGMTELAPGSVWNTMPAHVHSRRMEAYFYFEIPEDHAICHFMGEVGETRHVWMKGDQAVL SPEWSIHSAAATHNYTFIWGMGGENLDYGDQDFSLITDLK >gi|225935351|gb|ACGA01000041.1| GENE 44 51814 - 55176 2408 1120 aa, chain + ## HITS:1 COG:no KEGG:PRU_1888 NR:ns ## KEGG: PRU_1888 # Name: not_defined # Def: TonB dependent receptor # Organism: P.ruminicola # Pathway: not_defined # 14 1120 4 1063 1063 763 41.0 0 MTIIKGRTEDTLIKIGVICIYMLLSINLTYAQIKNITGIIIDEESREPLTGASVTVKGSR QGCISDLDGKFTLQQTLPGQQMLVISYIGYQTIEVPARHNMMEIRLRPNVNELDEVVVQV AYGTALKRSITGAVSVVDSKQIEMRPVSSVISVLNGAVPGLQIIDGVGQPGIEAEVRIRG YSSVNGSNKPLYVVDGIPYTGWITDLNPADIESVSVLKDAASCALYGSRASNGVILITTK KAKKQGVSLQLDIRHGFSARGQGDYERMNANQFMETMWQGYRNQLISNGSSPEEATIATN NDIISKVGINIYNKADNALFDANGHLVSDARILDGYKDDLDWHSPYTRNGHRQEYNLSGE SGNEKNRIRFSLGYLNEDGYTRKSDFNRLSGSLNADFTPRPWLKTGLSLGGTHQKTNWDI GAAGSARSNNLSNAFYFARRIAPIYPVHLHYTEDVFASDGSLLHSKGDYILNEEGGKQYD DGSESRSESDAASNGRHLLWESEKNKLWNAANTLQGNAYVDVSFLRDFTFSLKGNISLRN IEDSQYGNAEIGAYKDTGFISKIDQEYKEYTLQQLLAWKHQFGRHFVEWMVGHENYDYKL NFDAIQKKNESFPGIDELSNFTTTTYSEGYKDTYRTEGHFTRARYDYNETYFAEASFRRD GSSIFHTDHRWGNFWSAGIGWMLSNEAFLKNVSWLNRLKLRVSYGQVGNDNFGSSNGLYQ WMSLYGSAVNGGEAAYYKVQNENPELKWETNSSLNIGLETRLFNRVNLSFEYYNKHSDDL LFKFIQPLSAGATDSSTGLSTVWRNIGDVSNKGWEFSADGDIIRNREWEWNVGLNLSKVK NKIGKLPDKDREEGITNGDFQKFKEGHSIYEFWLYQYAGVDQMTGRSVYLPDFNAYYIAG EDGKTPVNGEESTEGKNPIPTDSWVDINGQYYTGDPRYARKDWSGSSLPKINGSVTTSLR WKDLTLSALMIFSCGSKVFDQPYQTLTSVGVHSLTPDLLNAWTAIPEGMTPTSPNRLDPS GIPQVNLDATINGYNSQKASTRYIVSGDYLSIKNITLSYRLPATWSKRLTLGGIRLHAAI ENVALFSKRKGLNPVQTFDGIVNNYTSIARVFSFGVNINL >gi|225935351|gb|ACGA01000041.1| GENE 45 55181 - 56770 1468 529 aa, chain + ## HITS:1 COG:no KEGG:PRU_0155 NR:ns ## KEGG: PRU_0155 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 20 527 9 514 514 339 41.0 2e-91 MKKEAIHMKFMNHSIEFCLLLTLLCYSCASDYLDTAPTNQVSPADLFKDEAYAAYAVSGL EKLMKTTYPTNKLDGVSFNGEGSVKLMYAEYQGADMYCPRNNFYTIFNGASHAIPTSGYT EYMWHYYYKVISNANAIINYVDPEGSHKFKYIYAQALTYRAYCYLQLLQFYSPRWCDSQN GSADGVVLRLNTSSTGDCPLSSMLACYEQIYNDLNQAITYYQASGIARKEDENHKINVNA AYATYARAALTREDWSTAAHYAALARAGYPLMNADEYFDGFSTVNREWIWSIYDSEEESL GNSSLAARLAYNSSSTLVCTYPACINRELYDALPESDIRRGLFLDPLEYTYNTGGITNNG LGGSALTSYAQGLYPDLNTSAKIYAYMSFKFKCIDKVGAMPFNLFRSSEMYLIEAEANCH LTPPKEAEARQLLKELIRDSGRDPQYTCDKSGQALLDEIKFYRRIELWGEGFSWFDFKRR KDTIVRHTFEDGGNYMTNAAVTINPEDANNWMWVIPAKEYEYNNAINKQ >gi|225935351|gb|ACGA01000041.1| GENE 46 56798 - 58903 1629 701 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 446 700 52 310 328 143 32.0 1e-33 MKIVRYISICLLFTFIGSLLPTKAQDNPYKIDHSLYLLYEEATKHRNNATGLEIADTIYT KAIRINDKKAQCLALTIPVIYYFNNGKTELLEKAISRLQEISRKNNYLQYYYFGSIYKVN YLMNTGNTLRALQEAELTKEQAFADDYPYGISTCLRMMGNIYFARRENRTALDYYQQSLV YTQEKLPEQDISYIYWNISMLQQNLKQYEAAYENAEKGIKCAKTSTNKYACMLRKCTLLY ALDREEEFKSYYQECLKATEKHGETRRNELNKLKIYNYILNQQYDKAHALADSTSILHER IAFQANIYAKEQKYKDAYQALQKLQSLQDSLNQLIQTADLSELNVRIGNEQLKRKAQALQ LENTQLNLQKTTLELQQTKSQVEIEKMNAENNELLLRNRNLELAQFKAETERTQSLMVAK QAESERQLMILKFILIFFCFFAIALTLYLYLRRKSIRQLQEKNEELTIARDRAEQADKMK THFMQNMSHEIRTPLNAIVGFSQLLSNPDLPLEDEEKLEFSSLVQHNSELLTTLVNDILD LSALESGKYTMNLTSCHCNEMCRLILSTVMDRKAEGVKLYYTSEVADDFQITTDEQRLQQ VLINFLTNAEKHTEKGEIHLHCSLTEHPDRITFSVSDTGPGIPADKMDSVFERFKKLDEF KQGSGLGLNICRTIAERLHGEVKVDKSYTNGARFLLILPLK >gi|225935351|gb|ACGA01000041.1| GENE 47 59645 - 60937 712 430 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 2 427 8 414 418 278 36 9e-74 MNFVEELRWRGMLQDIMPGTEELLSKEQVTAYLGIDPTADSLHIGHLCGVMILRHFQRCG HKPLALIGGATGMIGDPSGKSAERNLLDEETLRHNQACIKNQLAKFLDFESDVPNRAELV NNYDWMKDFTFLDFVREVGKHITVNYMMAKDSVKRRLNGEARDGLSFTEFTYQLLQGYDF LHLYETKGCKLQMGGSDQWGNITTGAELIRRTNGGEVFALTCPLITKADGGKFGKTESGN IWLDPRYTSPYKFYQFWLNVSDSDAERYIKIFTSIEKEEIEALIAEHQEAPHLRLLQKRL AKEVTVMVHSEDDYNAAVDASNILFGNATSEALRKLDEDTLLAVFEGVPQFEISRDALAE GVKAVDLFVDNAAVFASKGEMRKLVQGGGVSLNKEKLAAFDEVITTADLLDEKYLLVQRG KKNYYLLIAK >gi|225935351|gb|ACGA01000041.1| GENE 48 61024 - 61716 475 230 aa, chain - ## HITS:1 COG:BS_yabD KEGG:ns NR:ns ## COG: BS_yabD COG0084 # Protein_GI_number: 16077107 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Bacillus subtilis # 53 191 57 203 255 79 33.0 8e-15 MKKKVTDILDIHTHKQEVDSQGKSIINYPLLADSPLYMPLAENVEVAVGRGSYYSIGIHP WEVRENNVSQQLSFLQQQLQRKQFVAVGEAGLDKLAKASMELQLAVFKEQVKLSEKLGLP LIIHCVKAMEELLGVKKESRPQQPWIWHGFRGKPEQAVQLLKKGFYLSFGEYYPDETIRI VPDERLFLETDDSLLDIEDILCQAAGVRGVEVEALREVIRRNIQNVFFRA >gi|225935351|gb|ACGA01000041.1| GENE 49 62013 - 62402 173 129 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646897|ref|ZP_01719407.1| 50S ribosomal protein L34 [Algoriphagus sp. PR1] # 4 127 3 122 130 71 36 3e-11 MGIYTLCKAERLNSKILIGKMFEGGVSKSFSIFPIRVVYMPVEQGEAPASILISVSKRRF KRAVKRNRVKRQIREAYRKNKSLLVDELQRREQRLAVAFIYLSDELVATAELEEKMKIAL ARISEKLFS >gi|225935351|gb|ACGA01000041.1| GENE 50 62408 - 63154 581 248 aa, chain - ## HITS:1 COG:no KEGG:BF0075 NR:ns ## KEGG: BF0075 # Name: not_defined # Def: uroporphyrinogen-III synthase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 248 1 248 250 453 93.0 1e-126 MKIKKVLVSQPKPASEKSPYYDIAEKYGVKIDFRPFIKVESLSAKEFRQQKISILDHTAV IFTSRHAIDHFFTLCTELRVTIPETMKYFCVTEAVALYIQKYVQYRKRKIFFGATGKIED LIPSIVKHKTEKYLVPMSDVHNDDVKNLLDKNNIQHTEAVMYRTVSNDFTPDEEFDYDML VFFSPAGVTSLKKNFPDFNQKEIKIGTFGSTTAQAVRDAGLRLDLEAPTVQAPSMTAALD MFIKENNK >gi|225935351|gb|ACGA01000041.1| GENE 51 63159 - 63992 337 277 aa, chain - ## HITS:1 COG:no KEGG:BT_3225 NR:ns ## KEGG: BT_3225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 276 6 239 240 349 89.0 8e-95 MIGDTLSSQNTDTLSLLQQKQISPAKADSDSLQLADLHAVQEVDSGFEGTPISYSPRTDD AIALTLLVCFFLSSIALARGKKFLSQQVKDFVLHRERTSIFDSSTAADVRYLLVLVLQTC VLSGITFLNYFHDTCPALMDHVSSLLLLGIYVGFCLAYFLLKWLLYMFLGWTFFDKNKTN IWLESYSALIYYVGFALFPFVLFLVYFDLSLTNLVIIGSIILIFTKILMFYKWIKLFFHQ FSGLFLLILYFCALEIVPCLLLYQGMIQMNNILLIKF >gi|225935351|gb|ACGA01000041.1| GENE 52 63989 - 64576 533 195 aa, chain - ## HITS:1 COG:PA4923 KEGG:ns NR:ns ## COG: PA4923 COG1611 # Protein_GI_number: 15600116 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Pseudomonas aeruginosa # 4 189 3 185 195 166 46.0 3e-41 MNQINSVCVYSASSTKIDAVYFQAAETLGRLLAEHHIRLINGAGSIGLMCSVADAVLKNG GEVTGVIPRFMVEQNWHHTGLTELIEVESMHERKQKMANLSDGIIALPGGCGTLEELLEI ITWKQLGLYLNPIIILNTNRFFDPLLEMLEKAIDENFMRRQHGDIWKVAQTPEEAVQLLY ETPVWDISIRKFAAI >gi|225935351|gb|ACGA01000041.1| GENE 53 64656 - 66896 1185 746 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4389 NR:ns ## KEGG: Cpin_4389 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 25 746 27 775 775 422 35.0 1e-116 MEKYLLLFILSCCGNFTICNAQTITVRGRVTDRKTGEVLSDANIGDLMSRTGVAANTYGL YSIRIKGGRCVLRCSMLGYVTQMDTLTLTANSVHNFALMPDNYQLSDVEVMGNQKAGGQL TLNQKDIQALPTLGSEPDVLKSLQYLPGVISGNEGSNNISVRGSNQWGNLILLDEAMVYN PNHALSFFSVFNNDAIQQVSLYKSYFPLKYGGRTSSVIDVKMREGNNQEKHRSATIGVVA SKIQLEGPIKKGKTSYLVAGRFAYPGAVLNVLKQFRGTKMSFYDVNAKINSTLNDRNRIF FSVYNGGDHTFFNQLVRSYGMNWGNTTATFRWNHVWTDRLSGNFSAIFSNYYYRYKSITD GMKFLWKSNIQSYQLKYDADYAVNNVLHIRSGLSAHVFTTMPGSISSWGDFSNVVPYRMD RRSLLDMAAYGEATYKISSVWRLSGGIRFPVFYTPKVGELKQKGYIMPEPRAELSYSPGT GNRLHAAFTQSSQNLHMLSNSSVGIPSDMWVPANRQLKPAVMKQVALGYEKSLEKGMYTF SLETYYRKTDHIVDFVDNANIFLNNQIETQLNTGYSKAYGAEFYVSKNRGRLTGWISYTL SHARNYIAALEDKEYPPVYDRPHSLKIFLNYEAGRKRRCAFAATFSYNSGMNLTLPIAHY RVNGTAFYIYSTRNGYRAPAFHELNLSMTCKTGKRGRLILSVMNVYNRKNVFTIYTSRDD YDFSDIGMHKMYLYGALPSISYQFTF >gi|225935351|gb|ACGA01000041.1| GENE 54 66896 - 67903 747 335 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716900|ref|ZP_04547381.1| ## NR: gi|237716900|ref|ZP_04547381.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 335 1 335 335 673 100.0 0 MRLYFNLNKITYLFLMQALYLCIACEEPFDVGMPIPEDAIVFDGVITDEPPPYYFVLSKP STKLKYPENRSFDRINDAEIVIVDLTTGIRDTLQNAKLTGYQDFRFYDHYRDKDVTVYMK WLPGETPGGLYVTNKIYGVENHTYELHIKYKGKEYTACERMVPKTPIDKIVMKRIDTGEG EPNETPCISFYNPPEEHNYYLLKTDFCSSKVLRVASVYNLYYGTTNSAGWPYSILDDEYL AENVIDYVVSEGEQFVLPNRPGFSYPVSDSIWIKMQSISENCYQVFDQMIKQIRSDGGTF SPRPTSVKSNIDNGAYGIFRVSAISEIYFYKKHRI >gi|225935351|gb|ACGA01000041.1| GENE 55 67919 - 68911 583 330 aa, chain - ## HITS:1 COG:AF1354 KEGG:ns NR:ns ## COG: AF1354 COG1651 # Protein_GI_number: 11498950 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Archaeoglobus fulgidus # 176 330 133 301 305 73 29.0 5e-13 MKILYFFLFSFLFIACNTPVINKNDVVTEVNGEQILLSELASQSKQEIFDILNTAYEIKS RVLAGLIKQKLLEDAAKEENMSLEEFIDWFVQQKICVGQDSLKKRYGFNTQSFYVKGELI PLVKGSLEEKLSYQQKLRSRIVQALVDSLYQKADIKRFLYPPKQPECVVRDLCVYYRGNL DSPVSFIVASDYNCERCVQFEKTLSKLYDNYKERVKFGFVHFADAPSLAALACEAAGEQK QFWTFHDTIFNYSGVADSAFIYNLAKSKRLNMTEFDAYLHSSDKYKKMDKVINQLVERGL MATPTIIINDRLVYVTNSYEELSRLLEYEL >gi|225935351|gb|ACGA01000041.1| GENE 56 68985 - 69680 293 231 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172708|ref|ZP_05759120.1| ## NR: gi|260172708|ref|ZP_05759120.1| hypothetical protein BacD2_12646 [Bacteroides sp. D2] # 1 231 1 231 231 470 100.0 1e-131 MGRMEVAKEKVMRTAYGATSPRVGVFRVGSCGNYKYLSIKIDCENGNSKTSVSGNVGDTY VDGGDNMRLEFCMVDAGFRYPGGVFLFEDVPLETMTLVRYHDTEDGGHNGVWSDDPNYYD VMHISGMSKLDSNATLAWNINRNMTKWGDIPMGPAGINYGVIAPADMASGNLYFDDEDHN NKNWAQVWMGHQHQSDPHGTFYGVQLDKNTRYHVCLNTDKTNFTKLVRSVI >gi|225935351|gb|ACGA01000041.1| GENE 57 69683 - 69886 252 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237716902|ref|ZP_04547383.1| ## NR: gi|237716902|ref|ZP_04547383.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 67 1 67 303 131 100.0 2e-29 MKEKLLVLGTMVALCCSCAGSSVLLDTESELNGLNNEYELGTLEAGDNGLKCETSGDEDS IQVLKDN >gi|225935351|gb|ACGA01000041.1| GENE 58 70056 - 70226 80 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294644389|ref|ZP_06722152.1| ## NR: gi|294644389|ref|ZP_06722152.1| hypothetical protein CW1_0710 [Bacteroides ovatus SD CC 2a] # 1 56 1 56 56 91 100.0 1e-17 MEMKLLFFLLSAILFQSSISIYAMEQIKRGWVALKDVTQMRSHSIFVFPNNFLGSS >gi|225935351|gb|ACGA01000041.1| GENE 59 70816 - 71187 429 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172710|ref|ZP_05759122.1| ## NR: gi|260172710|ref|ZP_05759122.1| hypothetical protein BacD2_12656 [Bacteroides sp. D2] # 1 123 1 123 123 195 100.0 7e-49 MKAKFLPFLLLTVLLQVSMFSYADEMVTARQIKLRTKTQVQHRSIPVSPDAFIENSLLTI DLLSTVPTVTVTIKDAETGEVVYTSTDLNVDKVYIDLAGEEKGKYTLEIQLPKEAFIGDF ELD >gi|225935351|gb|ACGA01000041.1| GENE 60 71193 - 71393 62 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGLKFRSKNTQMKQKKRFLCGNNGNSKLFKSKKSKNESIYLTDYQHQKHVQKFVYPFYSL YDFYVS >gi|225935351|gb|ACGA01000041.1| GENE 61 71314 - 72981 1003 555 aa, chain - ## HITS:1 COG:no KEGG:BT_3570 NR:ns ## KEGG: BT_3570 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 547 20 556 566 267 33.0 8e-70 MKTRCCYPKRYLLLFFSLLVSVGSILMSVSLSSCSSSVKSPLLLSADSLMEIYPDSALSI LESISSPQKLPRADRALYALLLTQARHKNYIVLGDDSLIKTAVEYYGDKKKSVRAAKAHY YWGATYLEKGYTSFAVEEYLAAIRLMPIRNEFLAMIYDNLAECYEKDGLNNVAMEAYRTA YQILEGESAQVYPLRGIAGIFFSQNEKDSALYYYQQALDCALVVQDSSMIGAIYHDFAMI YNEKKDYIRADQYISKAIMVMGHDEAANAYLLKAEIMFNLNKLDSASYFFNKEMGSLDIN GRAVCYDGMYQIAKKRGEWKTATENMDAYKKLYDSIQFMTDNEELNRLMDKHQLEEHKRL LSERTRTLIFILITAFFSLMIICVFYFMWNDRKRKKYYIALQHELTQKRVDTMLLKEEEV SESNKEHIDKKRSELTEQQIQLCISVLKTTDCYEQLEILEKATPKQLLAMRSLRKEIRST ISNAFVDVMVNLKERCPALTGDDVFYCVLSLLCCSKTVMMELMDATSDALKTRKNRIKNK MDTQIFERVFGADNQ >gi|225935351|gb|ACGA01000041.1| GENE 62 73162 - 74454 1550 430 aa, chain - ## HITS:1 COG:TM1658 KEGG:ns NR:ns ## COG: TM1658 COG0192 # Protein_GI_number: 15644406 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Thermotoga maritima # 1 430 1 395 395 427 53.0 1e-119 MGYLFTSESVSEGHPDKVADQISDAVLDKLLAYDPSSKVACETLVTTGQVVLAGEVKTKA YVDLQLIAREVIKKIGYTKGEYMFESNSCGVLSAIHEQSPDINRGVERQDPMEQGAGDQG MMFGYATNETENYMPLSLDLAHRILQVLADIRREGKVMTYLRPDAKSQVTIEYDDNGTPV RIDTIVVSTQHDDFIQPEDDSQAAQLKADEEMLSIIRRDVIEILMPRVIASIHHDKVLAL FNDKIIYHVNPTGKFVIGGPHGDTGLTGRKIIVDTYGGKGAHGGGAFSGKDPSKVDRSAA YAARHIAKNMVAAGVADEMLVQVSYAIGVARPINIFVDTYGRSHVNMTDGEIARVIDQLF DLRPKAIEERLKLRNPIYQETAAYGHMGREPQVITKKFSSRYEGDKTVEVELFTWEKLDY VDKIKAAFGL >gi|225935351|gb|ACGA01000041.1| GENE 63 74685 - 74873 271 62 aa, chain + ## HITS:1 COG:no KEGG:BT_3217 NR:ns ## KEGG: BT_3217 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 62 1 62 63 78 77.0 9e-14 MILNKKFKTLLVLALYIGFTIAIYAIVCHFIDKPFQEIHLLYAVLIGCIAYLPVFIAEKK KK >gi|225935351|gb|ACGA01000041.1| GENE 64 74848 - 75357 278 169 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 1 162 111 264 278 111 39 2e-23 CNADFGQIMAKVYLGLGTNLGDKEQNLRDAVQKIEEQVGKIVSLSAFYITAPWGFSSDNS FLNAAVCVDTELAPIDVLQRTQAIEQELGRTKKSVNGIYSDRLIDIDLLLYGDLILSTTS PSGAKLILPHPLMAERDFVMKPLAEIAPGLVHPVLGKTMKELTSSFSPQ >gi|225935351|gb|ACGA01000041.1| GENE 65 75333 - 76391 1202 352 aa, chain - ## HITS:1 COG:SP1416 KEGG:ns NR:ns ## COG: SP1416 COG0809 # Protein_GI_number: 15901270 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Streptococcus pneumoniae TIGR4 # 1 349 1 341 342 296 43.0 3e-80 MKLSQFKFKLPEDKIALHPMKYRDESRLMVLHRNTGKIEHKMFKDVLDYFDDKDVFIFND TKVFPARLYGNKEKTGARIEVFLLRELNEELRLWDVLVDPARKIRIGNKLYFGPDDSMVA EVIDNTTSRGRTLRFLYDGPHDEFKKALYSLGETPLPHSIINRPVEPEDAERFQSIFAKN EGAVTAPTASLHFSRELMKRLEIKGVDFAYITLHAGLGNFRDIDVEDLTKHKMDSEQMFV NEMAVKTVNRAKDNGRNVCAVGTTVMRAIESAVSTDGHLKEFEGWTNKFIFPPYEFTVAN SMISNFHMPLSTLLMIVAAFGGYDQVMDAYHVALKEGYRFGTYGDAMLILDK >gi|225935351|gb|ACGA01000041.1| GENE 66 76409 - 77131 643 240 aa, chain - ## HITS:1 COG:MT2862.1 KEGG:ns NR:ns ## COG: MT2862.1 COG0130 # Protein_GI_number: 15842331 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Mycobacterium tuberculosis CDC1551 # 8 216 8 210 298 174 45.0 1e-43 MNFKKGEVLFFNKPFGWTSFKVVGHARYHICRRIGVKKLKVGHAGTLDPLATGVMILCTG KATKRIEEFQYHTKEYVATLRLGATTPSYDLEHEIDATYPTEHITRELVEEVLTHFIGAI DQVPPAFSACMVDGKRAYELARKGEEVELKAKQLVIDEIELLECRLDDPEPMIQIRVVCS KGTYIRALARDIGEALHSGAHLTGLIRTRVGDVRLEDCLNPEHFKEWIDGQEIENEEENN >gi|225935351|gb|ACGA01000041.1| GENE 67 77131 - 77985 743 284 aa, chain - ## HITS:1 COG:aq_2195 KEGG:ns NR:ns ## COG: aq_2195 COG1968 # Protein_GI_number: 15607126 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Aquifex aeolicus # 4 268 1 247 256 178 42.0 1e-44 MGDLTTFETIIIAIVEGLTEFLPVSSTGHMIITQNILGVESTEFVKAFTVIIQFGAILSV VCLYWKRFFRLNHTPAPAGASALKCFLHKFDFYWKLLVAFIPAAILGFLFSDKIDEMLES VAIVAVMLVIGGIFMLFCDKIFSKGSEDTVLTEKKAFNIGLFQCIAMIPGVSRSMATIVG GMSQKLTRKDAAEFSFFLAVPTMFAATGYKVLKLFLDGGTEILVNNMPALIIGNVVAFVV ALLAIKFFISFVTKYGFKAFGWYRIIVGGTILVMLLLGYNLEIG >gi|225935351|gb|ACGA01000041.1| GENE 68 77988 - 78230 271 80 aa, chain - ## HITS:1 COG:no KEGG:BT_3211 NR:ns ## KEGG: BT_3211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 130 96.0 1e-29 MSDKQKFAFDKVNFILLAIGMAIVIIGFLLMTGPTSSETLFEPDIFSVRRIKVAPVVCLF GFISMIYAVLRKPKTQKTEE >gi|225935351|gb|ACGA01000041.1| GENE 69 78344 - 79225 580 293 aa, chain - ## HITS:1 COG:L2 KEGG:ns NR:ns ## COG: L2 COG2177 # Protein_GI_number: 15672955 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Lactococcus lactis # 23 285 29 311 311 82 26.0 1e-15 MGNQNKYKSGSIFDMQFITSSISTTLVLLLLGLVVFFVLTAHNLSVYVRENISFSVLISD DMKEADILKLQKKLNQEPFVKQSEYISKKQALKEQTEAMGTDPEEFLGYNPFTASIEIKL HSDYANSDSIAKIEKLIKKNTNIQDVLYRKELIDAVNDNIRNISLVLLALAVVLTFISFA LINNTIRLAIYSKRFLIHTMKLVGASWGFIRGPFLRKNVWSGILAAIVADSILMGTAYWA VTYEQELLQVITPEVMLIVCASVLAFGIVITWLCAYFSMNKYLRMKANSLYYI >gi|225935351|gb|ACGA01000041.1| GENE 70 79229 - 80128 807 299 aa, chain - ## HITS:1 COG:SPCC162.05 KEGG:ns NR:ns ## COG: SPCC162.05 COG2227 # Protein_GI_number: 19075739 # Func_class: H Coenzyme transport and metabolism # Function: 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase # Organism: Schizosaccharomyces pombe # 104 196 80 180 271 60 35.0 5e-09 MEKLSINACPVCGGTHLKRVMTCTDFYASGEQFELHSCEDCGFTFTQGVPVEAEIGKYYE TPDYISHTDTRKGAMNNIYHYVRSYMLGRKARLVAKEAHRKTGRLLDIGTGTGYFSDTMV RRGWKVEAVEKSPQAREFAKTHFELDVKPESALKEFAPASFDVITLWHVMEHLESLNETW ETLRELLTEKGVLIVAVPNCSSYDAKRYGEYWAAYDVPRHLWHFTPGTIQQLASRHGFIM AARHPMPFDAFYVSMLSEKHRGSSCSFLKGMFAGTLAWFNALGRKERSSSMIYVFRKKR >gi|225935351|gb|ACGA01000041.1| GENE 71 80636 - 80893 265 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295085889|emb|CBK67412.1| ## NR: gi|295085889|emb|CBK67412.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 85 1 85 85 129 100.0 4e-29 MISVITSGMESRVYLDNEGIDDMIHYLTYLREKNETTYDLIEGNELDRLDEDLVLDGFEH ITHLELIYINAFDRDDGTVRIIEED >gi|225935351|gb|ACGA01000041.1| GENE 72 80898 - 81128 206 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172723|ref|ZP_05759135.1| ## NR: gi|260172723|ref|ZP_05759135.1| hypothetical protein BacD2_12721 [Bacteroides sp. D2] # 1 76 1 76 76 127 100.0 2e-28 MDKHVDAGKLKVVVVDRETGLNKTYSNNKSGYSYNLDEGGPTRIGRKKMIEPPHIDVNYP KPKPKNIEKKKLFVNY >gi|225935351|gb|ACGA01000041.1| GENE 73 81371 - 81553 200 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172724|ref|ZP_05759136.1| ## NR: gi|260172724|ref|ZP_05759136.1| hypothetical protein BacD2_12726 [Bacteroides sp. D2] # 1 60 1 60 60 102 100.0 8e-21 MNKYEGMTVNECLYVSGLIDEFYKAVEKKDIEAVISILKRVELSNNQIEPILVTLNLPYI >gi|225935351|gb|ACGA01000041.1| GENE 74 81695 - 81970 262 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172725|ref|ZP_05759137.1| ## NR: gi|260172725|ref|ZP_05759137.1| hypothetical protein BacD2_12731 [Bacteroides sp. D2] # 1 91 1 91 91 171 100.0 2e-41 MIYSNETEFAGSWLLENGNVKDDNVSMRIKDLIVNYLSEIAMTDDGWQRLYQDPNDGRYW ELSYPHSDWEGGGPPSLKNISQLEAKNKYKI >gi|225935351|gb|ACGA01000041.1| GENE 75 81970 - 86280 2171 1436 aa, chain - ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 1004 1200 37 231 440 65 28.0 6e-10 MIKQLRIIILLILLGTGTCLTAQNNTVTPLFAGKIGYHQYFGSDILLKDVVSPSGGNGTY SISWEYRYTDSNEWSEITRLAQSDYEGGCLVSSQYFNTYRAEGVYIRRKVVSGTLVAYSS DIRVKRLTEPLSFTNRGGEAYIDLTKHCPSLPGDFICTELWLGENTSDWCSFLYTGWENT DHPATVKIEVMRNFGDNRSGQIELLYRGFSRNSNFSIDMSNYQKYKVTVKVEQNWNFTDD VFLLLLPTQEEPINPGETSYGVWVDNDGMRDDLFRWWQYSHDQIHWVDVFDVPDLRTVYQ PGPLYQTTYFRMMANDGVDVYTSNIVSIKVREKLNLSSKNYIHTRTYTTPDGMEFRDDVE YFDGLGRSIENVKVKHSVDDTDLLSITEYDKMGRLFREWLPGVSEPGNCGAFVEAEELRN SSVLSNGNDVMPYTKTLYENSAIGKVESRYGPGMEWHNRGKGVNSRYLTNVSKDTTLLDL CDSLVCLRYTVPTDRTRLFVSCAGTYASGELYVVKEVDEDNHVRYEFTDKEGRIVLIRSI NGNEGMDDTYYIYDIFGNLRVVLPPMCSANFCSGSLTDSASVLSDYAFLYKYDTRNRCIG KKQPGCEWVELVYDRCDRLIFTQNGKERANNEWAFHLEDLSGRSVLTGVYRGTPDFQNCA SEDIYAEFTSNTDLPCYGYTIYGLQNHYLDIQQVYFYDTYEYQNLLPDSLSSLWYVFDNN YGEHYENSQSSPHCKEQLTGSIVRILGTEEYQYASYYYDYYHNLIQERKTTSGGNKKVNK SLFNFLKQPVSVRSEYEGGVLNKLYSYDRAGRLIHERHCVVDKDTIDLLYGYDKLGRLKR LERIHGKDSVITENAYNIRSWLTGIDSNAGFVEQLHYVDGLGIPCYNGNISSMTWEADGM KRGYKFMYDGRSRLLRAVYGEGEGLNVNQDRFNEEITGYDKNGNILGIKRCGKRSEDEYG LIDNLVMSYNGNQLKAVSDSVTGSAHADSFEFKDGADLPVEYYYDSNGNLIQDLNKKISE IQYNYLNLPSRIEFEDGGVISYLYDAKGTMFRTTHVIAGKTTTTDYFDDAIYENGVLTTL LTELGYISLVDGKYHYYLKDHQGNNRVVVGQNGSVEEVNHYYPFGGTFASTSSVQPYKYN GKELDRQGGLDWYDYGARHYDVALGRWHVVDPMAEKYYLWSPYAYCLGNPVRFIDPTGMV TEIPPGFWASFGKGFSQPFVSFWNAVTHPVETVTNVVNSIKSVTPTEAGFGVGEQVLRSP MSPLGSTFNQLDVAQAIAYDKANGTMTSAEVIGNQWGDIAFDAVTTAVGAGVGRGTGLYK GQTLVTNSIPQKVARVIPDGIKTSMLGAPNQSDVFVTAAKDIKGLNAMQIANKLTIPQSS SGFKVIEFRTPMNGLASPINRTNPGFVGKGRTLGGAREFTIPNQQIPKDAIIKIVK >gi|225935351|gb|ACGA01000041.1| GENE 76 86277 - 89282 1429 1001 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4144 NR:ns ## KEGG: Cpin_4144 # Name: not_defined # Def: YD repeat protein # Organism: C.pinensis # Pathway: not_defined # 736 1000 831 1087 1092 105 30.0 7e-21 MQKTIYIQKIFLLIVVTFSYSIAMAQQFKSPDVMPSSPQSQLIENFYEKYFNNDISARGE KTIDINLYDIEVKGLNIPINISYNTCGVKYKQPSGDVGVGWTLSPAARISRTIIGYPDEA MKRVPTLFDDLNKLNNHIDVDKYLSAFSSGIMGNVSYFSVASWDQGCDIFTFSTLSESGH FVFPDPSTLDKVEILEKINCQITPILTGGLLTGFQIVDGHGITYNYGGNASEHVHEDAYR TDISSGTTGWALKSIVTAKNDTVRFDYVGYTDRSITEPEYKTLSVNDSYAYRMYVQDEGD VRVQQEYELSTALNFNITLPNYTTFLLSGIEANGIKVNFFRASADGIYSNAVDSITVVDK QSGDIIKQINLSYSGMNKRPHFFLDRLEMDGQKYYLSYYTPSDLGLGEYDNTHYTSDLWG YYLYSSTSLDKVDLPVISREFVDDSFVFFKPVPGASLYDDVIKKLSRIIPNVNFKNDSDN TKAHLFSLKKIYFPTGGSQEFIYEPNEYLLNSARVKGAGIRLSNVKTYDDNGNLHEQRYK YGENEDGIGVPSFHLSCETFADVKQNRIQYRTSPGLYDYVSFCSRIYSPVAFGDANISSN FFVEYPQVHVYNKQEKGENKISYLFNLPQPLYIDVMPENTNEKYVGSPDLLNNRGWKSYI HRYQYGIKTNIASCRYFNSDGRLVKKEDYTYCRSAGLVMDGGIKMEQVVISVYQDTYDFA HTIELSRPSLFKYMTYSVMSGRDLLSKKVVSEYFDGGTVVSEEGYVYDDKNQLIKVIRTS QNGGGDCRIKNIKYPYNYTDAISNLMVERNMLDYPVETITEYNGKEVYREQIKYGLYQNL LLPAYVDISHDGPLGLKREMTYVSYDKKGNILEYVDKNNNHTCYIWGYNYKYPIAEIQGT TYDSVKSVLGHENNDLSYLQGYGEYQLERELNKLRMAFRDNHPVFIKTFLYKPYVGVTSI TTSDGFTTRYGYDQFGHLKESYLERNGRKQLIQEFKYNYKL >gi|225935351|gb|ACGA01000041.1| GENE 77 89381 - 89482 65 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAMKKSVWDMILKVVIAVASALAGVLGANAMNL >gi|225935351|gb|ACGA01000041.1| GENE 78 89482 - 90009 371 175 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 142 2 98 116 96 45.0 2e-20 MRIINLIVVHCSATRGDCTLSPEDLDRLHRRRGFNGTGYHYYIRKDGTVHLTRPIERVGA HAKGFNAHSIGICYEGGLDCRERPADTRTPAQRATLRQLVGQLQEKFPGCRVCGHRDLSP DLNRNGEIESEEWIKSCPCFEVAKEFKELEEFAIKTENTEEHRVSQHIKEQKGGK >gi|225935351|gb|ACGA01000041.1| GENE 79 90094 - 90597 577 167 aa, chain - ## HITS:1 COG:no KEGG:BT_3199 NR:ns ## KEGG: BT_3199 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 166 168 236 82.0 3e-61 MPVLYKPFQSVLEDKNKKKLFHPRVIYTANVSTTQLAKEIAAYSSLSTGDVKNTLDNLVT VAAQHLQASESVTLDGFGTFRMVMKSNGKGVELPEKVSAAQASLTVRFLPNYTKNPDRTT ATRSLVTGAKCVRFDLADTSASGGGNSGEPDDGGGSGGSGEAPDPAA >gi|225935351|gb|ACGA01000041.1| GENE 80 90720 - 91610 462 296 aa, chain - ## HITS:1 COG:no KEGG:BT_3198 NR:ns ## KEGG: BT_3198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 296 1 302 302 454 75.0 1e-126 MGRIKQGLDYFPMSTSFMHDRVVRRVMKREGDAAFATLVETLSYIYAGKGYYISAGDEFY DELTDSLYNTDMDDVKRIIALSVECGLFDAGLFRQYGILTSADIQRQYLFITKRRSSSLI EPDYCLLEAEELASYHPSQSSKSCADDTDNKTVDAVTPTADPVTSTTDSATSEAEMSTSG TQNKEKQIKTNQNKLNHLSDSPQGENGGGKILKRRKAMTQDDIDNLQPPSDGTQRNFGGL LENLHSYKIPPSEQYAIILKSNFGAIGNPVWKGFSNIRGSNGKIKLPGHYLLSVIN >gi|225935351|gb|ACGA01000041.1| GENE 81 91598 - 91918 197 106 aa, chain - ## HITS:1 COG:no KEGG:BT_3197 NR:ns ## KEGG: BT_3197 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 106 1 106 106 148 77.0 4e-35 METIMNQSLHICGCCKRKLPQEAFYMNQRTRASDNYCKECRKANTRRHRNQDKNISFENK PLSYPVITEVKDYALRMLLIRHARQVVADSIRRKQEKEKLHALWEE >gi|225935351|gb|ACGA01000041.1| GENE 82 92360 - 93073 717 237 aa, chain + ## HITS:1 COG:no KEGG:BT_3196 NR:ns ## KEGG: BT_3196 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 237 1 236 236 388 78.0 1e-107 MDTFLDVTGIVKRAKQALNFKNDSELAEYLGVSRATVSNWGARNSIDFRLLLDKFGDKVD YNWLLLGKGNPKHQSRYCESELVQGEVEIIHNPKTPEPIDDRSVTLYDITAAANLKTLFT NKKQYALGKILIPNISVCDGAVYVNGDSMYPILKSGDIIGYKEISSFDNVIYGEIYLVSF MIDGDEYLAVKYVNRSDKEGYLKLVSYNTHHEPMDIPFAVINAMAIVKFSIRRHMMM >gi|225935351|gb|ACGA01000041.1| GENE 83 93270 - 94652 472 460 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227395721|ref|ZP_03879044.1| SSU ribosomal protein S12P methylthiotransferase [Haliangium ochraceum DSM 14365] # 22 452 1 450 461 186 29 6e-46 MNELTGADFKSATEMNDDNKKLFIETYGCQMNVADSEVIASVMQMAGYSVAETLEEADAV FMNTCSIRDNAEQKILNRLEFFHSLKKKKKHLIVGVLGCMAERVKDDLITNHHVDLVVGP DAYLTLPDLIAAVETGEKAINVELSTTETYRDVIPSRICGNHISGFVSIMRGCNNFCTYC IVPYTRGRERSRDVESILNEVADLVAKGYKEVTLLGQNVNSYRFERPTGEVVTFPMLLRT VAEAAPGVRIRFTTSHPKDMSDETLEVIAQVPNVCKHIHLPVQSGSSRILKLMNRKYTRE WYLDRVAAIKRIIPDCGLTTDIFSGFHSETEEDHALSLSLMEECGYDAAFMFKYSERPGT YASKHLEDNVSEEVKVRRLNEIIALQNRLSAESNQRCIGKTYEVLVEGVSKRSRDQLFGR TEQNRVVVFDRGTHRVGDFVNVRVTEASSATLKGEEVVSN >gi|225935351|gb|ACGA01000041.1| GENE 84 94989 - 95471 329 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172735|ref|ZP_05759147.1| ## NR: gi|260172735|ref|ZP_05759147.1| hypothetical protein BacD2_12781 [Bacteroides sp. D2] # 1 160 1 160 160 249 100.0 4e-65 MEDFLKFLLIAGVILVGIFKEVNKNSKSKKAQNKRPVPPMSSPTEIDPDVVPMPEAWGRG KSLDELFPPASSYEPTVAKPAAKPISKPVAQPASKSKKKKEEVSVAASLANSAAQDERNT RQGSHYNTPHESPNEQDFTIHSAEEARRAIIWGEILQRKY >gi|225935351|gb|ACGA01000041.1| GENE 85 95623 - 97122 1513 499 aa, chain + ## HITS:1 COG:ygfH KEGG:ns NR:ns ## COG: ygfH COG0427 # Protein_GI_number: 16130821 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Escherichia coli K12 # 3 489 5 490 492 503 49.0 1e-142 MSFNRISAAEAASLIKHGYNIGLSGFTPAGTAKAVTAELAKIAEAEHAKGNPFQVGIFTG ASTGESCDGVLSRAKAIRYRAPYTTNADFRKAVNNGEIAYNDIHLSQMAQEVRYGFMGKV NVAIIEACEVTPDGKIYLTAAGGISPTICRLADQIIVELNSAHSKSGMGMHDVYEPLDPP YRREIPIYKPSDRIGLPYVQVDPKKIIGVVETNWPDEARSFAAADPLTDKIGQNVADFLA ADMKRGIIPSSFLPLQSGVGNIANAVLGALGRDKTIPAFEMYTEVIQNSVIGLIREGRIK FGSACSLTVTNDCLEGIYNDMDFFRDKLVLRPSEISNNPEVVRRLGIISINTAIEVDLYG NVNSTHISGTKMMNGIGGSGDFTRNAYISIFTCPSVAKEGKISAIVPMVSHLDHSEHSVN IVITEQGVADLRGKSPKERAQAIIENCAHPDYKELLWDYLKLAGNRAQTPHAIHAALGMH AELAKSGDMKNTNWAEYAK >gi|225935351|gb|ACGA01000041.1| GENE 86 97345 - 98751 865 468 aa, chain - ## HITS:1 COG:BS_pbp KEGG:ns NR:ns ## COG: BS_pbp COG2027 # Protein_GI_number: 16078896 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) # Organism: Bacillus subtilis # 36 461 51 484 491 158 29.0 2e-38 MKKSLLLVALSVCLLPLWGQQNFSRIDSLIKKMLPEASEIGISVYDLTAKKSLYNYRAEK LSRPASTMKLLTAITALSRPGANEPFRTEVWHDGVIEHDTLQGNLYVVGGFDPEFDSQSM DSLIEEVITFPFSVINGQVYGDVSMKDSLYWGSGWAWDDTPAGYQPYLSPLMFCKGTVQV SVVPSTVQGDTASVSCQPLSSYYTVTNQTKTRTSSAGKFSFTRDWLTNGNNLLISGNVTS IRKDDVNIYDSPRFFMHTFLERLRGKGITTPQSYGFAELPRDSVRVERMACWNTSVQKVL NQLMKESDNLNAEAFLCRLGAQATGKKQVAAEDGIVEIMKLIRRLGHDPKDYKIADGCGL SNYNYLSPALLVDFLKYAYSQTEVFQMLYKSLPVGGVDGTLKFRMKGTPAFRNVHAKTGS FTAINALAGYLKMKNGHEVAFAIMNQNVLSAAKARAFQDKVCEAIIGK >gi|225935351|gb|ACGA01000041.1| GENE 87 98798 - 100000 934 400 aa, chain - ## HITS:1 COG:no KEGG:PRU_2053 NR:ns ## KEGG: PRU_2053 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 393 1 357 484 153 29.0 1e-35 MRKYIITPVLLAIVCMVMCRCTSNRKASVAANEYLIQGELANLPDSIVIGLYEEDGNILN CVLRDTLMNGQFSFRDTVSTTRKMLIMSDNRGFPGTWLEVWIAPGEYIEIKGQDKLVKTW EVVSDVPEQQEENRFTACAMAQQKELMQYMAAEYDWQRMMFIDHPGDREFESQAWAKIDS IRKLSMPLQQEIWKKEMEYMEEAPVSPVWMDRLLFFASMMKYKTIMPYEEEVKKLYARMS EADKQTGDGQEITAYVYPPATVGVGDMMVDGDLYDANDSLRHISEFKGKFILLDFWSSGC GPCVESIPEMEKVMDLYKDKMTVISISEDPKARWKEYIKTKGMGGNQWNELRKGRTGLAL NYQVRGIPHYVLIAPDGKIQDVWSGYGAGSLLGQVEKNLK >gi|225935351|gb|ACGA01000041.1| GENE 88 100065 - 101408 667 447 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 2 447 3 448 458 261 33 2e-68 MKYQVIIIGGGPAGYTAAEAAGKAGLSVLLIEKNSLGGVCLNEGCIPTKTLLYSAKTYDS AKHASKYAVNIPEVSFDLPKIIARKSKVVRKLVLGVKAKLTANNVTIVSGEAQIIDKNTV CCGEETYEGENLILCTGSETFIPPIPGVDAVNYWTHRDALDSKELPASLVIVGGGVIGME FASFFNSLGVQVTVVEMMDEILGGMDKELSALLRAEYAKRGIKFLLGTKVIGLSQTVEGA VVSYENAEGNSSVIAEKLLMSVGRRPVTKGFGLENLNLEKTERGIIKVNEKMQTSVSGVY VCGDLTGFSLLAHTAVREAEVAVHSILGKEDTMSYRAIPGVVYTNPEIAGVGETEESAST KGIDYQVIKLPMAYSGRFVAENEGVNGVCKVLLDEQQRVIGAHVLGNPASEIITLAGTAI ELGLTAAQWKKIVFPHPTVGEIFREVL >gi|225935351|gb|ACGA01000041.1| GENE 89 101450 - 101902 384 150 aa, chain - ## HITS:1 COG:no KEGG:BT_3185 NR:ns ## KEGG: BT_3185 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 129 1 129 131 196 82.0 2e-49 MNRIFHARIAWYQYFLLVVLTVNAVGALWCKYILVAVLFMLMLIVVIEQIIHTAYTLTTD GNLEVSRGRFIRKKVIPLSEITTVRKYHSMKFGSFSVTDYILIEYGKGKFVSVMPVKEQE FAELLEKRMFEKKIKDLEAIDSVDPQGDEP >gi|225935351|gb|ACGA01000041.1| GENE 90 102009 - 103580 1598 523 aa, chain + ## HITS:1 COG:PA0761 KEGG:ns NR:ns ## COG: PA0761 COG0029 # Protein_GI_number: 15595958 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Pseudomonas aeruginosa # 4 519 6 520 538 483 46.0 1e-136 MVRKFDFLVIGSGIAGMSFALKVAHKGKVALICKSGLEEANTYFAQGGVASVTNLLVDNF DKHIEDTMIAGDWISNRAAVEKVVREAPAQIEELIKWGVDFDKNEKGEFDLHREGGHSEF RILHHKDNTGAEIQDSLIKAVQRHPNITVIENQFAIEILTQHHLGVTVTRQTPDIKCYGA YILDPKTGKVDTYLAKVTLMATGGVGAVYKTTTNPLVATGDGIAMVYRAKGTVKDMEFVQ FHPTALYHPGDRPSFLITEAMRGYGGVLRTMDGKEFMQKYDPRLSLAPRDIVARAIDNEM KNRGDDHVYLDVTHKDPEETKKHFPNIYEKCLSLGIDITKDYIPVAPAAHYLCGGILVDL DGQSSIERLYAVGECSCTGLHGGNRLASNSLIEAVVYADAAAKHSLQAVDQYSYNEDIPE WNDEGTRSPEEMVLITQSMKEVNQIMSTYVGIVRSDLRLKRAWDRLDIIYEETESLFKRS VASREICELRNMVNVGYLIMRQAMERKESRGLHYTIDYPHVKK >gi|225935351|gb|ACGA01000041.1| GENE 91 103691 - 104269 868 192 aa, chain - ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 3 192 2 195 195 201 57.0 9e-52 MAKSIKGTQTEKNLLTSFAGESQARMRYTYFASVAKKEGYEQIAAIFTETADQEKEHAKR MFKFLEGGMVEITASYPAGVIGTTLENLRAAAAGEHEEWSLDYPHFADVAEQEGFPMIAA MYRNISIAEKGHEERYLAFVNNIENMTVFAKEGEVVWQCRNCGYITVGKEAPEVCPACLH PQAYFEVKKENY >gi|225935351|gb|ACGA01000041.1| GENE 92 104519 - 106198 1669 559 aa, chain + ## HITS:1 COG:CT856 KEGG:ns NR:ns ## COG: CT856 COG0659 # Protein_GI_number: 15605592 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Chlamydia trachomatis # 8 559 13 560 567 409 43.0 1e-114 MKLFEFKPKLVSCLKNYSKETFMADLMAGIIVGIVALPLAIAFGIASGVSPEKGIITAII AGFIISLLGGSKVQIGGPTGAFIVIIYGIIQQYGEAGLIVATLMAGVLLILLGVFKLGAV IKFIPYPIIVGFTSGIAVTIFTTQIADIFGLSFGDEKVPGDFVGKWMIYFRHFDTINWWN TIVSIVSIIIIAITPKFSKKIPGSLIAIIVVTVAVYLMKTFGGIDCIQTIGDRFTIKSEL PDAVVPALDWEAIRELFPVAITIAVLGAIESLLSATVADGVIGDRHDSNTELIAQGAANI IAPLFGGIPATGAIARTMTNINNGGKTPIAGIIHAVILLLILLFLMPLAQYIPMACLAGV LVIVSYNMSGWRVFRALLKNPKSDVTVLLITFFLTVIFDLTVAIEVGLIIACVLFMKRVM ETTEISVITDEIDPNKESDLAVHEENLMIPKGIEVYEINGPYFFGIATKFEETMAQLGDR PEVRIIRMRKVPFIDSTGIHNLTTLCEMSQKEKTTVILSGVNENVHNVLEKAGFYELLGK ENICPNINVALERAKSLIK >gi|225935351|gb|ACGA01000041.1| GENE 93 106307 - 107671 398 454 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 43 452 37 455 460 157 29 2e-37 MKKNIITLVAISLSLSGCGIYTKYKPATVVPDHLYGEEVVAEDTASLGNMDWRELFTDPY LQSLIEVGLQTNTDYQSAQLRVEQAQAALMSAKLAFLPAFALSPQGTVNSFDTHKATQAY SLPVTASWELDVFGRMRNAKKQAKALYAQSEDYRQAVRTQLITGIANTYYTLLMLDEQLD LSRQTETAWKETVASTRALMNAGLANESAVSQMEAAYYQVQSSVLDLQQQISQVENSLAL LLAETPRNYERGMLAKQQFPTDLSIGIPVRMLSSRPDVRSAERTLEAAFYGTNAARSAFY PSITLSGSAGWTNSAGSLILNPGKFLASAVGSLTQPLFNRGQVVAQYRIARAQQEEAALG FQQTLLNAGSEVNDALIAYQTSQGKRLLLDKQIASLQTALQSTSLLMEHGNTTYLEVLTA RQSLLSVQLSQTANHFTEIQSLINLYRALGGGQE >gi|225935351|gb|ACGA01000041.1| GENE 94 107684 - 110863 3003 1059 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 6 1042 5 1035 1051 731 39.0 0 MKGNMFIKRPVMAISISVLILAIGLISLFTLPVEQYPDIAPPTVYVTASYTGADAEAVMN SVIMPLEESINGVEDMMYISSSASNAGLAIIQVYFKQGTDPDMAAVNVQNRVAKAQGLLP AEVTKVGVSTMKRQTSFLQIGALVCTDGRYDQTFLANYLDINVIPQIKRIEGVGDVMELG DTYSMRIWLKPERMAQYGLVPSDITAVLGEQNIEAPTGSLGESSQNVFQFTMKYRGRLKS VEEFQNTVVRSREDGSILRLKDVAEVELGTMTYSFRSEMDSQPAVLYMIFQTAGSNATAV NKEITAQIERMEKNLPEGTEFVTMMSSNDFLFASIHNVVETLIIAIILVILVVYFFLQDL KSTLIPSISIIVSLVGTFACLVAAGFSLNILTLFALVLAIGTVVDDAIVVVEAVQSKFDA GYKSPYLATKDAMGDVTMAIVSCTCVFMAVFIPVTFMGGTSGVFYTQFGITMATAVGISM ISALTLCPALCAIMMRPSDGTKSAKSINGRVRAAYNASFNAVLGKYKRGVMFFIRHRWMV WTLLAVAVALLVYLMSTTKTGLVPQEDQGVIMVNVSISPGSTLEETTKVMDRLENILKDT PEIEHYARVAGYGLISGQGTSYGTIIIRLKDWSERKGKEHSSDAVVSRLNAQFQSLKEAQ VFSFQPAMIPGYGMGNSLELNLQDMTGGDLATFYEAAVQFLGALNQRPEVAMAYTSYAIN FPQISVEVDAAKCKRAGISPSAVLDAVGSYCGGAYISNYNQYGKVYRVMMQASPEYRLDE QALNNMFVRNGTEMAPVSQFVTLKQVLGPETANRFNLYSTITANVNPADGYSSGEVQKVI EEVAAQSLPAGYGYEYGGMAREEASSGGAQTVFIYAICVFLIYLILACLYESFLVPFAVI FSVPFGLMGSFLFAKFLGLENNIYLQTGVIMLIGLLAKTAILITEYAIERRRKGMGIVES AYSAAQVRLRPILMTVLTMIFGMLPLMFSSGAGANGNSSLGTGVVGGMVVGTLALLFVVP VFYIIFEFLQEKIRKPMEEEADVQVLLEKEKSEVERERN >gi|225935351|gb|ACGA01000041.1| GENE 95 110878 - 111984 1360 368 aa, chain - ## HITS:1 COG:Cj0367c KEGG:ns NR:ns ## COG: Cj0367c COG0845 # Protein_GI_number: 15791734 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Campylobacter jejuni # 15 368 12 362 367 138 28.0 2e-32 MITVSKKWIRLIGIVGCTVWMASCKQASDAGVKSSSYAVMQIEAVDKEFSSSYSATIRGR QDIDIYPQVSGTIEKLCVTEGQKVRRGQLLFVIDQVPYKAALKTATANVEAARAGLGTAE LTYKSNKELYAQKVVSEFSLKTAENTYLTAKAQLSQAEAQEISARNNLSYTEVKSPSDGV VGALPYRVGALVGANMPYPLTTVSDNSDMYVYFSMTENQLLALTRQYGDMDEALKNMPEV ELHLNDNSVYDKKGVIESISGVIDRQTGTVVARVVFPNESRLLHSGASGTVVVPSIYKDC IAIPQTATVRMQDKIIVYKVVDGKAVSTLITVAGINDGREYVVLSGLKAGDEIVSEGAGL VREGTQVK >gi|225935351|gb|ACGA01000041.1| GENE 96 112208 - 113122 373 304 aa, chain + ## HITS:1 COG:no KEGG:BT_2939 NR:ns ## KEGG: BT_2939 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 303 303 512 81.0 1e-144 MEKENIRPIDIKDFKDSQHALDYIDNDFAIVNSLEGIPDNNDIVRLECFIIAVCIEGCIQ LDINYRTYQLKAGDLLLGLPNTVISHTMLSPKYKVRLAGFSTRFLQRIIKMEKETWNTAV HIHNNPVKSVSDEEDNSVFGFYRDLIVAKINDEPHCYHREVMQHLFSALFCEMLGQLHKE IECSDKSDRSRENIKQVNYTLRKFMELLSRDKGIHRSVSYFANELCYTPKHFSKVIKQAC GRTPSDLINETAMEQIKYRLKHSDKSIKEIAEEFNFPNQSFFGKYVKAYLGTSPASYRSR KEDI >gi|225935351|gb|ACGA01000041.1| GENE 97 113661 - 115700 1638 679 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172750|ref|ZP_05759162.1| ## NR: gi|260172750|ref|ZP_05759162.1| hypothetical protein BacD2_12856 [Bacteroides sp. D2] # 1 679 4 682 682 1307 100.0 0 MKNRLFAIGAMILMMTSCMQDELISSDKGKKPATAGSCLTLVGLSSPQTRVSIGDKTGDV YPVLWSEGDALGVFSRTAGTDINNVQSLLSDESIGQNSGVFTSDDVKMAEEGTTELLIYY PYRASAELAENDNKITSTLSVEQEQSRPGDSRHIGKYGFAFAKATVSGPDMLAKFTLNHA MAYVKFSISSQELSTYKLKSVSLYDKETKTPLSGVFTADLDTDELTYGTDVKPYATVSLT TPELLASAQDIYLTTYPADLSGKEVYIVITLENDQQTVTIPILKEGKQLKANAVNTIAVN NLKLSDNSCEWYEPVETRLLAGGWAYGESNCLLTNISTSGVSNTMSVKARGNFMEVEEPK YAKTILGCDLNVNHKMIAVNGSTTDISPIGSDYNITINTYKVSGGYDGGCGQVAIYGADQ TTVIWSFIIWMTPTPAEHPYGNTGYVVLDRNLGTYMTCEGDNWKQNGVYFQWGRPTPVGW SGTVGTNIPTEATNVRFSIENPRALLYTNNVENTKSDWYLGAWTGARTDRKDDFWGNPNE SSTYLNPSDGHKSIYDPCPKGYRVVSPRVLDEIEQKGEFVKQSATAVFKYCYDGTNYAYW PLAGCKWGSNGGNNGNNTGLDASKGAACYWSNSSASSYGNDKDQGATSLYYKVSDKTWTH SSGRSHAFSVRCMKDTENR >gi|225935351|gb|ACGA01000041.1| GENE 98 115730 - 116203 471 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172751|ref|ZP_05759163.1| ## NR: gi|260172751|ref|ZP_05759163.1| hypothetical protein BacD2_12861 [Bacteroides sp. D2] # 1 157 1 157 157 313 100.0 2e-84 MIMKNLHISMAFIFVLLLISASCSDKDDSAPRVIPTMNLTVSEIADNTALITSEQQTGTT FGAKVIDFYPVADIGFDYNIEVKLVKFVEENGEPVSLPYTRKITEGLRPGVNYISAIIAY NAEGRAVCSAFQTWKASGTEGAWSDGGSAGDLEENEW >gi|225935351|gb|ACGA01000041.1| GENE 99 116216 - 117397 1047 393 aa, chain - ## HITS:1 COG:no KEGG:BT_2913 NR:ns ## KEGG: BT_2913 # Name: not_defined # Def: unsaturated glucuronylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 389 7 393 402 402 52.0 1e-110 MRILLISILGLLTLASCTKQEPMEALIDRVFTVAEQQYTAMDTHLTEKTLPRTLSADGEF VPSNIYWWCSGFYPGSLWYIYEYTRKDAVKTLAEKNTLKLDSIQYVTRDHDVGFQLNCSY GNAFRLTGNEAYKQVLYQGAKSLSTRFNPAAGVIRSWDFVRKGCDWKFPVIIDNMMNLEL LLSMSKAYADDSLQNIACTHANTTIQHHFRDDYSTYHLVDYDPETGAVRGKQTVQGFSDD SSWSRGQAWALYSYTMMFRLTGYQNYLLQAGHIADMLLRRLPADGIPYWDFDAPVEEQTY RDASAAAIMASAFIELSRYIPGTEAKESYLAMAEKQLRTLASKEYLAEPGTNECFILKHS VGALPDKSEVDVPLTYADYYFLEALLRYKNLQK >gi|225935351|gb|ACGA01000041.1| GENE 100 117412 - 119343 1545 643 aa, chain - ## HITS:1 COG:no KEGG:Phep_2654 NR:ns ## KEGG: Phep_2654 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 48 640 34 628 847 474 41.0 1e-132 MKTILFKTIIIAFFVGTAVSCSDEDENRFRPGSTHEKPNPTEPEGGLDYSKLTADNHPRL LMNAEAFTALKAKVDANSSANLTLLHNTIMGVCNSKGMNATALTYKLDASNKRILDVSRD ALLRIFTCAYAYRMTGDAKYLTKAETDINAVCNFPDWNSKRHFLDVGEMATAVAFGYDWL YNELSVATRTKAANALLKFAFQQAQDKNWNLNFYEATNNWNQVCNGGLVCAALASYENNP SEAKDMIEKALESNKPALEVMYSPDGNYPEGSGYWCYGTLYQVLMLAALNSTLGTDNGLS DTPGFSKTAEYMLYMTGLNSKFFNYSDCAPSSTAALASWWFADKYSNPSLLYNELKMLKN GEYASCAENRLLPMIMAFANNLNLDAISAPSNKLWSGKGETPVVMVHTDWTYTDTDKYLG IKGGKAGSSHGHMDAGSFVYDAYGVRWSMDFGLQSYTTLESVLAGLGGNLWDMGQNSMRW DVFRLNNLNHSTISINDARHRVNGAATLTTTINTATELGATFDLTEVVSDQAASATRTVK IVNDKDLVVMDEIKARTDKSAKVRWCMVTPAVPTVESNRIVLTNGSKVMYLTASGSVKPT YKQWSTTSENSYDQANPGTYMVGFEATVTANQTATFTTTLSPK >gi|225935351|gb|ACGA01000041.1| GENE 101 119368 - 121182 1648 604 aa, chain - ## HITS:1 COG:no KEGG:Slin_2455 NR:ns ## KEGG: Slin_2455 # Name: not_defined # Def: heparinase II/III family protein # Organism: S.linguale # Pathway: not_defined # 28 602 34 609 628 597 48.0 1e-169 MKKILLLLLIFVSGCTGVVAQQFDYGKIAPHPRLLLPEGGEEAIRKAIAEYPPLATVHQR IMELCDRTLTEQPVERIKEGKRLLAISRIALKRIYYLSYAYRMTGDQKYALRAEQEMLAV SHFTDWNPTHFLDVGEMVMALAIGYDWLYDSLQPDTRRVVREAIIAKGFDAAKNTRHAWF YTAKNNWNSVCNSGLAYGALALFEEIPEVSKGIIEKCMETNPKAMVGYGPDGGYPEGFGY WGYGTSFQVMLIAALESAFGTDNGLSQAPGFMESARFMQYMTAPSGDCFCFSDSPVEAEC NMMMFWFAGKAKDLSLLWIERQYLDRPDMQFAEDRLLPSLLVFCSQLDLNRIGKPKKNFW FNRGDTPVFIYRGGWDSKKDTYLGVKGGSPSTSHAHMDAGSFIFERDGVRWAMDLGMQSY ITLESKGVDLWNMSQNGQRWEVFRLSNIAHNTLTINGERHLVESNAPITRTFESKKQKGA EVDLSSVFANSVKKAVRTVILDQKDHLEVTDRLETGDKEATVSWIMVTPAEAKITGKNRM ELTKDGQRMLLTVDADTEVEMKTWSNVPPHEYDFRNPGTIRVGFETVIPANRTAQLKVRL IPLK >gi|225935351|gb|ACGA01000041.1| GENE 102 121203 - 123014 1657 603 aa, chain - ## HITS:1 COG:no KEGG:BF0333 NR:ns ## KEGG: BF0333 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 601 4 614 615 307 33.0 8e-82 MKKILMILTMALVATSCMDILDVAPEDQIASENMWTTEELADKGMAGLYFPFYATQLSST QLRRADGLNRQGIEAMSFATDYYSNNYPVELLSLATKPANDFQVWYEWKFCYTIIHACND AIANLHKADMSADKLARYQCEARFLRAWAYNRLNMLYQGVPVYLEPINNEDCTRGQSSVD EVWQVILDDLTYCINNPDFPNNTLNENYGRPSKGAAYALRGMVYMWKKQYKEAGNDFKEV ETCGYGLWTGEYADFFKYENEKDKEMIFSLQFSEETGYCDNIQQMTGARDTYDGWTEIKP SADFVDYYKNADGSDFKWSEVDGLEDWDLLTPQQREIFFCRDGLESMSSQKNALIKRVGE DIYQKYYLNSGNEARIKKAYDNRDPRLQQTVVTPYVPVDCYKPNYAGDANQIGKQLRWPL KEQGTNGGDFWLDKRTSAFYCYRKYNEFEKGRLISRSRCHTDWPLIRYTDVLLQYAEALA QTDQLGEAIRLVNKVRTRAHMPALTEGGSGPCAVNGKEDMLERIRYERRVEFCLEGINFF DEVRWGTYKETKFQGKDVNGGKSWWGDMVEYNWYYTDYMWPWTAPIVETQKNPNLTKRSG WAY >gi|225935351|gb|ACGA01000041.1| GENE 103 123044 - 126316 3020 1090 aa, chain - ## HITS:1 COG:no KEGG:BF0387 NR:ns ## KEGG: BF0387 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 24 1090 50 1117 1117 800 42.0 0 MKKKHYLLFLLASLFLLTDVVWAQTSATVSGVVIDENGETLPGVSVVEVGTTNGVLTDLN GHYTLKTTSAKPSVSFSYIGYQTTTLPLNGRTKLDVQMKVETKILDEVVVVGYGVQKKVN LTGSVTSINFADQTEGRPIMSVSSALSGLAAGMNVTQASGQPGSDGATIRVRGNGTFNTN SPLVLVDGIEWSMDNVNPNDIESISVLKDAASTAIYGTRAANGVILITTKSGKGKPQISY SYSGVVQMPYNNLSFVSDYARYMGLVNEACENVNTKGIFSQESIDRWRAASADPNGLNEY GVPNYVAYPNTDWFDEVFDTGYSQEHNLSVSGSSEKVKYMLSLGYLDNQGVMNRWNLDSS TQKINFRTNLEAKIVKWMTVGTRLYGQKQDYGMANISNGFKYLYQTTPAVYPGEPNYWGR PALASEESSNANNIFGQMAGATGFNTVWRLNASVYGIITPYKGLNIEGTFNYSPTFTDKS SYSRQNGYWDYVTDQRVSESALENASITNTSARTWRQSAEILVRYQTTIKKDHDLGALLG YSAQEYYSKSFAVSRKGATDWTLNELSTYETLVSSSSSAPAKWGLLSYFGRVNYGYKGRY LFEANLRADASSRFGVNQRWGYFPSFSGGWRISEESFMQGASDYLSNLKLRVSWGKTGNN STGNYDWQANYATGNVVIEGEGTKGLVRKKLSNDKLHWESTATTDIGLDFGFFNNRLTGE IDYYNKYTSDILYHPELYLSMGVVGSAPENLGEVRNRGVEFTLNWNDRIGKDFEYRVGMN FSFNANKVMKFKGDLQKYWTYDAQGNKVSYVNNFSDVSESGFGGYICEGRQLGETYMYKV YRGSGEGYTGGAVDIHAGPKDGMIRTKEDMVWVQAMIDSGYSFGGMKTIAKDQLWYGDIL YADSNGDMNYGDTNDRDFSGHTSVPKFNLGFNCAFSYKNIDFSMLWSGAFGHYLNWNTDY YNSTLVSHGYGIIEHIADNHYFFDPSNPDDPRTNQSGKYPRLTYGTTYNNRIQSDWNEYK ADYFKLKNIQIGYTLPQRISSKFFVSKLRAFVSMDNILTITSYPGLDPEIGTEIGYPLMR QISFGGQITF >gi|225935351|gb|ACGA01000041.1| GENE 104 126583 - 128016 1094 477 aa, chain + ## HITS:1 COG:no KEGG:BT_3171 NR:ns ## KEGG: BT_3171 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 477 477 796 79.0 0 MKKHFLLLILSLLFLPLAQGKVKLPAMMGDHMVLQQNSSVKLWGWADGKKVTVTTSWNNR TYQASTDKDGAWLVKVDTPKGSYTPYSITISDGTPVTLSDILIGEVWICSGQSNMEMRMM GNAAQPIDNSLETLLNSGNYRDRIRFITVPRTNDTERRTDFEKRKWEVSSPETTIDCSAA AYFFARQLTESLHLPVGLVINSWGGSAIEAWIDEPTLKTVEGMDVEAAKDPKRGVHQRLE CLYNSMLWPVKNFTAKGFLWYQGESNISNYQFYAPMMTAMVQLWRNVWEAPDMPFYYVQI APYKYENSSNTGAALLREAQMEALKTIPNSGMIPTTDIGDEFCIHPSPKDVVGLRLATLA LTKTYSIGRLPSNGPMMTKVDYEGNKAIVTFNNAPAGLFPTFAQLEGFEIAGADKKFYPA KAKIIGRTNTVEVSSEEVAQPVAVRYAFRNYVGNITLRNTFGLSAFPFRTDTWDDVK >gi|225935351|gb|ACGA01000041.1| GENE 105 128184 - 132152 2633 1322 aa, chain + ## HITS:1 COG:BS_yycG KEGG:ns NR:ns ## COG: BS_yycG COG5002 # Protein_GI_number: 16081092 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 814 1046 369 603 611 126 28.0 3e-28 MKRSILSICILLASVSLVFANIYKKYDVRSGLSGNCVRSILQDSIGYMWFATQDGLNRFN GIEFTNYGHSSENGGNSYMNIVTICRHQDNNQIWVASTEKLYLFDSQEEKFSVFDKQTED GVTVNSVFGMAYDNDGQLWIGTTNGLFVYNEKKGTLRQYLHSLSDPHSLPDNHVWVIYND SFGTIWIGTRNGLAKYNQRTDNFTGYISEETSFGRPACNEIISLMESSQGVLWAGTWYGG LARFNKETGQFRYYFGEGDTLTIPRIRTLFQRTANSFYLGSDDGLYTFNTATGECLPTDD EQNKESIYACYQDREGGIWIGSYFSGVSYLSPKHKDIEWYYPNGTNSLSGNVISQFCEDP DGNIWIATEDGGLNLFDPRTRKFKNHLLKSSNLNIGYHNIHALLYNEGKLWIGSFSRGLY ILDIQTGKTKNYRHNRTNPHSIPNDHIYSIYQTKDGSIYLGTLSGFCQYDPNSDSFRTLE PLSHIFIYDMVEDQYGDLWLASKRDGIWRYNRQTGKLHNYRNDPANPASPCSNWVIRVYI DHKQNLWFCTEGGGICRYHYQEDRFENFSTKENLPNNIIYGILDDQSGNYWLSSNRGLIR YEPQNKRAQLYTIEDRLQSNQFNFRSSMQARDGKFYFGGVNGFNSFYPFKLSINKVKPTA SISAVYMHSPDDKVSLSKRIPALSGQVTIPYQVVSFDITFESLSYVAPSKNLYAYKLDGI HKEWIYTDKHNVSFLNLPPGEYTFRVKSSNNDEYWSNDDCCLHIEVLPPPWKTIYAKIFY LLIACGLAYFLVQLYLRKQQAVKIRKMKEMEQIKNQELFQSKITFFTQVAHEIKTPVSLI KAPLEAILETHEWNSEVESNLSVIRKNTNRLMELIKQLLDFRKVDKEGYTLSFNEVDINR MIEDIIDRFRAISLTGISFSVSLPKEHLQYNVDQEALTKIVSNLLTNAMKYARTRIMVIM DEHLSSEGRTLSLCVRDDGPGIPQEECSKVFEPFYQVGNTGNNGSGVGIGLSLVKLLVEK HKGKVYINPGYTEGCEVCVEIPYLEKSISVSPSITSMPDKVPASEEEGEPAGYSLLVVED TTDMLEFLAKNLGNTYTIHTATNGKEALECLETTTVDLIISDIVMPHMDGFELLKSIRSD NMLCHIPFILLSALDSIDSKIAGLDYGADAYIEKPFSLSHMKATINNLLENRRMLFNHFT TVPNMSYDQTLMNKTDVKWLNTINEIITRNFTNEEFTIDKMAEEMAISRSNLQRKLKGLT GMPPNDYIRLIRLKTAGELLREGEYRINEVCYIVGFNNPSYFARCFQKQFGILPKDYVKK GS >gi|225935351|gb|ACGA01000041.1| GENE 106 132496 - 133269 1027 257 aa, chain - ## HITS:1 COG:Cj1244 KEGG:ns NR:ns ## COG: Cj1244 COG0731 # Protein_GI_number: 15792568 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductases # Organism: Campylobacter jejuni # 9 247 4 229 300 99 32.0 6e-21 MTIIFPSPIFGPIHSRRLGVSLGINLLPEDGKVCSFDCIYCECGFNAERRTKKLLPTREE VRTALEEKLKDMQANGSAPDVLTFAGNGEPTAHPYFPEIIEDTLALRDKYFPKAKVSVLS NSTFIDRPAVFEALNKIDNNILKLDTVDEEYIHLLDRPNGKYSVKKIIEKMKEFKGNCIV QTMFLKGGYQGKDMDNTSDKYVLPWIEAVKEIAPRQVMIYTIDRETPDHDLQKATHEELD RIVALLEKEGIPATASY >gi|225935351|gb|ACGA01000041.1| GENE 107 133350 - 135407 1252 685 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1424 NR:ns ## KEGG: Cpin_1424 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 25 611 61 755 853 123 21.0 3e-26 MNKRKYVIIILLCMAAWLQAQDSVKAFDEFFVTGMKKIEGVFPVYVAEKEVYLEIPREYI GREIEVRGQIDRGFDLLNRPVTGLGVVRIMSPDKATICFQKPFYTERILDEKSVYQRSFS LSNIQPAGDSYPVVAYSKEQGAIIRITKYLINGNDWFSYNHGFIRSLVPELSEVTKIHPF EGGVSFTVRRYHGVEAERYMFSSSAIILPEGSIPLEVTCVVRLLSQKKDQIRLADHRIPY QTLTFKDYSQDPYCMVEDSLILRWDMSKPLTFYVDTLFPKEYFQVVKEGILVWNTAFRKA GIRDALQVKYADSKIIPAEQRAFVSYDLMMPGIKSDFTCHPRTGEILSCRVNIGHGFQKG KLDDYLLSCGASDPRIIADRYAKEVEKELLQNEITGEIGHLLGLRGNLSKNSCGTTVKVG EDACRAIYFGYNPLKGSRNCYDEREQLRRWIDKNLPDGTHSLPPSDYAAKVSNLQIILNQ LDKIVYKGGKRDKGSSLTDIYRKAIRQYGSYLMGIAETVGSSQPADAQHQAMLELDNYLF HPVEKMECTYVKENLLEARNNILYPELVKMFKHLLSIETISALRLQALHSNQKGYSDDDF FRDLYKGLFNDFDPSAAVSYEQMDIQLICLDAWLDIIQENAKHTSTTKRLKDELHSLYNR LEKLSTIHPQTEVRDMYTLLMGRIK >gi|225935351|gb|ACGA01000041.1| GENE 108 135433 - 137781 1454 782 aa, chain - ## HITS:1 COG:no KEGG:Cpin_6153 NR:ns ## KEGG: Cpin_6153 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 40 743 51 759 812 334 31.0 1e-89 MLKGIISAIFLAVACTFLYGQQPNMATFLKEGAPVEVIPGMFTTYRSGKHIYWEIPDSLI GREFAVTTTILTAPARPDRDMEKKFGYSGDMIGPVFFSFRKQGDELWIVDPLNERIIENP AGIYAKIAAQRGNSRLYKRLPVKAKTQGSSLIEIGEVLKDFPLFTLDIVSFDLLVGSRLK EKDCIKEIKGYDNRLLVHISRAYRSSSIGMPGKPVSPSYIGDWDTGVCIKLLSKKPLEAV SANSGAYFSIGKECFQGDQPAVRKAVIKRWRLEIKPEDKEKYMKGELVEPIQPIIFYIDR NTPEKYIGCIIEAVRDWRPAFEQAGFKNAIDARLAPTAEEDPDFSIYDSTYPFISWKISG QNNAYGPTPCESRSGEIIACHVGIFSSVLNLEQKWYFAQCGANDPQAWNIEFPDSLQLEQ IKQVFIHEVGHTLGLEHNFLGSSHYSIDQLRDNDFLSRYSIGSSIMDYVRYNYALRPQDK VDLKNRRVRVGEYDKWAIEWGYRIFPGKDASEREENRERWNQEKQKDPSLHFSGGIDVRA QAEDLGNDHVIVNTQGIENLKYLCEHPDVWHVTDKTSLYVLQGRYEAVLNHYKQWVQHVL SHLGGKRLAEADDENIYIPEKADYNKKVMNFIQTYVLQPPAWMFNKSFTHKLEIDASQEF DRFYEELMSEIIRSLRKVEESENACEDMLSVNEFLESMHEGLFVEWTDNVPVSEAKHKIQ TLYVNKLCDLLDRSEKITSSKLLVSVMQALNRIKKEGLDYSNRVAEPVAKKRAMFLVDSI IF >gi|225935351|gb|ACGA01000041.1| GENE 109 137797 - 138570 508 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172762|ref|ZP_05759174.1| ## NR: gi|260172762|ref|ZP_05759174.1| hypothetical protein BacD2_12916 [Bacteroides sp. D2] # 1 257 1 257 257 486 100.0 1e-136 MVRMEEAFLFFDSESHRFLSASIPGYYDYMMNQATQNIRNYGTKWSDQKPVSTYSMSDES NLFDPDVIDPSLEIHDIVTGGNWGNFAYAIASPRNGKELTVFKFSAQDEDPICAAQYTIA LPSEVNVETAKFAASYAYTANLIFMTSGNKLYRIDLDRGRAIELYTYETDPSAQIVALKF KDSESVREEDDDEETGEYKEKLGMSLGLGINTADKGVVVELQLTVAGDVSREENSICVYE DPEQLIGKIVDISYNYE >gi|225935351|gb|ACGA01000041.1| GENE 110 138536 - 139465 658 309 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293369292|ref|ZP_06615879.1| ## NR: gi|293369292|ref|ZP_06615879.1| hypothetical protein CUY_3865 [Bacteroides ovatus SD CMC 3f] # 1 309 1 309 555 627 99.0 1e-178 MRKYIIYIASFVGSALLQTGCYDDKGDYDYHDVNTMAIVIPETKVRMPKEEAVEISIIPE ISQTLEQNEENLVFQWKKTIEGKKAGSDRLSDYKDYSVGKECKVTVEPYESENIGLMLVI TDKKNGTTWYQIGEVAIIRPLNPCWFVLQEKEGKGVLGAIEGTPEGYYVYPDVFKSELNQ SFPLEGKPLAVSARKNYGDSFLSSMLGFFGFKVSPALMVVTDRGLALLTPSTLITRYPSN KILFEPAGKGEPLNIEFYKMSTHGELFVNSGKAYCAPMDGFCVPFSVKKESEFPAISAYG SYGGGFFVF >gi|225935351|gb|ACGA01000041.1| GENE 111 139482 - 140162 580 226 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172764|ref|ZP_05759176.1| ## NR: gi|260172764|ref|ZP_05759176.1| hypothetical protein BacD2_12926 [Bacteroides sp. D2] # 1 226 1 226 226 433 100.0 1e-120 MKRIIYFVLAILVCGIYVGCSENEIDLYDQTPRINFYSSTHVRTLVDTDYVKIDDPYAVD SFTVRIQGDLLKENRDFCVKVTPNSDYQNSVDVLLESKYTYAELDTVCQVFYYKINRPKV ESGRNVYGCYLEFDLDNPLHQFDKGLVEKNQEVLNVRWELKPDEWEDWIFGSYSDNKYMF IMDVCQRVWNDLEDEDVDKVKQAYKEYIEAGNPPILGEDGDEIDYE >gi|225935351|gb|ACGA01000041.1| GENE 112 140172 - 141695 1403 507 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1098 NR:ns ## KEGG: Cpin_1098 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 6 504 5 487 488 238 34.0 3e-61 MKQIIYILLATTMLGLMSCSDWLDVSPKTSIPTDKQFESESGFKDALTGIYLKLGTQTLY AGDLTYAYLDELAGLYSDYPGYNTNAVFDQSIVFDYENMFLSKKNGIYSTMYNIIANINN FLEYVDKNKDVLVTERYYETMKGEALGLRAFLHFDLLRMFGPVYKEHPASKAIPYRTTFD KDATPVLAANEVVDAILKDLNDAEKLLKENDPLDFFTDQTDEDFTKKNSFLVNREFRMNL YAVKAMLARVYCYKGDAESKGLATEYAKEVIAASKYFALYKSQTASNYNSIRYAEQIFGI TVNEFSNLLIGNYMDMENTNNQQRFYLDGDKFKFFYETADAGNTDWRKNTEMFEVVNGAS QTDVFCRKYNQKPLNGGYAYSGANAVPLIRLPEMYYIVAECASSASESADALNTVRFARG ISYSDEIITTGYDDLDTTSEEDKNQTKRINEIMKEYRKEYFAEGQLFYFLKAHNYSTYHG CGIETMTEAHYQMTLPDDEYIFGNNSK >gi|225935351|gb|ACGA01000041.1| GENE 113 141712 - 145062 3056 1116 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5147 NR:ns ## KEGG: Cpin_5147 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 6 1116 45 1164 1164 650 34.0 0 MKRNQAQKFIVLFLLFLFAYSWRVEAQTGKEKQITMEFKNEGLPSIFKRFEKVSGYKVLF IYDEISSYTSTGKVEKATVDEALKVIIGKNPLKYHIDGQFINITQKDSKKFFSQVKGKVL SEEDGLPVVGATIIVEDPSNIRTITDNNGNFQLSDVPKDSRVRISYVGLETQFLHPSSYM SVVMKSDTRALDEVVVTGMFNRKKEGFTGSAVTIKGEDLKKYSTNNVAKAIAAVAPGLRI VDNINMGSNPNGLPDMRMRGSANMDMGTQSVDFNSTSNDVLAVQGEYETYANQPLLIMDG FEISIQTLADMDPDRVASIVLLKDAAATAIYGSRAANGVIVIESKTPKPGRIWVTYGGEL RIEAPDMTGYNLMNAREKIDAELKSGLYTYGGETVEKWQLYQSKLREVLAGVNTYWLDKP LQTAFQQRHTVTLEGGDEALRYRMYVGYNSSPGVMKDSKRDVLTGSLDFQYRLKKVLLKN SITLDNSVANESPWGSFSEYTRLNPYLRPYGENGEIQKRLDNFEGVGGESSYLNPMYNTT FNSKDQSKNFTVRELFKVEYNPTNELRFEGAFNLSKSVGHRDIFRPAQHTLFDNVTDPTL RGDYRRSQSEAVSWGIDLTGSWNKQLKDHYLTANARMSVLENNSETYGNYVTGFPNDNMD NLLFGKKYNEKVTGDERTTRSIGWVAAGGYSYKYKYSFDFNIRLDGSSQFGKNNRWAPFW STGLRWDLKKENFMKDVSFISDFILRGTYGTTGSQGFDPYQAHGYYTYSNLLLPYYSSDA TGSEILAMHNESLKWQTTKSTNLALELGFFDQRLTARVEYYRKITDNMVTSISLAPSLGF GSYPENLGKIENKGWEISLSAIPYKNTAKQAYWTITVNGSHNTDKLLEISEAMKHRNDMN ASNLTDTPLPRYEEGESLSRIWVVRSLGIDPASGDEILLKRNGEMTSAVNWSANDVVPIG NTEPTWQGYINSSFTYKGWGADVSFRYQFGGQVYNQTLLDKVENANLKYNVDRRVSQLRW AKPGDKAQFRTLTPSGWETKATSRFIMDENIFQGSSLSVYYRMDRTNTKFISHWGLSSAK VTFNMEDFFYWSTVKRERGLYYPYSRQFTFALNVAF >gi|225935351|gb|ACGA01000041.1| GENE 114 145195 - 146370 829 391 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 190 376 125 309 331 86 33.0 1e-16 MDEKNINISKSEEELLEIMENRSRITAGQLHNLEEDEECLQACSDLTEVVIEMQKEQNLL AIDVRKELADFRNKHSKNDRRKNTRILWASVTGVAAAVAIILVLRAMMISSQPEIIKVFQ ANHIAREVTLQVNDEKEIKPLKEVVESLSSYSTAQLSSKEIDYSRALLQTETKEVGKQKV QIHRLSIPRGETFKVVLSEGTEVFLNSDSRLAYPTVFKGKERVVSLEGEAYFKVAKDAAH PFIVKSGNLQIRVLGTEFNVRSYSPTDVRVTLITGKVAVSDTCGVHSVEMVPGQSVQLSS DGTFAVNEVDIESFLYWKEGFFYFDDVALVDMMKEIGRWYNIDIEFRNSKIMDLRMHFFA NRHQDIFHLIELLNRMERIHAYFEAGKLIIE >gi|225935351|gb|ACGA01000041.1| GENE 115 146410 - 146913 442 167 aa, chain - ## HITS:1 COG:no KEGG:BVU_0609 NR:ns ## KEGG: BVU_0609 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 7 166 15 174 175 86 28.0 4e-16 MDDRATFDKMFNEWYAQFVYFAYYFINDAEVCRDIVSDAFEYLWRNYEKIEEATAKTYLY TIIRTRCIDYLRKQNIHEEYVEFTAQLTDKMIEGDSQNSDSRVLRIREAMKKLTPYNYHI LEACYIHNKKYKEVAEELNVSVAAIHKNIVKALRILREELGQKGNRN >gi|225935351|gb|ACGA01000041.1| GENE 116 146957 - 149005 1019 682 aa, chain - ## HITS:1 COG:no KEGG:BT_3167 NR:ns ## KEGG: BT_3167 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 682 1 682 682 1164 83.0 0 MKTIQLLRLISIINSLLIIPACSAQNPSESLLEDILEDLSVNNGTDNSVNTPNWENELEE LSNRMQEPVNLNVATREQLEQFPFLSDIQIEHLLAYIYIHGQMKTIYELQLVEEMDKQTI QYLLPFVCIKAINNESAFRWKSLLKNAAKYGKNELLTRFDIPFYKRKGYEHTYLGPSVYN SVKYSFRYSDRLYAGVVAEKDAGEPFGALHNRYGYDYYSFYLLLKDCGRLKALAVGNYRL SFGQGLVISTDYLMGKTVYASSFNNRNSGIKKHSSADEYNYFRGVAATVSLTKDWDISGF YSHRSLDGVITDGTITSIYKTGLHRSQKEADKKNLFTMQLTGGHVSYQHNRIRLGITGIY YLFNRPYEPELTGYSKYNLHGNNFYNLGIDYAYRWHRFSFQGETAIGKQGWASLNRLQYS PVQNTQIMLIHRFYSYNYWAMFAHSFGEGSTTQNEQGYYIGLETSPFAYWKFFASFDLFS FPWKKYRVNKPSRGTDALFQTTFTPYSNLSMYLRYRYKQKERDWTGSKGTLTLPIFHHQL RYRLNYSLGDVLSSRTTLDYNHFHSQDRAANKGYQVTQMISSQLPWARLFADVQGSYFFT DDYDSRVYASESGLLYTFYTPSFQGRGFRCSIRLRYELNKHFLLITKFGETIYLDRNEIG SGNDLIYGNKKADVQMQLRIKF >gi|225935351|gb|ACGA01000041.1| GENE 117 149150 - 149728 550 192 aa, chain + ## HITS:1 COG:no KEGG:BT_3166 NR:ns ## KEGG: BT_3166 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 192 3 190 190 328 92.0 7e-89 MTLFAIVCCSLRLQAQEKQSINGYLVPMCIYNGDTIPCVQLRTVYIFRPLKFKNEKERQE YYRLIRNVKKVYPISREINQAIIETYEYLQTLPNEKARQKHIKRVEKGLKDQYTPRMKKL SFAQGKLLIKLIDRQSNSTSYELVKAFMGPFKAGFYQTFAALFGASLKKEYDPQGEDKLT ERVVLMVENGQI >gi|225935351|gb|ACGA01000041.1| GENE 118 149766 - 150578 174 270 aa, chain - ## HITS:1 COG:no KEGG:CHU_3539 NR:ns ## KEGG: CHU_3539 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 2 244 50 294 320 134 31.0 3e-30 MYVLSFANLSFSVSAGKGGRIVSFKCEDRELLTSDSVHSKYYGATFWLSPQSEYWPQYQC VDELPYQAEIDKQILRLVSSPDSISGVSVTKEFSISERDSSILIHYSVRNVSRQLKRLAP WDVTRVYGGLSFFPVGETDQMNKSDVTGGYEDKGMVWVPCPDGTNERGQKLFSTVYGGWM AHYYRGLLFVKCFPDIRPDEVPPRQGEVEIFVAPKGRYLELENHGKYVELQPGGSLMYRQ KWFLKDVSDKKQSDWIDNLKRQMNEPKVHG >gi|225935351|gb|ACGA01000041.1| GENE 119 150680 - 152959 1182 759 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 23 754 40 781 790 613 43.0 1e-175 MRKIIWTIMALCTLSGCKYENSTISADSLLDFVDPLIGTGGHGHTFPGAAYPFGMLQLSP DTGLEGWDWCSGYHYSDSSIIGFSHTHLSGTGRSDLMDVMLMPVTGNLKLSPGSKSKPDE GYRSRFSHDEESASPGYYKVRLKDYDIMAELTVSPRCGFHRYTFPESESSHIILDLSHHF ATDSVLFTSINKIDSCTIIGERKTKGWGEPGEKYWSEQQLFFALKVSKVFDLSIATDEQF IQEKQASGKNIKAILNFETSLNEMVLVKVGISAVSAENALQNLSEEIPHWDFNKTLSETQ SVWEKELSKIKVGAPDKNKTIFYTALYHSLLAPYLYNDVNKEYLGFDKQTHMAEGFDNYT VLSLWDTFRAENPLLTLIAPDKVNDLIQSMLAQYEQYGLLPVWPLWSNETICMIGYHAVP VIVDAYFKGIRNYDVEKVYQAMKTSAMQDNFGVKELKQYGYIPYDVYNKSVSTALEYCYD DWCIAQMAKDLGKIDDYNYFMRRSSGYRTYFDKEYKLMNGFSSQGSFRRPFDPFFSSYGE CDWVEGNSWQYSFFVPHDVQGLINLYGGSEEFASVLDTLFSMPVGMSGHDVPIDITGLIG QYAHGNEPSHHVAYLYNYAGQPHKAQQRLHEIMSTLYTDQPDGLCGNEDCGQMSAWYVFS AMGFYPVNPAEGIYILGKPMVEKAEISIGDKSFKVKTVGWSDENIYVQSVKLNGRAYNKL FITHADIVNGGTLEFTMSPVPGNFKHAILPPSMSETLNE >gi|225935351|gb|ACGA01000041.1| GENE 120 153373 - 156432 2409 1019 aa, chain + ## HITS:1 COG:no KEGG:BDI_2469 NR:ns ## KEGG: BDI_2469 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 28 1017 114 1092 1094 533 35.0 1e-149 MKKCLKVKLLILIFLTGIQYVSAQGTAVNGTVVSDQNEPLVGVTVTEAGTTNGVITDMDG KFSIKLKNATGNLTFSYVGFATQTIAVNNQSKNLKIVLAEDSKQLDEVIVVGYGVQKKAN VSGSITSLGSRDLHTMSTNDASQALQGKAPVYISRQSGQPGASSSIIMRGVGTLNKATPL WIIDGVPGMPLDNFNEVESIQLLKDAASAAIYGIEAANGVVLVTTKKGSKGKIAVNYNGY VKVNHALGLPETLGTQGYIDMYKARWMSNNPDKGEPTTNDIKSFYFLTPNEVSQLPNTDW VEVMFNAGIEHSHSIDISGASDRSSYFLSAMYSNDEGTFVNTNYKKWAIKARFEQTPLKW LKFSQTVNFNHSKRKHNALDWQHILRANPAMNVYDDTNPMNTGYGYFTDEFKETIDWQGG NPLESADLKDHWEKWNTAWGNLQAIITPIKGLVWTTNLTGTLSNHATSQFLYNTFGGIST NSIDFVEGKNIQGHQLDYAHNQSTSYLLNTYVNYNTLIGKHDLGAMIGFEVRESRNDDAS GYAEWGIPAQDLRSTALTDHRDGTNAWSTGSSYSLFGRITYAYDNRYLLTANFRNDASDI FAPGKRSAFFPSVSIGWNIANEKFFKVEKINDLKLRFGIGEIGNNSIDKNWWRQEYKLQT NGTWLAQKTPNKDVTWEKTRITNIGIDLGAWNNAFTATIDLYNKKTRDALIEQKLPSTIG VGSNTYKLNKGEISNKGIELALSYRGSINHFNYLVSGNISYNKNKVLNIGNASYLSGGNF NRTLVNGPVAAFWGYVADGLYQTQAEIDALNAISMEKWGVAYDAGNIGPGDIKFKDLNGD GRINDEDMTSIGNPWPTYVYGFNVNLEYKGFEFNMNWQGVADRDIYNNTKQCLENMNADW NSTPDVWNAWTPNNTHTSQPRLGNATHNYQLPNSYMVEDGSYLRLKNIQLGYSFGKSIVS KMKLSKLKIYVGVENALTFTKFKGFDPEFIGNNNAEQGVYNLTQYPQSRSISFGLNVGF >gi|225935351|gb|ACGA01000041.1| GENE 121 156452 - 158062 1170 536 aa, chain + ## HITS:1 COG:no KEGG:BDI_2468 NR:ns ## KEGG: BDI_2468 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 536 1 526 529 195 29.0 5e-48 MKIKYMFIAGCLLLTSACSDFLDHPLTTSVSDDNIGEIIQRNPTKLGEFLGSAYRSLGSI HLYGRQTEYMATMSHEMDIDWLGEEQRNQWAINGLTSTNKNVEEAYIKYYKALASTNLTL DLIDHIDLEALDEEKKTMVMNFQGESLFLRAFIHFDLLRLFGEQGPSFGGTYPNNKDAKG IILREGVADASNAITARSTIEECYQLIIKDLKKAETCIGDNQIPVNTTIPGPGYTDTDYT KDQGWAQRPAVHALLGKVYLYMSDFTNAKIEFEKVIGDNRFKLDKPVNFTDYIQHTDNNP ECIFSLQYYLSSGSAYEDAPQHHLVRIFGGAPGAWNNYFVDQRTAARFGNDPRLNEATLY GYNVKKWSTATTKPEFEQVNTSDPTYRYYQRKYIDFYNVSSPVFSTKNVDIIRLADIYLM YAEVMLKLNSADVATEYVNKVRRRAWGESNYDVPGTKGEDFTTVTMEILQEERYKELFFE NTRWYDICRWGILEQELAKYPSTNAGTVHYDPQDYYLPIPESEMRSNPLMKQSKGY >gi|225935351|gb|ACGA01000041.1| GENE 122 158065 - 159831 1289 588 aa, chain + ## HITS:1 COG:no KEGG:Oter_2906 NR:ns ## KEGG: Oter_2906 # Name: not_defined # Def: hypothetical protein # Organism: O.terrae # Pathway: not_defined # 57 584 76 623 628 120 24.0 2e-25 MKKLLILLAFAFLFLSLTSASSFDERKKYLLDYYSKARPNDQYWGDNDIKTAMGFVLARL ETKKDVKYALNMLNRMQEDAPFDMFDCHQNIDAYLRFQSVYPKELKEKVRKRMTNEDYLA DGSTENHRLMFKTAGYLTALAFPDWGKADTVMAHCRSVLMDVMDKTVRYGIKEFDSPTYG TFYITCLLSLYDHSKDAEFKKQVQMTLEWHLLNMAPEWLNGYFISSSLREYYFACSPHMM SPYPLLGWLFFGGGPNPILEQKYENGELIVNNEGFYCVLAAVSSYRVPEIIQHIASDRKK AYVHKESHDMTPFAQLNYPWGFKKYTYINKTYGLASQWDGISLGWSAQMRRWKLVWESDA PASTFFLTHISHYGRSAESLFGATCREQVLQHNGTLLAMYKIESEEPHPYVTGVVPIDAI KQMKEDESGWIFFDGGSVLFAVKFCHPYTWDEDRVFRGVKHKMLRCDQRGTAAVIETCLP DKYQPTGTKTALDLFAEDILKTTKLEYIVTNPEYRQTIYHSLSGDTLKIIYNYGRFVNGE KVDYENWPLYSNPWMEQAVNGRFLNVSHGKESRVYDFRNWNVIDSKEN >gi|225935351|gb|ACGA01000041.1| GENE 123 159836 - 161473 1148 545 aa, chain + ## HITS:1 COG:no KEGG:Phep_3982 NR:ns ## KEGG: Phep_3982 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: P.heparinus # Pathway: not_defined # 13 545 19 553 553 640 56.0 0 MMKSILFSLTLVFTALACFAQDYYGNNRKQWLQKAEQYKPKLIITEKKPLKVVDIVPDTQ SFQKYKAVDVSPIDSLYSMPFRLKKEITIDFGEHLTGYFSFSVRSTGLAADGPLRFKLTF GEVPSEVAVPFDSYKEGLSRAWLQDEVVSVMYVPQTVTLSRRLAFRYVKIELLGSSPYYD FNIHDMKCMAQTSAKNIPEELPQAVDPMIRKIDRVGLNTLKECMQTVYEDGPKRDQRLWI GDLYLEALGNNYSYKQHDLTKRCLYLLAGLSDLNGYLLATVIENPVPRAQDKQFLYEYAL LYNVTLKDYLEATGDMETVKDLWPVAKKQLDIVRTNVKADGLMDFEKVKKDWWVFFDWKE DLYKEAPLQGVSIFALKESYKLAQLLGKEKEVADLPALTNKMIKAARKNLYNRKTGLFVG TGDKQISYASQIWMILSGVASKAEGKKALSALDTTQNVCYPGTPYMYHYYIQSLIDCGMN LEAKEALINYWGGMIAKGADTFWEAYDPTNDFISPYDFYPINSYCHAWSCTPVYFIRKYP EIFQK >gi|225935351|gb|ACGA01000041.1| GENE 124 162142 - 164742 1534 866 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 1 856 177 983 1087 491 37.0 1e-138 MAVQVLRWSDGSYLEDQDHWRFHGITRDVYIESRPDVFIQDFAVITDLDKDYKDAKLRIR PVISSKKTVDVSDWMLEARLYTKEGTPALGVNMSMPVKTITTEKYQQNFSLAKYLETNVK APLLWSSETPNLYIIVLTLKDHKGNVVESRSSRIGFRKVEFKNKHELCVNGKREYIYGVN RHDHDAWEGKTVPYERMVQDVTLMKQFGFNSVRTSHYPADPAFYDLCDEYGIYVMDEANV ETCGADAELSNNECWLFAQMERVAGMVKRDKNHPSIIFWSLGNESGVGANNAARASWVKD YDPTRLVHFEAYMHNGGSRQYGYGIDFMKTNRPAVNPPEPPAVDVVSTMYPSVEGIIKLA TQEGETRPVLMCEYAHAKGNALGNHQEYWNAVKKYPRLIGGYIWDWVDQSVIRKDSVTGK EYFSSLNGTNGLVFVDRKIKPAINECKKIYQHIRFDYSNGELTIRNEYNYLPLSAFRFSW KLMAGGELIKKGELTDIEAFPGTSARVKIDTGTSDSGQKGELILEINAYLKKDVIWAPKG FEIAWEQFTLQEGKVNPPVEEKTGNGGKLTVQKKANEIGIYNDRINIVFNKKAGLIQSWI VGGKEFLEKGPQINLWRAPTHNDGGYRPKTENEISRQWVEAGLDSLQHKLKSFKLVEEKN GTVSVATTFVAQKTGNKSYVEYTTKYIVDPSGKVQIDTDLKPFGNIISFPRIGYTMTVKS GNDTFSWYGYGPYDTYNDRHSGARLGRFSGTVDEQFTHHAYPQENGNKYHCSWVSLTDNE GIGLVAEGMPFIESSVMHYSLENLSEAIDESQLKRTDNITWNIDYKTYPIGNRSCGPPPL EQYVLFAEPVSFSFSIYPVLKKKVFQ >gi|225935351|gb|ACGA01000041.1| GENE 125 164724 - 165389 314 221 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 42 211 7 171 1087 135 41.0 4e-32 MKMKKQLIIIFLLCLSVKGISQIPFDGSDSYATPPIGFGQSEFENPLINSINREPYGATS ISFPTETEALQVKRSSSSRYQSLNGTWKFKFITDWDNLPSDFMKAETNDDSWDNIPVPST WEMKGYGDQVYCGQGYEFRPVNPPFVPRKDNHIALYRKTFEVPASWEGQNVLIHFAGVRG AFYLYVNGKKVGYNEDGGTLPAVFDMTPFLKKERTNWLSRC >gi|225935351|gb|ACGA01000041.1| GENE 126 165425 - 169057 2425 1210 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172779|ref|ZP_05759191.1| ## NR: gi|260172779|ref|ZP_05759191.1| hypothetical protein BacD2_13001 [Bacteroides sp. D2] # 1 1210 1 1210 1210 2491 100.0 0 MKKILVLIALSFMTVSTLTAQMKDPQNWVGYEEIMGVKNGLRFYDFDVNLIESSAPANVF WPGDDIRLKFQLINNTSQSIDIDAKVHVFRYGTKGIPNDIWLPQMIKLDYEKVIPVHLSI LPNGYVNTSVSVDDIKDFGGYAVVFDLGKYGRRLGTSFAYSMKPSLVKMQYPKQSLDYLG VDFLNRVGVQSIRYGIPFVSPDNPDYQGFRQELKKLMKDFMDNNITVMLMFGEGRMAQSM PLGTTRPHLDENGKFLHTKQDLVWLPELDEDFKKFVKELCIDFGWPKGPVTAVCLWNEPW EGTSISGWQADMIRYKEIYTKMAEAVIEAREKDIDVLVGGGDSNSNALDKFFADGTMDML PIFDFLSIHYQGMEAPVLYPEWNKRKDNKGRVKIWDTESWVGNTDDRVGLVIAANRSAGY DRSMGIFGGYMYSGDPNRSVRSMEVRTEKGKETMPKLHNTWSAAAAVGAAQSMIGEREFN RLLFKNGLPWVMIFDGYENKKDDGTIVIAGDLGEAFGAENILFRNVRSLSEARKKVDLHH QLKTLPANSAERKKIENELNTYYPITDGKMILKANPSFLLYDFYGNAIAPKNGIYEIPLN YQGYYMRVNGEKGAFDKLVSAISKADIVGYEPIEIIAKDFTAPIASKPEMELQLTNILNR PVKGVLSVSIGNLDISYPQNVSFKPNETKTIRAKVTNGTASVDNNYPLEVHFDAGKDGFA VHWENMHVNYIAKKTIKIDGNLNDWQNMISQTIEGSSKASISLTEAAWYPYQKFDSNAEG LAHTYLAYDDDYFYFAAKVADKTPNKGTLRFETRNDDDYFYPDTAYMQTIYAMHSTIVTQ PAAESDQKALQLPSAKGRMMNYMENTSTTLSMGMDIHLPKDKYTRTSFYFPSINQNGLSV TVYDKDSGKELLSTKIDKLWNGTYLTLDLCGNVRIRCSSYGWWYTTKLSGIFFDSSDNVV NSGNAGASAKLVNRDFDTVGNWLGKYGQLGYYLIGSDSSLPQDVTCHVVSQDDLVPLVWP EQVRRFTYRKRPTLPDGTNGIATDNILIAFNVIPIGEDGMEAETKGTMPRYIGYKCTDYE YALNTVAPEYGGGFEIWRMLVPGMPRKHFYPRQPKSSFDGAVKDGKLITTREGNTLYYEC AIPWSEIPDVKKAIDKGDKIKFSARINDDGAGAACMELARGRSVSKKNSRAFHPDWKEHW ANEVEFGVEK >gi|225935351|gb|ACGA01000041.1| GENE 127 169084 - 170895 1343 603 aa, chain - ## HITS:1 COG:BS_yteR KEGG:ns NR:ns ## COG: BS_yteR COG4225 # Protein_GI_number: 16080064 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Bacillus subtilis # 256 595 38 365 373 104 25.0 4e-22 MKKKKWICLLLSALLGVGTALPQQSKESPVEKVKKIGDKLIRETPFAYKLGLASCSTSFN TLNFVDFGRTFGVGQEAVAYAYTQLSFPRDTTIRVETEHNDACKIWCNGQLVYENKGKRD IKINRGERSMKMTSSFLLPLKSGNNNVLIKSATFGKEWCVFLQPPSDNDAVLSTARSYPQ IGLTNLNHVDKQVSDLTNWLIIGPFQGGIDVVHEPEQEFKFGYMYKGLHGPVTWTIPKIE VLGEMIDPEVWGTTYQWNYHNGGVAWAMQQLGELTGAHQYTQWATDFCDYQMEGMPFVDY QVNHLRAYNSANAMVINSTLLDFTLAPSLPILYRLRMDKDFKNRDIYQAYIDKMINYARF GQIRSEGMTNYTRDTPEKYTVWVDDMFMGIPFLIQAGLYSDSLELRKIFFDDAANQIIDF TKHVWNKDTRLYMHANYTSRPDVKLPHWARANGWAIWAMSDVLMALPKNHPKYKAILKQY QTFVYSLIKYQSPDGFWHNVIDRSDSPKEVSGTAIFTMGIIRGIRYGWLDKKKFMPIALK GWDAVSSEIEDDGTVHNICIGTMCTEDVNYYMNRPFFDNDTHGSFAVIFAGIEAQRMMDE DTK >gi|225935351|gb|ACGA01000041.1| GENE 128 171100 - 172497 1070 465 aa, chain - ## HITS:1 COG:no KEGG:Slin_3640 NR:ns ## KEGG: Slin_3640 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 25 461 28 435 445 94 21.0 8e-18 MDEYKDKRYIKEIIKEYNILLESWGSSESWDLNQFKTLENMIFETTKIDVNANTLKRFFQ QKTGNPQLATKDALCRFLGYSGYTDFVMKKTKKEETTNIQTEDVVEETKEHTNDTDKKAD SSQLEKAPNKVTKDPEEVSNRIYKKNKWYVNFAIILLLIIGGYMLYTYKLKDMYVEYLLS KIEFTTSQVKGICPLTVTFSYHIPASLFEDITVMYEEANGDISERKLNENIDKINATYIY EGEAFCHLKYKDRIIKTIPVECRKTGWSVFVREERKKIFRVFPMEQAYNDSGYVSLPIEN VPQEAHTNHMFVSYVFYKEKLIDGDNFIYEARVRNSKQENAVPCSDIIMYIHSDTAMHGF AMNENGYAYIKFISGENTIKGDEYNLSRFNFNSSEWHVMTIKVVNKKTTFYVDGEEVLGM DYKEPLGYANELTLRFKGCGAVDYVKVSNLDGKVVYEENFDPDKR >gi|225935351|gb|ACGA01000041.1| GENE 129 172711 - 174663 583 650 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90021240|ref|YP_527067.1| ribosomal protein S32 [Saccharophagus degradans 2-40] # 269 650 28 407 408 229 35 9e-59 MKKYFSALLIILLSSFTTAPRIVWLDELDLSNVDQSAGKAIANQSMWKTPLSIAGEKFDR GVGTHAASVFRIKLDGKTVSFKASAGIDDSAPEHELKQASAEFIILGDGKIIWRSGIMHA GEKAKQIDISVKGIKSLILRVDHAGDGIAGDRTNWVNARFEVKGADPVSVKKEREQEYIL TPAVARQPMINAPYIYGARPGNPFLFTVPVSGERPLNITATNLPEGLSIDSNTGIITGKA INPGTYTVHISAKNQYGETTKDLTLEIGEKISLTPPMGWNSWNIFGADIDDKKIRRMADR MVELGLVNYGYAYINIDDGWQGVRGGKYNAIMPNDKFPDMKGLVDYVHSKGLKIGIYSSP WVQTFAGYIGSSADTRNGKVVNSSRRYGEFSFAKNDVKQWAEWGFDYIKYDWVTNDIAHT AELSYLLRQSGRDILYSISNAAPFELAEDWSNLTNVWRTTGDIYDSWCSMTTIGFLQDKW QPFAKPGSWNDPDMLIVGKVGWGKNIHSTHLSPDEQYTHITLWSILAAPLLIGCDLEQMD DFTMNLLSNREVIAINQDIAGIQGSRVYADNNKEIEVWSKPLKDGSIAVGLFNLSDNKQD ISIFWDQLNIQGKQKVRNLWEQKDKGVYINKYQSDVPSHGVLFIKIEPVK >gi|225935351|gb|ACGA01000041.1| GENE 130 175105 - 175632 663 175 aa, chain - ## HITS:1 COG:FN1519 KEGG:ns NR:ns ## COG: FN1519 COG0566 # Protein_GI_number: 19704851 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Fusobacterium nucleatum # 25 167 89 228 234 80 33.0 2e-15 MRKLKITELNRISAEEFKQVEKLPLVVVLDDIRSLHNIGSVFRTSDAFRIECIYLCGITA TPPHPEMHKTALGAEFTVDWKYVNNAVDVVDNLKNEGYIVYSVEQAEGSIMLDELQLDKT KKYAIVMGNEVKGVQQEVIDHSDGCIEIPQYGTKHSLNVSVTTGIVIWDLFKKLR >gi|225935351|gb|ACGA01000041.1| GENE 131 176133 - 177071 717 312 aa, chain + ## HITS:1 COG:all4673 KEGG:ns NR:ns ## COG: all4673 COG0379 # Protein_GI_number: 17232165 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Nostoc sp. PCC 7120 # 3 307 20 323 324 370 57.0 1e-102 MNELIKAINELKKEKNAIILGHYYQKGEIQDIADYVGDSLALAQWAAKTEADIIVMCGVH FMGETAKVLCPDKKVLVPDMAAGCSLADSCPADQFAQFVKEHPGYTVISYVNTTAAVKAV TDVVVTSTNAKQIVESFPKDEKIIFGPDRNLGNYINSVTNRNMLLWDGACHVHEQFSVEK IVELKAQHPEALVLAHPECKSTVLKLADVVGSTAALLKYAVNHPENTYIVATESGILHEM QKKCPQTTFIPAPPNDSTCGCNECSFMRLNTLEKLYECLKNESPEITVDPEVAKKAVKPI QRMLEISAKLGL >gi|225935351|gb|ACGA01000041.1| GENE 132 177216 - 178058 825 280 aa, chain - ## HITS:1 COG:PA1461 KEGG:ns NR:ns ## COG: PA1461 COG1360 # Protein_GI_number: 15596658 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Pseudomonas aeruginosa # 91 267 69 245 296 98 32.0 1e-20 MKKITLFTLLTLLLCTSCVTKKKFMQAEMAATISKDSLQGLLTDCRNINAQMSAQIKNLL RDTTKMGNSIRQYQSMLNVNMTEQEKLNALLSQKKNELNERERTINELQDMIKAQNDKVQ NLLSNVKDALLGFSTDELTVREKDGKVYVAMSDKLLFQSGSARLDKRGEEALGKLAEVLN KQTDIDVFIEGHTDNKPINTVQFKDNWDLSVIRATSVVRILIKNYNVNPLQIQPSGRGEY MPVDDNETAEGRSKNRRTEIIMAPKLDKLFQMLQSSEESK >gi|225935351|gb|ACGA01000041.1| GENE 133 178090 - 178671 309 193 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 1 193 1 197 200 123 41 5e-27 MKRKLVFATNNAHKLEEVAAILGDQVELLSLNDISCQTDIPETAETLEGNALLKSSYIYK NYHLDCFADDTGLEVEALNGAPGVYSARYAGGEGHDAQANMLKLLHELEGKENRKAQFRT AISLILDGKEYLFEGVIKGEIIKEKRGDSGFGYDPIFKPEGYDRTFAELGNDIKNQISHR ALAVQKLCEFLQS >gi|225935351|gb|ACGA01000041.1| GENE 134 178722 - 179639 912 305 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 9 303 1 281 283 150 31.0 4e-36 MHKLSKAEVMREVKDYIYITLGLISYSLGWAAFLLPYQITTGGTTGIGAIIYYATGFPIQ WSYFIINAVLMTFAIRVLGPRFSIKTTYAIFTLTFLLWLFQLVVNNYVEAPDMTPDGKPL LLGTGQDFMACIIGAAMCGVGLGITFNYNGSTGGTDIIAAIVNKYKDVSLGRMIMICDVF IISSCYFIFHDWRRVIFGFVTLFIIGVVLDWIINSARQSVQFFIFSKKYDEIADRIIKDA DRGVTVLDGTGWYSKNNVKVLVVLAKKRQSLDIFRLVKRIDPNAFISQSSVIGVYGEGFD KLKVK >gi|225935351|gb|ACGA01000041.1| GENE 135 179762 - 182596 2958 944 aa, chain - ## HITS:1 COG:SP0254 KEGG:ns NR:ns ## COG: SP0254 COG0495 # Protein_GI_number: 15900189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Streptococcus pneumoniae TIGR4 # 3 944 4 832 833 727 42.0 0 MEYNFREIEKKWQKRWVEEKTYQVTEDDSKQKFYVLNMFPYPSGAGLHVGHPLGYIASDI YARYKRLQGFNVLNPMGYDAYGLPAEQYAIQTGQHPAITTVNNIDRYREQLDKIGFSFDW NREIRTCDPEYYHWTQWAFQKMFNSYYCNDEQQARPIEELEKAFAIYGNKGLNAACSEEI SFTAEEWNAKSEKEKQEILMNYRIAYLGETMVNWCAELGTVLANDEVVDGVSERGGFPVI QKKMRQWCLRVSAYAQRLLDGLDTIEWTDSLKETQRNWIGRSEGAEVQFKVKDSDLEFTI FTTRADTMFGVTFMVLAPESELVAQLTTPEQKAEVDAYLDRTKKRTERERIADRSVTGVF SGSYAINPFTGEAVPVWISDYVLAGYGTGAIMAVPAHDSRDYAFAKHFGLEIRPLVEGCD VSEESFDAKEGIVCNSPRPDVTPYCDLSLNGLTIKEAIETTKKYVKDHNLGRVKVNYRLR DAIFSRQRYWGEPFPVYYKDGMPYMIDEASLPLELPEVAKFLPTETGEPPLGHATKWAWD TVNKCVVENEKIDHVTVFPLELNTMPGFAGSSAYYLRYMDPHNNKALVDPKIDQYWKNVD LYVGGTEHATGHLIYSRFWNKFLYDMDVSVMEEPFQKLVNQGMIQGRSNFVYRIKDTNTF VSLNLKDQYEVTPIHVDVNIVSNDILDLEAFKAWRPEYKTAEFILEDGKYVCGWAVEKMS KSMFNVVNPDMIVEKYGADTLRMYEMFLGPVEQSKPWDTNGIDGVHRFIRKFWSLFYSRT DEYLVTDEPATKEELKSLHKLIKKVTGDIEQFSYNTSISAFMICVNELFNLKCSKKEILE QLVITLAPFAPHVCEELWDVLGHETSVCDAQWPAYNEEYLKENTVNYTISFNGKARFNME FAADEASDVIQAAVLADERSQKWIDGKTPKKIIVVPKKIVNVVI >gi|225935351|gb|ACGA01000041.1| GENE 136 183332 - 185902 2016 856 aa, chain + ## HITS:1 COG:no KEGG:PRU_1162 NR:ns ## KEGG: PRU_1162 # Name: not_defined # Def: putative glycolsyl hydrolase, family 18/alpha-rhamnosidase # Organism: P.ruminicola # Pathway: not_defined # 18 856 344 1191 1193 1271 70.0 0 MKMNLHCFENLIPIMILALFSNSGLINATEVGKRTDPLEASAWNESKWISAVDAPVVKGH NNGRAADGASWFVSTVKNEQKIVSAKWMTAGLGVYELYVNGKPVGGEFLKPGFTHYAKTK RSFTYDITDVIHTKPNAENMLSVQVTPGWWADKIITPGGHDGMIGKKCAFRGVLELTFSD GSKKRYGTDLENWKAGIAGPVKHAGIFDGEEYDAREPMGFECVNKLSIPEENTEFSGDIL PSDGAEVYLRTDLALAPIRAYIWKNIEGAKENEFGKVIIAREFASGKEMTVSPGETLVVD FGQNCAGVPSFVFKAAEGTVLTCLPAELLNDGNGAKIRGMDGPEGSCHRENLRIPYTGIR LDYTFAAGDNYVTYYPRCTFFGYRYVSITATGNVSIKSLKSIPVTSITKELETGTITTGN DLVNKLISNTYWGQLSNYLSIPTDCPQRDERLGWTADTQVFAETGTFFANTMKFFHKWMR DMRDTQNSLGGFPGVAPLAQYGDEKMRLGWADAGIIVPWTVWKQFGDTQIIEESWNAMDL FMNHINDTKYNHETLCGENGNYQWADWLSYEPLESCSGLAFSPQGPLPDAVSYWNYLSAS YWVIDAFMMRDMAAATGRNASKYQQMADSAKAYIKENFLNEDGTFKTAILNTMQTPALFA LKNQLVEGEAKAKMIDRLRENFAQHDLCLQTGFLGTSILMATLTENGMEDIAYELLFQRK NPSWLYSVDNGATTIWERWNSYMIDKGMGPRGMNSFNHYAYGCVCEWIWETVAGIAADPA TPGFKHIIMKPIPDKRLGHVTAEYRSAAGLIKSAWKYEGDTWIWEFTIPKGVTATVTLPG EMKSKEYGSGTYKVTK Prediction of potential genes in microbial genomes Time: Fri May 13 09:18:37 2011 Seq name: gi|225935350|gb|ACGA01000042.1| Bacteroides sp. D2 cont1.42, whole genome shotgun sequence Length of sequence - 239895 bp Number of predicted genes - 143, with homology - 141 Number of transcription units - 49, operones - 27 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 4 - 921 530 ## COG1708 Predicted nucleotidyltransferases 2 1 Op 2 . - CDS 980 - 2731 1495 ## BVU_3461 hypothetical protein - Prom 2872 - 2931 3.1 3 2 Tu 1 . + CDS 2862 - 3758 189 ## PROTEIN SUPPORTED gi|42631297|ref|ZP_00156835.1| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily 4 3 Tu 1 . - CDS 3755 - 7747 2502 ## COG0642 Signal transduction histidine kinase + Prom 7730 - 7789 5.0 5 4 Op 1 . + CDS 7984 - 8424 309 ## gi|260172794|ref|ZP_05759206.1| hypothetical protein BacD2_13078 6 4 Op 2 . + CDS 8457 - 8897 479 ## gi|260172795|ref|ZP_05759207.1| hypothetical protein BacD2_13083 7 4 Op 3 . + CDS 8953 - 12198 2354 ## BF0833 hypothetical protein 8 4 Op 4 . + CDS 12222 - 14018 1573 ## BF0834 hypothetical protein 9 4 Op 5 . + CDS 14052 - 15224 1043 ## BF0761 putative lipoprotein 10 4 Op 6 . + CDS 15266 - 16081 435 ## COG0657 Esterase/lipase + Term 16127 - 16173 6.1 + Prom 16098 - 16157 4.5 11 5 Op 1 1/0.111 + CDS 16321 - 17520 672 ## COG4124 Beta-mannanase 12 5 Op 2 . + CDS 17556 - 18626 840 ## COG4124 Beta-mannanase 13 5 Op 3 . + CDS 18645 - 19823 1190 ## COG2152 Predicted glycosylase 14 5 Op 4 . + CDS 19820 - 21187 949 ## COG2211 Na+/melibiose symporter and related transporters 15 5 Op 5 . + CDS 21195 - 22478 665 ## COG3458 Acetyl esterase (deacetylase) 16 5 Op 6 . + CDS 22525 - 23682 739 ## COG2942 N-acyl-D-glucosamine 2-epimerase 17 5 Op 7 . + CDS 23710 - 24528 580 ## gi|260172806|ref|ZP_05759218.1| hypothetical protein BacD2_13138 + Term 24547 - 24578 1.8 + Prom 24649 - 24708 5.2 18 6 Tu 1 . + CDS 24845 - 27463 2318 ## COG0249 Mismatch repair ATPase (MutS family) + Term 27617 - 27660 -1.0 - Term 27437 - 27477 2.5 19 7 Op 1 . - CDS 27531 - 28175 418 ## COG4845 Chloramphenicol O-acetyltransferase 20 7 Op 2 . - CDS 28172 - 29005 882 ## COG0682 Prolipoprotein diacylglyceryltransferase 21 7 Op 3 . - CDS 29055 - 29963 923 ## COG1893 Ketopantoate reductase 22 7 Op 4 . - CDS 30011 - 31114 1240 ## COG0012 Predicted GTPase, probable translation factor 23 7 Op 5 . - CDS 31181 - 32323 1060 ## COG2814 Arabinose efflux permease - Prom 32343 - 32402 4.5 24 8 Tu 1 . - CDS 32576 - 33883 765 ## COG3177 Uncharacterized conserved protein - Prom 33911 - 33970 4.7 - Term 33914 - 33980 14.8 25 9 Tu 1 . - CDS 34033 - 35442 1225 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 35462 - 35521 7.6 - Term 35503 - 35561 8.7 26 10 Op 1 . - CDS 35617 - 36828 844 ## Slin_0557 hypothetical protein 27 10 Op 2 . - CDS 36890 - 38140 1001 ## Phep_2694 glycosyl hydrolase family 88 28 10 Op 3 . - CDS 38149 - 40530 1478 ## CPE1875 hypothetical protein 29 10 Op 4 . - CDS 40536 - 43376 2356 ## COG1472 Beta-glucosidase-related glycosidases 30 10 Op 5 . - CDS 43423 - 45033 1290 ## Phep_2282 RagB/SusD domain protein 31 10 Op 6 . - CDS 45046 - 48195 2737 ## BT_3332 hypothetical protein - Prom 48215 - 48274 4.5 - Term 48697 - 48734 -1.0 32 11 Op 1 . - CDS 48769 - 52845 2634 ## COG0642 Signal transduction histidine kinase 33 11 Op 2 . - CDS 52858 - 53073 89 ## - Prom 53097 - 53156 5.6 + Prom 52827 - 52886 3.4 34 12 Op 1 . + CDS 53107 - 54555 1420 ## gi|260172822|ref|ZP_05759234.1| hypothetical protein BacD2_13218 35 12 Op 2 . + CDS 54589 - 55356 877 ## gi|260172823|ref|ZP_05759235.1| hypothetical protein BacD2_13223 + Term 55390 - 55451 4.2 - Term 55378 - 55439 0.4 36 13 Tu 1 . - CDS 55491 - 57350 1164 ## Phep_2687 hypothetical protein 37 14 Tu 1 . + CDS 57659 - 59185 1536 ## COG3119 Arylsulfatase A and related enzymes + Prom 59192 - 59251 2.8 38 15 Op 1 . + CDS 59305 - 60957 1431 ## Cpin_2848 hypothetical protein 39 15 Op 2 . + CDS 60979 - 61383 311 ## gi|260172827|ref|ZP_05759239.1| hypothetical protein BacD2_13243 40 15 Op 3 . + CDS 61405 - 63036 1413 ## HM1_0138 multidomain protein with S-layer homology region, glug motif, ig motif, I-set domain 41 15 Op 4 . + CDS 63065 - 64051 782 ## COG3507 Beta-xylosidase + Prom 64105 - 64164 4.7 42 16 Op 1 . + CDS 64192 - 67377 3491 ## Slin_2101 TonB-dependent receptor plug 43 16 Op 2 . + CDS 67374 - 69284 1833 ## PRU_2735 hypothetical protein 44 16 Op 3 . + CDS 69307 - 72393 3069 ## BVU_0505 hypothetical protein 45 16 Op 4 . + CDS 72419 - 74215 1483 ## PRU_2737 putative lipoprotein 46 16 Op 5 . + CDS 74245 - 75147 938 ## Slin_2105 hypothetical protein + Term 75179 - 75243 12.1 47 17 Tu 1 . - CDS 75251 - 79261 2772 ## COG0642 Signal transduction histidine kinase - Prom 79368 - 79427 3.4 + Prom 79303 - 79362 2.9 48 18 Op 1 . + CDS 79470 - 81527 1745 ## COG3534 Alpha-L-arabinofuranosidase 49 18 Op 2 . + CDS 81616 - 83163 1293 ## COG3119 Arylsulfatase A and related enzymes 50 18 Op 3 . + CDS 83194 - 84324 827 ## BT_3094 putative secreted xylosidase 51 18 Op 4 . + CDS 84321 - 85988 1364 ## COG3119 Arylsulfatase A and related enzymes 52 18 Op 5 . + CDS 85994 - 88198 1787 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 88318 - 88384 5.3 + Prom 88714 - 88773 5.0 53 19 Op 1 . + CDS 88980 - 89441 415 ## BT_3564 hypothetical protein + Prom 89461 - 89520 4.3 54 19 Op 2 . + CDS 89687 - 91414 1521 ## BT_4293 hypothetical protein + Term 91456 - 91513 11.6 - Term 91707 - 91743 0.2 55 20 Tu 1 . - CDS 91848 - 92981 778 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 93006 - 93065 6.6 + Prom 92980 - 93039 8.7 56 21 Tu 1 . + CDS 93153 - 94802 1529 ## BT_3091 putative regulatory protein + Term 94854 - 94902 8.1 + Prom 94815 - 94874 8.0 57 22 Op 1 . + CDS 95008 - 98013 3017 ## BT_3090 hypothetical protein 58 22 Op 2 . + CDS 98029 - 99519 1450 ## BT_3089 hypothetical protein 59 22 Op 3 . + CDS 99556 - 101070 1611 ## BT_3088 hypothetical protein 60 22 Op 4 . + CDS 101085 - 102863 1774 ## BT_3087 cycloisomaltooligosaccharide glucanotransferase 61 22 Op 5 . + CDS 102925 - 105432 2160 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 105438 - 105484 6.0 62 23 Tu 1 . - CDS 105479 - 109540 2203 ## COG0642 Signal transduction histidine kinase + Prom 109413 - 109472 1.6 63 24 Op 1 . + CDS 109529 - 109846 112 ## gi|260172853|ref|ZP_05759265.1| hypothetical protein BacD2_13373 64 24 Op 2 . + CDS 109875 - 111074 734 ## BT_2913 unsaturated glucuronylhydrolase 65 24 Op 3 . + CDS 111100 - 112350 936 ## Dd586_1768 exopolysaccharide inner membrane protein 66 24 Op 4 . + CDS 112375 - 115467 2297 ## Phep_3172 TonB-dependent receptor plug 67 24 Op 5 . + CDS 115479 - 117200 1565 ## Phep_3171 RagB/SusD domain protein 68 24 Op 6 . + CDS 117247 - 117558 339 ## COG3254 Uncharacterized conserved protein 69 24 Op 7 . + CDS 117586 - 119070 945 ## gi|260172859|ref|ZP_05759271.1| hypothetical protein BacD2_13403 70 24 Op 8 . + CDS 119088 - 120329 802 ## gi|293368591|ref|ZP_06615199.1| putative lipoprotein + Term 120346 - 120398 10.1 + Prom 120382 - 120441 3.0 71 25 Tu 1 . + CDS 120482 - 121834 842 ## BT_3171 sialic acid-specific 9-O-acetylesterase 72 26 Op 1 . - CDS 122023 - 122895 717 ## COG3568 Metal-dependent hydrolase 73 26 Op 2 . - CDS 122917 - 126840 2403 ## COG0642 Signal transduction histidine kinase - Prom 127035 - 127094 9.7 + Prom 126963 - 127022 1.9 74 27 Op 1 . + CDS 127058 - 128104 742 ## COG2152 Predicted glycosylase 75 27 Op 2 . + CDS 128122 - 129369 1106 ## COG0477 Permeases of the major facilitator superfamily + Prom 129405 - 129464 2.9 76 28 Op 1 . + CDS 129536 - 130696 857 ## COG2152 Predicted glycosylase 77 28 Op 2 . + CDS 130705 - 133872 3313 ## BDI_3133 hypothetical protein 78 28 Op 3 . + CDS 133887 - 135827 1994 ## BDI_3134 hypothetical protein 79 28 Op 4 . + CDS 135860 - 136588 638 ## BDI_3135 hypothetical protein 80 28 Op 5 . + CDS 136615 - 137448 731 ## COG3568 Metal-dependent hydrolase 81 28 Op 6 . + CDS 137466 - 138845 1143 ## BT_3787 hypothetical protein 82 28 Op 7 . + CDS 138907 - 139701 661 ## Phep_2528 endonuclease/exonuclease/phosphatase + Prom 139741 - 139800 4.6 83 29 Tu 1 . + CDS 139901 - 140386 479 ## BT_3062 hypothetical protein + Term 140405 - 140468 6.5 - Term 140402 - 140446 5.1 84 30 Tu 1 . - CDS 140471 - 141190 518 ## COG2186 Transcriptional regulators - Prom 141238 - 141297 4.6 + Prom 141228 - 141287 5.8 85 31 Op 1 . + CDS 141431 - 144586 2813 ## Dfer_2137 TonB-dependent receptor 86 31 Op 2 . + CDS 144600 - 146117 1449 ## Dfer_0773 RagB/SusD domain protein 87 31 Op 3 . + CDS 146139 - 147824 1158 ## gi|260172877|ref|ZP_05759289.1| hypothetical protein BacD2_13493 88 31 Op 4 . + CDS 147828 - 150104 1690 ## Sde_3285 TonB-dependent receptor (EC:4.2.2.3) 89 31 Op 5 . + CDS 150168 - 152348 1776 ## CA2559_11508 putative chondroitin AC/alginate lyase 90 31 Op 6 . + CDS 152375 - 152719 418 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain 91 31 Op 7 9/0.000 + CDS 152724 - 154190 1468 ## COG0477 Permeases of the major facilitator superfamily 92 31 Op 8 . + CDS 154213 - 154965 222 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 155026 - 155086 -0.3 + Prom 155081 - 155140 3.3 93 32 Op 1 . + CDS 155191 - 156393 1204 ## Cpin_4528 hypothetical protein 94 32 Op 2 . + CDS 156421 - 157410 867 ## COG0657 Esterase/lipase 95 32 Op 3 . + CDS 157415 - 158311 705 ## Phep_0714 fibronectin type III domain protein + Term 158341 - 158398 10.7 - Term 158401 - 158460 6.3 96 33 Tu 1 . - CDS 158465 - 160600 1531 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 160631 - 160690 9.3 + Prom 160659 - 160718 8.3 97 34 Op 1 . + CDS 160804 - 161607 800 ## COG0657 Esterase/lipase + Term 161653 - 161694 7.3 + Prom 161662 - 161721 2.7 98 34 Op 2 . + CDS 161755 - 162702 855 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Term 162731 - 162780 11.5 99 35 Tu 1 . + CDS 162782 - 164548 1354 ## BT_3079 hypothetical protein - Term 164559 - 164601 -0.5 100 36 Tu 1 . - CDS 164790 - 167561 242 ## PROTEIN SUPPORTED gi|163788005|ref|ZP_02182451.1| 50S ribosomal protein L33 - Prom 167684 - 167743 7.2 + Prom 167614 - 167673 12.9 101 37 Op 1 . + CDS 167897 - 170926 2745 ## BDI_3062 hypothetical protein 102 37 Op 2 . + CDS 170972 - 172438 1310 ## BDI_3063 hypothetical protein 103 37 Op 3 . + CDS 172462 - 174570 1355 ## BDI_3065 beta-glycosidase 104 37 Op 4 . + CDS 174602 - 175963 1107 ## COG5368 Uncharacterized protein conserved in bacteria 105 37 Op 5 . + CDS 176001 - 178286 2158 ## COG1472 Beta-glucosidase-related glycosidases + Term 178344 - 178403 9.4 - Term 178330 - 178391 2.2 106 38 Op 1 1/0.111 - CDS 178446 - 180848 2180 ## COG1472 Beta-glucosidase-related glycosidases 107 38 Op 2 . - CDS 180856 - 183168 1933 ## COG1472 Beta-glucosidase-related glycosidases 108 38 Op 3 . - CDS 183212 - 184846 1389 ## COG2730 Endoglucanase 109 38 Op 4 . - CDS 184867 - 185247 431 ## gi|255693558|ref|ZP_05417233.1| conserved hypothetical protein 110 38 Op 5 . - CDS 185283 - 186905 1363 ## PRU_2518 putative lipoprotein 111 38 Op 6 . - CDS 186918 - 190097 3185 ## PRU_2517 TonB dependent receptor - Prom 190222 - 190281 8.4 112 39 Tu 1 . + CDS 190547 - 190702 66 ## + Term 190946 - 190977 0.1 - Term 190605 - 190642 8.4 113 40 Op 1 . - CDS 190728 - 194810 2437 ## COG0642 Signal transduction histidine kinase 114 40 Op 2 . - CDS 194807 - 195286 410 ## COG3467 Predicted flavin-nucleotide-binding protein 115 40 Op 3 . - CDS 195304 - 196173 366 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 196243 - 196302 7.5 + Prom 196290 - 196349 6.7 116 41 Op 1 . + CDS 196442 - 199498 2342 ## ZPR_0351 TonB-dependent receptor Plug domain protein 117 41 Op 2 . + CDS 199510 - 201033 1233 ## ZPR_0352 hypothetical protein 118 41 Op 3 . + CDS 201061 - 202017 805 ## ZPR_0353 hypothetical protein 119 41 Op 4 . + CDS 202022 - 203494 1107 ## Csac_2519 coagulation factor 5/8 type domain-containing protein 120 41 Op 5 . + CDS 203497 - 205014 1168 ## Csac_2519 coagulation factor 5/8 type domain-containing protein 121 41 Op 6 . + CDS 205023 - 206459 782 ## Fjoh_4232 sialate O-acetylesterase (EC:3.1.1.53) 122 41 Op 7 . + CDS 206456 - 207877 814 ## Coch_1349 hypothetical protein 123 41 Op 8 . + CDS 207885 - 208904 848 ## COG2152 Predicted glycosylase 124 41 Op 9 . + CDS 208901 - 210496 1175 ## COG4146 Predicted symporter + Term 210503 - 210575 18.2 125 42 Tu 1 . - CDS 210595 - 212751 1245 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 - Prom 212843 - 212902 2.8 - Term 212845 - 212899 12.2 126 43 Op 1 . - CDS 212972 - 214549 1089 ## BF3505 hypothetical protein 127 43 Op 2 . - CDS 214605 - 215657 532 ## BT_3244 hypothetical protein 128 43 Op 3 . - CDS 215690 - 217042 789 ## BT_3243 hypothetical protein 129 43 Op 4 . - CDS 217055 - 217957 680 ## BT_3242 hypothetical protein 130 43 Op 5 . - CDS 217976 - 219535 1162 ## BT_3241 hypothetical protein 131 43 Op 6 . - CDS 219571 - 223161 3053 ## BT_3240 hypothetical protein - Prom 223190 - 223249 5.2 132 44 Tu 1 . - CDS 223313 - 224434 750 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 224458 - 224517 1.8 133 45 Tu 1 . - CDS 224539 - 225102 323 ## BVU_0609 RNA polymerase ECF-type sigma factor - Prom 225122 - 225181 3.6 + Prom 225092 - 225151 8.5 134 46 Op 1 . + CDS 225186 - 226193 644 ## COG0451 Nucleoside-diphosphate-sugar epimerases 135 46 Op 2 . + CDS 226184 - 227149 622 ## BT_3074 hypothetical protein + Term 227187 - 227234 4.6 - Term 227172 - 227225 7.2 136 47 Tu 1 . - CDS 227240 - 228235 496 ## PROTEIN SUPPORTED gi|148828154|ref|YP_001292907.1| ribosomal protein L11 methyltransferase - Prom 228341 - 228400 9.2 + Prom 228212 - 228271 3.0 137 48 Op 1 . + CDS 228360 - 229649 1272 ## COG0826 Collagenase and related proteases 138 48 Op 2 . + CDS 229695 - 230096 442 ## COG0824 Predicted thioesterase 139 48 Op 3 . + CDS 230093 - 231217 894 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake + Term 231255 - 231286 1.8 + Prom 232009 - 232068 6.3 140 49 Op 1 . + CDS 232101 - 236633 2466 ## COG3513 Uncharacterized protein conserved in bacteria 141 49 Op 2 . + CDS 236630 - 237643 796 ## COG3943 Virulence protein 142 49 Op 3 4/0.000 + CDS 237636 - 238568 658 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 143 49 Op 4 . + CDS 238595 - 238900 156 ## COG3512 Uncharacterized protein conserved in bacteria + Term 238910 - 238939 1.4 Predicted protein(s) >gi|225935350|gb|ACGA01000042.1| GENE 1 4 - 921 530 305 aa, chain - ## HITS:1 COG:SMb20835 KEGG:ns NR:ns ## COG: SMb20835 COG1708 # Protein_GI_number: 16264326 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Sinorhizobium meliloti # 1 290 27 320 331 126 28.0 5e-29 MKRSIKRLPKRTQEELAVLQELILSNLTNVRMIILYGSYARGKYVIWDETYDERGVTTYY QSDLDILVICDTRDANKAERHAREVIVPKYDTRMEGKRHPAPPNIIVETPVTINRAIRRK HYFFYEIIKDGILLYDNGTFQIGKPEKLPYREIKQYAEEEYEECFYMGECFLDSGHTAYE KGNFKYGSFLLHQACERYYKTFTLVYSGIRPKSHELKVLGAMVRSCSRGFANVFPTNTPE ENKAFDKLCRAYIEARYNRLFTVNKEEYEYMLARTEVLREVTIRECAARMTYYDEMIEKE EKDKI >gi|225935350|gb|ACGA01000042.1| GENE 2 980 - 2731 1495 583 aa, chain - ## HITS:1 COG:no KEGG:BVU_3461 NR:ns ## KEGG: BVU_3461 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 583 2 582 590 963 83.0 0 MEKYIAPDRKRIPYGMMNFAVIRRDDCYYVDKTRFIPMIEEADKFFFFIRPRRFGKSLTV NMLQHYYDILAKDKFEALFGDLYIGKHPTRDRNSYLVLYLNFSGIVGELHNYRKGLDAHC QTMFDYFCDIYADYLPKGIKEELDKKEGAVEQFEYLFTECNKTNQRIYLFIDEYDHFTNA ILSDIESLHRYTDETHGEGYLRAFFNKIKAGTYSSIERCFITGVSPVTMDDLTSGFNIGT NYSLTPEFNEMIGFTEEEVRQMLTYYSTTSPFNHSVDELIEIMKPWYDNYCFAEECYGET TMYNSNMVLYFVKNYIQRGKAPRDMVEDNIRIDYEKLRMLIRKDKEFAHDASIIQTLVSE GYVTGELKKGFPAVNITNPDNFVSLLYYFGMLTISGTYKGKTKLTIPNQVVREQIYTYLL STYNEAELNFSSYEKNELASALAYDGDWKAYFGYIADCLKRYTSQRDKQKGEFFVHGFTL AMTAQNRFYRPISEQDTQAGYVDIFLCPLLDIYSDMKHSYIVELKYAKYKDPENRVEELR QEAIDQANRYADTDTVKRAVGTTQLHKIVVVYKGMDMPICEEV >gi|225935350|gb|ACGA01000042.1| GENE 3 2862 - 3758 189 298 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631297|ref|ZP_00156835.1| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily [Haemophilus influenzae R2866] # 10 282 2 273 290 77 25 5e-13 MELKHGYYHLIAILVVAIWGLTFISTKVLINHGLTPQEIFFYRFLIAYLGIWVISPKRLF AGNWKDELWLMAGGFFGGSLYFFTENTALGITQASNVAFIICTAPLLTTILSLLFYKSEK ATKGLIYGSILALIGVGLVVFNGSFVLKLSPVGDLLTLLAALSWAFYSLVIKKMTGRYPT VFITRKIFFYGVLTILPAFLLHPLQPDFDVLLKPVVLSNLLFLAVLASLVCYVLWNVVLK QLGTVRASNYIYLNPLVTMVASIIILHEQITWITLLGAGCIIFGVYQAEKKWKSKTCI >gi|225935350|gb|ACGA01000042.1| GENE 4 3755 - 7747 2502 1330 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 820 1048 8 231 294 132 37.0 6e-30 MKKHLLSALFCLFVLTLQASERLNFRTFDVKSGIADNYVQSILCDRYGFMWFSTLNGLSR YDGYQFKTYTTTQLGAYNNDIEFTAEDASGTIWIKTPKSYYFYNREKDEIDDHITPILEQ FGIYGTPDNLYIDKDLNLWCTVNSTLYYYVFRQKELYTLSLPQNCQVLQIDCYQSNAYVL LSDSYIYHIDWRAQSLCKEVQIVLPLGWKQRMYIDTFSRLWLYVPHTKGVCCYDPNAKEW FSFPGEKEITNELITAVIDDGKGNIWIGTDNKGIYISYHQDGKGYERLSKEVDNPFSLPN NHICCFYKDHSDIMWVGTSKLGATFTSLHNVTFETCRLPQQEDVSCLVEDKDGNLWLGFD GEGLAYYDRKKNNYTFISKKQDAIPSDIIVCSYKDSKGRLWFGSYGNGAFYEQNGKFSKV ATTYEQKYSIDYVRCITEDKYGNIWLGTIMNGIYCLDNTGNITPYTMEKTGMLTNSITTF SCADGINLYIGTSSGLYCMNTETKEISLLENNNSASNLFEEVHINCIYQDTRGLLWIGTR KGMDVYNKNTKERIHLSTKNGLSHDYIRAITEDKEKNIWVTTDHGVTNIIATNSPEPSSK FLCYPYFEEDGIGNMAFNNHSITCTQQGEILMGGSGGYLRINPRFINYRHESHPVIYTSL YLANQRMEVGSKNSSGRILLNKNIQLLNEITMDYSDSNFALEVSSMDYNSLHKLQYAYRL NKDEEWVKLEGNRIYFNKLSPGTYQLEVKVYENNNSYKNSKASCLTIHVLPPFWLSVPAY IGYTLLLAGVIIFFFLKMKRKHIRNLSQQKREMEIIQQHEMDEAKMRFFTNVSHDLRTPL SLIIIPLEKLLSSNLEKGVKEELELIHRNSETLLNEVNQLLDFRKLDQQKTQLMLSHGNL SDFVKEVCGSFIPQAVKKGINIRLHINETKMDMDFDRNKIQRILLNLLSNAVKYNYENGE VIVALDKITANGTENAQIQVADTGIGIKDENKDKIFDRFFQEQHSTTTYIGNGIGLHIVK EYVTMHHGTITVENHIPQGTIFTITLPVTHNYAVNDEKQDEAEIREKAVDEVSDTTDTHK TSLLIVEDNDDFRNFLINCLKGTYQVFDAPNGKEALEILAHQSIQIVISDVMMPEMDGME LCRKIKTDIRYSHIPVILLTARTADEHELSGLKEGADDYITKPFNLEILLLRIQKILKWT KNNHEKFKTIDISPSEITVSSLDEQLIEKAIRAVEENMDNSEFSVEELSSYVGMSRGHLY KKLIMITGKSPLEFIRILRVKRGKQLLEQSQLNVSQIAYQIGLSPKQFAKYFKEEFGYVP SEYIKNNNTN >gi|225935350|gb|ACGA01000042.1| GENE 5 7984 - 8424 309 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172794|ref|ZP_05759206.1| ## NR: gi|260172794|ref|ZP_05759206.1| hypothetical protein BacD2_13078 [Bacteroides sp. D2] # 1 146 1 146 146 284 100.0 1e-75 MKRKLLIAATSFMIASSVCGKDFNVLTKDFQPWGNTQFDMEGMKITWTQKWQGGGWWLAQ DCSEYESARIEFREVLPMDVVFIVIYSAKDERNEKLKSKVTIPAGKKEAAIMLDITYKKS IDGLGISGSKPGTVCVKSLVLEEMKR >gi|225935350|gb|ACGA01000042.1| GENE 6 8457 - 8897 479 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172795|ref|ZP_05759207.1| ## NR: gi|260172795|ref|ZP_05759207.1| hypothetical protein BacD2_13083 [Bacteroides sp. D2] # 1 146 1 146 146 240 100.0 2e-62 MKKNLLIAVLLLLTVSSVWAKDYNVETKKFKSWGEAQFDAETATITFTKNWQGGGWWLSR LDCSAYDCIVIEFEEALPMDVMFMATYSAKDDKDQKIKSKVKVPAGKKKAVLELEPAYKN SVDGLGISATKAGVVKLKSLNLKEKK >gi|225935350|gb|ACGA01000042.1| GENE 7 8953 - 12198 2354 1081 aa, chain + ## HITS:1 COG:no KEGG:BF0833 NR:ns ## KEGG: BF0833 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 39 1081 20 1045 1045 1113 55.0 0 MKKNFHKNMLITGACLALLFGGIVNVRAIAPEEYEAVQQSRTIKGTVVDKNGEAVIGANV VVKGTSKGTITDVDGRFTLNGVPEKAVLSVTFVGYKSKEITLKSGQSQIKVELSDDAELL DEVVVVGYGTMKKRDLSGAVSQIKGDDLMRGNPADLSQGLAGKIAGVVVNQSDGAPGGGI SIQIRGTNSFSTDSQPLYIVDGVPYDAGGTPSSDANSGNNESNPLAFINPHDIQSIEVLK DASATAIYGSRGANGVVLITTRRGEPGADKVRFTANFSFSRIANRVDVLDAYQYALYRNE QVENSYKYHGKAYTSLPYPGKWSYSNSDKSQGKYNPAPEDFLNPGIYTDEYGNATKVGVA DWQDLIYQDGFSQEYNISVSGGSRNGSWHSFSGNYLKQQGIIKESGFTRYSIRANMGRKI KSWLDMGVNISFSHTDTDFSRTNSNDYGVIRSALVFPTTYDPSDDETLNSDELSWLASNP YAYINSAKDNLKSNNVFTSSYLEIKLFPFLKFRQNLGINYTNRERGTYYGRRTSEGSEAN NINGKAGQSTNWSMGITAESLFTFDKTFKKIHSVNAVGGFTVEKRDYGSKSMSATGFPSD LTQEYDMSLGTLPGKLKSDRTDSSLASFLGRINYTLMDKYIFTASYRVDGSSKFTENNKW ANFLSGAVAWRLSEEKFIKNLNIFSNLKLRLSYGETGNQGIGSYRTLPMLNPANYPFSGG AISSGFAEVEWRGPVAEDLHWETTAQYNAGLDIGFFNNRVNLTVDYYYKKTRDLLQEVKI PSSTGFSSMMVNRGYVTNEGLEISGKFYVLRNTPLKWDIDANISFNKNKVGGLDSDQFSA RLWYKADDVFLQRNGYPIGTIFGYVEDGFYNNLAEVMASPDPSVRAKGKSMIGEIKYRNF DDDPAITNADRVVIGDTNPDYVYGITNNFRWKNFTLSFFLQGSQGNDIFNGNLMEVKMGN TANIPVDAYNTRWTEANRASAKWPKAVNSYERTMLISNRYVEDGSYLKLKNLSIGYTWRP KFKGISSINIYGNATNLFTITGYSWFDPEVNAFGSDTSRRGVDIYSYPSSRTFSLGLKVD F >gi|225935350|gb|ACGA01000042.1| GENE 8 12222 - 14018 1573 598 aa, chain + ## HITS:1 COG:no KEGG:BF0834 NR:ns ## KEGG: BF0834 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 597 10 555 555 356 38.0 1e-96 MKLYKLYVVLSLWATGSLGLASCSDFLKEEPYAFVGPDQLGNDNAAVDLWLTGVYSKWGN NMFRYSSFPRCLELDSDYITGPDWAFSNLGAGNFQADEYATSIWTGCYNLIHRANVAIHY VNEITGADEKVKRNALGELYFQKAFSYFLLTRAYGEIPLFDVAISEGASYNQPRRPIPDV YAEIIRLLEQAAGMMYKNTDAEYQKGHVAAGTAIGLLAKVYATMGAGALPAGEKMIVKSG PSYSYNAGTKILTNPVSNTFEKTQVAGYESLNWKDCYEKAAYYAGIIMGRNPDVSYGTYR LLSFDELWKKSGFDESEHMFSVRTRSADEEYGNGVHQWYCGMQDAGGVLQKGPWVGNRFH WYCLFDNEDYRITKGVKHRFQYDSQVKNGRGYYYPGTPEYKLMATGKNMDDVKVQEPVAP FNDGLTYTNSISSNCLAFTTKYADVTDPTIGRTDAYWPFLRYADIVLIYAEAQCELNDGI SQDAIDALNDIRIRSNAKEASVTGDGAIETKVELRSVIFEERAKELAYEGDRRWDLLRWG IYLDVMNSLGGINEDGTRTPYDEGSINKRREFRHLLYPLPSLEVSTNMAIDKNNPGWN >gi|225935350|gb|ACGA01000042.1| GENE 9 14052 - 15224 1043 390 aa, chain + ## HITS:1 COG:no KEGG:BF0761 NR:ns ## KEGG: BF0761 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 52 348 42 330 368 110 27.0 1e-22 MKTLNRILIGTVLLATLFSCQDVVTYNDGYDDGMNSFGAPDVQGIYSPDDTDLQVPLTKG GFGEMIVLVGENLSNVTKITFNNMEVDLSEVYATSKKACFPLPGDMPEEITNKLYYETER GNTTCEFKLDVPVMEINGLYNEFALPGTSLQIKGKYFELYGFGKNSSSVIKFGDTELEIE SVTDQVITAKLPGDIPDNSMIVVEWNGTEGHCSYQLPYRQTDNLVWDLANPSDYGIWGGK DFITDGSNTGDPVALAGSFFRVKGTYKAWSYTGLPSGSIILDEEVAANPSDYLFKFEVNS ASDYPFYDITSKTGGYLIKLNGGNYQWKYSLDENMNTFGEWYTVTLELENVATAGLVSGK IALTLAMQPNVEWNVDHSFANIRIEKKIGQ >gi|225935350|gb|ACGA01000042.1| GENE 10 15266 - 16081 435 271 aa, chain + ## HITS:1 COG:DR0821_2 KEGG:ns NR:ns ## COG: DR0821_2 COG0657 # Protein_GI_number: 15805847 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Deinococcus radiodurans # 43 209 6 176 242 108 35.0 2e-23 MIKRICFLLSLLMCGVLYAQDITYRTVKDLSYSSSKDPYATERCKLDVYYPENKTGCPVV VWFHGGGLTQGNKSIPGRLKKNGMVVIAVNYRLLPKVAISECLDDAAASVAWAFREVEKY GGDKNKIFISGHSAGGYLTAMVGLDKRWLKKYDIDADSIAGLIPFSGQVISHFSYRKMNG IDNLQPTVDEFAPLFHVRKDAPPLVLITGDRELELFGRYEENAYMWRMMKLVGHKETFLY EIGGHGHGPMGDPAFYILKQHIKRILGEPEK >gi|225935350|gb|ACGA01000042.1| GENE 11 16321 - 17520 672 399 aa, chain + ## HITS:1 COG:BS_ydhT KEGG:ns NR:ns ## COG: BS_ydhT COG4124 # Protein_GI_number: 16077655 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Bacillus subtilis # 67 355 55 326 362 87 28.0 5e-17 MKKSRPTIGIVKSRVFLTGTAFVLMACSSGEEFAGHSGKGEEPVITPVLCTSGASEQAVK VYDFLRENRNKKTLSGTMACPSWNVNEAEWVYQHTGKYPAIAFFDYLSLEYSPCSWIDYS KTKIVEDWWNKNGLVGAGWHWRVPCVQGSAERHYTPGDGSVDPGTGKRTTTVFSAANATK EGTWENEVVKADLKKLAGYLKLLRDKRIPVIWRPLHEAAGNIYNYKNGKAWFWWGNDGAE AYKKLWIYIFNYFKKEGINNLIWVWTTQTKDSEFYPGDEYVDMVGRDMYPAKDEYTTGEY CFRQYGTITASCPGKLVALSECGNGEQSGKVYHLARISAQWEAGAKWTYFMPWYDYSRTK ELDSEAFTATSHRYADKDWWVDAMSQDYVITRDQLPSFK >gi|225935350|gb|ACGA01000042.1| GENE 12 17556 - 18626 840 356 aa, chain + ## HITS:1 COG:BS_ydhT KEGG:ns NR:ns ## COG: BS_ydhT COG4124 # Protein_GI_number: 16077655 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Bacillus subtilis # 66 319 79 326 362 102 28.0 1e-21 MAGVFLLALGMGGCVIREQQSASKTAETEALLDKLIRLPKKGFMFGHQDDPVYGIRWDGD ENRSDVKSVCGDYPAVMAFDLGRIERGGEKNLDDVLFERIRAEIIAHYNRGGVCSLSWHV DNPVTGEASWDVSDSTAVASVLPGGENHGKFLGWLDRVADFMNSLVTKEGVKIPVIFRPW HEHTGSWFWWGQNLCSAEEYKSLWKMTYDYLQEKGVNHLLYAYSPGSEPDNVNEYLERYP GDGMVDLFGFDTYQFEREKYVDTMEKSLTILTEAGRLHNKPVAVTETGYEAIPDSTWWTE TLFPIVDKYPVSYVLVWRNAREKEAHYYAPYPGQISALDFVEFYKHPKTIFVSDLK >gi|225935350|gb|ACGA01000042.1| GENE 13 18645 - 19823 1190 392 aa, chain + ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 44 360 19 320 326 125 30.0 2e-28 MKTFKEKIRDLFDDYEKLVTRKNIPVEGGNGIFTRYQYPVLTAAHTPVFWRYDLDEKTNP FLMERIGMNATLNSGAIKWNGKYLLVVRVEGADRKSFFAVAESPDGIDNFRFWDYPVTMP EDIIPATNIYDMRLTVHEDGWIYGIFCAERHDDTAPAGDLSSATATAAIARTKDLKTWER LPDLKTRSQQRNVVLHPEFVDGKYALYTRPQDGFIDAGSGGGIGWALVDDITCAEVKEET IIDRRYYHTIKEVKNGEGPHPIKTSKGWLHLAHGVRGCAAGLRYVLYMYMTSLEDPAKVI ASPAGFFMAPEGEERIGDVSNVLFANGWIADEDGTVFIYYASSDTRMHVATSTIDRLVDY CMNTPVDGLFTSASVETLKKQIDKNLKLMHTL >gi|225935350|gb|ACGA01000042.1| GENE 14 19820 - 21187 949 455 aa, chain + ## HITS:1 COG:yagG KEGG:ns NR:ns ## COG: yagG COG2211 # Protein_GI_number: 16128255 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 4 454 6 444 460 277 35.0 3e-74 MIGLREKIGYGFGDMASSMFWKLFGSYLMIFYTDVFGMPAAVVGTMFLITRVWDSAFDPI IGIIADRTQSRWGKFRPYLLYLAVPFAVIGVLTFTTPGFSDDGKVIYAYFTYSLMMMVYS AINVPYASLLGVMSPEPKDRNMLSTYRMTFAYIGSFIALLLFMPMVNRFSMGHDEQHGWM MSVIVIAVLCTLLFYGCFAWTAERVKPIKEQQNSLKSDLQDLLHNRPWWILLGAGVAALV FNSIRDGATVYYFKYYVIEEEYASVSLFGISFVLSGLYLAVGQAANIVGVVLAAPLSNRI GKKKTYMWAMSIATVLSVIFYWFDKEQLMLMFIFQVLISICAGSIFPLLWSMYADCADYS ELKTGNRATGLIFSSSSMSQKFGWAIGSAVTGWLLAFFGFEANAVQSEEAIHGIRMFLSW LPAMGTVLSVIFISLYPLSEKEMRKITNQLNDKRK >gi|225935350|gb|ACGA01000042.1| GENE 15 21195 - 22478 665 427 aa, chain + ## HITS:1 COG:BH3326 KEGG:ns NR:ns ## COG: BH3326 COG3458 # Protein_GI_number: 15615888 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Bacillus halodurans # 134 407 21 302 319 114 29.0 3e-25 MKIHRWIISLVFCCCVTFLSAQIRGTEITVVVSPDHTDWKYQLKEKCSFTIQVYKAQNLL SEVTIDYELGPEWYPVEKKKGILLKDGKTSVTGCMDVPGFLRCKVRASVGGHVYEGLATV AYAPEELRPHASVPADFSQFWNTVLKEARQIPLLPVMELIPDRCTDKVNVFHVSFQNICN GSRTFGILCIPKKPGKYPALLRVPGAGVRPYSGDVHMASKGVITLEIGIHGIPVTMPQMV YDVLSKGALNGYPHQNDNSRDKSYYKRVFAGTLRAVDFIASLPEYDGKNMGVTGSSQGGA LSVVTAALDKRITFYAAVHPAMCDHQAHLKKTAGGWPHYFYYFPQPSKERITTAAYYDVA NFARLITVPGWFSWGYNDEVCPPTSMYAVYNIITAKKELHPYLETGHYLYQEQSDEWNKW LCRQLGL >gi|225935350|gb|ACGA01000042.1| GENE 16 22525 - 23682 739 385 aa, chain + ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 5 385 16 384 388 76 24.0 9e-14 MQCMLTGNILPFWMNHMVDSEHGGFYGRISGTGERIPGASKGVVLNARILWTFSSAYRLL HKDEYLKMANRAKQELITHFYDHEYGGVFWSVCEDGSPLDTKKQIYALGFAIYGLSEYHR ATGDAEALEYAIRLFYDIEEHSFDRIKNGYCEALTREWNEIEDRRLSEKDENERKTMNTH LHILEPYTNLFRVWNNEYLERQLRNLIELFTDRILDLETGHLRLFFDDDWNSRHDIVSYG HDVEASWLLHEAALLIGDKRLITRIEPLVIEIAEAASEGFLPGAGMIYEMNGSCGSIDAE RHWWVQAEAIVGYINLYQHFDDRLSLLRAVRCWEFVKRHLVDYDNGEWFWGIRADGTVNR AEDKAGFWKCPYHNGRMCMEIMERF >gi|225935350|gb|ACGA01000042.1| GENE 17 23710 - 24528 580 272 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172806|ref|ZP_05759218.1| ## NR: gi|260172806|ref|ZP_05759218.1| hypothetical protein BacD2_13138 [Bacteroides sp. D2] # 1 272 1 272 272 568 100.0 1e-160 MKKWILIILLFTGCSADHQAQEGGVQTQVKVDFSKMHFGCDGNSITAGNQWSKTVVDILG FATHHNVAVGSAKWACYIDTQEYGSKDFAGISGGWKSTDDKVEIQKRHNNVAKVHIQKFI SEVENGSFPAPDIFVFSMGTNDTKIGSALDALKEEILDKVDLTTMAGGARWCIQTIIERF PECRVFLCTPIQSGSASHNELNLKKIAVLREICNAFSVPVIDCYSECGIKAEDEVWGEHG RYLKDGLHPDVEGQQLMGRYIAKKIQDYLEVN >gi|225935350|gb|ACGA01000042.1| GENE 18 24845 - 27463 2318 872 aa, chain + ## HITS:1 COG:MA0523 KEGG:ns NR:ns ## COG: MA0523 COG0249 # Protein_GI_number: 20089412 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Methanosarcina acetivorans str.C2A # 7 869 4 899 900 619 40.0 1e-177 MNEEEIVLTPMMKQFLDLKAKHPDAVMLFRCGDFYETYSTDAIVASEILGITLTKRANGK GKTVEMAGFPHHALDTYLPKLVRAGKRVAICDQLEDPKMTKKLVKRGITELVTPGVSIND NVLNYKENNFLAAVHFGKASCGVAFLDISTGEFLTAEGPFDYIDKLLNNFAPKEILFERG KRLMFEGNFGSKFFTFELDDWVFTETTAREKLLKHFETKNLKGFGVEHLKNGIIASGAIL QYLTMTQHTQIGHITSLARIEEDKYVRLDKFTVRSLELIGNMNDGGSSLLNVIDRTISPM GARLLKRWMVFPLKDEKPINERLNVVEYFFRQPEFKELIEEQLHLIGDLERIISKVAVGR VSPREVVQLKVALQAIEPIKEACLEADNASLNRIGEQLNLCIPIRDRIAKEINNDPPLLI NKGGVIKDGVNADLDELRQISYSGKDYLLKIQQRESESTGIPSLKVAYNNVFGYYIEVRN VHKDKVPKEWIRKQTLVNAERYITQELKEYEEKILGAEDKILALETQLYTDLVQALTEFI PQIQVNANQIARLDCLLSFANVARENRYIRPVIEDNDVLDIRQGRHPVIEKQLPIGEKYI ANDVMLDSTTQQIIIITGPNMAGKSALLRQTALITLLAQIGSFVPAESAHIGLVDKIFTR VGASDNISVGESTFMVEMNEAADILNNVSSRSLVLFDELGRGTSTYDGISIAWAIVEHIH EHPKAKARTLFATHYHELNEMEKSFKRIKNYNVSVKEVDNKVIFLRKLERGGSEHSFGIH VAKMAGMPKSIVKRANEILKQLESDNRQQGISRKPLTEVSENRGGMQLSFFQLDDPILCQ IRDEILNLDVNNLTPIEALNKLNDIKKIVRGK >gi|225935350|gb|ACGA01000042.1| GENE 19 27531 - 28175 418 214 aa, chain - ## HITS:1 COG:MA1703 KEGG:ns NR:ns ## COG: MA1703 COG4845 # Protein_GI_number: 20090555 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 3 213 6 206 209 104 33.0 1e-22 MKHIIDIETWERKENFNFFRHFQNPQLSITSEVEYGGAKKRAKEAHQFFFLHYLYAVLRA ANEIPELRYRIDPEGRVVLYDKIDILSPIKIKENGKFFTIRFPYHEDFDTFHQEARKIID SIPEDGDPYAAKNGEVAHGDYGLILLSATPDLYFTSITGTQEKQSGNNYPLLNAGKAITK EGKLVMPIAMTIHHGFVDGHHLSLFYRKVEELLK >gi|225935350|gb|ACGA01000042.1| GENE 20 28172 - 29005 882 277 aa, chain - ## HITS:1 COG:VC0674 KEGG:ns NR:ns ## COG: VC0674 COG0682 # Protein_GI_number: 15640693 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Vibrio cholerae # 10 268 10 257 271 125 34.0 8e-29 MYSLLNINWNPNPELFNLFGISIRYYGLLWAVGIFFAYVVVHYQFRDKKIEEKKFDPLFF YCFFGILIGARLGHCLFYDPGYYLSHFWEMILPIKFMPDGNWKFTGYEGLASHGGTLGLM ISLWLYCRKTKLHYMDVVDMIAVATPITACFIRLANLMNSEIIGKPTDVPWAFVFERIDM QPRHPGQLYEAIAYLILFFIMIYLYKNYSKKLHRGFFFGLCLTYIFTFRFFIEFVKENQE AFENGMTFNMGQWLSVPFVIIGVYFMFFYDRKKVEKK >gi|225935350|gb|ACGA01000042.1| GENE 21 29055 - 29963 923 302 aa, chain - ## HITS:1 COG:BH1763 KEGG:ns NR:ns ## COG: BH1763 COG1893 # Protein_GI_number: 15614326 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Bacillus halodurans # 1 291 1 287 304 107 26.0 4e-23 MKYLIAGTGGVGGSIAGFLSLAGKDVTCIARGAHLQAIQTDGLKLKSDLKGEHTLRIPAT TAEEFNGKADVIFVCVKGYSVDSIVELIQRAAHKDTVVIPILNVYGTGPRIQKLVPGVTV LDGCIYIVGFVSGTGEITQMGKIFRLVYGAHRGTIVKPGLLEAIQQDLQEAGIKVDLSPD INRDTFIKWSFISAMALTGAYYDVPMGEVQKPGKVRDTFIGLTTESAALGKKLGVEFPEA PVTYNLKVIDKLDPESTASMQKDLARGHESEIQGLLFDMIAAAEEQGIDIPTYRMVAEKF KE >gi|225935350|gb|ACGA01000042.1| GENE 22 30011 - 31114 1240 367 aa, chain - ## HITS:1 COG:PA4673 KEGG:ns NR:ns ## COG: PA4673 COG0012 # Protein_GI_number: 15599868 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Pseudomonas aeruginosa # 1 367 1 366 366 430 59.0 1e-120 MALQCGIVGLPNVGKSTLFNCLSNAKAQAANFPFCTIEPNVGVITVPDERLNALAELVHP QRIVPTTVEIVDIAGLVKGASKGEGLGNKFLANIRETDAIIHVLRCFDDDNVTHVDGSVN PVRDKEIIDYELQLKDLETIESRIQKVQKQAQTGGDKAAKQAYDVLVQYKDALEQGKSAR TVTFETKDEQKIAHELFLLTSKPVMYVCNVDEASAVNSNKYVDMVREAVKDENAEILVVA AKTEADIAELETYEDRQMFLAEVGLEESGVARLIKSAYKLLNLETYFTAGVQEVRAWTYE KGWKAPQCAGVIHTDFEKGFIRAEVIKYEDYIQYGSEAAVKEAGKLGVEGKEYVVQDGDI MHFRFNV >gi|225935350|gb|ACGA01000042.1| GENE 23 31181 - 32323 1060 380 aa, chain - ## HITS:1 COG:araJ KEGG:ns NR:ns ## COG: araJ COG2814 # Protein_GI_number: 16128381 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 378 1 383 394 309 48.0 5e-84 MKKSLIALAFGTLGLGIAEFVMMGILPDVAKDLGISIPTAGHFISAYALGVCVGAPVLTL ARKYPLKHILLVLITLIMIGNICAAMAPNYWVLLAARFISGLPHGAYFGVGSIVAEKLAD KGKGSEAVSIMIAGMTIANLFGVPLGTSLSTMLSWRATFLLVGIWGIVIMYYIWRWVPHV EGLQDTGFKGQFRFLKTPAPWLILGATALGNGGVFCWYSYINPLLTNVSGFSVESITPLM ILAGFGMVMGNLISGRLSDRYTPGKVGTAAQALICVMLLLIFFLSPYKWAAALLMCLCTA GLFAVSSPEQILIIRVAKGGEMLGAACVQVAFNLGNAIGAYVGGLAVSGGYRYPALTGVP FALIGFTLFLIFYKKYQAKY >gi|225935350|gb|ACGA01000042.1| GENE 24 32576 - 33883 765 435 aa, chain - ## HITS:1 COG:MA1868 KEGG:ns NR:ns ## COG: MA1868 COG3177 # Protein_GI_number: 20090718 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 23 411 67 458 473 269 37.0 7e-72 MIELPPQISQKEYLSVLLTEPLEKLKAIIEKNNDKYSYWSDVKYQRTPDGISNKELWCYI KASRKAKRMIVWKKYGISISITNAMQRMCHEFDMNFGGSWGSSSIIPAENKEQYLISSLM EEAISSSQMEGAATTRKVAKEMLRKSLSPQDKSQQMIFNNYQTIRFITQHINTPLSSGLL QQIHVLMTEKTLENPADAGRYRTNDEVVVEDAISHEVVHTPPPYEDILPFIEDLCTFFNE NNPKVFIHPIIRGIIIHFMIAYMHPFVDGNGRTARALFYWYMLKSGYWLTEYLSISRVIA KSKKSYENAYLYTEADEKDMGYFIAYNLRVLELAFKELQTYIKRKLDQKHQASVFYKLDN INERQADILRLINDNPKILFTVKELQNRFSVTHTTAKNDIDGLVERKLIDEISLNKVKKG YVKGEMFDEIVKSIT >gi|225935350|gb|ACGA01000042.1| GENE 25 34033 - 35442 1225 469 aa, chain - ## HITS:1 COG:MA2933 KEGG:ns NR:ns ## COG: MA2933 COG1073 # Protein_GI_number: 20091752 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Methanosarcina acetivorans str.C2A # 30 466 42 489 496 355 41.0 1e-97 MKIIKAARIVTLITGLVIAGMFFSPPVSAQDISGTWHGKLSIPTGSLTIVFHISQTGQGT YVTTLDSPDQGANGIKTQTTSFNDSILTIQIPIIHASYKGKLDSNQTISGTFTQGMPIPL NLEKGEASRPKRPQEPQPPFPYKNEEVIVRNEQDGINLAGTLTLPQKGTKFPAVVMVTGS GAQNRDEEIMGHKPFLVIADYLTQNGIAVLRCDDRGTAASQGNHATATNEDFATDTEAAI NYLRSRKEINAKKIGIIGHSAGGIIAFIVAKKDPSIAFVVSLAGAGVKGDSLMLKQAELI SKSQGMPDAVWQGVKPSIRNRYAILQQTDKTPEELQKELYADVTQTMSPEQLKDLNTIQQ ISAQISSMTSPWYLHFMRYDPAEDLKELKCPVLALNGEKDIQVDAAMNLTAIQERITGNG NKNVTIKAYPNLNHLFQTCEKGTLAEYGQLEETISPEVLKDIMEWIKKQ >gi|225935350|gb|ACGA01000042.1| GENE 26 35617 - 36828 844 403 aa, chain - ## HITS:1 COG:no KEGG:Slin_0557 NR:ns ## KEGG: Slin_0557 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 26 360 31 366 405 268 41.0 3e-70 MKKSAKLWMVILFLFTGAGRVTAQIPSTRIYDGKKLAKVKARMESKEYASAITKLMNEAD KALKSKPVSVMDKTMVAGSGDKHDYVSMGPYWWPDPTKPDGLPYIRKDGVRNPNATSDCT NIGKTINDISTLGIAYYFSGNEKYAAKAAELTRIWFLNPETRMNPNMNYAQMIPGHNENK GRGFGMIDVYAFINLLDAVEMMNTSTDFTIADRSGLKDWFTDYLEWIRTSPVADEARTSE NNHGVSFDVQQTVYALFTGDSVLARKTISEFAKLRLFPQIEPDGRQLRELERTNGLSYTN YNLALIMDMCAIGHTLGIDVYNSTSKDGRCIAKALKYMASFIGKPQSEFPYQQSRDWEKE LQTSCWILHRASFYDNKSGWEEICQKYLKPSSTDRRRLLYSLE >gi|225935350|gb|ACGA01000042.1| GENE 27 36890 - 38140 1001 416 aa, chain - ## HITS:1 COG:no KEGG:Phep_2694 NR:ns ## KEGG: Phep_2694 # Name: not_defined # Def: glycosyl hydrolase family 88 # Organism: P.heparinus # Pathway: not_defined # 71 411 45 383 383 457 63.0 1e-127 MRLKKQIAILALGAGFAACTPQPATNLIEDNMEFAQQQLRFAFDEIDYAIVNESPESRAK REKNGWGELTNPRNSEPDGTLNLVPSKDWTSGFFPGELWFLYEYTQNNFWKKKAQQHTDI LEKEKMNGSTHDMGFKVYCSFGNGYRLTQDEHYKEVLLQSARTLATRFKPAAGIIRSWDH STAKWACPVIIDNMMNLELLFWATKESKDSTFYRIAVDHARTTMKHHFRPDFSSYHVIDY DTITGQVLKKNTHQGFADESAWSRGQAWALYGYTMCYRETRLPEFLEQAQNIEKYLFTHP NMPEDLIPYWDFDAPGIPDEPRDVSAATVIASALYELSLYDPEKGERYRSNADKIIENLT KHYRAMLKKDNGFLLLHSTGTKPTNTEVDVPIVYADYYFIEALMRKNKLEKTGKQF >gi|225935350|gb|ACGA01000042.1| GENE 28 38149 - 40530 1478 793 aa, chain - ## HITS:1 COG:no KEGG:CPE1875 NR:ns ## KEGG: CPE1875 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 27 789 49 809 1479 526 37.0 1e-147 MKTRLFLLFIMCLFILPSASAQQKEYSLWYSQPAPNEGAENIVKSRGFPYDKYWERWSLP IGNGYMGACIFGRTDTERIQLTEKTFGVKGPYKKGGIGNFAEIYIEGIHHDQPLNYKRSL RLNDAISRVNYQYEGVNYTREYFANYPSNVIVVKLKADQPGKISFTLRPVLPYLHEYNDE GTGRTGKVSAQNDLITLTGDIQFFRLPYEAQIKVIPSGGQLKAMNDELGNNGTIRIQQAD SVVLLINAQTAYQLKSSVFTASPENKFTGNEHPHRAVSQCIQKAADKGYEALCKEHIADY QSLFSRVDLHLCNETPGIPTDSLLHDYQRGKESLYMDELLFQYGRYLLIASSRKGSLPPH LQGAWSQYEYAPWSGGYWHNINIQMNYWAAFNTNLAEVFIPYVEYNEAFRQSANEKATGY IKKNNPDALSAIPEENGWTIGTGANAFSIDSPGGHSGPGTGGFTTKLFWDYYDFTRDEDI LKKHSYPAMLGMAKFLSKTLKPTEEEYLLADPSSSPEQYHNGTTYQTKGCAFDQGMIWES FHDVLKAADILKEESPFLRTIKEQIGKLDAIQIGESGQIKEYREEKKYSDIGDPRHRHIS HLCALYPGTLINAETPEWLKAATVTLNNRGDKSTGWGVAHRLNLWARVKDGDMAYQRYQL LLKKYILENLWNMHPPFQIDGNLGGTAGVAEMLIQSHEGYIDPLPALPAAWRDGSYEGLV ARGNFVVSVFWKQGLMTQMNVLSRAGGECVIQYKDIANFTIKDAKGKKVKTIRESKNRIR FATQKGNTYYLNH >gi|225935350|gb|ACGA01000042.1| GENE 29 40536 - 43376 2356 946 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 51 813 2 758 778 513 39.0 1e-145 MKQLLITVLLSGSCGFLAAQKTVKTPAIYKPVRTEMYKKGWIDFNKNGMKDIYEDPSAPV DARIEDLLKQMTLEEKTCQMVTLYGYKRVLKDDLPTPEWKNQLWKDGIGAIDEHLNGFQQ WGLPPSDNEYVWPASRHAWALNEVQRFFIEETRLGIPTDFTNEGIRGVESYKATNFPTQL GLGHTWNRELIRQVGVITGREARMLGYTNVYAPILDVGRDQRWGRYEEVYGESPYLVAEL GIEMVRGMQQDYQVAATGKHFIAYSNNKGGREGMSRVDPQMSPREVEMVHVYPFKRVIRE AGLLGVMSSYNDYDGFPIQSSYYWLTTRLRGEMGFRGYVVSDSDAVEYLYTKHNTAKDMK EAVRQSVEAGLNVRCTFRSPDSYVLPLRELVKEGGLSEEVINDRVRDILRVKFLVGLFDH PYQTDLKGADEEVEKAENEEVALQASRESIVLLKNDQDVLPLDISGIKKIAVCGPNADEC SYALGHYGPLAVEVTSVLKGIQEKTDGKVEVLYSKGCELVDANWPESELIDFPLTEEEQK EIDRAVSQAKEADVAVVVLGGGQRTCGENKSRSSLDLPGRQLDLLKAVVATGKPVVLVLI NGRPLSINWADKFVPAILEAWYPGAKGGKAVADVLFGDYNPGGKLTVTFPKTVGQIPFNF PCKPSSQIDGGKNPGMDGNMSRANGALYAFGHGLSYTSFEYSDLKITPAVITPNQKTYVT CKVTNTGKRAGDEVVQLYVRDVLSSVTTYEKNLAGFERIHLKPGETKEVFFPIDRKALEL LNADMHWVVEPGDFTLMVGASSTDIRLNGTLTVTDRINDSTPQSKENESPISASTNQEMV NNVVDNDLTTFWEGNKGDYITFALQNEAKVDGISIAFHRDNGLETDFEIQLSSGGGQFLT VYSGTVKEYHKLLNFPFKGTTASDLRIVLGSDRVGITEIKLPQIKK >gi|225935350|gb|ACGA01000042.1| GENE 30 43423 - 45033 1290 536 aa, chain - ## HITS:1 COG:no KEGG:Phep_2282 NR:ns ## KEGG: Phep_2282 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 528 1 537 540 303 38.0 8e-81 MKKIYRNLKTAVLALGCMSLLLSSCSDFLDRNPHDFVSPEIFYKNESDCMMALAGVYWTL ATDHVYGERYSNILSNTDDLSYYARLNQSGQVYTNSHNSSNNDIYQTWTQLYSGINNANV LLDNIDAANIPDAAVKNRIKGEAKFLRAYYHFLLVQSWYEVPIRTETVTDINNSSLAATP HAEAIDWIIKEMEDCIDLVDDSKYDLSPSHVKKTTVQGILARVCLWRAGSPSNGGKEFYE KAAKYAKAVYDSHKHKLYQGDIYAIWKNIASDKYDTEYNESMWEVEFIGTRVDGKFTYSR IGNTIGNWQENTSATGKGYAYGFYCGSLILWDLFEKNPGDLRRDLSMATYKLNTNDAAVY WKDTEIVTRRCGKYRREWTTTSPKEKNNDQINYPVLRYADVLLMLAEAENEAGKAPTDLA YDAINEVRERAGIDKVENLSYAEFQQEVRDERARELCFESLRKYDLIRWGIYYDAAHNKL TEATNDKRWTTSGYYKAAKEYAANTEERHQFLPIPMKELGVNLLLKQNSYWSNTAE >gi|225935350|gb|ACGA01000042.1| GENE 31 45046 - 48195 2737 1049 aa, chain - ## HITS:1 COG:no KEGG:BT_3332 NR:ns ## KEGG: BT_3332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 1049 21 1053 1053 905 50.0 0 MKQKVAFMLLVISFWCASAIQAQTGSQITGKVVDASGELAGVSVVVKGTTNGVTTNMDGE FKLGNVKSGDVLKFSFVGYKTQEVKVGNKKRFDIVMEEDAQTLEEVTVVAVGYGDVRRRD LTGSIGKANISDMTRTPVNNIAESLGGRIAGVQVTSTDGALGDNFNIVIRGAGSLTQSTA PLYVIDGFPQESSSMSALNPNDIASIDILKDASATAIYGSRGANGVVIITTKQGQASKAK VTYNGSVSISKVNKTMDMMSGYDFALLQQEIMSADDFSNYYLKNGVTLEDYKTFRDYDWQ DEIYRTAVSHNHYVGLNGGSDKMKYSASLSYSNQQGIVINTDLSRYQGRFNFSQQINKKI KVNANANFASNVQNGVSPSAQASSMSHSLLYSVWGYRPVSPSGSDLLAAMYDEDVSMSDD YRFNPVLSARNEYRRNTTNHLQANLGVEWEIIKNLKFKTTAGYTGRDIKREEFNGSQTRT GNSHPKNTQSKGINAKLTQQETRNYLNENTLTYQFNKNKHSFNALAGLTFQKYSDYTTSI TQDYITNESFGMAGIGKSEATPTTTASLGDNAMMSYFTRFNYNYKSKYYANFTIRADGSS KFAKDNRWGYFPSGSLAWTFSRESFVSGNLPWLSNGKLRASWGLTGNNRIGNYDHMAQLI THSDYYKYPWDSSFTTGYVLNSMANPSLKWETTEQFDLGIDLGFLKGRINLTLDYYIKTT KDLLLNAEVPASSGYATATLNIGKLRNKGFEITLESTNITTKDFTWSSNFNIAFNNNEII ALNSGQKEMLSYVRWDNAYNNMPAYISRVGESAGKLYGYIYEGTYKYDDFNQTTNSDGSI SYKLKDGIPRISDSVQPGDPRYKKLSNDGTNKITDDDRTIIGNGQPKHTGGFTNNFVYKN WDLNIFLQWSYGNDILNANRMVFENPSNRTNTNMFASYNNRWSAANPTSDMPRAKALDAK HYSSLYVEDGSFLKLKTISLGYNLGRKALYKLGIQAARVYFSAENIATLSGYSGSDPEVS VRNSVLTPGFDWSSCPRSFNASFGVNITF >gi|225935350|gb|ACGA01000042.1| GENE 32 48769 - 52845 2634 1358 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 796 1094 39 317 318 123 31.0 2e-27 MKRLFCTLLLAISVQFLFANQPYHFSHLSVKDGLSQLNVTCIYQDKLGYIWFGTRNGLNK FNGNSFEVYWNHADDPHSISSNIINCMAEDAEGNLWIGTENGLNRLDKNKNEFKRYHINQ QENKDKNKITSLCLDQQGTLWVGAASEGLFAYTPQGDSLKNISTKELKGNWVNDIIERDH KLYIASYNKGLLIYDLQQQAVVRSFQQNSESLPIPSNRIRKVHIDRKGNIWLGNDTHGVY VIKHDPAEVVCYNTRNGLSNNNIRCINESPSGKIWIGTFDGLNILNPVTNRIQTYRYGTQ GTLSHHSIYSILFDRMQTIWIGTYAGGISYYNPYGHLFDFYSPSSSLNKTLGILGPAIEH DGTLYIGAEGGGLITFDLEQQTFQQYILPHNPGESMQSVIKTVYYDKNRILCGTSIGEIY EFNLHNHHFSLIFQNNDRNPIYHISKSISGELIVGSTSPKEGLMFISDTPSLHVQTEFAV KGEGTFRFASVICVCEISNDVFLIGTKEKGLFFYDKINHSLLSYNHTNGLTAKHISSILK DRSGRIWISSMDGGLSEFNITTEQFTTYNEKSGLQSNQVCKVVEGADNSLWISTLNGISR LDLRTRSITNYNKESGIQIQEFSPCAGFRLSDNRIFFAGNNGFTLFNPNDIVTNPNVPPI ILDKLYINNKLVCPNDKENVLQENISTQKKIVLNHNQTNITIEYCALNYIFNSKNEYSYK LEGFDSDWNKVGNRRTAYYTNIPAGTYHFIVKGSNNDGIWNEQGATLEIVVLPPLWKTWW AYLIYSAVAIALIAFIIRYFTEKKRLQNNIRLKQMEAEVHEEFYQERNRLFTNFSHELRT PLTLIMAPLEEFVRRTDLVDDVHYKSQLMLRNAQRLLRIVNNLMDLQKNESGTMKLQVSE HDIVKFTHEAVSSFQDLALYRNIHLRFKHAVDEQPVWFDWNLLEKVYFNFLSNAFKNVPD GGSIRVELNVKSLSELAIFVPARLNKYKNPEIKYLTVSIQDNGVGIDPDELEKIFRPFYQ VAQNEHSKSGTGIGLSLSKAIIEMHHGVVWADNAPDSGAVFQFVLPIDKNGFDPETMVES SSNETVLFNVELPDTAFEAEHSGKKQATILIVEDNRDLNNYICSCLADKYNVIGVTNGEE ALAKAINLLPNLIITDLMMPKMNGIELIRQLKKDMNSSHIPIIMVTAKTGTNDIKEGYAA GADEYITKPFDASILKVRVDNIIQSRERLKDLYSKNFSLESLGVDMTSVDEKFMQNLYAT LQQNIANSDLNLDAFCRELGLSKSNLYRKIKQITGYSPNEFIRNFRLETAAKMLKETDMT ITEVYCAVGYNSLAYFSNCFKALYGVSPTEFKNKAEGK >gi|225935350|gb|ACGA01000042.1| GENE 33 52858 - 53073 89 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKIGTSFPIYFLICTKELQIPFRTYNFPSKPYKNRSTIYLIGKRAQNNGSLWEVFYCFL LFLLQVKTLFP >gi|225935350|gb|ACGA01000042.1| GENE 34 53107 - 54555 1420 482 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172822|ref|ZP_05759234.1| ## NR: gi|260172822|ref|ZP_05759234.1| hypothetical protein BacD2_13218 [Bacteroides sp. D2] # 1 482 1 482 482 911 100.0 0 MKKSILFNRTLHLVSYVLLSLTSCLFVSCDKDDDWDKNLGNNLLEGIWTRESISQKFVAT FNADHTSYICTYNIETEALEHVDLQGKYRVVDESVLIYQSGDKHRFKLSEDGNTVEITYG YDTGSPETEKTYTYQRFVEVQEPEPEPEEKELEFADVNAAKATLGSLKAGASVTVGMSNW EDVVVTKVSLRKQLTIEDTPMTNSSLASGNLTFTVPENMPVGSYSLIVAYTVNGKAKEVM FDAVTCTVKEEVTPPEPGAKVLVFKNQMMGSAQNRDFGCLLTVTDAGQLDIQTACYMNDD PTLNADENKKRRSEIDLIANTYSGPAFAFGNFDKIAHNLRNFRCNGTNLFTTSKDDAKDK ETAITMFPGYADIQTKFLVLQESIEKEKNIIDLVRNDQLAEISETATPALFDGTSKIDRI TVQSIKPGESVGRDRFDLNGVVVFKSSKNGKIGIMLIRQINELDGSNDAADATIVFDLYY QK >gi|225935350|gb|ACGA01000042.1| GENE 35 54589 - 55356 877 255 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172823|ref|ZP_05759235.1| ## NR: gi|260172823|ref|ZP_05759235.1| hypothetical protein BacD2_13223 [Bacteroides sp. D2] # 1 255 2 256 256 502 100.0 1e-141 MKKVCLLVIMVAVALMGHAQSLRNNLLKGYKPGDQLEKGVYTSDQDTPRFNTWYGAFNAK AVEVEEGPVIGEPLSYKGYNESGPSINLAGEGQLNRVSVYPLTKSNKEYARGAYYLAFLV NFEKLGSTKFYDFAGLDINPLSRAVRGKVFVAGEGKDKIKFGVAVRTDCTEGTETYDYNQ THLLVLKVDYDKKQAVLYVNPDLSQGEPANGLVANAEGDELKNGLKSIYYRYRKNYKGNI GNFRFATTWDAVIGK >gi|225935350|gb|ACGA01000042.1| GENE 36 55491 - 57350 1164 619 aa, chain - ## HITS:1 COG:no KEGG:Phep_2687 NR:ns ## KEGG: Phep_2687 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 118 611 22 502 508 506 52.0 1e-141 MKKLIHTLALQSIKVKLLGLLLFLAGNLQACSDEETASIQVNADVLYFNGTGETQELQIE SSGEWTVTVAKGGEWCHYKMKENAPSVLSVSTDANQTGKERLTRLIISSGSAEKRVEVYQ RAASSSTTPTYPQVDSDLPLSSLTDEQGNILPDFSNIGYMGSEQEIPDVKVVETIEAPAN GADATRLIQNAIDKVAQITSATNGFKGAILLKKGRYSISGTLNIRSSGIVLRGEGEDSKT GTVLIAAGKGQRTLIKFEGTGSSSPSSPSLFNIKDDYVPVGQFWVRVLNPSSFQVGDEVT IYRPGTAQWISDLKMNQIPERTTGDPIVQWTPESYNLSYERTITHIINDALHFDNPIMMA METPYNKGAVFKSSFKGRINRCGVEDMLIESEYTSDTDEDHGWVAIEFNKMEQSWVRNIT SRYFGNGLVSLNSGTRYVTVKDCKCLDAKSIITGGRRYSYNMNQAQQCLVIDCEATEGRH DCVTGSKGVGPNAFVRVNIRNAHGDTGPHQRWNVGTLYDNINSDGPINIQDRSNWGTGHG WAGANQVLWNCTAKEVCVQNPWVSAKNYSIGTKGTKHAGDFKNPVRPDGVWVRANATVSP ASLFEAQLELRKRTGRLYH >gi|225935350|gb|ACGA01000042.1| GENE 37 57659 - 59185 1536 508 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 23 507 29 483 497 169 26.0 1e-41 MKTLIYLVPTLSLVGCTSQQVEEKPNVIVILADDLGFGDVSAYGSTTIHTPNIDSLAHGG VCFTNGYATSATSTPSRYALMTGMYPWKNKDAKILPGDAPLIINEHQFTLPKMMQQCGYV TGAIGKWHLGMGDGNVNWNETVKPGAREIGFDYSCLIAATNDRVPTVYVENGDVVGLDPA DPIEVSYEKNFEGEPTAISHPEMLKMQWAHGHNNSIVNGIPRIGYMKGGQKARWKDEDMA DYFVDKVKNFVTEHKDAPFFLYYGLHEPHVPRAPHQRFVGKTTMGPRGDAIVEADWCVGE LLAHLKKEGLLENTLIIFSSDNGPVLNDGYKDGAPELAGNHLPAGGLRGGKYSLFDGGTH IPLFVYWKGKIQPVVSDALVCQVDILASLGSMVHADLPEGLDSRNYLDAFFGKASTARKD VVLEAQGRMAYRSGDWIMMPPYKGSERNLTGNELGNLGEYGLFNVKADRTQHQNEAAQHP QLLDSLKQNFFAEVDGYYRSEVEEEPLK >gi|225935350|gb|ACGA01000042.1| GENE 38 59305 - 60957 1431 550 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2848 NR:ns ## KEGG: Cpin_2848 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 11 392 9 377 386 139 29.0 3e-31 MKMKAFLMFVLLFLCSMGTKAQDLGANYNENIDYPITEIEMLKQSKVNWVRGFVNIPTLF LQNENGKIVGVKEEAIKTHIPTLKFIQAKKALGDRVKFILSLKIPFELYTDTVPKVGTKE MEYIFRATEVLLQTYQMAQNIDILVMGNEPEWENALDTDLCHADGDDYRAFLNEFANRLT TWKQANGWTFDIYAGALNRVSELPKSETVPAVVSVVNNNPNVVGLDLHVHALNINQAEDD FRIIRNKYGVTKKLICTEFSMVRALNPHVADALGEWGTKNGYTAGMKIYEYLNLIAEKAN AGTPVSATEFKSLFESYSWYPKNWYKTFYEVFKKYDTYAITGRFSVVPGGARAVYDANTE MWELGGIYFSRYLGLDANGFYNPNPLLYPDFIAAQNGLAISSLVEGQQELFISWGNGADK TGVLSVTDKDGTEVVRQSLDSESDYTLVENLVPGTTYHVALLKSDDSVLWEDDVKTKLVT GEFPLLKYQQVDGYALVQLLNLPADAASYKLKVDGREIDIVNKNLNGQVLTAEVTYKDGS VETLSATIRK >gi|225935350|gb|ACGA01000042.1| GENE 39 60979 - 61383 311 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172827|ref|ZP_05759239.1| ## NR: gi|260172827|ref|ZP_05759239.1| hypothetical protein BacD2_13243 [Bacteroides sp. D2] # 1 134 1 134 134 268 100.0 1e-70 MKVIYVLLACLLVFASCAETDYAENGVDRKDITGYTEIGFYSLDGVCTFAEQEQVQAAIN SKRLTYRLQNSDQSKYIHVKFTGKPATVGQKIRVAFSFRGVSFLKEEMEIEVVRIDGGKV WLSGDNVGMVIPVF >gi|225935350|gb|ACGA01000042.1| GENE 40 61405 - 63036 1413 543 aa, chain + ## HITS:1 COG:no KEGG:HM1_0138 NR:ns ## KEGG: HM1_0138 # Name: not_defined # Def: multidomain protein with S-layer homology region, glug motif, ig motif, I-set domain # Organism: H.modesticaldum # Pathway: not_defined # 159 542 41 440 2976 86 24.0 2e-15 MKRYIPIAASLMAFSLVSCGEVMDLTQPEKAEVTYSGITLSLNQTGKYDLYLDEPEYEYT IMVEKSHCEKEAKAEFAVVDAHSFGEEYTLLPAANYDLDANSLDFKGDDVLHAVGLRFHD LTTLDNTKKYVLGLKLKSDDLAVNEEKSTMTFYLQQKQGGIGNPYIITAPKDLAKLGEYL KDGQTTYVRLGADIDLQGMDWTPVEATAVRPVDFDGCGHVINNLKITSSSSGYQGFFGML TGRCANVTFTNAQVTADKKLAGIVAGQAGNVSGAGIVENVRVSGTINLTSGNNAWDDGQA GGICGRLHGAESKIYQCGSETKITALWSAGGICGEVREGAAIEQCYHIGDITTQSCVGGI ASRLLGSTISHCYSHGVMKAVPMVVANPGGGIAGWVQPMSGASTTSTISYCWSDCDVSAQ NQVGGIMGNANNTTGSGITVHHCVAWNTYLFSQAAPKSGKVCGRYSQNVAYSCYANPMME CVFSGNPALPDQASVNVGTITTVDRYNGLTTISNLMEAVRTLDWDNSIWNLDGEQPRLAW ELN >gi|225935350|gb|ACGA01000042.1| GENE 41 63065 - 64051 782 328 aa, chain + ## HITS:1 COG:BH3683 KEGG:ns NR:ns ## COG: BH3683 COG3507 # Protein_GI_number: 15616245 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 37 284 16 265 528 72 26.0 1e-12 MNTFNKILLRMTLALGATLMSVSQLQADDWKEISFADPTIFVENGKYYLTGTRNQEPLGF SILESTDLEHWTVPNGSSLQLILRKGDRTYGEKGFWAPQYFKEKGTYYFTYTANEQTVIA SSKSVFGPFRQKEVRPIDASAKNIDSFLFKDDDGKYYLYHVRFNKGNYLWVAEFDMKKGS IKPETLKQCMDCTEPWEKTPNYKSAPVMEGPTVMKWDGVYYLFYSANHFMNIDYAVGYAT ATSPFGPWKKHPNSPIIHRSLVGENGSGHGDVFKGLDGKYYYVYHVHRSDSTVQPRKTRI VPLILEKGNDGLYSITVDKEHVIKPMWK >gi|225935350|gb|ACGA01000042.1| GENE 42 64192 - 67377 3491 1061 aa, chain + ## HITS:1 COG:no KEGG:Slin_2101 NR:ns ## KEGG: Slin_2101 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 15 1061 15 1038 1038 492 34.0 1e-137 MKRNRLYFSFFCLCCFLLCTLSVHAQQIYWKGQVKDAVSGEPMIGVSVRVKGTGSGTITD FDGNFSVKASKGDILVISYVGYKTLELDLKNKTTLGVISLGEDAETLEEVVVVGYGVQKK VSSVGSIATAKGDDLLKVGSVNSVSEALQGQMPGVVAINSTSKPGADKASLLIRGKSTWG EAAPLVLVDGIERDFNDVDVNEIESISVLKDASATAVYGVKGANGVILLTTKRGLEQKPE ISFTANFGFKQPSAAPEWSDYVTSMKQYNRAQANDANWGALVPESTIAAWENAYATGNYG PYNDAFPQVDWWKEMVKNVGYEQNYNLNVRGGTKKMSYFVSLGYLHDGDIFNTTKQEEFD PSFSYRRYNWRSNFDFNITNTTKLSFNVAGKMGYQNQPSYYENVDSPDERFFGTFFTAPS NEFPIKYSNGIWGDGLSSDQNIVCLMTEGGSRNIKQHQGFYDVILNQKLDFVTKGLSLKA SLSYTTSSSWSTQIMPGKILGKDDLVAQRTHIRINRVYDYANPIYNPDGTITYNYTEKRY PDENAPGDLPVGGAYDGFKAYGRKLYYEVALNYNRQFGDHDVSALFVFNRKMNESTSDNV AVMNFPAYEEDWVGRVTYNFKERYLAEFNGAYTGSEKFAPGHRFGFFPSASIGWRISEEP WVKKWTKGILTNLKVRYSYGVVGNDKGATRFNYIQKFEQLSENTQFGKYQTSNWGPLYKE GKLADPDATWEESVKQNIGIEIGLWGKLNFTLDLFDEKRNNILMTRNTIPSWADSGIAFP QVNLGKTKNHGLELDIAWNDRIGKFNYYAKFNFATSENRIVFIDDPKNQSEYLKKAGKSI GYVNKYLATGNFQSLDDIYNSAQSTIANGAHNTLIPGDLYYIDYNGDGMIDAKDMVPMKD LNYPVTTLGFTLGGSYKGFGFNMLWYSAMDVYKEAIPSYLWDFPAGNIKAQPNTLNTWTA DAPIQSGAVRPSIHVQRNYNSVGSTYTYTNHAYLRLKNLEVNYEIPKRWLQPVHLTKLQI YINGTNLLTFSKGDSRRDPEHSGQNVYPMVRRYNIGFRLGL >gi|225935350|gb|ACGA01000042.1| GENE 43 67374 - 69284 1833 636 aa, chain + ## HITS:1 COG:no KEGG:PRU_2735 NR:ns ## KEGG: PRU_2735 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 20 636 7 624 624 187 27.0 1e-45 MRHFKFTDMRRHKLNWIWIIMLGLFFGSCEDYLDRSPDDGLSEDDVYKDYNSLLGFMDRI FLNGDILIFTHGINSYSNTYVTVGNLSDEYASVRDDDPSKFVNAGNWLENASTRFEIGSK SDGANFKSAISRAYTGLRIVNRVINGVDQVKSITEDQKRKLLGQAHFLRAYFYFEIIKRY GGMPIFDQLWGASDDFDFPRKTYQESNAWMQTDLDKAIEYLPLSWSDVEHGRPDRVGAMA LKAMTQLYAASPLMQNDLNSIENKGYGKEMAAEAARSAQKAINAIESHEYYRLMNHDEYR SIQLMPNSNQFAQPEYLWFLRWHHGNWSAFVRAQWLTQPYDNKTGAEGTPYNAPTQNAVD MYERMGADGKYYPITDPRSGYDAVKTTDPYSDRDPRLTNNILVPGEQWGKNLQGVPYYVT TYSGGYSENFISTNQFTRGSQQTGYMCKKFIWPEASVPLFGDAGFQLYRLVAVYIRVSQV YLDMAEASYEATGDPDAIVTGCTMSARQALNKIRVRAGIGELPDGVDFREAYRRERSVEL MFEGHRWYDIRRWMIAEDLFKGEYPIMGVKATPINHSYTPDQLKVEKACTYKLTDFTYEY VPVRTAVRTFNKRNYWYPLPMDEVAALDNLQQNPGW >gi|225935350|gb|ACGA01000042.1| GENE 44 69307 - 72393 3069 1028 aa, chain + ## HITS:1 COG:no KEGG:BVU_0505 NR:ns ## KEGG: BVU_0505 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 41 1028 37 1006 1006 381 29.0 1e-104 MRNNNRYIRLIGLIFVLIVGSNLTFAQNKSKRKSAKKPLVEIVSIVTDEKGSPLKNVSII CGEGAVVLYTDAKGRFQTKVENDATVMVEALGYEDKVYKLSGSNIVPEKIVLKKEPLFLT EESLVNRADGGKTYKGNEVGATSVLENGQFGTFPDLTLTNMLQGKMLGLQVRSTVSGLGN NTPDLFIRGQHGMSENTAIVIIDGVERPAADLIPEEIERIELLKDATAKILYGARAANGV LWVTTRRGKANRRIYNATAEVGVVQMTRTPDFLNSYQYANLYNEARANDGMTPYYNQKQL EGYKNSKGANDLLYPNVDLADQLLNQNANYRKVSFDMTGGTDRVRYALIAGYVGGSGFED VGYTPQLNRLTLRGNLDFNVTDFLTISADVAGRMEMRKWGQLDCGQVFTALSTHRPNEYP LTMSPEETGLAGGSDGIPLFGASLRQPMNAYAETMYGGYTDERYTRSQTNIGLKFDLDML TKGLKAGAFLSFDNYDYLQLSLSKVYPTYAIKTYRDFAGEEQIMYTQMKKTDVATSQSRK STTLQQTLGWNAFAGYENTFNQKHDVSARLTYMYSKTTNQGVTQDIINANYALRLNYMYD HRYAVEADMALMGSNRFKPGNKYFFSAAGGLAWILSNEDFLKDNEYVNFLKLKTSAGILG YDRSTEHLLYERAWAQDGSFRFGTTNNGATAYYSTFVRAGNPNLKWEKAAEWNIGVEGLF LNNRLYTEINYFREKHTDIIGSVDASYGDYTGNFTYQDNMGSVLNHGIEGMFTWSDRVND WSYSVGANFVWSKNKVLKWNQVKHGEEYRYTVGRSTDAMMGLVAEGLFGKDVAINGHATQ TFADYQEGDIAYRDLNGDKVIDGRDVKELGNSFPRTTLGIDFSVSYKGWGLYLQGYSELG VHTWATNAYYWNNGESKYSKLALDRYHPVNNPTGSYPRLTTTAGENNFRNSSFWLENTSF FRMKNVELSYTFNQFSPSSVVKKIKVFARGANLFVLSSVKDLDPELLNAGVTNYPVTRSF TGGVSFVF >gi|225935350|gb|ACGA01000042.1| GENE 45 72419 - 74215 1483 598 aa, chain + ## HITS:1 COG:no KEGG:PRU_2737 NR:ns ## KEGG: PRU_2737 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 597 1 589 589 251 31.0 8e-65 MKHILGSCVLAGLFLFCACDDTLDTKVTWQIGDSDTWRVPELAQGVLYKAYNGIANRPDC FEDNFLDCATDNAMTNLRSSSVYKLSMGGMTAFNNPIGNWSNAYNMLNYVNSFLENGLTD QVQYNRTDTEVDKQIKLRLKGESYFLRAWWHFELLKMYGGKGKNGKALGIPLADHFISQE EAAQNGEFLRPTYQATVNFIVNDLNNAIELLPNAYQGDDLEFGNTQIGRATKAAAAVLKS RALLYSASPAMQDDDVTKITGMGQFEILNPTVYQAKWEAVAKEINKILGMEGFGTFVPVT ASSIADVQTESSDYAFRRYFNNNLLEGFHFPPFYYGSARATPSHNLVKAFYAKNGYPATD VRSGIDLSDPDFDMMQLYAVLDNRFALNVYYHSATFGDSGQSLDMSEGGKDSPSFSENAT RSGYYLAKFVSKKSAMLNPIQTLNSVHYNPLLRKSEVLLNFAEAANEAWGNPTVKGEGCL YSAYEIMKTIRLQAGGINFDLYLDEVAQSKDSFRKLIQNERRLEFAFENHRYFDMRRWVL PLNEEVEGVAVTRNEDGTFSFKVQKVEQRKYEVKNYFTPLPYSELEKNKNLINNQGWE >gi|225935350|gb|ACGA01000042.1| GENE 46 74245 - 75147 938 300 aa, chain + ## HITS:1 COG:no KEGG:Slin_2105 NR:ns ## KEGG: Slin_2105 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 26 226 16 226 331 69 28.0 2e-10 MKKLNILWIGLISVLMTACYDDYEKDYDKSAVYFASQKPLRTLVADTDMSIKVGVAIGGK REVHTDDWATFEIDPSLLDGTGLTLMPENYYQLANPNKMTISNPNLAIADVKVTFSDAFY NDDAALDKHYAIPFRLVDHNQDEVSTDVNGNLKDYSIVVVKFVSQYHGTYFVKGKVTNLS TQQVTEYNNKDLSQNMTRDFVSLGRNKVRRPGFGQTLENNESVNLTVNPDGSVTIEAGGS VAIADASATLDPAAESLEFVGKQPKFTLSYKYTKGGVTYQVNEELIRRQNPEADLRFQEW >gi|225935350|gb|ACGA01000042.1| GENE 47 75251 - 79261 2772 1336 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 773 1073 12 315 318 141 32.0 8e-33 MRTTALKIVLSFSLVLLNASLSTGQIQYNNIRFKQLSIAEGLPHNTINAITQDNHGFIWF GTRNGLCRFDGYNINLFAHNEADSTSLCHDFITRLYNDSLRNVLWISTDQGICSYDYRTE DFTRYRIKGNTKDDVCFLNTSDGMLLAACSNGLYRYNEQDSLFVPFLLNEEKPHVRYFAE DGDKTLWIDTNKGLMRYSLEKKQFVSLPTLIQPFAHQCNNAVLISSNQLLFNTNNDFFIY HIQSNTLCNLSKDLEVKDFRCASTDHTGNIWVGTEYGIFVFNKLYQLIAHYEQSERDLSA LNDSPIYSLYQDKAHNMWVGTYFGGVNYYIFGSDQFQIYPYGGSFNHLSGKAVRQIINAP DNGLYIATEDGGLNYLNSKKEITRAERLHKQMQIYAKNIHSLWLDKDNSLWLGLFLKGAL HYIPHLNRTVDYNLLSEEVSSGFCIIEDKNDHVWYGGPSGLFLIDKKKANARPEKISPLR VFNLVQFNDSILWAGTRKGSMFQINIRTLKVTSLPILPHTDLYVTYIYPDSHQRIWIGTD NNGLYVSDRNGSIIASYSKEQLGSNAIKGIIEDEMNNIWVGTGSGLCCINPQMENIDRYT TADGLPINQFNYSSACKKPDGELYFGTINGMISFYPEQVRSVNPHFNIALTAIWSNSEYM SPNNEKASIPSSISELTEITLTHDQAQSLRLEYSGLNYQYTYNTQYAMKMEGIDKDWQFV GNQHQVRFSNLPSGRYILKIKASKDGIHWDETGQKDLAIRILPPWWLSPGAYFVYALLSL LIIYAAYRYTKTRIILLMRLKTEHEQRVNMENMNQQKINFFTYISHDLKTPLTLILSPLQ RLIQQPQISNNDKEKLEVIYRNANRMNYLINELLTFSKIEMKQMHISVRKGDIMHFLEEL SHIFDIVAGEREIDFIVSLEDTKEEVWFSPSKLERILYNLLSNAFKYTQPGGYVRLSAKL IKEEKETMVQISVKDSGRGIPKDVQEQIFESYYQVEKRDHREGFGLGLSLTRSLIHMHKG EIRVESEVNKGSDFIVTLNVSEDAYAPDERSSENITTEEIQKYNQRIKETIELIPEKLTN KEKDSGRESIMIVEDNKEMNDYLASIFGEKYDIIRAYNGAEACKKIARQLPNLIISDLMM PVMDGLEFTERVKQDVTTSHIPVILLTAKTDENDHTEGYLRGADAYITKPFNAKNLELLV QNIQKSRKQNIEHFKQAEELNIKQITNNPRDEVFMKELVELIMANLTKEDFGVTEITAHL RISRSLLHMKLKSLTGCSITQFIRTIKMKEAKTHLLNGMNVSEASFAVGISDPNYFTKCF KKEFNITPTEFLKQLK >gi|225935350|gb|ACGA01000042.1| GENE 48 79470 - 81527 1745 685 aa, chain + ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 6 536 9 510 835 269 31.0 1e-71 MNHPTLKSLTLGIGLFFTLPLVYANSSFPSSSDGTLYINKSKTRKVAPVKYGFHYEEIGM MGEGALHAELIRNRSFEEATLPAGLSVKNGLYENVPAPRVKEKKVFQADPLIGWTTYPLS YAPVFVSRTEEDPMSEENKYSMLVNVTEDIANHPDALILNRGYYGMNLKTDTSYRLSLFL KSRNYSAPLRVFLVDELGQRVSNVVEVNIENRDWTKYTGELKPEKNVQRGMLAIQPMSKG QFQIDVVSLFPSDTWNEGKSVFRKDIVQNLKEFAPSFIRFPGGCIVHGVNEETMYHWKKT LGPIENRPGQWSKWAPYYRTDGIGYHEFYELCEYVGADAMYVMPTGMICSGWVKQSPQWN FRHIDVDLDVYIQDALDAIEYAIGDTTTKWGAERAKNGHPAPFPLKYIEIGNEDFGPVYW ERYEKIYQALSTKYPDLIYIANSVIRVVGRENDDKRKDIGNFINPKNVKVFDEHYYNSIE WACQQHYRFDNYKRGVADLFIGELGISGKYPYNLLATGAIRMSIERNGDLNPLFAERPVM RHWDFLEHRIFLPMLINGVDSSVKTSFFYLAKMFREHTFDVCLDAAIKDMEGMQNIFVTM GYDTASKQYILKLINLQDKKVTLQPEVSGFKRPVKAHKTSLVLVPGKENTPATPNEVEPV ETEVGLDLNKPLELEAASMTVYRFK >gi|225935350|gb|ACGA01000042.1| GENE 49 81616 - 83163 1293 515 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 41 497 1 456 467 163 28.0 1e-39 MKNTLFCITGLAIQGLAMSAAAQTGKPNIVVIMTDQQRADLCGREGFPLEVTPYVDQLAQ ENVWFNKAYTVMPASSPARCSMFTGRFPSATHVRTNHNIPDIFYKQDLVGVLKENGYQTA LVGKNHAYLKPADLDFWSEYGHWGKNKKATPAEKETARFLNQQARGQWLEPSPISLEEQH PTKIVNEALAWIEKQKENPFFVWVSFPEPHNPYQVCEPYYSMFSPDKLPVLKTSRKDLAK KGEKYRILAELEDASCPNLEQDLPRIRANYIGMIRLIDDQIKRLIESLKASGQYENTIFV VLSDHGDYWGEYGLIRKGAGLSESLARIPMVWAGYQIKNQPAPMDGHVSIADLFPTFCSA IGAEIPTGVQGRSLWPMLTGKAYPKEEFSSMVVQQGFGGADVGLDASLTFEQEGALTPGK IAHFDELNTWTQSGTSRMIRKDDWKLVMNHYGNGELYNLKKDPSEVHNLFGEKKYSEIQT ELLTRLLAWELRLQDPLPLPQRRYHFKQNPFNYLK >gi|225935350|gb|ACGA01000042.1| GENE 50 83194 - 84324 827 376 aa, chain + ## HITS:1 COG:no KEGG:BT_3094 NR:ns ## KEGG: BT_3094 # Name: not_defined # Def: putative secreted xylosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 376 376 697 88.0 0 MKLIGRLGLILLTGFISICTASAQLYGTANTNAPELRVPKFVKPAFDYWMRDTWATLGPD GYYYITGTTSTPDRHFPGQRHCWDWNDGLYLWRSKDMKSWEAMGHIWSMEKDGTWQKKPK VYKAGEKYQKKSINGDPMDNRFHAVWAPEMHYIKSVKNWFIVACMNESAGGRGSFILRSK TGKPEGPYENIEGNKDKAIFPNIDGSLFEDTDGTVYFVGHNHYIARMKPDMSGFAEELKT LKEKKFNPEPYVEGAFIFKYDNKYHLVQAIWSHRTSEGDTYVEKKGVTSPKTRYSYDCII STADNVYGPYSERYNAITGGGHNNLFQDKNGNWWATMFFNPRGAQAAEYKVTCRPGLIPM IYENGKFKPNFNYNTK >gi|225935350|gb|ACGA01000042.1| GENE 51 84321 - 85988 1364 555 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 22 532 4 527 536 284 34.0 3e-76 MKTSLLIAGSLALVPMTYAQDKPNIIIILADDLGFSDLGCFGGEIHTPVLDKLAKNGVRM TQMYNSARSCPSRANLLTGLYPHQTGLGHMDGSHPAWPKGYSGFRSNSDNVTIAEVLKDA GYFTAMSGKWHLGNKSNPILRGFQEYYGLLGGFNSFWNPEVYTRLPKDRTPRHYEEGTFY ATNVITDYAIDFIDQAHQEKKPLFLYLAYNAPHFPLHAPKEVTDKYMSLYMQGWDKIRDA RWKRIVDLKLMQGKPELSPRGVVPESLFEDETHPLPAWDSLTKDQQTDLARRMSIFAAMV DVMDANIGRVVDELKKNGELDNTFIMFMSDNGACAEWHEFGFDKQTGVEYHTHTGAELDQ MGLPGTYHHYGTGWANVCCTPFTLYKHYAHEGGISTPCIISWGNQVKNKGGLDHQPAQFS DIMSTCVELAGATYPKEYQGRAILPTAGKSILPIVKGKKMPERYIYAEHEGNRMVRKGDW KLVSANFKGDEWELYNIREDRTEQHNLIGKYPEMAKELETAYFEWADKSDVLYFQKMWNT YNKNRRKDFKEYKTR >gi|225935350|gb|ACGA01000042.1| GENE 52 85994 - 88198 1787 734 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 108 501 117 495 1087 86 25.0 2e-16 MDKRILFTLALGAISQVFAHAWEPKGDKIKTVWAEQVTPENVWQSYPRPQLQRAEWINLN GLWKYAVTDQNTVRKNVSFEGEILVPFAIESSLSGVQKTFLPTDKLWYQREFTLNPSWKN KSTILHFGAVDYECQVWVNNRLAGTHKGGNNPFSFDITKFLKKSGPQSIEVAVTDPTDME SISRGKQQLNQEGIWYTPVSGIWQTVWLEAVDKTYIRQVLPSTDIEKKSVKLNFDIAGAK GNEEVKIEILDDGKVIRTVEQKLSNTMEIDVPDAVLWSPESPKLYRLNISLSNGGRVLDQ VKSYFALRKVDVRKDECGYNRICLNNQPIFQYGPLDQGWWPDGLLTPPSEEAMLWDMVQL KKMGFNMIRKHIKVEPEQYYYYADSLGLMMWQDMVSGFATSRKKEEHVNPLATTDWNAPE EHTRQWQKEMFEMIDRLRFYPCITTWVVFNEGWGQHNTVEIVNKVIKYDDTRLINGVTGW TDRGVGDMYDVHNYPVTSMILPENNGNRISVLGEFGGYGWAIKEHIWNPNMRNWGYKNID GAMALIDSYGRLLYDLETLIAQGLSAAVYTQTTDVEGEVNGLITYDRKVTKIPEGLLHLM HNRLYEITPVKAVTLIADGQNGSKNTRLVGVNGQELKMTSLPFECPPRSTIVSEATFNVD KDFNHLSLWLNVAGEAKVWLNGVEVFVQEAKQTRQYNQYNISDYSRYLRKGNNLLKIEVK DSKKMRFDYGLRAY >gi|225935350|gb|ACGA01000042.1| GENE 53 88980 - 89441 415 153 aa, chain + ## HITS:1 COG:no KEGG:BT_3564 NR:ns ## KEGG: BT_3564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 91 1 91 146 76 40.0 2e-13 MSEWISVEEAATKYGIEKEYIQLWADMQVIMSYFKDYRTVVNDQSLRGFLKIREKGISPE YVKVLEQLCISKSEVCSAYAFLLGARDKELEMYREAKSQRDALRGMWIELNERTRDLEIE LELGRSGCYKCPLKKLCIGIKRIKLNWATKMQK >gi|225935350|gb|ACGA01000042.1| GENE 54 89687 - 91414 1521 575 aa, chain + ## HITS:1 COG:no KEGG:BT_4293 NR:ns ## KEGG: BT_4293 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 575 2 576 581 1013 84.0 0 MEEVKRIPYGVSNFVEVVEQNQYYVDKTMYLPLLEKQPSNLFFIRPRRFGKSIFLSMLRT YYDIAQKEKFEKRFSNLWIGSHPTQLQGTFQILFLDFSRVGGLDGTLTQNFDDYCCGGLD DFASIYESYYYPGFEEEMKAQCGTTNKLNFLDRKARNNGSKLYLIVDEYDNFTNVVLNEQ GDRVYHALTHASGFYREIFKKFKGMFERIFMTGVSPVTLDDLTSGFNIGWNISIDHQFNM MLGFSETDVREMLQYYKDAGQLPGNTNIDAMIEEMRPWYDNYCFAEESLERDPKMFNCDM VFYYLRHYMTLGKSPKEMIDPNTRTDYNKMKKLIRLDKLDGNRKGVLRKITEDGQIVTTL TTTFPATDITKPEIFPSLLFYYGMLTITATRGNYLVLSIPNNNVRKQYYEFLLEEYQDNR HINLNDLGLMYYEMAYNGHWRETLEFIAHAYKENSSVRSAIEGERNLQGFFTAYLSTNAY YLIAPEVELNHGYCDLFLMPDLIRYDVKHSYIIELKYLSAKDSGAKAEAQWKEAVEQIKG YAAGPKVRRMIYGTELHGIVMQFRGWELERMEEVI >gi|225935350|gb|ACGA01000042.1| GENE 55 91848 - 92981 778 377 aa, chain - ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 8 376 7 386 390 89 24.0 1e-17 MAIVNNPFIVGGYLSPHYFCDREVETEQLIRNITNGRNVVIISVRRMGKTGLIRHCFYQD EIKEHYYTFFIDIYATASLREFVFALGKEIFEKLKPKGMKFIERFFSVISSLRAGFKLDS VTGEPTFDIGLGDIHTAETTLDEIFAYLEQADKPCIVAIDEFQQIGNYTEKNVEAILRTK VQHCQNARFIFAGSQKHIMMNMFNSPARPFYQSVNMMQLKSIPLTAYRPFVERLFLENEK RIEEELIDEVYNFFEGHTWYVQLMFNELYILTGKGEVCDRSLQSIALTNILQMQDFTYQE IFSRLPEKQKEVLIAIGKEQKATGVTSGKFIKKYKLSTPSSVQAALKGLLEKNLVSQEQN HYEIADKLLGAWLQKNY >gi|225935350|gb|ACGA01000042.1| GENE 56 93153 - 94802 1529 549 aa, chain + ## HITS:1 COG:no KEGG:BT_3091 NR:ns ## KEGG: BT_3091 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 549 1 550 550 887 90.0 0 MKKVILIFVTIVLSGLLYAKDNKSTDALLREIDGIIKNRQTYGAEKEARIADLKKLLVEA ASDEQRYGFCGRLFDEYRAYNLDSSFVYAQRKEELAHRLNKLDYLDDSAMNMAEVMGTTG MYKEALELLGKIDKKTLPDYLYGYYYHLYRTIYGLMGDYAVTEKAKKEYYRMTDLYRDSL LQINASDSLGHALVMADKYIVHARYDEAIDMLMKYYSKPSLDDHAQAMITYTISEGYRLK GDKQGQKHYLALSAIADLKSAVKEYVSLRKLASLVYEAGDIDRAYNYLKCSLEDATLCNA RLRTLEISQVFPIIDQAYQLKTKRQQQEMKVSLICISLLSVFLLVAIFFVYKQMKKVAAA RREVVDTNTLLQELNEELHDSNSQLKEMNHTLSEANYIKEEYIGRYMDQCSTYLDKMDLY RRSLNKIAAAGRVEELYKAIKSSQFLDEELKEFYANFDVTFLQLFPSFVEEFNALLTEPM QPKPGEQLNTELRIFALIRLGITDSTKIAQFLRYSVTTIYNYRTRVRNKAVGERDEFEAK VMQIGKVEE >gi|225935350|gb|ACGA01000042.1| GENE 57 95008 - 98013 3017 1001 aa, chain + ## HITS:1 COG:no KEGG:BT_3090 NR:ns ## KEGG: BT_3090 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 1001 1 999 999 1820 94.0 0 MYMKQSMKSKGFERRLLLIMWGLFLSLSAFAQQISIKGHVVDATGEPVIGASVVEKGTTN GTITDVDGNFSLSVPANSTLTISFVGYKMQTVPVNGKNSLKVTLQEDTEVLDEVVVVGYG TMKKSDLTGAVSSVGVKDIKDSPVANIGQAMQGKVSGVQIIDAGKPGDNVTIKIRGLGTI NNSNPLIVIDGIPTDLGLSALNMADVERVDVLKDASATAIYGSRGANGVVMITSKRGAEG AGKVTVNANWAIQNATKVPDMLNAAQYAALSNDMLSNNDDNTNPYWADPSLLGTGTNWLD EMLRTGVKQSYSVSYSGGTEKAHYYVSGGFLDQSGIVESVNYRRFNFQANSDAQVNNWLK FTTNLTFSTDVKEGGTYSIGDAMKALPTQPVKNEDGSWSGPGQEAQWYGSIRNPIGTLHM MTNETKGYNFLANITGEITFTKWLKLKSTFGYDAKFWFVDNFTPAYNWKPNPVEESSRYK SDNKSFTYLWDNYFVFDHTFAQKHRVGVMAGSSAQWNNYDYLNAQKNIFMFDNIHEMDNG EKMYSLGGSQSDWALLSLMARLNYSYEDKYLLTATVRRDGSSRFGKNNRWGTFPSVSLAW RISQENWFPKDNFPVNDLKLRIGYGVTGNQEIGNYGFVASYNTGVYPFGNNNSTALVSTT LSNPNIHWEEVRQTNFGVDMSLFNSRVNLSLDAYIKKTADMLVKASIPITSGFEDTTETF TNAGKMRNKGVEMTLRTINLKGLFSWESALTATYNKNEIQDLNSETPMFINQMGNSYVTM LRAGYPINVFYGYVTDGLFQNWDEVNRHATQPGAAPGDIRFRDLNNDGVINDEDRTVIGN PNPNWFFSLSNNFSYKGWELSVFLQGVSGNKIYNANNVDNEGMAAAYNQTTAVLNRWTGE GTSNSMPRAIWGDPNQNCRVSDRFVENGSYLRLKNITLSYTLPKKWMQKIQLENARISFS CENVATITGYSGFDPEVDVNGIDSSRYPISRTFSMGLNFNF >gi|225935350|gb|ACGA01000042.1| GENE 58 98029 - 99519 1450 496 aa, chain + ## HITS:1 COG:no KEGG:BT_3089 NR:ns ## KEGG: BT_3089 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 496 1 496 496 954 90.0 0 MKKITTLVMMFAILALTSCNDFLDKYPKYGVDPESEVTNEIAVALTTACYKTLQSSNMYN QRIWSLDIIAGNSEVGAGGGTDGLETVQASNFIAQSDNGFALYVWRSPWVGIGRCNIVLS NLPSAAVSDEVKTRCMGEAYFLRAHYYYILVRLYGGVPLRLTPFEPGESTDIARNTVDEV YAQIFSDCKNAVNMLPPKSSYGDSDRGRACKEAAMAMLADIYLTLAPNHREYYNEVVTLC DNIAAMGYDLTQYNYADNFNATINNGPESLFEVQYSGSTEYDFWGGDNQSSWLSTFMGPR NSSLVAGAYGWNLPTEEFFKQYEDGDLRKDVTVLYQGCPAFDGMEYRRSWSNTGYNVRKF LVSKTVSPEYNTNPNNFVVYRYADVLLKKAEALNEQGHPDQAAEPLNIVRKRAGLADVPT TLTQTEMREKIIHERRMELAFEGHRWFDMIRVDNGNYALTFLKSIGKNNVTKERLLLPIP QTEMDSNHLMTQNPGY >gi|225935350|gb|ACGA01000042.1| GENE 59 99556 - 101070 1611 504 aa, chain + ## HITS:1 COG:no KEGG:BT_3088 NR:ns ## KEGG: BT_3088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 504 1 504 504 837 82.0 0 MKKYIYQLLCSLFIGGTMVACAEDYMETNKGHDTLTLSVNQQDLVLNEKNHSQEALALSW TTGTNYGSGNRISYTLELAKAGTDFANAYSVDLGTGTYQWTKKVEELNQFLNTQFGIGYA VKAELEARITAVVAGMEEQKQSATTAFTVTTYQPVTTTLYLIGDAAPNGWAADNATAMER TDNGQFTWTGKLNAGSLKFITTLGQFLPSYNRDATAVEALKLTYRTSGDEPDEPFVIDEG ATYIVKVDLLNLTLTLTETEDIGWRFDEFFIVGSFTGNGGWEFEALSKDAVQMDLFHYGA VIPWKADGEFKFSSVADFGQSDAFFHPAVANAPYTSTSVVLGGDDNKWQMKEAECGKPYK VWFYTGKGKEKMLMRPFTPYAGLYLVGEATPNGWSLDNATPMTKSVDSPYIFTWSGMLKA GEMKISCDKQTDWNGDWFMADKDGKTPTGEVETALFVTKSDADLSSMYPDADLGGLDYKW NIQEAGSYQITIDQLKETISIVKQ >gi|225935350|gb|ACGA01000042.1| GENE 60 101085 - 102863 1774 592 aa, chain + ## HITS:1 COG:no KEGG:BT_3087 NR:ns ## KEGG: BT_3087 # Name: not_defined # Def: cycloisomaltooligosaccharide glucanotransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 592 1 592 592 1095 88.0 0 MKKIIYLVAAFLCMACSDDNESNQQNGGASGGVTEVTPVTSDLSVDLSTDKAFYKPGEKV VFTADDALPAGTKVRYRLSGETVGEESVNGTSWTWQPPTTDFKGYMAELYRQENGTDVIV GTIAVDVSSDPARFPRYGFVADFSQEKTAEKTQEEMEYLNRHHINWVQFQDWHNKHHWPL GGTRTQLDEVYMDIANREVYTSSVKNYIEAQHRFGMKSMFYNLCFGALKDAATDGVKEEW YLFKDASHTTKDSHDLPGGWKSNICLVDPSNKEWQKYLGERNDDVYANFAFDGYQIDQLG RRSTLYNYSGIPVNLREGYASFIDAMKQVHPDKSLVMNAVSRYGARQIGETDKVDFFYNE VWADEADFTNLKAILYENGVYGDYQLNTVFAAYMNYNKADNRGEFNTPGILLTDAVMFAL GGSHLELGGDHMLCKEYFPNENLTMSEELKTAMVHYYDFLTSYQNLLRDGGTENSITMNC TNGEMRLNVWPPQQGSVTTYAKQVGSKQVIHLLNFSQANSLSWRDVDGTMPEPTLITKAA LQMNLPAKVNKLWVASPDVHGGALQELAFTQENGVVSFTLPSLKYWTMIVAE >gi|225935350|gb|ACGA01000042.1| GENE 61 102925 - 105432 2160 835 aa, chain + ## HITS:1 COG:alr4773 KEGG:ns NR:ns ## COG: alr4773 COG1501 # Protein_GI_number: 17232265 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 17 736 13 726 779 424 34.0 1e-118 MKIRYRLWVWVLCLMFPTWVWAENAIYTCTSFTQQGRQVTFHLTDSAALQLQLCSPSVVK VWFSPDGQLQRKNASFAVINEELEDVGTVHVDEQAACYEIFTPKLRIRVNKSPLSLQIFD KYQKLLFSDYADKGHVSEGTKKTEYKVLRRDEHFFGLGEKTGKMDRRGESYKMWNSDKPC YSIVEDPLYKSIPFFMSSYRYGIFLDNTYKTEFKFGTESRDYYSFEAPNGEMIYYFIFGK DYKEIIGQYVGLTGKPIMPPKWALGFAQCRGLLTSEKLSREIAEGYRKRGIPCDIIYQDI GWTEYLQDFEWRKGNYENPKKMLSDLKGMGFKVVVSQDPVIAQANKRQWEEADRLGYFVK DSTNGKSYDMPWPWGGNCGVMDFTLPAVADWWGTYQQKPIDDGISGFWTDMGEPAWSNEE QTERLVMKHYLGMHDEIHNVYGLTWDKVVKEQFEKRNPDRRVFQMTRAAYAGLQRYTFGW TGDSGNGDDVLQGWGQLANQIPVMLSAGLGLIPFSSCDITGYCGDIEDYPAMAELYTRWI QFGAFNPLSRIHHEGDNPVEPWLFGPEAEKNAKAVIELKYRLLPYIYTYAREAYDTGLPI MRPLFLEYPMDMETFSTDAQFMFGRELLVAPVVKKGARTKNVYLPEGTWIDYNNKKTVYT GEQWTTVDAPLSSVPMFVKQGSIIPTMPVMNYTHEKPVYPLTFEVFPAQEGAQATFTLYE DEGEDLRYQRDEFVKTPIVCNTLANGYELTVSVREGKGYTVPGPRNLLFRIYSAKAPKEV TAKGKEIKKVKPERLEENLENDTESVLWSWDKETGVCSVRIPDKGIDEQIMIAFK >gi|225935350|gb|ACGA01000042.1| GENE 62 105479 - 109540 2203 1353 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 812 1044 9 236 294 148 36.0 8e-35 MPEHNRLVAIIFMLVSCVSSIAGFETINFQHLSLNSMFPQITINGLYQDESGILWIGTKD GVKKYNGNYIESVNFMGINNWIQSNLVPTICGDKKGHLYVNTDYSIIEYDLIKEESRIIF SQPNTQILPSIAFNYGTNSLWIGLLDSVYNYKEGICQAKYKIDGNNLSISSLKETSGKLL YIGTKHDGVFTIDPSGRQQKVLTTRSEIISINEDSRKNIWVSTLNEGLFKLTPSGDVIQY TEPTLASNYVRTVCEDNNGNIWIGTMLGLNVINSQTGDISFYGLEKEGSAGLSNLSVWII MKDDQGTMWFGTYYGGLDYYNPHTDIFEYNNLKLGHAGIGPVISKIIEDKTGRIWISTEG DGLVSYLPETDAYTYYTKENNTISHNNIKTLYYDDTSSTIWIGTHLGGLCSYSIDRKEFK HFTIDPSDHTKRSEIVQAIIRHKNILYLGTLSGVYYMNLEDGSIQKIHLLDKYIYAVNSL MIDKQDNLWIGGNNLCYYSIKQNYVRSLDKYLAKITASSKNTITSLIQDNSNRIIATTLG AGILIYHPTQDRLEQINSKNSNLDSDYLSAVYPLSDHQLLVTSANGLSHIDLKTKRSYNY KSQNKFVLPSMIPGDIMKSSDNRIIMGGINGLAMISEKNLFPESLSTKMFFSKLFVNNKE VSANDSSRILPRSLFATDCIRLKHHENNISVEIGTNDFVNLGQSRYQYKLDGYNNSWIEF HPQKSINYMNLPYGDYKLKVRNISFKDEAELNEISLGITIIPPVYATWYAYLLYTLLILA IIISIAYFYRSKLLLRNSLELERRDKLQKDAINESKMRFLENISHELKTPIALISGQLEL ILMSNYSLATIQNSLREVHNRAAKMGALINELLDFLKYNKENFTLRIKQQDIVLFTQEIY DSFVSYAELKNIHFNFVFDQKSQYTWFDEVQLQKVFNNILSNAFKFTPENGTVEIKITTS ASSVIITFNDSGIGIPKDMTERIFERFFQVNNSVNRELSNTGTGIGLSLSLNIVKAHHGQ ITVESEVGKGSSFIVELPIGNEYFKEDKNITIIENQRNTGIINSPISIENEEESLENLED FIIQQKKDFDHSSTLLIVEDDDELRKLLIEIFEPIFEIYEAKNGEDAFKIAQLNSPDLIL SDVMMPGISGINLCARIKSHFDTSHIPVILLTALSSVEHNIKGLNCGADDYITKPFNIRI LIAKCINLLNNRRKLQERYKTLENNSAEQLTNNKLDQDFIDHIIRIVQTKLEEGNGDINV TLLCSELGLSRTKLFLKMKKITGESPHYFIQNIKLKTAAKMLRENEEYNISDISFQLGFS SLNYFGKSFKEYFGMSPTAYRKFHQEQKENHSI >gi|225935350|gb|ACGA01000042.1| GENE 63 109529 - 109846 112 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172853|ref|ZP_05759265.1| ## NR: gi|260172853|ref|ZP_05759265.1| hypothetical protein BacD2_13373 [Bacteroides sp. D2] # 1 105 1 105 105 162 100.0 7e-39 MFRHILKALIKFLLFSELPIFSFGISLRDSYNGGLLLYILYLIAGSIITNVSLFFEIMKQ KTVCNRDMNKIIEPLELFYPMLLPVISYFCSCYQEKNLWMDLLNF >gi|225935350|gb|ACGA01000042.1| GENE 64 109875 - 111074 734 399 aa, chain + ## HITS:1 COG:no KEGG:BT_2913 NR:ns ## KEGG: BT_2913 # Name: not_defined # Def: unsaturated glucuronylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 399 11 399 402 431 55.0 1e-119 MKNYVAICLLSLMVSGCTTTLEPSMEHMVDYALNRAVEQYKHMYSVMDSIPEMLPRSIDK NGALATSDSYFWTSGFYPGTLWNLYEYTGDEQLKEMASKMSDRVEIQKYNKDNHDVGFMI NCSFGNKYRLLGDTSCMDIVTTTARSLSTRYRSAVGCTRSWSHPGTLHWQFPVIIDNMMN LELLCRATEFSGDKRFYDMAISHADKTMENHFRPDYSCYHVVNYDTISGNALQKCTWQGY SDESVWSRGQSWALYGYTMMYRETKQPRYLEQAKHIAGFLLNHPKMPQDKIPYWDFDDPK VPDALRDASAAAIMASALLELSGYVDKVTGQGYIAVAETQLRTLASDEYLAKPGENCNFI LKHSVGFLPQNSEVDVPLTYADYYFVEALMRYRAMHLKE >gi|225935350|gb|ACGA01000042.1| GENE 65 111100 - 112350 936 416 aa, chain + ## HITS:1 COG:no KEGG:Dd586_1768 NR:ns ## KEGG: Dd586_1768 # Name: not_defined # Def: exopolysaccharide inner membrane protein # Organism: D.dadantii_Ech586 # Pathway: not_defined # 34 242 46 254 410 72 29.0 3e-11 MKNKLMKVSLLLFLMVLIAGKSLSQNQSVRIKAGHPRLILSGTDIELMRGNALSDIEPWK TAWKKLKGEIDGYADKKWKPNVYRGDASMSFYKAAIRDGSAARDLAIGYQITKDKRYAHK AIEIINEWSSPKNAPGTYFDPDKFYPNTGMLVSRGVFAFLYAYDLLCADNLIEKSKQIQF EAWLRILLPHIEEGVKRWVENDYFGKQYFQNHIVAEVVGLMSIGIILRDNELVNYVYDGE TNPHNIKKVIEGIILMKGQPPYCGEPGSWPTQDGEIMDRYRHFALTHYGQTTKPNRALQY AGLSTNLLMIAAEMGRLNGLDLHHYVAPTGESIKLPLLFYADFYITKDASIKGGFYTGED SWINYNDQSVFTLWEVGHARYPEEKVFNEVLRTNDRTAHNLHLLGPVVLTHGRCIE >gi|225935350|gb|ACGA01000042.1| GENE 66 112375 - 115467 2297 1030 aa, chain + ## HITS:1 COG:no KEGG:Phep_3172 NR:ns ## KEGG: Phep_3172 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 35 1030 134 1138 1138 926 48.0 0 MKQRNISKGIYSLYVIMFMLMIPIGDLWAQDPLNISGKVVDVNNEPLIGATVQVKGKQTG TITNIEGRFSVNVQAKDILVISYMGYASQEMIASKANGATITLVEDGKTLEEVVVVGYAT QKKVNLTGSVSTVNLTEQAESRPMTSLSTGLAGLSSGLYVNQGSARPNNDGATLLIRGQG TLNNSSPLIIIDGAEGNINEVNPQDVESVSVLKDASSSAIYGSRAANGVILITTKKGSEG KASISYNGYVSFQKPSNTIETVYNYADYMEYYNEAAYNVDPTAMPVYSERKIAEWRVHPN EPYLYPNTKWEDEVFSTGVSTNHNLSFSVGNQKLKAFGSVAYLNNPGIVENSAYERFTAR LNVNAELKPWITMGMNISGLKSNAEMGSKYMSSLFSNISSPGIVYRHPDGRYGSAENPEE NQQVQSPLYHLNMYQGDIEVNNFTSRFWTNIKVFKGLNIEASYTYRRNNSQEEETPVYAD RWNFQSNTITQMAAGRTFVRNKNWTNTHHVADVVARYQKTFFDKLDVGILAGASQEKDNV KWFEAKRYDLLADNLSVINGATGDSETSGAATDWVMRSYFGRINLSWADKYLFEGNIRRD GSSRFHKDHRWGTFPSFSLGWRMSEENFMKSIKWLDMLKLRLSWGALGNNAVDNYEYQSV YNKDNYVLGNSVVSGLAQIQLANAMVSWETTYVTNIGLDFSLFGSKLDGTIELFNKDTHD ILIDLPAPLLVGNASIPTQNAARVRNRGVEMNLKWNHRIGKVNYFVGGNFTFIDNEVTRF KGEERSISGTNMIQEGYPINIQYVLAVDRILQTDEDMLFVEKMIKDAPLDPTTGQRVNPF ASYGTPKKGDFLYKDLNGDGVIDDNDKYAVGHGLAPRLTYGFSLGAEWNGFDFSCLFQGN AGLQVYWQDKFYMGYINYGDVINKEIAEGAWREGRTNATYPRLLTGLNTLNAQPSDFWVQ NKSYLRLKNVQIGYKIPKKLLQKLDISQLRLYTSLENMLTFTSYKGIDPEISGTTYPTMK QITVGVNVVF >gi|225935350|gb|ACGA01000042.1| GENE 67 115479 - 117200 1565 573 aa, chain + ## HITS:1 COG:no KEGG:Phep_3171 NR:ns ## KEGG: Phep_3171 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 2 573 3 579 579 473 45.0 1e-132 MKTVNYIMVAVAVTLGFSSCYDLDVAPYDKVAQNNYWKTEADAKSGVMGVYAQLKDYGAY GYMPLFDTYSDIGHGPGGPVEQGTYNGAYDFLVQNWRDTYDGVQRANTVIKNVSGMSIDD QVKNNVLGEAHFLRALYYFHLADFFGGVPIYDESWEVSESFNEMLLPRNSREEVWNFIIK DLTFAIANLPLKWAVSDYGRATKGAAYALRGKAYLYTKNWSEAIADFEEIVYNKTNQYGY QLYPDYLTLFTSAGPVPNDNETVFAIQNKGTTDNLYGMPLCTLYGTRGSYGGGRATCMPS VTLADMYEEKDGQKFNWNNYIPGYNESDAVKKKAFQATLNSTKQKLETIPDTTLLGEIYR GRDPRMMQSLIVPYSYYNGYIVGTGAKKQLYAIAAGTTVANGFIQNDRGWNVYFYRKFVP IEDMGGAITNRNHTPINFPIIRFADVLLMLAEAYNEDNQLDEAVAELNKVRKRQSTNMPA LNSGPAWLQVSDKEDMFERIMHERAVELVGEGHRFSDLRRWGVAAQYLNNRQEKDFTGEI RFTRKFTERDYLWPIPSEEIQRNPALKPNNPGW >gi|225935350|gb|ACGA01000042.1| GENE 68 117247 - 117558 339 103 aa, chain + ## HITS:1 COG:mll5702 KEGG:ns NR:ns ## COG: mll5702 COG3254 # Protein_GI_number: 13474745 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 2 102 3 104 105 108 52.0 3e-24 MKREAFKMYLKPGFEKEYQKRHSEIWPELVKLLKSNGVSDYSIFWDKETNVLYAVQKNDG AGSQDMGDNEIVKKWWDYMADIMAVNPDNSPVSIPLEEVFYME >gi|225935350|gb|ACGA01000042.1| GENE 69 117586 - 119070 945 494 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172859|ref|ZP_05759271.1| ## NR: gi|260172859|ref|ZP_05759271.1| hypothetical protein BacD2_13403 [Bacteroides sp. D2] # 1 494 1 494 494 953 100.0 0 MKLEYKHSIILTLLLLTQMLMACNDNAKNTEIALVSDDAVSIGYGGGEELIKFICYDNWT ISSDVSWITFGGLTEGSGNAIIKIHIEKNTSGGDRTGKLSITCGGNIKIIEIRQSIKTID IEHKHPSILYTKEELLNIKQMVEGNSSASITTTYNNLMKRCNNALTYTATPYTGQDPTKF IEESYVPGSNSRDLALAYWFTGDKKYARKSIEIIEAWAKACKDISYVADAGSAMYLTRGM YPMVCAYDMLISENIMSDETKKNITDWFQVLYKEGMISINLWEDNDYFNKQYYQNHLVAH SMGILMLGLVTDDDELVQFAIDSPANPRDVKELLSGCILMDGDTPCSREKAGSAPPVKGE IYDRYRHDTGPLKGLQYTHLTLTLLSTTARMCYNNGLDLFAYTAPTGENLRYCFEYYSDF YRSMDSCIKSGYYCGETERMTKAGDNPGMYEMGLRYYPDSEPIRQLINSGTFNRESSYMD LLGYTRLLSAEINE >gi|225935350|gb|ACGA01000042.1| GENE 70 119088 - 120329 802 413 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293368591|ref|ZP_06615199.1| ## NR: gi|293368591|ref|ZP_06615199.1| putative lipoprotein [Bacteroides ovatus SD CMC 3f] # 1 413 1 413 413 792 96.0 0 MRKYISDHWIKSFMIMIGIGLFGVSAGVLQSCDDDDEIQNKPTLYIITEDIEALSTYGGT IPIEFLCNLDWQVSTDAAWITLDAKAGSQSATINAKVTKNDEGVDRMGVIKIVAGELEKR LNIHQKYRDTSTPQLSITTKSPIKLDFEGGRNSISFVCNVGWEASTDVEWISFTSETAGL EDGTLNFVVEKNGGDVIREGKVTITAQGITKSVDIVQQTEAMGGVNLLDTPEEDYSFERI TTNKYWPALGEWGGSDSQGIGKFGASATAVRVAANSPISHTGSSYLFVRMRENETANSLD WLWRKVKGLTPGKSYTFSFWFKTPSAAEMPQPGNIRLGAVINETDIPTLSNPLAPEVTFG FVAASDGVLSSSDEHKKVSYTFTMPADKTDVYIVWRRNGNQQPFLDDMSLVMN >gi|225935350|gb|ACGA01000042.1| GENE 71 120482 - 121834 842 450 aa, chain + ## HITS:1 COG:no KEGG:BT_3171 NR:ns ## KEGG: BT_3171 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 450 28 477 477 717 73.0 0 MGDHMILQQNSSVKLWGWADNKKVTVTTSWNNQTYQVLTDKNGAWLVKVNTPGASYTPYF ITISDGEEVILSDILIGEVWICSGQSNMDMRMMGNTGQPIDRSLETILHAGNYRNRIRFI AVSRTKDVQQRTDFEGRKWEVSAPEAVMTCSAVAYFFAKQVTEVLDIPVGLVISSWGGSR IESWMNEKTLASIDGVDIEAVRSSKLKMHHRLECMYDTMLWPVRNFTARGFLWYQGESNI FNYYCYAPMMTAMVQLWREVWEAPDMPFYYVQIAPHKYKDSQDTDAALLREAQTKALEAI PNSGMVSTTDIGDEFCIHPPQKDIVGLRLATLALTKTYNICRLPPTGPTMTKVDYLEGKA IVTFDNASAGLTPAFCNLEGFEIAGADKKFYPAQAQIVDRTPTVRVWSEQVNQPIAVRYA FRNYVGNVLLRNTFGLAAFPFRTDAWDDVK >gi|225935350|gb|ACGA01000042.1| GENE 72 122023 - 122895 717 290 aa, chain - ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 21 287 2 256 257 145 32.0 6e-35 MRKLLIVLTLCGVSILNAQQLNVASYNVRNSNPNDAKAGNGWEQRCPVLTQLITFHDFDI FGAQEVKHNQLEDMLNALPAYNYIGVGRDDGKTKGEYAPIFYRKDKFKLLKSGNFWLSED TTKPNKGWDAAYTRICTWGEFKDKTGKFKFWFFNLHMDHIGVVARRESAKLVISKIKEMC GKDPVILTGDFNVDQTSESYQVLHESGILSDSYEVAQMRYATNGTCTGWNPNTYTPNRID HIFVTPNFTVEKYGVLTDTYRTKNETTGKYESHIPSDHFPVKAVLRFNRK >gi|225935350|gb|ACGA01000042.1| GENE 73 122917 - 126840 2403 1307 aa, chain - ## HITS:1 COG:lin2643 KEGG:ns NR:ns ## COG: lin2643 COG0642 # Protein_GI_number: 16801705 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 781 1019 350 588 591 123 33.0 2e-27 MPVLLHSQNVVKQISNADGLSNNSVNCFLEDSEHTLWVGTWDGLNAYNGRSFKTYSYNKK NAGSISNNVIWQIIEQNDSVLWVSTDYGVNRWKRSTQQFTPYYLGTQNNPPKQEKSFLLG ITSGKYIICYVKEQGLFYFDDRKQDFVPLKNNLPDDIKNFVIDSKDQVFFLTGHGQLLHY QLSVHSSNPELSFKKEIKQPASISGIYLSQDYLIINDDRALTVSLDNRILNSIDIPENKT VSQVICHKEYLLISFIEGGCIRYNLEDNASMELPQLPAKASIFTIYIGSQNILWVGTDGQ GVLEVYEHSSPFHTVKTDYPVRCFCEEDNGNILVGTKGEGILLLDKQERQVEPYLSIGNG LISNSVYTIRKNMSGDIFIGTEGTGINYIPLNSSQVKKLSIPAEYPTFKAVYSILFTHND SLLWLGTSGYGLIKLTLQREGKSYKVTEMKQYKSPGPSSPSNNIIYSVIAGYNENELWLG TRGGGINKFDIASECFQQIHEIDSTLSLTNNDILYLTKGDSASIWIGTSYGLNRLFPADI PPSIMEYTDHNGLPNNTIHGILKDENGNIWASTNQGISFINLSSGKITNYSSRNGLQNDE FSDGAIFKDKAGWLYFGGVSGLNYFDENKIRLRDHIATLSLNSLKINNTSQNIYERIFNH TLRLDYDEPYITLGFTAHDFINNENCEFSYRIIDFADEWIYNENNPNIVITKLPPGKYKL EVKCTNGDRVWSNQIYSLHLDVAYPWWLSTTSFIIYFILIAIAIYITQSVIKNRIRLNRQ ILLEHIEKQNQQRIHESKLNFFTNVAHEFFTPLTLIYGPAQHLLEKADLDSYTKRYIYII KNNADRMQKLINELMEFRKAESGHTAIYAEKVDIQLLVDYVSDNYTEIAEENKIDFSFKS KEVSSFTTDRNALEKIIFNLLSNAFKYTPSGGYIRAEIRQNATTGTLHFRIRNSGKGLTE KQTSEIFSRFKIFESSNLKHAGSTGVGLNLTKSLTELLGGEITIESTLGEYVEFNVSLPP MHMNSEKESQPTEEETEISEMLFIPKQKEITILIVEDEKNIRELLKDILLPYYQVREAAD GEEALKEVEQKQPDIIISDVLMPKLDGITLTDILKSNERTAHIPVIHISAKNSIEDQINA YNHGTDLYIPKPFHPRHVLSAVENMINKYSLMKEYFKSGRSSLIVRDGITMHKEDELLLN KIIKFIEDNIDDESMNPDSLADFIGVSKAGLYRKLKELTEKTPSEFVRTIRLEYAASLLK TTKLTVTEIMYKSGFSNKSYFYREFAKLYNTSPKEYRSEQTEEKDTK >gi|225935350|gb|ACGA01000042.1| GENE 74 127058 - 128104 742 348 aa, chain + ## HITS:1 COG:lin0857 KEGG:ns NR:ns ## COG: lin0857 COG2152 # Protein_GI_number: 16799931 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Listeria innocua # 6 347 5 351 355 308 45.0 7e-84 MDIAKRFHQNPLLKPSDLQPGIEGMEITCFLNPGVFRFDGKIWLLLRVAERPVQKQGVIS FPVYNKDGKIEVLSFDENDPKLDASDPRVIGYAGQNYLTTMSYLRLVSSEDGIHFKEEPD YPPIFGKGALEAFGIEDCRVATTADGYYLTFTEVSSVAVGVGLIHTYDWKNYTRYGMIFP PHNKDCALFEEKVNGKYLALHRPSSPELGGNYIWLAESLDRLHWGNHCCIATTRPDSWDC ARVGAGAAPICTEEGWLEIYHGADYQNRYCLGALLLDLNDPSKVIARSKEPIMEPVAPYE QTGFFGNVVFTNGHLVEGDKIRLYYGASDEVICGAELSIAEILRSLKS >gi|225935350|gb|ACGA01000042.1| GENE 75 128122 - 129369 1106 415 aa, chain + ## HITS:1 COG:BS_yqgE KEGG:ns NR:ns ## COG: BS_yqgE COG0477 # Protein_GI_number: 16079556 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 21 407 20 400 430 73 20.0 6e-13 MNKLKREYQFFKQQTSNVRVLLMTNLLYALVLPVVEIFVGAYVMRHTSEPVSVAFYQLFM YIGIITTSFVNGFLLRHVSVKMLYAGGILVSGLSMFAMMLVKSLGFVELGIAGFVLGAAS GFFWTNRYLLTLNNTTDDSRNYFFGLESFFFSITSITVPLLIGAFISQIDGREIMGCLID INGAYRLVTVGVIIVTLFAVAVLWRGKFANPVQKNFLYFRFCTLWKKMLLLASLKGMVQG FLVTAPAILVLKLVGDEGVLGLIQGISGTLTAVLVYVLGRITKPEDRSKVFIAGLLVFFI GTLFNGILFSATGVIIFVLCKVIFQPLFDLPYYPIMMQTIDAVVKIEKRNEYTYILSHEF GLFLGRAFGLILFMVLAFVISQDFALKYALILVGALQLIAYPLARNIIRQNQAIR >gi|225935350|gb|ACGA01000042.1| GENE 76 129536 - 130696 857 386 aa, chain + ## HITS:1 COG:PAB1622 KEGG:ns NR:ns ## COG: PAB1622 COG2152 # Protein_GI_number: 14521331 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus abyssi # 74 378 22 290 305 149 35.0 6e-36 MIKLKAVIYLISMMTLAGCSGQKQTSAVVEEQNEHWIIGPFVRPEGVNPVISPQPTTFQC PMRKQLVKWEESDTFNPAATVKDGKIVVLYRAEDNSAQGIGKRTSRIGYAESTDGVTMKQ SDAPVLFPSEDDYKEIEWEGGCEDPRVAMTEDGLYVMLYTAWNRHLPRLAVATSTDLKNW TKHGLAFAKAYNGRFANIASKSASIVTGVKDGKLVIEKVNGKYFMYWGENAVCAATSDNL TDWTPVLNENNELREIAKPRSGYFDSRLTECGPPAIKTTDGIVLLYNGKNGYKEERDSEY PAGAYCAGQFLFDADDPYQVLGRLDKPFFVPEAAFEKSGQYKDGTVFIEGLAYFKNKLYL YYGCADSQVAVAICDNNMDLKLKTHN >gi|225935350|gb|ACGA01000042.1| GENE 77 130705 - 133872 3313 1055 aa, chain + ## HITS:1 COG:no KEGG:BDI_3133 NR:ns ## KEGG: BDI_3133 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 47 1055 104 1117 1117 818 43.0 0 MKHGCKMNTVVSKSFNDRLKIATVSALMAGTFLCLPFTVYAENPLISDVEQTVRNISVKG VVVDANGEPVIGASVQLKGSTGIGTITDLDGKFTLSVPANGVLQISYIGYKTAEVKVNGQ AGLKVTLQEDTETLDEVVVVGYGIQKKASVTGSVAAISSDKLMEVKAPSVTNMLAGRLPG LRAVQRSGSPGDDGASVDIRGYGSMLVIVDGIERDYAQLDPNDIESISILKDAAAAVYGF KGSNGVLLVTTKKGTEQKAKIEYNGYVGFQKVTRYPEMMNAYEYASLYNEAIHNANPWRG ASAYSQEQLEAYRNGTAGTDWWNETMRSSAPQTSHNLSMTGGTEKVKYYMSIGYMDQGGI IRSGDWNYQRYNVRSNLSVEVAKGVNVELRLSGRFDNRKKPYNGDNLFRSAQMAIPTYSM YANDNPDYWGAVGDMANPVHVSSSDDSGYEDRLRREFNSSLAITWKLPWVKGLMAKALVA YDYTNKEWKTWRKDLSEYTYDYANEEYIEKVINTAHLESKLENYDKPTYQFSMNYNNTFA KKHNIGAMLVWEMYNDKRSWVTGTRDFAIGLIPDLDYGDKTNQEAFGKTQETAHAGLVGR LNYDFSNRYLVEFNFRYDGTYKFREGNRWGFFPGVSLGWRVSEEAFFKKLLPDMDNLKIR ASYAKVGDEGDFDAFQYLDGYTSHGSYIMGSNGVTSGMTTVGMANPWLTWYESKIMNIGF EASYHRGLISVEFDWFRRNRSGLPATRVGSLPTIFGESMPQENLNSDINTGFEIVVGHKN RIGNFNYNVSANFSTTRIKYDYVERAASTNMYDDWRNNTNGRYKDIRWGKKVIGQFSSFE EILNSPVQDKDGNRSLMPGDLKFEDYNGDGIIDDNDTQPLGHGATPRMYYGLNMSGEYKG FDLTVFFQGAAGHDIYVSGDILDPFIQQGLGNGLAIMTDRWHREDPTDPYSKWISGYMPA ARVAGVADNRSGNSWSLHNASYLRLKTLELGYTLPKALTKKAAIDRVRFYINCNNLLTFT NRDGLMKNVDPESNSSGVRYYPQMKTYNFGVNVTF >gi|225935350|gb|ACGA01000042.1| GENE 78 133887 - 135827 1994 646 aa, chain + ## HITS:1 COG:no KEGG:BDI_3134 NR:ns ## KEGG: BDI_3134 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 40 644 34 605 605 373 37.0 1e-101 MNRNYLKYIWIAGVSILGLTGCDDYLNRESTSGVNTPDLIWQNPKAITAVLADMYDSGLK LDEFDDWYGSKKANLANQTSLSDEATASYQKESAFDKSNSTYSYGDYVFNDDMVTRYKQI RIVNNFLLNIEKTSVLSEDEKEEIGAEARFIRAMQYFGLVKRYGGVPLQTIPQEYTAGNT QALYQARDTEAATYDFIIKECKAIYGSLPEVRNSDAKYRANRGAVLALWSRAALYAGTIA KYSKTLTLTGEAVSKGYVYIPETEAERYFDECYTASSKILDEMVPRVYSLYKSTSTDPEE LAQNFYNLFSKTVNGDNGEYIFQKQYNVAAGKGHMWDKLNVPFSYRGDGWGCGMSPVLEM VEEFEYIDGTEGKLKMKDSGGKAISYDSSYDIFKNKDPRLLGSVYLPGADYKGYGGGKIE WMRGVINGQDGIGTKYEASAQPDKENKVVIDGQTYNTSGKDGGSLSVGDASKTGFYQRKF LDESLTDYTDIDAKRSSTPWVVFRLAEIYLNRAEACMELNQHLDVALKDINEIRGRAGIK LLTAGDLTLDKVRHERKVELAFEKHRYWDLKRWRLAHLDVSKGGLTNFRGTALCPYYNVK SGKYTFETGVPEKRKRLFLEKNYYTVFRAEDLSTNPLMTQNPGYGN >gi|225935350|gb|ACGA01000042.1| GENE 79 135860 - 136588 638 242 aa, chain + ## HITS:1 COG:no KEGG:BDI_3135 NR:ns ## KEGG: BDI_3135 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 239 1 254 255 102 29.0 8e-21 MNKIINIGLTALFACFMVACENDIDNYDAPNGGVHGTIYDKETNEPIPMPVPGSSGVMMS LYEQNTGATASVDFRARQDGTFEHTKVFNGNYRIQAKDGPFVGVCEGYVTVNGQTQVDLY TIPFSRISLDVTVSADNKLTLTYDAKTSNETLQLTDVSVIWNYAPGVDVNNANHATLSSL GTKASGTHVIDLMSDTEFIENHYKIVSNKNRVYVRVAATVTAESKPYVNYSRVVEVTVND IR >gi|225935350|gb|ACGA01000042.1| GENE 80 136615 - 137448 731 277 aa, chain + ## HITS:1 COG:CC0523 KEGG:ns NR:ns ## COG: CC0523 COG3568 # Protein_GI_number: 16124778 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Caulobacter vibrioides # 39 275 5 246 259 65 27.0 1e-10 MKHKFYLLLVLLINGLFIGCSSDDDGVTPPDDKDGVLVKVMSYNIYSGQKVYSGKKGMEA IAQVIKKINPDLAGLQEFETKTNKVENADIIALMKEVTGMQYAFFVKTRDVDGGEYGNLI LSKYPISDEVNYDLPRIETVEDVYPRSMGVVKTEKEGKNFYFGVTHLSHVGNETNRINQT STILEKTKGMDAPMILTGDFNALADSGPMKILYERFDIGCLNGNYGLTTGTPVPVKAIDF VLYTPDKGMAPKAYDVYYDAYVESDHFPVVATFSIND >gi|225935350|gb|ACGA01000042.1| GENE 81 137466 - 138845 1143 459 aa, chain + ## HITS:1 COG:no KEGG:BT_3787 NR:ns ## KEGG: BT_3787 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 456 1 449 452 178 28.0 3e-43 MKTLIHHLKRFSIYTLLLGAVGFAGCEDENRQDMTPEIGEGEVTPKITMYTPTAGGKSTS LTLYGTHFGTDLDNIRVTVNGVDAEVTGALGNIITANVQRGSGSGPVKIYIGQGENIQEL TYKTEFEYSQSPIVSSYIGSYVPAAKTEKKEGSLMEAVLWKPGSIAFDKEGTLYIVEDDD RDIRIAKDNQVATFLRGDATGGVAFRMMNIAFSLDGNTLFLSNDANGSGSAHIATMSWSN DAHQYDTGSLAAIWSITSSLGNGVTNVGVHPVTGEVFSVAHGNAMIYKYDPEQNTMVTTG QQLPDADGKNASKVKIRCILFDKAGTTVYMSSQEKDVIYKGDYDMVTGLFSNLHIWIGQY DKTGFVEGQGNDARLEEPCQMDLDEEGNIYVAVRKKHRIAKITPDGVVTSYTGTGTSGTT DGPLDKAQFNHPEGLQFGLDGALYVSDYWNHKIRKIEKD >gi|225935350|gb|ACGA01000042.1| GENE 82 138907 - 139701 661 264 aa, chain + ## HITS:1 COG:no KEGG:Phep_2528 NR:ns ## KEGG: Phep_2528 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase # Organism: P.heparinus # Pathway: not_defined # 1 258 34 305 312 135 34.0 1e-30 MNFRICILLFLVCCLLSPVMAQRDKNKQKEVTVKAMTYNTYSGRKQGIDKIAEVIKLEDP DIVSLQEIERNTEINPWDTPERLSVLTGMKYYYFAHALDIPTGGDYGNVILSKFPISEEK SFKLSVLKENDYVRSFGYVKVMKEGKEFYFATTHLDHEYEDAARLKQIDEILACMEQLDK PIILGGDLNSRRGSATMAVFQKYFTVNCLSDAAPWTVPVPSPTYTCDWLIYAPNEAFTVK AYNVCYWADKESDHYPVVATYLIK >gi|225935350|gb|ACGA01000042.1| GENE 83 139901 - 140386 479 161 aa, chain + ## HITS:1 COG:no KEGG:BT_3062 NR:ns ## KEGG: BT_3062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 163 165 131 44.0 7e-30 MGLRYAVKERVFNFDETKTKKYVAAPVSDGVIDFSKLCKNVSLICSTHCGQVKLVLDGLL DSLEGYMDEGKTVKLGEFGTLRPTFNAKSGIEAKDVDSSNIIVREIVFTPGTQLKTMLNK MSISKYVPLDVVATSGSNSGSDPGNGENPGGNQGEAPDPAA >gi|225935350|gb|ACGA01000042.1| GENE 84 140471 - 141190 518 239 aa, chain - ## HITS:1 COG:mll6865 KEGG:ns NR:ns ## COG: mll6865 COG2186 # Protein_GI_number: 13475721 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 5 216 7 218 250 82 26.0 7e-16 MKDEVNSQPVKLVIEHIKKQIAERQILPGERLPSERKLSELLKVSRSHVREALQKMELYG IVKTYPQSGTVVSEFSKDQLDTMITDALKISKYDFSSLVYVRVLLEIEVCKLCASNRTKE DLENIEKTLIELEEKFDTELRVEKDFAFHQAIAQGGHNPVISSLLLIITPDILKYYQKYK VCAVPQKTVHAEHREMLQKIKDKDKDGMKELVLRHLSNLIDFAKLSAKGEIPEFEYGHI >gi|225935350|gb|ACGA01000042.1| GENE 85 141431 - 144586 2813 1051 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2137 NR:ns ## KEGG: Dfer_2137 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 1 1017 3 999 1050 553 36.0 1e-155 MNKLLHFKKVVLVLLLLVGALNVYSQHVVTGKVIDEQGPLIGASVTIKDAAVPTGTVTDA EGKYSIKVPNSKTILVYSYIGYVDKAEAVGKRTVVNVTLEEDSKMLDDVVVIGYGTQAKS HLTGSISKLEGEKLINAPVSDMTTALQGSMSGLTVSNETSEVGVTPSIRVRGTGSISAES EPLVIIDGFPVAGGLSSINAADVKSIEILKDAASAAIYGSRAANGVIMVTTKSGTPDKPK YSFKFYQGFKYAYQLHDMMTSSEWLNLLTQEAEMGGPSVPAAARGAAYLESQMGTTDWQK EGLRDMAGITNVQMSVSGGRKETKYFISAAYTKDEGVMLQNSLDKLNFRTKLDAKLSNIV SVGVNLSGTYTKTERPKNNFIDFYRTPSFLPVYHNDWSTEMTGYSGFARGSHFNNIMTPT GTPDAEGNPTLEKSSPFSSANNNPRSVMANTSRWSESMQGLASMYLTIDLCKGLQFKTSN GLNVRFSPSYYYGNKDATKDKEESRATYFSTLYVDLLSENTLNYHIDFGRNNAHSIDALL GYTVESTRNQRVAMTATGFATDDVHTLNAATVYSLASKGNGNTDGTGTFRYPDVVLESYL GRINYSYLGRYLLSTSLRLDRSSLFSKGNRNAWFPSVSLGWRVNEEKFMKNIEVISNLKL RASYGVTGNNRISYDAALEVLNSANYVTGTGNGQLVNGSANISSSLANPHITWEKTDEFN YGLDLGFFKNRINLSIDAYYSVTRALLFEQPTQSFTGYSYYWNNIGKVRNSGVEIQIDTH NIKNRKFSWDTNINFSLSRNKLLELGGEQQVINQGERSECYIAKVGSPLIQFYGYKTNGV WNSVEEINANPHFSNDVPGGLRIVDTNNDGSLTPEDRVPLGNPYPDFTWGITNTFQIKNF DISFLIQGVQGIDVYNGDVYYNETVKWNKAYTKDRWVSAENPGNGKVPYLKLGYDICLTD YPLQDASYACLRNFTLGYTLPSTAARKLKLSGVRFYVSGSNLLYIWGSSYKGINPESRLT SSQYSSSMISGYQRGGFPLTSTISAGFDINF >gi|225935350|gb|ACGA01000042.1| GENE 86 144600 - 146117 1449 505 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0773 NR:ns ## KEGG: Dfer_0773 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 9 505 9 483 483 237 35.0 7e-61 MKKVNYIGLLLVTLLFASCDLDKYPYSEVAADEYVKNASSVNNLVLGCYNSLHDVMYYEW AMTELRSDNGRMYATGSSASTTKLVEQLDQGTIVAEHQWVEEYWNSCYATIARVNNAISY LDVVEDETTRNQYEGEVLFLRSLEYFNLVRLWGPVFMVTSKVPSDVARDMQRSTVEEVYA LIEGDLERILDNGMLPERMADADMGRADRNAAKALLAKVYATHYKSGDAKYARAAQLCKE VLESAAVGNPQTGADLVAYNKIFDITNEMNKEIIFAARYLSGNVGLGSPFGNMFAPVNNG ANVIIGTSSGYNTPSDNIITAYTMRGATDKRLDVNIAQKYFNSTTQEWVTTGNCRYCKKY TNPVSTQYDGESDWPIIRVGDIALLYAELTNEISGPSADNLKYLNMICERAGVSTYTLAD LSNRYDFREAVRNERRLELAFENQRWFDMLRWGTAAQTVNNYFKSETFYSEYTYVVNDIA DWQTFLPIPVSVININPDIAQNTGY >gi|225935350|gb|ACGA01000042.1| GENE 87 146139 - 147824 1158 561 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172877|ref|ZP_05759289.1| ## NR: gi|260172877|ref|ZP_05759289.1| hypothetical protein BacD2_13493 [Bacteroides sp. D2] # 1 561 5 565 565 1073 100.0 0 MKQYRKWSLASLFACVIFWTSCDSISMKDVVVSAPQIVSFSPESGSIGSEIVVTGEYLDD VVSATIGGEKVTILQKVSNERLSLKVTGNAKSGKIVLSNSVGEGVSEGNFTIEYPAPTIS STGMPTEVEMGNKLLISGSHMNVISAVLFTAEGHTTGNEASILSQNEDEILVKIPYVESD KAAITFRYFNGTSQVETPIESAPQMTVARYEPNVTTSSFEPANIGDIVVLNGTYLNKIDK VILGTIECNIALQTENELKFAVPSSENYVDGDNTMALKISYFDGREVHTLTDAFVVKVPF VYFWENKKVYAQGRDVEELSSFFSPETGLVYANADWRTKVDPISYQYKAATCSANNKPAV SESEYNSVNPYFFFSGVNAGTLQINSPAGSNGQLKNFYMINNSADENRVPGINGNCYGTP VLTFLYLDPTKSGYKALIDEVKNGTLDNIDETTFPIDIEAKTCRGFSISSMKTSINTDVW APGIFEVGKEQKVNVGAVLLILYYNVNGSTSNVADNVKRIGLLHIKTIDFKMYNNTNAPS SSSIEFDMYWQKKDYDYSKVQ >gi|225935350|gb|ACGA01000042.1| GENE 88 147828 - 150104 1690 758 aa, chain + ## HITS:1 COG:no KEGG:Sde_3285 NR:ns ## KEGG: Sde_3285 # Name: not_defined # Def: TonB-dependent receptor (EC:4.2.2.3) # Organism: S.degradans # Pathway: Fructose and mannose metabolism [PATH:sde00051] # 10 673 16 705 760 431 36.0 1e-119 MMTRKIHKAIIASLLIGLPFTSFAKRYDVTLPEVASCLKAAQPGDQIYIKDGQYKDMQLK WTGKGTEKAPIKIEALNPGKVKIEGGSTLRIAGEWMSVGGLHFTDGYAPKGSVIEFRNGQ ELANHCRLTNCVIDGFNPSRRDQAYSYILLYGRHNRVDHCSLTGKLNLGVTLIVILNDER CLENHHQIDHNYFGERPVYGSNGAETMRVGTSQQAYSSSNTVIENNLFERCSGEVEVISI KSSDNVIRNNILLECEGVVALRHGDRNTVNNNLFIGNGLRNTGGIRVVNAGHQIYDNTLV GLAGTRFFSALGVMDAVPNSLPNRYCQVVDVKMYRNTFVDCTNIEFGTGKDMERTLAPDN VSFTDNIIINKELSQPYIAVDDVSGIQFKGNKVQLAKNYSAPGFTTEKLKVPQLPDQVAI RKDKGASWFENRVAQPSAKTHKEYNAAPGTDLSEIIRSAEPGGIIVLVEGTYPIQSAMFI DKPLTIRAANAANKPLVRFNGEKPDNMVTIADGGELIIENIAFDGVLEPGKALAKAGIST ATDMIQPYTLTVDGCEFQNFGEGGFFAIKGTKATFAKSVTIKNCFFRDLSGDAINYAAEK DDIGRYNADDMLIENCSFYRLLGLPINIYRGGSDESTAGPYITIRHCNFADCCNKERGSV MRLIGPQVLTVENCNFDNSGRGGATIRLDEATWEKVRIANCNLWNSGRMVTTTSQAIQGK MYNIRPAYINADAYNYTPVPGSELEKLSIGLKKNSLPQ >gi|225935350|gb|ACGA01000042.1| GENE 89 150168 - 152348 1776 726 aa, chain + ## HITS:1 COG:no KEGG:CA2559_11508 NR:ns ## KEGG: CA2559_11508 # Name: not_defined # Def: putative chondroitin AC/alginate lyase # Organism: C.atlanticus # Pathway: not_defined # 12 719 36 751 759 577 42.0 1e-163 MVLGGCLLSFAQHPSLLFTQEEVNEMREGKGTVPAFDKTLSEVLSAADAALNSPISVPVP TDGGGGVVHEQHKSNYYAMFHCGVAYQLTGDKKYARYVADMLEAYAKLYPTLSFHPVSLS PVPGRLFWQTLNESVWLVHTAVAYDCIYHTLSAKQRTTIEKNLFAPMADFIMDGMGDNHA NNKTFNKMHNHATWATAAVGMIGFAMNREDYVKKALYGSDETGKHGGFIRQMDYLFSPDG YFTEGAYYQRYAIWPFVIFAQCIENKLPELKIFSYRDSILSKALSTLIQLSYEGEFFHIN DALLKGLSAQELVYAADILYNVHPSDKSLLSVANEYQHTYLPTIGGFRVARDIARGEAAP IVYRSSVFRDGRKGDEGGIAVIRSTDPKLNSALTLKATSHGLSHGHYDKLTMAYYDNGNE ILTDYGASRFLNIEAKNKGHYTRENESFAKQTIAHNTLVVDETSNFGGDIKVSSRYHSDI IYSDFNGDHFQVMVAKETNAYSGVEMKRTLVYVTTPFLQFPLILDVLQANSDKEHQYDYP LWYNGHFVSLNFPYTKASNGLQTLGTKNGYQHLWLEAWGQNEKTNTSSFTFVKDNRLYTI SAATTPQTELKMLRLGANDPDFNLRNETAFLIREKEQKNHTFATSIETHGDYDVVMETSN NLTSSCEEVKVVMDTAEYTVIKAIYKGGHSVTLCLANAEESKENKHHLSVEGKNYDWNGR CGVFAK >gi|225935350|gb|ACGA01000042.1| GENE 90 152375 - 152719 418 114 aa, chain + ## HITS:1 COG:CAC3376 KEGG:ns NR:ns ## COG: CAC3376 COG1917 # Protein_GI_number: 15896618 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Clostridium acetobutylicum # 5 112 2 109 114 101 40.0 4e-22 MKTCSETFQFEKDLKWENPAPGVNRQIMAYDGQLMMVKVKFDKGAVGTMHEHYHSQATYV VSGKFELTIGDKKEILSAGDGYYVAPDEWHGCVCLEAGILIDTFSPVRADFLNL >gi|225935350|gb|ACGA01000042.1| GENE 91 152724 - 154190 1468 488 aa, chain + ## HITS:1 COG:CC1508 KEGG:ns NR:ns ## COG: CC1508 COG0477 # Protein_GI_number: 16125755 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 120 455 66 391 431 219 37.0 7e-57 MKVKGLRWVVIGLIMLITIINYLDRGTLNYMWVANIDYRLLPEAEALSDKGNHAALVGDQ YILTAANGNRDSVSVANVQMKEKNGVTFVTNREGIAIDLGLIDRNDPDASQKAKDILGLI TIFFMIAYGISQLVSGKLYDKIGCRKGFFWSVLVWGAADALASLSRGIFSLTFFRMMLGL GEAGPWPGTTKSNAEWFPQKERAFAQGLFGAAASIGSILAPIIILMLFIAFGWKLTFIVV GGLGLIWLIPWLIINKATPKEHPWITEEERTYILSGQPEKEIKTEDKGKSWGELLRVKKN WSVILGRFFLDPIWWMFVTFLPLYLADVFHLNIKEVAFSAWVPYVGAALGSVAGGWYSGW LINRGKTVNYARKAAMLIGGFIIIPSILAAIMSTTAPVAILFMALVLGGFQFFMTNLQTI PSDLHSGKSVGSLAGLGGASAVLGTILAILFASYITNWILLFSLLGALVPLSLCSIFLTV GEIKQINK >gi|225935350|gb|ACGA01000042.1| GENE 92 154213 - 154965 222 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 245 4 238 242 90 29 8e-17 MKLEGKVAIVTGGARDLGRAISVKLAAEGAKVCLNYFDNEADAQETLALIKNVGGEAIAV QGDMTKAAAVKNLFAECNKAFGDKVDVLVNVVGGIVGRKKITEQDEDWYDFLMDVNMRSV FLCTREVVPMMPEGGSIVNFSSLAARDGGGNGASMYATAKGAVMTFTRSMAKELGPQGIR CNAICPGTIATSFHDRFNTPENRERMKGSYALRREGTADEVAELVCFLACSESSYLTGTN IDINGGCFFS >gi|225935350|gb|ACGA01000042.1| GENE 93 155191 - 156393 1204 400 aa, chain + ## HITS:1 COG:no KEGG:Cpin_4528 NR:ns ## KEGG: Cpin_4528 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 34 351 41 358 402 243 38.0 8e-63 MRRIFAIWALCLCWGVLNAAPLGPFNATLLEKLKTDYQKGDKEVTQYIELQEKAAEKYIK MSPLSVTAKKKLPPSKDPRDYMTLSPYWWPDSTKTDGLPYIRKDGERNPEVYEYPERENA NRFGDAAYCLGVLYYITGKEVYAKACADHLRTWFTDPKLGMNPNMTYAQSVPGMKNMRGS GFIDSRRFSRALGVAKLIESSKSWTASDKKKLDDWATAFCYWMENSTQGQRESHAANNHG LWYEAIHLMVLAYLDRTDRIREVAEQSILPKMGAQIADDGSLPQELERTLSLHYSTFALE ALMEANQITSQIGINLWNTPAANGKVASQAVDYLYPYYMNPESWKFKQIKPFDQSRAAIL LYEAGTALGNQKYIDTAKRIGLKYSTSDVETIPYLILKKK >gi|225935350|gb|ACGA01000042.1| GENE 94 156421 - 157410 867 329 aa, chain + ## HITS:1 COG:alr0079 KEGG:ns NR:ns ## COG: alr0079 COG0657 # Protein_GI_number: 17227575 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Nostoc sp. PCC 7120 # 78 303 144 381 411 78 25.0 2e-14 MKLKLILCLCMLPLFCLAQEAQPQQFSNRYGMKMTQNEDGSYALLYAKKYAKFIDVNTIP DVMVPYTYQPDRLKSKDYKGVTTKDIVYKKHKDYELILTVDFAETDKPAPFVVYIHGGGW ARGDNGSSRSLSQYLAKQKGITGVRVSYTLAPQSDATVKVSIQDILDAVKYVQEHAAELN VNPACVGFLGTSAGAHLAAVAAMTVPGTKALVGYSGIYDLEKAAITMKTKDAQRIAYFCD RNPKVLREASPINLIPKKNVPASMLICGTCDVTVECEQSEMFASALKKKGGVCNLLTYKY YDHNVSSKTSDKMEEIFFKSVDFLTTYMK >gi|225935350|gb|ACGA01000042.1| GENE 95 157415 - 158311 705 298 aa, chain + ## HITS:1 COG:no KEGG:Phep_0714 NR:ns ## KEGG: Phep_0714 # Name: not_defined # Def: fibronectin type III domain protein # Organism: P.heparinus # Pathway: not_defined # 33 297 230 483 484 188 43.0 2e-46 MKTGILFLALFAFIACSNSSSEVIDDKEQPKPGQPEQQVDGSLIADGNSAKTYDLIKRSG YNHEAPDSSREHKTEHFQHIQQVRDNQLNKYVFAFFIHATIDDDRSLPNITDRQRNEIKT DNKSPQSLVGQQGETMVFRWKFCLPAGFRTTTKFSHLHQLKGIDNASGTADVSSPLITLT AYSNSKGGQQLRVRYDKRGESTTTIASTDLADFLGNWVEVEEKARFGEDGSYEVTITRVK DGRVLLKLDPQKMDMWRTDCTGLRPKWGIYRYLGENRSWQDQLRDEEIRFADFSIKKL >gi|225935350|gb|ACGA01000042.1| GENE 96 158465 - 160600 1531 711 aa, chain - ## HITS:1 COG:alr4773 KEGG:ns NR:ns ## COG: alr4773 COG1501 # Protein_GI_number: 17232265 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 91 677 142 723 779 175 25.0 2e-43 MNKLLIYLFLLTTGLLSTAQVFAQKEIAPGVIKLEKGEIDTFTPYSLFGGKPIIEAMKSL PTAKLPFDTEEVQIKITDRGCLIEVPLEDYEQIYGFGLQFETFGQRGLRKCPIVNDNPLN GLGYTHAPQTFYVSTKGYGILVNTARYTTFLCGSNQKTRQSMQSTEVRKPISTTTEDLYK NRSNGDKVFIDVPGAKGIEVFIITGPELLDVVKRYNLLSGGGCLPPMWGLGFKYRVKGDA TQDSVMRFANYFRNNNIPCDVLGLEPGWQTATYSCSYIWNKERFPQHKEMLTRLQEKGYK VNLWEHAYVHPTSPIRKELEPYSGDFLVWNGLVPDFIKPEAHKIFTGYHRTLIDEGISGF KLDECDNSNISFASATWCFPDLTQFPSGIDGEKMHQIFGSLYVNAMDSIYRAKNTRAYQD YRSSGMFMSPRSAVLYSDTYDPKEYIQALCNSAFGGLLWCPEVREAHSAEDFFHRLQTVI LSPQAMVNAWYLQYAPWLQFDRGKNERGEFLPEAKQYEEYARTLINLRMELIPYLYTAFR TYQQEGIPPFRPLLMDDPKDERLRTISDQYMIGNGMMAAPLYENKKSRKVYFPEGVWYNF NTNEKYEGNREYEITTELNQLPLYVRQGTLLPLAEPIPYINTQTVFNLNCKIYGTPTATC QLFEDDGVSYDFQKGQFNQVTLNAAKGKVKLTRTKGYKIKRYQLKGYEFIN >gi|225935350|gb|ACGA01000042.1| GENE 97 160804 - 161607 800 267 aa, chain + ## HITS:1 COG:DR0821_2 KEGG:ns NR:ns ## COG: DR0821_2 COG0657 # Protein_GI_number: 15805847 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Deinococcus radiodurans # 44 253 6 216 242 107 30.0 2e-23 MKKNCLLLLFLLCVCVVQAQTVYRTDKDISYVSGSEIDTYRLERCKLDIYYPEKKKGFST IVWFHGGGMEGGNKFIPKEFTEQGFAVVAVNYRLSPKAKNPTYIEDAAEAVAWVFKNIEK YGGRKDRIFVSGHSAGGYLSLILAMDKKYMATYGADADSVAAYLPVSGQTVTHFTIRKER GLPDGIPVVDEYAPVNKARKETAPLVLITGDKHLEMAARYEENALLEAVLKSIGNKKVVL YEMQGFDHGQVLGPACYLIVDYVKRFK >gi|225935350|gb|ACGA01000042.1| GENE 98 161755 - 162702 855 315 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 311 3 307 308 333 56 3e-90 MEKIAKKLTDLVGNTPLLELSNYNKNNDLKARLIVKIESFNPAGSVKDRIALAMIEDAET KGVLQPGATIIEPTSGNTGVGLAFVSAAKGYKLILTMPDTMSIERRNLLKALGAELVLTP GADGMKGAIAKAEDLKGVTPGAVILQQFENPANPAMHLRTTGLEIWRDTEGKVDIFVAGV GTGGTVSGVGEALKMRDPSIKIVAVEPSDSPVLSGGKPGPHKIQGIGAGFIPKTYKASVV DEIIQVQNDDAIRTSRELAKQEGLLVGISSGAAVYAATELAKRPENAGKMIVALLPDTGE RYLSTILYAFEEYPL >gi|225935350|gb|ACGA01000042.1| GENE 99 162782 - 164548 1354 588 aa, chain + ## HITS:1 COG:no KEGG:BT_3079 NR:ns ## KEGG: BT_3079 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 588 1 586 586 887 75.0 0 MLNKTLGILVTILLLFSACKDEDSPLSSTKQLNSFSILKGDNQGKIGNDVNALIVGNVLT LSMNKYDDLKSLVATFEYDGKSITVDGIEQESGVTLNDYSQPLAFIIEAEDGSKETYTVK VVLEEKAGFTSFRFLKKNNSFLTADATCLIEGNKIVSLYEFPQSKLVAEFSSNAVKVLVD EVEQVSGVTENDFSSPVVYKLVMRNGDILQYTVRMEFLLDLIPLLTITTDDSSISEIPSK DSYLNATLTVNGKSFYESYTGKTEIKGRGNSTWGYPKKPYRLKLNKKAEICGFGEAKNYV LLANHIDPTLMLNSVAFKIGRLLELPFTNHAIPVDVVLNGKYKGSYLLTEQIEVKKNRVD LDDKNSVMWELDSYFDDEPRFKSTAFNLPVMVKDPDLTTEQFEYWKKDFNAFATQFAKEP LQGNTYVDMIDIESVAKYLITFNLVHNMEINHPKSIFLHKEGNGKYVMGPIWDFDWAYDY EGKGKHFASYTIPLFSNSMNGVGTAFFQRFLRDSRVKTIYKEIWQDFKKNKLNALLQYVD DYAVMLKPSIERNSVLWENTRSFDTKVAELKSWLKNRANYIDSEVNSY >gi|225935350|gb|ACGA01000042.1| GENE 100 164790 - 167561 242 923 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788005|ref|ZP_02182451.1| 50S ribosomal protein L33 [Flavobacteriales bacterium ALC-1] # 774 921 476 621 622 97 33 4e-19 MPIVTNYTAKDYQAGLQNWALAQGKNGEMYIGNNTGLLCFDGYTWSKYQMPGNQLVRSIL IDGDRIYVGTYEDFGYFSRNSLGILEYTSLWSQLKNIKTHNDEIWNILKIGECIYFQSFS SWFKYDGKKVTAHYNSQHLPLYFHQANGQIYVQMVNGDFYLLENDEYKLLIKRKALKDDS VVALIPTAGGKMILCTEWNGLFDYDGKTVSPHPTAIDKELKSQQMNRAIMIPSDSTIVLG TIRNGIYAVDKEGKEKWHYDMDNRLYNNSVLRLFCDRDNNVWAALDIGIALIHTGSPYSI LIPNRNSQSFGMVYGVNAFNNSLYIATNQSAWLYSFADQTIVPIRGTEGQNWHISTFDSQ ILLGNNFGTKIITGTVASNIPETETSSTCLRKCIINGQEVLIESSYYNLRVYRKYNGKWS FSNSIDGFWNPVRQFEVDHSGNIWAAHMSLGIYKIELSRDLKKVEKCTYIKSLSDEENNA SLMHVMKIRGRVVLSDSKRTYTYDDINQRIIPFVQLNSILKNGINMAIPVDDNLYWLTDY RGYTLIRYDNDNFRMERFIPSSFFGLECNENNNNVYVNGNVTYFCLNNGIGRLDMNLKKD TLLQRSSLLIREVTSLSQDHQLYLMSTSAQKKDNEKIWGDITFHLSYPNFNCEPLRFYYQ LTGSSLNLVSESADPAITFGSLGYGEYHFTALVKNVDGQVLSSVEYYFNKPRPFYLSIYA WIIYALLASVIVYFYSCWHAAKMLRKRNREFEKEKMKQDFKMLEQEHIIAQQREQLLEAE LQVKSKELASLALDAVVQRKAVESLKEVMSEQEHKGIINQHDIDTILKQINGNLNEEEFW DIYHKNFDMIHKNFFRNLRKQYPSLTASDLRFCALLRLNLSTKDIAQFTNLTIRGVETAR YRIRKKLAIPGNINLVDFLIDFT >gi|225935350|gb|ACGA01000042.1| GENE 101 167897 - 170926 2745 1009 aa, chain + ## HITS:1 COG:no KEGG:BDI_3062 NR:ns ## KEGG: BDI_3062 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 1007 1 1015 1017 1058 54.0 0 MKRKITFLVVVVLCLQTLLAQNKTIRGTIVDSFSEPIIGASAHVKGTYIGTISDLNGNYT LENVPDDAIITFSYIGMIPQEIAVKGKNVINVQLKDDVQKLEEVVVIGYGSAKAKDLTSP ITVVKGEALLSTPASSPMAAMQGKVAGVNVTNSGTPGEGPKVAIRGKGSFSNSSPLYVVD GMFYDDINFLNSNDIQDMSVLKDASAAAIYGVRAANGVVIITTKKGQRNQKAKITYNGYV GVQKATNVLEMANSSEYATMLLEANYDAYVSTMKASIDKFGGDYSDPDFHNWKFDSDTDW YKELLRSALITNHSLGISGGTEKSTYSVGMSYLYQDGVMDVENNYKRLNFRAALDYEATN WLKVGFNGVFSNSTQILPQNKAWQQAFNAPGIYPVYDTTNDNTFPDKYASPDAVGFTSNF YNPVATANYYDSQNENYQVLTNFYAQFQILPEKLNFRTSYSYDYSAIRGREYTAPYYVSS WQQQAVSELTKKDTNYYNYIWDNILTYNNQWGKHNFSAMLGYSMRQQQYRYMWGKANNVP DGKDEWLYLSQGNAEGVTLGDDGYCYRGQSYFTRLSYDYAGKYLLTFTMRADGSSKYQEH WGYFPSVGAAWVISEEDFMKDQKFFDYLKLRASWGRLGNDHVAASDGFASITTGNSASGV FGNSTFPGYQNTTYYSWLKWELVDETNIGFNFSTFKNRLNVDWDYFYRLTKRAVISPRLP FSNDVLAGNYGKILNQGFDLSLNWNDNIGRDFKYNLGVNLSYLKNKVKDLGGLSSIKGGK TINMVGEEMNSYYGYKVVGVYQTLEECAEDPIAVANNLVPGDFKYEDVNGDNVIDGDDRQ VLGSYIPNFTYGINLGLNWKNLDFELTTYGQTGGQIYNRKRALRYAQSNYNFDKAQYENR WTGPGSTNSHPSAAALVKGWNVSDQRVNSYFVESADFFRIQNITLGYSLRNIKLGNYTLP GIRFSLTADRPFTTFKANSFTPELSDAEGWDTEVYPLTSTYSFGIQIDF >gi|225935350|gb|ACGA01000042.1| GENE 102 170972 - 172438 1310 488 aa, chain + ## HITS:1 COG:no KEGG:BDI_3063 NR:ns ## KEGG: BDI_3063 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 486 4 490 499 441 48.0 1e-122 MKILKNILIYALPALMLCNTSCDYLDKEPENKVPEENVDFTQIENMYQAVSGVYAKVRTG GMHWVIWPLSVVRDDDVWSGRIDDQATLVDMGNYIYDNSFWGLNEMWNQYYGIIKVANAA LESLDSYAENITSDNDMTNYRSYCGEVKFLRAYAYYRLVQAFGPVTILRNNTQADMTRST INAVYKYALEDLQYTMDHTPRLRPNEMAHYGAVTAFSAEMLAAKIHLNLGNYGEVETLTD DIIGSKKFKLYDDYYNLFKIPGKVCDESLFECQCTDFGVGSGDMVDADNWFVFQGPANDG NISGWGFIGIYKELRDWAAARGETVRATTSFLLAGTTTPSGDVIRPLQNPTQTDCWNGKA YTPTDQLTPGRTKYGANNNVRIFRYADVLLMNAEAKVRLGKDGDESLNLVRDRAGMSEID GATVDQILDERRMELVCEWGERYNDLIRTGKAASVLGSKGWTADKTYYPLPFDQVSNIPS LTNEPIDE >gi|225935350|gb|ACGA01000042.1| GENE 103 172462 - 174570 1355 702 aa, chain + ## HITS:1 COG:no KEGG:BDI_3065 NR:ns ## KEGG: BDI_3065 # Name: not_defined # Def: beta-glycosidase # Organism: P.distasonis # Pathway: not_defined # 9 702 71 764 764 927 64.0 0 MTGIVKKILLSSGIAFSALAGNALPPEDSIKVVKGQVSDYYIPVVNQYEITGMVVDEQGN PLEGATVMFFSSPVHCNTDAEGRYKLKATDNDVHLYVYYPGKSFADVKRAVADRQVKIVM RPEKHKSVQRQPAQATRWYDPVHPVTRTYCNPMNISYNYEPYNNNVQSGGSFRSSADPMG LTYKDEYFLFSTNQGGFHYSKNLSDWEFAPASFQRRPTDDDMCAPAAFVSGDTLFYTGST YEGLPVWYSTSPKSGRFKRAIERNTLPSWDPCLFLDDDGKLYLYYGSSNEYPLKGVEISR DDFRPVSKIYDIMMLRPEEHGWERFGMNNDDEVTLRPFTEGAYMTKHDGKYYFQYGAPGT EFKVYADGVYVSDSPLGPFTYQQHNPMSYKPGGFVQGVGHSGTFQDLKGNYWHVGTCMLS LKYKFERRIGLYPTAFDPDGVMYSTTAFGDYPCWNADYDIKNPSDRFTGWMLLSYEKPVK VSSTDSIYSASNLTDENMRTYWAAKTGEPGEWVEIDLGGMKHIKAIQLNYYDHKSVQHNR ANDLYYQYRIYSSDNGTDWTLVVDKSDNDKDVPHDYIELSETLDARYLKLENIHVPSGNF CLSEFRVFGFADGEKPLPVRNFKVVRDKQDKRNAMISWSPSSGAYGYNIYYGIAPEKLYN CITVNGADHYDFRGLDLGTTYYFAIEALSESGRSALSKVVKQ >gi|225935350|gb|ACGA01000042.1| GENE 104 174602 - 175963 1107 453 aa, chain + ## HITS:1 COG:AGl3503 KEGG:ns NR:ns ## COG: AGl3503 COG5368 # Protein_GI_number: 15891871 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 44 451 11 408 425 310 38.0 4e-84 MEWTRTGIFITLLVVICACTQKNKTVIDAEPDRPVAFANDNELLDYIQKTHFNYMWEGAE KTSGLACERIHLDNIYPQQDQNVITIGGSGFGVAGLLVAIERNFIDREEGVARLTKIVDY LAKADRFHGVWPHWLNGPTGKVKPFGTKDDGGDLVESSFLMQSLLCVRQYVKDGNEKEKA LASKIDELWHGMEFDWYRNGGQNVLYWHWSPNYGWEMNFPLEGYNECLITYILAASSPTH SVPAECYHEGWARSGGIKSASKPYGYPLELKHNGAEEKGGPLFWAHYSYIGLDPRNLTDQ YANYWNVVRNHAMSDYQYCVTNPKGYKGYGPDCWGLTASYSINGYSAHMPDNDLGVITPT AALSSFPYTPEESMAALKGFYKQGSWIWGKYGFYDAFSPNEKWTVPHYLAIDQCTIAPMI ENYRTGLLWRLFMSCPEIQQGLKKLGFTSTATD >gi|225935350|gb|ACGA01000042.1| GENE 105 176001 - 178286 2158 761 aa, chain + ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 7 760 6 763 764 688 47.0 0 MRIGRFLLAAVISASAATAMATPQADKGKMDKFIDNLMGKMTLQEKIGQLNLPVSGEIVT GQAKSSDVAGKIRKGQVGGLFNVKGVDNIREVQKIAVEQSRLKIPLLFGMDVIHGYETVF PIPLALSCSWDMKAIEESARIAAKESSADGICWTFSPMVDICRDPRWGRMSEGGGEDPYL GSEISVAMVKGYQGDDLTDKNTIMACVKHFALYGAPEAGRDYNTVDMSHLSMFNNYFPPY KAAIDAGVGSVMTSFNVVDGIPATGNKWLMTDVLRDRWGFDGFVVTDYTAISEMIAHGMG DLQQVSAMSLSAGTDMDMVADGFLTTLEKSLKEGKVTMTEIDKACRRILEAKYKLGLFDD PYKYCDASRVKKDIFTAENRAVARKIATETFVLLKNENNLLPLQRKGKIALVGPLANTKA NMPGTWSVAAAFDKYNSLYDSMKQSLAGKAEVLYAKGSNLMYDAQREAEATMFGREMRDP RSAQELLDEALNIASQADVIVAAVGESSEMSGESSSRTNLEMPDAQRDLLIALKKTGKPI VLVYFAGRSTVMTWEQENFPAILNVWFGGSEAADAICDVVFGDVSPSGKLTTTFPKNVGQ IPLYYNHLNTGRPLEAGKWFTKFRSNYLDIDNEPLYPFGYGLSYTTFRYGDLQLSNNSMN ENGKITASVTVTNTGNYDADEIVQMYIRDMVGSVARPVKELKGFERIHLKKGESRTVSFD ITAEQLKFYNSALNWVCEPGEFEVMVGGNSRDVQTKKFSLE >gi|225935350|gb|ACGA01000042.1| GENE 106 178446 - 180848 2180 800 aa, chain - ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 67 800 4 736 754 500 38.0 1e-141 MKKLLCLALLLSAGSVYTGSVSASNQPAGKKSGNNSKDIYKKTWIDFNKNGVKDVYEDPS APIEARIADLLSQMTLEEKTCQMATLYGSGRVLKDAWPTDGWSTEIWKDGIGNIDEQANG LGKFGSEISYPYANSVKNRHTVQRWFVEQTRLGIPVDFTNEGIRGLCHDRATMFPAQCGQ GATWNKKLIREIAKVTADEAKALGYTNIYAPILDIAQDPRWGRVVESYGEDPYLVGELGK QMILGLQSEGIVATPKHFAVYSIPVGGRDGGTRTDPHVAPREMKTLYLEPFRKGIQEAGA LGVMSSYNDYDGEPVSGSYHFLTEILRQQWGFKGYVVSDSEAVEFLHTKHRITPTEEEMA AQVVNAGLNIRTNFTPPQDFILPLRRAISEGKVSLHTLDQRVGEILRVKFMMGLFDNPYP GDDRRPEVVVHNAAHQDVSMRAALESIVLLKNEKEMLPLSKSFSKIAVIGPNAEEVKELT CRYGPANASIKTVYQGIKEYLPNAEVRYAKGCDIIDKYFPESELYNVPLDTQEQAMINEA VELAKASDVAILVLGGNEKTVREEFSRTNLDLCGRQQQLLEAVYATGKPVVLVMVDGRAA TINWANKYVPAIIHAWFPGEFMGDAIAKVLFGDYNPGGRLAVTFPKSVGQIPFAFPFKPG SDSKGKVRVAGVLYPFGYGLSYTTFNYSNLKISKPVIGAQENITLSCTVKNTGKKAGDEV VQLYIRDDFSSVTTYDKVLRGFERIHLQPGEEQTISFTLTPQDLGLWDKNNQFTVEPGSF SVMVGASSEDIRLKGSFEVQ >gi|225935350|gb|ACGA01000042.1| GENE 107 180856 - 183168 1933 770 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 28 762 4 752 778 516 39.0 1e-146 MRQTFIAMGICCAIGISTLACQDKSKDYTDPTLPVSERVSSLMSQMTLEEKVAQMCQYVG LEHMKKAEKDMSAEDLKHSHSQGFYPNLHSSDVEEMTKKGLISSFLHVVKAEEANYLQSL AQQSRLKIPLLIGIDAIHGNGLYRGSTIYPTPIGQAATFDPALVERMSRETAIEMRASGM HWTFTPNVEVARDARWGRVGETFGEDPYLVGQMGAATVRGFQTKDFTGNDKVIACAKHLV GGSQPANGINGAPAELSERTLQEVFFPPFKDCLEAGVFTVMTAHNELNGIPCHGNKYLMT EVLRNQWKFDGFVVSDWMDIERMHDYHNVAETLKDAYRISVDAGMGMHMHGPEFYEAIIE CVKEGSIPEKQIDAAVSKILEVKFRLGLFENPFIDLKKKDEIVFNEKHQQTALEGARKSI VLLKNEGNMLPLDASKYKKVFVTGHNANNQSILGDWAMEQPEEHVTTVLKGLKAISPETN YNFLDLGWNVRLLSDNQIKEAVQQARNSDLAILVVGENSMRYHWNEKTCGENSDRYELSL PGRQQELVKAVAATGVPTVVILVNGRPLTTEWIDENMPCIIEAWEPGVAGGQALAEILYG KVNPSGKLPITIPRSTGQIQCMYNHKFTNHWFPYATGNSLPLYEFGYGLSYTTYKYENLK LSEATITPDKSVKVTVDVTNTGKMDGEETVQLYIRDEYSSATRPVKELKDFARIPLKAGE TKEVSFTLTPEMLSYYDANMHYGVEKGTFKIMVGASSRDTDLQSIILTVK >gi|225935350|gb|ACGA01000042.1| GENE 108 183212 - 184846 1389 544 aa, chain - ## HITS:1 COG:TM1751 KEGG:ns NR:ns ## COG: TM1751 COG2730 # Protein_GI_number: 15644497 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Thermotoga maritima # 318 542 110 310 317 110 30.0 8e-24 MKKRLIIYLACLFACLIASCSEKDEVISNGGEPQSYVPFAINKGVNISNWLSQVDAIPAN GFSQAEAQKLAGYGFDHIRLPIDEKLFFTEDGQKIPEAFTLLHNAIGWCRDAGMKVIVDL HVLRDHNFNTTVTTSGDPIESFYDTFEDMKWAGWFANGSGVTVNNVDNPSKSGINTSNKV FSVVRKSGVDTWSNAKKKDFTIIPVGSGKMQFKYLRAKIYKSTKSTITVVLSDENNTNDG YYGYTNTTANEWEEIAVDVSNYNKNCGRVALRPEEGETMYFDDVFFSDSKTGDNKIYPES GTTTVVPKLWTDKDAQYKYLDLWKTLAEELNQYPNDLVAYELLNEPVAPYASQWNSLSAQ LIRELRQTEPERKLIIGSNRWQSVNTFNELTIPSGDPNIILSFHFYNPHPLTHYQAEWTE EKDLNVPIHYPGELIEEADFNQLSEAMQKLVTPYMGTYDKTTLESLVLKAEMKASSLGVQ LYCGEFGCYKKTPAADRMEWIRDVVSILKDRHISYSYWEYKAGFGFCDSQGNVIEQEVLD LLTK >gi|225935350|gb|ACGA01000042.1| GENE 109 184867 - 185247 431 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255693558|ref|ZP_05417233.1| ## NR: gi|255693558|ref|ZP_05417233.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565] # 1 126 1 126 126 248 100.0 9e-65 MKKIIFICTLLVSLFAMSGCSDDIDTPVLTTDEYPRILGRWPDKQEDGTLGKFSIPLNQV LSINVQYTPAELCEGTWYLDGREIHKGVGLTYIPSTIGTFHLELIVKTPTKETTREAIIQ VTEATE >gi|225935350|gb|ACGA01000042.1| GENE 110 185283 - 186905 1363 540 aa, chain - ## HITS:1 COG:no KEGG:PRU_2518 NR:ns ## KEGG: PRU_2518 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 2 539 4 573 580 304 34.0 6e-81 MKTIYKIAFGMLCICGLLLGACTDFEELNTDPSKSSSTDPNQQLSMIQLQTWGHWQMCQP YPFYLAAFAQYMQGDWNTTNYGGQYRKNDAEMGNTWNLMYPALIKNIVDILDKTKDNERE VNIHSVARIYKVYLFSILTDMYGDCPYFEAGKGFITGNVKPAYDKQELIYKDFLKELGEA ADALTASGDKVTGDIIFQGNIDKWKRFANSLHLRYAMRIVNADPELAKAEAIKAVGQEAG LMQSAADDALIAYTDIQDWASNEFRRNGLAQLWRGREAYPTAYLCSTFWKQLDATSDPRQ FVFGRCYDESSANNPFGRVDLTEEMQNTEAAKFQPCNPGYFWYSNGTWPEGYWSKLTNKW QDKATRPQLNNIFLKGDMPGVIMTYAEAELLLAEAKARWAGDITTGADASTHYKNGVRAA IHFLEKFGAKTFDDQVIDTYLQANSLPASGLDAQLTAINTQLWILHFNNIPEGYANWRRT DIPVLLPSPHYGAVTIDSQTTPLRLCYPLFESSYNPEGYQSAIQAMGGKDDWNSPVWWDK >gi|225935350|gb|ACGA01000042.1| GENE 111 186918 - 190097 3185 1059 aa, chain - ## HITS:1 COG:no KEGG:PRU_2517 NR:ns ## KEGG: PRU_2517 # Name: not_defined # Def: TonB dependent receptor # Organism: P.ruminicola # Pathway: not_defined # 2 1059 5 1057 1057 1144 56.0 0 MKKQLTLFLCLLLFIPMYGYAEDEVNHQIVQQTTKVKGTVTDEQGEPLIGASIAVKGTSQ GVITDFNGQFSIDASRNATLIVSYVGYRSEEVQVKGQSNLKIVLKEDSKIIDEVVVTALG IKRERKALGYSIGEVKGEELEKAKETNVINSLAGKIPGLVISQTAGGPSGSSRVIVRGST EMTGNNQPLYVVDGVPLDNSNYGSAGQYGGYDLGDGISSINPDDIESMSVLKGPAASALY GSRASHGVILITTKKASTKKKFAVELNSTTTFEKQLTKWDDVQYVYGQGTGGRINGTDDQ YSSNKNWGPKIDPGLNLTYFDGVTRPYVVIPNNIDGFFRTGMTTTNTIVVSTVKDDTGIR ATYTDMRNKDILPNTKMSRNTLNLRANTTINKKVDLDFKVTYTREDVKNRPALSDHRANP AKNLMSLATTYDQKWLRDNYKDADGNYYDWNGRDVWNLNPYWVLNEMTNESGKDKFMGSA LVRYNVNEHLKIQVTGGADINFMNFQEYAAPTSPGFEQGQLQISDFRNRMYNVEALAIYN NSYKKFDYGVTIGGNLYKVDNKTQIVTAKEMVMRDVIALQSFTSKEITEGTYRKQINSLY GSINLAYGNFVYLDATLRGDHSSTLPSGNNTYLYPSVSGSFLFSEFFKINPTLLPYGKVR VSWAQVGSDTDPYQLGLSYELSPKNYSGYALSQIANTTIPNKDLKPTKTNSAEVGLELKF LKNRIGLDFTYYTQKSSNQIMRLNTTGTSGYNSMLINAGEIENKGVEIALNTRPIQTKDF SWDLNINFSKNSNKVKRLASGIKEFELESARWINVKVAAVEGQNYGSILGKDFLRNDAGQ IIVDASTGLPKVTEDLRVLGNATWKWTGGITTNLTYRDISLSAIFDIKVGADVYSMSARS SYMTGKDKATLAGRDGWYESEEQRLQAGVKESNWEATGGYVVEGVVEMPDGSFAENKKFV DPEVYWKHIADQTPVPFIYDNSYVKVREITLAYRLPKRWISKVFDAVSVSFVARNPFIIY KNVPNIDPDSNYNNGSGMGLEYGSLPSRRSYGFNVNVKF >gi|225935350|gb|ACGA01000042.1| GENE 112 190547 - 190702 66 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVVFPMYIGCISYVIQGIRTRHRVRVILTLTFTLTPAMPMNKEMQKRVRV >gi|225935350|gb|ACGA01000042.1| GENE 113 190728 - 194810 2437 1360 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 810 1086 12 290 294 157 34.0 2e-37 MKRFPALCLLTIFLLLPALSIFAQSEKQVNSHSFTYQYLTTKEGLSNQRVFSILEDKKGF IWISTRSGVDCFNGRTVKNYSLFGEDIIVDGAGRMIYLTKDSHETLWAYTSAGKVFRYDP ISDAFTLEIDVTELTESGIFLNNLFIDSSDHFWFGLKNGLFCYDNTNRHTENILKKKCIN SIHFSPSNETLYIATTEGLYERNLQNGTVSTLLSDLYIQSVFYDPATHLLWIGTFNSGIK VKDTRTGEFLSNGSLNQLPHLPYRSIIAYDAHTLLLGVDGAGVYAATRDGVHSWSFLNAN LEEEGELKGNGIYALCKDSSSNLWIGSYTGGVAYANPKKYLFELTQHEYKNPQSLINNHV NAILEDHEGDLWYATNQGISVHLIKSGTWKHFLKENVFLTLCNDENGNVWTGGYGTGVYC LNKQTGIRQHLTTERPGTLTTNYIYSIVEDNNKDLWFGGMYGNLIRYTPPQAGKAEKFTP YSITLINSITTVGKDTIALATANGFYLLNKQTGNFKQYFTSPSNADTRSNSFIYSMYFPT PDKVWFGTDGGGINLLDLKTGKAVTYSTADGLPSNYVYSILPDNEGHLWLSTDKGLAYIT TSPSPAITNIGFLDGLANEFNFMSYTRLRNGDFVYGSTNGAVRFSPKNFTRHLYKAPLLF TSFEVPQKSREKTEKKKIQFNRMLNEGKTIELKYNENSFLLSFISVSYQYQQDIQYSYQL EGFEQSWSAPTNELSIRYTNIPPGNYTFHVKSMSKNGGQQLDEQTIRIHIAQPFWNTAIA WLIYIILLAGITYFIWRFFANKMEKKHFSEKIQFFINTAHDIRTPVTLIMAPLSDLSKED GLSGEGKRYLQIARKNTEKLYNLITQLLDFQKIDTTHLTLQVAEYDLKSYLQEKVFSFQS LCESKQIRMELSVPEAPVSLWMDKDKADKIFDNLLSNAVKYTPSGGDISITVEQNDKKIT IEVRDNGIGIPRKAQRYIFSNFYRAENAVNSKETGSGIGLLLTRRLMKLHKGNISFTSNE GEGSTFLLTFRKGNRHLARYILPEKTSPLSASILSASISNIPTEDIDNELSPEESIDQEA GTVENTDLYGENDPPQTARNAKERIMIVEDNDDLRFYLKKTFASIYTVIDKPDGESALEY LQDKSVDLIISDVMMPGIQGDELCRRIKSDFTTSHIPVILLTAKTEKDAILEGLESGADD YLTKPFDTEILKTKIKGVLQNRKIMRQYFLSHSLPSVPAAESSEKKEDTECANLLSAMDK EFLERCTRLITENLANPDFTINQLSRELALSRTIFYEKLKALTGQAPNEFIKLMRMTEAA NLLKQGLPVQDVALLVGFTDSKYFSTAFKKHFGVSPSKFL >gi|225935350|gb|ACGA01000042.1| GENE 114 194807 - 195286 410 159 aa, chain - ## HITS:1 COG:MA2197 KEGG:ns NR:ns ## COG: MA2197 COG3467 # Protein_GI_number: 20091038 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Methanosarcina acetivorans str.C2A # 5 153 6 152 152 92 34.0 2e-19 MKTVIIEDKQRIESIILHCDACFVGITDLEGNPYVVPMNFGYENGILYLHSGPEGSKLEM LEHNNNVCITFSVGHKLVYQHEKVACSYSMRSESAMCRGKVTFIEDMDEKRRVLDIIMRH YTDSEFNYSEPAVRNVKVWQVPIEQMTGKVFGLRANEKP >gi|225935350|gb|ACGA01000042.1| GENE 115 195304 - 196173 366 289 aa, chain - ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 17 279 32 296 313 151 30.0 1e-36 MGTIQCEVTPLTDEDLFVLLNNPKAKFDYPIHFHSDYELNLVMNTSGKRIIGDSIEDFEN CDLVLLGPGLPHKWKAPTLDETQVITIQFHEQMYNSFLLNKRVFEPIRELLIRSTRGIVY TGKTFERVKERLLNLSNARGFNTALEFYSILYDLSIADEQRLLTSASYDSSLLVRESRSR RIEKICKYIEVNFKKDITLKEIAELVNMSESAISHFFKKRTNRSFITYLTDVRIGYASRL LAETTQSVADIAFECGFSNLSNFNRIFKKYKNQTPTDYRIGVHKTITKF >gi|225935350|gb|ACGA01000042.1| GENE 116 196442 - 199498 2342 1018 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0351 NR:ns ## KEGG: ZPR_0351 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 13 1018 46 1052 1052 1030 52.0 0 MKKVFMCMIPLLIFAMGVFAQTIDLSGTVKDDKGEPIPGASILVKGTQNGALSNVNGEFS IVVKKGQTLVCSFIGYVTEQAIVNQSRINFVLREDVAQLNEVVVVGYGGVKKGDLTGSVS TIKMDKLEDLPANNSVFSSLQGRVAGLQIVNSGQGPGSNPAFIVRGISSINGTQSPLVVV DGFPLGEGADLKQINPADIADIVVLKDASSTSIYGSRGANGVIMVTTRKAGKGTTHINFS HQTIISQFSSKLNLWRDPVLMAQLDNESRVNAGQLPLYVGRTDNGTYYPSVEEISSGAWP YFTRWDDEILRTPVTNNTSLSISGANDKLIYNLSVNYFDDRGTYIKDNYRKLSAKLDVDY KAFKNFSIRTSNILSKNWRNANSGDIGRNPLFPVYDEEGNYYQSSPTDYGNPIALANTVK NKNQGMDVLSSWLATWEIIDGLTWKGQLNYKYGTTVNDLYNPKKYTEDGTFNNGHAYIGN WYGQDVIPETYLTYDRMLGMHGHLTAMAGYSYKYSMERSSALDSYDFVNESLGNENIGAG NPQKNQVSNGFSESKLVSYYGKINFSWMDKYLLTATFRSDGSSKFGDNNKWASFPSGALA WKLHNEKFMSNLKFINEAKLRASYGISGNQGIAPYQTLSRYGNEKYYDNGAWNTAIGPGY VIGSYGSDGRYKYWGGIPNKDLKWETTRQLNFGFDVTLLDNRIRLVFDWYKKHTFDLLRQ RYLPLSSGYDKMWVNDGEVQNRGFEFTIDADVVRTKDFSFNSTFIFSRNRSKVLSLGSVA SSGLNVDPNTGMQYEFTGATLTEQPVGSVNILAVGQPLNVFYGYKTNGIIQSNAEGIEAG MSGDEAKAGELKYMDINNDRAVDEKDRTIIGNPNPDFTASLNLSFKYKKFDLSIFLNGVF GNDVLYQYGMTNPATMPLRWTVDNPNNEYPSLRQNRTPKVSDWFVRDGSFVRIQDINFGY TFDHLCKGVSSLRLYGSINNLYTFTSFDGYDPEVGLDGIYWGGYPRFRKFTLGMNITF >gi|225935350|gb|ACGA01000042.1| GENE 117 199510 - 201033 1233 507 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0352 NR:ns ## KEGG: ZPR_0352 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 505 1 497 497 402 46.0 1e-110 MRTKILSIFLWSTLMVGVVSCDLSESPYGFYSEKNFYKTPEDAESALMYAYNALTFLEYS RGIYYIGEAASETVSLKSGEDANNPGAQALDEWKISDNANNQTLQIYFKYCYIAINRANA VIYNVEQSELPKEVKDRILGEAYFLRAYNHFNLVKVFGLVPMQKEMVQTVSGTTPSMAKS MDDIYNFMLEDLKQAESLLDYTQKTGRANRAAAQGLMAKLYLTAASSKESGVANYTEMAA SVEDLYANAATCAKNVLDASQRGECNFGLSENLADIYDVNKPDGPEHVFIMSMDRTGLNE GNYSKIGMLFLPYNNGGNFYVKAGSSLYPSHYGFEVFPTNMAFYRTYDETDKRRTDLINT EIYNADGSVYATSAYPYTLKYTDPDQVTGNTGDKSSVKPYLLRFSDIALMYVEAKGSDDG AWLQRIRARAGLGNIPTPASVSEFRDMVVRERAWELAFEGNRLYDLRRKAMVTKVDPNAK SAGITEAEAAFYPIPQREVDLNPNLRN >gi|225935350|gb|ACGA01000042.1| GENE 118 201061 - 202017 805 318 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0353 NR:ns ## KEGG: ZPR_0353 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 2 316 6 311 312 194 42.0 3e-48 MKVLQYIYWLIVSAFLLGACQVNPLDEVEEGDWNKERRILGLTFENQAGDATISLNVDDP TKGTVEVTIVNPDFSQAIKIKKMEVSYKASSSVNSGDALNFDPVSHSATIIVTAVSGEKR EYTVRVTPLVESLVGKWEIHELDVFGGTGPYYGGVDFVNLSSDPSWWNEQTGVKAELDNT LEFTFEGITENGQTYGTCIHDAGVDGKFADFIWAGGLPDGQTVIDVNYNYRKVPQGTSHW VRDYTTETVTFTKGDKTYVASLVNSGDIVYWGKTLTIQDSALKFGNLKSAGDWGPIYSAY DKIVYAPWDFYVQIRKKQ >gi|225935350|gb|ACGA01000042.1| GENE 119 202022 - 203494 1107 490 aa, chain + ## HITS:1 COG:no KEGG:Csac_2519 NR:ns ## KEGG: Csac_2519 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: C.saccharolyticus # Pathway: not_defined # 3 489 16 492 628 197 30.0 9e-49 MKVLLQILFLLFCIGCQKPVQNAEVIFRINKDSIISTSYLGNGVQWDPYQLDYGNGRVTI SESDWNKLYARLDHMRPQFIRMMVYTTDYLKNGKLDEMYDFDQVSKILTYCQERGITVMM GDWGGRMVDPVTNRIDTFMLSNAARYADFLVNRKGYDCIKYYNMINEPNGDWSSNQANYD LWARAIRYFDRRMHDYGLADKVGLAGPDAAIWDQSEAWWIDSCATRFNEAIKLYDIHTYP PKSTVNSGEYSKIIRAYKERVPQGSQIIMGEIGLKFLKTDTLLEKENNERISKVPHASRE DSQMFVYDHFYGIDMADALFQTMNEGFSGALIWMLDDAMHSKTEEGPDKLKIWGFWNILG EEYFGGAKEEEVRPSYYAWSLLSKYIPKGSTVYAVETGEDKGVRAVAVEHKGKYTIGVVN VSGQERTVLFKSNSLPVLEDIRQYNYIENEILKEGDCKQLPNVTGIKLNLEKGLFLELPT DGLVVYTNME >gi|225935350|gb|ACGA01000042.1| GENE 120 203497 - 205014 1168 505 aa, chain + ## HITS:1 COG:no KEGG:Csac_2519 NR:ns ## KEGG: Csac_2519 # Name: not_defined # Def: coagulation factor 5/8 type domain-containing protein # Organism: C.saccharolyticus # Pathway: not_defined # 25 504 30 492 628 169 30.0 3e-40 MKNIIKGLICLALFLSGCGKDNGGDTPPVPEPDVNTLLVTPKVLTNSYLGNGPQWGGYDI VNAWTGNATLSEQDWNTLFKRVSFMRPSLIRIMVSQGWNYMNGEVYSPEKSDPVLGKILS YCQEKGITVQLGEWGHVGGSGIDATWVDHATDFLSYLVKEKGFTCIKYYTIVNEPNGDWS STVGSYPLWKNIIQQFYRKMNEKQLTDKVRIMGPDVAIWSIAETSWVVNTRNELAEEVKA FDIHAYPNNDDVHTTSFLQLLKAYKAASNLDAPIIMGELGFKYSATSSLGKANASRIKAD PFAADDSQMHVYDAFYGIDMADATIQVMLAGFGGVTYWDLDDAMYNDDGSSSSTKLKRWG FWNILGSEKFHNPDDEKIRPWFYTTSLLCRYFPSGSTIYDVTLPKPVKSGLRAIAGMRNG KVTIAIVDSSTNDYLFNLGMENGALLQGIKTYKYQSQQGAHFIGTVDADGFASPVDAGET IDLSGGKLREINLPAQSFMLLTNME >gi|225935350|gb|ACGA01000042.1| GENE 121 205023 - 206459 782 478 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4232 NR:ns ## KEGG: Fjoh_4232 # Name: not_defined # Def: sialate O-acetylesterase (EC:3.1.1.53) # Organism: F.johnsoniae # Pathway: not_defined # 23 473 25 501 511 322 38.0 2e-86 MRRICTFIMILCSSYLFLQAQDHLRFASAFTDNLVLQQKSRVKIWGYAPPRSTLQVLASW SRKEKTVKADTCGKWMIELSTPAGSYQTYNLSIANTEQTVVLKNICIGEVWFCSGQSNME MIMRNDPQWRLYVDNANEEIAVADYPGIRFMTVQRNESFTALDEVLTEGWQVCSPQTVGG LSAVGYYFARKLLSSLDVPVGLVVDAYGGSPIQSWIPYAETLKPLYKAEHETLQEAVEKG KEKPEYNMLSSLYNAMVHPLIDYKIRGWLWYQGEANVGDAGRYIAMMKDLVSSWRKKWKA KLPFYYVQIAPFQYPGYQKEKWAELAEVQSMALQTISSSGMVVTADLGDSTNIHPGKKKP VGERLALIALSDTYHQKIKSQSPSLKRLTLEQGKLRAEFDFAYHGLRLEGVHHEFEISSD GMTYYKAIVEIKGSCVWLSSPEVPSPRYVRYGWRDACVSTLYNSENLPLGPFKASVDI >gi|225935350|gb|ACGA01000042.1| GENE 122 206456 - 207877 814 473 aa, chain + ## HITS:1 COG:no KEGG:Coch_1349 NR:ns ## KEGG: Coch_1349 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 45 467 60 483 488 179 28.0 3e-43 MRIVGLLFIVGFLLSACQSEEGDKQMGESDNELIDIVKIQDDPDNVLRIDVEVTMKESAT LKLVYWKDGNESVKRELVFDDQKQSWATKLILLEEQTKYWLKVYASNQNGMVEESKPFQF ITKALPDGLMKFTNLMPDYTYTFNGYIHVGDKQQGTLYLINAEGKVVWYQPTDDLSVICS GFDPKTKTFQAILGFNPNESFTGEYIYVVDLYGNVKLKKYYAHLDNPYFHHDIQMLDNGD LVVVNQIRQSFDLTRWGGSSNEMVTGDGFSILDFSGNTKWTWSAFDFISPEDDPDIMGDR GEFQYTPREDWLHANTAYPCPDGDFLVSFNRIRQVWKVDGRTGEVIYKLGRNGDVHLTNP EDFSDRQHAASLTPDGDVMIYDNGYTNKRSRVMAYRINEITKQAEVTMRIVLPKEDFSVN QSSAYCMDNDRILFGSVVPKTIGVIDKNGNLMWHYKTNRPFYRALYIDFKLLK >gi|225935350|gb|ACGA01000042.1| GENE 123 207885 - 208904 848 339 aa, chain + ## HITS:1 COG:TM1852 KEGG:ns NR:ns ## COG: TM1852 COG2152 # Protein_GI_number: 15644595 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 1 333 3 293 296 189 39.0 7e-48 MKLTKSTFNPILSPNPTNHWENLVVCNPGVWYENGKFYMLYRAAGDDKEHIIRFGLAVSE DGFHFERVSDEPAFGPSVDGEDAGCVEDARVVKFDDYYYVTYAFRPCAPGQYWKFAHDEV IVRDFGENAPYFLKRNMANTALAMTKDFKKWIRLGRITQSNLDDRDVILFPEKINGKYAM LHRPKEWIGPQYGPKHPAIWLRYSDDLLVWNEPSHLLIEGIDGGWEEKIGGSTPPLKTDK GWLVLYHGVENGGCGYYRVGAMMLDLNDPTKVLGRTKDWILEPEFSYEIDGFYKGCVFPT GNVIVGDTLYVYYGGADKYIGVATANVDELVTFILTQKL >gi|225935350|gb|ACGA01000042.1| GENE 124 208901 - 210496 1175 531 aa, chain + ## HITS:1 COG:yidK KEGG:ns NR:ns ## COG: yidK COG4146 # Protein_GI_number: 16131549 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Escherichia coli K12 # 5 479 2 483 571 199 30.0 9e-51 MNWTNTLTITDLVIIALYFVFIVYAGLRYRKTSDSESYFLAGRSMTWPVIGFSMFAASIS SSTLIGQAGDAYSTGIAVFNYNLMSIFVMIIFAWFFLPFYIKSRIFTLPEFLERRFDVRS RYYFSGITILINIFLDAAGSLYAAALVMKLVFPEVSLTTLAVIFAVIVAIYTIPGGLSAA IRVDLIQGIILTIGAIALTWILAEKGGAAYVAEQFNQGVMMKLVRPLDDPSVPWLGMILG IPILGFFFWGNNQQLVQRALTAKSIDEGRKGVLLVGLLTLITLFIIIIPGVMAQKFFPGL EKPDMVYPSLVIEMMPVGMVGFLLAALVAALTSSISGLLNSVATLFTMDFYLKMKKNVSS KEQVFVGRIVSAVVLIIAVAWAPQIGEKFGSLLKYYQEMLSMLAPPIVSVFFLGIFWKRA TSQGAFYGLIGGALLGMTNLVFKIYFGYSIFGDIHFLLTVPIYLAWSMLIMLVISYLTPA PDYKIVKPYIWTKADFDKETVALQKLPFYKNYRKLSYMLIALCLIVLLVFA >gi|225935350|gb|ACGA01000042.1| GENE 125 210595 - 212751 1245 718 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 89 717 70 703 730 484 39 1e-135 MAKKKEKKEKKAGKRMSKKELAALLIDFFHAKPNETLSMKYIFSELRLTTHPQKMLCVDI LHDLLDDDYISEIEKGKFRLNNHGTEMVGTFQRKSNGKNSFIPEGGGEPIFVAERNSAHA MNNDKVKITFYAKRKNKDAEGEVIEILERANDTFVGTLEVAKSYAFLVTENRTLANDIFI PKDKLKGGKTGDKAIVKVTEWPDKAKNPIGQVIDILGQAGDNTTEMHAILAEFGLPYVYP KAVETAADKIPAEISAEEIAKREDFRKTTTFTIDPKDAKDFDDALSIRKLKDGLWEVGVH IADVTHYVKEGGIIDKEAEKRATSVYLVDRTIPMLPERLCNFICSLRPNEEKLAFSVIFD ITEKGEVRDSRIVHTVINSDRRFTYEEAQQIIETKEGDYKEEVLTLDTIAKALREKRFAA GAINFDRYEVKFEIDEKGKPISVYFKESKDANKLVEEFMLLANRTVAEFVGKVPKNKKPK VLPYRIHDLPDPEKLENLSQFIARFGYKVRTSGTKTDISKSINHLLDDIHGKKEENLIET VSIRAMQKARYSTHNIGHYGLAFEYYTHFTSPIRRFPDMMVHRLVTKYMDGGRSVSEAKY EDLCDHSSNMEQIAANAERASIKYKQVEFMSERLGQIYDGVISGVTEWGLYVELNENKCE GLVPVRDLDDDYYEFDEKNYCLRGRRKNKIYSLGDAITVRVARANLEKKQLDFELIEK >gi|225935350|gb|ACGA01000042.1| GENE 126 212972 - 214549 1089 525 aa, chain - ## HITS:1 COG:no KEGG:BF3505 NR:ns ## KEGG: BF3505 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 220 525 329 629 629 127 30.0 1e-27 MTTKYVNFFFSALLALCVITSCGDKDDDAPAMTVSSNSVTILAAGGEESIEINTNQSEWT ATRPELDSWCTLKMNGNTLKISASTNETITSRSTLVTVTAGIGTNAKIQEIKVTQKAADP SLEITGTPVALDAAGTAVELTVTTNTGSWNASRPAADTWCLLSQEGNKLTVSAEAYTVNA ERKTTITITYGEDATLTPKTFEVTQQGAAPIYAIEIPTDFETGDVQKAMYQNVKVAEICW EYIKTGSTDKRMVVIYPVAEDGKTNLAKGLTVEDGGSIVWDVETNTCTYTAGTASAISKV YLADGNFSTTTTAASPIKTTVEADLLIDTRPNDTKFSYKIVKIGTQYWMAENLKAQSYLN GTEIPRVTDGKEWDANTTGAYRLPFADTQTFLIHGAYYHGYTMYNEAGLAPEGWIVPSDD EWNKLRKYIGTPYGTKLKSSSSSYWSTNKGTNITGFNALSSGYYTSATGDSGSGTDIYFW STTKTRDLLARQDVPHYYRLANASTGMPTDTHTFNFGHCIRCVRK >gi|225935350|gb|ACGA01000042.1| GENE 127 214605 - 215657 532 350 aa, chain - ## HITS:1 COG:no KEGG:BT_3244 NR:ns ## KEGG: BT_3244 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 328 1 341 354 181 36.0 3e-44 MKKVINIFLFVLCFTVISCHDDDTISPSLTIIQTDLNLKAIGGEASIQIQATGEVKATSD VEWCQVTEVTTETVKLSVQENRDYPGRSAQIIITNGSQTKQATLIQEGAILTYNKAEQIQ STNNQAAVLPIKLSSSFPIQVTIPEKNKSWLSFTHSTDGSGGSFIVSANNTGKARGSDVT ITSGERTLTYQVLQYGAENFVGSWKGTYVNQGSQYSLPDISISAADENGIYTISGLYKTS VYDYKVQATYQNNMFVITGQEVGVYVSSIVPMNVYFCIVDNIGFPLWTANNSIGLIPVLL SDGSFVLAFADNGGVAGEVVSKISFAGFLGDPSLDTFQGHLRLMSNCFFF >gi|225935350|gb|ACGA01000042.1| GENE 128 215690 - 217042 789 450 aa, chain - ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 436 1 426 443 300 40.0 6e-80 MKKILYFLLASICISLQSCLYQEDSYFDDSSANRATSDVERCNELLKNAPNGWKLEYYAG KNYSMGGITLLCKFDGKNVSIISEIGSTTTKPGVKITSLYKVVSEQSTILTFDTFNELLH CFSTPILSQNSNFQGDYEFAIMSASENEIVLQGKKYRNEMVMTPMPADTDWNTYINNLNQ IANEAFLNTYILKAGGEKVGEVERFSHVFNISTQTESQDAAFAYTTDGFRFRQPIIIGGK ALQNFKWNKTRMTFTCTDAGAEHVSMEGVYPAGYKKYEDYTGFYYFYYKTLKVNDDGTFN FVDAMPFIVQLKRKVDQESYVMTGSDLDADIVVTYEKSIGKLVIKPQQPGSIGQFYGTYI LGNNETYALPAHMGANFGYITSYVDTDYGLVAEASADGILSFVEYNTFFSSIVGSPATSL VLTAYSSNKITGSTYLGYLNWMDSIQLMPY >gi|225935350|gb|ACGA01000042.1| GENE 129 217055 - 217957 680 300 aa, chain - ## HITS:1 COG:no KEGG:BT_3242 NR:ns ## KEGG: BT_3242 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 1 304 304 451 74.0 1e-125 MKKQIIYTLLCMISFSLVACNNDDDIDTSHSIFTDTPAIEENAFDLWLLKNYTYPYNIDF KYRMQDIESDQKYKLVPAEYDKSVMLSKIIKHVWMEAYTELAGPNFVRSYVPKTFHLIGS PAYDSSGTMVLGTAEGGKKITLYNVNDLDPNEIDIDMLNKYYFETMHHEFAHILHQKRNY DPSFDRICEGKYIGTDWYQETDRDGGNTKAWKKGFVTAYSMSEAREDFVENIAMYVTHSE AYWNSMLAKAGEEGASIINEKFTIVYTYMLEIWGINLDDLRDIVLRRQQEIPELNLSTIE >gi|225935350|gb|ACGA01000042.1| GENE 130 217976 - 219535 1162 519 aa, chain - ## HITS:1 COG:no KEGG:BT_3241 NR:ns ## KEGG: BT_3241 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 517 4 515 517 748 73.0 0 MKHINILHTLLLISGLGLGTSSCNDFLDKLPDNRTELDTEQKITKLLISAYPEVTANELF ELYSDNSDDNGPKYSYYQLAEEDCYNWKDTQEEYQDTPNYLWQGYYQAITAANMVLKAIE QKGNPESLNPQRGEALVCRAYSHFMLANIFCNAYNSHAAEELGIPYMEEVETTVNPQYKR GTLKEVYEKIEHDLLASIPFISDDNYSVPKYHFTKKAVYAFATRFYLNYMQTDFSNCDKV IDYATRVLGNDASGQLRDWESWGKLTANDNVQPDAYIASTNKANLLISSTTSNWGILISN YMAGKRYMHNKLIASNETSQSSGLWGNADYLYVKPFTTSGYECSFIRKSGVYEGSNYLYY MPVIFNTDEILLYRAEAYALKKNYTQAAADITTWQKAYTKNKNTLTPEMIHEYYGKMPYY SPLTSLTPKKKIHPDFTIEEGMQESMIQCILHIRRITSLHEGLRWPDIKRYGIVIYRRFI SDGYGSITDEMPVNDLRRAIQIPKSVILAGMQPNPRTNE >gi|225935350|gb|ACGA01000042.1| GENE 131 219571 - 223161 3053 1196 aa, chain - ## HITS:1 COG:no KEGG:BT_3240 NR:ns ## KEGG: BT_3240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 142 1196 2 1058 1058 1854 86.0 0 MRKNWNLFIVCFLCVLYFLPYQVGAQTQPQKKITIEFSNERLPSVLKRLEKISGYKILFT YDDVKKFTVSGSVKDKSLEQTLDVILSNKPLEYHIEDQFVTITPKGPSKQAKVFNVKGVV ISGDDNQPLIGATVLIKGSKSGVLTDMDGKFSIENVSNKSVLQFSYIGMKPQDLTPAPSM SVTLMPDVQTLSEVVVTGMQKMDKRLFTGATNQLTAENVKLDGLPDISRGLEGRAAGVSV QNVSGTFGTAPKIRVRGATSIYGSSKPLWVVDGVIMEDAIDVGPDDLSSGDAETLISSAI AGLNSDDIESFQILKDGSATSIYGARAMAGVIVVTTKKGKAGVSKISYTGEFTTRMIPSY NEFNIMNSQEQMGIYKEMEQKGWLNSGDLFRAKNSGVYGRMYRLIDQFDESNGQFGLANT LETRNAYLREAEMRNTNWFDILFENNMTQNHSVSITSGSEKSSFYASLSAMLDPGWYKQS EVKRYTANLNTSYNIYKNLSISLISNASYRKQKAPGTMNSSLDVASGEVTRDFDINPYSY ALNTSRTLDPSTDYIANYAPFNILQELNNNYIELNVTDVKFQGELKWKVIPELEISALGA VRYQASSQEHNILDDSNQAIAYRSGLDDATIRKENGWLYTNPDNPYALPISVLPEGGIYQ RQDRKMLGLDFRSTISWNHLFAEKHITNFFAGMEINNLQRSYSSFQGWGMQYSMGEIPSY VYQFFKQGIETGNKYYSLSHSETRSVAGFANATYSYDGRYTINGTFRYEGTNRMGRSRSS RWLPTWNVSGAWNAHEEKFFQALEPTLSNLTLKASYSLTADRGPADVTNSQAIIRSFSPY RPFTDIQETGLYIEDIENSELTYEKKHELNLGIDIGFINNRINLSADWYTRNNYDLIGLI PTQGVGGSVYKFANVASMQSHGIEFTLSTKNIKKENFSWSSDFIFSYAKNEVTELKGRTR MMELVSGSGFAREGYPVRALFSIPFAGLNSNGMPQFNINGNITSTDINFQEREKLDYLKY EGPTDPTITGSFGNIFTYKGFKLNVFMTYSFGNVVRLNPYFNYKYSDLSAMPREFKNRWT VSGDESKTNIPVILSSPQYEANRTLYKAYNAYNYSTERIAKGDFIRMKEISVTYDFPQKW ISPIKVSNLSLKLQATNLFLIYADKKLNGQDPEFFNTGGVASPVPRQFTLTVRLGL >gi|225935350|gb|ACGA01000042.1| GENE 132 223313 - 224434 750 373 aa, chain - ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 201 338 159 294 331 72 31.0 1e-12 MEKALDIMESLSEISDEQLQEILEDKEALQTCRDIMDSSLFLRQKSRIELPNVDMELERF KKRQYSVRMHSNLWKTGIGIAAMIAILFGAYYLINSLTTPVLEPITVFTADNTPQHITLQ KDNGEKIVLDEPQSTNQALPQKVISKSEKKELDYRQVISTTTQTHVLTVPRGESFKVVLC DGTEVWLNANSNFVYPTAFIGNERIVTLEGEAYFKVTKDPKRPFIVKTKTVQTRVLGTEF NIRSYTPEDTHVVLINGKVEVSNTKGGSYTRLYPGEDAHLQSDGNFVLAEVDLDSYVYWK DGYFYFDDVTLKDIMQNLGRWYNVNIEFRNKEAMEYKMHFISDRTKDLEHTISLLNRMKK VTVTLQGNTLTID >gi|225935350|gb|ACGA01000042.1| GENE 133 224539 - 225102 323 187 aa, chain - ## HITS:1 COG:no KEGG:BVU_0609 NR:ns ## KEGG: BVU_0609 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 22 186 7 171 175 101 35.0 2e-20 MLFSIIICLFARQIIYYSTLLRHVEDKADFDFLFKEYYPQLYYYAFHLINNMEASKDIVS DAFEFIWTNYAKIDKATAKSYLYVYVRNKSIDFLRHQNTHEQYIQIYSELTKSYVETEYQ EQDERMMHISKAMEKLTPHTRHILEECYIQRKKYQEVAEELNISVSAVRKHIVKALQVIR EECAKKT >gi|225935350|gb|ACGA01000042.1| GENE 134 225186 - 226193 644 335 aa, chain + ## HITS:1 COG:PH1742 KEGG:ns NR:ns ## COG: PH1742 COG0451 # Protein_GI_number: 14591500 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Pyrococcus horikoshii # 4 329 6 301 306 82 27.0 9e-16 MESVLITGASGFIGSFIVEEALKRKFGVWAGIRPTSSKRYLKNRKIHFLELDFAHPNELR AQLSGHKGTYSKFDYIIHCAGVTKCSDKKTFDYVNYLQTKYFIDTLKELNMVPKQFIYIS TLSVFGPVREKDYTPISGEDTPMPNTAYGLSKLKAELYIQSIPGFPYVIYRPTGVYGPRE SDYFLMAKSIQKHVDFSVGFRRQDLTFVYVKDIVQAIFLGMEKKVVQKAYFLTDGKIYKS RAFSDLIQKELGNPFVLHLKCPLIVLKVISLFAEFIATRSGRSSTLNSDKYKIMKQRNWQ CDISPVMKELGYVPEYDLEKGVRETIAWYKNEGWL >gi|225935350|gb|ACGA01000042.1| GENE 135 226184 - 227149 622 321 aa, chain + ## HITS:1 COG:no KEGG:BT_3074 NR:ns ## KEGG: BT_3074 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 321 1 321 321 539 96.0 1e-152 MALDLFKRVETRKGLFAVEKITLIYNLLTSILILFLFQRMDHPWHMLLDRAMIAAMTFLL MYLYRLAPCKFSAFVRIVIQMSLLSYWYPDTFEFNRFFPNLDHVFATAEEFIFNGQPAIW FCHTFPHLIVSEAFNMGYFFYYPMMLIVALFYFIYKFEWFEKMSFVLVTSFFIYYLIYIF VPVAGPQFYFPAIGIDNVSKGFFPAIGDYFNHNQELLPGPGYQHGFFYSLVEGSQQVGER PTAAFPSSHVGISTILMIMAWRGSKKLFACLIPFYMLLCGATVYIQAHYVIDAIVGFFSA FLLYVVVTWMFKKWFAQPMFK >gi|225935350|gb|ACGA01000042.1| GENE 136 227240 - 228235 496 331 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148828154|ref|YP_001292907.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae PittGG] # 1 287 1 284 326 195 38 1e-48 MKIGQIDLGKYPILLAPMEDVTDPAFRLMCKRFGADMVYTEFVSSDALIRAVSKTAQKLS ISDAERPVAIQIYGKDTETMVEAAKIVEQAQPDILDINFGCPVKRVAGKGAGAGMLQNIP KMLEITRAVVDAVRIPVTVKTRLGWDANNKVIVELAEQLQDCGIAALTIHGRTRAQMYTG EADWTLIGEVKNNPRMHIPIIGNGDVTTPKRCKECFDRYGVDAVMIGRASFGRPWIFKEV KHYLETGEELPPLSFDWCMEVLRQEVVDSVNLLDERRGILHVRRHLAASPLFKGIPNFRN TRIAMLRAETKEELFRIFEEITSQRKENPEI >gi|225935350|gb|ACGA01000042.1| GENE 137 228360 - 229649 1272 429 aa, chain + ## HITS:1 COG:aq_1015 KEGG:ns NR:ns ## COG: aq_1015 COG0826 # Protein_GI_number: 15606313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Aquifex aeolicus # 9 415 5 401 409 271 37.0 1e-72 MSLSLKDFEIMAPVGSRESLAAAIQAGADSIYFGIENLNMRARSANTFTIDDLREIARTC DEHGMKSYLTVNTIIYDKDIPLMHTIVDAAKEAGISAVIAADVAVMNYARQIGQEVHLST QLNISNAEALKFYAQFADVVVLARELNLEQVAEIYRQIQEEHICGPSGEQIRIEMFCHGA LCMAVSGKCYLSLHEMNHSANRGACMQVCRRSYTVRDKETDVELDIDNEYIMSPKDLKTI HFMNKMLDAGVRVFKIEGRARGPEYVRTVVECYKEAIKAYLEGTFTDEKIAAWDERLKTV FNRGFWDGYYLGQRLGEWTRNYGSAATERKIYVGKGIKYFSNIGVSEFLVEAAEVSVGDK LLITGPTTGALFMTLEEARVDLESVQTVKKGQHFSMKSDKIRPSDKLYKLVSTEELKKFK GLDIEQKRG >gi|225935350|gb|ACGA01000042.1| GENE 138 229695 - 230096 442 133 aa, chain + ## HITS:1 COG:CC3234 KEGG:ns NR:ns ## COG: CC3234 COG0824 # Protein_GI_number: 16127464 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Caulobacter vibrioides # 6 117 13 126 147 61 35.0 4e-10 MNYIYELEMKVRDYECDLQGIVNNANYQHYLEHTRHEFLTSVGISFAALHEQGVDPVVAR INMAFKTSLKSGDEFVSKLYMKKEGIKYVFYQDIFRKSDNKVVVKSTVETVCVVNGRLSD SELFDSVFAPYLK >gi|225935350|gb|ACGA01000042.1| GENE 139 230093 - 231217 894 374 aa, chain + ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 84 371 10 285 288 196 38.0 8e-50 MSSSEEEQIYSIALTMVPGIGHIGAKHLIDGMNNAVDVFRLRKEIPERIPEVSQRVVDAL DCPQAVIRAEQEYEFIRKNRISCLTFHDEAYPSRLRECEDAPIVLFFKGNTDLNSLHIIN MVGTRNATDYGTRICASFLRDLKTLCPDVLVVSGLAYGIDIHAHREALVNDLPTVGVLAH GLDRIYPYVHRKTAIDMLEKGGLLTEFLSGTNPDKHNFVSRNRIVAGMCDATVVIESAEK GGSLITAELAEGYHRDCFAFPGRINDEYSKGCNRLIRDNKASLLLSAEDLVQAMGWNIPT TSSEKVNVQRSLFLDLSEEEQKIVSILEKQGNLQINSLVVEADIPVHKINAILFELEMKG VVRVLAGGMYQLLN >gi|225935350|gb|ACGA01000042.1| GENE 140 232101 - 236633 2466 1510 aa, chain + ## HITS:1 COG:PM1127 KEGG:ns NR:ns ## COG: PM1127 COG3513 # Protein_GI_number: 15602992 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 542 1130 340 773 1056 100 22.0 3e-20 MKRILGLDLGTNSIGWALVDSEEQRILGMGSRIIPMDQGVLDTFSGGNPVETQTAARTAY RGTRRLRERALLRRERLIRVLNMMKFLPEHYASQIDFEKRLGQFFDEKEPKLAWKQNDEG RFEFIFQQSFNEMIEEFKNKGVVREDKKIPYDWTIYYLRKKALSQEIEKEELAWILLNFN QKRGYYQLRGEEEEENPNKLVEYYSLKIVDVIADEKPNSKGDIWYSLHLENGWIYRRSSK IPLFGWKDKTRDFIVTTDLNDDGTVKKDKEGIEKRSFRAPSENDWVLLKKKTESDIEKSH KTVGAYVYETLLQNPMQKVRGKLVRTIERKFYKDELKQILEKQREFHPELRDEDLYNDCV RELYRSNETYQFILSKRDFVHLFLDDILFYQRPLRSQKSAIGDCILEFKTYKDKDGNNIK EYLKVISKSNPLYQEFRIWQWLYNLKIYRKEDDEDVTLQFIANIEDKEQLFDFLSNRKSI EQKPLLEYLIKAKGLKKQVKTEAYRWNYVEDKIYPCSETKTMISTRLAKVENIPDNFFTK EIEQKLWHIIYSVTDKNEFEKALRTFAKKYDLDIASFVDNFKKIPPFKNEYGSFSEKAIK KLLPLMRLGKYWKWNDIDNNTQNRISKIITGEYDEEIKDIVREKSIALTNENDFQGLQLW LAQYLVYGRHSEADIAGKWHSVADLEKYLNEFKQHSLRNPIVEQIITETLRVVRDIWRKY GQGAENFFSEIHVELGRDMKNTADERKKIVNVVTENENTNLRIKALLMELKNNSDGKLEV ENVRPYSPTQQDILKIYEEYAISTGLDNEKDEKVKEDIKKISRVAQPTTTELQRYKLWLE QKYCSPYTGKVIPLGKLFTEEYQIEHIIPKSRYYDDSFSNKVICEAAVNKLKDKCLGLEF IKNYHGQIVETGFGQKVTIFDEEAYQNFVKQHYAYNRSKRTKLLLEEIPEKMIERQMNDT RYISKFVLPLLSNLVRAEENDNGVNSKNVLPVNGKITTMLKQDWGLNDVWNDLILPRFVR MNELAKTIAFTSWNEQHQKYLPTVPLELSKGFQKKRIDHRHHAMDALVIACATRDHVNLL NNKHANTDTIRYDLQRKLRLFERVTYIDPQTKNNVTKDIPKEFKKPWDNFTVDARNELEK IIVSFKQNLRIINKATNIYTKYENGKKIKVGQKGLNWAIRKPLHKETVFAKVSLRKRKTV RLSEALKDWKKIVDKKLKQEIKRLTCQYGKFDVDTILRYFKDRKYQFGEVDVSKVKMYYF DEENAAVRKNVDTSFTEKFIQGSVTDTGIQKILLNHLEAKGNKVEIAFSPEGIEEMNKNI IQLNEGKLHLPIFKVRVYETIGNKFSVGVKGNKKDKYVEAAKGTNLFFAIYVDENGIRSY ETIPLNIVIERLKQGLSVVPEKNRKEHSLLFYLSPNDLVYVPIEDERENIHAVNLDQLNN EQRKRIYKMVSCTGSECHFVPYYVASPIVNKVEYSSLNKIGRSLTGEMIKDVCIKLKVDR LGNVKGISGL >gi|225935350|gb|ACGA01000042.1| GENE 141 236630 - 237643 796 337 aa, chain + ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 12 334 13 336 345 269 43.0 5e-72 MNKNNNTPIPSDFFLYKDSNGEVKVEIYIFNETVWLTQDKIAQLFGVDRSVVTKHLKNIF QTAELQEDSVSAKIALTAADGKKYQTKLYNLDAILSVGYRVNSIQATHFRIWANSVLKEY LIKGFAMNDERLKNPQMLFGKDYFEEQLARIRDIRSSERRFYQKITDIYSQCSADYEAGS EVTKTFFATVQNKLHWAISGQTAAEIIVARVDAEKPNMGLTTWKNAPNGMIRKPDVSIAK NYLNETEMDDLNRIVSMYLDYAERQAKKGQVMYMKDWVKKLDAFLQFNEEAVLQHSGQVS HEVAKALAEQEYDKFHTRQLRNYESDFDKLLKGMNHD >gi|225935350|gb|ACGA01000042.1| GENE 142 237636 - 238568 658 310 aa, chain + ## HITS:1 COG:PM1126 KEGG:ns NR:ns ## COG: PM1126 COG1518 # Protein_GI_number: 15602991 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Pasteurella multocida # 45 294 70 317 343 163 36.0 3e-40 MIKKTLYFGNPVYLSLRNAQLVIKLPDVEKATALPEVLKKQAEVTKPIEDIGIVVLDNKQ ITITSGVLEALLENTCSVITCDSKSMPVGLMLPLYGNTTQNERFREQLDASLPLKKQLWQ QTIQAKINNQASVLKDCMNEEVKCMRVWAADVRSGDPDNLEARAAAYYWKSLFADVDGFT RDREGIPPNNLLNYGYAILRAVVARGLVISGLLPTLGIHHHNRYNAYCLADDIMEPYRPY VDELVFSLIQEYGMGVELTKEVKTRLLIIPTLEVIIGGKRSPLMIAVGQTTASLYKCFNG ELRRVAYPER >gi|225935350|gb|ACGA01000042.1| GENE 143 238595 - 238900 156 101 aa, chain + ## HITS:1 COG:Cj1521c KEGG:ns NR:ns ## COG: Cj1521c COG3512 # Protein_GI_number: 15792834 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 1 85 7 91 143 78 47.0 4e-15 MWVLVLFDLPTETKKEKKAYADFRKNLQKDGFTMFQFSIYVRHCASSENAAVHIKRVKSF LPELGHVGVMCITDKQFGDIELFYGKKAQSVNTPGQQLELF Prediction of potential genes in microbial genomes Time: Fri May 13 09:29:17 2011 Seq name: gi|225935349|gb|ACGA01000043.1| Bacteroides sp. D2 cont1.43, whole genome shotgun sequence Length of sequence - 268963 bp Number of predicted genes - 166, with homology - 164 Number of transcription units - 65, operones - 39 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 103 - 162 5.2 1 1 Tu 1 . + CDS 375 - 596 178 ## BT_3064 hypothetical protein + Term 623 - 667 3.0 + Prom 647 - 706 3.2 2 2 Tu 1 . + CDS 808 - 1002 106 ## + Term 1008 - 1039 -1.0 3 3 Tu 1 . - CDS 1096 - 1320 218 ## gi|260172935|ref|ZP_05759347.1| Cro/CI family transcriptional regulator - Prom 1380 - 1439 6.2 + Prom 1325 - 1384 5.2 4 4 Tu 1 . + CDS 1444 - 1836 206 ## gi|260172936|ref|ZP_05759348.1| hypothetical protein BacD2_13788 + Term 1978 - 2030 5.2 + Prom 1867 - 1926 5.3 5 5 Tu 1 . + CDS 2066 - 3844 1654 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase + Term 3891 - 3932 8.0 + Prom 4157 - 4216 6.4 6 6 Tu 1 . + CDS 4242 - 4730 533 ## BT_3062 hypothetical protein + Term 4770 - 4824 10.3 + Prom 4788 - 4847 6.3 7 7 Op 1 . + CDS 4962 - 6239 580 ## BT_3061 hypothetical protein 8 7 Op 2 . + CDS 6249 - 6986 359 ## BT_3060 hypothetical protein 9 7 Op 3 . + CDS 7022 - 7249 175 ## BVU_1244 hypothetical protein 10 7 Op 4 . + CDS 7304 - 8296 728 ## BT_3059 hypothetical protein + Term 8355 - 8400 8.3 + Prom 8382 - 8441 8.2 11 8 Tu 1 . + CDS 8642 - 9481 491 ## BT_3058 transcriptional regulator + Term 9528 - 9566 9.2 - Term 9513 - 9557 7.5 12 9 Op 1 36/0.000 - CDS 9603 - 10358 857 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 13 9 Op 2 . - CDS 10395 - 12374 2225 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 14 9 Op 3 . - CDS 12412 - 13116 719 ## BT_3053 putative cytochrome b subunit - Prom 13149 - 13208 3.2 15 10 Tu 1 . - CDS 13237 - 14109 625 ## BT_3052 transcriptional regulator - Prom 14215 - 14274 4.6 + Prom 14101 - 14160 5.6 16 11 Op 1 . + CDS 14265 - 15707 1069 ## COG3119 Arylsulfatase A and related enzymes + Prom 15729 - 15788 3.7 17 11 Op 2 . + CDS 15969 - 18083 1372 ## COG3525 N-acetyl-beta-hexosaminidase + Term 18118 - 18162 8.4 18 12 Op 1 . - CDS 18230 - 19810 1177 ## COG3119 Arylsulfatase A and related enzymes 19 12 Op 2 . - CDS 19803 - 21296 849 ## COG3119 Arylsulfatase A and related enzymes - Prom 21422 - 21481 7.3 - Term 21444 - 21487 -0.3 20 13 Op 1 . - CDS 21492 - 22622 783 ## BVU_0616 hypothetical protein 21 13 Op 2 . - CDS 22629 - 23561 730 ## BT_1048 putative secreted endoglycosidase 22 13 Op 3 . - CDS 23591 - 25147 1254 ## BF1327 hypothetical protein 23 13 Op 4 . - CDS 25161 - 28493 2449 ## BF1326 hypothetical protein - Prom 28576 - 28635 2.5 24 14 Op 1 6/0.000 - CDS 28656 - 29558 645 ## COG3712 Fe2+-dicitrate sensor, membrane component 25 14 Op 2 . - CDS 29637 - 30206 457 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 30409 - 30468 8.7 + Prom 30221 - 30280 4.6 26 15 Tu 1 . + CDS 30429 - 31391 868 ## BT_3050 chitinase precursor 27 16 Tu 1 . - CDS 31509 - 35627 2654 ## COG5002 Signal transduction histidine kinase - Prom 35660 - 35719 9.2 + Prom 35717 - 35776 4.9 28 17 Op 1 . + CDS 35818 - 38949 3508 ## ZPR_4652 TonB-dependent receptor Plug domain protein 29 17 Op 2 . + CDS 38953 - 40815 1737 ## ZPR_4651 RagB/SusD family protein 30 17 Op 3 . + CDS 40828 - 41502 505 ## ZPR_4650 hypothetical protein + Term 41527 - 41571 8.3 + Prom 41537 - 41596 3.6 31 18 Op 1 . + CDS 41622 - 42578 661 ## COG3507 Beta-xylosidase + Term 42589 - 42623 -0.6 + Prom 42584 - 42643 1.6 32 18 Op 2 . + CDS 42663 - 44255 1164 ## BT_3043 putative xylanase + Prom 44292 - 44351 5.4 33 19 Tu 1 . + CDS 44478 - 48545 2754 ## COG0642 Signal transduction histidine kinase + Term 48553 - 48593 0.0 + Prom 48617 - 48676 2.2 34 20 Op 1 . + CDS 48702 - 51902 2711 ## Slin_6567 TonB-dependent receptor plug 35 20 Op 2 . + CDS 51915 - 53444 1464 ## Dfer_0773 RagB/SusD domain protein 36 20 Op 3 . + CDS 53473 - 54522 821 ## COG1262 Uncharacterized conserved protein 37 20 Op 4 . + CDS 54562 - 55734 1142 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 38 20 Op 5 . + CDS 55747 - 59856 3103 ## gi|260172971|ref|ZP_05759383.1| hypothetical protein BacD2_13963 39 20 Op 6 . + CDS 59917 - 61185 1011 ## Fjoh_4229 hypothetical protein + Prom 61224 - 61283 2.4 40 21 Op 1 . + CDS 61402 - 63171 1138 ## gi|260172973|ref|ZP_05759385.1| hypothetical protein BacD2_13973 + Prom 63186 - 63245 1.7 41 21 Op 2 . + CDS 63265 - 65607 1373 ## gi|260172974|ref|ZP_05759386.1| hypothetical protein BacD2_13978 + Term 65660 - 65719 18.3 - Term 65652 - 65704 18.3 42 22 Tu 1 . - CDS 65712 - 66374 491 ## BT_3041 hypothetical protein - Prom 66402 - 66461 5.9 + Prom 66868 - 66927 6.3 43 23 Tu 1 . + CDS 67139 - 67483 332 ## BT_3039 hypothetical protein 44 24 Op 1 . - CDS 67668 - 70025 2179 ## COG1472 Beta-glucosidase-related glycosidases 45 24 Op 2 . - CDS 70022 - 70201 77 ## - Prom 70360 - 70419 3.6 + Prom 70004 - 70063 5.5 46 25 Tu 1 . + CDS 70228 - 71817 1339 ## COG3507 Beta-xylosidase + Term 71936 - 71972 -0.6 + Prom 71896 - 71955 3.3 47 26 Tu 1 . + CDS 71986 - 73551 1148 ## COG3507 Beta-xylosidase + Prom 73575 - 73634 3.6 48 27 Op 1 . + CDS 73739 - 75247 1360 ## COG2730 Endoglucanase 49 27 Op 2 . + CDS 75273 - 78446 2858 ## Dfer_1573 TonB-dependent receptor 50 27 Op 3 . + CDS 78470 - 80098 1518 ## Cpin_6367 RagB/SusD domain protein 51 27 Op 4 . + CDS 80112 - 81578 980 ## Dfer_1571 hypothetical protein + Term 81592 - 81641 2.1 52 28 Op 1 . + CDS 81644 - 83137 918 ## DICTH_1791 endo-1,4-beta-xylanase D (EC:3.2.1.8) 53 28 Op 2 . + CDS 83191 - 84954 1198 ## Dfer_5683 cellulose 1,4-beta-cellobiosidase (EC:3.2.1.91) 54 28 Op 3 . + CDS 85019 - 89047 2990 ## COG0642 Signal transduction histidine kinase + Term 89090 - 89144 12.2 + Prom 89118 - 89177 2.4 55 29 Op 1 . + CDS 89197 - 92055 2710 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 56 29 Op 2 1/0.000 + CDS 92074 - 94599 1726 ## COG3250 Beta-galactosidase/beta-glucuronidase 57 29 Op 3 . + CDS 94615 - 96843 2015 ## COG1472 Beta-glucosidase-related glycosidases 58 30 Op 1 . - CDS 96997 - 98577 1601 ## COG0793 Periplasmic protease 59 30 Op 2 . - CDS 98612 - 99070 263 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 60 30 Op 3 . - CDS 99067 - 100944 1867 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 61 30 Op 4 . - CDS 100954 - 102090 992 ## BT_3032 hypothetical protein - Prom 102132 - 102191 3.8 - Term 102190 - 102246 7.8 62 31 Op 1 . - CDS 102269 - 104188 1052 ## Slin_1080 protein of unknown function DUF303 acetylesterase putative 63 31 Op 2 . - CDS 104207 - 104986 500 ## gi|260172996|ref|ZP_05759408.1| hypothetical protein BacD2_14088 64 31 Op 3 . - CDS 104989 - 107655 1275 ## BT_2524 alpha-rhamnosidase 65 31 Op 4 . - CDS 107681 - 109078 541 ## PROTEIN SUPPORTED gi|90020673|ref|YP_526500.1| ribosomal protein L9 66 31 Op 5 . - CDS 109086 - 110255 714 ## COG2152 Predicted glycosylase 67 31 Op 6 . - CDS 110283 - 111644 789 ## Dtur_0402 hypothetical protein 68 31 Op 7 . - CDS 111649 - 112689 858 ## COG2730 Endoglucanase 69 31 Op 8 . - CDS 112709 - 115270 1301 ## COG3250 Beta-galactosidase/beta-glucuronidase 70 31 Op 9 . - CDS 115290 - 116375 736 ## COG2730 Endoglucanase 71 31 Op 10 . - CDS 116435 - 117607 970 ## BF0761 putative lipoprotein 72 31 Op 11 . - CDS 117637 - 119400 1248 ## BF0834 hypothetical protein 73 31 Op 12 . - CDS 119413 - 122664 2630 ## BF0833 hypothetical protein - Prom 122684 - 122743 6.2 74 32 Tu 1 . - CDS 122792 - 123763 443 ## COG2730 Endoglucanase - Prom 123879 - 123938 10.4 + Prom 123841 - 123900 9.2 75 33 Tu 1 . + CDS 124074 - 126071 989 ## MAP0339c hypothetical protein - Term 125912 - 125956 1.4 76 34 Tu 1 . - CDS 126068 - 127051 924 ## COG0530 Ca2+/Na+ antiporter - Prom 127148 - 127207 9.1 + Prom 127095 - 127154 5.3 77 35 Tu 1 . + CDS 127178 - 129028 1714 ## COG0668 Small-conductance mechanosensitive channel - Term 129018 - 129064 11.1 78 36 Op 1 . - CDS 129113 - 130051 983 ## BT_3017 acid phosphatase - Prom 130152 - 130211 2.0 79 36 Op 2 . - CDS 130216 - 133026 2673 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 133051 - 133110 2.9 80 37 Op 1 . - CDS 133195 - 133656 324 ## BVU_1305 hypothetical protein 81 37 Op 2 . - CDS 133676 - 134329 261 ## gi|237721873|ref|ZP_04552354.1| conserved hypothetical protein 82 37 Op 3 . - CDS 134378 - 135568 863 ## BT_3008 hypothetical protein 83 37 Op 4 . - CDS 135637 - 136410 433 ## COG2173 D-alanyl-D-alanine dipeptidase 84 37 Op 5 . - CDS 136478 - 139327 2011 ## BT_3006 hypothetical protein 85 37 Op 6 . - CDS 139341 - 142205 1722 ## BT_3005 hypothetical protein 86 37 Op 7 . - CDS 142193 - 143548 943 ## BT_3004 hypothetical protein - Prom 143617 - 143676 5.9 + Prom 144056 - 144115 3.3 87 38 Tu 1 . + CDS 144143 - 144646 458 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 144678 - 144732 6.2 + Prom 144699 - 144758 5.4 88 39 Tu 1 . + CDS 144864 - 145640 648 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 145719 - 145761 -0.3 - TRNA 145828 - 145901 73.9 # Met CAT 0 0 - Term 145774 - 145825 14.1 89 40 Op 1 . - CDS 145992 - 146303 192 ## BT_2979 hypothetical protein 90 40 Op 2 . - CDS 146309 - 146773 392 ## COG1522 Transcriptional regulators - Prom 146822 - 146881 8.3 - Term 146865 - 146907 11.2 91 41 Op 1 . - CDS 146933 - 147802 1020 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 92 41 Op 2 . - CDS 147823 - 148407 687 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 - Prom 148427 - 148486 2.2 + Prom 148729 - 148788 4.7 93 42 Tu 1 . + CDS 148881 - 149579 686 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Term 149652 - 149692 6.1 94 43 Op 1 . - CDS 149600 - 149824 267 ## BT_2974 hypothetical protein 95 43 Op 2 1/0.000 - CDS 149838 - 150188 175 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 150204 - 150253 5.2 96 43 Op 3 . - CDS 150259 - 151035 752 ## COG0500 SAM-dependent methyltransferases + Prom 151380 - 151439 5.2 97 44 Op 1 . + CDS 151477 - 152334 817 ## BT_2961 hypothetical protein + Prom 152347 - 152406 5.1 98 44 Op 2 . + CDS 152427 - 154820 1885 ## BT_2958 hypothetical protein - Term 154662 - 154701 -1.0 99 45 Op 1 26/0.000 - CDS 154807 - 155556 504 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 100 45 Op 2 . - CDS 155549 - 156814 758 ## COG0438 Glycosyltransferase 101 45 Op 3 . - CDS 156825 - 157943 835 ## BVU_0890 putative glycosyltransferase 102 45 Op 4 . - CDS 157916 - 158932 753 ## BVU_0889 hemolysin hemolytic protein 103 45 Op 5 . - CDS 158935 - 159798 587 ## COG3475 LPS biosynthesis protein 104 45 Op 6 . - CDS 159878 - 160939 584 ## BDI_2786 hypothetical protein 105 45 Op 7 . - CDS 160926 - 162383 962 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Term 162384 - 162440 12.6 106 46 Tu 1 . - CDS 162456 - 164495 1949 ## COG0143 Methionyl-tRNA synthetase - Prom 164629 - 164688 5.7 + Prom 165307 - 165366 5.9 107 47 Op 1 . + CDS 165393 - 167453 1810 ## COG1042 Acyl-CoA synthetase (NDP forming) + Prom 167462 - 167521 1.8 108 47 Op 2 . + CDS 167547 - 171569 2889 ## COG3292 Predicted periplasmic ligand-binding sensor domain 109 47 Op 3 . + CDS 171626 - 172633 616 ## COG3507 Beta-xylosidase 110 47 Op 4 . + CDS 172673 - 173968 996 ## COG0673 Predicted dehydrogenases and related proteins 111 47 Op 5 . + CDS 173965 - 174207 94 ## BT_3471 hypothetical protein 112 47 Op 6 . + CDS 174234 - 175652 1351 ## COG0673 Predicted dehydrogenases and related proteins 113 47 Op 7 . + CDS 175718 - 177064 1450 ## BT_3469 hypothetical protein + Prom 177100 - 177159 3.8 114 48 Tu 1 . + CDS 177208 - 178023 456 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis + Prom 178025 - 178084 5.3 115 49 Op 1 . + CDS 178209 - 181163 2090 ## BT_3514 hypothetical protein 116 49 Op 2 . + CDS 181190 - 182986 882 ## BT_2892 hypothetical protein 117 49 Op 3 . + CDS 183035 - 184147 828 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 118 49 Op 4 . + CDS 184214 - 185284 590 ## BT_3593 hypothetical protein + Prom 185567 - 185626 5.3 119 50 Op 1 . + CDS 185650 - 187551 1239 ## COG3507 Beta-xylosidase 120 50 Op 2 . + CDS 187608 - 189041 1006 ## BT_3476 hypothetical protein 121 50 Op 3 . + CDS 189065 - 192145 2743 ## BT_3475 hypothetical protein 122 50 Op 4 . + CDS 192159 - 194237 1703 ## BT_3474 hypothetical protein 123 50 Op 5 . + CDS 194262 - 195329 1099 ## BT_3473 hypothetical protein + Term 195356 - 195411 9.6 + Prom 195378 - 195437 2.1 124 51 Op 1 . + CDS 195481 - 197358 1219 ## gi|260173057|ref|ZP_05759469.1| hypothetical protein BacD2_14393 125 51 Op 2 . + CDS 197410 - 199137 1065 ## BDI_1871 hypothetical protein + Term 199184 - 199249 9.0 + Prom 199385 - 199444 7.8 126 52 Tu 1 . + CDS 199469 - 202180 2067 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 202219 - 202255 -0.8 + Prom 202182 - 202241 4.1 127 53 Op 1 . + CDS 202284 - 203429 764 ## COG2152 Predicted glycosylase 128 53 Op 2 . + CDS 203444 - 206374 2103 ## COG3250 Beta-galactosidase/beta-glucuronidase 129 53 Op 3 . + CDS 206388 - 207491 1014 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins + Prom 207530 - 207589 9.2 130 54 Op 1 . + CDS 207667 - 209700 1347 ## BT_2899 hypothetical protein 131 54 Op 2 . + CDS 209720 - 211111 1012 ## GYMC10_3408 fumarate reductase/succinate dehydrogenase flavoprotein domain protein + Term 211317 - 211358 8.1 - Term 211394 - 211434 1.5 132 55 Op 1 . - CDS 211589 - 213142 1294 ## COG0642 Signal transduction histidine kinase - Prom 213162 - 213221 9.7 133 55 Op 2 . - CDS 213240 - 214778 1015 ## COG0606 Predicted ATPase with chaperone activity 134 55 Op 3 . - CDS 214792 - 215862 776 ## BT_2845 hypothetical protein - Prom 215907 - 215966 8.6 - Term 215902 - 215946 10.9 135 56 Op 1 . - CDS 215976 - 217670 1723 ## BT_2844 hypothetical protein - Prom 217734 - 217793 4.8 136 56 Op 2 . - CDS 217806 - 218765 601 ## COG4974 Site-specific recombinase XerD - Prom 218797 - 218856 4.9 + Prom 218729 - 218788 3.3 137 57 Op 1 . + CDS 218828 - 219247 255 ## COG0757 3-dehydroquinate dehydratase II 138 57 Op 2 . + CDS 219284 - 220741 1690 ## COG0469 Pyruvate kinase 139 57 Op 3 . + CDS 220762 - 221418 587 ## COG4122 Predicted O-methyltransferase 140 57 Op 4 . + CDS 221434 - 221766 306 ## COG0858 Ribosome-binding factor A + Term 221810 - 221854 4.3 + Prom 221768 - 221827 6.3 141 58 Tu 1 . + CDS 221937 - 223010 835 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component + Prom 223061 - 223120 7.5 142 59 Op 1 6/0.000 + CDS 223156 - 223707 364 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 223719 - 223778 5.1 143 59 Op 2 . + CDS 223859 - 224974 713 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 224976 - 225035 7.8 144 60 Op 1 . + CDS 225128 - 228517 2512 ## ZPR_4655 TonB-dependent receptor Plug domain protein 145 60 Op 2 . + CDS 228529 - 230475 1469 ## ZPR_4656 RagB/SusD family protein 146 60 Op 3 . + CDS 230508 - 232148 1173 ## gi|260173079|ref|ZP_05759491.1| hypothetical protein BacD2_14503 + Term 232168 - 232238 5.3 147 61 Op 1 . + CDS 232247 - 233809 1070 ## BVU_3139 hypothetical protein 148 61 Op 2 . + CDS 233821 - 235470 1132 ## BVU_3139 hypothetical protein + Term 235513 - 235547 -0.4 + Prom 235734 - 235793 5.8 149 62 Op 1 1/0.000 + CDS 235813 - 238020 1672 ## COG1472 Beta-glucosidase-related glycosidases 150 62 Op 2 . + CDS 238032 - 240485 1966 ## COG1472 Beta-glucosidase-related glycosidases 151 62 Op 3 . + CDS 240515 - 241627 1061 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins + Term 241707 - 241767 3.9 + Prom 241745 - 241804 3.6 152 63 Op 1 6/0.000 + CDS 241868 - 242410 379 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 242434 - 242493 7.5 153 63 Op 2 . + CDS 242556 - 243482 732 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 243499 - 243558 5.6 154 64 Op 1 . + CDS 243629 - 245185 1008 ## BVU_3139 hypothetical protein 155 64 Op 2 . + CDS 245242 - 248631 3014 ## BT_2461 hypothetical protein 156 64 Op 3 . + CDS 248649 - 250370 1663 ## BT_2460 hypothetical protein 157 64 Op 4 . + CDS 250391 - 252034 1340 ## Coch_0930 hypothetical protein + Term 252048 - 252100 1.6 + Prom 252036 - 252095 2.4 158 65 Op 1 . + CDS 252123 - 253724 1325 ## BVU_3139 hypothetical protein 159 65 Op 2 . + CDS 253742 - 255625 1829 ## BT_2458 putative pyridine nucleotide-disulphide oxidoreductase 160 65 Op 3 . + CDS 255618 - 257684 1651 ## Phep_0375 hypothetical protein 161 65 Op 4 . + CDS 257710 - 259011 1013 ## COG3458 Acetyl esterase (deacetylase) 162 65 Op 5 . + CDS 259040 - 260833 1874 ## COG3250 Beta-galactosidase/beta-glucuronidase 163 65 Op 6 . + CDS 260842 - 262662 1447 ## Cphy_0623 hypothetical protein 164 65 Op 7 . + CDS 262720 - 263544 701 ## Cphy_0623 hypothetical protein 165 65 Op 8 1/0.000 + CDS 263612 - 265819 2312 ## COG1472 Beta-glucosidase-related glycosidases 166 65 Op 9 . + CDS 265857 - 268310 1985 ## COG1472 Beta-glucosidase-related glycosidases + Term 268428 - 268483 0.2 Predicted protein(s) >gi|225935349|gb|ACGA01000043.1| GENE 1 375 - 596 178 73 aa, chain + ## HITS:1 COG:no KEGG:BT_3064 NR:ns ## KEGG: BT_3064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 73 136 86.0 3e-31 MRRILSFAVLAALYFPGCTTSKNAVRCLSRWIKDCNPLTKELVPTGYMPYNHRYLTMKQY KIITKHLGDPYED >gi|225935349|gb|ACGA01000043.1| GENE 2 808 - 1002 106 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSMRSFFIKFFVWLIIVTFEGVLSNKYGTCDNMFLKLVMLVVVFELYFEITSLLRNCGA LTKT >gi|225935349|gb|ACGA01000043.1| GENE 3 1096 - 1320 218 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172935|ref|ZP_05759347.1| ## NR: gi|260172935|ref|ZP_05759347.1| Cro/CI family transcriptional regulator [Bacteroides sp. D2] # 1 74 1 74 74 107 100.0 2e-22 MTEIMLDKQKTLNEIGTRIRRLRESKNLSIQEFADKLEIEYNNVIRIEKGRTNFTIGTLV KIANTLEVNLKDIV >gi|225935349|gb|ACGA01000043.1| GENE 4 1444 - 1836 206 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172936|ref|ZP_05759348.1| ## NR: gi|260172936|ref|ZP_05759348.1| hypothetical protein BacD2_13788 [Bacteroides sp. D2] # 1 130 1 130 130 248 100.0 1e-64 MDTLKLHSHFENLLYVGRSVLTNTSSRIQRLFFKKEMCIYEYLFREEASKGIEIVVDNAV LVCVLENDICNKSILYLNDSTNITSYINYCNSNFEYDKLHDRWIMPDGYLTLFIPNDDFE KRFAFVQTLV >gi|225935349|gb|ACGA01000043.1| GENE 5 2066 - 3844 1654 592 aa, chain + ## HITS:1 COG:AF1252m KEGG:ns NR:ns ## COG: AF1252m COG5016 # Protein_GI_number: 18677784 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Archaeoglobus fulgidus # 2 457 1 431 480 186 31.0 1e-46 MMKKEIKFSLVYRDMWQSSGKYQPRVDQLVRIAPLIVEMGCFTRVETNGGAFEQVNLLYG ENPNKAVRAFTAPFKEAGIQTHMLDRGLNALRMYPVPADVRKLMYKVKHAQGVDITRIFC GLNETRNIIPSIKYALEAGMIPQATLCITYSPVHTVEYYARIADQLIEAGAPEICLKDMA GIGRPGMLGQLVRTIKEKHPDILIQYHGHSGPGLSMASILEVCENGADIIDVAMEPMSWG KVHPDVISVQAMLKDLGFQVPDINMKAYMKARAMTQEFIDDFLGYFMDPTNKYMSSLLLK CGLPGGMMGSMMADLKGVHSGINMILKSKNEPELSLDDLLVMLFDEVEYVWPKLGYPPLV TPFSQYVKNVALMNLMQLVKGEERWTMIDNHTWDMILGKSGRLPGTLAPEIIELAKSKGY EFVDTDPQLNYPDALDDYRKEMDENGWEYGEDDEELFELAMHDRQYRDYKSGVAKKRFED ELQHAKDASMAKSGYSEEDIKKLKRAKADPVIAPDNGQVLWEVSVEGPSIAPFIGRKYQH DEVFCYLSTPWGEYEKILTGFTGRVVEICAQQGATVRKGDVIGYILRSDIFA >gi|225935349|gb|ACGA01000043.1| GENE 6 4242 - 4730 533 162 aa, chain + ## HITS:1 COG:no KEGG:BT_3062 NR:ns ## KEGG: BT_3062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 163 165 236 82.0 2e-61 MGLGYVVTKRVFGFDKDKNEKYVAKSVRSGKVSFAKMCRKVSELCGVHRKVVDLVVSGLV DKMAEDIDDGKSVQIGEFGIFTPTIRAKSTDDEKEVLSKSIVQRKILFYPGKIFKNTLED MSITRRAEIETDYTDGSSNGGNNGGNPDSGDDGKGEAPDPAA >gi|225935349|gb|ACGA01000043.1| GENE 7 4962 - 6239 580 425 aa, chain + ## HITS:1 COG:no KEGG:BT_3061 NR:ns ## KEGG: BT_3061 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 425 1 425 425 652 73.0 0 MKHGIFILAILMALLSLPESGYASVPFFTVVNDTIRRVVIYFDANETNVNPYYRDNNQVI IALDSLLSRNLDTKYITALNVVSFVSPDGDESYNKSLVTKRNNSIKEFLRRYELVNNVDK IHFCSKGEDWLEFRKLVASDTNLPDREEVLMLIDYHKDDIAKRKRLLRKLNRGAAYRYMV RNVFPELRRSVITIVGETAIIDREVFEPVSSVSGLFVSNQEEALPKNQPDKPVGESDDKQ TCEVDMSEAEEPVKSQTVLAVKNNLLYDLALAPNIEIEIPIGKRWSLNTEYKCPWWLNSK HDFCYQLLSGGMEGRYWLGNRQKRNRLTGHFIGLYAEGGIYDFQLRGDGYQGKYYGAAGV TYGYAKQLARHFSLEFSLGIGYLTTEYKKYTPYEGDIVWTNSGRYNFIGPTKAKVSLVWL ITTKR >gi|225935349|gb|ACGA01000043.1| GENE 8 6249 - 6986 359 245 aa, chain + ## HITS:1 COG:no KEGG:BT_3060 NR:ns ## KEGG: BT_3060 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 220 1 204 304 289 66.0 7e-77 MRNARFMLLVLLSSLLMLNGCSRRDILDDYPVSGVEISLNWDGVTDKLPEGIRVIFYPKD GEGKKVDRYLSVRGGEMKVPPGRYSVVAYNYNTESIRIRGEESYETIEAYTGNCNGLGIA GTEKMVWSPDSLYVLNIDELKIDKSEEVLSLDWKLESVVKKYSFAVEVKGLEYVTTIVGC INGLSDCYHIGKGHGASSWQPIYFEVKKDGNKVVACFTAFKQTKEMSVPTRISESGSATS RGVVI >gi|225935349|gb|ACGA01000043.1| GENE 9 7022 - 7249 175 75 aa, chain + ## HITS:1 COG:no KEGG:BVU_1244 NR:ns ## KEGG: BVU_1244 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 74 248 320 321 62 52.0 5e-09 MQEAAIDVTEIIETLEDTGTGDDGKQDPPPEIELPPDDKIEVDKPELPPNPDGGGGMDGN VDGWGPEDNVELPVI >gi|225935349|gb|ACGA01000043.1| GENE 10 7304 - 8296 728 330 aa, chain + ## HITS:1 COG:no KEGG:BT_3059 NR:ns ## KEGG: BT_3059 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 325 1 322 324 141 36.0 3e-32 MKKILLAVTAALAITSCSQNEEFENAGQKAVINFESIVSNATRATEMKLQGLKDQGFHVY AYNTGDAVVGTGTLDKSIIENALVSWDSGTSKWTSATYYWPSKGNIQFFAYSSSKSLTLT ATDTDKYPTLVDYQIADAASGQEDLLTAKVTNKTKADLNVSFTFSHVLTQIQFAIKSKLA DNLTYTVSKIEISGANNKATYKYADNSWISLAGSAKYTYPLDDVTANNAVQGTTTGKNIG TESLMLLPQTLTAGKILVSYSVTDKNGDEVYATGATPKEVDLKDAVWGVGKSIRYTLSLT NDAATIGWDVTDVDTWTDEEQQKEPEAPTV >gi|225935349|gb|ACGA01000043.1| GENE 11 8642 - 9481 491 279 aa, chain + ## HITS:1 COG:no KEGG:BT_3058 NR:ns ## KEGG: BT_3058 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 1 279 280 464 79.0 1e-129 MPSICQGTWDCQICPKAVSNAITHVIYQRGFHKPAQRCEENFILFLIKGEMLVNSKEYAG TMLKEGEFILQAIDSTFEMLAMLECECIYYRFIQPELFCDSRFNHIMKDVSPPLICSPIK IVPELQYFLKGSITYLKGNKVCRDLLSLKRKELAFVLGYYYSDYDLSSLVYPLSKYINSF HNFVIQNYKKVKTVEELAQLGGYTLSTFRRIFNNVFHEPVYEWMLARRKEGILDDLNNSA YSISEICYKYGFESLPHFSNFCKKSFGASPRNLRGQNRS >gi|225935349|gb|ACGA01000043.1| GENE 12 9603 - 10358 857 251 aa, chain - ## HITS:1 COG:Cgl0368 KEGG:ns NR:ns ## COG: Cgl0368 COG0479 # Protein_GI_number: 19551618 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Corynebacterium glutamicum # 5 243 1 238 249 235 46.0 7e-62 MDKNISFTLKVWRQAGPKAKGAFETYQMKDIPGDTSFLEMLDILNEQLISERKEPVVFDH DCREGICGMCSLYINGHPHGPATGATTCQIYMRRFQDGDTITVEPWRSAGFPVIKDLMVD RTAYDKIMQAGGYVSVRTGAPQDANAILIAKPIADEAMDAASCIGCGACVAACKNGSAML FVSAKVSQLNLLPQGKPEALRRAKAMLSKMDELGFGNCTNTRACEAECPKNISISNIARL NRDFIIAKLKD >gi|225935349|gb|ACGA01000043.1| GENE 13 10395 - 12374 2225 659 aa, chain - ## HITS:1 COG:Cgl0367 KEGG:ns NR:ns ## COG: Cgl0367 COG1053 # Protein_GI_number: 19551617 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Corynebacterium glutamicum # 4 658 28 673 673 604 47.0 1e-172 MIKIDSKIPEGPVAEKWTNYKAHQKLVNPANKRRLDIIVVGTGLAGASAAASLGEMGFRV FNFCIQDSPRRAHSIAAQGGINAAKNYQNDGDSVYRLFYDTVKGGDYRAREANVYRLAEV SNAIIDQCVAQGVPFAREYGGTLDNRSFGGAQVSRTFYAKGQTGQQLLLGAYSALSRQVN VGTVKLYTRYEMQDVVIVDGRARGIIAKNLVTGELERFAAHAVVIATGGYGNAYFLSTNA MGCNCTAAISCYRKGAVFANPAYVQIHPTCIPVHGDKQSKLTLMSESLRNDGRIWVPKKK EDAVKLQKGEIKGSDIPEEDRDYYLERRYPAFGNLVPRDVASRAAKERCDAGFGVNNTGL AVFLDFSEAINRLGIDVVLQRYGNLFDMYEEITDVNPGELAKEINGVKYYNPMMIYPAIH YTMGGIWVDYELQTTIKGLFAIGECNFSDHGANRLGASALMQGLADGYFVLPYTIQNYLA DQITVPRFSTDLPEFAEAEKAVQAKIDKFMSIQGKESVDSIHKKLGHVMWEYVGMGRTAE GLKKGIAELKEIRKEFETNLFIPGSKEGMNVELDKAIRLYDFITMGELVAYDALNRNESC GGHFREEYQTEEGEAKRDDENFFYVACWEYQGDDEKAPVLYKEPLVYEAIKVQTRNYKS >gi|225935349|gb|ACGA01000043.1| GENE 14 12412 - 13116 719 234 aa, chain - ## HITS:1 COG:no KEGG:BT_3053 NR:ns ## KEGG: BT_3053 # Name: not_defined # Def: putative cytochrome b subunit # Organism: B.thetaiotaomicron # Pathway: Citrate cycle (TCA cycle) [PATH:bth00020]; Oxidative phosphorylation [PATH:bth00190]; Benzoate degradation via CoA ligation [PATH:bth00632]; Butanoate metabolism [PATH:bth00650]; Metabolic pathways [PATH:bth01100]; Biosynthesis of secondary metabolites [PATH:bth01110] # 1 234 1 234 234 376 95.0 1e-103 MWLSNSSVGRKVVMSVTGIALVLFLTFHMAMNLVAIISADGYNMICEFLGANWYALVATA GLAALFVIHIIYAFWLTMQNRKARGSERYAVTDKPKTVEWASQNMLVLGLIVIVGLGLHL FNFWAKMQLPELMHNMGMHADTLTLAYAANGAYHIQQTFSCPVYVVLYLVWLFALWFHLT HGFWSSMQSLGWNNKVWIDRWKCISNIYSTIVVLGFALVVVVFFVKTLICGGAC >gi|225935349|gb|ACGA01000043.1| GENE 15 13237 - 14109 625 290 aa, chain - ## HITS:1 COG:no KEGG:BT_3052 NR:ns ## KEGG: BT_3052 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 290 26 316 316 490 76.0 1e-137 MAGDSFCIQYTNCTGCPKATENILVYRNLPKGEHIPKDKCTQNCMLFMIKGELLINSEEY PGNTLREKQFILQAIGSKVEILALTDVEYIVYWFNELPLLCEDRYREVIEHTETPLTYTP LIMSERLYHLLTSMPEFLNEESPCSKYIDLKCKELVFLITNFYPQPQLSSFFYPISTYTE SFHYFVMQNYGTVKNVEEFAHLGGYTTTTFRRLFKNLYGVPVYEWILEKKREGILEDLQH TNMRITEICNRYGFDSLSHFAHFCKDSFGDTPRALRKKAADGEKIGKIVD >gi|225935349|gb|ACGA01000043.1| GENE 16 14265 - 15707 1069 480 aa, chain + ## HITS:1 COG:MA2648 KEGG:ns NR:ns ## COG: MA2648 COG3119 # Protein_GI_number: 20091471 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 5 382 25 440 535 183 32.0 7e-46 MKNKKVLIGMAGMAVLSAEAIAQKPVNIVLINLDDAGNGDFSCRGAVGYQTPHIDWMAAN GMAMNNFYAVQPISGASRAGLMTGCYPNRIGFAYALNPGSPYGISAEEETVAELLRDKGY ATAIFGKWHLGDEKKFLPLQHGFDEYYGLPYSNDMWPRHPQYKFPDLPLIKGNDIIGYNT DQTRLTTDYTNYAVDFIHRNVDKPFFLYLAHSMPHVPLAVSDKFKGKSKQGLYGDVMMEI DWSVGEILRTIREEGIEDNTMVILTSDNGPWANYGNHAGSSGGLREAKANTFNGGTRVPC FFYWKGKIQPATVCNNLYSNIDILPTLIEIAGAAKPKRKIDGVSMLPTLTVDPTIAVRKY LCFYYHKNSLEAITDGSYKLVFPHKYVSYDGQIPGNDGLPGALGKGEVTECELYDLRRDM GERTNVVTLYPEVVKDLTREAEVWREELGDDLTGHPGKARREASKSDKVNKSAKPGKAAK >gi|225935349|gb|ACGA01000043.1| GENE 17 15969 - 18083 1372 704 aa, chain + ## HITS:1 COG:XF0847 KEGG:ns NR:ns ## COG: XF0847 COG3525 # Protein_GI_number: 15837449 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Xylella fastidiosa 9a5c # 26 532 161 662 841 340 38.0 6e-93 MAGRELSVGKMKKFVESNVIYFMLEPQVRKEGFRLKITEDKLYIEASDGAGAFYAVQLLR ELSLSASDQQGECNKEVNGKKWELCFPPVELEDYPSMSYRGAMLDVSRHFFSVDQVKRYI DLLAFHRLNHFHWHLTDDQGWRIEIKKYPNLTKVGAWRGTDNYGGYYTQEEIKEVVTYAS ERYITIIPEIDMPGHTQAALAAYPELGCRGTSYEVATEVGGVHKDVMCMGSDFTFPFVKD VLKEVAELFPGPYIHIGGDEVPKDRWKECNACQKAIREHGLKNTKLHTAEERLQRTFNEE IAVYLHGLGKRMIGWDEVLADDLNREVIVMSWRGLGRATAAIRKGHDVIVSADSHLYLNH YQTINSEQEPRATGGLVEMKKVFETPFFSPQLTETERTQVLGAEACLWSSFVDDDSILDY MLLPRLAAFADAVWCEGRRGTYNHFLQRLPSLLRCYEQLGYGYARHFFTISATYTSVPDD KCLVVSLESLPDTKTYYTLDGSQPTKGAFLYENPLRIDKSCILKAVSYLPSGLASDELSK EVVVNKATFKPIHLQNAPSERYQGEDGKVLVDGVRSINFHNTGLWVGYHAADMIATIDLE YPQEINAVEVSALTDLSAWIMGPQSISVYISSNGKRYKMVAHKTYQALTDAMGEKSSDLH RLSFKKSASARFVKVIAVPFKGLPKGHSGEGEPPFLFVDEIRVH >gi|225935349|gb|ACGA01000043.1| GENE 18 18230 - 19810 1177 526 aa, chain - ## HITS:1 COG:PA0031 KEGG:ns NR:ns ## COG: PA0031 COG3119 # Protein_GI_number: 15595229 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 36 488 5 426 503 130 27.0 5e-30 MNKLLLTGILTAGTATQILNAQTAGQENRQKDEQRPNILFILSDDHTSQAWGIYGGVLAD YAYNNNIRRLANEGVVLDNCFCTNSISAPSRASILTGLYSHQNGLYTLADSLDTSLPTLA TVLQANGYHTGLVGKWHIKSQPQGFDYYSIFHDQGEYRDPTFIESTDPWPGNRNFGQRVL GFSTDLVTQKAIRWMKQQDGKQPFLMCCHFKATHEPYDYPIRMEHLYDGVTFPEPENLLD WGPETNGRSFIGQKLEELVRRWRTASKDPDKWWCRYPGLPFSTEGMQRTTARRAAYQKLI RDYMRCGATVDDNIGKLLKALDEMGIADNTIVVYVSDQGYFLGEHGFFDKRMFYEESARM PFVIRYPKKIPAGQRLKDLVLNVDFAPTLAEFSGVKMDNVQGRSFVDNLEGKTPSDWRKE IYYRYWTNHAIRPAHFAIRSDRYKLIFYYAKNLGMTDTENFEFTPAWDFYDLQNDPHENH NLYNDPKYASIIKQMKKDLLRLRKETGDTDDKYPEMQTVLEKFYYK >gi|225935349|gb|ACGA01000043.1| GENE 19 19803 - 21296 849 497 aa, chain - ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 53 478 4 459 467 134 25.0 6e-31 MKPAIQYSFISCCLSLAFSPLTAECQSTAKEIPQQPNVLFILTDDLQASSIHALGNEDVY TPAIDSLIAEGVTFTNTYTNGALCGALSMPSRAMLMTGRGLYNIQSDGMKIPKAHTTFPQ QFRRHGYRTFATGKWHSDKAAFNRSFQEGDNIYFGGMHPYEQNGHCSPHLNHYDSTGVYG PKTKFTGEEFSSKMYADAAIRFLQKQKGDKQPFLAYVAFTSPHDPRNQLPNYGRKYSPDT LDVPRNFLPKHPFNNGEMRVRDELLLPAPRTEQQVQKELSDYYGMISEVDVQIGRIMEVL RATGQAENTIVVFASDNGLAVGRHGLLGKQNLYDHSVKVPLTIIAPSYKNRKGEKNQSLC YLHDIAPTLCELANIPLPESMNAQSLYPVLEDSGTTHRKELFLAYSNIQRAFVNDSYKYI IYHVNGKITEQLFDLQKDPLEMHNLLTEKREEANKLKKQLAFRMEEEGDFCNLNDPNWWQ DGHKLTFKEMQQLYIYE >gi|225935349|gb|ACGA01000043.1| GENE 20 21492 - 22622 783 376 aa, chain - ## HITS:1 COG:no KEGG:BVU_0616 NR:ns ## KEGG: BVU_0616 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 7 372 6 372 373 194 34.0 5e-48 MRKNKIYTLFAGAFTLIALMGCKQEYEPIENRIYINEAAFKNVKSLPINVSEKTVTDFTV RIGDIMGKDVQATLAVDESLLQEYNKKMKLDYAVLPADKYSFEKEVIIESGKTMAKPTVV TISPYEAAEGVKYALPIRVISDGSVQEEIQGASYILLLDKPWTQFTPYLGRGGGFKSENV TTLSMDYFTIEFWIWMADFGYNNQCVIDCSAFYIRLGNANNQITKDQMQINIFGKSAENN KGFFTKFNFQKKTWTHVAMVYDQSKCHFYANGEKVQDVEATGVPADITSITFFNGNTSNE HMMGQVRLWNRVLTQAEIQSNMGGPVTVSPSLIGYWKMDEGQGNVVYDSSGNKNDATIAT GSITEWRADQCFTKKN >gi|225935349|gb|ACGA01000043.1| GENE 21 22629 - 23561 730 310 aa, chain - ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 310 19 370 375 199 39.0 1e-49 MKKIFIILFAAFLATNFFSCDDWTDMDAKNFEPEPLGEEYYAALREYKKSDHALSFGWFG SWSNQGAASLYYSLKSIPDSVDIVSIWGPYGSLTDYQKEDLKYVQEVLGTRVVFTVFSHN MANLPGKFENIAANIPAAAQAIADTIHKYGYDGIDFDHECSGSDLFYDKDNMTALLKETR QRIGTDKIIMVDGFVHLITEEGWKHANYAVAQAYATGSPTRLQERFDRVKQYIGSNRFIV TENFESYAANGGVNFKDPERGTIPSLLGMAYWNPDEGRKGGIGSFHMEYEYSRNYKYLRQ AIQIMNPAKK >gi|225935349|gb|ACGA01000043.1| GENE 22 23591 - 25147 1254 518 aa, chain - ## HITS:1 COG:no KEGG:BF1327 NR:ns ## KEGG: BF1327 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 518 3 514 514 463 46.0 1e-129 MKKITQFICSVLLVLLVCNCTGKFEDYNTNQYQPPVLNSRLLFAQLITCMSSTEENPSQR NITFWAGPFGGMLTPTSSWSRSQHFYTYNVDDAWNKWSVDWYFKNFYPNYFSIERFSEAS GHYYALAKIMRVHIMQIIASMQGPLPYTKVESGQYSVGYDNEEAAWKAMIDDLNYSIDVL TSLMNTDPSFNELTTEDRIYQGDYKKWIKFANSLKLRIAIRISNAVPEYAQEIAEKAVAH PYGVMDSNDCDAYDNIGSTNFKNGLASVNSWGEVRANACIVSFMNGYGDPRREAYFTKTT VSPDYEYLGVRSGISGITTGMFSTYSTVKVETKTPMLIFNAAEVAFLRAEGALKGWTMGG TAEEFYKKGIQLSFNEHSISGADTYIANSINVPADYICPLNSAYNYTNPCKLPVAWNPAT TTSAKEENLERILTQKMIANFPIGNETWADFRRTGYPAVFPAYDNLSTQGVTKERQQRRL RFSEDEYASNGNNVTEACSFLSNGQDTDATDLWWAKKN >gi|225935349|gb|ACGA01000043.1| GENE 23 25161 - 28493 2449 1110 aa, chain - ## HITS:1 COG:no KEGG:BF1326 NR:ns ## KEGG: BF1326 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 27 1110 11 1101 1101 1105 55.0 0 MTNLRKYQISYQKCCHTSIKALQKSIYAAILLLMIPGFALAQTTKRDIQLNFTNAPMTKV FESIEKQSEYSFFYNNDINTSQRTSIQITSNNINQIVSTLLKNTTIDFRISDKRIVLFQK QAAEEQKKAYTVTGVVSDGSEPVIGASISTKDGKGTITSIDGDFRLENIHIGDVLTISYV GYQTRNIVFNGQPRLNISLTEEATALEQVVVTALGIKRSEKALSYNVQQLSSDDITTVKD ANFMNSLTGKVAGVTINSSASGAGSAARVVMRGVKSITKSNNALYVIDGIPMFNVVNEST DTGLLTDQPGSDAVADINPEDIESINMLTGPAAAALYGNAAASGVVLINTKKGTKDKTNI SVSSSTTFSNPTMLPKMQNKYGNKDGVFASWGDVVNSNYDPEKFFRTGVNTINSVSLSTG TSKNQTYVSVSATNSTGILPNNKYERYNVSGRNTATFLNDKLTLDFGANLIFQNDRNMTA QGRYFNPLTALYLFPRGENFEAIRMYERYNEGLGIYEQYWPYDTQSMELQNPYWTMNRIV RENKKKRFMTNASLKWNIIDGIDITGRLKYDKTDMRSTDKRHASTIEQFASPYGGYYDIH KSYSSFYGDIMLNISKTFNDNMWSFNANIGASINDQREEQIGHGGNLEGIYNFFAIHNID TSAKYRRIQSGYIQQSQGVFVNAEVGYKSMLYMTVTGRNDWESQLAYSEASSFFYPSVGL SGIISSMVKLPDWLSYLKVRGSYTEVGTSFERFITRPSYEFNIASGKWISTSIYPVRNLK PEKTRSWEVGVNSKWLNNRLSFDLTYYRSSTINQTFTADLSVSSGYSKAYIQSGNVRNEG MEIALGYSDKWGGFSWDSNLTFSWNQNKIIKLTEGSVNPMTGEPITKDNYEMGQLGNLDA RVKLYKGGSIGDVYATHRIKRDGNNNIYVSQEGKIEIESVEEFKLGSILPKSNMGWSNNF SYKGVNLGVTLTARFGGIALSGTQSTLDQYGVSQVTADYRDAGGIFTNYGYVDTEKYFDT VKGYAAYYTYSATNVRLSELNLSYTLPQKWFRDKLRMTVGLVGKNLWMIYCKAPFDPEAT ASTQSNYYQSYDYFMQPTTRNIGFSVKLNF >gi|225935349|gb|ACGA01000043.1| GENE 24 28656 - 29558 645 300 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 69 238 44 206 274 79 36.0 6e-15 MRIRKEEETDKLDELIDGIELHSPRPPKEHSAEKSFSRLMARIQPDNNSIDIFRQRANRY RIWMAAATVAMLIAMSGWLYNIVSDSEPAFIVASNNTGIVQKVTLPDGTIINLNTCSRLT YPESFSGKSREVFLDGEAYFDVAHDKRHPFIVRAGELKIRVLGTKFNVNASTLVPQITAT LIEGSIEAVTGKKNILMKPNQQLKYDTSSGRVSLTELTNASREIRWTQNVWVLSDTPLLD ICQRLEQQFNIKIIIMNDELIGKSFTGEFYTNESLESILKTMQISTPFEYEYKGKNIILK >gi|225935349|gb|ACGA01000043.1| GENE 25 29637 - 30206 457 189 aa, chain - ## HITS:1 COG:RSc1055 KEGG:ns NR:ns ## COG: RSc1055 COG1595 # Protein_GI_number: 17545774 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 10 186 7 191 199 65 23.0 7e-11 MNNPQRISNEQEILKLVSEGNSDAFRTLYMHYYDRLFQFAMMFLRSEPASEDVVEDVFYN LWKDRHMLTSIPNFHPYIYQAIRNGCLNVLKSGYISKRDDIPETELQVSISTHTPLDELS YKELHKAVKEAITALPERCRIIFKMAKEDEMSHREIAEALDVKVCTVERQILIAKAKIKE AIQPFLEKF >gi|225935349|gb|ACGA01000043.1| GENE 26 30429 - 31391 868 320 aa, chain + ## HITS:1 COG:no KEGG:BT_3050 NR:ns ## KEGG: BT_3050 # Name: not_defined # Def: chitinase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 318 1 319 321 545 83.0 1e-154 MLLPRKFTFFFFLLLGVSIHAQQNFFVSSYVRGNFYNRGRVATESLRASDDLIFLNVHPN KDGSLSFENPRAFQGKGVTTWEGLIKSVRAKVKGTKTKIRLGASSGEWKAMVADEAACTA FAKNIKTVLEKNKLDGIDLDFEWAENEKEYKDYSLAIVKMREVLGDQYLFSVSLHPVCYK ISKEAIEAVDFISLQCYGPSPVRFPVEKYDSDIQMVLKYGIPKEKLVAGVPFYGVTKDNS KKTEAYFSFVQEGLITEPAQNEVIYKGEKYVFDGQDNIRTKTRYAMEQGLKGMMSWDLAT DLPLDDSKSLLKAMVEELRK >gi|225935349|gb|ACGA01000043.1| GENE 27 31509 - 35627 2654 1372 aa, chain - ## HITS:1 COG:BH4026 KEGG:ns NR:ns ## COG: BH4026 COG5002 # Protein_GI_number: 15616588 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 842 1070 369 597 607 130 33.0 1e-29 MKKHIASTLFLSLLSILQLTAQSHSVKRLGIEQGLSNNYVVSITQDKQGFLWFATEEGLN KFDGTRFITYYKNDLPHNNQGITGNELNRVYTDSKRPIIWIATQRDGLNAYNYDEQTFTA YQHNPENPHSLITNDVTDISPSTQNDDGLWVSTYYRGIEYLNINNGQFTHYNKSTVPSLP SNQTWTVLDGGDNNLYIGHVGSGFSIFSLKDKSVKNFQNETGNPASLPGNDVFCIIKDAN GNIWLGTNNGLALYNAANENFITFRNNKNDKYATLCSRILSIRQLKDNRLWIASELNGIA ILNLKQGMFLSPEELSIEYIQEGDDSRSLSNASARCIFQDSFDNIWIGTWGGGINFISSK PPLFTTLSYSPIPNNENSLNNKVASSLCMDRQGRVWIGTDGGGINVFEGEKRIAIYRKES GDIPSNFILASLQDSKGNLWFGSYQGGISYYDSRNKRFRSISLMGQSNLDVRTIYEDAQH NIWVGYSGGIVVLNPLTMKIIKHYNTDNSELHSDFIRSIAQDEKGRFWIGTFGDGLGIYT PDLQKIKTFTQRDGFCSNTINQIIQDKQKRMWIGTGEGLICFLSTNELNYKTYQRKDGLI NTNICAITEDKKGNIWFSNNKGISCYVTNKDCFYNYGHSDDVPAGSFSSGCVTQSKNGQI YFGSINGVCCFNPDITMNEQPAPAAVVTEMKILGRLSNLENNDMIINLSKGSNVELSHAQ NSFGVRFNVQNYSLVNQVEYVYMLKGLENSWYTVNESNNVTFRNIPPGKYEFFIKARVHN QEWPEEATSLTIRINPPLWLTWWAKLIYILASISIIYLILHAYKKKLDLESLYTLEKKNH EQEQELNQERLRFYTNITHELRTPLTLILGPLEDMQKEVSLPAKQAQKLSVIHQSALRLL NLINQILEFRKTETQNKKLCVSKGNIAPLIYEIGLKYKELNQNTKIDFRIQIEKEEMFLF FDKEIITIVLDNLISNAIKYTEQGRVTLSLHQTMRNEVAYTEIKVSDTGYGISAEALPHI FDRYYQESGKHQASGTGIGLALVKNLVTLHEGEIRAESVQNEGSTFYISLLTDNIYPNAL HADSTEPVQEEMNQNTELEYSQEATLDTGKPILLIVEDNEEIQKYIVESFTDSFEVITAN NGEEGKQQALSRIPDIVVSDIMMPVMDGITLCKQLKDDVRTSHIPIILLTAKDSLQDKEE GYEVGADSYLTKPFSASLLRSRINNLLDSRKKLVAQFQAQSTPGNQIDLSEKRIVIAEAL SKLDNEFIEKITLLIEENLSSEKIDINYLSDKMCMSGSTLYRKMKALTGLSTNEYIRKVK MENAERLLLEGKFNISEIAYKVGMNSTGYFRQCFKEEFGVSPSDYLKQIKQS >gi|225935349|gb|ACGA01000043.1| GENE 28 35818 - 38949 3508 1043 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4652 NR:ns ## KEGG: ZPR_4652 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 4 1042 11 1051 1052 1340 61.0 0 MRYVRILCMCVVQLLLSAAAFAQTQVTGHVADVRGEDIIGANVTVKGTSNGTITDIDGNF TINVDDKNATLAISFIGYKTKEMKIGEKAHLQIVLEEDTETLDEVVVVGYGTQTKKSLTG AISDVKSDALTRSVSTTTAGALSGKIAGVSTRAVDARPGRGINLEIRNMGSPLFVIDGIP YGGMNSRDWLQASDVSGSDVFNALNIEDIESITVLKDASAAIYGLRASNGVVLVTTKKGK KNDKVSINVNGYYGWQNLTRFPELANAGQYVRGLAEAAQNRGEDPSKLYSPEELAKWEAG TEPGYKSYDYYDMVMRKNVPQYHINANVTGGSEKTNYYLSVAHTGQDAMMPDFKYERTNL QLNIDTKITDRFTIGAQISGKYEKSEDVGLPGGDGYYSAILAMFKMQPIESPYANDNPEY INNVHENGYNPAAFSRDIAGYKDNLTRNANINAYAQYDFSFGLTAKATFSYNYTNSRFDG YQYAYQIYTYDKEKDTYNGTPATGRWRNQIDRSVPARYMQLQLNYAKQIKNHNISAVLGY EASDYDWSKKTYGTEPSTDYLPLLQFDEINSFGDEWSYEARAGWIGRINYDFAHKYLIEL LARYDGSYLYAANNRWGFFPGVSVGWRISEEKFFEKLRPVVDDLKIRASIGQTGTEKDVK LFDYLSGYTWNNGNAVLDGELVTGLNQRGLPITNLSWTKNTTSNIGFDLTMFNNRLKITA DAFRKDITGVPAARYDVLLPSEVGYTLPNENLNKQAYIGAEGMVTWTDHIGDLNYTVSGN FTFSRYKSIETYKPRFNNSWDEYRNSAEGRWGGIYWGYQVIGQFQSEEEIRNYPVNLDGN NNRTLLPGDFIYKDVNGDGIINSMDERPIGYPEGWAPIMSFGGNIGLAWKGIDLNIDLSG GAMQGWRQNYELANAYHGNGNSPVYLLEDRWHRADLYDPESEWIPGRYPAIRKGEFSYNN KNSDFWLHNVRYLRIKNLEVGYSLPSQLLRPIRAQKVRIYANVSNLCSFDNVGKFGVDPE ITAAAAVVYPQQRTFLLGFNITY >gi|225935349|gb|ACGA01000043.1| GENE 29 38953 - 40815 1737 620 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4651 NR:ns ## KEGG: ZPR_4651 # Name: not_defined # Def: RagB/SusD family protein # Organism: Z.profunda # Pathway: not_defined # 13 620 8 619 619 676 56.0 0 MNAIMKSKSIKIVLLGAVLTSMLSGCGDSWFERDPKNILTNELVWNDPNMIKSQLANLYN RIPQLHGDFNTGGMCETDDAMYCGTLDQNYRNELRYGNDYGRWWDYGLIRDINMSIENID KYGTDISADDKLQFKAEFRFIRAYVYFELVRRMGGVPLITTTLEYDFSGDPSYLRNPRAK EHEIYDFVYSECEEIKSQLGNKGSQSRANYYTALALESRAMLYAGSIAKYNALKTPNIVT SGGEVGIPADMADGYYRKSLAASQEIITKGGYELYEKESDKGVNFYKMFMDKTLNKEAIW VKDYKNPLKVHSFGYDNVVHHLREDNDNSSCIGPSLGLVEAFDYLDGAPGTLHYKNGNDY VVYDTPSDIFANKDARLYGTIVYPGSKFRNQDVDIQAGVAVWNDKTGSYDLLTDSKLGSA YDDNKTFVGQDGPQTNSPNVSNTGFYIRKFISEASGYTLRNYAENWWPWFRLGEIYLNAA EAAFELNDEPTALPYINRLRQRAGFPANSLTSLTIEKIQNERRVELAFEDHRYFDMKRWR IADITWNGNTDNDKAIVYGLYPYRIAKPKPGTSDQDKYIFVKTKSERFKVARTFQQSNYY SFIADDVINNNPTIVKNPFQ >gi|225935349|gb|ACGA01000043.1| GENE 30 40828 - 41502 505 224 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4650 NR:ns ## KEGG: ZPR_4650 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 15 223 1 210 210 191 45.0 1e-47 MKKIIPFFIMICCLLAVSSCSYDNFEEPKATLTGRAIYDGEAVGVRSGSSEFALFQDGYA LHGSIPVYVAQDGSYSVSLFNGDYKLVRMGNAPWERPSNDTIYITVKGNTVQDIPVTPYF SVRNVSFARNGNKVTARFTINKVVADANLENVGIYLGTGILTDEKQKEAELALGNTVSLN QENTAEIEIPSGLLNESYLYARVGVKSDKSSEYCYSQSIKVALK >gi|225935349|gb|ACGA01000043.1| GENE 31 41622 - 42578 661 318 aa, chain + ## HITS:1 COG:yagH KEGG:ns NR:ns ## COG: yagH COG3507 # Protein_GI_number: 16128256 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Escherichia coli K12 # 14 280 5 263 536 59 27.0 9e-09 MLAVTLPAFALSGNPVLPGFHADPEILYSNKTKKYYIYSTTDGQPGWGGWYFTVFSSVDL KNWKDEGTMLDLKSDQVPWADGNAWAPCMEEKLIGGKYKYFFYYSGNPKAGGGKQIGVAT SDSPTGPFVDLGRPIVTDSPTGNGQQIDVDVFTDPVSGKSYLYWGNGYMAGAELNEDMVS IKKETITVMTPEGGTLEDYAYREAAYVFYRNGLYYFMWSVDDTGSPNYHVAYGTSTSPLG PIKVAKNPVILIQNPEKEIYGPAHNAVLQVPGTDKWYIIYHRINKNYLKSDPGVHREVCI DRMEFNEDGSIKQVIPTP >gi|225935349|gb|ACGA01000043.1| GENE 32 42663 - 44255 1164 530 aa, chain + ## HITS:1 COG:no KEGG:BT_3043 NR:ns ## KEGG: BT_3043 # Name: not_defined # Def: putative xylanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 527 10 523 528 921 82.0 0 MNRIIGKKRVAILGFLFVCVAGGCSQSVPDVSYQIETDKPCQTMAYFSASDAWSMQFIGL WPEEKQNQVADWLFSTENDVNGQPKGIGLSLWRFNVGAGSAEQGEASQIASPWMRAECFL NADGTYDWNKQQGQRNFLKLAKERGVNKFLAFLNSPPVHYTQNGLATNTGRGGTANLKSD CYEKYVRFLADVVQGVEEHDGIKFNYICPFNEPDEHWNWVGPKQEGCPATNKEVAHTVRL LSKEFVSRNMDTQILVNESSDYRCMFRTHETDWQRGYQIQAFFCPDSVDTYLGDTPNVPR LMLGHSYWTTTPLSELRNIRCQLRDTLDKHQVGFWQTETCIMGNDEEIGGGNGFDRTMKT ALYVARIIHHDIVYARAESWQWWRAIGGDYKDGLIREYTTDDNFLDGTVEDSKLMWALGN YSRFIRPGAVRLSVSAFDQTNAFIPDGDTDQQGLMCSAYKNVDGTYVIVVINYANEEKEF SIHKEKVGNTQWQIYRTSDKEGENLLPVGKVKSGKNIQIPARSIITLQSN >gi|225935349|gb|ACGA01000043.1| GENE 33 44478 - 48545 2754 1355 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 818 1058 42 287 318 129 32.0 5e-29 MKRHIIFACIYEYVCTMVMKKFTFLLITLFSSTLILLADNPYRSYTVADGLSNSTVKAIY QDEMGYIWLGTKDGLNRLDGYEIKSFFYESDETVRQSNDIVSITGDRQGRMWIGTFNGIT LFDPFEEKYIDLATLYKGSELPRGVVVGLSIAADGAVWVVTKMGVYILRDGKCTCLEALR GLYINSMAPSADNSLLLNVANKGILRLDTRTSRFSYILKGPDYPVFLKIFQDEKERTWLA SSLDNLQLYNPATSTATKIEVNFPADMSVWKGQVHDIKVYNDSLLLLATDNGLAVMNEHT HVISRTFEECIPSGLLANRRLMSLYKDKQGSLWLGTFNEGALFYNARQYLFRYHPLTLNI NQPIRVTGKLIEAQGKLWIGHNRGICTLNLANGKVEEVRLPIGKVGASEDEVYYMFQNNE NEILFYVLNKGVYSLNLRDLSISKEMMDMFPPDAQIRAMAKDVQGNIWIAEDELSCYNPK TKKLSRSFSTNQDGNTRFMLTQDLLAYGSSMLVGGRTSGVWSFPYHPNNAAHYFKGNQLD FDELKNKNVSLLYLDSQHYLWVGTYNMGVYRCHLERGTIDHFGIEQGLIHNSVCGVLEDK ETGDFWISTVIGLSKLSMKDFRIVNYTKDTGFPLNEVSRNTLLQVEDGRIYVGGNNGMVE FAPREIIAGQESLLPVVHVSSVNSLDSDEGADRVQYDNTRSLEHVELSYKNAAVLIKYSP LDYIFPKGYKYAYRMEGLDLNWNYIERNEVIYSHLPAGEYTFYIKACNSDGIWGDATGID VVVHPPVWLTGWAKVIYVLLALMLIGGVLYYFYKRKSDKYKRRIEEIEKENIERNYRMKI ELFTNFSHELRTPLTLITGPAEDILQDETLPHKFLFPMKQIYKNSNRLLLLVNQLMDFRK LEYGAMTLKLSRVNIGTFLTSQIDSFSDLLHKKELTIGYDNDYYGDNLWMDTDLMEKVIF NLLSNAVKHSPKGTQIKVRSVEKAGSVVISVKDCGEGISEENLSKIFDPFFQVEQGSKSD LFGSGIGLNMVQYVVRLHKGKISVESTPGHGAEFLVELNLGKECFAGANVEFVENREDTY LEKVRRECIAPIEEEKPASVEDDHRYRVLVVEDDDDMRQYIVSLLSQQYIVYEASNGKEG LKEAVEQIPDLIVSDVMMPVMDGLELCKAIKEEMITAHIPVVLLTAKALNEHIEEGYSVM ADDYVLKPFAPKVLLAKLDSLIKNRNRLRRIFCEKLDAIEVPVAELSAQDSFMQQLMELI RERVHDPNLSVNDLHEELGMSRSQFFRKIKAVSDVSPNKLILNVRMKLAAEKLATGKYTV SEVAYDVGYSDPSYFSKVFKSTYNIAPANYLKQRM >gi|225935349|gb|ACGA01000043.1| GENE 34 48702 - 51902 2711 1066 aa, chain + ## HITS:1 COG:no KEGG:Slin_6567 NR:ns ## KEGG: Slin_6567 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 29 1066 139 1155 1155 568 35.0 1e-160 MNDKENQEKSSYEKLRFLATLFVLCLSTMAWAQMKVTGTVVDMMGEPIIGANVVESGNKT VGTITDLDGKFTLNVSKEATLVITYVGFTEKRVKVNGRSQLTIKLEEDSKTLDEVIVVGY GSVKKSNLTTSVAKISSDAIDGRPITSLSDALSGQLAGVQTQTSSGIPGEEMQILVRGAS SINGSSSPLIVVDGVITESMSDVNPSDVASIQVLKDAAATSIYGARGSAGVVLIETKQAS GSKPVITWESYIGTQNAVGLPEMMTAKEWLAYNIWYTNAKYLNKGGTNSMYVPNKKRSSG DQINEQWLVNPNSDVADWTFRNDIPTTNWIDQIMQNALTHNHQVSVSSKGKKYSIYLSAG YLNQEGIVKNTGFERFNFRLNASVDLNRYIKAGATFAPTISHQDKGESEGKDKQIMNALL MPPIIGLDENTREYGFNASYRNNVNPYERLMSVVDKREKKTFNTSLWAEAKIIKGLTFKT LFSYNSDMRIDEYFLPANVQPNNGSTTQGTTSTLSISRTGIQNTLTYNTTLNKKHALEIL LGQSIDERNEFRTSLGAMDYPLESVPTLNMGATPTEASSSRTIVRTSSLFGRVNYNYADK YLASASVRRDGSSRFGPGNRWAVFPSVSAGWKISGEEFMKDIKVINLLKLRASWGMSGND RIGTADYIPNYDVTNTVYGGTSQVGVYAKNIANDKLKWETTKAFDIGFDLSLFNNRVQLN VDYYINKTDDLLYSTKLPAATGFSTVKTNLASIENKGWEIDLTTVNVHTKAFKWSSSLNL ATNKNKVLDMGGNDNVITEAYDARFITKVGGPISQFYVYRTDGLLTNDDFELGPDGKYDK SRPRVPVLTNQIPGNVKYVDIDGNGEINSDDMVPYGSNDPDLTYGFTNRFSYKNLELSVF LRGQIGGKVLYLAGRSLDTGRGNYNGLKRWLHAYKEEYAGGNPIPTSLGVDMSWDGKTPL PYGLGNNSPLNKDGQMHMTDMAIYNASFLRIQNISLTYKLPKKWLRKSHIQAAKVYVTAE NLYTFTDYIGNPDVNSYSPDNPMVRGADYTTYPQSRKYIFGVNLTF >gi|225935349|gb|ACGA01000043.1| GENE 35 51915 - 53444 1464 509 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0773 NR:ns ## KEGG: Dfer_0773 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 17 509 16 483 483 200 32.0 1e-49 MKSFYKLIVSISCLFALSSCEGYLDVEQPSIYTDQNYYKTPQDFETAINGCYAQLQTIYN KNYMEAIVTRADEVRNSTPIGRFMDTPMESKWSKPWSAWWTLVFRCNQLLSRIDAVVFTD EDRKAYITGEAYALRGLAYLQFAWCWGGAPLITKEISREEVYKVARSSQEDTYKQAISDF EDAFNRLPDSWASSEAGRVTKYAAAGMLGRTYMYMHNYTKAAEYLGKVIEQEGTLYELAG KYEDCFSDEYNNGKERVWEVQYLGGTTGKALGLSQSFSGWFIPSTLNKEKGDYAKLNGIT FNGASSSIRASMSIAGDGVYETGDKRRDLTIVNNLYLDKSAPQADVYVVRKFLRATKNAP TAVDEWGNNIPILRYTDVKLMYAEALNELDYATYLSTDILPIINDVRKRAGLSAKLSSDF ADKQAVLDYLVKERFVEFSFEGIRWPDLIRWDLAEEAMATHFALVDEGYNETTEQPTYAM KSYNKLAPIPLSDILAYGNKDIMWQNDGY >gi|225935349|gb|ACGA01000043.1| GENE 36 53473 - 54522 821 349 aa, chain + ## HITS:1 COG:MA4278 KEGG:ns NR:ns ## COG: MA4278 COG1262 # Protein_GI_number: 20093067 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 106 341 48 267 270 123 35.0 6e-28 MMIKTLRYMMLGAVLVMAGTSCSDNDEYDNPSRPVEPGDSYFSSEPVEEGPSLFDAATIN TSKIPRVTSNKNTGLDEYYPLDPGDTWASVIRPERVKFFEKFLKEDMIFVNGGTFLMGAT AEQGEGVHIDELPVHKVTVSDFYICRFEVTQEMYSYVMGKWENFSWKDLATTRLPMDNRL FSEMQTFCNKLNEITGLHFTLPTEAQWEFAARGGRKRTSTVYAGSNNFDEVGYNLSNCYR LPEGQTGMQFWPEEVGKKLPNELGLYDMSGNVAEVCLDWYDEYSGEDQVDPVGPDALPSD VPQKRICRGGGWNTNTSACRVSARAAFPVDKRNKLIGFRLVHPKLENVQ >gi|225935349|gb|ACGA01000043.1| GENE 37 54562 - 55734 1142 390 aa, chain + ## HITS:1 COG:BS_yteR KEGG:ns NR:ns ## COG: BS_yteR COG4225 # Protein_GI_number: 16080064 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Bacillus subtilis # 85 388 53 370 373 92 26.0 1e-18 MKKYGIIGIVFVGVAVICAFTFRGHKEADYLLTDFPAEADPVTVGNKIADRFLEQWHSQY GSPLRVDEPRTQITYPDVCTWLGGLWFAQATKNRGLEERLEARFQPLFTTEAYLQPQANH VDNNVFGAVPLELYMQTKDEKYLAMGMKYADTQWDAPATQELTEEEKAWADKGYSWQTRL WMDDMFMITAVQAQAYRVTQDMKYITRAAREMVVYLDSLQLDNGLFYHAPSAPYCWGRAN GWMAVGMAELLRILPETNPDYAEIMAAYLKMMKTLKETQNEKGMWRQLVDDPELWEETSG SAMFTYAMIVGVKKGWLDAKEYGEVARKGWIALCSYIDKAGDVKAVCEGTMIKNSREHYI NRLALTGDLHGQAPVLWCAYALVTDFKTNK >gi|225935349|gb|ACGA01000043.1| GENE 38 55747 - 59856 3103 1369 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172971|ref|ZP_05759383.1| ## NR: gi|260172971|ref|ZP_05759383.1| hypothetical protein BacD2_13963 [Bacteroides sp. D2] # 1 1369 1 1369 1369 2834 100.0 0 MKTRIIFFFLIIVASCIQAQVRMVAGQPVLFEKQDVIDLATYNWPRTLISYVVVFDGGVK EEMLSLTDKSQGVPVPFQLSEKVVKNGFLQRARVSFFAALPQGGKYAYELVATGRKPVTV DNPLAVVSEDGQFSIGNKDMTVYIPASQSVKAGKAPAPVLSVQKGDRKIGNNRLYAHRKN VERIETLPVEVGELFVKYQMKYYLAGNATYTAWIKVVQGYPFVILEEEMNGLSKEDRVYL DMCWDNFAPVKRFGTQWDRVFDKASQWIGIDTPVHTSYSQEDPHWTGMGWIEDPSKEMIF RISPFGGNSVREQPPVMSFWEEGKNADELGVFVYDHQKWDDRQYGIWQPTPDLSIYFRYD DNKLYFKYPLVDGSRSTAIAFFSEEQGRKAVAGFNKRIDEIAKAGGAHKSEFLLYRYTQM LHQQYASLSLDRIKDWELEYANDKKQPENLFTKDHTGQTAESFYQNMISSAFAYYPMGLN FYPGIHSIEHRPVYSKLVEGYLFHARSLTEKQRKTVNALFILGGYVNMLEEMNAIRNSLA GTANMAADGWCVPMQTAYLFPEHPMAKEWGDFFEKCLEIYGVFYTRPEVKTFEGKGGRWV ESLGVYNWAYLRPTGHSNIAGELYDGKNRFASPYMAERGKWLVDMLTAPVYNYRNVEGAQ PGYPEGWKPGDALDEKSFTRQYPAHGAHGGGTTVDRPSSLFELGEWLFNYDPIVAENIFW AGTFGKELEHKPKDSDWSEVYKRIHTVDNRGTNPHLKSCKYVGHGIVLRAGVDTPEELSI HLEQVDKGPNYRWGNQAQGNSGGIYFYAQGQIFTGHENENAGDHITNNLDGVTNFGVMKN GEFRTIGMNELTAPLYDLGVAQFAELLPATGKDLYSWPEYQSRSILLVGTDYYLIFDETG TNWRAAHRFSWFTGKGFEYPQITFLSKKARDDHWTVAETATSRGFYRDAFGSLLTLVSHK KGEVVPLRGKLKNIPLLGTEEVADFIPAKEGAYPEGVIGIRAPQSEDLVFRAGKTLEYRT EQEDFEGKAGVIRRMKDGSLQLALIRGTEIAADGLALSMEAGDEAAIALTRAVDGTLSGT FKALKVTKVTLTGVTAKGTFYIDGVVQPVRVKAGQTVLTLPVGQHSLEYTAGKAMPLPAE ITDVEYEKNHFRIYWQNPNRLKSVRLEVSMDGGKRWETVTTTTHSPYQLSKDGYKDKIHI RAVAMNGKRAASSAREYPVYVTGEAPHYPEGIWMKLDDNRVEISWGKVLGVQKYRLYRRI KGEQEFRLIYEGKANSFVDKTAAGVCKAFAMPGSLDNRLQDRHGLKVYEYAVASVNGNGE GALSPVENTDPASWKNWYPAVTLKFKRQSAFWMPPYMPANMSPEKYYPD >gi|225935349|gb|ACGA01000043.1| GENE 39 59917 - 61185 1011 422 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4229 NR:ns ## KEGG: Fjoh_4229 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 10 421 16 428 429 229 35.0 1e-58 MRIIYFLWALVFVLSSCKEKPATSIVLKNPINEERVDESFRLSRTELRAVGEELLPVVKK ADGTYIPCQVDDMDQDGLWDELAFVYTLGGHETVELSLDWISATDYPVFERRTNIRYGKM TSPGRVEELSSDTHGKQNLPRSVNYPYQMDGPAWENDKVGFRHYFDGRNCRDLFGKRVSE MVLDTVGLRTDGYPDNTYQVLREWGCDILSVANSFGLGGIAIQLQDTLLRMGVEQKDTVD IIDSSHFEVIAEGPVRSVFRLDFTRWDVLGTKVDVHETVTIWAGKNGYEEEIRTSPLPEN AVLVTGLVANNNTKECVEKQYEGKWVSMITHDKQTVNKDYELGMALLIPKDQLVETFHAP DEGNGILKTWCAKLKPGNNGYRYQVYAGWELGVDDLSKREEFIRLVDTYVDYLNHPVSIS IN >gi|225935349|gb|ACGA01000043.1| GENE 40 61402 - 63171 1138 589 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172973|ref|ZP_05759385.1| ## NR: gi|260172973|ref|ZP_05759385.1| hypothetical protein BacD2_13973 [Bacteroides sp. D2] # 1 589 1 589 589 1203 100.0 0 MLRVSPGKGYSIKYLAPFYYARLWSNYETAEATAKLIEIYQYQLEHIDEIYHSDSDLEFF VHATMHGYMLTKTRMSEQLQKKIKDFMKSGKYATDKGTLNMRMMRQASGFLCAEEWSDFV DADGQTSSQLKSYLHGRILKTLKSFFTDNCPEADAFTYLGINLQYVRMLAEFSRDEEIRK TAATTYQHMVAQLLLPWNQGLYCANPSRCKGWANLCTGNLSVDVQIGQLAWLFYGGQNER KIRLDAGKDNFACFNFWMAYQRNVKPLPYFQSLNAGKQYPYHFEALRINDDHFCSRYTYQ SKSYGLSTQTIEAFPNKLKGFQYTYAFKETKNLHLVWQSDLPEASVFSVCHDNPERPQSY QTVSNKPGYGENPYHRVLGYERSAIGVYNVAEDYMDQPKFYQMYVPFTRKGIKTKMIREI NGMRWVLCHTGSMMFAFATPEDWDFGLQDNKYGIKDHHILTLKDVNRRRGSWVLETTEIT ERYKDAQGDMDAELEKFAGDIAAKVKLQLSADYETSDTPSISYTNIQGDVLELTFFSPEM SYNGQYKVNGNAVDLNTEYISKSDYMQQKAGSNSVQFHTETGTETLQLE >gi|225935349|gb|ACGA01000043.1| GENE 41 63265 - 65607 1373 780 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260172974|ref|ZP_05759386.1| ## NR: gi|260172974|ref|ZP_05759386.1| hypothetical protein BacD2_13978 [Bacteroides sp. D2] # 1 780 1 780 780 1637 100.0 0 MKVMKLKSCFLLIGLLLVCNVYAQELCRADFLPKASAAFDLLTQKYSEERIIKEIRAKNV RWVTNLMSASAVFYKATHEKRYLDMSEQVFGNVIREWKKNEKLMHGKDDFFALQNLALAY EILQDNDRLPMGADEVMIRFADLHFDPDFVIDNNQGQERALGFVRMCNLFPDAPGVSHWR EYIDKMWHFWYRNKDVDETATLYASIHLNDLINIAVESDKVALLKTPEIRRWFERYRDQQ APSGYMPEYGDDYFFAYNNWILVFEKMARLTGDASFRKAAWKLFTIGYPNLDIKYFKPGW NLRQACDWAALAEVALLPTFFEKSDSPVRTSLVTTRTNRRGKTDIPDQLLLRASSEAGTP FIMSDLYASGTHQHPNLRGTINYFEVDDNPLFHGVQRHATDVRHGNTVVLMKESGNGFPF DEKGSRLLTNVWFTDCVDFSQSTEISGDAAMRGMRKMTFRFQGEPGEEIYIKNVRLIGKV GNRLLHDCSTLENWSKNVTLVDLGKEGKAVKVVLPDKNVCFVNLDVAADFSLNDYRYIGC DWKHMAKSGAKKSVLDFMIRAYNKVSLPGEEYIHEKVGTLFNPNTVREAMAETREGDSYG RIVLDDQCVDGSVLQRNMVLTKEGVLVIQDHLLPGAGTEGYTAGSLWQLYSLDKSGKNWF NSTGENKKWKDRSGKDIETNQLLVYFEEQKERHFGAQQQEYTVKPVTTFAKQKVIPGSAV TFVTIIVPHTALWKAEDIVKAISAQTDATHQSNVWITLANKNNLKIEITKEGNWKVERNE >gi|225935349|gb|ACGA01000043.1| GENE 42 65712 - 66374 491 220 aa, chain - ## HITS:1 COG:no KEGG:BT_3041 NR:ns ## KEGG: BT_3041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 207 212 308 80.0 6e-83 MAIQFELYKTPRPKDEEDKETYHARVVNFQHIDTDYLAKEIQIATSLTEGDVKSVLESLS HFMGDRLREGQSVHLDGIGYFQIKLNSQEPITSPKLKANQIKLKANISFKADAKLKRSVS VVHVERSKLKPHSAVLSNDEIDKLLTNYFKSNPVLTRRDFQGLCGFTPTTAARQIKRLKE EEKKLKNINTYYNPIYVPMPGYYGKAEINTNQEYNIEKEN >gi|225935349|gb|ACGA01000043.1| GENE 43 67139 - 67483 332 114 aa, chain + ## HITS:1 COG:no KEGG:BT_3039 NR:ns ## KEGG: BT_3039 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 142 115 54.0 6e-25 MSDFTCSCSCLMKHDLERSVDKLSFMKENWPSFANIESVDQLSKAELQCSLCLLNIVIDG LSKDEFSCPNKELIHLVIMYVYIQERFDLCEIKELHTKLVMVPVKKRKKWLSKS >gi|225935349|gb|ACGA01000043.1| GENE 44 67668 - 70025 2179 785 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 30 768 7 750 778 375 33.0 1e-103 MKSIKKMVLVSAFAGTCLTTHAQANPPAIPADPAIEANIRQWLQRMTLEQKIGQMCEITI DVVSDLETSREKGFCLSEAMLDTVIGKYKVGSLLNVPLGVAQKKEKWAEAIKQIQEKSMK EIGIPCIYGVDQIHGTTYTLDGTMFPQGINMGATFNRELTRKSAEISAYETKAGCIPWTF APVVDLGRDPRWARMWENYGEDCYVNAEMGVSAVKGFQGEDPNRIGAYHVAACMKHYMGY GVPVSGKDRTPSSISRSDMREKHFAPFLAAVRHGALSVMVNSGVDNGLPFHANRELLTEW LKEDLNWDGLIVTDWADINNLCTRDHIAATKKEAIKIAINAGIDMSMVPYEVSFCDYLKE LVEEGEVSMERIDDAVARVLRLKYRLGLFDNPYWDIKKYDKFGSKEFAAVALQAAEESEV LLKNDAHTLPIAKGKKILLTGPNANSMRCLNGGWSYSWQGHVADDYTQAYHTIYEALCEK YGKENIIYEPGVTYAPYKNDNWWEENKPEIEKPVAAAAQADIIIACIGENSYCETPGNLT DLTLSENQRNLVKALAATGKPIVLVLNQGRPRIINDIEPLAKAVVNIMLPSNYGGDALAN LLAGDANFSGKIPFTYPRLINALATYDYKPCENMGQMGGNYNYDSVMDIQWPFGFGLSYT NYKYNNLKVNKPTFNADDELIFTIDVTNTGKVAGKESVLLFSKDLVASSTPDNIRLRNFE KISLKPGETKTVTLKLKGSDLAFVGYDGKWRLEKGDFKIKCGDQWIDIVCDQTKVWNTPN KNTLH >gi|225935349|gb|ACGA01000043.1| GENE 45 70022 - 70201 77 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAKFTIIRQKQGVLCFEQENRMYLIGASYVFNCISPFRVTANLSTFVVDFKSHSKNINS >gi|225935349|gb|ACGA01000043.1| GENE 46 70228 - 71817 1339 529 aa, chain + ## HITS:1 COG:BH3683 KEGG:ns NR:ns ## COG: BH3683 COG3507 # Protein_GI_number: 15616245 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 28 529 6 527 528 295 34.0 2e-79 MKNSCRLLLILIGLWMANNVSIAQETFRNPIITGMNPDPSICRVGDDFYLVTSTFEYFPG LPVYHSKDLVHWKLIGHALSRPENNPLMGCSASTGGQYAPTLRYHEGTFYVIGTNYGGKG SQGVFYVTAKNPAGPWSDPVWAGNWYVDPSIEFIDGKMYFLSPDNQGSFLLGVMDPETGK FVEPLRKVAAGLGGSSPEGPHFYKMGEFYYIMSAEGGTGYEHREVIQRSKSPWGPYEPSP VNPVLSNMNCPGHPFQAIGHADLVQLKDGSWWAVCLGIRPVDGKFQHLGRETFLAPVTWD ADGWPKVGKDGVVQETYPFPNLPSHVWAKQPVRDDFDAETLGLDWTFIRNPAHSFWSLTE KPGSLRLKGTAINFTTNDSPSFIGRRQAAFNLTASAKVNFIPKVENEEAGLVVRADDKNH YDLLITKRNGQRTAMLRKTLKDKVVDTIYKELPATGDVILSITATKTAYTFEVKAGHAVE VLGMASTRDVSNEVVGGFTGVFIGMYASGNGQANTNPADFDWFDFRCLD >gi|225935349|gb|ACGA01000043.1| GENE 47 71986 - 73551 1148 521 aa, chain + ## HITS:1 COG:CC2802 KEGG:ns NR:ns ## COG: CC2802 COG3507 # Protein_GI_number: 16127034 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 17 500 29 543 548 296 37.0 8e-80 MRNALFLIFISLCSVCKSSAQGYSNPVIPGFHPDPSVCKAGDDYYLVNSSFQYFPGVPLF HSKDLVHWEQIGNCLTRPSQLDLTNANSGSGIFAPTIRYNDGVFYMITTNVSGKGNFLVH TTDPRSEWSEPVWLEQGGIDPSLYFEDGKCFMVSNPDGYINLCEIDPMTGKQLSSSKRIW NGTGGRYAEGPHIYKKDGWYYLLISEGGTELGHKVTIARSRYIDGPYQGNPANPILTHAN ESGQSSPIQGTGHADLVEGTDGSWWMVCLAYRIMPGTHHTLGRETYLAPVRWDKDAWPVV NANGTISLEMDVTTLPQQEMKGRPEQIDFKEGKLSPEWIHLQNPEAKNYTFTKDGKLRLI ATPVTLSDWKSPTFVALRQEHFDMEASAPVVLQKAGVNDEAGLSVFMEFHSHYDLFVRQD KDQKRSVGLRYKLGEITHYAKEVSLPTSGEVELVVKSDINYYYFGYKVNGIYHDLGKMNT RYLSTETAGGFTGVVLGLYAVSASKESKMYADFECFKYKGE >gi|225935349|gb|ACGA01000043.1| GENE 48 73739 - 75247 1360 502 aa, chain + ## HITS:1 COG:CAC0826 KEGG:ns NR:ns ## COG: CAC0826 COG2730 # Protein_GI_number: 15894113 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Clostridium acetobutylicum # 147 475 38 337 370 154 32.0 3e-37 MEKQSFSDGLFGPLGIKRVIFMLVLLTTSFISCSNSDEKGGSLEVAQEYRNLEFDARGSR QTIQINGPSEWHISTSESWCKSSHAIGEGKQYVNITVEANDTQKDRTATVTVSASGAPDI IINVKQSMYSVPPYDEYIDPDNTGMRDLTSMQLSALMKAGVNIGNTFEAVIVGEDGSLSG DETCWGNPTPNKALFEGIKAAGFDVVRMPVAYSHQFEDASTYKIKSAWMDKVETAVKAAL DAGLYVIINIHWEGGWLNHPVDANKEALDERLEAMWKQIALKFRDYDDRLLFAGTNEVNN DDADGAQPTEENYRVQNGFNQVFVNTVRATGGRNHYRHLIVQAYNTDVAKAVAHFTMPLD IVQNRIFLECHYYDPYDFTIMPNDEDFKSQWGAAFAGGDVSSTGQEADIEATLGSLNVFI NDNVPVIIGEYGPTLRDQLTGEALENHLKSRNDYIEYVVKTCVKNKLVPLYWDAGYTEKL FDRTTGQPHNAASIAAIMEGLN >gi|225935349|gb|ACGA01000043.1| GENE 49 75273 - 78446 2858 1057 aa, chain + ## HITS:1 COG:no KEGG:Dfer_1573 NR:ns ## KEGG: Dfer_1573 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 40 1057 17 1041 1041 760 42.0 0 MNKHFLGKCTFVHWYCLALAILLWLPITILKAAPSQDLKVSGIVTSATDGEPLIGVSVQV KGTTTGTITDLNGKYTLNVSTGQTLVFSYIGFMEQQVVATKPVINVVLKEDTKTLDEVVV VGYGTMKRSDLTGSVVSVTGDELKKSVVTSLDQALQGRAAGVSVTQNSGAPGGGISVSIR GINSLNGNEPLYVIDGVAISGNTDGNSSVLSSINPSDIVSMEILKDASATAIYGSRASNG VVLITTNQGKAGKTKVSYEGYYGLQQLPKKLDVLNLREYAEYQNLRAQVLGFGDREEFKD PNLLGEGTNWQDEIFRNASMHNHQINISGGNEGTTYSLSGGYLSQDGIGIGSSFERFSAR VNMDNKITNWLSTGLRASIAKTQQNNTIDNGNIIRTAIQQLPEVPAKNPDGSWGMQSENM YGTYFTNPVAEALMRENYDKGLQLYVDFFADITLYKGLVFRAEYAGNYYYNNSYKYTPSY DYGLYTQESLGSRSASNGSNWTLKTYLTYTNTFGRHQITAMAGHEAQENNWESLSGTRSD YFLNSVHELDAGSSLTAKNSSSKGSSAIESYFGRVNYGFDDRYLLTLTVRGDGSSTFAPK NRWGIFPSAALAWKLKNEKFLKNVSWLDNLKLRLGWGLVGNQAADNYAYGVKMRTVATIW GSGYYAENYGNENLKWEETEAWNAGLDINLFGNRVELIIDGYYKNTDNLLMKASLPSYVN GVISSPWVNAGAMTNKGLEFTLNTVNINKKDFMWRSGLTISFNRNEITKLYTETAGLSGT INSETYTYSEVGQPIGQFYGYNVIGMFAKEDDFYKRDSYGNYILDKNGERAKVAIPKGKN ISESEIWVGDYIYEDIDDNGVIDEKDRTYLGNPEPKFSYGFNNSFSYKGFDLNVFINGVY GNKVVNMLRKEFTNPMNNSGMLKEAVNIARVELIDPSQPATLSNVYVSNAGSAQVQRITA ANANDNNRMSSRFVEDGSYLRIKNISLGYTFPKKWISKLNIDNLRVYMNIQNAFTFTKYK GYDPEVGAYNYDVLTRGIDNARYPSQRIYTFGLNLSF >gi|225935349|gb|ACGA01000043.1| GENE 50 78470 - 80098 1518 542 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6367 NR:ns ## KEGG: Cpin_6367 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 5 540 10 547 547 349 40.0 1e-94 MKRKLLAIATTALLLVACSDSFLDRAPEGSYVDATFYTSDEALEAATAPLYNRAWFDYNQ RSIVPIGSGRANDMYSPWNYPQFVTFQVTALDENLSGAWSGFYSVVTMANSVINAVETQT QGSVSEAAKTKAIAEARLMRACAYFYMLRIWGPVILIEDNQKLVDNPVRPLNREEDVFQF IINDLNYAVTHLPEQSDKGRATSWAAKGILAKVYLARSGWNNGGTRNEDDLELARQYAGD VCENSGLGLMTNYEDLFKYKNNNNQESLLAMQWVPLGEWYECNTLLSDLAFSTEVTGGVN CWSSYNGSIDMLQQYELADTLRRNATFFTKGSYYSYIAMKDGGYTYKGTASPIKKGVPGG PDDDNDGKVKQMNSPLNTYILRLADVYLTYAEACLGNNSVLSDGRGLYFFNLVRERAKVN KKSSITLDDIIRERRVEFGMEYSNWYDMVTWFRYLPDKMLNYFNNQWRGYRADAIIKDED GKLHFGKYDTDGTTFLEGPENYTAPEFTINIEAEDIFLPYPESDVIQNPLLNEPPVPYTF NE >gi|225935349|gb|ACGA01000043.1| GENE 51 80112 - 81578 980 488 aa, chain + ## HITS:1 COG:no KEGG:Dfer_1571 NR:ns ## KEGG: Dfer_1571 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 27 227 25 209 389 89 30.0 3e-16 MKSIYKYTDARLLLLFWLFLPLLALISCQNDDDSIPVIHYIRVTDPTRADSTFTDVSPGT MIVVVGEHLGGTQKIFINDQEVSFNRNYVTSTNIILTVPNELELTGQNPELKGEIRLETD HGIATYNMHVLSPAPYITRISATYPIKPGDQMTVIGGNFYEVQAVYLSTEQPAKDGTRPV DVQEITNYEVNNKYNQITLTAPANLLEEGYLVVECYTSSAVTEFKKNGPRPVVTAVSSTM PVVGSTVTITGQNFIKVSRVNINGEFDIPVEDITTSNTFDEISFVLPQAPTQSGHISVTA IGGTVESAEIFYPLENVILNYDGIGSHVWGDCSFVVADGSSAPYVSNGTCLGITGTVSAY NYWWKQSYSNAQWVGTSIIPGNTPISDLKLQFECFVKEVFTGPVFQIAMCENFDAALNGY VPVSSFTGETETGKWMQCSVSLSSVVADATYQEFLNRNSAHIGVYATNPGGSEATIEVYF DNFRIVKK >gi|225935349|gb|ACGA01000043.1| GENE 52 81644 - 83137 918 497 aa, chain + ## HITS:1 COG:no KEGG:DICTH_1791 NR:ns ## KEGG: DICTH_1791 # Name: not_defined # Def: endo-1,4-beta-xylanase D (EC:3.2.1.8) # Organism: D.thermophilum # Pathway: not_defined # 54 496 53 507 509 270 34.0 1e-70 MRKNIIYFVVFSMAIVASMACHAQELQMQDRIANGSSTPVIIMPEVSPKLLSGDANPLLD FIFTADPTAVEYEGRLYVYGTNDHQQYEAVGGNGKNSYEYIKSLVMMSTNDMVNWTYHGL IRTDSIAPWIKASWAPSVVSRKEGDGKTHFYLYFSNSGDGSAVLTATSPVGPWESPLNRS VIDTQSPGIGDCKAAFDPGAVIDEEGTGWLAVGGGCARIIRLGKDMISIDSPIVPINAPH HFEANELNFINGTYVYTYNIDWQNFDDWPLPTEKPTICCMSYMTSKTPLESDSWQYQHNY MKNPGEYGFEFGNNHTHLHKYGGKWYVFYHTMSLQRSFNTTGGFRNICVDEIEIDEGNVN IHMGKQTLKGVSQIKVLNPFMLQQAETTAATQGVKFLNGKDVGDMYAVTVPGMEGILSVR GVEFCKTPSRLELQAAGDGIIEVHRNTPDGEMMAAIQINTPNMKLLKTKIQTQFEGTTDL CFVLKGEDIVFDQWQFK >gi|225935349|gb|ACGA01000043.1| GENE 53 83191 - 84954 1198 587 aa, chain + ## HITS:1 COG:no KEGG:Dfer_5683 NR:ns ## KEGG: Dfer_5683 # Name: not_defined # Def: cellulose 1,4-beta-cellobiosidase (EC:3.2.1.91) # Organism: D.fermentans # Pathway: not_defined # 1 580 1 579 582 543 47.0 1e-153 MKIIKYIALFGMLSGLAVACTPSTSVISNDVVRLNQLGYYPNQEKIAVIDSGKVEEFVIL DAVSGEQVFAGKSLYTAKSAWSDKTRTTLDFSAITTPGEYILKVNGASVAFPVKDSVLSP LADAALKSFYYQRTAIPIEEQYAGQWNRPAGHPDNHVLIHASAASPGRPSGTIVSSSKGW YDAGDYNKYIVNSGYSIGLMQSIYQLFPDYFSRQKINIPESDNHTPDLLDEMHYNLDWML TMQDPADGGVYHKLTTPFFEGFVKPVDCKQQRYIVQKSITAALDFAAVMAQASRLFASYE KDYPGFSKRALLAAEKAYAWAEKHPEDYYNQNLLNQKFQPEIATGEYGDTHADDEFFWAA TELYFSTGKEIYREEVIKKAPKVYTAPGWGNTFALGIFAWLQPDRKLNEADRRFAGSLKT ELLKYADKVIQGAEQTPFHAPYGNDAKDFFWGCLAEKCLNQGVSLMYAYILTDKETYLTN AYRNMDYILGRNATGFCYVTGLGTKSPEHPHHRLSASDDIKAPIPGFLVGGPNPGQQDKA FYPTASPDESYVDTEDSYASNEVAINWNAALVALSSSLDALAVDSVK >gi|225935349|gb|ACGA01000043.1| GENE 54 85019 - 89047 2990 1342 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 832 1062 63 293 318 149 34.0 3e-35 MIRKSLLLLALLFTSWIVQAQSYLFKHLEVSDGLSNNSVNTIYKDRDGFMWFGTTTGLNR YDGYTFKVYQHAEDEPGSLPDNYITDIVEMPDGRFWINTARGYVLFDKERDCFITDVTGF MKTLESWGVPEQVFVDREGNAWLSVAGEGCYRYKEGGKRLFFSYMEHSLPEYGVTQMAEC SDGILLIYNTGLLVCLDRSTLAVKWQSDEIKKYIPEGKKIELSLFVDRDNCIWAYSLMGI WAYDSGTKSWRTDLTGIWSSRPDVIIHAVAQDIEGRIWVGKDYDGIDVLEKETGKVTSLV AHDDNGRSLPHNTIYDLYADRDGIMWVGTYKKGVSYYSESIFKFNMYEWGDITCIEQADE NRLWLGTNDHGILLWNRSTGKAEPFWRDAEGQLPNPVVSMLKSKDGKLWVGTFNGGLYCM NGSQVRSYKEGAGNALASNNVWALVEDDKGRIWIASLGGGLQCLEPSSGTFETYTGNNSA LLENNVTSLCWGDDNTLFFGTASQGVGMMDMRTREIKKVQGQSGSTKMSNDAVNHVYKDS RGLIWIATREGLNVYDVRRHLFLDLSPVAEAKGSFIAAITEDQERNMWVSTSRKVIRVTV ASDGKGSYLFDSRAYNSEDGLQNCDFNQRSIKTLHNGIIAIGGLYGVNVFAPDHIRYNKM LPNVMFTGLSLFDEAVKVGQSYGGRVLIEKELNDVENVEFDYKQNIFSVSFASDNYNLPE KTQYMYKLEGFNNDWLTLPLGVHNVTFTNLAPGKYVLRVKAINSDGYVGMKEATLGIVVN PPFWMSWWAYLLYAIGLVVVLFVARYRMLKREREKFHLQQIENEVAKNEEINNMKFRFFT NVSHELRTPLTLIISPLEGMLKETTDELQSTRLQLMYRNAQRLLHLVNQLLDFRKGEMST HQLSLSEGDIISYVHSVCNSFLLMADKKHIQFSFFSGIDTFSMAFDADKVGKIVMNLLSN AFKFTPEGGRVTVMIEHVAGTPDMLEIKIADTGIGISDVDKEHIFERFYQADHKGVEETT GNGIGLSLVRDFVTLHEGEVKVFDNIGTGSVFVIQFPVKHVETQVQLPPETGISIGEEED KEIKEETERKDFPLLLIVDDNEDFRIFMRYSLELQYRVKLAVNGNEAWEMMQEELPDLVI SDVMMPQMDGNELCRLIKQDKRTAHIPVILLTARQNTEAKLEGLQTGADDYVTKPFNMTI LVLRIRKLIELSRYHRVTQGMIDPAPSEIVITSLDEKLIEKAIKYVEDNMSRTELSVEEL SRELGMSRVHLYKKLLQITGKTPIEFIRVIRLKRAAQLLRESQLHVSEVAFEVGFNNPKY FSRYFKDEFGVLPSVYQEKEGK >gi|225935349|gb|ACGA01000043.1| GENE 55 89197 - 92055 2710 952 aa, chain + ## HITS:1 COG:CC0789 KEGG:ns NR:ns ## COG: CC0789 COG1501 # Protein_GI_number: 16125042 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Caulobacter vibrioides # 24 945 63 973 983 861 45.0 0 MNMKNIFCCLLPGLLFGACANKVYEEAGDSVIVKVQQEVTGGPRLVRLQVMGDKLIRVSA TADSKFADPQSLIVVPQEKQIPFAVMQNGDTITVSTEEVKASVLASTGEVWFVDKDGKLI LQENKGGGKKFTPIEVEGTKGYTICQVFESPEDEAFYGLGQHQADEFNYKGKNEELFQYN TKVSVPFVVSNKNYGILLDSYSLCRFGNPNDYSQLNRVFKLYDKTGREGALTGTYVPKKG ETLVRREDSIYFENLKTIQNLPEKLPLMGAKVTYEGEIEPAQTGEFKFILYYAGYVKVYL NNEPVVPERWRTAWNPNSYKFAVHLEAGKRVPLKIEWQPDGGQSYCGLRALTPVDPAEQG KQSWWSEMAKQLDYYFVVGEDMDEVISGYRTLTGKSPVMPKWAMGFWQSREKYNTQDEML GALKGFRDRKIPVDNIVLDWNHWPENAWGSHEFDKARFPDPKAMVDSIHAMHGRMMISVW PKFYVTTEHFKEFDKNGWMYQQSVRDSLKDWVGPGYHYGFYDAYDPDARKLFWKQMYEHY YPLGIDAWWMDASEPNVRDCTDLEYRKALCGPTALGSSTEFFNAYALMNAEAIYDGQRGV DNNKRVFLLTRSGFAGLQRYSTATWSGDIGTRWEDMKAQISAGLNFAMSGIPYWTMDIGG FCVENRYVAGQKQWNATKTENADYKEWRELNARWYQFGAFVPLYRAHGQYPFREIWEIAP EGHPAYQSVVYYTKLRYNMMPYIYSLAGMTWFNDYTIMRPLVMDFTADTQVNNIGDQYMF GPSFMVSPVYRYGDRSREIYFPQAEGWYDFYSGKFQPGGERKVIEAPYERIPLYVRAGAI VPFGDDIQYTDEKPADHIRLYIYQGADGEFTLYEDEGVNYNYEQGMYAMIPMKYDEATKT LVIGERQGEFPGMLKERTFTVVTVNKEKAQPFDLNAKGVTVKYNGSEQTLKL >gi|225935349|gb|ACGA01000043.1| GENE 56 92074 - 94599 1726 841 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 32 822 56 871 871 536 37.0 1e-152 MMIGKLKYLILGSCLMLLGACNSSSPISPRERSDFNADWRFHLGDGLQAAQPGFADNDWR VLNLPHDWAIEGDFSRENPSGTGGGALPGGVGWYRKTFSVDKADIGKIFRIEFDGVYMNS EVFINGASLGVRPYGYISFSYNLTPYLKWDEPNVLAVRVDNAEQPNSRWYSGCGIYRNVW LSKTGPIHVADWGTYVTTSSVDKGEAVLNLVTTIVNESDTNENITVCSSLQDAEGREVAG TRSAGKTETGREAVFAQQLTVKQPELWDIDTPYLYTLVTEVMRNEECMDRYTTPVGIRTF SFDARKGFTLNGRQTKINGVCMHHDLGCLGAAVNTRAIERQLQILKEMGCNGIRCSHNPP APELLDLCDRMGFIVMDEAFDMWRKKKTAHDYARYFNEWHERDLNDFILRDRNHPSVFMW SIGNEVLEQWSDAKADTLSLEEANLILNFGHSSEMLAKEGEESVNSLLTKKLVSFVKGLD STRPVTAGCNEPNSGNHLFRSGALDVIGYNYHNKDIPHVPANFPDKPFIITESNSALMTR GYYRMPSDRMFIWPERWDKPFADSTFACSSYENCHVPWGNTHEESLKLVRDNDFISGQYV WTGFDYIGEPTPYGWPARSSFFGIIDLAGFPKDVYYLYQSEWTDKQVLHLFPHWNWTPGQ EIDMWCYYNQADEVELFVNGKSQGVKCKDADNLHVVWRVKFEPGTVKVVARRKDEVIAEK EIRTAGKPAEIRLTPDRSVLAADGKDLCFVTVEVLDEEGNLCPNADNLVNFTVKGNGFIA GVDNGNPVSLERFKDKKRKAFYGKCLVVIQNDGKPGKTELTATSEGLRQAVVKVSAKDCK L >gi|225935349|gb|ACGA01000043.1| GENE 57 94615 - 96843 2015 742 aa, chain + ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 48 732 38 705 793 379 34.0 1e-104 MKIFLLTICFLSVQTGMVAIAQDKKQTLVYLDDAQPIEVRVQDALNRMTVEEKTRLSYAQ GKFSSPGCPHLGIPELWVSDGPHGVRAEINWNDWGYAGWTNDSCTAFPALTCLAASWNPL LAEKYGYAIGEEARYREKDVLLGPGVNIYRTPLNGRNFEYMGEDPYLASELCVPYIQGVQ KNGVAACVKHYALNNQELWRGHIDVQLSDRALYEIYLPAFKAAVERGKTWSVMGAYNKVR GTHATHHKLLNNDILKGEWNFDGCVITDWGAAHDTYEAAMYGLDIEMGSYTNGLTSESEF GFDDYYLGKSYLKMVREGKIPMEVVNDKAARVLRLIFRTAMNRRKPFGALTSEEHYRTAY EVATEGIVLLKNGTGKKQPALLPIPQGKYRRILVVGDNATRNLMLGGGSSELKVQRVVSP LEGIKAKFGENVVYAQGYTSGRPMYGRADVVPQTTVDSLRNDAVEKAMNADLVIFVGGLN KNHFQDCEGGDRLSYGLPFGQNELIEALLKVNKNLVAVIVSGNAVEMPWVKEIPSIIQSW YLGSVGGEALADVLSGDVNPSGKLPFSYPVKLEDCPAHFFGEISYPGDSIRQEYKEDILV GYRWYDTKKIRPLFSFGYGMSYTAFEYSKPVVSAKTMNADGSIDLTVKIKNTGKIAGKEI VQLYIGDEECSVLRPVKELKDFRKVQLLPNEEKEVKFTIKPEALQFFDDKQHTWVAEPGK FKAYIAASSSDIRGTVTFEYTQ >gi|225935349|gb|ACGA01000043.1| GENE 58 96997 - 98577 1601 526 aa, chain - ## HITS:1 COG:XF2704 KEGG:ns NR:ns ## COG: XF2704 COG0793 # Protein_GI_number: 15839293 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Xylella fastidiosa 9a5c # 22 341 69 386 508 224 41.0 4e-58 MMKKLTIILSVCLWAVAAQAQNFGSEAMRKLQMAEFAISNFYVDKVDEDKLVEEAIIKML AQLDPHSTYSDAEEVKKMNEPLQGNFEGIGVQFQMIEDTLLVVQPVSNGPSEKVGILAGD RIIAVNDSAIAGVKMSTEDIMKRLRGPKGSKVNLTIVRRGVKDPLLFTVKRDKIPILSLD ASYMIQPKTGYIRINRFGATTAEEFKKAMKDLQKQGMKDMILDLQGNGGGYLNAAIDLAN EFLGQKELIVYTEGRTAKRSDFYAKGNGDFRNGRLIILVDEYTASASEIVSGAVQDWDRG IIVGRRSFGKGLVQRPIDLPDGSMIRLTIARYYTPSGRSIQKPYDSTVDYNKDLIERFNH GELMNADSIHFPDSLKVQTKKLGRTVYGGGGIMPDYFVPIDTTLYTDYHRNLVAKGAVIK FTMQFIEGHRKELANKYKKFESFDEKFVVDDDMLATLKEIGEKEGVKFNEEQYQKSLPLI KTQLKALIARDLWDMNEYFRVMNTTNESIQKALEILNSDEYQKKLK >gi|225935349|gb|ACGA01000043.1| GENE 59 98612 - 99070 263 152 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 145 4 148 164 105 35 1e-21 MRKAIFPGTFDPFTIGHYSVVERALTFMDEIVIGIGINENKNTYFPIEKREEMIRELYKD EPRIQVMSYDCLTIDFAQEVGARFIVRGIRTVKDFEYEETIADINRKLAGIETILLFTEP ELTCVSSTIVRELLTYNKDISLFIPKGMKMSE >gi|225935349|gb|ACGA01000043.1| GENE 60 99067 - 100944 1867 625 aa, chain - ## HITS:1 COG:CT661 KEGG:ns NR:ns ## COG: CT661 COG0187 # Protein_GI_number: 15605394 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Chlamydia trachomatis # 17 618 7 602 605 583 52.0 1e-166 MEENELIPVDNNNAVEYTDDNIRHLSDMEHVRTRPGMYIGRLGDGAHAEDGIYVLLKEVI DNSIDEFKMQAGKKIEITVEENLRVSVRDYGRGIPQGKLIEAVSMLNTGGKYDSKAFKKS VGLNGVGVKAVNALSSRFEVRSYRDGKVRIATFSKGNLLTDETQSTEEENGTYIFFEPDN TLFLNYCFKPEFIETMLRNYTYLNTGLAIIYNGHRILSRNGLVDLLNDNMTATGLYPIIH LKGEDIEIAFTHTGQYGEEYYSFVNGQHTTQGGTHQSAFKEHIARTIKEFFNKNMDYTDI RNGLVAAIAVNVEEPIFESQTKTKLGSTNMVPGGVTVNKYVGDFIKQEVDNFLHKNADVA EAIQQKIQESEKERKAIAGVTKLARERAKKANLHNRKLRDCRVHLNDPKGKGLEEDSCIF ITEGDSASGSITKSRDVNTQAVFSLRGKPLNSFGLTKKVVYENEEFNLLQAALNIEDGIE GLRYNKVIVATDADVDGMHIRLLLITFFLQFFPDLIKKGHVYILQTPLFRVRNKKKTNYC YSEEERINAINELGPNPEITRFKGLGEISPDEFKHFIGKDMRLEQVTLRKTDAVKELLEF YMGKNTMERQNFIIDNLVIEEDLAS >gi|225935349|gb|ACGA01000043.1| GENE 61 100954 - 102090 992 378 aa, chain - ## HITS:1 COG:no KEGG:BT_3032 NR:ns ## KEGG: BT_3032 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 378 1 378 378 736 96.0 0 MAITIKKVSTKRELKKFIRFNYRMYKGNPYSVPDLYDDMLNTFNKKKNAAFEFCEADYFL AYRDDKIVGRVAAIINNQANEKWESKNVRFGWIDFIDDPEVSSALIKAVEDWGKERGMTH IAGPLGFTDFDAEGMLIEGFDQLSTMATIYNYPYYPVHMEKLGFEKDADWVEYKIYIPDA IPDKHKRISELIQRKYNLKIKKYSSGRKIAKDYGQKIFELMNEAYSPLYGYSPLTQRQID QYVKMYLPILDLRMVTLITDANDELVCVGISMPSLAEALQKSNGRLLPLGWFYLLKALFM KRRAKMLDLLLVAVKPEYQNKGVNALLFSDLIPVYQKLGFIFAESNPELELNGKVQAQWD YFETQQHKRRRAFIKEIK >gi|225935349|gb|ACGA01000043.1| GENE 62 102269 - 104188 1052 639 aa, chain - ## HITS:1 COG:no KEGG:Slin_1080 NR:ns ## KEGG: Slin_1080 # Name: not_defined # Def: protein of unknown function DUF303 acetylesterase putative # Organism: S.linguale # Pathway: not_defined # 7 633 8 635 652 543 44.0 1e-153 MKRFLLFTYCLLFAASQMIAQLSMPSFFSDHMVLQREKPIRIWGTAHSGEKISVTLGDIK KSIRADKNGKWLVSLPPMQAGGPYTLTVKSPEQSLFFSDILIGEVWICSGQSNMEFRLRS ANHAVEEIAAANYLQIRSFNVIQEMRHTPKNNLKGKWEVCSPTSASDFSAVGYFFARELY QKLNIPIGFINSSWGGTDIETWISMEVMDHFPKYEKSLSRMRSPEFEEYIKRSDKVKTEF EQAILNEPGETEKWYSEDTSTENWKKHAVPSLWSNEELSAIDGVVWFTYQFSIPANCLNQ DAELSLGTIDDDDITWVNGHEVGRTVGYDLKRLYKIPAKILKEQNTITIKISDYRGGGGL YGTKDEVYLKINNRIFPLCDDWKYKVAVSNAQYDYVEYGPNSFPSLLFNAMINPLIGLGM KGVIWYQGENNAGRANEYIDLFPALIKDWRNRWDCEFPFYWVQLANFMAPARQPSESHWA NLRDAQSKTLALPYTGQAVIIDIGEEKDIHPRNKQDVGKRLALHALRNDYGYDSIVCTGP VFKSVKRIGNTLEVTFDSCAGELIARNKYGYLSGFAIAGTDGKYQWAQAKIENNKVIVWN TEIQQPVSVRYAWGDNPDDANLYNSVNLPASPFEGHISQ >gi|225935349|gb|ACGA01000043.1| GENE 63 104207 - 104986 500 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260172996|ref|ZP_05759408.1| ## NR: gi|260172996|ref|ZP_05759408.1| hypothetical protein BacD2_14088 [Bacteroides sp. D2] # 1 259 1 259 259 540 100.0 1e-152 MYKRIILFTFICLLALAGVAQNKDTYTILLSGASFAEPNNKWFEMGCRALHAIPINRAIS AESIAHTANKMLDGTLYTPEEFDDIDVFVLMQVHEKDVYNEANLKENYKDYATPFDASDY AVCYDYVIKRYISDCYNQKFNPKSKYYNTPYGKPASIVLCTHWHDSRPVFNTSVRKLAEK WGFPVVEFDRYIGFSKNRKHPVTGKQYSLIYTGDSQTTHGEVFGWHPPHGEHSFIQQRMA AIFADTLRKILLPKEYINE >gi|225935349|gb|ACGA01000043.1| GENE 64 104989 - 107655 1275 888 aa, chain - ## HITS:1 COG:no KEGG:BT_2524 NR:ns ## KEGG: BT_2524 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 877 4 877 881 914 51.0 0 MAHIPIRSYLYLLLGTALFSCTPQELKTPFELKCENIPVPIGVDTQTPRLSWKLPLLEED SINKVEIWLSTDSAQLSDCQSVYWNKSIIGAPIRASYDGQPLDSYTTYYWKIGYQTSSKQ KTTFSPISSFTTGCLSTNDWKGKWITDGHDITYRPAPYYRKSFQLNKTVEQALLTIASAG LHELFINGKRTGNHFLDPMYTHFDKRILSVTHDVTSSLSMGENVMGVQLGNGWYNHQSTA VWFFDKASWRNRPRFTAQLHLRYTDGTTKYLGTDSTWQKTDSPIIFNSIYTAEHYDAQKE LTGWDSPGFNATEWHHAQETESPTETIKSQVMHPIRETARYTATQCKKINDSCYVYHFPK NIAGVTELNVKGKKGTILRLKHGELLDKNGRVNMANIDYHYRPTDDSDPFQTDIVILSGK EDSFMPKFNYKGFQFVEISSSAPIELSGENLIAVEMHSDVPATGYWSSSSDLLNKIWEAT NSSYLANLFGYPTDCPQREKNGWTGDAHITIETGLYNFDGISVYEKWMNDFCDEQKDNGV LPCIIPTSIWGYDWANGVDWTSAVAIIPWEIYRFYGDTTLLRRMYEPIKKYVSYIESISP EYLTDWGLGDWVPVRSKSNITLTSSIYYYTDVHILAKAALLFGYPKDASYYNTLAQNIKN AINTRFLNPETGIYAEGTQTELAMPLYWGIVPEKDKKKVAARLHQLVEKDNYHLDVGLLG SKALLSALSDNGYAETAYKVASQDTYPSWGYWIKQGATTLHENWRTDVIVDNSYNHIMFG EIGAWLYKGLGGIQIDEKHPGFRHILLKPFFPADMNELTIRYNTPYGWLNINWIRQTNDC IRYTIDIPAGTSATFVPFTMQESQKSITLQAGKHSLELDFTQLLINQQ >gi|225935349|gb|ACGA01000043.1| GENE 65 107681 - 109078 541 465 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020673|ref|YP_526500.1| ribosomal protein L9 [Saccharophagus degradans 2-40] # 1 463 5 521 522 213 28 8e-54 TMENIKLKEKIGYGLGDAASSMFWKLFTMYLLFFYTDVVGISSAVVGTMFLITRIWDTFL DPFVGILGDRTSSRWGKFRPYLLWVAIPFGICGILTFSSFGDNMTTKIIFAYATYTLMMM VYSLINVPYASLLGVMSANPQVRTEFSSYRMTFAFGGSILVLFLIEPLVDIFSKMKITES LPDIAFGWQMAAVVFAIMASGMFLLTFLWTKERVQPIKEEKGSLKEDLKDLGKNKPWWIL LCAGIMALVFNSLRDGSAVFYFKYYVDGSDTFSFSFMNSAITLITIYLVLGQAANILGIM FVPSLTKRIGKKKTYFMAMVCATILSVLFYFLPKDFIWGILCLQILISICAGIISPLLWS MYADISDYSEWKTGRRATGLIFSSSSMSQKFGWTIGGALTGWLLAYFGFKANVIQSDFAQ TGICMMMSIFPAIATMLSAIFISRYPLNEKRLYEISTELEERRKK >gi|225935349|gb|ACGA01000043.1| GENE 66 109086 - 110255 714 389 aa, chain - ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 47 358 24 320 326 112 29.0 1e-24 MKSNRLEELTQHYEALITRKNEICNNNNGIYRRYHYPVLTAEHAPLIWKYDFDEKQNPFM EERIGINAVMNTGAIKINHKYYLVARVEGADRKSFFAVAESDSPVDGFRFWDYPIEMPET DVPDTNMYDMRLTAHEDGWIYGIFCAERKDTNAPAGDLSSAVAVAGIARTKDLKTWQRLP DLKSPSQQRNVVLHPEFVNGKYALYTRPQDGFIDAGNGGGIGWALIDDICHAEIKEEKII NKRFYHTIKEVKNGEGPHPIKTPQGWLHLAHGVRGCAAGLRYVLYLYMTSLEDPTQIIAE PAGYFMAPIGEERIGDVSNVLFSNGWIEDDNGKVYIYYASSDTRLHVAESTVSQLVDYCL HTPADGFRSIESVKQIITMVNHNKQYLKQ >gi|225935349|gb|ACGA01000043.1| GENE 67 110283 - 111644 789 453 aa, chain - ## HITS:1 COG:no KEGG:Dtur_0402 NR:ns ## KEGG: Dtur_0402 # Name: not_defined # Def: hypothetical protein # Organism: D.turgidum # Pathway: not_defined # 33 453 1 441 442 319 40.0 1e-85 MLKRLSICYVLFILCTTCVSAQEDRWTGSATNLSKGNLRVNSSGRYLEYTDGTPFLYLGD TAWELISRLNDKETERYLENRREKGFTVIQTVILDELDGINTSSNGVPQLIDGNIDQPSP EYFARVDKVISLAAAKGLYIALLPTWGDKVDKQWGKGPEIFTPENAYRYGKWLGERYMNM PNLIWVIGGDRSGGGKNLSIWNALATGIKSIDQNHLMTYHPQGEHSSSYWFHDAPWLDFN MCQSGHAQQDFAIYQRILLPDLNRKPHKPCMDGEPRYENIPINFKKENGRFGEDDVRHTL YQSMFSGACGYTYGCNDIWQMFDTGRESKCDADTPWHQAMDKQGAWDLIHFRRLWEKFDF TQGKSQQSIFGNASLEKTNYPVAFGNKDYILVYLPQGGKRTIYLPPMSSPKQTLKWMNPR NGQTTFYQYTTTDTISISSPTEGKGNDWVLIIE >gi|225935349|gb|ACGA01000043.1| GENE 68 111649 - 112689 858 346 aa, chain - ## HITS:1 COG:RSp0162 KEGG:ns NR:ns ## COG: RSp0162 COG2730 # Protein_GI_number: 17548383 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Ralstonia solanacearum # 35 340 115 418 420 188 35.0 2e-47 MKKVFISAFLLLSLLTLNGCKSNQPPVKETGEPYGVNLACADFGSSFPGEYNKDYTYPTD QDLEYWHKKGLKLIRLPFKWERLQLDLKGPLNQHDLNKMKELVRAAEKRDMVVLLDLHNY CRRYMNNEHTLIGSNGLTIEDLASFWQAIAKEFSTFKNIYGYGLMNEPHDLAPGSNWFDM AQASINAIRQVDTNTLIMVGGNDWSSAERWIEQSDTLKFLKDPANNLAFEAHVYFDKDAS GTYKYSYEEEGCYPEKGIDRVKPFVEWIKQNNFHGFIGEYGIPDNDPRWNETLDLFLHYL QENGINGTYWAAGPWWDTYFMAITPKEGKDRPQMPIVEKYTHTLKK >gi|225935349|gb|ACGA01000043.1| GENE 69 112709 - 115270 1301 853 aa, chain - ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 92 849 25 776 785 223 26.0 1e-57 MRTRFITIITLLLSTPMVIAQKSMDEIDRKSFAAKPSPIEVKGTQMTETGNIPPVGNIPA EFSLDGIWQLAEGGTEKERLHTSWRDQIPARVPGSIHTALVENKIIPDPYIGQNDSIAEK QSYKTWWMKREFELNSPLSHSILSFGGIANKCTIWLNGKLLGTHEGMFGGPDFSIGKYLK NKNTLIVKLEAIPQMFLGNWPPNANESWKYTVVFNCVYGWHYAQIPSLGIWRSVQLKEQA AVDIDSPFVATRSLDGQMRLTFDLHEQSSPLKGVLYAEVSPKNFKGIKQYYRFDINSRRK QETLSLDFQIKDPHLWWPNDRGEQALYDLNLRFIPQKGKTAHVKTSFGIRTIEMRPLTDG AKEDYYNWTFVINGKPMFIKGTGWCTMDALMDFSRNKYEHLLQIAKSQHIQMLRAWGGGM PETDDFYELCDRYGILVMQEWPTAWNSHNTQPYSLLKETVERNTKRLRNHPSLVMWGAGN ESDKPFGPAIDMMGRLSIELDGTRPFHRGEAWGGSQHNYNCWWDNAHLNHNLNMTAPFWG EFGIASLPHIETVHKYLDGEKEVWPPRRSGNFTHHTPIFGTMGEMGKLVQYSGYFMPKDS LASFILGSQLAQVVGVRHTLERARTLWPHTTGALYYKMNDNYPGVSWSCVDYYGIIKPIH YFVQKSFAPLAAVMLFDRSNLSSQEVSLPVYLLDDCQELDKKPYQVKVSIYNDQLDTVAN HTFNGTGDENVVKKLGEIYLNREQTKSTMLFFVLDIVKNNRSIYRNYYFTNYEVRPGSIL SMPQTTIKTERRGNAVILTNTGKYPAIGVHIEVPEKMDQLIVSENYIWLNPQESKKLKIN LESPVIVKGWNLQ >gi|225935349|gb|ACGA01000043.1| GENE 70 115290 - 116375 736 361 aa, chain - ## HITS:1 COG:RSp0162 KEGG:ns NR:ns ## COG: RSp0162 COG2730 # Protein_GI_number: 17548383 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Ralstonia solanacearum # 32 356 106 418 420 176 33.0 4e-44 MLKDLFSLVTIVALLFSSCSKSEEEENGDDPQPTKQTVYFGVNLSGAEFGNVYPGVDGTH YGYPTEKDLDYFKAKGLYLVRFPFRWERIQPTMNGELNATELAKMKKFVKAAEDRNMQIL LDMHNFGRYCVYCDGQSSQNNQYAIIGNARCTVDNFCDVWKKLAKEFKDYKNIWGYDIMN EPYEMLASTPWVNIAQACINAIRTVDTKTTIIVSGDEFSSARRWKECSDNLKTLTDPSNN LIFQAHIYFDSDSSGNYDKGYDEDGATIQTGVARLKPFVDWLKENNKRGFVGEYGVPDTD GRWLDILDSALKYLQENGVNGTYWSAGPRWGDYPLSVQPTNNYTQDRPQLNTLLKYKSTQ Q >gi|225935349|gb|ACGA01000043.1| GENE 71 116435 - 117607 970 390 aa, chain - ## HITS:1 COG:no KEGG:BF0761 NR:ns ## KEGG: BF0761 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 43 390 34 367 368 137 30.0 6e-31 MKKIKYFAIIAASIFALTSCTDIVEVDDLKAKENKPSTGAPTVDKVVLATDAEFPIDGAN FEQVVRIEGTNLGDITSLKFNDIEVDSKEIYSTYDMLLAPIPRALPKEVSNTIYITTKHG ELSIPFVVSIPDLTINGLKNEFTQPGDTTVITGDNFDLYGITIEEAIVNLGNLPVNVIDA TRTELTIEIPANATPKSTLTIKGANMDEAYKLTYMDPGVSQLFDFNNWPGAGAFTHSSQF PDAPKNFLCDGTLEGQPEPLVEGGKYIRFNNSVKAWGWMVMWAGYITVPTEVAADPSSYD LRFEICTGAKFPISTQARIILGDYGWYPSKGGLPVNTYGGWQTVRISADTEALLPNPIDP STNTAFKIIFSPESAQDFDLSMCNFRFVHK >gi|225935349|gb|ACGA01000043.1| GENE 72 117637 - 119400 1248 587 aa, chain - ## HITS:1 COG:no KEGG:BF0834 NR:ns ## KEGG: BF0834 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 586 1 555 555 384 41.0 1e-105 MKKKNIFIYLMASTLLLSGAVMTSCESMIEEKPFDFIVPEDVEDSDNGADMWVTGVYNTL HEAMFRYGSFPRPLDYDCDYISGAVWQFSQFGSGNFQGGDGQADVLWTGMYSLINRANIA VSEINKMQNVSEAFKKNALGECYFLKAWAYFYLVRAYGAIPIYSVSVNESGQYTNNPRIP IAQVYTETIIPLLKDAKDMIYKNTDNGFKPGRVCAATAAGLLAKVYATIGSASMATGEQI TVKTGAPFVMQNVNGTMTKVYTEPVPTTFSKDQVAGYESFSSGEYYRLAYEIAEDVIGGE YGTHQLENYDLIWSPSGKTCSEHLFSLQTKSGDELYGTLFSSHYCGRLNAAGNIDNSLTV GCRKHWYLLFEEKDYRVDKGVLHCWIRQNSDTSWGGGSYYPNFGKWQRMVEAKEPPFDNP KVTSGWRCDEGGSEQFFAFTTKYSQQIADQTQPRTDANYPFLRYADVVLIFAEAANELNG PTKESVDALNDVRTRSNATGKELANFTDKASLRSAILEERAMELALEGDRRWDLIRWGIY LQAMNALGGMDEANNVKQRSSKHLLFPIPTLEILTNQGINENNPGWD >gi|225935349|gb|ACGA01000043.1| GENE 73 119413 - 122664 2630 1083 aa, chain - ## HITS:1 COG:no KEGG:BF0833 NR:ns ## KEGG: BF0833 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 43 1083 24 1045 1045 1104 56.0 0 MKKNLFSFPRSKVRMLKESKRVWLFLIMFWIVNTAAVAAGIEIKGTVTDSKGEPLPGVNI VELGVKKNNGTITDLSGKYTITVESQKSVLQYTFIGYKTTEVTVGNRKTINVSLKDDTQS LDEVVVIGYGTMRKKDLSGAVASIKSDDLMIGNPTNISQALQGKLAGVQVNQSDGAPGSG VSITIRGANSFSTNSQPLYIVDGIPFEVGDTPSSKANEGNNSTTNPLSLINPNDIESIDI LKDASATAIYGSRGANGVVLITTKRGRAGDAKVEFSANFGLSKIAKMVKMLDAYTYANYV NEGVINGAAYDNLPYSYLPYRGKWNYRRDENDKIVPNSGKYYASPEDYLNPGYREDEYGN KEWVEGTNWMDEILQDALTQEYNLSVSGGNEKSNYAFSGNYTDQSGIIKNSGYERFAVRA NIGSHIKPWLNTGLNINFTRSLTKFAKSNSYDYSIIRSAMLYLPTLYVGDKTEDDSYAWL SANPRTYVNTAKDELKSINVFTSAFAEISILDCLKFRQNLGISYSVNDRASYYNRETGEG KASNGRAGKSDNFWQNLTAESLITFDKTFNKLHHLNVVAGFTYEKSDWGGKTMNASNFPT DITQDFDMSQALNIETPASYRGQAVLVSLLGRANYTFKDRYIFTASFRRDGSSRFAPGNK FANFASGAVAWTLSEEEFIKNLNIFSNLKLRLSYGQTGNQAISSYQTMASLAPSNYPLDG TLSSGFAGQTYKGPLNDKLKWETTDQYNVGLDIGFWNNRISLSANYYYKKTKDLLQNVSI PNSTGYTTMWTNFGHVKNKGLELTGKIVALDKKDWTLEFDGNISFNKNEIGGLTADQYAN QLWYSAKEVFLQRNGLPIGTIFGYIEDGFYDNIAEVRADPIYAKASDDEARRMIGEIKYL DKNNDGKITSEDRAIIGDTNPDFIYGLNANLRWKNLSLGLFFQGTHGNDIFNGNLTNIGM SSIANITQDAYDSRWTPENAAGAKWPRVTTAMTRDMKLSDRYVEDGSYFRLKTINLSYNF GSVIKGISNLSVFGTVTNVFTITGYSWFDPDVNAFGSDASRRGVDIFSYPSSRTYSIGFK LTL >gi|225935349|gb|ACGA01000043.1| GENE 74 122792 - 123763 443 323 aa, chain - ## HITS:1 COG:BS_bglC KEGG:ns NR:ns ## COG: BS_bglC COG2730 # Protein_GI_number: 16078874 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Bacillus subtilis # 6 313 21 338 508 242 43.0 8e-64 MKKVFLFITLFSMISLFSYSKDPVKQWGQLQVKGNQLCNQAGEPIVLRGVSYGWHNLWPR FYNKQSVKWLKKDWKCTVLRAAMGTVIEDNYIENPEFALKCMNKVIKAAIKNDIYVIIDW HTYYPQQKEAKAFFSMMAQKYGKYPHIIYEIYNEPMEDSWESVKEYATDIISEIRKYDPD NIILVGNPHWDQDLHLVAESPLKGFNNIMYTLHFYAATHKQELRDRAEAAWEKGIPIFVS ECAGMECTGDGPLDIPEWTRWVEWLESKKISWVNWSISDKNETCSMILPRANKNGGWDES LIKPAGLQSRKFIRQYNSHIYKK >gi|225935349|gb|ACGA01000043.1| GENE 75 124074 - 126071 989 665 aa, chain + ## HITS:1 COG:no KEGG:MAP0339c NR:ns ## KEGG: MAP0339c # Name: not_defined # Def: hypothetical protein # Organism: M.avium_paratuberculosis # Pathway: not_defined # 114 660 143 689 697 159 26.0 3e-37 MKKSIFLVCWAVFGTVSLSAISPLVTEVWSDKQPDSISYGISRMVNIASLPLLDRGVSVH YEGSIDKKGKNADWDWSLYQDQRGEWVIFDVEGPGCIYNLVQHRYMSSSDPLFRFYFDGE EAPRFSLRLSEFGEKAPFIKPLAESYIGPFDNGRGPIRVARSFVPMAFNKGCRVTTDVKL EGYDRTKGEGGWGHVVYHTYTDNGIKTFTGKENYDTLIQLWKKQGSSFLCDDQLAYHRKS EQKVDAGECITLLDEKGEGAIGSLKFYLPEINEQHLQDVWIHMFWDAHQQPDISCPLACL GGNSLGFHDTNYLLSGYNTDGWFYNYFPMPYWKHVKIMIENRSGVPVSLGFSEIAVSRSV YPTSNTGYFRNTPYYTRKYVAGTDSPIAAIQGRGKMVAAHITCHAERSHIISCEGDVRVY VDGKRTPQVESDGSESYVCYGWGFPTPPEVHPMGGYDGLPDNPWSMTRFCIGDSYPFYSE LKFGIESGEYNNQYLEHSGTIFYYGQDKSVLVKTDSLNLSSPHAIKQHSYKAMGNVRKTK LESFFEGKEDGVLCMGEVVRFKNCSSFRVNILSQNEGVRLRRLSDQNDARQAARVFVDGE EVIERLWYVADSNPYKRWLEDDFEIPARYTKGKKSLNIRIVPVSMSKEGRIAWNEAEYQV FCYII >gi|225935349|gb|ACGA01000043.1| GENE 76 126068 - 127051 924 327 aa, chain - ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 16 325 16 316 318 228 46.0 2e-59 MNILLLIGGLILILLGANGLTDGAASVARRFRIPPIVIGLTIVAFGTSAPELTVSVSSAL KGSTDIAIGNVVGSNIFNTLMIVGCTALFAPIVITRNTLRKEIPLCILSSIVLLICANDV FLDKASENILNRVDGLLLLCFFVIFMGYTFAIASKPATMEQQAEYPVIEEETEIKSLPWW KSILYIIGGLAALIYGGQLFVDGATGIARNLGVSESIIGLTLVAGGTSLPELATSIVAAL KKNPEIAIGNVIGSNLFNIFFVLGCSASITPLHLSGITNFDLFTLVGSGILLWLFGLFFA KRTITRIEGGVMILCYVAYTVVLIYNT >gi|225935349|gb|ACGA01000043.1| GENE 77 127178 - 129028 1714 616 aa, chain + ## HITS:1 COG:sll0590_2 KEGG:ns NR:ns ## COG: sll0590_2 COG0668 # Protein_GI_number: 16331818 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Synechocystis # 358 603 1 248 264 253 50.0 1e-66 MEINKMKRSLLVLFLSFLAIGAQAQLEQAVKKIFAGDTITNGHVPLKRDSDSIHLANMQK SLEEARLNEANMRMEMEQMKLQMASADSVKYAQQRQRIDSLRQFTKGVPVVADGDTLFYL FTKRGGYTPQQRAQMTGAAIEEIGKRFNLQPDSVAIDHSDIVSDLMYGSKVLLSLTDQDA LWEGISRDSLAKERQQNVITKLHEMKAEHGLWRMAKRVLYFVLVIVGQYFLFRLTNWLFR KLKVRILRLKDTKIKPVSIQGYELLDAQKQANLLVFLSSIGRYILMALQLLFTVPLIFII FPQTEGLAYRLLGYIWNPVRTIFVGIIDYIPKLFTIIVIWYAVKYLVRLVLYLAREVEAG RLKINGFYPDWAMPTFHIARFLLYAFMIAMIYPYLPGSDSGVFQGISVFVGLIVSLGSST VIGNIIAGLVITYMRPFKMGDRIKLNDTTGDIIEKTPLVTRIRTPKNEVVTVPNSFIMSS HTVNYSTSAREYGLIIHSEVSIGYDIPWRQVNQILIDAALNTPGVVDDPRPFVLETSLSD WYPVYQINAYIREADKMAQIYSDLHQNIQDKFNEAGIEIMSPHYMAVRDGNETTTPKEYQ KSNNPSNKAGEENKPD >gi|225935349|gb|ACGA01000043.1| GENE 78 129113 - 130051 983 312 aa, chain - ## HITS:1 COG:no KEGG:BT_3017 NR:ns ## KEGG: BT_3017 # Name: not_defined # Def: acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 312 1 310 310 602 91.0 1e-171 MNIKIKSLLILSLIFVCTFVQAQLTDYSIFDKKFNFYVANDLGRNGYYDQKLIAELMGTM GEEIGPEFVLATGDVHHFEGVRSVNDPLWMTNYELIYSHPELMIDWFPILGNHEYRGNTQ AVLDYTNISRRWIMPDRYYTRTFEEKGATIRIIWIDTTPLIEKYRKESDKYPDACKQDVN KQLSWLESVLANAKEDWIIVAGHHPIYAYTPKEESERLDMQKRVDSILRKHKVDMYICGH IHNFQHIRVPGSDIDYIVNSAGSLARKVEPIEGTKFCSPEPGFSVCSIDKKELNLRMIDK KGNILYTVTRKK >gi|225935349|gb|ACGA01000043.1| GENE 79 130216 - 133026 2673 936 aa, chain - ## HITS:1 COG:CC0995 KEGG:ns NR:ns ## COG: CC0995 COG1629 # Protein_GI_number: 16125247 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 125 936 53 903 903 177 23.0 1e-43 MKRFLKFVSFVLLVMSTSGNTFAEEKVNIVRQGTIRGRIVDTSKQTLPGASIYIEKLHTG VTSDVNGYYTFANLTPGTYTIKVSYVGYSPVEMKITIPAGKTLEKDVVLNEGLELQEVVV GGAFQGQRRAINSQKNKLGITNVVSADEVGKFPDSNIGDALKRINGINVQYDQGEARFGQ VRGTSADLSSVTINGNRLPSAEGDTRNVQLDLIPADMIQTIEVNKVVTSDMDGDAIGGSI NLVTKSTPYKRVISATAGTGYNWISEKAQLNLGFTYGDRFFNDKLGMMAAVSYQNAPVGS DDVEFEYDVNKKGEVVMVEAQKRQYYVTRERQSYSLAFDYEINPNHRLTLQGIYNRRHDW ENRYRVTYKDLDKTGLDDEGEMQQSAQIETKGGTPNNRNARLELQQTMDFSLGGEHQFGK LSMNWGASYARASEDRPNERYFNLKQNFLGFNIVDAGGRFPYVTTDVSLHNGEVDGERGK WKVKELTESNQEIYEKDLKFKVDFELPLVNGIYGNSLKFGAKYASKTKNRDVTCYDYADA YKDTYQTEYMNNLTSQIRDGFMPGNQYKATDFVSKEYLGSLDLKNMEGEQVLEESSGNYH AKENVTSAFFRFDQNLGKKLKMMLGLRMEATHIKYDGWNWMVDEDENETLQPTGNHKNNY TNWLPSVLLKYDVTDDFKVRASFTETLSRPKYSALIPCVNINRSDNELVMGNSDLTPTIS YNFDLSADYYFKSVGLVSAGIFYKKINDFIVDQVIGNYTYQNNEYKKFTQPKNAGDADLL GVELAYQRDFSFIAPALKCVGFYGTYTYTHTKVNNFNFEGRENEKDLSLPGSPEHTANAS LYFEKKGFNVRLSYNFASSFIDEMGEVAALDRYYDAVNYMDLNASYTFGKKFKTTFYADA TNLLNQPLRYYQGTKDRTMQSEHYGVKINAGVKINF >gi|225935349|gb|ACGA01000043.1| GENE 80 133195 - 133656 324 153 aa, chain - ## HITS:1 COG:no KEGG:BVU_1305 NR:ns ## KEGG: BVU_1305 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 149 2 165 165 99 40.0 3e-20 MKKYILILFSIISFWSCTEDESIDITVLPPATTTGANTFGCLMDGWIYVGGRYLNWGHSY VWTYDSFHYYPEEDKLSVHVCVKPDINIHFTILSPREGEEATLTDIRFRGEELEDGTAFI SHFDPELNIISATFGNGKRLTNGRFDIHYTTQQ >gi|225935349|gb|ACGA01000043.1| GENE 81 133676 - 134329 261 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237721873|ref|ZP_04552354.1| ## NR: gi|237721873|ref|ZP_04552354.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 217 1 217 217 434 100.0 1e-120 MNQTLIKAILLLIAFTLSCHTPLFSQRDFERHEFSFHAGYGVMFHHPPTLTLSTHSYQRT LAQGVSWDGQYNFRPLKRFVFGGIYSGFSSKGSHPEGKDHLWVHFIGTQIGMCNANTKHW QIRVTTGPGGVILRNNSEVFGKTRKVKAFTIGLLTNANLTYKLNPNLGVSLGVQYMYSEL LRMRTHYHGERVIVKLDGNDDTNLTRINFTTGLSYYF >gi|225935349|gb|ACGA01000043.1| GENE 82 134378 - 135568 863 396 aa, chain - ## HITS:1 COG:no KEGG:BT_3008 NR:ns ## KEGG: BT_3008 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 395 1 394 395 692 84.0 0 MMQISPETQLFIREHSSDDVRALALQAKKYPDIDMPTAITQIAGRKVAAEKIPSWWEIEK IWYPKHLSLEQCSSEITARYKARLLQGDSLTDLTGGFGIDCSFLAIGFKSATYVERQEEL CEIAAHNFPVLNLNHINIKNEDGVTYLQAMSPVDCLFLDPARRNEHGGKTVAISDCEPNV AELEELLLQKANRVMIKLSPMLDLSLALKELKQTQEVHILSVNNECKELLILLGQTSPTE ISIHCVNLSTKGTQEEQHFVFTREQEQYSECTYTDSLETYLYEPNASLLKAGAFRSIAAA YPVRKLHPNSHLYTSDTFIENFPGRIFRIVNQCSFNKKEVKENLTDLKKANVTVRNFPAT VAELRKRIHLAEGGDTYLFASTLNNGQKVLIRCEKV >gi|225935349|gb|ACGA01000043.1| GENE 83 135637 - 136410 433 257 aa, chain - ## HITS:1 COG:CC2273 KEGG:ns NR:ns ## COG: CC2273 COG2173 # Protein_GI_number: 16126512 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine dipeptidase # Organism: Caulobacter vibrioides # 52 243 16 193 212 95 32.0 7e-20 MIILKYSLAIFCLLLTGCSFFSSHKEKESTPLMEYEQTTEDIQPHEEYSTVSPPERSEMA LYMDSLGLVNIADLDSSLVVKLMYTQADNFTGEVLYDNLTEAYLHPDAAYALIEAQKALK KLHPSYSLIIYDAARPMSVQKKMWNVVKGTSKYKYVSNPNRGGGLHNYGLAVDISIQDSL GQPLPMGTKVDHLGMEAHITDEIGLVHNGKMSETERQNRLLLRKVMKEAGFRALSSEWWH FNFCSRDVAKQKYKLIP >gi|225935349|gb|ACGA01000043.1| GENE 84 136478 - 139327 2011 949 aa, chain - ## HITS:1 COG:no KEGG:BT_3006 NR:ns ## KEGG: BT_3006 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 949 4 967 967 1428 71.0 0 MTIEERLNEIVEKGQGDAIIPFLQGLTQEERKTLVPCLNKLEEHYNKFVQLNENTYGTHG TPEQHRIINLTALVIYSLKEFRKHQWGIYTEQLNELIPWYIPSWLDSFFKEGESKEFGGF YGMNYETLMDWIEQGVLTLTPSPQTIAGYLVNYMNNTDFLQKRAITLKEHIWYLFQYDCG QNWTDNRTGGQPYFSFRYFVEHGQLDRMRVLKESLLAVNRNLNKNLSSWFAGMFTALSPN IEEQLTLQPEIFAVLSAPHSRPVNIILGLLKNLCTHPQFQIEEFLNQTSVLFASDVKAIH QNTLAVLHKLAKERKEHRDAICCAAAQGLMSREESTQSKIVKLIQTYGETASTTLKEVLS IYTETMLANTKKDLKAYLENNEPEDSASFTYEPILPIIREDNRIQEITSTEDLIFLASQV LDVNEIYHFDLLLGALVEWDRQQEAKQISQWTPILQRAYKLLMSGGSSRNGILDQLMAIF LLDYAKLLIKRFPEEAQELNNLHLKMVQKDELQKGKWGYRNLQKLTIREKTNKKIKFPVH KQLLCRTLDLLESKEKPLPLLSTPTHTPMFIAPATLIERLKQYQQANAEPDDMDMQTALS RVALESSSQELPLLLQSLKGEYRHLLTFLLGEKDVLPQAPFNHPSWWMMAGLMKSPETIY TEFKDFSYNKSPREFLTGNFKWRTYQYTDSYTDYNKKTVEWICSTLTFDIPESENSHVIN KDKYNERVSYYSYDPHPLLVEMYPQIERFDDIQNDLPRLAWLTPNIPEPLLVWCIRSAIY DPTLNEVREAGITQAAIEALHQLRHTWHEVSYLLEATCMLVADKTSRSYAAEIWIERVGQ GSIDSGRIGSILGSHQHTGWGPLKRLTDLIQQQMINVSPLHNRELEKLIVAMLTGLPEKP VKDLKKLLEIYAELLSINHSKTEDEHVLHLLDAWKGVANLKKAVANIQR >gi|225935349|gb|ACGA01000043.1| GENE 85 139341 - 142205 1722 954 aa, chain - ## HITS:1 COG:no KEGG:BT_3005 NR:ns ## KEGG: BT_3005 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 951 1 946 948 1382 71.0 0 MEELKKELEKLSKAYVDTPENEEKILIPFVKRLLELPMKERRKLLPVIRDLQWIKSKFAG FSSETTCSAARAHFLSAVQFVCANKREMDMAYHVKFDMLCKLLPLYYPTWLTDFINDDKT WFNFDLNYEQLMQLMDMGYLKEIAPSRIAHVLPWITRIRNKEPKGNDTFNSELLLKRDIT LKEHIWTIFEYESSIGYQDDCAQEAYKKGVTARDESISAALYRFSQDGHLDRERLLKATL ATFHRGFKKDMAGWFAGFFETLQPTTGELLSLQEEMMQIFTSSYTKPVNVMLQQLKNIAS EEGFRYQEFIERATTLFFSSPKNSLLTIYALFEKIVAQHPEMKESCCIILCQLFLKKDES LQKKAANFISKYGDASSSNLQGTLQSYQPEMFQSVYTILSSFKPRPAEEAHEPDVSMGKT VRRICKEDNLISFPANKEDFLFQLSRLFDMEESWEIETTIAAIIAFHPQLDKEDLNRMEP VFQRAATIVANGWEPYEDLLATFLLEYQRLWAQKDTSNTGFLRNMFTRLEERLKGIDENR GAYDERSFKRLADWKPGYSNATCFTPIKQLWLNVIRKIKGGNAFPLLSTPTHTPAYVQAT ELIRRLAVYQKAGAKPCPWDFQLAIARCTMEDKEEAIATARQLLQDEYLHLSLFLLDENT LPEPPYNYPTAWVAARLVKAPETEFEAFKSFACNTLPHNYLTGDYEWKEVKPKEKSYETD RRLLQLEFYKWHTYAECNSHQLWQEHLIINSKYNMDDSRYMEPLLCCFPNRPEPLIAQII TCYMTFGTPQEDSKRTLACALRMLLSFHCPLKEMSLLLLSGSLLFVDKTVRSYAAELWIE GVSTGRINNHRIGEILARLIQMELAPLKRFTTQVYESMYKRSTFHNQQLEALLTEFIGGF PDKPVTGLKQLLELYLELLTINHSKVTDEQLLQRLQEWGTNSNLKKVTTSLNNL >gi|225935349|gb|ACGA01000043.1| GENE 86 142193 - 143548 943 451 aa, chain - ## HITS:1 COG:no KEGG:BT_3004 NR:ns ## KEGG: BT_3004 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 451 1 451 451 854 93.0 0 MTDTISYQYAAPSALQRSADQDELFLAKYSEIEKKEAPCFFWGKLTQPYMTARCLIALSN VVQSSFNLTPAQLSMLKDPIVTAGNDRLRFEGFSNCAGVYARVDVLPDGHDGEFLENGTT NVDFNPGMISALGGIVRQENVVMSVGPKEVGLYHKGEKVIERKVPLPVKWIKGLTTVQIY QSVAEQLYSFNRIQTLQLFQTLPKSNVKCDYYLVMHGQKPAFSPVKSMNAVCIGGLHRLR LLEPLLPFADELKVFAHPTMQSTIWQLYFGPVRFSLSLSRECWRGFSGEGAALESLLEDV PERWIEAMDKYSYANQQFNPTLFAIEEHIDLNKVDSLAARLAAMGLLGFDLDENSFFYRR LPFKTERILSLNPRMIAAEKLLEEEKVEIISNDEKRTEARVAGSGGVRHTVILDRESEKE RCTCTWFSSNQGERGACKHILAVKKLVQWKN >gi|225935349|gb|ACGA01000043.1| GENE 87 144143 - 144646 458 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 95 36.0 3e-20 MKSLSFRKDLIGVQDELLRFAYKLTTDREEANDLLQETSLKALDNEDKYTPDTNFKGWMY TIMRNIFINNYRKVVRDQTFVDQTENLYHLNLPQDSGFESTERAYDLKEMHRVVNALPKE YRVPFAMHVSGFKYREIAEKLNLPLGTVKSRIFFTRQKLQEELKDFR >gi|225935349|gb|ACGA01000043.1| GENE 88 144864 - 145640 648 258 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 258 1 256 256 115 32.0 6e-26 MIKAIMLDVDGTLVSFETHKVLPSSVDALRKIHDGGIRIAIATGRAAGDLHEIADVPYDG IIALNGADCVLRDGTVIRKHLIPKDDFKKAMEIAKAFDFAVAIELDEGVFVNRLTPTVEQ IAKIVEHPIPAVVDIEDLFEKKECCQLCFYIDDEMEQKVMPFLPNLSLSRWHPLFADVNV AGISKATGLSVFADYYGIRMTEIMACGDGGNDIPMLKAAGIGVAMGNASEIVKASANFVT DTVENDGLCKALKHFGII >gi|225935349|gb|ACGA01000043.1| GENE 89 145992 - 146303 192 103 aa, chain - ## HITS:1 COG:no KEGG:BT_2979 NR:ns ## KEGG: BT_2979 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 103 1 103 103 186 96.0 3e-46 MEFLNEYHLAGLFIGICTFLIIGLFHPVVVKAEYYWGTKCWWIFLVLGIAGVVASLSIDN VILSSLLGVFAFSSFWTIKEVFEQEERVQKGWFPKNPKRKYKF >gi|225935349|gb|ACGA01000043.1| GENE 90 146309 - 146773 392 154 aa, chain - ## HITS:1 COG:YPO0002 KEGG:ns NR:ns ## COG: YPO0002 COG1522 # Protein_GI_number: 16120355 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 148 6 151 153 115 40.0 4e-26 MERIDNLDRQILEIISQNARIPFKDVAAECGVSRAAIHQRVQRLIDLGVIVGSGYHVNPK SLGYRTCTYVGIKLEKGSMYKSVVAELQKIPEIVECHFTTGPYTMLTKLYACDNEHLMDL LNNKMQEIPGVVATETLISLEQSIKKEIPIRVEK >gi|225935349|gb|ACGA01000043.1| GENE 91 146933 - 147802 1020 289 aa, chain - ## HITS:1 COG:STM4397 KEGG:ns NR:ns ## COG: STM4397 COG0545 # Protein_GI_number: 16767643 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Salmonella typhimurium LT2 # 76 287 22 219 220 176 47.0 5e-44 MKKVSIFMAIAAAASLASCTAQAPKANLKSDIDSLSYSIGMAQTQGLKGYLTGRLDVDTT YMADFIKGLNEGANKTSKKDIAYMAGLQIGQQISNQMMKGINQELFGTDSTKTISKENFL AGFIAGTLEKGGVMTMEAAQEYTRTAMETIKAKALAEKYADYKAENEKFLAENKTKDGVK TTPSGLQYKVITEGKGEIPADTCKVKVNYKGTLIDGTEFDSSYKRNEPATFRANQVIKGW TEALTMMPVGSKWELYIPQDLAYGARESGNQIKPFSTLIFEVELVSIEK >gi|225935349|gb|ACGA01000043.1| GENE 92 147823 - 148407 687 194 aa, chain - ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 5 194 67 259 259 188 53.0 5e-48 MDKFSYAIGLGIGQNLSSMGIGNLAVDDFAQAIKDVLEGNQTAISHNEAREIVNKYFEEL EAKMGAVAIEQGQAFLEENKKGPGVVVLPSGLQYEIIKEGTGKKPKATDQVRCHYEGTLI DGTLFDSSIQRGEPAVFGVNQVIPGWVEALQLMPEGSKWKLYIPSELGYGARGAGEMIPP HSTLIFEVELLEVL >gi|225935349|gb|ACGA01000043.1| GENE 93 148881 - 149579 686 232 aa, chain + ## HITS:1 COG:jhp1180 KEGG:ns NR:ns ## COG: jhp1180 COG0846 # Protein_GI_number: 15612245 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Helicobacter pylori J99 # 1 226 1 220 234 232 49.0 4e-61 MKNLVVLTGAGMSAESGISTFRDAGGLWDKYPVEQVATPEGYQRDPALVINFYNERRKQL LEVKPNRGHELLAELEKNFHVTVVTQNIDNLHERAGSSHIVHLHGELTKVCSSRDPHNPH YIKELKPEEYEVKMGDKAGDGTQLRPFIVWFGEAVPEIETAIQYVEKADIFVIIGTSLNV YPAAGLLHYVPRGAEVYLIDPKPVDTHTSRSIHVIQKGASEGVAELKQLLGV >gi|225935349|gb|ACGA01000043.1| GENE 94 149600 - 149824 267 74 aa, chain - ## HITS:1 COG:no KEGG:BT_2974 NR:ns ## KEGG: BT_2974 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 74 2 72 72 105 80.0 6e-22 MEKENKILIFRTSITQKRDIKRIGELLAEFPQIDKWNVDFEDWEKILRIECRDISALEIS EVLRNNHIFATELE >gi|225935349|gb|ACGA01000043.1| GENE 95 149838 - 150188 175 116 aa, chain - ## HITS:1 COG:STM3175_1 KEGG:ns NR:ns ## COG: STM3175_1 COG2207 # Protein_GI_number: 16766475 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 5 113 3 111 134 74 34.0 5e-14 METKDINKQEYQLRINKVTDYIHNHIDQPLSLQKMAGIACFSPFHFHRVFTILTGETPTD YIKRIRIEKAALLLKRNKELSATEIARLCGFSSLSLLSRNFRLHFSMTIRKFRSLK >gi|225935349|gb|ACGA01000043.1| GENE 96 150259 - 151035 752 258 aa, chain - ## HITS:1 COG:slr1117 KEGG:ns NR:ns ## COG: slr1117 COG0500 # Protein_GI_number: 16329224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Synechocystis # 25 257 19 253 253 192 41.0 4e-49 MSNDFKSIHEFDFTLICNYFKALKRQGPGSPEVTQKAVSFINELSDKARIADIGCGTGGQ TMALANYTKGQITGIDLFPDFIERFNKNAIEAHCEDRVKGIVGSMDALPFQEEELDLIWS EGAIYNIGFERGMNEWNKFLKKNGFIAVTEASWFTPERPSEIEDFWMANYPEIDTIPRKI MQMEKAGYIPTAHFILPENCWTEHFYAPQFPVQEAFLKEYAGNDAAADLITGQRYEESLY NKYKEYYGYVFYIGQKRD >gi|225935349|gb|ACGA01000043.1| GENE 97 151477 - 152334 817 285 aa, chain + ## HITS:1 COG:no KEGG:BT_2961 NR:ns ## KEGG: BT_2961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 1 285 285 491 76.0 1e-137 MKSKLLLTCTILFFCSSFLCGQNQSSKVANSVETNNGCIRHPWQGKRVGYLGDSITDPNC YGDKIKKYWDFLQEWLGITPYVYGISGRQWNDVPRQAEQLKKEHGGEVDAIVILMGTNDF NDGVPIGEWFTETEEQVMAARGQTQKLETRKKRTPIMDGSTYKGRINIGINRLKQLFPDK QIVLLTPLHRSLANFGETNVQPDENYQNSCGEYVDAYVQAVKEAGNVWGVPVIDFNAVTG LNPMVEEQLIYFYDAGYDRLHPSTKGQIRMARTLMYQLLALPATF >gi|225935349|gb|ACGA01000043.1| GENE 98 152427 - 154820 1885 797 aa, chain + ## HITS:1 COG:no KEGG:BT_2958 NR:ns ## KEGG: BT_2958 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 790 1 780 789 1341 79.0 0 MGYKLLFSFFIFLIVALPIRSKDFVLTSGQTVVIACSPSEELVVRTALEMLGRDIQTVLS STTQTNEKTGEIVIGTVGQNELISRTGVDVSALRGKKQAFLLSVSPEGKLIVAGSDKHGT AYGILEISRLLGVSPWEWWADVTPEKKKLFKLSSKFQSLQAPSVEYRGIFINDEDWGLMP WSSRTYEPSKVKGEIGPRTNERIFELLLRLRANTYWPAMHECTLPFFLTKGNREAAKKYG IFIGASHCEPMACSAAGEWKRRGEGAYDYVNNAPAVYKFWEDRVKEVADQEILYTLGMRG VHDGKMQGAKTVEEQKAVIDRVFADQRGLIEKYVDKDVTKVPQVFIPYKEVLDIYHAGLQ VPDDVTLMWCDDNYGYIRHFPTAEECARKGGNGVYYHVSYWGRPHDHLWLSTMSPYLIFQ QMKLAYDRGIQKMWILNVGDIKPAEYQIELFMDMAWNIEAVASEGVTSHLKHWLERELGA SCAKAVLPVMQEHYRLAHIRKPEFMGNTREEEKDPVYRVVKDLPWSEKEINGRLQAYDKL SETVERAASRIPSGRQSAYFELVKYPVQAATQMNRKLLYAQLARHGKADWEKSDLAYDSI VVLTKQYNSLEDEKWNRMMDFQPRKLPVFNRVERETATSPMMKERVAIYKWNGLDGKNIP NGKNTLNARKGTSVICEGLGYESKATGIDKGDALMFSFDNWKTDSVEVDIRLLPNHPVGG DQLRFSISLDDAAPEVISYETKGRSEEWKENVLRNQAIRTVRLPISGKKSHKLVIKALDE GVILDQVMLYMPSPTGE >gi|225935349|gb|ACGA01000043.1| GENE 99 154807 - 155556 504 249 aa, chain - ## HITS:1 COG:jhp0094 KEGG:ns NR:ns ## COG: jhp0094 COG0463 # Protein_GI_number: 15611164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Helicobacter pylori J99 # 9 187 3 182 260 140 40.0 2e-33 MHSVHPTPKFSIITVTYNAEKVLEDTIQSVISQTYHHIEYIIVDGASKDGTLSIINRYRS RIHTVVSEPDKGLYDAMNKGIALASGDYLCFLNAGDCFHEDDTLQQMVHTINGNELPDVL YGETAIVDKDRHFLRMRRLSAPETLTWKSFKQGMLVCHQAFFPRHTLVGPYNLQYRFSAD FDWCIRIMKKARTLHNTHLTIIDYLEEGMTTRNQKASLKERFRIMAKHYGLIGTVAHHAW FVIRAVTHQ >gi|225935349|gb|ACGA01000043.1| GENE 100 155549 - 156814 758 421 aa, chain - ## HITS:1 COG:all4426 KEGG:ns NR:ns ## COG: all4426 COG0438 # Protein_GI_number: 17231918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 1 416 1 412 417 237 33.0 4e-62 MRVLIINTSERIGGAAIAASRLMESLKNNGIKAKMLVRDKQTDQISVVRLKSNWLQVWKF MWERIVIWSANRFRRYHLFDVDIANTGTDITSLPEFRQADVIHLHWINQGMLSLNDIRKI LTSGKPVVWTMHDMWPCTGICHYARECNNYQQECHDCPYIYKGGGRKDLSYRTFRKKQKL YSYAPIHFVTCSHWLKEQAQTSALFEGKSVTNIPNAINTNLFKPMNKKEARAKFMLPEGK KLVLFGSLKITDKRKGVDYLIGACKLLAEKHPEWKDSLGVVVFGNQSQQLQEQLPFHVYP LPYIKNEHEVVNIYNAVDLFAIPSLEENLPNMIMEAMACGVPCVGFNVGGIPEMIDHLHN GYVAQYKSSEDFANGIHWILTEPEYNELSAQACRKVLGNYSESIVAKKYTDVYNKITGKY A >gi|225935349|gb|ACGA01000043.1| GENE 101 156825 - 157943 835 372 aa, chain - ## HITS:1 COG:no KEGG:BVU_0890 NR:ns ## KEGG: BVU_0890 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.vulgatus # Pathway: not_defined # 1 372 1 371 371 608 78.0 1e-172 MAEKKQAPLNKLLTIYFYHTRLTRESYEEWKEYKFPGHILYGLPLLENYGIRSVMHKCKY FSGRLKLMLYATKEILFCKEKYDVLYATSFRGIEPVIFLRALGLYRKPIVIWHHTAVVTN PKPWREQISRLFYKGIDQMFLFSRKLIQDSQKTRKAPSHKLKLIHWGPDLPFYDHLLAEM PDRKPEGFISTGKENRDVDTLLQAFAATNEKLDLYIAVSCGNINYKKIIDPYALPDSIRI HYTDGVIPYELGKLVARKSCIVICCLDFPYTVGLTTLVEAFALGIPVICSRNPNFEIDID KEGIGITVEYNDVQGWIDAIRYIADHPEEARRMGENARKLAEERFNLEIFSREIAESLLE ISNISSKNRTFA >gi|225935349|gb|ACGA01000043.1| GENE 102 157916 - 158932 753 338 aa, chain - ## HITS:1 COG:no KEGG:BVU_0889 NR:ns ## KEGG: BVU_0889 # Name: not_defined # Def: hemolysin hemolytic protein # Organism: B.vulgatus # Pathway: not_defined # 1 337 1 337 338 607 84.0 1e-172 MKSALVDVPVLILFFNRPQQLSQVFEQVKKARPSRLFLYQDGARNERDLPGIEACREIVS QIDWECEVERLYQEKNFGCDPSEYISQKWAFSKVDKCIVLEDDDVPAVSFFQFCKEMLDK YEHDTRISMIAGFNPEEITQDMPYDYFFTTTFSIWGWASWKRVVDQWDEFYSFLDDSFNM QQLEQLIKERKFRSDFIYMCQRHREQQKAFYETIFHASILFNSGLSIVPTRNMINNLGAT ADSTHFAGSIHTLPKGYRRIFTMKRYEVDFPLKHPRYVIENVAYKESVYRIMGWDHPWIK IGRSFEELFLNLKYGNFFLITKAVKNRINKWLKRNKHH >gi|225935349|gb|ACGA01000043.1| GENE 103 158935 - 159798 587 287 aa, chain - ## HITS:1 COG:SP1274 KEGG:ns NR:ns ## COG: SP1274 COG3475 # Protein_GI_number: 15901134 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 8 133 7 128 269 83 34.0 5e-16 MNKKYTAEELDLLHAELYDILGETIRVCQKHNIPYFVIGGTAIGALYDQAILPWDDDIDI GMTRENYNKFLKVAPGELGPSYFLSWIETDPHTPYYFAKVKKNDTLFVEEMFKNVPMHPG IFVDIFPFDKIPDNKLLRRIQSETLGFLKCCLMGKEIWMWKHFGTCEIENPTNRGAFSCF LNRVVDLLFSKKAIYRMLVSVQSCFNSRNTRYYNNVMATADHVTVESIRHLQPVKFGPLT VTAPDDLEGFLRYNYPTLHRFTEEEQEKVNNHYPAALSFSTTPKQEL >gi|225935349|gb|ACGA01000043.1| GENE 104 159878 - 160939 584 353 aa, chain - ## HITS:1 COG:no KEGG:BDI_2786 NR:ns ## KEGG: BDI_2786 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 307 1 313 353 291 45.0 3e-77 MNPNDLATRYRLLNSSFKKTMIYHIGIDAGFFTEYTYMLHAILYCLQHKIQFKLYSDDAN FGWEKGWEDCFAPFCEQVHEPFHHTYNTHRLPSWQALMKDKKLPKTKLLKWKLKVTCKNI IGKTIAFFTYGKPVLLNFQLTFNPNQHFHIPELGIDGDYLHTFQKLTEITWKLNDTTAQE CRQCAADLQLPPQYAGCQIRGGDKITETNLLPPEHYIRLIKEKTALRDVFVLTDDYRLFE QVQTLAPDIHWYTLCSPDEKGYVNSAFTQTTKELKQKQMARFLSSIQILMDASVFIGSIT TGPSLFLLKKFYPDINPADCLLKDFPQASVLPIPGRGQVATEFMQGNSTCKGT >gi|225935349|gb|ACGA01000043.1| GENE 105 160926 - 162383 962 485 aa, chain - ## HITS:1 COG:mll5270 KEGG:ns NR:ns ## COG: mll5270 COG2244 # Protein_GI_number: 13474395 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Mesorhizobium loti # 6 421 75 490 561 159 26.0 1e-38 MSENQSLKEKTAKGLFWGGFSNGIQQLLNLLFGIFLARLLTPSDYGMVGMLAIFSLIASS IQESGFTAALVNKKEVTHNDYNAVFWFNAAISLSLYLLLFLCAPLIADFYDTPELTPLAR YSFIGFFIASLGISHSAYLLRNLMVKQRALSSVIGLTVSGLTGVTLAYFGFSYWGIATQS IVYVTINTACYWYFTRWRPSLQFNFTPIKEMFGFSGKLLVTNVFNHINNNLFSVILGKFY SEKEVGYYNQANKWCGMGQLFISGMINGVAQPVLTKVSDDLERQKRVFRKMLRFTAFVSF PAMLGLGIVSEELIIITITDKWYSSILIMQILCISGAFAPIAYLYQQLIISKGKSRIYMW NTIALGIILLSSVLLAHSYGIFTMLTVYVSINILWLLTWHYFVWQEIGLKLRHALMDILP YAAIAATVMVITYYTTRSIENIYLRLASKIVLAAALYVAAMWGSRSVTFKESIQYFIKKK THEPK >gi|225935349|gb|ACGA01000043.1| GENE 106 162456 - 164495 1949 679 aa, chain - ## HITS:1 COG:PAB2364_1 KEGG:ns NR:ns ## COG: PAB2364_1 COG0143 # Protein_GI_number: 14521189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Pyrococcus abyssi # 7 546 3 553 562 558 49.0 1e-158 MEKKFKRTTVTSALPYANGPVHIGHLAGVYVPADIYVRYLRLKKEDVLFIGGSDEHGVPI TIRAKKEGITPQDVVDRYHSLIKKSFAEFGISFDVYSRTTSPTHHQLASDFFKTLYNKGE FIEKTSEQYYDEEAKTFLADRYITGECPHCHSEGAYGDQCEKCGTSLSPTDLINPKSAIS GSKPVMKETKHWYLPLDKHEAWLRKWILEDHKEWRPNVYGQCKSWLDMGLQPRAVSRDLD WGIPVPVEGAEGKVLYVWFDAPIGYISNTKELLPDSWETWWKDPETRLIHFIGKDNIVFH CIVFPAMLKAEGSYILPDNVPSNEFLNLEGDKISTSRNWAVWLHEYLADFPGKQDVLRYV LTANAPETKDNDFTWKDFQARNNNELVAVYGNFVNRAMVLTQKYFDGRVPAQGSLTDYDK ETLKEFADVKAEVEKLLDVFKFRDAQKEAMNLARIGNKYLADTEPWKLAKTDMERVGTIL NISLQLVANLAIAFDPFLPFSSEKLRKMLNMDTFEWSELGRDDLLPVGHQLNKPELLFEK IEDATIEAQVQKLLDTKKANEEANYKANPIRANVAFEDFEKLDIRVGTILECQKVPKADK LLQFKIDDGLETRTIVSGIAKHYQPEELVGKQVCFIANLAPRKLKGIVSEGMILSAENND GSLAVIMPGREVKPGSEVK >gi|225935349|gb|ACGA01000043.1| GENE 107 165393 - 167453 1810 686 aa, chain + ## HITS:1 COG:AF1211 KEGG:ns NR:ns ## COG: AF1211 COG1042 # Protein_GI_number: 11498810 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Archaeoglobus fulgidus # 5 682 3 679 685 321 33.0 3e-87 MITTQLLRPESIVVVGASNNVHKPGGAILKNLINGGYQGELRAVNPKEKEVQGVPAFADV NDLPDTDLAVLAVPASMCPDIVETLASKKQTRAFIILSAGFGEETHEGALLEERILETVN KYGASLIGPNCIGLMNTWHHSVFSQPIPNLNPKGVDLISSSGATAVFILESAVTKGLQFN SVWSVGNAKQIGVEDVLQFMDESFDPEKDSRLKLLYIESIQNPDRLLFHASSLIRKGCKI AAIKAGSSESGSRAASSHTGAIASSDSAVEALFRKAGIVRCFSREELTTVGCVFTLPELK GKNFAIITHAGGPGVMLTDALSKGGLNVPKLEGEIAEELKARLFPGAAVGNPIDILATGT PEHLRLCIDYCEEKLDNIDAMMAIFGTPGLVTMFEMYDVLHEKMQTCKKPIFPILPSINT AGAEVAAFLAKGHVNFADEVTLGTALSRIVNVPKPAVPEIELFGVDVPRIRRIIDSIPEN GYIAPNYVQALLHAAGIPLVDEFVSDNKEEIVAFARRCGFPVVAKVVGPVHKSDVGGVVL NIKSEQHLALEFDRMMQIPDARAIMVQPMLKGTELFIGAKYEEKFGHVVLCGLGGIFVEV LKDVSSGLAPLSYEEAYSMIHSLRAYKIIQGTRGQKGVNEDKFAEIIVRLSTLLRFATEI KEMDINPLLATQKAVVAVDARIRIEK >gi|225935349|gb|ACGA01000043.1| GENE 108 167547 - 171569 2889 1340 aa, chain + ## HITS:1 COG:XF1330_1 KEGG:ns NR:ns ## COG: XF1330_1 COG3292 # Protein_GI_number: 15837931 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Xylella fastidiosa 9a5c # 31 762 28 733 740 133 22.0 3e-30 MFMKKVLILLICLIGYQIGCHAQMADEHYYFKNLSVQNGLSQNTVNAILQDRQGFMWFGT KDGLNRYDGLSFRKFKHDDRTRQSIGNNFITALYEDAKGDIWVGTDVGLYIYNPEKDSFR HFAELSEENTKIEHTVTAISGDNKGCVWVAVESQGLFCYDLEEGKLRNHTLKNFSFLTTN IQSFVFDNSGTLWIGCYGDGLFYSKDHLKTLQPYVSPVDNKEFYANDVVTCIVKGAYNCL YVSSLKGGVKELNLTSNKLHDLLFEDETGESVFCRELLVASDNELWIGAESGLYIYNLRT AKYVHLRSSINDPYSLSDNAIYSLCKDREGGIWIGSYFGGVNYYPRFYTYFEKYYPKGTD NGLHGKRVREFCQDNQGILWIGTEDGGLNRFNPKTKTFSFFTPSNAFTNVHGLCLIDNYL WVGTFSKGVKVVDTHTGAIVKTYQKTDSPRSLIDNSVFSICRTTTGDIYLGTLFGLLRYN KQSDDFDRIPELNGRFVYDIKEDSGGNLWLATYANGAYCYNVSEKKWKNYLHDENNPKSL PYDKVLSIFEDSHRQIWLTTQGGGFCRFQPDTDTFANYNLSAGLPNDVVYQIVEDKDGFL WLTTNNGLVCFQPNTGVMKVYTTSNGLLGDQFNYRSSFETEDGTIYLGSIDGFIAFNPKN FSENKFIPSVAITDFFLFGKEVYAGEPGSPLEKSITFSDQLVLQSNQNSFSFRVAALDFQ APKTSRIMYKLEGFDTDWLTVGESPIVTYSNLRYGDYTFRVKVANSDGVWSDDEVILEVH ILPPFYLSIWAYCVYALLIIGCSLYTVMYFKRRSNNKHRRQMEKFEQEKEREVYHAKIDF FTNVAHEIRTPLTLIKGPLENIILKKQVDAETREDLNVMKQNTERLLNLTNQLLDFRKTE SQGFRLNFAKCNVTEVLKETHVRFTSLAKQKGLEFTLQVPEKDFYAHVNREAFTKIISNL LNNGVKYAESYVHISLEVPEADDNNSFRIRTENDGVIIPNEMKEEIFKPFVRFNEKEDGK VTTGTGIGLALSRSLAELHQGTLAMGAGEENNTFCLTLPIVQDMTITLTPEPEVEIDRMN EIPVGETEKKDNRPTVLVVEDNPDMLAFVVRQLSKEYTVLTATNGAEALQVLDGNYVNLV VSDVVMPVMDGFELCKTIKSDLNYSHIPVILLTAKTNIQSKIEGMELGADAYLEKPFSVE YLQACASSLIQNREKLRKAFAQSPFVAANTMALTKADEDFIKKLNEVIQVNYSNPEFSMD DMADSLNMSRSNFYRKIKGVLDLSPNEYLRLERLKKAAQLLKEGENRVNEICYMVGFNSP SYFAKCFQKQFGVLPKDFVS >gi|225935349|gb|ACGA01000043.1| GENE 109 171626 - 172633 616 335 aa, chain + ## HITS:1 COG:BS_abnA KEGG:ns NR:ns ## COG: BS_abnA COG3507 # Protein_GI_number: 16079933 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus subtilis # 42 272 44 284 313 128 37.0 2e-29 MKKYVNLFVFVFMIGACLSQGRAQTKSAVSYKNPVVDIAMPDPTVIKAADGYFYVYATES TRNVPIMKSKDLVEWTYCNTAFTNKTRPRFEPKAGIWAPDINYINGKYVLYYAMSVWGGE QTCGIGVATANSPQGPFTDHGKMLRSNEIGVQNSIDPSLLQYEGRNYLVWGSFRGIYVIE LTADGLSIMPGAKKKRIAGTAFEAVYIHKHDDYYYMFASIGSCCQGVKSTYKVVVGRSKK VWGPYKDKTGKPMLKNGYSLVIGANDHFVGNGHGSQIIRDDAGQDWLLYHGFSRSTPENG RILLLDQIKWDQEGWPYVEGGSPSYETQKAPVFNN >gi|225935349|gb|ACGA01000043.1| GENE 110 172673 - 173968 996 431 aa, chain + ## HITS:1 COG:SA0210 KEGG:ns NR:ns ## COG: SA0210 COG0673 # Protein_GI_number: 15925921 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Staphylococcus aureus N315 # 45 249 5 204 359 100 33.0 7e-21 MKHTPISRRDFLKNLGIAGAGTLLAASPWLSAFSEVTNTSNEKCRLAIIGPGSRGRFLMG FLAKNPKVEIVALCDIYKPSIESALELVPNAKVYGDYREVLEDKSIDAILVATPLSSHCK IVLDAFDAGKHVFCEKSIGFTMEECYRMYQKHRSTGKIFFTGQQRLFDPRYIKAMEMIHA GTFGEINAIRTFWNRNGDWRRSVPSPNLERLINWRLYKEFSKGLMTELACHQLQIGSWAL RKIPEKVMGHGAITYWKDGRDVYDNVSCVYVFDDGVKMTFDSVISNKFYGLEEQIMGNLG TVEPEKGKYYFENVAPAPAFLQMVNDWENKVFDSLPFAGTSWAPETANENTGEFIIGERP KSDGTSLLLEAFVEAVITQKQPERIAEEGYYASMLCLLGHQALEEERMLYFPDEYKIDYL NHQSVKTPEAV >gi|225935349|gb|ACGA01000043.1| GENE 111 173965 - 174207 94 80 aa, chain + ## HITS:1 COG:no KEGG:BT_3471 NR:ns ## KEGG: BT_3471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 80 76 58.0 3e-13 MKDTIVTARRKKIELITLLVCFVVSNLIHLYAIIAYHAPFTEMITSIFYIIIFTFVLYAF WGILRLLFNGLRALFTRSRT >gi|225935349|gb|ACGA01000043.1| GENE 112 174234 - 175652 1351 472 aa, chain + ## HITS:1 COG:STM4425 KEGG:ns NR:ns ## COG: STM4425 COG0673 # Protein_GI_number: 16767671 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Salmonella typhimurium LT2 # 41 239 3 189 336 67 27.0 5e-11 MVTRRDFLKTMSMASAGLALGAGDLLHAQTISPKKGRGDKVKIAYIGIGNRGEQIIEDFA RTGMVEVVALCDVDMGAKHTQKIMAKYPKAKQFRDFRQMFDKAGNEFDAVAIATPDHSHF PISMLALASGKHVYVEKPLARTFYEAELLMQAALKRPNLVTQVGNQGHSEANYFQFKAWM DAGIIKDVTAITAHMNNPRRWHKWDTNIYKLPSGQQLPKDMDWDTWLGVTPYHEYNKDYH LGQWRCWYDFGMGALGDWGAHILDTAHEFLELGLPYEVTMQYANGHNDYFFPYSSTILFR FPQRKGMPPVDITWYDGLDNLPPIPAGYGVSGLDPNIPSTNQGDTPSAKLNPGKIIYTKD LIFKGGSHGSTLSIIPEEKAKEMAGKLPKVPKSPSNHFENFLLACNGIEKTRSPFEINGV LSQVFSLGVMAQRLNTQLFFDSRTKQITNNEFANAMLAGLPPRKGWDEFYKL >gi|225935349|gb|ACGA01000043.1| GENE 113 175718 - 177064 1450 448 aa, chain + ## HITS:1 COG:no KEGG:BT_3469 NR:ns ## KEGG: BT_3469 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 448 3 449 449 818 91.0 0 MKNIVLTLLLFCAASLGAQNWEPLFNGKNLKGWKKLNGKAEYKIVDGAIVGVSKMGTPNT FLATTKNYGDFILEFDFKIDDGLNSGVQLRSESKKDYKKGRVHGYQFEIDPSKRAWSGGI YDEARRNWLYPLTLNPSAKTAFKNNAWNKARIEAVGNSIRTWINGVPCANIWDDMTPVGF IALQVHAIGNAADEGKTVRWKDIRICTTDVERYQTPEAQAAPEVNLIANTISPSEAKEGW ALLWDGKTTDGWRGAKLSTFPAKGWKIENGILKVMKSGGAESANGGDIVTTRKYKNFILK VDFKITEGANSGVKYFVNPGLNKGEGSAIGCEFQILDDDKHPDAKLGVKGNRKLGSLYDL IPAPKDKPFNKKEFNTATIIVKGNHVEHWLNGVKLIEYDRNNDMWNALVAYSKYKNWPNF GNPEEGNILLQDHGDEVWFKNVKIKELK >gi|225935349|gb|ACGA01000043.1| GENE 114 177208 - 178023 456 271 aa, chain + ## HITS:1 COG:CAC2766 KEGG:ns NR:ns ## COG: CAC2766 COG1477 # Protein_GI_number: 15896021 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 46 251 29 275 319 76 25.0 6e-14 MQTTIQHLYKPSNDGNSLFYAWFLSMHTRVDIILCSRKPEGELLLVVNRIYDALCRLEKM ANFYDPASELSILNRTASVSPVVLSEELYSMIDLCLEYNGRTLGCFDITVHSENYNQNTI HSVHLSAGDRSVAFSQPGVTINLSGFLKGYALETIRGILNEALIENALINMGNSSILALG NHPVGSGWKINDILLHNECLTTSGNDSPSRRHIVSPQNGKLVEGVKQIAVVTANGSIGEI LSTALFAADSEQTKALMGIYAQCRVYIYPSC >gi|225935349|gb|ACGA01000043.1| GENE 115 178209 - 181163 2090 984 aa, chain + ## HITS:1 COG:no KEGG:BT_3514 NR:ns ## KEGG: BT_3514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 980 2 979 982 1134 56.0 0 MKNKILLLGLFCCSLISAKAQVLLDKGTGKNSFPIVSSLTNAVVCFDGKDATVVRKSASL FVDDVRRVTGQELKMDESKPGRVSARYAIIAGTIGESGWIDALVSRNKIDTAAIAGGWER YMIEVVNNPVPGIKKAIVVAGSDRRGTAYGLLSISKAIGVSPWYWWADAPIKQQKQVSVK VDKFISKTPSVKFRGIFINDEDWGLYRWSKRNFEKERGNFGPRTYAKVCELLLRLQANYL CPAMHDASMAFHRIPENRLVADSFAIVMGSSHCEPLLFNTASEWKRDKMGEWDYINNKEG VNKVLKSRVDECAPFENVYTLALRGLHDRAMNASNNMDDRKKMLQEALMAQRQMLVDAIG KRAEDIPQAFTPYKEVLDVYDQGLELPDDVTIIWPDDNYGYMKRLSSPKEQKRSGRSGVY YHSSYLGKPHDHLWMNTTSPTLMYEELRKAYDMTADRIWLLNAGDIKSCEFAVDYFLTMA FDIDSFNFERAANYRTEWLCGMLGDDYRTEYQDVINSFYKLAFARKPEFMGWGYQWATDK HGRERNTDTDFSLANYREVDTRLAEYQRIGNVAEKILKALPEDKKACYYQSLYYPVKGCE LLNRMILNGQRNRWYSIQQRATTAELEKMTKACYDSLEVITKGYNSLLGGKWDHVMTMKQ GFAAAYFELPALRKVNLAPAASLGILAEGEDILKGQKSFHSLPCFNTYFRQSYYVDVFNK GATPLKWKASVSDNWILLSQKAGETAMENRIEVSIDWAKVPTGEKVFGTLEIASDRGEKE NVYISVFNPSSPSLAEMDTLFVEHNGYVSIDAAGFHRKVENKAIQMRTIPNLGIENTAIQ LGDPTAAPQRTAGRSTPRLEYDFYTFEQGSVDVYTYVLPTFTLSKDRGYAGHEATNVETK YGVCIDEGPVMNPSTSSFEYAQIWYESVLKNCRINKTTLHIDKPGKHTVKIICGDAGTVL QKIVLDFGGMKRSYLGPQPTRQGK >gi|225935349|gb|ACGA01000043.1| GENE 116 181190 - 182986 882 598 aa, chain + ## HITS:1 COG:no KEGG:BT_2892 NR:ns ## KEGG: BT_2892 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 94 481 65 450 570 110 25.0 2e-22 MKNTFFMLFFSIAISFIACGGSGENDENLLAGNGNTEKPKEDESQEPVVCSITPIGELNQ GTPIINSHADVTKRSSLMMNYRMLVTMESSYPRYPRIKKMKNGDYILFYHNGSANNNIGR RCVYALSKDLKTWSNKGEIFNSYDIIDSKGNKNIHCYANCDGLVLSNGDILAVASCRANS GYRDLPEDAGIELKRSTDNGVTWSEPIKIYQGVNWEPFLLELPTGELHCYFTDSSRTGLE GHGTDTGTAMVVSTDGGKTWKPDFSSSPYYVLRMRWEKNGIVGYNHQMPSVVRLNDNKGL AAAVETNNSGYHISLCYSDKDKWEYLAADQEGPVDSNNCVFSGMGPYLGQLPSGETVLSY ESSSKYTLKIGDATARNFGSAYQPFSGGYWGSFCIIDSHTLVGTNIKAKEGPVQMAQFVL NHRIDAVKRKVTIDGNNKEWANTDHALFVGSKSQAQGTLRCSYDDDNIYFLLEVLDRNLL ASDYASLYVSPVSNNKLSKGACCIQVTMNGLKNCEIYDASWKEAQLDAQVKTYVCNETNE RLIDDYGYIAEIAIPRSKLTITSGQVLVNFSITKRNSLDAICDVASTSTARWIPVTGL >gi|225935349|gb|ACGA01000043.1| GENE 117 183035 - 184147 828 370 aa, chain + ## HITS:1 COG:BS_yteR KEGG:ns NR:ns ## COG: BS_yteR COG4225 # Protein_GI_number: 16080064 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Bacillus subtilis # 61 361 40 358 373 104 26.0 3e-22 MKRNYLIYALLMTVFLSSNVISAQNYFGDFPVKADPKTVGNKLSRRLMETKHQLYFDRGI HYAEVCTWYGALRFAELTNNKELIKLLRNRFELLFHLEKELLPPPIHVDQNMFGCLPLRF YNITKDKRYLDLGLPYADTQWELPANANEEERKWDKKGYSWQTRLWIDDMYMITIVQSEA YKATGDPKYINRAAKEMVLYLDELQHPNGLFYHAPDVPYYWGRGDGWMAVGMTELLYNLP EKDPNRARIMKGYLLMMDNLKKYVNKDGLWNQLITEPDFWTETSGSAMFTYAMIVGVNNG WLDKEEYAPVARRAWLGLCNYINDKNDLTEVCVGTGKKNSKQYYMDRPRIAGDYHGQAPI IWCAYALLNK >gi|225935349|gb|ACGA01000043.1| GENE 118 184214 - 185284 590 356 aa, chain + ## HITS:1 COG:no KEGG:BT_3593 NR:ns ## KEGG: BT_3593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 356 1 357 358 498 67.0 1e-139 MKREFCIFIVLFLVFHIHAQLVYYDASNFPLLGRATESTGARYERFPDSLKNVSRAPLWN LSRNSAGMAIRFRSNSTTIAAKWVTLFNTHMNHMTDTGAKGLDLYCLQKNGDWRFVNSAR PKGKTNQATIIKNMHPEEREYMLYLPLYDGLVSLSIGVDSLATISQPLMNYPIRKNPVVF YGTSILQGGCASRPGMAHTNIISRRLNRECINLGFSGNAFLDLEVAKVIAEVEASVFVLD FVPNASVAKMKERMETFYRIIRGKHPDTPIIFIEDPIFPHTFYDERVAKEVRRKNDTLKE IFNHLKKRNEKNIILISSKNMLGEDGEATVDGIHFTDLGMMRYADLVCPVIKKAIR >gi|225935349|gb|ACGA01000043.1| GENE 119 185650 - 187551 1239 633 aa, chain + ## HITS:1 COG:BS_abnA KEGG:ns NR:ns ## COG: BS_abnA COG3507 # Protein_GI_number: 16079933 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus subtilis # 330 575 44 289 313 103 30.0 1e-21 MNYNSIKKNIREIGLMFLSAGLLFFLSCGSDDTKDVSGEDGDVNSNQYLQVQPNGLNVIK VSSSDVAQVVYPFSVELKGGSASVALTAQLEAWNEKDLEAYNKEEETAYKLLPSSLYSIS APQITLEQGIASKKVELKFAPDKVFTEFKKNGAEYVIALRLTSSVAKVRKSQSDFLLHIS FDYPTVSLVMPSQEISVSKMSMPVSVDATFNCRADGEIKTNPWNFTCTLAVPSNAEELVA KYNEDYKTSYRLLPSANYDLGEGISFKAGENEATGGITVKREGMEAVKYLLPVQLREASH ESVALHNEICYFKIGMTYTNPVITFSSAADPTVIRTDEGFYLYATQTNSYWIPIYFSKDL VNWEFKRSAFRKATRPTEDVLPGGGAFWAPEIRYINGKYMLYFSWAKWGDGSISYTAVAT SDSPVGDFLNAKPLLITDDFGSNCIDQFYYEEDSKKYMFVGSFNGIYVTELTDDGLSVKR GADGKPVLKKQVCGRAFEGTNIYKKGKYYYLFASINNCCPNNGMDSKYKVVVGRSENLLG PYVDRKGKDMLDNSWELVLEGDGETFFGPGHNSIIIPDDAGTDWMIYHSYVKENGAVGGR LGMLDRIVWSADGWPTIKKCVPSKGDLLPVFNN >gi|225935349|gb|ACGA01000043.1| GENE 120 187608 - 189041 1006 477 aa, chain + ## HITS:1 COG:no KEGG:BT_3476 NR:ns ## KEGG: BT_3476 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 457 457 368 42.0 1e-100 MKKKSDWRTRFIRLCAVVLPLIVLCFTACKDEDNEENLPFDPTKPVVITDFSPKSGGIGN NIILYGENFGNDPQKLKVIVGGKEANIISVKNNILYCVVPRMATEGDVEISVYDDNGEEV AFAEAEEKFTYVKQWLVSTLAGQRFENEKDAFQGEGAFDACGCIKGATWFSFDPKSNFDH LYLTCYGAGAIRLIDLEEKTVTMVPFLANTADRPAIINWTTDENRDMIISRDLSKDGNVN VLKTRISEFKTEVKLGESKQKGVTGAFVHPKTGKLYYTVYPDQEIYEYNFENKTTTKIAR HPRVKETLRSVVHPTGKYAYLLRQYNERGNGYISRMDYNSTTDQFSTPYIVAGSASGSGY RDGVGSKVQMNGPTQGVFVKNPEYAGEEDEYDFYFCDEYNHCIRILTPTGRVVTFAGRGN DSSDPGFANGSLRSEARFAHPWALAYDEKRKCFYVGERGMKHNGDEQAVIRKIAMEE >gi|225935349|gb|ACGA01000043.1| GENE 121 189065 - 192145 2743 1026 aa, chain + ## HITS:1 COG:no KEGG:BT_3475 NR:ns ## KEGG: BT_3475 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1026 1 1039 1039 1400 66.0 0 MRSKKLIACMLLLLFSITIWAQESAMITGVVTDENKEPLIGVNIAIQNMPGLGVITDING RYKIKASAYDRLVFSYIGYETVVVLVKEQKTINVKMKEAASTIIDEVVITGTGAQKKIAV TGAITNVDVDALKSVPSTSVVDGLAGVVPGVMAMQTSGRPGSVSEFWIRSISTFGANTAA LVLVDGFERDIDEVSVEDIESFTVLKDASATAIYGSKGANGVVLINTKRGKEGKINIDAK VEGFYSTFTKAPEFVDGYTYASMANEARLTRNQEALYSPSELELFRTQLDPDRFPDVDWM DMVLRDGAWSSRATLNMRGGGKTARYFVSGSYQDQQGMYKTDKSLKDYNTNAHFRKWTYR MNVDIDITKTTLLKVGVSGSLRKQNDTGSGTDNLWTVLMGYNSIMMPAEYSDGKIPGWAD KDDNMNPWVMTTQSGYNESWKNNIQTSLTLEQKLDFVTKGLRFVGRFGYDTYNSNWIKRY KSPAAYKADRYRQPDGTLNFTKIRDEKVMSQSSNSEGEKREFFEWELHYSRAFKTHHVGG VLKYTQASKIFTQNIGTDLKNGIPYRNQGIAGRFNYNWNYRYFIDFNFGYNGSENFHKDH RFGFFPAISGAWNLAEEPFMKKIAGKWLNMLKIRYSYGKTGNDALYEGNKRVRFPYLYSI ATDNNVYHFEDIGTGEKLWNGKYYSTVASPAISWEVATKQDLGIDFSFFNDKLTGEVDYF KERRDGIYMTRSYIPYEVGIGNASKANVGIVEAKGFDGHFAFKQKIGKVNFTLRGNITYS KNKVIEKDEENTIYEYKLDQGHRVNQARGLIALGLFKDYEEIRNSPKQTFGEVMPGDIKY KDVNGDGVIDSNDRVAIGATTRPNLVYGFGFSAKWKGLDFNAHFQGAGKSSFFVKGTTIF MFQGGDGWGNILKEMAESNRWILGENEDPNAEFPRLTYGDNKNNYQESNYWLRDGSYLRL KTLDIGYTFPKAWVNKIHLNQVRVFFIGTNLLTFSPFKYWDPELGSSDGKKYPLSKTLSL GVSVNL >gi|225935349|gb|ACGA01000043.1| GENE 122 192159 - 194237 1703 692 aa, chain + ## HITS:1 COG:no KEGG:BT_3474 NR:ns ## KEGG: BT_3474 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 692 1 681 681 748 54.0 0 MKKLIINLGFVLGVSTMVVGCTDYLDSDYLFEERVNIENVFQSKDYTNEWLAKGYAYMNH DYLQQVNSKKNTTFNFADDMYYIDLNYVDWKGGNYTEKGLGSGNSLYIWQAAYQAIRHVS IFIHNVDGNKELTEAEITDMKGQAHFLRAYFYWMLIRSFGPVPIVPDEGVDYTKEYDDIA YPRNTYDECADYIAEELVKAATMLDDTRDVQNIVRPTRGAALALRSRVLLYAASPLYNGQ APAEVIDALVDKGGRKLLSDTYDNRKWARAAAAAKDVMNLKQYQLYVAYKRESDDMAYPT TITPPDDNGTFHSQDWPYGWKNIDPFESYRSLFDGEVGAYNNDEIIFTRGVNQGGENIRV MVIHQLPRSQGGGYNCHGMTQKQCDAYYMKDGSDCPGMNSMYKGMDGYTDPSRYNEQPRV SGVVGTSELSKHPELGPKGIGVSKQYADREPRFYASVAYNGSVWHMLNASAEDKEETNVP IFYYRDAQDGYKAGTNYLSTGIGIKKFVHPKDLADSEKSYDVSRVVQKYAPDIRYAEILL NYAEALNEVEGSYEIPSWNGETMYTVTRSATALKEGIQPIRIRAGLPDYEVYDDVKKFRI KVKRERQIELFAEGHRFFDIRRWCDAPIEEALPIYGCNIYLTSENPDSFHTPIEVTSLPT MFTTKMWFWPIHHNIMKRNMWMIQNPGWKDPE >gi|225935349|gb|ACGA01000043.1| GENE 123 194262 - 195329 1099 355 aa, chain + ## HITS:1 COG:no KEGG:BT_3473 NR:ns ## KEGG: BT_3473 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 354 1 333 333 327 50.0 6e-88 MNNICKLLIGMMAAVCCTSCNNEWEDEQFKQLASFKAEINNEGVTSTYVRYKPGGVVTYQ LPVLLSGSTMNTQARTIHVALDKDTLDVLNYERFGERRRELFYHLLDKQYYSIPETVEMS AGESEALLPIDFTLGGQNNAKPLDMSDKYILPLTIQSDPSYNYQVNDRLHYRKALLNIIP FNDYSGSYDGSLLKIFLENQKDNFMLATHKAYVYNEKTIMFYMGVRDVNYLDRKNYKLYV EFTDEPLNEGSELKKKLRLWTDNGGEDGNNFKVVLLGEGDDDRFKEAPYYTIVEENDPVK PYLKHIYITLSMGYSFEDYTLSPGNRMKYRVEGTLSMQRDLNTLIPDEDQQIEWN >gi|225935349|gb|ACGA01000043.1| GENE 124 195481 - 197358 1219 625 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173057|ref|ZP_05759469.1| ## NR: gi|260173057|ref|ZP_05759469.1| hypothetical protein BacD2_14393 [Bacteroides sp. D2] # 1 625 1 625 625 1239 100.0 0 MKNYKILKNIYICLALLSTVLLVGCSKSDENPTPDDPQPSGKGKITFNLHTPGAFLTKAV GVGNENGGTNDKITFYQFSKEGKYERRYVLDYAAASVGANGDTRSYTVELASSTGGEKRF IIVESENESNFPNLGVTNTIDELLNSKTVAESGKLNPPFVMSNVKTDGKEYVTVANVESS DNQVDVKLKRRVARFDLLNDPAESGLVIDKVYIKNRYTQGFIGDVSGNAGNISSDVLEIP AADLANDGKSFYLYPTQLTSTLEQNEKTVVWATTKLANTNTEGPTLYLNLTADINVEANY LYQLNTKKIAGSTGFDISVEEWKDGTSIDWVTIEDGISVMDNKATIIAGTDIKGTYVKIA ADAAMPYTIKRVVTDYSSAALDVAYDGSLPAWLTISSSTKDAGSGLYRHEIIYTVTAHPA KTSQFAITHLNGAVSDENLLVIGFANPYPGTPLPCLSRGEKLYSPVHCRQSTYLVHNTVD KAYFCGIEGFTFNTPMKGKTGNEGAKNLCPEGWTALNNVEAKEYILWVGEHLKSQVMESV YTYCWSEADDEATDLRILAGYPYNKATPNVKDLACYGTWPSVAWLNINEKDLSIKDKTFY DYNNWDVKKDYGIPYRCIRDKAGWE >gi|225935349|gb|ACGA01000043.1| GENE 125 197410 - 199137 1065 575 aa, chain + ## HITS:1 COG:no KEGG:BDI_1871 NR:ns ## KEGG: BDI_1871 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 298 573 25 304 304 172 36.0 5e-41 MLKNTLFVILLMISSLFTACAEGYVSDVQKGDDTKEIRFSLNMEGGLTMSSTRASVSLDG MKWKIFCFDDQYNYLFDKTGSIGDVANEIKVSVTKGVVYRFLFLCTTADKFPELTSGKTY WDLEAYAPQLPLADPMAMLVSRGNEKDGTIRVAAASASVQVTLAPRASKIVLQKDPQTAS NITVNSVIFADAASSVPYTHIEPQHYSEYENLPVATRKTYQCVPQEDVCYMLPDMCAGTF GVNATLHITHPISGEQDVRVTVPVGLALNVGSGKTYYIEMSTDANGKVAATWATRVAPKT LKLATQNLWGKSTSVVLDYFNRIDVDVLCAQECSGLSESDIQAQGLYVHTHSNNGQGKCS IISRYPFSGITPNKYGAYIDLGEGIVVLVMNCHGAYFPYGPYQLNGIEYKDFPATDDVDY VVKVNKEARQGMVDKLLEDFNSSTTPFVCLSGDFNEPSWLDWTEGALSAGLAPYVVQWPT TRSLWEGGIKGDAYRTIHPDPVTHPGFTWTPRPSKKDTKDRLDLTLYTLSPNTEVKSCQV IGENTETSDIVLPNWGPFENVFDHRGLRTEFVFTK >gi|225935349|gb|ACGA01000043.1| GENE 126 199469 - 202180 2067 903 aa, chain + ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 79 743 25 681 785 167 26.0 9e-41 MHNIAKTFFPILLLLFAAVVTAQNEKNNVGNAQRIWLDSEIGHQGDYQWKMIKAGDATDP GEKISLSNYATTNWMPAIVPGTVLNSLVYNQKYPEPYYGINNKIESKLIPDISETGRDFY TYWFRTDFTVPQSFKGKTVWLQLDGINYRAEVWVNGNLLSTMNGMFIQDYIDVTDFVKVG EKNGLAIKVYPVDVPGSAKPKSWGAAGEFHNGGNGNIGLNTTMLMTVGWDFTFMDGIRDR NTGIWKNISLYATGRVALRHPFVKSELRKPDYDQARETVSVEIINPSTSNRVISCKVKGE IEGENITFEKTFRLMRGEEKTATFSPEEFPQLIINSPKLWWPVNKGPQNLYDLKLTVSVD GKECDSVKTRFGIREIVSDRKTPDKSRVFYVNGKRLFIRGTNWIPEAMLRTSDERTYAEL RYSRQSGVNLLRMWGGGIAESDYFFQLCDELGLLVWQEFWMTGDTRHPHDKALYMSNVES TVKRIRNHPSLAYYVASNESTEVTGTPELLNKLDGTRGFQMQSECEGVHDGSPYKQVNPM QHYENTASPRGSRVDGFNPEYGAPTLPTVEILREMMDEKDLWPINKEVWDYLDGNGFHLM STMYTDLVNNYGKSSSIDEFAQKGQLLGAINSKSIWEVWNYNKLDYGDRFCSGLLFWYHN CSMPQVASRMWDWSLEPTASLYHTANSLEPLHAQFDYLKNTVSVVNDYYREFKDYKVIAQ VYDINSKKVFEESAVVNLPSDGVVNDALTIRFPENISQVHFIKLILKDEKGKDVSSNFYW RSNDKYEGSKTLTGPAASGFEDLSKLKQAKVKLTYKVREDDNNYFVDITLRNTSGQIAFF NQLQFLNSKMSPIRPSFYTDNFFSLMPGEKKTVTIETAKGKLKDGVVLALKGWNINKQEY KLK >gi|225935349|gb|ACGA01000043.1| GENE 127 202284 - 203429 764 381 aa, chain + ## HITS:1 COG:PAB1622 KEGG:ns NR:ns ## COG: PAB1622 COG2152 # Protein_GI_number: 14521331 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus abyssi # 73 371 22 285 305 149 35.0 9e-36 MKLKFIIGVVGILSACGFTNAPSLDTTQEEQDNNWVIGPFHRPEGVNPVISPQPTEFYCP MRKQQVKWEESDTFNPAATTKDGKIVVLYRAEDNSAQGIGKRTSRVGYAESKDGIEMKRM GSPVLFPAEDNFKDQDWPGGCEDPRVAMTEDGLYVMLYTAWNRKKARLAVATSRDLKNWT KHGLAFDKAYNGRFNDLFCKSGSILTRLKGNQLVIDKVDGKYLMYWGEYAVYAATSDNLI DWYPVLDEKNELMKIIQPRKGHFDSLLTECGPPAVRTKHGIVLMYNGKNSGKTGDADYPA NAYCAGQLLLDGDDPYKVLDRLDKPFFAPEASFEKSGQYKDGTVFIEGLAYHKRKLYLYY GCADSRVAVAVCDDVKKLKIK >gi|225935349|gb|ACGA01000043.1| GENE 128 203444 - 206374 2103 976 aa, chain + ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 469 845 302 698 785 100 25.0 2e-20 MRIKLNTYIACIAIFLCSSIECIYSQKLKWNFGLSRKQSSEVIPHSKYSFQTAKLKEKVI EPGELSPLSDWEYQLSKGWEMIEGYKARAAEKSVVAQDLDTSEWYSATVPGTVLTTLVDQ GVYPDPYWGLNNLLIPDTLCRMDWWYRNSFSIPRNKKGEKVKLILNGINYKAEIWFNRQL LGTMTGAFERGIFDITPWVDYDKKNLLAIRILPPNNPGIPHEANKKEGIGPNGGALCLDG PTFISSEGWDWIPGIRDRNMGIWQDVRIKFGNELEIVDTHVITDLPLPDTTSVNFIVQTE IYNSSKTTRTANLHFNIGGVSAVYPVSLNANEKRMIKLTSNECKELQMKNPRLWWPNGYG EQYLYDASLSLISSGKDTLDVKKMRIGIRELEYELSAYEDNSPIVRLNYNPTAALQDGKP AFDTVKRKKTDNKVRYTNYDGEFVPYLLKPVSSQGIELIKDSLMKEYMVIKVNGQRIFCK GGNWGMDDGMKRVSRERLEPALKLHKNMNYNMIRNWTGESTEEVFYELCDEYGMLVMNDF WLSTDGFNLNPLDNCLFVRNVTETVRRFRNHPSIALWCARNEGFATNELEYMLAATLAKE DGSRHYTGNSRSLNSSGSGPWRYQFDAGWYYRSLAGGFRSEVGTPSLPTAETVREFMAEE DTWPISDVWYYHDWHNHRYGSKTFSELYKEGMDRKLGPSDNLDDFCRKAQLINYESHRAI FEAWNSKMWNDASGVLLWMSHPAWPSMVWQNYSSNGETAGAYYGTQKACRPLHIQMSLNS QHKVDIINTTLKEYRNLKVGVTVYDKVGKKIRSSQHKVGQVAANVLTSVTQLEDIKGLPD FYWVKLVLTTNSGKLLDDNVYWLNQKEWDGKVLANLPVSSVNVSVKNFTKKANHYEGRLI VKNPGQYLAAAIALTLRDAKTDAAVRPAYFDDGYFYLMPGESKEIRFQVDCKEGVDKLLL RVDGYNVESKDILLNK >gi|225935349|gb|ACGA01000043.1| GENE 129 206388 - 207491 1014 367 aa, chain + ## HITS:1 COG:BS_yteR KEGG:ns NR:ns ## COG: BS_yteR COG4225 # Protein_GI_number: 16080064 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Bacillus subtilis # 143 358 123 358 373 105 29.0 2e-22 MYKKVLILLFFIPFQLLFAQKSMLKDFPEGSTPEEIGKRLAYRFLTEKHALHIGKWIGYP ETFYWNGALRYAYLAKDKELINRLQDKFEPLFTEEKALQPIMNHVDVNMFGSLPLELYLV TKDKKYWYLGLPYADSQWKVPANAKPEEKEWAEKGYSWQTRLWIDDMYMITIVQTRAYKV TKDKKYIDRAAKEMVMYLDELQRPNGLFYHAPDVPYYWGRGNGWMAAGMTELLQCLPKNH QDRPRILEGYRTMMESLKKYQNPEGLWNQLIDQSDCWTETSGSAMFTFAFIKGVKNGWLN AEEYAPAARKAWIALVSYLNDKNQVREICVGTNKKNSKQYYYDRPRRTGDYHGQGPYLWC AAALMEK >gi|225935349|gb|ACGA01000043.1| GENE 130 207667 - 209700 1347 677 aa, chain + ## HITS:1 COG:no KEGG:BT_2899 NR:ns ## KEGG: BT_2899 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 672 1 672 677 1148 80.0 0 MSLKYLLCGIAFVACATIVEAFDNVPESISASISLKVPGNGAKHYPLTFQKSQNNNGTYY LENSEKMPLTVSQNIVTGNDGLRISVSITALEDIYFNYNQQMATGFRHDDCLFYMPGFWY RRNLRSPKEAPSFHTSDSWIVSEDRLSAPLTGIFCEKKQRFMTVNRLDKFVNNTLATHRE GEVILSDKTSLGYTGFENKDGVATLSFGFPYREAPKSYIRKLTLAPAVTAYQHLKKGETI LLTWQITEGEAEDYSDFVRHTWEYCYDTYSPKPVDTPYSIEYMKQTLSQFFVSSFVDKYP LVYNSGIHLRTDACTSNGQAEVGFIGRVLLNAFNAWEYGWECNREDLKANSTKIFDSYLK NGFTEAGFFKESVNYDKNFEDPVHSIRRQSEGLYAIFHFLAYEKEKGRKHPEWEQRLKNM LNMFLQLQNADGSFPRKFRDDFSIVDKSGGSTPSATLPLVMGYKYFKDKRYLDSAKRTAD YLEKELISKADYFSSTLDANCEDKEASLYAATATYYLSLVTKGEEHKHYADLTKQAAYFA LSWYYLWDVPFAPGQMLGDIGLKTRGWGNVSVENNHIDVFVFEFADVLRWLSNEYKESRF SNFAEVISTSMRQLLPYEGHMCGIAKVGYYPEVVQHTSWDYGKNGKGYYNDIFAPGWTVA SLWELFTPGRAEIFMER >gi|225935349|gb|ACGA01000043.1| GENE 131 209720 - 211111 1012 463 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_3408 NR:ns ## KEGG: GYMC10_3408 # Name: not_defined # Def: fumarate reductase/succinate dehydrogenase flavoprotein domain protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 31 458 2 438 458 269 40.0 1e-70 MTSRRDFLKKAGLISAAFAVPALNVNGNTTQSKMRLKNTKIAVDDRWDVIVIGGGPAGCT AAISAAREGAKTLLIEAMGQLGGMGTAGMVPAWCPFSDGEKIIYRGLAEKIFEASRKGVP HERKQKLNWVNINPEYLMQVYDRMVAESGAKVLFFSRVAAVEMSADDTVDAIIVANKSGL VAFKAKVFIDATGDGDVATWAGASFKKGGEDGVLQSSTLCFSFANVDSYNYNLIGPSLHT SNKNSPIFDAIKSGKYPLIGRHFNSNLIGPDVVQFNAGHIDNVDSTDPWATTRAMATGRQ IAEQYLEALKELQPKTFGSAFVVKTASLLGVRDSRRIEGDYTFTFQDWLERKTFEDEIGR NCYYIDVHKPGHKETRYKKGESHGIPYRCLTPKGLKNLLTAGRCISTDEEAFGSLRVMPP CLVTGEAAGMAAVHAIKQTKNDVHKIDTVHLRKRLKEEGQYFL >gi|225935349|gb|ACGA01000043.1| GENE 132 211589 - 213142 1294 517 aa, chain - ## HITS:1 COG:YPO0256_1 KEGG:ns NR:ns ## COG: YPO0256_1 COG0642 # Protein_GI_number: 16120593 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Yersinia pestis # 233 515 350 612 693 83 27.0 1e-15 MRKSIIAILMTFCLSIMTYAQTPRDRATELKEQAQSSLNQKDYIKARYLFKKAYEAFATR GNYPQAIECGVQANALYVRENFYKEGFELCRDMDQLIWTGEQNQKKVFYDLRFLVNKERL RMYTALKNPAQAKTQLNKLEETANLAKNDSLTEVLLYTKANYYYTFNQNTQGDACFRKLL NQYKEKKDYAKVNDCYKNLIDIARRANNAPLMERTYESYIVWTDSVKALTAQDELNVLKR KYDESLLTIQEKDDTLSAKQYIIIGLCILVVILVAGLVILAAILLKFIAGNRKLKKSVKI ANEHNELKTKFIQNISSQMEPTLNTLTTSANELSQKAPQEASQMQGQVAALKKFSDDIQE LSSLENSLTELYELGEINVGTFCENVMDKVKEHIKIDVTPSVNAPKLQVKTNKEQLERIL LYLLKNAAFYTEQGRISLEFKKRGAHTHQFIVTDTGTGIPAEQQENLFKPFTEVKDLTTG DGLGLPICSLIATKMNGSLTLDTGYTKGARFILELHV >gi|225935349|gb|ACGA01000043.1| GENE 133 213240 - 214778 1015 512 aa, chain - ## HITS:1 COG:slr0904 KEGG:ns NR:ns ## COG: slr0904 COG0606 # Protein_GI_number: 16331658 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Synechocystis # 1 507 1 507 509 533 52.0 1e-151 MLIKVFGAAVQGIDATLITIEVNSSRGCMFYLVGLPDSAVKESHQRIISALLVNGYKMPT SNIVVNMAPADIRKEGSAYDLPLAIGLLGANETISSEKFSRYLLMGELSLDGSIQPIKGA LPIAIKAREDGFEGLIIPQQNAREAAVVNQLKVYGVSNIKEVIEFFNNERELEPTVVNTR EEFYQQQTNCDLDFADVKGQENVKRALEVAAAGGHNLIMVGAPGSGKSMMAKRLPSILPP LSLGESLETTKIHSVAGQLKRGSSLISQRPFRDPHHTISQVAMVGGGSFPQPGEISLAHN GVLFLDELPEFNRGVLEVLRQPLEDRQITISRIKSTISYPANLMLIASMNPCPCGYYNHP TKTCVCSPGQVQKYLNKISGPLLDRIDIQIEIVPVPFDKISDQRQGESSNLIRQRVIKAR QMQEKRYTEYTGIYCNAQMNSKLLAMYAQPDAKGLALLKNAMERLNLSARAYDRILKVAR TIADLEGAEQILPNHLAEAISYRNLDRENWAG >gi|225935349|gb|ACGA01000043.1| GENE 134 214792 - 215862 776 356 aa, chain - ## HITS:1 COG:no KEGG:BT_2845 NR:ns ## KEGG: BT_2845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 344 13 360 371 444 70.0 1e-123 MKKSSILILIIICLCSCGKSSKKVSITGEIKGLGTDTLYLYGMDESFDRIDTIFAKNDKF SYTASIDTITSAFLLIDNQTEYPIFLDKGNQIKIKGDINHPEYLDINGNIYNEEFTTFQK ELNSLPTPSEKDLEQKAEEFIMKHHSSFVSLYLLDKYFVQKDSPDFNKIKKLIEVMTGIL QDKLYIEQLNEAISLSEKTETGKYAPFFSLSNIKGEKITRSSEDFKKKNLLINFWALWGD SISNHQSNTELKEIYQKYKKNKHIAMLGISLDMDKQEWQDAIKRDTLNWEQVCDFGGLNS EVAKQYAIKQVPSNILLSADGKILAKNLKGEQLKKKIEEVVTAAEEKEKQDNKKKK >gi|225935349|gb|ACGA01000043.1| GENE 135 215976 - 217670 1723 564 aa, chain - ## HITS:1 COG:no KEGG:BT_2844 NR:ns ## KEGG: BT_2844 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 564 1 553 553 912 92.0 0 MTKKLYLPLLMAMVVALFSSCSKKMGELSADYFTVTPQVLEAVGGKVPATINGKFPEKYF NKKAVVEVTPVLKWNGGEAKGQPATFQGEKVEGNDQTISYKMGGSYTMKTSFDYVPEMAK SELYLEFKATIGKKVVTIPAVKIADGVISTSELVNNTLGNANPALGEDAFQRIIKEKHDA NIMFLIQQANIRSSELKTAKEFNKEVANVNEAANKKISNIEVSAYASPDGGVSLNTTLAE NREGNTTKMLSKDLKKAKIDAPIDAKYTAQDWEGFQELVSKSNIQDKELILRVIAMYQDP AQRESEIKNISAVYKELANTILPQLRRSRLTLNYEIIGKSDEEIAKLASSNPSELNVEEL LYAATLTSDPAKQEVIYTQATKQFPNDYRGYNNLGKLAYQAGNIDKAESYFKKAASVNAT PEVNMNLGLISLMKGDKAAAEAYFGKAAGTKELGESMGNLYIAQGQYERAVNSFGDSKTN SAALAQILAKDYNKAKNTLANVERPDAYTDYLMAVLGARTNNSSMVTSSLKSAVAKDSSL AKKAATDLEFAKFFTNADFMNIIK >gi|225935349|gb|ACGA01000043.1| GENE 136 217806 - 218765 601 319 aa, chain - ## HITS:1 COG:SA1328 KEGG:ns NR:ns ## COG: SA1328 COG4974 # Protein_GI_number: 15927078 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 14 306 2 294 295 216 44.0 4e-56 MEINEKKHKKEQQELIIRKYQQYLKLEKSLSLNTLDAYLTDLDKLMSFLTLEGINVLDVC LSDLQRFAAGLHDIGIHPRSQARILSGIKSFFRFLILENYLEADPSELLEGPKIGFKLPE VLTVEEIDRIISAVDRSKAEGQRNRAILETLYSCGLRVSELITLKLSDLYFDEGFIKVEG KGSKQRLVPISPRAINEIKLYITDRNQIEVKKGHEDFVFVSQRRGKGLSRIMIFHMIKEL AQKAGITKNISPHTFRHSFATHLLEGGANLRAIQCMLGHESIATTEIYTHIDRNMLRSEI IEHHPRNIKYRKEKEELFH >gi|225935349|gb|ACGA01000043.1| GENE 137 218828 - 219247 255 139 aa, chain + ## HITS:1 COG:sll1112 KEGG:ns NR:ns ## COG: sll1112 COG0757 # Protein_GI_number: 16329990 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Synechocystis # 2 138 6 144 152 156 55.0 1e-38 MKIQIINGPNINLLGKREPSIYGSVTFEDYLADLRKRYVDVEIDYFQSNIEGEMIDCIQQ VGFDVDGIILNAGAYTHTSIALQDAIRSVTSPVIEVHISNVHSRESFRHVSMIACACKGV ICGFGLNSYRLALEALLDR >gi|225935349|gb|ACGA01000043.1| GENE 138 219284 - 220741 1690 485 aa, chain + ## HITS:1 COG:BB0348 KEGG:ns NR:ns ## COG: BB0348 COG0469 # Protein_GI_number: 15594693 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Borrelia burgdorferi # 1 471 1 473 477 389 47.0 1e-108 MLLKQTKIVASISDRRCDVDFIKQLFEAGMNVVRMNTAHASREGFEALIANVRAVSNRIA ILMDTKGPEVRTTANAEPIPYKIGEKVKIVGNPDLETTRECIAVSYPNFVSDLNIGGTIL IDDGDLELEVIDKTDGYLLCEVKNDATLGSRKSVNVPGVRINLPSLTEKDRNNILYAIEK DIDFIAHSFVRNRQDVLDIREILDAHNSDIKIIAKIENQEGVDNIDEILEVADGVMVARG DLGIEVPQERIPGIQRVLIRKCILAKKPVIVATQMLHTMINNPRPTRAEVTDIANAIYYR TDALMLSGETAYGKYPVEAVKTMTKVAAQAEKDKLEENDIRIPLDENSNDVTAFLAKQAV KATSKLKIRAIITDSYSGRTARNLAAFRGKYPVLAICYKEKTMRHLALSYGVEAIYMPEL ANGQQYYFAALRRLLKEGRLQPSDMVGYLSSGKAGTKTSFLEINVVEDALKHAEETVLPN NNRYL >gi|225935349|gb|ACGA01000043.1| GENE 139 220762 - 221418 587 218 aa, chain + ## HITS:1 COG:aq_1507 KEGG:ns NR:ns ## COG: aq_1507 COG4122 # Protein_GI_number: 15606661 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Aquifex aeolicus # 5 211 7 211 212 145 38.0 4e-35 MQETESIDEYILQHIDPESDYLKSLYRDTHVKLLRPRMASGHLQGRMLKMFVEMIQPRQI LEIGTYSGYSALCLAEGLPEGGLLHTFEINDEQEDFTRPWLEKSPFADKIRFYIGDALEL VPRLGVTFDMAFIDGDKRKYIEYYEMTLAYLSEGGYIIADNTLWDGHVLEQPRNTDAQTI GIKAFNDLVAQDVRVEKVILPLRDGLTIIRKKIVSSKS >gi|225935349|gb|ACGA01000043.1| GENE 140 221434 - 221766 306 110 aa, chain + ## HITS:1 COG:BS_rbfA KEGG:ns NR:ns ## COG: BS_rbfA COG0858 # Protein_GI_number: 16078728 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus subtilis # 3 109 2 107 117 63 37.0 6e-11 METTRQNKISRLLQKELSEIFLLQTKSMPGTLVSVSAVRISPDMSIARVYLSVFPSEKAE EMVKNINNNMKSIRYELGTRVRHQLRIIPELKFFVDDSLDYIEKIDSLLK >gi|225935349|gb|ACGA01000043.1| GENE 141 221937 - 223010 835 357 aa, chain + ## HITS:1 COG:HI1548 KEGG:ns NR:ns ## COG: HI1548 COG4591 # Protein_GI_number: 16273448 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Haemophilus influenzae # 37 354 91 406 416 93 23.0 5e-19 MVASFFTAFDPQLKITVREGKVFDAQDKRIRAVCALPEVEVFTETLEENAMVQYKDRQAM VVLKGVEDNFEELTAIDSILYGAGEFVLHDSIVNYGVMGVELVATLGTGLEFVDPLQVYL PKRNAKVNMANPGASFNRDYLYSPGVVFVVNQQEYDGKYILTSLDFLRELLDYTTEVSAM ELKLKTNVNISSVQSKIENILGDDFVVQNRYQQQADVFRIMEIEKLISYLFLTFILMIAC FNVIGSLSMLILDKKDDVVTLRSLGASDKLISRIFLFEGRLISLFGAISGIVLGLILCFI QQKFGIISLGGGGGTFVVDAYPVSVHAWDIVLIFITVLAVGFLSVWYPVRYLSKRLL >gi|225935349|gb|ACGA01000043.1| GENE 142 223156 - 223707 364 183 aa, chain + ## HITS:1 COG:DR0180 KEGG:ns NR:ns ## COG: DR0180 COG1595 # Protein_GI_number: 15805216 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Deinococcus radiodurans # 9 176 52 222 229 70 23.0 1e-12 MKLENNEISRIVDGDEIAFNRFMEHYSSRLYHYTFALLGQKESAEEIVSDVFFEVWKNRK SLAEIGNMNAWIQTITYRKAISFLRKETGKYELSFDDIEDFTFEPVQSPAEEMISMEEMA KINDAIQQLPPKCKHVFFLAKIDGLPYKDIAEMLNISVKTINNHIAFALDEISKRLNIKS RKN >gi|225935349|gb|ACGA01000043.1| GENE 143 223859 - 224974 713 371 aa, chain + ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 170 369 70 267 274 79 26.0 2e-14 MSLKEYHRLADLFSKLNSQQKIEDDFTSMPDGERLRFFWENCTQEKIDPSYIVERTRQKM RKDAMKRRKKYFLVASASIAASFLICFSTIYFLSQNGTTNLDFQAIAEEMDSRSVQEVTL ITSKEQLNLDENAFIKYSKEGKVAVNSQVIKEKEEKIKEEQEYNQLLVPAGKRARVELSD GTRLIVNSQSKVIYPRRFEGDIRKIYAQGEVFLEVAHDKKHPFIVESDDFKLQVLGTKFN ISNYKGGATNIVLVEGAVEVTDKNEKKARLNPNDLLNIANGTIAYQKQVDVAEYISWVEG IMLLNGNDLSQIIQRLSIYYGIPIQCDPIIGKEKVYGKLDLKDDIDEVIECIQQTLPFTI EKSDTSIYLNK >gi|225935349|gb|ACGA01000043.1| GENE 144 225128 - 228517 2512 1129 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4655 NR:ns ## KEGG: ZPR_4655 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 128 1129 3 997 997 1065 54.0 0 MVKKTIFLHLKKRLKLFCIINVFFMIPFSSFGTNSISWNSEEDLFSVQYENATVKDILDY IEKHSKYIFIYSANVQKNLNNKVSISVSNKKIDAVLKELFSEIGLNYKMSGRQITISVPD APKIQQTTQQKGIKVTGNVSDEKGEPLIGVTIILKNDSTVHALTDMNGNYSITVPERKSV LSFRYIGFVPKEEVVNNRKVVNVQMVEDVGQLDEVVVVAYGAQKKESVVGSITTIEPAKL KVSTTRSISNNLAGTVAGVLAVQRSGEPGYDNSSFWIRGISTFQDAGQNPLVLIDGIERD LNNIDPEEIESFSVLKDAAASAVYGVRGANGVILINTKRGQVGKPRVTVKAEFAATQPVK LPEYLGAADYMQVLDDILIDTGQQAKYTDRIAKTRAGYDPDLYPDVNWMDAIANDYASNQ RVTVNISGGTETLRYSFVAAAYNERGILKRDKSYDWDPTIKLQRYNVRSNVDLKLSPTTQ LRFNIGGYLQDRNSTTKDISQIFQKAFVAVPHAFPAQYSSGQIPTTEEPNVWAWATQSGY KRRSDSKIETLFSVEQDLKFFLPGLKVKGTFSFDRFSSGTVSRGKTPDYYVPATGRDDEG NLIIASKSNGTNFLDYSKSGDYGNKSVYMEATLSYDRTFADKHSVAAMLLFNRRNYDDGS KLPYRNQGLAGRASYTYAGKYVGEFNFGYNGSENFAKGKRYGFFPSGAIGWIVSEEAFMQ PLRKVISKLKLRASYGQVGNANLGGRRFAYLSTITDDYDTLNMYKWGLDSSYGLTGMAEG EFAVQDLTWEIVNKMNLGVELGFLNGMIDLQLDYFDERRKDIFMPRESVPMTAGFMKQPW KNFGKVTNQGVEVSLNVNKQFGKDLFVSLMGTFTYAHNEITEKDEPSAVVGTNRAETGHP VGQLTGYIAEGLFTEDDFEDVSTGKLKKGIPTQSFVSKLRPGDIRYRDVNGDGKVDVFDK SPIGGTKDPEIVYGFGLNMKYKNLDFGALFQGIGRSWNILGSSIIPGANRGVTGNMFTNA NDRWTVDNPSQNVFYPRLDDGINSNNNQPSTWWLRNMSFLRLKNIELGYSLPKSLWRNTT VISGIRLFVRGTNLLTFSKFDLWDPEVENTTGAAYPIMKSLSAGFEIKF >gi|225935349|gb|ACGA01000043.1| GENE 145 228529 - 230475 1469 648 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4656 NR:ns ## KEGG: ZPR_4656 # Name: not_defined # Def: RagB/SusD family protein # Organism: Z.profunda # Pathway: not_defined # 1 643 1 615 615 508 45.0 1e-142 MKNSILKFIVLSIITTFAFSSCSDYLDKQPDDQLDLESVFENKKNMERWLAYIYRGLPEY YTYDGPDAIADELIPSVGWEAQGFKAIQYQKGNWTADNPGVITYWNTYPKYIRSAYLFIK HAHPLEAVPAEEVDFMKAECRFFIAYYNSMMAIAYGSVPIIREASESTSADDLMLKQEPF YNVIDWADQEMLEASKQLPATQTEDKKYGRVTSVGCLAMRARMLLFAASDLVNGNPALAN IKNIDGTPIFNSAHDPERWKRAVDACKLVIDEAEAAGYHLHYEYLDNGDIDPFLSYQNAV MKRWNEANRELLFVRTMDSGGWYDKNCIPRGLGINGVGAIGITQSLVDAFFMRNGQRPIV GYNSDGSPKINPDANYSEAGFSTQKETYSTKWQYGSSEGDRNKDENVVVDANTYKMYCDR EPRFYISVLHNEQWHIGGKRNTDFYMDGKDGGPSHDAPWSGYLVRKRVDPSANPKEGSGD YKNRHGALCRLAEVYLSYAEALNEYSIEKGTYTANQKEILRYVNLIRERAGIPEYSVSSE EGKITAPSDPVEMRELIRQERRVELNCESGLRFNDLRRWKLAENVLDGDFYGMNAYIKVS DADYRNKYYTRTVYQTRKFISYWWPIPQDDIDKNWNLVQTPDWTVGNQ >gi|225935349|gb|ACGA01000043.1| GENE 146 230508 - 232148 1173 546 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173079|ref|ZP_05759491.1| ## NR: gi|260173079|ref|ZP_05759491.1| hypothetical protein BacD2_14503 [Bacteroides sp. D2] # 1 546 5 550 550 1088 100.0 0 MKNGYIFKKNLLRFFFLTLASCFMVACNDDDDATNLLDDRTSLVLTPSDYEIELKEDTPD EVALTLNWTEAAPIGTDYYISYLYKMDLGNNSFGTNTMIREYIDDDFSKSYTHKELQNLL VSKWKQHPGDLVKLQARIIGSVEGPKFVKPEVSTVTIKVKMYSEKTFVADHLYMSGTAVN GEDIEILPMESQPKRYVSICDLKAGNLHFPIVWKDENKINAISPVAAEQQVTDGAMEAKI KGIDNAGYWVIPEDGQYRVVVDFEARTVTIGLASNFIEADKIYIAGTCVSEDVEMTRTIE DENQYAFHAELQEGTIYFPILFNGQKDMAIAPEESGDFTDGTAMNISTMSPEAAALAYHW NIKTAGVYRIVININTKKITIYSPETDPKPMVVTWTWNNNTVTTTIERVFIWGPYDGWAK DGTGDTGFTMAHSMAPSLANPYLFIYKGEELPRKNSIKDKDGNAHPGGLNFKVGPQSAGC YTFGSTADAIRGSYDGCLDIAESDYNQKQTVVGGQANNRYAFFSVPVGVNYIELNIKELT VFFDKR >gi|225935349|gb|ACGA01000043.1| GENE 147 232247 - 233809 1070 520 aa, chain + ## HITS:1 COG:no KEGG:BVU_3139 NR:ns ## KEGG: BVU_3139 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 95 511 8 410 417 196 30.0 2e-48 MVVRKISYYRVIVSLCLFVFLLSPAFGDENSAISLDKELSENSIVTIQPDSMRILHNPLT GWVLYASMGVDAADFWAQYDHIYIPELGHNVSVTDYAHTLYIRASWTDFNPQEDVYGWKI DSNLRAYIEGAYQRNMRLAFRVVVDSRDKRTEFTPQFVKDAGAKGFMNKGKWSPYSDDPV FQKYYTKFVKALAADFNNPSRVEFIDGFGLGKWGEYHTMIYSTGDDTPKKAVFDWVTDIY SQAFDKVPVVINYHRWIGAGKDWVDDEHFAADSEEMLEKAIKKGYSLRHDAFGMTTYYGS WERRFAKKYRNICPIIMEGGWIVKSHSYWQDPRGYRKNSDEDVRRGEFDDSKEAHVNMMD FRFGDTESWFKDAFDLVRRFLREGGYRLYPSEISLPLEVSEKATPTIKHCWNNLGWGYCP TNIPQWNQKYKVAFALLDSETNEVRYTFVDMNTDLSKWIKGSPTEYVFKPNMSKVEKGTY VWAVGLVDRSNKELIGLDIAAKGDILNSGWLKLSKVSINK >gi|225935349|gb|ACGA01000043.1| GENE 148 233821 - 235470 1132 549 aa, chain + ## HITS:1 COG:no KEGG:BVU_3139 NR:ns ## KEGG: BVU_3139 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 103 546 5 415 417 183 30.0 2e-44 MKINFQNMCSLRKISCGLIFSMAISLVACGSDNDEGGDDNGDDGLVEDTSIPGNSIITVK QNRNIILHNPLSGWVLYAGIGDGLSSTFWQDYDNFPSSEGTVKVSDYANTLYLRGAWADF NPEEGKYAWNSDCDTPSAKRLKMLIEGAKQRNMKLAFTFVVDSRDKHYNFTPNFVKEAGA KGYETQTGSVKVWSPYPDDPIFQKYYEKFIRALAKDFNDPDKVQFVSGSGFGKWGEYHSV WYYQVRELGKPELPTREAVFDWVTDLYSQVFDKVPVFVNYHRWIGTSKEWDGNNYDKDTE RLIGKAVAKGYSLRHDAFGMKTYYSTWERNFIAKWKYLVPVVMEGGWVKTSHGNSIQGDG YANYAEVRQGEFDEAKIACVNMMDLRYNSDLHNGETYSWFNEAFQLVKQFCTEGSYRLFP DRISLPTTVSNGKQIEIAHRWNNFGWGYCPTNIPQWKNKYKVAFTLLDTKNDKPMYVFVD GEPEACDWVKGAAKSYTFTTQVEGVKAGKYMWAVGIVDTTKQNEIGIHLAVKNNVTSAGW LKLFEVTVR >gi|225935349|gb|ACGA01000043.1| GENE 149 235813 - 238020 1672 735 aa, chain + ## HITS:1 COG:YPO0616 KEGG:ns NR:ns ## COG: YPO0616 COG1472 # Protein_GI_number: 16120942 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 29 716 4 697 727 513 40.0 1e-145 MRKKVLIWGMCLLGVTHSLSSKDKNSIPLYKDAKAPIEKRIDDLISRMTLEEKVLQLNQY TLGRNNNVNNVGEEVKKVPSEIGSLIYFDINPELRNSMQKKAMEESRLGIPIIFGYDAIH GFRTIYPISLGQACSWNPGLVEQACAVSAQEARMSGVDWTFSPMIDVARDPRWGRVAEGY GEDPYTNGVFAAASVRGYQGDDMSAENRIAACLKHYIGYGASEAGRDYVYTEISAQTLWD TYLLPYEMGVKAGAATLMSSFNDISGVPGSANHYTMTAILKERWKHDGFIVSDWGAVEQL KNQGLAATKKDAAWYAFNAGLEMDMMSHAYDRHLKELVEEGKVTMAQVDESVRRVLRVKF RLGLFERPYTPVTNEKDRFFRPQSMAVAAQLAAESMVLLKNDNQILPLTNKKRIAVVGPM AKNGWDLLGSWCGHGKDTDVEMLYDGLTAEFGGEAELRYAMGCKPQGNDRSGFAGALDVV RWSDVVIVCLGEMLTWSGENASRSTIALPQIQEELVKELKEAGKPIILVLSNGRPLELNR MEPLCDAILEIWQPGINGARSMAGILSGRINPSGKLAITFPYSTGQIPIYYNRRKSGRWH QGFYKDITSDPFYSFGYGLSYTEFQYGVVTPSSTTVKRGEKLSVEVTVTNAGKRDGAETV HWFISDPYCSITRPVKELKHFEKQFIKVGETRTFRFDVDLERDLGFVDGNGKRFLEAGEY NIWVQDQKVKIELID >gi|225935349|gb|ACGA01000043.1| GENE 150 238032 - 240485 1966 817 aa, chain + ## HITS:1 COG:SPBC1683.04 KEGG:ns NR:ns ## COG: SPBC1683.04 COG1472 # Protein_GI_number: 19111852 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Schizosaccharomyces pombe # 28 811 2 817 832 388 32.0 1e-107 MRKIITTIGLCLLMMNTLVVAQEITPEVEQRARKIVSRMTLQEKIEYISGYTSFSLRAIP RLGIPEIKLADGPQGIRNHAPKSTLYPSGILSASTWNRELLYKLGQGLGQDAKARGVNIL LGPGVNIYRAPLCGRNFEYFGEDPYLTGETAKQYILGVQSEGVIATIKHFVANNQEWSRH HASSDIDERTLHEIYFPAFRKAIQEANVGAVMNSYNLLNGVHATEHKWLNIDVLRNLWGF KGILMSDWTSVYSAVGAANAGLDLEMPKGRFMNADNLIPAIKTGTVTEETINLKVQHILQ TLIAYGMLDKEQKDSRIPLDNPFSRQTALELAREGVVLLKNEDNLLPLKGKTAVMGPNAN LIPTGGGSGFVTPFSTVTVAQGLKDLKKKNLVLLTDDVIYEDIVHEFYTDANRQVKGFKA EYFKNKTLSGQPEVVRTESSVDYDWGYGAPLDGFPTDGFSVRWTACYVPQTDGQLKLHIG GDDGYRLFVNDKHITGDWGNHSYSSREVELPVEAGKEYRFRIEFFDNISSAIIRFNAYSL NEAKLRQGLAKVDNVVFCTGFNSNTEGEGFDRPFALLRYQELFIKKIASMHPNVVVVLNA GGGVDFTNWHDATKAILMAWYPGQEGGQAIAEILTGKISPSGKLPISIEKKWEDNPVHGS YYENLKAEIKRVDYSEGVFVGYRGYDRSGKEPFYPFGYGLSYTTFAYSNMVAEKTGEHQV TVSFDIENTGKMDASEVAQVYVHDVQSSVPRPLKELKGYEKVFLKKGEKKRVSIVLNEDA FSFYDMNQHRFVVEKGDFEILAGPASSQLPLKTTVKL >gi|225935349|gb|ACGA01000043.1| GENE 151 240515 - 241627 1061 370 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 52 369 44 352 352 207 35.0 4e-53 MKKTLLLGVSLLCSIFIMATEIPFQKAEIKSIMRKVADWQIANPHPAPEHDDLNWPQGAL YVGMVDWAELAEKEDGDDTYYKWLTRIGRRNCWQPDKRFYHADDIAVSQSFLDLYRKYKD EAMIVPTLARTEWIINHPSKGSFKLVEGDLKTLERWTWCDALFMAPPVYAKLYMLTGDKK YIKFMNREYKATYDYLFDKEENLFYRDWRYFDKREVNGKKVFWGRGNGWVLGGLVEILKE LPKKDKNRKFYEELFVKLATRVAGLQHPDGFWHASLLDPASYPSPETSCTTFIVYSIAYG INEGLLDKETYLPVMIEGWKALVSAVGPDGKLGYVQQIGADPKKVTRDMTEVYGVGAFLM AGNEIYKMAR >gi|225935349|gb|ACGA01000043.1| GENE 152 241868 - 242410 379 180 aa, chain + ## HITS:1 COG:PA1363 KEGG:ns NR:ns ## COG: PA1363 COG1595 # Protein_GI_number: 15596560 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 21 180 84 242 246 68 29.0 6e-12 MEDLERELLLISKGDELAFNSFMNRYSKRLYYHAFGILSNKEMAEEVVGDVFFEVWKLRK TLLEIASMLSWLNTIVYRKSISYLRKEAKGNQEISFEELQDFSFPDMQTPMDDMLSREEI RCLNEAINSLPPKCKHVFFLAKLEQMPYAEIAHMLQISLPTVNYHIGYAMNALRKRLVNG >gi|225935349|gb|ACGA01000043.1| GENE 153 242556 - 243482 732 308 aa, chain + ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 99 285 123 310 331 75 28.0 2e-13 MDKDINDYTYGDLQGDIKLKSFIEHSTPPEIDISGIKKKAFLKMHQQERKSRQRHVLMYI GSVAATLLLLIGIKYIVTENAMDGLFRMEVTELMADVYDEEVEVPVGEKMTLMLADGTRI VANSRTIVRYPKRFDGEYREVYVKGEAYFDVAHDAEHPFLVHSDNFRVKVLGTKFNVNNY DTSDSQVVLVQGAVELKTTNNDRVRMKPNELVSLQEGSFAEKRAVNTDEYTCWMQGMINL DGESVESVAQRLSHYYGVTILFDDGMSADKFYGKLMLGDRVSEALQSIETMTQTRMEIHG DTIYLVKK >gi|225935349|gb|ACGA01000043.1| GENE 154 243629 - 245185 1008 518 aa, chain + ## HITS:1 COG:no KEGG:BVU_3139 NR:ns ## KEGG: BVU_3139 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 90 516 5 417 417 197 32.0 1e-48 MKRYTLLRTFMLFMPVLIHCGWVSAHAQSAMLKGVKALGQTVYFEPDTTSVLKNPLTGWV MYLGRAWDENFWQTHHYDAMPVNGGDSTVRVSDYAGTCYIRINWNMLESKEGDYVWNDPD SRIYKLLASVRERGMRLAFRINVDSRDQGQNTPLYVKEAGAKGFQDPNNPQIWSPYPDDA VFQQKYEKFLQAFAVAFDDPDKVDFIDAYGLGKWGEAHGVKYNNYSNKVEVFEWITDTYA KAFKRVPLVINYHRLVGDTISWAEPHPDSKRLLERAIAKGYTIRHDAFGMTGYYEQWEKD FARAYRFRLPIIMEGGWITGAHHRYWIDPSGKYRQGHSEDVRWGEYEESRNAHVNMMDLR IGDEVASWFNSSFDLVKRFEREGGYRLYPTEVSFVNKARSGERVSVQHNWRNLGWGYCPT NIPQWKGKYKVCIALMDANHNIVKKQLADEADLSTWVQGRDGHYTTTVNLDGLQKGNYTW LIGLVDTTKACQPGLKMAVDKNLLFEGWCKVGKLKINN >gi|225935349|gb|ACGA01000043.1| GENE 155 245242 - 248631 3014 1129 aa, chain + ## HITS:1 COG:no KEGG:BT_2461 NR:ns ## KEGG: BT_2461 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 1129 29 1134 1134 1100 50.0 0 MNRKTITTTVVSLAKAFCLAGLLSVPAYSSASTVSFEKVNGMYILKAEDSTVKQVFDYIE KHSQYVFVYDQPVKDRLNNKVNIELKGKSIDAILTEVCRLADLKYTIQNRQVTITANPSK KATTQVKKAKRISGQVVDATDLPMIGATVRVKGSDAATITDLDGKFQIDVPEGSELLISY VGFVDKIVRIDSKNMYAIVMEEASRVLNEVVVIGYGVVKKKDLTGAVAAVKGEELVNKRT AMLSNALQGSLSGVMVTRNSSAPGAGAASIRVRGITTMGESSPLVIVDGVQSSLDYVNSN DVESISVLKDAAAASIYGSKAAAGVILVTTKRGNDTGKINLKYNAEFGWEIPTKQPSMVG VTRYLEMSNELKYNDNPTGGFFQEYTADKTKNWVKYNATDPNNYPITDWTDMILKSSAPR MTHTLSVSGGNKAVKSVATLSYDEVDGLYDGRGFQRYMFRSNNDFNINDKLSAQIDVNIR HAKSTATNYDPFDTMRKMPAIYPATWDDGRLASGKSGANPYGLLVAGGNSVAHSTQVAGK GSLTFKPIKGLSISGIVSPFINYQKKKAFKNSCGYTLPDDPETFGGYFDSGSTWTTNSLT ETRNDDYNVTSQAIANYMGTFGSHNLTVMAGFENYYMKSESLGAARDKYELTGYPYLNIG SEDFQTNSGTGSEYTSNSVFGRVIYSYADRYLFQANVRHDGSSRFAKGYRWGTFPSFSAG WVASEEKFMKNLKWDWLSFLKLRGSWGMLGNERIGSNYFPYIALMSFGNSLFYMADGSVV SDKTARPAVLAVEDITWETTTSTDLGLDVNFLNNRLQFHFDYYWKETKDMLLNIEIPYFM GYNNPSTNAGKMSTHGYDIEVAWNDQIGDFKYGVNVNFSDFLSKIDYMNDGEQISGGKIK RAGVLFNEFYGYVCEGIYQTQEEVNNSARTSTTVTVGDLKYRDISGPDGVPDGVISPEYD RVPLGNSLPRFQYGGSLNASYKGIDFSLAFQGIGKQNSYLSTSMVQPLRDNYGNVPAILE GKYWSPFNTTEENLAAKYPRLSNVSKSNNYATSNFWMFNGSYFRLKNITLGYTLPKSWTQ KAGINRARFYVSGSDLFCLSNFPSGWDPEMGTSDYPITTSLVMGIQVNF >gi|225935349|gb|ACGA01000043.1| GENE 156 248649 - 250370 1663 573 aa, chain + ## HITS:1 COG:no KEGG:BT_2460 NR:ns ## KEGG: BT_2460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 573 1 572 572 505 46.0 1e-141 MKTKILLAGMCLSLAACESMDLVPKSQGNTESWYTSETELRLASNDFYILGYWQEPLSSS EQWSDNTTYRQTNRNPGSGGTVLDGTLNGQQYEVYALWQQSFKLIARTNTMLENIHKAKG AVTEEVYNRYAGEAYFCRACKYAELIFFYGDVPYQEETITISEALQRGRRPKAEVIPLVY ADFDKAIAGLPESYGKDENIHPTKGAALAMKARFALYMDDYEIAAQAAKACMDLKLYSLE PDYAKLFKQSTKLNDEKVFVIPRSIENEVMLDSWIVKNGLPRNAGGYGSYNPSWDLLASY LCTDGLPIDESPLFDPRKPFKNRDPRCTMTIVEFNTEHCGFEYDPSPAAKTVMNYTTGKT QSNQDTRIVNQYSSYTGLLWKKGIDATWTVDQKVEQDYIIMRYADVLLIYAEAMIEQNLI DDSVLKAINMVRARAYGVNVTATDSYPAVTTTNQTELRRALRIERRMEFAMENQRLQDLM RWKLAGKALNGYNYIMLIDPTELLNNIVNKNLWFWGMTPQIDEDGLADFAALFNAGYCSQ GAKRIFPEREYLWPLPTHDVELCPNLLPNNPGY >gi|225935349|gb|ACGA01000043.1| GENE 157 250391 - 252034 1340 547 aa, chain + ## HITS:1 COG:no KEGG:Coch_0930 NR:ns ## KEGG: Coch_0930 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 492 1 432 479 65 22.0 7e-09 MKTMHLYRNLGMVAVGMMSMLATSCEEEIDNTASRNQSEIELTPSGEYIKLDESKPNETA FTLKWSTAHNFGEDYITTYKYEMQLIGSDVANIKEFDDDGEFLRSYTNKEMQDILVDYFG LTTSTIGEVLFTVTANFEGPRLMVPDIATATVKFKTYGPKQYKADNLYVGGTAVGDDNIK MTLKDETNRIYSYEGALVAGKINFPVDYADELNAIGPETADAPITTGEMAGVISDRAEAN SWLIPSAATYRITVNMSKQTVKIVEAGAVVEADQIFLAGSAVGGEQIEMAQALENDQIYA WRGALKAGNLYIPLTFEGEQAMAIVPEIADSHDIEDGQLSTFGQVLISKVDKQYWTIPAD GTYRIVLNKEEKSITIYSTATDLKPLTVSFNNTELGKNPWSQEVNVLHMYGGFNSFAYDS GIDEETGKSFKYCQVKYNLIQSVANPKVFVYKGDVLPRNEYKDNYGKTNAGWVNFSTLRY ANNGWFVGSTADAVRNSYNGYTTVTAGKIEGAVTGQGNNRYAYFLIPEGCNYVVVDLENN TVLFDIK >gi|225935349|gb|ACGA01000043.1| GENE 158 252123 - 253724 1325 533 aa, chain + ## HITS:1 COG:no KEGG:BVU_3139 NR:ns ## KEGG: BVU_3139 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 93 532 5 417 417 176 29.0 2e-42 MNKIIKNSIKTLSMCLLATTPFIGTACSDDDAPITDYSWNIEGNTTVNIMPERYQLQRNP MSGWVIYAGIGSGMMMDFWDLYDNFESSEGTVKVSDYGNTLYVRGLWSNFNPEKGKYVWD ESVQTEPAKRFRMLVEGAKERNLKLAFTFVCDSRDKHEDACPDYVKEAGAEGFTTTTGSV NVWTPYPDDPIFQREYEEFLTAFAAKYNDPDVTQFVSGFGLGKWGETHTLKYSTGDEAPR QAVFEWITDVMSRLFTKVPIMINYHRCLLSGKEFSDTDTDVAADMVKRAVAKGFCLRHDA FGMKQYYKDWERGISTTYHGIVPITMEGGWVESSHGGSIAGDGYKNFAEVRQGEYDEAKG GYVNMMDLRFDTNVNKGETHSWFNTAFHLVKEFIAEGGYRLYPDRLSIPTIARSGSSVSL THRWSNLGWGYCPTNLPQYGDKYKLAIALLDKNTEEPVRIYIEDKANIATWMKGAPKTYT TNIKLTDVAAGNYIWAVGLVDTTKENAIGILISARDEYQTSKGWVKVGDITIQ >gi|225935349|gb|ACGA01000043.1| GENE 159 253742 - 255625 1829 627 aa, chain + ## HITS:1 COG:no KEGG:BT_2458 NR:ns ## KEGG: BT_2458 # Name: not_defined # Def: putative pyridine nucleotide-disulphide oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 617 26 623 626 783 61.0 0 MMKKILLTCIAVACSLVAVAGELLIEAESFSQRGGWVLDQQFMDQMGSPYLMAHGMGIPV ADATAEINIPQAGTYYVYARTYNWTSPWTDAEGPGKFRLALGGKLLKTTLGHTGNSWQWQ SAGKVVLKAGTITLALKDLTGFNGRCDAIYLTTDMNAQPTAWGEAETAALRARLQQQQTV PTHQYDFVVVGGGVAGMCAATAAARLGCRVALVNDRPVLGGNNSSEIRVHLGGIIEKGPN EGLGRMIREFGHERSGNAQPGDYYEDQKKEDFIAAEKNITLYASQRAVAVKMQGDRIASV TIQHIETGEQTELTAPLFSDCTGDATIGYLAGADWTMGREGRDEYGESLAPEQPDSLVMG ASVQWYSKDMKKKTSFPRFEYGMRFGADNCEPVTMGEWKWETGMNRNQITEAERIRDYGL LVVYSNWSYLKNHYADRKKYANRSLDWVAYVSGKRESRRLLGDYVLAQDDIDKNVAHEDA SFTTTWSIDLHFPDSVNSVRFPDNEFKSATVHRWIHPYAVPYRCLYSRNVDNLFMAGRNM SCTHVALGTVRVMRTTGMMGEVVGMAAGLCHKHGVEPRDIYHHHLPELKQLMQAGLGKHN VPDNQRFNEPNKLLEAPAYIKPQTNYE >gi|225935349|gb|ACGA01000043.1| GENE 160 255618 - 257684 1651 688 aa, chain + ## HITS:1 COG:no KEGG:Phep_0375 NR:ns ## KEGG: Phep_0375 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 27 686 38 698 700 642 48.0 0 MSKKSIITIFLSTILGLPLWGGQHYYAFLKGDTLRMGNNYMERTMLWNNGTPVTISLTDK QCGKIIPAQGKEPDFSIVKGIPTAATFTVNEIPTNGIHASYLQATVACTIGSLSIERRYR IYADCPAIACDTYLKGQIELYQNKDDNRSNADRKNIEHTADMTTGVKTPTIDRLQLSGNH WRARTVEFFDYTDWNDNLVTERTWLPYRRNTYRGNLLFAHDVATRQGFFFLKEAPCSSTQ LHYNGSDFVADFSDFMVVGLGIASDDVKPDSWTRVYGCVTGVYTGDEQEALTALRLYQKQ LRHHTAAQDEMIMLNTWGDRSQDAKIDEAFCLAELDRAARLGVTLFQLDDGWQSGKSPNS KTAGGSFKDIWKNADYWTPNPTKFPHGLKRIVEKGKKLGIRIGLWFNPSIQNDFADWQKD AQVIIGLYKEYGICCFKIDGLQIPTKTAEQNLRRLFDTVLAQTNHEVIFNLDATAGRRGG YHYMNEYGNIFLENRYTDWGNYYPYRTLRNLWMLSRYVPAEKMQIEFLNKWRNADKYSTT DPFAPARYSFDYLFAITLAAQPLAWMEASNLPEEAYATATLLKEYQPLQLQFHQGIILPI GEEPSGRSWTGFQSIASDTQGYLIIYREDNDIAKRIIDTWLPEGKKVTFTPLMGNGKKFV TEVGVQGKVSFELNDKNSFTLYQYQVKP >gi|225935349|gb|ACGA01000043.1| GENE 161 257710 - 259011 1013 433 aa, chain + ## HITS:1 COG:TM0077 KEGG:ns NR:ns ## COG: TM0077 COG3458 # Protein_GI_number: 15642852 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Thermotoga maritima # 127 408 12 303 325 112 29.0 2e-24 MIKSLRSVFLALFVAVQFVVYGQPAERLIQVLVTPDHTNWLYQPGEKVKFKVAVLKCNIP QDHLEVRYEISEDMMKPHQTGKQPLKNEKLEINAGTMKKEGFLRCRAFVSYQGREYEGVA TVGFDPEKLQPTTPLPADFLEFWKSTKEAAEKWALEPIMTLLPEKCTDKVNVYHVSFANN DYASRMYGILCVPKAPGKYPAILKVPGAGIRAYNGEPERAGNGFIILEIGIHGIPVNLTG DVYHRLYNGALKNYHSFNTDDRDKYYYKRVYTGCIRAIDFIYTLPEFNGNLATFGGSQGG ALSIVIAGLDNRVKGLVSFYPALCDIAGYAHGRAGGWPHMFKDERNRTPEKIKTIQYFDV VNFARQVKVPGFYTFGYNDMVCPPTTTYSAYNMINAPKELFVAETTAHYAYAEQWSAAWN WVMNFLKNESRNE >gi|225935349|gb|ACGA01000043.1| GENE 162 259040 - 260833 1874 597 aa, chain + ## HITS:1 COG:TM1062 KEGG:ns NR:ns ## COG: TM1062 COG3250 # Protein_GI_number: 15643820 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 82 557 32 529 563 201 30.0 3e-51 MKIGKKLNLCLLLLIAFAGSVAAQELITNVYGRDIHSLNGKWNAIIDLYDQGQRMKIYEN QQPKGNTDFYEYAFEGGLRLNVPGDWNSQSPELKYYEGTVWYARHFDAKRLADKRQFLYF GAVSYRCKVYLNGKEIAEHEGGFTPFQVEVTDLLKDGDNFLAIEVNNRRTKDAIPAMAFD WWNYGGITRDVLLVKTPRTFIEDYFIQLDKNAPDRIIARVRLSDKKAGEKVTVAIPELKI NAELTTDAEGKAETVLNAKKLQRWSPEEPKLYGVAISSSTDRVEEQIGFRNITVKGTDIY LNGKPTFMCCISFHEEIPQRMGRAFSEADAAMLLNEAKALGVNMIRLAHYPQNEYTVRLA EKMGFLLWQEIPIWQGIDFTDKDTRKKAQKMLSEMIKRDQNRCAVGYWGVANETQPSKER NDFLTSLLETGKQLDTTRLYVAAFDLVRFNSEKQRFVMEDSFTSQLDVVAINKYMGWYHP WPVEPKDAIWEVVTDKPLIISEFGGEALYGQSGDENVVSSWSEEYQARLYRDNIRMFDNI PNLRGVSPWILFDFRSPFRFHPTNQDGWNRKGLISDQGMRKKAWYLMRDYYMKKRNN >gi|225935349|gb|ACGA01000043.1| GENE 163 260842 - 262662 1447 606 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0623 NR:ns ## KEGG: Cphy_0623 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 142 374 1 231 236 162 39.0 5e-38 MIKHFRIKNNFLLSGLFFLVTLFSVGSSLVSCSDSEKIVTQTEEKIVYVDGNWMVVASKE VTVMTGESVTLNVSVGGEEELKPEYTCTSKDANIATVAVSEDNNSIIITGVADGSTAISI ECPTNTKPLVATIPVTVKQNPIRILAIGNSFSQDAVEQYLYELAEAAGYELIIGNMYIGG CDLDKHWANFQSDAAAYEYRKIVKGEKVGKTGYKLSQGLADENWDYISLQQASGKSGKYE TYTVLADLIAGIKERCPKAKLLWHQTWAYASSSTHESFPDYDSNQMTMYSSIVTAARQAM TNHTDLSLLIPSGTAIQNGRTSFLGDAFNRDGYHLEVTYGRYTAACTWFEMITGQNVVGN PYAPETIDPQVVKIAQNAAHYAVQKPDEVTDLVDFKQPEISDTDLKVPIYIDFGPTSLSA TPWNNITSHQESSTTSWIKDVENNYTNIGVRVLDGFTATHAGVGSEPASPVTVDGVEFPL TAWKDGLLVKGEKNQGDVGPGRIEISQLDVARKYNFTILAIRFNGSKDARISSYKLVGKM ESAVKEVKTGIKDAASFAAANFEEYIAKFENVEPDSEGKVIVEVKGLDTGSAAEGHINAL CISLAK >gi|225935349|gb|ACGA01000043.1| GENE 164 262720 - 263544 701 274 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0623 NR:ns ## KEGG: Cphy_0623 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 18 242 1 221 236 152 40.0 1e-35 MCILLLLAGGAYAQQKTVRILAIGNSFSQDAVEQYLHELAEAEGISTIIGNMFIGGCSLE RHVKNARENAPAYAYRKIGTDGKKREKGKMSLEMVLADEDWDYVSLQQASTFSGMYETYE ASLPELIEYVKARLPKKTKLMLHQTWAYASTSNHGGFKNYNRDQLTMYHAIVDAVKKAGK AYKIKMIIPTGTAIQNARTSFIGDHMNRDGHHLDVKVGRYTAACTWFERIFKRNVIGNPY APESLDEARKAVAQKAAHAAVKHPYKVTDLSERR >gi|225935349|gb|ACGA01000043.1| GENE 165 263612 - 265819 2312 735 aa, chain + ## HITS:1 COG:YPO0616 KEGG:ns NR:ns ## COG: YPO0616 COG1472 # Protein_GI_number: 16120942 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 29 728 4 710 727 529 41.0 1e-150 MRKKVLVYGLCLLGSICTLSAKDKKDVALYKDPKAPIEKRVNDLLSRMTLEEKVMQLNQY TLGRNNNVNNVGEEVKKVPAEIGSLIYFETNPALRNSMQKKAMEESRLGIPIIFGYDAIH GFRTVYPISLAQACSWNPDLVEQACAVSAQEARMSGVDWTFSPMIDVARDPRWGRVAEGY GEDPYTNGVFGAASVKGYQGDDLSAENRMAACLKHYVGYGASEAGRDYVYTEISKQTLWD TYLLPYEMGVKAGAATLMSSFNDISGVPGSANSYIMTEILKKRWGHDGFIVSDWGAIEQL KNQGLAATKKEAAWHAFTAGLEMDMMSHAYDRHLQELVEEGRVSVAQVDEAVRRVLLLKF RLGLFERPYTPATSEKERFFRPQSMDIAARLAAESMVLLKNENKTLPLTDKKKIAVIGPM AKNGWDLLGSWCGHGKDTDVAMLYNGLATEFAGKAELRYAAGCATKGDNKEGFAEALEAA RWSDVVVLCLGEMMTWSGENASRSSIALPQIQEELAAELKKAGKPIVLVLVNGRPLELNR LEPISDAILEIWQPGVNGALPMAGILSGRINPSGKLAMTFPYSTGQIPIYYNRRKSGRGH QGFYKDITSDPLYPFGHGLSYTEFKYGTVTPSVTKVKRGDRLSVEVTVTNVGARDGAETV HWFISDPYCSITRPVKELKHFEKQLIKAGETKTFRFDIDLERDFGFVNEDGKRFLEAGEY HILVQGQTVKIELID >gi|225935349|gb|ACGA01000043.1| GENE 166 265857 - 268310 1985 817 aa, chain + ## HITS:1 COG:SPBC1683.04 KEGG:ns NR:ns ## COG: SPBC1683.04 COG1472 # Protein_GI_number: 19111852 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Schizosaccharomyces pombe # 29 817 3 823 832 388 32.0 1e-107 MKKKIGGIGLCMLAWSILTCAQTITPQAEQRAREIVSKMTLQEKIEYISGYTSFSLRAIP RLGIPEIKLADGPQGIRNHAPKSTLYPSGILSASTWNRSLLYQLGQGLGQDAKARGVNIL LGPGVNIYRAPMCGRNFEYFGEDPYLTGETAKQYILGVQSEGVIATIKHFAANNQEWSRH HASSDIDERTLHEIYFPAFRKAVQEANVSAVMNSYNLLNGVHATEHKWLNIDILRNLWGF KGILMSDWTSVYSAVGAANAGLDLEMPKGRFMNVDNLIPAIKNGTVTEETINLKVQHILQ TLIAYGMLDKEQKDSNIAQDNPFSRQAALELAREGVVLLKNEGNLLPLKGKTAVMGPNAD RIPTGGGSGFVTPFSTVSVSEGLEKLKKKNLVLLTDDVIYEDILHEFYADAARQTKGFKA EYFKNKTLSGQPEVIRTEASVDYDWQYGAPLEGFPEDGFSVRWTASYMSQKDGLLKLSIG GDDGYRLFVNDKHITGDWGNHSYSSREVELPVEAGKEYSFRIEFFDNISSAIIRFKASRL NEEKLRQGLAKVDNVVFCTGFNSNTEGEGFDRPFALLHYQELFIQKVASLHPNLVVVLNA GGGVDFTSWHEAAKAILMAWYPGQEGGQAIAEILTGKISPSGKLPISIEKKWEDNPVHDS YYENLKAEIKRVDYSEGVFVGYRGYDRSGKEPFYPFGYGLSYSTFAYSNLAVEKTGEHQV TVSFDIKNTGKMDASEIAQVYVHDVESSVPRPLKELKGYDKVFLKKGETQRLSIVLNEDA FSYYDMNQHRFVVEKGVFEILVGPASNQLPLKAAIEL Prediction of potential genes in microbial genomes Time: Fri May 13 09:41:00 2011 Seq name: gi|225935348|gb|ACGA01000044.1| Bacteroides sp. D2 cont1.44, whole genome shotgun sequence Length of sequence - 297924 bp Number of predicted genes - 203, with homology - 203 Number of transcription units - 67, operones - 39 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 774 - 1355 505 ## BF4460 hypothetical protein 2 1 Op 2 . + CDS 1352 - 4039 2198 ## BF4460 hypothetical protein 3 1 Op 3 . + CDS 4051 - 6051 1508 ## BF4257 hypothetical protein + Term 6087 - 6142 13.2 - Term 6079 - 6125 9.4 4 2 Op 1 23/0.000 - CDS 6155 - 7165 1071 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 5 2 Op 2 . - CDS 7169 - 9019 1722 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 9240 - 9299 76.8 + TRNA 9223 - 9297 55.3 # Arg CCT 0 0 - Term 9402 - 9455 15.1 6 3 Op 1 . - CDS 9469 - 12495 3062 ## COG0342 Preprotein translocase subunit SecD - Prom 12520 - 12579 1.7 - Term 12610 - 12648 2.1 7 3 Op 2 . - CDS 12661 - 14748 1975 ## COG0339 Zn-dependent oligopeptidases 8 3 Op 3 . - CDS 14777 - 15862 785 ## BT_2833 hypothetical protein 9 3 Op 4 . - CDS 15867 - 16772 761 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 10 3 Op 5 . - CDS 16753 - 17427 612 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 17617 - 17676 7.9 + Prom 17528 - 17587 9.7 11 4 Tu 1 . + CDS 17649 - 17921 338 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 17949 - 17980 -0.7 - Term 17930 - 17977 8.1 12 5 Op 1 . - CDS 17999 - 18763 722 ## gi|260173111|ref|ZP_05759523.1| hypothetical protein BacD2_14675 13 5 Op 2 . - CDS 18822 - 21650 1949 ## COG3250 Beta-galactosidase/beta-glucuronidase 14 5 Op 3 . - CDS 21686 - 24658 2356 ## COG3250 Beta-galactosidase/beta-glucuronidase 15 5 Op 4 . - CDS 24673 - 26589 1136 ## Cthe_0246 carbohydrate-binding family 6 protein 16 5 Op 5 . - CDS 26611 - 29136 2282 ## BT_2958 hypothetical protein 17 5 Op 6 . - CDS 29162 - 30400 1012 ## COG2755 Lysophospholipase L1 and related esterases 18 6 Tu 1 . - CDS 30519 - 32489 1426 ## Tter_1100 FG-GAP repeat protein - Prom 32560 - 32619 6.4 + Prom 32806 - 32865 3.0 19 7 Op 1 . + CDS 32890 - 33762 706 ## BF3848 hypothetical protein 20 7 Op 2 . + CDS 33774 - 34676 671 ## BF3847 hypothetical protein 21 7 Op 3 . + CDS 34731 - 36500 1318 ## BF3838 hypothetical protein 22 7 Op 4 . + CDS 36554 - 38164 1264 ## BVU_2520 hypothetical protein 23 7 Op 5 . + CDS 38263 - 39198 821 ## COG2755 Lysophospholipase L1 and related esterases + Term 39260 - 39300 8.1 - Term 39381 - 39442 8.2 24 8 Op 1 . - CDS 39495 - 41387 1668 ## BVU_0152 polysaccharide lyase family protein 11, rhamnogalacturonan lyase 25 8 Op 2 . - CDS 41407 - 42897 1126 ## COG5434 Endopolygalacturonase + Prom 42992 - 43051 5.6 26 9 Op 1 . + CDS 43224 - 45644 2056 ## BT_4163 hypothetical protein 27 9 Op 2 . + CDS 45681 - 49007 3041 ## BT_4164 hypothetical protein 28 9 Op 3 . + CDS 49019 - 50635 1454 ## Phep_2220 hypothetical protein 29 9 Op 4 . + CDS 50639 - 52459 1510 ## BT_4166 putative lipoprotein 30 9 Op 5 . + CDS 52504 - 54186 1298 ## gi|260173129|ref|ZP_05759541.1| hypothetical protein BacD2_14765 31 9 Op 6 . + CDS 54221 - 57397 2832 ## BT_4168 hypothetical protein 32 9 Op 7 . + CDS 57437 - 59143 1479 ## gi|260173131|ref|ZP_05759543.1| hypothetical protein BacD2_14775 33 9 Op 8 . + CDS 59158 - 60864 1050 ## COG0657 Esterase/lipase + Term 60968 - 61026 10.4 + Prom 60994 - 61053 5.9 34 10 Op 1 . + CDS 61079 - 65389 3142 ## COG0642 Signal transduction histidine kinase 35 10 Op 2 . + CDS 65438 - 67255 1964 ## COG0018 Arginyl-tRNA synthetase 36 10 Op 3 . + CDS 67317 - 68189 694 ## COG4292 Predicted membrane protein + Term 68435 - 68468 1.1 37 11 Tu 1 . - CDS 68400 - 69254 535 ## BF1504 hypothetical protein - Prom 69281 - 69340 3.1 - Term 69291 - 69332 6.1 38 12 Op 1 . - CDS 69350 - 69988 448 ## COG0207 Thymidylate synthase 39 12 Op 2 . - CDS 70067 - 70309 257 ## BF1503 hypothetical protein - Prom 70331 - 70390 2.7 - Term 70361 - 70403 0.2 40 13 Op 1 . - CDS 70489 - 70974 366 ## gi|260173140|ref|ZP_05759552.1| hypothetical protein BacD2_14820 41 13 Op 2 . - CDS 70971 - 72275 746 ## Dred_1227 RNA-directed DNA polymerase (reverse transcriptase) - Term 72708 - 72746 8.3 42 14 Op 1 . - CDS 72792 - 79529 4627 ## BT_4440 putative cell surface protein 43 14 Op 2 . - CDS 79556 - 83572 2310 ## BDI_0901 hypothetical protein 44 14 Op 3 . - CDS 83619 - 84083 230 ## gi|254884012|ref|ZP_05256722.1| predicted protein 45 14 Op 4 . - CDS 84085 - 88476 3478 ## BDI_0893 putative viral A-type inclusion protein 46 14 Op 5 . - CDS 88477 - 88677 80 ## gi|254884010|ref|ZP_05256720.1| predicted protein 47 14 Op 6 . - CDS 88716 - 89207 267 ## gi|254884009|ref|ZP_05256719.1| predicted protein 48 14 Op 7 . - CDS 89211 - 89786 571 ## gi|254884008|ref|ZP_05256718.1| predicted protein 49 14 Op 8 . - CDS 89783 - 90340 146 ## gi|254884007|ref|ZP_05256717.1| predicted protein 50 14 Op 9 . - CDS 90072 - 90551 308 ## gi|260173151|ref|ZP_05759563.1| hypothetical protein BacD2_14875 51 14 Op 10 . - CDS 90579 - 90965 301 ## gi|260173152|ref|ZP_05759564.1| hypothetical protein BacD2_14880 52 14 Op 11 . - CDS 90978 - 91442 339 ## AZC_3601 putative N-acetylmuramoyl-L-alanine amidase 53 14 Op 12 . - CDS 91442 - 92665 1191 ## gi|260173154|ref|ZP_05759566.1| hypothetical protein BacD2_14890 54 14 Op 13 . - CDS 92677 - 93714 937 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 93911 - 93970 2.8 + Prom 93796 - 93855 6.2 55 15 Op 1 . + CDS 93911 - 94357 384 ## gi|260642127|ref|ZP_05414625.2| hypothetical protein BACFIN_05935 56 15 Op 2 . + CDS 94359 - 95927 636 ## RCAP_rcc00985 hypothetical protein 57 15 Op 3 . + CDS 95946 - 96371 168 ## gi|254883999|ref|ZP_05256709.1| predicted protein 58 15 Op 4 . + CDS 96373 - 97761 691 ## COG4383 Mu-like prophage protein gp29 59 15 Op 5 . + CDS 97779 - 99194 910 ## gi|265750855|ref|ZP_06086918.1| conserved hypothetical protein 60 16 Tu 1 . - CDS 99177 - 99446 196 ## gi|254883996|ref|ZP_05256706.1| predicted protein + Prom 99400 - 99459 3.4 61 17 Op 1 . + CDS 99479 - 100111 395 ## gi|254883995|ref|ZP_05256705.1| predicted protein 62 17 Op 2 . + CDS 100116 - 100562 266 ## gi|255690957|ref|ZP_05414632.1| conserved hypothetical protein - Term 100439 - 100466 0.1 63 18 Tu 1 . - CDS 100572 - 100841 143 ## gi|254883993|ref|ZP_05256703.1| predicted protein - Prom 100876 - 100935 6.8 - Term 100880 - 100928 10.8 64 19 Op 1 . - CDS 100949 - 101164 116 ## gi|265750860|ref|ZP_06086923.1| conserved hypothetical protein 65 19 Op 2 . - CDS 101167 - 101817 753 ## gi|260173166|ref|ZP_05759578.1| hypothetical protein BacD2_14950 66 19 Op 3 . - CDS 101839 - 102282 418 ## gi|260173167|ref|ZP_05759579.1| hypothetical protein BacD2_14955 67 19 Op 4 . - CDS 102285 - 102425 106 ## gi|260173168|ref|ZP_05759580.1| hypothetical protein BacD2_14960 68 19 Op 5 . - CDS 102466 - 102675 119 ## gi|265750863|ref|ZP_06086926.1| conserved hypothetical protein 69 19 Op 6 . - CDS 102696 - 103208 295 ## gi|260173170|ref|ZP_05759582.1| hypothetical protein BacD2_14970 70 19 Op 7 . - CDS 103229 - 103534 302 ## gi|260173171|ref|ZP_05759583.1| hypothetical protein BacD2_14975 71 19 Op 8 . - CDS 103554 - 104459 475 ## gi|294778490|ref|ZP_06743913.1| hypothetical protein CUU_0512 72 19 Op 9 . - CDS 104463 - 104699 154 ## gi|260173173|ref|ZP_05759585.1| hypothetical protein BacD2_14985 73 19 Op 10 . - CDS 104696 - 105373 523 ## gi|260173174|ref|ZP_05759586.1| ATP-dependent serine protease 74 19 Op 11 . - CDS 105376 - 105732 309 ## gi|254883982|ref|ZP_05256692.1| predicted protein 75 19 Op 12 . - CDS 105771 - 106640 659 ## gi|260173176|ref|ZP_05759588.1| B transposition protein domain protein 76 19 Op 13 . - CDS 106676 - 108715 1487 ## gi|260173177|ref|ZP_05759589.1| hypothetical protein BacD2_15005 77 19 Op 14 . - CDS 108731 - 108952 187 ## gi|260173178|ref|ZP_05759590.1| hypothetical protein BacD2_15010 78 19 Op 15 . - CDS 108942 - 109142 311 ## gi|255690973|ref|ZP_05414648.1| transcriptional regulator, ArsR family - Prom 109201 - 109260 3.8 - TRNA 109166 - 109251 51.2 # Undet ??? 0 0 79 20 Tu 1 . - CDS 109262 - 109453 182 ## gi|260173180|ref|ZP_05759592.1| hypothetical protein BacD2_15020 + Prom 109316 - 109375 7.8 80 21 Tu 1 . + CDS 109589 - 110008 367 ## gi|260173181|ref|ZP_05759593.1| transcriptional regulator, XRE family protein - Term 110069 - 110118 -0.2 81 22 Op 1 . - CDS 110119 - 110343 104 ## gi|260173182|ref|ZP_05759594.1| hypothetical protein BacD2_15030 - Term 110349 - 110395 1.7 82 22 Op 2 . - CDS 110408 - 110620 182 ## BF2407 hypothetical protein - Prom 110640 - 110699 8.1 + Prom 110924 - 110983 1.6 83 23 Tu 1 . + CDS 111019 - 111339 134 ## Spro_1011 low temperature requirement A + Term 111404 - 111468 16.2 - Term 111391 - 111454 12.2 84 24 Op 1 . - CDS 111576 - 113600 1431 ## BT_2828 hypothetical protein - Term 113610 - 113658 6.1 85 24 Op 2 . - CDS 113673 - 116018 2321 ## COG0550 Topoisomerase IA - Prom 116083 - 116142 10.8 - Term 116217 - 116257 3.4 86 25 Op 1 . - CDS 116291 - 118012 1445 ## BT_2817 putative TonB-dependent receptor 87 25 Op 2 . - CDS 118034 - 121051 2827 ## COG0457 FOG: TPR repeat 88 25 Op 3 . - CDS 121122 - 121643 368 ## COG1051 ADP-ribose pyrophosphatase 89 26 Op 1 . - CDS 122008 - 123468 823 ## Coch_1468 hypothetical protein 90 26 Op 2 . - CDS 123502 - 125088 951 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 91 26 Op 3 . - CDS 125120 - 129250 1894 ## COG0642 Signal transduction histidine kinase 92 26 Op 4 . - CDS 129268 - 130371 670 ## Coch_1471 hypothetical protein 93 26 Op 5 . - CDS 130403 - 132196 1179 ## Coch_1472 RagB/SusD domain protein 94 26 Op 6 . - CDS 132215 - 135286 2234 ## Coch_1473 TonB-dependent receptor plug - Prom 135350 - 135409 4.3 + Prom 136127 - 136186 5.4 95 27 Op 1 11/0.000 + CDS 136216 - 136782 709 ## COG0450 Peroxiredoxin + Term 136808 - 136850 5.3 96 27 Op 2 . + CDS 136867 - 138417 395 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 138490 - 138527 3.3 - Term 138478 - 138515 7.1 97 28 Op 1 . - CDS 138585 - 141101 1155 ## Pjdr2_3116 carbohydrate binding family 6 98 28 Op 2 . - CDS 141154 - 141861 403 ## Coch_1347 hypothetical protein - Prom 141883 - 141942 4.0 99 29 Tu 1 . - CDS 142050 - 145997 1920 ## COG0642 Signal transduction histidine kinase - Prom 146017 - 146076 4.6 + Prom 146267 - 146326 4.7 100 30 Op 1 . + CDS 146365 - 149325 2393 ## BF4062 putative TonB-linked outer membrane protein 101 30 Op 2 . + CDS 149340 - 150980 1628 ## BVU_3705 hypothetical protein 102 30 Op 3 . + CDS 151004 - 152677 1268 ## Coch_1345 hypothetical protein 103 30 Op 4 . + CDS 152696 - 153721 573 ## COG3858 Predicted glycosyl hydrolase + Term 153795 - 153857 19.1 + Prom 153794 - 153853 6.6 104 31 Tu 1 . + CDS 153892 - 155718 1405 ## COG3568 Metal-dependent hydrolase - Term 155719 - 155763 7.4 105 32 Op 1 . - CDS 155787 - 156764 772 ## gi|260173206|ref|ZP_05759618.1| hypothetical protein BacD2_15150 106 32 Op 2 . - CDS 156789 - 157796 992 ## BT_2809 putative integral membrane protein 107 32 Op 3 2/0.000 - CDS 157815 - 159335 731 ## COG0627 Predicted esterase 108 32 Op 4 . - CDS 159337 - 160932 778 ## COG0627 Predicted esterase 109 32 Op 5 . - CDS 160944 - 162491 1228 ## BDI_1656 putative large exoprotein involved in heme utilization or adhesion 110 32 Op 6 . - CDS 162498 - 164105 882 ## BDI_1656 putative large exoprotein involved in heme utilization or adhesion 111 32 Op 7 . - CDS 164118 - 166082 1370 ## sce3320 hypothetical protein 112 32 Op 8 . - CDS 166102 - 167631 915 ## BDI_2874 hypothetical protein 113 32 Op 9 . - CDS 167646 - 170516 1960 ## Slin_2764 TonB-dependent receptor plug - Term 170517 - 170551 0.1 114 32 Op 10 . - CDS 170600 - 171514 660 ## COG0524 Sugar kinases, ribokinase family - Prom 171628 - 171687 4.9 + Prom 171579 - 171638 7.0 115 33 Tu 1 . + CDS 171658 - 172674 623 ## COG1609 Transcriptional regulators + Term 172687 - 172742 3.2 - Term 172675 - 172729 12.1 116 34 Tu 1 . - CDS 172746 - 173225 356 ## BT_2538 hypothetical protein - Prom 173282 - 173341 3.4 + Prom 173607 - 173666 4.2 117 35 Op 1 . + CDS 173724 - 174131 339 ## gi|260173218|ref|ZP_05759630.1| hypothetical protein BacD2_15210 118 35 Op 2 . + CDS 174198 - 175148 577 ## mru_0017 hypothetical protein + Term 175367 - 175400 0.0 119 36 Tu 1 . - CDS 175225 - 175443 138 ## gi|237717280|ref|ZP_04547761.1| predicted protein - Prom 175599 - 175658 2.6 + Prom 176006 - 176065 3.6 120 37 Tu 1 . + CDS 176093 - 176377 153 ## gi|237717282|ref|ZP_04547763.1| predicted protein 121 38 Tu 1 . - CDS 176439 - 177038 448 ## BT_2225 hypothetical protein - Prom 177161 - 177220 5.6 + Prom 177307 - 177366 8.2 122 39 Op 1 . + CDS 177459 - 180878 2693 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins 123 39 Op 2 . + CDS 180899 - 182632 1523 ## Phep_2529 RagB/SusD domain protein 124 39 Op 3 . + CDS 182652 - 183443 506 ## COG3568 Metal-dependent hydrolase 125 39 Op 4 . + CDS 183472 - 185628 1419 ## Amuc_0060 alpha-N-acetylglucosaminidase (EC:3.2.1.50) 126 39 Op 5 6/0.000 + CDS 185637 - 186779 821 ## COG1929 Glycerate kinase 127 39 Op 6 . + CDS 186786 - 188042 959 ## COG2610 H+/gluconate symporter and related permeases 128 39 Op 7 . + CDS 188057 - 189142 614 ## COG4299 Uncharacterized conserved protein 129 39 Op 8 . + CDS 189149 - 191332 1624 ## BT_0438 alpha-N-acetylglucosaminidase precursor + Prom 191362 - 191421 11.7 130 40 Op 1 6/0.000 + CDS 191442 - 192032 447 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 192047 - 192106 6.5 131 40 Op 2 . + CDS 192141 - 193097 661 ## COG3712 Fe2+-dicitrate sensor, membrane component 132 40 Op 3 . + CDS 193129 - 193902 443 ## RB8407 hypothetical protein 133 40 Op 4 . + CDS 193910 - 194689 230 ## gi|260173236|ref|ZP_05759648.1| hypothetical protein BacD2_15300 134 40 Op 5 . + CDS 194703 - 196232 1094 ## gi|260173237|ref|ZP_05759649.1| hypothetical protein BacD2_15305 + Term 196365 - 196412 8.2 + Prom 196368 - 196427 4.8 135 41 Tu 1 . + CDS 196554 - 197540 802 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 197617 - 197676 4.3 136 42 Op 1 . + CDS 197703 - 200918 2459 ## Dfer_0714 TonB-dependent receptor plug 137 42 Op 2 . + CDS 200939 - 202417 1443 ## Dfer_4709 RagB/SusD domain protein 138 42 Op 3 . + CDS 202441 - 203517 773 ## ZPR_4337 glycoside hydrolase, family 5 139 42 Op 4 . + CDS 203554 - 205431 1295 ## BT_2892 hypothetical protein 140 42 Op 5 . + CDS 205458 - 207347 1393 ## gi|260173244|ref|ZP_05759656.1| hypothetical protein BacD2_15340 141 42 Op 6 . + CDS 207385 - 210639 2439 ## BVU_0750 TPR domain-containing protein + Prom 210655 - 210714 4.9 142 43 Op 1 . + CDS 210836 - 211387 215 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 143 43 Op 2 . + CDS 211416 - 212555 709 ## COG2017 Galactose mutarotase and related enzymes + Prom 212577 - 212636 3.1 144 44 Op 1 . + CDS 212672 - 213475 587 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family 145 44 Op 2 . + CDS 213507 - 214214 450 ## COG1040 Predicted amidophosphoribosyltransferases 146 45 Op 1 . - CDS 214183 - 215262 488 ## Amuc_0698 beta-glucanase precursor 147 45 Op 2 . - CDS 215268 - 216947 986 ## COG0591 Na+/proline symporter - Prom 217031 - 217090 4.5 + Prom 217034 - 217093 4.9 148 46 Tu 1 . + CDS 217159 - 218196 695 ## COG1609 Transcriptional regulators + Term 218226 - 218275 11.4 + Prom 218199 - 218258 9.7 149 47 Op 1 . + CDS 218492 - 221686 2474 ## Phep_3596 TonB-dependent receptor plug 150 47 Op 2 . + CDS 221710 - 223317 1348 ## Phep_3595 RagB/SusD domain protein 151 47 Op 3 . + CDS 223349 - 225055 1303 ## Phep_3594 hypothetical protein 152 47 Op 4 . + CDS 225074 - 227686 1816 ## Phep_3593 hypothetical protein 153 47 Op 5 . + CDS 227690 - 231118 2136 ## Phep_3592 hypothetical protein 154 47 Op 6 . + CDS 231121 - 233166 1278 ## COG2755 Lysophospholipase L1 and related esterases 155 47 Op 7 . + CDS 233183 - 235678 1903 ## COG1472 Beta-glucosidase-related glycosidases 156 47 Op 8 . + CDS 235722 - 238004 1373 ## Cpin_3542 hypothetical protein + Prom 238054 - 238113 4.8 157 48 Tu 1 . + CDS 238136 - 239161 804 ## Phep_3597 regulatory protein GntR HTH + Term 239184 - 239235 7.6 - TRNA 239221 - 239297 79.9 # Asn GTT 0 0 - TRNA 239323 - 239399 79.9 # Asn GTT 0 0 + Prom 239393 - 239452 2.7 158 49 Op 1 . + CDS 239477 - 240466 866 ## COG0524 Sugar kinases, ribokinase family 159 49 Op 2 . + CDS 240512 - 242200 1838 ## COG0793 Periplasmic protease + Term 242225 - 242286 20.4 - Term 242215 - 242268 9.1 160 50 Tu 1 . - CDS 242342 - 243541 921 ## COG0642 Signal transduction histidine kinase - Prom 243610 - 243669 8.1 + Prom 243642 - 243701 6.6 161 51 Op 1 . + CDS 243725 - 245143 1657 ## COG0499 S-adenosylhomocysteine hydrolase + Prom 245148 - 245207 4.5 162 51 Op 2 . + CDS 245260 - 247773 2064 ## BT_2796 hypothetical protein + Term 247828 - 247869 10.7 - Term 247815 - 247856 10.7 163 52 Tu 1 . - CDS 247928 - 248524 466 ## gi|299144693|ref|ZP_07037761.1| conserved hypothetical protein + Prom 248828 - 248887 1.6 164 53 Op 1 4/0.000 + CDS 248916 - 250247 1211 ## COG1538 Outer membrane protein 165 53 Op 2 . + CDS 250270 - 251361 1059 ## COG1566 Multidrug resistance efflux pump 166 53 Op 3 . + CDS 251411 - 253048 1186 ## BT_2793 putative MFS transporter 167 53 Op 4 . + CDS 253087 - 253962 606 ## COG2207 AraC-type DNA-binding domain-containing proteins 168 54 Tu 1 . - CDS 253967 - 254620 703 ## COG0035 Uracil phosphoribosyltransferase + Prom 254594 - 254653 5.5 169 55 Tu 1 . + CDS 254901 - 256508 1750 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Term 256531 - 256570 9.1 + Prom 256557 - 256616 2.6 170 56 Tu 1 . + CDS 256658 - 257968 603 ## COG0249 Mismatch repair ATPase (MutS family) + Prom 257973 - 258032 6.4 171 57 Op 1 . + CDS 258086 - 260575 2140 ## BT_4682 hypothetical protein 172 57 Op 2 7/0.000 + CDS 260607 - 261164 617 ## COG2059 Chromate transport protein ChrA 173 57 Op 3 . + CDS 261161 - 261685 441 ## COG2059 Chromate transport protein ChrA + Term 261704 - 261754 19.1 - Term 261691 - 261742 20.1 174 58 Tu 1 . - CDS 261767 - 263566 1998 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 263712 - 263771 4.1 + Prom 263519 - 263578 4.1 175 59 Tu 1 . + CDS 263722 - 263991 446 ## PROTEIN SUPPORTED gi|160883111|ref|ZP_02064114.1| hypothetical protein BACOVA_01079 + Term 264017 - 264055 6.5 + Prom 264092 - 264151 9.3 176 60 Op 1 . + CDS 264178 - 264753 570 ## COG1396 Predicted transcriptional regulators 177 60 Op 2 . + CDS 264789 - 266438 1501 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 266488 - 266548 7.1 - Term 266475 - 266535 7.1 178 61 Tu 1 . - CDS 266593 - 267675 782 ## COG0836 Mannose-1-phosphate guanylyltransferase - Prom 267864 - 267923 80.3 + TRNA 267840 - 267923 50.2 # Leu CAA 0 0 - Term 267822 - 267895 20.1 179 62 Op 1 9/0.000 - CDS 268080 - 268850 514 ## COG3279 Response regulator of the LytR/AlgR family 180 62 Op 2 . - CDS 268868 - 270610 1105 ## COG3275 Putative regulator of cell autolysis + Prom 270639 - 270698 2.8 181 63 Op 1 27/0.000 + CDS 270898 - 272013 1036 ## COG0845 Membrane-fusion protein 182 63 Op 2 9/0.000 + CDS 272036 - 275242 3305 ## COG0841 Cation/multidrug efflux pump 183 63 Op 3 . + CDS 275239 - 276609 369 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 + Prom 276622 - 276681 6.4 184 64 Tu 1 . + CDS 276769 - 277113 384 ## gi|260173289|ref|ZP_05759701.1| hypothetical protein BacD2_15565 + Term 277135 - 277188 -0.2 + Prom 277279 - 277338 6.1 185 65 Tu 1 . + CDS 277547 - 278050 301 ## gi|260173290|ref|ZP_05759702.1| hypothetical protein BacD2_15570 + Prom 278174 - 278233 9.5 186 66 Op 1 . + CDS 278400 - 279188 467 ## gi|260173291|ref|ZP_05759703.1| hypothetical protein BacD2_15575 187 66 Op 2 . + CDS 279185 - 280630 1166 ## gi|260173292|ref|ZP_05759704.1| hypothetical protein BacD2_15580 188 66 Op 3 . + CDS 280641 - 281333 376 ## gi|260173293|ref|ZP_05759705.1| hypothetical protein BacD2_15585 189 66 Op 4 . + CDS 281352 - 281720 358 ## gi|260173294|ref|ZP_05759706.1| hypothetical protein BacD2_15590 190 66 Op 5 . + CDS 281726 - 283711 893 ## COG0464 ATPases of the AAA+ class 191 66 Op 6 . + CDS 283724 - 284605 653 ## gi|260173296|ref|ZP_05759708.1| hypothetical protein BacD2_15600 192 66 Op 7 . + CDS 284629 - 286338 934 ## gi|260173297|ref|ZP_05759709.1| chaperone protein DnaK 193 66 Op 8 . + CDS 286335 - 289031 1356 ## Shewana3_1825 ATPase-like protein 194 66 Op 9 . + CDS 289052 - 289693 310 ## gi|260173299|ref|ZP_05759711.1| hypothetical protein BacD2_15615 195 66 Op 10 . + CDS 289699 - 290637 776 ## gi|260173300|ref|ZP_05759712.1| hypothetical protein BacD2_15620 196 66 Op 11 . + CDS 290630 - 290980 316 ## gi|260173301|ref|ZP_05759713.1| hypothetical protein BacD2_15625 + Prom 291145 - 291204 7.7 197 67 Op 1 . + CDS 291325 - 291690 390 ## gi|260173302|ref|ZP_05759714.1| hypothetical protein BacD2_15630 198 67 Op 2 . + CDS 291742 - 293598 529 ## gi|260173303|ref|ZP_05759715.1| hypothetical protein BacD2_15635 199 67 Op 3 . + CDS 293607 - 294362 480 ## gi|260173304|ref|ZP_05759716.1| hypothetical protein BacD2_15640 200 67 Op 4 . + CDS 294386 - 294901 453 ## gi|260173305|ref|ZP_05759717.1| hypothetical protein BacD2_15645 201 67 Op 5 . + CDS 294909 - 295715 709 ## gi|260173306|ref|ZP_05759718.1| hypothetical protein BacD2_15650 202 67 Op 6 . + CDS 295748 - 297373 776 ## COG0464 ATPases of the AAA+ class 203 67 Op 7 . + CDS 297385 - 297672 220 ## gi|260173308|ref|ZP_05759720.1| hypothetical protein BacD2_15660 Predicted protein(s) >gi|225935348|gb|ACGA01000044.1| GENE 1 774 - 1355 505 193 aa, chain + ## HITS:1 COG:no KEGG:BF4460 NR:ns ## KEGG: BF4460 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 188 1 189 1086 252 65.0 6e-66 MTRKTNRCPAFMEGRGTRCLKIAAVATLLLWCTQPMQAVANTNETQETADIQQQKVKTTG VVVDENGEPLIGVSVKVQGTATGTVTDLNGRFSIDSPKGAVLSLSFIGYKTITVKADGTP LNIVMKEDSEQLDEVVVVGYGTQKKVNVTGAVGMVDAKVLAARPVTNVAQALQGTVPGLN FTVGSEGGPWMVR >gi|225935348|gb|ACGA01000044.1| GENE 2 1352 - 4039 2198 895 aa, chain + ## HITS:1 COG:no KEGG:BF4460 NR:ns ## KEGG: BF4460 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 895 195 1086 1086 1398 75.0 0 MSFNIRGAGTIGDGSGSSPLVLIDGIEGNLNSLNPNDIETVSVLKDAASASIYGARAAFG VILIQTKKGKAGKARVSYNGNVRFSDAVSVPEMMDSYTFAQYFNRAAENAGRSGEAPFSS EQLQKIKDYQEGKITSVTTLNTNNNRWNNYGGANANTDWFKEFYNDWVPSQEHNLSISGG NEKIQFSLSGSFMDQNGLLRHGEDNLQRYTMNSTITAQITDWFRVNYSTKWTREDFDRPS YLTGLFFHNIARRWPTCPAYDPNGYPMNGVEITELEDGGRQTNQKDLNTQQLQFVFEPVK NWTINIEGALRTENKNEHWEVLPIFSHDGDGNPYSIPWGSYGAGSSVVNEYNYKENYYST NIYSDYFKQYESGHYFKVMAGFNAELYKTRSLSGQKNTLISNSVPTLNTATESPTTSGGY AHNGVAGFFGRINYNYKERYLIELNGRYDGSSRFINDKRWGFFPSLSVGWNIAREDFFRN IADKAHIDVLKLRGSWGQLGNTDTKDAWYPFYQTMPQGTDYNWLVNGKRPNYASLPGIVS SLKTWETIETWDIGLDWGLFNNRLTGSFDYFVRWTYDMIGPASELSSVLGATPPKINNSD MKSYGFELELGWRDMIGDFSYGAKFTLADDQQKITRYPNENNKLSEAYYPGMMKGEIWGY ETVGIAQSQEEMDAHLAKVDQSSLGSKWGAGDIMYKDVDGDGVISTGDNTTEKPGDRVIL GNNTPRFKYGITLDAAWKGVDFRIFLQGVAKRDYVLSGPYFWGANGVDEWQATGFKEHWD FWRPEGDPLGANTNAYYPRVLKSDSRNMAVQSRYIQNAAYCRIKNVQIGYTLPKSWTNKA GMSSVRVYVSGDNLLTFSHMSKIFDPEALESTYDANNGKLYPLQRTISVGLNVNF >gi|225935348|gb|ACGA01000044.1| GENE 3 4051 - 6051 1508 666 aa, chain + ## HITS:1 COG:no KEGG:BF4257 NR:ns ## KEGG: BF4257 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 666 3 675 675 852 60.0 0 MKLKHIILQTILMAGATWSLTSCNDFLDMAPLDQVTPQEYFNTADHLAAYSISQYNNIFS THGGYGVGTVNNDQNTDNMVAGGYSSTYFEKGQWRVPNTGGGWDFTQIRYCNYFFENVLP KFEVGKIVGNREQILHYVGEMYFIRAWIYYSKLKSFGDFPIITEVLPDNQSVLTEKSVRQ PRNLVARFIINQLDSAAKYMVDDISGNKTRLTKNCALLIKSRVALYEATFEKYHKGTGRV PGDATWPGKRIHADFSTNMDNEINFFLDEAMKAAEQVADDITLTNNTGLTNPKNISGIVG WNPYFEMFAEEDMSSYSEVLFWKQYMNGGGVSITHGTPAYIFTGGNNGMLKSFVDCFVMK DGLPYYAAGDEYKGDKTIMDVKENRDLRLQMFVLGEKDLLPSSSTEEPALKEFKQPNIIS LEAQTSDNTGYRIRKCLTYNNKQIISGQSQSTTGCIIFRGVEAYLNYLEAYYLRNNKVDG KAKSYWDAVRNRAGITGNFQTTIDNTDMNKETDLAAHPGSYEVDATLYNIRRERRCEFIG EGMRWDDLVRWRAWDGVLTNKFIPEGYNFWEEAYKTYELPEDVTLKDDPDDATSNISSRA FKYIRPFSKVRINNQVYDGYSWAKANYLSPVAINEMRLASPDNSTDNSVIYQNPYWPEKV GGTALE >gi|225935348|gb|ACGA01000044.1| GENE 4 6155 - 7165 1071 336 aa, chain - ## HITS:1 COG:Rv2454c KEGG:ns NR:ns ## COG: Rv2454c COG1013 # Protein_GI_number: 15609591 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Mycobacterium tuberculosis H37Rv # 2 335 38 372 373 329 47.0 4e-90 MSDKVYTVQDYKSGQPRWCPGCGDHAFLNSLHKAMAELGVAPHNIAVISGIGCSSRLPYY VNTYGFHTIHGRAAAVATGAKVANPDLTIWQISGDGDGLAIGGNHFIHALRRNIDLNMIL LNNRIYGLTKGQYSPTSERGLVTKSSPYGTVEDPFHPAELAFGARGHFFARCIAVDGAAS VEVLKAAANHKGASVVEVLQNCVIFNDGTHASVATKEGRAKNAIYLEHGKPMLFGENKEF GLMQEGFGLKVVKLGENGITEKDILIHDAHCQDNTLQLKLALMEGPDFPIALGVIRDVDA PTYNDAVIGQIEEIKGKKKYHNFQELLMTNDTWEVK >gi|225935348|gb|ACGA01000044.1| GENE 5 7169 - 9019 1722 616 aa, chain - ## HITS:1 COG:MT2530_2 KEGG:ns NR:ns ## COG: MT2530_2 COG0674 # Protein_GI_number: 15841979 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 213 606 1 389 425 380 52.0 1e-105 MADEMMVKELEEVVVRFSGDSGDGMQLAGNIFSNVSATVGNDICTFPDYPADIRAPQGSL TGVSGFQVHIGAGQVYTPGDRCHVLVAMNPSALKTQIKFCKPQGLIITDSDSFEARDLEK AQFKTDNPFEELGVKQEVLEVPISSMCKESLKDSGLDNKSALRCKNMFALGLVCWLFNRN LAAAEKMLREKFAKKPEIAEANIKVLNDGFNYGANTHASVSTYKIESKAPKSKGLYTDIN GNKATAYGLIAAAEKAGLELYLGSYPITPATDILHELAKHKSLGVKTVQCEDEIAGCASA VGAAFAGALAVTTTSGPGVCLKSEAMNLAVIGELPLVIVNVQRGGPSTGLPTKSEQTDLL QALYGRNGESPMPVIAATSPTNCFDAAYMACKIALEHMTPVVLLTDAFVANGSAAWKLPN LNEYPAINPPYVTPDMAGNWTPYQRNEETGVRYWATPGTEGFMHRIGGLEKSNETGAIST EPENHNKMVHLRQAKVDKIADYIPELEVLGDEDADLLIVGWGGTYGHLRLAMDFMRDHGK KVAFAHFQYINPLPKNTADVLRKYKKIVVAEQNLGQFAGYLRMKIPGLNISQFNQVKGQP FVTRELIDAFTKLLEE >gi|225935348|gb|ACGA01000044.1| GENE 6 9469 - 12495 3062 1008 aa, chain - ## HITS:1 COG:AGc2877_1 KEGG:ns NR:ns ## COG: AGc2877_1 COG0342 # Protein_GI_number: 15888881 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 394 667 275 548 562 248 45.0 5e-65 MQNKGFVKVFAVLLTLVCVFYLSFSFVTRHYTNKAKEIANGDPKVEQDYLDSLSNEKVWL GNWTLKQCREMEISLGLDLKGGMNVILEVSVPDVIKALADNKPDEAFNNALAEAAKQAIN SQDDIITLFVREYHKAAPNAKLSELFATQQLKDKVNQKSSDAEVEKVLRTEVKAAVENSY NVLRTRIDRFGVVQPNIQSLEDKMGRIMVELPGIKEPERVRKLLQGSANLEFWETYTAKE VLPAMQSADAKLRAVLAHETTAVDTAAVDSTKEAQLAEATPAKKNISAADSLAAALKGDA TAAEDKSTANLAEIKKQYPLLAILQLNSSGQGPVVGYANYKDTADINKYLAMPEVKAELP KDLRLKWGVSPSEFDKKGQTFELYAIKSTERNGKAPLEGDVVTDAKDEFDQYSKPAVSMT MNSDGARRWAQLTKQNIGRSIAIVLDNYVYSAPNVNSEITGGRSQITGHFTPEQAKDLAN VLKSGKMPAPAHIVQEDIVGPSLGQESINAGMFSFIVALIMVMCFMCFLYGFIPGMVANV ALFMNFFFTMGILSSFQAALTMSGIAGIVLSLGIAVDANVLIYERTKEELRSGKGVKQAL SDGYSNAFSAIFDSNLTSIITGVILFNFGTGPIRGFATTLIIGILCSFFTAVFMTRVFYD HFMSKDKLLNLKFQSKFSKNLFVNTHFDFMGTNKKAFIITSAVILICIASFVIRGLSQSI DFTGGRNYKVQFEQPIEPEAVRELIANDFGEATVSVIAIGTDKRTVRVSTNYRINENGNT VDSEIEERLYNAVKPLLTQNISLQTFIDRDNHTGGSIISSQKVGPSIADDIKTGAIYSVV LALIAIGLYILLRFRNIAYSVGSVVALSCDTVIIIGAYSLLWGIVPFSLEIDQTFIGAIL TAIGYSINDKVVIFDRVREFFGLYPKRDKRQLFNDSLNTTLARTINTSLSTLIVLLCIFI LGGDSIRSFAFAMILGVIIGTLSSLFVASPIAYNMMKNKKVVASTTEE >gi|225935348|gb|ACGA01000044.1| GENE 7 12661 - 14748 1975 695 aa, chain - ## HITS:1 COG:XF1944 KEGG:ns NR:ns ## COG: XF1944 COG0339 # Protein_GI_number: 15838538 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Xylella fastidiosa 9a5c # 25 695 35 716 716 595 46.0 1e-169 MMIKKTLTILAVSCMMYSCATKTESNPFFTEFQTEYGAPSFDKIKLEHYEPAFLKGIEEQ NQNIEAIIESPEIPTFDNTIVALDNSAPILDRVSTIFFNMTDAETTDSLTALSIKLAPVL SEHNDNISLNGKLFKRVNDVYQKKDSLNLTSEQERLLDKTYKRFIRSGANLSEKDQARLR EINKELSTLGITFSNNVLNENNAFQLFVDKEEDLAGLPEWFRQSAAEEAKAAGQPGKWLF TLHNASRLPFLQYSENRPLREQIYKAYINRGNNNDGNDNKENIRKIVSLRLEKANLLGFD SYANFVLDETMAKNANNVMSLLNNLWSYALPKAKAEAGELQKLMDKEGKGEKLEAWDWWY YTEKLRKEKYNLSEEDTKPYFKLENVRDGAFAVANKLYGITLSKLEGIPTYHPDVEVFEV KDADGSQLGIFYVDYFPRPGKSGGAWMSNYREQHGTTRPLVCNVCSFTKPVGDTPSLLTM DEVETLFHEFGHALHGLLTKCEYKGTSGTNVVRDFVELPSQINEHWATEPEVLKMYAKHY QTGEVIPDEIIEKILQQKTFNQGFMTTELLAAAILDMNLHTMTDVKNLDMLAFEKEAMDK LNLIPEIAPRYRVTYFNHIIGGYAAGYYSYLWANVLDNDAFEAFKEHGIFDKNTADLFRH NVLEKGDSEDPMVLYKNFRGAEPSLEPLLKNRGMK >gi|225935348|gb|ACGA01000044.1| GENE 8 14777 - 15862 785 361 aa, chain - ## HITS:1 COG:no KEGG:BT_2833 NR:ns ## KEGG: BT_2833 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 589 80.0 1e-167 MEHIGKFVTYLILAVNALFVGTLILSAYSPYLNPKINPLASSLGLAFPIFLTINLIFIGF WAFVNYRYALLPAIGLLICIPQIRTYIPFNSTTKTIPEGSIKILSYNVMSFSNLEKKDGK NPILSYLANSNADIICLQEYNTATNKKYLTEQDVKKALKAYPYQSIHQQGKGDVQLACFS KFPILSIHPIEYKSNYNGSMRYVLNVNNDTLTLINNHLESNKLTKEDRGIYEDMIKDPNA KKVKTGLRQLIRKLTEASAIRASQADSVARVIAESKYPTTIVCGDFNDGSISYTHRILTQ ELDDAFTQSGKGLGISYNLNKFYFRIDNILISPNLKAYNCTVDRSIKASDHYPIWCYISK R >gi|225935348|gb|ACGA01000044.1| GENE 9 15867 - 16772 761 301 aa, chain - ## HITS:1 COG:MA3859 KEGG:ns NR:ns ## COG: MA3859 COG0705 # Protein_GI_number: 20092655 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Methanosarcina acetivorans str.C2A # 34 214 39 212 226 89 32.0 9e-18 MGHIIADLKETFRRGNIFIQLIYINVGIFVIGTLINVFLRLFEVSTPDIFGIFALPASFI GFIHQPWSLFTYMFMHAGILHILFNMLWLYWFGSLFLYFFSAKHLRGLYVLGGICGGFLY MVAYNVFPLFSSQVAGATLVGASASVLAIVAATAYREPNYRVQLFLFGAIRLKYLALIVI GIDVLSITSSNAGGHIAHLGGALAGLWFAASLSKGTDLTSWINWILDGFISLFQKKTWKR KPKMKVHYGNSATGREKDYDYNAQKKAQSDEVDRILEKLKKSGYDSLTTEEKKSLFDASK R >gi|225935348|gb|ACGA01000044.1| GENE 10 16753 - 17427 612 224 aa, chain - ## HITS:1 COG:XF0649 KEGG:ns NR:ns ## COG: XF0649 COG0705 # Protein_GI_number: 15837251 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Xylella fastidiosa 9a5c # 1 221 9 211 224 147 43.0 2e-35 MPTVTKNLIIINVLVFFGTIVAQRYGLDLTNYLGLHFFLASDFNPAQLITYMFMHGGFSH IFFNMFAVFMFGPILEQTWGPKRFLFYYILCGIGAGLIQEGVQYIQYVTELSHYAQVNIG TGVIPMEEYLNMMTTVGASGAVYAILLAFGMLFPNNRLFIFPLPFPIKAKFFVIGYAAIE LWSGLANSAGDNVAHFAHLGGMLFGLILILYWRKKSNNNGTYYS >gi|225935348|gb|ACGA01000044.1| GENE 11 17649 - 17921 338 90 aa, chain + ## HITS:1 COG:YPO3154 KEGG:ns NR:ns ## COG: YPO3154 COG0776 # Protein_GI_number: 16123316 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Yersinia pestis # 1 89 1 89 90 79 60.0 1e-15 MNKSDLISAMAAEAQMSKADAKKALDAFITSVTNAMKAGDKVSLVGFGTFSVSERAERTG INPSTKATITIPAKKVAKFKAGAELSAAVE >gi|225935348|gb|ACGA01000044.1| GENE 12 17999 - 18763 722 254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173111|ref|ZP_05759523.1| ## NR: gi|260173111|ref|ZP_05759523.1| hypothetical protein BacD2_14675 [Bacteroides sp. D2] # 1 254 1 254 254 479 100.0 1e-134 MKKVLLTLALVATASLSNAQVLRNNFLNGCKEGEPIEKAAYTAKKAPLNKDVWSAVFSEK EPFIGESPVAGKELSYKGYNEEGLSVTFGGLPEEATFRPSIYGLESGRTYSTGTYYLSFL VNFSKFKAKGYMDFISTSANHATGTSRGFVFASNQGSKLKFGVGIQKQRGSATKTYDLNT THLIVLKIDFAKNQASLFIDSELKDQEPTPDAVATEEGVLKAGIKGIMLKNRNNYAGNIG NFRFTDSWAGIIGK >gi|225935348|gb|ACGA01000044.1| GENE 13 18822 - 21650 1949 942 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 41 621 63 660 1087 219 30.0 2e-56 MIKIRQLLSLAFLLCFSSIGAQNTETLYLSGTGLGKTVTWDFYCSGGMNSGKWSKIEVPS QWELQSFGEYTFGRFYLVKEAKPSDETGLYRYKFKVPADWKDKQVRIVFEGVMTDTEVKI NNKLAGEIHQGGFICFNYDITDKLKFGKSNELEVKVWKESANKSVNAAERRADWWLFGGI YRPVYLKAMPKTHIERIAVNATADGKLSTEVYMNNLPEGYSLATSLTPVGESRAIGKQTN TLGTGNVQTISTDWKGIRTWDCEHPNLYTLRLELLDSQKQIVHVHEERIGFRTIEFRLKD GIYVNGTKVIMKGINRHSFHPDGGRTTNKEISLQDALLIKEMNMNAVRSHYPPDKHFLDV CDSLGLFYLEEFPGWHGRYDEKVGEKLLKEMMAHDVNHPCIFLWSNGNEGGWNKKLDARF ADYDPQKRHVIHPWADFNQLDTHHYPAYQTGTGRLANGYNVFMPTEFLHGQYDKGLGAGL EDYWNNYTSNPLFAGGFLWTFIDEAVSRSDKGGILDSDGPNGPDGIVGPRREKEGSFYTV REVWAPIQFKNLFITPSFNGEFMVSNTYLFTNLKECSMKYRLYATPSPLKGGERSLLNEG TVSLPAIDPGETGKARMQLPENFFLGDVLELEAYDRNGKVICNWTWPVKYAKEYFETQHI QTTSTKPAEIRKGNESITLSANGISATFNSKGGALVEIKSNKQIVPLNNGPLPVGIKADF KKGEVRMDGNDALFVVRYTGAIDSIVWRMTADGLLGMDAILLNRGNGGGYKGEFFDKNIY YLGLSFSFPEKEVKGMRWMGRGPYRVWKNRIKGANYNIWEKEYNNTVTGESFEKLIYPEF KGYHGNLYWATMEADKVPFTVYSETDGLYFRVFTPEEPVHRRDGENSMHSFPEGDLSFLY DIPAIRSFKSIPEQGPKSQPSTIRIKSGDEGLRMKLWFDFRR >gi|225935348|gb|ACGA01000044.1| GENE 14 21686 - 24658 2356 990 aa, chain - ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 22 784 50 871 871 147 23.0 1e-34 MKQKIILFIFIAFSLSVMTTVQASERKKYNFNSEWRLHVGDNEQASKTEFDDSNWKQVTL PHAFNEDEAFKLHIAQLTDTVVWYRKHFQVKDIKDKKVFIEFEGIRQGGSFYLNGEYLGI HENGVMAVGFDLTPFMKEGNNVLAIRIDNNWQYREKNTKSKFQWNDRNFNANYGGIPKNV FLHVTDEVYQTLPLYSNLKTTGVYIYATDFDIKGRKAKIHAESEIRNDGRETRKLTYQVT LKDMDGKTVKTFTSPGITLNPKETKVIQAAAEVDKLNFWSWGYGYLYTVTTALKDEKGEI FDEVITRTGFRKTHFGEGKVWLNDRVIQMKGYAQRTSNEWPAVGVSVPAWLSDYSNGLMV ESNGNLVRWMHVTPWKQDVESCDRVGLIQAMPAGDSEKDCGGRQWEQRTELMRDAIIYNR NNPSILFYECGNKAIRLEHMIEMKAIRDKFDPYGGRAIGSREMLDIREAEYGGEMLYINK SKHHPMWATEYCRDEGLRKYWDEHSYPFHKEGAGPLHKGKPATSYNHNQDMFAIEMVNRW YDYWRERPGTGLRVSSGGTKIIFSDSNTHCRGEENYRRSGVTDPMRIEKDAFFAHQVMWN GWVDTDKYQTYIIGHWNYPEKTVKPVYVVSNGEAVELFLNGKSLGKGKRESNFLFTFDKI TFRPGKLEAVSSDKNGMEVSRYHISTAGEATQLKLTAIQNPEGFHADGADMALLQIEVVD KEGRRCPLDNRTIKFSLTGEAEWRGGIAQGENNYILSKDLPVECGINRALIRSTTEAGKI IVTAQAEGLPATTLTLQTTPVKVTDGCSDYLPQYSLKGKLDKGATPLTPSYIDSKRDVAI IAAKAGSNQSEAIKSYDDNELSEWSNDGQLNTAWITYQLEKEATIDDICIKLNGWRSRSY PLEVYAGKTKIWSGSTEKSLGYVHLEIDKPVKSDKITIQLKGSTIDNDAFGEIVEVAGGA ANEMEKKAQSNKGKNGLRIIEVEFLETIKK >gi|225935348|gb|ACGA01000044.1| GENE 15 24673 - 26589 1136 638 aa, chain - ## HITS:1 COG:no KEGG:Cthe_0246 NR:ns ## KEGG: Cthe_0246 # Name: not_defined # Def: carbohydrate-binding family 6 protein # Organism: C.thermocellum # Pathway: not_defined # 66 633 272 810 820 410 42.0 1e-113 MKYTPLFIIGFILCLSACTDNQLLPSSSDDGEKPVVKEERNYTPRMLWASINGRADATND KNKIHNKKILVSWRMLPTDDTEVAFDLYRKSGDRAEIKVNENAITVTNFQDVSADLSVDN TYRLCYHSSTVTIDTYTITAQQASAGLPYISIPLQGTEGIAPGLVYKTNDISIGDLDGDG QYEIVLKRLISSPDGDEDSETGDTNDGIRHSVLLEAYRLDGTFMWRMAMGPNVPTGNGSS FAVYDFNGDGKCEIALRTAEGTIFGDGQEIGDTDGDGKTDYRVAGKKYIHGGPEFLSVIE GATGKELARTNYIALGQSEDWGDNYYKRSSSYRIGMARCSNDATSVIIGRGCYAKIVVEA WNFSDNQLSRVWRFDTTDGIHSDYDGQGYHSMSVGDVDNDGLDEIVYGSCTIDHNGKGLN CCGLGHGDALHLGKFAPSRKGLQIWSCFESGTVGAALRDANTGDVIWKFDKSGDVGRCLV ADIDPDSPGCEMWWYKGNAHSSSGVDLGYVPSSCNMAVWFSGSLNRQLLDKGTVNSYKDG RVFTIYRYGVTTINGTKANPCFYGDFLGDWREEIIQPTSDNTELRIFSTWYPTEYQFPYL MSDHIYEMSALNQNIGYNQPTHTSYYIGSDLIENKEKK >gi|225935348|gb|ACGA01000044.1| GENE 16 26611 - 29136 2282 841 aa, chain - ## HITS:1 COG:no KEGG:BT_2958 NR:ns ## KEGG: BT_2958 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 834 5 784 789 652 42.0 0 MLLRSFLLIIVFVCSQIVRCDDFVWYNGKQAISYAMPQSTAPVVKIAFDMFAGDMKQVTG FSPKQSSASKATIRIIQLDKADKKVQKELRELNIPVDNIAAKKDGFYISVAGGSANKQLL IVGSDGRGTAYGILELSRLAGVSPWVWWSDVTPEKRSQLTIDGNYKTFQSPSVEYRGIFL NDEDWSLQPWSWMTFEPSNIKGRIGPRTYKEIFKLLLRLRANAIWPGMHGITTPFYLVPG AKEVADSCGIAIGTSHCEPLMRSNVGEWSVKRRGDYNYITNRDSVQAYWIERLKEAGKYE NMYTIGMRGIHDGHMEGVSTMEEKVNALQQVINDQRKLLTKYTNPDLTKVPQVFVPYKEV LQIMENGLEVPDDVTLMWCDDNYGYMTRLSDEEQQKRSGGGGVYYHLSYWGRPHDYMWLC TTQPGLIYNEMKQAYDHNARRVWIVNVHDLKPSAYNLELFLDMAWDIHSVAPSTLNEHQE KWLCREFGEQAGKKLFPAMHEFYRLCGIRKPEHMGWTQVELDKRRYPRGRSQVIDTEFSL TEFGGELDRYLDSYETIKKTVTEAEALIAPERKDAFFSHIKYQVFASAAMSTKMLEAQRA RSYSTGQCDESLWGRDKAMFSACARSMKAYQEIKELTDYYNNEMANGKWKHSMCFYPRDL YVFYPPTLPIGLTDKEVDKYLAESPKKTSSRKIKTDKCISYNACNYTQASEGATPIQALG HSMNAVALPKGGSLTFEFDCPWEGEALLRTAVIPTQPNDKGDIRFSVSIDGGKPQVLSFR EKGRTETWKRNVLRGQAIKETKHSLKKGKHTLTITALDKHVVVDQWMIDFKPNRKFYVFP Q >gi|225935348|gb|ACGA01000044.1| GENE 17 29162 - 30400 1012 412 aa, chain - ## HITS:1 COG:BS_yesT KEGG:ns NR:ns ## COG: BS_yesT COG2755 # Protein_GI_number: 16077769 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 164 379 8 223 232 172 40.0 1e-42 MKTTLLSLLLLASVSIEAQNLPQTFKLADAPRYSEKTGYGYDRVDIPKKGSNAPFYFSVR VPDGNYRVTVRLGSKNQAGNTTVRAESRRLFVENLPTKKGEFVEETFIVNKRSPRISAKE SVKVKDREKSKLDWDDRLTIEINGEAPACESIHVEAADPSIPTIFLCGNSTVVDQENEPW ASWGQMLPRFLNDQISVANHAESGLSANTFISGNRLKKIISQMKKGDYVFVEFGHNDQKQ KGAGKGAYYSFMTYLKTFIDEVQAKGGNPVLITPTRRRRFNKEGRTVNTHGEYPDAVRWI AAKENVPLIDLNNMTGTLYEALGESESKKAFVHYPMGSFSGQKKELADNTHFNPYGAYQI AKCVIEGMKKVTPELTKYLKTDISYNPSQPDKAETFLWCPAPFCEMEKPDGN >gi|225935348|gb|ACGA01000044.1| GENE 18 30519 - 32489 1426 656 aa, chain - ## HITS:1 COG:no KEGG:Tter_1100 NR:ns ## KEGG: Tter_1100 # Name: not_defined # Def: FG-GAP repeat protein # Organism: T.terrenum # Pathway: not_defined # 23 653 372 938 1065 374 38.0 1e-102 MWVSYDPTPRGTLNHTSGYSHALVSWRLLPTDPDNLAFDIYKSEDNGPEVKLNETPVTDA TCWADKSIHAQTTHRYRVTVAGSEQTLCEYTFTPQMAETFYRAIRLNSNLPDPSLTYAAN DAQVGDLDGDGVMEIVLKRQPYDGANQGGWREGTTLLEAYKLDGTFLWQIDMGINIRSGS HYTSFIVYDFDGDGKCEIAFRSSEGTKFANGEKITDASGNINDYRQKDPSGKGWYSGKSL HSTCGLIFDGPEYISIVNGQGIEVGRTNNIPRGGEGSNYERAKYWHHYWGDDYGNRMDRF FIGAAYLDGIPQNGVKKANPSLIVTRGIYRNWQVWALDFNGSSLLPRWKFNTNDKGYESY RDMGSHTFRVADLNGDGYDEILYGSAAIARNGKGLYCTGNGHGDALHVGKFIPDRPGLQV VACFENPNKYINSGFGYGCAIFDAATGKFITGHAGGSGSGSGEDGADEEDEEDEDKDPPG DVGRCLVADIIPDSPGYEYWSSEASGVYSCQTGQLLSTTLPSAKTGSKSYNNAIFWTGDL TRQMLDDVMVHSYTKSAPWDKSRVVTFTYYGSGVYSNNSTKANPCYYGDFLGDYREEVIY RSKDGEYIYIFSSNHPTGHRFVHLMNDHTYDMSQAMQNVGYNQPTHLGYYIGADSK >gi|225935348|gb|ACGA01000044.1| GENE 19 32890 - 33762 706 290 aa, chain + ## HITS:1 COG:no KEGG:BF3848 NR:ns ## KEGG: BF3848 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 12 264 8 277 304 110 32.0 4e-23 MKTRKLKIAYAFLLFMPAVTLLQTSCSDNESDDAAGQGSVEIAANWDDYSEEAELPGNYT LAVVGIGEQAMDTRTATFSSLLDAGTYELAAYNTPENMTVNNLTATVALHENGTLQQPGY LFSNTRIKSVVVEPGKTVKTSLKMKQRVRKLTLTLTPKGGDPAQLEATELEAVLSGIAST FNLATEELSEAKDIELVFKKQPSGSYSATICILGVMAQEKQELTFQLDLSLGRHYKFKGD LTDVLRYFGNDIAPLSLKARFELPGGIDGSGTILPWEPEFVDGPDDVELH >gi|225935348|gb|ACGA01000044.1| GENE 20 33774 - 34676 671 300 aa, chain + ## HITS:1 COG:no KEGG:BF3847 NR:ns ## KEGG: BF3847 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 293 8 301 309 63 24.0 6e-09 MRTRILLTAVSFVMLAACNNLQEAEEAAGQFPVAASVWGTVNADNVPVSRVANDVWETGD AIGITGKSGNVEYVNKRYDFSKGSSFIPAGEIIYYIDKKTVEFSAYHPFNETGGAFKVNV SNQNESKNFDYLYATAEGSEAAPLLDFKFRHQMCKVVLNFLPGSGFAEDADFSNGIIRFA SLHPTGTFNTITGSAEINAEEAGGLEFIAGTTIVPAGNTYSFILLPETVSGGVALSFIQE DGTVYEGASLTGKESADLALTPGKVYEYNVTVNRERLTISSSAIHGWDETPGDKDVNINI >gi|225935348|gb|ACGA01000044.1| GENE 21 34731 - 36500 1318 589 aa, chain + ## HITS:1 COG:no KEGG:BF3838 NR:ns ## KEGG: BF3838 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 269 18 305 337 94 30.0 1e-17 MFAWLFVMAAGGVLTSCTSDEKEISGSPDNKYPMEFVGQVFNQSITRTTSEGTWTGTEEV GVQVGDGIKLYKADTYGALTTEDTPFYWESKTATVDVLAWFPCNGTEALPASFPVQQDQN QNDGFQNSDFLRAKGIFTFSGQQPALDFYHLPAKVKLNLKAGEGVEDPETVKNAEVTFVN MALASGEINMATGEVAQTTGEATIIPQKLDEESVTEGCLQTLQALLVPQQMRGKQFIKVV VDGVEAYYIPAEDDANLQAGYLYVYNIEITHRNEIEVTLAVSGPAWAEGKEQTVVSSTYY TADDMKPGDFFYRTADGTGWAVSDGGLRKVNHATGEKEWETPAQSPVDFTDGRVCIGIVF QTESKRISALEKAKGWTHGYVMALTDAAEKCTWGDKTIDEDTGEMTEGINYFPNLATNFD MYGDIDGYGKKVYIAGQKLAGVTTADGTLYDVFYHSENYGTDDSGQYAAPADGTTSGWYL PTIGQWWDILENLGDANGLIALRNSSGTTVELKNGVSQVAIDRMNELMSASGQTVTLFKA DVHYWSSSEKSSGVARRVLFDGSGKKYSLRIGDNNKDNPSTNVRCILAF >gi|225935348|gb|ACGA01000044.1| GENE 22 36554 - 38164 1264 536 aa, chain + ## HITS:1 COG:no KEGG:BVU_2520 NR:ns ## KEGG: BVU_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 506 1 537 568 98 25.0 6e-19 MKLYKLFILSASVFLLTALSGCSKEENAFTTLPEGGLPLTLKVTSDNYISTGTVETRIVE EGYTTQFTSGDEIGVLQLTPSSGNGGETVAKNYRFTLDDEGNWTGVPLSHKQGDEYIVYY PYTSVSSKEELEVYFTSFEPRLNQATYEDYTKSDLMKGEGLVSGTTLNVNLEHQMALIVI EVPLGKCATTKDGKVKYHPATTVLGAPAFSWEEGKVPYQTSDERILRYIIKPDADFSLKG SYPYYIGTRKFDIAATASDAAAKHYLFYTLDEGVEPPGSRDVQVGDYYMSDGTILPGDVA SVPEGCIGIVFQADTERMSVDELAQGWTHGCVMALTNAGKDSKWGKAATEGDGENSPFAE HASTYKGMYQEIGGYVKTKYMVDTYANEATFQEVNGAFYQASQYGETAATTSYKAPKGSS GWYLPSIGQWWDIFENLGEAKGLTALKDNETTSSNIAISLSGESVFENLNTCLKAASSSV DEFESSKNYWSSSEHNRNNKGLVYAHSVSLTATSLTTTGATKTSNSNRRVRCVLSF >gi|225935348|gb|ACGA01000044.1| GENE 23 38263 - 39198 821 311 aa, chain + ## HITS:1 COG:BS_yesT KEGG:ns NR:ns ## COG: BS_yesT COG2755 # Protein_GI_number: 16077769 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 23 256 5 228 232 131 34.0 2e-30 MKMKFLFLVALVLFTSAITDNSPVTIYMIGDSTMANRNLAKSPRNKERGWGMMLGSFFPA DKVVVANHAASGRSTKSFIDEKRWTRVVNQIKPGDYVFIQFGHNDEKKDNPKLYTEPGGT FDDNLRKFVNETRAKGGIPILFNSIVRRKFCQDAAGNFTDSLSDTHGDYLLSPKQIAEEL NVVFIDMNKMTHDLVQQMGPEKSKELYMWAGKKDDTHLNIKGSRVFAGMAIDAVGKKIPE LGKYIRHFDYVVATDGSGDFFTLDEALKAIPARKKCTVLVRTGQYGLKPEIRNKLIKITE DEGVTYGSPAI >gi|225935348|gb|ACGA01000044.1| GENE 24 39495 - 41387 1668 630 aa, chain - ## HITS:1 COG:no KEGG:BVU_0152 NR:ns ## KEGG: BVU_0152 # Name: not_defined # Def: polysaccharide lyase family protein 11, rhamnogalacturonan lyase # Organism: B.vulgatus # Pathway: not_defined # 22 630 22 633 635 964 74.0 0 MKRYIYRLMCVFALSSTWTANAQPNYNYSKLQKEKLGRGVVAIRESPSTVVVSWRYLSSD PMETAFNVYRGGKKLTAQPVVTGTLFRDENSSLETAAYEVRPVLKGKETHHIDGKYTLPA NAPLGYLQIPLQKPADGVTPAGDTYTYSPNDASIGDVDGDGEYEIILKWEPSNAKDNSHD GYTGEVYFDCYRLNGEQLWRINLGKNIRAGAHYTQFMVYDLDGDGKAEVVMRTSDGTVDG KGKVLGNAKADYREPGVFDKKKNKLTRQGRILEGNEYLTVFSGLTGEALYTTDYIPARGN PKDWGDTRANRSDRFLACIAYLDGVRPSVVMCRGYYTRIVLAAFNWDGKKLNKHWVFDTN IPGNEKYAKQGNHNLRVGDIDGDGCDEIIYGSCTIDHNGKGLYSTGLGHGDAIQLTQIDP ARKGLQVWACHENKRDGSTYRDAATGEIILQLKSTRDIGRCMAADIDPTNPGVELWSPGT GGIINFKGELIAPKTKNFPVNMAVWWDGDLLREVLDKTRISKYDWEQKCFIPFVEFEGVA SNNSTKATPCLQGDILGDWREEVLFRSEDNNSLHLYVSTIPTEYRFHTFLEDPIYRISIA TQNVGYNQPTQPGFYFGTDLKGTFRGYQFK >gi|225935348|gb|ACGA01000044.1| GENE 25 41407 - 42897 1126 496 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 33 371 5 363 448 142 28.0 2e-33 MRINSILTSLLTGVLLTACNPGKQTHHTYDWGKLPQQVDLSWADSVGSRQQPDGRIISAN SFGAVADSTRLSTEAIQKAIDECSAAGGGTVILAPGYYLVGALFIKSGVNLQLDKGVTLL ASTDINNYPEFRSRIAGIEMIWPSAVLNVIKQKNVAISGEGMIDCRGKKFWDQYWSMRRE YEKKGLRWAVDYDCKRVRGILVERSTDVTLKDFTLMRTGFWACQILYSDYCSINGLTINN NIGGRGPSTDGVDIDSSTNILIENCMIDCNDDNICLKSGRDTDGLRVNRPTENVVIRNCT TRKGAGLITCGSETSGGIRNILGHDLTAQGTWSVLRLKSAMNRGGIIENIYITRVKADSV RNVLAADLNWNPQYSYSTLPEEYSHKEIPEHWKILLTPVAPEEKGYPKFRNVYLSHVKAT NVREFISASGWNDTLRLENFFLYAIEAQAQKAGRIRYSRNFNLAEIMLDTKDNTTIASEN NDQCNIHLKSTPSGNL >gi|225935348|gb|ACGA01000044.1| GENE 26 43224 - 45644 2056 806 aa, chain + ## HITS:1 COG:no KEGG:BT_4163 NR:ns ## KEGG: BT_4163 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 36 438 2 404 750 110 24.0 2e-22 MIHTLFGLCKKRFSRLETCGLVMLMLLLLAGCTDTEEQYERPSWLEPPIYDVLAGKGNFS MYLHAADKTLYSSILKGAGNYTVFAPNDDAFRKFLSEHNYTSVDEVPVDVLTQIVAYSMV FNRFETARLGDVLNSGIWETGTSIKKRSSYYKTLYKETIDGVEQWVVDSPADVTAVVTPY KFLPIFTDSYFTGNLLTATDYEKFFPDAAYSGLNAAAGSVSNKDMYAENGIIHEVDAVSL PLDNLDEMLRRDEHESFRKILETKVGSSYLFVSYLLGENTTEVYKKLYPDRNISAVYCKT YQNLPYLLNNEDYKGTESGTTEQHGYTLLVPSNESVQRFAEMLCERAEVGDLSELSLTAL TYFLKAHMVPRIVWPSHFASEQNSNEEYLNGEGKDGPDFDACVTKSSFASNGVLYDTKEI VQSKYFTTVYSEILLNKDCRNLANIAYEKFFINDWIPEMTKSKLTGDKEVDYIMVLPSDE LLKADGFSYDEVNNKFLNENLTASADAEDRMKRLLRSCVFKRTKGLTELDDFAGFPQLSY DGYGYAVNIYGDMIRFRDNKLQGLGNILDGTEVEVEEVDFDYINGHVFRITNGTMIEYSP RNSGGTAAFTVPKLYDRITMYAQENSDCRLFKQYMDRVYGGSSPSFIRESTNYTVLIPTD EAIQNAVGSDLLPALPSGSDAFESEDKSMVERFIQLCFLTGTVVPDDGMPYIEPGKNESL ALNTVYKLTDSELDLWSVTTAVVVSKQADKSLTFRFRDITKNDWLQVEGYAPVNVVRESG KSNYVGPLSVIHAVDGIIGFTTHPKE >gi|225935348|gb|ACGA01000044.1| GENE 27 45681 - 49007 3041 1108 aa, chain + ## HITS:1 COG:no KEGG:BT_4164 NR:ns ## KEGG: BT_4164 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 1108 7 1088 1088 637 36.0 0 MRRYVYQFILLTVLCCMPALSSAQQTKVVIRGVVTAAIDKLPLPGANVLLVNKDGRVVGN AVTDMDGNYSMRVETRAGDNLVATYIGMKKATIPVKGKMTINIVMQDETVMLEGATIVGN KRVNNGMMDVSERDLTTAMSRISMSDLEDGMTAPSVEDALQGRIAGMDIAASSGDPGAGM SIRIRGTTSMTGSSQPLIVVDGFPYDVSVDDDFDFATADEEEYSQLLNVAPDDIKEITVL KDAAATALYGAKAANGVLMITTKRGTVSKPRISYTFKGTVISRPHGIETLSGDEYTTLIQ EALMNSGKIYDPTADPEFAYDVNQPYYFYNYGANTNWYDEVTRTGFSQDHTASISGGGDK AQYRASIGYYNANSVVIGTGLDRINARLNVDYNISKQLKFSASMAYTRTDDRKNYISYFS TGQNATSMAFTRMPNMSVYEYNEIGLWTGNFFTPEESPQGKWNPSSSSGGVYNPVAMAKD GYFQTLSNNVVSNLSLIWRPLAWMRYQSDFSLNVSNNKKNAFLPQTATGRPWNEATVNRA DDRDGEAFTIYTMNKVIFTPDLGKKHSFQGLLALTTSDKRSTNFRLTGGNLASPYLKDPS IPSRVTGASYLTSSTTLQQERSVGMMGSIQYSLLDRYIVNTTVRYDGNSRFGKENRWGLF PSVSLRWRLSGEPFMKPWKKWLNELSLRASYGLNGSSPKTNNNYTHISLYDSYEYSYLGE TGVYPANLELSNLKWQTTTQVNFGMNFAAFKNRLNIDVEVYQKRSKDQYFQNLSLPSTTG FTKADVNSGSMENKGFELSVNTVPYRSKAWNISFNINLARNINSILDVPDQYPMQKGVLT TNGQYVRRFELGQPIGAIYGFRYKGVYLNDDQTIARDANGNKIYSYDASGKRTPVQMRFA YPTIDYQFKAGDAIYEDINHDGNIDYQDVVYLGNANPILTGGFGPNIRYKSVSVSAFFNF RYGNKVINKMGMSLQNMASYNNQSKAVLRRWRHPYEDEATAPTDLLPRAVYGSKNAYNYL GSDRFVEDGSFLRFKTLTVKYTFDKKQLKNTFLQSCQIWTTLSNLYVWTNYSGMDPEIAL TGGAFKMGEDNSRIPRSFTATLGMSVTF >gi|225935348|gb|ACGA01000044.1| GENE 28 49019 - 50635 1454 538 aa, chain + ## HITS:1 COG:no KEGG:Phep_2220 NR:ns ## KEGG: Phep_2220 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 21 538 19 509 510 207 30.0 8e-52 MKKITKYIMIGLLALQGMTTTSCSDWLNLMPNDGVPLDEFWKTKEDVRSVVNGAYLSMTA NELVQRLFLYGEWRADMITTGRRTNSSIVSVFNGEISIDNTYLDWASFYQTINICNTILK FAPTAQANDPTFSEAQLKEYEGQAIAIRSLMYFYLVRTFGDVPFIREAYVNGSQQMSVAK SSEKEILAGLVADLEMVETGNYLPGQYSSTDIAQNKGRMTMWAVKALLADIYLWQEEYEK CNRKCDEIIGSGQLLLVPVESRKIEVENTEEGTTDIYYAPDQSSYYSLFQQIYYQGNSME SIFELQFATDNLNPFYSLMSSGRGVIGVKTERVNSDIFPPLADEAYSGECFDIRSVISQS KSCLWKYIGVEPNGSERAQEEYTNNFIIYRLAEIYLMKAEALTQMAILDSDNQELLKEAY AAVKVIRDRSTAVATTDIGQTEGSYSGKAMEEFVLAERGRELIFEGKRWFDVLRQAKRNN YDGENLNYLMTLAEYSAPASKVNSLKAKYRNYHSHYLPIYSEEIDANPLLEQNEFYAN >gi|225935348|gb|ACGA01000044.1| GENE 29 50639 - 52459 1510 606 aa, chain + ## HITS:1 COG:no KEGG:BT_4166 NR:ns ## KEGG: BT_4166 # Name: not_defined # Def: putative lipoprotein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 463 1 484 581 102 24.0 6e-20 MGGIYMNMKNKIGLNLIVSLLMLCFLNACDDPLDGTVYQTTEQQMLDEYMADHLGEFLKI VDKSDYRGMLHAYGAYTCLVPTDNAVRKYLEENRIDIEQLSKEEANECVGYHVINDTITS SRFQDGKMPTANIRDYYLTTQTKSDENGTIYVEVDRNARLLTKDVVLGNGILHIVDAVLE KPELTLREQVAALPMERYSLFIDLFAEYENYLNSVMTTDTCYTVYVQSNETFNNEGIHNK NELITRLKLNTPGINDSEALIQNFLAYHIGTGRQYIVDLMSSSAVMTKVENQVIACIMDG QKIVLNRFNSIAGYEPGIELIRDGEYADRTCADGVLQEVDGMLEIKERAPYRVNFDVCTL PEIKASASYLKETKKFEKGQLTQDIKIEYGRGSSTAAMTYTVGQVLDRNKPTLNQQMQCV NGDYLTFRMRKSDMITAVEFTLPLLIEGTYKVWLCYRHTDSGKNGPTLKTTFKQEGQEDQ ILDKIVTVKRVSAISYVKGADGKEKTDERGVRMVDHAAMETNYGWKQYTYYPHVYMNSNL LGVIKVFSSGRHVLRFDLEKDASLDIMLDMIQFIPTTEDQVWPMLDLTGMEIGEYSPGKE EVWPYE >gi|225935348|gb|ACGA01000044.1| GENE 30 52504 - 54186 1298 560 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173129|ref|ZP_05759541.1| ## NR: gi|260173129|ref|ZP_05759541.1| hypothetical protein BacD2_14765 [Bacteroides sp. D2] # 1 560 1 560 560 1133 100.0 0 MKRKFNKIRWALVLVAATLVAACNNQWDNHVAVDMPTLKGSVLEAVKANGELSGFYTLLQ ETGYDKVLQGAYEYTILAPVDEALAGYVKGLAEGEWNEEAKLMMVRNHIAFGTFNLTAIS QPDSHLKMINGKNRIMSELTFEPEHSDVLCNNGMLHVVDKVMEPLMNIDEYLQYLHALYP EEYEQLDSLYAKTTKIMDKDRSIQKGVNEKGQPVYDTIWTTRNYFFEEMPVNDEDSTYTF VLLRQANFQSLKEKYAKYMNQSTEELTDSLVTDELIRDLVFKPGVTTALSGVEVDFSKAV SFELAGVREYQASNGTIRFLNGVDIKIKENKVKTVVVEAEDYLGTYAASKTYTRLRTWAS GGKDVMVSSRSYQTDPVTGAEYSFTFNTNNMNTDANFYLQYAVNLNSVIYDVYLGSYDDM ENHIRPNEEADKATTLAVCQKLLASMPGDPALRRPAGTTSSIDNNYWGKECGFVGYSIAG DETYKNTPVHLRKYNLGSYMIPTTPVEEADAYDFEVPRMGEVQVMVCNTAGCHEYSKASQ RGGMMFLDYIKFVPRIADGE >gi|225935348|gb|ACGA01000044.1| GENE 31 54221 - 57397 2832 1058 aa, chain + ## HITS:1 COG:no KEGG:BT_4168 NR:ns ## KEGG: BT_4168 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 42 1056 27 1048 1050 474 30.0 1e-131 MNNKMKYIFLLIACVFALSGNAQKFNKKVRTLKVKKRIEYKERASGTIVDATTGSPLNGV RVAVPGLSTAMTDESGKFSIRIPSYEVELLVSSPGYQQKRIPLRGEKEVTVRLYDETHKS LFEDVLTPLGDMAGSQTTTALTQLNGDNSLSPATSPEALLQGTVPGLNTLFRSGMENAGA NMFMGGFNSIYTNNQPLLIIDGMVVENLSAGISLIDGYLSTPMSTIDVKDIERITVLKDA ATLYGVKGSNGAIVIETKRAKDAETRITAQITTGMNLKPSSIPMLDAVQSKRYLMDVYQS RGYSAGDIQKLPFINDTKPEQHSWGYEGNADYYRYNKNTDWQDQLFVQGFKQNYSIGVMG GDDVALYALSLGYLQNEGLVKGTDFSRFSARINTDINFSPKFTVQTNMNFVYGKKNLMQE GNASPQNPIYASLVKSPFMAGFTYNEQDQLSPNYEEVDLFGMSNPTAIVDNMLQENISYG FFANIHLKYKIWKELTLSTRFGLRLNKEKERVFRPEEGIPYEDVATSEVTNQMQYRTERI FSLFDETRANYLFKLGVEHQLDATLGMRYFNNRSEDDWGKSYNSTSDRFRSLQYGLNDLR QMGGSIGTWNWLSFYGNVAYSLKNRYFVNATLSADASSRYGEDIGQFQIFPALSSAWVVS SENFMRQLSWVDLLKIRAGYSMSGNDDIGNYAARRYYSSQNLLGNYGLVRGNLVNKNLKP ERMARLNVGMDVAVLNERLSLSVDVYRSTIKDMIAYSPITSYSGFSTYIDNSGEMRNTGV DVAINARVLNLPSLKWDLGATVSHYKNKITELKGNSYLTDIADGTILTEVGRPMGVFYGY KTNGIYATTEDAKADGLQVRSGLADVPFGAGDVRFVNMNGDKYINEEDMTVIGDPNPDIY GGLNTRILWKNFTFSARFTYSLGNDVYNYTRRTLESMSGLENQTQAVLNRWRAEGQVTSM PKVTYGDPMGNARFSDRWIEDGSYLKFKNLTVAYDVPIKKGIITGIQVYAVAENICTWTK YKGYDPEFSASTNPLGYGIDAFMTPQARTFYVGLKLGL >gi|225935348|gb|ACGA01000044.1| GENE 32 57437 - 59143 1479 568 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173131|ref|ZP_05759543.1| ## NR: gi|260173131|ref|ZP_05759543.1| hypothetical protein BacD2_14775 [Bacteroides sp. D2] # 1 568 9 576 576 1158 100.0 0 MLCGVLISTFTACVDTELESVVEYKNHYNTIPDADNAILGLYGLFMNLAEQTVVLGELRA DLMDVTNNASIELQEISANTPSANNKYADLTNYYAVIQNCNDMLAGFDEMLANNRMTQDE YAERYSDVAAIRCWTYLQVGMYFGKLVYVTTPTVSVEDAQKLEHQAKIGLDELLPLLIEC MENLPTLDDYKNSALVQYKLDTYDLKRFFINKRILLGDLYLWNNNYREAAVQYRKFLATD EDKSATANSVRYRCGTYSTGGDTYFQVFYERYMEGDINSYKNTWLNMFSFQADKSALWSE LIWTISYDEKFEPYFPMIELFANQGKGKYQLKPSDYAVESLWASQTQTNGAPFDGRGEDS SFKWVDGQCVVQKYLYDYDPAKPYMKRGRWFLNRAALVLLRYAEAANRCGYTDLAYSILN EGFKNTYTWGDEYGDQNRQTGYGPGEPYPAPFYFDARSMDAPYHRAPWRDFNGIRGRVSL KAKPLEPQQDETDVQCMEKMLIEEAALECAFEGHRWGDLVRVARRMNKEKAGSGSEYLKK VIGKKYERSGLAMPDFSSEDKWYLNVSK >gi|225935348|gb|ACGA01000044.1| GENE 33 59158 - 60864 1050 568 aa, chain + ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 314 545 45 303 328 134 33.0 4e-31 MKTVLKDLYVCKVMRHCFCLLILGILCPSTRAQEPAHVIIVAGQSNTDGRVPVADLPEYI KSMGIDSTGFAKGAYKYCKISQNRVDGKFVPFWPRRNRWGYDAVTYYLLEQLYQKEFYVI KWAVGGTSITPENTDSRGGYWSATPEWLAQNTPTAKKGKSLLLSFTQQISNSISKTLSHL PEGYHIDAFLWHQGESDSAYGPDYYENLKNVVSYVRDHLTRKTGEDYSELPFIFGSVAKS NKRYNAEVEAAMKRLASEDKNAYLIDMSKATLLKDRLHFDKTSAEYLGKQMYDTMIQASS VNVTSMQSPFDIDLWEKGLPNSNGKEPEGYDDKKHNYKPSVRVFLPASEKPAKAILICPG GGYDHLAMKDEGYDWALYFNKRSIAAVVLKYRMPNGNPEVPVSDAYEAMRYIKEHAKEWN ILPDSIGIMGSSAGGHLASTVATHAPAKLRPAFQILFYPVITMGGRNVHQGSKKHLLGEN PDAKLVKKYSNEWQVNRKTPRAFIALSGDDQSVKPVHSESYHKALKAQKIPSEIHIYANG KHGWGYNKRPFAQRAEMMKDLESWLRTL >gi|225935348|gb|ACGA01000044.1| GENE 34 61079 - 65389 3142 1436 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 893 1123 52 288 318 130 31.0 2e-29 MIKQRFLFAFILFIVSALDTYASIELRSHQMKTSDGLPSNSVRYMYQDSKGFLWLGTLNG LSRYDGNSFLTFQPGNDGKPSLADNRIFNITEDKHGFLWMGTTVRLYSCYDLQKACFVNY MEPEEQDRNYSKLFLASCGDVWLWHPDNGSRRVVHQENGTLTSTVFKTEAGNLPDNRVNF VREDESGRIWIGTKRGLALVINGEVKIVNRSLHFVSLLANGNTVYFLTENGDIYSYQEEA QELVKQAAISSVAGKTSPTDDFRIKDKWVILTTIGVYNYDLVTYTLTPDSRLNIKKGEVI RDNHGDYWIYNHTGRVHYMLAATGETKSFQLIPEGMLDHIDFERYHIVHDSRGIIWISTY GNGLFAYNTAEDKLEHFAAGITENSYINSDFLLFVMEDRAGGIWVSSEYSGVARISVLNE GTSRIYPENPELFDRSNAIRTIGEMPNGDIGIATRKGGLFTYDTHFNQKMNKTYYQSNIY AVEKDADGKLWMGTRGEGLKIGNDWYRHDKLDLTTLSNNNIFAICRDAKDRMWIGTFGGG LDLAEPTSDGKYKFQHFFQGRYGVQMVRVFQEDRNGMIWMGTSEGICIFHPDSLIADPEN YHWFSHTNGKFCSNEIKCIFQDSKGRIWTGTSGMGLNLCQPEDNYNVLKYEHYGVNEGLV NNVVQSILEDKEGKLWIATEYGISKFDPDIRSFDNYFFSSYTLSNVYSENCAYVGADGKL LFGTNYGLTVIDPKQIKSNESFSPVVFTDLYVNGIQVIPGGTDSPLTYSLAYSSEMELKY FQNSFLIDFSTFDYSDSGQAKYMYWLENYDKEWSAPSSLNFASYKYLNPGTYILHVKSCN EAGIWNDKETTLTIVIKPPFWKTGWAFLCYALLAMVALYFTYRVVRNFNRLRNRISVEKQ LTEYKLVFFTNISHEFRTPLTLIQGALERIQRIGDIPKELVHPLKTMDKSTQRMLRLINQ LLEFRKMQNNKLALSLEETDVISFLYEIFLSFGDIAEQKKMDFQFNPSVPSYKMFIDKGN LDKVTYNLLSNAFKYTPSNGHIVLSVTVDEAKQQLQIQVSDSGVGIPKEKQAELFKRFMQ SNFSGDSIGVGLHLTHELIQVHKGTIKYAENEGGGSIFTVSIPTDKSAYNEKDFLVPNNA LLKDANIHTGHLAEPTDFLGDEQEDLPEYEKVDNPLNKRKILIIEDDNDIRQFLKEEIGA YFEVEVAADGTSGFEKARSYDADLIICDVLMPGMTGFEVTKKLKSDFDTSHIPIILLTAL NSPEKYLEGIESGADAYIPKPFSIKLLLARVFQLIEQRDKLREKYLNEPGIMRPVVCSTD RDKEFADRLTVVLEKHLARPDLTIDEFASIMKMGRTAFYKKIRGVTGYAPNEYLRIIRMK KAAELLLSDENLTVAEVSYRVGIDDPFYFSRRFKMQFGVSPSIYQRGAKNNSASEE >gi|225935348|gb|ACGA01000044.1| GENE 35 65438 - 67255 1964 605 aa, chain + ## HITS:1 COG:TP0831 KEGG:ns NR:ns ## COG: TP0831 COG0018 # Protein_GI_number: 15639817 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Treponema pallidum # 8 605 12 589 589 469 40.0 1e-132 MKIEDKLVASVLNGLKALYGQEVPEKMVQLQKTKKEFEGHLTLVVFPFLKMSKKGPEQTA QEIGEYLKANDPAVAAFNVIKGFLNLTIASATWIELLNEIQADEQYGLVQVTDASPLVMI EYSSPNTNKPLHLGHVRNNLLGNALANIVAANGNKVVKTNIVNDRGIHICKSMLAWKKYG NGETPETSGKKGDHLVGDYYVSFDKHYKAEVKELMAKFTAQGMSDDEAKAKAESESPLMQ EAREMLVKWEAGDPEVRGLWEMMNNWVYAGFDETYKKMGVSFDKIYYESNTYLEGKEKVM EGLEKGFFYKKEDGSVWADLTAEGLDHKLLLRGDGTSVYMTQDIGTAKLRFADYPINKMI YVVGNEQNYHFQVLSILLDKLGFEWGKSLVHFSYGMVELPEGKMKSREGTVVDADDLMEE MIATAKETSQELGKLDGLTQEEADDIARIVGLGALKYFILKVDARKNMTFNPKESIDFNG NTGPFIQYTYARIQSVLRKAAESGIVIPEQIPTGIELSEKEEGLIQMVADFAAVVKQAGE DYSPSIIANYTYDLVKEYNQFYHDYSILREENEAVKVFRIALSANVAKVVRLGMNLLGIE VPSRM >gi|225935348|gb|ACGA01000044.1| GENE 36 67317 - 68189 694 290 aa, chain + ## HITS:1 COG:RSc0230 KEGG:ns NR:ns ## COG: RSc0230 COG4292 # Protein_GI_number: 17544949 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 19 290 19 291 398 218 44.0 9e-57 MFYSPHPLLRKRETETATVSYSELLFDLIYVFSVTQLSHYLLHNLTWEGLLKETILWFAV WMLWQHTIWVTNWFNPDTRPIRILLFISMLVGLVMAAAIPYAFTYRGLIFAICYVLIQAG RTLYIIGVLGDHHLAANFKRIMGWFCISAVFWVTGAILQGEWQILLWIIAAICDYTAPMH GFALPRLGRSDSSKEWTIEGHHLVERCQLFVIIAFGETLLMTGASLSEVEEWTPLVIISA VISFIGSLAMWWVYFDVSSEAGSRKIQEVKNPGKLGLIYNAIHIVLVGAL >gi|225935348|gb|ACGA01000044.1| GENE 37 68400 - 69254 535 284 aa, chain - ## HITS:1 COG:no KEGG:BF1504 NR:ns ## KEGG: BF1504 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 273 7 275 282 311 57.0 2e-83 MRKMYLSAPLPFVGQKRMFAREFIKVLGQFPDSTVFVDLFGGSGLLSHITKCVRPDATVV YNDFDNYRCRLVNIPATNVLLSDLRRIAEGEPRNKRITGEVRDKMFARIEREEKEHGYVD YITVSASLLFAMKYVTSLEGMKKEAIYNRIRQTDYPEAKDYLEGLTITSEDYKEVFKRYK DVPGVVFLVDPPYLSTEVGTYKMFWRLADYLDVLTVLKGHSFVYFTSNKSSILELCDWMD RNPFVGSPFKECRKVEFSASVNYQAKYTDMMLYTKPDEVSGIAA >gi|225935348|gb|ACGA01000044.1| GENE 38 69350 - 69988 448 212 aa, chain - ## HITS:1 COG:YPO0783 KEGG:ns NR:ns ## COG: YPO0783 COG0207 # Protein_GI_number: 16121095 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Yersinia pestis # 1 193 1 213 264 62 27.0 6e-10 MNKYYRILDKILATGKTQTNKKGNIQYLLNEQLSLTPADLLDIFEGHNIARKKLRSELQL FMQGERNVEKYREAGINWWDYCGSILVNSYPTYFEKLPPLIAKINRERRNSKNYVLFLGE TGAESNQAPCLSLVQFQLDGGELVLSAYQRSSDANLGLPSDIYHLYLMARQIELPLKSIT LYLGNVHIYENNIPGTRALIAGDETVRFGLNV >gi|225935348|gb|ACGA01000044.1| GENE 39 70067 - 70309 257 80 aa, chain - ## HITS:1 COG:no KEGG:BF1503 NR:ns ## KEGG: BF1503 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 80 1 80 80 81 52.0 9e-15 MKVIEILNFNRELLKRLQAAGIRLEDARYIDLYADYTRLLDQGEKVSYAVAVLSEKYSVS ERKVYALVKRFQSDCKTLAV >gi|225935348|gb|ACGA01000044.1| GENE 40 70489 - 70974 366 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173140|ref|ZP_05759552.1| ## NR: gi|260173140|ref|ZP_05759552.1| hypothetical protein BacD2_14820 [Bacteroides sp. D2] # 1 161 1 161 161 325 100.0 4e-88 MKRVEGTSGIKLIECVSPARNRWRIRWDVQEREDGSASYMEEGFVGRPHMDTIKSVITDW CNEQIDREILSGFLYEGMPVWLSSENQFNYKAAYDLAVQTGGATLPVTFKFGTDEVPQYR EFVTLEELTDFYTKAMKHVQDTLSDGWRKKDAFDPEKYRVE >gi|225935348|gb|ACGA01000044.1| GENE 41 70971 - 72275 746 434 aa, chain - ## HITS:1 COG:no KEGG:Dred_1227 NR:ns ## KEGG: Dred_1227 # Name: not_defined # Def: RNA-directed DNA polymerase (reverse transcriptase) # Organism: D.reducens # Pathway: not_defined # 7 322 8 336 342 112 27.0 3e-23 MRRVGYIIEEIVEPSNMEASFRQVLRGSKRKRSRQGCYLLAHKPEVLEELVAQIASGTFR VKDYREREIIEGGKLRRIQVIPMKDRIAVHAIMAVVDRHLRKRFIRTTSASIKRRGMHDL LAYVRRDMAEDPDGTRYCYKFDITKFYESVKQDFVMYCVSRVFKDAKLVTMLESFIRLMP EGLSIGLRSSQGLGNLLLSVYLDHYLKDRYAVRHFYRYCDDGVVLGKTKAELWKIRDAVH GRMECAGLLVKGNERVFPPGEGIDFLGYVTFGADHVRLRKRIKQKFARKMHEVKSRRRRR ELIASFYGMAKHADCHTLFKKLTGKDMRSFKDLNVSYKPEDGKKRFPGVVVSIRELVNLP IVVKDFETGIKTEQGEDRCIVAIEMNGEPKKFFTNSEEMKNILLQVKDMPDGFPFETTIK TETFGKGRTKYIFT >gi|225935348|gb|ACGA01000044.1| GENE 42 72792 - 79529 4627 2245 aa, chain - ## HITS:1 COG:no KEGG:BT_4440 NR:ns ## KEGG: BT_4440 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 348 1132 267 1034 2183 84 22.0 4e-14 MALTESEKTELKNDILNAIKAESQSVDELVEVSSLDNIKSLPALRGSELVSAPLTLLRKP ADDAAATANASATKADNAAALANKAAGMASDAAGTANQAAETANSAAAAASAAAKQAEDA AAGVNDGLVGGMTAVPDEENDTVKLTLLGKTGTEIASVDIPGGTGGGGNTYNVTAEVPLE SGYYVLSSAIGAVDEKYRYKGRCITYEVSQGKWETKQFVGTSLSSWEQEASWEDFGGAGT MKSLTVNGEKKVPDSEGNVDLTIDKLEVDESLDADSTNPVQNKTVAAKFQEVEAGTVFGM SAEVSDDESSVRLALTNKSGAEIASVDIPAGSGGGGESSTTKIVLLAETDKKTVKEGGAV KLTYTYDHQVAGGDDKGSSTGQKATVTIQVKRGTTTTYSSSLKEVSKGTYTLDLTKYLLV GTSDIYVIAETTDPTTGKAQKKQAYVSVKSVTLSLSCGYNLAATIQNGGYGTYDSASIPY AVSGTGTKTVSLYVDGVQQNAHTVTRSGTTNGSFEVSMTGLSVGRHTAQLVAEMETDDLT LKSESIHIDLLKAGTGAPFIGLKLIHADGHVLGRDEHLEPVLEAGRYEKLTFDWVAYDPD RVPAEVEFWKNGVKSSTVSAPRSMMTYSNRFTEEGTQTLVLKAGPTGYTLRIDVGESGID ISEATYGLAVKLDAAGRSNGESNPGTWESNGVETTFEGFDWSSNGWTGEALKLTNGAKAV IGYRPFATDVKSTGLTIELTLRVSNPTDSDTAVVDCLDSGKGLYITPSETSFKTGEKVSY TNEDDELVEREIKLGTNYVEDRWIKVALMVGTRNESRLMELYVDGNRTGADIYDNAFSFR QDNPKYITIDSAGADVEVKSVRIYTRRLSDDEELENRMVDSADGEEMIALYEENDILGDT DTVDMDKLRAKGKGVLRIVRQNKLDDVYAENNKKTDFSADIFYYSPFGSEYDFVLRDCYI RIQGTSSTKYPSKNIRIYISKGGTNLSFTVGGKEQAEKKYPVRPGGIAMNLICLKSDYSD SSMSLNTGGAKLFNDVLKEMGLLTPPQRYQYETGGSDLNAVTVRTAIDGVPIDMFVAAAE DGENNYVGQYNFNNEKSKSGDLFGLSGVEGYDPACPLTLEMLNNTEAMCLFKTTSDAHLE EVFDAGAETNVPDDVKWAGLDESQRTAVKRLYAWIRSCVPDGATSADLSTFKSEKFRDEI SDYFDKAFLLTYYLWTDYFLAVDQRAKNMMLRTWDGLIWYITYYDGDTQMGKRNDCFLVY DYTTDRDTYDAEAGKYAFEGRDSWLWNLVLANLDADLKTQAQALRGVLTTSRVLDMLNVE QAGNWCDRAYNKSGELKYILPATQEMYGKVWPFIYALQGSNRAHREYFVRNRFALLDAKY GTSNFTSDNIDLYLARTAADTPDVLKITANEVYAFGYGTNNSPNIGNTGIIKKDAAASLS ITGAYTVNDPLRVYGASRMKVLDMSGAADHLKNAFDLGKCTVLRELNLQSSGNGSTGWWL NIGNCKQLRKLNLRNQAQAKTGGSTSTELDLSAQTKLEELEARGTQVQSVVLAKGSPVTL LHLPGTLTSLRLEYLGRLTTGGLTLESYSKVKTFIFDSCPGIDWETLLGRCTGVERIRVT GIDREDDGTWLDKFVGMGGVDSDGNTTDTCALVGTVQLTRYIDDDTYSALKAHFPELNIR QPEYTMIEFDDEVSDDANVSNLDNGTGYKYDNAYEVSGHISAILKQRHRVLAKVTKKATT RGVNMANVDTTVNNLDGEMTYYPLDDTDSNKYADGTAARLDGTEGDWMMYEPFFWSKGIN DYLNGKHYSCYSSNGSDNMPSVPDADVLTLDDIKGTSGGYLSGRKIMSGKDTLSNSYSTD STYSVCKVNVDGYKRVRFPSVPGTSLVGSIFTDDSGTVISSIVVPTLSNKFEAGMYLIAD VPEGATALHFSILNTAEFDKVVLSNSDRIEDMEPEWVPNDEHLCAVVGSSVVGSKLRACI TGGSTTASMTWADFHYYSVQRGMQQIDALMHSRIANLFYAKYGRRDSQEQCGAGSHTNNR TTGGTASRGMTDTIGYEEASSINPNVTNSLIENSVHQYAWYREKDDYGGATVTQVNNICC LGYEDIYGHKYDMMDGVDLPNDTGNSGKWRIWMPDGSTRLVKGSVSSGIWITAVAHGKYM DVIPVGSVSGSSSTNYCDIYYISTASSRVVYRGGYSANPNGGVSMSYASSDSSSTNTYIG SRLAFRGRLVKASSAVAFKAISEVA >gi|225935348|gb|ACGA01000044.1| GENE 43 79556 - 83572 2310 1338 aa, chain - ## HITS:1 COG:no KEGG:BDI_0901 NR:ns ## KEGG: BDI_0901 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 481 1 493 1223 150 27.0 5e-34 MLTIYDSNGNRRTDIEAGDSSTQVKEVQGDNVLTLSFTHYEYIALDVNDRVDFEGERYWL TERYIPKQKSGQEWVYDLKFYGIESLVRRFLVLETTDGNTEPVFTLTATPREHVAMIVKC INDGMNHTTDWKVGRVDGTDLIVIDYEGKYCNEALKEIAEAVGGQAEWWVEGQTVNVCRC EHGEEITLGYGKGLTGIERDTTGTDNFYTRLFPVGSTRNIDPSKYGHSRLMLPGGRQYVE IHTEEYGIYDRYEQDAFSGIYPRRIGAVSSVRSEDVKDDDGNPFTVYYFRDDSLNFDPND YELPDETKRVSFQDGDLSGLGQGEDHYFEVNFNSATREFEIITIWPYDDDTQLPGGKLIP KSGDRYILWNIRMPDEYYPLAEEEFLTAVEQFNTECWQDLAVYKAPTDHVWIEENGVSLS VGRRVRLESEEYFPETGYRSSRITKITRKVNQPGEMDIEISDALHSGAFERVNDSIGELK NYTKSKAEGAALPDIIRSWDKTLPTDNNLFSARRSQAEHISKKKNDRAKGKITFEAGASF GQEDNAGIDDKGNAELLTLVVREFLRSPKFVDGLFGEGWRLWMEDALSHLTIDKLTVRQV MVVLELLIEKVRSVGGQLCVSAANGKIKTAVLEDGYWRITFEQDNSFQAHDLMRCATFSG GNLKGYWVEVAGVEGDSILVSEDEFSGSLPEAGDECVLMGNTENPLRQNLILISATEDGQ PRVDVMDGVKAKNFTGCLRARLGNLDGISDDWFPADNQPHGNGLYSDNAYLRGTFLLVTG EDIKTKFEIVEGRITSAVTALRNDFATEKGYLNNPAFDDGLEKWNTENETVFFLAGNKWI WANNNVLTRKGDGASVTVDDGRTVVHIRNKYILQKRANLKSIPSMPVNGDGEKEAVPVYL SFFYRCATKGTLRVQFLDVDKTGFANFNSLEVEEELSATDGYVQYTCSGLWNGTGDFKLS FTGDIYLYMLVLSTDKVESLAHRYRTLFEQSERLVKITAAVFDRDENMLEETGLVVKPEG AGIYAQDADGKLALIGVSVDGTDADGNKISVVKLTGDRVKLEGLVTANDNFKILEDGSIE ANAGTFSGHIRTNFHLVESSDAVLTSCSGRGEAGYLIGRELSLKVDMAGSSNGADIILPN DVRYIGSRVTLYNGCHPPYTRTVGSIRYSSVRVDDGTLLRGSNVNLSSDSLLSYSDPYRI DWISGIIELVGTPELNGRILADLVSWRGASAGPPASPSSGWLYYDTKRNRNYLYWYGAWV EFPVYGGASDDLRITWKGELSSAPENPERNWLYVTSVNRFLLLYTGEDWEEPAIINSLNK CGWCILGFSALSYHYYND >gi|225935348|gb|ACGA01000044.1| GENE 44 83619 - 84083 230 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254884012|ref|ZP_05256722.1| ## NR: gi|254884012|ref|ZP_05256722.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 154 1 154 154 311 100.0 8e-84 MEKILGGLVLVNGTDIWSTYGVFLVEDKRGGMDNLTAILTPSKTKTDTAVNIREEDGEKY SSVLTPRNEARDVTLHFALFGKTQAGWLKKYFEFINFLKKGRDGWLEISFPQLALTLRVK YTDCSKFQPLTYLWKEGVHAGKFKVKFREPVPVI >gi|225935348|gb|ACGA01000044.1| GENE 45 84085 - 88476 3478 1463 aa, chain - ## HITS:1 COG:no KEGG:BDI_0893 NR:ns ## KEGG: BDI_0893 # Name: not_defined # Def: putative viral A-type inclusion protein # Organism: P.distasonis # Pathway: not_defined # 62 840 199 884 1388 244 28.0 1e-62 MKPVQIEFLMVDHLSARLDKAVGKIERMSQQASSANRQIRELDRSGSLLNNTVGKLAAAF TIKELVSNITKVRGEFQQLEVSFQTMLGSAEKADTLMQQLVHTAATTPFGLEDVAQGAKQ LLAYGFGAEKVNETLIRLGDIAAGLSIPLNDLVYLYGTTMSQGRLYTQDLNQFTGRGIPM IAELAKQFGVAESKVKELVEEGKVGFPEVQKVIESLTDEDGKFSGLMEAQSKTITGQISN IEDAVSMMFNEIGQQSEGVINTTLSGVSYMVEHYERFGRILLGLVGTYGVYRTAVMTVTA VKGWAVAAEALHYNWLLLVEKAQKMLNRTMLSNPYVLVATLLAGVAVALISMKTETERLQ ESEERYQQQKQKTIEAEEEHRRKIEELCSIAGDEAVSTDARREALNKLEQKYPDIFSKYD TEYEKLKNIKKIKEEIARLEAGESISNPANELKRVDDRIKELEGKTRLATEYYQDSYGRQ RARYVRKSARSRDEEAELQNLYGKRKSLNGQIRKDEVNAYFENLTGVSNETLAQQIKQRR TLLARMSVQEKEYGKITQGDENLTGTYSRDELKYQLNKLVSEQNRRNLPTDSSTDWVAAA KEKYQDALKAYNAFLQETSNSLSREEFEKKAKELKDAVDTAKKEYDKVKPGEDKDSEAER KKADKAEKEAQRRKQVSEKLGQELAGLQRKNDEAEIEMMTEGLEKKLRQIDNEYQARKDE IARQEAGWKRDNAKAGQSGSLSEDQQSEIDKARELNESGRQKKIAEAYREEFGVMQEYLQ AYGTFQQQKLAIATEYAEKIQKTTSGSEKLSLGVERDSKLAGIEVQELKARIDWGTVFGE FGGMFSDMIKPVLADARKYMLTDEFRNADHASQDALVSAVQQMERALGGSGKVSFKKLGA EVTAYQKALSDLKEAQAVYADTYTALIAAQKSYIEAQQSGTEQEKESARQALETAQANAD AASENINALQETADSARQSLSNTASGLKTSMENVRDGLQQIASGSISGAYNGLITLGKGA KEVDGKLGEAFGKVSETLEDVPVVGWIVSIIDLFKDGLSVVIGGLLDAVFNAVSGILDDV LSGDLFVTIGKSLLSGVGKIFDALTWGGFSSWTTSSNAKEVQETIDRLTERNETLQTAIE DLTEEIKASKGTKSVAAYRDAYRLQQETNSNYLDMAMSQAGYHGSHHSWNYYWGGFSQEQ IDRLSGQIGRSWNGDIWNLSPEEMKKLRSNVDMWTQIQDTGKGGYGGRLTEKLDDYIDQA GKLEELTDQLYEGLTGISFDSMYSSFVDNLMDMKYDAAAAAEDISEYFMRAMLSNKIGEL YSDKLKGWWEKFGKAMEDNDLTEAERKALQDEYMKYVEEAIALRDNLAAATGYDNSGGTS QSAKTGGFSAMTQDQGTKLDGMFTSGLQHWSSIDEKMESVIDKMNTAEGHLARIEENTGT SASHLGKIEEEIRKINRDGVKCK >gi|225935348|gb|ACGA01000044.1| GENE 46 88477 - 88677 80 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254884010|ref|ZP_05256720.1| ## NR: gi|254884010|ref|ZP_05256720.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 66 67 132 132 131 98.0 1e-29 MWQIASATGWSVDYILNKVNYQTLIMMLSDAPRYVRDKSGSSGPADERSAEDEADGIVGF FQSKLK >gi|225935348|gb|ACGA01000044.1| GENE 47 88716 - 89207 267 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254884009|ref|ZP_05256719.1| ## NR: gi|254884009|ref|ZP_05256719.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 163 1 163 163 308 100.0 1e-82 MDEAVIKQIQREGADALLDIGVSVPLKAFHIPFRKSPLELRVTMRRPYMSGQILFARTYL SMGITSEEMWGFSKEEEMQFLASHGKAVSRMVAYTLCRGPFSRRVLLRPVAWLIRNFMEQ RYLVGAIKRFVSLMGTDPFIPIIRSAERTNPMSLRLSQRKKGS >gi|225935348|gb|ACGA01000044.1| GENE 48 89211 - 89786 571 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254884008|ref|ZP_05256718.1| ## NR: gi|254884008|ref|ZP_05256718.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 191 15 205 205 346 99.0 4e-94 MNKNFMYGVGAVKYKDFVVGYIEKNSFDMGGQKPESAKIEAEQVPGTPVLIIPQSNGSIA PTFNVIQLNYENLHSLLGGTMHYKEEDSEKKTPIGWTAPTAAVLLTGPWEIALVSGQSIL IPNGTLLSNLGGKLTLTETAKIECTLEVAMPEDGSQPHGVFNTDSIPEEWKQYKLPPAES AAAASTNLAEE >gi|225935348|gb|ACGA01000044.1| GENE 49 89783 - 90340 146 185 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254884007|ref|ZP_05256717.1| ## NR: gi|254884007|ref|ZP_05256717.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 62 185 1 124 124 233 98.0 3e-60 MKTGKLCRRPGVKWHASGKPLILPTVAGIMMIVLFFTGCASTRKSRTEVNRNSSLSSSAD NVSNVRRGLLMAGIPKSAVSLTIPPDSLRKLPSGSSYHSRNGQAGLTVKSDAAGNIIAEA SCDSLQRLVLCYEEELTRIRNETHEDSFTVETEFERRFSPVKIALAAFITGCVAGIVLTF KIKKQ >gi|225935348|gb|ACGA01000044.1| GENE 50 90072 - 90551 308 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173151|ref|ZP_05759563.1| ## NR: gi|260173151|ref|ZP_05759563.1| hypothetical protein BacD2_14875 [Bacteroides sp. D2] # 1 159 1 159 159 265 100.0 6e-70 MDFSAVMNLVLGGGLVATIIAIITLKSTVREARAKAEKATAEAETVRIDNTEHATRILIE NIVEPLKEELNENRKALQATRREMARLRKAIDTANSCRHHDDCPVLYGMREYSKEQDGGE PEQQPVVKRRQRVKREAGVVDGGDSEIGGEPDDPSGQPP >gi|225935348|gb|ACGA01000044.1| GENE 51 90579 - 90965 301 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173152|ref|ZP_05759564.1| ## NR: gi|260173152|ref|ZP_05759564.1| hypothetical protein BacD2_14880 [Bacteroides sp. D2] # 1 128 1 128 128 207 100.0 1e-52 MKRFLLFFVLILGFVSATFAQTGTVPEVDYSAMITTFAGFVGGVVLLTEGIKALFPKMQG LATQIVSWCVGIVAAMLLWWLDAGFVADATWYIALCYGFGASLVSNGVADTGFVQWLIGL FAGKDAGK >gi|225935348|gb|ACGA01000044.1| GENE 52 90978 - 91442 339 154 aa, chain - ## HITS:1 COG:no KEGG:AZC_3601 NR:ns ## KEGG: AZC_3601 # Name: not_defined # Def: putative N-acetylmuramoyl-L-alanine amidase # Organism: A.caulinodans # Pathway: not_defined # 1 150 1 139 161 110 40.0 2e-23 MGKLKYLVIHCTATPEGREVSGAEIRAWHTNPVSKGGRGWKQVGYTDLFHLNGGVERLVN NNEDANVDPWEVTNGVAGYNSVSRHIVYAGGCAKDGKTPADTRTSWQKKALEKYVKDFHR RFPDVRIVGHNELAAKACPSFDVQRWLKQIGITQ >gi|225935348|gb|ACGA01000044.1| GENE 53 91442 - 92665 1191 407 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173154|ref|ZP_05759566.1| ## NR: gi|260173154|ref|ZP_05759566.1| hypothetical protein BacD2_14890 [Bacteroides sp. D2] # 1 407 1 407 407 783 100.0 0 MAIVVRNTNYNGEVLEKILVLATTGNDLVEKGLIMVIPGVEKKISLPRIKTGKMLQKRKE NPTLEDSKGNFNYSEKSLDPEDFMAFTTFNPRAFEHVWRKWQPKGNLVFAELPPEAQNTL LDELSKSVKFELGWHYLNGEFGSDDDHLFNGILTQAAKDPDVIVVPAPSDTSMIGKLKAV RKAIPKALRENPNLRILMSIDDFDKYDDELTEREYKNTSETDINKKRYKGITIETLNSWP DGLIVATLCSMSADGNLFAGVNLQDDEEVIQIDKWMNSSELYFFKLLMKADTEIAFGEEF VVLDTRETPVFKVVERSISADPAALSFKAAGESKEVKVTASGDYSVVSIPAGFTAVGTDG SLTVTAGVNSSGKAVSGTLVLGLDADPEKKVEIALSQAAVDEEEGGE >gi|225935348|gb|ACGA01000044.1| GENE 54 92677 - 93714 937 345 aa, chain - ## HITS:1 COG:ECs0829_1 KEGG:ns NR:ns ## COG: ECs0829_1 COG0740 # Protein_GI_number: 15830083 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Escherichia coli O157:H7 # 20 180 64 225 226 110 40.0 4e-24 MSRFFNMIPGTDACCILLYGDIGEYDDNVRSGDIARELLEAEALTGKVDVRINSNGGEVY SGIAIFNALKNSKADITIYVDGIAASMASVIALCGKPVQMSRYARLMLHSVQGGCYGNKD EMKDCIREIEALEDTLCEMYATRMGKDKEEIRAMYFDGKDHWLRADEALALGLIDGIYDA DPVPEDSTPEQVFQIFNNRLHKPQNENSMNLDELKRRPRFKNCATDDDFLREIGLLETEA GKVPALDAEVTRLKGELKVFQDKADADDAAARKKLLDDAEQDGRIDAATRPIYENLLAKD RENGEKALEKLSPKRSVMTDLRVNPTGESPWNKRMSEIKDKLNHK >gi|225935348|gb|ACGA01000044.1| GENE 55 93911 - 94357 384 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260642127|ref|ZP_05414625.2| ## NR: gi|260642127|ref|ZP_05414625.2| hypothetical protein BACFIN_05935 [Bacteroides finegoldii DSM 17565] # 1 148 37 184 184 267 100.0 2e-70 MAELTNEQKKAWAKTLYTRETLTQAEIAERVGVSRVTVNNWIGKGNWEQLKASITITREE QLKNLYRQLAELNNAIMGKPEGERFPNAAEADTISKLSNAIKKLETEVGLADIISVFSDL LKWVRTYDSTQAKEITPLLDAFVKSKLS >gi|225935348|gb|ACGA01000044.1| GENE 56 94359 - 95927 636 522 aa, chain + ## HITS:1 COG:no KEGG:RCAP_rcc00985 NR:ns ## KEGG: RCAP_rcc00985 # Name: not_defined # Def: hypothetical protein # Organism: R.capsulatus # Pathway: not_defined # 19 370 22 411 552 84 22.0 1e-14 MAKKRLTPQDRIALDNWNELVASVREHSDINPTDTETEIRQRRERLEKNDEEWFKYYFAM YCTCESAAFHKKATGRLMRNNRWYEVRAWSRELAKSARSMMEISKLALTKKIRNVLLISN SADNAERLLLPFMANFEENQRIIQDYGQQKKPGAWETGEFTCMSGCSFRAIGAGQSPRGT RNKNFRPDFILVDDIDTDEECRNPERIKTKWKWLEEALIPTMSVSGNYRILFNGNIIAPD CCIKRAIEKATELKAKGIGHVDIINIRGKDGLSVWPEKNSEEDIDLFLSLVSAAAAQKEF FNNPVVDGGVFAEITYGKVPALSKFKFLVIYGDPAPGENKTKKSSTKTVCLLGKLAGRLY LIKTFLDRGLNAEFVEWYIKLLEFVGGKTTVYCYMENNKLQDPFFQQVFQPIVRRIRRER KISLYITGDEEKKTDKATRIEANLEPLNREGNLVLNEAEKDNPHMKRMAEQFKLFNLQLT YPADGPDCVEGGNRIIDRKARQSEKPVIVTRKSTRSQNKYRV >gi|225935348|gb|ACGA01000044.1| GENE 57 95946 - 96371 168 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|254883999|ref|ZP_05256709.1| ## NR: gi|254883999|ref|ZP_05256709.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 141 1 141 141 267 100.0 2e-70 MSKFIELSDYDASIHREILDALTREDDAVVEICEDRAVAEMRCYLSRRYDCDKIFTATGD KRNQLVLMMAIDIAVYHIFCIHNPRNLSPLRKERHERAVEWLKAVAAEEISVDGLPLLSE ETRAAKSNFLIKSNRKRVNHW >gi|225935348|gb|ACGA01000044.1| GENE 58 96373 - 97761 691 462 aa, chain + ## HITS:1 COG:ECs4970 KEGG:ns NR:ns ## COG: ECs4970 COG4383 # Protein_GI_number: 15834224 # Func_class: S Function unknown # Function: Mu-like prophage protein gp29 # Organism: Escherichia coli O157:H7 # 63 434 41 446 508 93 25.0 7e-19 MNKRKKGAGKITQSGNLPRPGQKGPATIILTQPRRFGIDIADYMLAVRAFENVDYSRRFR LYDLFSDILMDTHLTSVIEKRKNAALASSIEFRRNGKPDEKVNKQIRSPWFRKFIGDILD AKFWGFSLVQFYRKGEWVNYDLIPRKHVDPVRRLILRHQTDTTGTSWDEYPDLLFIGSPD DPGLLVKAAIWVIYKRNDVADWAQFAEVFGAPIREYTYPTDDDEARQRALDDADSTGSLS VFVHAEDTVLKLVEAANKTGSADLYDKLCERCNNEISKLFLGNTLTTEASDKGTQALGTV HKDVEEKVTLSDRQDILDVLNYDMADIFAMLGIDTTGGEFCYPEKKLIEPEKKMSILTQL RTNFNLPVGDDYLYEEFGIEKPANYDELKKRQEEKAAEIEAAKARETEKAEEDEPDPEEE PELEKHGKGTPKEKKNALKNAYNWLKRFFGKAPGRDGAALEW >gi|225935348|gb|ACGA01000044.1| GENE 59 97779 - 99194 910 471 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|265750855|ref|ZP_06086918.1| ## NR: gi|265750855|ref|ZP_06086918.1| conserved hypothetical protein [Bacteroides sp. 3_1_33FAA] # 1 471 27 497 497 956 100.0 0 MEDKQVETLFSFDEEVLKKALKNIYSKDFHPMTDIEENLFEATWKTMNKATDKGFGTRKT DDPDYDFYREIRMNNAVFAAFKVHRAQNDMAALLLDKNGSLKPFEQWVKEAMPIADHQMI HWLRTEYDTAVIRAHQAADWRQFEREKDVLPNLKWMPSTSVTPGADHQIFWGTIRPIDDP FWNEHRPGDRWNCKCTLSSTDEAPTAVPDENGQNKAHDGLENNPGKDGKLFSDKHPYITE AHPGAKKAVDALTRRINEMIAEMPDNLTLEEKTDIARNNLKIEKALGVTKGKPMTYEQAN KGKENPKFGKEEGYRVNCQTCTVTHMLRRLGFDIEAKPNIRQSAYNEMAKQGITWEERFL NRDGTKPDYDYTYKWQVRKGYQVMNANRLKEYFREKFREDGIYEIYCAWKGGSAHVFCAE VTEGKTRFFDPQTGKDDASNYIQSMKAGRVGVIRIDNKLVNPKIMGLFITK >gi|225935348|gb|ACGA01000044.1| GENE 60 99177 - 99446 196 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254883996|ref|ZP_05256706.1| ## NR: gi|254883996|ref|ZP_05256706.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 89 11 99 99 167 98.0 2e-40 MQQGVFYVPFCVMEIPKQVSELANSSGYNSVVLSASSPEGSIYSVGCVDGDGFELPVGLP AFILFDGQSCRLVDGEEGLALSSRLFGDE >gi|225935348|gb|ACGA01000044.1| GENE 61 99479 - 100111 395 210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|254883995|ref|ZP_05256705.1| ## NR: gi|254883995|ref|ZP_05256705.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 210 1 210 210 363 100.0 3e-99 MDIKEYSKLIKAKRKELDGLMKRKMPVIAGRMAKDHFQDNFRREGFVNGGLHPWPKAKRL SSGRTDAAGSYGTLLSGRNHLFSSVKYMPGEYRVRVANELVYAPVNNWGGEVHPTVTPQM RRFAWAKYYQASGKAKKAATGKRKGKKKGSAANNEPQENQEALKWKRLALTKKKKLRIKI PQRQFIGESRELSEKIDRKMENEIRNILNL >gi|225935348|gb|ACGA01000044.1| GENE 62 100116 - 100562 266 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255690957|ref|ZP_05414632.1| ## NR: gi|255690957|ref|ZP_05414632.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565] # 1 148 1 148 157 286 99.0 4e-76 MEEIFIAIMERIAEKMPELSYIDEDYGQLEAGAEEDHYPVTFPCVLVGNAESDWNDLGYG VQKSESLITIRLAIDCYDDTHYTSGTYDKVRERQLKAKELYKALQEFQCTEETSPLVRVK SRDYSLPGNIKVYETVYSFTLHDESAMQ >gi|225935348|gb|ACGA01000044.1| GENE 63 100572 - 100841 143 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254883993|ref|ZP_05256703.1| ## NR: gi|254883993|ref|ZP_05256703.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 89 1 89 89 130 100.0 4e-29 MAKGRDKKLIELRDEALCRRYYYWTEVQRLRFDDALKVLSRQEFFISEERIMSIIRCKCR ELKDLEVKPVPKVKKPRLTAVQLSLFTGE >gi|225935348|gb|ACGA01000044.1| GENE 64 100949 - 101164 116 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|265750860|ref|ZP_06086923.1| ## NR: gi|265750860|ref|ZP_06086923.1| conserved hypothetical protein [Bacteroides sp. 3_1_33FAA] # 1 71 1 71 71 139 100.0 5e-32 MEKTKNIAPHVMACKRCEGKGRIFYLDQGGAPLSAKCPVCNGSGRVKVQSKVITRIEPFV PGEDDTELMTM >gi|225935348|gb|ACGA01000044.1| GENE 65 101167 - 101817 753 216 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173166|ref|ZP_05759578.1| ## NR: gi|260173166|ref|ZP_05759578.1| hypothetical protein BacD2_14950 [Bacteroides sp. D2] # 1 216 1 216 216 363 100.0 4e-99 MDLKEQLKSLSAQDRKELLKQLQQEEKESKRNRRDAYEGLRAQFMLEVKNRLLPVVDDVK AFRDWVEKEAAAFRAVMREYGQLRKDEQASFTIVDGDMKLEVRSNKVKSFDERADLAAER LVDYLKRYAMGRELGTDDPMYQLGMTMIERNRQGDLDYKSVSKLYELEDRFDSEYTEIMD LFRESNVVYKTAVNYYFHKRDENGVWHRIEPSFCRL >gi|225935348|gb|ACGA01000044.1| GENE 66 101839 - 102282 418 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173167|ref|ZP_05759579.1| ## NR: gi|260173167|ref|ZP_05759579.1| hypothetical protein BacD2_14955 [Bacteroides sp. D2] # 1 147 1 147 147 256 100.0 2e-67 MQIDINSRKQLNKPENYAAFYSLLNRLPTSDRDALKESIVSQYTEGRTTSLRDMTLKEYS AAVSAMQKLVPPTYQEQLRKILRQKRSAVLHQMQLLGIDTADWDWVNAFCRDSRIAGKEF RELDCEALDTLQVKLRAIRRKRENKQQ >gi|225935348|gb|ACGA01000044.1| GENE 67 102285 - 102425 106 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173168|ref|ZP_05759580.1| ## NR: gi|260173168|ref|ZP_05759580.1| hypothetical protein BacD2_14960 [Bacteroides sp. D2] # 1 46 1 46 46 74 100.0 2e-12 MGILEFFDQYKCTKNEKEHLLDYLCTIRVKRVIKEINDLKINKKTV >gi|225935348|gb|ACGA01000044.1| GENE 68 102466 - 102675 119 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|265750863|ref|ZP_06086926.1| ## NR: gi|265750863|ref|ZP_06086926.1| conserved hypothetical protein [Bacteroides sp. 3_1_33FAA] # 1 69 1 69 69 110 100.0 3e-23 MSKKMVIVVTAVGVRKVVEKWLCENMTCELVVSRNARHECCVEVIYDSGNPSVLRTLLRS AVGEIIELC >gi|225935348|gb|ACGA01000044.1| GENE 69 102696 - 103208 295 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173170|ref|ZP_05759582.1| ## NR: gi|260173170|ref|ZP_05759582.1| hypothetical protein BacD2_14970 [Bacteroides sp. D2] # 1 170 1 170 170 310 100.0 2e-83 MSENNNKQKRKRVCPHCGRKLWMREFYPLKNGGRSSWCHECVLVYKREQYRKHRKVADGT FMHRTLGRLVEHKGYSTRIFWNGNMLSIMRRHYHNTLNRELAEMLGVSERSVTRKAREMG LEKDKGFVASLSREHLLLANARSKELGYPGGFTKGMKFRGNQYTGRIRVE >gi|225935348|gb|ACGA01000044.1| GENE 70 103229 - 103534 302 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173171|ref|ZP_05759583.1| ## NR: gi|260173171|ref|ZP_05759583.1| hypothetical protein BacD2_14975 [Bacteroides sp. D2] # 1 101 1 101 101 193 100.0 4e-48 MITEKQKEAVKELCQYVDNFCKENDLSAFMSVAASEDHPDGLEQIAGSIITGKTEHIVGS ISGVVKANKNVYMLLSVGLMQAYTRKADINTIPFGENLNMN >gi|225935348|gb|ACGA01000044.1| GENE 71 103554 - 104459 475 301 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294778490|ref|ZP_06743913.1| ## NR: gi|294778490|ref|ZP_06743913.1| hypothetical protein CUU_0512 [Bacteroides vulgatus PC510] # 1 301 1 301 301 573 93.0 1e-162 MRKEYYNYVVKLPVLLHELFRGKVADYHFSDMTVVMNHLVKSYIRMTDGGRVSTATRRIL LCMDRIPDMSFFFRRQEKSVLFFEMDPAVAGSLQRAIIAGGWGNRQRLVVRLVCAFCCGA GVTLNNLSMELASEEVFRRPEGYLIHTYVSNYQYVFLKETAAAQRMSVEGMLTAAAELLV GTDDEGSGYHIPESLGRIADRVFEVRGSTLKDFRRQCLVSIRTNTIGPDRIASFMEKHGI ASAREFLRRVVLFFLEARYLIYRKEVELDEDDLPEEEETDWEETMYSQYQKRDFAISTYN Y >gi|225935348|gb|ACGA01000044.1| GENE 72 104463 - 104699 154 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173173|ref|ZP_05759585.1| ## NR: gi|260173173|ref|ZP_05759585.1| hypothetical protein BacD2_14985 [Bacteroides sp. D2] # 1 78 1 78 78 152 100.0 6e-36 MSRIKKQLEICPPAYMCKGPNRENFVSTGHKCGYCKGNGWFWGTEEGSREDVHVSCPVCG GSGELDAIITVDWKPSSK >gi|225935348|gb|ACGA01000044.1| GENE 73 104696 - 105373 523 225 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173174|ref|ZP_05759586.1| ## NR: gi|260173174|ref|ZP_05759586.1| ATP-dependent serine protease [Bacteroides sp. D2] # 1 225 1 225 225 459 100.0 1e-128 MEEEKKDNKKAGMRRALNVRDILNKKYDVFPFEGKWKDAFDTPEVRGCWFVWGNSGNGKT SFVMQLCKELCKYDRVAFNSLEEGTSLTVQNNLRRFGMAEVSRHLAFIKEDIPTLKIRLR RHKSFNIVIIDSFQYTQMTYRDYIQLKEEFPDKLFVFISHARGKNPKGDAATSVMYDADL KIWVEGYVAFSKGRYQGATGEYTIWEKGAYDYWNVAGPKQKGGQA >gi|225935348|gb|ACGA01000044.1| GENE 74 105376 - 105732 309 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254883982|ref|ZP_05256692.1| ## NR: gi|254883982|ref|ZP_05256692.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 118 15 132 132 228 94.0 1e-58 MENKFEYLKIDGREQLPAPWSDYPVLREYETVTVYRNGRDYLDALVGQQDGWWVAGVHME VGGSGGGFNPGRKWGQFATRENALLWALGRMLCHEKLRGAARQAVLDRIDNIRQLTLF >gi|225935348|gb|ACGA01000044.1| GENE 75 105771 - 106640 659 289 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173176|ref|ZP_05759588.1| ## NR: gi|260173176|ref|ZP_05759588.1| B transposition protein domain protein [Bacteroides sp. D2] # 1 289 1 289 289 578 100.0 1e-163 MEITMKEKDAISESLRAYVAKYPSQTKAAGSLKGVSVGTVSNILNGRYENISDEMFRNVA SQVGGVSATGWQIVETGAYQEITAVLSDAQRWRNVTWVTGEAGCGKSTTARVYLQEHKEV FYILCSEDMKKGDFVREIARTVGIRTEGYNIREVWGLILDDIIQMDAPLLVFDEADKLTE PVFHYFISLYNKLEEKCGVVFLSTDYIAKRISNGLRYQKPGYKEFYSRIGRKFYELEPTD VNDVFAICSANGVTDRKDIDKVIKEASTCDFDLRRVRKSIHKVKRMTGE >gi|225935348|gb|ACGA01000044.1| GENE 76 106676 - 108715 1487 679 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173177|ref|ZP_05759589.1| ## NR: gi|260173177|ref|ZP_05759589.1| hypothetical protein BacD2_15005 [Bacteroides sp. D2] # 1 679 1 679 679 1338 100.0 0 MEYFDNILCVTYKELLDIMPKGTLNSQLSREKLDVVSRGGGENNPALYAYSSLPEKYKKR WVERHGEPEKQMRQEMIRNIVKKDEKAENFFEDYRYDKNGEMVALPEDVKKEYTWNASVL NALMEEFKRLSSSNNKLTGFRRNLWELLLVTSEEWRPVYGHSLPGSVGRLKALINKFRPD NYGVLVSGKYGNSNTLKIEEDGGRYLVALKRSRVPVYTDMEIFEEYNRVAPERGWKPLKS PRSLREWFNSPRVEPLWYDAVYGEMKAHQRYDRKHRTILPGRRDSLWYGDGTKLNLYYRD ENGNKCTTSVYEVVDAYSEVLLGYYISDNEDYIAQYHAFRMAIQTSRHKPYEIVCDNQGG HKKNAALGLFSKISRIHRPTAPYNGESKTIENIFYRFQSQVLKKRFGFTGQNITAKRDTS RPNLEFINANIDSLPTLEELKEQYAAAREQWNSMKHPATGISRIEMYNTSVNEATDAVSV SDMVEMFWYTTEKPSLFTANGIEITVQGKKYPYEVFSAPGEPDLEWRRRNTYKKFYVQYD PYDMSSVRLLYKDKGGAMRFECVASFPLMIHRAQQEQTEAEKRFIRAQQEAVINERINRQ VVAKDIEYEHGVAPEQNGLRTPDLKGLGKEAQRQIDRRTRKYSQPARPSIGRDMKVISNV TWDSFEKKEVSIRKVVGKL >gi|225935348|gb|ACGA01000044.1| GENE 77 108731 - 108952 187 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173178|ref|ZP_05759590.1| ## NR: gi|260173178|ref|ZP_05759590.1| hypothetical protein BacD2_15010 [Bacteroides sp. D2] # 1 73 1 73 73 122 100.0 1e-26 MKNDLMTLFSDQLHWFARLKRKQRFCVLYFCMSFGILLSIFFINPLLELLVVLNFGISVR LLKKHVPLNDLED >gi|225935348|gb|ACGA01000044.1| GENE 78 108942 - 109142 311 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255690973|ref|ZP_05414648.1| ## NR: gi|255690973|ref|ZP_05414648.1| transcriptional regulator, ArsR family [Bacteroides finegoldii DSM 17565] # 1 66 1 66 66 112 100.0 9e-24 MKERIVVEYGEVNKIAELMGCTNVMVSHALAFRKNSKLARSIRKLAIERGGSKVGGNPQN TSSHEK >gi|225935348|gb|ACGA01000044.1| GENE 79 109262 - 109453 182 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173180|ref|ZP_05759592.1| ## NR: gi|260173180|ref|ZP_05759592.1| hypothetical protein BacD2_15020 [Bacteroides sp. D2] # 1 63 1 63 63 120 100.0 4e-26 MEVKFKKGQSVRITKRNGEIIDGIVRDWDYNICTFVREYNIDYMKNGQVWTVICVPEDAI KEL >gi|225935348|gb|ACGA01000044.1| GENE 80 109589 - 110008 367 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173181|ref|ZP_05759593.1| ## NR: gi|260173181|ref|ZP_05759593.1| transcriptional regulator, XRE family protein [Bacteroides sp. D2] # 1 139 1 139 139 244 100.0 1e-63 MSVFSKNLRYLRESRGLKLDEFEFLGIKKGTMSNYELGNTEPKLSLLCEISKFFRISIDD FLLKDIEAEKITPVVTETAPPETANNNFRELLDVLREKDSTIREMAEEIGMLKQTITQLK QDKSGRVSDASDSTVANAI >gi|225935348|gb|ACGA01000044.1| GENE 81 110119 - 110343 104 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173182|ref|ZP_05759594.1| ## NR: gi|260173182|ref|ZP_05759594.1| hypothetical protein BacD2_15030 [Bacteroides sp. D2] # 1 74 1 74 74 112 100.0 9e-24 MMEGCIYPFERFERLKKWKGYSLAYSFGYSFKLLQKRNVLIAYSFGYSFLCLFCSNKTGK YLFFIWYSSVFIIL >gi|225935348|gb|ACGA01000044.1| GENE 82 110408 - 110620 182 70 aa, chain - ## HITS:1 COG:no KEGG:BF2407 NR:ns ## KEGG: BF2407 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 68 2 65 69 61 56.0 9e-09 MSKVIHVHLIFEKKNIYFGSISAIFETLTEKQVGITKSSLLHAGLVDDIAKYTKRAMIIQ SRLITCTRKG >gi|225935348|gb|ACGA01000044.1| GENE 83 111019 - 111339 134 106 aa, chain + ## HITS:1 COG:no KEGG:Spro_1011 NR:ns ## KEGG: Spro_1011 # Name: not_defined # Def: low temperature requirement A # Organism: S.proteamaculans # Pathway: not_defined # 1 92 288 379 393 75 41.0 4e-13 MIICAVGDELIVAHPEQEMRAEVVFVLIIGPIVYILANSIYKYVTCRMLPLSHIIAVIAL ALLLPWPYHISLLTMNILVTSVFIFVIVFDMLFPNKGFKIKWEPKI >gi|225935348|gb|ACGA01000044.1| GENE 84 111576 - 113600 1431 674 aa, chain - ## HITS:1 COG:no KEGG:BT_2828 NR:ns ## KEGG: BT_2828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 674 1 660 670 374 34.0 1e-102 MKQTILKYLMIGVLVISSISCMDKERDLSWERRHMPKEAYFDFNMTQAVALDVDYCFKSD NYRVLFDIYDQDPIEYSADGTVSQKGIEPIYRAVTDEEGKFSGEMNIPADISEVWLSSDY LATASPLKLTIDDSRRLSFNQDAYITTLRSQTASKTRGVTVNQHTYLKEWNVLPDADWDD SGRPTNLEAKINIPPADVLYNIKYVFRKVTVKDENGKSKVMNISQNYPEFFDGTIKMTSD IPIVNPTEVSLVFITSSAAWYNTVGYYTYPTNNPPQSASDVKQIIAFPNTSPIYKTLGAG ALVCGEEIKLKYWNEETQEYEDKFPAGVTIGWCLQGMGFKSKLTSETDKDKVGDIIKGMG ARYSTRNLNTNNTQRTVSLRDSKSGQIVAVGFEDNIDFDYADAIFYIHTSEKNAIDPALP ALPEDPEAIPEQYKISYSGTLAFEDLWPKLGDYDMNDVMVKYTSTMTRNALDNRIYEIED KFILQHCGGYLQNGFGYQFHKLSNSNVKSVKITGPDANGLSSSIYMEGKETEPGQSHPTI LLYDDMTKFKNITDESKKEYTVTITLDGASEKDVVPPYNPFIFISSNEGRGKELHLMNYP PTDKADLSILGTGKDIYRPEEGMYYVSADLMPFAINMPVSNLPVPEEGKRIDQSYPKFSG WVSSNGKQNKDWYK >gi|225935348|gb|ACGA01000044.1| GENE 85 113673 - 116018 2321 781 aa, chain - ## HITS:1 COG:SMc01364_1 KEGG:ns NR:ns ## COG: SMc01364_1 COG0550 # Protein_GI_number: 15965053 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Sinorhizobium meliloti # 1 585 5 585 585 438 41.0 1e-122 MQKNLVIVESPAKAKTIEKFLGKDFKVLSSYGHIRDLKKKEFSIDVEKNFTPSYEIPADK KALVNTLKTEAKDAETVWLASDEDREGEAIAWHLYEVLKLKPENTKRIVFHEITKSAILK AIEQPRDIDLNLVNAQQARRILDRIVGFELSPVLWRKVKPALSAGRVQSVAVRLIVERER EIHAFQTEAAYRITAVFLVPDTDGKLVEMKAELARRIKTKEEAKAFLNACQGASFAIDDI TTRPVKKTPPAPFTTSTLQQEAARKLGYTVAQTMMLAQRLYESGFITYMRTDSVNLSEFA TAGSKDAIIKMMGDRYVHPRHFETKTKGAQEAHEAIRPTYMENQSVEGTAQEKKLYDLIW KRTIASQMADAELEKTTATITISGSSDVFTAIGEVIKFDGFLRVYRESYDDDNEQEDESH LLPPLKKGQKLEHGPIIATERFTQRPPRYTEASLVRKLEELGIGRPSTYAPTISTIQQRE YVEKGNKDGEERQFNVMTLKDRQIKDENHTEITGAEKAKLFPTDTGTVVNDFLTEYFPDI LDFNFTASVEKEFDEIAEGEVKWTSIMKNFYDKFHPSVENTLAIKTEHKVGERILGEEPG TGKTVSVKIGRFGPVVQIGTVEDEEKPRFAQMKKGQSMETITLEEALELFKLPRTLGEYE GKSVSVGVGRFGPYVLHNKVYVSLPKTLDPMEITLEEAEQLILEKRQKEAERHIKTFEEE PELEILNGRYGPYITYKGSNYKIPKDIVPQDLSLASCLELIKLQDEKGPATTKKGRFAKK K >gi|225935348|gb|ACGA01000044.1| GENE 86 116291 - 118012 1445 573 aa, chain - ## HITS:1 COG:no KEGG:BT_2817 NR:ns ## KEGG: BT_2817 # Name: not_defined # Def: putative TonB-dependent receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 573 1 573 573 696 62.0 0 MKRSHYIIGGLALAISMPSCLQAQTTQPKDTTMNRTVIVEQEYNPDIMDASKVNVLPKVE EPTVSKKEVEYATTFFPATSVPAGLMRPYTGKEIQPGTTPGYVRAGYGNYGNLDVLANYL FRLSKKDKLNIRFQMDGMDGKLTLPFTDGEKWNAFYYRTRANVDYTHQFNKLDLNIAGNF GLSNFNFQPGSVNSKQKFTSGDFHAGIHFIDETAPLRFNAETNLLMYERQHNMFNESEAN TGIKETIIRTKGDVTGAIGDQQLITIALEMNNLLYSGYTKNVSTGDEYFKNYTTLLLNPY YELDNDDWKLHIGANVDLSFGFDKSFRISPDITAQYIFSDSYIVYAKATGGKQLNDFRRL ENICPYGELPEANTTATLGYVQRPYDTYEQINGSIGFKASPYPGLWFNVFGGYQNLKNDL SYLGFDPSNLHSGSYLSFAQDNTDNLYLGGEISYDYKDILGISAKYTYRNWDSKTEEYLL AVKPVSEMSFNVRIHPISALNINLGYDYINRKEVKEYAKMTAINDLHIGASYNVFKGVSV YAQVHNLLNKKYQYYLGYPTEGFNFLGGLSFRF >gi|225935348|gb|ACGA01000044.1| GENE 87 118034 - 121051 2827 1005 aa, chain - ## HITS:1 COG:MA1613 KEGG:ns NR:ns ## COG: MA1613 COG0457 # Protein_GI_number: 20090471 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 30 968 4 992 1885 65 20.0 7e-10 MKKEITRLICAAICCTPMIGFAQTSDKFTSTDNLYKEGKELFQEKNYAAALPALKAFVKQ KPAASLLQDAEYMLASSAYELKDKNRIEILRKYLDRYPDTPYANRIYALLASCYFYEGKY DEALALFNSADLGLLGNEERDDCTYQLATCYLKTDNLREAAIWFETLRANSPKYAKDCDY YLSYIRYTQKRYNEALKGFLPLQDDSKYKALVPYYIAEIYAQLQNYDKAQIVAQNYLSAY PNNEHAAEMYRILGDAYYHFGQYHQAVEAFNNYLNKDHSAPRRDALYMLGLSYYQTKVYS KAAETLGKVTTANDALTQNAYLHMGLSYLQLAEKSKARMAFEQAAASSANMQIKEQAAYN YALCLHETSFSAFGESVTAFEKFLNEFPTSPYAEKVSSYLVEVYMNTRSYDAALKSIDRI AKPSAQILEAKQKILFQLGTQAFANADFTQALKYLNQSIAIGQYNRQTKADAYYWCGESY YRLNRMTEATRDFNAYLQLTTQPNNEMYALANYNLGYIAFHRKDYTQASNYFQKYIQLEK GENTTALADAYNRIGDCYLHVRSFEEAKHYYSQAEQMNTPSGDYSFYQLALVSGLQKDYS GKITLLNRLVGKYPSSPYAVNAIYEKGRSYVLMDNNNQAITSFKELLSKYPESPVSRKAA AEIGLLYYQKGDYNQAIGAYKEVIEKYPGSEEARLAMRDLKSIYVDLNRIDEFAALANAM PGHIRFDANEQDSLTYAAAEKIYIKGRMEEAKTSLNKYLQTFPEGAFSLNAHYYLCLIGN EQKNYDMVLLHSGKLLEYPNNPFAEEALILRAEVQFNQQNMAEALASYKMLKEKATNVER RQLAETGILRCAFLLRDDIETIHAATDLLAEAKLSPELRNEALYYRAKAYTKQKADKKAL EDYRELAKDTRNSYGAEAKYQVAQSLYDAKEYAAAEKELLNYIEQSTPHAYWLARSFVLL SDVYHATGKDLDARQYLLSLQQNYQGNDDIQSMIESRLRKLKVEN >gi|225935348|gb|ACGA01000044.1| GENE 88 121122 - 121643 368 173 aa, chain - ## HITS:1 COG:MK1028 KEGG:ns NR:ns ## COG: MK1028 COG1051 # Protein_GI_number: 20094464 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Methanopyrus kandleri AV19 # 42 160 27 142 154 68 38.0 4e-12 MEHPLAQFLYCPECGSSHFEVNNEKSKKCADCGFAYYFNPSAATVALILNEKNELLVCRR AKEPAKGTLDLPGGFIDMNETGEEGVAREVLEETGLKVKKAVYQFTLPNIYIYSGFPVHT LDMFFLCTVEDMSHFSAMDDVADSFFLPLSEIHPEDFGLDSIRRGLKKFLSGR >gi|225935348|gb|ACGA01000044.1| GENE 89 122008 - 123468 823 486 aa, chain - ## HITS:1 COG:no KEGG:Coch_1468 NR:ns ## KEGG: Coch_1468 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 41 470 61 492 493 433 50.0 1e-120 MKRFLISLFFIPSLLLTFCSDKVNSKEEQEFPFLTPLEEMEKVTYRPSEEVICNPERGFF THQEYATDNDHAITPEFLNEYRNKGMSLIFTAYYMRNFKDKLISEEYLQRIRNNMLALRK GGAKCVLRFAYTSSENEKPWDAPWTLTKQHIQQLKPIFDEFSDVICVLEAGFVGVWGEWY YTDHYNYQPKKNEYGPRRKVLDALLKVMPKDRMISVRYPVAKLFTFNLNHTDTITQETAY NGSDLSRISFHNDCFLADTDDMGTFGENPDYRKFWEWETKYVAMGGETCQLSEYSNCENA VTDFAKYHWSYINIDYHPAVINQWEDEHCMTEIKKRLGYRFTLSDGYFTPKGKIGHPYEI ALKLQNTGWAAPFNPRDVEIIFVHKKKKDNKYKIKLKEDPRFWFPGEQITIRAQFGLPTS MPSGEYDIYLNLPDPKPTISTRWEYSIQLANRDVWNKQYGYNKIHNTILIDGSDGETFDG ETLQKF >gi|225935348|gb|ACGA01000044.1| GENE 90 123502 - 125088 951 528 aa, chain - ## HITS:1 COG:BH1905 KEGG:ns NR:ns ## COG: BH1905 COG1501 # Protein_GI_number: 15614468 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 58 509 191 646 773 124 24.0 4e-28 MKKLIFSILAFSLAATTLSQQVKEIEILPNEKWWGGATDLGSQMPFRENTMEIDLQTQNF NNQTTPLLISNKGRYIWCDGPFRFQLKNGKIRIESARGTIEHTTAGTTLREAYLAASKKH FPPSETLPPELFFSKPQYNTWIELIYNQNQEDILKYAQSIIDNGFPTGILMIDDSWQKNY ANFGFRPDKFPNPKAMVDKLHSMGFKVMLWVSPFVTPDSEEFRDLRAKGYLVKKKGSDQP AILNWWNGSSACYDLSNPAAYNHLREALQKIQKDYGIDGFKFDAGDPERYLAKDVDVFDQ QSYDTEQTYLWAKLGLEFPYNEFRACWKWGGQPLVQRLGDKTYSWNGVASLVPSMVSAGL LGYSYTCPDMIGGGEYSSFLGIDASSFDQTLIVRSCQIHSMMPMMQFSVAPWRILNKENL ETCIKYAKWHEQLGDYILSLAKEASITGEPIVRHMEYAFPNQGFEECKNQYMLGNKYLVA PIMSSDNTRTVKLPKGKWKDDMGKLYKGGKTYTIDVPLSRLPWFVEVK >gi|225935348|gb|ACGA01000044.1| GENE 91 125120 - 129250 1894 1376 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 838 1124 2 266 294 132 29.0 6e-30 MRYYKQIILSLIILTCAFTTSSAQEELLFRHLTRNDGLLHDNVTCIVQDSLGYMWFGSHR GLNRYDGYSIDSYKYENGIINSVYYNRVYSIQIVGHHIWLATEAGLSCFDIRTKKFVSYR IEGQSDPDFYSKVRVLKRGINHQLWLISDNQIRLVKIQSSEKNGTQPVLSAQKIGDTYSY VADELNPKVATDSLGNVWISGKRYLSAYKPNANGELQFSGNINNNIGSGVRDICYDNGYL WLIYQEHLAKYKIENGDHYVLIKQVMFNTPGGVLSLCVDPDYVWIGANEGLFQVWKNNDS MPAIEHKHSPSNPYSVGSDINNIFLDRDNNLWVSAWTAGVSYANTRPRFFKTVRYSPFKT SDTNIGSEFISSIHYSKDGHVYMGSKFGGLSSFNIKTKEVIWDYCHLPQLFPSITSIQSD NRNIYAAVRDNILIINKKTREIIQSLRTVNGGYIFWLDFDKFNRLWVTTYAGLECFEEIN SQWRNTMTYTTQTATPNNLSTDLLHNIYSDTIKNELIITSAMGINRVIFDNEGKVIRIVK YLAKENDKNSLSSNYIWPIDKGSESTYWIGTMGNGLNKVTLIDRPNGIYDYSAESYGIEA GATSNDIESIEVDRFGRVWCGGFNLNYFDDEIKRFNLFDTNDGLQSYVFGTSSSTKDDDG NLYFGGAHGLNYFTPIAETPHTASYRVYFTRCHINGKVVDSDIEFSNSLKLKYPDNNFSV SFTSLSYRNQHHIRYRYKLEKYDNEWRYIEAGKEPTVSYQKIPFGLHTLLVEAGDWQAWS NEQYTLQIYSQPPFWMTWWAYTFYILVILGLFYIGFRYFIKWTQMKNTISMQKEQERQKE EMMQMKMRFFTDVSHEFRTPLTLISHAINEIAEDEDICTNKYVNIIQHNTGKLSNMVNEL LDFHRAEIKSAQLRTAYTSIPEYVNDIYEEFKGWAEASDIHVNLQIDDRNIRMWIDPEHF GKILSNILSNSIRYSHANSEINIQVSKGLVDDIIPLYKDSFWNTKEMIPGEQVIIRVSDT GIGMEKAVLPTIFKRFHQVQNNTGKQHSGSGIGLSLVKSLIELHHGGIIISSKPNVGTEV IIALPISDAYLTKEEKTEENTFVLKDYLSNYAVEYAPLEVEETTAIYREGKPTILLVDDN HEILMILREYFVKEYNIIMAIDGQEALDKCNRSLPDMIISDVMMPRMNGIELCATLKKNL QTCFIPIILLTAKSQIEDQIEGIEMGADAYIPKPFNPRLLKANVRNLLNKSHQMRNLPTT NNVRQEIQDKRQREMFDKLIELVNDNLTNQQFSVDHLCLELGMNRTKLYSFIKSTTGMSL GNYIRKIRLDKAAELLRTTDMSISEVGYAVGIESPSYFTRTFKEQFGSAPSEFIKH >gi|225935348|gb|ACGA01000044.1| GENE 92 129268 - 130371 670 367 aa, chain - ## HITS:1 COG:no KEGG:Coch_1471 NR:ns ## KEGG: Coch_1471 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 17 367 16 366 366 355 51.0 2e-96 MRTSYIKSMFLLTISLLIACSEEDAPNLRISVEADKTEVKVGEPVTFSIWHNAMALSIYT GDDGHDYKTSAYFLLQGLADTDLQNNNYRPIDPEIVPYNCDFANTQTGATTIKDNLSEVI NAGSGDNLIGSEAEVEYDESIQQNVLKITSVHPEWWYQALRLNTHAKLGTNKKLNLRMRF DKDILEDVSSGAPRPEITTFQVVIRLGGIGVGETDVIFRDETVWDIYWNPNIAYTDYSVD LSSVIDAWQGATGKTMATLSYIQILFTPSNNAGYLGDYYIASASYGDIDYIPFATGQSLN INDNSGIAKYQYTYTQPGTYQVVVIGTNTSMKNYSDSGYKDNVGNKINASEYNYNTQQST IDITVHP >gi|225935348|gb|ACGA01000044.1| GENE 93 130403 - 132196 1179 597 aa, chain - ## HITS:1 COG:no KEGG:Coch_1472 NR:ns ## KEGG: Coch_1472 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.ochracea # Pathway: not_defined # 7 597 8 601 601 850 68.0 0 MNKFNILIAILVCLNICSCSDFLEEDPKGRLTTDNFYNSESDARQAINGVYRRLSDSWVT GYNMKQIPNDLLKRASWDEASGLSNFTYGSENTYIAGMWQNHYAVIKDCNSVIDNVTANK DKINNWERYVAQAQGIRAFLYFDLVRWFGDVPLVLKDTKSLDGLEVTRTSQKDVFQQIIE DFEYCIDHTMDKGDTSQGYQYGRLTKDACRGFLAKVYLWLGSVAQRDGKEILGNAADNFG KSLEYSSTVIQGGRYKLVDYYPDVFNAKTRDKAPDEVLWCVQGLTGDDTGTWTGMMFGIR GSQNLGGSWDNISSSDYHRMIYEPSDSIRRLWNCPRMTIQEDGTLWGWDYKMYWDTRGDQ KLSEATENNNWLQWSIGKFRRFPLADPSSYNYTNFGMDEPLLRYADVLLMYAEAYNEVNH GPGDYRPSSGIDMSGISVQSAYDAVNLVRKRSRIANEGIMHQDVLPRKLITDYINKVDEC VPDWKPNSYGYIYDGVRTVWDYNRYGDDYTAFRTEILNERARELVAESTDRWCDLVRRGT LVKQMQVWRQYNPFISNTEREITTPGAPENIQSRNMLLPIPLSEIDTNKNLTQNPGY >gi|225935348|gb|ACGA01000044.1| GENE 94 132215 - 135286 2234 1023 aa, chain - ## HITS:1 COG:no KEGG:Coch_1473 NR:ns ## KEGG: Coch_1473 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.ochracea # Pathway: not_defined # 13 1023 9 1021 1021 1419 68.0 0 MKRKVLLSTLFLILHFTVALAQQVTISGQVLDEKSEPLIGATINIEGTTNAVITDLEGKF TIKVLPSEKLVISYLGYKPKTIAIGKNRRFDIILDPSVTEMDEVVVVGYGSQRKSDIATA VASVNIKDIAKSSSTQTLQALQGKISGVQIIPTDGSFSSGMTFRIRGVNSVTGGTQPLFV IDGVPMPTQQITNEDTETVNNPLLGLNPNDIESMEILKDAAAAAIYGAKGANGVVIITTK RGTASTKPKFTFSLTGGLDMNPHIPLEVLSPEEYAKKMLDYGTYDSPNLINFWQNVVDNK GWNDPSVHKWMDEITQVAKKYETNASISGGTKGTTYMLSLGYLNNTGIIKRSAFDRFTSR LNLNQEVNSKINIGINLSYSTSKDKNPVSDWSQSGVILNALQISPFLFYPGLADIMNYSN INIMSPLVAVDQVDINNRYSELNGNIYFNYKIQKDLTFSTSASYRQYSMDQNKLWGSDTW YGQSERGRMEISNREENSWVYEARLQYAKSIKKHSFSLMGAFEASKWSMKDVYNKATNFE DMAVGIWGIDKGLVTYAPKYMYDSSQMVSFISRGTYTFDNKYVLNASLRVDGSSKFGANN KYGYFPAVSVAWRASEEEFIKKYDFISNLRVRTSFGMTGNNQIPSYQSLSQLENNKVVMN GNMVEIGRYPSNVTNDDLKWESQKQYNVGFDFGVLDNRFSITADFYYKRIDDMLLQVNIP STSGYTKAWKNAGSMENKGMEFAINANWFQGDFSWSTDFNISFYKNKILSLDKGQYQQFY DRGINAKITSDVLLRVGMPVGIYYGYISDGTYNNDTEIINGYPGPNLGLGQLKVVDVNKD GVIDSNDRTPIADVNPKHTGGIGNTFSYKGFDLYAFFRWSYGNDVVNGNAYYLVGTTSIN NILKSVYKDVWSTKTPENNYPLYGRGTWGESVLRSDLVEDGSFLRLQTLSLGYNFPTKVT RKMGLSKMRVAVTGTNLWLWTRYSGFDPEANTGYGTVARLAPGLDMSPYPRPRSFSLSIE LGF >gi|225935348|gb|ACGA01000044.1| GENE 95 136216 - 136782 709 188 aa, chain + ## HITS:1 COG:STM0608 KEGG:ns NR:ns ## COG: STM0608 COG0450 # Protein_GI_number: 16763985 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Salmonella typhimurium LT2 # 4 188 3 187 187 247 59.0 9e-66 MEPILNSQLPEFSVQAFHNGAFKTVTNNDLKGKWAILFFYPADFTFVCPTELVDMAEKYD QFKAMGVEIYSVSTDSHFVHKAWHDASESIRKIQYPMLADPTGALSRALGVYIEEEGMAY RGTFVVNPEGKIKVVELNDNNIGRDASELLRKVEAAQFVASHDGEVCPAKWKKGESTLKP SIDLVGKI >gi|225935348|gb|ACGA01000044.1| GENE 96 136867 - 138417 395 516 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 212 505 2 297 306 156 33 8e-37 MLESALKEQLKGIFAGLEANFTFDISVSSSHENKTELLELLGDVADCSNHITCVVNEGDA LKFTLLKNGDRTGITFRGIPNGHEFTSLLLAILNLDGKGKNFPDEAVCNRVKALKGPIHL TTYVSLTCTNCPDVVQALNAMTTLNPAITHEMVDGALYQDEVDALKIQGVPSVFADGKLL HVGRGEFGELLAKLEEQYGIDETKANAEVKEYDVIVAGGGPAGVSAAIYSARKGLRVAIV AERIGGQVKETVGIENLISVPETTGNELADNLKTHLLRYPVDLLEHRKVEKVEVVGKQKQ ITTSVGEKFLAPALIIATGASWRKLNVPGEAEYIGRGVAFCPHCDGPFYKGKHVAVVGGG NSGIEAAIDLAGICSKVTVFEFMDELKADSVLQERLKSLPNVEVFVSSQTTEVIGNGDKL TGLRIKDRKTEEERLVELDGVFVQIGLSANSSVFRDIVETNRPGEIMIDAHCRTNATGIY AAGDVSTVPYKQIIISMGEGAKAALSAFDDRVRGII >gi|225935348|gb|ACGA01000044.1| GENE 97 138585 - 141101 1155 838 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_3116 NR:ns ## KEGG: Pjdr2_3116 # Name: not_defined # Def: carbohydrate binding family 6 # Organism: Paenibacillus # Pathway: not_defined # 40 818 62 793 1156 218 28.0 1e-54 MRITFISLSLIFMSICLHAQDRADYKYLNMSSSSFRVDESYLSKHDIVYFSPTQLEAEGF PMGNGNIGGMIWNNDNGIELQINKNDLWSKPDKEENDLSVLKHAARLKIDFGIPVFSWIH LKNFEGRLSLQKGETTYKAATPYSTTTIRTWLAHGRNVWILECENLPNKEITGTEQKATV SLERLGSRSFSGWYAGRFPKDPKVGIGKTQSLLNGNDMIIEENGDGLHFAVACRIIKEPA TAVIINSHRMEQKTVKNKFTVLISVVTDRENKAPVEAARRLLDEAETDGIEHLKADKNNW YKNFWSNAFMKLGDDYLENIYYLRRYLMAAGSQGEFPVAFNGGLWRWNRDVMNWVTPHHW NTQQQYWGLCAQNDCQLMAPYLDTYFKMIPSGEMLAKEYGASDDALLITEAHCFTGEQTG KDREDMRNNFTPASQIASLFWDYYAFTGDKDFLKNKAYVFMKKAANFYLEKLQWDAEKKE FYFLSSLYESADISYVKNTISDRNCIEQLFRNCIKAATLLKTDKDKIAQWKHVLNHLWKI SYEPFESCGEVIAPAEEYYTELRYSPWMWASAGSSAFPTGLIGIDDKDSHMGKAITNLLK CRESANAHYPLPEIAARMGLGDEALKYLTNGVNIHQMYPQGLMHNVTGYPDNIYDLQSVH DLLGHIYTIRSHDFFQCGMEPISNYSTAMNEMMLQSNEDKIRVFPAIPTAWDTTRIAFTL LARGAFIVSSERDEQAKVTQIGIKSLKGNICRVQNPCPVNDIVVMSVNDKNKVKYKIEKN NVITFATITDREYIVKLNSDKEENVPTVYSGTPNSKVKYWKQRTLGKNSGWNDNPIQH >gi|225935348|gb|ACGA01000044.1| GENE 98 141154 - 141861 403 235 aa, chain - ## HITS:1 COG:no KEGG:Coch_1347 NR:ns ## KEGG: Coch_1347 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 36 231 282 498 508 151 40.0 2e-35 MRNCYLTLSLICICSICFAQQQTNDISTKNPPNREWNFNNLDGWEYGHQDDNPDNQCILE NGYLRIFTRANSVDRKKVRTIERIYTTGRYTWRTHIPQMGIGDQCSVGSWIYHDDQHELD FEVGYGKDTVRRELNAAPDEIIAYMTSQAYPFSSVPIVIKTGWHLFEIDLTLKDGNYYIT WLIDNEPKHELQLKFGKDIAFHISCSVENLKFIGDRPAQQENSGLFDFVRYTYHD >gi|225935348|gb|ACGA01000044.1| GENE 99 142050 - 145997 1920 1315 aa, chain - ## HITS:1 COG:MA1149_2 KEGG:ns NR:ns ## COG: MA1149_2 COG0642 # Protein_GI_number: 20090015 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 802 1029 21 258 279 124 30.0 2e-27 MSKAHILKRYSSLIWILFFVLSTKAEQQDYYFRQISLEQGLSQSRVQCIYRDHLGVIWIG TKWGLNSYDQSELKSYFHDREQPNSLPDNFIRFITEDRFGDLYVSTNKGIAIYNKAENQF QPLKYNGKPFTAWSYLQIDDNFLFGGEETLYQYNLADKSITTIFPDIDGDKLKCINRIFQ WSPNILITSSKKDGLWMYDLAKKKMYRCPFVKEREINTIFVDSQNRLWVSFYGKGMACYS KEGKRLFSLSTRNSELNNDIIFDFLEKDNQLWIATDGGGINILDLQTMKFSYLKHISDDE QSLPNNSIYRLYKDQMDNIWIGSIHGGLFAIKKVFIKTYKDVPLNNPNGVSERIVVSIFE DKDTLLWIGTDGGGINSFDQKTNTFHHYPTTYGEKVTSITDFSENELLLSCFNKGVYTFN KRTAQMQPFPIINDSISKREFSSGDLVNLYATKDNIYILGAKVYIYNKHTRQTSILYAPQ IDIQRQIAMQAIYSDDTHLYLMGTNNLFKLNFKTNELSSLVSMKEGDDFTSACRDDKGNF WIGSNFGLLFYNKQTGKTEKIHTNLFNSVSSLAYDKKGKVWIGAQNMFFAYIINEKRFVI LDESDGVPSNELIFTPIPALRTPNLYMGGTMGLVRINTDIIFESNSSPILKLLEVKLNGK STLKQVNNNCISIPWNHSSFNIKVIADEKNSFRKHLFRYVITGKDKMVIESYLQTLELGT LASGEYTISVSCDTPNGEWSQPTEILTIMVSPPWWKSTWFIILCIFFAFLVAGVVFFSLI RKKENRLKREMREHEKKIYEEKIRFLINISHELRTPLTLIYASLKRILNKEVKQDELPEY LQGAFKQANQMKDIINIVLDARKMEVGQEVLHISSHPLHKWIQEVAETFQTASKAKEIEI TYDFDDSIQYIAYDDTKCRVVLSNLIMNALKYSPNQTRIVIKTIRTNESIQVHVQDQGIG LDNVDIKKLFTRFYQGKHNEGGSGIGLSYAKMLIDLHGGRMGAFNNEDRGATFFYEIPAN LQEQEVSCPQHSYLNELLSSPEEEEKIESGSFSLQGYSLLIVEDKQDLREFLKSALKDKF KKIYQAENGLVALEVIKQQQPDIIVSDVMMPQMNGYQLCKEIKENLNISHIPVILLTARA DSESQMLGYKLGADAYLPKPFEMEMLLSVIQNQMRNREYIKSRYRGNQFILSPQEATFSN ADEQFMIKLNEMIDQNLSQPDLDVKFLTAQMAMSRTSLYNKIKELTGMGANDYINRRRID KAIILLTQSDMSITEISEQVGFTYQRYFSTLFKEMKGMTPSQFRAQHGSTQQPSE >gi|225935348|gb|ACGA01000044.1| GENE 100 146365 - 149325 2393 986 aa, chain + ## HITS:1 COG:no KEGG:BF4062 NR:ns ## KEGG: BF4062 # Name: not_defined # Def: putative TonB-linked outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 986 1 998 998 816 45.0 0 MDNLRKTLGCLLLFLFAAVTSTYAQVAKQYSGTVMDADSNEPIIGVNVTLKDAQTGTVTD ISGKFSISAPVGSTLSFSYIGYVTKEVKLGTNTSLKITMQEDQNQLSEVVVIGYGVVKKS DITGAVASVSSKQFKDQPVKRVEDILQGRTAGVAVTSVNGLPGGTVKVRVRGTTSLNTSN DPLYVIDGIMSGGLDVNPADIQSIEVLKDASATAIYGSRGANGVVLVTTKKGVEGKVQIY ADVAIGVSNILKKYDLLNAYEYATALKEYNGISFADDEMEAYKNGSKGIDWQNLMLQTGI SQDYKLGISGGTAKNKYLISANVLNMTAMTITTKYQRAQLRINLDNELTKWLTLSTKINA SRTHSHNGGIDIMNFLNYSPTMEMKDPVTGVYNMDPYNSVNGNPYGARVANYGDSYVYAL NTNMDLTFKIMKGLTLSVQGAANYSHVPSYSFTSSLAKPGQISGMENASRMNLFWQNTNN VTYNTSFGDHHLTVTAVFEASGAEGRNLKLTGSDLANEFVGYWNAKNAKTRDGENGYSAE AIVSGLGRIMYNYKGKYMLTGTFRADGSSKFQKKNKWGYFPSAAVAWDVAKENFMSKQNI VQQLKLRASFGVVGNQSIGAYTTLGMLAPTNYDGYGSDAIHTGYWTGNLATPDVTWESTY QYNIGLDASVLDGRLSFTAEWFRKDTKDLLLRKPAPQYNGGGSFWVNQGEVRNSGVEFTI TATPLTDKDIFGWETSLNASYLKNKIIDLAGSDFIVGENYTSIGGGPIQIKKVGYPIGSF YLYEWANFNDQGANLYKHQSNGSLTTNPGADDLVTKGQAEPNWTFGWNNTFTWKNWTLNL FINAALGQDRLNVSRYAMGSMTGVYRFISLSDAYYKSWDKVANKADAVYASHKNSDNRNY PDSDFWLEDASFVKLKNISLTYNIPKKITKVADIQLSVSAQNLFTLTKYTGMDPEVYSES DYGFNGVDMGSYPVPRTFTFGMKLNF >gi|225935348|gb|ACGA01000044.1| GENE 101 149340 - 150980 1628 546 aa, chain + ## HITS:1 COG:no KEGG:BVU_3705 NR:ns ## KEGG: BVU_3705 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 15 544 10 532 534 404 45.0 1e-111 MDMKTIYRIFKSVGLALIALWMVSCQDLLTEDPKGQLAVTNFFNSKGDLDLALNGMYSKV ASDMYANIWAGFESVMGDDISTHPAANKQGLREVDTYNVSDNNTWVTELWGARWRLVKAA NFIIDNAGRTPEVSQEEKDAAIGQAYYWRAYSYFYFVMAWGEVPMVVKDEINYNMPLATV PEIYELIVSDLKKAETMVPANYTKDPYARNGVNIAVSQGAVKATLAYVYMAMAGWPLNKG TEYYQLAAAKAKEVIDAAKKGTYYYKLLPDYKQVYSMEYNKNNPEVLLGVYYNLGIDALT NAPLADFLADYAYGGGGWGDTNGEIKFWYDFPEGSRKDASYFPKIILKNETKLRDWWEDP NPEAPRVVVAPCFMKKVETTTGEEFDYTNPKISMNQNGEKTHQIIRLAEVYCWYAEAVGR SKTGSITEAVNLLNEVRNRANGSVVADRDIYKTTMSYDDLAEAAYNEHGWEIAGYYWGNI ATRARDMFRMNRIKDHFEYRKLNPEIEVAPGVFRKEAVSVSGTWNDSKMYLPRPFVDSSI NPNLKN >gi|225935348|gb|ACGA01000044.1| GENE 102 151004 - 152677 1268 557 aa, chain + ## HITS:1 COG:no KEGG:Coch_1345 NR:ns ## KEGG: Coch_1345 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 138 552 67 499 499 195 34.0 3e-48 MKLRYLYLAIGVLCNASLVSCGDSFKEKTEIVACGISTNALTFGVSSTEVQTVDITSEAA WEVAVDQAGGNWLTVSPLEGTGNGTLTISADKNSGPKRSATLTIAAKGAELRTITIIQDG YKGTIYNYGDFTGLQKTGLVAGINPITIVDNDECEDGKALRIYTRPGKEYEGTNGDRFKV QTTTQFGSGRYEWRIYVPKFGMNDRASIGAFLYFDDGHELDFEICSGTAADRAAHSAGPD DMLCLVTSQANPFFSEFTPIKAEAWHTFVLDLKLENKKYLAEWSVDGKVLKRAQLDYGEE AYFRAISSVENLYGMGDHAATQENYALFDYLEYVPYDYSMKPIVEGQLPPEPEGTTVKWD FEEAGFVPVGWTNNGGTIADGNLNLSNGNNFVYGPEIGAGKYTWEIDVPLVGVGEKWLAG GNIAATNAEERSFSMFVFAGTENDRAACTIPPVPGQMLVRCYTESMGVYGVPIDPGKHTL TIDLRLNADGAYWATWIIDGEVAKTFTTWYTPAQFKFGFSIMTFADGGGWQGDKPTAKTY TAKYDYIEYKKYNYDEE >gi|225935348|gb|ACGA01000044.1| GENE 103 152696 - 153721 573 341 aa, chain + ## HITS:1 COG:BH2292 KEGG:ns NR:ns ## COG: BH2292 COG3858 # Protein_GI_number: 15614855 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Bacillus halodurans # 36 332 124 414 426 73 22.0 6e-13 MRYLSIIGTAICCVMLVMTSVSCKQEALASTTLEKKVFPWYVYIDGSSFKDIEPVKEIIS SISVFGNPPKSFIDECHQNHIEVYQAVGGNEETIDTPQKRKALVEKYVSDCNANGYDGID LDLEHLRPDIQDAYTEFLKLASKELHAVGKKLSHCVSFYPALYQDNETKMFHDPAVLNAT CDLVRVMCYDMYFAPGINKPELKHRDDCMGIGPTSNYPWTKEAMLFWMKHIPSDKLVMAL PAYANDYAVTGDIKGRQVYQSVPDSINGILPSSTWLCYEKVNMYLYDGTDGNRHMFYASD ARSTEALLELADELGISQIGFWHFNSVDPQMWDTTAKWKKK >gi|225935348|gb|ACGA01000044.1| GENE 104 153892 - 155718 1405 608 aa, chain + ## HITS:1 COG:SMb20092 KEGG:ns NR:ns ## COG: SMb20092 COG3568 # Protein_GI_number: 16263840 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Sinorhizobium meliloti # 16 244 5 245 252 92 31.0 3e-18 MKKIFLLISVILFIFPVQAQHTLRLMTYNIKNANGMDDICNFQRVANVINNASPDVVAIQ EVDSMTRRSGQKYVLGEIAERTQMHACFAPAIKFDGGKYGIGLLTKQVPLRLQAIPLPGR EEARILILAEFEDYIYCCTHLSLTEEDRMKSLEIVKSFTASYKKPLFLAGDMNAEPESDF IKELQKDFQILSNPKQSTYPAPDPKETIDYITALKSNANGFALISSQVLDEPMASDHRPI LVELRTAEKADKIFRTKPYLQNPIGNGMTVMWETTVPAYCWVEYGTDTTQLKRARTIVDG QVVCNNKLHKIRINDLIPGQKYYYRVCSQEILLYQAYKKVFGNTAQSDFSEFTLPETDTD SFTAIVFSDLHQHTKTFRALCKQIQNINYDFVVFNGDCVDDPVDHEQATTFISELTEGVR SDRIPTFFMRGNHEIRNAYSIGLRDHYDYVGDKTYGSFNWGDTRIVMLDCGEDKLDSHWV YYDLNDFTQLRNEQVDFLKEELSAKDFKKAKKRVLIHHIPLYGNDGKNLCADLWTKLLEK APFNVSLNAHTHTYAYHPKGELGNNYPVIIGGGYKMDSATVMILEKKKDELRIKVLNVKG EVLLDITV >gi|225935348|gb|ACGA01000044.1| GENE 105 155787 - 156764 772 325 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173206|ref|ZP_05759618.1| ## NR: gi|260173206|ref|ZP_05759618.1| hypothetical protein BacD2_15150 [Bacteroides sp. D2] # 10 325 1 316 316 565 100.0 1e-159 MKKYFYLLAMLAVVLSTIACSDDDNHPSTNVLDKPEVTVPDVKESSAVITWKAIGNATAY IYSLNNGSEQSTDQNTIQLTGLEPEKSYTFKVKAQKTGSIYFEDSEYAEITFTTTSEVTV YRIATFADDWDKWYYEYNDNGTVKRVYRLYEGELDREWLFAYEGNNITVTGKNAYTMTLN DQGYVATFVDGSNTYEYTYDENGYMTKAEKNGNIASNITIENGNITQWTRFSDGVEQFKV QTYSAVPNVGGAHCIYAEGSGPSRWLVETGLFGKASANCHTSSGWQHSSVASTYTFEYDE NSCIKEESKNYDGDIEKFYYTYFSE >gi|225935348|gb|ACGA01000044.1| GENE 106 156789 - 157796 992 335 aa, chain - ## HITS:1 COG:no KEGG:BT_2809 NR:ns ## KEGG: BT_2809 # Name: not_defined # Def: putative integral membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 334 1 332 334 398 67.0 1e-109 MFTVNSYLLAVIFCIVTMICWGSWGNTQKLVSKNWRYELFYWDYVIGMVLFTILLGFTMG SHGDTGRPFLEDLGQASGDSIGWVILGGVIFNASNILLSASISLAGMSVAFPLGVGIALV LGVIVNYLGIPTGNPMLLFGGVALIVIAIICNGVASGKMQKGEESKKNNKKGIIIALIAG VLMSLFYRFVVKGMDVENFNSPAIGMMTPYSAIFVFSIGVLLSNFIFNTLVMKYPFVGEK VSYSEYFKGNARTHICGMLGGAIWGLGSAFSYIASGEAGPAVSYALGQGAPLIAAIWGVF IWKEFKGSTKSVNQLLASMFIFFLIGLGMIVVAGN >gi|225935348|gb|ACGA01000044.1| GENE 107 157815 - 159335 731 506 aa, chain - ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 1 298 1 269 269 119 28.0 2e-26 MRKILLYIILFLSSIDGVRALPIEKEGMSIYSPALKKEVSYSIILPEGYEHSDMEYPVLY MFHGIGGDYTSWLEYGNVARVMYQMIKKGEIQPFIIVIPDGYLSYYSDTYDGSFPYETFF TKELVPYIDNNYRTHKNANARSIAGFSMGGFGALSVSLRNRNLFGSVVALSPSIRTEKQY IEEGPQIGWDNQWGRIFGGVGKKGNERLTSYYKQHSPYHILSTLRNSDLKEFGIMLDIGD KEGTLCESNEELHRLLLEKKIPHEWEVRSGEHDFTCWNAALPKAFRFINGHFNENQQKNR ERNFPNETPFIKMGNTTIYYPEQAQGSIRKYPIIYVQGEINGQQQKALVSQFHQLVNNNK TWPAVLCFVKANVDLSKTISDVEKQCSGIRGSQRMRALITWENHIKESIETIRRENLFTG IVCVNTTGDESDAPNFIKAVNSHKRYPRCWIEVLPESKEYGLSSNIHILLEASNLEHEFR SRKYKETNAFTYWEDWIIYLNNRIHV >gi|225935348|gb|ACGA01000044.1| GENE 108 159337 - 160932 778 531 aa, chain - ## HITS:1 COG:AGc3637 KEGG:ns NR:ns ## COG: AGc3637 COG0627 # Protein_GI_number: 15889292 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 46 309 25 310 322 96 27.0 1e-19 MNRKLLILLALLFSYGLSSCLSDDNPPSEDEQTETPEQFTKRYNPDQSFYSKIIGQEIKY SVLLPQEYLSESTAKYGVVFLLHGWGGNQNSWGPSGLNIQTIVDTQTNNGSVRPLIYIMP EGFNSYFCNRYDGKFNYMDMFINELVPLIDKRFRTIASKTERAVAGFSMGGFGALSIASQ HPETFSVSVGLSPSLNTDEQYISLSQDGWNLQWGNNFGGSGQAGTGRLTSYYKSQCPLHF FKDKSSSTFQAIRYYIDCGDDEERLYAGNGELHSLLRDKNIQHEYRVRNGAHTDSYWRES MKEALPFIERSFKGQSYPQETLQQFTEELHSTTKNIKVGNSNIELYIPDDYNSTLTYKVL YYSKGEGNTNLTTEQVAVALDSLMQIKRMIIAGFDVKEIIQNGISFSAITNAVEKTIHTE NNADFRLGLAYGSDADYLYHQSTGNTPAIHFFFAEDADIANPSDENQAQIYYLDITDEGT NYNSMFTLFNGLRKAEAPVQYRVRNGVDSSQSAQTGIYSMSYYIGEQLIKK >gi|225935348|gb|ACGA01000044.1| GENE 109 160944 - 162491 1228 515 aa, chain - ## HITS:1 COG:no KEGG:BDI_1656 NR:ns ## KEGG: BDI_1656 # Name: not_defined # Def: putative large exoprotein involved in heme utilization or adhesion # Organism: P.distasonis # Pathway: not_defined # 30 512 39 517 518 587 55.0 1e-166 MKKVLINMILLFAFSISGTAQIEYGKTVEISKDVLLDKIKGGWAGQTIGCTYGGPTEFKY RGAIIHEKIPIIWYDDYCKDIFAEDPGLYDDVYMDLTFLEVMQKEGINAPAESFAKAFAN ADYKLWHANQAARYNILHGIMPPASGHWKNNPHADDIDFQIEADFIGMICPGMVNTASDF SDRIGHIMNYGDGWYGGVYMGAMYALAYVNNDIYTIVTEALKTIPEQSKFHRCITDVIKY WKQYPDDWRKCWLEIENRHAFEIGCPEGVFNAFNIDATINAAYCVIGLLYGNGDFFKTMD IATRCGQDSDCNPATAAGILGVIQGYKAIPEYWKPALERCENIKFPYTDISLSSVYDINL KLLSDVLKANGGKVKGDKYYTTIQKPKTVAWEVSFEGLYPSERRVIKYDLGTEKTFEFNG KGVVLMGLIRQDIKDNNDNYIAILEAYIDGKKIETIEMPYDYIKRKYDIFYNYDLEEGPH KLVIKWINPNEKYAVQCKDMVVYSSTPAEPINPYQ >gi|225935348|gb|ACGA01000044.1| GENE 110 162498 - 164105 882 535 aa, chain - ## HITS:1 COG:no KEGG:BDI_1656 NR:ns ## KEGG: BDI_1656 # Name: not_defined # Def: putative large exoprotein involved in heme utilization or adhesion # Organism: P.distasonis # Pathway: not_defined # 27 524 39 510 518 508 47.0 1e-142 MKKIITICIIGLFALNIQAQPVKTLKLSDKELLDKIKGGWAGQTIGVVFGAPTEFKFTGT YIQDYQPIPWAEGYVKYWWEKKPGLFDDIYNDCTFVEAFNELGLDCSQEELAKRFAFADY HLAHANQAGRYNIRQGIMPPASGHWLNNPHADDLDFQIEADFIGLMAPAMLPEALDIASR VGHIMNSGDGFCGGAFVAALYSSAFYEQSPEAILRTAISVIPKESTFHQCIQDVINFHSL HPDNWKDCWYFLQEKWNCDVGCPKGVFLNFNIDAKLNSAFVALAMLYGQGDFTNSIDIAA RCGQDSDCNPSTVSGVLGVMYGYDNIPSFWLNPLKEVEEFTFEGTDMSLAKAYNMSFEQA KQLIVKTGGTVSSGEVEIPIRKADVLPLEQNFENTYPLYRERKDCFLTDTFEFDFNGNGF VIWGNICCTRSITPDYINRVSTRHIGSEVFGLAEPNDPYVAKVEIWIDGELDHVAALPMK NTDRKVEPAWKYLMKEGRHHVKMKWLNRKKDYIIRINDIMYYSEKKEHDRFYFNK >gi|225935348|gb|ACGA01000044.1| GENE 111 164118 - 166082 1370 654 aa, chain - ## HITS:1 COG:no KEGG:sce3320 NR:ns ## KEGG: sce3320 # Name: not_defined # Def: hypothetical protein # Organism: S.cellulosum # Pathway: not_defined # 22 390 116 497 524 142 29.0 3e-32 MKKPIIFLLVLITNILFISAQKISKAELKDKIAGAWIGQMVGNIYGLPFENKFVDEPAPE SRFPFGYTKNIDKLQKYNGAFSDDDTDVEYIYLLLMEKYGVEPTYANMREGWMYHIRDRV WLANRAALGLMHLGFTPPFTGDENLNPHWYQIDPQLINEIWAYTAPGMISYAAGKSDWAA RITSDSWAVSPTVLYGAMYADAFFCKDIRKLITRALKELPADDRYAIAVKEMIALYDKYP KDWVKARQIMAKKYYIDEPAMTKTIWNANLNGLCGILAMLYGEGDFQRTLDLSCAMGFDC DNQAATISGLLGVMYGAKSLPESLTKPIEGWEKPFNDRYINITRFDMPDASIEDMIERTY NKAIELVCAKGGKVKGDMVYVNPKAQFIPPLEFCVGPNPDLEIGQPTDYSFACRTNANFK WDLIKGKLPAGVTFQNGKLAGTPTEAGKFPITLQLSANNKNAITKDFELLVKTKNIASKA DTIYANIRKLNEEVLDSCWITFGKPMYATNVEVINDGVKDGVGSVFYSLAAKSNLPKIDY FGYGWKEEHTINMVVLDMGCLEEFGGWFTSLSIQYLGNDGHWYDTGKFKSTPTLPETDIV FFQPPFAQYVLEFPAVKTKGIRVLMDTKVQEHWHKYTKNVSSFISITELGVYEK >gi|225935348|gb|ACGA01000044.1| GENE 112 166102 - 167631 915 509 aa, chain - ## HITS:1 COG:no KEGG:BDI_2874 NR:ns ## KEGG: BDI_2874 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 509 7 519 522 262 34.0 2e-68 MKKTNIILSFVICILSYGCTDLGETVYTGVAMNDFFKNEKELVANAGRAYTKLQGYNSEQ SLWTLLLQASDECAVPACGGSWYSNGRYEEIQTNKIPPANKLLTRGWNWIFNGIAACNEI IYETELSPIQFEGKEKIIAEMKILRAFYYYQAISCWGNVPFTTDYTETGYPEQKSREYIF NYLEKEINDNIEFLDREPSDTNYGRATQAAAYCILAKMYLNAEAWFGTPMYDKAEKACKD IIDIGAYSIEDSYSTNFDIKNEDSKENIFVILYDRVYTSGDSNSFYLHTLTLEAASQATF NIPAAPWSGFLCQPDFFQTYDEQDIRRSQSWLYGPQVDLSGKDLGFEYTPVFPEEKYYNS NGGRGTYDGARCWKWHYQTDGSLKEYTVSMDNDFAIFRYADVVLMYVEALVRQNRTSEAI QLADFKKIRTRAGLDAYTTSELTIDELYAERGRELAWEGWRHEDMIRFGKYLKKYWAHPD QSSETFRNVFPIPTDILNANPKLSQNKDY >gi|225935348|gb|ACGA01000044.1| GENE 113 167646 - 170516 1960 956 aa, chain - ## HITS:1 COG:no KEGG:Slin_2764 NR:ns ## KEGG: Slin_2764 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 11 956 142 1103 1103 617 38.0 1e-175 MSNDVIAQKAIRGTVLDEANEPIIGASVLVKGTTNGSITGVDGSFNVKANPSDVLVISYV GYATVEQQVGNQTQLVIHLREDSKVLNDVVVIGYGSTTKKELTGSVTSMKKDDLNPGTFT NAMGMLQGKVAGLQIINPNGADPTAKYEVLLRGTNTLSAGQGPLIIIDGVAGADIRNINF QEVESIDVLKDGSAAAIYGTRGTNGVIIVTTKRARSGKTEVSYDGQLTVGVVSRRAKPLS ASEYKSVIKEYRPELESYIFDSDTDWFKEITQTPFSHKHNLSISGGSEKFSHHTSFNYEK SDGLQKHNSSEKLMARTNIRQSLLDKWVDLDYNLNIIHRKYSPSSTSAFMQAFTHNPTEP VYDDSDPDAGGYSRIKAMEYYNPVAIINERNMESKNDNYGANIRATLNILPIKGLKWENF VSYDKEQYETREYYTHYYPSLIGTNGQAYIENYQENDTQYESTLNYSNIFGKHSIQALLG YTYQYTYSTSASMTNSGFDFDDNQTHNIGTGTNLTEGKASMSSNKEDNTYIGFFGRFMYN YDDKYLLSASLRRDGSSRFGDNNKWGWFPAVSVGWRINKEKFLSNVKWIDDLKLRAGYGV TGNQDFSNYKSLMMMTTAGKFYYNGQWINTYQPASNANPDLKWEKKAEFNVGVDMTMFDN RLSFTFDYYKRTTSNLLYDYIVPTPPYVYNTLFTNVGKVTNEGVELTISGTPFNTRDFTW NTSLTVSHNKNKLVKFTNDEFTNGTYKVGWSTSAACYTQRLIEGQSLGTFYGPIWLGTDT DGKDVLLGQNADGSVPEEQWEKIGCAYPDATLSWSNTFRYKKFDLSFSLRASIGGEILNN YAMEYENLSSIGLRNISSNWLSQTNFTSTTYKYSSKYIEDASYLKLDNVTFGYTWDFTSK MIKRLRLSLTAQNVFCITGYSGVDPEVALSGLEPGMESLSYYPRTTEFTFGINIVF >gi|225935348|gb|ACGA01000044.1| GENE 114 170600 - 171514 660 304 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 4 302 2 298 306 262 50.0 6e-70 MDTKKKIVVIGSSNTDMVIKSDRLPKPGETILGGNFLMNHGGKGANQAVAAARLGGDVTF ICKIGNDIFGNETLEMFHKEGIDTTYVGITPQEPSGVALINVDKKGENCIVVASGANGTL SVDDIQRAESTIKQASIVIMQLETPIESVTYAAKMAKKDGITVILNPAPAPTQQLPDDLL ANVDILIPNVTEAEIISGMHITDDESAKEAIRYISSKGIKTVIITMGAKGALAYENNKFT HIPAFKVEAVDTTAAGDTFCGGLCVALSEGKNLKDAIIFASKASSISVTRMGAQVSIPLR KEIQ >gi|225935348|gb|ACGA01000044.1| GENE 115 171658 - 172674 623 338 aa, chain + ## HITS:1 COG:HI0506 KEGG:ns NR:ns ## COG: HI0506 COG1609 # Protein_GI_number: 16272450 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 4 334 3 331 332 187 32.0 4e-47 MKKTLIDVSKKTGYSISTISRVLNGKSEKYRISQSAKEVILQSVKELDYQPDIVAQSLRN NTTYTIGLLVPHIDNPFFANIASVVIREAQRYNYTVMLIDTLEDPVQENKAIDSLLSRKI DGIILVPTGENPSKLEEISAKTPIVLIDRYFEKHNLPYVATDNYVGAYQATKLLLESGHS KILCIQGPSISITTKERVRGYVDALREAGYQDNAMIRGNEFSIQNGYIETKLALNSTTKP TAIFALSSTILLGAVKALNEHKVRIPQDMSIISFDDNLYLDYLNPPITRIAQSLENIGII AVKMLMQKILEETELHSEILLKPNIIKRDSIKVLTDRK >gi|225935348|gb|ACGA01000044.1| GENE 116 172746 - 173225 356 159 aa, chain - ## HITS:1 COG:no KEGG:BT_2538 NR:ns ## KEGG: BT_2538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 157 1 158 160 132 49.0 5e-30 MALKYVVKKTTFGFDKEKAEKYVARPFNAVTVDFKMLCDQVTKVGFVPRGTVKSVLDGLI DSLITYMEIGASVSLGEFGTFRPSFGCKSQDDEKGVTTDALKNRKIIFTPGSMFKGMIKS ISIQKLDSSKTNSSPTPDDGKGDDKGEGGGSGEAPDPAA >gi|225935348|gb|ACGA01000044.1| GENE 117 173724 - 174131 339 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173218|ref|ZP_05759630.1| ## NR: gi|260173218|ref|ZP_05759630.1| hypothetical protein BacD2_15210 [Bacteroides sp. D2] # 1 135 23 157 157 245 100.0 5e-64 MRKIIIVSAISMVIFSCCIGVSSNTDKDKKYLYIEATGTNSHGDYSAKIDTLEIMEKNDS LAYLKAFEELCVSQRASALVVEIMKEKMGDRFDEYEEVHDFRLLNEKLEEVDRTIVPDSV LAEIAQSIFSLKLEK >gi|225935348|gb|ACGA01000044.1| GENE 118 174198 - 175148 577 316 aa, chain + ## HITS:1 COG:no KEGG:mru_0017 NR:ns ## KEGG: mru_0017 # Name: not_defined # Def: hypothetical protein # Organism: M.ruminantium # Pathway: not_defined # 2 133 46 180 312 108 40.0 2e-22 MKYDVFISYSSKDSGVAFDVCAMLEAAGLTCWIAPRNVYGGKSYAREIIEAITESQIVLF IFSSYSNCSSHVESEIDIAFNQEKVIIPFRIEEVKMSPELTYYINKKHHIDGIPEPALSF DILKESVLNNIPRLQKELDKERAYKLLREDLGDFDIEFLKSVLQNSRNNRADPRNVNCED NVAENEFNILQNAAGELCLLIRSRNGRPRKPCFICDNSNFTILFRNNSSAVFLEDINPVA VEALNKVNQMLVVELNNDEVVREYVVPIVLIEGLTSYLQDDAIYDKDSGIQGTLDRVLSH KQQTLLRKLYCNWREN >gi|225935348|gb|ACGA01000044.1| GENE 119 175225 - 175443 138 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717280|ref|ZP_04547761.1| ## NR: gi|237717280|ref|ZP_04547761.1| predicted protein [Bacteroides sp. D1] # 1 72 1 72 72 133 98.0 4e-30 MQNRLLRICKLLTIFIIELNYHLTVSYGEAFDKRRSVFGGEFGNLFVGKLFVALVAEEGK PTIHVATFGTFG >gi|225935348|gb|ACGA01000044.1| GENE 120 176093 - 176377 153 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717282|ref|ZP_04547763.1| ## NR: gi|237717282|ref|ZP_04547763.1| predicted protein [Bacteroides sp. D1] # 1 94 25 118 118 195 100.0 8e-49 MMGVSRAGYYKWKRRDPSTRDLNRETMVEFVEQMHSEHSTHGYRWVVAFIRNELRATVSD NFVYKCFRYLGIQSETRPCFAIGYDTPVNYHLIS >gi|225935348|gb|ACGA01000044.1| GENE 121 176439 - 177038 448 199 aa, chain - ## HITS:1 COG:no KEGG:BT_2225 NR:ns ## KEGG: BT_2225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 198 1 199 201 181 45.0 1e-44 MSVYYDLYASGNPQKKDEQQPLHARVIPSGTLDAKKFIELVSKSNAFSQATIEGCLQAVT DELQHWLKQGWIVEVGELGYFSLSLKCDHPVMEKKEIRSPSIHLNKVNLRINKKFRENME PLPLERMESPYRSNGNPDEDKCLSILMQHLDEQGCITCVDFTRLAGISRYKATILLNTYL EEGIIRKYGGGKTVVYLKK >gi|225935348|gb|ACGA01000044.1| GENE 122 177459 - 180878 2693 1139 aa, chain + ## HITS:1 COG:PA0931 KEGG:ns NR:ns ## COG: PA0931 COG4771 # Protein_GI_number: 15596128 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Pseudomonas aeruginosa # 196 327 41 185 742 63 35.0 2e-09 MEKDSPFSSTGRRCIVTLMCIFVTAFVYAQQQKISVSIKELPLKEAISQIAEKASMNVAY SKEFVDTSRKVSLEVKDTDVNKALTLLLKGTNIGFRFLDDSILFYNKEYQNKTEPVDSQG EKKELYVKGKVTDENTEPIIGATVSVKGSTTGTITDINGQYSIKVPYGSTLRYSYVGYRE ESVIAKATTVNVVMKENAVSLEDVVVVGYGVQKKVNVTGAVSMVKAEAIENRPITNVTTG LQGLLPGVSIVSSSGQPGAVPSINIRGTGTINSSTAPLILIDGVAGGDINLLNPSDIESV SVLKDAASSAIYGARAANGVILVTTKKGEKKESVVFQYNGYAGFQTPTALPELVNGREYM ELSNEAMSAAGFSKPYTQEAFDKYDSGLYPNEYSNTDWIDEIYKSRAFQTGHNVSARGGS EKTGFFMSYGFLDQDGLVVGDGYSSKRHNARISVNTEVYDRLKLTGQMSYVDFYKKDLGY SGTSGVFRLSQRMSPLLPVMWQIPDENGRMVDSENWSYGSVRNPLQVAYESGMEERKTRV LNGIFNADLKIIDGLNVGMQYSANIYTRQVDEFNPKMLSYYSDGSPLKANEDAKDYISQS HLDVMTQTLQFTLNFNKTIGRHELGALMGFSQEWENRSTLGATRDNVMVEDIHVISAGMI NFMNSGTKDEWALRSYFGRVNYAFDGKYLFEANLRADGTSRFAKGNRWGYFPSFSAGWNF SREKFMEFATSVLSSGKLRASWGELGNQNIPGNYYPYLSPIITEESYPIGASNTPVMGLW QNKIGNPDIKWETIRMLNFGVDLSFLNNRLNVDFDWYKKENIDALVRPDVPAIVGVSSSN VGYVNLGKIDVKGWELNLSWRDKIGSVNYNLGFNLSDARNKITDLGGTPESLTSTASGSY RRVGDPIGAFYGYLTDGLAQVYDFESVNTTTGKYQKPKFPLVASQNGIVQPGDIKYRDIS GPDGAPDGVIDDYDKVVFGEKEPHYTYAIKGGLEWKGIDFSFYLQGVGKVAGYLEDEARH AFINDYSIPKKEHLDRWTPMNPNASYPRLYQSQEHNRLFSDYWKEDASYLRLKNIQIGYR FPARMVAPLGINSLRVYASADNLFTKTDYFGAYDPEVRTTSGDVYPQVKTYVFGLSITF >gi|225935348|gb|ACGA01000044.1| GENE 123 180899 - 182632 1523 577 aa, chain + ## HITS:1 COG:no KEGG:Phep_2529 NR:ns ## KEGG: Phep_2529 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 4 576 24 579 579 296 37.0 1e-78 MKYSKYFLIIALFGIVTSCSDFLDRTNPNEPDNVTFWVNEDQLKNALPPCYEALQKDYLV NWSESTAETVMWGNITSGLSKVSGGKHSYTDGFPFTTYWTGAYSYIYRCNNFLDNYNKAQ VAQNKKDVYAAEVKTIRALMYFYLTVFWGDVPWVGEVIQPEDAYIERTPREKVIDQLVED LKWAAERMPEERYTGDKLGRLDRWGALAILARIALQNERWELAAKTSEYIIENSPYGLYE YEKLFHHEGDVENDPKNIEAIVYSLFVPEIRTQSLPNETCSPTDYIRLNPTKSLVDAYLC TDGKPAKTGLEYYKKTGVQTSSLYKSPEEHYVDYFQNRDPRMKMTLYAPGDKWPGGDDGD PDTDKANEIFNLPRFASLQDNNRVGANSRTGFYLKKYNDIDLAGSSVGGHGNLNVIRFAE ILLIYAEATFELQGKKLTQTQIDYSINRLRDRVNMHRMNLDELSAWGMDLETELRRERRI ELAGEGTRYADVMRWREGELRFGRAITGPSLKVCMNDLGANPYPDTGVDEFGDVIYEKST AEGGARYFDATKHYLWPVPNPERQKNPLLGQNPGWEK >gi|225935348|gb|ACGA01000044.1| GENE 124 182652 - 183443 506 263 aa, chain + ## HITS:1 COG:CC0523 KEGG:ns NR:ns ## COG: CC0523 COG3568 # Protein_GI_number: 16124778 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Caulobacter vibrioides # 28 253 5 245 259 66 26.0 6e-11 MYSMKKYLLLFLCLTLGVAYAQDTLRVRVMTYNLRFGELASLEELAHHIKSFKPDFVALQ EVDSKTDRKRTPHQKGKDFISELAYHTGMFGLYGKTIDYSTGYYGIGMLSKYPYISVQKI MLPHPVKEHERRAMLEGLFEMGNDTIVFTSTHLDVNSQETRAEQIKFITGHFKNYKYPVI LGGDFNARHYSEVIRGMDSWFAASNDDFGMPAWKPVIKIDYLFAYPQKGWRVISTQTVQS LLSDHLPIITELEYVKEASKKKY >gi|225935348|gb|ACGA01000044.1| GENE 125 183472 - 185628 1419 718 aa, chain + ## HITS:1 COG:no KEGG:Amuc_0060 NR:ns ## KEGG: Amuc_0060 # Name: not_defined # Def: alpha-N-acetylglucosaminidase (EC:3.2.1.50) # Organism: A.muciniphila # Pathway: not_defined # 31 715 33 721 848 651 45.0 0 MIHTIIKYLLISATLFFCSCHKPKTDIITPAKQLIERQIGERAQSIHFEYIEPSEGKDIF EVIASDGRLTLRGSSSVAICYAFHTYMKEACKSMKTWSGEHITSVMPWPDYELYEQMSPY ELRYFLNVCTFGYTTPYWDWERWEKEIDRMALYGVNMPLATVASEAIAERVWLRMGLNKE EIREFFTAPAHLPWHRMGNLNKWDGPLSDAWQQNQIALQHQILTRMRELGMQPIAPAFAG FVPEGFVQKHPDTQFRHMRWGGFDEEYNAYVLPPDSPFFEEIGKLFVEEWEKEFGENTYY LSDSFNEMELPIDKEDKEAKYKLLAEYGETIYKSITAGNPDAVWVTQGWTFGYQHSFWDK ESLKALLSNVPDDKMIIIDLGNDYPKWVWNTEQTWKVHDGFYGKKWIFSYVPNFGGKNTM TGDLDMYASSSVKALRAANKGNLIGFGSAPEGLENNEVVYELLADMGWSSDSIDLDDWMK IYCEARYGGYPDAMEEAWKLFRKTAYSSLYSYPRFTWQTVVPDQRRISKIDLSDDYLQAI RLYASCADELKGSELYRNDLIEFVSYYVAAKAENFYKQALKDDSENRVLAAQRNLQQTVD LLMDVDRLLASHPLYRLEEWVELARNSGTTLQEKDAYEANAKRLITSWGGIQEDYAARFW SGLIKDYYIPRIQLYFTKDRNKIREWEEQWITSPWSNSTTPFDDPVKAALNLIEKTNK >gi|225935348|gb|ACGA01000044.1| GENE 126 185637 - 186779 821 380 aa, chain + ## HITS:1 COG:CAC2834 KEGG:ns NR:ns ## COG: CAC2834 COG1929 # Protein_GI_number: 15896089 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Clostridium acetobutylicum # 4 375 6 376 380 275 43.0 2e-73 MKKIVLAFDSFKGSVGSFEIAKAAEKAIQEELPDCQIIRFPIADGGEGTTEALCSALHAQ TVSCRVHDPFMKPIDVSYGIVNNGAMAIIEMASACGLPLIDSSRRNPMKTTTYGVGEMVA DALKRGCREFIIGIGGSATNDAGIGMLKALGARFLDSGNRELEPVGENLIKVHQIDISQL NPALKESNLTIACDVSNPFFGKEGAAYVYAPQKGANSLQVIELDNGLRHYAQVIKEYTNM DISQLPGAGAAGGMGGGLLPFLNAELQSGIEVILKTLRFEEVVRRADLILTGEGKLDCQT GMGKALDGILRVGEKCQVPVIALGGAVEATEALNRMGFTAVLPIQPFPVTLEEAMQPEFT KENIERTVRQVVRIIKQFTK >gi|225935348|gb|ACGA01000044.1| GENE 127 186786 - 188042 959 418 aa, chain + ## HITS:1 COG:HI0092 KEGG:ns NR:ns ## COG: HI0092 COG2610 # Protein_GI_number: 16272066 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Haemophilus influenzae # 1 414 4 414 419 343 53.0 5e-94 MTAIGALIGLSLSILLIIRKLSPTYSLIIGAIVGGLLGGLSLNETVTVMTEGVKEVTPAV LRILTAGVLSGILIQTGATTVISNAIINKMGEKRVFMALALATMLLCTVGVFIDVAVITV APVALSIGKRLNLSPSVLLIAMIGGGKCGNIISPNPNTIIAAGNFNADLSAVMFANILPA VIGLFFTVFVIVRLMPQSVKNKKTMMQTEDKEEERNLPSLRTSLIAPVVTIVLLALRPAA GINIDPLIALPVGGLCGAICMKQWKNILPSIEYGLQKMSVVAILLIGTGTIAGIIKNSSL KDWILTGLDHAHMSDALIAPISGALMSAATASTTAGATLASSSFAETILAIGISAVWGAA MINSGATVLDHLPHGSFFHATGGVCELNFKERLKLIPYESLIGIVLAAGTTILYIISN >gi|225935348|gb|ACGA01000044.1| GENE 128 188057 - 189142 614 361 aa, chain + ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 6 361 2 375 375 189 33.0 1e-47 MNPNKRLLSLDVLRGITVAGMILVNNTGKCGYNFAAFAHAKWDGFSPADLVFPMFMFLMG ISTYISLCKYNFQCRPAIAKIIKRSLLLIFIGLVMEWFITAIDSGNYFDLSQLRLMGVMQ RLGICYGITALLAVTIPHKRFMPLAIILLVVYFIFQLFGNGFEKSADNIVGMIDSAILGS NHMYLQGRQFVDPEGILSTIPAVSQVMIGFVCGKIIIDIKDNERRMLNLFLIGTTLLFVG YLLSYACPLNKRLWSPSFVLLTCGIAALSLALLLYIIDVKQNKKWFSFFEAFGANPLVIY VFSCIAGGLLVHWHIHTTVFNNLLNPLFGNYFGSFMYGVFFLLFNGLLGYILLKRKIYIK L >gi|225935348|gb|ACGA01000044.1| GENE 129 189149 - 191332 1624 727 aa, chain + ## HITS:1 COG:no KEGG:BT_0438 NR:ns ## KEGG: BT_0438 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 691 1 699 730 671 45.0 0 MKRKSVYTCLVMLLMSLALQAKDKDVAVAEALLKRLLPSYIESFQFQKLKGEKDCFTIES VKDKIVIGGNNANSMAMGLNHYLKYYCLTTVSWYADIAVEIPEELPMVGEKVVSEARVDT RFFLNYCTYGYTMPWWQWKEWERFIDWMALNGINMPLAITGQEAVWYKVWSKMGMSDIEI RSYFTGPPYLPWHRMANIDRWNGPLPMEWLEHQVSLQKKILARERELNMKPVLPAFAGHV PADLKRIYPEADIQHLGKWAGFADAYRCNFLNPNDALFAKIQKLFLDEQKKLFGTDHIYG LDPFNEVDPPSFEPEYLRKIASDMYATLTAADPKAQWMQMTWMFYFDKDKWTSERMKALL TGVPQNKMILLDYHCENVELWKRTEHFHDQPYIWCYLGNFGGNTTLTGNVKESGARLENA LINGGGNLKGIGSTLEGLDVMQFPYEYILEKAWNLNADDNKWIECLADRHVGCVSQSVRD AWKRLFNDIYVQVPRTLGTLPGYRPALNKNSEKRTSNVYSNVELLEVWRKLNEAPSDRRD AFRLDLITVGRQVLGNYFFDVKVEFDRMVEAKDYQALKACGEKMKEILNDLDKLNAFHPY CSLDKWIDDARKMGDSPQLKDYYEKNARNLITTWGGSLNDYASRSWAGLISDYYAKRWEV YINTFIKAAEKGVEVDQKQLEDELKEIEEGWVNATDRKDVRKDIHSATDGLLSFSTFLFS KYQRLVK >gi|225935348|gb|ACGA01000044.1| GENE 130 191442 - 192032 447 196 aa, chain + ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 14 181 19 189 201 67 26.0 2e-11 MIKLSQKYSQDEAQALVKALKEGNQLAFSIVYKTYAAQTFSLAFKYLLNKELAEDAVQNL FLKLWLKKEEIDETKPINRYLFTMLKNDLLNTLRDSKKNIYLLEDCLSMVLELEDNSQNE NLKQEQMNIIQQALEQLSPQRRKVFEMKVSGKYSNQEIADKLNLSINTIKFQYSQSLKQI RATVGELSLLLLYCMM >gi|225935348|gb|ACGA01000044.1| GENE 131 192141 - 193097 661 318 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 119 308 131 315 331 72 28.0 1e-12 MKSKKKYTNQDFEVVDGIGKYMDDDIQRIIEGKQLELGDRELPPFDEYRIYKNIQEAVMR EEKRKNRRIRMPLFFKWTVACAIVLLVIGVGYNFYQSRSEANLVYREVCAVRGEKLLVLL PDGSRVWLNADSKLTYPEQFAKYNRNVTLEGEAYFEIAENKKSPFQVLAENVKIQVTGTC FNVKAYASDKVIKTTLDEGSINIGHVQSRRPMQQMLPGQTAVYEKRSNVIKIKTDRYHDD ASSWKSNRLIFRNASLKEVLTTLSRHFDIEITVKNEKIASFTYDFVCKGNDLNYVLEVMQ SITPVSFKKISEYTYTVE >gi|225935348|gb|ACGA01000044.1| GENE 132 193129 - 193902 443 257 aa, chain + ## HITS:1 COG:no KEGG:RB8407 NR:ns ## KEGG: RB8407 # Name: not_defined # Def: hypothetical protein # Organism: R.baltica # Pathway: not_defined # 26 255 44 272 281 140 34.0 4e-32 MKALLKPIVWVCLFFFAYQSTYAQALKIMSYNCRMSGEMTGYSVKEYAVFIRKYNPDVVM LQEIDYNTKRNKNQDFTTQLAAELGLFSVFGKAMDTGGGEYGVAILSKYPFVYINNKTFE GIDGAKEPRTLLYVDIQEPGTSDVIRIGTTHLDHSTDLIRSAMAEQINERIGTGDTPTLL GGDFNARTDSNVICEVMKNWQRICDDTFTYPADQPTIKIDYIFGLPQNKWKVKSFKVLSN PEVSDHRALFAEVEFVK >gi|225935348|gb|ACGA01000044.1| GENE 133 193910 - 194689 230 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173236|ref|ZP_05759648.1| ## NR: gi|260173236|ref|ZP_05759648.1| hypothetical protein BacD2_15300 [Bacteroides sp. D2] # 1 259 1 259 259 489 100.0 1e-137 MKSNYLYLLFICIISCLFVSCSDEDNKGEEIPEWNRFRTTNVLVYAHLSEQNLFSSCSYK EVASSIRNTVHSVALLDRTNAVYGQTAIINTGTETARESKKVPVFVPVSYSNGEKKIIGS TVLFPSTISEMTQYVVKNDCRYLETKTEAVNGIDMLFCSVSLNSEDLIAPAVDVFKKKVN EQTVLVGTVKRALLPNLESAITSNLTSDTYSFVEVENRNRDSEYCIFVLTSHKWAFRGVT ETSVSGDLHCFQLQIESLK >gi|225935348|gb|ACGA01000044.1| GENE 134 194703 - 196232 1094 509 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173237|ref|ZP_05759649.1| ## NR: gi|260173237|ref|ZP_05759649.1| hypothetical protein BacD2_15305 [Bacteroides sp. D2] # 1 509 1 509 509 997 100.0 0 MKQYIKNIMILILALPWGLLLTNCSDSEPDYSDDNAYPPPTVELTSPSEIDVVEYNSTVT VSARSFSAVGIHSIYATLLKMDENGEYEEINATERQRLKIDTLQTDMTLEFDLNVKVNTR EAAGILVTSTDVLTKTAQKVIPIKKITKLPSQIFTEPSDFPVLVPDEEVSLSVVIRSAVG IKSIKHTLCNKVLGDLKEYTTIPVSGNPLEMEFILKTVVDNKETNGIKIVVEDIEGLKEE KIINVEGLEGVDNNVALVFNDIEMAPEWEHSTEPDQPYIFSIEGIMIRGVQKHVLSLKEI KGYGSKANSVDFAFINIWRNPSFVAVKNRGFSYVSASRINGGPIGRAYDVNDWIKPAGVA TNKTLFTLIPDDKVTDLGIDVMMANAASDVKTFEALNMLESIAKRGADMLMQRVNASDGY PNDPCSLQIKDGSYIAFVTAAGKYGVIHVIEAANDMDALVAGGCKIATPTGVVGSQGPAY SGAGITGLTYDGVALLYGRTCKLKIVVQK >gi|225935348|gb|ACGA01000044.1| GENE 135 196554 - 197540 802 328 aa, chain + ## HITS:1 COG:AGpAbx251 KEGG:ns NR:ns ## COG: AGpAbx251 COG3712 # Protein_GI_number: 16119537 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 14 274 23 258 311 78 28.0 2e-14 MEDIYIIISKYLLGTASPQEEMEIMEWRNADAGHEQEFQELCESWQIAHAGIHPVIPDKE RVWEKIMSNLNLVKPVKMYTQRLLYRAVGIAAMLALALGFSLSLLVSEKEEVGLVSFTAP VGQKAEVSLPDGTKVWLNSGSTLTYSTDYAKDCRSVKLNGQAFFDVVQDSKRQFDVSVGD VKVLVHGTAFDVNGYGDHSELEVVLLRGHVTVVSTLTDKLLADMKPNQKVIIPLHEMEKC KLEACDAEVESVWRLGKLKIENENLQEIVQKMERWYGIKIQLHDVPENKRYWMTIKTESL REMLEIINRVTPITYTINGEEVSITGRK >gi|225935348|gb|ACGA01000044.1| GENE 136 197703 - 200918 2459 1071 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0714 NR:ns ## KEGG: Dfer_0714 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: D.fermentans # Pathway: not_defined # 1 1071 7 1124 1124 845 41.0 0 MLKLKQILCSILVLGSLLSSPSLWAQKETKVNIAARQISLKNLIQAVEKQTDYTFVFDNS IPLSRIVSLKGGSQYLSDVLKQAFDKSDIAYEIVGKQIVLQKVQTKTNRTISGVVKDEQG EAVIGASVLVKGTTNGTVTDFNGKFELQNVPESATIDVTFIGYAPQSKKVVAGVRSMNFV LEEDTETLDEVVVVGYGVQRKRDLSGSIASVKGDIITEYANTSVASALQGRVSGVQIQQT NGQPGAGIQVRVRGSNSIRGDNEPLWIINGFPGDINMINTADIESVEVMKDASATAIYGS RGANGVILITTKQAKEGKITVEYNASFGVQSLAKELELLDAWEYMNYLNEKAAINNQPTI YTDEEIKSTRHSTNWQRELFRNALVTDHSVNVSGGTEKVQGTLGASYFDQQGIVKESGYK RMSIRSSLNYHISKYVTVSSNLIFSRSNHNQMNSQGGSRGTSVIGSTLILPPTATPHYDD GTWNDFQTQPIAPVNPLAYVKEVDNKWYANRIMANASLTIKPIDGLSIQLSANVNNNQNR KDYYKSLQYPNSQGAASITFGETVGITSNNIITYNKSFLKKHHLSVMGGFTYEQSTSKTA GTGTAEGFLSDVTETYDMDAATVKGLPTSSYSDWRLFSFLGRVNYNYADRYLLTASLRAD GSSRYSKGNKWGYFPSAAAAWRLSQESFLRDVEWLSDLKFRLSYGVTGSTAISPYSTQNT LRTENVVFDKNTTVAYVPSDTYTGDLKWETTSQFNVGVDLSLFNNRLRMTADYYRKKTTD LLNNVEMPRSSGYTTALRNIGSIRNSGFELQLDGRIIDRAVKWDLGVNFSLNRSKVLVLS EDKDIFGGELDNTILKDQLNLMRVGEPMYVFYGYVEDGYDENGHIVYKNMDDDPAITAAD KTIIGDPNPDFLVNLTTAVSYKGFTLSAFFQSSIGNDIYSLSMAAQAYDYGYNGNTLREV YYNHWTPENPTAKYPNLDQTSYKMSDRFVYDGSFVRLKNLELAYDVPCARSKFIKRARAY VSAQNLFTITSYPFWDPDINANGGGSSMIQGVDSYCYPSARTYTIGCRLTF >gi|225935348|gb|ACGA01000044.1| GENE 137 200939 - 202417 1443 492 aa, chain + ## HITS:1 COG:no KEGG:Dfer_4709 NR:ns ## KEGG: Dfer_4709 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 492 1 479 479 283 37.0 1e-74 MKKLLIYIAAFAMILAMNTSCEDMGSLEEHPKKVDATTFMANAKEVESVINSIYFQLRRD PGFSRYLIVLEEGLADYCIGRGNYATAYDTGLTSGGVGFSKDSWAVLYRAIRFANNILDG IGNTPLSQQEYNNLTGETRFLRAFAYSWLARNWGAVPFFDEQNMNDFNKPRTPEADIWKF VIDEADYAASNLPEVAKSAGRPSRYAALALKTEACLYAGRYEEAAEAAGLIISSKRYSLV EVGKADDFLDLYGHTANATSEEIFYIKFNRDSGSTIAYMYLCKPNPFVNMGAVGIYTDYQ KNKFIQNWDQNDLRYQFGLYKQTQNGTLNALTKTGMICSKFRDSEWTGSSTTPNDNPVYR YADILLYYAEAVCRWKGAPTDDAMEKLNMVRRRAYGQKPTQASSSDYKLADYASKDAFLA LVLQERGYETIFEGKRYNDLKRCGKLAEAALAAGRISALSEVGDAAYWWPIPTDEFNYNM ALDQTKDQNPGY >gi|225935348|gb|ACGA01000044.1| GENE 138 202441 - 203517 773 358 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4337 NR:ns ## KEGG: ZPR_4337 # Name: not_defined # Def: glycoside hydrolase, family 5 # Organism: Z.profunda # Pathway: not_defined # 1 355 1 350 353 327 45.0 4e-88 MNRLNILIVTILFLLTTTGCAQQAERWSTEKANAWYASQKWPVGINYVTATAINQFEMWQ EETFDPKTMELELGRAGELGFNTVRIFLHDMVWEADPAGFKQRLDTFLGICQKHGMRAIV TFFTNGGRFESPKLGVQPASVQGVHNSQWIQSPGAPSVNDPSTYPRLERYVKDVMTTFKA DDRILLWCLYNEPENFKQKAHSMPLLREVFRWAREVNPSQPLSSPIWIYPGGHGTRSNLP IISFLGENCDVMTFHCYYGPEEMEKFIAFMKQFDRPVICQEYMGRPRSTFEEIMPILKRE KVGAISWGLTAGKCNFHLQWSSKAGDPEPEIWFHDIFRLDGTPYSQQEIDFIKSMTSN >gi|225935348|gb|ACGA01000044.1| GENE 139 203554 - 205431 1295 625 aa, chain + ## HITS:1 COG:no KEGG:BT_2892 NR:ns ## KEGG: BT_2892 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 96 486 65 448 570 93 26.0 3e-17 MKKLKTHRYLLVTVMGLLVACTSYTSTDEWPNPRPNSNPDPEPVTGEITSIQSLNKSENV AVNSHADEWGNSSLELDYRRLLTLESPALSAVNALYPRIKKLGDGTYLLLYQQGPQAWNV YYALSTNLITWQNASSPLFQSESAQQTSGASDTRCFSSCDAVVLANKDILAFASFRLNQG YRVDPQSNGIMMRRSSDNGRTWSTAQIIYQGTTWEPYALQLRSGEIQVYFTDSEPLTADS GTAMLRSMDNGRTWTVVGKVIRQKTGLAIDGSGKQIYSDQMPSARELNNSTKIAVATETR FRDEGDVYHISMAWSSDNWASAPLTGDEVGPSDRKLNFVQKAAAPYLAQFPSGETVLSYN ASSLFTMQVGNAEASQWGETYQPFSGKGYWGATEIIDPHTLVAAMPATFVNSENKDAARI QIGQFVLNHRINASGMTPVIDADNSDWSAVSDALFIGSVSTTQAVFRFAYDAENVYCLVE RLDKDLTTDDSMELIFQGGDATGTPLKISLIPDAVQYTIKCSHSSVTCKGAVHGTFGDTA ADKGYVVEMAIPRTLLRVVADRLMFNATLYDQNGSDTFTGLTATNYEKWLPIVLKAATDP EPLPGEGDTGTGPSWNNGDTEGTWK >gi|225935348|gb|ACGA01000044.1| GENE 140 205458 - 207347 1393 629 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173244|ref|ZP_05759656.1| ## NR: gi|260173244|ref|ZP_05759656.1| hypothetical protein BacD2_15340 [Bacteroides sp. D2] # 1 629 1 629 629 1208 100.0 0 MKKNKILILLVIGLSLGSCVQTDDEAFSPEPTVKGGEKSITAICAVTATTRTSLNASMNV VWTQNDEIRLFGTSTPNGAVYTTTANSVRTAVFDPVDASVDDAIRYAIYPASAASGSRLE GTTLAVDFSALAGQKWMGAFNADSEISSLPMAAASDGEAFSFKNLCGGVRIQLTDYQLMG ISIKSVAVRGNDGEQVSGIADVNAADGIVSLRKSSTPSAAATALVTCDTSVPLSVDNNPS AHTDFVLFLPAVNYTKGLTFVITDAAGRIYEQATPGAFTIEAGVVKPMELLPVTLYYGKA NCYRTASAGTLEIDVTPYYSLAGDYTYENRPRVNVNGELVDKAVSATVLWTQTNSSSSGD VLSAVPALEGTTLKVPVSGVKGNALVAIRDASGKNVWSFHIWVTEASDLTYVNEERGTFK MMDRNLGATSVTPKDQNAYGAWYQWGRKDPFPRPLDIVRSSATTVDNKELTANATTSAEV GTVSYTISNPDIRIFSANDWHNEWRNNGLWGNSDGLTKNVKTVYDPCPEGYCVPDQNCYQ GFTFTSKTECDNNYGHLFVIDGSQTSYFPTGGYLDKGANKIAYQEYRGYQWTSNPGTTGA YYFYYNNANLNFTGLDRASAASVRCVKIE >gi|225935348|gb|ACGA01000044.1| GENE 141 207385 - 210639 2439 1084 aa, chain + ## HITS:1 COG:no KEGG:BVU_0750 NR:ns ## KEGG: BVU_0750 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.vulgatus # Pathway: not_defined # 12 995 8 1019 1113 517 32.0 1e-145 MKYFSILFLILFLTTSAKAQSVKVSVSKVSIPTYTEPEREELPMFAENRVHQRSSGNPYP NKIVLKVNREQKVDKEYTLIKLENEYLELQILPEIGGKIYAAKDKTNGYDFFYKNHVIKP ALIGALGSWISGGLEFNWPFHHRASSFMPTDYEIEKLPGGGVIVWVSEHDPTDRMKGTVG IVLNPGESIFETRVKLSNITPLRHSFLWWENVAVPSNKNYEIFFPHDVSHVFFHYKRSVT TYPVATNAAGIFNGIRYDGAVDISKHKNTIQPTSYFSAASQYDFFGGYDTGRKCGVVHIG DHHVSPGKKMFTWAYNQLSQSWENALTDTDGAYCELMAGSYSDNQPDFTWLEPMETKTFS QYWFPIGEIGVPDFANTTGAIYVKDAIKVQLNKTRNVKITVKGDDRVLFSGNATIKAREE YSLPADVRMRLGYSIDVTANDGTVLMSYTVKKHDTFNIPHTTQDMPNIKKVESPHLLYLE GLHVDQYRDPATKGESYYKEALERDPNFAPALIALGEAKLRNAFYSEALEYLLRAEKVLT RFNTRLENGKLYYLLGHVYLALDEQEKSYDYFQKAAWSSAYVSSAMTYVAMLDIRKLEYD KAVQHLTTAITYHKDNAVANALMIYASYLQGDKKASERQYLSVEANDKLNHLARYFGVLT GKVSARDFMEKIRTDKNQVCLDLIETLLVANLQKETVSLIEMLQTHEPLIFSLSAIYADI KGGSPNDSATEGIAFPSRRIEMNSLSHWAKQGSRKAQLLLGCALYAKGHYEKAVALWEGL SGTDYRAARNLAVAYYSHMNRKNEVLPLLKQALSLKPNDEQLIFETVYVMGKLGVAPAER ISFLNNHKSAISRDDIMLEWARAYNMAGQEDKAIELLRGRNFVPAEGGEHAVAEQYMFAY FLKGRRLMKENKMQEATDCFKTAQTLPQNLGAGLWNIVRLVPFKYYEAICLKSLGQEDKA NENFDFITGIEVDYFSNMNLPELPFYQALCYRETGMPFKGDILINYKLQDWNEGMKTVDA GYFATTPFFISFCDRAVQQRSAYYSYLLALAYRYTGDTKLAQKYIEQAAVSDPYALNIFA ERQF >gi|225935348|gb|ACGA01000044.1| GENE 142 210836 - 211387 215 183 aa, chain + ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 39 174 36 172 181 71 28.0 1e-12 MRLINTIPGTMDIQAFRNYYDTYYEQLCCFLNFYTHDGAVIEDVIQEVFLKLWENKDCIE ITYIKTYLFRAAKNRVLNYLRDEENRHQLLENWFNQQLEERKYKDCFDMDALTKVVNQAI EQLPEKCREIFSLSRKEGLSYRQIAERLGISVKTVETQISIALKRIREILSSSAFAFLWL FIR >gi|225935348|gb|ACGA01000044.1| GENE 143 211416 - 212555 709 379 aa, chain + ## HITS:1 COG:TM0282 KEGG:ns NR:ns ## COG: TM0282 COG2017 # Protein_GI_number: 15643051 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Thermotoga maritima # 24 378 6 355 356 288 44.0 9e-78 MNQFLACLSLAVLVTACAECGSSSGVKAAFYGVTQDGDTVIQYTLTNASGAQIKVIDYGC RITNIIVPDREGKMADVVLGYENLKDYEIGAERFFGALLGRYANRIAGGAFMIDSIRYQL SCNESPNGYPGHLHGGMKGFDRVMWKAMPINTPDTLGIVFTRRSLDGEEGYPGNLDCKVT YFWTSDNTWRIEYEAVTDLPTIVNMSQHCYFNLQGYDGGSVLNHIVQINADSVTVNTPWY VPASVEAIAGGPLDFRTPHSFAERANSPNEHMKLMGGYSANWILRDYNGDLRYAATITEP QSGRRIETYTTEPGLLIYTGIGLSEKIVAKGGPQQKYGGFILETIHHPDTPHHPEFPSCV LRPHEKYHSVTEYRFSVQK >gi|225935348|gb|ACGA01000044.1| GENE 144 212672 - 213475 587 267 aa, chain + ## HITS:1 COG:PA3818 KEGG:ns NR:ns ## COG: PA3818 COG0483 # Protein_GI_number: 15599013 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Pseudomonas aeruginosa # 34 264 31 263 271 157 38.0 2e-38 MLDLKQLTTDVCQIATEVGHFLKEERKNFRRERVEEKHAHDYVSYVDKESEVRVVKALAA LLPEAGFITEEGSATYQDEPYCWVIDPLDGTTNYIHDEAPYCVCIALRNRTELLLGVVYE VCRDECFYAWKGGKAFMNGEEIHVSSVEDIQDAFVITELPYNHLQYKQTALHLIDQLYGV VGGIRMNGSAAAAICYVAIGRFDAWMEAFLGKWDYSAAALIVQQAGGKVTNFYGEDNFIE GHHIIATNGTLHSFFQKLLAEVPPLNM >gi|225935348|gb|ACGA01000044.1| GENE 145 213507 - 214214 450 235 aa, chain + ## HITS:1 COG:AGl2633 KEGG:ns NR:ns ## COG: AGl2633 COG1040 # Protein_GI_number: 15891426 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 232 56 284 291 94 31.0 2e-19 MKHTLLVKDWLSSFLSLLFPRCCVVCGRPLAKGEECICTVCNIKLPRTNYHLRKDNPVER LFWGQIPLERATSFFFYEKGSDFRLILHRLKYGGQKEIGAIMGRYMAAELLSSHFFQGID VIIPIPLHKKKQQIRGYNQSEWIARGITAVTGIPIDTESILRKKNTETQTHKSILERRDN VEGIFELQRPEALVGKHILIVDDVLTTGSTTLACASCLVKVEGIRISILTLATVE >gi|225935348|gb|ACGA01000044.1| GENE 146 214183 - 215262 488 359 aa, chain - ## HITS:1 COG:no KEGG:Amuc_0698 NR:ns ## KEGG: Amuc_0698 # Name: not_defined # Def: beta-glucanase precursor # Organism: A.muciniphila # Pathway: not_defined # 11 342 12 367 369 290 44.0 5e-77 MRSYYFLITLMSICVISNTIQAQNKKKKSFKPGTVWVDNNGKTINAHGGGIIYVDGIYYW YGEHKLPNKSEKEKADGGVHCYSSTDLYHWEDKGIVLSVDYKNEKSDIADGCILERPKVI YNRATSRYMMYFKLYPKGQDYKYGYLGVAAAISPTGPFTYSHKFLGADSPYGSGDFCIYK DDDGKVYHFTVRKPDKAFVAGELNKEYTYPQGKYKVVTGITNETEAPAIFKHKGTYYLLG SGSSGWKPNAARIFTSKSITGDYTEKSNPCHGVNPYNGLGPEKTFGGQSSFIIPVQGKKN SYIAMFDIWKPEMPIEGLYIWLPIKIQTNSINIKWRDEWNLNIFSDEKKIIPPLPTSIY >gi|225935348|gb|ACGA01000044.1| GENE 147 215268 - 216947 986 559 aa, chain - ## HITS:1 COG:SP1328 KEGG:ns NR:ns ## COG: SP1328 COG0591 # Protein_GI_number: 15901182 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Streptococcus pneumoniae TIGR4 # 2 375 7 397 513 111 25.0 3e-24 MTTLDIITIVLFSLGVVLVGLAFSQKGKSMQSFFAANGTMPWYMSGLSLFMGFFSAGTFV VWGSIAYSMGWVSITIQWTMAIAGFIVGLLIAPKWHKTGALTAAEYIHKRLGKSTQKIYT YLFLFISVFLTASFLYPVAKILEVSTGLPLNTCILLLGGLCILYVSAGGLWAVISTDVLQ FVILTAAVIIVVPLSFDKINGVTALIEQVPDTFYNLLNDEYTFGFLVAFGIYNTIFLGGN WAYVQRYTCVKTQKDSQKVGYLFGALYIISPVLWMLPPMVYRVYDPSLSLLDAEGAYLMM CKEVLPNGLLGLMIGGMIFATTSALNSKLNIASGVITNDIFKNLRPKSSDRTLMYVARIS TILFGVFSILIAMLIPKMGGVVNVVISLAALTGVPLYLPIIWTLFSKRQTATSNITTTLL SLAINGLFKFITPLMGFSLNRAEEMILGVTFPILCLAIFEIYYKAKNKESEKYQSYLIWN KENNARKLKEIEENESTEKAGNNFSKRVIGYGIFASGAGIAILGGMADNGQKLIITTGVV FALLGLLMTRKSKPTISKD >gi|225935348|gb|ACGA01000044.1| GENE 148 217159 - 218196 695 345 aa, chain + ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 1 339 1 332 333 190 34.0 3e-48 MSKPRVTIKDLAKELKISTSTVSRALCDKWDVNPETRKAVLELAEKWNYKPNPISLSLKQ SQSMFIGVIVPEFVNSFFSEVIMGIQSVLNPEGYHVLIMQSNESHENELRNMKALEAQMV DGFLISVTQESENTSYFSDLIWNNFPVVFFNRICAELSAPHVVIDDYKWAFAAVEHLIQQ GCRRIVHLAGSEDLLIAQKRKNGYMDALRKYGLLVEESLIIDCGIMMEKGIMAAHQILEM EKMPDGIFAVNDPVAIGAMKTLQKNKIRIPEDMAVVGFSESKEAFIVEPNLTSVEQPTFE MGRNAAQLLLEQIKHNVNNEDKMLSKSIILDAKLNIRESSLRRTS >gi|225935348|gb|ACGA01000044.1| GENE 149 218492 - 221686 2474 1064 aa, chain + ## HITS:1 COG:no KEGG:Phep_3596 NR:ns ## KEGG: Phep_3596 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 50 1064 25 1059 1059 1011 51.0 0 MKNAVISKAKWYIRCCSLVYLSISFSSVVHSAEDVSKMNSVAIAQQAKTRTVVGSVIDFE TGEPIIGASVAVAGKSVGTISDLDGNFSIRVDGDNTKLEISFIGYEKQAVMVEDGKKLTI RLRAVSELLDEVVITAYGSGLRKDLTGAIAKANVQDMRKAPVFNFEESLAGRVAGVQVTS SDGQPGSDLQIVIRGNNSVTQDNSPLYVVDGFPLEMAVGNMLNPEEIESIEILKDASATA IYGARGANGVILVTTKKGKVSSPTVTYSGWVGVQQIIKKQEMLDPYEFVRYQLEADYNVY SKRYLAGERTLEDYRNIEGINWQDKVYRDALVHSHNIAVRGGTEKTRYSVSGSLVDQQGI MLNSGYKKYQGRFVLDQTITKKIKVGINANYTYSKKYGTIVSESQTSPTASLMYALWGYR PVAGVNDGDLLNDLYDETTDPTTDLRINPLMAAENEYNPLFTYNFIGNAYFEYKILKNLT LKITGGYNKIHQRKEVFFNSNSRGGHRHTNNKVNGWITNTERTSLLNENTLTYDVPLKKG HRLKVLGGFTVQDNSTFIDEIRAINVPNEALGIAGLDEGELTSATISKTANGLVSYLGRA DYNYKSKYLLTVSFRADGSSKFPKDNRWAYFPSASAAWGFGEEKFVKNVKWISSGKLRAG IGTTGNNRVTDYAALTALQITADSGYSTGNTPGKGVVPKTLGNPKLKWETTVQTNIGLDM SFWDNRISLTADYYYKKTKDLLLNATLAPSMGFLSAYRNVGSVSNSGLELTIDTKNIQTK EFSWTSSFNISFNRNKVLSLNDDEPSLASRVNWGNFNNAYPYIAIPGHPIAMFYGHIFDG VYQYTDFDKVGESYILKDGVPNNGNAREKIQPGDIKFKDINRDGVVNDYDLTIIGNPNPK HIGGFGNNFQYKNFDLNVFFQWSYGGDVLNANRIEFEGGDPNARTSLNMFASFANRWTPE NQTNELYRIGGQGPAVYSSRTIEDGSFLRLKTVSLGYRLPNAWLKKINIKSLRVYASAQN LITWTRYSGPDPEVSTRPTALTPSFDWSPYPRPRTLTLGVDISF >gi|225935348|gb|ACGA01000044.1| GENE 150 221710 - 223317 1348 535 aa, chain + ## HITS:1 COG:no KEGG:Phep_3595 NR:ns ## KEGG: Phep_3595 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 9 535 14 527 527 449 46.0 1e-124 MKHNILLWMTVCLLGTSCSNILDKEPDFVSPDYYYNTESELLQALNGVYNRLIDTNGRMY SKGLFSLFVLSDESFYTNNFNNTNIRAGVMDAADLDVGRFWEVLYEGVNRANLLLYSVEG KELDTDMMKAAKGEALFLRGYYYYLLTSFFGEVPLKLAPTMSANDNYLAKSPLTDIYKQI VKDMQEAEKLVLDIDALGYNERISKTGVQAILARVFLKMAGEPLKDETRYADALEYANKV IASTKHELNPDYKQIFINHSQDINESKECIWEIGMYGNKIGTVDLAGSVGVENGILCRDE SIGYSGGPMKASKRLYDSYGEGDLRKDWNVAPYYYNVVEETKVNEETQEVEVVQVTKKVM FSATQIYNRNPGKWRREYEIGQKARLFNSTNFPVVRYSDVLLMKAEAENEVNGPTDEAYD AINQVRRRAYGKPIHTVDATVDLPADLAKTDFLEEVKKERFRELCFEGMRKLDLLRWGEY VATMKAFGTEISTTAPSEFKYASRVGQNITDRNVLFPIPNTEITVNKLMTQNEGW >gi|225935348|gb|ACGA01000044.1| GENE 151 223349 - 225055 1303 568 aa, chain + ## HITS:1 COG:no KEGG:Phep_3594 NR:ns ## KEGG: Phep_3594 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 15 276 22 288 307 180 38.0 1e-43 MKNIIRYCGLLLVGLVSCMEDEAPSVELNVALDKQVYQVGEPVTFKLNGNPDNIVFYSGE VGHNYAYKERYHADGDLLVKFNSWVRYGDIYHNLKFLVSSDFSGIYDKENVEAATWIDLS DKFRFSVGDDQTPSGEVNLKEYVGAEEDAKLFVAFRYEDEQKARQNNWIIRSITLDCVSA EGVRSNLATMSTMGWKVVDFENPAVTWNVASTSQILIDGGANQPKNVDWVISQAFDVRKT TPDTGVALKNISTTMDEYKYVYTKPGIYEVVFETTSDWYNGNDHALTKLTVEVQGKVEEE IPASLSVLPDKVECKVGEPITFSLTGNNASNVVFYSGESGHNFDFRERFYADNDIVVNFS TWVRYDVDQLLKFKISTDFDGVYEKGHVEAATWVDLSDKFAFSTGADKTPSGEVSLKEAA GDDPNARIFVAFHHKDEEEAVEKRNDWIVRTFEMDLISPEGFRSNLAKMSTKDWWTAVDC LNPNRNWNVTLQQLVLIGGTNKPTNDDWVISKPVYIRKGTPDKGVSLNSVTSKDYTYTYN TPGVYKVVFDWYDGSNYSQVKLNIEVKE >gi|225935348|gb|ACGA01000044.1| GENE 152 225074 - 227686 1816 870 aa, chain + ## HITS:1 COG:no KEGG:Phep_3593 NR:ns ## KEGG: Phep_3593 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 31 870 24 859 859 997 55.0 0 MPKNFINYFSVVVVLSLSIALSAQELSVVRLTNEQLELGWKKGVGGYTLETLKVKGTNGW EIMEIAKYQHNVLYASSKPETAPQALYDNTGKEILFPAPQYRYIIPSWQQNTNAVAMNKA GENIVFYPSAVKRVSDTEICFRYENETMRITEKWCMDSLHQNDIKVDFILHSKKTGYYSL ATPSLVSIDKYNFQWATVPGIFQGNAINTDFVQAYGYGQGIPDIPVVARERTASTLSSLI TDRKGVTIAVTAEPGTGRDPWPKDKKMHAEWQLGLSVMNREGEFSPTLYHPVLGEKNSLM NVGDSLSFSFRYTIQKADWYAVLKHTINDIYRFTDFLRLKQTKYSLTQRLYDMHAYLTND STSKWHNLVYKGVTIGAQDYLGGVYDSEKDAMKNSDYGAMWMLAKLTDDPRLTQKRLPNA LNFKLMQQHAEEDFLCGSSAGQYYLYKSKRFTEEWGPYTEPIATTYYMLMDMGNILLFEP QQKELKQHVKLAADRLLEWMKPNGQWEVAYENKTLKPTFTDITDLRPTFYGLLIAYEILK DKKYLQAAIQGADWYVENAVKKGHFLGVCGDTRFVPDFATAQSAQALLELYNVTKNEKYK EAAISTAKIYTASVYTHPIPTSVVKQVKGIERKDWEISQVGLSFEHGGVAGSANHRGPIL LASHAGMFVRMYRLTKDSLFLNMARAAAIGRDAFVDFKTGVASYYWDSMNNGAGPYPHHA WWQVGWITDYLLSEISLRSNGGITYPGGFITPKVGPHLTYGFTSGMVFGTKADLIMRPGL FKLDNPYIEYMAALNEKEKTVFLILLNNDDEKQTSLIEMDTKCLFSGKKIRVKNVASLNN QGHSTLVDGVENWNVTIDAYGLTVLKIKYK >gi|225935348|gb|ACGA01000044.1| GENE 153 227690 - 231118 2136 1142 aa, chain + ## HITS:1 COG:no KEGG:Phep_3592 NR:ns ## KEGG: Phep_3592 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 11 1140 11 1133 1136 1270 53.0 0 MSINHWIVKNLVCTLFLVVGGLNIYAQKDKYLIEAEAFQFKGKWSTDRSADCMGSAMLRL NGGGSLDEQFDALTVVNIMEEGDYNVWVRSADYDKLPGTRLFRLSVDEKPMKESGKHGKV GFYWENAGCVQLAKKQVLLRLHDTKRNFGRCDAILLVKDHSINPNEMDRKEVGKWRKNPV KIETKGSGFDNVSTPLLLSPDAEVVANIENPNIRVSFVKAGANNQAIACRTEIKVKGTWR RFLSTMEDHKVFLITAEQSPVDNDKFFPSWKNAVSKSSFTVGGKEYTVQSDEDYLNPYVA GVLSEAIPVKAVNKDNNSIEVQYITKNGSSITGYWTLPAKGNHVEVRLSCRPATDGMYSM GLSAFQPVPQPNMSNILLSPMFQYKRVSSHPVMMLSSMMQQPLAIAESATPLGGVMSSFI CGDDSTFPQEWGSVDFSPMGFSVKNEQNVVQPVAFAPVMGMKDSHVKAGQVIERKFVVGI LPAGWNEVIEYISDQVYKVKDYRKQKEVSLTEAMFNILDLMRNEEYGGWDAHLKGFYDIE GDPETAPTVVHAAPLAIIAASVLSEDEEFYLSRSLPTIEYTLSRSGYRWATDLVPSGYNK TLETLRLNPFNSQFNTSYYEGLHRLLGGLNPWLVSIALPGDTLRKTKGYSTQVLSWVQAI SAYNLTGDEKWKRFATSTADRYIDLQIYKNSSQPVGLMSFYNTNVYAPWWDLIDLYELTK EEKYLKAAQYGASNTIAGIRSFPTVTEDLQTIHPGNRFDGNTNLWWKGKEKYRLGFPRVA NDAQEKQVPEWLVSPVGLGFEQPSTYFLRTKGKQVRPVFMSNWAPHLLRLYQYTRKPIYE TYARNGVIGRFTNYPGYYATGYTDITLSPDFPYKGPDVSSIYYHHIPPHLAFTWDYLVSE AIERSNGNINFPFCKQEGFVWFTNRIYGNEKGKIFGDSKAKLWIKKGLIQIDNPTVNYLT AVSDKYFWVILSGESKQEESLTIRLDKVAELVGQGDIVQYTEKGKTAKVNRVGNAIKVSV PGKGFRALAFPLAETRKENRYPVLKNGIKVVDMGEPFGKVFLFRIRSPFGWDSVYGFAET APAKEKEMSVSVVCNGQSCTVSSYPFEWSFYKLGMDEKAIVKVTVSSKGQVDKTEEIILN SK >gi|225935348|gb|ACGA01000044.1| GENE 154 231121 - 233166 1278 681 aa, chain + ## HITS:1 COG:AGpA668 KEGG:ns NR:ns ## COG: AGpA668 COG2755 # Protein_GI_number: 16119683 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 509 674 34 200 217 87 28.0 1e-16 MIQIRQLGTIIAGMVLMTVFPCVYGQSRADLDKIAASQAGASPLVYTTADKEIPLIQLGN YYDEKECTVRKGLPNFYSKIKKEQEITVAFIGGSITQGDYCYRLQTTRYMENTFSNTCFK WINAGVSGTGTDLGAFRIREQVLQYKPDLIFIEFAVNGGYPDGMEGMIRKIIKENPHTDI CLIYTIYTNQATVYQKGDVPQVIKRLEDIATYYQLPSIHLGMEAAALEKAGKLLWKGTKE VAVGKILFSNDGVHPITDGGNLYASAIARGLEKIRKENSASQVHMLPEPLFGSEWEEAEM YIPSQIASFDNSWKEINTSVTPSLKKFSGWFDTVMTSSKEGSSFSFGFEGDMIGLFDIGG PEVGQVEVLIDGKFVRLKEISTKGFHLYEANDRIGNYTLNRFNSWCNNRYRGQYDVIKLK KGIHQVTIRVSSEKADKKKILGNKQWEDITAHPEKYDQSTIYLGRILLRGKPIPCERIKG VPKLPQQLKWEQKMKRYEKADSINPPAKDLILFVGSSTMENWKTLADDFPGKPVLNRGVS GTKTIDLINYKDRLISPYHPKQIFVYEGDNDIGYQWTPDEILEQIKRLFFILRKEKPEAE IIFISIKPSVRRLKDKERIEQTNALIKEFVEQQMNTAYADVYNPMFTPKGELFPEHYRED GLHLTAEGYAVWKEVISKYIK >gi|225935348|gb|ACGA01000044.1| GENE 155 233183 - 235678 1903 831 aa, chain + ## HITS:1 COG:SPBC1683.04 KEGG:ns NR:ns ## COG: SPBC1683.04 COG1472 # Protein_GI_number: 19111852 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Schizosaccharomyces pombe # 30 823 2 817 832 400 33.0 1e-111 MMKHLMIAGILSCFSMGGFAQIYKDKTAPVEDRTNDLLHRMTLEEKLDYIGGYKGFYIRG IERLGVPEIKLTDGPVGTHKDGKSTAYPAGVLTASTWNRELVYELGKQLGRDSKARGVHI LLGPGVNIVRSPLCGRNFEYFTEDPYLNSQVAIGYVKGLQDQKVVATLKHYAANNQEWDR NNVSSDIDERVLHEIYLPVYRAAIQEAGAGAVMDSYNPVNGVHATQNDYLNNQVLRKQWG FDGIVMSDWSATYDAVEAANGGLDLEMPRAKWMNKENLMPAIKAGKVKEATIDEKVKRIL RIMFRFGFFDNEQLDSSIPYNNPEAAKVALELAREGIVLLKNENDLLPLNPSKVKSVAVI GPNANSYISGGGSSYTFPFHSVSVLDGLKNVGQDLQISYAPGVPTLTETVVNAIFYTEAG SRIKGLKAEYFDNIRLKGAPARTVIDTIVNIGNGWHIAAENKGIPYDHCSMRWSGVVRPE KTANYRFIVRGFDGFRLKIGTEMLINEWRDQGITTRETVLALEAGKEYPVVLEYFANVHP VDISFGWREDRLLFDEAVEIARKSDVAIVNIGFNESSERESNDRPFELPEYQDSLVQCIT AANPNTVVLLNAGGNVDMSKWIDKVPSLLHLWYAGQEGGTAVAEILFGKVNPSGKLPISF EKKWEDNPAYPYYYDVDGDKRVEYKEGLFMGYRHYDVSETKPQFAFGYGMSYTTFKYSDL RIDREKGRDLVVKIRFTVKNTGKHDGAETAQVYIRPINPKVERPYKELKGFAKHFIRKNT VQEIEITLDKDAFSYYKTELKDFDYDVGSYEVLVGSASDDIRLRKVIQIEK >gi|225935348|gb|ACGA01000044.1| GENE 156 235722 - 238004 1373 760 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3542 NR:ns ## KEGG: Cpin_3542 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 759 1 758 758 982 58.0 0 MVLDKFNSVRMFNTDVLQADFVIVGGGISGVCAAITAARQGLKVVLIQDRPVLGGNGSSE IRLWMLGATSHMGNNNRWAREGGVIDEIMVENTFRNSEGNPIIFDSVMLDKVVSEPNITL LLNTCVYEVKKSAGHKIESVRGFCSQNSTMYEITAPLFCDASGDGVVGFLSGAPYRMGAE SREEFGEKFAPAEDYGELLGHSLYFYTKDTGKPVKYVAPSYAMDVTKTVPRFRSFNAKEH GCKLWWVEYGGDLDTVHDTEQIKWELWKVIYGAWDYIKNSGKYPEAETMTLEWVGCIPGK RESRRFEGDYMLIQQDVIEQRHHEDAVSYGGWSIDLHPAAGVFGEESACNQWHAKGVYQI PYRCLYSRGIENLFLAGRIISVSHVAFGSTRVMATSAHSAQAVAMAAAMCLKENISPREV YSLGKVSELQKKLSRMGQYIPDMIIRDEENLVTKATLTASSEYHFKGFPADGEMQVLDES VAQMIPLQKGDVLGKVQVDLCASEETALEVELRISSKAFNHTPDVVLETKILALHQGKQQ LELVFGTVMPEEQYVFLVFKKNPLVQLQYTQERITGILSVFNTVNEAVSNFGKQTPPEDI GIEEFEFWCPKRRPEGYNIAIRTEKDCYAFPVSNINNGIDRPVQSPNAWVAELKDPNPTL TIKWDAAISVGKLSLFFDTDYDQPMETVQMTHPEYKMPFCVEKYRIYDGTNQLIYEKNDN HQSINEIIFPEKLVTDEIRIVLEHPSGLIPAALFAVKCYE >gi|225935348|gb|ACGA01000044.1| GENE 157 238136 - 239161 804 341 aa, chain + ## HITS:1 COG:no KEGG:Phep_3597 NR:ns ## KEGG: Phep_3597 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: P.heparinus # Pathway: not_defined # 11 341 11 341 341 421 70.0 1e-116 MQNENLDITEQKKPASRMKYLQLVDYINSLIEDGTLQIGDQIPSLNQLQTQLRMSKETLL KGLNYLLEMGVIEAIYRKGYFVKKVSINHSYHVFLLLDKMNVMREQIYRSIFKSLKDVGD IDIYFHHHNYKVFEKLIKENLGNYTHYIIATFLKEDVTDVLNLIPAEKRIIIDYNQQGLS GNYSCIYQDYGYDIYCSLLQLQDRLKKYEKLILIARPEAIHAQQVIDGFLHYCVKSELPY SIESNVEEKNFRKGNAYITFSRYDTDDVSLIKLAKKKGYELGKEIGLISYNDTAVKEILE GGITVISTDFEMMGKAAASVILNKNIIYKRNPTKVIIRNSL >gi|225935348|gb|ACGA01000044.1| GENE 158 239477 - 240466 866 329 aa, chain + ## HITS:1 COG:SMc02846 KEGG:ns NR:ns ## COG: SMc02846 COG0524 # Protein_GI_number: 15963924 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Sinorhizobium meliloti # 4 308 6 313 330 182 35.0 1e-45 MDKIIGLGNALVDVLATLKDDTLLDEMGLPKGSMQLIDDAKLQQINTKFSQMKTHLATGG SAGNAILGLACLGAGTGFIGKVGNDNYGEFFRENLQKNKIEDKLLTSDRLPSGVASTFIS PDGERTFGTYLGAAASLRAEELTLDMFKGYAYLFIEGYLVQDHEMILHAIELAKEAGLQI CLDMASYNIVANDLEFFTLLINKYVDIVFANEEEAKAFTGKEPEEALRVIAKKCSIAIVK VGANGSYIRKGTEEIKVSAISVQKVVDTTGAGDYFASGFLYGLTCGYSLDKCAKIGSILS GNVIQVIGTTIPQERWDEIKLNINRILAE >gi|225935348|gb|ACGA01000044.1| GENE 159 240512 - 242200 1838 562 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 39 409 36 398 408 234 38.0 3e-61 MKKLLNRQIAIVAVAVIATVAFFSFKSGDDRNFQIAKNLDIFNAIVKELDMFYVDTIDPN KTIREGIDNMLYTLDPYTEYFPEEDQSELEQMIKGSFGGMGSYIAYNTKLKRSMISEPFE GTPAAKAGLKAGDILMEIDGQDLAGKNNAEVSQMLRGQAGTSFKLKIERPNEKGGRTPME FTIVRESIQNPAIPYTAVLDNKVGYISLSTFSGNPSKEFKKAFLDLKKQGATSLVIDLRS NGGGLLDEAVEIANYFLPRGKVIVTTKGKIKQASNTYKTLREPLDLDIPIAVLVNSGTAS ASEILSGSLQDLDRAVIVGNRTFGKGLVQVPRSLPYGGTMKVTTSKYYIPSGRCVQAIDY KHRNEDGSVGTIPDSLTKVFHTAAGREVRDGGGVMPDIVIKQEKLPNILFYLVRDNLIFD YATQYCLKHPTIVAPEEFEVTDADYNDFKALVKKADFKYDQQSEKILKTLKEAAEFEGYM DDASEEFKALEKKLNHDLDRDLDYFSSDIKKMIATEIIKRYYYQRGNIIQQLKDDDGLKE AMKILNDPVKYKEMLSAPATKE >gi|225935348|gb|ACGA01000044.1| GENE 160 242342 - 243541 921 399 aa, chain - ## HITS:1 COG:VCA0709_1 KEGG:ns NR:ns ## COG: VCA0709_1 COG0642 # Protein_GI_number: 15601465 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 171 396 495 727 738 150 39.0 3e-36 MRNYENKTKEDLLEIIEQLEEKIDFLSSHSSDSSSPSKERFRDKYSTRILDALPDMLTVF DHDANIIELASSPATNHVEGISPNNITTTNVKDILPKEAYESVRKNMDKVILTGESSTAR HDLMLDGVLHHYENRIFPLDKEYLLCMCRDISQQWEAEQTNAQQQKELKAARIKAEESDR LKSAFLANMSHEIRTPLNAIVGFSKLITYATSAEEKNQYSEIIERNSEMLLNLFNDILDL ASLEADSLKFNIRPIKLIDICLQLEQQFCHKTQNGVKLILDDVDADMYTSGDWNRIIQII SNLLSNATKFTPKGEIHFGYREKEDFVEFYVKDSGIGIPAARIATIFRRFGKVNDFVQGT GLGLTLCRMLVEKMGGRIWLRSQEGQGSRFYFTLPLIRQ >gi|225935348|gb|ACGA01000044.1| GENE 161 243725 - 245143 1657 472 aa, chain + ## HITS:1 COG:XF1037 KEGG:ns NR:ns ## COG: XF1037 COG0499 # Protein_GI_number: 15837639 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylhomocysteine hydrolase # Organism: Xylella fastidiosa 9a5c # 33 472 1 446 446 650 68.0 0 MSTELFSTLPYKVADITLADFGRKEIDLAEKEMPGLMALREKYGESKPLKGARIMGSLHM TIQTAVLIETLVALGAEVRWCSCNIYSTQDHAAAAIAAAGVPVFAWKGETLADYWWCTLQ ALSFDGGKGPNVIVDDGGDATMMIHVGYDAENNAAVLDKEVHAEDEIELNAILKKVLAED STRWHRVAEEVRGVSEETTTGVHRLYQMQEEGKLLFPAFNVNDSVTKSKFDNLYGCRESL ADGIKRATDVMIAGKVVVVCGYGDVGKGCSHSMRSYGARVLVTEVDPICALQAAMEGFEV VTMEEACLEGNIFVTTTGNIDIIRIDHMAKMKDQAIVCNIGHFDNEIQVDALKHYPGIKC VNIKPQVDRYYFPDGHSIILLADGRLVNLGCATGHPSFVMSNSFTNQTLAQIELFNKKYD INVYRLPKHLDEEVARLHLEKIGVKLTKLTPEQAAYIGVSVDGPYKADHYRY >gi|225935348|gb|ACGA01000044.1| GENE 162 245260 - 247773 2064 837 aa, chain + ## HITS:1 COG:no KEGG:BT_2796 NR:ns ## KEGG: BT_2796 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 827 1 826 843 1435 85.0 0 MKKFLPDLIAILAFIILSFAYFFPADIEGRILFQHDTAAGVGAGQESKEYLERTGERTRW TNSIFGGMPTYQMSPSYDSTTSLKGVEKVYRLFLPDYVVLTFIMMLGFYILLRAFGISAW LAGLGGVIWAFSSYFFILIPAGHIWKFVTLAYIPPTIAGVVLAYRKKYLLGGIVTALFIA LQIQSNHIQMSYYFMFVILFFVGAYFEDAYKKKELPHFFKASGVLALAAVVGVCINISNL YHTYEYSKETMRGKSELKQEGAAASQTSSGLDRDYITNWSYGIGETLTLLVPNVKGGGSG STMSQSEVAMAKANPMYSGIYSQLPQYFGEQPWTAGPVYVGAFVMFLFVLGCFIVKGPLK WALLGATIFSILLSWGKNFMGLTDFFIDYVPMYNKFRAVSSILVIAEFTIPLLAIFALKE ILSKPDMLKQEKNCRGVIAALVLTAGVAFILAVAPGAFFSSFITAQEMTALKQALPAEHL ASFVANLTEMREAIIASDAWRSFFIIVIGCLFLFLYQLRKLKASFTLAGIALLCLIDMWS VNKRYLNDEQFVPKSKRSEAFVKTQADEIILQDTTPNYRVLNFIGFPGNTFNENNTAYWH KSVGGYHAAKLRRYQEMIDHHIVPEMKETYQAVATAGGQMDSVDASKFRVLNMLNTKYFI FPAGEQGQAVPVMNPYAYGNAWFVDKVQYVNNANEEIDALNDILPTETAVVDVKFKEQLK GVTEGYKDSLSTIQLTSYEPNRLVYKASTPKDGVVVFSEIYYPGWQVTIDGQPVDIARAD YILRAINMPAGEHTIEMWFDPQSIHVTESIAYAALALLLIGVMLLAWTQRSKIAKKS >gi|225935348|gb|ACGA01000044.1| GENE 163 247928 - 248524 466 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|299144693|ref|ZP_07037761.1| ## NR: gi|299144693|ref|ZP_07037761.1| conserved hypothetical protein [Bacteroides sp. 3_1_23] # 1 198 1 198 198 286 97.0 5e-76 MRKNMKLIGLAVSLTCCAGFIDAQEANEPQERQQQEKHLPREVLNPEKVATQMTEQMNKL LQLTDKQYKKIYKLNLKEQKAFFKAMQNSDDYRPPMGEGPGMRGGRPPMGGGQPPMMGEG GFPGRMGGGPMMSRDTNSADSQKKAAETKEKKIKKILTKEQYEKWQAEQNSARKKASQRR MHKDNHPDKGPDFEQKSF >gi|225935348|gb|ACGA01000044.1| GENE 164 248916 - 250247 1211 443 aa, chain + ## HITS:1 COG:aq_1332 KEGG:ns NR:ns ## COG: aq_1332 COG1538 # Protein_GI_number: 15606535 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Aquifex aeolicus # 4 435 11 414 415 99 20.0 2e-20 MKKLLFIFSFFFLFVLNGKTQEVEILTLEDCLRIGIDNNLSLEGKRKEIQRSKYGVSENR SKLLPQINAIAGYSNNFDPPVSVTDGSSYGVPYNITQTLQHSANAGLEMQMPLFNQTLYT SMSIAKVMEEISRLSYGKAREDVILQISKMYYLGQVTAEQIMLIKANITRLEELRDITQA FFDNGMSMEVDLKRVNINLENLKVQYDNAQAMMKQQLNMLKYIMDYPAEKEIALTPVNTD SITTVALTGLSENIYELQLSQSQVQLAERQKKIITNGYIPSLSLTGSWRYAAYTDKGYHW FHSGPSNQWFRSYGVGLTLRIPIFDGLDKTYKIKKAMIDIENKRLAWEDARKNLQTQYLN AVNDLMNNQRNFKKQKDNYLLAEDVYAVTSDRYREGIASMTEVLQDEMQMSEAQNNYISA HYNYRVTNLMLLKLTGQIESLVK >gi|225935348|gb|ACGA01000044.1| GENE 165 250270 - 251361 1059 363 aa, chain + ## HITS:1 COG:BMEII0793 KEGG:ns NR:ns ## COG: BMEII0793 COG1566 # Protein_GI_number: 17989138 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Brucella melitensis # 56 360 14 316 325 160 33.0 4e-39 METMENNLPSATHQEKAKKMKKLRRWQIAISLLGVAIIVWGVIEVICLFLNYSQTETSND AQIEQYVSPINLRASGYIDKIYFTEHQEVHKGDTLLVLDDREYKIRVMEAEAALKDAQAG ATVINATLNTTQTTASVYDASIAEIEVRLAKLEKDRKRYENLVKRNAATPIQLEQIVTDY EATRKKLEATKRQKKAALSGVDEVSYRRMNTEAAIQRATAALEMARLNLSYTVVIAPCDG KLGRRSLEEGQFISAGQTITYILPDTQKWIVANYKETQIENLHIGQEVFVTVDAISDKEF KGKVTSISGATGSKYSLVPTDNSAGNFVKIQQRIPVRIDFTDLSKEDNERLAAGMMVVVK AKL >gi|225935348|gb|ACGA01000044.1| GENE 166 251411 - 253048 1186 545 aa, chain + ## HITS:1 COG:no KEGG:BT_2793 NR:ns ## KEGG: BT_2793 # Name: not_defined # Def: putative MFS transporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 545 1 545 545 894 86.0 0 MPSYPKNYPFYSWMPKPLGIIILLFFFLPILTVGGVYSVNSTEMMSGLGIISEHIQFANF VTSIGMAAFAPFLYQLVCVRREKMMCIVGFAFMYIFSYICAKTDSVFLLALCSLLTGFLR MVLMMVNLFTLIWYAGGMEATRNITPGLEPKDTAGWNKLDIERCVSQPAVYLFFMILGQS GTALTAWLSFEYEWKYVYCFMMGILLISILLLFITMPNYKFPGRFPINFRKFGNVTAFCI SLTCLTYVLVYGKVLDWYDDESIRWATAVSILFAGIFLYMDVTRRSPYVLLDAFKLRTIR MGALLYLLLMVINSSAMFVNVFAGVGMHLDNLQNASLGNWCMVGYAIGAVIAMVLGGKGL HFKYLFAMGFFFLSLSAVFMYFEVQTAGVYERLKYAVIIRATGMMILYALTAAYANQRMP FKYLSTWICIMLTVRMVVGPSIGGAIYTNVLQERQQHYITRYAQNVDLLNPDASTSFLGT VQGMKYQGKSETEARNMAAISTKGRIQVQATLSALKEMAGWTIYGGLICMIFVLVVPYPK RKLLT >gi|225935348|gb|ACGA01000044.1| GENE 167 253087 - 253962 606 291 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 122 288 123 291 307 71 26.0 1e-12 MDSGQDRLLQFERDLLSGKNICSSEGIFVNFPPSLKKPFQMKGLGLIICHQGNFQFSLNQ KKHFAGAGESLFIPEDGEFQVLQESEDMEVRILIYQIEPIRDIMGNLVVSMYMYSRLTPE EPSCVWSTGEEEEIVKYMSLLDNVLQSEENSFKLYEQKLLLLALTYRICSIYNRKLVNDG REVGGRKNEVFIHLIQLIEKYYMQERGVEFYADKLCLSPKYLSAVSKSICGYTVQELVFK AIIRKSISLLKNTQKDIQEISNAFGFPNASYFGTFFKKQVGVSPQQYRKNL >gi|225935348|gb|ACGA01000044.1| GENE 168 253967 - 254620 703 217 aa, chain - ## HITS:1 COG:MTH1114 KEGG:ns NR:ns ## COG: MTH1114 COG0035 # Protein_GI_number: 15679125 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Methanothermobacter thermautotrophicus # 1 214 8 211 215 137 38.0 2e-32 MKVIDFGQTNSILNQYISEIRNVEVQNDRLRFRRNIERIGEIMAYEMSKEFKYSVKNIQT PLGIAPVSTPDNNLVISTILRAGLPFHQGFLSYFDGAENAFVSAYRKYKDTLKFDIHIEY IASPRIDDKTLIITDPMLATGGSMELSYQAMLTKGHPAEIHVASIIASQKAIDHIKNVFP EDKTTIWCAAIDPELNEHSYIVPGLGDAGDLAYGEKE >gi|225935348|gb|ACGA01000044.1| GENE 169 254901 - 256508 1750 535 aa, chain + ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 2 535 10 541 542 808 73.0 0 MANLDLSKYGITGVTEILHNPSYDVLFAEETKPSLEGFEKGQVTELGAVNVMTGIYTGRS PKDKFFVKNEASADSVWWTSEDYKNDNKPCTEEAWADLKAKAVKQLSGKRLFVVDTFCGA NEATRMKVRFIMEVAWQAHFVTNMFIRPTAEELANYGEPDFVCFNASKAKVDNYKELGLN SETATVFNLKTKEQVILNTWYGGEMKKGMFSIMNYMNPLRGIASMHCSANTDMEGTSSAI FFGLSGTGKTTLSTDPKRKLIGDDEHGWDNEGVFNYEGGCYAKVINLDKESEPDIYNAIK RDALLENVTVDANGKIDFTDKSVTENTRVSYPIYHIENIVKPVSKGPHAKQVIFLSADAF GVLPPVSILNPAQAQYYFLSGFTAKLAGTERGITEPTPTFSACFGAAFLSLHPTKYAEEL VKKMEMTGAKAYLVNTGWNGSGKRISIKDTRGIIDAILDGSIDKAPTKVIPFFDFVVPTE LPGVDPKILDPRDTYECACQWEEKAKDLAGRFIKNFAKFTGNEAGKALVAAGPKL >gi|225935348|gb|ACGA01000044.1| GENE 170 256658 - 257968 603 436 aa, chain + ## HITS:1 COG:aq_308 KEGG:ns NR:ns ## COG: aq_308 COG0249 # Protein_GI_number: 15605835 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Aquifex aeolicus # 172 428 504 770 859 90 28.0 6e-18 MVYLATDKQTYADLSITETANNEQFLFSLFSKTETKEGKALMLNWIMYPLSDLGEIRKRQ EAIVWDALPELLLNEEELDFIEYYLAYRDQIREAHILLSCATVIDRLVRYDSTRYVICRG VKLVVHLLHCLKEWATELPQDAPQLMKESAAMIDNILHGSELEEVLEQTSDEEKRLSNFV IDKFDYLFRCTRLLSLKELLSVIYLLDVCRTAHRVAKEKSFCCMPVMVPTMDFSVEGVVH PFVKDAQPNRWQMSRGNICIFTGSNMAGKSTTLKALTLAVWLAHCGLPVPVKSMICPLYE GIYTSINLPDSLRDGRSHFMAEVLRIKEVMQKAVTGKRCLVVLDEMFRGTNAKDAFEASV AVNELLKSFSHCHFLISTHILEYAKAFEKDSSCCFYYMEAEIIDDAFVCPHRLLSGISEA RVGYWVVKKILSDGLM >gi|225935348|gb|ACGA01000044.1| GENE 171 258086 - 260575 2140 829 aa, chain + ## HITS:1 COG:no KEGG:BT_4682 NR:ns ## KEGG: BT_4682 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 828 1 810 812 1437 82.0 0 MKNFYFLLISSLFLTSGLYAGETDYTQGLSIWFDTPNTLQGYAAWYGGRPDLWKDENKPV TAGSGHNLDASWESQSLPIGNGSLGANIMGSVEAERITFNEKTLWRGGPNTAKGADYYWN VNKQSAHLLDEIRKAFTEGDQKKAEMLTRQNFNSEVSYEADGENPFRFGSFTTMGEFYVE TGLNMIGMSDYKRILSLDSAMAVVQFKKDRVAYQRNFFISYPANVMVVRFSADQSGKQNL VFSYAPNPLSTGSMVSDGNKGLVYTASLDNNGMKYVVRIQAETKGGTLSNADGKLTVKDA DEVVFYITADTDYKINFDPDFKDPKTYIGVNPEETTKQWMNNAVAQGYTALFNQHYNDYA TLFNRVRLNLNPAVKGVNLPTSQRLKSYRKGQPDYYLEELYYQFGRYLLIASSRPGNMPA NLQGIWHNNVDGPWRVDYHNNINIQMNYWPACSTNLNECVLPLIDFIRTLVKPGEKTAQA YFGARGWTASISGNIFGFTTPLESQDMSWNFNPMAGPWLATHIWEYYDYTRDLKFLKETG YELIKSSADFAVDYLWHKPDGTYTAAPSTSPEHGPIDQGATFVHAVVREILMDAIEASKV LGVDKKERKQWEHVLANLVPYQIGRYGQLMEWSVDIDDPKDEHRHVNHLFGLHPGHTVSP VTTPELAKAAKVVLVHRGDGATGWSMGWKLNQWARLQDGNHAYTLFGNLLKNGTMDNLWD THPPFQIDGNFGGTAGITEMLLQSHMGFIQLLPALPDAWKDGSISGICAKGNFEVDVIWE NHQLKEAVVRSNAGGDCVIKYADQTISFKTVKGRSYQIGYDAAKGLIKN >gi|225935348|gb|ACGA01000044.1| GENE 172 260607 - 261164 617 185 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 6 179 7 180 186 124 41.0 7e-29 MVLEYLKLFVTFAKIGMFTIGGGYAMIPLIEREIVNKQWMNKEEFMEMFALTQSLPGVFA VNISIFVGYKLYKVGGSLVCALATILPSFVIMMLIAMFFAQFQDNEVMIRIFNGIRPAVV ALILFPCISAVRALKLKYLQLVAPAIATVLIWQFGLSPIYIVLAGIIGGLVYTLWLKEKI ANKQV >gi|225935348|gb|ACGA01000044.1| GENE 173 261161 - 261685 441 174 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 174 1 173 176 123 42.0 2e-28 MIYWQLLWVYLKIGMFGFGGGYAMLSLIQHEIVDLHHWLTPQQFTDVVAISQMTPGPIGI NSATYVGYAVTQSVWGAVLATVAVCLPSFILVLLISYFFAKCKDNKYIKAAMSGLLPMSV ALIASAALLMMNRENFIDYKSIGIFAAAFLVTWKWNLHPILLICLAGVVGLLLY >gi|225935348|gb|ACGA01000044.1| GENE 174 261767 - 263566 1998 599 aa, chain - ## HITS:1 COG:DR1198 KEGG:ns NR:ns ## COG: DR1198 COG1217 # Protein_GI_number: 15806217 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Deinococcus radiodurans # 5 594 4 593 593 659 55.0 0 MQNIRNIAIIAHVDHGKTTLVDKMLLAGNLFRSNQNSGELILDNNDLERERGITILSKNV SINYNGTKINIIDTPGHSDFGGEVERVLNMADGCILLVDAFEGPMPQTRFVLQKALEIGL KPIVVVNKVDKPNCRPEEVYEMVFDLMFSLNATEDQLDFPVIYGSAKNNWMSTDWKQQTD SITPLLDCIIENIPAPKQLEGTPQMLITSLDYSSYTGRIAVGRVHRGTLKEGMNITLAKR NGDMFKSKIKELHVFEGLGRVKTNEVSSGDICALVGIDGFEIGDTVCDFENPEALPPIAI DEPTMSMLFAINDSPFFGKDGKFVTSRHIHDRLMKELDKNLALRVRKSEEDGKWIVSGRG VLHLSVLIETMRREGYELQVGQPQVIFKEIDGVKCEPIEELTINVPEEYSSKIIDMVTRR KGEMVKMENTGERINLEFDMPSRGIIGLRTNVLTASAGEAIMAHRFKEYQPHKGEIERRT NGSMIAMESGTAFAYAIDKLQDRGKFFIFPQDDVYAGQVVGEHSHDNDLVINVTKSKKLT NMRASGSDDKVRLIPPIQFSLEEALEYIKEDEYVEVTPKAMRMRKVILDEIERKRANKN >gi|225935348|gb|ACGA01000044.1| GENE 175 263722 - 263991 446 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883111|ref|ZP_02064114.1| hypothetical protein BACOVA_01079 [Bacteroides ovatus ATCC 8483] # 1 89 1 89 89 176 100 1e-42 MYLDAAKKQEIFGKYGKSNTDTGSAEAQVALFSYRISHLTEHMKLNRKDYSTERALTMLV GKRRRLLDYLKARDIERYRAIVKELGLRK >gi|225935348|gb|ACGA01000044.1| GENE 176 264178 - 264753 570 191 aa, chain + ## HITS:1 COG:MTH659 KEGG:ns NR:ns ## COG: MTH659 COG1396 # Protein_GI_number: 15678686 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 7 190 6 189 190 193 56.0 2e-49 MDTSKIVGEKIKALREDKSISIEELAQRSGLAIEQVERIENNIDIPSLAPLIKIARVLGV RLGTFLDDQDEVGPVVCRKKEAKDAISFSNNAIHSRKHMEYHSLSKSKADRHMEPFIIDV MPTEDTDFVLSSHEGEEFIMVMEGIMEISYGKNTYLLEEGDSIYYDSIVPHHVHAYEGQA AKILAVVYTPI >gi|225935348|gb|ACGA01000044.1| GENE 177 264789 - 266438 1501 549 aa, chain + ## HITS:1 COG:MTH657 KEGG:ns NR:ns ## COG: MTH657 COG0318 # Protein_GI_number: 15678684 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Methanothermobacter thermautotrophicus # 1 544 1 545 548 725 61.0 0 MQLFDRTLGQWLEHWAEETPDKEYIVYSDRNLRFTWSQLNRRVDDMAKGLIAIGVERGTH VGIWAANVPDWLTLLYACAKIGAVYVTVNTNYKQSELEYLCQNSDMHTLCIVNGEKDSDF VQMTYTMLPELKTCERGHLKSERFPYMRNVVYVGQEKHRGMYNTAEILLLGNNVEDDCLS ELKSKVDCHDVVNMQYTSGTTGFPKGVMLTHYNIANNGFLTGEHMKFTADDKLCCCVPLF HCFGVVLATMNCLTHGCTQVMVERFDPLIVLASIHKERCTALYGVPTMFIAELHHPMFDL FDMSCLRTGIMAGSLCPVELMKQVEEKMYMKVTSVYGLTETAPGMTASRIDDPFDVRCNT VGRDFEFTEVKVIDPETGEECPVGVQGEMCNRGYNTMKGYYKNPEATAEVLDENNFLHSG DLGIKDEDGNYRITGRIKDMIIRGGENIYPREIEEFLYKLDGVKDVQVAGIPSKKYGEAV GAFIILQEGVKMQEADVRDFCRNKISRYKIPKYIFFVNEFPMTGSGKIQKFRLKDLGLQL CKEQGIEII >gi|225935348|gb|ACGA01000044.1| GENE 178 266593 - 267675 782 360 aa, chain - ## HITS:1 COG:CAC3072 KEGG:ns NR:ns ## COG: CAC3072 COG0836 # Protein_GI_number: 15896323 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 9 340 5 337 350 251 39.0 1e-66 MTNQDNYCVIMGGGIGSRFWPFSRKTLPKQFLDFFGTGRSLLQQTFDRFQKVIPTENIFI VTNAMYADLVKEQLPEVSENQILLEPARRNTAPCIAWASYHIRALNPNANIVVAPSDHLI LKEDEFLAAIEKGLDFVSRSEKLLTLGIKPNRPETGYGYIQIDEPAGGNFYKVKTFTEKP ELELAKVFVESGEFYWNSGLFMWNVNTIIKASEDLLPELASKLAPGKDIYATDKEKAFIE ENFPACPNVSIDFGIMEKADNVYVSLGDFGWSDLGTWGSLYDLSERDPEGNVTLKCHSLI YNSKDNMVVLPKGKLAVIDGLEGFLIAESDNVLLICRKDEEHAIRKYVNDAQMQLGDDFI >gi|225935348|gb|ACGA01000044.1| GENE 179 268080 - 268850 514 256 aa, chain - ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 1 230 1 221 246 99 29.0 4e-21 MKVLIIEDENHSAERLQRYIRTLHPDYEISGITKSIVQSIDFLQREQPDLIFSDIRLQDG LSFDIFRQIKIVSPIIFTTAYDQYAIQAFKFNSIDYLLKPIDSDELEAAIQKATERISRS SPATTAPKLEQLLEYLGGAVPTPHYRERFLVSRKDEYITIEVQNVCFIHSQQNITRLYLT DGISATIPYTLDQLEKEMNPASFFRANRQHMIQVRHIKKVSNWFNYKLKVEMNGYPQEEI LISREKAATFKKWLDK >gi|225935348|gb|ACGA01000044.1| GENE 180 268868 - 270610 1105 580 aa, chain - ## HITS:1 COG:VC0694 KEGG:ns NR:ns ## COG: VC0694 COG3275 # Protein_GI_number: 15640713 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Vibrio cholerae # 387 555 355 525 558 88 31.0 4e-17 MIRKRIYLTAAISLLVVSFSAYYLYRVNRSACSRLEYANGLIEVNPRKALADVEQINRTF LSERNYMMCDLVKAGALIGIQEYLVPDSLLNIVSPYFKYKADSLHLTEIYYYRGEIARHS NFLLEAVEYFTLCTQYNNDVYDLRELNFYLNNFKGQVYHTKHMMKEEKEAKLAALSLARE LNNPSLIAEAYSELTHYYAQTNDGDESIPLLKTASMKGYSGSLQARLLFLLSEKYADKHM PDSARIYALQIPHLYQDSVDYLLGKIHSDLQQIDSARFYLNRSSQSNNPFICLKAYRQLT DLNIRTGSLNGVSDCLERLTHYQAQIDSIAYNEDLAQIENVDKLRKTIRDSEVAEAEYYR YWVFYCWIIVIAFTVILILMSISIVLQKKKKNLQLKRQQSRLDALKMQIDPHFIFNNLSI LLDLVETGDATAPIYIKCLSKVYRHIVANVDKNLSSVADELSSLEAYIFLLKIRFEDAIQ VEVQVQDAIRERQIPPIVLQMLLENAIKHNQVSEEAPLFIRVYSDGDKLVVENNCKPLTP NSSTHHIGLRNIKERYQLTGAPAPEIYQDEHIYRVTLSTL >gi|225935348|gb|ACGA01000044.1| GENE 181 270898 - 272013 1036 371 aa, chain + ## HITS:1 COG:Cj0367c KEGG:ns NR:ns ## COG: Cj0367c COG0845 # Protein_GI_number: 15791734 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Campylobacter jejuni # 11 369 9 363 367 142 29.0 2e-33 MKMETNKMWQIFMLTTLLTLAGCKNGGGEYSQSVSEYATMQVAPVNKTLSTQYPAAIQGK QDIAIFPQVTGTITRLCVNEGQAVKRGQVLFIIDQVPYQAALATARANMEAADATLATAQ LTYDSKKELFAQNVISAYELKTAENNLLMAKAGRAQAKAQELSAANDLSYTEVRSPSDGV IGTLPYRVGTLVSAGMAQPLTTVSDNSDMYVYFSMTENQLLGLIRQYGSKEKALQSMPAI ELVLNDRTTYEEKGTIATISGVVDASTGTVSVRAAFPNKNGLLHSGSSGNVVVPSVYKDG IIIPRAATYEVQDKVFVYKVVDGKTHSTQISVERVDGGQDYIVTNGLAVGDEIVTEGVGL LQDGMPITKKK >gi|225935348|gb|ACGA01000044.1| GENE 182 272036 - 275242 3305 1068 aa, chain + ## HITS:1 COG:all3143 KEGG:ns NR:ns ## COG: all3143 COG0841 # Protein_GI_number: 17230635 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Nostoc sp. PCC 7120 # 1 1036 1 1036 1057 736 39.0 0 MNLRIFIERPVLSAVLSIVIVVVGIIGLFTLPVEQYPDIAPPTVQVYTAYDGASAETVQK SVIAPLEEAINGVEDMTYMTSSASNAGSAEITIYFKQGTNPDMAAVNVQNRVSKAAGQLP AEVTRVGVSVNKRQNGMLQIFTLHSPDSSLDENFLSNYININLKPAILRISGVGDVQVMG GVYSMRVWLKPDVMAQYKLIPSDVTAALASQNIEAATGSIGENSKEAKAYTMKYRGRLMT PEEFGEIVIRSTKNGEVLRLKEIANISLGQESYSYQGSFNGKPGVSCMVYQTAGSNATEI NREIDAFLADASKRLPKGAEITQLMSTNEFLFASIHEVLKTLIEAILLVILVVYVFLQDI RSTLIPLVGIFVSLIGTFAFMALVGFSINLITLFALVLVIGTVVDDAIIVVEAVQSKFDS GYRSPYMASVDAMKGLSGAIVTTSLVFMAVFIPVSFMSGTSGTFYTQFGLTMAVAVGIST INALTLSPALCALILRPYTNEDGTQKQNFAARFRRAFNTAFESLSERYKKGVLFFIRRKW LTGAFVAGTVVILALLMNSTKTGLVPDEDQGLVFVAVSTAPGNSLYATDGIMNRVEERIQ QIPQVKQVLKVTGWSMGGAGNSSGIFFVRLTPWDERPEEEDHVQAVIGQVYARTGDIKDA TVFAMAPGMIPGYDMGNALDIQMQDKAGGDLTEFFGITRQFIDSLNQRPELAMAFSSFEI NYPQWQVDVDAAKCLRAGIMPDEVLSTLGGYYGGSYVSNFNRFSRVYRVMVQADPNYRLD EASLNRHFVRLSNGEMAPLSQFVTLTKVYGAESLNRFNMYNSIALNAMPAEGYSSGEAIR AVQETATRVLPANYDFDFGGLAREESNQSNTTIVIFAICLLMVYLLLSALYESFFVPFAV LLSVPAGLMGSFLFAKLLGLENNIYLQTGLIMLIGLLAKTAILLTEYATERRRAGMSLTS AALIAAKDRLRPILMTALTMIFGMVPLMLASGVGANGNSSLGTGVVGGMLVGTLALLFIV PAMFIVFQAIQEKVRPIQFDATVADWQIKEEVEKADEERKQYLENKKK >gi|225935348|gb|ACGA01000044.1| GENE 183 275239 - 276609 369 456 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 3 453 2 455 460 146 25 8e-34 MKRNIITLAITTLMLSSCGIYTKYQPATTVPEDLFGAAETVSDTIHNIAKLNWRQLFVDP HLQSLIEQGLRNNTDLQSAHWRVEEARASLSSARLAYLPSFAFSPQGTVSSFDKGKAGQS YSLPITSSWEIDIFGRLTNANRRAKALYIQSQEYEIAVKTQLIANLANTYYTLLMLDAQL TISEETEVKWKESVRVMQALKNAGQGNEAGLAQTEATYYSICTTVLDLKEQIRQAENSLC LMLGETPQFIRRGTLDGQVLPQDLSVGIPLQLLANRPDVRSSELALAQAFYTTNEARSAF YPSITLSGSAGWTNGVGEVVMNPGKLLLSAVGSLTQPLFNKGQNIARLKIAKAQQEEAKL SFTQTLLNAGAEVNNALKQNQTARDKSDLYQRQISSLQTAVTSTQLLMQHGNSTYLEVLT AQQTLLTAQLTQVSNRFQEIQGIINLYQALGGGREE >gi|225935348|gb|ACGA01000044.1| GENE 184 276769 - 277113 384 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173289|ref|ZP_05759701.1| ## NR: gi|260173289|ref|ZP_05759701.1| hypothetical protein BacD2_15565 [Bacteroides sp. D2] # 1 102 6 107 119 112 100.0 5e-24 MRKQILSLAVALLIGSTVCMAQNHQKGKADREKRIEKMVTDLGLNEKQAKDFKAAMEEMK PAKNKSDEKPSREEMQKKKKEVDAKIKSILTDEQYKKYQDMRKKDNAKKKKKAK >gi|225935348|gb|ACGA01000044.1| GENE 185 277547 - 278050 301 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173290|ref|ZP_05759702.1| ## NR: gi|260173290|ref|ZP_05759702.1| hypothetical protein BacD2_15570 [Bacteroides sp. D2] # 1 167 1 167 167 331 100.0 9e-90 MERDYWLASSLTGTAVGTIFEGRPWVEELMNCVNSEEVRNLFKQYLDGRFDFWNGSISQP GEATWEEKYGLLSLFRGGSHLMYVCVNRSDLADPSLEHRYPKLEKILERGQAYASLSWIS ENKIKVKECIAEGYNGDEDGYGVEPDTWMIGYLNKEGVLQGSFESSE >gi|225935348|gb|ACGA01000044.1| GENE 186 278400 - 279188 467 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173291|ref|ZP_05759703.1| ## NR: gi|260173291|ref|ZP_05759703.1| hypothetical protein BacD2_15575 [Bacteroides sp. D2] # 14 262 1 249 249 450 100.0 1e-125 MYSNKYLSINKVRMEGRVMIQNDCNWKGMSWDELKQRLTDDPEYKANVELARRVVNNYEV VVNYYLGPMCTKIVERINKIMGENSYTDYYLFLSYPIVDTDNGPKPEWHRVSLYDAKDCK LQTYTSTIACRYFYKLANKEKMRRNQEDELFEYKDYESLLLCDQVEEEGESITQIRMRKA FAQLSERDKLVLTYLVIAKMPAIEAYPMVEKMIHPIAKDGMTSDQVKLNWTVKQRQDAMS LMKGYALKHLLIKYNEQKKQEK >gi|225935348|gb|ACGA01000044.1| GENE 187 279185 - 280630 1166 481 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173292|ref|ZP_05759704.1| ## NR: gi|260173292|ref|ZP_05759704.1| hypothetical protein BacD2_15580 [Bacteroides sp. D2] # 1 481 1 481 481 853 100.0 0 MKRIDDIIFNKFVCNTLSQSDMVEVEKQLIESKEIPASLHASILNYEMNTEEASEMLGVN EEQSDFINQEKNMVDEDRKEESDDSIEVQDEIITFKNSTNMNINLTKEEALKVQELSTVY NEFENSELNIDENLVNFYLAQRPGTFAEDAHEVIAGIRKGVETFNANLTAALKDGDIDYI SQLKELGQDLTNEQKFELYINFLSALHVLNIQNFSAEKASQIEDFVTIKQGFAPTGEITD EMLDEVIGKIADALNNNTLCMTSIDKMRDLMEVLPDGSEAVKEELCGSENDMHTKLVNAL AIYIAYQNEDITSLKGQNVSPEAIAIAAAAGIEQAHVIEDTRTGKITVEKAIKVLKIIGG VALWTTLMAGVIYVAVNLAMFTMGSFIGLLGTSIFAMIIAGGIAAMVGIGFSNSLFDTVD SIVDGAGTIFDKIVILWRETVWPVIKDRAEAFISWIRSLLSNKTVEQTAEISPITLVTTQ A >gi|225935348|gb|ACGA01000044.1| GENE 188 280641 - 281333 376 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173293|ref|ZP_05759705.1| ## NR: gi|260173293|ref|ZP_05759705.1| hypothetical protein BacD2_15585 [Bacteroides sp. D2] # 1 230 1 230 230 413 100.0 1e-114 MGFISSAIEGVKNFISSIMNSISERVNVWNERRQQTRERSSYIRQIEDFESTYCPEVPDQ ELALRVRQYLKEKFPNGIEERLFQMSVDELPDFFMEIEKDAEEIMDVTTEEVVIIYPKTE EEIMKYGCGFYNIENNLLCINGAFLYSGNIDLIKEQIFTIFHELKHARQYAAVERKKDYG YSEELMQTWEQNMKNYITYRENDEAYRKQPLEMDTFGFEELLKEDYYTND >gi|225935348|gb|ACGA01000044.1| GENE 189 281352 - 281720 358 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173294|ref|ZP_05759706.1| ## NR: gi|260173294|ref|ZP_05759706.1| hypothetical protein BacD2_15590 [Bacteroides sp. D2] # 1 122 1 122 122 216 100.0 3e-55 MKIIKYISLFKYLIEDEGLNPMSAYRNIQRIRMLPPEFKQAVFDVLSSYVPDMEIEGVSF RELTENDGMKPIRAILMLDWLRREPAVAIRYMAAERYRSVMVGLPKQKEVEQITKEDIII ED >gi|225935348|gb|ACGA01000044.1| GENE 190 281726 - 283711 893 661 aa, chain + ## HITS:1 COG:all4296 KEGG:ns NR:ns ## COG: all4296 COG0464 # Protein_GI_number: 17231788 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Nostoc sp. PCC 7120 # 151 448 117 401 503 177 37.0 5e-44 MDFRKTQIGRMLEQMKLAMLAHIPVIYIPTDQMELIHEILYSDNTIDSLVPRVKYDSEKK AVVKLANKEYGVKDDEKRTFSSIKDNYLIFVNSIDSDVVRCPSILLTYTTKWDKVETGIR NFISDYMGVKRSKDNNPNPYHVANISRSLCIVVTPTEQVIPENIAPYVMTVRVPALLDEE IEAVISSEFNAESMDISVLRRSEVLYSQMIVSLRGFSVLRIRQLLRQMIASQSIDFNHVN ADAVLAAIRVSKKQMLENCNGLKWEETNATNAAGLDSISKWLEERIDIFSDPERAAQCHT DIPNGLLISGIPGSGKSLMAKTAAYKLGLPLISLDMGALLNSLMGESEHNMINALRMSEN MAPCVLWIDEIEKAFSGSSQNSSSSDGGVGRRMFGKFLTWMQEKTAACFVFATSNDITCL PPELFRSERFDRKYFTFMPKAEECAQIFASNIKAQNKSYREELEAMPVSKRAKMAQQLFA TNLEKDSFWLDIINQECTANMDACHLKLKERNDVNESEKEDDVYVWSSSSRPKNKLMTGA DISALIKEAKFLIRPLGNSDTIQTVIYGEYQMKNAVEQMMRSRTFKPYGETNLKDIVRCF LKLHENEFVPASGTCILDFERYDEDKCIYRHNPASDRWPNPYDQVLYYTIVGAINQYAKN L >gi|225935348|gb|ACGA01000044.1| GENE 191 283724 - 284605 653 293 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173296|ref|ZP_05759708.1| ## NR: gi|260173296|ref|ZP_05759708.1| hypothetical protein BacD2_15600 [Bacteroides sp. D2] # 1 293 1 293 293 567 100.0 1e-160 MKLNNILIVTLSDFARHFTFPAFWTNRQQFVRDLCPEKVYYLNHEQEQAYAEISKWIQEG EADADSDKTLLCNALEELLGYEIDRKAVEAFFHPTAIDSDIIIVNAPATFDLPKFKKTSN ASFTIKLRKLQIEGNMDQEAEIRIGGEFVAKLKSGEVAYVTEIEGKYIEVLPNHINNDRY DASLVSANGEFYSTLVIHDKKNASKFSWDGVISFALVDDGYLFVDKQDSLIVMSESTPRF LLKSEGKVVYVKSFNDKVLALYEDGNLKSTISMQKISNVISANFNSDGSIESK >gi|225935348|gb|ACGA01000044.1| GENE 192 284629 - 286338 934 569 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173297|ref|ZP_05759709.1| ## NR: gi|260173297|ref|ZP_05759709.1| chaperone protein DnaK [Bacteroides sp. D2] # 1 569 1 569 569 1172 100.0 0 MKLIVGIDFGTSTTVVRYKEEGTDVIKPIKDADGVSDIIPSAIFRVDGQNQTLYGCGALN AKTGGMQGELITNFKMGLLDVNPEERRIKEAYIEEFFTYIYKQFWNQTRGIRYDSMDVYV SVPAKWDTDAREVMKRTVKKAGFGENIICENEPTAAAYTMLHQHLNDFQKTKMLTVQKPM HVFMLDMGAGTTDIVIFRLKVDSEGRVATDNVLSYPTRDNRTLCGGREIDSILYNYILES LVKGTGKSKEELEQWGIYSVDKAKSWKDKYLSYRLKDNITIDTPNDFLPFLKMSGVSTQQ ALQMIRFNRQIFENETKEHWLELYKLIESAVAIYKEKYGKEVKGAEDIDLLFLTGGHSQW YCIPNLFNGEGINGYIGKDVNNGQGIVKALNFKKLREEPWRMFGDALPHECVATGLCLQD SNIKITTPTANNVWVRLTVNEQSSDYIQILRVGDLLPVQNKQSLEVTLSRNLVFGDCNFN ITIELYTGENIETANRKILKLRQDENSVLGALIVAIMVLPIFFPVNYKFKINLCVDALED GTLDVSGDCMMDNRENTRKEFSLKDMQEV >gi|225935348|gb|ACGA01000044.1| GENE 193 286335 - 289031 1356 898 aa, chain + ## HITS:1 COG:no KEGG:Shewana3_1825 NR:ns ## KEGG: Shewana3_1825 # Name: not_defined # Def: ATPase-like protein # Organism: Shewanella_ANA3 # Pathway: not_defined # 108 878 77 847 1065 229 25.0 4e-58 MSNFIKRLIGRSQFNVDGNLQKIKEGHKPLFVFPSDLANPMMDYMLERSYLEDISGRSDF NEIPMQLQTDISWLRIDRLPVSPLRIDDYDLLSRWQGVLSSLHAWGQKLIFLLQRHNGQT HLYVGVQGYNGEECANKCKCALTSSMPGIDLHYLGGKEDLKEIIGINNQISGSVCGGAVT GIPSFRANTQYGVLQTLDKLAFGFKDMRGMDANYSMIVIAEPLDDNVISEVIYRYQKLGS DIHSEVTQHVTESTTIQHGEGTSTGVHGGIGMGAGQGTMNVVSSLLKTAMYTATPVSLPI GMGAKALMSAIGLSGNIGFSRSISSNDSVSYGESVAKDYLNKFAQYTEQLTDKHCQRLRS GRDIGFWNAGVYVLADSVDNVNLITGILRSVYSGDYTHIEPIRTHLFHSPNALNTIKNFN LVPLINPAANEFATEEWHILGAPYQYVSTPVNTEELSLYTSLPRKDVPGIRFVKNVARFA NNPGKNLNADDLVKIGNIVDTGVQQNNPYTVSVNSLVRHALIVGSTGCGKTTTCKTLINA VLEKKKPVLIIEPAKDEWVRWAIKQNEHLPESERIHIFEPGLSSFEGTLLSHLMLNPFQP AAIAGAPIDMQTRCEKITALINATLPTGDILPVIMDEALYTYLKEKVEDFEEEEMEQLSS YPLLEGALGVAKKLLTNRGYEQRVTDSFVAALETRFKYLTRGKRGNILNQLKSTSYDTLF NKNCVINLSKIPNVKDKALIMSMILLAQYEYRMSAYSYNKEYRKHAQANELMHLTVIEEA HNVLSKPSAASEGTGNPQQVVADLFSNMLAEIRSLGEGFMIIDQVPTKLIPDVIKNTNYK ICHRMTSIDDCAVIAQALALRDDQRGIIPTLEQGNAIIAGDLDDAASWVKISRPVINL >gi|225935348|gb|ACGA01000044.1| GENE 194 289052 - 289693 310 213 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173299|ref|ZP_05759711.1| ## NR: gi|260173299|ref|ZP_05759711.1| hypothetical protein BacD2_15615 [Bacteroides sp. D2] # 1 213 1 213 213 411 100.0 1e-113 MEQNSRILLYALLGIVVVGGTIVSIIYRSKQKEKSEGKIKSGTQQNNLLKSISNQQADDL NRQGSNIIFSDEEIISQMMKMLEYFNGDMGALFSVVSDPVPQIASVIFDNLDAVINSRGS EMLKSWFSSFTGARKGWDAELYRSKAVKILSLLKQCGIQQSTELKLTWNENAAKHYRQLT KIEVGDVCEVLSPCWIYKNEVFEQGLVRPIING >gi|225935348|gb|ACGA01000044.1| GENE 195 289699 - 290637 776 312 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173300|ref|ZP_05759712.1| ## NR: gi|260173300|ref|ZP_05759712.1| hypothetical protein BacD2_15620 [Bacteroides sp. D2] # 1 312 1 312 312 589 100.0 1e-167 MDTNLINELGTSEIAQTPEMNNEVQEADVRFASLSPAVGYGIYRVSQMPEVREAVNDKVE SCKEAFAEVLENAKETVTEKAAEAKEAIAGFAEKLTDGIKDFFVGAYEDVKEIFVNPKPK EVPFAEFSSIHAKEGIQETREFGLDACSEAAMEIFNPGVIDAWGSMTERERKAIALEYAE RVAQAFELVNYEGVYIEKLEPGTLGYNNGDGTIHLTNDLLSSDTTPFLIMDTITHELRHQ YQNECIRGYHDVPDEVRNEWAVATAIYNYDQPSCYDPWGYIYNPLEIDSNYAGNTVVRNV SSQMFNNVLNNA >gi|225935348|gb|ACGA01000044.1| GENE 196 290630 - 290980 316 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173301|ref|ZP_05759713.1| ## NR: gi|260173301|ref|ZP_05759713.1| hypothetical protein BacD2_15625 [Bacteroides sp. D2] # 1 116 1 116 116 211 100.0 2e-53 MHNMSMINKNELKQRLIDEGYIEEYGLDRTVENLINLENLENKAAYEMLCTWLKTGKIQK FESIEGIDLKFLRDELHMKNPAIILAYGMLLYDPKHNAIALKREKDRRNCMKPAKL >gi|225935348|gb|ACGA01000044.1| GENE 197 291325 - 291690 390 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173302|ref|ZP_05759714.1| ## NR: gi|260173302|ref|ZP_05759714.1| hypothetical protein BacD2_15630 [Bacteroides sp. D2] # 1 121 1 121 121 214 100.0 2e-54 MAGFEYLKKFFQSENFRYEEEEGILSFKIQGVNYFAFTNDSPFLQIVIVCNISNKDKTKI LEICNELNSDKFVTKFILRGERVWCSYEFNPSEHTSSDEFMAIFSILDKTSDEFLEKMSK L >gi|225935348|gb|ACGA01000044.1| GENE 198 291742 - 293598 529 618 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173303|ref|ZP_05759715.1| ## NR: gi|260173303|ref|ZP_05759715.1| hypothetical protein BacD2_15635 [Bacteroides sp. D2] # 1 618 1 618 618 1277 100.0 0 MYTSAIVIDFGSSNSGAARIDTYKNGRLVYSTPQFCHSDGYYAKDATWFFIHPKLLERAE INYESLHDDDFRILSRVFMNTLEPNIIWGRDLISANCQKIKQENWVEFKYFKMMIYLDVS YPAKLKSYPISLIVKIFLRILKIECISVESAFKGRTISSSEIQWGVTIPSIWTEDNQRLM SDICQNVFGDHVRVLSEPEGPVVSERIHASNNAQLDHSPGKKSIVVDMGGGTTDICLLAD KEVISESDSHFQLLASCDGIGVGGNMIDKDFWIYFLRFISKGITNGIIYDKLSDDELKNI LLQPYIDDLDHAIEMENAWLAFKHKVVKSFTIPKDYRKWLINNGHSSVADRVTDILIGQT SFDAAELYEQAFAPTFNKISQCVESFLKTHNHLIKDDVENLSLIFAGGLSLNQKLRDLIM CKINAMFSINVSTNIANTPLRASGSIMDGASYLLLWRRFIQREAPFYIYDCAASNLMALQ EAYKEKGVLMKYGELNAISQKDVEERKASNDKVCGFPVAIKGADLLDYRCSFCAVREEQK SIELSFYGSDEIIVHPQDNILAWELGSTLLPNYRNKSYVCIVDFNETNTGNLHYYVTLEE SNELIKEGNISLTNKSKK >gi|225935348|gb|ACGA01000044.1| GENE 199 293607 - 294362 480 251 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173304|ref|ZP_05759716.1| ## NR: gi|260173304|ref|ZP_05759716.1| hypothetical protein BacD2_15640 [Bacteroides sp. D2] # 1 251 5 255 255 442 100.0 1e-123 MNVYLILFVVIFNAVFLVIILLYLINIFEKVLSDNPVVRINRQNHELFDRLSALLKEVAD IKKGYQESISERKEFSELIFSNVEQCQKGLDELTLLLKSHDVSASSSSAVDQIAYNDAVI AFNNINNELYELRQLPEIGMVLMEALVMDKNPTIDFSSLAQDKKELINNLKSKISLFNMN YRSQIVSFLSAKGRDWKDCVRFPLNQNFDGTWDEHLLGDDIMPDYRINRVVQLGFEFPDS NIIGRRKSKIL >gi|225935348|gb|ACGA01000044.1| GENE 200 294386 - 294901 453 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173305|ref|ZP_05759717.1| ## NR: gi|260173305|ref|ZP_05759717.1| hypothetical protein BacD2_15645 [Bacteroides sp. D2] # 1 171 1 171 171 263 100.0 3e-69 MDYVPQRVKDLFIEFFKKCSLFKGKDYKKEILNITVDEEERMDLEELFQATEDFYAEREA LQASGLPPHAYLEREYLKIWEQEHLGASEEDRRKAVEEMNAILSESIIIEMDALIKDGQD DLALERLKARIDATDEETREQIIDEHMTSIGAFECQSEDTLDEEDTHLENK >gi|225935348|gb|ACGA01000044.1| GENE 201 294909 - 295715 709 268 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173306|ref|ZP_05759718.1| ## NR: gi|260173306|ref|ZP_05759718.1| hypothetical protein BacD2_15650 [Bacteroides sp. D2] # 1 268 14 281 281 418 100.0 1e-115 MEAKDITRELIALEQGIQRLRKSNNWLKLGLNEGDFQSQEIIDYLQRPLGENVQLDVETE NKIKAAVYLSVKRSKWKPRKIAKQLAEAAAEALRDARLVTLYESNKIGAKQYKEECENNF VSKVVSTTKRIKKRYGRKLVKGVLATALGLVGGPAGLIAGGIMLVSEIIPQKTKEKIRKK VKEVALKAAETISQGVINLYRKGEKIASRIAEKVVKAAENVADGISIYAAPVVDCIKSVA HTIVEEVKEVGAKVKQGAKKVWKWLTGK >gi|225935348|gb|ACGA01000044.1| GENE 202 295748 - 297373 776 541 aa, chain + ## HITS:1 COG:all1872 KEGG:ns NR:ns ## COG: all1872 COG0464 # Protein_GI_number: 17229364 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Nostoc sp. PCC 7120 # 135 477 122 448 503 202 38.0 1e-51 MEITTAILSVPIVFVPHYHYSYLDNALVQLISTSQKTHCGLNISYDDLVEFDIARGSIDF NTKKKDECYKLPSFLTDLLDNENGLGNKKIYLFKGALKSLFSDEECILQLLQFATLYEKG VLERTKTLIFIDDIPISSFPSVLIPHTQVVNILLPSAEIIEELLLPIPLSKSIFLTDGRE AYLTSLIRTFQGLHQLQIINILRSVLVRTGGYLSKVALNMAETEKKNLVKKTDTLEIIET DITLKQIGGLEVLQKDILQKAKFFKNLSLATSHSVKLQMPKGILILGMPGCGKSMIAKAI ADEFAMPLLRLDVNKLMGKYVGESEENLRKALQTAESTHPCILWIDEVEKAFAGTQNSSG NDSLIIRLMGCFLTWMQERKTPIYVVATANDTMRPEFMRKGRFDEVYFVNFPTESECVDI LLKKLSRYNSPDSIFDFQTLTKGEYQKIALAMQGGVYGGFAGSEIEAVVSMVMENAFIKY LGMSSQHRVPIKVDDFLSVIASMKDAVMANQKGKLGQKTNVERILEIQECYHFKSASNKK D >gi|225935348|gb|ACGA01000044.1| GENE 203 297385 - 297672 220 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173308|ref|ZP_05759720.1| ## NR: gi|260173308|ref|ZP_05759720.1| hypothetical protein BacD2_15660 [Bacteroides sp. D2] # 1 86 1 86 95 163 100.0 2e-39 MEECISHNELLFVNNRFCFNIEDFINSIRSIKDDFEALNYWLVPLIYDGVLLKWLQQVGE EEGMALVKQIEISKDRSETAQATIKRLLLELDLLN Prediction of potential genes in microbial genomes Time: Fri May 13 09:58:54 2011 Seq name: gi|225935347|gb|ACGA01000045.1| Bacteroides sp. D2 cont1.45, whole genome shotgun sequence Length of sequence - 46638 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 19, operones - 9 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 13 - 72 5.2 1 1 Tu 1 . + CDS 114 - 347 203 ## BT_1862 hypothetical protein + Term 373 - 424 10.0 + Prom 669 - 728 7.4 2 2 Op 1 6/0.000 + CDS 751 - 2247 1602 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 3 2 Op 2 30/0.000 + CDS 2329 - 3723 1692 ## COG0065 3-isopropylmalate dehydratase large subunit 4 2 Op 3 . + CDS 3765 - 4361 662 ## COG0066 3-isopropylmalate dehydratase small subunit 5 2 Op 4 11/0.000 + CDS 4343 - 5890 1731 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 6 2 Op 5 . + CDS 5951 - 7012 1422 ## COG0473 Isocitrate/isopropylmalate dehydrogenase + Term 7030 - 7080 15.2 - Term 7018 - 7068 7.6 7 3 Op 1 . - CDS 7094 - 7783 490 ## COG0671 Membrane-associated phospholipid phosphatase - Prom 7818 - 7877 3.0 8 3 Op 2 . - CDS 7879 - 9714 1050 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 9741 - 9800 3.8 + Prom 9670 - 9729 8.4 9 4 Op 1 . + CDS 9846 - 10796 736 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 10 4 Op 2 . + CDS 10825 - 11643 971 ## COG0457 FOG: TPR repeat 11 5 Tu 1 . - CDS 11651 - 13459 1191 ## COG0514 Superfamily II DNA helicase - Prom 13612 - 13671 10.9 + Prom 13619 - 13678 6.1 12 6 Tu 1 . + CDS 13729 - 14193 471 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 14219 - 14268 5.3 + Prom 14201 - 14260 5.9 13 7 Op 1 . + CDS 14289 - 16337 2229 ## BT_1846 putative dipeptidyl-peptidase III 14 7 Op 2 . + CDS 16344 - 17006 639 ## BT_1845 hypothetical protein 15 7 Op 3 . + CDS 17089 - 17577 253 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 16 7 Op 4 . + CDS 17574 - 18845 1612 ## COG0104 Adenylosuccinate synthase + Term 18867 - 18912 1.2 + Prom 18874 - 18933 6.1 17 8 Op 1 . + CDS 18955 - 20253 1292 ## COG3669 Alpha-L-fucosidase + Prom 20289 - 20348 3.3 18 8 Op 2 . + CDS 20374 - 21054 768 ## COG2738 Predicted Zn-dependent protease + Term 21109 - 21158 8.4 + Prom 21101 - 21160 5.0 19 9 Tu 1 . + CDS 21220 - 22584 1691 ## COG0124 Histidyl-tRNA synthetase + Term 22638 - 22689 14.2 - Term 22621 - 22681 11.1 20 10 Op 1 . - CDS 22708 - 23898 930 ## COG1160 Predicted GTPases 21 10 Op 2 . - CDS 23975 - 25396 1295 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 22 10 Op 3 . - CDS 25393 - 26445 673 ## COG0502 Biotin synthase and related enzymes 23 10 Op 4 . - CDS 26426 - 27895 1195 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain - Prom 28114 - 28173 6.4 + Prom 27955 - 28014 5.0 24 11 Op 1 . + CDS 28041 - 28214 198 ## gi|160886567|ref|ZP_02067570.1| hypothetical protein BACOVA_04578 25 11 Op 2 . + CDS 28283 - 28831 433 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 26 11 Op 3 . + CDS 28891 - 29619 617 ## gi|160886565|ref|ZP_02067568.1| hypothetical protein BACOVA_04576 + Term 29661 - 29700 7.5 - Term 29883 - 29935 5.0 27 12 Tu 1 . - CDS 30008 - 31036 870 ## BT_1831 hypothetical protein - Prom 31276 - 31335 5.8 + Prom 31082 - 31141 5.9 28 13 Op 1 41/0.000 + CDS 31299 - 31571 386 ## COG0234 Co-chaperonin GroES (HSP10) 29 13 Op 2 . + CDS 31615 - 33252 1672 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 + Term 33277 - 33326 9.5 + Prom 33362 - 33421 6.6 30 14 Tu 1 . + CDS 33443 - 34381 1003 ## BT_4479 integrase protein + Prom 34675 - 34734 4.3 31 15 Tu 1 . + CDS 34756 - 37380 2214 ## BT_2486 hypothetical protein + Term 37404 - 37453 5.5 + Prom 37748 - 37807 2.9 32 16 Tu 1 . + CDS 37830 - 38768 716 ## BT_4479 integrase protein + Prom 38965 - 39024 5.5 33 17 Tu 1 . + CDS 39145 - 42039 2596 ## BT_1826 hypothetical protein + Term 42182 - 42226 6.6 - Term 42224 - 42267 0.5 34 18 Op 1 . - CDS 42405 - 42902 372 ## COG4929 Uncharacterized membrane-anchored protein 35 18 Op 2 . - CDS 42889 - 43863 349 ## BT_1824 putative permease 36 18 Op 3 . - CDS 43847 - 44764 665 ## COG4984 Predicted membrane protein - Prom 44802 - 44861 8.6 - Term 44825 - 44882 -0.3 37 19 Tu 1 . - CDS 44920 - 46638 1740 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] Predicted protein(s) >gi|225935347|gb|ACGA01000045.1| GENE 1 114 - 347 203 77 aa, chain + ## HITS:1 COG:no KEGG:BT_1862 NR:ns ## KEGG: BT_1862 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 80 80 68 85.0 1e-10 MKTKLFLAAVAVTFSFAMMSCTGNKTTNAASEGEETTVETVEAVVETDSCCQAKDSCATA CDKKADCAEKKECCDKK >gi|225935347|gb|ACGA01000045.1| GENE 2 751 - 2247 1602 498 aa, chain + ## HITS:1 COG:VC2490 KEGG:ns NR:ns ## COG: VC2490 COG0119 # Protein_GI_number: 15642486 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Vibrio cholerae # 1 494 1 496 516 462 50.0 1e-130 MDNRLFIFDTTLRDGEQVPGCQLNTVEKIQVAKALEALGVDVIEAGFPISSPGDFNSVIE ISKAVTWPTICALTRAVQKDIDVAVDALKFAKHKRIHTGIGTSDSHIKYKFNSNREEIIE RAVAAVKYARRFVDDVEFYAEDAGRTDNEYLARVVEAVIKAGATVVNIPDTTGYCLPSEY GAKIKYLVDHVDGIDNAILSTHCHNDLGMATANTIAGVLNGARQVEVTINGIGERAGNTA LEEIAMIIKSHHEIDIQTNINTQKIYPTSRMVSSLMNMPVQPNKAIVGRNAFAHSSGIHQ DGVLKNVQTYEIIDPHDVGIDDNSIVLTARSGRAALKNRLSILGVNLDQEKLDKVYDEFL KLADKKKDINDDDILVLAGADRSQNHRIKLDYLQVTSGVGVRSVASLGLNIAGEKFEACA SGNGPVDAAIKALKKIVERHMTLKEFTIQAISKGSDDVGKVHMQVEYDNQIYYGFGANTD IIAASVEAYIDCINKFKS >gi|225935347|gb|ACGA01000045.1| GENE 3 2329 - 3723 1692 464 aa, chain + ## HITS:1 COG:NMA1450 KEGG:ns NR:ns ## COG: NMA1450 COG0065 # Protein_GI_number: 15794355 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Neisseria meningitidis Z2491 # 3 461 5 466 469 516 57.0 1e-146 MNTLFDKIWDAHVVTTVEDGPTQLYIDRLYCHEVTSPQAFAGLRERGIGVLRPEKVFCMP DHNTPTHDQDKEIEDPISKTQVDTLTQNAKDFGLTHYGMMHPKNGIIHVVGPERGLTLPG MTIVCGDSHTSTHGAMGAIAFGIGTSEVEMVLASQCILQSRPKTMRITVDGELGKGVTAK DVALYMMSKMTTSGATGYFVEYAGSAIRNLTMEGRLTLCNLSIEMGARGGMVAPDEVTFE YIKGRESAPQGEAWDKALAYWKTLKSDDDAVFDKEVRFEAADIEPMITYGTNPGMGMGIT QHIPTMEGMSEAAQVSFKKSMEYMGFQPGESLLGKKIDYVFLGACTNGRIEDFRAFASIV KGRKKAENVIAWLVPGSWMVDAQIRKEGIDKILTEAGFAIRQPGCSACLAMNDDKIPAGK YSVSTSNRNFEGRQGPGARTLLASPLVAAAAAVTGVITDPRELM >gi|225935347|gb|ACGA01000045.1| GENE 4 3765 - 4361 662 198 aa, chain + ## HITS:1 COG:NMB1034 KEGG:ns NR:ns ## COG: NMB1034 COG0066 # Protein_GI_number: 15676921 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Neisseria meningitidis MC58 # 6 197 4 202 213 169 46.0 3e-42 MAKTKFNIITSTCVPLPLENVDTDQIIPARFLKATTREEKFFGDNLFRDWRYNADGSLNK DFVLNNPTYSGQILVAGKNFGSGSSREHAAWAIAGYGFRVVVSSFFADIHKNNELNNFVL PVVVTEEFLQELFDSIEADPKMEVEVNLPEQTITNKATGKSEHFEINAYKKLCLMNGLDD IDFLLSNKDKIEEWEKKA >gi|225935347|gb|ACGA01000045.1| GENE 5 4343 - 5890 1731 515 aa, chain + ## HITS:1 COG:MK0391 KEGG:ns NR:ns ## COG: MK0391 COG0119 # Protein_GI_number: 20093829 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Methanopyrus kandleri AV19 # 7 505 4 491 499 245 34.0 1e-64 MGKEGVKIEIMDTTLRDGEQTSGVSFVPHEKLMIARLLLEDLKVDRVEVASARVSDGEFE AVKMICDWAARRSLLHKVEVLGFVDGHTSVDWIQRTGCRVINLLCKGSLKHCTQQLKKTP EEHLADIISVVHYADEQDITVNVYLEDWSNGIKDSPEYVFQLMDGLKETSVKRFMLPDTL GILNPLQVIEYMRKMKKRYPNTHLDFHAHNDYDLAVSNVLAAVLSGVKGLHTTINGLGER AGNAPLASVQAILKDHFNAVTNIDESRLNDVSRVVESYSGIVIPANKPIVGENVFTQVAG VHADGDNKNNLYCNDLLPERFGRKREYALGKTSGKANIRKNLEDLGLELDEDAMRKVTER IIELGDKKELVTQEDLPYIVSDVLKHGAVGEKVKLKSYFVNLAHGLKPMATLKIEINGKE YEESSGGDGQYDAFVRALRKIYKVTLGRKFPMLTNYAVTIPPGGRTDAFVQTVITWSYDE QVFRTRGLDADQTEAAIKATMKMLNLIEDEYEKSK >gi|225935347|gb|ACGA01000045.1| GENE 6 5951 - 7012 1422 353 aa, chain + ## HITS:1 COG:aq_244 KEGG:ns NR:ns ## COG: aq_244 COG0473 # Protein_GI_number: 15605790 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Aquifex aeolicus # 3 352 4 358 364 349 49.0 5e-96 MDFKIAVLAGDGIGPEISVQGVDVMSAVCEKFGHKVSYEYAICGADAIDKVGDPFPEATY QVCKEADAVLFSAVGDPKFDNDPTAKVRPEQGLLAMRKKLGLFANIRPVQTFKCLIHKSP LRAELVENADFICIRELTGGMYFGEKYQDNDKAYDTNYYTRPEIERILKVAFEYAMKRRK HLTVVDKANVLASSRLWRQIAQEMAPNYPEVTTDYMFVDNAAMKMIQEPAFFDVMVTENT FGDILTDEGSVISGSMGLLPSASTGESTPVFEPIHGSWPQAKGLNIANPLAQILSVAMLF EYFDLKEEGALIRKAVDASLDENVRTPEIQVADGAKYGTKEVGQWIVDYIKKA >gi|225935347|gb|ACGA01000045.1| GENE 7 7094 - 7783 490 229 aa, chain - ## HITS:1 COG:MJ0374_2 KEGG:ns NR:ns ## COG: MJ0374_2 COG0671 # Protein_GI_number: 15668550 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Methanococcus jannaschii # 10 180 1 156 168 60 30.0 2e-09 MIHTGIVQYLSEIDTNIFLSFNGIHSPFWDYFMSSFTGKFIWIPMYATILYILLKNFHWK VVMCYVAAIALTITFADQMCSSIIRPVVARLRPANPENPIVDLVYIVNGYRGGSYGFPSC HAANSLGLAMFVIFLFRKRWLSIFILTWAILNCYTRIYLGVHYPGDLLVGGIIGGFGGWL FCTIAHKAAIYLEPSTRTKRKEIKQWSVTIYVGLLTILGIVLYSTIKSW >gi|225935347|gb|ACGA01000045.1| GENE 8 7879 - 9714 1050 611 aa, chain - ## HITS:1 COG:VCA0802 KEGG:ns NR:ns ## COG: VCA0802 COG1368 # Protein_GI_number: 15601557 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Vibrio cholerae # 190 605 196 632 657 189 30.0 1e-47 MKKRIIQFLTTYFLFVLLFVLQKPIFMVYYHDLYTNVSLGDYFRVMWHGLPLDLSLAGYL TAIPGLLLIASAWTNSSILRRIRQGYFGVIAFVMACIFIIDLGLYGFWGFRLDATPVFYF FSSPKDAMASVSFWFVLLGILSMLIYAAILYFIFYCVLIREKKPLKIPYRRQNVSLALLL LTAALFIPIRGGFSVSTMNLSKVYFSQDQRMNHAAINPAFSFMYSATHQNNFDKQYRFMD PKIADELFAEMVDKPVAATDSIPQLLNTQRPNIIFIILESFSTHLMETFGGQPNVAVNMD KFAKEGILFSNFYGSSFRTDRGLASIISGYPGQPSTSIMKYPEKTDKLPSIPRSLKNAGY NLEYYYGGDADFTNMRSYLVSSGIEKIISDKDFPLSERTGKWGAQDHVLFQRLMKDLKEE KQKEPFLKLVQTSSSHEPFEVPFHRLDDKILNSFAYADSCVGDFVKQYQETPLWKNTLFV LVPDHQGAYPYPIENPLDGQTIPLILIGGAIKQPLVVDTYASQIDIAATLLAQLGLPHDE FTFSKNIMNPASPHFAYFTRPNYFGMITADNQLVYNLDANTVQLDEGTAKGANLEKGKAF LQKLYDDLAKR >gi|225935347|gb|ACGA01000045.1| GENE 9 9846 - 10796 736 316 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 311 3 307 308 288 50 5e-77 MAKIARKLTDLVGNTPLMELSGYSGKYGLEQNIIAKLEAFNPAGSVKDRVAFSMIEDAEV RGLLKPGATIIEPTSGNTGVGLAMVATIKGYHLILTMPETMSLERRNLLKALGAQIVLTD GLGGMAASIAKAQELRDSIPGSVILQQFENPANAAVHERTTGEEIWRDTDGEVAVFVAGV GTGGTVCGVARALKKHNPDIYIVAVEPVSSPVLAGGEEAPHRIQGIGANFIPKLYDASVV DEVMGVPDDEAIRAGRELASTEGLLVGISSGAAVYAARQLSLRPEFKNKKIVALLPDTGE RYLSTELFAFDAYPLD >gi|225935347|gb|ACGA01000045.1| GENE 10 10825 - 11643 971 272 aa, chain + ## HITS:1 COG:all0889 KEGG:ns NR:ns ## COG: all0889 COG0457 # Protein_GI_number: 17228384 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 61 257 252 448 605 77 29.0 2e-14 MVRIIIALLFCFPAVAFAQTYQQLSERAIECIEKDSLPKAEELLLQALKLEPKNAKNALL FSNLGLVQRRLGEFDKALESYSFALNFAPLAVPILLDRAAIYMEMGKTDRAYTDYCQVLD EDKQNKEALLMRAYIYVLRRDYPAARIDYNRLLELDPQSYSGRLGLATLEQKEGKFRESL EILNKMITATPDDATLYIARADVEREMKHEDLALVDLEEAIRLDAASADAYLLRGNIYLA QKKKGLAKADFEKAISLGVPPADLHEQLKQCK >gi|225935347|gb|ACGA01000045.1| GENE 11 11651 - 13459 1191 602 aa, chain - ## HITS:1 COG:ECs4752 KEGG:ns NR:ns ## COG: ECs4752 COG0514 # Protein_GI_number: 15834006 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Escherichia coli O157:H7 # 2 602 16 606 611 533 44.0 1e-151 MRETLKTYFGYDSFRPLQEEIIHHVLNKQDTLVLMPTGGGKSICYQLPALLCEGTAVVVS PLISLMKDQVEALLANGIAAGALNSSNDETENANLRRACIEGRLKLLYISPEKLLAEKDY LLRDMNISLFAIDEAHCISQWGHDFRPEYTQMGVLHQQFPQIPIIALTATADKITREDIV RQLHLNHPRVFISSFDRPNISLTVKRGFQAKEKNKAILEFIHRHGGESGIIYCMSRSKTE TVAQMLQKQGIRCGVYHAGLSTQHRDETQNDFINDRIQVVCATIAFGMGIDKSNVRWVIH YNLPKSIESFYQEIGRAGRDGLPSSTVLFYSLGDLILLTKFASESNQQSINLEKLQRMQQ YAEADICRRRILLSYFGETTTEDCGNCDVCKNPPQRFDGTVIVQKALSAIARTEQQISTG LLIDILRGNYSAEVTGKGYQELKTFGAGRDIPPRDWQDYLLQMLQLGYFEIAYNENNHLK ITASGSDILFGRAKATLAVIRHEEVATQKGKKKKVVVAKVLPFGLEGGENEDLFEALRGL RKQLADQEALPAYIVLSDKVLHLLSISRPTTIEEFGEISGIGEYKKKKYGKDFVNLIRQF VE >gi|225935347|gb|ACGA01000045.1| GENE 12 13729 - 14193 471 154 aa, chain + ## HITS:1 COG:VCA0926 KEGG:ns NR:ns ## COG: VCA0926 COG2207 # Protein_GI_number: 15601680 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Vibrio cholerae # 56 152 272 365 365 62 35.0 2e-10 MSDLENKKTEETPKKRPYNLREKKEKKAAYRSLIRPELADELYDKILNIIVVQKKYKDPD YSAKDLAKELKTNTRYLSAVVNSRFGMNYSCLLNEYRVKDALHLLTDKRYADKNVEEISA MVGFANRQSFYAAFYKNVGETPNGYRKRHIENKK >gi|225935347|gb|ACGA01000045.1| GENE 13 14289 - 16337 2229 682 aa, chain + ## HITS:1 COG:no KEGG:BT_1846 NR:ns ## KEGG: BT_1846 # Name: not_defined # Def: putative dipeptidyl-peptidase III # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 682 1 675 675 1233 88.0 0 MRKHLILMTVAATLLTSCGGSKTTTAEAEKFDYTVEQFADLQILRYRVPGFEELTLKQKE LIYYLTEAALEGRDILFDQNGKYNLRIRRMLEAVYTNYQGDKTTPDFKNMEVYLKRVWFS NGIHHHYGTEKFVPNFSQEFLKQAVLGLDAKLLPLEKGQTADQLCAELFPVIFDPAVMPK RVNQADGEDLVLTSACNYYDGVTQKEAESFYSVLKDPKDETPVSYGLNSRLVKENGKLTE KVWKVGGLYTQAIEKIVYWLKKAEGVAENEAQKAVITKLIQFYETGDLKDFDEYSILWVK DLDSRIDFVNGFTESYGDPLGMKASWESLVNFKDLESTHRTEIISSNAQWFEDHSPVDKS FKKEKVKGVSAKVITAAILAGDLYPATAIGINLPNANWIRAHHGSKSVTIGNITDAYNKA AHGNGFNEEFVYSDAEIQLIDAYSDLTDELHTDLHECLGHGSGKLLPGVDPDALKAYGST IEEARADLFGLYYVADPKLVELGLLSSPDAYKAQYYTYLMNGLMTQLVRIEPGNSVEEAH MRNRQLIARWVFEKGAADKAVELVKKDGKTYVVINDYQKVRQLFGELLAEIQRIKSTGDF EGARTLVENYAVKVDPALHAEVLERYKKLNLAPYKGFVNPKYELVTDENGNVTDVTVSYD EGYVEQMLRYSTDYSPLPSINN >gi|225935347|gb|ACGA01000045.1| GENE 14 16344 - 17006 639 220 aa, chain + ## HITS:1 COG:no KEGG:BT_1845 NR:ns ## KEGG: BT_1845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 219 1 219 221 399 91.0 1e-110 MDIKEQLKDIKTQLRLSMNGAVSQSMREKGLVYKLNFGVELPRIKMIAESYEKNHDLAQA LWKEDIRECKILAGMLQPIETFYPEIADIWVENIRNIEIAELTCMNLFQYLPYAPAKSFH WIADEQEYIQTCGFLTAARLLMKKGDMTERASGELLDQAICAVHSDSYHVRNAALLVIRK YMQHSEEHAFQVCRLVEGMADSTLEGEQMLYNMVKEETEE >gi|225935347|gb|ACGA01000045.1| GENE 15 17089 - 17577 253 162 aa, chain + ## HITS:1 COG:Cj0400 KEGG:ns NR:ns ## COG: Cj0400 COG0735 # Protein_GI_number: 15791767 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Campylobacter jejuni # 1 154 1 157 157 75 31.0 3e-14 METQNVKDTVRQIFTEYLTANGHRKTPERYAILDTIYSIDGHFDIDMLYSRMMDQENFRV SRATLYNTIILLINARLVIKHQFGTSSQYEKSYNRETHHHQICTQCGRVTEFQNEELQHA IENTKLSRFQLSHYSLYIYGVCSKCDRANKRKKVNNNNKKEK >gi|225935347|gb|ACGA01000045.1| GENE 16 17574 - 18845 1612 423 aa, chain + ## HITS:1 COG:PM0938 KEGG:ns NR:ns ## COG: PM0938 COG0104 # Protein_GI_number: 15602803 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Pasteurella multocida # 5 417 6 424 432 382 48.0 1e-106 MKVDVLLGLQWGDEGKGKVVDVLTPKYDVVARFQGGPNAGHTLEFEGQKYVLRSIPSGIF QGNKVNIIGNGVVLDPALFKAEAEALEASGHPLKERLHISKKAHLILPTHRILDAAYEAA KGDAKVGTTGKGIGPTYTDKVSRNGVRVGDILHNFDEKYAAAKARHEQILNGLNYEYDLT ELEKAWLEGIEYLKQFHFVDSEHEVNNLLKDGKSVLCEGAQGTMLDIDFGSYPFVTSSNT VCAGACTGLGVAPNRIGEVYGIFKAYCTRVGAGPFPSELFDETGDKMCTLGHEFGSVTGR KRRCGWIDLVALKYSVMINGVTKLIMMKSDVLDTFETIKACVAYKVNGEEIDYFPYDITE GVEPVYAELPGWQTDMTKMQSEDEFPEEFNAYLTFLEEQLGVEIKIVSVGPDRAQTIERY TEE >gi|225935347|gb|ACGA01000045.1| GENE 17 18955 - 20253 1292 432 aa, chain + ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 24 350 8 358 449 134 29.0 4e-31 MKTRFITFLLLFVMNLGAFAQSPYQPAEENLKARQEFQDNKFGIFLHWGLYAMLATGEWT MTNNNLNYKEYAKLAGGFYPSKFNADKWVEAIKASGAKYICFTSRHHEGFSMFDTKYSDY NVVKATPFKRDIVKELAAACAKQGIKLHFYYSHLDWAREDYPWGRTGQGTGRSNSKGDWK SYYQFMNNQLTELLTNYGPIGAIWFDGWWDQPKSFNWELPEQYALIHKLQPGCLVGNNHH QTPFDGEDIQIFERDLPGENASGLSGQEVSRLPLETCETMNGMWGYKITDQNYKSTKTLI HYLVKAAGKNANLLMNIGPQPDGELPAVAVQRLAEMGEWMKQYGETIYGTRSGIVAPHDW GVTTQKGNKLYVHILDLKDAALFLPLTGKKVKKAVLFKDQSPVRFTKTKAGVLLEFAEVP KDIDYVVELTID >gi|225935347|gb|ACGA01000045.1| GENE 18 20374 - 21054 768 226 aa, chain + ## HITS:1 COG:BH1677 KEGG:ns NR:ns ## COG: BH1677 COG2738 # Protein_GI_number: 15614240 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Bacillus halodurans # 2 226 1 223 224 180 43.0 2e-45 MMSYWVLFIGIAVVSWLVQMNLQNKFKKYSKIPTGNGMTGRDVALKMLHDNGIYDVQVTH TPGRLTDHYNPTNKTVNLSEGVYESNSIMAAAVAAHECGHAVQHARMYAPLKMRSALVPV VNFASSIMTWVLLGGILLINSFPQLLLAGIILFAMTTLFSFITLPVEINASKRALVWLSS SGITNSYNHAQAEDALRSAAYTYVVAALGSLATLVYYIMIFMGRRD >gi|225935347|gb|ACGA01000045.1| GENE 19 21220 - 22584 1691 454 aa, chain + ## HITS:1 COG:HP1190 KEGG:ns NR:ns ## COG: HP1190 COG0124 # Protein_GI_number: 15645804 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Helicobacter pylori 26695 # 5 440 4 422 442 246 33.0 5e-65 MAAKPSIPKGTRDFSPVEMAKRNYIFNTIRDVYHLYGFQQIETPSMEMLSTLMGKYGDEG DKLLFKIQNSGDYFSGITDEELLSRNAVKLASKFCEKGLRYDLTVPFARYVVMHRDEITF PFKRYQIQPVWRADRPQKGRYREFYQCDADVVGSDSLLNEVELMQIVDTVFSRFNIRVCI KINNRKILSGIAEIIGESDKIVDITVAIDKLDKIGLDNVNAELKEKGISDEAIAKLQPII LLSGTNAEKLATLKNVLSASEVGLKGVEESEFILNTLETMGLKNEIELDLTLARGLNYYT GAIFEVKALDVQIGSITGGGRYDNLTGVFGMAGVSGVGISFGADRIFDVLNQLELYPKEA VNGTELLFINFGEKEAAFSMGILSKVRAAGIRAEIFPDAAKMKKQMSYANTKNIPFVAIV GENEMNEGKAMLKNMETGEQNLVSAEELIAVVKK >gi|225935347|gb|ACGA01000045.1| GENE 20 22708 - 23898 930 396 aa, chain - ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 3 391 4 391 411 326 45.0 6e-89 MSLTDTPNANRLHIALFGRRNSGKSSLINALTGQDTALVSDTPGTTTDLVSKAMEIQGIG PCLFIDTPGFDDEGELGELRISRTLKAIEKTDIALLLCGDTTFSHEKEMLALLKEKNIPV IPVLNKIDIRENSDSLATYIEEECKIRPLLISAKEKTGIEQIRQAILEKLPSDFGQQSIT GELVTENDLVLLVMPQDIQAPKGRLILPQVQTIRELLDKKCLVVTCTTDKFPATLQALAR PPKLIITDSQVFKTIYEQKPKESELTSFSVLFAGYKGDIHYYVESAATIERLTESSRVLI AEACTHAPLSEDIGRVKLPRLLRKRIGENLQIDMVAGTDFPQDLTPYSLVIHCGACMFNR KYVLSRIERAREQHIPMTNYGVAIAFLNGILDQIKY >gi|225935347|gb|ACGA01000045.1| GENE 21 23975 - 25396 1295 473 aa, chain - ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 2 473 1 472 472 721 75.0 0 MIYQKDSSKAEEFIHHEEILDTLEYAQNNKDNRVLIEQLIEKAALCKGLTHREAAILLEC NQPDLIERIFHLAKEIKQKFYGNRIVMFAPLYLSNYCVNGCLYCPYHAKNKTIARKKLTQ EEIRREVIALQDMGHKRLALEAGEHPSLNPIEYILESIQTIYSIKHKNGAIRRVNVNIAA TTVENYRRLKEAGIGTYILFQETYHKDNYEALHPTGPKSNYAYHTEAMDRAMEGGIDDVG IGVLFGLNTYRYDFIGLLMHAEHLEAKFGVGPHTISVPRICSADDINAGDFPNSISDEIF SKIVAVIRIAVPYTGMIISTRESQESRKKVLELGISQISGGSRTSVGGYAETELPDHNSA QFDVSDTRTLDEVVNWLLELGYIPSFCTACYREGRTGDRFMSLVKSGQIANCCGPNALMT LKEYLEDYASEDTRQKGLELILKETDRIPNPKIREIAIRNLKAIAAGQRDFRF >gi|225935347|gb|ACGA01000045.1| GENE 22 25393 - 26445 673 350 aa, chain - ## HITS:1 COG:TM1269 KEGG:ns NR:ns ## COG: TM1269 COG0502 # Protein_GI_number: 15644025 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Thermotoga maritima # 2 350 4 348 348 249 40.0 4e-66 MKQWIDKLRQERTLTPEEFRQLLTGCDAEILRYINKQAQEVALLHFGNKIYIRGLIEISN CCRNNCYYCGIRKGNPNIERYRLSRESILNCCKQGYELGFRTFVLQGGEDPALTNDQIEM TVARIRQEYPDCAITLSLGEKSREAYERFFRAGANRYLLRHETYNELHYRQLHPAEMSDK RRLQCLADLKEIGYQTGTGIMVGSPGQTVEHIIEDLLFIEKLRPEMIGIGPFLPHHDTPF AEYPSGTAEQTILLLSIFRLMHPSALIPATTALATLIPDGRERGILAGANVVMPNLSPRE ERRKYELYNDKASLGAESAEGLAALQKQLKTIGYEISTERGDFKYTTENI >gi|225935347|gb|ACGA01000045.1| GENE 23 26426 - 27895 1195 489 aa, chain - ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 173 411 95 340 450 150 36.0 6e-36 MAFTNNIMIVRHKLLADLVRLWKNDELVEKIDRLPIELSPRKSKPLGRCCVHKERAVWRY KTFPLMGLDMTDEHDEVTPLSEYARLALSRPEPDKENIMCVIDEACSSCVQINYEITNLC RGCVARSCYMNCPKDAIRFKKNGQAMIDHDTCVSCGICHKSCPYHAIVYIPVPCEESCPV KAISKDEHGIEHIDESKCIYCGKCMNACPFGAIFEISQTFDVLQRIRKGEKMVAIIAPSI LGQFKTSIGQVYGAFKEIGFTDVIEVAEGAMSTTSNEAHELLEKLEEGQKFMTTSCCPSY IELVEKHIPDMKPYVSTTGSPMYYAARIAKEKHPDAKVVFVGPCVAKRKEVRRDEAVDYI LTFEEIGSILDGLGIELEQVQEFSVLHTSVREAHGFAQAGGVMGAVKAYLKEEAEKINAI QVSDINKKNIALLRACAKTGKAAGQFIEVMACEGGCITGPSTHNDIVSGRRQLAQELLKR KESYETMDR >gi|225935347|gb|ACGA01000045.1| GENE 24 28041 - 28214 198 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160886567|ref|ZP_02067570.1| ## NR: gi|160886567|ref|ZP_02067570.1| hypothetical protein BACOVA_04578 [Bacteroides ovatus ATCC 8483] # 1 57 1 57 57 89 100.0 5e-17 MNSSIFEQRSRFAMIGALMVIISLMFLFYMGSSLVSSTKKYLEQIHEIEITCIDTDE >gi|225935347|gb|ACGA01000045.1| GENE 25 28283 - 28831 433 182 aa, chain + ## HITS:1 COG:CC3650 KEGG:ns NR:ns ## COG: CC3650 COG0494 # Protein_GI_number: 16127880 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Caulobacter vibrioides # 8 180 6 177 187 97 31.0 1e-20 MEEKNKAWKTVSSKYLFRRPWLTVRCEDMLLPNGNHIPEYYILEYPDWVNTIAITKDGQF VFVRQYRPGIERTCYELCAGVCEKEDASPLVSAQRELWEETGYGKGNWQEYMVISANPST HTNLTYCFLATDVELIDHQHLEATEDITVHLLTLEEVKSLLDKNEIMQALNAAPLWKYIA NL >gi|225935347|gb|ACGA01000045.1| GENE 26 28891 - 29619 617 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160886565|ref|ZP_02067568.1| ## NR: gi|160886565|ref|ZP_02067568.1| hypothetical protein BACOVA_04576 [Bacteroides ovatus ATCC 8483] # 1 242 1 242 242 376 99.0 1e-103 MKTVLWSMLCLFLSGWGSMQAVLAQDLKEMEKNLSAINEELSQKTKEYSWQLAAAYADYC EANNKYISWNDLPYLQQVVEYERPASLETYRLEHKASKEELDKFLNTYKEYKDLVKKQKE AVTKEEKDAVSTAFSAFWKKLRSEENAYKDLYYAERKAVCKYRSEALRYAIAYYKEKKQE IPTSYIKYTERSYLLQKGSALELLQKEISALESVQREIIQNITRAKYGLSETGENKREKI FD >gi|225935347|gb|ACGA01000045.1| GENE 27 30008 - 31036 870 342 aa, chain - ## HITS:1 COG:no KEGG:BT_1831 NR:ns ## KEGG: BT_1831 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 342 7 348 348 589 82.0 1e-167 MNGIASGLIIASCVFYSCTSRTGEISPVHEQQTDSLSQDTIVQPEVKPVNKKLTAEQIEI SKDLLYDQYTLEDTYPYKDTTRQFQWDKIKERLALLENIQLQPSTWAILQNYKNRNGEAP LVRSFKRNAYGRVADTLGIERYQSVPLYLLTDTLVPERYGQDGELTRFIEDGEKFIKAEP MFTGDEWMIPKKYVKVIGDTIVFNKAVFVDRHNQNIASLERSGKGQWVVRSMNPSTTGRH LPPYAQETPLGMFVLQEKKVKMVFLKDGSKETGGYAPYASRFTDGAYIHGVPVNAPRKTQ IEYSPSLGTTPRSHMCVRNATSHAKFIYDWAPVNGTIIFVLE >gi|225935347|gb|ACGA01000045.1| GENE 28 31299 - 31571 386 90 aa, chain + ## HITS:1 COG:RC0969 KEGG:ns NR:ns ## COG: RC0969 COG0234 # Protein_GI_number: 15892892 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Rickettsia conorii # 1 89 5 98 99 98 55.0 2e-21 MNIKPLADRVLILPAPAEEKTIGGIIIPDTAKEKPLKGEVVAIGHGTKDEEMVLKVGDTV LYGKYAGTELDVEGTKYLIMRQSDVLAVLG >gi|225935347|gb|ACGA01000045.1| GENE 29 31615 - 33252 1672 545 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 545 3 547 547 648 60 0.0 MAKEILFNIDARDQLKKGVDALANAVKVTLGPKGRNVIIEKKFGAPHITKDGVTVAKEIE LADAYQNTGAQLVKEVASKTGDDAGDGTTTATVLAQAIVAEGLKNVTAGASPMDIKRGID KAVAKVVESIKDQAETVGDNYDKIEQVATVSANNDPVIGKLIADAMRKVSKDGVITIEEA KGTDTTIGVVEGMQFDRGYLSAYFVTNTEKMECEMEKPYILIYDKKISNLKDFLPILEPA VQTGRPLLVIAEDVDSEALTTLVVNRLRSQLKICAVKAPGFGDRRKEMLEDIAILTGGVV ISEEKGLKLEQATIEMLGTADKVTVTKDYTTVVNGAGNKDSIKERCEQIKAQIVATKSDY DREKLQERLAKLSGGVAVLYVGAASEVEMKEKKDRVDDALRATRAAIEEGIIPGGGVAYI RAIDSLEGMKGDNADETTGIGIIKRAIEEPLREIVANAGKEGAVVVQKVREGKGDFGYNA RTDVYENLHAAGVVDPAKVARVALENAASIAGMFLTTECVIVEKKEDKPEMPMGAPGMGG MGGMM >gi|225935347|gb|ACGA01000045.1| GENE 30 33443 - 34381 1003 312 aa, chain + ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 310 1 303 305 510 82.0 1e-143 MLKVLTFMKQVANGLQVEGNFGTAHVYRSSLNAIIAYSGKVDFTFDEVSPEWLKGFEVYL RSRGCSWNTVSTYLRTFRAVYNRAVDLRKASYVPHLFRSVYTGTRADHKRALGDEDMKKV FAKLSRTSGVPLAVYQAQELFILMFSLRGMPFVDLAYLRKSDLRDNVITYRRRKTGRPLS VTLTPEAMILVKKYMNRDPSSPYLFPLLKSREGTKEAYREYQLALRSFNQQLMLLGELLG LSDKLSSYTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFRSKKIDEANKQVL DFVKRSVVGVSA >gi|225935347|gb|ACGA01000045.1| GENE 31 34756 - 37380 2214 874 aa, chain + ## HITS:1 COG:no KEGG:BT_2486 NR:ns ## KEGG: BT_2486 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 1 142 1016 118 46.0 9e-25 MKRKFVKVMFFGALALSTVTYVGCKDYDDDIDNLQTQIDANKASIAELQNFVKEGKWVTN VEQITDGFKITFNDNKSYSITSGKDATPTTIKIDPVTKNWIVNDNDLGICAEGKKGADGK PGAAGSPGGKGEDGYAPQISENGFWMVWDAETKKPVETKIKAATDIYVAADASNPLVWIL NIFNKETKEWETVSMPKSARITSMSVLGIKGDGSVDVGSTEAETTLYYSIAGKDIVFNGN KTFKKKGDLLVARGGSKIHALINPVNLKAADIQAYEIGLTDSKGNTNFAVANIADNFSID ALTRAADPEKEPTANKGVYDLTLKFVDGLTKDELTALESAETAYALTTKDAWGNEIISQY GVKIKASSQNIPDVNFTAPEPMPYQTTYNLDELFGSELDKVVAYYYEVTDEEAKKADAKF DKEKNTILANKEGQVKVKIHCLLVDGSTQDPEVELTFTYVSKKAEIKDMTWVVDASNKTA TSEIVGPSVDEIKGQIKLSDPIVATIAYTDDKAMINGKVVQSYMDGSIQLKLVGLDKDGK PVSGTSEADIAKITKFVIQATFDEENVAAVSHTATVKFKNKDSQAGLGNDFLYETTFKIT VDQQNDKLFTFKRATAYFDGDNAKAYGTVPTTAVLAATADTKIGFDLYTLYKEGSISADK QNTITFTEEKPSRVVSGKKQFAPAWLDETLPQPTKNSKIKVFPYVSKPATDENWGGAYTG RYITVSYAPFGNSRLKAITDRFNLTILSEIFEGTFEYTKEVDKKIIGTEANPFIIEGNTV EISAKDFKRIDARGNSYEFSDNRIESVSVVLADDDATTYLAKNDGNLTDDPKKVVISKKE GAVILTPPTCKVNVNILDKWGRTKSVSIYVKVNK >gi|225935347|gb|ACGA01000045.1| GENE 32 37830 - 38768 716 312 aa, chain + ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 310 1 303 305 505 81.0 1e-142 MLKVLAFMKQVATGLQMEGNFGTAHVYRSSLNAIIAYRGKNDFVFSEVTSEWLKGFEVYL RSRGCSWNTVSTYLRTFRAVYNRAVDLQKAPYVPHLFRSVYTGTRADHKRALGDDDMKKV FTKLSRASGVPLAVYQAQELFILMFSLRGMPFVDLAYLRKSDLRGNVITYRRRKTGRPLS VTLTPEAMALVKKYMNRDSSSPYLFPLLRSREGTKEAYREYQLALRSFNQQLMLLGELLG LGDKLSSYTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFRNKKIDEANKQVL DFVKRSVTGVSA >gi|225935347|gb|ACGA01000045.1| GENE 33 39145 - 42039 2596 964 aa, chain + ## HITS:1 COG:no KEGG:BT_1826 NR:ns ## KEGG: BT_1826 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 961 1 1029 1038 203 27.0 3e-50 MKRKFVKVMFFGALALSTVTYVGCKDYDDDIKSVQEQIDQIKSNNPVSVGDMQTAINVAK SALESQLADLKTKLENKDSQIKDLGLKITDLEDKLSKTADKATVDQLTKDLATAKNDLEA LKKLQKSDIDGLTARIVKLEALKEELDGLKVNFATKEELKNYVESAKLSGLISDEIATAL GEDGEIAAAINDAIQTKVLADFGSMKEVAALAGEDATVADVIKELYAAINADKTGILAKL TALEDYQTALEDKAVENGFESVEAVIAEVKSLKTTLSGLYASAEFSEKVKAIVSAELTTV NSRIDTLEDDLAKLGVAIKGMIQSVVYIPTSIDRSVDFYTLYAKKTSTSSSYVVAAKSAD AKELQFRISPASAAMTLEDFNKNYEIKLNAEERNWTRAAEPFAVEVKNCEAGVLTVSLTT SSEKSHAISLNIVSTKDAEGKEELTPTNVNSDYIAVIQSSYYLKTAYYEVVTEKAGEIIY DAPQPVDYSGVGTLTVSYTTTASGSTAFTKTLEELNVENIFATTYSLTGTDANLFEVSTA GSVSLKTSGLIASLDKTADVMAKVTAPGFYLETSSNNPKKLGTVKVTRTIDELTHTYALE ERDWTNETTAAAEERKDLNVADIYNDPAVNIRPSAYESLSLVELIPTTGIRLENGANNAL TLIIPKNTAAGDYTATAKFEGDGYTLVVNVPVKIKPITLAKLARVSEMWSSDQKRTGFTP TKDSETAATAITSEFKLTTIFSNFDAVKTAVLAKGGTFVITTNITENSIAGVNYDENEAK FTFDKDLYTGKMTVGGKTVPAVVKFTIKASYNGKVEDTIEGIVEVKDISGTWVAPTATTL SLSDKSVEYNVSTGFAWNDLAGKTMWKDGAVVAGTGSNGFASSVTNPLNIYGLVAPTFAF REAAASTYLSLDASTGKVTFTAEGKSHHFYEAVTYTVEVKATSKWGTIKNYEGKNTITVT IPAE >gi|225935347|gb|ACGA01000045.1| GENE 34 42405 - 42902 372 165 aa, chain - ## HITS:1 COG:YPO2802 KEGG:ns NR:ns ## COG: YPO2802 COG4929 # Protein_GI_number: 16123000 # Func_class: S Function unknown # Function: Uncharacterized membrane-anchored protein # Organism: Yersinia pestis # 8 163 7 175 176 91 34.0 8e-19 MKKYSRILIIANLILLLGYFNWSVYKKEQTLKDGQLILLQLAPVDPRSLMQGDYMRLSYK EASSDLLDQQTAIRGYAILQIDSNQVGKIVRLQNALEPVNDNELVIKYKIVRHRIFLGAE SFFFEEGQDTLYQKAVYGGLKVDGKGQSLLVGLYDENFHYIQSDK >gi|225935347|gb|ACGA01000045.1| GENE 35 42889 - 43863 349 324 aa, chain - ## HITS:1 COG:no KEGG:BT_1824 NR:ns ## KEGG: BT_1824 # Name: not_defined # Def: putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 324 1 318 318 326 61.0 7e-88 MTQKHSLTIQAVSIIGGILTAIFFLGFLALARILRSDISCLIVGSILILTTLTISRMVIR SFLDAMNITLYIAGCVLIGFGINASINILFTTLMGISILTFLLSRGFILPFLSVILFNIS FFGEAAHVFSSFYPLQIAVVPILALFLFANIFETKLFECIGTKNYFSKYKPFHFGLFVSG IVSLGGLSINYMISETNSWLVSCILSVCIWIGILIMVQRIMQVMKVDHPVNQIGIYILCI VICLPTVFAPYLSGSLLLILICFHYGYKAECAASLLLFIYAVSKYYYDLNLSLLTKSMTL FFIGIACITAWYFFTQKRTRHEKV >gi|225935347|gb|ACGA01000045.1| GENE 36 43847 - 44764 665 305 aa, chain - ## HITS:1 COG:YPO2801 KEGG:ns NR:ns ## COG: YPO2801 COG4984 # Protein_GI_number: 16122999 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Yersinia pestis # 18 147 49 175 735 84 45.0 4e-16 MEKTDSSPLSRQALYADKKQWNQFLSIFLLAVGVGFTVAGIIFFFAYNWEELPKFAKLGI VEVLLVASVLLATFTHWNKLVKQILLTGATFLIGTLFAVFGQIYQTGADAYDLFLGWTLF TILWAVAIRFAPLWLTFIGLLCTTIWLYNIQIASANSWEMTLLANAVTWICALTTIITEW MSVKGHLDRNNRWFVSLLSLATIIHTSFLLMMAICEENAILSVPLISTLLLFSAGLWYGW KVKSLFYLAIIPFAALMILLTTFISQSGLRDVQIFFYGGVIVITGTTLLIYIILHLKKQW YDTEA >gi|225935347|gb|ACGA01000045.1| GENE 37 44920 - 46638 1740 572 aa, chain - ## HITS:1 COG:YPO1358 KEGG:ns NR:ns ## COG: YPO1358 COG0028 # Protein_GI_number: 16121638 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Yersinia pestis # 5 565 12 571 573 547 49.0 1e-155 ELIDTLVKSGVERIYAVTGDSLNEVNEAVRKNDQIKWIHVRHEETGAYAAAAEAQLTGRP GCCAGSSGPGHVHLINGLYDAHRSGAPVIAIASTIPTGEFGTEYFQETNTIKLFNDCSYY NEVATTPTQFPRMLQSAIQTAVTRKGVSVIGLPGDLAKASAVAVDSSVINYPAPPEVCPS EEDLAQLADMLNKHTRITLFCGIGCRGAHEEVIALSEKLNAPVVYTFKGKMEVQYENPYE VGMTGLLGMPSGYYSMHEAEVLLMLGTDFPYSAFLPDDIKIAQIDIKPERLGRRAKVDIG LCGDVKLSIQSLLRMLNPKTDDSFLLKQLKRYEGVKKDLAAYTEDKGDVNKIHPEYVMSE IDKLSSDDAVFTVDTGMTCVWGARYLQATGKRHMLGSFNHGSMANALPQAIGAALAYPDR QVVALCGDGGLSMTLGDLETVVQYKLPIKIIVFNNRSLGMVKLEMEVDGLPDWQTNMLNP DFAQVAEAMGMTGFNVSDPEEVLTTLLNAFELDGPVLVNIMTDPNALAMPPKIEFGQMVG FAQSMYKLLINGRSQEVIDTINSNFKHIREVF Prediction of potential genes in microbial genomes Time: Fri May 13 10:02:38 2011 Seq name: gi|225935346|gb|ACGA01000046.1| Bacteroides sp. D2 cont1.46, whole genome shotgun sequence Length of sequence - 533384 bp Number of predicted genes - 384, with homology - 379 Number of transcription units - 179, operones - 94 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 139 106 ## gi|160886554|ref|ZP_02067557.1| hypothetical protein BACOVA_04565 - Prom 168 - 227 5.6 + Prom 5 - 64 6.0 2 2 Tu 1 . + CDS 274 - 492 243 ## BT_1819 hypothetical protein + Term 522 - 579 10.2 + Prom 584 - 643 4.1 3 3 Op 1 . + CDS 664 - 1677 1009 ## COG2008 Threonine aldolase + Prom 1685 - 1744 1.8 4 3 Op 2 . + CDS 1773 - 2966 667 ## BT_1814 hypothetical protein 5 3 Op 3 . + CDS 2953 - 4656 1074 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase + Prom 4672 - 4731 3.7 6 4 Op 1 . + CDS 4751 - 5716 1180 ## COG2214 DnaJ-class molecular chaperone 7 4 Op 2 . + CDS 5731 - 6045 265 ## BT_1811 hypothetical protein + Term 6051 - 6096 12.5 - Term 6036 - 6087 10.2 8 5 Op 1 . - CDS 6133 - 6606 319 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 9 5 Op 2 . - CDS 6614 - 11674 4130 ## BT_1809 hypothetical protein 10 5 Op 3 . - CDS 11713 - 12132 297 ## BT_1808 hypothetical protein 11 5 Op 4 . - CDS 12140 - 13918 1043 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 13951 - 14010 8.7 - Term 13985 - 14021 3.2 12 6 Op 1 3/0.032 - CDS 14048 - 15754 1916 ## COG1960 Acyl-CoA dehydrogenases 13 6 Op 2 29/0.000 - CDS 15761 - 16780 970 ## COG2025 Electron transfer flavoprotein, alpha subunit 14 6 Op 3 . - CDS 16783 - 17655 1060 ## COG2086 Electron transfer flavoprotein, beta subunit - Prom 17747 - 17806 3.2 - Term 17855 - 17910 2.7 15 7 Op 1 . - CDS 17918 - 18448 356 ## PROTEIN SUPPORTED gi|229873878|ref|ZP_04493445.1| acetyltransferase, ribosomal protein N-acetylase 16 7 Op 2 . - CDS 18457 - 20331 1812 ## COG3669 Alpha-L-fucosidase 17 7 Op 3 . - CDS 20371 - 21051 496 ## BT_1803 hypothetical protein - Prom 21101 - 21160 4.4 - Term 21121 - 21159 6.9 18 8 Op 1 . - CDS 21188 - 24034 2529 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 19 8 Op 2 . - CDS 24046 - 25359 657 ## BT_1798 hypothetical protein 20 8 Op 3 . - CDS 25390 - 26592 909 ## BT_1798 hypothetical protein - Prom 26669 - 26728 6.4 - Term 27344 - 27383 -0.2 21 9 Tu 1 . - CDS 27398 - 27565 130 ## gi|237723355|ref|ZP_04553836.1| conserved hypothetical protein - Prom 27606 - 27665 5.3 + Prom 27560 - 27619 6.2 22 10 Op 1 . + CDS 27641 - 28126 438 ## BT_3497 hypothetical protein 23 10 Op 2 . + CDS 28145 - 30139 1972 ## BT_1796 hypothetical protein + Term 30186 - 30220 0.1 + Prom 30158 - 30217 2.7 24 11 Op 1 9/0.000 + CDS 30238 - 31452 821 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 25 11 Op 2 . + CDS 31449 - 32177 515 ## COG3279 Response regulator of the LytR/AlgR family + Term 32365 - 32403 -0.9 26 12 Tu 1 . - CDS 32182 - 33366 1320 ## COG3579 Aminopeptidase C - Prom 33428 - 33487 2.3 27 13 Op 1 . - CDS 33501 - 34706 902 ## BF3189 hypothetical protein 28 13 Op 2 . - CDS 34703 - 36028 805 ## BT_1786 hypothetical protein + Prom 36095 - 36154 4.9 29 14 Tu 1 . + CDS 36221 - 37279 1029 ## BF3356 hypothetical protein + Term 37441 - 37482 4.1 30 15 Tu 1 . - CDS 37695 - 38270 469 ## BT_1784 hypothetical protein - Prom 38293 - 38352 2.7 + Prom 38230 - 38289 4.8 31 16 Tu 1 . + CDS 38448 - 39293 697 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily - Term 39032 - 39068 -0.9 32 17 Tu 1 . - CDS 39285 - 40307 778 ## BT_1767 hypothetical protein - Prom 40443 - 40502 5.3 + Prom 40413 - 40472 5.8 33 18 Op 1 . + CDS 40563 - 42674 1517 ## BT_1781 xylosidase/arabinosidase 34 18 Op 2 . + CDS 42704 - 45568 2830 ## COG1472 Beta-glucosidase-related glycosidases 35 18 Op 3 . + CDS 45599 - 47314 1219 ## BT_1779 sialic acid-specific 9-O-acetylesterase 36 18 Op 4 . + CDS 47349 - 49046 1264 ## BDI_3065 beta-glycosidase 37 18 Op 5 . + CDS 49095 - 49436 281 ## gi|260173385|ref|ZP_05759797.1| beta-galactosidase 38 18 Op 6 . + CDS 49501 - 52062 2341 ## COG1472 Beta-glucosidase-related glycosidases + Term 52077 - 52145 11.2 39 19 Tu 1 . + CDS 52151 - 54730 1634 ## BT_1777 hypothetical protein + Term 54794 - 54833 9.0 - Term 54769 - 54829 11.2 40 20 Tu 1 . - CDS 54885 - 58586 3497 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 58753 - 58812 7.0 + Prom 58672 - 58731 8.1 41 21 Tu 1 . + CDS 58893 - 59123 67 ## 42 22 Op 1 . - CDS 59090 - 61795 2677 ## BVU_1478 hypothetical protein 43 22 Op 2 . - CDS 61843 - 63672 1560 ## BT_1775 hypothetical protein 44 22 Op 3 . - CDS 63703 - 65064 1155 ## BT_1772 hypothetical protein 45 22 Op 4 . - CDS 65087 - 67081 2076 ## BT_1773 hypothetical protein 46 22 Op 5 . - CDS 67087 - 70365 3056 ## BT_1774 hypothetical protein - Prom 70391 - 70450 5.1 47 23 Tu 1 . - CDS 70815 - 73397 1348 ## BT_1770 hypothetical protein - Prom 73428 - 73487 7.6 - Term 73514 - 73574 8.6 48 24 Op 1 . - CDS 73596 - 75857 2023 ## COG3537 Putative alpha-1,2-mannosidase 49 24 Op 2 . - CDS 75893 - 79135 2776 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 79155 - 79214 5.1 + Prom 79417 - 79476 4.2 50 25 Op 1 . + CDS 79516 - 82026 1599 ## BT_1781 xylosidase/arabinosidase + Prom 82166 - 82225 4.3 51 25 Op 2 . + CDS 82247 - 83020 276 ## COG0500 SAM-dependent methyltransferases + Term 83077 - 83129 10.4 + Prom 83082 - 83141 4.3 52 26 Tu 1 . + CDS 83161 - 85041 1291 ## COG1621 Beta-fructosidases (levanase/invertase) + Term 85072 - 85116 5.2 53 27 Op 1 . - CDS 85117 - 85992 572 ## COG2017 Galactose mutarotase and related enzymes 54 27 Op 2 . - CDS 86021 - 86893 725 ## BDI_1888 putative DNA repair ATPase - Prom 86973 - 87032 4.2 + Prom 87090 - 87149 8.5 55 28 Op 1 . + CDS 87173 - 90298 3302 ## BT_1763 hypothetical protein 56 28 Op 2 . + CDS 90326 - 92038 1759 ## BT_1762 hypothetical protein 57 28 Op 3 . + CDS 92065 - 93447 1174 ## BT_1761 hypothetical protein 58 28 Op 4 . + CDS 93465 - 95033 1531 ## BT_1760 glycosylhydrolase + Term 95073 - 95113 4.2 59 29 Tu 1 . + CDS 95134 - 96969 1507 ## COG1621 Beta-fructosidases (levanase/invertase) 60 30 Op 1 2/0.065 + CDS 97354 - 98523 1109 ## COG0738 Fucose permease 61 30 Op 2 . + CDS 98559 - 99446 970 ## COG0524 Sugar kinases, ribokinase family + Term 99466 - 99517 12.4 + Prom 99497 - 99556 2.4 62 31 Tu 1 . + CDS 99596 - 100714 1065 ## COG2849 Uncharacterized protein conserved in bacteria + Term 100746 - 100777 1.8 + Prom 100778 - 100837 5.6 63 32 Op 1 . + CDS 101040 - 106625 3597 ## COG2373 Large extracellular alpha-helical protein 64 32 Op 2 . + CDS 106618 - 109500 2597 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 109501 - 109549 0.2 65 32 Op 3 . + CDS 109579 - 109917 235 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase + Prom 110001 - 110060 4.5 66 33 Op 1 16/0.000 + CDS 110216 - 111442 1276 ## COG4175 ABC-type proline/glycine betaine transport system, ATPase component 67 33 Op 2 14/0.000 + CDS 111439 - 112263 771 ## COG4176 ABC-type proline/glycine betaine transport system, permease component 68 33 Op 3 . + CDS 112281 - 113153 823 ## COG2113 ABC-type proline/glycine betaine transport systems, periplasmic components + Term 113207 - 113254 10.1 - Term 113188 - 113249 14.5 69 34 Tu 1 . - CDS 113267 - 114469 938 ## BF3314 hypothetical protein - Prom 114654 - 114713 6.1 - Term 114649 - 114703 7.8 70 35 Op 1 . - CDS 114718 - 118269 3556 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 71 35 Op 2 . - CDS 118302 - 119480 851 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 119610 - 119669 8.6 + Prom 119499 - 119558 5.9 72 36 Tu 1 . + CDS 119643 - 120842 614 ## BT_1745 hypothetical protein - Term 120931 - 120970 4.8 73 37 Tu 1 . - CDS 121075 - 122526 1172 ## COG1649 Uncharacterized protein conserved in bacteria - Prom 122594 - 122653 5.7 + Prom 122532 - 122591 2.9 74 38 Op 1 . + CDS 122635 - 124065 1106 ## COG1966 Carbon starvation protein, predicted membrane protein 75 38 Op 2 . + CDS 124062 - 124292 240 ## BT_1741 hypothetical protein 76 38 Op 3 . + CDS 124297 - 124782 665 ## BT_1740 hypothetical protein + Term 124810 - 124852 7.2 - Term 124798 - 124840 8.5 77 39 Tu 1 . - CDS 124901 - 127672 2748 ## COG0178 Excinuclease ATPase subunit - Prom 127738 - 127797 3.6 + Prom 127680 - 127739 4.9 78 40 Tu 1 . + CDS 127767 - 129275 1256 ## COG1649 Uncharacterized protein conserved in bacteria + Term 129297 - 129349 5.5 - Term 129285 - 129337 5.5 79 41 Op 1 7/0.000 - CDS 129383 - 129931 441 ## COG2059 Chromate transport protein ChrA - Prom 129955 - 130014 6.7 80 41 Op 2 . - CDS 130016 - 130558 380 ## COG2059 Chromate transport protein ChrA 81 41 Op 3 . - CDS 130555 - 134559 2970 ## COG5002 Signal transduction histidine kinase - Prom 134606 - 134665 5.3 - Term 134767 - 134808 8.0 82 42 Tu 1 . - CDS 134828 - 138532 3898 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain - Prom 138700 - 138759 5.6 + Prom 138490 - 138549 2.9 83 43 Op 1 . + CDS 138692 - 139444 533 ## COG1280 Putative threonine efflux protein 84 43 Op 2 . + CDS 139455 - 139997 680 ## BT_1731 hypothetical protein 85 43 Op 3 . + CDS 139998 - 140858 684 ## COG1091 dTDP-4-dehydrorhamnose reductase 86 43 Op 4 . + CDS 140935 - 142509 1449 ## COG4108 Peptide chain release factor RF-3 87 43 Op 5 . + CDS 142588 - 143178 511 ## BT_1728 RNA polymerase ECF-type sigma factor 88 43 Op 6 . + CDS 143189 - 144028 751 ## BT_1727 putative transmembrane sensor + Term 144034 - 144071 6.2 + Prom 144758 - 144817 7.9 89 44 Op 1 . + CDS 144873 - 145133 348 ## BT_1698 hypothetical protein 90 44 Op 2 4/0.000 + CDS 145160 - 146995 1994 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase 91 44 Op 3 . + CDS 146995 - 148230 1260 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit + Term 148256 - 148305 9.6 - Term 148247 - 148290 7.4 92 45 Op 1 9/0.000 - CDS 148338 - 149609 1307 ## COG1538 Outer membrane protein 93 45 Op 2 27/0.000 - CDS 149658 - 152690 2694 ## COG0841 Cation/multidrug efflux pump 94 45 Op 3 . - CDS 152751 - 153803 788 ## COG0845 Membrane-fusion protein - Term 154383 - 154422 6.3 95 46 Tu 1 . - CDS 154518 - 154925 322 ## gi|260173444|ref|ZP_05759856.1| hypothetical protein BacD2_16350 - Prom 155090 - 155149 3.6 - Term 155092 - 155156 17.3 96 47 Tu 1 . - CDS 155170 - 155424 437 ## PROTEIN SUPPORTED gi|29347102|ref|NP_810605.1| 50S ribosomal protein L31 type B - Prom 155549 - 155608 3.8 + Prom 155773 - 155832 4.2 97 48 Tu 1 . + CDS 155875 - 156879 1159 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 156905 - 156952 10.1 - Term 157005 - 157069 4.6 98 49 Op 1 . - CDS 157085 - 158245 1216 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 99 49 Op 2 . - CDS 158247 - 158678 572 ## COG1038 Pyruvate carboxylase 100 49 Op 3 . - CDS 158702 - 159622 928 ## BT_1687 hypothetical protein 101 49 Op 4 2/0.065 - CDS 159654 - 161207 1780 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 102 49 Op 5 . - CDS 161236 - 161640 456 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 161668 - 161727 6.3 103 50 Tu 1 . - CDS 161816 - 162814 507 ## BT_1684 hypothetical protein - Prom 162976 - 163035 5.6 + Prom 162975 - 163034 4.5 104 51 Op 1 . + CDS 163067 - 166183 2541 ## BVU_2924 hypothetical protein 105 51 Op 2 . + CDS 166197 - 167993 1365 ## Cpin_4993 RagB/SusD domain protein + Term 168038 - 168083 10.0 + Prom 168051 - 168110 7.3 106 52 Tu 1 . + CDS 168133 - 168675 478 ## COG0288 Carbonic anhydrase + Term 168705 - 168751 2.5 107 53 Tu 1 . - CDS 168741 - 168965 109 ## gi|260173456|ref|ZP_05759868.1| hypothetical protein BacD2_16410 - Prom 169003 - 169062 3.0 + Prom 168794 - 168853 6.2 108 54 Op 1 . + CDS 169088 - 169624 499 ## COG0778 Nitroreductase 109 54 Op 2 . + CDS 169627 - 169962 241 ## BT_1679 hypothetical protein 110 54 Op 3 . + CDS 169970 - 170764 789 ## BT_1678 hypothetical protein 111 54 Op 4 . + CDS 170745 - 171266 387 ## COG1778 Low specificity phosphatase (HAD superfamily) 112 54 Op 5 . + CDS 171338 - 171919 557 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation + Term 172010 - 172068 14.5 - Term 171998 - 172056 13.6 113 55 Op 1 . - CDS 172108 - 174312 1811 ## COG0457 FOG: TPR repeat 114 55 Op 2 . - CDS 174323 - 175507 947 ## COG2311 Predicted membrane protein - Prom 175562 - 175621 2.4 - Term 175565 - 175604 -0.8 115 56 Op 1 . - CDS 175663 - 176706 1036 ## BT_1674 hypothetical protein 116 56 Op 2 . - CDS 176717 - 177685 752 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 177711 - 177770 4.1 - Term 177716 - 177756 7.0 117 57 Op 1 . - CDS 177792 - 179051 1678 ## COG0126 3-phosphoglycerate kinase - Prom 179071 - 179130 5.7 118 57 Op 2 1/0.161 - CDS 179145 - 179822 628 ## COG0177 Predicted EndoIII-related endonuclease 119 57 Op 3 . - CDS 179819 - 181018 832 ## COG0477 Permeases of the major facilitator superfamily 120 58 Tu 1 . - CDS 181167 - 182186 1184 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Prom 182232 - 182291 3.7 + Prom 182188 - 182247 5.1 121 59 Tu 1 . + CDS 182307 - 183323 730 ## BT_1668 hypothetical protein 122 60 Tu 1 . + CDS 184215 - 185081 556 ## BDI_3189 hypothetical protein + Prom 185125 - 185184 3.0 123 61 Tu 1 . + CDS 185210 - 185422 404 ## BT_1667 hypothetical protein - Term 185398 - 185428 1.2 124 62 Op 1 . - CDS 185496 - 186065 439 ## BF3225 hypothetical protein 125 62 Op 2 . - CDS 186087 - 188294 1733 ## FP0382 hypothetical protein 126 62 Op 3 . - CDS 188291 - 190480 1137 ## BT_3282 hypothetical protein 127 62 Op 4 . - CDS 190525 - 192090 1194 ## BT_1937 hypothetical protein - Prom 192114 - 192173 3.8 - Term 192118 - 192171 3.7 128 63 Op 1 . - CDS 192184 - 193371 988 ## Fjoh_0764 hypothetical protein 129 63 Op 2 . - CDS 193380 - 196334 2000 ## BT_1939 putative outer membrane receptor 130 63 Op 3 . - CDS 196382 - 197452 761 ## COG3712 Fe2+-dicitrate sensor, membrane component 131 63 Op 4 . - CDS 197532 - 198191 464 ## BVU_1982 RNA polymerase ECF-type sigma factor - Prom 198355 - 198414 4.7 - TRNA 198385 - 198461 67.0 # Ala GGC 0 0 + Prom 198478 - 198537 3.8 132 64 Op 1 . + CDS 198577 - 198879 247 ## BT_1665 hypothetical protein 133 64 Op 2 . + CDS 198933 - 199442 440 ## COG0817 Holliday junction resolvasome, endonuclease subunit 134 64 Op 3 . + CDS 199494 - 201500 1499 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases + Term 201558 - 201597 1.3 135 65 Tu 1 . - CDS 201581 - 205714 2683 ## COG0642 Signal transduction histidine kinase - Prom 205755 - 205814 7.0 136 66 Op 1 . + CDS 206110 - 209283 3366 ## Fjoh_4951 TonB-dependent receptor, plug 137 66 Op 2 . + CDS 209303 - 210949 1476 ## Cpin_6367 RagB/SusD domain protein 138 66 Op 3 . + CDS 210964 - 212367 1262 ## Cpin_6366 hypothetical protein 139 66 Op 4 . + CDS 212392 - 214614 2102 ## COG3693 Beta-1,4-xylanase + Prom 214741 - 214800 6.6 140 67 Op 1 . + CDS 214947 - 216605 1439 ## BVU_0028 sialic acid-specific 9-O-acetylesterase 141 67 Op 2 . + CDS 216622 - 218037 1026 ## COG2211 Na+/melibiose symporter and related transporters 142 67 Op 3 . + CDS 218077 - 219207 898 ## COG3693 Beta-1,4-xylanase 143 67 Op 4 . + CDS 219234 - 220211 927 ## BVU_0040 beta-xylosidase/alpha-L-arabinofuranosidase + Prom 220290 - 220349 6.4 144 68 Tu 1 . + CDS 220377 - 222512 1804 ## PROTEIN SUPPORTED gi|90020672|ref|YP_526499.1| ribosomal protein S18 + Prom 222527 - 222586 5.7 145 69 Tu 1 . + CDS 222620 - 224074 202 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 146 70 Tu 1 . - CDS 224110 - 225930 1281 ## COG0642 Signal transduction histidine kinase - Prom 226013 - 226072 3.1 + Prom 225939 - 225998 4.2 147 71 Tu 1 . + CDS 226068 - 226214 123 ## gi|160886371|ref|ZP_02067374.1| hypothetical protein BACOVA_04381 + Prom 226224 - 226283 4.7 148 72 Op 1 . + CDS 226423 - 226926 299 ## Cpin_0044 RNA polymerase, sigma-24 subunit, ECF subfamily 149 72 Op 2 . + CDS 226993 - 228060 936 ## COG3712 Fe2+-dicitrate sensor, membrane component 150 72 Op 3 . + CDS 228080 - 231676 2774 ## Coch_0666 TonB-dependent receptor plug 151 72 Op 4 . + CDS 231697 - 233124 1464 ## Fjoh_3801 RagB/SusD domain-containing protein 152 72 Op 5 . + CDS 233144 - 235648 2078 ## Coch_0557 hypothetical protein + Prom 235746 - 235805 7.6 153 73 Op 1 . + CDS 235956 - 236702 784 ## COG0588 Phosphoglycerate mutase 1 154 73 Op 2 . + CDS 236730 - 237782 1093 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes + Term 237784 - 237828 -0.3 155 73 Op 3 . + CDS 237850 - 238506 888 ## COG0176 Transaldolase + Term 238537 - 238598 4.5 - Term 238527 - 238584 4.5 156 74 Tu 1 . - CDS 238614 - 242561 2080 ## COG0642 Signal transduction histidine kinase - Prom 242584 - 242643 5.8 + Prom 242978 - 243037 4.3 157 75 Op 1 . + CDS 243070 - 244500 1235 ## COG1660 Predicted P-loop-containing kinase 158 75 Op 2 . + CDS 244537 - 245286 608 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) + Term 245294 - 245329 5.1 + Prom 245290 - 245349 4.8 159 76 Tu 1 . + CDS 245395 - 247377 1387 ## COG0642 Signal transduction histidine kinase + Term 247477 - 247517 1.0 + Prom 247842 - 247901 4.9 160 77 Op 1 . + CDS 248041 - 248190 60 ## 161 77 Op 2 . + CDS 248187 - 248609 319 ## BVU_2957 hypothetical protein + Term 248660 - 248707 6.4 + Prom 248682 - 248741 7.4 162 78 Op 1 . + CDS 248895 - 250535 1264 ## BT_1632 chitinase 163 78 Op 2 . + CDS 250567 - 253887 3359 ## BT_3174 hypothetical protein 164 78 Op 3 . + CDS 253906 - 255750 1520 ## BT_3175 hypothetical protein 165 78 Op 4 . + CDS 255796 - 257640 1181 ## BT_1629 hypothetical protein + Term 257687 - 257735 12.0 + Prom 257665 - 257724 3.6 166 79 Op 1 . + CDS 257843 - 259426 1264 ## COG3119 Arylsulfatase A and related enzymes 167 79 Op 2 1/0.161 + CDS 259445 - 261769 1531 ## COG3525 N-acetyl-beta-hexosaminidase + Prom 261810 - 261869 5.8 168 80 Tu 1 . + CDS 261937 - 265023 2840 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 265026 - 265085 3.8 169 81 Tu 1 . + CDS 265168 - 266985 1402 ## COG3669 Alpha-L-fucosidase + Prom 267012 - 267071 2.2 170 82 Tu 1 . + CDS 267127 - 268302 957 ## COG0668 Small-conductance mechanosensitive channel 171 83 Op 1 . - CDS 268532 - 269635 675 ## BT_1616 hypothetical protein - Term 269636 - 269683 1.6 172 83 Op 2 . - CDS 269712 - 271172 1314 ## COG2195 Di- and tripeptidases - Prom 271252 - 271311 2.0 + Prom 271139 - 271198 7.0 173 84 Op 1 . + CDS 271306 - 271539 240 ## BT_1614 hypothetical protein 174 84 Op 2 . + CDS 271576 - 272169 540 ## BT_1613 hypothetical protein + Term 272182 - 272222 3.8 - Term 272168 - 272210 4.2 175 85 Op 1 . - CDS 272270 - 272611 359 ## BT_1612 hypothetical protein 176 85 Op 2 . - CDS 272615 - 272911 320 ## BF3072 putative septum formation initiator-related protein - Prom 272942 - 273001 3.8 + Prom 272973 - 273032 3.8 177 86 Tu 1 . + CDS 273068 - 274933 1434 ## COG2812 DNA polymerase III, gamma/tau subunits - Term 274824 - 274851 -0.8 178 87 Op 1 . - CDS 274925 - 275302 620 ## PROTEIN SUPPORTED gi|160886338|ref|ZP_02067341.1| hypothetical protein BACOVA_04348 179 87 Op 2 . - CDS 275304 - 275537 252 ## COG1983 Putative stress-responsive transcriptional regulator - Prom 275627 - 275686 6.0 - Term 276032 - 276071 0.1 180 88 Tu 1 . - CDS 276186 - 276791 395 ## gi|260173530|ref|ZP_05759942.1| hypothetical protein BacD2_16780 - Prom 276848 - 276907 6.4 - Term 276847 - 276882 2.4 181 89 Op 1 . - CDS 276980 - 277366 220 ## gi|298479816|ref|ZP_06998016.1| conserved hypothetical protein 182 89 Op 2 . - CDS 277359 - 277793 205 ## gi|260173532|ref|ZP_05759944.1| hypothetical protein BacD2_16790 183 89 Op 3 . - CDS 277768 - 278172 213 ## gi|260173533|ref|ZP_05759945.1| hypothetical protein BacD2_16795 184 89 Op 4 . - CDS 278174 - 278539 175 ## gi|260173534|ref|ZP_05759946.1| hypothetical protein BacD2_16800 185 89 Op 5 . - CDS 278579 - 278854 236 ## gi|260173535|ref|ZP_05759947.1| hypothetical protein BacD2_16805 186 89 Op 6 . - CDS 278867 - 279502 307 ## gi|260173536|ref|ZP_05759948.1| hypothetical protein BacD2_16810 - Prom 279535 - 279594 5.4 187 90 Op 1 . - CDS 280444 - 280899 287 ## COG2003 DNA repair proteins 188 90 Op 2 . - CDS 280908 - 281993 450 ## BF3462 hypothetical protein 189 90 Op 3 . - CDS 281959 - 282846 457 ## Dred_0498 hypothetical protein - Prom 282867 - 282926 1.8 - Term 282879 - 282919 6.1 190 90 Op 4 . - CDS 282942 - 283211 193 ## gi|260173540|ref|ZP_05759952.1| hypothetical protein BacD2_16830 - Prom 283437 - 283496 8.6 - Term 283494 - 283534 6.0 191 91 Op 1 . - CDS 283572 - 285209 607 ## CHU_0049 hypothetical protein 192 91 Op 2 . - CDS 285235 - 287502 825 ## BVU_3124 hypothetical protein - Prom 287540 - 287599 6.1 - Term 287540 - 287594 6.3 193 92 Tu 1 . - CDS 287624 - 288898 519 ## BT_2980 transposase - Prom 288948 - 289007 2.4 194 93 Op 1 . - CDS 289356 - 289466 86 ## 195 93 Op 2 . - CDS 289552 - 289854 216 ## gi|260173544|ref|ZP_05759956.1| hypothetical protein BacD2_16850 196 93 Op 3 . - CDS 289864 - 290232 77 ## gi|260173545|ref|ZP_05759957.1| hypothetical protein BacD2_16855 197 93 Op 4 . - CDS 290233 - 290637 181 ## gi|260173546|ref|ZP_05759958.1| hypothetical protein BacD2_16860 198 93 Op 5 . - CDS 290721 - 291395 251 ## gi|260173547|ref|ZP_05759959.1| hypothetical protein BacD2_16865 - Prom 291430 - 291489 2.6 199 94 Op 1 . - CDS 292368 - 292823 204 ## COG2003 DNA repair proteins 200 94 Op 2 . - CDS 292832 - 293944 515 ## BF3462 hypothetical protein 201 94 Op 3 . - CDS 293949 - 294848 366 ## Alvin_3206 AAA ATPase - Prom 294870 - 294929 2.3 - Term 294876 - 294918 8.1 202 95 Tu 1 . - CDS 294942 - 295211 269 ## gi|260173551|ref|ZP_05759963.1| hypothetical protein BacD2_16885 - Prom 295436 - 295495 6.8 - Term 295483 - 295523 2.2 203 96 Op 1 . - CDS 295549 - 296217 320 ## gi|260173552|ref|ZP_05759964.1| hypothetical protein BacD2_16890 - Prom 296357 - 296416 2.8 - Term 296246 - 296283 2.0 204 96 Op 2 . - CDS 296460 - 298568 685 ## BVU_3124 hypothetical protein - Prom 298605 - 298664 8.2 - Term 298605 - 298660 11.1 205 97 Op 1 . - CDS 298670 - 300079 529 ## BT_2980 transposase - TRNA 300107 - 300182 69.6 # His GTG 0 0 206 97 Op 2 . - CDS 300248 - 301396 639 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) 207 97 Op 3 . - CDS 301393 - 302274 956 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase - Prom 302302 - 302361 3.2 - Term 302412 - 302445 3.1 208 98 Tu 1 . - CDS 302481 - 303803 1774 ## COG0541 Signal recognition particle GTPase - Prom 303826 - 303885 3.4 209 99 Op 1 . - CDS 303908 - 305230 1182 ## COG0534 Na+-driven multidrug efflux pump 210 99 Op 2 10/0.000 - CDS 305270 - 307225 1389 ## COG0642 Signal transduction histidine kinase 211 99 Op 3 . - CDS 307222 - 309072 763 ## COG0642 Signal transduction histidine kinase - Prom 309092 - 309151 2.7 - Term 309105 - 309158 11.0 212 100 Tu 1 . - CDS 309163 - 310629 968 ## COG3119 Arylsulfatase A and related enzymes - Prom 310813 - 310872 3.3 - Term 310834 - 310893 13.8 213 101 Tu 1 . - CDS 310914 - 313211 2227 ## COG1158 Transcription termination factor - Prom 313439 - 313498 7.1 + Prom 313174 - 313233 6.5 214 102 Tu 1 . + CDS 313449 - 314285 582 ## BT_1594 putative ferredoxin + Term 314290 - 314339 1.0 215 103 Tu 1 . - CDS 314270 - 315550 750 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 315628 - 315687 6.5 + Prom 315581 - 315640 9.5 216 104 Op 1 . + CDS 315696 - 318182 1950 ## COG0370 Fe2+ transport system protein B 217 104 Op 2 . + CDS 318179 - 318394 116 ## BF1354 hypothetical protein + Term 318468 - 318537 26.3 + TRNA 318446 - 318520 52.4 # Cys GCA 0 0 + Prom 318449 - 318508 78.8 218 105 Op 1 . + CDS 318738 - 319391 551 ## BF1419 hypothetical protein + Term 319437 - 319478 7.1 + Prom 319469 - 319528 5.3 219 105 Op 2 . + CDS 319592 - 322615 2181 ## COG0642 Signal transduction histidine kinase 220 105 Op 3 . + CDS 322656 - 323006 114 ## BVU_3167 hypothetical protein 221 105 Op 4 . + CDS 322960 - 323535 391 ## BVU_3167 hypothetical protein + Term 323555 - 323594 1.5 - TRNA 323595 - 323668 72.9 # Thr TGT 0 0 + Prom 323689 - 323748 7.4 222 106 Tu 1 . + CDS 323778 - 324365 512 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 324412 - 324467 11.2 - Term 324406 - 324450 4.6 223 107 Tu 1 . - CDS 324510 - 325241 566 ## COG0500 SAM-dependent methyltransferases - Prom 325333 - 325392 5.1 + Prom 325211 - 325270 5.3 224 108 Tu 1 . + CDS 325337 - 325522 205 ## + Term 325621 - 325655 -0.5 - Term 325673 - 325740 26.5 225 109 Op 1 . - CDS 325741 - 326118 194 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 226 109 Op 2 . - CDS 326133 - 327278 1023 ## BT_1579 hypothetical protein 227 109 Op 3 . - CDS 327292 - 328023 573 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 228 109 Op 4 . - CDS 328091 - 328738 271 ## COG0259 Pyridoxamine-phosphate oxidase 229 109 Op 5 . - CDS 328782 - 329486 647 ## COG1741 Pirin-related protein 230 109 Op 6 . - CDS 329550 - 330551 950 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 330735 - 330794 6.5 + Prom 330618 - 330677 3.8 231 110 Tu 1 . + CDS 330760 - 332205 1224 ## COG2067 Long-chain fatty acid transport protein + Term 332227 - 332280 14.2 - Term 332221 - 332261 -0.8 232 111 Tu 1 . - CDS 332289 - 333578 1154 ## BT_1573 hypothetical protein - Prom 333748 - 333807 5.1 233 112 Tu 1 . + CDS 333742 - 334248 456 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 334290 - 334347 14.6 + Prom 334384 - 334443 6.6 234 113 Op 1 . + CDS 334488 - 335168 704 ## COG1738 Uncharacterized conserved protein 235 113 Op 2 . + CDS 335177 - 335836 677 ## COG0603 Predicted PP-loop superfamily ATPase + Term 336042 - 336075 1.1 236 114 Op 1 . + CDS 336408 - 336872 432 ## COG0780 Enzyme related to GTP cyclohydrolase I 237 114 Op 2 . + CDS 336910 - 337545 509 ## BT_1563 hypothetical protein - Term 337519 - 337559 4.2 238 115 Tu 1 . - CDS 337606 - 338079 345 ## COG1576 Uncharacterized conserved protein - Prom 338161 - 338220 5.5 + Prom 338074 - 338133 5.5 239 116 Op 1 . + CDS 338207 - 338599 334 ## BT_1561 hypothetical protein 240 116 Op 2 . + CDS 338592 - 339440 851 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 + Prom 339522 - 339581 4.6 241 117 Op 1 . + CDS 339686 - 340237 193 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 242 117 Op 2 . + CDS 340221 - 341270 796 ## BT_1558 hypothetical protein 243 117 Op 3 . + CDS 341300 - 341806 455 ## BT_1557 hypothetical protein + Prom 341859 - 341918 2.1 244 118 Tu 1 . + CDS 341984 - 342709 686 ## BT_1556 hypothetical protein + Prom 342786 - 342845 4.4 245 119 Tu 1 . + CDS 342879 - 343817 834 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 246 120 Tu 1 . - CDS 343966 - 345072 1010 ## COG0686 Alanine dehydrogenase - Term 345104 - 345144 6.1 247 121 Tu 1 . - CDS 345203 - 345886 480 ## Slin_4128 serine/threonine protein kinase - Term 346235 - 346275 2.0 248 122 Op 1 . - CDS 346327 - 347664 860 ## BT_1553 hypothetical protein 249 122 Op 2 . - CDS 347675 - 351022 2196 ## BT_1552 hypothetical protein 250 122 Op 3 . - CDS 351055 - 353688 2393 ## BT_1551 hypothetical protein - Prom 353765 - 353824 7.1 + Prom 353661 - 353720 7.6 251 123 Op 1 . + CDS 353887 - 355539 1230 ## COG0739 Membrane proteins related to metalloendopeptidases 252 123 Op 2 . + CDS 355565 - 357202 1490 ## COG4690 Dipeptidase + Prom 357326 - 357385 9.9 253 124 Tu 1 . + CDS 357412 - 359157 2077 ## COG1109 Phosphomannomutase + Term 359213 - 359259 10.5 254 125 Op 1 . - CDS 359517 - 360692 903 ## COG2311 Predicted membrane protein 255 125 Op 2 . - CDS 360702 - 361751 361 ## COG1835 Predicted acyltransferases 256 125 Op 3 . - CDS 361831 - 362394 430 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 257 125 Op 4 . - CDS 362401 - 363189 458 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding - Prom 363212 - 363271 2.1 258 125 Op 5 . - CDS 363273 - 364973 1105 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 365173 - 365232 3.4 259 126 Tu 1 . - CDS 365393 - 366766 449 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 366788 - 366847 2.5 260 127 Op 1 . - CDS 366867 - 367448 552 ## COG3059 Predicted membrane protein 261 127 Op 2 . - CDS 367537 - 368376 704 ## COG2207 AraC-type DNA-binding domain-containing proteins 262 127 Op 3 . - CDS 368379 - 368861 392 ## COG0295 Cytidine deaminase - Prom 368969 - 369028 5.3 + Prom 368777 - 368836 4.4 263 128 Op 1 . + CDS 368972 - 369877 927 ## COG1705 Muramidase (flagellum-specific) + Term 369917 - 369951 3.0 264 128 Op 2 . + CDS 370073 - 371449 1325 ## COG1252 NADH dehydrogenase, FAD-containing subunit + Term 371499 - 371542 8.3 + Prom 371514 - 371573 4.2 265 129 Op 1 . + CDS 371631 - 372878 1290 ## BT_1536 ABC transporter permease 266 129 Op 2 . + CDS 372896 - 373561 297 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 267 129 Op 3 . + CDS 373629 - 374255 562 ## BT_1534 hypothetical protein 268 129 Op 4 . + CDS 374312 - 375589 684 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 269 129 Op 5 . + CDS 375594 - 376859 1021 ## BVU_2350 ABC transporter permease 270 129 Op 6 . + CDS 376892 - 378181 1045 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 271 129 Op 7 . + CDS 378211 - 379464 835 ## BT_1532 ABC transporter permease + Prom 379469 - 379528 3.5 272 129 Op 8 . + CDS 379549 - 382158 1694 ## COG0642 Signal transduction histidine kinase + Term 382165 - 382209 5.3 + Prom 382175 - 382234 5.1 273 130 Op 1 . + CDS 382288 - 383766 1430 ## BT_1530 putative outer membrane protein OprM precursor + Term 383792 - 383847 -0.9 + Prom 383778 - 383837 3.6 274 130 Op 2 13/0.000 + CDS 383860 - 385209 1457 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 275 130 Op 3 . + CDS 385236 - 386528 1095 ## COG0642 Signal transduction histidine kinase + Prom 386531 - 386590 3.8 276 130 Op 4 . + CDS 386626 - 389178 2310 ## BT_1527 hypothetical protein + Prom 389199 - 389258 3.1 277 131 Tu 1 . + CDS 389287 - 390576 1426 ## COG1260 Myo-inositol-1-phosphate synthase + Term 390602 - 390659 2.4 + Prom 390587 - 390646 4.2 278 132 Op 1 . + CDS 390739 - 391218 494 ## COG1267 Phosphatidylglycerophosphatase A and related proteins 279 132 Op 2 . + CDS 391233 - 391700 377 ## BT_1524 hypothetical protein 280 132 Op 3 . + CDS 391774 - 392424 764 ## COG0558 Phosphatidylglycerophosphate synthase 281 132 Op 4 . + CDS 392477 - 393394 672 ## BT_1522 putative aureobasidin A resistance protein 282 132 Op 5 . + CDS 393442 - 394452 632 ## BT_1521 hypothetical protein 283 132 Op 6 . + CDS 394515 - 395687 1192 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family + Term 395704 - 395767 21.4 + Prom 395769 - 395828 8.9 284 133 Op 1 . + CDS 395908 - 399045 2770 ## BT_3271 hypothetical protein 285 133 Op 2 . + CDS 399054 - 400613 1340 ## BT_3272 putative outer membrane protein 286 133 Op 3 . + CDS 400626 - 401351 593 ## BT_3273 hypothetical protein 287 133 Op 4 . + CDS 401361 - 403001 1340 ## BT_3274 hypothetical protein 288 133 Op 5 . + CDS 403054 - 404025 952 ## gi|260173639|ref|ZP_05760051.1| hypothetical protein BacD2_17325 289 133 Op 6 . + CDS 404068 - 406635 2514 ## Cpin_5142 hypothetical protein + Term 406668 - 406728 10.1 + Prom 406637 - 406696 4.1 290 134 Tu 1 . + CDS 406911 - 408245 846 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 408418 - 408457 -1.0 291 135 Op 1 . - CDS 408393 - 408842 266 ## COG3023 Negative regulator of beta-lactamase expression 292 135 Op 2 . - CDS 408861 - 409079 310 ## BT_1518 hypothetical protein - Prom 409146 - 409205 5.6 293 136 Tu 1 . - CDS 409275 - 409781 539 ## BT_1517 hypothetical protein - Prom 409885 - 409944 8.4 294 137 Op 1 . - CDS 409969 - 411348 1169 ## COG0305 Replicative DNA helicase 295 137 Op 2 . - CDS 411353 - 411976 492 ## BT_1515 hypothetical protein 296 138 Tu 1 . - CDS 412438 - 412707 263 ## BT_1514 hypothetical protein - Prom 412847 - 412906 13.8 + Prom 412767 - 412826 13.7 297 139 Tu 1 . + CDS 412935 - 413852 693 ## BT_1503 integrase + Prom 414123 - 414182 4.6 298 140 Tu 1 . + CDS 414225 - 417206 2800 ## BT_1826 hypothetical protein 299 141 Tu 1 . + CDS 417657 - 419213 1267 ## BT_0374 hypothetical protein - Term 419208 - 419255 8.2 300 142 Tu 1 . - CDS 419257 - 420831 1444 ## COG1530 Ribonucleases G and E - Prom 420980 - 421039 2.4 - Term 420905 - 420962 12.5 301 143 Tu 1 . - CDS 421115 - 421390 209 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 421465 - 421524 7.5 + Prom 421411 - 421470 6.1 302 144 Op 1 . + CDS 421602 - 422636 712 ## COG1194 A/G-specific DNA glycosylase 303 144 Op 2 . + CDS 422671 - 423222 520 ## COG0629 Single-stranded DNA-binding protein + Term 423230 - 423285 6.1 304 145 Op 1 . + CDS 423300 - 424655 1307 ## COG1253 Hemolysins and related proteins containing CBS domains 305 145 Op 2 . + CDS 424633 - 425265 364 ## BT_1495 siderophore (surfactin) biosynthesis regulatory protein - Term 425330 - 425380 1.0 306 146 Op 1 . - CDS 425427 - 425651 246 ## BT_1494 hypothetical protein 307 146 Op 2 . - CDS 425660 - 425962 298 ## COG2388 Predicted acetyltransferase - Prom 425982 - 426041 3.5 308 146 Op 3 . - CDS 426043 - 427416 1449 ## COG3033 Tryptophanase - Prom 427438 - 427497 3.7 - Term 427553 - 427596 7.1 309 147 Op 1 . - CDS 427635 - 428954 885 ## BF1581 hypothetical protein 310 147 Op 2 . - CDS 428995 - 431007 1395 ## COG3391 Uncharacterized conserved protein 311 147 Op 3 . - CDS 431027 - 433072 1080 ## BF1583 putative vitamin B12 receptor - Prom 433196 - 433255 3.5 - Term 434083 - 434128 12.3 312 148 Op 1 . - CDS 434132 - 435076 671 ## BT_1485 hypothetical protein - Prom 435099 - 435158 2.8 313 148 Op 2 40/0.000 - CDS 435233 - 437014 1396 ## COG0642 Signal transduction histidine kinase 314 148 Op 3 . - CDS 437042 - 437731 732 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 437753 - 437812 6.9 + Prom 437730 - 437789 7.2 315 149 Tu 1 . + CDS 437973 - 439166 904 ## BT_1481 hypothetical protein + Prom 439228 - 439287 3.5 316 150 Op 1 . + CDS 439307 - 440107 811 ## BT_1480 hypothetical protein 317 150 Op 2 . + CDS 440109 - 441977 1946 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 441981 - 442010 -0.3 + Prom 442028 - 442087 4.8 318 151 Tu 1 . + CDS 442133 - 442972 599 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Term 442864 - 442912 -1.0 319 152 Op 1 . - CDS 442974 - 443189 345 ## gi|237723012|ref|ZP_04553493.1| conserved hypothetical protein 320 152 Op 2 . - CDS 443216 - 444451 967 ## COG2407 L-fucose isomerase and related proteins - Prom 444509 - 444568 5.4 321 153 Op 1 . + CDS 444632 - 445831 1044 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Prom 445833 - 445892 5.2 322 153 Op 2 . + CDS 445914 - 447158 907 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component + Term 447208 - 447258 1.0 + Prom 447187 - 447246 4.0 323 154 Op 1 1/0.161 + CDS 447289 - 448755 2401 ## PROTEIN SUPPORTED gi|29346884|ref|NP_810387.1| ribosomal protein S6 modification protein-related protein 324 154 Op 2 . + CDS 448758 - 449207 184 ## PROTEIN SUPPORTED gi|227881166|ref|ZP_03999028.1| (SSU ribosomal protein S18P)-alanine acetyltransferase - Term 449117 - 449156 8.6 325 155 Op 1 . - CDS 449238 - 450410 870 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 326 155 Op 2 . - CDS 450467 - 451492 877 ## COG0628 Predicted permease 327 155 Op 3 8/0.000 - CDS 451550 - 452803 800 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation 328 155 Op 4 . - CDS 452809 - 454173 746 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 454393 - 454452 6.2 + Prom 454375 - 454434 8.7 329 156 Op 1 . + CDS 454500 - 455801 1007 ## BT_1468 putative outer membrane efflux protein 330 156 Op 2 24/0.000 + CDS 455808 - 457055 1313 ## COG0845 Membrane-fusion protein 331 156 Op 3 36/0.000 + CDS 457068 - 457784 360 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 332 156 Op 4 . + CDS 457781 - 460201 2024 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 460215 - 460257 10.1 - Term 460690 - 460751 2.1 333 157 Tu 1 . - CDS 460828 - 461391 279 ## COG3760 Uncharacterized conserved protein - Prom 461605 - 461664 8.5 + Prom 461449 - 461508 4.1 334 158 Op 1 . + CDS 461688 - 462440 750 ## COG2186 Transcriptional regulators 335 158 Op 2 . + CDS 462488 - 463504 406 ## COG3055 Uncharacterized protein conserved in bacteria 336 158 Op 3 1/0.161 + CDS 463540 - 463893 276 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 337 158 Op 4 . + CDS 463962 - 464450 336 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Prom 464484 - 464543 1.5 338 159 Op 1 . + CDS 464567 - 466207 814 ## COG4409 Neuraminidase (sialidase) 339 159 Op 2 . + CDS 466214 - 467446 841 ## COG0477 Permeases of the major facilitator superfamily 340 159 Op 3 . + CDS 467503 - 470679 2102 ## Phep_3579 TonB-dependent receptor 341 159 Op 4 . + CDS 470702 - 472162 1210 ## Phep_3578 RagB/SusD domain protein 342 159 Op 5 . + CDS 472181 - 473512 736 ## COG4409 Neuraminidase (sialidase) 343 159 Op 6 . + CDS 473533 - 475710 1181 ## gi|260173693|ref|ZP_05760105.1| hypothetical protein BacD2_17605 344 159 Op 7 . + CDS 475722 - 477284 885 ## BT_0448 hypothetical protein + Prom 477318 - 477377 5.4 345 160 Op 1 . + CDS 477398 - 477940 427 ## gi|260173695|ref|ZP_05760107.1| hypothetical protein BacD2_17615 346 160 Op 2 . + CDS 477961 - 478551 442 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 478574 - 478632 4.1 - Term 478560 - 478620 14.1 347 161 Tu 1 . - CDS 478687 - 479166 519 ## BF1973 hypothetical protein + Prom 479233 - 479292 10.3 348 162 Op 1 . + CDS 479445 - 481922 2040 ## BT_1460 hypothetical protein + Term 481953 - 481994 8.3 349 162 Op 2 . + CDS 482004 - 483080 376 ## COG3275 Putative regulator of cell autolysis 350 163 Op 1 9/0.000 + CDS 483191 - 484156 379 ## COG3275 Putative regulator of cell autolysis 351 163 Op 2 . + CDS 484153 - 484869 645 ## COG3279 Response regulator of the LytR/AlgR family 352 163 Op 3 . + CDS 484934 - 485482 468 ## Cpin_3046 RNA polymerase, sigma-24 subunit, ECF subfamily + Prom 485488 - 485547 2.3 353 164 Op 1 . + CDS 485567 - 486736 819 ## COG3712 Fe2+-dicitrate sensor, membrane component 354 164 Op 2 . + CDS 486747 - 490124 3291 ## BVU_2447 hypothetical protein 355 164 Op 3 . + CDS 490151 - 491608 1500 ## BVU_2446 hypothetical protein 356 164 Op 4 1/0.161 + CDS 491633 - 492799 1176 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 357 164 Op 5 . + CDS 492820 - 493968 930 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases + Term 494160 - 494207 0.2 358 165 Op 1 . - CDS 494077 - 495327 674 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 359 165 Op 2 . - CDS 495347 - 496564 931 ## COG0535 Predicted Fe-S oxidoreductases - Prom 496696 - 496755 5.2 360 166 Tu 1 . + CDS 496458 - 496652 57 ## - Term 496700 - 496755 15.1 361 167 Tu 1 . - CDS 496781 - 497278 563 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 497330 - 497389 4.0 + Prom 497608 - 497667 8.2 362 168 Op 1 . + CDS 497792 - 499189 1318 ## COG1690 Uncharacterized conserved protein 363 168 Op 2 . + CDS 499232 - 499939 698 ## BF2725 hypothetical protein 364 168 Op 3 . + CDS 499942 - 501048 640 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Term 501128 - 501183 1.2 - Term 501378 - 501426 9.1 365 169 Tu 1 . - CDS 501460 - 503109 1155 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 503249 - 503308 6.0 - Term 503199 - 503254 -0.6 366 170 Tu 1 . - CDS 503310 - 504524 866 ## COG4833 Predicted glycosyl hydrolase - Term 504526 - 504571 3.8 367 171 Op 1 . - CDS 504602 - 505408 508 ## Cpin_4502 hypothetical protein 368 171 Op 2 . - CDS 505449 - 507503 1529 ## Cpin_4503 RagB/SusD domain protein 369 171 Op 3 . - CDS 507540 - 510719 2228 ## Cpin_4504 TonB-dependent receptor plug 370 172 Tu 1 . - CDS 510842 - 512686 1245 ## Geob_3465 hypothetical protein - Prom 512753 - 512812 8.3 + Prom 513066 - 513125 5.5 371 173 Tu 1 . + CDS 513160 - 515700 1031 ## BT_2204 hypothetical protein + Term 515810 - 515850 4.3 - Term 515936 - 515973 0.5 372 174 Tu 1 . - CDS 516160 - 517659 1286 ## COG1620 L-lactate permease - Prom 517685 - 517744 1.6 - Term 517666 - 517720 12.0 373 175 Op 1 . - CDS 517752 - 519203 1421 ## BT_1452 hypothetical protein 374 175 Op 2 . - CDS 519228 - 519815 684 ## BT_1451 hypothetical protein - Prom 519962 - 520021 5.3 + Prom 519942 - 520001 6.4 375 176 Op 1 2/0.065 + CDS 520065 - 521609 1584 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 376 176 Op 2 . + CDS 521688 - 523199 1476 ## COG0439 Biotin carboxylase 377 176 Op 3 . + CDS 523229 - 523753 646 ## COG1038 Pyruvate carboxylase + Prom 523756 - 523815 1.6 378 177 Tu 1 . + CDS 523842 - 526199 2237 ## COG0642 Signal transduction histidine kinase + Term 526323 - 526370 -0.8 - Term 526194 - 526256 5.2 379 178 Op 1 9/0.000 - CDS 526306 - 526959 528 ## COG0132 Dethiobiotin synthetase 380 178 Op 2 5/0.000 - CDS 526956 - 527798 450 ## COG0500 SAM-dependent methyltransferases 381 178 Op 3 5/0.000 - CDS 527863 - 528525 438 ## COG2830 Uncharacterized protein conserved in bacteria 382 178 Op 4 6/0.000 - CDS 528522 - 529676 730 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Term 529804 - 529859 5.3 383 178 Op 5 . - CDS 529898 - 532312 1548 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase - Prom 532340 - 532399 2.3 384 179 Tu 1 . - CDS 532405 - 533358 873 ## BT_1441 hypothetical protein Predicted protein(s) >gi|225935346|gb|ACGA01000046.1| GENE 1 1 - 139 106 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160886554|ref|ZP_02067557.1| ## NR: gi|160886554|ref|ZP_02067557.1| hypothetical protein BACOVA_04565 [Bacteroides ovatus ATCC 8483] # 1 46 1 46 613 98 100.0 1e-19 MCPDTKALLAYVKLELFFAAICYYIKINPKPITAMAKKIAEQLIDT >gi|225935346|gb|ACGA01000046.1| GENE 2 274 - 492 243 72 aa, chain + ## HITS:1 COG:no KEGG:BT_1819 NR:ns ## KEGG: BT_1819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 120 88.0 1e-26 MNRNEIGVNAGKVWQLLSNNEKWSYGLLKRKSGLKDKELGAALGWLSRESKIEFDQCDEE LYVYLCVNVYIG >gi|225935346|gb|ACGA01000046.1| GENE 3 664 - 1677 1009 337 aa, chain + ## HITS:1 COG:alr3296 KEGG:ns NR:ns ## COG: alr3296 COG2008 # Protein_GI_number: 17230788 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Nostoc sp. PCC 7120 # 1 336 5 341 345 248 38.0 8e-66 MRSFASDNNSGVHPLVMEALNRANIDHSLGYGDDKWTEEAVAKIKETFTPNCVPLFVFNG TGSNVVALQLMTRPYHSIFCAETAHIYVDECGSPVKMTGCQIRPIATPDGKLTPELMQPY LHGFGDQHHSQPRALYISQCTELGTIYTPEELKRLTDFAHLNGMYVHMDGARIANACAAL NLSLKELTVDCGVDILSFGGTKNGLMMGECVIVFNKDLQSEARFIRKQSAQLASKMRYLS CQFTAYLTDDLWLKNANHANAMATKLYAELKKLPEVTFTQRAESNQLFLTMPRPVIDRML ESYFFYFWNEEKNEIRLVTSFDTTEEDVDEFIRLLKR >gi|225935346|gb|ACGA01000046.1| GENE 4 1773 - 2966 667 397 aa, chain + ## HITS:1 COG:no KEGG:BT_1814 NR:ns ## KEGG: BT_1814 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 2 398 398 687 87.0 0 MNTRQNRLIIFALLLLTQTPLGAQLHSTKIELPEDSISATMEDPIPAKRSFFKKFLDYFN DANKEKKNKKFDFSVIGGPHYSSDTKFGLGLVAAGLYRTDRIDTLLPPSNVSLYGDVSTV GFYLLGVRGNHLFPKDKYRLNYNLYFYSFPSLYWGRGYDNGANSDNESDYKRFQAQVKVD FMFRLAKNFYIGPMAVFDYIDGRNFEKPELWEGMAARTTNTSLGLSLLYDSRDFLTNAYH GYYLRIDQRFSPAFLGNKYAFSSTELTTSYYQPVWKGGVLAGQFHTLLTYGDTPWGLMAT LGSSYSMRGYYEGRYRDKGVMDAQIELRQHVWKRNGVAVWVGAGTIFPRLSEFTPKHILP NYGFGYRWEFKKRVNVRLDLGFGKHQTGFIFNINEAF >gi|225935346|gb|ACGA01000046.1| GENE 5 2953 - 4656 1074 567 aa, chain + ## HITS:1 COG:RC0454 KEGG:ns NR:ns ## COG: RC0454 COG2194 # Protein_GI_number: 15892377 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Rickettsia conorii # 177 528 173 521 522 153 31.0 1e-36 MKLFKNIKNWLENQEHLFYLFLFILIVPNMVLCFTEPLPFMAKVANVLLPFGCYYLLMTL SRNCGKMLWILFLFLFFGAFQIVLLYLFGQSIIAVDMFLNLVTTNSSEALELLDNLTPAI IAVIILYVPALILGTISIIRKRKLTVEFIRRERKRAFLVFGISLLSLVGAYVQDSGYELK SDLYPLNVCYNVGLAFQRTALTQDYHRTSKDFTFHARPTHPEGKREVYVMVIGETSRALN WQLYGYERETNPLLSRQTGLIAFPKVLTESNTTHKSVPMLMSDATACNYDSIYHQKGIIT AFKEAGFRTAFFSNQRYNHSFIDFFGMEADTYDFIKEDSVSSTYNPSDDELLKLVEEELA KGATKQFIVLHTYGSHFNYRERYPSEDAFFTPDYPMEAERKYRDNLVNAYDNSIRYTDDF LSRLIRMLEKQQVDAAMLYTSDHGEDIFDDSRHLFLHASPVPSYYQLHVPFLIWMSDDYR ETYPERWNTAIENKDKNVSSSSSFFPTMLSLGGIETPYRDDSQAVTASHYVLKPRVYLND HNDPRPLDDLGMKKQDFQMLEKRNIKY >gi|225935346|gb|ACGA01000046.1| GENE 6 4751 - 5716 1180 321 aa, chain + ## HITS:1 COG:all1488 KEGG:ns NR:ns ## COG: all1488 COG2214 # Protein_GI_number: 17228981 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Nostoc sp. PCC 7120 # 3 319 8 300 315 174 35.0 2e-43 MAYIDYYKILGVDKSASQDDIKKAFRKLARKYHPDLNPNDPSAKDKFQEINEANEVLSDP EKRKKYDEYGEHWKHADEFEAQKRAQQQAGGFGGAGGFGGFGGAGQGFSDGNGTYWYSSD GEGFSGGNASGFSDFFESMFGHRGGRGQGSAGFRGQDFNAELHLSLRDAAQTHKQILTVN GKQVRITIPAGVADGQVIKLKGYGAEGVNGGPAGDLYITFVIAEDPVFKRLGDDLYIDVE VDLYSAVLGGEKVVDTLDGKVKLKIKPETQNGTKVRLKGKGFPVYKKEGQFGDLIVTYSV KIPTNLTDKQKELFRQLQSMN >gi|225935346|gb|ACGA01000046.1| GENE 7 5731 - 6045 265 104 aa, chain + ## HITS:1 COG:no KEGG:BT_1811 NR:ns ## KEGG: BT_1811 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 4 107 107 160 87.0 1e-38 MQTELIIVSEYCHKCHIEPSFIEMLEEGGLINVHTEGGEHYLLLSELPNVERYSRMYYDL SINMEGIDAIHHLLERMEDMRHEMRSLRKQLLLYREREIEDMDW >gi|225935346|gb|ACGA01000046.1| GENE 8 6133 - 6606 319 157 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 12 156 6 151 165 127 43 8e-28 MAENLIIHTGSKEEKYRELLPQLHALVSTEADLIANLANMAAALKQTFGFFWVGFYLVKE EELVLGPFQGPIACTRIRFGRGVCGTAWKEARTLIVPDVEQFPGHIACSSDSKSEIVVPI LKQGKVVGVLDIDSDTLDSFDTIDARYLEEICTYIVL >gi|225935346|gb|ACGA01000046.1| GENE 9 6614 - 11674 4130 1686 aa, chain - ## HITS:1 COG:no KEGG:BT_1809 NR:ns ## KEGG: BT_1809 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1686 1 1676 1676 2545 75.0 0 MRLIYNDALNKRIAPYLERLTKKRTSLDKETMALLDVFMQYFNMDTRYGAYSDKLEPCII YIIQEEKIKSVANLFDGKLNKLLRYLLGDEYAHLFHTYLKLKARCPYTHGYSRRSQRSAN PLLHIGHVIDALTQFLKLRATGFTVQAILNGGNTPEEIEAYKDSMNCQNWMAAQIAEGNQ TVIEYLNNVLTSENNANRLNQGHLQAIAVSGYRPLLELEGKLLLAAKLQEGLRQAIVETM DEGCPESYLHLFSVICDNGLQRFASVKRGIAVSTGIGEQDSSERITNKYVELIHRFLNDR KQAHSALQSKDTVELYLALWSIGFYNTEEIQTLVPEIIKKGAKYQVQTLLYFLRCTQYSG MNHRISKNAFERWYKEPSVVAAILPLYLSGLYLSRYGGHKDAPSLHDYFDSKEEAVRHYE YLKQIYQSISAKEIYSPYVFPWESTELTRSEIVLKMAYITWMTNNSALKDDLCSYLPSLD TYMRAGYIGVVLAPPTSPLQEEYVLQSLGDRSQDVREEAYKALSEMTLSPEQNQKVEELL RFKYSEMRIHAINLLMKQPKEQLADSIRRLLTDKVAERRLAGLDMMKTIHNVEFLQDIYQ ELIPTVKEIQKPNPKEKVLIESLIGDGTEESVTQHYTKDNGFGLYDPSLEVSLPEITQDK GFNVKKAFEFICFGRAKLVFKKLSKYIETYKNEEFKNGYGEARLVGNSVLINWSNYGGLS GLGFPELWKAFYEEEIGSYDKLLMMSFMLASTGTAKDEDDSDEEDEEDIKADQKSSNTFE PLVNRMYAGITYRGLQKDLRKMPYYEQMSDIIEALAYEYKDEAVYQRLAVNMLLQLLPLL NTKNIFRQYTSKHAWLRDKLEYGEKQVVYPIHNNKFVNFWLEMPQKPMNDDLFIRYFTVR YQLYKLTNYMEHTPELEETDSYLHATDFARAWMLGIIPTEEVYREMMGRTSSPSQIKAIT MVLNDNVRFNKEKERYADIKNIDFSLFRSLAQKVVDRILEIELKRGDSETQVTSLAEELS YVYGAETFIRILQAFGKDTFIRDSYNWGSTKRGVLSSLLHACHPLPTDTSENLKKLAKQA EISDERLVEAAMFAPQWIELTEKAIGWKGLTSAAYYFHAHTNETCDDKKKAIIARYTPID VDDLREGAFDIDWFKDAFKTIGKQRFEVVYNAAKYISCSNSHTRARKFADATNGAVKAAD IKKEIIAKRNKDLLMSYGLIPLGRKPDKELLDRYQYLQKFLKESKEFGAQRQESEKKAVN IALQNLARNSGYGDVTRLTWSMETELIKELLPYLSPKEIDGVEVYVQINEEGKSEIKQIK DGKELNSMPAKLKKHPYIEELKAVHKKLKDQYTRSRIMLEQAMEDCTHFEENELRKLMQN PVIWPLLKHLVFICNGQTGFYTDGLLITVNAVCLPLKPKDELRIAHPTDLYTSGDWHAYQ KFLFDKSIRQPFKQVFRELYVPTPEEIEATQSRRYAGNQIQPQKTVAVLKGRRWVADYED GLQKIYYKENIIATIYAMADWFSPADIEAPTLEYVCFHSRKDYKLMKISEIPPVIFSEVM RDVDLAVSVAHAGSVDPETSHSTIEMRSVLVELTMPLFHFKNVTIKGSFAHIEGKLGKYN IHLGSGVIHQEGGAQIAVLPVHSQNRGRLFLPFVDEDPKTAEILTKIIFFAEDDKIKDPS ILNQIK >gi|225935346|gb|ACGA01000046.1| GENE 10 11713 - 12132 297 139 aa, chain - ## HITS:1 COG:no KEGG:BT_1808 NR:ns ## KEGG: BT_1808 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 139 222 78.0 3e-57 MRVYIQIKQLGKRKCSIEKIPIDFPVPPVNVQGLIEAIVSWQVCEYNERLQQSEVLKYLT QEEVENKTASGKVGFAVNYNGKPAAEVEAITNALQSYEDGIFRIFMDDTETGDLSSPIQL KEESTLTFIRLAMLSGRLW >gi|225935346|gb|ACGA01000046.1| GENE 11 12140 - 13918 1043 592 aa, chain - ## HITS:1 COG:TM0584 KEGG:ns NR:ns ## COG: TM0584 COG0705 # Protein_GI_number: 15643350 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Thermotoga maritima # 163 388 5 230 235 126 33.0 1e-28 MNVDRDRNMALKPSYNDKLYLPGLSNTEILILALEASQKLEWNIEEVTPEGIRFEVPFSI RSHGEAITFTIEKGSDGEVSVRSQSSSVQFVDYGKNRKNIQKLRETMEEIKASLTPEELA QRAKDFEEEFNRPLTEEEKAYIEEEKKRNSFLSFFIPRKGFIATPILIDINILVFIVMIA SGVGIMSPSTLSLLKWGADFGPLTLTGDWWRTVTCNFIHIGAFHLLMNMYAFMYVGLLLE ELIGGRRMFVSYLLTGLCSAAFSLYMHGETISTGASGSIFGLYGIFLAFLLFHRIAKEQR KALLTSILIFVGYNLVYGMKAGIDNAAHIGGLLSGFVLGIIYVVGYKFEKPDAQRTVSII GELGIFCIFLFSFMILCKNVPPLYPEIRNEWESGIVEAYLNGELEEENENSNHSAVNATD NLSSRKVPAYTPVGNDDTWLSYYDAATKFSCQYPTNWIKIAGAKGIIDGAEPPLLMLVNG GNQLTITALTYDTQKEFEHMKELSLTLPRNAQGEPSEDYQLSNVNINGLPMTKITNPLHI GAPDEPGEDVKQIALHYFQESKKRSFAIVMLVYDEEAETDLNAITSSIQITQ >gi|225935346|gb|ACGA01000046.1| GENE 12 14048 - 15754 1916 568 aa, chain - ## HITS:1 COG:CC3393 KEGG:ns NR:ns ## COG: CC3393 COG1960 # Protein_GI_number: 16127623 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Caulobacter vibrioides # 45 445 47 459 603 197 33.0 5e-50 MANYYTDIPELKFHLNNPMMKRICELKERNYRDKDEFDYAPLDFEDAVDSYDKVLEITGE ITGEIIAANAEGVDEEGPHCANGRVEYASGTKQNLDAMVKAGLNGMTMPRRFGGLNFPIT PYTMCAEIVAAADAGFGNIWSLQDCIETLYEFGNADQHSRFIPRVCQGETMSMDLTEPDA GSDLQAVMLKATYSEKDGCWLLNGVKRFITNGDANLHLVLARSEEGTRDGRGLSMFIYDK NEGGVNVRRIENKLGIHGSPTCELVYKNAKAELCGDRKLGLIKYVMALMNGARLGIAAQS VGLSQAAYNEGLAYAKDRKQFGKAIIEFPAVYDMLAIMKGKLDAGRALLYQTARYVDIYK ALDDISRERKLTPEERQEQKKYAKLADSFTPLAKGMNSEYANQNAYDCIQIHGGSGFMME YACQRIYRDARITSIYEGTTQLQTVAAIRYVTNGSYIATIREFEAIPCSPEMEPLMSRLK KMADKFEASTNAVKEVQDQELLDFTARKLVEMAADIIMCHLLIQDASKSSELFSKSAHVY LNYAEAEVEKHTNFIENFDKEDLAFYKK >gi|225935346|gb|ACGA01000046.1| GENE 13 15761 - 16780 970 339 aa, chain - ## HITS:1 COG:CAC2709 KEGG:ns NR:ns ## COG: CAC2709 COG2025 # Protein_GI_number: 15895966 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 4 335 9 332 336 263 44.0 4e-70 MNNLFVYCEIEEGNVADVSLELLTKGRSLANQLNCQLEAVVAGSGLKDIEKQILPYGVDK LHVFDGEGLYPYTSLPHTAILVNLFKEEQPQICLMGATVIGRDLGPRVSSALTSGLTADC TSLEIGDHEDKKEGKTYKNLLYQIRPAFGGNIVATIVNPEHRPQMATVREGVMKKEILSP TYQGEVIRHDVKKYVADTDYVVKVIERHVEKAKNNLKGSPIIIAGGYGVGSKENFNLLFD LAKELHAEVGASRAAVDAGFADHDRQIGQTGVTVRPKLYIACGISGQIQHIAGMQESGII ISINNDPDAPINTIADYVINGSIEEVVPKMIKYYKQNSK >gi|225935346|gb|ACGA01000046.1| GENE 14 16783 - 17655 1060 290 aa, chain - ## HITS:1 COG:mll5862 KEGG:ns NR:ns ## COG: mll5862 COG2086 # Protein_GI_number: 13474882 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Mesorhizobium loti # 3 286 1 265 283 181 40.0 2e-45 MSLKIVVLAKQVPDTRNVGKDAMKADGTINRAALPAIFNPEDLNALEQALRLKDAHPGST VTILTMGPGRAADIIREGLFRGADNGYLLTDRAFAGADTLATSYALATAIRKIGDCDVII GGRQAIDGDTAQVGPQVAEKLGLTQITYAEEILEVGDGKIKVKRHIDGGVETVEGPLPIV ITVNGSAAPCRPRNAKLVQKYKHAKTVTEKQQGNLDYTDLYDKRDYLNLVEWSVADVNGD LAQCGLSGSPTKVKAIQNIVFQAKESKTISGSDRDVEDLIVELLANHTIG >gi|225935346|gb|ACGA01000046.1| GENE 15 17918 - 18448 356 176 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229873878|ref|ZP_04493445.1| acetyltransferase, ribosomal protein N-acetylase [Spirosoma linguale DSM 74] # 16 176 17 177 185 141 39 4e-32 MVQKTGIIVQKKNYTLRTWQTEDVASLAQYLNNKNIWDNCRDGLPYPYSQEDANVFLSMV QAKENIQDFCIEVNGEAVGSIGFVPATDVERFSTEVGYWIGEPFWNQGIVTDALKEAINY YFEHTDKVRVFAVVFEHNSPSMRVLEKVGFTKVGIMQKAIFKNDNFTNAHYYELIK >gi|225935346|gb|ACGA01000046.1| GENE 16 18457 - 20331 1812 624 aa, chain - ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 29 480 9 448 559 289 36.0 1e-77 MKKHFLIFATGLFLLSACNSVKAPEAILPIPEAKQVEWQKMETYAFVHFGLNTFNDREWG YGDSDPKTFNPTRLDCEQWVQTFVKSGMKGVILTAKHHDGFCLWPTQLTEYCIRNTPYKD GKGDIVRELSDACKKYGIKFAVYLSPWDRHQANYGSPEYVEYFYKQLNELLSNYGDVFEI WFDGANGGDGWYGGAKDSRTIDRKTYYNYPRAYKMIDELQPQAVIFSDGGPGCRWVGNEH GFAGATNWSFLRAGEVYPGYPKYRELQYGHADGNLWVAAECDVSIRPGWFYHPEEDDRVK TVDELTDLYYRSVGHNATLLLNFPVDRDGLIHPTDSANAVNFHQNVQKQLAHNLLAGLSP KASDERGRTFSAKAVTDGDYDTYWATNDDVTSATIEFDLPQAEKINRMMLQEYIPLGQRV KSFVVEYNQAGEWLPVKLNEETTTIGYKRLLRFETITTDKIRVRFTDSRACLCINNIEAY YAGETSDTYTAKAAELKSYPFTLVGVDVEEAQKGMDKNDQTTCFINGNTLLIDLGEERTI TSFHYLPDQSEYNKGLIAAYEISVGTDSNAVNQVVAKDEFSNIKSNPILQSVYFTPVKAR YIQLKATRMIHDGEPMGLAEIGIQ >gi|225935346|gb|ACGA01000046.1| GENE 17 20371 - 21051 496 226 aa, chain - ## HITS:1 COG:no KEGG:BT_1803 NR:ns ## KEGG: BT_1803 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 226 1 226 226 401 88.0 1e-111 MKKQILLIAILFCTAFAQAQEVFVTADFVSSYIWRGMDSGNASVQPSLGLNWKGLTVYAW GSTEFREKNNEIDLSLEYEYKNLTLYANNYFTQTEEEPFKYFNYSSHSTGHTFEVGAGYM LSEKFPLSVSWYTTFAGNDYRENGKRAWSSYCELSYPFSVKDVNMNIEAGFTPWESMYSD KFNVVNIGLSATKDIKITSNFSLPIFGKLIANPYEEQLYFVVGITL >gi|225935346|gb|ACGA01000046.1| GENE 18 21188 - 24034 2529 948 aa, chain - ## HITS:1 COG:PA1613 KEGG:ns NR:ns ## COG: PA1613 COG1629 # Protein_GI_number: 15596810 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 110 249 37 180 702 80 33.0 1e-14 MKKLYILTILLCLTGFGSIFAQTLKGHIYDAKTNEPLVGATVTYKLHGNQGVVSNINGEY EIKLPEGGVDLVFSYVGYDDVLMPIVINKREVVTKDVYMKESTKLLEEVVVSAGRFEQKL SDVTVSMDVVKSGDIARQAPTDISSTLRTLPGVDIVDKQPSMRGGSGWTYGVGARSQILV DGMSTLNPKTGEINWNTVPLENVEQVEVIKGASSVLYGSSALNGIINIRTARPGLTPKTR FSAYIGVYGDAENDEYQWSDKSFWKDDKYSVKPILRGSLLSGIRNPIYEGFDLSHSRRIG NFDVSGGINLFTDEGYRQQGYNKRFRMGGSLTYHQPDMGMKLLNYGFNVDFLSNQYGDFF IWRSPTEVYKPSPFTNMGREENNFHIDPFINYVNPENGTSHKIKGRFYYSADNIVRPTQG TSITDILGNMGTDAKTIQNIAGGDYSSLYPALVGIGSGLVNGNLEDAMNGVFTSLGNIFP NATTADYCDLISWVMDNGVPSDLGGLTNGQLPSDLIPWLSNVINPSRNTPKTQTDKNFDY YLDYQFNKKWEGGAQITTGVTLEHIRYDSAVMDEVYKSDNIAAFFQYDQRFWDRLSVSAG VRAEYYRVNNHHREAETKIFGTKVPFRPVFRAGLNYQLADYSFIRASFGQGYRNPSINEK YLRKDIGGVGVYPNLDIKPEKGFNAELGIKQGYKVGNFQGFVDVAGFYTQYKDMVEFQFG LFNNANYTMINSIGDAFQMLTDGKGFGIGAQFHNVSKAQIYGVEISTNGVYNFNKNTKLF YNLGYVYTEPRDADYQERNAVEGLYTDPLQMKEKSNTGKYLKYRPKHSFKATVDFQWKRI NVGANVAWKSKVLAVDYLMLDEREKQQKDLMDYVRGILFGYSKGETLATYWKKHNTDYAT VDVRFGVKATKEVAFQFMVNNLLNKEYSNRPMAVAAPRTFVLKMDVTF >gi|225935346|gb|ACGA01000046.1| GENE 19 24046 - 25359 657 437 aa, chain - ## HITS:1 COG:no KEGG:BT_1798 NR:ns ## KEGG: BT_1798 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 395 10 409 448 203 32.0 1e-50 MKKISILLISCLAVASAFLTSCDNDHYGPEPIDVTANYSNKLSNPNPNLILTYNGETMIG KSVDFSTVTGETAIINLYDILPGEKEVKIMSIPLSGDGQGYSFSGNSMGNETLSSFRYEG RVIKGQLTLNISNIKMGNAELWANTYKLPTVINGIKTIVVGDMWGEEYTWQDVDGQVLNA SCYFYADIEASESGATTQTWGSAIQNILSYILPQVLQEITLGADGNVTASYSNEPLTGVD MDIIFGFLENPLTQDMITPNIVNRNYIPSPKGFANWFQKDGKLILKLNLANIIASISSGN QYMDVNITNAIIEAISQMDAMKVKELLTTLNQSLKNETLGFLLNVNDTSFKAIFNWLTTG IPMQVISKDGHTFIYLDKEGFTPIAKLLPDLSPLIVSLLPEDMQSLGGIISIFLNGISDA FLSPEKIEFGLEIVPNK >gi|225935346|gb|ACGA01000046.1| GENE 20 25390 - 26592 909 400 aa, chain - ## HITS:1 COG:no KEGG:BT_1798 NR:ns ## KEGG: BT_1798 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 399 1 446 448 301 42.0 3e-80 MRNLRKLSYVACAVFFFTSCEETYNDKLFWPGEISQEYGSYIKPYTLDLTYSGEKLIGKT VSFKTEDSETGTLTLNNIIPGEKETPISRIQLYENEKKGYYTFSGTNITMGGATVKYEGI ITPKNMQLSLNVTMAYANSIANTYTFPAYSHTTDGESIIRNSGASYVNITTKAGGESLQP VILQIQQMATNILDVIFPYVLKDITFEKNGIVTASYTTSPVDMNEIMEVANSGKTDAEFK SLINKRTYESSPKGLAYWNQTGNKAFVVQLNIPAIVSLIAQNNGKQIDYQLIAGISEALL KSDPARLKLTLSAINMILNNEIITYILQLDDDTFATLLTWMKDGIPMGINSTKEHTYIYF NKETLMPFISIIGSLLETNLIEVAALFDKMEIGIDLTAKK >gi|225935346|gb|ACGA01000046.1| GENE 21 27398 - 27565 130 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237723355|ref|ZP_04553836.1| ## NR: gi|237723355|ref|ZP_04553836.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 55 12 66 66 92 100.0 8e-18 MNGGFLSVNRLYYSNKLFCSILKILLFNSGYESNQIENKMHNLVQEYAYHAPYYA >gi|225935346|gb|ACGA01000046.1| GENE 22 27641 - 28126 438 161 aa, chain + ## HITS:1 COG:no KEGG:BT_3497 NR:ns ## KEGG: BT_3497 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 112 4 107 251 135 65.0 4e-31 MNMNEACNVTTALSAFSSISLEEMSTIRLMNRTDTKYIVSLSALMDVLQRASNCYRVQEV QGERNIAYHTTYLDTPDYAMYLAHQNGRVIREKIRVRTYVSSGLTFLEVKKKIFSGFDAS LEGEFRTRDGLQTVECWSGSAGVSYKMFRWLKASAGYSFKF >gi|225935346|gb|ACGA01000046.1| GENE 23 28145 - 30139 1972 664 aa, chain + ## HITS:1 COG:no KEGG:BT_1796 NR:ns ## KEGG: BT_1796 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 645 20 649 666 783 80.0 0 MKRIVLFWIPLLLLLLVNCTTESFDFGDQEGILVGGSGGGGSSQPNPTIPEGSEDLLGFT IAFDESDKTAYGSMSETVTSDDDFIENSQFASVVTIIYNGTTAAVSNGVSGVEVSSNGAH VVVNSTVSGVEYVLSGTTTNGSFKVYSEKKFKLSLAGVSILNPVGAAINIQSSKRVFVVC ADETTNVLTDGSSYTATTDGEDMKACLFSEGQLIFSGGGSLTVTGNYKHAITSDDYVRFR SGCNITVASAKKDGIHTNESVIIGGGILNISSDGDAIQCEEGGITMTGGFAKLSTTDNKA HGLKSCLDVVISGGAIQAQVAGAASKGISCDGNLTISGGKLTAFTSQTALYEDNDLSSCA GIKCDGNILITGGEIAIQSTGGAGKGINCDGSITINDGTVKVITTGTQCVYGKLDSSAKG IKANGALTINGGTVLVKATGGEGSEGIESKSVLTVNEGTVAALCYDDCMNASNSIVLNGG NIYCYSSGNDGIDSNGTLTITGGVIVSSGTTSPEDGFDCDQNTFKITGGIVLGIGGGTST PTSSVCTQRTVIYGGSGSNGEILNIQSADGTSVLTYQIPRAYSQMTVLFSSPNLTSGGSY TISKGGTVSGGSEFFGLYSGATYSGGTQTATFTASSMVTQVGSTSGGGQPGGGGGGHGPG GWGW >gi|225935346|gb|ACGA01000046.1| GENE 24 30238 - 31452 821 404 aa, chain + ## HITS:1 COG:BH2727 KEGG:ns NR:ns ## COG: BH2727 COG2972 # Protein_GI_number: 15615290 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 204 403 374 586 597 103 30.0 7e-22 MKRLRKQHVLEQSIYGAIWIVIFLLPLIGGYFAMSGGLEREETRMVIRESWLSILPFFVL FLLNNYGLVPYFLFKKRYWQYTVSLIFVIGMICWLSPFPSANHFPKDFKRGELPLRMKEN RRDQIIKTREKAREEGDVHWHESNPEQRPRGMGDPNGPGRFPKPTPFPYPPFVLRYLIHF IIAFLMVGFNIAIKLFFKSFRDEEMLKELEHQRLQSELQYLKYQINPHFFMNTLNNIHAL VDIDTGKAKSTIVELSKLMRYVLYEASNKTILLSREVQFLKNYIALMSLRYTNKVSIQMD FPAEVPEVQIPPLLFVSFVENAFKHGVSYRSESFIHVLIQLDEGNRLSFRCSNSNNGSAD EQHHGIGLENIRKRLRLLFGNDYTLSITEEEHKFDVLLIIPLLE >gi|225935346|gb|ACGA01000046.1| GENE 25 31449 - 32177 515 242 aa, chain + ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 1 239 1 246 246 111 31.0 2e-24 MKCIAIDDEPLALTQLSDYISQIPFLSLVKSCQDAFEAMKILSEEEVDLIFVDINMPDLN GLDFIRSLVNRPLVIFTTAYSEYAVEGFKLDAVDYLLKPFEFQDLLKAADKARRQFEYRL LEQQGEIGNASQIKGDSLFVKSDYRVVRIDVKNIRYIEGMSEYVRIFVEGEDKPVITLAS LQKMEERLPTHFMRVHRSYIVNLRKITEVSRLRIIFDKNTYIPVGDNYKEKFTEYIGKLS LS >gi|225935346|gb|ACGA01000046.1| GENE 26 32182 - 33366 1320 394 aa, chain - ## HITS:1 COG:SP0281 KEGG:ns NR:ns ## COG: SP0281 COG3579 # Protein_GI_number: 15900215 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Streptococcus pneumoniae TIGR4 # 41 386 60 421 444 61 24.0 3e-09 MKKTILIAALGLFSLSVMAQDAKPEEGFVFTTVKENPITSIKNQNRSSTCWSFSTLGFVE SELLRLGKGEYDLSEMFVVHKTMQDRGVNYVRYHGDSSFSPGGSFYDVMYCIKNYGIVPQ EVMPGIMYGDTLPVHNELDAVASGYINAIAKGKLSKLTPVWKNGLSAIYDTYLGACPEKF TYKGKEYTPKTFAESLGLNYNDYVSLTSYTHHPFYSQFAIEIQDNWRNGLSYNLPIEELM AVMDNAVKKGYTFAWGSDVSEQGFSRDGIAVMPDAAKESELSGSDMARWTGLTAADKRRE LFTKPFPEKDITQEMRQVAFDNWETTDDHGMVIYGIAKDQNGKEYFMVKNSWGKSGKYDG IWYASKAFVAYKTMNILVYKDALPKEIAKKLGIK >gi|225935346|gb|ACGA01000046.1| GENE 27 33501 - 34706 902 401 aa, chain - ## HITS:1 COG:no KEGG:BF3189 NR:ns ## KEGG: BF3189 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 401 1 401 401 525 69.0 1e-147 MKKILFCLLAFYTPVSIVIAQNTTNSPTSMFGLGELSTGEGGQYAGLGGTGIALRGNNFL NNANPASLTELTEQRFQIDAGIMGAYQIYSQRGASNRSVTGNLNNLSIGCRIMPRWYGAI FMAPVSSVGYAITLDEEVAGTNGGTISSLFQGEGGLSKMGISTAYLFGKRFSVGTNLSYV TGTITQTETQGTATEETSSYKHTFYADFGVQYKWAIDRERSFVAGAVYGYSQDFKQDNNL YVSSSSGGDDIEKSLKRYRQCLPQFFGLGASYNTLRWMATIDYKYVDWSRMQSSQSNVSF ENQHRLSVGGRYTLGNVYRNPVSILLGTGINNSYIVIQKKKATEYYVSTGLNFGLRNNNV LSIGLKYKGQMKIPNGMQKENSLSLFLNITFSERTYRAKIQ >gi|225935346|gb|ACGA01000046.1| GENE 28 34703 - 36028 805 441 aa, chain - ## HITS:1 COG:no KEGG:BT_1786 NR:ns ## KEGG: BT_1786 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 441 5 445 445 509 58.0 1e-142 MKSIILLFSFFISLSFFACVDDSSSIGAKWVQSSFLNEQMDTCTVLLSTVLSDSIATSGD TVCQIGYRDDNLWGKITASFYAEYEVPSYSFDENIQYEFDSITIRLYSSGNYLGDTLKTQ RIHLHELTKNIELDDRGYLYNTTTAYYNETPLASFDFRPTPGSPSEELEIRLPDAWGEEW FNLMLNDGRWVQSQDFFHDYFKGIAFIPDANDACISGFQVNDSSMCITVYYHQITETLNE KTLVFNTSSTLTYNKIEQDRSNIPIANLQSGDGNEYSSGKSEHQVYLQGMTGMYVTIDFP HLNNLCEKGELVTIESATLQLYPVKGTYDGMYPLPKSLALYTANNENVTQSVITDLTGSS VQSGNLVVDEMSYEETYYSFDITSFLQTNLGTTGYDRQKLQLFLPDNLFYTTLQGVIFGD GEHTANKKNTKLIILYKTYQQ >gi|225935346|gb|ACGA01000046.1| GENE 29 36221 - 37279 1029 352 aa, chain + ## HITS:1 COG:no KEGG:BF3356 NR:ns ## KEGG: BF3356 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 352 16 350 350 378 64.0 1e-103 MNTKMNKLLCMMVVLLSLAACTDDTEYTAGVWYRRSDFDGVARTDAAGFTIGNVGYLCTG YRGSTKDRLKDCWAYDIEANSWTQCTSMPDAAPARNAATGFAVGTKGYIATGYDATKKDY LNDCWQYDPATNSWKEMASIPDGTDGTNIYGKRYYALSFGIGQYGYLGTGYNDNYQKDFW KFDPSVGDKGEWTAMSGFGGQKRMGGMAFVIDDIAYICGGENNGSDVTDFWCFNPATGTW KELRELYDKSDDDYDDDYTSIVRSYACAFVIDGKGYIAAGQTAGGSYRSNYWIYDPLTDL WDGEDLTDFEGSTRSKAVCFSTGKRGIIATGGASTYYYDDTWELKPYEYEEK >gi|225935346|gb|ACGA01000046.1| GENE 30 37695 - 38270 469 191 aa, chain - ## HITS:1 COG:no KEGG:BT_1784 NR:ns ## KEGG: BT_1784 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 186 1 170 173 296 92.0 4e-79 MFKEEKIKKSCPQASTMRIQFEIKENLPEIIEEILHSDKWQTSVKEELSGRTTVVIRDQA YGSEATIEIYATSIEIKTAWSKYSYRIFVANDIVWCEYNGAYRGLLEQVLLPTITPKENL LNSDVTESSLYGNEHKKLREYAEDNLKLKQFRRENFNEQKNGTAPFDHPKRVYDEFIKED YVVAPKEEEKK >gi|225935346|gb|ACGA01000046.1| GENE 31 38448 - 39293 697 281 aa, chain + ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 11 276 5 270 283 235 44.0 5e-62 MKNLQIDGSTGLVLEGGGMRGVFTCGVLDYFMDHDIRFPYAIGVSAGACNGLSYASRQRG RAKYSNIDLLEKYNYIGLKYLLKKRNILDFDLLFNEFPEHILPYDYETYFASPERFVMVT TNCITGEANYFEEKKDRRRVIDIVRASSSLPFVCPITYVDEIPMLDGGIVDSIPLQRAIA DGCTRNVVILTRNRGYRKDSKDIRIPSFVYRKYPKLREALSCRCAVYNEQLEMVERMEDE GKIVVIRPLKPVAVDRIEKDVQKLTEFYKEGYECAKALFSF >gi|225935346|gb|ACGA01000046.1| GENE 32 39285 - 40307 778 340 aa, chain - ## HITS:1 COG:no KEGG:BT_1767 NR:ns ## KEGG: BT_1767 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 340 1 340 340 602 89.0 1e-171 MPANRNALIRYKTIDNCLRNPYRRWTLEDLVDACSDALYEYEGIDKGISKRTVQMDIQMM RSEKLGYNAPIVVYENKYYKYEDPEYSITQTPLNEQDLKTMSEAVEVLRQFKGFSYFTEM SDIINRLEDHVASARMKTTPVIDFEKNESLKGLDYLDTIYHAIVNEHPIQLKYRSFKARS ANSFIFYPYLLKEYRNRWFVYGVRGNGRILQNLALDRIQSLEVLPQEHYIKNTFFDPNTF FDDLVGVTKNSGSVAEKVGFKVAAAEAPYIITKPIHRSQQLVERLPDGSVILEIEVVINH ELERVFFGYVDGIEILYPKTLVELMSRKLEKAAKQYTKSK >gi|225935346|gb|ACGA01000046.1| GENE 33 40563 - 42674 1517 703 aa, chain + ## HITS:1 COG:no KEGG:BT_1781 NR:ns ## KEGG: BT_1781 # Name: not_defined # Def: xylosidase/arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 703 1 679 679 1280 91.0 0 MVKRLNWLLVYLLFSIGIMAQSGGRKQYNSYKGLVMAGYQGWFNTPGDGSGRGWHHYNGR EGFRPGSCSVDLWPEVSEYKKLYKTDFTFADGKPASVFSSYDESTVNVHFRWMKEYGLDG VFMQRFIAEIRNESGLKHFNKVLNSAMKAANKYERAICVMYDLSGMQPGEEKLLLKDIAE IAQRHSLKNHAKNPSYLYHNGKPLVTVWGVGFNDNRRYGLKEAAHIIDGLKSQEFSVMLG VPTQWRELKGDTESDPRLHELIRKCDIVMPWFVGRYNETTYPKYQKLVEEDIQWAKKNQV DYVPLVFPGFSWGNMKGKDHNSFIPRNKGSFLWKQMMGAIRAGAEMIYVAMFDEIDEGTA IFKCAKEVPTGKSTFVPIEEGVESDHYLKLVGEAAKVLRKEKAIAFNTSLNPATPNPFIR HMYTADPSAHVWEDGRLYIYASHDIAPPRGCDLMDRYHVFSTDDMVTWTDHGEILSSDQV PWGRKEGGFMWAPDCAYKNGTYYFYFPHPSETDWNDSWKIGVATSNKPAEGFKVQGYVEG MDPMIDPCVFVDDDGQAYIYNGGGGTCKGGKLKDNMMELDGPMQLMKGLEDFHEAAWIHK YNGKYYLSYSDNHDENWNDGVKGDNRMRYAISDSPLGPWESKGIYMEPTDSYTNHGSIVK FKGQWYAFYHNSALSGHDWLRSICVDKLYYNPDGTIKLVRQTK >gi|225935346|gb|ACGA01000046.1| GENE 34 42704 - 45568 2830 954 aa, chain + ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 209 846 82 734 754 465 40.0 1e-130 MKNKMKWMLAIGLLSCSMAMAQQQSDILSVSASANADNAALAFDKNVKTMWTLPSQALKT EQWLMFTIQQPGDVCELDLQMQGVNKNELKEVLDIFVTYDPMNLGTPVNYRIEGNDKQMK VKFTPKYGAHVKLNFKPGKLDKPFSLKEISVLVAEKVLTDSKGKVTDRRYMDASLPVEER VESLLAVMTPEDKMELIREGWGIPGIPHLYVPPITKVEAVHGFSYGSGATIFPQALAMGA TWNRKLTEEVAMVIGDETVAANTKQAWSPVLDVAQDARWGRCEETFGEDPVLVSQIGGAW IKGYQSRGLFTTPKHFGGHGAPLGGRDSHDIGLSEREMREVHLVPFRHAIRNYDCQSLMM AYSDYMGIPVAKSTELLQQILRQEWGFNGFIVSDCGAIGNLTARKHYTAQDKIEAANQAL AAGIATNCGDTYNNKEVIQAAKDGRINMENLDNVCRTMLSTMFRNELFEKNPCKPLDWKK IYPGWNSDSHKEMARQAARESIVMLENKENLLPLTKNLRTIAVLGPGADDLQPGDYTPKL LPGQLKSVLTGIKEAVGKQTKVLYEQGCDFTNPDETNIPKAVKAASQSDVVVMVLGDCST SEATNDVRKTCGENNDWATLILPGKQQELLEAVCATGKPVILILQAGRPYDILKASEMCK AILVNWLPGQEGGPAMADVLFGDYNPGGRLPMTFPRHVGQLPLYYNFKTSGRRYEYVDME YYPLYRFGFGLSYTSFEYSDLKIQEKPNGNVTVQATVKNIGSRAGDEVAQLYVTDMYASV KTRVMELKDFDRIYLQPGESKTVSFELTPYDISLLNDHMDRVVEKGEFKICVGGMSPDYV AKNEIKHSVGYSDKKKGVTGMLNYTHEFGADFILSVSKVEENLTKNQKTVWVSVKNNGTL MDIGKVEMFVDGKKAGDAIHYELGAGEEKLIPFKLDKDNKQPVAFTTKYKMVAL >gi|225935346|gb|ACGA01000046.1| GENE 35 45599 - 47314 1219 571 aa, chain + ## HITS:1 COG:no KEGG:BT_1779 NR:ns ## KEGG: BT_1779 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 571 14 571 572 974 89.0 0 MKQLLLVSWLLSLSAILGLPEVKADVRMPLIFGNHMVLQQDTRITIWGWADAGENIEVLF AGQKVRTTANADGTWQVKLKPVKTKKKGETLTITGKNKLVYSDVLVGDVWVASGQSNMEW GIKVRKEYADDIAHSEDSLLRLFFVPKNTSLQPLSEIEIPQGTASPERAARWVLCTPEML AKINGQGFSATAYYFARDMRAANGRPLGVIQSAWGGTRAEAWTSLSGLKQEPALAHYVAA YEKNVKDNPEILATYPQKQKEFDTAVREWDQTIGKEWNQAQKEWAVAVRAAQADGKPAPA KPEPRVPRPPNPRKPDGGNNGPANLFNAMISPLIPLSIKGVIWYQGEFNSGGSAKEYATL FSRMITDWREKWGIGDFPFVYVQLPNFEPVDQEPSVEGNGWRWVREGQLKALNLPNTAMA VTIDVGDPFDLHPVDKYDVGHRLALAARKLAYGEKIVGMGPLYKKMSVKGNKIILEFTNQ GKKLMIGTSPYIPEGEQVRPKPTKLTGFGIAGADRKFVWADAVIEGNKVIVSSHEVAEPV AVRYGFSNSPRCNLYNEERLPASPFRTDHWE >gi|225935346|gb|ACGA01000046.1| GENE 36 47349 - 49046 1264 565 aa, chain + ## HITS:1 COG:no KEGG:BDI_3065 NR:ns ## KEGG: BDI_3065 # Name: not_defined # Def: beta-glycosidase # Organism: P.distasonis # Pathway: not_defined # 34 564 213 754 764 412 40.0 1e-113 MKKSTHLLIASILAGCITSCSSGKRIAQHRIVTNPMNLNYRFQPKDESRREAADPVLEYF KGYYYLFASKSGGYWRSEDLAGWEYIPCTTIPTLEDYAPTILPIGDTLYFTTSSGKTQIF KNAHPEKDTWEAVDTKLTYRLHDPAFYSDEDGKVYMYWGCSDKDPIMGVEVDPKDGFRAL GEAKALIAHHGDKYGWEVPGKNNEEPRQGWNEGPCILKYEGRYYLQYAAPGTQYRIYGDG IYVGDTPLGPFEYVEDNPFSFKPGGFIGGAGHGHTFQDKYGNYWHVASMIISVRHMFERR LGLFPVAVSSRNGIYAHTVWSDYPFYIPNRKVDFDKTDLSMGWNLLSYRKPVQASSSLAG YEPGNANDEQIETWWAAQTGKKGEWLQVDLETPMLVNSIHVNFADHHFKVFAPHPPVVYQ FFIEGSADGEKWMNLFDERENGKDEPHRLFTLDKPVKVRYLRICNAKEMDGCFSLSGFRV FGKGEGAAPSKVTGFRAIRDSEDKRMFRFTWDAQKGATGYVLRWGSQKDKLTHAVTVFDN QYEARYFNRDSEYYFSITAFNENGN >gi|225935346|gb|ACGA01000046.1| GENE 37 49095 - 49436 281 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173385|ref|ZP_05759797.1| ## NR: gi|260173385|ref|ZP_05759797.1| beta-galactosidase [Bacteroides sp. D2] # 1 113 1 113 113 239 100.0 4e-62 MTLQVFKCCVCIWIGVFSETMLAQTQKVVFSRDFSPAEGLVTPQEKPYRDEVCLNGLWDL QCVAVPSSWKKGSGIVFWNSMLRVMGIATYGAYFNSHDAKNKKHDLLMDGPVD >gi|225935346|gb|ACGA01000046.1| GENE 38 49501 - 52062 2341 853 aa, chain + ## HITS:1 COG:BH1908 KEGG:ns NR:ns ## COG: BH1908 COG1472 # Protein_GI_number: 15614471 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 26 853 6 794 926 527 38.0 1e-149 MVIMNRYKKILLAGTLSVMSCSLIHAQELYKNENAPVHERVADLLSRLTVEEKISLLRAT SPGIPRLGIDKYYHGNEALHGVVRPGRFTVFPQAIGLAATWNPELQKRVATVISDEARAR WNELDQGREQKEQFSDVLTFWSPTVNMARDPRWGRTPETYGEDPFLSGVMGTAFVKGLQG DDPRYLKIVSTPKHFAANNEEHNRFVCNPQISEKQLREYYFPAFEMCVKEGKAASIMTAY NALNDVPCTLNAWLLKKVLRQDWGFQGYVVSDCGGPSLLVNAHKYVKTKEAAATLSIQAG LDLECGDDVYDEYLLNAYKQYMVSDADIDSAACHVLTARMKLGLFDGTERNPYTRISPSV IGSKEHQQIALDAARECIVLLKNKNNMLPLNVNKVKSIAVVGINAGKCEFGDYSGAPVVD PVSILQGIKDRVGDRVKVVYAPWKSAADGLELIQGENFPEGLKAEYFENTALEGTPKVRK EGWINFEPANQAPDPFLPKSPLSVRWTGKLRPTISGRYTFSFTSDDGCRLSIDNQLLIDA WSAHAVSTDSASIYLEAGKDYQLKAEYYDNRDYTIAKLQWKVPQVGKATRLDLYGEAGKA VRECETVVAVMGINKSIEREGQDRYDIQLPADQREFLQEIYKVNPNIIVVLVAGSSLAVN WMDEHIPAIVNAWYPGEQGGTAVADVLFGDYNPAGRLPLTYYKSLDELPAFDDYDITQGR TYKYFKGDVLYPFGYGLSYSSFKYSDLKVKDGANTVSVSFRLKNTGKRKGDEVAQVYVRI PETGGVVPIKELKGFRRIPLKSGESRVVEIELDKEQLRYWDAGLGQFIVPQGAFDIMIGA SSKDIRLQTVINL >gi|225935346|gb|ACGA01000046.1| GENE 39 52151 - 54730 1634 859 aa, chain + ## HITS:1 COG:no KEGG:BT_1777 NR:ns ## KEGG: BT_1777 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 137 859 285 1014 1019 1266 82.0 0 MKNKYLRCINALFLCCLMAETTMAARFQQPSLKATYNKPAKVWESEALPIGNGYMGAMIF GGVEVDVIQTNEHTLWSGGPGEDPSYNGGHLGTPETNKSYLHKTRVLLQQKMNDFTANHS AYIDADGKLITHNYEGDGNGTELRNLIDKLAGTKEHFGSFQTLSNIIVEVVNPATSEPAY SDYTRTLDIDNAIHRVTYKEGGITFKREYFMSYPDNIMVMRLTSDSKKGKISRMISLESL HTDKVIRASDNTITLTGYPTPTSGDKRVGDHWKNGLKYAQQLLVKHTGGKITVVDGKKLK IEEAKEIIVLMSAATNYVQCMDDSYHYFSGEEPLDKVKATLKKAANKKYTALLAAHEKDY HSLYDRMKLNLGNLTEMPVVTTDSLLKGMDARTNSESENQYLEMLYFQFGRYLLISSSRE GSLPANLQGVWGERLSNPWNSDYHTNINVQMNYWPTQPTNLSRCHLPMVEYVKSLVPRGK YTAQQYYCKPDGGNVRGWVTHHENNIWGNTAPAKKDTPHHFPAGAIWMCQDIWEYYQFNL DKDFLEAYYDVMLQAALFWVDNLWTDERDGTLVANPSHSPEHGEFSLGCSTSQAMIAEMF DMMIKASKVLGKDKEPEIAEIETAMNKLSGPKIGLGGQLMEWKDEVTKDVTGDGGHRHTN HLFWLHPGSQIVIGRSEEDDKYANAMKVTLNTRGDEGTGWSKAWKLNFWARLHDGNRSHA LLRSAMKLTVPQGRFGGVYTNLFDAHPPFQIDGNFGCTAGIAEMLMQSQGGYIELLPALP DAWKDGAFKGMKARGNFEVDVTWKEGQITSIEILSNAGAECMLKYPDAKSLKVSGARVRV LADDRIAFDTVKGKRYTIR >gi|225935346|gb|ACGA01000046.1| GENE 40 54885 - 58586 3497 1233 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 247 698 43 444 1087 99 22.0 6e-20 MNRISKVFLACACSVFCMNGQAQSNATWTNDFSNAGETLKVVGRGTCSIADNVFRSKGAY AVFGHPEWKNYSFSFKARAPKNAEQVQIWASFRNYNHFDRYVVGIKGGLQDDLYLMRTGY MGTDEFMGVRPLGFHPVPGEWYRVKVEVCGNRIRIFLNDEKQPHIDLVDKNADLVPSGEV ALGGGWIETEFDDLTVTPMADDALKNVKVAEYQKKPTPTEKENKRRQERASYTSAKLAAL TGSRTDITLDGNWLFMPDYQLDNKDKAVSVQTNDQDWHIMSVPNFWTPIRIWLHGETMPS PTGAQPKGVSDTYYQQETDRCENYTFDYRRVKYAWYRQWLELPANVEGKNLTLTFDAVSK IAEIYINGTLATSHLGMFGEIQVDGSRLLKPGKNLIAVKVTRKMDGSTAESANAIDFFYS SVRESEQEDAKVEVNKDALLKEIPHGFYGDEPAGIWQPVKLTITDPVKVEDVFIKPTLNG ATFDVTLKNHGSKKKQFDLYTDIIDKETGAVLYSGLSIRKLNLNADEERMETYTISDLKP RLWTPQHPNLYDFKFRLVADKGTELDCLTETSGFRTFEVKDGLFYLNGNKYWLRGGNHIP FALAPNDENLANTFMQLMKAGNIDVTRTHTTPWNKRWMTAADRNGIGVSFEGTWSWLMIH STPIPDQRLIEIWRNEFLGLLKKYRNHPSLLFWTVNNEMKFYDNDSNLERAKEKYRIISD VVKEMRRIDPTRPICFDSNYQAKGKDKKFGADFMSSIDDGDIDDMHGYYNWYDYSVFRFF NGEFQKQFKVADRPLISQEMSTGYPNNETGHPTRSYQLIHQNPYTLIGYESYDWADPASF LKVQAFITGELAETLRRSNDQASGIMHFALMTWFRQTYDYQNIEPYPTYYALKRALQPVL VSAELWGRNLYAGEKLPTRIYVVNDREDGTDLQPSLLRWEIQDESGKCLASGSEKIPAVK HYARYYAEPDIQLPANLPADKTKAKLVLKLTENGLPISANEYELLLTNKEWNIGQVDPNK KIVLLDKDNTKTVFDFLNINYQPISSIKELLNSKLKADLCVISGLTVCTDEEKELIRTYQ SKGGKLLFLNSKEAVKAIYPEYITGWIIPTEGDIVIMERNDAPVFNDIDVLELRYFNNNK REIPQACNATLKVHRHKNVTELAGQMKIHAYIDGGKPEDRIERIESMRGLTMLQIADGKG EAMISTMCTEKATTDPVAGKLVVNMINCLTTNK >gi|225935346|gb|ACGA01000046.1| GENE 41 58893 - 59123 67 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQWKSLLYTLKKISCGEFDVKGYMKYKRAMPEFEVCNRLAISRHGCLIASLFRVIFITLV CTEIKIYPSTILKNRF >gi|225935346|gb|ACGA01000046.1| GENE 42 59090 - 61795 2677 901 aa, chain - ## HITS:1 COG:no KEGG:BVU_1478 NR:ns ## KEGG: BVU_1478 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 333 1 346 952 99 27.0 7e-19 MNKRVLNSFLYGTLLFSLGTGVVSCKDYDSDIEGLNNRITTVESDMDRFKEKIEAALNAN LTVQSWYPSEDQSQYTIVLSNGDELYVQASNKATPFYQFKVEEGTWRYTKDKGAGWYKVL TFGTEQEIPGTDKDQLYYDKSNGYIYIQKTEGLVQTTITADKDTPILAENKENKTLSVYI YGENYILPVQGGGFSGISSILFQKQFVFEEDEFLEAASYTNTQGKVVTAHTATAKFKILP KDIDLSEADFQCADIHELKLTRAEAPQLLVNTTKKLDENGILSVELTPSNMTAPYYGAVL EITLDKTTTSSNYFVVKPTSYSANNGVFAYRETRTVYQNSESLKFVSTESLDLTRTIGWG FGENEEVKFVDELGFSELAVATTYELTENPNNTFEVTDKGVLTAKAANRSGKVRITYTVA EEEFSKEVLIYSQDEATAKNGIELRSTTVSLSDIESLYKGTKAFIVQNAQTTMENLGVTA SKAWKLGTQANGGNWEAIPMTTNLATITQDTELANGEVCLYYDAAQKTSYLLVGPQADGI EGANLFAMNDEGTDKAPFTLNEKEVGLYVGNVGTKYVVEAPTAKEELAIQMYVKAGIDHP DTDPNQNIRILGKKLAIPGVYDEELGYHFNELDLRTTLYNFKPADADFKITLNRDDQNDK VKEKWGTNFQWNADNYTLTLSPVWALYNFNFNDKKVSDMTEANNGVKLSWTFNAGKGETT SNGNEQWYIKDPVRQPGENTIYMGLDKQPNATGVISTDYEIKVSEITGDPEKKKLSDLEV GKVYKFGDYFKNHNFWVNGCSWDVYGGPKNLTGKNLPVLIKYDFATNKLVITDYAKKYFG KRRCELKFQPQGNETTAGLEIDSDEQTFKLKLVPTTTNFQIKIDYNTDFSNQNLFFRIVE G >gi|225935346|gb|ACGA01000046.1| GENE 43 61843 - 63672 1560 609 aa, chain - ## HITS:1 COG:no KEGG:BT_1775 NR:ns ## KEGG: BT_1775 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 609 3 570 570 280 33.0 1e-73 MKNKYWKYICLLGLGAFCWTCSDDNPNDGPDGPGSEGGQGTEQEIVRSYKIGTLAGFEGN LRQKMDKPSTDVSTMHNRSLFCRYVASAYPEAQDDQPYPGPITEADWWDNFVEEIAYSGQ DYVAMNCRGEANKDIDHGRPDKLHDLMAAIKRKGVEHEFKIAIFDDTPASWSAARNAHKG YGYDNKPFKNGSRVEGSHYPLLFLDPEKEDMKGADNDFRKEVYYYIWNANLKPSFENVPR DYWFEIDGRPVILFWNPNGFLQDSYLSELMKKDPAKYKSTTGLDETKSDAYNGKLSYILK CISDDFNSTFGVRPFLIIQREWTDRDFSLVNCPYLDGIHNWFAVPTMDMSEEVYNENLIY NTYISAYNFKGFSVGSGCPGFVQGDLARPNWQFIDADHGRYATKMMESFLAKKPDLVFLE GFTDLAENAAWWRSSDKIYYDYPNQRINLLRKYSNRPFPTRQKLEAEACDYCFNGSVSGN KIETLLMETEMNNAPEQGVVKYCNDTKYNGGWHANLTSGSLNTLRWKEIPFRTGTSTIRF RYSSNNAASVRCQIGDTMTQSVSLPATGGAWEETEVAVYSRDARGYADFNLIVTEGDISL NYAEIIAEK >gi|225935346|gb|ACGA01000046.1| GENE 44 63703 - 65064 1155 453 aa, chain - ## HITS:1 COG:no KEGG:BT_1772 NR:ns ## KEGG: BT_1772 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 451 24 424 426 218 34.0 3e-55 MKLMKYITGCSAALLLLAACSKEEGPSVDDYFLNYEIPEIPVTEDYQVGIFYYKGNDITT RPNGDANFTRWELLTLTDKELSTKNNYNNLTPQVMPESQKDGYLQPDAGSQSYRMIPMVQ QHVDDCIEAGANFIILPEVGADLGKGPGTEINQGDSLFVTMMLGRSGRGGKRLPMPHLGQ PGVDFVDLKTMKIVISVNWNNIVDLPPTLSSSNCIETAAGKVYDGVTYTRQQLLNNFFSK IGCYFSDDRYYKLGGTRPVVYIKNKTGEIYAQESKAMYDGIRKAVKEATGYDIYIMVESA DVWCNQMRYQYFYMEGCVDAVASRNMYDQSEMSRSYMYPQMIDQNWKYNRETALPAFGDG SMEFVPCVGPAWNKLVQDGRGSIGNTPIVKKDAATYRTMCNVAKMNAGRNRLILIDSYNK YNFDSFIEPTVEGYGNGYGRTYLDITRQQFKKN >gi|225935346|gb|ACGA01000046.1| GENE 45 65087 - 67081 2076 664 aa, chain - ## HITS:1 COG:no KEGG:BT_1773 NR:ns ## KEGG: BT_1773 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 637 13 638 668 328 35.0 5e-88 MTMKKYLYTLCTAALLGLATSCVDSFDATNQNPNKLYLDEIDIQKIFPGTIYKSLDVLSE MNYNYYAYMARYIVSWTSPQSSDNVGDRFYNFYKKVLGDVAIMEQKYDRDTNGNYWAILT TWKAYLYSVLTGTWGPIPLDSACKETYGNVYYYNSEAEAYMQILRWLDQAVETFDPNGTK MVKDPFYPSAGGDSDIEKWRKFANTLRLDVAIRMMNMKKSPEAISMAREQIEKALNENNR NYLFSSNSDNAAGRYGTDPNADVSLYYNRILKEFDLGTKLETEIGGLTYPAMNEYFFCYM RSYKDPRLSKYAQTSRINENGDYRAVVRDSLWSVAEQKYIQVSYRIPFLPRFELKQTPTG WIVGKDEHNNNLQSLYSTASVSIEGYTYALIHRDFIKQDATVKLLTWADVNFMLSEIQLR KDEWGLNVALPQSAEQYYYNGISASMNEYGVTSGVSEYIERNGIKWNTNGLGCRDYRAFY KADINGKGGYKNNLEQVWKQRYIADYFNGFAGWTLERRTRVMQYPPIFYNGTPNNYGTYG ANQFDPVPERLQYPTDERNWNMNYYREACRMLQSTSLVPRSDGNYNDNFFTPLGIGGPYN QEGLYERWKGGQLMYNNEMVAHWYGDTIEEFVENVLEDYPEYASLQGEERLAASIKWVRI EDNK >gi|225935346|gb|ACGA01000046.1| GENE 46 67087 - 70365 3056 1092 aa, chain - ## HITS:1 COG:no KEGG:BT_1774 NR:ns ## KEGG: BT_1774 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 1092 14 1101 1101 846 42.0 0 MKQRFSQHMRYYERSLILILMLFSASVMQAQNRTVKGTVSDAQGEPIIGANVVIVGGTKG VITDLDGKYSIQVPENGAVLKFSYIGFKTKSFNVVKGKNVLNVTLEEDAVMLEQTVVTAM DLRRDEKSLSTAFQKMDVESMTENRDAGFVNMLAGKVAGLQVISNGAAGSATVRIRGANS ISGNNQPLYVIDGVPIINDVTGGEIDYGNPANSINPDDIENIVVLKGANASALYGSDAAN GAILITTKKAGQRSGLGVTYSTNVQFTEFSQYPIYQNIYGGGHINRFENNKANSFNGDVK VPYDPNMPYGIQRMGGYDNSRSWGMPMLGFQVVGRNGELKSYVPTPANTTSMYQTAYSWT NSVSIERATEHVSTRIGFTNLRSDDVLEGLNNLTRNAFNVRSNVKLTKSLDVDLNGRYTH ENVKNRSYRNNSDRNPIYTLMDMPRDLSIQEMYPWKDENGKPTALQFKSPVWMLNELSNQ DKKEWLLADVTVNYKITKDLKLRLKAALDLNMKEGYEFRNMYTPGDADGFYKEFTEKSRN YTYEAMLSYNKTWKDFNISASVGANSQDFLFKKQNSEIGTLATSDFISLTNNGATVKSWP EYNAKKKQAVYGTASIGYKDFIYVDVTGRNDWSSALPSDNRSYFYSSYGVSFVLTELVKS IPKDWLSYAKIRGSYAKVGNDTGFDQLLNGFSYNTSYLGDMAWFESENKRKTNSLKPEST TSFETGLDLRFLKDRASLSFTYYNKNTKNQILTSTINGVSGYGEALFNAGEVKNWGYEVT LGVVPFRNKDWEWKVDINWAKNNSEVVSLANGMDYMTLTTVQNSVELRIVKGEPLVSLYC REPWKTNDEGQVLVGANGRPLSGEAKFLASVEPKWTGSIRTSLRWKDLTFSAMLDIRMGG HVWSETAFQSSRNAQSIMSLGGRTEHLFSDLILNEGDQTGYLGILDPKYVPNGKNNIYMD ASRPKGMNIPGAVYDSSVPGLAGQPCQAWIKPIDYWTNDSGKNGELYLYDASYVKLKEIS LGYNIPKNWLRKIGFIQSMRVSAVGRNVAILHQKTPKGIDPEATSSMGIQQGLERGFNLP TSSYGFDFKITF >gi|225935346|gb|ACGA01000046.1| GENE 47 70815 - 73397 1348 860 aa, chain - ## HITS:1 COG:no KEGG:BT_1770 NR:ns ## KEGG: BT_1770 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 860 1 855 855 1091 67.0 0 MRYQLQILLSVLLCMVGKVDASSLYDYGLSLKSHSVPGIERTTLYLDDNQPFSIKNDFII SFQMYVRANEHDFGTILHLHTNTNQFIRFSFVAGEERHFPALVLNEGIININSPIEREKW LDVSLHLRLKDNVIEIDYDNKKISAMAPLQGVKSVTALFGQMKEYLSDVAPIDLRNVTIT QDGKQIREWKLWKHNDTVCYDEIEGAVARAIHPVWLIDNHIEWKLVHQAKIPGKLDVAFN AREALFYLVRSQSIDVLDENGTLQKEIAIRGGYPAVEFPNHLLYDTLSNKLVSYYPKKGI TSRFSFDTERWSNEIRNTEEASNYNHARTFNPADSSFYFFGGYGFYQYRNDLYRMKYSTN QIEQVEYERPLYPRYSAAMAIVGDELYIFGGRGNKYGKQELSSHYYWGLCAINLKNKQSR IVWQKNQPQEEGTIMASTMYFEPSDSSFYAVSTNKGGVLWKISMKDSVYSEVSKPIYNES TYQDSDFSLYTSPSHGKLFLVLDKILSNHTHELAIYSINMPLVNEVDIRQSTAGESINNR WYLYAIGILLLLVLAGFVLYRFKYNGKNKKAPATKKGTEKTVATTGKVQSQSDVPESKTI PKKEWMQESETIFTETVNYYDRSRASISLLGCFNVRDKDGNDITSNFTPRLKHLLILLIL YTEKNAQGILASKTTEILWPEKEETAARNNRNVNLRKLRVLLESIGDMEVMIENNFLRIK WGTGVFCDYHTLITCTKQFEQEKSEELLNRILEILLYGPLLPNTILDWLDDFKDDYSSYS IDLLKNLLDIEISRNHQDMIIRLADIMFLHDPLNEEALAAKCSVLVTQGKKGIARNLYDR FCKEYHDSMGETYKVPFADL >gi|225935346|gb|ACGA01000046.1| GENE 48 73596 - 75857 2023 753 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 34 749 49 779 790 275 28.0 2e-73 MKILHFCAAITMAAMLSGCNGGQSQTANRAPVDYVNPYIGNISHLLVPTFPTIQLPNSML RVYPERADYTTELLNGLPLIVTNHRERSAFNLSPYQGKELRPIITYNYDNEHLTPYSYEV DLNDNSMKAEYALSHQSALYRITFEADKPAYIIVNSRNGSIHVGENFISGHQQLSANTNV YVYIEPQEKPVSTGILKDGVIEASKDNAEGINACAAWRFADGTTTVSLRYGISFISEEQA EKNMRNELKDYNIKNLAKAGRQIWNEALGRIKVEGGTEDDKTVLYSSFYRTFERPICMSE AGGRYFSAFDGEVHDDNGTPFYNDDWIWDTYRAAHPLRTLIDQKKEEDIIASFLLMAEQM GTMWMPTFPEVTGDSRRMNSNHAVATIADALAKGLNIDAAKAYEACRKGIEEKTLAPWSG AAAGWLDNFYRENGYIPALRPDEKETDPNVHPFEKRQPVAVTLGTSYDQWCLSRIAEILG KKDEAAHYLQCSYNYRNLFNKETGFFHPKDKEGNWITPFDYRYAGGMGAREYYGENNGWV YRWDVPHNVADLINLMGGKEQFIANLDRTFSEPLGRSKYEFYAQLPDHTGNVGQFSMANE PSLHVPYLYNYAGQPWKTQKRIRQMLKTWFRNDLMGMPGDEDGGGMTSFVVFSSLGFYPV TPGAPVYNIGSPLFTHAEITLSNGSVFEIEAPNVSEENKYIQSATLNGQKWEKPWFHHDD LKNGGKLVLTMGNKPNKTWGSGANAAPPSADDK >gi|225935346|gb|ACGA01000046.1| GENE 49 75893 - 79135 2776 1080 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 34 492 25 444 1087 122 27.0 5e-27 MKKNLILLCSLLTGFTLQPIQTNAQTATNDQKLFYPYSFAPSEGLVNKTEKEHRQEICLN GYWDFQPVSLPKEYKQGKGVAPQLSLPKDGNGWSKTRIKIPSPWNINSFGYRDLEGPDHR NYPSYPKEWEQVKMAWLKKNITIPANWTGQQIKLYFEAVAGYSEIYINQEKVGENFDLFL PFSFDITDKVTPGETVEILVGVRSQSLFEDNSTIGRRIVPGGSMWGYHINGIWQDVYLLA LPKVHIEDVYIKPLVAKNTLEIEVTLQNKTDRKADIQLQGNVREWINCAGIDINSAPVPA WTLGTEALQIAPFKVTVAPNASQKVTLQVPVNENILEYWTPEQPNLYALLLSIKDKKQTI DTKYERFGWREWALQGTTQYLNGKPYALHGDSWHFMGIPQMTRRYAWAWFTAIKGMNGNA VRPHAQVYPRFYMDMADEMGICVLNETANWASDGGPKLDSDLFWEASKAHLKRFVLRDRN HASVFGWSVSNENKPVILHVYNRPELMPVQKKAWEEWRDIVHQYDPTRPWISADGEDDGD GILPVTVGHYGDINSMKRWTEIGKPWGIGEHSMAYYGTPEQVSKYNGERAYESQEGRMEG LANECYNLIANQRRMGASYSTVFNMAWYALKPLPLGKKDKSKAPSVSEDGVFFGEYKEGV PGVQPERVGPYCTTFNPGYDPTLPLYQEWPMYSALRAANAPGEPAWSPYATIDKEQYQAH KATNTAKNYKEVVFIGNPDSKVKQLMDAQGVIFASKTTIPSSLLYIVDGSEALNAATQKE IQKQLAKGADLWIWGITPETVEKYNEILPLPVALDPLKRSSFLPVQKAWMHGLNNSDFYF CELQKADASNYSLKGAFVEEGEVLLNACKTDWRKWNKRPEEIKTAGTIRSEYECTSATPV FVKYQHGSSTVYLNTLTEFANSEKGYNTLSIILKNAGINYQKPEININEVFFLRDEQMNF PVATKEKFVKKNNGWALEFYVFSPRPLDDLLIEPNMPKLSLMLKAKTRELSINDKPYASL SHDGRNEVIYKELPLLQGWNKLVITIGDGDRNDFSGFFQCDNKKDFLPTLKAAFTNPEAK >gi|225935346|gb|ACGA01000046.1| GENE 50 79516 - 82026 1599 836 aa, chain + ## HITS:1 COG:no KEGG:BT_1781 NR:ns ## KEGG: BT_1781 # Name: not_defined # Def: xylosidase/arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 267 549 397 676 679 310 55.0 2e-82 MGHNLLFALRYFLPLFLGGMIALNSCKEERASINDDPGQGDDGGDEYEYVDDRYDNGISC EEYLTNYRGIPFKKDSEGNSLSVISGVSSVIEAEDFDDGGQDISFSFKNSTAGNYKDYRT EKGVAISKSGEVINIGNVNNGDWLCYTLQVNQAGAYSIDTYCVSGGGKTSFYFEVDGKSA GQIVESPEDEWSVYSHSVKVTNVQLSEGRHVLRWCTTGSMNLDKFVFTRTGEYTGDKVDG SQFEYPRYGYYEHNPLFVDFSSQMYQNSFTGTLYTADPSAHVWADGRLYVYASHDMEPTQ GCDRMDRYHVFSTTDMKNWTDHGEIMNSATVKAHVGMGIDGFMWAPDCAYNKEEQLYYFY FPHKIDANTWRIFVATSKEPAAKFRVKGVIDGIPSTIDPCVFVDDDGQPYIYTSGAGKGC WGAKLRKDDWTKLDGVMSPLSGFTDFHEAPWVHKYKGNYYLSHSDNHAGSQGGNRMQYSM SNNLLGPFTPCGAYMYPHGEETAHGSIVEYKGRWYSFYHTANYSGKGALRSVCVDPIEYD QNNKLKMVQNWGSPRSGRAVEVRMTPSVVIQAENYNTGGEHYGYHKNPLEGGMQQETQNG ITYLKDMKSGEWVRYSINVKEAGRYGITCRMFQKRSGGKFRIAVNGVYKTGEIVLSGSAG VWNETLVYPIELETGEQYIDFRIKSGDIDIDWIKFGAGHSQVPGIIQAEDFDDGQYSFKN ASSGNFKSYRSDQGVAISASNNIVHISNTSGGDWIQYTFNVQEKVSTVKVRAAAEKDGKF ALSFDGGAQLNPVATTTGNWTTYNDFTVSNLQLSAGTHTMKIHILSPMNIDWFEFR >gi|225935346|gb|ACGA01000046.1| GENE 51 82247 - 83020 276 257 aa, chain + ## HITS:1 COG:BH3955 KEGG:ns NR:ns ## COG: BH3955 COG0500 # Protein_GI_number: 15616517 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus halodurans # 13 253 11 255 255 116 30.0 4e-26 MNSNYQVGDLIYDANIYDGMNTDLADLHFYKRWLPKSKDARILELCCGTGRLTLPIAKDG YNISGVDYTSSMLDQAKIKAFEAGLEIRFIEADIRTLNLQDKYDLVFIPFNSIHHLYKNE DLFRAFNVVKNHLKEGGLFLLDCFNPNIQYIVEGEKEPKEIAAYTTDDGREVLIKQTMRY ESKTQINRIEWHYFINGEFNSIQNLDMRMFFPQELDSYLEWSGFHIIHKYGGFEEEVFND NSGKQVFVCQYKSDCLY >gi|225935346|gb|ACGA01000046.1| GENE 52 83161 - 85041 1291 626 aa, chain + ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 132 624 32 509 677 385 42.0 1e-106 MKTTPWMKLCKGVALALTVSYGLTYCHTTKSKLTLEQQGDSLTVIHITNPTNYILLPIEE EAPESQVLLHTGEAADTDMDIRLAQNQVDYFVPFALPGGTRNVTVRVRNKSKDALCWKEI KLSDTFDTTNTEKFRPVYHHTPLYGWMNDANGLVYKDGEYHLYFQYNPYGSKWGNMHWGH SVSKDLMHWEHLEPAIARDTLGHIFSGSSIVDQENVAGYGAGSILAFYTSASDKNGQIQC LAFSKDNGRTFTKYEKNPILRSSDGLKDFRDPKVFRYEPEDKWVMIVSADKEMRFYDSKN LKDWNYMSSFGEGYGVQPCQFECPDMVELSVDGDTNRKKWALIVNVNPGCYFGGSATQYF TGDFDGTKFICDNQPNVTKWLDWGKDHYATVCFSNTGDRVIAVPWMSNWQYCNIIPTKQF RSANALPRELSLYTQNSEVYLSAAPVAEIKALRKESKEIPTFTVANDYHINSLLADNGGA YELALDITAGGAEIMGFSLFNDKGEKVDIYFNLPEKRLVMDRTKSGIVDFGKNSVPHEIE AHDRRKTTSINYMDDFALATWAPIRKEKKYALDIFVDKCSVEIFLNGGKIAMTNLIFPSE PYNRMCFYSKGGTFNVDSFNVYRLGL >gi|225935346|gb|ACGA01000046.1| GENE 53 85117 - 85992 572 291 aa, chain - ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 1 289 1 289 290 191 38.0 1e-48 MKTISNKQLTIQVSPHGAELCSIVANGKEYLWQADPAFWRRHSPVLFPIVGSVWENEYRN EGVPYTLTQHGFARDMEFTLVAEKEDEVRYRLVSNEETLQKYPFPFCLEIGYRIQGKQIE VMWEVTNTGDKEMYFQIGAHPAFYWPEFDANCLERGFFGFDPKDGLKYILISEKGCADPS TEYSLELTDGLLPLDIIHTFDKDALILENEQVRKVTLYNKEKQAYLSLHFNAPVVGLWSP PAKNAPFVCIEPWYGRCDRAHYTGEYKDKDWMQQLQPGEIFQGGYIIEIDE >gi|225935346|gb|ACGA01000046.1| GENE 54 86021 - 86893 725 290 aa, chain - ## HITS:1 COG:no KEGG:BDI_1888 NR:ns ## KEGG: BDI_1888 # Name: not_defined # Def: putative DNA repair ATPase # Organism: P.distasonis # Pathway: not_defined # 1 279 1 279 281 400 77.0 1e-110 MTKQQALKLFEERRVRTVWDDEQEKWYFSIVDVVAVLTDSSNPQTYWRVLKKRLLSEGNE TVTNCNGLKMQAADGKMRLTDVADTEQLLRLIQSIPSPKAEPFKLWMAKVASERLNQIQD PELSIDQALMDYKRMGYSDSWINQRLKSIEIRKDLTDEWKKHGLQEGVQFATLTDIIYQT WSDMTSKEYKQFKGLKKENLRDNMTNKELVLNMLAELSTKEISETSDPETFSDHIDIAQQ GGEVARNARLELEAKTGKRVISPLNAQSGILLKGKSEDKSKEGSENELKD >gi|225935346|gb|ACGA01000046.1| GENE 55 87173 - 90298 3302 1041 aa, chain + ## HITS:1 COG:no KEGG:BT_1763 NR:ns ## KEGG: BT_1763 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1041 1 1041 1041 1978 96.0 0 MHRIMKNKKLLCSVCFLFTFMSVLWGQSITVKGNVTSKTDGQPVIGASVVEATATANGTI TDLDGNFTLSVPVNSTLKITYIGYKPVTVKSAAIINVLLEEDTQMVDEVVVTGYTTQRKA DLTGAVSVVKVDEIQKQGENNPVKALQGRVPGMNITADGNPSGSATVRIRGIGTLNNNDP LYIIDGVPTKAGMHELNGNDIESIQVLKDAASASIYGSRAANGVIVITTKQGKKGQIKIN FDASVSASMYQSKMDVLNTEQYGRAMWQAYVNDGENPNGNALGYNYNWGYDANGNPVLYG MSLSKYLDSKNTMPVADTDWFDEITRTGVIQQYNLSVSNGSEKGSSFFSLGYYKNLGVIK DTDFDRFSARMNSDYKLIDDILTIGQHFTLNRTSEVQAPGGIIETALDIPSAIPVYASDG SWGGPVGGWPDRRNPRAVLEYNKDNRYTYWRMFGDAYVNLSLFKGLNVRSTFGLDYANKQ ARYFTYPYQEGTQTNNGKSAVEAKQEHWTKWMWNAIATYQLEIGKHRGDVMAGMELNRED DSHFSGYKEDFSILTPDYMWPDAGSGTAQAYGAGEGYSLVSFFGKMNYSYADRYLLSLTI RRDGSSRFGKNHRYATFPSVSLGWRITQENFMKELTWLDDLKLRASWGQTGNQEISNLAR YTIYAPNYGTTDSFGGQSYGTAYDITGSNGGSTLPSGFKRNQIGNDNIKWETTTQTNVGI DFSLFKQSLYGSLEYYYKKTTDILTEMAGVGVLGEGGSRWINSGAMKNQGFEFNLGYRNK TAFGLTYDLNGNISTYRNEILELPETVAANGKFGGNGVKSVVGHTYNAQVGYIADGIFKS QEEVDNHATQEGAAVGRIRYRDIDHNGVIDEKDQEWIYDPTPSFSYGLNIYLEYKNFDLT MFWQGVQGVDIISDVKKKSDFWSAANVGFLNKGTRLLNAWSPTNPNSNIPALTRSDTNNE QRVSTYFVENGSFLKLRNIQLGYTVPAVISKKLRMERLRFYCSAQNLLTIKSKDFTGEDP ENPNFSYPIPVNITFGLNIGF >gi|225935346|gb|ACGA01000046.1| GENE 56 90326 - 92038 1759 570 aa, chain + ## HITS:1 COG:no KEGG:BT_1762 NR:ns ## KEGG: BT_1762 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 570 1102 96.0 0 MKKILYIATIGFTLLTSSCDDFLDRQVPQGIVTGDQIASPEYVDNLVISAYAIWATGDDI NSSFSLWNYDVRSDDCYKGGSGTEDGGVFNALEISKGINTTDWNINDIWKRLYQCITRAN TALQSLDLMDEKTYPLKNQRIAEMRFLRGHAHFMLKQLFKKIVIVNDENMDPDAYNELSN TTYTNNEQWQKIADDFQFAYDNLPEIQMEKGRPTQAAAAAYLAKTYLYKAYRQDGVNNNL TGINEEDLKQVVKYTDPLIMAKAGYGLETDYSMNFLPQYENGSESVWAIQYSINDGTYNG NLNWGMGLTTPQILGCCDFHKPSQNLVNAFKTDSQGKPLFSTYDNENYEVTSDNVDPRLF HTVGMPGFPYKYNEGYMIQKNDDWSRSKGLYGYYVSLKENVDPDCDCLKKGSYWASSLNH IVIRYADVLLMRAEALIQLNDGRIADAISLINDVRSRAAGSTMLIFNYKEEYGVNFKVTP YELKAYAQDEAMKMLKWERRVEFGMESSRFFDLVRWGEAKDVINAYYVTEASRCSVYKNA GFTENKNEYLPVPFEQISASNGNYTQNFGW >gi|225935346|gb|ACGA01000046.1| GENE 57 92065 - 93447 1174 460 aa, chain + ## HITS:1 COG:no KEGG:BT_1761 NR:ns ## KEGG: BT_1761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 460 1 461 461 837 91.0 0 MKSIIKQLYTILLVTVACLTATGCSDDFKSNLRLDGDVWVNAIKLDAYAGTIDYQNKTIV VGVPYDYDVTRMAVTEMNLSEGATASIAVGETIDFSLPVSLTVKNGDVQMSYTITVKRDE AKILTFKLNDTYVGKVDQLSKTISVVVPLTVDITQLKGTFTVSDGATVTPVSGSIQDFTN PVTYTATYRSAVTPYVVTVTQGNVIPTAFVGTASSVSQLTSPEEKAAAQWMMDNISMSEY ISFKDIVDGKVDLGKYTAIWWHFHADNGDNPPLPDDAKAAVEKFKVYYQNGGNLLLTRYA TFYIKDLSIAKDERVPNNSWGGNEDSPDIVDGPWSFPITGNESHPLFQDLRWKDGDQTRV YTFDAGYAITNSTAQWHIGDWGGYEDLNAWRNLTGGINVACGDDGAVIIAEFEPRANSGR TICIGSGCYDWYGKGVDASADYYHYNVEQMTLNAINYLCK >gi|225935346|gb|ACGA01000046.1| GENE 58 93465 - 95033 1531 522 aa, chain + ## HITS:1 COG:no KEGG:BT_1760 NR:ns ## KEGG: BT_1760 # Name: not_defined # Def: glycosylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 522 2 523 523 1026 95.0 0 MKNMILHIALTTLLASMTACSDEMDPVLTQKDWDGTATYFQSTDEHGFSMYYKPQVGFIG DPMPFYDPVAKDFKVMYLQDYRPNPEATYHPIFGVATKDGATYESLGELIPCGARDEQDA AIGTGGTIYNPADKLYYTFYTGNKFKPSSDQNAQVVMVATSPDFKTWTKNRTFYLKGDTY GYDKNDFRDPFLFQTEDGVYHMLIATRKNGKGHIAEFTSADLKEWESAGTFMTMMWDRFY ECPDVFKMGDWWYLIYSEQASFMRKVQYFKGRTLEDLKATTANDAGIWPDSREGMLDSRA FYAGKTASDGTNRYIWGWCPTRAGNDNGNVGDVEPEWAGNLVAQRLIQHEDGTLTLGVPD AIDRKYTSAQEVKVMAKEGNVTESGKTYTLAEGASVIFNRLKVHNKISFTVKASSNTDRF GISFVRGTDSKSWYSIHVNADEGKANFEKDGDDAKYLFDNKFNIPADNEYRVTIYSDQSV CVTYINDQLSFTNRIYQMQKNPWSLCCYKGEITVSNIQVSTY >gi|225935346|gb|ACGA01000046.1| GENE 59 95134 - 96969 1507 611 aa, chain + ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 136 611 22 511 677 454 49.0 1e-127 MNLKLSSKQIQTFAVLLCMIVMNISLSARADNSPLLIKDLGEGHCLVRVNTNQKYLLLPV EDASPDVRIRMIVDNKEVDNFDVRLAIHKVDYFVPVDLSAYSGKQISFKFKMNSNDPIRV NLSPDNTACCKEMKLSDTFDTSNREKFRPTYHFSPLYGWMNDPNGMVYKDGEYHLFYQYN PYGSKWGNMNWGHAISKDLINWEHRPVAIAPDAFGTIFSGSAVVDSKNTAGFGAGAIIAI YTQNGDRQVQSIAYSTDNGRSFTKYENNPVLVSEARDFRDPKVFWYEGTQRWIMALAVGQ EMQFFSSPNLKDWTFESSFGKGQGAHGNVWECPDLFELPVEGTNEKKWVLLCSLGDGPFG DSATQYFVGTFNGKEFVNESPSKTKWMDWGKDHYATVTWSDAPDNRRIAIAWMSNWQYAN DVPTSQYRSPNSVPRDLSLFTVDGETYLQSAPSPELLKLRDISKKRSFKVNGTRTIKDMI AGNEGAYEIELTIENQHADVIGFRLYNDKGEEVDMQYDMKEKKFSMDRCKSGEVGFNENF PMLTWTTIESGKDELKLRLFVDKSSVEAFGDGGRFVMTNQVFPSEPYTHIDFYSKGGAYK VDSFVIYKLKK >gi|225935346|gb|ACGA01000046.1| GENE 60 97354 - 98523 1109 389 aa, chain + ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 4 383 24 417 426 94 26.0 3e-19 MENSKNSSLSKLIPVMLCFFTMGFVDLVGIASNYVKADLGLSDSQANIFPSLVFFWFLIF SVPTGMLMGRIGQKKTVLLSLIVTFASLLLPVFGDSYMLMLISFSLLGIGNALMQTSLNP LLSNIVRGDRLASSLTFGQFVKAIASFLAPYIAMWGATQAIPTFDLGWRILFPIYMIVAV IAILLLNVTQIEEEKEDGKPSTFGQCIALLGKPFILLCFIGIMCHVGIDVGTNTTAPKIL MERIGMGLDDAAFATSLYFIFRTAGCFLGSFILRKMSPKSFFGISVVMMLIAMIGLFIFH EKAIIYACIALIGFGNSNVFSVIFSQALLYLPGKKNEVSGLMIMGLFGGTVFPLAMGVAS DTSMGQNGAIAVMTVGVLYLLYYTFRIRK >gi|225935346|gb|ACGA01000046.1| GENE 61 98559 - 99446 970 295 aa, chain + ## HITS:1 COG:MA1840 KEGG:ns NR:ns ## COG: MA1840 COG0524 # Protein_GI_number: 20090690 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Methanosarcina acetivorans str.C2A # 9 294 35 322 326 191 38.0 2e-48 MNNIIVGMGEALWDVLPEGKKIGGAPANFAYHVSQFGFDSRVVSAVGRDELGEEILKVFN EKKLKMQIEQVDYPTGTVQVTLDDEGVPCYEIKEGVAWDNIPFTDELKRLALSTRAVCFG SLAQRNDVSRATINRFLDTMPDIDGQLKIFDINLRQDFYSKEVLRESFRRCNVLKINDEE LVTISRMFGYPGIDLQDKCWILLAKYNLKMLILTCGINGSYVFTPGVVSFQETPKVPVAD TVGAGDSFTAAFCASILNGKPVPEAHKLAVEVSAYVCTQSGAMPELPQVLKDRLM >gi|225935346|gb|ACGA01000046.1| GENE 62 99596 - 100714 1065 372 aa, chain + ## HITS:1 COG:FN2118 KEGG:ns NR:ns ## COG: FN2118 COG2849 # Protein_GI_number: 19705408 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 76 251 70 243 245 85 35.0 2e-16 MKHVSILLRSRSWNVLLLLCMTLPLFAQKEYKIDQVSVVNVGDGRLLYRDLTTENPLNDE HRIIDGYHSAYLLASFKDGFYDGSYKEYLDNILITEGTYKEGRKQGLFKIYSKFDGKLKE EKSYKEGKLDGTSKTYFTTGKVETEKGYRMGKEHGKHLSYESDGTLRMDHNYKDGRQVGK QYTFMKGTHELYETIYYNEDGLKEGKFSSMFTFGAPYVLGSYKNGQKEGAWMKFAETGDT LTIETYQNGKEDGLQVLFSPTKGTREKEYYMKNDRKDGLYREYNPANGELKYEATYQQGR LNGKERQLVVSNRFDYWEITTYVNGRPNGSYEARYVKNDKLRECGEYKNGHRVGRWKRYD IDGKLEKEWEEN >gi|225935346|gb|ACGA01000046.1| GENE 63 101040 - 106625 3597 1861 aa, chain + ## HITS:1 COG:PA4489 KEGG:ns NR:ns ## COG: PA4489 COG2373 # Protein_GI_number: 15599685 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Pseudomonas aeruginosa # 1201 1461 851 1110 1516 62 23.0 7e-09 MNTMLCQLKSLLLLGVLLFSNGAFGQPMIWDDTSPETIVFELTNKEALKLLKGKLRQKQW DKIQQTPFARFTDAWSNPPVKGHFLLADVERNEVHYRYAPVIPFHVFLFKEYGVLTLQVV DAGGTIRKDAKVRVGSKVVYYDEDSQTYTDDNWSQKEQHILTVEVDKFRAVFDLRKHLVP PWYKNDYGRQDAPEFYSYLITDKNKYRPGETVRFKSYALSEHKRPLKQELSLWMRVGSSW RDYKKIMSVVPYHPGGFAGEFLLADSLNLKLDQRYTVQLRDKRGRIVASTNFKYEDYELN GNKLLVKLASNVQYAPQSNRMDISATDANGLPLREVNVEVTVGRQQVLKSYAQILSLPDT LMSVQAELDASGKTSVDIPARIFGASDCFYTANVVLLTADNNRLEQQSKATFYYSCYDMQ CTTQADTICFSFFDLGVERPVAAELTYGEKKEVKKVRLPYREPFNQAVTDYRFKIPETGY ETTIASAALDSKLELNGGIEKDSFCVSLSNPLQLELSWYVYQGNRLLQKGSGKEMEHKSG EIDPSSVYYVEVFYFMGDKECMLKRSYTSPSERLVIESDLPERVYPGQKVKTKLNVTNIQ GRSVSNVDLTAFAVNTQLDYHVPDLPYYGSAPRPREQRASYSMKQKEYLHTGTLDYQHWN QLLHLDKLPYYQFAYPSGNLFRQTVDTPDGTTQFAPYVMLNGRAVNIYVIEQNDVPCYFS WTEQPKQYSFPVLHPSGKQKITLRMHDRAFIIDSLAFDQGKKTILSFDMNQLPQGVEVVW LRQRKGKYDEYKFTSEEKKRYERYLCRLPVMDGAVYTSLEHNGKLFPISLMELSRYKKKI LAGPVEPGPWKYMNGVLYRHEGGFSYEFEGNVVYKYKDEEMCPAYLNFSSVAKIPTLNDY HLSPERFRSLVAELRKGKSWHPTRIYFSLPDKTLNFRLPEEKDSTGVANLLFKDCTTGEL VYPDTLVQMNRIYSKFPAGTYDAILLYNNGKYLKQDALTIQSYTYLDVDMESLPLHERDS LSAGWLLLGRGIGRIGTNFPGHREMRIRQYVRNYGGKVSGYVTDATGEPLIGCSVVVKGT TEGTITDMDGYFEMDCDRGGDQLLFSYVGFKQQEMRATPGANLLVTLEEDSQALEEVVVV GYGMRSRSSLTGSVAGLMVGSSSSPAATAPLEKLEEQDQKAKEEADNEQLYNELMQLNGL RRNFSDVAFWQPRLFTDKTGTVQFETTFPDNVTKWETVVYGMNRRLQTGTFRRSVRSYKP LMAELKTPRFLVEGDQSTVVGTIRNYLDGQHIAGKTLFCVGTDTLKRQEVSFTDGFHETV PMQAAHTDSVTISYLFTRDDGYQDGEEYTIPILPQGTELAEGTLGILSDAKTVKVQSGED EEVMVSITDNQLDIYKESVNYLTGYKYLCNEQLASKLIGLLAYQQYMQSKGEKVKVDKAI RPIIHRLTNHQNKHQLWSWWGNSENTSFWMSAHILRALKMAQDAGYPVELNLNGLKVEYA HTRPYRGMKLEDIEILHALHEWKVEADYSSAVRLLEPFVRQLEQKEDSLANRNKYYRPLS YLKEKLLLWEIKQQVDSVNVGDSVRPYLKKDMLGGVYCDDGRRTYYWEGNQMINTLIAYR IIKNDPSLCKLKENMQLYILRTKERGWNTYQASSAVATVLTDLLAGSGGMSGTSVSVSGK DHQRITEFPYHARLTAGDSLLISKTGKEPLLYSVYSMKRVMNARQSDAFKVEATLEKDSL VAGVPVTLTVTLQVKQEGAQYVMLEVPVPAGCSYASKPVNFSRSEVYREYFKEKTVIFSE KLPVGTYQFTVPLLPRFTGKYTLNPVKVELMYFPVVNANNAGREIWITERNTKDIRKKEN D >gi|225935346|gb|ACGA01000046.1| GENE 64 106618 - 109500 2597 960 aa, chain + ## HITS:1 COG:SMb20671 KEGG:ns NR:ns ## COG: SMb20671 COG1879 # Protein_GI_number: 16265126 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 22 304 28 313 322 197 40.0 6e-50 MIEAMKYTKWITVLFCLLGLAACRQDAPRFRIGVAQCSDDSWRHKMNDEILREAMFYDGV SVEIRSAADDNRKQAEDVHYFIDKGVDLLIISANEAAPMTPIVEEAYQKGIPVILVDRKI LSDKYTAYIGADNYEIGRAVGNYIASSLKGKGNVVELTGLGGSTPAMERHQGFMAAISNY PDIKLIDKADAAWEREPAEVEMDSMLRRHPKIDAVYAHNDRIAPGAYQAAKKAGREKEMI FVGIDALPGKGNGLELVLDNVLDATFIYPTNGDKVMQLAMNILEKKSYPRETVMNTAVVD RTNAHVMQLQTTHISELDQKIETLNGRIGGYLSRVATQQVVMYGGLVILLLVAGLLLVVY KSLRSKNRLNKELSEQKKQVEQQRDKLEEQRDILEEQRDKLEEQRDQLIQLSHQLEEATH AKLVFFTNISHDFRTPLTLVADPVEHLLADSSLSEDQRRMLLLVQRNVNILLRLVNQILD FRKYENGKMEYTPISLDILSSFEGWNESFMAAARKKHIHFSFDYMPDTDYRTLADVEKLE RIYFNLLSNAFKFTPENGKVTVRLSSLTKDDHCWIRFTVANTGSMISAEHIRNIFDRFYK IDMHHAGSGIGLALVKAFVELHKGTITVESDEKQGTIFTVDLPVQTCETVVSENSPAFSV PATSVTSTDAATSAVAGAPVTPATSGYSGSSSLNDALTYEEEELEKSYDSSKPCVLIIDD NADIRLYVHGLLHTDYTVIEAADGSEGIRKAMKYVPDLIISDVMMPGIDGIECCRRLKSE LQTCHIPVILLTACSLDEQRIQGYDGGADSYISKPFSSQLLVARVRNLIDSHRRLKQFFG DGQTLAKEDVCDMDKDFVEKFKALIEAKMGDSNLNVEDLGKDMGLSRVQLYRKIKSLTNY SPNELLRIARLKKAASLLASSDMTVAEIGYEVGFSSPSYFTKCYREQFGESPTDLLKRKG >gi|225935346|gb|ACGA01000046.1| GENE 65 109579 - 109917 235 112 aa, chain + ## HITS:1 COG:lin0580 KEGG:ns NR:ns ## COG: lin0580 COG3695 # Protein_GI_number: 16799655 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Listeria innocua # 13 106 4 97 98 121 57.0 3e-28 MKDYKVDKASLSASFCQEVYQVVREIPVGNVSTYGGIAALLGMPQCSRMVGRALKQIPDD LSAPCHRVVNASGRLVPGWAEQKQLLLEEGVSFKQNGCVDLKKHLWNYSTSE >gi|225935346|gb|ACGA01000046.1| GENE 66 110216 - 111442 1276 408 aa, chain + ## HITS:1 COG:lin1013 KEGG:ns NR:ns ## COG: lin1013 COG4175 # Protein_GI_number: 16800082 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, ATPase component # Organism: Listeria innocua # 1 394 1 392 397 384 52.0 1e-106 MSKIEIKDLYLIFGHEKQKALKMLKKDKSKAEILKDTGCTVGVKDANLSINEGEFFVIMG LSGSGKSTLLRCINRLIRPTAGQVLVNGVDISKISEKELLQVRRKELAMVFQNFGLLPHR SVLSNIAFGLELQGVKKEEREKKAMESMKLVGLKGYENQMVGELSGGMQQRVGLARALAN NPEVLLMDEAFSALDPLIRVQMQDELLALQSKMKKTIVFITHDLSEAIKLGDRIAIMKDG EVVQVGTSEEILTEPANDYVARFVENVDRSKIITASSLMIDKPLVARLKKEGPEVLIRKM RAKNITVLPVIDADDKLVGEVHLNDLLKLRSKQEKSIEAVVRKEVHSVLCDTVLEDILPL MTKSNSPVWVIDETHEFLGTIPLSSLIIEVTGKDKEEINEIIQNAIDL >gi|225935346|gb|ACGA01000046.1| GENE 67 111439 - 112263 771 274 aa, chain + ## HITS:1 COG:YPO2646 KEGG:ns NR:ns ## COG: YPO2646 COG4176 # Protein_GI_number: 16122855 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, permease component # Organism: Yersinia pestis # 2 268 94 360 388 258 51.0 9e-69 MINIGQYIEIAINWMMVHFSTFFDAVNAGIGSFITGFQHILFGIPFYLTILALAAIAWVK AGRGISIFTVLGLLLIYGMGFWEATMQTLALVLSSTCLALIFGVPLGIWTANSPRADKIL RPILDLMQTMPAFVYLIPAVLFFGLGTVPGVFATIIFAMPPVVRLTGLGIRQVPKNVVEA SRSFGATRWQLLYKVQLPLALPTILTGVNQTIMMSLSMVVIAAMIAAGGLGEIVLKGITQ MKIGLGFEGGIAVVILAIILDRITQGMAGRKNKN >gi|225935346|gb|ACGA01000046.1| GENE 68 112281 - 113153 823 290 aa, chain + ## HITS:1 COG:MA2147 KEGG:ns NR:ns ## COG: MA2147 COG2113 # Protein_GI_number: 20090990 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, periplasmic components # Organism: Methanosarcina acetivorans str.C2A # 32 290 58 315 315 188 38.0 9e-48 MHVRIYKIVGIVLSAMLLLASCAHSDLEKKKIKIAYANWLEGIAMSHLAKVVLEEHGYEV ELQNADLAPIFVSMARKKSDVFLDAWLPITMKDYMDQYGDSIEFLGEVYEEARIGLVVPE YVTIQSISELADHKDRFSSEIVGIDAGAGIMKTTDKAITAYGLDGYALMTSSSSTMLASL KKAMDKGEWVVITGWTPHWMFDQFDLKFLDDPKKVYGDLEEIHAIAWKGFSEKDPFAAEF FGNIKLTTEELSSFMTAMKDARMDEEEIARKWRDEHRQLVDSWIPKSANK >gi|225935346|gb|ACGA01000046.1| GENE 69 113267 - 114469 938 400 aa, chain - ## HITS:1 COG:no KEGG:BF3314 NR:ns ## KEGG: BF3314 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 19 389 20 421 430 97 25.0 1e-18 MKLKTAIVAVIAVAAGLTSCEKNLYDPSQDKTEVKIENLKIGDDFDWLTTRNAACKVDAD QPVNVSIYQDEACKVILTHTSVIPGNNLQLPLSVIKNSQKLYLKVDGESQVYPVTLNEDG SFGCSLSTSRTTRAATRAGEDFEDHGTTIFYPNGSWGTVMFEDNYPQIGDYDFNDFVASY MVEIQTYPGTTFIKSIQFYMAIHAIGASYSYIPHLRIQSFENSTVKSMEVNKKGMSNLNV DRFADGKVNDDNTGKIVLAFNGAENKKGQEFLNTEKGSQYIPGDKFDITLTFEEGKADIA FIYADKFDIFLADKNRTKEIHMLGYGPAFASGKDYGNTNYYKQAGTNLVWGINVPSIIGH AYEKSNFLDAYPNFAKWVEGDMNANEWYKNQNSEYIFQMK >gi|225935346|gb|ACGA01000046.1| GENE 70 114718 - 118269 3556 1183 aa, chain - ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 5 405 3 403 410 590 72.0 1e-168 MTKQKKFITCDGNQAAAHISYMFSEVAAIYPITPSSTMAEYVDEWAAAGRKNIFGETVLV QEMQSEGGAAGAVHGSLQAGALTTTYTASQGLLLMIPNMYKIAGEFLPCVFHVSARTLAS HALCIFGDHQDVMSARQTGFAMLAEGSVQEVMDLAGVAHLATIKSRVPFMNFFDGFRTSH EIQKIEMLENDDLAPLIDQEALAEFRARALNPMNPVARGMAENPDHFFQHRESCNNYYEA VPAIVEEYMNEISKITGRKYGLFDYYGAEDAERVIIAMGSVTEAAREAIDHLVANGEKVG MVAVHLYRPFSAKHFLAAVPKTAKTIAVLDRTKEPGANGEPLYLDVKDCFYGAENAPVIV GGRYGLGSKDTTPAQIISVFENLAMPMPKNHFTIGIVDDVTFTSLPQREEIALGGEGMFE AKFYGLGADGTVGANKNSVKIIGDNTDKHCQAYFSYDSKKSGGFTCSHLRFGDTPIRSTY LVNTPNFVACHVQAYLHMYDVTRGLRKNGSFLLNTIWEGEELAKNLPNKVKKYFAQNNIS VYYINATQIAMEIGLGNRTNTILQSAFFRITGVIPVEQAVEQMKKFIVKSYGKKGEDVVN KNYAAVDRGGEYKTLAVDPAWANLPDDAKAENNDPAFINEVVRPINAQDGDLLPVSAFKG IEDGTWYQGTSKYEKRGVAAFVPEWNPENCIQCNKCAYVCPHASIRPFVLDAEEQKGAKF EQLKAVGKVFDGMTFRIQVDVLDCLGCGNCADICPGNPKKGGKALTMKHLESQLAEADNW TYCAENVKTKQHLVDIKSNVKNSQFATPLFEFSGACSGCGETPYVKLISQLFGDREMVAN ATGCSSIYSGSVPSTPYTTNENGHGPAWANSLFEDFCEFGLGMELANEKMRARIVKLFNQ ILEGDNAPAEAKEVLKAWIENMHDADKTKELAPQIEAIIEQGIAAGCPVSKELKGLTQYL VKRSQWIIGGDGASYDIGYGGLDHVIASGKDVNILVLDTEVYSNTGGQSSKATPVGAIAK FAAAGKRVRKKDLGLMATTYGYVYVAQIAMGADQAQTLKAIREAEAYPGPSLIIAYAPCI NHGLKAGMGKSQEEEEKAVKCGYWHLWRYNPALEEEGKNPFQLDSKEPNWEEFQGFLKGE VRYASVMKQYPTEAEELFKAAEENAKWRYNSYKRLARENWGAE >gi|225935346|gb|ACGA01000046.1| GENE 71 118302 - 119480 851 392 aa, chain - ## HITS:1 COG:MJ1637 KEGG:ns NR:ns ## COG: MJ1637 COG1373 # Protein_GI_number: 15669833 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanococcus jannaschii # 33 290 73 342 473 82 28.0 1e-15 MESFYRTHAYLVEHTNAPVRRDLMDEIDWSDRLIGIKGTRGVGKTTFLLQYAKEKFGNDR SCLFINMNNFYFSGHSIVDFANEFQKRGGKVLLIDQVFKHPEWSRELRMCYDRFPNLKIV FTGSSVMRLKEENLELRDIAKSYNLRGFSFREYLNLQTGMKFRAYSLEEILSTHEQIAKG VLSKVRPLDYFQDYLHHGFYPFFLEKRNFSENLLKTMNMMVEVDILLIKQIELKYLSKIK KLLYLLAVDGPKAPNVSQLASDIQTSRATVMNYIKYLADARLINLVYPEGEEFPKKPSKI MMHNSNLMYSIYPVKVEEQDVLDTFFANSLWKDHKVHKGDKNVSFMVDEVMPFKICQEGA KIKNNPNVTYALHKAEIGRGNQIPLWMFGFLY >gi|225935346|gb|ACGA01000046.1| GENE 72 119643 - 120842 614 399 aa, chain + ## HITS:1 COG:no KEGG:BT_1745 NR:ns ## KEGG: BT_1745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 394 1 395 398 517 62.0 1e-145 MTKKALYIFNPEHDLALASGETNYMAPASARQMASELALLPMWYAEEGSVVLAPSAYNLD YVKKIQELLGLSVDLMTEPELASERNLEIRPWGWDIAIRKRLLDLGVEESELPSIEQLDK LRFCSHRSCAVELLPQLQLGACFCGESFYLTFPEEWKSFVESHDTCLLKAPLSGSGKGLN WCKGVYTPSISGWCSRIGNQQGGVIGEPLYNKVEDFAMEFRSSGNGRFGFAGYSQFRTGG SGAYEGNLLISDAAIERNLSEYIPVEEIYKLRDRLEQELSLRFGTIYNGYLGVDMMICRF PESPVYRIHPCVEINLRMNMGVVARHIYDHYIYPTSTGAFQISYYPTEGTAWRAHKEMEE AYPLEIEQGRIKSGYLSLVPAHKKSSYRAWVFISKSVFL >gi|225935346|gb|ACGA01000046.1| GENE 73 121075 - 122526 1172 483 aa, chain - ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 6 483 20 507 510 236 31.0 6e-62 MRRYIFILSLLFPFLSMVAQPKHEVRAAWVTAVYGLDWPRTRATTPQTIRKQKEELIDIL DKLKAANFNTVLFQTRTRGDVLYPSAIEPFNSILTGKTGGNPGYDPLAFAVEECHKRGME CHAWMVTIPLGNKKHVASLGSQSVTKRMKDICVPYKSEYFLNPGHPATKEYLMKLVREVV SGYDVDGVHFDYLRYPENAPLFPDKYDFRRYNKGRTLDQWRRDNISEIVRYIYKDVKAMK PWVKVSTCPVGKYRDTSRYPSRGWNAFFTVYQDPQGWIGEGIMDQIYPMMYFQGNNFYPF ALDWQEQSNGRQVVPGLGIYFLHPDEGKWTRDEIDRQMNFIRSQKMAGEGHYRVKYLMEN TQGIYDELAENFYAYPALQPPMPWLDNVPPTAPSELKVTDINNGYTELKWQAATDNDSRN NPMYVIYASNEFPVDTNRPENIVAQGVRETSYIYAPILPWNAKKHFAVTAIDRCGNESAA VQK >gi|225935346|gb|ACGA01000046.1| GENE 74 122635 - 124065 1106 476 aa, chain + ## HITS:1 COG:MA1905 KEGG:ns NR:ns ## COG: MA1905 COG1966 # Protein_GI_number: 20090754 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 1 448 1 447 479 412 51.0 1e-115 MITFTLCLLALIVGYFTYGRLMERVFGPDDRKTPALTKADGVDYIPLPTWKIFMIQFLNI AGLGPIFGAIMGAKFGSSSYLWIVLGSIFAGAVHDYFAGMLSLRNGGESLPEIIGRYLGL TTKQVMRGFTVILMILVGSVFVAGPAGLLAKLTPESLDATFWIVVVFLYYILATLLPVDK IIGKIYPIFAVALLFMAVGILVMLYVNHPALPELWDGLQNTNPEASELPIFPIMFVSIAC GAISGFHATQSPLMARCMTSERHGRPVFYGAMITEGIVALIWAAAATYFFHENGMEESNA SVIVDSITKEWLGAVGGVLAILGVIAAPITSGDTAFRSARLIVADFLGMEQKSMRRRLYI CIPMFVLAIGLLLYSLRDANGFNMIWRYFAWANQTLAVFTLWAITVFLAVSKKTYIITLI PALFMTCVCSTYICIAPEGLGLSHAVSYGVGITCVVVAVIWFYIWMSKQKTRKLSE >gi|225935346|gb|ACGA01000046.1| GENE 75 124062 - 124292 240 76 aa, chain + ## HITS:1 COG:no KEGG:BT_1741 NR:ns ## KEGG: BT_1741 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 1 76 76 95 94.0 5e-19 MKKIRKSTGVAIAFLIYVSVTAAYLLPRNTEVSQTEKILTVAGSYVIVFLLWLVLRKKEQ MRERRKKDEQSIHLKK >gi|225935346|gb|ACGA01000046.1| GENE 76 124297 - 124782 665 161 aa, chain + ## HITS:1 COG:no KEGG:BT_1740 NR:ns ## KEGG: BT_1740 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 161 1 161 161 274 85.0 9e-73 MKKLALALCLLAVSFTAQAQFEKGTTIINPSLSGLDFSYSKNDKAKFGVGAQVGTFFAEG IALMVNAGADWSKPIDEYTLGTGVRFYFNKTGIYLGGGLDWNRFRWSGGKHQTDWGLGIE GGYAFFLSRTVTIEPAVYYKWRFNDGDMSRFGVKIGFGFYL >gi|225935346|gb|ACGA01000046.1| GENE 77 124901 - 127672 2748 923 aa, chain - ## HITS:1 COG:MTH443 KEGG:ns NR:ns ## COG: MTH443 COG0178 # Protein_GI_number: 15678471 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Methanothermobacter thermautotrophicus # 7 923 11 948 962 838 47.0 0 MSENNYISIKGARVNNLKNIDVDIPRNKLVVITGLSGSGKSSLAFDTLYAEGQRRYVESL SSYARQFLGRMSKPECDFIKGIPPAIAIEQKVNSRNPRSTVGTSTEIYEYLRLLYSRVGK TYSPISGQEVKKHSTEDIVNCMLSYPEGTRYTVLTPIRLREDRTLQQQLEIDLKQGFNRI EVNGEMKRIDEYTPVAGDEVYLLVDRMAVANSKDAISRLTDSAETAMYEGDGTCMLRFYL SDGTTKLHTFSTKFEADGIIFEEPNDQMFSFNSPIGACPVCEGFGKVIGIDEHLVVPDRS LSVYEGAIVCWRGEKMGEWKEELIHNADKFDFPIFTPYYELTDAQRRLLWEGNQYFHGIN DFFKMLEENQYKIQYRVMLARYRGKTLCPKCHGTRLKPEAGYVRVGGKNISELVDLPITE LKEFFDHLELDEHDSNVSRRILVEINSRIRFLIDVGLGYLTLNRLSNSLSGGESQRINLA TSLGSSLVGSLYILDEPSIGLHSRDTDRLLHVLRQLQQLGNTVVVVEHDEEIIRAADYII DIGPNAGRLGGEVVYQGDMKDLKKGSNSYTVRYLLGEDEIPVPEHRRPWNNYIELKGARE NNLKGVNVRIPLNVMTVVTGVSGSGKSTLVRDIFFRALKRELDECSDRPGEFSSIGGSLR DLRNVEFVDQNPIGKSSRSNPVTYIKAYDEIRKLWSEQPLAKQMGYTPGFFSFNSEGGRC EECKGEGTITVEMQFMADLVLECESCHGKRFKSDTLEVKFNDKSIHDVLEMTVNQAVEFF NEHGQKKIVKKLLPLQDVGLEYIKLGQSSSTLSGGENQRVKLAFYLSQEKADPTMFIFDE PTTGLHFHDIRKLLDAFDALIRRGHSIVIIEHNMDVIKCADYVIDLGPEGGDKGGNIVAV GTPEEVAACGASYTGQFLKEKLG >gi|225935346|gb|ACGA01000046.1| GENE 78 127767 - 129275 1256 502 aa, chain + ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 4 501 6 510 510 371 39.0 1e-102 MKLKNYLLLLALLLVVGVRAQVPSGNKYPKREFRGAWIQAVNGQFRGIPTERLKQILVSQ LNSLQEAGINAIIFQVRPEADALYASQHEPWSRFLTGTQGQTPSPMWDPMQFMIEECQKR NMEFHAWINPYRVKTSLKNKLAPEHIYHQHPEWFVTYGDQLYFDPALPESREHICKIVTD IVSRYDVDAIHMDDYFYPYPVNGLDFPDDASFARYGGGFTNKADWRRSNVNVLIKKLHET IRGIKPWVKFGISPFGIYRNQKSDPLGSNTNGLQNYDDLYADVLLWAREGWIDYNIPQIY WEIGHKAADYETLVKWWATHSENRPLFIGQSVPKTVQFADPQNPSINQLPRKMALQRAYQ TIGGSCQWYAAAVVENQGRYRDALISEYHKYPALIPVFDFMDDKAPGKVRKMKKVWTEDG YILFWTAPKADTEMDKAVQYVVYRFDSKEKVNLDDPSHIVAITRNPFYKLPYETGKTKCR YVVTALDRLHNESKSVSKKLKL >gi|225935346|gb|ACGA01000046.1| GENE 79 129383 - 129931 441 182 aa, chain - ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 182 1 173 176 120 47.0 1e-27 MIYLQLFYTFFKIGLFGFGGGYAMLSMIQGEVVTRYGWVSSQEFTDIVAISQMTPGPIGI NAATYVGFTSTGSVWGSIIATFAVVLPSFILMLTISKFFLRYQKHPVVESIFNGLRPAVV GLLASAALVLMNVENFGSPTEDTYSFVISIIIFLIAFIGTRKYKANPILMIIGCGIAGLL LY >gi|225935346|gb|ACGA01000046.1| GENE 80 130016 - 130558 380 180 aa, chain - ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 2 179 4 181 186 157 50.0 9e-39 MNIYLEAFGIFFKIGAFTIGGGYAMVPLIENEIVTKRKWIAQEDFIDLLAISQSAPGILA VNISIFIGYKLRGIRGSIITALGTVLPSFIIILAIALFFHSFQDNPIVERIFKGIRPAVV ALIAAPTFTMGRSAKINRYNLWIPVVSALLIWLLGFSPIWIIIAAGVGGFLWGKLRKVKS >gi|225935346|gb|ACGA01000046.1| GENE 81 130555 - 134559 2970 1334 aa, chain - ## HITS:1 COG:BS_yycG KEGG:ns NR:ns ## COG: BS_yycG COG5002 # Protein_GI_number: 16081092 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 797 1025 369 599 611 117 31.0 2e-25 MKKWLFLILLFPLTCVAQTYRYLGVEDGLSNRRVYCIQKDKTGYMWFLTHEGIDRYNGKE FKRYKLMDGDVEVNSLLNLNWLYIDQEGVLWEIGKKGKVFRYDQIHDCFSLVYKLPMESF SKQPDPVTYAWLDQNKHIWLCNKDTIFLYNTETQQVTHIKNDISEDITDIEQIDESHFFI GTEMGIHYAKLENNALELINCDKLESVKAQVIDLHFDKKIRKLFIGTFLRGIMIYDMNTK SVIRPEQNLKDISITRFKPLNDKELLIATDGGGVHKMNVDTYQIVPDIVADYNSNNGMNG NSINDIFVDDEERIWLANYPIGITIQNNRYTSYKWIKHSIGNKQSLINDQVNSIIEDRDG DLWFATNNGISFLNSKTGQWRSVLSAFEESQGNKSHIFLTICEVAPGTIWAGGYSSGAYQ IDKKTFNVSYFMPPLYTHTNKRPDKYIRDIRTDMQGYIWSGEFYNLKRINLKTQEVRFYD GLNSITAIVEKDEKSMWIGSATGLCLLDKESGKFERIKLPVESSYIYSLHQAKNGSLYIG TSGSGLLIYDINKKLFTHYHTENCALISNNIYTILSDADRDILMSTESGLTSFYPNEKKF YNWTKDMGLMTTHFNALSGVLRKNNKFILGSSDGAVEFDKDMKLPRGYSSKMIFSDFKLF YQTIYPGDENSPLKASINDTKVLKLKYNQNIFSLQVSSINYDYPSNILYSWRLEGFYDEW SKPGTENTIRYTNLAPGTYTLRVRAISNEDKRIMLEERSMDIVIAQPFWLTFWAVLVYTA FLCLIAIVLLRILILRKQRKVSDEKIHFFINTAHDIRTPLTLIKAPLEELREKEELSKEG ISNMNTALRNVNALLRLTTNLINFERADVYSSELYISEHELNTFMNEIFNAFQQYANIKH INFTYESNFRYMNVWFDKEKMESIFKNIISNALKYTPENGNVQVFVSESTDSWSVEVRDT GIGIPANEQKKLFKLHFRGSNAINSKVTGSGIGLMLVWKLVRLHKGKINLSSIENQGSVI KITFPKDSKRFRKAHLATPSKQRIEIENVPSSSPEIYENAQKKENINHRRILIVEDNDEL RNYLSQTLSEEYVVQVCSNGKEALTIIPEYKPELVISDIMMPEMRGDELCQAIKNNIETS HIPVILLTALNNEKDILSGLQIGADEYVVKPFNIGILKANVANLLANRALLRSKYANLDL NDEENDEDCINCSQDIDWKFIANVKKNVEDNIDNPALTVDVLCSLMGMSRTSFYNKLRAL TDQAPGDYIRLIRLKRAVQLLKEDTHSITEIAEMTGFSDAKYFREVFKKHFNVSPSQYGK EKKTASKEGGEKKE >gi|225935346|gb|ACGA01000046.1| GENE 82 134828 - 138532 3898 1234 aa, chain - ## HITS:1 COG:HI0752_1 KEGG:ns NR:ns ## COG: HI0752_1 COG0046 # Protein_GI_number: 16272693 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Haemophilus influenzae # 15 890 65 970 1011 507 38.0 1e-143 MILFFRTPSKSVIAVESNHQLTPDESNKLCWLFGEAVMESEENLKGCFVGPRREMITPWS TNAVEITQNMGLEGISRIEEYFPVKDENADYDPMLQRMYKGLDQNVFTTNRQPEPIIYIE DLEVYNEKEGLALSKEEMDYLKKVEKDLGRKLTDSEVFGFAQINSEHCRHKIFGGTFIID GVEQESSLFQMIKKTTQENPNKIISAYKDNVAFAEGPVVEQFAPADHSKPDFFQVKDIKS VISLKAETHNFPTTVEPFNGASTGTGGEIRDRMGGGKGSWPIAGTAVYMTSYPRTEEGRE WEEILPVRKWLYQTPEQILIKASNGASDFGNKFGQPLICGSVLTFEHTENKEVYGYDKVI MLAGGVGYGTQRDCLKGTPEAGNKVVVIGGDNYRIGLGGGSVSSVDTGRYSSGIELNAVQ RANAEMQKRANNVVRALCEEEVNPVVSIHDHGSAGHVNCLSELVEECGGVIDMSKLPIGD KTLSAKEIIANESQERMGLLIKEEAIEHVRKIAERERAPMYVVGETTGDHRFAFQQADGV RPFDLAVEQMFGSSPKTYMIDKTVERHYEMPKYELSKLHEYLTNVLQLEAVACKDWLTNK VDRSVTGKVARQQCQGELQLPLSDCGVVALDYRGEKGIATSIGHAPQAALADPAAGSILS VSEALTNLVWAPMAEGMDSISLSANWMWPCRSQEGEDARLYTAVKALSDFCCALQINVPT GKDSLSMTQKYPNGEKVISPGTVIVSAGGEVSDVKKVVSPVLVNNEKTTLYHIDFSFDEL KLGGSAFAQSLGKVGDEVPCVQDAEYFRDAFLAVQELINKGLILAGHDISAGGLITTLLE MCFSNVEGGMEISLDKMKEQDIVKILFAENPGIVIQISDKHKDEVKKILEDAGVGYMKLG KPTDERHILVSKDGATYQFGIDYMRDVWYSSSYLLDRKQSMNGCAKARFENYKMQPLEFA FMPEFKGKLSQYGITPDRRTPSGIRAAIIREKGTNGEREMAYSLYLAGFDVKDVTMTDLI SGRETLEDVNMIVYCGGFSNSDVLGSAKGWAGAFLFNPKAKEALDKYYAREDTLSLGVCN GCQLMMELDLINPELKKKGKMLHNNSHKFESRFLGLTIPTNRSVMFGSLSGSKLGIWVAH GEGKFSLPYDEDKYNVVAKYSYDEYPGNPNGSDYSIAALASADGRHLAIMPHLERSIFPW QNGCYPADRKNSDQVTPWIEAFVNARKWVEAKMK >gi|225935346|gb|ACGA01000046.1| GENE 83 138692 - 139444 533 250 aa, chain + ## HITS:1 COG:BS_ycgF KEGG:ns NR:ns ## COG: BS_ycgF COG1280 # Protein_GI_number: 16077378 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Bacillus subtilis # 42 201 2 161 209 60 29.0 3e-09 MYVLLIIAIYSDSCVEKTLIYYPHITKKCDICDMIQIETIFDILVKGFIIGVVVSAPLGP VGVLCIQRTLNKGRWYGFVTGLGASLSDIAYALLTGYGMSFVFDYINKNIFYLQLLGSIM LLLFGIYTFRSNPVQSIRPASSSKGSYFHNFITAFFVTLSNPLIIFLFIGLFARFAFVQP GVLVFEEITGYVAIAIGALTWWLGITYFVNKVRTKFNLRGIWILNRVVGSIVMLVSLAGL IFTLLGESLY >gi|225935346|gb|ACGA01000046.1| GENE 84 139455 - 139997 680 180 aa, chain + ## HITS:1 COG:no KEGG:BT_1731 NR:ns ## KEGG: BT_1731 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 180 1 181 181 283 89.0 2e-75 MNQIAQQLKEKNIAEYLIYMWQEEDLIRANHCEPEEMEANIITRYPADQQPAMREWYTNL ITMMSEEGVREKGHLQINKNVIINLTELHNALASSPKFPFYSAAYFKALPFIVELRNKNG KKEPELETCFEALYGVLLLRLQKKPISEGTAKAVEAITSFLSMLANYYDKDRKGELKLDE >gi|225935346|gb|ACGA01000046.1| GENE 85 139998 - 140858 684 286 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 284 1 280 280 244 48.0 9e-65 MRILVTGANGQLGNEMQVLAKENPQHTYYFTDVEELNICDKQAVWAYIAEKRIELVVNCA AYTAVDKAEDNSELAYQLNCEAPKQLASAAQFNGAAMIQVSTDYVFDGTAHTPYTEDCDP CPDSVYGTTKLEGEYDVMNYCEKAVVIRTAWLYSIFGNNFVKTMIRLGKERDSLGVVFDQ IGTPTYANDLARAIYTIINKGIVRGIYHFSNEGVCSWYDFTVAIHRLAGITSCKVKPLHT AEYPAKANRPAYSVLDKTKIKTTFGIEIPHWEESLQRCLEKLEIKN >gi|225935346|gb|ACGA01000046.1| GENE 86 140935 - 142509 1449 524 aa, chain + ## HITS:1 COG:XF0174 KEGG:ns NR:ns ## COG: XF0174 COG4108 # Protein_GI_number: 15836779 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Xylella fastidiosa 9a5c # 6 523 21 540 548 532 50.0 1e-151 MADNTEILRRRTFAIIAHPDAGKTSLTEKLLLFGGQIQVAGAVKSNKIKKTATSDWMEIE KQRGISVTTSVMEFDYRDYKINILDTPGHQDFAEDTYRTLTAVDSVIIVVDGAKGVETQT RKLMEVCRMRKTPVIIFVNKMDREGKDPFDLLDELEEELMIQVRPLSWPIEQGARFKGVY NIYEQKLDLYQPSKQMVTEKVAVDIHTEELDQQIGKPLADKLRGDLELIEGVYPEFDSES YLAGDCAPVFFGSALNNFGVQELLNCFVEIAPSPRPVQAEEREVNPDEPKFTGFIFKITA NIDPNHRSCVAFCKICSGKFVRNAPYVHVRHGKTMRFSSPTQFMAQRKTTIDEAYAGDII GLPDNGTFKIGDTLTEGEMLHFRGLPSFSPEMFKYIENADPMKQKQLAKGIDQLMDEGVA QLFVNQFNGRKIIGTVGQLQFEVIQYRLLNEYNASCRWEPVSLYKACWVESDDPAELEAF KKRKYQYMAKDREGRDVFLADSGYVLQMAQMDFKHIKFHFTSEF >gi|225935346|gb|ACGA01000046.1| GENE 87 142588 - 143178 511 196 aa, chain + ## HITS:1 COG:no KEGG:BT_1728 NR:ns ## KEGG: BT_1728 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 286 77.0 3e-76 MKTLSWDKIQQGDEEAFRQLYEQYADLLYGYGMKIAGDETLVTEAIQSLFVYIFEKRETC AAPQSIPAYLCVSLRHMIVNELKKENSGSLKSLDEVGTNEYQFDLEIDIETAIIRSELEK EQLEVLQKELNNLTKQQREVLYLKYYKKMSPEEIAQVMGLTSRTVYNTTHMAISSLRERM SKSFLLLVAANLWIFN >gi|225935346|gb|ACGA01000046.1| GENE 88 143189 - 144028 751 279 aa, chain + ## HITS:1 COG:no KEGG:BT_1727 NR:ns ## KEGG: BT_1727 # Name: not_defined # Def: putative transmembrane sensor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 1 283 283 326 60.0 6e-88 MEKWTESLDKYLEEGKLPLVGEIFSRKKEIGDKLKLQREESHKIALSRRRKQGAGRSFSF SWGVAAAVVLLLGIGGYLLAEEKVVTDNTAMNYELPDGSTVQVMENSRLTYNHITWLWER KLQLLGKASFNVTKGKTFTVRTEAGDVTVLGTKFLIDQQGKKMTVNCEEGSVKVETAVGK HTLLAGESVHCDENKIVPVEKKAEESEFPEVLGYEDDPLINVVADIEHIFKVTVVGHEKC EGLTYNGTVLTKDLNATLEKVFGSCGISYQIRGKEIILQ >gi|225935346|gb|ACGA01000046.1| GENE 89 144873 - 145133 348 86 aa, chain + ## HITS:1 COG:no KEGG:BT_1698 NR:ns ## KEGG: BT_1698 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 86 1 86 86 111 91.0 9e-24 MENIETAILLMVVGMATVFVILLIVIYLGKLLIALVNKYAPEEVVPVKREAPRGPAPVPG NILAAITAAVNVVTQGKGKITKVEKL >gi|225935346|gb|ACGA01000046.1| GENE 90 145160 - 146995 1994 611 aa, chain + ## HITS:1 COG:AF1252m KEGG:ns NR:ns ## COG: AF1252m COG5016 # Protein_GI_number: 18677784 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Archaeoglobus fulgidus # 9 515 9 477 480 183 29.0 8e-46 MKREVKFSLVFRDMWQSAGKYVPRVDQLVKVAPAIIEMGCFARVETNGGGFEQVNLLFGE NPNTAVREWTKPFHEAGIQTHMLDRALNGLRMSPVPADVRKLFYKVKKAQGTDITRTFCG LNDVRNIAPSITYAKEAGMISQCSLCITHSPIHTVEYYTNMALELIKLGADEICIKDMAG IGRPVSLGKIVANIKAAHPEIPVQYHSHAGPGFNMASILEVCQAGCDYIDVGMEPLSWGT GHADLLSVQAMLKDAGYQVPEINMEAYMKVRGMIQEFMDDFLGLYISPKNRLMNSLLIAP GLPGGMMGSLMADLESNLESINKYKAKHNLPFMTQDQLLIKLFDEVAYVWPRVGYPPLVT PFSQYVKNLAMMNVMAMEKGKDRWGMIADDIWDMILGKAGRLPGKLAPEIIEKAEREGRK FFEGDPQDNYPDALDKYRKLMKENKWEAGQDDEELFEYAMHPAQYEAYKSGKAKEDFLED VAKRRAEKDKSPEEDVKPKTLTVQVDGQAYRVTVAYGDAELPATPAGASAAPAGEGKEVL SPLEGKFFLVKNAQETALQVGDTVKEGDVICYVEAMKTYNAIRAEFGGTVTAICANPGDT VSEDDVLMKIG >gi|225935346|gb|ACGA01000046.1| GENE 91 146995 - 148230 1260 411 aa, chain + ## HITS:1 COG:AF2084 KEGG:ns NR:ns ## COG: AF2084 COG1883 # Protein_GI_number: 11499666 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Archaeoglobus fulgidus # 24 411 5 354 354 292 47.0 9e-79 MNEIFENLYDMTAFSNIIADPQFLIMYAIAFVLLYLGIKKQYEPLLLVPIAFGVLLANFP GGEMGVIQADENGMIMINGVMKNIWEMPLHDIAHELGLMNFIYYMLIKTGFLPPIIFMGV GALTDFGPMLRNLRLSIFGAAAQLGIFTVLLVAILMGFTPKEAASLGIIGGADGPTAIFT TIKLAPHLLGPIAIAAYSYMALVPVIIPLVVRLLCTKKELSINMKEQEKMYPSKTEIKNL RVLKIIFPIVVTTVVALFVPSAVPLIGMLMFGNLVKEIGTNTFRLFDAASNSIMNAATIF LGLSVGATMTTEAFLNWTTIGIVVGGFLAFALSITGGILFVKLVNLFSKKKINPLIGATG LSAVPMASRVANEIALKYDPKNHVLQYCMASNISGVIGSAVAAGVLISFLA >gi|225935346|gb|ACGA01000046.1| GENE 92 148338 - 149609 1307 423 aa, chain - ## HITS:1 COG:FN1273 KEGG:ns NR:ns ## COG: FN1273 COG1538 # Protein_GI_number: 19704608 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 99 421 87 412 413 60 21.0 4e-09 MKTQLITLSLLFLGLTAGAQQPYLSREAYRDKVEAYSQILKQQKLKTMASTEARKIAHTG FLPKIDANADGTLNMSDLSAWNEPLGEYRNHTYQGVFIVSQPLYTGGALNAQHKIAKADE KLNQLNEELTIDQIHYQSDAVYWNASASQAMLQAADKYQSIVKQQYDIIQDRFSDGMISR TDLLMISTRLKEAELQYIKARQNYTLALQKLNILMGEEPNSPVDSLYTIDKASAPVQILS LENVLQRRADFESTEVNIMKSEAQRKAALSQFNPQLNMYFSGGWATATPNLGYDVSFNPV VGININIPIFRWGARFKTNRQQKAYIGIQKLQQSYVTDNINEELSAALTKLTETEYQVKT ATETMNLANENLDLVSFSYNEGKANMVDVLSAQLSWTQAHTNLINAYLSEKMAVAEYRKV ISE >gi|225935346|gb|ACGA01000046.1| GENE 93 149658 - 152690 2694 1010 aa, chain - ## HITS:1 COG:VC1757 KEGG:ns NR:ns ## COG: VC1757 COG0841 # Protein_GI_number: 15641761 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1009 1 1012 1016 595 33.0 1e-169 MSLARYSLDNTKIIYFFLAVLLIGGITSFGKLGKKEDAPFVIKSAVIMTRYPGAEPAEVE RLITEPISREIQSMSGVYKIKSESMYGLSKITFELQPSLSASSIPQKWDELRRKVLNIQP QLPSGASAPTVSDDFGDVFGIYYGLTADDGYTYEEMRNWAERIKTQVVTADGVMKVALFG TQTEVVNILISTHKLVGMGIDPKQLASLLQSQNQIINTGEIRAGEQQLRVTANGMYTTVD DIRNQVITTKAGQVKLGDIAVIEKGYMDPPSNIMHVNGKRAIGIGVSTDPQRDVVQTGKN VKAKLDELLPLMPVGLELQSLYLENEIANEANNGFIINLIESILIVIVIIMLVMGLRAGM LIGSSLIFSIGGTLLIMSFFGVGLNRTSLAGFIIAMGMLVDNAIVVTDNAQIAIARGVDR RKALIDGATGPQWGLLGATFIAICSFLPLYLAPSAVAEIVKPLFVVLAISLGLSWVLALT QTTVFGNFILKAKAKDGAKDPYDKPFYHKFASILRTLIRRKTLTLGSMVVLFVASLIIMG TMPQNFFPSLDKPYFRADVFYPDGYSINDVVKEMKSVEEHLAKQPEVKKVSITFGSTPLR YYLASTSVGPKPNFANVLIELTDSKYTKEYEEDFDAYMKANYPNAITRTSLFKLSPAVDA AIEIGFIGPDVDTLVALTNQALEIMHRNPDLINVRNSWGNKIPVWKPIYSPERAQPLGVS RQGMAQSIQIGTTGMTLGEYRQGDQVLPILLKDNTVDSFRINDLRTLPVFGTGNETTSLE QVVSEFNFQYRFSNVKDYNRQMVMMAQCDPRRGVNAIAAFNEVWPLVQKEIKVPEGYTMK YFGEQESQVESNEALAKNLPLTFFLMFVTLLFLFRTYRKPTVILLMLPLIFIGIVLGLVL LGKSFDFFSILGLLGLIGMNIKNAIVLVEQIDLEAKTGKKPLDAVVSATTSRIIPVAMAS GTTILGMLPLLFDAMFGGMAATIMGGLLVASALTLFVLPVAYCAIQRIKG >gi|225935346|gb|ACGA01000046.1| GENE 94 152751 - 153803 788 350 aa, chain - ## HITS:1 COG:VC1756 KEGG:ns NR:ns ## COG: VC1756 COG0845 # Protein_GI_number: 15641760 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 7 345 19 354 364 155 31.0 1e-37 MKKIGFVLTAAILLAGCGQKKEATMTTARPVKTTIVESRSIISKDFSGIVEAVEYVKLAF RVNGQIIQLPVIEGQKVKKGQLIAAIDPRDIALQYAATKSAYETASAQVERNKRLLSRQA ISVQEYEISLANYQKAKSEYELSANNMRDTKLTAPFDGSIEKRLVENYQRVNSGEGIVQL VNTHNLRIKFTIPDAYLYLLRAKDPRFLVEFDTFKGHVFQAKLEEYLDISTDGTGIPVSI TIDDPSFDRDLYAVKPGFTCGIRFTADVGPLVQNSWTIIPLSAVFGESEGNKMYVWVVED NKVHKRQVTVSTPTGEAQTLISEGLKPGEKIVIAGVHQLVEGESITTVDK >gi|225935346|gb|ACGA01000046.1| GENE 95 154518 - 154925 322 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173444|ref|ZP_05759856.1| ## NR: gi|260173444|ref|ZP_05759856.1| hypothetical protein BacD2_16350 [Bacteroides sp. D2] # 9 135 1 127 127 222 100.0 4e-57 MEITDLKQMTKEEVFNFIRQRLSFNDELLEQFRHVKKDDLAKEHRRFEMSGNESKTGWST VFNTAILNEFADLGIYNYTSYLFLDFHNGTPTIYLKYFLEDENLEYTFNGYTTTEIIFEV LELTIFSGKPERKRN >gi|225935346|gb|ACGA01000046.1| GENE 96 155170 - 155424 437 84 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29347102|ref|NP_810605.1| 50S ribosomal protein L31 type B [Bacteroides thetaiotaomicron VPI-5482] # 1 84 1 84 84 172 100 2e-41 MKKGLHPESYRPVVFKDMSNGDMFLSKSTVATKETIEFEGETYPLLKIEISNTSHPFYTG KSTLVDTAGRVDKFMSRYGDRKKK >gi|225935346|gb|ACGA01000046.1| GENE 97 155875 - 156879 1159 334 aa, chain + ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 1 328 1 328 332 453 69.0 1e-127 MVNYKDLGLVNTKEMFAKAIKGGYAIPAFNFNNMEQMQAIIKAAVETKSPVILQVSKGAR QYANATLLRYMAQGAVEYAKELGCAHPEIVLHLDHGDTFETCKSCIDSGFSSVMIDGSHL PYEENVALTKKVVEYAHQFDVTVEGELGVLAGVEDEVSADHHTYTDPEEVIDFATRTGCD SLAISIGTSHGAYKFTPEQCHIDPVTGVMVPPPLAFEVLDAVMEKLPGFPIVLHGSSSVP QEEVETINKYGGALKAAIGIPEEWLRKAAKSAVCKINIDSDSRLAMTAAIRKTFAEKPAE FDPRKYLGPARDNMEKLYKHKILNVLGSDNKLAQ >gi|225935346|gb|ACGA01000046.1| GENE 98 157085 - 158245 1216 386 aa, chain - ## HITS:1 COG:TM0880 KEGG:ns NR:ns ## COG: TM0880 COG1883 # Protein_GI_number: 15643642 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Thermotoga maritima # 12 385 17 383 384 349 53.0 6e-96 MGDFINFLGNNLADFWTYTGFANATVGHIVMILVGLGFIYLAVAKEFEPMLLIPIGFGIL IGNIPFNMDAGLKVGIYEEGSVLNILYQGVTSGWYPPLIFLGIGAMTDFSALISNPKLML IGAAAQFGIFGAYMIALEMGFDPMQAGAIGIIGGADGPTAIFLSSKLAPNLMGAIAVSAY SYMALVPVIQPPIMRLLTTKSERVIRMKPPRAVSHTEKVIFPIIGLLLTCFLVPSGLPLL GMLFFGNLLKESGVTRRLANTASGPLIDTITILLGLTVGASTQASEFLTLDSIKIFALGA LSFIIATASGVIFVKIFNIFLKKGNKINPLIGNAGVSAVPDSARISQIIGLEYDSTNYLL MHAMGPNVAGVIGSAVAAGILLGFLM >gi|225935346|gb|ACGA01000046.1| GENE 99 158247 - 158678 572 143 aa, chain - ## HITS:1 COG:SA0963 KEGG:ns NR:ns ## COG: SA0963 COG1038 # Protein_GI_number: 15926699 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Staphylococcus aureus N315 # 65 143 1068 1146 1150 68 44.0 3e-12 MKEYKYKINGNSYKVTIGDIEDNIAHVEVNGTHYKVEMEKQPKVAPKPVTVRPMPNAPTA PTQVVKPAAPSTGKSGVKSPLPGVILDIKVNVGDTVKKGQTIIILEAMKMENNINADKDG KITAINVNKGDSVLEGNDLVIIE >gi|225935346|gb|ACGA01000046.1| GENE 100 158702 - 159622 928 306 aa, chain - ## HITS:1 COG:no KEGG:BT_1687 NR:ns ## KEGG: BT_1687 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 306 1 306 306 546 90.0 1e-154 MNKTKIGIFLSLLLLIGLTSCGEKKSNNKLVLNEILIDNQSNFQDDYGLHSAWIEIFNKS FGSADLAACLLKVSNQPGDTVTYFIPKGDVLTLVKPRQHALFWADGEPNRGTFHTSCKLN PETANWIGLYDSGRNLLDQIVVPAGALGPDQSYARVSDGAVNWEVKGGSSDKYVTPSTNN QTLDSNAKMENFEEHDAVGIGMSISAMSVVFVGLILLYISFKVVGRVAVNLSKRNAMKSK GIDKQEAKELSQAPGEVYAAISMALHEMQDEVHDVEETVLTITRVKRSYSPWSSKIYTLR ETPPRK >gi|225935346|gb|ACGA01000046.1| GENE 101 159654 - 161207 1780 517 aa, chain - ## HITS:1 COG:RC0960 KEGG:ns NR:ns ## COG: RC0960 COG4799 # Protein_GI_number: 15892883 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Rickettsia conorii # 11 517 12 514 514 653 62.0 0 MSNQLEKIKELIERRAVARIGGGEKAIAKQHEKGKYTARERLAMLLDEGSFEEMDMFVEH RCTNFGMDKKHYPGDGVVTGCGTIEGRLVYVFAQDFTVSAGSLSETMSLKICKIMDQAMK MGAPCIGINDSGGARIQEGINALAGYAEIFQRNILASGVIPQISGIFGPCAGGAVYSPAL TDFTLMMEGTSYMFLTGPKVVKTVTGEDVSQENLGGASVHSTKSGVTHFTAKTEEEGFAL IRKLLSYIPQNNLEEAPYVDCVDPIDRLEDSLNDIIPDSPTKPYDMYEVIGAIVDNGEFL EIQKDYAKNIIIGFARFNGQSVGIVANQPKYLAGVLDSNASRKGARFVRFCDAFNIPIVS LVDVPGFLPGTGQEYNGVILHGAKLLYAYGEATVPKVTITLRKSYGGSHIVMSCKQLRGD MNYAWPTAEIAVMGGAGAVEVLYAREAKDQENPAQFLAEKEAEYTKLFANPYNAAKYGYI DDVIEPRNTRFRVIRALQQLQTKKLSNPAKKHGNIPL >gi|225935346|gb|ACGA01000046.1| GENE 102 161236 - 161640 456 134 aa, chain - ## HITS:1 COG:PH0272 KEGG:ns NR:ns ## COG: PH0272 COG0346 # Protein_GI_number: 14590197 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Pyrococcus horikoshii # 6 133 8 133 136 122 55.0 3e-28 MKISHIEHLGIAVKSIEEALPYYENVLGLKCYNIETVEDQKVRTAFLKVGETKIELLEPT CPESTIAKFIENKGAGVHHVAFAVEDGVANALAEAESKEIRLIDKAPRKGAEGLSIAFLH PKSTLGVLTELCEH >gi|225935346|gb|ACGA01000046.1| GENE 103 161816 - 162814 507 332 aa, chain - ## HITS:1 COG:no KEGG:BT_1684 NR:ns ## KEGG: BT_1684 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 332 1 332 332 612 87.0 1e-174 MPLKLTTYYQGKDIPELPGKNTFHSKELFQIYEATPGYTPLLIVATEDGRPVARLLAAIR KAKKWLPSSLVKHCVVYSEGEFLDESLSANKEKAEEVFGDMLEHLTQEASRSCVLIEFRN LNNSMFGYRVFRANDYFPVNWLRVRNSLHSMKKVEDRFSPSRIRQIKKGLKNGARVEEAH TVEEIRDFSRMLHKVYSSRIRRYFPANDFFRHMNNMLIKGKQAKIFVVKYKEKIIGGSVC IYSGDDAYLWFSGGMRKTYALQYPGVLAVWKALEDAHQRGFRHMEFMDVGLPFRRHGYRD FVLRFGGKQSSTRRWFRVSWSWLNKLLVKFYI >gi|225935346|gb|ACGA01000046.1| GENE 104 163067 - 166183 2541 1038 aa, chain + ## HITS:1 COG:no KEGG:BVU_2924 NR:ns ## KEGG: BVU_2924 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 10 1038 12 1034 1034 1080 55.0 0 MRDYAFVEVWKAQHRRISMILGITLLCSPIPYSVQAEDGVSNTMVQIVTQAKTVKGTVLD ENGEPLIGVSIVVKGTSTGTITDFDGKFSIDLPAGSKELLISYIGYKDQTITITGNVPLN VKMVPDTQALDEVVVIGYGTVKKRDLTGAVTSVKSGDITASPVNNPMEALQGKVAGLDIM KSSGKAGADVDIQLRGTRSIYGSNSPLFIIDGIQGSYNQVNPADIESIEVLKDASSTAIY GSAGANGVVIITTKQGKTGKATVNFDAYVGVSGFAKFPHGMTGDEYVNLKKEAWRTKNNE EYPEFMSTIFNKPGVLEAYEAGKWIDWVDEVMGKNGVQQNYNLSITGGTEKTKIFASLNY NNEEGLIENDNRKRYSMRLNLDQSIFSWAKVGFNTNLTYTDGNSRNQGVFVNALTFLPLG DAYDADGNINYEYCKDGGKVNPMADEAKDQYVNNTRSTYFTGNAYLELTPLKGLTFKSVL GTTLSNSRNGVFFGEKSIANITSGYQAPLAEIYNSASYAYRWENVITYNFTLARDHNFGV TGITSWSKNQDEANESHAQGQELAAQSFHNLGAGTEKVQVASSFSQTQSMSYAFRLNYSY KGKYLFTFSNRWDGVSHLAVGHKWDAFPAAAIGWRISDEAFMEKTQNWLSNLKLRAGYGV TGNSGRKDAAYSSTTGSYTYPKGVGFGNSAAGHIQYGGTYGNPDVGWEKSYNTNIGIDLG LFNGRVNFTFDWYNTDTKDLLYARTMPITSGIAAWGSPMKAWQNLGETNNKGIELSLNTQ NIVKKNFTWSSNVTFTANKEKIVSLPDGDVKDEKLFEGEAISVFYDYKYLGIWSTTEAEE AAKYNCKPGDIKLATDGSFNDNGIHTYSNKTDYFILGKKTPDWILGFQNNFTYRDFDLSF FIMARWGYMVNNDVITRYNPTTSWDNSPTGSDYWTPENQGAYLPRPGLHDAVSNYTGFTS LGYMDGSYIKVKNITLGYSLPKKWLSKICMEKLRVYATAYDPFIFAKEKLMRDLDPERNG ASNFPLTKRFVFGVNVTF >gi|225935346|gb|ACGA01000046.1| GENE 105 166197 - 167993 1365 598 aa, chain + ## HITS:1 COG:no KEGG:Cpin_4993 NR:ns ## KEGG: Cpin_4993 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 18 598 16 602 602 478 46.0 1e-133 MKNRKIYNILLVTVWILGGVSCSLDEYNPGGSSLEALMTTQNGFKQAINNCYFGLQRSFY GKQIYIELAEAGTDLWTAKQNSNTNANYFKYAEGGSMNLNMAQSIWNSGYDGIGSCNLAL AQADKVIDWDSEEVKNEILAEAHFLRGLYYYNLVEIFGGVTLITAKNEGVDMAPGRTEPL EIYKNCIIPDLEFAATYLPISRPEEEGRALRKSAKAYLVKAYLTTKEYGCSDYLQAAKDV ADELIRDCEAGGADLDTYMYTTFDDVFADANNQNNKESMFSIACSSQYGSVNAWQNNLSW MEFYCNTYYFPAVEKTQESIKQWGGNPEGMFMPSKCLLDLYVQSDNKLDPRYAASFKTLW KANKAFTWSEDKIKEMDRNGITTNTTLGVGDDAIRIIRSEEDNYAALKANKLKAPYMIVD VEDLYDDAGKVKMNYTRQSDGKTVLNPFYYFYPSLSKFNTTNVVVSNWSKGRYGSNASII TMRMAEIYLLAAEADVYLTGGTNANRYINVVRERAGATKYNGAVDIQYILDERARELAGE STRWFDLKRTGKLTEDYLKQTNPDIGQYFLNSTHTVRPIPQSFTDIIENGVDYQNPNY >gi|225935346|gb|ACGA01000046.1| GENE 106 168133 - 168675 478 180 aa, chain + ## HITS:1 COG:BS_ytiB KEGG:ns NR:ns ## COG: BS_ytiB COG0288 # Protein_GI_number: 16080121 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Bacillus subtilis # 1 180 3 182 187 208 51.0 5e-54 MVEEILAYNKEFVENKGYESYITNKYPDKKIAILSCMDTRLTALLPAALGIKNGDVKMIK NAGGVISHPFGSVIRSLLVAIFELGVEEIMVIAHSDCGACHMHSETMLEKMTARGINPDY IDMMRFCGVDFHAWLDGFEDTEKSVRGTVDFIVRHPLIPSDVTVYGFIIDSTTGELTRIV >gi|225935346|gb|ACGA01000046.1| GENE 107 168741 - 168965 109 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173456|ref|ZP_05759868.1| ## NR: gi|260173456|ref|ZP_05759868.1| hypothetical protein BacD2_16410 [Bacteroides sp. D2] # 1 74 1 74 74 115 100.0 1e-24 MKKRITQDDYVKANRKASREAEIEMYGHPICHKRVHQSKKVYNRRKIKAADKKLPYFFVF KIASLSYRHSDTIQ >gi|225935346|gb|ACGA01000046.1| GENE 108 169088 - 169624 499 178 aa, chain + ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 6 175 3 172 174 145 39.0 5e-35 MENFSELIKNRRSMRKFTDEELTQDQVVALMKAALMSPSSKRSNSWQFVVVDDKEILKEL SHCKEQASSFIADAALAIVVMADPLASDVWIEDASIASIMIQLQAEDLGLGSCWVQVRER FTATGMPSDEFVHGILDIPLQLQILSVIAIGHKGMERKPFNEEHLQWEKIHINKFGGK >gi|225935346|gb|ACGA01000046.1| GENE 109 169627 - 169962 241 111 aa, chain + ## HITS:1 COG:no KEGG:BT_1679 NR:ns ## KEGG: BT_1679 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 122 85.0 4e-27 MGAAKANNSKDTYLATAVTFIVIGALFLIDKLIHFSSIGLPWVMNKDNMLLYASICFLIF KRDKSIGFVLLGLWLVMNIGLVMSLLGSLSGYLLPLTLLIIGIVLFWFAKR >gi|225935346|gb|ACGA01000046.1| GENE 110 169970 - 170764 789 264 aa, chain + ## HITS:1 COG:no KEGG:BT_1678 NR:ns ## KEGG: BT_1678 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 444 85.0 1e-123 MKRSIEDTPIVFIGAGNLATNLAKALYRKGFRIVQVYSRTMESARTLAEKVEAEYTTDLQ AVSKDAKLYIVSVKDDALVDLLPQITEGKQASLLVHTAGSIPMSVWEGHAERYGVFYPMQ TFSKQREVKFQEVPFFVEAKRPEDVELLKAVAATLSEKVYEASSEQRKSLHLAAVFICNF TNHMYALAADLLEKYNLPFDVMLPLIDETARKVHELAPRDAQTGPAVRYDENVMSNHLAM LVDSPALQEIYKLMSKSIHEHHQL >gi|225935346|gb|ACGA01000046.1| GENE 111 170745 - 171266 387 173 aa, chain + ## HITS:1 COG:FN0213 KEGG:ns NR:ns ## COG: FN0213 COG1778 # Protein_GI_number: 19703558 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Fusobacterium nucleatum # 8 165 1 158 168 117 36.0 2e-26 MSTINYDLSRIKALAFDVDGVLSSTTVPLHPSGEPMRTVNIKDGYAIQLAVKKGLHIAII TGGRTEAVRIRFEGLGVKDLYMGSAVKIHDYRAFRDKYGLTDDEILYMGDDVPDIEVMCE CGLPCCPKDAVPEVKSVAKYISYADGGRGCGRDVVEQVLKAHGLWMAEDAFGW >gi|225935346|gb|ACGA01000046.1| GENE 112 171338 - 171919 557 193 aa, chain + ## HITS:1 COG:BS_maf KEGG:ns NR:ns ## COG: BS_maf COG0424 # Protein_GI_number: 16079857 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus subtilis # 10 191 5 183 189 137 43.0 2e-32 MLDNLKKYQIILASNSPRRKELMSGLGVDYVVRTLPDVDESYPDTLVGAEIPEYIAREKA DAYRTMMKPGELLITADTIVWLDGKVLGKPEGREGAVEMLRALSGKSHQVFTGVCLTTTE WQKSFTASSEVLFDVLSEDEILYYVDRYQPMDKAGAYGVQEWIGYIGVKSISGSFYNIMG LPIQKLYGELKKL >gi|225935346|gb|ACGA01000046.1| GENE 113 172108 - 174312 1811 734 aa, chain - ## HITS:1 COG:all3773_2 KEGG:ns NR:ns ## COG: all3773_2 COG0457 # Protein_GI_number: 17231265 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 512 719 133 335 395 66 28.0 2e-10 MNEKTINEQYAYIRALLEEKRLKEALMQLESLLWQCPDWDLRTRLEQLQTSYKYMLEYMK QGANDPERWNLYQKMVSDTWGIADQSRLLILDNASSRYYHEVRRTPKSPDLSNYGLKTIL HILESFNDDLAVSGLLSDEKMDEVLKRHEDTLKFMFIRTWTNSAWTPEDEEDAKAMLASE LLPGDDLCLFVSALTLSLMECFDLRKIMWLLNAYEHPNVNVSQRALVGAMIIFHIYRSRL TFYPELIKRVDLMEEIPSFREDVARIYRQMLLCQETEKIDKKMREEIIPEMLKNVSSMKN MRFGFEESDEENNDMNPDWEDAFEKSGLGDKLREMNELQLEGADVYMSTFAALKNYPFFR EVHNWFYPFSKQQSNVLKAMKQAGNQGGSLLDLILQSGFFSNSDKYSLFFTIHQLPQSQQ DMMLSQLNEQQVAELAEKSNVETMKKFNERPGTVSNQYLHDLYRFFKLSVRKSEFRDIFK EKLDLHHVPALDNILHWEDVLFPIADFYLSKERWDEAIEIYEELETIGGFEGESAEYYQK FGYALQKRKKYAEAIQAYLKADTLKPDNIWNNRHLAICYRLNRNYQVAITYYKKVEEAAP EDTNVTFHIGSCLAELGQYEEALNYFFKLDFIENNCIKAWRGIGWCSFISQKHEQAMKYY EKIIEQKPLAIDYMNAGHVAWVMGDIQKAAVFYGKAITASGNRERFLEMFHKDEEALLTQ GIREEDIPLMLDLL >gi|225935346|gb|ACGA01000046.1| GENE 114 174323 - 175507 947 394 aa, chain - ## HITS:1 COG:BS_yxaH KEGG:ns NR:ns ## COG: BS_yxaH COG2311 # Protein_GI_number: 16081049 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 8 390 9 391 402 100 25.0 4e-21 MESKIAETNARIDVADVLRGLAVMGIILLHSIEHFNFYSFPEEVPFEWMKFTDQAIWRGL FFTFSNKAYAVFALLFGFSFYIQDNNQQRRGKDFRLRFLWRLFILFIIGQFNAAFFTGEI LTMYAILGIILPIFCRMSDRTVAIFATLLILQPIDWAKLIYALCNPDYVAGKSLAGYYFS IAFDVQKHGNFLETVRMNMWEGQMANMTWALEHGRILQTPGLFLFGMLVGRRKYFLYSEQ NERLWLKALAISLLCFFPIYGLNNMLPEFIERNAVLVPLQLILSSFSSLSFMVLLVTGLL LTFYRVKDRSFFMRFTSYGKMSLTNYLGQSIFGSLLFYHWGFELGRYLGITYSFLFGILF VLLQMAFCSWWLRHHKHGPFEGLWKRLTWIGKNK >gi|225935346|gb|ACGA01000046.1| GENE 115 175663 - 176706 1036 347 aa, chain - ## HITS:1 COG:no KEGG:BT_1674 NR:ns ## KEGG: BT_1674 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 347 347 628 87.0 1e-178 MRKTLLNLFTFVLCITGIQNALLAQEQTPLNQVVNTLKERISLSGYAQLGYTYDDAANPD NTFDIKRIIFMAHGKITKRWTCDFMYDFYNGGMLLEVYTDYQFLPGLTARIGEFKVPYTI ENELSPTTVELINCYSQSVCYLAGVSGSDKCYGMTSGRDIGMMLHGKVFRDFLQYKVAVM NGQGLNTKDKNSQKDVVGNLMVYPLKWLSVGGSFIRGTGHAIADSEYTGIKAGENYAKKR WSAGGVVTTPTFNLRTEYLGGKDRSVKSEGFYATGSVRFARNFDFIASYDYFNPNKSADF KQNNYIAGVQYWFYPRCRLQAQYTFCDKKGDGQKDSNLIQAQVQVRF >gi|225935346|gb|ACGA01000046.1| GENE 116 176717 - 177685 752 322 aa, chain - ## HITS:1 COG:AF0088 KEGG:ns NR:ns ## COG: AF0088 COG0715 # Protein_GI_number: 11497708 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Archaeoglobus fulgidus # 34 305 23 297 300 133 31.0 5e-31 MKRPILFFWFILFFLYSCQRKKGESSFLPLPELQPLTLGMMPTLDGLPFHIAKTQGIYDS LGLDLTILSFNSANDRDAAFQTRKMDGMITDYPSAVALQAIHHTDLGFILKNDGYFCFIV SKESNINQLEQLKEKNIAVSRNTVIEYATDQLLSKAGIKHTEINMPEIGQLPLRLQMLQY NQIDASFLPDPAASIAMNSKHRSLVSTQELGIDFTATAFSRKALNEKREEIELLITGYNL GVDYIKMHPQKEWEQVLIEIGVPENLTGLVALPGYQKAKRPSAEAIDKAIHWLKENHRIP QTYSEKNLIDTTFIPTVSTIIK >gi|225935346|gb|ACGA01000046.1| GENE 117 177792 - 179051 1678 419 aa, chain - ## HITS:1 COG:all4131 KEGG:ns NR:ns ## COG: all4131 COG0126 # Protein_GI_number: 17231623 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Nostoc sp. PCC 7120 # 8 419 13 399 400 387 52.0 1e-107 MQTIDKFNFAGKKAFVRVDFNVPLDENFNITDDTRMRAALPTLKKILADGGSIIIGSHLG RPKGVADKFSLKHIIKHLSELLGVEVQFANDCMGEEAAVKAAALQPGEVLLLENLRFYAE EEGKPRGLAEDATDEEKAAAKKAVKESQKEFTKKLASYADCYVNDAFGTAHRAHASTALI AKYFDVNNKMFGYLMEKEVKAVDKVLNDIKRPFTAIMGGSKVSSKIEIIENLLSKVDNLI IAGGMTYTFTKAMGGKIGISICEDDKLDLALDLMKKAKEKGVNLVLAVDAKIADAFSNDA NTKFCAVDEIPDGWEGLDIGPKTEEIFANVIKESKTILWNGPTGVFEFDNFTHGSRAVGE AIVEATKNGAFSLVGGGDSVACVNKFGLASGVSYVSTGGGALLEAIEGKVLPGIAAIQE >gi|225935346|gb|ACGA01000046.1| GENE 118 179145 - 179822 628 225 aa, chain - ## HITS:1 COG:RP746 KEGG:ns NR:ns ## COG: RP746 COG0177 # Protein_GI_number: 15604580 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Rickettsia prowazekii # 9 213 8 210 212 196 46.0 3e-50 MRKKERYEKVIAWFQDNIPVAETELHYNNPYELLIAVILSAQCTDKRVNIITPPLYKDFP TPEALAATTPEVIFEYIRSVSYPNNKAKHLVGMAKMLVNDFNSEVPDNLEDLIKLPGVGR KTANVIQSVVFKKAAMAVDTHVFRVSHRIGLVPDSCTTPFSVEKELVKNIPEKLIPIAHH WLILHGRYVCQARTPKCDTCGLQMMCKYFCNTYKVTKEEPKAKNK >gi|225935346|gb|ACGA01000046.1| GENE 119 179819 - 181018 832 399 aa, chain - ## HITS:1 COG:STM2280 KEGG:ns NR:ns ## COG: STM2280 COG0477 # Protein_GI_number: 16765607 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 3 389 2 381 396 134 25.0 3e-31 MAKDRLITSSYCFILAANFLLYFGFWLLIPVLPFYLSEFFQAGNSTIGIVLSCYTVAALC IRPFSGYLLDTFARKPLYLFAYFIFMMMFGGYLIAGSLTLFIIFRIIHGVSFGMVTVGGN TVVIDIMPSSRRGEGLGYYGLTNNTAMSIGPMFGLFLHDAGVSFATIFCYAFGSCILGFL CASLVKTPYKPPVKREPISLDRFILMKGLPAGLSLLLLSIPYGMTTNYVAMYARQIGLNT QTGFFFTFMAIGMAISRIFSGKLVDRGKITQVIAAGLYLVVFSFFLLSTCVYLIQWNDTA CTLLFFGIALLMGVGFGIMFPAFNTLFVNLAPNSQRGTATSTYLTSWDVGIGIGMLTGGY IAEISTFDKAYLFGACLTVVSALYFRLKVTPHYHKNKLR >gi|225935346|gb|ACGA01000046.1| GENE 120 181167 - 182186 1184 339 aa, chain - ## HITS:1 COG:BS_pheS KEGG:ns NR:ns ## COG: BS_pheS COG0016 # Protein_GI_number: 16079916 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Bacillus subtilis # 1 338 1 344 344 317 45.0 2e-86 MIAKIEQLLKEVEALHASNAEELEALRIKYLSKKGAINDLMADFRNVAAEQKKEVGMRLN ELKTKAQDKINALKEMFESQDNDCDGLDLTRSAYPVELGTRHPLTIVKNEIIDIFARLGF SIAEGPEIEDDWHVFSALNFAEDHPARDMQDTFFIEAHPDVVLRTHTSSVQTRVMETSQP PIRIICPGRVYRNEAISYRAHCFFHQVEALYVDKNVSFTDLKQVLLLFAKEMFGADTKIR LRPSYFPFTEPSAEMDISCNICGGKGCPFCKHTGWVEILGCGMVDPNVLESNGIDSKIYS GYALGMGIERITNLKYQVKDLRMFSENDTRFLKEFEAAY >gi|225935346|gb|ACGA01000046.1| GENE 121 182307 - 183323 730 338 aa, chain + ## HITS:1 COG:no KEGG:BT_1668 NR:ns ## KEGG: BT_1668 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 1 338 350 517 78.0 1e-145 MKKIVFCLLLLTFSFRLAAQIDYLEPVKPFSTYTGELGEYYRSVFSLLNTGFQKQPYARF AAIPSFSPEYAMSVEKRNGRYMLVSNTLSRTYWQAEKGTVTVDTKSVAISASLYQSLGAI FRLVTEQVQDLDGSTAGLDGIVYFFSSTDAKGKEQMGRKWSPEKGTLMERLVLVCQSAYM LSRGENISEQTLAVEAAALLKALQQRTKEEPDAYKRPMYIGIYPVGPRSKTLSGRQVEEP AHFSAMTPEEYIASEMVYPSGLLEKNVSGYALCEFTIDKEGVILRPHILRSTHPEFAEEA LRIVKGMPKWSPALVGGKPADSNYTLYVPFRPQLYRNK >gi|225935346|gb|ACGA01000046.1| GENE 122 184215 - 185081 556 288 aa, chain + ## HITS:1 COG:no KEGG:BDI_3189 NR:ns ## KEGG: BDI_3189 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 286 3 295 301 80 25.0 7e-14 MTTKQWIGIEEAATKYQVSTRRIITWCKRQEIIYSEVGDYLMLDENSLTDCLERNIRFSL SEEEHKRRMDEKMKENEEEFFLLQSLKELTPLIRLIIKELAGMIRNDERRQLFLYTVLQG NIKDFSVRKRMKYRQAQKAFEGLVQEIKSQAGFLRTYKEENIRLRATVRAYEMKSRQNGF DNDMFMREAEETNPEIFIPEDIKAAKALLDTPITELKFDIRSQRIISEADIKTLRELLQI TSQYGFRKLRDMLRNFGLVSQKKVEKRLKELNVLDVAGNCNLYRYLDE >gi|225935346|gb|ACGA01000046.1| GENE 123 185210 - 185422 404 70 aa, chain + ## HITS:1 COG:no KEGG:BT_1667 NR:ns ## KEGG: BT_1667 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 70 1 70 70 80 88.0 2e-14 MDDMINRHDSIAEENIEPNGRPAKDQFEEWSGEVADRADDVFKNDKKDGPIKNREKRIKE MDEVIKKDLE >gi|225935346|gb|ACGA01000046.1| GENE 124 185496 - 186065 439 189 aa, chain - ## HITS:1 COG:no KEGG:BF3225 NR:ns ## KEGG: BF3225 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 183 12 186 191 91 28.0 1e-17 MLSPKTLKYLRLTHRYVSFICAGILFVYLLSGFLLNHRKEFTFMNQKKERTVDYIFQLPA NKDDFTEADARKIISDLSCNPDLYNRFSLSKNKLTIFGKEQLVIKLDATSQQAFIKEMHR PAFLTALNKLHRNPGSLWTITSDAFLLLMFILLITGLLIVPGKKGLWGIGGILTIIGILI PILIYWMIA >gi|225935346|gb|ACGA01000046.1| GENE 125 186087 - 188294 1733 735 aa, chain - ## HITS:1 COG:no KEGG:FP0382 NR:ns ## KEGG: FP0382 # Name: not_defined # Def: hypothetical protein # Organism: F.psychrophilum # Pathway: not_defined # 25 734 20 707 713 384 33.0 1e-105 MKLTTIISAFYLLCLLCIQPVHADGGFWLPTQIQGKVHQAMKKQGLRLSEKDIYDINQSC LSNAILSLSYDNSTFSPSASASFISGEGLVLTNFHCVARYLEHISDANHDYVKHGCWATQ RSEETYLPNLQVNQLITIKDVTTEITQDTGNLTDQALTQKINENGNKIVKSQAKGRGVEG KIYSLFGGRQYILAVFRSFKDVRIVAAPPISIGKFGGDTDNWQWPRYSADFAILRVYAND KNQPAAYNKQNRPYQPDAHLSISTKGVKPDDFVMVAGYPAQTRQYVPSFALEKIVFKDTQ AEADIAKIKLDFYTQRKETAADSLYSYYNIKAGSAANVYLKSIGEISGVKESGIIARKQQ EEQAMTEWILSDDTRTQRYGATLLDDMKANYQRLTQLNFTDQMFREVALYGANVIPFAGK FEKLVQMEQRKSKNVKTAMKGEIRKLKPLTVAFYRDFRPEDDKYIMKKMLAYYLDHVDSI FYSTALKKVEKHYPQHLSAYIDSLYDHSPLRNQESMLALLDSVPTQGTAALSQDGLYQLS LGFYLVYVEKINRLKQAYQNKNMKLYSLYLQAYAEKNKGTQIAFDANRTLRYSTGKVKAA VPSEGVIYTPFTTVDGMLSRHRLYAGNKDFNLPARFVKLIEHRQPSNYFSQKVTPTCCFL TDAHTTSGSSGSPVLNAKGEVVGINFDRIWQGLSSDYETAGNEKSRNIAVDIRYVLWTLE QYSHSQYILNELDIH >gi|225935346|gb|ACGA01000046.1| GENE 126 188291 - 190480 1137 729 aa, chain - ## HITS:1 COG:no KEGG:BT_3282 NR:ns ## KEGG: BT_3282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 720 67 788 849 429 35.0 1e-118 MMRIANNKLRLLLLLLFPTLLWGNSAPMFHIRQNTKGCFVEIPKRLINRDFLLAARVMTV SSPNNKVKLYAGQRLYDPVWIRLKYDKEQLYLLRPDSKNLCEDTTHLSYPAYARNAITPI AESWKIEQETDSSIVVNWSKFLSEPIEGVDPFGGKTSPGRSLPQLNKILQVDVHEKNLEV SVQYGFEGTTQPFLTTIRKSLLLLPEQPMQPRIHDARVGYDNIPKRKFYFDTPSIAAENY ITRFRIVPSPKDVRSYLQGKQVKPQQPIVFYIDDAFPPLWKKAIRKGILDWNKAFEEIGF KEVMQAKTYAEAGPEFDPNDIRFNCFRYVVSDFPNAMGKHWVDPRSGEILQADVLFHSNV IALLRKWYFLQTSAYHPAARTKTLPDDITAQLIRYAAAHEIGHCLGLEHNFKASYAYNTE DLRRPEFTERYGTTPSIMDYARFNYVAQPGDGVRYVLPPLLGVYDRYAIRIGYAYLSREN TRTVAGWIDEKQNDPMYHCGRMAPSTIPTDPTVQTSDLGNDPVASATYGIRNLQQILTQL PEWNKKRLTDNPFEEMPATYTDLQQAYFDHLERVIPFIGFSDEVSGKAVEFLWKELLGGY NFLRTDAVCKYAGNPTEAIIKAQKTIIEKMFGRIIAERISSNETPAGFTYAHYLEVSANY LFTDKTPDIFTRHLQESYLQTLQSLLTEARTSSFSVLFSPTVSEHLTRIREQLTTNPSTW NNYLKNKIQ >gi|225935346|gb|ACGA01000046.1| GENE 127 190525 - 192090 1194 521 aa, chain - ## HITS:1 COG:no KEGG:BT_1937 NR:ns ## KEGG: BT_1937 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 493 36 491 520 77 24.0 9e-13 MMMNHRYILFTGLMVTALCIHAQDIELQTKEKYEYADATQLWRRTLNPAGLSLDTLTNRG ISYFEFSHQQGTHYRVQDGDEQNKLLFTSERYQKIGKYLYGYGRFTFDMGRQFNRSWSDV LRSHHSNPYFSGSSIKGKYDFQNFDLSAALATLPIKNFTFGARLDYKVGDLSRLKDPRSR TNLADYQLTPAVTYTWNKHSLGLSGYYHRRKEKIPNITTVQTDPNLKYYTFTGMENADGT TGGYSGFEREFVNHEFGGELSYQYKNERIQTLTTLSYAKGNEDVWGDIKYSPGKYHTTTY NLLSMNRIKSGRTEHIIDISINYQTGKADEYRQEKIIEKDPVTGIESSYWNTLLTYDERY TVDLLNANLHYRLLWSNHHTGETTAYAGARAYFQSVEDQYNLPSSSLTVRQATICMEGGY SFLRKNNRSLWIEAEAGYHISLSSDLSLNDPSTEYAQSVLLPDMTYYGASYAHGKLQIQY QMPVTIKKHTNVWFVKATGAYLKTDKKTDSKMFGISLGLYH >gi|225935346|gb|ACGA01000046.1| GENE 128 192184 - 193371 988 395 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_0764 NR:ns ## KEGG: Fjoh_0764 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 22 394 18 435 436 121 29.0 4e-26 MNRTISFITKMMFAAVTVIMAACSSDDNATQALTVQVKVSMPEGFKSDILYTGHTVTMGK YTAITNEKGIATFEGVIPDLYDISTSCEITAEEYEEMTGNDPQNENYVISGSLLKYTVGS STTIELQTSISAKQSIVISKVYYAGTKDNNNKNYLAGKYIEFFNNSDQTVDIAGLYFGLV ESESTPAYTLGSTPEYIYLKQIYRFPSNGVTEVAPGASIIVANSAIDHTGNNEVDLSKAD FEAKDTQGKTTNNPATPAVELIYTTFPTISNMNLVQGGPCSVVLFSTDEDVTSWETVYVD GKDKGSKFVKTPVKYIMDGVECLKNKSTGVDKNSKRLYNYIDAGYQYTEATTGYTGEVVY RKTAKTENGRTILADTNNSSNDFAVSAEIKPREYK >gi|225935346|gb|ACGA01000046.1| GENE 129 193380 - 196334 2000 984 aa, chain - ## HITS:1 COG:no KEGG:BT_1939 NR:ns ## KEGG: BT_1939 # Name: not_defined # Def: putative outer membrane receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 116 984 50 953 953 364 31.0 1e-98 MMQTVAIILFAANVSAQNQNVVISVRNIPVRTALTQIRQAANVHFVYEEKNINSQQTVTL NYPQGTSLSTLLNNLCKQIGLTYEINESVILLYPAQKQTTTHDIHILLLERGNKQPLPMA TCVLNPLGAYAATDMEGKAVLKNVPTGKYILNISYVGFETVQREINVEQNLDLTIRMAPT SLALKEVVVVAKQNAAGESTSSIIGRQAIDHLQAMSLDDVMQLIPGHLMKNTDLTSRSNV QLRTLVNNNTNAFGSSIIMDGVPMSNNGTLSQGGFSSTAFVGTDLRQISADDIESVEIIR GIPSAEYGDLTSGLVVVHSKIGQTPWQIKGKINPGTMNYSLGKGLRLNKDAGILNFNLDY AQAWGDPRQKTKSFDRYTFSLGYSKDFSRIWHTDTKVRYMMGKDWNGKDPDAIDDGTFQK NRNQLFSLSHNGKLSINKPFSRTVNYTLGVSYTQTEMEKSAIVANSSGLLPILTATETGY YNVPFEQSSYQASGGSLSRPGNVYAKVSNTFFVKSKKTYQNFKMGVEYHYDWNNARGYYN DDDRYPLQPNSNGRPRPFSDIPSIHQMAAYIEDNFRWDISENRFLKIQLGGRFTTLQPWA DEATFSFSPRVNASFSVNKWLNIRGGFGLNSKTPGLDYLYPDKKYIDRVAANYMPQDDKV GQILMYHTQVYNVQRTKGLKNATNRKWEAGFDIKLSDKHKLSLIGYHDKTANGFGSATEY FTYTANYYSAEKGLIITPGQATKIDWDNPERTDLVLSTTGKIGNTNVSINKGIEMDFDFG EIPALHTSFYLSGAYMESKTYSTDINSSNPTDLPTEYQVTNTIPFKIVYPSGLNYSTYRR FTNTLRIVTNIPALRMVASFSTQAIWYNYSKSNNPPMDPIGWIDTDLSYHEITADMLADE NYKIKGVSLKSQRKNPKDNVPSKAPITWLMAGRLTKELGNIGGISFYANNILFYEPFLKN STSNTLVQRNTGNFSFGVELFFNL >gi|225935346|gb|ACGA01000046.1| GENE 130 196382 - 197452 761 356 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 154 346 131 322 331 79 29.0 1e-14 MNQEDHNKTEKNLITLLNWYRLKRVNEFAAQCDEAQAWESIQQRIRHRKRIRMYTYWSST AAACLLIAAFSFYTLTEYDQSLFAHRGEQKATLFVNNGESYDLLSQDKSIFNANGMQVAQ NTEKELRYDTQTSIQKNAGKHILNVPRGGEYKLVLSDGTRVHANAESRLTYPVSFTGNKR EVYLTGEALFEVAKDTLHPFIVHTPYGIVEVVGTRFNVNTYEKNQTTVTLEEGAVKVYCE EKKGTEQKSLSPGEQAVIQSKTINTQTVQVQEYTSWANGIYEYTDTSLETIVQQLSRWYN VNMTFTDPQLRERKFAGVIFRDQPLQKAVDILSKVSNVHFRQKGNVIEITENRPQY >gi|225935346|gb|ACGA01000046.1| GENE 131 197532 - 198191 464 219 aa, chain - ## HITS:1 COG:no KEGG:BVU_1982 NR:ns ## KEGG: BVU_1982 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 35 214 15 192 194 123 43.0 5e-27 MTFYHKNINKKYYILRFLCYFALHNSIIKMNHLTDKELVSLFSTDKERAFNLFFQRYYIR LCMYAVQITDDFSESEDIVQSFFISFWEKKLYKTITDNLKGYAYLCIRNASLKFIEKREK IDSSDILLNEEEYLYICESLEENEREQKEKELEKALAALPEQEKKALYGVVIENKSYKAV ASELHISVNTLKTYLARAMKKLRKNEKLLLVSLFLINLS >gi|225935346|gb|ACGA01000046.1| GENE 132 198577 - 198879 247 100 aa, chain + ## HITS:1 COG:no KEGG:BT_1665 NR:ns ## KEGG: BT_1665 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 100 190 89.0 1e-47 MLIYNTTFQVDEDVHDNFMIWIKESYIPEVQKHGTLKAPRICRILSHREEGSAYSLQWEV ESSGLLHRWHLEQGVRLNDELAKIFKDKVIGFPTLMEVIE >gi|225935346|gb|ACGA01000046.1| GENE 133 198933 - 199442 440 169 aa, chain + ## HITS:1 COG:VC1847 KEGG:ns NR:ns ## COG: VC1847 COG0817 # Protein_GI_number: 15641849 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Vibrio cholerae # 2 141 15 150 173 96 40.0 2e-20 MGYGVLRVKGTKPEMIAMGIIDLRKFANHYLKLRHIHERVLSIIESYLPDELAIEAPFFG KNVQSMLKLGRAQGVAMAVALSRDIPITEYAPLKIKMAITGNGQASKEQVADMLQRMLRF PKEDMPTFMDATDGLAAAYCHFLQMGRPTMEKGYNSWKDFIAKNPDKVK >gi|225935346|gb|ACGA01000046.1| GENE 134 199494 - 201500 1499 668 aa, chain + ## HITS:1 COG:TM1845 KEGG:ns NR:ns ## COG: TM1845 COG1523 # Protein_GI_number: 15644588 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Thermotoga maritima # 47 637 229 812 843 503 44.0 1e-142 MKMGSNYLAILGVTTVTTVMSCSPAKKEYASFELYPVRTGSLTEMEYTPLATRFYLWSPT AEEVRLMLYDAGEGGHAYETVKMEPTEDGTWTTSVDENLLGKFYAFNVKINDKWQGDTPG INAHAVGVNGKRAAIIDWKTTNPEGWESDRRPSLKSPADMIIYEMHHRDFSVDSTSGIKN KGKYLALTEHGTMNSDKLLTGIDHLIELGVTHVHLLPSSDYASIDETKLEENHYNWGYDP ANYNVPDGSYSTDPYQPATRVKEFKQMVQALHRAGIRVIMDMVYNHTFNTVESNFERTVP GYFYRQKEDGTLANGSGCGNETASERPMMRKFMIESVLYWIKEYHIDGFRFDLMGVHDIE TMNEIRKAVNKVDPNICIYGEGWAADTPQYPADSLAMKENVSHMPGIAVFSDELRDGLCG PVWKKEKGAFLTGVPGAEMSVKFGIVGAIEHPQVRCDSVNYSQKPWAEQPTQMISYVSCH DGLCLVDRLKASMPGATPEQLVRLDKLAQTVVFTSQGIPFIYAGEEVMRDKQGVDNSYKS PDAVNAIDWRRKTTNGDVFMYYKRLIDLRKSHPAFRMGDAERVRKHLEFLPVEGQNLIAF RLKDHANGDNWEDIIVAFNSRMTPARLEVPVGKYTVVCKDGVIDVRGLGTQIGPEVIIPG QSALIMYK >gi|225935346|gb|ACGA01000046.1| GENE 135 201581 - 205714 2683 1377 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 839 1120 8 277 294 154 37.0 1e-36 MSMKAQILFLFSIYCFLIGSTPVMKAQTGKFYSTDKELSNSLINAVYQDRKGFIWIATEN GLNKFDGTRFSIYRHNATDSTSLKNNYVRTLFEDSRGNFWIGCINGLQRYDRATDNFHEL FISRKDGRKNPHITSIIERRNGDLWIATSGQGAISLKKNSNPASFHIETELTDRIGSNYL NVIFEDSRQNLWIATEEKGLYRYSPESKELKSYKAPYHIAGDDVSAICEDAHGQIFVGTL TKGLFRLSSRQEGNFEPVLYQNRMNLNIRTLIIDTRGKLIIGTDGEGVKEYQPQQDIIVD SEINAGPFDFSKSKVHSLIEDKDHNLWLGIFQKGLILVPGISNKFDYYGYKSIHNNTIGS SCVMAIHTDEQATIWIGTDNDGLYAINDQGKQLRHYTHQAGNPQSVPGTILCLYEDSNQE LWLGSYFDGLARMNKQTGTCQDATSLLQGNLNAGKPKVSCIIEDKNKNLWVGTYGSGLYK INLPTQHVTYYESTRNENDDWSINRLPNDWISYLLEDKEGMIWIGTYRGLAVLNPQTDNF INYKKQNNLLPGYVVYSLLESSNGEIWAGTSEGLVCLNKDRLTPVLFTTADGLPSDIICG LAEDEKKNIWISTHQGISKLNPPEKKFINYYAGDGLQGNEFTRTAVFKDKRGKIFFGGTN GVTAFYPQDITEIKKEMNVLITGFHVANRPVKKGDKSGNNVITDTAVMDTEQFTLAYNEN TFSIDFSVLEFSNPDRISYQYKIKELGDEWISTQPGTNRVTYSSLKPGKYTFSVQARDHN NFSNIRTVTIAITPPWYQTWWAKVIWGCLGALLIYALTMYILSRIRHRQEVMRQKHMEQI NEAKLQFFINISHEIRTPMTLIISPLEKLLAEHSEKQPVYLMIYRNAQRILRLINQLMDI RKLDKGQMHLKFRETDIVGFINDLMQTFNYQAQKKNITFTFEKELEGADSLKVWIDLNNF DKVLMNVLSNAFKYTHEGGNIEVLLKTGHNDAYRGALKDYFEIDITDNGIGIDKNKIEQI FERFYQIDNDMTQSNFGTGIGLHLSRSLVELHHGIIKAENREDGQGTRFVIRLPLGSNHL KAEELENPEETGSEPTISQLPKDSIYETEEENKTNEYRKPKAKTRYRVLIVEDDEEIRRY IRSELDSDFRIYECTNGREGLETILKEKPDLVISDVMMPEMDGITLCRKIKQNININHIP IILLTAKSKAEDQIEGLEIGADAYIVKPFNTELLRTTISNLIANRERLRGKLVGEQQVEE KITKIEMKSNDEILMSKVMKTINDHLADPTLNVEMLAANVGMSRVHMHRKLKELTNQSAR DFIRSIRLKQAANLLREKNLSVSEVAYATGFSNLSHFSNTFRDFYGISPSEYKEQQM >gi|225935346|gb|ACGA01000046.1| GENE 136 206110 - 209283 3366 1057 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4951 NR:ns ## KEGG: Fjoh_4951 # Name: not_defined # Def: TonB-dependent receptor, plug # Organism: F.johnsoniae # Pathway: not_defined # 3 1057 12 1063 1063 711 39.0 0 MESTNHYFRPLGLIILFCLFPVWVLAQTVSVTGVVKDASGEPIIGASVVEAGTTNGIVTD LDGNFKLNVSAKGSLKISFIGYQTQTIPVAGKKQFDITLKEDAKVLDEVVVVGYGQMKRS DLTGSVVSVNDEAIKKSVVTSVDQVLQGRAAGVQVQANSGMPGGSSSIRIRGINSLNASN EPIFVIDGVIIDGSTGSGSDNALASINPSDIVSMDVLKDASATAIYGARAANGVIMITTK RGQKGEAQITYDGYIGWQEMPKKLDMLNLREYAEHKNVRSGKDYSGNDWGIVNKDNNFVR PDLLGEGTDWQDEMFSKAMMTSHNLSVTGGTDKSNYALGAGYLNQDGIAVGSGFRRLNLR GSFDAQVKSYLKMGINFAFSNSRQKLTVSDESLIRTALLQSPSVAVRNAEGTFDGPDTDE WVQTNPVGLAMIKDNRNEKMGIRANTYAEATIIDGLTFKTELSFDYGVTNTYKFDPSYTF GAIENTDRQGTFSKSYNKFWSWRNIVNYMKTFGVHNVNAMIGQEMQESKWEYLMGSRMGY LSNAATDLTLGDAATAKNNGNSGESSILSYFGRLFYSYDDKYLLTFTLRRDGSSKFYKDN RWGWFPSAALAWKVSNESFLKDNNIINNLKLRLGWGVVGNQNVPNNAYKAIYSSVATVWG TGLLAGNTPNPELKWESTYSSNLGLDINLFQNRIELIADIYYKKTNDLLLQVPLPAYVGT SGQGSTSAPWKNVGSLENKGLELTLNTVNIDKGGFQWRSNVVFSMNRNKVKSLDTATSLI DKSNQTGETITVLTRTSVGNPIGQFYGYKVIGRFEKATDFYYKDASGAIKPVALPEGMSI SKNSIWIGDYIFEDVNKDGVINEQDRTYIGNPEPDFTFGFGNSFSYKGFDLSINLTGSVG NEVVNWGRRELENPRGNNNILKSALDYAQLALIDPNGPDDYRNIQIVGGDPYACRMAIAK GTDDSNYRFSDRFVEDGSFLRIQSISFGYTFPRKWLAPIGIQNLKLYCNLQNVYTFTKYK GMDPEIGSANQDALLTGFDNYRYPSPRIYTFGLNLTF >gi|225935346|gb|ACGA01000046.1| GENE 137 209303 - 210949 1476 548 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6367 NR:ns ## KEGG: Cpin_6367 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 2 538 3 547 547 377 43.0 1e-103 MKTSKYIYSFMLAVALAFAGCSDFLDSEDNSNIAGDNFYQTEDDFRAATAPLYNKVWFDF NDKFYYGLGDGRGFNLCATTSNYIYPFADLTETGLTGPLVSAWGSLYNIVQQTNKIINGI KGNSSNSEEVKNPYIAEARFMRGVAYSYLAMLWGNVIINEDTDELVANPIVNTSPVSDVY EFAMRDLEFAAKYLPEVSSAAGRVNKYSAFGMLSRVYLTYAGYSSNPNSATRNQDYLDLA KKAALKVIENRNFELMTDYADLFMIDKNNNSETLFALQWVSNGTYGECNTIQDYFACESA ITGNDRAWGGYVFAQPDVIWEYEKGDKRRQPTWMAYGDHYPEIQKANGGYTCEKKASGTV SGIYCKKYVCGSSKDNAKITSGSTPINTYMLRLAEVYLIYAEAILGNNPSTDDADAMEYF NKVRKRAGLEPQNSITYEDIRRERRLELCLEGQYWYDLVRRAYYKQQEVLNYITGQDRGT AIPVVWDADTQTLKRDEDSDETKRAIGDVDSSIFLLPYPESETVQNPLLKADPMPYAFKE DRITDLFN >gi|225935346|gb|ACGA01000046.1| GENE 138 210964 - 212367 1262 467 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6366 NR:ns ## KEGG: Cpin_6366 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 6 424 11 409 419 120 28.0 2e-25 MKSLIYKNSMKAFGLLVSLLICSCSFVSCDDDDDNGGSSVMNITGIYLEDAKSNVPDRLV DFARLGQLIRIEGEGFNGLKKVYINGYNCYFNPVFVSNKSFLVSVNSKVPTTEADENVRN TIRLVKDGGEYVYDFQIRAAAPSITKISNCMPNVGEPIIVYGSGLTEIAKVVFPGNVVVT EGIISDLDGEYFMVDMPAGVSEEGGSIFVEGSNGGAYSPAYFNYKKGLLLNFDGVGAQGA WGDSESMIQTTELESASIGEGNVSQGAYCRLPLERQLPVAAAKNRCAEVWTAGNGTDPDW LTLGVPAETPVAECAIQFEIYVPEPWSESGFLKICGQNGFNGGEWERDCYNYVPWLVDGK IVPFQTTGWQTVTVPFSEFYKSKASSGAWTTFADVAATRASASYANFGFYFENSDITLDK ITGASSDKETEFLSKATSVKIYIDNWRVVPLTKPEYSDFPDEEEDAE >gi|225935346|gb|ACGA01000046.1| GENE 139 212392 - 214614 2102 740 aa, chain + ## HITS:1 COG:BH2120 KEGG:ns NR:ns ## COG: BH2120 COG3693 # Protein_GI_number: 15614683 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-1,4-xylanase # Organism: Bacillus halodurans # 481 724 149 396 396 104 31.0 7e-22 MKYRNMLPFVALSVLILTSCDDNKMEWYKDPTHGAVTSSELPLQLAEKISRYKPLKEYLS DPNFKLGIGVGMDEYLGDETTTIIVNENFNDLTIGYAMKHGPMVDSKGNLKFDKVDQLFT KTTEAGISIYGHCLIWHTNQNASYLNSLIAPEVIPGPAGSNLLDLSGIEDGTFDGWNRKN GSDAMSIVEGEGLTTTSKALKFTVGSSVTAAHSVQLYTPDISAVSGHNYEISFFVRSDKA GKARLTFAGLGNNYPYKDWYATGGSWTEAFETTSQWQQVKITVNDFVSTTFQIAFEFGYL PDVTYYVDNIVVTDKDAEPTVVNLISNGDFESKAITPWGAWSSGKATISEEGEGYGSSYS MKLTSSVDGGAGNAYKAQAGYGFDTPLEAGKTYEFSAMIKASVSTTFQIQIQNSTSYAGE CYVDGNVSTTWTEFKKEFTVSKEDMNRFCINFGVSAGDYYIDNIVLSEKVVETRAVTRAS GPTIIEKTDEEKAEIIRDAMEDWISKMVTHCKPYVHAWDVVNEPMDDGKTSDIKTGKGKT DLASDEFYWQDYFLTPKDYAVEAFKLARQYGNPDDKLFINDYNLEYNLNKCDGLIKYVEY IESKGATVDGIGTQMHIAIDSNKDNIAQMFQKLGATGKLIKVSELDIKVNTSSPTTENLA QQAEMYQYVIDMYKKYIPVDKQYGITIWGVSDNEKEHVNWIPNDAPNLWDANYARKHAYK GVADGLAGKDVSGDFTGDLE >gi|225935346|gb|ACGA01000046.1| GENE 140 214947 - 216605 1439 552 aa, chain + ## HITS:1 COG:no KEGG:BVU_0028 NR:ns ## KEGG: BVU_0028 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.vulgatus # Pathway: not_defined # 7 539 3 528 645 573 50.0 1e-162 MNKYWFYKVGLVVVFLCFALLGGAKVKLPTLVSDGMVLQRGEPVNIWGTADPDETVDITF LKKKYKTVADAHGNWKVTLPVLKAGGPYTMTINDIELKDVLIGDVWVCSGQSNMELPVSR VTDRFRDEISTDSNYPMVRYIKTPLLYNFHAPQADIPGISWQAMTPENVMPFSALAYFFA KDVYQKTKVPVGIINSSVGGSPIEAWISEEGLKPFPFYLNEKRIYESDDLVESMKKEERK KSRAWNVALYQGDKGMHEATPWYAADYDDSNWKETDLFASGWATNGLNTINGSHWFRKDF QVSAQQAGEKATLRLGCIVDADSVYVNGTFVGSVSYQYPPRIYTIPAGLLKAGKNTITIR LFSYGGRPQFVKEKPYKILFGKGQPEKGESEINLEGSWKYRLGAPMPAAPGQTAFHYKPT GLYNAMIAPLLNYTVSGVIWYQGESNVSRRNEYKDLLTAMISDWRQRWNKSDMPFYIIEL ADFLSPTDKGGRTAWAEFRKAQAEVADTNKNVTLIKNSDLGEWNDIHPLDKKTLGQRVAA AILIEMNTKNRK >gi|225935346|gb|ACGA01000046.1| GENE 141 216622 - 218037 1026 471 aa, chain + ## HITS:1 COG:CAC3451 KEGG:ns NR:ns ## COG: CAC3451 COG2211 # Protein_GI_number: 15896692 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Clostridium acetobutylicum # 12 468 3 449 458 371 44.0 1e-102 MENTIKTNEAKGFYKLSWLQRIGFGSGDLAQNLIYQTVCMYLLIFYTNVYGLKPEVAAVM FLIVRIVDVLWDPLVGAFVDKHNPKLGKYRSYLIWGGIPLTGFAILCFWNGFSGSLFYAY FTYVGLSMCYTLINVPYGALNASLTRDTNEITVLTSVRMFLANLGGLAVAYGIPILVKVL SPDGKINTTASANAWFITMTIYAVIGLALLMFCFSQTKERVVMDQEETSKVKVSDLWVEF CRNKPLRILAFFFITAFAMMAIGNSAGSYYMIYNVRAPEMLPYFMALGSIPAFIFMPMVP AIKRAIGKKQMFYVFLSVAILGMALLYIISVVPVLKTQIWLVFVAQFIKSTGVIIATGYM WALVPEVISYGEYTHGKRISGIVNALTGIFYKAGMALGGVVPGLVMAFVGFDQTNEVSQS PFAEQGILWLVAVIPALLLLVAMFIISKYELEDNVIDNINEEIESRCKKGE >gi|225935346|gb|ACGA01000046.1| GENE 142 218077 - 219207 898 376 aa, chain + ## HITS:1 COG:TM0061_2 KEGG:ns NR:ns ## COG: TM0061_2 COG3693 # Protein_GI_number: 15642836 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-1,4-xylanase # Organism: Thermotoga maritima # 29 362 1 315 691 233 39.0 4e-61 MKLKRIILLLLTVMFSFSYGEVFAKDGNSLKKALKNKFLIGVSVNTHQSSGKDVAAVEIV KKNFNSIVAENCMKSSVIHPKENKYNFAQADEFVSFGESNQMAIIGHCLIWHSQLAPWFC VDKDGNNVSPEVLKKRMKDHITTIVKRYKGRIKGWDVVNEAIEDNGAYRKTKFYEILGEE YIPLAFQYAHEADPDAELYYNDYSMAQPGRREAVVKMVNDLKKRGIRIDAIGMQGHIGMD YPKISEFEKSMLAFAGTGVKIMITELDLTVIPSPNPNVGAEVSASFEYKKEMNPYPDGLP EEVSKAWTERMNDFFRLFLKHHNLITRVTLWGVADQNSWRNDWPMRGRTDYPLLFDRNYQ PKPVVDLIIKEAEKTK >gi|225935346|gb|ACGA01000046.1| GENE 143 219234 - 220211 927 325 aa, chain + ## HITS:1 COG:no KEGG:BVU_0040 NR:ns ## KEGG: BVU_0040 # Name: not_defined # Def: beta-xylosidase/alpha-L-arabinofuranosidase # Organism: B.vulgatus # Pathway: not_defined # 1 321 1 321 323 556 81.0 1e-157 MKTEKRYLVPGDYMADPAVHVFDGKLYIYPSHDWESGIAENDNGDHFNMKDYHVYSMDDV MNGEIKDHGVVLSTEDIPWAGRQLWDCDVVCKDGKYYMYFPLKDQNDIFRIGVAVSDKPY GPFIPEANPMKGSYSIDPAVWDDGDGNYYIYFGGLWGGQLQRYRNNKALESAILPEGEEE AIPSRVARLSEDMMEFAEEPRAVVILDEDGKPLTAGDTERRFFEASWMHKYNGKYYFSYS TGDTHLLCYATGDNPYGPFTYQGVILTPVVGWTTHHAIVEFKGKWYLFHHDCVPSEGKTW LRSLKVCELQYDADGRIITIEGKDE >gi|225935346|gb|ACGA01000046.1| GENE 144 220377 - 222512 1804 711 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020672|ref|YP_526499.1| ribosomal protein S18 [Saccharophagus degradans 2-40] # 74 690 140 739 754 699 55 0.0 MRNIFALTFTLFFCFSSIWAEDGSALWLRYASGAKAEITSKKQSPTLRIAVSELQNFWQG GIPVTLEVRNNKELRALGNEGYTIQTSKGGNQITIASSGEQGVLYGTYHLLRLQATGQLP ESALQSLNISERPDYRIRILNHWDNLDGTIERGYAGHSLWKWDELPSVVSPRYEAYARAN ASIGINATVINNVNASPKILSNDYLQKVKVLADVFRPYGLKIYLSINFSSPAALGGLSTS DPLNKEVINWWKQKAKEIYSLIPDFGGFLVKANSEGQPGPCDYGRTHAEGANMLADVLRS YHGIVMWRAFVYSPTDSDRAKQAYLEFEPLDDKFRDNVIVQIKNGPIDFQPREPFSPLFG AMKKTAVMPEFQITQEYLGFSNHLAFLAPMWKECLDSDTYLQGKGSTVARVTDGSLFFHP LTAISGVANIGDDTNWCGHPFAQANWYAFGRLAWKHSLSSEQIGEEWLKQTFLPVTGAQT DTPAGEVIQKEQIDAQLSLLNSQVLQKSREAVVDYMMPLGLHHIFAWGHHYGPEPWCDIP DARPDWLPPYYHRADNMGIGFDRSSTGSNATGQYHSPLCEQLDNVNTCPENLLLWFHHVP WNHQMKSGRTLWAELCYAYDRGVNEVRNFQKVWDRMEPYIDSERFRDVQHRLKIQARDAV WWRDACLLYFQQFSRQSIPYELERPIHELKDMMEYKLDITNFECPPYGFSK >gi|225935346|gb|ACGA01000046.1| GENE 145 222620 - 224074 202 484 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 282 469 21 198 305 82 32 3e-14 MNQNTFCMAGGVARNPLVRLAKPITATIGANEHIAIVGPNGGGKSLFVDTLIGKYPLREG TVQYDFSPSATQTLYDNVKYIAFRDTYGAADANYYYQQRWNAHDQDEAPDVREMLGEIKD EQLQRELFELFRIEPLLDKKIILLSSGELRKFQLTKTLLTAPRVLIMDNPFIGLDAPTRE LLFSLLERLTKMSSVQIILVLSMLDDIPSFITHVIPVDKMEVFPKMEREAYLEAFCSRDV VTSLDDLQQRIIDLPSDGNNYDSEEVVKLNKVSIRYGDRTILKELDWTVRRGEKWALSGE NGAGKSTLLSLVCADNPQSYACDISLFGRKRGTGESIWEIKKHIGYVSPEMHRAYLKNLP AIEIVASGLHDSIGLYKRPQPEQMAICEWWMDIFGIADLKDKPFLQLSSGEQRLSLLARA FVKDPELLILDEPLHGLDTYNRRRVKKVIEAFCRRKDKTMIMVTHYESELPNTITDRIFL KRNR >gi|225935346|gb|ACGA01000046.1| GENE 146 224110 - 225930 1281 606 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 321 602 20 307 328 174 36.0 4e-43 MNKANIGWWEADLKAETYKCSELISQLLGIGEDGLISFEDFNKRILKEDQGYATFPSFDK VQQTVEEIYLFDTPKGHVWIRSKACFQETDENGNTKVYGIAETQEGIHIASAHQALQNSE RILHNIYKNLPVGIELYDKDGQMVDLNKKDMEMFRISNKEDILGVNIFENPILPEEIKQK IKDNENADFTFRYDFSKINKYYQPNSTTGFIDLTTKVTTLYDHNHEPINYLLINVDKTED TIAYNKIQEFESFFDLVGDYAKVGYAHFDALSRDGYALRSWYRNVGEEEGTPLPEIIGIH SHFHPEDRAVMIDFLDKVIKGESSKLSRDVRIRRADGNYTWTRVNVLVRNYQPQDNIIEM LCINFDITELKETERMLIGAKEKAEEADRLKSAFLANMSHEIRTPLNAIVGFSSLLEEAE DAEEKHLYATIIEENNKLLLQLISDILDLSKIEAGTFDIIPEQVDAQQLCNELLQSMQVR ATEQVEILLAPELPELTFTSDKNRLYQVLLNFVTNALKFTSEGSIVIDYRINGNEVRFSV QDTGMGIEPEKQEAIFTRFVKLNNFIAGTGLGLPICQSIVTQLGGKIGVESEPGKGSCFW FTHPIN >gi|225935346|gb|ACGA01000046.1| GENE 147 226068 - 226214 123 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160886371|ref|ZP_02067374.1| ## NR: gi|160886371|ref|ZP_02067374.1| hypothetical protein BACOVA_04381 [Bacteroides ovatus ATCC 8483] # 1 48 1 48 48 67 100.0 2e-10 MIVSQGKYVLQRKWAYMSRVSINYLRVKKVIEAFCTRKDKAKIKSTSR >gi|225935346|gb|ACGA01000046.1| GENE 148 226423 - 226926 299 167 aa, chain + ## HITS:1 COG:no KEGG:Cpin_0044 NR:ns ## KEGG: Cpin_0044 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: C.pinensis # Pathway: not_defined # 2 163 18 178 183 83 29.0 3e-15 MNNSQNAITVIYREHWQKLYIHAYNLLNDEESAKDVVNDVFCSVLESSERLTTEEDLLPL FFVMVRNRCIDQIRHQNVVHKNAERYLEELYSGWTVKEYREYEDKINRMQESIRQMAPQM RTVVEEFFLNEKKCAEISEKLRISDNTVRTHIARALKILRKQLTIFF >gi|225935346|gb|ACGA01000046.1| GENE 149 226993 - 228060 936 355 aa, chain + ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 145 354 67 275 280 80 31.0 6e-15 MIDHSDKKLDYALRAIQHPQLRETDEFLQWIGIPENKELFLELMACKEAVIRENLQRKRK YRKKMRILAIASSAAAVLILAFLIPSLLPSTSLPVEQPIRFFAANNNDEHVVLQMDGRSE QQLLTDSVMDVKDWKTVSADTVCCQTLTTPRGKDFLLVLADGTKVWLNAESRLRYPVAFN GKERRVELEGEACFEVAKDAEHPFIVCANGMNTMVLGTKFNVRSYSVEDRHVTLVNGKVQ VTNTANNKSVTLRPGQDLTYTETGEEKVSEVNIATYTAWTEGMFYFEDVPLEEIMGALGR WYNVNIDFERCELYHIRLNFWANKNTHLDEALELLNKLEKVQVDYQDGTITIKQI >gi|225935346|gb|ACGA01000046.1| GENE 150 228080 - 231676 2774 1198 aa, chain + ## HITS:1 COG:no KEGG:Coch_0666 NR:ns ## KEGG: Coch_0666 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.ochracea # Pathway: not_defined # 131 1198 24 1101 1101 774 38.0 0 MKKQQITLLEKRIKRKLVWMKTAFLLLVFTATNLQGWAQNAENRQISLEMKNEPLGSALK QFSNVSGYKVNFPSEDVAPYRVTVSIYQMAPFAALQKILEGKPFEYDVKQNFITVRKVKA TSTTASGKFRNVGGQVVDEGGNPLPGANIKVLNSPFGAIADAEGNFTCKVPVDVHTLEVS FVGMQSEKVSVKDRNNVRVIMHEDKQQLGDVVVTGYQRISRERSTAAFGFVDSEQLNRQM HSDLASSLEGQIAGLRMNINPNTGDMSPILRGVGTFSDDVGTNPLIVVDDMPTNLSLSEI NPYNVESITVLKDAAAASIYGALAANGVIVVTTKQAQKEGAHVSINADWFITTKPSFKSL NLASTSDIIDYQTAVFDANVAEKGSAANFLSSFQYNYYNPLFQLYLDQANGDITSSEVNA TLGQWRNNDYYKEYRDNAWRTAITQRYNVTVSQKAGNSNHFLSFNYEKDSQRVISGKSNK ISLYYKSNYAVTNWLNLNAGVDVRMGRSHTPNSSYTSYTLQQRYERILDADGNHYTSPYV NVGAYPSYNGSVVRKAEGVSPYRTFGFNVLDALEESITKSHDVSIRPFVSLQARFLKMFK YNFMYQYEWNKGKSELFESENTYAMRMLHNSMVDTNGKAQLPQGGRFSQTEVESKRYTVR NQIDFDKTWKDHAVTAIAGLEFRENKIPTPARQLLYGYDPQTLTSDFMNWQAYRDGVGTS ALSDRTITLSGLSATLHESRHRYASFYANAGYSYLSRYNLSGSIRWDQADLFGLDIRNQR HPLWSVGASWILSEESFMKEISWLDYLKLRMTYGINGNVDQASTTYFVVKKKTQSNPIKT TYLTYEDDDLPNPKLRWEKTATYNIGLDFRLFKNLISGTLEYYNRHSSDLLVRRYMDPTL GAGSRVVNNGEMRNRGVELSLSANIIRKKDWNFAVDFNFDHNKNKMLKVDHSESDNASLF IKSPLNYFMEGTSYNTLWAYRIDRMENGYPVAVDKDGNDLVKFNEDGTVASITSGSSLKG TDDLVNLGSLTPKFSGSVGLRFNYKNFDLNAFFVYAGGNKLRNSVLKMDDQLGTQTLKGI ANRWTADDSNAQVRMYLDIPAQVKTYATTFQDWWQFGDINVKDAGYVKLRSLSVGYNLPF MVCQYLRLSSLKVKVQVNNLFTWCKAGSDIDPESYGMNDGTRGIASPKTYSIGLSTSF >gi|225935346|gb|ACGA01000046.1| GENE 151 231697 - 233124 1464 475 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3801 NR:ns ## KEGG: Fjoh_3801 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 1 475 1 470 474 452 50.0 1e-125 MRKLLYILPVILILGFAACEDYVDITPTGKKTVDSTDTYYELIALPNRAYHPAAFALLSD NVWSKESNIIGNEFISWDGINMTFNEAANRKELSDNNLYENCYTYILRSNIVISLVDASL GDKDVKELAKAEAKIMRAWDHFILINTFAKAYNPETAATDGGVAIIDKYDLEATPTKSTV AQVYDFIIKDIEEALPYLQEEPVNVYHPSKAFGYALAARVYLFHRDWKKAKEAAEESLKL NNTLIDYIDLGAKGGPTKVTTYAKGGNPEVLNYAYMGGPTEVLAFCYGMLSPEMVQLFGQ NDERLNQFFKTSDNSIYYFDEGSGAALWNTSITYSKFQPMSVGMRTAEVYLILAEAKARL KDIPGAVQTLNQLREKRIKGAEAVLPEPATERAMVQAIIDERRKELISGFSRFWDLKRYN TEADYAKTITRTFPLVSTDVEKKTYTLKPDSRLYIIPFPLAAREKNPNLTMNTNE >gi|225935346|gb|ACGA01000046.1| GENE 152 233144 - 235648 2078 834 aa, chain + ## HITS:1 COG:no KEGG:Coch_0557 NR:ns ## KEGG: Coch_0557 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 48 820 48 824 827 847 52.0 0 MRKILMMAVLAFLSLQGVGARASGISLLESGKKKLVAQEDTVKKDTVKKVTAYEKLLKDG GSECEGMFTVRHIKDNWYFEVPDTLLGRLLLAVTRFKAVPQGFKMLSGEEVNRSVVYWEQ HDDKTLFLREYVQSQFARPGDNIAEALKQSTVDPVIYKFDVIGRNPETQAQLIDVSKLFL GDNKLCGFTSSDRSILGIGTLAQDRTFMDTIKTYPINVETVTLRTYSISAGRLPAAQTGS VTVKLNTSIVMLPKEPMQPRFADDRVGFFQNSLTEFSDDQQTTDRGAIIQRYRLEPKDPE RYRRGQLSEPKNPIIYYIDPATPKKWIPYLKAGVEDWNTAFEAAGFKNAIIAKEWPNDPN MSLDDARYSVIRYLPSETENAYGPRIVDPRSGEIMESHICWYHNVMNLLKKWYMVQCGPL DKRARTMTFDDKLMGSLIQFVSSHEVGHSIGLRHNMAASSATPVEKLRDKAWVEANGHTV SIMDYARFNYVAQPEDRISEKGLFPRIGVYDKWAIQWGYRYRPEFKDPYKEKDVLRAEVT KKLRGDHRLWFVGDEGKGFDPRSQSEDLGDNNMKANEYGIKNLKRVMENILTWTAQPDGE YTDLTTIYNSVRAQHLKYTLQVQKNIGGRYTNNLPDLKVHDYLPRSLQKEAVEWLGRNLF VAPLWLYPDEVVSRTGVKPVDEIRDRQSSVVALLLAPGMLYNIYSTSLCSSESYALDDYL NDVFTAIWKPLNDPNELENNFRRQLHRTYLGFVEGMIKPSSKDGANANLAISRSDIQLFV EQHLDKIEESVKEQLAASKEGDLNYRHYAALLRNVQKIKEGYYGKDADTRSSDK >gi|225935346|gb|ACGA01000046.1| GENE 153 235956 - 236702 784 248 aa, chain + ## HITS:1 COG:STM0772 KEGG:ns NR:ns ## COG: STM0772 COG0588 # Protein_GI_number: 16764136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Salmonella typhimurium LT2 # 1 248 3 250 250 290 58.0 2e-78 MKKIVLLRHGESAWNKENRFTGWTDVDLTEKGVAEAEKAGETLKEYGFNFDKAYTSYLKR AVKTLNCVLDKMNLDWIPVEKSWRLNEKHYGELQGLNKAETAEKYGEEQVLVWRRSYDIA PHPLSESDLRNPRFDYRYHEVPDAELPRTESLKDTIERIMPYWESDIFPSLKTAHTLLVV AHGNSLRGIIKHLKNISDEDIVKLNLPTAVPYVFEFDENLNVANDYFLGNPEEIKKLMEA VANQGKKK >gi|225935346|gb|ACGA01000046.1| GENE 154 236730 - 237782 1093 350 aa, chain + ## HITS:1 COG:ECs2900 KEGG:ns NR:ns ## COG: ECs2900 COG1830 # Protein_GI_number: 15832154 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 350 25 374 374 477 64.0 1e-134 MSKVVELLRDKACYYLDHTCETIDKSLIHVPSPDTIDKIWIDSDRSIRVLNSLQTLLGHG RLANTGYVSILPVDQDIEHTAGASFAPNPIYFDPENIVKLAIEGGCNGVASTFGILGSVA RKYAHKIPFIVKLNHNELLSYPNSFDQVLFGTVKEAWNMGAVAVGATIYFGSEQSRRQLV EIAEAFEYAHELGMATILWCYLRNNDFKKGAVDYHAAADLTGQADRLGVTIKADIVKQKL PTNNGGFKAIGFGKIDERMYTELASEHPIDLCRYQVANGYMGRVGLINSGGESHGSSDLR DAVITAVVNKRAGGMGLISGRKAFQKPMNEGVELLNTIQDVYLDSSITIA >gi|225935346|gb|ACGA01000046.1| GENE 155 237850 - 238506 888 218 aa, chain + ## HITS:1 COG:TM0295 KEGG:ns NR:ns ## COG: TM0295 COG0176 # Protein_GI_number: 15643064 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Thermotoga maritima # 1 215 1 211 218 244 51.0 7e-65 MKFFIDTANLEQIQEAYDLGVLDGVTTNPSLMAKEGIKGTQNQREHYIKICKIVNGDVSA EVIATDYEGMIREGEELAALNPHIVVKVPCIADGIKAIKHFTEKGIRTNCTLVFSVGQAL LAAKAGATYVSPFVGRLDDICEDGVGLVGDIVRMYRTYDYKTQVLAASIRNTKHIIECVE VGADVATCPLSAIKGLLNHPLTDSGLKKFLEDYKKVNG >gi|225935346|gb|ACGA01000046.1| GENE 156 238614 - 242561 2080 1315 aa, chain - ## HITS:1 COG:slr1393_3 KEGG:ns NR:ns ## COG: slr1393_3 COG0642 # Protein_GI_number: 16329802 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 793 1020 45 290 301 128 33.0 6e-29 MRRKTFLLLCIAFITLFIGVSPLQAGHYYYKQISLKDGLPSTVRCILTDKQGFVWIGTRS GLGRYDGHELKKYVHQSNNPHSIPHNLIYQLIEDVQNNIWILTDKGVARYQRQSDDFLIP TDETGNNIIAHSVCLTDDGVLFGSKNKIFFYNYQDATFRLLQTFDLEPNYNVTLLSLWDK RTLLCCSRWQGLLLLDLNTGKCRRPPFDCGKEIMNVLIDSQKRIWVAPYNGGLNCLTHDG KLLATYTTHNSSLSNNVVLSLAEREGEIWIGTDGGGINILDPETGHFSLLEHVPGSDNYS LPANSILCLYNDYNNNIWAGSIRNGLISIREVSMKTYTDVPPGNDRGLSNNTVLSLFQES QDQIWVGTDGGGINNFNPLTEKFTHYPSTWEDKVASICQFTPGKLLISIFSQGVFIFNPA TGEKQPFTIVDAETTALLCNRGKAVNLLRNTHHSILFLGEHIYIYDLNTRTFSTAIEQEE QDIIGALLPIGNYRNTTYLNDTKHIYELDNHTNRLKALYRCWKDTTINSVSRDEEGNFWI GNNYGLVHYNPATKIQTPIPTSLFSEITLIVCDQQGKVWIGADSMLFAWLIKEKKFVLFG ESDGVIQNEYLSKPRLLSMQGDVYMGGVKGLLHINCKIPLTTSELPQLQLSDVIINGESV NNELTNRPTGISAPWNSNITIRIMSKEKDIFRQKVYRYQIEGLNDQQIESYNPELAIRSL PPGSYKIMASCTAKDGSWIPSQEVLELTILPPWYRTWWFILSCTVLTAAAIIETFRRTLK RKEEKLKWAMKEHEQQVYEEKVRFLINISHELRTPLTLIHAPLSRILKSLSAEDTQYLPI KAIYRQSQRMKNLINMVLDVRKMEVGESKLQIQPYALNQWIEHVSQDFVSEGEAKNVRIR YQLDPQVNTVSFDKDKCEIILSNLLINALKHSPQDAEITITSELLSEKNSVRISIIDRGN GLKQVNTQKLFTRFYQGTGEQSGTGIGLSYSRILVELHGGSIGARDNQEAGATFFFELPL RQQSEEIVCQPKAYLNELMNDDSKEQLPEENTFDTSPYSILVVDDNPDLTDFLKKSLGEY FKRVVIASDGVEALQLTRNHAPDIIVSDVMMPRMNGYELCKNIKEDITISHIPIILLTAR DDKQSQLSGYKNGADAYLTKPFEIEMLMEIIRNRLKNRESIKKRYLNTGLVPAPEESTFS QADETFLLKLNKIIQEHLDSSNLDVTFICKEIGMSRASLYNKLKALTDMGANDYINKFRM EKAITLITSTDMSFTEIAEKVGFTTSRYFSTAFKQYTGETPTQYKEKRKQERKKE >gi|225935346|gb|ACGA01000046.1| GENE 157 243070 - 244500 1235 476 aa, chain + ## HITS:1 COG:YPO3586 KEGG:ns NR:ns ## COG: YPO3586 COG1660 # Protein_GI_number: 16123728 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Yersinia pestis # 337 464 156 277 284 89 38.0 2e-17 MITEELQKLYQSYTGVPAENITELPSSGSNRRYFRLTGIQPLIGVYGASIDENEAFLYMA GHFRKCGLPVPEVRIVSEDKTYYLQEDLGDALLFHAIEKGRATSVFSEEEKELLRKTIRL LPAIQFAGADGFDFSRCYPQPEFNQRSILWDLNYFKYCFLKATGMEFQEDKLEDDFQKMS DVLLRSSSATFMYRDFQSRNVMIKDGEPWFIDFQGGRKGPFYYDIASFLWQAKAKYPDSL RKELLQEYMEALRKYQPIDESYFYSQLRHFVLFRTLQVLGAYGFRGYFEKKPHFIQSVPY AIENLRELLKEEYPEYPYLCNVLRELTGLKQFTDDLKKRQLTVKVMSFAYKKGIPDDSTG NGGGYVFDCRAVNNPGKYERYKPFTGLDEPVITFLEEDGEILRFLDHVYALVDASVQRYM ERGFSNLSVCFGCTGGQHRSVYSAQHLAEHLNQKFGVKVELVHREQNIEHTFEATI >gi|225935346|gb|ACGA01000046.1| GENE 158 244537 - 245286 608 249 aa, chain + ## HITS:1 COG:CAC2981_1 KEGG:ns NR:ns ## COG: CAC2981_1 COG1208 # Protein_GI_number: 15896233 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Clostridium acetobutylicum # 1 203 1 189 382 106 32.0 3e-23 MKAMIFAAGLGSRLKPLTDTMPKALVPIAGHPMLEHVILKLKAAGFTEIVINIHHFGEQI LDFLEANENFGLIIHISDERDLLLDTGGGIKKARSFFENSDEPFLIHNVDILSDVNLKEL YDYHLRSGAVATLLASQRKTSRYLLFDTDKRLCGWINKDTEQVKPEGFQYDSSLYQEYAF SGIHVLSPAIFQWMTSPSWDGKFSIMDFYLATCRQVNYGGYLTEKLHLIDIGKPETLAKA EGFLYQNAK >gi|225935346|gb|ACGA01000046.1| GENE 159 245395 - 247377 1387 660 aa, chain + ## HITS:1 COG:mlr3786_1 KEGG:ns NR:ns ## COG: mlr3786_1 COG0642 # Protein_GI_number: 13473249 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 108 508 43 462 478 176 31.0 2e-43 MEQTLVNGIADNLLHNIFNDLSVGLELYDKDGLMIDVNYSRLRSMGIKDKKDILGYNLFN YTSFSDEIKEQIRKGETVRFMAKYDFDDLCRLFPTNLSGIKFFEITVSFVHNDKSEITNY MVITQDITERVLWQNKYDNLYEEVVRSKKELLESEQRMIHLIRQNELVLNNINSGLAYIA NDYIVQWENISLCSKSLSYEAYKKGEPCYLTAHNRTTPCENCVMQRARKSGQVESILFNL DNKHVIEVFATPIFNEQGDVDGVVIRVDDVTERQHMIGELEKARSRAEQSDKLKSAFLAN MSHEIRTPLNAIVGFSDLLMVTEDQEEKEEFIQIINANNELLLKLINDILDLSKIEAGSV ELKYENFDLAVYFNELAASMHRRVVNPQVRLVPVNPYETCTVRLDKNRLAQILTNFVTNA IKYTSKGTIEMGYEKIDENIRLYVRDTGIGIPEDKKDKVFHRFEKLDEFAQGTGLGLSIC KAIVEACRGEIGFESEFDKGSLFWAVLPCQFESVNSEPTSSRRNNEKDANKENILDSEET KKVPKRVLVVEDIQSNFFLVSSILKNKCQLLHAPNGLEAVEIVRTQPVDLVLMDMKMPVM DGRTATSEIRKFNAEIPIIALTAHAFDADRVAALKAGCDDYLVKPINGAKLMQTLKEYGC >gi|225935346|gb|ACGA01000046.1| GENE 160 248041 - 248190 60 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTATKKIKTREELRVEVVKKITANKRNAQSFLKKAGIVSKSGELTKIYR >gi|225935346|gb|ACGA01000046.1| GENE 161 248187 - 248609 319 140 aa, chain + ## HITS:1 COG:no KEGG:BVU_2957 NR:ns ## KEGG: BVU_2957 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 22 140 217 320 320 84 46.0 2e-15 MKETLYSRRSNLVVGFHGCDQSIKEQVFEHLARLAAVADLSEENRIAYDKALDRYRVNQI VEEDERRKNEEMRRKAAEEGMKEGLKEGIREGIKEGMEKGMEKGEQKKQIEIARKMREDG ISIDTIIKYTGLQSSDIENL >gi|225935346|gb|ACGA01000046.1| GENE 162 248895 - 250535 1264 546 aa, chain + ## HITS:1 COG:no KEGG:BT_1632 NR:ns ## KEGG: BT_1632 # Name: not_defined # Def: chitinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 545 1 552 554 744 67.0 0 MRNKFLFLCLFLFGTLTACKDTKWVDVPTGTPPEGAITGKPGEPEEPDDPTVKSFINCSY IRGDFFEINRISAASMGACTDLIYLAARPYANGEVAFELPLNDATMTNVSHTGTFQGRNG VVKFEGTSFMNGGDGLLHSADGAFNKFTFGTYIYISEWVDGAFLFKKMDGNSTIIAFQMG ATANSLKLSIGSSVATVTTPDLATGWHYIGLTYEQGTAKLYVDTNNTAISFVGALPNEVP NTRTDFLIGDKFKGYLDETFVSSLVVGTSGRNPITFDNWNNSKILAYWKYDDATKPGKDS HSWLGRLNNIRTALQGQQGDRRLRLGIAGGEWKKMMADATARTRFASEVKKIIEQYDFDG VDFDFEWPTNTNEFNNYSATVIQMRSTLGKYVYFTASLHPVSFKISPEAIDALDFISYQC YGPAVMRFPYEQFVKDGEMAITYGIPKNKLVMGVPFYGSTGSLIAAYYDFVNDGLATTIE DTYTYKGNTYTFNSVETIRKKARYVCEEGFAGIMSWDLATDIDVTNNKSLLKAIKEEFEY YANPTE >gi|225935346|gb|ACGA01000046.1| GENE 163 250567 - 253887 3359 1106 aa, chain + ## HITS:1 COG:no KEGG:BT_3174 NR:ns ## KEGG: BT_3174 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 1106 21 1100 1100 1762 81.0 0 MEKKKFMRVQKFLILFLILLLSFPLSALAQQQMIKGQVVDDKGETIIGATVMVKGSKDGT LTDIDGNFSVKGKVGNTLVISYVGFTPLEIKVTKAEGNRFTLKEDTKVLDEVVVVGMDKQ KRSTITAAVATVGSDAIVNRPVTDLTSALQGNVAGLNFSSDAVANGVGGETGAEIKFNIR GIGSINGGEPYVLVDGVEQSLQNVNPADIETISVLKDASAAAVYGARAAYGVVIVTTKSG KKEKTRISYSGTVGFSSPINMPQMMNSIEFANYLNERDDNDGVKHSIPDALIEKMQGFME NPYSEKFPGIGPNADGTGWAGSKDAVYANTDWYDYYFKKASIRHSHNLSVTGGSDKFNYY VGLGYIYQEGLLDQVDDNLSKYNVNSKFQIRANKWLKFNFNNNLSLNILKRPMANQTIFY GTIGSSFPNSPTHLPVKSEYNDNSEYRYLKQSHYVQNRISDAMSFATTITPLEGWDIVGE MKVRFDVENNDFKRGYPTAEKPDGTLDISKGTKQGYQYPGMNWKNSNWGSYTRGNTFNYY LSPNVSSSYTHAWGDHFFKAMAGFQMELQENSNGYTYKDGLLTSDIFSFVNANGQVLAGE DRTHWATMGMYAKLNWNYKEIYFLEFSGRYDGSSRFAPGNRWGLFPSFSAGYDISRTDYF KALNLPVSQLKVRVSYGRLGNQNGAGLYDYLGIQSLTSDSPNAWLLPGVQATPQKGTLAT TPKMISSYITWEKVDNANLGIDLMLLDNRLSITADIYQRTTRDMIGPAEAIPNLGGISTD DRAKVNNATLRNRGWELSVNWQDQLKCGFSYGIGFNVFNYKAVVTKYNNPEGLIYNNHTG LATNKGYYEGMDIGEIWGYEANDLFLSNREVDAYLRQVDLSFFKSGDKWQRGDLKYIDSD GDGKVDPGKGTLADHGDLKIIGNATPKYSFGINLNVGYKGFEISTLLQGVAKRDFPMAAS TYLFGGKEWFKEHLDYFSPENPNGYLPRLTTDAQTTNANTGYNTTRYMLNAAYMRMKNLT VSYTFKPQLLKNIGLSNLKVYFTCDNLFTISKLPNQFDPETLNQVNAWAGGSNAAAPGLT SAQNQNGNGKVYPMNRNFVFGLDFTF >gi|225935346|gb|ACGA01000046.1| GENE 164 253906 - 255750 1520 614 aa, chain + ## HITS:1 COG:no KEGG:BT_3175 NR:ns ## KEGG: BT_3175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 611 1 616 621 914 70.0 0 MKNKIIILTVFAGLLLSACSDYLDRKPYTQPNNEDFLKTRENVESYINNLYTSLPAPAQY GIGVRGEEVNSDNILSEKYDMRLNGENNQFSGSDDWNKGYKNLRMVNYFFHYYAVSEVDE TDEVLSLRGEAYFFRAYWHFYLLTRFGNIPIMDDFWDGNATLGGLQIPATKRADAARFIL NDLKAAIGLIPEARANLHSRSKYSGLRVNRETAMILAMRVALYEGSWEKYHKGTDFATED NSEEFFKEVMNWGDQYLFPVGLTLHMTATNPKATNIDDAFSELFNSNDLSNISEVTFWKK YSIADGVFNTVTQQLSGGVTDNVKPSGLSKSLVDNFLNVDGTPVDPTDEKYKDFNEVFKD RDGRLLAMVMHTGCKFKSNSLMNVRAYDETGTEEEQKEKNKDISSPRLNGDGIYKNVTGF HTRLGIDTTYVTGNCETAHVMFRYAEGLLCYAEAAAELGLYNDGVAEKTLKPLRQRAGVA YVTPAADPHFPFQGLPPAVQEVRRERRSELSLQGFRLDDLMRWRVAGTLKGVEGRGRGAY LGKDGVLYLSFSPTLRKEGLNHVLTDNEGWMDPLKEYLPEGYKFNEDRDYLLPIPPDEIQ MDHELNQNPGWPTK >gi|225935346|gb|ACGA01000046.1| GENE 165 255796 - 257640 1181 614 aa, chain + ## HITS:1 COG:no KEGG:BT_1629 NR:ns ## KEGG: BT_1629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 607 1 577 593 613 57.0 1e-174 MKNILYKIFPICLLTMGLTGCEEETKYRPLPEIVPLTMSINDNAFAMGEHLKVNINVEPD ADGKEVVANEDFDIYFTAKAGTEDASNVFEQFNGIVTFPKGEKQIQVDFPIKASGLTGST AINFTAFARGYKMEGSSQVIKVSDYYRIMASLENNADNVVMEGGKFVLVAKIEKPSSVPL NMKITPKEGEGDRYENLPSTLTIPAGRTSAKSDAVTIKQDFEMTGDLQLVLNLESDSPAN PMINSTLTITMTDLESMGDPNLFDITKVYEFPDRPFMSDKNKTAIESWFTGDKIAMNKNS AHPTSALKDKDWVLCNAIEFHYINNSFSGGNNTPNAFGHRISWAFSDINDAPSQKIQAVN NAKCTNITNEGILNMWVDKNVQGTGAMTATKDYGVAALQCSKFGGIFAPQHTRFFPGMRI EVKARLRGIRTGFVPTIGLKNQKNSLPNTKDEIDILKNTQGSVITQAVTVDNEIGSKSVA IPQANEWNIYWVELVDENIINLGINGATNLTVNRTQSPDRWPFDKAGTPATANPDPATAA GGSAGLYFFMRLAESSERAAGKAPEGWDNVLKSIANCETDDNTPRMEIDWIRIYTNKNYV QTDAEKVWANQLFY >gi|225935346|gb|ACGA01000046.1| GENE 166 257843 - 259426 1264 527 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 50 518 1 456 467 132 24.0 2e-30 MEKLQPNLFFPLAGVAAVASFASCSNKQKPVEQKPLNIVYIMTDDHTAQMMSCYDTRYME TPNLDRIAADGVRFTQSFVANSLSGPSRACMITGKHSCANKFYDNTTCVFDSSQQTFPKL LQKIGYQTALVGKWHLESLPSGFDYWQIVPGQGDYYNPDFITQNNDTIQKHGYITNLITD DAIDWIENKRNPEKPFCLLIHHKAIHRNWLADTCNLALYEDKTFPLPDNFFDNYEGRPAA AAQEMSIMKDMDMIYDLKMLRPDKNTRLKSLYEKYIGRMDEAQRAAWDKFYTPIIDDFYK QNLQGKELANWKFQRYMRDYMKTVKSLDDNVGRVLDYLKEKGLLDNTLVVYTSDQGFYMG EHGWFDKRFMYEESMRTPLIMRLPKGFDRRGDITEMVQNIDYAPTFLELAGAEIPSDIQG VSLVPLLKGEHPKDWRKALYYHFYEYPAEHMVKRHYGVRTDRYKLIHFYNDINWWELYDL QADPSEMHNLYGQPEYEPVVKELKEEMLKLQEQYNDPVRFSPERDKE >gi|225935346|gb|ACGA01000046.1| GENE 167 259445 - 261769 1531 774 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 21 613 26 593 757 368 36.0 1e-101 MKKYILIILILLSPLTAFLQADNLTSPLISIVPRPTQIVPGRGNFTFSAQTVFAVENQEQ AVIARNFIDLFTHAAGITPALNVGREEGQVRFVTDSSLKSEAYLLEITPQQILIKASDTK GFFYALQSVRQLLPAAIESEQPVRNVDWRVPAMTIQDEPRFGFRGLLLDPVRCFIPKKNV LRIIDCMAMLKINKLHFHLTDDNGWRIEIKKYPRLTEVGAWRVDHTDVPFHSRRNPKRGE PTPIGGFYTQEEIREIVAYAADRQIEVIPEIDVPAHSNSALAAYPQLACPVVKDFVGVLP GLGGRNSEIIYCAGNDSVFTFLQDVFDEILELFPSRYIHVGGDEARKTNWEKCPLCQKRM KKQRLANEEDLQGYFMKRISDYLRKKGREVIGWDELTNSSFLPEESIILGWQGMGTAALK AAEKGHRFIMTPARVLYLIRYQGPQWFEPVTYFGNNTLKDVFDYEPVQKDWKPEYESLLM GVQACMWTEFCNKPEDVDYLLFPRLAALAEVAWTPAGTKDWSGFLKRMDAYNAHIAEKGI VYARSMYNIQQTVTSVDGHLEVNLECIRPDVEIHYTLNGSNPAMSSHRYDGPIRVTKTQM VKAATFMNGKQMGETLELQLTWNKATAKPLLGNKKNEMLLVNGLRGSLKYTDFEWCNWNQ NDSISFTIDLQGREILNKFTIGCITNYGMGAHKPKMIRVEVSDDNRTYHTMGELNFSPKE IYLEGTFRNDYSIDMGGVSARYVRVTAEGAGICPDEHVRPGQEARVYFDEVIIE >gi|225935346|gb|ACGA01000046.1| GENE 168 261937 - 265023 2840 1028 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 33 1024 7 981 1087 662 39.0 0 MLMNISLKKRTLLVLLSGLTATFTLAQQQPLPEWQSQYAVGLNKLAPHTYVWPYANASDI EKPGGYEQSPFYMSLNGKWKFHWVKNPDNRPKDFYQPSYYAGGWADINVPGNWERQGYGT AIYVNETYEFDDKMFNFKKNPPLVPYAENEVGSYRRTFKVPADWKGRRVVLCCEGVISFY YVWVNGKLLGYNQGSKTAAEWDITDVLNEGENVVALEVYRWSSGAYLECQDMWRLSGIER DVYLYSTPKQYIADYKLNASLEKEKYKDGIFGLEVTVEGHSAISGATSIAYTLKDANGKA VLQDAIRIKSRGLSNFITFDEKNIPDVKAWSAEHPHLYTLILELKDEQGKVTELTGCEVG FRTSEIKDGRFCINGVPVLVKGTNRHEHSQLGRTVSKELMELDIKLMKQHNINMVRNSHY PTHPYWYQLCDRYGLYMIDEANIESHGMGYGPASLAKDSTWLTAHMDRTHRMYERSKNHP AIVIWSLGNEAGNGINFERTYDWLKSVEKTRPVQYERAELNYNTDIYCRMYRSVDDIKAY VKEKDIYRPFILCEYLHAMGNSCGGLKEYWDVFENNPMAQGGCVWDWVDQSFREIDKNGK WYWTYGGDYGPEGIPSFGNFCCNGLVGANREPHPHLLEVKKVYQNIKVTLANQKNLTIRV KNWYDFSNLNEYVLNWNVTADNGKILAEGTKTVDCAPHATVDVTLGAVKLPNTVREAYLN ISWTRREASSMIDKDWEVAYDQFVFAGNKNYTGYRPQKAGETTFTVDKQTGALTSLNLNG KELLATPLTLSLYRPATDNDNRDKNGARLWRDAGLDCLTQKVVSLKESKTSTTARVEILN AKTQRVGIADFVYSLDRNGALKVHTTFQPDTTIVKSMARLGLTFRVSNAYDQVSYLGRGD NETYIDRNQSGKIGVYQTTPERMFHYYVAPQSTGNRTDVRWVKLADTSGEGIFVESDRAF QFSIIPFSDVLLEKARHINELERDGLLTVHLDAEQAGVGTATCGPGVLPQYLVPLKKQSF EFTLYPVK >gi|225935346|gb|ACGA01000046.1| GENE 169 265168 - 266985 1402 605 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 50 490 10 461 559 317 39.0 4e-86 MNKLLTSLLLSVTLMSGAQAQKKENYYVKHVEFPQSATIEQKVDMAARLVPTPQQYAWQQ MELTAFLHFGINTFTGREWGDGKEDPALFNPSELDAEQWVRTLKEAGFKMVLLTAKHHDG FCLWPTATTKHSVASSPWKNGQGDVVKELRAACDKYDMKFGVYLSPWDRNAECYGDSPRY NDFFIRQLTELLTNYGEVHEVWFDGANGEGPNGKKQVYDWDAFYQTIQRLQPKAVMAIMG DDVRWVGNEKGVGRETEWNATVLTPGIYARSQENNKRLGVFSKAEDLGSRKILEKATELF WYPSEVDVSIRPGWFYHAEEDGKVKSLKHLSDIYFQSVGYNSVLLLNIPPDRRGLIHEAD IKRLKEFADYRQQTFADNRVKNGRKYWSTTSGGEAVYALKSKSEINLVMLQEDITKGQRV EAFTVEALTDNGWKEVGKGTTIGYKRMLRFPAVNANKLRVRIDECRLTAYVSQVAAYYAE PLQEETTKEDWNNLPRSGWKQVAASPLTIDLGKTVTLSSFTYAPSKAEAKPTMAFRYQFF VSMDGKSWKEVPASGEFSNIMHNPLPQTVAFSQKVQARFIKLEATTPDATVAKVNMNEIG VMVIP >gi|225935346|gb|ACGA01000046.1| GENE 170 267127 - 268302 957 391 aa, chain + ## HITS:1 COG:BMEI0944 KEGG:ns NR:ns ## COG: BMEI0944 COG0668 # Protein_GI_number: 17987227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Brucella melitensis # 29 388 12 391 408 243 34.0 3e-64 MLVVDLGKWMNKTLINWGIDPAVADRFDETIMALLMIVVAIVLNYLCQAILIGGMKQYTR RKPHQWNTLLMKRRVFHNLIHTIPAFLVYSLLPMAFIRGKELLLISQKACVIYIIFSLLL AINGILLMIMDIYDGRESMKDRPMKGFIQVLQVLLFFVGGIVIIAIIVNKSPATLFAGLG ASAAILMLVFKDSILGFVAGIQLSANDMVRPGDWVTLPSGDANGIVQEITLNTVKIQNFD NTISTIPPYTLVSSPFQNWRGMTQSGGRRVMKSITLDLTTLQFCTPEMLDRYRKEIPLMA DYQPEEGVVPTNSQVYRVYIERYLCSLPVVNQELDLIISQKEATMYGVPIQVYFFSRNKV WKEYERIQSDIFDHLLAMVPKFDLKVYQYSD >gi|225935346|gb|ACGA01000046.1| GENE 171 268532 - 269635 675 367 aa, chain - ## HITS:1 COG:no KEGG:BT_1616 NR:ns ## KEGG: BT_1616 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 367 10 373 373 550 71.0 1e-155 MILSGVSFLAPSNIIAQEVHGRQASSQQERISFRVVSWNIENLFDTHHDSLKNDHEYLPD AIRHWNYSRYKKKLADVARVITAIGEWNPPALVGLCEVENDTVLRDLTRRSPLKELSYRY VMTNSPDLRGIDVALLYQRDLFKLLSSRSISIPPFKQHRPTRDLLHVSGLLLAGDTLDVF VCHFPSRSGGAKESEPYRLYVAHILRTEVDSIMNIRSHPQAIIMGDFNDYPTNQSILKIL KAEAPPVKTNDLASTTDSANASTVTPSSLKLYHLLARKAKSENFGSYKFRGEWGLLDHLI VSGTLLNQSNHFFTSEEKANVCLLPFLLKDDEKYGDKEPFRTYKGMKYQGGISDHLPIYT DFELIVY >gi|225935346|gb|ACGA01000046.1| GENE 172 269712 - 271172 1314 486 aa, chain - ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 2 485 51 533 534 465 47.0 1e-131 MEKKDLKPAGVFKYFEEICQVPRPSKKEEKMIAYLKAFGAKHNLETKVDEAGNVLIKKPA TPGKENLQTVVLQSHIDMVCEKNNDVQHDFLTDPIETEIDGEWLKAKGTTLGADNGIGVA TELAILADDSIEHGPLECLFTVDEETGLTGAFALQEGFMSGDILLNLDSEDEGEIFIGCA GGIDSVAEFTYKEVEVPAGYFFFKVEVKGLKGGHSGGDIHLGRGNANKILNRFLSRMAGR HDLYLCEINGGNLRNAIPREAYAICAVPEDAKHDVRTELNIFTSEMEDELSVVEPDLKLV LESEAPRKIAIDQDTTTRLLKALYAAPHGVYAMSQDIPGLVETSTNLASVKMKPNHIIRI ETSQRSSILSARNDMANTVRAVFQLAGANVTFGEGYPGWKPNPHSAILEVAVESYKRLFG VDAKVKAIHAGLECGLFLDKYPTLDMISFGPTLTGVHSPDERMLIPTVEKFWKHLLDILA HVPAKK >gi|225935346|gb|ACGA01000046.1| GENE 173 271306 - 271539 240 77 aa, chain + ## HITS:1 COG:no KEGG:BT_1614 NR:ns ## KEGG: BT_1614 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 135 93.0 4e-31 MVETILITLLIVAISLVLLGVKVFFTKGGKFPNGHVSGNKALRQKGIGCAQSQDREAQKK PRFSINELEKALNDSMN >gi|225935346|gb|ACGA01000046.1| GENE 174 271576 - 272169 540 197 aa, chain + ## HITS:1 COG:no KEGG:BT_1613 NR:ns ## KEGG: BT_1613 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 197 1 193 193 286 88.0 4e-76 MNYLMNGLAALAFIVLFSQCAGKTDNQTTNAPAQANAELSGMKIAYVEIDTLLAKYNFCI DLNEAMVKKSENVRMTLNQKATSLNKEKQDFQKKVENNAFLSQDRAQQEYNRLVKLEQDL QELSNKLQNGLMEENNKNSLQFRDSINAFLKEYNKTHGYSLIFSNTGFDNLLYADSTFNI TKEIVDGLNARYSPVKK >gi|225935346|gb|ACGA01000046.1| GENE 175 272270 - 272611 359 113 aa, chain - ## HITS:1 COG:no KEGG:BT_1612 NR:ns ## KEGG: BT_1612 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 113 181 90.0 5e-45 MKQLVPALFTAGAIMALTGAAVFITGWIYAPYIYTVGAGCIALAQVNTPVKGKSKTLKRL RVQQIFGALALILTGAFMFTTRGNEWIACLTVAAILELYTAFRIPQEEEKERS >gi|225935346|gb|ACGA01000046.1| GENE 176 272615 - 272911 320 98 aa, chain - ## HITS:1 COG:no KEGG:BF3072 NR:ns ## KEGG: BF3072 # Name: not_defined # Def: putative septum formation initiator-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 98 1 98 100 145 79.0 5e-34 MGKLISIWSFICRRKYLITVVAFAVIIGFLDENSLVRRFGYEREISQLKEEIEKYRADYE ENTKRLNEISTNPDAIEQIAREKYLMKKPNEDIYVFED >gi|225935346|gb|ACGA01000046.1| GENE 177 273068 - 274933 1434 621 aa, chain + ## HITS:1 COG:BS_dnaX KEGG:ns NR:ns ## COG: BS_dnaX COG2812 # Protein_GI_number: 16077087 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus subtilis # 3 365 2 363 563 285 40.0 3e-76 MENYIVSARKYRPSTFESVVGQRALTTTLKNAIATQKLAHAYLFCGPRGVGKTTCARIFA KTINCMTPTADGEACNQCESCVAFNEQRSYNIHELDAASNNSVDDIRQLVEQVRIPPQIG KYKVYIIDEVHMLSASAFNAFLKTLEEPPRHAIFILATTEKHKILPTILSRCQIYDFNRI SVEDTVNHLSYVASKEGITAEPEALNVIAMKADGGMRDALSIFDQVVSFTGGNITYKSVI DNLNVLDYEYYFRLTDCFLENKVSDALLLFNDILNKGFDGSHFITGLSSHFRDLLVGKDP VTLPLLEVGASIRQRYQEQAQKCPLPFLYRAMKLCNECDLNYRISKNKRLLVELTLIQVA QLTTEGDDVSGGRGPKKTIKPVFTQPAAAQQPQVASGTQVQQAPVHSSPSSVTTQAANGT TAQHPQASAAVQPGAPASPGAASSAPSQGAGVAQTAKEERKIPVMKMSSLGVSIKNPQRD QVSQNATTTYVPKVQQPEEDFMFNDRDLNYYWQEYAGQLPKEQDALAKRMQMLRPALLNN STTFEVVVDNEFAAKDFTALIPELQDYLRGRLKNSKVMMTVRVSEATETVRPVGRVEKFQ MMAQKNQALMQLKDEFGLELY >gi|225935346|gb|ACGA01000046.1| GENE 178 274925 - 275302 620 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160886338|ref|ZP_02067341.1| hypothetical protein BACOVA_04348 [Bacteroides ovatus ATCC 8483] # 1 125 1 125 125 243 97 1e-62 IKIKNIDHIVIPVSDIDKSLHFYTEVLGMEADTSNQRFAVKFGNQKINLHVGKAQFLPTA KHPAFGSADICLLTEGNIEEIKVEVESKGIEIEVGIVQRQGAQGAIRSIYFRDPDGNLIE VSTLI >gi|225935346|gb|ACGA01000046.1| GENE 179 275304 - 275537 252 77 aa, chain - ## HITS:1 COG:MA4346 KEGG:ns NR:ns ## COG: MA4346 COG1983 # Protein_GI_number: 20093134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 4 63 112 171 183 62 44.0 3e-10 MENEKKLTRSSNRMIAGVCAGIAEYFGWDATLLRIVYILATFFTAFAGVIIYIILWIVMP GRKPSDGYEDRMNQRLH >gi|225935346|gb|ACGA01000046.1| GENE 180 276186 - 276791 395 201 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173530|ref|ZP_05759942.1| ## NR: gi|260173530|ref|ZP_05759942.1| hypothetical protein BacD2_16780 [Bacteroides sp. D2] # 1 201 1 201 201 371 100.0 1e-101 MDREYSQVNNLLAEATNLIRREKEATGNVEPAEGYKRRQIEELKLFATQHNLWININSLP LSYLSKGGENEVFTGHEDVVFKLNNFEYAGEDLENFFIRIEAHNLFFSNVTYQMIGFAYN SQYEFCAVLVQPYVRAKREATEEEIAEHMQALGFEMVYEDEYHNAKYEVFDAVPNNVLYG IDDKLYFIDTQIRFRPIEITL >gi|225935346|gb|ACGA01000046.1| GENE 181 276980 - 277366 220 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298479816|ref|ZP_06998016.1| ## NR: gi|298479816|ref|ZP_06998016.1| conserved hypothetical protein [Bacteroides sp. D22] # 1 128 1 128 128 217 97.0 2e-55 MNKIRKTILLIILAIGLVVTTIILYNHQGSSDTQNYIAKIVPLETKYIVDLSSVEDKGYK LSYMDYKSPELTARVIIDNARDYIMLFMEKNNYDEGDFHPKVSLFNEYSPVLIITKDNKV YLKMKKKE >gi|225935346|gb|ACGA01000046.1| GENE 182 277359 - 277793 205 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173532|ref|ZP_05759944.1| ## NR: gi|260173532|ref|ZP_05759944.1| hypothetical protein BacD2_16790 [Bacteroides sp. D2] # 1 144 1 144 144 283 100.0 2e-75 MNKGSYFNNIMKYSLSHGLTSIGLPTIPAKVKDKYLCLILDTGSTCSLIDSTVVEYFKDI VEPVGDYYISGIEGTKHKVDIVTLPFNFEGQIYKPKFCVKPLLDTFKGIEDESGIQVQGL LGTDFLLENKWVINFNKLIVSNYE >gi|225935346|gb|ACGA01000046.1| GENE 183 277768 - 278172 213 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173533|ref|ZP_05759945.1| ## NR: gi|260173533|ref|ZP_05759945.1| hypothetical protein BacD2_16795 [Bacteroides sp. D2] # 1 134 1 134 134 264 100.0 1e-69 MGFFKKLVNEGKDYTKMANAVGNVKAILDDIEQSYTTIDKETFLIAAWICRVGIIDIIER NNWTMNHKLLIPINGHYINLTFHEVYLMTIGRLSIKAEEHGDNIKEMVLDVFEKGDWFNQ IDAIVPYEQRKLFQ >gi|225935346|gb|ACGA01000046.1| GENE 184 278174 - 278539 175 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173534|ref|ZP_05759946.1| ## NR: gi|260173534|ref|ZP_05759946.1| hypothetical protein BacD2_16800 [Bacteroides sp. D2] # 4 121 16 133 133 209 100.0 6e-53 MLIRFASDYNKQANTVIKQGGMRGKYKTLINHILRQDSSARIIQETNTFISVGLTGISGS TIFFIHQTFGTVTIQYKVDSKVFGKHQLEWKFNEFADQEKMIEKITNDISAYNNNMIQKF M >gi|225935346|gb|ACGA01000046.1| GENE 185 278579 - 278854 236 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173535|ref|ZP_05759947.1| ## NR: gi|260173535|ref|ZP_05759947.1| hypothetical protein BacD2_16805 [Bacteroides sp. D2] # 1 91 1 91 91 158 100.0 8e-38 MELMDLFRKQSREKALREKIRQGFEDSVMEVIREGAAESPMGGLIVKAAIASFYQGMKSS ELKNICLETGVNFQDILDEECQNALHKYLEE >gi|225935346|gb|ACGA01000046.1| GENE 186 278867 - 279502 307 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173536|ref|ZP_05759948.1| ## NR: gi|260173536|ref|ZP_05759948.1| hypothetical protein BacD2_16810 [Bacteroides sp. D2] # 1 211 1 211 211 391 100.0 1e-107 MKYYYVLLLTLVIVSCNSNRQKSSSSNKETPQLSTAQDSVPEGWYNETLTPYTGKDKAEK NMISQMNTYNQALLRGDINNASLYIYPDVIKYCKKYYPELSDRAIIQTLYKDMSEMYHTL KTTYDEKGIDYNIIVSNITNRVADKEYIIITFEVVGVLTQGEKCIHDNPETNIGVSHNKG KNWTFLALTDDVPNILRMKFDEEIINKIMNY >gi|225935346|gb|ACGA01000046.1| GENE 187 280444 - 280899 287 151 aa, chain - ## HITS:1 COG:PM1152 KEGG:ns NR:ns ## COG: PM1152 COG2003 # Protein_GI_number: 15603017 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Pasteurella multocida # 48 145 122 218 224 79 41.0 2e-15 MDNIMNVSEVQLSYKSNVKSSTRYKINSSLDAYELLIKYFPDDTIGYRESFKVVLLNQSN RVLGIVPISEGGISATYVDVRLILQAALLANATQVILAHNHPSGSMKPSTLDDALTEKVK KAAELMEIHIADHVILSPEKEYYSYNDEGKL >gi|225935346|gb|ACGA01000046.1| GENE 188 280908 - 281993 450 361 aa, chain - ## HITS:1 COG:no KEGG:BF3462 NR:ns ## KEGG: BF3462 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 60 361 41 340 340 412 63.0 1e-113 METMQLQPVRANLIYPNRNKVDNPYIQTVEPIEIINTGIASTNPVGVEPISTREGNKLPF IEANTKEVTMQYLKEECITPVFSKDNEVTISHSSFIETVWEAANKVFSNERIEEPAIRVS HVIKGRIPEAIHKPVNQLLETDKTIYFERMMFCFEIPTIYEDIAGNRLNLTIGGVRAYNH ENLYSKKGAEKFKVFIGFKNLVCCNMCVSTDGYRSELKVMSTAELFNAVVRLFQEYNIAQ HLYYMSAYKDSYIRESQFAQFLGRCRLYQFLPIDQKKKLPQMLMTDTQIGLVAKAYYNDD NFSTLLDSREISMWNVYNLLTGANKSSYIDNFLDRSLNATQLAEGLNKALYGENEYSWFI N >gi|225935346|gb|ACGA01000046.1| GENE 189 281959 - 282846 457 295 aa, chain - ## HITS:1 COG:no KEGG:Dred_0498 NR:ns ## KEGG: Dred_0498 # Name: not_defined # Def: hypothetical protein # Organism: D.reducens # Pathway: not_defined # 7 231 8 234 389 198 43.0 2e-49 MLNLRVSSKKQAKIKLALQGCAGSGKTYSALLLAYGLCNDWTKIAIIDSENGSADLYAHL GAYNVLSLSDNFTPETYIQAIEICEGAGMEVIIIDSISQCWDNLLEYHAGLQGNSFTNWQ KVTPRINAFMQKVLQSGSHVICTMRCKQDYVLSEKNGKMIPEKVGLKAVMRDGIDYEFTI VFDINMKHQTIASKDRTNLFIGKPDFTITPATGQIILDWCNDGVNLEMIRSKINSSKTIE ELTAIYHQFPEWYQQLTSDFMQKKAALQVQKNQPTIDYTPNYIRYGNNAVAASQS >gi|225935346|gb|ACGA01000046.1| GENE 190 282942 - 283211 193 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173540|ref|ZP_05759952.1| ## NR: gi|260173540|ref|ZP_05759952.1| hypothetical protein BacD2_16830 [Bacteroides sp. D2] # 1 89 1 89 89 180 100.0 3e-44 MERLKFLETVTVNEFKAQKGVSKIEIKQNPHTGKCFFVYGCETGAVSDKFINGEVTNPVI SQVCSPDTGDMFYMLHQRGEGGAMTLATL >gi|225935346|gb|ACGA01000046.1| GENE 191 283572 - 285209 607 545 aa, chain - ## HITS:1 COG:no KEGG:CHU_0049 NR:ns ## KEGG: CHU_0049 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 265 518 285 508 530 65 26.0 7e-09 MSNLNWNEFEIVWEGFKQELLRVIASQPSGKRIGSALPINIKHSGYFALIPKNFDLDAHL EQYPLTDYKFVNDHGHIRTVTTTGTLGAYFDIERLIYIMGLISSIPSIKKDSIDEEGYVS INAKYLRHFFKDYLSYLDYLIKTGILITDNQYIKDEKSKGYKFAPAYDSVPLVKYVYQSV REQAVQIEAVPQEVFNESLGKFTSNHLLEYPYLSHWYEQKQLTINAYLARQYAYQMMTEK FNGGYESWDSNRDKWCRSRNTFCKKYPRSQYNAAIHNIESIACYDYKAKIDSNVHRLHSV ITNIQKDYRHFLLYDEQPLVGIDIANSQPYLLCLLFNPIFWEKDSNTTLNIGTLPTNIQS LFPQEHFVEIRDYVSSLTAEALQEYKQIASEGRVYDHIMSLINSRRNTTLDKKNVKTMML IVFFSKNRYYHQQGATLKRLFDNTYPEIYSLIALAKRDNHAALACLLQSIESEIILHRCC KRIWEEGNHQVPVFTIHDSIATTTEHVEWVKGIMQEELTNAIGIPPTLKEEQWNLSQVGH PKYLY >gi|225935346|gb|ACGA01000046.1| GENE 192 285235 - 287502 825 755 aa, chain - ## HITS:1 COG:no KEGG:BVU_3124 NR:ns ## KEGG: BVU_3124 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 127 435 10 315 576 110 31.0 2e-22 MSKEDELAKEIAGLDTSFMDELYNLQISDEKTNQSEVITLGTTPVSEEHSKEVKSSIVEN QIEETVPNTYGDLHDASVEHIEILSSGKVGETKTIPQDTVITSLDNLLFGRPPEVIRKEV DASFWNLADFIDKMPHGIVDKKIPGIGATTLEINSKRNSIIVFPTKALAYGKHSKHPNTL YVGSEIKGEKEKVTNQQIEEYLAKDGYKKLLVVADSLGRVLGIIGKNYKDYFLMIDEVDV LQTDNNFRPQLENVIDYYLMFPSKNRCMVTATMKEFSNPHLKTECRFPITWQYNTHRNID LLHTDNITQAVIEKVISHPTEKIFIAYNSILQIRNIIASLDEETRKECAILCSEASIKEA GEYFAPKLGDNDTLPARINFATCCYFTGIDIEDSYHLITVSDVRRSHSMLTLDRMTQIHG RCRKINGILNETIIYNTLGYVSIMESMESYTTTLLNKAKKVLKVIDSADDIMQGDYTLID LFAMVKEAIREKAQERIAGNELINLIRQNVDGEYVPAYLNIDYIIERTELYATYFMPETL KEVLSKQVKIISYKSLNYNVSPEQNNIEKTNKDAQNKLTDSNIQDAIKYIKTLSTTGQLN DNTLYSYTRHCRSKTKIFLERFIKLYRYVDLDSLLHQLWESRTSNSVVFKNLNNTVMYWA LDEEHPFKVAIRRSLTLNKSYSASEIQEILAPIVQYHLHKVLKPRKYVALLKAVYTTSRT SRNKYTIRGENPRGFKEHTGRIATKENNLLKLFIL >gi|225935346|gb|ACGA01000046.1| GENE 193 287624 - 288898 519 424 aa, chain - ## HITS:1 COG:no KEGG:BT_2980 NR:ns ## KEGG: BT_2980 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 419 1 420 422 461 56.0 1e-128 MLTFKVEIRKNEMKVGGTFNVKIRVTYNREVKRLATHIFVRTEDLTKDFKLKNPKYIKEA DKLVRYYEELCMGLPLEASNLTLSDVLDYIQKEKEKNTPIDFIQFCKDWLTTTEVKGKRN YQTALNTFIAFLGKDKLNTNQVTKLLMMEFMEYLHKKRAKQVAELQRKGKRIPSNRMVSL YMGSIRHLFNEAKKKYNDYDRNVIRIPNSPFENLEIPKQEATRKRALSVELIKKIWELPY IINTNGKERLCPFNLAKDCFILSFCLIGMNSADLYNCTEMEGGSITYYRTKTTDRRLDKA KMKVDVLPILLPLMKKYEDYTQKKVFCFYHLYSTFKNFNRAINLGLKQIGKILKIDDLEY YAARHSWATLAVNKVGIDKYTVHAALNHIDEAMRVTDIYIERDFKIENEANKKVVEYVFS NHNE >gi|225935346|gb|ACGA01000046.1| GENE 194 289356 - 289466 86 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPNPQNNKWSGIILLIGAIMLIIIMIILTVKASSNI >gi|225935346|gb|ACGA01000046.1| GENE 195 289552 - 289854 216 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173544|ref|ZP_05759956.1| ## NR: gi|260173544|ref|ZP_05759956.1| hypothetical protein BacD2_16850 [Bacteroides sp. D2] # 1 100 1 100 100 188 100.0 1e-46 MRYEPHGQVDLEDQWINPWFNLIGWLVIILCVWYNQSYSIKKMTYKQSENEWEKPVKIEM SWSKRCIVIFKCLAIDVKNNPRYTLGCCALLIYLLIRCFL >gi|225935346|gb|ACGA01000046.1| GENE 196 289864 - 290232 77 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173545|ref|ZP_05759957.1| ## NR: gi|260173545|ref|ZP_05759957.1| hypothetical protein BacD2_16855 [Bacteroides sp. D2] # 1 122 1 122 122 193 100.0 3e-48 MLSLALLLITFYGIHLFLKWFLSNKRGRWATIVLFILHRAMYLMIYLFNWFIIINAIPRK TELLIALVFTFPMAWYMGKVCLHDIDMPQRIQWYPFFGRYFILFKWKIIKWEYKDKDDYT IY >gi|225935346|gb|ACGA01000046.1| GENE 197 290233 - 290637 181 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173546|ref|ZP_05759958.1| ## NR: gi|260173546|ref|ZP_05759958.1| hypothetical protein BacD2_16860 [Bacteroides sp. D2] # 1 134 1 134 134 266 100.0 3e-70 MENRCSIIYGLEHTRLPLIPVEVKDKYLSFILDTGSTCSLIDSTVVEYFKDIVKPIGEYC ISGIEGTKHKVEMITLPFTFEGQVYKPKFCVKPLLNTFIGIEDESGIQIHGLLGTDFLLE NQWVIDFKEHIIHY >gi|225935346|gb|ACGA01000046.1| GENE 198 290721 - 291395 251 224 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173547|ref|ZP_05759959.1| ## NR: gi|260173547|ref|ZP_05759959.1| hypothetical protein BacD2_16865 [Bacteroides sp. D2] # 1 224 1 224 224 412 100.0 1e-114 MGILNLFRKRIKDPELCRLRDLLAIVYASGEMTTKERTTILEIAAKHNISSSKFHQMLEI DPDSVQDIYPTSEEDRYQYLYELIYLMTVNRKHSTRAIDYIRFIAAKMGYSPKDVYEMTE IIDSSPFTPSTKQKITPTKWTIKFERDFNQEEVAAVEQAVVVSSEYGNSIQFTLRSGGMT YIPLDHNSDLGTGEIIDITKAKLICLEKSGESDIYRVGYQESPW >gi|225935346|gb|ACGA01000046.1| GENE 199 292368 - 292823 204 151 aa, chain - ## HITS:1 COG:aq_1610 KEGG:ns NR:ns ## COG: aq_1610 COG2003 # Protein_GI_number: 15606726 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Aquifex aeolicus # 12 151 90 231 231 80 34.0 1e-15 MDSIMNVAEVQLSYRSCVKSSTRYKINSSQDSYELLIKYFPDDTIGYRESFKVVLLNQSN RVLGIVPISEGGISATYVDVRLILQAALLANATQVILAHNHPSGSMKPSTLDDTLTERVK KAAELMEIHIADHVILSPENEYYSYYDESKL >gi|225935346|gb|ACGA01000046.1| GENE 200 292832 - 293944 515 370 aa, chain - ## HITS:1 COG:no KEGG:BF3462 NR:ns ## KEGG: BF3462 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 50 370 25 340 340 412 60.0 1e-113 MFQVPSQPSLKVTLPLSTVNTATTTKNSMITESVNTSPVSTVAVPTEPVSTELLEVETVQ SRVGHKLPFIEANTKEVTMQYLKAECVVPVFSKDNEVTISHPNFIEAIWDAANQVFPNER IEEPAIRVSHVIKGRVPEAIHKPVNQLLESDRTQYFERMMFCFEIPTIYEDISGNRLNLT IGGVRAYNHENLYSRKGAEKFKVFIGFKNLVCCNMCVSTDGYRSELKVMSTTDLFNSVMR LFQEYNIAKHLYYMSAYKDSYMTEKQFAMFLGKSRLYQFLPTDRKKKLPQMLMTDTQIGL VAKAYYNDDNFSTLLDSREISMWNVYNLLTGANKSSYIDNFLDRSLNATQLAEGLNKALY GENEYSWFIN >gi|225935346|gb|ACGA01000046.1| GENE 201 293949 - 294848 366 299 aa, chain - ## HITS:1 COG:no KEGG:Alvin_3206 NR:ns ## KEGG: Alvin_3206 # Name: not_defined # Def: AAA ATPase # Organism: A.vinosum # Pathway: not_defined # 4 284 2 305 328 201 40.0 3e-50 MLNLRVSSKKQAKIKLALQGCAGSGKTYSALLLAYGLCSDWSKVAIIDSENGSADLYAHL GAYNVLNLSENFTPETYIQAIEVCESAGMEVIIIDSISQCWDTLLEYHASLQGNSFTNWQ KVTPRINAFMQKILQSSRHIICTMRCKQDYVLSEKNGKMIPEKVGLKAVMRDGIDYEFTI VFDINIKHQAIASKDRTNLFMNKPDFTITPTTGQIILDWCNDGVNLEMIRSKINSSKTVE ELTAIYHQYPEWYQQLTSDFMQKKTSLLEQKNQPIINYTPSFIQYGSNPHTTINPSTKN >gi|225935346|gb|ACGA01000046.1| GENE 202 294942 - 295211 269 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173551|ref|ZP_05759963.1| ## NR: gi|260173551|ref|ZP_05759963.1| hypothetical protein BacD2_16885 [Bacteroides sp. D2] # 1 89 1 89 89 174 100.0 2e-42 MERLKFLETMTVNEFKAQKGVNKIEVKQNPHTGKCFFVYGCETGAVSDKFVNGEVTVPVI SQVCSPDTGDMFYMLHQKGEGGAMTVATL >gi|225935346|gb|ACGA01000046.1| GENE 203 295549 - 296217 320 222 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173552|ref|ZP_05759964.1| ## NR: gi|260173552|ref|ZP_05759964.1| hypothetical protein BacD2_16890 [Bacteroides sp. D2] # 1 222 1 222 222 427 100.0 1e-118 MTTIKEYTIFPVKACIYFKQRDLYLLAGLYINAHYQKGVDYITTDTTYKQLSELTGVPVY YIKESFVPKLRASEYIQVRTVQVEPEVKRNTYYLPNPNINFRYIWKELFDDKTLTPDEKG FMIGLYCMCANNEFRIDLSDIEVYRHWNIAKNTYVKYRNLLINKKVIWSSYDVPMALTWT EHMDAKVILYPHLGHTTWIDKVTSHTPDDDEIKDYLDTIKDE >gi|225935346|gb|ACGA01000046.1| GENE 204 296460 - 298568 685 702 aa, chain - ## HITS:1 COG:no KEGG:BVU_3124 NR:ns ## KEGG: BVU_3124 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 74 388 10 320 576 104 30.0 1e-20 MENTKQHNQVIYNTFADVIDWDDLTEKDHQVVEDLKENGINITPTTIIKDLSGFPIGQAT YAGKGIFASKRYLLSDYLSKTPHGLIDKKVTGIGATTLEINSKRNSIIVLPTKALASAKC NKHLNTLYVGSEIKGERERTTDNMIQEYLSRDGYKKLLVVADSLRRVLKYISPEELKNYF LMVDEIDVLQSDNSFRPQLEDVIDYYFMFPPKNRCMVTATMKEFSNPLLKTECKFSVPWL CNTARTIRLLYTNNITQVVVDELHSHPTEKILIAYNSILQIQNIISTLDEKSRSECAILC SEASTKEAGDYFAPKLGDNDTLPARINFATCCYFTGIDIDDSYHLITVSDTRRSHSMLGI DRIIQIHGRCRKVTGVLSDTIVFNITKQHSGKSMDTYPKILLNKASKVLKVIEAANDITQ DDYTLDNLFSIIAEAIQEKAQEKSIGGELINLVRKNIQGEYVPAYLNIDYLTERLELHDE LYSHPETIKKALSKLGNIISYSTLSYNTTSQQNDIEQSNKAIQEQLIDKYIQNAIVETRG LAATKQLNDNTLIMYIRHSRSKVKIFYERFLRLYKYVDLESLLNQLWEIRADNNAAFKNL NNAVMYWALDEEHPFKVAVRSSFTLNRAYSANEIQELLTPIVQYHLHKVLKPRKYISLLN VIYQTMRPKNNYIIKEENPMQLKEHGIRISCEDNNMLRYFMF >gi|225935346|gb|ACGA01000046.1| GENE 205 298670 - 300079 529 469 aa, chain - ## HITS:1 COG:no KEGG:BT_2980 NR:ns ## KEGG: BT_2980 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 458 1 421 422 454 56.0 1e-126 MRVFCFQEVGIQVLNLAQKDYFMRYLLVYCCSQENQYNMLTIKAEILKTKQKVDNTYNLK IRLTYNREVRRLATHIFVRTEDLTKGFKLKNPKYIKEADRLVRYYQELCTSLPLESSKLT ISDILDYIQKEKEKNTPIDFLQFCNEWLETTKVKGKVNYKSALNTFKTFLGKDKLNTSQV TKLLMMEFMDYLQKKRAKHVAELQKKGKRIPSNRMVSLYMGSIRHLFYEAKKKYNDYDRN VIRIPNSPFENLEIPKQEATRKRAVSPELIKKIWEIPYIINANGRERLCPFNLAKDCFIL SFCLIGMNSADLYNCTELEEDSITYYRTKTADRRLDKAKMRVDVLPILLPLIKKYEDYTH QKVFCFYHLYSTYKNFNRAINLGLKQIGKILKVDDLEYYAARHSWATIAVNKVGIDKYTV HAALNHIDEAMKVTDIYIERDFKIENEANKKVVEYVFGNQGELSSPEVE >gi|225935346|gb|ACGA01000046.1| GENE 206 300248 - 301396 639 382 aa, chain - ## HITS:1 COG:SPy0818 KEGG:ns NR:ns ## COG: SPy0818 COG2843 # Protein_GI_number: 15674859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Streptococcus pyogenes M1 GAS # 44 297 83 338 430 135 33.0 2e-31 MRHTLFLMILLLSLSCTSRSQAKRDSIIDTLSDSLSDSIFSTDTLRLLFVGDLMQHQGQI NAARTSTGYDYSTCFTYVKEEIKKADLSIANLEVTLGGKPYKGYPAFSAPDEFLTAIHDA GFNVLVTANNHSLDRGKSGLERTIQLIDSLKVPHAGTYINADEREKKYPLLLEKNGFRIA LLNYTYGTNGIPVTPPNIVNYIDTAIIAKDIEESKAMKPDAIIACMHWGIEYQSLPDKEQ KFLADWLIEKGVNHIIGSHPHVVQPIEVRTDSLTNDKHLVVYSLGNYISNMSAHRTDGGL MVRMELVKDSTVRLNNCDYSLVWTARPIQSGKKNHQLLPVNLPIDSIPLQARNSLKIFVN DARTLFSKHNRGIKEYTFFEKK >gi|225935346|gb|ACGA01000046.1| GENE 207 301393 - 302274 956 293 aa, chain - ## HITS:1 COG:lin1397 KEGG:ns NR:ns ## COG: lin1397 COG0190 # Protein_GI_number: 16800465 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Listeria innocua # 3 289 4 279 284 273 51.0 4e-73 MTLIDGKAISEQVKQEIAAEVAEIVARGGKRPHLAAILVGHDGGSETYVAAKVKACEVCG FKSSLIRYESDVTEEELLAKVRELNEDDDVDGFIVQLPLPKHISEQKVIETIDYRKDVDG FHPINVGRMSIGLPCYVSATPNGILELLKRYNIETSGKKCVVLGRSNIVGKPMAALMMQK AYPGDATVTVCHSRSRDLVKECQEADIIIAALGQPNFVKAEMVKEGAVVIDVGTTRVPDA TKKSGFKLTGDVKFDEVAPKCSFITPVPGGVGPMTIVSLMKNTLLAGKKAIYQ >gi|225935346|gb|ACGA01000046.1| GENE 208 302481 - 303803 1774 440 aa, chain - ## HITS:1 COG:BS_ffh KEGG:ns NR:ns ## COG: BS_ffh COG0541 # Protein_GI_number: 16078661 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus subtilis # 2 433 3 437 446 443 55.0 1e-124 MFDNLSERLERSFKILKGEGKITEINVAETLKDVRKALLDADVNYKVAKNFTDTVKEKAL GQNVLTAVKPSQLMVKIVHDELTQLMGGETAEINIDARPAVILMSGLQGSGKTTFSGKLA RMLKTKKNRKPLLVACDVYRPAAIEQLRVLAEQIEVPMYCELDSKNPVEIAQHAIQEAKA KGYDLVIVDTAGRLAVDEQMMNEIAAIKEAINPNEILFVVDSMTGQDAVNTAKEFNERLD FNGVVLTKLDGDTRGGAALSIRSVVNKPIKFVGTGEKLEAIDQFHPARMADRILGMGDIV SLVERAQEQYDEEEAKRLQKKIAKNQFDFNDFLSQIAQIKKMGNLKDLASMIPGVGKAIK DIDIDDNAFKSIEAIIYSMTPAERSNPEILNGSRRTRIAKGSGTTIQEVNRLLKQFDQTR KMMKMVTSSKMGKMMPKMKR >gi|225935346|gb|ACGA01000046.1| GENE 209 303908 - 305230 1182 440 aa, chain - ## HITS:1 COG:PAB0243 KEGG:ns NR:ns ## COG: PAB0243 COG0534 # Protein_GI_number: 14520582 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Pyrococcus abyssi # 5 393 6 397 463 120 25.0 4e-27 MYTNKQIWSVSYPILLSLLAQNVINVTDTAFLGHVSEVALGASAMGGLFYICVFTIAFGF STGSQIVIARRNGEGRYSDVGPVMIQGIMFLFVMALLLFGFTKAFGGNIMRLLVSSESIY EGTMEFLNWRIYGFFFSFINVMFRALYIGITRTKVLTINAIVMALTNVVLDYALIFGKFG LPEMGIKGAAIASVLAEASSILFFVIYTYATVDLKKYGMNRLRTFDPALLMRILSISCFT MLQYFLSMATWFVFFVAVERLGQRELAIANIVRSIYVVLLIPVNALATTTNSLVSNAIGA GGIQHVMPLINKIARFSFFIMLGLVAVSALFPQFLLSIYTSEAALITESVPSVYVICFAM LIASVANVVFNGISGTGNTQAALLLETITIIIYGSYIIFIGMWLKAPIEICFTIEIVYYS LLLITSYIYLKKAKWQNKKI >gi|225935346|gb|ACGA01000046.1| GENE 210 305270 - 307225 1389 651 aa, chain - ## HITS:1 COG:alr1285_1 KEGG:ns NR:ns ## COG: alr1285_1 COG0642 # Protein_GI_number: 17228780 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 421 598 210 389 483 92 28.0 2e-18 MKQVLLLILLLCCTLTGKASTEIVLESDSLIKVLETLPYDTTRLNVLNQIIRIEQNNQQC IQYSDALMKEALQLGNDKYAGLAAYYHILYYYNRNNQDSVAKWLTIMEPHARKSDLWNYF FDAKRFQIDLYTFNEQYELAINESQKMQQQASQMNSNRGSMAAYQCLSNAYIGSQRWDKG IEALEAAYQLLTPTENPVVRISVLSQLVSVVKEKKDNKKLLKYLQELENTLHNHISANPS LKAGFADVYLFNELFYSYYYLNTHQPQQAYEHLVKSKEYLDENTYFMYKVLYFDTFAKYY QAIGAYQQASDYIDTTLTMLKKDFTSDYAEQLLEKARIWKQAGQSGKAIPLYEQALAIKD STATVLSNNQMAQIQSKYNIEKTELDQKRENNRIQLTYLIFIFVILILLFIFMARLFMVR KALKYAENEIRKTTENVRKTNEIKNRFLSNMSYNIRTPLNNVVGFSQLIASEPNIDKEIR QEYSNIIHKSSEKLMRLVNDVLDLSRLEAQMMKFQLQDYDATTLCKEACSMAQMRNEETG IQIEFSSETDAQIHTDIMRLTQALISVLVYPQEHRQARIIRFTLSQNEDMLCFRITNSPL ADHAFVSQETIIRHDINLLLLKHFGGNYLVNDKTPEGPEIVFIYPIVSESK >gi|225935346|gb|ACGA01000046.1| GENE 211 307222 - 309072 763 616 aa, chain - ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 369 605 12 255 311 160 38.0 5e-39 MKKHRRHIIAIVVTALISVTLYAARTNSDRLSLLQPLLEYNHPFSPDAPVDSIIAWEKLL EPELQKEQQYPLLFQINLLTVQALITEGNISLAINQANQMYQKAREMDYPLGTALALHAI GNTYLSSSTPQVAIKSYKEALEIIQKLPHANQYIKTILSQFILTKLNYHQMTDIEDNIRE LESVINKDTNPQDDFHLVYCQAFYRIQMHHLPEALNYIRQTEQISRQHQYPYFHLMIKYL YSRYYTESKEYTQALTTLDELLSHTKAANSYRSLQVLKDRTHILTLMGNSKEACEAYEIF NTYKDSLDAMNYIRQINELHTLYQIDKNELDNLNRQKTILYWSWFTILFIVILIVFFILL VRRGNKKLRQSQQELEKVKKQEENSIRTKSLFLSNMSHEIRTPLNALSGFSSILTEESID NETRKQCSDIIQQNSELLLKLINDVIDLSSLEVGKMKFKYERCDAVAICRNVIDMVEKIK QTNANVRFSTSLHSLELTTDNARLQQLLINLLINATKFTPQGSITMELEKQTEDIALFSV TDTGCGISPENQNKIFNRFEKLNENAQGTGLGLSICQLIIEQLGGKIWIDSNYEEGARFL FTHPIYHEQQGKEEAK >gi|225935346|gb|ACGA01000046.1| GENE 212 309163 - 310629 968 488 aa, chain - ## HITS:1 COG:yidJ KEGG:ns NR:ns ## COG: yidJ COG3119 # Protein_GI_number: 16131548 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 21 471 2 458 497 123 25.0 7e-28 MKTIYPFMGLALCSVTAQAQEKPNFLIIQCDHLTQRVVGAYGQTQGCTLPIDEVASRGVI FSNAYVGCPLSQPSRAALWSGMMPHQTNVRSNSSEPINPRIPENVPTLGSLFSENGYEAV HFGKTHDMGSLRGFKHKEPVAKPFTDPEFPVNNDSFLDVGTCEDAVAYLSNPPEEPFICI ADFQNPHNICGFVGANEGVHTDRTISGTLPELPANFNVEDWSNIPKPVQYICCSHRRMTQ ASHWNEENYRHYIAAFQHYTKMVSKQVDSVLKALYSTPAGKNTIVIIMADHGDGMASHRM VTKHISFYDEMTNVPFIFAGPGIKQQKKPIDQVLTQPTLDLLPTLCDLAGIPVPAEKPGI SLAPILKGEKQKKTHPYVVSEWHSEYEYVVTPGRMVRGPRYKYTHYLEGNGEELYDMKKD PGERKNLAKDPKYSKVLAEHRAMLDDYIARTKDDYRSLKVDADPRCRNHTPGYPNHEGPG VREILKRK >gi|225935346|gb|ACGA01000046.1| GENE 213 310914 - 313211 2227 765 aa, chain - ## HITS:1 COG:BMEI0003 KEGG:ns NR:ns ## COG: BMEI0003 COG1158 # Protein_GI_number: 17986287 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Brucella melitensis # 372 765 27 421 421 467 62.0 1e-131 MYNIIQLNDKNLSELQVIAKELGIKKADSYKKEDLVYKILDEQAIVGATKKVAADKLKEE RKNEEQKKKRSRVAPTKKEDKVVSATKSEEANKTKETAPVKAAPQPSKKEESTNKEKETV VVEAKAENTATPKRKVGRPRKSSDAEEKKEVENAKPAAPKVVEPKPVVAEKATGATEKPA PVQQQTAEKKAKNKPAAEANKPVAETNKPATEPNKPVVEKKVIDKPQKKATPVIDEESNI LSSVDDDDFIPIEDLPSEKIELPTELFGKFEATKTESAQTAPEQPSHPQQQQQSQQQQSQ QQRPRIVRSRDNNNGNNNANNNGNNGNNANNNFQRNNNNQNQVQNPNQNQNQNQNQQRLP MPRAAQQNNAGENLPVQQQQERKVIEREKPYEFDDILNGVGVLEIMQDGYGFLRSSDYNY LSSPDDIYVSQSQIKLFGLKTGDVVEGVIRPPKEGEKYFPLVKVSKINGRDAAFVRDRVP FEHLTPLFPDEKFKLCKGGYSDSMSARVVDLFAPIGKGQRALIVAQPKTGKTILMKDIAN AIAANHPEVYMIMLLIDERPEEVTDMARSVNAEVIASTFDEPAERHVKIAGIVLEKAKRL VECGHDVVIFLDSITRLARAYNTVSPASGKVLSGGVDANALHKPKRFFGAARNIENGGSL TIIATALIDTGSKMDEVIFEEFKGTGNMELQLDRNLSNKRIFPAVNITASSTRRDDLLLD KTTLDRMWILRKYLADMNPIEAMDFVKDRLEKTRDNDEFLMSMNS >gi|225935346|gb|ACGA01000046.1| GENE 214 313449 - 314285 582 278 aa, chain + ## HITS:1 COG:no KEGG:BT_1594 NR:ns ## KEGG: BT_1594 # Name: not_defined # Def: putative ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 278 42 319 319 483 83.0 1e-135 MTVNEAHLIYFSPTHTSKQVAEAIVHGTGIKNIFPINVTQQIADEIVIPTSSLAIIVVPV YGGHVAPLAMERLQYIRGVDTSTVLVVVYGNRAYEKALMELDAFAIPHGFKVIAGATFIG EHSYSTDKYPIAVGRPDESDLAFAAEFGKKIMEKIQTADSMDTLYPVDVRAIKRPSQPFF PLFRFLRKVIKLRKSGTPLPRVPWVEDEDLCTHCGLCVVRCPAGAITKGDELHTDEAKCI KCCACVKACVRKARVYETPFAALLSDCFKKQKLPQTIL >gi|225935346|gb|ACGA01000046.1| GENE 215 314270 - 315550 750 426 aa, chain - ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 6 425 5 457 461 209 31.0 6e-54 MIQQRVTKYIEKEHLFSPDDKILVALSGGADSVALLYILHTAGYHCEAAHCNFHLRGKES DRDELFVRQLCERMEIHLHTIDFNTTQYATEKHISIEMAARELRYQWFEKIRKECQADVV AVAHHQDDSIETILLNLIRGTGITGLLGIRPRNGTIVRPLLCINREEIIRYLQNIGQDYV TDSTNLEDEYTRNKIRLNLLPLMQEINPSVKNTLIDTSNYLNDVATIYNKCIEETKKKII TAEGIRISDLVKEPAPEAILFEVLHPLGFNSAQIKDITHSLHSQPGKQFCSKEWRVIKDR EFLLIETAESENETLPPFQIIKEEKEYTPDFLIPREKEIACFDADKLNGEIHYRKWQPGD TFIPFGMKGKKKISDYLTDRKFSISQKERQWVLCCGEHIAWLIGERTDNRFRIDETTKRV VIYKIV >gi|225935346|gb|ACGA01000046.1| GENE 216 315696 - 318182 1950 828 aa, chain + ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 112 824 11 665 670 555 41.0 1e-158 MRLSELKTGEKGVIVKVLGHGGFRKRIVEMGFIKGKTVEVLLNAPLKDPIKYKVLGYEIS LRRQEAEMIEVISEEEAKKLAEKTVYHEGLPEDLSVKEEDMKRLALGKRRTINVALVGNP NSGKTSLFNLASGAHEHVGNYSGVTVDAKEGYFDFEGYHFRIVDLPGTYSLSAYTPEEIY VRRHIIDETPDVIINVVDSSNLERNLYLTTQLIDMNVRMVVALNIYDELEASGNTLDYHL LSKLFGVPMLPTVSKKNRGLDTLFHVVINLYEGVDFFDKQGNMNPEVLKDLTEWHDSLED RKNHEEEHLEDYVREHKKTGRVFRHIHINHGPDIEKAIEAVKSEVSKNEFIRHKYSTRFL SIKLLENDPDIERIVRTLPNADEIFHVRDKMSKRVQDTMNEDCESAITDAKYGFISGALK ETFTDNHLEQAQTTKVLDSIVTHRVWGFPIFFLFMYLMFEGTFVIGEYPMMGIEWLVEQI GDLLRNNMAEGPFKDLLIDGIIGGVGAVIVFLPNILILYFCISLMEDSGYMARAAFIMDK IMHKMGLHGKSFIPLIMGFGCNVPAIIASRTIENRKSRLITMLVNPLMSCSARLPIYLLL VGAFFPNNASLVLLSIYVIGIVLAVVMARLFSKFLVKGDDTPFVMELPPYRMPTAKSIFR HTWEKGAQYLKKMGGIIMIASIIIWFLGYYPNHDAYETVAEQQENSYIGQLGRGIEPVIK PLGFDWKLGIGLLSGVGAKELVVSTLGVLYADDPDADSVSLAERIPITPLVAFCYMLFVL IYFPCIAAIAAIKQESGSWKWALFAACYTTGLAWLVSFAVYQIGGLFV >gi|225935346|gb|ACGA01000046.1| GENE 217 318179 - 318394 116 71 aa, chain + ## HITS:1 COG:no KEGG:BF1354 NR:ns ## KEGG: BF1354 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 57 2 56 70 87 70.0 2e-16 MNNWQDWVVGVLVVLCIARVIYGIFRFFRRTRENQNPCDSCVSGCELKDMMDKKRGECDV KKKSTKKNCCG >gi|225935346|gb|ACGA01000046.1| GENE 218 318738 - 319391 551 217 aa, chain + ## HITS:1 COG:no KEGG:BF1419 NR:ns ## KEGG: BF1419 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 195 48 242 244 199 57.0 4e-50 MKKILFILFVAQFILAPYIIKGYGANLVEDNYEYSGVDQGRETVEKDILGNIIIRDDNGN RKTIEKDILGNIIIRDDKGNRKTIEKDILGNIIIRDDRGNRTTIEEDILGNFIVRDDKGN RKTIEEDILGNTIIRDDKGNRKTIEKDILGNTIIRDDKGNRKTITKDIFGNTIIEDDKGN RTTIKKDIFGNEIIEYGNGHGKIIKKDIFGNTVIEEY >gi|225935346|gb|ACGA01000046.1| GENE 219 319592 - 322615 2181 1007 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 590 853 17 305 328 169 37.0 4e-41 MEHKQLLTVFNSLPVGIGFFTADGTLVRCNESFCRIFGADSQTLLGNKLNINENSVVPDV VKEAVRRCVPVQMSFLYDFDKQRTDKRFFSTRTRTCYLKCNGNPLYDNDGKFVNYIFIFE DITETVKSEEVLRQSRRKTELAMKAANIMFWEFDAVPKLFYSDNEPLNGYDQSKPVSMAL YLETLHPDDRSQVTEVMERMSNGEDFSFSFDSRVMLPDSSTWQFCTISGAPYEFDSNKKV LKYVGTRKNNTEVQKKEQFFYNILNNLPLSVHIKDVENDFRYVFCNEESKLMFGTSEDKT TYDVLSEEEVERIQKTDLEVYNTGNPYFGMERIILKDGRSYDTIVRKSIIEDDGKRFLLN TRWDQSLQNELKRRAQLLTVTMEAMNAFTWFFEPAKNRVSFGEGFDKNGKKASEINSVEK YLTLVHPDDRQKFAETLQKAVELGSGVWDVEYRIDFKGDGMYQWWETRGLVETTTLNDAP YKYLFGMTQNIDSYKQTELTLLKNKEIQDALVRQNELVLNNTNSGLAYITKEYMVQWENI SLCSKSLSFEAYKKGELCYKSTYNRTSPCENCVMQRAFVSLQTEQMKFSLDSAHTVEVFA TPVVLEDGSVDGVVIRVDDVTEREKMIKELQEAKHQAEQSDKLKSAFLANMSHEIRTPLN AIVGFSELMAYAGEEEKADYIQIINSNNELLLKLINDILDLSKLEAGSVELKYEPFDLSE HFENMFTSMKQRLKNPDIVLTEINPYHCCQVTLDRNRVAQIITNYVTNAIKYTSKGSIKM GYACKDGGVYFYVKDTGIGIADDKKGKVFQRFEKLDEFAQGTGLGLSICKAIAEAMGGKV GFESVHNEGSLFWAFLPCEVDTLSMVEEKKEENISGGEFGANDEVMEKSAGRKTILIAED IQSNYQLVSTILKDHYDLLHAENGQKAVEIARSQHVDLLLMDMKMPVLDGLKATAEIRKF NASLPIVALTAHAFDSDRIAAIKVGCNEYLVKPLEKMKLMVALKKYL >gi|225935346|gb|ACGA01000046.1| GENE 220 322656 - 323006 114 116 aa, chain + ## HITS:1 COG:no KEGG:BVU_3167 NR:ns ## KEGG: BVU_3167 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 71 1 71 295 89 64.0 4e-17 MNRIKGILYAAVSSSTFGLAPFFSITLLLAGFSAFEVLSYRWGVAAIVLTLFGWCSGCNF RLAKKDLLVVFFIKFVAGNHLIQFAHCLSEYSHGGSIDYSFYVSVGCIIGNDVFLS >gi|225935346|gb|ACGA01000046.1| GENE 221 322960 - 323535 391 191 aa, chain + ## HITS:1 COG:no KEGG:BVU_3167 NR:ns ## KEGG: BVU_3167 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 191 102 293 295 188 59.0 7e-47 MYPLAVSLAMMFFFHEKKSLLVMLAVLMSLFGAALLSSGELEAKSCDTIVGLVAACISVF SYGGYIIGVRTTRAAQINSTVLTCYVMGMGAVLYLIGAMATFGLHLVTDGYIWLIILGLA LPATAISNITLVRAIKYAGPTLTSILGAMEPLTAVVIGVFVFKELFTLNSVIGILLILLA VGMVVFRKQKS >gi|225935346|gb|ACGA01000046.1| GENE 222 323778 - 324365 512 195 aa, chain + ## HITS:1 COG:DR0189 KEGG:ns NR:ns ## COG: DR0189 COG0526 # Protein_GI_number: 15805225 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Deinococcus radiodurans # 53 171 47 160 185 76 40.0 3e-14 MNVCSVALLAIGIWACSGQKKGTANVEVATDSVEVTADAKSVQADSTGYIVRVGEMAPDF TITLTDGKQVSLSSLRGKVVMLQFTASWCGVCRKEMPFIEKDIWLKHKNNADFALIGIDR DEPLDKVLAFAKSTGVTYPLGLDPGADIFAKYALRESGITRNVLIDKEGKIVKLTRLYNE EEFASLVQAINEMLK >gi|225935346|gb|ACGA01000046.1| GENE 223 324510 - 325241 566 243 aa, chain - ## HITS:1 COG:Ta0580 KEGG:ns NR:ns ## COG: Ta0580 COG0500 # Protein_GI_number: 16081683 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Thermoplasma acidophilum # 10 226 5 210 227 99 31.0 6e-21 MAITILSAEKDPMGAAISDYFNHHRADRLRVFSSQFEEDEIPVKELFRRIQSMPVLERTA LQMATGRILDVGAGSGCHALALQEMGKDVCAIDISPLSVEVMKQRGVNDPRLINLFDETF SETFDTILMLMNGSGIIGRLNNMPEFFQRMKRILRPGGCIFMDSSDLRYLFEEEDGSIVI DLAGDYYGEIDFQMQYKDVKGDTFDWLYVDFQTLSLYASECGFKAELIKEGKHYDYLVKL SIA >gi|225935346|gb|ACGA01000046.1| GENE 224 325337 - 325522 205 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSIEEAIMGGIVFKGKKDKPKEEEKVKTKAKKATYIRGQHGSGAAKMKADIRKKRASRHK K >gi|225935346|gb|ACGA01000046.1| GENE 225 325741 - 326118 194 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 5 125 3 126 126 79 37 2e-13 MEIKSKFDHFNINVTNLERSIAFYEKALGLKEHHRKEASDGSFTLVYLTDNETGFLLELT WLKDHTAPYELGENESHLCFRVAGDYDAIRAYHKEMNCVCFENTAMGLYFINDPDDYWIE ILPQK >gi|225935346|gb|ACGA01000046.1| GENE 226 326133 - 327278 1023 381 aa, chain - ## HITS:1 COG:no KEGG:BT_1579 NR:ns ## KEGG: BT_1579 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 381 1 377 377 617 86.0 1e-175 MNKFTILFLTLFLALPMAMKADSAKEKKDDTRYLVGAVPEVDGKVVFSKEFQIPGMSQAQ IYDTMTKWMNERLKENKNIDSRIVFSDEAKGTIAGVGEEWIVFSSSALSLDRTLVNYQIT VTCKPGNCLVELEKIRFTYRETEKYKAEEWITDKYALNKAKTKLVRGLAKWRRKTVDFAD DMFMDVAVAFGAPDTRPKTEKKKKEEEQQTPSIVAAAGPIIIGGTDKKTDIKVTTAEPVQ TTVPAATLTPATPVGKASTDMPGYTEIDLKQIPGEVYALMGSGKLVISIGKDEFNMTNMT ANAGGALGYQSGKAVAYCTLSPDQPYEAIEKADSYTLKLYAPNQTTPSAVIECKKMPSQT TPQAGQPRTYVGEIVKLLMKK >gi|225935346|gb|ACGA01000046.1| GENE 227 327292 - 328023 573 243 aa, chain - ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 5 232 4 228 237 153 37.0 4e-37 MTLDYIYHSGFAIEMEGVTVIIDYYKDSSETEHNRGIVHDYLLQRPGKLYVLATHFHPDH FNREILTWKEQRPDIQYIFSKDILKSHRAKAEDAFYIKKGETYEDETIRIDAFGSTDVGS SFLLHLQDWSIFHAGDLNNWHWSEESTEEEIRKANGDFLAEVKYLKEKAPNIDLVLFPVD RRMGKDYMKGAKQFIEQIKTTIFVPMHFSEDYEGGNALCSFAENAGCRFISITHRGESFE ITK >gi|225935346|gb|ACGA01000046.1| GENE 228 328091 - 328738 271 215 aa, chain - ## HITS:1 COG:sll1440 KEGG:ns NR:ns ## COG: sll1440 COG0259 # Protein_GI_number: 16330895 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxamine-phosphate oxidase # Organism: Synechocystis # 1 215 17 230 230 204 48.0 9e-53 MAKLNIADIRQEYTKGGLRENELPGDPLSLFNRWLQEAIDAEVDEPTAVIVGTVSPEGRP STRTVLLKGLHDGKFVFYTNYESRKGRQLAQNPYISLSFVWHALERQIHIEGIATKVPPE ESDEYFRKRPYKSRVGARISPQSQPITSRMQLIRSFVREAARWIGKEVERPDNWGGYAVT PTRIEFWQGRPNRLHDRFLYTLQPDGEWKISRLAP >gi|225935346|gb|ACGA01000046.1| GENE 229 328782 - 329486 647 234 aa, chain - ## HITS:1 COG:sll1773 KEGG:ns NR:ns ## COG: sll1773 COG1741 # Protein_GI_number: 16330260 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Synechocystis # 5 211 4 209 232 175 40.0 8e-44 MKKVIHKADTRGHSQYDWLDSYHTFSFDEYFDSNRINFGALRVLNDDKVAPGEGFQTHPH KNMEIISIPLKGHLQHGDSKKNSRIITVGEIQTMSAGTGIFHSEVNASPVEPVEFLQIWI MPRERNTRPVYQDFSITELERPNELAVIVSPDGSTPASLLQDTWFSIGKVEAGKKLGYHM HQSHAGVYIFLIEGEIVVDGEVLKRRDGMGVYDTNSFELETLKDSHILLIEVPM >gi|225935346|gb|ACGA01000046.1| GENE 230 329550 - 330551 950 333 aa, chain - ## HITS:1 COG:alr0058 KEGG:ns NR:ns ## COG: alr0058 COG1052 # Protein_GI_number: 17227554 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Nostoc sp. PCC 7120 # 5 332 3 329 341 354 53.0 1e-97 MAYTIAFFGTKPYDESSFNDKNKEFGFEIRYYKGHLNKNNVLLTQGVDAVCIFVNDVADA EVIRVMAANGVKLLALRCAGFNNVDLDAAAAAGITVVRVPAYSPYAVAEYTVALMLSLNR KIPRASWRTKDGNFSLHGLMGFDMHGKTAGIIGTGKIAKILIHILKGFGMNILAYDLYPD HNFAREEQIVYTSLDELYHNSDIISLHCPLTEATKYLINDYSISKMKDGVMIINTGRGQL IHTNALIEGLKNKKIGSAGLDVYEEESEYFYEDQSDRIIDDDVLARLLSFNNVIVTSHQA FFTHEAMENIAATTLQNIKDFINHKPLLNEVKK >gi|225935346|gb|ACGA01000046.1| GENE 231 330760 - 332205 1224 481 aa, chain + ## HITS:1 COG:FN1003 KEGG:ns NR:ns ## COG: FN1003 COG2067 # Protein_GI_number: 19704338 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Fusobacterium nucleatum # 230 481 3 273 273 64 23.0 5e-10 MRKISLIGLAMLIVSIPTFAGDYLTNTNQNAAFLRMIARGASIDVDGVYSNPAGLAFLPK DGLQVALTIQSAYQTRDIAATSPLWTMDGQTTVRNYEGTASAPVIPSIHAVYKKGDWAFS GSFAIVGGGGKASFNTGLPMFDAAAISLANSASGGMLKPNMYNINSAMEGRQYIYGFQLG ASYKINEHFSVFAGARMNYFTGGYKGHLNISLKEGVAQQLGAAIVQQIMAANPGMSLEQA TLAAQAQSGPLLQKLDDTKIELDCDQTGWGLTPIIGVDAKFGKLNLAAKYEFKANMNIEN DTHTREFPDAAADFMAPYANGVNTPSDLPSMLSVAASYEFLPSLRASVEYHFFDDKNAGM ADGKQKTLKHGTHEYLAGVEWDINKLFTVSGGYQKTDYGLSDAFQSDTSFSCDSYSVGFG GRINFTQALSLDVAYFWTTYSDYTKENPRRGGLPESMASLVDKDVYSRTNKVFGVSVNYK F >gi|225935346|gb|ACGA01000046.1| GENE 232 332289 - 333578 1154 429 aa, chain - ## HITS:1 COG:no KEGG:BT_1573 NR:ns ## KEGG: BT_1573 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 429 1 429 429 809 91.0 0 MKTSFKMVAMLLGIGIFPLCAYAQQKVVIEDEEPNSIMFVSKNKAGDEIIRIMNDRSQMR FHDPNAPRFLLTDQKGKFALGIGGYVRATAEYDFNGIVNDVDFYPALIPQRGSGNFAKNQ FQMDITTSTLFLKLVGRTKHLGDFVVYTAGNFRGDGKTFELQNAYAQFLGFTIGYSYGSF MDLSALPPTIDFAGPNGSAFYRTTQLSYMCDKLKNWKFGVSMEMPSVDGTTNNDLSINTQ RMPDFATSVQYNWNSSSHVKLGAIVRSMTYSSNVHEKAYSATGFGLQASTTFNITKKLQA FGQFNYGKGIGSYLNDLSNLNVDIVPDPDNEGKMQVLPMLGWYAGLQYNLCPSIFISGTY SLSRLYSENGYPSENPESYRKGQYLVANAFWNVSSNLQVGVEYLRGWRTDFSSATRHANR LNMLVQYSF >gi|225935346|gb|ACGA01000046.1| GENE 233 333742 - 334248 456 168 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 6 164 8 161 183 88 35.0 7e-18 MEKVDFTQGILAIQPDLHRFAYKLTADRESANDLVQDCLLQALDNQEKFTYSKNLKGWMY TLMRNIFVNNYRRTVREMNLIDDSYSINQQHLIEDEDADRFEFTYDMKQLYRVIHSIPEE MKVPFQMFVAGFKYREIAEKLGLPMGTVKSRLFFIRKRLKEELKDFSS >gi|225935346|gb|ACGA01000046.1| GENE 234 334488 - 335168 704 226 aa, chain + ## HITS:1 COG:Cgl0234 KEGG:ns NR:ns ## COG: Cgl0234 COG1738 # Protein_GI_number: 19551484 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 9 213 43 249 250 128 33.0 9e-30 MKEKVSVPFMLLGILFNVCLIAANLLETKVIQVGSLTVTAGLLVFPISYIINDCIAEVWG FKKARLIIWSGFAMNFFVVALGLIAVAIPAAPFWEGEEHFNFVFGMAPRIVAASLMAFLV GSFLNAYVMSKMKVASQGRNFSARAIWSTVVGETADSLIFFPVAFGGVIAWKELLIMMGI QIVLKSMYEVIILPVTIRVVKAIKKIDGSDVYDTNISYNVLKVKDI >gi|225935346|gb|ACGA01000046.1| GENE 235 335177 - 335836 677 219 aa, chain + ## HITS:1 COG:CAC3627 KEGG:ns NR:ns ## COG: CAC3627 COG0603 # Protein_GI_number: 15896861 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Clostridium acetobutylicum # 1 213 5 217 222 313 64.0 2e-85 MNREAALVVFSGGQDSTTCLFWAKRNFKKVYALSFLYGQKHQKEVELAREIARKAEVEFD VMDVSFIGQLGHNSLTDTTMVMDQEKPADSVPNTFVPGRNLFFLSIAAVYARERGINHLV TGVSQTDFSGYPDCRDAFIKSLNVTLNLAMDEQFVIHTPLMWIDKAETWALADELGVLEL IRTETLTCYNGVQGDGCGHCPACTLRREGLEKYLKSKNQ >gi|225935346|gb|ACGA01000046.1| GENE 236 336408 - 336872 432 154 aa, chain + ## HITS:1 COG:NMA2170 KEGG:ns NR:ns ## COG: NMA2170 COG0780 # Protein_GI_number: 15795041 # Func_class: R General function prediction only # Function: Enzyme related to GTP cyclohydrolase I # Organism: Neisseria meningitidis Z2491 # 1 154 1 155 157 228 72.0 3e-60 MSAMTELKDQLSLLGRKTEYKQDYAPEVLEAFDNKHPENDYWVRFNCPEFTSLCPITGQP DFAEIRISYIPDIKMVESKSLKLYLFSFRSHGAFHEDCVNIIMKDLIRLMSPKYIEVTGI FTPRGGISIYPYANYGRPGTKFEQMAEHRLMNRE >gi|225935346|gb|ACGA01000046.1| GENE 237 336910 - 337545 509 211 aa, chain + ## HITS:1 COG:no KEGG:BT_1563 NR:ns ## KEGG: BT_1563 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 211 212 334 81.0 1e-90 MTHHVPAELQNYVRQSIIPQYANFDKAHQIDHVEKVIEESLKLATHYEVDYSMVYVIAAY HDLGLYEGREFHHITSGKVLLADETLRRWFTDEQLLQMKEAIEDHRASNKQAPRTIYGMI VAEADRIIDPEVTLRRTVQYGLSHYPEMDKEEQYARFRKHLTEKYAEGGYLKLWIPQSDN AGRLAELRNLITNEDELRQVFNKLYIEEKNG >gi|225935346|gb|ACGA01000046.1| GENE 238 337606 - 338079 345 157 aa, chain - ## HITS:1 COG:SA0023 KEGG:ns NR:ns ## COG: SA0023 COG1576 # Protein_GI_number: 15925729 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 1 155 1 158 159 100 36.0 1e-21 MKTTLLVVGRTVEQHYITAINDYIQRTKRSITFDMEVIPELKNTKSLSMEVQKEKEGELI LKALQPGDVVVLLDEHGKEMRSLEFAEYMKRKMNTVNKRLVFIIGGPYGFSEKVYQAAHE KISMSKMTFSHQMIRLIFVEQIYRAMTILNGGPYHHE >gi|225935346|gb|ACGA01000046.1| GENE 239 338207 - 338599 334 130 aa, chain + ## HITS:1 COG:no KEGG:BT_1561 NR:ns ## KEGG: BT_1561 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 199 89.0 4e-50 MKKRVLVGMATLLLSLSLLMAQEIPAGVITAFKRGSSQELSKYMGDKVNLVLQGRSTSVD KQKATAMMQEFFTENKVSGFNVNHQGKRDESSFVIGTLATTNGNFRVNCFLKKVQNQYLI HQIRIDKINE >gi|225935346|gb|ACGA01000046.1| GENE 240 338592 - 339440 851 282 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 3 281 9 284 286 332 60 2e-89 MNKEKELIDKLIDLAFAEDIGDGDHTTLSCIPATAMGKSKLLIKEAGVLAGIEVAKEIFN RFDPTMKVEVFINDGTEVKPGDVAMVVEGKVQSLLQTERLMLNVMQRMSGIATMTRKYAK VLEGTNTRVLDTRKTTPGMRILEKMAVKIGGGVNHRIGLFDMILLKDNHVDFAGGIDKAI TRAKEYCKEKGKDLKIEIEVRSFDELQQVLDLGGVDRIMFDNFTPEMTKKAVEMVAGKYE TESSGGITFDTLREYAECGVDFISVGALTHSVKGLDMSFKAC >gi|225935346|gb|ACGA01000046.1| GENE 241 339686 - 340237 193 183 aa, chain + ## HITS:1 COG:SMb20592 KEGG:ns NR:ns ## COG: SMb20592 COG1595 # Protein_GI_number: 16265252 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 44 174 47 188 227 71 29.0 8e-13 MENEIELIKGCRAGKDSARKELYTLYSKQMLAVCFRYTGDMDAAHDVLHDGFIKIFTNFS FRGESSLCTWITRVMVTQSLDFLRREKRVSQLVVHEEQLPDIPDISDSGGGAGISEEQLM AFIAELPDGCRTVFNLYVFEEKSHKEIAKMLHIKEHSSTSQLHRAKYLLAKRIKEYRNHE ERK >gi|225935346|gb|ACGA01000046.1| GENE 242 340221 - 341270 796 349 aa, chain + ## HITS:1 COG:no KEGG:BT_1558 NR:ns ## KEGG: BT_1558 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 349 1 350 350 586 84.0 1e-166 MKKENDEITDLFRTRLADAGMSVRDGFWEELSQEIPVACQHRRRILLFRVAAAASVLLVL AASSATFLYFSPKEEMEEAFTKIAVTNGGQMDGDGIRVNQLPLPVEPVLPKPAPKSYGML SQYTEEEDSLSITFSMSFSFSSTTSTGNGNRYGNQGHNGYWQATNGNTGSSVASEEQSNV DMSQPKAVKKHRWAMKVQVGTALPADNGTYKMPVSAGVTVERKLNDFLGIETGLLYSNLR SAGQHLHYLGIPVKVNVTLVDTKKIDLYATVGGVADKCIAGAPDNSFKEEPIQLAVTAGI GINYKINDRLAVFAEPGVSHHFKTDSKLATVRTKRPTNFNLLCGLRMTY >gi|225935346|gb|ACGA01000046.1| GENE 243 341300 - 341806 455 168 aa, chain + ## HITS:1 COG:no KEGG:BT_1557 NR:ns ## KEGG: BT_1557 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 2 169 169 312 89.0 2e-84 MLWMAVCLIVSFTGCTSEEMDYNNPDVALFVKQLKSGTYKMKNEKGVVEVPHFTEEDIPE LLKYAEDLTIIPSFPSVYNMNNGKIRLGECMLWVIESIRQGTPPSLGCKMVLANAENYEA IYFLTDEEVLDAAACYRSWWEERQYPKTRWTIDPCYDEPLCGSGYRWW >gi|225935346|gb|ACGA01000046.1| GENE 244 341984 - 342709 686 241 aa, chain + ## HITS:1 COG:no KEGG:BT_1556 NR:ns ## KEGG: BT_1556 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 241 5 242 242 405 79.0 1e-112 MDGLKRVITTMALCLTAVFAYSQVWTAQDSLHLKKLLESDQELHLNMDAVKSIDFGSAVG TPRMSEEKSWMMPDESFPEALPKPKVMLTLMPYKANTRYNWDPIYQKKIKIDKNTWRGDP FYEIRHQRSYSNWARNPMAKGMRKSLDEIQASGVRFRQLGERANGMMVNTVVMDAPIPLF GGSGVYINGGTIGGLDLMAVFTKDFWNKTGRDNRARTLEVLRTYGDSTTVLINKPIEQIA R >gi|225935346|gb|ACGA01000046.1| GENE 245 342879 - 343817 834 312 aa, chain + ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 7 298 9 298 310 225 43.0 1e-58 MNRITSLLGIRYPIIQGGMVWCSGWRLASAVSNAGGLGLIGAGSMHPDTLREHIRKCNAA TKLPFGVNIPLMYPQIEEIMNIVVEEGVKIVFTSAGNPKTWTGWLKERGITVVHVVSSSR FAVKCEEAGVDAVVAEGFEAGGHNGREETTTFCLIPAVREATTLPLIAAGGVGTGEGVLA AMVLGAEGVQIGTRFALTEESSASSVFKDYCLSLREGDTKLLLKKLAPTRLVKNAFREAV EKAEDSGASAEDLRTLLGRGRAKKGIFEGDLEEGELEIGQVSAIISRRQSVAEVMDELVA AYQRAAKKDYLF >gi|225935346|gb|ACGA01000046.1| GENE 246 343966 - 345072 1010 368 aa, chain - ## HITS:1 COG:BS_ald KEGG:ns NR:ns ## COG: BS_ald COG0686 # Protein_GI_number: 16080244 # Func_class: E Amino acid transport and metabolism # Function: Alanine dehydrogenase # Organism: Bacillus subtilis # 1 365 1 364 378 383 56.0 1e-106 MIIGVPKEIKNNENRVGMTPSGVAEVVKQGHRVFIQHTAGINSGFPDEAYQAVGAHILPT IEDIYATAEMIVKVKEPIITEYNLIRKGQLLFTYFHFASDRELTLAMLSNKSICLAYETV EEADHTLPLLIPMSEVAGRMSIQEGARFLEKPQGGKGILLGGVPGVKPAKVLILGGGVVG SNAAQMAAGMGADVTITDINLARLRYLSETLPKNVKTLYASELRIRKELPDVDLVVGSVL IPGDKAPHLITKEMLSIMQPGTVLVDVAIDQGGCFETSHPTTHSAPTYIVDGIVHYAVAN IPGAVPYTSTLALTNATLPYVIALANKGWKKACKENPALALGLNIVEGKIVYKAVADVFG LKYDPISL >gi|225935346|gb|ACGA01000046.1| GENE 247 345203 - 345886 480 227 aa, chain - ## HITS:1 COG:no KEGG:Slin_4128 NR:ns ## KEGG: Slin_4128 # Name: not_defined # Def: serine/threonine protein kinase # Organism: S.linguale # Pathway: not_defined # 24 212 409 599 651 136 37.0 6e-31 MSSTITKFFASFLAYGVANKKKRFSAIGRFSEGLAPVKGKIQWGYINKEYDIVIPLMYER AFSFKEGLGMVVLNSQYGFIDHTGQIRIPFKYAAAHSFEQECARVCHDGLWGLIDRQGNY ILPPTYSQMEQFAEGLALVSLHNKVGFINKKGEVVIPLEYDNGCSFSEGLAAVCIESQSS KWGYINKDNEEVLPFKYDIAEPFYNNIARVGLYGKSMKINKQGSECL >gi|225935346|gb|ACGA01000046.1| GENE 248 346327 - 347664 860 445 aa, chain - ## HITS:1 COG:no KEGG:BT_1553 NR:ns ## KEGG: BT_1553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 445 1 442 442 565 65.0 1e-159 MTKKIYYIFTLIGSLLLSSCDSYLDIQPVGQVIPNTLAEYRALFTTAYNTALNDRGICEI RTDIATILQSDATSKNSLGDVEKWNDVNPNASTRQFGWAAYYTNIYYANAIIDKKDEISE GSQEDINQLVGEAYLMRAYMHFILVNLYGQPYTATGALETKAVPLKLNTDLEEIPSRNTV KEIYTSILSDIETARKLINKKEWEIQYSYRFSTLSVDAMESRVYLYMGEWSKSYEASERV LAGKSTLVNLNDEGGKLPNEFTSVEMITAYEVFPNSDYAGSLLLYPSFLQEYEEGKDLRP NKYYQANKNGNYTSIKSGESKFKCTFRTGELYLNSAEAAAHLNKLPEARTRLLKLIENRY TSEGYEQKKNEINAMSQEKLVTEILKERARELAFEGHRWFDLRRTTRPEIKKEIEDVTYT LVQDDSRYTLRIPQDAIDANPGLLN >gi|225935346|gb|ACGA01000046.1| GENE 249 347675 - 351022 2196 1115 aa, chain - ## HITS:1 COG:no KEGG:BT_1552 NR:ns ## KEGG: BT_1552 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1115 1 1114 1114 1776 80.0 0 MRKKYLYVLICFLVSTLATANAANRTITGVVISGEDNEPLIGASVYVNADDLKKAGVSQT SLGTITDMDGKFSISIPEKVTRLHCSYIGFEEQNIVLQAGKDTYRIVLQASSHTLGDVVV TGYQELERRKLTAAIAKVDVTDGMVGAAKSIDQALAGQVAGVAVTTTSGAPGAPARIRIR GTASLNGTQDPLWVLDGIPLEGTDIPEINKDNDNDIVNMSQSSIAGLSPNDIESITILKD AAATAIYGARAANGVIVVTTKRGKTGKPVINFNTKLTYTPNLNTSRLNLLNSEEKVDLEL QLLKEARFDILWGLTDPIPVFPEKGKVAAIMKQYNLIDIYKEQGWNGLTPEAQNAINKLK TINTDWNDILFRDAFTQEYNFSISGGSEKVTYYNSLGYVKENGNVPGVSMSRFNLTSKTS YQINKILKIGMSIFANRRKNNTFMTDTYGLINPIYYSRIANPYFAPFDEQGNYLYDYDVV RSNETDEKQGFNIFEERANTNKESVTTAINSIFDVQLRFNDQWKVYSQIGVQWDQLSQEE YAGINSYNIRNIRETNKYWKNGVQTYLIPEGGMLKTTNSTTSQLTWKIQGEYKNTFGDIH DIQIMAGSEIRKNWVDNQASTGYGYDPKKLTFQNLIFKDEAQANDWNLKTKSYKENAFAS FFANGSYTLMNRYTLGGSVRMDGSDLFGVDKKYRFLPIYSVSGLWRLSNESFIRQYKWID NLALRLSYGLQGNIDKGTSPFLVGKYDNVNILPGYSEENIIINSAPNSKLRWEKTASYNL GIDFSVLNQAINLSVDYYYRKGTDLIGSKALALENGFTNMSINWASMENKGVEINLQTRN ITTKNFSWYTTFNFAYNQNKVLKVLTDKSQVTPSLEGYPVGAIFALKTKGINPDTGQIYL ENKEGKAVTVEELFRMTSNEDGLGTYQIGPSTEEQRDFYSYVGTSDAPYTGGFLNTFNYR NWELNLNFSYNFGAHVKTTPSYNVSDLDPGRNMNRDILDRWTPENTTGKFPALATYNYNP ADYYLFSTRNDIYRSLDIWVKKLSFVRLQNIRLAYRVPSEWLHKLSIGGATVGLEARNLF VISSNYDNYMDPESMGNLYSTPVPKSITFNLSLNF >gi|225935346|gb|ACGA01000046.1| GENE 250 351055 - 353688 2393 877 aa, chain - ## HITS:1 COG:no KEGG:BT_1551 NR:ns ## KEGG: BT_1551 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 877 1 876 876 1675 92.0 0 MRMYVKVMAAVIACILLSGEALSAFATTPATENLSWFKKKKKKPEEKEEKSKSDYEKLVE DSKTTKGMFAVHQKKNDYYFEIPTSLLGRDLLVVNKLQRVPAELNDAGVNRGVNYENQMV CMEWDKATGKLMLRQQRPLPLAPQTDAIFRSVKDNFISPLIAAFKIEAINADSTALVIKV NDIYDGTETSINNVFTNINLGTSAIKNLSRILSIKSFSNNVVATSELTTRVTEGTTTVYV TVEVSSSILLLPEKPMMGRFDNQKVGYFTNPLLSFSDAQQRTDKTQYITRWRMEPKPEDR EAYLKGQLVEPAKPIVFYIDNSTPYQWRTYIKKGIEDWQIAFEKAGFKNAIIAKEITDSM HVDMDDVNYSVLTYAASEKKNAMGPSLLDPRSGEILEADIMWWHNVLSMVREWITVQTGT VCLEARNVQLPDALMGDAIRFVACHEVGHSLGLRHNMMGSWAFPTDSLRSEAFTSRMNST ASSIMDYARFNYIAQPGDGVKVLSPHIGPYDMFAIEYGYRWYGKKTPEEEKDVLFDFLSK HTDRLYKYSEAQDVRDAVDPRAQNEDLGDDPVRSSLLGIENLKRIVPQILQWTTTGEKGQ TYEEASRLYYAVINQWNNYLYHVLANIGGIYIENTIVGDGVKTYTFVEKEKQQASLKFLM DEVLTYPKWLFDTEVGQYTYLLRNTPIGKQENAPTQILKNAQAYILWDLLGNTRLMRMIE NESVNGKKAFTVVELMDGLHKNIFGITERGGIPNVMERSLQKNFLDALLTAAAEPEAVKI NKKIANEHFLLDHATPFCSCYAAEQRALRQEDRMGAPRVLNFYGSQLNRISDAISVKRGE LLRIKKLLQSRLGTSDTAARYHYEDMILRINTALGIK >gi|225935346|gb|ACGA01000046.1| GENE 251 353887 - 355539 1230 550 aa, chain + ## HITS:1 COG:TM1660 KEGG:ns NR:ns ## COG: TM1660 COG0739 # Protein_GI_number: 15644408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 25 272 19 256 323 73 27.0 8e-13 MRRYIIALMLACCIGGYGQEKKQVTFVPPFDFPLTLSGNFGEIRSNHFHGGLDFKTGGVI GKPVRALADGYISRIRVTNGSGYVLDVCYHNGYSTINRHLSGFVSPIAERVEKLQYEEES WEVEIVPEPGEYPVKGGQQIAWSGNTGYSFGPHLHLDVFETESGDYIDPMPFFQSKIKDT RAPKADGILFFPQPGKGVVDGKQENKTILPNSERPVEAWGVIGVGIKAYDYMDGVSNHYG VYSVVLAVDGNEIFRSTVDRFSQEENRMINSWTYGQYMKSFIDPGNTLRLLKASNDNRGL VTIDEERDYQFLYTLKDAFGNTSNYGFTVRGRKQPVEPLNHREKYYFTWNKTNYLQEPGL SLVIPKGMLYDDVPLNYQMKADSGAVAFTYQLNDKTVPLHAACELCIGLRRKPVADTTKY YVARITPKGGKYSVGGKYEDGYMKAAIKELGTYTVAVDTIPPEIIPVNKNQWGRNGKIVY RLKDQGAGVVSYRGTIDGKYALFGRPNIVKSYWECVLDPKHVKKGGKHTVEFTVTDGCGN ETIARESFIW >gi|225935346|gb|ACGA01000046.1| GENE 252 355565 - 357202 1490 545 aa, chain + ## HITS:1 COG:MA3377 KEGG:ns NR:ns ## COG: MA3377 COG4690 # Protein_GI_number: 20092191 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Methanosarcina acetivorans str.C2A # 22 500 2 538 574 188 27.0 2e-47 MKRRIILCAAIFMAAVANTFACTNLIVGKNASADGSTIVSYSADSYGLFGELYHYPAATY PKGTMLKVYEWDTGKYLGEIEQARQTYNVVGNMNEYQVTIGETTFGGRPELADSTGIIDY GSLIYIGLQRSRTAREAIKIMTDLVQQYGYYSEGESFTIADPNEIWIMEMIGKGAGIRGA VWVAVRVPDDCISAHANQSRIHQFDMNDKENCMYSPDVVSFAREKGYFNGVNKDFSFSLA YAPLDFGARRFCEARVWSYFNKFTDNGKEYLPYIEGKTDTPMPLFVKPKHKLSVQDVKDM MRDHYEGTPLDISNDFGAGPYKTPYRLSPLNFKVDDKEYFNERPISTQQSGFVFVAQMRA HKPDPIGGVLWFGVDDANMAVFTPVYCCATKVPVCYTRVDGADYITFSWNSAFWIFNWVS NMVYPRYNLMIGDVREAQKEMETTFNDAQEGIEEMAAKLLAKDKNAAVDFLTNYTNMTAQ STLDTWKQLGTFLIVKYNDGVVKRVKDGKFERNSIGQPAGVIRPGYPKEFLQEYVKQTGD RYLVK >gi|225935346|gb|ACGA01000046.1| GENE 253 357412 - 359157 2077 581 aa, chain + ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 12 553 5 549 575 454 44.0 1e-127 MENQELIKQVTEKAEKWLTPAYDAETQAEVKRMLENDDKTELIEAFYKDLEFGTGGLRGI MGVGSNRMNIYTVGAATQGLSNYLKKNFKDLPQISVVVGHDCRNNSRLFAETSANIFSAN GIKVYLFDDMRPTPEMSFAIRHLGCQSGIILTASHNPKEYNGYKAYWDDGAQVLAPHDKG IIDEVNAIASAADIKFQGNPDLIQIIGEDIDKIYLDMVKTVSIDPAAIARHKDMKIVYTP IHGTGMMLIPRALKMWGFENVFTVPEQMVKDGNFPTVVSPNPENAEALSMAVNLAKEIDA ELVMASDPDADRVGIACKDDKGEWVLINGNQTCMMYLYYILTQYKQLGKIKGNEFCVKTI VTTELIKKIADKNNIEMLDCYTGFKWIAREIRLCEGKKKYIGGGEESYGFLAEDFVRDKD AVSACCLIAEVAAWAKDNGKSLYQLLLDIYVEYGFSKEFTVNVVKPGKSGAEEIKAMMEN FRANPPKELGGSKVILSKDYKTLKQTDDKGIVTVIDMPEPSNVLQYFTEDGSKVSVRPSG TEPKIKFYMEVQGEMGCRNCYASAESAAMEKIEAVKKSLGI >gi|225935346|gb|ACGA01000046.1| GENE 254 359517 - 360692 903 391 aa, chain - ## HITS:1 COG:BS_yrkO KEGG:ns NR:ns ## COG: BS_yrkO COG2311 # Protein_GI_number: 16079697 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 9 386 17 385 405 88 24.0 2e-17 MELSTKTPRIEVVDALRGFAVMAILLVHNLEHFIFPVYPESSPEWLTILDAGVLNATFSL LAGKSYAIFALLFGFTFFIQSHNQQLKGKDFGYRFLWRMVLLAGFATLNAAFFPAGDVLL LFVVVSLVLFIVRKWNDKAILITAILFSLQPIEWFHYIMSLFNPAYTLPDLNVGAMYAEV ADYTKAGNFWDFLIGNITLGQKASLFWAIGAGRFLQTAGLFLFGLYIGRKELFVATESHL RFWTKALIITAISFAPLYSLKEQIMQSDSSLIQQTVGTAFDMWQKFAFTIVLVSSFMLLY QKDKFKKAVSNLRFYGKMSLTNYISQSILGAIIYFPFGLYLAPYCGYTLSLIIGIVLFLL QVKFCKWWLSKHKQGPLETIWHKWTWIGTKK >gi|225935346|gb|ACGA01000046.1| GENE 255 360702 - 361751 361 349 aa, chain - ## HITS:1 COG:Rv0517 KEGG:ns NR:ns ## COG: Rv0517 COG1835 # Protein_GI_number: 15607658 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Mycobacterium tuberculosis H37Rv # 2 349 36 407 436 89 23.0 1e-17 MINTLTSLRILFALMVFGAHCYVLDPSFDAHFFKEGFVGVSFFFILSGFIIAYNYQKKLL EKTTTKRTFWIARIARIYPLHLLTLLIAACIGGYVQYSDTMDWIKHFVASTFLLQPFFPS ANYFFSFNSPSWSLGCEQLFYFCFPLIIPFLNSKRNLCITLFVCLLVMLIGMHLTPEEQI KGYWYVNPITRLPDFFVGVLLYQLYQSIYNKKISYSIGTLLEVGVVVLFFIFYFCAADIP KVYRYSCYYWLPVSLVILVFALQRGYISRLLSNRILVIGGEISYSFYLIHLFIILTYTQM AALYQWQISWMISVPIIFCIIIALSLLSYYYFEKPANRWVKRILTKKQS >gi|225935346|gb|ACGA01000046.1| GENE 256 361831 - 362394 430 187 aa, chain - ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 31 186 38 194 199 70 30.0 2e-12 MENIIKGIRQYYPVSDNSLEELFSYMKKMELPKKHLLIQGGVSDRHVYFIEKGFCRSYCL RDGEEITIWFSREGDITFAMKDLYHNEPGYEYVELLEDCELYAIRIEDLNQIYEANIEIA NWGRVIHQECLLYMDIHHINRLYLPAKERYEQLLRDQPDVVHRAQLGYIASFLGMTPQHL SRLRSES >gi|225935346|gb|ACGA01000046.1| GENE 257 362401 - 363189 458 262 aa, chain - ## HITS:1 COG:MA1439 KEGG:ns NR:ns ## COG: MA1439 COG2816 # Protein_GI_number: 20090298 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Methanosarcina acetivorans str.C2A # 6 258 27 279 285 181 38.0 9e-46 MDQTVQSWWFIFYKDQLLLEKKGDGKYAVPCGESSPIIIKEKTTVHNITTLEGRNCKAFS LSSPIEESERWAMIGLRASYDYLPLSHYQTAGKAHEILHWDRNSRFCSACGTPMEQKESI MKRCPKCGREVYPSISTAILVLVRKKDSLLLVHARNFKGTFNSLVAGFLETGETLEECVA REVKEETGLDVKNITYFGNQPWPYPSGLMVGFIADYAGGEIKLQDEELSSGDFYTRDYLP ELPRKLSLARKMIDWWIEHPNE >gi|225935346|gb|ACGA01000046.1| GENE 258 363273 - 364973 1105 566 aa, chain - ## HITS:1 COG:CC2587 KEGG:ns NR:ns ## COG: CC2587 COG0488 # Protein_GI_number: 16126825 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Caulobacter vibrioides # 3 565 4 525 535 263 30.0 8e-70 MSISIQQISYIHPDKEVLFSDLNFAISKGQKLGLVGNNGCGKSTLLQIIAGQLSPSSGVI VRPDDLYYIPQHFGQYDSLTIAQALRIERKQQALHAILSGDASNENFVVLDDDWNIEERS IAALDLWGLGQFTLSYPMNLLSGGEKTRVFLAGMDIHHPSVVLMDEPTNHLDSSGRQRLY DWVEKCHSTLLVVSHDRTLLNLLPEICELEKHQINYYGGNYEFYKEQKTLMQEALQQRIE EKEKALRIARKVARETVERRDKQNVRGEKNNIKKGVPRIVLNALQGKSEKSTSKLNSTHQ EKAEKLTSERNQLRSSLSPTDTLKTDFNSSSLHTGKILVTAKEINFGYHPNSINSHIQMN NEANLADTGNHSSPDSNDIQDNSDFKQQLWQTPISFQLKSGDRLRIEGANGSGKTTLLKL ITGQLQPQEGNLTRMEFTYVYLNQEYSIIDDRNSILEQAYAFNNRNLPEHEIKIILNRYL FPASEWDKSCRKLSGGEKMRLAFCCLMISNNTPDMFILDEPTNNLDIQSIEIITATIKNY TGTVIAISHDDYFIQEIGIEQRILLS >gi|225935346|gb|ACGA01000046.1| GENE 259 365393 - 366766 449 457 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 449 1 442 458 177 27 7e-43 MKQYDAIIIGFGKAGKTLAAELSNRGWQVAIVERSNMMYGGTCPNVACIPTKTLVHEAEV SALLYHDDFPKQANMYKQAISRKNRLTSFLRNDNYERLNKRPNVTIYTGTGSFVSSNTIK VALSEGDIELQGKEIFINTGSTPIIPAIDGIQQSQHVYTSSTLLDLNVLPHHLIIIGGGY IGLEFASMYAGFGSKVTILEGGNKFMPREDRDIANSVKEVMDKKGIEIHLNARAQSIHDT NDGVTLTYSDVSHGTPYYMDGDAILIATGRKPMIEGLNLSAAGVGVDAHGAIIVNDQLRT TVPHVWAMGDVKGGAQFTYLSLDDFRIIRDQLFGDKKRDIGDRDPVQYAVFIDPPLAHIG ISEEEALKRGYSFKVSRLPASSVVRARTLRQTDGMLKAIINNHNGKIMGCTLFCADASEI INIVAMAMKTGQPSTFLRDFIFTHPSMSERLNQLFDI >gi|225935346|gb|ACGA01000046.1| GENE 260 366867 - 367448 552 193 aa, chain - ## HITS:1 COG:STM0566 KEGG:ns NR:ns ## COG: STM0566 COG3059 # Protein_GI_number: 16763943 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 3 187 2 182 186 174 51.0 6e-44 MKEKFIALLTFTSSLKSFGIQFIRVAILIVFVWIGGLKYFHYEADGIVPFVANSPFMSFF YAKGAPEYKEHKNAEGAFVPENRAWHEANRTYTFSYGLGALIMSIGILVFLGIFFPKVGL AGDALAIIMTLGTLSFLVTTPEVWVPDLGSGEFGFPLLSGAGRLVIKDIVILASAVVLLS DSSQRVLKTLKKD >gi|225935346|gb|ACGA01000046.1| GENE 261 367537 - 368376 704 279 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 117 272 131 285 288 83 31.0 4e-16 MLQQYHTKLKGTLALTDSYLTEKALQKEKGLYKFIWVRSGSITVEIDHQEMMLTKDEVIS LTHLQHLEFKSIDGEYLTLLFNSNFYCIYGNDHEVSCSGFLFNGSSHLIRFTLNEKERKE LDTITEALENEFTVSDSLQEEMLRILLKRFIIQCTRIARHRMNITREKESGFEIVRQYYN LVDEHYRTKKQVQDYADMLHKSPKTLSNIFSTCKLPSPLRVIHERVEAEAKRLLLYSNKS AKEIADILGFEDQASFSRFFKNMTGQSAVQFRNTQEGKN >gi|225935346|gb|ACGA01000046.1| GENE 262 368379 - 368861 392 160 aa, chain - ## HITS:1 COG:SP0844 KEGG:ns NR:ns ## COG: SP0844 COG0295 # Protein_GI_number: 15900731 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Streptococcus pneumoniae TIGR4 # 21 153 2 125 129 92 39.0 4e-19 MRDLTITAIIKVYQYDELNEADRALMQTAMEATARSYSPYSHFSVGAAALLGNGAVVTGT NQENAAYPSGLCAERTTLFYANSQYPDQPIVTLAIAARTEKDFIDQPIPPCGACRQVILE TEKRYKQPIRILLYGKECIYEVKSIGDLLPLSFDASAMED >gi|225935346|gb|ACGA01000046.1| GENE 263 368972 - 369877 927 301 aa, chain + ## HITS:1 COG:lin1064_1 KEGG:ns NR:ns ## COG: lin1064_1 COG1705 # Protein_GI_number: 16800133 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 33 171 54 201 201 87 41.0 2e-17 MENKLYRLIFLTIVFFFAVGVQAQKRNARYIEYINKYSDLAVEQMKLHKIPASITLAQGL LESGAGYSQLARKSNNHFGIKCGGSWRGRSVRHDDDARNECFRAYKHPRDSYEDHSDFLR RGARYAFLFKLDITDYKGWARGLKKAGYATDPSYANRLITIIEDYDLYKYDRKGVYSERK LKKNPWLMNPHQVYIANDIAYVVARNGDTFKDLGDEFDISWKKLVKYNDLQRDYTLVEGD IIYLKSKKKKASKPYTVYIVKDGDSMHGISQKYGIRLKNLYKMNRKDGEYVPEIGDRLRL R >gi|225935346|gb|ACGA01000046.1| GENE 264 370073 - 371449 1325 458 aa, chain + ## HITS:1 COG:all2964 KEGG:ns NR:ns ## COG: all2964 COG1252 # Protein_GI_number: 17230456 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Nostoc sp. PCC 7120 # 11 423 5 424 442 268 35.0 2e-71 MSLNIAKSNKKRVVIVGGGFGGLKLANKLKKSGFQVVLIDKNNYHQFPPLIYQVASAGME PTSISFPFRKIFQRRKDFYFRMAEVRAIFPEKNMIQTSIGKAEYDYLVLAAGTTTNFFGN KHIEEEAMPMKNVSEAMGLRNALLANLERALTCSTKQEQQELMNIVIVGGGATGIEVAGI LSEMKKFVLPKDYPDMPSSLMHIYLIEAGPRLLAGMSEDSSAHAEQFLREMGVNILLNKR VVDYRDHKVVLEDGTEIATRTFIWVSGVTGVTIGNMDPSLIGRGGRIKVDSFNRVEGMSN VFAIGDQCIQVADEDYPNGHPQLAQVAIQQGELLAKNLIRMEKGREMKPFHYRNLGSMAT VGRNRAVAEFSKVKMQGWFAWVMWLVVHLRSILGVRNKVIVLLNWVWNYFTYDQSMRMIV YARKAKEIRDREKVEETTHWGKELIQEEKNPSTTSSSK >gi|225935346|gb|ACGA01000046.1| GENE 265 371631 - 372878 1290 415 aa, chain + ## HITS:1 COG:no KEGG:BT_1536 NR:ns ## KEGG: BT_1536 # Name: not_defined # Def: ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 415 1 415 415 658 91.0 0 MDREIPKEVRQKERNKKIIRYSSIGVASIIAIGVLISVLRAGVEAKDLVFSTVDKGVIEV SVSASGKVVPAFEEIINSPINSRIIEVYKKGGDSVDVGTPILKLDLQSTEIEYKKLLDEE QMRQYKLDQLRVNNQTKLSDMAMQIKVSAMKLSRMKVELRNEHYLDSLGAGTTDKVRQAE LSYNVAQLEYEQLQQQYKNEKEVAAAELKVQELDFNIFRKSLSEKKRTLDDAQIRSPRKA ILTYINNQIGAQISEGGQVAIISDLSHFKVEGEIADTYGDRVAAGGKAIVKIGSDKLEGT VSSVTPLSKNGVISFTVQLKEDNHRRLRSGLKTDVYVMNAVKEDVMRIANGSFYVGRGEY ELFVCNSDNELVKRKIQLGDSNFEYVEVLSGLQPGDKVVVSDMSAYKNKNKLKIK >gi|225935346|gb|ACGA01000046.1| GENE 266 372896 - 373561 297 221 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 214 1 211 245 119 34 3e-25 MITLNSLSKIYRTDEIETVALENVNLTVERGEFLSIMGPSGCGKSTLLNIMGLLDAPTMG MVEINGIRTEGMKDKELAVFRNKTLGFVFQSFHLINSLNVMDNVELPLLYRRMAGSERKR LAQEVLEKVGLSHRMNHFPTQLSGGQCQRVAIARAIIGNPEIILADEPTGNLDSKMGAEV MELLHRLNKEDGRTIVMVTHNEEQAKQTSRTIRFFDGRQVQ >gi|225935346|gb|ACGA01000046.1| GENE 267 373629 - 374255 562 208 aa, chain + ## HITS:1 COG:no KEGG:BT_1534 NR:ns ## KEGG: BT_1534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 288 73.0 1e-76 MTKNIFIVAIAVLLSVAFCSLSAQNVSKDYNVDEFSAINLQSVGNIIFTQSAGCSCRLEG PSEFVEKTRVTVKNGTLVISYKDRNARNIKNLTFYITAPDLSKVKIDGVGNFDAKEELKL KNIAFELDGVGNCNVKSLYCDELKLDVDGVGNMKLNVDCSTIKAKVDGVGNITVSGKADA AFFKRDGVGKINSKNLKCKDVTKKGWNF >gi|225935346|gb|ACGA01000046.1| GENE 268 374312 - 375589 684 425 aa, chain + ## HITS:1 COG:YPO1365_2 KEGG:ns NR:ns ## COG: YPO1365_2 COG0577 # Protein_GI_number: 16121645 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Yersinia pestis # 101 422 98 392 395 83 24.0 9e-16 MWKQYIKQALYQLKENRLISIVCMVGTALAICMIMVIVLTLQIRIKDCVPEVNRSRSLYV KAMTSRHKEQHNNAASSQMSVTTARECFKTLTIPEAVTITSVNNKMRASLPGGGRMSVDE METDEAFWQVFCFEFLCGKPYTKADFESGLSKAVVSASVARRLFGTTDVVGRTIQLNKAD YTITGVVKDVSKLATASYAQVWIPYTSTDLASFSWRENLMGPMRAVILARSSDDFPAIRA EVEKYRQVYNSKLKDMELIYRGQPDTQFAYLYRHWGTDLDMKHIVRRFILIIVILLLVPA INLSSMTLSRMRKRMTEIGVRKAFGATANELLRQVFWENLILTLLAGVLGLILSYSATFL LNSFLFDNSENAGLAGETSLSTDMLFSPLTFLVAFCFCLLLNLLSAGIPAWRVSRMNIVD AINQR >gi|225935346|gb|ACGA01000046.1| GENE 269 375594 - 376859 1021 421 aa, chain + ## HITS:1 COG:no KEGG:BVU_2350 NR:ns ## KEGG: BVU_2350 # Name: not_defined # Def: ABC transporter permease # Organism: B.vulgatus # Pathway: not_defined # 2 421 3 423 423 494 56.0 1e-138 MKQLLKQIYNERRSNAFLWIELLLVFVVLWYIVDLVYVTLHIYYQPMGFNIENTYVLRMN RLTDKSTDFNPELTVKDDMTALREIAGRLSRHPEVESVCISQNSIPYNEGCSGASFRFPD NDTVWISTMDRWTTPEYYKVFRFRNIDGSGHESLVKALEKNTIIVPVDVADYYPDATFHG KDLLGKEVRMSDTNFRIAALTEPVRYDHYNIAGKPFTGTYIGTYLSDEQMETVENVAYLE LSLRVREGVDDGFVERLMNDADRIYQVGNVYILDVTPLSKVRTASEIDNDNELRTQFCIL FFLLLNIFLGVIGTFWFRTQQRRGEVALRMAMGANRKNIFYRLITEGLLLLSMSALPAVL IAFNIGYTELVDISQMAFTVPRFLIAILLTYLLMAIMIILGVLYPALQSMKVQPAEALRD E >gi|225935346|gb|ACGA01000046.1| GENE 270 376892 - 378181 1045 429 aa, chain + ## HITS:1 COG:YPO1365_2 KEGG:ns NR:ns ## COG: YPO1365_2 COG0577 # Protein_GI_number: 16121645 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Yersinia pestis # 118 426 113 392 395 72 24.0 1e-12 MIKLYFKQAFHLLGENKLLSSISIIGTALAIAMIMVIVITLRATIAPFAPETHRDRMLIF RFAGLQSKSNVNWQSNGPIGYNTAKACFKAMTIPEVVSITNIWQETMLAAKPAGEMESCS VLQTDDAFWKIFEFEFLSGKPYDNADFDAGAAKAVISEDMARRLFGTSEVVGKTFLLNHS AYIVCGVVRPVSKLAKYAYAQVWIPLSSTSAFTATWGDDNIMGMTAVYILAKSKDDFPAI RQEADRLRAIFMAGHPNFDLLYRGQPDTYFVAAQRYSANNPPAVKEAVRQYILTLLVLLI VPAVNLSGLTLSRMRKRISEIGVRKAFGAPRRELMMQVLSENMLYSLFGGILGLVLSYVA AFLLGGMLFSVDFVSNGVEDLRTMCVDLLFDPTVFLLAFLACFLLNLLSAAIPAWRVTRT NIVDAINER >gi|225935346|gb|ACGA01000046.1| GENE 271 378211 - 379464 835 417 aa, chain + ## HITS:1 COG:no KEGG:BT_1532 NR:ns ## KEGG: BT_1532 # Name: not_defined # Def: ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 417 1 417 417 635 75.0 1e-180 MQLLKQIWNERRSNGWLWSELLIVFVVLWYVVDWTYVTARTYYEPVGFDITDTYYLELSL KNDKSNSYLSKEQKSTSLGQDIIELTNRLRRLPEVEAVSISNNARPYIGSNSGSMLRIDT LVSNPLRRSVTPDFFQVFRYQSADGRGYQPLVQALRNGNVVVGENFLPKDYKGDRTLLGK EMVDVDDSTKVYKIGGVSKKVRYNDFWPNYSDRYVAIELPEKIMVELDDELYPSSVEVCL RVKPGTSRDFAEHLMKLSANQLSVGNLFILKVHDYEDLRNDFQQGSYNQVQVRFWMMGFL LLNILLGIVGTFWFRTQHRRAESALRIAVGASRMQLWQRLNKEGLLLLTLAALPAAVICY NIGHLELTEGYMEWGVVRFLITFVITYFLMSLMILVGIWFPTRQVIRIHPAEALREE >gi|225935346|gb|ACGA01000046.1| GENE 272 379549 - 382158 1694 869 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 612 865 52 311 328 167 40.0 6e-41 MNRRYYCLLIVLFFAVLIQAAAAERTYNILFIQSYTSQTPWHSDLNQGLVKGFKESGLKV NITTEYLDADFWAFNSEKVIMRRFCQRARDRQTDLIITASDEAFYTLFACGDSLPLQIPV VFFGIKYPDMELIATHPNVCGFTANPDFDVILRQAQKIFPQRKEVVCVIDNSFLSNKGLE DFEEEWKIFQKDNPDYRMKVYNTQNHTTSHIIAAICYPRNSYERLVVAPKWSPFLSFVGK NSKAPVFSSQNVGLTNGVFCAYDSDSYASALSAAQRAALVLKGTSPQEIGVTEITQGFIY DYKQLDYFHIDPDKVSSSGTIVNEPYWEKYKYLFILLYPSILALLIASIVWLMRANRRES KRRIQAQTRLLVQNKLVEQRNEFDNVFHSIRDGVITYDTDLHIHFTNRSLLQMLHLPYES GGRFYEGMMAGSIFKIYYNGQDILHSMLKQVASKGESVKIPQGAFMKEVHSDKYFPVSGE IVPIRSKDTITGMALSARNISNEEMQKRFFDMAVDESSIYPWQFDMETNCFIFPQGFLKR LGYDESVTTISRDEMDRTIHPDDLKEISPLFNRALTGEDSNTRLNFRQRNVNGEYEWWEY RSSVITGLTQDSLYNILGVCQSIQRYKTAEQEMREARDKALQADKLKSAFLANMSHEIRT PLNAIVGFSDLLSDTSGFTSEEIAQFIGTINKNCGLLLALINDILDLSRIESGTMEFMFA EHNLPLLLKTVHDSQRLNMPPGVELVLRMPESDKKYLTTDNVRLQQVVNNLINNAAKFTS SGFITFGYEDDEVPGYTRIFVEDTGVGISEEGIRHIFERFYKVDNFTQGAGLGLSICQTI IERLNGTISVTSEVGKGTRFTVRLPNYCE >gi|225935346|gb|ACGA01000046.1| GENE 273 382288 - 383766 1430 492 aa, chain + ## HITS:1 COG:no KEGG:BT_1530 NR:ns ## KEGG: BT_1530 # Name: not_defined # Def: putative outer membrane protein OprM precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 492 1 492 492 823 89.0 0 MKKKSILFALAAVCLPLALSAQKEREITLNEAIAMARAQSVDAAVALNELKTAYWEYRTF RADLLPEVNLTGTLPNYNKSYSSYQNSDGSYGFVRNNTLGLTGDLSIDQNIWLTGGKLSL TSSLDYIKQLGAGGDRHFMSVPVTLQLTQPIFGVNNIKWNRRIEPVRYAEAKAAFITATE EVTMRAITYYFNLLLARENLGTAKQNLTNADHLYEVALAKRKMGQISENELLQLKLSALN AKAALTEAESDLNAKMFQLRAFLGVGEDEILSPVLPEAVDRPKMEYNLVLNKALERNSFA QNIRRRQLEADYEVATARGNLRSVDLFASVGYTGENRNFPAVYRNLQDNQIVQVGVKIPI LDWGKRRGKVRVAKSNRDVVLSKIRQEQINFNQDIFLLVEHFNNQAQQLDIAKEADAIAQ QRYKTSIETFLIGKINTLDLNDAQNSKDEARQKHISELYYYWYYYYQIRSLTLWDFRSNS ELEADFDEIIRQ >gi|225935346|gb|ACGA01000046.1| GENE 274 383860 - 385209 1457 449 aa, chain + ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 2 447 7 456 461 340 42.0 4e-93 MILIIDDDSAVRSSLSFMLKRAGYEAQTVPGPREAMEVVRSVAPDLILMDMNFTLSTTGE EGLTLLKQVKIFRPEVPVILMTAWGSIQLAVQGMQAGAFDFITKPWNNAALLQRIETALE LSGTSQETTQEQSESFDRSHIIGRSQGLMDVLNTIARIAKTNASVLITGESGTGKELIAE AIHINSQRAKYPFVKVNLGGISQSLFESEMFGHKKGAFTDASADRIGRFEMANKGTIFLD EIGDLDPSCQVKLLRVLQDQTFEVLGDSRPRKTDIRVVSATNADLRKMVGERTFREDLFY RINLITVKLPALRERREDIPLLARHFADRQAVTNGLPRTEFAADALQFLSRLPYPGNIRE LKNLVERTILVSGKPLLDASDFDAQYIRHDDARVAEGAALAGMTLDEIERQTILQALERC KGNLSQVATALGISRAALYRRLEKYNITV >gi|225935346|gb|ACGA01000046.1| GENE 275 385236 - 386528 1095 430 aa, chain + ## HITS:1 COG:BH1920 KEGG:ns NR:ns ## COG: BH1920 COG0642 # Protein_GI_number: 15614483 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 104 430 188 537 548 103 27.0 6e-22 MRIKGLFYILVILLLALGSILLYLSSQMNALFFYIGEGLVLFILLYLTFFYRKIVKPLNT IGSGMELLREQDFSSRLSPVGQYEADRVVNIFNRMMEQLKNERLRLREQNNFLDLLIKAS PMGVIITSLDEDLAELNPMALKMLGVRLEDVQGKKMKEIDSPLAVELSSLPKGETVTVRL NDSNIYRCTHSSFIDRGFQHPFYLVETLTDEVMKAEKKAYEKVIRMIAHEVNNTTAGITS TLDTVEQALSSEEGMDDICDVMRVCTDRCFSMSRFITRFADVVKIPEPTLSSVNLNDLVF TCKRFMEGMCNDRHITLRMEIDESLKDVMLDAALFEQVLVNIIKNAAESIEADGEIIVRT ISPATVEVIDNGQGISKETEAKLFSPFFSTKPNGQGIGLIFIREVLMRHGCTFSLRTYAD GLTRFRIIFP >gi|225935346|gb|ACGA01000046.1| GENE 276 386626 - 389178 2310 850 aa, chain + ## HITS:1 COG:no KEGG:BT_1527 NR:ns ## KEGG: BT_1527 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 850 1 850 850 1589 92.0 0 MRLMKKGTLTLFLLLAISIPLSAQYVVQGVVTDSLTKEPLPYASVRLKDTTEGTTTGSDG RFYFKTNRSEAVLVISVIGYNDYIRSIRPARNASYKVTLSPTEYALGEVVVKPKREHYRK KDNPAVEFVRRMIESRDNHSPYEKDFWQRERYEKTTFALNNFDEEKQKKWLYRKFDFLTE YVDTSAVTGKPILTVSARELLATDYYRKSPRSEKQWVKGRKQAGVDEFLSKQGMQAAINE VFKDVDIYENNISLFTNKFVSPLSRIGTGFYKYYLMDTLQIAGEPCVDLAFTPFNSESFG FNGHLYVVLDSTYFVKRAVFNFPKKINLNFVDYMLLEQEFKRAEDGTRLLDHESITVEFK LTEGQDGIFARRVADYSNYTFTPTAEADKAFTKPERIIEETEALSRPETFWAENRPQAAI SQQENSVDRLMTQLRSYPVYYWTEKVLSILFTGYIPTSKEAPLFYIGPMNATISGNTLEG PRIRAGGMTTAWLNPHLFGKGYVAYGFKDERVKGLAELEYSFKKKKEYANEFPIHSLKLR YELDVNQYGQNYLYTSKDNVFLALKREKDDRIGYFRQAEMTYINEFYSGFSFQLTARTRK DESSYLIPFLKKEGDAYTPVKDFSISAAELKLRYAPNEKFFQTQWNRFPVSLDAPVFTLS HTLAGKGVLGSDYTYNHTEAGIQKRFWFSAFGYTDIILKAGKVWDKVPFPLLIMPNANLS YTIQPESYSLMNAMEFMNDEYFSWDVTYFLNGWLFNRVPLLKKLKWREIVSCRGLYGHLS DKNNPALSDGLFAFPIENTQTMGKTPYVEAGVGIENIFKVLRLDYVWRLTYRDSPGIDKS GLRISLHMTF >gi|225935346|gb|ACGA01000046.1| GENE 277 389287 - 390576 1426 429 aa, chain + ## HITS:1 COG:YJL153c KEGG:ns NR:ns ## COG: YJL153c COG1260 # Protein_GI_number: 6322308 # Func_class: I Lipid transport and metabolism # Function: Myo-inositol-1-phosphate synthase # Organism: Saccharomyces cerevisiae # 11 421 87 541 555 184 29.0 3e-46 MKQEIKPATGRLGVLVVGVGGAVATTMIVGTLASRKGLAKPIGSITQLATMRMENNEEKL IKDVVPLTNLEDIVFGGWDIFPDNAYEAAMYAEVLKEKDLNGVKEELEAIKPMPAAFDHN WAKRLNGTHIKKAATRWEMVEQLRQDIRDFKAANNCERVVVLWAASTEIYIPLSDEHMSL AALEKAMKENNTEVISPSMCYAYAAIAEDAPFVMGAPNLCVDTPAMWEFSKQKNVPISGK DFKSGQTLMKTVLAPMFKTRMLGVNGWFSTNILGNRDGEVLDDPDNFKTKEVSKLSVIDT IFEPEKYPDLYGDVYHKVRINYYPPRKDNKEAWDNIDIFGWMGYPMEIKVNFLCRDSILA APIALDLVLFSDLAMRAGMCGIQTWLSFFCKSPMHDFEHQPEHDLFTQWRMVKQTLRNMI GEKEPDYLA >gi|225935346|gb|ACGA01000046.1| GENE 278 390739 - 391218 494 159 aa, chain + ## HITS:1 COG:STM0420 KEGG:ns NR:ns ## COG: STM0420 COG1267 # Protein_GI_number: 16763800 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Salmonella typhimurium LT2 # 10 159 23 169 171 81 39.0 6e-16 MKRPPFLPVFIGTGFGSGFSPFAPGTAGALLASIIWIAFYFLLPFPVVLWLTAALVIVFT FAGIWAANKLETYWGEDPSRVVVDEMVGVWIPLLAVPNDDRWFWYVIAAFALFRIFDIAK PLGIRRMESLKGGVGVMMDDVLAGVYSFILLVGARWVIG >gi|225935346|gb|ACGA01000046.1| GENE 279 391233 - 391700 377 155 aa, chain + ## HITS:1 COG:no KEGG:BT_1524 NR:ns ## KEGG: BT_1524 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 154 154 252 90.0 2e-66 MISQKGGIFMFLRAQLSAQMATIADFLVTILLVRLFEVYYVYATLAGAIYGGIVNCVINY KWTFKSKGKKTNVAAKFILVWICSVWLNTWGTYALTESLAKIPWVRDTLSLYFGDFFIIP KVVVAIIVALFWNYNMQRVFVYRNIDIRSLFGKRN >gi|225935346|gb|ACGA01000046.1| GENE 280 391774 - 392424 764 216 aa, chain + ## HITS:1 COG:MT2687 KEGG:ns NR:ns ## COG: MT2687 COG0558 # Protein_GI_number: 15842152 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 13 170 14 163 217 88 32.0 8e-18 MNYRDYLQQLIYKIINPLIRGMIKIGITPNFITTTGFILNVVAAGMFVYAGIYGGQNDLA IIGWAGGVILFAGLFDMMDGRVARLGNMSSKFGALYDSVLDRYSELMTFFGICYYLSMKD YFFYALIAFIALIGSLMVSYVRARAEGLGIECKVGFMQRPERVVLTSLGALFCGVFKDIT AFEPILIMIVPLAFVAVFANITAFARVRHCYKAMKE >gi|225935346|gb|ACGA01000046.1| GENE 281 392477 - 393394 672 305 aa, chain + ## HITS:1 COG:no KEGG:BT_1522 NR:ns ## KEGG: BT_1522 # Name: not_defined # Def: putative aureobasidin A resistance protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 305 1 305 305 512 91.0 1e-144 MPSKREALTVTVIMALFLLLTGIFIGLRSEHLLMAVLYLVLFFAGLPTRKLAVALLPFAI FGISYDWMRICPNYEVNPIDVAGLYNLEKSLFGVMDNGLLVTPCEYFAAHNWPIADVFAG IFYLCWVPVPILFGLCLYFKKERKTYLRFALVFLFVNLIGFAGYYIHPAAPPWYAINYGF EPILNTSGNVAGLGRFDTFFGVTIFDSIYGRNANVFAAVPSLHAAYMVVALVYAIIGKCR WYVITLFSIIMVGIWGTAIYSCHHYIIDVLLGISCALIGWLVFEYVLMKIPAFKRFFERY YAYIK >gi|225935346|gb|ACGA01000046.1| GENE 282 393442 - 394452 632 336 aa, chain + ## HITS:1 COG:no KEGG:BT_1521 NR:ns ## KEGG: BT_1521 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 328 1 309 309 494 88.0 1e-138 MKNKYRNIFLAFGIIAVLIMIFTFDMDYQELWMNLKRAGIYLPLVLLLWLFVYLINTVSW YIIIRSSGKTGFSFARLYKFTVTGFALNYVTPVGLMGGEPYRIMELKPYIGIERATSSVI LYVMMHIFSHFCFWLSSVLLYVCLYPVGWAMGIILGAITLFCLLIVILFIKGYRHGMAVA FVRMGGRIPFLKKKVLHFASAHKEKLENIDKQIALLHQQKKQTFYSALLLEYTARIVSCL EIWLILNVLTTNVSFADCCLIAAFSSLLANLLFFLPMQLGGREGGFALAVGGLSLSGAYG VYAALITRVREMVWIVIGLALMKVGNFKQSKEKYSA >gi|225935346|gb|ACGA01000046.1| GENE 283 394515 - 395687 1192 390 aa, chain + ## HITS:1 COG:BH3344 KEGG:ns NR:ns ## COG: BH3344 COG1979 # Protein_GI_number: 15615906 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Bacillus halodurans # 1 390 1 387 387 305 40.0 9e-83 MNNFIFYSPTEFVFGRDTEAQTGALVQKYGARKVMIVYGGGSVIRSGLLARVENSLQEAG IPYCMLGGVQPNPIDTKVYEGIDLCRKENVDMMLAVGGGSVIDTAKAIAAGVPYNGDFWD FYIGKAIVTKALKVAVVLTIPAAGSEGSGNTVITKVDGLQKLSLRAPGVLRPVFAVMNPE LTYTLPPFQTGCGIADMMAHIMERYFTNTKDVEIGDRLCEGTLLAIIKEATTVMKEPENY GARANLMWSGTIAHNGTCGVGCEEDWASHFLEHEISAIYNVTHGAGLSVIFPAWMTWMTE HNVDKIAQYAIRVWGVAESDDKKAVALEGISRLKSFFTSIGLPVTFKELGIENPDIDRLA DSLHRNKGELVGNYVKLTKQDSKEIYRLAL >gi|225935346|gb|ACGA01000046.1| GENE 284 395908 - 399045 2770 1045 aa, chain + ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 48 1045 112 1116 1116 846 44.0 0 MQKLNSGALNRILLFVYILSLSTNAIAQNKNNSKETYLLPPHGNYVYGRVIEKLSKEPMV GVTIRLDGHSTGVITDINGCYVLTLPEKGGLVIYSYIGFETRKIKVTSRQKVDVQMVEAT ESIQEVIVTGYNSIQKESFTGNTTKIEKEDLLKVNPNNLISAIQTFDPSFRIQENLAAGS DPNSLPQFVLRGQTGIGETTLGQTSTSSISREVLSGNSNLPIFILDGFEVDVEKIYDLDM NSIHSINILKDAAATAMYGSRAANGVIVIERRAPEAGKFRVQYSGVLSAELPDLSSYNLM NAREKLETERLAGLYDSNTPEIDPYTNGYYQRLNNVLTGVDTYWLSQGLRTALNHKHSVF IDGGENDVRWGVELGFRGTEGVMKHSSRKNANAAFYVDYRIGGLQIKNKVTYTYNKSTDV PFNSFSDYSHLLPYMRLYDENGDYVRRLEKFDGASGTQVNPLYEINFYNSFDHSGYDEVT DDLSLNWRITDGLRLRGQFSVLMRNSTGDLYKDPASASYSASTGNINGEKTESTQKRTVI DGSLSLMYNNTFKGHNLNICLSSNMRQTQSTASETRYRGFPGGDLVSSNYAAEVYGKPSS SDNTTRLVGALLTSNYTYNNIYLADLTGRIDGSSEFGSDKRWSMFWSTGAGINIHNYDFM KSNELFSMLKFRASYGLTGKTNFSLYSAKDMYQLQTDSWYPTGYGVFLYQMGNPNLKWER KYTLDYGVEIGLWHDKIYLKASAYDERTIDLITDYTIPSSTGFTSYKENMGKVKNTGVEL ELRARLYSDRNWLFQLYGSFARNKNTIVEISQAMRDYNKRVEELFSGYNPESSSDSKYAK TYLKYYEGASLTSIYGMKSLGISPTNGKEIYLRRNGDVTDVWSADEWTIIGDTAPKGQGS FGYTLSYKQLSMFASFLYTFGGDAYNNTLVSYVENADIKNDNVDKRVLLDRWQKPGDITT MKDIRDRNVTTGASSRFVQKNNTLQWSSLTMSYNFRPEQLKKLHLSGLRLSFTMNDLFYW STIRQERGLDYPYSRSFNLTTNIIF >gi|225935346|gb|ACGA01000046.1| GENE 285 399054 - 400613 1340 519 aa, chain + ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 496 4 476 488 241 33.0 6e-62 MKKLISHILIALTGMLAVSCNAWLDVTPENAIADDDLFSTGFGYRNALNGIYTNLASDEL YGKQLSWGFLSAISQQYNQKAGTISPMYADASELIYNTVDTEPVVTAIWEKGYKVIANLN KLIENIRPTDISLFEYGEEEKNLIYAEALSLRAMMHFDLLRLFAPATATNPSGAYLPYRD KYEAAVVEKCTVTDFIEKVLKDLLEAEDILRKFDTEYHPEAMYASQMYEPTPEWNARYRF NSGSYIDDMGAFFWYRGIRFNYLALLGLKARVCIYAGPAYYKNAETAAKELYNTYYQQKR WIGFTEGENITCNLNSRYTKVSHDILFGLYKKQLATDYEQAVWGSSSSSSTTRLPLANIP SLFASDNTGVYTDYRLTYLIGTTNETQSKYYTLKYNPSVESVVEAMENPMIPVIRFSEIC HILAEISSYNGKITEGINYLETVRKARGAERTLSLTVSTREQLDAEILLDIRKEMIGEGG TFYTYKRMNLSTVPDSDEEGEINMTGSYVLPLPTSETTN >gi|225935346|gb|ACGA01000046.1| GENE 286 400626 - 401351 593 241 aa, chain + ## HITS:1 COG:no KEGG:BT_3273 NR:ns ## KEGG: BT_3273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 184 4 186 236 63 28.0 5e-09 MKRILTYTTFCLLLVFASCSEEQLEVYHGDNYVYFTYMNDKSPQKITFNFATDAPLLREG TVKVKMTLLGYLLEENATCDISAVGEKSTARSGIDYAPLTSGIFHKGLAEDTYEVTVYRN EALLNTEYTLTLSLDAVENCLVGPAEYQYVTIQVTDRISCPVWWSQSSAANLGVYSDMKY RVFIIFMDGEILESLDKYTGIEFVNLIADFKAWWKDQWQQGNYQYYDTDGVTPLYETILD N >gi|225935346|gb|ACGA01000046.1| GENE 287 401361 - 403001 1340 546 aa, chain + ## HITS:1 COG:no KEGG:BT_3274 NR:ns ## KEGG: BT_3274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 469 3 454 534 69 23.0 3e-10 MKRKTILFISILQLLVITGCYDDLGNYGYTNSTKVSFVNSASSDYTFMVGDLFEMDAPVN FSTDIGNVDELFTVQWYLNRELIYTGYHLKYQFEKGGTYELILKVINKETNETYISNKYT LTGKNSFDWGWMILSDKGDGKSALSFINPAFRVTHNVESTIEGGLGTDPQGIYYYYVLGS ISGSYVSGLPKVLINQGSGSVTLDGNSLQKDMWLADEFENRKEPDDLKIMDLAFKEEYYV ICSEQGEVYIRAVGTDNKAIPYYGKYGAMPYEFEGGSRITCFAPFHNVTYWCADEERCIL YDEQNARFIGIAHYSQWGAVYTPAIVYFKTYDQDLEVPSGVLRVNDMGAGTRCLAIGAYE KIDVASNGGLTFWANYVSLIDVQGTGNYNLHEFAVKGMDNNSHLITGTDQYGFSGGSLLT PQSIIKMSSNFEKNPYFYFTDGDKNLYVYSMQMRNHVLAYTADSRITGISGSPIVCEFYG YGGNSTDPNFRLALSQENGNITVIDVNKSQMVRLFEGFSPDLKLQTFTGFGNVKGMVWCT NYEGEY >gi|225935346|gb|ACGA01000046.1| GENE 288 403054 - 404025 952 323 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173639|ref|ZP_05760051.1| ## NR: gi|260173639|ref|ZP_05760051.1| hypothetical protein BacD2_17325 [Bacteroides sp. D2] # 1 323 1 323 323 562 100.0 1e-158 MKNRLFYFATCVALALSSVSCSSDDDGEEDIDIPVVGKIAVSGAYNVYSHGSQGWIEADK IGIYVLSDGKPQENLPYAPSEVAKATMMEYEGKTYITYDKEDYVTDEVTLNPSSELSAGF KSGDHTIYAYTPYSAASQDYKTVALPNISVQEYYASEFMPNRKYSFAYASATTSSYSAAT VTLGEFKSLFSQLTLPALECPDALAGKTCTKIVVTCDEHPLSYADGATVNLATGEISGTP LNSVTYNIPDGLVVNAGFPAFGLPASLETAYMMVAVPFEEGMNYTYKFTLTIGGQEYTTS GKPKTGFWSTDNNLNMKDIAGIE >gi|225935346|gb|ACGA01000046.1| GENE 289 404068 - 406635 2514 855 aa, chain + ## HITS:1 COG:no KEGG:Cpin_5142 NR:ns ## KEGG: Cpin_5142 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 43 848 37 825 837 831 52.0 0 MKKLVSILMILAMIIPLAGAQSTDIFKKKKKKKSKTEAVDKAKADSIAKSKKDALQPYAK VITGKAKTMDGFFKVHYVDGKYFFEIADSLFGRDILIVNRVVKAPVDAQKRKVGYPGDYI SDEVIRFEKGRGDKLFVREISYLEHSADTLGMYQAVLNSNVQPIVATFPLKTVRKEGETT NYVIDMTDYIRKDNEMFSFTSRVKDNIGASSMVDDASYIDTLKAFPQNIEIRTVRTFQRK KGGGSGFEKLLAAFFATSTTPLTYELNSSMLLLPKEPMKPRLHDDRVGYFAVSYKDFDEN PQGVKYKANITRWRLEPKDEDREKYLRGELVEPKKPIIIYIDPATPKKWVPYLIQGVNDW QAAFEKAGFKNAIFGKEAPTDDPTWSLEDARHSAIVYKPSDIPNASGPHVHDPRSGEILE THINWYHNVMSLLYNWYIVQAGAIDPGARKPMFDDELMGELVRFVSSHEVGHTLGLRHNF GSSNTVPVEKLRDKAWVEANGHTPSIMDYARFNYVAQPEDNVSRSGIFPRIGMYDKWAIE WGYRWMPEYETAEAEIPHLNKWIIEKLREDKRYTFGTELDRNDPRNQSEDLGDDAMLASS YGIKNLKRVMPEIMNWTYEPNEGYMKAVRLYQNVVGQFDLYMGHVATNVAGIYHNPISVE QTDMKAVEYVPKDIQKKAVDFLNKELFTTPTWLMDDKLSERTGINTFNSIYRVQSSTLKQ LLSSRTLDKMTVNELVNGAKAYTANDLFRDLKKSIWSDMQGGKKPDASQRSLQKTYVNAL IGMLDKPKNSSGSLGSLGGYSLVAFDFPSEAPTIARGQLTDLRRDLTNAANASSGIYRSH YLNLKALIDAAFDVK >gi|225935346|gb|ACGA01000046.1| GENE 290 406911 - 408245 846 444 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 3 441 24 457 470 171 29.0 2e-42 MFKRDILFHLEKWKDDTHRKPLILRGARQVGKTTVVNEFGKQFDNYLYLNLEKREAASLF ELNVSLKDLMPLFFAHCGKIRNEGTILLFIDEIQNSAKAVALLRYFYEELPDIYVIAAGS LLENLIDVRVSFPVGRVQYMALRPCSFREFLGAVGEEPLLSVLDKPEITLAFHDRLMLLF NIYTLIGGMPEVVQLYAERRDILSLESTYETLLQGYRDDVEKYALGKQLPEVIRFILKEG WHKAGQIITLGGFAGSSYNAREVGEAFRLLEKAMLLELVYPTTATEVPATPEIKRMPKLV WLDTGLVNYAAQVQKEVLGAKDIMDAWRGMIAEQIVAQELLTLTDKVSQKRCFWVRNKSG SNAEVDYVWGQDSMVYPIEVKSGHNAHLRSLHSFMNHSMQTVAVRIWSQPYAVDEVKTAD GKEFKLINLPFYLVGKLDSILRRF >gi|225935346|gb|ACGA01000046.1| GENE 291 408393 - 408842 266 149 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 141 2 97 116 95 45.0 2e-20 MRTINLIVIHCSATREDKSFTEYDLDICHRRRGFNGTGYHFYIRKNGDIKSTRPIEKVGA HCRSFNKESIGICYEGGLDCMGQPKDTRTCWQKHSLRVLILTLLKEYPDCRICGHRDLSP DLNGNGEIEPEEWIKACPCFNAEKDWDKV >gi|225935346|gb|ACGA01000046.1| GENE 292 408861 - 409079 310 72 aa, chain - ## HITS:1 COG:no KEGG:BT_1518 NR:ns ## KEGG: BT_1518 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 25 96 101 99 84.0 3e-20 MAQEVKEFSELVKDQYTFLMQQLEKVLKDYFDLSSKVKEMHTEIFSLRGQLAQAATLQCI HKECSQRSMAEA >gi|225935346|gb|ACGA01000046.1| GENE 293 409275 - 409781 539 168 aa, chain - ## HITS:1 COG:no KEGG:BT_1517 NR:ns ## KEGG: BT_1517 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 176 237 86.0 1e-61 MNVLVERYQRRKYVNQPDSQMLYYVRQKSGTVRVMDINKLADAIEANSSLTAGDVKHAIE AFVEQLRLSLTQGDKVKVDGLGTFHITLSSEGAEKEKDCTVRSIRKVNVRFVADKALQLV NTSHATTRGENNVDFILAAKGDGEGDDDGNSGSGGNSGGSGEAPDPAA >gi|225935346|gb|ACGA01000046.1| GENE 294 409969 - 411348 1169 459 aa, chain - ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 2 443 4 437 450 272 38.0 9e-73 MNTENRVSPQAPEIEEAIIGACLIEQGAIPLVADKLRPEMFYVLRHQVIYAAILAMYHAG IKIDILTVKEELSHRGKLEEAGGPFGITQLSSKVATSAHLEYHAQIVHEKYLRREMILGF NKLLTCSLDETMDIDDSLVDAHNLLDRLEGEFGHNNHMRDMDELMTATMVEAEGRIANNK NGVTGLPTGLADLDRMTSGLQKGELVVVAARPGIGKTAFALHMARSAAMAGYAVAVYSLE MQGERLADRWLTAVSEISARHWRSGTVSQQELVEARTTAADLKRLPIHVDDSTSVNMEHI RSSARLLQSQHACDAIIIDYLQLCDMTTGQNNRNREQEVAQATRKAKLLAKELNVPVVLL SQLNRESENRPAGRPELAHLRESGAIEQDADVVMLLYRPALARVTTDRESGYPTEGLGVV IIAKQRNGETGNVYFRHNPEMTKITEYVPPLEYMLKHAK >gi|225935346|gb|ACGA01000046.1| GENE 295 411353 - 411976 492 207 aa, chain - ## HITS:1 COG:no KEGG:BT_1515 NR:ns ## KEGG: BT_1515 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 210 212 240 60.0 2e-62 MKNFNFQAMLKHGFLIIPKALLQQQIEDRHMQEGEIEALLKILMKVNYSDTLYNDRQNKN CLCKRGESLFSYRDWSHIFHWSVGKAFRFIHELATLGIIEIISHPNNSSLHIRVVEYDKW MGVPDSDKQKKKAVNEKFHLFWNEFHSITQLPKENIAKAQREWKKLGDKEQQLAIDRIEE YYFHQTNINFLLHAASYLSNKAFLNEY >gi|225935346|gb|ACGA01000046.1| GENE 296 412438 - 412707 263 89 aa, chain - ## HITS:1 COG:no KEGG:BT_1514 NR:ns ## KEGG: BT_1514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 88 12 89 99 72 47.0 3e-12 MNNNEPAAEHYSINGFKYFSELAKEYFPDLANASSASKKMRKRIKADKTLNEQLAAAYYT CQTIDISPEMQLILYRHWGPPHIDLPTNV >gi|225935346|gb|ACGA01000046.1| GENE 297 412935 - 413852 693 305 aa, chain + ## HITS:1 COG:no KEGG:BT_1503 NR:ns ## KEGG: BT_1503 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 32 331 336 456 77.0 1e-127 MKQVAVDLQQSGNLGTAHVYRSSLNAILTFQGSGCLSFPEITPEWLKHFEGSLRARGCSW NTVSTYLRTLRAVYNRAVDLRKAPYVPHLFRSVYTGTRADRRRALDAEDMKKVFARLLQS DAVTPAMRGAQELFILMFLLRGLPFVDLAYLRKSDLRGNVISYRRRKTGRPLSVTLTSEA MFLLRKYMSREEQSPYLFPILHSDEGSPMAYREYQLALRNFNYQLELVGKLLGLKDRLSS YTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFRSKKIDEANKQVLDFVKRSV MGASA >gi|225935346|gb|ACGA01000046.1| GENE 298 414225 - 417206 2800 993 aa, chain + ## HITS:1 COG:no KEGG:BT_1826 NR:ns ## KEGG: BT_1826 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 988 1 1033 1038 424 35.0 1e-117 MKKKFVRVMLFGALTLAVSTTVTSCKDYDDDIKNLQEQIDKVTSTNPVSTEDMKAAISSA IQTLQTQLQTAIDGKADSKAVQDLLKTVEALQTALENKADASTIKTLGDQITELSKQVNS IEGTLNETKKDLEAKVADLTEKLAGAASSEDLKKLANELAEAKNELKAVKDMADNNAAAI VEIQANILELQKLDGRITALETFNQNAASKDDLTAYVAHSELAGLVDNEVLELLKDNGSI AKYVNEVIESQVLAETSAINLAIKGVDDKLATLSTSFETYKSEQATAYQTVTSNITTLTT FKTTIEAALAGGGYENFAAVLTEISTIKASYGYCATKADFDNKVEAYLATYKSGVDGEFT ALKTRITALENQIQSVVYVPEYEDGKIIFMSYFYDNKLVAETKPIQMKFRISPATAAANF AENYAPSFDGQEIKTRTAEIYNIEKTEVDEATGIVTFTISTSTDKSFAVSLNLIAKDQSK NLTNISSNYFPVISDYRAITDVKVESPNKEIDYILYDKPLSVVDYATGAVLQITGKNRAG DDVADETMASSVNAEKFVVTYKVEGDDAASYTIENGVLKLKNYTADSNGKVAQPKATVTI TGTDFEKVTAFADVTAKAASTDPEVSPTISAVKFDGTKEQVADVTVSYGTTGGSTDIGIS QEVYEKLPAENFAFKAAEGVYLRFKDKTTTNELEIVVPKGTVKGTYVPEVKVKVSDVQNF TLKPSITVEVTGADYTLTYDDKIITGGSSLALTANLLPADKPTSMNFMMEIPSLFANYAT IVDNAKKVGATVQFSLKNAITGVSLADNVLKVDNTYSNPAPAKAIVVVAKVIAKNAAGED IELVTSKETTFTLADLSGEWTGPTDATVKIGSDKLDATAQLATGAIWKASNKEKMWEAGK EVKAGDKWGAAPLGIFGFVAPTFELDDATQAKYVTLDKNTGALTLTNTGKQLKAEVSIVV NIVAESRWGTINNYAAGKTVTVKIDPTELSDAI >gi|225935346|gb|ACGA01000046.1| GENE 299 417657 - 419213 1267 518 aa, chain + ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 516 816 76.0 0 MSHKIYPIGIQNFEKIRNDGYFYIDKTALMYQMVKTGSYYFLSRPRRFGKSLLVSTLEAY FQGKKELFEGLAVEKLEKDWIKYPILHLDLNIEKYDTSESLDNILDKSLTAWEKLYGAEP SERSFSLRFAGIIERACKLAGQRVVILVDEYDKPMLQAIGNEELQKQFRNTLKPFYGALK TMDGCIKFAFLTGVTKFGKVSVFSDLNNLDDISMRKDYVEICGVSDQELHDTLDAELHEF ADVRGVTYDKLCAELKECYDGYHFTHNSIGMYNPFSLLNAFKYREFGSYWFETGTPTYLV KLLQKHHYDLERMTHEETDAQVLNSIDSESTNPIPVIYQSGYLTIKGYDEEFGMYRLGFP NREVEEGFVRFLLPYYANVNKVESPFEIQKFVREVRSGDYSSFFRRLQSFFADTTYEVIR DQELHYENVLFIVFKLVGFYTKVEYHTSEGRIDLVLQTDKFIYIMEFKLNGTAEEALQQI NDKHYALPFEMDGRKLFKIGVNFSAETRNIEKWIVEEK >gi|225935346|gb|ACGA01000046.1| GENE 300 419257 - 420831 1444 524 aa, chain - ## HITS:1 COG:CT808 KEGG:ns NR:ns ## COG: CT808 COG1530 # Protein_GI_number: 15605542 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Chlamydia trachomatis # 1 517 1 512 512 277 32.0 3e-74 MTNELVVDVQPKEVSIALLEDKSLVELQSEGRNISFSVGNMYLGRIKKLMPGLNACFVDV GYEKDAFLHYLDLGPQFNSLEKFVKQTLSDKKKLTSISKATLLPDLDKDGTVSNTLKVGQ EVVVQIVKEPISTKGPRLTSEISFAGRYLVLIPFNDKVSVSQKIKSSEERARLKQLLMSI KPKNFGVIVRTVAEGKRVAELDGELKVLVKHWEDAMVKVQKATKYPTLIYEETSRAVGLL RDLFNPSFENIHVNDEAVYNEIRDYVSLIAPDRANIVKLYKGQLPIYDNFGITKQIKSSF GKTVSYKSGAYLIIEHTEALHVVDVNSGNRTKNANGQEGNALEVNLGAADELARQLRLRD MGGIIVVDFIDMNEAENRQKLYERMCANMQKDRARHNILPLSKFGLMQITRQRVRPAMDV NTTETCPTCFGKGTIKSSILFTDTLESKIDYLVNKLKVKKFSLHVHPYVAAYINQGLVSL KRKWQMKYGFGIKIIPSQKLAFLQYVFYDTHGEEIDMKEEIEIK >gi|225935346|gb|ACGA01000046.1| GENE 301 421115 - 421390 209 91 aa, chain - ## HITS:1 COG:lin2048 KEGG:ns NR:ns ## COG: lin2048 COG0776 # Protein_GI_number: 16801114 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Listeria innocua # 3 90 4 91 91 58 38.0 3e-09 MTKADIVNEITKKTGIDKQTVLTTVEAFMDAVKDSLSNDENVYLRGFGSFVVKKRAQKTA RNISKNTTIIIPEHNIPAFKPAKTFTISVKK >gi|225935346|gb|ACGA01000046.1| GENE 302 421602 - 422636 712 344 aa, chain + ## HITS:1 COG:L0296 KEGG:ns NR:ns ## COG: L0296 COG1194 # Protein_GI_number: 15672823 # Func_class: L Replication, recombination and repair # Function: A/G-specific DNA glycosylase # Organism: Lactococcus lactis # 1 262 8 274 387 236 41.0 6e-62 MNEFTKTIVEWYEENKRELPWRESADPYLIWISEIILQQTRVAQGYDYFLRFIKRFPDVQ TLAAADEDEVMKYWQGLGYYSRARNLHAAAKSMNGVFPKTYPEVLALKGVGGYTAAAICS FAYGMPYAVVDGNVYRVLSRYFGIDTPIDSTEGKKLFAALADEMLDKKHPAVYNQGIMDF GAIQCTPQSSNCLFCPLAGGCSALSKGLVTKLPVKQHKTKTTNRYFNYIYVRAGAYTFIN KRTGNDIWKNLFELPLIETPTALSEEEFLALPEFRAFFASGEVPVVRSVCREVKHVLSHR VIYANLYEVTLSENLTSFGNFRKIKVEELEQYAISKLVQNLINT >gi|225935346|gb|ACGA01000046.1| GENE 303 422671 - 423222 520 183 aa, chain + ## HITS:1 COG:VC0397 KEGG:ns NR:ns ## COG: VC0397 COG0629 # Protein_GI_number: 15640424 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Vibrio cholerae # 27 183 6 177 177 105 41.0 5e-23 MHPIVILFNFGSRFNFKNHKLDTDMSVNRVILIGNVGQDPRVKYFDTGSAVATFPLATTD RGYTLANGTQIPERTEWHNIVASNRLAEIVDKYVHKGDKLYLEGKIRTRSYSDQSGAMRY ITEIYVDNMEMLSPKGANLGAGASASGQPAMGQQQQPVAGQPQQTQQSQAQPVQDNPADD LPF >gi|225935346|gb|ACGA01000046.1| GENE 304 423300 - 424655 1307 451 aa, chain + ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 40 445 17 425 426 207 34.0 2e-53 MDSDGYLSQLADIFNGITVNTPSISAIIAIALAGVLLLASGFASASEIAFFSLTPSDRND IDEHNHPSDEKISALLGDSERLLATILITNNFVNVTIIMLCNFFFMNVFVFHSPLAEFLI LTVILTFLLLLFGEIMPKIYSAQKTLAFCRFSAPGIYFLEKVFRPIATVLVRSTTFLNKH FVKKSHNISVDELSHALELTDKAELSEENNILEGIIRFGGETVKEVMTSRLDMVDLDIRT SFKEVMQCIIENAYSRIPIYSGSRDNIKGVLYIKDLLPHVNKGDNFRWQSLIRPAYFVPE TKMIDDLLRDFQANKIHIAIVVDEFGGTSGLVTMEDIIEEIVGEIHDEYDDEERTYVVLN DHTWIFEAKTQLTDFYKIAKVDEDEFEKVVGDADTLAGMLLEIKGEFPALHEKVTYHNYE FEVLEMDSRRILKVKFTILPKEVEDGVVSAV >gi|225935346|gb|ACGA01000046.1| GENE 305 424633 - 425265 364 210 aa, chain + ## HITS:1 COG:no KEGG:BT_1495 NR:ns ## KEGG: BT_1495 # Name: not_defined # Def: siderophore (surfactin) biosynthesis regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 209 1 198 201 314 76.0 1e-84 MALFLQYKTNGIQWAVWKMEESLDALLLLLPGARRAFCEQELNRFVSERRKMEWLSVRVL LYSMLQEDKEIGYSPEGKPYLTDHSSFISISHTKGYVAVMLASSVPVGIDIEQYAQRVHK VSDRYVRPDEQVESYQGDITWGLLLHWSAKEAVFKRMENADADLRKLRLTHFIPQGEGMF QVQELATEQQELYSVGYRICPDFVLTWTLN >gi|225935346|gb|ACGA01000046.1| GENE 306 425427 - 425651 246 74 aa, chain - ## HITS:1 COG:no KEGG:BT_1494 NR:ns ## KEGG: BT_1494 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 139 93.0 4e-32 MGKKIEYTNGELTIVWQPELCQHAGICVKMLPNVYHPKERPWVQIENAITEELIAQISKC PSGALSYRLNKKEK >gi|225935346|gb|ACGA01000046.1| GENE 307 425660 - 425962 298 100 aa, chain - ## HITS:1 COG:DR1844 KEGG:ns NR:ns ## COG: DR1844 COG2388 # Protein_GI_number: 15806844 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Deinococcus radiodurans # 10 96 7 93 93 73 42.0 6e-14 MAEDYKLIDNEEKHRYEFQIDGKIAEIDYIKSNNGEIYLVHTEVPASLGGKGVGSQLAEK ALADIERQGLRLVPLCPFVAGYIHKHPEWKRIVMRGIHIK >gi|225935346|gb|ACGA01000046.1| GENE 308 426043 - 427416 1449 457 aa, chain - ## HITS:1 COG:PM0811 KEGG:ns NR:ns ## COG: PM0811 COG3033 # Protein_GI_number: 15602676 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Pasteurella multocida # 6 455 6 455 458 468 49.0 1e-132 MELPFAESWKIKMVEPIRKSTREEREQWIKEAHYNVFQLKSEQVYIDLITDSGTGAMSDR QWAEMMLGDESYAGATSFFKLKEMITKLTGFEYIIPTHQGRAAENVLFSYLVHEGDIVPG NSHFDTTKGHIEGRHAIALDCTIDAAKQTQLEIPFKGNVDPDKLQKALTEYAERIPFIIV TITNNTAGGQPVSMQNLYEVRAIADKYGKPVLFDSARFAENAYFIKMREEGYRDKTIKEI TREMFSLADGMTMSAKKDGIVNMGGFIATRRADWYEGAKGFCVQYEGYLTYGGMNGRDMN ALAIGLDENTEFDNLETRIKQVEYLAKKLDEYGIPYQRPAGGHAIFIDAPKVLTHVPQEE FPAQTLTVELYLEAGIRGCEIGYILADRDPVTHENRFNGLDLLRLAIPRRVYTDNHMNVI AAALKNVYERRESITHGVRIAWEAPLMRHFTVQLERL >gi|225935346|gb|ACGA01000046.1| GENE 309 427635 - 428954 885 439 aa, chain - ## HITS:1 COG:no KEGG:BF1581 NR:ns ## KEGG: BF1581 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 439 1 455 455 395 48.0 1e-108 MKKMKFLISQLYLLALFALPFVSTSCSDDDDNSTKIEITSLGVEDGTTIVTGQIIQLEAQ LSNPQGEVHYSWSTAGKEVSTQSTYTFQSDVTGTHTITLTVTANNEAQEKSINIIVVKPP FYVINEGQGKGSVNRYKQEQWQYNIVEGLGVTSTVGIINNGYMYIVSKKSPFLVKMNLEN NQVVNKIEAGLDQNAQGQNFCIVNNETGILTTSNGAFKVNLKQLTLGEKLSGLDAVSSDN EDVYKTDKYIFISSKNTIKVYNTNDLSFKQDIAYQIKTGFAQTKDGTLWAANGNKLIKIN VETLDHEEVELPNGLSVFYNQWAYTPTGLCASITENALYLVQMVEEGYSVYGKNIYKYNV ETKNATLFFSAPAADKSVYGAGVQVDPRNGDVYIIYTEDGWSTHYLNTNIYIANGVTGTQ KEVIDYTGTYWFPSTIAFQ >gi|225935346|gb|ACGA01000046.1| GENE 310 428995 - 431007 1395 670 aa, chain - ## HITS:1 COG:MA1904_1 KEGG:ns NR:ns ## COG: MA1904_1 COG3391 # Protein_GI_number: 20090753 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 98 335 109 326 361 63 30.0 9e-10 MNIRTKILYICQTLFCAFLPLFVACDDLEDKPTSTTIDGNITETGTAEIYILSEGLFNLN NSSLAKYSFKSNKLVKNYFKDLNKRGLGDTANDIALYGSKLYIVVNVSSTIEVIDFQTGI SIKQIPMFTDNGSSRQPRHIAFYENKAYVCSFDGTVARIDTTSLQIESFTKAGRNPENIC VKNKKLYVSNSGGLDYSEGLGVDNTVSVIDIASFTEIKKIEVGPNPGCILPGPDEAVYVA TYGSNIADGDFNFVKINSQTDEVERIYNEKVMNFAIDNNNIAYLYNYNYNTEASSIKVLN LRTGETIRENFITDGTKISTPYSINVNPYSGNVYITEAYSYTITGDVLCFNTNGQLLFRL NRIGLNPNNVIFSQKASTGDSDGEESDPNAPSAFANKVLDYNPAPSQYMNTVTTAYKENY TAEEVRKYAEEQLKDTDLCLISLGAYGGYITVGFDHTVPNVPGEYDLKIYGNAYYDMFGT LTGALGGSSEPGIVLVSKDTNGNGLADDEWYELAGSEYNSPATTKNYTITYYRPSSPKED VKWTDNKGNEGYVYRNDYHTTNSYYPAWIKEDQITFHGSRLKDNTVNEPRENMPEHWVGY CYAWGYADNHPNGEEQCKFKIDWAVDKNGNPVVLDGIDFVRIYTAVNQSSGWMGEISTEL QAVEDLHFKK >gi|225935346|gb|ACGA01000046.1| GENE 311 431027 - 433072 1080 681 aa, chain - ## HITS:1 COG:no KEGG:BF1583 NR:ns ## KEGG: BF1583 # Name: btuB # Def: putative vitamin B12 receptor # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 12 681 15 678 678 1019 74.0 0 MKERTFKKMIFAVFCCQFLSIPLFAQQQKVDTTHTYSIPEVTVSDIYQTREVRSTAPLQV FSKDALKNLHALQVSDAVKHFAGVTVKDYGGIGGLKTVSIRSLGAQHTAVGYDGIALTDC QTGQIDIGRFSLDNVDQLSLSNGQSDNIFQPARFFAAAGILNIQTLTPRFEKEKTTNISA SFKTGSWGLVNPSILLEQQLNSKWTVSANGEWMSSDGHYPYTLHYGDAKEDLTSREKRKN TDVQTFRAEAGLYGIFSDKEQYRLKAYYFQSSRGLPKATTLYNDHSLQQLWDKNTFVQSQ YKKEFSRQWVFQSSAKWNWSYQRYLDPDTPNSKGKTENSYYQQEYYLSASVLYRMLDNLS FSLSTDGSINTMNANLNEFAQPTRYSWLTALAGKYMNDWLTISASALATIINEDVKEGGS AGNHRKLTPYISASFKPFYHEEFRIRFFYKDIFRLPSFNDLYYQEVGNTKLRPENARQYN VGLTYSKNVCPFLPYLSATIDAYYNKVTDKIIAYPTKNLAVWSMRNLGEVEIKGIDATGS LSLQPWERIRINLSGNYTYQQALDITNSDPTTEAGRTYKHQIAYTPRVSGSGQAGIETPW INFSYSFLFSGKRYVQNENIAENSMEGYSDHSISISRSFHILKTHTSVSVEVLNIANKNY EIVKNFPMPGRSVRATLNIKY >gi|225935346|gb|ACGA01000046.1| GENE 312 434132 - 435076 671 314 aa, chain - ## HITS:1 COG:no KEGG:BT_1485 NR:ns ## KEGG: BT_1485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 308 17 307 311 490 82.0 1e-137 MKTNIHILPAICILCFCACKSGNASSLNKNDVIQDTIKTFTLPAIPPMMTAPEQRADFLV KHYWDNVNFADTNYIHHPEVTEQAWADYCDILNHVPLETAQEAMRKTIEQTNVDKKVFTY ITDLADKYLYDPNSPMRNEEFYIPVLDAMLASPLLEEIEKVRPKARRELAQKNRIGTKAL NFSYTLASGAQGSLYQQQADYILLFINNPGCHACTETIDALKNAPIINRLLGQKKLTVLS IYPDEELDEWRKHLNEFPKEWINGYDKKFAIKEQQLYDLKAIPTLYLLNKEKIVLLKDAT AQAIEEYLTNIELK >gi|225935346|gb|ACGA01000046.1| GENE 313 435233 - 437014 1396 593 aa, chain - ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 84 589 71 563 566 159 28.0 2e-38 MNLPVTQKHFLSFSRKLFLSVISLFLVFAICFIAYQYQREREYKIELLNTKLQDYNSRLY EQLENQPLDSEIIDGYINNHILEDLRVTLIDAQGNVVYDSYPSHNNQMENHLNRPEVQKA IKHGNGYDVRRTSETTGVPYFYSATHYKDYIVRSALPYNVSLINNLQADPHYLWFTVIVT LLLMIIFYKFTNKLGTSISQLREFAMRADRNEPIEMAMQSAFPHNELGEISQHIIQIYKR LHETKEALYIEREKLITHLQISHEGLGIFTKDKKEILVNNLFTQYSNLISDSNLETTEEV FAISELQEIIHFINKNQQERSRGKGEKRMSVTINKNGRTFIVECIIFQDASFEISINDVT QEEEQVRLKRQLTQNIAHELKTPVSSIQGYLETIVSNENIPREKINVFLERCYAQSNRLS RLLRDISVLTRMDEAANMIDMERVDISVLVGNIINEVSLELDEKHITVVNSLKKSIQIKG NYSLLYSIFRNLMDNAIAYAGSNIQININCFREDENFYYFSFADTGIGVSPEHLNRLFER FYRVDKGRSRKVGGTGLGLAIVKNAVIIHGGSISAKNNQGGGLEFVFTLAKEK >gi|225935346|gb|ACGA01000046.1| GENE 314 437042 - 437731 732 229 aa, chain - ## HITS:1 COG:aq_319 KEGG:ns NR:ns ## COG: aq_319 COG0745 # Protein_GI_number: 15605840 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Aquifex aeolicus # 5 227 7 227 228 164 42.0 9e-41 MNDYRILVVDDEEDLCEILKFNLENEGYEVDTANSAEEAMKMDISSYHLILLDVMMGEIS GFKMANILKKDKKTAKVPIIFITAKDTENDTVTGFNLGADDYISKPFSLREVIARVKAVL RRTVTTETERVPERLTYQSLVIDITKKKVSIDDEEVPLTKKEFEILLLLVQNKGRVFSRE DILARIWSDEVYVLDRTIDVNITRLRKKIGVYGKCIVTRLGYGYCFEAE >gi|225935346|gb|ACGA01000046.1| GENE 315 437973 - 439166 904 397 aa, chain + ## HITS:1 COG:no KEGG:BT_1481 NR:ns ## KEGG: BT_1481 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 1 397 397 723 92.0 0 MIKIDLKRHRKEIIGSVVVLLILLGGMSVFKYTSFNSGFEIVDDLGGNIFPSAILSVATT DVQVIVPSDSNYLGNPKSCIAVRVKSKTAYSRVRIEVAETPFFSRSVSEFVLNKPRTEYT IYPDIIWNYEALKNEVQAEPVSVAITVEMNGKDLGQRVRTFSVRSINECLLGYVSNGTKF HDTSIFFAAYVNEENPMIDQLLREALNTRIVNRFLGYQSKAKGAVDKQVYALWNILQKRK FRYSSVSNTSLSSNVVFSQRVRTFDDALESSQINCVDGSVLFASLLRAINIDPILVRTPG HMFVGYYTDNSHTDKNFLETTMIGDVDLDDFFPDEQLDSTMVGKSQNEMSLLTFEKSKQY ANKKYKENEEGIHSGKLNYMFLEISKDVRRKIQPIGK >gi|225935346|gb|ACGA01000046.1| GENE 316 439307 - 440107 811 266 aa, chain + ## HITS:1 COG:no KEGG:BT_1480 NR:ns ## KEGG: BT_1480 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 266 1 266 268 469 87.0 1e-131 MQGKKFISPGAWFSMNYPSDWNEFEDGEGSFLFYNPDVWTGNFRISAFKGKAGYGKDAIR QELKENDSASLVKVGTWECAYSKEMFQEEGTYYTSHLWITGVDDIAFECSFTVPKGGVVK EAEDVIATLEVRKEGQKYPAELIPVRLSEIYLINEGYEWVVSTVKQELKKDFQGIEEDLE KLQQVIDSGKIGLKKKEEWLAIGITVCAILANEVDGMEWKTLIDGNREAPVLQYKDRTID PMKLVWSKVKAGEPCNVIEEYKKCLD >gi|225935346|gb|ACGA01000046.1| GENE 317 440109 - 441977 1946 622 aa, chain + ## HITS:1 COG:sll0912 KEGG:ns NR:ns ## COG: sll0912 COG0488 # Protein_GI_number: 16331003 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Synechocystis # 6 619 4 634 636 487 43.0 1e-137 MAVPYLQIDNLTKSFGDLVLFENVSLGIAEGQRVGLIAKNGSGKTTLLNIIAGKEGYDSG NIVFRRDLRVDYLEQDPQYPEELTVLEACFHHGNSTVELIKEYERCMETEGHPGLENLLA RMDQEKAWEYEQKAKQILSQLKIRNFDQKVKQLSGGQLKRVALANALITEPDLLILDEPT NHLDLDMTEWLEDYLRRTNLSLLMVTHDRYFLDRVCSEIIEIDNQQIYQYKGNYSYYLEK RQERIESKSVEIERANNLYRTELDWMRRMPQARGHKARYREDAFYELEKVAKQRFNTDNV KLEVKASYIGSKIFEADHLFKSFGDLKILDDFSYIFSRYEKMGIVGNNGTGKSTFIKILM GQVKPDSGTVDIGETVRFGYYSQDGLQFDEQMKVIDVIQDIAEVIELGNGKKLTASQFLQ HFLFTPETQHSYVYKLSGGERRRLYLCTILMRNPNFLVLDEPTNDLDIITLNVLEEYLQN FKGCVIVVSHDRYFMDKVVDHLMVFNGQGDIRDFPGNYSDYRNWKEAKEQKEKEAEKPQE EKTARVRLNDKRKMSFKEKREFEQLEKEIAELEAEKTQIEELLCSGTLSVDELTEKSKRL PEVNDMIDEKTMRWLELSEIEG >gi|225935346|gb|ACGA01000046.1| GENE 318 442133 - 442972 599 279 aa, chain + ## HITS:1 COG:SPCC1672.01 KEGG:ns NR:ns ## COG: SPCC1672.01 COG1387 # Protein_GI_number: 19075372 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Schizosaccharomyces pombe # 8 265 6 271 306 77 24.0 4e-14 MTNLTNYHSHCLYCDGRANMDDFIRFAISEGFTSYGISSHAPLPFSTAWTMEWDRMEDYL SEFSRLKKKYAGKIELAIGLEIDYLNEENNPSLPCFQKLPLDYRIGSVHMLYSPEGKIVD IDTPADLFRQLVDRHFDGDLDSVVHLYYKNLLRMVELGGFDIVGHADKMHYNASCYRPGL LDEAWYDTLVRDYFAVIAARGYMVEINTKSYHELGTFYPNERYFPFLKELGIRVQVNSDA HYPERINNARFEGLAALKKAGFTSVVEWHGGKWEDIPIG >gi|225935346|gb|ACGA01000046.1| GENE 319 442974 - 443189 345 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237723012|ref|ZP_04553493.1| ## NR: gi|237723012|ref|ZP_04553493.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 71 16 86 86 95 100.0 7e-19 MIGEITCAINRVEEQIEQLFDEKEEFIMAYEDALPRTMYLKKLTEIDSRIDELKKTLISL NEEKQEILDME >gi|225935346|gb|ACGA01000046.1| GENE 320 443216 - 444451 967 411 aa, chain - ## HITS:1 COG:APE1887 KEGG:ns NR:ns ## COG: APE1887 COG2407 # Protein_GI_number: 14601699 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Aeropyrum pernix # 53 398 72 417 433 141 28.0 2e-33 MTIHLISFASILHKQVSLRSSHEAILSEIEKYYTVKFVDHQDMDKLSSDDFKIIFVATGG VERLVIQHFENLPRPAILLADGMQNSLAAALEISTWLRGRGMKSEILHGELPAIILRIHT LYNNFQAQRSLFGKRIGVIGSPSSWLVASNVDYLLAKRRWGIEYVDIPLERIYEQFKHIT DDQVGASCAAVASQALACREGTPEDLIKAMRLYRAIKKVCQEENLEALTLSCFKLIEQID TTGCVALSLLNDDGIIAGCEGDLQSVFTLLAVKALTGKDGFMANPSMINSRTNELILAHC TVGLKQTERYIIRNHFETEKGIAIQGLLPTGDVTIIKCGGECLDEYYLSTGTLTENTNYI NMCRTQVRIRMNTPAEYFLKNPLGNHHIMLHGNYEDTLNEFFQANACKRTE >gi|225935346|gb|ACGA01000046.1| GENE 321 444632 - 445831 1044 399 aa, chain + ## HITS:1 COG:CAC1001 KEGG:ns NR:ns ## COG: CAC1001 COG0436 # Protein_GI_number: 15894288 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 5 394 4 393 395 360 46.0 2e-99 MPTISIRGNEMPASPIRKLAPLADAAKQRGVHVFHLNIGQPDLPTPQAAIDAIRNIDRKV LEYSPSAGYRSYREKLVGYYAKFNINLTADDIIITSGGSEAVLFSFLSCLNPGDEIIVPE PAYANYMAFAISAGAKIRTIATTIEEGFSLPKVEKFEELINERTKAILICNPNNPTGYLY TRREMNQIRDLVKKYDLFLFSDEVYREFIYTGSPYISACHLEGIENNVVLIDSVSKRYSE CGIRIGALITKNKEIRDAVMKFCQARLSPPLIGQIAAEASLDAPEEYSRETYDEYVERRK CLIDGLNRIPGVYSPIPMGAFYTVAKLPVDDSDKFCAWCLSDFEYEGQTVFMAPASGFYT TPGSGINEVRIAYVLKKEDLTRALFVLQKALEVYPGRTE >gi|225935346|gb|ACGA01000046.1| GENE 322 445914 - 447158 907 414 aa, chain + ## HITS:1 COG:RSc1117 KEGG:ns NR:ns ## COG: RSc1117 COG4591 # Protein_GI_number: 17545836 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Ralstonia solanacearum # 12 414 11 416 416 118 24.0 3e-26 MSLSLFIARRIYRESDGGKQVSRPAVLIAMAGIAIGLAVMIIAVAVVIGFKSEVRNKVIG FGSHIQITNLDAVSSYETHPIVVGDSMMTALADYPEISHVQRFSTKPGMIKTDDAFQGMV LKGVGPEFDPHFIKEYLVEGEIPVFSDSVSTNQVLISKALATKMKLKLGDKIYTYYIQDD IRARRLTIAGIYQTNFSEYDNLFLLTDLNLVNRLNGWQPEQVTGVELQVKDYDRLEDITY EIATDIDNRQDELGGVYYVRNIEQLNPQIFAWLDLLDLNVWVILILMIGVAGFTMISGLL IIIIERTNMIGILKALGANNFTIRRTFLWFAVFLIGKGMFWGNAIGLAFCILQSQFGFFK LDPATYYVDTVPVSFNVLLFILINLGTLCASVLMLIGPSFLITKINPASSMRYE >gi|225935346|gb|ACGA01000046.1| GENE 323 447289 - 448755 2401 488 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29346884|ref|NP_810387.1| ribosomal protein S6 modification protein-related protein [Bacteroides thetaiotaomicron VPI-5482] # 1 488 1 488 488 929 92 0.0 MNNVLILLDNLDDWKPYYETSSVLTVSDYLKNKPVEKDRKLVINLSNDYSYNSEGYYCSL LAQTRGQKVIPDVDIINKMEAGTGVRMDRSLQALCYQWIQKNDIKDDIWYLNIYFGKCRE KGLERIARFIFENYPCPLLRVALNTHPRNQIEGIHFLPLNQLNDAEQDFFANTLDNFNKK IWRAPKSAKASRYSLAVLVDPQEKFPPSNKGALHKLTEVAKKMNIHVEMITEDDAIRLLE FDALFIRTTTSLNHYTFHLSQLAAQNGMAVIDDPLSIIRCTNKVYLKELFEKEKISAPKS TLIFQSNHHSFEQISELVGAPFILKIPDGSYSIGMKKVSNEEELQASLKILFEKSAILLA QAFTPTEFDWRVGLLNGVPLYACKYYMAKGHWQIYCHYDSGRSRCGLVDTIPIYQVPRVV LDTAVKAANLIGKGLYGVDLKMVDDRAYVIEINDNPSIDHGLEDAIIGDEMYYRLLNHFE QVLETKHY >gi|225935346|gb|ACGA01000046.1| GENE 324 448758 - 449207 184 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227881166|ref|ZP_03999028.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Halogeometricum borinquense DSM 11551] # 1 145 29 172 181 75 34 3e-12 MEPIIVRKAQQADIPAILEIEWECFREDSFSKEQFAYLISRSKGTFYVMMEADRVIAYVS LLFHGGTRYLRIYSIAVHPDYRGRGLGQVLMDQTIQTAGECKAAKITLEVKVTNTSAIGL YMKNGFIPAGIKPCYYHDGSDAIYMQRLI >gi|225935346|gb|ACGA01000046.1| GENE 325 449238 - 450410 870 390 aa, chain - ## HITS:1 COG:Ta1048 KEGG:ns NR:ns ## COG: Ta1048 COG0463 # Protein_GI_number: 16082079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Thermoplasma acidophilum # 57 296 7 212 256 75 25.0 2e-13 METFTFNTVELILLSAAGILFIIQLIYYFGLYNRIHVHNKAVGKEEAHFIRELPPLSVIL CARNEAENLRKILPAILEQDYPQFEVIVINDASTDDTEDILGVMEEKYPHLYHSFTPESA RYISHKKLALTLGIKASKHDWLVFTETNCMPASNQWLKLMARNFTPQTQIVLGYSGYDRT KGWLHKRTAFDTLFQSLRYLGFALAGKPYIGIGRNLAYRKELFFQQKGFSKYLNLQRGED DLFINELATSSNTRVETDFNATTRIQPVYRYKDWKEEKVSYMATARFYHGIQRYLLGFET FSRLLFYIACIAGIVFGILNFHWLVAGIAFLIWLLRFTVQAVIINRTAKEMGGGRKYYFS LPVFDLIQPVQSLKFKLCRFFRGKGDFMRR >gi|225935346|gb|ACGA01000046.1| GENE 326 450467 - 451492 877 341 aa, chain - ## HITS:1 COG:VC0624 KEGG:ns NR:ns ## COG: VC0624 COG0628 # Protein_GI_number: 15640644 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Vibrio cholerae # 165 331 186 351 361 93 34.0 4e-19 MSTKEQYWKYSLIVIILFMGVIIFRQITPFLGGLLGALTIYILVRGQMNHLVEKRKLKRS ISALLITAETIFVFLIPLGLTVWMVANKLQDINLDPQTYIAPIQQVAEFIKEKTGYDVLG KDTLSFIMSILPRIGQIIMESISSLAINLFVMIFVLYFMLIGGKKMEAYVNDILPFNETN TQEVIHEINMIVRSNAIGIPLLAIIQGGVAMIGYLLFGAPNILMLGFLTCFATIIPMVGT ALVWFPVAAYLAISGDWFNAIGLFGYGAIVVSQSDNLIRFILQKKMADTHPLITIFGVVI GLPIFGFMGVIFGPLLLSLFFLFVDMFKKEYLDLRNNLPSR >gi|225935346|gb|ACGA01000046.1| GENE 327 451550 - 452803 800 417 aa, chain - ## HITS:1 COG:BMEI0867 KEGG:ns NR:ns ## COG: BMEI0867 COG5000 # Protein_GI_number: 17987150 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Brucella melitensis # 102 401 374 708 751 102 22.0 1e-21 MLFSRHIYWTTLGHILFILATAGTGLWLIISQQGVVIGILLVICSLFQIGRLVNKLNSFN QKLRLFFDAIEDKDNMLYFPENNVSREQEMLNRSLNRINALLIRTQAEYSKQEHFYRSLL EEVPSGVLAWDSSGKIMMANSAALTLLGCQQLAQYDQLKPILQEKEKKERLSLSQNQMKL QNETITILSIKDISNELNDKESESWNKLSHVLTHEIMNTIAPIISLSQTLSAYPDNSEKT LRGLHIIQAQSERLLEFTESFRRLSYLPQPERKRFSLTALLQNLQELLSTDFQENQIHFT LTCQPEFIDMDGDENQLSQVLLNLLRNSIQALDGRTDGAIEIYASRDEHISIDITDNGPG IPDELQEKIFIPFFTTKSEGTGIGLSLCRQIIRNHNGHLSILESRPGKTIFHIDISL >gi|225935346|gb|ACGA01000046.1| GENE 328 452809 - 454173 746 454 aa, chain - ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 6 451 8 441 441 315 38.0 2e-85 METGTILIVDDNKSVLASLELLLENVFSTVRTAANPNQITTILSTTPIDIVILDMNFSAG INNGNEGLYWLKHIHEIRPSLPVIMLTAYGDVELAVKALKNGATDFLLKPWDNHILIQKI KEAYQNNRPTHHKNSPKSTQKASGDENLNKSEMLIGHSPAMLQLIKVVTKVAKTDANILI TGENGTGKEMLAREIHRLSSRNTRQLLSVDMGAISESLFESELFGHERGAFTDAYESRPG KFEAANGGSLFMDEIGNLPLALQAKLLTVLQNRNITRIGSNKVIPVDIRLISATNKDIPE MIKEGLFREDLFYRINTIHLEIPPLRERGDDLLLFIDAFLHRFASKYQRPEMRMHEQTIE KLRSYHWPGNIRELQHTIEKAVILCESNVIRPKDILVKQTWKPQTVQAVPNLEEVERQAI ETAILQNNGNLTAAAEQLGISRQTLYNKLKRFKP >gi|225935346|gb|ACGA01000046.1| GENE 329 454500 - 455801 1007 433 aa, chain + ## HITS:1 COG:no KEGG:BT_1468 NR:ns ## KEGG: BT_1468 # Name: not_defined # Def: putative outer membrane efflux protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 433 1 413 413 660 84.0 0 MKRFLLLLLLSVPVFLLAQSAMSLDDCIRLAYKQNPAVRNGVIGIKETKADYIASVGAFL PHIVVNAETGKRFGRSLDPDTNGYTSESFEEGTVGLDMTLSLFEGFSRINQVRFRKMNKE RSEWDLKEKQNELAYQVTDAYYKLILERKLLDLALEQSRLSERYLKQTEVFVELGLKSAS DLQEVKARREGDIYRYQSRENSSRIALLHLKQLMNIQPGDTLAILDTITASQLPPYSVST VETLYAQSIEILPSIRMIDLKQKAAHKEYAIAGGAFSPSVFARFTVGSNYYNTAFSARQL RDNIGKYVGVGISFPLLSGLQRLTNQRKLKLNMYRLKNEEELEKQQLYTDIEQTLLSLHT GYSEHQQALSQLDAETLVLKESERKWEEGLISVFQLMEARNRFIAAKAELVRVRLQIEMM MKLEKYYRQGTFL >gi|225935346|gb|ACGA01000046.1| GENE 330 455808 - 457055 1313 415 aa, chain + ## HITS:1 COG:YPO1498 KEGG:ns NR:ns ## COG: YPO1498 COG0845 # Protein_GI_number: 16121771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 1 399 1 404 420 119 25.0 9e-27 MDTLIERKPGINRKRLYWVGGVMLGLAVIAYFIFRDTASSMAVEKDRLTIATVEQAEFSD YIRVIGQVMPSRIIYMDAIEGGRVEERLKEEGAMVKAGDVILRLSNPLLNIGIMQSEADL AYQENELRNTRISMEQERLQLKQERIGLNKELIGKQRRYEQYKRLVNEQLIAREDYRQAE EEYIAAKEQLAVIDERIRQDHIFRESQIGSLDENIRNMKRSLALVRERLENLKVKAPIDG QVGNLNAQIGQSISAGEHIGQIITSDLKVQAQIDEHYVERVLPGLPADFTRDGGTYKLEV TKPYPEVKDGQFRTDLNFISERPENIRAGQTYHINLQLGDPAQAILVPRGGFFQITGGRW MYVVDESGTFATRRPVKIGRQNPLYYEVTDGLSPGEKVIISGYELFGDNEKLILK >gi|225935346|gb|ACGA01000046.1| GENE 331 457068 - 457784 360 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 212 4 215 223 143 40 1e-32 MIKTEKLSMLFTTEEVQTKALNDVTLQVEQGEFVAIMGPSGCGKSTLLNILGTLDSPTSG SYFFEGKQVDKMTENQLTALRKGNLGFIFQSFNLIDELTVYENVELPLVYLGMKTLQRKE RVNKVLEKVNLLHRANHYPQQLSGGQQQRVAIARAVVTECKLLLADEPTGNLDSVNGIEV MELLSELNRQGTTIIIVTHSQRDAKYAHRVIQLLDGQIVAENINRPLEKKTSSKNETV >gi|225935346|gb|ACGA01000046.1| GENE 332 457781 - 460201 2024 806 aa, chain + ## HITS:1 COG:NMB0549_2 KEGG:ns NR:ns ## COG: NMB0549_2 COG0577 # Protein_GI_number: 15676455 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Neisseria meningitidis MC58 # 129 363 136 348 395 68 25.0 5e-11 MIGNYWNSAYRNLMKRKKFSFINIFGLAIGMASALLMLTYVTFEFSFDKMHTKYAHIYRV QSTFHEGEVLTDYWATSSFGYASAMKENLAGIEDYTRIATHLQPEQIVKYGELTLRENQI AYADPGFFRLFDFELLKGDKKTCLSMPRQVVITERIARKYFKDEDPIGKILIFTGTYDKV SCEVTGVMKEMPSNSHIHYNFLISYASLPQYMQEYWYKHEAYTYVLLDSPERKAEIEKEF PVMAEKYKTEEALKNKTWGVSLIPLADIHLTPQIGYETETKGNRSAMIALIFAAVAILAI AWINYINLTVARSMERAKEVGVRRVVGAFRQQLIYQFLFEALVMNLIAFILAVGLIELVL PHFNQLVGRTVTFSVWFMDYWWILLVLVFIAGIFISGYYPALALLNRKPITLLKGKFLHS KSGDRTRQVLVVVQYTASMILLCVTLIVFAQLNFMRNQSLGVKTSQTLVVKFPGHTEGQN IKLEAMKKAIARLPLVHRVTFSGAVPGEEVATFLSNRRTNDALKQNRLYEMLACDPDYAD AYGLQIVAGRSFSEEYGDDVDKLVINETAVRNLGFASNDEAIGELVTVECTDAPMQIIGV VKDYHQQALSKNYTPIMLIHKDKIDWLPQRYISVVMASGNPCELVSQVQEIWNQYFADSS FDYFFLDQFFDHQYRQDEVFGAMIGSFTGLAIFISCLGLWVLVMFSCSTRTKEMGIRKVL GASRWNLFYQLVKGFFQLILIAVVIALPVAWFSMNAWLSHYAFRTDLKAWFFIVPVLLML FISFVTVAFQTMKIIMSKPARSLRYE >gi|225935346|gb|ACGA01000046.1| GENE 333 460828 - 461391 279 187 aa, chain - ## HITS:1 COG:mll4433 KEGG:ns NR:ns ## COG: mll4433 COG3760 # Protein_GI_number: 13473736 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 28 173 5 151 165 82 27.0 3e-16 MSIVSFIFATIKKSDYMSPDSEYTETHENKIYATLDKLGIAYQSLHHPAIMTIEEGAEIA QKLGCTSCKSLFLTNKQQEYFMLLLPANKKLKTKELAGQIGSSHLSFASEKAMENLLCTF PGAVSILGLIYDKENKVQLLIDKEILESTYIGCHPCVNTCSLKIRLEDILKILLPKISHQ NFNIVEL >gi|225935346|gb|ACGA01000046.1| GENE 334 461688 - 462440 750 250 aa, chain + ## HITS:1 COG:BMEII0858 KEGG:ns NR:ns ## COG: BMEII0858 COG2186 # Protein_GI_number: 17989203 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 14 241 18 241 242 71 28.0 1e-12 MKIEINQTTLIDQVEDSLLTYFKKNDLRCGDSIPNENNLAAELGVARSVVREALSRLKMM GLIHARPRKGMVLTEPSILGGMKRVIDPRVLSEETILDLLDFRIALEIGISSDIFRKITP KDIEELSEIVKMGIVFENNEYALISESAFHTKLYKITGNKIISEFQEIIHPILVYVKEKF KDYLKPINIEMSKSGRIATHADLLDFIRKGDEKGYRDAIERHFEVYKIFKVNRSLELMAE KNEEERIDSI >gi|225935346|gb|ACGA01000046.1| GENE 335 462488 - 463504 406 338 aa, chain + ## HITS:1 COG:FN1470 KEGG:ns NR:ns ## COG: FN1470 COG3055 # Protein_GI_number: 19704802 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 11 336 20 367 372 110 25.0 5e-24 MNTACSRKGENKIVMQWENSLLLPGCTGMLKNVGLAGAYSGIVEDKLLVLGGANFPDKYP WEGGVKTWWSTLYSYDLHMDKWTVYDDFLNSPLAYGVSICLPEGLLCIGGCDRMQCSDKV FLIKKEGVSFVVDSVSYSALPVPLANATGAIGDNCIYIAGGQETMTNEQSTNHFYMLDLL HKEKGWQKMPGWEGPSLAYAVSVVQGGRFYLFSGRSYAPNEVMVEYTEGYVYEPGSRKWS KIAGNFPVMAGTAIPYEKDKIILLGGVEEILPTSPEHPGFSRKLRVISTETNSLVDSLDC PYPIPVTTNAVYMGNDVYVVSGEIQPGIRTPLILKGTF >gi|225935346|gb|ACGA01000046.1| GENE 336 463540 - 463893 276 117 aa, chain + ## HITS:1 COG:YPO3024 KEGG:ns NR:ns ## COG: YPO3024 COG0329 # Protein_GI_number: 16123201 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Yersinia pestis # 5 113 2 110 297 98 40.0 2e-21 MNNYEKLEGMVAATFTPLDENGDVNLSVIDKYADWIASTPIKGIFVCGTTGEFSSLTIDE RKLILEKWLVSARKRFKVIAHVGSNCQRSAMELARHAAQVGADAIASIAPSFLNREQ >gi|225935346|gb|ACGA01000046.1| GENE 337 463962 - 464450 336 162 aa, chain + ## HITS:1 COG:VC1776 KEGG:ns NR:ns ## COG: VC1776 COG0329 # Protein_GI_number: 15641779 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Vibrio cholerae # 4 143 141 279 298 111 40.0 7e-25 MPSITGVNLPVDKFLVEGKKKIPNLVGTKFTHNNLMEMGVCIELEQHRFEVLHGYDEILI SGLAMGAVAGVGSTYNYIPNVYQAIFDSMKMNDLETARHNQIKSIRTVEVIIKYGGGVRG GKAIMKLIGIDCGSCRLPIKPFSVDEYEKLRGDLDAIKFFEF >gi|225935346|gb|ACGA01000046.1| GENE 338 464567 - 466207 814 546 aa, chain + ## HITS:1 COG:Cgl1519 KEGG:ns NR:ns ## COG: Cgl1519 COG4409 # Protein_GI_number: 19552769 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Corynebacterium glutamicum # 182 497 81 373 399 95 30.0 2e-19 MKVSKKIFIIFSMILLLVSPVTSLGKDIKWVLERPVIPVLVKKPANPVMKITLIRTDNQP YVIRQIDLDLLGSTNVADIVSVAIYGAKKNGLIDTSRLLCNALPATQKMSFTNEVQVNQD SLSFWVAVTLKDTVSLDHRVQVNCNRVKTTKGNLKILDKTSKPLRVGVAVRQKGQDGCIS SRIPGLATSNKGTLLAIFDARYDYPRDLQGNIDIALHRSTDKGVTWQPIQTVLDMGEWGG LPQKYNGVSDACILVDKNTGDIYVAGLWMHGLLDKDGKWIEGLNENSTNWTHQWKGRGSQ PGTGLKETSQFMIAKSTDDGLSWGFPDNITSKTKRPEWWLFAPAPGQGITLTDGTLVFPT QGRDENGLPFSNITYSKDHGKTWVTNNPAYQDVTECSVVQLGDGALMLNMRDNRNRGNKD VNGRRICTTIDLGESWKEHPTSRRALVEPTCMASLHRHEYMEEGKKKSMLLFVNPNNYGT RDNLTLKVSFDDGMTWPEEHWILFDQYRSAGYSCITSIDENSIGILYESSQSDLAFIKIN LIEILK >gi|225935346|gb|ACGA01000046.1| GENE 339 466214 - 467446 841 410 aa, chain + ## HITS:1 COG:CC2486 KEGG:ns NR:ns ## COG: CC2486 COG0477 # Protein_GI_number: 16126725 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 6 379 37 433 519 91 24.0 3e-18 MNNKTSNIYPWVVVGLLWGVALLNYMDRQMLSTMRIPMMEDVRELESAANFGRLMAVFLW VYGLMSPLSGIVGDRMSRKWLIVGSLCVWSGVTYLMGYATTFNQLYWLRGIMGISEALYL PAALSLIADFHKDKTRSLAVGIHMTGLYVGQAIGGFGATFAAIYSWHTTFHWFGIIGVGY GVILAFLLRDKERGSVSENQKMKKIPVLKSLGMLFSNVFFWIILFYFCVPGTPGWAAKNW LPTLFSDSLSIDISVAGPMSTISIALSSLFGVLAGGYISDRWVLKNVRGRVYTGAMGLGL IIPSLLFIGYGHSIFALVMGAMLFGIGFGMFDANNMPILCQFVSARYRATAYGIMNMCGV FAGATITSLLGESMDAGHLGRDFALLAILVFVMLVILVTCLRPKTIDMKD >gi|225935346|gb|ACGA01000046.1| GENE 340 467503 - 470679 2102 1058 aa, chain + ## HITS:1 COG:no KEGG:Phep_3579 NR:ns ## KEGG: Phep_3579 # Name: not_defined # Def: TonB-dependent receptor # Organism: P.heparinus # Pathway: not_defined # 29 1058 22 1059 1059 570 35.0 1e-160 MVGILLMFLYLWYPLVQAAPGQSGTSPKKITVKGVVLDDAKEPLIGVSVLVKGIEGGTIT NVDGKFTIEAPANGVLIFSYIGMDTQEVDIKGRQDIIVTMRSGVVALSDVVVIGYGEASR RTITSSISKVKGDVLSDMSISSPVEGLKGRISGVRVVQTNNTPGGGFSIKVRGGSSITQS NEPLVLVDGVERSMNDITPEDISSIDVLKDAASTAIYGARGSNGVILISTKKGKFNSAPR ITFEASVAYQEPETLRDFLNAEEYINVLRPAIAISPSPQWNQSSGYSVSSANTGSSIYST RYYNEGDVLPNGWKTMPDPLDPSKTLMFCDTDWQGLMFGSALWQNYHLNIDGGSNIIRYN ASVGYTDDGGVGLATGYKRFNLKSNAEAKISDNLTANINVNFQRTSSEAYANQRDAISRG LSATPTQIVYMEDGTPAQGYNATSQTPIFYNYYNDDSDITKYLSLAGNLKWQILPQWSVN LSGSYYDTTNKQSTFMRSNYYSQAREATSTWTETNRLKTELYTRYNFSIQQKHNVDVMVG YAYQKRDYEKLYAYGTGGSSDKITTIDASAETLGSSTMSKDVEVGFFGRFNYNFKEKYLL TLTGRYDASSKFVKDNRWGFFPGMSAGWIISEEEFMKDMHWLDYLKARFSYGTTGNNGIG VNDALGKYTATYKYNGNAAVRGTILPNQNLTWETTTQMDLGLELGVLNNRVYLSFDFYNK KTKKLLYDMSLPNTTGYASIKTNLGSVRFWGYELELTTQNIDSKDFKWESKFVFSQNKNK VLELPNNGMAKNRTGTSDYPIYSNGNGTFFGGLAEGEPLYRFYGYKAIGIYQTDEEAAKA EYDQMARGFNYKDGTTVAGRKFAGDYIWADRNKDGVITKNQDLFCLGTTEPTVTGSIGNT FTYKSWSLSVYLDYALGHSIYDESFSRYFYATFSTNYALAKDVEKCWKQPGDVTRYAKFW ANDSGTGQGNFNRVSNVFTYKGDYLCIRDIALSYRFPKKWMQKISIENLQLTLSGSNLYY FTAVKGISPEIGTASTYNSSYSNYPPVRRLSVAAKVTF >gi|225935346|gb|ACGA01000046.1| GENE 341 470702 - 472162 1210 486 aa, chain + ## HITS:1 COG:no KEGG:Phep_3578 NR:ns ## KEGG: Phep_3578 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 17 485 1 471 472 248 32.0 3e-64 MKTNTIKFLLFLLPLGLFSCIGDLDVIQKSTINSENMWENEGDMKAAMYGSFYSFRNAYK TNLSYWGDFRSGLIGPGLGSFNGTALVSNKITSSETKGTSWELLYKSINDCNMILKYVDE VTFQDENLRNQIKANAYFLRAYMYFTIVRVWGDAPLMLEGIESDEGDMMPSRVDATLLYE QVKNDIEAALGVMPASVTDRTTGSQAAINMLATDYYLWMYKCRNVGKDALTRASEAVDNV LGNKTYSLLPDYSKIFDIKSKNSAEIIFTMHFERDEAEGGYPASYLIPDSKYTDDKKYRD ANNVKTGSQDQWYSLSSTIQTLINEISEDARTTTTFTVFTIPETGNSYSWINKFTGEWTD NTRYFTSDIPIYRYAEALLFKAEIENELNGTPLFYLNQIAQRAYGKENYYPSTLTKEEIN EAIFNERLKEFAAEGKSWWDYIRMGYVFTKIPSLLGRQNETNILLWPISSDCFENNPNIR QTIGYN >gi|225935346|gb|ACGA01000046.1| GENE 342 472181 - 473512 736 443 aa, chain + ## HITS:1 COG:Cgl1519 KEGG:ns NR:ns ## COG: Cgl1519 COG4409 # Protein_GI_number: 19552769 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Corynebacterium glutamicum # 74 407 80 396 399 96 26.0 8e-20 MKHRLLYSLFIFSILCTACGDDLKFGPNSGIVIPEEPTEDDPKEDEDPEINYPDNREIVE SIVYEAGKDNHIYFRIPALTVTKKGTILAFCEARNTKADFYEGNEDKFPVVPVGSTKDLG DIDLAVKRSTDGGKTWDNMITVVDDYDNTCGNPAPVVVESTGRIYLFWCWGRYPNQLESK FVSSTSDGHTRRVFYCYSDDDGQTWSTQYDLTSRLKKSDWSWYATGPCHGIQKQLAPQKG RIIISANHRDNGNKDNYSHIIYSDDNGSTWQLGGRTEIGGNESSLTELADGSLLINMRRV VKDNNGNDLPSEYRASAVSKDGGATWEKSVVNQDLIDPGCQGSIVNYPINKNRSNTLLLS NTHHSSSRSNLCISKSIDGGISWTTPLVIWSGRAAYSDIITLQDGSVCVIYENGSGKFGK ANPNEQISFYRVPPSLLGDKLNL >gi|225935346|gb|ACGA01000046.1| GENE 343 473533 - 475710 1181 725 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173693|ref|ZP_05760105.1| ## NR: gi|260173693|ref|ZP_05760105.1| hypothetical protein BacD2_17605 [Bacteroides sp. D2] # 1 725 1 725 725 1447 100.0 0 MKIIKINITCLVALLSVLFIGCEVDYKDIDTLGKADSGASEVSINLDLFVAGSWTPDQVA ASAGELKVEKYDLFIFKVSSDGNYLEYKEMNITPDASEMVPYNQSIKIGNRNIVLPTGGT KRVVVIANANDKVSYPVLCSLAESTKSDFSDVTVYDKFINDFSIKADKQQTSPFVMMGNA LIANADMENVGITLAPQYAKINIKNKAAGSETNGLFISSVQLKNVPESAYPFINDYSKKT SRFVNYEVNMLDNGALEEMIPDKLYLLYTPGTSSASADYRVTVLIKGKKNGVDFEKEFPC SNPMYPGYLFNMILSLSGNEVEAEFVPNWSDGNFSISGVHLKNNKMTFPFTADKFWGYEI FWATNMAGAVSVEKQGNESWYSITVEENLVRVCCLEDNTTGRERTASFTIGLGKKKETVE IVQQLMPSTITFNGMEWLDRNLGAILPLTEENITHSDTYGYYYQWGRNVPFPTFGTVGIV VSDPGITIQQAHGMKEFIVSSSSSTDWYTAVSVADKTTTWEDRTGSLGNPCPEGYHVPSY REYQTILPYKNSAGIGNFSNVDFKINSAEILDDTGTLYDALYVTSAKDEATIYAIKKYKT DGAYYLRLRRIKTDSGIYLRIDRLSGNAQSDFVGESADAKLSSASAIFNSATTGMETIYF PAAGRRNNKTGELVNQGINLYSWAATCWDVSSSGIYFDPLDGNTRIYCIAQARSFAQTVR CIKNY >gi|225935346|gb|ACGA01000046.1| GENE 344 475722 - 477284 885 520 aa, chain + ## HITS:1 COG:no KEGG:BT_0448 NR:ns ## KEGG: BT_0448 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 505 1 470 477 193 29.0 1e-47 MKTFTLLFIIYMLSAYGSYAVAFDNIVGKVTCGQKPVSNVLVSDGVEVVKTNAEGYFHFN SKKEDGYVFISVPGNYEVDHIGIIPNHFARLKKGYSEVDTVNFHLTKRENKDCEIFMFTD IHLTNDRVDNDLPQFGKTFYPDIVSQIKARANKPVYAICLGDMTTDVKWHINHYALPEYL ETMKDFPIPVYHAMGNHDNERKVIENISDWDFYGESVYKDVIGPNYYSFNIGAWHVMILD NIITGGPVKKNGKLNYQFTYRIDDQQMEWIKKDLSFMPKNTPIMVGMHVPPLKYTGMKNG VLETDYDFENAKEFLDCFAEFNQVQIFAGHTHRSSNYVYGDNITLHNLPSASAVSWKING ETSRLISEDGSPAGYWIIKVKGDNLQWQFKAAERDLEKSQFFVYDLNCVPEQFGGSKDSN EILINIYNWDPDWKIEVTENGRSLDVSHCWTRDPLYRLIRSDEDALPGRPTAFLANYNSH MFKVKAANAKSPISIRITDGFGNVYKTMMKRPKKFSWDME >gi|225935346|gb|ACGA01000046.1| GENE 345 477398 - 477940 427 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173695|ref|ZP_05760107.1| ## NR: gi|260173695|ref|ZP_05760107.1| hypothetical protein BacD2_17615 [Bacteroides sp. D2] # 1 180 21 200 200 360 100.0 3e-98 MKNLFVITLRILMLLVFIASCGLGYVIYEDTLTAWWIPVGMALLIALATIPFYKKWIWLT TMDDKVINCLCHLACIGAISYVLFLGGNYWFADPASTHEEIVMVQKKYVETHKKTRRVGR HRYVSDGIHKEYYLQVAFENGAVEELHVSLSTYNKTKAGAPKILTLQKGFFGLPVITKGL >gi|225935346|gb|ACGA01000046.1| GENE 346 477961 - 478551 442 196 aa, chain + ## HITS:1 COG:L120883 KEGG:ns NR:ns ## COG: L120883 COG0110 # Protein_GI_number: 15673269 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Lactococcus lactis # 1 191 1 191 203 187 47.0 1e-47 MENRELYERMLSGTMYNDLSAELVQRREQVVFLTNDYNSTYGKPKEVREALLRKMLKGIG ENVHFEPNFRCEFGFNITIGNNFFANFDCIMLDGNLITIGDNVLLGPRVGLYTANHALDA RERVMGGCYARPIVIEDNVWIGAGVHIMGGVTIGRNSVIGAGSVVTKDVPENVIAAGVPC KVIREITDKDKTDFLG >gi|225935346|gb|ACGA01000046.1| GENE 347 478687 - 479166 519 159 aa, chain - ## HITS:1 COG:no KEGG:BF1973 NR:ns ## KEGG: BF1973 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 148 1 150 150 206 66.0 2e-52 MKKFLFVLVAFITIFMTSAEARRIKINVPANIPKVATLPDSSYYKTDDGSHLDLGYIEKD GSRVLVLFSESKPDTYYDIPNDFIEALQKDLNVEDLTSLIPEPSFWDKWGGSILIYGFGA LIVIGIISYLKDFIFGILRLGKKKKEEEDEEDNDSKANS >gi|225935346|gb|ACGA01000046.1| GENE 348 479445 - 481922 2040 825 aa, chain + ## HITS:1 COG:no KEGG:BT_1460 NR:ns ## KEGG: BT_1460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 825 1 825 825 1514 90.0 0 MKMNKTRLLEKISIQSGISADECSAVLKIFERILSEELTRKIHRYGGWILLLVALLVSAV AFSQTPQRKGRPVQTIRGMVIDGDSKHPIPYATVRLSEKEGTGTITDSLGRFSIPQVPVG RHTVEAAFMGYEPGIFREILVTSAKEVYLEIPLKESVNELNEVVIRARTNKEEAMNKMAT TGARMLSVEEASRYAGGFDDPARLVSSFAGVAPSVSSNGISIHGNAPHLLQWRLEDVEIP NPNHFADIATLGGGILSSLSSQVLGNSDFFTGAFPAEYGNAVSGVFDMNLRNGNNQKNEN TLQVGIMGIDVASEGPLSKKHKASYIFNYRYSTTGLLNLEGGKMDYQDLNFKLNFPTQKA GTFSVWGTSLIDKFGSDFEKNTDKWEYMSDRSESKDKQYMAAGGISHRYFFNNDASLKTT IAGTYSQLDGGATMFNHSLESTPYMDLNSKHTNLILTSTFNRKFSNRFTNKTGFTYNAMF YQMNLAIAPYEAESLETVSQGKGNTSLISAYNSSSVGLTERWTLNAGIYGQFLTLNNKWS VEPRIGLKWQATPKATFALAYGMYSRMEKMDVYFVKTKSTGDKSVNKELDFTKAQHVMLS FGYKISDRMNLKIEPYIQFLHDVPVMADSSYSVLNRSDFYVEDALVNKGRGRNIGIDITL ERFLEKGLYYMISGSLFDSRYRGGDGVWYNTKFNRNYVINGLIGKEWMLGRNRQNILSIN LKLTLQGGDRYSPIDREATMNHPDKEVQYDETKAFSKQYSPMLIGNYTVSYRINKKKVSH EFAVKGLNFTGAKEYYGHEYNVKTGKIDVSDGSTTLTNVSYKLEF >gi|225935346|gb|ACGA01000046.1| GENE 349 482004 - 483080 376 358 aa, chain + ## HITS:1 COG:FN0220 KEGG:ns NR:ns ## COG: FN0220 COG3275 # Protein_GI_number: 19703565 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Fusobacterium nucleatum # 177 337 345 516 541 60 30.0 5e-09 MAETITTDNKSTLLYRFLVSPELRWARYLVLIMVLATISFNQVFIIFLDYRDILGGWIYT FTFLYLLTYIGVIYLNLFWLFPKFLLKRRYLTYISLLSVAMMLALAIQMATEYVSYSCWP EFYERASYFSIPIVMDYISSFMLSTLCMIGGTMTVLLKEWMIDHQRVSQMEKVHVLSEVE QLKEQVSPELLFKTLHHSGELTLSEPEKASKMLMKLSQLLRYQLYDCSRTKVLLSSEINF LNNYLTLEQNSQTPFNYELLADGEVNRTLVPPLLFIPFVKYIVKSINEQRISIPVSLKIH LKVEENTIIFTCLCLQVNLLEDKGLERIRQRLNLLYGNRYRLFLTTESIWLELKGGEV >gi|225935346|gb|ACGA01000046.1| GENE 350 483191 - 484156 379 321 aa, chain + ## HITS:1 COG:FN0220 KEGG:ns NR:ns ## COG: FN0220 COG3275 # Protein_GI_number: 19703565 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Fusobacterium nucleatum # 68 308 276 525 541 88 29.0 2e-17 MTINVFWYEPLQSVSFWRRFGGFLAYFVSINTVIYMNLYILVPCFLLKNRLGHYVLAAVL TNLVVIIFLSVTQGLLFEVILPGKDPGRFATFINTFSGILTIGFVTAGSAAISLFTHWLR YNLRIDELESTTLQSELTFLKNQINPHFLFNMLNNANVLIKRNPEEASKVLFKLEDLLRY QINDSSRERVSLASDIRFLNDYLNLEKIRRDNFQFTLRQEGEVDSIWIQPLLFIPFVENA VKHSFDSEHSSYVHVFFKVDAHRLDFRCENSKPAVAVQQGKVGGIGLANIQRRLGLLYPE HYKLEQREDENLYSVILSITL >gi|225935346|gb|ACGA01000046.1| GENE 351 484153 - 484869 645 238 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 1 220 2 224 240 105 30.0 6e-23 MNCIIVDDEPLAREAMKLLIEESDNLQLIGSFNSAATASDFMEQQGVDLVFLDIQMPGIT GIEFARTISKKTLVIFTTAYTEYALDSYEVDAIDYLIKPVEAERFQKAVDKALSYHSLLL KEEKEAIETIVAAEYFFVKAERRYFKVNFSDILFIEGLKDYVILQLNDQRIITRMSLKAI FDLLPKSIFLRVNKSYIVNTDHIESFDNNDIFIKSYEIAIGNSYRDDFFEGFVMKQRL >gi|225935346|gb|ACGA01000046.1| GENE 352 484934 - 485482 468 182 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3046 NR:ns ## KEGG: Cpin_3046 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: C.pinensis # Pathway: not_defined # 14 168 20 177 200 95 35.0 1e-18 MIGTTDYISIKLTDETQFRSIFDKYYISLCMFANQYVENDALAADIVQECFVKLWQLRDD FMYVHQIKSFLYTSVRNKSLNELEHTKVMNEYAQKVQEMSKDSFFQDKVIAEESYRILVD AIEKLPPQMKSIMQLALEGKSNPEIAETLNISGETVHSQKKIAYRKLRVYLKDYYYLVFY FL >gi|225935346|gb|ACGA01000046.1| GENE 353 485567 - 486736 819 389 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 193 352 132 291 331 80 34.0 6e-15 MINQHFYIARLIARYLSDEIGEEEQAELTRWRNESPENERLFREICKEENIKQNMQKRQT FHTEDGWEGVQRKIQRHRFRHRILNICKYAAIFIFPVVVATVAIYKSSNEPQPLSQVAEQ IVPGGKKAVLILDNGEAIDLKSTSGVELKEKDGTVIQVDSTALNYQQAPARTSEKLAYNK VNVPRGGEYQLTLSDGSKVQLNSMSSIRFPVQFAQDCRLVELEGEAYFEVSKTGQPFIVQ TKGMKIEVLGTTFNISAYANEEYQTTLVSGSVKVQTENGSNRILKPSEQACITPGSNQIN VRNVDTAFYTSWIHGKINFKDQRLDDIMKTLARWYDMDVVYENEATKELRFGCYVNRYNE ITPLVKLLEQTGRVTVTVEGKTIKIFTNH >gi|225935346|gb|ACGA01000046.1| GENE 354 486747 - 490124 3291 1125 aa, chain + ## HITS:1 COG:no KEGG:BVU_2447 NR:ns ## KEGG: BVU_2447 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 1125 1 1117 1117 1296 57.0 0 MEKLRNVARFGKYGKRLCMMFLFLCITTSGLMRASIFAQTTVTAKFRNVTLNEVLWEIQK QTDFTFIYSTNDAKKVRVENLDVKNELISEVLNKCLRNSGLTYTVHDGVIAIRKAEPVRT ETVAREKYTITGKVIEDSGEPIIGANVIVKGTTNGMMTDMDGNFHLEVTDKKVTLMVSYI GYTSQEVVATPGKPMSIILKVDNNLLDEVIVTGYGTFKKSAYAGSASIVKTDAVKDVPNV SFQQMLEGAAPGVSVNTGSGIPGSSTSIRIRGMGSFNASNSPLYVVDGVPVLSGNIGASG SDSGLDVMSTLNTSDIESITVIKDAAAASLYGSRAANGVVIITTKQGKAGKPVFKLKSDW GFSNFAMPFRELMGGQERRNLIHEGLRNYALTYSGKTDDEVGLYQGMTENEAWAYADSNI DQYAPIPWCGFVNWDDYLFRNGSHQNYEFSASGGQEKIKYYTSIGYMKQDGVTINSGLER ISARLNVDYQMAKWMNIGAKIQFSKVNQDTYSEGTSYTSPIYGTRNGATPSDPIWNEDGT WNRALIKLDDRNPMLSNSYNFKREYATRSFNTVYASFDIWKGIKFASTFSYDFVMNKSRA WKDPRTSDGDDDNGRFSKDYNDITNMTWSNILTYQTKIKKKHNLDLLAGYEINSKESDGL GTTISNFARWDKPEVNNGVVYQSMGGSNSTTRIVSYITRANYDYDNKYYLGASWRTDGSS RLARENRWGNFWSLSGAWRVSSESFMKPLQNWLSDLKLRVSYGVNGTLPSSYYGYMGLSS LTSNYNDNPGITQSQLENKELTWETNYNFNSGIDLGFFDNRLNVTFEYYVRTTKNLLYSR PLSLATGFSSYLANIGKLQNKGYELEIRSTNIETKDFRWSTSLNLGHNSNKILKLDGDLK QVTSGTSIHKVGLPYSTYYMIEFAGIDPADGEPMFYKNTTDENGNLNKETTKDPRSAEKV VLQCADPTITGGLGNSISYKWFDLNFNVNFSFGAWNYDGAAGKLEHGGDGTLNIPIYYRK RWQKEGDETSIERFVVGRSISMTDYATTRRLYSGDYVRLKNLTFGFTLPKTWTRKVGIDN IRLYASGNNLLTWAAYDYIDPESGSSPSWDTPPMRTYTFGLEVKF >gi|225935346|gb|ACGA01000046.1| GENE 355 490151 - 491608 1500 485 aa, chain + ## HITS:1 COG:no KEGG:BVU_2446 NR:ns ## KEGG: BVU_2446 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 482 1 480 482 394 44.0 1e-108 MKKIKYIALGVFATLLSSCGNDWLDLQSSTAIETDGSLTELRDFEFVLNGAYSSMQSSSY YGADMFCYGDLRGDDMKSYKSSSTNVSFYTFKYNKTNGPSGFWGMYYGIGKNLNILFRDI EKIKLVPDREITTPKLEKLTEQEYYNDLKGEALAIRALLLFDMTRIYGYPYLKDNGASLA VPIIDKVVEDKNIKPSRNTTAQCYKAITDDLTDAVKLLRPVKKEGKINKWGAMTLLSRVY LYMGNDKEAYDVAAEAIKGAEKQGYKLWTNDEYAKIWATPFNSELLFEIVNLTTDSPGKS SIGYLSTKYNLIATEKYWKDYMKDNSDDVRSQMVSTASSSKPYCLKYPVQGSKSYEDANI PVFRLSELYLNAAEAAIKKNDIPNTRKYLKPIYARTDKDLDAVADEDINLDLVLEQRRIE FWGEGQRFFDLLRNNKKVIREDYLSEVPNEAVEFDWSYYKIVLPVPNHEMEYNENMVQNP EYELH >gi|225935346|gb|ACGA01000046.1| GENE 356 491633 - 492799 1176 388 aa, chain + ## HITS:1 COG:SPCC1840.07c KEGG:ns NR:ns ## COG: SPCC1840.07c COG0639 # Protein_GI_number: 19076006 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Schizosaccharomyces pombe # 80 336 27 278 332 94 30.0 3e-19 MKKIYSIYFLLGLFITAFYSCGEDLTPVGPDIAEITHFRNDGPYLFYENGRLKILEVTKD NALNIREESGLPAGLKLDVYSDDNQLLFQVPINKIENFERPAWEDRTEYAKTFAVSDLHG RFDLFAAILKTGEVINDKYEWIYGSNHLVIDGDIFDRGADVLPILWLIYKLEFEAKAVGG RVTTILGDHEEMIMRDNLKYTYAKYNTLSQRAMNMTYGKMWGLTNVMGNWLRSKNTIQIV GENLYVHAGLSKAFMEREETIPEINELVSKSIYLSKEERKKQYPDIADFLYSDSYNGPLW YRGMVKTGSEYSPIKEADVDKLLAQYDVKRIIIGHTENSRVKYTYNKKVYDICVNHPKAF EKETRAVVIEGDDIKAINDEGELVTIKK >gi|225935346|gb|ACGA01000046.1| GENE 357 492820 - 493968 930 382 aa, chain + ## HITS:1 COG:SPCC1840.07c KEGG:ns NR:ns ## COG: SPCC1840.07c COG0639 # Protein_GI_number: 19076006 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Schizosaccharomyces pombe # 108 348 56 296 332 80 27.0 4e-15 MSTRFLRLFLSTLVLVLISSGIQAGTYHSGDKKKEKKEKLSGDGPYILYQADGSTRVINV NKKGRITDKTYATLPKDFSFRVTDHEGRYPFDVKLHPLKRPEWQYTRPEKVFVMSDPHGR LDCVISLLQGNGVINDNYQWNFGSNHLVIIGDIFDRGKDVLQIFWLFYKLEDEAVKAGGH VSFLLGNHEALVLSNDLRYTKDKYKLLAEKLGVEYPSLFGTNTELGRWLATRNTMQIIGT DLYVHAGLGKLFYDKDLNIPTVNEEMSRALFMSKKERKALSPLTDFLYGNDGPIWYRGLV REDPKYKPLVQDSLQMMLDRYMVKHILVGHTIFKDISTFYNGKVIAVNVDNKENRKKKRG RAILIDNGVYYVVGDDGVQRKL >gi|225935346|gb|ACGA01000046.1| GENE 358 494077 - 495327 674 416 aa, chain - ## HITS:1 COG:YLR189c_2 KEGG:ns NR:ns ## COG: YLR189c_2 COG1819 # Protein_GI_number: 6323218 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Saccharomyces cerevisiae # 2 399 5 412 462 110 25.0 5e-24 MKILLVTRGSQGDIYPYLTIASALIKRGHQVTLNLPQIFEKEAKAYQLDYVLQDFDDIQG MVSKAGEKSEGAKPYLKWMRDVIDVQFKQLIPLLKEHDILIATNSEFAAASVADYCQKPF IRTAFAPFIPGRHIPPPIFPYPKPHPIFTPRFIWKLLNIGNNYMTQKTINKNREQLGLKP LKNCGYYSTERAFNYMLYSRYLGNTDPDWKYKWDIGGYCFNDALHYDTDAYQQLLDFIQQ EQRPVIFFTLGSCSAKESDAFCNRLIHICRQLNFRLIIGSGWSGTGKSLANDKDIFLLTH TIPHSLIFPHCDAVMHHGGSGTTHSVARAGKPQVVMPLIIDQPYWAYRVLQLGIGPKCIK INKISDRELKEKVNDLVTNPMYKTKAMELAKQIRNEKSVDNFCDFIESVVMKNKNI >gi|225935346|gb|ACGA01000046.1| GENE 359 495347 - 496564 931 405 aa, chain - ## HITS:1 COG:SSO2309 KEGG:ns NR:ns ## COG: SSO2309 COG0535 # Protein_GI_number: 15899070 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Sulfolobus solfataricus # 30 361 36 322 344 70 25.0 7e-12 MKTKRTRSSFRSMNSERRLESIFLFVTGKCNAKCAMCFYANDMAKKEKDLTFEEIRKISE TAGEINKLWVSGGEPTLREDLPEILEMFYRNNQIKDVNMPTNGLKPDRVIEWVKRFRINC PDCNINVSISLDGFGDTHDTQRGVPGNFYKAADTIRKVSEHFKDDGKVLLNVATVITKYN IDQINDFMTWMYGRFHLSTHTIEAARGVTREDGVKALDESTLRQIQDEAAPIYRAYAKRM VSSTSGLRKPITKFFYIGIIRALYDIRASNIDQPTPWGMDCTAGETTLVIDYDGRFRSCE LREPLGNVKDYDCDVQQIMQSEAMKQEIEAIGHGYKANCWCTHGCFITSSLTFNPRKMIK KVYKGYREVNKLDKPIDFGEDNMSKMEEHYHLDREKLRLLGIIRN >gi|225935346|gb|ACGA01000046.1| GENE 360 496458 - 496652 57 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAHLALHFPVTNKKMDSNLLSEFIERKELLVRLVFILLEYCFNYWLIHIYMRQKYINLQF EVQR >gi|225935346|gb|ACGA01000046.1| GENE 361 496781 - 497278 563 165 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 43 156 3 115 117 126 44.0 2e-29 MKLMKGLLSAFAVILATTACAGNSGENKKSNESTKEDNKMEVVALNKADFLKKVYNYEAS PNDWKFEGSRPAIVDFYATWCGPCKVMHPILEELSKEYSGKVDIYQIDVDKEQDLAAAFG IRSIPTLLMIPMKEEPRIMQGAMPKDQLKKAIDEFLLKQNNEAKQ >gi|225935346|gb|ACGA01000046.1| GENE 362 497792 - 499189 1318 465 aa, chain + ## HITS:1 COG:DR0430 KEGG:ns NR:ns ## COG: DR0430 COG1690 # Protein_GI_number: 15805457 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 42 463 40 464 470 382 50.0 1e-106 MGIRLKDLSKLGYRDNVARSLVVDIVSKHCKYDTKEQIEMTLSDILEHPESYKNNEIWNK LAERLSPTIIAKEFIAYDLLDEPLMYKTYGGKFIETLAKQQMNLAMRLPVTVAGALMPDA HAGYGLPIGGVLATDNVVIPYAVGVDIGCRMSLTVFDASADFLKRYTYQMKEALKDFTHF GMDGGLGFEQEHEVLDREEFRLTPLLRDLHGKAVRQLGSSGGGNHFVEFGEIALQENNVL NLPEGNYLALLSHSGSRGLGAAIAKHYSLLAREVCRLPREAQHFAWLDLNTEAGQEYWMS MDLAGDYARACHERIHLNLAKALGLKPLANVNNHHNFAWREEIAPGRMAIVHRKGATPAQ KGQAGLIPGSMATAGYLVCGKGMEAALNSASHGAGRAMSRQKAKDSFTQSALKKLLSQAG VTLIGGSVEEMPLAYKDINRVMYTQETLVEVQGKFMPRIVRMNKE >gi|225935346|gb|ACGA01000046.1| GENE 363 499232 - 499939 698 235 aa, chain + ## HITS:1 COG:no KEGG:BF2725 NR:ns ## KEGG: BF2725 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 233 1 233 237 385 77.0 1e-106 MISEKYDRTYHYPFSPGTTSDDRINHTYWEDIQRIRTLVHTEKLDGENNCLSQWGVFARS HAAPTTSPWTRQLRERWELIKNDLGDIEIFGENLYAVHSIEYQRLETHFYVFAVRCMDQW LSWEEVKFYAALFDFPTVPELKIESVSGLTPELLKQEIIRMSQEPAIFGSCEPWTKEVCT REGVVSRNVGEYLVSEFAHNVFKYVRKGHVKTDEHWTRNWKRAPLVWEFNNEKEE >gi|225935346|gb|ACGA01000046.1| GENE 364 499942 - 501048 640 368 aa, chain + ## HITS:1 COG:CAC0753_1 KEGG:ns NR:ns ## COG: CAC0753_1 COG0617 # Protein_GI_number: 15894040 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Clostridium acetobutylicum # 18 208 21 210 228 119 32.0 1e-26 MNWKLIEDKSWCSLEQLFEWVREMNTVQQDIRYHAEGSVAEHTRMVLEALQQSSAYHSLS TLEKEIIWTSALLHDVEKRSTSVDEGEGRVSAKGHARKGEYTVRTILYRDCPAPFHIREQ IASLVRYHGLPVWLMEKPDSVKKLCEASLRVDTLLLKMLADADIRGRICEDMNELLEALE LFEIFCREQDCWKKPRGFATDYARFHYFHTEDSYIDYVPHEQFKCEVTMLSGLPGMGKDY YIQSAGIDVPVVSLDVIRRKHKLSPTDKSANGWVVQTAKEEARTYLRKGQDFVWNATNIT RQMRAQLIDLFVDYGAKVKIVYLEQPYHIWRQQNKSREYALPESVLDKMLDKLEVPQLAE AHEVVYQV >gi|225935346|gb|ACGA01000046.1| GENE 365 501460 - 503109 1155 549 aa, chain - ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 107 548 25 513 677 305 37.0 1e-82 MKLLIQKAILANILLFIGYSLSAGEIKIKIDKRYLNIPVSHKEDRHKMTFEVKGQPALSV VIRLAPDEADYWVFKDVSNLKGKTLTISYEGNEKGLSNIYQDDNIKGEDNLYKENNRPQF HFTTKRGWINDPNGLVYYDGEYHLFYQHNPFEREWENMHWGHAVSKDLVHWEELTDALHP DHLGGMFSGSAVIDYGNTAGYNKGDTPAMIVAYTAAGPDKQVQCIAYSLDKGRTFTKYTN NPVIDSKHIWNSHDTRDPKIFWYTSGNHWVMVLNERDGHSIYTSPDLKNWKYESHVTGFW ECPELFELAVDGNPNNKKWVMYGATGTYMLGSFDGKTFIPESGKYFYTKGSLYAGQTYTN IPNSDGRRIQIAWGRISHPSMPFNGMMLLPTELTLHTTKEGIRLFSNPIKETKQLFTPLK KWASLTSDKANDHLKEFRNAETLRIKTTFKLSHATSAGIDLFGQRILDYDMNANTINSVF YSPEDRNSMELTADIYIDKTSIEVFIDGGIYSYSMGRKAANNNQDGIRFWGNNIAVKNLE IFSVKSIWN >gi|225935346|gb|ACGA01000046.1| GENE 366 503310 - 504524 866 404 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 82 356 46 297 341 68 26.0 2e-11 MYKKQVMNKNKIIIVFSWILGAMLFGCSDSDANENNNSGDKGFTYSDVITAYDSFNEYLF QDSRQVYRRDAGSGISEIAVGWTQAMMFDMTINAYKLTGDKKYMDLMERHFEGCSNEFTF DWYDYSHWDLYDDMMWWVGSLARAYLLTKDDKYLKISEDGFYRVWNGKPQSEGGHPLDKG SFDPNSGGMYWDWKFGRTGKMACINYPTIIAAMELYKATNNSEYLEKAKTVYKWASENLF NPVTGAVADSKHDGNADAAWTMLVYNQATCMGSAAMLYLVTKDQTYLNHAKAAMDYIIAK KSTANKVLRPEGNPSQEELTDEKGIYNAILAQYIPILINDCGQTQYTEYIERSINLGWKN RDKERNLTNKFLERAPLSTSPLSSYTASGIPALMLTFPNVKDRK >gi|225935346|gb|ACGA01000046.1| GENE 367 504602 - 505408 508 268 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4502 NR:ns ## KEGG: Cpin_4502 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 268 1 235 235 74 29.0 3e-12 MKKITFCTLLCALFTFGSCDMFSVDNYDEPAETLQGEVVDVATGEKVLTDQGSEGIRVRL TELSWGDNVQHNPDFYCMPDGTFQHTKLFKGTYNIRIDGPFIPLVREDNYGVPLADETIE KEISGITKVRFEVEPFLKVVLIGEPTISNGKITAKVKVTRGVSKEIFKEKIEPLGEYKDS FLNMTDIQLFVSYSSTVGYRARDDRWSNKIEYTGADFEPLFGTEVTIKSNGTIPSGRTVF IRAAARINYETESVRRWNYSEPMEVLVP >gi|225935346|gb|ACGA01000046.1| GENE 368 505449 - 507503 1529 684 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4503 NR:ns ## KEGG: Cpin_4503 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 6 674 9 596 607 293 35.0 1e-77 MKKIFILSLFSLLLFSGCNDLDIAPKNLITDKDLIGSESGMDIYMARMYSNMPFEDFKYM ARWGFNFSSWLGAMGIEGTGEAVGRDDICKTFTGEDTQYWGKSFPLLRDANFLIENLPKY ADNFAGNVYNHYMGEAYFVRAFVFYTMAKRYGGVPLVTRVIQYPADESTLEVPRSSEEET WNQVLSDFDTAISLLSNKPLKDGYSSKYIALAFKSEAMLYAGSVAKYNEKVQGRLTGLGR KTGVRVIGFDESRWEDVSKKYFTEAYKAAREVMINGGYDLYKKKWAANDKEAQYQNMVDM FSDLTSPENIYVRKYIYPTITHGYDAYSAPLIFRAPLSSETCPTLDFVELFDGFEHYADG TMKVTDGASNTDGNYLLFDTPMDFFKNAEPRLRAYVIFPGDMFKGREIEIRAGIYTGATP IKPFFNDYSFANADTHYQDLNAYTGKPKTLYLSPKMENQEVVKYEGKDMTAGGENGPFYD QGESAITGFYGRKWLNPDPSFAAGEGKSAQPFILMRYAEVLLNAAEAAIELSLAGVTSPD GKNLLQLATKAVNDIRERAGATLLTSNLTATEGGRDIVRKERRKELAFEHKTKWDLRRWR VQDYNNRSGFWGETRDKDKFSSNSRYRFRGLYPFFSTQAGKYFFDARFQWTSNRTADYST IDYYFGIPGGEVTKSPLIDQQPNR >gi|225935346|gb|ACGA01000046.1| GENE 369 507540 - 510719 2228 1059 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4504 NR:ns ## KEGG: Cpin_4504 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 1 1059 12 1074 1074 718 39.0 0 MMKTQGVKKMQYLYSCILMVFLLTLSSPSWAQKRTITGIVTDETNEPVIGANVVIKNTTI GTITGIDGQYRIEAPDNATLVFSFIGYNSIEEKVNGRTQINVSMKSNDITLSDVVVVGYG VQKKVSLTGAIAGVRSTDLLKTKNENPQNMLTGKISGVRVWQKSAEPGSYSNNFDIRGYD APLVIIDGVPRDVQDFQRLNANDIDDISVLKDASAAIYGVRSANGVILVTTKKGSKEGKT KVSYNGSFTIQQPSDMPKLSDPYGTMILYNEKAMNKVDGGNIIYGEDQFEAFRNGSRRAT DWNSLIFSDHSPQTQHDISISGGNERTQYYIGMGYFYQEGFFKSGDLNYEKYNIRSNIST RILKGLTFELNLSAFLDERNSPYYSSVDIIRNYWSQGVLYPAYADPENTMLNYKGLELEN NTVAFMTSDVSGYKKNKQKNIQSSASLNYDFGTITPVLKGLSAKALFSYDYRLDNNESYR KEYYQYAYDDLTDTYTQKLYNNSSPSNLLRKMYDKQQTLAQLILNYNRTFGEHSISGLIG WETQKRQGDNFYAQRNLAFSVPYLFVGEDTAQQGGMYSGNSDLYEEANSALIGRINYAYG SRYLLEAQFRYDGSSKFAKGHKWGFFPSISAGWRISEEPFFKSISALSFVNQLKVRASYG VLGDDLSNNWNYEWAQGYNYPATSGNAEKGYYNQYAPGFVFGDKFVYAASPKPLPNEGIS WYESRTFDIGVDFEGWDGLFGFVIDYFDRRRKNMFARSSGDLPTVVGAEAPLENVNSDRQ FGIDLELTHRNKIGDFSYKVKAIGTITRRKHMTAVDKGPWSSSYDRWRNDNLNNRYQGVQ FGYNSAGRYTDWNDIWTYPIYKDRDILPGDYKYVDWNGDGEINGQDEHPFAFDQTPWLNY SLNLDFAYKNFDLNILFQGSALGSMEYKEPLYNIWGENGGGALEQFLDRWHPTDPLADPY DPKTKWVSGHYGYTGHYPKGNSEFNRVSTAYLRLKSIEIGYTLPRIKALSTMNLRIFANA YNLFTITGVKFVDPEHPDDDLGRMYPLNKTYTVGMSLSF >gi|225935346|gb|ACGA01000046.1| GENE 370 510842 - 512686 1245 614 aa, chain - ## HITS:1 COG:no KEGG:Geob_3465 NR:ns ## KEGG: Geob_3465 # Name: not_defined # Def: hypothetical protein # Organism: Geobacter_FRC-32 # Pathway: not_defined # 49 192 292 438 2901 65 32.0 8e-09 MKNKKLYFELRMKLMKNLRTFTFFMLAGAIAFTTSSCSDDDDSTKSDYGTVTGMVTDENN TPINGVAVTISDVEGSTATGADGKYSFSNVPMDKRTITFTKKDYLTTSATVLAGKFDANK VASISIQMLSAAAKVMGTVTDAKNNNAPLAGVTVKLGSLTTTTDNDGTYLFESLIIDDYS ITFSKTGYVAVSKDIKKAQFIDKVATLDIRLGGIEVLPGLTADDLKEADRWLYNEYRGGR NGDSYPHWDWSTDYMCTLSFWGNMEEQNEGTTLRIRNTGDEQKNPANLDAFDSYTYGSKK ITEDNKILSLRIRTHNADDKTPVYFGVQVVDLSAAEPVAVKVGETQTCGSQSYTDYDFDL SDYVGKEVIIAIGIYRTATGDYWKQLVLRAIRFAPQKVEGWNWLPGATITGLEEWKLTAE MVRSTMVQTAKSFTGISPIGGNRDNYYAAYRAWRDVNHIAYCWSFMPTVKDPEVFPSEGY LIKTTGNGGANTNKPEAYFYAKFAIAEGNNKLTLKTRTFGSQWTFFKLTAIKEDGTFVNL TPTSNTAQEAAAAEDGCWKFKHGLGGADTPNAYAKFVYDLSQFNGNNVVIAIGVFNGEKS SDENKLVFHSVEFN >gi|225935346|gb|ACGA01000046.1| GENE 371 513160 - 515700 1031 846 aa, chain + ## HITS:1 COG:no KEGG:BT_2204 NR:ns ## KEGG: BT_2204 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 845 20 851 857 570 38.0 1e-161 MKLSSCHIIAIISIFFLSSTLYGQGLEFKGNDYPIDERTSYNVFNDTPIQFSDKFNISFE MSLTRPSRLGYIVRIKNENNKIYNLSYYNDGIFSVFKLNEEGKNSLIAAKFETKDLVASR WFQVSIKFDLQKDSLCLAIKQQNFFVHNLELPSKWTPDIYFGKSDYMIDVPVFSIRQLVI FDDKQQYNFPLDESEGEEVHSIEGKVFGQVSNPKWLINESYNWTQKYKFTSSSVAGYNFD DLTDNIYIFNKDTLITYNLYSGDVICNSLANKCPIDIFLGTNFWNSGANKLYVYEVHVDD AGKPTVATLDLRAKEWTVVSNENLPMQLHHHSVAYDRENERHFIFGGFGDIYYSKELYVY NYNKNRLDSVVLKGDRIEPRYFSSMGYRKDDNSLYIYGGMGNESGEQIVGRQYFYDLHKV DLNNNTVSKLWEIPWNRENIVPVREMVIQDDSYFYTLCYPEHCSNTYLKLYRFAFKDGAF QILGDSIPIRSEKIKTKANLYYSDKLNKLFAVVQEFDDDDISSSVGVYSLAFPPISHASL SAYKPHSRNSEFTFQILIALLILLVIVIISALIFFIRRRSHEKQGANDKKTVINPVNVKC STSLEQNSVKANSVYLFGEFMVRDRQNKDITYMFSTKLKQVFLSILQYSPKGGISSQRLS ELFWPGKSEDKVKNSRGVAINHVRGILKEIDGIELVYDKGLFRIEYTDEFYCDYLACVKL LMINNTGGNATELIGIVSRGKFLRSIDMPEFDSFKGNLEQKLEPVLLIEIENCFKKEAYK IVVALCESLFYIDPINDEALCYAIQSLTKMNMVNEAKVQYLKFSIEYMNTMNTEYPYSFT DIQKKV >gi|225935346|gb|ACGA01000046.1| GENE 372 516160 - 517659 1286 499 aa, chain - ## HITS:1 COG:BB0604 KEGG:ns NR:ns ## COG: BB0604 COG1620 # Protein_GI_number: 15594949 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Borrelia burgdorferi # 3 497 6 499 500 292 40.0 1e-78 MTLILAVIPVLLLIILMAFFKMSGDKSSIISLIVTMLIALFGFAFSVDNLFYSFLYGALK AVSPILIIILMAIFSYNVLLKTEKMEIIKQQFASISTDKSIQVLLLTWGFGGLLEAMAGF GTAVAIPAAILISLGFKPIFSATVSLIANSVATAFGAIGTPVLVLAKETNLDVLHLSTNV VLQLSVLMFLIPLVLLFLTDSKLKSLPKNIFLALLVGGVSLASQYVAAKYMGAESPAIIG SILSIIVIVIYGKLTASKEEKARKSHLKTKDILNAWSIYLLILFLIILTSPLFPGLRHTL ENNWITRISLPINASTVNYTISWLTHAGVLLFIGTFIGGLIQGAKVKDLFIVLWNTVKQL KKTFITVICLVGLSTIMDSSGMIAVIATALATATGSLYPLFAPVIGCLGTFITGSDTSSN ILFGKLQANVAGQIHVSPDWLSAANTVGATGGKIISPQSIAIATSAGNQQGKEGEILKAA IPYALVYVAIAGIVVYIFS >gi|225935346|gb|ACGA01000046.1| GENE 373 517752 - 519203 1421 483 aa, chain - ## HITS:1 COG:no KEGG:BT_1452 NR:ns ## KEGG: BT_1452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 482 1 482 484 849 91.0 0 MKRKLIYLFIAFAAAILPAHAQKFLNDALTLSNVSLWQQGNSLYVGMTFDMKNLTIGSAR SLSLIPLLTDGQHNVPLQEIIVNGKRREKAYIRSLAITKQEPTAIIVPYNKRETFNYTQV IPYKPWMANASLQLVENLCGCGNYQEMNAQELITNDVSTEAKRLSAMSPIIAYIQPTVEV VKNRSEQYEAHLDFPVNKSVILTDFMNNHAELVNIHAMFDKIQNDRNLTVKGISIKGFAS PEGPLTFNEQLSKKRAEALKDYLVKNEKVSSKLYKVTFGGENWDGLVKALQSSSMKEKET FLNIIKNTTDDAKRKQEIMRVGGGAPYRSMLKEIYPGLRKVNCKIDYTVVNFDVEQGRII IRENPKYLSLNEMYQVANSYPKGSKDFVNVFDIAVRMYPTDAVANLNAGAVALSQKDLDA AVKFMEKADHNTAEFINNTGVYNFLNGDINRAMAAFEQAAKLGNEAALANLKQLQQILSV KMK >gi|225935346|gb|ACGA01000046.1| GENE 374 519228 - 519815 684 195 aa, chain - ## HITS:1 COG:no KEGG:BT_1451 NR:ns ## KEGG: BT_1451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 195 195 370 93.0 1e-101 MRTIKILLAAGFLLVFGTSIHAQVQRNETYLPKFAIKTNALYWATSTPNLGIEVGLAKKL TLDISGNYNPWKFGDDRQIKHWLVQPELRYWLCERFNGSFFGLHGHYGEMNVSNLNIFGM GHDRYDGNLYGTGISYGYQWIISKRWSMEATIGVGYARLEYDKYARGDGGEKLGHNTRNY FGPTKIGLSFIYVIK >gi|225935346|gb|ACGA01000046.1| GENE 375 520065 - 521609 1584 514 aa, chain + ## HITS:1 COG:BMEI0801 KEGG:ns NR:ns ## COG: BMEI0801 COG4799 # Protein_GI_number: 17987084 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Brucella melitensis # 1 514 1 510 510 672 64.0 0 MKELISNLEELNRKAEKGGGDARIEKQHSVGKLTARERIDLLLEKGSFIELDKLVTHRCT DFGMEKQKFAGDGVVTGYGMIGKRLVYVFAQDFTVFGGALSETHAKKICKVMDMAMQMGA PIIGLNDSGGARIQEGVRSLAGYAEIFLRNSMASGVIPQISAIMGPCAGGAVYSPALTDF ILMVKNSGYMFITGPDVVRSVTQEEVTKEELGGVGVHMTKSGVAHLSAENDIECINYIRE LISYLPGNNMEEPPFVATSDSPTRLTPELADLIPSNPNQPYNIKEMIEAVADDNSFFELQ AEYAANIVTGYIRLNGKTVGVVANQPLVLAGTLDINASIKAARFVRFCDAFNIPLLTLVD VPGFLPGIDQEYGGIIRNGAKLLYAYCEATVPKVTVITRKAYGGAYDVMSSKHIRGDVNL AFPTAEIAVMGPDGAVNILFRKEIDKAEKPEEKRKELQDDYRGKFANPYRAAELGYVDEV IDPAVTRLRLIRSFEMLANKRQSNPPKKHSNLPL >gi|225935346|gb|ACGA01000046.1| GENE 376 521688 - 523199 1476 503 aa, chain + ## HITS:1 COG:MA0675 KEGG:ns NR:ns ## COG: MA0675 COG0439 # Protein_GI_number: 20089560 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Methanosarcina acetivorans str.C2A # 1 478 1 493 493 494 50.0 1e-139 MIKKILVANRGEIAMRIFRTCRVMNISTVAVYTHVDRGALHVRYAEEAYCISESPEDTSY LKPELILSIAKKTGAAIHPGYGFLSENADFARRCEEEGVIFIGPSADIISKMGIKTEARK IMREAGLPIVPGTETPVQGIDEVKKVANEVGYPIMLKALAGGGGKGMRLVRTEEEVETAL RLSQSEAGTSFGNDAVYIEKYIENPHHIEVQIMGDKYGNVVHLYERECSIQRRNQKVIEE SPSPFVKEETRKKMLKVAVEACKKIGYYSAGTLEFMMDKDQNFYFLEMNTRLQVEHPVTE ECTGVDLVRDMITVAAGNPLPYKQDDIEFSGAAIECRIYAEDPENNFMPSPGVITVREAP EGRNLRLDSAAYAGFEVSLHYDPMIAKLCCWGRTRASAISNMARALREYKILGIKTTIPF HQRVLKNAAFLKGEYDTTFIDTRFDKEDLKRRQNTDPTVAVIAAAVRHYEREKEAASRAT TLPVVGESLWKYYGKLQMTANNY >gi|225935346|gb|ACGA01000046.1| GENE 377 523229 - 523753 646 174 aa, chain + ## HITS:1 COG:YGL062w KEGG:ns NR:ns ## COG: YGL062w COG1038 # Protein_GI_number: 6321376 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Saccharomyces cerevisiae # 80 174 1066 1169 1178 72 43.0 4e-13 MGTTLATYYAKLQDMPDSEYKVEILEDGPIKKIAVNGKIYEVDYNMGGDSIHSIIIDHHS HGVQISPSSNNSYTIMNKGELYQIELQGEMEKIHNARTAAESVGRQVVQAPMPGVILKTY VKKGDLVKRGDPLCVLVAMKMENEIRSVTDGVVKEVFVEDGMKVGLNDRIMVIE >gi|225935346|gb|ACGA01000046.1| GENE 378 523842 - 526199 2237 785 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 344 645 5 316 328 195 40.0 3e-49 MSIEQSKEALLEQLEALKKENEQLRKELSTLRDEKYNDRPVSFKEKYAVRILDSLPDMLT VFNQSEVGIEVVSNEETNHVGISNKDFRGMRMQDMVPPEAYQNIHSNMRHAITTGTVSTA HHELDFNGERHYYENRIFPLDEEYVLIMCRDITERVATQRQLEVFKSVLDKVSDSILAVA EDGTLVYANKQFIEEYGVTQELGTQKIYDLRVSMTNKEAWEQKLQVIRDHEGSLAYRAAY IPYGHTKERVHQVSTFLIRENNQELTWFFTQDITDVIKKRDELRELNLLLDGILNNIPVY LFVKDPEDDLRYLYWNKAFADHSGIPASKAIGHTDYEIFPVHGDAEKFRKDDLELLQTHK RIDMQETYLSANGEARIVQTLKALVPMEGRKPLLIGISWDITNLQNIEQELIKARIKAEQ SDRLKSAFLANMSHEIRTPLNAIVGFSQLLPSAETAEEKKLYSGIINQNSDILLQLINDI LDLSKIEAGTLEYIKRPMNLGEVCRTIYTVHKERVKEGVTLVFDNEEEDLLMEGDQNRIM QVITNFLTNASKFTYEGEIRLGFGRMDKDIRVYVKDTGIGIEPEKVDHIFERFVKLNSFA QGTGLGLSICRMIIEKIGGEIGVTSELGKGSTFYFTIPYEETGEHGKFFKESKVVSKGNT VNRVQQIKKILVAEDVESNFILLKNLIGREYTLLWAKDGVEAIEMYKQYQPDLILMDVKM PRMDGLEATHIIRSYSKEIPIIALTAYAFEADKELALEMGCNDFVTKPISERTLRKALDK YSTTV >gi|225935346|gb|ACGA01000046.1| GENE 379 526306 - 526959 528 217 aa, chain - ## HITS:1 COG:NMA0943 KEGG:ns NR:ns ## COG: NMA0943 COG0132 # Protein_GI_number: 15793901 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Neisseria meningitidis Z2491 # 3 208 2 207 215 236 54.0 2e-62 MKQNVYFVSGIDTDAGKSYATGFLAREWNKNGHRTITQKFIQTGNVGHSEDIDLHRQIMG IPFTKEDQEGLTMPEIFSYPASPHLASQLDNRPIDFDKIKRATEELSERYDSVLLEGAGG LMVPLTTELLTIDYIVQEKYPLIFVTSGKLGSINHTLLSLEAIQKRGIVLDTVLYNLYPT VEDKTIQNDTMEFIRTWLKKYFPETKFILVPEIKKNL >gi|225935346|gb|ACGA01000046.1| GENE 380 526956 - 527798 450 280 aa, chain - ## HITS:1 COG:PM1903 KEGG:ns NR:ns ## COG: PM1903 COG0500 # Protein_GI_number: 15603768 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Pasteurella multocida # 1 276 4 249 251 174 37.0 1e-43 MNKRLIAERFSKAITTYPKEANVQRQIAGKMIRLLTEHIPSPCSKVIEFGCGTGIYSRML LQALRPEELLLNDLCPDMKYCCEDLLMKKQVSFLPGDAETVSFPTESTLITSCSALQWFE SPENFFERCNTLLNNQGYFAFSTFGKENMKEIRELTGNGLPYRSREELEVALSPHFDILY SEEELIPLSFEDPIKVLYHLKQTGVNGLSTQSSPTGKQENDLCSSDNNSKNNFKNNLPQQ QWTRRDLQLFCERYTQEFTQGASVSLTYHPIYIIAKKKKV >gi|225935346|gb|ACGA01000046.1| GENE 381 527863 - 528525 438 220 aa, chain - ## HITS:1 COG:NMA2012 KEGG:ns NR:ns ## COG: NMA2012 COG2830 # Protein_GI_number: 15794892 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 1 206 1 201 215 143 39.0 2e-34 MKQHFIIKNNQKHLLLFFAGWGMDETPFLTIHPTDKDWMICYDYRSLAFDTDLLETYSQI TLIAWSMGVWAASQIMKQYPHLPVSQSIAINGTLYPIHETKGIAHSIFDGTLQGLNEQTL QKFQRRMCGSIADYKTFQTISPQRPMEELKEELAAIQQQYLSLPPSDFKWQKAIIGKGDR IFLPDNQYLAWENQVDSLEQVEAAHYQQELFNTILMQPEN >gi|225935346|gb|ACGA01000046.1| GENE 382 528522 - 529676 730 384 aa, chain - ## HITS:1 COG:PM1901 KEGG:ns NR:ns ## COG: PM1901 COG0156 # Protein_GI_number: 15603766 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Pasteurella multocida # 3 384 2 384 387 427 54.0 1e-119 MTLEHINQELQTLKEKKNYRSLPPLIHEGRDVLLNGQRMLNLSSNDYLGLSNDISLRKEF LKTLTPETFLPTSSSSRLLTGNFSDYQKLEQQLATMFGTESALIFNSGYHANTGILPAIC NTHTLILADKLVHASLIDGIKLSSAKCIRYRHNDISQLQRLIAENHNAYEQLIIVTESIF SMDGDEADLPALIQLKKSYSNVLLYVDEAHAFGVRGKKGLGCAEEQDCINDIDFLVGTFG KAIASAGAYIVCRQLIREYLINKMRTFIFTTALPPINIQWTSWVLERLSALQHKRAHLLQ ISEKLKVALTAKGYNCPSVSHIVPMIIGASEDTILKAEELQRKGFYALPVRPPTVPDGTS RIRFSLTADITEHEIDQLIKLINE >gi|225935346|gb|ACGA01000046.1| GENE 383 529898 - 532312 1548 804 aa, chain - ## HITS:1 COG:NMB0732 KEGG:ns NR:ns ## COG: NMB0732 COG0161 # Protein_GI_number: 15676630 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Neisseria meningitidis MC58 # 387 803 14 430 433 592 64.0 1e-168 MKQQRHIQTTRALLSRFRYWGRKNYAAFASMGREFQIGHLHTNVVDVALRKQNAAQTIPY HTFMTLQEIKDQVLAGIDISPDQAAWLANMADSEALYAAAHEITVARASHEFDMCSIINA KSGRCPENCKWCAQSSHYRTKAEIYDLLPAEECLRQAQYNEAQDVNRFSLVTSGRKPSPK QITQLCDTVRYMRRHSSIQLCASLGLLNEEELRSLHEAGITRYHCNLETAPSYFSKLCST HTQEQKLATLDAARRVGMDICCGGIIGMGETMEQRIEFAFTLAELNVQSIPINLLSPIPG TPLENEQPLSEEEILKTIVIFRFINPTAFLRFAGGRSQLSSEAMRKALYIGINSAIVGDL LTTLGSKVSEDKKMIQEEGYHFAGSQFDREHIWHPYTSTTDPLPVYKVKRADGATITLED GRTLIDGMSSWWCAVHGYNHPVLNQAAKEQLDKMSHVMFGGLTHDPAIELGKLLLPLVPS SMQKIFYADSGSVAVEVALKMAVQYWYAAGKPEKNNFVAIRSGYHGDTWNAMSVCDPVTG MHSLFGSALPVRYFVPSPTSRFDGEWNPKDILPLQEMIEKHSKELAALILEPVVQGAGGM WFYHPQYLREAEKLCRKHDILLIFDEIATGFGRTGKLFAWEHAGVEPDIMCIGKALTGGY MTLSAVLTSNRIADTISNHTPGAFMHGPTFMGNPLACAVACASVRLLLESGWQENVKRIE TQLKEELAPAREFPEVADVRILGAIGVIEMKRPVNMAYMQRRFVEERIWVRPFGKLVYLM PPFIITSEQLSELTSGLLKVIQKR >gi|225935346|gb|ACGA01000046.1| GENE 384 532405 - 533358 873 317 aa, chain - ## HITS:1 COG:no KEGG:BT_1441 NR:ns ## KEGG: BT_1441 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 316 52 367 370 540 87.0 1e-152 MNFSDFDTLYISPNRYNYALMVTHFSNFEYYSITSELPQPQKLSFSPNPHNKIGLYFGWR WIFLGWSVDVNDIFKKGTRKNRGTEFDLSLYSSKLGVDIFYRRTGNDYKIHKIRGFSDDI PSNYSEDFSGIKVDIKGLNLYYIFNNRKFSYPAAFSQSTNQRRNAGTFIAGFSISKHHLD FDYTALPGFIQEAMNPAMKVKNIKYTNANISFGYAYNWVFARNCLACLSLTPAIAYKASD VDAETNEAKAWYGKFNLDFLVRAGIVYNTGKYYVGTSFVGKNYNYHRNNFSVDNGFGTLQ IYAGFNFNTRKEYRKKK Prediction of potential genes in microbial genomes Time: Fri May 13 10:19:55 2011 Seq name: gi|225935345|gb|ACGA01000047.1| Bacteroides sp. D2 cont1.47, whole genome shotgun sequence Length of sequence - 35827 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 18, operones - 8 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 131 - 190 4.7 1 1 Op 1 . + CDS 276 - 3404 3045 ## BT_1440 hypothetical protein 2 1 Op 2 . + CDS 3419 - 4900 1356 ## BT_1439 hypothetical protein + Term 4940 - 4984 9.2 3 2 Tu 1 . - CDS 5003 - 6205 862 ## COG0477 Permeases of the major facilitator superfamily - Prom 6401 - 6460 7.5 + Prom 6288 - 6347 6.0 4 3 Op 1 . + CDS 6413 - 6658 379 ## BF4188 hypothetical protein 5 3 Op 2 . + CDS 6703 - 6990 351 ## BT_1435 hypothetical protein + Term 6995 - 7051 9.2 + Prom 7191 - 7250 7.5 6 4 Op 1 4/0.000 + CDS 7381 - 8454 910 ## COG1609 Transcriptional regulators 7 4 Op 2 3/0.000 + CDS 8476 - 9288 1138 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 8 4 Op 3 . + CDS 9352 - 10521 1308 ## COG1312 D-mannonate dehydratase + Term 10554 - 10604 11.1 - Term 10542 - 10591 3.3 9 5 Tu 1 . - CDS 10596 - 11192 404 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases - Prom 11339 - 11398 7.1 + Prom 11407 - 11466 3.7 10 6 Op 1 1/0.000 + CDS 11711 - 12535 461 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 12537 - 12596 1.8 11 6 Op 2 . + CDS 12617 - 13033 468 ## COG3871 Uncharacterized stress protein (general stress protein 26) + Term 13061 - 13110 9.3 12 7 Tu 1 . - CDS 13101 - 13376 329 ## BT_1428 hypothetical protein - Prom 13396 - 13455 8.3 - Term 13395 - 13432 2.2 13 8 Tu 1 . - CDS 13538 - 13804 122 ## gi|160882889|ref|ZP_02063892.1| hypothetical protein BACOVA_00851 + Prom 13619 - 13678 7.0 14 9 Tu 1 . + CDS 13770 - 14609 272 ## BT_1427 tetracycline resistance element mobilization regulatory protein RteC + Term 14782 - 14811 0.2 + Prom 14664 - 14723 5.5 15 10 Op 1 . + CDS 14826 - 15320 151 ## COG4332 Uncharacterized protein conserved in bacteria + Prom 15327 - 15386 5.3 16 10 Op 2 . + CDS 15406 - 16107 643 ## BT_1425 hypothetical protein + Term 16165 - 16202 4.5 + Prom 16194 - 16253 6.2 17 11 Tu 1 . + CDS 16325 - 17413 595 ## COG1162 Predicted GTPases + Term 17476 - 17518 -0.7 + Prom 17721 - 17780 4.2 18 12 Tu 1 . + CDS 17945 - 20191 2010 ## Fjoh_4747 hypothetical protein + Term 20194 - 20237 6.9 + Prom 20304 - 20363 5.0 19 13 Tu 1 . + CDS 20463 - 21083 576 ## BT_1424 hypothetical protein + Term 21262 - 21303 1.1 + Prom 21547 - 21606 5.7 20 14 Op 1 . + CDS 21631 - 23790 1663 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 21 14 Op 2 . + CDS 23811 - 24278 400 ## BT_1419 hypothetical protein + Term 24372 - 24411 6.3 + Prom 24333 - 24392 4.5 22 15 Op 1 1/0.000 + CDS 24441 - 25040 392 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit 23 15 Op 2 . + CDS 25080 - 26561 1445 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit 24 15 Op 3 . + CDS 26574 - 27809 961 ## BT_1416 hypothetical protein 25 15 Op 4 . + CDS 27806 - 28597 754 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 26 15 Op 5 . + CDS 28627 - 29907 1155 ## BT_1414 hypothetical protein + Term 29964 - 30023 10.0 - Term 29959 - 30003 10.1 27 16 Tu 1 . - CDS 30059 - 30748 424 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 30779 - 30838 10.0 + Prom 30815 - 30874 9.9 28 17 Tu 1 . + CDS 30997 - 31536 628 ## COG0655 Multimeric flavodoxin WrbA + Term 31611 - 31659 15.1 + Prom 31605 - 31664 8.4 29 18 Op 1 . + CDS 31685 - 32506 513 ## COG1237 Metal-dependent hydrolases of the beta-lactamase superfamily II 30 18 Op 2 . + CDS 32524 - 35775 3028 ## COG0793 Periplasmic protease Predicted protein(s) >gi|225935345|gb|ACGA01000047.1| GENE 1 276 - 3404 3045 1042 aa, chain + ## HITS:1 COG:no KEGG:BT_1440 NR:ns ## KEGG: BT_1440 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1042 1 1042 1042 1873 93.0 0 MRIYLRLLVVSLLLFVGNIVYAEAQQEKRVTGTVTSEGEPLPGVSVQLKGASSGTITDID GKYSIEVLATGTLVFRFVGMRTVEQPVNNRSVINVTLESESKELEEVMVVAYATAKKYSF TGAASTMKAGEIEKLQTSSISRILEGTVSGVQASAASGQPGTDAEIRIRGIGSINASSAP LYVVDGVPFDGSVNSINPDDISSMTVLKDAASAALYGSRGANGVIIITTKQGDQNTKATV KVKASLGGSNRAVRDYDRVSTDQYFELYWEALRNQYAKSADYTPATAATQASKDLVTKLM GGGPNPYGPQYAQPVGTDGKLVAGARSLWNSDWSDAMEQQALRTELNLSVSGGGKANQYF FSAGYLNDKGIALESGYQRFNLRSNITSEMTSWLKGGVNLSFAHSMQNYPVSSDSKTSNV ITAGRTMPGFYPIYEMNTDGSYKLDESGDRIYDFGSYRPSGSMANWNLPATLPLDKSERM KDEVSGRTFLEATIIEGLKFKTSFNFDLINYNTLDYTNPKLGPAKENGGGVSRMNTRTFS WTWNNIATYDKTIGEHHFNVLAGVEAYSYRYDELTASRSKMAQPDMPELVVGSQLTGGSG YRIDYALVGYLTQALYDYQNKYFFSASYRRDGSSRFAPETRWGNFWSLGTSWRIDREEFM ASTSDWLSALTLKMSYGAQGNDNLGTYYASKGLYTIVSNLGENALVSDRMATPNLKWETN LNFNVGIDFSLFNNRFSGSFDFFTRRSKDLLYSRPIAPSLGYGSIDENVGALKNTGIEMV LNGTIINQNGWVWKLGMNLTHYKNKVTDLPLKDMPQSGVNKLQVGRSVYDFYMIEWAGVD PENGDPLWYMDEEDENHNPTGKRVTTNDYGSADYYYVNKSSLPKVYGGFNTSLSWKGFDL SAIFAYSIGGYIYNRDITMILHNGSLEGRDWSTEILKRWTPDNRNTDVPALSTTTNNWNS ASTRFLQNNSYMRLKNLTLSYNLPKQWISKLSLSSVQVYVQGDNLFTIHRNQGLDPEQGI TGITYYRYPAMRTISGGINVSF >gi|225935345|gb|ACGA01000047.1| GENE 2 3419 - 4900 1356 493 aa, chain + ## HITS:1 COG:no KEGG:BT_1439 NR:ns ## KEGG: BT_1439 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 493 1 493 493 868 85.0 0 MRKIKIKSILAAVAGLFLATSCSSSFLDTDPTDAVSSEKVPVPENAAALVNGAWYNLFDY SSTYANIGYRALQCLDDMMASDIVSRPKYGFNSSYQFNDIALPSNNRTEFAWYLIYKTID NCNTAISIQGDSEELRQAQGQALALRAFCYLHLVQHYQFTYLKDKDAPCVPIYTEPSNSS TVPKGKSTVAQVYQRIFDDLNLAQDYLKNYVRSGDNQKFKPNVAVVDGLLARAYLLTGQW EEAAKAAEAARTGYTLMTTTAEYEGFNNISNKEWIWGFPQIPSQSDASYNFYYLDATYVG AYSSFMADPHLKDTFIEGDIRLPLFQWMREGYLGYKKFHMRADDTADLVLMRASEMYLIE AEAKVRDGVALNQAVIPLNTLRNARGVGDYDVTGKSQEDVLNEILMERRRELWGEGFGIT DVLRTQGSVVRTALSDEMQKTEVDCWQEGGSFEKRNPLGHWFLNFPNGKPFTVNSTYYLY AIPQKEINANPNI >gi|225935345|gb|ACGA01000047.1| GENE 3 5003 - 6205 862 400 aa, chain - ## HITS:1 COG:BMEI1292 KEGG:ns NR:ns ## COG: BMEI1292 COG0477 # Protein_GI_number: 17987575 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Brucella melitensis # 14 399 19 401 403 380 53.0 1e-105 MVQNQEQPVGKITFSILIALSLSHCLNDLLQSVLSASYPLFKDDLGLSFAQIGLITLVYQ LSASVFQPITGIFFDKHPVAWSLPIGMSFTLIGLINLAFSDNLYWILASVFLIGIGSSVL HPEASRITFLASGGKRGLAQSLFQVGGNFGGSLGPLLVALLVAPYGRQHLIVFAFVALAA IGVMYPICKWYKSYLNRMKAQTVSVRKPVHLPLPMDKTALSIAILLILIFSKYIYMASLT SYYTFYLIHKFNVSVQDSQLYLFIFLVATAIGTLIGGPVGDRIGRKYVIWASILGAAPFS LLMPHANLLWTIILSFCVGLMLSSAFPAILLYAQELLPTKLGLISGLFFGFAFGVAGVAS AVLGNLADKTSIEYVYNICAYMPLLGLVTFFLPNLKKQKI >gi|225935345|gb|ACGA01000047.1| GENE 4 6413 - 6658 379 81 aa, chain + ## HITS:1 COG:no KEGG:BF4188 NR:ns ## KEGG: BF4188 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 81 1 81 81 67 80.0 1e-10 MGFIWYIIIGIVAGFLAGKIMRGGGFGLVINLLLGILGGVLGGWVFALFGLAASGLIGSL ITSTVGAILVLWIASLFSKSK >gi|225935345|gb|ACGA01000047.1| GENE 5 6703 - 6990 351 95 aa, chain + ## HITS:1 COG:no KEGG:BT_1435 NR:ns ## KEGG: BT_1435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 127 86.0 2e-28 MECRHGNFWIGLGIGSILGAVAYRLSRTAKAKQLESEIYNAIHRIGRDAEIAAAHAERKA VDLGLKAVEAGAEIADKVADEADKVAGKAKDKWGK >gi|225935345|gb|ACGA01000047.1| GENE 6 7381 - 8454 910 357 aa, chain + ## HITS:1 COG:SP1999 KEGG:ns NR:ns ## COG: SP1999 COG1609 # Protein_GI_number: 15901822 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 8 186 7 176 336 78 32.0 2e-14 MDKEFSNIRIVDIAKMAGVSVGTVDRVIHNRGRVSEENRKKVQAILEMVHYQPNLMARSL AASKKQYHLLAITPSFVQGEYWEAISEGIDKAASEMESYNITITKLFFDQYNNKTFDDII RNLLNEKVDGVLIATLFTDSVIRLSQELDRNEIPYVYVDSNIGGQHQLAYFGTESYDAGV IAARLLMDRLSSSSDILMARIIHSGKNDSNQGKNRREGFCHYLTETGFNGNLHEVELKIN DSVYNFMKLDEIFEANPNINGAVIFNSTCYILGNYLKARGMQSVKLVGYDLIGRNTQLLS EGVITALIAQRPERQGYDGIKSLCNHLLFKQGSEKVNLMPIDILFKENLKYYLNNKL >gi|225935345|gb|ACGA01000047.1| GENE 7 8476 - 9288 1138 270 aa, chain + ## HITS:1 COG:BH1067 KEGG:ns NR:ns ## COG: BH1067 COG1028 # Protein_GI_number: 15613630 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus halodurans # 6 269 7 279 281 224 43.0 1e-58 MNELFNVKGKVVVITGGAGILGKGIAAYLAKEGAKVVVLDRSEEAGKALVDSIKAEGNEA MFLYTDVMDKEVLEGNKVEIMKAYGRIDVLLNAAGGNMAGATIAPDKTFFDLQIDAFKKV VDLNLFGTVLPTMVFAEIMVEQKKGSIVNFCSESALRPLTRVVGYGAAKAAIANFTKYMA GELALKFGNGLRVNAIAPGFFLTDQNRALLTNPDGSLTDRSKTILAHTPFNRFGEPEDLY GTIHYLISDASNFVTGTVAVIDGGFDAFSI >gi|225935345|gb|ACGA01000047.1| GENE 8 9352 - 10521 1308 389 aa, chain + ## HITS:1 COG:STM3135 KEGG:ns NR:ns ## COG: STM3135 COG1312 # Protein_GI_number: 16766435 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Salmonella typhimurium LT2 # 5 389 2 392 394 535 62.0 1e-152 MYLCEQTWRWYGPNDPVSLWDIKQAGATGIVNALHHIPNGEVWTVEEIMKRKQMIEEVGL TWSVVESVPVHEHIKTQTGDFMKYIENYKESIRNLAKCGVMVVTYNFMPVLDWTRTDLAY TMPDGSKALRFEKAAFVAFDLFILKRPNAEKDYTPEEIAKAKARFEQMSEDDKKLLVRNM IAGLPGSEESFTVEQFQQALDRYDDIDAEKLRSNLIFFLKEIAPVADEVGVKLVIHPDDP PYTILGLPRILSTEEDFKKLIEAVPNESNGLCLCTGSFGVRADNDLAGMMERFGDRVNFV HLRSTQRDEEGNFYEANHLEGNVDMYNVMKSLILLQQRRKCSIAMRPDHGHQMIDDLKKK TNPGYSCLGRLRGLAELRGLEMGIAKSIL >gi|225935345|gb|ACGA01000047.1| GENE 9 10596 - 11192 404 198 aa, chain - ## HITS:1 COG:CAC0738 KEGG:ns NR:ns ## COG: CAC0738 COG0847 # Protein_GI_number: 15894025 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Clostridium acetobutylicum # 2 168 1 165 306 127 40.0 2e-29 MINFAAIDFETANGKRTSVCSVGVVIVREGKITNKIYRLIRPRPNYYTQWTTAVHGLTYD DTMEADEFPEVWAEIKPLIDGLPLVAHNSPFDEGCLRAVHELYDMTYPDYKFYCTCRTSR KVFGKDLPNHQLHTVAERCGYHLENHHHALADAEACAQIALLIIPEPKKERKTKKADKDI HVGDLFASLIPQPVKKSK >gi|225935345|gb|ACGA01000047.1| GENE 10 11711 - 12535 461 274 aa, chain + ## HITS:1 COG:mlr1196 KEGG:ns NR:ns ## COG: mlr1196 COG2207 # Protein_GI_number: 13471273 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 88 256 107 273 276 79 28.0 7e-15 MYTEYQPSHLLVPYIDSYWEFKGNPDYGMRIHILPDGCTDFIFTLGEVANAVEEGSLIMQ PYRSYFVGPMTKYSELVTYTESIHMFGVRFLPCGLSCFTNLPLHEFVNSRVSTNEMKAVF DDTFIEKLGEQKHVTDRIRVVEEYLLAYLARHYQPADSHVAMAVNMINHSKGKRSVRSLM DDVCLCQRHFERKFKHYTGFTPKEYSRIVKFKNAVELLRTTTSANLLTTAVDAGYYDLAH FSKEIKSMSGNTPASFLSLTVPEETTLTYIEPQR >gi|225935345|gb|ACGA01000047.1| GENE 11 12617 - 13033 468 138 aa, chain + ## HITS:1 COG:DR1146 KEGG:ns NR:ns ## COG: DR1146 COG3871 # Protein_GI_number: 15806166 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Deinococcus radiodurans # 13 130 42 161 193 62 30.0 2e-10 MTTKTMKEKATELLQRCEVVTLASVNKEGYPRPVPMSKILAEGISTIWMSTGADSLKTID FLSNPKAGLCFQDKGDSVALTGKVEVVTDEKMKQELWQDWFIDHFPGGPTDPGYVLLKFE SNHATYWIEGTFIHKKLD >gi|225935345|gb|ACGA01000047.1| GENE 12 13101 - 13376 329 91 aa, chain - ## HITS:1 COG:no KEGG:BT_1428 NR:ns ## KEGG: BT_1428 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 91 1 91 91 151 92.0 7e-36 MIRLNVFVRVNESNREEAIKAAKELTACSLKEEGCIAYDTFESSTRPDVFMICETWQNAE VLAAHEKSPHFAQYVGIIQKLAEMKLEKFEF >gi|225935345|gb|ACGA01000047.1| GENE 13 13538 - 13804 122 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882889|ref|ZP_02063892.1| ## NR: gi|160882889|ref|ZP_02063892.1| hypothetical protein BACOVA_00851 [Bacteroides ovatus ATCC 8483] # 44 88 1 45 45 73 97.0 4e-12 MFSLLFLTKFSIHSQRTHHSHLISINRTKSLNTLIASVLKNDAVNNKNDKMKNLFKYRTT QTKEISLWFMRLILSYNKFKNCAQNHCF >gi|225935345|gb|ACGA01000047.1| GENE 14 13770 - 14609 272 279 aa, chain + ## HITS:1 COG:no KEGG:BT_1427 NR:ns ## KEGG: BT_1427 # Name: not_defined # Def: tetracycline resistance element mobilization regulatory protein RteC # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 1 279 279 430 82.0 1e-119 MENFVRNSKLNIEEEIKRIELQTVDPLDRIRQIIGVVQVSLTSLKTTVAGYQFRSAEEEI LFFKVSKPQISGLLMFYVRLYQIEKNRIDKSLSSQCRYLKIELENLQKSFLNNDFYEYYR AGRTELDNRYFIRENYDILSDIHCHLLDRDISFTTLHDSSVAEILANNRLIEYVSEEIEQ LTEKLHLKFTSIVDSKLLQWTDSKVALVEFIYALYAGKCFSNGNTSLKDIAFCCETLFNI EIGDFYRIFLEIRNRKKSRTQFLDKLKEQIIKMMDELDR >gi|225935345|gb|ACGA01000047.1| GENE 15 14826 - 15320 151 164 aa, chain + ## HITS:1 COG:CAC0055 KEGG:ns NR:ns ## COG: CAC0055 COG4332 # Protein_GI_number: 15893352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 155 37 183 196 88 34.0 6e-18 MNAQKKNIDVWLIYRCVKCDNTCNMTLLSRTKPDLIDKKLFHSFSMNDREVAWQYAFSAG VASRNNLQLDYDSVEYEVINTVSLEDILNMSSEIISIQVKCDFDLSLKLSSLIKRCLPLS STRLKLLFEKGYISLLSGKTSSKCKVKNGDTILMDRKSLIDFLG >gi|225935345|gb|ACGA01000047.1| GENE 16 15406 - 16107 643 233 aa, chain + ## HITS:1 COG:no KEGG:BT_1425 NR:ns ## KEGG: BT_1425 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 232 232 327 75.0 2e-88 MKKVCLLAFVLFTWSIVVNAQESDGRYVEVTGSSEIEVVPDEIHFLVQIKEYWQEEYTGK SKKEEDFHTKVPLAMIEKDLRKSLRKIGITDDAIRTQEIGDYWRQRGKEFLIGKQLDIRL TDFEQINSIIRSVNTWGIESMRIGELKHKDLPMYRKQGKIEALKAAREKASYLVEAMGQQ LGEVIRIIEPADNNISRYLPFEAQSNVSMGAAATEQYRVIKLRYEMMARFAIK >gi|225935345|gb|ACGA01000047.1| GENE 17 16325 - 17413 595 362 aa, chain + ## HITS:1 COG:alr8077 KEGG:ns NR:ns ## COG: alr8077 COG1162 # Protein_GI_number: 17227451 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Nostoc sp. PCC 7120 # 14 344 2 333 353 206 34.0 7e-53 MNNENQTIHKFDFNLNWYGWNDKLSQLKQESLYSTLPHGRISIVHRTCYEVVSENGLFQC ELTGNMMYGKSDAELPCTGDWVLFQPFDENKGIIVYMLPRERTLYRKKSGTVADKQAIAS YVDKAFIVQSLDDNFNVRRVERFMVQVLEENIKPVLVLNKSDLDFDRQSVEEQINHISNQ IPVFFTSIHQPQTILRLRESISEGETVVFVGSSGVGKSSLVNALCEKSVLLTSDISLSTG KGRHTSTRREMVLMNDSGVLIDTPGVREFGLVIDNPDSLAEVLEISDYAESCRFKDCKHI NEPGCAVLEAVNSGVLDYKVYASYLKLRREAWHFSASEHEKRKKEKSFTKLVEEVKNRKA NR >gi|225935345|gb|ACGA01000047.1| GENE 18 17945 - 20191 2010 748 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4747 NR:ns ## KEGG: Fjoh_4747 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 3 746 4 745 746 928 57.0 0 MNKDLLKVSIRQNAIYLPLIEEEKKQEELTSTTIALVAQLRKVGYSLSEELLHAINQLYP TQQMMILQVMKEALGVTLNWSPLVKGWDVPTGETRLDHLVTWIANLFNSQKGVKLPCGHV IPDNTFPMERYNGCPFCGTPFQTATMEYFGQGSKLKVLELWQDKELNAFFCDLLESRTAL DTTQADSLKIMLGELPLPAVGIKMKETLMLVIDTLVEQDRAQEAQIYFSTPNDILRYLWY KKTGFLQIIEPKTIIRKTGRNNTHICGVLDKSRSAAQAKREELKLKYTRRECKMVALWLN NLTMAPEKACEIMHPKREMWVRMIRALRLAEYARKPEFGNLKELMDIFYREAYTVWQGEV ERNRLKADAEQTFALLKQRPGMFARSLFANMLWFGAEETLAAFKEVVHLLPARLVVTLGM YAESYFEPGRKRMVKPLGGNALLIEPHYLVGLYMEDQLKAMVKDVQDLCKEVVAARFASA TVESENKSMYIDPMLFHIPLAIGDRSETIQDTSCALQGTRFPVKGDKVRLFMQWGKGLPA QHLDMDLSCHITLPSTTEVCSFFNLQAIGAKHSGDIRSIPNKKGTAEYIELDLNELNRVG AEYVAFTCNAYSNGTISPNLVVGWMNSAYPMKISERTGVAYDPSCVQHQVRISQSLQKGL VFGVLKVKEREIVWLEIPFGGQTILSLDTQTIEKYLDKLEAKTTVGELLAVKAQAQGLKL VDIPEADEIYTREWALNTAAVTKLLLGD >gi|225935345|gb|ACGA01000047.1| GENE 19 20463 - 21083 576 206 aa, chain + ## HITS:1 COG:no KEGG:BT_1424 NR:ns ## KEGG: BT_1424 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 205 205 299 75.0 4e-80 MSAKYDLYETPDVQNTGEQQPLHARIVPSGTYSKKEFLERVSHSSQTFNYNVIDAVVGIV IDELAEALSEGYVVELGELGHFSISLKCTHKVMTKREIRAESICFDNVHLRTSKGFKRKI KREIELERVDKSKHSSQKVEFSMEQRQQLLQEFLKKNGGITRLEYSSLTGLSRLKAIDDL NIFIEKGILRKRGAGRTVFYVWQHEE >gi|225935345|gb|ACGA01000047.1| GENE 20 21631 - 23790 1663 719 aa, chain + ## HITS:1 COG:PA0781 KEGG:ns NR:ns ## COG: PA0781 COG1629 # Protein_GI_number: 15595978 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 31 714 39 687 687 116 22.0 2e-25 MRIGIISGVIGLFVTLSVHAQKSDSIKSMLLPDVVVTETYQQRQAKKSALTVDVADQDFL RKHFTGNFMQAMENIPGVQAMDIGSGFSKPMVRGMGFNRIAVLENGIKQEGQQWGADHGL ELDAFNIGAVNVLKGPSSLLYGSDAMGGVIDVVPSAVPADNRVFGDVTLLGKSVNGTIGG SLMLGIRKNAWYSHIRYSEQHFGDYRIPTDSIVYLTQRIPIYGRKLKNTAGIERNIGLFT QYQRRAYKANYAVSNVYQKTGFFPGAHGIPDASRVEDDGDSRNIELPFSKVNHLKVTTHQ QYAWEKLILSGDLGFQNNHREEWSAFHTHYGSQPAPEKDPDKELAFNLNTFSASVKARFI GSSSWEHILGWDGQHQRNDISGYSFLLPEYRRSTTGMLWLTTYRPNNVFSVSGGVRYDYG YMNISSHEDTYLADYLRKQGYDPEQIDFYKWNSHSVNKHYGDYSLSLGLVWTPSDKHLVK VNIGRSFRLPGANELAANGVHHGTFRHEQGDANLKSEQGWQLDASYHLKYRGISFSVSPF VSWFSNYIFLRPTGEWSVLPHAGQIYRYTGAEALFAGTEATVDVDFLRNFNYRISAEYVY TYNCDEHIPLSFSPPPVMRNTLTWQKNWYMLYAEWQSIARQNRVDRNEDRTAGANLFHLG GSLNIPIGGNNEIEITLTARNIFDTRYYNHLSFYRKVEIPEPGRNFQILIKVPFKKLLK >gi|225935345|gb|ACGA01000047.1| GENE 21 23811 - 24278 400 155 aa, chain + ## HITS:1 COG:no KEGG:BT_1419 NR:ns ## KEGG: BT_1419 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 9 164 164 210 78.0 1e-53 MISLLATVTFMFSSCDNDDSSDTTKPLIELHEPEEGQALEIGNEHGVHFEMDLSDDVMLK SYKIEIHSNFDHHSHGGNSRAAQETVDFSFNRSYDVSGQKTAHIHHHDIVIPANATAGDY HLMVYCTDAAGNESYIARNIKLSNEVEDEDHHHDE >gi|225935345|gb|ACGA01000047.1| GENE 22 24441 - 25040 392 199 aa, chain + ## HITS:1 COG:Cj1358c KEGG:ns NR:ns ## COG: Cj1358c COG3005 # Protein_GI_number: 15792681 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Campylobacter jejuni # 31 165 30 167 171 83 33.0 3e-16 MMKFPIINRLFPSYKWKVAAVIIGGVIVGGGALFMYMLRAHTYLGDDPAACVNCHIMTPY YATWFHSSHARNATCNDCHVPHENAVKKWTFKGMDGMKHVAAFLTKSEPQVIQAHEASSE VIMNNCIRCHTQLNTEFVKTGKIDYMMSQVGEGKACWDCHRDVPHGGKNSLSGTPGAIVP LPESPVPEWLRKMVNQKDK >gi|225935345|gb|ACGA01000047.1| GENE 23 25080 - 26561 1445 493 aa, chain + ## HITS:1 COG:PM0023 KEGG:ns NR:ns ## COG: PM0023 COG3303 # Protein_GI_number: 15601888 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Pasteurella multocida # 52 490 68 506 510 447 46.0 1e-125 MEKKLKSWQGWLLFGGSMVVVFVLGLCVSALMERRAEVASIFNNRKNVIKGIEARNELFK DDFPREYQTWTETAKTDFESEFNGNIAVDALEKRPEMVILWAGYAFSKDYSTPRGHMHAI EDITASLRTGSPMSPTEGPQPSTCWTCKSPDVPRMMEALGVDSFYNNKWGAMGAEIVNPI GCSDCHDPETMNLHISRPALIEAFQRQGKDITKATPQEMRSLVCAQCHVEYYFKGDGKYL TFPWDKGFTVEDMEAYYDEAGFYDYIHKLSRTPILKAQHPDYEICQMGIHGQRGVSCADC HMPYKSEGGVKFSDHHIQSPLAMIDRTCQVCHHESEETLRNNVYERQRKANEIRNRLEQE LAKAHIEAKFAWDKGATETQMKDVLALIRQAQWRWDFGVASHGGSFHAPQEIQRILSHGL DRAMQARLAVSKVLAKNGYTGDVPMPDISTKAKAQEYIGLDMDAERAAKEKFLKTTVPAW LEKAKENGRLAQK >gi|225935345|gb|ACGA01000047.1| GENE 24 26574 - 27809 961 411 aa, chain + ## HITS:1 COG:no KEGG:BT_1416 NR:ns ## KEGG: BT_1416 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 409 1 409 411 726 87.0 0 MWSKPWSYKEGLVIGAGLLVIGLLLQMTVGAIHWDLFACPVNVIVLVVYIIALVAMHLLR KRVYLFGWLSHYSAAVSSLLWVVGMTVVMGLIRQAPSGHAPADLLGFSQMISSWPFVLLY FWMVTALGLTILRTGFSLKISRISFLLNHIGLFIALITATLGNADMQRLKMTTRMGSAEW RATDDKGQLIELPLAIELKDFTIDEYPPKLMLIDNETGRTLPEKSPVHVLLEEGVTNGSL QDWQLTIEQSIPMAASVATEDTLKFTEFHSMGATYAVYLKAVNQKNQTTKEGWVSCGSFL FPYKAIRLDSLTSIVMPEREPQRFVSEVKIYTQEGTITGGTIEVNRPMEIEGWKIYQLSY DETKGRWSDISVFELVRDPWLPVVYTGIIMMMAGAICLFVSAQKRKEEDKA >gi|225935345|gb|ACGA01000047.1| GENE 25 27806 - 28597 754 263 aa, chain + ## HITS:1 COG:all0936 KEGG:ns NR:ns ## COG: all0936 COG0755 # Protein_GI_number: 17228431 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Nostoc sp. PCC 7120 # 166 260 253 347 351 84 42.0 1e-16 MSWEYFILFAIAALVCWALGAFAAWKGTKTGWAYGFTFLGLAIFFSFIIGMWISLERPPM RTMGETRLWYSFFLPLAGLITYARWKYKWILSFSCILSLVFICINIFKPEIHNKTLMPAL QSPWFAPHVIVYMFAYAMLGAATVMAVYLLWFKKKEIERKEMDLCDNLTYVGLAFMTLGM LTGAIWAKEAWGHYWAWDPKETWAAATWFAYLVYIHFRLGKPLKARPALIILLVSFVLLQ MCWYGINYLPAAQGVSVHTYNLN >gi|225935345|gb|ACGA01000047.1| GENE 26 28627 - 29907 1155 426 aa, chain + ## HITS:1 COG:no KEGG:BT_1414 NR:ns ## KEGG: BT_1414 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 426 1 428 428 702 82.0 0 MKTLYWSFLLMLLPSMAYTQNTEKENEFTMSMQIRPRAEYRNGALTPRDEGVAPTSFINN RARLSMDYKRSDLELKMSAQHVGVWGQDPQIEKNGRFMLNEAWAKLNFGEGFFAQLGRQS LIYDDERILGGLDWNVAGRYHDALKLGYANKNNEVHLILAFNQNNDNRTSGGTYYDSSTG QPYKNMQTVWYHYKADNVPFGASLLFMNLGLETGDKATDDSHTRYLQTMGTYLTYKNSNW NLDGAFYYQMGKNKAAEKVSALMGSIQAAYTFNQTWGAVASFDYLSGDKGNGGKYKAFDP LYGTHHKFYGAMDYFYASTFANGYAPGLMDARIGGRFRLSDKVDMELNYHYFSTAVKVQD LKKSLGSEVDYQINWSIMKDVKLSAGYSFMRGTKTMDAVKTGNHKSWQDWGWVSLNINPK ILFVKW >gi|225935345|gb|ACGA01000047.1| GENE 27 30059 - 30748 424 229 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 31 229 26 225 229 80 26.0 2e-15 MKKILLTDSHKEKLFQIPLFRDLPLNIKLSLLEKLEFVVYAAEKKEIVVTQGTPCNKLYV LLEGKLRTDIIDGLGNEVMIEYIIAPRTFATPHLFSSNNTLPATFTALENSVVLMATKES TFKVISQDPQVLHNFLCIAGNCNICTVSRLKPLSRKTVRERFIVYLYEHKKKDSLVVDIM HTQSQLAEYLNVSRPALSKEINKMIKEGLVTMEGKRIVILDQITLEKYL >gi|225935345|gb|ACGA01000047.1| GENE 28 30997 - 31536 628 179 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 179 1 179 179 251 65.0 7e-67 MAKKVLIISSSPRKGGNSDLLCDEFMKGALEAGNEVEKIFLKDKTVHPCTGCSVCSMYGK PCPQKDDAAEFVEKMIAADVIVMATPVYFYTMCGQMKIMIDRCCARYTEITNKEFYFIIA AAENDKAMMERTIDGFRGFLDCLEGPQEKGTVYGIGAWKVGEIKDTPYMQEAYKMGKMV >gi|225935345|gb|ACGA01000047.1| GENE 29 31685 - 32506 513 273 aa, chain + ## HITS:1 COG:MTH1101 KEGG:ns NR:ns ## COG: MTH1101 COG1237 # Protein_GI_number: 15679112 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily II # Organism: Methanothermobacter thermautotrophicus # 4 270 2 260 260 139 37.0 5e-33 MSYKITTLVENCVYGRKLQAEHGLSLYIEFQGNRILFDTGASDLFIRNARLLHIDLQKVD YLILSHGHSDHTGGLRYFLELNTQATVVCKREIFFPKFKDERENGMKHTQNLDLSRFRFI TEQTELLPGVFLFPSIDIINEEDTHFERFWVQKEDGCKIPDTFQDELAMVLVEPEGVSVL SACSHRGITNILRTVRAAFPESPCKLLLGGFHIHNAEKQKYQIIADYLQEYLPRQIGVCH CTGVDKYAFFFKDFGDKAFYNYTGKLIQTDFSE >gi|225935345|gb|ACGA01000047.1| GENE 30 32524 - 35775 3028 1083 aa, chain + ## HITS:1 COG:VCA0045 KEGG:ns NR:ns ## COG: VCA0045 COG0793 # Protein_GI_number: 15600816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Vibrio cholerae # 712 1066 23 379 394 285 42.0 3e-76 MKKLITSLALVLSALSSYAITPLWMRDARISPDGTEIVFCYKGDIYKVPAQGGTAVQLTT QASYEANPVWSPDGKQIAFASDRNGNFDLFIMPADGGVARRLTYHSASEIPSAFTPDGKY VLFSASIQDPANSALFPTGAMTELYKVSVDGGRTEQVLATPAELVCFDKAGKNFLYQDRK GFEDEWRKHHTSSITRDIWLYNTQTGKHTNLTNRGGEDRNPVYAPDGNTVYFLSERNGGS FNVYTFPLNTPQEVKAVTTFKTHPVRFLSVSDKGTLCYTYDGELYTQKPGARPEKVKVEL VRDDDEQVAALKFSQGATSASVSPDGKQVAFIVRGDVFVTSTDYATTKQITNTPAKEASV SFAPDNRTLVYASERTGNWQLYTAKISRKEDPNFPNATLIEEEVLLPSKTVERAYPQYSP DGKELAFIEDRNRLMVLDLKTKKVRQVTDGSTWYNTGGGFDYEWSPDGKWFTLEFIGNRH DPYSDIGIVSAQGGAITNLTNSGYISGSPRWVLDGNAILFQTERYGMRAHASWGSQQDVM LVFLNQDAYDRYRLSKEDFELLKELEKEQKKAKEKDDNKKKDGDKEKADEEKVDQKDIVV ELNGIEDRIVRLTPNSSDLGSAILSKDGEDLYYFSAFEDGYDLWKMNLREKETKRLHKLN TGWASLMLDKKGDIFLLGSRNMQKMDAKSDALKSISYQAEMKMDLAAEREAMFDHVYKQH QKRFYNVNMHGVNWDAMTNAYRKFLPHIDNNYDFAELLSEWLGELNVSHTGGRYSPKGKG DVTSNLGLLFDWDYQGKGMQIAEIIEKGPFDHSRTKVKAGCIIEKINGEEITPDHDITCL LNNKAGKKTLISIYNPQNKERWEEVVMPVTSGQLNGLLYKRWVKQRAADVEKWSKGRLGY VHIQSMGDDSFRTVYSDILGKYNNCDGIVIDTRFNGGGRLHEDIEILFSGQKYFTQVVRG REACDMPSRRWNKPSIMLQCEANYSNAHGTPWVYKHQNIGKLVGMPVPGTMTSVSWETLQ DPSLVFGIPIVGYRLPDGSYLENTQLEPDVKVSNDPEMVVKGEDTQLKVAVDELLKEIDK QKK Prediction of potential genes in microbial genomes Time: Fri May 13 10:21:09 2011 Seq name: gi|225935344|gb|ACGA01000048.1| Bacteroides sp. D2 cont1.48, whole genome shotgun sequence Length of sequence - 50578 bp Number of predicted genes - 50, with homology - 49 Number of transcription units - 26, operones - 12 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 791 - 850 4.8 2 2 Tu 1 . + CDS 870 - 1775 497 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 1776 - 1824 9.3 - Term 1760 - 1815 14.1 3 3 Tu 1 . - CDS 1843 - 3600 1808 ## COG1154 Deoxyxylulose-5-phosphate synthase - Prom 3692 - 3751 6.4 + Prom 3825 - 3884 6.6 4 4 Op 1 . + CDS 3951 - 4298 256 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 5 4 Op 2 . + CDS 4342 - 4551 75 ## + Term 4724 - 4783 5.9 6 5 Op 1 . - CDS 4511 - 4855 157 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase 7 5 Op 2 . - CDS 4859 - 5710 738 ## COG2329 Uncharacterized enzyme involved in biosynthesis of extracellular polysaccharides - Prom 5800 - 5859 3.2 + Prom 6346 - 6405 2.7 8 6 Tu 1 . + CDS 6461 - 6841 213 ## COG1073 Hydrolases of the alpha/beta superfamily + Term 7036 - 7071 0.2 9 7 Tu 1 . - CDS 6901 - 7824 360 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 7866 - 7925 6.9 + Prom 7825 - 7884 6.1 10 8 Op 1 . + CDS 8047 - 9234 885 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 11 8 Op 2 . + CDS 9247 - 9540 238 ## BT_1114 hypothetical protein 12 8 Op 3 . + CDS 9559 - 10278 487 ## BT_1114 hypothetical protein + Term 10330 - 10369 -0.7 - Term 10304 - 10368 12.4 13 9 Op 1 . - CDS 10369 - 10854 157 ## BVU_3106 radical enzyme activating protein 14 9 Op 2 . - CDS 10838 - 12931 1825 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase + Prom 12879 - 12938 3.5 15 10 Tu 1 . + CDS 12983 - 13291 87 ## gi|260173776|ref|ZP_05760188.1| hypothetical protein BacD2_18038 16 11 Tu 1 . + CDS 13713 - 14849 997 ## BT_1391 hypothetical protein + Term 14873 - 14920 9.2 + Prom 14917 - 14976 3.7 17 12 Op 1 . + CDS 15052 - 16125 1259 ## COG3831 Uncharacterized conserved protein 18 12 Op 2 . + CDS 16129 - 17139 556 ## BF1878 putative periplasmic protein 19 12 Op 3 . + CDS 17126 - 17980 627 ## BF1941 hypothetical protein 20 12 Op 4 . + CDS 17977 - 19254 708 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 21 12 Op 5 . + CDS 19266 - 20831 599 ## BF1943 hypothetical protein + Term 20972 - 21030 15.1 + Prom 20909 - 20968 10.7 22 13 Op 1 . + CDS 21068 - 22432 834 ## COG0534 Na+-driven multidrug efflux pump 23 13 Op 2 . + CDS 22473 - 23021 375 ## BT_1386 hypothetical protein + Prom 23038 - 23097 1.6 24 14 Op 1 1/0.100 + CDS 23146 - 24024 219 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 24028 - 24087 4.3 25 14 Op 2 . + CDS 24121 - 24540 400 ## COG3871 Uncharacterized stress protein (general stress protein 26) + Term 24574 - 24619 6.2 - Term 24562 - 24605 9.6 26 15 Op 1 1/0.100 - CDS 24623 - 25243 434 ## COG1309 Transcriptional regulator 27 15 Op 2 . - CDS 25247 - 29458 2375 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) - Prom 29534 - 29593 6.2 - Term 29854 - 29905 11.3 28 16 Tu 1 . - CDS 30098 - 30997 559 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 31035 - 31094 4.1 + Prom 30956 - 31015 6.9 29 17 Op 1 . + CDS 31234 - 31800 477 ## COG0716 Flavodoxins 30 17 Op 2 . + CDS 31826 - 32824 900 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 32845 - 32883 1.1 31 18 Tu 1 . - CDS 32881 - 33414 266 ## COG1896 Predicted hydrolases of HD superfamily - Prom 33484 - 33543 9.4 + Prom 33748 - 33807 10.9 32 19 Op 1 . + CDS 33856 - 34419 421 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 33 19 Op 2 2/0.100 + CDS 34494 - 34931 266 ## COG1359 Uncharacterized conserved protein 34 19 Op 3 . + CDS 34943 - 35770 722 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit 35 19 Op 4 . + CDS 35810 - 35959 156 ## gi|260173796|ref|ZP_05760208.1| hypothetical protein BacD2_18138 36 19 Op 5 . + CDS 35998 - 36561 359 ## COG0716 Flavodoxins + Prom 36579 - 36638 2.4 37 20 Op 1 2/0.100 + CDS 36667 - 37245 460 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 38 20 Op 2 . + CDS 37310 - 37732 326 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 37828 - 37880 -0.0 - TRNA 37948 - 38024 86.1 # Asp GTC 0 0 - TRNA 38080 - 38153 85.5 # Asp GTC 0 0 + Prom 38495 - 38554 5.9 39 21 Op 1 . + CDS 38599 - 39453 751 ## COG0788 Formyltetrahydrofolate hydrolase 40 21 Op 2 25/0.000 + CDS 39453 - 40043 506 ## COG0118 Glutamine amidotransferase 41 21 Op 3 23/0.000 + CDS 40062 - 40781 634 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase + Prom 40806 - 40865 6.0 42 21 Op 4 24/0.000 + CDS 40885 - 41637 741 ## COG0107 Imidazoleglycerol-phosphate synthase 43 21 Op 5 . + CDS 41702 - 42313 656 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 44 21 Op 6 . + CDS 42355 - 43083 354 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 45 21 Op 7 1/0.100 + CDS 43097 - 44416 1196 ## COG0527 Aspartokinases + Term 44432 - 44476 6.0 46 22 Tu 1 . + CDS 44495 - 45655 1248 ## COG0019 Diaminopimelate decarboxylase + Prom 45669 - 45728 3.8 47 23 Tu 1 . + CDS 45797 - 46276 676 ## COG1528 Ferritin-like protein + Term 46315 - 46359 11.1 + Prom 46333 - 46392 4.5 48 24 Tu 1 . + CDS 46432 - 48696 1628 ## COG0642 Signal transduction histidine kinase 49 25 Tu 1 . - CDS 48728 - 49135 492 ## BT_1372 hypothetical protein - Prom 49166 - 49225 6.6 50 26 Tu 1 . - CDS 49338 - 50531 1406 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes Predicted protein(s) >gi|225935344|gb|ACGA01000048.1| GENE 1 44 - 715 297 223 aa, chain - ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 207 1 203 227 156 44.0 2e-38 MEMKDVFKRYLVFVIGLYFLAAGIVLIIRSALGTTPISSINYVLSLNSPLSLGTCTFIIN MVLILGQFWLIRKNRTRQDIIEILLQMPFSFIFSAFIDFNMMLTSELHPANYGMSIALLL TGCMVQSIGVVLELKPRVAMMSAEAFVKYASRHYNKEFGKFKVYFDITLVTLAVILSLLL TQGIQGVREGSLIAACITGYIVSFLNQKIMTRKTLHKLLPVWK >gi|225935344|gb|ACGA01000048.1| GENE 2 870 - 1775 497 301 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 108 293 95 287 288 86 31.0 5e-17 MVKSRIFDNTQLLKEHPELPHYKEEVVCFRSEYKDRSYPANECYFNRELYMIFVLEGRSE ILLNGEFIAIEPNMLLIHGANYLTEHLYSSRDIQFITLALSESMRTDDSYLTQITAILLA TMRQNKQYTIQLTEYEAQIIRKELEVLMRLLNIEHHFLFRRIQAACNALFLDIADFLSRK TIIKKEVSRKDHVLQEFHALVTRNFREEHFVSFYADKLAISEQYLARIVRAGTGKTINSI INELLVMEARTLLSSTKSTVGEIASKLGFSDAAAFCKFFKRNAGQTPLNYRKGLLLHIEM K >gi|225935344|gb|ACGA01000048.1| GENE 3 1843 - 3600 1808 585 aa, chain - ## HITS:1 COG:CAP0106 KEGG:ns NR:ns ## COG: CAP0106 COG1154 # Protein_GI_number: 15004809 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 1 584 1 585 586 832 69.0 0 MYLENIYSPADVKKLSEKELNELSGEIRAALLQKLSEHGGHFGPNFGMVEATIALHYVFN SPKDKIVFDVSHQSYVHKMLTGRKDAFLHPEKYDNVSGYTEPQESEHDFFIIGHTSTSVS LASGLAKGRDLTGGNENIIAVIGDGSLSGGEAFEGLDYVAELGTNMIIIVNDNQMSIAEN HGGLYKNLKELRDSNGQCECNFFKAMGLDYMYVNDGNNVQALIEAFSKVKDIQHPIVVHI NTLKGKGYARAEQDKETYHWRTPFNVETGEAKVSYEEEDYSEVTAQYLLKKMKEDPRVVT ITSGTPTVLGFTSDRRQEAGKQFVDVGIAEEHAVALASGIAANGGKPVYGVYSTFIQRSY DQLSQDLCINNNPAVLLVFWGTLSGMNDVTHLCFFDIPLISNIPNMVYLAPTCKEEYIAM LEWSIHQNEHPVAIRVPATDVISSGEPVDSDYSILNRYKVTHRGSKVAIVALGSFYGLGQ SVASLLKEKANVDATLINPRYITDVDNELMDELKADHELVITLEDGVLDGGFGEKIARYY GATDMKVLNYGAKKEFVDRFDLQEFLRANHLTDEQIVEDITALIG >gi|225935344|gb|ACGA01000048.1| GENE 4 3951 - 4298 256 115 aa, chain + ## HITS:1 COG:TM1009 KEGG:ns NR:ns ## COG: TM1009 COG0656 # Protein_GI_number: 15643767 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Thermotoga maritima # 1 112 178 285 286 100 44.0 5e-22 MKDYTDWTGIKIKAWSLFVEGQHDFFRNEVLTALAGKYNRTVAQAELRWLTQRGIIVIPK SVHRNRIEENFNSLDFRLLDEDMELLATCDIGKPVADDFDNPDFVYDLCTRKYDF >gi|225935344|gb|ACGA01000048.1| GENE 5 4342 - 4551 75 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCVNRKIWLPIISLKLFLIYSFVVLLAFVWIACSEHPNNAEYRSGFEDELRFFSQVFTHL WLQQSHELS >gi|225935344|gb|ACGA01000048.1| GENE 6 4511 - 4855 157 114 aa, chain - ## HITS:1 COG:MYPU_0950_1 KEGG:ns NR:ns ## COG: MYPU_0950_1 COG0122 # Protein_GI_number: 15828566 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Mycoplasma pulmonis # 4 92 2 82 217 86 46.0 1e-17 MEQFFLYGEREIAYLKSKDKRLREVIDKVGIVKRRVIPDLFTALVHSIVGQQISTKAHET IWRKMSDALGEVTPEKVLNLPPEALQAMIKYDSGDAPRIQHFMKVHDFAATIDV >gi|225935344|gb|ACGA01000048.1| GENE 7 4859 - 5710 738 283 aa, chain - ## HITS:1 COG:AGc2463 KEGG:ns NR:ns ## COG: AGc2463 COG2329 # Protein_GI_number: 15888662 # Func_class: R General function prediction only # Function: Uncharacterized enzyme involved in biosynthesis of extracellular polysaccharides # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 169 278 2 111 121 100 41.0 3e-21 MLLAMLIAETNNLFSQSMSKRLEETPQSVIANVHKAARLLKEKGSEALAVLTDPKSEFND RDAYLFIIDVDKSLVVSNPRFPERTGGNIREHLDWSGKHYGVELCEIAMCGGGWIEFVWP KPGTEEGTRKVSYIYPIPGMRYTICAGIYNNTMTLDELNTLTGHGKKKVAVIFEVMPTAK GKANYFKMGAALKEELLRMPGFISVERFASVNNEGKFLSLSFWESEEAAAGWRNQVNHRQ NQKMGHDKLFDNYRISVGKIVREYTDQNRSGAPDDSNEYLGIK >gi|225935344|gb|ACGA01000048.1| GENE 8 6461 - 6841 213 126 aa, chain + ## HITS:1 COG:RSc0206 KEGG:ns NR:ns ## COG: RSc0206 COG1073 # Protein_GI_number: 17544925 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Ralstonia solanacearum # 8 108 204 336 342 135 53.0 2e-32 MAEAAEQRYTDFLGGETKYTSDAVHQLTITSSPIEREFYEFYPFEDIKTISPRSMFFITG ENAHSREFSEDAYQLAAEPKELYIVPGAEHVDLYDRVSLIPFNKLEFFIYSYQYCMSATA VDRRIG >gi|225935344|gb|ACGA01000048.1| GENE 9 6901 - 7824 360 307 aa, chain - ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 191 295 194 291 307 68 33.0 2e-11 MSGIMKIDTVQQYNDYFGVETYHPLVSVIEGKKAKPLRFCRKLYNVYAILLKDTNCGNLK YGQSIYDYQQGAMLFLAPGQSMGSEDDGLLHQPEGWALVFHPELLRGTHLANIMKEYTYF TYNANEALHLSEQERRTVIECMEKIRTELQFPIDKHSKSLIIDNVKLLLDYCIRFYDRQF ITRENADKDILTRFENLLDDYFSSNVPAEQGMPTVQYCADKLCLSANYFSDLCRKETGVS ALKHIQQKVLDVAKEQVFDTTKSISEISYELGFPYPQHFSRWFKKMTGCTPNGYRQNPIQ LNISPTK >gi|225935344|gb|ACGA01000048.1| GENE 10 8047 - 9234 885 395 aa, chain + ## HITS:1 COG:TM1006 KEGG:ns NR:ns ## COG: TM1006 COG0667 # Protein_GI_number: 15643766 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Thermotoga maritima # 77 385 15 323 333 315 49.0 1e-85 MKKEQDSISNMEMSRRGFLKRTVLAGAAICIAPTFEKVTAAEKAINGKSVVPAKIPAILA TVRETRTLGDGNAAFTVSAMGFGCMGLNHHRSQSPDEKACIRLVREAIERGVTLFDTAES YGYHKNEILIGKALNGYTSRVFVSSKFGHKFVNGVQVKTEEDSSPANIRRVCENSLRNLG VETLGLFYQHRSDPNTPVEVVAETIKELIKEGKILHWGMCEVNADTIRRAHIVCPLTAIQ SEYHFMHRTVEESVLPVCEELGIGFVPYSPLNRGFLGGMINEYTMFDPTNDNRQTLPRFQ PNAIRQNMRIVEILNAFGRTRGITPAQVALAWLMNKKPYIVPIPGTTKLSHLEENLRASG ILFTPEEMKELENAIAAIPVVGSRYDALQESKVQK >gi|225935344|gb|ACGA01000048.1| GENE 11 9247 - 9540 238 97 aa, chain + ## HITS:1 COG:no KEGG:BT_1114 NR:ns ## KEGG: BT_1114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 97 5 96 343 162 80.0 3e-39 MRWTIILCMAMMALLWGCQARDENILLIEEQGSFAVGGSVMTDSLGRNYHGDHAYVFYQK PVNARKYPLVFAHGIGQFSKTWETTPDGREGFQNIFL >gi|225935344|gb|ACGA01000048.1| GENE 12 9559 - 10278 487 239 aa, chain + ## HITS:1 COG:no KEGG:BT_1114 NR:ns ## KEGG: BT_1114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 104 342 343 412 79.0 1e-114 MVDQPRRGNAGRSTETVTLSPVFDEEEWFNRFRVGIYPDYFEGVQFSHDKEALNQYFRQM TPTIGSIDLDVFSDAYAALFDQIGTAVLVTHSQGGGVGWLTLPKTKNIKAIVAYEPGTNV PFPKGEMPEEGKVMTLSGKTEGVEVPMSVFMKFTKIPIIVYFGDNLPETNERPELYEWTR RLHLMRKWAKMLNDFGGDVTVIHLPEVGLYGNTHFPFSDLNNIEVANHLSKWLHEKGLD >gi|225935344|gb|ACGA01000048.1| GENE 13 10369 - 10854 157 161 aa, chain - ## HITS:1 COG:no KEGG:BVU_3106 NR:ns ## KEGG: BVU_3106 # Name: not_defined # Def: radical enzyme activating protein # Organism: B.vulgatus # Pathway: not_defined # 1 152 1 152 155 243 75.0 2e-63 MLRYADYDIVFQEIPDEVTLAINLSNCPNHCKGCHSAYLMEDVGEPLTEESLSTLLGKYG KAITCVCFMGGDASPAEVEQLAAFLHKQTITPVKVGWYSGKSKLPEHFDVSHFQYIKLGP YIESLGGLKSETTNQRLYHIENGIMEDITYRFLPHHSKKIS >gi|225935344|gb|ACGA01000048.1| GENE 14 10838 - 12931 1825 697 aa, chain - ## HITS:1 COG:TM0385 KEGG:ns NR:ns ## COG: TM0385 COG1328 # Protein_GI_number: 15643151 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Thermotoga maritima # 80 686 23 635 651 244 28.0 6e-64 MISSEIFIIKRDGKKEAFSLDKIKNAISKAFLSVGSFATQDVITNVLSRVSISDGTNVEE IQNQVEVALMAEHYYSVAKAYMLYRQKHLEDREVRDKLKFLMDYCDASNPASGSKYDANA NVENKNIATLIGELPKSNFIRLNRRLLTDRLRDMYGKEASDRYLELLNHHFIYKNDETNL ANYCASITMYPWLIAGTTAVGGNSTAPTNLKSFCGGFINMVFIVSSMLSGACATPEFLMY MNYFIGLEYGQDYYKHPDKLADLSLKQRSIDKIITDCFEQIVYSINQPTGARNFQAVFWN VAYYDKYYFNSLFEHFVFPDGSKPDWDSLSWLQKRFMKWFNKERTRTVLTFPVETMALLT KDGDVLDKEYGDFTAEMYAEGHSFFTYMSDNADSLSSCCRLRNEIQDNGFSYTLGAGGVS TGSKSVLTINLNRCIQHAVKSGILYPFFLEEVVDLVHKVQLAYNENLKLLQAKGMLPLFD AGYINMSRQYLTIGVNGLVEAAQFMGIDINDNPKYEAFVQEILGMIEKYNKKYRTKEVLF NCEMIPAENVGVKHAKWDKEAGFQTYRECYNSYFYIVEDKSLNIVDKFRLHGHRYIEHLT GGSALHMNLEEHLSKEQYRQLLRVAATEGCNYFTFNIPNTVCNKCHHIDKRYLHECPECH SENVDYLTRVIGYMKRISNFSQARQKEADLRYYASVR >gi|225935344|gb|ACGA01000048.1| GENE 15 12983 - 13291 87 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173776|ref|ZP_05760188.1| ## NR: gi|260173776|ref|ZP_05760188.1| hypothetical protein BacD2_18038 [Bacteroides sp. D2] # 1 102 1 102 102 203 100.0 3e-51 MHDNFWRLVIKWTYTIRSIAFHPKAWNNGNFIGRSSDLFLLWRLPDPFHESVANECHNIS LFYERDKTYSYGDSSGFSLDSLLIPSGDNAISETDVGGKVRN >gi|225935344|gb|ACGA01000048.1| GENE 16 13713 - 14849 997 378 aa, chain + ## HITS:1 COG:no KEGG:BT_1391 NR:ns ## KEGG: BT_1391 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 378 1 375 375 619 85.0 1e-176 MNMRKSIILLAFILGGFTVANAQSVVEGTKLTDNWSVGVNAGGVTPLTHSAFFKGMRPTF GVGVSKQLTPIFGLGFQGMGYINTTSSKTAFDASDVSVLGKVNLMNLFASYTGEPRLFEV EAVAGMGWLHYYVNGDGDQNSWSTRLGLNFNFNLGESKAWTLGIKPAIVYDMQGTYPETK SRFNANNAGFELTAGLTYHFKTSNGTHHFAKVRVYNQAEIDGLNSSINALRADVNNKDGE ISNANQRINGLQEELEACRTKVVPVETVVKTARVPESIITFRQGKSSVDASQLPNVERVA SYLKKYADSKVVIKGYASPEGSVEVNARIAAARAEAVKTILVNKYKISASRITAEGQGVG DMFTEPDWNRVSICTIED >gi|225935344|gb|ACGA01000048.1| GENE 17 15052 - 16125 1259 357 aa, chain + ## HITS:1 COG:ZmolR.A_1 KEGG:ns NR:ns ## COG: ZmolR.A_1 COG3831 # Protein_GI_number: 15802594 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 EDL933 # 3 67 2 66 94 80 56.0 5e-15 MKRVFVFQDFKSQKFWSIDVVGTDVTVNYGKLGTDGQTQVKNYATTEEAEKAAGKLIAEK TKKGYVETTEETAREMKVEAKKYTLSYDEYENNVNLLDKILKDKHLSEYKQITIGCWDYE GGDCSALLQGMIENKEKFAQIEGLFWGDIEQEEQEISWIEQADISPLLDAMPKLKDLKIK GTNNLRLGKTSRPELRSLEIISGGLPTEVVEDILGSDFPNLEKLILYVGVEDYGFEADIE IFRPLFSKERFPKLTYLGIVNSEEQDKIVEMFLESDILPQLETMDVSAGTLKDEGAQLLL DNMDKIAHLKFINMRYNYLSKDMKKQLQNLPMKIDIAETEEVDEYDGELWYYPMITE >gi|225935344|gb|ACGA01000048.1| GENE 18 16129 - 17139 556 336 aa, chain + ## HITS:1 COG:no KEGG:BF1878 NR:ns ## KEGG: BF1878 # Name: not_defined # Def: putative periplasmic protein # Organism: B.fragilis # Pathway: not_defined # 1 336 1 336 336 371 55.0 1e-101 MQIIVVSNSLSKRIEYFIEAGKHLQTEVRFMTYGELFNCLPQLRQAVIKLEPCVSDETDF QKYALLNQAYKETLQRLGEMRLSDDVSFLNTPHALLRALDKKETKQVLMDRGLKVTPMLP SPHSFDELRELLAGCGRGCFLKPRYGSGAGGIMAVRYQPNRNKWVVYTTLQQVDGVIHNT KRIHRLSTEKEMIPLAEAVMQTEAILEEWIPKEQLQGENYDLRVVCRESEIDYIVVRCSK GSITNLHLNNKAHWWNELSLPEVVRQQVYFQCQEAVQSLDLQYAGVDVLMERGTDIPYII EVNGQGDHVYQDMFAHNSIYIQQIKNIKKRYNHANR >gi|225935344|gb|ACGA01000048.1| GENE 19 17126 - 17980 627 284 aa, chain + ## HITS:1 COG:no KEGG:BF1941 NR:ns ## KEGG: BF1941 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 283 1 283 283 507 82.0 1e-142 MQIDELPAGEQIQNPNLDMNQVVGTHDILMLCFDTLRYDVSKEEEAAGRTPVLNSHGGEW EKRHAPGNFTYPSHFAIFAGFLPSPAEPHSLRSRKWLFFPVQAGTGRIPPAGSYPFTEAT FVQSLANVGYETICIGGVNFFSKRNELGRVFPGYFTKSYWLPTFGCTAPDSTEKQIDFAL KKLENYPEDKRIFMYINFSAIHYPNCHYVEGKMKDDKESHAAALQYVDSQLPRLFQAFQK RGNTLVIALSDHGTCYGEDGYEYHCISHETVYTVPYKHFILTKK >gi|225935344|gb|ACGA01000048.1| GENE 20 17977 - 19254 708 425 aa, chain + ## HITS:1 COG:STM4012 KEGG:ns NR:ns ## COG: STM4012 COG0635 # Protein_GI_number: 16767277 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Salmonella typhimurium LT2 # 8 416 7 405 413 280 40.0 5e-75 MNQPLPRYVDYMYSYPHKTAYRSFPSPISLVPYLKQVEGQKASLYFHIPFCSHKCGYCNL FSLQTNRADYIATYLETLHKQAQQLSPLTIGLTFDSFAIGGGTPLLLTVPQLEYLLDTAA LFGVHPSHTFTSVETSPEYADPARLDLLKQAGVARVSIGVQSFLDEELTALKRRPRQDMI NQALEAIRKRQFPFFNIDLIYGIKGQTVASFLYSLEQALLFQPNELFIYPLYVRPGTAIT ERESDDVCFQMYCAACDLLKDRGFLQTSMRRFIHHPSTDAEISCGDEVMLSCGSGGRSYL GNLHYATRYTVCQRCIAGEIDDYMGTTDFTVARNGFILSQEEQRQRFIIKNLMYYMGLDK AEYKRRFGESPDNVPLFRQLAERQWIEDTDNGRVCLTSEGMAYSDYIGQLFITPEIRELM ETYSY >gi|225935344|gb|ACGA01000048.1| GENE 21 19266 - 20831 599 521 aa, chain + ## HITS:1 COG:no KEGG:BF1943 NR:ns ## KEGG: BF1943 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 9 505 3 498 500 578 53.0 1e-163 MENEGSLLSFRRIYYRGKLNSCNYTCSYCPFGKKSHLADTTQDEQAWNRFIAAVEQWKGE PLQLFIIPYGEALIHRYYRKGMMHLTALPQVAGISCQTNLSFPAKHWLDEIRVTPTMISK IRLWASFHPEMTSVEKFAHQIHILHHAGIQVCAGAVGNPSAKAVLNDLRNTLLPDIYLFI NAMQGLRAPLSQEDIQFFSQLDNLFEYDLKNAPAQWEVCAGGRNNCFIDWKGDMYACPRS RVKIGNFYQGDGAVVPLSCERKVCDCYIAFSNLNNHPLHRIMGEGAFWRIPDKPLITTVF FDVDGTLTDSQGKVPESYANALRYMAQSVSLYLATSLSMEQAKKKLGKALFDLFRGGVFA DGGLLSYSGQIRCLPVKVYPEVYGESAKVTIHSYEGVVYKYSILVRDKEQREAILTRLKE NPCQIFHKAPLITVIHPEASKKEGVIQLCKALGLASEHTLVVGNSLKDWPMMSVVSHSCA VMNAEPLLKERARYTLNPDRLAAFFRFSNGDPIDQTSSDNS >gi|225935344|gb|ACGA01000048.1| GENE 22 21068 - 22432 834 454 aa, chain + ## HITS:1 COG:BH0886 KEGG:ns NR:ns ## COG: BH0886 COG0534 # Protein_GI_number: 15613449 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 2 427 4 429 454 160 27.0 6e-39 MKELDLTQGSVPKVLLQFAVPFLIANVLQALYGGADLFVVGQYDDSASVAAVAIGSQVMQ TITGIILGITTGTTVLIAIATGAKDNRKVAFTIGSSVWLFSITGVVLTLVMVLFHGRIAE LMHTPVEAMADTKSYILVCSLGILFIVGYNVVCGILRGLGDSKTPLYFVGLACVINIVLD FILVGYFHWGATGAAIATVTAQGVSFGIALWFLYRHGFHFDFSRKDIRLNRNLSKKILVL GAPIALQDALINVSFLIITVIVNQMGVIASASLGVVEKIIVFAMLPPMAISSAVATMTAQ NYGAGLIKRMNKCLASGIGIALVFGVSVCVYSQFLPETLTAFFTKDAAVVAMAADYLRGY SIDCIVVSFVFCINSYFSGQGNSLFPMIHSLIATFLFRIPLSYWFSQIDSSSLFIMGFAP PISTVVSLLICIWYLRYTQRKLYLRGTMMPAMSN >gi|225935344|gb|ACGA01000048.1| GENE 23 22473 - 23021 375 182 aa, chain + ## HITS:1 COG:no KEGG:BT_1386 NR:ns ## KEGG: BT_1386 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 182 275 80.0 5e-73 MEAYLKKLCFEYQVDEREVKELLARMEMVYLDKGETIASATMPEQSLYIIVSGILHTYTT HEGEERTNRFFSAGDAVLCYNSSQYSIKTLTKCAAYYISEEEIEELCASSISFANLVRQL MEYQFYFKEEEDMSARKLTVRERYLSLLAEIPDILYRVPLKHINHYLGVDVTSLGYLAGS SK >gi|225935344|gb|ACGA01000048.1| GENE 24 23146 - 24024 219 292 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 133 289 133 287 288 70 30.0 4e-12 MQNSIPRYTFYKHKYGSELLVDVVELKYVKKFLAKSTVHTLNYYDITFITEGKGAFTVDN QTYEAIPCDVLFSKPGEIRNWDTHHIINGYALIFEEEFLSSLFKDSLFVRHLSFFQIESF SSRLHLSDELYTRILETLHDIKMEIDSYQQGDVHVLRALLYEVLMLLDRAYLKMTSMEEG RSREVSNNHVSKFMNLVATHAKEQHSVQYYADKLCITPNYLNEMITSTMGFSAKQYIQGK VMEEAKRLLVYTDFPIADIAFELCFSTVSYFIRSFRQHTGETPLLYRKAHKP >gi|225935344|gb|ACGA01000048.1| GENE 25 24121 - 24540 400 139 aa, chain + ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 11 138 13 139 145 113 42.0 1e-25 MRDAEKTVGNMIDKLKTAFIGSIDGEGFPTIKAMLQPRKRKGIKTIYLTTNTSSMRVAQY RENNHACIYFCDTRFFRGVMLRGMMEVLTDSASKEMIWQEGDTMYYPEGVTDPDYCVLKF TAVSGRFYSNFKSEDFVVG >gi|225935344|gb|ACGA01000048.1| GENE 26 24623 - 25243 434 206 aa, chain - ## HITS:1 COG:PA1504 KEGG:ns NR:ns ## COG: PA1504 COG1309 # Protein_GI_number: 15596701 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 13 155 12 153 216 59 29.0 4e-09 MATTEHSDLEQQIIKTAQQLFIEKGFVETSMSDIAATVGINRPTLHYYFRTKDKMFKAVF GSIVMSLMPKIQDIVKQDIPFMERLSMILDEYIELFTSNPCLPRFICGEIQRDVNHLLAT AKELQFGDALSQVKDSILTAMEDGKLKKVPIHIVFLTVYSLLSYPFLAKNLIVSLFLEDE AAFADFLQEWKQNIINQLQNLLCYEE >gi|225935344|gb|ACGA01000048.1| GENE 27 25247 - 29458 2375 1403 aa, chain - ## HITS:1 COG:L104115_1 KEGG:ns NR:ns ## COG: L104115_1 COG1924 # Protein_GI_number: 15674220 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Lactococcus lactis # 4 637 5 634 634 552 45.0 1e-156 MNAYKVGLDIGSTTAKMVVLDKDGTTVLFSGYKRHQAKIQECLLGFLYQIKEQLGDIPLS FHITGSVGMGISEKCSLPFVQEVVAATNYVHREHPHTATMIDIGGEDAKVVFFQDGQATD LRMNGNCAGGTGAFIDQMAILLGVSIDELNELALRSNQVYSIASRCGVFCKTDIQNLIAK NISKENIAASIFHAVAVQTTVTLARGCNIVAPVLLCGGPLTFIPALRKAFANYLNLSQDT DFILPEKGNLIPAWGAALAENNEESIHLSNLIQVIENKLSVTSPFQSDLSPIFDSEEEYR SWKKEKDQYKIQYASLRPGLQEATIGIDSGSTTTKIVVLDNNHRILYSYYHDNNGNPIKT VENGLQKLYEECRRRGTTLRIKGGCSTGYGEDLIKAAFHMDAGIIETIAHYAAAKHISKE VSFILDIGGQDMKAIFVNDGVINRIEINEACSSGCGSFISTFAQSLDYSVEDFAKAACFS QAPCDLGTRCTVFMNSKVKQVLREGASVADIAAGLSYSVVKNCLYKVLQLKDTEILGDHI VVQGGTMRNDSIVRSLEKLTGKQTYRSNCPELMGALGCALYAKQIENARVTKLNEMLQWA QYTSKQLQCRGCENQCAIMRYTFNSENHYFSGNRCEKVFSNKGSHADKGINTYDKKLELL FDRSADIPQPLFTIGIPRILNMYEEYPFWHTLFTACGIQVQLSEPSTFSKYETAAGMVMS DNICFPAKLVHSHIRNLTQQNVNRIFMPFVVFEKKDKQQQNSYNCPIVSGYSEVIKSVQE ENIPIDAPTITFKDETLLYKQCNEYLKSLGIRDEVCKNAFSRALQEQYAFEEKIAAYNQE VLNEGREKHKLIILLAGRPYHSDPLIQHKVSDMIAAMGVYVITDDIVRQQEISLEKTHYL SQWAFTNHILKATKWAAMQEGDIQYMQMTSFGCGPDAFLIDEVRNLLKRYGKNLTLLKID DVNNIGSIKLRVRSLVESLNFSLKHSQAKDPEPFVSTAPFTKKDKKKKILAPFFTPFISP LIPSIMKVAGYEMETLPLSDTASCDWGLKYSNNEVCYPATLIVGDIVKAFKSGRYDPANT CVAITQTGGQCRASNYISLIKKALIENGYTNTPVVSLAFGSGIENEQSGFKVNWLKVLPI ILASVLYSDCIAKFYYAAVVREKERGQAARLRDLYLDTAQPIIQKNKPEDLLSYLYLAAR DFNKICEQRSCHKVGIVGEIFLKFNPFAQKDVTSWLINQKIEVIPPLISDFFMQGFVNLK VRQNQHLQRKLTPDFVIDWLYKKVQKQINKVNEIGKNFNYFTPFESIFEKAEKAKQVISL GTQFGEGWLISGEILSFASQGVNHVISLQPFGCIANHIVEKGIEKRIKSLCPQMNILSLD FDSSVSDVNITNRLLLFIDNINN >gi|225935344|gb|ACGA01000048.1| GENE 28 30098 - 30997 559 299 aa, chain - ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 209 299 268 363 368 65 37.0 1e-10 MENIIKLETIQEYNRLLGAETLHPLVSVTDFSALQSLKHYRKNFGFYCVFYKELGCGVLQ YGRNKYDYEDSTLVFISPGQIAGVNDGGETINPKGLVLMFHPDLLYGTPLARRMKDYTFF SYESNEALHMSDRERRIILNCFHEIREELENVIDKHTKQIVTSNIETLLNHCVRFYERQF VTREVVNHDLLTEFELILQQYFNSDKPSEIGLPSVQYCAEQMHLSPNYFGDLVKKETGKS AQEYIQLTIMERVKELLVESNKTISEIAYELGFKYPHHLSRVFKKVIGTTPNEYRAQAG >gi|225935344|gb|ACGA01000048.1| GENE 29 31234 - 31800 477 188 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 27 187 1 178 179 134 41.0 9e-32 MRKLVLIMLMCLVALSSYSCTQSDEPVTQVSGKKILVAYFSWGGTTKRLAEEIASLTGGD LFSIETVTPYPTDYTPCTEVAKEELEKGIRPPLKDVVKNMDDYDIVFVGCPVWWHTAPMA IWSFLESKEYNLEGKIIVPFCTYAATYRDETLAKIVELTPDSEHLKGFGSTGSTSGAKGW LREIGIIK >gi|225935344|gb|ACGA01000048.1| GENE 30 31826 - 32824 900 332 aa, chain + ## HITS:1 COG:YPO2806 KEGG:ns NR:ns ## COG: YPO2806 COG0667 # Protein_GI_number: 16123004 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 1 332 1 329 329 410 60.0 1e-114 MKYRELGNSGLQVSAIGLGCMGMSHGYGPASDRKEMISLIRQAYEQGVTLFDTAEIYGTV DNPHDNEELVGEALAPIRNKVVIATKFGIYLSEQGKQYQSSRPEQIRKSIEGSLMRLRTD RIDLYYQHRVDTEVPIEEVAGTIGELIQEGKILHWGLSEAGVQTIRRAHLVQPLTAVQSE YSMFWRNPETDLLPVLEELGIGLVPFSPLGKGFLTGAINENTKFGQGDFRTIVPRFTPEN IAANLQLVDFIKEVAVSKGVTPAQLALSWLKEQKPWIVSIPGSRSLKHLTENIAAADVEY TQEEMEHINDGLSRIILSGERYPAELQSRVGR >gi|225935344|gb|ACGA01000048.1| GENE 31 32881 - 33414 266 177 aa, chain - ## HITS:1 COG:yfdR KEGG:ns NR:ns ## COG: yfdR COG1896 # Protein_GI_number: 16130293 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Escherichia coli K12 # 1 121 10 125 187 77 36.0 8e-15 MSYITTFTGKHFDPIHPVPEKIDMKDIAHALSLICRANGHTRFFYSVAQHSITCCKEAKT RRLSNRIQLGCLLHDSSEAYMSDVTRPIKAKLTEYLKFEDHLQNMIWNHFISEPLSDTEK KAIFEIDDEMLSYEFIHLLPESISDDYKFILSRPTLEFINPTIVEDEFINLASGIHL >gi|225935344|gb|ACGA01000048.1| GENE 32 33856 - 34419 421 187 aa, chain + ## HITS:1 COG:SP0798 KEGG:ns NR:ns ## COG: SP0798 COG0745 # Protein_GI_number: 15900691 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 1 164 2 164 224 119 42.0 3e-27 MKILIIEDERSLSDSIVAYLSSEKYLCEQAFTYEDAKMKVNMYEYNCILLDLMLPGGNGL DILRDIRKQRNPVGVIIVSAKDSLDDKVKGLEIGADDYLAKPFHLPELSMRIYAIIRRKE FSANNILESNGIRIDLLNKLAMVNDTQVELTKSEYDLLLFFIGNIIHFIRIMYGSLLQNR GFGYKNP >gi|225935344|gb|ACGA01000048.1| GENE 33 34494 - 34931 266 145 aa, chain + ## HITS:1 COG:SMa0558 KEGG:ns NR:ns ## COG: SMa0558 COG1359 # Protein_GI_number: 16262744 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 16 125 38 154 170 68 35.0 5e-12 MIKTILSVLSLFVMLSCSMTGKKDIKQQPGMCAKLPMAADGIVRLSKIEVYPEYLEEYMK YATEVGEVSLRTEPGVLTMYAVSEKENPGRITILETYASQEAYKSHIASEHFQKYKQGTL HMVKTLVLSDQTPLNPANCINNFIQ >gi|225935344|gb|ACGA01000048.1| GENE 34 34943 - 35770 722 275 aa, chain + ## HITS:1 COG:Cgl1022 KEGG:ns NR:ns ## COG: Cgl1022 COG0599 # Protein_GI_number: 19552272 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Corynebacterium glutamicum # 26 127 6 107 107 139 62.0 6e-33 MNKLLLFSVFSILTLNVMAQEKIVQTAGRDQLGTFAPKFAELNDDVLFGEVWSRTDKLGL RDRSLVTITSLISMGITDNSLVYHLQSAKKNGITRTEIAEIITHIGFYAGWPKAWAAFNL AKGVWSEDIVGEDAKTAFQREMIFPIGEPNTAYAQYFIGNSYLAPVSREQVNISNVTFEP RCRNNWHIHRATKGGGQMLIGVAGRGWYQEEGKPAVEILPGTVIHIPANVKHWHGAAADS WFAHLAFEVSGENTYNEWLEPVTDDEYNKLKQTKR >gi|225935344|gb|ACGA01000048.1| GENE 35 35810 - 35959 156 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173796|ref|ZP_05760208.1| ## NR: gi|260173796|ref|ZP_05760208.1| hypothetical protein BacD2_18138 [Bacteroides sp. D2] # 1 49 13 61 61 85 100.0 8e-16 MTTEEFKEYVKTRKALNTEEIHRLMDDMSNYQSNGRMKTPVGLLHRDNV >gi|225935344|gb|ACGA01000048.1| GENE 36 35998 - 36561 359 187 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 31 174 5 153 179 100 37.0 2e-21 MKQIIFIFMSLLCFSCSSKAQNQTADTNVQSDKKILVAYFSCTGTTEKAADAIAKTVGGK LYQITPATAYTSADLDWNNKSSRSSIEMTTENSRPELGGEVLNLKDYDVIFLGYPIWWNL CPRPVNTFLEKYDFSGKTVIPFATSGGSSITNSVKQLKKLYPKIVWGESRLLNGSVKQAG EWAKGVI >gi|225935344|gb|ACGA01000048.1| GENE 37 36667 - 37245 460 192 aa, chain + ## HITS:1 COG:alr4010 KEGG:ns NR:ns ## COG: alr4010 COG0664 # Protein_GI_number: 17231502 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 20 189 22 191 196 72 31.0 6e-13 MLLEKLLKEFNFSTNIFNWEEFKLFDQYFERILVSKGTFLIKEGETERYSYFVFDGILRC WLLNHNGEEQTFWFCKEGTFSMSNISFTLQTRSAFNVQTIVDSVIYRIDKKQVNELYTAI PKVKTVFEDLNAILLNKLLKRNIDLIKYSPEQYYLQMMEEYGITLNYIPLKDIASYLGIT PQALSRIRKRIF >gi|225935344|gb|ACGA01000048.1| GENE 38 37310 - 37732 326 140 aa, chain + ## HITS:1 COG:CC2265 KEGG:ns NR:ns ## COG: CC2265 COG0454 # Protein_GI_number: 16126504 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Caulobacter vibrioides # 2 140 4 143 145 117 41.0 5e-27 MIRKIKVTDYPRLMEIWESAVLSTHDFLKEEDFLYYKERLPVYFQYVNLFGFEQEGILIG FMGIAEGNLEMLFIDNKYRGAGIGKKLITYAIDNLQVTKVDVNEQNVQAVGFYEYMGFNI YKRSNLDGEGKEYPILHMQL >gi|225935344|gb|ACGA01000048.1| GENE 39 38599 - 39453 751 284 aa, chain + ## HITS:1 COG:alr1623 KEGG:ns NR:ns ## COG: alr1623 COG0788 # Protein_GI_number: 17229115 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate hydrolase # Organism: Nostoc sp. PCC 7120 # 3 283 5 283 284 323 55.0 2e-88 MTTAKLLLHCPDKPGILAEVTDFITVNKGNIIYLDQYVDHVENIFFMRIEWELKDFLVPQ EKIEDYFRTLYGQKYEMDFRLYFSDVKPRMAIFVSKLSHCLFDMLARYTAGEWNVEIPLI ISNHPDLQHVAERFGIPFYLFPITKETKEEQERKEMELLAKHKITFIVLARYMQVISEQM INAYPNKIINIHHSFLPAFVGAKPYHAAFQRGVKIIGATSHYVTTELDAGPIIEQDVVRI THKDAIEDLVNKGKDLEKIVLSRAVQKHIERKVLAYKNKTVIFS >gi|225935344|gb|ACGA01000048.1| GENE 40 39453 - 40043 506 196 aa, chain + ## HITS:1 COG:YPO1545 KEGG:ns NR:ns ## COG: YPO1545 COG0118 # Protein_GI_number: 16121818 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Yersinia pestis # 1 196 1 196 196 171 43.0 6e-43 MKVAVVKYNAGNIRSVDYALKRLGVEAVITADKEELQSADKVIFPGVGEAETTMNHLKAT GLDELIKNLRQPVFGICLGMQLMCRYSEEGEVDCLNIFDVDVKRFVPQKHEDKVPHMGWN TIGKTNSKLFEGFTEEEFVYFVHSFYVPTCDFTAATTDYIHPFSAALHKDNFYATQFHPE KSGKTGEKILTNFLNL >gi|225935344|gb|ACGA01000048.1| GENE 41 40062 - 40781 634 239 aa, chain + ## HITS:1 COG:PM1203 KEGG:ns NR:ns ## COG: PM1203 COG0106 # Protein_GI_number: 15603068 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Pasteurella multocida # 3 234 5 241 249 187 39.0 2e-47 MIEIIPAIDIIDGKCVRLSQGDYDSKKVYNENPVEVAKEFEANGVRRLHVVDLDGAASHH VVNYRVLEQIAARTSLVIDFGGGVKSDEDLKIAFESGAQMVTGGSIAVKDPELFCHWLEI YGSEKIILGADVKEHKIAVNGWKDESACELFPFLENYIDKGIRKVICTDISCDGMLSGPS IDLYKEMLAKFPDLYLMASGGVSKMDDIVALDEAGVPGVIFGKALYEGHITLQDLRIFL >gi|225935344|gb|ACGA01000048.1| GENE 42 40885 - 41637 741 250 aa, chain + ## HITS:1 COG:aq_181 KEGG:ns NR:ns ## COG: aq_181 COG0107 # Protein_GI_number: 15605750 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Aquifex aeolicus # 1 250 1 250 253 297 58.0 1e-80 MLAKRIVPCLDIKDGQTVKGTNFVNLRQAGDPVELGRAYSEQGADELVFLDITASHEGRK TFTELVKRIAANINIPFTVGGGINELSDVDRLLNAGADKISINSSAIRNPRLVDEIAKNF GSQVCVLAVDAKQTENGWKCYLNGGRIETDKDLFEWTKEAQERGAGEILFTSMNHDGVKA GYANEALAALADQLSIPIIASGGAGCKEHFRDVFLQGKADAALAASVFHFGEIKIPELKS YLCGEGITIR >gi|225935344|gb|ACGA01000048.1| GENE 43 41702 - 42313 656 203 aa, chain + ## HITS:1 COG:hisI_1 KEGG:ns NR:ns ## COG: hisI_1 COG0139 # Protein_GI_number: 16129967 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Escherichia coli K12 # 2 100 9 107 112 146 65.0 2e-35 MELDFDKMNGLVPAIIQDNETRKVLMLGFMNKEAYDKTVETGKVTFFSRTKNRLWTKGEE SGNFLHVVSIKADCDNDTLLIQVDPVGPVCHTGTDTCWGEKNEEPVMFLKALQDFIDKRH EEMPEGSYTTSLFESGINKIAQKVGEEAVETVIEATNGTDERLIYEGADLIYHMIVLLAS KGYRIEDLARELQERHSSTWKKH >gi|225935344|gb|ACGA01000048.1| GENE 44 42355 - 43083 354 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 6 202 4 201 223 140 38 1e-32 MDDGALIQYKNVEIHQQELCVLSDVNLELHKGEFVYLIGKVGSGKTSLLKTLYGELDVID GEAEVLGYNMRSIKRKHIPQLRRKLGIVFQDFQLLTDRTVYNNLEFVLRATGWKNKQEIQ ERIEEVLQLVGMSNKGYKLPNELSGGEQQRIVIARAVLNSPAIILADEPTGNLDVETGKA IVELLHNICESGSSVVMTTHNLQLLKDYPGRVYRCADHQIVDVTDEYMPRQRTIEIDLNI DN >gi|225935344|gb|ACGA01000048.1| GENE 45 43097 - 44416 1196 439 aa, chain + ## HITS:1 COG:VC0391 KEGG:ns NR:ns ## COG: VC0391 COG0527 # Protein_GI_number: 15640418 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Vibrio cholerae # 3 436 34 476 479 254 36.0 2e-67 MKVLKFGGTSVGSAQRMKEVAKLITDGEQKIVVLSAMSGTTNTLVEISDYLYKKNPEGAN EIINKLEAKYKQHIDELFATQEYKQKGLEVVKSHFDYIRSYTKDLFTLFEEKVVLAQGEL ISTAMVNFYLQECGVKSVLLPALEFMRTDKNAEPDPVYIKDKLRAQLDLYPDTEIYITQG FICRNAYGEIDNLQRGGSDYTASLIGAAVNASEIQIWTDIDGMHNNDPRIVDKTAPVRQL HFEEAAELAYFGAKILHPTCIQPAKYANIPVRLLNTMDPDAPGTLISNDTEKGKIKAVAA KDNITAIKIKSSRMLLAHGFLRKVFEIFESYQTSIDMICTSEVGVSVTIDNTKHLNEILD DLKKYGTVTVDKEMCIICVVGDLEWENVGFEAKALDAMRDIPVRMISFGGSNYNISFLIR ECDKKVALQSLSDMLFNGK >gi|225935344|gb|ACGA01000048.1| GENE 46 44495 - 45655 1248 386 aa, chain + ## HITS:1 COG:mlr3508 KEGG:ns NR:ns ## COG: mlr3508 COG0019 # Protein_GI_number: 13473029 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Mesorhizobium loti # 15 376 27 388 422 280 43.0 3e-75 MKGIFPIDKFRTLQTPFYYYDTKVLRDTLSAINHEVAKYPNYSVHYAVKANANPKVLTII RESGMGADCVSGGEIRAAIRAGFPANKVVFAGVGKADWEINLGLEYGIFCFNVESIPELE VINELAAAQNKVANVAFRINPDVGAHTHANITTGLAENKFGISMQDMDKVIDVAQEMKNV KFIGLHFHIGSQILDMGDFVALCNRVNELQDKLEARRILVEHINVGGGLGIDYGHPNRQA IPNFKDYFATYAGQLKLRPYQTLHFELGRAVVGQCGSLISKVLYVKQGTRKKFAILDAGM TDLIRPALYQAFHKMENITSEEPLEAYDVVGPICESSDVFGKAIDLNKVKRGDLIALRSA GAYGEIMASRYNCRELPKGYTSDELV >gi|225935344|gb|ACGA01000048.1| GENE 47 45797 - 46276 676 159 aa, chain + ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 159 1 161 171 148 44.0 3e-36 MISEKLQNAINEQITAEMWSANLYLAMSFYFEKEGFSGFAHWMKKQSQEEMGHAYAMADY IIKRGGTAKVDKIDVVPNGWGTPLEVFEHVYKHECHVSQLVDKLVDVAAAEKDKATQDFL WGFVREQVEEEATAQGIVDKIKKAGDTGIFFVDSQLGQR >gi|225935344|gb|ACGA01000048.1| GENE 48 46432 - 48696 1628 754 aa, chain + ## HITS:1 COG:PA0928_1 KEGG:ns NR:ns ## COG: PA0928_1 COG0642 # Protein_GI_number: 15596125 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 508 748 259 507 509 150 38.0 1e-35 MNGLNRGIIRLLLLSLLLLLGVSSCSQKEERRILVVHSYEETYAAYPEFNRMIAEQFEKE KIDADIRTVYLDCESYWEEPELERMRFLVDSVSKDWRPEVILVNEDQATYSLMKCGIQLA KEVPVVFGGVNYPNWGLLKHHPNVTGFHDKIAFNENISVAKELFGEHVRLFTMLDTTYID RQIRRDAKEQFKGHKVTGFIDNPELSPEEQIRLVQEDGYTRFMAIPLRNARNHSDATFMW VLNRSYRDQCYIQLKRDYTTINIGSICGSPSLTAINEAFGFGEKLLGGYITSLPIQVEEE VKTAVRILQGASPSDIPIVESRKEYVVDWNTMTQIGLSKESIPAKYRIINIPFRDEYPFL WGALVASFVLFLILLFASLWWLYLREQMRKKQALIALADEKETLSLAIEGGMTYAWRLDK GCFVFEDAFWASQGLNPRQLSFKEFMSFIHPDHWEGVKFNWRNLKSAHKKIVQELCNFDG KGYQWWEFRYTTKQLSGGEYKTAGLLLNIQDIKDREEELEAARLLAEKAELKQSFLANMS HEIRTPLNSIVGFANILALEDGLSSVEREEYIGTINKNSELLLKLINDILELSRIESGYM SFSFKKCRVRELIDDIYMTHQVLIAPRLEFLKEVDNIPLEINVDRERLIQVLTNFLNNAS KFTETGYIKLGYIYLPDEGRVRIYVEDTGRGIPREEQRMIFSRFYKQNEFSQGAGLGLSI CQVIIEKLGGRIELKSEVGKGSRFTVVLPCRVVS >gi|225935344|gb|ACGA01000048.1| GENE 49 48728 - 49135 492 135 aa, chain - ## HITS:1 COG:no KEGG:BT_1372 NR:ns ## KEGG: BT_1372 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 135 23 157 157 191 94.0 5e-48 MEELTLTTPALLFSAVSLILLAYTNRFLSYAQLVRQLRDRYVENPSDITEAQIENLRKRL NLTRTMQGLGIASLFLCVVSMFLIYIGLQLFSAYVFGLALILLIASLGVSFREIQISTRS LEIYLGTMEKGKNKK >gi|225935344|gb|ACGA01000048.1| GENE 50 49338 - 50531 1406 397 aa, chain - ## HITS:1 COG:YPO0059 KEGG:ns NR:ns ## COG: YPO0059 COG0156 # Protein_GI_number: 16120412 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Yersinia pestis # 7 394 12 402 403 479 60.0 1e-135 MYGKMQEYLRQTLAEIKEAGLYKEERLIESAQQAAITVKGKEVLNFCANNYLGLSNHPRL IKASQEMMNNRGYGMSSVRFICGTQDIHKELEAAISDYFQTEDTILYAACFDANGGVFEP LFSEEDAIISDSLNHASIIDGVRLCKAKRYRYANADMKDLERCLQEAQAQRFRIVVTDGV FSMDGNVAPMDQICDLAEKYDALVMVDESHSAGVVGATGHGVSELYKTHGRVDIYTGTLG KAFGGALGGFTTGRKEIIDLLRQRSRPYLFSNSLAPGIIGASLEVFKMLKESNALHDKLV ENVNYFRDKMTAAGFDIKPTQSAICAVMLYDAKLSQIYAARMQEEGIYVTGFYYPVVPKD QARIRVQISAGHEKAHLDKCIAAFIKVGKELNVLKAE Prediction of potential genes in microbial genomes Time: Fri May 13 10:22:01 2011 Seq name: gi|225935343|gb|ACGA01000049.1| Bacteroides sp. D2 cont1.49, whole genome shotgun sequence Length of sequence - 5557 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 89 - 1042 881 ## COG0451 Nucleoside-diphosphate-sugar epimerases 2 1 Op 2 . + CDS 1042 - 1890 625 ## BT_1369 hypothetical protein 3 1 Op 3 . + CDS 1938 - 2930 812 ## COG0812 UDP-N-acetylmuramate dehydrogenase + Prom 3000 - 3059 4.7 4 2 Op 1 . + CDS 3089 - 3847 414 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I + Term 3855 - 3889 -0.8 + Prom 3851 - 3910 3.1 5 2 Op 2 . + CDS 3931 - 5505 841 ## BT_1366 hypothetical protein Predicted protein(s) >gi|225935343|gb|ACGA01000049.1| GENE 1 89 - 1042 881 317 aa, chain + ## HITS:1 COG:SA0511 KEGG:ns NR:ns ## COG: SA0511 COG0451 # Protein_GI_number: 15926231 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Staphylococcus aureus N315 # 1 315 1 314 321 340 52.0 2e-93 MEHILIIGATGQIGSELTMELRKRYGNANVVAGYIPGAEPKGELKESGPSAIADVTDGEA IASVVKEYHIDTIYNLAALLSVVAESKPKLAWKIGIDGLWNVLEVAREQGCAVFTPSSIG SFGASTPHTKTPQDTIQRPRTMYGVTKVTTELLSDYYFNKYGVDTRAVRFPGIISNVTPP GGGTTDYAVDIYYSAVKGEKFVCPIKQGTLMDMMYMPDALNAAITLMEADPARLIHRNAF NIASMSFDPETIYQAIKKHVPQFEMIYDIDPLKQRIADSWPDSLDDTCAREEWGWKPAYN LESMTVDMLEKLREKLK >gi|225935343|gb|ACGA01000049.1| GENE 2 1042 - 1890 625 282 aa, chain + ## HITS:1 COG:no KEGG:BT_1369 NR:ns ## KEGG: BT_1369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 282 1 282 282 493 86.0 1e-138 MKKLIAGVILLGILSSCGNKKTNIDPFASITKEVDSVRQIADSSHHDKSPEDPQPIRADE SFDDFIYNFASDDVLQRQRVKFPLPYYNGDKKSNIEERNWKHDDLFTKQHYYTLLFDREE DMDLVGDTSLTSVQVEWMFVKTRMVKRYYFERIKGAWILEAINLRPIEQSDNENFVEFFG HFATDSLFQSQRVREPLAFVTTDPDDDFSILETTLDLNQWFAFKPVLPVDRLSNINYGQR NDDDSPTKILALKGIGNGFSNILYFRRKAGEWELYKFEDTSI >gi|225935343|gb|ACGA01000049.1| GENE 3 1938 - 2930 812 330 aa, chain + ## HITS:1 COG:PM1589 KEGG:ns NR:ns ## COG: PM1589 COG0812 # Protein_GI_number: 15603454 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Pasteurella multocida # 1 330 1 329 341 267 43.0 3e-71 MYSLLPYNTFGIDVSAARFLEYSSVEELKKLIVQGAIVTPFLHIGGGSNLLFTKDYDGLI LHSRIEGIEVTEEDEHSVSVRVGAGVVWDDFVAYCVEHGWYGTENLSLIPGEVGASAVQN IGAYGVEVKDLITAVETVNIQAEERVYSVEECGYTYRNSIFKRPENKSAFVTYVRFRLSK KEHYTLDYGTIRQELEKYPALTLSVVRKVIIDIRESKLPDPKVMGNAGSFFMNPIVPKEK LEALQQEYPRIPYYELADGRVKIPAGWMIDQCGWKGKALGPAAVHDKQALVLVNLGGAKG SDILALSDAVRASVREKFGIDIHPEVNLIN >gi|225935343|gb|ACGA01000049.1| GENE 4 3089 - 3847 414 252 aa, chain + ## HITS:1 COG:BB0533 KEGG:ns NR:ns ## COG: BB0533 COG1235 # Protein_GI_number: 15594878 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Borrelia burgdorferi # 6 252 6 253 253 200 39.0 2e-51 MKVRIIGSGTSTGVPQIGCTCPVCTSTDPKDNRLRASAIVETDDARILIDCGPDFRTQVL HLPFEKIDGVLITHEHYDHVGGLDDLRPFCRFGSVPIYAENYVAQGLRLRMPYCFVDHRY PGVPDIPLQEIAVGQAFSVNHTEVLPLRVMHGRLPILGYRIGQLGYITDMLTMPEESYEQ LAGIDVLVVNALRIATHPTHQNLEEALAVARCIQAKKTYFIHMSHDMGLHAKVEKSLPEN IHLAFDGLDIYL >gi|225935343|gb|ACGA01000049.1| GENE 5 3931 - 5505 841 524 aa, chain + ## HITS:1 COG:no KEGG:BT_1366 NR:ns ## KEGG: BT_1366 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 524 1 525 525 822 77.0 0 MKLRTIVKIAITSSVVLLCSGFALYSFFRLSAAEGQKDFNLYELVPSTTSAVFVTDDVLE FVAEVDDLTCSKNQQYLYVSKLFSYLKQSLYALSEDTPHGLSRQMNQMLISFHEPDSERN QVLYCRLGNGDKELVNRFVRKYISSLYPPKTFVYKGEEIIIYPMADGDFLACYLTSDFMT LSFQKKLIENVIDAYKSDKSLADDPTFTGVRAPKKSAAAATIYTRMKGMMGWTEFDMKMK DDFIYFSGITHDADTCFAFINQLRQQQSVKGFPGEALPSTAFYFSRQGITDWASLLSYGN AQEQNGAVRTSEVQNRDKEFSRYLMENAGQDLVVCLFQREDTLQDAAAVLSLSVADVTEA ERMLRALVNTAPADEGRKTPRITFCYTVNKAYLVYRLPQTTLFKQLTSFVEPTLDVYAAF YGGRLLLAPDEDSLSHYIRQLDKGEVLNGAMAYQTGMDHLSDSYHFMLMADFDHVFRQSE NHVRFVPDFFLRNADFFRNFTLFVQFACTDGVVYPNIVLKYKSE Prediction of potential genes in microbial genomes Time: Fri May 13 10:22:13 2011 Seq name: gi|225935342|gb|ACGA01000050.1| Bacteroides sp. D2 cont1.50, whole genome shotgun sequence Length of sequence - 9590 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 38 - 412 357 ## BT_1365 hypothetical protein - Prom 432 - 491 4.5 + Prom 395 - 454 4.6 2 2 Tu 1 . + CDS 566 - 1690 1145 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) + Term 1711 - 1774 6.2 + Prom 1713 - 1772 2.5 3 3 Op 1 . + CDS 1792 - 2571 675 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 4 3 Op 2 . + CDS 2571 - 3779 1094 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 5 3 Op 3 . + CDS 3813 - 5474 1756 ## COG0497 ATPase involved in DNA repair 6 3 Op 4 . + CDS 5516 - 6259 390 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 7 3 Op 5 . + CDS 6261 - 7967 1691 ## COG0457 FOG: TPR repeat + Term 7998 - 8045 10.9 + Prom 8151 - 8210 9.5 8 4 Tu 1 . + CDS 8328 - 9272 502 ## BT_1726 integrase + Term 9315 - 9347 0.1 Predicted protein(s) >gi|225935342|gb|ACGA01000050.1| GENE 1 38 - 412 357 124 aa, chain - ## HITS:1 COG:no KEGG:BT_1365 NR:ns ## KEGG: BT_1365 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 124 1 124 124 189 82.0 2e-47 MEFTGKIIAILQPKGGVSKATGNEWKAQEFVIESHDQYPRKMCFEVFGADKIDQFNIQMG EELTVSFDIDSNQWQDRWFNRIRAWKVERVSAGAPMATGGSVPPAAPSAMPEFTPGDAKD DLPF >gi|225935342|gb|ACGA01000050.1| GENE 2 566 - 1690 1145 374 aa, chain + ## HITS:1 COG:BMEI1942 KEGG:ns NR:ns ## COG: BMEI1942 COG0592 # Protein_GI_number: 17988225 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Brucella melitensis # 1 370 26 395 397 160 30.0 5e-39 MKFIVSSTALSSHLQAISRVINSKNALPILDCFLFELEDGTLSVTVSDSETTMVTTVEVN ESDTNGRFAVVAKTLLDALKEIPEQPLTFEINPDNYEITVQYQNGKYSLMGQNADEFPQS ATLGDNAVRVEMEASVLLGGINRSVFATADDELRPVMNGIYFDITTEDITMVASDGHKLV RCKTLAAKGNERAAFILPKKPATLLKNLLPKEQGTVTIEFDERNAVFMLESYRMVCRLIE GRYPNYNSVIPQNNPHKVTVDRQQLVGALRRVSIFSSQASSLIKLRMQENQIVISAQDID FSTSAEETQVCQYAGAAMSIGFKSTFLIDILNNISADEVIIELADPSRAGVIIPVEQEEN EDLLMLLMPMMLND >gi|225935342|gb|ACGA01000050.1| GENE 3 1792 - 2571 675 259 aa, chain + ## HITS:1 COG:CT261 KEGG:ns NR:ns ## COG: CT261 COG0847 # Protein_GI_number: 15604982 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Chlamydia trachomatis # 9 239 4 214 232 79 30.0 5e-15 MKLNLKNPIVFFDLETTGTNINTDRIVEICYLKVYPNGNEEAKTLRINPEMHIPEASSAI HGIYDADVVDCPTFKEVAKNIARDIEGCDLAGFNSNRFDIPVLAEEFLRAGVDIDMTKRK FVDVQVIFHKMEQRTLSAAYKFYCDKNLEDAHTAEADTRATYEVLKAQLDRYSDLQNDIA FLADYSSFSKNVDFAGRMVYDDNGVEVFNFGKYKGMSVAEVLKKDPGYYSWILNSDFTLN TKAALTKIRLREMSNLITK >gi|225935342|gb|ACGA01000050.1| GENE 4 2571 - 3779 1094 402 aa, chain + ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 401 1 400 404 334 45.0 2e-91 MLKGKKIILGITGSIAAYKACYIIRGLIKQGAEVQVVITPAGKEFITPITLSALTSKPVI SEFFAQRDGTWNSHVDLGLWADAVLIAPATASTIGKMANGIADNMLITTYLSAKAPVFVA PAMDLDMFAHPATQKNLDILRSYGNHIIEPGTGELASHLVGKGRMEEPENIIRVLDEFFA SSDELSGKKVMITAGPTYEKIDPVRFIGNYSSGKMGFALAEECARRGAQVTLITGPVQLK TKHSRIRRVDVESAEEMYAAAQIYFPDADAGILCAAVADYRPETVADKKIKREKEEELTL HLQATQDIAANLGAMKGKNQLLVGFALETNNEQQNAEGKLERKNFDFIVLNSLNDAGAGF RHDTNKISIIDRKGRTDYPLKPKTEVAQDIIDRLVATLSPGF >gi|225935342|gb|ACGA01000050.1| GENE 5 3813 - 5474 1756 553 aa, chain + ## HITS:1 COG:PA4763 KEGG:ns NR:ns ## COG: PA4763 COG0497 # Protein_GI_number: 15599957 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Pseudomonas aeruginosa # 1 550 1 552 558 340 40.0 6e-93 MLRSLYIQNYALIEKLDISFETGFSVITGETGAGKSIILGAIGLLLGQRADVKAIRHGAS KCIIEARFDISAYGMRPFFEENELEYDEECILRREVQSSGKSRAFINDTPASLAQVKELG EQLIDVHSQHQNLLLNKEGFQLNVLDILAHNDAALEKYHLCYDEWKQTDRELAELVSLAE KSRSDEDYIRFQLEQLEEARLVEGEQEELEQEAETLSHAEEIKAGLYRVEQSFASDEGGL LSYLKDSLNTLNNLQRVYQPAKELTERMESAYIELKDISHEVSLQSDSVEFNPVRLDEVN ERLNLIYSLQQKHRAQTLDELITLTDEYRSKLSDITSYDERIAELTARKEEQYKQVKQQA EVLTKARTKAAREVEKQLAARLIPLGMPNVRFQIEMGLKKEPGLQGEDTVNFLFSANKNG TLQNISSVASGGEIARVMLSIKAMIAGAVKLPTIVFDEIDTGVSGEIADRMADMMQEMGD RNRQVISITHLPQIAARGRAHYKVYKKDSDTETNSHIRRLTDEERVEEIAHMLSGATLTE AALSNAKSLLARN >gi|225935342|gb|ACGA01000050.1| GENE 6 5516 - 6259 390 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 6 244 7 246 255 154 35 2e-37 MLDKSEMIFGVRAVIEALQAGKEIDKILVKKDIQSDLSKELFAALKGTLIPVQRVPVERI NRITRKNHQGVIAFISSVTYQKAEDLVPFLFEQGKNPFFVMLDGVTDVRNFGAIARTCEC AAVDAVIIPVRGSASVNADAVKTSAGALHTLPVCREQNLRSTLQYLKDSGFRIVAATEKG DYDYTKADYTGPLCIIMGAEDTGVSYENLALCDEWVKIPMLGTIESLNVSVAAGILIYEA VKQRNND >gi|225935342|gb|ACGA01000050.1| GENE 7 6261 - 7967 1691 568 aa, chain + ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 231 558 59 331 628 71 24.0 6e-12 MKKILILPLLLFFLVQGSMAQTPKWVEKAKRAVFSIVTYDKNDKMLNTGNGFFVSEDGLA LSDYTLFKGAERAVVITSEGKQMPVSLILGANDMYDVIKFRVAITEKKVPALIVAKTAPV AGAEAWMLPYSTQKSIACVNGKVKDVSKVAGEYHYYTLSMHMKDKMVSCPVMNAEGQVFG IAQKSSGIDTVTTCYAAGAAFAMSQKISALSLGDAALKSIGIRKGLPETEDQALVYLFMA SSSLSGEDYEKLLDDFIRQFPANADGYLRRANYYASKGKDDQTWYDKAVADFNQALKVAQ KKDDVYYNIGKLMYAYQLSKPEKTYKDWTYDTALKNVRQAIAIDPLPIYTQMEGDILFAQ QDYAGALAAYEKVNASNIASPATFFSAAKTKELLKGDPKEVVALMDSCIARCPQPITADF APYLLERAQMNMNADQARNAMLDYDAYHTAVKGEVNDVFYYYREQAALKARQFQRALDDI AKAIEMNPTDLTYQAEHAVINLRVGRYEEAIQILNNILKTDPKYGEAYRLLGLCQIQLKK TDEACGNFKKAKELGDPNADELITKYCK >gi|225935342|gb|ACGA01000050.1| GENE 8 8328 - 9272 502 314 aa, chain + ## HITS:1 COG:no KEGG:BT_1726 NR:ns ## KEGG: BT_1726 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 308 1 307 316 405 67.0 1e-111 MEKNRFTICANNYIDCLRQEGRYSTAHVYKHAIRSFSQFCGTQSITFSKINRETLKRYSN YLMASRLKPNTISTYMRMLRSIYNRGVDMHQAPYVHGLFRDVFTGVDTRQKKAIPIGELH MLLNKDPQSEKLRRTQAIANLLFQFCGMPFSDLAHLEKSNLERGLLKYNRTKTGTPMSIE VLESAQNAIGGLYNKSDARSSGYPDYLFWILSGAYKRNEEGAYREYQSALRRFNNELKSL SRKLRLHSPVTSYTLRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFELEERTKV NKLNYSYVCNFKML Prediction of potential genes in microbial genomes Time: Fri May 13 10:22:42 2011 Seq name: gi|225935341|gb|ACGA01000051.1| Bacteroides sp. D2 cont1.51, whole genome shotgun sequence Length of sequence - 87207 bp Number of predicted genes - 75, with homology - 73 Number of transcription units - 31, operones - 18 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 770 735 ## COG1596 Periplasmic protein involved in polysaccharide export 2 1 Op 2 . + CDS 785 - 1909 674 ## BT_1355 hypothetical protein 3 1 Op 3 2/0.000 + CDS 1926 - 3332 574 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Prom 3356 - 3415 5.2 4 1 Op 4 7/0.000 + CDS 3554 - 4762 804 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 5 1 Op 5 9/0.000 + CDS 4771 - 5919 650 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 6 1 Op 6 . + CDS 5974 - 6555 259 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 7 1 Op 7 2/0.000 + CDS 6573 - 7631 511 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 8 1 Op 8 . + CDS 7639 - 8415 510 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 9 1 Op 9 . + CDS 8422 - 8820 271 ## BF1535 hypothetical protein 10 1 Op 10 . + CDS 8817 - 9890 661 ## COG0451 Nucleoside-diphosphate-sugar epimerases 11 1 Op 11 . + CDS 9893 - 10813 654 ## BF1537 putative dTDP-glucose 4,6-dehydratase 12 1 Op 12 . + CDS 10820 - 12343 469 ## PRU_0439 hypothetical protein 13 1 Op 13 . + CDS 12346 - 13305 381 ## PRU_0438 hypothetical protein + Term 13376 - 13417 -0.9 + Prom 13324 - 13383 3.7 14 2 Op 1 . + CDS 13451 - 14773 680 ## gi|260173839|ref|ZP_05760251.1| glycosyl transferase, group 1 15 2 Op 2 . + CDS 14807 - 15760 507 ## BF1544 hypothetical protein + Prom 15766 - 15825 7.0 16 3 Tu 1 . + CDS 15846 - 16592 454 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 16616 - 16655 -0.6 + Prom 16599 - 16658 6.3 17 4 Tu 1 . + CDS 16707 - 17552 309 ## Geob_1464 glycosyltransferase-like protein + Term 17752 - 17792 1.1 18 5 Tu 1 . - CDS 17529 - 17876 70 ## - Prom 18045 - 18104 7.1 + Prom 18242 - 18301 5.4 19 6 Tu 1 . + CDS 18340 - 18702 325 ## gi|260173843|ref|ZP_05760255.1| hypothetical protein BacD2_18377 + Prom 18717 - 18776 4.9 20 7 Op 1 8/0.000 + CDS 18862 - 19572 228 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 19656 - 19715 11.8 21 7 Op 2 . + CDS 19769 - 20578 241 ## COG1216 Predicted glycosyltransferases + Term 20587 - 20638 1.6 + Prom 20870 - 20929 8.9 22 8 Tu 1 . + CDS 21057 - 21236 81 ## gi|160882621|ref|ZP_02063624.1| hypothetical protein BACOVA_00574 + Term 21256 - 21307 3.4 - Term 21244 - 21295 10.5 23 9 Op 1 . - CDS 21301 - 21744 134 ## COG3023 Negative regulator of beta-lactamase expression - Prom 21868 - 21927 3.9 24 9 Op 2 . - CDS 21930 - 22418 673 ## BT_0403 hypothetical protein - Prom 22554 - 22613 2.6 - Term 22563 - 22602 2.3 25 10 Tu 1 . - CDS 22675 - 24999 1177 ## BT_0404 hypothetical protein - Prom 25118 - 25177 9.0 + Prom 24954 - 25013 7.1 26 11 Tu 1 . + CDS 25152 - 25511 264 ## BT_0405 hypothetical protein + Prom 25522 - 25581 15.0 27 12 Op 1 . + CDS 25650 - 25727 98 ## 28 12 Op 2 . + CDS 25766 - 26155 341 ## COG3152 Predicted membrane protein + Term 26176 - 26225 12.4 + Prom 26200 - 26259 8.2 29 13 Op 1 . + CDS 26283 - 28541 1393 ## gi|260173851|ref|ZP_05760263.1| hypothetical protein BacD2_18417 30 13 Op 2 . + CDS 28546 - 29118 502 ## gi|260173852|ref|ZP_05760264.1| hypothetical protein BacD2_18422 31 13 Op 3 . + CDS 29123 - 30190 1034 ## PRU_0332 RyR domain-containing protein 32 13 Op 4 . + CDS 30213 - 30506 308 ## BT_2247 putative ryanodine receptor - Term 30241 - 30274 4.7 33 14 Op 1 . - CDS 30481 - 33762 2689 ## COG4995 Uncharacterized protein conserved in bacteria 34 14 Op 2 . - CDS 33783 - 34451 400 ## gi|260173856|ref|ZP_05760268.1| hypothetical protein BacD2_18442 35 14 Op 3 . - CDS 34473 - 36089 924 ## COG1262 Uncharacterized conserved protein - Prom 36119 - 36178 7.3 + Prom 36094 - 36153 6.6 36 15 Tu 1 . + CDS 36185 - 36559 406 ## COG0251 Putative translation initiation inhibitor, yjgF family + Term 36582 - 36629 13.5 - Term 36363 - 36410 1.5 37 16 Tu 1 . - CDS 36527 - 38017 963 ## COG0285 Folylpolyglutamate synthase - Prom 38241 - 38300 5.1 + Prom 37975 - 38034 6.2 38 17 Tu 1 . + CDS 38149 - 39474 1465 ## COG1875 Predicted ATPase related to phosphate starvation-inducible protein PhoH - Term 39513 - 39564 1.7 39 18 Op 1 . - CDS 39594 - 40571 1098 ## COG0167 Dihydroorotate dehydrogenase 40 18 Op 2 . - CDS 40595 - 41263 480 ## COG0325 Predicted enzyme with a TIM-barrel fold 41 18 Op 3 . - CDS 41270 - 41749 661 ## BT_1331 hypothetical protein 42 18 Op 4 . - CDS 41833 - 42396 618 ## COG3247 Uncharacterized conserved protein - Prom 42507 - 42566 5.0 + Prom 42356 - 42415 7.1 43 19 Tu 1 . + CDS 42534 - 43034 613 ## COG2077 Peroxiredoxin + Prom 43080 - 43139 3.3 44 20 Tu 1 . + CDS 43233 - 44117 918 ## BT_1328 hypothetical protein 45 21 Op 1 . - CDS 44288 - 44932 312 ## PROTEIN SUPPORTED gi|154175107|ref|YP_001408238.1| ribosomal protein L22 46 21 Op 2 . - CDS 44920 - 46413 1643 ## COG0457 FOG: TPR repeat 47 22 Op 1 . - CDS 46516 - 48255 1821 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 48 22 Op 2 . - CDS 48304 - 49116 811 ## COG0226 ABC-type phosphate transport system, periplasmic component - Prom 49275 - 49334 5.4 + Prom 49113 - 49172 6.3 49 23 Op 1 38/0.000 + CDS 49294 - 50490 1166 ## COG0573 ABC-type phosphate transport system, permease component 50 23 Op 2 41/0.000 + CDS 50492 - 51367 759 ## COG0581 ABC-type phosphate transport system, permease component 51 23 Op 3 32/0.000 + CDS 51376 - 52134 217 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 52 23 Op 4 . + CDS 52190 - 52879 874 ## COG0704 Phosphate uptake regulator + Term 52937 - 52977 7.1 + Prom 52978 - 53037 7.3 53 24 Op 1 . + CDS 53255 - 53857 702 ## COG0307 Riboflavin synthase alpha chain 54 24 Op 2 . + CDS 53872 - 54435 601 ## COG0778 Nitroreductase + Term 54486 - 54546 17.1 - Term 54495 - 54555 0.7 55 25 Op 1 . - CDS 54649 - 57933 2664 ## COG0793 Periplasmic protease 56 25 Op 2 . - CDS 57985 - 59304 1101 ## COG1295 Predicted membrane protein - Prom 59401 - 59460 4.8 + Prom 59816 - 59875 4.1 57 26 Op 1 3/0.000 + CDS 59901 - 60881 583 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 58 26 Op 2 . + CDS 60886 - 62037 1058 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily + Prom 62145 - 62204 7.1 59 27 Op 1 1/0.000 + CDS 62257 - 63792 1890 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 63839 - 63891 5.7 + Prom 63877 - 63936 7.5 60 27 Op 2 . + CDS 63977 - 64837 777 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 64874 - 64935 10.6 + Prom 64892 - 64951 6.7 61 28 Op 1 . + CDS 64972 - 65595 360 ## BT_1310 hypothetical protein 62 28 Op 2 . + CDS 65673 - 66074 330 ## BT_1309 hypothetical protein + Term 66094 - 66147 7.6 + Prom 66079 - 66138 7.6 63 29 Op 1 . + CDS 66165 - 67355 822 ## BT_1308 clostripain-related protein 64 29 Op 2 . + CDS 67413 - 67958 350 ## BF0992 RNA polymerase ECF-type sigma factor 65 29 Op 3 . + CDS 68021 - 69193 855 ## COG3712 Fe2+-dicitrate sensor, membrane component 66 30 Op 1 . + CDS 69342 - 72767 2331 ## Phep_3877 TonB-dependent receptor plug 67 30 Op 2 . + CDS 72773 - 74530 1198 ## Phep_3876 RagB/SusD domain protein 68 30 Op 3 . + CDS 74560 - 77388 1835 ## Phep_3875 TonB-dependent receptor plug 69 30 Op 4 . + CDS 77415 - 80078 2093 ## Phep_3874 RagB/SusD domain protein 70 30 Op 5 1/0.000 + CDS 80098 - 82134 1282 ## COG1472 Beta-glucosidase-related glycosidases 71 30 Op 6 . + CDS 82141 - 83247 714 ## COG1472 Beta-glucosidase-related glycosidases 72 30 Op 7 . + CDS 83264 - 83944 278 ## BT_2457 putative purple acid phosphatase 73 30 Op 8 . + CDS 83815 - 84444 266 ## BT_2457 putative purple acid phosphatase 74 30 Op 9 . + CDS 84496 - 86562 1526 ## gi|260173892|ref|ZP_05760304.1| outer membrane protein SusF + Term 86682 - 86738 14.0 75 31 Tu 1 . + CDS 87056 - 87206 85 ## gi|299149093|ref|ZP_07042155.1| hypothetical protein HMPREF9010_03382 Predicted protein(s) >gi|225935341|gb|ACGA01000051.1| GENE 1 3 - 770 735 255 aa, chain + ## HITS:1 COG:Cj1444c KEGG:ns NR:ns ## COG: Cj1444c COG1596 # Protein_GI_number: 15792762 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Campylobacter jejuni # 128 202 431 505 552 70 41.0 4e-12 LKEGFIVDGTPGFVLQPYDEVYVRRSPGYQAQQNVVVEGEILFGGSYAMTSREERLSDLI NKAGGATNYAYLRGAKLTRVANASEKKRMGDVIRLMSRQLGEAMMDSLGVRVEDHFTVGI DLEKALANPGSTADIVLREGDVISIPKNNNTVTINGAVMVPNTVSYIKGEDMDYYLNQAG GYSENAKKSKKFIVYMNGQVTKVKGSGKKQIEPGCEIIVPSKAKKKTNMGNILGYATTFS TLGMMVASIANLIKK >gi|225935341|gb|ACGA01000051.1| GENE 2 785 - 1909 674 374 aa, chain + ## HITS:1 COG:no KEGG:BT_1355 NR:ns ## KEGG: BT_1355 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 363 1 365 379 464 66.0 1e-129 MTDQRIDNKLSFESNKSKDEIIEIDLKEIVRKIISFRRTIYKAVGIGGIIGLIIAFSVPK QYTVRVTLSPEMGSSKGAGLSGLAASFLGNGVTASDGTDALNASLSADIVSSTPFLLELS TMKIPTLKDEPMTLNTYLDEESSPWWNYVIGFPGMIIGGVKSLFIEENKSFAFNKLNRGT IELSNKEFQKIEALKNMITASVDKKTSITSVTVTLQNPKVAAIVADSVVKKLQEYIIGYR TFKAKEDCAYLEKLFKERQQEYYAAQKTYANYIDSHDDIILQSVRAEQERLQNDMTLAYQ VYSQVANQLQVARAKVQEEKPVFAIVEPAVVPLKSSGTSKKVYVLVFMFLSVCIVLLGKL FGEFFLNKFKEIRA >gi|225935341|gb|ACGA01000051.1| GENE 3 1926 - 3332 574 468 aa, chain + ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 67 457 70 454 464 231 36.0 2e-60 MASTNQINKVVELIAVLGDLIILNFSLFFLFLFWEKTFRYSPFSCTVSLMMTSLSLCYLA CSASRGKAWDSREIRADQLVLRVLKNIVTFSVFWACIMAFSGISIISPLFFVIYFFILFV VLSIYRIIIRHSLISYCTKGKHKRYAVFIGGGNNMQMLYEEMESSLASIYEVVGYFDIGP NDALSSQCSYLGNPDSFIDFMSEKKAIKHVFCSLSMDQGRYNVSVMNYCENHLLYFHGVP NVCKGFPRRIWHSMVGNMPILNLRYEPLSKMENRLLKRIFDIVLSGIFLVTIFPLVYLIV GIVIKMTSPGPVLFKQMRTGLNGVDFKCYKFRSMRVNEEADSKQATVNDPRKTRFGDFLR RSNIDELPQLINVFKGDMSIVGPRPHMLVHTETYAQLIDKYMVRHFIKPGVTGWAQTHGF RGETKELSQMEGRVKADIWYMEHWTIFLDLYIIYKTIVNVALGEKNAY >gi|225935341|gb|ACGA01000051.1| GENE 4 3554 - 4762 804 402 aa, chain + ## HITS:1 COG:FN1696 KEGG:ns NR:ns ## COG: FN1696 COG1086 # Protein_GI_number: 19705017 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Fusobacterium nucleatum # 23 266 265 507 607 110 31.0 6e-24 MFDLKKFIADHVVCRSTSIFAVDIEANKRVLQTEIEDKSVCVIGGAGSIGSSFIRAMLPF KPSKLIVIDLNENGLVELTRDLRSTYGMYIPTEYRTYTLNFDDPIFERIFCHEKGFDIVA NFSAHKHVRSEKDKYSVQALIENNDIKAKRFLDLLTIYPPKHFFCVSTDKAANPVNIMGA SKRIMEDMIMAYTSKFKVTTARFANVAFSNGSLPDGWLQRVTKKQPLAAPTDVKRYFVSP EESGQICMLACILGKNGEIFFPKLGEKQMLTFSSICDEYIKAIGYDKKDFVTDEEAKQFA SKMEFNDKEYPVVFFKSDTTGEKTYEEFYVPGEKINMDRFASLGVIEEVEKRPMSEIETF FDEMKTIFADPDFTKEEVVKAIKRFIPNFEHEEKGKNLDQKM >gi|225935341|gb|ACGA01000051.1| GENE 5 4771 - 5919 650 382 aa, chain + ## HITS:1 COG:Cj1320 KEGG:ns NR:ns ## COG: Cj1320 COG0399 # Protein_GI_number: 15792643 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Campylobacter jejuni # 6 382 4 379 384 367 46.0 1e-101 MVNLAKVVDFIHELYGTKDFVPLSVPKFVGNEKKYLEDCIDTTFVSSVGKFVDRFEEEMA EYAGAKKAVVCVSGTNALHISLMLCGVERDDEVLTQALTFVATCNALSYIGAYPVFIDVD KSTMGLSPDSMKEWLRKNSVIRNGECYNKNTGRRVKACVPMHTFGHPVRIEELVQVCAEY HLELVEDAAESIGSKYKGQHTGTFGKVGAISFNGNKTITTGGGGMILFQDEELGKLAKHI TTQAKVPHRWKFVHDQIGYNYRMPNINAAIGCAQLECLDEFIKSKRKIAKEYADFFKKVD GIDFFEEPEGCFSNYWLNVVILKDKSAQLEFLEYTNDNGVMTRPIWELMTRLPMFKNCEN DGLRNTIWFADRVVNIPSSVRL >gi|225935341|gb|ACGA01000051.1| GENE 6 5974 - 6555 259 193 aa, chain + ## HITS:1 COG:Cj1123c KEGG:ns NR:ns ## COG: Cj1123c COG0110 # Protein_GI_number: 15792448 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Campylobacter jejuni # 46 185 50 188 195 78 32.0 6e-15 MIEAAESAGYTIKGILDISSRVGEKVLDYTIIGTDDDIFLYTEDCDFIVTLGFIKDASLR IHLHDKIEKAGGHLATVIASTAHISRYAKLGEGTVVLHQACINAGARVGKGCIINTFANI EHDTVIGDYCHISTGVMVNGDCKVGKNSFIGSHSVLINGIAICDKVIISADSFIRKNILK AGVYINNSTMSIL >gi|225935341|gb|ACGA01000051.1| GENE 7 6573 - 7631 511 352 aa, chain + ## HITS:1 COG:Cj1329_2 KEGG:ns NR:ns ## COG: Cj1329_2 COG1208 # Protein_GI_number: 15792652 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Campylobacter jejuni # 130 348 6 222 228 137 36.0 3e-32 MKKYSKKIEQRVIDINKSIIDSLLLMERISTKLLLVFKENSFKGVVSIGDIQRAVIKNIP MNSSIEVVMRNKFTYASISDEKIDIKNKMIEQRIECMPVLDENGELVDVILWEELFQTRA FKSVNINLPVVIMAGGKGTRLKPLTNIYPKPLIPIGEKTIVESIMDHFVSYGCHKFYFSV NYKADVIKNYFDFLANPDYELIYFQEDKPMGTAGSLRLLKDQISSTFFVSNCDIMIDEDY ASILDYHRQNKNELTVVAALKTYSIPYGTIVTGENGLLQTIEEKPNLTFKINTGLYILEP SLLNDIPDDFFHITHLIKKLKNEDRRVGVYPVSQKDWMDMGDWKEYLKMINQ >gi|225935341|gb|ACGA01000051.1| GENE 8 7639 - 8415 510 258 aa, chain + ## HITS:1 COG:STM2092 KEGG:ns NR:ns ## COG: STM2092 COG1208 # Protein_GI_number: 16765422 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Salmonella typhimurium LT2 # 1 258 1 256 257 297 55.0 2e-80 MKVVLLAGGFGTRISEESQFKPKPMIEIGGRPILWHIMKEYAYYGFTEFIICAGYRQQYI KEWFANYFLHNSDVTFDYTEGKKIMTIHESHCEPWKVTVVDTGLSTMTGGRIKRIQKYIG NESFFMTYGDAVCDVDIHKLLEFHKGHGKIASLTAVVLRQEKGVLDIGGDNAVKSFREKE ISDGASINAGYMVLNPEIFDYIENDSTVFERDPLVKLAEKGELMSYRHKGFWQCMDNKRE MEMLEKLLDLERAPWKKW >gi|225935341|gb|ACGA01000051.1| GENE 9 8422 - 8820 271 132 aa, chain + ## HITS:1 COG:no KEGG:BF1535 NR:ns ## KEGG: BF1535 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 132 6 133 133 194 70.0 1e-48 MLKDRIKIIPRRLIKDERGWFFKAITGTEENIPSHTGEVYLTMGTKGQAKGGHYHPRAVE WFTVIEGKAKLKLEDIETHEKMEIMMSLEEAILVFVPNNVAHIFENVSNDNFIVLAYTDL LYNPSDTITYQL >gi|225935341|gb|ACGA01000051.1| GENE 10 8817 - 9890 661 357 aa, chain + ## HITS:1 COG:STM2091 KEGG:ns NR:ns ## COG: STM2091 COG0451 # Protein_GI_number: 16765421 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Salmonella typhimurium LT2 # 1 356 1 354 359 347 45.0 3e-95 MIDIFNSFYKDKRILVTGHTGFKGSWLSIWLHELGAEIIGIAKEPLTNRDNYVLSKIGNK ISADIRADIRDGEKMKQIFVKYQPDIVFHLAAQPLVRLSYDIPVETYETNVMGTINIMEA IRATDSVKVGVMVTTDKCYENKEQCDGYKEDDPFGGYDPYSSSKGACEIAISSWRRSFFN PAQYDKHGKSIASVRAGNVIGGGDWALDRIIPDCIKALEEGTPIDIRSPKAVRPWEHVLE PLSGYMLLAQKMWNAPTEYCEGWNFGPETDSVATVWEVATEIIKNYGRGELRDSSNLGAL HEASLLMLDINKVKTRLGWTPRMDMNQCIKLVVDWYKNYMHEDVYELCVKEINTFLK >gi|225935341|gb|ACGA01000051.1| GENE 11 9893 - 10813 654 306 aa, chain + ## HITS:1 COG:no KEGG:BF1537 NR:ns ## KEGG: BF1537 # Name: not_defined # Def: putative dTDP-glucose 4,6-dehydratase # Organism: B.fragilis # Pathway: not_defined # 1 302 1 293 296 207 36.0 3e-52 MKIAIIGSNGMLSVALTKYFMDLKNEVDVYGLGTPIGYSCTTFTHVDLVKVSLDVIPLID SDVIIYAAGAGVQAALNTDAELMYQLNVQVPIDITLNLKKQSYNGVYISFGSYMEIGVNN EDGKAFTEEEIICSLLPVTNDYALSKRLYSRYMASFWGDFTYWHFILPNMFSYDDFKPGT RLIPYVLQYLQAYKQGLNPEEPRFSAGTQTREFILMEDVFNIISKSINSGLPSGVYNMGG GKFQSIRELIETLFDFFDVPCKDSYFGQEARRDGNIRCLRLNAMKLKNALNMLPSTTLVE ALQQKQ >gi|225935341|gb|ACGA01000051.1| GENE 12 10820 - 12343 469 507 aa, chain + ## HITS:1 COG:no KEGG:PRU_0439 NR:ns ## KEGG: PRU_0439 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 5 502 7 507 507 242 33.0 3e-62 MTPENKRITVNTIALYVKLFVSIAVGLFSSRIVLNALGVTNYGLYNVVGSIVVMINSAGI ALQSVSYRYLAVELGKKNNQDVAKVFSLSINIFILMAIFLLFIGGPSGVYYINHYMNLEN AVPADAYFVFYFSLLTAIISLIGIPYSSLIMVKEKFLFTSGIEIVRTFIQFSLILWMSYS DYNHLRLYCVIIFISNLIIFISNILYNLHFNKDDIRYKWVSDKKLYREVAGYSIWIILGA IACVAQVQVASNIINVYFGLALNAAFGIANQVYSYLMLFVRNLSQAAVPQIMKEAGTNAQ KSISIVYSISKYSFLLFSLIIVPIILNLQGILKIWLVEVPEYVYEFTILMIVNGMLWCLA NGFDAAIQASGKIRKNQIIYSTLTLIALPISAILFHFGYPPYLLMVMNVFMYVIILICQC FIMREISAFDFMSYWMFTIYPSLRIAMIVLPYGILHYLLLEENLNVLFATLISFAYTCFT VYSFGLSAEEKALILSVINNKFLNKHL >gi|225935341|gb|ACGA01000051.1| GENE 13 12346 - 13305 381 319 aa, chain + ## HITS:1 COG:no KEGG:PRU_0438 NR:ns ## KEGG: PRU_0438 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 279 1 279 322 288 54.0 3e-76 MVNTAVLFETFARPKYARQVFEQIKKAQPKKLYFYSNKARIGYPEEMKNNEEIRSWVKEI DWDCELYTFFRDEYVDVYVSTFGAIDWVFRNEEEAIVLEDDCVPSQAFFMYCEHFLNKYK NDKRICVISGDNYVEGLDYDGADHVITSSFFMFGWASWRDRWINADFNIDVKHIIENEHI FEKYFIGDTRKVKFWKSYYKNIYDFLNKTHCWDYMFSLNCIRQKTYVVAPIEHLVQNVGI VGTHAPGKENATNKSIDLEKKNYQVTNKIINVIPNKVYDELSFNILFNYPFSLKSRMKRL VRPVVRPIKGILKYFSCYK >gi|225935341|gb|ACGA01000051.1| GENE 14 13451 - 14773 680 440 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173839|ref|ZP_05760251.1| ## NR: gi|260173839|ref|ZP_05760251.1| glycosyl transferase, group 1 [Bacteroides sp. D2] # 1 440 1 440 440 905 100.0 0 MQRKKLKVALICHVSNITIREHISLDNWKLKNIIKRLLGLPISNYRDYSIWNTLLFQQFE GVNDIELHAVIPHPGLKGKRTDFVVNNIHYHCFRQTGGGRIWAKIMSVGHIKKDIFKRFR HNVSMIKKIVNEIHPDVINMIGAECPFYTLVGLEFDINKVPFILTLQTALSDPDFLKNYP MESRIYEENVQVEQALFKHCHYIAHDSYWYREVAAGYNPTAQFLRYTFFMPKYDIDTNVK KEFDFVYYAANISKAGADAVEAFGLAYKDNPNLTLNIIGDYSVSTYVQLMERVKELGIEK NVTFSGYLPTHVDALKQVIKSRYGLVPIKIDLISGTIIECIILGLPVVTFKTKGTPRIND NGEAVLLSDIGDYQDMANNMLRLVNDETLGRSMVKKAREFYTNIWELNQATHRLTDIYHS VYENFHNGSEIPAKLVESVY >gi|225935341|gb|ACGA01000051.1| GENE 15 14807 - 15760 507 317 aa, chain + ## HITS:1 COG:no KEGG:BF1544 NR:ns ## KEGG: BF1544 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 317 19 320 320 259 45.0 7e-68 MGKKMYDIIIPVATKDCDFLPRVVSYIRKYLTDAEFIYIITNEQNFSRFKKLLKSDNLVR LLNENNIINELSYDQIKQYLANKKVYEGQGWYFQQFLKMGFAQTEYAKEYYLSWDADTLP LSHIQYFDELGKPRFTMKREFHPAYFHTLKRLLGLEKIASFSFIAEHMMFNSTIMKELLN VINFNKDVEGGDWVEKCINACEFDLEHREKGPYFSEFETYGTFVWSKYPNFYSVQTLNTF RAAGYIRGRYINDYILSRLSMDLDMASFEIYDQPVFPYNIPFLWYKQKIRIIKLRDIFLA GSPHEILKILKNKIKKK >gi|225935341|gb|ACGA01000051.1| GENE 16 15846 - 16592 454 248 aa, chain + ## HITS:1 COG:HP0102 KEGG:ns NR:ns ## COG: HP0102 COG0463 # Protein_GI_number: 15644732 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Helicobacter pylori 26695 # 1 205 2 217 259 123 34.0 3e-28 MKVSVVTVTYNASATLEETMQSVFYQSYPYIEYIIIDGASADGTVNIISKYADKLSYWLS EQDKGIYDAMNKALEVATGDFLIFMGADDLFYTNDVIKNVVSQITDLDAVYYGSVLFKGR GTKHWGQFNKIKWAVTNVSHQAIFYPKVVYTTHSYDIQYRIYADYAYNLNLLAEKVCFIY VDEIITLYDMTGISANAHDEIFQRDFRGLVFSSVGRLAYYIGKCIRNLYFIKESLKSTYN KKRLENKE >gi|225935341|gb|ACGA01000051.1| GENE 17 16707 - 17552 309 281 aa, chain + ## HITS:1 COG:no KEGG:Geob_1464 NR:ns ## KEGG: Geob_1464 # Name: not_defined # Def: glycosyltransferase-like protein # Organism: Geobacter_FRC-32 # Pathway: not_defined # 3 258 8 271 288 135 34.0 1e-30 MKILYVIVLYKCKLEDSKSYQSLLQDRNEPIFVYDNSSIPQEVHGKNVIYVHDPQNRGLG VAYNTAAQYARYAHFDWLLLLDQDTTFPLGAIDVYIKSIMNYPAIKMFVPIHQISDGRYI SPTRYICKGSHPSSIIKSGILDFKEAAPINSGIMINVDAFYKVGCYDEEVMLDFSDIRFM EKFRKIYPKFYAIGEIVCLQDFSIEEQNLSKLMMRYKIFLYCALACKREHLLDYFSYFFV TIKRTVHLLWQTHSFSFLSVYVTNYLLGKSEIGNLHGNKKK >gi|225935341|gb|ACGA01000051.1| GENE 18 17529 - 17876 70 115 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEGNQIILLILIYLSHNNKLNKKNVVKYCCKKFTSWLNPYKRPQRKDNNEIHSIMHKIFK YLIFIFLPYKSNRHKRRKRNKFNFIPVKIFAASNMATVTNKFNTKFNILFFLITM >gi|225935341|gb|ACGA01000051.1| GENE 19 18340 - 18702 325 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173843|ref|ZP_05760255.1| ## NR: gi|260173843|ref|ZP_05760255.1| hypothetical protein BacD2_18377 [Bacteroides sp. D2] # 1 120 270 389 389 229 100.0 6e-59 METYYQLLSKSPQVYITGMGPGGMVYSKGFGSEVPLMEWTYLELLRFYGIGCLVILFVCF YPIYFLWKKKAVYPLGIPISIGGAIYMFASATNPYLINSTGMIVLIYMYTYMEHGRYDKN >gi|225935341|gb|ACGA01000051.1| GENE 20 18862 - 19572 228 236 aa, chain + ## HITS:1 COG:DRA0037 KEGG:ns NR:ns ## COG: DRA0037 COG0463 # Protein_GI_number: 15807707 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Deinococcus radiodurans # 5 224 10 234 328 85 29.0 9e-17 MSISVCIATYNGGAFIKDQIYSILLQLSQNDEIIISDDGSRDSTLNILFSFNDSRIRIYK NDGKHGVVSNFENAIKHATGDYIFLCDQDDVWMPGKVKRVMEAFDNYDFVVHNAEMVDSN LISQGIDFFTLRKTRYGYWQNLWKMRYLGCTMAFKSDTLKFILPFPKNILWHDMWAAAIL HLKFRGILIDESLMWYRRHGNNASSSGEKSGWSWSFRIRYRWIIFYNSIFRIIKAK >gi|225935341|gb|ACGA01000051.1| GENE 21 19769 - 20578 241 269 aa, chain + ## HITS:1 COG:RSc0688 KEGG:ns NR:ns ## COG: RSc0688 COG1216 # Protein_GI_number: 17545407 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Ralstonia solanacearum # 2 267 4 269 275 201 36.0 9e-52 MITVSIVTYKTDVKELTKCLQSLKSPLVLKIYIVDNSNEKYIQDFCLNYANVEYISSENV GYGTGHNQVLRKVLSSSSIYHLVLNSDVYFDPKVLEQLVEYMNENLDVAQVQPNIIYPDG KMQYTCRLLPTPVNLIFRRFLPKKMISRMDERYLLKFNDHTQAIDIPYHQGSFMFFRTDC FNKVGLFDERFFMYLEDIDITRRMHKYYRTMFWPEVTIVHAHRAASYKNKKMLKIHIQNA IKYFNKWGWIFDCERKLWNRNLLKELHYK >gi|225935341|gb|ACGA01000051.1| GENE 22 21057 - 21236 81 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160882621|ref|ZP_02063624.1| ## NR: gi|160882621|ref|ZP_02063624.1| hypothetical protein BACOVA_00574 [Bacteroides ovatus ATCC 8483] # 1 59 19 77 77 119 100.0 7e-26 MYFPDLTIRASVNKLRRWMRRCKPLMNEILSTDFHPKTKAFSVREVRLITYYLGKPGEL >gi|225935341|gb|ACGA01000051.1| GENE 23 21301 - 21744 134 147 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 55 139 2 97 116 66 38.0 2e-11 MENSEESYLPREIKLLVIHCSATRCNVSFTVEQLRQCHLQRGFKDIGYHFYITRNGELHH CRPVSEPGAHVRGFNRHSIGICYEGGLDEEGRPADTRTQAQRFALLDLLTILKHQYPDAQ ILGHYQLSASIHKACPCFDSRKEYMDI >gi|225935341|gb|ACGA01000051.1| GENE 24 21930 - 22418 673 162 aa, chain - ## HITS:1 COG:no KEGG:BT_0403 NR:ns ## KEGG: BT_0403 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 163 165 245 79.0 4e-64 MAQNYTLMARKNLLKPNETPKFYAVARSGRKVTVKEVCKRITERSSYSKGELEGCIGEFL LEIVNVLDEGNIVQMGDLGNFRMSIKTGTPTDTAKEFKASCINKGKVLFYPGSDLRKLCK TLDYTLYKSDSSTDLDKDPLPDDGGDDNQGGSGSGEAPDPAA >gi|225935341|gb|ACGA01000051.1| GENE 25 22675 - 24999 1177 774 aa, chain - ## HITS:1 COG:no KEGG:BT_0404 NR:ns ## KEGG: BT_0404 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 772 1 775 775 1095 69.0 0 MKVLEEIKVSVYENVYSKKPKVMSFLEVIIMCIHPVYATIINSIRRYHAEGDHAAAQKLK NQLPCFTPAGTFDGAHAIKHFLLPSHIVGLDYDHVANRLEVIQRCAADPHTVAAIESPTD GVKVFAYVEGIGGRHREGQQLVSRYYNQLLGLESDPACKDESRLCYFSYSPNGYIASLYQ AFVLEPPVKEAKFLSENEVLPPFPLQNNPPENVSEEEISQFISSYIFFHPLTAGQRHSNV FKLACEACRRHYPQKSILRELTAFFEHTDFRPEELTSVLSSGYKQVNEHAFTSTPANEPS FQKDIRTKRPYGTTENSDADDEAYWLGEEFRKGTPLFPRSLYNNLPDLLNDCIIEDGSER EQDVALLSDLTALSAALPQTFGIYNHKKYSTHIFSVILSPAASGKSIAQTGRYLLEEIHS EILATSESMMKNYQTVHSNWQSECQKQKKKGDACSEEPQRPPFKMLFIPATTSYTRMQMQ MQDNGPQGSIIFDTEAQTLSTANHLDCGNFDDMLRKAFEHENIESSYKANGLIPIYIRHP KLALLLTGTPGQIDGLLSSYENGLPSRTLIYTFREAPHWKEMGDDCISLEDSFKPIAHRV SELYNFCLAHPVLFHFNRLQWNRLNEIFSRMLSEVALEGNDDLQAVVKRYAFLVMRISMI QTRIRQFEATDLSPEIYCTDADFERSLQIVLCCYEHSRLLHSSMPSPSVRPLKNPDTIRN FVQELPDSFTTDEAIQIGAKYDFNHRKVTRLLKSLNGVKINKISHGSYTKMNEQ >gi|225935341|gb|ACGA01000051.1| GENE 26 25152 - 25511 264 119 aa, chain + ## HITS:1 COG:no KEGG:BT_0405 NR:ns ## KEGG: BT_0405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 119 126 47.0 3e-28 MAKEKQEPYEFLSNLVLTLMSADRIFSNSFFISEFAVSPKTLGEIRRGEDMCIYQYVRVI RCMTKYLHLIIQMDMLLKKLRIVLFSHCDLVVATVPHRSCGTCQPTEWVAVMHWDGVKL >gi|225935341|gb|ACGA01000051.1| GENE 27 25650 - 25727 98 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFKAPFSFDGRIRRIEYFLSGIIGV >gi|225935341|gb|ACGA01000051.1| GENE 28 25766 - 26155 341 129 aa, chain + ## HITS:1 COG:PA0659_2 KEGG:ns NR:ns ## COG: PA0659_2 COG3152 # Protein_GI_number: 15595856 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pseudomonas aeruginosa # 11 80 31 100 104 57 45.0 7e-09 MLGAASGSAGGSVFGLLIGLAAMIASIWFSLAQGVKRLHDLNKSGWLILLCCVPIVGWIF ALYMLFADGTVGPNPYGADPKNRMPYQAQPASVNVTVNVSREEVKVDKPVEDAPAPAEVP AEENKEKAE >gi|225935341|gb|ACGA01000051.1| GENE 29 26283 - 28541 1393 752 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173851|ref|ZP_05760263.1| ## NR: gi|260173851|ref|ZP_05760263.1| hypothetical protein BacD2_18417 [Bacteroides sp. D2] # 1 752 1 752 752 1479 100.0 0 MYMLSTVISQTGECNAQHATDPCFLNQWGYSTLLLIIIGIGLVYLLFKKRKFLVDNLKWI AGTVFIAGFLIYWYAFNEGGSDSNSIALAFRSALSSMEMFASHSDLLEVPEDLHHDPFYM TIFSVIHFLAVIVSAVFIIKLLGFRFISWVRLCAANLSRKKKCRLFIFWGVNDNAILLAN SIRKKAAEPKGKNEEGWENCKFIFVRLLSANESSSHGRFTFSHFFNSSHDGTEKFIEKIE ELDGILVNSKFGITGRVIDKVKSEFDLYKYLGLRRLGNLIKKYPQATFYFLSPNEELNLE AVSVFKEIAKCKEDRVHNQVQIYCHARKNNQNQKLEICDGLKHQIHIIDSSNLAVLQLKK NVRNHPVNFVDVDTSKACVKKPFTSMIIGFGETGRDAFRFLYEFGALIDVNGNRNPQKIY VVDEHIDELKGDFLMKAPALKERKNELEWCEEMSIHSERFWEKFSEIIHDLNYVVIAIGN DNEGMALAIDLYEYAYRYRKDCFNDFRIYLRVNGSCNTIQLKQIKEYFNIYGNTRDVIIT FGAQEEIFSYDVVSTDVLEVLAKEFYYAYQKIMIDAMPETNEKEIEEKKKAKESLKQTAE EEWNARREALRDEHCLDAQIKLTYQEEQDRANVWHIDTKKFLAGAMGEDGKDNKERLKEM VELTQRDAHTLNYSKVCDVVSSTLFDNLSKCEHLRWNACMELQGFVTCDGDKDFQQKKHK CIVDNDILRSKYPETIPYDQCVVELSFRLKKN >gi|225935341|gb|ACGA01000051.1| GENE 30 28546 - 29118 502 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173852|ref|ZP_05760264.1| ## NR: gi|260173852|ref|ZP_05760264.1| hypothetical protein BacD2_18422 [Bacteroides sp. D2] # 1 190 1 190 190 320 100.0 3e-86 MTDQEIVDGLINRDEKITDWFFNIKYRPLFINVIKLIFDYQVDYDECISELYYHLMKNDA AVLRHFEGRSTIGTWIKIVAIRFFREQKKREQMIEDESKEPLYEQNHEEEIDDSESKIAA KIDLERLFDLMSNKRYVMVIRELVLKEVEPEFLALSMGITVANLYNIKKRALAALAHLAM NDKKKYENKR >gi|225935341|gb|ACGA01000051.1| GENE 31 29123 - 30190 1034 355 aa, chain + ## HITS:1 COG:no KEGG:PRU_0332 NR:ns ## KEGG: PRU_0332 # Name: not_defined # Def: RyR domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 2 306 22 352 354 209 40.0 1e-52 MITEELLAAFEEGKTNAEETALVLEYLATDESLQEEFILSQQLDAMMGADDEETDFLPMA QMAAKSEGNLCDFQCEQFILKRRKIEYNSDELSEEARNNSWLRERGTPLHSVGRLLEQRG LIVMRSYGSSIDSVIRALKAGHDAIVVVNSCRLPGNSEEEIAYHAAVVLDVNEEEVTLYD PATGEESTAYPKDHFIAAWNDAKAYLARVKVPDLDYNPRPIDLEDVELSTDLIELREAIA ENAHEVWADQRQEEGWTYGPQRDDEKKETPDMVPYSMLPYSEKEYDRRMAFDTIKLMKKL GYSIIKRGDTALHNELMRKLKNEGDAKVCECGASIFMDQIYCSHCGKKIDWKLFR >gi|225935341|gb|ACGA01000051.1| GENE 32 30213 - 30506 308 97 aa, chain + ## HITS:1 COG:no KEGG:BT_2247 NR:ns ## KEGG: BT_2247 # Name: not_defined # Def: putative ryanodine receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 96 7 100 100 112 58.0 6e-24 MKEYIPNPVDTGHIQLPKELEYLVEEMAKNVHEVWSKTRIEQGWTYGKKRDDVLKQHPCL VPYEELPEEEKVYDRNSSVETLRLIMKLGFKISKDEE >gi|225935341|gb|ACGA01000051.1| GENE 33 30481 - 33762 2689 1093 aa, chain - ## HITS:1 COG:slr1968_2 KEGG:ns NR:ns ## COG: slr1968_2 COG4995 # Protein_GI_number: 16330786 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 701 1079 3 393 426 85 25.0 6e-16 MRQFILSLLACLCLTAYSQTASQADLLVEEAQKLESKQDYPTAITKLKQADELYVKTGKT QSAERATCLHILGRCYLNLERPEGLTYTQMAADMRKTILGETNIKYISSLNNTGLYYLTV AKDYPKAAEIHSKTWELCSRIQPRPEQAFMFHINLARCYIALGEMDKASAIVEEEVAISK KMYGEKSLSVARQLQQIGSLYYLSGRKDVGVTYYEQAFNIFPDDSKEYEQLLDWISSIYL ELNNQPKVLEYMKLAEAHNKKELEKPCDEPVCLTERAQYFASIGDNDKARAHFLEALKKC DVNTHKEIVFKVRHNYAQFLSGIQDHASAGEYYELAADILRSNPAEQTKFALESYLGGLN YSIAAKYEASNRMLNQALSVYAQMLPDRLDKYVDASIALCRNYGFTKEYGKALEILAKAE KQLPTGDKSDKMGEILRSRGSILYRQKQYAEAAANYQQAADIYKVLPGSDVKYQDALSSL NRCHTMMGNETAARQTEQDAERQRMAVLNRLLKENLEQLDAYRLQWGEDGLMYVSALGTI ADIYYTQGETDKALAYMEPFLSGETTALRNLFRLSKADERLAFWKDIRSSLDSIPLSAAN ITATGTPEQKQRFARLGYDALLFSKGIMLNSSIELESLIRASGDKSLLDQYNKATLMAEQ ILSMQSELPNAANQTEARKNIIRQKEEYEQLQLDLMRKSTDFGDYTRYLSVKWQDVQKHL HGNSIAIEFALIDDELLAPDKHLAAFVLRPGDVSPTAIKLMSQELLSKEMQSPTAFTATE NRAHFWKALDEYISKADTIYFSPDGILHQLPVEYLPYGAGNLPLAFQKAVYRLSSTKEIA LDRASQNYSSAALFGGLDYEMASTKVRNIVSTDHNSGKFRNGQNGYHELPYTLNEVNNVN SLLKEKKIKTNLFVGENGTEKQFRALDGKAVSLIHIATHGDYKELKQREADDAMKHCFLI MSGANATEENNDGLLRADEISTLNLRGCRMVVLSACNTGQGTLGADGLFGLQRGFKNAGV QTILMTLSAVDDQASMLLMTKFYENIISGLSERQALIKAQQYLRENGYADSKYWAPFILL DAGRSPIPHPSRS >gi|225935341|gb|ACGA01000051.1| GENE 34 33783 - 34451 400 222 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173856|ref|ZP_05760268.1| ## NR: gi|260173856|ref|ZP_05760268.1| hypothetical protein BacD2_18442 [Bacteroides sp. D2] # 1 222 1 222 222 432 100.0 1e-120 MSRKTIFRVFILLFFSIGGQMSANAQTTTPFETVFDACKKACEALDGGFASGEQLLAVSK TLRDAAPIPLTVKQTQGDVLSLKGHLVFDYEFIQACVDNETIYEIADKYAAEARMRGDKD SANKVRLDTKMVAAGQTCTFEIPSCSGTSQIGCVAEVNRSFSWKIKTIGYQSKSEREYKC NDSVRKGLPFRKEKVSSDERYKIIVSITNKSKRDGSFALILF >gi|225935341|gb|ACGA01000051.1| GENE 35 34473 - 36089 924 538 aa, chain - ## HITS:1 COG:MA4278 KEGG:ns NR:ns ## COG: MA4278 COG1262 # Protein_GI_number: 20093067 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 321 533 47 267 270 120 33.0 5e-27 MSKHNYDIFISYRKRCSGDKPEMLQLMLEESGFRKRVSFDKDNLNGRFDVELIRRIDECK DFIMFMVPETFTTIRPLNEEAVETGEKATWDMEEVAFYERMASLTYEEFETEIKQISRTG EIDFVRIELGRALHRRSRSPKQINIIPIAPQESESYDFATLQLPPDISGLKDFQAVFYSN SRVARFKDIKGDLLKQMLSKPSYVSAKWLVMTFIALLLIVAGSKTYTSIQRTAEQKLEFK DCRTYDDYSSFIKKNPDSSLKSTCDSILHEFNALRNDGRASVNNTGNRDIKDREKEWVDV KWNPTITLPQLRSLVDMMNNMLLIPAKNKEFIMGKTMGKGYDSPQHTVVLSSDYYMCKYE VTRSLWYAIMNDSIVTEEGMLPMTHITWNDAEAFTKKLNKLTGLPFSLPTEAQWEYAAAG GESYPYAGSDNIRDVAYYASNANERLHPVGEKRENGFDLYDMSGNAAEWCTDWMSRYENT RVTDPQGPAENPGHHKKIVRGGSYLANERDMDIRHRSVQTYDTSEPHIGFRVVLNPIQ >gi|225935341|gb|ACGA01000051.1| GENE 36 36185 - 36559 406 124 aa, chain + ## HITS:1 COG:PH0854 KEGG:ns NR:ns ## COG: PH0854 COG0251 # Protein_GI_number: 14590714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus horikoshii # 1 124 12 136 137 133 56.0 6e-32 MKKVICSEKAPGAIGPYSQAIEANGMVFVSGQLPIDAATGQMAEGIEGQARQSLENIKHI LEEAGLTMGNIAKTTVFLQDMSLFAGMNGVYATYFDGAFPARSAVAVKALPKDALVEIEC IAVR >gi|225935341|gb|ACGA01000051.1| GENE 37 36527 - 38017 963 496 aa, chain - ## HITS:1 COG:TVN0757 KEGG:ns NR:ns ## COG: TVN0757 COG0285 # Protein_GI_number: 13541588 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Thermoplasma volcanium # 26 494 17 419 428 200 30.0 6e-51 MDYQNTLKYLYESAPMFQQIGGKAYKPGLETTHQLDEHFGHPHQQFKTIHIAGTNGKGSC SHTIAAVLQCAGYRVGLFTSPHLIDFRERIRINGEMIPEEYVVNFVEEHRSFFEPLHPSF FELTTAMAFRYFADQKVDVAVIEVGMGGRLDCTNIIHPDLCIITNIGLDHTQYLGDTLTK IAKEKAGIIKEGVPVVIGRAQGAVKRVFTMKAKEKNAPIEYARENARYWDMEIVPYSKLQ EIRPMMDNTIQSMHEMIEAMDEQSEEEANQMRQALLMLDLSDSLHTLDQILDKRKDAIKI HNEMFPFGLFTELSGAYQFENMSTILKALATLTRLNYNIRSQDYRAGLANVCQLTGLMGR WQKVHSYPDIICDTGHNVDGIEYIHVQLNAIHKTFGQEIHFVFGMVNDKDIRGVLRALPK YATYYFTKASVKRALPENELLALAEEAGLKGTTYPTVVEAVQAAKKNCPPKDLIFVGGSS FIVADLLANRDTLNLY >gi|225935341|gb|ACGA01000051.1| GENE 38 38149 - 39474 1465 441 aa, chain + ## HITS:1 COG:BH2629 KEGG:ns NR:ns ## COG: BH2629 COG1875 # Protein_GI_number: 15615192 # Func_class: T Signal transduction mechanisms # Function: Predicted ATPase related to phosphate starvation-inducible protein PhoH # Organism: Bacillus halodurans # 4 441 2 442 442 286 41.0 7e-77 MGTKKNFVLDTNVILHDYNCLKNFQENDIYLPLVVLEELDKFKKGNEQINFNAREFVREL DVLTSDELFSDGVKLGEGLGRLFVVTSNVPAAKVWESFPIKKPDHLILAATEYLTDKYPK MKSILVTKDVNLRMKARSIGLLCEDYITDKVVNVDVFEKSNEIFENVDPALIDRIYSSKE GIDLSEFDFKDLIHPNECFVLKSDRNSVLARYNPFTHSIIRVMKGKNYGIEPRNAEQSFA FEILNDPNIKLVALTGKAGTGKTLLALAAALGKLTDYKQILLARPVVALSNKDIGFLPGD AQEKVAPYMQPLFDNLNVIKRQFATNSTEVKRIEDMQKSEQLVIEALAFIRGRSLSEMYC IIDEAQNLTPNEIKTIITRAGEGTKMVFTGDIQQIDQPYLDSQSNGLVYMIDRMKDQNIF AHVNLLKGERSELSELASNLL >gi|225935341|gb|ACGA01000051.1| GENE 39 39594 - 40571 1098 325 aa, chain - ## HITS:1 COG:alr1912 KEGG:ns NR:ns ## COG: alr1912 COG0167 # Protein_GI_number: 17229404 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Nostoc sp. PCC 7120 # 3 319 2 323 343 209 35.0 7e-54 MTDLKTTFAGLSLRNPIIISSSGLTNSAGKNKKLAEDGAGAIVLKSLFEEQIMLEADQLK DPAFYPEASDYLEEYIREHKLSEYLTLIKESKKVCPIPIIASINCYTDSEWIDFAKMIEE AGADALEINILALQSEVQYTYGSFEQRHIDILRHIKKTIKIPVIMKLGDNLTNPVALIDQ LYANGAAAVVLFNRFYQPDINIEKMEHISGEIFSNASDLAIPLRWIGIASAVVDKIDYAA SGGVANAESVVKAILAGASAVEVCSAVYLNTNAFIGEANRFLSAWMERKGFKNIAQFKGK LNIKDVQGVNTFERTQFLKYFGKKE >gi|225935341|gb|ACGA01000051.1| GENE 40 40595 - 41263 480 222 aa, chain - ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 1 222 1 218 221 162 41.0 4e-40 MSIADNLKQVLAELPQGVRLVAVSKFHPNEAIEEAYQAGQRIFGESKVQEMTAKYESLPK DIEWHFIGHLQTNKIKYMIPYVAMIHGIDSYKLLAEVNKQAVKAGRTVNCLLQIHVAQEE TKFGFSPEECKEMLNAGEWKELTHVRICGLMGMASNTDCIEQINREFGLLNRLFNEIKTT WFIHSDTFCELSMGMSHDYHEAIAAGSTLVRVGSKIFGERIY >gi|225935341|gb|ACGA01000051.1| GENE 41 41270 - 41749 661 159 aa, chain - ## HITS:1 COG:no KEGG:BT_1331 NR:ns ## KEGG: BT_1331 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 159 293 98.0 1e-78 MAMHTWFECKIRYEKVMENGMQKKVTEPYLVDALSFTEAEARIIEEMTPFISGEFTVSDI KRANYSELFPSDEESADRWFKCKLIFITLDEKSGAEKKTSTQVLVQAADLRDAVKKLDEG MKGTMADYQIGMVSETPLMDVYPYSAEPNDKPEFDPSKA >gi|225935341|gb|ACGA01000051.1| GENE 42 41833 - 42396 618 187 aa, chain - ## HITS:1 COG:RSp0426 KEGG:ns NR:ns ## COG: RSp0426 COG3247 # Protein_GI_number: 17548647 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 13 170 7 163 186 63 28.0 2e-10 METVFNEIQHSVKNWWTSLLLGIVYIIVALWLMFSPVSTYVALSIIFSVSMLISGILEII FALSNRKGVPSWGWYIVGGLIDLVLGIYLIAYPMVSMEVIPLIIAFWLMFRGFSSTGYSI DLKRYGTRDWGWYMAFGILAILCSLLILWQPAIGALYAVYMISFAFLIIGLFRVMLSFEL KNLHKRK >gi|225935341|gb|ACGA01000051.1| GENE 43 42534 - 43034 613 166 aa, chain + ## HITS:1 COG:Cgl1062 KEGG:ns NR:ns ## COG: Cgl1062 COG2077 # Protein_GI_number: 19552312 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Corynebacterium glutamicum # 1 165 4 167 168 186 58.0 1e-47 MATTNFKGQPVKLIGEFIQVGKVAPDFELVKTDLSSFSLKDLNGKNVILNIFPSLDTSVC ATSVRKFNKMAAGLKDTVVLAISKDLPFAHGRFCTTEGIENVIPLSDFRFSDFDESYGVR MADGPLAGLLARAVVVIGKDGKIAYTELVPEITQEPDYDKALAAVK >gi|225935341|gb|ACGA01000051.1| GENE 44 43233 - 44117 918 294 aa, chain + ## HITS:1 COG:no KEGG:BT_1328 NR:ns ## KEGG: BT_1328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 1 294 294 517 87.0 1e-145 MNKMKTLFIALLCMGVGTLSAQTADSTQISPWTKEGFAGLKLTQVSLTNWAAGGDNSVAF DLQGTYQINYKKGKHLWNNRIELAYGLNKTGDDGTRKANDKIYLNTNYGYAIAKSWYASA FATFQTQFSPGYDYSVNKDIAISEFMAPAYLTTGLGFTYDPGKIFTVVLSPAAWRGTFVL NDRLSDEGAYGVDPGKHLLSSFGANLKGEAKYEFLKNMTVYSRLDLYSDYLHKPLNIDVN WEVQVNMIINKWFSTTLTTNLMYDDDVKIVQKDGTKGARVQFKEILGVGVQFNF >gi|225935341|gb|ACGA01000051.1| GENE 45 44288 - 44932 312 214 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175107|ref|YP_001408238.1| ribosomal protein L22 [Campylobacter curvus 525.92] # 1 214 1 198 199 124 35 1e-27 MESVAELLKWVLENLNYWVVTIFMAIESSFIPFPSEAVVPPAAWKAMADDSMNIFLVVLF ATIGADIGALVNYYLARWLGRPIIYKFANSRLGHMCLIDEEKIHHAEEYFRKHGAASTFF GRLIPAVRQLISIPAGLAGMKIGPFLLYTTLGAAIWNSILALLGYLIYRFTDLKTTNDVY VMATEYSHEIGYVIIAVVVIVIVFLAYKGLKKRK >gi|225935341|gb|ACGA01000051.1| GENE 46 44920 - 46413 1643 497 aa, chain - ## HITS:1 COG:MA1362 KEGG:ns NR:ns ## COG: MA1362 COG0457 # Protein_GI_number: 20090223 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 176 414 165 395 400 73 27.0 6e-13 MGRKNSPSAKKELVTLITNYEEAKAENRQLYLDADQLADIADWYASERKFEEAQEVITYG LKIHPGNTDLLIEQAYLYLDTQKLQKAKKVADSITEEFDSEVKLLKAELLLNGGKLEEAQ WLLSTIADADELETIIDVVFLYLDMGYPDAAKEWLDRGKSRYAEDEEYMALTADYLASTH QVESAITYYNKLIDKSPFNPSYWMGLVKCYFVQEQIDKAIEACDFALAADDQYGEAYAYK AHCFFYLNNSDDAIENYQKAIELKAIPPELGYMFMGISYGNKEEWQKADDYYDKVIERFE EDGDKQSILLIDTYTSKAFALSHLERYEEAHELCEKAKEINPNEGLIYLTEGKLYLAEEL EDEAAISFEKAIEINPNIEMWYMIASAYSESDYLIEAKEYFEKVYQINPKYEDVTEKLSV LCLMHGEIDNFFKYNKECEHPLEEDMILDLLNSPEHREEDERTLKEVWERMKKENKKKTK GKNKSIYKFYPQKTWNQ >gi|225935341|gb|ACGA01000051.1| GENE 47 46516 - 48255 1821 579 aa, chain - ## HITS:1 COG:RSc0791 KEGG:ns NR:ns ## COG: RSc0791 COG0008 # Protein_GI_number: 17545510 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Ralstonia solanacearum # 16 576 16 580 580 582 50.0 1e-165 MTDIKNEEAGEKKSLNFIEQAVEKDLKEGKNGGKVQTRFPPEPNGYLHIGHAKAICLDFG IAEKHGGVCNLRFDDTNPTKEDVEYVEAIKEDIQWLGYQWGNEYYASDYFQQLWDFAIRL IQEGKAYIDEQSSELIAQQKGTPTQAGVESPYRNRPIEESLELFKKMNSGEIEEGAMVLR AKIDMANPNMHFRDPIIYRVVKHPHHRTGTTWKAYPMYDFAHGQSDFFEGVTHSLCTLEF VVHRPLYDLFIDWLKEGKDLNDNRPRQTEFNKLNLSYTLMSKRNLLTLVKEGLVNGWDDP RMPTICGFRRRGYSPESIHKFIDKIGYTTYDALNDIALLESSVRDDLNSRATRISAVINP VKLIITNYPEGQVEELEAINNPEDPEAGSHLIEFSRELWMEREDFMEDAPKKYFRMTPGQ EVRLKNAYIVKCIGCKKDENGVITEVYCEYDANTRSGMPDANRKVKGTLHWVSCNHCLQA EVRLYDRLWKVENPRDELAAIREAKKCEALEAMKEIINPDSLKVLPNCYIEKFAATLPPL SYLQFQRIGYFNIDKESTPDKLIFNRTVGLKDTWGKINK >gi|225935341|gb|ACGA01000051.1| GENE 48 48304 - 49116 811 270 aa, chain - ## HITS:1 COG:MA0887 KEGG:ns NR:ns ## COG: MA0887 COG0226 # Protein_GI_number: 20089771 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 23 269 70 315 317 191 46.0 2e-48 MKVRRNLLIALSLLSLGANAQRIKGSDTVLPVAQQTAERFMNQHPDARVTVTGGGTGVGI SALMDNTTDIAMASRPIKFSEKMKIKAAGEEVNEVIVAYDALAVVVHPSNPVKQLTRQQL EDIFRGKITNWKQVGGDNRKIVVYSRETSSGTYEFFKESVLKNKNYMASSLSMPATGAII QSVSQTKGAIGYVGLAYVSPRVKTLSVSYDGSHYATPTVENATNKTYPIVRPLYYYYNVK NKEQVSPLIQFILSSDGQDIIKKSGYIPVK >gi|225935341|gb|ACGA01000051.1| GENE 49 49294 - 50490 1166 398 aa, chain + ## HITS:1 COG:MA0888 KEGG:ns NR:ns ## COG: MA0888 COG0573 # Protein_GI_number: 20089772 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 149 392 45 290 296 264 55.0 3e-70 MKKVFERIIEGMLTCSGFVTSITILLIVLFLFTEAFGLFKSKVIEEGYVLALNKSNKVSV LSPAQIKNVFDEEITNWKELGGEDLPIRVFRLEDITQYYTEEELGPAYEYVGDKITELVE KTPGIVAFVPQKFIVHPDAVHFIEDNTISVKDVFAGAEWFPTATPAAQFGFLPLITGTLW VSLFAILFALPFGLSVSIYMSEVANPKVRNWLKPIIELLSGIPSVVYGFFGLIVIVPLIQ KLFNLPVGESGLAGSIVLAIMALPTIITVTEDAMRNCPRAMREASLALGASQWQTIYKVV IPYSISGITSGVVLGIGRAIGETMAVLMVTGNAAVIPTTILEPLRTIPATIAAELGEAPA GGPHYQALFLLGVVLFFITLIINFSVEYISSKGLKRSK >gi|225935341|gb|ACGA01000051.1| GENE 50 50492 - 51367 759 291 aa, chain + ## HITS:1 COG:MA0889 KEGG:ns NR:ns ## COG: MA0889 COG0581 # Protein_GI_number: 20089773 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 13 289 30 306 307 270 51.0 3e-72 MEIRSNNKAKHRSQKFAFGIFRLLSLCIVLILFAILGFIIYKGIGAISWDFITSAPTDGM TGGGIWPAIVGTFYLMVGSALFAFPIGVMSGIYMNEYAPKGRLVRFIRVMTNNLSGIPSI VFGLFGMALFVNYMGFGDSILAGSLTLGLLCVPLVIRTTEEALKAIPDSMREGSRALGAT KLQTIWHVILPMGMPNIITGLILALGRVSGETAPILFTCAAYFLPQLPTGILDQCMALPY HLYVISTSGTDMEAQLPLAYGTALVLIMIILLVNLLANALRKYFEKRVKTN >gi|225935341|gb|ACGA01000051.1| GENE 51 51376 - 52134 217 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 241 7 237 318 88 26 1e-16 MDTVKIDARDVNFWYGDFHALKGISMQIEEKSVVAFIGPSGCGKSTFLRLFNRMNDLIPA TRLEGEIRIDGHNIYAKGVEVDELRKNVGMVFQRPNPFPKSIFENVAYGLRVNGIKDNAF IRQRVEETLKGAALWDEVKDKLKESAYALSGGQQQRLCIARAMAVSPSVLLMDEPASALD PISTAKVEELIHELKKDYTIVIVTHNMQQAARVSDKTAFFYMGEMVEFDQTKKIFTNPEK EATQNYITGRFG >gi|225935341|gb|ACGA01000051.1| GENE 52 52190 - 52879 874 229 aa, chain + ## HITS:1 COG:RSc1533 KEGG:ns NR:ns ## COG: RSc1533 COG0704 # Protein_GI_number: 17546252 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Ralstonia solanacearum # 6 221 11 225 235 101 33.0 1e-21 MVKFIESELILLKKEIDEMWTLVYNQLDRAGEAVLTLDKELAQQVMVRERRVNAFELKID SDVEDIIALYNPVAIDLRFVLAMLKINTNLERLGDFAEGIARFVLRCKEPVLDAELLSRL RLAEMQAEVLSMLELAKRALNEESNDLAAGVFAKDNLLDEINADATGILSDYIIEHPEAV HTCVDLVSVFRKLERSGDHITNIAEEIVFFIDAKVLKHRGKTDENYPEK >gi|225935341|gb|ACGA01000051.1| GENE 53 53255 - 53857 702 200 aa, chain + ## HITS:1 COG:L0164 KEGG:ns NR:ns ## COG: L0164 COG0307 # Protein_GI_number: 15672976 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Lactococcus lactis # 1 196 1 192 216 150 40.0 1e-36 MFSGIVEEYATLVALVKDQENIHFTFKCSFVNELKIDQSISHNGVCLTVVTLTDDTYTVT AMKETLERSNLGLLKVGDKVNVERSMMMNGRLDGHIVQGHVDQTATCIDIKDAEGSWYFT FRYAFDKEMAKRGYITVDKGSVTVNGVSLTVCNPTDDTFQVAIIPYTYEHTNFHTFEIGS VVNIEFDIIGKYISRMIQYK >gi|225935341|gb|ACGA01000051.1| GENE 54 53872 - 54435 601 187 aa, chain + ## HITS:1 COG:CAC2311 KEGG:ns NR:ns ## COG: CAC2311 COG0778 # Protein_GI_number: 15895578 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 11 152 2 139 187 76 32.0 2e-14 MTGGLEMAYTDFLQLVQARQSDRSYDKERPVEPEKLERVLEAARLAPSACNAQPWRFVVI TDKELAQKAGKAAAGLGMNKFAKDAPVHILVVEESANITSLLGGKVKGKHFPLIDIGIAA AHISLAAEAEGLGSCILGWFDEKELKQLAGIPASKRLLLDIVIGYPAKEKRKKIRKPKEK VISYNRY >gi|225935341|gb|ACGA01000051.1| GENE 55 54649 - 57933 2664 1094 aa, chain - ## HITS:1 COG:VCA0045 KEGG:ns NR:ns ## COG: VCA0045 COG0793 # Protein_GI_number: 15600816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Vibrio cholerae # 722 1078 21 379 394 290 42.0 1e-77 MKKLLLSAFVICLAGTSIAQESPLWMRHCALSPDGTTIAFTYKGDIYTVPSTGGRATQIT TNPAFDTEPVWSPDSKQIAFASDRMGSLDVFVVSSEGGAPQRLTTHSGSEKPVAYKDNKH ILFTANIAPSAEDAGFPSGQFQQVYEVAVTGGRPAMFSSMPMECISINKEGVMLYQDKKG YEDYWRKHQVSPIARDIWTYTPGKKPVYQKQTTFGGEDREPVWSPDGKSFYYLSEEKGSF NIFQRTPGTTTSQQITFHTKHPVRFLSISTTGTLCYGYDGEIYTLTPGKQPQKVNISILA DKNDKDIIRQIKSNGATDIAVSPKGKEVAFIMRGDVYVTSIDYKTTKQITNTPDQERNIS FAPDGRTLVYSSERDGLWQLYTSTIVRKEEKQFTYATELKEERLTNSKVASFQPQFSPDG KEVAFLENRTAIRVINLKSKAVRTVMDAKYQYSYADGDQWFQWSPDSQWILSDFIGIGGW NNKDVVLLKADGKGEMVNLTESGYSDSNAKWVLGGKAMIWNSDRAGYRSHGSWGSEDDTY IMFFDVDAYNRFLMSKEDIALLEEAEKAEKAEKEKAKKEKAEKKEDTKDSKKKTNQNENA KKDSTEVKPLTFDLENRFDRIVRLTVNSSRLGDAVMSPKGDILYYLAAFEGDYDLWEHKL KENTTKILLKGVGGGSLIPDKEGKNIFMCTGGRLKKIEIAGSKITPIEFEAFFDYRPYDE RAYIFDHVCQQVNDKFYIADLHGVDWKGYKKAYERFLPHISNNYDFTEMLSELLGELNGS HTGARFAAGGSAMPTATLGVFYDESYDGAGLKIKEIMKQSPFTQKKTEVKAGCIIEKVDG TAIEAGADYFPLFEGKVGRKVVLTVYDPSTKKRFEEAVKAISYGTQSDLLYKRWVERCAK KVEELSGGRIAYVHIKGMDSPSFRKIYSELLGRYRNKEAVVVDTRHNGGGWLHDDVVTLL SGKEYQRFVPRGQYIGSDPFNKWLKASCMLTCEDNYSNAHGTPYVYKTLGIGKLVGAPVA GTMTAVWWERQIDPSIVFGIPQVGCMDMQGNYLENQTLQPDILVYNEPAASLKGEDAQLK AAVDYLLKDLSKKK >gi|225935341|gb|ACGA01000051.1| GENE 56 57985 - 59304 1101 439 aa, chain - ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 55 341 19 300 396 163 34.0 6e-40 MKKKITDIWKFLTYDIWRITEDEVTRTKFSLYNIIKTVYLCINRFTKDRMANKASALTYS TLLAIVPILAILFAVARGFGFDNLMEHQFRNGFGGNTETTEAILSFVNSYLSQTKGGIFI GVGLVMLLWTVINLVSNIEITFNRIWEVKKARSMYRKITDYFSMFLLMPILIVVSGGLSL FMSTILKQMDDFVLLAPVMKFMIRLIPFVLTWLMFTGLYIFMPNTKVKFKHALIAGILAG SAYQAFQFLYINSQLWVSKYNAIYGSFAALPLFLLWLQISWTICLFGAELTYAGQNIRSF SFDQDTRNISRRYRDFISILIMSLIAKRFEKNEPPYTAAEISEEHQIPIRLTNQVLYQLQ EIELIHEVVTDEKSEEIGYQPSMDINQLNVAVLLDRLDTYGSENFKIDKDEEFNDEWKVL TESREEYYKKASKVLLKDL >gi|225935341|gb|ACGA01000051.1| GENE 57 59901 - 60881 583 326 aa, chain + ## HITS:1 COG:BH3007 KEGG:ns NR:ns ## COG: BH3007 COG0791 # Protein_GI_number: 15615569 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus halodurans # 53 276 84 306 336 96 32.0 8e-20 MKKNILLFYCFLAMMAASLKAQEIRPMPADSAYGVVHISVCNLREEGKFTSGMSTQALLG MPVKVLQYNGWYEIQTPDDYIGWVHRMVITPMSKERYDEWNRAEKIVVTSHYGFAYEKPD ESSQPVSDVVAGNRLKWEGSKGHFYQVSYPDGRKAYLSKSISQPEAEWRASLKQDVESII ETAYSMMGIPYLWAGTSSKGVDCSGLVRTVLFMHDIIIPRDASQQAYVGEHIDIAPDFSN VKRGDLVFFGRKATAERKEGISHVGIYLGNKQFIHALGDVHVSSMNPADQNYDEFNTKRL LFAVRFLPYINKEKGMNTTNKNPFYQ >gi|225935341|gb|ACGA01000051.1| GENE 58 60886 - 62037 1058 383 aa, chain + ## HITS:1 COG:all3532 KEGG:ns NR:ns ## COG: all3532 COG4948 # Protein_GI_number: 17231024 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Nostoc sp. PCC 7120 # 46 380 1 343 350 207 35.0 4e-53 MQNRRDFLKTAALAAFSSGLVARQALAGESLLSTIHINKLGLGGKMKMTFFPYELKLRHV FTVATYSRITTPDVQVEIEYEGVTGYGEASMPPYLGETVESVMNFLGKVNLEQFSDPFQL DDILSYVDSLSPKDTAAKAAVDIALHDLVGKLLGAPWYKIWGLNKEKTPSTTFTIGIDAP DVVRAKTKECADQFNILKVKLGRDNDKEMIETIRSVTDLPIAIDANQGWADRQYALDMIH WLKEKGVVMIEQPMPKEKLEDIAWITQQSPLPIFADESLQRLGDVAALKGAFTGINIKLM KCTGMREAWKMVTLAHALGMRVMVGCMTETSCAISAASQFSPLVDFADLDGNLLISNDRF KGVEVVKGKITLNDLPGIGVMKI >gi|225935341|gb|ACGA01000051.1| GENE 59 62257 - 63792 1890 511 aa, chain + ## HITS:1 COG:YPO3566 KEGG:ns NR:ns ## COG: YPO3566 COG0265 # Protein_GI_number: 16123710 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Yersinia pestis # 78 481 56 436 457 243 38.0 9e-64 MKQTTKNILGVGAIILLSSGVAGLTTYKLLQSNESAKETSFNEMFKQNPNVKLAAFDAVN AQPVDLTQAAENSLHAVVHIRSTQEAKTRTVQQAPDIFDFFFGDGRGQQRQVQSQPRVGF GSGVIISKDGYIVTNNHVIEGADEISVKLNDNREFKGRVIGTDPSTDLALVKIEGDDFPT IPVGDSEALKVGEWVLAVGNPFNLNSTVTAGIVSAKARSLGVYNGGIESFIQTDAAINQG NSGGALVNAKGELVGINSVLSSPTGAYAGYGFAIPTSIMTKVIADLKQYGTVQRALLGIR GGSIGSSLMDDRQPIDKSGKTLADKAKELGVVEGVWVSEIVENGSAAGADIKVDDVIIGV DNKKVSNMADLQEALAKHRPGDKVKVKLMRDKKEKTVEVTLKNEQGTTKIVKDAGMEILG AAFKELPDDLKKQLNLGYGLQVTGVSSGKMSDAGVRKGFIILKANDQPMRKVSDLEEVMK AAVKSPNQVLFLTGVFPSGKRGYFAVDLTQE >gi|225935341|gb|ACGA01000051.1| GENE 60 63977 - 64837 777 286 aa, chain + ## HITS:1 COG:lin1491 KEGG:ns NR:ns ## COG: lin1491 COG0568 # Protein_GI_number: 16800559 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Listeria innocua # 21 285 108 373 374 221 45.0 1e-57 MRQLKITKSITNRESASLDKYLQEIGREDLITVEEEVELAQRIRKGDRVALEKLTRANLR FVVSVAKQYQNQGLSLPDLINEGNLGLIKAAEKFDETRGFKFISYAVWWIRQSILQALAE QSRIVRLPLNQVGSLNKISKAFSKFEQENERRPSPEELADELEIPVDKISDTLKVSGRHI SVDAPFVEGEDNSLLDVLVNDDSPMADRSLVNESLAREIDRALSTLTDREKEIIQMFFGI GQQEMTLEEIGDKFGLTRERVRQIKEKAIRRLRQSNRSKLLKSYLG >gi|225935341|gb|ACGA01000051.1| GENE 61 64972 - 65595 360 207 aa, chain + ## HITS:1 COG:no KEGG:BT_1310 NR:ns ## KEGG: BT_1310 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 218 218 327 72.0 2e-88 MKKSVLISLVLIALGAGLVAYARCAVSSSDSVPAKVEQRLREKAQAGKAYCDKNGYNTNY CFLVDFSIHSGKRRFFVWDFKGDSVKYASLCAHGYGKNSTVSKPVFSNVEGSYCSSLGKY KVGIRSYSKWGINIHYKLHGLEATNDNAFKRYIVLHSYTPLPETEVYPLHLPLGISQGCP VISDEVMRKVDGLLKAEKKPVLLWVYD >gi|225935341|gb|ACGA01000051.1| GENE 62 65673 - 66074 330 133 aa, chain + ## HITS:1 COG:no KEGG:BT_1309 NR:ns ## KEGG: BT_1309 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 133 133 207 86.0 7e-53 MKYVKILFAIALVFTMCSAFSLKKDHSKPVYAFGISASFTDTVVYFTDIQILDSAKVSKE GFLSHRELYSYQLKNYLEDNQLQQNSTCMIYFSENKKKLEKEATKILNKYKKNNRMTVSR IDSDKFHFTKPEE >gi|225935341|gb|ACGA01000051.1| GENE 63 66165 - 67355 822 396 aa, chain + ## HITS:1 COG:no KEGG:BT_1308 NR:ns ## KEGG: BT_1308 # Name: not_defined # Def: clostripain-related protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 396 2 399 399 397 52.0 1e-109 MKKIKIFSLFVCLAMLVIACHNDDDERGVQMRTVLVYIAGDNSLRSFATEDLAEMTEGMQ SVDDNSYNLLVYIDTGSSPKLIRLKKDKKKNVVQEVIATYEGRNSVDVSKMKEVINTAFS EYPAQSYGLVLWSHGEGWLAKSQNKTRWWGQDGGSNYMDISELKDVLRNAPHLSFLLFDA CFMQSVEVVNELKEHADYIIGSPTEIPAPGAPYQKVVPAMFANNASATDIAKAYFEFYAD ENLYTGKLPYNWGLGDPWTAGVSVSVVNTSMLEQLAKSSSEIIPKYIKGRQAIATSGILC YDCRSSKYYYDFDGLIRSLGSETSEYEAWKAAYDAAVVYWKTTPNNYSSYGGSFSMNGSA GLSTYIFRQSYEEEINPFYRQSIQWYSAAGWDETGW >gi|225935341|gb|ACGA01000051.1| GENE 64 67413 - 67958 350 181 aa, chain + ## HITS:1 COG:no KEGG:BF0992 NR:ns ## KEGG: BF0992 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.fragilis # Pathway: not_defined # 11 177 20 188 193 107 39.0 2e-22 MKNKQSLYKWLYETYENDLFSYGIAFGISKELLEDAIHDVFLHLYEREHKLWESQNMKFY LLNCLKNRIRTIKKKEMNTEFLEEGSDNYSFLIEVNGFELIDEEKERAAMQKQLKKMLDS LTDRQREAVYLRYTQGLSYEEIGKLMGIQPKAAQKLVYRAIEQMRKIQPQIIYFFLFGYF L >gi|225935341|gb|ACGA01000051.1| GENE 65 68021 - 69193 855 390 aa, chain + ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 182 344 65 227 280 68 31.0 2e-11 MHSDRIPESTKQWLQNEDFILWCLAPTQESEEWWVNYQKEHPEDQASLQEARKIILSARL NPVQRLPEESERLWARIETSMERRDKRHRFILFTRYAAACVLILGIVSFWLIHSFQTSGT ADDLLVNVKADTLYQKVMLIRDDAEAIQIENNAVITYDSNITIQSDGKETEIIKEGAVQP GKQNTLIVPYGSHSSIVLADGSKVWINSGSKLRFPSSFGPGERRIKVEGEIYIEVAKDTS RPFYVETPQLTVNVLGTKFNVSAYADDALQSVVLVEGKVNVKANSNEDFVLLPNQRFKLS DGISGIDEVNAYDYISWKDGVLQFRGETMKDIIQRLSRYYNVSITCTPEVAQKCTMGKLI LFDDIEQVMKTFSLLYDVQCTIEAGSIKIE >gi|225935341|gb|ACGA01000051.1| GENE 66 69342 - 72767 2331 1141 aa, chain + ## HITS:1 COG:no KEGG:Phep_3877 NR:ns ## KEGG: Phep_3877 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 136 1141 27 1036 1036 807 43.0 0 MENSYYNHANRAYFLCKSYLFVKFFVIFLFIANVTLANGNGTSELYDIKKGTLLEYFKDI EKETGYVFIYSKDIRPSLNQVVSVDLSRKKTITEKLSALFEKTNLVYEINGKQIVVKRKT AVKKNLSSQPQEPRRITGTITDTKGEPIIGANIKEVGTFNGIISDINGRFALSVNPNAVL EISYIGYVTTTVPVKDNKVLSIEIKEAEQSLEEVVVVGYGKQKKESVTASIASISRKELV QTQQSNISNMLVGRMPGLIAFQRSGAPGEDASSLLIRGVSTFTDNTAPLIMIDGIERTNF DGIDPNEVESLNILKDASATAIYGVKGANGVVLITTRKGERGRPRLSYSGNFALQQSTQL PSYLNSADYATLYNEALENDSRVSGSTYQPKFTDKDIELYRNGFDPILHPNTNWIDDFLR KFSTRTQHNINLSGGTERVKYFISGSYFDQTGIYKHTKIDSDHDVNPRNTRYNFRTNFDF QITRDFSATVQMAAQIGRVITPGSGNSGIWQAISFANPLSSPGLVDNKIVRIQDGLGSVN PWQTLLSNGYQKDNRNNINTTLRLNYDLSSLLTKGLSVHGSIAYDSYYYSRKKYSKTFPY YLARRDAEDPDYIYLIPQSEESIWSVSTGWDKNRKVYMEFGIDYNRTFGLHKTTALILYN QSKYYSPSLQYYVPNAYQGLVGRLTYEYASRYLAEFNMGYNGTENFAKGKRFGFFPAVSL GWVISEEKFFPKNNIVKYLKVRGSYGEVGNDQIGGDRFLYLPSSYGAASDSGVNKYNFGL ASNPYTSLMIVENKIGNPDLTWEKAKKMNVGVDINLFNNCLTASFDIFKEKRNNILANRS TSPMIIGANLPAYNFGEMENRGWEMDLNFRHHIQDFHYWARFNYSFARNEIIYMDEVQKR YDYQMTTGRRKNQFFGLIFDGYYNSWEEINALDRPKSSWSGNQLQPGDVKYVDVNQDGVI DDYDMVPIGYTPVPEIIYGFSFGAQYKGFDFSILFQGADNVSIKYFGRSMWPFAKGEESA KSLIKERWTQERYEAGEKITFPRLSLNPNGETDHNYRPSTLWIRDASYLRLKNLEVGYTF TGGFVKRLNLNSVRLYFNGSNLFTWTDVVDLDPEAPSRSGNVEINTYPLQKVYNIGLNIN F >gi|225935341|gb|ACGA01000051.1| GENE 67 72773 - 74530 1198 585 aa, chain + ## HITS:1 COG:no KEGG:Phep_3876 NR:ns ## KEGG: Phep_3876 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 585 1 587 587 328 36.0 5e-88 MKRITYICVILLSSVLMFSCSDYLEAPPSVDLDEDGVFADRTLTEQYITGIYAEGMPLGF SMGSSGIDRKLCATSTLAGACDEAEQGANWGKGNASWNVDNHNNSSIDWDEDPRHNTRWQ TLRKCNIVLERIDEVPDDPGDVDFKTRSRGEAYFMRALVFWEGVYRYGGLPIIHRRINPS EDGKLPRNTFSDCVDSILVDCDRAASILPDYYTNSILVGRANRIAALALKSRVLLYAASP LFNTDDPYLPLSGNNDLIGYGNYSKERWNEAAKAAKAAITAVESSGYYDLYDEGTPETNY EHVWTAPDNKEIILANKKYRNFTTSSHPITSNIPAWAGSSWSDGGLFTTFNFVRFYEKKD GNQQTWNMDGGDDLLEKYDELDPRFAQTIAAHGANWNTEIGILNFLPGGAHNVANDKTKH LVRKWVPRVLRATAPRNSTNMDWIVFRVAELYLNYAEALNEYYETPPKEAFDAVLKVRER SGMPGFPSTLNKEQFREKLRRERAVELAYEDHRFWDIRRWLIADDEGVMKGAMYGLQLSA VTGAPGKVHYKPYVFENRLWSDRSYLHPIKQTEIDKGYMLQNPGW >gi|225935341|gb|ACGA01000051.1| GENE 68 74560 - 77388 1835 942 aa, chain + ## HITS:1 COG:no KEGG:Phep_3875 NR:ns ## KEGG: Phep_3875 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 39 942 41 937 937 540 37.0 1e-151 MTIINSIYKGKRYAILLTTILTFYAPVLFAQHIADSALVEVAYGQQPEWRRTSAISSIRG EDLLKTTSASLGNSLQGQLPGLTLLQQSGEPGYDFSIANLYLRGRTSYASGQKMLVYVDG FEAPIESLSTAEIESVTLLKDASALALYGMRGANGVLLVTTKKGCVSAPQVSIRLQMGIQ QPLGVPDPINAYDYATLYNQVRVNDGLPRLYTPEELNAYQNNTDPYFYPNVNWKKAILNN TAPLSMADLSFKGGGDVVQYYVMMNLLQNIGFYKDTDKKRRENSNAYYASFNFRSNLNIQ ISKHLSTGLNFSGSVGNRSIPGGSSSANRLLGAIWRTAPNAFPVYNPDGSYGGNAAFTNP VGNIVSRGLYKENSRTFQLIFMPKYDLEKLTKGLSVSAGIAYNNYMADSSIKNQNYARYS LSKGAGGEVLYTPYGADSPLESDEGFRTDWSRLNFKAQLDYDRVFKQHQLTASIFFLSDL YQKYGSRDDIKYLNYAGRVTYSYNQKYIGEFAASYTGCDDYAPGKRYGFFPAFSAGWILS KENFMKNINWVEYLKLRASIGLVGNNQNENGRYLFDQTYSSNGSYFLGTGSSSTGGFRAD MIANPDISWEKERVFNIGLDAVLLKGLSVEFDYFHKERYDILSQPYSTIPGFVGASYGDI LPYMNVGKVKNQGFEGTIRYESSLKNNFNYYVETSAWYAKNEIVDMAEEVKLYDYQYRKG RSVNTPFVLVADGLYQESDFDSNGSLKSHLPVPQFGEVNPGDIKYIDKNGDGVVDSNDSY PVGYSNIPEWNYALKIGFEWKGIDVEALFHGVANRDIYLSGPLAYSFTDNGSASKLALDS WTPENPNASYPRLSTRTFENNYRTSTYWKRNGNYFRMKNIRIGYTFPQSIARAIKLSRLY VYANASNLFTVSHLDGLADPESSSLITYPLTRSYNLGIKIDF >gi|225935341|gb|ACGA01000051.1| GENE 69 77415 - 80078 2093 887 aa, chain + ## HITS:1 COG:no KEGG:Phep_3874 NR:ns ## KEGG: Phep_3874 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 10 517 7 523 568 384 43.0 1e-105 MKYPIGLSIILNALAAISILSGCSDYLDREYDSFIDNEMTFTSYERTSKFLVNAYRYLPD GFNRIGSEAMLDAATDDAEHANASCNIQHFNTGAWNSRSNPDDLWNKYYAGIRIANEFIE NVDRVNLDKYRLDPDNQNEYQNRLNDLKTWKYEARFLRAFFHFELVKRFGPVPVITSTLS VNADYSETPRPSMDDCISFISSECDKVAEVLDLTPGRGIDSDLGRATKGAALALKSRVLL YAASPLYLDWQNLSESDLPSDMEKWKAAAQAAKDVIDLGIYSLYGSYATLFKNNFQNSEF ILMRRYGNSSDFEKYNFPVSYGGVGGINPSLNLVDSYEMKDGSYFSWENEENAVRPQFYR DDRLNATILLNDSVWKSTAVENWDGGKDGLGVTNATKTGFYLKKYLNEDVNIQTGGGSQG HIWPLFRLAEIYLNYAEALNEYDPENADIAEYVNRVRSRAGQPNLPSGLTQDEMRERIRR ERRVELAFEEHRSWDVRRWKIAQETLGGDLLGLEITRKNQARRAVTRNSVIPANEVPEGW HYYDGDEFNDLVINNSCWGQYGSDTPVGNSQYGQPTGNIQTYRKKQITIEKGSGGLSFAR IAATKDDNPPAPTLSTASTREGWWSGALSSRDTDKYGYQGKYYPLHSRIEIRAKIPYIYG IWMGPWCRHYAGASVAELDIEEFFVKEFENTASPRRLSQALHLHDNKTGNLGINVNGYGR HTVLDFDPGADFHTYGVQVDPDPVSPDKHAIISYLLDGKVTNTFKTIDYDDRYNTFITKA IAEGREKRTWDIAITGQIGGKNENGIGYPEDRNANLRNVSMDVDWVRVFTRDETEPEIPE KPEYPVEKFDYSRAVVEKRVFDSKMYWYPIPESEILQLKNWKQNPGW >gi|225935341|gb|ACGA01000051.1| GENE 70 80098 - 82134 1282 678 aa, chain + ## HITS:1 COG:CC1105 KEGG:ns NR:ns ## COG: CC1105 COG1472 # Protein_GI_number: 16125357 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Caulobacter vibrioides # 332 677 10 364 743 332 49.0 1e-90 MLKIKILIIIFCCIGVVKAQTVIPPSSETPPGWIYYDGDEFNGDVIDSRYWGIYGSQKVG RPTYNQENKAMLQTYRPEQVFIETLPTGEKICRIRSFKSKDAPSPVHPSVKSKTGWWSGA LSSRDSDTEKYYPLFCRIEIKAKVPYLYGLWNALWLRHYKGAGVAEIDILEFFTKAFGEN PYPAKANQTLHLFNSETQKLGINLPKGQIRYTEIGDDKPGDNFHVYAVQIDPDPVDNNHA IITFLIDNKVNYQIHTRTQLGDAYTDFITKARKENRLDRVWDIAITGQVGAFDKLDVGYP AEELQQFDFDIDWIRVYVRDPHTRIVGNSKLPHTPEADSFVKDLLSRMTVEEKIGQLSQY VGRTLLTGPESEYLSDSLIARGLVGSVLNISGAKTLRDLQEKNMRHSRIKIPILFGMDVI HGYKTIFPTPLAESCSWDLAAIERAAKIAAIESSAAGLHWTFAPMVDIARDARWGRVVEG AGEDTYLGSEIAKARVNGFQWNLWENNSVLACAKHWVAYGLPQAGRDYAPVDMSERTLFD TYLPPFKACIDAGVLTFMSAFNDINGIPASAHPFLLKDLLRGQWNFNGFVVSDWEAVKQL VAQGVAEDDKDATRLAFNSGIDMDMTDGLYNKYMKELIEAGKISMEDVDNSVSRILHIKY ALGLFADPYKFCNEEYES >gi|225935341|gb|ACGA01000051.1| GENE 71 82141 - 83247 714 368 aa, chain + ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 8 367 385 764 764 335 46.0 6e-92 MKKEFLDAALDMAHKSAVLLKNDNHTLPLAKNVRSIAVVGPLADNQTELLGSWRARGEDR HVTTVLQGIKNKIGGNKTKVGYARGCDFDGEDKSGFKEAVKLASKSDMVIAVVGEKALMS GESRSRAQLDLPGVQEELIKELVATGKPVVVVLMNGRPLSIEWVDKNVSAILETWFLGTS AGTAIADILFGDYNPSGRLTISFPRVEGQVPVYYNYKKSGRPGDMPHSSTTRHIDVPNAP LYPFGYGLSYTTFSYSAPQSTQKEYTRQETISVSVTVTNTGDRDGEETVQLYVNDKVASV VRPVKELKAFKKIFLKAGESKTVQFDISPLALGFYDAAMNYVVEPGEFEIMTGCNSNDLQ TISVKLIN >gi|225935341|gb|ACGA01000051.1| GENE 72 83264 - 83944 278 226 aa, chain + ## HITS:1 COG:no KEGG:BT_2457 NR:ns ## KEGG: BT_2457 # Name: not_defined # Def: putative purple acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 194 19 187 389 151 40.0 1e-35 MRNWIIVFFVIFAARLSAHDGDSIKITHGPYLCDMSTDGVTVVWTTNKPALSWVEVAPAG EDHFYGKERPRHYDTESGRKRANDTIHRVRIKHLEPGREYRYRIFSREVVSWPSSDWVTY GLIAASNVYKQEPFRFRTFDDRKKEISFLVLNDIHGRSDYMKSLCREVDFKSLDFVLLNG DMSSWVEGRSRYVKIILMLVSNCLRPKCLLFSIGETMKHVEFIPMR >gi|225935341|gb|ACGA01000051.1| GENE 73 83815 - 84444 266 209 aa, chain + ## HITS:1 COG:no KEGG:BT_2457 NR:ns ## KEGG: BT_2457 # Name: not_defined # Def: putative purple acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 198 182 380 389 183 43.0 4e-45 MGRGQEQICKDYIDACVELFASEVPIVFNRGNHETRGVYSDALIKYFPTSTGTFYYRFNI GKVCFLVLDSGEDKPDSDLEYAGIADYDNYREEETLWLRSVVEENDFKQSSLRIAFLHIP PTIGNWHGNYHLQQTLLPVLNTAGIDLMLSGHTHKYYFRESEPDKANFPILVNDNNSYLL CKIKDGKMVIDVVGANGKDKKQHQFDVKF >gi|225935341|gb|ACGA01000051.1| GENE 74 84496 - 86562 1526 688 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173892|ref|ZP_05760304.1| ## NR: gi|260173892|ref|ZP_05760304.1| outer membrane protein SusF [Bacteroides sp. D2] # 1 688 1 688 688 1363 100.0 0 MKMKTIIKLLFVFSLPFIFCRCADDGIHYEREINVPEGIYLTGSASQFSVEAINGKMNVI KEGTLVALNTWLTSGGDFYISLVGPDGQPVYYGQGALLQNDNSEVSTYSLALGTGGFKVM EDGLYQIIVNPILKQVNIVPFNFRLTSKFELTEDGENELFLQKPSYDHINHVATWVSTDE LEVILPAEFSFNYTDNTSFDVKETDTEKYTFSSMYTGTGGSIKMNVLTEEYAELTNQSQV QLNLKNKGEYKVTLQYEVLTGKFFAKMTGNEIIEPEPEGYPEKLYMIGDEFGNWNWNSTN VVEMAPVGQLGNGAFWTIKYFNAGQGIKWASEKSDAESFASLGTNVNYVVGSNGRATVET SGLYLVYVDMNRNLIAFEKPAVYGIGECFDGQEVSFDLSGQNFSAVTTTQGNLQMYATSD YNNRDWNTMEFNIYNGQIYYRGVGAALEPVPVASNIPIELDFSQDKGKIAVTFASPSDVP STAKAIYMVGDEFGNMNWGSDGVISLDKVWNSADRWIHINYFNAGTKLRFSTSKIFGDGE FTGLTNNVGFEISDEGLVVIPQSGTYIIFVDLGSKTISIQKPVIYGYGTAAGGNNEKILP FTESSDGKTFSVTLPNGGRFRIHPYIPAFDNLNPSFGAWKREYAVNPETLEIYLRKEGMD EPNKDYVWAANTIITLDFRAAKGTIVVP >gi|225935341|gb|ACGA01000051.1| GENE 75 87056 - 87206 85 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|299149093|ref|ZP_07042155.1| ## NR: gi|299149093|ref|ZP_07042155.1| hypothetical protein HMPREF9010_03382 [Bacteroides sp. 3_1_23] # 1 50 1 50 56 92 98.0 9e-18 MYTILEHFRSALCNLKKGGIDLSLNKWVELVSLRTTDFSLTGIFLLFPFS Prediction of potential genes in microbial genomes Time: Fri May 13 10:26:39 2011 Seq name: gi|225935340|gb|ACGA01000052.1| Bacteroides sp. D2 cont1.52, whole genome shotgun sequence Length of sequence - 71916 bp Number of predicted genes - 60, with homology - 60 Number of transcription units - 36, operones - 12 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1003 - 1062 5.5 1 1 Tu 1 . + CDS 1084 - 2190 1007 ## Slin_3582 hypothetical protein + Term 2211 - 2246 1.0 - Term 2305 - 2341 3.2 2 2 Tu 1 . - CDS 2454 - 2678 172 ## BT_2368 hypothetical protein - Prom 2699 - 2758 7.7 + Prom 3268 - 3327 5.8 3 3 Tu 1 . + CDS 3347 - 3847 461 ## COG3449 DNA gyrase inhibitor - Term 3669 - 3711 1.1 4 4 Op 1 . - CDS 3911 - 4282 295 ## COG3324 Predicted enzyme related to lactoylglutathione lyase - Term 4299 - 4334 2.4 5 4 Op 2 . - CDS 4377 - 5180 367 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 5233 - 5292 7.0 - Term 5266 - 5323 9.8 6 5 Tu 1 . - CDS 5332 - 6201 798 ## BT_1908 hypothetical protein - Prom 6303 - 6362 8.2 + Prom 6183 - 6242 6.1 7 6 Tu 1 . + CDS 6324 - 6650 436 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain + Prom 7220 - 7279 3.3 8 7 Op 1 11/0.000 + CDS 7427 - 8206 243 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Prom 8256 - 8315 4.2 9 7 Op 2 . + CDS 8344 - 8949 618 ## COG1309 Transcriptional regulator 10 7 Op 3 . + CDS 8963 - 9679 575 ## Sde_1498 hypothetical protein 11 7 Op 4 . + CDS 9724 - 10254 245 ## gi|260173904|ref|ZP_05760316.1| hypothetical protein BacD2_18702 + Prom 10295 - 10354 6.7 12 8 Tu 1 . + CDS 10423 - 11598 927 ## COG1488 Nicotinic acid phosphoribosyltransferase + Term 11615 - 11669 7.6 - Term 11598 - 11661 13.0 13 9 Op 1 . - CDS 11745 - 12068 336 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 12100 - 12159 7.6 14 9 Op 2 . - CDS 12164 - 12646 452 ## COG3467 Predicted flavin-nucleotide-binding protein - Prom 12726 - 12785 4.8 + Prom 12846 - 12905 6.5 15 10 Op 1 . + CDS 13118 - 14632 1485 ## COG0439 Biotin carboxylase 16 10 Op 2 . + CDS 14655 - 15164 531 ## COG1038 Pyruvate carboxylase 17 10 Op 3 . + CDS 15184 - 16719 1603 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) + Term 16757 - 16816 5.1 - Term 17110 - 17153 3.6 18 11 Tu 1 . - CDS 17329 - 18453 768 ## BF3546 putative N-acetylmuramoyl-L-alanine amidase - Prom 18512 - 18571 6.4 + Prom 18436 - 18495 5.6 19 12 Tu 1 . + CDS 18583 - 19863 1318 ## COG2873 O-acetylhomoserine sulfhydrylase + Term 19892 - 19941 12.9 + Prom 19903 - 19962 2.7 20 13 Tu 1 . + CDS 20131 - 20802 502 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Term 20762 - 20816 0.4 21 14 Tu 1 . - CDS 20818 - 21711 572 ## gi|260173914|ref|ZP_05760326.1| hypothetical protein BacD2_18752 - Prom 21792 - 21851 6.5 + Prom 21676 - 21735 7.2 22 15 Tu 1 . + CDS 21833 - 25759 2093 ## COG0642 Signal transduction histidine kinase - Term 25818 - 25863 0.6 23 16 Tu 1 . - CDS 25971 - 26501 300 ## BT_1925 hypothetical protein - Prom 26534 - 26593 2.0 - Term 26517 - 26562 6.1 24 17 Op 1 . - CDS 26598 - 27464 729 ## BT_1926 hypothetical protein 25 17 Op 2 . - CDS 27491 - 30061 2560 ## BT_1927 hypothetical protein - Prom 30085 - 30144 7.0 26 18 Tu 1 . - CDS 30465 - 31403 373 ## BT_1793 integrase protein - Prom 31423 - 31482 1.6 27 19 Tu 1 . - CDS 31520 - 32146 397 ## BT_4601 hypothetical protein - Prom 32267 - 32326 4.2 + Prom 32271 - 32330 5.0 28 20 Op 1 . + CDS 32503 - 34956 1461 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 34962 - 35021 2.3 29 20 Op 2 . + CDS 35046 - 36521 1115 ## COG5520 O-Glycosyl hydrolase - Term 36522 - 36582 12.1 30 21 Tu 1 . - CDS 36648 - 37154 353 ## YPTB3281 hypothetical protein - Prom 37213 - 37272 4.2 - Term 37267 - 37330 6.1 31 22 Tu 1 . - CDS 37388 - 39637 1813 ## BT_2020 putative phosphate/sulphate permeases - Prom 39711 - 39770 5.1 32 23 Tu 1 . - CDS 40119 - 42206 1434 ## COG0855 Polyphosphate kinase - Prom 42238 - 42297 3.9 + Prom 42166 - 42225 8.5 33 24 Op 1 . + CDS 42262 - 42756 420 ## COG0622 Predicted phosphoesterase + Prom 42763 - 42822 3.1 34 24 Op 2 16/0.000 + CDS 42856 - 43725 1005 ## COG1209 dTDP-glucose pyrophosphorylase 35 24 Op 3 . + CDS 43747 - 44886 1123 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Term 45008 - 45063 13.2 36 25 Tu 1 . - CDS 44970 - 45863 782 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase - Prom 45962 - 46021 5.2 + Prom 45814 - 45873 3.9 37 26 Tu 1 . + CDS 45952 - 47013 885 ## COG1408 Predicted phosphohydrolases + Prom 47105 - 47164 9.3 38 27 Tu 1 . + CDS 47213 - 48502 946 ## COG1373 Predicted ATPase (AAA+ superfamily) 39 28 Op 1 . - CDS 48521 - 49090 518 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 40 28 Op 2 . - CDS 49164 - 49448 340 ## COG2350 Uncharacterized protein conserved in bacteria 41 28 Op 3 8/0.000 - CDS 49498 - 50064 574 ## COG0194 Guanylate kinase 42 28 Op 4 . - CDS 50108 - 50989 967 ## COG1561 Uncharacterized stress-induced protein - Prom 51015 - 51074 7.7 + Prom 50877 - 50936 7.2 43 29 Op 1 . + CDS 51079 - 51768 563 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone 44 29 Op 2 . + CDS 51805 - 52416 599 ## BT_2006 hypothetical protein 45 29 Op 3 . + CDS 52446 - 53750 1410 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 46 29 Op 4 . + CDS 53747 - 54289 503 ## BT_2004 16S rRNA-processing protein RimM 47 29 Op 5 . + CDS 54386 - 55246 698 ## COG0739 Membrane proteins related to metalloendopeptidases 48 29 Op 6 17/0.000 + CDS 55261 - 56430 1041 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase + Prom 56459 - 56518 3.2 49 29 Op 7 . + CDS 56560 - 57915 1350 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 + Term 57961 - 58015 5.2 - Term 57947 - 58002 6.6 50 30 Tu 1 . - CDS 58026 - 58907 873 ## BF3490 hypothetical protein - Prom 58969 - 59028 5.3 + Prom 58908 - 58967 6.0 51 31 Op 1 . + CDS 59095 - 60432 1153 ## BF1006 putative transmembrane protein 52 31 Op 2 2/0.000 + CDS 60470 - 61255 799 ## COG0637 Predicted phosphatase/phosphohexomutase 53 31 Op 3 . + CDS 61330 - 62418 1041 ## COG0075 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase + Term 62481 - 62532 11.2 - Term 62319 - 62355 0.2 54 32 Tu 1 . - CDS 62520 - 63923 792 ## COG0477 Permeases of the major facilitator superfamily - Prom 64064 - 64123 4.4 55 33 Op 1 12/0.000 - CDS 64128 - 64586 303 ## COG0602 Organic radical activating enzymes 56 33 Op 2 . - CDS 64664 - 67057 2326 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 67091 - 67150 3.5 - Term 67705 - 67760 2.2 57 34 Tu 1 . - CDS 67763 - 68422 477 ## gi|260173952|ref|ZP_05760364.1| hypothetical protein BacD2_18942 - Prom 68453 - 68512 3.7 - Term 68530 - 68572 8.7 58 35 Tu 1 . - CDS 68622 - 70745 1960 ## COG5492 Bacterial surface proteins containing Ig-like domains - Prom 70827 - 70886 8.2 - Term 71054 - 71085 1.5 59 36 Op 1 . - CDS 71095 - 71367 184 ## BT_4479 integrase protein 60 36 Op 2 . - CDS 71371 - 71907 442 ## BT_4479 integrase protein Predicted protein(s) >gi|225935340|gb|ACGA01000052.1| GENE 1 1084 - 2190 1007 368 aa, chain + ## HITS:1 COG:no KEGG:Slin_3582 NR:ns ## KEGG: Slin_3582 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 167 368 158 360 360 158 44.0 4e-37 MSKEYTLADFTDIFDYSTGWFSDSSICSLFYQIFKRFPSMKMVKYKVNQDFLKEIKELYQ QDDAFDIFEHVYCNHFENEKEEEEEEEDNTSTKELYDCIVICKKNLMIGYFDNCVKIVYS NIDKEEINQINQICENHKKENEKLNNLFIVTYSHNYFSLKQSQVNEPAIQIDRHYNDDFV PVAAEIENFLLEDNKSGLIILHGKQGTGKTTYIRHLINLGKKRMIYMSGDLVDKLSDPSF ITFIRQQKNSIFIVEDCEELLSSRNGGNRMNAGLVNILNISDGLLSDELCIKFICTFNAP LKDIDEALLRKGRLAARYEFKDLTTDKVNQLIKEESLDIPEQTHPMTLAEIYNYEGMDFS QGRKRVGF >gi|225935340|gb|ACGA01000052.1| GENE 2 2454 - 2678 172 74 aa, chain - ## HITS:1 COG:no KEGG:BT_2368 NR:ns ## KEGG: BT_2368 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 105 72.0 4e-22 MDKTIVGNNAGKVWYALKEIGEISIPELARRLNLSVESTALAAGWLARENKICIQRKNGL IALSDESAFPFSFG >gi|225935340|gb|ACGA01000052.1| GENE 3 3347 - 3847 461 166 aa, chain + ## HITS:1 COG:lin1814_2 KEGG:ns NR:ns ## COG: lin1814_2 COG3449 # Protein_GI_number: 16800881 # Func_class: L Replication, recombination and repair # Function: DNA gyrase inhibitor # Organism: Listeria innocua # 24 166 11 156 159 77 32.0 1e-14 MKPAIIRPDLELKKEIRNVSERNVIYIRLTGDYKLNDYGGTWGRLFQFIKEQKLPMGDFS PLCIYHDDPKVTPAEKLRTDVCMVMPVKVAPKSDVGFKVIPAGRYAIFLYKGPYDNLQAV YDTIYGKYLPEMECTIRDEASAERYLNNPCDTPPEELLTEIYIPVE >gi|225935340|gb|ACGA01000052.1| GENE 4 3911 - 4282 295 123 aa, chain - ## HITS:1 COG:STM0409 KEGG:ns NR:ns ## COG: STM0409 COG3324 # Protein_GI_number: 16763789 # Func_class: R General function prediction only # Function: Predicted enzyme related to lactoylglutathione lyase # Organism: Salmonella typhimurium LT2 # 1 120 1 118 121 82 38.0 2e-16 MKKLIAFFEIPATDFRRAVDFYETVLGVQLPTFECETEKMACFTEEGETVGAISYASNFD FLPSTHGVLIHFNCEDIEQTLEKVLLKGGKVVIPKTKIEADDKGWFAVFTDSEGNRIGVY AEK >gi|225935340|gb|ACGA01000052.1| GENE 5 4377 - 5180 367 267 aa, chain - ## HITS:1 COG:CC2573 KEGG:ns NR:ns ## COG: CC2573 COG2207 # Protein_GI_number: 16126811 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Caulobacter vibrioides # 61 259 67 263 270 88 27.0 1e-17 MQSFQLIKPCFALAPYIRHYWILQDDSIAPVSERTLPIGCMQLVFHKGKQLLLLGESELQ PQSFISGQSVGFSDVMSTGRIEMITVVFQPYAVKALFHIPSHLFRGQTVDIDAMEDVELS DLVKQVTDTSDNAVCIRLIEQFFLRRLYTLPEYNLKRMSAVFHEINLRPQINISHLSETA CLSSKQFGRIFADYVGTTPKEFIRIIRMQRALSMLQQDATIPFVQVAYECGFSDQSHMIK EFKLFSGYTPAEYLSVCAPYSDYFSEL >gi|225935340|gb|ACGA01000052.1| GENE 6 5332 - 6201 798 289 aa, chain - ## HITS:1 COG:no KEGG:BT_1908 NR:ns ## KEGG: BT_1908 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 289 1 289 289 441 80.0 1e-122 MKKNILLFGALIGAFLLVSCSGGNKKQAASSVTPEELDNASKVINYYHTSLIVLRHVANA KDVNAVLGYMEQTGKVPEVSPIAPPEVSARDTAELMDPGDYFNIQVRQNLKESYRGLFSA RAQFYDNFNKFLSYKQAKETAKAGKLLDENYRLSVEMSEYKQVIFDILSPLTEQAEKELL ADEPLKDQIMAMRKMSGTVQSIMNLYSRKHVLEGARIDVKMAELKKELEAAKKLPAVTGY DEEQKNYYSFLSSVESFMKDMQKARDKGAYSDADYNAMSEAYEYGLSVI >gi|225935340|gb|ACGA01000052.1| GENE 7 6324 - 6650 436 108 aa, chain + ## HITS:1 COG:MTH1452 KEGG:ns NR:ns ## COG: MTH1452 COG1917 # Protein_GI_number: 15679449 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Methanothermobacter thermautotrophicus # 10 107 1 98 99 108 50.0 3e-24 MEQSFQKGVVLHLASLIEYTEGGVISKQLIKSPAGNITLFSFDKGEGLSEHRAPFDALVQ VLEGVANITVNGTLFTVKAGESIVFPANAPHALTAVERFKMLLTMIKE >gi|225935340|gb|ACGA01000052.1| GENE 8 7427 - 8206 243 259 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 247 4 238 242 98 30 1e-19 MNRFENKVVVITGAAGGIGEATTRRIVSEGGKVVIADHSEKRAEQLANELTHTGADVRHV YFSATELQSCKELIDFSMNEYQRIDVLINNVGGTDPKRDLSIEKLDINYFDEAFHLNLCC TMYLSQQVIPIMTANGGGNIVNVASISGLTADANGTLYGASKAGVINLTKYIATQMGKKN IRCNAVAPGLVLTPAALDNLNEDVRNIFLGQCATPYLGEPEDVAATIAFLASNDARYITG QTIVVDGGLTVHNPTVALS >gi|225935340|gb|ACGA01000052.1| GENE 9 8344 - 8949 618 201 aa, chain + ## HITS:1 COG:CC2662 KEGG:ns NR:ns ## COG: CC2662 COG1309 # Protein_GI_number: 16126897 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Caulobacter vibrioides # 5 192 14 207 213 81 29.0 1e-15 MEKEITKNRQATEMTLIKAVNDIIEESGFEGLGINAVAAKAKVSKMLIYRYFNSLDGLIA AYIQQNDYWINFDEELPDQAHLAAFIKQIFKRQIMRLRENYTLKRLYRWELTTDNKFVKE LRNKREEKGIWLVEAVSKLSKHPKKEIAAMASIITAAISYLALLEENCPVYNGLKIQQES GWEELEEGINLLVDLWLQKQL >gi|225935340|gb|ACGA01000052.1| GENE 10 8963 - 9679 575 238 aa, chain + ## HITS:1 COG:no KEGG:Sde_1498 NR:ns ## KEGG: Sde_1498 # Name: not_defined # Def: hypothetical protein # Organism: S.degradans # Pathway: not_defined # 44 233 145 335 340 138 41.0 2e-31 MTIMGIWFLSLLINSGCNRDLVEYDSNDLKISIEKGEEWLHNFPLFLGINIKNPPQIAIW TEDMEGNYLSTVYVTHKIATQSWQASGGNRRKEALPHWCYQRGIQYEDGLYLPSKKEPLT DGISGATPKGSFNVKMSPTGKQKKFIVKIEINHSTDFNDAYPKSAKEGDANYSGGKEGSG QPALVYAAEVDLTSGKKEFTAKLIGHSSPDGSNGDLTRDTSSLTTALHIIKSITVYVQ >gi|225935340|gb|ACGA01000052.1| GENE 11 9724 - 10254 245 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260173904|ref|ZP_05760316.1| ## NR: gi|260173904|ref|ZP_05760316.1| hypothetical protein BacD2_18702 [Bacteroides sp. D2] # 1 176 1 176 176 345 100.0 8e-94 MKKFIFIFALLSFITVCADGQTDSRRLSYTTYIGTGFSMNQPSYTPFNWQIVSHYHISQR FAIGAGSGLSVYEKLLIPLYASAQFYITKPRRLTPYLECHIGGSFATDREANGGFYLSPS IGAQFKVNRKIKLNLMAGYELQKLERTKKQEDPYFHTTFKEELSHHSITLKIGLTY >gi|225935340|gb|ACGA01000052.1| GENE 12 10423 - 11598 927 391 aa, chain + ## HITS:1 COG:MA2533 KEGG:ns NR:ns ## COG: MA2533 COG1488 # Protein_GI_number: 20091361 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 2 383 1 395 404 282 40.0 1e-75 MIVKTLLDTDLYKFTTSYAYIKLFPYAMGTFSFNDRNETQYTEDFLKALKTEIMNLSQLR FTEEELEYMTKNCRFLPRVYWEWLSSFRFDPNKIDIHLDEACHLHIEVTDLLYKVTLYEV PLLAIVSEIKNRFFGNVADMNEILCKLSEKIELSNQHQLRFSEFGTRRRFSIDVQETVIK KLNETAHYCTGTSNCNFAMKYGMKMMGTHPHEWFMFHGAQFGYKHANYMALENWVNVYDG DLGIALSDTYTSGIFLSNLSRKQAKLFDGVRCDSGNEFEFIDKLVARYRELGIDATTKTI VFSNALDFTKALDIQEYCQNKIRCSFGIGTNLTNDTGFEPSNIVMKLTQCKMNVNQEWRE CIKLSDDEGKHTGSLEEVQACLYELRLNRQQ >gi|225935340|gb|ACGA01000052.1| GENE 13 11745 - 12068 336 107 aa, chain - ## HITS:1 COG:slr0233 KEGG:ns NR:ns ## COG: slr0233 COG0526 # Protein_GI_number: 16331440 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Synechocystis # 22 91 21 90 105 68 40.0 2e-12 MKEKKIAREERNQEKLANGDWVMAEFYATWCPHCQRMKPVVEEFKKLMEGTLEVVLVDID QEPALTDFYTVESTPTFILFRKGQQLWRQSGELPLERLERAVKGFKS >gi|225935340|gb|ACGA01000052.1| GENE 14 12164 - 12646 452 160 aa, chain - ## HITS:1 COG:CAC2475 KEGG:ns NR:ns ## COG: CAC2475 COG3467 # Protein_GI_number: 15895740 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Clostridium acetobutylicum # 8 159 5 154 154 99 38.0 2e-21 MKYLNEPVRRQDRLLEEEKALRLLQTAEYGILSMQALGGGGYGIPVNYVWDGARSIYIHC APEGEKLRCISACERVSFCVVGATHLVPDKFTTGYESIVLTGTARTGLSEAERMKALELL LDKLSPEDKVVGMKYAEKSFHRTEIIRMDIDNWSGKCKRV >gi|225935340|gb|ACGA01000052.1| GENE 15 13118 - 14632 1485 504 aa, chain + ## HITS:1 COG:MA0675 KEGG:ns NR:ns ## COG: MA0675 COG0439 # Protein_GI_number: 20089560 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Methanosarcina acetivorans str.C2A # 1 441 1 440 493 509 57.0 1e-144 MIKKILVANRGEIAVRVMRSCREMEITSIAIFSEADRTAKHVLYADEAYCVGPAASKESY LNIEKIIEVAKAAHADAIHPGYGFLSENATFARRCQEEGIIFIGPNPETMEAMGDKIAAR IKMIEAGVPVVPGTQDNLKSVEEAIELCNKIGYPVMLKASMGGGGKGMRLIHSAEEVEEA YTTAKSESLSSFGDDTVYLEKFVEEPHHIEFQILGDKHGNVIHLCERECSVQRRNQKIVE ETPSVFVTPALRKDMGEKAVAAAKAVNYIGAGTIEFLVDKHRNYYFLEMNTRLQVEHPIT EEVIGVDLVKEQIKVADGQVLQLKQENIQQRGHAIECRICAEDTEMNFMPSPGIIKQITE PNSIGVRIDSYVYEGYEIPIYYDPMIGKLIVWATNREYAIERMRRVLHEYKLTGVKNNIS YLRAIMDTPDFVEGHYDTGFITKNGEYLQQRIMRTSEHSENIALIAAYMDYLMNLEENNS GMATDNRPISKWKEFGLHKGVLRI >gi|225935340|gb|ACGA01000052.1| GENE 16 14655 - 15164 531 169 aa, chain + ## HITS:1 COG:AGc4940 KEGG:ns NR:ns ## COG: AGc4940 COG1038 # Protein_GI_number: 15889978 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 88 164 1095 1171 1174 62 38.0 5e-10 MEIHIGNRVAEVELVSKEDNKVVLTIDGKPFEADVVMAENGTCNILMDGRSSNAQLIRRD NGKSYKVNTHYSSFNVEIVDSQAKYLRMRKKGEEEQNDCIISPMPGKVVKIPVVAGQEMK AGDTAIVVEAMKMQSNYKVTSDCRIKEILVQEGDSITGEQTLITLEPIA >gi|225935340|gb|ACGA01000052.1| GENE 17 15184 - 16719 1603 511 aa, chain + ## HITS:1 COG:VNG1529G KEGG:ns NR:ns ## COG: VNG1529G COG4799 # Protein_GI_number: 15790513 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Halobacterium sp. NRC-1 # 11 511 10 516 516 623 58.0 1e-178 MEEINKAYATFEERDRIASLGGGVDKIEKQHESGKMTARERIEMLLDKGTFVELDKLMVH RCTNYGMDKNKIPGDGIVSGYGKVDGRQVFVYAYDFTVYGGSLSASNAKKIVKVQQLALK NGAPIIALNDSGGARIQEGIESLSGYADIFYQNTMASGVIPQISAILGPCAGGACYSPAL TDFIFMVKEKSHMFVTGPDVVKTVIHEEVSKEELGGAMTHSSKSGVTHFMCNTEEELLMS IRELLSFLPQNNMDEAKKQPCTDETNREDASLDTIVPVDPNVPYDMKDIIERVIDNGYFF EVMPNFAKNIIIGFARMAGRSVGIVANQPAYLAGVLDIDASDKASRFIRFCDCFNIPLIT FEDVPGFLPGYTQENNGIIRHGAKIVYAFAEATVPKLTVITRKAYGGAYIVMNSKQTGAD VNFAYPSAEIAVMGADGAINILFRKADEATKAKELEAYKEKFATPYQAAELGYIDEIIYP RQTRKRLIQALEMTENKMQTNPPKKHGNMPL >gi|225935340|gb|ACGA01000052.1| GENE 18 17329 - 18453 768 374 aa, chain - ## HITS:1 COG:no KEGG:BF3546 NR:ns ## KEGG: BF3546 # Name: not_defined # Def: putative N-acetylmuramoyl-L-alanine amidase # Organism: B.fragilis # Pathway: not_defined # 20 371 20 346 346 533 76.0 1e-150 MKNKLYILLFLAFLFSGTTLWAQQKATPKAGEGISSFLLRHNRSPKKYYDDFIELNKQKL GKNNVLKVGVTYVIPPVKKSTATPAKSTSTKETSSKDAPTKNADAKNTASESSGTKQQSP KAKSTKIGTTLNEPLFGKQLADVKVTSNRLAGACFYVVSGHGGPDPGAIGKVGKYELHED EYAYDIALRLARNLMQEGAEVRIIIQDAKDGIRDDSYLSNSKRETCMGDPIPLNQVQRLQ QRCDKINALYRKDRKNYSYCRAIFIHIDSRSKGKQTDVFFYYSNKKGESKRLANNMKDTF ESKYDKHQPNRGFSGTVSGRNLYVLSHTTPASVFVELGNIQNTFDQRRLVMNSNRQALAK WLMEGFLKDYKEKK >gi|225935340|gb|ACGA01000052.1| GENE 19 18583 - 19863 1318 426 aa, chain + ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 9 426 5 420 422 522 58.0 1e-148 MAKQFKPETLCVQAGWTPKKGEPRVLPIYQSTTFKYDTSEQMARLFDLEDSGYFYTRLQN PTNDAVAAKIAALEGGVAAMLTSSGQAANFYAIFNICQAGDHFVCSSAIYGGTFNLFGVT MKKLGIDVTFVNPDASEEEISAAFKPNTKALFGETISNPSLEVLDIEKFARIAHSHGVPL IVDNTFPTPINCRPFEWGADIVVHSTTKYMDGHATSVGGCIVDSGNFDWDAHAEKFPGLC TPDESYHGLTYTKAFGKGAYITKATAQLMRDLGSIQSPQNSFLLNLGLETLHLRMPQHCR NAQKVAEYLSKNEKVAWVNYCGLPDNKYYSLAQKYMPNGSCGVISFGLKGGRDVSIKFMD SLEFIAIVTHVADARSCVLHPASHTHRQLTDEQLMEAGVRPDLIRLSVGIENADDIIADI EQALNA >gi|225935340|gb|ACGA01000052.1| GENE 20 20131 - 20802 502 223 aa, chain + ## HITS:1 COG:MA0289 KEGG:ns NR:ns ## COG: MA0289 COG2220 # Protein_GI_number: 20089187 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Methanosarcina acetivorans str.C2A # 15 219 13 220 225 111 36.0 1e-24 MPDFEMDSFTTKSGKSLKITFFKHASLLLEYAGQKIFVDPVSDYADYTQQPKADFILITH EHGDHFDTKAIAAIETSRTRIIANPNCRKMLNRGQEMKNGDVLQLADDIKLEAVPAYNTT PGRDKFHPKGRDNGYILTLGGTRIYIAGDTEDIPELTQVKDIDIAFLPVNQPYTMTPEQA IRATQIIKPRILYPYHYGDTDINKVKEGLKNEKTTEVRIRALQ >gi|225935340|gb|ACGA01000052.1| GENE 21 20818 - 21711 572 297 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173914|ref|ZP_05760326.1| ## NR: gi|260173914|ref|ZP_05760326.1| hypothetical protein BacD2_18752 [Bacteroides sp. D2] # 1 297 11 307 307 564 100.0 1e-159 MKTKTLLGFFLLSLMAFGISACQDDAEIILFSGSKLIDETGTCTNTISSTVLYLNGWEAE NIGIANGKGGYSAQSSDETIVTATVNDNRLWLSSHGKKGKVTVTVSDKDGNYVTLPVTVS YGVLTFTCLDQPEFQVSRPSEDMLLLLDAEDDLQEKVNVAMAPYAFINRGDICVLRPDDV YRLSEEGEGGKFTYKTKDEQVLVEGTYRIEWASLAGKKKKAFVFTYTGENDVEKQHTFFN FSPYLGTQSLTRERGPITTAWLEDVSDSPYLEGFLPEDRTVVYWVDTMISSLSAEDL >gi|225935340|gb|ACGA01000052.1| GENE 22 21833 - 25759 2093 1308 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 795 1028 1 229 294 138 38.0 8e-32 MTLKYYYSLILSLMTTLSVSSQQVVVTELPTQRLLPVAHIHRILQDSEGYMWYATEGGGL CRDNGYQINVFRSDLNTPHLLSSNDITCITEDNGHHIWFGTKRGLYRLDKANYQINEITD GELKKQRVDAVRAIHDGTIWTSAEGMIYHFSSEGKCLGSYPSEWKGTRRNVTDFYEDGKH QLWVLQGAGGLLQYHPSTDSFSPYQWTCDASPIQMIEDVQNECYWVATWGKGVVQLIPSS NPQSEEVICKLQPATLEGYEQTPHKTQILGLLKDSRQGVLWVSAMDNLYAYRIVDNALHP VDTESFLPKGKKILDRIIEDRAGNVWVPGYSPHTFILSFDANKIKRYPASAMSVITGYPV MADRTIQEEDYYWIWQGRTGLSIYHPTSEQISFASDFSEETGKYNIIKCIEKCHNQSGIW AASDNACLLHLQHEGMKMKLTKEIQLPDARQIRVLSEDNQGNLWIGTENAIYQYSLSKGE LKKFQGGTDMINDLAVAPDGTIFCITEALEFQYFSPEGERHTIRKGENYSSVLIAPDGKV WVATLEGNVYSYHPQTKVITREENACNTNGDAIKGMEIDNLGHLWILADPYVKEYNPTNH SFRILYNSDRFIQVDYFLSIRKMEDGAICLGGIGAFCLITPSAELDQSPNDIKPVISSIK IDGKTQITGINTRQIELNPDNINVEISFSTLEHLHAGQIGYAYRLKGWDVSWKSLPPGVN TAYFTKLPKGNYTLEIKATDIHGCWAQPMSCLQINRLPAWYETWWAYALYMFSFILIATG VIKIYFNRLHRKQQEQLEEQLTQMKFRFFTNVSHDLRTPLTLIVTPLSSLLSEIQDGKLK QQLSSIYRNAEELLQLVNQLLDFRKLEVSEEKLNLTNADISEFVTTTCEAFESYANNKQI RFSVIPLKQTLYMYLDRDKVHRILYNLLNNAFKFTPPEGRIDVFFKMENRGGIQYVCIIV QNTGQEIAADELPHIFDRYYQIGANTHRQATAGSGIGLHIVKEYVAMHQGFIEVESNKEE GTEFCVYLPTNLVDKRLLPREENIPDESGATETNSSDARKTILIVEDNEEFRQFMYQQLS QEYQVWEASNGEEAEAIANEKEVDIIVSDVMMPGMDGFELCLRLKENMKTSHIFIILLTA RTGDENELAGYQSGADCYLTKPFNMDILKNRIQHLLALQQKRKQIFLSGIEVNAEDLTSS KVDERFLEKAIELVEKNLDNSDYSVEAFSDDMCMSRMNLYRKLQIITGQKPTEFIRSIRL KKAAGLLTHTELTVIEISEKVGFATPSYFSKCFKEMFGVLPTQYHSQD >gi|225935340|gb|ACGA01000052.1| GENE 23 25971 - 26501 300 176 aa, chain - ## HITS:1 COG:no KEGG:BT_1925 NR:ns ## KEGG: BT_1925 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 46 176 1 131 131 191 70.0 7e-48 MNRYVKLLSFLPGICLIAACAGHQIKGYEINGSAPLPEFEGKMVYMKDASSGQPIDSAEI IHGKFAFADTVTSVSPVVKVLSIRASKSGLEYRLPVVIENGSIQADISDVVCTGGTMLNE RMQDFLMAVDEYSTACENKQTEQIKSGFADLLKKYIEINDDNAIGEYIRTAYRSSL >gi|225935340|gb|ACGA01000052.1| GENE 24 26598 - 27464 729 288 aa, chain - ## HITS:1 COG:no KEGG:BT_1926 NR:ns ## KEGG: BT_1926 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 288 16 288 288 528 94.0 1e-149 MIKKLCTLLLLTCMTAAPSMAQRYSSSDDASFAPKKGQWQVSVLLGSGKFFNENTSYLLP KFSNDGGVVGLPNGGTDNSGDLNRYLNIGSLNNNSLVNIAGIEGKYFVSDNWDVNFQFSM NVSLTPKKDYVEGDNSVPDMIIPAQSYINAQMTNNWYVSVGSNYYFKTRNERIHPYLGGA LGFQMARIETTEPYTGDTYKDSDDSEELPSQVYVSGSKAGQMYGFKVAAVAGIEYSIAKG FVFGFEMHPLAYRYDLIQICPKGFDKYNASHHNIKIFEMPVVKLGFRF >gi|225935340|gb|ACGA01000052.1| GENE 25 27491 - 30061 2560 856 aa, chain - ## HITS:1 COG:no KEGG:BT_1927 NR:ns ## KEGG: BT_1927 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 191 1 179 928 140 58.0 3e-31 MRKWTYLVATLLMAGTTATFTGCIDTDEPEGIAELRGAKSEFIKAQAAVELVEAELRKAQ VAEQELVNAGLALQNKSAEIDLQLHELDIQLKQLLIEKEEAATAQAKAEAEAAIAKAEAD KTKWENEKALIVEQYKEKMLLAETATAKAQEAYKQAMEQIEASKLLLTDEEQARLNGVQA QVAYAKQAMDKAMYGYSTTVIKRILSSEVANSTTTTNPDESTETNSSTTKYYVYLITEDP TQSADGYKAGSLKKLQEQLANYSDIVADNNLEAVLDNALKNAEFALEMTQKYADNLKAIL DNEYTTVADWEAEVKKLEEEIAAAKVKEQQYNIEKGKLEVANPKLISDLQATSNKLEIAK NNQNSNKNKAKTAAAYSKKVEAEIKKGLNDVIPSGTTVAGYNSSTGTFAYGKDILITEAQ NQIDGWIKLIDKATKGVDLENIEWAKPQLATALAAQKEAEENYAKDYKAWEDAMAAYDES LKIDLEKSEAAANAAIKKYNGLKSEDRKKEANIDAVATALVAYYKDALAKEAQVTKATAT KGSDTKKISAWLIADADNFKKVVGVDLGFITWDAGDIATANEISKENIGKLIDVEQDSSV KTPLEKWKSASSAVFSDNFNSGNNPRRLPVSKDEVIAAAGDNHVDALKNGNYGSLGKMIW ATAETASLQAIIDQVETYKALKEAFTAQKAVEQTTQDKANAALADAVDKAKKANDEAQAA YDKVFKEVNDNIATVQKEYGNNEIIKGKIETEITVYLNNNFTNGDLGSLEAIKNLVKEEY MIALANVTAAKADVAKAKRNIEKLAAGTYTDTDYITESLVSIQEKIKVKQAEYDAAKADY DTASAQLKALLAIFLK >gi|225935340|gb|ACGA01000052.1| GENE 26 30465 - 31403 373 312 aa, chain - ## HITS:1 COG:no KEGG:BT_1793 NR:ns ## KEGG: BT_1793 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 312 1 312 315 514 80.0 1e-144 MGNLISFMKEVAEGLRKSGNYGTAHVYRSSMNAIIAFNGSRNLPFKKVNQEFLKSFETYL REKDCSWNTVSTYMRTLRAVYNRAVDRRMASYIPHHFRYVYTGTRADRKRALEKEDMERL MKELPKQIHQGRGELQRTRAYFFLMFMLRGMPFVDLAYLKKQDIVGNVLTYRRRKTGRLL TVTLLPETMKLMKKYMNTDSASPYLFPILTGGENTEATYREYQIALRNFNYQLLLLKQVL ALTSDLSSYTARHTWATMAYYCEIHPGIISEAMGHSSIIVTETYLKPFKNKKIDEANVTI LSSLKRNYICGK >gi|225935340|gb|ACGA01000052.1| GENE 27 31520 - 32146 397 208 aa, chain - ## HITS:1 COG:no KEGG:BT_4601 NR:ns ## KEGG: BT_4601 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 207 212 285 65.0 6e-76 MAMYIMEEMPDIHGTGEQVLYPRFAMMEQVSTEDLIRQIASSSGFNVGDVEGVITQIGIE MAHQMAEGKSVKLDGIGTFSPSLALCKDKEREKAGEGETHRNARSIVVGNVNFRVDRKMM RRINGRCLLERAPWKSQRSSQKYTPEQRLALAVRYLEEHPFLTVYEYRKLTGLLRTAATN ELRQWAYTPDSGIGIDGRGTHRVYIKKT >gi|225935340|gb|ACGA01000052.1| GENE 28 32503 - 34956 1461 817 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 30 801 54 870 871 360 31.0 6e-99 MMNNYFESSVRLLALLALIPCFIAESSASERKVSMNENWKFYRGCIANAEQTSFKDTQWR VLDLPHDWSIDPVPVQREGITIGPFSRMSVGGADTGQTVGGEGWYRKEFTIQPEDADKII SLYFEGAYNQAEVWINGQKACFNPYGYIPFKIDIQPYCNAPGIPNIIAVKVVNEGLNSRW YAGSGIYRHVWLMKTDKVHLDEWDTFIDASKVEGKKATVDLHSILHNSGKEKVTAHLKIQ IFSPQGEEVYSTTQPVNISGEINIPISLTFDIKKPELWSVDTPSLYTAHLSVKSKKLSDE ITVPFGIRTVEFSAEKGFLLNGKPLKLKGGCLHHDNGLLGAVAINRAEERKVELMKANGF NAVRCSHNLPSEHFLQACDRLGLLVIDEVFDQWQQAKRPQDYHQFFDEWSEHDIATMVRR DRNHPSIIMWSIGNEIAERADEPTGEIIARKLIATIHKYDTSRATTAAVNSFWDRRDFSW EKDSERAFRNLDISGYNYQWKEYEKDHARFPQRIMYGSESVPKEAAQNWNLIDKNSYLIG DFVWTALDYLGEAGLAHTLELAPGEHSPQFMGWPWYNAWCGDIDFCGDKKPQSYYRDILW KRRDISLAVQPPVAKGKREDINYWGWKNEFLSWNWKGYEGETMAVHVYSRSPKVRLYLND RLLGEKEVNPDTYTASFEVPYEPGQLKAINLKGKKETTSTELCTTGIPAMIHLTADRMKI KADKNDLSYIKVEILDKDGKLIPDCSIPLEIKSTGKGSVIAAGNGSADDMRSFRSLKPKT FRGKAIAIVQPNTEKGTITLTVSAEGLPEAAIVIETY >gi|225935340|gb|ACGA01000052.1| GENE 29 35046 - 36521 1115 491 aa, chain + ## HITS:1 COG:CC1757 KEGG:ns NR:ns ## COG: CC1757 COG5520 # Protein_GI_number: 16126001 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: O-Glycosyl hydrolase # Organism: Caulobacter vibrioides # 27 489 28 468 469 145 23.0 3e-34 MIKKTTLLSSFAIVLSMGTMSAQSYEWVSSTENNTWQQSKVKLQTNTGRTPLLKVNGSEN GTVFKAWGTCFNELGWDALNMLPRKQQEIVLQQLFAPGGDLKFTMGRFSMNANDYARDWY SCDEVSGDFQLKHFNINRDKTTLIPFIKAAQQYNPDMTFWMSPWSPPSWMKINHYYSVRS DRNQNQMSPLSDVALYEDSKEKNTQVFPQQLAVNDYFIQDPRYLQTYANYFCKFIDAYKE QGIPISMIMFQNESWSYTNYPGCAWTAEGIIRFNAEYLAPTLKRQHPEVKLYLGTINTNR YEVIDQILSDPRMPETIEGVGFQWEGGQILPKLRAKYPQYKYVQTESECGWGSFDWKAAE HTFGLMNHYLGNGCEEYTFWNAILYDGGFSGWGWKQNALIHVDSKTGTATYTPEYYAVKH YSHYVTPGSQVLAYKDRGDKMPVMIVMTPQKKQVVIAGNFNEETKELTVKLGTRYLNVTL QPHSLNTFIEK >gi|225935340|gb|ACGA01000052.1| GENE 30 36648 - 37154 353 168 aa, chain - ## HITS:1 COG:no KEGG:YPTB3281 NR:ns ## KEGG: YPTB3281 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis # Pathway: not_defined # 1 166 14 162 163 63 25.0 2e-09 MKYQLEITTLLVPVNVHQLFEKCEWPELNCFDREMVEGYFSDLVNGIQTDEALDDWTLTV VLYIGTYLGANHISIWKHGITDTVTKEKVLTIGIPLPCSKTIRWGVKKKGRFTGKIPDES FRRNNRLLPVNFVKYDNMETYIEDNIRIALLNLFEVGFTLKGYKVKKR >gi|225935340|gb|ACGA01000052.1| GENE 31 37388 - 39637 1813 749 aa, chain - ## HITS:1 COG:no KEGG:BT_2020 NR:ns ## KEGG: BT_2020 # Name: not_defined # Def: putative phosphate/sulphate permeases # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 749 1 750 750 1356 92.0 0 METIYLCIIIFLFVLAVFDLIVGVSNDAVNFLNSAVGAKAASFKTILFIAGIGIFIGASL SNGMMDIARHGIYQPEHFYFAEIMCILLAVMLTDVVLLDVFNSMGMPTSTTVSLVFELLG GTFALSLIKVHNSDTLGLGDLINTDKALSVIMAIFVSVAIAFFFGMLVQWLARVIFTFNY TKKMKYSIALFGGVAATAIIYFMLIKGLKDSSFMTSENKHWIQDNTLMLITVFFVFFTVL MQILHWMKINVFKVVVLMGTFALALAFAGNDLVNFIGVPLAGFSSFLDYTANGEGNPNGF LMTSLLGPAKTPWYFLIGAGAVMVYALCTSKKAHAVIKTSVDLSRQDEGEEAFGSTPIAR TVVRISMTLANGISRIMPDGSKKWLDSRFRKDEAIIADGAAFDLVRASVNLVLAGLLIAL GTSLKLPLSTTYVTFMVAMGTSLADRAWGRDSAVYRITGVLSVIGGWFITAGAAFTICFF VALVLHYGGNISIIALIGIAVFILIRSQVMYKKRKAKEKGNETLKQLMQTTDSTEALQLM RKHTREELSKVLEYAETNFELTVTSFLHENLRGLRRAMGSTKFEKQLIKQMKRSGTVAMC RLDNNTVLEKGLYYYQGNDFASELVYSISRLCEPCLEHIDNNFNPLDAIQKGEFSDIAED ITYLIQQCRKKMENNEYNDLEEEIRRANDLNGQLSLLKRKELQRIQSQSGSIRVSMVYLT MVQEAQNVVTYTINLMKVSRKFQIETEMP >gi|225935340|gb|ACGA01000052.1| GENE 32 40119 - 42206 1434 695 aa, chain - ## HITS:1 COG:ECs3363 KEGG:ns NR:ns ## COG: ECs3363 COG0855 # Protein_GI_number: 15832617 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Escherichia coli O157:H7 # 7 690 7 681 688 422 37.0 1e-117 MESKYNYFKRDISWLSFNYRVLLEALDERLPLYERINFISIYSSNLEEFYKIRVADHKAV ASGATESDEETVQSARELVEEINKEVTRQLDDRVRIYEEKLLPALRKNHIIFYQDRHVEP FHQQFIKDFFREEIFPYLQPVPVSKDKIVSFLRDNRLYLAIRVYPKKEGNEGTTSLTESR QPLYFVMKQPYAKVPRFIELPSREKNHYLMFTEDIIKANLNLIFPGYDVDSSYCIKISRD ADILIDDTASSADLVAQLKKKVKKRKIGDVCRFVYDRGMPHDFLDFLVDAFHIQRDELVP GDKHLNLEDLRHLPNPNKSLHSLEKPKPMKLTILDEKESIFNYVAKKDLLLYYPYHSFEH FIHFLYEAVHNPETREIMVTQYRVAENSAVINTLIAAAQNGKKVTVFVELKARFDEENNL ATAEMMQAAGIKIIYSIPGLKVHAKVALIRRRGLNGEKIPSYAYISTGNFNEKTATLYAD CGLFTCRPKIVADLYNLFRTLQGKEDPKFTTLLVARFNLIPELNRLIDREIALADEGKQG RIILKMNALQDPAMIDRLYEASEHGVQIDLIVRGICCLIPGQSYSRNIRVTRIVDSFLEH ARIWYFGNDGKPKVFMGSPDWMRRNLYRRIEAITPILAPDLRDSLIEMLNIQLADNQKAC WVDDNLRNIFKKRAPGTPAVRAQYTFYDWLNKTNN >gi|225935340|gb|ACGA01000052.1| GENE 33 42262 - 42756 420 164 aa, chain + ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 3 137 9 134 157 58 33.0 6e-09 MTRIGLLSDTHAYWDEKYLEYFEPCDEIWHAGDIGSLEVAERLAAFRPFRAVYGNIDGQE IRKLYPQINRFTVDGAEVLIKHIGGYPGKYDPSIIGSLMTRPPKLFISGHSHILKVKYDK TLDMLHINPGAAGMSGFHKVRTLVRFVINQGAFQDLEVIELADK >gi|225935340|gb|ACGA01000052.1| GENE 34 42856 - 43725 1005 289 aa, chain + ## HITS:1 COG:MTH1791 KEGG:ns NR:ns ## COG: MTH1791 COG1209 # Protein_GI_number: 15679779 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Methanothermobacter thermautotrophicus # 1 287 1 287 292 402 64.0 1e-112 MKGIILAGGSATRLYPLSKAISKQIMPVYDKPMIYYPLSTLMLAGIREVLIISTPRDLPM FRDLLGTGEELGMSFSYKIQEQPNGLAQAFVLGADFLNGEPGCLILGDNMFYGQGFSAML RRAANIEKGACIFGYYVKDPRAYGVVEFDEQGKVISLEEKPEVPKSNYAVPGLYFYDASV TEKAAALRPSARGEYEITDLNRLYLEEGTLKVELFGRGFAWLDTGNCDSLLEASNFVATI QNRQGFYVSCIEEIAWRQGWIPTEQLLLLGQQLEKTEYGKYLIELAKQS >gi|225935340|gb|ACGA01000052.1| GENE 35 43747 - 44886 1123 379 aa, chain + ## HITS:1 COG:FN1667 KEGG:ns NR:ns ## COG: FN1667 COG1088 # Protein_GI_number: 19704988 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Fusobacterium nucleatum # 1 377 1 397 399 496 60.0 1e-140 MKTYLVTGAAGFIGANYIKYILAKHNDIKVVILDALTYAGNLGTIAKDIDNERCVFIKGD ICSRDVVDGLFAEYRFDYVVNFAAESHVDRSIENPQLFLITNILGTQNLLDCARRAWVMG KDEQGYPTWRKGVRYHQVSTDEVYGSLGAEGYFTEATPLCPHSPYSASKTSADMVVMAYH DTYKMPVTITRCSNNYGPYHFPEKLIPLIIKNILEGKHLPVYGDGSNVRDWLYVEDHCKA IDLVVREGQDGEVYNVGGHNEKTNLEIVKLTISTIHRLMAENPEYRQVLKKKVKDENGDI SIDWINEDLITFVKDRLGHDQRYAIDPTKITNALGWYPETKFEVGIVKTIEWYLANQAWV EEVTSGDYQGYYEKMYGKN >gi|225935340|gb|ACGA01000052.1| GENE 36 44970 - 45863 782 297 aa, chain - ## HITS:1 COG:VNG1075G KEGG:ns NR:ns ## COG: VNG1075G COG1575 # Protein_GI_number: 15790173 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Halobacterium sp. NRC-1 # 10 296 11 311 311 178 37.0 1e-44 MEEVKRNSLQAWILAARPKTLTGAITPVMIGTALAAMDGRFHWLPALICCLFASLMQIAA NFINDLFDFLKGTDREDRLGPERACAQGWISPQAMKTGIVITVALACLIGCTLLFFAGWE LIIVGVFCVLFAFLYTTGPYPLSYNGWGDVLVIVFFGFVPVGGTYYVQALTWTPDVTIAS LICGLLIDTLLVVNNYRDREADARSGKRTVIVRFGEKFGRYFYLMLGITASLLCLCFLRE GHFYAAFLPQLYLIPHFLTWKRMVKIYSGKKLNSILGETSRNMLLMGILIAIGMLIN >gi|225935340|gb|ACGA01000052.1| GENE 37 45952 - 47013 885 353 aa, chain + ## HITS:1 COG:mll3894 KEGG:ns NR:ns ## COG: mll3894 COG1408 # Protein_GI_number: 13473337 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Mesorhizobium loti # 111 348 50 311 312 107 30.0 3e-23 MKKIKTIYPVILLTLLLLVSCKSKKNMVATLPRPVLNSDSIYPDTANAIAGIFSPDHSQL KELNVSKNKKQNTKKKTSTDTHESSDLVLRRTKITSSSVDVSSVYTGVDRVVKYDFTHRD VPEAFEGFRIAFISDLHYKSLLKEKGLNDLVRLLIAQKADVLLMGGDYQEGCEYVEPLFS ALARVKTPMGTYGVMGNNDYERCHDDIVNTMKHYGMRPLEHEVDTLRKDGQQIIIAGVRN PFDLGRNGVSPTLALSPKDFVILLVHTPDYIEDVSVANTDLALAGHTHGGQVRVFGVAPA LNSHYGNRFITGLAYNSAKIPLIITNGIGTSKLPIRVGAPAEIIVITLHRLTE >gi|225935340|gb|ACGA01000052.1| GENE 38 47213 - 48502 946 429 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 427 23 452 470 390 46.0 1e-108 MRRDAMQELYDWKEKSTRKPLIIRGARQVGKTWLMKEFAATAYKQFAYINFEDNEVMKEI FQKDFDIERILMAIQLVTGKVVNTDTLILFDELQEAPRGLTAMKYFQEKAPQYHVIAAGS LLGIAMHQNDSFPVGKVDFIDLYPLSFSEFLEAIGQEPFASLLAKQDWNLISTFRSKLTD FLKQYYFVGGMPEVVNAFIEHKDYAEVRQLQQNILDSYDRDFSKHAPIAEVPRIRMVWRS IPAQLAKENRKFIYGVIKEGARAKDFELAMEWLIDAGLIYKVNRVKKGGIPLSAYEDFSA FKLFMLDTGLMGAMSGLPPQALLEGNVLFTDYKGAITEQYVLQQLKSVKGLNIYYWSSDT SKGELDFLLQKEIYIIPVEVKAEENLQSKSLRSFVEKNPGLHGIRFSMSDYRQQEWLTNY PLYSVGYIF >gi|225935340|gb|ACGA01000052.1| GENE 39 48521 - 49090 518 189 aa, chain - ## HITS:1 COG:BS_yqeJ KEGG:ns NR:ns ## COG: BS_yqeJ COG1057 # Protein_GI_number: 16079618 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Bacillus subtilis # 5 188 3 183 189 114 36.0 7e-26 MTKLKTGIFSGSFNPIHIGHLALANYLCEYEGLDEIWFMVSPQNPLKAQEKLWNDELRLE LVKLSISDYPRFQASDFEFHLPRPSYSVYTLEKLRETFPDREFYFIIGSDNWERFGYWYQ SERIIKENQILIYPRPGFPVKEEELPETVRLVHSPVFEISSTFIREALDAGKDVRYFVHP KVWETIKER >gi|225935340|gb|ACGA01000052.1| GENE 40 49164 - 49448 340 94 aa, chain - ## HITS:1 COG:CAP0038 KEGG:ns NR:ns ## COG: CAP0038 COG2350 # Protein_GI_number: 15004742 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 94 1 96 96 94 47.0 3e-20 MFVLILTYKAPIEKVLELLDAHCRFLDKYYNAGNFLASGPQIPRTGGVILCRATDRVEVE EIIREDPFNEIADYQIIEFEPNKSIEGFKDLLNV >gi|225935340|gb|ACGA01000052.1| GENE 41 49498 - 50064 574 188 aa, chain - ## HITS:1 COG:RSc2155 KEGG:ns NR:ns ## COG: RSc2155 COG0194 # Protein_GI_number: 17546874 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Ralstonia solanacearum # 3 183 20 198 221 146 42.0 2e-35 MTGKLIIFSAPSGSGKSTIINYLLTQNLNLAFSISATSRPPRGTEQHGVEYFFLTPEEFR QRIENNEFLEYEEVYKDRYYGTLKAQVEKQLEAGQNVVFDVDVVGGCNIKKFYGDRALSV FIQPPSVEELRCRLEGRGTDAPEVIESRIAKAEYELGFAPQFDCVIVNDDLEAAKAEALK VIKEFLNS >gi|225935340|gb|ACGA01000052.1| GENE 42 50108 - 50989 967 293 aa, chain - ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 292 1 291 292 128 32.0 1e-29 MIQSMTGYGKATAELPDKKINVEIKSLNSKAMDLSTRIAPAYREKEIEIRNEISKVLERG KADFSLWIEKKEGADAAPINKDVLKSYYSQLESISRELEIPCPAPEDWLQLLLRMPDVMT KTEIQELTEEEWSMVHATILEAISHLVDFRKQEGAALEKKFREKIANINSLLEQITPYEK ERVEKVKERITDALEKTLSVDYDKNRLEQELIYYIEKLDVNEEKQRLSNHLKYFISTMES GSGQGKKLGFIAQEMGREINTLGSKSNHAEMQKIVVQMKDELEQIKEQVLNVM >gi|225935340|gb|ACGA01000052.1| GENE 43 51079 - 51768 563 229 aa, chain + ## HITS:1 COG:XF1533 KEGG:ns NR:ns ## COG: XF1533 COG1214 # Protein_GI_number: 15838134 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Xylella fastidiosa 9a5c # 4 134 3 127 229 80 37.0 3e-15 MSCILNIETSTSVCSVAASQDGQTIFVKEDLKGPSHAVSLGVFVDEALSFIDSHAIPLDA VAVSCGPGSYTGLRIGVSMAKGVCYGRNIPLIGIPTLEVLSVPVLLYHDLPENALLCPMI DARRMEVYAAVYDRRLQVKRTVAADIVDENSYLEFLNEQPVYFFGNGADKCREQITHPNA HFIDNIHPLAKMMFPLAEKAVADEDYKDVAYFEPFYLKEFVASMPKKLL >gi|225935340|gb|ACGA01000052.1| GENE 44 51805 - 52416 599 203 aa, chain + ## HITS:1 COG:no KEGG:BT_2006 NR:ns ## KEGG: BT_2006 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 189 204 344 91.0 1e-93 MQYNTQQKRMPLPEYGRSIQNMVDYALTIQDRAERQRCANTIINIMGNMFPHLRDVPDFK HKLWDHLAIMSGFELDIDYPYEIIRKDNLVTRPDHIPYSTARMRYRHYGHTLEVLIKKAI EFPEGNEKRNLIALICNHMKKDYLAWNKDTVDDKKIAEDLYELSNGELQMTDDIVRLMAE RLNQNYRPKTNYTNNRQNNKRRY >gi|225935340|gb|ACGA01000052.1| GENE 45 52446 - 53750 1410 434 aa, chain + ## HITS:1 COG:BB0472 KEGG:ns NR:ns ## COG: BB0472 COG0766 # Protein_GI_number: 15594817 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Borrelia burgdorferi # 1 434 16 439 442 384 46.0 1e-106 MASFVIEGGHRLSGEIHPQGAKNEVLQIICATLLTAEEVTVNNIPDILDVNNLIQLLREM GVTVAKKGIDSYSFKAENVDLAYLESDEFLKKCSSLRGSVMLIGPMVARFGKALISKPGG DKIGRRRLDTHFVGIQNLGADFRYDEERGIYEITADRLQGSYMLLDEASVTGTANIVMAA VLAKGTTTIYNAACEPYVQQLCRLLNRMGAQISGIASNLLTIEGVEELHGTQHTVLPDMI EVGSFIGMAAMTKSEITIKNVSYENLGIIPESFRRLGIKLEQRGDDIYVPAQETYQIESF IDGSIMTIADATWPGLTPDLLSVMLVVATQAKGSVLIHQKMFESRLFFVDKLIDMGAQII LCDPHRAVVIGHNHGFKLRGARLTSPDIRAGIALLIAAMSAEGTSTISNIEQIDRGYQNI EGRLNAIGARITRI >gi|225935340|gb|ACGA01000052.1| GENE 46 53747 - 54289 503 180 aa, chain + ## HITS:1 COG:no KEGG:BT_2004 NR:ns ## KEGG: BT_2004 # Name: rimM # Def: 16S rRNA-processing protein RimM # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 179 1 179 179 293 88.0 1e-78 MIKKEEVYKIGLFNKPHGIHGELQFTFTDDIFDRVDCDYLICLLDGIFVPFFIEEYRFRS DSTALVKLEGVDSAERARMFTNIEVYFPVKHAEEAEDGELSWNFFIGFQMEDIHHGLLGE VIDVDTTTVNTLFVVEGAEEEELLVPAQEEFIVGIDQKQKLITVELPEGLLNLEELEDDR >gi|225935340|gb|ACGA01000052.1| GENE 47 54386 - 55246 698 286 aa, chain + ## HITS:1 COG:NMB1483 KEGG:ns NR:ns ## COG: NMB1483 COG0739 # Protein_GI_number: 15677336 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Neisseria meningitidis MC58 # 163 286 295 415 415 90 41.0 3e-18 MPKRRRSKAFWKNFKFKYKLTIVNENTLEEIVGLRVSKLNGLSVLLCVLAVLFLIASCII TFTPLRNYLPGYMNSEVRTQIVDNALRVDSLQQVLNKQNLYIMNIQDIFSGKVSIDSVQT LDSLTTAREDTLMERTKREEEFRRQYEENEKYNLTSITSQPDVTGLILYRPTRGMVSDHF NAEKKHYGTDIAANPNESVLATMDGTVILSTYTAETGYLIGVQHNQDLISIYKHCGSLLK KEGERVKGGEAIALVGNSGTLSTGPHLHFELWYKGHPVNPEKYIVF >gi|225935340|gb|ACGA01000052.1| GENE 48 55261 - 56430 1041 389 aa, chain + ## HITS:1 COG:alr4351 KEGG:ns NR:ns ## COG: alr4351 COG0743 # Protein_GI_number: 17231843 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Nostoc sp. PCC 7120 # 10 383 3 380 399 353 47.0 3e-97 MNIETNKKKKQIAILGSTGSIGTQALQVIEEHPDLYEAYALTANNRVELLIAQARKFQPE VVVIANEEKYSELKEALSDLPIKVYAGTDAICQIVEAGPIDMVLTAMVGYAGLKPTINAI RAKKAIALANKETLVVAGELINQLAQQYRTPILPVDSEHSAVFQCLAGEVGNPIEKVILT ASGGPFRTCTLEQLKSVTKTQALKHPNWEMGAKITIDSASMMNKGFEVIEAKWLFGVQPS QIEVVVHPQSVIHSMVQFEDGAVKAQLGMPDMRLPIQYAFSYPDRISSSFDRLDFSQCTN LTFEQPDTKRFRNLALAYEAMYRGGNMPCIVNAANEVVVASFLKDGISFLGMSDVIEKTM ERVAFIANPTYDDYVATDAEARKIAASLI >gi|225935340|gb|ACGA01000052.1| GENE 49 56560 - 57915 1350 451 aa, chain + ## HITS:1 COG:aq_1964 KEGG:ns NR:ns ## COG: aq_1964 COG0750 # Protein_GI_number: 15606963 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Aquifex aeolicus # 9 450 4 428 429 139 28.0 1e-32 METFLIRALQLIMSLSLLVIVHEGGHFLFARLFKVRVEKFCLFFDPWFTLFKFKPKKSDT EYAVGWLPLGGYVKIAGMIDESMDTEQMKQPEQPWEFRSKPAWQRLLIMVGGVLFNFLLA LFIYSMILFAWGDQYIKVQEAPLGMEFNETAKAVGFQDGDILLSADGVPFERYDGDMLSQ IADAREVSVIRNGAKASVYIPEDLMQRLLADSVRFASYRFPYVIDSVMVNSPAAQAGIQA GDSIIALNGTPISFSDFKEAMAERKKNEATLLKDSIDPRLITLTYVRNGATDTLSMRVDS AYLMGVTACLVTDRLLPMVKKEYAFFESFPAGVSLGVKTLKGYVGNMKYLFSKEGAKQLG GFGTIGSIFPATWDWHQFWYMTAFLSIILAFMNILPIPALDGGHVLFLFYEMIARRKPSD KFMEYAQMTGMILLFGLLIWANFNDILRFFF >gi|225935340|gb|ACGA01000052.1| GENE 50 58026 - 58907 873 293 aa, chain - ## HITS:1 COG:no KEGG:BF3490 NR:ns ## KEGG: BF3490 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 293 11 302 302 438 81.0 1e-121 MDELTDIYKRIEYLRNNGVKMKEIADRVDMAPSVLSALYSSVLPAYIDLLKTRTPDEALD EALALVNNVSKKRLLNNVGSVRLLLQEMEPDVQSEAENGNSFIKLLGKEAKESVQEVYNY SGMYLSYSLSSSTDSLKIEPYMICASENNEYVKVGMINAYKSVHWGSGIISNHQNSYLMF NERDLPQFALVTIYLQLPHYEFPNMLKGLYLCLDYNHNPIARRIVLVKQSDSTDVNQFLE MEGCLVPRAELTPELEVYYNYTCQEGDYIKTCTVPSPKLDETDLEREKKMLKI >gi|225935340|gb|ACGA01000052.1| GENE 51 59095 - 60432 1153 445 aa, chain + ## HITS:1 COG:no KEGG:BF1006 NR:ns ## KEGG: BF1006 # Name: not_defined # Def: putative transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 435 1 436 446 546 65.0 1e-154 MNFKEKLSPSAQKKLSDLLFILYAGGAALLSYSLVYALRKPFTAATFDGMELFGMDYKIA TSIIQIFGYMVSKFIGIKLISELKREGRLKFILVSILVAELSLVLFGCLPRPFNVLALFF NGLSLGCMWGVIFSFLEGRRVTDLLASLFGLSIAVSSGTAKSIGLFVVDTLHVSEFWMPA LIGAVALPLLAGLGYILDHLPKPTAEDKALRVERVTLNKQQRWNLFRSFAPILTLLFFAN LFLTVLQDVKEDFLVKIIDVNAAGLSPWVFAKVDGVVTLIILAIFATLAVMKSHIKVLSV LLTLVIAGAITLSTVAFNYHTLQLSPLVWLFIQSLCLYFSYLSFQTIFFDRFIACFRIKG NVGFFIAMVDSIGYTGTVVVLVVKECFNPDLNWLEFYNTMAGTVGIVCTFAFTLAMIYLT QKYRKGKQGEIRGNESCPVPSLASY >gi|225935340|gb|ACGA01000052.1| GENE 52 60470 - 61255 799 261 aa, chain + ## HITS:1 COG:STM0432 KEGG:ns NR:ns ## COG: STM0432 COG0637 # Protein_GI_number: 16763812 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Salmonella typhimurium LT2 # 1 259 2 262 270 177 37.0 2e-44 MKKIECIIMDWAGTAVDYGCFAPVAAFIEAFAEKGLVIDVVQTRKPMGLPKIQHIRELLS MSDVNEQFTACYQRAWTEEDVVELNRLFEKHLFASLENYTDPIPGVIPTLEKLRAAGIKI GSTTGYTREMMDVVLPAAQAKGYHVDYCATPNLLPAGRPAPYMIFENLTKLAVPSLDAVI KVGDTIADILEGVNAKVYSVGVILGSNEMALTEAETKSMPASELEARIADVKERMLAAGA SYVIRTIEELPALIETINAGN >gi|225935340|gb|ACGA01000052.1| GENE 53 61330 - 62418 1041 362 aa, chain + ## HITS:1 COG:VCA0604 KEGG:ns NR:ns ## COG: VCA0604 COG0075 # Protein_GI_number: 15601362 # Func_class: E Amino acid transport and metabolism # Function: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase # Organism: Vibrio cholerae # 4 362 5 363 367 477 60.0 1e-134 MKPYLLLTPGPLTTSETVKEAMMTDWCTWDEDYNLGIVQELRKDLVNIATKKPEEYTSIL LQGSGTYCVEAVLGATITPKDKLLICSNGAYGDRMGNIAEYYHLDYDMLAFEETEQVSVE YVDDYLSNNSDITHVAFVHCETTTGILNPIKELSHIVKMHGKKLIVDCMSSFGGIPLDVN ELGIDFLISSSNKCIQGVPGFGFIIARRSELKYCKGVARSLSLDIYDQWETMEKGHGKWR FTSPTHVVRAFKQALTELIEEGGVEARYQRYCENHRILVEGMRALGFKTLLEDDIQSPII TSFLYPHEGFDFKTFYYALKNKGFVIYPGKISKADTFRIGNIGDVHPEDFRRLVEVIRET KY >gi|225935340|gb|ACGA01000052.1| GENE 54 62520 - 63923 792 467 aa, chain - ## HITS:1 COG:YPO1712 KEGG:ns NR:ns ## COG: YPO1712 COG0477 # Protein_GI_number: 16121972 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 14 462 6 454 455 475 58.0 1e-134 MVQSINPTQKIEETDGLPMPKRIWAVVSVGFALCMSVLDINIVNIVLPTLSHDFGTSPAV TTWIINGYQLAIVISLLSFSALGEIIGYRKVFLSGIGLFCITSLICALSDSFWTLTVARI FQGFSASAITSVNTAQLCYIYPKNQIGRGMGINAMVVAISAAAGPSVASGILSIASWHWL FAINVPLGLTALFLGIKHLPRQEERSKRKFDYISAIANAVTFGLLIYTLDGFAHHEEMDF LFIQLVVLAVVGTYYVRRQLTQTTPLLPLDLLRIPIFRLSILTSICSFIAQMSAMVSLPF FLQNTLGHSEVMTGLLLTPWPLATLVTAPLAGYLVERIHPGILGSIGMVLFAIGLFSLSS LTAESSDISIILRLMLCGAGFGLFQTPNNSTIISSAPTQRSGGASGMLGMARLLGQTTGT TLVALLFSFVTHSRSTAVCLMVGSGFAVVAAIVSSLRLSQPSTLKRK >gi|225935340|gb|ACGA01000052.1| GENE 55 64128 - 64586 303 152 aa, chain - ## HITS:1 COG:PM0941 KEGG:ns NR:ns ## COG: PM0941 COG0602 # Protein_GI_number: 15602806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Pasteurella multocida # 1 151 1 155 158 123 41.0 1e-28 MNLLGTYPETIVDGEGIRYSIYLAGCSHHCLGCHNPESWNPGAGEELTEEKIQSIIREIK ANPLLDGVTFSGGDPFFHPEEFLLLLKRVKEETGMNVWCYTGYSYEEIKAQPRLNAALDY IDVLVDGRFEQALFSPYLEFRGSSNQRILKLK >gi|225935340|gb|ACGA01000052.1| GENE 56 64664 - 67057 2326 797 aa, chain - ## HITS:1 COG:CAC1209 KEGG:ns NR:ns ## COG: CAC1209 COG1328 # Protein_GI_number: 15894492 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 8 797 5 690 699 410 34.0 1e-114 MNYAEICIIKRDGKREDFSISKIKNAISKAFSATGFQDEQQLVADITMNVISQFTTPTIT VEEIQDLVEKSLMKVRPEVAKKYIIYREWRNTERDKKTQMKQVMDGIVAIDKNDVNLSNA NMSSHTPAGQMMTFASEVTKDYTYKYLLPKRFAEAHQLGDIHIHDLDYYPTKTTTCIQYD MDDLFERGFRTKNGSIRTPQSIQSYATLATIIFQTNQNEQHGGQAIPAFDFFMAKGVAKS FRKHLASFINFYVAMENGNQADEKSIRTLIKEYLPSIKATEAERETLRIALIALQIIIDK EHLARIVEKAYQQTRKDTHQAMEGFIHNLNTMHSRGGNQVVFSSINYGTDTSAEGRMVIE ELLKATIEGLGTRGEVPVFPIQIFKVKDGVSYSEKDFEKAMKAGNIEEAMTDRYEAPNFD LLLKACQTTAKALFPNFMFLDAPFNQNEKWRADDPKRYVYELATMGCRTRVFENVAGEKS SLGRGNLSFTTLNMPRLAIEARIKAENLIEDERNKDAIEQKAKEIFIESVHTMSVLVADQ LYERYQYQRTALARQFPFMMGNNVWKGGGELNPNEQVGDALRSGTLGIGFIGGHNAMVAL YGQGHGHNQKAWDTLYEAVMEMNKVVNEYKEKYNLNYSVLATPAEGLSGRFTKMDRRKYG KIPGVTDRDYYVNSFHVDVKELISIVEKIKREAPFHAITRGGHITYVELDGEAQKNVRAI AKIVKVMHDEGIGYGSINHPVDTCHNCGYKGVIFDKCPVCQSESILRMRRITGYLTGDLS SWNSAKRAEEQDRVKHL >gi|225935340|gb|ACGA01000052.1| GENE 57 67763 - 68422 477 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260173952|ref|ZP_05760364.1| ## NR: gi|260173952|ref|ZP_05760364.1| hypothetical protein BacD2_18942 [Bacteroides sp. D2] # 1 219 1 219 219 451 100.0 1e-125 MRTKDWKAINNAKVLVVGEDSNLQWSETVPEYVMFADYYFRGFPEDHGERSRNVEARNLF NYIIKLTGNQMTPEDFYITNLCNDNLEPAPKGKRVLIPEDKALKGIEHIEWVLSQNPTIE YVFAMSLQTNYWLQKGGFYEGSAEFLSAAEPRRTGTENYQPFYQPVDGKAFIQVCGNCYD AKNFAVKVIPILPAKDFPLSEQNMERYGEACKKVSSYFK >gi|225935340|gb|ACGA01000052.1| GENE 58 68622 - 70745 1960 707 aa, chain - ## HITS:1 COG:CAC1389 KEGG:ns NR:ns ## COG: CAC1389 COG5492 # Protein_GI_number: 15894668 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 371 534 168 334 773 110 43.0 8e-24 MKKFRHYALLLVTAMAVFGCSKDYDDTELKQDISDLQSRVEKLETWCTTVNGQISALQGL VTALEAKDYVTGVSPVTNGYTITFSKSDAITIYNGKDGAKGADGVTPVIGVDKFEGEYYW TVKTGTAAATWILDADGKKIRTTGDKGADGSAGVAGTSPVLSVAADTDGKVYWKVNGEWL LNSGKKVQATGDKGDKGETGSAGAAGADGAQGDAVFAADGVTVDKTKGTVTFTLAGEDGA TFTLPMASEMKIFKEFTDCMVRPAMKTLTLDLNLKQDEYTAIKAELTSSKGMTTAIVKAT RAAAGSPWGVTLKEPTFNADKSIKENAVVTFDFPTDVAEDEFALLKVTVIDTKGQEHAAT RIIVYSTKVAVESVTLNQVTIDVNVGDEANLIATITPADATTQTLKWESSDDEIVTVDAA GKVTGVKAGSATITVTSTADATKTATCTVTVKSVAVTGVTLEGDKAIAIGETVKLTATIA PANATNKKIIWNSSNPEIASIGTDGTIEGKAAGETTVTVTTEDGSKTATCKVEVTSIPLK AYNVGDYYPDADDSSTAVGVVFKITNGGVHGKVISLDETTAQWSTVNHYFTINDYNDGAA NTKIIMEQAEADTKYPAAAWCVAKGVGWYLPAKGELTGFGNIWYANPTTYNTKFTNAGGT AINLDTFKNYYWTSNTDSGSDSAISFNGGGSSTSGKPSYKVRACYAY >gi|225935340|gb|ACGA01000052.1| GENE 59 71095 - 71367 184 90 aa, chain - ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 90 216 305 305 145 74.0 4e-34 MALRGFNHQLALLGELLGIKGKLSSYAARHTWATTAYYCEIHPGVISEAMGHSSITVTET YLKPFHSKKIDDANRRIIDFVERSIGKAIA >gi|225935340|gb|ACGA01000052.1| GENE 60 71371 - 71907 442 178 aa, chain - ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 178 38 213 305 260 74.0 1e-68 MPFRQLTPEWLKRFENSLRERGCSWNTVSTYLRTLRAVYNRAVSQRKATYVPHLFRSVYT GTRADRQRALDNEDMKKVFTRLATQSSATTAMRTAQDWFILMFLLRGLPFVDLAYLRKND LKENVITYRRRKTGRTLSVTLTTEAMFLLQRYMNRDVASPYLFPILRSGEGTEEAYRE Prediction of potential genes in microbial genomes Time: Fri May 13 10:28:35 2011 Seq name: gi|225935339|gb|ACGA01000053.1| Bacteroides sp. D2 cont1.53, whole genome shotgun sequence Length of sequence - 38730 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 9, operones - 6 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 528 - 587 6.1 1 1 Tu 1 . + CDS 703 - 1572 627 ## BT_2030 hypothetical protein + Term 1635 - 1679 4.5 - Term 1619 - 1671 14.6 2 2 Op 1 . - CDS 1675 - 2283 365 ## BT_1964 hypothetical protein 3 2 Op 2 9/0.000 - CDS 2294 - 3658 386 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 4 2 Op 3 27/0.000 - CDS 3701 - 6898 3099 ## COG0841 Cation/multidrug efflux pump 5 2 Op 4 . - CDS 6917 - 8119 1286 ## COG0845 Membrane-fusion protein - Prom 8203 - 8262 3.7 + Prom 8087 - 8146 6.0 6 3 Tu 1 . + CDS 8278 - 9126 723 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 9161 - 9224 18.2 + Prom 9170 - 9229 4.2 7 4 Op 1 . + CDS 9374 - 12631 2758 ## ZPR_0253 TonB-dependent receptor Plug domain protein 8 4 Op 2 . + CDS 12642 - 14120 1321 ## ZPR_0252 hypothetical protein 9 4 Op 3 . + CDS 14184 - 14747 469 ## gi|160883414|ref|ZP_02064417.1| hypothetical protein BACOVA_01383 10 4 Op 4 . + CDS 14751 - 16187 1225 ## BDI_2528 hypothetical protein - Term 16049 - 16098 -0.9 11 5 Op 1 . - CDS 16334 - 17449 694 ## COG4299 Uncharacterized conserved protein 12 5 Op 2 . - CDS 17471 - 19099 1188 ## COG3525 N-acetyl-beta-hexosaminidase 13 5 Op 3 . - CDS 19120 - 21630 1803 ## COG3525 N-acetyl-beta-hexosaminidase 14 5 Op 4 . - CDS 21669 - 23048 1107 ## BT_3313 hypothetical protein 15 5 Op 5 . - CDS 23085 - 24560 1363 ## Dfer_5600 RagB/SusD domain protein 16 5 Op 6 . - CDS 24572 - 27814 2735 ## Coch_0443 TonB-dependent receptor plug - Prom 27848 - 27907 4.7 - Term 27872 - 27912 -0.7 17 6 Op 1 6/0.000 - CDS 28041 - 29036 655 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 29066 - 29125 2.8 18 6 Op 2 . - CDS 29134 - 29700 422 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 29835 - 29894 8.8 19 7 Op 1 . + CDS 29970 - 32261 2604 ## COG0281 Malic enzyme 20 7 Op 2 . + CDS 32258 - 32434 118 ## BF3606 hypothetical protein 21 7 Op 3 . + CDS 32478 - 32642 69 ## gi|160882152|ref|ZP_02063155.1| hypothetical protein BACOVA_00095 22 7 Op 4 . + CDS 32683 - 34017 1640 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 34055 - 34095 8.6 + Prom 34065 - 34124 12.6 23 8 Tu 1 . + CDS 34273 - 35733 1256 ## COG0753 Catalase + Term 35760 - 35824 -0.9 + Prom 36410 - 36469 3.7 24 9 Op 1 . + CDS 36689 - 37831 896 ## BT_3745 hypothetical protein 25 9 Op 2 . + CDS 37856 - 38650 654 ## BT_3744 hypothetical protein + Term 38684 - 38730 5.5 Predicted protein(s) >gi|225935339|gb|ACGA01000053.1| GENE 1 703 - 1572 627 289 aa, chain + ## HITS:1 COG:no KEGG:BT_2030 NR:ns ## KEGG: BT_2030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 289 1 303 304 449 79.0 1e-125 MSKFINPFTDYGFKLIFGREVSKDLLIEFLNDLLEGERVITDLQFLNNEQLPLYPEGRGI IYDVYCTTDTGEKIIVEMQNRMQSNFKERSIYYLSRAIVNQGRVGNEWKFEIKAVYGVFL MNFIIDKNIKLRTDVILSDRETGELFSDKFREIFIALPLFNKNEEECETNFERWIYILNN METLKRMPFKARKAVFEKLEDIADVASMSPEDRERYDNSVKVYRDYLVTMDAAEQKGIKE GLEKGIEKGMKEGTQRAQLKIARNMKAKGIDNESIAECTDLPLSIIEGL >gi|225935339|gb|ACGA01000053.1| GENE 2 1675 - 2283 365 202 aa, chain - ## HITS:1 COG:no KEGG:BT_1964 NR:ns ## KEGG: BT_1964 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 202 1 202 202 347 80.0 2e-94 MNPETLRCRILDYTKREMYQHGADGLTMDDIARGMKMSKRTLYKLFPSKTSLFRVCLSDF ANGIRSRIQQKQIRMDSSCMEALFITVDGYLTLLHSLGKKLLMDIATDREYRAAFEREEA FWLQQLIDVLTHCKICGYLLPGVDPDRFAPDLQEVIYQSSLQGTPYLVQRMLNYTLLRGL FKMDGIRYIDEHLKPDNLNVCV >gi|225935339|gb|ACGA01000053.1| GENE 3 2294 - 3658 386 454 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 2 452 4 455 460 153 25 2e-36 MRKIIILSAATLILSSCGIYNKYKPVSEVPEGLYGSESVAATDTANFGNLSWREVFTDPY LQNLVDSALLRNTDMQTAHLRVKEAEATLLTSKLSYLPSLFLAPEGAASSFDRGKATQTY SLPVTASWELDIFGKVTTAKRRAKAAYEQSKEYEQAVKTQLVASVANTYYTLLMLDSQYE IAVATEAAWKESVNATRAMKKAGMVNEAGLAQTEATYFNICTTVLDLKEQINQAENSLAL LLAETPHQIQRGKLGNQQLPENFSVGVPLQMLANRPDVRSAEFSLAQAFYTTNAARAAFY PSITLSGSAGWTNSAGSMIVNPGKFLASAVASLTQPLFNKGANIAQLKIAKAQQEEARLS FEQTLLNAGVEVNEALVQYQTAREKADYYNKQVASLQTAAKSTSLLMKHGNTTYLEVLTA QQTLLNAQLSQVANRFTEIQGVITLYQALGGGRM >gi|225935339|gb|ACGA01000053.1| GENE 4 3701 - 6898 3099 1065 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 4 1025 3 1022 1051 763 41.0 0 MNLRTFIERPVLSAVISITIVVVGIIGLFSLPVEQYPDIAPPTIMVSTTYYGASAETLQK SVIAPLEEAINGVEDMTYMTSSATNSGSVSITVYFKQGTDPDMAAVNVQNRVSRATGQLP AEVTQVGVTTSKRQTSILQMFSLYSPDDSYDENFLSNYISINLKPQILRISGVGDLMIMG GEYSMRVWMKPDVMAQYKLIPSDITGVLAEQNIESATGSFGENSDETYQYTMKYTGRLIT PEEFGDIVIRSTDNGEVLKLKDVADIQLGQDSYAYHGGMDGHPGVSCMVFQTAGSNATEV NQNIDKLLEEASKDLPKGVELTQMMSSNDFLFASIHEVVKTLIEAIILVILVVYVFLQDF RSTLIPLVGIVVSLVGTFAFMAIAGFSINLLTLFALVLVIGTVVDDAIIVVEAVQARFDV GYRSSYMASIDAMKGISNAVITSSLVFMAVFIPVSFMGGTSGTFYTQFGLTMAVAVGISA INALTLSPALCALLLKPYINEDGTQKNNFAARFRKAFNSAFDMMVDKYKTIVLFFIKRRW LTWSLLACSVVLLVLLMNNTKTSLVPDEDQGVIFVNVSTAAGSSLTTTDKVMERIEKRLI EIPQLKHVQKVAGYGLLAGQGSSFGMLILKLKPWDERPGDEDNVQSVIGQVYARTADIKD ASVFAISPGMIPGYGMGNALELHMQDKMGGDMNEFFTTTQQYLGALNQRPEISMAYSTFD VRYPQWTVEVDAAKCKRAGITPDAVLSTLSGYYGGQYVSNFNRFSKVYRVMIQADPVFRL DETSLDNAFVRMSNGEMAPLSQFVTLTRSYGAESLSRFNMYNSIAVNAMPADGYSTGDAI KAVQETAEQSLPKGYGYDYGGITREENQQSGTTIIIFGICFLMIYLILSALYESFIIPFA VLLSVPCGLMGSFLFAWMFGLENNIYLQTGLIMLIGLLAKTAILLTEYAAERRKAGMGLI ASAVSAAKARLRPILMTALTMIFGLFPLMMSSGVGANGNRSLGTGVVGGMTIGTLALLFI VPTLFIAFQWLQERLRPVQSVPTHDWQIEEEIKVSEEEKSKAGKE >gi|225935339|gb|ACGA01000053.1| GENE 5 6917 - 8119 1286 400 aa, chain - ## HITS:1 COG:XF2093 KEGG:ns NR:ns ## COG: XF2093 COG0845 # Protein_GI_number: 15838684 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Xylella fastidiosa 9a5c # 18 399 27 405 408 147 31.0 4e-35 MSKKVCKVKQGLLLLCCMVAATGCKQAPPAQMETEYEVMTVSPADRMISSAYSATIRGRQ DIDIYPQVSGTLTKVCVTEGQRVKNGQTLFIIDQVPYEAALQTAVANVESAKASLATAQL TYDSKEELYKENVISSFDLSTAKNSLLAAKAQLAQAKAQEVSARNNLSYTVVKSPADGVV GTLPYRVGALVSSALAQPLTTVSDNSDMYVYFSMTENQLLGLIRQYGSKEEALKNMPAID LQLNDKSAYSERGQIESISGVIDRSTGTVSLRAVFPNKEGLLHSGGAGNVIVPVQKTAAL TIPQAATFEIQDKRYVYKVVDGKAQSSQVQVTRVNGGREFIVDEGLAPGDVIVAEGVGLL REGTPVKAKAAQTPVAGVTNANQSSSTETTTKSPATTTEN >gi|225935339|gb|ACGA01000053.1| GENE 6 8278 - 9126 723 282 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 167 277 176 285 288 70 36.0 3e-12 MESKNSSLAILHEPFVAGMDENLSPIMQRLWKLEGGAIYFCRSGWAHVTIDLKDYEIVEN TQIVLLPGTIIRINGNSSDFTASFFGFPKEMFREACLRFEPIFFRFIKEKPCYTLKDEST GAINGLMRATTAIYNDRENRFRNQIAKNHLQSFMLDIYDKCYRYFDRQEIEGGSRQDEIF KNFIALVHENCISQREVTFYASKLCISTKYLTGICKSVTGEAAKKIIDDFAILEIKVLLQ STGLTIQEIADRLGFPDQSYLGRYFKRHEGMSPKEYQSKYSI >gi|225935339|gb|ACGA01000053.1| GENE 7 9374 - 12631 2758 1085 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0253 NR:ns ## KEGG: ZPR_0253 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 20 1085 114 1147 1147 921 46.0 0 MKRNKWLLSLSIGLLCCGSAYAQTTVKGKITSEGGEPLIGATVAVKNSTDGTVTDIDGNY SLKTKKTLTQKDLLVFSYVGYKTLQKNYAGNTMNVKLAEESQQLNDVVVTALGIKREEKG LGYATETVKGDMITSALPTNWSTALNGKVAGLSVISSGGPLNSSRISLRGDVSLNQNGND ALVVVDGVPMSSPMTNPGVAYGAGSNSELSVDYGNGFSDINPDDIESIQVLKGASATALY GTRAANGVIMVTTKSGAGAPKGIGVSYSGNFSIDDVMRWPDYQYEFGQGLPSNIGPAGSI YEGQPYYSYGKAEDGSYASTSGTSSAYGARFDANRLFYQYDPVTQGRASVATPWVAYKNN RKDLFQTGYTLTNSVALTGKSDRGSVRASITHTKNEWILPNTGFQRITAMVSAQQQISRA LRINFKSSYTYRKLNNTPALGYNSNSISYFLIFQNPNVNLDWLRPMWRTGQENVKQLQPY SSFIGNPYVILYESENPSEKHSNVSSISANLRINSKFDFMIRSGIQLSADQREQHRPISD VVFGNGFFKKQNVFDYELNSDALFTYHDSFANGLRVNASAGGNMMQQSYDMLAASVVGLI TPGVYKLANGVSNPNVQTVIKKKALNSLYFTANFSYKDKLFLDVTGRNDWSSTLPKSNRS FFYPSVSVSAVMNEWFTLPEQISLLKVRGSLAQVGNDTDPYKTSPYYGTSDFPGSAIVSS TLYNQDFKPEISTNYETGFDFRMFHNRIGLDFTFYYNRTKNQILDAPMDPTTGYSKATIN SGCVRNRGYEIQLDVTPVVSRDFRWNATFTWSKNENRILSLAEGADENQLISSIGSVSII GRVGGTTGDLWGYKLVRDPEGNVVINDNGLPERSGEIEYVGTAYPKWKAGLYNEFSYKNF TLSVLLDGQVGGKMYSHSHHKMTEQGKLQHTLNGRLPGTEFYMSKDDPRIAAAGLSPQDG MYMIAPGTVKNDDGTYSPNTKVVTVESYYKEYYRMANVETNTFDTSFLKLREMRLEYKLP NSTLKKTPFSKASVALYARNLFCITDYPLFDPESAALNGSSMVTGVETGSLPTARTFGLN INVSF >gi|225935339|gb|ACGA01000053.1| GENE 8 12642 - 14120 1321 492 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0252 NR:ns ## KEGG: ZPR_0252 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 4 492 3 483 483 370 41.0 1e-101 MKTNKLFKNIVWGMTLCGALCTTSCTSFDELNTDPTRMDEVNPGTLLNPILYETSVYNWK RYNSYTYDLMQCAVSTSSTNGVGWWYMTDSEGDGTWTTYYKWINNAKEMMRLTGKLPEAS KQPNYDAISLTLQCWLYQILTDAFGDIPMSEACSADEGILAPKFDTQQQVYQQIIDNLKT ANELFDKANGLIYNQSGEMLYKTSKGDATGILKWKKFCNSLRLRALLRVIDVPEFNAKQE LRTMLTDPATYPVFESNDDAALLAISGTYPEEAPLTRPQDFTSYVQISEFFVNLLKGWND PRLQVYATQVTLPDQTKDYVGLPSGYQTLPSITASGLNQEMAKAPMKLAMMPYAEVEFIK AELLKKGVIDGGSSAAKEAYQKGVQAAIEQWGQKMPDHYFENPEAAYDDTLERIMNQKFV ALFFCDYQQWFEYNRTGFPVLPVGPGIANANNKMPKRFKYPSALQRTNLKNYQAAKANMG GDDFHIRLMWQQ >gi|225935339|gb|ACGA01000053.1| GENE 9 14184 - 14747 469 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160883414|ref|ZP_02064417.1| ## NR: gi|160883414|ref|ZP_02064417.1| hypothetical protein BACOVA_01383 [Bacteroides ovatus ATCC 8483] # 1 187 1 187 187 353 100.0 4e-96 MNMKRIISMAVVLVITMAATAQKGELAMNENISGKNHISKNEKEATPVGKKPKAVFIFDV YGEIPTTKPVFEAEHYLGSDITGKWNTFIQNYTHEYDVTIGFTDSSVEILKPSIYKAVNK VNKYYKKALKKEEVSREVATFNMGHILDCANVMCFDDNSKSFEEALKSADEPEEIIALFN QVTLRKL >gi|225935339|gb|ACGA01000053.1| GENE 10 14751 - 16187 1225 478 aa, chain + ## HITS:1 COG:no KEGG:BDI_2528 NR:ns ## KEGG: BDI_2528 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 7 478 2 471 472 540 54.0 1e-152 MKKINFRKTLFAAVLLFQLNAIAAFGQTVVKGSVKDNTGKPISGVVVTDGAHFNTTDAEG NYVLNTDPTRYPMVYISTPAAYELPSKEGVADGFYQYLDAGKSENQCDFVLTKRQKPVDE FVYIVLSDPQVRNEKQLDRFRTETVPDLKQTADSLKNFEIVGMGLGDLVWDAMNLYAPYR QAVSNLGMTMFQLMGNHDFNLLYKSITQTDHPADGYGEQNYYQSFGPANYSFNIGKVHVI AMKDIDYDGNKKYTERFTPEDLDWLRKDLSYVPKGNIVFLNVHAPVANNTVAAGGNARNA NALFQLLRPYQVHIFSGHTHFYENQSPAPTIYEHNIGAACGAWWAGHVNRCGAPNGYLVV QVKGDDVKWRYKATGCSPDYQFRLYQPGEFESQKDYIVANIWDWDWTYTVNWYEDGVLKG AMQAFDDEDQDYINMVKGKKTGYRTRHLFRAQPSKDAKSVKVVVKNRFGEIFTEEIKL >gi|225935339|gb|ACGA01000053.1| GENE 11 16334 - 17449 694 371 aa, chain - ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 5 371 2 375 375 221 37.0 2e-57 MKSERLLSLDILRGITIVGMILVNNPGTWESIYAPLRHAEWNGLTPTDLVFPFFMFIMGV SMSFALSRFDHHFSRGFIIKLVRRTVILFLLGLFLSWFSLVCTGVEQPFSHIRILGVLQR LALAYFFGSLLIVGVRRPANLAWISGIILAGYSILLALGHGFELSEQNIIAVTDRTLFGE AHLYREWLPDGGRIFFDPEGLLSTLPCIAQVIIGYFCGNILREKTEIHHRLLQISILGIA LLFAGWLLSYGCPLNKKVWSPTFVLVTCGFASLLLVFLTWLIDIRKKQKWGYPFHVFGTN PLFIYIVAGVLATLLEVITVGGSSLQEKIYTSIWSVLPDAHLASLIYALLFIGFNYLIVW VLYKKQIFIKI >gi|225935339|gb|ACGA01000053.1| GENE 12 17471 - 19099 1188 542 aa, chain - ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 34 539 130 635 637 253 30.0 9e-67 MNYDKIKRTGILFLLGIGAITSLSCNDNDNGNYPERVPTRLSVMPLPERVDYKESVVSLP QNVTVSQNIPASTSQLLKSTLEEKLSLSVSDASNDRAFIRVKQESDLAKEAYRLTVTKEG ACVYYSTETGLLWGIQTLRQALEQANFFTSGSAKYLPMVDIKDAPKYDWRGFHIDVVRHM FTVDYLKKVIDCLSFYKINKLHLHLTDDQGWRIEVKKYPLLTQEGSWRDFDEYDKRCVEL SQQDYNYEIDPRFVRNGSQYGGHYTQEEMKGLVSYALERGIDIVPEIDMPGHFSAAIKVY PELSCTGEAGWGEEFSYPICPSRPENYQFVQSIIDEMVEIFPSEYFHIGADEVEKDNWEQ CEVCQRLMQQEGYQKVDELQNRFVKIMTNYVKGKGKKVMGWDDAFLEKEPQDLIYTYWRD WLPDQPGKITQKGYPIIFMEWSRFYLSATPSDEGLSSLYNFEFEPQFPGIVKQNVLGFQA CVWTEMIPNERKFGQHVFPSLQAFSELAWGSSRNWIDFTNRLKWHVKWLNANGFYFTKPG FL >gi|225935339|gb|ACGA01000053.1| GENE 13 19120 - 21630 1803 836 aa, chain - ## HITS:1 COG:VC2217 KEGG:ns NR:ns ## COG: VC2217 COG3525 # Protein_GI_number: 15642215 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 63 811 81 870 883 317 29.0 6e-86 MRNMLYIIWGSLCLLLVISGCKSTDGAKAPVSLTWKMGAVEVQPGYYENSFVLKNISDVP LGKDWIIYYSQLPREILQEESAPVKVEVVNANFFRMYPAENFQPLAPGDSLTVKFCCTNG LKKMSHAPEGTYWVSQSGSKQGIPLPVGLTIQPLKGMETEDWYPAPDKIYASNLALETTA QLQQTDIFPSVKEAVPATGKEGFAIENKVKLTFHPDFANEAELLKEKLATIHGLEVVSEA PVTVHLDYLPERETAVNGEYYRIDTGNGLINISASTSHGIFNGTQTLLSLLKGQEKPFRL EALSIRDYPDLPYRGQMLDIARNFTTVEHLKKLVDVISSYKLNVLHFHFSDDEGWRLEIP GLEELTSVGARRGHTTDELECLYPGYDGNYDPSAATSGNGYYTREEFIDLLRYAAQRHVR VIPEIESPGHARAAIVSMKARYHKYVNTAPEKANEYLLSDAQDTSRYVSAQSYTDNVMNV ALPSTYRFMEKVIRELIAMYEEAEVPLTTIHLGGDEVPEGAWMGSPVCRTFMDENGMTSA HELSEYYITKMADYLQQHHLQFSGWQEVALGHPETTDRHLNQLAAGVYCWNTVPEWEADE IPYQIANKGYPVILCNVNNFYLDLAYDAHPDERGLSWAGYVDESKGFSMLPYSIYRSSRT DMAGNPVDPDIAGKGKTTLTASGKEHIQGVQAQLFAETIRDFEWVEYYTFPKILGLVERG WNAFPAWSTLTGEKERQAFNKELGLFYSKVSEKEMPHWASRSINFRLPHPGLCIKEGQLH ASTPIRGGEIRYTTDGTEPTLRSELWKAPVACDASVVKAKLFYLNKESVTSTLKVD >gi|225935339|gb|ACGA01000053.1| GENE 14 21669 - 23048 1107 459 aa, chain - ## HITS:1 COG:no KEGG:BT_3313 NR:ns ## KEGG: BT_3313 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 457 44 438 667 75 24.0 5e-12 MKHNNLFLWLLGLTVCWLASCSEEDGRVTYPYSCPEISELQFSTTDQTPAADSLYFSVKI HDPQTPLSTLEVKLMTGETLVSSQSIRTKGTDVSIVEKGIYIPFEAGLEENQEAKVILTA INVEGSEVTQTFDFHITRPQIPEVLYLHYNGEVVEMNRSEDNPYLYLTEVSDSEEGYPME LSGKISTTVSLDDAKLIWGAAEASNTAVLIEASGPSFAFDYRDWFVEQITFNVMTFELGV VGYQKNLKIKETELIASEGFFRAQISFTQGEEFEMSGFEDVEHAYNRDFFSYNPNNGKFT FLRKSGTWEIYYSSKYNYIWVARMGDTAPTCFWLVGHGFTCAPVWNEAYNSGGWNLEDIS QLGYIVPIGEQKYQTTVYLSNTHEWESFEIEIYSDLQWNKDKGMLLQEGSLSGDTDGIEL SGSNGITSGDGFVPGYYRLTFDASQGVGKETLHIERLSD >gi|225935339|gb|ACGA01000053.1| GENE 15 23085 - 24560 1363 491 aa, chain - ## HITS:1 COG:no KEGG:Dfer_5600 NR:ns ## KEGG: Dfer_5600 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 6 491 3 488 488 313 39.0 1e-83 MKNIIKLAYVVLCTLVVSCSSFLDEKPQSDFMQEGTGTEDQESKYGSLADAQAELQGAYE SFKADIFQSENYTIGDVQSDNCYIGGDGVAEQEFDLLKLTSTNYKVELVWSQYYSMAGTA TSVIENTKMMDPTSTVAEERNRVIAEAKFLRAWAYFDIVRLWGDAPMVLDLIPTITAENL DKWYPVMYPERSATDKIYDQILDDLNEENTIRYLVSKNKGIFQATKGAAYALRAKVLATR GEKSTRDYTKVVEACDKVIAEGYTLVSNFDELWQPDKKFSSESIFEVYYTSDAPNWAYWV LLKEDDGSVTWRRYCTPTHDLVAKFDKEKDTRYASSILWKSVPYDTYWPADSYPLSYKIR EKTSNIILMRLADILLLKAEALVELDRTPEAIRIVNSIRERAGFAPSSLDENMGQARGRL AVENERQLELYMEGQRWFDLVRNNRLLEVMQKHKDKDGRLLFAGLQAFRQLWPIPQGEKD KNTNLTQNEGY >gi|225935339|gb|ACGA01000053.1| GENE 16 24572 - 27814 2735 1080 aa, chain - ## HITS:1 COG:no KEGG:Coch_0443 NR:ns ## KEGG: Coch_0443 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.ochracea # Pathway: not_defined # 117 1080 29 989 989 906 49.0 0 MRKQLHIIFRLVMFSCILLFVSQEASAQITFSVKNQTIRQTIRVIEKKADYSFFYTDKLP GLDQKISLTVSNESIDAVLEKVFKGSSISYKIESGKQVVLTLKSEQNAPQKGDKKTITGV VSDRRGEPLIGVTVRVEGENIGTATDIDGRFSLEAPVGAKVSFSYIGYVSQSHTVNKQNV YNVSLSEDSQVLEEVVVVGYGFMKRKDITTAVSVVSTADIEERPIMTAAQAIQGKAAGIQ VVQPSGMPGSGVTIRVRGATSVQASNEPLYVVDGLPSDDISNISPNDIESMQILKDASSA AIYGARAANGVVLITTKRGKIGAPQVKMSAYVGFSKLGKKIDALNTEQYKDLMKDLKAVS DVAPNIPESETRYVDWTDLFFGTGVNQNYQLSVANGTEKLQYFVSGGYSDEQGIVEKAHF NRYNFRANLDSEQTKWLKMALNFAYSHTGGQWVNESRSSLRAGSILSVVNTPPFMQKWNP YDPNEYDEQAYGARILNPLAANAADSNTNTDHINGSLGFTVDIFKGLKFKTTFGIELTNE HWDYYLDPISTSDGRGTKGRVEESFSRNFEWLFENLLTYDCSFNKHNLSILGGATQQRAQ YNGSWMAGFDLAESYPDIHSISAANQLDKDACGSSASAWTLASFLGRVAYNYDSRYLLTV NFRADGSSRFAPGHRWGTFPSVSAGWRISGEKFMQPLQDIVTDLKLRAGWGMNGNQGGFG NYAYMASMSVSKLPVSEGNLYPGLAIKPGSAANKELTWEKTSQWNLGLDLTMFDSRLSFS VDAYYKKTTDLLLTVSLPENVVPSSVTRNDGEMVNKGMEFTLSSQNFKGNFQWNTDFNIS FNRNKLTKLGLNKVYYYAEMYESKEKAVILKEGLPLGTFFGYISEGVDPETGDIIYRDLN GNGFIDPEDRTTLGNAQPKFIYGMTNTFSYAGFDLSVFLQGSQGNKIFNASRIDMEGMTD FRNQSVRVLDRWKRPGMITNVPRVGNTENNHNSSRFVEDGSYLRLKTVTLSYNFPKKWLN KIHLSRLQAYVTGQNLFTLTKYKGYDPEVNAFGGDSVAQGVDYGTYPQSRAVIFGLNIEF >gi|225935339|gb|ACGA01000053.1| GENE 17 28041 - 29036 655 331 aa, chain - ## HITS:1 COG:AGpAbx251 KEGG:ns NR:ns ## COG: AGpAbx251 COG3712 # Protein_GI_number: 16119537 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 122 295 111 278 311 70 30.0 4e-12 MEQHDYIVLLEKFFKKEASSEERRILIEWIRKPGIRNEFNLLCEQMWKEAAVEIDKTVEE EMWNHLQRNLDEPKILSPKKRRWQLIVYKVAATILLPVCLGLATYFGVEHMNKVSQDPFM VAVDYGQKANLTLPDGTKVWLNSATHLSYDAEYNKSDRKIYLDGEAYFEVAKNKEKRFIV CCNDLEIEALGTTFDVKGYCDDHSVTTLLAEGSVKVSNKTDVTLLKPGEKVEYHKNKQTF TKSAISDMREIDFWRNNMLIFNSSSLAEIATTLERMYGVKVVFDSEKLKNVPFSGTIRNS SLHNVFYIISLTYPLTYELEGDTVRIGSSIN >gi|225935339|gb|ACGA01000053.1| GENE 18 29134 - 29700 422 188 aa, chain - ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 2 174 4 184 187 70 26.0 1e-12 MITTQQLQLLKEGNKNAFEALYRGYNARIYNFVLSMVSNAGVAKDITQDIFLHIWEKRLN IDLEGNFDGYLFKISQNMVYHYVRRELLLQNYVDRLANESSDESVEIDEELDYLFLEEYI LKLLEELPPARREVFMLYWKSGLNYREIADQLNISEKTVATQVHRSLDFLRDKLGIIAFS VSLFLHDI >gi|225935339|gb|ACGA01000053.1| GENE 19 29970 - 32261 2604 763 aa, chain + ## HITS:1 COG:STM2472_1 KEGG:ns NR:ns ## COG: STM2472_1 COG0281 # Protein_GI_number: 16765792 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Salmonella typhimurium LT2 # 1 429 1 429 434 516 62.0 1e-146 MAKITKEAALLYHSQGKPGKIEVVPTKPYSTQTDLSLAYSPGVAEPCLEIEKNPQDAYKY TAKGNLVAVISNGTAVLGLGDIGALSGKPVMEGKGLLFKIYAGIDVFDIEVDEKDPEKFI AAVKAIAPTFGGINLEDIKAPECFEIERRLKEELDIPVMHDDQHGTAIISSAGLVNALQV AGKKIEDVKIVVNGAGASAVSCTKLYVSLGARLENIVMLDSKGVISKVRTDLNEQKRFFA TDRTDIHTLEEAIKGADVFLGLSKGNVLSQDMVRSMAPMPIVFALANPTPEISYEDAMAA RPDVLMATGRSDYPNQINNVIGFPYIFRGALDTHAKAINEEMKIAAVHAIANLAKQPVPD VVNAAYHVNNLSFGAEYFIPKPVDPRLITEVSCAVAKAAMESGVARTEIKDWDAYCVHLR ELMGYESKLTRQLYDTARRSPQRVVFAEGIHPNMLKAAVEAKAEGICHPILLGNDEAIGK LAEEMDLSLEGIEIVNLRHPDESDRRERYSRILAEKRAREGFTYEEANDKMFERNYFGMM MVETGDADAFITGLYTRYSNTIKVAKEVIGIQPGFKHFGTMHILNSKKGTYFLADTLINR HPDTETLIDIAKLSDKTVRFFNHTPVISMLSYSNFGADTAGSPVKVHEAVAHMQEEYPEL AIDGEMQVNFAMNRELRDTKYPFTRLKGKDVNTLIFPNLSSANAGYKLLQAMDPDTEFIG PIQMGLNKPIHFTDFESSVRDIVNITAVAVIDAIVDKKKRGGK >gi|225935339|gb|ACGA01000053.1| GENE 20 32258 - 32434 118 58 aa, chain + ## HITS:1 COG:no KEGG:BF3606 NR:ns ## KEGG: BF3606 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 58 1 58 58 75 82.0 4e-13 MKRPLSKSQIVCISLLWLALCYLVLTKAERIDGMTVVMLGISAALVFIPVYKSIKKNK >gi|225935339|gb|ACGA01000053.1| GENE 21 32478 - 32642 69 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160882152|ref|ZP_02063155.1| ## NR: gi|160882152|ref|ZP_02063155.1| hypothetical protein BACOVA_00095 [Bacteroides ovatus ATCC 8483] # 1 54 1 54 54 75 98.0 7e-13 MFYRIFLYKMFASSNFIFTFATENKYPMYPYADVQICYIAQKHIVKRYISKLSN >gi|225935339|gb|ACGA01000053.1| GENE 22 32683 - 34017 1640 444 aa, chain + ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 7 444 9 445 445 582 62.0 1e-166 MNAAKVLDDLKRRFPNEPEYHQAVEEVLSTIEEEYNKHPEFDKVNLIERLCIPDRVYQFR VTWMDDKGNIQTNMGYRVQHNNAIGPYKGGIRFHASVNLSILKFLAFEQTFKNSLTTLPM GGGKGGSDFSPRGKSNAEVMRFVQAFMLELWRHIGPETDVPAGDIGVGGREVGFMFGMYK KLAHEFTGTFTGKGREFGGSLIRPEATGYGNIYFLMEMLKRKGTDLKGKVCLVSGSGNVA QYTIEKVIELGGKVVTCSDSDGYIYDPDGIDREKLDYIMELKNLYRGRIREYAEKYGCKY VEGAKPWGEKCDIALPSATQNELNGDHARQLVANGCIAVSEGANMPSTPEAIKVFQDAKI LYAPGKAANAGGVSVSGLEMTQNSIKLSWSAEEVDEKLKSIMKNIHEACVQYGTEADGYV NYVKGANVAGFMKVAKAMMAQGIV >gi|225935339|gb|ACGA01000053.1| GENE 23 34273 - 35733 1256 486 aa, chain + ## HITS:1 COG:NMA0050 KEGG:ns NR:ns ## COG: NMA0050 COG0753 # Protein_GI_number: 15793081 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Neisseria meningitidis Z2491 # 6 483 11 488 504 713 70.0 0 MEKKKLTAANGRPIADNQNSQTAGQRGPVMLQDPWLIEKLAHFDREVIPERRMHAKGSGA YGTFTVTHDITKYTRAAIFSKVGKQTECFVRFSTVAGERGAADAERDIRGFAMKFYTEEG NWDLVGNNTPVFFLRDPLKFPDLNHAIKRDPKTNMRSPNSNWDFWTLLPEALHQVTITMS PRGIPYSYRHMHGFGSHTYSFINAENQRIWVKFHLRTLQGIKNLTDQEAEAIVAKDRESH QRDLFESIEKGDYPKWLFQIQLMTEEEADNYRINPFDLTKVWPHKDFPLQDVGILELNRN PENYFAEVEQAAFNPMNIVDGIGLSPDKMLQGRLFSYGDAQRYRLGVNAEQIPVNKPRCP FHAYHRDGAMRVDGNYGATKGYEPNSYGEWQDSPDMKEPPLKVTGEVYNYNEREYDDDYY SQPGDLFRLMPSEEQQLLFENTARAMGDSELFIKQRHTRNCYKADPAYGTGVAKALGIDL QEALKE >gi|225935339|gb|ACGA01000053.1| GENE 24 36689 - 37831 896 380 aa, chain + ## HITS:1 COG:no KEGG:BT_3745 NR:ns ## KEGG: BT_3745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 369 1 372 374 350 50.0 4e-95 MKLKKLFAGTILCLAVASCIQDEAQNVEAAIDGCSGNHIQQYLIDRNDFTVQLYVSKAAD PSKININFDLPTGASVKPVRQLTGDEANVYNFEDENPREFKVTSEDGAFSATYTIKLWQT EMPLAYDFETLSSDAPYHKFYEDKSSENIIRRLELASGNPGFELTKMAKTPEDYPTVQVN GGVNGGKCVKLTTKDTGSFGSMVKMYIAAGNLFVGSFEVGQALNNAMKATHFGFPFFYYP LKLEGWYKYKAGTNFSSKGEIVEGKKDKCDIYGVLYETDDNVQFLDGSTSLTSPNIVALA RNLDALPETDNWQQFSFNFEPKNGKSIDPDKLEKGIYKLGIVFSSSVDGAKFEGAVGSTL HIDQVEIVCTSNPSEYPVNQ >gi|225935339|gb|ACGA01000053.1| GENE 25 37856 - 38650 654 264 aa, chain + ## HITS:1 COG:no KEGG:BT_3744 NR:ns ## KEGG: BT_3744 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 264 5 261 261 418 80.0 1e-115 MKKTIYNKVVGIVFLTMLFSHVAYAQNERNKALIYSYLHGWEYSIKAGLSIGGTSPLPLP KEIRSIDSYAPNIAIAIEGNATKWFGNDKKWGMTAGIRLENKTMTTEATVKNYGMKIINT NGGELQGLWTGGVKTKVKNSYLTIPLLANYKISDRWKISLGPYFSYMTEGSFSGHVYEGH LRTPDETGQRVDFNGESIATYDFSDNLRKFQWGAQLGGEWKAFKHLNIYADLTWGLNDIF KKDFDTITFAMYPIYLNIGFGYAF Prediction of potential genes in microbial genomes Time: Fri May 13 10:30:12 2011 Seq name: gi|225935338|gb|ACGA01000054.1| Bacteroides sp. D2 cont1.54, whole genome shotgun sequence Length of sequence - 87901 bp Number of predicted genes - 66, with homology - 66 Number of transcription units - 30, operones - 13 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 339 - 385 7.2 1 1 Tu 1 . - CDS 409 - 2433 1521 ## BT_3742 hypothetical protein - Prom 2612 - 2671 6.0 - Term 2680 - 2719 1.3 2 2 Tu 1 . - CDS 2746 - 3279 483 ## BT_3742 hypothetical protein - Prom 3453 - 3512 9.5 3 3 Op 1 . - CDS 3546 - 5138 1247 ## BDI_2795 putative lipoprotein 4 3 Op 2 . - CDS 5141 - 6394 883 ## BF3629 hypothetical protein 5 3 Op 3 . - CDS 6399 - 7244 684 ## BF3629 hypothetical protein 6 3 Op 4 . - CDS 7257 - 8000 713 ## BF3629 hypothetical protein 7 3 Op 5 . - CDS 8008 - 10179 1411 ## BDI_2795 putative lipoprotein 8 3 Op 6 . - CDS 10176 - 11456 967 ## BF3631 hypothetical protein 9 3 Op 7 . - CDS 11456 - 13978 2000 ## BDI_2794 hypothetical protein - Prom 14138 - 14197 7.5 - Term 14190 - 14246 11.1 10 4 Tu 1 . - CDS 14271 - 17246 2833 ## BT_1972 phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 17383 - 17442 4.4 + Prom 17317 - 17376 5.7 11 5 Tu 1 . + CDS 17451 - 19535 1966 ## COG1505 Serine proteases of the peptidase family S9A + Term 19555 - 19613 9.4 + Prom 19566 - 19625 6.7 12 6 Tu 1 . + CDS 19736 - 21073 1520 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 21220 - 21268 11.4 + Prom 21338 - 21397 5.4 13 7 Op 1 . + CDS 21417 - 22580 1223 ## COG0006 Xaa-Pro aminopeptidase + Term 22596 - 22635 -0.4 14 7 Op 2 . + CDS 22663 - 24225 1358 ## COG3119 Arylsulfatase A and related enzymes 15 7 Op 3 . + CDS 24275 - 24661 236 ## COG3119 Arylsulfatase A and related enzymes 16 7 Op 4 . + CDS 24676 - 25800 913 ## COG3119 Arylsulfatase A and related enzymes + Term 26020 - 26086 6.0 17 8 Tu 1 . - CDS 26245 - 27678 1434 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase - Prom 27729 - 27788 5.0 + Prom 27626 - 27685 4.8 18 9 Tu 1 . + CDS 27774 - 28610 956 ## BF3443 putative lipoprotein + Term 28815 - 28845 1.0 - Term 28801 - 28831 1.0 19 10 Op 1 . - CDS 29011 - 30003 858 ## BT_1977 hypothetical protein 20 10 Op 2 . - CDS 30005 - 30607 604 ## COG0632 Holliday junction resolvasome, DNA-binding subunit - Prom 30653 - 30712 9.9 + Prom 30555 - 30614 5.8 21 11 Tu 1 . + CDS 30795 - 31694 1106 ## BT_1979 meso-diaminopimelate D-dehydrogenase + Term 31724 - 31764 9.2 + Prom 31702 - 31761 10.2 22 12 Op 1 . + CDS 31790 - 33829 1349 ## Fjoh_0510 hypothetical protein 23 12 Op 2 . + CDS 33859 - 35871 1332 ## ZPR_3520 transglutaminase-like superfamily protein 24 12 Op 3 . + CDS 35890 - 37809 1281 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases 25 12 Op 4 . + CDS 37890 - 38540 511 ## COG1272 Predicted membrane protein, hemolysin III homolog + Term 38708 - 38763 4.6 + Prom 38689 - 38748 4.4 26 13 Tu 1 . + CDS 38872 - 39573 645 ## COG0120 Ribose 5-phosphate isomerase + Term 39678 - 39717 3.9 - Term 39854 - 39894 -0.8 27 14 Tu 1 . - CDS 40038 - 40259 230 ## BVU_1841 hypothetical protein - Prom 40449 - 40508 6.0 + Prom 40482 - 40541 6.7 28 15 Op 1 . + CDS 40635 - 43571 2723 ## BDI_0500 hypothetical protein 29 15 Op 2 . + CDS 43583 - 45196 1580 ## BDI_0503 hypothetical protein + Term 45277 - 45312 1.3 - Term 45265 - 45300 1.3 30 16 Tu 1 . - CDS 45440 - 45838 390 ## COG0784 FOG: CheY-like receiver - Prom 45894 - 45953 5.3 + Prom 45801 - 45860 7.7 31 17 Tu 1 . + CDS 45958 - 47595 1541 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 47626 - 47666 -0.7 + Prom 47597 - 47656 5.0 32 18 Op 1 . + CDS 47802 - 48104 95 ## BT_2037 hypothetical protein 33 18 Op 2 11/0.000 + CDS 48193 - 49428 1135 ## COG0845 Membrane-fusion protein 34 18 Op 3 . + CDS 49506 - 52604 2959 ## COG3696 Putative silver efflux pump + Prom 52682 - 52741 2.9 35 19 Tu 1 . + CDS 52762 - 53946 978 ## BT_2040 hypothetical protein 36 20 Tu 1 . - CDS 54033 - 54218 136 ## gi|160882715|ref|ZP_02063718.1| hypothetical protein BACOVA_00673 + Prom 53977 - 54036 7.5 37 21 Tu 1 . + CDS 54204 - 55034 488 ## BT_2041 hypothetical protein + Term 55186 - 55230 7.2 - Term 54860 - 54894 -0.9 38 22 Tu 1 . - CDS 55031 - 56494 1069 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Prom 56580 - 56639 6.1 + Prom 56476 - 56535 2.9 39 23 Op 1 . + CDS 56604 - 56726 110 ## gi|237720132|ref|ZP_04550613.1| conserved hypothetical protein 40 23 Op 2 . + CDS 56710 - 57384 565 ## BT_2043 hypothetical protein 41 23 Op 3 . + CDS 57381 - 57932 326 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 42 23 Op 4 . + CDS 57913 - 58242 414 ## BF3732 hypothetical protein + Prom 58271 - 58330 1.9 43 24 Op 1 . + CDS 58363 - 59634 913 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 44 24 Op 2 16/0.000 + CDS 59674 - 60468 811 ## COG0207 Thymidylate synthase 45 24 Op 3 . + CDS 60473 - 60967 264 ## COG0262 Dihydrofolate reductase + Term 61198 - 61228 1.6 - Term 60882 - 60937 3.3 46 25 Tu 1 . - CDS 60971 - 61450 376 ## COG1522 Transcriptional regulators - Prom 61696 - 61755 4.7 + Prom 61472 - 61531 6.0 47 26 Op 1 . + CDS 61603 - 62880 821 ## BT_2050 hypothetical protein 48 26 Op 2 . + CDS 62924 - 63463 393 ## BT_2051 hypothetical protein + Term 63469 - 63512 10.3 - Term 63460 - 63497 7.3 49 27 Op 1 . - CDS 63528 - 65117 1337 ## BT_3274 hypothetical protein 50 27 Op 2 . - CDS 65138 - 65794 689 ## BT_3273 hypothetical protein 51 27 Op 3 . - CDS 65814 - 67274 1324 ## BT_3272 putative outer membrane protein 52 27 Op 4 . - CDS 67310 - 70462 2819 ## BT_3271 hypothetical protein 53 27 Op 5 . - CDS 70511 - 73123 2421 ## BT_3275 hypothetical protein 54 27 Op 6 . - CDS 73143 - 75713 1847 ## BT_3275 hypothetical protein 55 27 Op 7 . - CDS 75727 - 78423 2292 ## BT_3275 hypothetical protein - Prom 78482 - 78541 9.6 - Term 78591 - 78637 11.2 56 28 Op 1 . - CDS 78661 - 79131 517 ## BT_2052 hypothetical protein 57 28 Op 2 . - CDS 79135 - 79728 554 ## BT_2053 hypothetical protein 58 28 Op 3 . - CDS 79764 - 80249 418 ## BT_2054 hypothetical protein 59 28 Op 4 . - CDS 80256 - 81056 863 ## COG0811 Biopolymer transport proteins - Prom 81267 - 81326 4.0 - TRNA 81128 - 81215 60.5 # Ser GGA 0 0 60 29 Op 1 . - CDS 81336 - 82112 563 ## COG0084 Mg-dependent DNase 61 29 Op 2 . - CDS 82106 - 82822 508 ## BT_2057 hypothetical protein 62 29 Op 3 . - CDS 82838 - 83812 876 ## COG0142 Geranylgeranyl pyrophosphate synthase - Term 83828 - 83862 5.1 63 29 Op 4 . - CDS 83882 - 84565 665 ## BT_2059 TonB - Prom 84585 - 84644 6.9 + Prom 84679 - 84738 5.5 64 30 Op 1 2/0.000 + CDS 84782 - 85468 200 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 65 30 Op 2 . + CDS 85477 - 86346 375 ## PROTEIN SUPPORTED gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase + Prom 86459 - 86518 2.7 66 30 Op 3 . + CDS 86546 - 87526 1138 ## COG0205 6-phosphofructokinase + Term 87567 - 87630 11.1 Predicted protein(s) >gi|225935338|gb|ACGA01000054.1| GENE 1 409 - 2433 1521 674 aa, chain - ## HITS:1 COG:no KEGG:BT_3742 NR:ns ## KEGG: BT_3742 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 287 672 128 502 505 162 35.0 3e-38 MRKNLFYYLFTVLCTVALFTSCSDDDKNGNDGGEGLGLNSYVVGTYNSQLKVFLEGVDLT ESAPISQRIFVKSEDENKVTVSLRNFAIDIAGAPLTVGDIIVGGVVLEGDASQVVLQETK TTIQHADLGTLDITVSGNVVAEKVNMTINVVQHMPEDPSTTMNIEVKVTGTRISTEADDA DYSTVFAGWYPRTENGFTCDYTGEGFELTEPSKGITMVAKGYNKISLSSFSVSFPVKPGF TASAYKYQGLKADEVAIEKQADGSFTIGEYKGTTTAVANRREATEYTISGSISSDKEMTL KINIKSETYNVNYTFVSGVLKTEKELLSVTINEDVLAYKPSIVTSSMGGNWIILYAKPDA QAEQLKIVPKLVVSDGAKVFYNGELYDGGAIDCSKTQRINIVSEKDYSEGKTTGEEYIIS CGILNFATDLEQWELKNNTDDEHMKYYEPVGGWTTSNPGVEYLKSMSLFTKYDKTKPYAV HEAAAGYSGKAAKLTTLNSSGSALAMVPYVTSGSLFNGVFITEISNTLKSTRFGQPCDRE PKTFSGVYKYTAGEKYYVCPDPKKANVANLDESKTDAPAMNAVLYEVSDYMADYLDGTNL LTSEKIVAIASVKDAGEKAEWTTFNVSFEWKDGKSWNASNKYKLAIVCSSSKDGDKFSGA PDSVLYIDDLKVSF >gi|225935338|gb|ACGA01000054.1| GENE 2 2746 - 3279 483 177 aa, chain - ## HITS:1 COG:no KEGG:BT_3742 NR:ns ## KEGG: BT_3742 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 176 4 160 505 63 34.0 4e-09 MKRISFYLFAMLIGTMSFLMSCSDDDDDDTTVASEKVAGTYNGNLVVKLMGQEVANETKS LSLVKNGDDAVNVVIKDFILSVSIGEASIPVSLGDLKVEKCALTQKDGKYTFSGATELKG IKVPITETSELPVDCTVDIADATVDGKNLSLPIVVGVSMSGGEKFMDVNVQYTGVKQ >gi|225935338|gb|ACGA01000054.1| GENE 3 3546 - 5138 1247 530 aa, chain - ## HITS:1 COG:no KEGG:BDI_2795 NR:ns ## KEGG: BDI_2795 # Name: not_defined # Def: putative lipoprotein # Organism: P.distasonis # Pathway: not_defined # 45 527 18 518 518 296 37.0 1e-78 MKSKGRNRLLSLIICICIAFAFSSCVKNDLPYPVVRLSITGFEVNGQIGSAVINNEERTV TVDLLDSVDLKKVIVTRFDYSEEANTDLKVNTTIDLSSPKEVVLSLYQDYRWKIIANQTV ERVFSVKNQVGGAVIDEKARQAIVYVNKNTMLNKITVKDLKLGPISSTVSPDFTTLKDFT QEQKVNVTFKGKTEEWSLYAFITDKVVFTNSADGWTNVAWLYGEGQEDVVNGFEIREASS EEWTRVDQDIVVQNGVNFYVCVPHLKADTEYVCRALVDGTEDRGEEITFRTTAAIPLTGG TFDNWNQSGKVWNPWGSNETPFWDTGNQGATTLGDSNSVPTEDIWSGKTTGMAAKLESKF VGVGSAGKLAAGNLFVGKYVATDGTNGILNFGQSFSAYPTHLKGYYKYQPGVIDNSSIEY NYLKGRTDTCSIYIAIGDWDSPIEIRTKPSNRKLFDKNDKHVIAYAEFTSGTSVAVYKEF DLKLEYRATNRKPTYIVLVCSASKYGDFFTGASGSVLYVDEFSLEYDYDN >gi|225935338|gb|ACGA01000054.1| GENE 4 5141 - 6394 883 417 aa, chain - ## HITS:1 COG:no KEGG:BF3629 NR:ns ## KEGG: BF3629 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 195 415 21 239 241 161 37.0 4e-38 MRHRKKYILLFLLLVVCAGLVFFYGIKSEGQEDKQEVLSSVEDTHQVPEAVVDSACQIIG EVEENVSPAENSVEKVVKKPEERTVEKSEERSVGKFVDKKQADKRQKVASKREAVAVPPV SSTDIKLASDVVNTPLKLEGTICDLVLSERQQDSKITAALPVHAVETVKREFDWKSDTLE TKRGIDLLRIKRVGRFDRGIVNYRFMPKGKWMFGTDFSFWDYNSEDSKLLFTYLDNFDFE ARTINVAPFIGYFFKDNQVVGVKLGYKHTDGHLGNVSFKIDDDVNFTLKELKLKEELYNC TFFHSSYIGLDPGKRFGLFNETNLRLGFGHSEFTRGSGESLKNTKTNIFEAQLGINPGVA VFIMQNVSVECSVGVIGLKYRKESQVNNLGEKGSRTNGGANFKINLFDINLGLTISM >gi|225935338|gb|ACGA01000054.1| GENE 5 6399 - 7244 684 281 aa, chain - ## HITS:1 COG:no KEGG:BF3629 NR:ns ## KEGG: BF3629 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 59 279 21 239 241 139 36.0 1e-31 MYKRLLVTVLVSVFVFSLWGKTETVCQSLEVEDSVPVFQKDTLPVKGSWDFLRIKRIGRY DRGIMNYRFIPKGKWISGFTMSYWDYNSADNKLLFAYFDNFDCDGRNLNFSVYGGYAVRD NMVLGLKFGYRNMRGALNNITLKIDDDVDFSLKDLKLEQNLYNVSFFHRSYVGLDAGKRF GLFNETALTFNFGKSKFDRGVEENLVNTQTDIFEVQLGINPGLAVFIMQNVSIECSLGIA GIRYRQEKQKNNLGETGKRMTGGSNFRVNPFNISLGLTICL >gi|225935338|gb|ACGA01000054.1| GENE 6 7257 - 8000 713 247 aa, chain - ## HITS:1 COG:no KEGG:BF3629 NR:ns ## KEGG: BF3629 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 247 4 241 241 207 44.0 3e-52 MNYFIKTSFLIILGIIVGTTQTCAQQEFKRNIEMRTFIPKGQWIVGSSVSYSQYSNDNYQ FLVIENLSGEGYAFKVSPVLCYAFKDNVAAGGRFMYSRTYTKLNNISVNIDEDTNFDISN LYELKNSYSGIAVLRNYINLGTSKRFALFNEVQLELGGSVSKVVNGKGDDLTGVYQTSLD ASVGLSPGLVAFINNYTAVEVSIGVLGLNVSRTHQVTDRIYESNRSSMSANFKINLFSIG LGIAFYL >gi|225935338|gb|ACGA01000054.1| GENE 7 8008 - 10179 1411 723 aa, chain - ## HITS:1 COG:no KEGG:BDI_2795 NR:ns ## KEGG: BDI_2795 # Name: not_defined # Def: putative lipoprotein # Organism: P.distasonis # Pathway: not_defined # 409 723 218 518 518 187 37.0 1e-45 MKKLIHILVIAFIVGLVCVSCRESDLTNNGEGALSLKIGVQRELPTVSAGKTRATLTDEE LLQKCKVYIRNAEGLVRKFATLDEMPEQILLVPDNYKAEVTAGDSVAASFTDSYFKGTKE FTISKGTAISESVVCGIRNTVVSLVLSDELQEAFPTYKITVSNRLGTLEYTQDNISQLGY FMLLDDENQLRWNFVGQLNGSETEITKSGVINLVRKATRYDLTFSNSSSEVGGGMISVNV VETALTTDEKVDILARPVIKIIENDQIYELDNAVYRAQFDKETPIVVRMAASVPFTELKV TSADFARLGITHFSSFDLMELTEVQIRELDNLGFSIQFNADMTKVQFAFKENLREIMTNS IDDGNNTYRFDIEVRDRNGKDRAKTLTVIATDALVATEDIATQNDIWSTKATLYGSLVAS VTEMNFRYRALGSNDWMYSNQAVLNNKIFTSQITGLEPGITYEYQAMAGTVAAKETKQFT TEAPAQLPNAGFEDWHGSSPLYIYKSGGEMFWDSGNTGSATLNKNVTTYDATIKHSGNYS AKLQSQYVALFGIGAFAAGNVFVGKFLKIDGTDGILRFGRPFTSRPSKLVLYYRYVAGTV DYSTLSELPKNSKDMGSIYIALGDWPHQTFEGETDIPVLVKTKPADRQLFDSNSSNVIGY GECILKESTEGEGLIKLEIPINYRSNRKPTDIILVGSASRYGDYFTGSTGSTLWLDDLEL IYE >gi|225935338|gb|ACGA01000054.1| GENE 8 10176 - 11456 967 426 aa, chain - ## HITS:1 COG:no KEGG:BF3631 NR:ns ## KEGG: BF3631 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 421 1 440 442 78 24.0 6e-13 MSNMKAIKKLVFICLASILGLCSACMNDNPDFGRDDEGTGNTNQGEGQVRLAGTSVSVSV DVLSRADGKAINTDDYIIRIYSTDNNLLVKEYTRYRDMPEILTLNVGNYKIEALSHDVKP AEWEKPFYKGTQTFTIKKDEVTSVDVIKCFLQNIMVTVDLNDDLKQKVGGDQSITIKVGT GELTFNNNDIGSGRAGFFEAAEVSNIIIVHFQGTVDGEWTEEQTSFSNVKGGEHRNIVFN LEIPSVGDMGLSVKLDAVCKKVDLLAWANPGAEDILPEDPSVNPGTSAPTIVGQGFNIMD EIEVEKDETKTIIVNITADNGIQNLKVKIDSETLTPEILKSVGLSSEFDLAHPANADLEK ALKSLGFPVGSEVIGVKYLVFDITQFTPLLGLYGLASHNFIITVVDQANNTRTATLALKS IAKKTE >gi|225935338|gb|ACGA01000054.1| GENE 9 11456 - 13978 2000 840 aa, chain - ## HITS:1 COG:no KEGG:BDI_2794 NR:ns ## KEGG: BDI_2794 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 490 840 236 614 615 136 33.0 4e-30 MKGILKFILFLEIVLLCSCVNDEKRSGAFGGMSLHLSANSSVTDVTTRSSEEVLPSIQDF SISVLQGDKVQASWDKLSDYDEDTTFPVGSYTLKAFYGDIEKEGFDSPYYEGTTDFNIRG GETTPVETTCKLANTKISIEYTDDFKQYFKTYSSTVQAELGSEVSFSSGETRAAYVKPGR ISVKLTFTKVNGGLSPTTIEVATIEEALAQHHYHLQMNVDAGKAMLSIVFDRVTEVRPIT LDISDKALNIKPPYFTLTGFEKTSNDGNQWDGNPIESNQLSALLTSLGGFTKCILRTTSP NLPDWPEEGFNLAALTPEDQALLDKSGVKLIGFGINQDQMGIINFTGLIPNLNITDNNDT HLFYLSATSTYGKQSEDYVLNITTPKNFMLLPTEPVKMKSKEVTIPVKLKEGNPQNIKLY YRYYGVMTLINNTVITPIEGKEGYYNIKASGIDMGVVAKDFQAEYNGLKSAIVSVAVIIP SYSVILEPFDVWSYTAEMTIIPEYAEDMSNVMSAIEAYISLNGSTWTKVDAKDLKLDVTT GKAVISGLTPGTTYYFRTTCDDGTTYSIPVMQATESVIQLPDFTQGWGKTIFSGTINKGG KYTHGAFGTYKFDTTDLTVFDLDGNGAWVTVNQKTVPTSGNLNSWYMVPSSIKQNNGVLM RNVAWTMTSADPPGKSALFGNSLTDLTPPTHGHRSAGKLFLGSYSYNHSTNVEIYNEGIS FTSRPIKLKGTYTYVANNDQGGIVTVIVENREGGQTLKLAEGSEVLTATSSQKDFTVDLT YNTMFEKKATHLRVMFASSSNASNTQSVEDSKIKTTDNKGEAVSTGSELYIDKNIILEYK >gi|225935338|gb|ACGA01000054.1| GENE 10 14271 - 17246 2833 991 aa, chain - ## HITS:1 COG:no KEGG:BT_1972 NR:ns ## KEGG: BT_1972 # Name: not_defined # Def: phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 990 1 990 990 1952 96.0 0 MLSKYKLNQLYFKDTQFANLMTRRIFNVLLIANPYDAFMLEDDGRIDEKIFNEYTSLSLR YPPRFSQVSTEEEALSQLENMSFDLVICMPSTGDNDSFDIGRHIKEKYEHIPIVILTPFS HGITKRIINEDLSAFEYVFCWLGNTDLLVSIIKLIEDKMNLEHDVQEVGVQLILLVEDGI RFYSSILPNLYKFVLKQSQEFSTEALNAHQRTLRMRGRPKIVLARTYQEAMEIYHKYQNN ILGVITDVRFPKVERGEKDGLAGIKLCAEIRKNDPFVPLIIQSSESENSSYAAKYGASFI DKNSKKMDVDLRRIVSDNFGFGDFIFRNPDTGEEIARVRNLKELQNILFAVPAESFLYHI SRNHVSRWFYSRAMFPVAEFLKPITWNSLQDVDAHRKIIFEAIVKYRKMKNQGVVAVFKR DRFDRYSNFARIGDGSLGGKGRGLAFIDNMVKRHPEFDEFENARIAIPKTVVLCTDVFDE FMDTNNLYQIALSDADDDTILKYFLKAKLPDRLVEDFFTFFDVVKSPIAIRSSSLLEDSH YQPFAGIYNTYMIPYLDDRYEMLRMLSDAIKGVYASVYFRDSKAYMQATSNVIDQEKMAV ILQEVVGNQYGDRYYPSMSGVARSLNYYPLGNEKAEEGTVNLALGLGKYIVDGGMTLRFS PYHPNQVLQTSEMEIALKETQTRFYALDLKNAGHDFSIDDGFNLLKLHVKEAESDGALRY IASTYDPYDQIIRDGLYPGGRKVITFANILQHDVFPLARILQLVLKYGEQEMRRPVEIEF AATLSREHDKSGTFYLLQIRPIVDSKEMLDEDLNEIPDEDVILRSYNSLGHGIMNEVYDV VYVKTDNYSASNNQTIAWEIEKINQQFLNEGKNYVLVGPGRWGSSDTWLGIPVKWPHISA ARVIVEAGLTNYRVDPSQGTHFFQNLTSFGVGYFTINAFMNDGVYNQDFLNAQPAVQETN YLRHVRFEKPVVVKMDGKKKLGVVLMPGVDK >gi|225935338|gb|ACGA01000054.1| GENE 11 17451 - 19535 1966 694 aa, chain + ## HITS:1 COG:all2533 KEGG:ns NR:ns ## COG: all2533 COG1505 # Protein_GI_number: 17230025 # Func_class: E Amino acid transport and metabolism # Function: Serine proteases of the peptidase family S9A # Organism: Nostoc sp. PCC 7120 # 9 686 5 685 689 699 51.0 0 MVMSCAPQQKKLVYPETAKVDTVDVYFGTQVPDPYRWLENDTSAATTAWVEAQNKVTNEY LSQIPFRENLLKRLTTLADYEKISAPIKKHGKYYFSKNDGLQNQSVFYVQDSLDGEPRVF LDPNKLSDDGTVALTGLYFSNDGKYTAYSISRSGSDWSEIFVMDTESGKLLEDHIEWAKF TGAAWQGDGFYYSAYDAPAKGKEFSNVNEKHKIYYHKIGEPQSKDKLIYQNPAYPKRFYT ASTSEDERILFLTESGAGRGNNLFIRDLKNPNSPFIQLTTDLDYQYYPIEVIGDQIYIYT NYGAPKNRIMVADINRPKLEDWKELVPESEAVLSNAEVIGGKLFLTYDKDASNHAYVYGL DGKQIQEIQLPSLGSVGFSGNKDDKECFFGFTSFTIPGATYKYDMDQNTYELYRAPKVQF NSDDFVTEQVFFASKDGVKVPMFLTYKKDLKKDGKNPVFLYGYGGFGISLNPGFSAMRIP FLENGGIYAQVNLRGGSEYGEDWHVAGTKMQKQNVFDDFISAAEYLINEKYTNKDKIAIV GGSNGGLLVGACMTQRPDLFRVAIPQVGVMDMLRYHKFTIGWNWASDYGTSEDSKEMFEY LKGYSPLHNLKPGTKYPATLVTTADHDDRVVPAHSFKFAATLQADNDGTNPTLIRIDSKA GHGAGKPMAKVLEEQADIYGFIMYNLDMKPDFKK >gi|225935338|gb|ACGA01000054.1| GENE 12 19736 - 21073 1520 445 aa, chain + ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 2 445 4 445 445 542 59.0 1e-154 MNIQKIMSSLEAKHPGESEYLQAVKEVLLSIEDIYNQHPEFEKAKIIERLVEPDRIFTFR VTWVDDRGEVQTNLGYRVQFNNAIGPYKGGIRFHASVNLSILKFLGFEQTFKNALTTLPM GGGKGGSDFSPRGKSDAEIMRFCQAFMLELWRHLGPDMDVPAGDIGVGGREVGYMFGMYK KLTREFTGTFTGKGLEFGGSLIRPEATGFGGLYFVNQMLQAKGIDIKGKTVAISGFGNVA WGAATKATELGAKVITISGPDGYIYDPDGISGEKIDYMLELRSSGNDIVAPYVEQYPNAT FVEGKRPWEVKADIALPCATQNELNGEDAQNLIKNDVLCVGEISNMGCTPEAIDLFIEHE TMYAPGKAVNAGGVATSGLEMSQNAMHLSWSAAEVDEKLHAIMHGIHAQCVKYGTEPDGY INYVKGANIAGFMKVAHAMMGQGVV >gi|225935338|gb|ACGA01000054.1| GENE 13 21417 - 22580 1223 387 aa, chain + ## HITS:1 COG:MA4232 KEGG:ns NR:ns ## COG: MA4232 COG0006 # Protein_GI_number: 20093022 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Methanosarcina acetivorans str.C2A # 23 384 20 385 388 175 32.0 1e-43 MLQPELKFRRDKIRSLMVSQGIDAALITCNANLIYTYGCVVSGYLYLPLHSPALLFFKRP NNITGEHSFSIRKPEQIVDLLKEQGLPMPTKLMLEGDELPYTEYCRLASLFPETEVVNGT PLIRQARSVKTPVEIEMFRRSGIAHAKAYEQIPGVYRPGMTDIEFSIEIERLMRLQGCLG IFRVFGRSMEIFMGSVLTGDNAGYPSPYDFALGGQGLDPALPGGANKTPLKEGQSVMVDL GGNFNGYMNDMSRVFSIGKLPEEAYTAHQVCLDIQEKIASIARPGIACEVLYDTAVEVVK TAGFADKFMGTGQQAKFIGHGIGLEINEAPVLAPRIKQQLEPGMVFALEPKIVIPGVGPV GIENSWVVTNEGIEKLTNCNEEIIELS >gi|225935338|gb|ACGA01000054.1| GENE 14 22663 - 24225 1358 520 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 483 1 465 497 162 27.0 2e-39 MKQPFILTLVSSVACAATAANKAPENTTPTKSKTPNVVFIYADDLGYGDLECYGAKNVQT PNVNQLAKSGILFTNAHATAATSTPSRYSMLTGEYAWRKEGTDVAAGNAGMIIRPEQYTM ADMFKSVGYTTAAVGKWHLGLGDQTATQDWNAPLPCALGDLGFDYHYIMAATADRVPCVY IENGKVANYDPSAPIEVSYQKNFEGEPTGKSNPELLYNLKPSHGHDMSIVNGISRIGFMK GGGKALWKDENIADSLTTHAIQFIEENQNKPFFLYFATNDVHVPRFPHDRFRGKNPMGLR GDAIVQFDYCVGEILNTLEKLGLRENTLIILSSDNGPVVDDGYDDKAEELLNGHSPAGPL RGNKYSAFEGGTRIPAIVSWPAGVKKGMTSDLLVSQVDWLASLASLTGATMPKNTAPDSY NYLGSWLGKEKTDRPWVIEQASNHTLSVRTKDWKYIEPNDGPKMIQWGPKIETGNSKEPQ LYKMKEVKEVTNYAKNMPEKVAELQSILQSVREGKDLIGK >gi|225935338|gb|ACGA01000054.1| GENE 15 24275 - 24661 236 128 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 8 88 9 90 497 79 47.0 1e-15 MNYQSIPLIGSALLLSTLSANGQKKNTQPNILFILCDDMGYGDLGCYGQPFIRTPHIDAM AGEGMRFTQAYAGSPVSAPSRASFMTGQHSGHCEVRGNKEYWKNAPIVEYGQNKEYSIVG QHPYDPDT >gi|225935338|gb|ACGA01000054.1| GENE 16 24676 - 25800 913 374 aa, chain + ## HITS:1 COG:MA2648 KEGG:ns NR:ns ## COG: MA2648 COG3119 # Protein_GI_number: 20091471 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 343 135 478 535 116 27.0 8e-26 MKDNGYTTGMFGKWAGGYEGSCSTPDKRGIDEYYGYICQFQAHLYYPNFLNRYSKALGDT GVVRVVMDENIQYPMYGPDYLKRPQYSADMIHQKALEWLDQQDGKQPFFGVLTYTLPHAE LVQPEDSILNEYKAKFNPDKEFKGSEGSRYNAITHTHAQFAGMITRLDYYVGEVLKKLKE KGLDENTLVIFSSDNGPHEEGGADPTFFGRDGKLRGLKRQCHEGGIRIPFIARWPGHIPA GEVNDHICAFYDLMPTFCEVIGIKDYEKKYRNKEKEVDYFDGISFAPTLLGKNKQKKHDF LYWEFDETDQIAVRMDDWKMVVKKGTPFLYNLKTDIHEDHDITLQHPDIVEKMKAIIFEQ HTPNPHFSVTLPKR >gi|225935338|gb|ACGA01000054.1| GENE 17 26245 - 27678 1434 477 aa, chain - ## HITS:1 COG:MT4026 KEGG:ns NR:ns ## COG: MT4026 COG0617 # Protein_GI_number: 15843539 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Mycobacterium tuberculosis CDC1551 # 29 438 36 445 480 230 35.0 5e-60 MIELTQEELKQHFSEPIFGQIAETADTLGMECYVVGGYVRDIFLQRPSKDIDVVVVGSGI AMAEALGKQLGRGAHVSVFKNFGTAQVKYRGTEVEFVGARKESYQRDSRKPIVEDGTLED DQNRRDFTINALAVCLNKARFGELVDPFGGLEDMKEKTIRTPLDPDITFSDDPLRMMRCI RFATQLGFYIDDDTFESLCRNKERIEIISRERIADELNKIILSPIPSKGFIDLERSGLLS LIFPEFAALQGVETRNGRSHKDNFYHTLEVLDNISKKTDNLWLRWAALLHDIAKPATKRW EPKAGWTFHNHNFIGEKMIPHIFRKMKLPMNEKMKYVQKLVSLHMRPIVIADDVVTDSAV RRLLFEAGDDIDDLMMLCEADITSKNMERKQRFLNNFQLVRQKLKDLEEKDRVRNFQPPV SGEEIMEIFGLEPCREVGVLKSAIKDAILDGVIPNEYEAAHAFMLERAKKMGMTPIR >gi|225935338|gb|ACGA01000054.1| GENE 18 27774 - 28610 956 278 aa, chain + ## HITS:1 COG:no KEGG:BF3443 NR:ns ## KEGG: BF3443 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 278 1 279 282 349 66.0 6e-95 MKKLIILFCGTLALAACGNGLEKKANEKLMVAKAAYERGDYEEAKLQIDSIKILYPKAFE ARKAGQALMLDVETKAQQKTLAYLDSAFQAKTEEFNAIKDKFKLEKDAEYQQVGNYLWPT QTIEKNMHRSYLRFQVNEQGIMSMTSIYCGAGNIHHTKVKVIAPDGSFAETPSSKDSYET TDMNEKIEKADYKLGEDGNVIEFLNLNKDKNIRVEYIGDRTYKTTMSPTDRQAAAGVYEL AQILSAMEQIKKEQEEANLKIGFINKKKERKAQEEITD >gi|225935338|gb|ACGA01000054.1| GENE 19 29011 - 30003 858 330 aa, chain - ## HITS:1 COG:no KEGG:BT_1977 NR:ns ## KEGG: BT_1977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 330 1 329 329 494 77.0 1e-138 MKQQKIYMKAWLDAHGRAKAVDTDEWYLDFANQLLPLVADSFIYGGREWEEDQKRVALTC ALYLEDCVADGGNWRQFIHWHRKSYGRYLPFYALTEEYLPDEINREDIVFLLWAINSPVG DDFDGVENPMDADLLEFADTLYNRLDAAFELAPISDYLATDWLMETELMQKKRMPLPVAL PGEKMPTNVERFLEASKGEPLLYFDSYEALKYFFVQSLKWEDEEDSLLPDLKEFGNFVVF ANPKGLLIGPDVAEYFADKRNPLYNAELAEEEAYELFCEEGLCPFDLLKYGMEHDLLPEA QFPFENGKELLQENWDFVARWFLGEYYEGE >gi|225935338|gb|ACGA01000054.1| GENE 20 30005 - 30607 604 200 aa, chain - ## HITS:1 COG:PA0966 KEGG:ns NR:ns ## COG: PA0966 COG0632 # Protein_GI_number: 15596163 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Pseudomonas aeruginosa # 1 198 1 198 201 112 37.0 4e-25 MIEYIRGELAELSPATAVIDCNGVGYAANISLNTYSAIQGKKSCKLYIYEAIREDAYVLY GFAEKQEREIFLLLISVSGIGGNTARMILSALSPAELVNVISTENANMLKTVKGIGLKTA QRVIVDLKDKIKTMGMSAAGGASAGLLLQPANAEVQEEAVSALTMLGFAAAPSQKVVLAI LKEEPDAPVEKVIKLALKRL >gi|225935338|gb|ACGA01000054.1| GENE 21 30795 - 31694 1106 299 aa, chain + ## HITS:1 COG:no KEGG:BT_1979 NR:ns ## KEGG: BT_1979 # Name: not_defined # Def: meso-diaminopimelate D-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: Lysine biosynthesis [PATH:bth00300] # 1 299 1 299 299 585 98.0 1e-166 MKKVRAAIVGYGNIGHYVLEALQATPDFEIAGVVRRAGAENKPEELANYAVVKDIKELGE VDVAILCTPTRSVEKYAKEYLAMGINTVDSFDIHTGIVDLRRTLNATAKEHKAVSIISAG WDPGSDSIVRTMLEAIAPKGITYTNFGPGMSMGHTVAVKAIDGVKAALSMTIPTGTGIHR RMVYIELKDGYKFEEVAAAIKADPYFVNDETHVKLVPSVDALLDMGHGVNLTRKGVSGKT QNQLFEFNMRINNPALTAQVLVCVARASMKQQPGCYTMVEVPVIDLLPGDREEWIGHLV >gi|225935338|gb|ACGA01000054.1| GENE 22 31790 - 33829 1349 679 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_0510 NR:ns ## KEGG: Fjoh_0510 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 32 679 24 673 673 218 29.0 7e-55 MRKLFILCAFLLQFLSPVHSQQATTATIEPNLKYGKPSKEELSLTSYAPDTTATAIYLFH QGQSDFVYHDGFQLTTEHWVRIKILKPQGVSYADVSVPYYSPTDRDEGQERASDIEGCSY NMENGKCIKTSMKRESISFERINNLYKMLKFSLPAVKEGTVIEYHYKLYSDYFSHIDNWM MQEELPMLYNQYKITIPHVFIYNIELRGKDYIQVKQRDSSIHATEREGGSVGGVSKDFTV SAQETTFTSRNLPAIRQDEPYCWCPEDYKVQVSFDLQGTQFTPNEYKPYSQKWEDVDNQL LKPENTQFGKYLSFTNPFRPETKQAYNSEMSFEEKIICAFQILKKKMAWNGRYQLYSKEL EKVIPKGNGSNADLNFILISILKDFGLEAYPVVLSRRSSGMLPYNFPSLQKLNTFIVAIY DINKQKYVFLDGSMDVPALNILPLELSVNKARILSPKVKEEKKWVNVMALADNKSFMKIE ARMEGNQVKGHRSTILYGQEAVEYQANEKHKQDSIVSTPESNASQKNKLTVTNLKVKKQE NDWALIEEEFDFVLQTDQTDSHLYINPMLFPQLKSNPFIQTERVLPIEFPYPYKFTMLTT LTLPEGYEVEETPQPQVIRTQGDGLQCKYMIQRQRNTISLNYIFYLKEPIFLTEQYKQLQ EPWTKVIEKNNALIVLKKL >gi|225935338|gb|ACGA01000054.1| GENE 23 33859 - 35871 1332 670 aa, chain + ## HITS:1 COG:no KEGG:ZPR_3520 NR:ns ## KEGG: ZPR_3520 # Name: not_defined # Def: transglutaminase-like superfamily protein # Organism: Z.profunda # Pathway: not_defined # 10 654 10 637 638 263 28.0 1e-68 MMKTLLTFVFLLLGWQSSAQQTDTNTITPSLKYGKPSEEELNMTAYAPDTAATAVVLYSK NTARYDLINNEFRLVYTYETKIKILKSEGTSYADINIPFYSNANSGIMKENVGQIDASAY NMENGKIVRTKMKRDLIFNERLNKTYEQVKFSIPAVREGTVFEYKYQINSDFYYSINHWE AQRDIPVLLAQYDITIPEYFEFNLDMRGSHTLNPKDQSESISFHLQYQNGQIEKIDCTGR HLSFTGKQLPALRPDSYVWCADDYRSGVNFELRGISFPGALYKSFTHTWTEIDKMLMEDE DFGSPLKMRNPYRDEMATLALDKLSDRQDKIAAIYTFLKSKISWNGQYALYGSEVKKAVK NGTGSNADINFALMSMLRDAQIPCYPVVMSRKNLGILPLSHPSIQKLNTFIVGIADTDST FVFLDGSVTNGFMNILPPVLMVNRARLINGIGQDNWIDLSKLGKNQIRSSVKAQIHPDGK ITGNRQSGYVGQYASGFRSRYHAAKDSTEFINKLETEENIKITKFTTEGVNIFSPKVTES FEFEKQATVNDHLIYVNPLIFMHVSKCPFIQVERRLPLEMPYTEQLMLVVNLTLPEGYAV DELPQSMNLQTEDRQGFCRYNIQQKNNTITVTYSFAFNKLLHLIDEYKGVKAFWEMIAEK NNEILVLKKI >gi|225935338|gb|ACGA01000054.1| GENE 24 35890 - 37809 1281 639 aa, chain + ## HITS:1 COG:XF1451 KEGG:ns NR:ns ## COG: XF1451 COG1305 # Protein_GI_number: 15838052 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Xylella fastidiosa 9a5c # 258 402 239 381 636 75 32.0 3e-13 MIRSSHPILKQRICRAFGLSCAILYLSLHIQPACAQDILKDANSIIVEARTEVLCKSMTQ SIEKESLTISVLNRKGLEAARFFCGCDMFRSLQKFSGEILNATGQSVRKIKKSELQKSEY SSSLTTDDYFYYYECNYPSFPFTVKYEWEMKCNNGLIGYLPFVPQAEFNQGVEKATYRIE LPAGQGCRYRELNTQGKGIEVKESTGADGQQVIEATASKLSPIIKEPFGPDFAKLFPRVY FAPSAFKYDKSEGDMSSWQKYGEWQYQLLDGRDLLTEPFRAKLHELTASCTTDRDKVKAI YDYLAKTTRYVSIQLGIGGLQPIAAADVCRTGFGDCKGLSNYTRAMLKELGIASTYTVIS TTNERLLPDFSSANQMNHVILQVPLPQDTLWLECTNPSLPFGYVHQDIAGHDALLIEPTG GQMYRLPTYPDSLHTQHIVANITLSPTAEARIEVNEISRIFQYENEAGIVYLEPNKQKDR IRSSINLSQADIQNLQISECKEANPSITFDYTATSNQYGHKTGNRLFIPTNVFRKEFSVP PVTKRTYPIYINYGYTDADSIRIQLPEGYVIEGLPKPLDVKSKFGSFHSGIQVKDKEIYI THRLFMCKGVYSPDEYAAFIDFRKQVAGQYGGKIILKKE >gi|225935338|gb|ACGA01000054.1| GENE 25 37890 - 38540 511 216 aa, chain + ## HITS:1 COG:PA4833 KEGG:ns NR:ns ## COG: PA4833 COG1272 # Protein_GI_number: 15600026 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Pseudomonas aeruginosa # 11 213 5 201 205 116 41.0 2e-26 MKNKRYNNIEEWANTLSHGAGILLGIVAGYFLLEKASENMEPQWAVACVSVYLAGMLSSY ISSTWYHGSRPGKLKELLRKFDHGAIYLHIAGTYTPFTLLVLRHAGGWGWGIFAFVWLSA IAGFILSFKKLKEHSNLETACYVGMGACILVAMKPLLDHLGELGATTAFWWLIGGGVSYI VGAVFYSLRKPYMHATFHLFCLGGSIGHIIAIWLIL >gi|225935338|gb|ACGA01000054.1| GENE 26 38872 - 39573 645 233 aa, chain + ## HITS:1 COG:MTH608 KEGG:ns NR:ns ## COG: MTH608 COG0120 # Protein_GI_number: 15678636 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Methanothermobacter thermautotrophicus # 36 230 21 215 226 157 45.0 2e-38 MDWKNHLIKHLQWSDSIINREAKEHVAREIAATAKDGDVIGAGSGSTVYLTLFELSRRIR EEHLHIEVIPASQEISMTCIQLGIPQTTLWNQRPDWTFDGADEVDPQQNLIKGRGGAMFK EKLLIRSSRKTFIIIDPSKRVNQLGSKFPIPVEVFPDSLTYVEHELQRLGASEIVLRPAR GKDGPIFTENGNFILDTRFNDIDSSLEERLKAITGVIESGLFIGYDIKVIVAE >gi|225935338|gb|ACGA01000054.1| GENE 27 40038 - 40259 230 73 aa, chain - ## HITS:1 COG:no KEGG:BVU_1841 NR:ns ## KEGG: BVU_1841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 72 1 72 1002 113 77.0 2e-24 MKRKLMFLMTFLFIGIGLVTAQTSRVTGLVTSEEDGQPVVGASILVNGTTLGTITDIDGK FTITNVPSSAKTY >gi|225935338|gb|ACGA01000054.1| GENE 28 40635 - 43571 2723 978 aa, chain + ## HITS:1 COG:no KEGG:BDI_0500 NR:ns ## KEGG: BDI_0500 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 978 78 1057 1057 1196 62.0 0 MISQDVSIKPGVIKVVLKSDAKALDEVVVTAMGISREKKALGYAVQDVKSDQLTQAANSN LAGALQGKVSGLDVKPSSGMPGASSQITIRGARSFSGDNTPLYVIDGMPVTSTPDVSTDI QNNGSVSGADFANRAVDIDPNDIESINILKGQAASALYGIRASNGVIIITTKSGKGLEKG KPQVSFSSNVSFDVVGRLPEFQKTYAQGSGGVFSATSGTSWGPKISELPNDATYGGNTDN EYTQKFGKQQGKYYVPQRAAAGLDPWATPQAYDNAKDFFDTGITWSNSLNVAQMLDKSSY SISLGNTHQDGIISSTGMDRYNVKVSADTKLTNNWSSGFTANYITTSIDKAVTSGNGLLR TVYAAPPSYDLAGIPSHVDGNPYTQNSFRGSFDNAYWAMENNKFTEDTNRFFGNVYASYQ TDFGTTNHKLNAKYMVGVDAYTTHYVDSYGYGSNTGGGRGQIENYGWTNATYNSLLTINY DWHINEDWGLNAVVGNEIIQSNRKKYYEYGTNYNFPGWNHINNATTQQTEEETWKNRTVG FFGNVSASYRNMLYLTLTGRQDYVSNMPRNNRSFFYPSISAGFILTELDALKNNIVNHAK LRVSYAEVGQAGDFLENYYSTPTYGGGFYTLTPIMYPMKGTTAYTPYYTIFDPKLKPQNT RSYEVGADVNFLDNLITFSYTYSRQNVKDQIFEVPLASSTGASKLLTNGGKIHTNTHEFT LGFNPIRTKNINWDFAFNWTKIDNYVDELAPGVENISLGGYVTPQVRASAGEKFPVIYGV GFKRDANGNRLVDENGLPIAGEAQVIGKVSPDFLMGFNTTLRLWKCTISAVLDWKQGGQM YSRTTGLADYYGVSKRTENRDGTIIFDGYKTDGTKNDIAITGANAQQVYYSRLNDIDESS VYDNSFIKLREVAVNYKILQKRSIELSVNAFARNILIWAQLPDLDPEASQGNNNMAGAFE DYSMPQTASFGFGFNIKF >gi|225935338|gb|ACGA01000054.1| GENE 29 43583 - 45196 1580 537 aa, chain + ## HITS:1 COG:no KEGG:BDI_0503 NR:ns ## KEGG: BDI_0503 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 537 5 522 522 463 47.0 1e-128 MKKYKSIGKLLAVSFLTATIASACTEDAMDKINENPNNPLDAPAKFLITDLGVSTAFSTV GGDFSLYSSVYIEHETGISNQLYRAEVRSGEPTTATTYNNAWINVYSNIKNAKIVIKKCE EDPSEKGNVVTEAIAKILLAYNGAVAADVFGNTPYSQTGILNPDGTPMYMQPKMDTQESI YQEVMQNLDDAITLLNNGTAKDTGLSGAVGSKDLIYGSNTSTQASMWLKTAYALKARYTM RLLNKSANKTTDLQNILTYVSKSFANASEECKLAVYDADSQLNPLWSFSYSRNSLAASAS LIEKFVERNDPRAPQAFLEPDPTGYITYGYGATQATDIAGIKAAPNGTPQELQNNYGMSM ISWAMSAPTLLISYHEVKFLEAEALCRLGGRLGEAKTALKDAVTAGFENLGNSIIDAADT WIYDGDSDLGADVAEAYFTDEVERLFDANPLQETMIQKYLAFFGASGESLEAYNDYRRLK GAGENFIVLKNPQNSSKFPLRFGYGADDVLANPEVKAAFGDGQYVYSEAVWWAGGNK >gi|225935338|gb|ACGA01000054.1| GENE 30 45440 - 45838 390 132 aa, chain - ## HITS:1 COG:VC1445_2 KEGG:ns NR:ns ## COG: VC1445_2 COG0784 # Protein_GI_number: 15641456 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Vibrio cholerae # 13 127 1 115 117 78 38.0 4e-15 MESEQMNEFRPLILVAEDDDSNFKLIKAIIGKKCDIEWAKNGQEMVELFQQHQQRAKAML MDIKMPVMNGLEATKIIRESNTEIPIIMQTAYAFSSDKENAMNAGATEVLVKPITLGILR TTLSKYLPDLQW >gi|225935338|gb|ACGA01000054.1| GENE 31 45958 - 47595 1541 545 aa, chain + ## HITS:1 COG:all4183 KEGG:ns NR:ns ## COG: all4183 COG0488 # Protein_GI_number: 17231675 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Nostoc sp. PCC 7120 # 1 533 1 531 564 398 40.0 1e-110 MISVDGLAVEFGGTALFSDISFVINEKDRIALMGKNGAGKSTLLKILAGARQPTRGKVSA PKDCVVAYLPQHLMTEDGRTVFGETAQAFSHLHEMEAQIEKLNKELETRTDYESDSYMAL IEEVSSLSEKFYSIDATNYEEDVEKALLGLGFTRGDFQRQTSDFSGGWRMRIELAKLLLQ KPDVLLLDEPTNHLDIESIQWLEDFLINNGKAVIVISHDRKFVDNITTRTIEVTMGRIYD YKVNYSQYLQLRKDRREQQQKAYDEQQKFIAETKEFIERFKGTYSKTLQVQSRVKMLEKL ELLEVDEEDTSALRLKFPPSPRSGSYPVIMEGVGKTYGDKVVFRNANLTIERGDKVAFVG KNGEGKSTLVKCIMKEIEHDGTLTLGHNVQIGYFAQNQASLMDENLTVFQTIDDVAKGDI RNKIRDLLGAFMFGGPEESMKKVKVLSGGERTRLAMIKLLLEPVNLLILDEPTNHLDMKT KDILKQALLDFDGTLIVVSHDRDFLDGLVTKVYEFGNQKVTEHLCGIYEFLEKKKMDSLQ ELEKK >gi|225935338|gb|ACGA01000054.1| GENE 32 47802 - 48104 95 100 aa, chain + ## HITS:1 COG:no KEGG:BT_2037 NR:ns ## KEGG: BT_2037 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 37 136 136 163 81.0 1e-39 MKQDLPVEQQCPTHHHHPGNDSCCSSECMTRFYSPTPSIHTDSGPDYVFIATLFTDVIIE HLLRPQEKRIKNYCVYRDSLHGTDTHRTTSLRAPPYSVFA >gi|225935338|gb|ACGA01000054.1| GENE 33 48193 - 49428 1135 411 aa, chain + ## HITS:1 COG:aq_468 KEGG:ns NR:ns ## COG: aq_468 COG0845 # Protein_GI_number: 15605952 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Aquifex aeolicus # 86 409 38 359 359 73 25.0 7e-13 MKKLIFVGILGLFVLGSCNNSTVTHTHDEHDHAAEGHNHEAEGPDHSHESECSGEHNHEA ADEHNEAAEAHSDEIILPKAKAEAAGVKVSVIEPAPFQQVIKTSGQVLAAQGDESVAVAT VAGVVSFRGKVTEGMSVGSGTPLVTISSKNIADGDPVQRARIAYEVSKKEYERMKELVKN KIVSDKDFAQAEQSYENARLSYEALSKNHSAIGQSITAPIAGYVKSILVKEGDYVTIGQP LVSVTQNRRLFLRAEVSEKYYPYLRTISSANFQTPYNNQVYELKALNGKLLSFGKAAGDN SFYVPVTFEFDNKGEVIPGSFVEVFLLSSAMENVISLPRTALTEEQGIFFIYLQLDEEGY KKQEVTIGADNGKSVQILTGVKAGDRVVTEGAYQVRLASASNAIPAHSHEH >gi|225935338|gb|ACGA01000054.1| GENE 34 49506 - 52604 2959 1032 aa, chain + ## HITS:1 COG:all7618 KEGG:ns NR:ns ## COG: all7618 COG3696 # Protein_GI_number: 17158754 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Nostoc sp. PCC 7120 # 1 1020 1 1019 1058 764 42.0 0 MLNKIIHYSLHNRLVVVCAAILLLIAGTYTAMHTEVDVFPDLNAPTVVIMTEANGMAAEE VEQLVTFPVETAVNGATGVRRVRSSSTNGFSVVWVEFDWGTDIYLARQIVSEKLAVVSES LPVNVGKPTLGPQSSILGEMLIVGLTADSTSMLDLRTIADWTIRPRLLSTGGVAQVAVLG GDIKEYQIQLDPERMRHYGISMGEVMAVTQDMNLNANGGVLYEFGNEYIVRGVLSTSKTE QLGKAVVKTVNNFPVTLEDIANVTIGPKAPKLGTASERGKSAVLMTVTKQPATSTLELTD KLEASLKDLQKNLPPDVKVSTDIFRQSRFIESSIGNVKKSLFEGGIFVVIVLFLFLANVR TTLISLVTLPLSLLVSILTLHYMGLTINTMSLGGMAIAIGSLVDDAIVDVENVYKRLREN RLKVEAERLSTLEVVFNASKEVRMPILNSTLIIVVSFVPLFFLSGMEGRMLVPLGVAFIV ALFASTIVALTLTPVLCSYLLGSNKTNKKLKEAPVARWMKGIYEKALTWVLAHKRVTLGS TIGLFVVALGVFFTLGRSFLPSFNEGSFTINISSLPGISLEESNKMGHRAEELLMTIPEI QTVARKTGRAELDEHALGVNVSEIEAPFELKDRSRSELVADVREKLGTITGANIEIGQPI SHRIDAMLSGTKANIAIKLFGDDLNKMFSLGNQIKGAISDIPGVADLNVEQQIERPQLKI QPKREMLAKFGITLPEFSEYVNVALAGKVISQVYEQGKSFDLIVKVKDDARDEIEKIRNL MVDTNDGRKVPLSYVAEVVSAMGPNTINRENVKRKIVISANVADRDLRSVVNDIQKRIDT SVQLPEGYHIEYGGQFESEQAASRTLALTSFISIVVIFLLLYNEFRSVKESGVILLNLPL ALIGGVFALVITTGEVSIPAIIGFISLFGIATRNGMLLISHYNHLQKEEGLNVYDSVIQG SLDRLNPILMTALSSALALIPLALAGDLPGNEIQSPMAKVILGGLLTSTFLNGFIVPIVY LMMHRKEAQKSL >gi|225935338|gb|ACGA01000054.1| GENE 35 52762 - 53946 978 394 aa, chain + ## HITS:1 COG:no KEGG:BT_2040 NR:ns ## KEGG: BT_2040 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 394 1 397 397 624 90.0 1e-177 MKQIMTISAALLFLTIGEVQAQNGIEQVLKNIETNNKELQANEQLITSQKLEAKMDNNLP DPTLSYAHLWGAKDKSETIGELVVSQSFDFPSLYATRNKLNRLKAGTLDSQSDVFRQENL LQAKELCLDIIMLRQQKHILEERLRNAEELAKMYAKRLQTGDANALETNKINLELLNVKT EASLNETALRNKQQELNTLNGNIPVVFEENQYPAIPFPSDYQMLKSEVMATDRTLMALGN ESLVARKQIAVNKSQWLPKLELGYRRNTETGVPFNGVVVGFSFPLFENRNKVKIAKAQAL NIDLQKDNATLQVESELAQLYREAKTLHASMEEYSKTFQSQQDLALLKQALTGGQISMIE YFVEVSVIYQSHQNYLQLENQYQKAMARIYKSKL >gi|225935338|gb|ACGA01000054.1| GENE 36 54033 - 54218 136 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160882715|ref|ZP_02063718.1| ## NR: gi|160882715|ref|ZP_02063718.1| hypothetical protein BACOVA_00673 [Bacteroides ovatus ATCC 8483] # 10 61 1 52 52 96 98.0 5e-19 MFRSHNSDFMFVANVMYCFEISSRNLKKKRGGIFRHELHGLTLLLYSYDNTKDREFCVIR A >gi|225935338|gb|ACGA01000054.1| GENE 37 54204 - 55034 488 276 aa, chain + ## HITS:1 COG:no KEGG:BT_2041 NR:ns ## KEGG: BT_2041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 276 8 274 274 385 70.0 1e-106 MRPKHFYLFMLCCILTLLSGCKDDDDNPLRFYNSEYEVPMGGRRYLGLESGNGDYSLAVK DTRIASAGTETGWSGVPAGRMIYVTGILTGTTYLTVTDNTTQETCTLPIKVVDNYENIKL LRSYLSNLPNGDANLLPGISDIFLINNHARDAYFFKQGEQTAFSSGLELITQGSYKLEKE EGDDEKVTLTLTFSEDMATPVSNHKFIIWGNSYLFHRLDKSLNLNWNTPPIGETRTSPAP PPSYTLEEIAEGTEIGKGRQLSFSLSQQEMPAGILP >gi|225935338|gb|ACGA01000054.1| GENE 38 55031 - 56494 1069 487 aa, chain - ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 1 291 1 276 280 162 37.0 1e-39 MELPASFIDYTRALLGDEEYDKLAVALQQEPPVSIRLNQLKINHSLSDKVPWSSEGFYLE ERLTFTFDPLFHAGCYYVQEASSMFVEQVLRQYITGPVKMLDLCAAPGGKSTHARSVLPE GSLLVANEVIRNRSQILAENLTKWGHPDVVVTNNDPADFSALLSFFDVILTDVPCSGEGM FRKDPVAVEEWSPENVEICWQRQRRIIADIWDALKPGGILIYSTCTFNTKEDEENARWIQ QEYGGEPLTVQVQENWNITGDLLSDKGDHSKSSIPVYHFFPHKTKGEGFFLVAFRKPETE EEIPVSSFAKEKVFKKKDKKGTAASSPVSKEHLNMAKSWLNDENSDKYLLLAEGTNIRAF SQYYTNELTMMKQSLKIVSAGIEIGEVKGKDLIPNHALAMCTSLLCREAFAMEEISYEQA ITYLRKEAITLPATAPRGYVLLTYRHIPLGFVKNIGNRANNLYPQEWRIRSGYLPENIRI LSEDVTD >gi|225935338|gb|ACGA01000054.1| GENE 39 56604 - 56726 110 40 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237720132|ref|ZP_04550613.1| ## NR: gi|237720132|ref|ZP_04550613.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 34 1 34 259 67 100.0 3e-10 MKNFLLALMVVLTTCTLAVAQTTVVRDSIGKVKVNCDQRQ >gi|225935338|gb|ACGA01000054.1| GENE 40 56710 - 57384 565 224 aa, chain + ## HITS:1 COG:no KEGG:BT_2043 NR:ns ## KEGG: BT_2043 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 181 44 226 250 229 72.0 5e-59 MTKDNKANTNNTAVTVIGVDTADADSADVNANSSNVSTTHGKASFTFDSDDSDFPFHNVS AGGGILVAIIAIIAVFGFPVFILFVIFFFRYKNRKARYRLAEQALAAGQPLPAEFIRENK TVDSRSQGIKNTFTGIGLFIFLWAITGEFGIGAIGLLVTFMGIGQWIIGSKQQTQDANAT RIYTGNKDEKKNPNNVKTDSFEIIPSESEEKDNGVNEEKNDENK >gi|225935338|gb|ACGA01000054.1| GENE 41 57381 - 57932 326 183 aa, chain + ## HITS:1 COG:alr3280 KEGG:ns NR:ns ## COG: alr3280 COG1595 # Protein_GI_number: 17230772 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 5 174 29 213 218 86 34.0 3e-17 MSQLNDISLVAQVVVFKNTRAFDQLVKEYQAPIRRFFLNLTCGDSELSDDLAQDTFIKAY TNIASFQNLSSFSTWLYRIAYNVFYDYIRSRKETADLDTREIDAINSTEQENIGQKMDVY QSLKMLKEVERTCITLFYMEDISIDKIAGIVGVPSGTVKSHLSRGKEKLATYLKQNGYDR NRQ >gi|225935338|gb|ACGA01000054.1| GENE 42 57913 - 58242 414 109 aa, chain + ## HITS:1 COG:no KEGG:BF3732 NR:ns ## KEGG: BF3732 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 1 109 109 114 55.0 2e-24 MTEIDNDKLLRDFFAENKQEIADNGFSRRVMHHLPDRSNRLARIWSAFVMTVAAALFVWL GGLEAAWGTIREVFIGMINHGTSSLDPKSIIIAAVVLLFMATRKVASMA >gi|225935338|gb|ACGA01000054.1| GENE 43 58363 - 59634 913 423 aa, chain + ## HITS:1 COG:BH2858 KEGG:ns NR:ns ## COG: BH2858 COG1502 # Protein_GI_number: 15615421 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus halodurans # 51 423 135 503 503 243 35.0 5e-64 MKLRIFILFLFLPILSVQAGIIDSLMIHPRDSIGLTSDSLVLRYLQESGIPISDNNKVKL LKSGREKFIDLFEAIREAKHHVHLEYFNFRNDSIANALFALLAEKVKEGVKVRAMFDAFG NWSNNKPLKKKHLKKIREQGIEIVKFDPFTFPYINHAAHRDHRKIAVIDGKVAYTGGMNI ADYYINGLPKIGTWRDMHTRIEGDAVNDLQEIFLTIWNKETKQNVGGAAYFPQHEEQTDS TNIVVAIVDRTPKKNSRMLSHAYAMSIYSAQKDVHIVNPYFVPTSSIKKALNRTIDRGVN VTIMVSSASDIPFTPDAALYKLHKLMKRGATVYMYNGGFHHSKIMMVDDLFCTVGTANLN SRSLRYDYETNAFIFDTKITGELNTMFRNDIEHCTQLTPEFWKKRSPWKKFVGWFANLFT PFL >gi|225935338|gb|ACGA01000054.1| GENE 44 59674 - 60468 811 264 aa, chain + ## HITS:1 COG:BH3451 KEGG:ns NR:ns ## COG: BH3451 COG0207 # Protein_GI_number: 15616013 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus halodurans # 1 264 1 264 264 436 74.0 1e-122 MKQYLDLLNRVLTEGTEKSDRTGTGTISVFGHQMRFNLDDGFPCLTTKKLHLKSIIYELL WFLQGDTNVKYLQEHGVRIWNEWADENGDLGHVYGYQWRSWPDYNGGFIDQISEVVETIK HNPDSRRIIVSAWNVADLNNMNLPPCHAFFQFYVADGRLSLQLYQRSADIFLGVPFNIAS YALLLQMMAQVTGLKAGDFVHTFGDAHIYLNHLEQVKLQLSREPRPLPQMKINPDVKSIF DFKFEDFELVNYDPHPHIAGAVAV >gi|225935338|gb|ACGA01000054.1| GENE 45 60473 - 60967 264 164 aa, chain + ## HITS:1 COG:RSc0946 KEGG:ns NR:ns ## COG: RSc0946 COG0262 # Protein_GI_number: 17545665 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Ralstonia solanacearum # 1 163 1 161 167 125 44.0 4e-29 MSKISIIAAVDRRMAIGFENKLLFWLPNDLKRFKALTTGNTILMGRKTFESLPKGALPNR RNIVLSSNPATECPGAEVFPSLEAALQSCKEEEHIYIIGGASIYQQALSFADELCLTEID DMAPEADAYFPEVSPEMWQEKSREAHPADEKHLCSYAFVDYVRK >gi|225935338|gb|ACGA01000054.1| GENE 46 60971 - 61450 376 159 aa, chain - ## HITS:1 COG:HI0563 KEGG:ns NR:ns ## COG: HI0563 COG1522 # Protein_GI_number: 16272506 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 4 156 2 149 150 117 39.0 8e-27 MGHHQLDALDEQILKLIAGNARIPFLEVARACNVSGAAIHQRIQKLTNLGILKGSEYVID PEKIGYETCAYIGIYLKDPESFDSVTKALEAIPEVVECHFTTGKYDMFIKIYAKNNHHLL SIIHDKLQPLGLARTETLISFHEAIKRQMPIMVDIEDED >gi|225935338|gb|ACGA01000054.1| GENE 47 61603 - 62880 821 425 aa, chain + ## HITS:1 COG:no KEGG:BT_2050 NR:ns ## KEGG: BT_2050 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 425 1 425 425 807 88.0 0 MKHICCIILCFCTSIGSYAQNFADYFQNKTLRVDYIFTGDATQQAIYLDELSQLPTWAGR QHHLSELPLEGNGQIIVKDLASKQCIYKTSFSSLFQEWLSTDEAKETAKGFENTFLLPYP KQPVEVEVTLYSPRKKIMATYKHIVRPDDILVHKRGGSHVTPHRYILQSGNEKDCIDVAI LAEGYTEKEMDIFYQDAQRTCESLFSYEPFRSMKGKFNIVAVASPSTDSGVSVPRENQWK QTAVHSHFDTFYSDRYLTTSRVKSIHNALAGIPYEHIIILANTDVYGGGGIYNSYTLTTA HHPMFKPVVVHEFGHSFGGLADEYFYDDDVMTDTYPLDVEPWEQNISTRVNFASKWKDML PSGTPIPTPIAEKKKYPVGVYEGGGYSAKGIYRPAYDCRMKTNEYPEFCPVCQRAIRRMI EFYVP >gi|225935338|gb|ACGA01000054.1| GENE 48 62924 - 63463 393 179 aa, chain + ## HITS:1 COG:no KEGG:BT_2051 NR:ns ## KEGG: BT_2051 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 177 1 177 178 306 85.0 2e-82 MIRLQPISTSDLQHYKFMEELLIDSFPPEEYRQLEQLREYTDRTGNFHNNIIFDDDLPVG FITYWDFDSFYYVEHFATNPALRNGGYGKRTLEYLCNYLKRPIVLEVERPVEEMAKRRIS FYQRQGFTLWEKDYSQPPYKPGDDFLPMYLMVHGELDCEKDFETIKNKIHTEVYGVKNN >gi|225935338|gb|ACGA01000054.1| GENE 49 63528 - 65117 1337 529 aa, chain - ## HITS:1 COG:no KEGG:BT_3274 NR:ns ## KEGG: BT_3274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 486 2 476 534 177 28.0 1e-42 MRKNIYVLLLLSLIMVLGSCYSDKGNYDYEDVNEITVDLGGAKYTYVVGNVAKLEPTLTF ATREIAESELEYNWTLNGEFISDKRVLEFTVEKIASQVECQLRVTNPRTGLTYIGRTAMD FTQKYNLYGWLVLSKDEQASYLNFMTATGTDAQIYEEHLKVYQEQNRETLPKETVGLLEH FRSGGSGSNPSSVWIVNPKADKCVDLEGAAFTKDLVLPDAFLEPSFTSNLTVKQIAELKW LTVVVDQNGKAYTRKKLSEKAFHTGKFLSTPLTFENKEVKVDRFLIIPEMRALNMAFVEG EKGQHQRILALMDYDKQSAGKVLQFTVKQEDYDKNGFYDAPKLDDLLDYEVVYLGYARPK TTGLTDGTYVMTLILKRGSEYLYQEYSVNKASTSNQVTAIPSVNKPINFGSQLEGAVIYT APYINNRTYLLVGKGHDLYYVDRTTIDKDALILLKTFDEDVTALNAETSQGNQLGVGLAN GKFFVLSLKTANLGEILGDPKKEELERYYQMSGKVFDIRYRFRYSNGWT >gi|225935338|gb|ACGA01000054.1| GENE 50 65138 - 65794 689 218 aa, chain - ## HITS:1 COG:no KEGG:BT_3273 NR:ns ## KEGG: BT_3273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 218 236 100 32.0 5e-20 MKKIYLFLTAMVLVLTACEKDQIGRYDLGSYVYFTQKETASQSFSFSYYPGLTTHSLEFE VNLMGDLLTEDKTFELYIDTEKTTATPDMYELNLHPVFHQGTATDQITVTLKNPNDILKD KEVTLVFGIKENENFQPGFVGQRSITINFNDIASKPLWWDSTVESYLGAYDAYKLDEFIK CTGVNDLTGVDETLIRKYALDFKEYIEENGLDIDIPVY >gi|225935338|gb|ACGA01000054.1| GENE 51 65814 - 67274 1324 486 aa, chain - ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 485 4 488 488 314 38.0 5e-84 MKIKNIILVLCTSLTFCSCSDWLDVSPNNQVDGEDLFNSGSGYRIALNGIYKQMSSQNLW GEELTWGMADVLGQQYTKSNLGSTDSKYWQACQYKYTDKTLEPVIQSIWSTAYNAIANCN ELIKNIEAADPAIFQGKTLEQDLIHGEALALRAMLHFEVLRLFAPSVAADDGKKYVPYYA TFPSVSEPYLTVKEVLAKIEKDLEEARDLVQTYDNQEGYKLLMTKAYRFEGGDLVTDMFY ASRGFRMSYIAITALQARVFSYAGESKKAYDAASEVINYTDDNGEKMFTFTANASFNTNP KMKDDLIFALSNSKEVELFKAWDNRDEDGGMVSIDYDDYTEILDENANDRRWNETGGMME IHSDYDWYCVISKKYMDDASYTDLDKRIIPVIRLSEMHYIRGEYLAETDPVAGTAELETV AQARGCTAGIFSYISTLAEYKIAVLKDARKEFLGEGQLYYFYKRYNILPNSKTNFIFPLP DNEMVY >gi|225935338|gb|ACGA01000054.1| GENE 52 67310 - 70462 2819 1050 aa, chain - ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 44 1050 108 1116 1116 1106 56.0 0 MKKVFLFSLLYLLSVVTVVAQQKDNKQQVQKTKVENSEVPSGYRAVFGKVVDQEGLQLIG VTIRLKGTDFGTTTDVNGEFKLFYPVRKHPVIIVSYIGMVTEEISLGDDASKDKARVIKL KEDAVMTDEVVITGYSNINKKSFTGNSVQIKKEDLLKVSKTNVFSALQAFDPSFRIQENS QWGSDPNAIPELYIRGRSGIGIKELDSETVSKSNLQNNPNLPLFIMDGFEVSATKLYDLD PNRIENITILKDAAATAMYGSRAANGIVVITTVPPKPGKLQIDYSMTGTLQMPDLSDYNL MNASEKLETERLAGFYIGKDAGEQYTLDREYYGKLQNVKRGVDTDWISQPVHSVFNHKHS LSLTGGTENLRFSVDLSYNKSDGVMRGSYRDRTGGGLALDYRIGRLQVRNYVSYSSTRSK ESPYGSFSDYTTKLPYDTFRDDDGDYVTKTTEWHNLGMDNLANPLYEATLKSYDRSNADE FIDNLSANWYINDHWQIKGQFALTKSYSESRRFLDPLSAKNTDALSSTNKISGELTTSSG NSLSWDMNAFLAYNRTIKEHNINLSVGINATSSSGTSTSARYMGFPSGDFDSPNYAQKIY EKPDWSDTKSRLFGALATLNYSYQNIYLLDASVRSDGSSEFGSDNKTALFWSFGLGLALH NYEFIKKLSFIDEFKIRGTYGSSGKVNFEPFAAKTVYEINGDEWYETGMGANMIAMGNVN LGWETTYQTDFGFELGLLKRLFYISFAYYNKRTVDLVNDVTIPSSTGFTSYKDNVGEMQN RGYEFNVRSNIIRRRDMQLSLFANLGHNENKLVKISNSLKAYNDLVDQKYAELGNYDADA AKPFRKYEEGVSTTAISAVKSHGIDPATGKEIFEKRDGTLTTKWDSADMISCGDTEPDIQ GTFGFNFSWKKFSLYTTFMYEYGGQRYNSTLVSKVENADIYNSNVDKRVLSDRWKEPGDN ARFKALYNGKNSIEITKSSTRFIEDYNLLSLNSITLGYTFGAEQIKKLGLSMLRLEIGAN DLARFCTVKQERGLSYPYSRSVNFTVNASF >gi|225935338|gb|ACGA01000054.1| GENE 53 70511 - 73123 2421 870 aa, chain - ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 42 865 30 860 860 591 40.0 1e-167 MRRFLIICLAAVFTLPAADANTIFWKKKKDKTEKKEKKLSKYEKLFKDKKVETSKGLLTL HFVDKKKLLVEFPVSLMGREFLLGSKVVETSDMGNGPAGTMSYTPRHIKFSVVDSTLYMK EVSRTGQNMIFSSSANQNMQEGLKKNSLSTIVDAFTIAAFNKDSSAVVVDMTSYFVSHTE DMNPFSSGKRTMEYGSGRQAVKFRDDLSYLTGVKAFGDNVSILSKLTYLMNLSVGGQLAS VDEPVSMTVNRTLLLLPEKPQMRPRLADPRIGIGTVEMENMGTEVDGSRMEHRMKRWNLE VSDVDKYKRGELTEPKKPIVFYMDPNFPVSWRAAVKAGVNDWNKAFEAIGFKDAIQVKDF PKDDPDFDPDNLKYSTIRYVPTGVVTTMKDASFADPRTGEIMNASLYLYHDLLKWNNIQR FVQTSQVDPDARHLRLPDDLMSETLRCAVRREVGFALGLIENMAGSFAIPTDSLRSASYT QKYGITASVMDEVGFNYVAQPGDKGVMLSPKLGVYDYYAIKVAYKPILEAQTAQEENKVV RQWISEKSGDPMYRFGTKQYLLTTFDPSALSFDLGNDAIKASTYGINNLRYIVAHMHEWM DAEDKNYDYRNMVFPFIVQQLRGYLFNVYRNIGSFYLNEHFVGDANATLTAVPKDLQRAS LDFLFTQLKDLEWIDDEDVIKKLSFSGSQARKILKNYVVKEKNVITDLYQTKRVSLTYYR DPSSYSPKEYIDDLYELVWRPTMEGRSLTEPERIMQAEFLSNVRNSVDINGQKRWFGQYY LSDDNTDKVGALDEEEDGDAESAIEEQGYGSRAVMYNIATDNQNHLWYGTYQKLAELVKK QQMTGDEQTRAHYKYLLFLMTRSWKDTPKI >gi|225935338|gb|ACGA01000054.1| GENE 54 73143 - 75713 1847 856 aa, chain - ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 49 851 30 856 860 521 36.0 1e-146 MKLFIVLLTGAFMLATGTSDVFAKKKEKVQTEAKSDTVQTASKKRNTQYPRLFKNKRVVT KKGLITIHKIDDKLYFEFPKSLLGRDFLMGSSISATSDNTSGLVGQTMTTPLHIRFAIQE DQVYMQNVTPVSRMDVYSNQSDISKAVAKSNITPDMESFKIAAYNMDSTAVVFEVTKFFL ADNKRLPLFDQNSSSLEDEKYGQLELKAVLKKNLSSIRNFYVFDDNLEINLDMSFYQSLL ANKKEVRGGNVRVKAVYSMLLLPEETMAWRLGDPRLGYTFTKKQKISTEKDGTAFSYLQH KWNLLPADEEAYRRGDLVAPRKPVVFYIDNDFPEAWKKAIKEGVNWWNEAFKMAGFKEAV QTADFPANDTTFYAGNLKYSCIRYVPVNLSKTQSKVWVDPRSGEIINGTIFVGHDIAKTI ANQRFIQTAQADKRMRGLQLSEEILLESIRLYVAREVGHCLGLADNASASAAFSVESLRS AAFTEQNGIASSVMDELPYNYIAQPQDDKVQFVQNKLGAYDLYAIQMGYMLIPDTKTPEE IREKIMRWVTMKSGDPVYRYGKAQYQEAEYDPSALGGDLSNDAVKAGKYGISNLKYILGN IEQWVEQQDSDYRFRIHIYPEIVNQYNQYLMYAFQNVGGFYLSEHIVGDRNQALSVVSKI QQREAARFVMNELCNMTWLDDSSILRNMPVQTPVYQKMMKDFASVLVSTKRVSLAYYKDK NSYSPEEYLNDVYNGIWASTLKGKSLNRAEMLLQYEVVAAILRKLQLPNHMETTGREKKK KVETLLQSTQPEVNTVGGFGPLGYITNLSNDNQLHLYHGLLIRVQKLMESRLNSGSYETR MHYGQLLEKIEDIRNW >gi|225935338|gb|ACGA01000054.1| GENE 55 75727 - 78423 2292 898 aa, chain - ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 45 892 30 856 860 667 43.0 0 MIVNKLVAKKAFLLLVLACAMLVTVEPIYAKKKKNTKTTKKESPYEKIFKGKKDEVKKGV ITLHKIEGKILFEFPLTLQNREMLLGSTVSEISDNGNALVGQKIKKPLHIKFALRDSVME MREVSNFARRPIFSTSKDESIKQAMKKGIGEPVMEGFKVMAYNADSTAVVFDMTDFLVSD NKRMAIFDPYGKKTMFGACVRRATFKKELSYVDQIKSFDDNISVTSSLSYLQDLLYMGIV AIAYQEPVTVKVNRSFVLLPETPEMMPRMADPRIGYFTSNKEEIADDYDGVKMNTYANKW NVYPKDVEAYKRGELVEPTQPILFYVDDAFPEEWKKGIHEGVLVWNKAFERIGFKNVMQV KDFPKDDPEFDPANIKYNCINYAPIGIANAMGPSWIDPRNSQIINASVFVYHDVIQLVND MRFVQTAQVDPRVRTPKLPQDVLDESLRYIISHEVGHCLGLMHNMGSSFAYATESYRDPV FMKEHGTTPSIMDYARFNYIAQPEDKDVCLTPPVLGTYDYYAIKWGYTVFPEAKTTEEEV PYLESIIKSKIGDLEYRYGKQQLGYGVFDPTSLSEDISNDAMKAGAYGIKNLKYILGNFN TWLNDKDPDFTYRDHLYDALVSQYVRYLNNAWANVGGFFINEHYVGDPYNTSEVIPHDMQ KRAVQFVLNELKNIDWIDNPDVVKNLTFDGSMSRSILKSMSKKILNTKRVSLAAYRDSSA YSPQEYLDDIYTILWESTIKNVEPTESERILQTEVLMGIIRNADPLTKKDEIDEVLFADL KQAAQTGALQLPANTDLPLFQLFEVYKHLTEQGDADCCVDAGHYYQTANAMKKINEYAGF DYIYYVLNTSNNNQNHLYFDLLTRIKNIVEKRKNSGSYATRSHYEYLLSIINDFEKAK >gi|225935338|gb|ACGA01000054.1| GENE 56 78661 - 79131 517 156 aa, chain - ## HITS:1 COG:no KEGG:BT_2052 NR:ns ## KEGG: BT_2052 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 156 1 145 145 249 97.0 1e-65 MGKFNKTGKREMPALNTSSLPDLIFTLLFFFMIVTTMREVTLKVQFTLPQGTELEKLEKK SLVTFIYVGEPTQEYRAKMGTESRIQLNDSYAEVGEVQDFIFQERASMNEGDAAKMTVSL KVDQKTKMGIITDVKNALRKSYALKINYSATKRGEK >gi|225935338|gb|ACGA01000054.1| GENE 57 79135 - 79728 554 197 aa, chain - ## HITS:1 COG:no KEGG:BT_2053 NR:ns ## KEGG: BT_2053 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 197 1 197 197 344 92.0 1e-93 MARGKRKVPDINSSSTADIAFLLLIFFLITTSMDTDRGLARLLPPPPEDQDQQNTDKIKE RNVLQVYLNKDDALMCGNDYIGVDELRERAKEFIANVANAEHMPEKTQKNVEFFGTYLVN DKHVISLQNDRGSSYQAYISVQNELVAAYNELRDELAQEKWQKTYAELNDDQQKAIREIY PQRISEAEPKKYGDKKK >gi|225935338|gb|ACGA01000054.1| GENE 58 79764 - 80249 418 161 aa, chain - ## HITS:1 COG:no KEGG:BT_2054 NR:ns ## KEGG: BT_2054 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 161 1 161 161 196 80.0 2e-49 MSKLTYRISYYVLYAMFAIILIVMGVFFLGGDATGDALIPGVDPEMWQPAQTDALLYLMY VLFGIAIAATVIAAVFQFGAALKDNPASAIKSLLGLVLLVVVLIVAWAMGDGTPMQIQGY DGTDNVPFWLKVTDMFLYAIYILLFVTVVAIIVSGIKKKLS >gi|225935338|gb|ACGA01000054.1| GENE 59 80256 - 81056 863 266 aa, chain - ## HITS:1 COG:VC1547_2 KEGG:ns NR:ns ## COG: VC1547_2 COG0811 # Protein_GI_number: 15641555 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Vibrio cholerae # 113 254 46 184 205 72 32.0 6e-13 MKKLFAIVAVIGAFTFGSIQLAQAQDAPAAEQTEQQAAPAAEATTAAAPAAEEGGIHKEI KVKFIEGTASFMSLVAIALVIGLAFCIERIIYLSLAEINTKKFMASIEAALEKGDVEAAK DIARNTRGPVASIYYQGLMRIDQGIDVVEKSVVSYGGVQAGYLEKGCSWITLFIAMAPSL GFLGTVIGMVQAFDKIQQVGDISPTVVAGGMKVALITTIFGLIVALILQVFYNYVLSKIE ALTSEMEDSSISLLDMVIKYNLKYKK >gi|225935338|gb|ACGA01000054.1| GENE 60 81336 - 82112 563 258 aa, chain - ## HITS:1 COG:VC0103 KEGG:ns NR:ns ## COG: VC0103 COG0084 # Protein_GI_number: 15640135 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Vibrio cholerae # 2 258 1 255 255 222 42.0 6e-58 MLIDTHSHLFVEEFTEDLPLVMERARKAGVSYIFMPNIDSTTIDAMLSVCRDYPGFCYPM IGLHPTSVNESYEQELAIVHKYLSTSREFVAIGEIGLDLYWDKTFLKEQILVFEKQIEWA LEYGLPIVIHSREAFEYIYKVMEPYKNTPLTGIFHSFTGTSEEAAKLLEFEGFMLGINGV VTFKKSTLPEALTTVPLERIVLETDSPYLAPVPNRGKRNESANVRDTLMKVAGIYQRDLE YVAQVTSVNALKVFGIRK >gi|225935338|gb|ACGA01000054.1| GENE 61 82106 - 82822 508 238 aa, chain - ## HITS:1 COG:no KEGG:BT_2057 NR:ns ## KEGG: BT_2057 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 238 1 238 238 404 86.0 1e-111 MPYRRLPNTDQARVRALKAAVEKGEMYNVRDLAITLKTLFEARNFLHRFEAAQIYYTQCY DNQSRASRKHQMNVKTARLYISHFIQVLNLAVLRDEIKVTHKELYGLPASNTVPDLLSEA SLVEWGKKIIEGEQLRTTQGGIPIYNPTIARVKVHYDIFLDSYERQKNYQALTNRSLDEL ASMRDRADELILDIWNQVEAKYQDVTPNDTRLEKCRDYGLIYYYRSSEKIKEEKEISC >gi|225935338|gb|ACGA01000054.1| GENE 62 82838 - 83812 876 324 aa, chain - ## HITS:1 COG:MA0606 KEGG:ns NR:ns ## COG: MA0606 COG0142 # Protein_GI_number: 20089495 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Methanosarcina acetivorans str.C2A # 11 245 17 253 324 191 41.0 2e-48 MFTASQLLDKINNHISEIQFTRTPKGLYEPIEYILSLGGKRIRPVLMLMGYNLYREDVAS IYDPATAIEVYHNHTLLHDDLMDRSDVRRGKPTVHKVWNDNTAVLSGDAMLILAFRYMTG CPPEHLKEVMDLFSLTTLEICEGQQLDMEFESRCDVTEDEYIEMIRLKTAVLLAGSLKIG AILAGATAEDAENLYNFGMHIGVAFQLQDDLLDVYGDPEVFGKKIGGDILCNKKTYMLIK ALNRADEKQHAELNRWLNAEAFQPTEKIEAVTEIYNQLNIRNICESKMREYYTFAMESLA AVAVAEDRKKELKNLVKLLMYREM >gi|225935338|gb|ACGA01000054.1| GENE 63 83882 - 84565 665 227 aa, chain - ## HITS:1 COG:no KEGG:BT_2059 NR:ns ## KEGG: BT_2059 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 358 95.0 6e-98 MEVKKSPKADLEGKKSTWLLIGYVVVLAFMFVAFEWTQRDVKIDTSQAVADVVFEEEIIP ITETPEQQAPPPPEAPKVAELLEIVDDKAQIEETTTIINEDNQARVEVKYVPVQVVEEEP EEQTIFEVVENMPDFPGGQAALMQYLAKNIKYPTIAQENGTQGRVIVQFVVNRDGSIVDA KVVRSVDPYLDKEALRVINTMPKWKPGMQRGKPVRVKFTVPVMFRLQ >gi|225935338|gb|ACGA01000054.1| GENE 64 84782 - 85468 200 228 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 1 214 32 274 863 81 28 1e-14 MKKITIAIDGFSSCGKSTMAKDLAREVGYIYIDSGAMYRAVTLYSIENGIFNGDVIDTEK LKKEIKNIRISFQLNKETGRPDTYLNGVNVENKIRSMEVSSKVSPISTLDFVREEMVAQQ QAMGKEKGIVMDGRDIGTTVFPDAELKIFVTATPEIRAQRRYDELKAKGQEASFDEILEN VKQRDYIDQNREVSPLRKAEDALLLDNTHLSIEEQKKWLFGQFNKVSK >gi|225935338|gb|ACGA01000054.1| GENE 65 85477 - 86346 375 289 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Clostridium acetobutylicum ATCC 824] # 1 278 1 274 642 149 32 6e-35 MIKVEIDEGSGFCFGVVTAIHKAEEELAKGETLYCLGDIVHNSREVERLKTMGLITINRE EFKQLKNAKVLLRAHGEPPETYMIARENNIEIIDATCPVVLRLQKRIRQGYLADSDKEKQ IVIYGKSGHAEVLGLVGQTDGKAIVIEKAEEAKKLDLNKSIRLFSQTTKSLDEFQEIVEY FKQHISPEATFEYYDTICRQVANRMPKLREFAATHDLIFFVSGKKSSNGKMLFEECLKVN TNSHLIDNEKEIDPSLLQNVNSIGVCGATSTPKWLMEKIYNHIRTLIKE >gi|225935338|gb|ACGA01000054.1| GENE 66 86546 - 87526 1138 326 aa, chain + ## HITS:1 COG:BH3164 KEGG:ns NR:ns ## COG: BH3164 COG0205 # Protein_GI_number: 15615726 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Bacillus halodurans # 4 326 1 319 319 312 51.0 6e-85 MGTVKCIGILTSGGDAPGMNAAIRAVTRAAIYNGLQVKGIYRGYKGLVTGEIKEFKSQNV SNIIQLGGTILKTARCKEFTTPEGRQLAYDNMKKEGIDALVIIGGDGSLTGARIFAQEFD VPCIGLPGTIDNDLYGTDTTIGYDTALNTILDAVDKIRDTATSHERLFFVEVMGRDAGFL ALNGAIASGAEAAIIPEFSTEVDQLEEFIKNGFRKSKNSSIVLVAESELTGGAMHYAERV KNEYPQYDVRVTILGHLQRGGSPTAHDRILASRLGAAAIDAIMEDQRNVMIGIEHDEIVY VPFSKAIKNDKPIKRDLVTVLKELSI Prediction of potential genes in microbial genomes Time: Fri May 13 10:33:57 2011 Seq name: gi|225935337|gb|ACGA01000055.1| Bacteroides sp. D2 cont1.55, whole genome shotgun sequence Length of sequence - 71390 bp Number of predicted genes - 52, with homology - 52 Number of transcription units - 24, operones - 14 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 285 - 314 -0.2 1 1 Tu 1 . - CDS 332 - 1696 537 ## COG3182 Uncharacterized iron-regulated membrane protein - Term 1704 - 1752 3.0 2 2 Op 1 . - CDS 1774 - 3180 1092 ## BT_2064 hypothetical protein 3 2 Op 2 . - CDS 3244 - 5706 1661 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 5726 - 5785 13.8 + Prom 5685 - 5744 10.7 4 3 Tu 1 . + CDS 5825 - 6748 492 ## COG0583 Transcriptional regulator 5 4 Op 1 3/0.222 - CDS 6728 - 7579 533 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 6 4 Op 2 . - CDS 7576 - 8799 1079 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 7 4 Op 3 . - CDS 8855 - 9616 516 ## BT_2068 3-oxo-5-alpha-steroid 4-dehydrogenase - Prom 9636 - 9695 4.2 - Term 9709 - 9768 13.0 8 5 Op 1 4/0.111 - CDS 9800 - 11143 1560 ## COG0372 Citrate synthase 9 5 Op 2 1/0.222 - CDS 11170 - 12345 1219 ## COG0538 Isocitrate dehydrogenases 10 5 Op 3 . - CDS 12364 - 14607 2173 ## COG1048 Aconitase A - Prom 14631 - 14690 5.7 + Prom 14661 - 14720 4.3 11 6 Tu 1 . + CDS 14833 - 16611 1281 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Term 16643 - 16682 -0.4 12 7 Tu 1 . - CDS 16760 - 18724 1536 ## BF3757 hypothetical protein - Prom 18773 - 18832 6.3 - Term 18850 - 18888 5.1 13 8 Op 1 . - CDS 18917 - 19960 1286 ## COG0059 Ketol-acid reductoisomerase 14 8 Op 2 . - CDS 20028 - 20771 612 ## COG3884 Acyl-ACP thioesterase 15 8 Op 3 32/0.000 - CDS 20771 - 21334 510 ## COG0440 Acetolactate synthase, small (regulatory) subunit 16 8 Op 4 6/0.111 - CDS 21391 - 23085 1771 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 17 8 Op 5 . - CDS 23115 - 24914 1931 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 25131 - 25190 6.7 + Prom 25181 - 25240 4.2 18 9 Tu 1 . + CDS 25396 - 25974 563 ## COG1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2 + Term 25995 - 26057 5.2 - Term 25983 - 26044 1.1 19 10 Tu 1 . - CDS 26102 - 27388 1250 ## COG3681 Uncharacterized conserved protein - Prom 27433 - 27492 5.1 + Prom 27410 - 27469 7.7 20 11 Op 1 . + CDS 27522 - 28607 833 ## BT_2081 hypothetical protein 21 11 Op 2 . + CDS 28640 - 29425 694 ## BT_2082 hypothetical protein + Term 29445 - 29477 0.7 22 12 Tu 1 . - CDS 29422 - 29988 646 ## BT_2083 hypothetical protein - Prom 30123 - 30182 4.9 + Prom 29961 - 30020 5.9 23 13 Op 1 . + CDS 30150 - 31226 1215 ## COG0082 Chorismate synthase 24 13 Op 2 . + CDS 31226 - 32590 1268 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Term 32626 - 32693 21.4 - Term 32614 - 32681 23.0 25 14 Tu 1 . - CDS 32701 - 33759 997 ## COG3049 Penicillin V acylase and related amidases - Prom 33830 - 33889 6.0 + Prom 33746 - 33805 5.6 26 15 Op 1 . + CDS 34010 - 34453 275 ## Ctha_2340 protein of unknown function DUF323 27 15 Op 2 . + CDS 34411 - 34875 158 ## COG1262 Uncharacterized conserved protein - Term 34784 - 34832 6.1 28 16 Op 1 . - CDS 34957 - 36135 800 ## BT_2246 hypothetical protein 29 16 Op 2 . - CDS 36075 - 36995 537 ## BT_2246 hypothetical protein - Prom 37239 - 37298 4.9 - Term 37672 - 37703 1.1 30 17 Tu 1 . - CDS 37782 - 39956 2070 ## COG0550 Topoisomerase IA - Prom 39978 - 40037 4.1 - Term 40169 - 40219 14.1 31 18 Op 1 7/0.000 - CDS 40250 - 42397 2164 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit 32 18 Op 2 . - CDS 42399 - 44300 2179 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit - Prom 44544 - 44603 7.0 + Prom 44279 - 44338 6.1 33 19 Tu 1 . + CDS 44570 - 46234 1442 ## COG2985 Predicted permease + Term 46257 - 46306 2.0 - Term 46245 - 46294 10.4 34 20 Op 1 . - CDS 46325 - 46957 631 ## BT_2093 hypothetical protein 35 20 Op 2 . - CDS 47010 - 49766 2296 ## BF2098 hypothetical protein - Prom 49858 - 49917 4.5 + Prom 49813 - 49872 6.1 36 21 Op 1 2/0.222 + CDS 49904 - 50413 341 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 37 21 Op 2 11/0.000 + CDS 50416 - 51453 759 ## COG2038 NaMN:DMB phosphoribosyltransferase 38 21 Op 3 6/0.111 + CDS 51455 - 52201 401 ## COG0368 Cobalamin-5-phosphate synthase 39 21 Op 4 . + CDS 52210 - 52743 453 ## COG0406 Fructose-2,6-bisphosphatase 40 22 Op 1 9/0.000 - CDS 52723 - 53670 716 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 41 22 Op 2 2/0.222 - CDS 53685 - 54707 762 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 42 22 Op 3 1/0.222 - CDS 54700 - 56163 1195 ## COG1492 Cobyric acid synthase 43 22 Op 4 . - CDS 56195 - 56785 332 ## COG2096 Uncharacterized conserved protein 44 22 Op 5 . - CDS 56785 - 58107 939 ## COG1797 Cobyrinic acid a,c-diamide synthase 45 22 Op 6 . - CDS 58104 - 59288 703 ## COG2242 Precorrin-6B methylase 2 46 22 Op 7 . - CDS 59316 - 60722 1228 ## COG1010 Precorrin-3B methylase - Prom 60805 - 60864 8.5 - Term 60824 - 60886 8.1 47 23 Op 1 . - CDS 60940 - 61971 963 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase 48 23 Op 2 . - CDS 61999 - 64080 1893 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins 49 23 Op 3 . - CDS 64086 - 64781 499 ## COG2243 Precorrin-2 methylase 50 23 Op 4 . - CDS 64807 - 66990 2215 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Prom 67188 - 67247 4.8 51 24 Op 1 . + CDS 67374 - 69203 1161 ## COG2875 Precorrin-4 methylase 52 24 Op 2 . + CDS 69200 - 71011 940 ## COG1903 Cobalamin biosynthesis protein CbiD + Term 71205 - 71238 3.1 Predicted protein(s) >gi|225935337|gb|ACGA01000055.1| GENE 1 332 - 1696 537 454 aa, chain - ## HITS:1 COG:PA4513_1 KEGG:ns NR:ns ## COG: PA4513_1 COG3182 # Protein_GI_number: 15599709 # Func_class: S Function unknown # Function: Uncharacterized iron-regulated membrane protein # Organism: Pseudomonas aeruginosa # 1 442 1 364 395 119 24.0 1e-26 MKKIFRQIHLWLSVPFGLIITLICFSGAMLVFENEVNELSRPALYYVETVKEEPIPIDKL LEKVAATLPDSVSVTGVSISSDPGRTYQVNLSKPRRASLYVDQYTGEVKGKSERSSFFTF MFRMHRWLLDSMKPGNDGIFWGKMIVGVSTLLLVFVLISGIVIWWPRTRKSLKNSLKITV TKGWKRFWYDLHVAGGMYALIFLLAMALTGLTWSFPWYRTAFYKVFGVEMQEHTSQGHEQ KGNLQKGDVRLAHQEKKREGNDTQNGERTGRPKGRKDNGEHGERPHNGKEQFGNHKKDRE HAGNQEAGEYPRRRFDSKEYPENDHSNRYSVASPFAYWQEVYDKLRYQNPEYKQISVSSG TASVSLNRFGNQRASDRYSFNTDTGEFTETTLYQHQDKSGKIRGWIYSVHVGNWGGMFTR ILAFIAALIGAALPLTGYYLWIRKFIRQLIGTKK >gi|225935337|gb|ACGA01000055.1| GENE 2 1774 - 3180 1092 468 aa, chain - ## HITS:1 COG:no KEGG:BT_2064 NR:ns ## KEGG: BT_2064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 468 1 462 464 226 34.0 1e-57 MVRKNFLWMLAVSMVMGLTVLSCSDDEPGNQRDDGEGTGSDSPKYGVNFTLIDSDQQKYK TSTLVTEDLVGNGEYFTVSGFESIGDKVYTALCPLGFSDYGIESGAAEGYEDLIAEDEET GEKSISATVHPNQVWIGIYDGINNFDKKPTIITDNRISYATSRYRSQFYPTICAADDGYL YVFSNSVAKSQSNEKNKTTLDAGVLRVNTTTNEFDKSYYFNIEDAANATFGKKLSFFQVW HITGTKFLLRMYATEGKYDSTSDAKMMAIFDSQSGTLTKVTDFPAAEELADMGRFVYVEN GKAYIPVVFQKSASGSSTVQQPAIYIIDATAAKATKGATVQADGGITAICKMNNGTLEKY VIAAASSEASYLVPATEAEINDPDAVLTIKVEGGSTETDAATHWLFPNQKYAYGLGYRQG DAGVSYSFELKSDGTLGKRSAEFTMPRFTAFGTRDQYLILGAAAETSL >gi|225935337|gb|ACGA01000055.1| GENE 3 3244 - 5706 1661 820 aa, chain - ## HITS:1 COG:FN1971 KEGG:ns NR:ns ## COG: FN1971 COG1629 # Protein_GI_number: 19705267 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 113 289 23 206 657 72 30.0 4e-12 MNRVKKIVCCKIKRITGLLLLMLIFSLSAYSQHGKGVMISGRIISTEKEIVDFATVHLKG TGFGSATNREGLYHIKAPGGEYTLVVSAMGYKTIEKKVVLRRGERIKMNFTITPDVKELS EVVISTSGVNRVNKSAFNAVAIDATKLHNSTQDLASALTKVPGVKLRESGGVGSDMNLSL DGMSGKNIKLFIDGVPQDGVGGSFSLNNIPINFAERIEVYRGVVPVGFGTDALGGVINIV TGNKRRTFVDASYSYGSFNTHKSYVNVGHTADNGFTFEINAFQNYSDNNYWIDTPVEHYD EVVDEDGWVSGWDSEKIERVKRFHDTYHNEAVVGKVGLVNKPFADRLMLTFTYSQNDKEI QNGVVQEIVYGQKRRKGHSFMPSVEYRKRNLLLKGLDVNLTANYNRNITQNIDTATYEYN WLGQMRYKKGKLGEQSYQDYKSSTDNWNGTFTANYHLGESHAFVLNHVLMGYERKPTSSA NITDATGAAQYSKIEKKSRKNITGLSYRYSHKDLWNVSVFGKYYNQYSSGPRNTNNDDTS SNNRVSYEESAGSTGAFGYGIAATYLFLDGFQAKFSYEKALRLPTADELFGDDDMELGDA GLKPEHSHNVNASLSYSHEFGKHSIYVEGGFVYRNTKDFIRRLTGTFLAGSSQYAAYMNH GRVRTMGWNAEVRYNYSKWFSIGGNFTDLNTRDYERYGLNGSESTTYKVRMPNIPYLFAN VDASFYAHHLFGKGNLLTVTYDNYYVHEFPLHWENHGASNKKGVPSQFSHNLSLVYSLKN SRYNFSLECKNFTDEKLYDNFSLQKAGRAFYGKIRYYFNR >gi|225935337|gb|ACGA01000055.1| GENE 4 5825 - 6748 492 307 aa, chain + ## HITS:1 COG:BMEI0513 KEGG:ns NR:ns ## COG: BMEI0513 COG0583 # Protein_GI_number: 17986796 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Brucella melitensis # 1 243 1 246 301 77 25.0 4e-14 MRADLEWFRTFKAIFDTGTMSDAAKELNISQPGVSLHLSSLENYIGHPLFERNTRKMIPN ERARMLYRQICYSLTKLEEVENSVRKRSGKERMTLCLGVYSGLFSQLIAPHIADLDFNLI VQFGDNDKLSELLESGSVDIIITSTETPIHNISYQVLGTSEFIVAAGKETDISQFQQLDI ENKGQVRKWLQSQLWYSNEKNIWARFWKLNFKKESDFAPNYVMPDKNTILRCLEKGVGLA LMPHSVCRDSIERGDIICLWKGYVEMKNTLYIGHRKNSILSDEIQQIKEIMTTEFEKCHN ESCQPLR >gi|225935337|gb|ACGA01000055.1| GENE 5 6728 - 7579 533 283 aa, chain - ## HITS:1 COG:VNG0479G KEGG:ns NR:ns ## COG: VNG0479G COG1028 # Protein_GI_number: 15789712 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Halobacterium sp. NRC-1 # 8 203 21 218 316 121 35.0 2e-27 MSEMKWAIITGADGGMGTEITRAVAKAGYRIIMACYNPKKAELVRERLSKETENPYLEVM AIDLSSMQSVVSFANRILERNLPIALLMNNAGTMETGFHTTSDGFERTVSVNYMGPYLLT RKLVPLMVRGARIVNMVSCTYAIGKLDFPDFFHRGKTGCFWRIPVYSNTKLALLLFTFEL SRQLSEKGITVNAADPGIVSTNIITMHKWFDPLTDILFRPFIRKPKKGASTAVGLLLDEK EAGVTGQLYVNNHRKNLSDKYTNHEQKEQLWEVTERSLAQWLT >gi|225935337|gb|ACGA01000055.1| GENE 6 7576 - 8799 1079 407 aa, chain - ## HITS:1 COG:MT3467 KEGG:ns NR:ns ## COG: MT3467 COG1902 # Protein_GI_number: 15842955 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Mycobacterium tuberculosis CDC1551 # 5 379 11 385 396 290 41.0 4e-78 MESKLFSPVTFGPLTLRNRTIRSAAFESMCPDNTPTQMLLDYHRSVAAGGVGMTTVAYAA VTQSGLSFDRQLWLRPSIIPRLHELTKAVHDEGAAVGIQIGHCGNMSHRNICGVTPISAS SGFNLYSPTFVRGMEKEELPEMAQAYGNAVNLAREAGFDAVEVHAGHGYLISQFLSPYTN HRKDEYGGSLENRMRFMDLVMEEVMKAAGNDMAVFVKMNMRDGFKGGMEIDESIQVAKRL LEHGVHGLVLSGGFVSRAPMYVMRGAMPIRSMSYYMNCWWLKYGVRMFGKWMIPSVPFKE AYFLEDALKFRAALPNAPLIYVGGLVSRQKIDEVLNSGFDAVQMARALLNEPGFVNRMKK EEQARCNCGHSNYCIGRMYTIEMACHQHLKEQIPLSLQKEIDKLEKK >gi|225935337|gb|ACGA01000055.1| GENE 7 8855 - 9616 516 253 aa, chain - ## HITS:1 COG:no KEGG:BT_2068 NR:ns ## KEGG: BT_2068 # Name: not_defined # Def: 3-oxo-5-alpha-steroid 4-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 253 1 253 253 411 88.0 1e-113 MNIDAFNLFLGVMSLIALVVFIALYFVKAGYGIFRTASWGVAISNKLAWILMEAPVFLVM SVMWIYSERRFEPVIFTFFLFFQIHYFQRAFIFPLLLTGKSKMPLAIMSMGVLFNLLNGY MQGKWIFYLAPETMYQSGWFTSPWFIIGTLLFFTGMLTNWHSDYIIRHLRKPGDTRHYLP QKGMYRYVTSANYFGEIVEWAGWAILTCSLSGLVFLWWTIANLVPRANAIWCRYCEEFGD AVGERKRVFPFLY >gi|225935337|gb|ACGA01000055.1| GENE 8 9800 - 11143 1560 447 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 12 447 8 441 441 377 45.0 1e-104 MKKEYLIYKLSEEMKEATRIDNELFPKFDVKRGLRNEDGTGVLVGLTKIGNVVGYERIPG GGLKPIPGKLFYRGYDVEDISHAIIKEKRFGFEEVAYLLLSGRLPDKEELLSFRDLINDN MPLEQKTKMNIIELEGNNIMNILSRSVLEMYRFDANADDTSRDNLMRQSIELISKFPTII AYAYNMLRHATFGRSLHIRHPQEKLSIAENFLYMLKRNYTELDARTLDLLLILQAEHGGG NNSTFTVRVTSSTGTDTYSAIAAGIGSLKGPLHGGANIQVADMFQHLKENIKDWTSVDEI DTYFTRMLNKEVYNKSGLIYGIGHAVYTISDPRALLLKELARDLAREKGRESEFAFLELL EERAIATFGRVKNNGKTVSSNIDFYSGFVYEMIGLPQEIFTPLFAMARIVGWCAHRNEEL TFEGKRIIRPAYKNVLDDLAYIPIKKR >gi|225935337|gb|ACGA01000055.1| GENE 9 11170 - 12345 1219 391 aa, chain - ## HITS:1 COG:SA1517 KEGG:ns NR:ns ## COG: SA1517 COG0538 # Protein_GI_number: 15927272 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Staphylococcus aureus N315 # 3 390 7 422 422 505 58.0 1e-143 MQTDGTLLVPDVPTVPYITGDGVGAEVTPAMQAVVDAAIRKAYGGKRRIEWKEVLAGERA FHATGSWLPDETMETFQEYLIGIKGPLTTPVGGGIRSLNVALRQTLDLYVCLRPVRWYQG VQSPVKSPEKVNMCVFRENTEDIYAGIEWEAGTPEAEKFYKFLKDEMGVTKVRFPETSSF GVKPVSREGTERLVRAACQYALDHHLPSVTLVHKGNIMKFTEGGFKKWGYELAQREFADA LADGRLVIKDCIADAFLQNTLLIPEEYSVIATLNLNGDYVSDQLAAMVGGIGIAPGANIN YQTGHAIFEATHGTAPNIAGKDVVNPCSIILSAVMMLEYFDWKEAAALIEKALEQSFLDA RATHDLARFMPNGTSLSTSAFTREIVERIEK >gi|225935337|gb|ACGA01000055.1| GENE 10 12364 - 14607 2173 747 aa, chain - ## HITS:1 COG:SPAC24C9.06c KEGG:ns NR:ns ## COG: SPAC24C9.06c COG1048 # Protein_GI_number: 19114943 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Schizosaccharomyces pombe # 12 745 41 769 778 913 58.0 0 MVYDVTMLEAFYAAYKGKVEHVRAILKRPLTLAEKILYAHLYDVADLKDYKRGEDYVNFR PDRVAMQDATAQMALLQFMNAGKDQVAVPSTVHCDHLIQAYKGAKEDIATARLTNEEVYD FLRDVSSRYGIGFWKPGAGIIHQVVLENYAFPGGMMVGTDSHTPNAGGLGMVAIGVGGAD AVDVMTGMEWELKMPKIIGVRLTGKLSGWTSPKDVILKLAGILTVKGGTNAIIEYFGPGT ESLSATGKATICNMGAEVGATTSLFPFDGRMAAYLRATGRDCVVDWAEAVDADLRADDVV TDEPSNYYDRVIEIDLSELEPYINGPFTPDAATPISEFAEKVLLNGYPRKMEVGLIGSCT NSSYQDLSRAASLAKQVTEKNLSVAAPLIVNPGSEQIRATAERDGMIEAFEQLGATIMAN ACGPCIGQWKRETDDPTRKNSIVTSFNRNFAKRADGNPNTYAFVASPELTMALTIAGDLC FNPLKDRLVNHDGEKVKLSEPVGDELPLKGFTSGNEGYIAPHGAKTEIKVKPDSQRLQLL TPFPAWDGQDLLNMPLLIKAQGKCTTDHISMAGPWLRFRGHLENISDNMLMGAVNAFNGE TNSVWNRSTNTYGTVSGTAKMYKSEGIPSIVVAEENYGEGSSREHAAMEPRFLNVRVILA KSFARIHETNLKKQGMLALTFVDKADYDKIQEHDLLSVVGLVHFAPGRNLTIVLHHEDGT KESFEVQHTYNEQQIAWFRAGSALNAR >gi|225935337|gb|ACGA01000055.1| GENE 11 14833 - 16611 1281 592 aa, chain + ## HITS:1 COG:MK0070 KEGG:ns NR:ns ## COG: MK0070 COG1112 # Protein_GI_number: 20093510 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanopyrus kandleri AV19 # 100 590 135 669 698 265 34.0 2e-70 MGVARKVKRGLCWYPVSLGRSYYNSLNQLVIDITRTENKEIEHSFEFGRPVCFFHQSFEG KVKYMNFIATVSFADEERMVVVLPGAGALAELQTDGILGVQLYFDETSYRAMFEALEDTI RAKDNRLAELRDILLGTQKPGFRELYPVRFPWLNSTQETAVNKVLCTRDVSIVHGPPGTG KTTTLVEAIYETLHREPQVLVCAQSNTAVDWICEKLVDRGVPVLRIGNPTRVNDKMLSST YERRFESHPAYPELWGIRKSIREMGSRMRRGSYSEREGMRNRMSHLRDRATELEIQINAD LFDSARVIASTLVSSNHRLLNGRRFPTLFIDEAAQALEAACWIAIRKADRVILAGDHCQL PPTIKCIEAARGGLDHTLMEKVVQQKPSAVSLLKVQYRMHEAIMQFPSDWFYHGELEAAP EVRYRGILDFDTPMNWIDTSEMDFHEDFVGESFGRINKQEANLLLQELEAYIERIGKERI LDERIDFGLISPYKAQVQYLRGKIKGSSFLRPFRSLITVNTVDGFQGQERDVIFISLVRA NEDGQIGFLNDLRRMNVAITRARMKLVILGDASTLTKHPFYKRLMLFIKKED >gi|225935337|gb|ACGA01000055.1| GENE 12 16760 - 18724 1536 654 aa, chain - ## HITS:1 COG:no KEGG:BF3757 NR:ns ## KEGG: BF3757 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 21 653 20 635 639 345 37.0 3e-93 MKKLNLSQMTVITKFVCAVAVISVAFTSCLQKDVYEPNSDKGESEEVTLDNYFNFATTKD VQLNIDYGKECPKAYFEIYAENPLLYVAEGGQIIKKTGISHIATGFTDAQGLYIKTASFP AAVSEVYIYSPDFGVPTLYKTQVAGSNVSAQISFDNTLDITALDSSTRSVQTRSSVQFIT NVIPNVLGTWNTNTGKPDYLDTSKKIDVDATLKNYITSYFPEGVNNEGTNLVSDDADILI KEDANVVINYFGGDTGAQSVFAYYCYPEGASVSDIKEASKHACVIFPNAHGNSLDYYSGV AVNLKYINKAGSFPQEDPERFPAKTKIGFLIWNNGWINVGSNGNMFYSTKSLNSDGISHT AIFAAKNKAGDRFNVITMEDWKGGENDYNDVAFVISSNPITAIEVPDVPNPGDRQGTEKY SGILGFEDNWPEQGDYDLNDVVMKYQSNIDYNIDNKVLNIIDKFTLVWTGANYKNSFAYE VPFDLSKASKITINGSETSSYSGNVITLFKDAKAELGVSNINADDMMNHNIQEKTYTVSI EFNNPTLDKSVVVAPYNPFIEVLNSTTEVHLTNHKPTAGATNHFPAGADISRGDVDGTYF ICKDGFPFAIHVDARLDASILNLNLKAEKQRIDKTYPKFAEWAKTRDPQIKWWK >gi|225935337|gb|ACGA01000055.1| GENE 13 18917 - 19960 1286 347 aa, chain - ## HITS:1 COG:YLR355c KEGG:ns NR:ns ## COG: YLR355c COG0059 # Protein_GI_number: 6323387 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Saccharomyces cerevisiae # 1 346 48 394 395 423 60.0 1e-118 MAQLNFGGTTETVVIRDEFPLEKAREVLKNETIAVIGYGVQGPGQALNLRDNGFNVIVGQ REGKTYDKAVADGWVPGETLFGIEEACEKGTIVMCLLSDAAVMSVWPTIKPYLTAGKALY FSHGFAITWNDRTGVVPPTDIDVIMVAPKGSGTSLRTMFLEGRGLNSSYAIYQDATGKAM DRTIALGIGIGSGYLFETTFQREATSDLTGERGSLMGAIQGLLLAQYEVLRENGHTPSEA FNETVEELTQSLMPLFAKNGMDWMYANCSTTAQRGALDWMGPFHDAIKPVVEKLYHSVKT GNEAQISIDSNSQPDYREKLNEELRQLRESEMWQTAVTVRKLRPENN >gi|225935337|gb|ACGA01000055.1| GENE 14 20028 - 20771 612 247 aa, chain - ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 17 215 15 211 248 94 27.0 2e-19 MSEENKIGTYQFVAEPFHVDFNGRLTMGVLGNHLLNCAGFHASDRGFGIATLNEDNYTWV LSRLAIELDEMPYQYENFSVQTWVENVYRLFTDRNFAIIDKDGKKIGYARSVWAMINLNT RKPADLLTLHGGSIVDYVCDEPCPIEKPSRIKVTSDQSMATLTAKYSDIDINGHVNSIRY IEHILDLFPIELYKTKRIRRFEMAYVAESYFGDELSFFCDEVNANEFHVEVKKNGSEVVC RSKVIFE >gi|225935337|gb|ACGA01000055.1| GENE 15 20771 - 21334 510 187 aa, chain - ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 14 161 16 162 168 72 31.0 4e-13 MSDKTLYTIIVHSENIAGLLNQVTAVFTRRQINIESLNVSASSIKGVHKYTITALTDKDT IEKVVKQIEKKIDVIQAHYFTEDEIYFHEIALYKVSTPAFQETPEASKLIRRYNARVVEV NPVFSIVEKNGMSEDITSLYGELKALNCVLQFVRSGRVAITTSCFERVNEFLDGREAMYN QSKNQQE >gi|225935337|gb|ACGA01000055.1| GENE 16 21391 - 23085 1771 564 aa, chain - ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 5 561 9 559 564 499 47.0 1e-141 MKDLITGAEAMMRSLEHQGVTTIFGYPGGSIMPVFDALYDHQNILNHILVRHEQGAAHAA QGYARVSGEVGVCLVTSGPGATNTVTGIADAMIDSTPIVVIAGQVGTGFLGTDAFQEVDL VGITQPIAKWSYQIRRAEDVAWAVARAFYIARSGRPGPVVLDFAKNAQVEKTKYEPTKVD FIRSYVPVPDTDEESVQAAAELINNAERPLVLVGQGVELGNAQNELREFIEKADMPAGCT LLGLSALPTEHPLNKGMLGMHGNLGPNINTNKCDVLIAVGMRFDDRVTGNLATYAKQAKV IHFDIDPAEVNKNVKVDVAVLGDCKETLASVTKLLKKKTHTEWIDSFKEYEKVEEEKVIR PELHPATNSLSMGEVVRAVSNATHHEAVLVTDVGQNQMMSARYFKYTKERSIITSGGLGT MGFGLPAAIGATFGAPERTICVFMGDGGLQMNIQELGTIMEQKAPVKIICLNNNYLGNVR QWQAMFFNRRYSFTPMLNPDYMKVASAYDIPSKRAFTREELKEAIAEMLATDGPFLLEAC VVEEGNVLPMTPPGGSVNQMLLEC >gi|225935337|gb|ACGA01000055.1| GENE 17 23115 - 24914 1931 599 aa, chain - ## HITS:1 COG:NMB1150 KEGG:ns NR:ns ## COG: NMB1150 COG0129 # Protein_GI_number: 15677026 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Neisseria meningitidis MC58 # 4 597 3 612 619 797 65.0 0 MKKQLRSSFSTQGRRMAGARALWAANGMKKNQMGKPIIAIVNSFTQFVPGHVHLHEIGQL VKAEIEKLGCFAAEFNTIAIDDGIAMGHDGMLYSLPSRDIIADSVEYMVNAHKADAMVCI SNCDKITPGMLMAAMRLNIPAVFVSGGPMEAGEWNGQHLDLIDAMIKSADESVSDQEVAN IEQNACPTCGCCSGMFTANSMNCLNEAIGLALPGNGTIVATHENRTQLFKDAAELIVKNA KLYYEEGDESVLPRSIATRQAFLNAMTLDIAMGGSTNTVLHLLAIAHEAEVDFKMDDIDM LSRKAPCLCKVAPNTQKYHIQDVNRAGGIIAIMDELAKGGLIDTSVRRVDGMSLAEAINE YSITSPNVSEKAIKKYSSAAGNRFNLVLGSQGMYYKELDKDRANGCIRDLEHAYSKDGGL AVLKGNIAQDGCVVKTAGVDESIWKFTGPAKVFDSQEAACEGILGGRVVSGDVVVITHEG PKGGPGMQEMLYPTSYIKSRHLGKECALITDGRFSGGTSGLSIGHVSPEAAAGGNIGKIV DGDIIEIDIPARKINVRLTDEELAARPMTPVTRDRYVPKSLKAYASMVSSADKGAVRLI >gi|225935337|gb|ACGA01000055.1| GENE 18 25396 - 25974 563 192 aa, chain + ## HITS:1 COG:FN1875 KEGG:ns NR:ns ## COG: FN1875 COG1047 # Protein_GI_number: 19705180 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 2 # Organism: Fusobacterium nucleatum # 1 156 1 149 164 85 37.0 7e-17 METVENKYITVAYKLYTMEDGEKELFEEAKAEHPFQFISGLGTTLEDFENQITALSKGDK FDFTIPADKAYGQYDEQHVIDLPKNIFEIDGKFDSERIKEGNIVPLMTGDGQRVNASVVE IKPDIVVVDLNHPLAGADLIFEGEILESRPATNEEIQELVKMMSGEGGCSCGCDSCGDGC GDDCGCEGGHCH >gi|225935337|gb|ACGA01000055.1| GENE 19 26102 - 27388 1250 428 aa, chain - ## HITS:1 COG:STM3238 KEGG:ns NR:ns ## COG: STM3238 COG3681 # Protein_GI_number: 16766537 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 7 427 11 435 436 325 44.0 9e-89 MTESERRQIIELIKKEVIPAIGCTEPIAVALCVAKAAETLGVKPEKIDVLLSANILKNAM GVGIPGTGMVGLPIAVALGVLIGKSDYQLEVLRDCTPEAVEQGKLFIAEKRICISLKEDI TEKLYIEVICKAGDKIAKAIIAGGHTTFIYIAKDEQTLLDKQHTVSEEEEDASLELNLRK VYDFALTAPLDEIRFILDTARLNKAAAEQAFKGNYGHSLGKMLRGTYEHKVMGDSVFSHI LSYTSAACDARMAGAMIPVMSNSGSGNQGISATLPVVVFAEENGKSEEELIRALMLSHLT VIYIKQSLGRLSALCGCVVAATGSSCGITWLMGGNYNQVAFAVQNMIANLTGMICDGAKP SCALKVTTGVSTAVLSAMMAMEDRCVTSVEGIIDEDVDQSIRNLTRIGSQAMNETDKMVL DIMTHKGC >gi|225935337|gb|ACGA01000055.1| GENE 20 27522 - 28607 833 361 aa, chain + ## HITS:1 COG:no KEGG:BT_2081 NR:ns ## KEGG: BT_2081 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 617 85.0 1e-175 MKAKHVILYLLLVIVSSSCIGEEALNAEADIISCTLPGVAMTTSPIINNNSITIFVGPGT DISELTPEFTLTPGATINPLSGTERNFNTPQEYTVTAADGVWKKTYIISVIDTELATNYN FEDTLGGKKYYIFIEREGGKVVMEWASGNAGYAMTGVAKTADDYPTFQITDGKAGKCLSL VTRSTGFFGQIAGMPIAAGNLFIGSFDVSNAMSNPLKATKFGLPFRHVPTYLAGYYKYKA GDQFTEGGKPVNEKRDICDIYAIMYETSESVPTLDGTNAFTSPNLISTARINNAKETNEW TYFKLPFTTLPGKFIDKEKLMDGKYNIAIVFTSSLEGDHFNGAIGSTLLIDEAELIYHSE D >gi|225935337|gb|ACGA01000055.1| GENE 21 28640 - 29425 694 261 aa, chain + ## HITS:1 COG:no KEGG:BT_2082 NR:ns ## KEGG: BT_2082 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 261 1 261 261 504 93.0 1e-141 MKIYLYIFSLLTCIGIALPGYAQVDRNETLIRSALHGLEYEIKAGFSIGGTAPLPLPVEI RSIDGYNPTLAISIGGEVTKWIAVQNKLGIIVGLRLENKAMTTEATVKNYNMEILGQGGE RISGVWTGGVKTKVHTAGLTIPLMATYKLTNRWNIKAGPYFSYLLSREFSGHVYEGYLRE DNPTGPKVEFTDGKIATYDFSDDLRHFQWGLQVGAGWRAFKHLNVYADLTWGLNDIFKND FNTVTFAMYPIYLNIGFGYAF >gi|225935337|gb|ACGA01000055.1| GENE 22 29422 - 29988 646 188 aa, chain - ## HITS:1 COG:no KEGG:BT_2083 NR:ns ## KEGG: BT_2083 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 184 1 183 187 258 73.0 9e-68 MKKNLLYLLALVCSLTLFAACSSDDDDSDNKNNENPPEEEAVITAPDVIGTYWGNLDISM IPDGSDQEIVIGDGIEKFITLSQVSNTEVKIELKDFELFINQQILKFGDIVIDKCEVKKG EGVSTFTGQQDLTFEGDAASLGTCPVTVTGTVEGGNADMTINVKVPALQQTVKVTYSGVK QVAEPSGT >gi|225935337|gb|ACGA01000055.1| GENE 23 30150 - 31226 1215 358 aa, chain + ## HITS:1 COG:all0797 KEGG:ns NR:ns ## COG: all0797 COG0082 # Protein_GI_number: 17228292 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Nostoc sp. PCC 7120 # 1 351 1 353 362 364 52.0 1e-100 MFNSFGNIFRLTSFGESHGKGIGGVIDGFPSGITIDEEFVQQELNRRRPGQSILTTPRKE ADKVEFLSGIFEGKSTGCPIGFIVWNENQHPNDYNNLKNVYRPSHADYTYTVKYGIRDHR GGGRSSARETISRVVAGALAKLALRQLGVNITAYTSQVGPIKLEGTYSDYNFDLIETNDV RCPDPEKAKEMADLIYKVKGEGDTIGGTLTCVIKGCPIGLGQPVFGKLHAALGNAMLSIN AAKAFEYGEGFKGLKMKGSEQNDVFYNNNGRIETHTNHSGGIQGGLSNGQDIYFRVVFKP IATLLMEQETVNIDGVDTTLKARGRHDACVLPRAVPIVEAMAAMTILDYYLLDKTTQL >gi|225935337|gb|ACGA01000055.1| GENE 24 31226 - 32590 1268 454 aa, chain + ## HITS:1 COG:BH3875 KEGG:ns NR:ns ## COG: BH3875 COG0624 # Protein_GI_number: 15616437 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus halodurans # 1 452 1 453 458 406 44.0 1e-113 MNEIQKYIAAHEPKMMDDLFSLIRIPSISALPEHHDDMLACAERWAQLLLEAGVDEALVI PSQGNPVVFAQKIVDPDAKTVLIYAHYDVMPAEPLELWKSEPFEPEIRDGHIWARGADDD KGQSFIQVKAFEYLVKNGLLKNNVKFIFEGEEEIGSPSLEAFCEEHKELLKADVILVSDT SMLGAELPSLTTGLRGLAYWEIEVTGPNRDLHSGHFGGAVANPINVLCQIISKVTDADGR ITVPGFYDDVEEVPQAEREMIAHIPFDEKKYKEAIGVKELFGEKGYSTLERNSCRPSFDV CGIWGGYTGEGSKTVLPSKAYAKVSCRLVPHQDHHKISQMFADYILSIAPDTVQVKVTPM HGGQGYVCPISLPAYQAAEKGFEIAFGKKPLAVRRGGSIPIISTFEQVLGIKTVLMGFGL ESNAIHSPNENCSLDIFRKGIEAVIEFHQEYARR >gi|225935337|gb|ACGA01000055.1| GENE 25 32701 - 33759 997 352 aa, chain - ## HITS:1 COG:AGl573 KEGG:ns NR:ns ## COG: AGl573 COG3049 # Protein_GI_number: 15890402 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 14 346 18 353 355 378 54.0 1e-104 MKKKLAGVALVLAAVSLMSIQPVEACTRAVYIGPDQMVITGRTMDWKEDIMTNIYVFPRG IQRMGHNKEKTVNWTSKYGSVIATGYDIGTCDGMNEKGLVASLLFLPESIYSLPGDTRPA MGISIWTQYVLDNFATVREAVDELKKETFRIDAPRMPNGGPESTLHLAITDETGNTAVLE YLDGKLSIHEGKEYRVMTNSPRYEQQLAINDYWKEIGGLQMLPGTNRASDRFVRASFYIH AIPQIADAKIAVPSVLSVMRNVSVPFGINTPEKPYISSTRWRSVSDQKNKVYYFESTLTP NLFWLDLKKIDFSPKAGIKKLSLTKGEIYAGDAVKDLKDSQSFTFLFETPVM >gi|225935337|gb|ACGA01000055.1| GENE 26 34010 - 34453 275 147 aa, chain + ## HITS:1 COG:no KEGG:Ctha_2340 NR:ns ## KEGG: Ctha_2340 # Name: not_defined # Def: protein of unknown function DUF323 # Organism: C.thalassium # Pathway: not_defined # 39 145 397 510 646 64 31.0 9e-10 MIKEYTKQTIIICFLLQGIALSSRAQEQQVWQRYHDKCRREASIIDTLPSKLEKALHWSP TINKEQKEVIRYILWNMVYVEGGTANLGNNNNYPVDVASFFINRYEVSQDEWYVIMGENP SNQHRRNYPVDQVNWFNAQRFTKKFLN >gi|225935337|gb|ACGA01000055.1| GENE 27 34411 - 34875 158 154 aa, chain + ## HITS:1 COG:BH0900 KEGG:ns NR:ns ## COG: BH0900 COG1262 # Protein_GI_number: 15613463 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 19 92 146 234 286 84 46.0 5e-17 MVQRSTIYQKISQLSGLPFRLPFEAEWEYAARGGLKTKNFIYAGSNNAEQVAWFREKYYN TYVSKETGTKKPNELGLYDMSGNVYEWCMDWYDRKVPFTGNIAPIISEKDTHKVLRGGSY YTFEKYCKVTSRYGVNPQRWDIDYGLRLVVSLPN >gi|225935337|gb|ACGA01000055.1| GENE 28 34957 - 36135 800 392 aa, chain - ## HITS:1 COG:no KEGG:BT_2246 NR:ns ## KEGG: BT_2246 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 357 306 657 682 113 25.0 1e-23 MARQLWSLYAADSERYYDPLDYRPISITQQPDGNWTATSQDYVHLVIVGFNRMGRSLLLE ALRICHYANYDDRLPTDERIRTHITLVDREMESQKDYFKAQFPYIESQIDDIEVEYCHDD ICSTAMRTRLQQWAQNKHCMLTVAICVHDPDLSLSLGLNLPHEVYQHQCRVLIRQDFNND LSSIVDDEQGRYRYVKVFGMVDRGMKKNILQDKLALYVNYLYDCCYTDESLKQKEVLKKM YESYGNHSADFILMNHQAQFLWNKLSEPLRWANRYQLDAYSVFCRTLGYGIKRSDRSPAR ISGSMFNENLPSQVLYLLVRMEKYRWNAERTVAGWRRAEVKDKVFLQHPLIMPFNELLQK YPEEVEKDADVILNLPYVLALGGYELYKLADQ >gi|225935337|gb|ACGA01000055.1| GENE 29 36075 - 36995 537 306 aa, chain - ## HITS:1 COG:no KEGG:BT_2246 NR:ns ## KEGG: BT_2246 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 234 5 240 682 108 30.0 2e-22 MNRFINYYKKIFSQEYMDRTISGGIKSQLTLLLVTIATVLAIFFIIVMFFSIQLYGHEEW GERLWVVYNNFVDPGNQMNETAWSSRLLLGIVSFSGSILLGGVLISTISNIIERRVDVVN TGRMTYRNITQHYVLIGFNELTINMIRELYNECPSARILLMSGIEAATVRHRIQSALPVE IERQVLVYFGNIESIEELQRLNIASASEVYVLGDEERCGRDAKNIAIVHLVSALRGKCSD GKVMPVYVQFDSIPSYSNIQKMNLPPEVFCIEGKPNIFSVPLTSTRIWPGNYGVSMLPIV KDIMIR >gi|225935337|gb|ACGA01000055.1| GENE 30 37782 - 39956 2070 724 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 720 4 650 709 449 39.0 1e-126 MIVCIAEKPSVARDIAEVLGAHTRKEGYIEGNGYQVTWTFGHLCTLKEPHEYTPNWKSWN LGSLPMIPPRFGIKLIENPTYEKQFHIIEGLMQNADEIINCGDAGQEGELIQRWVMQKAG ARCPVKRLWISSLTEEAIREGFAKLKDQTDFQSLYEAGLSRAMGDWLLGMNATRLYTIKY GQNKQVLSIGRVQTPTLALIVNRQLEIANFEPKQYWELKTNYRDTTFSALIRKSDEEIAA EEEKNGGKKKIDNPGIDPIANREEGEALVERIKDLPFVVTSVGKKDGREFAPRLFDLTSL QVECNKKFAYSADETLKLIQSLYEKKVATYPRVDTTFLSDDIYPKCPAILKGLRDYEVLT APLAGTTLPKSKKVFDNSKVTDHHAIIPTGVYAQNLTDMERRVYDLIARRFIAVFYPDCK ISTTTVMGEVDKIEFRVTGKQILEPGWRVVFAKDVKDPTEEKEEEDENVLPTFVKGESGP HIPDLNEKWTQPPRPYTEATLLRAMETAGKLVDNDELRDALKENGIGRPSTRAAIIETLF KRNYIRKERKNLIATPTGVELVQLIHEELLKSAELTGIWEKKLREIEKKTYDARQFLEEL KQMVSEIVMSVLSDNTNRRITIQDAVAAKAEEKEKKETKKRERKPSAPKEKKPKTEKAAS GVSSNSAPSPAPVTTPASTPSATGDVDTFVGQPCPLCGKGIIIKGKTAYGCSEWRNGCTF RKNF >gi|225935337|gb|ACGA01000055.1| GENE 31 40250 - 42397 2164 715 aa, chain - ## HITS:1 COG:BH2955_1 KEGG:ns NR:ns ## COG: BH2955_1 COG1884 # Protein_GI_number: 15615517 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 24 587 19 582 582 846 72.0 0 MRKDFKNIDIYAAFQPTNGAEWQKANGISADWKTPEHIEVKPVYTKEDLEGMEHLGYAAG LPPYLRGPYSVMYTLRPWTIRQYAGFSTAEESNAFYRRNLAAGQKGLSVAFDLATHRGYD PDHERVVGDVGKAGVSICSLENMKVLFDGIPLNKMSVSMTMNGAVLPIMAFYINAGLEQG AKLEEMAGTIQNDILKEFMVRNTYIYPPAFSMKIISDIFEYTSQKMPKFNSISISGYHMQ EAGATADIELAYTLADGLEYLRAGTAAGIDIDAFAPRLSFFWAIGTNHFMEIAKMRAARM LWAKIVKQFNPKNPKSLALRTHSQTSGWSLTEQDPFNNVGRTCIEAMAAALGHTQSLHTN ALDEAIALPTDFSARIARNTQIYIQEETYICKNVDPWGGSYYVESLTNALAHKAWEHIQE IEKLGGMAKAIETGIPKMRIEEAAARTQARIDSGQQTIVGVNKYRLEKEAPIDILEIDNT AVRLEQIENLKRLKEGRNQAEVDKALAAITECVKTGKGNLLELAVEAARVRATLGEISYA CEQIVGRYKAIIRTISGVYSSESKNDSDFKRACELAEKFAKKEGRQPRIMVAKMGQDGHD RGAKVVATGYADCGFDVDMGPLFQTPAEAAREAVENDVHVVGVSSLAAGHKTLVPQIIEE LKKLGREDIVVIAGGVIPAQDYDFLYKAGVAAIFGPGTPVAKAACQILEILMDEE >gi|225935337|gb|ACGA01000055.1| GENE 32 42399 - 44300 2179 633 aa, chain - ## HITS:1 COG:BH2956_1 KEGG:ns NR:ns ## COG: BH2956_1 COG1884 # Protein_GI_number: 15615518 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 8 470 9 468 525 227 31.0 6e-59 MADKKEKLFSDFSPASTEQWMEKVTADLKGADFEKKLVWKTNEGFKVKPFYRMEDLEDLK TTDALPGEFPYLRGTKKDNNAWLVRQEIRVECPKEANAKALDILNKGVDSLSFHVKAKEL NAEYIETLLSDIQAECVELNFSTCQGHVVELANLLVAYFQKKDYDVKKLKGSINYDFFNK MLTRGKEKGDMVQTAKSLIEAIQPLPFYRVLNVNALSLNNAGAYISQELGYALAWGNEYM GQLTDAGIPAAIVAKKIKFNFGISSNYFLEIAKFRAARLLWANIVASYNPECVRDCENKG PNGECRCAAKMAVHAETSTFNLTLFDAHVNLLRTQTEAMSAALGGVDSMTVTPFDKTYET PDEFSERLARNQQLLLKEESHFDKVIDPAAGSYYIENLTVAIAKQAWELFLAVEEAGGFY AALKAGTVQAAVNESNKARHKAVAQRREVLLGTNQFPNFNEKAGNKQPVEGKCCCGGDSH TCEKEVDTLVFDRAASQFEALRLETEASGKRPKAFMLTIGNLAMRQARAQYSCNFLACAG YEVVDNLGFETVEAGVEAAMAAKADIVVICSSDDEYAEYAIPAFKALNGRAMFIVAGAPA CMDDLKAAGIENFIHVRVNVLDTLKEFNAKLLK >gi|225935337|gb|ACGA01000055.1| GENE 33 44570 - 46234 1442 554 aa, chain + ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 28 549 18 545 553 391 42.0 1e-108 MDWLQSLLWDPSSVAHIVLLYAFVVATGVYLGKIKIFGVSLGVTFVLFAGILMGHFGFTA DTHILHFIREFGLILFVFCIGLQVGPSFFSSFKKGGMTLNLLAVGIVVLNIAVALGLYYL WNGRVELPMMVGILYGAVTNTPGLGAANEALNQLHYTGPQIALGYACAYPLGVVGIIGSI IAIRYIFRVNMAKEEESLKIQSGDSHHKPHMMSLEVRNESISGKTLIEIKNFLGRKFVCS RIRHDGHVSIPDHETVFNIGDQLFIVCSEEDAPAIVVFIGKEVELDWEKQDLPMVSRRIL VTKPEINGKTLGSMHFRSMYGVNVTRINRSGMDLFADPNLILQVGDRVMVVGQQDAVERV AGVLGNQLKRLDTPNIVTIFVGIFLGILLGSLPIAFPGMPTPLKLGLAGGPLVVAILIGR FGHKLHLVTYTTMSANLMLREIGIVLFLASVGIDAGANFVQTVVEGDGLLYVGSGFLITV IPLLIIGTIARLYYKVNYFTLMGLIAGSNTDPPALAYANQTTSGDAPAVGYSTVYPLSMF LRILTGQMILLAMM >gi|225935337|gb|ACGA01000055.1| GENE 34 46325 - 46957 631 210 aa, chain - ## HITS:1 COG:no KEGG:BT_2093 NR:ns ## KEGG: BT_2093 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 209 1 209 211 397 94.0 1e-109 MSGNYVINIGRQLGSGGKEIGEKLAARLGINFYDKELINLASEESGLCKEFFEKADEKAS QGIIGGLFGMRFPFISEGAMPCNNCLSNDALFKVQSDVIRHLAAERSCVFVGRCADYILR EHPRCANVFISASKEDRIARLCGMHHIDAEAAEEMIEKADKRRSEYYNYYSYKTWGAAAT YHLCIDSSSLGIEETVRFIEEFVVKKLQLV >gi|225935337|gb|ACGA01000055.1| GENE 35 47010 - 49766 2296 918 aa, chain - ## HITS:1 COG:no KEGG:BF2098 NR:ns ## KEGG: BF2098 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 914 5 918 922 707 41.0 0 MNACKLVGLLLLLLWGHTALYAQQVRTVRGRVQTVEAGSNEKRALPSASIVILAKSDSAF IKGITSDKNGRFILNYQPQKKKKYLLKVSFMGMQSVYRALEDSVSVNIGTVVLKDDDIQI SEVTITGKLQEVVMEGDTTVINAAAFKTREGAYLEDLVKRVPGLVYNKKDNSLTYNGQPI SEINVNGEAFFSGDKKTALENLPADLISKLKVYDKKSKEEEFTGISSGEKKYVLDLQTKD ELNKTWLTNATVGYGNNQKKDLEGQVNYFSKDGDNLSFIAQSTNRYQNSTYKDNINNSVG MNMTHKFGKKFSLTGSVNYNLNRNGNLSSIYQEQYLPAGNQYSASTNESNSKGRSISSNF IGSWEVDKRTRIHFTGNLGISPNRNENSSQNASFDAPPGISHESLFDDFESIPRDIKVNR SENWSRSENQSNRYSWMMGIMRRLNEKGTTLALNIQNSDSWGDNESFSLSKTTYFRLEDK LGNDSVLFRNQYLKSPQKNNSWRVGITFAQPIGKKMHFRVAYNWDTNYERDNRDTYELSS LTKSEVFGELPPDYEAGYVDSLSNRSHSRTNGHNLDVGLNYSDDTWMFNASLGMTPQKRA IERKMGKLYADTTMHTIDFQPMIWVNWKKKEARITFNYSGRTRQPSLSDLMPLTDNSNPL YITRGNPDLKQMFSHSVRVGFQHSKKGISANIGGQMEQNSVTQVVLYDAKTGGRETSPIN INGNWNVYGSANWWKRLGHFSLRLDMNGNHSNRVSMINEDKSLEPKKSTTRDTGLGCDAN VSYQPTWGGIDFSTSWNYQYSLNSINDNDTYTRNYSFRLEGYVDLPFGLQLRTDGAYSFR NGTNIRKGEDDQMLWNASASWRFLKKKEAELSAYWADILGKRKSFGRTTSSDGFYEFRNQ EIKGYFIVTFKYNFRLMM >gi|225935337|gb|ACGA01000055.1| GENE 36 49904 - 50413 341 169 aa, chain + ## HITS:1 COG:BMEI0693 KEGG:ns NR:ns ## COG: BMEI0693 COG2087 # Protein_GI_number: 17986976 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Brucella melitensis # 5 169 8 172 173 140 41.0 7e-34 MKRIILITGGQRSGKSSFAEKKAHELTTSPVYLATARIWDEEFRKRVERHQKNRGPEWTN IEEEKELSRHNVTGRVVVIDCVTLWCTNFFMEFDSNVASSLEAVKKEFDKFTQQDATFIF VTNEIGWGGVSENKLQRKFTDLQGWTNQYIASHADDVYLMASGIPVKIK >gi|225935337|gb|ACGA01000055.1| GENE 37 50416 - 51453 759 345 aa, chain + ## HITS:1 COG:RSc2397 KEGG:ns NR:ns ## COG: RSc2397 COG2038 # Protein_GI_number: 17547116 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Ralstonia solanacearum # 21 343 20 344 354 291 47.0 2e-78 MKTFHIQKPDEAIKEALIDKINNLTKPKGSLGRLEEIALQIGLIQQSLSPRLTHPQNIIF AADHGIVEEKVSPSPKEVTWQQISNFLHGGAGINFLCRQHGFTLKIVDAGVDYDLPYEKG IINMKVGRGTRNFLHEAAMTPEEMELCLEHGVQCVDTSHQEGCNIISFGEMGIGNTSSSS LWMTCFTGIPLDQCVGAGSGLNHQGINHKYEVLKRSLEQYPGEHSAEEILCRFGGYEMVM AVGAMLKAAELGMVILIDGFIMTNCILAASRLYPEVMSYAIFGHQGDESGHKLLLDYLGA RPLLNLGLRLGEGSGSVCAYPIVDSAVKMLNEMDSFAHASITKYF >gi|225935337|gb|ACGA01000055.1| GENE 38 51455 - 52201 401 248 aa, chain + ## HITS:1 COG:VC1238 KEGG:ns NR:ns ## COG: VC1238 COG0368 # Protein_GI_number: 15641251 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Vibrio cholerae # 5 239 13 249 261 112 35.0 8e-25 MNQILAALIFFTRLPFWKIKEVPQECFKHVVSYWSFSGWLTGGMMALIFWGASTILPHGV AVILALTSRLLITGALHEDGLADFFDGFGGGTNRESTLRIMKDSHIGSYGVLGLIIYYLF AYNLLVALPLAVTPFLLLSGDTWSKFISSNIINYLPYARKEEDSKSGTVYTRMSFGEIVM SAMGGILPLLLLPPSYWPVCIFPILVFFFICRMMKRRLNGYTGDCCGALFLLSEMSFWLG AVMSIYIK >gi|225935337|gb|ACGA01000055.1| GENE 39 52210 - 52743 453 177 aa, chain + ## HITS:1 COG:RSc2395 KEGG:ns NR:ns ## COG: RSc2395 COG0406 # Protein_GI_number: 17547114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Ralstonia solanacearum # 1 145 1 149 192 71 32.0 1e-12 MNIYLIRHTSVDVPKGLCYGQSDVPLRPTFEIEAAVTKAKIESIHFDMAYTSPLSRCTRL AQYCGFGDAIRDPRILELDFGDWEMQYFEKIKDPNLQCWYDDYLNVKATNGESFADQYKR VAAFLDEVKQKEAENIVVFAHGGVLICAQIYAKLIQPEEAFQAVPAYGGVFLYQERP >gi|225935337|gb|ACGA01000055.1| GENE 40 52723 - 53670 716 315 aa, chain - ## HITS:1 COG:STM2034 KEGG:ns NR:ns ## COG: STM2034 COG1270 # Protein_GI_number: 16765364 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Salmonella typhimurium LT2 # 18 289 9 290 319 184 39.0 2e-46 MMSYFIAIGVVLRVLLPAWLLDRLFGDPVSLSHPVIWFGKMIAFGEKRLNKGSYKLWKGG TMSVLLILFVYSVTIGIEYALSFLGTPAVIGFDIICIFFCLAGTTLIKEVRMVFEAVDRS LEEGRIQVARIVGRDTSELSAQEVRTAALETLAENLSDGVIAPLFWYLLLGVPGMMAYKM VNTLDSMIGYHSDRYLLFGRVAARVDDVANYIPARLTAFLMVLASGRLKLLSFVKRYGRE HASPNSGYPEAALAGILNCRFGGPHHYFGQLFTKPYIGVNERLLDTSDMRMGVAVNRRAE EMMIVLTVCLRALLI >gi|225935337|gb|ACGA01000055.1| GENE 41 53685 - 54707 762 340 aa, chain - ## HITS:1 COG:BH1589 KEGG:ns NR:ns ## COG: BH1589 COG0079 # Protein_GI_number: 15614152 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Bacillus halodurans # 33 336 45 358 370 159 31.0 9e-39 MIEGHGDDLYKYGKKIVSNFSSNVYNRIDHSGLYQRLNERLSTICSYPEPMPYSLESEIA RRYSLTPRQVCVTNGATEAIYLIAQVFQGRISAVLGPTFSEYADACRVHRHKVKPFYSLD ALPEDAELVWICNPNNPTGEVRNKEDLKALVDSHPDKLFIFDQSYEYFTLKSLLGIKEAA SFPNVILLHSMTKQYAIPGLRVGYFTASEGLTDDVRCRRMPWSVNSLAIEAAKYLLEEGD GISADIPQLLAERERLTNLLLATGMLEIWPTDTHYMLIKLRMGKAAALKDFLAMNHGILI RDASNFEGLDERFFRIATQTPEENDKLVKAISEWMEQIMS >gi|225935337|gb|ACGA01000055.1| GENE 42 54700 - 56163 1195 487 aa, chain - ## HITS:1 COG:STM2019 KEGG:ns NR:ns ## COG: STM2019 COG1492 # Protein_GI_number: 16765349 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Salmonella typhimurium LT2 # 1 486 6 502 506 410 46.0 1e-114 MLAGTGSDVGKSILAAAFCRILKQDGYHPAPFKAQNMALNSYATPEGLEIGRAQAVQAEA AGVPCHTDMNPLLLKPSSEHTSQVVLLGKPIGDRNAYEYFRKEGREELRKVVNESYDRLA TRYNPIVMEGAGSISEINLRDVDLVNMPMAKHAGADVFLVADIDRGGVFASVYGSILLQT PEERSLIKGIIINKFRGDIRLFESGIKMIEDLCKVPVLGVVPYYKGIYIEEEDSVMLESK NREAMAGKINIAVILLRHLSNFTDFNVLERDERVHLYYTNNVNEIAKADIVLLPGSKNTL DDLYELRCNGVVQAILKAHREGATIMGICGGYQMMGLEVRDPDGVEGSFKLLPGLGLLPV ITTMQGDKVTRQVNFTFDESDTVCKGYEIHMGRSVPAEGFSPSPLNKLEDGREDGYRNGR KCMGTYIHGILDNQSFIDFLLKPYADKLEQQTFDYATFKEEQYDKLAEHVRKHINMPLLY KILERND >gi|225935337|gb|ACGA01000055.1| GENE 43 56195 - 56785 332 196 aa, chain - ## HITS:1 COG:lin1172 KEGG:ns NR:ns ## COG: lin1172 COG2096 # Protein_GI_number: 16800241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 3 181 2 177 188 126 41.0 3e-29 MRRIYTRTGDNGTTGIFGGERVEKDDIRIEANGTIDELNTVIGIVRSLLPEQDEWHEWLY EIQMELMSCMSHVATPSVKRGENPNTLRTDLTGLLEKRMDTMNAAMEDNGYFILPGGTQI SAQLHHARVIARRAERRLCTLHRQDPVPQSLLEFINRLSDLFFVMARYEMYRQHWTEEKW KAFGYKRKASTPDQSK >gi|225935337|gb|ACGA01000055.1| GENE 44 56785 - 58107 939 440 aa, chain - ## HITS:1 COG:SSO0404 KEGG:ns NR:ns ## COG: SSO0404 COG1797 # Protein_GI_number: 15897338 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Sulfolobus solfataricus # 8 435 5 430 434 274 35.0 3e-73 MKQTSHFLIGAASSGGGKTTFTTGLLRLFRDRGLNVQPYKCGPDYIDTKYHEMAAGNASV NLDLWLSSQKHVKETYAKYAAAMDVCVTEGVMGLFDGFDGMEGSSAAIAALLNIPVVLVI NAKSTSYTVAPILYGFKYFNPSVRLAGVVFNQVASERHYSFLQKACEDVGVECFGFLPRL KELEVPSRHLGLTLDASYQFDKFASVVAQAIEEHVAVDRILEVCSVPLVDYNAHISIPST EISKDILSEKKWKIAVAQDAAFNFMYRENLARLKELGTVTFFSPMSDEHLPDCDLVYLPG GYPEFFLKELESNIAIKQQLKMYVESGGRLLAECGGMMYLCDSIRGMDGKQYAMVGLLHQ RATMENMKLRLGYRTLRIGEQVWKGHEFHYSSIEKELPSIAEVTDAKGNPVSTPLYRYKN LIAGYTHLYWGEMNLMKLWE >gi|225935337|gb|ACGA01000055.1| GENE 45 58104 - 59288 703 394 aa, chain - ## HITS:1 COG:sll0099_2 KEGG:ns NR:ns ## COG: sll0099_2 COG2242 # Protein_GI_number: 16331843 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Synechocystis # 219 349 2 131 195 95 37.0 2e-19 MQEFHIIGITDSRELSFSKEVQVLIARSHVFSGGKRHHEIMAAMLPESSVWIDITVPLDA VFECYADYQEIVVFASGDPLFFGFANTVMRKLPEAVVKVYPSFNSLQLLAHKALLPYHDM RTISLTGRPWKALDEALISGETLIGSLTDREKTPASIAARMQAYGYANYRMIVGEQLGNE EETVAEYSVEEAMERHFRMPNCVILKRITVRERPFGIPEEQFELLDGRANMITKMPVRLL SLSLLDLRNRSVMWDVGFCTGSVSIEAKLQFPHLDIVAFEKREAGRQLMETNSRRFGCPG ITTVIGDFLDMPLEKFPMPDAVFIGGHGGKLGEMIKKIAPLLPVGGTLVFNSVSEDSRNQ FEKAIDDNGLKLEQVIRLTVDEHNPIDTLKAIKR >gi|225935337|gb|ACGA01000055.1| GENE 46 59316 - 60722 1228 468 aa, chain - ## HITS:1 COG:lin1162 KEGG:ns NR:ns ## COG: lin1162 COG1010 # Protein_GI_number: 16800231 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-3B methylase # Organism: Listeria innocua # 6 244 2 239 241 231 49.0 3e-60 MKENKIIVAGIGPGSPEDITPAVVSALKESDVVIGYKYYFQFITQWLAEGTECVDTGMKR EKARAELAFEYAEQGKTVCVISSGDAGIYGMAPLIYEMKRERGSEVEILVLPGISAFQKA ASLLGAPIGHDFCIISLSDLMTSWEIIEKRITAAATADFVTAIYNPKSEGRYWQLHRLKE IFQQSRSPETPVGYVRQAGREEETVVVTTLQDFDPEEVDMFTVVLIGNSQTYQFQNRMVT PRGYYREQEKGEVGKGQEIMMNSFRKIEEEMQNKEIDLDRKWALFHAIHTTADFEMEKLV YTDKEVVKNLYEQIASGRIKTIITDVTMVAAGIRKGALQRLGVEVKCFLSDERAVEQSKK RGTTRSQEAIRLAVEEHPDALFVFGNAPTALIELCDLIRKGKAHPSGVIAAPVGFVHVKE SKYMIKTFTEVPKIIVQGRRGGSNLAATLVNAILCFDDARTLRPGRDV >gi|225935337|gb|ACGA01000055.1| GENE 47 60940 - 61971 963 343 aa, chain - ## HITS:1 COG:lin1165 KEGG:ns NR:ns ## COG: lin1165 COG4822 # Protein_GI_number: 16800234 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Listeria innocua # 55 326 4 251 261 100 28.0 5e-21 MKQFRFMMMTALVAVMSLSAASCSDDDDDVAKSNHDKKMEAVTAEVQANKKHDTALLLVT FGSTWDAPQTTFNNMKAQFASKFPNMDIYFSFTSEICMTRCAAKGWNYYAPSFYLEAIGL AGYKTVCVQSLHVIPGEEFLRVQNVVKDFHNDGEHPEFANVKVYLGGPLLETEEDVTNVA KILHETYKSKVESNNIVTFMGHGNPENYNYGNGNSRYTMMEAELQKLNSNYFVATVDMED NYIDDMISRMQEANKKSGNVICHPLMSIAGDHANNDMKGGVGATAEEGSWRYELSKAGYT CPLANCDIKGLGDYSKIVSVWVSHIEKKIADNDAMYDPDAEEE >gi|225935337|gb|ACGA01000055.1| GENE 48 61999 - 64080 1893 693 aa, chain - ## HITS:1 COG:Cj0755 KEGG:ns NR:ns ## COG: Cj0755 COG4771 # Protein_GI_number: 15792094 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Campylobacter jejuni # 1 657 1 657 696 114 23.0 9e-25 MRQMWLAGLVLLSLSAEVSAQSNVAMQDSIPEIVVTGTGTEHYLKDAPVQTEVISRKMLD SYAGATLEDILSGLCASFDFSAGDMGANMQLGGLGNGYILILVDGKKMHGDVGGQNNLGL IDPARIERIEIVKGAASALYGSDAIAGVVNIILKKHRENILIENTSRGGSYGEFRQSNTV QFKVGKFTSSTNFQLKHSDGWQNTTYEDPNRYEYPITNSINKTVNRYTDWQVAQRFDYQA TKDLSLYADGSFYRKRIYRPCGVPDYKTYDFLYRNSSVATGGKLKLKNSNSIMLDVNYDS HAYYYMYTRETWDKEYDDSGKEISFPYFPGDKGLQSDQSRLLLQLKGIFNLPYFNRLSVG TDTEINWLDAPRRLDEKDQVSDYTTSFYAQDEWTPIERLNITAGGRLTVNQNFGVRITPK VSALYKLGAFNLRATYSEGFKTPTLKELHYRYIRQMSIISLNLGNTELDPQTSRYVSGGL EYNGTRFSINVTGYCNWVDNMITLVTIPLSQAPGDLVVTYDPARVRQYQNMDDARTYGVD VNAQWTPVQSLTLTGGYSYLDTEANQYDEEDQVMKHVIIDGMAHHRATVSAIWTHAWRRS NYRLGIGVYGRIQSKRYYQDDGNGKAYNLWRLNTRHQFKLGKRWNAEVNAGIDNIFNYYE TTYHSLNYGTTTAGRTFYGSLMIQFGQKKTKNT >gi|225935337|gb|ACGA01000055.1| GENE 49 64086 - 64781 499 231 aa, chain - ## HITS:1 COG:slr1879 KEGG:ns NR:ns ## COG: slr1879 COG2243 # Protein_GI_number: 16330281 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Synechocystis # 10 176 9 179 242 73 32.0 3e-13 MEDLKPVYIVSLGPGEAGLISLIGGAALCRANAIFCPKTNARSRSSVLLKELKIKKESIR HYTLPMSKKREMAYEAYDKVFEEIVSLYSEGLCVAIVAEGDAGFYSSTSYLVGKLKKVGI PIKQIAGVPAFIAAAASIGLSIVEQEERLTVCPGKVDGELFSRIASGSEVAVVMKLSQCE EEVKHFISLEERLVCHYFENVGADGEFYTCDKKDILARKFPYFSLMIIKGR >gi|225935337|gb|ACGA01000055.1| GENE 50 64807 - 66990 2215 727 aa, chain - ## HITS:1 COG:Cj0755 KEGG:ns NR:ns ## COG: Cj0755 COG4771 # Protein_GI_number: 15792094 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Campylobacter jejuni # 103 634 25 599 696 131 25.0 5e-30 MLKKLILTTMLVGVCCVVSAQVTIKGRVLNSETGEPVVGANIRVDHSLQGGTTNAKGEFI IKGLPDGKQTLQVSHLNYEPQRYSTSKSVNDVVIRMTESHNNLNQVVVTGTGTHRRMVDS PIPVNVLTAKDIKEANVTNLEEALTKLTSNFSFSTSGMGTEMVLNGLNSDYILVLVNGRK LIGDDALMRINMANVKRIEILNGSASALYGSDAIGGVINIITDDSKNKIDASSTTKVSDH GRFSEAVNLDLNVGKLSSYTSYQRQQSDGWKLHPMTETVNKKGEVKLDPTDKQAFTGYHS NTVNQSFSYSLNHKTTLYAQGSYYNFLNDRPVTEYKYNMYHENYTYGFGMKYIVNKKAYI DADFYSDNYKSAYDYIQESGDFKIGDKETRKKQYFYRGNVESIIKVNSKNKLSAGLEYLD EKLESESDNISNKTLYTMALYAQDEWTIAESLQAVVGLRYIYNETFEDHFTPTASIMYRE GGFRGRLSFATGFRTPTLSEIYATDLAKTTNRYTIGNLELKPEESKNFSLNLEYTHSRFS VSATAFVNNVTDMINYRTLSDEEAAQYGDYDEVRQRDNIDKVRIKGVNVNANAYLGLGFN LGAGYTLLDARNLVTDKPIDKSVKYAGNVNAQWSKNWGLYGLNINMIGRGQGKRYSETYD YDSSGFMLWDLNTGHTFNLKYFILNAGLGVENLFDWVDDRPWNTKKPYSAITPGRAFYAS LTIRFRK >gi|225935337|gb|ACGA01000055.1| GENE 51 67374 - 69203 1161 609 aa, chain + ## HITS:1 COG:MJ1578 KEGG:ns NR:ns ## COG: MJ1578 COG2875 # Protein_GI_number: 15669774 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Methanococcus jannaschii # 360 608 3 253 259 242 49.0 2e-63 MEKSVQTAIILISESSLPIARNISRELAESAIYLKGEAEGCETITSYSGFMKEHFSDFKA IVFIGALGICVRSIASCIRNKYADPAVINVDSTGRFVISVLSGHIGGGNELTRIVAGITG GEAIVTTQSDNQQLWALDTFTSRYGWKTSATPASMNQAIFQFVNKHKTALLLSVKDKGTD ELEQSCPEHTDIYYRLEEIPLAEYQLLITVGPWEYDAPIPTLQFYPPVLHIGVGCKKECS PQGVCIYMKDELLRHHLSPLAVKSISTIELKKDEPLIAELHTQFSNSELHIYKAEELADI SVPNPSEKVKEVTGVDGVAESSAIRASDYGRLLMEKQKGILSEGNNFTFAVALSADSDRN NGHIEIVGAGPGDPELISVRGKRMLEKADLILYAGSLVPRELTYYAKPGATVRSSADMTL EEQFTLMKSFYDKGLFVVRLHTGDPCIYGAIAEQMAFFDQYGMRYHITPGISSFQAAAAA LKSQFTIPEEVQSIILTRGEGRTPMPEKEKLHLLARSQSTMCIYLSAAIVEQVQEELLQA YSPETPVAACYKLTWKEEKIYRGKLKDLAQIVRDNHLTLTTLLVVGNAIDHREGLSRLYA DEFKHLFRP >gi|225935337|gb|ACGA01000055.1| GENE 52 69200 - 71011 940 603 aa, chain + ## HITS:1 COG:PA2908 KEGG:ns NR:ns ## COG: PA2908 COG1903 # Protein_GI_number: 15598104 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Pseudomonas aeruginosa # 252 515 12 278 366 223 46.0 8e-58 MILIFGGTTEGRTAAETLDEAGQPFYYSTRGTLQEVTCRNMTRLTGNMDSEKIISFCREH DIRLIIDAAHPFAQQLHQNIFLAAKELTIKVVRLERIYPELSEEIIWCDDFEGAVARMKQ DGIEKLLALTGTQTITKLQGFWKEHDCIFRVLERSESVDIAIKAGFPKEKLIFYQPEEDE SLLFKQLSPQAIITKESGSSGGFIEKIEAALKNGIKIYVVRRPPLPEGFITVNGKYGLRR AVELNVPGFYALRSGFTTGTCATAAAKAALLALTEECFEEKTEVTLPDGERVVLPIYKVE KTDENTATASVIKDAGDDPDVTHGHQIVATVSFSDSPGIHFLQGKGVGKVTLPGLGLAVG EPAINSTPRKMIVNELSLLYGGGLDVTISVPEGETLALKTFNPKLGIIGGISIIGTSGIV KPFSSEAFVDAIRKEVEVAYAVNAERLVINSGAKSEKFVRSLYPNLPPQAFVHYGNFIGE TLKIAAETGFQKVSMGIMIGKAVKLAEGHLDTHSKKVVMNKDFLKEIAIQCSCSDETLAL IEQMTLARELWGIPNKKDKEILFNKLVQLCHIHCTPLFPKGELAILLINETGEISYQFHS TLF Prediction of potential genes in microbial genomes Time: Fri May 13 10:35:16 2011 Seq name: gi|225935336|gb|ACGA01000056.1| Bacteroides sp. D2 cont1.56, whole genome shotgun sequence Length of sequence - 85830 bp Number of predicted genes - 71, with homology - 69 Number of transcription units - 39, operones - 16 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 35/0.000 - CDS 56 - 1120 760 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components 2 1 Op 2 33/0.000 - CDS 1113 - 2150 787 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 3 1 Op 3 . - CDS 2158 - 3342 1105 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component + Prom 3585 - 3644 3.2 4 2 Op 1 33/0.000 + CDS 3757 - 4890 1042 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 5 2 Op 2 35/0.000 + CDS 4887 - 5903 784 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 6 2 Op 3 . + CDS 5900 - 6655 227 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 7 3 Tu 1 . - CDS 6757 - 6930 141 ## gi|260174100|ref|ZP_05760512.1| hypothetical protein BacD2_19719 - Prom 7141 - 7200 5.1 + Prom 6997 - 7056 4.5 8 4 Tu 1 . + CDS 7160 - 8845 1751 ## COG2985 Predicted permease + Prom 9273 - 9332 6.3 9 5 Op 1 . + CDS 9513 - 11396 1238 ## COG4206 Outer membrane cobalamin receptor protein 10 5 Op 2 . + CDS 11393 - 12550 732 ## COG3391 Uncharacterized conserved protein 11 5 Op 3 . + CDS 12585 - 12893 140 ## BT_4606 hypothetical protein 12 5 Op 4 . + CDS 12874 - 14205 901 ## BF3765 hypothetical protein 13 5 Op 5 . + CDS 14256 - 15917 928 ## BT_1956 putative cell surface protein + Prom 15919 - 15978 4.3 14 6 Op 1 1/0.091 + CDS 16046 - 17401 993 ## COG0534 Na+-driven multidrug efflux pump + Prom 17408 - 17467 1.5 15 6 Op 2 . + CDS 17490 - 17993 247 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Term 18000 - 18048 5.8 16 7 Op 1 . - CDS 18122 - 20860 2112 ## COG3537 Putative alpha-1,2-mannosidase 17 7 Op 2 . - CDS 20878 - 22059 608 ## COG0738 Fucose permease 18 7 Op 3 . - CDS 22065 - 23168 872 ## Bcav_2704 hypothetical protein 19 7 Op 4 . - CDS 23185 - 24678 1264 ## Phep_0755 RagB/SusD domain protein 20 7 Op 5 . - CDS 24692 - 27793 2291 ## Phep_0754 TonB-dependent receptor plug 21 7 Op 6 . - CDS 27818 - 28690 646 ## gi|260174114|ref|ZP_05760526.1| hypothetical protein BacD2_19789 22 7 Op 7 3/0.000 - CDS 28712 - 29812 641 ## COG1940 Transcriptional regulator/sugar kinase 23 7 Op 8 . - CDS 29842 - 31572 1687 ## COG1482 Phosphomannose isomerase - Prom 31608 - 31667 7.8 + Prom 31616 - 31675 9.6 24 8 Tu 1 . + CDS 31714 - 32442 509 ## COG2188 Transcriptional regulators + Term 32474 - 32529 6.2 - Term 32462 - 32517 6.2 25 9 Tu 1 . - CDS 32562 - 33380 361 ## BT_2114 hypothetical protein - Prom 33421 - 33480 8.6 - Term 33708 - 33752 9.1 26 10 Tu 1 . - CDS 33778 - 34203 529 ## BT_2500 hypothetical protein - Prom 34436 - 34495 4.6 27 11 Tu 1 . + CDS 34542 - 35594 553 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF + Prom 35619 - 35678 3.9 28 12 Tu 1 . + CDS 35745 - 36179 309 ## BF3357 hypothetical protein + Term 36257 - 36308 5.4 - Term 36266 - 36318 14.8 29 13 Tu 1 . - CDS 36351 - 37703 1221 ## COG0534 Na+-driven multidrug efflux pump - Prom 37808 - 37867 5.2 + Prom 37725 - 37784 2.5 30 14 Op 1 . + CDS 37809 - 39536 2273 ## COG1190 Lysyl-tRNA synthetase (class II) 31 14 Op 2 . + CDS 39590 - 40558 780 ## COG0240 Glycerol-3-phosphate dehydrogenase 32 14 Op 3 . + CDS 40577 - 41914 1573 ## COG0166 Glucose-6-phosphate isomerase + Term 41951 - 42009 15.1 - Term 41935 - 42000 10.1 33 15 Tu 1 . - CDS 42014 - 42520 505 ## BT_2125 hypothetical protein - Prom 42540 - 42599 4.6 + Prom 42601 - 42660 5.0 34 16 Op 1 . + CDS 42711 - 43520 630 ## BT_2126 hypothetical protein 35 16 Op 2 . + CDS 43557 - 44240 591 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 44259 - 44318 4.2 36 17 Tu 1 . + CDS 44363 - 47215 2026 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase + Term 47268 - 47310 1.1 - Term 47255 - 47298 2.1 37 18 Tu 1 . - CDS 47329 - 47439 128 ## - Prom 47460 - 47519 7.8 38 19 Tu 1 . + CDS 47839 - 48267 308 ## gi|260174132|ref|ZP_05760544.1| hypothetical protein BacD2_19879 + Term 48423 - 48464 -0.2 - TRNA 48457 - 48532 52.0 # Arg CCG 0 0 + Prom 48553 - 48612 9.3 39 20 Op 1 . + CDS 48754 - 49791 897 ## COG2502 Asparagine synthetase A 40 20 Op 2 . + CDS 49798 - 50460 576 ## COG0692 Uracil DNA glycosylase + Term 50487 - 50547 8.4 - Term 50469 - 50539 10.1 41 21 Op 1 . - CDS 50605 - 53322 2336 ## BT_2133 hypothetical protein - Prom 53346 - 53405 6.6 42 21 Op 2 . - CDS 53412 - 53948 374 ## COG1418 Predicted HD superfamily hydrolase - Prom 53989 - 54048 8.1 + Prom 53934 - 53993 6.3 43 22 Tu 1 . + CDS 54019 - 55185 581 ## COG3876 Uncharacterized protein conserved in bacteria + Prom 55221 - 55280 6.1 44 23 Tu 1 . + CDS 55316 - 56653 525 ## COG1512 Beta-propeller domains of methanol dehydrogenase type + Prom 56714 - 56773 3.8 45 24 Tu 1 . + CDS 56797 - 57162 195 ## COG2315 Uncharacterized protein conserved in bacteria + Term 57245 - 57293 1.4 46 25 Op 1 . - CDS 57385 - 57855 297 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 47 25 Op 2 . - CDS 57874 - 58767 692 ## BT_2140 putative sodium-dependent transporter - Prom 58925 - 58984 7.6 + Prom 58775 - 58834 1.6 48 26 Op 1 . + CDS 58909 - 60168 964 ## COG0860 N-acetylmuramoyl-L-alanine amidase 49 26 Op 2 . + CDS 60178 - 61071 977 ## BT_2142 hypothetical protein 50 27 Tu 1 . + CDS 61423 - 62835 921 ## COG0593 ATPase involved in DNA replication initiation + Term 62954 - 62991 -0.8 - Term 62713 - 62749 -0.6 51 28 Tu 1 . - CDS 62940 - 63683 658 ## COG0778 Nitroreductase - Prom 63803 - 63862 7.6 + Prom 63692 - 63751 7.1 52 29 Tu 1 . + CDS 63968 - 66511 2137 ## COG0209 Ribonucleotide reductase, alpha subunit + Term 66550 - 66589 5.3 + Prom 66529 - 66588 13.1 53 30 Tu 1 . + CDS 66643 - 69324 1689 ## COG1640 4-alpha-glucanotransferase + Prom 69504 - 69563 6.2 54 31 Op 1 . + CDS 69627 - 70679 615 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases 55 31 Op 2 . + CDS 70663 - 71034 230 ## COG1539 Dihydroneopterin aldolase - Term 71054 - 71118 16.8 56 32 Op 1 . - CDS 71165 - 71689 364 ## COG1803 Methylglyoxal synthase 57 32 Op 2 . - CDS 71693 - 72721 474 ## COG1216 Predicted glycosyltransferases 58 32 Op 3 . - CDS 72724 - 73647 693 ## COG1560 Lauroyl/myristoyl acyltransferase 59 32 Op 4 . - CDS 73650 - 74969 276 ## PROTEIN SUPPORTED gi|229245919|ref|ZP_04369978.1| SSU ribosomal protein S12P methylthiotransferase - Prom 75164 - 75223 4.7 + Prom 74877 - 74936 5.7 60 33 Tu 1 . + CDS 74993 - 76753 1235 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) + Prom 76820 - 76879 3.1 61 34 Tu 1 . + CDS 76913 - 77134 270 ## + Term 77159 - 77205 0.2 - Term 77151 - 77189 4.3 62 35 Tu 1 . - CDS 77223 - 78131 677 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 78181 - 78240 5.7 63 36 Op 1 . - CDS 78277 - 79143 866 ## BT_2157 hypothetical protein 64 36 Op 2 9/0.000 - CDS 79182 - 80669 1623 ## COG0673 Predicted dehydrogenases and related proteins 65 36 Op 3 . - CDS 80682 - 81794 768 ## COG0673 Predicted dehydrogenases and related proteins - Prom 81971 - 82030 6.3 - Term 81872 - 81920 1.6 66 37 Tu 1 . - CDS 82033 - 83514 827 ## BT_2160 putative regulatory protein - Prom 83689 - 83748 3.9 - Term 83706 - 83760 8.5 67 38 Op 1 27/0.000 - CDS 83779 - 84222 712 ## PROTEIN SUPPORTED gi|160885187|ref|ZP_02066190.1| hypothetical protein BACOVA_03185 68 38 Op 2 11/0.000 - CDS 84237 - 84509 461 ## PROTEIN SUPPORTED gi|160885186|ref|ZP_02066189.1| hypothetical protein BACOVA_03184 69 38 Op 3 . - CDS 84512 - 84856 576 ## PROTEIN SUPPORTED gi|160885185|ref|ZP_02066188.1| hypothetical protein BACOVA_03183 - Prom 84881 - 84940 10.1 + Prom 84857 - 84916 9.6 70 39 Op 1 . + CDS 85017 - 85463 467 ## COG1846 Transcriptional regulators + Prom 85465 - 85524 6.3 71 39 Op 2 . + CDS 85604 - 85780 294 ## gi|260174165|ref|ZP_05760577.1| hypothetical protein BacD2_20044 Predicted protein(s) >gi|225935336|gb|ACGA01000056.1| GENE 1 56 - 1120 760 354 aa, chain - ## HITS:1 COG:alr4033 KEGG:ns NR:ns ## COG: alr4033 COG1120 # Protein_GI_number: 17231525 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Nostoc sp. PCC 7120 # 18 299 2 284 333 250 45.0 4e-66 MNDNQNREKLNMQEQETITIRNLATGYKGKHKTHIVARDINASIYAGELTCLLGANGAGK STLLHTLSGFLPKLSGEIRIMNKEVEKYSDADMSKVISVVLTEKCDLRNMTVEEMVALGR SPYTGFWGVLKKGDKEIVSRAIAGVGIEVLTGRMMHTLSDGEKQKVMIAKALAQETPVIF LDEPTAFLDFPSKVEMMQLLHRLSRDTGKTIFMSTHDLELALQIADKVWMMDKENGVTIG TPEDLSLNGTLSNFFSRKGIMFDQNTGLFKINNEYSVKMRLEGHGQKYAMVRKALLRNGI LAGREIESDVYIETGNLQTEGFLLHLPENEVRKAEDIEVLLKLVSGYLVNSRYS >gi|225935336|gb|ACGA01000056.1| GENE 2 1113 - 2150 787 345 aa, chain - ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 7 338 24 354 362 245 45.0 9e-65 MKTPVWLLMLLLVGLIFFLFFLNLVLGSVSIPFRAVWDICLWGDHENHIWTNIIWKSRMP QAVTALVAGAGLSVSGLQMQTVFRNPLAGPSVLGISSGASMGVAFVVLLSGSLGGVALSS LGLIGELALTVAAIAGALSIMALIAFASQKVKGNVTLLIIGVMIGYVANAVIGVLKFFSV EEDIRAYVIWGLGSFSRISGDQLFPFLCTMAVLLPLSFLLIKTLNLLLLGDSYARNLGLN IKRARLLVISCSGALVAIVTAYCGPIVFLGLAVPHLCRALFRTSDHRILMPSSLLVGAAM ALLCNLIARMPGFEGALPVNSVTALVGAPVVVSVLFRKRKGEVNE >gi|225935336|gb|ACGA01000056.1| GENE 3 2158 - 3342 1105 394 aa, chain - ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 54 390 83 421 426 192 32.0 1e-48 MSRYNSMRLKGKYWPVVCLLLFLVSCGQKSKNNTQAVGVAVSASDSSRIILPEYAEGFEM QYVEGGCLVTIQDPQKEKGKGEIYQYAFVRDKEAFAKNKVSADYVVQPVPIKNIICMTSL QLSNFIKLDELDHVVGITSTRHLFNEEVKRRLKEKKIRKIGIEGNFDNEVIISMNPDLIL ISPFKRGGYDALKEVGIPLIPHLGYKENSPLGQAEWIKFIGLLIGEEEKANAIFDSIKTE YNTLKQLTAKIEKKPVVFSGELRGGNWYAVGGRNFLAQLFRDAGADYFLKDDPNTGGVTL DFETVYSRAESADYWRILNSHNGTFTYDALKAQDSRYADFRAFKDKGVIYCNLKEKPYYE NVPTHPEVLLKDLIKVFHPELLPEYTPVFYELLK >gi|225935336|gb|ACGA01000056.1| GENE 4 3757 - 4890 1042 377 aa, chain + ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 55 375 97 422 426 213 34.0 6e-55 MKSPISILTWLCLLFLASCISNKKTSLEAFKQDVYTPEYATGFKILGADNTASTLIQVSN PWQGAKDVKMSYFISRNGEQAPAGFTGPTIPAGAKRIVCMASSYIAMFDALGQVDKIVGV SGIDYVSNPYILAHKDSIKDMGPEMNYELLLGLKPDVVLLYGIGDAQTAVTDKLKELYIP YMYVGEYLEESPLGKAEWLVALAELTDSREKGIEIFREIPKRYQVLKALTESVEQRPTVM LNTPWNDSWVMPSTQSYMAQLVTDAGADYIYKENTSNSSTPIGLETAYKLIQKADYWINV GMASTLDELKTVNPKFVDAKAVREKTVYNNNLRTTPTGGNDYWESAVVRPDVVLRDLIHI FHPELVSDSLYYYRHLE >gi|225935336|gb|ACGA01000056.1| GENE 5 4887 - 5903 784 338 aa, chain + ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 10 332 23 353 362 221 41.0 1e-57 MKQPDKRTPFLFITLGIVAILLLLIDMATGDTYIPITKVWAVLTGGECDEMTRNILLSIR FIRVVVAALIGIALSVSGLQMQTVFQNPLADPYLLGVSSGAGLGVALFILGAPLLGWAEF PILQSLGIVGSGWIGTAIILLGVAIISRKVKNILGVLIMGVMIGYVAGAIIQILQYLSSA EQLKMFTLWSMGSLSHITVTQLGIMIPMLCIGLLISVACIKSLNLLLLGENYARTMGMNI KRSRTFIFVSTALLTGTVTAFCGPVGFIGLAVPHVTRLLFNNADHRILVPGTMLTGLISM LLCDIIAKKFLLPVNCITALLGVPVILWVIAKNLRRFK >gi|225935336|gb|ACGA01000056.1| GENE 6 5900 - 6655 227 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 223 1 224 245 92 26 9e-18 MIELKELTLGYGQRTLLETVNARITGGQLVALLGRNGTGKSTLLRAMMGLEKPQSGEITL QGKNIASLKPEKLARNISFVTTDKVRIANLCCKDVVALGRAPYTNWIGQLQPEDQKRVDD AMQLVGMSGYAEKTMDKMSDGECQRIMIARALAQDTPVILLDEPTAFLDLPNRYELCLLL KKLAQEEKKCVLFSTHDLDIALSLCDSIMLIDNPQMYTLPTPEMVASGHIERLFRNESVT FDVQEMRVRIK >gi|225935336|gb|ACGA01000056.1| GENE 7 6757 - 6930 141 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174100|ref|ZP_05760512.1| ## NR: gi|260174100|ref|ZP_05760512.1| hypothetical protein BacD2_19719 [Bacteroides sp. D2] # 1 57 1 57 57 103 100.0 3e-21 MAGLLTDSLPDAFPTLTIDTWSVVKVNIKKLNGVSQQRDCPGFTPDSLLIQIVDMAI >gi|225935336|gb|ACGA01000056.1| GENE 8 7160 - 8845 1751 561 aa, chain + ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 33 556 18 545 553 393 41.0 1e-109 MEVLRNLFEGYPNLWGGGVAHSVLILSLTIAFGIILAKIKVRGVSLGVTWILFVGIVFGH FNMNLDEHLLHFLKEFGLILFVYSIGLQVGPGFFSAFKKGGFTLNMLAMLVVVLGVTITI ILHFTTGTPITTMVGILSGAVTNTPGLGAAQQANSDLNGIDAPEIALGYAVSYPLGVVGI ILSLLALKYLLRINTAQEEKDAEVGLGHLQELTVRPISIEVRNESVNGIKIKELRPLLNR KFVISRIKHREGGETELVNSETTLHVGDIILVISTPIDVEAITVFFGKEIKMEWEQLNKE LISRRILITKPELNGKTLSQLKIRNNFGANITRVNRSGVDLVASPQLQLQMGDRVTIVGS ELAVAHAEKVLGNSMKRLNHPNLIPIFIGIALGCILGSLPFAFPGIPQPVKLGLAGGPLI VSILISRFGPKYKLITYTTMSANLMLREIGISIFLACVGLGAGEGFVDTVIHGGGYVWVG YGVIITILPLLITGLIGRYYCKLNYFTLIGVLAGSTTNPPALAYSNDLTSCDAPAVGYAT VYPLTMFLRVLTAQILILSLG >gi|225935336|gb|ACGA01000056.1| GENE 9 9513 - 11396 1238 627 aa, chain + ## HITS:1 COG:BMEI0657 KEGG:ns NR:ns ## COG: BMEI0657 COG4206 # Protein_GI_number: 17986940 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Brucella melitensis # 327 623 334 595 599 62 23.0 2e-09 MQKKMSPVQILSGKELEKLNVYSVADALRYFSGVQIKDYGGIGGLKTVNIRSMGSHHVGV FYDGIELGNAQNGVVDLGRFSLDNMEVISLYNGQKSAIFQPAKDYSSASAIYMQTRKPIF KGEKKNNLNIGVKGGSFSTINPSLLWEHRFNERISSSISTEYMYTSGRYKFTYAKKDGYD TTAVRQNGDVRMLRLENAFFGKVPKGEWKAKAYLYNSERGYPGAAVREEPGKFRHQDRQW DTNLFVQGSFQNYFKPWYSLLANGKYAYDYLHYLSDPRLDVTTMYVDNHYRQQEIYASAA HLFTIYPWWSMSLSNDFQWNTLRADLIDFVYPTRNTILTSAATSFDFNRLMLQASLLYTH VDDNTRTKGANAGTKNKYTPSVIATWQPLTKLPLNVRAFYKKVFRMPTLNDLYYTFIGNK DLKPEYTTQYDVGITFSHTWNNHWLKSLDLQIDGYYNEVDDKIIAMPTSNQFRWTMINLG HVEIRGLDAAIRGEWGFGKVEFSTLFNYTYQKAQDFTDPTSEWYGGQIPYIPWHGGSIIL NGSYQTWSCNYSFIYTGERYEAVANIPENYAQPWYTHDFSLSKTFQWGKTGIRVTAEINN IFNQQYEVVQCYPMPGTSFKIKLNVML >gi|225935336|gb|ACGA01000056.1| GENE 10 11393 - 12550 732 385 aa, chain + ## HITS:1 COG:MA0512 KEGG:ns NR:ns ## COG: MA0512 COG3391 # Protein_GI_number: 20089401 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 97 346 107 345 461 62 29.0 1e-09 MRISVLSIIICFVLLSTSCREDERIIPSTPNQVTPGVSGTSVKGFFLLNEGNMGSNKATL DYFDYETGIYHKNIYAERNPGVVQELGDVGNDIQIYGNKLYAIINCSHFVEVMNVETAKH ITQISIPNCRYIVFKDKYAYVSSYAGPVKIDPNARLGYVARVDTTTLTVKDTCVVGYQPE EMVIVGNKLYVANSGGYRVPNYDRTVSVIDLNTFKVIKTIDVGINLHRMELDQYGNIYVS SRGDYYDTYSETFVIDSNTDKVIEELPLLPNTEMAICRDSLYVYSTEWSYYTNSWTVSYA IYNTKTKKTDTRNFITDGTEKSIKIPYGIAINPETGEFFVTDAKDYVTPGTLYCFSRDGK KKWSVITGDIPAHIAFTDKKLKEIQ >gi|225935336|gb|ACGA01000056.1| GENE 11 12585 - 12893 140 102 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 91 12 96 394 82 45.0 6e-15 MKKILFLLLTVVTVCSCYNDDDLWDKVNDLDGRVETLETTVKKMNSEITTLQSLVDALNQ GKVITNTEQTSDGYTLTFSDGSTVSIKNRKEWYRCSCDWSES >gi|225935336|gb|ACGA01000056.1| GENE 12 12874 - 14205 901 443 aa, chain + ## HITS:1 COG:no KEGG:BF3765 NR:ns ## KEGG: BF3765 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 147 212 365 715 87 33.0 1e-15 MIGVKADEDGIYYWTITTNGTTEWLPDATNKLKVTGTTPVMGVDSDGYWTVDTGEGAKRI NGTDGKPVKAEGQDGDSFFKSVTVNDNHVLVTLADGTSFTLPKGETYTFDIANRDYTLFS SNETRTYLLTTANIDDATLIAIPQGWTAVLNDKDLRITAPAVSGDDVKDGELKILVTPSK AFGKVIKMQLRAVREAHFLTFEDVDYKGDANMVGERNWSSLIDSQQYGGSLLYPKNNQLY NWSDANNTFLASELPNGWGDYQYWGGGHAISNYLDMDLTHGDFQHQLAVYYQDAKTGFGG HNGSKNFCVHYGYADNSGYANGPLPYIYFGDGVARVVDHMYVTMTTYLANCVANGNGLTA PAGKDDWVKLVAIGYDEDGKEVATRPEFYLVGAEGNILEWTKWDLSALGKVVKIDFNVTG SNDNGYGFSQPAYFAYDDVAVRF >gi|225935336|gb|ACGA01000056.1| GENE 13 14256 - 15917 928 553 aa, chain + ## HITS:1 COG:no KEGG:BT_1956 NR:ns ## KEGG: BT_1956 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 549 32 591 593 371 42.0 1e-101 MKEEEDSPLLSAPVITLKEESPIYKTKIGREITIAPVYENTDSETTFNWTIDGKSICTAS TLTFSADEAGRYFILISATNAGGTTEKEIRIDVFQLDVPTISIADADKGFKVLQGSELTI TPIVDSFLETSYSWQINGEDISSELTCILPTTELGEYLVRFATHNEDGDDEVTFKVEVCS PDQVDFSWTFLQTEYNMSVGRTIRIRPVDVNNALDAVYTWTVDNEQVQSGENDTYLFTSE QQGEHTVKVEMRNSYILATQTLKINVCSPAGTYQRKISATSSASMNKVYEYLPAPGQFIN ENHTTTTMAEACTYAEDRINQTAYVSLGGFGGYIIVGFDHSIVNDGDYNIAITGNAFDGS SEPGIVWVMQDENGDGLPNDTWYELRGSEYGKAETWQDYAVTYHKPTGIQLPTPWTDNHG QSGSIDYLGAFHRQDYYYPAWVKEKQYTLRGTRLKERNYDQSGNGTYWVNDSYEWGYADN FSAIDRLTDDDNYNAGPSDNHFKISHAVTFDGKDANLKYVDFIKVQVGVNSKSGWLGEIS TEVFGIKDFNMLK >gi|225935336|gb|ACGA01000056.1| GENE 14 16046 - 17401 993 451 aa, chain + ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 11 448 12 441 443 202 30.0 1e-51 MKDSIDFGSMNISTLFRKLLIPTVLGMVFSALFVITDGIFVGKGIGSDALAAVNITAPLF MIAAGIGLMFGVGASVVASIHLSQGKRKVASINITQAIAFSALLILILSALCLYFIEPLA KLLGSSDRLLPLAVEYMAWYIPFLVFYEVLNIGMFCIRLDGSPTYAMMCNAVAAILNIIL DYIFIFEFGWGMMGAAFATSLGTVVGGLMTLIYLLRFSRTLHLCRIKLSIKSLLLTLRNI GYMIKLGSSAFISEASIACMMFLGNYVFIHHLGEDGVAAFSIACYFFPIIFMVYNAIAQS AQPIISYNFGAGQPERVRKTLHLAIRTALICGISFFILTVLCCQDIVSLFIDCSYAAFDI AVNGLPYFGVGFIFFAVNMIGIGYYQSVERGQRATIVTLLRGVVFMLIGFFALPPVLGTP GIWLAVPLAELLTTFYIFGIYLKDRFIVCRR >gi|225935336|gb|ACGA01000056.1| GENE 15 17490 - 17993 247 167 aa, chain + ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 6 162 38 195 199 66 29.0 2e-11 MEEVHFKKRDLIVREGTKNCNLYFIKTGIWRAYYHKDGVDTTIWFASDGEAAFSVWGYVD NAYSLINIEVMCDSVAYCISRTALNQLFASSIGLANLGLRLMDHQFLQQENWLISSGSPR AKERYLNLIKETPELLQYVPLKHIASYLWITPQSLSRIRAKIASSSL >gi|225935336|gb|ACGA01000056.1| GENE 16 18122 - 20860 2112 912 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 168 901 43 770 790 219 27.0 3e-56 MNQNIIKGLLVAAIWCSNILVLQAAQDNIAPRAHVTVSNVLNENYAADNLVDGKIMYDDK GEWACKGSVTSWGVMYTPWAQLEWDEEVCVDRVVLYDRVSLAEHLAGGTLHFSDGSQLSV TAIPNDGSPKSISFPKKKVKWIRFEATDGNGKNLGLSEIEVFAAHGNQTDYVEWVDPYIE TTRGRWFFCTPGGRPFGMVAAHAFTRNKNQGGGGYNYNFPDILGFSQINEWMISGPNIMP VVGEINPTEGMAGWKSPFKHESEIIQPGYHRLYLDRYKTWVEYTATERATFYRLNYTENV NAKLLVDVGSVLGNCSMDKATLLRVSNTRVVGEFFTTERFWGGPDSIKISFVLDCNRPIK SIDGWNEKGVLPDVEAIAGNAAGMVLNFGQLNAEELLFKMAFSYTSVDNAIANMDAELNH WDFDKVCKETRSVWNEALGRIAVEGGTDAQRIKFYTDLWHVLLGRHKINDVNGYYPDYAG NKYVNKRTSEPMKVRRLPLTADGKPKFNMYGFDGLWLTHWNLNVLWGLAWPEVMDDLSAC LVQYADNGKLLPRGACSGGYSFIMTGCPATSLLVSTYMKGIMKKADPLHTFDVIKRNHMP GGMMSYESADDLKFYISHGYCPDNASKTLEWAFQDWGASRMAARLGKHSDARMFEKRSRA WTPLFNAEQGLVFPKKRNGEWLHQDALSGNGWVEANSWQATWSLSHELPKLVKMMGGADK FCEKLNFAFEQARDLDFVYAYSGGYVSYANQPGCSNAHIFAYGGKPWLTQYWVRQVKERA YGGITPDKGYGGHDEDEGQMGGVSALMALGLFSVTGTESDTPYYDITSPIFDKITIKLNG DYYEGSTFTITTHNNSAENCYIQRAQLNDMEWNYAQFNHADFTKGGKLELWLGNEPNKSW GKLKYLAPDSDL >gi|225935336|gb|ACGA01000056.1| GENE 17 20878 - 22059 608 393 aa, chain - ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 3 379 25 418 426 92 24.0 1e-18 MVKNNSYLKAIPVLAAFFVMGFCDIVGISSDYMQKSFNWSPTMTGFVPSMVFIWFLFLGI PVGNRMSKYGRKNTVLASMAVTVVGMFLPLLVYSSVTCIIAYVLLGIGNAILQISLNPLL NNVISSPRLLTSSLTAGQVIKAVSSLVGPEIVLFATLHFGDDKWYYCFPILGAITLFFAL WLTFTPIRREQVEESKVSVSDSFNLLKNRTILILFFGIFFIVGVDVATNYVSSKLMSIRY DWTAEQVKFAPQVYFFSRTVGALLGAFLLTRIAGARYFKVNILACIMMLVLLIGVQNPAV SLLCIGGIGFFASSVFSIIYSMAFQECPTKMNQISGLMLTAVAGGGVVTPVIGFAIDNAG ITAGVIVILLCVLYLTYCAFAVRERKCGEQTTL >gi|225935336|gb|ACGA01000056.1| GENE 18 22065 - 23168 872 367 aa, chain - ## HITS:1 COG:no KEGG:Bcav_2704 NR:ns ## KEGG: Bcav_2704 # Name: not_defined # Def: hypothetical protein # Organism: B.cavernae # Pathway: not_defined # 44 363 73 387 390 145 27.0 3e-33 MNKFIYLLACTLFVSCQNNNKSSVTQMEVELDSMWCTRLMPGLGGGVTGGDGAISIDLKD GRNLFMWGDSFFGDVVDDKRSKDSKFVMGNTFTIINEKGELETLYSGDLKNPSAYIPAEQ DGDSPRWYWPGDGFVKDGILHLFMSKFRKVGEGSFGFEYMCCDYFRLDVKTMKIIDEMNI PAANQNGVHYGHAVMPYQNAIYIYGTKSDSTGAKAVVHVAKAELVDNKLANFVYWDGASW QADATKTAKLEGLQKNISEQFNVFSLNEKIVLVSQNRSGNAKEIYSYIADSPEGAFSHEK LLYTVDEPNFEQDSMMTYNTMVHPQYMKNDKILMCYNVNTYDLKKVFEKASLYQPRFFWV PVDHILK >gi|225935336|gb|ACGA01000056.1| GENE 19 23185 - 24678 1264 497 aa, chain - ## HITS:1 COG:no KEGG:Phep_0755 NR:ns ## KEGG: Phep_0755 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 18 492 23 491 494 280 38.0 1e-73 MKKYILSILLSGAFMFTACSDFLDEVPKGNITSEGYYKTAQHAISATNAIYDYLIIGYAP NGLWDKNYGGTFYNDYWVLQDMFADNSETNQTSIDYQSVENMQIDQYNQPVELLWRDFYQ TIKCCNVVIDKVPSIDMDVTLRNQLVAEAKFFRAMMYFDLIRMFGDVPLREHNVESAEED ATSRTSKETIYELIFSDLKTAETDLKYTERFGGGRPYPASASALLARVYLTYAAEHHSQE HYQLAVDKAKSVIPNFPMLENYGDLFKVANRFNTEIIWGVNFSATLSEGWKGAQFLVRLL PALDGVSNAQGWENATEDLWNSFDQQNDQRLDVTLKRSFTYSDGSIKDFDKPYVFKYWDS QAEPKGNNTEAIFPAIRTAEMYLIIAEALNEINQGANEEAVIAINAVRNRAGLTTAVPTD YKGFKKAVLNEYRHEFVMEGHRWFDLTRMCTPQEFVSIIKAAKPESTPKEYHVKFPIPQR EIDLSQNMITQNEGYGQ >gi|225935336|gb|ACGA01000056.1| GENE 20 24692 - 27793 2291 1033 aa, chain - ## HITS:1 COG:no KEGG:Phep_0754 NR:ns ## KEGG: Phep_0754 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 10 1033 19 1038 1038 784 43.0 0 MNNLFRSITLLLLMFILGSVSAQNKSLNVKGTVRDAMGDPIIGATILQQGSTYGTISNKD GEFSLSVPDKSVLEISFVGYQTKIVKVTAQKREFHIVLEDDMQVLADVVVVGYGVQKKSD LTSSISTIKSEQLSSTSITSLDQGIQGRAAGVAVLNTSGQPGAGTSIRIRGTSSINGNNE PLYVIDGVPVISDANTFSTGTLKNPALNPLTNINPNDIESIEILKDASATSIYGARGANG VVLVTTKQGKNGKPKVSIGAKYTLQQVTKKMDMLNAVQLAELGNEATDNAGEERNPVFAG LNNLSKLNTDWQDEIFRTAPMQNYDISVSGGNDKTTYFVSGNLLLQDGIIIGSDFGKGSF RINLGQQINKWLKTGVSVNLSYSRSNGVVTNSEGGFASSITSWALEMNPALPVRNESGDY IYENNLTTTNNVGNPVQDAYEAKNRNTSFRTLANAYLEWVPIKNLTFKTSIGVDYFYIKD QAFAPGEIKRAESNGGYASIGNRDGYNWVWENTVNYTNTFEKHTISALAGMTAQAFVSEN SAVSTADFTDGFLGFNSIQSGALRQAASSGISEWQMLSYLARVNYNYGGRYLMSLTGRID GSSKFGKGNKYGFFPSISGAWRISEENFMKRAQAISNMKLRVSYGIVGNEGIPPYSSQGL MYNTEAYFGNSEVATGVVPYTLSNQDLKWETTSQVNLGFDLGLFNNRLTLTADYYRKNTR DLLLAMPVSFNTGYDTAVKNVGSLRNEGFEFALGAVPFAGKFGWDMNFTLGYNKNEITDL AGSQENLSGASILGVTYWTKINEGKPIGTIYGYKTDGIAQLSEDLSKIPYFSGKTLKHGD RKYVNKNGDNVINEDDLYELGNANPDFTFGFNNTFTYNLRDHSSIGLTVYLQGAVGNEIV NFNKFSLESFDGHKNNSVAALERWTPENPTNKYPRATTKSSGTILSDHYVEDGSYLRIKD ITLSYTFPKSILQKFYCEGLTIFAGLKNIYTFTNYSGYDPEVSRFSNDNLSMGADYGSYP MSKSYEFGLRMNF >gi|225935336|gb|ACGA01000056.1| GENE 21 27818 - 28690 646 290 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174114|ref|ZP_05760526.1| ## NR: gi|260174114|ref|ZP_05760526.1| hypothetical protein BacD2_19789 [Bacteroides sp. D2] # 1 290 1 290 290 559 100.0 1e-158 MKTLKKIIFISIVLGMISSLIISCKSDDKQQTGLLTLSILEGKIKIGDHVSVELSLSDPD LISKIVVKKSIEGKEVSSYLKELNVSELNFPYTFTEEIIAGDENGILVYSFYGMDENNKV VDAGDLVLTVELAQIPLLLKYDWVLASQTIKGEDTATPDLKDDIHRFNSDLTWQVDWGYI FSSAALETLNSYCAWQVTMDGATVKTLSTIHYNVFSPAEALITHYNVLQLSDRKMILESY QDLSGLGDGYSSNERVLEVYTPVSKTEDFTPYRGQNPDNYIVASCNPGSY >gi|225935336|gb|ACGA01000056.1| GENE 22 28712 - 29812 641 366 aa, chain - ## HITS:1 COG:slr0329 KEGG:ns NR:ns ## COG: slr0329 COG1940 # Protein_GI_number: 16331233 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Synechocystis # 7 286 30 288 327 90 29.0 3e-18 MYEYDKRVVLTLDAGGTNFVFSAIQGNNEMISPIGLPAVSDNLDECLEVLVKGFDRVIAA IPVPPVAISFAFPGPADYENGIIGDLPNFPSFRGGVALGPFLKHKYGIPVFIENDGNLFA YGEALSGALPMINRELSLAGCSREYKNLIGITLGTGFGAGVVINKVLLTGDNGCGGDIWL MRNKKYPDMLAEESVSIRAVRRVYSDLSGQSSASLSPKDIYDIAEGIKEGNRHAAIASFN ELGMMAGAAIASVLNVVDGLVVIGGGVAGASKYILPALITELRAQLTTFSGNKFPCVSMQ VYNGEDESERARFLSLSDSQVVIPGTHLTTTYHQLKQTLVLCSREGASRSIMRGAYAYAL QKIDLT >gi|225935336|gb|ACGA01000056.1| GENE 23 29842 - 31572 1687 576 aa, chain - ## HITS:1 COG:SA1945 KEGG:ns NR:ns ## COG: SA1945 COG1482 # Protein_GI_number: 15927717 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Staphylococcus aureus N315 # 227 459 2 217 312 82 26.0 2e-15 MRKANYDKFPSTKLTGMLVQGWDAIISVLKKQMDARKVLAVDLYTGVYEEEVLDAFSKEF SGKVMNVRDLMKPEQEIQTLTERFMTEDVLFGYVTNLKLEDYFDAGKLAAAQKQISEAED TIVIIGTGAAMVAPQDAMIVYADMARWEIQQRFRRHEVKALGIDNRKDAVSLQYKRGYFN DWRVCDRYKERLFDRVEFWIDTHVAGTPKMIDKDTFFKGVEATVNTPFRVVPFFDPAPWG GQWMKEVCDLDRERENFGWCFDCVPEENSLYFEVNGVRFELPSVDLVLLKSKELLGEPVE ARFGKDFPIRFDFLDTIGGGNLSLQVHPTTQFIRDSFGMYYTQDESYYMVDAEEDAVVYL GVKTGVDKEAMIDDLRKAQKGELVFDAEKYVNKIPTKKHDHFLIPGGTIHCSGANSMVLE ISSTPNLFTFKLWDWQRLGLDGKPRPINVERGKCVINWNRDTEYVNEHLRNQFKEVASGD GWIEERTGLHPNEFIETRRHRFSSPVLHHTNDSVNVLNLLEGEEAVVESPTHAFEPFVVH YAETFIIPASVGEYTIKPYGKCRDKECVTIKAYVRF >gi|225935336|gb|ACGA01000056.1| GENE 24 31714 - 32442 509 242 aa, chain + ## HITS:1 COG:CAP0006 KEGG:ns NR:ns ## COG: CAP0006 COG2188 # Protein_GI_number: 15004711 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 239 1 235 237 99 27.0 6e-21 MKIDHSSDKPLHIQAEEILRRLIESEEYKNGKLFPNEVELSEQLHISRNTLRQAINKLVF EGLLVRKKGYGTKVVKKGIVGGVKNWLSFSQEMKMLGIEIRNFELHISLKRPTEEIGTFF NLGSNPDARCVVMERVRGKKEYPFVYFISYFNPNIPLTGEEDYTRPLYEMLETQYNIVVK TSKEEISARLAGEFIAEKLEIKSSDPILIRKRFVYDVNGVPIEYNIGYYRADSFTYTIEA ER >gi|225935336|gb|ACGA01000056.1| GENE 25 32562 - 33380 361 272 aa, chain - ## HITS:1 COG:no KEGG:BT_2114 NR:ns ## KEGG: BT_2114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 265 1 256 258 370 74.0 1e-101 MANWLTTKQMSERHDIQEAILKNWANLGYITSSRINDQLFLDDESLDAYLEAHKRLGLEA DYLSKIVEEKKLERDFIISRYDDLLYVLRTQKTCKPLYQIIIRELAALILHPIARNVFYS ISTGESVEKVAGRHRITYEKTLQIYNSILKGLKLKKDILATYRKRAINARFLSLADNNKN INIEQEEWILQLPVCKVADTRLANVLYNQDVRTVKDLLEIVSGRGWKSMLRIEGVGKISY YHLLSKLQMIGVVDESLDRILAGHSVGRFEKR >gi|225935336|gb|ACGA01000056.1| GENE 26 33778 - 34203 529 141 aa, chain - ## HITS:1 COG:no KEGG:BT_2500 NR:ns ## KEGG: BT_2500 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 130 172 114 47.0 8e-25 MKTNLFSKAVVMVAVVMASVMNFSASAMNPTEFVTNNEMTGEVITAKNIYRNEDGHLYRH LRYTYTYDNENRVISKEAAKWDSVKEVWTPYFKMNVEYSANEVTVDYARWNDRSKTYNSN VEKAVYALNDNNATLLLASTK >gi|225935336|gb|ACGA01000056.1| GENE 27 34542 - 35594 553 350 aa, chain + ## HITS:1 COG:alr3273 KEGG:ns NR:ns ## COG: alr3273 COG1619 # Protein_GI_number: 17230765 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Nostoc sp. PCC 7120 # 19 344 34 362 368 160 31.0 3e-39 MKIIKNQFTIILLLFCGILVSCKTTQTASNRSLMLEERCKQKNTSNSNFQFPPFLQAGDK VAIVSPSGKIDTQFVEGAKQRLESWGLKVITGKHVCDSSGLYAGTIKHRLKDLQRAMDHL EVKAIFCSRGGYGAVHLIDKLDFTAFRKHPKWLIGYSDVTTLHNLFQKNGYASLHSPMAY HLTMESEDDPCIMYLKDILSGNPPTYTCEKHELNKQGHAQGVLRGGNMSVICGLRGTPYD IAAEGTVLFIEDVNEEPQAIERMMYNLKLGGVLEKLSGLIIGQFTKYEEDGSLGKDLYAA LADLVKEYNYPVCFNFPVGHVTHNLPLINGARVEFTVGEKEVELKFNLNR >gi|225935336|gb|ACGA01000056.1| GENE 28 35745 - 36179 309 144 aa, chain + ## HITS:1 COG:no KEGG:BF3357 NR:ns ## KEGG: BF3357 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 144 1 144 144 111 38.0 9e-24 MDEKINYDDIPKAFLYCSYKQCPRHNKCLRYQAMLCIPAQVPYYSTVNPCHIAGNEKNCH YFKPYQTSHFASGIDHLLDNIPHSLAITIRKELYSLMGRNMYYRIRNQERLLHPDEQEQI AAIFLKHGIKPKPEFDKYIDKYDW >gi|225935336|gb|ACGA01000056.1| GENE 29 36351 - 37703 1221 450 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 7 434 4 432 448 333 40.0 4e-91 MTGQKTPTALGTEKIGKLLMQYAIPAIIAMTASSLYNMVDSIFIGHGVGAMAISGLALTF PLMNLAAAFGSLVGVGAATLISVKLGQKDYDTAQRVLGNVFVLNLIIGISFTVIVLPFLD PILYFFGGSDETVKYAREFMQVILLGNVVTHLYLGLNAVLRASGHPQKAMMATITTVIIN VLLAPLFIFVFDWGIRGAATATVCAQLIALVWQLHLFCRKDELIRLKKGIFRLKRKIVLD SLAIGMSPFLMNMASCFIVILINQGLKEYGGDLAIGAFGIVNRIVFVFIMIVLGLNQGMQ PIAGYNFGAKLYPRVTKVLKATICCATVVTTIGFLIGMFIPEIVSSIFTSDEELISIASK GFRIVVFFYPIVGFQMVASNFFQSIGMASKAIFLSLTRQMLFLVPCLLILPHYYGQMGVW ASMPVADLAASLISGGMLWWQFRQFKKAVA >gi|225935336|gb|ACGA01000056.1| GENE 30 37809 - 39536 2273 575 aa, chain + ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 13 501 33 510 515 456 48.0 1e-128 MNILELSEQEIIRRNSLNELRAMGIDPYPAAEYVTNAFSTDIKAEFKDDEEPRQVSVAGR IMSRRVMGKASFVELQDSKGRIQVYITRDDICPGEDKELYNSVFKRLLDLGDFIGIEGFV FRTQMGEISIHAKKLTVLAKSIKPLPIVKYKDGVAYDSFEDPELRYRQRYVDLVVNDGIK ETFLKRATVVKTLRNALDEAGYTEVETPILQSIAGGASARPFITHHNSLDMDLYLRIATE LYLKRLIVGGFEGVYEIGKNFRNEGMDKTHNPEFTCMELYVQYKDYNWMMNFTEKLLERI CIAVNGSTETVVDGKTINFKAPYRRLPILDAIKEKTGYDLNGKNEEEIRQVCKELKMEEI DETMGKGKLIDEIFGEFCEGTYIQPTFITDYPVEMSPLTKMHRSKPGLTERFELMVNGKE LANAYSELNDPLDQEERFKEQMRLADKGDDEAMIIDQDFLRALQYGMPPTSGIGIGIDRL VMLMTGQTTIQEVLFFPQMRPEKVVKKDAAAKYMELGIAEDWVPVIQKAGYNTVADMKDV NPQKLHQDICGINKKYKLELTNPSVNDVTEWIQKI >gi|225935336|gb|ACGA01000056.1| GENE 31 39590 - 40558 780 322 aa, chain + ## HITS:1 COG:TM0378 KEGG:ns NR:ns ## COG: TM0378 COG0240 # Protein_GI_number: 15644628 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Thermotoga maritima # 1 317 8 311 323 151 30.0 2e-36 MGGGSWATAIAKMCLAQEDSINWYMRRDDRIADFKRLGHNPAYLTGVKFDTRRITFSSNI NDVVKESDTLIFVTPSPYLKAHLKKLKTKIKDKFIITAIKGIVPDDNVIVSEYFTKEYGV PPENIAVLAGPCHAEEVALERLSYLTIACPDKDKARIFARRLGSSFIKTSVSDDVAGIEY SSVLKNVYAIAAGICSGLKYGDNFQAVLISNAIQEMNRFLNTVHPLNRNVDESVYLGDLL VTGYSNFSRNRTFGTMIGKGYSVKSAQIEMEMIAEGYYGTKCIKEINKHHHVNMPILDAV YNILYERISPMIEIKLLTDSFR >gi|225935336|gb|ACGA01000056.1| GENE 32 40577 - 41914 1573 445 aa, chain + ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 2 445 5 449 450 484 54.0 1e-136 MISLNIEKTFGFISKEKVFAYEAEVKAAQEMLEKGTGKGNDFLGWLHLPSSITKEHLADL NATAKVLRDNCEVVIVAGIGGSYLGARAVIEALSNSFTWLQEKKTAPVMIYAGHNISEDY LYELTEYLKDKKFGVINISKSGTTTETALAFRLLKKQCEDQRGKETAKKVIVAVTDAKKG AARVTADKEGYKTFIIPDNVGGRFSVLTPVGLLPIAVAGFDIDKLVAGAADMEKVCGSDV AFAENPAAIYAATRNELYRNGKKIEILVNFCPKLHYVSEWWKQLYGESEGKDNKGIFPAS VDFSTDLHSMGQWIQEGERSIFETVISVEKVNHKLEVPSDEANLDGLNFLAGKRVDEVNK MAELGTQLAHVDGGVPNMRIVLPELSEYNIGGLLYFFEKACGISGYLLGVNPFNQPGVEA YKKNMFALLDKPGYEEESKAIRAKL >gi|225935336|gb|ACGA01000056.1| GENE 33 42014 - 42520 505 168 aa, chain - ## HITS:1 COG:no KEGG:BT_2125 NR:ns ## KEGG: BT_2125 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 22 187 187 207 69.0 1e-52 MKKRVLLLLPLFLGACSNSGESPVIKIEKLYAKIESDGENTSSGEKDYANQREELPALKV GDEVKAFLLLDGNGAELKTFKLQNDDEVDTKLFYEQTEVSTEGNLTDVEKGQLRFKDGVS KARIMVIATIKQVDKNGDVKLEFYLSSKAECEGAQEEIGLKTKAEDDK >gi|225935336|gb|ACGA01000056.1| GENE 34 42711 - 43520 630 269 aa, chain + ## HITS:1 COG:no KEGG:BT_2126 NR:ns ## KEGG: BT_2126 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 266 4 265 268 295 58.0 2e-78 MAGNLLKKSSGLMLLCSVSLLMIVMTGCQEAKLKTVIGIANKQCPLDMGEVGKITSIIYD GDNVVYTLNMNEEITNIKILKDNPESMKSSIKMMFQNPAADVKEMLKLMAKCNSGLHMIF VGNKSGEQAVCELTAEELKEVINTNADPAQSEQTKLEAQLKMANLQFPMQASEEVVVEKI ESIGESVVYICSVDEELCPISQIEENAAEVKESIVSTLASQTDPATQIFIKTCVENNKNI TYRYIGKDSGKQYDVIIPRSDLKKMIIEK >gi|225935336|gb|ACGA01000056.1| GENE 35 43557 - 44240 591 227 aa, chain + ## HITS:1 COG:MA0451 KEGG:ns NR:ns ## COG: MA0451 COG0637 # Protein_GI_number: 20089342 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Methanosarcina acetivorans str.C2A # 5 217 2 206 218 120 37.0 2e-27 MKKKLKAVLFDMDGVLFDSMPYHSEAWHKVMKSHGLTLSREEAYMHEGRTGASTINIVFQ RELGREATQEEIESIYQEKSVLFNSYPEAKPMPGAWELLQKVKKDGLIPMVVTGSGQLSL LERLEHHYPGMFHKELMVTAFDVKYGKPNPEPYLMALKKGGIKADEAVVVENAPLGVEAG HNAGIFTIAVNTGPLNGQVLLDAGADLLFPSMQALNDTWDMMTENDI >gi|225935336|gb|ACGA01000056.1| GENE 36 44363 - 47215 2026 950 aa, chain + ## HITS:1 COG:CAC3055 KEGG:ns NR:ns ## COG: CAC3055 COG2605 # Protein_GI_number: 15896306 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Clostridium acetobutylicum # 585 887 2 275 364 90 26.0 2e-17 MQKLLSLPPNLIHCFHELEEVNHTDWFCTSDPIGSKLGSGGGTTWLLQACHQAFAPQESF SNWIGHEKRILLHAGGQSRRLPSYGPSGKILTPIPIFSWERGQKLGQNLLSLQLPLYERI MNQAPAGLNTLIASGDVYIRSEKPLQDIPNADVVCYGLWVNPSLATHHGVFVSDRKKPEV LDFMLQKPSLEELEGLSKTHLFLMDIGIWILSDRAIEVLMKRSLKGGTKDITYYDLYSDY GLALGEHPKTEDEEINQLSVAILPLPGGEFYHYGTSHELISSTLAIQDKVRDQRRIMHRK VKPNPAIFIQNSITQVSLSADNANLWIENSHVGKEWKLGSRQIITGVPENQWSINLPDGV CIDIIPIGENEFVARPYGLDDVFKGALDKITTTYLNVPFTRWMEDRGITWEDIKGRTDDL QSASVFPKVDSVEDLGILVRWMTSEPQLEEGKKGWLKAEKVSADEISASANLKRLYEQRN AFRKENWKGLAANYEKSVFYQLDLLDAANEFVRFNLDMPDILKEDAAPMLRIHNRMLRAR IMKLREDKDCAKEEQAAFQLLRDGLLGVMSERKSHPVLNVYSDQIVWGRSPVRIDVAGGW TDTPPYSLYSGGSVVNLAIELNGQPPLQVYVKPCKEYHITLRSIDMGAMEVIRNYEELQD YKKVGSPFSIPKAALTLAGFAPAFSTESYPSLAKQLEDFGSGIEITLLAAIPAGSGLGTS SILASTVLGAINDFCGLAWDKNDICSYTLVLEQLLTTGGGWQDQYGGVFSGIKLLQSEAG FEQNPLVRWLPDQLFVHPDYRDCHLLYYTGITRTAKSILAEIVSSMFLNSGPHLSLLAEM KAHAMDMSEAILRSNFESFGRLVGKTWIQNQALDCGTNPPAVAAIIEKIKDYTLGYKLPG AGGGGYLYMVAKDPQAAGQIRRILTEQAPNPRARFVEMTLSDKGLQVSRS >gi|225935336|gb|ACGA01000056.1| GENE 37 47329 - 47439 128 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEARKKKWGSVALIIGAIGFIIIMIYFTVISSLNM >gi|225935336|gb|ACGA01000056.1| GENE 38 47839 - 48267 308 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174132|ref|ZP_05760544.1| ## NR: gi|260174132|ref|ZP_05760544.1| hypothetical protein BacD2_19879 [Bacteroides sp. D2] # 1 142 1 142 142 265 100.0 8e-70 MKLLKLLFLSVILLSACNDDSTDTPETGEPKQIIVEMASNDFTDVSGTYTLETNKEYCTR EDGKYKILLVKSIYINKEGKEKDILTYGILDVANEKYLYFTEKIDVAQRLERDWVFYNQI TNPDFPDDPNRRLVCNSYTKKD >gi|225935336|gb|ACGA01000056.1| GENE 39 48754 - 49791 897 345 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 10 345 3 327 327 344 50.0 1e-94 MSYLIKPKNYKPLLDLKQTELGIKQIKEFFQLNLSSELRLRRVTAPLFVLKGMGINDDLN GIERPVSFPIKDLGDAQAEVVHSLAKWKRLTLADYHIEPGYGIYTDMNAIRSDEELGNLH SLYVDQWDWERVITNEDRTVNFLKEIVNRIYAAMIRTEYMVYEMYPQIKPCLPQKLHFIH SEELRQLYPNLEPKCREHAICQKYGAVFIIGIGCQLGDGKKHDGRAPDYDDYTTKGLNDL PGLNGDLLLWDDVLQRSIELSSMGIRVDKEALQRQLKEEKEEKRLELYFHKRLMNDTLPL SIGGGIGQSRLCMFYLRKAHIGEIQASIWPEDMRKECEELEIHLI >gi|225935336|gb|ACGA01000056.1| GENE 40 49798 - 50460 576 220 aa, chain + ## HITS:1 COG:PA0750 KEGG:ns NR:ns ## COG: PA0750 COG0692 # Protein_GI_number: 15595947 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Pseudomonas aeruginosa # 3 220 8 226 231 273 59.0 2e-73 MNVQIEESWKAHLEPEFEKDYFRTLTDFVKSEYSQYQIFPPGKLIFNAFNLCPFDKVKVV IIGQDPYHGPGQAHGLCFSVNDGVPFPPSLVNIFKEIKADIGTDAPATGNLTRWAEQGVL LLNATLTVRAHQAGSHQNRGWETFTDAAIRALAEEKENLVFILWGSYAQKKGAFIDRNKH LVLTSAHPSPLSAYNGFFGNKHFSRTNDYLKAHGETEIAW >gi|225935336|gb|ACGA01000056.1| GENE 41 50605 - 53322 2336 905 aa, chain - ## HITS:1 COG:no KEGG:BT_2133 NR:ns ## KEGG: BT_2133 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 905 1 902 902 1543 91.0 0 MAPLKAKILISFILLIVLLMLPDEATSQRRRRGMIASNTAQTDSLQGNDSLKADTVTVRI DSIAPVKKKQPLDAPVVYESNDSTVFTLGGAATLYGSGKVNYQNIELAAEVISMNLDSST VHAYGIKDTTGVEKGKPVFKEGDTSYDTETIRYNFKTKKAGITDIVTQQGEGYVTGSKAK KGANDEIFMEQGRYTTCDHHDHPHFYMQLTRAKVRPKKNVVTGPAYLVVEDVPLPLAVPF FFFPFSSSYSSGFIMPTYMDDSSRGFGLAEGGYYFAISDIMDLKMTGDIFTKGSWRLSGL TNYNKRYKYSGTLQADYQVTKTGDKGMPDYTVAKDFKVVWNHRQDAKASPNSTFSASVNF STSSYERSNINNLYNSQLLTQNTKTSSISYSRSFPDIGLTLSGTTNIAQTMRDSSIAVTL PDLNITLSRLFPFKRKKAAGAERWYEKISISYTGRLTNSIRTKDDRLFKAGLSEWENAMN HNIPISATFTLFKYLQVSPSVNYTERWYTRKVNQHYNEAEHELEALPGDTLNGFYRVSNY SASLSLSTKLYGMYKPLFAKKKEIQIRHVFTPQVSLSGAPGFSKYWEEYTDYNGNTQYYS PFTGQPYGVPSREGSGTVSFSISNNLEMKYYDAKKDTLKKVSLIDELGASMSYNMAAKER PWSDLSMNLRLKLTKNYTFNMNASFATYAYTFDKSGNVVTSNRTEWSYGRFGRFQGYGSS FNYTFNNDTWKKWFGPKEEDEKGKDKNKSEDSDDGESDGTEGDGTTPKKVEKAQADPDGY QVFKMPWSLSLSYSFNIREDRTKPINRYSMRYPYTYTHNINANGNIKISNNWSLSFNSGY DFQAKEITQTSCTISRDLHCFNLSASLSPFGRWKYYNVTIRANASILQDLKYEQRSQTQS NIQWY >gi|225935336|gb|ACGA01000056.1| GENE 42 53412 - 53948 374 178 aa, chain - ## HITS:1 COG:MJ0778 KEGG:ns NR:ns ## COG: MJ0778 COG1418 # Protein_GI_number: 15668959 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Methanococcus jannaschii # 21 161 18 149 169 72 35.0 5e-13 MNPYEIIDKYYPENTQQRQILVIHSLAVSGKAMKMLDAHPELRLNRSFVKEAALLHDIGI FQTDAPTIQCFGPHPYIAHGYLGAEILRAEGFPQHALVCERHTGAGLSLEDIIAQQLPVP HREMLPITLEEQLICFADKFFSKTHLDEEKTVEKARKSIAKYGEEGLSRFDRWCSLFL >gi|225935336|gb|ACGA01000056.1| GENE 43 54019 - 55185 581 388 aa, chain + ## HITS:1 COG:BS_ybbC KEGG:ns NR:ns ## COG: BS_ybbC COG3876 # Protein_GI_number: 16077233 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 32 387 45 414 414 249 40.0 6e-66 MQTRVFFILIFLFIMLFPMRVNSQIVTGAEQMDQYLPLLKGKRVGMVVNHTSVVGAKQTH LLDTLLKRNVQVVKVFAPEHGFRGNADAGETVKNGKDSRTGTPIVSLYGDNKKPSAAQLK DIDVILFDIQDVGARFYTYISTMYYVMEACAENKKEMIVLDRPNPCDYVDGPILKPGYKS FVGMLPIPVLHGCTIGELAQMINGEEWIANKKNPCSLKVIPMVGWKHGEPYSLPIKPSPN LPNDQSIRLYASLCPFEATRISVGRGTTFPFQVLGAPNKKYGSFTFTPRSLPGFDKNPMH KGITCYGEDLRNVTDVNGFTLRYFLDFYRLSGESAAFFSRARWFDLLMGTDSVRKAILKG KSEEAIRNSWQKELQDYKEIRKKHLLYE >gi|225935336|gb|ACGA01000056.1| GENE 44 55316 - 56653 525 445 aa, chain + ## HITS:1 COG:BMEI0229 KEGG:ns NR:ns ## COG: BMEI0229 COG1512 # Protein_GI_number: 17986513 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Brucella melitensis # 65 212 34 180 253 87 32.0 4e-17 MKDKTQKSFRGIVRILLLVFCYTFGSHAFALALEHEQTVDIYKVTDVPNPRNESSSNWVS NPNQILDESYVWEINNMLSQLEDSLSIEVTVVALPSIGEDIPAEFAHKLFEHWGIGKKAD DNGLLILLVLDQRKVTFATGYGLEGVLPDALCFRIQQNEMVPWFRKNDFDRGMTEGVRTV TLVLYGSDYEPVSQGTSDNYWKSAGNTLWNFLANQPPMLWVFLILVNVITYLMKVNKARP KDSSALAAIKVLTLYNPLGCLVLFFPVWPALIAASLWYKFYQKQRVILQSKTCDSCKAVA LQLLPNELATPLLSASEQTEHKLGSAIHRIYQCTSCGWLLRYKSITISEYRMCNQCHTIA SKRISPWKTIKEPTYSDAGLEVADSLCLMCGDKKQTTQKIPRKTPPNSDSSSNSHSSSSG RSSSRSSSGSFGGGHSGGGGASSSF >gi|225935336|gb|ACGA01000056.1| GENE 45 56797 - 57162 195 121 aa, chain + ## HITS:1 COG:lin1469 KEGG:ns NR:ns ## COG: lin1469 COG2315 # Protein_GI_number: 16800537 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 3 121 7 121 127 62 30.0 2e-10 MNIEEFREYCLSFKGVHEKMPFPNVPDQYSRDVLCFYVADKWFCFVNIEIFDFCCIKCNP DESGELQTEYAGIKPGWHMNKKYWISVYFNQDVSDDKIKELVSKSYDIVVKSLTKKERVS V >gi|225935336|gb|ACGA01000056.1| GENE 46 57385 - 57855 297 156 aa, chain - ## HITS:1 COG:alr3535 KEGG:ns NR:ns ## COG: alr3535 COG0454 # Protein_GI_number: 17231027 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Nostoc sp. PCC 7120 # 2 156 1 154 156 75 31.0 3e-14 MITIRIALNTDIEEIQSLYRNTVLVINRRDYSQAEVEDWASCGDDPSKIEGMIKTHYFIV AVNRQSEIVGFSSITPQGYLHSMFIHKDFQGKGVATLLLNEIERYAVAAGITRITSEVSI TARPFFEKRGYIVEVEQKRRANQLSLTNYWMAKSLV >gi|225935336|gb|ACGA01000056.1| GENE 47 57874 - 58767 692 297 aa, chain - ## HITS:1 COG:no KEGG:BT_2140 NR:ns ## KEGG: BT_2140 # Name: not_defined # Def: putative sodium-dependent transporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 297 1 297 297 503 92.0 1e-141 MLKFLKNWTLPIAMLVGAVGYPLFISLSFLTPYLIFTMLLLTFCKVSPRDLKPKPLHMWL LLIQIGGALAAYLLLYRFDKIVAEGVMVCIICPTATAAAVITTKLGGSAASLTTYTLIAN IGAAIAVPILFPLVEVHPDVTFWEAFLVILGKVFPLLICPFLVAWLLSKCLPKVHQKLLG YHELAFYLWAVSLAIVTAQTLYSLLNDPADGFTEIMIAVGALIACCLQFFLGKTIGSIYN DRISGGQALGQKNTILAIWMAHTYLNPLSSVAPGSYVLWQNIINSWQLWKMRKKEVK >gi|225935336|gb|ACGA01000056.1| GENE 48 58909 - 60168 964 419 aa, chain + ## HITS:1 COG:aq_1681 KEGG:ns NR:ns ## COG: aq_1681 COG0860 # Protein_GI_number: 15606778 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Aquifex aeolicus # 30 255 130 353 359 131 37.0 2e-30 MKLNRPYILYIFICLWLLFLPSCTNHLWGKDFVVVIDAGHGGHDPGAIGKISKEKNINLN VALKVGNLIKRNCDDVKVIYTRSKDVFIPLDRRAEIANNAKADLFISIHTNALANNRTAK GASTWTLGLAKSDANLEVAKRENSVILYESDYKTRYAGFNPNSAESYIIFEFMQDKYMEQ SVHLASLMQKQFRQTCKRADRGVHQAGFLVLKASAMPSILIELGFISTPEEERYLNSEEG AGSMAKGIYRAFLNYKREHELRLTGVSKTIIPTEQEEDNAPEIAQKDTEIINTAPQQQEL LAEAKTKPAATAKTAPKRPIVAESATNDSEITFKIQILTASKPLAKNDKRLKGLKDVDYY KEGGIYKYTYGASTDYNKVLRTRRTITAEFKDAFIIAFRNGEKMNINEAIAEFKKRKNK >gi|225935336|gb|ACGA01000056.1| GENE 49 60178 - 61071 977 297 aa, chain + ## HITS:1 COG:no KEGG:BT_2142 NR:ns ## KEGG: BT_2142 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 297 1 297 297 523 89.0 1e-147 MKYITKEVRIGIAGIVALCVLIYGINWLKGIHMFQPSSYFYAKFENVNGLTKSSPVFADG VRVGIVRDIYYDYVKPGNVIVEVELDTELRIPKGSTAELVSELMGGVRMNILLANNPREK YAVGDTIPGTLNNGMMESAAQLIPKVEEMLPKLDSILISLNNILGDKSIPATLHSIEKTT ANLAVVSSQVKGLMSNDIPQLTSKLNTIGDNFVVISGNLKEIDYAATFKKIDETLANVKI LTEKLNSKDNTLGLLFNDPTLYNNLNATTENAASLLEDLKEHPKRYVHFSLFGKKDK >gi|225935336|gb|ACGA01000056.1| GENE 50 61423 - 62835 921 470 aa, chain + ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 9 462 8 439 446 294 35.0 3e-79 MIESNHVVLWNRCLDVIKDNVPETTYNTWFAPIVPLKYEDKTLILQIPSQFFYEILEERF VDLIRKTLYKVIGEGTKLMYNVMVDKTSIPNQTVNLEASNRSTAVTPKSIIGGNKAPSFL QAPAVQDLDPHLNPNYNFENFIEGYSNKLSRSVAEAVAQKPGGTAFNPLFLYGASGVGKT HLANAIGTKIKEIYPEKRVLYVSAHLFQVQYTDSVRNNTTNDFINFYQTIDVLIIDDIQE FAGVTKTQNNFFHIFNHLHQNGKQLILTSDRAPVLLQGIEERLLTRFKWGMVAELEKPTV ELRKNILRNKIHRDGLQFPAEVIDYIAENVNESVRDLEGIVIAIMARSTIFNKEIDLDLA QHIVHGVVHNETKAVTIDDILKVVCKHFDLEPSAIHTKSRKREVVQARQIAMYLAKNHTD FSTSKIGKFIGNKDHATVLHACKTVKGQLEVDKSFNAEVQEIESLLKKRN >gi|225935336|gb|ACGA01000056.1| GENE 51 62940 - 63683 658 247 aa, chain - ## HITS:1 COG:BH1048 KEGG:ns NR:ns ## COG: BH1048 COG0778 # Protein_GI_number: 15613611 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Bacillus halodurans # 1 182 5 186 244 108 32.0 6e-24 MFETVKNRRTIRKYLPKDINPILLNDLLETSFRASTMGGMQLYSVIVTRDAEMKEKLSPA HFNQPMVKNAPVVLTFCADFRRFSKWCEQRKAVPGYDNLMSFMNASMDTLLVAQTFCTLA EEVGLGICYLGTTTYNPQMIIDTLQLPELVFPLTTITVGYPDGIPAQVDRLPLEAAVHDE KYHDYTQEEIDKLYAYKESLPENKQFIEENKKETLAQVFTDVRYTKKDNEFMSENLLKVL RQQGFLK >gi|225935336|gb|ACGA01000056.1| GENE 52 63968 - 66511 2137 847 aa, chain + ## HITS:1 COG:AF1664 KEGG:ns NR:ns ## COG: AF1664 COG0209 # Protein_GI_number: 11499254 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Archaeoglobus fulgidus # 25 847 7 752 752 282 29.0 2e-75 MEKQIYSYDEAYEESLRYFQGDELAARVWVNKYAVKDSFGNIYEKSPEDMHWRIANEVAR VESKYPNALTAKELYDLLDHFKYIVPQGSPMTGIGNDYQVASLSNCFVIGVDGAADSYGA IIKIDEEQVQLMKRRGGVGHDLSHIRPKGSPVKNSALTSTGLVPFMERYSNSTREVAQDG RRGALMLSVSIKHPDSEAFIDAKMTEGKVTGANVSVKLDDAFMQAAVDEKPYVQQYPIDS AQPTFTKEIDASTLWKKIVHNAWKSAEPGVLFWDTIIRESVPDCYADLGYRTVSTNPCGE IPLCPYDSCRLLAINLYSYVVNPFKPDAYFDFDLFQKHVALAQRIMDDIIDLELEKIERI MTKIDEDPENEEVKHAERALWEKIYKKSGQGRRTGVGITAEGDMLAALGLRYGTEEATEF SEKVHKTVALGAYRSSVEMAKERGAFEIYNSEREQNNPFIQRLAAADPKLYEDMKKYGRR NIACLTIAPTGTTSLMTQTTSGIEPVFLPVYKRRRKVNPNDTNVHVDFVDETGDAFEEYI VFHHKFVTWMEANGYDPARRYTQEEIDELVAKSPYYKATSNDVDWLMKVKMQGRIQKWVD HSISVTINLPNDVDEDLVNRLYVEAWKSGCKGCTVYRDGSRSGVLISTKSDKDKKEGLPP CKPPTVVEVRPRILEADVVRFQNNKEKWVAFVGLLDGHPYEIFTGLQDDDEGILLPKSVT CGRIIKNVDEDGTKRYDFQFENKRGYKTTIEGLSEKFNKEYWNYAKLISGVLRYRMPIEQ VIKLVGSLQLNSESINTWKNGVERALKKYIQDGTEAKGKKCPNCGNETLVYQEGCLICTT CGASRCG >gi|225935336|gb|ACGA01000056.1| GENE 53 66643 - 69324 1689 893 aa, chain + ## HITS:1 COG:L94405 KEGG:ns NR:ns ## COG: L94405 COG1640 # Protein_GI_number: 15672678 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Lactococcus lactis # 399 889 3 487 489 444 46.0 1e-124 MILSFNIEYRTNWGEEVRISGLFPESIPLHTTDGIYWTAELELEVPQEGMTINYSYQIEQ NGIVIRKEWDSFSRSIFLSGSSRKIYRINDCWKNIPEQLYLYSSAFTEALLAHPEKENIP QRYKKGLVIKAYAPRINKDYCLAICGNQKSLGHWDPEKAVLMSDTNFPEWQIELDASKLK YPLEYKFILYNKQEKKADCWEKNPNRYLADPELKTNETLVISDRYAYFDIPAWKGAGIAI PVFSLKSEKSFGVGDFGDLKRMVDWAVNTRQKVIQILPVNDTTMTHAWTDSYPYNSISIY AFHPMYADIRQMGTLKDKEAASKFSKKQKELNSLPAIDYEAVNQTKWEFFNLLFRQEGEK VLASKGFKDFFETNKEWLQPYAVFSYLRDAYKTPNFRKWPRHSVYQAEDIEKMCQPGTAD YPHISLYYYIQYHLHLQLLSATEYARQHGVVLKGDIPIGISRNSVEAWTEPHYFNLNGQA GAPPDDFSINGQNWGFPTYNWDVMEKDGYRWWMKRFQKMAEYFDAYRIDHILGFFRIWEI PMHAVHGLLGQFDPSLPMSREEIESYGLTFRDEYLLPFIHESFLGQLFGPHTHLVKQDFL QLIDDSGLYRMKPGFETQREVEQFFAGRNDEDSIWIREGLYSLISNVLFVADKKEEGKYH PRIGVQRDFVFRSLNEEEKNAFNRLYDQYYYHRHNEFWYQQAMKKLPQLTQSTRMLVCGE DLGMIPACVSSVMNELRILSLEIQRMPKNPMHEFGHLNEYPYRSVCTISTHDMSTLRGWW EEDYQQTQRYYNATLGHYGVAPTTATPELCEEIVRNHLNSNSILCILSFQDWLSIDGKWR NPNVAEERINVPSNPRNYWRYRMHLTLEQLMKAKTLNDKISELIKYTGRDPNK >gi|225935336|gb|ACGA01000056.1| GENE 54 69627 - 70679 615 350 aa, chain + ## HITS:1 COG:CAC3042 KEGG:ns NR:ns ## COG: CAC3042 COG3594 # Protein_GI_number: 15896293 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Clostridium acetobutylicum # 7 280 2 268 337 69 25.0 9e-12 MEGNQSRRIDFVDLTKGVCIILVVMAHVGGAFEQLDTNSMLSCFRMPLYFFISGVFFKSY EGLFGFILRKINKLIIPFLFFYLSAFLMKYIVWKIAPRVFQLPVSWNELLVVFHGHDLIK FNPPIWFLLALFNCNILFYLIHFLREKHLLVMFAVTILIGCAGFYLGKLQIELPLYIDVS MTALPFYVAGFWIRRYNFFLYPSHRFDKLIPVFVVLALVVMYFTATTLGMRTNNYTGNIF QVYIAAFAGIFMIMLLCKKVKKIKVVSYLGRYSIITLSIHGPILHFLGPLVSRYIHNSWA QASALLLITLSICLLLTPVFLKVIPQMVAQKDLLKVKQDHTKKTVYEDKQ >gi|225935336|gb|ACGA01000056.1| GENE 55 70663 - 71034 230 123 aa, chain + ## HITS:1 COG:SPy1099 KEGG:ns NR:ns ## COG: SPy1099 COG1539 # Protein_GI_number: 15675082 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Streptococcus pyogenes M1 GAS # 8 122 4 119 119 89 38.0 2e-18 MKINSSYILLKEIRCYAYHGVAPQENLIGNEYLIDLKLKVDISKAARTDEVTDTVNYAEV HQVIENEMAVPSKLLEHVSGRIIQKLFDQFPCIEEIELRLSKRNPPMGADIESAGIELHC SRK >gi|225935336|gb|ACGA01000056.1| GENE 56 71165 - 71689 364 174 aa, chain - ## HITS:1 COG:TM1185 KEGG:ns NR:ns ## COG: TM1185 COG1803 # Protein_GI_number: 15643941 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Thermotoga maritima # 6 165 15 166 166 133 48.0 2e-31 MKSKVRRGIGLVAHDAMKKDLIEWVLWNSELLMGNKFYCTGTTGTLILEALKEKHPDEEW DFTILKSGPLGGDQQMGSRIVDGQIDYLFFFTDPMTLQPHDTDVKALTRLAGVENIVFCC NRSTADHIISSPLFMDPDYERIHPDYSSYTKRFQDKPVVTEAVESVNRRKKKRK >gi|225935336|gb|ACGA01000056.1| GENE 57 71693 - 72721 474 342 aa, chain - ## HITS:1 COG:alr4493 KEGG:ns NR:ns ## COG: alr4493 COG1216 # Protein_GI_number: 17231985 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Nostoc sp. PCC 7120 # 3 247 9 253 295 128 29.0 1e-29 MKVSVVILNWNGCDMLRTFLPSVVRYSEGEGVEVCVADNGSTDTSVSLLQQEFPSVRTIV LDQNYGFADGYNWALQQVDAEYVVLLNSDVEVTEHWLEPMIAYLDAHPEVAACQPKIRSQ RQKEYFEYAGAAGGFIDKYGYPFCRGRIMGVVEKDEGQYDTVIPVFWATGAALFIRHTDY VNVGGLDGRFFAHMEEIDLCWRLRSRNREIVCVPQSIVYHVGGATLKKENPHKTFLNFRN NLVMLYKNLPQEELNKVMRIRTCLDYLAAFNFLLQGHWDNASAVMRARKEYKRLCPSFSL SREENMRKKTLNPIPERTKSSILWQFYARGCKRFSQLSDLKG >gi|225935336|gb|ACGA01000056.1| GENE 58 72724 - 73647 693 307 aa, chain - ## HITS:1 COG:NMA1630 KEGG:ns NR:ns ## COG: NMA1630 COG1560 # Protein_GI_number: 15794524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Neisseria meningitidis Z2491 # 4 292 2 283 289 87 27.0 3e-17 MRSKLIYWLVYSGMWLFSALPFRVLYMLSDLNCLLMYRIGKYRRRVVRGNLLRSFPEKTD AERLQIERKFYRYLSDYMLEDLKLLHMSAEELCARMTYKNTEQYLELTEKYGGIIVMIPH YANYEWLIGMGSIMKPEDVPVQVYKPLRDKYLDELFKRIRSRFGGYNIPKHSTAREIIKL KRDGKKMVVGLITDQWPSGYDKYWTTFLGQETAFLDGAERIAKMMNFPVFYCELSKKRRG YCEAEFKLMTETPKETREGEITEMFAHRLEQTIRREPAYWLWSHKRWKLTKEEADQLEQE ELNKKKE >gi|225935336|gb|ACGA01000056.1| GENE 59 73650 - 74969 276 439 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229245919|ref|ZP_04369978.1| SSU ribosomal protein S12P methylthiotransferase [Catenulispora acidiphila DSM 44928] # 153 388 224 461 529 110 28 2e-23 MIDTTVFQNKTAVYYTLGCKLNFSETSTIGKILREVGVRTARKGEKADICVVNTCSVTEM ADKKCRQAIHRLVKQHPGAFVIVTGCYAQLKPGDVAKIDGVDVVLGAEQKGELLQYLGDL QKHEKGEAITTTTKDIRSFSPSCSRGDRTRFFLKVQDGCDYFCSYCTIPFARGRSRNGTI ASMVEQARQAAAEGGKEIVLTGVNIGDFGKTTGESFFDLVKALDQVEGIERYRISSIEPN LLTDAIIEFVSHSRSFMPHFHIPLQSGCDEVLQLMRRRYDTALFASKVRKIKEVMPDAFI GVDVIVGTRGETPEYFEQAYQFIDGLDVTQLHVFSYSERPGTQALKIEYVVSPEEKHQRS QRLLTLSDQKTQAFYARHIGQVMPVLMEKSKTGVPMHGFTENYIRVEVENDDSLDNRVVN VRLGDFNEERTALKSTILI >gi|225935336|gb|ACGA01000056.1| GENE 60 74993 - 76753 1235 586 aa, chain + ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 57 583 15 503 600 235 31.0 2e-61 MQSYTLLLVKRFWIRKFIPIFAQFTKKRAMIKENFIKLYENSFRENWDLPCYTDYGEDTQ YTYGEVAEKIARLHLLFKHCSLRRGDKISVIGKNNAHWCIAYMATITYGAIIVPILQDFT PNDVHHIVNHSESVFLFTSDSIWDNLEEEKLAGLRGVFSLTDFRCLYQRDGETIQKFLKN TDKEMHSLYPKGFTREDVQYTTLSNDKVMLLNYTSGTTGFSKGVMLTGNNLAGNVTFGIR TELLKKGDKVLSFLPLAHAYGCAFDFLTATAVGTHVTLLGKTPSPKIIMKAFEEVKPNLI ITVPLVIEKIYKNIIQPLINKKGMKWALNIPLLDTQIYNQIRKRLIDALGGRFKEIIIGG AAMDKEVEEFFYKIKFPFTIGYGMTECGPLISYAPWDEFVLGSSGKILDIMEARIYKENP EAETGEIQVRGENVMVGYYKNQEATQEVFTQDGWLRTGDLGSMDSNGNIFIRGRLKTMIL SSSGQNIFPEELETKLNNLPFILESLVIERNKKLVALVYADYEALDSLGLNNPNNLKTIM DENLKNLNSNVAAYEKISRIQLYPTEFEKTPKRSIKRYLYNSIAVD >gi|225935336|gb|ACGA01000056.1| GENE 61 76913 - 77134 270 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLVLMAVAIVAVSFASCGNKAADAAKATADSIRIADSIAAVEAAALEAEQAAAAAADS LNADSTATETVAE >gi|225935336|gb|ACGA01000056.1| GENE 62 77223 - 78131 677 302 aa, chain - ## HITS:1 COG:lin2265 KEGG:ns NR:ns ## COG: lin2265 COG1082 # Protein_GI_number: 16801329 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 29 290 2 239 246 115 30.0 7e-26 MKTKIYLCALAALFMIPQAMTAQTKKKAKKEVAIQLYSVRDILNKVDNKNGKCDPTYTAL LKKLANMGYTGVEAANYNNGKFYDRTPQQFKKDVESAGLKVLSSHCTRQLSKEELASGDL SESLQWWDQCIADHKAAGMKYIVAPWMDVPKTLKDLNTYCTYFNEIGKRCKQQGLSFGYH NHAHEFQKVEDKVMYDYMLEHTNPEYVFFQMDLYWVVRGQNSPVDYFNKYPGRFKIFHVK DHREIGQSGMVGFDAIFKNAKTAGVNYLVAEIEGYSMPVEESVEVSLDYLLDAPFVKSSY AK >gi|225935336|gb|ACGA01000056.1| GENE 63 78277 - 79143 866 288 aa, chain - ## HITS:1 COG:no KEGG:BT_2157 NR:ns ## KEGG: BT_2157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 288 1 290 290 531 90.0 1e-149 MKKVFYPLACCCLAAGVFASCGGQKKANAQEEPSKVALSYSKSLKAPETDSLNLPVDENG YITIFDGETFNGWRGYGKDRVPTKWTIEDGCIKFNGSGGGEAQDGDGGDLIFAHKFKNFE LELEWKVAKGSNSGILYLAQEVTSKDKDGNDVLEPIYISAPEYQILDNANHPDAKLGKDN NRQSASLYDMIPAVPQNSKPFGEWNKAKIMVYKGTVVHGQNDENVLEYHLWTKQWTDMLQ ASKFSEDKWPLAFELLNNCGGENHEGFIGLQDHGDDVWFRNIRVKVLD >gi|225935336|gb|ACGA01000056.1| GENE 64 79182 - 80669 1623 495 aa, chain - ## HITS:1 COG:lin2266 KEGG:ns NR:ns ## COG: lin2266 COG0673 # Protein_GI_number: 16801330 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 41 195 3 164 358 69 30.0 2e-11 MSNISRRKFLKTGAAALAGITIAPSTILGMSHGHISPTDKLNLAAVGIGGMGHANINHVK GTENIVALCDVDWKYAKGVFDEFPKAKKYWDYRKMYEEMGKSIDGVIIATADHTHAIITA DAMTMGKHVYCQKPLTHSVYESRLLTNLAASTGVVTQMGNQGSSDEGTDLVCEWIWNGEI GDITKVECATDRPIWPQGLNVPEKVDKIPSTLNWDLFTGPAKMNPYNAIYHPWNWRGWWD YGTGALGDMACHILHQPFRALKLQYPTKVEGSSTLLLNACAPQAQHVKMIFPARENMPKV AMPEVEVHWYDGGMMPERPKGFPEGKQLMQSGGGLTIFHGTKDTLICGCYGQNPWLLSGR KPNAPKVCRRVPNAMNGGHEMDWVRACKENKSNRIMTKSDFSEAGPMNEMVAMGVLAIRL QALNKTLEWDGANMCFTNIGDNETIRTVIKDGFKIHDGHPTFDKTWTDPINAKQFAAELV KHTYRDGWRLPDMPR >gi|225935336|gb|ACGA01000056.1| GENE 65 80682 - 81794 768 370 aa, chain - ## HITS:1 COG:lin2932 KEGG:ns NR:ns ## COG: lin2932 COG0673 # Protein_GI_number: 16801991 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 10 192 4 184 333 76 26.0 8e-14 MNTQSSVDTRIKVGIIGFGRMGRFYWEAMTKSGRWNIAYICDTDPESRQLAKKLSPESLI VEDNQKVFEDESVQVVGLFTLADSRMEQIEKAIRYGKHIISEKPIADTMENEWKVVEMTE NANLISAVNLYLRNSWYHNLMKEYIEQGEIGELAIIRICHMTPGLAPGEGHEYEGPAFHD CGMHYVDITRWYAGCDYRTWNAQGVNMWNYKDPWWVQCHGTFQNGVVFDITQGFVYGQLS KDQTHNSYVDIIGTKGVVRMTHDFNTAVVDLHGVNQTIRVEKPFGGKNIDVLCDLFADSV ETGKRSSRLPSMRDSAIASEYAWTFLKDTRKHDLPAIGNLRTLEQIRERRRNMKNGYGLL HGNLPKIINP >gi|225935336|gb|ACGA01000056.1| GENE 66 82033 - 83514 827 493 aa, chain - ## HITS:1 COG:no KEGG:BT_2160 NR:ns ## KEGG: BT_2160 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 487 75 561 561 686 77.0 0 MLKSPGLTLEGEYRINLRLYNEYKKFHIDSAIHYVDRNIEISRQLNRPYFTNQSSLHLSL LYSMCGRFREAEIILKSIKTSELPRDLLINYYQTYSSFWGHYSISVANNLYGKQQSAYQD SLFALIDHTSWDYRMSQASYYIWRDTLKSKEIFKELLDIEEVGTPNYAMITHSYSRLCHH QKKYDEEKKYLILSAIADTRNATRENASLQSLALIQYEEKNLADAFKFTQSAIDDVISSG IHFRAIEIYKFNSIINTAYQAEQAKSRSHLTTFLISTSVILFLLILLVVFIYIQMKKTLK IKQALAQSNEELLRLNNKLNNMNSQLNDTNNQLYEINGIKEYYIAEFFDVCFSYIHKMEK YQNMLYKIAINKYYDELIKKLKSSALIDEELSALYARFDKVFLGLYPTFVSDFNALLKDE EKIILKPDALLNRELRIYALLRLGITDSGKIANFLRCSTSTVYNYRTKMRNKAAVDRDEF ENEIMKISSTQET >gi|225935336|gb|ACGA01000056.1| GENE 67 83779 - 84222 712 147 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885187|ref|ZP_02066190.1| hypothetical protein BACOVA_03185 [Bacteroides ovatus ATCC 8483] # 1 147 1 147 147 278 100 5e-74 MEIILKEDVVNLGYKNDIVNVKSGYGRNYLIPTGKAVIASPSAKKMLAEELKQRAHKLEK IKKDAEAMAAKLEGVSLTIATKVSSTGTIFGSVGNIQIAEALSKLGHEVDRKIIVVKDAV KEVGSYKAIVKLHKEVSVEIPFEVVAE >gi|225935336|gb|ACGA01000056.1| GENE 68 84237 - 84509 461 90 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885186|ref|ZP_02066189.1| hypothetical protein BACOVA_03184 [Bacteroides ovatus ATCC 8483] # 1 90 1 90 90 182 100 7e-45 MAQQVQSEIRYLTPPSVDVKKKKYCRFKKSGIRYIDYKDPEFLKKFLNEQGKILPRRITG TSLKFQRRIAQAVKRARHLALLPYVTDMMK >gi|225935336|gb|ACGA01000056.1| GENE 69 84512 - 84856 576 114 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885185|ref|ZP_02066188.1| hypothetical protein BACOVA_03183 [Bacteroides ovatus ATCC 8483] # 1 114 1 114 114 226 99 3e-58 MNQYETVFILTPVLSDVQMKEAVEKFKGVLQAEGAEIINEENWGLKKLAYPIQKKSTGFY QLIEFNADPTVIDKLELNFRRDERVIRFLTFKMDKYAAEYAAKRRSVKSNKKED >gi|225935336|gb|ACGA01000056.1| GENE 70 85017 - 85463 467 148 aa, chain + ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 25 148 17 141 160 64 31.0 7e-11 MIEQFNFDIRLIFAILNGKVSAAINRKLYRNFRQNGLEISPEQWTVLIFLWEKDGVTQQE LCNATFKDKPSMTRLIDNMERQHLVVRISDKKDRRTNLIHLTKDGKELEEKARVIAGQTL KEALHGITLDELSIGQEVLKKVFYNTKD >gi|225935336|gb|ACGA01000056.1| GENE 71 85604 - 85780 294 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174165|ref|ZP_05760577.1| ## NR: gi|260174165|ref|ZP_05760577.1| hypothetical protein BacD2_20044 [Bacteroides sp. D2] # 1 58 1 58 58 62 100.0 8e-09 MKGLLKNLGLILILAGVVILLGCSFTGNVNNNAILGTSVVLVVLGLISYIVINKKIAD Prediction of potential genes in microbial genomes Time: Fri May 13 10:38:33 2011 Seq name: gi|225935335|gb|ACGA01000057.1| Bacteroides sp. D2 cont1.57, whole genome shotgun sequence Length of sequence - 301321 bp Number of predicted genes - 236, with homology - 231 Number of transcription units - 103, operones - 64 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 - CDS 41 - 742 877 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 2 1 Op 2 . - CDS 744 - 2300 1048 ## COG0642 Signal transduction histidine kinase - Prom 2402 - 2461 6.4 - Term 2491 - 2543 10.1 3 2 Tu 1 . - CDS 2634 - 4790 2186 ## COG0480 Translation elongation factors (GTPases) - Prom 4946 - 5005 6.2 + Prom 5122 - 5181 4.4 4 3 Op 1 . + CDS 5222 - 6352 707 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 5 3 Op 2 . + CDS 6372 - 6926 510 ## BT_2169 RNA polymerase ECF-type sigma factor 6 3 Op 3 . + CDS 7014 - 7448 353 ## BT_2170 hypothetical protein 7 3 Op 4 . + CDS 7476 - 8354 588 ## COG3712 Fe2+-dicitrate sensor, membrane component 8 3 Op 5 . + CDS 8290 - 11058 2024 ## BT_2172 hypothetical protein 9 3 Op 6 . + CDS 11066 - 12064 685 ## BT_2173 hypothetical protein 10 3 Op 7 . + CDS 12055 - 12831 338 ## gi|299147860|ref|ZP_07040923.1| hypothetical protein HMPREF9010_03572 11 3 Op 8 . + CDS 12834 - 13121 228 ## BT_2176 hypothetical protein 12 3 Op 9 . + CDS 13118 - 13726 394 ## COG2431 Predicted membrane protein + Prom 13861 - 13920 6.2 13 4 Tu 1 . + CDS 14019 - 14375 491 ## BT_2178 hypothetical protein + Term 14394 - 14451 10.8 - Term 14390 - 14427 5.4 14 5 Tu 1 . - CDS 14441 - 15505 1198 ## BT_2179 putative DNA mismatch repair protein - Prom 15618 - 15677 7.4 + Prom 15611 - 15670 5.3 15 6 Tu 1 . + CDS 15693 - 16385 626 ## COG1011 Predicted hydrolase (HAD superfamily) 16 7 Tu 1 . - CDS 16403 - 17176 452 ## BT_2181 transcriptional regulator - Prom 17378 - 17437 3.9 + Prom 17143 - 17202 4.9 17 8 Tu 1 . + CDS 17264 - 17929 648 ## COG3506 Uncharacterized conserved protein + Term 17988 - 18029 8.6 - Term 17976 - 18016 8.4 18 9 Op 1 . - CDS 18047 - 19099 705 ## BT_2183 hypothetical protein 19 9 Op 2 . - CDS 19096 - 19644 349 ## BDI_3895 hypothetical protein 20 9 Op 3 . - CDS 19655 - 19807 74 ## + Prom 19654 - 19713 6.2 21 10 Tu 1 . + CDS 19828 - 21060 1201 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase + Prom 21107 - 21166 4.1 22 11 Op 1 . + CDS 21205 - 21642 290 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 23 11 Op 2 . + CDS 21717 - 22433 474 ## COG0300 Short-chain dehydrogenases of various substrate specificities + Prom 22518 - 22577 3.8 24 12 Tu 1 . + CDS 22618 - 23367 373 ## BT_1587 hypothetical protein - Term 23359 - 23408 1.5 25 13 Op 1 . - CDS 23423 - 24421 821 ## COG2234 Predicted aminopeptidases 26 13 Op 2 . - CDS 24471 - 25565 514 ## PG0350 internalin-related protein - Prom 25609 - 25668 5.5 + Prom 25540 - 25599 5.0 27 14 Op 1 . + CDS 25663 - 26130 416 ## BT_2205 hypothetical protein 28 14 Op 2 . + CDS 26130 - 26942 515 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 29 14 Op 3 . + CDS 26955 - 27368 534 ## COG0802 Predicted ATPase or kinase 30 14 Op 4 . + CDS 27365 - 27592 234 ## BT_2208 hypothetical protein 31 15 Op 1 23/0.000 - CDS 27647 - 28957 993 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 32 15 Op 2 . - CDS 28966 - 29940 1097 ## COG0714 MoxR-like ATPases 33 15 Op 3 . - CDS 29952 - 30590 416 ## BT_2213 hypothetical protein 34 15 Op 4 . - CDS 30608 - 31222 272 ## BT_2213 hypothetical protein 35 15 Op 5 . - CDS 31219 - 31836 514 ## BT_2214 hypothetical protein 36 15 Op 6 . - CDS 31853 - 32791 871 ## BT_2215 hypothetical protein 37 15 Op 7 . - CDS 32772 - 33734 693 ## COG1300 Uncharacterized membrane protein - Prom 33773 - 33832 4.8 + Prom 33746 - 33805 3.3 38 16 Tu 1 . + CDS 33836 - 34561 518 ## COG1714 Predicted membrane protein/domain + Term 34594 - 34640 4.0 39 17 Op 1 . - CDS 34648 - 34851 92 ## 40 17 Op 2 . - CDS 34778 - 35113 552 ## BT_2228 hypothetical protein 41 17 Op 3 . - CDS 35155 - 35469 172 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein - Prom 35497 - 35556 7.3 42 18 Tu 1 . - CDS 35604 - 36632 727 ## COG1609 Transcriptional regulators - Prom 36799 - 36858 6.0 + Prom 36626 - 36685 4.5 43 19 Op 1 . + CDS 36832 - 38271 937 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 44 19 Op 2 . + CDS 38299 - 39624 430 ## COG0738 Fucose permease + Prom 39668 - 39727 5.5 45 20 Op 1 6/0.000 + CDS 39854 - 40444 263 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 40446 - 40505 3.9 46 20 Op 2 . + CDS 40534 - 41529 486 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 41623 - 41660 1.1 + Prom 41539 - 41598 3.7 47 21 Op 1 . + CDS 41689 - 45036 1867 ## BDI_1677 hypothetical protein 48 21 Op 2 . + CDS 45049 - 46542 983 ## Phep_2140 RagB/SusD domain protein 49 21 Op 3 . + CDS 46563 - 48077 750 ## COG3119 Arylsulfatase A and related enzymes 50 21 Op 4 . + CDS 48090 - 49055 563 ## Psta_0944 hypothetical protein 51 21 Op 5 . + CDS 49135 - 50022 391 ## Arad_8333 hypothetical protein 52 21 Op 6 . + CDS 50045 - 52366 967 ## COG3525 N-acetyl-beta-hexosaminidase 53 21 Op 7 . + CDS 52387 - 53595 736 ## BDI_1319 glycoside hydrolase family protein 54 21 Op 8 . + CDS 53646 - 54203 428 ## BDI_1319 glycoside hydrolase family protein 55 21 Op 9 . + CDS 54277 - 55665 977 ## gi|260174214|ref|ZP_05760626.1| hypothetical protein BacD2_20311 + Term 55720 - 55773 1.1 + Prom 55668 - 55727 4.0 56 22 Tu 1 . + CDS 55815 - 58031 1046 ## BDI_1317 glycoside hydrolase family protein + Term 58032 - 58103 15.2 - Term 58031 - 58078 2.7 57 23 Tu 1 . - CDS 58194 - 62000 3812 ## COG0587 DNA polymerase III, alpha subunit - Prom 62118 - 62177 6.9 + Prom 61980 - 62039 3.3 58 24 Op 1 14/0.000 + CDS 62162 - 62848 543 ## COG0688 Phosphatidylserine decarboxylase 59 24 Op 2 . + CDS 62859 - 63566 466 ## COG1183 Phosphatidylserine synthase 60 25 Tu 1 . + CDS 63634 - 63924 264 ## BT_2233 hypothetical protein + Term 64038 - 64069 -0.8 - Term 63863 - 63899 0.1 61 26 Op 1 . - CDS 63940 - 64377 463 ## COG0590 Cytosine/adenosine deaminases 62 26 Op 2 . - CDS 64418 - 64588 164 ## BF0707 hypothetical protein - Prom 64682 - 64741 4.8 63 27 Tu 1 . + CDS 64719 - 65084 357 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase + Prom 65100 - 65159 7.3 64 28 Op 1 . + CDS 65199 - 65555 437 ## COG2315 Uncharacterized protein conserved in bacteria 65 28 Op 2 . + CDS 65539 - 66297 344 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 66 28 Op 3 . + CDS 66364 - 67656 460 ## BF0631 hypothetical protein + Term 67671 - 67725 12.2 - Term 67522 - 67570 0.3 67 29 Tu 1 . - CDS 67691 - 68374 626 ## BT_2240 TPR domain-containing protein - Prom 68426 - 68485 9.9 + Prom 68362 - 68421 4.9 68 30 Op 1 . + CDS 68488 - 69816 854 ## COG0534 Na+-driven multidrug efflux pump 69 30 Op 2 . + CDS 69871 - 70581 873 ## COG0528 Uridylate kinase 70 31 Op 1 . - CDS 70687 - 72240 1246 ## Acid_0712 hypothetical protein 71 31 Op 2 . - CDS 72251 - 73663 1258 ## Acid_0712 hypothetical protein 72 31 Op 3 . - CDS 73667 - 76330 1959 ## COG3250 Beta-galactosidase/beta-glucuronidase 73 31 Op 4 . - CDS 76342 - 78450 1630 ## gi|260174232|ref|ZP_05760644.1| hypothetical protein BacD2_20401 74 31 Op 5 . - CDS 78482 - 80095 1416 ## Fjoh_2078 RagB/SusD domain-containing protein 75 31 Op 6 . - CDS 80110 - 83169 2953 ## Fjoh_2077 TonB-dependent receptor - Prom 83220 - 83279 5.2 76 32 Tu 1 . - CDS 83301 - 87332 2867 ## COG0642 Signal transduction histidine kinase + Prom 87541 - 87600 6.5 77 33 Op 1 . + CDS 87713 - 88624 641 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 78 33 Op 2 . + CDS 88676 - 89236 829 ## COG0233 Ribosome recycling factor + Prom 89239 - 89298 3.0 79 34 Tu 1 . + CDS 89336 - 90268 918 ## COG1162 Predicted GTPases + Term 90295 - 90335 -0.9 + Prom 90289 - 90348 5.3 80 35 Op 1 27/0.000 + CDS 90368 - 91453 1073 ## COG0845 Membrane-fusion protein 81 35 Op 2 9/0.000 + CDS 91469 - 94501 2700 ## COG0841 Cation/multidrug efflux pump 82 35 Op 3 . + CDS 94498 - 95838 1215 ## COG1538 Outer membrane protein + Term 95865 - 95911 11.3 + Prom 95856 - 95915 6.3 83 36 Tu 1 . + CDS 95990 - 97084 816 ## BT_2254 putative pectate lyase + Term 97162 - 97204 6.1 - Term 97145 - 97195 7.5 84 37 Tu 1 . - CDS 97252 - 98889 495 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase - Prom 99081 - 99140 4.5 + Prom 98791 - 98850 4.5 85 38 Tu 1 . + CDS 98912 - 99184 81 ## gi|299147790|ref|ZP_07040853.1| hypothetical protein HMPREF9010_03501 - Term 99023 - 99084 4.3 86 39 Op 1 . - CDS 99156 - 101147 1507 ## BT_2257 hypothetical protein 87 39 Op 2 . - CDS 101240 - 101416 263 ## BT_2258 GTP-binding protein 88 39 Op 3 . - CDS 101431 - 102501 535 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 - Prom 102526 - 102585 6.9 - Term 102536 - 102582 10.7 89 40 Op 1 . - CDS 102612 - 103502 870 ## gi|260174245|ref|ZP_05760657.1| hypothetical protein BacD2_20476 90 40 Op 2 . - CDS 103528 - 105132 1369 ## BVU_1855 hypothetical protein 91 40 Op 3 . - CDS 105158 - 108115 2772 ## BVU_1854 hypothetical protein - Prom 108209 - 108268 2.3 92 41 Tu 1 . - CDS 109022 - 110194 703 ## BT_2267 integrase protein - Prom 110271 - 110330 6.2 + Prom 110507 - 110566 12.1 93 42 Op 1 . + CDS 110644 - 113679 2265 ## BVU_1841 hypothetical protein 94 42 Op 2 . + CDS 113700 - 115112 1049 ## PRU_1887 putative lipoprotein + Term 115161 - 115218 9.5 + Prom 115495 - 115554 2.1 95 43 Op 1 . + CDS 115583 - 118555 2360 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 96 43 Op 2 . + CDS 118572 - 120125 1085 ## BT_2033 hypothetical protein + Term 120149 - 120202 10.5 97 44 Op 1 6/0.000 + CDS 120490 - 121077 302 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 121097 - 121127 -0.3 98 44 Op 2 . + CDS 121152 - 122159 552 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 122231 - 122270 5.8 + Prom 122323 - 122382 6.1 99 45 Op 1 . + CDS 122470 - 125769 2400 ## BF0340 hypothetical protein 100 45 Op 2 . + CDS 125775 - 127616 1632 ## BF0341 putative outer membrane protein 101 45 Op 3 . + CDS 127636 - 128310 541 ## BF0290 hypothetical protein 102 45 Op 4 . + CDS 128384 - 129961 1360 ## COG3119 Arylsulfatase A and related enzymes + Term 130018 - 130068 11.1 - Term 130004 - 130055 11.3 103 46 Op 1 . - CDS 130096 - 130746 680 ## COG2095 Multiple antibiotic transporter 104 46 Op 2 . - CDS 130808 - 131500 581 ## COG1011 Predicted hydrolase (HAD superfamily) - Prom 131536 - 131595 4.7 - Term 131520 - 131574 10.3 105 47 Op 1 . - CDS 131597 - 132454 1004 ## BT_2272 hypothetical protein 106 47 Op 2 . - CDS 132488 - 133162 545 ## COG0313 Predicted methyltransferases 107 47 Op 3 . - CDS 133175 - 133918 621 ## BT_2274 hypothetical protein - Prom 133977 - 134036 5.4 + Prom 133924 - 133983 6.0 108 48 Op 1 . + CDS 134059 - 134658 586 ## COG1435 Thymidine kinase 109 48 Op 2 . + CDS 134680 - 135813 941 ## COG0628 Predicted permease + Term 136003 - 136050 9.5 + TRNA 135908 - 135980 70.0 # Lys TTT 0 0 110 49 Tu 1 . + CDS 136223 - 137443 834 ## COG4974 Site-specific recombinase XerD + Term 137464 - 137521 17.0 111 50 Tu 1 . - CDS 137866 - 138015 86 ## BF0655 hypothetical protein - Prom 138121 - 138180 4.7 - Term 138284 - 138325 3.1 112 51 Tu 1 . - CDS 138349 - 138663 171 ## BF0653 hypothetical protein - Prom 138799 - 138858 4.5 - Term 138681 - 138732 3.8 113 52 Tu 1 . - CDS 138860 - 139702 164 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 139722 - 139781 7.3 + Prom 139681 - 139740 6.2 114 53 Op 1 . + CDS 139812 - 140321 228 ## COG0778 Nitroreductase 115 53 Op 2 3/0.053 + CDS 140335 - 140940 252 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 116 53 Op 3 . + CDS 140937 - 141587 309 ## COG0546 Predicted phosphatases + Prom 141595 - 141654 6.5 117 54 Op 1 . + CDS 141702 - 141935 213 ## BF0648 hypothetical protein + Prom 141974 - 142033 2.5 118 54 Op 2 . + CDS 142090 - 142323 91 ## BF0647 hypothetical protein + Term 142345 - 142387 6.4 + Prom 142371 - 142430 2.0 119 55 Tu 1 . + CDS 142453 - 142785 213 ## BF0646 hypothetical protein + Term 142799 - 142848 1.2 + Prom 142974 - 143033 3.4 120 56 Op 1 . + CDS 143069 - 143653 449 ## BF0644 clindamycin resistance transfer factor BtgA 121 56 Op 2 . + CDS 143658 - 144578 427 ## BDI_1256 clindamycin resistance transfer factor BtgB + Prom 144915 - 144974 4.3 122 57 Op 1 . + CDS 145023 - 145274 126 ## gi|260174279|ref|ZP_05760691.1| hypothetical protein BacD2_20646 123 57 Op 2 . + CDS 145261 - 147795 782 ## COG0539 Ribosomal protein S1 + Prom 148158 - 148217 4.5 124 58 Tu 1 . + CDS 148244 - 149113 464 ## COG4413 Urea transporter + Prom 149530 - 149589 9.1 125 59 Op 1 . + CDS 149623 - 150480 312 ## Teth39_1000 SpoIID/LytB domain-containing protein 126 59 Op 2 . + CDS 150490 - 150837 362 ## gi|260174283|ref|ZP_05760695.1| hypothetical protein BacD2_20666 127 59 Op 3 . + CDS 150837 - 152033 294 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 128 59 Op 4 . + CDS 151975 - 152259 113 ## BDI_2144 hypothetical protein 129 59 Op 5 . + CDS 152318 - 153103 167 ## PFLU3250 hypothetical protein - Term 153404 - 153441 -0.7 130 60 Op 1 . - CDS 153482 - 153889 386 ## BT_2360 transcriptional regulator 131 60 Op 2 . - CDS 153935 - 154324 419 ## BT_2361 hypothetical protein 132 61 Op 1 . + CDS 154658 - 157744 2632 ## BT_2362 hypothetical protein 133 61 Op 2 . + CDS 157760 - 159598 1373 ## BT_2363 hypothetical protein 134 61 Op 3 . + CDS 159626 - 159853 81 ## + Prom 159895 - 159954 6.0 135 62 Op 1 . + CDS 160042 - 164877 2819 ## COG0457 FOG: TPR repeat 136 62 Op 2 . + CDS 164916 - 165659 417 ## gi|260174292|ref|ZP_05760704.1| hypothetical protein BacD2_20711 + Term 165682 - 165726 1.1 137 63 Op 1 1/0.158 - CDS 165775 - 166626 742 ## COG0350 Methylated DNA-protein cysteine methyltransferase 138 63 Op 2 . - CDS 166623 - 167168 397 ## COG0350 Methylated DNA-protein cysteine methyltransferase - Prom 167225 - 167284 4.9 139 64 Tu 1 . - CDS 167394 - 169487 1612 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 169564 - 169623 6.6 + Prom 169625 - 169684 3.8 140 65 Tu 1 . + CDS 169835 - 170464 504 ## BT_4231 hypothetical protein 141 66 Op 1 . - CDS 170966 - 172303 881 ## BT_3124 putative sialic acid-specific acetylesterase 142 66 Op 2 . - CDS 172402 - 174459 1332 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 143 67 Op 1 . - CDS 174582 - 176435 1507 ## BT_4726 glycerophosphoryl diester phosphodiesterase 144 67 Op 2 . - CDS 176467 - 177384 843 ## COG3568 Metal-dependent hydrolase 145 67 Op 3 . - CDS 177402 - 179198 1698 ## BDI_3111 hypothetical protein 146 67 Op 4 . - CDS 179216 - 182728 2901 ## BDI_3110 hypothetical protein - Prom 182758 - 182817 5.9 + Prom 182673 - 182732 4.2 147 68 Tu 1 . + CDS 182759 - 182911 81 ## 148 69 Op 1 6/0.000 - CDS 182959 - 183975 780 ## COG3712 Fe2+-dicitrate sensor, membrane component 149 69 Op 2 2/0.105 - CDS 184011 - 184727 381 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 184748 - 184807 3.1 150 69 Op 3 . - CDS 184809 - 186950 1639 ## COG0642 Signal transduction histidine kinase - Prom 187025 - 187084 3.8 151 70 Tu 1 . - CDS 187280 - 187600 303 ## COG2076 Membrane transporters of cations and cationic drugs - Prom 187759 - 187818 3.7 + Prom 187459 - 187518 5.4 152 71 Op 1 . + CDS 187673 - 188323 443 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 153 71 Op 2 . + CDS 188407 - 188772 291 ## COG2315 Uncharacterized protein conserved in bacteria - Term 188802 - 188853 11.4 154 72 Op 1 1/0.158 - CDS 188866 - 190119 1174 ## COG1538 Outer membrane protein 155 72 Op 2 11/0.000 - CDS 190094 - 193201 2669 ## COG3696 Putative silver efflux pump 156 72 Op 3 . - CDS 193208 - 194320 914 ## COG0845 Membrane-fusion protein - Prom 194406 - 194465 8.1 + Prom 194630 - 194689 4.1 157 73 Op 1 1/0.158 + CDS 194749 - 195957 832 ## COG0477 Permeases of the major facilitator superfamily + Prom 195959 - 196018 8.6 158 73 Op 2 . + CDS 196042 - 198171 1264 ## COG0475 Kef-type K+ transport systems, membrane components + Term 198222 - 198252 -0.3 159 74 Op 1 . - CDS 198193 - 199827 848 ## COG0642 Signal transduction histidine kinase 160 74 Op 2 . - CDS 199779 - 202235 942 ## COG3292 Predicted periplasmic ligand-binding sensor domain - Prom 202308 - 202367 3.5 + Prom 202262 - 202321 5.7 161 75 Op 1 . + CDS 202387 - 204801 1551 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 162 75 Op 2 . + CDS 204837 - 205682 395 ## BT_2157 hypothetical protein + Prom 205760 - 205819 7.6 163 76 Op 1 . + CDS 205839 - 206333 264 ## BT_1142 hypothetical protein + Prom 206344 - 206403 4.0 164 76 Op 2 . + CDS 206423 - 207268 570 ## BT_2385 hypothetical protein - Term 207206 - 207263 -0.4 165 77 Op 1 9/0.000 - CDS 207312 - 208673 446 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 166 77 Op 2 27/0.000 - CDS 208686 - 211829 2714 ## COG0841 Cation/multidrug efflux pump 167 77 Op 3 . - CDS 211839 - 212942 1172 ## COG0845 Membrane-fusion protein - Prom 213020 - 213079 3.7 + Prom 213091 - 213150 3.3 168 78 Op 1 9/0.000 + CDS 213173 - 214180 458 ## COG3275 Putative regulator of cell autolysis 169 78 Op 2 . + CDS 214177 - 214884 535 ## COG3279 Response regulator of the LytR/AlgR family + Term 214901 - 214939 0.5 170 79 Tu 1 . - CDS 214916 - 215527 467 ## BT_2369 hypothetical protein - Prom 215574 - 215633 6.0 + Prom 215513 - 215572 4.8 171 80 Op 1 . + CDS 215733 - 216149 275 ## BT_2376 hypothetical protein 172 80 Op 2 . + CDS 216146 - 217276 545 ## COG2207 AraC-type DNA-binding domain-containing proteins 173 80 Op 3 . + CDS 217308 - 217778 356 ## BT_1189 hypothetical protein + Term 217852 - 217891 3.2 + Prom 217812 - 217871 2.5 174 81 Tu 1 . + CDS 218069 - 219481 885 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Term 219302 - 219338 -1.0 175 82 Tu 1 . - CDS 219473 - 220669 830 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 220745 - 220804 5.8 + Prom 220755 - 220814 4.1 176 83 Tu 1 . + CDS 220919 - 221137 275 ## BT_2974 hypothetical protein - Term 221052 - 221082 1.0 177 84 Op 1 . - CDS 221158 - 221619 272 ## Desal_1295 hypothetical protein 178 84 Op 2 7/0.000 - CDS 221622 - 222578 740 ## COG1846 Transcriptional regulators - Prom 222631 - 222690 6.4 179 84 Op 3 . - CDS 222727 - 223938 802 ## COG0534 Na+-driven multidrug efflux pump - Prom 224119 - 224178 4.8 180 85 Op 1 . - CDS 224187 - 224921 303 ## BT_2379 hypothetical protein - Prom 224941 - 225000 6.0 181 85 Op 2 . - CDS 225002 - 225532 218 ## PROTEIN SUPPORTED gi|229878290|ref|ZP_04497790.1| acetyltransferase, ribosomal protein N-acetylase 182 85 Op 3 . - CDS 225546 - 226172 335 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 183 85 Op 4 . - CDS 226244 - 226714 429 ## COG1522 Transcriptional regulators - Prom 226883 - 226942 7.8 + Prom 226767 - 226826 4.1 184 86 Op 1 . + CDS 226893 - 228179 1147 ## COG2873 O-acetylhomoserine sulfhydrylase 185 86 Op 2 . + CDS 228248 - 228487 331 ## BT_2388 hypothetical protein + Term 228531 - 228576 7.2 - Term 228507 - 228572 20.8 186 87 Tu 1 . - CDS 228590 - 229498 872 ## COG0668 Small-conductance mechanosensitive channel - Prom 229568 - 229627 4.5 + Prom 229455 - 229514 7.4 187 88 Op 1 . + CDS 229726 - 231975 1950 ## BT_2390 hypothetical protein 188 88 Op 2 2/0.105 + CDS 231988 - 232596 500 ## COG3201 Nicotinamide mononucleotide transporter 189 88 Op 3 . + CDS 232593 - 233216 585 ## COG1564 Thiamine pyrophosphokinase + Term 233282 - 233324 2.2 - Term 233212 - 233262 3.2 190 89 Op 1 . - CDS 233313 - 234497 734 ## gi|260174345|ref|ZP_05760757.1| hypothetical protein BacD2_20986 191 89 Op 2 . - CDS 234512 - 235675 234 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 192 89 Op 3 . - CDS 235659 - 235883 84 ## - Prom 235952 - 236011 6.5 - Term 236568 - 236600 1.7 193 90 Op 1 . - CDS 236671 - 237972 1346 ## COG0498 Threonine synthase 194 90 Op 2 . - CDS 237986 - 239194 1121 ## COG3635 Predicted phosphoglycerate mutase, AP superfamily 195 90 Op 3 . - CDS 239294 - 241729 2460 ## COG0527 Aspartokinases - Prom 241760 - 241819 3.7 + Prom 241919 - 241978 5.6 196 91 Tu 1 . + CDS 242131 - 243168 914 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Term 243180 - 243242 2.4 + Prom 243192 - 243251 3.4 197 92 Op 1 . + CDS 243272 - 244471 822 ## COG1373 Predicted ATPase (AAA+ superfamily) 198 92 Op 2 . + CDS 244483 - 245850 1304 ## COG1066 Predicted ATP-dependent serine protease + Prom 246357 - 246416 5.0 199 93 Op 1 . + CDS 246476 - 247558 373 ## Ping_1180 hypothetical protein + Term 247570 - 247617 10.6 200 93 Op 2 . + CDS 247637 - 249289 1276 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 201 93 Op 3 . + CDS 249306 - 249908 543 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Prom 249912 - 249971 2.1 202 93 Op 4 . + CDS 249999 - 252446 2191 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 252569 - 252623 17.1 - Term 252818 - 252863 1.0 203 94 Op 1 . - CDS 252969 - 253484 423 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 204 94 Op 2 . - CDS 253524 - 254669 550 ## BT_2411 hypothetical protein 205 94 Op 3 . - CDS 254666 - 256495 1531 ## COG0826 Collagenase and related proteases 206 94 Op 4 . - CDS 256492 - 257409 862 ## COG1897 Homoserine trans-succinylase - Prom 257453 - 257512 5.4 + Prom 257336 - 257395 3.7 207 95 Tu 1 . + CDS 257590 - 257760 202 ## BT_2414 ferredoxin + Term 257783 - 257820 6.2 - Term 257864 - 257911 10.2 208 96 Op 1 . - CDS 257984 - 259177 1353 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 259198 - 259257 3.8 209 96 Op 2 . - CDS 259260 - 260477 1278 ## COG0807 GTP cyclohydrolase II 210 96 Op 3 . - CDS 260483 - 262333 1310 ## COG0795 Predicted permeases - Prom 262372 - 262431 5.5 - Term 262377 - 262422 1.6 211 96 Op 4 . - CDS 262458 - 262844 430 ## BT_2418 hypothetical protein - Prom 263085 - 263144 77.0 + TRNA 263068 - 263141 84.7 # Met CAT 0 0 - Term 263254 - 263298 8.5 212 97 Tu 1 . - CDS 263322 - 264626 1400 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 264861 - 264920 8.4 + Prom 264914 - 264973 3.5 213 98 Op 1 6/0.000 + CDS 265001 - 265582 309 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 214 98 Op 2 . + CDS 265665 - 266855 536 ## COG3712 Fe2+-dicitrate sensor, membrane component 215 98 Op 3 . + CDS 266880 - 270212 1626 ## Cpin_1097 TonB-dependent receptor plug 216 98 Op 4 . + CDS 270223 - 271734 899 ## Cpin_1098 hypothetical protein 217 98 Op 5 . + CDS 271752 - 272603 462 ## gi|260174371|ref|ZP_05760783.1| hypothetical protein BacD2_21116 218 98 Op 6 . + CDS 272620 - 274299 879 ## gi|260174372|ref|ZP_05760784.1| hypothetical protein BacD2_21121 219 98 Op 7 . + CDS 274333 - 274473 71 ## gi|294645365|ref|ZP_06723076.1| conserved hypothetical protein + Prom 274501 - 274560 2.5 220 99 Op 1 . + CDS 274586 - 276919 1562 ## BT_3275 hypothetical protein 221 99 Op 2 . + CDS 276934 - 279477 1744 ## BT_3275 hypothetical protein + Prom 279480 - 279539 6.6 222 99 Op 3 . + CDS 279559 - 279915 289 ## gi|295084588|emb|CBK66111.1| hypothetical protein + Prom 279937 - 279996 2.8 223 100 Op 1 . + CDS 280018 - 282801 1053 ## CHU_2270 hypothetical protein 224 100 Op 2 24/0.000 + CDS 282884 - 283819 713 ## COG1131 ABC-type multidrug transport system, ATPase component 225 100 Op 3 2/0.105 + CDS 283821 - 286142 1355 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 226 100 Op 4 . + CDS 286155 - 288482 1358 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 227 100 Op 5 . + CDS 288418 - 288546 62 ## gi|260174380|ref|ZP_05760792.1| hypothetical protein BacD2_21161 - Term 288477 - 288538 17.1 228 101 Op 1 . - CDS 288559 - 289998 1486 ## COG0642 Signal transduction histidine kinase 229 101 Op 2 . - CDS 290018 - 291148 893 ## COG2205 Osmosensitive K+ channel histidine kinase - Prom 291168 - 291227 1.6 - Term 291162 - 291212 7.2 230 102 Op 1 . - CDS 291233 - 291997 796 ## BT_2422 hypothetical protein 231 102 Op 2 . - CDS 291981 - 292115 60 ## gi|237717983|ref|ZP_04548464.1| conserved hypothetical protein 232 102 Op 3 18/0.000 - CDS 292116 - 292688 664 ## COG2156 K+-transporting ATPase, c chain 233 102 Op 4 20/0.000 - CDS 292705 - 294630 2189 ## COG2216 High-affinity K+ transport system, ATPase chain B - Prom 294655 - 294714 1.6 234 102 Op 5 . - CDS 294757 - 296463 1697 ## COG2060 K+-transporting ATPase, A chain - Prom 296665 - 296724 4.6 235 103 Op 1 . + CDS 297024 - 299297 1646 ## COG0380 Trehalose-6-phosphate synthase 236 103 Op 2 . + CDS 299310 - 301100 1226 ## COG3387 Glucoamylase and related glycosyl hydrolases + Term 301139 - 301177 4.8 - 5S_RRNA 301176 - 301275 98.0 # CP000140 [D:147281..147431] # 5S ribosomal RNA # Parabacteroides distasonis ATCC 8503 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Parabacteroides. Predicted protein(s) >gi|225935335|gb|ACGA01000057.1| GENE 1 41 - 742 877 233 aa, chain - ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 6 226 3 221 225 141 38.0 1e-33 MDEKLRILLCEDDENLGMLLREYLQAKGYSAELYPDGEAGYKAFLKNKYDLCVFDVMMPK KDGFTLAQDVRAANAEIPIIFLTAKTLKEDILEGFKIGADDYITKPFSMEELTFRIEAIL RRVRGKKNKESNIYKIGKFTFDTQKQILSSEGKQTKLTTKESELLGLLCAHANEILQRDF ALKTIWIDDNYFNARSMDVYITKLRKHLKEDDSIEIINIHGKGYKLITPEVES >gi|225935335|gb|ACGA01000057.1| GENE 2 744 - 2300 1048 518 aa, chain - ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 283 513 341 564 566 125 36.0 2e-28 MKKSTIWILGIVMGLSFLSLLYLQVSYIEEMMKTRKEQFDSAVRNSLDQVSKDVEYAETR RWLIEDISEAERKALIANNASIQQDNLIQQTQRFTVKSKDGKVYSDFELKVMTTKPSELP KAMISPYRGTKTIPETSRSLVEAIKNRYMYQRALLDEVAWQMIYRGSDKSIGDRVRFKEL DDYLKSSLYNNSIDLPYHFTVIDKDGREVYRCADYEAKGSEDAYQQALFKNDPPAKMSIL KVHFPGKKDYIFDSISFMIPSLIFTLVLLVTFIFTIYIVFRQKKLTEMKNDFINNMTHEF KTPISTISLAAQMLKDPAVGKSPQMFQHISGVINDETKRLRFQVEKVLQMSMFERQKATL KMKEIDANELISGVVNTFALKVERYNGKITSNLEATDPVIFADEMHITNVIFNLMDNAVK YKKPEEDLELKVRTWNESGKLMISIQDNGIGIKKENLKKIFEKFYRVHTGNLHDVKGFGL GLAYVRKIILDHKGTIRAESDLNVGTKFIIALPLLKNN >gi|225935335|gb|ACGA01000057.1| GENE 3 2634 - 4790 2186 718 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 710 3 685 690 437 35.0 1e-122 MKVYQTNEIKNIALLGSSGSGKTTLVEAMLFESGVIKRRGSVAAKNTVSDYFPVEQEYGY SVFSTVLHVEWNNKKLNIIDCPGSDDFVGSTVTALNVTDTAIILLNGQYGVEVGTQNHFR YTEKLNKPVIFLVNQLDNEKCDYDNILEQLKEAYGSKVVPIQYPIATGPGFNALIDVLLM KKYSWKPEGGAPTIEDIPAEEMDKAMEMHKALVEAAAENDENLMEKFFEQDSLTEDEMRE GIRKGLIARGMFPVFCVCGGKDMGVRRLMEFLGNVVPFVSEMPKVQNTEGKEVAPDTNGP ESLYFFKTSVEPHIGEVSYFKVMSGKVHEGDDLLNADRGSKERIAQIYVVAGGNRVKVEE LQAGDIGAAVKLKDVKTGNTLNGKDCDYKFNFIKYPNSKYTRAIKPVNEADVEKMMSILN RMREEDPTWVIEQSKELKQTLVHGQGEFHLRTLKWRLENNEKLPVKYEEPKIPYRETITK AARADYRHKKQSGGAGQFGEVHLIVEPYKEGMPVPDTYKFNGQEFKITVRGTEEIPLEWG GKLVFINSIVGGSIDARFLPAIMKGIMSRMEQGPLTGSYARDVRVIVYDGKMHPVDSNEI SFMLAGRNAFSEAFKNAGPKILEPIYDVEVFVPSDRMGDVMGDLQGRRAMIMGMSSEKGF EKLVAKVPLKEMSSYSTALSSLTGGRASFIMKFSSYELVPADVQDKLMKDFEAKQVEE >gi|225935335|gb|ACGA01000057.1| GENE 4 5222 - 6352 707 376 aa, chain + ## HITS:1 COG:SPy1040 KEGG:ns NR:ns ## COG: SPy1040 COG0635 # Protein_GI_number: 15675037 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Streptococcus pyogenes M1 GAS # 5 368 9 369 376 242 34.0 1e-63 MAGIYLHIPFCKTRCIYCDFYSTTRSELKTRYVRALCRELAMRKDYLKGEDIETVYFGGG TPSQLEKEDFEQIFDTIRTHYGLSHCQEITLEANPDDLTSEYLKMLSSLPFNRISMGIQT FDDPTLKLLKRRHNAHTAIEAVHRCREAGFQNISIDLIYGLPGETKERWENDLRQAVSLN VEHISAYHLIYEEDTPIYNMLKQHQISEVDEDSSLDFFTLLIEHLQKAGFEHYEISNFCR PGKYSRHNSSYWKGIPYLGCGPSAHSFDGMTREWNVSSIDTYIKGIEENSRAFEIEYLDQ TTRYNEFIITTIRTVWGTPIEKLKQMFGNEMWEYCQRMAAPYLKNGKLEEHDGALRLTRE GIFISDSIMSDLLWVD >gi|225935335|gb|ACGA01000057.1| GENE 5 6372 - 6926 510 184 aa, chain + ## HITS:1 COG:no KEGG:BT_2169 NR:ns ## KEGG: BT_2169 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 182 281 91.0 7e-75 MTESEVRKLLRQMKELDSQTAFRDFYNMTYDRLFRIAYYYVKQEEWSQEIVLDVFLKLWK QRSNLLDVRNIEDYCFILVKNASLNYLEKESKRTYIHPDSLPEPQEQSYSPEESLISEEL FALYVKALDRLPERCREVFIRIREEKQSYAQVAEELGISMNTVDAQLQKAITRLKEMISR AEID >gi|225935335|gb|ACGA01000057.1| GENE 6 7014 - 7448 353 144 aa, chain + ## HITS:1 COG:no KEGG:BT_2170 NR:ns ## KEGG: BT_2170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 143 144 195 70.0 4e-49 MKKLNDANIGILVLLSTCLCLFSCNNDNDNLPKDYAGFEHSKETVECEGNKSECELEIKI VAMEKAKEDRTVVLAAPPPVTGQAAVIQLTEKKVIIKAGKKSATTIIKIYPKQMVLNKQN VTLSCTPQWKEGGISKLTILLKRK >gi|225935335|gb|ACGA01000057.1| GENE 7 7476 - 8354 588 292 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 7 253 52 290 331 85 30.0 1e-16 MKHTDTELEEILNQLIASTRSPRGRFSAAASYPELEKRLKSHTRHLTLIRTFSAAAAVAL LCLSVWTAYLYMQPAAIQTISTLAETRTVRLPDGSSVMLNHYSSLSYPEKFQSDKREVEL NGEAYFEVSKDPKHPFIVQTETIDVQVLGTHFNVDAYHDNLDVKTTLLSGSVAVSNKSKS VRMVLKPNEIAIYNKVEEKLTRKVLENAEDEISWRQGEFIFDDLPLQEIARELSNSFGAT IQIADTTLQNYRITARFRDGEDLATILSVLHNAGYFNYSQNNKQIIITAKPD >gi|225935335|gb|ACGA01000057.1| GENE 8 8290 - 11058 2024 922 aa, chain + ## HITS:1 COG:no KEGG:BT_2172 NR:ns ## KEGG: BT_2172 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 23 922 1 897 897 1476 83.0 0 MPVISTTHKTINRLLSLLNQTKMNIIFQGRFPVRTIVLIGVALLITTQIYAQNAEARLSL TLRNATLKEFVKRIENSTGYSFIYGEEIIIKHKINLQVKNKPLREILDLVFKNEQISYQF TGRHILLQKKKESKTVGRKFTISGYVTDGTSSETLIGTNIIESHQNQGTTTNPYGFYSIT LPEGETELRFSYLGYATEAHHFTLSQDTLLNIRMQGNTQLQEVVIVSDKTETGTVATQMG SIEIPMTQIKNTPSILGEADVMKAIQLMPGVQAGVDGSAGLYIRGGSPDQNLILLDGTPV YNVDHMFGFFSVFTPEAVKKVTLFKSSFPARFGGRLSSVIDVRTNDGDMQKYHGTLSIGL LTSKINLEGPIVKGKTSFNISARRSYVDLIAKPFMPNDEEYGYYFYDINAKINHKFSDRS RIYLSVYNGKDHFAANYDGDTDSKDGSTMNWGNTIVSARWNYIFNNRLFSNTTVSYNNYL FDVNSYNNNKYANSMGASIINRYSADYRSGINDWSYQIDFDYNPSPAHHIKFGTGYIYHR FRPEVMTSKISEKTGDKVDRDTTYHSIANSRIYGHELSAYLEDNIKVNDRLRLNLGLHFS LFQVQKQSYSSLQPRVSARYQLGKDVTLKASYTQMSQYVHLLSSMPIAMPTDLWVPVTKK IKPMRSHQYSLGGYYTGIEGWEFSVEGYYKDMYNVLEYKEGVSFFGSSAGWENKVEMGKG RSAGIEFMAQKTLGRTTGWLSYTLSKSDRQFAKGGINNGERFPYKYDRRHNINLTINHKF SERIDIGASWVFYTGGTSTIPEEKTAVIRPSDGTNNGFGGGYGYGDYFDSGITSPTIGEA SYVEHRNNYRLPASHRLNVGVNFNKKTKHGMRTWNISLYNAYNAMNPTFVYRSTSKNDPY KPIIKKYTILPLIPSFTYTYKF >gi|225935335|gb|ACGA01000057.1| GENE 9 11066 - 12064 685 332 aa, chain + ## HITS:1 COG:no KEGG:BT_2173 NR:ns ## KEGG: BT_2173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 329 1 333 337 443 67.0 1e-123 MRTRIYYPFILLIALLTTVSCENELPFSVKDNPPKLVMNALINADSLTNVLYLNFTGRGY ATHAEKATVEVRVNGQLSESLRPLPPQAEGDMQCRFNISGKFSPGDVVRIDALTDDGQYH AWAEVTVPQRPHEITDIDTVTVPLTQYYYTQNYLRYKINIKDRPNENNFYRLIMDKQMTV KDYNNEIGEYVTQTTHRYHFISREDVVLTDGQPTNSDDEDNGMFDTVKNIYGVFDDSRFK NTSYTMTVYNQTNVEGLSKYGTNVKMDIIVRLLSITETEYYYLKALNLADSDAYDETINE PIKYPGNVHGGVGIVGISTETSKIIHIEKPWI >gi|225935335|gb|ACGA01000057.1| GENE 10 12055 - 12831 338 258 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|299147860|ref|ZP_07040923.1| ## NR: gi|299147860|ref|ZP_07040923.1| hypothetical protein HMPREF9010_03572 [Bacteroides sp. 3_1_23] # 27 258 1 232 232 432 100.0 1e-119 MDMKTHHLLFILIALPLFCSCRSNRSMLREIQALKSSLYYELTSPIYQEKADQTVYLDFI DYSNMDYYTSVKRKKSAYIPLLLYNYEGELFHLRLGESSLTQLYREFLTEALLTECNSST CCHLIDNQKGKMIPDSAYRLEVKIRKNETCARIKLNQSSIPWFEGEMLEVVNNKIRPAAS SLAISIRLTQKEDCLLDKTYSTEYQQTTKAQRFEDSPSANAACLDDMTECLSMATKEIVE EISRDIHLILSLQPKSRH >gi|225935335|gb|ACGA01000057.1| GENE 11 12834 - 13121 228 95 aa, chain + ## HITS:1 COG:no KEGG:BT_2176 NR:ns ## KEGG: BT_2176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 94 1 94 95 110 85.0 1e-23 MFIIIGIMLTGMLLGYLLRSKRLSWIHRIITLLIWILLFLLGIDVGGNESIIKGLHTLGL EAIIITVAAVAGSTLCAWGLWYLLYKWNGGKETKA >gi|225935335|gb|ACGA01000057.1| GENE 12 13118 - 13726 394 202 aa, chain + ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 5 200 2 195 198 114 39.0 2e-25 MKGSLIIVSFFIIGTLCGVYHLIPYDFTDSKLSYYALCGLMFCVGISIGNDPNTLKSFRS LNPRLVFLPIMTIIGTLAGCAVAGAFMSQRGPLDCMAVGAGFGYYSLSSIFITEYKGPEL GTIALLSNIMREIIALLCAPLLVKYFGKLAPISVGGATTMDTTLPIITRYSGKEFVIISI FHGFVVDFSVPFLVTFLCSISF >gi|225935335|gb|ACGA01000057.1| GENE 13 14019 - 14375 491 118 aa, chain + ## HITS:1 COG:no KEGG:BT_2178 NR:ns ## KEGG: BT_2178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 118 186 94.0 2e-46 MKRLGLTLVAALCLAATTFAAGNQPTTAKWEGNINVSKLGKYLNLNSVQSEEVANICDYF SEQMSRATTAKKDKEAKLRNAVYGNLKLMRKTLSAEQYAKYAALMNITLQNKGIELNK >gi|225935335|gb|ACGA01000057.1| GENE 14 14441 - 15505 1198 354 aa, chain - ## HITS:1 COG:no KEGG:BT_2179 NR:ns ## KEGG: BT_2179 # Name: not_defined # Def: putative DNA mismatch repair protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 354 1 354 354 628 88.0 1e-178 MKIGDKVRFLSEVGGGIVTGFQGKDFVLVEDADGFDIPMPIRECVVIETDDYNLKRKPTS SAPKQEGPAKPVKPEMPVIQRQPEVKGGDTLNVFLAYVPEDAKAMMTTPFETYLVNDSNY YLYYTYLSAEGKAWKNRSHGLVEPNTKLLLEEFTKDVLNDMERVAVQLIAFKDGKPAAIK PAVSVEIRIDTVKFYKLHTFSDSDFFEEPALIYDIVKDDMPTKQVYVSAEELQEALLQKK SVDKPRSQPIVKPNHAHGGKSEIVEIDLHIDSLLDDTQGMGNAEILNYQLDKFREVMEVY KNKREQKIVFIHGKGDGVLRKAIVDELKRKYSNCRYQDASFQEYGFGATMVTIK >gi|225935335|gb|ACGA01000057.1| GENE 15 15693 - 16385 626 230 aa, chain + ## HITS:1 COG:mlr6523 KEGG:ns NR:ns ## COG: mlr6523 COG1011 # Protein_GI_number: 13475450 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Mesorhizobium loti # 5 209 7 209 238 192 45.0 5e-49 MKELIKVIAFDADDTLWSNEPFFQEIEKQYTDLLKPYGTSEDISAALFQTEMNNLKYLGY GAKAFTISMVETALHVSGQKISGTDIQHIIELGKSLLKMPIELLPGVKETLKVLKEKGKY KLVVATKGDLLDQENKLERSGLASYFDHIEVMSDKTEKEYQRMLNILQIAPSEFVMIGNS LKSDIQPVLSLGGYGIHIPFEVMWKHEVVDTFTHDHLIQVKRVDELLTLF >gi|225935335|gb|ACGA01000057.1| GENE 16 16403 - 17176 452 257 aa, chain - ## HITS:1 COG:no KEGG:BT_2181 NR:ns ## KEGG: BT_2181 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 257 257 461 88.0 1e-128 MDVLQKEIDEVYATQPITDETLDNGVVEQHRRFIHSLTEINGGCAVISDLSNRKSYIAVH PWAHFLGLTPEEAALSVIDSMDEDCIYRRIHPEDLVEKRLLEYKFFQKTFAMPFDERLKY RGRCRIRMMNEKGVYQYIDNLVQIMENTPSGSAWLIFCLYSLSADQRTEQGIYPTITHME RGEVETLFLSEEHRNILSEREKEILRCIRKGLSSKEIATTLYISVNTVNRHRQNILEKLS VGNSIEACRAAELMKLL >gi|225935335|gb|ACGA01000057.1| GENE 17 17264 - 17929 648 221 aa, chain + ## HITS:1 COG:all7165 KEGG:ns NR:ns ## COG: all7165 COG3506 # Protein_GI_number: 17233181 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 28 204 1 176 183 158 42.0 8e-39 MKYLKIKIMSIAFMAIATSSMGQSLNKMNWLNEPQQWEIKDGKTLVMDVPAKTDFWRISH YGFTVDDGPFYYATYGGEFEAKVKITGNYVTTFDQMGLMLRIDHENWIKAGVEYVNGKQN VSAVVTHRTSDWSVVQLPDAPRSIWIKAIRRLDAVEIFFSRDDKEYIMMRTCWLQDNCPV MVGVMGACPDGKGFTATFEEFKVTPLADQRRLEWAKKQMNK >gi|225935335|gb|ACGA01000057.1| GENE 18 18047 - 19099 705 350 aa, chain - ## HITS:1 COG:no KEGG:BT_2183 NR:ns ## KEGG: BT_2183 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 348 1 348 351 553 79.0 1e-156 MKISNIRLLNQQLLSPLFSQPKELVSWMGAIQAQNYSMVKWAVGMRLKSATIQTVEKALR EGEILRTHVMRPTWHLVAAEDIRWMLKLSAERVIAANESYAKGHDLDISEELYAKSYRLL EKILSGNKSLTRQEIAEHFSRSGIVADNHRMTRFMARAEQVGIVCSGEDKGSKCTYALLE ERVPPMSELTKDESLARLARSYFRSHAPAVLQDFVWWSGLPVTEARQAIYLIDSELTAEE WNGQTWYIHEDCRTRGKVSGSLHLLPSYDEYLLGYKDRTDVLPKEHYAKAFTNNGLFYPI VLHEGQVIGNWDKSVKKGGSFIEHSWFRLDDCVDEVALNREKDRYIRFWK >gi|225935335|gb|ACGA01000057.1| GENE 19 19096 - 19644 349 182 aa, chain - ## HITS:1 COG:no KEGG:BDI_3895 NR:ns ## KEGG: BDI_3895 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 28 177 608 760 766 164 50.0 1e-39 MKIKSLYALLFCIFTTGCLDGQKQSDHAILIAYTSTEFPELGSSVCYLNERGDTVIPFGK YHYGGSDTIRHIGFVVEPHTPGWTTINNKGEKLFYTFSFDNGPDYVEEGLFRIINDEMLM GFADTLGNVVIQPQFAFVFPFKDGKAEVTYTGAKKAMDDYGEHWTWQSDHWFYIDKKGNK LK >gi|225935335|gb|ACGA01000057.1| GENE 20 19655 - 19807 74 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEYKSRDLFGDFQLVLALGKGVEVISISSGLFFLLLTMVCLERNFIFAIY >gi|225935335|gb|ACGA01000057.1| GENE 21 19828 - 21060 1201 410 aa, chain + ## HITS:1 COG:PM0839 KEGG:ns NR:ns ## COG: PM0839 COG0128 # Protein_GI_number: 15602704 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Pasteurella multocida # 6 397 10 430 440 235 36.0 1e-61 MMLYKLIPPSTVTTAIQLPASKSISNRALIINALGKGMYAPENLSDCDDTQVMIKALTEG KGTIDIMAAGTAMRFLTAYLSVTPGERTITGTARMQQRPIQILVNALRELGAEIEYTHNE GYPPLCIKGAELKGNEITLKGNVSSQYISALLMIGPVLKDGLTLHLSGEIISRPYINLTL QLMQDFGAKAAWTSPSSISVAPQLYQSIPFKVESDWSAASYWYQIAALSPKAEIELLGLF RNSYQGDSRGAEVFSRLGITTEFTSQGVKLKKTGKAPERLEEDFVDIPDLAQTFVVTCAL LNIPFRFTGLQSLKIKETDRIAALRTELKKLGYVIEEENDSILMWNGKRCEPEEIPVIDT YEDHRMAMAFAPAVICHPNLLIADPQVVTKSYPGYWEDLKQAGFQVINEG >gi|225935335|gb|ACGA01000057.1| GENE 22 21205 - 21642 290 145 aa, chain + ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 18 145 64 190 193 85 37.0 3e-17 MGFVSSGSFRYCCTNSAGENSIVGYTFDHSFVGNYPAFQLQDKSNVDIQALCDCSVYVIN YQQMTEFYDTNDAHQKLGRRIAETLLWEVYDRMISMYSLTPEERYLEIINRCPDLLKLIT LKELASYLLIRPETLSRIRRKVVQK >gi|225935335|gb|ACGA01000057.1| GENE 23 21717 - 22433 474 238 aa, chain + ## HITS:1 COG:CAP0051 KEGG:ns NR:ns ## COG: CAP0051 COG0300 # Protein_GI_number: 15004755 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Clostridium acetobutylicum # 1 233 1 236 240 167 41.0 2e-41 MKKIIIIGATSGIGRGLAEIYSQEDFLIGISGRRENLLKEVCARDEDKLFYQVCDITDTQ STISSLETLIQKMGGMDILIICAGTGELNPELSYQLEEPTLLTNVIGFTNIVDWGFRYFE RQKSGHLVTISSVGGTRGSGIAPAYNASKAYQINYMEGLRQKATKSHYPIYTTDIRPGFV DTAMAKGEGLFWVTPVDKAVRQIKKAISKKKKVAFISKRWKYVAILFRLLPSAIYCRM >gi|225935335|gb|ACGA01000057.1| GENE 24 22618 - 23367 373 249 aa, chain + ## HITS:1 COG:no KEGG:BT_1587 NR:ns ## KEGG: BT_1587 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 250 250 442 82.0 1e-123 MNTDFINLTAENLSNEHLCCIIRSKKSHPGIEAKRQWLSDRLKEGHVFRKLNAKATVFIE YAPLEKAWVPIMGGNYYYLYCLWVLGSPRGNGYGSSLMEYCIADAKEKGKSGICMLGAKK QKNWLSDQSFAKKFGFEVVDTTNNGYELLALSFDGTTPKFTPNAKRLKIESEELTVYYDI QCPYIYQYIEMIKQYCETNNVPVSFIQVDTLEKAKQLPCVFNNFALFYKGVFETVNLPNI DYLKRILKK >gi|225935335|gb|ACGA01000057.1| GENE 25 23423 - 24421 821 332 aa, chain - ## HITS:1 COG:CC2502 KEGG:ns NR:ns ## COG: CC2502 COG2234 # Protein_GI_number: 16126741 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Caulobacter vibrioides # 2 302 5 272 309 113 31.0 4e-25 MRRNYLLLAFLFVGNIAFAQSPIERGLNTINRSSAEAAINFLAGDELQGREAGFHGSRVT SEYIASLLQWMGIPPLTDSYFQPFDAYRKERQKKGRLEVHPDSIAKLKQEVHQKLSMRNV LGMIPGKNTKEYVIVGAHFDHLGIDPALDGDQIYNGADDNASGVSAVLQIARAFLASGQQ PERNVIFAFWDGEEKGLLGSKYFVQTCPFLSQIKGYLNFDMIGRNNKPQQPKHVVYFYTA AHPAFGDWLKEDIKKHGLQLEPDYRAWDHPIGGSDNGSFAKVNIPIIWYHTDGHPDYHQP SDHADRLNWDKIVEITKASFLNMWKMANEKSF >gi|225935335|gb|ACGA01000057.1| GENE 26 24471 - 25565 514 364 aa, chain - ## HITS:1 COG:no KEGG:PG0350 NR:ns ## KEGG: PG0350 # Name: not_defined # Def: internalin-related protein # Organism: P.gingivalis # Pathway: not_defined # 125 329 160 361 484 80 31.0 7e-14 MKQKKLLFSLIGVCLLLGFSSCSSEDKSVFGDDFEIPELTDANTIQFTVDASGEWKVIEM NAGGGRIAIEWGDGRLQKIEHPDNASIQYRYKPSRSFTVRVWAEELTAFSVSGVLMPVSN MHLGDFPRMKRIELNSIKESSFLDLNTSCPNLEYMNIGNWEGLEELHFDECLNLKTVQVY TNPKLTSIKFGHCEFLTSLYCIDNGITSLSLKKLPALHTVELTGTPQLSALEVDDDNMIS VLHIEGCVFRKLDFLNKLMSLTELYCSSNKLTDLDTSGNKKLWYLDCYNNQLKSLSIPMN NELRKLDCRYNKLDKDQLNNIFNTLPDITGYPAYYDKCVIMYKNNPGAADCNETILQNKH WRVN >gi|225935335|gb|ACGA01000057.1| GENE 27 25663 - 26130 416 155 aa, chain + ## HITS:1 COG:no KEGG:BT_2205 NR:ns ## KEGG: BT_2205 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 148 1 140 141 237 89.0 1e-61 MIRNCYIGMWILIISLVLLGVIALIAGIIRNKRLQKKIEKGELDRMPEVKEVDVECCGQH EVCERDSLLAAVSKKIEYYDDEELDQFIGRPGNAYTEEETDMFRDVLYTTLDIEVAGWVR SLQLRGIELPDDLKDEVFLIIGERRNGEIKKTDDR >gi|225935335|gb|ACGA01000057.1| GENE 28 26130 - 26942 515 270 aa, chain + ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 2 261 3 262 274 213 46.0 4e-55 MDLLQYTFFQHALLGSLLASIACGIIGTYIVTRRLVFISGGITHASFGGIGLGLFAGISP ILSAAVFSVLSAFGVEWLSRRKDMREDSAIAVFWTLGMALGIMFSFLSPGFAPDLSAYLF GNILTINQIDLWMLGILALILTGFFYLFIRPIVYIAFDREFARSQKIPVEIFEYVLMMFI ALTIVACLRMVGIVLAISLLTIPQMTANLFTYSFKKIIWLSIGIGFLGCLGGLFISYHWK VPSGASIIFFSILIYAVCKIGKSCCRKKSS >gi|225935335|gb|ACGA01000057.1| GENE 29 26955 - 27368 534 137 aa, chain + ## HITS:1 COG:BS_ydiB KEGG:ns NR:ns ## COG: BS_ydiB COG0802 # Protein_GI_number: 16077658 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Bacillus subtilis # 27 135 30 134 158 90 41.0 9e-19 MEIKIQSLESIHEAAREFIAAMGDNTVFALYGKMGAGKTTFVKALCEELGVADVISSPTF AIVNEYRSDETGELIYHFDFYRIKKLSEVYDMGYEDYFYSGALCFIEWPELVEELLPGDA VKVTIEELEDGSRVIKL >gi|225935335|gb|ACGA01000057.1| GENE 30 27365 - 27592 234 75 aa, chain + ## HITS:1 COG:no KEGG:BT_2208 NR:ns ## KEGG: BT_2208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 74 110 91.0 2e-23 MTGQYIVQGIFALAGTVSLLASLLNWDWFFTTRNAQTIVRNVGRNRARLFYGILGIIIIG MAIFFFIETRKAIGL >gi|225935335|gb|ACGA01000057.1| GENE 31 27647 - 28957 993 436 aa, chain - ## HITS:1 COG:PA4323 KEGG:ns NR:ns ## COG: PA4323 COG1721 # Protein_GI_number: 15599519 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Pseudomonas aeruginosa # 59 436 67 443 443 169 29.0 8e-42 MFLTRRFYIALVVVILLLGSGYVFAPFFVIGQWTLFVLLLVVLADAYSLYRIRGIRAFRQ CADRFSNGDENEVSIRVESNYSHPVSLEIIDEIPFIFQKRDVDFHVKLGANEGKTVNYRL RPTHRGVYSFGHIRVFVTGKIGFISRRYTCAEPLDIKVYPSYLMLHQYELLAISDNLTEL GIKRIRRVGHHTEFEQIKEYVKGDDYRTINWKASARRHGLMVNVYQDERSQQIYNVIDKG RVMQQAFRGMTLLDYAINASLVLSYVAMRKEDKAGLVTFDEHFDSFVPASKQPGYMQTLL ENLYSQQTTFGETDFSALCVHLNKHVSKRSLLVLYTNFSSIGGMNRQLSYLKQLNRQHRL LVVFFEDVDLKEYIAQPAKDTESYYRHVIAEKFAYEKRLIVSTLKQHGIYSLLTTPDNLS IDVINKYLEMKSRQLL >gi|225935335|gb|ACGA01000057.1| GENE 32 28966 - 29940 1097 324 aa, chain - ## HITS:1 COG:BH0604 KEGG:ns NR:ns ## COG: BH0604 COG0714 # Protein_GI_number: 15613167 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Bacillus halodurans # 16 323 6 310 318 276 45.0 3e-74 MEENTEQRVDLTLFSEKIQELKDRIASVIVGQEQTVDLVLTAILANGHVLIEGVPGVAKT LLARLTARLIDADFSRIQFTPDLMPSDVLGTTVFNMNTNGFDFHQGPIFADIVLVDEINR APAKTQAALFEVMEERQISIDGTTHRMGDLYTILATQNPVEQEGTYKLPEAQLDRFLMKI TMDYPSLEEEVNILERHHTNAALVKLDDITPAITKEEVLSLRAFMNQVFVDRTLLQYIAL IVQQTRTSKAVYLGASPRASVAMLQSSKAYALLQGRDFVTPEDIKFVAPYVLQHRLILTA EAEMEGYSPVKVTQRLIDKVEVPK >gi|225935335|gb|ACGA01000057.1| GENE 33 29952 - 30590 416 212 aa, chain - ## HITS:1 COG:no KEGG:BT_2213 NR:ns ## KEGG: BT_2213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 212 215 424 424 342 76.0 6e-93 MPVAMSRQWGKGEVILVSTPLIFTNYGILDGKNATYIFRLLSQMGKLPIVRTEGYMKETA QVQQSPFRYLLAHQPLRWALYLTMITIILFMIFTAKRRQRAIPVIQEPANKSLEFTELIG TLYFQKKDHADLVRKKFSYLAEELRREIQVDIEEVEDDERSFNRIARKTGMDIQEIAKLI REVRPVIYGGRTIDADQMKVYIDKMNEIINHI >gi|225935335|gb|ACGA01000057.1| GENE 34 30608 - 31222 272 204 aa, chain - ## HITS:1 COG:no KEGG:BT_2213 NR:ns ## KEGG: BT_2213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 191 1 192 424 251 63.0 1e-65 MKGSRWFIIFIVAFLFIMFAIEYHLPKNFVWKPTFGHYDEQPFGCAIFDSLLASSLPQGY TFSKKSLYQLEQEDTTQRRGILVISDNLRLSDVDVNALLKMAERGDKIMLVSTLFGRYME DTLSFRSYYSYFSPMALKKYATSFLLKDSLCWVGDSAVYPRQTFYFYPQLCSSYFWGDSL PERVLDKKSLSRMNLNTKPRRTHW >gi|225935335|gb|ACGA01000057.1| GENE 35 31219 - 31836 514 205 aa, chain - ## HITS:1 COG:no KEGG:BT_2214 NR:ns ## KEGG: BT_2214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 205 2 204 204 307 81.0 2e-82 MNLTSPADTLVCDTAQIALWQSDPAYNYNRELITPEMNVFEWISRQFGELMRKIFGSRFA EEYSGLILICIAILILLLIVWFVYRKRPELFMVSRKNALPYTIEEDTIYGVDFPGGITEA LSRQDYREAVRLLYLQTLKQLSDAERIDWQLYKTPTQYINEVRMPAFRQLTNHFLRVRYG NFEATEELFRTMQSLQGEIEKGGVS >gi|225935335|gb|ACGA01000057.1| GENE 36 31853 - 32791 871 312 aa, chain - ## HITS:1 COG:no KEGG:BT_2215 NR:ns ## KEGG: BT_2215 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 312 1 310 310 303 57.0 6e-81 MESQKPKIAMYVKRPFGEKLNASFDFIKENWKQLFKYSTYLILPICLIQAANFSGLMGSM TDLSAMQTSGGIGENPLAALGPSFALNYAGVIFFSCLGGLLLTSLIYAMVRLYNEREERL NGIVFSDIKPLLLRNVKRLFLMGIACGFLFFFAVILVVLLAVLTPFTLILTIPLLFAFMI PLVLMSPIYLFEDISLSEAFAKTFRLGFATWGGIFLILFVMGLIASVLQTIVSIPWYVIY IVKMIFTMSDGGATSSSVGLNFAQYLFSILMLYGSYLSAIFSIVGLVYQYGHASEVVDSI TVESDIDNFDKL >gi|225935335|gb|ACGA01000057.1| GENE 37 32772 - 33734 693 320 aa, chain - ## HITS:1 COG:alr1808 KEGG:ns NR:ns ## COG: alr1808 COG1300 # Protein_GI_number: 17229300 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Nostoc sp. PCC 7120 # 31 315 39 317 318 130 31.0 5e-30 MKEVTFIRRNIEKWKETEKVVERAASLSPDQLADAYTDLTADLAFAQTHFPTSRITIYLN NLASALHNEIYRNKREKWTRIITFWTQEVPRTMHDARRELLTSFLIFVASALIGVLSAAN DPDFVRLILGNGYVDMTLDNIANGEPMAVYNGSSEVPMFLGITLNNVMVSFNCFAMGLLT SFGTGYMLLSNGIMVGAFQTFFYQQDLLWESSLAIWLHGTLEIWAIIVAGAAGLALGNGW LFPGTYSRLESFRRGAKRGLKIVIGTVPVFIMAGFIEGFITRHTELPDMLRLGVILTSLA FIIFYYIYLPNRKKHGITET >gi|225935335|gb|ACGA01000057.1| GENE 38 33836 - 34561 518 241 aa, chain + ## HITS:1 COG:BH0734 KEGG:ns NR:ns ## COG: BH0734 COG1714 # Protein_GI_number: 15613297 # Func_class: S Function unknown # Function: Predicted membrane protein/domain # Organism: Bacillus halodurans # 3 149 8 164 266 72 30.0 9e-13 MAESTIITGQFVRISQTPASIGERLMALIIDYFLIGLYILSTATLLSELSLPSGFSLFFF LCIVYLPILGYSFLCEMFNHGQSFGKKLINIRVVKVDGSTPSIGSYLLRWILFPIDGPIT SGLGLLVILLNKNNQRLGDLAAGTMVIKEKNYRKIHVSLDEFDYLTQNYHPVYPQSADLS LEQVNVITRTLESSEKDRARRVTALAKKVQELLSVTPRDGNQEKFLQTVLRDYQYYALEE I >gi|225935335|gb|ACGA01000057.1| GENE 39 34648 - 34851 92 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MILKKSSLWFRERWNTETLWDKWTKTTSGLIVDKIKKRGVLCCEYAPFLCWLFLKHSYAY EKKFSLR >gi|225935335|gb|ACGA01000057.1| GENE 40 34778 - 35113 552 111 aa, chain - ## HITS:1 COG:no KEGG:BT_2228 NR:ns ## KEGG: BT_2228 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 152 94.0 5e-36 MGLEDDFLLADADDEKTIEFIKNYLPQELKEKFSDDELYYFLDLIDEYYSESGILDAQPD EDGYVNIDLEEVVAYIVKEAKKDEVGEYDPEEVLFVVQGEMEYGNSLGQVD >gi|225935335|gb|ACGA01000057.1| GENE 41 35155 - 35469 172 104 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 3 103 18 117 120 70 32 6e-11 MALEITDSNYKEVLAEGKPVVVDFWAPWCGPCKMVAPIIEELAAEFEGQVIIGKCDVDEN GDMAAEYGIRNIPTVLFFKNGEIVDKQVGAVAKSVFAEKVKKLL >gi|225935335|gb|ACGA01000057.1| GENE 42 35604 - 36632 727 342 aa, chain - ## HITS:1 COG:CAC0360 KEGG:ns NR:ns ## COG: CAC0360 COG1609 # Protein_GI_number: 15893651 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 4 332 2 325 328 182 33.0 7e-46 MKHITIKDIARHLSLSVSTVSRALINDKNIRQETKERVLETAKMLGYKPNPVATNLKYGH TNTVGIIVPEMLTPFASQVISGIQSVLYANGIKVIIAESDEDPNKERENLQTMERFMVDG IIICLCSYKENLDQYIRLQQAEMPMVFYDRIPYGMNVSQVIVDDYIKAFFLVEHLIREGY RHIVHLQGPDDVYNSVERARGYKDALVKFNIPFDKDRMLKAGLTFKDGANAADMLIEKGV SFDAIFAFTDTLAIGAMNRLRDLGKKIPEEIAIASFSGTVLSTIVYPQLTTVEPPLQQMG KVVAELIIEKIKEPLSPNRSIVLDAEIKLRASSKKEKDSKGK >gi|225935335|gb|ACGA01000057.1| GENE 43 36832 - 38271 937 479 aa, chain + ## HITS:1 COG:PA2342 KEGG:ns NR:ns ## COG: PA2342 COG0246 # Protein_GI_number: 15597538 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Pseudomonas aeruginosa # 1 479 11 489 491 393 40.0 1e-109 MKTQVLSYNYNREHIKPGILHIGVGNFHRAHEEFYTNLLLEDPTQQDWGICGAMLLPGDE RLYRILEKQKKEYTLTICGRDGKDQTYQIGSLIELIWGIENPAAIINKIADKNIHIITLT ITEGGYNIAKATNEFMLDNENIKYDLAHPQSPKTTFGFVAEGLRKRKAVGNGPITILSCD NLQHNGNTARKAFLSFFQAQDPELAEWATVNITFPNSMVDRITPSTQPEDIIRLNTQNGT QDGAPVYCEDFIQWVVEDKFIAGRPAWEKVGVEFTQDVTTYENMKLSLLNASHTLLSYPA FLSGYRKVDVAISDERIKKFVRGFMDIDITPYVPAPGNMDLDLYKQTLIERFGNHTVSDQ VARLCFDGASKFPVYIMPNLIQMIRDRANLTRQAYLFAAYRHYLKYKTDDNGTKFEIAEP WLTATDDILIASNSPIDFLSLSPFQSTELKAADEFVELYLQMVNAIKEKGTMSTLESIL >gi|225935335|gb|ACGA01000057.1| GENE 44 38299 - 39624 430 441 aa, chain + ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 2 430 16 410 412 170 30.0 6e-42 MNTQKNNSLGPFIVLTFIYFIVGFLTTVNGQFQGPLKIAFLSHTDELRNTLTTFISFFFF LGYLLNSSLGGKWINVHGYKKTLLRALSIMVIGLLMYSLSSWLVVHYGDARILIFKDQVP YGYFIFLLGSYLMGTSAALLQVVINPYIAAYELPNTQPVQRINIVCAINSFGTTIAPFFV TGIIFAGVTLESVTADQLMFPFLMITLCIIITTLITSRLNLPDIQGTRVDNGNKPKHSIW SYQHLSLGVITLFFYVGAEVSIGVNINLHAMELIENGHRFFCFGKSHIVVWGLDLGIPAL LATLYWGGLMVGRLIFSFFNNVSPRILLTVTSIIATILILVAILTNNLWILVSVGLCHSV MWGCIFTLAIKGLQQYTSKASGILMMGVFGGAIFPVLQGILADTWGSWQWTWIIALICEL VMLYYAISGSHIKNLPKVQQS >gi|225935335|gb|ACGA01000057.1| GENE 45 39854 - 40444 263 196 aa, chain + ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 10 184 6 186 187 58 24.0 5e-09 MHCETIEKQRKILLSVRNGSEKAYQELYEQWVSRLYGFVFQYLKSKDATDDVVQETFLRI WSNRANLNPDVSFKSYLFTIAYHFLLKEMRRQLNNPLMEDYVEYLNRSSTEIAEAESLMC YDQFVNALEKGKQHLSPRQRIIFEMNKEYGMSISEISEKLSITNQVVRNQLYMALKILRV ELRQYYPLLLLFLKDI >gi|225935335|gb|ACGA01000057.1| GENE 46 40534 - 41529 486 331 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 132 276 132 272 331 63 30.0 5e-10 MHIVNKAKDLLKKFQSGNLSLSDFKELVSTVNDSSDQELEDFFFEEWNKFDTYPSLSQEK IDSLYCHLHKKMKISPFYKITRHWGQIAASILLLFASGLTILYYIQHQELQTLAEQDVIV RSGDSGTSQVSLPDGTLVRLNANSSLTYQQNFGQNNRKVKLSGEGYFEVKKNTEKKFIVN TGYIDVTVLGTKFNLYAYEDKDIIEMALVEGHVNVSTSKPPYQTICVKPNEKVTYNKYDN KLNIEKTTTKIETAWLNKELVFREEKLENVFQCLSRKFRVKFSIDSSISVDDVYTGAFDD EKIEDILEVLKIHYGFNYTVKDGKINIRMNK >gi|225935335|gb|ACGA01000057.1| GENE 47 41689 - 45036 1867 1115 aa, chain + ## HITS:1 COG:no KEGG:BDI_1677 NR:ns ## KEGG: BDI_1677 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 17 1115 8 1124 1124 929 46.0 0 MKNKKQLISLVQKDNKNTIKLYSLLLFFCFLTFTSANAQTGKVNINLKNASVKELFNAIE SQTPYHFSYRSVEVENKKEVTISVKNAKLKDLLIQELPKHHLSHIVQGNKIIVTPATDNQ SSDKSNKVTGKVVDTNGEPIVGATIKEQGTTNGTITDMDGNFSFMVSPNTMIEISYIGYQ DQRIKTVFGKSMFITLKEDTELLDEVVVVGYGIQKKVNLTGAVSVVDAKTIQNRPITNAT QALQGAQGVYVNSAAGAQPGNDAATIRIRGMGTFSSAGNDPLFLIDGVPGTLTDVNPSDI ASISVLKDAASASIYGSRAANGVVLVTTKKAEAGKFSVSYNNSFGLQQITYLPDAVWDPI LYMKGFDKAYENEGRAPLYADIIEEYKEGMKVDPYTYPATDWFDLYFRDAFMQEHNIRIS GGNDHIQTGLSVGYLDQEGVVNFTDAKKVSINFNSTIKYGQFKAGINVSANYRNYNEPHY GISDYMQLAMRALPVMTPYLADGSYGRSWVVTPGQNTFNNLLSCKDGENNYKQTRIVGSA FAEYIFPYDIKYNITLGVRKVDLTRRYFQPTTYTYNPKTLEPQKMIANTTAMNAANDDIN PSISQTVNWNRSFNGKHNIAALLGMNYEEFNAWSFSAKGQNGYLDNELTEVGLATTFLKP ASSSSKVRLLSYFGRINYDYADRYLFEANLRYDGSSRFAKGHRWGLFPSFSAGWRIDQEA FMANTQNWLSNLKLRVSWGQLGNQSIGLFQYTPIMSSGVNYIFGNTTATGYAITQAVDPE ISWETTTITNLALDFGIFNNSLSGSIEFFKKRTKDILRSVNQPSQVGNLTGAMRNIGTVD NTGLEANLAYRNNIGAFNYHVFGNVTYVKNEVVHIGGDDMINGKRITREGYPIDAYYLYI CDGIFQSEDQVKHHAYQSANTHAGDLIFRDVSGPEGVPDGQITEDDRVVTGSSVPDFTYS FGLNLDYKGIGLNVFFQGVSGISTYPTHNLVYPYANGAGVTYEWLKRAWTPENTGGGFPR LLTTNSQHDNYTKSSTFWLRDASYLRLKNIQLSYDFPKKWIAPLKIAALKVFVNAENLLT FSDFDIFDPERSLTSDYIWSYPSVKSFTGGINVTF >gi|225935335|gb|ACGA01000057.1| GENE 48 45049 - 46542 983 497 aa, chain + ## HITS:1 COG:no KEGG:Phep_2140 NR:ns ## KEGG: Phep_2140 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 18 494 21 527 529 344 40.0 4e-93 MKKILTGFVFCLSITSCSLDTIPTSQYVEDNFWKTPEQIEAGLVACYNTMYNVYMYGSNL IFSETGTPNAYNYNGTYGWRVIGDGSVNSINSDIVNGKWGACYQGIGRCNTFIAGAPSFV ISEEDKKEMVGEAKFLRALFYFDLTNMYGDAPLILDKPDVNTQSFLPRDPKAKIMAQVIQ DLKDAFSVLPATSSQTGRANKWAAKALLARVYLYNEMWEEAEKAAEEVMKSQKYSLFPDY RNLFSRDHENNQEVIFDVQYLYPTFVHSGDGLDVILRQFNTIAPTLDLVKSYDMKDGSSY TDGKDLYTDRDPRFYATIVYPGATYMGQIVTNDKFINTGYTFKKYSRYDTAAAAIDDKND INIILMRYADILMIYAEARNERLDKPDQPIYDAINEVRNRPSVKMPLFDKTKDNFTKEQM RERIRHERRIEFAGEGWYYNDIRRWKIAKKLMNGSTVQKYDGSIIERRIFNDKNYLFPLP QQEVNDNPNLLPNNPGW >gi|225935335|gb|ACGA01000057.1| GENE 49 46563 - 48077 750 504 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 17 492 18 473 497 175 30.0 2e-43 MINKYPLLYMFFLPVMCPSGTKAQNPDQRPNIIYILADDIGYGDLGCYGQQKIETPNLDQ LAAKGMRFTQHYSGSAVSSPSRCSLMTGLHTGHAYIRGNDELPERGDIGNYLAVLADSTL EGQRPMPEGTITVASLLKQAGYTTGCVGTWCLGYPGSSSTPRKMGFDFFYGYHCQRQAHD YYPPFLWRNEHREYLPNRLLSPNQKFDATADPNKKESYEFLVSKSYAPELMLHEVLSFVK RHKERPFFLYWPTPIAHVPLQAPQRWIDYYVKKFGDESPYLGDKGYFPCRYPKATYAAMV SYMDEQVGCLVELLKECGIYENTLILFSSDDGPTHNGGVNAPWFDSAGPFKSEKGWGKGS LREGGIRVPMIVHWHDQITAGSVSDHICAFWDVLPTLGEISGYSYKKTDGISFFPTLKGN RQEVHEYLYWELPEGKGSKAIRMGKWKGYLSNIKNGNRCVELYDLETDPQEQYNLSSIYP NVVKEIEKKMKKAHTESPVTNYKL >gi|225935335|gb|ACGA01000057.1| GENE 50 48090 - 49055 563 321 aa, chain + ## HITS:1 COG:no KEGG:Psta_0944 NR:ns ## KEGG: Psta_0944 # Name: not_defined # Def: hypothetical protein # Organism: P.staleyi # Pathway: not_defined # 36 315 16 296 304 116 30.0 1e-24 MKKWNLLLVLALLFVACKDEYDVNLDELYGTTSGSSSKVDKEIITCGDSKVMIINIGKAS MTTVPYEWEWDSKTANDLSVTLKGRLLGMDECKVVNEGNDQLVLLTSSGGSALILSRKTK KCLFTAVDAPGAHSIELLPNNRVVVALSGDAGGFQVYDRNASNKVIFKEDQKGGHGVIWM ENQQLLYALGYDKLLAYELKDWNTSTPSFVKKQEWNLPLGGTEQIPDGHDLIRISDHELG FTTAYNVYVLDTNTGTISEFEPMKGRKFIKSFNYDKESGYLVYTYADIEWWTHNIHIQNP NKTITISNVNLYKVRTVPERE >gi|225935335|gb|ACGA01000057.1| GENE 51 49135 - 50022 391 295 aa, chain + ## HITS:1 COG:no KEGG:Arad_8333 NR:ns ## KEGG: Arad_8333 # Name: not_defined # Def: hypothetical protein # Organism: A.radiobacter # Pathway: not_defined # 24 290 26 293 296 139 30.0 1e-31 MKKIYAIFILFFVLCACATGQKKNEIVVCGDDKVWIVNVDNSEGKNLDIVWKWNAQESNI PDDFKKYFRGMDECKTAKNGDWLLTSSSAGGAAIIERSSEKCLFYARVPAAHSIELLPDN RVVVALSHNEEGDCIQLFDINRPNQVLFQDSLFWGHGVIWMKNRQLLYALGFNELKAYSL KNWKSEQPTLQMEKVWKLPTDDGHDLIRISENELGFTTSSGTYVIDLDTEKIVSFQPLEG KKFIKSFNYNKENGILSYTQAEIEWWTHNIYLENPKKTLTIDSINLYKVRPVIYE >gi|225935335|gb|ACGA01000057.1| GENE 52 50045 - 52366 967 773 aa, chain + ## HITS:1 COG:XF0847 KEGG:ns NR:ns ## COG: XF0847 COG3525 # Protein_GI_number: 15837449 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Xylella fastidiosa 9a5c # 35 621 87 670 841 350 35.0 7e-96 MHIINKILILGSWLVVLLLVHTSCTNSVEHSEKTLNLLPEPAQLTVGKEAFILQDDMIIS INAPSLKSAADYLVTILQRATGYKFVVTEEERGHIQLCIDEKLPHKEGNYTLKVTSNQIH ISSANYAGVIAAISTIRQMFPVEIEASTVVLDTVWSIPTVSIIDEPRFAWRGILLDVARH FFSKEEVKELLDVMALYKMNKFHWHLTDDQGWRIEIKKYPLLTEKGAWRTFNDQDRICMS RSKREQNPSLAIPSDKLRIIEGDTLYGGFYTQEDIREIIRYAAVRGIDVIPEIDMPGHMQ TAVSLYANVSCFPQKEAPMNISSPVCPGKESALEFCKNVYDEIFRLFPSEYVHLGADEVS KKNWEKCSDCQKRMKVNNLKTEEELQSWFIHQMEQYFNENGKRLIGWDEILQGGVSPTAT VMWWQSYEKEVVKKSIARGNSVILCPNYDFYLDYSEIGQSTRLICESVSLLDSLNESQSK RILGVQGNIWGEFIPSRERMHYMAFPRLLAIAETGWSQAANYNWKSFQRRMTGQFNRLDA LRIDYRMADLEGFCNINTFIGETKVNVISPDPDAVIHYTDDGTEPTEKSAVYTTPISIRD SVSFAFRAYRPNGKSSKVFRASYKKQNGYIPAVDIKAPKKKGLMLTWYEGDIPSCQVIEN YHLKEKRTIDDVYIPSEVGNSKVGLIFTGYFYVSVDGIYSFSLSSDEGSTLKVDEEMIID NGGRHLRNEVSSQRALAKGWHPLEVRYFDFNGGCLSLKMHDASGKQIKPIYNH >gi|225935335|gb|ACGA01000057.1| GENE 53 52387 - 53595 736 402 aa, chain + ## HITS:1 COG:no KEGG:BDI_1319 NR:ns ## KEGG: BDI_1319 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: P.distasonis # Pathway: not_defined # 1 393 1 371 583 476 58.0 1e-133 MRTQLIIILSLLNVCLVPAQERDTRVRTYITPTRIVWQQDSDHITNCNFLLNQGDGQAYW GAYYDYESIDTGLQGKKEKTVKAATYCQLSSTDGARPAILLDFGKELHGGVQLVTGAWPS HKPVKIRLRYGESVSEAMSDIDGKGGATNDHAIRDQELILPWLGVYETGNSGFRFVRIDL LDTDAILELKEVRAISIMRDIPYRGSFQCNDEKLNTIWRTGAYTVHLNMQNHLWDGIKRD RLVWIGDSYPEVMTVNSVFGYNEVVPKSLDLMRDITPLPHWMNAGFSSYSIWWLLCHYEW YRYHGNKAYLEQSRDYITALLRQLMTQIAPDGQECLDGTRFLDWPSNSNKDAIAAGLQAL MVWGMRVGVEFAELFEDRQLSNDCKAAEKKLVKAASKVYKSF >gi|225935335|gb|ACGA01000057.1| GENE 54 53646 - 54203 428 185 aa, chain + ## HITS:1 COG:no KEGG:BDI_1319 NR:ns ## KEGG: BDI_1319 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: P.distasonis # Pathway: not_defined # 1 183 398 580 583 301 71.0 8e-81 MTITGLIDAPKADKEFLSINGAQGFSTFYGYYMLEAMATAGNYQGALDVIREYWGAMIDL GATSFWEDFDINWIPNATPIDELVPEGKKDIHGDCGAYCYVGFRHSLCHGWASGPTSWLS RHILGVEVIEPGCRQVRITPHLGDLQWVKGTFPTPYGEIQIYHEKQANGNVISEVKAPKG VSVIK >gi|225935335|gb|ACGA01000057.1| GENE 55 54277 - 55665 977 462 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174214|ref|ZP_05760626.1| ## NR: gi|260174214|ref|ZP_05760626.1| hypothetical protein BacD2_20311 [Bacteroides sp. D2] # 7 462 1 456 456 916 100.0 0 MKKSLLMNAWMMLALFICMGIFSCSDNSENGVKEPLVLRLVDKNGNPVGGINGGDYNLTG NVENLTLRVMSNLDWKITIPEGNEWFKVDIAEGVGFKEVSFTATENEGLEERKAQVILSS TQNSSVQDFILNCIQPGGPRLIVTPENASVGKEGKDVVVAISTNLPSLKCIIPENVKWIT QKSLTNKQLVLTVAASDIAKIRTATVEVTSESSAYPTTKSIEIRQAGAVDMAELLDVKFN ADGTAEDLSAMKMPIKTFAGPTLTMIENETFGYIAKFDPETVNQNKITSGFYTVDFTQNI SFQNAIADGFSMELYVTAKDYGEKSAPIPMGCHGGGGVAFIWHDDNNSWSFEPYIGNYSA CNMPPLGEVPEETWYHVVGVWDGTGNADAIKLYVDGKVEAQRAPASGNSFQFPGNKWFVI GGDAGAANEADRAYKGQIAITRIYDRALTAEEVKALYNTLKE >gi|225935335|gb|ACGA01000057.1| GENE 56 55815 - 58031 1046 738 aa, chain + ## HITS:1 COG:no KEGG:BDI_1317 NR:ns ## KEGG: BDI_1317 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: P.distasonis # Pathway: not_defined # 27 731 32 737 738 983 66.0 0 MKKIIVSLIVGILTAAHIQAQNSDNTRAQWISGELCNSATNTWLIYRKTVHIDNIPRSLI ANIAADSKYWLWINGQMVVFEGGLKRTPSPYDTYYDPVEIAPFLQIGENTIAVLVWHFGK SGFSHNNSGLAAFFFEATSPEINILSDDSWESDVYTAYQNTMTPPNPNWRLAESNILFDA RKEKLGWNLPGYRGVIPKAIVVSEVGVSPFGKLIKRPIPLWKDYGLRPYISIRYSANKDT VYCRLPYNCQITPYLKVEAPAGKKINIRTDNYCGGSEYNVRAEYITRNGLQEYESYGWMN GHEVQYIIPKGVKIVSLKFRETGYNTEISGNFYCNDPFLNELWKRSARTLYITMRDNYMD CPDRERAQWWGDEVNELGEAFYALSPSSYQLALKGVYELINWQKPNGIFFSPIPGNRREL PLQTLATIGWYGFYTLAFYSGDNSFIPDIYDRIHRYLHEVWKINEKGLVIERKGDWNWGD WGENIDMGVLTNCWYYLALKAEREFARQLGKSSDVEEISSLMRGIEKEFDSEFWTGTCYR SPLYTHATDDRAQAMAVVSGLASSDKYPQLLEIFKNEYHASPYMEKYVLEALFIMNEPSF ALYRMKKRYSKMLSYKDYTTLFEGWGIGQEGFGGGTINHAWSGGPLTLLSQKVCGITPIT PGFKVIRVAPQMGDLTNASATIETIAGKVQVVLKRKKNQIHMSLCIPDGITIEVPYTKDK IKILNSGRHNLTISNSNN >gi|225935335|gb|ACGA01000057.1| GENE 57 58194 - 62000 3812 1268 aa, chain - ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 2 1230 9 1133 1167 703 35.0 0 MQDFVHLHVHTQYSLLDGQASVARLVDKAMKNGMKGIAVTDHGNMFGIKEFTNYVNKKNS GPKGEVKDLKKRIAGIEAGTIECEDKEAEIAACKAKIVEAENKLFKPIIGCEMYVARRTM DLKEGKPDQSGYHLIVLAKNETGYHNLIKLVSHAWTRGYYMRPRTDRSELEKYHEGLIIC SACLGGEVPKRITAGQFAEAEEAIQWYKNLFGDDYYLELQRHKATVPRANHECYPLQVNV NKHLIEYAKKFNVKLICTNDVHFVDEENAEAHDRLICLSTGKDLDDPTRMLYTKQEWMKT REEMNELFADVPEALSNTLEILDKVEYYSIDHAPIMPTFAIPEDFGTEEGYRAKFTEKDL FDEFTQDEHGNVVLSEEDAKAKIKRLGGYDKLYRIKLEGDYLAKLAFDGAKRIYGEPLTE EVKERMNFELYIMKTMGFPGYFLIVQDFINAARKELGVSVGPGRGSAAGSAVAYCLGITK IDPIQYDLLFERFLNPDRISLPDIDVDFDDDGRGEVLRWVTNKYGQEKVAHIITYGTMAT KMAIKDVARVQKLPLSESDRLCKLVPDKIPDKKLNLPNAIAYVPELQAAEASSDPLLRDT IKYAKMLEGNVRGTGVHACGTIICRDDITDWVPVSTADDKETGEKMLVTQYEGSVIEDTG LIKMDFLGLKTLSIIKEAVENIRLSRNIEVDVDAIDISDPATYKLYSDGRTIGTFQFESA GMQKYLRELQPSTFEDLIAMNALYRPGPMDYIPDFIDRKHGRKPIEYDIPVMEKYLKDTY GITVYQEQVMLLSRLLADFTRGESDALRKAMGKKLRDKLDHMKPKFVEGGRKNGHDPKVL EKIWTDWEKFASYAFNKSHATCYSWVAYQTAYLKANYPSEYMAAVMSRSLSNITDITKLM DECKAMGIQTLGPDVNESNLKFTVNHDGDIRFGLGAVKGVGEAAVQSIMEERRQNGPFLG IFDFVQRVNLNACNKKNMECLALAGGFDSFPELKREQYFAVNSKGEVFLETLMRYGNRYQ ADKAAAVNSLFGGENVIDVATPEIPQGVERWSDLDRLNRERDLVGIYLSAHPLDEFSIVL EHVCNTRMADLEDKAALVGREITMGGIVTSVRRGVSKNGNPYGIAKIEDYSGSTEIPFWG NDWVTYQGYLNEGTFLYIKARCQAKQWRQDELEVKITSMELLPDVKEELVQKITIVIPLS VLNSALVTELATLTKEHPGNTELYFKVTDDADVTHMSVDLISRPVRLSVGRDLITYLKER PELGFHIN >gi|225935335|gb|ACGA01000057.1| GENE 58 62162 - 62848 543 228 aa, chain + ## HITS:1 COG:NMA1160 KEGG:ns NR:ns ## COG: NMA1160 COG0688 # Protein_GI_number: 15794106 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Neisseria meningitidis Z2491 # 12 227 10 213 265 151 41.0 8e-37 MGRLKKLKKIRIHREGTHILWASFLLLLLINAALYWGIDCKIPFYVVAVASIAVYLLMVN FFRCPIRLFGQDTEKIVVAPADGKIVVIEEVDENEYFHDRRLMISIFMSIVNVHANWYPV DGTIKKVAHHNGNFMKAWLPKASTENERSTVVIETPEGVEVLTRQIAGAVARRIVTYAEV GEECYIDEHMGFIKFGSRVDVYLPIGTEVCVSMGQLTTGNQTVIAKLK >gi|225935335|gb|ACGA01000057.1| GENE 59 62859 - 63566 466 235 aa, chain + ## HITS:1 COG:SMc00552 KEGG:ns NR:ns ## COG: SMc00552 COG1183 # Protein_GI_number: 15964875 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Sinorhizobium meliloti # 9 189 42 221 289 89 31.0 6e-18 MTNVIKNSIPNTVTCLNLFSGCIACVMAFEAKYELALLFIALSSIFDFFDGLLARMLNAH SIIGKDLDSLADDVSFGVAPSLIVFSLFKEMYYPASMEFIAPYLPYLAFLISVFSALRLA KFNNDTRQTSSFVGLPVPANALFWGSLVAGAHDFLISDNCHPVYLLILVCLFSGLLVSEI PMFSLKFKNLSWNDNKISFIFLIICIPLLIFLGISSFAAIIVWYILLSLFTRKSK >gi|225935335|gb|ACGA01000057.1| GENE 60 63634 - 63924 264 96 aa, chain + ## HITS:1 COG:no KEGG:BT_2233 NR:ns ## KEGG: BT_2233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 96 16 99 99 96 62.0 2e-19 MHILLFILIFIIAIFVFGLSIIGFILRTIFGLGRGSSSSRPKQTESGRTSQQDYGQRDRR SNDDEEEIYSENVPEKRHKKIFTQDDGEYVDFEEIK >gi|225935335|gb|ACGA01000057.1| GENE 61 63940 - 64377 463 145 aa, chain - ## HITS:1 COG:SA0516 KEGG:ns NR:ns ## COG: SA0516 COG0590 # Protein_GI_number: 15926236 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Staphylococcus aureus N315 # 1 144 1 149 156 134 48.0 8e-32 MLDDIYFMKQALIEAGKAAERGEVPVGAVVVCKERIIARAHNLTETLNDVTAHAEMQAIT AAANVLGGKYLNECTLYVTVEPCVMCAGAIAWAQTGKLVFGAEDEKRGYQKYAGSALHPK TVVVKGVMADECATLMKEFFAAKRK >gi|225935335|gb|ACGA01000057.1| GENE 62 64418 - 64588 164 56 aa, chain - ## HITS:1 COG:no KEGG:BF0707 NR:ns ## KEGG: BF0707 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 56 22 77 77 91 92.0 1e-17 MSILLSDEEQLIVDRYLEKYKITNKSRWLRETILMFIHKNMEEDYPTLFGEHDMRR >gi|225935335|gb|ACGA01000057.1| GENE 63 64719 - 65084 357 121 aa, chain + ## HITS:1 COG:CAC1763 KEGG:ns NR:ns ## COG: CAC1763 COG0792 # Protein_GI_number: 15895040 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Clostridium acetobutylicum # 8 116 9 120 122 59 36.0 2e-09 MAKHNDLGKAGENAAVAYLEQKGYLIRDRNWRKGHFELDIVAAKDNELIVVEVKTRSNTL FAEPEDAVDLPKIRRTVRAADTYIRLFQIDSPVRFDIITVVGNDGHFKVEHIEEAFYPPL Y >gi|225935335|gb|ACGA01000057.1| GENE 64 65199 - 65555 437 118 aa, chain + ## HITS:1 COG:DR2400 KEGG:ns NR:ns ## COG: DR2400 COG2315 # Protein_GI_number: 15807390 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 3 113 8 125 132 77 40.0 7e-15 MNVETIREYCLSKRGVTESFPFDDVSLVIKVMNKMFALIDLEEANHIALKCDPEKAIELR EHYSGIEGAYHFNKKYWNSVRFDSDIDDKLMKELIDHSYNEVIKKFTKKLRAEYDALP >gi|225935335|gb|ACGA01000057.1| GENE 65 65539 - 66297 344 252 aa, chain + ## HITS:1 COG:lin2018_2 KEGG:ns NR:ns ## COG: lin2018_2 COG0340 # Protein_GI_number: 16801084 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Listeria innocua # 36 236 35 236 253 86 29.0 4e-17 MMPFPDIFPVPLIHINETNSTNNYLQSLCSKQKMEELTVVVADFQTSGRGQRGNSWESDP GKNLLFSTVIFPEFLEARRQFLISQVISLAIKEELDTYTTDISIKWPNDIYWKEKKICGM LIENDLMGRNISQSIAGIGVNINQEIFHSSAPNPVSLVQITGKEHDLFEILKNIMLRIQS YYSLLKKGDTTSIACQYEKSLFRREGIHRYKDANGEFLARIVCVEPEGRLILEDEMLMKR DYMFKEVEYLLK >gi|225935335|gb|ACGA01000057.1| GENE 66 66364 - 67656 460 430 aa, chain + ## HITS:1 COG:no KEGG:BF0631 NR:ns ## KEGG: BF0631 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 430 1 436 436 463 56.0 1e-129 MKTKLIFLLSLLWILIGCDDSDSATLNISESKFDNISASGESLTIDITCSSSWTVTSNKQ WCIPNTQKGENDGKLILSINANLESNSRTATVTIISHKVNKTVQIIQNGSINTAEEYHYK IPVIFHVLYKEDRNSLQKVNSSRLSHILDKVNSLYKSKNNSVDMNLTFTLATTDKNGETL PNPGVEYIQWPESYPIDCEAFMEDNSGEYVKYLWDPNSYINIMVYNFATEPNSNSVTLGI SHIPFSTKGKHYLEGLGETDYSHLTLANLQFPLCVSINSLYINEESTSTKYNTADVTVTL AHELGHYLGLHHVFAETNNGTCEDTDYCKDTKSYNKQEYDSNCDYIYENERAKYTFENLV KRTGCDGIEFISYNIMDYAISHSNQFTQNQRERIRHVLSYSPLIPGPKKGDIDTRALNEG PLDLPIRTIK >gi|225935335|gb|ACGA01000057.1| GENE 67 67691 - 68374 626 227 aa, chain - ## HITS:1 COG:no KEGG:BT_2240 NR:ns ## KEGG: BT_2240 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 227 15 227 227 330 85.0 2e-89 MRTLTIFLISLFSLPLALNAQSVDEMLQKVSAAIEAGQNGQAVSYFRQTIPLNIDRTEMY YWTNVDKNSEISSKLATELALAYKKKRNYDKAYLFYKELLQKDPNNVDCLETCAEMQVCR GQEKDALRMYEKILQLDADNLAANIFLGNYYYLMAEQEKKKLETDYKKLPSPTKMQYARY RDGLSKLFTTRYEKARNSLQKVVLRFPSTEAQKTLDKILRIEKEVNR >gi|225935335|gb|ACGA01000057.1| GENE 68 68488 - 69816 854 442 aa, chain + ## HITS:1 COG:VC0090 KEGG:ns NR:ns ## COG: VC0090 COG0534 # Protein_GI_number: 15640122 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 5 424 12 431 454 282 39.0 1e-75 MIDKKQSSENRRILRIAIPSIISNITVPLLGLIDVTIVGHLGSPAYIGAIAVGGMLFNII YWIFGFLRMGTSGMTSQAYGQHDLNEITRLLLRSVGVGLFIALCLLILQYPILKLAFTLI QTTPEVEQLATTYFYICIWGAPATLGLYGFAGWFIGMQNSRFPMYIAITQNIVNIVASLS FVYLLDMKVAGVATGTLIAQYAGFFMAILLYMRYYSALRKRIVWKEIIQKQAMYRFFQVN RDIFFRTLCLVVVTMFFTSAGAAQGEIVLAVNTLLMQLFTLFSYIMDGFAYAGEALAGRY IGAKNQTALRNTVHHLFYWGLGLSLIFTILYAIGGKEFLGLLTNDTSVINASDTYFYWAL IIPLAGFSAFLWDGVFIGATATRQMLYSMLVASASFFGVYYAFHPLLGNHALWLAFLIYL SLRGVVQTFLGRQIIKKVIASQ >gi|225935335|gb|ACGA01000057.1| GENE 69 69871 - 70581 873 236 aa, chain + ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 4 234 6 236 239 263 58.0 2e-70 MAKYKRILLKLSGESLMGEKQYGIDEKRLAEYAQQIKEIHEQGAQIGIVIGGGNIFRGLS GANKGFDRVKGDQMGMLATVINSLALSSALVATGVKARVLTAVRMEPIGEFYSKWKAIEC MENGEVVIMSAGTGNPFFTTDTGSSLRGIEIEADVMLKGTRVDGIYTADPEKDPTATKFD DITYDEVLKRGLKVMDLTATCMCKENNLPIVVFDMDTVGNLKKVISGEEIGTLVHN >gi|225935335|gb|ACGA01000057.1| GENE 70 70687 - 72240 1246 517 aa, chain - ## HITS:1 COG:no KEGG:Acid_0712 NR:ns ## KEGG: Acid_0712 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 82 516 33 459 462 231 33.0 5e-59 MKKIFILGALLFITSIPMVSCTDDDDKDPNFMPPDIVMGGGDVESEYPEDLPVPGASVVY APSLNANMYRPISVKYSSAYPPISSWTTGNTRIIAYMDGYKPAIKTLKAYQESVNKYGSS TTLPKQAATGRFYTKKIDGRWWLVDPEGCLHLERSATSLRKGTSSRNKAAWNSRFGTDEK WLSTTQRELSEIGFHGTGAFCTGTYSLIQIHNASNPSSPLTLAPSFAFLSQFKSEKSYNY PGGSDDNAAGLVFYNGWAEWCDSYLAGSAFADYLRDPNVLGFFSDNEINFSSNSSRILDR FLAINSSNDPAYVAAKGFMDSKGVQSVTDALNNEFAGIVAEKYYKAVKEAVMKVDDKLLY LGTRLHGTPKYMEGVMRAAGKYCDVISINYYSRWSPELTTAIADWASWADKPFLVSEFYT KGVEDSDLNNQSGAGYSVPTQNERAYAYQHFTLGLLEAKNCIGWHWFKYQDDDGTDNSSK PANKGLYDNSYQLFPYLSFFARELNFNAYDLIQYFDK >gi|225935335|gb|ACGA01000057.1| GENE 71 72251 - 73663 1258 470 aa, chain - ## HITS:1 COG:no KEGG:Acid_0712 NR:ns ## KEGG: Acid_0712 # Name: not_defined # Def: hypothetical protein # Organism: S.usitatus # Pathway: not_defined # 13 463 8 459 462 362 41.0 2e-98 MKNSIVILFLLLLSQLGYGQGRTFKVTARPWVKGQKDLPWKEYDTRTIAQLDGFKATDKV HVNEYGSDWDAPKHRATGFFRVERIGNRWWMIDPDGYRHLQKVVVGVRLGTSERNKQAML DKFGTEEKWIERTAQMIHSLGFSGTGSWSNEEAIASYNASHKEVLTRSIILNLMSGYGKK RGGTYQLPGNTGYPNQCIFVFDPEFETYCDEMAQKLVANKMDKNIIGYFSDNELPFGPKN LEGYLTLQNPGDPGRVYAESWLKKQGITLQQITDEHREEFAGVVAERYYKVVSEAIRKYD PNHLYLGSRLHGKPKFIRQIVEAAGRYCDVVAINYYGAWTPSEKTMKHWGEWAQKPFIIT EFYTKGMDSGLANTTGAGFTVQTQQERGYAYQHFVLGLLESGNCVGWHWFRYQDNDPTAK GADPSNLDSNKGLIDNEYNLYKPLADAMKELNINAYRLADWFDQQSNNNQ >gi|225935335|gb|ACGA01000057.1| GENE 72 73667 - 76330 1959 887 aa, chain - ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 56 596 40 554 570 188 27.0 4e-47 MYKVYPKIILACLLLVMTGTVYSQRVTQTINDGWKFSLFEGDASTADFDISGWTDVSIPH TWNAKDADDEIPGYFRGKGWYRKVVAVEELIPEQRVYLSFEGANQETNVFVNGTFVGNHK GGYSAFTFDVTDYVHAGRNLIAVSVDNSHNPDIAPLSADFTFFGGIYRDVYLVYTSPVQL STTHYASSGVYLKASKITDLQADISVKTFLSNVLKSNQSLILETEILDADGNRVALSQKK VNVKAEEKNVAFESLMAITQPKRWDVDSPYMYKVYSRLKNKKEEVLDCVVNPLGIREYRF DAEKGFFLNGKYRKLIGTSRHQDYKGMGNALRDEMHIRDVQLSKDMGSNFLRVAHYPQDP VVMQMCDKLGLLTSVEIPVVNAITQSKAFMDNCVEQVTEMVCQNYNYPSVIIWAYMNEVL LRPPFNPENRTERADYMTFLHQIASAIEAQIRSLDSERYTMLPCHSTSQIYQEAGIAELP MLLGFNLYNGWYGGSLSGFEEKLEELHREFPHKPLLITEYGADVDTRIHSFSPMRFDFSC EFGSIYHEHYLPEILKRDYIVGAMVWNLNDFYSEARRNAMPHVNNKGLVSMDRERKDGYY LYQAYLKEAPVLHIASKSWKNRAGASRDGKSCTQPLKVYTNADKVEVFLNGKSLGVYPVS DKVVSVDIPFINGENVVDAVIEKEGREYRDQYVCNFQCVNVKNGFTEVNVLLGAQRYFED RTAELCWIPEQAYEKGSWGYIGGEVAPNKTRYGSLPASDTDILGTDQDPIFQTQRVGIEA FKADVPDGVYAVYLYWTELTSENKREALVYNLGNDVVKEEYANRVFSVDINGVSVAGQMN IAEEYGSERAVIKKYIVPVSQGKGLVVRFGAVESVPILNAIRIVKEY >gi|225935335|gb|ACGA01000057.1| GENE 73 76342 - 78450 1630 702 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174232|ref|ZP_05760644.1| ## NR: gi|260174232|ref|ZP_05760644.1| hypothetical protein BacD2_20401 [Bacteroides sp. D2] # 1 702 3 704 704 1196 100.0 0 MKKIIYFLLILNLGLLSCVDDASVRVMPEFNCKDTKVNLAKAAGSSVTSLLYTNVGQVVA QYEAEWLSVDVSSKSVVYTALTQNDGEDARTAVVKLTCGSYTVEVTVTQDSKEPDLSLKI GQSVDEGIGMIFWVDPSDNMVGKAVSVKRQGGNPFEASVMPHSAFSTVNGYANSALFTSP SANDAVAYCQSLGDGWYLPARDELWELFDAYNGVGHADPDFVSAVPDKLTEVEKAARAAF DKMLTDLQGDVMNEAAGSGNGESYWSSTENAAGNQAYWVRFGKSGADAGNKTATNRFVRC IRTIGDYTYPEEPATLTVNPNPVTLEGANEAEANVTLTSNKTVFSVALANDSWLSYTISG TTVTFKAKSKNTTGDVRTTVATVTAGTGTAAKSVEVTVNQNVAAEGGASLELSTNAVTIT PDAVTKSEGITMISDETEFTVNITDESWVKAYVDITSKTLYFWTLSPNLNSSNRVTTATV IAGSGANAPKQEVTITQRGLLSSEFAVGQVIADNGSLKGGIVFWVDGTNRGKAKIMSLDR ENLAWSTASSPASTGLTLSNDNGLANTTALAALPNAAEMPALKYCMGKGSGWYWPTRTDL EQMFETYNGTAVADATENNPDAITDFEKANRTAWDQVITGAGGTIMNTAAASATGDSYWA SRETSSGTSAFYVRFGKPLAWDKSNAKKDGKRFVRAVRSISK >gi|225935335|gb|ACGA01000057.1| GENE 74 78482 - 80095 1416 537 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2078 NR:ns ## KEGG: Fjoh_2078 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 1 527 1 521 530 189 30.0 4e-46 MKKLIYTAFVICGMLTASCSDLLNLESKTDVTNNYLFTTPEGLNTAVTGLYSLARELPGG ADNNESNLYIVTMCDFNTDIAILRAGVSTSIGRLNTSFTPSTGDVNKFWKHHYGIIGKAN EIIVAAEALGLDDSDVLHAWSEAKFFRGRSYFELWKRFDRLYLNITPTTVDNLKREYKPA SHEELMTLITTDLDDAMKGLDWSLPQNNGNVLYGRVTKATAKHVRAQVAMWESDWDTAIE ECEDIFKQEGIYSMEKKAENVFNGADLKSPEVLWSFQYSQNLGGGGSGTPVAGHRVSIQT TTRINKISGCINTADQGGYGWGRIYPNTYLLSLYDQAKDTRYNELFVHRFKYNDPTSPKY GELIPLTQSSSYCETLHFMSKKYFDQWTMADNPDRTTGFKDLIVYRLAETYLMAAEAYMR RDGGMSTDALRCYNKTWERAGNDKFAGPLTQDILLDEYARELNFEGVRWPLLKRLGLLGE RVKAHYGETKAENPYLDKDYAECRTSFVVGKHECWPIPQEQIDLMGKENFPQNENWY >gi|225935335|gb|ACGA01000057.1| GENE 75 80110 - 83169 2953 1019 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2077 NR:ns ## KEGG: Fjoh_2077 # Name: not_defined # Def: TonB-dependent receptor # Organism: F.johnsoniae # Pathway: not_defined # 16 1019 27 1008 1008 661 39.0 0 MKNIVIKWAVFLTAFLISLEVSAQNVRVSGVVTDALGPIPGANIMEEGTTNGTVTDVNGK YSISVSAKSTLVFSCIGYKEQKIRVGTKTVLNVDMVEESKMLDELVVVGYGVQRKSDVAT SVASVKADEMKTFPAGNVADMLRGRAAGVNVTSSSGRPGSTPSITIRGSRSISADNAPLY IIDGSPSSATEFSTLSADDIESVEILKDAASQAIYGARASDGVVLVTTKRGKAGKVEVSY NGYLGIQSLWRNFDFYSPEEYMQLRREAKAHDKGIVDAREISIAEALEDEVMQRVWASGK FIDWEKEMFRNAIYHNHDVSVRGGTEKIKVSAGANYFDQQGMVVTGSGYQKFSLRLNLDF EISKWISFGINSSYAMTKQDREDGNFNDFITSSPLAEIYDADGKYTKYINSEGNYNPLYR AQHYGREVSRDNYRLNFFMDVKPFKGFNYRLNTSVYNQTSEDGSYKDSQYPGGGGTAVLD ESRTQNWLVENIVTYKVPIRNKKHQLTLTGVQSVDHNGSKSIGYSVENLPVDKDWNFISQ GEFTGKPRRQFNENNLVSFMARAQYSLLDRYLLNVAVRRDGSSRFGKENKWGTFPSAAFA WRVNQENFLKDVSWIDNLKLRVSYGIVGNQNGIGNYTTLGLADNKGYEFGDTFQMGYLPG KELSNPNLKWEQSATANLGVDFSFFNGRLNGTVEYYNTHTKDLLVERSLNASLGYTTMLD NLGKTKSSGIDLSLNGDVIRTKEFAWSLGTNFSMYKNEIVRIDDTLDENGKPASQVAQGW IIGEPINVYYDYLVDGIFQYDDFDITRDGTGNLVYTLKNTYDSNNDGIADSPIDYGGAIE PGMVKVRDNNGDGKITADDRVPIRKDPKFTVSLSSTWNWKGFDLFMDWYGVSGRKIKNSY LYDYNSGGSLRGKLNGVKVDYWTPFNPSNEFPRPSYSADPAYLSAIAIQDASYIRLRTLQ LGYTFPARLLKNTPIHKLRLYATATNLLTFTEFKSYSPELTPGSYPESRQYVFGVNVSF >gi|225935335|gb|ACGA01000057.1| GENE 76 83301 - 87332 2867 1343 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 807 1052 53 287 318 144 36.0 2e-33 MTKANFLVIYCLFISGLCCMANVTKDINMRFKDVRQGLSHQTVNCFYQDEFGFLWIGTQD GLNRFDGRKFEVFKPDNANPYSININNIRQVCGNKEGLLFIRSLQSVTLYDMRLNRFKVL REGEVAGICYAHDALWIATGKEIYRYRDLNQVPELFFSFPSDGEDILMNSLMVRQNQTVV VGTSSKGIYCIDQAAHITRHIDVGAVNSITEDRDGSIWIATRNRGLTRLEANGELTHYRY NKVADNTINHDNVRHVTQANDSLLYIGTYAGLQTLNLSTGEFTDYEYDLNVEAADIRSII SMHYDTSGTLWLGTFYQGIQYYNVANDAYHFYRSSTAVGGHLNSYIISSIAEDRSGRIWF ASEGSGLNYYDKHTKRFFPLKHLYAQELSFKIVKSLYYEKEPDYLWVASLYQGINRINLS TGHIESITENIDTPEGKTVDRAYNLVKMIEFAGHDSLLIAAKGGLLVLDKKNLRLHHFEH PSLASRHLSQVWDMTFDKEGDLWLTTSFDLIRVNLKAGTSRSYPFSQIAQSTAQHHINHI LCDRKGRIWLGSTGSGIYLFDKKKDTFVGYGAKQGLENGFITGLVESPLDGSIYVATNGG FSKFNLATTTFENYNRQSDFPLNNVNDGGLYITSDHDIYVCGLAGIVSIAQEKLNKQSVD YNVFVKRVLVDNTEIQPLDSLGLIKETVLYEHQLVLPPRYSSVTFEIASNTLNNISNIGL EYKLEGFDNEYMKAGDNTMVTYTNLHPGRYTFHVRGDQLRIHDQEAPSASFELIVEAPVY QRAWFILLMILAGILIAGYIIRMFWIRKTLQHSLLAEKREKEYIENVNQSKLRFFTNVSH EFRTPLTLISSQLEMLLMHKDMIPEVYNKILEIYKNSRRMNNLVDEVIDIRKQDQGYLKL KISKENIVAVIEEICCSFYSYAQLNKIDLRFSSTLKEADLYIDKTQIEKVFYNLLSNAFK YTKPGDWISVELVAEGENDIVISVNNLGVGIEKSKIKHVFERFWQDDSATTTQTVKGSGI GLAMAKGIVELHQGTIGVESEPNGITSFIVTLHRDANVNVEASSSADERVAGHYVIEPQE ISEMVKPDKTVKILIVEDNPEMRKVLAQIFEQIYDVYTAADGQEGLEQASSLQPQMIVSD IMMPVMSGLEMCEKLKSNLQTSHIPVVLLTARNREEHTLEGLQTGADDYISKPFNIKILV ARCNNIIQTRKLLQQRFARNDEPKVEDLPFNPIDKKMLMDATAIVESYIDNADFDVATFA REMCMSRTLLFTKLKALTGQTPNDFILSLRLKKATEKLINDPNALIADIAFDYGFSNPSY FIRCFKNAYDITPAAYRRKYTNA >gi|225935335|gb|ACGA01000057.1| GENE 77 87713 - 88624 641 303 aa, chain + ## HITS:1 COG:PAB0040 KEGG:ns NR:ns ## COG: PAB0040 COG0697 # Protein_GI_number: 14520295 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pyrococcus abyssi # 7 283 23 291 295 69 26.0 7e-12 MTNKTKGFIYGAIAAASYGMNPLFALPLYAAGMNVDTVLFYRYFFATIVLGILMKMQHQS FALHKADVLPLVIMGLLFSFSSLLLFMSYNYMDAGIASTILFVYPVMVAVIMGIFFKEKI SAITVFSILLALSGIALLYQGDGNKPLSTLGIIFVLLSSLSYAIYIVGVNRSTLKNLPTT KLTFYAILFGLSVYIVRLNFCTELQIIPSAWLWADVLTLAILPTAVSLICTALAIHYIGS TSTAILGALEPVTALFFGVLLFHEKLTPRLMVGILMIITAVTLIIIGKSLIKKMGMLLQM NKK >gi|225935335|gb|ACGA01000057.1| GENE 78 88676 - 89236 829 186 aa, chain + ## HITS:1 COG:RSc1407 KEGG:ns NR:ns ## COG: RSc1407 COG0233 # Protein_GI_number: 17546126 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Ralstonia solanacearum # 10 186 9 186 186 158 50.0 4e-39 MVDVKTCLDNAQEKMDMAIMYLEEALAHIRAGKASARLLDGIRVDSYGSMVPISNVAAIT TPDARSIVIKPWDKSMFRVIEKAIIDSDLGIMPENNGEMIRIGIPPLTEERRKQLAKQCK GEGETAKVSVRNARRDGIDALKKAVKDGLAEDEQKNAEAKLQKIHDKYIKQIDDMLAEKD KEIMTV >gi|225935335|gb|ACGA01000057.1| GENE 79 89336 - 90268 918 310 aa, chain + ## HITS:1 COG:TM1717 KEGG:ns NR:ns ## COG: TM1717 COG1162 # Protein_GI_number: 15644464 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Thermotoga maritima # 2 298 6 286 295 213 39.0 3e-55 MKGLVIKNTGSWYQVKTDDGQSIECKIKGNFRLKGIRSTNPVAVGDRVQIILNQEGTAFI SEIEDRKNYIVRRSSNLSKQSHILAANLDQCMLVVTINYPETSTIFIDRFLASAEAYRVP VKLVFNKVDAYDEDELRYLDALINLYTQIGYPCFKVSAKNGNGVEEIKKALESKITLFSG HSGVGKSTLINSILPGIETRTGEISSYHNKGMHTTTFSEMFPVEGNGYIIDTPGIKGFGT FDMEEEEIGHYFPEIFKISADCKYGNCTHRHEPGCAVRKAVEEHLISESRYTSYLNMLED KEEGKYRAAY >gi|225935335|gb|ACGA01000057.1| GENE 80 90368 - 91453 1073 361 aa, chain + ## HITS:1 COG:VC0165 KEGG:ns NR:ns ## COG: VC0165 COG0845 # Protein_GI_number: 15640195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 71 361 62 357 368 123 28.0 7e-28 MNKKTKWGIIILVGAGIIGGGIYSQLPKKNDELAAADKVMSGNQKRGKQVLNVNAKVIKP QSLTDEFTTTGVLLPDEEVDLSFETSGKIVEINFEEGTAVKKGQLLAKVNDRQLQAQLQR LISQLKLAEDRVFRQDALLKRDAVSKEAYEQVKTDLATLNADIEIIKTNIELTELRAPFD GIIGLRQVSVGTYASPTTVVAKLTKIAPLKVEFSVPERYAKQIKKGTNLNFSVEGTLDAF GAQVYAVESAIDPNLHQFTARALYPNVHHTLLPGRYASVLLKKEEIPNAIAIPTEAIVPE MGKDKVYLYKSGKAEPVDIITGIRTASEVQAIRGLHIGDTIITSGTLQLRTGLAVTLDNI D >gi|225935335|gb|ACGA01000057.1| GENE 81 91469 - 94501 2700 1010 aa, chain + ## HITS:1 COG:VC0914 KEGG:ns NR:ns ## COG: VC0914 COG0841 # Protein_GI_number: 15640930 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1001 1 1009 1036 649 37.0 0 MNISELSIRRPVLATVLTIIILLFGFIGYNYLGVREYPSVDNPIISVSCSYPGANADVIE NQITEPLEQNINGIPGIRSLSSVSQQGQSRITVEFELSVDLETAANDVRDKVSRAQRYLP RDCDPPTVSKADADAQPILMVALQSDKRSLLELSEIADLTVKEQLQTISDVSSVSIWGEK RYSMRLWLDPVKMAGYGITPIDVKNAVDNENVELPSGSIEGNTTELTIRTLGLMHTADEF NDLIIKEENNRIVRFSDIGRAELGPADIKSYMKMNGVPMVGVVVIPQPGANHIEIADAVY QRMEQMKKDLPEDVHYNYGFDNTKFIRASINEVKSTVYEAFVLVIIIIFLFLRDWRVTLV PCIVIPVSLIGAFFVMYLAGFSINVLSMLAIVLSVGLVVDDAIVMTENIYIRIEKGMAPK EAGIEGAKEIFFAVISTTITLVAVFFPIVFMDGMTGRLFREFSIVISGSVIISSFAALTF TPMLATKLLIKREKQSWFYAKTEPFFEGMNRLYSRSLAAFLGKRWIALPFTFITICLIGI LWNAVPAEMAPLEDRSQISINTRGAEGVTYEYIRDYTEDINQLVDSILPDAESVTARVSS GSGNVRITLHDMKDRNYTQMEVAEKISKAVQKKTMARSFVQQQSSFGGRRGSMPVQYVLQ ATNLEKLEEVLPKFMAKVYENPVFQMADVDLKFSKPEARIQINRDKSSIMGVSTKNIAQT LQYGLSGQRMGYFYMNGKQYEILGEINRQQRNKPADLKAIYVRSNSGDMIQLDNLIELES GIAPPKLYRYNRFVSATISAGLADGKTIGQGLDEMDKIAKETLDDTFRTALSGDSKEYRE SSSSLMFAFILAILLIYLILAAQFESFKDPLIIMLTVPLAIAGALVFMYFGDITMNIFSQ IGIIMLIGLVAKNGILIVEFANQKQEAGEDKMSAIKDAALQRLRPILMTSASTVLGLIPL AFATGEGCNQRIAMGTAVVGGMVVSTLLTMYIVPAIYSYISTNRIKKLQE >gi|225935335|gb|ACGA01000057.1| GENE 82 94498 - 95838 1215 446 aa, chain + ## HITS:1 COG:VC1565 KEGG:ns NR:ns ## COG: VC1565 COG1538 # Protein_GI_number: 15641573 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 17 426 8 384 419 61 21.0 4e-09 MKQKGICMKRIAYITIACAFLSVSFAKAQVLTLKECLEEGLQNNYSLRIIHNEEQISKNN ATLGNAGYLPTLDFSAGYTGNLDNIETKARATGEITKNNGVYDQTVNVGLNLNWTIFDGF NISTTYKQLKELERQGETNTRIAIEDFIAALTSEYYNFIQQKIRLKNFHYAMSLSKERLR IAEASHLVGKFSGLDYQQAKVDFNADSAQYIKQQELLHSSRIQLNELMANNNVNQNIIIK DSTIDVHSDLQFDDLWNSTLATNASLLKADQNTVLAQLDYKKINSRNYPYLKLNTGYGYT FNKYDINANSRRGELGFNAGITVGFNIFDGNRRREKRNASLAFKNRRLERQDMELALRSD LSNLWQAYRNNLQLLNLERQNLITAKDNHDIAMDRYIQGDLSGFEVREAQKSLLDAEERI LSAEYNTKLCEISLLQISGKITKYLE >gi|225935335|gb|ACGA01000057.1| GENE 83 95990 - 97084 816 364 aa, chain + ## HITS:1 COG:no KEGG:BT_2254 NR:ns ## KEGG: BT_2254 # Name: not_defined # Def: putative pectate lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 363 364 626 84.0 1e-178 MRKLSLFLLLLFILTNASAQKATDYRKQQNYKEWVHIAPKFDDDFFKTEEAQRIGDNVLL YQQTTGGWPKNIYMPAELTEQEYKAALKAKEDTNQSTIDNNATTTEIEYLARLYLATQKE KYKEGVLNGIQYLLKSQYENGGWPQFYPRPKGYYVQITYNDNAMVRVMNQLRSIYEKKAP YTFLPDNICEQARNAFNKGIECILKTQVRQNGELTVWCAQHDRVTLEPCKARAYELPSLS GQESDNIVLLLMSLPNPSAEVVKSIEGAIKWFQKSKIKGIQKEYFTNSEGKKDYRMVPCE DCPTLWARFYELETNRPFFCDRDGIKKYDISEIGHERRNGYSWYNKDGSKVLKRYEKWKK EQNK >gi|225935335|gb|ACGA01000057.1| GENE 84 97252 - 98889 495 545 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 76 531 38 482 508 195 32 2e-48 MATPPFKYQPMFEKGKDTTEYYLLTKDYVSVSEFEGNPILKIEKEGLTAMANAAFRDVSF MLRRSHNEQVAKILSDPEASENDKYVALTFLRNAEVASKGVLPFCQDTGTAIIHGEKGQQ VWTGYSDEEALSLGVYKTYTEENLRYSQNAPLNMYDEVNTKCNLPAQIDIEATEGMEYEF LCVTKGGGSANKTYLYQETKAILNPGTLVPFLVEKMKTLGTAACPPYHIAFVIGGTSAEK NLLTVKLASTHFYDNLPTTGNEYGRAFRDIELEKEVLAEAHKIGLGAQFGGKYLAHDVRI IRLPRHGASCPVGLGVSCSADRNIKCKINKDGIWIEKLDSNPGELIPVELRQAGEGDVVK IDLNRPMSEILKELTKYPVATRLSLNGTIIVGRDIAHAKLKERLDRGEDLPQYIKDHPIY YAGPAKTPEGMACGSMGPTTAGRMDSYVELFQSHGGSMVMLAKGNRSQQVTDACKKYGGF YLGSIGGPAAILAQNNIKSIECVEYPELGMEAIWKIEVENFPAFILVDDKGNDFFKQLKP WNCTK >gi|225935335|gb|ACGA01000057.1| GENE 85 98912 - 99184 81 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|299147790|ref|ZP_07040853.1| ## NR: gi|299147790|ref|ZP_07040853.1| hypothetical protein HMPREF9010_03501 [Bacteroides sp. 3_1_23] # 1 90 1 90 90 155 100.0 7e-37 MLVSVAKVVIKNEEWRMMNEELRMNNGEKQEQQKTIINYKQRAVNKLCHSLTSAVHRSLF YLQFAIYHILFIIHSSPSATPIMIRLLQSV >gi|225935335|gb|ACGA01000057.1| GENE 86 99156 - 101147 1507 663 aa, chain - ## HITS:1 COG:no KEGG:BT_2257 NR:ns ## KEGG: BT_2257 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 662 1 662 663 1311 95.0 0 MKDYRRLTEDEVLQLKSQSCLADDWGNVSVAEGFNCEYVHHTRFSGEVKLGVFEAEFTLP GGIKKHSGLRHVTLHNVSVGDNCCIENIQNYIANYEIGSDTFIENVDIILVDRLSTFGNG VEVAVLNETGGREVLMNDKLSAHQAYILALYRHRPELINRMKSIADYYSNKHASAVGSIG NHVMILNTGSIKNVRIGDYCHICGTCRLSNGSVNSNVTAPVHIGHGVICDDFIISSGSKV DDGTMLTRCFVGQSCKLGHNYSASDSLFFSNCQGENGEACAIFAGPFTVTHHKSTLLIAG MFSFMNAGSGSNQSNHMYKLGPIHQGTMERGAKTTSDSYILWPARVGAFSLVMGRHVNHA DTSNLPFSYLIEQRNTTYLVPGVNLRSVGTIRDAQKWPKRDKRKDPNRLDYINYNLLSPY TIQKMFKGRSILKELKRVSGETSEIYSYQSAKIKNSSLNNGIRFYEIAIHKFLGNSIIKR LEGINFQTNEEIRQRLKPDTEIGLGEWVDVSGLIAPKSEIDRLLDGIENGTVNRLKSINA SFAEMHENYYTYEWTWAYHKIQEFYGLDPETITAQDIIGIVKAWQQAVVGLDRMVYEDAK KEFSLSSMTGFGADGSHDEMKLDFEQVRGDFESNTFVTAVLKHIEDKTALGNELIKRIEA IES >gi|225935335|gb|ACGA01000057.1| GENE 87 101240 - 101416 263 58 aa, chain - ## HITS:1 COG:no KEGG:BT_2258 NR:ns ## KEGG: BT_2258 # Name: not_defined # Def: GTP-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 56 364 419 419 109 98.0 3e-23 MKTWMAKMEDNCLFISARERINIDELKNVVYQRVKELHVQKYPYNDFLYQTYEEEEEE >gi|225935335|gb|ACGA01000057.1| GENE 88 101431 - 102501 535 356 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 41 337 45 328 425 210 44 5e-53 MKEFVISEAKVETAVLVGLITQTQDERKTNEYLDELAFLAETAGAEVVKRFTQKLPTANS VTYVGKGKLEEIRQYIHDEEEAEREVGMVIFDDELSAKQIRNIEAELKVKILDRTSLILD IFAMRAQTANAKTQVELAQYKYMLPRLQRLWTHLERQGGGSGAGGGKGSVGLRGPGETQL EMDRRIILNRMSLLKERLADIDKQKATQRKNRGRMIRVALVGYTNVGKSTMMNLLSKSEV FAENKLFATLDTTVRKVIIDNLPFLLSDTVGFIRKLPTDLVESFKSTLDEVREADLLVHV VDISHPGFEEQIEVVNKTLAEIGGSGKPMILVFNKIDAYTYVEKAPDDLTPEQRKT >gi|225935335|gb|ACGA01000057.1| GENE 89 102612 - 103502 870 296 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174245|ref|ZP_05760657.1| ## NR: gi|260174245|ref|ZP_05760657.1| hypothetical protein BacD2_20476 [Bacteroides sp. D2] # 1 296 1 296 296 551 100.0 1e-155 MKLNKILLYLLIVPALSFVSCSDYEDTEVTSPQADENALGANFTAATTSVVVHPDENKYQ LTLNRLNTKEAVSVPVTVKSCSEIDGVKFTEVPTAFLFAKGESKATVELKLNDKCKFQEV YKLTLSLGEGKDHPYAAGTSSTVVSVSKDYDWVEIDHPVVVEAKWYDGGILAPLEFASDY EGEDGNQLFRIKALYSAAGTASTATGHLQFLLDENYDVVSMLSVGDAYNPEKINTGVVDK TTKAPYYMNVKSAEKTSEGAYVFTYDVFYYENNVAKNKVEGVTATLDYDIAGAMEE >gi|225935335|gb|ACGA01000057.1| GENE 90 103528 - 105132 1369 534 aa, chain - ## HITS:1 COG:no KEGG:BVU_1855 NR:ns ## KEGG: BVU_1855 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 529 3 527 528 347 39.0 6e-94 MKTKYIFASLLLSSLAFTGCEDMDTLPEGNIVTSNQKDEIGKIDPTKVEARVNAIFAQFS LMTPNKGALQLERHNDFGYPSVMLFTDANGNDVVTDDNGYNWMGDELSYTDRSQTSLSCQ IIWNDMYAIIYAANNVAKEIDPATSDATEQYFLSQGLGARAFSYWVLAQLFQFNYKGNES KPCVPLITDKNMEAAADNGCPRSTVQDVYTQITTDLTAAVDLLKKAESAGEERKDKRYIS LAVAYGLQARVYLSMHEYAQAATAAGNAIEAAANEGISPAGMTDVNKPAFWTVSEKNWMW GIIVNETDEIVSSGIVNWPSHMGSLNFGYANFSGGLQISKSLYNQIPNTDVRKGWWLGGD LKSKNLSASQQAMVTGYGYKAYTQVKFAPYNNVLETSINANDIPLMRVEEMYLIKAEAEA MSGQDGKKTLEDFIKKYRNESYVVSNSDIQEEVFLQRRIELWGEGLNWFDVMRLNKGVDR RGAGFPNDNMIFNIAPTDPILLWPIPQAEIQANQALSAADQNPTSTTPKPVEDK >gi|225935335|gb|ACGA01000057.1| GENE 91 105158 - 108115 2772 985 aa, chain - ## HITS:1 COG:no KEGG:BVU_1854 NR:ns ## KEGG: BVU_1854 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 985 23 1005 1005 957 52.0 0 MQTQEVAIKPKLKIVLKSDTEVLDEVIVTAYGTSTKGTFTGSASVMKADKIEKRQVSNVS NALAGAVAGVQILSDNGQPGESAKVRIRGVGSINAGMEPLYVVDGVPYDGDLSSINSADI ETMTVLKDAASTALYGARGANGIIMITTKRGTSGKARINFDAKWGANSRAIKTYDVMTSP KNYIETAYQSIYNSQISLGYSPEDANIRANKILPSEASGGLGYQVYTTAPGELLVGSNGK LNPNATLGYSDGQYYYTPDNWADETFQNNLRQEYNLSASGGSDKGTYYFAFGYLDDQGVI SGSGFKRLNGRFKGDYKLYSWLKIGANVSYVNTESRYPGDQDANATASSGNAFYIANNMA PIYPLYVRGADKQILLNNGRKVYDYGDGQSTNFSRSFMSIANPSGDLIYNKREYLSDVIN ANWFAEITPITGLTISARYGLNIDNTRQNEMGNAYMGQSASYGGTAYQAAIRTYGFDQQY VANYQFALKDVHHFDVTAGYDGYSYEYTLLEGSGQNLYNPESFYLGNVIDKFTIGGKKDL YSTKGFFGRVNYSYNDTYFGNVSYRRDASSRFAPENRWGNFWSASVAWMLTRESFMEDIT WVNMLKFKASFGQQGNDDILYPGVLLEKNYYPWLDQYKMTGANGTFADGTLLYKGNPDIT WETSTSYNIGADFGLFNNKLNGSIEYFGRKSKDMLYNRPTAGSLGYTAVPMNVGSMTNSG VEIDLNYQIMANKKFNWSVNLNATFVKNKINKLHPDLNGKLIDSDRIYEEGESMYRMYLV DYAGVDEKTGEALYWAKDDNGNAIKTSEYSTAENYKVATDDMMPTVYGGLGTTFEAYGFD ASIQLSYQLGGKIYDTGYRRLMHGGASSSAGWNWHKDIYNAWTPENPTSNIPRLNASDKY TNSASTRWLTSSDYLSINNITVGYTLPQQLVKKVMPDKVRVYFTADNVALISARKGLDPR QSYISATTALYTPIRTISGGISLTF >gi|225935335|gb|ACGA01000057.1| GENE 92 109022 - 110194 703 390 aa, chain - ## HITS:1 COG:no KEGG:BT_2267 NR:ns ## KEGG: BT_2267 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 389 1 389 396 614 82.0 1e-174 MEKISYNLVFNRKKRLNKRGMALVQVEAYLNRKKMYFSTKIYLKPEQWDAKRKMVKNHPN ANVLNRMLYENIAAIEQTELGLWQQGKSIWLDLLKNSIDKPLSNGRSFLTFFKEEIANSS LKESTRQNHLSTLELLQEFKKEVLFTDLTFEFVSSFDNYLQSKGYHLNTIAKHMKHLKRY INVAINKEYMDIQKYAFRKYKIKSIEGSHTHLAPEELHKFENLQLTGRYTRLQKTKDAFL FCCYAGLRYSDFTNLTSANIVEFHQETWIIYKSVKTGMEVRLPLYLLFEGKGIQILQRYK DDLNSFFKLKDNSNINKELNILAGLAKIDKRVSFHTARHTNATLLLYNGANITTVQKLLG HKSVKTTQVYANIMDITVVRDLEKTASSKK >gi|225935335|gb|ACGA01000057.1| GENE 93 110644 - 113679 2265 1011 aa, chain + ## HITS:1 COG:no KEGG:BVU_1841 NR:ns ## KEGG: BVU_1841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 1011 1 1002 1002 840 46.0 0 MKRKLMLLLACLFVGIGLVTAQTQKVTGVVISEEDGQPVVGASVLVKGTTLGTITDVDGN FNLSNVPSSAKTLQISYIGMQTQEVAIKLNLRVVLKSDAQLIDEVVVTGYGVTKKAAFTG SAQTVDNKDLMKKTDANFMKSLEGSVAGLQVNSLTGQPGAYASTTIRGVGSMNSGTEPLY VIDGIAIYTDKMGAYNKAGTGDMAASPMANINPNDIESITVLKDATATAIYGARAANGVI VVTTKKGSLGKARFNVNTKVGTSFVGKIDHNYRTTNLDKYKEIWTEGLTNAGYDGEDGMT SAEWLNAYVMDWYNVDLTQNIKSVDWLKEILQNGFSQEYDISAQGGNENLRYYISGGYFQ NTGIVIGTGMKRYSGRISLDGKSGRIGYGLSVNGALSDINNTMTESQYTNPIVAVYDIRP FEQVRNEDGTYNTNVNYNPVAINDKEKGDKRNQKQITLVVNPYFTYNIIDGLTWKTNAGL SLIDLDEFFYSSIYNPQYSGSGMLGQRNQERATTLTITNTVNFNRTFKDMHHLNVLLGQE AQKISYRTIYAAASGYPSDAVFELDNASKPTGAGSSTKASTLSSFFMNAEYNLNDKYYLS GSFRYDGSSRFGVNNKWAPFWSIGAKYRISNESFMENTKNWLTDLTIRGSYGTVGNQDIG YYAAMGLYNYGYSYNGKPGAIPYQIANPDLKWETVAKADIGIHAVFFERLTLEVDYYNQR TKDMIFDVPLSYTTGFGSILKNVGEMENKGIEFLINANVLRNKDFSWDVRFTGTANKNKI IKLATEKPIENTLTIRKVGEAYNTFYMPEYAGVDPETGEAMWYKGQEGDEKTKNVNEAGQ RIVGSADPKFYGGFGMNFKYKGFDFSFDTSFTLGNKVYNSGFAFDMQVGHYFLGPVSNYV YDNRWQKPGDITDVPKFVAGDNSGAETNSSRFLMNGSYLRMKSMVLGYTLPKNLMNKVAI DNLRVYISADNLFTITAKDFIGFDPQTRSTGMQSWAYPVPRNIMFGLNLSF >gi|225935335|gb|ACGA01000057.1| GENE 94 113700 - 115112 1049 470 aa, chain + ## HITS:1 COG:no KEGG:PRU_1887 NR:ns ## KEGG: PRU_1887 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 10 470 5 491 491 256 35.0 1e-66 MTMKKILFALLTGWMAISFTACDDFLTETPSTSVPTEEALVTPQDFQNALNGVYYTLGSY RFLGRDVLAIGDAPTDITSHSVATSHFYDIFRYQILDTNSYIEEIWAYGYWAIDRCARII KEGASFKEVTKGDLSEVEGCIAQAYAIKALCSFYLTNMFGLPYSEANKTTLGIVNVTEPV PAYSQVSRVSVEQNYSLILDDITKAKEYYAKEGVENVGKQYMNTAAVAALEARVKLYMKN YEGAITAAQEAITLSGGTVVSTKEDYKNMYTTLAVSTEDIFFIAKAEDDYLSANALNTLW NKYGLSINSATVAEYSPNDIRLALLGGTWTGGKMRGITTNNQISNLPVFRLPEMYLIQAE AYAKQSTPNYGKAKEKLLEVAAKRNPDLDDNEIAEDQTVIDAIDKERKLELVQEGHRFFD ARRLGKIISVADGSYTNFNIAKFVYPIPSFEVNSGFGVIQTEGWSNNLPR >gi|225935335|gb|ACGA01000057.1| GENE 95 115583 - 118555 2360 990 aa, chain + ## HITS:1 COG:RSp0416 KEGG:ns NR:ns ## COG: RSp0416 COG1629 # Protein_GI_number: 17548637 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Ralstonia solanacearum # 85 218 168 284 812 62 32.0 6e-09 MQISYIGMQTQEVVIKPNLRVVLKADAQKLDEVVVTAMGISREKKALGYAVQDVKSDALT RAANTDLAGALQGKVSGIDITPSSGMPGASSKITIRGSRSFTGDNTPLYVIDGMPIASTA DVSTSLTDGAYGTDYANRAVDIDPNDIESINILKGQAASALYGMRASNGVIVITTKSGKG ADKGKPTITFSTNLSFDKISTLPELQQEYAQGSGGTFDPSSPFAWGPKISELANDPTYGG NTDNSYTSQYGKQSGKYYVPQLAAAGMNPWATPQAYNNMKDFFETGVSWSNNVNVAQRFD KGNYSFSLGNTTSNGIVPSTGMDRYNVKMSAEAQLHPNWTTGFNGNFVTSKISKQSTANT SVVATIYNAPVSYNMAGIPSHIEGDPYTQNTYRDSWIDDAYWAVENNQFSERSQRFFGNA FVKYTTKFGTDNHKLDIKYQIGDDAYTTNYSEIYGYGSTWAPTGEDSEYHYTVNELNSLL TAAYTWNINEEWTLDALIGNEFVDKKTKYEYAYSMNFNFPGWNHLNNASVFSNESLYNKK RTVGNFANLSVAWKNMLYLSGSIRNDIVSSMPRDNRSFTYPSVSLGFIFTELAPLKNNIL TFGKIRASYAEVGMAGDYTQSYYYTPSYGGGFYMGNPIVYPINGAMAYIPYYKVYDPNLK PQNTKSYELGADLTFLNGLVTLNYTYSRQNVKDQIFEVPLAGSTGASSMIMNGGKIHTNT HELTLGISPVDTKNFKLDFAFNFSKIDNYVDELAPGVESIMLGGFVTPQVRAGIGDKFPV IYGKSYMRNDEGKIVVDQNGLPMQGEDAVIGTVSPDFRLGFNTNIELYKFRISAVFDWKQ GGQMYSGTAGEMNYYGVSKLSGDMRKNEFIVENAVKETGKDADGNSIYAPNDIKVTDAQA YFTRRRSIDESYIYDNSYIKLRELSVSYPVFSKKWLNVNVNVFARNILVWSEMKGFDPEA SQGNDNMGGAFERFSLPGTASYGFGFNVKF >gi|225935335|gb|ACGA01000057.1| GENE 96 118572 - 120125 1085 517 aa, chain + ## HITS:1 COG:no KEGG:BT_2033 NR:ns ## KEGG: BT_2033 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 517 1 520 520 705 70.0 0 MKTYNIIKGIFAATVLSVVFSGCSEDAMDRINKDHGHTQSVAARFILTDVITSTAFSNAG GDFNTYLSAYIEYEVGVDNQLYYAETRESEPTSSSTFNNTWNGLYSTLKSARIIINQCSE GGIDHGNYVTKGMAEVLAAYNCALIADMFGDAPCSQAALIDENGSPVYLTPKMDKQEEIY TQAMQYLDDAIADLQKEDLIDPEAQDLLYQGDAEKWLKLAYGLKARYTMHLLKRSANKDT DMEKVLEYVNNSFESAEEQAAFDVYDANNINPLYGFFVARAGLGASESMRSKLAEYNDPR LSRAFITKQGTSGKEQAPGTADNAVYAPNGTPEQGTSKYGTSLFMYSISAPTLLLSYHEL KFLEAEALCRLERDAKSALKDAVVAGLLNAENSFTISRKELGKTLINAASSITEAEAESY FDNTIEAKYTAEPLKTTMTQKYFALWGASGEATESYNDVRRMKGLGENFIELKNPNSFPL RCPYGNSDTTTNAEVKAAYGNGQYVYSENVWWAGGSR >gi|225935335|gb|ACGA01000057.1| GENE 97 120490 - 121077 302 195 aa, chain + ## HITS:1 COG:CC2751 KEGG:ns NR:ns ## COG: CC2751 COG1595 # Protein_GI_number: 16126983 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 9 180 9 180 186 63 25.0 3e-10 MDHFLEKIDTHTLSLLQQGDEKAFDTIFWKYNPRVFHFIHSLLYDKILAEDLTQNVFLKI WERHQDIKPEEGFEAYLFTIARNMVYKETEKRLLSERFLDSIKQTDADKHFEIDVDTDSL QEYIDELVEQLPSSRKKIYLMSKKQHLTNKEIAAQLSLSEKTVETQLYRALHFLRSKLAN EIVALALFFLSSNFK >gi|225935335|gb|ACGA01000057.1| GENE 98 121152 - 122159 552 335 aa, chain + ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 101 331 42 265 274 89 29.0 6e-18 MEKNTNILLLECFMRGETSPDEEQILSDWLHSQEAKEQLSNYYQETWKGSNNKLIAEVQG RMFERVKSQMHNAVQELNRRKQKRTMRIRRLFQYAAVILLIITAGIGGHLYTVSQTEPTV TDKSYLVQTGKGQRANITLPDGTVVWLNSYTQLHYNANYGATQRVVSLIGEAYFEVAKDK EKPFIVETAGMNVEALGTTFNVKAYREDSQIIATLFSGSVRVSSDRDNVILSPDENATFE RRSGKLAIHKPDNSSYAKMWRNNDLVFNGETLEEIAVLLNRMYNVQIAFKSERIKQYCFS GVICNNSLDNVIELISLTSPITYETRGDTIILGNR >gi|225935335|gb|ACGA01000057.1| GENE 99 122470 - 125769 2400 1099 aa, chain + ## HITS:1 COG:no KEGG:BF0340 NR:ns ## KEGG: BF0340 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 57 1099 29 1067 1067 1231 58.0 0 MGQITLTMKDKPLKKVIKQIEKVSEYRFFYNEELANLNKPVSLEAQNSSIEKVLKDLSSQ APFSYLIKPNFQIVLSDDLPSQQTKLQSITGRVVDTADNPIIGANVLVKGTTNGVITDID GNFSLEKVPVQSQLQVSYIGYETKELAVVSGKTNVKIIMAENTKLLNEVVVTGFGLAQKK ATLTGAIASVGADDISRSSAVNTSGALVGKVAGLNYRSADSRPGNATTLQIRNMGTPLFV IDGVQSDEGQFNNIDFNDIESISILKDASASIYGVRAANGVIVVTTKKGKKNTKSTVTLN TYYGWQHVSRLPEPADAVTYVENYIQSETIQGKTSRLYSKEDLEKWRQGTEKGYVPFNWY DYIWVTSPQYYVNANVSGGSDKINYYFSVGHMNQESAIRNYGGMKRYNVQMNVEAQVTDK FKIGMNMNGRIKQLKNPGVPGTDDYSNPTAATYHNLPTVRPFANDNPNYPASVGSDNGLN FGLLNFEKSGTFEDTWRMAQLNLNAEYEIIKGLKAKALFGYYFASELLNNQEYTYRVYRY DEATDEYVDVGGRASAWRERTNAYVEELTSNIQLAYEKAFGAHSINAVIGFEAIKRDTPK SWLHAIPASNSLHLIYYDTIDKYEDTGNNTEARLGWLGRFNYNYANKYLIDFSARYDGSW KFPPNHRWGFFPSASLGWRISEENFWKESKIASIFSDLKIRGSYGLVGDDNVDGYSAFDY MTGYDYKQGGGVIDGSYIIGTTPRNLPVTTLSWAKAKILDIGLDVAFLNNRLSGSVDFFR RIKTGIPASRDDVLLPEEVGFERPKENLNSNVHTGYDLLARWSDKVNDFHYSISANFTYS RFYNWEQYNPKFSNSWDEYRNSTWHRFGNVNWAYEADGQFQSWEEIASWPIDNDQKGNTT LRPGDIKYVDQNGDGIINDMDKRPIGYKEDSTPVLNFGFNLAFAWKGFDLALDFSGAGMS TWNPSLIQKIPFNNNGNNPAYYMEDTWRLSDIFDANSEMIPGKYPTLLIGNNANNHSNYW DSSFWKYNVRYLKLRNLEFGYTFPKEWLQKCRINSLRLYLAGQNLFTLTNVPIDPEVSAG NGTAYPSMRIINLGLTLKF >gi|225935335|gb|ACGA01000057.1| GENE 100 125775 - 127616 1632 613 aa, chain + ## HITS:1 COG:no KEGG:BF0341 NR:ns ## KEGG: BF0341 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis # Pathway: not_defined # 14 613 14 617 617 522 48.0 1e-146 MKYIKGYLLLGLTLILGSCSDFLDRDPDQILTNDQIFSDAKMINSVLAGFYGDTENWGQS FATPASFTKVDDGCVTDGARDNMQEYSDNQWRVYPYKYIRNLNQFLAGLRATTVLESTEK LRYEGEIRFLRAWAYFSMCRGLGGVPIVNDNVYEYEAGMDVSGLAVPRSKESEVYDYIIN ECATIADYLPTKPTTNAARATRWAALMLKARAAVYAGSLANYNNKMVAPIRTEGNEVGIP AEMATPYYETALEAAKEVITQASAYYNLDITVRNDLGQNFYNAVCVKENNHEVIWAKDYA YPGATHGFTQKNIPASLAEDQERCYSGAVLNLVEAFEYKDNRDGTIKVKDESGNYIMYSR PEDAFANKDARLWGTVLYPGAKFRGSEVVLQAGQKVPDGNGNWKEIIGKADSYDDNNRLI TSINGPMESNEIRINKTGFFFRKFLDETIGASTNSKMSTMWYPRFRISEAYMIACEAAYE LNKTGEDNPLNYINPVRTRAGISELESITFEDIVREYRVEFALEDHRYWDLKRWRLAHEI WDGNSENPDAQLYELFPYQVNDPGSANDKMWVFDKKKAYMVPYPRYFQMKNYYNFFDNSW LNKNPLLVKNPYQ >gi|225935335|gb|ACGA01000057.1| GENE 101 127636 - 128310 541 224 aa, chain + ## HITS:1 COG:no KEGG:BF0290 NR:ns ## KEGG: BF0290 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 224 1 235 235 167 43.0 4e-40 MKKIRTIFSLLSLSLLAISCGLDNYDEPESFMEGKITYEGKQLGLKGTNGGIQLQLYQDG YANHDPITVYATQDGTFSAVLFDGPYKLVTKDKNGPWVNNRDTIYVEVKGKTQCEVKVTP YFTISDENITLDNNIVSGTCNIQQIVQDAKISQAMLLVSKTTFVDENTNIARQNLSNINP GVTNISLDITSNQNVQSAKALFARIGVKANGADQAVYSEIFRLK >gi|225935335|gb|ACGA01000057.1| GENE 102 128384 - 129961 1360 525 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 24 516 2 522 536 288 34.0 2e-77 MTNKNMLLMGTASLLSLYTAAQNAPKPNIVLIMVDDMGYSDLGCYGGEIPTPNLDNLAKK GVRFTHFCNTGRSCPSRASLLTGLYPQQAGIGMMSEDPHSAEDHGVHGYMGYLNRNSVTI AEVLKEAGYHTYMSGKWHVGMHGQEKWPLQRGFDHFYGILSGACSYLQPHGDRNLTLDNQ QLPPPEQPYYTTDAFTDYAIKFVDERPKDDNPFFLYLAYNAPHWPLHAKQEDIDKFVGKY RKGWQKVREARLKRMIKMGLVDKKWDLAQWEGRQWSELTEDEQIELDRRMSVYAAQVHCM DYNVGRLIKYLEEKGELNNTLLIFLSDNGACAEPHREKGFGTIDMINNPDEWVNPSYGLP WAQVSNTPYRKYKVRAYEGGTSTPLIISWPDKYAKYNGEIRHNRAFLPDIMATFIDAAET TYPKTYHGGNEIIPLEGASMLPILTNTDTELHEYIFGEHFDNCVVWWKNWKAVKDQNSKV WELFNLEKDRTERVDLSRENPKILKQLVDEWNKWANTHYVYPKGK >gi|225935335|gb|ACGA01000057.1| GENE 103 130096 - 130746 680 216 aa, chain - ## HITS:1 COG:BS_yvbG KEGG:ns NR:ns ## COG: BS_yvbG COG2095 # Protein_GI_number: 16080438 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Bacillus subtilis # 5 203 2 203 211 127 38.0 1e-29 MDSTLLPFALLCFTSFFTLTNPLGTMPVFLTMTHGMTDKERQSIVRRATIVSFITIMVFV FAGQFLFKFFGISTNGFRIAGGVIIFKIGFDMLQARYTPMKLKDEEIKTYADDISITPLG IPMLCGPGAIANAIVLMQDAHSYEMKGILIGTIALIYLLTFFILRASTKLVNVLGETGNN VMMRLMGLILMVIAVECFVSGLKPILVDIVREGMIP >gi|225935335|gb|ACGA01000057.1| GENE 104 130808 - 131500 581 230 aa, chain - ## HITS:1 COG:YPO2295 KEGG:ns NR:ns ## COG: YPO2295 COG1011 # Protein_GI_number: 16122519 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Yersinia pestis # 1 230 1 222 224 116 31.0 4e-26 MKYKNLFFDLDDTIWAFSQNARDTFEEVYQKYSFDRYFDSFDHYYTLYQQRNTELWIEYG EGKITKDELNRQRFFYPLQAVGVEDEALAEQFSKDFFAIIPTKSTLMPHAKEVLEYLAPK YNLYILSNGFRELQSCKMRSAGVDGYFKKVILSEDLGVLKPWSAIFNFALSATQSELRES LMIGDSWEADITGAHGVGMHQAFYNVTERTTFPFLPTYHIHSLKELMDLL >gi|225935335|gb|ACGA01000057.1| GENE 105 131597 - 132454 1004 285 aa, chain - ## HITS:1 COG:no KEGG:BT_2272 NR:ns ## KEGG: BT_2272 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 1 284 284 423 95.0 1e-117 MKKLAVLFVCVAMLASCDSFKGGSKDLKAENDSLLMELNQRNAELDDMMGTFNEVQEGFR KINAAESRVDLQRGTITENSASAKQQIASDIEFISKQMEENKAQIAKLQAQLKNSNYNSA QMKKAVAALTAELNAKQQRIEELQTELASKNIRIQELDAAVSDLSAAKETLAAENEAKAK TVAEQEKSLNAAWFVFGTKSELKAQKILQSGDVLKSADFNKDYFTQIDIRTTKEIKLYSK RAELLTTHPSGSYELVKDDKGQLTLKITNPTEFWSVSRYLVIQVK >gi|225935335|gb|ACGA01000057.1| GENE 106 132488 - 133162 545 224 aa, chain - ## HITS:1 COG:all4680 KEGG:ns NR:ns ## COG: all4680 COG0313 # Protein_GI_number: 17232172 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Nostoc sp. PCC 7120 # 2 221 8 228 285 229 49.0 4e-60 MGKLYVVPTPVGNLEDMTFRAIKVLKEVDLILAEDTRTSGILLKHFEIKNAMQSHHKFNE HKTVESVVNRIKAGETVALISDAGTPGISDPGFLVVRECVRNGIEVQCLPGATAFVPALV ASGLPNEKFCFEGFLPQKKGRQTRLKTLAEEHRTMVFYESPHRLLKTLTQFAEYFGAERQ ATVSREISKLHEETVRGSLAELIEHFTATEPRGEIVIVLAGIDD >gi|225935335|gb|ACGA01000057.1| GENE 107 133175 - 133918 621 247 aa, chain - ## HITS:1 COG:no KEGG:BT_2274 NR:ns ## KEGG: BT_2274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 247 3 241 241 163 59.0 6e-39 MKQKLLTDIELDVHELKYLMDSFTKEPTPTLSELLKRSITRMQGRLDELQQEVDAVQVIS PSVVEETVEEEDEVAGSEEDSPVIIQSLESVVVKEDEEGEAIVAEEKPVVIAAAAETAAA AEEEEEEEEEEEEEEEEGKKEESAIVEEPVVETVVKEEEPKSAVLGESLKLSAGLRHAIS LNDSFRFSRELFGGNTDLMNRVIEQISVMSSYKTAVAFLSSKVELNEEKEAVNNFLELLK KYFNQSA >gi|225935335|gb|ACGA01000057.1| GENE 108 134059 - 134658 586 199 aa, chain + ## HITS:1 COG:BH3779 KEGG:ns NR:ns ## COG: BH3779 COG1435 # Protein_GI_number: 15616341 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Bacillus halodurans # 9 188 1 187 204 181 47.0 6e-46 MVLFSEDHIQETRRRGRIEVICGSMFSGKTEELIRRMKRAKFAKQRVEIFKPAIDTRYSE EDVVSHDSHSIASTPIDSSASILLFTSEIDVVGIDEAQFFDSGLIDVCNQLANNGIRVII AGLDMDFKGNPFGPMPQLCAIADEVSKVHAICVKCGQLASFSHRTVKNEKQVLLGETAEY EPLCRECYLRARGEDEQKV >gi|225935335|gb|ACGA01000057.1| GENE 109 134680 - 135813 941 377 aa, chain + ## HITS:1 COG:RSc2624 KEGG:ns NR:ns ## COG: RSc2624 COG0628 # Protein_GI_number: 17547343 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Ralstonia solanacearum # 37 345 36 338 356 101 25.0 2e-21 MERKKITFDSFIRGSIGCVLVVGILMLVERLSGVLLPFFIAWLIAYMVYPLVKFFQYKLR LKSRIISIFCSLFLITVIGVTLFYLLVPPMISEIGRMNDLLVTYLTNGAGNNVPKNLSEF IHENIDLQALNRVLSEENILAAIKDTVPRVWTLLAESLNILFSILASFIILLYVIFILLD YEAIAEGWLHLLPNKYRTFASNLVHDVQDGMNRYFRGQALVAFCVGILFSIGFLIIDFPM AIALGLFIGALNMVPYLQIIGFLPTILLAILKAADTGQNFWIIIACALAVFAIVQIIQDT FLVPKIMGKITGLNPAIILLSLSIWGSLMGMLGMIIALPLTTLMLSYYQRFIINKEKIKY DEVETTDNQETSHNEEK >gi|225935335|gb|ACGA01000057.1| GENE 110 136223 - 137443 834 406 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 124 385 13 281 297 67 26.0 5e-11 MARKSFSVLFFIKKGKLLKNGEAPVCMRITVNGCMVDISIKRSCHVNLWNQAKENSKGKD RMSVELNHYLEITRSHVHQIYRELETSGKVITVDLVRKLFYGVDEDNKTLLQVFREHNEQ SRKLIGKDFVSKTVQRYETTTRYLEEFIKKEYQLSDIALNNLEANFISKFDAFLKIEKGC AQNSAITRLKNLKKIIRIALENDWIKKDPFAYYRFKLEETDPEFLTMDEIKIILAKEFTI KRVEQVRDVFVFCIFTGLAFSDVKDLSPEHLVKDNKGELWIRKNRQKTKIMCNIPVLPVA ASILEKYKNVAECTGKLLPVLSNQRMNSYLKEIADVCGIHKNLSTHTARHSYATSICLAN GVSMENVAKMLGHADTSITKHYARVLDQNIFKDMQKVNSCLSELAI >gi|225935335|gb|ACGA01000057.1| GENE 111 137866 - 138015 86 49 aa, chain - ## HITS:1 COG:no KEGG:BF0655 NR:ns ## KEGG: BF0655 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 42 1 42 97 79 95.0 6e-14 MELINKNTPQVKEFISSLDSMLNGIESIVKHYKPHLNGERFLIMKFPRN >gi|225935335|gb|ACGA01000057.1| GENE 112 138349 - 138663 171 104 aa, chain - ## HITS:1 COG:no KEGG:BF0653 NR:ns ## KEGG: BF0653 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 104 1 104 104 170 91.0 2e-41 MEVVTIEKRTFSYVCERFIEFAKRIESLCSTHTQKVENWLDSQEVCLLLGFSKRILQYYR SSGRLAYSQIGNKIYYKSSDIERIIADSETQNQSPKQIMPYEKN >gi|225935335|gb|ACGA01000057.1| GENE 113 138860 - 139702 164 280 aa, chain - ## HITS:1 COG:AGl645 KEGG:ns NR:ns ## COG: AGl645 COG2207 # Protein_GI_number: 15890441 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 274 50 301 327 68 20.0 1e-11 MKKENLHQPFEISFNEHNESLLTEHEHTFFELVYILTGTGIQWINNNMFPYHDGHLFLIT PNDSHSFEIHTTTKFINIKFNDIYIHSAIFGSENIQRLEFILQHANHQPGCILRNKVDKL LVKPMIEAIIREYVNRNLYSKEIITQLINTIIIVVARNIAMFLPQQVDECSDDKSLDILQ YIQANIYQTGKIRAKEISQHFGISESYLGRYIKKHTNETMQQYILSYKLKLVESRLLHSQ MRINEIAEELGFTDESHLSKFFRKNKGCSPSSFRKSNKAI >gi|225935335|gb|ACGA01000057.1| GENE 114 139812 - 140321 228 169 aa, chain + ## HITS:1 COG:CAC1484 KEGG:ns NR:ns ## COG: CAC1484 COG0778 # Protein_GI_number: 15894763 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 166 1 167 172 125 38.0 3e-29 MNFLELSKQRYSARNYSSDMIEQEKLDYILECARFAPSAVNYQSWHFFVVKSNKPKLLIQ QSYPREWFTEAPLYIVVCADNSISWVRKSDNKNHADIDAAIATEHICLAAAEQGLGSCWV CNFDPDMLKDNLHLSPNMYPVAIISLGYVKQPPEKSSKRKDITEIVSIL >gi|225935335|gb|ACGA01000057.1| GENE 115 140335 - 140940 252 201 aa, chain + ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 2 195 9 191 192 178 45.0 6e-45 MKTEMEKCLAGEWYDCHNQVFLDLKTRTHKLLMKYNSLPYEDKEEKHSLLKKMLGSIGMK VSVASPFICDYGCNIHIGDNVTVNTGCTFVDCNKITIGSNVLIAPNVQLYTATHPIDLDE RLAPVETEDGIKRVRRTYALPITIEDGCWIGGGVIILPGITIGYGSVIGSGSVVTKDIPA NSLAVGNPCKVIRQINQTKKL >gi|225935335|gb|ACGA01000057.1| GENE 116 140937 - 141587 309 216 aa, chain + ## HITS:1 COG:MT2292 KEGG:ns NR:ns ## COG: MT2292 COG0546 # Protein_GI_number: 15841725 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Mycobacterium tuberculosis CDC1551 # 3 180 80 260 291 68 24.0 1e-11 MIKLVAFDLDGTIADTIPMCIEAFKQAVSPYASKILSKDDIVKTFGLNEEGMIKEVVSED WEKALVDFYVIYKEMHTVYIQPFVGIRELITELKENSIIVSLITGKGQRSCNITLHHFEM EALFDSISIGSPDRNTKAESMAELMSKYNIQPAEMVYVGDEFSDITACSKVGIQCLSAAW IASLSKMKKLEKENQGYVFSSIKSLHSFLKKEAHLV >gi|225935335|gb|ACGA01000057.1| GENE 117 141702 - 141935 213 77 aa, chain + ## HITS:1 COG:no KEGG:BF0648 NR:ns ## KEGG: BF0648 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 77 17 93 93 131 96.0 9e-30 MLMNVLLNMRMFRLFSETSQEVTNSEMQEAYGEFVEQIRTISNGNDYSTTYRILAATRIE IALLETIPLYGQGEKCA >gi|225935335|gb|ACGA01000057.1| GENE 118 142090 - 142323 91 77 aa, chain + ## HITS:1 COG:no KEGG:BF0647 NR:ns ## KEGG: BF0647 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 77 67 143 143 140 98.0 2e-32 MELISGLFLSQRVVTHAGTKSPLTEIGRAFEHLFNIKLGDVHKKHESVIKRKPSKVTEFL DTLRKAIAEESKKKGYL >gi|225935335|gb|ACGA01000057.1| GENE 119 142453 - 142785 213 110 aa, chain + ## HITS:1 COG:no KEGG:BF0646 NR:ns ## KEGG: BF0646 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 110 1 110 110 182 98.0 2e-45 MYIDNENFEKWMEKLSKKLTEIGQDLKSLINTDTVLDDNEKILDNQDLAFLLKVSFRTLQ RYRVNNQLPYFTIGRKIYYRAVDIRAFVRERADSQTYKRFEKDNQLTNQP >gi|225935335|gb|ACGA01000057.1| GENE 120 143069 - 143653 449 194 aa, chain + ## HITS:1 COG:no KEGG:BF0644 NR:ns ## KEGG: BF0644 # Name: not_defined # Def: clindamycin resistance transfer factor BtgA # Organism: B.fragilis # Pathway: not_defined # 1 194 1 194 194 318 94.0 8e-86 MPNNSRKTIFTTISIDKETAVLVEKICKRYSLKKSEVAKLAFGYIDKAHINPSEAPESVK SELAKINKRQDDIIRFIRHYEEEQLNPMIRATNSIALRFDAIGKTLETLILSQLEASKER QTAVLKKLSEQFGNHADVINNQSKQINALYQIHQRDYKKLLHLIQLYSELSACGVMDSKR KENLKAEIINLVNT >gi|225935335|gb|ACGA01000057.1| GENE 121 143658 - 144578 427 306 aa, chain + ## HITS:1 COG:no KEGG:BDI_1256 NR:ns ## KEGG: BDI_1256 # Name: not_defined # Def: clindamycin resistance transfer factor BtgB # Organism: P.distasonis # Pathway: not_defined # 1 294 1 294 306 429 82.0 1e-119 MHIDFAPPSKGTYNNAGSSWQLANYLEHEDLERMEKGIYTEGFFNLTDDNIYKSMVIKDI DSNIGQLLKTDAKFYAIHVSPSEKELRAMGNTEQEKVEAMKRYIREVFIPEYAKNFNKEL SASDIKFFGKIHFDRNRSDNRLNMHCHLIVSRKDQANKQKLSPLTNHKNTKRGIIKGGFD RINLFQQAEQGFDKLFGYDRQQAESFDYHNTMKNSSIFEQLAIQEQQFTGEKKSEVQQSG EKVNKISCNLDSKEDNKHSYIQQNNSSDDSLLSIFSLGDGNNYNATSAEELQAQKRKRKK KKGIRR >gi|225935335|gb|ACGA01000057.1| GENE 122 145023 - 145274 126 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174279|ref|ZP_05760691.1| ## NR: gi|260174279|ref|ZP_05760691.1| hypothetical protein BacD2_20646 [Bacteroides sp. D2] # 1 83 95 177 177 159 100.0 6e-38 MKQKRIQTINAILDQDLEALLKQTDKYEDLVNGRILCENCKKVIAIENIGIVLPDIQNDT MKLRFYCDDIDCIQKYYLENGRG >gi|225935335|gb|ACGA01000057.1| GENE 123 145261 - 147795 782 844 aa, chain + ## HITS:1 COG:PA3162 KEGG:ns NR:ns ## COG: PA3162 COG0539 # Protein_GI_number: 15598358 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosomal protein S1 # Organism: Pseudomonas aeruginosa # 355 649 101 399 559 77 23.0 1e-13 MEEAEIYNDIRTILTSLGLPIIAVIAIVIAFFFPDRIKIWVGVFQYHIGRICVGVRKSSI KNRLEGACTHSLKKLGKELPDLEIPELSIKWVKEDNLQAKLKEGKAIVKLRFSDDQTRNI INAATVYVKEAFLIHSKPYMSENFVKAMDFSITKKILLGVHNNKRNVISEFIKEYALDLN SIQDKCSQIEVIDDAGLFTRILIRELDFFGNKLIGRNPSDDYKNESDKFLSFLHEITTRD ADDLTPLQFVENILKVGVLLVAKKDTYYQHGLHPYLRRIKLGLARGIKTFYLLAREDKVE ILESVAKELLLTGNFILINNPRSFSDSYNREVICYCLRIDTESSLTSTYREISLALDSKD KMQGVVTKLREDGLKVDVNGVEGFVQIRNLSASKISDIRKYFKEKMLIELIPIEIGSTGI VEFTLINTCSDPNNLINANFEIGRTISAKVKYCDDDFVKFDVGHEKIEGISFRKDLTYSR FSLLHKQYELGTEHDFIVKNNDFENNTIYLRLKQLKDPWDSLTIRKFSQVNFLVCRKVNY AFIGELQEGIEGILAYKELSWFSSEIEVIKNSIKLNNNVDCVVKDIDKEKRIIYLTLKDR QNNPYFEYLDTNRNKVVEFFAIEETIYGILGSIESKYQVFIPKNEQSWNGDKYNCKIGKK NKVVIKELSNRGDSLIGTFRPLIPHPLALFSNKFDVGQILKPLKVLNTYNWGATFIISIG HKKFEALLFKGDISNLCFIKSCIGIFDNIDKIPLAIKEIDLDKNRVILSLKEVLKNNLQR SRECKYANEYESIIIGSNNQGYVVLLKGLWIEAILESTQTYETGKILTLRPARLSDEIII LTDE >gi|225935335|gb|ACGA01000057.1| GENE 124 148244 - 149113 464 289 aa, chain + ## HITS:1 COG:YPO2672 KEGG:ns NR:ns ## COG: YPO2672 COG4413 # Protein_GI_number: 16122877 # Func_class: E Amino acid transport and metabolism # Function: Urea transporter # Organism: Yersinia pestis # 13 289 29 329 330 115 34.0 1e-25 MTFIHNLFLIPGRGIGQVMFQNNALSGLLMLIGIFLNSWQMGILAICGNIISTLTAYFSG YKHDDIKNGLYGFNGTLVGIAVGVFLQLSVGSLTMLTIASALSTWIAYFFSRQRLLPGFT APFILAVWGMLGVCSWLISDLLPASDTVIDTTQNINYFQAFCLGIGQVMFQGNTVLAGLL FLIGILINSRKATLYTILGALLPIPLAILLEVDATNLNAGLMGYNGVLCAIALGGTTWKS GVWAGCSVLLSTALQILGMGLGIITLTAPFVISVWIIIMIQKAMRTQNK >gi|225935335|gb|ACGA01000057.1| GENE 125 149623 - 150480 312 285 aa, chain + ## HITS:1 COG:no KEGG:Teth39_1000 NR:ns ## KEGG: Teth39_1000 # Name: not_defined # Def: SpoIID/LytB domain-containing protein # Organism: T.pseudethanolicus # Pathway: not_defined # 101 275 99 272 762 71 34.0 3e-11 MKTKFKDLLKGIFLTSVASLISSNEANAISYDLSRISDDNNIEQGKKNDLSQKYILKIHN DNLFLIAGHRSHRSHSSHRSHSSHRSSSYGSYSGSSTTTSSSTSTYNTNTTTSTKTTYSL GERNIISGVSGKDVAELANKLVKIRYIIKSDIEKNYSGDVKYSIRLENAIKKFQKDHNQV VDGIVGKKTIEAINKQIEQINRETVNTDVQKYNLGDRILKKQMQGYDVTQLKNILIDKGY LSGKLLKGTSLFDEETEKAVIKFQKSIGIDADGIVETQTVYFLKK >gi|225935335|gb|ACGA01000057.1| GENE 126 150490 - 150837 362 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174283|ref|ZP_05760695.1| ## NR: gi|260174283|ref|ZP_05760695.1| hypothetical protein BacD2_20666 [Bacteroides sp. D2] # 1 115 1 115 115 186 100.0 5e-46 MIKAELRSENTFCFTINASIFNERVLTKALYWYTESFIIYWNENKDNLFEITLELKPSAN KIYTFEYVTHKFNQDLIDYKNRDLITNETKDIRNILYVKAFANNDDFEDFNLASE >gi|225935335|gb|ACGA01000057.1| GENE 127 150837 - 152033 294 398 aa, chain + ## HITS:1 COG:MTH114 KEGG:ns NR:ns ## COG: MTH114 COG0641 # Protein_GI_number: 15678142 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Methanothermobacter thermautotrophicus # 87 274 2 181 262 71 28.0 2e-12 MNHYYLLPFRFERIKEQELLVNELGDFIFVPTGTTERIIKRQLSNQEDLYKDLVANFFIS ESPIPELIDNIATRLRTKKAFLDSFTSLHIFVLTLRCNQNCIYCQASSKESCEAIYDMKE EHLFKAIDLMFQSPSHSITMEFQGGEPSLPFQLLQKAVKRTVALNQEFKKQITYVLCTNS INLTDEILSFCKEYNILISTSLDGPAFIHNHNRGKADSYNRVIEGINKARNYLGTDRISA LMTTSELSINHPKEIIDNYLSNGFNNIFLRPLNPYGLALNNTNWETYFDNFIKFYKSALN YIIDINIQGRFFVEEFTSILLRKILTPFTTGFVDLQSPSGIINSVIVYNYDGYVYASDES RMLAEYNDNTFKLGHITDKYESLFLWKKGSTNSPNLGN >gi|225935335|gb|ACGA01000057.1| GENE 128 151975 - 152259 113 94 aa, chain + ## HITS:1 COG:no KEGG:BDI_2144 NR:ns ## KEGG: BDI_2144 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 88 391 474 474 87 50.0 1e-16 MKVFFYGRKAQQIALIWGTEFIAGCADCAFQSYCGADPVRNYSAQNDMYGFRPTSSLCKK HKAIITYIFYLIQNEYDRVMPIFKQWINRNENEF >gi|225935335|gb|ACGA01000057.1| GENE 129 152318 - 153103 167 261 aa, chain + ## HITS:1 COG:no KEGG:PFLU3250 NR:ns ## KEGG: PFLU3250 # Name: not_defined # Def: hypothetical protein # Organism: P.fluorescens_SBW25 # Pathway: not_defined # 1 259 135 396 411 236 41.0 8e-61 MCAQPPLNRDDIDSFWNKSIKLIDNAPNGLTNIGITGGEPTLLQEKLIHLVEYIKLRYPE SLIHILTNGRAFSDIRYTMKFKEISNLLFGIPLHSDFSIDHDTITQVKGSYTETMKGLYN LASIGADIELRIVINKMNYQRLPQLSEFIWKNLPFVAYISFMGLEDTGYSIKNHNKVWID PIDYQKELEKAVVNLAEWKLDVSIFNIPICLLKPSLYKFAKKSISDWKVCYIDACSKCSK KHECCGLFSTSKIQSPSIKPI >gi|225935335|gb|ACGA01000057.1| GENE 130 153482 - 153889 386 135 aa, chain - ## HITS:1 COG:no KEGG:BT_2360 NR:ns ## KEGG: BT_2360 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 136 210 91.0 1e-53 METEFYTNNSHLGRKIERIRRLRGMTQTDLGELLGVTKQAISKMEQSEKIDDDKLKQVAD ALGVTEDGLKKFTEETVLYSTNNFYENCHVSASNIGPITHVENLNHFSMEQAVKLFEELL KIEREKYSKGKEDSK >gi|225935335|gb|ACGA01000057.1| GENE 131 153935 - 154324 419 129 aa, chain - ## HITS:1 COG:no KEGG:BT_2361 NR:ns ## KEGG: BT_2361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 129 1 129 129 190 95.0 2e-47 MDIEVKDKANRRHVGHNLQKIRVYFGMKQEALAADLGVNQQVISKIEKQEEIEEGLLNQI ASVLGISAEVIKDFDVEKAIYNINNIRDNTFEQGSTSIAQQFNPLDKIVELYERLLQSER EKVELLKNK >gi|225935335|gb|ACGA01000057.1| GENE 132 154658 - 157744 2632 1028 aa, chain + ## HITS:1 COG:no KEGG:BT_2362 NR:ns ## KEGG: BT_2362 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1028 1 1028 1028 1793 88.0 0 MKSFSDKTNIGENNPLKTVYRMNFYLLFLLIGSLSILGSTRAYCQQNRTISGIVKNSQGE TIIGANIVEKGTSNGTITNVDGLFTLRVAPNAVLKVSYMGYTEQEVNTRNKNKLEIILTE DARLIDEVVVVGYGSVKKRDLTGAVTSVKSAEVLAAPTNNVMEALQGKIPGMDITKTSGQ VGGDVTILLRGSRSIYGSNEPLFIIDGIPGSYSQVSPSDIESVDVLKDASSTAIYGSAGA NGVVIITTKRGKEGKATVNFDAYYGFSGSPNYRHGMTRDEWVTYQREAYKYKNGDYPTDM SALLGKQDFIDAYNNGKWIDWVDEVSGNTATTQKYSLSVSSGTEKTKLFASTSYNREEGL LNNENLNRYSLRLNLDQQIFSWAKVGFTSNLVYRDLNSGVKNTFTKSLSSIPLGDAYTEE GEINHEYITGQYSPMSDFIENQYVDNTRSTYLNMSGYVELTPVKDLTFTSRVNGTLNHSR HGQYWGEKCNANRPSYAGSPHASITNNNAWNYTWENILAYNTTIAKDHNLGGSLITSWNK NQSESSMAAASGQMVDQWSFWRLTSGTSQHVESDFAQTQKMSFAFRLNYSYKGKYLFNFS TRWDGVSQFSTGHKWDAFPAGALAWRISDETFMEKTRSWLDNLKLRVSYGITGNSGGTTA YSTTTQAYVYTASGISINGKIVPFTQYSGTYGSSDLGWEKSYNWNIGLDFGILNGRIDGS VEWFKTTTKGLLFKRTLPITSGLTGWGSPLAIWQNIAQTSNQGVEATITSHNIRHKDFTW NTTLSVTWNKEQIDDLPDGDLIAENLFVGEPIKAIYGYKYDGIWGTDTPQETLDAYGVKP GFIKIETLDQKGDGGVHKYSTEDRQILGHSNPDWIIGFSNSFTYKNFDLSIFAMARYGQT INSDLLGYYTAEQSVTKNQLAGVDYWTEDNQGAYFPRPGTADEQKTVYPSLRVRDGSFIK IKNITLGYTLPVNISRKVLMEKCRIYATAYNPFIFVKDKQLKDTDPETNGSDAFPTYRQF VFGVNLTF >gi|225935335|gb|ACGA01000057.1| GENE 133 157760 - 159598 1373 612 aa, chain + ## HITS:1 COG:no KEGG:BT_2363 NR:ns ## KEGG: BT_2363 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 612 1 610 610 1046 87.0 0 MKSIYIKFIFCSVLLSALSLHSCTLDEYNPGAFTKEALATSIDGYETLINQCYFAMERFY YGAADWMSLTEGDTDLWTYKANESTSYTQWFWFFAGTSPNTTYTNNWWNGTYDGVGACNE AIALGDKPPYTTEEERNAKIAEARFLRAVYYFNAVEQFGAVTMLTEPVVTETLTYSPTRT DPMTIYQEVILPDLRFASEWLPTGTHATTTTPTKKAALGFLAKACLQTYEYGSTEYLQEA LDTAKKLITDCETGGGKYNTYMYPSYSEVFKESNNWENKEALWKHRWYAGADGHGSSNGN YKLNRNDEYFLCNINKFGAREDNQETRLTWEGSITGIFMPTQHLLSLYVQKDGTLDPRFH ESFTTEWNANKNYTWDESAVHMYDKEETVIGKALNKGDLAIKFIMPQDMDYATEKQKEHI SDYLLIDYHHVYSNDNNNVNMNYAYTNVTGNYKDDGTNENQFRYYYPSLNKHNSSNYYVA NASKQRNGNLNATFIMRMAEVYLIAAEADIYLNGGANAAGYINKVRERAGANPLTGSITV RDILDERGRELCGEYCRFYDLKRTGMFKNSEYLENTHPDLARFFHPNYALRPISTTFTAT ITNGSEYQNPGY >gi|225935335|gb|ACGA01000057.1| GENE 134 159626 - 159853 81 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESIRAPVHFDYRFALRMNIRYLAKGHRQAITPKLYLSVFFDVFLKIEVQNLTFLQDCYN KSKFFPMFVLKGNKI >gi|225935335|gb|ACGA01000057.1| GENE 135 160042 - 164877 2819 1611 aa, chain + ## HITS:1 COG:MA0188 KEGG:ns NR:ns ## COG: MA0188 COG0457 # Protein_GI_number: 20089086 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 702 1091 521 904 914 95 21.0 1e-18 MQSERDMLVNKVFPRLRQIAYERNVTLTEVDLRWGITEEEAKSSKVVEICLDEIRNSHPF FIGLLGERYGWCPSKETLIEHQAMPDRYEWLAADLDRGMSITEIEIQYGVLRSLEPVYAS FYIREADEKTMETDPRQAQLKETVRNNKRYNTYDYCSPEQLGEQVESEFRALLDHLFPKD KVENPAVHERQKLQAFLDDKSDNYIPNKAYISYLDKFLNDTTNQHLVITGESGMGKSALL AYWLKNIIEDDRWNVVAHFSANSSQSLDTTDIAKHITTQIDSLYGLEQMEENDRQIEHNA TDTDNIDYQKLALRAQLIAGQKPLLIVLDGANQLSDRNHRTKLLNWLPDFPDNVKIIFST IEEDKTMQVFKKRKYPVITVYPLLLDQRKELIVDFFDRYRKRLSEQQLTMILKGSDITDN TMVLMSLLEEIRCFGNFDSLTSFISQMTNLPDINSFFDRLLQRKEQTYNTPLYPSLTSDL LSLIALSKDGLSETELIAINNIPSLYWSQFYCANTAHLMIRDGRVVFAHDMIRQAIEQKY LNSERKVQLRQNIIDYFNREENNNFRKMEELPYQLYHAEKWDELHECISTLGYMSRQFST NNIHEFILYWRTLKQADASKYKISQAYIEQIMGSMAEKAEMAVMLQWAYWQMFRNDVIRL TNICISYLNDLDAAYELVEFLIRMCDQMADGDTSEERCSFHNLAGMICIRKKEYIQALKE YREGLSLAVHAYGEDDLKITSYLCNIADVYHSIGESQDPVDKDALEKAKLILEKVLKMRK SILPDMHDDIAVAYDNLAGIYRLLGEEELAKEYRLKSQDIYVSLKGENDIDVAIAYHNYA LECRMEQDFDQAREYAEKALTIYQHILGDENTHTQEEYHSLAEIEMYSGNYPKATEYMRH AIDIYQKIGDRDMTLAELLNSQASIYLRANQPDLSLNTALECISLLETLKQTDSKLAATM YDNLGKVYLVLNDEKLSEHYYLKGAETWHNMDNLEKEAASYSYLAAAYNTFNRFEDAAVA LEKSLALLEECDKSQDEAAAYANNNYGAICFRLGHIDKAIEHIEKAWEIRSALFGPDDQM ARQYKDTVDQLKQANENTEDSANDKPNEKQTANANAEIEEFATYIGIDDPALTDTFRQGR DAFQTGYFERAVDMFNRALHHCNQKGVEKMSVARSLIYRYLAYSMEMKKEEDQAEEYYRM AADIAIQNDNDAVAERVFKDMAEFNWNRGNFEQAEMSYWHEIRCLLSLYGFYNKKTIACL NNIAQAITKQDTVYWQVILRCFEFVYVLSSNKEEYKQSHQSASNGIGTSIEHLKQEDESF DDQNYRIDLESAVIVLMDYTISLGMYNTTHQLYISFLEEIAANSGITVDGYYQISLQRIT MDAVLDNPHRVITAGLNLLQKAESQPPSDEIKSQIEITIANAYCDTTQYHKALPLYETNL TYEPNLIVPLAKCYNALSHPQEALQMLENAIEQDPSISTLPEFDKVLGKILLDTGHIKEA LDAFNRYEADKTDTGISPMTIHKALALYLSGSEMDATQLVHSAMKLLKDESNDLLYRISI GIELIDYMFQTGNIEQAEKLIDQLHGLLEELPDSMQEPYNARIVAIRQAQN >gi|225935335|gb|ACGA01000057.1| GENE 136 164916 - 165659 417 247 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174292|ref|ZP_05760704.1| ## NR: gi|260174292|ref|ZP_05760704.1| hypothetical protein BacD2_20711 [Bacteroides sp. D2] # 1 247 1 247 247 432 100.0 1e-120 MNIQEILNMLDEDIARAEEQQPDNKEENIDFLLSELEASRELDEDRLAEKYLAMHTMYQF LTRPGNLIKILDCKDQLTLGECLPMLTYLPWVKDTEFERLCLEASFCSLYNHFQYGNELE KGFAAIATIKLLLHHSEQLLPCFEEFQQIAYENEHLHKQMDEVDFQYEPQILLYQVQASM VPFVRRVYEMYPMLEEERINEFCNRIEATLADLELSQERMMFKTRLIYNFMIEADIRCYE GINEEEE >gi|225935335|gb|ACGA01000057.1| GENE 137 165775 - 166626 742 283 aa, chain - ## HITS:1 COG:SMc02841_2 KEGG:ns NR:ns ## COG: SMc02841_2 COG0350 # Protein_GI_number: 15963919 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Sinorhizobium meliloti # 115 277 3 168 169 154 43.0 2e-37 MNGQKTLDYARIAQAIGYIRENFKRQPGLDEIAREVALSTAHFQRMFTEWAGVSPKKFLQ YISIEYAKKILDETHASLFDTAQETGLSGTGRLYDLFVNIEGMTPGEYKNGGENLSINYS FAKSPFGELFIASTNKGICCLEFADDHDAAFHSLLKKFPNATFTPAVDEMQQNALSIFTQ DWRKLKDIKLHLKGTDFQLKVWETLLKIPVAGLTTYGDIAAGINNPKACRAVGTAVGENP VAFLIPCHRVIRSSGELGNYHWGEIRKTAIIGWEAAKNGYLNK >gi|225935335|gb|ACGA01000057.1| GENE 138 166623 - 167168 397 181 aa, chain - ## HITS:1 COG:VCA1017 KEGG:ns NR:ns ## COG: VCA1017 COG0350 # Protein_GI_number: 15601770 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Vibrio cholerae # 18 174 7 154 157 138 47.0 6e-33 MEDKRKEIVGNVIRIQRYHSPCGDLMLGSVEDQLCLCDWAVESHRDIVDRRLRKVLKACY EESTSEVIQKAIKQLDEYFNGERTVFEVPLLFVGTDFQKSVWYKLLDIPYGSTVSYGELA KQLDMPKAVRAVAAANGANAISIFAPCHRVIGSNHTLVGYAGGLPAKKRLLDLEINGKPL L >gi|225935335|gb|ACGA01000057.1| GENE 139 167394 - 169487 1612 697 aa, chain - ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 397 647 357 608 836 71 26.0 5e-12 MKITLIRQDNGSGKETLSICEAGTLFDKMKTETKAGHITALREIIPLLEGTYAQYEHIDK LPYIYSAVEYTRTKEGERKMKQYNGLVQLEVSRLAGGSEAEFVKRQAALLPQTFAAFCGS SGRSVKIWVRFALPDDGGLPSEEAEAELFHVHAYRLAVKCYQPMLPFDIDLKEPVLTQKC RMTLDEAPYYNPDAVPFCLEQPLTMPGEETFRQRKQEEKNPLLRLQPGYESAQTFTKIYE AALNRALQEMENWKRGDDLQSLLVRLAEHCFKAGIPEEEAIRQTMIHYYREEEEQVIRSI LHNLYQECKGFGKKSSISKEQETAFLLEEFMKRRYEFRYNTVLDDLEYRQRDSVHFCFKP VDKRVRNSIAINALKEGISAWDRDVDRFLNSECVPLYNPVEEYLYEAGRWDGKDRIRALA GLVPCDNPHWQELFYRWFLSMVAHWRGIDRQHGNSTSPLLVGSQGYRKSTFCRIILPPEL RFGYTDSIDFKSKQEAERYLGRFFLINIDEFDQINVSQQGFLKHLLQKPVANLRKPYGNT IREMRRYASFIGTSNQKDLLTDPSGSRRFICIEVTAPIQTNVTINYKQLYAQAMEAIYKG ERYWLNDEDENILKQTNREFEQASPLEQLFHCYLNPVEEEMEGEWLTAMQILSYLQTKTR DKLAISKVGQFGRVLQKLNIPCRKSRKGTLYHLVKVE >gi|225935335|gb|ACGA01000057.1| GENE 140 169835 - 170464 504 209 aa, chain + ## HITS:1 COG:no KEGG:BT_4231 NR:ns ## KEGG: BT_4231 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 205 212 247 58.0 2e-64 MAILFDWYEDPKPSDKQQGENENTLHPRIKYNGSTGTDVLRRRIQERCSLTETDVSAVLD ALSHIMGQELAEGRQVHLDGIGYFHPCLTTTEPVTIGTKRKATKVKLKAIQFRADQTLKN EFGVLKVKSLKGGLDFTQLTNEEIDLQLTKYFRTHPFMRRYDFQYLCGMARSTAMRHIRR LRDEGKLKNMGGAMQPIYVPESGYYGNNE >gi|225935335|gb|ACGA01000057.1| GENE 141 170966 - 172303 881 445 aa, chain - ## HITS:1 COG:no KEGG:BT_3124 NR:ns ## KEGG: BT_3124 # Name: not_defined # Def: putative sialic acid-specific acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 441 32 479 479 494 52.0 1e-138 MVLQQQTDVQLWGMASPKKMVAICPSWDKKLYTVRADDAGMWRASLKTPVAGGPYSISLS DGEDLMLHDVLIGEVWLCAGQSNMEMPLMGKANQPIADAVDMIVRAKPSRPIRICTIGQE GARMPQKNCRAQWLKNTPEVVSETSATAYFFADYLQQVLDVPVGIIVSSWGGTSIQAWMS REVLAPFKAFDLGFLDDTTFIERPKYQPCMLYNAMIAPVEQYTIKGFLWYQGESNRKDPD LYRRLQPAFVKMLREAWGQGELSFYYVQIAPFAYEGAELVGSALLREAQLLNLKEIPNSN MVVTMDIGDCNCIHPAHKRKVGERLALLALSGSYGLKGFVPDAPVYQSMEVVNGKAYLTF DCGSEGLAPLGATISGIEVAGSDRVFYPATARIEKYTGRLEVDCEKVPVPVAVRYCFRNY EKGTLYSRYGIPVASFRTDTWEVKE >gi|225935335|gb|ACGA01000057.1| GENE 142 172402 - 174459 1332 685 aa, chain - ## HITS:1 COG:PH0746 KEGG:ns NR:ns ## COG: PH0746 COG1554 # Protein_GI_number: 14590619 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Pyrococcus horikoshii # 42 673 19 687 737 211 26.0 3e-54 MRKRNFLIFLLLGGLWPVIGLSAQDAWKITAEQIDPQSYYGITVANGMLGLVSSPEPLKI SRVVLGGVYDIYGKGRVNNFLHGINMLDTELQINGSTVKASQISGYKQTLDMRRGVFCGE FDYKSLARVEYQYTSLRHLPYSCLLRVQIIPKENIEVSVANIMTVHESLRNPQEYYNRIF NGKTAIDLCTSVAKSPVREFEIGACSSFVFDASFPHPEVCHRSARGAGVHTQEFTVRLQA GQSYTFSIIGTTLSSVTHADVRNEVERLTAFAAVEGVERLWSKHEAAWDKLWESDIVIEG DLQSQQDIHSMLYHTYAFVREGSGLSCSPMGLSGFGYNGHVFWDADTWIFPALLLLHPEL AESMMEYRYQRLDAAKHNAFMHGYKGAMYPWESSDKGNEDNTVTNIYGPFENHITGDVAM ATWQYYSVVQDLEWLRQKGFPIISAAADYWVSRSEPNEAGEYEIRNVIGADEWNQNPQGG KNVNNNAYTNGVAKSTLEAACKAAKLLNMKSDPMWATVAGKLRFRQLANGVTAEHDTYDG AITKQADVCLLAFPLKLVTDKEQIRKDLEYYLQTVPRKKTPAMSKSIYSILFTRLGDRER AWHYFRDSYCPNLNPPFRVIAEFDGGTNPYFLTGAGGVLQSVLMGFGGLDITDKGIVAGK GTLPDAWKSLTLKGIGKEKKDYIIK >gi|225935335|gb|ACGA01000057.1| GENE 143 174582 - 176435 1507 617 aa, chain - ## HITS:1 COG:no KEGG:BT_4726 NR:ns ## KEGG: BT_4726 # Name: not_defined # Def: glycerophosphoryl diester phosphodiesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 116 614 30 551 843 82 25.0 4e-14 MKRIIYLLIAAVSLLVGCNPAEEAWVQAGFTTDKESYEIGDLITLTNTSTAENAQIAVCK WEFMGKVSYELAAPEPFSVEEGGDYLFRLTVTSDHGSMKSVFEKTIKIIDDGIRPTADFS WLPEQIVAGEEVVFTDQSKAAEGCEIIKREWTFGAILSTEENPTVTFGSHGKINVSLTVT DNKKRKDTKTVTVDVAKSAGSLGVLWAHSYDEQGVVIGTSPATSADGEYVYVSSSNYNLV CFTKLGERVWGFDCGQNNEPNRGSDYQHPTPTVDSDGVVYVAVGDNTSKADAAKSASLYA VKGGADGGTQKWFTSIGAKSSMRPFGAPVVTDRYVMILTNSAPSTDGMHFRIYDKASGVL QYAEKTSSGSYGGCVGLKDGRVLVNTGQSGGRSYGTHIFFPKGNSWSKSTNEQNYAPNDQ PNGSQIAVGADGKAYILCLNLGARISTNETKALVYCYDTKKTTEGNVSAPEWYTAVKGEN KQTGYGLVVDGNGVVYVATDKYITAISSTGSVLWATEVAGTAYGVPAIDNEGYIYYNDTE KGSLVKLEPATGKAIASLQLGTGLQSSPTISADGIIYVTGMLEGKPTLFAVEGAATGYAA NAWSQMGGNPGKSGYMY >gi|225935335|gb|ACGA01000057.1| GENE 144 176467 - 177384 843 305 aa, chain - ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 52 303 4 256 257 102 28.0 1e-21 MMKFNQILPVIAIYVVALGGCSNTTPDYSEFYKDPETVKEEDPSLDDNQIKVMSFNIRYY NTNDTGDKSWDARTAAFIPMIEEQKPTVIGMQEARPPQLKWLEENWKDYGYIGKGRRANN ANNDEFVPIFYRKKAVELVKWDCFWLSETPDVPGSQGWDNTVPRVATWAIFKHIASGKKF FFINTHIDVGSTIAPGKSMEVIVNKMSELNPEGLPMLLTADFNKQIDSDIFNGVKTFMTN VRLSAPVTDSKYSYTGFGTANPYIVDHIFCTGFKALKFETIDKVYQNITYISDHYPIVGK LEYVK >gi|225935335|gb|ACGA01000057.1| GENE 145 177402 - 179198 1698 598 aa, chain - ## HITS:1 COG:no KEGG:BDI_3111 NR:ns ## KEGG: BDI_3111 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 8 597 3 609 609 469 44.0 1e-130 MKRLSNIRNILLLCATVLLGSCDLTETMQVEADKAMVFGSESGLRLYAYSFYRALPTLST GYQQDEMCDIAAVRQTNVFIQQNAYNSETATSWTWGTLRNINYFIDGCHSKECTVDAATR DNYLGIARWFRAWFYYDKLTQYGEVPWFANEIQSYQYDVMYKERDSRDIIIRNMIEDLDF AYEHIQATSSVNSSTLTKWAAAALKSRVCLFEAAYRRYHKLTGLEITPEELYRHAASAAK LVMDNSGLSLNTATGTKGAYRDLFYLEAPITSEVILAVCANSASGIYGTQNYWYNSLSYG KGWSLVRPFVNTYLKLDGTPFTSETGYETKNFVEEMKSRDLRLAQTIRGLDFKRDGKAVV ADMTVCLTGYHVIKYSLDDTKYDNNEKNNNSIPLLRYAEVLLNYAEAQAELGGLTDAEWN TTIGALRRRAGITGGTEKLPATIDSYLQQMFYPNITDPVLLEIRRERAIELVAEGMRFDD LRRWKCGDLIEELPWTGMHISALNVDIDLNGDGTPDCYFTDNGTQSSNKDCKTVNVKNET GLYATVNAAGGYDLKYNPGTGNRVWYDDDRQYLYPVPAQVIRDYESAGYKLSQNPNWN >gi|225935335|gb|ACGA01000057.1| GENE 146 179216 - 182728 2901 1170 aa, chain - ## HITS:1 COG:no KEGG:BDI_3110 NR:ns ## KEGG: BDI_3110 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 1170 1 1187 1187 1134 50.0 0 MKNKKKVFFMVLLLLCFHLSLFAQHISLQLRNVTVKQAIETIQKQNGYSFAFAASDLNTK KIVTVLAKDLPIEKVVEQVLAGQDVSYTIQGKDIVIKKKGSQPVQQSSKKYTINGIVKDM SGEPVIGANIKEKNGTTGTISDMDGKFQLAVSEGSIVQVSYLGFLTQEVSVKNKNYLTVI LKDDTQTLDEVVVVGFSTQKKINLTGAVENVSSDVFENRPVANVTQMLQGAVPNLNISLA DGKPNQSASYNIRGVTSIGAGGSALVLIDGVEGDPAMLNPNDIESVSVLKDAASAAIYGS RAPYGVVLITTKDPSKQKDKFTINYTGNFSYETPYAIPDVVDDGYVWAYLFREAEFNYRG IEPTSINKSQPFSKEWLETFRQRKLAGNTLQAAVGPDGKYTYYGNEDYYDALYKDHTFAQ SHNVSVSGSNGKISYYTSARLYDYDGLFNYNSDIYRTMNMRTKISAQVFDWLKISNNIDY THDKYTQPLGYVEQNEGLVWKAINMEGHPSEPIFNPDGTLTHSGARSVGGLVTGDNWIKR TTKTLKNTTILNITMLDNKLRFTGDFSFRTRDFIEDKKTTAVPYSDYEGVIKYLGVPETD DKMSEKVQQTTYISTNVYAEYENTFAGKHYLKALAGYNYEQQDYKAISSERNGLLMPDVE NINFAMGDNMSIKSDGNRWRYAGAFFRLNYSYDNRYLFEVNGRYDGSSKFPDNSQWGFFP SASAAWRFSEEKFWKVNPDIISNGKLRVSYGALGNSNVKPYSYLEKFALKTYSTTGSSSE GRYLDGKAQLRYTLNPAQIPDNIGWETSRTIDAGIDLGFWNNKITLSADYYVRKTANMYT VGPTLPDTFGASSPKGNYADMSTYGYEISLGYNDSFRLAGKPFNFGIKATLADYHSVIDK YNNPKRDLDDYYVGQRIGEIWGFVCNGLFQSQEEIDAAFDGKGYKNNLMQTSVNYITYPG DMRFEDLNNTGTIDNGANTVDSPGDRKIIGNSEPRYIYTLSFSADWNNFFLSAMFDGVGK QDWYPSGESSFWGQYNRPYNQVPVWHLNNYWTKDRPDAYLPRYSGYYNPLYKGTANTRYL QDVSYFRLKNLQFGYNLPKKWIAKAGFSKVSVYFSGENLWSWSPLYRHTKDYDVAVITKG SDNDLTSGDKGDGFNYPTMRNLSLGISITY >gi|225935335|gb|ACGA01000057.1| GENE 147 182759 - 182911 81 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRSLGIYLYLPKLSSIKVTVAFKIERQEVRGWKRNLTFLCFYQERLIMFF >gi|225935335|gb|ACGA01000057.1| GENE 148 182959 - 183975 780 338 aa, chain - ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 134 320 74 257 280 73 28.0 5e-13 MNEQSLHTDTVNDWIIGYLTNSLTQEEMQLLQDWLNISEENRKYFSDMQEVWIAASDEAD ELNFNKEQAYQLFLEHTGASVKRPEKRKAFHIRPWMYAAAMVIIVFVCATIAFQSGKRVL QNQLTRITVEAPYGSKTKLYLPDGTLVWLNAGSKMSYAQDFGINERALDLSGEAYFEVTK NKEIPFKVHTEELDVKVLGTKFNFRNYKDDLEAKVCLLEGKVALNTLQKETVLHPNQQAL LDKKTGKLSVSATKAIHSVEWTNDRLYFDEVLLSDIIKELERSYDVKITVADDTLNTIRF YGNFRKREQSIQEIMNVLSSTDKMTYTINGKNIVITLP >gi|225935335|gb|ACGA01000057.1| GENE 149 184011 - 184727 381 238 aa, chain - ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 52 223 25 192 201 59 23.0 6e-09 MRYFCERVISFFFHFYFFPLINLHLSIYYPNIAEQKKQIMIPIDEKYIIKKLKAGDNEAY KYIYDYHYVALCKLSYYFLKDKVQAESIVNDVIFHLWEIRDKLELVPPLRNYLIVAVRNK CLNYLALKQQEMEVHFSTIERAGIQLQNIISDTQQPLGNLLKEELESKIKESVNKLPPVC RQVFIMSRFQEMTYEEISQKLGISINTVKYHIKSALVVLKKELGTFLCIIFAFYPTLL >gi|225935335|gb|ACGA01000057.1| GENE 150 184809 - 186950 1639 713 aa, chain - ## HITS:1 COG:MA2553_2 KEGG:ns NR:ns ## COG: MA2553_2 COG0642 # Protein_GI_number: 20091380 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 318 571 168 428 428 179 42.0 1e-44 MKTTDSEPQGVGMELEQYRTRLEQMVEEKSKDLIAIQENLEATNRRQALFIKVLQILQLE PDIPTAMNMALAEIGRYTGVDRLATWENHLDGVTYGCTNEWCNEGIEPAIDYLRSMTIEA GKPWFDMLEENHIICTSDIYSLDPFITQMLEVQGVKAIAVFPLSQLGVHFGFLSFNFCWK KQWDKKDVELMSQISQIVSTATKRWQVEISLQQSQRTMQKVLDNINANIFVSDYDSLEVL FANKSFREEAGSVPERATCWKMLNAGLGGACTHCPKPQLLDANRKLTGVHFWEDYNSVTE RWYTIQSMAINWLDGRWAIMELATDITTRKQVELELIEAKEKAEESDRLKSAFLANMSHE IRTPLNAIVGFSSLLAETDEIELRQAYMSLVQENNELLLNLISDILDISKIEAGTIELTT GWVDVPQLCREVIATFSHKKHDGAVELRFDESSPQIVIDADKNRIMQVLSNFMTNALKFT TKGSITLSYTLNDDGQVRFCVTDTGKGIPAEQCHDIFNRFVKLDSFVQGAGLGLSICQSL VERMGGKIGVESCVGKGSCFWFTHPFTSATQSASTLVTENDMLATLQKKTVSRNYKPLIL VAEDIDSNYLLIEALLKKDYRLVRAHNGSEAIELFNAETPDLILMDMKMPGISGIDTTTL LRKTGTQVPIVALTAFAYADDKSLAFDAGCNDFLTKPVSPPELRRVVSKWTVK >gi|225935335|gb|ACGA01000057.1| GENE 151 187280 - 187600 303 106 aa, chain - ## HITS:1 COG:CC3443 KEGG:ns NR:ns ## COG: CC3443 COG2076 # Protein_GI_number: 16127673 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Caulobacter vibrioides # 1 105 1 102 106 70 49.0 5e-13 MNWIILIIAGLFEVGFTFCLGKIKGATGTDFYLWGTGFVISVTLSMYLLAKATQTLPIGT AYPIWTGIGAVGTVLVGILFFHEPATLGRLFFMTTLILSIIGLKLI >gi|225935335|gb|ACGA01000057.1| GENE 152 187673 - 188323 443 216 aa, chain + ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 26 214 2 189 193 83 31.0 3e-16 MTAQEYEQFLFPNLTFCIFAENGSCYMDIHEIINKIYPIPETSMDKLTKHLSKVTYPKGH HILEAGKTETNIFFIEKGIVRAYIPVDGKEVTFWIGKEGSAIVSLKSYVDNQQGYESMEL MEESELYLLKRKDLQELFKEDIHIANWGRKFAESEFMQTEERLISLLFTNASERYMKLIQ NNPELLQRMPLECLASYLGITPVSLSRIRAKLKRML >gi|225935335|gb|ACGA01000057.1| GENE 153 188407 - 188772 291 121 aa, chain + ## HITS:1 COG:STM4251 KEGG:ns NR:ns ## COG: STM4251 COG2315 # Protein_GI_number: 16767501 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 116 1 114 118 59 30.0 2e-09 MNIEEYRDNCLSVKGATECMPFGENILVFKVMDKMFTFATLRPKNGNFWADMKCAPDKAE ELIEQYGDIFWGPFSDKKHWITVYLEGDVPDKLIKELISHSVEEVIKKLSKKKQEEYRTM F >gi|225935335|gb|ACGA01000057.1| GENE 154 188866 - 190119 1174 417 aa, chain - ## HITS:1 COG:PA2522 KEGG:ns NR:ns ## COG: PA2522 COG1538 # Protein_GI_number: 15597718 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 36 411 40 420 428 94 23.0 5e-19 MMRRNIHFNILLLLLLVEAISVHAVAQIPDRTLYQSISLDEYLKRVGEGNLNYLAEKLNV SIADAEMIAQKVFPDPEVGFEAGHETFSLGLSYSLELGNKRGARIKLARSQMELEKLILE QSFQNLRAEAASLFLEAILQRELLEVQQSSYQYMLQLSQSDSLRYVAGEITENDARQSKL EAVTLLNAVYGQEAAYQSALVVLNRQMGMAADTLYIPLGNWDEFNREFYLSELIKVGLGN RIDLFTAQKNTEITSRAYKLTRAERRPDIGLSVSYEKDWNRFFPSARSATVGVSIPLAFS NINKGTVKAAKFRISQSEIQEKDIEQQIQAEIMQAWFNYEAEKKKVVQFKSGMLGDAQKV MDGMVYKYKRGETNILDVLVAQRTYNEVQQEYLETMKGYAVSLVELEKACGIWDIRF >gi|225935335|gb|ACGA01000057.1| GENE 155 190094 - 193201 2669 1035 aa, chain - ## HITS:1 COG:RSp1040 KEGG:ns NR:ns ## COG: RSp1040 COG3696 # Protein_GI_number: 17549261 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Ralstonia solanacearum # 1 1022 1 1019 1038 1201 57.0 0 MKKDLMLSIIQKRWIMLCLFVMMCVFGYYSWTQLSIEAYPDIADVTSQVVTQVPGLAAEE VEQQITIPLERAINGLPGMHVMRSKSTFGLSMITIVFKDGTEDYWSRQRVQERLNEVELP YGAAPGLDPLTSPVGEVYRYIIESDQHSLRELTDLQNWVVIPRIKEVSGVADVTNFGGIT TQFQVEIDPYKLEQYHLSLSQVIEAIENNNASVGGSILNRGDLGYVVRGIGLIENLDDLG HIVVTTTSGVPVFLNDIGSLKYGNLERKGVLGYTDRTRDYSESLEGIVLLLKHENPSKVL DGIHAAVDDLNHEILPEGVRIHTFLDRTSLVDTTLDTVSHTLLMGMALVVIVLILFLGNW RGALLVSITIPASLLIAFILMHLTDIPANLLSLGAIDFGIIVDGAIVMLETILKKREDYP EQYIEEKSIAQRAKEVGRPILFSTIVIITAYLPLFAFERVERKLFTPMAFTMSYAMIGAL LVALLLIPGLAYAIYRKPRKVYKNKWLEKLKEKYTGAIEKLIQKPVKTILFSCLILIAGI VLSGVVGKDFLPELDEGSIWLQVNLPPGISVEKSREMSDTLRSRTMKYPEVTYIMVQAGR NDDGTDPFTPSHFEVSIGIKPYDEWPKGKTKQDLIHELEEEYKTLPGFRVGFSQPMIDGV MDKIAGAHSELVVKVYGEDFRETRRIAEEITRALGTVKGAVDLDIDQEPPLPQLQIIMNR DAIARYGLNVSDVADLIEVAIGGKAIAQLYQGDRQYDITCKYKEEMRDTPEKIAGLMLTS STGAKIPLSQVAEVKLSVGESTITREMNRRHLTVKLNLRGRDLASFLKEAQNVIDAKVNY DKSKYTVKWGGAFENQNRAYTRLSVILPLTLMCMFILLYATFGKFRQAGLVLSIVPLALF GGMLALNVRGMTLNVSSAVGFIAMFGLSIQNGIIMVSQINNLRKGGMELRKSVIEGARQR FRPILMTSVTTVLGLFPASLATGIGSDVQRPLATVIVYGLMFSALISMFVLPVFYYLIEN NVLNKKKNDEKEYTL >gi|225935335|gb|ACGA01000057.1| GENE 156 193208 - 194320 914 370 aa, chain - ## HITS:1 COG:RSp1041 KEGG:ns NR:ns ## COG: RSp1041 COG0845 # Protein_GI_number: 17549262 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Ralstonia solanacearum # 70 369 82 380 382 153 33.0 6e-37 MMMKKKNLQILLICGVLLVTACKSKSKEEQQSGYRMEGDTVQITDRFLSEKIKVTEARLE PYSKEVITSGVVRPIPTRYACIASPFAGRVTKSYIQIGQQVNRGTPLFEISSPDFTTAQK EYFQALSSRELAKKDLKRKEDLIKNGVSSQKELEEAQNALLIADKEFENASAALEVYQVE HPENMILGQPLVVRSPISGEIIENNIVTGQYLKDDTEPVAIVANLSEVWIAAQVKEKDIR FINAGSSLDIEVSALPGRVIKGNVYHVDEAVDEDTRSIKVLSVCDNSKKHLKLGMYTTVH FLSAPIEQIQISESALLQGEKDSYVYVQVTPDIFVRTPVKVEATKDGFAVINDGLHPGDK VISEGGYYLK >gi|225935335|gb|ACGA01000057.1| GENE 157 194749 - 195957 832 402 aa, chain + ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 4 402 45 441 450 183 33.0 5e-46 MNNWKKKFIIIWSGQLFSILSSSIAQFAIVLWISLETGSAEVLSFATIAALLPQVVLGPF AGVFVDRWSRKWTMILADSFVALCSGIIALLFYLDVIEIWQIYLLLMLRSVGSAFHAPAM KSSIPLLAPEKELTRIASINQTIQALCNICGPVLGAALIVSTNMSVVMLLDVAGAAIACT TLMFVFIPNPEKTETETTNNVLRDMKEGFQAICSNCGLKWVMVTEVLITFFIMPVVALIP LMTLKNFSGTAYQVSLIELLFGSGALVGGILLGIWNPRIRKVVMINISYIIIGVSIFVSG ILPPSAFIIYAIFAIVQGISMPFYSGPFTALLQTQIEVSFLGRVFSLFDSISLLPSILGL LATGFIADAIGIANVFVICGIAIVITGTLAFFIPSIMNLEKK >gi|225935335|gb|ACGA01000057.1| GENE 158 196042 - 198171 1264 709 aa, chain + ## HITS:1 COG:BH2844_1 KEGG:ns NR:ns ## COG: BH2844_1 COG0475 # Protein_GI_number: 15615407 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Bacillus halodurans # 11 400 5 388 388 305 45.0 2e-82 MDWIGLDLTFPITDPTWIFLLVLLIILFAPILLNKLRIPHIIGMILAGLAIGEHGFNILA RDSSFQLFGKVGLYYIMFLAGLEMNMGDFKETRNKALVLGLLAFIIPIGIGFVANVSYLK YGVITSVLLASMYASHTLVAYPIVTRFGISRHRSVSIAVGGTAVTDTLTLLVLAVIGGLF KGETGGLFWIWLVVKVIFLGALIIYFFPRIGRWFFHRYNDNVMQFIFVLAMVFLGAGLME LVGMEGILGAFLAGLVLNRLIPHVSPLMDHLEFVGNALFIPYFLIGVGMLINLRVIFGHG DALKVAAVMITMALTGKWIACWLTQKIYKMSVLERNLMYGLSNAQAAATLAAVLVGYNII LPTGERLLNDDVLNGTVLLILVTCVVSSLITERAARKVAMDDSLSENESSKETEKILVSI ANPDTIEDMVNLSLIIRDSKLKDNLLALNVINDNNNSDNLRLRSKYYLEKAETTATEANV PLKKVTRYDLNIASGIIHSVKENEITSIITGLHRKKNITDSYFGILAEHLLNGLNCEIII SKFLIPVHTIKRIVIAVPPKAEYESGFPHWMEHFCRMGSTLGCRVHFFANEKTTIRLQAW IKKRHKQTLTDFSLLEDWNDLLVLTGQVSYDHLLVIISARPGTLSYDSSFEKLPRQLSKY FSNNSLIVLYPNQSGEPLDTSFFSKLYTDTETRHYEKVGKWFYRWFKKD >gi|225935335|gb|ACGA01000057.1| GENE 159 198193 - 199827 848 544 aa, chain - ## HITS:1 COG:CC3219_1 KEGG:ns NR:ns ## COG: CC3219_1 COG0642 # Protein_GI_number: 16127449 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 28 267 277 520 527 106 30.0 1e-22 MDGSSEKKKREQLQKIEQKHKEEIYESKLQFFTNISHEFCTPLTLIYGPCNRLMEQKGLT ESAKRYTSVIRQNAERLNSLIQDLIEFNRIESGYKKPVIVPVDITAIANKLVESFTDMAE SHKIFLEKDISPFLKWNSDKDFIVTILSNLLSNAFKYAGNEKIVRIKIGMDQDNLWIIVS NTGKGIAEKDISTLFDRYRILQHFEKNDHFWSRNGLGLAISHNMVHLLGGSIKVESVPDD WTHFRVQLPSLESCVMAADEGEPISLPDYRLDYHSALSIPPNNMDELKPTILIVDDEIEI QWLIVDIFKDDYNVLTATNSVEALKILKEFYPDIIISDVVMPGIDGLAFSREVKSDEATA HIPFIFLSAKRDVGIQTEGLDAGAEIYITKPFDVDYLKSSVRRLLERKESLKEYFSSPLS SYTLKDGKLSHKEHRKFVNEILRIINRNISNKELTAHTIAEKMNIGIRSFYRKLEEIEGV TLTDLICDCRLAKATDLLLKTKLTIDEVVFQSGFTNRSTFYRAFSKKYNCTPTEYRREHA QVVI >gi|225935335|gb|ACGA01000057.1| GENE 160 199779 - 202235 942 818 aa, chain - ## HITS:1 COG:VC1353_1 KEGG:ns NR:ns ## COG: VC1353_1 COG3292 # Protein_GI_number: 15641365 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Vibrio cholerae # 79 471 164 529 675 77 23.0 1e-13 MISLCGIDNREKTIACLLLSLFLLQVAVSAKAFNLRKINNSENLSSGIISSIHQDEKGLI WIGTNRGLDMYDGKRVMKYGPEHNENFFTGSNIHKIMQVNDSLLWLQTYHGLHKINLESS AIESFDMFNRISFLNKDCYGNIYLIQGSYCIYYKLKDQECFEQVFVPGLRANDIVAFFTD NAGKLWIFQKNGVALCFSIQISRQGTIKLEPSAGYKHSAGIRYGISDDSSVFYFVDDTYT LYEFDTSAQKLLLVCDLTKHIAGKDEISSLIKFHHDYFIGFKTKGLSLLRKNDRGYVLEE LNVPGGISCIYKDKYQDLVWIGTLGHGIYLYSNDMYSIQSFHLSDFIPDLHQSVSALWVD KKNTLWLGTRGEGILQIDNFQVDKKVEDHELKLLTAGNSQLRDNVILSLDESRQGNLWIG SEKGLTYFDSERNCVFPVPLSRSGKEIEFISDIYEKDSLLWVSTMGMGIIKAHIEWWNGT PLLTVIKQFSVKDGDRFANCFQDIYPEGDSVLWCMNRGEGVYKLHVKTSKLKNIRLEGYA INETNVMQKDYHGNYLLGTNFGLVKYNPTGYKVLNEAGDLFANSVYGILFDSYSDYWLST NRGLISYNTDTESIRSYDHHDGLSILEFNEGASFRDAKNGILYFGGTNGFVTVERNYFDE GQHYMPLIYFKAITIREQQFPVEKFLSQEKNNTFLELSHEQNFFTLSFVAIDHLNGNSHS YYYKLDGEGKGWTYNGNSNVISFADLHPGNYRLYVKYYNKALRKESYICKMDIKVLPPWY ASHWAYGAYFLFALAGILFIVRIWMAHQKKRRGSSYRR >gi|225935335|gb|ACGA01000057.1| GENE 161 202387 - 204801 1551 804 aa, chain + ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 117 782 54 719 737 173 23.0 1e-42 MKSTFTLIFFMFCFHSFMATASQIKGNIREKGTGNLIEFADVILLKEGIITVSHVLTESN GHFLLPDIGAGKYSLMVRLLGYDVYTQDSIIIRSSSAPALDLGIIELQPLEIGITEVVIE GTKRQVVYKLDKKIIDASNALMSSGGTAVDILENTPAIRVDADGEVTFRGSSGFKVYIDG KPSFYSGSQALQQIPAGQIENIEIITTPSARYNTDGEVGMINIITKKNKQQGINGMINVY GSTALSNGVDFLLSQQHNKLQWHLAGHWNQPIRESDFEQRKTTIVDHITTTSTSIGPRKG KNYHYGLRGGIDYSFSPRTSVNAELQIVYDKFTREGDLDYTETHTDANGTSTPHNNNSRD NFDLHSTVNIGKAGFSHKFNDNGHTLTGSFMAGFEQDPLEYFQSDLFDKQGQRQQGHRAW EDEIRRTIQCDLVYNYPYSKTGFLESGYQYHSYLEDGDYSMQFWNPAKEEFYWREDIYNT FYFQRRIHSVYLMGSDSYKAFDIQAGLRGEHTHQVLRSSIEGANRTVNRFEVFPSIHVGY TLPHEQKLTFAYLYRTNRPELFYMEPYITYRDYYTAEIGNPDIRPEYIHSYELNYKKNIG EQVLQSSLFYRSRKDKIERLRVPYEAGVTLDSMANVGNDHALGMELSGQFIVNKWWNLNV NGNVYYYKVVNNINSGGKEETSTNYDITVNNLLRICKNTRIQLDGNFVGPSVTTQGRSDS FWFVNLAIRQQWFGSNLQSTLSFRDIFNSARYKNDITTPNLESVTHIRPKFPVITLSISY TFNHFKRSSQNSRDNRDLFEGVNH >gi|225935335|gb|ACGA01000057.1| GENE 162 204837 - 205682 395 281 aa, chain + ## HITS:1 COG:no KEGG:BT_2157 NR:ns ## KEGG: BT_2157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 1 287 290 315 52.0 9e-85 MKQLYKLSLLVMALFAGISCHNRSSSRNQPEASNKIGPEFEDIKKALINLSDFPADQEGF IYLFDGSSLKGWRGYGKDHLPEKWSVENGVLHLNSSAAGEGGDLIFASLFKDFELELEWK ISKGGNSGIFYLAHEITTRDMNGRNRLEPIYISALEYQILDDEHNPDAKLGKVNSRISGS LYDIIPAIPQIAKPFGKWNKTKIRVRQGHVIHELNGTKVMECQLWTPEWKERLQNSKFSE TKWPLAFELLNNREEKAKEGYIGLQDHGHDVWFRNIRIKRN >gi|225935335|gb|ACGA01000057.1| GENE 163 205839 - 206333 264 164 aa, chain + ## HITS:1 COG:no KEGG:BT_1142 NR:ns ## KEGG: BT_1142 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 161 12 158 161 149 54.0 4e-35 MKKNPTFLLLLLTILFSSCYTSAISYNNLNRIERGMFPKDVIAILGEPIYRSFDDKSEIL EFRSSEYPPARVVRIRFVDNKVVEMESYLDRYDNCRDENKTSKEKKEDKSSEKETSSKVR VSTDGKHFVQMGSILVTPEGKHEIIVSNSGGLIITASGEHIHTF >gi|225935335|gb|ACGA01000057.1| GENE 164 206423 - 207268 570 281 aa, chain + ## HITS:1 COG:no KEGG:BT_2385 NR:ns ## KEGG: BT_2385 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 277 1 279 289 391 70.0 1e-107 MATYTHFAKQPDVLKHLILCEVLKNETPQVYVETNSACAIYPMTQTPEQQYGIYHFLEKA DEVPSLKDSVYYQLESGEMAKGNYLGSPALAMNVLGKQAKRFVFFDLEKSALENVGTYAK QIGLSASVEIHHADSSRGAIELLPSLPTSSFIHIDPYEIDKKGDSGVTYLDVLIQATLLG MKCLLWYGFMTGDDKASLDNYITSSLEKAGIKDYTGVELTMNSIRKDSVLCNPGILGSGI LATNLSQTSTDIILDYSNQLVDIYKDAKYKDYDGSLYIQHL >gi|225935335|gb|ACGA01000057.1| GENE 165 207312 - 208673 446 453 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 1 453 1 456 460 176 27 1e-42 MKKIIICLFLSAGLSGCHVYRTYERPDMTVTDSLYRQPVVAADTASIASLSWKELFTDPQ LQQLIETGIANNTDLNIARLKVKEAEALLLSSKLAYLPSVSLTPQGTLSSVDGSKTSKTY DLAASADWELDIFGKLTNAKRGAKAALEQSEAYRQAVQTQLVATIANSYYTLLMLDKQLD ITRRTAETWDENLRVMKALKRAGQTTEMAVAQTEASKLSVDASVLSIEQQISEMENSLSA LLGMAPQLITRSTLDNQRFPESLSTGVPLQLLHRRPDIRQKEAELAGAFYATNQARSAFY PSITLSGSAGWTNSAGGSITNPGQWLLSAVGSLVQPLFNRGQNIANLKIAKAQQEEALLT FRQSLLDAGTEVNDALVQWQKAQGRLALSKQQVTSLQSAVRSSELLMRHSSQNYLEVLTA RQTLLQAELSVASNRFDEIQGVINLYHALGGGY >gi|225935335|gb|ACGA01000057.1| GENE 166 208686 - 211829 2714 1047 aa, chain - ## HITS:1 COG:SMa1662 KEGG:ns NR:ns ## COG: SMa1662 COG0841 # Protein_GI_number: 16263363 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 6 1034 7 1030 1044 751 38.0 0 MKIRTFIDRPILAGVISVVFLIIGLIGLSQLPVEQFPEIAPPTVSVSATYTGANAETVQK SVVVPLEEALNGVENMMYMTSSSTNNGSARITIYFRQGTDPDMATVNVQNRIATAQGLLP AEVTRSGITVRKRQTSNIKALALYSPDNTFDESFLNNYLKINIEPRLSRIAGVGEVNVMG ADYSLRIWLDPAKMAKYGLVPSDITTVLDEQNLEAPTGTLGAESKNTFQYVLKYRGRYEE EKDYGNLVIRSQAGGEVLRLKDVARIELGASSYTYIGEVNGHPGSNCMIAQTSGSNANEI IKEIDKVTAEIAQGLPKGMELVDLMSSKEFLDASIKNVIKTLIEAILLVVLVVYVFLQSL RSTFIPAISIIVSLVGTFAFLYAAGFSLNMLTLFALVLVIGTVVDDAIVVVEAVQAKFDE GYKSAYRATIDAMGGITSALVTTTFVFMAVFIPVCFMGGTTGTFYTQFGLTMAVAVAISL INALTLSPALCALIMTPHMEAAKGEKLSFSSRFHMAFDSAFHRLVLKYKSGVFFMLKRKW LAGALLLVACAGLFLLMKTTKTGLVPQEDMGTIFVDVRTSPGNSLEETKIVMDEIDKRIH DIPQIRMFSKVTGNGMISGQGASNGMFIVRLKPWDERTEKEDGINAVINEIYRRTDDISS AQIMAFAQPMIPGYGVSSGFEIYIQDQKGGSVEDLLKYTRQMIDALNARPEIGRASTSFD TKYPQYLVEVDAALCKRNGVSPSDVLSTLSGYIGGNYASNMNRFSKLYRVMVQASPEFRL DTEALNNMFVRNSDGEMSPVSQYLTLTRVYGAESLTRFNLFSAISVNGAPADGYSSGQAI QVVREVAEQVLPAGYGFEFGGMSREESSTGNTTTLVFIICVVFIYLILCALYESLFIPIA VILSVPFGLAGSFLFAKMFGLENNIYLQTGLIMLIGLLSKTAILLTEYASERRRQGMTII QAAVSAAQVRLRPILMTSLTMIFGMLPMMFASGVGANGNISIGVGTVGGMLIGTVALLFI VPVLFMVFQYLQEKLMPARKPAELENL >gi|225935335|gb|ACGA01000057.1| GENE 167 211839 - 212942 1172 367 aa, chain - ## HITS:1 COG:XF2093 KEGG:ns NR:ns ## COG: XF2093 COG0845 # Protein_GI_number: 15838684 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Xylella fastidiosa 9a5c # 7 366 22 385 408 145 30.0 1e-34 MKQLIIFALVCLPLLCACGGSQQKEADATYKTLTVTQSNQILKSDYTATLRGRQYVEIRP QVSGIITGICINEGDPVHKGQTLFIIDQVPYKAALETAVANVKSAEAKLATAKLTADSKA ELYKEQIVSEFDLQTARNEQAAAEAALAQAKAQEVNARNDLSYTEVKSPVNGVASMIPYR VGALVSSSITEPLVTVSDDSEVYAYFSMTENQILDFVQQYGSLKKAIENMDEVELTMSNG KTYSHTGKVDAISGTVDEGTGAVGLRAVFANPDQFLRNGGSGKVVVPTIKEGCIIIPQAA TYELQNRIFVYKVVDGKAKSTPVEVFRLNNGTEYIVETGLAPGDVIIAEGAGLVREGTVI KSDPTKE >gi|225935335|gb|ACGA01000057.1| GENE 168 213173 - 214180 458 335 aa, chain + ## HITS:1 COG:SMb21546 KEGG:ns NR:ns ## COG: SMb21546 COG3275 # Protein_GI_number: 16264735 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Sinorhizobium meliloti # 114 312 153 356 383 84 26.0 2e-16 MKRTNIPVYIDLLFCLVIMPPIIMLVPVDKWIVHHTVFMLTLIVYLYGLYFIYRKVKLPM LFMQRKFGHILLLMLSLIFITWLLTHFPYPPEPNNIDPMMHKARQHMRAQTVWFFFLVVT GFSLSIELIFELFRQILSKQEIEAEKNKAELALYKAQINPHFLFNTLNALYGLVLTKSDK TESAFIKFSNILKYMYAQTTSDLISISNEVEYIRQYVDLQSLRLNKHTKVVFETEMDDEQ IQIPPMILITFVENSFKYGVSSDTDCTIFIRIIVKEGQLIFETENMIMKENHQNPHAIGI DNCRKRLELLYPNRFILLTKEENGCFKTYLNIQLR >gi|225935335|gb|ACGA01000057.1| GENE 169 214177 - 214884 535 235 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 4 197 2 202 240 89 31.0 5e-18 MKKLTCIAIDDEPIALLIISQFCERKGGLELTTFSEPLTGLKEIARCKPDLVFLDIEMNS ISGLDIAHALPPESCLIFTTAHAQYALDGFDLDAVDFLHKPFAYERFEKAVEKAIVFIEA RQNKLPENIIIKQEYNNISIPISDILYIEAMENYTKIFRINGNYILSRTSLKSIQEMLPE KAFLRTHRSYIIPINKVERFSKREINLTGCRIVIPIGRQYAENVYATLTARNMTL >gi|225935335|gb|ACGA01000057.1| GENE 170 214916 - 215527 467 203 aa, chain - ## HITS:1 COG:no KEGG:BT_2369 NR:ns ## KEGG: BT_2369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 200 4 203 209 348 84.0 9e-95 MLLAVGTLCLMAGCSSKVSSSKGIVSDATMNTVTIVTDKNDTLSFSTMDANKEEVNGLLL NDTLEVFYTGKYTPGMPATKLVQYPQSPLVGGDRDEHGCIGSAGYVWCEVQKDCIRLFEK GIRTEAVDDSDTSAFIVFSPDSTQLELFFSNNQPNEILERRSLPSGGYAWNVEDDDTKNV RLIDGLWTISQRDKLIYTQKAGS >gi|225935335|gb|ACGA01000057.1| GENE 171 215733 - 216149 275 138 aa, chain + ## HITS:1 COG:no KEGG:BT_2376 NR:ns ## KEGG: BT_2376 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 38 175 178 198 72.0 5e-50 MKKTFQIPTPETLEALTGKELYDLWTSLRQLIEQKYNMEQMWNHGGKKWTYEYKYRRGGK TLCALYAKEQTIGFMVILGKDERTKFESMREMFSNAAQKIYDETTTFHDGKWLMFELKDT SLFNDIERLLSIKRKPNR >gi|225935335|gb|ACGA01000057.1| GENE 172 216146 - 217276 545 376 aa, chain + ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 109 206 7 104 130 65 35.0 1e-10 MIESEINKRYCQSCGMPLRFDVEKYLGTNSDGSRSDEYCYYCLKDGKYIVDIPMSEMINI WIKYTDKYNEYADTAYSPEELRHILNERLPHLKRWKQKLETCNIHHQKIQDIIVYINNHL FDTLDTDMLSTISGLSKYHFRRVFQTVAGENIGSYIQRLRLEHIAHLLVSTEFTLNQISE QTNYQTKFSLAKAFKKHFGVSSSQYREKYKPMYDEQHAVITPEIRSILPMKVFCIEVGEK YKDELRYKLIWNRLTNYAKQHNEEKLNYKFVSLSMDDPSITPMNKCRFYLGVTIDDTEND SQPGVMEVPGGRYAVFRHIGDYSLLHKLYRTIYEEWFPESKYRPQSTFSFEMYMNRPYST LRTELMTDIYIPVTKK >gi|225935335|gb|ACGA01000057.1| GENE 173 217308 - 217778 356 156 aa, chain + ## HITS:1 COG:no KEGG:BT_1189 NR:ns ## KEGG: BT_1189 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 234 76.0 5e-61 MGKISEIMLLKQPEQSALAVEVHTDMNGMPEAIGKNFVKIDSLFKAQGEVTTDIPYVEYP DFESLTEHNIKMIIGLKSSRELQGEGAIRSITIPERKIVSCLHKGTYNELAVLYNEMMEW IKNNGYKPSGTSIEYYYSNPNVPEEEQVTRIEMPLL >gi|225935335|gb|ACGA01000057.1| GENE 174 218069 - 219481 885 470 aa, chain + ## HITS:1 COG:PA4132 KEGG:ns NR:ns ## COG: PA4132 COG1167 # Protein_GI_number: 15599327 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Pseudomonas aeruginosa # 1 467 1 467 471 333 35.0 4e-91 MKREILYQKIAGTIAWQIKTGIWKAGEKLPSLRTISNEYGVSLNTAIQVYYELEKDGFII SRPKSGYIVNYKPLNLSAPATTQPATQSLGKEEADLIMEVYHSIEDSSITRFSLGIPEDV LLPIAKLNKELIKAMYSLPGNGTRYEDPQGNIRLRNYIARFAYSWNGNLTEDDIVTTTGV TNSISLALSVIARKGDTIAVESPVYFGILQLANSMGLRVLELPTNPVTGIEPDALRKVLP QINLCLLISNFSNPLGSCMPDEHKKEVVQMLAEYNIPLIEDDLYGDVFFGHSRPKPCKAF DKKGLVLWCGSVSKTLAPGYRVGWIAPGKFKDAVIRQKHIHLISTPTLNQEAIANFMENG RYENHLRRLRHELHSNSLHLAQSITDYFPEDTKIITPQGGFMLWVELNKKIDTTELYYKA MQHKISIAPGRMFTLHDQYRNCMRLSFGQQWSPFIEERLQVLGNIIKESF >gi|225935335|gb|ACGA01000057.1| GENE 175 219473 - 220669 830 398 aa, chain - ## HITS:1 COG:PAB2227 KEGG:ns NR:ns ## COG: PAB2227 COG1167 # Protein_GI_number: 14520410 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Pyrococcus abyssi # 5 392 16 406 410 374 46.0 1e-103 MKLNFAKRMSYIKASEIREILKVTEQEDVISFAGGLPAPELFPIDEINEINQIVLKEAGT KALQYTTTEGYAPLREWIANRMNRRLGTAFDKDNILITHGSQQGLDLSGKVFLDEGDVVL CESPTYLAAISAFKAYGCSFIEIPTDEEGMMMDVLEDVLSNTPHIKLIYAIPTFQNPTGK TWSLERRRKLAELSAKYSVAVIEDNPYGELRFEGEPLPSIKSFDEAGNILCTGSFSKIFC PGFRIGWIAGDKDIIRKYVLVKQGTDLQCNTIAQMTIAEYLKRYDIDKHIAKIVEVYRKR RNVAIESIERYFPDTIKFTRPEGGLFTWIELPVGISAREVLVRSLEKKIAFVPGGSFYPN GNKENTLRINYSNMPEDKIEKGLKTLGEVVKEYVRQSE >gi|225935335|gb|ACGA01000057.1| GENE 176 220919 - 221137 275 72 aa, chain + ## HITS:1 COG:no KEGG:BT_2974 NR:ns ## KEGG: BT_2974 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 85 63.0 7e-16 MEEQILIFRTSVKNVRDIKHIAALFTLCPQIYKWNVDMEDWEKVLRIECQGITPTEISKA LRAINIYAQELE >gi|225935335|gb|ACGA01000057.1| GENE 177 221158 - 221619 272 153 aa, chain - ## HITS:1 COG:no KEGG:Desal_1295 NR:ns ## KEGG: Desal_1295 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 13 153 12 152 153 101 39.0 8e-21 MKKVDGEFIKGRIAPCGLHCGKCFAFANGDISYHSGELKKVLGNFDVYAQRFVEMLDEPV FAKYPDFKEFLNHLYMATCQGCRKEKCKLFKTCNVRVCSEEKQVDYCFQCSHFPCENTGF DEHLYKRFVAINRRMQEIGIEKYYEEVKDLPRY >gi|225935335|gb|ACGA01000057.1| GENE 178 221622 - 222578 740 318 aa, chain - ## HITS:1 COG:CC3149_1 KEGG:ns NR:ns ## COG: CC3149_1 COG1846 # Protein_GI_number: 16127379 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 1 205 1 202 203 68 24.0 2e-11 MDFFNQTGKLAIGSRLRMLTDKITVDAAAIYKMFGVDLKPKWFPVFFVLSRGEAKTITSI AKEIGHSHPSVSNIIKEMVAKGLVKETKDKSDGRRNMVMLSAKGKRMSDSFSEQCIDVTA AIEQITQQTRNDLWKAIEEWENLLLDKSLLERVEEAKKERESKDIKIVPYEPCYQSDFRT LNEEWITAYFRMEEADYKALDHPQEYILDKGGAILVALYKDEPVGVCALCKMDHPLYDYE LAKLAVSPKVQGKGIGVLLCEAVVNKAKELGGKRIFIESNTRLKSAIHIYHKLGFKELPE YHTTYERGDIQMELMIEE >gi|225935335|gb|ACGA01000057.1| GENE 179 222727 - 223938 802 403 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 2 379 49 428 448 238 37.0 1e-62 MALAGVGVTTSLVILISSFSAIVGGGGAPLAAIALGQGDRSRAGKILGNGFILLILFTLF TSLIAYTFMEPILLFTGASENTLEYAVNYLSIYLLGTIFVEISTGLNSFINAQGRPAIAM YSVLIGALLNIVLDPIFIFWLDMGVKGAALATVLSQACSAVWVLTFLFSRHASLPLEKRY MALNREIILSIFALGVSPFIMASTESLVGFVLNSSLKDFGDIYVSALTILQSAMQFASVP LTGFAQGFVPIVSYNYGHGDKQRVKDCFRVALITMFSFNLLLMLFMILFPSTVASAFTSD ERLIETVRWTMPVFLGGMTIFGLQRACQNMFVALGQAKISIFIALLRKAILLTPLALILP HFMGVAGVYAAEAISDATAAICCTLLFFWQFPKILGKIQGSTL >gi|225935335|gb|ACGA01000057.1| GENE 180 224187 - 224921 303 244 aa, chain - ## HITS:1 COG:no KEGG:BT_2379 NR:ns ## KEGG: BT_2379 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 244 15 244 244 419 84.0 1e-116 MKKLIIVFVLLLSALSCFSQIEFLTCLFDASRNRVIPLAVYQPHKVNSKTKVIIFSHGYD GNKNNKSNQTYAYLTRFLSQKGFYVISIQHELADDPLLAMEGNFMETRMPNWERGVANIL FTIQEFKKLKPQLNWNDFILIGHSNGGDMTMLFATRYPQLINKAISMDHRRMIMPRTRNP RLYTLRGCDYDADSGVLPTEKEQEQFRMKVVKLDGITHSNMGENGTEEQHNLINQYIYGF LTRR >gi|225935335|gb|ACGA01000057.1| GENE 181 225002 - 225532 218 176 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229878290|ref|ZP_04497790.1| acetyltransferase, ribosomal protein N-acetylase [Slackia heliotrinireducens DSM 20476] # 9 156 11 158 181 88 30 3e-16 MKFIIEPGTSADIDELEKLYDELNDYLAVTTNYPGWIKGVYPVREDAVAGVNDETLYVAR YDGRIVGSVILNHQPEKAYENVKWKIELDYSHIFVIHTFVVHPSFLKIGVGHALMDYSLE LAQSSGIKSVRLDVYEKNLPAISLYEKYGFEYIDTVDLGLGHYGLEWFKLYEKIIL >gi|225935335|gb|ACGA01000057.1| GENE 182 225546 - 226172 335 208 aa, chain - ## HITS:1 COG:CAC0777 KEGG:ns NR:ns ## COG: CAC0777 COG0110 # Protein_GI_number: 15894064 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 3 206 7 210 210 284 67.0 7e-77 MNSKTYPRTNDFQTVYLNTVIKNPAIMVGDYTIYNDFVNDPVLFEKNNVLYHYPINKDRL IIGKFCSIACGAKFLFNSANHTLGSLSNYTFPIFFEEWNLDKGDVTSAWDNKGDIVIGND VWIGYEAVIMAGVHIGDGAVIASRAVVTKDVPPYTIVGGTPAKEIRKRFDENTIVQLQKL QWWDWSIEKISECIPYITGGKIEELIKR >gi|225935335|gb|ACGA01000057.1| GENE 183 226244 - 226714 429 156 aa, chain - ## HITS:1 COG:VCA1068 KEGG:ns NR:ns ## COG: VCA1068 COG1522 # Protein_GI_number: 15601819 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 7 155 3 151 160 114 40.0 1e-25 MDTFDKLDRVDLQILRTLQENARLTTKELAARVSLSSTPVFERLKRLESGGYIKKYIAVL DAEKLNQGFVVFCSVKLRRLNRDIAAEFTRIIQDIPEVTECYNISGSYDYLLKIHAPNMK YYQEFILNVLGTIDSLGSLESTFVMAEVKHQYGIHI >gi|225935335|gb|ACGA01000057.1| GENE 184 226893 - 228179 1147 428 aa, chain + ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 1 428 1 426 426 528 62.0 1e-150 MATKNLHFETLQVHVGQEQADPATDARAVPIYQTTSYVFHNSAHAAARFGLQDPGNIYGR LTNSTQGVFEQRVAALEGGVAGLAVASGAAAITYAFENITRAGDHIVAAKTIYGGSYNLL AHTLPSYGITTTFVDPSDLSNFEKAIQENTKAVFIETLGNPNSNIIDIEAVAEIAHRHKI PLIIDNTFGTPYLIRPIEHGADIVVHSATKFIGGHGSSLGGVIVDSGKFDWVASGKFPQL TEPDPCYHGVRFVDAAGPAAYAIRIRAILLRDTGATISPFNAFILLQGLETLSLRVERHV ENALKVVNFLNNHPKVKKVNHPSLSNHPDHALYQRYFPNGAGSIFTFEVKGGQEEAHRFI DSLEIFSLLANVADVKSLVIHPASTTHSQLNAQELAEQEIYPGTVRLSIGTEHIDDLIAD LDQALAKI >gi|225935335|gb|ACGA01000057.1| GENE 185 228248 - 228487 331 79 aa, chain + ## HITS:1 COG:no KEGG:BT_2388 NR:ns ## KEGG: BT_2388 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 79 129 94.0 3e-29 MEKYLIHSNELHLIDAEKIHQAVEKMVESLDLAAGSTTNFDLYQVVENYFKDLEKRRKIN HVLGIKEDRYELAEDFGIK >gi|225935335|gb|ACGA01000057.1| GENE 186 228590 - 229498 872 302 aa, chain - ## HITS:1 COG:VC0480 KEGG:ns NR:ns ## COG: VC0480 COG0668 # Protein_GI_number: 15640507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 44 299 28 282 287 213 40.0 4e-55 MLLLLQATQAADSVQVAADKLMEEAIANADGLDKLSLITQQLIDFGIRAGERILIAVIVF IVGRFLISLLNKFIRRLLDKRKVDVSIKTFVRSLVNILLTILLIISVVGALGVETTSFAA LLASAGVAVGMALSGNLQNFAGGLVILLFKPYKVGDWIETQQGSAGTVKEIQIFHTILTT SDNKLIYIPNGSLSSGVVTNYSHQETRRVEWIIGIDYGEDYNKVQQIVRDILAEDKRILN EPAPFIALHALDASSVNVVVRVWVNSGDYWGVYFDTNKAIYETFNEKGINFPFPQLTVHQ GN >gi|225935335|gb|ACGA01000057.1| GENE 187 229726 - 231975 1950 749 aa, chain + ## HITS:1 COG:no KEGG:BT_2390 NR:ns ## KEGG: BT_2390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 749 8 743 743 1262 88.0 0 MKRIVWMAVALLGAGITVHAQTSAKDSMRVVNLQEVQVVSTRATAKTPVAFTNIGKAELK KVNFGQDIPYLLSMTPSTLTTSDAGAGIGYTTLRVRGTDGTRINITVNGIPMNDAESHNL FWVNMPDFSSSVKDMQVQRGAGTSTNGAGAFGASVNMQTEGASMKPYAEFNGSYGSFNTH KETVKVGTGLLNNHWTFDARLSNIGTDGYIDRASVDLNSYYLQGGYFAENTSVKLIAFAG KEKTYHAWGYATKEQMEEFGRRYNPCGEMYTDANGKKHYYDDQTDNYLQKNYQLLFNHTF STAWNLNVALHYTKGDGYYEEYKDGRYLIEYGLKPFTIDGTEVAKSDLVRQKKMDNKFGG GVFSLNYTANRLSASLGGGLNQYRGNNFGRVPWVKNYVGSLSPDHEYYRNKSKKTDGNIY LKANYDLTRGLSAYADLQYRHINYTIDGNNDKYDWSKNALRPLAVDKKFDFFNPKVGLNW NITSNHRVYASFSVAQKEPTRNNYTDGDPDSYPKAEKLLDYEAGYTFANQWLTAGANFYY MDYTDQLVLTGALNDIGEALTENVQDSYRMGIEIMLGIKPCKWFQWDINATWSKNRIQDF VESLPGYHYNNDGSSTSLPTIQIKHKDTHIAFSPDFLLNNRFSFNYKGFEAALQSQFVSK QYMTNAEVEELTLDKYFVSNLNLAYSFRPKKVLKEVTVGFTVYNLFNEKYENNGWASSDY TDTLENRGNYAGYAAQAGTNVMGHVSFRF >gi|225935335|gb|ACGA01000057.1| GENE 188 231988 - 232596 500 202 aa, chain + ## HITS:1 COG:PA1958 KEGG:ns NR:ns ## COG: PA1958 COG3201 # Protein_GI_number: 15597154 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Pseudomonas aeruginosa # 6 197 4 181 191 92 28.0 4e-19 MELNFLEIFGTIVGLIYLWLEYRASIYLWIAGIVMPAIYIFVYYKAGLYADFGINIYYLI AAIYGWFFWMWGHRKKKGQQSANISTNNNPKDLPIVHTPWKCYLPLFLVFIVAFIGIAWI LIEYTDSNVPWLDSFTTALSIVGMWMLARKYVEQWFAWILVDIVCCGLYIYKDLYFTSAL YGLYSIIAIFGYFKWKKLMNIQ >gi|225935335|gb|ACGA01000057.1| GENE 189 232593 - 233216 585 207 aa, chain + ## HITS:1 COG:HP1291 KEGG:ns NR:ns ## COG: HP1291 COG1564 # Protein_GI_number: 15645904 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Helicobacter pylori 26695 # 9 205 2 199 204 123 36.0 2e-28 MINEHYTPEAVILANGEYPTHVLPLKMLEEAKFVICCDGAANEYISCGHTPDVIIGDGDS LSPENKTRFSDIIHHVTDQETNDQTKAVRFLQEKGYRKIAIVGATGKREDHTLGNISLLL DYMKNGMKVRTITDYGVFIPASDTQTFASHPGQQISIINFGAKGLKGEGLIYPLSDFTNW WQGTLNEAITDEFTIHCTGEYLIFLAY >gi|225935335|gb|ACGA01000057.1| GENE 190 233313 - 234497 734 394 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174345|ref|ZP_05760757.1| ## NR: gi|260174345|ref|ZP_05760757.1| hypothetical protein BacD2_20986 [Bacteroides sp. D2] # 17 394 1 378 378 701 100.0 0 MKVTYVIARIKMFLAIMLCLFAISSCISDGDETVVLETGNVQARKMLFGEWKLSSKTVVD EDGIEQSKEDISEEPNLEFTEDGNCKLTYPNGNSVNQKWDLSGDFYSIFISDIRYEIYTL GKNILVLVINYEDYYLKFVYYKLSSPEQNDGEESEIGGTDDNNPYKPYSARFKVTKIVVD DKDTYTFEYDGRGRIMRYKTPSKTYTFTYDDTKAYLWLNGKIVNTGFVGSNGYLTKMWNG TSESGGVSTFVYITDMDYYMNYLNRVEYKGTGAIQYWKPAYSDGNMTFLYNDGSHNFTYT SSDNNFSVDLNGFISAMYQWEWFMHDAEVIWGLFDFYGVRSSDMVYQETTSQWNNTFSYY YGERGDDFTLRITQVRKGLNTNFTKIYKVYATNN >gi|225935335|gb|ACGA01000057.1| GENE 191 234512 - 235675 234 387 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 217 383 181 343 347 94 35 4e-18 MAQSGDKSERVRLGVKGGMHLSNMHYSNLDQYSSDWISNGVGGIFAEFDLGRNRKFSVRP EILFLSRGTKIGDSNISYKLKTKYTDFRLPFIFNFTNPSKVGPYVYVAPVLGIGRGGEVN YTEFHDGEAYEWEPLDINDANFSKVNFSVAAGLGIRIPIKVAAGKKIHLALEANYQYGIT DTYGSKEKDGEAIAVNRAIYDITGTRKHQDIEITAAVSVPLSIFKKMPKKEKPVPVVEKP IVVEEEEVPVIIEEKKPCYTLEEILDLVVAQKTIAGKTICAVDVINFEFGKSAINKNSYA YLDKIAMLMQRTNLSVEIKGHTDNVGKEDFNMELSRKRAKAVYDYLIKKGVGSSKLSYTY YGMTKPIASNDTEEGRLINRRVEFEIK >gi|225935335|gb|ACGA01000057.1| GENE 192 235659 - 235883 84 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGSRAIGVGLLSILCYWVFSGPSLTEWHAVPRFRFVNSALRDCGQNKHIFYESKEFIIF FPVLFLFFSDGSIG >gi|225935335|gb|ACGA01000057.1| GENE 193 236671 - 237972 1346 433 aa, chain - ## HITS:1 COG:PM0115 KEGG:ns NR:ns ## COG: PM0115 COG0498 # Protein_GI_number: 15601980 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Pasteurella multocida # 1 432 1 424 424 370 44.0 1e-102 MKYYSTNKQAPVASLQEAVVKGLAADKGLFMPMSIKPLPQDFYDTIDSLSFQEIAYRVAD AFFGEDIPADTLKQIVYDTLSFDVPLVKVADNIYSLELFHGPTLAFKDVGGRFMARLLGY FIKKEGQKNVNVLVATSGDTGSAVANGFLGVDGIHVYVLYPKGKVSEIQEKQFTTLGQNI TALEVDGTFDDCQALVKAAFMDKELNEHLSLTSANSINVARFLPQAFYYFYAYAQLKRAG KADNVVICVPSGNFGNITAGLFGKKMGLPVKRFIAANNRNDIFYQYLQTGKYNPRPSIAT IANAMDVGDPSNFARVLDLYNGSHAAISVEISGTTYTDEQIRETVKETWKEHHYLLDPHG ACGYRALVEGLKEGEAGVFLETAHPAKFLETVESIIGESVEIPAKLQEFMKGEKKSLQMT KDFADFKSYLLSL >gi|225935335|gb|ACGA01000057.1| GENE 194 237986 - 239194 1121 402 aa, chain - ## HITS:1 COG:MA0132 KEGG:ns NR:ns ## COG: MA0132 COG3635 # Protein_GI_number: 20089031 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted phosphoglycerate mutase, AP superfamily # Organism: Methanosarcina acetivorans str.C2A # 1 402 1 396 397 380 49.0 1e-105 MKHIIILGDGMADWPVKSLGDKTLLQYAKTPYMDKLARMGRNGRLITVAEGFHPGSEVAN MSVLGYNLPKVYEGRGPLEAASIGVDLKPGEMAMRCNLICVEGEILKNHSSGHISTEEAD VLIQYLQEKLGNDRVRFHTGVQYRHLLVIKGGNKELDCTPPHDVPLKPFRPLMVKPLVPE AQETADLINDLILKSQELLKDHPLNLKRMAEGKDPANSIWPWSPGYRPQMPTFSETFPQV KKGAVISAVDLINGIGYYADLRRIAVEGATGLYNTNYENKVAAALEALKTDDFVYLHIEA SDEAGHEGDIDLKLLTIENLDKRAVGPIYEAVKDWDEPVAIAVLPDHPTPCELRTHTSDP IPFLIWYPGIEPDEVQTYDEIAACNGSYGLLKEDEFIKEFMK >gi|225935335|gb|ACGA01000057.1| GENE 195 239294 - 241729 2460 811 aa, chain - ## HITS:1 COG:MJ0571 KEGG:ns NR:ns ## COG: MJ0571 COG0527 # Protein_GI_number: 15668751 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Methanococcus jannaschii # 3 454 4 467 473 273 38.0 8e-73 MKVMKFGGTSVGSVNSILSVKRIVESANEPVIVVVSALGGITDKLINTSKMAAVGDSAYE GEFREIVYRHVEMIKEVIPAGEKQVSLQRQIGELLNELKDIFQGIYLIKDLSPKTSDTIV SYGERLSSIIVTALIEGAKWFDSRTFIKTERKHSKHTLDTELTNKLVKEAFQSIPKVALV PGFISSDKTTGDVTNLGRGGSDYTAAIIAAALDAASLEIWTDVDGFMTADPRVISTAYTI TELSYVEATELCNFGAKVVYPPTIYPVCHKNIPIIIKNTFNPDGVGTVIKQEVSNPQSKA IKGISSINDTSLITVQGLGMVGVIGVNYRIFKALAKNGISVFLVSQASSENSTSIGVRNA DADLACEVLNEEFAKEIEMGEISPILAERNLATVAIVGENMKHTPGIAGKLFGTLGRNGI NVIACAQGASETNISFVVDSKSLRKSLNVIHDSFFLSEYQVLNLFICGVGTVGGSLVEQI RCQQQKLMMENGLKLHVVGIIDAAKAMFSREGFDLANFREELLEKGKDSSLQTIRDEIIG MNIFNSVFVDCTASADIASLYKDFLQHNISVVAANKIAASSAYENYRELKTIARQRGVKY LFETNVGAGLPIINTINDLIHSGDKILKIEAVLSGTLNYIFNKISADIPFSRTIKMAQEE RYSEPDPRIDLSGKDVIRKLVILAREAGYRLEQEDVEKNLFVPNDFFEGSLEDFWKRVPS LDADFEARRQVLEKENKHWRFVAKLENGKASVGLQEVGANHPFYGLEGSNNIILLTTERY KEYPMMIQGYGAGAGVTAAGVFADIMSIANV >gi|225935335|gb|ACGA01000057.1| GENE 196 242131 - 243168 914 345 aa, chain + ## HITS:1 COG:YPO2161 KEGG:ns NR:ns ## COG: YPO2161 COG0252 # Protein_GI_number: 16122393 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Yersinia pestis # 7 344 5 335 338 271 45.0 1e-72 MSPLNTSVLLIYTGGTIGMIENAATGALENFNFEQLQKYIPELQKFNFPIDTYQFDPPMD SSDMEPDMWRKLVRIIHDNYNRYHGFVILHGTDTMAYTASALSFMLEGLDKPVILTGSQL PIGVLRTDGKENLMTSIEIAVAQNKEGRALVPEVCIFFENHLMRGNRTTKMNAENFNAFR SFNYPVLAEAGIHIKYNNVQIHVNGEERELKPHYLLDTNVVVLKLFPGIQENVIAAILGI DGLKAVVLETYGSGNAPRKEWFIRRLCQASERGIVIVNVTQCSAGMVEMERYETGYQLLQ AGVVSGYDSTTESAVTKLMFLLGHGYTADEVRDRMNRSMAGEITL >gi|225935335|gb|ACGA01000057.1| GENE 197 243272 - 244471 822 399 aa, chain + ## HITS:1 COG:TM1265 KEGG:ns NR:ns ## COG: TM1265 COG1373 # Protein_GI_number: 15644021 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermotoga maritima # 31 399 37 387 387 112 26.0 1e-24 MDTLIRRYKRLLTATSTTYIRSLMNTINWDNRLIAIRGARGVGKTTLMLQYLKLHYANDS QSALYTSLDSLYFTQHSLSELAEQFYLKGGKCLFLDEVHKYPSWSKEIKNIYDEFPELKI VFTGSSLLQLLNAEADLSRRCISYNMQGLSYREYLNLYHQIDIRPYTLEEILDNSDGICN EVNSQCRPLAYFEDYLKHGYYPFYLEGNAEYYTRIENIANLILEIELPQQCGVDISNVRK LKSLLGILSSEVPFMVDITKLSAMAELSRTTILAYLQYLDRAKLIHLLYSDNDSIKKLQK PDKIYMENTNLLYALTFKDVNKGTLREVFMVNQLAYQHRVEYCTRSADYTIDSKYTIEVG GKSKDGKQIANTKQAFIAADDIEYSAGNKIPLWAFGFLY >gi|225935335|gb|ACGA01000057.1| GENE 198 244483 - 245850 1304 455 aa, chain + ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 455 1 457 458 483 51.0 1e-136 MAKEKTVYVCSNCGQDSPKWVGKCPSCGEWNTYVEEIVRKEPTNRRPVSGIETQKPKPLA LSDIEADDEPRINMHDDELNRVLGGGLVQGSLVLIGGEPGIGKSTLVMQTVLHMPEKKIL YVSGEESARQLKLRADRLSDTSSDCLIVCETSLEQIYVHIKNTNPDLVIIDSIQTISTES IESSPGSIAQVRECSASILRFAKETHTPVLLIGHINKEGSIAGPKVLEHIVDTVLQFEGD QHYMYRILRSIKNRFGSTAELGIYEMRQDGLRQVNNPSELLLSQDHEGMSGVAIASAIEG IRPFLIETQALVSSAVYGNPQRSATGFDLRRMNMLLAVLEKRVGFKLAQKDVFLNIAGGL KVNDPAIDLPVISAILSSNMDAAIEPEVCMAGEIGLSGEIRPVNRIEQRIGEAEKLGFKR FLLPKYNLQGIDTQKLKIELVPVRKVEEAFRALFG >gi|225935335|gb|ACGA01000057.1| GENE 199 246476 - 247558 373 360 aa, chain + ## HITS:1 COG:no KEGG:Ping_1180 NR:ns ## KEGG: Ping_1180 # Name: not_defined # Def: hypothetical protein # Organism: P.ingrahamii # Pathway: not_defined # 1 360 1 360 362 138 35.0 4e-31 MTDEELQSKSIDFLRFPLIVGVVLIHAHFSNVIMNGVNVNILHEYSCPIYDTTSYFFSEL IGRIAVPLFFFISGFLFFYRSKEFSLSVYRHKLKNRGRSILLPYLFWNLMIISYSILIQT IPGLSSSTNQLMDMYSLTDWLDAFWSINGGCPICYQFWFLRDLIVMILFSPLIYILVKYF RWFSVFVLGVLWYFNWWFSLPGFSIAAFFFFSAGAYFSVQNCNFVALLKPCLFPSAMFYS LLALVCIYFREQPWINFIHSLSILTGIVLAISLSAHFIERRMWTANTFLVGGAFFIYAFH GIPLGVILKYTLKYLPVSNDMIFLSIYFLSAGFIILTSLGIYSLLKKWLPRFLSVVTGGR >gi|225935335|gb|ACGA01000057.1| GENE 200 247637 - 249289 1276 550 aa, chain + ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 1 548 1 528 535 352 39.0 2e-96 MIQEYQLRILPEIAANEQRLKEYLSKEKGLNLRDITATRILKRSIDARQRTIFVNLKVRV YIREMPKDDEYEHTIYNKVEGKPQVIVVGAGPGGLFAALRLIELGLRPVVIERGKDVRER KKDLAQISREHTVDPESNYSFGEGGAGAYSDGKLYTRSKKRGNVDKILNVFCQHGASTSI LVDAHPHIGTDKLPRVIENMRNTIIECGGEVHFQTRMDALIIENDEVKGIETNTRKTFLG PVILATGHSARDVYRWLAANHVAIEAKGIAVGVRLEHPAILIDQIQYHNKNGRGKYLPAA EYSFVTQAEGRGVYSFCMCPGGFIVPAASGPEQVVVNGMSPANRGSRWSNSGMVVEIQPE DLGNEELKMRNKELAAQQDEQLMALNPNLKSSQLSEINSQLLSVLHFQEELERQCWLQGG RRQTAPAQRMLDFTRKKLSYDLPESSYSPGLISSPLHFWMPSFISKRLSLGFQQFGRSSH GFLTNEAVMIGVETRTSSPVRIIRDKDTLQHVTVRGLFPCGEGAGYAGGIVSAGVDGERC AEAAANYFNH >gi|225935335|gb|ACGA01000057.1| GENE 201 249306 - 249908 543 200 aa, chain + ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 129 192 143 206 213 68 57.0 9e-12 MNYQPEIAIVEANTLTCLGLKGILEEMIPMATIRTFHHFSELMDDTPDMYAHYFISAQIY VEHNAFFLPRKRKTIVLASDSPQFQLSGVPVLNIHESEEELVKNILKLHQHAHHDGYPVK DMPPMPPMQPHQEILSAREIEVLVLITKGLINKEIADKLNISLTTVITHRKNIMEKLGIK SVSGLTIYAVMNGYIEADRI >gi|225935335|gb|ACGA01000057.1| GENE 202 249999 - 252446 2191 815 aa, chain + ## HITS:1 COG:YPO1011 KEGG:ns NR:ns ## COG: YPO1011 COG1629 # Protein_GI_number: 16121312 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Yersinia pestis # 3 815 6 690 690 153 23.0 1e-36 MKRKNFMIALTLLTTGTAWAEDFPKDSLKIVDIEEVVVIATPKENRKLRELPVAATVLSQ DNMRANQVNSVKNLTGIVPNLFIPDYGSKLTTSIYIRGIGSRINTPSVGLYVDNIPYIDK SAFDFNYCDIERIDVLRGPQGTLYGRNTMGGLIKVHTKSPFTYQGTDMRMGAATYNNYNV SLTHYHRISDRFAFSTGGFYEHTGGFFENSARNNEKIDKSNAGGGRFRGIYMPTSNLKID MTLSYEYNDQGGYPYYYTGITQNGIAEAKSKGKELTEDRADYIGKISYNERSSYRRGLMN AGVNLEYQAQNFILSAVTGYQNLNDRMFLDQDFTEKAIYTLEQKQKSNTISEEIVLKSKA NKRWQWATGIFGLYQTLNTKGPVTFWKDGVKNVIEGNANNIFAGMGPTAPELHLTVNNPT ILVGGNFDTPIWNAAIFHQSTLNDLFVEGLSLTVGLRLDYEKMNMKYNSISDPTDFNFSM KMAAMPPRPPMALEAKNLIANAGYDGKISDDYVQLLPKFALQYEWKKGNNVYATVSKGYR SGGYNVQMFSDLISGNLQNSMIDAIKESTEFAAAATMIDKYAKKANVPEVKEATRYKPEY SWNYEVGTHLTLWEGKLWADLSAFYMDMHDQQISQFAASGLGRTTLNAGKSHSYGAEASL RASLTSELSLNASYGYTYATFTDYVEYEKDTEGQLSVKADYNGKYVPFVPKHTLNIGGEY AITCSPRSIFDRVVFQANYNAAGRIYWTEQNDVSQSFYGTLNWRTNLEIGDAMISFWARN FLNKDYAAFYFETMNKGFIQKGRPMQLGVDLRCRF >gi|225935335|gb|ACGA01000057.1| GENE 203 252969 - 253484 423 171 aa, chain - ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 159 1 164 167 97 31.0 2e-20 MEIRSTEIKDLPLVMEIYDYARAFMRTTGNTTQWIDGYPSEALIRQEIEEGHSFVCTGEQ GEIFGTFCFILGEDPTYLQIYEGAWLNDEPYGVIHRLATNGKQKGVSGACLDWCFERWPN VRVDTHRDNKVMQHILTKYGFQRCGIIYVKNGTERIAYQMKRLPDGTEKLA >gi|225935335|gb|ACGA01000057.1| GENE 204 253524 - 254669 550 381 aa, chain - ## HITS:1 COG:no KEGG:BT_2411 NR:ns ## KEGG: BT_2411 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 66 381 29 344 344 560 87.0 1e-158 MRIINHGIIRHRGRRGDNVSTHPSGTARLNGLWKRDGKFFFMFSVSSMSSVSYSNSVFSA FSVVKSVMSLLLCLFLLSSCYYKTPALDSEELSKKTKDSLTYLYERHYTWNTNLEVVDDS ISLECLPIKDTFIQLNRGDRVVVAEFAVHPADSVDSIWVKLAHTQDEQGWIREKELKKSF VPTDSISQAIHLFSDTHASYFVVIFALFVGAYLFRAFRRKQLQMVYFNDIDSVYPLFLCL LMAFSATIYESMQVFVPETWEHFYFNPTLSPFKVPFILSVFLLSIWLFLIVALAVLDDLF RQLTPAAAIFYLLGLMSCCIFCYFFFILMTHIYIGYLFLAFFILVFAKKVYRNIGYKYRC GHCGEKLKEKGLCPHCGAINE >gi|225935335|gb|ACGA01000057.1| GENE 205 254666 - 256495 1531 609 aa, chain - ## HITS:1 COG:ZydcP KEGG:ns NR:ns ## COG: ZydcP COG0826 # Protein_GI_number: 15801708 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli O157:H7 EDL933 # 2 605 17 632 667 530 46.0 1e-150 MIKQRKIELLAPAKNLECGIEAINHGADAVYIGAPKFGARAAAVNSLEDIEALVQHAHLY NARIYVTVNTILKEEELKETEEMIHALYRIGVDALIVQDMGITKLNLPPIPLHASTQMDN RTPEKVKFLWEAGFRQVVLARELSLREIKKIHESCPEVPLEVFVHGALCVSYSGQCYVSQ ACFGRSANRGECAQFCRLPFSLVDADGKVIVKDKHLLSLKDMNQSDELEQLLDAGASSFK IEGRLKDVSYVKNVTAAYRQKLDAIFARRPEYVRASSGTCSFEFKPQLDKSFSRGFTHYF LNGRDKEIFSFDTPKSLGEEMGTMKEARGNYLTVAGLKSFNNGDGVCYIDEQGRLQGFRI NRVDSNKLYPQEMPRIKPRTTLYRNFDQEFERILSRKSAERKIAVKILLADNNFGFSLTL TDEDDNSVTITLPREKELARTPQTDNLKTQLSKLGNTPFEAKEIEISFTGNWFLPASVLA DFRRQAIDQLIIARRINYRQELAVWKPTSHAFPQTTLTYLGNVMNTRAASFYQEHGVQQV AAAYEKEAVEDAVLMFCKHCLRYSMGWCPIHQRVRSPYKEPYYLVSNDGKRFRLEFDCKN CQMKVKAAQ >gi|225935335|gb|ACGA01000057.1| GENE 206 256492 - 257409 862 305 aa, chain - ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 301 1 301 301 387 58.0 1e-107 MPLNLPDKLPAIELLKEENIFVIDTSRATQQDIRPLRIVILNLMPLKITTETDLVRLLSN TPLQVEISFMKIKSHTSKNTPIEHMKTFYTDFDQMRHEKYDGMIITGAPVEQMDFEEVTY WDEITEIFDWARTHVTSTLYICWAAQAGLYHHYGIPKYPLEEKMFGIFEHRVLEPFHSIF RGFDDCFYVPHSRHTEVRREDILKVPELTLLSESKEAGVYMAMARGGREFFVTGHSEYSP LTLDTEYRRDLDKGLPIEIPRNYYIDDDPEKGPLVRWRAHANLLFSNWLNYFVYQETPYN INDIK >gi|225935335|gb|ACGA01000057.1| GENE 207 257590 - 257760 202 56 aa, chain + ## HITS:1 COG:no KEGG:BT_2414 NR:ns ## KEGG: BT_2414 # Name: not_defined # Def: ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 56 21 76 76 88 100.0 6e-17 MAYVISDDCIACGTCIDECPVEAISEGDIYSINPDVCTDCGTCADVCPSEAIHPAE >gi|225935335|gb|ACGA01000057.1| GENE 208 257984 - 259177 1353 397 aa, chain - ## HITS:1 COG:BMEI0516 KEGG:ns NR:ns ## COG: BMEI0516 COG0436 # Protein_GI_number: 17986799 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Brucella melitensis # 1 397 22 421 421 379 49.0 1e-105 MNQLSDRLNSLSPSATLAMSQKSNELKAQGIDVINLSVGEPDFNTPDHIKEAAKKAVDDN FSRYSPVPGYPALRNAIVEKLKKENGLEYTAAQISCANGAKQSVCNTILVLVNPGDEVIV PAPYWVSYPEMVKMAEGTPVIVSAGIEQDFKITPEQLEAAITPKTKALILCSPSNPTGSV YTKEELAGLSAVLAKHPQVIVIADEIYEHINYIGAHQSIAQFPEMKERTVIVNGVSKAYA MTGWRIGFIAGPEWIVKACNKLQGQYTSGPCSVSQKAAEAAYTGTQEPVKEMQKAFERRR DLIVKLAKEVPGFEVNVPQGAFYLFPKCDAFFGKSNGERKIADSDDLAMYLLEEAHVACV GGASFGAPECIRMSYATSDENIVEAIRRIKEALAKLK >gi|225935335|gb|ACGA01000057.1| GENE 209 259260 - 260477 1278 405 aa, chain - ## HITS:1 COG:BH1556_2 KEGG:ns NR:ns ## COG: BH1556_2 COG0807 # Protein_GI_number: 15614119 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase II # Organism: Bacillus halodurans # 210 403 1 194 197 236 59.0 8e-62 MKETIKMDRIEDAIADFKEGKFVIVVDDEDRENEGDLIIAAEKITPEKVNFMLKHARGVL CAPVTVSRCKELDLPHQVSDNTSVLGTPFTVTIDKLEGCTTGVSASDRAATIQALADPTS TPATFGRPGHINPLYAQEKGVLRRAGHTEATIDMARLAGLYPAGALMEIMSEDGTMARLP ELRQMADEHGLKLISIHDLIVYRLKQESIVEKGVEVNMPTEHGMFRLIPFRQKSNGLEHM AIFKGTWSEDEPILVRVHSSCATGDILGSQRCDCGEQLHKAMEMIEKEGKGVVVYLNQEG RGIGLMEKMKAYKLQEDGMDTVDANICLGHLADERDYGVGAQILRELGVHKMRLLTNNPV KRVGLEAYGLEIVENVPVETVPNPYNERYLRTKKERMGHTLHFNK >gi|225935335|gb|ACGA01000057.1| GENE 210 260483 - 262333 1310 616 aa, chain - ## HITS:1 COG:alr4069 KEGG:ns NR:ns ## COG: alr4069 COG0795 # Protein_GI_number: 17231561 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Nostoc sp. PCC 7120 # 19 153 30 164 371 63 27.0 9e-10 MLFIGTFFICLFIFMMQFLWRYVDELVGKGLEMSVMAQFFFYSALTLVPVSLPLAVLLAS LITFGNFGERYELLAMKAAGISLLKIMRPLAFFVCGLVGVSFYFQNVVGPIAQAKLGTLI LSMKQKSPELDIPEGVFYSEIKDYNLKVAKKNRKTGMLYDVLIYNMKDGFENAHIIYADS GRLEMTADKQHLWLHLYSGDLFENLKAQSLKSQNVPYRRESFREKHTIIEFNSDFNMVDG DIMGKQSSAKDMAQLQSSIDSMTVVGDSIGRQYYREVAEGNFRASYGLTKEDTVKIEKAD IHEYNVDSLYEVASLTQKQKVLSSAVSRAENVANDLGFKKFTMENNDYSIRKHKTEWHKK ITISLSCLLFFFIGAPLGGIIRKGGLGMPVIVSVLVFIIYYIIDNTGYKMARDGKWIVWM GMWTSSAVLAPLGVFLTYKSNKDSVVLNADAYINWFKKIMGIRSVRHIFKKEVIIHDPDY PRLTGDLEQLTAECRAYAAKKRLEKAPNYFKLWMSSEDDNEVMAINEKLEALVEEMSNTK SATLIGALNNYPVISVSAHVRPFHIYWLNLVAGVIFPIGLFFYFRIWAFRVRLAKDMERI IKNNEQIQFIIQKINK >gi|225935335|gb|ACGA01000057.1| GENE 211 262458 - 262844 430 128 aa, chain - ## HITS:1 COG:no KEGG:BT_2418 NR:ns ## KEGG: BT_2418 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 128 128 228 96.0 6e-59 MKKEKIHLEYLLNATSKNILWGAISTPTGLEDWFADKVISDDKIVEFHWGKTEQRKAEII AIRSFSFIRFRWQDDENERDYFEIKMTYNELTSDYVLEITDFAEPDEVADMKELWESQVA KLRRTCGF >gi|225935335|gb|ACGA01000057.1| GENE 212 263322 - 264626 1400 434 aa, chain - ## HITS:1 COG:BH0607_2 KEGG:ns NR:ns ## COG: BH0607_2 COG0519 # Protein_GI_number: 15613170 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Bacillus halodurans # 121 434 1 315 315 410 61.0 1e-114 MKQDMIVILDLGSHENTVLARAIRALGVYSEIYPHDITVEELKALPNVKGIIINGGPNNV IDGVAIDINPAIYTLGIPVMAAGHDKATCEVKLAEFTDDIEAIKAAVKSFVFDTCKAEAN WNMKNFVNDQIELIKRQVGDKKVLLALSGGVDSSVVAALLLKAIGNNLVCVHVNHGLMRK GESEDVVEVFNQLKANLVYVDVTDRFLNKLAGVEDPEQKRKIIGGEFIRVFEEEARKLNG IDFLGQGTIYPDIVESGTKTAKMVKSHHNVGGLPEDLKFQLVEPLRQLFKDEVRACGLEL GLPYEMVYRQPFPGPGLGVRCLGAITRDRLEAVRESDAILREEFQLAGLDKKVWQYFTVV PDFKSVGVRDNARSFDWPVIIRAVNTVDAMTATIEPIDWPILMKITDRILKEVKNVNRVC YDMSPKPNATIEWE >gi|225935335|gb|ACGA01000057.1| GENE 213 265001 - 265582 309 193 aa, chain + ## HITS:1 COG:PA3899 KEGG:ns NR:ns ## COG: PA3899 COG1595 # Protein_GI_number: 15599094 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 23 178 15 165 169 62 32.0 6e-10 MNEEKSSFIKGINEQHPTAYHQLYNEYYKALVLYAINFLSSQQAAEDIVQDLFATMWEKK MRFLSLPSFRTYLYNSIRNASLNYLKHQNVESLYLERLASTYREITEEEDTNEEEIYRLL FRAIDNLPTRCREIFLLHMDGKKNEEIATALGISIETVKTQKKRAIQSIKEQMGTCYFLL PLCDILYSSKFFS >gi|225935335|gb|ACGA01000057.1| GENE 214 265665 - 266855 536 396 aa, chain + ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 200 367 157 325 354 77 32.0 3e-14 MKIKVDAEKLIAKYLLGDLTDPEKEQLDNWIKNSPQHEEFFNRLCTNMSFRKRYEAYTQI NSHQAWKHFKKKYCQASITSIFLKYAAIFILPIIIVAGGWYFYTSSEIQVSDNLTLGDAI QPGIPKATLILAGNNKQSLTPTYPTPVKVNHSTTAIAQNGALIYPTTPNTNIDSILKQRS EVIENNTLTTEQGNEFRVTFEDGTTVHLNYNTELRYPVKFSQTKRMVYLKGEAYFKVAQD TRPFYVITDHGTIRQYGTEFNVNTFSPERTEVALVKGSISIIPTKSSQEQFIKPGQLAHI EQKNNNISIHNVDLTPYIAWNEGRLIFENRTLENIVEILEHWYNVDISFGTSELKQLRFT GNMDRYATISPILKAIARTTNLQIKIKGREILITDN >gi|225935335|gb|ACGA01000057.1| GENE 215 266880 - 270212 1626 1110 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1097 NR:ns ## KEGG: Cpin_1097 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 108 1110 136 1148 1148 667 38.0 0 MRKTKSRERVRILSLVIVLLLLPTMMFAIPKSGKKVTLNLESVTVKEFFDALRQQTGLSF VYNTEQTKSLKPITIHVKDETVDSVLRTVLNGTGLTYSMERDIVTISKAEQQGDKRSATG IVSDEEGYPLPGVNVVISDLQRFAITDNNGKYSIEIPTNTTCTITFSYIGMSTQQVMINS GRNDVRKNITLKSDTKLDEVIVTGIYTRKAESFTGAATTISSKDLMRVGNQNVFQSLKNL DPTLYIADNFSMGSDPNTTPTMSMRGTSSFPTTETSSLKSNYQNQPNQPLFILDGFETTA ETVMDMDMNRIESITILKDASAKALYGSKAANGVIVIETKRLAGNEQRITYNGSLSFEMP DLTSYDLCNALEKLEAERLDGVYRNANLDTQIELNQLYNTRKQMALQGLDTYWLAKPLHT GIGHKHNLNIEVGDSQSLRAVLDFTYNQITGVMKGSDRRNISGNANISYRRNNIIFRNIF SIISNNSNDSPYGSFSDYSKMNPYWQATDENGKVLRWAEYNDDLKVANPLYDATIGTSFT SSYLEFTNNFYTEWLFHPDWKATIRLGVSQKRNNSDDFYPAQHSMFATYTSQDEIIRRGQ YIMENGKSSSLSGDFNINYNKMIKKHTIFANAGFFLSEDQYSAYQHTAEGFSNSSNADIT FAKQYLAGTTPKGSSSINREASFLLAASYDYDNRYLADATIRESASSLYGSDNRWANSWS FGLGWNLHNEVFMKDISWIKQLKLRASVGLTGNQNFNTSAAIATYQYYSGITYGGTTNPM TGAYLNNLPNSKLKWEQKKDYNVGADIRVAGLTLKFDYYSADTKNMLTDVSIPTSTGFSS IKDNLGLVRNSGIELNANYTIWQGPEGFVNLYGTFVYNKNKIIRLSDSMRAYNEKMQKMA EEANQSAPVLMYQDGMSMNTIWAVPSAGIDPQTGQEVYIKKDGSYTYVYSSNDMVPAGDS SPKYRGTGGFTAEYKGIGISATLSYLAGCQFYNSTLVDRVENADIDYNVDRRLLEGRWTT YGQQTQYKKFDSATTTRATTRFVQDRRELSISAISAYYEFPRSIYQKLYMQRLRLAFNMN DITTFSSIKVERGLNYPFARTMSFSLTATF >gi|225935335|gb|ACGA01000057.1| GENE 216 270223 - 271734 899 503 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1098 NR:ns ## KEGG: Cpin_1098 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 500 1 486 488 187 30.0 9e-46 MKKYIYIILIYMTSLIASSCSDWFDVTAPSEIRKDDHFSSVTGFQQSLIGCYIGMTDDAL YGTNLSWFATEIMAHQFNPYVNSTNIGLAYWLQSFNYTNTYTTPTVEEIWEKAYNVIVNV NDELANIEEKKEILDDLNYHIIKGELLAIRAYIHFDLLRLYGYGNWSQRDTELDEKRTIP YATEVSKDPAPQYSGAETIKLLLNDLNEAAALLKDYDPITKTKAASFYQEYNEEGFFNER TLRMNYYAVKALQARVYLWRGKNEDIVNALSAANEIITALENNIAINEMYTYCNFLTPET VNKSCTSMSRENIFGLNVSDVASRIVNYIKPYYLDSENTPMYLLTTDAMSLYENSATDIR LTTLMEPNTNAQNTGYTPLKVYQSDLANDYKNKISMIRIPEIYYIAAECYVKQNNPNLPL ALNCLNTVREKRGLYTPLEDLDAEQILVEIQKEYHKEFLSEGVMFYYYKRTGTKTIPNYT EDMEDAQYVLPYPEFEIQSGRVQ >gi|225935335|gb|ACGA01000057.1| GENE 217 271752 - 272603 462 283 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174371|ref|ZP_05760783.1| ## NR: gi|260174371|ref|ZP_05760783.1| hypothetical protein BacD2_21116 [Bacteroides sp. D2] # 1 283 1 283 283 495 100.0 1e-138 MMKKINYIIGMIVIGIFLSCENEAIPYYNSENDAVRFSNKNSDGYGGDGIVYQSYSFVSN PLDEYVIYDIPVILIGNTSDKDRTVNYTIDTEKSTATEGSYELMEGIIPANHNEGYIRIK LYNVTGDKTYELRISIQSSDELEKGPSLYLNAALSWSNSIPTPPNTYTRITYNMLIQSPL SFSSSSISYYSPNALKTIIAAFNWNDWDNPDAHPEYPAAYKRYFTQANYKYLPHYILLNV EQSYSSFATQLGKYIEEYNSTHDTPLLHDAGTLLGQPIEARKY >gi|225935335|gb|ACGA01000057.1| GENE 218 272620 - 274299 879 559 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174372|ref|ZP_05760784.1| ## NR: gi|260174372|ref|ZP_05760784.1| hypothetical protein BacD2_21121 [Bacteroides sp. D2] # 1 559 1 559 559 985 100.0 0 MKKYIFISPLLLCLLSLYSCYDDSSTLATNNIGNITFTEKQSELYVGSMEELTLIPDIQI AEGTNTDALTYEWALTETPVTSSSSYNFEYEIISTDPQLNYIVERPVSTSPYTLLLTITD TVHDNLQYTKYWKIYVQSTFLDGLLISDTQNGTTSDLTLINNQAFTVNYNKDEQIFRKIL TSLNGQPFNGLMQTLVYEVMGYGSSIQTNQVWTILGDATLARFNCLDYTQNGQFEDQSLI IDKPNGLQVLSAFQSHSNFYINTSNNLYTLASSTVNRFSGPAGALSSYKVNNNVIAYSPN TGHVSNSLSGADQQHLTFYDKERASFITCNGSGQFMQVKSFDANNNFDPNKLPNQTAISA VVFEDMSQIVFLMKDDTNGTYSIYTFSRYIGEEGHYDGDNWIVTSPSQPASARNKYTIPS EGTALLDKAISIFFSNRNLLLYVTTTDGIYTINYGAGSTATVSTTAKYTPQSGEIITKAK MYQQGLYNYNCNLIVGDNPTVPQTEWNNKAIIVTTQSSEYEGKVHIIPITQVASGTLDPS KAKTYDGFGKILDVTTTGY >gi|225935335|gb|ACGA01000057.1| GENE 219 274333 - 274473 71 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294645365|ref|ZP_06723076.1| ## NR: gi|294645365|ref|ZP_06723076.1| conserved hypothetical protein [Bacteroides ovatus SD CC 2a] # 1 33 1 33 862 67 96.0 3e-10 MRKKILLLAAMVVASSTTIDAQRKFPFFWKKKEGKNGTNNSCKKGK >gi|225935335|gb|ACGA01000057.1| GENE 220 274586 - 276919 1562 777 aa, chain + ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 777 68 860 860 654 42.0 0 MLIGSTVTSISDNGNAVVGSKPTDLLHVMFTRNKTHVQLRQVNTDYITGSTQIDEALRKS TLGAIISNQKIQAYNNDSTAIVFDMTGVFLSDNKKMSPFDRNSIYGMYNRTENYQSDCSY ISQIKAFKDNVSIKSCLSYTFSVSNSQGASLIKDRPFTAEMTRSIMLLKEKPYRPRMADY RIGVFFTGREQLGEGAKTTAPVYYANRWDIQPSDTAAYLRGEKVKPTKQIVFYIDNTFPE KWKPYLREGVTQWNELFEQIGFKDVVAAKDFPTDDPEFDPDNIKYSCVRYAPSSIENAMG PSWVDPRSGEILNASVYLYHNVIKLISNWLFVQTAQADKDVRTVNIPDEMVGDALRYVLS HEIGHCLGFMHNMGASSTFPVDSLRSPEFTQKYGTTPSIMDYARFNYVAQPGDKERGVKL TPPRFGEYDKYLIKWTYTPVFNVNSAEEEAIITGKWISDAIKENPVYRYGKQQVYGVVDP RSQTEDIGDNSMKATRYGIKNLKYIMNNLESWISEGDDTYEYREDLFIGIVEQLAMYVTH VAGNVGGYFVNEVKEGDTMPRFAQIPKAQQKEALNYLFEIYNDLDWLDNKNLLTKFPVSG SPKQTIQNFMLRYILPVPFQVSQYEGLEKDSFTAAEAFNMIYNFVWKPTISGRTLTESQM NLQKQYIYMMMQTAGFTIKGAGKALTGEKPLDINHRQFGYTCCQGHAIKEDVIHNPVAGF EWRPLNRFSMTAKVTQADVYAYIAKAKQLMKQKAASASGKTKAHYELLLKMLDINLK >gi|225935335|gb|ACGA01000057.1| GENE 221 276934 - 279477 1744 847 aa, chain + ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 49 739 30 738 860 309 30.0 3e-82 MKILFLWLMAILLIIGSTNEANARSKKKKGKAKTEQQTDSVQKKKKSVYDKLFKDKKKHT VNKGTITVHQYEDKIYLELPVELMGRDFLVNSAITTASDISLAGTKAAQSRYLIIDKTDS LILFRDPKYNVRLNEQDDNQEAAFALSRSNAIYKAFPIEGYTSDSTAVVFNATSYFSCSN KDVLNLSGRSYGGMLTIVSASPQSKTSFVDSVDAFDNCISITQNCTAKLSISIMGFISKE QPELTMSVQTTLALLSKEKMNTREANPRVGTGYISYTDYRNEKRFKKGYYVTRRNITTQQ PVVFYIDTLIQDSWVKAIQKSADEWNIIFEDLGIGKPIIIKPYEKDSTFRANNPMINTIA FLNNNNSEVTAYNVTDLRTGEILSTKIGVPRDLAVSVRRNGVYQMAEIDPRFRTYYIADE VICENLTARMLKAFGLSLGLTTNLAGSAAYSPEELRSPEFTQKYGITASVMDNVLYNYLA QPGDKEKGVVLIVDKPGVCDAFTLKYLYAATSENESDMLKKWAMEHDGDPRYFYGKRSPA YATDPRCQNYDLGNDPIASLDAQIAHVKYVVKNSPAWFHDNNIPNDYRELFPDFVIIELI NKTLSPVSSYIGGIYINEANEKSNVPSYQPVSADMQKKVLQKIFSTFYDLSWLDSNKDFL RLGGVNPDMSTWIYNNGYPMMSLMFRLMRMGLSVEKSTRPYTQEAYLNDIEKQLFKETLN GKPLSAPMIAQLSVYISSLKGMCPTLKAIDKAVSTRVTSIALNEQTNHKLQSLGLLTTFA SISATEKQSGMEPMTSVNFYSGTDIEAICYDKLKSTRRYLIQARSLASNDIERGKCDYLI AMIDRVI >gi|225935335|gb|ACGA01000057.1| GENE 222 279559 - 279915 289 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295084588|emb|CBK66111.1| ## NR: gi|295084588|emb|CBK66111.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 118 10 127 148 210 98.0 2e-53 MKQIILFIVFIFTIMMCSFAQEAPSSPIIKGRDNKSTEFFKNKKLQSPISPMVLFSQKIT GVVYTEGGKEVLIGASVIEVGSTNGIITDIDGEYSIGISKADKKILQASYLGYETKQE >gi|225935335|gb|ACGA01000057.1| GENE 223 280018 - 282801 1053 927 aa, chain + ## HITS:1 COG:no KEGG:CHU_2270 NR:ns ## KEGG: CHU_2270 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 37 907 29 929 929 139 22.0 6e-31 MKKFISIFLLLYITLPINAQDKTNELSGFIIHKQENGSNGSLDGANIFLIYAKDTLKTTS INGVFRFKPIKTGKAKLIITAIGCRKVEKELNIIAGKNSNLYIEILDESIQLEEVTIKGR IPIVTQNGDTLIFNPKAVNVQEGDVAMNIVEQLPGTETDDHSVKIMGKQVTKTYIDGRLI FGSDPMAALKNLSATDVLKIKAYDEYENTKTKKLMYRGDLTRVLNIETKSKLISSWKAHL LASIGSNMDSKDDRGKFRKGLGLTTNFFSEKFLLTSNVFHNNINRKSNNIKNVLSISDPG STYNETTYADLATERSWDTENGTYGTFRAFYTFGYNRDNTNTRSEQHYFPSNNYQQRLYE EQNSNSDRKQNHYSEFSFSNSNDKWGEFRWNQLITYNNNKDWQSLYIYNEENNLQTSKSL MHYDNLNKNFHVKEDVSYVNSITDRIGYTLESSVDINKGKNHSLRIDTLESSTTQTYLTI PSKLRNTEWNGNAQLIYILNPENDTQISFDYSVKYENGWKHQFAWNMLSPTNPQTDEANT YSYKINNLTQKQEVSFHFFPFQKTSCDISAGLKESTLKREEKEMNNYRKTFISPTVAISF VHTDITKSWSAAYRLHNFIPNIVQLRPQIDNSNPYMLRSGNPNLKQSYLHSFLFNCNRML GKHNHTIGIIISTSIRQHSPVAKTTYYNAETYLPELQYTAPAHSSLISFENVEGYWDIKG KLIWQAPIRSIKSKYTLSTGFNYEHNPYYIGENKTTTRTYAPLIEHFLLCSLTKRLKVTI SANTHYVHSINTENYTGKTFYQTAGAMLDISQICKYFYLSSHYNFIFSRDYGINKEINRN HTLNLNVGCKILNRKGDISIAAYDLLNSHRTFNSQMYSNYIQNTWTNYYGRFFTINFAYR FGKVKSNYEGTTNDGSIREYRPMSGKI >gi|225935335|gb|ACGA01000057.1| GENE 224 282884 - 283819 713 311 aa, chain + ## HITS:1 COG:sll0489 KEGG:ns NR:ns ## COG: sll0489 COG1131 # Protein_GI_number: 16331772 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Synechocystis # 5 303 1 309 342 226 39.0 5e-59 MSTPIVEVKHLSHRYSVDWAIRDINFSIEKTGILGLLGSNGAGKSTTMNIICGVLNQTEG DVFINGIDLRKNPVEAKKHIGFLPQKPPLHPDLTVDEYLIHCATLRRIEKSKVREAVEIA KERCAIAHFSKRLIKNLSGGYQQRVGIAQAIVHNPPFVVLDEPTNGLDPNQIVEIRNLIK EIAEDHSVLLSTHILSEVQATCNDIRMIEHGKVVFSGSMKDFDNYVVPSSFTVTFALPPS IEELAKIEHVLNIEELYPGTFRIRFDDDENITERVVALSIQNGWRLKEITMERCSLDIIF AQLSGKLKNNI >gi|225935335|gb|ACGA01000057.1| GENE 225 283821 - 286142 1355 773 aa, chain + ## HITS:1 COG:BB0753 KEGG:ns NR:ns ## COG: BB0753 COG1277 # Protein_GI_number: 15595098 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Borrelia burgdorferi # 3 250 4 238 244 86 27.0 2e-16 MTDLKIIMRIARTDLAILFYSPIAWFILIVFSFLTTSSFTLLMENIVTDYDLSGGKEVSL SGICFLGSYGFLSSVVSNIYIYIPLLTMGLISRETASGSIKLAYSSPVTSGQIVLGKYLA AIGFGCCLMLVPIASAIYGSLVIPSFDWAPVLVALLGLYLLICAYCAIGLFMSSLTTYQV VAAVGTLIILAILNFVGSIGQEYDFIRELTYWLSINGRTIDMLNGVIRSEDVIYFVVVVT LFLTFTTFKLASDRRTISRFRQAISYLGFFVAAMAIGYFTSRPGMIKVWDTTRTKLNSLT ENSQHVLAKLTGPVTITNYVNLLDNKSYRYLPIMKKANETIFEPYCLAKPDLQVKYVYYY DFAPNGVANNPKFQGKTVDEMRDYMTMIYNLNPHLFKSPEEIRQIIDLREEQNTFVRIME TQDGKRTFIRDFEDMDATPSEAEITAAIKKMISTPPTVAFIKGDGEREVSKSGDRDYSNF SIEKYSRAALINQGFDVCEIDISHGDTISSLINIVVLAEMRTPLTEKGENQLEAYLARGG NLFILTDTGRQEVMNPFLSKFGIKMEEYQLAQSSVDFSPNLILAKATRESEKLTFGFKED FLQYDLRVSMPGCVALTSSDNDYGFQYTPILETNAKGVWIEKEQTDLQELPVECNASAGE KEQTYITAYALSRQLKDKEQRIIISGDADCISNTELTLSREGYRSGNFNLIIESFRWLSG GEFPIDVRRPHCTDNKLSIGVKDIGTMKTIFIIIIPAILLLIGVGIWFFRRRN >gi|225935335|gb|ACGA01000057.1| GENE 226 286155 - 288482 1358 775 aa, chain + ## HITS:1 COG:BB0753 KEGG:ns NR:ns ## COG: BB0753 COG1277 # Protein_GI_number: 15595098 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Borrelia burgdorferi # 1 245 3 234 244 96 28.0 2e-19 MNLRLILRIARTELAVLFYSPVAWLLLIAFTCQVGFDFMNILTEIVKIKALGNTITFSVT AGFVLGLKGIYEVIQETIYLYIPLLTMNLMSREYSSGSIKLLYSSPVNSIQIITGKFVSM VVFALIFVIILALPTIVMFISVPHVDITLILAGLLSMFLLILTYCSIGLFMTTLTSYQVV AAVATLSALAFLNYVGGIGQESIFFREITYWLSIKGRASEMVGGLICSDDVIYFLAVILL FLWLSVIKLNNEKTHRSLLSKTMRYALAVCTIIVIGFVSSRPAMMSFYDATRSKQRTLSE ESQKVMKQLSGPMTITTYVNIFDKEFDVASPKEQKEDMARFKMYTRFKPEIKMEYVYYYS TPKDSALYRQYPNKNIREIAYEVAKKKNFNPQKLKSAEELKEKIDLAKENYRFVRVVERG SGEQARLRLFDDMEYHPSETEISAALKKMLVTPVKVGAITGHQERSTTKKGDQDYSLFAT HGRFRYSMINQGFDLVELNLKDMNDIPSNINILLIAEMRSSMSSKEQEIIDRFLERGGNI MIMGDVGRQEVMNPLLRKVGLKLLPGIIAQPSDVNPGELVLAKATQIAADSIGGFYKRMV DRQTHSAVTMPSAVALEVVDTTKFHPIVLLQSNAQQTWIEYQTKDFVNDSLSLDSLQGEK LGAYPTAIALTRKIKGKDKKQRIIVLGDADCFSNAELQKSSRPGIYSFNFNMIPGSFRWL CYNEFPVSSSRAPYLDKDISLTPMVLSTIKIIYCYGIPFIIGLCGIWICWRRRKR >gi|225935335|gb|ACGA01000057.1| GENE 227 288418 - 288546 62 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174380|ref|ZP_05760792.1| ## NR: gi|260174380|ref|ZP_05760792.1| hypothetical protein BacD2_21161 [Bacteroides sp. D2] # 10 42 1 33 33 63 96.0 4e-09 MAFHLLLVCVAFGYAGEEENDKKKELDFRQADFVPVGSLILS >gi|225935335|gb|ACGA01000057.1| GENE 228 288559 - 289998 1486 479 aa, chain - ## HITS:1 COG:SA1322 KEGG:ns NR:ns ## COG: SA1322 COG0642 # Protein_GI_number: 15927072 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Staphylococcus aureus N315 # 66 477 195 584 588 163 28.0 7e-40 MNIKTKLILGIGMLAGMIILLVTLSVVNLQILTATEPDSPAATPGLERALLWISVTGGIC ILTGLILLYWLPRTISKPIKELKEGILEIANHNYEKRLQMNNNEEFRDVADSFNRMAERL TEYRASTLSDILSAKKFIEAIVNSIDDPVIGLNMEREILFINEEALNVLNLKRENVIRQS AEELALKNDLLRRLIRELVTPSEQKEPLKIYADDKESYFKASYVPIINTEAEKGEPRKLG DVILLKNITEFKELDSAKTTFISTISHELKTPIAAIMMSLQLLEDKRVGALNDEQEQLSK SIKENSERLLSITGELLNMTQVEAGKLQFMPKITKPIELIEYAIKANQVQADKFNIQIEV EYPEEKIGKLFVDSEKIAWVLTNLLSNAIRYSKENGRVIIGAKQDENWIELYVQDFGKGI DPRYHKSIFDRYFRVPGTKVQGSGLGLSISKDFVEAHGGTLTVESELGKGSRFVMRLKA >gi|225935335|gb|ACGA01000057.1| GENE 229 290018 - 291148 893 376 aa, chain - ## HITS:1 COG:AGl2094 KEGG:ns NR:ns ## COG: AGl2094 COG2205 # Protein_GI_number: 15891164 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 344 17 349 900 251 38.0 2e-66 MDDREHSVQYFLDLIKKSHRGKFKVYIGMIAGVGKSYRMLQEAHELLENGVNVQIGYIET HGRAGTEALLQGLPVIPRRKIFYKGKELEEMDLDTIIRVHPEIVIVDELAHTNVEGSLNE KRWQDVITLLDEGINVISAINIQHIESVNEEVQEITGIEVKERVPDSVLQEADEVVNIDL TAEELIARLKAGKIYRPEKIQTALDNFFRTENILQLRELALKEVALRVEKKVENEVVMGV AVGLRHEKFMACISSHEKTPRRIIRKAAKLATRYNTTFIALYVQTPKESMDRIDLASQRY LLNHFKLVAELGGEVVQVQSKDILGSIVKVCKEKQISTVCMGTPNLRLPYAICSILGYRK FLNNLSQANVDLIILA >gi|225935335|gb|ACGA01000057.1| GENE 230 291233 - 291997 796 254 aa, chain - ## HITS:1 COG:no KEGG:BT_2422 NR:ns ## KEGG: BT_2422 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 254 1 248 248 348 68.0 9e-95 MKSFFSKKSILGTMAVAAVCMLGTSNAQAQEFTIQGDLVSSYVWRGIYQGGAASFQPTLG FSVGNFSLTAWGSTSLSESNKEIDLTAAYKFGEAGPTLSVATLWWDGQADVANGELTNNY FHFKSGDTGHHFEAGLAYTLPIEKFPLSIAWYTMFAGADRKTTDEGEEKQAYSSYVELNY PFSVKGVDLNATCGVVPYKTPQYNVNGFAVTNLALKATKAINFNDKFSLPIFVQAIWNPR LEDAHLVFGVTLRP >gi|225935335|gb|ACGA01000057.1| GENE 231 291981 - 292115 60 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717983|ref|ZP_04548464.1| ## NR: gi|237717983|ref|ZP_04548464.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 44 1 44 44 87 100.0 3e-16 MKIEQRSCPDRLGYAAVQINRVGWGAAFLQEHKIELDKKYEKFF >gi|225935335|gb|ACGA01000057.1| GENE 232 292116 - 292688 664 190 aa, chain - ## HITS:1 COG:AGl2092 KEGG:ns NR:ns ## COG: AGl2092 COG2156 # Protein_GI_number: 15891163 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 188 8 186 188 160 46.0 1e-39 MKTLLKSLKITLAFCVFFSVFYILILWVFAQVAGPNKGNAEVATLDGKVVGAANVGQMFT KDIYFWGRPSSAGDGYDATSSAGSNKGPTNQEYLDEVKARIDTFLVHHPYLNRADVPAEM VTASGSGLDPNITPQCAYVQVKRVAQARGLTEKQVRAIVDKSIEKPFLGLLGTEKVNVLK LNIALEESNK >gi|225935335|gb|ACGA01000057.1| GENE 233 292705 - 294630 2189 641 aa, chain - ## HITS:1 COG:DRB0083 KEGG:ns NR:ns ## COG: DRB0083 COG2216 # Protein_GI_number: 10957402 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Deinococcus radiodurans # 1 638 39 671 675 748 64.0 0 MFTVEVATVVMLLVTLYSIVNSSQGSFAYNIAVFIILFVTLLFANFAEAIAEARGKAQAD SLRKTREETPAKKVEGNKIVTVSSSQLKKGDVFVCEAGDVIPSDGEIIEGLASIDESAIT GESAPVIREAGGDKSSVTGGTKVLSDHIKVMVTTQPGESFLDKMIALVEGASRQKTPNEI ALTILLAGFTLVFVIVCVTLKPFADYSNTVITIASLISLFVCLIPTTIGGLLSAIGIAGM DRALRANVITKSGKAVETAGDIDTLLLDKTGTITIGNRKATHFHTAPGVNLHDFVETCLL SSLSDETPEGKSIVELGRESGVRMRNLNTTGARMIKFTAETKCSGVDLADGTQIRKGAFD AIRKMVEGEGNEFPKEVEEIISSISSNGGTPLVVCVNKKVTGVIELQDIIKPGIQERFER LRKMGVKTVMVTGDNPLTAKYIAEKAGVDDFIAEAKPEDKMEYIKKEQQAGKLVAMMGDG TNDAPALAQANVGVAMNSGTQAAKEAGNMVDLDNDPTKLIEIVEIGKQLLMTRGTLTTFS IANDVAKYFAIVPALFMIAIPELAALNIMHLHSPESAILSAVIFNAIIIPILIPLALRGV QYKPIGASALLRRNLLIYGVGGVIAPFVGIKLIDLVVGLFF >gi|225935335|gb|ACGA01000057.1| GENE 234 294757 - 296463 1697 568 aa, chain - ## HITS:1 COG:pli0052 KEGG:ns NR:ns ## COG: pli0052 COG2060 # Protein_GI_number: 18450334 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Listeria innocua # 9 568 1 571 573 451 45.0 1e-126 MNTEILGVVVQIALMVILAYPLGKYIAKVYRGEKTWSDFMAPIERVIYKVCGIDPNEEMN WKQFLKALLILNAFWFFWGMVLLVSQGWLPLNPDGNGPQTPDQAFNTCISFMVNCNLQHY SGESGLTYFTQLFVIMLFQFITAATGMAAMAGIMKSIAAKTTKTIGNFWQFLVISCTRIL LPLSLIVGFILILQGTPMGFDGKMKVTTMEGQEQMVSQGPTAAIVPIKQLGTNGGGYFGV NSSHPLENPTYLTNMAECWSILIIPMAMVFALGFYTRRKKLAYSIFGVMLFAFLVGVCIN VSQEMGGNPRIDELGIAQDNGAMEGKEVRLGAGATALWSIVTTVTSNGSVNGMHDSTMPL SGMMEMLNMQINTWFGGVGVGWMNYYTFIIITVFISGLMVGRTPEFLGKKVEAREMKIAT IVALLHPFVILVFTALSSYIYVHHPDFVESEGGWLNNLGFHGLSEQLYEYTSCAANNGSG FEGLGDNTYFWNYTCGIVLILSRFIPIIGQVAIAGLLAQKKFIPESAGTLKTDTLTFGVM TFVVIFIIAALSFFPVHALSTIAEHLSL >gi|225935335|gb|ACGA01000057.1| GENE 235 297024 - 299297 1646 757 aa, chain + ## HITS:1 COG:PAE1272_1 KEGG:ns NR:ns ## COG: PAE1272_1 COG0380 # Protein_GI_number: 18312516 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose-6-phosphate synthase # Organism: Pyrobaculum aerophilum # 8 465 1 467 483 399 40.0 1e-110 MLCKEIDMKLYIIANRLPVKVAGTNGKFAFSRSEGGLATGLDSLQTSYEKHWIGWPGICA NEEKDQQEINEKLQEMNFHPVFLSEKQIENYYEGYSNSTIWPLCHYFYAYTLYKNCFWQS YQQVNRLFCEEICRLVRPGDKVWIQDYQLMLLPGMLRKIYPELCIGYFHHIPFPSYELFR ILPERAEILKGLLGADFIAFHTHDYMRHFISAVERVLHLDFKLDEVQINNRVTRVEALPM GINYESYHKASENKEVHQAIERTRKLFGEHKLILSVDRLDYSKGILHRLRGFATFLEHHA EYHGKVTLAMVIVPSRDHVGSYAELKTKIDEEIGSINGRYSTMNWTPVCYFYHGFSLEEL TAMYYVADIALVTPLRDGMNLVAKEYVATKCDNPGVLILSEMAGAAVELTDAIQINPNDT EQIENAICQALEMPEEEQKQRLQRMQSILSVQTVNKWAADFVNELNATCMKNDMLRKKRI VAATIAQIKLKYNQAKQRLILLDYDGTLTALKPRPEDAQPTPELISILQQLASDPANHIV INSGRDHFTLEKWLGSLPVSMAAEHGAFYKENGVWHKNIKKKEWGAGILSILQMFVDRTP RSHLEVKETALAWHYRESDAWLGTLRAQQLVNTLISLCTRQKLQILQGNKVIEIKSPDCN KGSEVGRLLANRRYDFVIAMGDDTTDEDMFQALPKSAIAIKIGSVSEAANYHLSAQSDVL PFLRSLLGKQKTATKEGSIKNRLTFAFDFLKDLLKTQ >gi|225935335|gb|ACGA01000057.1| GENE 236 299310 - 301100 1226 596 aa, chain + ## HITS:1 COG:RSc1458 KEGG:ns NR:ns ## COG: RSc1458 COG3387 # Protein_GI_number: 17546177 # Func_class: G Carbohydrate transport and metabolism # Function: Glucoamylase and related glycosyl hydrolases # Organism: Ralstonia solanacearum # 5 582 7 604 619 352 33.0 1e-96 MNNLNYGVIGNCRTAALISQNGNIEWLCFPDFDSPSIFASMLDREKGGAFGFEVSGDYRI TQNYVPHTNILSTQFSSDEGEFVVLDYMPCYRSQDETGHYLPAELYRYIHWIKGKPRFKI NYHPAPNYAQGQVILNITPQYIESYCSFDNKDRQYLYASLSLQDIKEKKEIILEKDEFLL LSYNEKVIPIDIEREKIEYCRTLVYWLNWTDRSKKYTLYNDVIERSLLVLKLMSYHNGAV LAALTTSLPEAVGEVRNWDYRFCWLRDASMSIETMFQIGHTGAAKRFMKFIQSTFVSKHD YQIMYGIRGEHQLTEVILDHLSGYKNSKPVRIGNDAYHQKQNDSFGYLMDLIYQYYRLMP GTLDEIEDMWEMVKTILAKVVENWRKPDKGIWEIRGEGQHFVSSKVMCWVALDRGAKIAQ MLNKYNYSERWQLEAEKIKKDVMKYGWNKELQSFTQTYNNQAMDSSLLLMEPYGFIEADD IRYHKTVEAVKKALFHKGLMYRYNSKDDFGLPTSAFTICTFWLIRALFVTGHKEEARCLF DEVLKYSNHLGLFSEDIDFETKEQLGNFPQAYSHLALVNTAILFAEEEKRLSFIRP Prediction of potential genes in microbial genomes Time: Fri May 13 10:48:44 2011 Seq name: gi|225935334|gb|ACGA01000058.1| Bacteroides sp. D2 cont1.58, whole genome shotgun sequence Length of sequence - 43391 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 18, operones - 7 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 193 177 ## COG1472 Beta-glucosidase-related glycosidases 2 2 Tu 1 . - CDS 196 - 843 428 ## COG3153 Predicted acetyltransferase - Prom 939 - 998 4.9 3 3 Op 1 11/0.000 - CDS 1001 - 1924 885 ## COG0248 Exopolyphosphatase 4 3 Op 2 . - CDS 1921 - 4050 1749 ## COG0855 Polyphosphate kinase - Prom 4145 - 4204 6.8 + Prom 3962 - 4021 8.9 5 4 Tu 1 . + CDS 4164 - 4742 596 ## BT_0542 hypothetical protein - Term 4737 - 4778 4.1 6 5 Op 1 . - CDS 4822 - 6912 1953 ## COG3968 Uncharacterized protein related to glutamine synthetase 7 5 Op 2 . - CDS 6869 - 7012 139 ## BT_0543 glutamine synthetase 8 5 Op 3 24/0.000 - CDS 7059 - 8447 1263 ## COG0004 Ammonia permease 9 5 Op 4 . - CDS 8496 - 8852 452 ## COG0347 Nitrogen regulatory protein PII 10 6 Op 1 . - CDS 8997 - 9722 754 ## BT_0546 hypothetical protein 11 6 Op 2 . - CDS 9795 - 11027 1380 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 12 6 Op 3 . - CDS 11088 - 11897 515 ## COG0253 Diaminopimelate epimerase - Prom 12051 - 12110 12.3 13 7 Tu 1 . + CDS 12312 - 12608 359 ## BT_0549 putative thioredoxin + Term 12671 - 12727 1.2 - Term 12766 - 12824 -0.6 14 8 Tu 1 . - CDS 12855 - 13601 818 ## COG0584 Glycerophosphoryl diester phosphodiesterase - Prom 13667 - 13726 10.6 - Term 13736 - 13793 -0.1 15 9 Op 1 . - CDS 13795 - 15462 1496 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) 16 9 Op 2 21/0.000 - CDS 15502 - 16845 1239 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 16900 - 16959 2.7 17 9 Op 3 . - CDS 16972 - 21522 3871 ## COG0069 Glutamate synthase domain 2 - Prom 21570 - 21629 2.1 + Prom 21661 - 21720 8.1 18 10 Op 1 . + CDS 21885 - 23729 1702 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 19 10 Op 2 . + CDS 23758 - 25641 1765 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 20 10 Op 3 24/0.000 + CDS 25669 - 26814 1011 ## COG0505 Carbamoylphosphate synthase small subunit 21 10 Op 4 . + CDS 26817 - 30044 3568 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Term 30101 - 30139 7.4 22 11 Tu 1 . + CDS 30169 - 30489 201 ## BVU_0164 arylsulfatase precursor + Prom 30584 - 30643 3.8 23 12 Tu 1 . + CDS 30663 - 31709 809 ## COG0836 Mannose-1-phosphate guanylyltransferase + Term 31826 - 31859 -1.0 24 13 Tu 1 . - CDS 31713 - 32609 435 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 32746 - 32805 5.9 + Prom 32613 - 32672 4.8 25 14 Op 1 . + CDS 32771 - 34030 1051 ## BT_0560 outer membrane efflux protein 26 14 Op 2 10/0.000 + CDS 34023 - 34952 906 ## COG0845 Membrane-fusion protein 27 14 Op 3 45/0.000 + CDS 34977 - 36446 353 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 28 14 Op 4 22/0.000 + CDS 36449 - 37552 800 ## COG0842 ABC-type multidrug transport system, permease component 29 14 Op 5 . + CDS 37630 - 38745 648 ## COG0842 ABC-type multidrug transport system, permease component + Term 38757 - 38809 12.4 - Term 38745 - 38797 12.4 30 15 Tu 1 . - CDS 38843 - 39271 385 ## COG0071 Molecular chaperone (small heat shock protein) - Prom 39466 - 39525 9.8 31 16 Op 1 . - CDS 39543 - 40019 278 ## COG0013 Alanyl-tRNA synthetase 32 16 Op 2 . - CDS 40088 - 40501 237 ## BDI_2962 hypothetical protein - Prom 40542 - 40601 2.6 33 17 Tu 1 . - CDS 40610 - 41245 473 ## COG2949 Uncharacterized membrane protein - Prom 41266 - 41325 10.3 - Term 41633 - 41673 7.4 34 18 Tu 1 . - CDS 41717 - 43390 1318 ## COG5492 Bacterial surface proteins containing Ig-like domains Predicted protein(s) >gi|225935334|gb|ACGA01000058.1| GENE 1 2 - 193 177 63 aa, chain + ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 1 62 702 763 764 67 48.0 4e-12 KELKGFERIFLKSGESRDINFVITENDLKFYNSGLEYIYEPGEFDVMVGSNSRDVQTKRF KVE >gi|225935334|gb|ACGA01000058.1| GENE 2 196 - 843 428 215 aa, chain - ## HITS:1 COG:MA1701 KEGG:ns NR:ns ## COG: MA1701 COG3153 # Protein_GI_number: 20090553 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 1 193 15 207 217 169 40.0 5e-42 MDIKLRIEQSSDYHETENITREAFWNHYSPGCNEHYLLHIMRNHPKFVPELDIVAEHNNK IVGNVVCLKSFIMADDGNQYEVLSLGPISVLPEYQQQGIGGRMIALTKKLASEMGFRAIL LCGDPDYYLRQGFIPAETLGIRTEDNMYATALHVYELYDNALANAKGRYIEDEIYQIDRS AANEFDKQFPPKEIEVGTPSQKRFEQLVVMRRKAI >gi|225935334|gb|ACGA01000058.1| GENE 3 1001 - 1924 885 307 aa, chain - ## HITS:1 COG:MA0083 KEGG:ns NR:ns ## COG: MA0083 COG0248 # Protein_GI_number: 20088982 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Methanosarcina acetivorans str.C2A # 10 291 13 310 543 100 27.0 2e-21 MISMKKVNYAAIDIGSNAVRLLIKCVNEENAPELMSKVQLIRIPLRLGEDAFTVGVISAE KEKKLIRLMKAYKQLMKIYDVVDYRACATSAMRDARNGKDIVQQIVKKTGIRVEIIDGQE EAHIVYDNHIEQLFASGQNYLYADVGGGSTEINLISNGELKNSRSYNIGTVRMLSGMVKE EEKEALRTDLIGLATEYAPISIIGSGGNINKLFRLADKKDKKASLLPIESLQDIYETLKA LSTEQRIKQYKLKPDRADVIVPAAEIFLEIATHVKATGIIVPTIGLSDGIIDSLYTQNMN RSDSPQS >gi|225935334|gb|ACGA01000058.1| GENE 4 1921 - 4050 1749 709 aa, chain - ## HITS:1 COG:ECs3363 KEGG:ns NR:ns ## COG: ECs3363 COG0855 # Protein_GI_number: 15832617 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Escherichia coli O157:H7 # 32 707 7 683 688 505 40.0 1e-142 MTITSHIDIFAEILNSSFFTMKQSEIKKKYPYVERDISWMYFNQRILLEAARSEVPLLER LTFLGIYSNNLDEFFRVRVATLNRIIEYADKNIQAEQETAACTLKQIGKLHNRYYKQFEE TFASIMEELKKENIYVIKDTEMTDEQKAFVTSFYRNKLNGSTNPLFLNGTRPLDDQTDED IYLAIRLLRKDETGKIKEKDYAVIELPTEDFGRFIQLPDSDGKTYLMFLDDVIRYCLPMI FVGMKYTDYEAYTFKFTKDAEMEIDSDLRTGVLQKISKGVKSRKKGEPIRFVYDEQIPKD LLKKLAGRLNVDKNDTRVAGGRYHNFKDLMKFPVCGHHELKYPVWEPIFKPELNGMESLL TLIRRKDRSLHYPYHSFDTFIRVLREAAISKEVKSIKMTLYRLAKESKVIKALICAARNG KKVTVVIELLARFDEASNINWSKKMQDAGIHVIFGVEGLKIHSKLLHIGTRHGDIACIST GNFHEGNARMYTDYTIMTAHRPIVREVNAVFDFIEKPYTPVDFKELLVSPNDMRKRFIAL INKEIKNKEQGKEAYILAKVNHITDRALVEKLYEASTAGVQVELVVRGNCSLVTGVPGIS ENIHINGIIDRYLEHSRIFIFANGGEEKYYIGSADWMPRNLDNRIEVAAPVYDKEIQADL KRIVCYGFRDTAKGRIVDGTGENREKEREENSVPFRSQEELYNEYKNTL >gi|225935334|gb|ACGA01000058.1| GENE 5 4164 - 4742 596 192 aa, chain + ## HITS:1 COG:no KEGG:BT_0542 NR:ns ## KEGG: BT_0542 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 307 90.0 1e-82 MKTMNYSLIRILFALVIGLVLVIWPNAAASYIVITVGVAFLIPGVISLFGYFGRKRPEGE AAPRFPIEGIGSLLFGLWLIVMPEFFADVLMFLLGFILIMGGVQQIASLSMARRWMPVPG AFYLIPSLILIAGIIALFNPTGARNTAFIIIGISSLVYSLSELINWFKFARRRPKAPVMH DDDDIEDAKIIE >gi|225935334|gb|ACGA01000058.1| GENE 6 4822 - 6912 1953 696 aa, chain - ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 7 696 41 724 724 591 44.0 1e-168 MYSTRKKCSNTFPSKVYNALIDAIDNGAPLDRSIADEVAAGMKKWAIEMGVTHYTHWFAP LTEGTAEKHDAFVEHDGKGGMMEEFTGKLLVQQEPDASSFPNGGIRNTFEARGYSAWDPS SPAFIVDDTLCIPTVFIAYTGEALDYKAPLLKALRAVDKAAVDVCHYFNPEVKKVVAYLG WEQEYFLVDEGLYAARPDLLLTGRTLMGHDSAKNQQLEDHYFGAIPTRVAAFMKDLEIEA LKLGIPVKTRHNEVAPNQFELAPIFEECNLANDHNLLIMSLMRKVSRRHGFRVLLHEKPF KGVNGSGKHNNWSLGTDTGILLMGPGKTPEDNLRFVTFVVNTLMAVYHHNGLLKASISSA TNAHRLGANEAPPAIISSFLGKQLSQVLDHIENSTKDDLISLSGKQGMKLDIPQIPELLI DNTDRNRTSPFAFTGNRFEFRAVGSEANCASAMIALNSAVANQLVKFKKDVDALIEKGEP KVSAILEIIRGYIKECKAIHFDGNGYSDEWKKEAARRGLDCETSVPVIFDNYLKPETIAM FEATGVMTKKELEARNEVKWETYTKKIQIEARVLGDLAMNHIIPVATQYQTDLINNVYKM QSLFPAEKAAKLSAKNLELIEEIADRTAFIKEHVDAMIEARKVANKIESEREKAIAYHDT IVPALEEIRYHIDKLELIVDNQMWTLPKYRELLFVR >gi|225935334|gb|ACGA01000058.1| GENE 7 6869 - 7012 139 47 aa, chain - ## HITS:1 COG:no KEGG:BT_0543 NR:ns ## KEGG: BT_0543 # Name: not_defined # Def: glutamine synthetase # Organism: B.thetaiotaomicron # Pathway: Alanine, aspartate and glutamate metabolism [PATH:bth00250]; Arginine and proline metabolism [PATH:bth00330]; Nitrogen metabolism [PATH:bth00910]; Metabolic pathways [PATH:bth01100]; Two-component system [PATH:bth02020] # 1 46 1 46 729 94 100.0 1e-18 MSKLRFRVVETAFKKKAVEVATPAERPSEYFGKYVFNKEKMFKYLPQ >gi|225935334|gb|ACGA01000058.1| GENE 8 7059 - 8447 1263 462 aa, chain - ## HITS:1 COG:PA5287 KEGG:ns NR:ns ## COG: PA5287 COG0004 # Protein_GI_number: 15600480 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Pseudomonas aeruginosa # 40 461 20 441 442 355 51.0 8e-98 MSRHIIYAISKIGIMAAIIMACLFAPSAYAQDTVKLSEEITIVEETPVLNSGNTAWLIVA TILVLLMSIPGIALFYGGLVRQKNILSIIMQTLLIVGVVSILWVAFGYSWAFGTSFMESG NPLGAIIGGFDKAFLHGITINTLTTGDIPELTFVMFQCMFAIITPALILGAFAERIKFSG YMVFIILWVILAYFPMAHWVWGGGFLQQMGAIDFAGGTVVHINAGISALVMAIMLGKRED YRIGHPITPHNITFVFMGTSFLWLGWFGFNAGSGLAADGLAANAFMVTHIATAVAAVTWM LIDWFCNKKPTTVGACTGAVAGLVAITPAAGTVDLLGAFFIGLITPVICFFMVAVVKPKF KYDDALDAFGVHGIGGIVGSILTGVFATQYITGEGGVEGALYGDWHQLWVQIVATVVSIL FSAVITFVLYKMVDSLIGIRVDRRVEEEGLDIYEHGESAYNS >gi|225935334|gb|ACGA01000058.1| GENE 9 8496 - 8852 452 118 aa, chain - ## HITS:1 COG:aq_109 KEGG:ns NR:ns ## COG: aq_109 COG0347 # Protein_GI_number: 15605696 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Aquifex aeolicus # 1 111 1 112 112 95 48.0 1e-20 MKKIEAIIRKTKFEDVKTALLEADIEWFSYYDVRGIGKAREARIYRGVMYDTSSIERILI SIIVRDKNAEKTVQAILKSAHTGEIGDGRIFVIPIEDAIRIRTGERGDIALYNAEQER >gi|225935334|gb|ACGA01000058.1| GENE 10 8997 - 9722 754 241 aa, chain - ## HITS:1 COG:no KEGG:BT_0546 NR:ns ## KEGG: BT_0546 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 1 241 241 407 89.0 1e-112 MKTTINKKWGVIIVALLLTGFAPSAMLAQDKVEASVGADLVSGYIWRGQDLGGVSIQPSL GIAYKGFSLGAWGSVGFESTDTKEFDLTLGYSTGGFSVSVTDYWFNTQVATNKYFKYGAH STAHVFEAQVGYDFGPLAVNWYTNFAGADGVKENGKRAYSSYLALSAPFKLGGLDWAVDL GMVPWETTFYNGYTSGFCISDVSLGASKDIKITDSFSVPAFAKVSVNPRTEGAYFAFGLS F >gi|225935334|gb|ACGA01000058.1| GENE 11 9795 - 11027 1380 410 aa, chain - ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 406 1 405 410 543 60.0 1e-154 MALVNEHFLKLPGSYLFSDIAKKVNTFKITHPKQDIIRLGIGDVTQPLPKACIEAMHKAV EELASKDTFRGYGPEQGYDFLIEAIIKNDFIPRGIHFSASEIFVSDGAKSDTGNIGDILR HDNSVGVTDPIYPVYIDSNVMCGRAGVLEEETGKWSNVTYMPCTSENNFIPEIPDKRIDI VYLCYPNNPTGTTLTKPELKKWVDYALANDTLILFDAAYEAYIQDENVPHSIYEIKGAKK CAIEFRSFSKTAGFTGVRCGYTVVPKELTAATLEGDRIPLNRLWNRRQCTKFNGTSYITQ RAAEAVYSAEGKAQIKETIGYYMTNAKIMKEGLEATGLKVYGGVNAPYLWVKTPNGLSSW RFFEQMLYEANVVGTPGVGFGPSGEGYIRLTAFGERNDCIEAMRRIKNWL >gi|225935334|gb|ACGA01000058.1| GENE 12 11088 - 11897 515 269 aa, chain - ## HITS:1 COG:BH3412 KEGG:ns NR:ns ## COG: BH3412 COG0253 # Protein_GI_number: 15615974 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Bacillus halodurans # 5 263 1 278 286 221 45.0 1e-57 MTTKIKFTKMHGAGNDYIYVDTTQYPIADPAKKAIEWSKFHTGIGSDGLILIGVSDKADF SMRIFNADGSEAMMCGNGSRCVGKYVYEYGLTDKTEITLDTRSGIKILKLHVEGKTVSAV TVDMGSPLETEEIQWGGKYPFRSTKVSMGNPHLVTFVEDITRINLPKIGPELENHPLFPD RTNVEFAQIVSKDTIRMRVWERGSGITQACGTGACATAVAAFINGLTGRKSDVIMDGGTV TIEWDEISGHILMTGPATKVFDGEIEDKY >gi|225935334|gb|ACGA01000058.1| GENE 13 12312 - 12608 359 98 aa, chain + ## HITS:1 COG:no KEGG:BT_0549 NR:ns ## KEGG: BT_0549 # Name: not_defined # Def: putative thioredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 98 1 98 98 162 79.0 3e-39 MNIEKQLETAVKTNHLVLVVFYADWSPHYEWIGPVLRTYEKRTVELIRVNAEENRTIADA HNVETVPAFLLLHKGHELWRQVGELTVGELKEVLDDFQ >gi|225935334|gb|ACGA01000058.1| GENE 14 12855 - 13601 818 248 aa, chain - ## HITS:1 COG:SA0220_2 KEGG:ns NR:ns ## COG: SA0220_2 COG0584 # Protein_GI_number: 15925931 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Staphylococcus aureus N315 # 21 248 1 227 242 77 27.0 2e-14 MASALLMVACCMQAQTKVIAHRGFWKTPGSSQNSISSLLKADSIGCYGSEFDVWIAKDNK LVVNHDPVYKMRPMEYSKGDALTGLKLSNGENLPSLEQYLEAGKNCNTRLILELKAHSNK KRETKAVQGILAMVKKMGLENRMEYITFSLHAMKEFIRLAPAGTPVFYLNGELSPKELKE LGAAGLDYHMGVIKKHPEWIKEAHELGLKVNVWTVDEVEDMKSLIEQKVDFITTNEPVIL QEELKKHQ >gi|225935334|gb|ACGA01000058.1| GENE 15 13795 - 15462 1496 555 aa, chain - ## HITS:1 COG:VC0991 KEGG:ns NR:ns ## COG: VC0991 COG0367 # Protein_GI_number: 15641006 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Vibrio cholerae # 1 554 1 554 554 779 67.0 0 MCGIAGILNIKVQTKELRDKALKMAQKIRHRGPDWSGIYVGGSAILAHERLSIVDPQSGG QPLYSPDRKQVLAVNGEIYNHRDIRAQYTGKYNFQTGSDCEVILALYKEKGIHFLEDISG IFAFVLYDEEKDEFLIARDPIGVIPLYIGKDKEGRIYFGSELKALEGFCDEYEVFLPGHY FHSKEGKMKRWYSRDWTAYEAVKDNDAQTGDVKTALEDAVHRQLMSDVPYGVLLSGGLDS SVISAIAKKYAAKRIETDGASDAWWPQLHSFAIGLKGAPDLIKAREVAEYIGTVHHEINY TVQEGLDALRDVIYFIETYDVTTVRASTPMYLLARVIKSMGIKMVLSGEGADEVFGGYLY FHKAPTPQAFHEETVRKLSKLHMYDCLRANKSLSAWGVEGRVPFLDKEFLDVAMNLNPKA KMCPGKVIEKRIVREAFADMLPESVAWRQKEQFSDGVGYSWIDTLREITATAVSDEQMEH AAERFPIHTPQNKEEYYYRSIFEEHFPSESAARSVPSVPSVACSTAEALTWDIAFRNLNE PSGRAVRGIHEEAYT >gi|225935334|gb|ACGA01000058.1| GENE 16 15502 - 16845 1239 447 aa, chain - ## HITS:1 COG:VC2374 KEGG:ns NR:ns ## COG: VC2374 COG0493 # Protein_GI_number: 15642371 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Vibrio cholerae # 1 444 1 472 489 429 48.0 1e-120 MGDPKAFLNIPRQEAGYRPVNERIADYSQVEQTLNTNSRKLQASRCMDCGVPFCHWACPI GNKQPEWQDALYRGKWKEAYEVLSSTCDFPEFTGRICPALCEKSCVLKLSCDQPVTIREN EAAIVEAAFREGYIQVKTPERNGKKVAVIGAGPAGLVVANQLNLKGYSVTLFDKDEAPGG LLRFGIPNFKLDKNVIDRRMKILAAEGIRFEMGVEIDVNHLPEGFDAYCICTGTPAARDL SIPGRELKGIHFALEMLAQQNRILAGQTFPKDKLVNAKGKKVLVIGGGDTGSDCIGTSVR QGAVSVTQIEIMPKPPVGYNPATPWPQWPVVFKTTSSHEEGCTRRWCLASNQFLGKNGKV TGVEVEEVEWIPATDGGRSTMKPTGKKEVIEADMVLLAMGFLKPEQPKFAENVFLAGDAD TGASLVVRAMAGGRKAAAEIDTYLTKV >gi|225935334|gb|ACGA01000058.1| GENE 17 16972 - 21522 3871 1516 aa, chain - ## HITS:1 COG:CAC1673_2 KEGG:ns NR:ns ## COG: CAC1673_2 COG0069 # Protein_GI_number: 15894950 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Clostridium acetobutylicum # 402 1201 1 804 804 892 55.0 0 MKKQELFNNVTEGFPYQRQPQQVGLYDAAYEHDACGVGMLVNIHGEKSHDIVESALKVLE NMRHRGAEGADNKTGDGAGIMLQIPHEFILLQGIPVPEKGRYGTGLLFLPKNEKDQAAIL SIIIEEIEKEGLTLMHLRNVPTCPEILGESALANEPDIKQVFITGFTETETADRKLYLIR KRIENKVRLSSIPAKDDFYVVSLSTKSIIYKGMLSSLQLRNYYPDLTNSYFTSGLALVHS RFSTNTFPTWGLAQPFRLLAHNGEINTIRGNRGWMEARESVLSTPTLGDIKEIRPIIQPG MSDSASLDNVLEFLVMSGLSLPHAMAMLVPESFNEKNPISEDLKSFYEYHSILMEPWDGP AALLFSDGRFAGGMLDRNGLRPARYLITKNDMMVVASEVGVMDFEPGDIKEKGRLQPGKI LLVDTEKGEIYYDGELKKQLAEAKPYRTWLSTNRIELDELKSGRKVPHHVANYDRMLRTF GYSKEDIERLIMPMASTGAEPIHSMGNDTPLAVLSDKPQLLYNYFRQQFAQVTNPPIDPL REELVMSLTEYIGAVGMNILTPSESHCKMVRLNHPILSNTQLDILCNIRYKGFKTVKLPM LFEVAKGKAGLQEALTHLCKMAEESVTEGVNYIVLTDREVDITHAAIPSLLAVSAVHHHL ISVGKRVQTALIVESGEIREVMHAALLLGFGASALNPYMAFAVLDRLVKDKDIQLDYTTA EKNYIKSICKGLFKIMSKMGISTIRSYRGAKIFEAVGLSEELSKAYFGGLGSPIGGIRLE EIARDAIAFHKEGFESIKNEELLKNNGLYAFRKDGEKHAWNPETISTLQLATRLGSYKKF KEYTHLVDEKEKPIFLRDFLGFRRNPISIDQVEPVENILHRFVTGAMSFGSISKEAHEAM AIAMNKIHGRSNTGEGGEDAARFQPLPDGSSLRSAIKQVASGRFGVTAEYLVNADEIQIK IAQGAKPGEGGQLPGFKVNDVIAKTRHSIPGISLISPPPHHDIYSIEDLAQLIFDLKNVN PQAKISVKLVAESGVGTIAAGVAKAKADLIVISGAEGGTGASPASSIRYAGISPELGLSE TQQTLVLNGLRGQVVLQADGQLKTGRDIILMALMGAEEYGFATSALIVLGCVMMRKCHQN TCPVGVATQNEELRKRFHGRSEYLVNFFTFLAQEVREHLAEMGFTRMDDIIGRTDLIERK SDANDPNPKHALIDFTKLLARVDNSAAIRHVIDQDHGISAVKDVTIIDAAQEAIEHEKEI SLEYTIANTDRAIGAMLSGVIAKKYGAKGLPEHTLNVKFKGSAGQSFGAFLVPGVNFKLE GEANDYLGKGLSGGRISVLPPIRSNFEAEKNTIAGNTLLYGATSGEVYINGRVGERFAVR NSGAVAVVEGVGDHCCEYMTGGRVVVLGQTGRNFAAGMSGGVAYVWNKDGNFDYFCNMEM VELSLIEEASYRKELHELIRQHYLYTGSKLARILLDNWNHYVDQFIQVVPIEYKKVLQEE QMRKLQQKIADMQRDY >gi|225935334|gb|ACGA01000058.1| GENE 18 21885 - 23729 1702 614 aa, chain + ## HITS:1 COG:DR0302 KEGG:ns NR:ns ## COG: DR0302 COG0449 # Protein_GI_number: 15805332 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Deinococcus radiodurans # 1 614 37 642 642 552 48.0 1e-157 MCGIVGYIGKRKAYPILIKGLKRLEYRGYDSAGVAIISDDQQLNVYKTKGKVSDLENFVT QKDISGTIGIAHTRWATHGEPCSVNAHPHYSSSEKLALIHNGIIENYAVLKEKLQAKGYI FKSSTDTEVLVQLIEYMKVTNQVSLLTAVQLALGEVIGAYAIAILDKEHPDEIIAARKSS PLVVGIGENEFFLASDATPIVEYTDKVVYLEDGEIAVLNLGKELKVVNLNNVEMIPEIKK VELNLGQLEKGGYSHFMLKEIFEQPDCIRDCMRGRINVEADNVVLSAVIDHREKLLNAKR FIIVACGTSWHAGLIGKHLIESFCRIPVEVEYASEFRYRDPVIDGQDVVIAISQSGETAD TLAAVELAKSRGAFIYGICNAIGSSIPRATHTGSYIHVGPEIGVASTKAFTGQVTVLAML ALTLAKAKGTIDEQHYLSIVQELNHIPEKMKEVLELNDTLAELSKTFTYAHNFIYLGRGY SYPVALEGALKLKEISYIHAEGYPAAEMKHGPIALIDAEMPVVVIATQNGLYEKVLSNIQ EIKARKGKVIAFVTKGDTVISKIADCSIELPETIECLDPLITTVPLQLLAYHIAVCKGMD VDQPRNLAKSVTVE >gi|225935334|gb|ACGA01000058.1| GENE 19 23758 - 25641 1765 627 aa, chain + ## HITS:1 COG:YPO2772 KEGG:ns NR:ns ## COG: YPO2772 COG0034 # Protein_GI_number: 16122976 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Yersinia pestis # 39 535 21 453 505 128 25.0 4e-29 MEQLKHECGVAMIRLLKPLEYYEKKYGTWMYGLNKLYLLMEKQHNRGQEGAGLACVKLEA NPGEEYMFRERALGSGAITEIFENIQNNFKDLTPEQLHDAEYAKRTLPFAGEIYMGHLRY STTGKSGISYVHPFLRRNNWRAKNLALCGNFNMTNVDEIFARITAIGQHPRKYADTYIML EQVGHRLDREVERVFNLAEAEGLTGMGITHYIEEYIDLANVLRTSSREWDGGYVICGLTG SGESFAIRDPWGIRPAFWYQDDEIAVLASERPVIQTALNVPFEEIKELQPGQALLISKEG KIRTSQINKPRENHACSFERIYFSRGSDVDIYKERKRLGEKLVPRILKAINNDIDHTVFS FIPNTAEVAFYGMLQGLDDYLNEEKVQQIAALGHHPDMEELEVILSRRIRSEKVAIKDIK LRTFIAEGNSRNDLAAHVYDITYGSLVPGVDNLVIIDDSIVRGTTLKQSIIGILDRLGPK KIVIVSSSPQVRYPDYYGIDMAKMSEFVAFRAAVELLKERDMKDVIAAAYRKSKDQVGLP KEQMINYVKDIYAPFTDEEISGKMVELLTPKGTKAKVEIVYQPLEGLHEACPRHKGDWYF SGNYPTPGGVKMVNRAFIDYIEQMYQF >gi|225935334|gb|ACGA01000058.1| GENE 20 25669 - 26814 1011 381 aa, chain + ## HITS:1 COG:YJL130c_1 KEGG:ns NR:ns ## COG: YJL130c_1 COG0505 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Saccharomyces cerevisiae # 2 379 20 405 433 367 45.0 1e-101 MRNVTLILDDGSRFSGKSFGYEKPVAGEVVFNTAMTGYPESLTDPSYAGQLMTLTYPLVG NYGVPPFTIEPNGLATFMESEKIHAEAIIVSDYSYEYSHWNAVESLGDWLKREQIPGITG IDTRELTKVLREHGVMMGKIVFDDEPDNVPEAVYAGVNYVDRVSCKEIICYLPAGTSQTF SVNSSYAQLNSQFSTFNSQFKKVILVDCGVKTNIIRCLLKRNVEVIRVPWDYDYSGLEFD GLFISNGPGDPDTCDAAVQNIRKAMVNEKLPIFGICMGNQLLSKAGGAKIYKLKYGHRSH NQPVRMVGTERCFITSQNHGYAVDNNTLSAEWEPLFINMNDGSNEGIKHKKNPWFSAQFH PEAASGPTDTEFLFDEFVKLL >gi|225935334|gb|ACGA01000058.1| GENE 21 26817 - 30044 3568 1075 aa, chain + ## HITS:1 COG:YJL130c_2 KEGG:ns NR:ns ## COG: YJL130c_2 COG0458 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Saccharomyces cerevisiae # 6 1057 6 1051 1070 1221 58.0 0 MKENIKKVLLLGSGALKIGEAGEFDYSGSQALKALKEEGIETILINPNIATVQTSEGVAD QIYFLPVTPYFVEKVIQKEKPEGIMLAFGGQTALNCGVALYKEGILEKYNVKVLGTPVQA IMDTEDRELFVQKLNEINVKTIKSEAVENAEDARRAAKELGYPVIVRAAYALGGLGSGFC DNEQQLDVLVEKAFSFSPQVLVEKSLRGWKEVEYEVVRDRFDNCITVCNMENFDPLGIHT GESIVIAPSQTLTNKEYHKLRELAIRIIRHIGIVGECNVQYAFDPESEDYRVIEVNARLS RSSALASKATGYPLAFVAAKLGLGYGLFDLKNSVTKTTSAFFEPALDYVVCKIPRWDLGK FHGVDKELGSSMKSVGEVMAIGRTFEEAIQKGLRMIGQGMHGFVENKELVISDIDKALRE PTDKRIFVISKAFRAGYTIDQVHELTKIDKWFLQKLMNIMKTSEELHSWGNNHKQIADLP NELLRKAKVQGFSDFQIARAIGYEGDMEDGILYIRKHRKEAGILPVVKQIDTLAAEYPAQ TNYLYLTYSGVANDVHYLGDHKSIVVLGSGAYRIGSSVEFDWCGVQALNTIRKEGWRSVM INYNPETVSTDYDMCDRLYFDELTFERVMDILELENPHGVIVSTGGQIPNNLALRLDAQK INILGTSAKSIDNAEDREKFSAMLDRIGVDQPRWRELTSMDDIQEFVEEVGFPVLVRPSY VLSGAAMNVCSNQEELERFLKLAANVSKKHPVVVSQFIEHAKEVEMDAVAQNGEIVAYAI SEHIEFAGVHSGDATIQFPPQKLYVETVRRIKRISREIAKALNISGPFNIQYLAKDNDIK VIECNLRASRSFPFVSKVLKINFIELATKVMLGLPVEKPEKNLFELDYVGIKASQFSFNR LQKADPVLGVDMASTGEVGCIGMDTSCAVLKAMLSVGYRIPQKNILLSTGTMKQKADMMD AARMLVNKGYKLFATGGTHKTLAENGIESTHVYWPSEEGHPQALEMLHRKEIDMVVNIPK NLTAGELSNGYKIRRAAIDLNIPLITNARLASAFINAFCTMSVDDIAIKSWAEYK >gi|225935334|gb|ACGA01000058.1| GENE 22 30169 - 30489 201 106 aa, chain + ## HITS:1 COG:no KEGG:BVU_0164 NR:ns ## KEGG: BVU_0164 # Name: not_defined # Def: arylsulfatase precursor # Organism: B.vulgatus # Pathway: not_defined # 3 102 365 464 464 90 47.0 1e-17 MVKGIDGISYLPLLTGNGEVSRSHEYIYYEFYEQGGKQSILKDGWKLVRLNMSKPEKLKE ELYYLPDDIGEQHDLLKENPQKVEELRILAMKARTESLHFSWKPEK >gi|225935334|gb|ACGA01000058.1| GENE 23 30663 - 31709 809 348 aa, chain + ## HITS:1 COG:CAC3058 KEGG:ns NR:ns ## COG: CAC3058 COG0836 # Protein_GI_number: 15896309 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 6 342 5 340 350 284 42.0 2e-76 MDNHVVIMAGGIGSRFWPMSTPECPKQFIDVMGCGRSLIQLTADRFDGVCPRENMWVVTS EKYIDIVREQLPEIPESNILAEPCARNTAPCIAFACWKIKKKHPNANIVVTPSDALVIDT GEFRRVMEKALRFTDDGSAIVTIGIRPTRPETGYGYIAAADQLQTDKEIYTVDAFKEKPD KETAEKYLAEGNFFWNAGIFVWNVRTITSVMRVYAPGIAQIFDRIFPDFYTEKENETIKK LFPTCESISIDYAVMEKAQEIYVLPASFGWSDLGTWGALRGLLPQDKSGNATVGTDIRLY ESKNCIVHASEEKRVVIQGLDGYIIAEKDNTLLICKLEEEQRIKEFSK >gi|225935334|gb|ACGA01000058.1| GENE 24 31713 - 32609 435 298 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 195 295 185 285 288 75 32.0 1e-13 MEKKIPLNIGLEILENPDIVRFNEFRFISPHFGFVNSFAKIESNLITIDQPYRIKEGRIA VIKQGYARVLINLIEETFQPDKLLLIVPNSILQIIEVSPDFDMQMVGMDAEFLSIAGKDN FFTHLSQLQKNLILSLTSHEYGLFKSYFALIWNILQEPTFRREAIQYLLISLLCNIEYLV KNNNQKEQSCQTHQEEIFQRFISLVNAYSKRERNVSFYADKLCLTPRYLNTIIRQTSQQT VMDWINQSIILEAKVLLKHSNLLVYQISDELNFPNPSFFSKFFKRMTGMTPHEYQFSK >gi|225935334|gb|ACGA01000058.1| GENE 25 32771 - 34030 1051 419 aa, chain + ## HITS:1 COG:no KEGG:BT_0560 NR:ns ## KEGG: BT_0560 # Name: not_defined # Def: outer membrane efflux protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 418 6 408 409 665 87.0 0 MKRMVFSFSFLLFVSGINAQITLEECQRKTQENYPLVHQYGLVEKTKEYNLENAAKGYLP QFALSAKASYQSEVTEIPVKLPGVDLKGVPKDQYQVMLELQQKIWDGGGIRMQKKQTTAE AEIEKEKLNVDMYALNSRVNDLYFGILLLDEQLKQNALLQDELERNYRQITAYVENGIAN QADLDAVKVEQLNTKQKRVELVSSRMAYLKMLSLLVGEKLSQETVLEKPVPQDDISAVGE IRRPELSLFNAQGVGLQVQEKALNVRHLPQFGLFVQGAYGNPGLNMLKNEFSPYYIAGVR LSWNFGSLYTLKNDRKVIENKRRQLDNNRDVFLFNTRLEMTQQDQAIQSLEKQMQDDDEI IRLRTNIRKSAEAKVANGTLTVTEMLRELTNESLARQSKALHEIQRLMGIYQLKYTTND >gi|225935334|gb|ACGA01000058.1| GENE 26 34023 - 34952 906 309 aa, chain + ## HITS:1 COG:PA5232 KEGG:ns NR:ns ## COG: PA5232 COG0845 # Protein_GI_number: 15600425 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 37 288 41 314 357 86 25.0 6e-17 MINITSLDMERMTKIGMVGVALIMLAACGKGIPGYDATGTFEATEVIVSAEAAGKLLRLE VEEGTRLKAGEEIGLVDTVQLYLKKLQLEASMKSVESQRPDLAKQIAATKQQIATAEREK KRVENLLAAGAANQKQLDDWDAQVKLLERQLVAQESSLQNSTNSLIEQGNSVAIQVAQME DQLAKCHVQSPIEGTVLAKYAEAGELAAIGKPLFKVAEVDRMYLRAYITSEQLSQVKLGD EVTVYADYGNSEQKAYPGVVTWISDRSEFTPKTILTKNERANLVYAVKIAVKNDGALKIG MYGGVTLKN >gi|225935334|gb|ACGA01000058.1| GENE 27 34977 - 36446 353 489 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 248 471 12 237 318 140 34 1e-32 MEKGEPIVVVKEVGKNYGTVEALKEVSFVVERGEIFGLIGPDGAGKSSLFRILTTLLLAD KGTAMVAGLDVVTDYKQIRTKVGYMPGRFSLYQDLSVEENLELFATVFHTTVQENYDLIK DIYQQIEPFRKRRAGALSGGMKQKLALSCALIHKPDILFLDEPTTGVDPVSRKEFWQMLR NLREQGITIVVSTPIMDEARQCDRIAFINHGQIHGIDTPERILHQFASILCPPSLEREEV KHEVAPVIEVEQLTKCFGDFTAVDHISFLVNRGEIFGFLGANGAGKTTAMRMLCGLSKPT SGIGKVAGYDIYREAEQVKKHIGYMSQKFSLYEDLKVWENIRLFAGIYGMKEQEIERKTE ELLDRLGLTAERDTLVKSLPVGWKQKLAFSVSIFHEPRIVFLDEPTGGVDPATRRQFWEL IYQAADQGITVFVTTHYMDEAEYCNRISIMVDGQIKALDTPARLKQQFGAETMDDVFQKL ARGAVRKAD >gi|225935334|gb|ACGA01000058.1| GENE 28 36449 - 37552 800 367 aa, chain + ## HITS:1 COG:CAC3268 KEGG:ns NR:ns ## COG: CAC3268 COG0842 # Protein_GI_number: 15896513 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Clostridium acetobutylicum # 2 365 4 374 378 220 33.0 3e-57 MRQFIAFIKKEFYHIFRDRRTMLILLGMPVVQIILFGFAISTEVKNVRLAVLDPSNDVVT RKIIDRLDASEYFTVTARFHSPQEMEAAFLKNEVDMAIVFSERFADDLYTGDARVQLIVD ATDPNMSTSQVNYATGIVSMVGQEMIPPNMSAARLTPDIKLLYNPQMKSAYNFVPGVMGL ILMLICAMMTSISIVREKETGTMEVLLVSPVKPLFIILAKAVPYFVLSFVNLITILLLSV FVLDVPVVGSLFWLITVSLLFIFVSLALGLLISSVTRTQVAAMLVSGLMLMMPTMLLSGM IFPIESMPVILQWISDILPARWYIQAVRKLMIEGVPVALVYKEIGILLLMATVLITISIK KFKYRLE >gi|225935334|gb|ACGA01000058.1| GENE 29 37630 - 38745 648 371 aa, chain + ## HITS:1 COG:SMb21204 KEGG:ns NR:ns ## COG: SMb21204 COG0842 # Protein_GI_number: 16264618 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Sinorhizobium meliloti # 2 369 6 367 370 169 30.0 7e-42 MIKFLIEKEFKQLLRNSFLPKLILVFPCMIMLLMPWAVNLEIKNIQLNIVDNDHSAISQR LVNKIAASTYFRLVEVPASYEEGLRNIEIGTADIVMEIPRHLERDWMNGEDTHILIAANA VNGTKGGLGSSYLSSIINDYAAELRSEYPATATVSGAFPSIGIDTQGLFNPNLNYKLYMI PALMVMLLTLICGFLPALNIVSEKEVGTIEQINVTPVPKFIFILAKLLPYWLIGFVVLTV CFILAWLIYGIVPVGHFLLIYFFAVLFVLVMSGFGLVISNYSATMQQSMFVMWFCLLVVI LMSGLFTPISSMPEWAQLITRLNPLRYFMEVMRMVYLKGSGFFDLLPQFGVLLFFAVVFN SWAVISYRKNN >gi|225935334|gb|ACGA01000058.1| GENE 30 38843 - 39271 385 142 aa, chain - ## HITS:1 COG:TM0374 KEGG:ns NR:ns ## COG: TM0374 COG0071 # Protein_GI_number: 15643142 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Thermotoga maritima # 3 142 12 146 147 69 37.0 2e-12 MMPVRRTQSWLPSIFNDFFDNDWMVKANATAPAINVLETEKEYKVELAAPGMTKDDFNVR IDEDNNLVISMEKKTENKEEKKDGRYLRREFSYSKFQQTMILPDNVDKEKIAASVEHGVL NIELPKLSEEEVKKPNRQIEIK >gi|225935334|gb|ACGA01000058.1| GENE 31 39543 - 40019 278 158 aa, chain - ## HITS:1 COG:BS_alaS KEGG:ns NR:ns ## COG: BS_alaS COG0013 # Protein_GI_number: 16079794 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Bacillus subtilis # 6 138 553 686 878 67 28.0 9e-12 MEQQIQLNDHNKQEYPPMHTAEHLLNATMVKTFGCPRSRNAHIERKKSKCDYILSSCPTA EQIQSIEDRVNEVISQNLPVTVEFMTHEQAKDIVDLSKLPADASETLRIVRIGDYDACAC IGLHVSNTSEVDIFKIISYDYDEERQTLRIRFKLIEKK >gi|225935334|gb|ACGA01000058.1| GENE 32 40088 - 40501 237 137 aa, chain - ## HITS:1 COG:no KEGG:BDI_2962 NR:ns ## KEGG: BDI_2962 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 131 1 131 268 223 84.0 1e-57 MAKITVQKTDVTILKINDTDYISLTDIAKYKTTDANAVIANWLRNRMTIEYLGLWEILYN PHFKPLEFEGFKKEAGLNAFTLSPQKWIETTCAIGIISKSGRYGGTFAHKDIAFKFASWI SVEFELYIVKERIDSSN >gi|225935334|gb|ACGA01000058.1| GENE 33 40610 - 41245 473 211 aa, chain - ## HITS:1 COG:ygjQ KEGG:ns NR:ns ## COG: ygjQ COG2949 # Protein_GI_number: 16130981 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli K12 # 3 210 16 218 230 204 47.0 9e-53 MKRKLLYTTLIIAILCVISIIVCNRTIKKHTVSKLYTEVNTIPSNNVGLLLGTSPKLKNG NNNLYFDYRILATVELYKAGKIKYILISGDNRKEDYNEPEEMKKALMQKGIPEKSIYLDY AGFRTLDSVVRAKEVFGQNHLTIISQRFHNERAIYLAEKNGIDAIGYNAPNVNAYAGLKT NIRELLARVKMFIDLGIDKQPHFLGEKIIIE >gi|225935334|gb|ACGA01000058.1| GENE 34 41717 - 43390 1318 557 aa, chain - ## HITS:1 COG:CAP0003_1 KEGG:ns NR:ns ## COG: CAP0003_1 COG5492 # Protein_GI_number: 15004708 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 126 221 146 246 501 63 40.0 1e-09 TVTFTLAGEGGATFTLPMASEMKIFDKFDEFKVNSEKHELTLALNVKDGDYAAIKAELTS SKGMEVAIVKATTRAASTPWGVELTSPTFKEDKTIDVNAKVTFTLPANIEKGEFALLKVT VVGKDGTEHSATRVIVYTTDISVENVALADKSVPEGSTVKLIPTFTPANATNQNVTWTSS ELGVATIAADGTVTGVAQGGTTIITVTTEDGGFTATCTVTVTEKLKGYGWYLAGKESGNF EISTLEELKQFGLLVNGDRSALTFNGLSAAVDFSGKTITLKDNINLNNEEWTPIGTSRQP FKGTFDGNGKTISGLKVSNTIAYAGFLGYISAATVKNIKVSDGIVTGTYAGGVVGSSNSS TIEKCSSSASISGQIAGGVVGWAPLSVIKACHATGNVTATKSTNDIVTAGGVVGFISISG SIQACYYTTGTVTGGEGASGNNCCTGGVIGSSKRDIKVYNCYSTGTVVAGTGPQAYNVYT GAVIGYADDGNNNGSLLYVTSKGANRGIGEKSSDDASKVKSITSESDLSANISFLNTGLP GEVGYQFKADGTLEKVQ Prediction of potential genes in microbial genomes Time: Fri May 13 10:49:08 2011 Seq name: gi|225935333|gb|ACGA01000059.1| Bacteroides sp. D2 cont1.59, whole genome shotgun sequence Length of sequence - 1517 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 363 410 ## BT_4606 hypothetical protein - Prom 445 - 504 8.2 - Term 672 - 703 1.5 2 2 Tu 1 . - CDS 713 - 1516 568 ## BT_4479 integrase protein Predicted protein(s) >gi|225935333|gb|ACGA01000059.1| GENE 1 3 - 363 410 120 aa, chain - ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 117 394 81 35.0 1e-14 MKKFRHYALLLVTAMAVFGCSKDYDDTAIKNDISDLQSRVEKLETWCTTVNGQISALQGL VTALEAKDYVTGVSPVTNGYTINFSKGESITIYNGKDGAKGADGVTPIIGVDKFEGEYYW >gi|225935333|gb|ACGA01000059.1| GENE 2 713 - 1516 568 267 aa, chain - ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 267 40 305 305 400 74.0 1e-110 RQLTPEWLKRFENSLRERGCSWNTVSTYLRTLRAVYNRAVTLRKATYVPHLFRSVYTGTR TDRQRALDNEDMKKVFTRLATQSSATTAMRTAQDWFILMFLLRGLPFVDLAYLRKNDLKE NVITYRRRKTGRTLSVTLTTEAMFLLQRYMNRDVASPYLFPILRSGEGTEEAYREYQLAL RGFNHQLALLGELLGIKGRLSSYAARHTWATTAYYCEIHPGVISEAMGHSSITVTETYLK PFHSKKIDDANRRIIDFVERSIGKAIA Prediction of potential genes in microbial genomes Time: Fri May 13 10:49:40 2011 Seq name: gi|225935332|gb|ACGA01000060.1| Bacteroides sp. D2 cont1.60, whole genome shotgun sequence Length of sequence - 94719 bp Number of predicted genes - 87, with homology - 85 Number of transcription units - 45, operones - 21 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 396 - 455 4.7 1 1 Tu 1 . + CDS 486 - 1190 622 ## COG2755 Lysophospholipase L1 and related esterases 2 2 Tu 1 . - CDS 1305 - 1814 367 ## COG4502 Uncharacterized protein conserved in bacteria - Prom 2034 - 2093 6.7 + Prom 1936 - 1995 6.3 3 3 Tu 1 . + CDS 2041 - 2790 623 ## COG1349 Transcriptional regulators of sugar metabolism + Term 2828 - 2859 1.8 - Term 2806 - 2856 2.0 4 4 Tu 1 . - CDS 3041 - 5071 2118 ## COG0556 Helicase subunit of the DNA excision repair complex - Prom 5120 - 5179 5.0 + Prom 5057 - 5116 6.3 5 5 Op 1 3/0.000 + CDS 5207 - 6505 1166 ## COG1541 Coenzyme F390 synthetase 6 5 Op 2 . + CDS 6525 - 6950 452 ## COG4747 ACT domain-containing protein + Term 6977 - 7039 10.1 + Prom 7187 - 7246 6.2 7 6 Op 1 . + CDS 7280 - 8083 767 ## COG4105 DNA uptake lipoprotein 8 6 Op 2 . + CDS 8098 - 8433 470 ## BT_0574 hypothetical protein + Term 8455 - 8523 11.1 + Prom 8463 - 8522 5.3 9 7 Tu 1 . + CDS 8542 - 8985 530 ## BT_0575 hypothetical protein + Term 9010 - 9060 12.9 - Term 9002 - 9044 1.0 10 8 Op 1 . - CDS 9118 - 9831 496 ## BT_0576 hypothetical protein 11 8 Op 2 . - CDS 9860 - 11626 1598 ## COG1388 FOG: LysM repeat - Prom 11748 - 11807 5.8 + Prom 11548 - 11607 4.5 12 9 Tu 1 . + CDS 11785 - 14610 2587 ## COG0178 Excinuclease ATPase subunit + Term 14624 - 14672 13.2 + Prom 14629 - 14688 2.1 13 10 Op 1 . + CDS 14715 - 15194 422 ## COG2606 Uncharacterized conserved protein + Prom 15213 - 15272 2.2 14 10 Op 2 . + CDS 15293 - 16807 1353 ## COG3104 Dipeptide/tripeptide permease + Term 16831 - 16888 14.1 - Term 16818 - 16875 14.1 15 11 Tu 1 . - CDS 16884 - 17381 169 ## PROTEIN SUPPORTED gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase - Prom 17430 - 17489 8.9 + Prom 17422 - 17481 7.5 16 12 Op 1 . + CDS 17546 - 17872 365 ## COG1695 Predicted transcriptional regulators 17 12 Op 2 . + CDS 17894 - 18991 974 ## BT_0583 hypothetical protein + Term 19035 - 19089 8.7 + Prom 19095 - 19154 5.2 18 13 Op 1 . + CDS 19218 - 20144 594 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 19 13 Op 2 . + CDS 20169 - 20675 600 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 20725 - 20778 10.2 - Term 20718 - 20761 3.9 20 14 Op 1 . - CDS 20810 - 22912 2110 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 21 14 Op 2 . - CDS 22961 - 24292 892 ## COG0534 Na+-driven multidrug efflux pump - Term 24303 - 24349 1.1 22 15 Op 1 . - CDS 24376 - 26229 1799 ## COG0706 Preprotein translocase subunit YidC 23 15 Op 2 . - CDS 26288 - 27913 1850 ## COG0504 CTP synthase (UTP-ammonia lyase) - Prom 27969 - 28028 10.1 + Prom 27879 - 27938 4.8 24 16 Tu 1 . + CDS 28102 - 29610 1065 ## BT_0591 hypothetical protein + Term 29631 - 29667 9.6 + Prom 29752 - 29811 6.9 25 17 Tu 1 . + CDS 29870 - 30439 314 ## BT_1358 putative transcriptional regulator + Prom 30452 - 30511 5.4 26 18 Op 1 1/0.000 + CDS 30592 - 32526 1265 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 27 18 Op 2 2/0.000 + CDS 32543 - 33376 791 ## COG1596 Periplasmic protein involved in polysaccharide export 28 18 Op 3 . + CDS 33386 - 35800 1938 ## COG0489 ATPases involved in chromosome partitioning + Term 35921 - 35959 1.2 + Prom 35828 - 35887 5.3 29 19 Op 1 . + CDS 36029 - 36403 208 ## BVU_4025 hypothetical protein 30 19 Op 2 . + CDS 36397 - 36732 132 ## BVU_4024 hypothetical protein + Term 36864 - 36893 -0.4 + Prom 36736 - 36795 3.6 31 20 Tu 1 . + CDS 36945 - 38585 1096 ## COG3436 Transposase and inactivated derivatives + Term 38635 - 38660 -0.5 32 21 Op 1 . + CDS 38687 - 40108 1047 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 33 21 Op 2 8/0.000 + CDS 40143 - 41459 1237 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 34 21 Op 3 5/0.000 + CDS 41464 - 42543 819 ## COG0451 Nucleoside-diphosphate-sugar epimerases 35 21 Op 4 1/0.000 + CDS 42561 - 43535 839 ## COG0451 Nucleoside-diphosphate-sugar epimerases 36 21 Op 5 . + CDS 43570 - 44838 1118 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 37 21 Op 6 . + CDS 44925 - 46448 626 ## BVU_2391 putative transmembrane protein + Prom 46510 - 46569 6.4 38 22 Op 1 1/0.000 + CDS 46598 - 47596 605 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 39 22 Op 2 . + CDS 47593 - 48378 244 ## COG3274 Uncharacterized protein conserved in bacteria 40 22 Op 3 . + CDS 48350 - 48625 62 ## 41 22 Op 4 . + CDS 48653 - 49591 481 ## FIC_00340 hypothetical protein 42 22 Op 5 11/0.000 + CDS 49597 - 50418 680 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 43 22 Op 6 26/0.000 + CDS 50423 - 51448 321 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 44 22 Op 7 . + CDS 51449 - 52585 753 ## COG0438 Glycosyltransferase 45 22 Op 8 . + CDS 52591 - 53724 664 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) 46 22 Op 9 . + CDS 53740 - 54795 604 ## FIC_00343 putative capsular polysaccharide biosynthesis protein YveQ 47 22 Op 10 1/0.000 + CDS 54810 - 55922 815 ## COG0438 Glycosyltransferase + Term 55991 - 56030 -1.0 + Prom 56017 - 56076 3.3 48 22 Op 11 . + CDS 56098 - 57087 454 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 49 22 Op 12 . + CDS 57093 - 57530 362 ## BT_0397 hypothetical protein + Term 57558 - 57588 -0.3 50 23 Tu 1 . - CDS 57743 - 58171 325 ## COG3023 Negative regulator of beta-lactamase expression - Prom 58277 - 58336 3.3 - Term 58387 - 58424 -1.0 51 24 Tu 1 . - CDS 58471 - 58959 651 ## BT_1705 hypothetical protein - Prom 59141 - 59200 4.1 + Prom 58920 - 58979 6.3 52 25 Tu 1 . + CDS 59179 - 59397 178 ## BT_1704 hypothetical protein + Term 59405 - 59442 3.1 - Term 59415 - 59471 1.5 53 26 Op 1 . - CDS 59530 - 61401 815 ## BT_1703 hypothetical protein 54 26 Op 2 . - CDS 61429 - 62058 371 ## BT_1637 hypothetical protein - Prom 62186 - 62245 7.6 + Prom 62033 - 62092 10.1 55 27 Tu 1 . + CDS 62229 - 62369 150 ## gi|160884254|ref|ZP_02065257.1| hypothetical protein BACOVA_02232 + Term 62522 - 62573 5.7 + Prom 62495 - 62554 8.1 56 28 Op 1 . + CDS 62742 - 63167 197 ## BT_0616 hypothetical protein 57 28 Op 2 10/0.000 + CDS 63175 - 64140 871 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 58 28 Op 3 12/0.000 + CDS 64154 - 65491 1269 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 59 28 Op 4 12/0.000 + CDS 65497 - 66489 1099 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 60 28 Op 5 13/0.000 + CDS 66516 - 67232 955 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 61 28 Op 6 3/0.000 + CDS 67247 - 67834 707 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE + Prom 67848 - 67907 5.0 62 28 Op 7 . + CDS 67931 - 68503 716 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA + Term 68528 - 68593 15.1 + Prom 68518 - 68577 6.5 63 29 Tu 1 . + CDS 68728 - 69762 958 ## COG1087 UDP-glucose 4-epimerase + Term 69794 - 69835 7.6 - Term 69782 - 69823 5.1 64 30 Tu 1 . - CDS 69880 - 70002 57 ## - Prom 70048 - 70107 3.5 + Prom 70002 - 70061 7.6 65 31 Op 1 . + CDS 70087 - 70386 240 ## COG4680 Uncharacterized protein conserved in bacteria 66 31 Op 2 . + CDS 70397 - 70759 346 ## BDI_0882 hypothetical protein 67 32 Tu 1 . - CDS 70792 - 71622 514 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase - Prom 71670 - 71729 5.3 + Prom 71773 - 71832 5.2 68 33 Tu 1 . + CDS 71861 - 73420 1429 ## COG0305 Replicative DNA helicase + Prom 73465 - 73524 3.4 69 34 Op 1 . + CDS 73598 - 76060 2945 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 70 34 Op 2 . + CDS 76091 - 76831 878 ## COG0217 Uncharacterized conserved protein 71 34 Op 3 . + CDS 76841 - 77086 245 ## BT_0628 hypothetical protein + Term 77126 - 77167 7.1 + Prom 77093 - 77152 6.5 72 35 Op 1 . + CDS 77284 - 78537 1150 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family 73 35 Op 2 . + CDS 78567 - 79328 766 ## COG0708 Exonuclease III + Term 79417 - 79467 3.1 74 36 Op 1 . - CDS 79323 - 79730 343 ## COG0432 Uncharacterized conserved protein 75 36 Op 2 . - CDS 79742 - 80008 249 ## gi|260174499|ref|ZP_05760911.1| hypothetical protein BacD2_21762 - Prom 80033 - 80092 5.1 + Prom 80020 - 80079 8.4 76 37 Op 1 . + CDS 80103 - 80561 466 ## BT_0631 hypothetical protein + Prom 80605 - 80664 4.1 77 37 Op 2 . + CDS 80715 - 80915 329 ## BF2557 hypothetical protein + Term 80934 - 81007 8.1 + Prom 80922 - 80981 8.2 78 38 Tu 1 . + CDS 81044 - 82825 1897 ## COG0481 Membrane GTPase LepA + Term 82831 - 82896 15.3 + Prom 82877 - 82936 2.1 79 39 Tu 1 . + CDS 82965 - 84143 989 ## BT_0633 putative Na+/H+ exchange protein + Term 84152 - 84186 1.0 80 40 Op 1 . - CDS 84299 - 85534 1113 ## COG1322 Uncharacterized protein conserved in bacteria 81 40 Op 2 . - CDS 85534 - 86388 703 ## COG0024 Methionine aminopeptidase - Prom 86417 - 86476 2.1 82 41 Tu 1 . - CDS 86489 - 87112 160 ## BT_0639 hypothetical protein 83 42 Tu 1 . - CDS 87224 - 88864 637 ## COG1032 Fe-S oxidoreductase - Prom 88928 - 88987 5.8 + Prom 88884 - 88943 5.1 84 43 Tu 1 . + CDS 88976 - 89374 405 ## BT_0641 hypothetical protein + Term 89388 - 89451 20.6 - Term 89383 - 89430 13.8 85 44 Op 1 . - CDS 89449 - 90369 203 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 86 44 Op 2 . - CDS 90406 - 91776 1216 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 92000 - 92059 5.4 + Prom 91828 - 91887 6.5 87 45 Tu 1 . + CDS 92024 - 94718 2974 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase Predicted protein(s) >gi|225935332|gb|ACGA01000060.1| GENE 1 486 - 1190 622 234 aa, chain + ## HITS:1 COG:CAC3448 KEGG:ns NR:ns ## COG: CAC3448 COG2755 # Protein_GI_number: 15896689 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Clostridium acetobutylicum # 58 222 15 178 190 62 34.0 5e-10 MDKRRWGQWILLAVVSLSFSIGESYAQKNDFANLRRYSKENAALPQATKKDKRVIFMGNS ITEGWVRTHPDFFKSNGYIGRGISGQTSYQFLLRFREDVINLSPALVVINAGTNDVAENT QAYNEDYTFGNIVSMAELAKANKIKVILTSVLPAAAFKWRMEIKDVPQKIKSLNDRIEAY AKANKIPFVNYYKAMVVDENQALNPQYTKDGVHPTSEGYDIMEPLIKNAIEKAL >gi|225935332|gb|ACGA01000060.1| GENE 2 1305 - 1814 367 169 aa, chain - ## HITS:1 COG:SA0680 KEGG:ns NR:ns ## COG: SA0680 COG4502 # Protein_GI_number: 15926402 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 2 165 3 170 180 125 38.0 4e-29 MKQQVLVDMDGVLADVYTQFIDWEYRESGIQLAIESLYGKTENEAFPSCDKHVRSTGFFR TAPVIPGSIEGLKYLNEKYKVLIVSSATEYPQSLIEKQEWLNEYFPFITWEQMIFCGRKD SIKGDIMIDDHPKNLSVFDGERYIFTQPHNMFITDYMRVNSWKDIIDTL >gi|225935332|gb|ACGA01000060.1| GENE 3 2041 - 2790 623 249 aa, chain + ## HITS:1 COG:AGc4939 KEGG:ns NR:ns ## COG: AGc4939 COG1349 # Protein_GI_number: 15889977 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 239 8 242 266 138 33.0 1e-32 MLKEERHQYILNRLNENYRVYITALSQELGVSDDTLRRDLIDLDEQGLLTKVHGGAIAKS GISHLFVDRLNLGIVEKQEIASKVVPLFQPGDIVLIDGGTSNLEVARQIPTNMELTIYTN SFPIVNVLMNHPKLELIFLGGKIFPSSQVTVGISVYQALQTVRPDWLVLGVCSVHPQQGL TAPDREEAMVKRLMVERAKKKIVLADSHKLNKAEDYIIASLGDIDYLVTEDSKVEAIMEH WPKHSYVIL >gi|225935332|gb|ACGA01000060.1| GENE 4 3041 - 5071 2118 676 aa, chain - ## HITS:1 COG:BS_uvrB KEGG:ns NR:ns ## COG: BS_uvrB COG0556 # Protein_GI_number: 16080570 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus subtilis # 3 668 5 658 661 728 56.0 0 MNYELTSAYKPTGDQPEAIAQLTEGVLQGVPAQTLLGVTGSGKTFTIANVIANINKPTLI LSHNKTLAAQLYSEFKGFFPNNAVEYYVSYYDYYQPEAYLPNSDTYIEKDLAINDEIDKL RLSATSALLSGRKDVVVVSSVSCIYGMGNPSDFYKNVIEIERGRMMDRNVFLRRLVDSLY VRNDIDLNRGNFRVKGDTVDIYLAYTDNLLRVTFWGDEIDGIEEVDPITGVTIAPFDAYK IYPANLFMTTKEATLRAIHEIEDDLTKQVAFFESIGKEYEAKRLYERVTYDMEMIRELGH CSGIENYSRYFDGRAAGTRPYCLLDFFPDDFLIVIDESHVSVPQIRAMYGGDRARKINLV EYGFRLPAAMDNRPLKFEEFQEMAKQVIYVSATPADYELIQSEGIVVEQVIRPTGLLDPI IEVRPSLNQIDDLMEEIQLRIEKEERVLVTTLTKRMAEELAEYLLNNNVRCTYIHSDVDT LERVKIMDDLRQGVYDVLIGVNLLREGLDLPEVSLVAILDADKEGFLRSHRSLTQTAGRA ARNVNGKVIMYADKMTDSMKLTIDETNRRREKQLAYNEANGITPQQIKKARNLSVFGNAK EADELLKERHAYVEPSTPNIAADPIVQYMSKAQMEKSIERTRKLMQEAAKKLEFIEAAQY RDELLKLEDLMKEKWG >gi|225935332|gb|ACGA01000060.1| GENE 5 5207 - 6505 1166 432 aa, chain + ## HITS:1 COG:MTH1855 KEGG:ns NR:ns ## COG: MTH1855 COG1541 # Protein_GI_number: 15679843 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanothermobacter thermautotrophicus # 1 431 1 431 433 506 55.0 1e-143 MIWNESIECMDRESLRKIQSIRLKKIVDYVYHNTPFYRKKMQEMGITPDDINSIDDIVKL PFTTKHDLRDNYPFGLCAVPMSQIVRIHASSGTTGKPTVVGYTRKDLSTWAECISRAFTA YGAGRSDIFQVSYGYGLFTGGLGAHAGAENIGASVIPMSSGNTEKQITLMHDFGSTVLCC TPSYALYLADAIKDSGLPREEFRLKVGAFGAEPWTENMRHEIEEKLGIKAYDIYGLSEIA GPGVGYECECQHGTHLNEDHYFPEIIDPNTLQPVESGQTGELVFTHLTKEGMPLLRYRTR DLTALHHDKCSCGRTLVRMDRILGRSDDMLIIRGVNVFPTQIESVILEMEEFEPHYLLIV GRENNTDTMELQVEVRPEFYSDEINKMLALKKKLGGRLQSVLGLGVNVKLVEPRSIERSV GKAKRVIDNRKI >gi|225935332|gb|ACGA01000060.1| GENE 6 6525 - 6950 452 141 aa, chain + ## HITS:1 COG:MTH1854 KEGG:ns NR:ns ## COG: MTH1854 COG4747 # Protein_GI_number: 15679842 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Methanothermobacter thermautotrophicus # 1 141 1 143 143 109 40.0 1e-24 MVAKQLSIFLENKSGRLTEVTEVLAKENINLSALCIAENADFGILRGIVSDPDRAYKALK DNHFAVNVTDVVGISCPNIPGALAKVLGYLSEEGVFIEYMYSFANNHIANVVIRPSNLDK CIEVLKEKKVDLLAASDLYKL >gi|225935332|gb|ACGA01000060.1| GENE 7 7280 - 8083 767 267 aa, chain + ## HITS:1 COG:BMEI0587 KEGG:ns NR:ns ## COG: BMEI0587 COG4105 # Protein_GI_number: 17986870 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Brucella melitensis # 8 258 40 282 309 58 22.0 9e-09 MKKNIIITLLAAATLTSCGEYNKLLKSTDYEYKYEAAKNYFAKGQYNRSATLLNELITIL KGTDKAEESLYMLGMSYYNQKDYQTAAQTFITYFNTYPRGTFTELARFHAGKALFLDTPE PRLDQSSTYQAIQQLQMFMEYFPNSTKKQEAQDMIFALQDKLVLKELYSARLYYNLGNYL GNNYESCVITAQNALKDYPYTDYREELSILVLRARHEMAIYSVEDKKMDRYRETIDEYYA FKNEFPESKYLKEAEKIFNESQKVIKD >gi|225935332|gb|ACGA01000060.1| GENE 8 8098 - 8433 470 111 aa, chain + ## HITS:1 COG:no KEGG:BT_0574 NR:ns ## KEGG: BT_0574 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 186 100.0 3e-46 MDYKKTNAPATTVTRDMMELCADTGNVYETVAIIGKRANQISVEIKNDLSKKLAEFASYN DNLEEVFENREQIEISRYYEKLPKPDLIATQEYIEGKIYYRNPAKEKEKLQ >gi|225935332|gb|ACGA01000060.1| GENE 9 8542 - 8985 530 147 aa, chain + ## HITS:1 COG:no KEGG:BT_0575 NR:ns ## KEGG: BT_0575 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 222 85.0 3e-57 MIQRIQTVYLLIVVGLLITAMCLPMGYFIDAMGEHPFKALGLEVNGAIQSTWGLFGILMV STIVAVATIFLYKNRMLQIRMTIFNSLLLVGYYIAALAFYFALKNDENMFRIGWALCLPL VSIILNILAIRAIGRDEVMVKAADRLR >gi|225935332|gb|ACGA01000060.1| GENE 10 9118 - 9831 496 237 aa, chain - ## HITS:1 COG:no KEGG:BT_0576 NR:ns ## KEGG: BT_0576 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 237 8 240 240 368 78.0 1e-100 MIATLLLTGFALPAVAQIGEARNNLSVGINGGVNLNSASFTPTIKQNSLMGITGGLTARY ISEKYFAMICGAQVELNVSQRGWDQLFETISLDANGYEVTSKDLTKTYTRKMTYIDIPFL AHLAFGRDRGLQFFVHAGPQISFLISESETIKGIDMNSLSDTQKALYGVKIQNKFDYGIA GGGGVELRTKKAGSFIVEGRYYFALSDFYSTTKKDYFARAAHGTITIKLTYLFDLKK >gi|225935332|gb|ACGA01000060.1| GENE 11 9860 - 11626 1598 588 aa, chain - ## HITS:1 COG:BB0625_1 KEGG:ns NR:ns ## COG: BB0625_1 COG1388 # Protein_GI_number: 15594970 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Borrelia burgdorferi # 32 183 165 308 350 61 26.0 5e-09 MKPINRIFLFLLFISVSYAISYAQENQSYFLHTIEKGQSLYSISKMYNVTTSDIIRLNPG CDEKIYAGQTIKIPTGKESQKGETFHTIQAGETLYKLTTMYNVSAKDICEANPGLSAENF RIGQVILIPQKKEEQTATATQTPTEQTTIQGPVVPKCKDMHKVKRKETIFSVSREYGISE QELIAANPELKKGMKKGQFLCIPYPAATTVQPIQKEDPYAIPPSNSELFRKSKETPQKLS TIKAALLLPFQEDKRMVEYYEGFLMAVDSLKRTGISLDLYVYDCGKDVSTLNTILAKNEM KSMNIIFGPMHQNQIKPLSDFAEKNDIRLVIPFSQKGEEVFNNPAVYQINTPQSYLYSEV YEHFTRQFPNANVIFIEPASVDKEKAEFISGLKQELKSKGISMKTVGESATKETLKAALR SDKENIFIPTSGNNVLLIKILPQLTLLVRENPAENIHLFGYPEWQTYTRDHLENFFELDV YFYSSFYTNTLFPAAVQFTNAYHKWYSKDLASKYPNYAMLGFDTGFFFLKGLSLYGSELE NNLPKMNLTPIQTGFKFERVNNWGGFINKKVFFIRFTKNFELVKLDFE >gi|225935332|gb|ACGA01000060.1| GENE 12 11785 - 14610 2587 941 aa, chain + ## HITS:1 COG:BH3594 KEGG:ns NR:ns ## COG: BH3594 COG0178 # Protein_GI_number: 15616156 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 7 936 6 935 957 1025 56.0 0 MQETEYINVYGARVHNLKDIDAEIPRNSLTVITGLSGSGKSSLAFDTIFAEGQRRYIETF SAYARNFLGNLERPDVDKITGLSPVISIEQKTTNKNPRSTVGTTTEIYDYLRLLYARAGV AYSYLSGEEMVKYTEEQILDLILKDYKGKKIYLLAPLVRSRKGHYRELFEQIRKKGYLYV RVDGEVREITHGMKLDRYKNHDVEVVIDKLVVAEKDDRRLKQSVATAMRQGDGLMMILDA QSESIRHYSKRLMCPVTGLSYREPAPHNFSFNSPQGACPKCKGLGVVNQIDVDKVIPDRE LSIYEGAIAPLGKYKNAIIFWQIGALLEKYDATLKTPIKELPDDAIDEVLYGSDERIKIK SSLIGTSSDYFVTYEGVVKYIQMLQEKDASATAQKWAEQFAKTTVCPECKGARLNKEALH FRIHDKNINELANMDINELYDWLMKVDEFLSDKQKKISVEILKEIRTRLKFLLDVGLDYL ALNRSSVSLSGGESQRIRLATQIGSQLVNVLYILDEPSIGLHQRDNLRLINSLKELRDMG NSVIVVEHDKDMMLAADYVIDMGPKAGRLGGEVVFAGTPQEMLNTDTMTSQYLNGKMKIE IPAKRRKGNGKSIWLKGAKGNNLKNVDVEFPLGKLICVTGVSGSGKSTLINETLQPILSQ KFYRSLQDPLEYDSIEGLENIDKVVDVDQSPLGRTPRSNPATYTGVFSDIRNLFVGLPEA KIRGYKPGRFSFNVAGGRCEACTGNGYKTIEMNFLPDVYVPCEVCHGKRYNRETLEVRFK GKSIADVLDMTINRAVEFFENVPQILNKIKVLQDVGLGYIKLGQSSTTLSGGESQRVKLA TELSKRDTGKTLYILDEPTTGLHFEDIRVLMGVLNKLVDKGNTVIVIEHNLDVIKMADYI IDMGPEGGKGGGELLSYGTPEEVAKSPKGYTPKFLREELGL >gi|225935332|gb|ACGA01000060.1| GENE 13 14715 - 15194 422 159 aa, chain + ## HITS:1 COG:lin0783 KEGG:ns NR:ns ## COG: lin0783 COG2606 # Protein_GI_number: 16799857 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 5 159 4 158 158 172 57.0 3e-43 MKINKTNAARLLDKAKIAYELIPYEVDENDLSAVHVAASLGENIEQVFKTLVLHGDKSGY FVCVIPGEHEVDLKLAAKASGNKKCELIPVKELLPLTGYIRGGCSPIGMKKHFPTYIHET SQQFPYIYVSAGVRGLQIKIAPEDLIRESRAEICRLFEE >gi|225935332|gb|ACGA01000060.1| GENE 14 15293 - 16807 1353 504 aa, chain + ## HITS:1 COG:lin0564 KEGG:ns NR:ns ## COG: lin0564 COG3104 # Protein_GI_number: 16799639 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Listeria innocua # 2 458 9 443 492 153 29.0 1e-36 MFNKHPKGLIAAALANLGERFGFYTMMAILVLFLQAKFGMNGKEAGFIYSTFYFSIYILA LVGGIIADKTRNYKGTIFTGIVLMAVGYLLLAIPSKTPVDNKMLYLVITCASLFVIAFGN GLFKGNLQALVGQMYDNPQYANMRDSGFSLFYMFINIGAIFAPFAAVGVRNWWLSTFGYN YDADLPALCHGQLAGTLSPEATETYHTLVTKASTAPVQDYTAFASDYLNVFTTGFHYAFG VAIIAMAISLTIYWLNKKNFPDPSKKTVASSASSTVVEMSAQEVKQRMYALFAVFGVVIF FWFSFHQNGLTLTYFAKEYTDLNLFGMPISAELFQSLNPFFVVFLTPVIMAIFASQRRRG KEPSTPKKIAIGMGIAALAFIVMAVGSYFANLPLHEDIVAVGTSPVKVTPFLLMLTYLIL TVAELYISPLGISFVSKVAPPKYQGIMQGGWLGATALGNQLLVIGAILYESIPIWMTWTV FVVACTISMFTMIFMLKWLERVAK >gi|225935332|gb|ACGA01000060.1| GENE 15 16884 - 17381 169 165 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase [Sebaldella termitidis ATCC 33386] # 13 161 15 160 169 69 31 5e-11 MFTIRKATVADCELIHKMAKEVFPATYKEILSPEQLDYMMDWMYAPSNVRKQMEEEGHVY SIAYKENEPCGYVSIQQQEKDVFHLQKIYVLPRFQGTHCGSFLFKEAIRCIKEIHPGPCL MELNVNRNNKALQFYEHMGMRKLQEGDFPIGNGYYMNDYIMGLDI >gi|225935332|gb|ACGA01000060.1| GENE 16 17546 - 17872 365 108 aa, chain + ## HITS:1 COG:BH0406 KEGG:ns NR:ns ## COG: BH0406 COG1695 # Protein_GI_number: 15612969 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 5 108 67 170 174 77 36.0 5e-15 MKVDNVKSQMRKGMLEYCIMLLLHKEPAYASDIIQKLKEAQLIVVEGTLYPLLTRLKNDD LLSYEWVESTQGPPRKYYKLTEQGEVFLGELEISWKELNDTVNHIANR >gi|225935332|gb|ACGA01000060.1| GENE 17 17894 - 18991 974 365 aa, chain + ## HITS:1 COG:no KEGG:BT_0583 NR:ns ## KEGG: BT_0583 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 363 364 585 83.0 1e-165 MKKTLTINLGGIVYHIDDDAYRLLDNYLSNLKHYFRKQEGAEEIVNDIEMRIAELFAEKV TEGKQVITVSDVEEIIARVGKPEDFGIADEDMDSQKRTEQTSSANQGSTQTSAQRRWFRD PDNKLLGGVAAGLAAYFGWDITLVRILMIILVFVPYCPMIILYIIGWLVIPEARTAAEKL SMRGEAVTIENIGKTVTDGFERVADGVNNYMNSGKPRTFLQKIGDVFVSIAAVLFKIFLV ALVILCCPVLFVLAVVLVALVFAAIAVAVSGGALLYEMLPAIDWMPVASVSPMMTLLGTI AGVALIGIPLGAFLYTILRQLFHWSPMGTGLKWSLLILWILGAVIMIINLSALGWQLPLY GLHCS >gi|225935332|gb|ACGA01000060.1| GENE 18 19218 - 20144 594 308 aa, chain + ## HITS:1 COG:YPO1228 KEGG:ns NR:ns ## COG: YPO1228 COG2220 # Protein_GI_number: 16121515 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Yersinia pestis # 55 280 84 314 342 168 36.0 1e-41 MVRGRFFNRQHRFRPGMGSVLKWRLSSNPQRKEKKTVKWDPKVCYLRSLDAVVGDSLIWL GHNSFFLQLAGKRIMFDPVFGSIPFVKRQSEFPANPDIFTGIDYLLVSHDHFDHLDKQSI ARLLKNNPQMKLFCGLGTGELIQGWFPEMKVIEAGWYQQMEDEGLKITFLPAQHWSKRSV RDGGQRLWGAFMLQGNGISLYYSGDTGYSSHFREIPDLFGAPDYALLGIGAYKPRWFMRP NHISPHESLTAAEEMHAGLTIPMHYGTFDLSDEPLHDPPKVFATEAKKRKISVEIPYLGE IVKLRKQK >gi|225935332|gb|ACGA01000060.1| GENE 19 20169 - 20675 600 168 aa, chain + ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 2 128 4 131 180 75 35.0 3e-14 MKKLEVKDLKENFFETIGKEWMLVTAGTKEKFNTMTASWGGIGWLWNKPVAFVFIRPERY TYEFVEKNDYLTLSFLGEENKKIHAVCGSKSGRNIDKVKETGLKPVFTEKGNVLFEQARL SLECKKLYADTIKPECFLDKESIEKWYGGAHGGFHKMYIVEIENIYSE >gi|225935332|gb|ACGA01000060.1| GENE 20 20810 - 22912 2110 700 aa, chain - ## HITS:1 COG:XF2260 KEGG:ns NR:ns ## COG: XF2260 COG1506 # Protein_GI_number: 15838851 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Xylella fastidiosa 9a5c # 51 694 52 707 709 334 31.0 4e-91 MRQANLIMMSAAMLLAACGGTKDAGKTDQVLIEKSDIKIEGKRMTPEALWAMGRIGGFAV SPDGKKIAYTVAYYSVPENKSNREVFVMNADGSENQQITHTPYQENEVTWIKGGTKLAFL SNDNGSSQLYEMNPDGSGRKQLTNYDGDIEGYSISPDGKKLLFISQVKTKESTADKYPDL PKATGIIVTDLMYKHWDEWVTTAPHPFVADFDGNGISNIVDILNDEPYESPMKPWGGIEQ LAWNMTSDKVAYTCRKKTGLEYAISTNSDIYVYDLNTKKTENITEENKGYDTNPQYSPDG KYIAWQSMERDGYEADLNRLFIMNLETGEKRFVSKTFESNVDAFVWGADAKVIYFTGVWH GESQIYALDLANDSVKAITSGMYDYEGVALFGDKLIAKRHSMSMGDEIYSVALDGSTTQL TQENKQIYDQLEMGKVEGRWMKTTDGKQMLTWVIYPPQFDPNKKYPTLLFCEGGPQSPVS QFWSYRWNFQIMAANDYIIVAPNRRGLPGFGVEWNEQISGDYGGQCMKDYFTAIDEMAKE PYVDKDRLGCVGASFGGFSVYWLAGHHDKRFKAFIAHDGIFNMEMQYLETEEKWFANWDM GGAYWEKQNPVAQRTFANSPHLFVEKWDTPILCIHGEKDYRILANQAMAAFDAAVMRGVP AELLIYPDENHWVLKPQNGVLWQRTFFEWLDKWVKKAPSK >gi|225935332|gb|ACGA01000060.1| GENE 21 22961 - 24292 892 443 aa, chain - ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 8 443 20 454 464 127 24.0 7e-29 MKTKYSYKQIWTISYPILISLIMEQMIGMTDTAFLGRVGEIELGASAIAGVYYLAIFMMA FGFSIGAQILIARRNGEGNYKEIGPIFYQGIYFLLAMAVILFTFSIVFSPYILKNIISSP HIYDAAESYIHWRVYGFFFSFIMVMFRAFFVGTTQTKTLTLNSIVMVLSNVVFNYILIFG KFGFPQLGIAGAAIGSSLAEMVSVIFFIIYTWKRIDCRKYALNILPKFQGKTLKRILNVS VWTMIQNFVSLSTWFMFFLFVEHLGERSLAIANIIRNVSGIPFMIAMAFASTCGSLVSNL IGAGEQDCVRGTIRQHIRIGYIFVLPILIFFCLFPDLILRIYTDMPDLRAASVPSLWVLC SAYLVLVPANVYFQSVSGTGNTRTALAMELCVLAIYVTYSAYFIMYLRMDVAFAWTTECV YGIFTLMFCYWYMKKGNWQKKKI >gi|225935332|gb|ACGA01000060.1| GENE 22 24376 - 26229 1799 617 aa, chain - ## HITS:1 COG:BB0442 KEGG:ns NR:ns ## COG: BB0442 COG0706 # Protein_GI_number: 15594787 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Borrelia burgdorferi # 86 561 72 533 544 145 27.0 2e-34 MDKNTITGLVLIGILLVGFSFLSRPSEEQIAAQKKYYDSIAVVQQQEEALKAKTEAALAN SQKEVASAADSSALFFNALHGTDSKISIQNNVAEITFTTKGGRVYSAMLKDYMAQDKKTP VVLFDGDDASMNFNFYNKAGAIQTKDYFFEAVNKTDSSVTMRLAADSASYIDFIYTLKPD SYLMNFEIKATGMEDKLASTKYVDIDWSQRARQLEKGFTYENRLSELTYKVTGDNVDNLS AAKDDSQNLPGRIDWVAFKNQFFSSVFIAEQDFDKVSVKSKMEQQGSGYIKDYSAEMNTF FDPTGKEPTEMYFYFGPNHFKTLKALDKERDEKWELHRLVYLGWPLIRWINQFITINVFD WLSGWGLSMGIVLLILTIMVKVVVYPATWKTYMSSAKMRVLKPKIDEINKKYPKQEDAMK KQQEVMGLYSQYGVSPMGGCLPMLLQFPILMALFMFVPSAIELRQQSFLWADDLSTYDAF ITFPFHIPFLGNHLSLFCLLMTVTNILNTKYTMSMQDTGAQPQMAAMKWMMYLMPIMFLF VLNDYPSGLNYYYFVSTLISVGTMILLRRTTDETKLLAILEAKKKDPKQMKKTGFAARLE AMQKQQEQLQQQRQNKK >gi|225935332|gb|ACGA01000060.1| GENE 23 26288 - 27913 1850 541 aa, chain - ## HITS:1 COG:BS_ctrA KEGG:ns NR:ns ## COG: BS_ctrA COG0504 # Protein_GI_number: 16080768 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Bacillus subtilis # 4 533 2 530 535 596 53.0 1e-170 MGETKYIFVTGGVASSLGKGIISSSIGKLLQARGYNVTIQKFDPYINIDPGTLNPYEHGE CYVTVDGHEADLDLGHYERFLGIQTTKANNITTGRIYKSVIDKERRGDYLGKTIQVIPHI TDEIKRNVKLLGNKYKFDFVITEIGGTVGDIESLPYLESIRQLKWELGKNALCVHLTYVP YLAAAGELKTKPTQHSVKELQSVGIQPDVLVLRAEHPLSDGLRKKVAQFCNVDDKAVVQS IDAETIYEVPILMQAQGLDSTILEKMGLPVGETPGLGPWRKFLERRHAAETKEPINIALV GKYDLQDAYKSIREALSQAGTYNDRKVEVHFVNSEKLTDENVAEALKGMAGVMIGPGFGQ RGIDGKFVAIKYTRTHDIPTFGICLGMQCIAIEFARNVLGYTDANSREMDEKTPHNVIDI MEEQKAITNMGGTMRLGAYECVLQKGSKAYQAYGQEHIQERHRHRYEFNNDYKDRYEAAG MKCVGINPESDLVEIVEIPTLKWFVGTQFHPEYGSTVLNPHPLFVAFVKAAIENEKATVN G >gi|225935332|gb|ACGA01000060.1| GENE 24 28102 - 29610 1065 502 aa, chain + ## HITS:1 COG:no KEGG:BT_0591 NR:ns ## KEGG: BT_0591 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 50 502 26 483 483 756 83.0 0 MMRITTLCSAFILSILSLSLFAQEPVESASLPTDSLQPAPEVIKPAKTVKSSTVVAKARV VKADTLSAELQKYLMLKLNMSGPTPKLDTVSYLYNKYVAQLDYLNDLSVPPRYIPSDPDY FRLFTPLAYYYAPMAQYSKLEWKPMQWDTTPQLTAELLPYDTLAFTKTQRAEKIVNSALM DLYLERPNLVVTTEDRIMSRDVFRHDVKPKISPKATVIHLFQPENMDDNVGKANMKISRP NWWVTGGNGSLQISQNHLSDNWYKGGESNFSGLVTFQIFANYNDNEKVLFENQLEAKVGM TSTPSDEYHKYLFNTDQFRLYNKLGLRAFKNWYYTISSEFKTQFFNGYKANSEELVSSFM SPADLAVSVGMDYKLSKKKFNLSVFMAPLTYNLRYIGNSEVDETKFGLDKGKCSKNDFGS QLQSTLNWTIISAVTLESRLNYLTNYHWARVEWENTFNFVLNRYLSTKLYIHTRFDDSSK PTEGDSFFQLKELLSFGINYKW >gi|225935332|gb|ACGA01000060.1| GENE 25 29870 - 30439 314 189 aa, chain + ## HITS:1 COG:no KEGG:BT_1358 NR:ns ## KEGG: BT_1358 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 188 1 188 190 302 79.0 4e-81 MIIKKNDDRNLLPTDVIGSSVARSKRWLVAIVRICHEKKTGERLTKMGIENFLPIQQEVH QWSDRRKIVDRVLLPMMIFVHVDLQEQKEVLTLSSISRYLVLRGESTPAIIPDQQMSRFK FMLDYSDETISLSTSPLAPGEKIRVIKGPLAGLEGELVHVNGKSKVAVRLTMLGCACVDI PVGCVEVLA >gi|225935332|gb|ACGA01000060.1| GENE 26 30592 - 32526 1265 644 aa, chain + ## HITS:1 COG:BH3718 KEGG:ns NR:ns ## COG: BH3718 COG1086 # Protein_GI_number: 15616280 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Bacillus halodurans # 33 577 9 545 608 337 37.0 6e-92 MGDFFKAVFGAYQELRLKRKYVNYWIVLSLDTLLSTISTLISILFISGFIVHINKTPFLI LLLCSGVFSLTLFYFFNVHKNIIRHASIKSIGKITLALLLKEVGLLIVSLLGGDWIEYRY LWACCFIDLLISIVVLVGFRVTLLLVYDLVVIQIGRVKKTNVLIYGTDDKSVSLQQRMYT SKHYQVVGYINPNQRLKSYRISDLPVYCFKNEENFTHFMSKRTIGGIIFPSYEAVQRESE TLVRYGQKIGIKNLVSPPIDIVGSGLKKTAIREVKIEDLLGREEIKINMQEIIDNFSGKV VMVTGAAGSIGSELCRQIATFGVKHLVLFDNAETPMHEIRLELEKSFPELHFTPFIGDVR SCERLRMAFEKFHPQVVFHAAAYKHVPLMEENPCEAVLVNVVGSRQVADFCVEYNVEKMI MISTDKAVNPTNVMGASKRLAEIYVQSLGLAIERGEVSGTTKFVTTRFGNVLGSNGSVIP YFRKQIATGGPVTVTDPKITRFFMTIPEACRLVMEAATMGKGNEIFVFEMGQAVKIVDLA TRMIELAGYRPGEDIEIKFTGLRPGEKLYEEVLSDKENTIPTENKKIRIAKVRRYEYTDI LPTYAEFETLSRAVEIMDTVKLMKEVVPEFKSKNSPKFEVLDKE >gi|225935332|gb|ACGA01000060.1| GENE 27 32543 - 33376 791 277 aa, chain + ## HITS:1 COG:PM1016 KEGG:ns NR:ns ## COG: PM1016 COG1596 # Protein_GI_number: 15602881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Pasteurella multocida # 51 244 77 253 387 60 29.0 5e-09 MNKSFYMYMNVQGMRRNVFTCLITIFLLASCQSYKKVPYLQDVEVMEQTAQQENLYDAKI MPKDLLTIVVSCTSPELAVPFNLTVASSASVATGNTQLTTQSVLQPYLVDNEGNINFPVL GELNVGGLTKKEAEQLIVDKLKPYIKETPIVTVRMVNYKISVLGEVARPGTFTISNEKVN LLEALAMAGDMTVWGVRDNVKLIREGVNGKQEIVTLDLNKAETILSPYYWLQQNDIVYVT PNKAKARNSDIGNSTSLWFSATSILVSLASLLVTIFK >gi|225935332|gb|ACGA01000060.1| GENE 28 33386 - 35800 1938 804 aa, chain + ## HITS:1 COG:VC0937_2 KEGG:ns NR:ns ## COG: VC0937_2 COG0489 # Protein_GI_number: 15640953 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Vibrio cholerae # 494 793 1 299 302 119 28.0 3e-26 MIEEKREKRGEQSEEQVNILEILFRYLIHWPWFVVSVVICIACACGYLRLATPVYNISAT VLIKDDKKGSGASMSSELERMGLDGFVSSSNNVDNEIEVLRSKSLAREVVDHLGLFVTYK DEDEFPSRELYRTSPVVVSLTPQEADKLPHSMKVNMTLQPAGAMDVQIIMGEKEYRKQFE KLPAVFPTDEGTVAFFVNNDTLSSVRPESVTQERHITAFINSPSSVAKGYANSLSISPTS KMTSVVVISLKNSNARRGKDFINKLLEMYNINANNDKNEVAQKTAEFINERIGIISKELG STEQDLENFKRSAGITDLSSEAQIALTGNAEYEKKRVENQTQINLVMDLQRYMKGNEYEV LPSNIGLQDAASAGAIDRYNEMLVERKRLLRTSTENNPTIINLDTSIRAMRSNVQATLDA TLKGLQITKEDLAREASRYSRRINDAPTQERQFVSIARQQEIKAGLYLMLLQKREENAIT LAATANNAKIIDEALADNNPVSPKKMIVYFAALLLGVGLPVGIIYLIGLTKFKIEGRADV EKLTSLPVIGDIPLADEKSGSIAVFENQNNLMSETFRNVRTNLQFMLENDKNVILVTSTI SGEGKSFVSSNLAISLSLLGKKVVIVGLDIRKPGLNKVFNLSRKEHGITQFLTNPTVNLM DLVQPSDINKNLFILPGGAVPPNPTELLARDGLEKAIETLKKNFDYVILDTAPVGMVTDT LLIGRIADLSVYVCRADYTRKAEFTLINELAESDKLTNLCIAINGLDLQKKKYGYYYGYG KYGKYYGYGKRYGYGYGYEEHKVN >gi|225935332|gb|ACGA01000060.1| GENE 29 36029 - 36403 208 124 aa, chain + ## HITS:1 COG:no KEGG:BVU_4025 NR:ns ## KEGG: BVU_4025 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 124 1 120 120 111 46.0 7e-24 MYSYNDFERLFLRYKLEGIPAGVSIEKFCMSNKVPYNLFAKWYTDTRKKIIPVQVLGAPS TEAEMKESPSPIPERKPEADTLLSSHSELRILVDIRMSNGVHISQKNLSYAGLKSLVEKQ EGLC >gi|225935332|gb|ACGA01000060.1| GENE 30 36397 - 36732 132 111 aa, chain + ## HITS:1 COG:no KEGG:BVU_4024 NR:ns ## KEGG: BVU_4024 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 111 1 111 111 140 59.0 2e-32 MLNITGLNHFFFVRDFHDMRCKYDKVLSIIHQQLNREPEDGDVFIVMSKDLRLVRLFSYD RRSYSMFEKRFRPGYKFMQVTYNGGESVYSIHWQDVILLLESPVIKKLIIK >gi|225935332|gb|ACGA01000060.1| GENE 31 36945 - 38585 1096 546 aa, chain + ## HITS:1 COG:ECs3866 KEGG:ns NR:ns ## COG: ECs3866 COG3436 # Protein_GI_number: 15833120 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 215 545 129 460 463 95 28.0 2e-19 MASRERDAERIQRDEARIDELLSKVDELLSLQKAAMAAEKERDAYKQMVGNLLSKITVLE ERLKVRNKNLYDGKSQKGIRKKKREVEEDHTRDKDDFDGTPQSLGSCLPQSGEAHAQEED FESKSKEARLYRQGLSYRTMSADNTVCHNSDIRQLPAGSKIIKRFRKYAYEQVSKLVEHD YEVIRYKTSDGKIHEGYFPFSGQPEIIDVVPGTHASGSFLAYLAFNKYVLDTPLYREMYR LSGESMRLSRMSLTNWLEKGSTHFCGLIAYLKNTCLEKDSIVNCDETWCRVKRKGCYKKK YIWCLVNKLSKVVIYCYEDGSRGRDALKHILGDSQVKALQSDGYNVYLYLDDHMVDIEHL CCMAHARAKFKYALDQGNDKDAAFILELIGELYKLEREYEEGQLSPEQIKLCRNNLKTKE VIIKLRSKLDVLLSDGHPPRGELMEKAINYMNTFWTQLFAYLNDGSYSIDNSIAERFIRP LAGERKNSLFFGSDRMAHVSAVYHTIISTCKMQGVSVLDYFKRFFSEIVKGRRDYEHLLP LTIGLE >gi|225935332|gb|ACGA01000060.1| GENE 32 38687 - 40108 1047 473 aa, chain + ## HITS:1 COG:FN1696 KEGG:ns NR:ns ## COG: FN1696 COG1086 # Protein_GI_number: 19705017 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Fusobacterium nucleatum # 13 463 165 605 607 329 42.0 6e-90 MALKVRLLNSSHYKVVGFCIYGTGNSIRRVADLPVYSFKDEECFNKLIHKKCIGGILFAR YENTREEEGRLLQYCKRNGLKTLIAPSISEADENGSFHQWVRPIKIEDLLGRSEIHINME EVMTEFCGKVVLVTGAAGSIGSELCRQLAQMNIKKLIMFDSAESPLHNVRLEFEKNYPGL DFVPVIGDVRVKERLRMVFDIYQPQIIFHAAAYKHVPLMEENPCEAVLVNVVSSRQVADM AVEYGAEKMIMVSTDKAVNPTNVMGCSKRLAEIYVQSLGCAIREGKVKGHTKFITTRFGN VLGSNGSVIPRFKEQIENGGPVTVTHPDIIRFFMTIPEACRLVMEAATMGEGNEIFVFEM GKAVKIVDLATRMIELAGYRPGEDIEIEFTGLRPGEKLYEEVLSDKENTIPTENKKIMIA KVRHYEYADILDIYTEFENLSRTVKIMDTVKLMKKVVPEFKSKNSPRFEVLDK >gi|225935332|gb|ACGA01000060.1| GENE 33 40143 - 41459 1237 438 aa, chain + ## HITS:1 COG:STM2080 KEGG:ns NR:ns ## COG: STM2080 COG1004 # Protein_GI_number: 16765410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Salmonella typhimurium LT2 # 6 438 1 388 388 469 56.0 1e-132 MDTKDLKIAVAGTGYVGLSIATLLSQHHQVTAVDVIPEKVDMLNRKQSPIQDEYIEKYLS EKALNLTATLDGAKAYSDADFVVIAAPTNYDPVKNYFDTHHIEDVIDLVLSVNPDAVMVI KSTIPVGYCRGLYLKYARKGVKKFNLLFSPEFLRESMALYDNLYPSRIIVGYPKFIESEQ FKEENEAIKAVADIPALEKAARTFAALLQEGAIKEDIPTLFMGLKEAEAVKLFANTYLAL RVSYFNELDTYAEMKGLDSQSIIQGVGLDPRIGTHYNNPSFGYGGYCLPKDTKQLLANYQ DVPQNMMSAIVESNRTRKDFIADQVLSKAGYYTASSQWDVHKEQNIVIGVYRLTMKSNSD NFRQSAIQGIMKRIKAKGAEVIIYEPTLQDDESFFGSKVVNNLTKFKELSRAIIANRYDA CLDDVKDKVYTRDIFQRD >gi|225935332|gb|ACGA01000060.1| GENE 34 41464 - 42543 819 359 aa, chain + ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 13 350 3 333 343 362 53.0 1e-100 MVTYNVSLENKVVLVTGAAGFIGANLVKRLLAEFDSIKVIGIDSITEYYDVRLKYERLEE LSVYGDRFVFIKDNIAKKEIVESAFTNYRPQVVVNLAAQAGVRYSITNPDAYIESNLIGF YNILEACRHYGVEHLVYASSSSVYGSNKKVPYSTDDKVDNPVSLYAATKKSNELMAHAYS KLYNIPSTGLRFFTVYGPCGRPDMAYFSFTNKLLKGETIQVFNYGNCKRDFTYVDDIVEG VVRIMQHAPEKKNGDDGLPIPPYKVYNIGNNSPENLLDFVTILQDELIRAGVLPNDYDFE SHKKLVSMQPGDVPVTYADTTPLEQDFGFKPSTSLRVGLRKFAEWYAKYYGNNEKDIES >gi|225935332|gb|ACGA01000060.1| GENE 35 42561 - 43535 839 324 aa, chain + ## HITS:1 COG:VNG0065G KEGG:ns NR:ns ## COG: VNG0065G COG0451 # Protein_GI_number: 15789399 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Halobacterium sp. NRC-1 # 15 321 5 309 309 272 44.0 8e-73 MAYKHLKFPEDSLFLVTGAAGFIGSNLCEAILEMGYRVRALDDLSTGKQKNIDMFFDNPR YEFIKGDIKELDTCMTACENVDYILNQAAWGSVPRSIEMPLFYSLNNIQGTLNMLEAARQ NGVKKFVYASSSSVYGDEPNLPKKEGVEGNLLSPYAVTKCCDEEWAKQYTRHYGLDTYGM RYFNVFGRRQDPDGAYAAVIPKFLKLLINGQKCRINGDGKQSRDFTYIENVIEANLKACL APSSAAGQAFNIAYGGREYLIDIYYGLTKALGVDIEPEFGPDRVGDIKHSHADISKAKEL LGYEPNWSFERGIEAAIQWYKENL >gi|225935332|gb|ACGA01000060.1| GENE 36 43570 - 44838 1118 422 aa, chain + ## HITS:1 COG:PM1003 KEGG:ns NR:ns ## COG: PM1003 COG0677 # Protein_GI_number: 15602868 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Pasteurella multocida # 1 422 3 424 424 512 57.0 1e-145 MKETKICVIGLGYVGLPLARLFSTRYKTVGFDMNAKRCEALMAGHDATLEVSDELLQDAI NNHGFVCTADIEQIRDCNFYVVAVPTPVDENNNPDLTPLYSASTTVGKVISKGDVVVYES TVYPGVTEDECIPVVEKVSGMKFNEDFFAGYSPERINPGDKQHTVEHIKKVTSGSTSEIG RYVDEIYASVIAAGTHLAPTMKVAEAAKVIENSQRDINIAFVNELSHIFTKMGIDTHDVL DAAGTKWNFLPFRPGLVGGHCIGVDPYYLAQCAQRHGYNPEIILAGRRMNDGMGEYVATE TVKHMLKKGIQVLNSGIIILGFTFKENCPDVRNTKVIDIYRVLKEYNVNVTVYDPWANPV ITKREYGIEVVNELPSEKFDAAVVAVAHKVFEGFDVPALLKDKHVIFDVKCTLDRSIIDA RL >gi|225935332|gb|ACGA01000060.1| GENE 37 44925 - 46448 626 507 aa, chain + ## HITS:1 COG:no KEGG:BVU_2391 NR:ns ## KEGG: BVU_2391 # Name: not_defined # Def: putative transmembrane protein # Organism: B.vulgatus # Pathway: not_defined # 1 507 1 511 512 343 38.0 9e-93 MLDTQSNNKRIAKNTAFLYFRMALTMFIGLFSVRIVLNALGEEDYGIYNVIGGVVTMLAF LNNSLSSATQRFLSFELGKGNKKELHSIFSNSMTLYIGICVILLVLAEILGLWFVNNKLV IPNERIVAANWIYQFSIISFLCTILSSPFNAVIIAREKMGVYAYVSIAESLLKLALIYSI LVLGGDKLIIYGFFMMLVSVLVFVFYMFYCLMKFEETKYRYVFDKNKIREIGSFAGWGIW GALSNIFKGQGINILLNIFFGPMVNASRGIAYQVEGAVNTLVQNFYTAMRPQLIKSYAAN ETEEMFKLLNISTRLGYYMMFLVSLIFLYKTPLIFSIWLEKIPQYSITFTRLALISQLFI VLANPLMTAIHATGKVAKYQFLSGCIFMLVLPFSYFILKWDYNSFWPFIVLIVSSLLYWI LTIERCYKLINLSLRNYFVMVFRLFLVSLILLLGAQLLYRWGTDGWLDFIVLCFYTLVVG IMAILFIDCSKSERIMIKELILSKFHK >gi|225935332|gb|ACGA01000060.1| GENE 38 46598 - 47596 605 332 aa, chain + ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 1 248 1 253 344 107 28.0 2e-23 MRFPLITIALSIYNVEPYLRQSLETIVNQTYQNLEILCIDDCSNDKTYDILQEYAQKCSR IRVVKQPKNQGLSVSRNRAIEMAQGEYILMLDGDDLFALDMVEKAYKKAAETDADLVVWD YCTFYKDEELPRLLKKPSVLIGFDSNDKIALLKCPAFMWVKLIRTQILRDQGVHFPEGLT KQDIPVWWHLVTTLNKIVVLPERLSYYRQNQFNTTSRKDKSVYSLAYVMDITGEYLKENN LYETYKNEYLRSRLGLLQGMYDYIKSELQPNAIQMIRERLDADAIAYVKSPVCALSKRTI LFYKGYMMGELFAKLQYDGLMLVRTIYRKIKV >gi|225935332|gb|ACGA01000060.1| GENE 39 47593 - 48378 244 261 aa, chain + ## HITS:1 COG:L79351 KEGG:ns NR:ns ## COG: L79351 COG3274 # Protein_GI_number: 15672661 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 4 205 2 201 343 102 32.0 9e-22 MMKQREIGPDIIRTLAIVCVVCGHFFTVNTPYNQVLFEGGGMLLQGCLKAFFCNLGVPLF LMLTGYFNCKKEFSSKYYSNIKRVLIPYAIISVLTWAVLSTNHSIKELILATLGYKIIGY AWYVEMFIGLYLCVPFLNMVVEKVFNSDDKKLIYGLFVVLIFMTSLPPLVDRGEFRIVPN YWQMCFPILLYFTGAYIRSFQPAIRHKVLIALVICVIYLQYPIVNYLKIGIIGGGESAEP IWTVLCVAKLCSDDTSIYQFI >gi|225935332|gb|ACGA01000060.1| GENE 40 48350 - 48625 62 91 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLLFISLYKVKIKTETIRKIVTRVSLISYEMFLFSYLCDQLIYPQVMKWFYVDQNSFTV WFIPITLTVFIASYTMALIYRKISECLMQKK >gi|225935332|gb|ACGA01000060.1| GENE 41 48653 - 49591 481 312 aa, chain + ## HITS:1 COG:no KEGG:FIC_00340 NR:ns ## KEGG: FIC_00340 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium # Pathway: not_defined # 1 307 11 318 321 315 48.0 2e-84 MKNPIMRFLTDLKNENPYLKYLWLLSLEHREKKELANFSDEECINNLYCSYVHKYPNYDN PVGFGEKMQWMKLHYRNELMPIVGDKYTVRKYLEELGHPELLNELIAVYDSIEQFEVDKL PKQFVLKATHSSGWNLVVKDKDKINWRIWKKHMKFWLTHGIAWNGREWHYGEMTPRIVCE KYLEDKSGGLMDYKFFCFDGEPKFLQVNVDRGLSTATQNYYDLDWKLIPFGKSQPHNPNI EVECPSQFQHMIQLARELSRPFPYVRVDFYEANDKIYFGEFTFFPCSGMPDFIPIEYDKI VGDMLKLPKANH >gi|225935332|gb|ACGA01000060.1| GENE 42 49597 - 50418 680 273 aa, chain + ## HITS:1 COG:BS_yveO KEGG:ns NR:ns ## COG: BS_yveO COG0463 # Protein_GI_number: 16080486 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 9 273 6 271 278 223 41.0 3e-58 MAYVKKKRKISIIMGIYNCADTLREALDSILAQTYTDWEMIMCDDCSTDSTMEVAQEYVD RYDNFLLINNKCNSRLAATLNHCLEYAESEYIARMDGDDISIPERFAVQAAFLDAHPEYA LVSCSMINFDKDGDWGIQTKPEKPCIRDFIYTSPFCHAPVMMRRKILNEVGNYTVKKELR RGQDFYLWHKFYHMGYKGYNLQTPYYKMRDDKNAAKRRTFKDDLYGTKIHLEVMSNLNIP LYKRIRAFKGILIGLLPTSLRAWAHKKKLQNKK >gi|225935332|gb|ACGA01000060.1| GENE 43 50423 - 51448 321 341 aa, chain + ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 5 340 6 335 344 134 27.0 3e-31 MQEKISIIVPCYNTEVFLKQCLDSLIGQTYQNLEIICVNDGSKDNTLQILCEYAEKDDRI IVVDQENTGLSGVRNKALSLSTGKYVTFVDSDDWLKLDYIETVLSEIKDEDVIVCGYIKA SEKQEEHVRLFSKVFRYDETNIDRLRQRVLGPIGEQTAHPEIIDSLSSVCTKFIKKTIVD ENNLCFRHKNEVGPAEDMLFSVCYYQHVNSALCLPIEGYYYRRNNASITKFYNSQLLEQW DSLYSLLWKQVEDKPMLHEAYRNRKALSVIGLSLNQVNVSKNFFRQRKKVKEILNHRLYK DVFLKFDTSYMPMHWQVFFFLIKNRCYILVALMLNLINRLR >gi|225935332|gb|ACGA01000060.1| GENE 44 51449 - 52585 753 378 aa, chain + ## HITS:1 COG:BS_yveP KEGG:ns NR:ns ## COG: BS_yveP COG0438 # Protein_GI_number: 16080485 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Bacillus subtilis # 2 376 7 374 384 155 29.0 9e-38 MKVLQVTGTMNRGGAEVMLMDILRNKPDDVHFDFLINNDPRNLKEEGAFDQEILSYGCQI RHIGTQLRIGPIRYIREFKKIVRELQPDVVHIHLNAKCGLIALAAHLAGVEKIIAHCHAD IKFRGSLPSRVFSEFEMWIQKWMIGWFATDYWGCSIEANKRLYRGGHFRNSVVINNAISC AAYSAVSEEEYKALRNSYNLPDDAVVLGNVGRIVPHKGIAKMVDVLSELKNRGVNAHFIV VGRNDACEYVNEMMAHAEECKVADKIHFLGERSDVPIVMHTFDVFVGPALREGFGLVAVE AQAAAVPCVLYQGFPQTVDMHVGLITFHKDYNPKAWADSIQAVLKSCKVEKGLILNAIRV LGFDANENALKISEMYKR >gi|225935332|gb|ACGA01000060.1| GENE 45 52591 - 53724 664 377 aa, chain + ## HITS:1 COG:BS_ywtB KEGG:ns NR:ns ## COG: BS_ywtB COG2843 # Protein_GI_number: 16080641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Bacillus subtilis # 1 258 59 320 380 102 31.0 2e-21 MKITIFGDICPTKDTQAAFDRGDRGSIFGDTFREIESSDIVIGNLECAVTDQPKPIQKAG PVLYTGIQSIQTLKDFDVLSIANNHIRDCGDEGVMTALETCKKLGIRTLGAGKSMQEARK PLVIEKCGIKIGLMSFAEQEFNIASDIRPGACYLDLYDDFERICEFRKTVDYLIILYHGG IEYFPYASPELSRKCRKMVDCGADLISCQHSHCIGTIEQYNGSTIVYGQGNSVFGYRDGD NSWNRGLLLQVEFQKVGSSFSSLFTYKGMVATPNGLHWMSEDASKDLSNELRTREQLSQD RLAVQKEWDKFCANLGKIHLPLLLGWPRILIAINRRTGNSLIKMLYGRLAHNNTHNLIRC EAHREVIENLLSKKDFS >gi|225935332|gb|ACGA01000060.1| GENE 46 53740 - 54795 604 351 aa, chain + ## HITS:1 COG:no KEGG:FIC_00343 NR:ns ## KEGG: FIC_00343 # Name: not_defined # Def: putative capsular polysaccharide biosynthesis protein YveQ # Organism: F.bacterium # Pathway: not_defined # 32 351 35 355 355 137 33.0 6e-31 MTEYYVFLAIAMIFAAAYSKNPQFKITLYATLAIMVLFAGLRSAGVGTDSGSYARSFTDI GDMEGDWKEQLTDEPGFYYLKLVLSMVSNQYWVLFTGIAILTYSCVVIAIKRETEKIVIP LFVFITLGLYTFVFNAARQGIAVAIYMLSFKFLFEDHKTGFLKYCLFVFLAAMFHKTVVI ALPLYFLFRQKYSPKMLVIIAVLGIGMGTALPSFLAFAGTHEARYELYSTQAGGGEMLTV FYMLITAFFIIRRKKIDLEYLPKYDVFLSMMLFGTLIFMVVQVNNMYVELTRFAAYFQVA SVFLWAYIYQSKTKPFAVFSVVIIVGHLLYFYIFCTRMAGLVPYSFNSTIF >gi|225935332|gb|ACGA01000060.1| GENE 47 54810 - 55922 815 370 aa, chain + ## HITS:1 COG:BS_yveN KEGG:ns NR:ns ## COG: BS_yveN COG0438 # Protein_GI_number: 16080487 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Bacillus subtilis # 2 345 4 345 381 268 42.0 2e-71 MKVLLVANVAKEHVNKFHVPTIKYLKSEGCQVDVACSVDADVPAADHIYSMSWKRSPFTP KTFKGISELKKLIVENDYDIIYCHTPVGGLVARIASRTARKKGTKVVYCAHGLHFFDGAP LVNWLVFYPMEKWMAKMTDMFITINPEDYERVKKYFNKNMLVKMIHGIGVNFDRLNIDNL DGIRKKYRMEMRIPQDAEVLIYVAEILKNKNQQMLIHALKELHDKDRKMYLLLPGPDHSK SEFYKLAEDLGLKGYVKFLGWRSDIGQLMAASDMCVASSIREGFGINLVEAQYCHLPVVA VTNRGHRAIIKDGENGFLVPMNDSKAMANRVLEVMDNKELYDRLANVNVDEYKCENIAKT IYGYLQEVVK >gi|225935332|gb|ACGA01000060.1| GENE 48 56098 - 57087 454 329 aa, chain + ## HITS:1 COG:RC1279 KEGG:ns NR:ns ## COG: RC1279 COG0472 # Protein_GI_number: 15893202 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Rickettsia conorii # 1 255 3 273 327 82 31.0 8e-16 MTTYIIIIILLFIAELVYFHIADKCNIIDKPNERSSHSTITLRGGGIIFLIGIWVWSTFF GFQYPWFLAGLTLVAGISFVDDIHSLPDSVRLIAQFTAAAMAFYQLGILHWSMWWIILLA LIVYVGATNVINFMDGINGITAGYSLAVLIPLALVNINGVFVEQSLIISTILASLVFCIF NFRPKGKAKCFAGDVGSIGIAFIILFLLGNVMIRTTDITWLIFLLVYGVDGCLTIVHRIM LHENLGEAHRKHAYQIMANELKVGHVKVSLLYTVMQLVISLGFIYLCPDTVFAHWLYLVG VSAVLAIAYILFMRKYYHLHEEYLISLKQ >gi|225935332|gb|ACGA01000060.1| GENE 49 57093 - 57530 362 145 aa, chain + ## HITS:1 COG:no KEGG:BT_0397 NR:ns ## KEGG: BT_0397 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 138 148 188 67.0 6e-47 MEFDKEFLDKLFEQAMENPRLRQNFDLRTSSADTSQRMLNALLPETKVPIHRHEDTTETI ICLCGKLDEVIYEEVVSYEKDTDDFQKGMSVQDVVRKVEYREVQRIHLNPTEAQYGCQIP KGAWHTVEVIEPSVIFEAKDGAYAR >gi|225935332|gb|ACGA01000060.1| GENE 50 57743 - 58171 325 142 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 129 2 97 116 84 43.0 7e-17 MRTITLIIIHCSATPEGKSLSAEACRQDHILHRGFRDIGYHFYITRDGEIHRGRALEKIG AHCRNHNAHSVGVCYEGGLDANGKPKDTRTLEQKGALLALLRELKRQFPKALVVGHRDLN PMKGCPCFDAVKEYSQIISYQK >gi|225935332|gb|ACGA01000060.1| GENE 51 58471 - 58959 651 162 aa, chain - ## HITS:1 COG:no KEGG:BT_1705 NR:ns ## KEGG: BT_1705 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 164 166 254 88.0 6e-67 MIRYKIYQNQQKKGLNAGKWFARAVSDETFDLAKLAEHMSKHNSPYSGGVIKGVLSDMVD CIKELLLDGKCVKIDDLAIFGVGIRSKAADTLEEFSLEKNISGMRLKARATGNLSTNNLK LDSQLKQQAEYQKPTTAGGGSDSGDNPDPKPDGGGEAPDPAA >gi|225935332|gb|ACGA01000060.1| GENE 52 59179 - 59397 178 72 aa, chain + ## HITS:1 COG:no KEGG:BT_1704 NR:ns ## KEGG: BT_1704 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 114 80.0 1e-24 MSEFKIRAYGRMELAQLYSPQLTDIAAYRKMKKWISLCPGLLQRLYDLGYESKRRSFTPL EVRVIVDALGEP >gi|225935332|gb|ACGA01000060.1| GENE 53 59530 - 61401 815 623 aa, chain - ## HITS:1 COG:no KEGG:BT_1703 NR:ns ## KEGG: BT_1703 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 607 1 612 612 783 63.0 0 MTDIESLRRLTEAVETARADIAPTYLEYVQLSFAIATDCGEAGRDFFHRLCRVSPKYQRE HAERVFSNALHTQRGEVHLGTAFHLAEATGVAISYVETVGTVRTTPQTLTQARAYNKVAN EETDEETITEEKEELLSGSDPQHPLPTLPEANWPKLLQRIIKVAKSEIQCDIMLMGAFTA LGACMSRHVRCLYGGKYYYPSLQCFIVAPSASGKGILSYLRLLVQPIHDEIRAKVAEQMK IYKKEKAEYDAMGKERSKMEVPQMPPNKMFLISGNNTGTGILQNIMDSEGTGLIFESEAD TLSTAIGSEHGHWSDTLRKAFDHDFLSYNRRGNQEYRETKKVCLSLLMSGTPLQVKAFIP TTEDGGSSRQLYYYMYGIDKWVSQFVDNETDWEEVFNALGLEWKKQLDVIKKNGIHTLQL TNEQKSEIDAVFSELFDRSSIANDREMYSFVARMAINLCRIMSEVAVLRALEHPNPYSLQ ISSSSPFTPDKSIPSDNRKDGIVTRWNVSISAEDFHAVLNLAEPLYCHAMHILSFLPNTE VSHRPNADRDFLFQQLGKEFTRTELLEKATALGINKSTAITWLKRLTRQGKLISVDGKGS YARACVCMSETGKVSPLSPPYLL >gi|225935332|gb|ACGA01000060.1| GENE 54 61429 - 62058 371 209 aa, chain - ## HITS:1 COG:no KEGG:BT_1637 NR:ns ## KEGG: BT_1637 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 205 1 206 208 275 66.0 8e-73 MSNNYRMSYFMPPIAPIKDKQGRLMTPPTLIPFCEVSIEQVFQMITCNENLKTLTAQVRN ATDIRAAKASLLPYVTPCGTFTRRSCKDFVSPSHLVIVDVDGLHSYQEAVEMRRMLYDDP LLQPVLTFISPSGLGVKAFVPCHYSSTTNDTKNITENMSWAMRYVETAYNTVTAVSSETK SKVDFSGKDLVRSCFLSYDPEALFRTTNE >gi|225935332|gb|ACGA01000060.1| GENE 55 62229 - 62369 150 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160884254|ref|ZP_02065257.1| ## NR: gi|160884254|ref|ZP_02065257.1| hypothetical protein BACOVA_02232 [Bacteroides ovatus ATCC 8483] # 1 46 1 46 46 79 100.0 8e-14 MKKKTVCCSDLGAYINELLKRAKLKNEYVCETLGMGHDVLNGIKKG >gi|225935332|gb|ACGA01000060.1| GENE 56 62742 - 63167 197 141 aa, chain + ## HITS:1 COG:no KEGG:BT_0616 NR:ns ## KEGG: BT_0616 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 1 141 141 237 89.0 9e-62 MTNNTIKHLGIVENIQGSHLSVRIVQTSACAACSAKGHCSSADSKDKIIDIIDTAASSYQ VGEKVMVVGETSMGMMAVVLAFVLPFVLLIFSLFLLMALIENELYAALLSLAVLIPYYFV LWLNKTRLKQQFSFTIKPINN >gi|225935332|gb|ACGA01000060.1| GENE 57 63175 - 64140 871 321 aa, chain + ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 1 265 1 261 264 176 41.0 6e-44 MNLILIAVISLGAIALVLAAILYVASKKFAVYEDPRIAQVGEVLPQANCGGCGYPGCSGF ADACVKAGSLDGKFCPVGGQPVMAQIADILGLAAGEAEPMVAVVRCNGTCMNRPRINQYD GAKSCAIAASLYGGETGCSYGCLGCGDCVAACQFDAIHMNPETGLPEVDEAKCTACGACV KACPKAIIEIRPQGKKSRRVYISCVNKDKGAVARKACTVSCIGCGKCVKTCPFEAITLEN NLAYIDPHKCKSCRKCVEVCPQNTIIELNFPPRKPKEEAPAAPKPAAVSKEAVETPAPAA KVETPAEKGYRSSESNRIITE >gi|225935332|gb|ACGA01000060.1| GENE 58 64154 - 65491 1269 445 aa, chain + ## HITS:1 COG:TM0244 KEGG:ns NR:ns ## COG: TM0244 COG4656 # Protein_GI_number: 15643016 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Thermotoga maritima # 8 442 22 451 451 373 48.0 1e-103 MLKTFSIGGVHPHENKLSAHQPIITAEVPAKAVILLGQHIGAPAKPVVAKGDVVKVGTKI AEPAGFVSAAIHSSVSGKVAKIDTIVDASGYAKPAIFIDVEGDEWEETIDRSATLVKECE LSAEEIVKKIADAGIVGLGGACFPTQVKLCPPPSFKAECVIINAVECEPYLTADHQLMLE HAEEIMVGVSILMKAVKVNKAFIGIENNKPDAIELMTKVASSYAGIEVVPLKVKYPQGGE KQLIDAITKRQVASGALPISTGAVVQNVGTAFAVYEAVQKNKPLFERVITVTGKSVAKPS NFLARIGTPMKQLIDACGGLPEDTGKVIGGGPMMGKALVNIEVPTAKGSSGILIMNRKEA KRGEAQTCIRCAKCVSACPMGLEPYLLGALSENGDFETMEKERIMDCIECGSCQFTCPAN RPLLDYCRLGKGKVGAMIRARQAKK >gi|225935332|gb|ACGA01000060.1| GENE 59 65497 - 66489 1099 330 aa, chain + ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 4 328 2 318 318 262 48.0 6e-70 MENKLIVSLSPHVHGGDSVQKNMYGVLIALIPAFLVSLYFFGLGALIVTATSVAACLFFE WAIGKYLMKKPTTTICDGSAIITGVLLAFNLPSNLPIWIIILGALFAIGVGKMSFGGLGC NPFNPALAGRVFLLLSFPVQMTSWPVVGQLTAYTDATTGATPLALMKQAIYGDTAALSQI PDALTLLIGQNGGCLGEVSALALLIGLVYMLWKKIITWHIPVSILATVFVFAGIMHLADP EKYVSPVLQLLSGGLMLGAVFMATDYVTSPMSKKGMLIYGVCIGLLTVVIRLFGAYPEGM SFAILIMNAFTPLINTYCKPKRFGEVAKKK >gi|225935332|gb|ACGA01000060.1| GENE 60 66516 - 67232 955 238 aa, chain + ## HITS:1 COG:PA3493 KEGG:ns NR:ns ## COG: PA3493 COG4659 # Protein_GI_number: 15598689 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Pseudomonas aeruginosa # 3 196 14 199 214 70 28.0 2e-12 MLLVLTGVTAISVALLAYVNELTKGPIAEANAKTLNEALKKVLPEFTNNPVAESDTIFSE KDGKKNVDFIVYPAKNGEELVGTAVEAKSMGFGGELKVLVGFNAEGKIYNYSLLAHTETP GLGSKADKWFGAYDPAKGEKAVSHEESTKSILGMNPGEAPLTVSKDGGAVDAITASTITS RAFLNAVNAAYQAYKAEGGEVNGVTGASQKAKGADADAADAATGATIKVELTDSVSAK >gi|225935332|gb|ACGA01000060.1| GENE 61 67247 - 67834 707 195 aa, chain + ## HITS:1 COG:FN1593 KEGG:ns NR:ns ## COG: FN1593 COG4660 # Protein_GI_number: 19704914 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Fusobacterium nucleatum # 1 189 1 190 205 192 60.0 3e-49 MNNFKVMMNGIIKENPTFVLLLGMCPTLGTTSSAINGMGMGLATMFVLICSNVVISLIKN LIPDMVRIPSFIVVIASFVTLLQMVMQAYVPGLYATLGLFIPLIVVNCIVLGRAEAFAAK NNAVASMFDGIGMGLGFTIALTLLGAVREFLGTGKIFDLTIMPEEYGMLVFVLAPGAFIA LGYLIAIINKLRTEN >gi|225935332|gb|ACGA01000060.1| GENE 62 67931 - 68503 716 190 aa, chain + ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 18 189 21 192 194 174 60.0 8e-44 MEYILIFISAIFVNNIVLSQFLGICPFLGVSKKVETAMGMSAAVAFVLTIATIVTFLIQK FVLDVFGLGYLQTITFILVIAGLVQMVEIILKKVSPALYQALGVFLPLITTNCCILGVAI LVIQKDYDLLTGVVYAFSTAIGFGLALVLFAGLREQMSLVKIPKGMQGTPIALITAGLLA MAFMGFSGVV >gi|225935332|gb|ACGA01000060.1| GENE 63 68728 - 69762 958 344 aa, chain + ## HITS:1 COG:SP1828 KEGG:ns NR:ns ## COG: SP1828 COG1087 # Protein_GI_number: 15901657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 5 337 3 330 336 345 52.0 1e-94 MKERILVTGGTGYIGSHTVVELQNSGYEVIIIDNLSNSSADVVDNIEKVSGIRPVFEKLD CLDFAGLDAVFAKYKGIKAIIHFAASKAVGESVQKPLLYYRNNLVSLINLLELMPKHGVE GIVFSSSCTVYGQPDELPVTEKAPIKKAESPYGNTKQINEEIIRDTVASGAPINAIMLRY FNPIGAHPTALLGELPNGVPQNLIPYLTQTAIGIREKLSVFGDDYDTPDGSCIRDFINVV DLAKAHVIAIRRILEKTQKEKVEVFNIGTGRGVSVLELINGFEKATGVKLNYQIVGRRAG DIEKVWANPDFANKELGWKAVETLEDTLRSAWNWQLKLRERGIQ >gi|225935332|gb|ACGA01000060.1| GENE 64 69880 - 70002 57 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLGVSKKRYHDSRRSTVTTQTQNKTRQNYYAIFPVYAGEA >gi|225935332|gb|ACGA01000060.1| GENE 65 70087 - 70386 240 99 aa, chain + ## HITS:1 COG:asl4856 KEGG:ns NR:ns ## COG: asl4856 COG4680 # Protein_GI_number: 17232348 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 20 99 6 85 85 84 43.0 3e-17 MRVVSHKKLKDFYETKGCEDSRVALERWYDIAEKAEWRNLSDIKADFLSTDYVGNQHYVF NIRGNNYRLVVVVKFTVGYIFVRWVGTHKEYDKIDCSTI >gi|225935332|gb|ACGA01000060.1| GENE 66 70397 - 70759 346 120 aa, chain + ## HITS:1 COG:no KEGG:BDI_0882 NR:ns ## KEGG: BDI_0882 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 120 1 120 120 201 90.0 5e-51 MTKITKEQYEFALARVEELLPLVDDSTPANDKSMVELSVMSDIVIAYEKEHFPIEKPTVA ELIELSLEEKGMTQKQLACEIGVSPSRVNDYISGRSEPTLKIARLLCRVLNIHPAAMLGF >gi|225935332|gb|ACGA01000060.1| GENE 67 70792 - 71622 514 276 aa, chain - ## HITS:1 COG:NMA1092 KEGG:ns NR:ns ## COG: NMA1092 COG1947 # Protein_GI_number: 15794040 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Neisseria meningitidis Z2491 # 3 249 9 245 281 140 38.0 3e-33 MIAFPNIKINLGLSITEKRPDGYHNLETVFYPVALEDALEIRTSPNADRKFTLHQHGMEI AGNPEDNLVVKAYLLMDKEFHLPPIEIHLYKHIPSGAGLGGGSSDAAFMLKLLNDHYQLG VSEEQLEVYAATLGADCAFFIKNRPIFAEGIGNIFSPVELSLNGYQIMIIKPNVFVSTRE AFSNIHPHRPKYPVREAIQRPVQEWKDTLINDFEASVFPQHPVIGEIKEELYHQGAIYAS MSGSGSSVFGLFAPGFVLPEIDWGTDVFCFKGTLNK >gi|225935332|gb|ACGA01000060.1| GENE 68 71861 - 73420 1429 519 aa, chain + ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 24 466 8 442 450 374 45.0 1e-103 MAEQKRNTRNSKSTKIQPVNDYGRIQPQAPELEEAVLGALMIEKDAYSLVSEILRPESFY ERRHQLIYSAIVDLAVNQKPVDILTVKEQLSKRGDLEEVGGPFYITQLSSKVASSAHIEY HARIIAQKSLARELITFTSNIQSKAFDETLDVDDLMQEAEGKLFEISQQNMKKDYTQINP VIDEAYKLIQKAAARTDGLSGLESGFTKLDKMTSGWQNSDLIIIAARPAMGKTAFVLSMA KNIAVDYRNPVALFSLEMSNVQLVNRLIANVCEIPSEKIKSGQLANYEWQQLDYKLKNLM DAPLYVDDTPSLSVFELRTKARRLVREHGVRIIIIDYLQLMNASGMAFGSRQEEVSTISR SLKGLAKELSIPIIALSQLNRGVESREGIDGKRPQLSDLRESGAIEQDADMVCFIHRPEY YKIFQDDRGNDLRGMAEIVIAKHRNGAVGEVLLRFKGEFTRFSNPEDDMVIPMPGEPAGA MLGSKMNTGNAGSIPPPAPDFAPQTENPFGAPGDGPLPF >gi|225935332|gb|ACGA01000060.1| GENE 69 73598 - 76060 2945 820 aa, chain + ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 154 820 3 652 653 374 32.0 1e-103 MNISYNWLKEYVNFDLTPDETAAALTSIGLETGGVEEVQTIKGGLEGLVIGEVLTCTEHP NSDHLHITTVNLGDGEPVQIVCGAPNVAAGQKVVVATLGTKLYDGDECFTIKKSKIRGVE STGMICAEDEIGIGTDHAGIIVLPENAVPGTLAKDYYNIKSDYVLEVDITPNRADACSHY GVARDLYAYLIQNGKQATLQRPSVDGFKVENHDLNIEVKVENSEACPHYAGVTVKGVTVK ESPEWLQNKLRLIGVRPINNVVDITNYIVHAFGQPLHCFDAGKIKGNEVIVKTMPEGTPF VTLDEVERKLNERDLMICNKEEAMCIAGVFGGLDSGSTEATTDVFIESAYFHPTWVRKTA RRHGLNTDASFRFERGIDPNSVIYCLKLAALMVKELAGGTISSEIKDVFTTPAQDFIVDL AYEKVHSLVGKVIPVETIKSIVTSLEMKITNETAEGLTLAVPPYRVDVQRDCDVIEDILR IYGYNNVEIPTTLNSSLTTKGEHDKSNKLQNLIAEQLVGCGFNEILNNSLTRAAYYDGME TYPSNHLVMLLNPLSADLNAMRQTLLFGGLESIAHNANRKNADLKFFEFGNCYHFDADKK NEEKVLAPYSEDYHLGLWVTGKKVSNSWAHADENSSVYELKAYVENILKRLGLDLHNLVV GNLTDDIFAAALSINTKGGKRLASFGIVTKKLLKAFDIDNEVYFADLNWKELMKAIRSVK ISYKEISKFPAVKRDLALLLDKNVQFAEIEKIAYEREKKLLKEVELFDVYEGKNLEAGKK SYAVSFLLQDESQTLNDKMIDKIMSKLVKNLEDKLGAKLR >gi|225935332|gb|ACGA01000060.1| GENE 70 76091 - 76831 878 246 aa, chain + ## HITS:1 COG:Cj1172c KEGG:ns NR:ns ## COG: Cj1172c COG0217 # Protein_GI_number: 15792496 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 1 238 1 234 235 177 45.0 1e-44 MGRAFEYRKAAKLKRWGHMAKTFTRLGKQIAIAVKAGGPEPENNPTLRSVIATCKRENMP KDNIERAIKNAMGKDQSDYKSMTYEGYGPHGIAVFVDTLTDNTTRTVADVRSVFNKFGGN LGTMGSLAFLFDHKCVFTFKKKDGLDMEELILDLIDYDVEDEYEEDDEEGTITIYGNPKS YAAIQKHLEECGFEDVGGDFTYIPNDMKEVTPEQRETLDKMIERLEEFDDVQTVYTNMQP EEGEEE >gi|225935332|gb|ACGA01000060.1| GENE 71 76841 - 77086 245 81 aa, chain + ## HITS:1 COG:no KEGG:BT_0628 NR:ns ## KEGG: BT_0628 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 81 81 149 90.0 5e-35 MEYVYRTQGTCSTNIELNVEDGVVKEVAFWGGCNGNLQGISRLVRGMKVADVIAKLEGVH CGGRPTSCPDQLCRALHEMGY >gi|225935332|gb|ACGA01000060.1| GENE 72 77284 - 78537 1150 417 aa, chain + ## HITS:1 COG:CAC0628 KEGG:ns NR:ns ## COG: CAC0628 COG1914 # Protein_GI_number: 15893916 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Clostridium acetobutylicum # 13 417 11 415 417 459 63.0 1e-129 MKNIFQDLKRKDHKRYLGGLDVFKYIGPGLLVTVGFIDPGNWASNFAAGSEFGYSLLWVV TLSTIMLIVLQHNVAHLGIVTGLCLSEAATKYTPKWVSRPILGTAVLASISTSLAEILGG AIALEMLFDIPIIWGAVLTTLFVSIMLFTNSYKKIERSIIAFVSVIGLSFIYELFLVEID WPAATVGWVTPSFPKGSMLIIMSVLGAVVMPHNLFLHSEVIQSHEYNKKDDASVKKVLKY ELFDTLFSMIVGWAINSAMILLAAATFFKSGIQVEELQQAKSLLEPLLGSNAAIVFALAL LMAGISSTITSGMAAGSIFAGIFGESYHIKDSHSQVGVLLSLGIALLLIFFIGDPFKGLI ISQMILSIQLPFTVFLQVGLTSSRKVMGDYVNSRWSTFVLYSIAIIVSVLNIMLLFS >gi|225935332|gb|ACGA01000060.1| GENE 73 78567 - 79328 766 253 aa, chain + ## HITS:1 COG:MA2077 KEGG:ns NR:ns ## COG: MA2077 COG0708 # Protein_GI_number: 20090923 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Methanosarcina acetivorans str.C2A # 3 252 7 256 260 247 45.0 2e-65 MKIITYNVNGLRAAVSKGLPEWLAQENPDILCLQETKLQPDQYPGEVFEALGYKSYLYSA QKKGYSGVAILTKQEPDHVEYGMGMEAYDNEGRFIRADFGDLSVVSVYHPSGTSGDERQA FKMVWLEDFQKYVMELQKSRPNLILCGDYNICHEPIDIHDPVRNATNSGFLPEEREWMTR FLSAGYVDSFRTLCPEKQEYTWWSYRFNSRAKNKGWRIDYCMVSEPVRPLLKRAYILNEA VHSDHCPMALEIL >gi|225935332|gb|ACGA01000060.1| GENE 74 79323 - 79730 343 135 aa, chain - ## HITS:1 COG:DR2598 KEGG:ns NR:ns ## COG: DR2598 COG0432 # Protein_GI_number: 15807580 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 1 134 7 142 151 159 54.0 1e-39 MATTFDIQLPHYSRGFHLITRDIISQLPALPESGLLVIFIKHTSAGLTINENADPDVRHD FQTFFNKLVPDGAPYFIHTLEGPDDMSAHIKASLIGSSVTIPIKNHRLNLGTWQGVYLCE FRDGGDTRKLSITIL >gi|225935332|gb|ACGA01000060.1| GENE 75 79742 - 80008 249 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174499|ref|ZP_05760911.1| ## NR: gi|260174499|ref|ZP_05760911.1| hypothetical protein BacD2_21762 [Bacteroides sp. D2] # 1 88 1 88 88 140 100.0 2e-32 MNKTITYKDSTALAIQAMMSAARKDEYADRKKGNSFPQNRKKRDVTLKDVREWNKRRNYK EDAGIAVNAMMASAINDPYVDLNPPSKF >gi|225935332|gb|ACGA01000060.1| GENE 76 80103 - 80561 466 152 aa, chain + ## HITS:1 COG:no KEGG:BT_0631 NR:ns ## KEGG: BT_0631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 152 1 152 152 268 92.0 5e-71 MEDRIQKAVELFKSGYNCSQSVVAAFADMYGFTQEQALRMGASFGGGIGRMRETCGAACG MFLVAGLETGATEATDREGKAANYAVVQELAAEFKKRNGSLICGELLGLKKKDSVSTIPE ERTAQYYSKRPCAKMVEEAARIWSEYLEKHPK >gi|225935332|gb|ACGA01000060.1| GENE 77 80715 - 80915 329 66 aa, chain + ## HITS:1 COG:no KEGG:BF2557 NR:ns ## KEGG: BF2557 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 65 1 65 66 92 83.0 5e-18 MLKEKAGVIAGTIWNALNETEGMTAKQLKKATKLVDKDLFLGLGWLLREDKVSVEEVEGE LFIKLI >gi|225935332|gb|ACGA01000060.1| GENE 78 81044 - 82825 1897 593 aa, chain + ## HITS:1 COG:BS_lepA KEGG:ns NR:ns ## COG: BS_lepA COG0481 # Protein_GI_number: 16079605 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Bacillus subtilis # 4 593 14 606 612 724 57.0 0 MKNIRNFCIIAHIDHGKSTLADRLLEFTNTIQVTSGQMLDNMDLEKERGITIKSHAIQME YTYQGEKYVLNLIDTPGHVDFSYEVSRSIAACEGALLIVDASQGVQAQTISNLYMAIEHD LEIIPVINKCDMASANPEEVEDEIVELLGCKREEVIRASGKTGMGVEEILAAVIERIPHP EGNEEAPLQALIFDSVFNSFRGIIAYFKIENGTIRKGDKVKFFNTGKEYDADEIGVLKMD MVPRNELRTGDVGYIISGIKTSKEVKVGDTITHIARPCDKAIAGFEEVKPMVFAGVYPIE AEDFEDLRASLEKLQLNDASLTFQPESSLALGFGFRCGFLGLLHMEIVQERLDREFEMNV ITTVPNVSYHIYDKQGNMKEVHNPGGMPDPTMIDHIEEPYIKASIITTTDYIGSIMTLCL GKRGELIKQEYISGNRVEIYYNMPLGEIVIDFYDKLKSISKGYASFDYHPNGFRTSKLVK LDILLNGEPVDALSTLTHIDNAYDMGRRMCEKLKELIPRQQFDIAIQAAIGAKIISRETI KAVRKDVTAKCYGGDVSRKRKLLEKQKKGKKRMKQIGNVEVPQKAFLAVLKLD >gi|225935332|gb|ACGA01000060.1| GENE 79 82965 - 84143 989 392 aa, chain + ## HITS:1 COG:no KEGG:BT_0633 NR:ns ## KEGG: BT_0633 # Name: not_defined # Def: putative Na+/H+ exchange protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 392 1 381 381 648 93.0 0 MRKVLSFSGFLMLGLVVSQFLPMMAGDGYGAVKSISNVLLYVCLSFIMINVGREFVLDKT RWKSYAQDYFIAMATAALPWFMIAIYYVFILLPPEFWNSWEAWKENLLLSRFAAPTSAGI LFTMLAAIGLKSSWIYKKIQVLAIFDDLDTILLMIPLQIMMIGLRWQLIIVVVIVFLLLS IGWQRLNKYDWRQDWKAILFYSVIIFLATQILYLGSKELYGDEGSIHIEVLLPAFVLGMI MKHKEHDTPVERKVSTGISFFFMFLVGMSMPHFIGVNFAETHAGSYSVTGSQEMMSWGMI MFHVVIVSLLSNIGKLCPMFFYRDRKLSERLALSIGMFTRGEVGAGIIFIALGYNLGGPA LVISVLTLVLNLILTGIFVLWVKNLVLRSYAD >gi|225935332|gb|ACGA01000060.1| GENE 80 84299 - 85534 1113 411 aa, chain - ## HITS:1 COG:RSc1035 KEGG:ns NR:ns ## COG: RSc1035 COG1322 # Protein_GI_number: 17545754 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 21 403 74 445 457 300 48.0 3e-81 MELILLIVIAVLVIVLLVLSLTKGNNQTQAEQLQIALRQQMQENREELNRSIRELRMEMT QTLNQNMQQLQDVLHKNMMTTGELQRQKFDMMARQQEALLKSTEKRLDDMRLMVEEKLQK TLNERIGQSFEIVRSQLENVQKGLGEMKSLAQDVGGLKKVLSNVKMRGTFGEVQLGALLE QMMSPEQYDANVKTKKSGTEFVEFAIKLPGKDDANSTVYLPIDAKFPKDVYEQYYDAFEA GDAALMESSGKQLENTIKKMAKDIHDKYVDPPFTTDFAIMFLPFESIYAEVIRRTSLIET LQKEYKIVVTGPTTLGAILNSLQMGFRTLAIQKRTGEVWTVLGAVKTEFSKFGGLLEKVQ KNLQSAGDQLEEVMGKRTRAIERKLRQVEQLPHEESQKILPIDVEDDLTDY >gi|225935332|gb|ACGA01000060.1| GENE 81 85534 - 86388 703 284 aa, chain - ## HITS:1 COG:PA3657 KEGG:ns NR:ns ## COG: PA3657 COG0024 # Protein_GI_number: 15598853 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Pseudomonas aeruginosa # 39 283 5 248 261 267 52.0 2e-71 MKNFIKGFRFTPSNYPAEVEAKIQKYRKQGYKLPPRKVLRTPEQLEGIRESAKINTALLD YISENIREGMSTEEIDVLVYDFTTSHGAIPAPLNYEGFPKSVCTSINDVVCHGIPNKNEI LKSGDIINVDVSTIYKGYFSDASRMFMIGDVNPDMQKLVQVTKECMEIGIAAAQPWKQLG DVGAAIQEHAEKNGFSVVRDLCGHGVGMQFHEEPDVEHFGRRGTGMMIVPGMTFTIEPMI NMGTYEVFVDDADGWTVCTDDGLPSAQWENMILITETGNEILTY >gi|225935332|gb|ACGA01000060.1| GENE 82 86489 - 87112 160 207 aa, chain - ## HITS:1 COG:no KEGG:BT_0639 NR:ns ## KEGG: BT_0639 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 208 208 245 64.0 6e-64 MDIRKYQFNIFCLLLFPLAATGQEWNKQDSLRLQQMLESDQEIKINRKFIEKVEQNMYSR KPFVDFDPTLPTLKSSMIFSKPSIRTYQTFLKPGSTFLPTYSWLRINKNLVLHSKSNFAE NSSCFHIQSQIEYNLSPKWSLNIYGVQNLDTRKHRGLPSEVEPTQLGSNVVLKINKNWKI KTGVQYQYNVLRKRWEWIPQISVSYEW >gi|225935332|gb|ACGA01000060.1| GENE 83 87224 - 88864 637 546 aa, chain - ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 460 3 463 548 169 28.0 2e-41 MKLLWLDLNSSYAHSSLALPALHAQIANNTDIEWCTVSATINENTGSVVNQIYRHQPDII AATNWLFNHEQLLHIVSRAKALLPHCCVILGGPEFLGDNEAFLYKNKFVSGVFRGEGEEV FPLWLKVWNQPRKEWKSITGLCYLDESEAYHDNGLARVMNFSELVPPEESRFFNWSKPFV QLETTRGCFNTCAFCVSGGEKPVRTLSLEATRKRLDVIHQHRIKNVRVLDRTFNYNNKRA KELLNLFREYPDICFHLEIHPALLSDELKRELATLPKGLLHLEAGIQSLRENVLEQSRRI GKLSDALAGLKYLCSLENMETHADLIAGLPLYHLSEIFDDVRTLAEYGAGEIQLESLKLL PGTEMKRRADELGIQYSPLPPYEVLQTREITVDELQTAHYLSRLLDGFYNTPTWRSITRI LILENPHFIHELLDHLVQTDVIDTPLSLEKRGLILYDFCKNHYPDYLTQVSIAWIEAGMS LKKAPAEKVRTKRQLPPESWEIEYGAYRENLRLCFLPTDEEGHGYWFGFESEIQKIQPVF KAKKLS >gi|225935332|gb|ACGA01000060.1| GENE 84 88976 - 89374 405 132 aa, chain + ## HITS:1 COG:no KEGG:BT_0641 NR:ns ## KEGG: BT_0641 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 133 119 56.0 5e-26 MKMIKRYAGILMMLTLLVGFTSCEDDEDIYDDLMGRTWVGDLWFGSDYNPIESRIRLDNN GLGIDYQVYDYDGRPAGDLPFRWWVDYGTLYLDYGRDFALREIRGVRVRGRYLQGDLYLD GEYIDYIELQMQ >gi|225935332|gb|ACGA01000060.1| GENE 85 89449 - 90369 203 306 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 74 294 66 279 285 82 30 6e-15 MGKRPRRTPAEKARAQYTNYAVKEPMELMEFLAAKMPDASRTKLKSLLSKRVVFVDNVIT TQFNFPLEAGMKVQISKQKGKKEFNNRLLKIVYEDAYIIVVEKMQGLLSVNTERQKERTA YTILNEYVQRSGRQFRVFIVHRLDRDTSGLMMFAKDEKTQRTLRDNWHDIVTDRRYVAVV EGSMEKDYDTVVSWLTDKTLYVSSSEYDDGGSKSITHYKTIKRANGYSLLELDLETGRKN QIRVHMQDLGHPIIGDGRYGREDAPNPIGRLALHAFKLCFYHPVTGDLMEFETPYPAEFK KLFLKK >gi|225935332|gb|ACGA01000060.1| GENE 86 90406 - 91776 1216 456 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 2 456 20 457 458 285 35.0 1e-76 MDVAAEGKAIAKVNDLVIFVPYVVPGDVVDLQIKRKKNKYAEAEAVKFHELSPNRAVPFC QHYGVCGGCKWQVLPYSEQIRYKQKQVEDNLRRIGKIELSEISPILGSDKTEFYRNKLEF TFSNKRWLTNEEVRQDVKYDQMNAVGFHIPGAFDKVLAIEKCWLQDDISNRIRNAVRDYA YEHDYSFINLRTQEGMLRNMIVRTSSTGELMVIVICKITEDHEMELFKQLLQFVADSFPE ITSLLYIINNKCNDTINDLDVHVFKGKDHIFEEMEGLRFKVGPKSFYQTNSEQAYNLYKV AREFAGLTGNELVYDLYTGTGTIANFVSRQARQVIGIEYVPEAIEDAKINAEINDIKNAL FYAGDMKDILTQDFINQHGRPDVIITDPPRAGMHQDVVDVILFAEPKRIVYVSCNPATQA RDLQLLDEKYKVKAIQPVDMFPHTHHVENVVLLELR >gi|225935332|gb|ACGA01000060.1| GENE 87 92024 - 94718 2974 898 aa, chain + ## HITS:1 COG:mlr7532 KEGG:ns NR:ns ## COG: mlr7532 COG0574 # Protein_GI_number: 13476256 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Mesorhizobium loti # 4 897 3 877 892 1014 58.0 0 MDKKRVYTFGNGQAEGKADMKNLLGGKGANLAEMNLIGIPVPPGFTITTDVCTEYNTLGR DKVVELLKDEVVKAITHVETLMKSKFGDVENPLLVSVRSGARASMPGMMDTILNLGLNDE VVEGIIRKTGNARFAWDSYRRFVQMYADVVLGMKPTNKEDIDPFEAIIEEVKKEKGVELD NELKVEDLQELVKKFKAAVKEQTGKDFPTGAYEQLWGAICAVFDSWMNERAILYRKMESI PDEWGTAVNVQAMVFGNMGETSATGVCFSRDAGTGEDLFNGEYLINAQGEDVVAGIRTPQ QITKIGSQRWAVLAGVTEDVRAAKFPSMEEAMPEIYKELDALQTKLENHYKDMQDMEFTV QEGKLWFLQTRNGKRTGAAMVKIAMDLLRQGMIDEKTALMRVEPNKLDELLHPVFDKDAL KKAKVLTRGLPASPGAATGQVVFFADDAAEWHAAGKRVVMVRIETSPEDLAGMAVAEGIL TARGGMTSHAAVVARGMGKCCVSGAGALNIDYKNRTVEIDGVVLKEGDYISLNGSTGVVY NGKVETQAAELSGDFAELMTLADKYTRLQVRTNADTPHDAEVARNFGAVGIGLCRTEHMF FEGEKIKAMREMILAENAEGRRKALAKILPYQQEDFKGIFKAMAGCPVTVRLLDPPLHEF VPHDLKGQQEMADTMGVSLQYIQQRVESLCEHNPMLGHRGCRLGNTYPEITQMQTRAILG AALELKKEGVETHPEIMVPLTGILYEFKEQENVIRAEAKKLFEEVGDSIDFKVGTMIEIP RAALTADRIASSAEFFSFGTNDLTQMTFGYSRDDIASFLPVYLEKKILKVDPFQVLDQNG VGQLVRMATEKGRAIRPDLKCGICGEHGGEPSSVKFCHRVGLNYVSCSPFRVPIARLA Prediction of potential genes in microbial genomes Time: Fri May 13 10:51:42 2011 Seq name: gi|225935331|gb|ACGA01000061.1| Bacteroides sp. D2 cont1.61, whole genome shotgun sequence Length of sequence - 78544 bp Number of predicted genes - 68, with homology - 67 Number of transcription units - 24, operones - 15 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 3.7 1 1 Tu 1 . + CDS 214 - 1710 827 ## BVU_3359 transposase + Term 1771 - 1823 16.6 2 2 Op 1 . + CDS 2786 - 3172 166 ## BVU_3364 hypothetical protein 3 2 Op 2 . + CDS 3203 - 4768 892 ## BVU_3365 putative mobilization protein + Term 4782 - 4843 15.1 + Prom 4893 - 4952 4.0 4 3 Op 1 . + CDS 4976 - 5848 610 ## BVU_3366 putative ParA-related protein 5 3 Op 2 . + CDS 5869 - 6273 358 ## gi|260174518|ref|ZP_05760930.1| hypothetical protein BacD2_21857 6 3 Op 3 . + CDS 6275 - 6862 263 ## BVU_3367 hypothetical protein 7 3 Op 4 . + CDS 6871 - 7320 294 ## BVU_3368 hypothetical protein 8 3 Op 5 . + CDS 7406 - 7762 285 ## BVU_3369 conjugate transposon protein TraE 9 3 Op 6 . + CDS 7783 - 8073 188 ## BVU_3370 hypothetical protein 10 3 Op 7 . + CDS 8085 - 10919 2144 ## BVU_3371 hypothetical protein 11 3 Op 8 . + CDS 10913 - 11653 585 ## BVU_3374 hypothetical protein 12 3 Op 9 . + CDS 11672 - 12364 540 ## gi|260174525|ref|ZP_05760937.1| hypothetical protein BacD2_21892 13 3 Op 10 . + CDS 12354 - 13040 539 ## gi|260174526|ref|ZP_05760938.1| hypothetical protein BacD2_21897 14 3 Op 11 . + CDS 13066 - 14193 853 ## BVU_3377 hypothetical protein 15 3 Op 12 . + CDS 14212 - 14808 413 ## BVU_3378 conjugate transposon protein TraK + Term 14988 - 15024 1.0 + Prom 15209 - 15268 2.1 16 4 Op 1 . + CDS 15319 - 16506 945 ## BVU_3380 conjugate transposon protein TraM 17 4 Op 2 . + CDS 16543 - 17415 738 ## BVU_3381 hypothetical protein 18 4 Op 3 . + CDS 17419 - 17892 351 ## BVU_3382 hypothetical protein 19 4 Op 4 . + CDS 17922 - 20231 1567 ## COG3505 Type IV secretory pathway, VirD4 components + Term 20353 - 20392 3.3 + Prom 20249 - 20308 2.0 20 5 Tu 1 . + CDS 20435 - 20914 258 ## BVU_3384 hypothetical protein + Term 21154 - 21196 2.0 21 6 Tu 1 . - CDS 20915 - 21490 288 ## gi|260174535|ref|ZP_05760947.1| hypothetical protein BacD2_21942 - Prom 21683 - 21742 3.8 22 7 Op 1 . + CDS 21600 - 22391 550 ## BVU_3388 hypothetical protein 23 7 Op 2 . + CDS 22404 - 23675 1063 ## COG0249 Mismatch repair ATPase (MutS family) 24 7 Op 3 . + CDS 23705 - 23935 271 ## gi|260174538|ref|ZP_05760950.1| hypothetical protein BacD2_21957 25 7 Op 4 . + CDS 23919 - 25247 1048 ## COG4227 Antirestriction protein 26 7 Op 5 . + CDS 25275 - 26459 943 ## BVU_3391 hypothetical protein 27 7 Op 6 . + CDS 26495 - 27214 502 ## BVU_3392 hypothetical protein 28 7 Op 7 . + CDS 27247 - 27792 352 ## BVU_3393 putative DNA topoisomerase I 29 7 Op 8 . + CDS 27789 - 28352 354 ## BVU_3394 hypothetical protein 30 7 Op 9 . + CDS 28379 - 29026 378 ## BVU_3395 hypothetical protein - Term 28905 - 28942 2.0 31 8 Op 1 . - CDS 29108 - 29635 374 ## gi|260174545|ref|ZP_05760957.1| hypothetical protein BacD2_21992 32 8 Op 2 . - CDS 29641 - 31104 743 ## Psta_4450 peptidyl-arginine deiminase - Prom 31125 - 31184 2.5 33 9 Op 1 . - CDS 31192 - 32505 609 ## Coch_1910 McrBC 5-methylcytosine restriction system component-like protein 34 9 Op 2 . - CDS 32502 - 34622 1293 ## COG1401 GTPase subunit of restriction endonuclease - Prom 34642 - 34701 4.7 35 10 Op 1 . - CDS 34731 - 35369 138 ## gi|260174549|ref|ZP_05760961.1| hypothetical protein BacD2_22012 36 10 Op 2 . - CDS 35377 - 37107 945 ## ACL_0737 hypothetical protein 37 10 Op 3 . - CDS 37118 - 37870 610 ## COG1704 Uncharacterized conserved protein 38 10 Op 4 . - CDS 37911 - 38942 908 ## RB2501_07125 hypothetical protein - Term 38967 - 39013 10.4 39 11 Op 1 . - CDS 39021 - 45323 4073 ## COG1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster 40 11 Op 2 . - CDS 45334 - 47892 1233 ## gi|260174554|ref|ZP_05760966.1| hypothetical protein BacD2_22037 41 11 Op 3 . - CDS 47911 - 48618 445 ## COG3183 Predicted restriction endonuclease 42 11 Op 4 . - CDS 48634 - 49914 948 ## gi|260174556|ref|ZP_05760968.1| hypothetical protein BacD2_22047 43 11 Op 5 . - CDS 49931 - 53071 2735 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family - Term 53266 - 53317 1.8 44 12 Op 1 . - CDS 53349 - 53798 229 ## gi|260174559|ref|ZP_05760971.1| hypothetical protein BacD2_22062 45 12 Op 2 . - CDS 53814 - 57638 3325 ## COG0827 Adenine-specific DNA methylase 46 12 Op 3 . - CDS 57652 - 58656 804 ## PG1696 type II DNA modification methyltransferase, putative - Prom 58679 - 58738 3.2 - Term 58671 - 58708 1.3 47 12 Op 4 . - CDS 58742 - 58948 245 ## BT_3153 hypothetical protein - Prom 58972 - 59031 4.0 + Prom 58934 - 58993 2.4 48 13 Op 1 . + CDS 59119 - 59688 313 ## gi|260174563|ref|ZP_05760975.1| hypothetical protein BacD2_22082 49 13 Op 2 . + CDS 59685 - 60323 546 ## SRU_p0019 putative lipoprotein 50 13 Op 3 . + CDS 60336 - 60755 247 ## gi|260174565|ref|ZP_05760977.1| hypothetical protein BacD2_22092 + Term 60775 - 60831 12.1 + Prom 60830 - 60889 1.9 51 14 Tu 1 . + CDS 60947 - 62356 707 ## COG1106 Predicted ATPases 52 15 Op 1 . + CDS 62459 - 63454 302 ## COG3621 Patatin 53 15 Op 2 . + CDS 63454 - 64431 436 ## Ccur_13910 hypothetical protein + Term 64563 - 64621 0.3 + Prom 64681 - 64740 4.1 54 16 Tu 1 . + CDS 64811 - 66100 222 ## Ccur_13900 molybdopterin/thiamine biosynthesis dinucleotide-utilizing protein 55 17 Tu 1 . - CDS 66551 - 66847 63 ## + Prom 67485 - 67544 2.4 56 18 Tu 1 . + CDS 67660 - 68019 321 ## BVU_3418 hypothetical protein + Term 68045 - 68093 6.8 - Term 68352 - 68394 0.8 57 19 Tu 1 . - CDS 68459 - 68695 147 ## BVU_3420 hypothetical protein + Prom 69245 - 69304 3.5 58 20 Op 1 . + CDS 69392 - 69958 548 ## BF2868 putative ribose phosphate pyrophosphokinase 59 20 Op 2 . + CDS 69961 - 70449 213 ## COG4474 Uncharacterized protein conserved in bacteria + Term 70501 - 70557 11.6 - Term 70490 - 70544 14.1 60 21 Op 1 . - CDS 70570 - 71193 376 ## COG0631 Serine/threonine protein phosphatase 61 21 Op 2 4/0.000 - CDS 71183 - 72055 694 ## COG4849 Uncharacterized protein conserved in bacteria 62 21 Op 3 . - CDS 72045 - 73019 734 ## COG4861 Uncharacterized protein conserved in bacteria - Prom 73070 - 73129 2.2 - Term 73117 - 73150 3.1 63 22 Op 1 . - CDS 73171 - 73509 376 ## gi|260174581|ref|ZP_05760993.1| hypothetical protein BacD2_22172 64 22 Op 2 . - CDS 73543 - 76815 1831 ## COG0358 DNA primase (bacterial type) 65 22 Op 3 . - CDS 76812 - 77015 128 ## gi|254881888|ref|ZP_05254598.1| predicted protein - Prom 77058 - 77117 4.2 - Term 77086 - 77128 -0.2 66 23 Op 1 . - CDS 77135 - 77428 140 ## BVU_3432 hypothetical protein 67 23 Op 2 . - CDS 77450 - 77641 148 ## gi|260174584|ref|ZP_05760996.1| hypothetical protein BacD2_22187 - Prom 77784 - 77843 5.5 + Prom 77703 - 77762 4.5 68 24 Tu 1 . + CDS 77788 - 78246 349 ## BVU_3433 hypothetical protein Predicted protein(s) >gi|225935331|gb|ACGA01000061.1| GENE 1 214 - 1710 827 498 aa, chain + ## HITS:1 COG:no KEGG:BVU_3359 NR:ns ## KEGG: BVU_3359 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 3 498 4 499 499 714 68.0 0 MATFKTMVRYKRTDGFYQVYIRVVHRTKSGYIKTDKFVTEKQLSKSGEIKDPVVNEYCSR EILHYMDMINRHDVSQYTITELINFLLKSEEEVCFSDYATQFISRMINEGHERNAKNYRL AVNHLERYLGTNKIMFPILTSAVLSKWIDSLAQTNRAKEMYPTCIRQIFKKAIKEFNDEE RGRIRIKFNPWLKIAIPKSDRGAQKAISAEACREFFNRPLPVSKMISPLPELGRDLALLS LCLGGINSVDLYELKKKDYKNGIIGYKRAKTRNSRSDDAYMEMRVEPFILSTFEKYLSKD SKDEYLFNFHQRYSNSDSFNANINIGIRKICEDMGMAKEDKYCYYTFRHTWATIAQNDCD ANLYEVAFGMNHSHGFNVTRGYVKIDFTPAWELNAKVIDFIFFSDQKSKQGKARDWDEPK DVLFRITKKKMIYGRAYFKGKVVAEMTDIGFSNVDEVIARLVNQLPKDIPTGCNVQFRLT NCDTQREAVYERSKGKGF >gi|225935331|gb|ACGA01000061.1| GENE 2 2786 - 3172 166 128 aa, chain + ## HITS:1 COG:no KEGG:BVU_3364 NR:ns ## KEGG: BVU_3364 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 128 1 128 131 152 61.0 3e-36 MNQRTKYFQLRLTPAEAEYIREKAAYYSSVSHYIRSAIAEYSHVNAKQRLELVNNLGAFY RKYQNELSWIGSNLNQSVRRANELSVAGLLAPSYVQEVLIPQIQETQRTLNEIKRGLDAV TQKAIKSK >gi|225935331|gb|ACGA01000061.1| GENE 3 3203 - 4768 892 521 aa, chain + ## HITS:1 COG:no KEGG:BVU_3365 NR:ns ## KEGG: BVU_3365 # Name: not_defined # Def: putative mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 520 1 530 531 690 65.0 0 MIATILPGSSNFHAVGYNERKVAKGVARLLEIQNFGALGTFGKPTPDELVQYLQDYTSRN DRIRKAQFHVAFSCKGREMSETEVLDFAHRYLSEMGYGEDGQPLLVYSHYDTDNTHLHIV TSRVAPDGKKIAHDHERRRSQEIIDRILGNDRKKSTKNDIDVAKEYTFSSFAQYKAIMQS MGYEAYQKEGTVFIKRGGKVQQMMPLSELEALYKSGCRERARCRQLRSILKKYRDVSANK EGLQKELKSKFGIDIIFFGKKDAPFGYMIVDHANKTVIHGAKVLSMEELLDFATPEQRFD RIENYIDQLLTLNPKITQGEIYQKLRKQRAYIKKGVVYFDGQSRPLKDFMAEAIDHNNRI EWVEKFKPTTEAERDLLCCIFKVNRPDLVSLSSEWPKNYEPSVNRLREIFEDETATASVR SQLRAEGFIIRQEEDAVYAINFHEHVIINLTTEGFNLERLKWKPQTQNQPVPQTKNQAGS KLPHLPKFKDGGGGSQSEKREWEVGQKGNYDTVDDERSLKR >gi|225935331|gb|ACGA01000061.1| GENE 4 4976 - 5848 610 290 aa, chain + ## HITS:1 COG:no KEGG:BVU_3366 NR:ns ## KEGG: BVU_3366 # Name: not_defined # Def: putative ParA-related protein # Organism: B.vulgatus # Pathway: not_defined # 1 277 1 277 282 439 76.0 1e-122 MIQTPIIVTFANQKGGVGKTTLCVTFANYLVTKGVRVVVIDCDFQHSIMKCRTADINKYG EQQMPYEVWAYEANDKEMMTSLMEKLHNDPEIDVVLMDSPGSLKAEGLVPMFVNTDIIVV PFHYDLVTVPSTASFLMFVDRLRKAVGDRMKAQLFIIPNLNDGRVGKRSELVIWDNTRDT FSNYGYVTPKIPKRADMERFSTMAALDMQATIVTPVFDKIYLSMFNTLNPIRETALTGIQ LTENVITKPEKKKKESPIETEESDDDTEDVSKETSEETGNELTHDNSQNS >gi|225935331|gb|ACGA01000061.1| GENE 5 5869 - 6273 358 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174518|ref|ZP_05760930.1| ## NR: gi|260174518|ref|ZP_05760930.1| hypothetical protein BacD2_21857 [Bacteroides sp. D2] # 1 134 1 134 134 233 100.0 3e-60 MTNDIPQIDDLTKGINDSDFFLGKPKEDVEEPAQQMKTEPTLKEGEAEAEVTDKCWNVFM GFLESSDEQKDKSDRLVCKLDRDLADSLDDCNIHNRCRSDLVNAIVRSFFENYLQQLAQY RREKKSLFTNLKDE >gi|225935331|gb|ACGA01000061.1| GENE 6 6275 - 6862 263 195 aa, chain + ## HITS:1 COG:no KEGG:BVU_3367 NR:ns ## KEGG: BVU_3367 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 183 1 183 196 213 58.0 3e-54 MRKTKWTDKMLNALSELYPVETTSYTAACLQMSETAVKNQARKLGLSKIAKSKWMIRAEH VQNHFHQCSFSEMAKQLGITKMSVSRIAAKLGLTRTKAESYQVASRIRLDMIRRERRRVI FGLEPITQLKVVSNRAKVRLRSRLKSKGYIVSEEHNVLFYTRDVERKNHLETRGMKLGLR FLPFQCESSLLPTSI >gi|225935331|gb|ACGA01000061.1| GENE 7 6871 - 7320 294 149 aa, chain + ## HITS:1 COG:no KEGG:BVU_3368 NR:ns ## KEGG: BVU_3368 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 61 1 61 229 75 67.0 4e-13 MQYSQIVFLLFIVMMCYYAALIVMDIRKAKAARAAEHDLHSEEDIDITDEAQSFQPVQVS RDEPPKEPKGNEVGDAGGDSGEDDGDAELEQAPKQQPEPETPTEDRIIYREAIMTDGIVV EKIIHEVNQLAETGTCDLGAVIFSCEHAR >gi|225935331|gb|ACGA01000061.1| GENE 8 7406 - 7762 285 118 aa, chain + ## HITS:1 COG:no KEGG:BVU_3369 NR:ns ## KEGG: BVU_3369 # Name: not_defined # Def: conjugate transposon protein TraE # Organism: B.vulgatus # Pathway: not_defined # 2 118 16 132 132 164 79.0 1e-39 MQQIKTGFQRAWNRLLSTRLAQFCMTLLVALMLSVGNLMAASKGAAGFTKATQEVSSYQT PVSNLMKAIAAVIVLVGAFNVYFKMQNGDQDVKKTIMLTIGGCIAFIALSEALPLFFK >gi|225935331|gb|ACGA01000061.1| GENE 9 7783 - 8073 188 96 aa, chain + ## HITS:1 COG:no KEGG:BVU_3370 NR:ns ## KEGG: BVU_3370 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 96 1 96 96 140 82.0 2e-32 MAMNPEGYPVFKGLQKPLEFMGIRGRFLTLAAAAIGVSFVGFIVFSIAQGKLAGFIAMLV MAIAGLITIYVKQRGGLHNKKRAKGIFIYKSIHRQT >gi|225935331|gb|ACGA01000061.1| GENE 10 8085 - 10919 2144 944 aa, chain + ## HITS:1 COG:no KEGG:BVU_3371 NR:ns ## KEGG: BVU_3371 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 765 1 765 766 1364 88.0 0 MARHKKRIFDGLYAQLEETNGNVVLFSAKGEPSVIFEITNPVQQLCTDAEQYQLFQDVLS NVVQTLGEGYALQKQDVFCKQAYHHEVPKDAEFLTRSYFKYFEGREFTEIRTYLIITQEA VRSQFVQYDPKTWLDFHSKVSKVDDILTEKRIKHRKLTKKEVNEYCHRFMAFQFRHGPFS MTNFKASDEYLKIGDRVVRSYPLVDIDEINLPTMVKPYTQMSINGYGIATDVLSFLASVP HADCVVFNQVVQIPNQRKLLRKLQTKAKRHGSMPDPSNKIAKADIEEVLNRLAIDSTQLV YTNFNILVSCLPEQVTPVTSYLETKLYECGILPSRTAYNQLELFTDSFPGNAYAFNPDYD LFLTLSDAALCFFFKEHLKASENTPLTTYYTDRQGLPVCIDITGKEGKVKMTDNANFFCI GPSGSGKSFHMNTVVRQLLEQKTDVVMVDTGDSYEGICGYYKGTYISYSKEKPISMNPFK VTREEYNLNFGEKKNFLKSLIFLIYKGNDFPSKIEDMLINQTIVEYYETYFHPFEAFSDK EREGLRQKLLVAAKMEDDYETFAHQMEDIDKQINEEPKSEKAEEKSLLLPSEVRRLKLIR QCRSLTALLHDEAATASEKERALRIIESHKKELYNDTLLIKIDKQINHIEEQKRRLKVSE LSFNSYYEFALERIPQITTLEKIQFNIRDFAAILKQFYRGGELEVTLNSDLDVNLFDEQF IVFEIDKIKDDPVLFPIVVLIIMDVFLQKMRIKKGRKALIIEEAWKAIASPTMAEYIKYL YKTVRKFHGIAGVVTQELNDVIDSPIVKEAIINNSDVKILLDQTKFKDRYEDIAAILGLT PIQRQQIFTINALNNRENRSYFKEVWICRGQNSDVYGVEEAPECYWAYTTERSEKEALKI YLRHYGTMQEAITHIEIDRKKDGNHFYLEFARKVNQQQKVMSLW >gi|225935331|gb|ACGA01000061.1| GENE 11 10913 - 11653 585 246 aa, chain + ## HITS:1 COG:no KEGG:BVU_3374 NR:ns ## KEGG: BVU_3374 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 245 1 245 245 375 74.0 1e-103 MVKLRFLLLVTLLAVTSTHLHAAWRIVIDRRSLAAVSANMASQKLIEDQHNARLDSIAEK QRKVELYTVSMATIKELYKMSMENIKGFGTESIYYKEIGSCAFDILQNVPELVKTVSKAK FTNKLYCLTELSGLVMETQQLVGNFVNIVNNAKVSNPLKGQGTAEKKNDGYNLLDRYERL TLANRIYTDLMEIRYKVDSMIMMAQYATLNDLFFAIDPEGWANIMTMKSAVNGLIQDWNG LVASNY >gi|225935331|gb|ACGA01000061.1| GENE 12 11672 - 12364 540 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174525|ref|ZP_05760937.1| ## NR: gi|260174525|ref|ZP_05760937.1| hypothetical protein BacD2_21892 [Bacteroides sp. D2] # 1 230 1 230 230 453 100.0 1e-126 MIRRIAFYMVMVFIVPLKLSAGGIPIPNDLPTIEALINLHKAIKKDEDKALQRVTTSFGE QSMITKGANKFNEVRTTLDTKLSNANSYLVLAGAVSSTANSLYQLIRDYKNFTGSTFKHV SQKPFVAWYYTEANLAIAREVKHCYKLYASVAASGINLMKASMDEKLDLVMTLKASIDRA RYIIDNANLYCYLITDCGWKPDYIWEILNSDVTDEIANRVINQWNKGYAS >gi|225935331|gb|ACGA01000061.1| GENE 13 12354 - 13040 539 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174526|ref|ZP_05760938.1| ## NR: gi|260174526|ref|ZP_05760938.1| hypothetical protein BacD2_21897 [Bacteroides sp. D2] # 1 228 1 228 228 399 100.0 1e-110 MQVKTIIIALLTIISISFPLKGSAQAMVHDEEKEKQWKSMENGPWDFAPDWFYYFFHKKY SGAETYWKWAGFKSGWRVRFKEHKSNVKRIMPVRVTAEETQRQKMKKAEEERAYIEELYK EELLREADRSVDLMYATYKGEFNRMQDCITEGLLYCLTKSKGKLQYQVDELSRQNEVLCA DIAYIHKTGIGYGLENAKRQQAYEEAKEKMSKLVDRTAHLCAVASTHY >gi|225935331|gb|ACGA01000061.1| GENE 14 13066 - 14193 853 375 aa, chain + ## HITS:1 COG:no KEGG:BVU_3377 NR:ns ## KEGG: BVU_3377 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 329 1 329 379 571 93.0 1e-161 MNILTVILTIGLPAIDESLEKLLVAMETFPNVAVLGDAVSMARALGLCLALCVGSYECWM MMLGRRGMDVMKLLRIIGISICISSSSWICSALQVPGKGLESATHAMAKAKNREVAAFEL KVAQKQSEYLDRLRAVQDSISTAQQVAAIGQDAAWWDKLIYNVENLGNTINNYAQRAAVA AETKMSEWINDVVRFVGELIFQMSYYGILVAQRIFMAIMVIFCPIMFALSLAPPWNSAWS QWMSKYLSLSLWGFVTYMCLYYIDFILLYNLQQDMVAYNHLLHGSVNSWSQIGALGLQGI GSNCMYAMGMLVGAYIIRFVPEVASWLIPGGVSSGTGSAAGATVMGIATTAGAMAGSAAG SVVGSTGSMVKSAFK >gi|225935331|gb|ACGA01000061.1| GENE 15 14212 - 14808 413 198 aa, chain + ## HITS:1 COG:no KEGG:BVU_3378 NR:ns ## KEGG: BVU_3378 # Name: not_defined # Def: conjugate transposon protein TraK # Organism: B.vulgatus # Pathway: not_defined # 1 198 1 198 198 390 91.0 1e-107 MLIESLAQKTKLAMMTVLATIGGCVVICGFTVWCCISLVTQERKQIYILDGEIPFLAQRA QLEANFTMEAKAHIQLFHQYFFNLPPDNDYIKWTLGKAMYMADGTALKQKQAMDENGFYS DIISSSAVCTIICDSIQFDEQERRFTYYGTQLIKRRTTDLKRSMVTTGFIETVPRTRNNP HGLMITNWRTLENKDLDY >gi|225935331|gb|ACGA01000061.1| GENE 16 15319 - 16506 945 395 aa, chain + ## HITS:1 COG:no KEGG:BVU_3380 NR:ns ## KEGG: BVU_3380 # Name: not_defined # Def: conjugate transposon protein TraM # Organism: B.vulgatus # Pathway: not_defined # 3 395 4 412 412 553 80.0 1e-156 MMKINFKQPKYIFPLVIFVPLCALIYFVMQTFGGSSDEQQAVATDRINMELPKANAEEAG DKMYEMSRRFGDEDAFTAVGGIGEDKKEEEELEHGYSEEELNRLDAAEAERLRQQKELEE LERSLAESKKHINSYAYGGDQNSNPSSNSTDDFARDLEDIQRRSYERQKAIESGLGFGQA DADEREKKLRADSIAKVRQEEKERNRPKLVMKSKDTNAEKFHTVSGDEEAVEAKLIRAMI DQTTKAREGTRLRFKLLDDVTVSGTKLKKGTYLYGTVTGFGQQRVRATITSILIGDKFIN VKLSVFDNDGMEGFYVPESAFRDFVKDAGSSTVQQNISFDSENGETGISGETIALQALQN MYNSASSAISSNIRKNKAKIKYNTIVYLINSEEAR >gi|225935331|gb|ACGA01000061.1| GENE 17 16543 - 17415 738 290 aa, chain + ## HITS:1 COG:no KEGG:BVU_3381 NR:ns ## KEGG: BVU_3381 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 16 288 17 288 290 405 79.0 1e-112 MNTKILLVVFAFLCTLSAVANEKIYVNREVTTHIVMPENIKMVDISTPKVIGNQCADNIV RIKPFQDNDSTHTCTYAENELLATLTLIGERRIAQYDIVYTHSQQMAASIYRVAYSETQS YINPEVTMPMAEMARYAWAIYGSERKYNQVVSKAHGMKAVVNNIYSIGDYFFIDYSLQNK TKIAYDIEELRVKLADKKETKATNSQTIELTPVFSLNHVKKFKKNYRNVLVLPKLTFPEE KVLRLEISENQISGRVIILTIEYEDILNADGFDANLMKLTPYYPYYYISD >gi|225935331|gb|ACGA01000061.1| GENE 18 17419 - 17892 351 157 aa, chain + ## HITS:1 COG:no KEGG:BVU_3382 NR:ns ## KEGG: BVU_3382 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 157 1 157 157 259 78.0 2e-68 MKKIILLMLCAFTAMGTFAQDKKLTVNAGFLFPSTLNAMVGYEYPLSYGNAVELYGEAGT HWQSPTCHRFWEDYYWDGGLVYKHRLVRYKNGMLRFRFGPQFGAVQRKFFMGLEGGFEYS YVFQNGCEFTLIQKNNVNFLHGDTFRNGLLIGFKIPF >gi|225935331|gb|ACGA01000061.1| GENE 19 17922 - 20231 1567 769 aa, chain + ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 471 676 280 484 589 67 26.0 1e-10 MAFEETREQQQMYNYFRSCIYIFLIIEIVMNLPVTADNRITQFIIDLLGRFKVFNSVSGC KITELICICVVCIGTKAKKALKFNLKTMVVYPVLAGLTLVGLCFVFHNMSMGISWMGFPA NRILYAVCSVVGTMLVHQGLDGIAKYYNYKVGEDRFNFENESFQQSEVLVNNDYSVNIPM IYYWKQKMHKGWINIINPFRATIVLGTPGSGKSFGIIDPFIRQHAAKGFTLMVYDFKFPT LAKTLFYQYCKNRKAGRLPVNCGFRIVNFTDIEYSNRINPIQRKYIPDLAAASETAATLL ASLNKGGGEKKGGSEAFFTNSAENFLAAIIYFFVNFHPVGFRNGKKLKRFVSVNGQKLEI VIRNWDDFNAIDKDGYVILDFVDEHGNDVSTDEDRMFVDLNGYRYKDRNGQEVKIDRCWY EDENGQEVEPDTITGEYSDMPHVLSFLGRPYDQVFNILMQDDKISSLMAPFKSAYENKAN DQLEGMVGTLRVNAARLVSPEAYWVFTGDDFDLKISDRNNPSYLVIANDPEKEQVIGSLN ALVLNRLITRVNSKGNIPVSIIVDELPTLYFHKIDRLIGTARSNKVAVTLGFQELPQLEA DYGKVGMQKIITTCGNIFMGAARNKETLEWAQNDVFGKAKQTSRSISINDQKVSTTISEK MDYLVPAAKIADMATGWLAGQTARDFTATDDSMLDRFDIEQSEEFKTTKYFCKTHFDMKQ IKDEENHYVSLPKIYEFKNDREKEVLLNRNFKRVNQEVENIVKELLGIG >gi|225935331|gb|ACGA01000061.1| GENE 20 20435 - 20914 258 159 aa, chain + ## HITS:1 COG:no KEGG:BVU_3384 NR:ns ## KEGG: BVU_3384 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 159 72 229 231 149 55.0 5e-35 MHYLHTLLGFAVVSEVPWLLLNGNDGTHNVLFTLALGVTTLIFLDKLIKSNRTLSISVVL VMAYLAYYLGVDYDWRGMLMMAIFFILKSKGASHSPFSRVLQLVFSFPLMMHYGMVGAML ACLVLFMYDESRGFIQGRVAKYGFYGIYPVHFLLIWLIR >gi|225935331|gb|ACGA01000061.1| GENE 21 20915 - 21490 288 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174535|ref|ZP_05760947.1| ## NR: gi|260174535|ref|ZP_05760947.1| hypothetical protein BacD2_21942 [Bacteroides sp. D2] # 1 191 12 202 202 385 100.0 1e-106 MLNAQMVSAQFYTITKEAAIEAVSSKNVAVHKGEKDSLALEMGKKVANDTPADTLVKVQK CDQRVSLNQVKAKVSKKGQLRGLSELTIPNLYAEIKRHDIQHPKIVLAQAILETGWFTSP LCRNRHNLFGLTNPRTGDYYEFNHWTESVRAYYTKVQYRYKGGNYLVWLRDIGYAQDPNY IRSVIKVLKML >gi|225935331|gb|ACGA01000061.1| GENE 22 21600 - 22391 550 263 aa, chain + ## HITS:1 COG:no KEGG:BVU_3388 NR:ns ## KEGG: BVU_3388 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 263 25 285 285 372 84.0 1e-102 MDMEMHNRNQSEMLPYEKLAVLGIDREKADSLPIEVKEKLVAGEVTPLMQVSISAQNGDV ITIPLKLQLTTDQNGAPALMAYPVRAALEVERNQALRLTEQEAERLTKGEVIQKAVDVNG EKTQQYLQLDPETKSVIHRRVTDIKLEQQLKDMEKVNDIELGMQQKQQVREGKPVELNVG GEKVAVGIDLKEPQGFKVIQGDMKEWDRQQKLRYDDLHPEYLGLVMTDKNRWEYQQVVDK QSQERALRLSSTQKESKSQSLKL >gi|225935331|gb|ACGA01000061.1| GENE 23 22404 - 23675 1063 423 aa, chain + ## HITS:1 COG:RP298 KEGG:ns NR:ns ## COG: RP298 COG0249 # Protein_GI_number: 15604167 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Rickettsia prowazekii # 15 107 23 114 891 89 48.0 9e-18 MPKQSPPEKDKDVGILQQFDNLKKKHPDAVLLFRHGDFYETFKQDAVKAATVLAIAIADR IFPMEKEPIKVASFPYHALDAYLPKLIRAGMRVAICDALEAPKQIKKESTLSNNETMAKK KKEQGTQEEPTQAVKSAAEEKPAKKSKAEATVETKADVKSEQQFEAKQERKPREPQMITA NGEKVTHGHAYQNTTNPNEWYFTAKIDGQQLKPQKMDAADLAAYQKKEMTVPQLMERYYP SKLQPKVSEEAFKFPKQIAGPDGAINIDKFNVYKEKDEQRPDFGKYKFYVEVGDTKMSAV ASRQDLNAYFDRTIAPTQLVEKNFGERLHLKSAYEKYQLPEGADANGIRVAKDHNDNKWK VSMDLGDKGQTSKKEISFDDGYSLFKSKTATREQIAAKYLHTEITGLLAAPTAKMEKSAS MKM >gi|225935331|gb|ACGA01000061.1| GENE 24 23705 - 23935 271 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174538|ref|ZP_05760950.1| ## NR: gi|260174538|ref|ZP_05760950.1| hypothetical protein BacD2_21957 [Bacteroides sp. D2] # 1 76 1 76 76 140 100.0 4e-32 MSAKMNHEELVELLQDGKIGYLRFVMDGENSQDYLDWCQKHGTDPTDESAEFYVGQTDIA EMDSQVMNDEAYGIWH >gi|225935331|gb|ACGA01000061.1| GENE 25 23919 - 25247 1048 442 aa, chain + ## HITS:1 COG:mlr6154 KEGG:ns NR:ns ## COG: mlr6154 COG4227 # Protein_GI_number: 13475143 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Mesorhizobium loti # 23 333 31 304 320 123 30.0 7e-28 MAYGTNTSSTGNAGQVAIDRFAEMMIARMEQMKSADWKQGWIGGASGYAGLPQNVGGRNY SGSNSFFLQLQTAAMGYQLPVYLTFKQAQNLKAHVLKGEKAFPVVYWDMLVRDQHGQRIS AEEYRAMSKDEKQGLEAIPFIKAFPVYNVAQTNLAEVQPERMQKLLDKFKMPELRDTQGM YEHAALDRMIQTQGWLCPIQADKRENGAYYSPSKDIVVLPMKAQFNIGNSPDETYRGGME YYSTMLHEMTHSTMTPERLNREIGGKFGDPLYAKEELVAELTAAMISHSMGFDSKITDNS AAYLDSWIGVLKKEPKFIVSVMADVNKASELILDHVDKQRLQLNKQPYLAKNDPLAPLDG VELPFKNAAIVKTRNGDYGIRASYDGVELGLKKVSKETARTYFQLTDMKDKNAFLVMTAQ KTYGPELATMRQTQKASAGLRI >gi|225935331|gb|ACGA01000061.1| GENE 26 25275 - 26459 943 394 aa, chain + ## HITS:1 COG:no KEGG:BVU_3391 NR:ns ## KEGG: BVU_3391 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 394 1 394 394 555 69.0 1e-156 MTDYKINFRELKSKVGIDDVAYDLGYKLNKKAGVGRYIELVLSDGRTKSDTIIISNPNNK AAQTFFRRDGSKGDVVTLIRKNLNAFNVSGKDEWHRIAKVLARFANAPEPDYHEDREYVR SAKVDAVFDSARYEVKPIDTNKLPPLFAQRGLSAETVQALSPFISLIRDKKNEKFEGFNI GFPYTQGDDTTVKGYEIRGYGGYKSKAAGSNSNSAAWVADLSRGNHDAVKQVFFCESAFD AMAFYQLNRLQLSTDIALVSLGGTFSDEQITNVMQRFPQAKAYDCFDNDLAGRIYGLRML ALLEEIPMKINKTNEGILVEAKGKAFSLSQDRSLTAQLSEHLHIRYRMGQRLPPKAFKDW NDCLLNKPVEVIMTPRKEEREQKLAERRNAGLKI >gi|225935331|gb|ACGA01000061.1| GENE 27 26495 - 27214 502 239 aa, chain + ## HITS:1 COG:no KEGG:BVU_3392 NR:ns ## KEGG: BVU_3392 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 239 11 247 247 370 78.0 1e-101 MALVVTLSFAFTGTPVLAQQTDPALTAEVVAQTVVLNDIHQKRKKTQERIIAAEAAVSIA LDRVHRVEDKMLEYLSNAQGAMQNLYQIKRAGELVTKEIPQNISLLTKSVGGNLKGTAIA AIVSDELTDAATQMAALYPFMKQLVTSGSYNVTNSDGNKEKHKVNLLNSSERYYVANEVV TRLEAINTDLFILAWQVRTLSWNDLWFSLDPEGWANVMSGKNIIGGIIADWNQGYVLKW >gi|225935331|gb|ACGA01000061.1| GENE 28 27247 - 27792 352 181 aa, chain + ## HITS:1 COG:no KEGG:BVU_3393 NR:ns ## KEGG: BVU_3393 # Name: not_defined # Def: putative DNA topoisomerase I # Organism: B.vulgatus # Pathway: not_defined # 4 181 2 179 179 310 83.0 2e-83 METEKKTIGRCPLCGGNVVKTCKGYRCENNLGEQPTCTLNINSIIGNRKMGDEEVTELLE KRSILLDGFATKEGKAFPAVLELANDGAINMLYVIGKCPHCGGDVRVSSRAFNCSNYSNQ SAPCTFSIWRNIGGHQLTLAEAKEICEKEITNSELEMYREDGAIYRKRLGLSPDKLQIVK I >gi|225935331|gb|ACGA01000061.1| GENE 29 27789 - 28352 354 187 aa, chain + ## HITS:1 COG:no KEGG:BVU_3394 NR:ns ## KEGG: BVU_3394 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 187 1 186 186 196 51.0 4e-49 MRPGMKIVIILVTAVLFWGGMLVACVSCSEKPNDLAVKVAEKALMATVDHPESVKILGFS KADSVFGKNYVTMEERMALSVMMMKINEQVMKTTNNFEDFNPDNREMNELVERQMEAMTT LRALVAFGDIQPEVHGQKKPQAPFSGWKVKVEVEAPNAEGKLCRSEHWFILDKSAQFVVK SFEIPLL >gi|225935331|gb|ACGA01000061.1| GENE 30 28379 - 29026 378 215 aa, chain + ## HITS:1 COG:no KEGG:BVU_3395 NR:ns ## KEGG: BVU_3395 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 10 212 22 225 225 283 65.0 4e-75 MLVVSLWVTTSCECNHDIVTPFTSSLQVGNVVCADGTVISATDFSNSGKEPVAIIYHVNQ NSEINCLGYAVYIHDMAPQTFADSLGVEQETSASLTDEDGNANTYALYHCEDVSSSMAIK VFDMWSYGQSAFVPSIRQLSYLFPVRYAVNERIKAVGGEPISLEPGDCWLWSSTEVEGQT ENKAWLFSMQSGTYQETPKNQAHRFRPIISIYHVE >gi|225935331|gb|ACGA01000061.1| GENE 31 29108 - 29635 374 175 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174545|ref|ZP_05760957.1| ## NR: gi|260174545|ref|ZP_05760957.1| hypothetical protein BacD2_21992 [Bacteroides sp. D2] # 1 175 1 175 175 356 100.0 3e-97 MLDNKKPRIINVTRKPSKCPDCGSQVVDIIYGTGDMTEIEFVLEYRKDAIMGGNNIPRRP PIWSCSCGCKRFRKVNPDGSDAAVKVKMLKNMRKAPATKINWTSDLASRALEDNRHEIMH HYEMEITTELDEHETLSITAVSGSDAEDQATELVAKGFVGLRGRKCVAIEVFDAE >gi|225935331|gb|ACGA01000061.1| GENE 32 29641 - 31104 743 487 aa, chain - ## HITS:1 COG:no KEGG:Psta_4450 NR:ns ## KEGG: Psta_4450 # Name: not_defined # Def: peptidyl-arginine deiminase # Organism: P.staleyi # Pathway: not_defined # 213 479 2 273 279 155 31.0 4e-36 MIKLEQRLRGFSLSESSHQNIISGSYEAPTEFAAIAQTTLAGHFCVKGKEGNVLVRPTCV EFYYHEEAEHGIKDYIVYHRNMKDNPKPAFDFGTLHNHVSGIDIAFEKGDSPDNAIRASM LIREFEIDGRNDDRSTMLYEALYQQSSVFEGISVQWVDGNVPVEVTADVRKNVALFDTNG EKKKTSDYPELLATEDKKYVQDLRKWQFKRKQIVDSDTNKVYISSWLKDECPDFYGRFIS LLQNNGIVFQVMQSTNDIWARDYMPIQIYDDHFVQYCYNPDYLQKSEEDKESITDVDSVC NELGIQTYKTDLVIDGGNVVKAGKYIIMTEKVYVENSHLKPAEVRAQLRSIFHRDVIMLP WDIKEHYGHADGIIKAIDDNTVLLTNYDDFDFHYAKRFEEILSKYFTVKKLSYHVEYPNK NNWAYINFLRIGDTIFIPGLGAEEDEQALQQIKSYYPECKVLQIEASEVVEKGGALNCIT WNIKEKL >gi|225935331|gb|ACGA01000061.1| GENE 33 31192 - 32505 609 437 aa, chain - ## HITS:1 COG:no KEGG:Coch_1910 NR:ns ## KEGG: Coch_1910 # Name: not_defined # Def: McrBC 5-methylcytosine restriction system component-like protein # Organism: C.ochracea # Pathway: not_defined # 47 435 44 434 437 270 39.0 1e-70 MTIIQLQEQILENTPLLRQDADFILPYLEDGEEVVYTVKRGREEKTPCLLLKRQSEQIAA SGSYFVGIDWLRENELAVQVSPKLNNGFEVDYVRMLNDALLEPENFEHLTDLITIRFDKP SIKVSQQQDILSIFLITEYLSILHRIVRKGLKRSYYTVEENLAKKVKGKIIIGKNIRKNL SRGNITDNVCRYQVYDIDSPENQILKMALRFCSKQLEIYKNAVNIEPLKKKVRLIAPYFD SVSDEISVKSIKSYKGNPVFKEYNQAIKFAQLLLRRYSYDITVVGKHEIQTPPFWIDMSK LFELYVFHHLRQVFTVKGEVRYHVNAYYQELDYLLNPAEWPEPYIIDAKYKPRYKSSKGI SIDDAREVGGYARLSSIYTKLGLDENTAPPIKCLIIYPDQEQEERFTFTRYKEPQFEKVC GYVRFYKVGIKLPVINE >gi|225935331|gb|ACGA01000061.1| GENE 34 32502 - 34622 1293 706 aa, chain - ## HITS:1 COG:DRB0143 KEGG:ns NR:ns ## COG: DRB0143 COG1401 # Protein_GI_number: 10957435 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Deinococcus radiodurans # 249 595 474 837 969 221 36.0 4e-57 MNYTWIPYYQEFAEKLLPFRNDRKSLLKLIYDRREELLAGYLHDEDGYDDLCSDIDPFTT FGLFNRGIKSQNRIHTTEVFKTLLGIKAEVPKDFRGIPVLNNQKSHFFGFRSKRKANDIE SLWALFEKVVKNEGFEDEYNAVVSQFIINVNVTMALFWIRPNDFLAFDSTNRDYLKKQYG IDLPKRVPDYKTYMSLLKGIKEKIASKEIKEKAFYELSANANEGEHTTDRDTNHTWYDDV VRILSRRKNIVMYGAPGTGKTYDVPEIAVRLCDPDFDCSSREDLMKRYNQLKHDKRIAFT TFHQSMDYEDWIEGLKPIADNNQVTYEIESGIFKQLCEEAARPIVKDKHIGIADDAVVWK VSLCGTGDNPVRKECMDNGHIRIGWDDYGPVISDETDWSIYNGEGKRILDAFLNSMKIGD IVMSCYTNTTIDAIGVVVGDYEFIDSYHDYKRVRKVNWVLKGINENIVEMNDGKTMTLST VYRLNSITLDNVKALLDKYQKPTTMEVNTKPYVMVIDELNRGNVSKIFGELITLLEADKR KGSKNAESALLPYSKKSFMIPSNVFIIATMNTADRSLGTLDYAIRRRFAFIPWKPYSLEG EVEGFDEELFKEVSALFISNYEDYEASEWDPNFTLEPAETLSEEYKPEDVWIGQSYFIME ENGEYITSDRILYDIIPLLQEYVRDGVLTEEAQDTIDNLYQKAIEQ >gi|225935331|gb|ACGA01000061.1| GENE 35 34731 - 35369 138 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174549|ref|ZP_05760961.1| ## NR: gi|260174549|ref|ZP_05760961.1| hypothetical protein BacD2_22012 [Bacteroides sp. D2] # 1 212 1 212 212 419 100.0 1e-116 MSNYYDEDNYRDYQDYHEQGLEGEYKGDPPPGYESWAEWHYIHGDELPTDADYGIYADDY IPVESSTDEYVYDPNSDTYYCPRTHKRLTQRQLDAEAERETEKINALAARIEAQNRRRTE SRKPKWDKNGMMKGNEAHQIRNSKESILKLAAEQGVHPDIFFRTDEDCSKWVQDYNTWLY KHPRFYTIYKYTCLSIGCLCIVIFFLVMLFGC >gi|225935331|gb|ACGA01000061.1| GENE 36 35377 - 37107 945 576 aa, chain - ## HITS:1 COG:no KEGG:ACL_0737 NR:ns ## KEGG: ACL_0737 # Name: not_defined # Def: hypothetical protein # Organism: A.laidlawii # Pathway: not_defined # 5 565 12 581 611 346 35.0 1e-93 MLKDIYDPLTEYINVFRDRFKVVSEETFAELAAEANVNIEANHETCRQLYATEDVLSSVK TCIGWWTALSVLLWLGVAAGVVAALFFHQELESQVLVCIAIAALVATIFLFVSVHPRLKK LKAERDDLASKAEQLKNEAWEQMAPLNRLYDWDVLTRMMTKTVPRLEFDPYFTTQRLADL KMTYNWDDSFNAERSVVYSHSGLINGNPFVICRTRKMEMGSKTYYGSKTITWTRRERGAD GKYHTVRHSEVLTASVTAPYPEYFEKTRLIYGNTAAPDLIFYRKQSGLAGKEDSLSFKWK KHSLRRKARNLTNADFAMMTNEEFEVAFNTSNRNNNQQFALLFTPLAQESMLKLLEDREA GYGDDFDFDKNKMINTITPNHMQELNLDMNPDQYRHFDYDKAKADFYTINARYFRAIYFS LAPLLCVPIYQQIRPLSDIYGRDMQRHSSFWEHEALANFWGQDHFKHPSCVTDCILKTEQ TQGRDDSSTITVHAYGYRSVQRLTYISKWGGDGRSHRVPVYWDEYLPVTGQGNIYMKEDN DFQDVSITQRQRLNHINNVLGQTNLDLYRKHIASKI >gi|225935331|gb|ACGA01000061.1| GENE 37 37118 - 37870 610 250 aa, chain - ## HITS:1 COG:MYPU_7580 KEGG:ns NR:ns ## COG: MYPU_7580 COG1704 # Protein_GI_number: 15829229 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycoplasma pulmonis # 1 243 1 219 224 202 45.0 4e-52 MANLLDEVTRPVNDNGRDIHVIDRQLPVKIGIGSTIFEIALWVTIPVLVLLYVLIMGATL ENALLVGVVGCLAGILPGVIFIFMKISARNYFQQLEQRIQAEASNIDNFLEQRVQILQNV VGLVERAIDLDKDVMKAVAALRGGGSINAENRNERSQQINSVCAHLFPQVEAYPELKAHN AIADAMQQNSYLQREITAARTVYNSRVTQWNTDIFAWPTKMIVAARQGYTTRIPFTASVE TREMARSKFF >gi|225935331|gb|ACGA01000061.1| GENE 38 37911 - 38942 908 343 aa, chain - ## HITS:1 COG:no KEGG:RB2501_07125 NR:ns ## KEGG: RB2501_07125 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 1 341 1 335 336 248 39.0 2e-64 MPINKNAMIRYQALDRCLQNRQKRFYMQDLIDACNEALADVNGSRGTKGAEGVGRRQVYD DLNFMEDPMGFGVTIEKVKDGRKTYYRYAYDSKTIKDQPIKQEEIDMIHDALMLLKRFDG VPQFEWLNNVETCLYTTSKLGKNTESVVSFQHNPYLKGMKWYQPLFDLIVNHRVIELEYQ PFGKEDRIVKVSPYYLKQYNNRWFLIARREGYTYLSNYAIDRIVNITELADTFVPLDDDF SFEEYFGDVVGVSVTNSPVEDVVLHVTDKALGYIVTKPLHESQSTRPEPLEDGRWKITLK VQENYELMSLLRSFGDGIEVVQPLSLRDKMKEMLRRMTEIYAD >gi|225935331|gb|ACGA01000061.1| GENE 39 39021 - 45323 4073 2100 aa, chain - ## HITS:1 COG:ECs5260 KEGG:ns NR:ns ## COG: ECs5260 COG1205 # Protein_GI_number: 15834514 # Func_class: R General function prediction only # Function: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster # Organism: Escherichia coli O157:H7 # 92 1752 93 1745 2104 333 24.0 3e-90 MKFAKLYKDNREAVERALRSMWCGESGNDSQRKYAERLRDVIKDIFAPDDAVPLVQCMNS YESVHSVPAKTAESLVGDLWEKSLPKGVYYSPFEHQYQCWHALLEEKDANGHPMSICVTT GTGSGKTECFMMPLVKDLIDEQKNRLEISSSQIQALFLYPLNALMEDQKERLEKLLAGTN LTYTVYNGDLPEYEPKSTDHSKAADNLRKRIRQIRGWDEKTKTYKFEHMVYTRDAVRKNP PNILLTNPTMLEYILLRGTDAKLTNPELKSLRWIAIDETHTYTGAGAAELAMLLRRVLLA FGVKAEDIRFATSSATFGNGSDPKKEEQQLREFIAGITGVSQQQVKAIGGNRVGENEIPQ GEDEDRWRKIFQRDYVSLNELYPQSDQTIEEKLALLDEMCAKVPMGSNGLPLMKAKVHYF YRVPNNGLFVRLTEHEDGAFKIYTKNTSEDANEEQPLLELSRCKHCGEYVAIAKINTSPG EENGTYQPLERDDSDMFDLEEDEEDAVQKYGIIGLAKDDIATGDNNTSMMVEGKKLVPAS AGSGAWHLVVNTQCSCPYCNSKLAHKKNNEEEVTADATEEMESAYLQKFRTSADFISRQM APSVLNQLEKGSSKDPAKIMLHDGQQFISFADSRQLAAKATLKQNLEQERLWVYSTIFHA LSKQKAESGTVQQEIDKVSAELVALALKGDTDQLAEASKKIQALKAQMKKYFTWQEIADL LMKSKYCTVFCRQFVKRSGDSDELGQDGNIPKGVLEKYVHSLMVMYLSTRPASAAAPETM GLFHTHYPQLQDIALPEAVEAFNQVLDIESNRISQEDWHHLLQIFMDYTVRSNQSLFLKL SNTNPIDIFACERFATEKPRRRPIKKPVVEVGKPSHSRIVRYLCGLIKREKPELKTGDVY RQYFNQLAAVVDALWEDVNAPKNHLLEESVHWNSSNQQFERDSNEATRFNLANLCFKLYE DVYLCDTNSDSSTRHTVCLRPIENYFKGFAPYLIGNEVVELSNDLHEVWEVYPYYEGTSE EVTIEKVEAWAETHRSLLWNHHIWGKEGVFGNRLTEIHQVPNLFIQAEHTAQVDKEVSRS LQSEFKDHAINILACSTTMEMGVDLGNLEVVMLSSVPPQPSNYKQRAGRSGRNNKVCSVC ITLCGSDAIGLRTLFNPLENIINRPVQVPMVDLMSPQVVQRHVNSYLVRAFGVFKDGDEG GKLTQKVLNYYTNFIFIREDKKTVVVDPATNSTQEPTAKLGDETGTMYARFNLMCSQALD ETVRDELNQLLKGTLFEGHPDFVISKARENNERCYAELNNRLEDYCIAFNNSNAKKYPKF RNLLKMQYMEVLNERLLNYWATSRFTPNANMPVNVLTLDLSDSGQIDFSTPATSSNPSYG LRDAIAQYAPGNNIVVDGVAYVVRGIQFTNMYQGVRAFKQIFRNSDKCVIDDPSLDGKIR WDVNDKESVELVQPVGFVPDMNDGKTRIMETNSFTRVSAQLLDTTDWDNNVTEPHLFSVR SNRETGNAKILYYNEGLGYGYCFCSRCGRMTLETGVADGPSVLDKLPDDMNTLKPKTEGR PRYHHAITGKELHSACSGSNNKDCIRRNVIIGDLVQTDYAEIRIRHKGAKRWMNSRSEDN LLFTLGIVFTQSLLDILGKERGAIDFAVMPNGHLCLFDTNPGGAGYANQMASVPLMKDVI TASKQILLHAKERNSKDMLLDKFTLRFMKYVDIDAALDWIEEEEVSRGETPEEIAFVSKD ATETDIVNLERAFAASSGESILFVNDSYAHWNYNDSEHGWRTHYLHYFNTHQGMTTVCVV RTNHDAMPEPILDMVRSIKSGWAKDVVMMDSPYGDKSVYPIAYIDGNLYFTNNAENATLD DAWGNKTLYCARVENILLHAKSIDCSYKDSTKVFILSGHDTQLVKTKELGEVIQNHSGNI INGFVDYCKSNGGKIHISYQDEHLKSVMGMVLTLQTIGHLIKQIGKEFELEFLVERFEDN CYKASIMANLKNSSERDMRLGDLCEGWLADMEHNVGVHGELVPIESRKERSLTHWRVLSF ECGNKRLHIYPDGGFANGWNLQHGKEVHNKFYTLDNTDTNDNIALERAQDIKFDVSVEDI >gi|225935331|gb|ACGA01000061.1| GENE 40 45334 - 47892 1233 852 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174554|ref|ZP_05760966.1| ## NR: gi|260174554|ref|ZP_05760966.1| hypothetical protein BacD2_22037 [Bacteroides sp. D2] # 1 852 1 852 852 1789 100.0 0 MGKIDNIEALLEFRQKRLASFDVNEPYVWRLHLTHEEFAQLDSCIYEVVTHHSGSLRGMM TEDLSLVMIVYLAEWYKRVYCGNKQETSFVAKIGSGELKQLWELSGISRDKYVYKTDNGI SLWKYSIYVLGGLAIQHELARGDHNRFLKMLCRIYYGEECNLEELNDAGRAIAFQQSIRQ GHSLHVFLNDILRGNYEDSDADMTRLLTAIRCANDEVLRSKFRIEWLVHYDPAATTMQRK LRVWLKPEETGGVLYQYLRYERLQSWGLVHPEQIKWLHFGLRWYDKTKVVQDIDRKKPLI SYCNTNNGYGFLSWGIDKYAVSNSIPTEEFTHLEIIAFDDNHQECFIQREEVFEWLQLWR IDAWKDDWSSMKSAQHQTAVIYSARCLANQEADDCKAFHGKGTAMSDVWSWNYIQSSIIL TDSRGQEHVLYNHMGYDQIFAHLYKDTICYQDGGLVRVMVHDEEEGDIEELFPLIFGKHD VRICHFDTKESDEEPMTDEVADEVEFKGENGRYVLWTEDDEPPMGLITIRVQIKGKYCTE RMAHLQGPIVRDLQHQTICYLNEEGQLSTYQDTISLDKQPLAPTVELEIGDVTLNVYRPT DVKEVYIDGQVLTYANKEDGLTIPYILKHRLAIADFGMHGYYHYDCSYLSSIYPLMGDNN NSALGCWRDGKLWSAQELDEKAPDWLSVCVGTKQETDAQSFDKQDLQFYLCNLYKNEPPK AVNYNEVHCIGKGEILFQDLRAPTESLTNIFPKIGKPDPWAKKKKIENIELNCFLIAAQY NLYFEIFSPLRDIGYADKKRQLHEWNEKVLRPLLCFRNGELTQNDKDCIARFAEEFRWDA IPCEYIQVDNNI >gi|225935331|gb|ACGA01000061.1| GENE 41 47911 - 48618 445 235 aa, chain - ## HITS:1 COG:DR1726 KEGG:ns NR:ns ## COG: DR1726 COG3183 # Protein_GI_number: 15806729 # Func_class: V Defense mechanisms # Function: Predicted restriction endonuclease # Organism: Deinococcus radiodurans # 126 232 131 236 241 101 40.0 1e-21 MTQYESIHQKFLLYLEKNVRTFSVEEREALVHFAETQLVDYIRTYFIPHFGKLYAKTDAA LYQDLRNKVNTKPEAKAENEACEGRYSMALKYYAAFLQSAIFKGKEKVKLMAGEAKACQD PLLPVEEAALTEGKLHQVSLTKHERNQALRQLCLKHYGYTCQVCGMNFEAVYGKLGKNYI EVHHINPIAETDGEHVLDPKTGLIPLCSNCHSMIHRGKNGRVLTPEELKAILHQQ >gi|225935331|gb|ACGA01000061.1| GENE 42 48634 - 49914 948 426 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174556|ref|ZP_05760968.1| ## NR: gi|260174556|ref|ZP_05760968.1| hypothetical protein BacD2_22047 [Bacteroides sp. D2] # 1 426 1 426 426 870 100.0 0 MITVGLDFGTHQTKVCVEEKNGVELSYSFFKFADTEGKLQYTLPSIIMIDGQNRLKYGYV PKGKKKKLIRYFKQATFTSINNGQTQNEAIYYSIWYIAFILFDLEEMYGQDFAIQMGVPT DGARLTHQKQLAAVILASAYTLVEDVFENDKEAFLACTLNQLKEKTEVAIYSKELKDNYG MLVFPEAYACLMPLISSSKIANGMSLMVDIGGGTTDISFFTIKKDPKTKQYRPVVYDFNS INMGLNFLTSPDEMDPDRLDSNVQDVSEILKNKRSDFSNNINQVCVALIGKLQREFKKQC NLRMERLMDALKSRPIIYTGGGSTFTILRRGYGGFKDVIHISEKEWRTKSIQELSTIKAL GLCPILSTAYGLSISVADDNISCEPFQDIFENIRGAEEEGGHVVNKYEYGKAYGGFSYGD DYDAWK >gi|225935331|gb|ACGA01000061.1| GENE 43 49931 - 53071 2735 1046 aa, chain - ## HITS:1 COG:SMc02152 KEGG:ns NR:ns ## COG: SMc02152 COG0553 # Protein_GI_number: 15964256 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Sinorhizobium meliloti # 119 636 123 628 1170 134 25.0 9e-31 MAQFNIGDKVLLVDTNAHGVVIKVMPARRGRQLYTVNFPSGNQDVLEADLKADFDESDPF ERCKSGIYGSYSEYSKRNTAFKIKNSNNSTISSLKASKTLFRAYQFKPLLKFLNSPNRRL LVADEVGLGKTIEAGHIMLELKARRELKNVLIVCPKSLQEKWKAELYEKFGLTFKIVDNA KDLIADLQAKTGTVRAIVNYEKVRMRKRATKDEQRGGEHRYTNLIDYLSENPQRFSLVLC DEAHKMRNRETQTYKGSEIIMSCADAALFLTATPIMISEENLYNLLHLLDNSRYFNPQIF MNRLKENRPFVEAITLLNHNVPLPTIMEGLLDADVQLTFSADGNEIYSRSRHISELYADD PIFQEIKELMEGEDTPKVRARLQYLLSTMSVMNTVFSRTRKREVTTDMSQAERKPHARKV VLNPEEQEEFNAVIENYTEDNSYTDYDGEEVLTQGGALGLVQRKRQVASSVWGYLNKETD LDRGIDAFESCPDAKVEELQRIIEAVFKSGTKKLVVFALFRRTLKYLHIRLKKAGYNALI IHGQVENRAEILTQFKKDKNTHILLSSEVGSEGLDMQFCNSMVNYDLPWNPMVVEQRIGR IDRFGQKAKVVNIYNLVVAGSIQEEIYMRLLERIGIFRGTIGDMEAILDAQVGNTAMTIQ DVYNKMEKEFFTKELTREEKEQKIAEVARAIENEKENLQHLQEGLSNTLTNDAYFRDEIN RILYNNAYVTDEELHNFITSAIRQRLTTCNMEEVEKGVMEFAVPISQPTILKNFLTQYCE NTDESNISVNQFKRRIDEQQKFLLTFNQEKAYEDGTLNYLNIYHPMIQACLNYFIENDDE TRTSFSYALQADELLHEGDKYFMGLYQLTTSRMVQNVKKTTAELQPVLFNLQTGEVESNE AVVDRIYRRSQVEGNERNASNDDAQAEVIEDMRYDFADYISGEKKRRIDEANRQVESDRK RNEQQTKEYYTSRIANHESNIRTWEGLLEFGLDEKEHRNRQGAIRLAKANIAQLQRERDE RLSLINEDPQLSIEDKLLSLNLITII >gi|225935331|gb|ACGA01000061.1| GENE 44 53349 - 53798 229 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174559|ref|ZP_05760971.1| ## NR: gi|260174559|ref|ZP_05760971.1| hypothetical protein BacD2_22062 [Bacteroides sp. D2] # 1 149 1 149 149 287 100.0 2e-76 MNYEDIPYGYAHCYATAEQCPQCNHCLRHHAAVMNEALATPHERVNCVTTAYSHKVASGQ PCAHYRSDEPIRYAQGMKGIFDMVPKKLYPQVRNQVINCFSSERVFYYAQNGKQLISPAQ QQRIARVFESIGLAAPTYTHYIMRPNWDS >gi|225935331|gb|ACGA01000061.1| GENE 45 53814 - 57638 3325 1274 aa, chain - ## HITS:1 COG:MPN111 KEGG:ns NR:ns ## COG: MPN111 COG0827 # Protein_GI_number: 13507850 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Mycoplasma pneumoniae # 944 1072 248 379 422 94 36.0 2e-18 MAVYTTIQPVQAKQSDSLFGSDQIVLRDEQEDAIRAALKYFKKNAKEKKFLWNAKMRFGK TVCALELAKRMGELPDDKAVHRTLIVTHRPVVNDSWSTDFKKIFGQQNPDYHYATKFDDS AEGDFYALERSVKQEGMHYVFFASMQYLRRARMVGGDNDEQLKIDILTNAWDLVVIDEAH EGTRTNLGQRVIDTLTKPGTNVLHLSGTPFNLYEDFTDEQIYTWDYIKEQTAKRDWPKFH PHEPAENNPYRELPEMEIRTYNLGKLVKENFGDDATFQFKEFFRTKQGQGVPREEKGKFV HEDAVRQFLDLMCSEDEHSHYPFSNDDYRTFFNHTLWVVPGVKEAKALERLLKEHEIFGE DGPFEIINVAGNNDDDETRSDALDKVKKKIGDVPDQTYTITISCGRLTTGVTVRPWTAVF YLKGSENTSAATYMQTIFRVQSPYQYVDADGQLQMKTKCYVFDFAPDRSLKMVAETAKFA TLTQKEKSYARAATTRDKDIENMDTFISFCPIIAMEGGKMEKYNANQLFEQLEHVYVDRV VRNGFNDNSLYDVRELMSLAPDELEDFNLMGEEIGKTTNMEKPKKADKKFDLNKNGLSPN QQAAADRAHKKQKNKEPLTPEEEAALEAEKKRKAEERKERDNRITILRGISLRIPLMMYG AKIDDEEEGITLNNFTRIVDDASWAEFMPRGVTKPMFNKFKKCYNTTVFVAAGKRYRQLA READRMHTDDRIQRIAEIFSYFHNPDKETVLTPWRVVNMHMSDCLGGYTFFNDRFDGPNE ICVTSGENGKLFDWIPTVAPRYVNRGEVTKEVFGDSKAKILEINSKTGLYPLYVAYSLYR QRTQDFLAANLIEDVDNYSVEEEQVIWDDILANNIYVICNTPMAQRITHRTLLGFRPITQ KDGNERVNIKAEKLIERATTEKESLIADLKSVGYWRGTRSKEDMKFSAVVGNPPYQIMDG GAGVSAKPVYNSFVEVAKDLQPLNISMIMPAKWYTDGKGLEAFRSAMLNDKRLSKLVDFT DSRDCFENVDIAGGICYFLWNKDYNGQCNFVSKHQGETKSSMRDLAATDDFIRHIEAVSI IDKVKNYSNVGYYSERVSTRKPFGLSTNDTALDFGDITLIFNGGKGPYKSSLISKGKDMI PLWKVTISRLTAEHAGQADKEGRKRVLSSLNYLKPNEICTETYLVVDTFKTEKETLALMS YLKTRFVRFLIAQLASTQQMTKEKFAFVPLQDFTPQSDIDWTQSVADIDKQLYKKYGLTT EEQQFIESMIKPME >gi|225935331|gb|ACGA01000061.1| GENE 46 57652 - 58656 804 334 aa, chain - ## HITS:1 COG:no KEGG:PG1696 NR:ns ## KEGG: PG1696 # Name: not_defined # Def: type II DNA modification methyltransferase, putative # Organism: P.gingivalis # Pathway: not_defined # 1 333 12 343 345 372 54.0 1e-101 MANEVDISENDLLRTHPVVLEKLLLDHTTHKNIFWATDSYASLGKGFSFSDEITIAHITG ENGRMIQPRAVKSAEVQTQRSKDMAEVFTPSWICNAQNNLVDEAWFGRHNVFNQENQEEH TWIPTEGVIEFPEDVKGRSWKDYVRDIRLEMTCGEGPYLVSRYDTVTGLPIMPLSQRIGL LDRKLRVVSENTTTSRDWLKWGKEALMSIYGFEWQGDNLLLAREAVFYTFCDFYEAKFDK QVPEQSLEGMAYIISWNLWQMDGLRMVIPGSCDNVFAPSHDLFDPTPQKQECPGCKNGTR TEHIGIKCRIRDWRTTSKDEDKQKPYFISLLNNK >gi|225935331|gb|ACGA01000061.1| GENE 47 58742 - 58948 245 68 aa, chain - ## HITS:1 COG:no KEGG:BT_3153 NR:ns ## KEGG: BT_3153 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 66 47 110 114 73 59.0 1e-12 MSKKINRLKVVLAEKDLTNKWLAEQLGRNVTTVSKWCTNDNQPNLETLLQIARLLKVEVQ DLLVKDVD >gi|225935331|gb|ACGA01000061.1| GENE 48 59119 - 59688 313 189 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174563|ref|ZP_05760975.1| ## NR: gi|260174563|ref|ZP_05760975.1| hypothetical protein BacD2_22082 [Bacteroides sp. D2] # 1 189 1 189 189 383 100.0 1e-105 MTKFRNIRRATAVKRFDRLSKAAFYANECIQAMMCQLGTKEMLTEWVETVHAFQDELSRI KALYYSQVESEDLLCETVERIPQIRNNMMPNIFQMSEKLYQVGYYQTEGFRLGHATPNSK ADIINNVYELQKYAAGCLKSISHISYPTIWKGLGEIVGRLNESFCDVCTKTAQIGRSFAT DNSLKTQLQ >gi|225935331|gb|ACGA01000061.1| GENE 49 59685 - 60323 546 212 aa, chain + ## HITS:1 COG:no KEGG:SRU_p0019 NR:ns ## KEGG: SRU_p0019 # Name: not_defined # Def: putative lipoprotein # Organism: S.ruber # Pathway: not_defined # 11 205 10 201 223 80 26.0 3e-14 MKKNYLMAFALFASMLMLSSCGTSYMASYSVGLSSVESPADAKKQFGETKVVTFKDGEVD KYRYEDDFIEIVWYVGLKQFNFELKNKSTHTMKINWDDISYVDINGKTGRVMHAGVKYNE RNNSQPASSIPKGASLSDILLPTDNVNFISGQYGGWRESNLIPSYYSTPEAMANGAESYV GKKMTILMPIMIENVQNDYTFVFNIDKWLNKK >gi|225935331|gb|ACGA01000061.1| GENE 50 60336 - 60755 247 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174565|ref|ZP_05760977.1| ## NR: gi|260174565|ref|ZP_05760977.1| hypothetical protein BacD2_22092 [Bacteroides sp. D2] # 1 139 1 139 139 201 100.0 8e-51 MKKLLSVLSMLLMAVMTISLSSCSSDDDEAFDNAVILGTWTTTEDVIEGGEYVPDMEMQV KWTFKADNTASERMYIKMNNIITKDVTVYFTYVYKGSTITMQSTTNSSTPPFTYEISVTG NKMRFGNQENGYFDLTKMK >gi|225935331|gb|ACGA01000061.1| GENE 51 60947 - 62356 707 469 aa, chain + ## HITS:1 COG:alr3406 KEGG:ns NR:ns ## COG: alr3406 COG1106 # Protein_GI_number: 17230898 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Nostoc sp. PCC 7120 # 1 447 1 449 463 217 34.0 5e-56 MIIRIRIKNFLSFHDEIELNMVPGQGRLKSEHKSAPVGGYSTLKTTAVCGANAGGKSNLV KAVAFIKYMVLHGTRPDGLIDYHKFLLSKESKEEDSKIEIDFQANKKNYNYGIEFNNKEI VGEWLYEFTKRSENKIFHRDINKFEMDGLLKRNKSNEARQYLQFFAQSTPKNQLFLHEVF TRNVEDNVNDISDLVAVIDWFLNKLKIIFPGTPYKNGVMLQAANDDELHRFYAVLLRYFD TGIDSVKFNDVDIDKLGIPQSLLRDIKADLLKSKDKDAYGTLSFDDELYLVSAEGGQIKT KKLITIHKMIDGLKNDVAMFSLSNESDGTKRLFDYIPLILDLLKGSKVFLVDEIERSLHP SLVYQIFQLFLESCGQVDSQLIVTSHETTLMTQNLLRKDEIWFMEKNKLGESKLVSLEEY KVRFDKELRGSYLEGAFGGVPKFNEDEFRTLLNKSINKEASLCRENEKN >gi|225935331|gb|ACGA01000061.1| GENE 52 62459 - 63454 302 331 aa, chain + ## HITS:1 COG:VC0178 KEGG:ns NR:ns ## COG: VC0178 COG3621 # Protein_GI_number: 15640208 # Func_class: R General function prediction only # Function: Patatin # Organism: Vibrio cholerae # 8 223 15 241 355 110 34.0 4e-24 MEERKPFKILSIDGGGIKGLFSAAILEKFEEVFNTQIHEQFDLICGTSTGGIIALGASAG KRMTDIVSFYENDGPKIFDERNKQLFKWPYNFYLNARRVLWGTKYSGKALEAALIREFGS LTLAESKTLLCIPAFNITTGDRRIFKKDYNSFTEDSCRKYVDIAMATSAAPTYLPVRNIG SGQFADGGLWANNPILTGLVEFLYSFKDDSRFDGVQILSISSMEKSSGECKKRCNRSFWS WKDTLFDDYSVGQSKSALFLLDKLKDSLTFPLEYVRIANPSLSEKQAHVIDMDNASHKAL EALKDIGYNVAMKEKMKSKIQEIFQTGKTIK >gi|225935331|gb|ACGA01000061.1| GENE 53 63454 - 64431 436 325 aa, chain + ## HITS:1 COG:no KEGG:Ccur_13910 NR:ns ## KEGG: Ccur_13910 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 1 297 1 287 315 128 34.0 3e-28 MANNHDQFIAFDKAITASSSRRDTLKTNRDALRKKIRKHFKDNWPDKPQPSFYWQGSYAM FTLLSPITDESGLGAYDLDDGIYFIGKEDERETIEWYHQEIFKAVKTHTKQGAKDNKPCV TVYYADGHHIDLPAYFWAEKEDHPQLAHKSSDWTDSDPRELSDWFKGRSEHPQLRRVVRY LKAWADYVDDSTSKKMPTGCALMMLAVEHYKANDRDDIVLRDILVSMHESLSKEDGFHCY RPTFPKNEDLFDGYSATRKKDFLAELKSFKEDAERAVNSKNPHDACLKWQNHFGDRFCCS TAKDEDEDAERQSTSGILTNNNRFA >gi|225935331|gb|ACGA01000061.1| GENE 54 64811 - 66100 222 429 aa, chain + ## HITS:1 COG:no KEGG:Ccur_13900 NR:ns ## KEGG: Ccur_13900 # Name: not_defined # Def: molybdopterin/thiamine biosynthesis dinucleotide-utilizing protein # Organism: C.curtum # Pathway: not_defined # 170 378 293 498 550 94 28.0 8e-18 MDFTQEVESYWTLSYNNESTVDTRYIIYGTQPEKIEVLNILQHYNGNKLIVNDNDLQFVR KHFHSVVVGKILFLPRFTISRRPPYVITWQGLKDCVDQVDLDEIERFIRTTGCLNILFPL ANNSLFGGVYFERINTRINGFRDTYFTPVKALNKIYKTKKIPRIMAKAYNKNRIEGRTSG GISTPYKFSIVGCGSIGSNLCSFLRSYNDASFVLVDNDLFTIDNIGRHLLGFDSINQFKT VAIKNVLQSTRPEMDVQAISRSSYDCSVSSFVESSAIFFCTGNLMAELELLGRFTKHNVL TPKFIVWFEPYAIAGHMIYLNKLDNDRTLQSLFTNELYKHNLIRPEEYKNPDAFTKRDAG CNGSYTLYSGNDVMLFLSAIFPLICDLINSDSPSTCYRWIGNTQIAIEKRIGIFQTEMQK YQIETFPIQ >gi|225935331|gb|ACGA01000061.1| GENE 55 66551 - 66847 63 98 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKRSLVGSLFAHSLGFILLTITTYQLKLAWCRRHGVSQLVVIFQLPHRFRAPHAIGLVG SGMTPSIGLSSFADQPSKNKLSSPLLQVQPATVKLLNN >gi|225935331|gb|ACGA01000061.1| GENE 56 67660 - 68019 321 119 aa, chain + ## HITS:1 COG:no KEGG:BVU_3418 NR:ns ## KEGG: BVU_3418 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 118 1 118 120 184 73.0 1e-45 MCRLMTAELAEALKGYPLYSQDNKGKEAICRAIFVLGSIRWYILEGQVEGNDTILFGIVI GMPEDEYGYVSLNELSDLKVDLTKHGLGIMQVRQQPNLPPTPLKLLQDKRLQSFLARLK >gi|225935331|gb|ACGA01000061.1| GENE 57 68459 - 68695 147 78 aa, chain - ## HITS:1 COG:no KEGG:BVU_3420 NR:ns ## KEGG: BVU_3420 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 78 47 125 130 83 51.0 3e-15 MMNRCRNFMEIVPERLYSVSEAARLLGVHRCTIYTYISLPERPLPFMRLPQSDRLSFQGT ELIAFKSAGLPKKGRKRK >gi|225935331|gb|ACGA01000061.1| GENE 58 69392 - 69958 548 188 aa, chain + ## HITS:1 COG:no KEGG:BF2868 NR:ns ## KEGG: BF2868 # Name: not_defined # Def: putative ribose phosphate pyrophosphokinase # Organism: B.fragilis # Pathway: not_defined # 1 188 1 188 188 283 69.0 3e-75 MAPQVAENIRKQLAKPQTWFCKYFPKRIRNVGEKEVSDRTRVYKFKDGVAHEEVAQLTAE SMKQQYGAACSTIVFTPVPASTPQKNELRYKAFCERVCELTGAINGYEHVRVCGERLAIH ENRKAEKEIRKVNIIEFDEEWFNGKTIVAFDDIITRGISYATFAIQLESFGGNVLGGIFL AKTHYKVK >gi|225935331|gb|ACGA01000061.1| GENE 59 69961 - 70449 213 162 aa, chain + ## HITS:1 COG:BS_yoqJ KEGG:ns NR:ns ## COG: BS_yoqJ COG4474 # Protein_GI_number: 16079120 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 23 142 4 136 171 63 27.0 2e-10 MDKGLYLKAQSVAFTGHRTIPVPLQGEVRRRLKAAVSLAYANGNRRFLCGMAIGFDSLAA EAVLSLKEDLPNIRLVAVVPFSGQACRWSAIERKRYHQLLSQADEVVVLSVNYYPGCLLR RNDYLLEHTHQVIAYFNGQPKGGTYYTCKKAKLCGMKVENLF >gi|225935331|gb|ACGA01000061.1| GENE 60 70570 - 71193 376 207 aa, chain - ## HITS:1 COG:CAC0035 KEGG:ns NR:ns ## COG: CAC0035 COG0631 # Protein_GI_number: 15893333 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Clostridium acetobutylicum # 3 199 5 225 270 60 25.0 2e-09 MRIDSFSAEGKLYPNEDSLCVCQVSPNQVVAVLADGMGGLSFGKEAADLIVHAVSEYICE NLGRNSIQELIGKALEHADRIIAQRSKEMHSKMGAAVALVFVDSNTVHFTWLGNVRIYFS EQAEAKLLTTDHSLDAGYGKQLLTRCIKGKGVRPDWPYQVLSVKAGSKLILCTDGLYKQL DLERLFTAALPILEEFEDDASMIRVEI >gi|225935331|gb|ACGA01000061.1| GENE 61 71183 - 72055 694 290 aa, chain - ## HITS:1 COG:RSp0267 KEGG:ns NR:ns ## COG: RSp0267 COG4849 # Protein_GI_number: 17548488 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 19 190 18 191 287 99 34.0 9e-21 MEFKLTREQLQNDLLYDALCALSKVMNDLQLEVFIVGALARDIAMEILKMPPSPRRTADL DVAITLKDWSQFELLKEHLLKNHFIKGEPKQRFYYQGEDGNNDYEIDIVPFGELEADEKV AWPPEGNPEMSVKCFKDVMNIADTVVIDDAIAVKMAPLSGQFLIKFDTWLDRHLLTDKDA SDMLYIMDNFYLAYVSFKQPVPDDVEEESEDFDLLTGGAKWIACEMREFLSKEHLQFYID QLQEQIALDENSPLLRAMSSKYPASNGHMMLKKTLIGMTNVLRKGIKDED >gi|225935331|gb|ACGA01000061.1| GENE 62 72045 - 73019 734 324 aa, chain - ## HITS:1 COG:RSp0266 KEGG:ns NR:ns ## COG: RSp0266 COG4861 # Protein_GI_number: 17548487 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 63 321 95 364 370 120 31.0 4e-27 MGTDYQELLHDAIHRLKVATGNERITTDALPQTVSINGTRFHCIVKKTISNANILSTIDT LKAESALKKVPVLLITNHLYEKLANQLADNHISWVDKAGNCDIRHENLTMKVVGQKGSAE TKVTATGKINEASMKLILFFLQHPDTINLSYREIQEKVGYSLGTITKAFDLLKANNYLAQ TEKGRKIAMREELIEWWQQQYNEFLKPKLLVNRMAFRSPEARKHWKEIELPLGVYWGGDC GANLLDGYLIPGEFELYSDVVSTMLLKTGAVMPDPQGEIKIYKKFWIGESDERLAPALVI YADLMGTGDSRCREAALRIKENGI >gi|225935331|gb|ACGA01000061.1| GENE 63 73171 - 73509 376 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174581|ref|ZP_05760993.1| ## NR: gi|260174581|ref|ZP_05760993.1| hypothetical protein BacD2_22172 [Bacteroides sp. D2] # 1 112 1 112 112 198 100.0 8e-50 MSRKKLTPAQIRERFTAQGAAIAAMVEFAKDQWSGYKPTKKQTEEVNHYIELLHGDSLKY ARTPEDVYFWREVVTHNVASDCNDYLEEEEWQKAQDALSKITKDQKYYVPEQ >gi|225935331|gb|ACGA01000061.1| GENE 64 73543 - 76815 1831 1090 aa, chain - ## HITS:1 COG:FN1319 KEGG:ns NR:ns ## COG: FN1319 COG0358 # Protein_GI_number: 19704654 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Fusobacterium nucleatum # 7 515 8 488 603 197 29.0 1e-49 MISETTINKVRDLDIKEVLEPYVTLTKKGVALIGLCPFHSENTASFKVNPQKNLYHCFGC GRGGDAITFIMEKENLSFMEAVLFIAKKHNIEIEYVDEKQTEDQKAEAKHRESLLIVLDH VQRFFQENMRINTNDECRAARDYTYGRWPEEFCATSAIGYAPKDSQAFLDFCRQKALNEE LLFELGLLKRSEDGRAYAMFRERIMIPIRNRWGRIIAYTARYIGKNSQAPKYINSATSIV YSKGETLFGIDRASRQRGANYFIIVEGAPDVLKMQSIGFDNTVASLGTAWTEEQFEQLKK FTSSVCFVPDSDVALDKFYGPGFEAVMANGASAIKKGFHVTVRELPFAQAPMTDEELQNL YADSEVPEDAPRMKPVKNDADSFIHCREDYTSLTEKHFIVWLAQKRFFVAGSLMEERKCV AEIADLLRYVKDQLVFDQCIEQLAKLHGKVKLWRDAVSQARGEARKRNEKQTGMNDQQRE AELLRQFGLFVRDHCYYAIGDEDEDPSRISNFIMEPLFHIEDESNGTRIFRLRNMYNVCR VIELKESELCSLSNFQQKAGSLGNYVWLAKIDKLNRVKEYLYSKTDTAERIRKLGWNDIE GFFAFGNGLWFDGSFKAVDELGIVRGINDKAFYIPATSKIYIHNQEIFQFERLMVHENRN GVKLYDFVARLTEVFGENARVAFSYLLSTLFRDIIFRRTRHFPILNLFGEKGTGKTTLAT SLQSFFLHGIDPPNLGVTSVPAMNDRVSQAINTLVVLDEYKNDLDLRKIAYLKGLWGGGG QTKKNTNTDGMAAQTIVTTGVALCGQDKPTQDMALYTRVIFLAFTKTSFNQLEKRHYEDL VSLCNLGLTHLTIEVLSHRELFEKNFPEIYSITKRELATKLDQEVIHDRIFGNWLIPLAT FRTVETVIEVPFSFTELFDTTIRGMRNQNELAQESSEVADFWSMLQGFQTSGKCIDRAHY RIRYMKSFRPLNVKEDIEFQEARPILYLNTAAVASLFNSRNMNATANRSNWSTIISYLKS HASYLGLKQDRFTILLPSGMPDFSIDIVNGQQVKHVKVNRPKALCFDYLQLKEMFGLDLE TEVVVDDQDE >gi|225935331|gb|ACGA01000061.1| GENE 65 76812 - 77015 128 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254881888|ref|ZP_05254598.1| ## NR: gi|254881888|ref|ZP_05254598.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 67 1 67 67 114 83.0 2e-24 MKELKPTCDPNGVYSVKRTCAELGISHKTLRKYKKSGYIKPLNPDNVCRPKYSGQSIIDC WTLLTTL >gi|225935331|gb|ACGA01000061.1| GENE 66 77135 - 77428 140 97 aa, chain - ## HITS:1 COG:no KEGG:BVU_3432 NR:ns ## KEGG: BVU_3432 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 97 1 97 105 113 61.0 2e-24 MQFLQFLDHLIPYDTFLNDLAARVVKLLKSDKDDPEFISQRKAYEMFGRRNVERWRRQGK VQCYKRPGKVEYRTADLRLLQRTVQDYFDDSPSKDKK >gi|225935331|gb|ACGA01000061.1| GENE 67 77450 - 77641 148 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174584|ref|ZP_05760996.1| ## NR: gi|260174584|ref|ZP_05760996.1| hypothetical protein BacD2_22187 [Bacteroides sp. D2] # 1 63 1 63 63 101 100.0 2e-20 MEEKVEKIRPALMALEVGQEINFPITRLKSVRTQASELGVMYSRQFKTRTDKVNHLIIVR RVS >gi|225935331|gb|ACGA01000061.1| GENE 68 77788 - 78246 349 152 aa, chain + ## HITS:1 COG:no KEGG:BVU_3433 NR:ns ## KEGG: BVU_3433 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 150 1 142 149 111 43.0 1e-23 MNIPVNDVHIGKAIDEQREKLKLSKSEFGRRIGVPQQHVNRILERDTMETKRLVKVCQEL NFNFFSLFCPLHTQQISAYLSAISLEGDANNLIGDAALLAQLEKVKGELKEVQSDLEHEK RENQLLKEQIAGLVANMKDKDAIIELLKERRE Prediction of potential genes in microbial genomes Time: Fri May 13 10:56:46 2011 Seq name: gi|225935330|gb|ACGA01000062.1| Bacteroides sp. D2 cont1.62, whole genome shotgun sequence Length of sequence - 122104 bp Number of predicted genes - 108, with homology - 104 Number of transcription units - 52, operones - 25 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 127 - 186 7.7 1 1 Tu 1 . + CDS 258 - 773 484 ## BT_0645 hypothetical protein + Term 800 - 848 8.3 + Prom 795 - 854 7.0 2 2 Tu 1 . + CDS 969 - 1541 669 ## BT_0646 hypothetical protein + Term 1559 - 1613 11.0 3 3 Op 1 . - CDS 1678 - 2286 575 ## BT_0647 thiamine phosphate pyrophosphorylase 4 3 Op 2 . - CDS 2349 - 3041 644 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 5 3 Op 3 . - CDS 3046 - 4170 1024 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 6 3 Op 4 . - CDS 4261 - 5958 1656 ## COG0422 Thiamine biosynthesis protein ThiC 7 3 Op 5 3/0.000 - CDS 5972 - 6745 884 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis 8 3 Op 6 . - CDS 6790 - 7344 568 ## COG0352 Thiamine monophosphate synthase 9 3 Op 7 . - CDS 7412 - 7612 226 ## BT_0653 ThiS protein, involved in thiamine biosynthesis + Prom 7366 - 7425 4.1 10 4 Tu 1 . + CDS 7457 - 7783 92 ## 11 5 Op 1 . - CDS 7822 - 9264 1329 ## COG0642 Signal transduction histidine kinase 12 5 Op 2 . - CDS 9309 - 9812 281 ## BT_0654 two-component sensor histidine kinase - Prom 9837 - 9896 5.9 - Term 9890 - 9928 2.2 13 6 Op 1 . - CDS 9963 - 10580 434 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent 14 6 Op 2 . - CDS 10656 - 12395 979 ## BT_0656 hypothetical protein - Prom 12545 - 12604 7.7 + Prom 12368 - 12427 8.5 15 7 Op 1 . + CDS 12542 - 14920 2101 ## COG0210 Superfamily I DNA and RNA helicases 16 7 Op 2 . + CDS 14946 - 15635 662 ## BT_0658 hypothetical protein + Term 15668 - 15730 11.1 + Prom 15815 - 15874 6.7 17 8 Tu 1 . + CDS 15901 - 17457 1066 ## BVU_1868 hypothetical protein + Term 17633 - 17673 6.4 - Term 17643 - 17683 6.7 18 9 Tu 1 . - CDS 17695 - 18252 378 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) - Prom 18298 - 18357 8.6 + Prom 18296 - 18355 6.3 19 10 Tu 1 . + CDS 18428 - 19567 830 ## COG0019 Diaminopimelate decarboxylase + Term 19577 - 19619 -0.0 + TRNA 19686 - 19758 84.5 # Gly GCC 0 0 + TRNA 19793 - 19877 51.5 # Leu CAG 0 0 + TRNA 19898 - 19981 54.0 # Leu GAG 0 0 + TRNA 20001 - 20073 84.5 # Gly GCC 0 0 + TRNA 20106 - 20190 51.5 # Leu CAG 0 0 + TRNA 20225 - 20300 92.8 # Gly GCC 0 0 + Prom 20227 - 20286 80.0 20 11 Op 1 1/0.143 + CDS 20491 - 21657 1327 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 21 11 Op 2 . + CDS 21679 - 22854 1229 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 22 11 Op 3 . + CDS 22953 - 23246 103 ## 23 12 Tu 1 . - CDS 23096 - 23458 424 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 23518 - 23577 4.6 + Prom 23530 - 23589 6.0 24 13 Op 1 13/0.000 + CDS 23695 - 24960 1279 ## COG1538 Outer membrane protein 25 13 Op 2 11/0.000 + CDS 24983 - 26077 1059 ## COG0845 Membrane-fusion protein 26 13 Op 3 . + CDS 26092 - 29217 3174 ## COG3696 Putative silver efflux pump + Term 29299 - 29332 3.1 27 14 Op 1 40/0.000 - CDS 29305 - 30636 961 ## COG0642 Signal transduction histidine kinase 28 14 Op 2 . - CDS 30660 - 31334 669 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 31421 - 31480 4.9 + Prom 31285 - 31344 3.6 29 15 Tu 1 . + CDS 31453 - 33462 1814 ## BT_0683 alpha-glucosidase + Term 33567 - 33631 9.9 - Term 33565 - 33604 1.2 30 16 Tu 1 . - CDS 33703 - 35367 1346 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 35422 - 35481 7.8 - Term 35529 - 35572 3.1 31 17 Tu 1 . - CDS 35649 - 37115 1202 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain - Prom 37183 - 37242 5.9 + Prom 37185 - 37244 7.0 32 18 Tu 1 . + CDS 37282 - 38466 831 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 38552 - 38623 9.7 - Term 38564 - 38608 7.6 33 19 Tu 1 . - CDS 38681 - 40312 1728 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 40333 - 40392 7.5 - Term 40392 - 40435 4.0 34 20 Op 1 . - CDS 40466 - 41134 659 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 35 20 Op 2 8/0.000 - CDS 41145 - 42458 709 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation 36 20 Op 3 . - CDS 42455 - 43795 808 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 43816 - 43875 1.6 37 21 Op 1 . - CDS 43893 - 44696 544 ## BT_0691 hypothetical protein 38 21 Op 2 . - CDS 44677 - 45483 663 ## BT_0692 calcineurin superfamily phosphohydrolase - Prom 45654 - 45713 6.2 + Prom 45640 - 45699 3.8 39 22 Op 1 . + CDS 45749 - 48058 1620 ## BT_0693 hypothetical protein 40 22 Op 2 . + CDS 48082 - 48756 323 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 41 22 Op 3 . + CDS 48775 - 51093 1649 ## BT_0695 ABC transporter, permease protein + Term 51110 - 51158 4.7 - Term 51031 - 51059 -0.9 42 23 Tu 1 . - CDS 51133 - 52497 1181 ## COG0534 Na+-driven multidrug efflux pump - Prom 52521 - 52580 5.5 + Prom 52485 - 52544 4.8 43 24 Tu 1 . + CDS 52648 - 53295 680 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 53457 - 53516 7.7 44 25 Op 1 . + CDS 53541 - 54362 922 ## COG0413 Ketopantoate hydroxymethyltransferase 45 25 Op 2 . + CDS 54372 - 55529 860 ## COG0477 Permeases of the major facilitator superfamily + Term 55532 - 55572 8.4 - Term 55423 - 55466 5.1 46 26 Tu 1 . - CDS 55544 - 57763 2268 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases - Prom 57863 - 57922 3.4 + Prom 57739 - 57798 6.5 47 27 Tu 1 . + CDS 57870 - 59387 1701 ## BT_0701 hypothetical protein + Term 59416 - 59473 1.3 + Prom 59416 - 59475 4.7 48 28 Op 1 . + CDS 59499 - 60122 449 ## COG4845 Chloramphenicol O-acetyltransferase 49 28 Op 2 . + CDS 60156 - 61550 961 ## BT_0703 hypothetical protein 50 28 Op 3 . + CDS 61575 - 62138 434 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 51 28 Op 4 . + CDS 62236 - 63387 864 ## COG1835 Predicted acyltransferases + Term 63413 - 63475 10.4 - Term 63401 - 63463 6.6 52 29 Op 1 . - CDS 63540 - 64010 243 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 53 29 Op 2 . - CDS 64032 - 65198 1367 ## COG0027 Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) - Prom 65223 - 65282 5.3 + Prom 65724 - 65783 3.5 54 30 Op 1 42/0.000 + CDS 65875 - 67395 1860 ## COG0055 F0F1-type ATP synthase, beta subunit 55 30 Op 2 . + CDS 67405 - 67653 244 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 56 30 Op 3 . + CDS 67737 - 68171 328 ## BT_0713 hypothetical protein 57 30 Op 4 . + CDS 68155 - 69297 1124 ## COG0356 F0F1-type ATP synthase, subunit a + Prom 69314 - 69373 2.1 58 31 Op 1 . + CDS 69406 - 69663 422 ## BT_0715 ATP synthase C subunit 59 31 Op 2 38/0.000 + CDS 69720 - 70223 688 ## COG0711 F0F1-type ATP synthase, subunit b 60 31 Op 3 41/0.000 + CDS 70229 - 70789 453 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 61 31 Op 4 42/0.000 + CDS 70789 - 72372 2036 ## COG0056 F0F1-type ATP synthase, alpha subunit 62 31 Op 5 . + CDS 72434 - 73330 813 ## COG0224 F0F1-type ATP synthase, gamma subunit + Term 73518 - 73554 -0.7 + Prom 73697 - 73756 6.3 63 32 Tu 1 . + CDS 73915 - 76512 1930 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Term 76436 - 76475 5.8 64 33 Op 1 . - CDS 76541 - 77161 508 ## COG2431 Predicted membrane protein 65 33 Op 2 . - CDS 77158 - 77436 307 ## BT_0723 hypothetical protein - Prom 77457 - 77516 5.8 - Term 77623 - 77663 2.6 66 34 Tu 1 . - CDS 77839 - 78750 553 ## COG1091 dTDP-4-dehydrorhamnose reductase - Prom 78773 - 78832 8.4 + Prom 78726 - 78785 7.0 67 35 Tu 1 . + CDS 78889 - 80058 1207 ## COG1690 Uncharacterized conserved protein + Term 80105 - 80136 1.8 - Term 80093 - 80122 1.4 68 36 Op 1 . - CDS 80330 - 80902 351 ## BT_0727 hypothetical protein 69 36 Op 2 . - CDS 80935 - 81612 414 ## BT_0727 hypothetical protein - Prom 81758 - 81817 6.8 - Term 81767 - 81817 2.0 70 37 Op 1 . - CDS 81998 - 82897 556 ## COG2207 AraC-type DNA-binding domain-containing proteins 71 37 Op 2 . - CDS 82928 - 83830 656 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 84008 - 84067 7.7 + Prom 83812 - 83871 6.2 72 38 Tu 1 . + CDS 84034 - 85041 803 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Term 84950 - 84986 1.1 73 39 Op 1 . - CDS 85181 - 85729 411 ## BT_0731 hypothetical protein 74 39 Op 2 . - CDS 85811 - 86584 500 ## BF2197 hypothetical protein 75 39 Op 3 40/0.000 - CDS 86608 - 87303 771 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 76 39 Op 4 . - CDS 87316 - 88830 898 ## COG0642 Signal transduction histidine kinase - Prom 88896 - 88955 4.5 + Prom 88812 - 88871 3.9 77 40 Tu 1 . + CDS 88963 - 91071 1532 ## BT_0734 hypothetical protein + Term 91237 - 91280 1.0 - Term 91037 - 91083 5.6 78 41 Tu 1 . - CDS 91092 - 93098 1317 ## COG0642 Signal transduction histidine kinase - Prom 93148 - 93207 2.4 - Term 93177 - 93249 19.9 79 42 Op 1 . - CDS 93276 - 94919 1648 ## BT_0735 aspartate aminotransferase (EC:2.6.1.1) 80 42 Op 2 . - CDS 94942 - 96636 1867 ## COG2985 Predicted permease - Prom 96772 - 96831 7.4 + Prom 96715 - 96774 9.3 81 43 Tu 1 . + CDS 96804 - 98471 1787 ## COG2759 Formyltetrahydrofolate synthetase + Term 98515 - 98576 4.2 - Term 98734 - 98775 -1.0 82 44 Tu 1 . - CDS 98810 - 100090 1718 ## COG0112 Glycine/serine hydroxymethyltransferase - Prom 100124 - 100183 5.8 83 45 Op 1 . - CDS 100225 - 100962 717 ## BT_0739 hypothetical protein 84 45 Op 2 . - CDS 100979 - 101548 528 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 85 45 Op 3 19/0.000 - CDS 101565 - 102026 610 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 86 45 Op 4 . - CDS 102023 - 102964 977 ## COG0540 Aspartate carbamoyltransferase, catalytic chain - Prom 102996 - 103055 2.7 - Term 102999 - 103049 6.2 87 46 Op 1 . - CDS 103077 - 103856 656 ## BT_0592 hypothetical protein 88 46 Op 2 . - CDS 103880 - 104224 304 ## BT_0593 hypothetical protein - Prom 104339 - 104398 4.3 - Term 104274 - 104314 2.0 89 47 Tu 1 . - CDS 104449 - 104859 303 ## gi|260174670|ref|ZP_05761082.1| hypothetical protein BacD2_22627 - Prom 104960 - 105019 8.8 + Prom 104853 - 104912 4.2 90 48 Tu 1 . + CDS 104989 - 105942 512 ## BT_0595 integrase + Term 106123 - 106172 7.3 + Prom 106083 - 106142 4.0 91 49 Op 1 . + CDS 106297 - 106863 455 ## BT_0596 putative transcriptional regulator 92 49 Op 2 13/0.000 + CDS 106936 - 107823 719 ## COG1209 dTDP-glucose pyrophosphorylase 93 49 Op 3 9/0.000 + CDS 107873 - 108442 471 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 94 49 Op 4 11/0.000 + CDS 108448 - 109314 659 ## COG1091 dTDP-4-dehydrorhamnose reductase 95 49 Op 5 . + CDS 109322 - 110395 882 ## COG1088 dTDP-D-glucose 4,6-dehydratase 96 49 Op 6 . + CDS 110474 - 112024 567 ## BT_0467 hypothetical protein 97 49 Op 7 . + CDS 112008 - 112976 352 ## COG1035 Coenzyme F420-reducing hydrogenase, beta subunit 98 49 Op 8 . + CDS 112984 - 114120 594 ## COG2327 Uncharacterized conserved protein 99 49 Op 9 . + CDS 114149 - 115474 458 ## gi|260174680|ref|ZP_05761092.1| hypothetical protein BacD2_22677 100 50 Tu 1 . - CDS 115865 - 116026 69 ## gi|260174681|ref|ZP_05761093.1| hypothetical protein BacD2_22682 + Prom 116192 - 116251 5.3 101 51 Op 1 . + CDS 116271 - 116393 65 ## 102 51 Op 2 . + CDS 116439 - 116762 237 ## gi|260174684|ref|ZP_05761096.1| hypothetical protein BacD2_22697 103 51 Op 3 . + CDS 116797 - 116940 65 ## 104 51 Op 4 . + CDS 117020 - 118138 689 ## COG3754 Lipopolysaccharide biosynthesis protein 105 52 Op 1 . + CDS 118668 - 119492 553 ## Mfla_2015 glycosyl transferase, group 1 106 52 Op 2 . + CDS 119497 - 120708 319 ## DPPB80 related to F420H2-dehydrogenase, beta subunit 107 52 Op 3 . + CDS 120705 - 121742 327 ## Amet_0211 hypothetical protein 108 52 Op 4 . + CDS 121753 - 122104 84 ## gi|260174689|ref|ZP_05761101.1| hypothetical protein BacD2_22722 Predicted protein(s) >gi|225935330|gb|ACGA01000062.1| GENE 1 258 - 773 484 171 aa, chain + ## HITS:1 COG:no KEGG:BT_0645 NR:ns ## KEGG: BT_0645 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 295 87.0 5e-79 MKKGLLLVVVMLATIAVKAQDIYVGGSLNVWRNSTGNTTSFKVAPEVGYNFNETWALGAE LDYSHNYNGGLTKNSVIVAPYIRWSYCETGAVRLFLDGTAALGFVKVKDGDTTKAGQVGL RPGIAVKLNDHFSFIAKYGFLGYRRNINTLGDSFGLELTSEDLSIGFHYAF >gi|225935330|gb|ACGA01000062.1| GENE 2 969 - 1541 669 190 aa, chain + ## HITS:1 COG:no KEGG:BT_0646 NR:ns ## KEGG: BT_0646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 322 88.0 5e-87 MKKSVLFVLFALISVAGFSQITGWNAKVGMNFSNYTGDLDLNAKVGFKLGGGFEYAFTDT WSLQPSLFLTSKGAKKDGNSINAMYLELPVMAAARFNVADNTNLVVNAGPYFACGIAGKT KFDLGNDTERKVDTFGDDALKRFDAGLGVGVALEFGRIIAGLDGQFGLVDVEKVGNPKNM NFSIIVGYKF >gi|225935330|gb|ACGA01000062.1| GENE 3 1678 - 2286 575 202 aa, chain - ## HITS:1 COG:no KEGG:BT_0647 NR:ns ## KEGG: BT_0647 # Name: not_defined # Def: thiamine phosphate pyrophosphorylase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 202 1 202 202 393 96.0 1e-108 MKLIVVTTPTFFVEEDKIITALFEEGLDVLHLRKPETPAMYSERLLTLIPDKYHRRIVTH EHFYLKEEFNLMGIHLNARNPKEPHDYYGHISCSCHSVEEVKNRKHFYDYVFMSPIYDSI SKVNYYSTYTAEELREAQRAKIIDSKVMALGGINEDNLLEIKDFGFGGAVVLGDLWNKFD ACQDQNYLAVIEHFKKLKKLSD >gi|225935330|gb|ACGA01000062.1| GENE 4 2349 - 3041 644 230 aa, chain - ## HITS:1 COG:all2906_1 KEGG:ns NR:ns ## COG: all2906_1 COG0476 # Protein_GI_number: 17230398 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Nostoc sp. PCC 7120 # 2 215 18 231 262 216 51.0 3e-56 MRYDRQIILPEIGEEGQKKLQEAKVLIVGVGGLGSPIALYLAGAGVGCLGLVDDDLVSIT NLQRQVLYSEKELGKPKAICAAERLSALNSEIRIHPYSTRLTKENAYHIIQEYDIVVDGC DNFATRYLINDICIEQKKPYVYGAICGFEGQVSVFNYGNQKKNYRDLYPDEEEMQRMPPP PKGVMGVTPAIVGSVEATEVLKIICDFGDVLAGELWTIDLRTWQSNKFSL >gi|225935330|gb|ACGA01000062.1| GENE 5 3046 - 4170 1024 374 aa, chain - ## HITS:1 COG:VC0066 KEGG:ns NR:ns ## COG: VC0066 COG1060 # Protein_GI_number: 15640098 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Vibrio cholerae # 2 370 3 368 370 373 49.0 1e-103 MFSDELEKISWEETTKAIYSKTDADVRRALGKKEHLDVNDFMALISPAATPYLEVMARLS QKYTMERFGKTISMFVPLYLTNSCTNSCVYCGFHISNPMKRTILTEEEIVNEYKAIKRLA PFENLLLVTGENPAAAGVPYIARALDLAKPYFSNLQIEVMPLKAEEYQELTNHGLNGVIC FQETYHKANYKTYHPRGMKSKFEWRVDGFDRMGQAGVHKIGMGVLIGLEEWRTDVTMMAY HLRYLQKHYWKTKYSVNFPRMRPSENGGFQPNVVMNDRELAQLTFAMRIFDHDVDISYST RESAEIRNHMATLGVTTMSAESKTEPGGYYSYPQTLEQFHVSDERKAVEVERDLKKLGRE PVWKDWDQSFDFKR >gi|225935330|gb|ACGA01000062.1| GENE 6 4261 - 5958 1656 565 aa, chain - ## HITS:1 COG:PA4973 KEGG:ns NR:ns ## COG: PA4973 COG0422 # Protein_GI_number: 15600166 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Pseudomonas aeruginosa # 7 564 23 596 627 781 63.0 0 MEQKIKFPRSQKVYLPGKLYPNIRVAMRKVEQVPSVSFEGEEKIATPNPEIYVYDTSGPF SDADMSIDLKKGLPRMREEWIVGRGDVEQLPEITSEYGQMRRDDKSLDHLRFEHIALPYR AKKGEAITQMAYAKRGIITPEMEYVAIRENMNCEELGIKTHITPEFVRQEIAEGRAVLPA NINHPEAEPMIIGRNFLVKINTNIGNSATTSSIDEEVEKALWSCKWGGDTLMDLSTGENI HETREWIIRNCPVPVGTVPIYQALEKVNGIVEDLTWEIYRDTLIEQCEQGVDYFTIHAGI RRHNVHLADKRLCGIVSRGGSIMSKWCLVHDQESFLYDHFDDICDILAQYDVAVSLGDGL RPGSIYDANDEAQFAELDTMGELVLRAWDKNVQAFIEGPGHVPMHKIKENMERQIEKCHD APFYTLGPLVTDIAPGYDHITSAIGAAQIGWLGTAMLCYVTPKEHLALPDKEDVRVGVIT YKIAAHAADLAKGHPGAQVRDNALSKARYEFRWKDQFDLSLDPERAQTYFRAGHHIDGEY CTMCGPNFCAMRLSRDLKKSAKTNK >gi|225935330|gb|ACGA01000062.1| GENE 7 5972 - 6745 884 257 aa, chain - ## HITS:1 COG:YPO3742 KEGG:ns NR:ns ## COG: YPO3742 COG2022 # Protein_GI_number: 16123879 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Yersinia pestis # 2 256 62 324 333 317 62.0 2e-86 MEKLVIAGREFNSRLFLGTGKFNSNEVMEQAILASGTEMVTVAMKRIDMDNKEDDMLKHI IHPNIQLLPNTSGVRNAEEAVFAAQLAREAFGTNWLKLEIHPDPRYLLPDSIETLKATEE LVKLGFIVLPYCQADPVLCKRLEEAGAATVMPLGAPIGTNKGLQTKEFLQIIIEQAGIPV VVDAGIGAPSHAAEAMELGASAVLVNTAIAVAGNPVEMAKAFKAATEAGRQAYEAGLGLQ AVDFVAEASSPLTAFLD >gi|225935330|gb|ACGA01000062.1| GENE 8 6790 - 7344 568 184 aa, chain - ## HITS:1 COG:PAB1645 KEGG:ns NR:ns ## COG: PAB1645 COG0352 # Protein_GI_number: 14521295 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Pyrococcus abyssi # 2 181 27 202 207 119 38.0 3e-27 MALEGGCKWIQLRMKEAPLEEVEAVALQLKPLCKEHEAILVLDDHVELARKLEVDGVHLG KKDMPIDQARQILGEAFIIGGTANTFEDVVQHYRAGADYLGIGPFRFTTTKKNLSPVLGL EGYSSILSQMKEANIEIPVVAIGGITYEDIPAILHTGVNGIALSGTILGADNPIEETRRI LNHA >gi|225935330|gb|ACGA01000062.1| GENE 9 7412 - 7612 226 66 aa, chain - ## HITS:1 COG:no KEGG:BT_0653 NR:ns ## KEGG: BT_0653 # Name: not_defined # Def: ThiS protein, involved in thiamine biosynthesis # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 107 87.0 2e-22 MKVQVNNKEVEITPDSTLTQLTAQLELPVQGIAIAVNNKMIPRTEWEGFILHENDNLVII KAACGG >gi|225935330|gb|ACGA01000062.1| GENE 10 7457 - 7783 92 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQYETLPFGTGNHFVVYGDSNALNRKFELCCQLCKGRVGSNFHLFVVHLYFHCPIRINKL KVNTKSVLEWKGDKKKHNPYVSTNCLRFLGYNLSPSWGTPLSLRGQSK >gi|225935330|gb|ACGA01000062.1| GENE 11 7822 - 9264 1329 480 aa, chain - ## HITS:1 COG:VC2453_1 KEGG:ns NR:ns ## COG: VC2453_1 COG0642 # Protein_GI_number: 15642449 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 199 469 238 513 516 166 34.0 1e-40 MIQDFYPETEHLVFVSDNTYNGLAELAWFKKNLQHFPQLSITYVDGRIHTLDMAASQLRN LPRNTVMLLGIWRIDNRGITYMNNSVYAFSKANPLLPMFSMTSTAIGYWAIGGYVPQYEG VGQSMGEYAYRFLDQKETDISSINILPNRYKFDAKKLKEWGFENKKLPVNSMVINQPIPF FVAYKTEVQFILIIFLVLVGSLMISLYYYYRTKILKNHLERTTKQLREDKKKLEESEVEL RDAKERAEEANQLKSAFVSNMSHEIRTPLNAIVGFSSLIIGSVEQNNELKEYADIVQTNS NLLLQLISDVLDISRLESGKLQFNYEWCELVNHCQNMITLTNRNKTMDVDIKLQMPKEPY MLYTDPLRLQQIIINLLNNALKFTPAGGSITLDYEVDEKKQCMLFSVTDTGTGIPEDKQE LVFQRFEKLNEFVQGTGLGLAICKLTIQYMGGDIWIDKNYKNGARFIFSHPIKKQESTEK >gi|225935330|gb|ACGA01000062.1| GENE 12 9309 - 9812 281 167 aa, chain - ## HITS:1 COG:no KEGG:BT_0654 NR:ns ## KEGG: BT_0654 # Name: not_defined # Def: two-component sensor histidine kinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 166 6 163 660 229 75.0 2e-59 MVLYKRKKIRFIVSIAVLLFLPFFSNVGNAATTEESEEEILFITSYNSDTKYTYDNISTF IETYTQLGGRYSTMVENMNATDLMQAHQWKKTLTDILDKHPKAKLVILLGGEAWSSFLHL EDEKYKQLPVFCAMASRNGIRIPEDSTDIRSYNPESINLMERMKEYT >gi|225935330|gb|ACGA01000062.1| GENE 13 9963 - 10580 434 205 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 13 203 1 195 201 171 43 1e-41 MNTILMSLIMMTMTYEMPKLPYANNALEPVISQQTIDFHYGKHLQTYVNNLNSLVPGTEY EGKTVEEIVAAAPDGAIFNNAGQVLNHNLYFLQFAPKPSKKEPAGKLGEAIKRDFGSFEN FKKEFNAAAVGLFGSGWAWLSVDKDGKLKITKEGNGSNPIRAGLKPLLGFDVWEHSYYLD YQNRRADHVNALWDIIDWDVVEKRM >gi|225935330|gb|ACGA01000062.1| GENE 14 10656 - 12395 979 579 aa, chain - ## HITS:1 COG:no KEGG:BT_0656 NR:ns ## KEGG: BT_0656 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 580 580 849 73.0 0 MKRSIFLSIILSLFLVACIPQAMAQKQSRLEKLLRYLNDNDADKWQKNRDKIDDETQTYY AEELALLDVLNGLWNEQSEQAATNYFGCYERATKAYFPNICEEEKIQLSNVQNKAELAVI SILDASKDQIPFSKTLMDSIQSSGYPGDSAILQKVRDIREMALLEGMLKTPTLNIYQTYI TEYPNGKFISQINTAENKRLYQIVKSNPTSANFKAFFDNANMQKFFTDKDTRPFLPEVRA LYDDFLFQGIDSLREKGNATAIRQIIDEYKQSPYLTSTARTHLDELEYLSEKADFELLKA AIVNSESLSMLQDFLCTHKYKEFRDQANALRTPFILQTIISTPTSVKYYNGGRLIKSAEN DSTGNTSTTYSYDDKGQLISTLSLTVKNGQPSNEIQTNRLYDPQGHCIFEVQTNPKTKTD LYRRTRRIGTDGSIESDSLKYTDGRVIISSYNKQGLLTETKEYNKNGELQAYTANKYDDK GRLISSQHQNLLFANSSDQIISQKDAYEYDKYGYLTQIVYQRILGNNQKTSGCLTCLYDK YGNQIDSNSYYEYDNTGQWICRTDREHPKEVERIQYIYK >gi|225935330|gb|ACGA01000062.1| GENE 15 12542 - 14920 2101 792 aa, chain + ## HITS:1 COG:SPy1267 KEGG:ns NR:ns ## COG: SPy1267 COG0210 # Protein_GI_number: 15675225 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Streptococcus pyogenes M1 GAS # 6 786 5 767 772 509 39.0 1e-144 MNTNYIDELNESQCAAVTYNDGPSLVIAGAGSGKTRVLTYKIAYLLEQENGYNPWNILAL TFTNKAAREMKERIARQVGMERARYLWMGTFHSIFSRILRAEATFIGFTSQFTIYDTADS KSLLRSIIKEMGLDEKTYKPGVVQARISNAKNHLVTPTGYAANKEAYEGDMAAKMPAIRD IYTRYWDRCRQAGAMDFDDLLVYTYILFRDFPEVLARYRDQFRYVLVDEYQDTNYAQHSI VLQLTKENQRVCVVGDDAQSIYSFRGADIDNILYFTKIYPNTKVFKLEQNYRSTQTIVCA ANSLIEKNERQIRKAVFSEKEKGEPIGVFQAYSDVEEGDIVANKIAELRREYRYGYADFA ILYRTNAQSRIFEEALRKRSIPYKIYGGLSFYQRKEIKDVIAYFRLVVNPNDEEAFKRII NYPARGIGDTTVGKIISAATDHGVSLWAALCEPLSYGLDINKGTHAKLQGFRELIEGFIV DQADKNAYEIGTNIIRQSGIINDVCQDTSPENLSRKENIEELVNGMNDFCALRQEEGNPN VSLTDFLSEIALLTDQDSDKADDGEKITLMTVHSAKGLEFKNVFVVGLEENLFPSGMVGD SPRALEEERRLFYVAITRAEEHCYLSFAKTRFRYGRMEFGSPSRFLRDIDVHYLQLPHEA GVSRAVDEGAGRFRREIEGGFTRSASPSRAPFGSTSSEQRERPKAQIIAPSVPRNLKKVS TVSPSNGVQATSSTSASIAGVQAGQMIEHERFGLGEVIKVEGTGDNAKATIHFKNAGDKQ LLLRFARFKVVE >gi|225935330|gb|ACGA01000062.1| GENE 16 14946 - 15635 662 229 aa, chain + ## HITS:1 COG:no KEGG:BT_0658 NR:ns ## KEGG: BT_0658 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 229 1 231 231 317 81.0 3e-85 MKKQLTVILLSALIVSGCASGRMGNPGAIMAGASIGGSLGSSIGGLIGDNNHGWRGGYRG SAIGNIVGTIAGAAIGNALTAPRQEQIEEDAYIPEVREVRVQKYKKQPVQQPISQLKLRK IRFIDDNRSHVIDAGENSKIIFEIMNEGRNPVYNVVPVVETVGKVKHLGISPSVMVEEIL PGEGIRYTASIHAGEKLKDGEVTFRVAVADENGVICDSQEFTLPTQRGN >gi|225935330|gb|ACGA01000062.1| GENE 17 15901 - 17457 1066 518 aa, chain + ## HITS:1 COG:no KEGG:BVU_1868 NR:ns ## KEGG: BVU_1868 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 518 1 518 526 763 72.0 0 MKETFRRYPIGIQNFEDLRNNDCVYIDKTALIYHLTHTNKIYFLSRPRRFGKSLLVSTLE AYFSGKKELFNGLAMEQLEQEWTVYPVLHIDFSRTKYTTIEDLQEQLNLYLSEWERTYGK NEEETSYAARLTGLLQRIYQQTGKQAVVLIDEYDAPLLDSNSDSALQGQLRTEMRKFFSP LKAQGQYLRFLFLTGISKFSQMSIFSELNNLQNISMSDDYSAICGITEQELLSQMQPDIE RIAQANNETYEEACRHLKRQYDGYHFSKNCEDIYNPFSLFNAFSQRDYKSFWFSTGTPTF LIEMLQRMDFSLNLLEKMEVKDEDFDKATEVITDPIPVLYQSGYLTIKGYEPLFRTYTLG YPNEEVKIGFIETLIPSYLNQPTRESNFYVVSFVRDLMKGDIEQCLLRTRSFFSSIPYDL ENNQEKHYQTIFYLLFRLMGQYVDVEVKNAIGRADVVIKMSDAIYVFEFKVDGTPEEALA QIDSKQYAIPYEIDHCKVVKVGVNFDSPTRTLGQWIIK >gi|225935330|gb|ACGA01000062.1| GENE 18 17695 - 18252 378 185 aa, chain - ## HITS:1 COG:CC0205 KEGG:ns NR:ns ## COG: CC0205 COG2249 # Protein_GI_number: 16124460 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Caulobacter vibrioides # 8 176 8 179 185 125 39.0 5e-29 MNKDLRKVVILLAHPNMKGSQANKALIDAVGDIEGVAVFNLYELSPDIAFNVDEWSKIIS DASAVIYQFPFYWMSAPSLLKKWQDEVFTFLSKTPAVAGKPLTVVTTTGSEFDAYRSGGR NGFTTDELLRPYQASAIHSGMVWQTPIVVYGMGTADAGKNIAEGANTYRQRVEMLVNSSN AGNNW >gi|225935330|gb|ACGA01000062.1| GENE 19 18428 - 19567 830 379 aa, chain + ## HITS:1 COG:sll0873 KEGG:ns NR:ns ## COG: sll0873 COG0019 # Protein_GI_number: 16330194 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Synechocystis # 4 378 14 386 387 422 49.0 1e-118 MINFNQFPSPCYIMEEELLRKNLCLIKSVADRAGVEIILAFKSFAMWRSFPIFREYIDHS TASSVYEARLALEEFGSKAHTYSPAYTEQDFPEIMRCSSHITFNSMQQFERFYPMVVAEG SGISCGIRVNPEYSEVETELYNPCAPGTRFGITTDLLPEALPQGIEGFHCHCHCESSSYE LERTLEHLEAKFAHWFPQIKWLNLGGGHLMTRKDYDTEHLIRLLQGLKARYPHLRIILEP GSAFTWQTGVLTSEVVDIVESRGIKTAILNVSFTCHMPDCLEMPYQPAVRGAEMGNEGKY IYRLGGNSCLSGDYMGLWSFDHELQIGERIVFEDMIHYTMVKTNMFNGIHHPAIAIWTKE GKAEIYKQFSYEDYRDRMS >gi|225935330|gb|ACGA01000062.1| GENE 20 20491 - 21657 1327 388 aa, chain + ## HITS:1 COG:lin2213 KEGG:ns NR:ns ## COG: lin2213 COG1820 # Protein_GI_number: 16801278 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Listeria innocua # 43 379 45 378 380 204 34.0 3e-52 MLTQIINARILTPQGWLKDGSVLIRDNKILEVTNCDLAIIGAKLIDAKGMYIVPGGVEIH VHGGGGRDFMEGTEEAFRTAIKAHMQHGTTSIFPTLSSSTIPMIRAAAETTEKMMAEPNS PVLGLHLEGHYFNMAMAGGQIPENIKDPDPEEYIPLLEETHCIKRWDAAPELPGAMQFGK YITAKGVLASVGHTQAEFEDIQTAYEAGYTHATHFYNAMPGFHKRKEYKYEGTVESIYLI DDMTVEVVADGIHVPPTILRLVYKIKGVERTCLITDALACAASDSQVAFDPRVIIEDGVC KLADHSALAGSIATMDRLIRTMVQKAEIPLEDAVRMASETPARIMGVLDRKGTLERGKDA DIIALDRDLNVRAVWAMGELVEGTNKLF >gi|225935330|gb|ACGA01000062.1| GENE 21 21679 - 22854 1229 391 aa, chain + ## HITS:1 COG:lin2213 KEGG:ns NR:ns ## COG: lin2213 COG1820 # Protein_GI_number: 16801278 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Listeria innocua # 44 379 46 378 380 186 30.0 7e-47 MLTQIINGRILTPQGWLKDGSVLICDGKILEVTNSDLAVIGATVIDARGMTIVPGFVSMH AHGGGGHDYTEATEEAFRTATTAHLKHGATGIFPTLSSTSFERIYQAVDVCENLMKEKDS PVLGLHIEGPYLNPKMAGTQYDGFLKTPDENEYIPLLARTSCIRRWDISPELPGAHDFAK YTRSKGIMTAVTHTEAEYDEIKAAYAVGFSHAAHFYNAMPGFHKRREYKYEGTVESVYLT DGMTVEVIADGIHLPATILKLVYKLKGVENTCLVTDALAYAAYEGNEPIDPRYIIEDGVC KMADHSALAGSLATMDVLVRTMVKKANIPLEDAVRMASETPARLIGVSDRKGALAKGKDA DIVILDKELNVRCVWSMGKIVPGTDILLHKE >gi|225935330|gb|ACGA01000062.1| GENE 22 22953 - 23246 103 97 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYKIISAIIRVNLRFDTLPIIKANVLYDIKSPAVYHHQKYPGLLMYFYSESKLVSISSF NWFLFIVPNFIVGLPFSGQSSRLGMVRILNALSNSGS >gi|225935330|gb|ACGA01000062.1| GENE 23 23096 - 23458 424 120 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 4 114 3 112 117 105 40.0 2e-23 MKIIDLTKDSFVEKVADYQSYPDNWNFKGNKPCLVDFHAPWCVYCKALSPILDQLAKEYE GKLDIYKVDVDQEPELESAFKIRTIPNLLLCPLNGKPTMKLGTMNKNQLKELIETSLLSE >gi|225935330|gb|ACGA01000062.1| GENE 24 23695 - 24960 1279 421 aa, chain + ## HITS:1 COG:PA2522 KEGG:ns NR:ns ## COG: PA2522 COG1538 # Protein_GI_number: 15597718 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 26 384 40 389 428 71 23.0 2e-12 MNRVFLISFFLLLTGGICAQQATTGTLTLKEAEQRFLERNLSLIAERYNIDMAQAQVLQA KLFENPVISLEQNVYNRLNGKYFDFGKEGEMVVGIEQVIRLAGQRNKQVKLEKINKEIAE YQFEEVMRTLRQELNEKFVQVYFLSKSISIYEKEVNSLQELLAGMKLQQEKGNISLMEMS RLESMLFSLKKEKNERENELLTLRGELNVLLNLPGDTMVELSLDEEVLKQLDLSQLSFAD LKAMVNERPDLKIARSTVSASRANLKLQKSMAFPEFSVNGSYDRAGNFINNYFAIGVSLS VPIFNRNQGNVKAARFSIQQAGAEQENAANRADMELYTAYASLDKAVQLYQSTNMELERN FEKLIAGVNENFKKRNISLLEFIDYYDSYKETCIQLHEIKKDVFLAMENLNTTIGQTILN Y >gi|225935330|gb|ACGA01000062.1| GENE 25 24983 - 26077 1059 364 aa, chain + ## HITS:1 COG:PA2521 KEGG:ns NR:ns ## COG: PA2521 COG0845 # Protein_GI_number: 15597717 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 43 287 158 403 484 117 27.0 2e-26 MNWNKFLPCILLTTVLGACSGKGEQTNVEPAALCLTDSLLRIVSVDTVHVQEVIDELTLN GRVTFNENQVAHVYPMFGGTVTELKAEIGDFVRKGDVLAVIRSGEVADYEKQLKEAEQQL LLARRNMDATQDMYASGMASDKDVLQAKQELISAEAEERRIKDVFSIYNFSGNAYYQLKS PVSGFIVEKQISRDMQLRPDQSEELFTISGLSDVWVMADVYESDISKVSEDASVRISTLA YPDKMFAGTIDKVYHLLNSESKTMNVRIKLKNEEYLLKPGMFTNVSVKCKADGTSMPRID AHALVFEGGKNYVVVVEPDQRLQVKEVDVYKQLSKECYIRSGLSEGDRVLNNNVLLVYNA LNAD >gi|225935330|gb|ACGA01000062.1| GENE 26 26092 - 29217 3174 1041 aa, chain + ## HITS:1 COG:RSp1040 KEGG:ns NR:ns ## COG: RSp1040 COG3696 # Protein_GI_number: 17549261 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Ralstonia solanacearum # 5 1034 2 1025 1038 732 37.0 0 MHKFIDNIVAFSLKNKFFIFFCTAIAVIAGAVSFKHTPIDAFPDVTNTKVTIITQWPGRS AEEVEKFITIPVEIAMNPVQKKTDIRSTTLFGLSVINVMFEDRVDDFTARQQVYNLLNDA DLPDGVTPEVQPLYGPTGEIFRYTLRSDKRSVRELKTIQDWVIERNLRSVSGVADIVSFG GEVKTFEVSVNPHQLINYGITSLELYDAIAKSNINVGGDVITKSSQAYVVRGIGLINDLD ELRNIVVKNINGTPILVKNLADVHESCLPRLGQVGRMDENDVVQGIVVMRKGENPGEVIA NLKDKIEDLNQNVLPKDVKIVAFYDREDLVNLAVKTVTHNLIEGILLVTFIVLIFMADWR TTVIVAVVIPLALLFAFICLRVMGMSANLLSMGAIDFGIIIDGAVVMVEGVFVALDKKAR EVGMPAFNVMSKMGLIRNTAKDKAKAVFFSKLIIITALIPIFSFQKVEGKMFSPLAYTLG FALLGALIFTLTLVPVMSSMLLKKNVREKNNRFVHFINAKCSALFDLFYAHRKLTIGMAT VIAGVGLWLFSFLGTEFLPQLNEGSIYIRATLPQSISLDESVTLANKMRKKLLTFPEVRQ VLSQTGRPNDGTDATGFYNIEFHVDIYPEKEWESKLTKLELIDKMQEDLSIYPGIDFNFS QPITDNVEEAASGVKGSIAVKVFGKDLYESEKYAVQIDKILSTVQGIEDLGVIRNIGQPE LRIELNERQLARYGVAKEDVQSIIEMAIGGKSASLLYEDERKFNIMVRYSEQFRQNEEEI GKILVPAMDGTMVPIKELADITTITGPLLIFRDNHARFCAVKFSVRGRDMGTAVAEAQKK VNASVHLPAGYSLKWTGDFENQQRATKRLAQVVPISIAIIFIILFILFSNARDAGLVLLN VPFAAVGGIVALLITQFNFSISAGIGFIALFGICIQNGVIMISDIKANLKLGSPLEKATK EGVRSRIRPVIMTAAMAAIGLMPAAMSHGIGSESQRPLAIVIIGGLIGATFFALFVFPLI VEVVYERMLYDKNGKLLQRRI >gi|225935330|gb|ACGA01000062.1| GENE 27 29305 - 30636 961 443 aa, chain - ## HITS:1 COG:RSp1043 KEGG:ns NR:ns ## COG: RSp1043 COG0642 # Protein_GI_number: 17549264 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 157 431 167 450 466 155 32.0 2e-37 MKIGSKIALFYTLISVLTTVIIIAVFYIFSTQYINKLYASYLREKAYLTAQKHWEKDEVD EQSYQIIQRKYDELLPEAHEILLNMDSLSEVRDTLNKYLTQHQQALLLAGQDSIPFSFKY KDQLGAALYYPDNEGNFIVLVMSRNAYGTEIKEHLLLLSIFLILASSVLIFFIGKIYSGR ILVPLQHILKELKRIRANSLNRRLKTTGNNDELEDMIKTLNSMLDRLDSAFKAEKSFVSH ASHELNNPITAIQGECEISLLKERSTGEYIESLQRISSESKRLSSLIRHLLFLSRQEEEL LKNNIEEIILADILKELTASDERIHLHLEETDPQMTVKANPYLLKIALKNIIDNACKYSD KEVNATLYREQQQVILDIEDRGIGIPQEEIEHIFQSFYRGSNTRDYAGQGIGLSLTLKII SAYHAKLDISSEIEKGTKVRVIF >gi|225935330|gb|ACGA01000062.1| GENE 28 30660 - 31334 669 224 aa, chain - ## HITS:1 COG:alr1194 KEGG:ns NR:ns ## COG: alr1194 COG0745 # Protein_GI_number: 17228689 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 4 222 5 222 228 195 46.0 6e-50 MAKILLVEDEINIASFIERGLKEFGHSVTVCHDGDTGWKILQDEPFDLLILDIIMPKING LELCRLYRQMFGYQIPVIMLTALGTTEDIVKGLDAGADDYLVKPFSFQELEARIKALLRR NKEVPSNLLTCDNLVLDCNTRKAKRGDIDIDLTVKEYRLLEYFMTHQGVALSRITLLKDV WDKNFDTNTNIVDVYVNYLRVKIDRDFDKKLIHTVVGLGYIMNT >gi|225935330|gb|ACGA01000062.1| GENE 29 31453 - 33462 1814 669 aa, chain + ## HITS:1 COG:no KEGG:BT_0683 NR:ns ## KEGG: BT_0683 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 667 1 667 671 1266 88.0 0 MKINYVIGAFLCVLGCYGCSSPKTEVKSPDGHIKMALTVDDNGKPLYNILVDDSLLIENS TLGFTEKNGINLGGGFQIKNTTFDSKDETWTQLWGENKTNRNHYNEMAVNLVNKDQVELT LRFRVFDDGVGFRYEYNVPAADSLLITDELTTFRFCQDGTSWSIPASAETYELLYAQRPI SEVETANTPFTFKTADGVYGSIHEAALYDFSEMTLKQAGNYTLKAELAPWPDGVKVRKGN HFTTSWRTIQIVPDAVSLINSAMILNLNEPSKIETTDWIRPMKYVGVWWGMHLGVETWKM DERHGATTVNAKKYIDFAATNQIEGVLFEGWNEGWESWGGMQNFDFTKPYADFDIDEVVR YAKEKGVEVIGHHETGGNIPNYERQMDHAMQWYTDHGIHVLKTGYAGAFPNGYLHHSQYG VNHYQKVVETAARHKMTLDAHEPIKDTGIRRTWPNMMTREGVRGMEWNAWSEGNPPSHHV MLPFTRMLSGPLDYTPGTFDILFLNTKDSPRRQKWNDQDKGNSRVNTTLAKQLANWVILY SPLQMASDMIENYEGHPAFQFFRDFDPDCDESKALAGEPGEFVAIVRKAKGNYFLGAATN EKPRTLEIKLDFLEPGKQYKAVIYADGENADWKSNPTDYRITEQTVTSENTLNIRMAAGG GQAISFMAL >gi|225935330|gb|ACGA01000062.1| GENE 30 33703 - 35367 1346 554 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 16 548 4 499 600 230 29.0 7e-60 MEQEHQFIDYIEQSIIKNWDKDALTDYKGITLQYKDVARKIAKFHIVLESAGIQPGDKIA VCGRNSAHWAVTFLATITYGAVIVPILHEFKADNIHNIVNHSEAKLLFVGDQAWENLNED AMPLLEGIASLTDFSALVSRNEKLTYAFEHRNAIYGQQYPKNFRPEHICYRKDRPEELAI INYTSGTTGYSKGVMLPYRSLWSNVAYCFEMLPVKPGDHIVSMLPMGHVFGMVYDFLYGF SAGAHIYFLTRMPSPKIISQSFSEIKPKVISCVPLIVEKIIKKDILPKVDSKIGKLLLKV PIVNDKIKSLARQAAMEIFGGNFDEIIIGGAPFNAEVEAFLKKIGFPYTIAYGMTECGPI ICSSRWETLKLASCGKATSRMEVRIDSPDPKTHAGEIVCRGMNMMLGYYKNPEATAQIID ANGWLHTGDLGTLDEEGYVTVRGRSKNLLLTSSGQNIYPEEIESKLNNMPYVSESLIVLQ HEKLVALIYPDFDDAFAHGLQQTDIQKVMEQNRIELNQQLPNYSQISKIKIHFEEFEKTA KKSIKRFMYQEAKG >gi|225935330|gb|ACGA01000062.1| GENE 31 35649 - 37115 1202 488 aa, chain - ## HITS:1 COG:BH4038 KEGG:ns NR:ns ## COG: BH4038 COG3263 # Protein_GI_number: 15616600 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Bacillus halodurans # 1 483 1 480 490 298 38.0 2e-80 MIFTAENTLLIGSILLFVSIVVGKTGYRFGVPTLLLFLVVGMLFGSDGLGLQFHDAKDAQ FIGMVALSIILFSGGMDTKFREIKPILGPGIVLSTVGVLLTALFTGLFIWWISGMSWSNI YLPITTSLLLASTMSSTDSASVFAILRSQKMNLKHNLRPMLELESGSNDPMAYMLTIVLI QFIQSSGMGVGAIVGSFVIQFIVGAAAGYVLGKLAIRMLNKLNIDNQALYPILLLAFVFF TFSITDLLKGNGYLAVYIAGIMVGNNKIMHRKDIYTFMDGLTWLSQIIMFLCLGLLVNPH EMLEVAAVALLIGVFMIIIGRPLSVFLCLLPFRKITMKSRIFVSWVGLRGAVPIIFATYP VVAGVEGSNLIFNIVFFITIVSLVVQGTTISFVARILNLSKPLEKTGNDFGVELPEEIDS DLSDMTITKSMLEEADTLKDMNLPKGTLVMIVKRGDEFLIPNGTLKLHEGDKLLLISEKS KEEDTDSE >gi|225935330|gb|ACGA01000062.1| GENE 32 37282 - 38466 831 394 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 1 393 5 421 447 324 41 1e-87 MDSNNLTPLRKGVVGVQFLFVAFGATVLVPLLVGLDPSTALFTAGIGTLIFHAVTRGKVP VFLGSSFAFIAPIIKATELYGLPGTLSGMVGVALVYFVMSALVKWQGVRVIERLFPPVVI GPVIILIGLSLAGTGVNMAKENWVLALLSLVTAVVVSMKAKGLLKLIPIFCGIVVGYLAA WLFYDLDLSGVRDAAWIGLPQFVFPKFSWEPVLFMIPVAIAPVIEHIGDIYVVNTVTGKD FVKDPGLHRTLLGDGLACFCAGLLGGPPVTTYSEVTGAMSLTKITNPQVIRIAAISAILF SVIGKISALLRSIPSAVLGGIMLLLFGTIACAGIGNLVNNCIDLSRTRNIIIVSLTLTVG IGGAAFNWGDFSLSGIGLAALVGVVLNLILPRED >gi|225935330|gb|ACGA01000062.1| GENE 33 38681 - 40312 1728 543 aa, chain - ## HITS:1 COG:CAC2750 KEGG:ns NR:ns ## COG: CAC2750 COG1151 # Protein_GI_number: 15896007 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 1 541 1 527 530 738 65.0 0 MSMFCYQCQETAMGTGCTLKGVCGKTSEVANLQDLLLFVVRGIAVYNEHLRKDGHPSEQA DKFIYDALFITITNANFDKVAITEKIKEGLKLKKELGNKIKIENAPDECLWDGNEDEFEE KSKTVGVLRTPNEDIRSLKELVHYGLKGMAAYVEHAHNLGYESPEIFAFMQHALSELTRN DITVEELVQLTLETGKYGVSAMAQLDKANTSSYGNPEISQVSLGVRNNPGILISGHDLKD LEELLEQTEGTGVDVYTHSEMLPAHYYPQLKKYKHLAGNYGNAWWKQKEEFESFNGPILF TSNCIVPPRANASYKDRIYITGACGLDGAHYIPERKDGKPKDFSALIAHAKQCQPPVAIE NGTIIGGFAHAQVTALADKVVDAVKSGAIRKFFVMAGCDGRMKSREYYTEFAQKLPGDTV ILTAGCAKYRYNKLALGDINGIPRVLDAGQCNDSYSLAVIALKLKEIFGLDDVNQLPIVY NIAWYEQKAVIVLLALLALGVKHIHLGPTLPAFLSPNVRNVLIEQFGIGGISTVDEDIVK FLS >gi|225935330|gb|ACGA01000062.1| GENE 34 40466 - 41134 659 222 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 10 222 14 226 229 119 30.0 4e-27 MIPALVNNPLFRGITPERLFADLEEISFHTRSYKKGEILAQQGAVCNRLVILTKGSVRGE MIDYSGRLIKVEDIAAPRAIAPLFLFGEENRYPVEVTANEPTEVIELPKSSVLSLFRKNE QFLENYMNLSANYARTLSDKLFFMSFKTIRQKLASYLLRLYKQQQQTHITLDRSQQELSD YFGVSRPSLARELAHMQEDGLLIADRKHITILQKEQLVRLIQ >gi|225935330|gb|ACGA01000062.1| GENE 35 41145 - 42458 709 437 aa, chain - ## HITS:1 COG:CC1742 KEGG:ns NR:ns ## COG: CC1742 COG5000 # Protein_GI_number: 16125986 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Caulobacter vibrioides # 82 436 326 698 716 120 26.0 8e-27 MSSQIQQLGIRYWFRVLLTVLFCISTVWFGIHQSYGWLGISLCLLVLSIIWQIRLYQLHT KQVLFMINALENNDNTFHFPEENGTPESQKINRALNRVGHILYNVKAETAQQEKYYELIL DFVSTGLLVLNDNGAVYQKNKEALRLLGLNIFTHIHQLSKVDTTLMEKIENCRPGDKLQV IFHNERGTVNLSIRVSEINVRKEHLRILALNDINTELDEKEIDSWIRLTRVLTHEIMNSV TPITSLCDTLLSMSEGKDEEISHGLQTISTTGKGLLSFVESYRQFTRIPAPEPSLFYVKA FIERMIELARHQNPCDTICFHTEISPADLILYADENLIAQVVINLLKNAIQAIGSQPDGR IELRAYCNDMEEIWIEIKNNGPEIPSEIAEHIFIPFFTTKENGSGIGLSISRQIMRLSGG SLTLLREKETTFILKFK >gi|225935330|gb|ACGA01000062.1| GENE 36 42455 - 43795 808 446 aa, chain - ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 7 443 8 441 441 308 40.0 2e-83 MSKSGTIIIVDDNKGVLTAIQILLKSYFSKVVALSSPVTLTSVLREEMPEVVLLDMNFTS GINTGNEGLFWLHEIKKVRPDLPVVLFTAYADIELAIRGIKEGATDFIVKPWDNQKLVET LQTAATSTHNDRKTASKEKPVHSPMYWGESKVMQQLRALIKKVAVTDANLLITGENGTGK EMLAREIHALSNRKYKEMIAVDMGTITESLFESELFGHIKGSFTDAHTDRTGKFEAAEKS TLFLDEIGNLPYHLQAKLLTAIQRRSIVKVGSNTPIPIDIRLICATNRNLQEMVAKGEFR EDLLYRINTIHVEIPPLRERKEDIIPLAERFMVHFCKQYDKSLMKFTPEAKDKLTAHPWY GNIRELEHVIEKAVIINDSPLIPAELFQLSVPRIAIQEQSISTLEEMEVQMIRKALDACA GNLSAVAAQLGITRQTLYNKMKKFGL >gi|225935330|gb|ACGA01000062.1| GENE 37 43893 - 44696 544 267 aa, chain - ## HITS:1 COG:no KEGG:BT_0691 NR:ns ## KEGG: BT_0691 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 267 1 267 267 466 83.0 1e-130 MSWLNTNYILRAGVLSVLLLLPGLMSATCKRDTIINNYKDDSLRFIIKDDSIITFRNGES SFTIIKADSVLPATPKHTKHTRYDNRVHRFRRNWERIIPTHSKIQYAGNMGLLSFGTGWD YGKRNQWETDVLLGFIPKYSSKKAKVTMTLKQNYMPWSINIGKGFSTEPLACGLYVNTVF GNQFWVNEPERYPKGYYGFSSKVRFHVFMGQRLTYDIDPQRRFMAKSVTFFYEISTCDLY VISAVNNSYLRPRDYLSLSFGLKFQWL >gi|225935330|gb|ACGA01000062.1| GENE 38 44677 - 45483 663 268 aa, chain - ## HITS:1 COG:no KEGG:BT_0692 NR:ns ## KEGG: BT_0692 # Name: not_defined # Def: calcineurin superfamily phosphohydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 268 1 267 267 434 79.0 1e-120 MMKRKNLYALFSCLFLSGCGMIDYHPYDVRISGETEVNAHNMERIEANCQGKTTIRFVTM GDSQRWYDETEDFVKAINQRNDIDFVIHGGDMSDFGLTKEFLWQRDIMNGLHVPYVALIG NHDCLGTGAETYKAVFGPTNFSFIAGDVKFVCLNTNALEYDYSEPVPDFTFMENEITNRR DEFEKTVICMHARPYTDVFNDNVAKVFQHYVKQYAGIQFCTAAHTHHHQDDMIFDDGIHY VTSDCMDYRTYLVFTITPEKYEYELVKY >gi|225935330|gb|ACGA01000062.1| GENE 39 45749 - 48058 1620 769 aa, chain + ## HITS:1 COG:no KEGG:BT_0693 NR:ns ## KEGG: BT_0693 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 769 25 793 793 1310 80.0 0 MKQIYYVIRTLLRGHGSNVIKILSLALGLTMSILLFARVAFEQSFDKCFKDYDNLYQIFS VFSANGQTFEPQEMNCGPVAGAIQENFPKEVEAATSYCVWMSAPLYYGSVRFEANKVTAD SLFFQTMGIDVLSGVPEKELMQKDVVFLSDRLARKMFDQENPIGKVISYNKEIELTVKGT YADIPENATVRPDAVISLPTVWSRGWGNYSWRGGDSWIAFIRFRPGADKSVVNARLNDLI KKYRPAEDQKVVGYTAFVKPIRDVYREEPDVKRMRNIMSILGIIILFIATLNYVLISISS LSYRAKAIGVHKCSGAGSGKILGMFLLETAIIILFALLLMGLILLNFRDFIEDTTAVELG ALFSLDRLWVPLLTVAILFLIGGALPGRIFARIPVSQVFRRYTEGKKGWKRPLLFVQFAG VAFICGLMYVVMLQYYYVLNKDLGYNPKRVVVANTGFGNKENQDYALTFFRDLPYVESVS SADSHPVYSYSGTMIQDESGQSLFSSRFCEMMEDYPKMMGMVMKEGRMPRNENEVAINET YGEWMHWGTELLNRTVYNSGYVCKVVGVIKDFRIGNFTNPQAPFILMSTKNFGNCVHVRL KEPFAENLQKLNKVSADAFPDKTVDFRSMEQMIKESYNSIRVFSNATILAAVTMFFVMMM GLIGYTNDEVRRRSKEIAIRKVNGAEASGILEMLVKDVLYVAVVAVVIGVAASWYVNDMW MDMFAEHVPVSWAAYVLIAIVNLLVIVACVLWKSWRIANENPVNSLKSE >gi|225935330|gb|ACGA01000062.1| GENE 40 48082 - 48756 323 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 219 1 218 245 129 34 9e-29 MIEINNISKVFRTSEVETVALNHVNLEVKEGEFVAIMGPSGCGKSTLLNILGLLDNPTEG SYRLMGEEVGGLKEKERTHVRKGKLGFVFQSFNLIDELNVYENVELPLTYLGFKASERRR MVEDILKRMNISHRAKHFPQQLSGGQQQRVAIARAVVTNPKLILADEPTGNLDSKNGAEV MNLLTELNKEGTTIIMVTHSQHDASFAHRTVHLFDGSVVASVIA >gi|225935330|gb|ACGA01000062.1| GENE 41 48775 - 51093 1649 772 aa, chain + ## HITS:1 COG:no KEGG:BT_0695 NR:ns ## KEGG: BT_0695 # Name: not_defined # Def: ABC transporter, permease protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 772 1 771 771 1122 72.0 0 MRQIYYSIRTLLHERTLNIVRVLSLSLGLTVGILLFSQIAFEMSYEKCYPEAENLIMARI AAQNMTTGEVQGDDGMNYDYTVCDPVAFTLAQDMPEEIESATCVLPSQFYHIYKEDKLLS KANYMMVDTCFFETMGIPVLKGNAKDLIISNSIFVSEHFARETFGSADPIGKTLSADKQL ELTIRGVYKDIPENSMLSRDFVITIHRDGRYASGNGWNGNDIFYAFARLRHASDIDKVNS KIQQVMAKYSPLQWDDWKLEYSVVPLSKRHLDSADTQKRLMIFGFLGFSVFFVAIMNYML ISIATISRRAKSVGVHKCSGASSGNIFGMFLAETGVLVLLSVLFSFLLIINMGELIEDLL GTSLASLFAWDIIWIPLLTVVVLFLLAGGIPGKLFSRIPVTQVFRHYTDGKRGWKRSLLF VQFTGTSFVLGLLLVTLLQYSHLMNGDMGIKIPGLVEAETWMSGETVEHVKDELLRQPMV DGVSASTHSVLGEYWTRGLIGNDGKRIATLNFNMCDYDYPNVMGIRITEGTTIKKKGDLL VNKELVRLMKWTDGAVGKSVSGVEGTIVGVFDDIRNRSFYSSQSPIVLIADKESANHTIN VRLKEPYDENLKRLNECVAKTFPNVALNFNSVDSMIREGYKSVYRFRNAVWISSCFILLI VITGLIGYVNGETQRRSKEIAIRKVNGAEASGILKLLIYDILNISVISVLIGTAASYFTG QIWLEQFSERIDMDLLLFIGIALLVLLVIVACVVVKAWHVANENPVKSIKAE >gi|225935330|gb|ACGA01000062.1| GENE 42 51133 - 52497 1181 454 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 4 430 2 429 448 338 43.0 1e-92 MSKNDPHILGKESIGKLLLQYSIPAIIGMTITSIYNIIDSIFIGHGVGPMAIAGLAISFP LMNLVVAFCTLVSAGGSTLASIRLGQKDMKGATEILSHTLMLCITNSFFFGILSFIFLDD ILVFFGASNETLPYARSFMQVILLGTPITYTMIGLNNVMRATGYPKKAMLTSMVTVVANI ILAPIFIFHFEWGMRGAATATVISQLIGMVWVVSHFVKKDSTVHFEGNIWKMKRRIVESI FAIGMSPFLMNVCACAIVIIINNSLQNYGGDMAIGAYGIINRLLTLYVMIVLGLTMGMQP IVGYNFGAQKIDRVKQTLRLGIISGVVITSSGFIICEFFPHAVSALFTDSDELIDLAVDG IRLAVLMFPFVGAQIVIGNFFQSIGKAKISIFLSLTRQLLYLLPCLLLFPNWWGLEGIWI SMPVSDALAFITAVVSLMIYIKKVSKQHPETVAE >gi|225935330|gb|ACGA01000062.1| GENE 43 52648 - 53295 680 215 aa, chain + ## HITS:1 COG:all1058_2 KEGG:ns NR:ns ## COG: all1058_2 COG0637 # Protein_GI_number: 17228553 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Nostoc sp. PCC 7120 # 10 212 10 215 223 81 29.0 1e-15 MNTTKTIAALFDFDGVIMDTETQYTVFWDEQGRKYLNEEDFGRRIKGQTLLQIYEKYFSD QPEAQLEISAELYVYEQKMSYEYIPGVEAFIADLRRNGAKIAVVTSSNEEKMANVYNAHP EFKGMVDRILTGEMFARSKPAPDCFLLGMEIFEATPENTYVFEDSFHGLQAGMTSGATVI GLATTNTREAITGKAHYIIDDFTGMTYDKLLTLHR >gi|225935330|gb|ACGA01000062.1| GENE 44 53541 - 54362 922 273 aa, chain + ## HITS:1 COG:Cgl0115 KEGG:ns NR:ns ## COG: Cgl0115 COG0413 # Protein_GI_number: 19551365 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Corynebacterium glutamicum # 8 273 5 269 269 250 50.0 2e-66 MAGYISDDSRKVTTHRLVEMKQRGEKISMLTAYDYTMAQIVDGAGMDVILVGDSASNVMA GNVTTLPITLDQMIYHAKSVVRGVKRAMVVVDMPFGSYQGNEMEGLASAIRIMKESHADA LKLEGGEEIIDTVRRIVCAGIPVMGHLGLMPQSINKYGTYTVRAKDDAEAEKLIRDAHLL EEAGCFAIVLEKIPATLAERVASELTIPIIGIGAGGHVDGQVLVIQDMLGMNNGFRPRFL RRYADLYTVMTDAISRYVSDVKNCYFPNEKEQY >gi|225935330|gb|ACGA01000062.1| GENE 45 54372 - 55529 860 385 aa, chain + ## HITS:1 COG:STM2280 KEGG:ns NR:ns ## COG: STM2280 COG0477 # Protein_GI_number: 16765607 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 186 1 186 396 70 26.0 7e-12 MAVKLWTVHFMRICVANLLLFISLYVLFPVLSVEMADRLGVPAAQTGVIFLFFTLGMFLI GPFHAYLVDAYKRKYVCMFAAALMVVATIGYAFVTNLTELILLGTVQGLAFGIGTTAGIT LAIDITNSTLRSAGNVSFSWMARMGMIAGIILGVWLYQSQSFQTLLTVSVITGAVGILML SGVYVPFRAPIVTKLYSFDRFLLLRGWVPAINLILITFIPGLLVPMVHPFLNDFVLGNTG IPVPFFVGTALGYIVSLFFARLFFLKEKTLRLVIIGLGLEIVAMSLLNTDISIGISSVLL GLGLGFILPEFLVMFVKLSHHCQRGTANTTHLLATEIGISLGIATACYMELDTDKMLHTG QIVASIALLFFALITYPYYIKKKVR >gi|225935330|gb|ACGA01000062.1| GENE 46 55544 - 57763 2268 739 aa, chain - ## HITS:1 COG:lin1558 KEGG:ns NR:ns ## COG: lin1558 COG0317 # Protein_GI_number: 16800626 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 59 738 51 735 738 400 33.0 1e-111 MDDFFTSEEKKELFSLYRHLLQSAGDSISWKDCLKLKRHLIKAAQCNGLQRNNFGMNPVI RDLQTAVIVAEEIGMKGSCLIGIMLHEIVKGHVLSIDEVNAEYGDDVASIIKGLVKTNEL YAKSPAIESENFRNLLLSFAEDMRVILIMIADRVNVMRQIKDTGNEEDRLKVANEAAYLY APLAHKLGLYKLKSELEDLSLKYTQKETYYFIKDKLNETKASRDKYIAAFIEPIQRKVSE AGLKFDIKGRTKSIHSIWNKIQKQKTSFEGIYDLFAIRIILDSEPDPAKEKQECWQVYSI VTDMYQPNPKRLRDWLSIPKSNGYESLHITVMGPEGKWVEVQIRTRRMDEIAERGLAAHW RYKGIKGETGLDEWLTSVREALENADNDSLKVMDQFKMDLYEDEVFVFTPKGDLFKLAKG ATVLDFAFHIHSKLGCKCIGAKVNGKNVQLKQKLNSGDQVEIMTSNTQTPKQDWLNIVTT SKARTKIRQALKEMVARQHAFAKETLERKFKNRKLEYDEATMMRLIKRLGFKNVTEFYQR IADGGLDVNEILDKYVEQQKRDSDTHDEIVYRSAEGYNLQAAQQEETTSKEDVLVIDQNL KGLEFKLAKCCNPIYGDDVFGFVTVSGGIKIHRADCPNANQMRERFGYRIVKARWAGKSV GTQYPITLRVVGHDDIGIVTNITSIISKENGITLRSIGIDSNDGLFSGTLTVMVGDTGRL EALIKKLRTIKGVKQVSRN >gi|225935330|gb|ACGA01000062.1| GENE 47 57870 - 59387 1701 505 aa, chain + ## HITS:1 COG:no KEGG:BT_0701 NR:ns ## KEGG: BT_0701 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 505 1 505 505 947 91.0 0 MITTQDKELLAKKGITEEQIVEQLACFQTGFPFLKLDAAASIEKGILAPDAEEQKAYLAA WDAYTNTDKTVVKFVPASGAASRMFKNLFEFLSADYDRPTTKFEQTFFDGIKNFAFYDDL NVACQRIDGKDIPGLIEDGNYKAVVSALLETAGLNYGALPKGLLKFHKYPEGSRTPMEEH LAEGALYAAGKSGKVNVHFTVSTEHRELFKKLVAEKVDEFSKRYGVDYYITFSEQKPSTD TIAADMDNQPFRDNGKLLFRPGGHGALIENLNDLDADIIFIKNIDNVVPDRLKADTVTYK KLIAGVLVTLQEQVFEYLTLLDSGKYTHDQMMEMLQFLQKKLFCKNPETKDLEDSVLAIY LKNKFNRPMRVCGMVKNVGEPGGGPFLAYNSDGTISLQILESSQIDMDDPEKKEMFEKGT HFNPVDLVCAVRDYKGHKFDLVKYVDKATGFISYKSKNGKDLKALELPGLWNGAMSDWNT VFVEVPLSTFNPVKTVNDLLREQHQ >gi|225935330|gb|ACGA01000062.1| GENE 48 59499 - 60122 449 207 aa, chain + ## HITS:1 COG:CAC0235 KEGG:ns NR:ns ## COG: CAC0235 COG4845 # Protein_GI_number: 15893527 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Clostridium acetobutylicum # 6 201 2 196 212 127 31.0 1e-29 MNQIEKIIDIATWNRREHYEHFSAFDDPFFGVTVQVDCTRAYQEAKAKGISFSLLVLHRI TTAAAAVEEFRYRIEGDKVVCYDSLLPEATVGRDDHTFSFAAFEYDPDELVFIRRAKAEM ERLQATTGLNKGGTFHPNAIHYSAVPWLSFTDMKHPTNMRSGDSVPKISTGKYFRDGERL MMPVSVTCHHGLMDGYHVAQFIEKLHL >gi|225935330|gb|ACGA01000062.1| GENE 49 60156 - 61550 961 464 aa, chain + ## HITS:1 COG:no KEGG:BT_0703 NR:ns ## KEGG: BT_0703 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 88 464 33 407 407 662 83.0 0 MRVKIPRWEIAVICSALLFPGTSLCAQEVGQEEKEVKEMRQDVEQLQQDVRQLREEVRRL QEEIHGFRHNSFPQCGADTVAPYVPHHFIHRLGIEARPQYVFPTNPFLQGENERWKPIQS SFAAHLKYSFKFRPNTCADRIYGGAYQGFGLAFTTFGDKKQLGDPMTFYVFQGARIARFH PRLSLNYEWNFGISAGWKPYDNDYNSYNGAVGSRVNAYLNAGIYLNWSLSRYFDFIIGGD FTHFSNGNTKFPNAGVNTTGAKIGLVYNFNREEADLTKSLVHPYVPRFPRHVSYDLVLFG SWRRKGVYVGEKQIASPGSYPVAGFNFAPMYNLNYKLRFGVSLDGVYDGSANVYTEDALV EYDAGSGSSRRKFLVPGIQNQLAFGLSGRAEYVMPFFTIGVGLGTNVLGRGDLRGLYQVF ALKINVTRSSFLHIGYNLQDFQTPNYLMLGLGFRFNNKYPKVRH >gi|225935330|gb|ACGA01000062.1| GENE 50 61575 - 62138 434 187 aa, chain + ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 14 186 21 194 199 67 30.0 1e-11 MENIIRKIRQIYPVSDEALQALQANMELRYYPKDTRIVQAGVTDRLVYFIEEGIARSVFH HNGEDTTTWFSQEGDVTFGMDSLYYQQPSVESVETLSDCKIYVIHIDKLNALYETYIDIA NWGRILHQNVNKELSHMFVERLQLSPKERYEQFNRRYPGLINRVKLKYVAAFLGISIYTL SRVRAKK >gi|225935330|gb|ACGA01000062.1| GENE 51 62236 - 63387 864 383 aa, chain + ## HITS:1 COG:CC1328 KEGG:ns NR:ns ## COG: CC1328 COG1835 # Protein_GI_number: 16125577 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Caulobacter vibrioides # 16 375 12 332 337 115 31.0 1e-25 MSNISASAFSDTKAHYDLLDGLRGVAALMVIWYHVFEGFAFASNSAIETLNHGYLAVDFF FILSGFVIGYAYDDRWGKSLTMKDFFKRRLIRLHPMVIMGAVLGAITFCIQGSVQWDGTH VAISMIMLSLLCTIFFIPAMPGVGYEVRGNGEMFPLNGPCWSLFFEYIGNILYALFIRRL SNKALTVFVVLLGAALAAFAVFNVSTYGNIGVGWTLDGVNFLGGSLRMLFPFSLGMLMSR NFKPMKVKGAFWICTIALIGLFSVPYLEGLEPLCMNGVYEAFCVIVAFPIILWIGASGTT TDKQSTKICKFLGDISYPVYVIHYPLMYLFYAWLIENQLYTLGETWHVALCVFILSIILA YLCLKLYDEPIRKYLAKRFLSKK >gi|225935330|gb|ACGA01000062.1| GENE 52 63540 - 64010 243 156 aa, chain - ## HITS:1 COG:alr3535 KEGG:ns NR:ns ## COG: alr3535 COG0454 # Protein_GI_number: 17231027 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Nostoc sp. PCC 7120 # 3 155 2 153 156 90 34.0 9e-19 MNKIRQVNLSDIPALQELYQHTVLTVNRKDYTAEEVADWASCGDDTSHWNELFEEQHYVV AKNEEGVIVGFGSVNDDGYMHTLFVHKDFQHQGVATSLYKHLEAYARERGAKRLTSEVSI TAKPFFEKQGFQVDEEQKRKANQMCLTNYKMSKQLC >gi|225935330|gb|ACGA01000062.1| GENE 53 64032 - 65198 1367 388 aa, chain - ## HITS:1 COG:alr1299 KEGG:ns NR:ns ## COG: alr1299 COG0027 # Protein_GI_number: 17228794 # Func_class: F Nucleotide transport and metabolism # Function: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) # Organism: Nostoc sp. PCC 7120 # 2 387 9 388 391 428 58.0 1e-119 MKKILLLGSGELGKEFVISAQRKGQHIIACDSYAGAPAMQVADEFEVFDMLNGEELERVV KKHRPDIIVPEIEAIRTERLYDFEKEGIQVVPSARAVNFTMNRKAIRDLAAQELGLKTAK YFYAKTLEELKEAADKIGFPCVVKPLMSSSGKGQSLVKSADELEHAWEYGCSGSRGDIRE LIIEEFIKFDSEITLLTVTQKNGPTLFCPPIGHVQKGGDYRESFQPAHIDPAHLKEAEEM AEKVTRALTGAGLWGVEFFLSHDNGVYFSELSPRPHDTGMVTLAGTQNLNEFELHLRAVL GLPIPGIKQERIGASAVILSPIASQERPQYRGLEEVTKEEDTYLRIFGKPFTRVNRRMGV VLCYAPLDSDLDALRDKAKRIAEKVEVH >gi|225935330|gb|ACGA01000062.1| GENE 54 65875 - 67395 1860 506 aa, chain + ## HITS:1 COG:SPAC222.12c KEGG:ns NR:ns ## COG: SPAC222.12c COG0055 # Protein_GI_number: 19114063 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Schizosaccharomyces pombe # 6 504 55 524 525 607 64.0 1e-173 MSQIIGHISQVIGPVVDVYFEGTDAELVLPSIHDALEIKRPNGKILVVEVQQHIGENTVR TVAMDSTDGLQRGMKVYPTGGPITMPIGEQIKGRLMNVVGDSIDGMKGLDRKGAYSIHRE PPKFEDLTTVQEVLFTGIKVIDLLEPYAKGGKIGLFGGAGVGKTVLIQELINNIAKKHNG FSVFAGVGERTREGNDLLREMIESGVIRYGEAFKEGMEKGHWDLSKVDYNELEKSQVSLI FGQMNEPPGARASVALSGLTVAESFRDAGKEGEKRDILFFIDNIFRFTQAGSEVSALLGR MPSAVGYQPTLATEMGAMQERITSTRKGSITSVQAVYVPADDLTDPAPATTFSHLDATTV LDRKITELGIYPAVDPLASTSRILDPHIVGQEHYDIAQQVKQILQRNKELQDIISILGME ELSEEDKMVVNRARRVQRFLSQPFAVAEQFTGVPGVMVGIEDTIKGFKMILEGEVDYLPE QAFLNVGTIEEAIEKGKKLLEQAAKK >gi|225935330|gb|ACGA01000062.1| GENE 55 67405 - 67653 244 82 aa, chain + ## HITS:1 COG:HI0478 KEGG:ns NR:ns ## COG: HI0478 COG0355 # Protein_GI_number: 16272425 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Haemophilus influenzae # 2 78 1 77 142 68 38.0 3e-12 MMKELHLSIVSPEKSIFDGNVKIVTLPGMIGSFSILPGHAPIVSSLKAGTLSYTTMDGEE HTMDIQGGFVEMSDGTVSACVS >gi|225935330|gb|ACGA01000062.1| GENE 56 67737 - 68171 328 144 aa, chain + ## HITS:1 COG:no KEGG:BT_0713 NR:ns ## KEGG: BT_0713 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 172 84.0 4e-42 MVNISKTKRNFIWWHTLFAVISGALGAVILHFALPGHYFGGYPFIPIYFYFFGLFSIYAF DACRRHAPQRMLLLYLATKMIKMILSLILVLIYCLAVREEARAFLLTFILFYLIYLVFET WFFFSFELNQKRKKKNKKKNETVA >gi|225935330|gb|ACGA01000062.1| GENE 57 68155 - 69297 1124 380 aa, chain + ## HITS:1 COG:BMEI1546 KEGG:ns NR:ns ## COG: BMEI1546 COG0356 # Protein_GI_number: 17987829 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Brucella melitensis # 195 366 101 269 277 90 34.0 6e-18 MKQLHNIVAPLILFCLMAAVSLPVLAQEQEGEAVSQQTQDEITPKEEQENTVDVKEIVFG HIGDSYEWHITTWGNTHITIPLPIIVYSSTTGWHTFLSSRLEENGGTYEGLSIAPEGSKY EGKLVEYNAAGEQVRPWDISITKVTFALLFNSVLLLIIVLSVSHWYRKRPQGAKAPGGFI GFMEMFIMMVNDDIIKSCVGPNYRKFAPYLLTAFFFIFINNMMGLIPFFPGGANVTGNIA ITMVLAVCTFLAVNIFGSKHYWKDIFWPDVPWWLKVPIPMMPFIEFFGIFTKPFALMIRL FANMLAGHMAMLVLTCLIFISASMGPALNGTLTVASVLFNIFMNALELLVAFIQAYVFTM LSAVFIGLAQEGAKVKTEED >gi|225935330|gb|ACGA01000062.1| GENE 58 69406 - 69663 422 85 aa, chain + ## HITS:1 COG:no KEGG:BT_0715 NR:ns ## KEGG: BT_0715 # Name: not_defined # Def: ATP synthase C subunit # Organism: B.thetaiotaomicron # Pathway: Oxidative phosphorylation [PATH:bth00190]; Metabolic pathways [PATH:bth01100] # 1 85 1 85 85 74 100.0 1e-12 MLLSVLLQATAAAVGVSKLGAAIGAGLAVIGAGLGIGKIGGSAMEAIARQPEASGDIRMN MIIAAALIEGVALLAVVVCLLVFFL >gi|225935330|gb|ACGA01000062.1| GENE 59 69720 - 70223 688 167 aa, chain + ## HITS:1 COG:VC2768 KEGG:ns NR:ns ## COG: VC2768 COG0711 # Protein_GI_number: 15642761 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Vibrio cholerae # 16 157 12 152 156 61 29.0 8e-10 MSLLLPDSGLLFWMFLSFGIVFVILAKYGFPVIIKMVEGRKTYIDQSLEVAREANAQLSK LKEEGDALVAAANKEQGRILREAMEERDKIVHEARKQAEIAAQKELDAVKQQIQIEKDEA IRDIRRQVAVLSVDIAEKVLRKSLEDKEAQMGMIDRMLDEVLTPNKN >gi|225935330|gb|ACGA01000062.1| GENE 60 70229 - 70789 453 186 aa, chain + ## HITS:1 COG:sll1325 KEGG:ns NR:ns ## COG: sll1325 COG0712 # Protein_GI_number: 16329328 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Synechocystis # 10 174 14 177 185 64 27.0 1e-10 MEVGILSMRYAKAIIEYAQEKGLEDRLYQEFLTLSHSFCEQPGLREALDNPVISFKEKLA LVCTAADGDGKSTREFVRFITLVLRNRREGYLQFISLMYLDLYRKLKHIGVGKLITAVPV NKETENRIRSAAEHILHAQMELETVIDPSIEGGFIFDVNDYRLDASVATQLKRVKQQFID KNRRIV >gi|225935330|gb|ACGA01000062.1| GENE 61 70789 - 72372 2036 527 aa, chain + ## HITS:1 COG:TM1612 KEGG:ns NR:ns ## COG: TM1612 COG0056 # Protein_GI_number: 15644360 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Thermotoga maritima # 5 517 3 496 503 575 58.0 1e-164 MSENIKVSEVSDILRKQLEGINANVQLDEIGTVLQVSDGVVRIYGLRNAEANELLEFDNG IKAIVMNLEEDNVGAVLLGPTDKIKEGFVVKRTKRIASIRVGEGMLGRVIDPLGEPLDGK GLIGGELYDMPLERKAPGVIYRQPVNQPLQTGLKAVDAMIPIGRGQRELIIGDRQTGKTA IAIDTIINQRSNFLAGDPVYCIYVAIGQKGSTVASIVNTLREYGALDYTIVVAATAGDPA ALQYYAPFAGAAIGEYFRDTGRHALVVYDDLSKQAVAYREVSLILRRPSGREAYPGDIFY LHSRLLERAAKIISQEEVAREMNDLPESLKGIVKGGGSLTALPIIETQAGDVSAYIPTNV ISITDGQIFLETDLFNQGTRPAINVGISVSRVGGNAQIKAMKKVAGTLKIDQAQYRELEA FSKFSSDMDPITALTIDKGRKNGQLLIQPQYSPMPVEQQIAILYCGTHGLLHDVPLDKVQ DFERSFIESLQLNHQEDVLDVLKTGVIDDNVIKAIEETAAMVTKQFI >gi|225935330|gb|ACGA01000062.1| GENE 62 72434 - 73330 813 298 aa, chain + ## HITS:1 COG:BH3755 KEGG:ns NR:ns ## COG: BH3755 COG0224 # Protein_GI_number: 15616317 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Bacillus halodurans # 1 296 1 281 285 173 34.0 4e-43 MASLKEVKTRINSVKSTRKITSAMKMVASAKLHKAQGAIENMLPYQKKLNKILTNFLSAD LPIESPYVQEREVKRVAIVVFSSNTSLCGAFNANVIKMMMQTIGEFRTLGQDNILIFPVG KKVDEAAKRMGFKPQEVSPTLSDKPTYQEAAELAHRLMDMYVAGEVDRVEIIYHHFKSMG VQILLRETYLPINLTNVVSEEDRKKEEEAQEHEIANDYIIEPNAEELIASLIPTVLSQKI FTAAVDSNASEHAARTLAMQVATDNANELIQDLTKQYNKTRQQAITNELLDIVGGSMK >gi|225935330|gb|ACGA01000062.1| GENE 63 73915 - 76512 1930 865 aa, chain + ## HITS:1 COG:SPBC887.14c KEGG:ns NR:ns ## COG: SPBC887.14c COG0507 # Protein_GI_number: 19113280 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Schizosaccharomyces pombe # 21 406 328 758 805 153 29.0 1e-36 MENNPELQLAWQFIENTGTHLFLTGKAGTGKTTFLRRLKEHTPKRMVVLAPTGIAAINAG GVTIHSFFQLSFAPFVPETTFNSSQTHYRYSKEKRNIIRSMDLLVIDEISMVRADLLDAV DATLRRYRDREKPFGGVQLLMIGDLQQLAPVVKDNEWELLRKHYETPYFFASHALKETAY MTIELKKVYRQSDTFFLSLLNKIRENKADDEVLNELNRRYQPGFQPQKEEGYIRLTTHNN QAQRVNDRELASLPGKAYHFSAEIEGDFPEYSYPADKLLTIKEGAQIMFLKNDPSSEKRY YNGMIGEVVAVNETGIVVRGKGDRSEFQLLPEEWGNYKYVLNEETKEITEVIEGTFRQYP IRLAWAITIHKSQGLTFERAIIDARNSFAHGQTYVALSRCKTLEGMVLESPLRREAIISD ATVDNFTKAVEQNKPGSQQLNDMQKAYFFDLLSDLFNFYSIDQAYKRLLRLMDEDLYKLF PKQLAEYKALAPHVKEKIVEVSQRFRNQYTRLIHESEDYAANEELQERIRSGAGYFHKEL EPVRALYDKTNMPLDNKELRKLLAERMQALDDALWIKESLLEAVSTRGRFAITDYLKLKA KVMLSLEDDSTSSGSSKALKEKKERKERKERTRSAEKVKVEVPTDILHPELYRALSEWRT AKTREMNVPAYVIMQQKALMGIVNLLPDSPRALEAIPYFGAKGVERYGLEILGIIRKYMA ENQLERPEITDMLISSGNSDDTVRQRKTKLQKEAEKQEKEAEKEKEKEAKKDTKLVSYEM FRQGMNIDEIAKARDLVSGTIAGHLEHYVRSGKIKVEQVVKAENIAKIRKYLDEHEYMGI FAIKVALGDAVSYADIKFVLAVSGH >gi|225935330|gb|ACGA01000062.1| GENE 64 76541 - 77161 508 206 aa, chain - ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 5 206 2 195 198 94 34.0 1e-19 MKGSLIVIVFFCVGCIMGAFNKFEFDTHTVSMYILYALMLQVGISIGSNKNLKAIVSHLH PKMLLIPLGTIIGTLLFSALASLLLRQWSVFDCMAVGSGFAYYSLSSILITQFKEPSIGL QLATELGTIALLTNIFREMMALLGTPIIKKYFGKLAPISAAGVNSMDVLLPSITRYSGKE MIPIAILHGILIDISVPVFVSFFCNL >gi|225935330|gb|ACGA01000062.1| GENE 65 77158 - 77436 307 92 aa, chain - ## HITS:1 COG:no KEGG:BT_0723 NR:ns ## KEGG: BT_0723 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 92 1 92 92 117 94.0 2e-25 MFSIISTMFLGIGIGYVLRNWSILQKTEKTISLTIFLLLFILGVSIGSNSLIVNNLGKFG WQAIVLAVSGVLGSLIAARLVLQLFFRKGGEQ >gi|225935330|gb|ACGA01000062.1| GENE 66 77839 - 78750 553 303 aa, chain - ## HITS:1 COG:alr3336 KEGG:ns NR:ns ## COG: alr3336 COG1091 # Protein_GI_number: 17230828 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Nostoc sp. PCC 7120 # 1 297 1 284 294 135 30.0 7e-32 MKKILIIGANGFTGRQILNDLSACKQYKVTGCSLHPDILPNNAEDYRFIETDIRNEADVK HLFEEVQPDVVINCSALSVPDYCETHHEEAYLTNVTAVGQLADLCEEYKSRFIHLSTDFV FDGKINEAAGLLYTEEDLPAPVNYYGYTKWKGEERVAETCSSYAIVRVEIVYGKALPGQH GNIVQLVMNRLKAGQEIRVVSDQWRTPTYVGDVSDGVQRLIEHTTNGIFHICGDECITIA EIAYQVADCMKLDRSLIHPATTEEMNESTPRPRFSGMSIDKARTMLGYRPQKLKEILANW NNL >gi|225935330|gb|ACGA01000062.1| GENE 67 78889 - 80058 1207 389 aa, chain + ## HITS:1 COG:STM3519 KEGG:ns NR:ns ## COG: STM3519 COG1690 # Protein_GI_number: 16766807 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 11 389 13 405 405 266 40.0 5e-71 MEKVIMGTRLPAKLWLDEVEDSCMSQIMDLTSLPFAFKHIAIMPDAHAGKGMPIGGVLAT NGVIVPNAVGVDIGCGMCAIKTNLKAEDFSYTDLTSIMSKIRAVVPLGFDHQNKKQDQEL LPQGFDFEEMPVLKNQYEACLKQIGTLGGGNHFIEIQKDTETSDVWVMIHSGSRNIGLKV ANHYNKIAQYWNEKWYSAMVPGLAYLPMETQMAKDYFREMNYCVAFAFANRQLMMNRICE SIQAVKPETDFEPMINIAHNYAAWENHFDQDVIVHRKGATRAYEGEIGIIPGSMGTKSYI VEGLGNPESFKSCSHGAGRLMGRKDACRRLSLDEEKERMDQQGIIHGLRSQNELDEAPGA YKDIAQVIANERDLVKPLVELAPMAVIKG >gi|225935330|gb|ACGA01000062.1| GENE 68 80330 - 80902 351 190 aa, chain - ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 209 393 393 267 72.0 1e-70 MSTVEVAYDLKNVTSHLIASTSEIMAYGMPYDKIGQYLIGNIDYEKVCDGFYSFYSNYVT PCGTIGVTDCSELDNLAAIMKEINQRYTFNEELTGELQRLDGYTPTIFFDYGDYVSKLCS DPDLLEQFNEQLKRTVPFKRNTKQYFTAISSSYYGERIDINTFSGITISDPSTNPGASKK NETAWYIATH >gi|225935330|gb|ACGA01000062.1| GENE 69 80935 - 81612 414 225 aa, chain - ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 174 393 246 62.0 6e-64 METMQPYYRLMKSILFVLFLSFGMISCEKEDPVATSPTTRQTDDSTTDPTDPDPTDPVVR SNNEQTVFMYLPWSNNLTSFFYQNIADLKSIIGNNILKNERVIVFICTSATKATLSELVY ENGKGVQKTLKNYDYPDPTYTTAEGITSILNDVQTYAPAKRYAMIIGCHGMGWIPVSRAV SRSSLQISKKHWEYEHAPMTRFLEEHKRNIKLILLRLPEVFRTRA >gi|225935330|gb|ACGA01000062.1| GENE 70 81998 - 82897 556 299 aa, chain - ## HITS:1 COG:BS_ytdP KEGG:ns NR:ns ## COG: BS_ytdP COG2207 # Protein_GI_number: 16080067 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 189 296 666 766 772 73 36.0 4e-13 MGDKVLKIGTVHQCNCCLGSKTLHPLVSVIDLSKADLSAHTGIKFDFYTILLSECKCEAY MYGHQYYDFSDGTLLFLSPGESINMKENSKNFPSKGWILAFHPDLICGTPLGLNIHNYTF FSYLPEEALHISLREKQIILEFMDKINQELERCIDRHSKKIVSKYIELLLDYCIRFYERQ FITRNEANKTIIKQFDKIINNHFETKQVSTVDILSHKYCANRLHLSPEYFNDLLKYETGK SFKEYIEFKRFEIAKDWLVNTDKTINQITQELGFQNPQYFSRLFKKITGCSPNDFRIPN >gi|225935330|gb|ACGA01000062.1| GENE 71 82928 - 83830 656 300 aa, chain - ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 216 299 275 361 368 60 36.0 5e-09 MDEITKIETVDQYDQLFGLETLHPLVNVIDFSKATKSVEYIRMNIGFYCLFLKDAKCGDL TYGRKNYDYQEGTVVCMAPGQVSGIDNRNRTAPRTKSIGVLFHPDLIRGTSLGQNIKNYT FFSYEVNEALHLSDQEREIVTDCIHKIRLELEHTIDKHSKQLIVRNIELLLDYCMRFYER QFITRNQANKDIVVKFEQLLDEYFQNQVAMTEGLPSVKYFADKACLSPNYFGDLIKKETG KTAQEYIQCRIIELAKERILEGVQTVSQVAYELGFQYPQHFSRLFKKHVGCTPNEYKQKN >gi|225935330|gb|ACGA01000062.1| GENE 72 84034 - 85041 803 335 aa, chain + ## HITS:1 COG:alr4831 KEGG:ns NR:ns ## COG: alr4831 COG0451 # Protein_GI_number: 17232323 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Nostoc sp. PCC 7120 # 1 271 1 251 311 101 27.0 3e-21 MKALFIGGTGTISTDVVALAQQRGWEITLLNRGSKKMPEGIHSIIADINDEEAVAKAIAL EHYDVVAQFIGYTAEDVKRDIRLFQNKTRQYIFISSASAYQKPLTDYRITESTPLVNPYW QYSKNKIEAEEVLMSAYRTSGFPVTIVRPSHTYNGTKPPVAVHGDKGNWQILKRILDGKP VIIPGDGSSLWTLTHSKDFAKGYVGLMANPHAIGNAFHITTDESMTWNQIYQTIADALDK PLNALHVASDFLAKHSDHYDFRGELLGDKAATVVFDNSKIKRLVPDFICHISMADGLRQA VHYMLSHPETQTPDPEFDSWCDRIADAISAADKAF >gi|225935330|gb|ACGA01000062.1| GENE 73 85181 - 85729 411 182 aa, chain - ## HITS:1 COG:no KEGG:BT_0731 NR:ns ## KEGG: BT_0731 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 182 14 178 178 275 82.0 5e-73 MKKLILFLSLILSVGFASAQSEIPSDSIRRAPSTNIKEFGGFLLDMGLMNVATPELPQFN LDMPNMTKDYNQLFRLNTDITYSQGFTDSFSSSSYSGFSGFGYGYGWGLSSSPQFMQMGS FKLKNGMRINTYGDYNKDGWRVPNRSAMPWERNNFRGAFELKSANGNFGIRIEVQQGRNT PY >gi|225935330|gb|ACGA01000062.1| GENE 74 85811 - 86584 500 257 aa, chain - ## HITS:1 COG:no KEGG:BF2197 NR:ns ## KEGG: BF2197 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 257 2 262 263 174 38.0 3e-42 MNKRQFTGFMFLLCTCTAFAQQVEVKKDSIGRAVLLPPEQKQPLNMDKEGFILPDMNITP NDRKPTIQTDSMTLHISPPEFMDTPWPTPRLGTSFDPFSRDYNRSDIFGINANSYLSTYS LHNTYPTMGTHIQVGAIYTYAPNERWELSGGLFSAKYTMPSFQHGARNDFGFSGSAAYRI NNFLRIRAFGEYSVNGERNASQGYLTPIYPQSGYGMILEIKFNDYIELHGGMERSYNPMK RKWETAPVVYPVIKLKR >gi|225935330|gb|ACGA01000062.1| GENE 75 86608 - 87303 771 231 aa, chain - ## HITS:1 COG:MT1062 KEGG:ns NR:ns ## COG: MT1062 COG0745 # Protein_GI_number: 15840463 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Mycobacterium tuberculosis CDC1551 # 6 228 51 272 276 151 35.0 8e-37 MSKNKITVLLVEDEQTLAMIIKDTLEENDFIIHTANDGEEGLSLFFELHPDVLVADVMMP KMDGFEMVRRIRQTDKQTPVLFLTARSAINDVVEGFELGANDYLKKPFGIQELIIRIKAL MGKAFLFTENKVANHFEIGSYLFDPVAQTLLHAGTRQELSHRESEILKRLCENRNQVVNT QDVLLELWGDDSFFNSRSLHVFITKLRHKLSQDEQIRIVNVRGIGYKLIVN >gi|225935330|gb|ACGA01000062.1| GENE 76 87316 - 88830 898 504 aa, chain - ## HITS:1 COG:BH1945 KEGG:ns NR:ns ## COG: BH1945 COG0642 # Protein_GI_number: 15614508 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 280 504 240 459 462 125 32.0 2e-28 MKLPLKHIVILVICSLIGIFVYQAYWLTGLYRSMKMEMESNIRDAMRTSDFNEVVLRVND LQKDNVEHGSLTVSAGYDADGNSLVTQQTISYTDSSRQDTIHSRTEITTDTVKTVGNAED STPVASSENGLDVLLKKQDSLKELLMSIQQGIHSGVDTYIDVNLQRYDSLLTHVMKEHNL SIPHHTLLIYTGISADSSIVYTDTLGIAGDSSYIPSPKAIRYDYEFNMHHSQRYQLVFEP INSLVLKQMTGILVTSFVILLILGFSFWFLIRTLLKQKTLDEMKSDFTNNITHELKTPIA VAYAANDALLNFNQAEEKSKRDQYLRISQEQLQRLSGLVEQILSMGMERRKTFRLHPEEI NLKELITPLMEQHQLKADKPVHIELDIQPETLVIVADRTHFSNIISNLIDNAVKYSKEEA ELSISCRQTGGTVTVSVTDRGIGIPLDKQKHIFDKFYRVPTGNLHNIKGYGLGLFYVKSI VEKHGGTITVKSEPDKGSTFTITL >gi|225935330|gb|ACGA01000062.1| GENE 77 88963 - 91071 1532 702 aa, chain + ## HITS:1 COG:no KEGG:BT_0734 NR:ns ## KEGG: BT_0734 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 702 1 702 702 1152 82.0 0 MKRISLLFMSLLCVWISATAQVGQQSQEPVDSIKNIGYMDSLFQELPEVMITGERPIVKA EQGKLVYDVPRLVGNLPVDNAYDVVKNLPGIVNMNNSLMLGGQEVTVVINGKVTTLSAEQ LDALLKSIPASRIEKAEVMYSAPARYQVRGPMINLILAPGTGQASSLQGELYTAFNQHHY ESLAERGSLLYSGRKFSADLLYSYSYDRDRRITEKEALHTLADGSVHQMDMDEIMTSRTN SHQIRLGMDYSLAEDHLLSLVYTTAFTDSKHYSIATGAQNSVTDSQGDSQLHNAKLDYQT PFGLKAGVAFTSYHSPGSQLLYSTMGTETMNFLSRDKQQINQWRFYAGQEHTLGKGWGLN YGITYTTALDHSYQRYYDPETDNLLPDNNMNSRRREQTLNVYAGLSKSFGEKLSADVSLA AEQYHTNIWNEWSLYPVANLTYMPAAGHVLQFSLSSDKEYPEYWSVQNAISYMGAYSEIQ GNPYLKPATNYETSLNYILKGKYVFSAFYSYTKNKQMQTLYQSPERLVEIYKFFNFDFSS QAGLVMTVPFKVKKWLDSRFTAIGFRYRQKDSDFWDIPFDRKLYTFVLTMDNTFTLSTKP DLKFTLTGFYQNRAIQGIFDLPRSGNLDAALRYTFAKGKAQLTLKCDDLFNTSTISTQVR YGLQNVKNHYMRTTRTFGISFNYKFGGYKEKKREEVDTSRFK >gi|225935330|gb|ACGA01000062.1| GENE 78 91092 - 93098 1317 668 aa, chain - ## HITS:1 COG:PA0928_1 KEGG:ns NR:ns ## COG: PA0928_1 COG0642 # Protein_GI_number: 15596125 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 436 666 269 509 509 108 28.0 2e-23 MRHLLLIIFIIIPSLCFTINASQHPSDSLIYLLKSTNSSPQKIQICRNLADIYLDAPEAK MYLLQMYHESLKINDKENALNALSDIITEELNSLNKDSLSKYINCIKKIASLEECESLLP LDHMRIFEAQCYSDQKDEAIKKELDFMDSKESTSNIYKETATTYTVGTSFYINQKHKEAS SYLEKSLKLAESLSVNVKSKYQKRIIWKLGMAYSKSGRRQEAILLIKELINMVEQEYKRD YQKQRPFYNIDSYRIQYYSFMISNLDLLTPKEENDYWKRILLIGKKLTNDFDKYNYYLCA NNYYTNSRTQKDLSKAIAANDTLIKIAKILAPQNLPGLLNTVSLLYEKQKDYPNALKYYK NSHYIQDSLSSNDARQQLNELQVKYDVNTLNNEKAELEIKNKKTQIISLSTLLIIVIGVC SYLYFSLKREKRMKAELKVLNRKAQESEKMKQAFINSICHEIRTPLNSIVGFSDLIMNED IDEEMRREFPAEIQKSTLLLTGLVNSMLEVANLDVSEDKLPCGPTDIRNICIQEMEKISP KPRIKYQLDITEDTLFIPTNAQYLTMVIEHLLNNANKFTEEGFIALGYKVNLSKNQICIS VTDSGCGIPKEKQEEVFNRFSKLDTFVPGNGLGLYLCRLIVKRLDGEIKIDPEYTDGTRV VVNLPMNE >gi|225935330|gb|ACGA01000062.1| GENE 79 93276 - 94919 1648 547 aa, chain - ## HITS:1 COG:no KEGG:BT_0735 NR:ns ## KEGG: BT_0735 # Name: not_defined # Def: aspartate aminotransferase (EC:2.6.1.1) # Organism: B.thetaiotaomicron # Pathway: Alanine, aspartate and glutamate metabolism [PATH:bth00250]; Cysteine and methionine metabolism [PATH:bth00270]; Metabolic pathways [PATH:bth01100] # 1 540 16 555 557 1034 91.0 0 MEKKTNNPAITKSYAKKMETISPFELKNKLIDMADESIKKIAHTMLNAGRGNPNWIATEP REAFFLLGQFGLCECRHAFSLEEGIAGIPQKAGIAARFEAFLKENEKAPGANLLKEGYNY MLMEHAADPDTLIHEWAESVIGDQYPVPDRILHFTELIVQDYLAQEVCDRRPPKGTFDLF ATEGGTAAMCYLFDSLQENFLLNQGDAIALMVPVFTPYIEIPELRRYQFDVTEISADQMT PDGLHTWQYKDEDIDKLKDPRIKALFITNPSNPPSYALSKETTERIINIVKNDNPNLMII TDDVYGTFIPHFRSLMAELPHNTLCVYSFSKYFGATGWRTAVIALHEDNIYDKMIARLSE EQKSILNKRYSSLSLQPEKMKFINRMVADSRQIALNHTAGLSLPQQIQMSLFAIFSLLDK EDSYKKKMQEIIHRRLHALWDNTGFTLVEDPLRAGYYSEIDMLVWAKKFYGDDFVTYLQK TYNPLDVVFRLANETSLVLLNGGGFAGPKWSVRVSLANLNEADYVKIGQSIKCVLEEYAK AWKASIH >gi|225935330|gb|ACGA01000062.1| GENE 80 94942 - 96636 1867 564 aa, chain - ## HITS:1 COG:STM0870 KEGG:ns NR:ns ## COG: STM0870 COG2985 # Protein_GI_number: 16764232 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 14 563 15 556 561 249 29.0 1e-65 MEWIINQLRVHPELAIFLTLFAGFWLGRLKIGKFSLGTVTSVLLVGVLVGQLNITVDGPM KAVFFLLFLFAVGYKVGPQFFRGLKKDGLPQVGFAVLMCIVSLVAPWILAKIMGYHVGEA AGLLAGSQTISAVIGVASDTINQLGISDAQKATFINAIPVAYAVTYIFGTAGSAWILASL GPKMLGGLDKVKADCKELEAQMGTSEADEPGFSPALRPVVFRAYKITNEWFGKGKKVSEL ETYLCKNDKRLFVERIRQRGVVKDVNPNLILHKNDEVVLSGRREFVIGEEDWIGPEVIDA QLLDFPAETLPVMVTHRTFAGETVSKIRAQKFMHGVSIRNIKRAGINVPVLPKTIVDSGD ILELTGLKHEVESAAKQMGYIDRPTNQTDMIFVGLGILLGGLFGALAIHLGGVPISLSTS GGALIAGLLFGWLRSKHPTFGGIPEPSLWVLNNVGLNMFIAVVGIAAGPSFIAGFKEVGV SLFIVGALATAIPLLAGLLMARYLFKFHPALSLGCTAGARTTTAALGAIQDAVESDTPAL GYTVTYAVGNTLLIIWGVVIVLLM >gi|225935330|gb|ACGA01000062.1| GENE 81 96804 - 98471 1787 555 aa, chain + ## HITS:1 COG:SP1229 KEGG:ns NR:ns ## COG: SP1229 COG2759 # Protein_GI_number: 15901091 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Streptococcus pneumoniae TIGR4 # 1 554 1 555 556 561 53.0 1e-159 MKSDIEIARSIELKKIKQVAEGIGIPREEVENYGRYIAKIPEQLIDEEKVKKSNLILVTA ITATKAGIGKTTVSIGLALGLNKIGKNAIVALREPSLGPCFGMKGGAAGGGYAQVLPMDK INLHFTGDFHAITSAHNMISALLDNYLYQNQAKGFGLKEILWRRVLDVNDRSLRSIVVGL GPKSNGITQESGFDITPASEIMAILCLSKDVSDLRRRIENILLGFTYDDQPFTVKDLGVA GAITVLLKDAIHPNLVQTTEGTAAFVHGGPFANIAHGCNSILATKLAMSFGDYVITEAGF GADLGAEKFYNIKCRKSGLQPRLTVIVATAQGLKMHGGVSLDRIKEPNMEGLKEGLRNLD KHVRNLRSFGQTIIVAFNKFASDTDEEMELLREHCEQLGVGFAINNAFSEGGEGAVDMAR LVVDTIENNPSESLRYTYKEEDSIQQKIEKVATNIYGASVITYSSIARNRIKLIEKMGIT HYPVCIAKTQYSFSADPKIYGAVNNFEFHIKDIVINNGAEMIVAIAGEILRMPGLPKEPQ ALHIDIVDGEIEGLS >gi|225935330|gb|ACGA01000062.1| GENE 82 98810 - 100090 1718 426 aa, chain - ## HITS:1 COG:aq_479 KEGG:ns NR:ns ## COG: aq_479 COG0112 # Protein_GI_number: 15605959 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Aquifex aeolicus # 1 424 5 410 428 481 58.0 1e-135 MKRDDIIFDIIEKEHQRQLKGIELIASENFVSDQVMQAMGSCLTNKYAEGYPGKRYYGGC EVVDQSEQIAIDRLKEIFGAEWANVQPHSGAQANAAVFLAVLNPGDKFMGLNLAHGGHLS HGSLVNTSGIIYTPCEYNLNQETGRVDYDQMEEVALREKPKMIIGGGSAYSREWDYKRMR EIADKVGAILMIDMAHPAGLIAAGVLENPVKYAHIVTSTTHKTLRGPRGGVIMMGKDFPN PWGKTTPKGEIKMMSQLLDSAVFPGIQGGPLEHVIAAKAVAFGEILQPEFKEYAKQVQKN AAVLAQALIDRGFTIVSGGTDNHSMLVDLRSKYPELTGKVAEKALVSADITVNKNMVPFD SRSAFQTSGIRLGTPAITTRGAKEDLMLEIAEMIETVLSNVENEEVIAQVRARVNETMKK YPLFAY >gi|225935330|gb|ACGA01000062.1| GENE 83 100225 - 100962 717 245 aa, chain - ## HITS:1 COG:no KEGG:BT_0739 NR:ns ## KEGG: BT_0739 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 245 1 245 245 431 88.0 1e-120 MNMLRKFILPGLLAIACTAVGQNYNAEKVDRPNTKKINIGIKAGFNSSMFMVSELKIKDV TIDEVQNNYKIGYFGALFMRINMKKHFIQPEVSYNVSKCEITFDKLGSQHPAIEPDYASV QSVLHSVDFPILYGYNVVKKGPYGMSIFAGPKLRYLWGKHNEITFKNFDQKGIHEKLYPF NVCAVIGVGVNISRIFFDFRYEQGIGNISKSIIYDNINSDGSTGVSNIIFRRRDSALSFS LGFIL >gi|225935330|gb|ACGA01000062.1| GENE 84 100979 - 101548 528 189 aa, chain - ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 19 184 20 185 197 161 49.0 7e-40 MKQDWKPGTMIYPLPAVLVSCGKEESEYNMFTVAWTGTICTNPPMCYISVRPERHSYDII KKNMEFVINLTTKDMAFATDWCGVRSGRDYHKFDEMKLTPGQCTVVSAPLIEESPLCIEC RVKEIISLGSHDMFIADVVNVRADDRNLNPETGKLELAEANPLVYVHGGYYNLGEKIGKF GWSVEKKKS >gi|225935330|gb|ACGA01000062.1| GENE 85 101565 - 102026 610 153 aa, chain - ## HITS:1 COG:PAB1499 KEGG:ns NR:ns ## COG: PAB1499 COG1781 # Protein_GI_number: 14521525 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Pyrococcus abyssi # 8 150 4 148 152 138 46.0 3e-33 MSENKQALQVAALKNGTVIDHIPSEKLFTVVQLLGVEQMTSNITIGFNLDSKKLGKKGII KIADKFFCDEEINRISVVAPHVKLNIIRDYEVVKKKEVKMPDELRGIVKCANPKCITNNE PMATIFHVIDKDNCIVKCHYCEKEQKREEITIL >gi|225935330|gb|ACGA01000062.1| GENE 86 102023 - 102964 977 313 aa, chain - ## HITS:1 COG:VC2510 KEGG:ns NR:ns ## COG: VC2510 COG0540 # Protein_GI_number: 15642506 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Vibrio cholerae # 4 304 29 330 330 312 53.0 5e-85 MENRSLVTIAEHSKEKILYMLEMAKQFEMNPNRRLLQGKVVATLFFEPSTRTRLSFETAA NRLGARVIGFSDPKATSSSKGETLKDTIMMVSNYADIIVMRHYLEGAARYASEVAPVPIV NAGDGANQHPSQTMLDLYSIYKTQGTLENLNIFLVGDLKYGRTVHSLLMAMRHFNPTFHF IAPEELKMPEEYKLYCKTHQIKYVEHTDFSEEIIADADILYMTRVQRERFTDLMEYERVK NVYILRNKMLENTRPNLRILHPLPRVNEIAYDVDDNPKAYYFQQAQNGLYAREAILCDVL GITLDDVKNDILL >gi|225935330|gb|ACGA01000062.1| GENE 87 103077 - 103856 656 259 aa, chain - ## HITS:1 COG:no KEGG:BT_0592 NR:ns ## KEGG: BT_0592 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 254 8 259 261 245 51.0 1e-63 MNMINKGIPYFPTPANFFDEEVMELLEAKFGVLASYIVMRLLCKIYKEGYYISWGKEQNL IFVRKVGGGIKEDMMEKIVDLLLEKGFFHKETYEKYGILTSEQIQRVWFEATTRRKIDFS QLPYLLETKKKKGMQKEELNAENADIFETQEEVSQENADISRQTKLKETKLNHEEEETNV VSIEIPGYAYNLATHNIAGLMESLDNHKVTDPKEKQTILRLSDYGRKGTQVWKLLSNTAW SKIGAPGKYIIAALASGRK >gi|225935330|gb|ACGA01000062.1| GENE 88 103880 - 104224 304 114 aa, chain - ## HITS:1 COG:no KEGG:BT_0593 NR:ns ## KEGG: BT_0593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 114 23 121 121 96 46.0 2e-19 MRQKIRKNKLEACKLVWKKRITAEKGISDKCAERIVQECIRLIERMLYGNAMIAFHRQDG TFCLEKGTLVGYEKFFHRKFNITAKQESIVYWSEEQKGWRRFMIGNLMEWKAIV >gi|225935330|gb|ACGA01000062.1| GENE 89 104449 - 104859 303 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174670|ref|ZP_05761082.1| ## NR: gi|260174670|ref|ZP_05761082.1| hypothetical protein BacD2_22627 [Bacteroides sp. D2] # 1 136 1 136 136 175 100.0 9e-43 MKSKNAKKGSKNDKVKRSVQTMKKAPRRHSREEIISEEELDNRIAIDGDIRLYFTMYLKI FIDGHFRHPKKRKLINLAQYIYDQKVLYIHKHGGYKLMELSSIHAELGALRKPVEEEYMK EKKEKREQAEKVKSKY >gi|225935330|gb|ACGA01000062.1| GENE 90 104989 - 105942 512 317 aa, chain + ## HITS:1 COG:no KEGG:BT_0595 NR:ns ## KEGG: BT_0595 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 317 1 317 318 496 79.0 1e-139 MQMMNKNGFSRCGENYINRLRKEGRYSTAHVYKNALYSFSKFCGTLNMSFRQVTKERLRR YGQYLYECGLKPNTISTYMRMLRSIYNRGVEAGSAPYVPRLFHDVYTGVDVRQKKALPAG ELHKLLYEDPKSERLRRTQTIAALMFQFCGMSFADLAHLEKSALEQSVLRYNRIKTKTPM SVEVLDTARGMINQLRNNQEPIPNCPNYLFDILCGNKKRKDERAYSEYQSALRRFNNSLK DLARALRLNSPVSSYTLRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFELRERT EVNKGNLSYIRNYRLGR >gi|225935330|gb|ACGA01000062.1| GENE 91 106297 - 106863 455 188 aa, chain + ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 1 187 192 330 84.0 1e-89 MILTKETLSAGPSIGTGEGVAHSKRWYVALVRMHHEKKVAERLDKIGIENFVPVQQEVHQ WSDRRKVVESVLLPMMVFVHADPKERKEVLSFSTVSRYMVMRGESSPTIIPDEQMARFRF MLDYSEEAICMNSAPLARGEKVRVVKGPLTGLVGELVTVDGRSKIAVRLNMLGCACADMP VGYVEPFK >gi|225935330|gb|ACGA01000062.1| GENE 92 106936 - 107823 719 295 aa, chain + ## HITS:1 COG:NMB0062 KEGG:ns NR:ns ## COG: NMB0062 COG1209 # Protein_GI_number: 15675999 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Neisseria meningitidis MC58 # 1 291 1 288 288 414 65.0 1e-115 MKGIVLAGGSGTRLYPITKGVSKQLLPVFDKPMIYYPISVLMLAGIREILIISTPYDLPG FKRLLGDGSDYGVRFEYAEQPSPDGLAQAFIIGEDFIGNDSVCLVLGDNIFYGQSFTRML QEAVRTVEEEQKATVFGYWVADPERYGVADFDKNGNVLSIEEKPENPKSNYAVVGLYFYP NKVVDVAKHIQPSPRGELEITTVNQEFLNDHQLKVQLLGRGFAWLDTGTHDSLSEASTFI EVIEKRQGLKVACLEGIALRHGWITADKMRELAKPMLKNQYGQYLLKVINELGLE >gi|225935330|gb|ACGA01000062.1| GENE 93 107873 - 108442 471 189 aa, chain + ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 178 1 179 183 200 54.0 1e-51 MEIIKTAIEGVVIIEPRLFKDDRGYFFESFSQREFTEKVRKVDFVQDNESKSSYGVLRGL HFQKPPYAQSKLVRVIKGSVLDVAVDIRKGSPTFGEHVSVELTEENHRQFFISRGFAHGF VVLTEEVIFQYKCDNFYAPQCEGALAWDDPALKIDWKVPADKVILSVKDQHHERLEEAGW LFDYNENLY >gi|225935330|gb|ACGA01000062.1| GENE 94 108448 - 109314 659 288 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 283 1 279 280 230 44.0 3e-60 MNILVTGANGQLGNEMRRVSSTSSNQYIFTDVAELDITSRDSIRKMVNDNQIHVIVNCAA YTNVDKAEDDFATADLLNNKAVENLAVVAKETDATLIHVSTDYVFQGDRNVPCREDWETN PLGVYGKTKLAGEHSIQEIGCHYLIFRTAWLYSPYGKNFVKTMRQLTSDKDTLKVVFDQV GTPTYAGDLASVIYQVIEENQLYKEGIYHFSNEGVCSWYDFAKEICDLSGNVCDIQPCHS DEFPSKVKRPHFSVLDKTKVKSAFGITVPYWKDSLQKCINELKQQLDY >gi|225935330|gb|ACGA01000062.1| GENE 95 109322 - 110395 882 357 aa, chain + ## HITS:1 COG:ECs4721 KEGG:ns NR:ns ## COG: ECs4721 COG1088 # Protein_GI_number: 15833975 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Escherichia coli O157:H7 # 5 347 2 347 355 391 55.0 1e-108 MNFARNILITGGAGFIGSHVVRLFVNKYPEYHIINLDKLTYAGNLANLKDVEDQPNYTFV KADICDFEKMLEIFKQYHIDGVIHLAAESHVDRSIKDPFTFAQTNVMGTLSLLQAAKLTW EILPECYEDKRFYHISTDEVYGALEFDGTFFTEETKYQPHSPYSASKAGSDHFVRAFHDT YGMPTIVTNCSNNYGPYQFPEKLIPLFINNIRQGKPLPVYGKGENVRDWLYVVDHARAID LIFHNGNTAETYNIGGFNEWKNIDLIKVIIKTVDRLLGNSEGTSDHLITYVTDRKGHDLR YAIDSNKLKNELGWEPSLQFEEGIEKTVRWYLDNQNWMDNVTTGDYQKDNDRDDKSL >gi|225935330|gb|ACGA01000062.1| GENE 96 110474 - 112024 567 516 aa, chain + ## HITS:1 COG:no KEGG:BT_0467 NR:ns ## KEGG: BT_0467 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 505 2 482 489 442 52.0 1e-122 MTENITQNNSKRIAKNTLLLYIRTFITMAISLYTSRVILEVLGINDFGIYNVVGGFVGMF ALISATLTSSTQRFITYELGKKDENHSQEIFSTAVLIHIIIALVIIVLAESVGVYFLNNQ MNIPISRLTAANWVFQFSLITFVVNIISIPYNSTIIAHERMAAFAYITLIESILKLLIVY LLILSTFDKLIIYGALMLLVSLAIRMVYGVYCSKNFIECKFKLVKERFYYKQILGFSGWN IIGSSSVVLTNYGINILLNIFFGVAVNAARGITTQVDNALNQFVSNFVMALNPQITKSYA SGNREYMMTLVMTGSRYSFYLLLIMVIPILFETEYILACWLKNTPKYTVIFVQLSLIYML CQSLSNTLFTAMLATGNIRNYQIIVGGLALMAFPLSYGLFKMGFEPAYCYYATIFISILC LVVRLIMLHRIIGLSIRIFFKDVIVRVILVSMFSLISPYIIISFMTQGLERFIILSLVSL VATCIFIFFSGISKDERTFFITYIKNKFNKRYERKF >gi|225935330|gb|ACGA01000062.1| GENE 97 112008 - 112976 352 322 aa, chain + ## HITS:1 COG:MTH341 KEGG:ns NR:ns ## COG: MTH341 COG1035 # Protein_GI_number: 15678369 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, beta subunit # Organism: Methanothermobacter thermautotrophicus # 14 235 86 307 406 95 29.0 1e-19 MRENSNINKALAYYIGYSTDDLIRYKASSGGIGTSIIKYLLSTSEYDTSMTFVYDKEKCG YIPKLIYDFNEINICGSVYQDIDIFSFLKENISSIKNGIIVTCMPCQVQGIRSVLDRNNI NNFIISFCCSGQTTLEGTWLCYRYMGIDKSQVINMQYRGNGWPSGIQIELFDGKKIYKNN YTNPWRLMHQSKLFRPKRCLMCKEDISYKADVSLADPWLGKYKISDKIGHTMFLINTEKG LAFIEEMKNKSLLRLIDSSVEDYIEAQGHTIMAKDKASLEKKFNNILSKMGNNILYKKIM TLSPFMLRFHMLIIRIVYKIVK >gi|225935330|gb|ACGA01000062.1| GENE 98 112984 - 114120 594 378 aa, chain + ## HITS:1 COG:MTH340 KEGG:ns NR:ns ## COG: MTH340 COG2327 # Protein_GI_number: 15678368 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 113 376 112 364 400 79 27.0 9e-15 MKPYIIISGLNFRDNNRGTAALSYGSLSFLNERELLSKETKILNFRFVKNLFKSKNRGTN IQKIDIAGVKWECLTVNVFFVEALLFKYLHVCLPFTNLGRMIKQVEYVAAINGGDGFSDI YNTTSFLNRLPDINLAMKFNIPVIILPQTLGPFRESKNKVIADRILCYASQIFVRDDKYA SDLEAMGLKYEQTRDLSYYMKPEPFNIEIKANAIGINISGLAYSNKFRTLSGQFSTYPYL INKLIVYFQQKNIAIYLIPHSYNYQKVEVANDDLEAARDVYSKLHDKSNIILIDQDLISP QIKFIISQMKFFIGTRMHANFAAMYTGVPLFGLAYSYKFQGAFEANGIYDSTAMINDISE KEADAIVERIVTKYKSLD >gi|225935330|gb|ACGA01000062.1| GENE 99 114149 - 115474 458 441 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174680|ref|ZP_05761092.1| ## NR: gi|260174680|ref|ZP_05761092.1| hypothetical protein BacD2_22677 [Bacteroides sp. D2] # 1 441 1 441 441 733 100.0 0 MAFIQSWLTIIILVAFFVKLRIGLVLYLVYFFLVPYFNINFLGISLSWNFANILLLIAFV IDYYKHHGKVRLDLRPFYPFIFFYAMMLIMMPFQEFVPTDIALNSWRANMMTNLVLPIVM WNVSRYDPKIIRYSRNCMVIVIIVIVIYGIFLLTLNGLNPYVYFMAQINNAELREAQFGE QMARLMIKISSVFTHPMIFGLFLGLAMVYLYSLKDKIKPLFIYLLMFFIVVCIFLCGIRT PIAAMFLTVFFYLLMLHRIKPMIYVAVIGFIGYVIIENIPELSAMVDSIFIKDSRKTNVE GSSIDMRMEQLNGCLREIQNCLIFGKGYEWCGYYMSIHDLHPVLLAFESLIFVVLCNSGI VGLCVWFITFVWLFRGVYRMNKNVNVTLFVITLAVYYIAYSTITGEYGYMKYFIIFYTLL LIDSKIFIGREHVSSQKLASR >gi|225935330|gb|ACGA01000062.1| GENE 100 115865 - 116026 69 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174681|ref|ZP_05761093.1| ## NR: gi|260174681|ref|ZP_05761093.1| hypothetical protein BacD2_22682 [Bacteroides sp. D2] # 1 53 1 53 53 89 100.0 5e-17 MDYSTFSRHDYKHIPIIGLAVKPLAHDGQHAAVIATHIRKITNIIEMVKTDYT >gi|225935330|gb|ACGA01000062.1| GENE 101 116271 - 116393 65 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNELQLVIELKRELLSRENYSRAPINYSEGMAEDQKDRYI >gi|225935330|gb|ACGA01000062.1| GENE 102 116439 - 116762 237 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174684|ref|ZP_05761096.1| ## NR: gi|260174684|ref|ZP_05761096.1| hypothetical protein BacD2_22697 [Bacteroides sp. D2] # 1 107 1 107 107 182 100.0 6e-45 MRLTFEGFIAKQKALEEQWALFQFWYEEKTLVLSEERKLRKSAERKLKSLQEKFDYANQE RSGKKPKWGYFPTVGHPEVVTKFEGTKSHSRVSSGYWLMLMFLLVMS >gi|225935330|gb|ACGA01000062.1| GENE 103 116797 - 116940 65 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYVSCKITWALTKLHWLQASVKFKTKFINKTRFRAHLYALSKGDAFN >gi|225935330|gb|ACGA01000062.1| GENE 104 117020 - 118138 689 372 aa, chain + ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 6 371 221 568 818 243 37.0 5e-64 MGNKVEIIAFYLPQFHPIPDNDEWWGKGFTEWTNVGKAKALFKGHNQPRVPADLGYYDLR LPIVREQQVELAKEAGVTAFCYWHYWFGNGKRLMADIFNEVLNTGKPDFPFCLGWANHSW YAKNWNSDGTSTNKLLIEQMYPGIDDEKMHFTFLLKAFKDPRYVKIDNKPFLLIFDPISI PEEYIQNFKCWTERAGFDGLYLVANITSSSISKKELLAKGFDVVTYQRLGGMVNPQLHKL GKVGRGLFKIYQYAKGFILKRPPRMIDYSKYYHSLITEDDQSVDVIPSIVPQWDHTPRSG WNGSLWVNSTPYFFYKHVLEALDAIKNKPQNQQILLLKSWNEWGEGNYMEPDLKNGKGYI EALKKALNKGLE >gi|225935330|gb|ACGA01000062.1| GENE 105 118668 - 119492 553 274 aa, chain + ## HITS:1 COG:no KEGG:Mfla_2015 NR:ns ## KEGG: Mfla_2015 # Name: not_defined # Def: glycosyl transferase, group 1 # Organism: M.flagellatus # Pathway: not_defined # 35 273 139 379 382 90 29.0 6e-17 MHPYGTMMYRLMLPKRNVVIACHNVSTPRGANSEKWAKKLTNMWIRSFKNIQVFSRSQEV ILKEKVHGKNILMAHLALKDYGESTIEVNKKEERTFRFLFFGNIVAYKRVDLLIEAANIL HDRGIENFKVRIAGSCNCNWEQNYASFIKCPDVFELMIKRIPNEDVANMFADSHVFVMPY QDIAQSGAITVAFRYNVVTLVSDIPQFKEFVEDGVTGLAFQSGNAGDLADKMQWCIENRD VLLKELSVNQKNFVDKELSLTSIVAKYKDFFSKL >gi|225935330|gb|ACGA01000062.1| GENE 106 119497 - 120708 319 403 aa, chain + ## HITS:1 COG:no KEGG:DPPB80 NR:ns ## KEGG: DPPB80 # Name: not_defined # Def: related to F420H2-dehydrogenase, beta subunit # Organism: D.psychrophila # Pathway: not_defined # 7 334 59 383 443 206 36.0 1e-51 MSFVPRLATDKQCTGCFACIDVCNKNAINIVEHYDGHRYVEIDKSKCVGCGMCEQICPIV SNFEYQKSEYSDFYAAWAKNRTHRKTSASGGAFYAMAFSVIEQGGIVFGAKIESPCYVHH QAIETKEELSCLQGSKYTHSNTSGNYKLALKYLKEGRMVLFSGTGCQIAGLLSFLKDRTY KGNLVTVDLICGGIPSRILIDKFISQAPYKVKRVLSFRTKETGWKPRGFRYNLKVEDENG KIYDYANKNNLITTGFSLELTNRYSCYDCPFVGKNRKSDFTIGDYWGCIKYDEEHFDGIS LIIAHTDKARQFLSSLRDSLFYDKADMNLAISHNHRLVTGHSIQQYFLERKLLAFLSSQL SDSTFSKIYANTFSNKSPWILLKVIRKVYGIVVNIIDRIGHKQ >gi|225935330|gb|ACGA01000062.1| GENE 107 120705 - 121742 327 345 aa, chain + ## HITS:1 COG:no KEGG:Amet_0211 NR:ns ## KEGG: Amet_0211 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 3 344 9 366 369 161 30.0 3e-38 MKIGLLTYHKSYNCGAVLQTYATCRLLKELGHEVELIDLRQPEPIKLRQLIFIPRFIKFY RFCKKFYPSLTRHYKTVEELKKAKLDYDCLLVGSDQTWNPLISREQCLAYFLDFGGEQVR RVSFASSFGVNMWPESHKELLPTIDKLLHRFSGISVREVTGQNILKQQFSLDSSLVLDPT MLHSSYVEITSSITPNHRIVCYLLNRNDCQLDMARYLSYKMGIPARMIFNGYPLRGFEYC YPPAIKGWIEQIGGAEVVVTDSFHGLVFSLLYHRQFAVIPIANGLSSRLLDLLDLVGLEN HVFTSQEELEKNMEVLQRPIDYERVDSILAIYRERSIDFLRNVLK >gi|225935330|gb|ACGA01000062.1| GENE 108 121753 - 122104 84 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174689|ref|ZP_05761101.1| ## NR: gi|260174689|ref|ZP_05761101.1| hypothetical protein BacD2_22722 [Bacteroides sp. D2] # 1 117 1 117 117 235 100.0 8e-61 MIHILHGFSNVDSEFKPRNLRQEVHYIESPRTKKRIVGWTIGCFRALCCSRKGETVFCWY DFQAVLLYWMCLLTFQRRNIGCLNILLKKKDTIQNRIVSKMYRKALMSKYFHASVTS Prediction of potential genes in microbial genomes Time: Fri May 13 11:00:27 2011 Seq name: gi|225935329|gb|ACGA01000063.1| Bacteroides sp. D2 cont1.63, whole genome shotgun sequence Length of sequence - 64726 bp Number of predicted genes - 63, with homology - 62 Number of transcription units - 24, operones - 14 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 122 - 181 6.4 1 1 Op 1 . + CDS 246 - 572 140 ## gi|260174691|ref|ZP_05761103.1| hypothetical protein BacD2_22744 2 1 Op 2 8/0.000 + CDS 554 - 1120 79 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 3 1 Op 3 1/0.000 + CDS 1159 - 2337 640 ## COG0438 Glycosyltransferase 4 1 Op 4 1/0.000 + CDS 2384 - 3454 855 ## COG0535 Predicted Fe-S oxidoreductases 5 1 Op 5 12/0.000 + CDS 3459 - 4541 823 ## COG0438 Glycosyltransferase 6 1 Op 6 2/0.000 + CDS 4561 - 5967 770 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 7 1 Op 7 2/0.000 + CDS 6054 - 6863 705 ## COG1596 Periplasmic protein involved in polysaccharide export 8 1 Op 8 . + CDS 6873 - 9314 2074 ## COG0489 ATPases involved in chromosome partitioning + Term 9484 - 9522 2.0 + Prom 9485 - 9544 7.2 9 2 Tu 1 . + CDS 9609 - 11174 1172 ## BT_1642 hypothetical protein + Term 11216 - 11256 4.1 + Prom 11256 - 11315 5.5 10 3 Op 1 . + CDS 11337 - 11861 619 ## BT_0615 hypothetical protein 11 3 Op 2 . + CDS 11930 - 12022 131 ## + Term 12062 - 12095 6.1 + Prom 12124 - 12183 6.9 12 4 Op 1 . + CDS 12276 - 14600 2079 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein + Term 14601 - 14638 -0.7 13 4 Op 2 . + CDS 14650 - 15036 187 ## BT_0744 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase 14 4 Op 3 . + CDS 15037 - 15789 724 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase 15 4 Op 4 . + CDS 15794 - 17077 1009 ## COG0612 Predicted Zn-dependent peptidases + Term 17082 - 17123 -0.7 + Prom 17084 - 17143 4.7 16 4 Op 5 . + CDS 17163 - 18332 1048 ## COG4642 Uncharacterized protein conserved in bacteria + Term 18353 - 18388 3.4 - Term 18387 - 18428 7.4 17 5 Tu 1 . - CDS 18477 - 19415 666 ## COG0462 Phosphoribosylpyrophosphate synthetase - Prom 19482 - 19541 7.2 + Prom 19394 - 19453 5.9 18 6 Op 1 . + CDS 19569 - 23888 3374 ## COG0642 Signal transduction histidine kinase 19 6 Op 2 . + CDS 23923 - 24651 903 ## BF2233 two-component system response regulator 20 6 Op 3 . + CDS 24710 - 26146 1083 ## COG2978 Putative p-aminobenzoyl-glutamate transporter + Prom 26165 - 26224 4.9 21 7 Op 1 . + CDS 26246 - 26833 514 ## BT_0752 putative RNA polymerase ECF-type sigma factor 22 7 Op 2 . + CDS 26911 - 27939 888 ## COG3712 Fe2+-dicitrate sensor, membrane component 23 7 Op 3 . + CDS 28037 - 31456 3268 ## BT_0754 hypothetical protein 24 7 Op 4 . + CDS 31485 - 33299 1747 ## BT_0755 hypothetical protein + Term 33316 - 33370 4.5 + Prom 33329 - 33388 5.7 25 8 Op 1 . + CDS 33496 - 34812 830 ## COG3119 Arylsulfatase A and related enzymes 26 8 Op 2 . + CDS 34814 - 36862 1558 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 36889 - 36933 7.5 + Prom 36932 - 36991 3.5 27 9 Tu 1 . + CDS 37138 - 37305 155 ## gi|260174717|ref|ZP_05761129.1| hypothetical protein BacD2_22874 + Prom 37309 - 37368 5.4 28 10 Tu 1 . + CDS 37567 - 38460 704 ## COG1360 Flagellar motor protein + Term 38489 - 38530 9.2 - Term 38477 - 38518 9.2 29 11 Tu 1 . - CDS 38660 - 39244 202 ## gi|160884512|ref|ZP_02065515.1| hypothetical protein BACOVA_02496 - Prom 39468 - 39527 7.4 - Term 39609 - 39648 6.1 30 12 Tu 1 . - CDS 39752 - 40054 250 ## gi|260174720|ref|ZP_05761132.1| hypothetical protein BacD2_22889 - Prom 40138 - 40197 8.6 + Prom 40024 - 40083 9.9 31 13 Op 1 . + CDS 40167 - 40394 185 ## gi|260174721|ref|ZP_05761133.1| hypothetical protein BacD2_22894 32 13 Op 2 . + CDS 40419 - 40967 468 ## HSM_0868 putative antirepressor protein 33 14 Op 1 . + CDS 41109 - 41387 151 ## gi|260174723|ref|ZP_05761135.1| hypothetical protein BacD2_22904 34 14 Op 2 . + CDS 41406 - 41663 149 ## gi|260174724|ref|ZP_05761136.1| hypothetical protein BacD2_22909 35 14 Op 3 . + CDS 41681 - 41947 251 ## gi|260174725|ref|ZP_05761137.1| hypothetical protein BacD2_22914 + Term 42024 - 42069 4.1 36 15 Op 1 . + CDS 42146 - 42328 180 ## gi|260174727|ref|ZP_05761139.1| hypothetical protein BacD2_22924 37 15 Op 2 . + CDS 42351 - 45254 1760 ## PRU_0854 hypothetical protein 38 15 Op 3 . + CDS 45281 - 45565 229 ## gi|260174729|ref|ZP_05761141.1| hypothetical protein BacD2_22934 + Term 45597 - 45640 9.1 + Prom 45991 - 46050 6.2 39 16 Tu 1 . + CDS 46160 - 46648 356 ## BF2331 hypothetical protein 40 17 Op 1 . + CDS 46769 - 47170 272 ## BF2332 hypothetical protein 41 17 Op 2 . + CDS 47232 - 48098 586 ## COG0616 Periplasmic serine proteases (ClpP class) 42 17 Op 3 . + CDS 48123 - 50324 2106 ## BF2334 hypothetical protein + Term 50333 - 50367 6.2 43 17 Op 4 . + CDS 50384 - 51190 613 ## BF2421 hypothetical protein + Term 51212 - 51262 8.3 + Prom 51194 - 51253 2.6 44 18 Tu 1 . + CDS 51306 - 52253 1026 ## BF2422 hypothetical protein 45 19 Op 1 . + CDS 52410 - 53084 347 ## COG0863 DNA modification methylase 46 19 Op 2 . + CDS 53053 - 53403 322 ## BF2423 hypothetical protein 47 19 Op 3 . + CDS 53384 - 54064 567 ## BF2338 hypothetical protein + Prom 54116 - 54175 2.5 48 20 Tu 1 . + CDS 54273 - 55745 1016 ## BF2339 hypothetical protein + Term 55753 - 55795 6.3 49 21 Tu 1 . + CDS 56191 - 56709 318 ## gi|260174741|ref|ZP_05761153.1| hypothetical protein BacD2_22994 + Term 56721 - 56754 5.2 50 22 Op 1 . + CDS 57072 - 57323 59 ## BF2297 hypothetical protein 51 22 Op 2 . + CDS 57320 - 57670 228 ## gi|260174743|ref|ZP_05761155.1| hypothetical protein BacD2_23004 52 22 Op 3 . + CDS 57667 - 58023 256 ## gi|260174744|ref|ZP_05761156.1| hypothetical protein BacD2_23009 53 22 Op 4 . + CDS 58020 - 58394 125 ## gi|260174745|ref|ZP_05761157.1| hypothetical protein BacD2_23014 54 22 Op 5 . + CDS 58408 - 58641 223 ## gi|260174746|ref|ZP_05761158.1| hypothetical protein BacD2_23019 + Term 58656 - 58694 -0.4 + Prom 58682 - 58741 2.1 55 23 Op 1 . + CDS 58798 - 59025 174 ## gi|260174747|ref|ZP_05761159.1| hypothetical protein BacD2_23024 56 23 Op 2 . + CDS 59045 - 59413 291 ## gi|260174748|ref|ZP_05761160.1| hypothetical protein BacD2_23029 57 23 Op 3 . + CDS 59427 - 59948 181 ## gi|260174749|ref|ZP_05761161.1| hypothetical protein BacD2_23034 58 24 Op 1 . + CDS 60219 - 61583 1167 ## BF2343 hypothetical protein 59 24 Op 2 . + CDS 61576 - 62277 369 ## gi|260174752|ref|ZP_05761164.1| hypothetical protein BacD2_23049 60 24 Op 3 . + CDS 62261 - 63211 694 ## BF2432 hypothetical protein 61 24 Op 4 . + CDS 63221 - 64030 527 ## BF2433 hypothetical protein 62 24 Op 5 . + CDS 64027 - 64461 414 ## COG3772 Phage-related lysozyme (muraminidase) 63 24 Op 6 . + CDS 64458 - 64725 150 ## BF2348 hypothetical protein Predicted protein(s) >gi|225935329|gb|ACGA01000063.1| GENE 1 246 - 572 140 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174691|ref|ZP_05761103.1| ## NR: gi|260174691|ref|ZP_05761103.1| hypothetical protein BacD2_22744 [Bacteroides sp. D2] # 1 108 1 108 108 217 100.0 2e-55 MANSAIVAMPLTTEAPAGLLVLFQAAGSKRFVITTSTATTRAYITSDRGCALERDVKQWK GAIRYYLLNEKERNDKALKLHKFLKDVCGRDNFDSGIQKIVDLCESNY >gi|225935329|gb|ACGA01000063.1| GENE 2 554 - 1120 79 188 aa, chain + ## HITS:1 COG:MJ1064 KEGG:ns NR:ns ## COG: MJ1064 COG0110 # Protein_GI_number: 15669253 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanococcus jannaschii # 32 176 70 212 214 89 37.0 5e-18 MRIKLLVKYLIGWLKCRFNHIECGDCCYLGINLNIINKGNFKLGDNVKFRPSLDVYISRN ALLSIGDHTEIGNHFTISAHNIVIIGKGVLTSPHVFIADHNHEYRNPDVYIYKQGCRAGK DDRVLIDDGTWLGTNVVVVGNVHIGKQCVIGANSVVTKDIPDYCVAAGSPAKVLKRYNFE SKEWERFR >gi|225935329|gb|ACGA01000063.1| GENE 3 1159 - 2337 640 392 aa, chain + ## HITS:1 COG:SMb21503 KEGG:ns NR:ns ## COG: SMb21503 COG0438 # Protein_GI_number: 16265081 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 2 359 5 367 447 209 33.0 5e-54 MKILLANKFYYRRGGDCIYMLNLELLLREYGHEVAIFAMDYPDNLETPWRKYFPSEVKFK PGLGMLEAFLRPLGTNEVKRNFCALLDEFKPDVVHLNNIHTQLSPIIAELAHERGIKVVW TLHDLKLLCPRYDCLRNGKIACEECFMDKRKVLEYKCMKNSIVASFLAYKEAVSWNRERL EKCTDIFICPSLFIANKMIQGGFDKTKMCTLCNFIDIEKTRKQEYNKLDYYCFIGRLSNE KGVVTLVKAARQLPYKLKIIGEGPLMEELKILCKGTNIELVGYKQWPEIKELLGCARFSV ISSECNENNPLSVIEAQCLGTPVLGACIGGIPELVEEGKSGMLFKSKDIVDLKKKIEMMF TYTFDYESLAKKSQERYSADTYYKQLMKIYTK >gi|225935329|gb|ACGA01000063.1| GENE 4 2384 - 3454 855 356 aa, chain + ## HITS:1 COG:CAC2796 KEGG:ns NR:ns ## COG: CAC2796 COG0535 # Protein_GI_number: 15896051 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 17 194 44 224 394 75 31.0 1e-13 MTTNNIPCPTDASIILTYRCPMRCKMCNVWQYPTEKSKEIQPEELRTLPKLKFINLTGGE PFIREDLDKIVEECYKHTDRIVISTSGWFEDRVVALAKKFPNIGIRISIEGLSCKNDELR GHAGGFDKGLRTLLTLKGMGLKDIGFGCTVSNNNSKDMLSLYQLSKSLGMEFATAAFHNS YYFHKDDNVITNKDEVCKNFEQLIEWQLKENHPKSWFRAFFNMGLINYIEGGRRMLPCEA GSANFFIEPYGDVYPCNGLEEKYWMKKMGNIRETLNFMTIWESEQAQQVRDMVRKCPKNC WMVGTASPVMHKYIKHPLKWAIANKLRSMQGKSACLDKCWYNVGQDPCQGDLREKF >gi|225935329|gb|ACGA01000063.1| GENE 5 3459 - 4541 823 360 aa, chain + ## HITS:1 COG:SMb21502 KEGG:ns NR:ns ## COG: SMb21502 COG0438 # Protein_GI_number: 16265080 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 2 359 23 387 389 262 39.0 7e-70 MKIVVTGTRGIPNILGGVETHCEELFPRMAQKDIDITIVRRKSYVHDSLTEYQGVKLVDI TTPKKKSLEAVIHTFKAILKARSLHADIVHIHAIGPALLTPLARLLGMKVVFTHHGPDYD RDKWGLAAKTMLKLGERMGVMFANEVIVISEVINDILVRKYGRRDCRLIYNGVPAPDKIA DTDYLESLGVESHKYVFAMGRFVPEKNFHHLIHAFSALKRLDYKLILAGDSDFEDDYSIN LKKLAKENGVILTGFIKGKKLHELLTHACCFVLPSSHEGLPIALLEAMSYDLPVIVSNIP ANLEVGLAYDCYFQTDDEKELKDKLQKALSKDFVPVSYSMDEYNWDTIAEQVLAVYRGVV >gi|225935329|gb|ACGA01000063.1| GENE 6 4561 - 5967 770 468 aa, chain + ## HITS:1 COG:VC0934 KEGG:ns NR:ns ## COG: VC0934 COG2148 # Protein_GI_number: 15640950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Vibrio cholerae # 111 468 112 465 465 244 38.0 3e-64 MQEIQRFNKVLKSLVLLGDLILLNLLLWGFNFFLGTRFWCIHCGSIFQGMTLITLCYLLC NMHSGVILHRPLVRPEQIMIRVLRNMVPFVFLSVCTLLVFHFHFSHSRYFGLFYVALIIV IISYRLISRHLLEIYRKKGGNVRKVVLVGSHENMQELYHAMTDDPTSGYRILGYFEDFPS DRYPQEIPYLGQPNEVTVFLEKHTGEIDQLYCSLPSVRSAEIVPIINYCENHLVRFFSVP NVRNYLKRRMHFELLGNVPVLSIRCEPLESLENRIIKRAFDIVCSGVFLITVFPFVYIFF GIAIKLSSPGPVFFKQKRSGKDGREFWCYKFRSMKVNTQCDTLQATENDPRKTRIGEIMR KTSVDELPQFINVLKGDMSIVGPRPHMLKHTEEYSNLINKFMVRHFVKPGITGWAQVTGY RGETKELWQMEGRVQRDIWYIEHWTFLLDLYIMYKTVYNAIRGEKEAY >gi|225935329|gb|ACGA01000063.1| GENE 7 6054 - 6863 705 269 aa, chain + ## HITS:1 COG:RSp1020 KEGG:ns NR:ns ## COG: RSp1020 COG1596 # Protein_GI_number: 17549241 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Ralstonia solanacearum # 50 231 86 262 381 60 31.0 2e-09 MMRRLNKRNLWLLLLPLLLAACQSYKKVPYFQNVEVVNEVEQQEKLYDAKIMPKDLLTIV VSCTNPELAIPFNLTVASNAGIAVSTSSYVTTQPVLQPYLVDNEGNINFPVLGELKLGGL TKREAEQLIIDKLKPYMKETPIVTVRMVNYKISVIGEVTRPGTFTISNEKVNLLEALAMA GDMTVYGLRDNVKLIREDANGKQQIVTLDLNKAETILSPYYWLQQNDIVYVTPNKAKARN SDVGNSTSLWFSATSILVSIVSLLVNILK >gi|225935329|gb|ACGA01000063.1| GENE 8 6873 - 9314 2074 813 aa, chain + ## HITS:1 COG:alr2856_2 KEGG:ns NR:ns ## COG: alr2856_2 COG0489 # Protein_GI_number: 17230348 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 545 790 3 253 275 120 29.0 9e-27 MKETDFNEAQESKEENIDVKELLFKYLIHWPWFVGTVVVCLIAAWVYLYMSTPVYNISAT VLIKDDKKGGSAGMLSGLESLGLDGMVSSSQNIDNEIEVLRSKTIVKEVVEDLGLYISYT DEDEFPSRNMYKTSPVQVSLTPQEADLLEEPMIVEMALQPQGSMDVNVKIGDDEYQKHFE KLPAVFPTERGTLAFFLTPDSVLSSKRTLEETTDSEKTTRNITATINKPLAVAKAYCKNM TIEPTSKTTSVAVISLKNSNVQRGKDFINKLLEMYNINTNNDKNEVAQKTAEFINERISI ISKELGSTEKDLESFKRGAGITDLTSDAQIALTGSAEYEKKRVENQTQINLLQDLQKYMQ NEGYEVLPSNIGLQDLNLAAAINRYNDVLVERKRLLRTSTENNPTIINLDTSISAMKENV QVSLDRVLRGLFITKADLDREANRYSRRISEAPGQEREFVSIARQQEIKAGLYLMLLQKR EENAITLAATANNAKIIDDAIADDAPVSPKRKMIYLIALVLGVGIPVGVIYLLELTKFKI EGRSDVEKLTSVPIVGDIPLTDEKQGAIAVFENQNNLMSETFRNIRTNLQFMLENDKKVI LVTSTVSGEGKSFISANLAISLSLLGKKVVIVGLDIRKPGLNKVFNIPRKEVGITQYLAN PEKSLMDLVQLSDVSKNLYILPGGTVPPNPTELLARDGLDKAIETLKKNFDYVILDTAPV GMVTDTLLIGRVADLSVYVCRADYTHKNEYTLINELAEKDKLPSLCTVINGLDLKKRKYG YYYGYGKYGKYYGYGKRYGYGYGYGEQSHAKED >gi|225935329|gb|ACGA01000063.1| GENE 9 9609 - 11174 1172 521 aa, chain + ## HITS:1 COG:no KEGG:BT_1642 NR:ns ## KEGG: BT_1642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 518 1 518 521 832 79.0 0 MEDLQRLYPIGIQTFSKIREGNYLYIDKTEYVYRMTHSASSYMFLSRPRRFGKSLLTSTL HSYFSGRKELFHGLAMEKLEKEWTEYPVLHFDMSTAKHSDSEQLLQELNLKLYGYEQIYG RLEEEVNPNQRLMGLIKRAYEQTGKKVVVLIDEYDAPLLDVVHERENLDVLRNIMRNFYS PLKACDPYLRYVFLTGITKFSQLSIFSELNNIKNISMDEPYAAICGISEDEIRLQMKDDL GGLAKKLEITPEEALMKLKENYDGYHFTSPSPDIYNPFSLLNAFADGKFGSYWFGSGTPT YLVKMLDKFGVKPSEIGRRQLKSSVFDAPTETMTDAVPLLYQSGYITIKDYNKMLDLYTL DIPNKEVRLGLMESLLPYYVNNKTPEATTMVAYLFYDIQNGDMDAALHRLQEFLSTIPYC DNTRFEGHYQQVFYIIFSLLGYYVDVEVRTPRGRVDIVLRTKTTLYVMELKLDKSAGEAM EQIDLKNYPERFALCGLPVVKVAVCFDSERCTIGDWEIIGC >gi|225935329|gb|ACGA01000063.1| GENE 10 11337 - 11861 619 174 aa, chain + ## HITS:1 COG:no KEGG:BT_0615 NR:ns ## KEGG: BT_0615 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 1 150 172 255 89.0 5e-67 MSVIYKVITRPTDPRVPNSPKRYYPHLITLGQSVNLKYIAQKMQDRSSLSIGDIKSVIQN FVEKMKEQLLEGKSVNIEGLGVFMLTARSKGAELAKDINAKSVDSVRIFFQANKELRVTK TATRADEKLDLISLDEYLKKISVTVSPEDPEKPDEGEGGGDEGGGSGEAPDPAA >gi|225935329|gb|ACGA01000063.1| GENE 11 11930 - 12022 131 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKTWSIILKVIIAVAGAIAGVVGVQAATL >gi|225935329|gb|ACGA01000063.1| GENE 12 12276 - 14600 2079 774 aa, chain + ## HITS:1 COG:aq_624 KEGG:ns NR:ns ## COG: aq_624 COG5009 # Protein_GI_number: 15606057 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Aquifex aeolicus # 1 748 1 679 726 277 30.0 5e-74 MIRKIIKALWIFLAVIVLAIVIVFVSISKGWIGYMPPVEELENPSYKFATEIFSEDEKVL GTWSYSKENRVYTAYKDLSPSIINALIATEDVRFVEHSGIDAKALFRAFVKRGLMFQKNA GGGSTLSQQLAKQLFTENVARNTLQRLFQKPIEWVIAVKLERYYTKEEILSMYLNKFDFL NNAVGIKTAAYTYFGCEPKDLKIEEAATLVGMCKNPSLYNPVRFNERSRGRRNVVLEQMR KAGYITDAECDSLQALPLKLTYNRVDHKEGLATYFREYLRGVMTAPKPVRSDYRGWQMQK FYEDSIAWETNPLYGWCAKNKKKDGTNYNIYTDGLKIYTTINSRMQQYAEDAVKEHLGDY LQPVFFKEKEGSKNAPYARSLPEKRVEELLTKAMKQTERYRLMKEAGASEQQIRKAFDTP EEMTVFSWKGDKDTIMTPMDSIRYYKSFLRTGFMSMDPANGHVKAYVGGPNYVYFQYDMA MVGRRQVGSTIKPYLYTLAMENGFSPCDQARHVEQTLIDENGTPWTPRNANDKRYGEMVT LKWGLANSDNWISAYLMGKLNPYDLVRLIHSFGVRNKAIDPVVSLCLGPCEISVGEMVSA YTAFANKGIRVAPLFVTRIEDSDGNVISTFAPQMEEVISASSTYKMLVMLRAVINEGTGG RVRRYGITADMGGKTGTTNDNSDAWFMGFTPSLVSGCWVGGDERDIHFGRMTYGQGAAAA LPIWAMYMKKVYDDPTLGYDQQERFKLPEGFDPCAGSETPDGEVEERGLDDLFN >gi|225935329|gb|ACGA01000063.1| GENE 13 14650 - 15036 187 128 aa, chain + ## HITS:1 COG:no KEGG:BT_0744 NR:ns ## KEGG: BT_0744 # Name: not_defined # Def: 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 48 165 173 145 61.0 3e-34 MHKYIVCIGSNYNRKENLTFARQKLTESFSSICFAPELETKPLFFKNPALFSNQVVLFYS DKDEETVRKMLKDIEQRSGRRSEDKKEEKVCLDIDMLLYDNKILKPEDWQRGYIQQSLSA FHSSLFIK >gi|225935329|gb|ACGA01000063.1| GENE 14 15037 - 15789 724 250 aa, chain + ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 247 1 239 245 192 45.0 4e-49 MKFLGIIPARYASTRFPAKPLAMLGGKTVIQRVYEQVAGVLDDAYVATDDERIEAAVKAF GGKVVMTSVHHKSGTDRCYEACTKIGGDFDVVVNIQGDEPFIQPSQLDAVKACFEDVTTQ IATLVKPFTADEPFAVLENVNSPKVVVNKNWNALYFSRSIIPYQRNAEKQDWLKGHTYYK HIGLYAYRTEVLKEITMLPQSSLELAESLEQLRWLENGYKIKVGISEVETIGIDTPQDLE RAEEFLKNRI >gi|225935329|gb|ACGA01000063.1| GENE 15 15794 - 17077 1009 427 aa, chain + ## HITS:1 COG:sll2009 KEGG:ns NR:ns ## COG: sll2009 COG0612 # Protein_GI_number: 16330306 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Synechocystis # 22 420 9 410 435 118 27.0 2e-26 MDRTIQPEIQTLKNFRILPPVRMTLPNGIPLTVINAGEQEVVRIDVLFAGGRWQQSQKLQ ALFTNRMLREGTKKYTAATIAEKLDYYGSWLELSSSSEYAYITVYSLNKYLAKTLEVVES MIKEPLFPEKELNTILDTNIQQYQVNTSKVDFLAHRSLLQSLYGEQHPCGKIVVEEDYHA ITPEVLREFYERYYHSGNCSIFLSGKVTEDIISRVTDTFGTSFGQHQQQVSRLSFPFTAV PGKRIFTEREDAMQSAVKMGYTTITRNHPDYLKLRVLMTLFGGYFGSRLMSNIREEKGYT YGISAGIMFYPDSGLLAISTETDNEYVEPLIQEVYHEIDRLHQEPVSAEELTIVRNYMLG EMCRSYESPFSLSDAWIFIATSGLDDDYFSRSLLAVNEVTPMEIQDLAQRYLCKETLKEV IAGKKLS >gi|225935329|gb|ACGA01000063.1| GENE 16 17163 - 18332 1048 389 aa, chain + ## HITS:1 COG:slr1485 KEGG:ns NR:ns ## COG: slr1485 COG4642 # Protein_GI_number: 16329198 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 46 360 27 341 349 164 33.0 3e-40 MRKYLYTTLLLALLAQEGVMAQENEGKKGGFFGKIKDTFSTEIKIGNYTFKDGSVYTGEM KGRKPNGKGKTVFKNGDVFEGEYVKGKREGYGIYMFPDGEKYEGQWFQDQQHGKGIYYFM NNNRYDGMWYQDYQHGEGTMYYHNGDLYVGKWVNDKREGEGTYTWANGAKYTGHWKNDKK NGKGIMNWDDGCKYEGDWKDDVRHGKGVFEYTNGDKYDGDWADDIQHGRGTYYFHTGDRY EGSYLLGERTGEGVYYHANGDKYVGNFKNGMQDGKGTFTWANGAVYEGSWKNNKRDGRGV YKWSNGDVYDGDWKDNRPNGQGTLKTVAGMQYKGGFVDGLEDGQGVQIDKDGNRFDGFFK QGKKNGPFVETDKDGKVIKKGTYKFGRLQ >gi|225935329|gb|ACGA01000063.1| GENE 17 18477 - 19415 666 312 aa, chain - ## HITS:1 COG:Cj0918c KEGG:ns NR:ns ## COG: Cj0918c COG0462 # Protein_GI_number: 15792247 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Campylobacter jejuni # 7 311 4 309 309 312 50.0 5e-85 MSEKAPFMVFSGTNSRYLAEKICASLDCPLGNMNITHFADGEFAVSYEESIRGAHVFLVQ STFPNSDNLMELLLMIDAAKRASAKSVVAVVPYFGWARQDRKDKPRVSIGAKLVADLLSV AGIDRLITMDLHADQIQGFFNIPVDHLYASAVFLPYIQSLQLENLVIATPDVGGSKRAST FSKYLGVPLVLCNKSREKANEVASMQIIGDVEGKNVVLIDDIVDTAGTITKAANIMLDAG AKSVRAIASHCVMSDPASFRVQESALTEMVFTDSIPYAKKCPKVKQLSIADMFAETIKRV MNNESISSQYII >gi|225935329|gb|ACGA01000063.1| GENE 18 19569 - 23888 3374 1439 aa, chain + ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 633 816 88 254 311 67 31.0 2e-10 MSSRFPLYIIGIVLFASFFSCTDMVPTKEVRLIDSLNGKAYAYRYRSLDSSYKYANEAYR QVNFYKSGKAEASNNLGFCAFMAMDFDQAEALHKEVYKLTKNELELLIADIGLMKICQRT AMNKEFYDYRNSALKRMKRIREESDLFADRHEALRLDYAFTEFFIVSSIYYYYLQQRQEA ITSLNRIPEDEALTDTNQLLYYHYIKGSASLVEATKPEDRKMREFDQLYITWRTAVQTNH PYFEGNGLQGLANLMVSPNNFELFRTRRGYALDQFGFPVDSLLPLRMAQRALEKFREYND LYQIAGAYVSIGKYMNEHGRYTEALDTLAKALDCVNQHHMLYYHHAADTLDKLHVFVEGD TTYTGVPWIMQEDVRTVPEWISRIREQLSVSYAGLGMKYASDYNRNIYLDILNYTRQDKE LESRYLSLEADSRQMTLVLSLVIVGLVLVVILWWFFNKRSKIRNQVDVERLQRILTLCRD ITSSIPMNVPLIQQGIDQLFGKGRLQLEIPEEGKAALVPLHRLNRDEKALVHVLEPYIVW AADNEQMVEALSDERMQLEKQRYVYEQHIAGNKRQNLIKKACLAIVNGINPYIDRILNEV HKLTERGYIDNAKIKKEKYQYIDELVTTINEYNDILALWIKMKQGTLSLNIETFSLNELF ELLGKGRRAFEMKNQKLEIEPTTVMVKADRALTLFMINTLAENARKYTPEGGTIKVYART TEDAYVEISVEDNGRGISEEDVAHIIGEKVYDSRVIGMKNAADPEVLKENKGSGFGLMNC KGIIEKYKKTNDLFRGCVFDVESELGKGSRFYFRLPSGVRKAMGVLLLCLLLPLGMVSCL HDPIPPMLQDGDSIVVVTDSAYEDLLDVASDYANAAYFANVDENYELALQYIDSAMLFLN EHYEKYARPDRPHRYMKLVGEGTPAEISWWNELFDSDYHVILDIRNEASVAFLALKQLDA YSYNNSAFTDLYKLQGEDQTLEAYCRQLERSNTNKTVGIILCFVLLIISLVGYYFLYMRK RLQNRLNLEQVLEINQKVFAASLVRPQEQENAEALQREESTLKEIPQRIVDEAFGAVNEL LTIDRMGIAVYNETTHRLEYASRPGQEMPEMVEQCFSSGKYLSEQHLQAIPLMVEAGGEH QCVGVLYLERREGTEQETDRLLFELVARYVAIVVFNAVVKLATKYRDIESAHEETRRASW EDSMLHVQNMVLDNCLSTIKHETIYYPNKIKQIVGRLNAQNLSETEEREAVETMTELIEY YKGIFTILSSCASRQLEEVTFRRTVIPVQELLDAAGKYFKKLMKNRPERIELEIEPMEAK VIGDVNQLRFLLENLIDEALTVREDGVIRLQARKDNEYIRFLFTDTRREKSVEELNQLFY PNLARMTSGEKGELRGTEYLVCKQIIRDHDEFAGRRGCRINAEPAEGGGFTVYFTIPRR >gi|225935329|gb|ACGA01000063.1| GENE 19 23923 - 24651 903 242 aa, chain + ## HITS:1 COG:no KEGG:BF2233 NR:ns ## KEGG: BF2233 # Name: not_defined # Def: two-component system response regulator # Organism: B.fragilis # Pathway: not_defined # 1 242 1 242 242 439 95.0 1e-122 MEEQKFKVIIVEDVKLELKGTEEIFRHEIPNAEVIGTAMTESEFWPLMEAQLPDLVLLDL GLGGSTTIGVDICKNIFKRYKGVRVLIFTGEILNEKLWVDVLNAGADGIILKTGELLTKT DVQAVMDGKKLVFNYPILEKIVDRFKKSVANDAKRQEAVISYDIDEYDERFLRHLALGYT KEMIANLKGMPFGVKSLEKRQNDLIGRLFPNGERVGVNATRLAVRALELRIIDLDNLEPD EE >gi|225935329|gb|ACGA01000063.1| GENE 20 24710 - 26146 1083 478 aa, chain + ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 1 478 23 503 512 298 37.0 2e-80 MPHPATMFLLLTMAVVFLSWICDIYGLKVTLPQTGEDIRVQSLLSPEGIRWWLRNAIKNF TGFAPLGMVIIAMFGLGVAQHSGFIDACIRMGVGNRQEKRKIVLWVIVLGLLSNAIGDGG YIILLPIAAMLFQWVGLHPIAGIVTAYVSVACGYSANIVLSTMDPLLAHTTQEAALAQTG YQGNTEPLCNYFFMSASTVAITAIVYWITQKWLLPTLGKYEGSMKVVAYHPLSRKERRAI MISIVVAAIYVALILWLTFSSYGILRGVNGGLMHSPFIAGILFLLSLGAGITGMAYGFSS GRYRTDNDVIEGLTQPMKLLGVYFVIAFFAAQMFACFEYSHLDKCLAIMGADLLSSFEPA PLSALVLFILFTALINLIMVSATSKWAFMSFIFIPMFAQMGIAPDIAQCAFRIGDSSTNA ITPFLFYMPLVLTYMRQYDKQITYGSLLKYTWRYSLGILVVWTLLFIVWYLLKIPMGL >gi|225935329|gb|ACGA01000063.1| GENE 21 26246 - 26833 514 195 aa, chain + ## HITS:1 COG:no KEGG:BT_0752 NR:ns ## KEGG: BT_0752 # Name: not_defined # Def: putative RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 195 195 347 92.0 1e-94 MNLPEISENLIEQLNHGNTKAFDKIYHTYYLYLCAIAVYYVHDKRVAGEIVNDVFVAFWQ NRHHITYPALPYLRRAIQNASISYLRSSYFNEKQMTEQMEEIWAFLENHILSSDNPLQAL EHSEMNTIIMKKVEELPTKCRAIFKASLYEGKTYSEIAEEQNINVTTVRVQMKIALTKLR ESLGTPYMIAILMFL >gi|225935329|gb|ACGA01000063.1| GENE 22 26911 - 27939 888 342 aa, chain + ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 132 309 70 242 274 90 33.0 5e-18 MNDSYHIEELEIRISNYLSGNCTEEEKEALLAYLASNEDAAKIFREMSAVWAVSSVPAFA KVEDANLSQIKEKIAAAEMKPVRSRKIVPVWLKVAAAIILLLGCNYFWYTYTENLTEVYT NADSPYEIKVPAGSRTNIVLPDGTEVSLNAGSVLRYCRGFGIRERDVTLDGEGYFKVAKN EKIPFFVNTNGVQVKVVGTVFNVRAYDDDNYVMVSLLEGRVNLATLSGSVMKLFPSEQAF YDKNTGRMEKMKSNASSACDWLDGGLTFEDAPFADIAHRLERKFQVKISLESERLKAERF SGCFNSNQSINDILGEINVEKQYTWKVSGDTIFITDKKKEVK >gi|225935329|gb|ACGA01000063.1| GENE 23 28037 - 31456 3268 1139 aa, chain + ## HITS:1 COG:no KEGG:BT_0754 NR:ns ## KEGG: BT_0754 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1139 1 1138 1138 2041 91.0 0 MRLIVKRILLMLLLLSVGNVLTGGFQLSAQEQSKRFSVKVDNITLKEAIEVIKKQGNYSF LIRNNDIDLNRKVSVNINNGTINDVMGQLLAGTDVSYEVNGTRVIMFHAVTPQKEQEKAF VLKGRITDPTGEGVIGANVKVVDSTEGTITDMDGNFSLMVTPNARLSISYIGYATQEVVV KDQRPLNVTLKEDTRLIDEVVVVGYGVQKKANLTGAVSSVKMDDVLGDRPVVSVSDALKG AMPGLQITGNSGRPGEEMSFNIRGVNSLDKNGKPLVLVDNVEMDINMLDPNDIESVTVLK DAASSAIYGARAAFGVILITTKKGSDATKLSINYSNNFSFSRPANMPHKATPLQTVQAYK DMGTINYQSGQNVDTWLGLLKEYNANPSAYPDGYAMVDGLRYSLAETDLFDDMMETGFQQ THNISVGGGTKDISYRFSFGMVNENGVLASDKDTYKRYNVSSYLRSDVFSWITPELDIKY TNSNSSLPETSGGYGIWGAAVAFPSYFPTGTMNIDGEELPINTPRNLIDLAYPTTIQKNN LRIFGKVTISPLKNVKLIGEYTFNHLSNEKTKFEKKFYYAHGGNFVKEVSTANSKYEYSN GITDYNALNFYANYNNSWGKHDVTVMGGFNQESSDYRYAEMSRMNMINEDLPSISQSTGD YFAKDKFERYTVRGLFYRINYSFAGKYLLETNGRYDGSSKFPKNSRFGFFPSVSAGWRIS EEAFMKPLNSVLSNLKLRGSWGNIGNQSISPYAYIPGMDAEQAYWTVSGIKVTTLKPAAL VSNSFTWEKVTTIDVGFDLGLLDNRLNMVFDWYRRDTKGMLAPGSELPGVLGASAPLQNT ADLRSKGWEISVDWNDQIGKVKYSIGFNLYDAKTKITKYNNETGLFGKDKNNNETYRVGM ELGEIWGYVTDRLYTVDDFDADGKLKPGIAKVEGYNPNPGDILYKDLDGNNIINSGTSTT KDPGDRKIIGNNTRRYQYGIHGSASWNGFSLSFLLQGVGKRDLWVMNDLFYPHYDAWTTV YDSQLNYWTPENTNSYFPRIYEKAAGNTDANTRVQTRYLQDGSYLSIRNITLSYNFPSKW MSKIGVNNLAVFFSGENLYTFDHLPKGLDPERSVTDDLGQRGFTYPYMRQYSFGINLSF >gi|225935329|gb|ACGA01000063.1| GENE 24 31485 - 33299 1747 604 aa, chain + ## HITS:1 COG:no KEGG:BT_0755 NR:ns ## KEGG: BT_0755 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 604 1 601 601 1118 89.0 0 MNKYILCVSVSLCLLMSSCLNDEFLEVYPKGQQTEASVFTTYDNFKTYTWGLYNVFFGYT YDTGQTDEIFRGDFESDNMIKGLDGYEGQWAYRKAKATDESKDWDYDYIRRVNLMLDNID HSEMSETEREHWRSVGYFFRSYKYFQMLSKFGDIPWVEHALTEESPELYGKRDSRDLVAS NILSNLKYAETHIGSNIEADGKNTINMYVVKALISRFALFEGTWRKYHGLSGADTYLEEC ARASEEVIKQYPNVHPKYDELFNSETQDGVTGILLYKAYETGQLMHGLTRMVRTGESYIE ATKDAVDSYLCTDGRPVSTTTSHYGGDKNMYGQFRDRDYRLYLTICPPYMVKKENGPSTA DWKYTDNAQDREFIDLMATISGETYHRLPSSNFKGFTVQGQPHFKNMNWGQGWNASQMGF WVWKYYNTHTVATNANGVNTTDAPLFRIGEVMVNYAEAMCELNKLDQAAADKSINKLRAR ANVAKMVVNDINDAFDPKRDPSVPALLWEVRRERRVELMGEGFRLDDLRRWKKGDYVNKQ PLGAYVTGASAKNLKVTGGAGADEGYVYFFDTPLGWQEHYYLYPLPLKQLALNTNLEQNP VWTK >gi|225935329|gb|ACGA01000063.1| GENE 25 33496 - 34812 830 438 aa, chain + ## HITS:1 COG:PA0031 KEGG:ns NR:ns ## COG: PA0031 COG3119 # Protein_GI_number: 15595229 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 12 353 28 361 503 136 32.0 7e-32 MRPELGCYGVDVVKTPNMDRLAASGVLFQNAYCNIPVSGASRASLLTGVYPHYPDRFISF SAYASKDCPEAIPISGWFTRNGYHTISDGKVFHHISDHADSWSEPPYRNHPDGYDVYWAE YNKWELWMNSESGKTINPKTMRGPFCESADVPDTAYDDGKLANRAIRDLKRMKEAGKPFF LACGFWKPHLPFNAPKKYWDLYKREEIPLATNRFRPEGLPEQVRNSSEIYAYARVTDTSD IDFQREVKHGYYACLSYVDAQIGKVLDALDELGLSDNTIVVLLGDHGWNLGEHDFIGKHN LMDTSTHVPLIVRVPGMKKGKTKSMVEFVDLYPTLCELCKLPVPAEQLSGQSFAGVFKNL KAKTKGEVYIQWEGGDNAVDRRYSYAEWMKGDVKKASMLFDHQIDKKENKNRVDEKKYKN KVESLSSFIRIKKSSLKK >gi|225935329|gb|ACGA01000063.1| GENE 26 34814 - 36862 1558 682 aa, chain + ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 28 583 6 552 570 164 27.0 5e-40 MRTVFLQSSRLVSVVLFMLCGMSMFAQRQDILLNNDWNFRFSHQVQKGTEVRVDLPHTWN AQDALSGKIDYKRGIGNYEKNLFIRSEWKGKRLFIRFEGVNNIADVFVNRRHIGEHRGGY GAFIFEITGKVEYGKENSILVRVNNGEQLDIMPLVGDFNFYGGIYRDVHLLIIDETCISP LNYASPGVRLIQDSVSHKYAKVRAVVDLSNGGSSNREVELNVRLLDGQRVVKEGTKKVNL SGNAAMQQEFTFEIDQPHLWNGRQDPFLYQAEVILSRNGQMVDRVIQPLGLRFYRIDPDK GFFLNGKHLPLQGVCRHQDRSEVGNALRPQHHEEDAALMLEMGVNAVRLAHYPQATYFYD LMDKNGIIVWAEIPFVGPGGYNDKGFVDLPAFRANGKEQLKELIRQHNNHPSICVWGVFN ELTELGDNPVEYIKELNVLAHQEDPTRPTTSASNQMGDLNFITDAIAWNRYDGWYGGTPA DLGKWLDRMHKDHPEICIAISEYGAGASIYHQQDSLVKTVPTSWWHPENWQTYYHIENWK TISSRPYVWGSFVWNMFDFGAAHRTEGDRPGINDKGLVTFDRKVRKDAFYFYKANWNREE PMLYLTGKRNTVRTQRLQTITAFTNLSGAELFVNGKSYGKAIPDSYAILEWKNVELEPGE NEIKVVSTNKKLPLSDSFHCRL >gi|225935329|gb|ACGA01000063.1| GENE 27 37138 - 37305 155 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174717|ref|ZP_05761129.1| ## NR: gi|260174717|ref|ZP_05761129.1| hypothetical protein BacD2_22874 [Bacteroides sp. D2] # 1 55 1 55 55 95 100.0 9e-19 MVLSIDNDGKRKRIPQYTMAAQSVSPEYSAINSGNISPCFREANPNTIIAKVIAK >gi|225935329|gb|ACGA01000063.1| GENE 28 37567 - 38460 704 297 aa, chain + ## HITS:1 COG:PA1461 KEGG:ns NR:ns ## COG: PA1461 COG1360 # Protein_GI_number: 15596658 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Pseudomonas aeruginosa # 152 272 127 245 296 92 42.0 1e-18 MKNLFRPLLVLTGCIVLASCVSQKQFKGLKTDYTKLQTEYQESQLQLRENQARSTSLEER LAEAQRNNSELRKSLSDMQSILNKSLTQSSQGNMNMGKLIDEINASNRFIQHLVKEKNRS DSLNLVMVNKLTRSLTREEMRDLDIKVLKGVVYISLADNMLYKSGSYEISNKAEAVLSKI AKIIQDYKDYDVLVEGNTDTDPISRTNIRNNWDLSALRASSVVQALQNIYGINPKRLTAA GRGEFNPVAGNDTAEGKARNRRTEIIITPKLDQFMDLIDQAPDNSSSGIVGDENTKE >gi|225935329|gb|ACGA01000063.1| GENE 29 38660 - 39244 202 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160884512|ref|ZP_02065515.1| ## NR: gi|160884512|ref|ZP_02065515.1| hypothetical protein BACOVA_02496 [Bacteroides ovatus ATCC 8483] # 2 194 90 282 282 356 100.0 5e-97 MKKLLFLIMTALIVVGCSKSDEEKSDIQEIWLNGYKALTPESDYENISTTFLLFKADNNE EFDVKKQTFSGNILDYQKIQDETWNLLSKSKIKKKDGSIVDATYSIFASSIKETYTSAKV KIGKYFMVAIYSDSKTGYHWLYSNKYACQYVDIPYRYNPWDYSVVFPCDLHNYGKIEWVS WNQTPYPYEFEFSK >gi|225935329|gb|ACGA01000063.1| GENE 30 39752 - 40054 250 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174720|ref|ZP_05761132.1| ## NR: gi|260174720|ref|ZP_05761132.1| hypothetical protein BacD2_22889 [Bacteroides sp. D2] # 1 100 1 100 100 156 100.0 4e-37 MANLLLIRDLCEKNKIKIRELASRIGKDESSIQSMIRTGSTNTKTLEAIAEVFNVSPGIF FDNPLESNTENSLNKEAEIAYLKKILEEKERLIQVLLSKK >gi|225935329|gb|ACGA01000063.1| GENE 31 40167 - 40394 185 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174721|ref|ZP_05761133.1| ## NR: gi|260174721|ref|ZP_05761133.1| hypothetical protein BacD2_22894 [Bacteroides sp. D2] # 1 75 1 75 75 142 100.0 6e-33 MAKKTLKRNEDGEKLRMYLIGLPLKDSSKMVTKLAEACKVPLHTVHNWRAGLCRIPELAK DKIEEVTGVKIFHVD >gi|225935329|gb|ACGA01000063.1| GENE 32 40419 - 40967 468 182 aa, chain + ## HITS:1 COG:no KEGG:HSM_0868 NR:ns ## KEGG: HSM_0868 # Name: not_defined # Def: putative antirepressor protein # Organism: H.somnus_2336 # Pathway: not_defined # 21 176 26 179 279 82 28.0 5e-15 MNTKIIARVNNVDILSTGDEQFVAIRPICEVLGIDPEGQRQRIERDEILGPAACLIKAIG KDGKSYEMYAIPYCYVFGWLFSIDISRINENVKASVLEYKLACYKALFTHFTESQTFLKQ KQTVIEQKVLKCQECQCRFKEAQKLMNEAKTELNQVMKITIDDWRANNCRLDLPFVSGEM DD >gi|225935329|gb|ACGA01000063.1| GENE 33 41109 - 41387 151 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174723|ref|ZP_05761135.1| ## NR: gi|260174723|ref|ZP_05761135.1| hypothetical protein BacD2_22904 [Bacteroides sp. D2] # 1 92 1 92 92 159 100.0 7e-38 MSKIYIVTKRESGIYEEDGVWFSILAAFDTRNKAEEYLKKYVKTAPKEAYYTFYRIESVP LFLSSHKVKINRPKHTTYPIGELVKLKISGEK >gi|225935329|gb|ACGA01000063.1| GENE 34 41406 - 41663 149 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174724|ref|ZP_05761136.1| ## NR: gi|260174724|ref|ZP_05761136.1| hypothetical protein BacD2_22909 [Bacteroides sp. D2] # 1 85 1 85 85 161 100.0 1e-38 MSKKISIKVTEAQPLPCPYCNGFYGYQYSDLFRMSYTSVHNSDGTYSGGEYSDGVSLNKS KTAYCVNCGTKLPFTLIREGEEQVE >gi|225935329|gb|ACGA01000063.1| GENE 35 41681 - 41947 251 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174725|ref|ZP_05761137.1| ## NR: gi|260174725|ref|ZP_05761137.1| hypothetical protein BacD2_22914 [Bacteroides sp. D2] # 1 88 1 88 88 177 100.0 1e-43 MKSTITTPDELTTLRIEGSSGTYKIFSSFRPMESPAFVDAVDRKYNLAEIKNLSGGKGYF LVHLNREQQETIQEDLNAILCDSVPCLL >gi|225935329|gb|ACGA01000063.1| GENE 36 42146 - 42328 180 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174727|ref|ZP_05761139.1| ## NR: gi|260174727|ref|ZP_05761139.1| hypothetical protein BacD2_22924 [Bacteroides sp. D2] # 1 60 1 60 60 91 100.0 2e-17 MKDKNLKYIAHAIIVVAFMGLIAFVIYYTGKTAFLWLLLFVFLYQPWGDLKTKQEENNEK >gi|225935329|gb|ACGA01000063.1| GENE 37 42351 - 45254 1760 967 aa, chain + ## HITS:1 COG:no KEGG:PRU_0854 NR:ns ## KEGG: PRU_0854 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 5 959 7 922 922 552 36.0 1e-155 MIKAEDIYKVTNNGLDIILHYYPQARDCVGTNRHFKRRPSEDDASACIKLFGKEGSQQVY KVTDFGDTGTAQSPVDICMYEEGLRFNEAILKLASMYNVTDELNRNVNKPDIRKVPASQD QKDGTKIFELADHLAPDQLRILGPRVTQENAEALHWYSAKYIGYVKNREVTYKYATATYP IFMRECLVKPAEGDTPEVKFYKIYEPLNPDKQWRFSYTPEGVKPKDYINGLSELKALYRE FNSREEAAFKKNPANAEKPYKEQKLQEAFICSGERDALCVKSLGFSPIWFNSETYKLSEQ DYKEIMKYVEVLYNIPDIDTTGRVKGTELALRFIDIHTIWLPAWLTTYRDQRGKPRKDFR DFMELRSKNEDFRNLMTLAMPAKFWYSKFNEKSRQWDHNIDADCLHYFLRLNGFYSLHDE NSSSTKYIRITGNIVKLIKAKDIRKFIRGWAQDSFLSRDIRNLILNSPKLSDTALDNLQE IELDFTNYTHNTQMFFFPGCSMEVSGIGIKEHPANGSTLSHYVWEENVLKHKVRLMEDMF TISRKKDIEGNDVFDIRINAVPSNFFGYVINSSRVYWRKELEYNFDNKSVGEAESYREKH KFDIEGEGLMAEEVAEQKKNLINKIFTIGYMLHRYKSPSRAWAPQAMDNKIGEDGECNGR SGKSFMFKALSYFMKTVKLSGRNPKLMDNPHVFDQVNQHTDFILVDDCDRYLNTGLFYDI ITSDMTVNPKNNQSFTIPFEESAKLGFTTNYVPIDFDPSTEARLLYLVFSDYYHQRTEDN DYRETRSIRDDFGKDLFSKTYSENEWNADINFFLQCCRFYLSLCEESIKLLPPMENIIRR KYKADMGNNFEDWANSYFSPDSEHLDSFIVREKAFADYKSFSGVNKITMQRFTKALKGFV ALCPYIEELNPKDLCNSQGRIVRKDNDGKAADMIYLRSCGTAETAAGGGTEPADPTLMFV PDERPDE >gi|225935329|gb|ACGA01000063.1| GENE 38 45281 - 45565 229 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174729|ref|ZP_05761141.1| ## NR: gi|260174729|ref|ZP_05761141.1| hypothetical protein BacD2_22934 [Bacteroides sp. D2] # 1 94 1 94 94 186 100.0 3e-46 MSEFTALWNSGERFRKFAEQVYRYLERMKPGTVLMLERYSGEQLEWIIKTACVFILEGDN SLEYEFNEDYTAVVHRYVPPDVKKWILSRCKHRV >gi|225935329|gb|ACGA01000063.1| GENE 39 46160 - 46648 356 162 aa, chain + ## HITS:1 COG:no KEGG:BF2331 NR:ns ## KEGG: BF2331 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 160 1 163 163 170 52.0 1e-41 MITTRIQIESYLAEYVRGKYYDETVGTVRFPSSSDIYVTIYDLMEKRPVNCSADRGNLEF MLPDRREANFAGGKSPEQFNYISVRGTAILEKRLRALMWAELHELMDENKHLHGIEFKET VFTFLKKYNISSIQEDGLLKNYQRWRDSFRRKKKRAYNRKKV >gi|225935329|gb|ACGA01000063.1| GENE 40 46769 - 47170 272 133 aa, chain + ## HITS:1 COG:no KEGG:BF2332 NR:ns ## KEGG: BF2332 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 12 132 14 139 143 65 34.0 7e-10 MSRKLISSAHSLQLVPVYNIIHFGVVRSKVVIRSIGKPDILTIVPGTLKPGDSKNEDVYT KKHTFKLADVSRNKTLYLENLKATPFVALYIDETGNTRVSGSPDYPLTFSFEIGGGLYNC TLSGTGPGVDAFL >gi|225935329|gb|ACGA01000063.1| GENE 41 47232 - 48098 586 288 aa, chain + ## HITS:1 COG:sll1703 KEGG:ns NR:ns ## COG: sll1703 COG0616 # Protein_GI_number: 16330327 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Synechocystis # 60 277 310 533 610 75 27.0 8e-14 MDKIQEIFTAPWAIADNDYYRLLSLLVPCVAAGNLDAIEKRLNNNKITAYATTPYLANRW ELDDDTLPADSVAVIILEGTLYSWETYRLEQQLRDISGNPKICGAVLWINGPGGMVAHVD LAAKMIAESSKPIATYVAGTMGSAHFWLGTAAGRTFIASPMCEVGSVGIMLTYQSFKEYF RKQGIDYREIYPDSADLKNYETRAIEKDNNEEPIKQRLVVMHRIFCDAISRNLGIAYDPE LPLFRGQIFTGDVAVANGYIDQFGTLEDAVKWVLAQATVRKVNEMYNI >gi|225935329|gb|ACGA01000063.1| GENE 42 48123 - 50324 2106 733 aa, chain + ## HITS:1 COG:no KEGG:BF2334 NR:ns ## KEGG: BF2334 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 733 5 738 738 976 66.0 0 MKFKSFSAHILALLGLSEWSKIEDKNSITAEEVAKLKNYGFTEKFLTDFKASLENDFQDE AEDGNQGEETEEPRTTAFLRGLLGDTAARLTQAQEQLEALQTQQRDENRNNTALIAKKDA EITKLSGIIAQLSAAAEDDPGKGKQHNAQADGKGAFNLQDEKQLGGLQGEMFSLDRPYNL RAKAALMEAAGFEMIALPKASSIDYSRLKEDLGAFYRIPWQQRLQSFLMELPSIESIFPL ESGYQDLATLVNIWLGEFSQAGNEESDFDKVTKGSYEFDDETLRMFNVMFAHRFKNLKAL EKTWIGTLNKEGSNPIKWSFIEYILAETAKKLHNEREQRRINGIRKDPNLNEPGKALAAA DGLYEFLNKKVNGHTDINNGKFVYQIKPFELGELTEANIGEKVYKGTSMIPAVLRDSGNL ALYMPSHFIVLYHKYNELHYGQNQDYKANIMYVKEYPAVKIIPVPNADNHHRIFWTFEGN IKTYEDKPGEMTAFNLEQEDWSLKVWSNWRESIWAIAVGFKYTKKEDMDYTRQMIFCNEY DRPASYFVDADKDKNPSAKLHTSIVTVANTAEFTITDIEDAPVGTVISLKCGSVDKGVKI EKSGNFELISEAWQPGKGDVIKLMKRADGKFIEIGRENASSDALQFAPDETEPSLLDGEV FVTGENTKATAITNFTDAEAGVVYTIYGNGSEYASTIAAGGNFVLTEAMTLSEGKFIKLA KAADGKFYEVARG >gi|225935329|gb|ACGA01000063.1| GENE 43 50384 - 51190 613 268 aa, chain + ## HITS:1 COG:no KEGG:BF2421 NR:ns ## KEGG: BF2421 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 267 4 271 272 332 61.0 1e-89 MTYVKASVRRPAGNPGNGIQPKDQLVIYDVDDILSFPQRNDAGVVIEDDIVMKAGRYAIG IYLTPGTAEISSNSDGETDAEGYTPSVKFNHPGNEQEIREFKTNWLSKKCIVVLRYCSGK PADLIGTPCNPCKLSVSYTGSNESNTNELTFTQISKGDDIAIYRGTDTLEEPVAVVEAGA TDIDYQTDGQYQLSAGAAKIAGVTGGSHGSVITLMGCSGVAPTVETGGNFLLKGGKTFTA SEGSQLTLRAFNDGSEAMKWIEQSRYEA >gi|225935329|gb|ACGA01000063.1| GENE 44 51306 - 52253 1026 315 aa, chain + ## HITS:1 COG:no KEGG:BF2422 NR:ns ## KEGG: BF2422 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 314 1 310 311 254 47.0 3e-66 MKQEIITYLAGPRNFIQGVELYEKYGINRMLKKSFRRQGETETMKAILLEELRKLAGLSE REFKTIRRNSKQPAAVKMEPVREETPKTPVKYSDDLLLELAESFGVSVEELVSSDFRDKV LSMDENADRVEELEEELEEAEKRYKAAPETVTKMIRFREKFTFLNSPDCPDILKILVSDM FTAYGKYKEAFARLEATPDDVSSLSTAQEAQAVVENFIANRDMWDELEYYRENGKILGKC EKVKSLSVRKGVENLSDIDIQKALNNARANLSKNKAKLEQAGDDEKKKTSALALIQKWET TQKAIEEEIEARKKK >gi|225935329|gb|ACGA01000063.1| GENE 45 52410 - 53084 347 224 aa, chain + ## HITS:1 COG:mlr7520 KEGG:ns NR:ns ## COG: mlr7520 COG0863 # Protein_GI_number: 13476248 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Mesorhizobium loti # 4 213 21 265 377 104 31.0 1e-22 MITNKIYNEDCLEALKRVPDNSVDCIITDPPYFLGMTHNGQKGSFKDLSICKPFYRDLFL EFNRVKKPGACVYFFTDWRGYAFYYPLFDLYIGASNMIVWNKQSGPGNHYAFIHELILFH CGKGVSIGATNIIDNIRSFASGAKLVEGEKVHPTQKPVALIRKLIEDSTKPGDLILDTFG GSGTTAVAAIESGRNFVLMEQDEIYYFTAQKRIKDAYERFNGGG >gi|225935329|gb|ACGA01000063.1| GENE 46 53053 - 53403 322 116 aa, chain + ## HITS:1 COG:no KEGG:BF2423 NR:ns ## KEGG: BF2423 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 10 115 10 112 113 87 47.0 2e-16 MRMNDLTVVDSIYLDAQQKEDVRRLSSLGYSPKDIAVSLGLSLEDARLFVRDAETVGTSV NFLIREGILVARAAPEIKLHEAAEGGNVEAIKQLEAVRKRHTFERLIEQMDDDEFN >gi|225935329|gb|ACGA01000063.1| GENE 47 53384 - 54064 567 226 aa, chain + ## HITS:1 COG:no KEGG:BF2338 NR:ns ## KEGG: BF2338 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 220 1 220 225 293 67.0 2e-78 MTTNLIKPSRIDFDKVDINQIQRILSTGTLEALAPDEREYYSLMEMVRGLRARMRINGKL VTKAGIIRLLKSEPYGLSDWMARQVYADSLNFFYTQDNVRPQAFANLYAEKAENWANTVF LMGNVKEAKNLLKLAAELRGCYKDQQAEIPEELLSQKSTVIYTTSRKDLGVPEIDRKELE EFIDAIPEIPVIVRDNIKEDARIKAFDLKKRMLYDIKEFGEDNEGE >gi|225935329|gb|ACGA01000063.1| GENE 48 54273 - 55745 1016 490 aa, chain + ## HITS:1 COG:no KEGG:BF2339 NR:ns ## KEGG: BF2339 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 471 83 549 568 684 67.0 0 MPAVQKGWQLMGLIEGVHYVKDTRPPESWRRKCSVIVDDYKHVYSFWNGCVIFMGSLDNP SLLAGKSVIHLFYDEAKYDKEMKVNRAMPILRGDAITYGHSHLFLGITITTDMPDIDENE YDWFFRYVKQMDPERIIKIVQAASMRNDLVISLLKEERKNKPSPLKLKRLKRDIEYYDRA LLKLRKGQTFFLNASSFANVEILTIEYLRRLYNGTLELHEFKKSVVGMRPGLRRDLRFYV LFGEGHKYYNGTASGEAAYSSRELRYLHHDKTIEGGMDFGNMLSLVIGQADGAYYRVHKN FFEIPPGWFREIADQFLTFFQNHEYKELDLYYDRAGNNFEKQKEDYAGKIKDAIEKDGSG NRTGWIVNLKSRKQAVIRQDAEYDFMQEIMGGTNKNLPILLVDAVNCKEMVSSVEKAKAE IKYRGNSKVVFKVKKSEKLAPKKLPMLSTNFSDAFKYLLMRPGWIALVRGKRTLQADSFV DQWIENRHKR >gi|225935329|gb|ACGA01000063.1| GENE 49 56191 - 56709 318 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174741|ref|ZP_05761153.1| ## NR: gi|260174741|ref|ZP_05761153.1| hypothetical protein BacD2_22994 [Bacteroides sp. D2] # 1 172 1 172 172 290 100.0 3e-77 MKKLLLITVLAILVVAATAQETRKTFCEIVGTGKVLSSKVKIQIDFGQKTSYFGKYKTFM VDESGKKIEFNSMVDAMNYLAKFRWKFEQAYVVTNESTNQNVYHWLLSKDIVSDDEIREG IITQKDFEDMEKAAMEDKENKNEEVEKKVPLFMRNMKKESDEEDEAQKRYEP >gi|225935329|gb|ACGA01000063.1| GENE 50 57072 - 57323 59 83 aa, chain + ## HITS:1 COG:no KEGG:BF2297 NR:ns ## KEGG: BF2297 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 83 64 143 146 62 44.0 3e-09 MKKELQSGTSYVPSFRTGSTDVNTIQHRYFQEPKHECTVCSTSGAYYLSAIACFCLTFIY PPAVIGAVICVCRAKKARKGGRK >gi|225935329|gb|ACGA01000063.1| GENE 51 57320 - 57670 228 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174743|ref|ZP_05761155.1| ## NR: gi|260174743|ref|ZP_05761155.1| hypothetical protein BacD2_23004 [Bacteroides sp. D2] # 1 116 1 116 116 224 100.0 2e-57 MISYFIELNEYKPQNRKCAEMAEFANQFGNTLCPDKISFDAFKTELEAKVKELNEKYPKT MPLKISSGSGFIHIDQDTKTHNNGCDKPVAYFFIYRVKRIYRFSERPQIEKKGGAE >gi|225935329|gb|ACGA01000063.1| GENE 52 57667 - 58023 256 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174744|ref|ZP_05761156.1| ## NR: gi|260174744|ref|ZP_05761156.1| hypothetical protein BacD2_23009 [Bacteroides sp. D2] # 1 118 1 118 118 210 100.0 2e-53 MIYTEYQQVLLTQLQNNDKRIEEIKKEKEEIQEMFLQESKFKPGDLIQIDYKISNATFKV RGWIFRITFWRNRPYYHLNLPKKDGSRGLRVKSVCDGVLESITSISHIKLEDLKGGAR >gi|225935329|gb|ACGA01000063.1| GENE 53 58020 - 58394 125 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174745|ref|ZP_05761157.1| ## NR: gi|260174745|ref|ZP_05761157.1| hypothetical protein BacD2_23014 [Bacteroides sp. D2] # 1 124 1 124 124 242 100.0 6e-63 MNTNNPDILFFVKREYGTPSIELRAYKVEKVNNEFAFLELERLRLVVFSGNFQSVSLHHE YGKNNCMYNSANNIPDLMKDMKRWQLSPIDRRNYERFRKVALGIYRQAGIIDFTTLETTP IKNV >gi|225935329|gb|ACGA01000063.1| GENE 54 58408 - 58641 223 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174746|ref|ZP_05761158.1| ## NR: gi|260174746|ref|ZP_05761158.1| hypothetical protein BacD2_23019 [Bacteroides sp. D2] # 1 77 1 77 77 134 100.0 2e-30 MKDIEVNGAHITDESAGILKQWQVKTEPVSACYIRVIEETIDDLTDEGGEPLSAEKIVER IRTLRMMKKDIEKLSNP >gi|225935329|gb|ACGA01000063.1| GENE 55 58798 - 59025 174 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174747|ref|ZP_05761159.1| ## NR: gi|260174747|ref|ZP_05761159.1| hypothetical protein BacD2_23024 [Bacteroides sp. D2] # 1 75 1 75 75 142 100.0 5e-33 MKIGTDKWKHFGINYAICALLGNYGVPFALGASLGKEYGDKMSPGNKWDWKDILADLAGI VAGYLTHVCTVWTIM >gi|225935329|gb|ACGA01000063.1| GENE 56 59045 - 59413 291 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174748|ref|ZP_05761160.1| ## NR: gi|260174748|ref|ZP_05761160.1| hypothetical protein BacD2_23029 [Bacteroides sp. D2] # 13 122 13 122 122 191 100.0 1e-47 MTETIITAIITALCTGGLTWLFTLRYTRKQAEADAMKSVQEVYQELIEDMKNDRKELKNA YQEQKKRFDEVDNKYKEVLQKCNEMEKAIKQNARVMDTMKPFLCGVKNCPNRKSITFDTN NN >gi|225935329|gb|ACGA01000063.1| GENE 57 59427 - 59948 181 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174749|ref|ZP_05761161.1| ## NR: gi|260174749|ref|ZP_05761161.1| hypothetical protein BacD2_23034 [Bacteroides sp. D2] # 1 173 1 173 173 320 100.0 3e-86 MRHGIVHLLIFICFAACFYGCRSSRSVTRKTVTEATGEEKQTTTDGVIELARRDSSNEEH VLDVYREDSTHIRIDYDSLGRIKEIDFSNRKTEKRTGKNQSSSFQDHKETTSQAETVVTR KSDVKQQSQEKEKVTNGCSLWTFLKFMFFFLSFCLVHDNWGGIKSFIRRLWKK >gi|225935329|gb|ACGA01000063.1| GENE 58 60219 - 61583 1167 454 aa, chain + ## HITS:1 COG:no KEGG:BF2343 NR:ns ## KEGG: BF2343 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 453 1 449 450 677 69.0 0 MEIRRSGNFGIIDTGTDKGLISFSIGGRGKGWEPSSIQLNRRGAFFSRKISVNGTFIVPM GDNNDMPGDVMRLLDKFYAGEGIMGKIAGLQWGEGPRLYEDAIDEDNNRFYRRWKLDPEI TADLESWDYTTVLHRSLVDLTHMQGFFIKFVRNRAPRVGNPGRLVRLEHIPYQKARLVYP PDGEDEPQEVLVGDFPYPDPAYTYRYPIFDPAHPFKYPVSVKYYNIYSFCKDFMSTPRFL GALDWLELAGGLAAILIAYNENASAISLHIESPQSYWDRAEARIKQVCERTGEKYTAQML EDFKDEAMEKFASNITGRQNAGKYMHTTKFWNPEANNFEGWTVEPLDKKIKDYVDAQIKI SNKADAAATSGFGLDPVLSNLIIENKLSSGSEKLYSLKVYNASETAIPDMILCKPLQQYI NANFPGTATKVGLYRTIVEAEQNVSPSNRMKENA >gi|225935329|gb|ACGA01000063.1| GENE 59 61576 - 62277 369 233 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174752|ref|ZP_05761164.1| ## NR: gi|260174752|ref|ZP_05761164.1| hypothetical protein BacD2_23049 [Bacteroides sp. D2] # 1 233 1 233 233 479 100.0 1e-134 MRSLFFTPKPEDVPEEPVSDRQPEENRADNTPDKHIKARRTKNVHFDRRIKSELHLEECL PWHFEKGASYHCISHGDVDSLTYLRVIVKQQPVEYVLISTWCMAITDVKEVEKWLERKDI GHADFYVGEIFQGSYADVYLYLKKVAERFGSRVCIFRNHAKVMAGFGNAFDFVIESSANV NTNPRTEQTCITIDTGLARFYKEFYDEINNFTKDFDNWKPYTLKRDRANDEVI >gi|225935329|gb|ACGA01000063.1| GENE 60 62261 - 63211 694 316 aa, chain + ## HITS:1 COG:no KEGG:BF2432 NR:ns ## KEGG: BF2432 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 316 5 317 317 338 56.0 2e-91 MTKLFNKGGDGAGEIVRVLGLIDNDLDFTKWEPILPLGIRDLQAIIGTEPIDAVDKYYRE DHADGTESDGMAETLRLMQQAVAMFTWLKVIPTLDAQHGTAGRGKHLGENETGMTALQEF KDEENIRNLAYEAVDALVELMDREKFDFWMNGIKKKAINRLLIQNKETFDEYYNIGSHRL FLVLIPMIREVQDGQIIPVITRNRYNKLIEGDTVLTEKLLEYVRRPLALLTIKKAVERLP VEVLPSGIVQVQQSTTVRDKLRAEKEARQSVANSLEQDAAAYLDVLQDIIRELDAQSETV DYYIPGVTVQSKGITF >gi|225935329|gb|ACGA01000063.1| GENE 61 63221 - 64030 527 269 aa, chain + ## HITS:1 COG:no KEGG:BF2433 NR:ns ## KEGG: BF2433 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 268 1 266 266 241 45.0 2e-62 MEKFTYNSKTAEVPSCLDEVSSEQYRQFLILSVLMNRGTISPGQFRVKWLSFLLGMKADY TMYRREIIRELDGQLEKLDGFFSYTTGKEGERIVTPILKTGRNLMQDFGGWHGVGDMLNG LTFGNFCDCLDLLQQSKQAAEKDEPAINEIFQDITLKLYRYKDPEKMPAVPSLLAIHAVN FFSAVWEMVLSGPVYIGGEAIDFRILFQKLASEDRKADDKTGWTGIVFEVAASGVFGNKK EVDDTPFWDVLLYLYKCKFEYLHQKRNKK >gi|225935329|gb|ACGA01000063.1| GENE 62 64027 - 64461 414 144 aa, chain + ## HITS:1 COG:RSc3192 KEGG:ns NR:ns ## COG: RSc3192 COG3772 # Protein_GI_number: 17547911 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Ralstonia solanacearum # 9 135 19 146 153 87 35.0 8e-18 MRTTTGTKNKIKKFEGLRLKAYVCAAGVCTIGYGHTAGVKPGDVITEPQADAFFESDIRA VENQVNALPLHLGQYQFDAVVSFCFNVGIGKFKNSTLYKKIRADAYDSSIPAEFKKWIYG GGKILPGLVTRREWEAKRYQGLTI >gi|225935329|gb|ACGA01000063.1| GENE 63 64458 - 64725 150 89 aa, chain + ## HITS:1 COG:no KEGG:BF2348 NR:ns ## KEGG: BF2348 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 88 1 88 147 107 53.0 1e-22 MIDIKVYREYWEGVQKRVPEIKKVLPVTIDEEMSKTIQGLSKEECPVLFILIPSGTGASL SADNVRENNLCVIFLMSKYDPQRKGAYET Prediction of potential genes in microbial genomes Time: Fri May 13 11:04:33 2011 Seq name: gi|225935328|gb|ACGA01000064.1| Bacteroides sp. D2 cont1.64, whole genome shotgun sequence Length of sequence - 118563 bp Number of predicted genes - 130, with homology - 127 Number of transcription units - 59, operones - 28 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 122 - 298 176 ## gi|260174758|ref|ZP_05761170.1| hypothetical protein BacD2_23079 2 1 Op 2 . + CDS 303 - 713 365 ## BF2349 hypothetical protein 3 1 Op 3 . + CDS 747 - 1052 267 ## RF_0889 hypothetical protein 4 1 Op 4 . + CDS 1059 - 1292 273 ## BDI_2430 hypothetical protein 5 1 Op 5 . + CDS 1304 - 1531 142 ## gi|237721585|ref|ZP_04552066.1| conserved hypothetical protein 6 1 Op 6 . + CDS 1528 - 1764 82 ## gi|260174763|ref|ZP_05761175.1| hypothetical protein BacD2_23104 + Term 1789 - 1823 5.3 + Prom 1786 - 1845 5.0 7 2 Op 1 . + CDS 1875 - 6176 3915 ## COG5283 Phage-related tail protein 8 2 Op 2 . + CDS 6186 - 8339 932 ## BF2351 hypothetical protein 9 2 Op 3 . + CDS 8275 - 8595 90 ## gi|237721589|ref|ZP_04552070.1| predicted protein 10 2 Op 4 . + CDS 8592 - 9749 596 ## BF2291 hypothetical protein 11 2 Op 5 . + CDS 9759 - 16547 4681 ## BT_4440 putative cell surface protein 12 3 Tu 1 . + CDS 16742 - 16927 69 ## gi|260174770|ref|ZP_05761182.1| hypothetical protein BacD2_23139 + Prom 16948 - 17007 3.5 13 4 Op 1 . + CDS 17257 - 18276 640 ## Dtox_4258 RNA-directed DNA polymerase (reverse transcriptase) 14 4 Op 2 . + CDS 18278 - 18772 463 ## gi|260174772|ref|ZP_05761184.1| hypothetical protein BacD2_23149 + Term 18779 - 18824 6.1 15 5 Tu 1 . - CDS 18821 - 19966 531 ## COG4974 Site-specific recombinase XerD - Prom 20047 - 20106 4.2 - Term 20078 - 20115 4.0 16 6 Op 1 . - CDS 20310 - 20669 377 ## gi|260174774|ref|ZP_05761186.1| hypothetical protein BacD2_23159 17 6 Op 2 . - CDS 20689 - 21174 247 ## gi|260174775|ref|ZP_05761187.1| hypothetical protein BacD2_23164 18 6 Op 3 . - CDS 21220 - 21867 254 ## Coch_0655 putative phage repressor - Prom 21887 - 21946 5.7 + Prom 21836 - 21895 8.2 19 7 Tu 1 . + CDS 21939 - 22106 197 ## gi|260174777|ref|ZP_05761189.1| hypothetical protein BacD2_23174 + Term 22115 - 22165 6.1 20 8 Tu 1 . - CDS 22095 - 22424 286 ## gi|260174778|ref|ZP_05761190.1| hypothetical protein BacD2_23179 - Prom 22448 - 22507 7.5 21 9 Tu 1 . + CDS 22781 - 23404 244 ## - Term 23213 - 23260 1.2 22 10 Tu 1 . - CDS 23285 - 23692 349 ## gi|260174781|ref|ZP_05761193.1| hypothetical protein BacD2_23194 + Prom 23679 - 23738 4.9 23 11 Op 1 . + CDS 23833 - 24051 179 ## gi|260174782|ref|ZP_05761194.1| hypothetical protein BacD2_23199 24 11 Op 2 . + CDS 24048 - 24272 213 ## BF2474 hypothetical protein 25 11 Op 3 . + CDS 24276 - 24623 198 ## BF2473 hypothetical protein + Prom 24643 - 24702 1.8 26 12 Tu 1 . + CDS 24769 - 25011 251 ## BF2472 putative recombination protein + Prom 25091 - 25150 3.2 27 13 Tu 1 . + CDS 25201 - 25428 175 ## BF2407 hypothetical protein + Prom 25432 - 25491 5.2 28 14 Tu 1 . + CDS 25552 - 25761 136 ## gi|260174787|ref|ZP_05761199.1| hypothetical protein BacD2_23224 + Prom 25765 - 25824 3.5 29 15 Op 1 . + CDS 25914 - 26450 243 ## gi|260174788|ref|ZP_05761200.1| hypothetical protein BacD2_23229 30 15 Op 2 . + CDS 26447 - 27817 713 ## COG0305 Replicative DNA helicase 31 15 Op 3 . + CDS 27828 - 28112 324 ## BF2465 hypothetical protein 32 15 Op 4 . + CDS 28125 - 28475 182 ## BF2471 hypothetical protein 33 15 Op 5 . + CDS 28496 - 29410 568 ## COG1032 Fe-S oxidoreductase 34 15 Op 6 . + CDS 29460 - 29618 158 ## + Prom 29623 - 29682 3.0 35 16 Op 1 . + CDS 29710 - 30123 165 ## gi|260174794|ref|ZP_05761206.1| hypothetical protein BacD2_23259 36 16 Op 2 . + CDS 30141 - 30626 228 ## ABAYE2755 putative methyltransferase 37 16 Op 3 . + CDS 30638 - 30871 294 ## gi|260174796|ref|ZP_05761208.1| hypothetical protein BacD2_23269 38 16 Op 4 . + CDS 30908 - 31246 249 ## gi|260174797|ref|ZP_05761209.1| hypothetical protein BacD2_23274 39 16 Op 5 . + CDS 31249 - 31743 400 ## gi|260174798|ref|ZP_05761210.1| hypothetical protein BacD2_23279 40 16 Op 6 . + CDS 31736 - 31936 223 ## gi|260174799|ref|ZP_05761211.1| hypothetical protein BacD2_23284 41 16 Op 7 . + CDS 31938 - 32399 343 ## Coch_0885 putative prophage Lp2 protein 26 42 16 Op 8 . + CDS 32396 - 32689 193 ## gi|260174801|ref|ZP_05761213.1| hypothetical protein BacD2_23294 43 16 Op 9 . + CDS 32701 - 32931 251 ## gi|260174802|ref|ZP_05761214.1| hypothetical protein BacD2_23299 + Prom 32949 - 33008 2.3 44 17 Tu 1 . + CDS 33050 - 33745 298 ## PSPA7_2372 hypothetical protein 45 18 Tu 1 . - CDS 33880 - 34113 74 ## + Prom 33786 - 33845 5.3 46 19 Op 1 . + CDS 34016 - 34195 94 ## gi|260174805|ref|ZP_05761217.1| hypothetical protein BacD2_23314 47 19 Op 2 . + CDS 34206 - 34553 309 ## gi|260174806|ref|ZP_05761218.1| hypothetical protein BacD2_23319 + Term 34625 - 34656 -0.1 + Prom 34598 - 34657 1.5 48 20 Op 1 . + CDS 34694 - 34888 130 ## gi|260174807|ref|ZP_05761219.1| hypothetical protein BacD2_23324 49 20 Op 2 . + CDS 34900 - 35184 201 ## gi|260174808|ref|ZP_05761220.1| hypothetical protein BacD2_23329 50 20 Op 3 . + CDS 35141 - 35455 252 ## gi|301164546|emb|CBW24105.1| putative endonuclease + Term 35667 - 35711 -0.2 - Term 35427 - 35463 3.2 51 21 Op 1 . - CDS 35516 - 35914 296 ## BVU_2807 hypothetical protein 52 21 Op 2 . - CDS 35955 - 36140 226 ## gi|260174812|ref|ZP_05761224.1| hypothetical protein BacD2_23349 - Prom 36278 - 36337 8.7 + Prom 36230 - 36289 5.1 53 22 Op 1 . + CDS 36483 - 36884 418 ## gi|260174813|ref|ZP_05761225.1| hypothetical protein BacD2_23354 54 22 Op 2 2/0.000 + CDS 36884 - 38503 1055 ## COG4626 Phage terminase-like protein, large subunit 55 22 Op 3 3/0.000 + CDS 38541 - 39791 750 ## COG4695 Phage-related protein 56 22 Op 4 . + CDS 39805 - 40398 373 ## COG3740 Phage head maturation protease 57 22 Op 5 . + CDS 40403 - 41650 1050 ## NMC0858 putative phage-related protein 58 22 Op 6 . + CDS 41652 - 41963 136 ## gi|260174818|ref|ZP_05761230.1| hypothetical protein BacD2_23379 59 22 Op 7 . + CDS 41960 - 42286 234 ## gi|260174819|ref|ZP_05761231.1| hypothetical protein BacD2_23384 60 22 Op 8 . + CDS 42283 - 42720 155 ## gi|260174820|ref|ZP_05761232.1| hypothetical protein BacD2_23389 61 22 Op 9 . + CDS 42720 - 43088 284 ## gi|260174821|ref|ZP_05761233.1| hypothetical protein BacD2_23394 62 22 Op 10 . + CDS 43122 - 43604 273 ## BF2452 hypothetical protein + Term 43625 - 43663 6.0 63 23 Op 1 . + CDS 43668 - 44417 317 ## gi|260174823|ref|ZP_05761235.1| hypothetical protein BacD2_23404 64 23 Op 2 . + CDS 44431 - 47601 2302 ## COG3941 Mu-like prophage protein 65 23 Op 3 . + CDS 47598 - 49733 871 ## BF2448 hypothetical protein 66 23 Op 4 . + CDS 49730 - 51001 673 ## BF2447 hypothetical protein 67 23 Op 5 . + CDS 50998 - 56559 3047 ## COG3291 FOG: PKD repeat 68 23 Op 6 . + CDS 56569 - 58890 1178 ## Cpin_3975 OmpA/MotB domain protein 69 23 Op 7 . + CDS 58890 - 59276 348 ## BVU_2812 hypothetical protein 70 23 Op 8 . + CDS 59273 - 59767 126 ## COG3023 Negative regulator of beta-lactamase expression + Term 59784 - 59828 5.1 + Prom 59770 - 59829 2.1 71 24 Tu 1 . + CDS 59911 - 60267 240 ## gi|260174831|ref|ZP_05761243.1| hypothetical protein BacD2_23444 + Term 60302 - 60336 -1.0 72 25 Op 1 . + CDS 60705 - 61619 326 ## GFO_0259 hypothetical protein 73 25 Op 2 . + CDS 61624 - 62040 303 ## gi|260174833|ref|ZP_05761245.1| hypothetical protein BacD2_23454 74 25 Op 3 . + CDS 62031 - 62534 254 ## gi|260174834|ref|ZP_05761246.1| hypothetical protein BacD2_23459 + Term 62655 - 62699 2.2 - Term 62646 - 62680 3.6 75 26 Tu 1 . - CDS 62698 - 63936 476 ## BVU_0168 tyrosine type site-specific recombinase - Prom 64018 - 64077 5.6 - TRNA 64079 - 64152 75.7 # Arg TCT 0 0 + Prom 64208 - 64267 10.1 76 27 Tu 1 . + CDS 64440 - 65174 828 ## BT_0766 hypothetical protein + Term 65200 - 65241 5.3 + Prom 65210 - 65269 10.0 77 28 Op 1 9/0.000 + CDS 65303 - 66310 574 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 78 28 Op 2 . + CDS 66294 - 66890 366 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Term 66947 - 66995 1.3 + Prom 67073 - 67132 6.5 79 29 Op 1 . + CDS 67177 - 67842 600 ## COG5587 Uncharacterized conserved protein 80 29 Op 2 . + CDS 67880 - 68326 578 ## COG2731 Beta-galactosidase, beta subunit 81 29 Op 3 . + CDS 68345 - 70357 2112 ## COG0296 1,4-alpha-glucan branching enzyme + Term 70378 - 70424 10.1 - Term 70359 - 70418 19.4 82 30 Op 1 . - CDS 70447 - 70935 448 ## BT_0772 hypothetical protein - Prom 70973 - 71032 2.5 - Term 71000 - 71043 8.2 83 30 Op 2 . - CDS 71074 - 72771 1481 ## COG0366 Glycosidases 84 30 Op 3 . - CDS 72825 - 73631 496 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 73679 - 73738 4.3 - Term 73679 - 73721 -0.4 85 31 Tu 1 . - CDS 73798 - 75126 445 ## COG1145 Ferredoxin - Prom 75271 - 75330 3.4 - Term 75277 - 75321 4.1 86 32 Op 1 . - CDS 75350 - 76285 827 ## COG2006 Uncharacterized conserved protein 87 32 Op 2 . - CDS 76300 - 76809 518 ## BT_0777 hypothetical protein - Prom 76893 - 76952 6.5 88 33 Tu 1 . - CDS 77022 - 77450 264 ## XBJ1_0403 hypothetical protein - Prom 77563 - 77622 5.3 89 34 Op 1 . - CDS 77625 - 78335 592 ## BT_0781 hypothetical protein 90 34 Op 2 . - CDS 78409 - 79296 668 ## COG2326 Uncharacterized conserved protein - Prom 79477 - 79536 6.4 + Prom 79309 - 79368 5.9 91 35 Tu 1 . + CDS 79499 - 79663 251 ## gi|160884529|ref|ZP_02065532.1| hypothetical protein BACOVA_02514 92 36 Tu 1 . - CDS 79877 - 80452 343 ## gi|237718610|ref|ZP_04549091.1| transcriptional regulator - Prom 80505 - 80564 5.5 + Prom 80445 - 80504 6.2 93 37 Op 1 . + CDS 80659 - 81462 429 ## BT_0971 hypothetical protein + Prom 81489 - 81548 4.6 94 37 Op 2 . + CDS 81591 - 83093 1324 ## COG0174 Glutamine synthetase + Term 83110 - 83162 11.2 + Prom 83109 - 83168 10.7 95 38 Op 1 . + CDS 83292 - 84179 686 ## Dfer_0860 hypothetical protein 96 38 Op 2 . + CDS 84203 - 85237 533 ## BT_1459 two-component system sensor 97 38 Op 3 9/0.000 + CDS 85234 - 86295 507 ## COG3275 Putative regulator of cell autolysis 98 38 Op 4 . + CDS 86292 - 87014 567 ## COG3279 Response regulator of the LytR/AlgR family + Prom 87071 - 87130 6.0 99 39 Tu 1 . + CDS 87291 - 87536 339 ## COG0724 RNA-binding proteins (RRM domain) + Term 87560 - 87607 10.3 - Term 87546 - 87596 13.3 100 40 Tu 1 . - CDS 87599 - 87772 274 ## gi|160884539|ref|ZP_02065542.1| hypothetical protein BACOVA_02524 - Prom 87847 - 87906 4.7 + Prom 87785 - 87844 4.4 101 41 Tu 1 . + CDS 87909 - 88688 691 ## BT_0786 putative integral membrane protein - Term 88669 - 88712 -0.9 102 42 Op 1 39/0.000 - CDS 88799 - 89659 849 ## COG0074 Succinyl-CoA synthetase, alpha subunit 103 42 Op 2 . - CDS 89693 - 90823 935 ## COG0045 Succinyl-CoA synthetase, beta subunit - Prom 90902 - 90961 6.5 - Term 90944 - 90988 11.6 104 43 Op 1 . - CDS 91023 - 91910 1092 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 105 43 Op 2 . - CDS 91961 - 92797 500 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase - Prom 93017 - 93076 8.9 + Prom 92981 - 93040 6.6 106 44 Tu 1 . + CDS 93131 - 93868 736 ## COG1051 ADP-ribose pyrophosphatase + Prom 93901 - 93960 2.7 107 45 Op 1 11/0.000 + CDS 93980 - 95503 1552 ## COG1070 Sugar (pentulose and hexulose) kinases 108 45 Op 2 . + CDS 95545 - 96861 1694 ## COG2115 Xylose isomerase + Prom 96895 - 96954 6.9 109 46 Tu 1 . + CDS 97000 - 98454 1243 ## COG0477 Permeases of the major facilitator superfamily + Term 98517 - 98565 8.8 + Prom 98542 - 98601 8.0 110 47 Tu 1 . + CDS 98763 - 99629 455 ## CCC13826_0816 acid membrane antigen A + Prom 99633 - 99692 6.0 111 48 Op 1 . + CDS 99816 - 100286 358 ## Alvin_2303 5 nucleotidase deoxy cytosolic type C + Prom 100289 - 100348 4.0 112 48 Op 2 . + CDS 100372 - 101109 359 ## TDE0221 hypothetical protein + Term 101117 - 101170 12.3 - Term 101108 - 101155 12.2 113 49 Tu 1 . - CDS 101156 - 102049 369 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 102102 - 102161 5.6 + Prom 102066 - 102125 3.3 114 50 Tu 1 . + CDS 102196 - 102390 105 ## gi|260174876|ref|ZP_05761288.1| hypothetical protein BacD2_23669 + Term 102503 - 102534 1.1 + Prom 102629 - 102688 9.9 115 51 Op 1 . + CDS 102769 - 103059 158 ## gi|237713274|ref|ZP_04543755.1| predicted protein + Prom 103062 - 103121 6.9 116 51 Op 2 . + CDS 103192 - 104730 1235 ## COG0388 Predicted amidohydrolase + Term 104736 - 104776 5.0 - Term 104724 - 104764 5.0 117 52 Tu 1 . - CDS 104829 - 105878 752 ## BT_0805 hypothetical protein - Prom 106047 - 106106 6.6 + Prom 106010 - 106069 10.6 118 53 Op 1 . + CDS 106089 - 109577 3828 ## COG0060 Isoleucyl-tRNA synthetase 119 53 Op 2 . + CDS 109613 - 109993 594 ## BT_0807 DnaK suppressor protein, putative 120 53 Op 3 . + CDS 110091 - 110747 491 ## BF2277 lipoprotein signal peptidase 121 53 Op 4 . + CDS 110689 - 111813 843 ## BT_0809 hypothetical protein 122 53 Op 5 . + CDS 111819 - 112205 67 ## BT_0810 hypothetical protein - Term 111990 - 112042 15.3 123 54 Tu 1 . - CDS 112156 - 112920 677 ## COG0566 rRNA methylases - Prom 113004 - 113063 4.8 + Prom 112966 - 113025 7.1 124 55 Tu 1 . + CDS 113063 - 115372 1643 ## BT_0814 putative outer membrane protein + Term 115428 - 115479 1.2 - Term 115279 - 115309 -0.4 125 56 Op 1 . - CDS 115326 - 115778 209 ## gi|299145876|ref|ZP_07038944.1| hypothetical protein HMPREF9010_01328 - Prom 115811 - 115870 2.0 - Term 115808 - 115852 7.4 126 56 Op 2 . - CDS 115872 - 116423 597 ## BT_0815 hypothetical protein - Prom 116443 - 116502 8.9 127 57 Op 1 . - CDS 116517 - 116966 361 ## COG2259 Predicted membrane protein - Prom 116996 - 117055 5.2 128 57 Op 2 . - CDS 117057 - 117491 388 ## COG3015 Uncharacterized lipoprotein NlpE involved in copper resistance - Prom 117557 - 117616 6.1 + Prom 117502 - 117561 5.6 129 58 Tu 1 . + CDS 117636 - 117884 201 ## BT_0819 hypothetical protein + Term 117927 - 117981 12.2 - Term 117913 - 117968 12.4 130 59 Tu 1 . - CDS 117997 - 118419 408 ## BT_0820 hypothetical protein - Prom 118487 - 118546 6.4 Predicted protein(s) >gi|225935328|gb|ACGA01000064.1| GENE 1 122 - 298 176 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174758|ref|ZP_05761170.1| ## NR: gi|260174758|ref|ZP_05761170.1| hypothetical protein BacD2_23079 [Bacteroides sp. D2] # 1 58 1 58 58 107 100.0 3e-22 MDAFAWFWLTVIVGIITIGVNDALCTYWKYKYASNKKNETVKDESGKRHIISGFSKNE >gi|225935328|gb|ACGA01000064.1| GENE 2 303 - 713 365 136 aa, chain + ## HITS:1 COG:no KEGG:BF2349 NR:ns ## KEGG: BF2349 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 127 7 137 154 118 51.0 5e-26 MAENFKTDFFTDRIGRGIQDIFQAQLDIATKRIYQKGSERKKVQGTGEIIQGRSGALMAA LQNPNYSVVPDGEGVIARSNLPLYTRFLDMKKHGNYQIYNRQIYGILYHDTLGKIKYEYQ DYVRERVKEMFASSLK >gi|225935328|gb|ACGA01000064.1| GENE 3 747 - 1052 267 101 aa, chain + ## HITS:1 COG:no KEGG:RF_0889 NR:ns ## KEGG: RF_0889 # Name: not_defined # Def: hypothetical protein # Organism: R.felis # Pathway: not_defined # 17 101 1 85 85 95 48.0 5e-19 MILLPIFVSVTNKTKFMPEICRFFGIIIFLYWKDHNPPHIHFTYGDYECSISVLDRIVDG QAPAKVIAKVNEWINLHEAEILSLWEKAQKGEKIDKIEPLK >gi|225935328|gb|ACGA01000064.1| GENE 4 1059 - 1292 273 77 aa, chain + ## HITS:1 COG:no KEGG:BDI_2430 NR:ns ## KEGG: BDI_2430 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 77 7 83 83 72 44.0 3e-12 MLRVIDVDYIRNYELLVTFSDGSKKIVNLEPYLTGEVFGELLDKEKFVQYGLTRATIEWA NGADLAPEFLYEIGIAA >gi|225935328|gb|ACGA01000064.1| GENE 5 1304 - 1531 142 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237721585|ref|ZP_04552066.1| ## NR: gi|237721585|ref|ZP_04552066.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 75 1 75 75 137 100.0 2e-31 MNDCLAIQDKKEETFLYRIFISHPELNASAVARRMGISQSLMSQYISGIKKPSQEREALI VNTIKDIGKELTMIV >gi|225935328|gb|ACGA01000064.1| GENE 6 1528 - 1764 82 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174763|ref|ZP_05761175.1| ## NR: gi|260174763|ref|ZP_05761175.1| hypothetical protein BacD2_23104 [Bacteroides sp. D2] # 1 78 1 78 78 134 100.0 1e-30 MTYEDILFLIGFFLVIAFFVGCKHKPATLSGWLAFTFLSFTVTPFISVPLTWYVCRMIDR ATIKDKGYFDPSDFTFKR >gi|225935328|gb|ACGA01000064.1| GENE 7 1875 - 6176 3915 1433 aa, chain + ## HITS:1 COG:ECs2641 KEGG:ns NR:ns ## COG: ECs2641 COG5283 # Protein_GI_number: 15831895 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Escherichia coli O157:H7 # 21 479 38 513 696 98 24.0 1e-19 MAKLKPDYIEWVLTLNASDAQKEIHNLSEKNKELRDSNKEIKKAMTDLIATGKAGGKQWK KLNEQLKENNKTLGENNKKIAECEKRLDKTTMSANQLARKANALRKELRDTVKSLQPEKY AALEKELKEVEKAYGQATKKAEGFGSSLLSLNKIKTVLAGVFVTIGAMITGQIVGGLRDA ISTIIEFEKKNSTLAAILGTTKKSIKDLTDEARRLGATTSYTAAQVTELQIELAKLGFFK EDIKAMTPSVLKFAKAVDTDLASAATLAGATLRIFNLDAEDTERALSTMAIGTTSSALSF EYLNSAMSTVGPVANSFGFTIEETTALLGALANSGFDASSAATATRNILLNLADSSGKLA LALGGPVNNLDDLVKGLKKLNSEGIDLNKALELTDKRSVAAFNTFLNGTDTVLALCDAVT GAEDAFNAMSEEMGDNVQGALNRLSSTIEGVVLRFYESKGILRDLIDLVTLMVEGVGGMI DMFNKWGVVTYTVTAYLVSYYGGLKIATMWHARFKTATLASVVAEKAHAVQLYISRAATL AYAAAQALLHLNIKRCTAALRLMRIELLKNPYTALLALLVAAGVAIYQLAKKTEQASAAM KAHQEVVKKVNEEYASQEAKIKTLVAAINDENLSNYTRKQRLAELKELIPDYNAELNEEG RLINNNKEAIDQYLVSLEKQIKLKAYQEELEELYKKKRNLESQESEQSDAYWDTRQQNTL SGYNRNSLTAKISRLFGTEKEANQLKALQTTQKDLAGIESAIAQINNDILKTEATATSLT GTNKENINTETSLIKKLEAEKKKVQEQWAEDSEANIAKKNKEIERIDTEIKRLNELGKVK KKAEAGEYKNTETDATLKPLEIEHEKRMLLIKQNREKENKTEAQYILEGTAENLRYYRER IDALQKLEAKTPANKKKLLDEIHKLETEAQTAIFTETGKQEDARIKLVQEKRDERLKIET AYYNVQKDTMEKAVLNQSITQEAADAYMLEVEAEHAAELLEINRTYQNDIAALEITGKQK RIETATEAADAVRESEMKLLRDRAAIAQKVREITSVPVGITGMQEAHRKQVQDVETTYNA IIEIARQAGISTVGLEKQKQQEISQLEFEYQNSLYQIQSQIGVSWAQEYQNELALLKNLH DQELIDEKTYQRKKLQMQMNNAKKYFDYYSGLSSSMVEAIQQAEIDQVEAKYDVLIQEAE NNGEDTAALEEEKENKKLEIQKKYADVNFAIKCSQIIADTAVSIMKAYADLGPIAGTVAA AMLAATGVAQLASAKAERDRIKNMSLKNTTGSKTATAERVVSGSSGGGYSEGGYTGPGGR YEVAGVVHKGEYVVPQPEMNNPKVIDAVSTIEAIRRQRTNANPLPQNPGEYAEGGYVTSY AGDSSYREFLEAAKELRASCEAIKLIKAYIVYQDLEKAKETIDNARDTFTRGK >gi|225935328|gb|ACGA01000064.1| GENE 8 6186 - 8339 932 717 aa, chain + ## HITS:1 COG:no KEGG:BF2351 NR:ns ## KEGG: BF2351 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 606 3 611 700 410 38.0 1e-113 MLKIKTNKGYLDLGGDFTVQIDEKSPVMNDRGSQTVPVTVPCTGNNAKITGFAHRLDMGI KPMNEDQACTILDGAYKRTGKINIVSAGKKEGITLNIGFDNSEAYSAWKAKKLNAITLPV KEYSSVNSLCAHLQQVLGGYQTDYAVFQIMTGNDSKDNQFYPKYLNYITPVSEGSKVYRL RYQARTETFLVNGTPTAVTLPEGYGVTAFLYVWRVLELVFSEFGYTITENPFKTNKELSN LVILNNAADCCVKGKLSYADLMPDCTVEDFLNALHVRFGLVYNVSSDTKTATLRLIRDIV DDVPDIDLSRSLTDEPLITYETARQMKLSARTSFTGAAPSVERLEDYLKDQKVARLTKVD VSKRVIHLNYEETTGRWFKWDEDNNRLTYSSSSFFSWDRKTDNIEDNELTSDDECVPMDF APNDILSPQYLADYVHRYTYLKTSSNNNDEDSEKVETPLSFVFAFTSSQNSKYPFGSVLP YTSDAEEVILRDGSKHTMSLFFQYDNGLFFNFWKKYDAILRHSFNKIEANVLLPVHRLTG MDILTPVILRGQYLLFDGLSYSLPANKIVPVDLTLRTLRLIGPYDLDKEQETPVFGSRLF TWEFISSNIETAKENERNRILQQARDEWNKRPTAVNEMKSITYSLDGYTTRNDDKYLVEN YPQEAGITLQRNYKCKATAIISIYYEPGSFTPGTYRDVTYESEFEYTDTFVSIVYSG >gi|225935328|gb|ACGA01000064.1| GENE 9 8275 - 8595 90 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237721589|ref|ZP_04552070.1| ## NR: gi|237721589|ref|ZP_04552070.1| predicted protein [Bacteroides sp. 2_2_4] # 14 100 1 87 92 130 71.0 3e-29 MLRMNPNLNIQILLFLLSIPVNPVLYSSFDKSNFCTMEKQNNIVLAPCTTQVTELYNFWK ENHTGRLTDFYKFMVNPSAARDRFISSLEMQHELTGSFIVTKIAIQ >gi|225935328|gb|ACGA01000064.1| GENE 10 8592 - 9749 596 385 aa, chain + ## HITS:1 COG:no KEGG:BF2291 NR:ns ## KEGG: BF2291 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 7 385 2 376 376 244 36.0 3e-63 MSASDEALKVNIYPTGNAFTRNPIFLSVSSSSMATYSIRMNNEEVFKGNGIGEFRVNIAE IVETGITDARILSDNTEHLLAVSGLSAEVTIHVVNEGEEEDNLSFTAWKGGISKKEFRRL RNMGTDIFSLKFLNESCNFFFTTRSNDWRITMRETELYPLCFIYPGHELKITELLTGQSL AVPGTTGSLYALNLEAVRLKFFTDYGVLANLFDVYSGDTFALRIGIEQSPTVREHYRLRF LNSYGTYEVFSLEGEASVTPGMDEDEDAVFRRYDEITDDYYSDRIRTEIQEAVTIKTGFK RPQEIRFLLDLLSSDNVYLSGYGQEEIKVIPSAEEFSYRVRPDAPQNVTLKLTFAEKESN WTGEITESGYRKPGVHSKEFSKQFN >gi|225935328|gb|ACGA01000064.1| GENE 11 9759 - 16547 4681 2262 aa, chain + ## HITS:1 COG:no KEGG:BT_4440 NR:ns ## KEGG: BT_4440 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 423 1182 261 1026 2183 111 21.0 4e-22 MATQEYIDDLIIVIETAEDAGSVTNQMVAAVLDFLNVNLKKVSQGEEVLEEEAARIAADA ALQKAIDAVSLRIDRLVGNNASQAIDNFNEILNFLNGLKDSDSLAALLADINARIGSEDG SQSEDGSLWGKLKSLSQDISSCSEDISTLQVDRDKIKQELQQTAGLQVSTFTNVNNFLNA GTVYSDLSGVFTALKTADKINNVRKNGVILSFLTADGWVTKQFRGNPDADFENVEKWEDF GSGGSAGGNTYNVTGNMPLTEGFYTLASAIAAVPDKWRDRGRVITFETSLGKWETWQFTG TDPAVWDQEASWEEFGGKGTVKSVTVNGEKQTPDAAGNVNVNVDILEVDETLSLDSTNPV ENKVVTARFNEVDASTLFNVNAEVSEDETSVRLSFQNKSGAEITAVDIPAGSGGGSGETV ATKIVLNAAVDNTIIKEGGNARLTYTYDHQYTTGDEKGESTGQKADITVTIRRGTTTMYS QTVSDVSKGSYELDLSSYLLVGNTDIYVVATTTDPTTGKKQTRQAFTSVKVVSLSLTSSY NLAGAIAAGGYTLADTINIPYAVNGSGTKVVTLYLNGQQQNAHTITRSGTTNGSFSLSPS SLVTGRNTVQMVAEMEASADLVLKSESIYIDILKSGGSAPFIGTMISFPDGRIFTEDHLV PRLEAGQYEQVKFDFVAYDPAATPAKVDVYRDGVKTQSVSVARTTQTYTNRFTQQGEINM KFKTGATEYPFYIDVTESGIDLQETTAGLVLKLSAAGRSNSESDPGAWDYGDIHTTFAGF DWNSNGWTGDALKLTGGAKIEIGYQPFSTDATTTGATYEMEILCSSVTDRQGVILDCMAG DIGFQMTTEQALMRVSGGTEVSTKFASDMNLKIAFIVGSKAGKRLLELYVNGIRCGAVQY GSTEGLLQAEPVNIRLFSDTADVEIRNFRIYNRALTDDEELNNYMVDRTTSDEMVLLFEK NDVTGDNGTDIDIDKLRAQGKAVMRIVGDVNLVNATNNKKFEVPVDIYFYSPYGKEYDFV ARNIGLRIQGTSSTTYPRKNYRLYFLRLEKYGTTLEVNGVDVPSLEYSFKPGARPISIFC LKADFSDSSGTHNTGAVRIVNDVWKKCRWLTPPQAAYKGEYDVRIGVDGFPMDLFYDNDG TGANTYLGKYNFNNEKSESAIIYGFEGIEGFNDEAALNGQRNKCICLEFLNNSEALCLFG TTDMSSFDDALEFRFKADTTWADAHEDDKAAVTRLWNWIDSCKDDPAKFLAEYNQYFGND SPFAWYLITDYFMAVDNRAKNMMLATWDSLIWYFLPYDMDTLFGVRNDSVLKYEYTITHE SFDDSIGSYAFAGHDSVLWELVRSCPDKLREVAETLRSNMSLEYVLQVFNEEQMGNWCER IYNKDSEYKYILPLTEGVTTSSGTSYYNYLYALQGSRYAHRTYTIQNRFALLDSQYVAGT YRRDSFAAYFGYKFGSDNRKIRITASERYYYGYGYTSGTPHQSAVLAETAGAVVELTMDT DLIVNDPQYFYGASRIRGLDLTDVSHAIVGTLNLNNCTALRELNVSCEAGQMTLNALLVG NCRNLRKLDISGLKSSSFTGMDLSSNTKLETFLAGDTSLTGVTFAGGAPLAVCVLPGTLQ TLELRYLSKLTNAGLQLEGTANITRLVIDNCSLIDWNTLLQQCSATSYLRITGIDMDGNG NLLRRLMTMGGVDEDGGNVQTCRLVGTYRLTQSMSDEEYAATCAHFPELNIIQPQFVGIK IDQTVGDGEKITNLDNSTGYDYNTEFTPSSHILEVLSKRRCILAKKTAEGEMTCYPLHDE NRNKYGDSDSVENATDAVLTGSEGEVYVYEPHYWYKGVTDVLNQCLYGFISSNEDAPAAA GYTSVRFTREKLDVTEGIGIRKNTDYTTIEEAKNEYESGSFALVDVRNYKQVRFPGFAST LYGAVFVDDAGKIVSRISVSNANGFINGMYLFCAVPVGATKLAFTFLNSAAFDFVLLTTS ESVEAIEPDWVEHTECLGGAYEAYLIDDVLRSVSGVSSVGTISQSQAIKYAQNRGKGFQL FDWEMHKDVGNLHFFKYGNTDSQVVCGYGTSNYQKVTGLTNALGMRDTVSYYKEKGGSNP QAEGAYRDGVNYQSVNVLGYENFQGNKAEWLQYVTVNKTAADGRWFITMPDGTERIVQGI TVYNADIYPTHMVWGRYMDLIAAKEGGSTSSHWFDRFYVGTGLSRVVYRSNNYAYALGGV SSAAAVLDSSSTSASIGVRLAFRGIIRWAGSVAAFKAINQAD >gi|225935328|gb|ACGA01000064.1| GENE 12 16742 - 16927 69 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174770|ref|ZP_05761182.1| ## NR: gi|260174770|ref|ZP_05761182.1| hypothetical protein BacD2_23139 [Bacteroides sp. D2] # 1 61 1 61 61 89 100.0 6e-17 MRTSVFGLQTIRIKKKRISLKNWRTTVGTCSRRGAKRNEPRQQQPFTVGKGKNKAQGNGV W >gi|225935328|gb|ACGA01000064.1| GENE 13 17257 - 18276 640 339 aa, chain + ## HITS:1 COG:no KEGG:Dtox_4258 NR:ns ## KEGG: Dtox_4258 # Name: not_defined # Def: RNA-directed DNA polymerase (reverse transcriptase) # Organism: D.acetoxidans # Pathway: not_defined # 8 231 97 349 369 103 30.0 1e-20 MNVVDRHLKVRFIRTTSASIKNRGTHDLLQYIVKDIKDDPEGTLFGYQFDITKFYESVDQ DVLLDAVKRMFKDKILIGILEECIRMMPKGVSIGLRSSQGLCNLLLSIYLDHRLKDQEAV AYYYRYCDDGLVLSGSKKYLWKVRDIIHEQTRKARLEIKSNDTVFPITEGIDFLGYVTRP DHVRLRKRNKQKFARKMHKIKSKKRRQELTASFYGLTKHADCKNLFYKLTGKKMKKLKDL GYKYKPKDGRKRFTGARIKSPELMNKDVIVLDYEKDVPTKNGNRTVIKLELDGKERKYFT SLEETLFICESAARDGELPFEAHCEGEVSEKGLIIIHFT >gi|225935328|gb|ACGA01000064.1| GENE 14 18278 - 18772 463 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174772|ref|ZP_05761184.1| ## NR: gi|260174772|ref|ZP_05761184.1| hypothetical protein BacD2_23149 [Bacteroides sp. D2] # 1 164 1 164 164 312 100.0 5e-84 MIRIYADSKAEPVRCTNRRRGIWRITWDYQETETPEGVQRSYMEETFDHLPALAEIKAVI NEWYNRKITDTIESGYVWNGLKVWLSMENQMNYKTAYDLALQTGGENLPVTFKLGEEDNP TFYEFASMQQLQEFYTGAVKHIQETQKEGWALKKAIDWSVYTLE >gi|225935328|gb|ACGA01000064.1| GENE 15 18821 - 19966 531 381 aa, chain - ## HITS:1 COG:CAC2066 KEGG:ns NR:ns ## COG: CAC2066 COG4974 # Protein_GI_number: 15895336 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Clostridium acetobutylicum # 111 375 8 285 292 66 27.0 8e-11 MTAIRVVRQGKKGKTDRLPLYVEFYINREKIRIAVRLSVTSKEWDEQNEVIKGRDKESKD KNLIISNIRSRVSDIFVRARLKNETLTKESFFRQYNNPSDFGTFFDFARVYLKQISKTIS FGTWKHHVSIIKKLETFAPGLVFSEITHEFLLSFFAYLRKIGNMDSTAWRNMATIKIYVG AAIRGGYMDQDPFAAIKIRRPKSEVIYLTEEELLRLTALYRSGRLEECTQNVLRFFLFLC FTSLHIGDAKALQINQFIGNELHYTRGKTKIPVTVPLSDPARYIYEYYRAGRTKGNLFMN LPTDQDINRVLKTIAGKVGITKDISSKTGRHTFATLYYKKTHDIVTLSHLLGHSSITMTM VYAHALEDEREAGIHAFDDML >gi|225935328|gb|ACGA01000064.1| GENE 16 20310 - 20669 377 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174774|ref|ZP_05761186.1| ## NR: gi|260174774|ref|ZP_05761186.1| hypothetical protein BacD2_23159 [Bacteroides sp. D2] # 1 119 1 119 119 167 100.0 2e-40 METSITPIIIAVAIIQLIMIIVFFTMASNISAIRKSIILKDKDFQAKFYALILSGEKDKA KLLLFEQISKEESFIYAILYQNDKKIDTMQEKFSKEYDSELQLLGIEKLDMSMLKISKE >gi|225935328|gb|ACGA01000064.1| GENE 17 20689 - 21174 247 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174775|ref|ZP_05761187.1| ## NR: gi|260174775|ref|ZP_05761187.1| hypothetical protein BacD2_23164 [Bacteroides sp. D2] # 1 161 12 172 172 316 100.0 3e-85 MIIAVIILKGDWQYEKEALLKAQKHISQQNELTESQKYYFGDKCIGLSARKGYNKFSIAG AYYRDLPITMVGKFNGYAIAQTDNEHDQYAIAIYNDTGVHLGFLPGGNKKQHSYIINEGE QKRVHAYGYLAWNGSGMYGEVCVETDKNLVAKRNKPYNVEL >gi|225935328|gb|ACGA01000064.1| GENE 18 21220 - 21867 254 215 aa, chain - ## HITS:1 COG:no KEGG:Coch_0655 NR:ns ## KEGG: Coch_0655 # Name: not_defined # Def: putative phage repressor # Organism: C.ochracea # Pathway: not_defined # 6 200 7 228 277 71 28.0 3e-11 MTTKERFIEYLKIKGIGQTSFEESSGLSRGAISQKSGFSANSIEKIAIACPDLNLDWLIT GNGEILKSSHTNIIAKAPMEYGKEQTRPRVPLTAAAGSLSGDSIGVTLEQCEQMPLIHQI PSYDFTMFIKGDSMSPRFESGDEIACRHIDQSRFIQWGKVHVLDTTQGFVIKRVYEDGDK IRCVSYNPEYSDFSVPKEDILSMSLVVGVVSIMEM >gi|225935328|gb|ACGA01000064.1| GENE 19 21939 - 22106 197 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174777|ref|ZP_05761189.1| ## NR: gi|260174777|ref|ZP_05761189.1| hypothetical protein BacD2_23174 [Bacteroides sp. D2] # 1 55 1 55 55 104 100.0 2e-21 MSDEEIRLEVAKLAIQCFIGCNGNRNFNDFSANAAILYDYIKFGKLPKDAVKSKE >gi|225935328|gb|ACGA01000064.1| GENE 20 22095 - 22424 286 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174778|ref|ZP_05761190.1| ## NR: gi|260174778|ref|ZP_05761190.1| hypothetical protein BacD2_23179 [Bacteroides sp. D2] # 1 109 1 109 109 181 100.0 1e-44 MNSCERKAEELILYAHKKGGTIELEEVLKEFDKSYISDDVMLVLDIYGKTGSAVLDSAYG WTWFQLSEKGMLFARRGCFTGEARRDQLVKIGAIAAIIAAIAGIIQLFL >gi|225935328|gb|ACGA01000064.1| GENE 21 22781 - 23404 244 207 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTIEAYQCDYCKKYSKSKSVMKKHESKCFHNPKAKACDTCGNCYQILCKVRNEISPFNY DRNVYSYKSVCKAGINLSELKDGSLTIRLCDNCEYWIEKEEEKSTQSIRRIEISIYHETE SRDDGKAKGLLSVSRSITSDGKEVYKDKMSPKMYIASKDPIRNLECFLLGLEESKDGRNI GSNSRKRGSFFAQIWRKLFCKKIQFIT >gi|225935328|gb|ACGA01000064.1| GENE 22 23285 - 23692 349 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174781|ref|ZP_05761193.1| ## NR: gi|260174781|ref|ZP_05761193.1| hypothetical protein BacD2_23194 [Bacteroides sp. D2] # 1 135 1 135 135 228 100.0 9e-59 MITLITPKLKDDLLADLLSVESMNVQNDIHSCAKEFDTSSDIVEAIYDQFEEMGLLQQTK CLGGTIIFRLRAKAHDFYSHGGFAAQEEILKANIQKLSDELDFLAKQLSPNLCEKAASLS TIAANIAAVLGLFKS >gi|225935328|gb|ACGA01000064.1| GENE 23 23833 - 24051 179 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174782|ref|ZP_05761194.1| ## NR: gi|260174782|ref|ZP_05761194.1| hypothetical protein BacD2_23199 [Bacteroides sp. D2] # 1 72 1 72 72 127 100.0 2e-28 MDRLQEIMMAAENVTFSKNQSSALVGGRRRLERLAAEKKIAFIKTTDKKNGRWECKGSDV LRYAIPQNYTRA >gi|225935328|gb|ACGA01000064.1| GENE 24 24048 - 24272 213 74 aa, chain + ## HITS:1 COG:no KEGG:BF2474 NR:ns ## KEGG: BF2474 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 74 5 78 78 96 62.0 2e-19 MNRVIKHLLTIVVAVVTIAGCIYAGRTEYNDDVLSGMSAEKYQYIHDSLGCRASQEDVVK EYITNQKYYDSKKY >gi|225935328|gb|ACGA01000064.1| GENE 25 24276 - 24623 198 115 aa, chain + ## HITS:1 COG:no KEGG:BF2473 NR:ns ## KEGG: BF2473 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 104 1 107 116 102 50.0 4e-21 MSKQSFNIPNGSNYVSVEATDNKLIISFSKENSNMFFCQESEHIEETPLIGHLSIFWDFS PSDAIISKVADIDYSDCTYKAKNGVWYRHAVRFRSEEQYSKILQSNVAKVEAKKE >gi|225935328|gb|ACGA01000064.1| GENE 26 24769 - 25011 251 80 aa, chain + ## HITS:1 COG:no KEGG:BF2472 NR:ns ## KEGG: BF2472 # Name: not_defined # Def: putative recombination protein # Organism: B.fragilis # Pathway: not_defined # 1 79 64 142 145 120 78.0 1e-26 MATRFNEKNCNAQCRSCNRFDEGNLQGYRRGLISKYGESVVLMLESMKNQMNKISDFEYK AMIDYYRKEIKRLIKEKNID >gi|225935328|gb|ACGA01000064.1| GENE 27 25201 - 25428 175 75 aa, chain + ## HITS:1 COG:no KEGG:BF2407 NR:ns ## KEGG: BF2407 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 65 1 65 69 77 64.0 1e-13 MKVIYVYLIFRKKGYAFGSLSAVFDYLDEEEVGIKKTTLLHRSDAGTITTQRAIISKLTL LRSKKCHGKEDRDKN >gi|225935328|gb|ACGA01000064.1| GENE 28 25552 - 25761 136 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174787|ref|ZP_05761199.1| ## NR: gi|260174787|ref|ZP_05761199.1| hypothetical protein BacD2_23224 [Bacteroides sp. D2] # 1 69 3 71 71 140 100.0 2e-32 MDKNDYKWLKVEGEPLILYGNKSVQAKVSACGCNGCSYNKRQGAKKCTLSQFCMGHLRPD RIPVVFVVY >gi|225935328|gb|ACGA01000064.1| GENE 29 25914 - 26450 243 178 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174788|ref|ZP_05761200.1| ## NR: gi|260174788|ref|ZP_05761200.1| hypothetical protein BacD2_23229 [Bacteroides sp. D2] # 1 178 51 228 228 265 100.0 7e-70 MLAMVIPQINANNRKFENGLKGGRPPKGNQDESNSKPNNNQTITKSEPKNNQNQTTKEPS QNQMETKVEPNVYDNDNVNDNKKESANADKKEDEPYGSTLSPKFLSFQAWIKDNAPYCAN PKNMNQITEKELDRLMETYTEKQIADVISQIENRKDKRKQYSSLYRTVLNWIKTEKQQ >gi|225935328|gb|ACGA01000064.1| GENE 30 26447 - 27817 713 456 aa, chain + ## HITS:1 COG:CC1665 KEGG:ns NR:ns ## COG: CC1665 COG0305 # Protein_GI_number: 16125911 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Caulobacter vibrioides # 7 439 23 491 507 165 28.0 2e-40 MKRNERIAFYDLDEERLVLGTLLNDPDKLVEVSSILDEACFYHDFHKQIFKIIKELDKKG KKFDFMVVKRELRGRMNEEKDTPLFFEIVGQRVYTNLYEHAAILKDASARRQIEDNLKNA LSLIPCVTEDPFEIIEKVRNALDGIYSDQTSGVITLREALEGAYKIIQNNINDPDASSGT PTGFRELDRKGILKPGYLTVIAADSSQGKTSLATSFCVNAAIAGAKQAFYTMEMSPVELS QRILSKTSGVNGVRIASAQLTAEEFSMVDKGLAPIYNLPLYFDGKSTSTIDSIISSIRMM KKRYDIDGAIVDYLQILSANSNSGSGDEMQLGIFARRLKNIAKDLGIWILALSQLSRDKV DVAPSISRIRGSGQITEAADVVLLLYRPEHYGRDKKYPEPFSNVSTENTAMIDIAKGRSI GTWKFICGFNAETTQFYDLLDIPHINPSASSEEKPF >gi|225935328|gb|ACGA01000064.1| GENE 31 27828 - 28112 324 94 aa, chain + ## HITS:1 COG:no KEGG:BF2465 NR:ns ## KEGG: BF2465 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 89 1 89 93 138 70.0 5e-32 MKCETEGKILVELPATSGISKNGKEWEKREFVIETNERYHSKMRFAMYSFDGPIENPPRI GDTVKVAFTVEAREYKNNWYNEVKAHRIELVTSK >gi|225935328|gb|ACGA01000064.1| GENE 32 28125 - 28475 182 116 aa, chain + ## HITS:1 COG:no KEGG:BF2471 NR:ns ## KEGG: BF2471 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 110 3 112 116 151 79.0 6e-36 MRPKQDLINIAINDGSMDRMNMLLSATHLLNCEANNLIEEASDVMIAKGLLLGNLKKLHN DFVKCADRYFNEFASLVTTDKCKMDMFDDLQGFDESFRKWAKVPIEWYPRILDENK >gi|225935328|gb|ACGA01000064.1| GENE 33 28496 - 29410 568 304 aa, chain + ## HITS:1 COG:Ta1390 KEGG:ns NR:ns ## COG: Ta1390 COG1032 # Protein_GI_number: 16082367 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Thermoplasma acidophilum # 100 200 70 173 425 74 42.0 2e-13 MNIGLIDVDGHNFPNFALMRASAYYKTKGDRVEWATPFNIYDKVMASKVFTFTPDFNYLT LQADVIEKGGTGYNIASRLPEAVENSSLMDYSIYPQYPFSIQFFSRGCIRKCPFCLVREK EGYIRSVHPVDLNPKGEWIEVLDNNFFANPEWKNAVSYLLKTGQSIKLHGVDVCIMDEEQ AYYLNKLKMKQNIHIAWDLPQIDLTDRLKEMIKYVKPYKITCYVLVGFNSTIEQDLFRLN TLKSLGITPFVQPYRDFTNKRKPKQYELDLARWANKMWLFKSFDFVDFSPRKGFRCDYYL KQLA >gi|225935328|gb|ACGA01000064.1| GENE 34 29460 - 29618 158 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKYGILDLIDESLSELSNREQRNLLKELSSEIDSRLQDVGEEKDDKDEFEEE >gi|225935328|gb|ACGA01000064.1| GENE 35 29710 - 30123 165 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174794|ref|ZP_05761206.1| ## NR: gi|260174794|ref|ZP_05761206.1| hypothetical protein BacD2_23259 [Bacteroides sp. D2] # 1 137 13 149 149 264 100.0 2e-69 MVSTIEDDGIYDGRGTFDLYQCEKCSREKITTYADKGVTPFIIRCDCGGNMQHTKTFKSV PDYIRVFKWKRPTLEQTIKLSDGQIEHILNGGLVLDTDIYTPSSESVKKSLEKIPKFIPP LTRQQRRKMERESKKKK >gi|225935328|gb|ACGA01000064.1| GENE 36 30141 - 30626 228 161 aa, chain + ## HITS:1 COG:no KEGG:ABAYE2755 NR:ns ## KEGG: ABAYE2755 # Name: not_defined # Def: putative methyltransferase # Organism: A.baumannii_AYE # Pathway: not_defined # 7 156 11 160 166 203 61.0 2e-51 MSETKIILDACCGSRMFWFDKENPLTLFADIRDEEHTLCDGRNLKVHPDIVSDFTDMPFL DESFKLVVFDPPHLLKVGENSWLAKKYGKLPEDWSRVIKKGVDECFRVLEDYGVLIFKWN EDQITVREVLDAIERNPLFVHTTGRHGKTMWMCFMKLPINE >gi|225935328|gb|ACGA01000064.1| GENE 37 30638 - 30871 294 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174796|ref|ZP_05761208.1| ## NR: gi|260174796|ref|ZP_05761208.1| hypothetical protein BacD2_23269 [Bacteroides sp. D2] # 1 77 1 77 77 135 100.0 6e-31 MSIKISKEAYEKLIKEDFDFLNEHCPESLELDHIKAIIRSSIDWNYPKKAKSMCLKDKTK VCNLCHECDVDVLNPSY >gi|225935328|gb|ACGA01000064.1| GENE 38 30908 - 31246 249 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174797|ref|ZP_05761209.1| ## NR: gi|260174797|ref|ZP_05761209.1| hypothetical protein BacD2_23274 [Bacteroides sp. D2] # 1 112 12 123 123 209 100.0 4e-53 MPTVLRETYPTAKKEHICEFCACKIQPGQKYVRQTNVYDRTLYDFVTHQECNELAHELMM YDDYDDSGLDGESFRSQLYSYIYDKHIEDIYTNWKFNRYEIVKKVLKELKNE >gi|225935328|gb|ACGA01000064.1| GENE 39 31249 - 31743 400 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174798|ref|ZP_05761210.1| ## NR: gi|260174798|ref|ZP_05761210.1| hypothetical protein BacD2_23279 [Bacteroides sp. D2] # 1 164 1 164 164 333 100.0 2e-90 MGFTTPAFIRRNTPELRKKLEELGYVKNSPIWTDNCSIIWAYQYPEKRFDAPNYVIADSF DIPFDKHSRLCGKFIDCGTNEELFLAIAALRDDTNENQWFIADSPLSVSYDDAVGNDHYF TEPKGSMFFWDENWNHATIISGSYHKATVEELIEHFKEKEVNHE >gi|225935328|gb|ACGA01000064.1| GENE 40 31736 - 31936 223 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174799|ref|ZP_05761211.1| ## NR: gi|260174799|ref|ZP_05761211.1| hypothetical protein BacD2_23284 [Bacteroides sp. D2] # 1 66 1 66 66 128 100.0 1e-28 MNSVQTQTFSIRGNDDAMAYIDFCDGDLCVSVVVDGKQADFHFEPVTLKMFAYAYKLHCE ELKKEE >gi|225935328|gb|ACGA01000064.1| GENE 41 31938 - 32399 343 153 aa, chain + ## HITS:1 COG:no KEGG:Coch_0885 NR:ns ## KEGG: Coch_0885 # Name: not_defined # Def: putative prophage Lp2 protein 26 # Organism: C.ochracea # Pathway: not_defined # 3 152 2 132 132 68 36.0 7e-11 MNRTIKFRGQILNSVKDDKGNVVTEWKSWIYGDLLHRSNGVLHILTLNEKKACYENNTID PETLGQFTGLLDKDGKEIYEGDIISVNGKYPKLVKYIDDYACFCLANIEDLDEKMDTGYW HQVSPGWWNSSKREIRVIGNIYDHPDLIKEEDK >gi|225935328|gb|ACGA01000064.1| GENE 42 32396 - 32689 193 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174801|ref|ZP_05761213.1| ## NR: gi|260174801|ref|ZP_05761213.1| hypothetical protein BacD2_23294 [Bacteroides sp. D2] # 1 97 1 97 97 198 100.0 1e-49 MKEEIITLETVKLLNAILPYRDFYQPSQSLVQKWLRETKNLHISIIRNACGYGYDICKAD NGTYIADGMYKGPNDGGQWDTYEEALEAGIQEAIGLI >gi|225935328|gb|ACGA01000064.1| GENE 43 32701 - 32931 251 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174802|ref|ZP_05761214.1| ## NR: gi|260174802|ref|ZP_05761214.1| hypothetical protein BacD2_23299 [Bacteroides sp. D2] # 1 76 1 76 76 142 100.0 5e-33 MKPFDLEKAKAGASVCTRDGSKARIVCFDASNKRFPLVALIKDFNNSDEYPVLYTKEGRF FDGEKDNPEDLCMKDG >gi|225935328|gb|ACGA01000064.1| GENE 44 33050 - 33745 298 231 aa, chain + ## HITS:1 COG:no KEGG:PSPA7_2372 NR:ns ## KEGG: PSPA7_2372 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA7 # Pathway: not_defined # 7 227 20 218 223 68 29.0 1e-10 MGGIIMKKIMFNDKFGLTQSVLDGRKTMTRRICKYDRPDESWDIVFPVFESKDYDSEGNL ISPLYGAFGWKNKEGDFTGWNNPLYKVGEVVAIAQSYKDSGYDPDSLDRHPKDLSIRGLM KDSSGWNNKMFVKSYACKHHIKITNVKIERLQDISDEDCLKEGIVKQEVISDESPFLYAY DAFLDGDNKYFASRWFRSPKEAFAVLIDKVSGKGTWESNPFVFAYEFVLFD >gi|225935328|gb|ACGA01000064.1| GENE 45 33880 - 34113 74 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIVSPTLIMPKRQTIIPATRCKRIILNTITSLISMLFYFNLRIMYSMQSILYTSLIGMAL PPSFSFLRISKQVLSQI >gi|225935328|gb|ACGA01000064.1| GENE 46 34016 - 34195 94 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174805|ref|ZP_05761217.1| ## NR: gi|260174805|ref|ZP_05761217.1| hypothetical protein BacD2_23314 [Bacteroides sp. D2] # 1 59 1 59 59 63 100.0 5e-09 MSEVIVFKIILLHLVAGMIVCLLGMMRVGDTIIQKCLRGIIGMVIISMITSLGYVLINV >gi|225935328|gb|ACGA01000064.1| GENE 47 34206 - 34553 309 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174806|ref|ZP_05761218.1| ## NR: gi|260174806|ref|ZP_05761218.1| hypothetical protein BacD2_23319 [Bacteroides sp. D2] # 1 115 1 115 115 226 100.0 5e-58 MKLRIEMKLKDIVSQLANRMNQPHVIEVYLRQVYAKGFVDGTKQSPWISVKDRVPIPIIE AIDGNEYGSIDVLGAIQNSYGDYDYYICRYWGYNKEYRWENGITPDYWMPIPKFN >gi|225935328|gb|ACGA01000064.1| GENE 48 34694 - 34888 130 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174807|ref|ZP_05761219.1| ## NR: gi|260174807|ref|ZP_05761219.1| hypothetical protein BacD2_23324 [Bacteroides sp. D2] # 1 64 1 64 64 96 100.0 5e-19 MEYKKRISIRLDERSAMLLNELSKITRTSTSIIIRGMVNRSIEELIDKSGNWKIPNEKDK EGKG >gi|225935328|gb|ACGA01000064.1| GENE 49 34900 - 35184 201 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174808|ref|ZP_05761220.1| ## NR: gi|260174808|ref|ZP_05761220.1| hypothetical protein BacD2_23329 [Bacteroides sp. D2] # 1 94 1 94 94 171 100.0 2e-41 MAMIASNYKQLKQLCADHSHGLYCSKDNEDIFQDTVLFVSLDEKASSLSTDKELIDHFCY RFRMIEYQAINDNKLLKEIPYADYLQAPKTTEEE >gi|225935328|gb|ACGA01000064.1| GENE 50 35141 - 35455 252 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301164546|emb|CBW24105.1| ## NR: gi|301164546|emb|CBW24105.1| putative endonuclease [Bacteroides fragilis 638R] # 18 100 2 84 84 120 68.0 4e-26 MPTIYKPQKQQKRNDNYYDAERRKVYNSDRWRRLRAWKFACNPLCEMCLKENKTTPAEDI HHISSFMSTDDPVQRNQLAYDYDNLMSLCKKCHQVVHNKKGSRT >gi|225935328|gb|ACGA01000064.1| GENE 51 35516 - 35914 296 132 aa, chain - ## HITS:1 COG:no KEGG:BVU_2807 NR:ns ## KEGG: BVU_2807 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 132 17 139 140 118 48.0 7e-26 MRTLEEDLLKMDSLHGDELDAHLYEMKALYTKPEEREAIRKHLDKTLDTIANNVESISSR LNIREQMNDIIDLIPVSYIAKNYFGKSRAWLYQRINGYKVRGHVYTLNEKELEIFNRALK DIGNKIGSLSVS >gi|225935328|gb|ACGA01000064.1| GENE 52 35955 - 36140 226 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174812|ref|ZP_05761224.1| ## NR: gi|260174812|ref|ZP_05761224.1| hypothetical protein BacD2_23349 [Bacteroides sp. D2] # 1 61 1 61 61 65 100.0 7e-10 MKELNELERIEFEIEKEKQNLREWKRKVLILEIGKEDDEARTDAILERISELLERKEKLK K >gi|225935328|gb|ACGA01000064.1| GENE 53 36483 - 36884 418 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174813|ref|ZP_05761225.1| ## NR: gi|260174813|ref|ZP_05761225.1| hypothetical protein BacD2_23354 [Bacteroides sp. D2] # 1 133 1 133 133 227 100.0 2e-58 MEKKKKISFKLPETIKHKEARKIISDLVKQLNDRGMLEIADIPQLHRMATAYDAYLECVE VLARDGMTMENLKGEWVKRPEANLLKENWSQYLELAKEYGLTAKSKGQIKAMNAGDNEES PLETYLKGKKETR >gi|225935328|gb|ACGA01000064.1| GENE 54 36884 - 38503 1055 539 aa, chain + ## HITS:1 COG:ECs1598 KEGG:ns NR:ns ## COG: ECs1598 COG4626 # Protein_GI_number: 15830852 # Func_class: R General function prediction only # Function: Phage terminase-like protein, large subunit # Organism: Escherichia coli O157:H7 # 6 529 4 528 553 295 33.0 2e-79 MQTKTYYKYAQDVIGGKVVSGKFIQLAAERFFSMMEDDRYEFKEVKADEVIEFFSILRHF TGRHAGKQFILQPWQQFVIAAIYGFYIKETDERLVKYVYIEIARKNGKTAFAAGLSLYHL IADGEMDAEVDLAANSKEQAKIAFKFCSQFAKGIDPKGKDLVSYRDKVKFEKMLSLLQVF AADDSKLDGFNASMYLIDEYHAAKNTGLKDVLQSSQGMRDNPMAVIITTAGFDKLGPCYQ YREMCTEVLSGLKENDALFAAIFSPDEGDDWKDPETWQKSNPNLGVTVKPQYLQTQVQSA VNAPSEEVGIKTKNFNIWCDSETVWIPDHYILQASANLDFEQFRDMDCYAGIDLSSTSDL TCVNFMFPTSDKYYFKTLYYLPEAALQEKRFKDLYGEWRRQGHITITPGNVTDYDYILND IMSIREIVFIQKIAYDSWNSTQFVINAEEKGLPMEPYGQNLGNFNRPTKEMERLILSGKA VIDNNVINRHCFRNVVMARDRNGNTKPSKQFEEKKIDGVIAKLEALGIYLVSPRYGEFY >gi|225935328|gb|ACGA01000064.1| GENE 55 38541 - 39791 750 416 aa, chain + ## HITS:1 COG:RSc1682 KEGG:ns NR:ns ## COG: RSc1682 COG4695 # Protein_GI_number: 17546401 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Ralstonia solanacearum # 38 395 43 397 407 158 28.0 2e-38 MKFLGFEIRKASKQETSQVTAWSFNGSRPMFTSRSKPMLLSTVYRCVDLISDSVAVLPLK TYHLDADGFKAEAKSHPAYYMLNMEPNEDMTRYVFFKTLMASVLLTGNGYAYIERDSKMN AAQLIYLPSSQVTITWVSDRSGIMRKRYQVVGFRELVDPRDMIHVLNFSYDGIIGVSTLE HARQSLGIATSTEEHAEGFFKSGASVAGILTVEAARLDKDKKDQIYQTWEERTNPVTGHP NGIAVLEGNMKYQPISISPKDSQFIESRQFNVVDLCRFFSVSPVKAFDLSKSSYSTVEAT QLQYLTDTALAVITKIELEINRKVFLPSERGRFISEFDTSAILRTDKSAQAAFWKDLATV GAATPNEVRRENNMPRIENGDKAFVQVNVQTLDNAVKEKVDEPQNNPDLSDKNLVS >gi|225935328|gb|ACGA01000064.1| GENE 56 39805 - 40398 373 197 aa, chain + ## HITS:1 COG:L112195 KEGG:ns NR:ns ## COG: L112195 COG3740 # Protein_GI_number: 15672503 # Func_class: R General function prediction only # Function: Phage head maturation protease # Organism: Lactococcus lactis # 1 179 11 182 196 82 35.0 6e-16 MDEKREIRNTSFQVQLTGDTEEKRTVEGYALLFNTPSDGLPFEEVIERGALDGVIEKSDV FALLNHNQTRGILARSKEGTGSLFLSVDDKGLKYRFEAPKTALGEELLENLRRGEIDQSS FCFDVEKDTWEKKNDGTWKRTVHKIGNLYDASPVYNAAYSKTSVCLRGKEQVEEELRKKE QTVPEEYYQSIEKSLNL >gi|225935328|gb|ACGA01000064.1| GENE 57 40403 - 41650 1050 415 aa, chain + ## HITS:1 COG:no KEGG:NMC0858 NR:ns ## KEGG: NMC0858 # Name: not_defined # Def: putative phage-related protein # Organism: N.meningitidis_FAM18 # Pathway: not_defined # 124 409 326 618 627 130 32.0 1e-28 MAKEKSITELKDERNQLIARSKEIINGAKTEKRQFKPEEAEELGTNQQRRAEIDLEIEEH EVMNRQQGKRHQPIANERFSLRRAIANMVDGAQQNESDASVIDAATTLHNTSGAQMADKR SIVVPVNLESRAAFTAATEAATGVIIDEEQQEMLLPLQSALVLARAGARFMTGLQGNIYW PQFSGANVFWEGENEEAKDGAGKFSKGDVFKPLRLTAYVDISKQLLVQENASVEAYIRQA IAVAIAQKIEQTAFSKTKGVENTPDGMFHTLSNTVKGDINWAQIVAMETSADTQNALFGN LAYILHPSLIGKAKTKVKDASGAGGFIFAGNGDGQLNGYKALRTNNLPKELGEGNDEFGA VFGNWSDYFLGQWGGIELLVDPYTQALKGTVRLITNSYWNMGFIRKESFCVASMK >gi|225935328|gb|ACGA01000064.1| GENE 58 41652 - 41963 136 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174818|ref|ZP_05761230.1| ## NR: gi|260174818|ref|ZP_05761230.1| hypothetical protein BacD2_23379 [Bacteroides sp. D2] # 1 103 1 103 103 190 100.0 3e-47 MTYVTLDMAKRHLNIEPSYTDEDSYIEALIKVVEEKTAKELCVSVEELATIDGGKNIPTP LVQAMLLCLGAYYANRENTAYATLKEIPHGAKYLVDLYRDYSK >gi|225935328|gb|ACGA01000064.1| GENE 59 41960 - 42286 234 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174819|ref|ZP_05761231.1| ## NR: gi|260174819|ref|ZP_05761231.1| hypothetical protein BacD2_23384 [Bacteroides sp. D2] # 1 108 1 108 108 215 100.0 9e-55 MRAGLLRETLVFKSPIETQSPTGAVKKEYKEVFQCRACRKKMSLIADRDGVSAMEQFIGH TLVFQVRNYPAIKENLHVVYNGNEYNLKMVNPQMNDNSLLLTLEKIDT >gi|225935328|gb|ACGA01000064.1| GENE 60 42283 - 42720 155 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174820|ref|ZP_05761232.1| ## NR: gi|260174820|ref|ZP_05761232.1| hypothetical protein BacD2_23389 [Bacteroides sp. D2] # 1 145 1 145 145 250 100.0 2e-65 MIEVKQIDRENIQYLVDNLEDFEKDKAIRSGLRSAVNVFRVKGRANLRSRLLHRGKQTNH LMNSFTNRVKRNKLGALAGFDRPGGNHSHLVDSGTKVRTTKSGANRGIMPANRFWSDAKV SEESKAMSALYQGVQKAVQRINNRS >gi|225935328|gb|ACGA01000064.1| GENE 61 42720 - 43088 284 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174821|ref|ZP_05761233.1| ## NR: gi|260174821|ref|ZP_05761233.1| hypothetical protein BacD2_23394 [Bacteroides sp. D2] # 1 122 1 122 122 229 100.0 3e-59 MNKLAITTEIRNILLDSGDITSLIGKKIFPVVAPMKTEGDFIIYQRDGYKQEYTKMGVAR QIPTVFVNAVSDDYDRSLELASLIYEALEGDFSNPDMTIHLEDSTEDYSEGKYFQVLQFS VE >gi|225935328|gb|ACGA01000064.1| GENE 62 43122 - 43604 273 160 aa, chain + ## HITS:1 COG:no KEGG:BF2452 NR:ns ## KEGG: BF2452 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 9 150 4 144 160 65 34.0 8e-10 MAAVKHDSSKDLVRGQLFLFLGENPVAFASSCSLEVSVEEIDISNKMCGDWAASLPGKKS FTISSESLLTRLQGATSYDELLKHVDTGETFQFVVGESTITDKTNVGGSFAIDTTKPNYK GEIMLTSLSLKSDNGQIATCSASFKGVGALQKVEAVPAGE >gi|225935328|gb|ACGA01000064.1| GENE 63 43668 - 44417 317 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174823|ref|ZP_05761235.1| ## NR: gi|260174823|ref|ZP_05761235.1| hypothetical protein BacD2_23404 [Bacteroides sp. D2] # 1 249 1 249 249 488 100.0 1e-136 MSLFITIIIIVFTLVTVVAIIDSKCCRPGPPTVVNSPAPRKRFRFDPKVKLNIQSIIRWE QLRGKSFSLMDYADKDDVDALLYTTTVCNNEGKMYTFEVFRHTLSNEKITREMVMALERE TAVLAQFQKKQKEDDIANHDVTPGMIGELVATLIMAGLDAHYALEEMELCDLPIYIDAYE RRKKEQMESDRLWTYYKILPHIDASKLKNGARDLITFPWEEMEDMKEAERALAEDADIFE LFMNNGINT >gi|225935328|gb|ACGA01000064.1| GENE 64 44431 - 47601 2302 1056 aa, chain + ## HITS:1 COG:YPMT1.11c_2 KEGG:ns NR:ns ## COG: YPMT1.11c_2 COG3941 # Protein_GI_number: 16082793 # Func_class: R General function prediction only # Function: Mu-like prophage protein # Organism: Yersinia pestis # 57 244 18 206 450 68 28.0 5e-11 MAGKLSFSIAINLLTENFKRGSNQVKAAFRSMQMQLLTFAAALGAGGLGLSNFVSRLIEV ARETNRVTTALKNVSGTMSRYADNQRYLLDLAKKYGLEINALTANYAKFTAAASISGMSM MNQRKVFESVSRACTAFGMSADDSNGVMLALSQMMSKGKISSEELRLQMGERLPVALQAM AKAAGVSVGGLDKLMKQGKLMSADVLPKFAEALNEMIPNVDTDNLETSVNRLKNAFTDFT NNTGIQDLYKQIVDGTTKAVQYVQKNLKTLLTWIVSAVSGYIGGKVFGYITAEFAKMQRA ALVAAKKMAKEAGQAFDEAGFRANRFYNSAVAGAAKLNIALKSAVKSLAPMMIIAGITQI IQVISAWRDRQKEVNEKYQEYQKGAQKAGAGNYAEVIKLTQLRNIINDTNNSYKTRVGAL NQLNTLLGTAFSIDSKTLKVNGDINKAIQQRVALMRNTALADYYTNQYTQSLQERKELTK QKHEIVDKYNKRHPQSPIDVNSTSYDYDSFVKTSKDILPFKELELITKDLNNTNKILVDA DKNSKAYAALVAKEEIKNTPITDPDDSKKKKTPLQKEQESFNKQFEELKAELEIGKITQA EYNKNLGELNIKMYAQAKGTGNKKVLESEYLKARKQAAEKAIKDQDKNTALVEFEKVQKD YNTKVEEARAQQAKGLMSQKDLNSNIASLSIEAAKSAAGIKGIGDEADVFISAIQLKAKL LSKRTKEQKYDATFDYKKTKPEILSDKLNIAKKWRDDAKEEAKVLGAEMTEELNKAMTKV TSLDEALKLAQVKEDIKNFNKELNESLYSGVKDIASSSDRIVSAFSNLRDVMNDVDASGW ERIMAIWNAMTNVVDTFLSIIKMIESLTEITNKLTQAKEQEAKMTGVATASKVTEAAVDT TVTGVKVANSEIKKTADTSEAATAVTTATKEVAANTAKGVSAAGSSAASLPFPANIIAIG AAIAAAVALFASIPKFATGGVITGGPTSGDKILARVNAGEMILNRGQQSRLFEAINSGKL GGSGNMSSTVTTRVRAKDLILTINNELKSQGKKPIS >gi|225935328|gb|ACGA01000064.1| GENE 65 47598 - 49733 871 711 aa, chain + ## HITS:1 COG:no KEGG:BF2448 NR:ns ## KEGG: BF2448 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 710 1 689 693 546 43.0 1e-153 MSYALIYTVPFATLDNVPCVVEIEKEGYTGKSKELTPAGDSPFTVDIEDEEFLYTPTRFS SATIRVVGSDYLQNLFSTGYQMYRVTLKVDGLVTWCGFIKPELYTQDYTLKTFNLDLECI SAMSTLEFIDYKQIGESRTFVSFWDLIKKCITSASAQYNAIYFPHVYAKDTESYAEGTNV LENMTVSEQNFFDEDDKAMTLKEVLEEICKFLNWTCVDWKGELYFIDIDHIGEFYKYDPI TFKKNGTVSPTLLNIQNVGFAGSDHALDILPGYNKVTVKCSNYPIEEIKITEDFDKLKLL SNIGEVSTNLGNGNTRHTQREVLYPNILTMHQFTYKNGVLSPVTDLSIYKNKSNAAELLG AIPLRYASYESGLKTPTTQSYNYECAIQVRQRCGTKYDPINDITPNSVFNDSTVVISAKN EALFFGKGGALSLNMSIKVLQKDKYDSPFGGGIVPSEGGITYLKDMVKVGIRIGDKYVSK DDYGRFTWSDTPATMSISLDQSNVENADGKMGTGFVPLYKTYGVLGKYSDADGVVINIPT NIYGTLELSIYAPTLTEREGQVPYGYLIKDLKLKYCSPIDINDNENSDRTYENVVNENYI NELDEIEFKISSYNNDGACHSKVIWDDDYLTDNLYLAIEGTTVRPEEQLIRRIIKRYSAP RIKLTQVIKHTSDLTPLSRLSDNYMVNKRFINVGGTIDYKMNRFECIMIEV >gi|225935328|gb|ACGA01000064.1| GENE 66 49730 - 51001 673 423 aa, chain + ## HITS:1 COG:no KEGG:BF2447 NR:ns ## KEGG: BF2447 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 143 4 137 1324 109 49.0 2e-22 MSDQILIKSKAIPSNPRSKNYPAGATVVRSGNGGGSTVITGSGGTNIDIIKVDDMRSLTD KNVLSSLRVLAEILSRIIKSDEVTELSDSNVLSSLRINKELDTINERFKEAIDSLKDLYL SKVKNDTATGLITFSKGLISEELIEANNGLVVRKTEAVEPMLMSLLSEEFEDGIVEENED VFIEEMHIATSGAVTLGELDNVTDEADKISDTDDLLVRLAGASGWTINTTLFSQVSQLMS KVFPFTMTLSGGGTYEKGSSLTINLSWTYDRDIESQSINNESLLIGIRAKQYVNVATDTT YTLKAIQGGQAYTKSVSAQFKVKKYYGVSANGTLTNDEILALSSTWAGRTQGSTVFDCTG GKYPYYILPTSMVTGIQFWIGGLRNTDWKEETREVTNAFGYKESYTIYRLNSIQTGVLNI EVK >gi|225935328|gb|ACGA01000064.1| GENE 67 50998 - 56559 3047 1853 aa, chain + ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 1587 1811 570 817 1734 79 27.0 5e-14 MSEELKGTNVYSPIVPGTSRDVYPTHYSIYGKGGHKEVSTIDARNAITADRLTEGCVVYV KETDKEYQYKNGEWVDYQTNFDDTVLRELIDKKVDKVDGKNLSTNDFTDADKEVIAIHSE EIDSLQDSVNDIYQRLDSTTGVQYYIRVQNNGDKSFTSQKGEPCVLNFTFISQERYSYND PYENTGERGKCEIFIKNSVSADYTLIKTLMVNSITATKVDIAEFLANGANSIMVKITGEV TGQATPAYTYNVTMTSLSVKADTFQWWTLYSGAISIPLYISGNVNKTLKVTLEGENYAKA YEQVLGNVIYTDTALNFSIDHPEQTGVYKLSVYLENSDGTIKTKTVSFNIMCASEGEQVK LMCVNSLSDKASNWANNKLFEYAVYDGDATATSGTFSIKMDDLTVYTSEESTIPTNTKNS FSYAMEIETVDDTDFEISVAVSDNGEPLTDTMIFPVSNSSGFSATAGSVFYMNPRTRTNS QSNYQKIINEIDSSQIAAEWEGMNWNNDGWTVDSDGNRVLRMMAGSLLDIGYKPFEIESA RNGKTIELDYKIYNVTDYSEPIITLSVPDGQGFTGLNIYANNIWPCSQSLKNEELQSIPT DDGVRVRIAMTISPNMYGNAGFNLCSIYINGKKNRTFLYESNDYWAQNGDIIIGSDYADV DVYGIRIYETGLGSNAVHKNYINWLPGTDEKVEESENNNLYDAMATQLDFDAIRAKMNVF VFDNIFPSYYDTAKRTGTLEILFVNRPERNVTITNVEMSGQGTSSKKYWEWNEKCKVDKT KSVITYADGSTTTKKFIMFDNVPACASVTFKKNWASSMQDHKAGSVNSYTDLYKQLGLTN EAMALDPKVRVSVYQEPFMAFRKELNDEGEIVYTCMGEFTGGPDKGDKYCFGYDTDLFPG LISIEGADNSPLPALFRVPWNTGRITYSEDEESWQYNGENSIGFDGGLPENIKYWIPAYN LAYSCSSKICPFDGTLGELNADASGYKENGVEYWIAKPGDTNLYNLYYYEAAEKQFIPSD IGEGQINLIHQLVNKGYGLSSADLVGKTNDELNTLFINARIAKFRAEAKTYFDISDAIFH HNFTEFVAATDNRAKNTYPYCFGEGCKWKWRQDDLDTIMPITNQGQLRKGYYVEVHDSYD TGAPVWNGETSVFWNLLELAFPDELAAGMRSMMSAMEVLGGLKSGTHAEKVYAWYQKYYL NVKEYFPAVTVNEDSKRYENAKLMMNAGRYTNDTDPLTQELGDLYSAETAWMKKRIQYMS SKYSFGEYSANGTDSINVRAAGNAIAYDIIPAIDMYPTIANGTSIVKGSRTKAGQVCRMI IDLGGTGDQQNIIQGASWLMSIGKWHDKNVNGNLIIKGRMLRELELGSRTERIVIAITGL TISDCVSLQSILLSNIATLAGSLDLSVCTHLRRVWADGTSLTQIRLPQGGCLELVQYPST NRYLTLQNFPLLKQEGVLIDDCAGKITDFFVSDCPKLNPVDLLIKIMDAQQEQGEAHALK RVRAVFGEYTYNENGAEMLDNLGKLADGTYVGLNSSGVAGDDPRPVLDGTLHINTNCYED TAIALRSYFNRLVLNITGEFYLRFKDDFVGSYMVGNYGDGTGIIKAQLANITDSQFGTPF SNNTQIMYFDEFQYMTSISMIKPGLLKYCSNLKSFILPPYVTEIAYEAFRGTAIEEMTIP KYVEYIQQSIFSGCPNLKKVEFESGRMKPLKLRWSIFEKCGSLESFEFPDEIISEGMDNF FDYCSSLKTVILPVNGLTLLGKLMFRGCNVLTGLEVPASVTSFGNDLFYDCKSMIYLKIS SSVPPACSSANLGYGFNAACKIYVPDDSVETYKGNSWWSQYSSRIYPLSSYNK >gi|225935328|gb|ACGA01000064.1| GENE 68 56569 - 58890 1178 773 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3975 NR:ns ## KEGG: Cpin_3975 # Name: not_defined # Def: OmpA/MotB domain protein # Organism: C.pinensis # Pathway: not_defined # 319 471 191 330 1026 90 34.0 2e-16 MAILSNGKFYGFLCSVKETGQKLANGVKEYVEDFMSGFAGHGWKLWEYIKGKWMLEIDAI RVRGQFTIFELLVSKIRAIIGAQTITQGCGKIKTVGISEDGAAYLITLEDTDMSFMEHDF IRCQEFTGNQRLYHVEIESIVDGVIHIPVSEFESEVNEDGITFVTNPPMPGDDIVQFGNS SYEEQYAGRHSAIYMHADESGQPAIDVLDGIYTKDWSNCLKVRMGGDIPGTAGLKGFYCV NGMLKAVDEDGTILYQFNPDSSGFIAKGNIRWDKDGNGDIFNRAIYWDKGGFHFGSGVKL TWDNLDSETKENLKGEPGRDGNNGADGINGEDGTSLVYKGEFTSHPSNPQNGWYYRNISD KKTYVYQDNAWYVMTVDGSDGKDGLDGINGEDGKDGLDIVWKGDSSIPPANPQKNWVYRD TDNGRVYIYNGTAWALMVADGNDGIDGTDGKDGMRVYITYHDSEAEPAKPTGRGTSDGWH TDATNAVVWISQKVSESADSGEWGDPIRVKGEPGKDGQDANLLPWIEKWNGYATELGEEY IVTPKMFSGIKSSDGNLTGIVQGKECLTDKTTGEKRTGIFAVVDGEIVFELDPINKKYKF KGRVEVEEGSLSLANGKIVLREDGTGVLADNSIYWDEYGYLYQRSRPRVIWRSAVHEMDE LGIHGRNPYNIDLRKGTYIETLTAYVGEPRYINLPNPSDVPGAILDINCRLGTRSFDVIG FKCQSASFMQTKSGSISAHSFVCLSPGSQGTITAVSTNDDSYWDISESFIDKE >gi|225935328|gb|ACGA01000064.1| GENE 69 58890 - 59276 348 128 aa, chain + ## HITS:1 COG:no KEGG:BVU_2812 NR:ns ## KEGG: BVU_2812 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 128 1 128 128 194 84.0 6e-49 MELNDWLTILGALGGLEAIKWMVNFYVNRKTDARKEDASADSMEDENERKQVDWLEDRIA QRDTKIDAIYVELRNEQNDKLIWIHKCHELELQLKDAEHNRCDRPDNDCSRRIPPRRVTI TKDKEEKK >gi|225935328|gb|ACGA01000064.1| GENE 70 59273 - 59767 126 164 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 45 151 1 100 116 91 43.0 6e-19 MKTIDSIIIHCSATRAGQDLRAKDIDRMHKQRGFSQIGYNFVIDLDGMVENGRPLSIDGA HCNTKGFSTTSYNKHSIGICYIGGLDPNGKPADTRTPAQRAALRELVAKLCKEYPIIEVL GHRDTSPDLDGSGEVEPAEYIKACPCFDVRSEFSNFLRNTVIRP >gi|225935328|gb|ACGA01000064.1| GENE 71 59911 - 60267 240 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174831|ref|ZP_05761243.1| ## NR: gi|260174831|ref|ZP_05761243.1| hypothetical protein BacD2_23444 [Bacteroides sp. D2] # 1 118 23 140 140 199 100.0 4e-50 MNVNKQTKVTTDKLSDLKIENKTVYLSAPDSTGKQHPIKESTTTASKQEQERTEVYETLS ITLQQFSNRLDTISNKVNVLLNQKETVVELSWWDLHKDKVYIGIIGLLIAGWLAYRLK >gi|225935328|gb|ACGA01000064.1| GENE 72 60705 - 61619 326 304 aa, chain + ## HITS:1 COG:no KEGG:GFO_0259 NR:ns ## KEGG: GFO_0259 # Name: not_defined # Def: hypothetical protein # Organism: G.forsetii # Pathway: not_defined # 5 298 57 352 360 70 25.0 6e-11 MGLVIDAPENCSFVESTNDCACFFNKLRLSESLWIVNRRKFLYVSLSNVKNIDFHAIVTF LSIIDEMKSRGINTRGELPKDLNCRRFLVVFRLLSKMYDENGKKFQNTSKSELMYFEKGE GKLTVSQMEMISNCVEHTCKYLTGGLSSNTSLKSMLKEICGNSIEWSNSHNKKWQLGIMY GNNEVVFTAVDLGKGILDTIYRKHSQLIKEFFFNDDRLTILKNAFNKKYGSNTQEENRGK GLPGIKNCCDDGYIQELLVITNNVRLDFVNPNKSLIFATEKNLFKGTMYRWVLRKECFFN SIDN >gi|225935328|gb|ACGA01000064.1| GENE 73 61624 - 62040 303 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174833|ref|ZP_05761245.1| ## NR: gi|260174833|ref|ZP_05761245.1| hypothetical protein BacD2_23454 [Bacteroides sp. D2] # 1 138 6 143 143 261 100.0 1e-68 MAVLNIAKDFSKYTGLRHCDISERSGEEFYHSLLNQKFKEAFEKKERLTLELDGADGFAP SFLDEAIGNLVFDFSLKIVEKYLNIVSVYEPHWLEMIKSQTYLQWEDRRLKDKKPKTTEE HEAWYRIVNDSLELKKWS >gi|225935328|gb|ACGA01000064.1| GENE 74 62031 - 62534 254 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174834|ref|ZP_05761246.1| ## NR: gi|260174834|ref|ZP_05761246.1| hypothetical protein BacD2_23459 [Bacteroides sp. D2] # 1 167 1 167 167 278 100.0 8e-74 MELISVVNSLVITVSDWIQIGGIAITAGLSIWIVNTIQAKVDSKRFIKEFFINEILEIRN EYRVLIGQLKNGELKPRMVKYKTKELNIRVNDLMSILKEQYNINFNYLLSYQLELLSIVM DSREFITNFTSNSTFSLSEQTLGDLSIFENENDGKFSKLIMEVNKFE >gi|225935328|gb|ACGA01000064.1| GENE 75 62698 - 63936 476 412 aa, chain - ## HITS:1 COG:no KEGG:BVU_0168 NR:ns ## KEGG: BVU_0168 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.vulgatus # Pathway: not_defined # 1 392 1 391 393 188 32.0 3e-46 MATLSLTILSAKPTKAGKFPILIRVSVKNDKEYIKTEYQLDDACQWYNGKVVARNDATMM NKQLIYELKKYKERLQYIDNYDCYTAKQLKIILTQQDKIAPDIRTFNDFMRQRIKEKKEE GKTSHAKMLEDTLKIFESAEGEVPIIIMNHITVEHFDRWMKLHGHTDGGRQIRLSHIKAR INEAIKLGIIRCDKHPFAYTKIPIPEPRSLDISVDAIRKIISADVSKSKQLSLAKDMFLL SFYLGGINYADLIQVDLSEDVISYIRLKSGEHKTKNRTTSIAVCPEAKSIINKYIGKNGK LKFDYKYTAGNLQRYINKCLKLLATELNIKEGLTFYSARKTFAQFASEIGIPYPIIEYCL GHSIKTSITINSYVRVKQHQADAAIQRVIEYINNPEVFKPYIEMRAQMQMMM >gi|225935328|gb|ACGA01000064.1| GENE 76 64440 - 65174 828 244 aa, chain + ## HITS:1 COG:no KEGG:BT_0766 NR:ns ## KEGG: BT_0766 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 24 267 267 376 87.0 1e-103 MKKLIFLFAFCIVVTNVFAQTDPEQLKKEGNDAFNAKNYPVAYAKFSEYLKQTNNQDSAT AYYCGIAADAVKKYPEAVTFFDIAIQKKFNIGNAYARKALALDAQKKTAEYVATLEEGLK VDPKNTTMVKNYSLHYLKAGLAAQKAGKAEEAEDCFKKVIPLDHKQYKTNALYSLGVLCY NDGANILKKAAPLANSDADKYAAEKEKADGRFKEALDYLEEAAKISPENENVKKMLPQVK AVMK >gi|225935328|gb|ACGA01000064.1| GENE 77 65303 - 66310 574 335 aa, chain + ## HITS:1 COG:PM1464 KEGG:ns NR:ns ## COG: PM1464 COG0147 # Protein_GI_number: 15603329 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Pasteurella multocida # 10 334 7 323 324 295 46.0 7e-80 MHLYNKEQAIKRMNQLGQLHRPFIFIINYLQDVSYIEEVAAVDSTEVLYNLNGFTNQIIS AEDNIATYSAKTMPSLHWQPFAESFSSYQRSFNIVRQNILAGNSFLTNLTCRTPVETNLT LNDIYFHSKAIYKLWIKDRFTVFSPEIFVRIHQGKISSYPMKGTIDASIPFAAQLLMNDP KETAEHATIVDLIRNDLSMVANRVSVSRYRYMDRLQTNRGAIFQTSSEIQGILPENYQEH LGDIIFRLLPAGSITGAPKKKTMQIIQEAETYDRGFYTGVMGYSDGIDLDSAVMIRFVEQ EGEKMYFKSGGGITCQSDVESEYNEMKQKVYVPIY >gi|225935328|gb|ACGA01000064.1| GENE 78 66294 - 66890 366 198 aa, chain + ## HITS:1 COG:HI1169 KEGG:ns NR:ns ## COG: HI1169 COG0115 # Protein_GI_number: 16273093 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Haemophilus influenzae # 1 181 1 185 188 124 36.0 8e-29 MYPFIETIRIEDGQIYNLDYHTERFNETRAAFWKDSTPLDLREFISPPTLNGIHKCRIVY GKEVEEVTYAPYQMRQVSSLHLVVSDTIDYTYKSAHREELNALYAQRGMADDILIVRNGY LTDTSIANVALYDGDTWFTPAHPLLRGTKRSEFLDRQLIVEKDIPQICLKDYSHIMLFNA MIDWKRIILPINEKHLIL >gi|225935328|gb|ACGA01000064.1| GENE 79 67177 - 67842 600 221 aa, chain + ## HITS:1 COG:XF2023 KEGG:ns NR:ns ## COG: XF2023 COG5587 # Protein_GI_number: 15838617 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 7 218 19 234 237 103 28.0 2e-22 MNIPVIFQFLKDLSANNNREWFNEHKAEYETARAEFDNFLATVIARISLFDETIRGIQPK DCTYRIYRDTRFSADKTPYKIHFGGYINAKGKKSDHCGYYVHLQPDGSMLAGGSLCLPTN ILKAVRQSIYDNIEEFVAIVEDPEFKKYFPVIGEDFLKTAPKGFPKDFKYIDYLKCKEYV CSYNVPDDFFTRPDMLEQIDKVFRQFKRFADFINYTIDDFE >gi|225935328|gb|ACGA01000064.1| GENE 80 67880 - 68326 578 148 aa, chain + ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 147 1 150 152 97 35.0 6e-21 MVVDTLENLEKYASLNPLFAQAIEFLKSHDLQAMEVGKTELKGKDLLVNIAQTKPKTKEE AKLETHNEFIDIQIPLSGTEIMGYTAAKDCVPANAPYNAEKDITFFEGLAETYVAVKPGM FAIFFPQDGHAPGITPDGVKKVIVKVKA >gi|225935328|gb|ACGA01000064.1| GENE 81 68345 - 70357 2112 670 aa, chain + ## HITS:1 COG:YEL011w KEGG:ns NR:ns ## COG: YEL011w COG0296 # Protein_GI_number: 6320826 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Saccharomyces cerevisiae # 8 670 12 704 704 567 46.0 1e-161 MEKTLNLVKNDPWLEPFAGAITGRHQHVLDKEAELTNKGKQTLSDFASGYLYFGLHRTDK GWTFREWAPNATHIYMVGTFNNWEEKAAYKLKKQKNGIWEINLPADAIHHGDLYKLNVYW EGGQGERIPAWATRVVQDEQTKIFSAQVWAPENPYKFKKKTFKPDTNPLLIYECHIGMAQ QEEKVGTYNEFREKILPRIAEEGYNCIQIMAIQEHPYYGSFGYHVSSFFAASSRFGTPDE LKALIDAAHEMGIAVIMDIVHSHAVKNEVEGLGNFAGDPNQYFYPGARREHPAWDSLCFD YGKNEVIHFLLSNCKYWLEEYKFDGFRFDGVTSMLYYSHGLGEAFCNYGDYFNGHQDDNA ICYLTLANEVIHQVNPKAITIAEEVSGMPGLAAKVEDGGYGFDYRMAMNIPDYWIKTIKE KIDEDWKPSSMFWEVTNRRQDEKTISYAESHDQALVGDKTIIFRLIDADMYWHMQKGDEN YVVNRGIALHKMIRLLTSSTINGGYLNFMGNEFGHPEWIDFPREGNGWSCKYARRQWDLV DNKNLTYHYMGDFDKEMLKVLKSVKDFQATPVQEIWHNDGDQVLAYERKDLIFVFNFNPK QSFTDYGFLVTPGAYEVILNTDDVAFGGNGLADDSVVHFTITDPLYKKEKKEWLKLYIPA RTAVVLRKKK >gi|225935328|gb|ACGA01000064.1| GENE 82 70447 - 70935 448 162 aa, chain - ## HITS:1 COG:no KEGG:BT_0772 NR:ns ## KEGG: BT_0772 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 158 158 259 81.0 2e-68 MKKFMICAIALFLLLSTSVFGDTPPGNVQSTFKKMYPKANGVAWSQDDGYYCANFAMNGF TKNVWFNVRGQWVMTQTDLVSLDRLTPTVYNAFVSGPYANWVVDNVTMVEFPKWQAIIVI KVGQDNVDIKYQLFYTPQGVLLKTRNVSDMYDILGPSTFLEN >gi|225935328|gb|ACGA01000064.1| GENE 83 71074 - 72771 1481 565 aa, chain - ## HITS:1 COG:TM1650 KEGG:ns NR:ns ## COG: TM1650 COG0366 # Protein_GI_number: 15644398 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Thermotoga maritima # 6 360 5 262 422 95 24.0 3e-19 MNDENKIIIYQVFTRLFGNNNNHCVYNGDIATNGCGKMADFTLKALSEIKKLGTTHIWYT GIIEHATQTDYRRYNIRPDHPAVVKGKAGSPYAIKDYYDVDPDLANDVQGRMKEFENLVH RTHRTGLKVIIDFVPNHVARQYHSDAQPDGTTELGANDDSSQSFSPYNNFYYIPQAELHA QFDMKDGAAEPYREFPAKATGNNRFDATPNITDWYETVKLNYGVDYQNGGTCHFSPTPDT WIKMLDILLFWASKDIDGFRCDMAEMVPVEFWEWAIPQVKEAYPEILFIAEVYNPGEYRN YLFRGKFDYLYDKVGLYDTLRNVACGYESASAITHCWQSLNGIEKKMLNFLENHDEQRIA SDFFAGNPRKGIPALIVSACMNTNPMMIYFGQEFGELGMDSEGFSGRDGRTTIFDYWSVD SIRRWRNGGKFDGKMLTEEHKHLYSIYQKVLTICNEEQAISKGVFFDLMYANINGWRFNE HKQYTFMRKYKNEILFFIINFDSQLVDVAINVPSHAFDFLQIPQMESYQATDLMTGAKEE ISLLPYKPTDVSVGGYNGKILKITF >gi|225935328|gb|ACGA01000064.1| GENE 84 72825 - 73631 496 268 aa, chain - ## HITS:1 COG:aq_1386 KEGG:ns NR:ns ## COG: aq_1386 COG1752 # Protein_GI_number: 15606577 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Aquifex aeolicus # 11 261 9 258 259 164 36.0 1e-40 MEVFTNNWNSRKYQIGYALSGGFIKGFAHLGVMQALLEHDIKPEIISGVSAGALAGVFYA DGNEPHKVLDYFSGHKFQDLTKLVIPKKGLFDLCEFIDFLRTNLKAKNLEELQLPLIITA TDLDHGRMVHFHRGNIAERVAASCCMPVMFSPVNIEGTNYVDGGLMMNLPVSTLRRVCDK VVAVNVSPIMAQDYKMNIVSIAMRSFHFMFRANTFPEREKCDLLIEPYNLYGYSNTELEK AEEIFGQGYNTANEVLNQLLEEKGKIWK >gi|225935328|gb|ACGA01000064.1| GENE 85 73798 - 75126 445 442 aa, chain - ## HITS:1 COG:ECs3097 KEGG:ns NR:ns ## COG: ECs3097 COG1145 # Protein_GI_number: 15832351 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli O157:H7 # 257 418 17 164 164 80 29.0 9e-15 MGILQDVIARISKATGKKKKRYSYSPAKNKLRWGVLGLTVVGFLCGFTFIVGLLDPYSAY GRVVVHIFKPIYMLGNNLLESLFARFDNYTFYQVDTSILSISSLLIAIITLATIFVMAWK HGRTWCNTICPVGTILGLLSRFSLFKVRIDTAKCNGCGLCATKCKAACINSKEHAIDYSR CVDCFNCLGACKQKALVYAPSLKKQSDVETPAPSSPDLDSSKRRFLVAGLVTAGATPKLI SQAKESVASLEGKKAYKKENPITPPGSISREHFQQQCTSCHLCVSKCPSHILKPAFMEYG LAGMMQPTVSFEKGFCNFDCTVCGDVCPNGAILPINVEQKHLTQMGYVVFIEENCIVYTD GTSCGACSEHCPTQAVAMVPYKDGLTIPHVNKEICVGCGGCEYVCPARPFRAIYIEGNPV QKEAKPFKESEEHKVEIDDFGF >gi|225935328|gb|ACGA01000064.1| GENE 86 75350 - 76285 827 311 aa, chain - ## HITS:1 COG:MA1031_1 KEGG:ns NR:ns ## COG: MA1031_1 COG2006 # Protein_GI_number: 20089906 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 49 273 15 245 295 83 27.0 6e-16 MDRRDFLKTVAITGAALTIQRSEAMEVLTQTINKANGSNPDLVAVMGGEPEEMFRRAISE LGGMKQFVKPGQKVVVKPNIGWDKVPELAGNTNPKLIEEIVRQCFAAGAKEVVVFDHTCD DWQKCYKNSGIEAAAKKAGAKVMPAHLESYYKPIDLPKGKKMKKAKVHEAILDCDVWINV PILKNHGGANLTISMKNHMGIVWDRGFFHQNDLQQCIADICTLQKKAVLNVVDAYRIMKT NGPRGRSASDVVLAKGLFISPDIVAVDTAAAKFFNQVREMPLDTVGHLANGEALKIGTMN IDKLNVKRIKM >gi|225935328|gb|ACGA01000064.1| GENE 87 76300 - 76809 518 169 aa, chain - ## HITS:1 COG:no KEGG:BT_0777 NR:ns ## KEGG: BT_0777 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 169 3 171 171 236 87.0 3e-61 METSVKLYSLKHSNVKTYLFALLFVAGNIALPQLCHLVPYGGPTLLPIYFFTLIAAYKYG FLVGLLTAILSPVINHLLFAMPSEAVLPILLIKSSLLAGASALAARTIKSVSLLAILGVV LTYQVIGVAFEWAIVGSFYEAVQDFRIGIPGMLLQWFGGYALLKVIAKL >gi|225935328|gb|ACGA01000064.1| GENE 88 77022 - 77450 264 142 aa, chain - ## HITS:1 COG:no KEGG:XBJ1_0403 NR:ns ## KEGG: XBJ1_0403 # Name: not_defined # Def: hypothetical protein # Organism: X.bovienii # Pathway: not_defined # 4 140 6 141 147 99 43.0 4e-20 MIKKIDYISLFARLALAIGFLSAVADRLGLWTPLLGSENVVWGNMESFTTYTGVLLPWIP KFLIPLLAWTATIAETILGILLLIGFQKRWVALLSGILLLTFAFSMSFSLNVKAPFDYSV FAAAACAFLLYKESKVKQEEYS >gi|225935328|gb|ACGA01000064.1| GENE 89 77625 - 78335 592 236 aa, chain - ## HITS:1 COG:no KEGG:BT_0781 NR:ns ## KEGG: BT_0781 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 233 240 380 75.0 1e-104 MDIQTFIQNFKEAFGENAELPLVFWYSDEQKGETEKINGCFFKGMKTVRDGGIISLNAEN IGCGGGKFYTGFTEMPEHVPTFVSLKEKYKQTPEQVIDYIEQMGVSRTENKYLHFARIDK VESLEEIEGVVFLANPDTLSGLTTWAFYDNNAADCVVSTFGSGCSAVVTQATIENRKGGK RTFLGFFDPSVRPHFEADKLSYTIPMSRFKEMYETMRQSCLFDTHAWGKIKERMNG >gi|225935328|gb|ACGA01000064.1| GENE 90 78409 - 79296 668 295 aa, chain - ## HITS:1 COG:all2088 KEGG:ns NR:ns ## COG: all2088 COG2326 # Protein_GI_number: 17229580 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 9 287 6 282 289 320 55.0 2e-87 MKKDILKDLLAKPGKKQLVSDFDSSFTGDLSKQDAKEQLAKDIEKLSELQSMLYAQDRYS ILIIFQAMDAAGKDGTIKHVMSGINPQGCQVYSFKQPSAEELDHDYLWRINRSLPERGRI GIFNRSHYEDVLIAKVHPEIILSNKLPGVETINDIDPDFWKRRYRQINDFERYLTENGTI VLKFFLNVSKAEQKNRFMERLDDASKNWKFSSADIKERQFWEDYMNAYADVLTETSTELA PWYIIPADNKWFMRYAVGRIICDRMQQLDLHYPKLSKEGLEKLEECKKSVSNINF >gi|225935328|gb|ACGA01000064.1| GENE 91 79499 - 79663 251 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160884529|ref|ZP_02065532.1| ## NR: gi|160884529|ref|ZP_02065532.1| hypothetical protein BACOVA_02514 [Bacteroides ovatus ATCC 8483] # 1 54 1 54 54 100 100.0 3e-20 MKYLNKEINGITLSRIGLGAMRMADVQQGVDTIHAALDSGITYLTFMVTEKVNW >gi|225935328|gb|ACGA01000064.1| GENE 92 79877 - 80452 343 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237718610|ref|ZP_04549091.1| ## NR: gi|237718610|ref|ZP_04549091.1| transcriptional regulator [Bacteroides sp. 2_2_4] # 1 188 1 188 306 363 98.0 4e-99 MEQVVDLALPEYFERISAEATFQEQLSVIDIDNSRRSPVQVKNYPIRLKGYSLIFVLSGE ITIGINYLSHTLKKNMVMQVYPDDIIEHTAYSADFKGYLIIHSAELKKEIMAMTSGIRLQ QAGQLKKLHTRQELSEEESLRLRKQVELIKSYIPDKEHVYHSYVIKNQIINLFFDLDNCR WHRYGDGERHQ >gi|225935328|gb|ACGA01000064.1| GENE 93 80659 - 81462 429 267 aa, chain + ## HITS:1 COG:no KEGG:BT_0971 NR:ns ## KEGG: BT_0971 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 266 1 265 265 417 76.0 1e-115 MKYLLICAGLLLIVFHSWGQERLADRIAPPSGYVRETCSDNSFTTYLRNLPLLPKGSKVL LYNGKEKANQAAAFAVVDMEIGNRDLQQCADAVIRLRAEYLWKHKRYADIKFNFTSGFTA EYKKWAEGNRIKVNDNQVQWYASGKGVDYSYKTFRNYLDMVFMYAGTASLSRELPAVLYT SLQPGDVFIKGGSPGHAVIVMDVAIHPNTGKKVFLLAQSYMPAQQIHILVNPTSRNLSPW YELTETDAGKLYTPEWIFEKKDLKRFK >gi|225935328|gb|ACGA01000064.1| GENE 94 81591 - 83093 1324 500 aa, chain + ## HITS:1 COG:MA3382 KEGG:ns NR:ns ## COG: MA3382 COG0174 # Protein_GI_number: 20092196 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Methanosarcina acetivorans str.C2A # 4 498 5 504 506 566 54.0 1e-161 MNYDLSMNANQLVAFLQKTTSEFTKADIISFIQQKDIRMVNFMYPAGDGRLKTLNFVINN QAYLEAILTCGERVDGSSLFSFIEAGSSDLYVIPRFCTAFVDPFAEIPTLSMLCSFFNKD GEPLESAPEHTLYKASKTFTEVTGMEFQAMGELEYYVIAPDTGMFQATDQRGYHESGPYA KFNEFRTQCMSYIAQTGGQIKYGHSEVGNFTLDGYIYEQNEIEFLPVPVSQAADQLMIAK WVIRNLGFRYGYNVTFAPKITAGKAGSGLHVHMRIMKDGKNQMLKDGVLSETARKAIAGM MVLAPSITAFGNTNPTSYFRLVPHQEAPTNVCWGDRNRSVLVRVPLGWAAKTDMCTLANP LEAESHFDTSQKQTVEMRSPDGSADLYQLLAGLAVACRHGFEIENALDIAEKTYVNVNIH QKENEDKLKSLAQLPDSCEASADCLQQQRAIFEQYNVFSPAMIDGIIRRLRGYEDKTLRA DLEGKPMEMLDLVHKYFHCG >gi|225935328|gb|ACGA01000064.1| GENE 95 83292 - 84179 686 295 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0860 NR:ns ## KEGG: Dfer_0860 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 12 293 5 295 299 233 40.0 9e-60 MRIILLSAFALFLLLPENIQAQVTVKTGYISSSRYKDENGQYPGKGDMCFVEGNVNIPVS QKMNERNQPTLWMISAGGSYTAMNNKNLQSYIDQIINMQLSVTNIRPISKKWLLLTTVGA GVYTSSDVKLKNVLGQGGVIFIRQFKSNLSLGAGLAVNNTFGYPMVFPAIYFDWSTEGRY QIKVSMLNAMEVSAGMKMNKYLNLRIVAEMNGSLALLEREGKDTMFSQQFIIVGLQPEVS MGNSFSITATAGVSCSRIAYYTTRTLKAFFEDMSKDTDPYFKPAMYVSVGMKYKF >gi|225935328|gb|ACGA01000064.1| GENE 96 84203 - 85237 533 344 aa, chain + ## HITS:1 COG:no KEGG:BT_1459 NR:ns ## KEGG: BT_1459 # Name: not_defined # Def: two-component system sensor # Organism: B.thetaiotaomicron # Pathway: not_defined # 53 326 39 310 326 132 29.0 2e-29 MSVKILLYTLLKQIGIFLVVSFLVFRGTFVESDSLGLNLSTGVSLLLSVILWGIINLHTY WLIPGYLFQRRYKKYIIYLSCLVGFMLTAMVGAICFLGQYYVIPEQMQRLNSDLPFFLLI NALALILYFLAFSFTIFLHRWVAYQQRLNELENISIQTDLNHLKDQLQPEFFSRILNKVR TLLGEDGEKASLLIFKLSRLLRYQLYESERQQVLLGDDIDFMTDYMKLEKLCNPEFNFEV NILNEVRYIQVPPLLFMPVIEYVIQTTVCGENGKGENIRIDFQMEENQLRFICTYCLRKK ESLQVAPAFDKLRQRLDLLYPGGYQLKSGYNDEALCTIGLMLEL >gi|225935328|gb|ACGA01000064.1| GENE 97 85234 - 86295 507 353 aa, chain + ## HITS:1 COG:SMb21546 KEGG:ns NR:ns ## COG: SMb21546 COG3275 # Protein_GI_number: 16264735 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Sinorhizobium meliloti # 152 338 167 363 383 91 29.0 3e-18 MKDTWNVYDWGKNLVNDRFRLISNLLLLFIILFLGFQNVYGDFTREGFLYAYPVSILVFA FPIYLNIYWLVPRFLYKAKRKLWCYWISFLGVNLVSVLLGFIFLSPLYQRYGIRGFCIQD NHAVSFDSIAYGVLVLLLSAGGCTSFELFRRWVVSDKKILELEKATKQAELQQLKKQINP HFLFNMLNNANILVKDAPDEASQILEKLDNLLRYQLNDSTRREVFLTADIQFLTSFLELE KVRRDHFEYTIFQEGNMENICIPPLLFIPFVENAVKHNLDSDNLSYVHLYFSVHNKQLTF RCENSKPRVPVKREGGIGLANVKRRLDLLYESRYTLQIEDKETTYNVNLHLNL >gi|225935328|gb|ACGA01000064.1| GENE 98 86292 - 87014 567 240 aa, chain + ## HITS:1 COG:ECs3261 KEGG:ns NR:ns ## COG: ECs3261 COG3279 # Protein_GI_number: 15832515 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 1 202 1 206 244 100 28.0 3e-21 MNCIIVDDEPVARKGMKSLVEQIPQLTLVGSFNNAIAASAYINEYTVDLIFLDIQMPGIT GIEFAQSIPVNTLIVFTTAYAEYALDSYEVSAIDYLVKPIEMNRFLKAVNKAIAYHKLLL SETAERVEEIQEEYIFIKSERRYIKINFSDILFIEGLKDYSILQLEGQRVITKMNLKNIH EQLPAKQFLRINKSYIVNTSRIGSFDTNDIFIKSYEIGIGGGNYKKTFFEEFVAKNLQKD >gi|225935328|gb|ACGA01000064.1| GENE 99 87291 - 87536 339 81 aa, chain + ## HITS:1 COG:sll0517 KEGG:ns NR:ns ## COG: sll0517 COG0724 # Protein_GI_number: 16332012 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Synechocystis # 1 80 1 80 101 85 53.0 2e-17 MNLYIGNLSYNVKESDLRHVMEEYGTVASVKLITDRETRRSKGFAFIEMPDDAEAAKAIE QLNGAEYVGRSMVVKEALPKN >gi|225935328|gb|ACGA01000064.1| GENE 100 87599 - 87772 274 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160884539|ref|ZP_02065542.1| ## NR: gi|160884539|ref|ZP_02065542.1| hypothetical protein BACOVA_02524 [Bacteroides ovatus ATCC 8483] # 1 57 68 124 124 99 100.0 9e-20 MSRRRQLEHEVSVAQERIKKAAKDTPKDILKLWEQELVDLELELNNMVDDEEDNNED >gi|225935328|gb|ACGA01000064.1| GENE 101 87909 - 88688 691 259 aa, chain + ## HITS:1 COG:no KEGG:BT_0786 NR:ns ## KEGG: BT_0786 # Name: not_defined # Def: putative integral membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 259 1 240 240 380 80.0 1e-104 MLNLIAVFLVCSGIGIAVHIAIDLTHRPQSMKIMNAVWILSALWGSYLALWAYNKFGRSS PMKAEDDGMKMDMSGMKDMSDMKGMDMSGMKDNMEMDMAMQGEMSMQGMQRPYWQSVALS ALHCGAGCTLADIIGEWFTNYVPVTVAGSQLFGNWVLDFILALMIGVYFQFYAIREMEKI SVGNALARAFKADFFSLLSWQVGMYGWMAIVYFVLFINEPLPKDTWIFWFMMQLAMLFGF FCAYPMNALLIKLGIKKGM >gi|225935328|gb|ACGA01000064.1| GENE 102 88799 - 89659 849 286 aa, chain - ## HITS:1 COG:BS_sucD KEGG:ns NR:ns ## COG: BS_sucD COG0074 # Protein_GI_number: 16078673 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Bacillus subtilis # 1 278 1 278 300 337 60.0 2e-92 MSILIDKSTRLIVQGITGRDGLFHAKKMKEYGTHVVGGTSPGKGGTDVDGIPVFNTMYDA VEQTKANTSIIFVPARFAADAIMEAADAGIRLIVCIAEGIPTLDVIKAHQFVEQKGAMLI GPNCPGLISPGKSMVGILPGQVFLEGNVGVISRSGTLTYEIVYHLTANGMGQSTAIGIGG DPVVGLHFRQLLEMFQNDPETEAIVLIGEIGGNAEEQAAEYIRNNVTKPVVAFIAGQSAP PGKQMGHAGAIISGSSGSAKEKIESLEAAGIRVAQKPSDIPKLLKR >gi|225935328|gb|ACGA01000064.1| GENE 103 89693 - 90823 935 376 aa, chain - ## HITS:1 COG:RC0599 KEGG:ns NR:ns ## COG: RC0599 COG0045 # Protein_GI_number: 15892522 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Rickettsia conorii # 1 356 1 363 386 333 46.0 4e-91 MKIHEYQAKEIFSKYGIPVERHTLCRTAAGVIAAYRRMGTDRVVIKAQVLTGGRGKAGGV KLVNNTEDAYQEAKNILGMSIKGLPVNQVLVSEAVDIAAEYYVSYTIDRNTRSVVLMMSA SGGMDIEEVARQTPEKIIRYSINPFIGLPDYLARRFAFSLFPQMEQAGKMAAILQELYKI FMENDASLVEVNPLALTKKGTLMAIDAKIVFDDNALYRHPEVHALFDPTEEERIEADAKD KGFSYVHMDGNIGCMVNGAGLAMATMDMIKLYGGQPANFLDIGGSSNPVKVIEAMKLLLQ DEKVKVVLINIFGGITRCDDVAMGLLQAFEQINSNIPVIVRLTGTNEHIGRELLRNYSRF QIATTMKEAALMALKA >gi|225935328|gb|ACGA01000064.1| GENE 104 91023 - 91910 1092 295 aa, chain - ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 3 286 5 289 308 245 46.0 7e-65 MKAFVFPGQGAQFVGMGKDLYENSALAKELFEKANDILGYRITDIMFDGTDEDLRQTKVT QPAVFLHSVISALCMGDDFKPEMTAGHSLGEFSALVAAGALSFEDGLKLVYARAMAMQKA CEATPSTMAAIIALPDEKVEEICAAVNAEGEVCVPANYNCPGQIVISGSVPGIEKACELM KAAGAKRALPLKVGGAFHSPLMDPAKVELEAAIKATEIHTPKCPVYQNVDALPHTDPAEI KKNLVAQLTASVRWTQSVKNMVADGATDFTECGPGAVLQGLIKKIDGTVNAHGIA >gi|225935328|gb|ACGA01000064.1| GENE 105 91961 - 92797 500 278 aa, chain - ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 7 262 7 258 265 236 53.0 3e-62 MERHPVILSIAGSDCSGGAGIQADIKTISALGGYAASAITAITIQNTLGVRAVQSIAPDI VRGQIEAVMDDLQPVAIKIGMVNDIQIVRVISDCLQKYSPAYVVYDPVMVSTSGKKLMTD EAIEEIKKELLPLVTLITPNIDEAKVLTGKSIHNIQDMQAAAKMLTDDYHTSILLKGGHL EGDNMCDLLHTSESIYHIYEEKKIESHNLHGTGCTLSSAIATYLAKGYPMRESIQHAKTY ITQAIIAGKELNVGHGNGPLWHFPDSVAQMCTFCAVVS >gi|225935328|gb|ACGA01000064.1| GENE 106 93131 - 93868 736 245 aa, chain + ## HITS:1 COG:alr2484 KEGG:ns NR:ns ## COG: alr2484 COG1051 # Protein_GI_number: 17229976 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Nostoc sp. PCC 7120 # 11 244 17 241 248 89 29.0 5e-18 MYMQNLQKNTPLANNHISVDCVVIGFDGEQLKVLLVKRAGEDNGEVYHDMKLPGSLIYMD EALDEAAQRVLYELTGLKNVNLMQFKAFGSKNRTSNPKDVRWLERAMQSKVERIVTIAYL SMVKIDRTLDKNLDDHQACWVALKDVKTLAFDHNLIIKEAMTYIRQFVEFNPSMLFELLP RKFTAAQLRTLFELVYDKAVDVRNFHKKIAMMEYVVPLEEKQQGVAHRAARYYKFDKKIY NKVRR >gi|225935328|gb|ACGA01000064.1| GENE 107 93980 - 95503 1552 507 aa, chain + ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 2 483 3 487 500 207 31.0 3e-53 MFLLGYDIGSSSVKASLVNAETGKCVSSAFFPKTEANIIAVNPGWAEQDPESWWENLKLS TQAIMTESGVSAAEIKAIGISYQMHGLVCVDKNQHVLRPAIIWCDSRAVPYGQKAFETIG EERCLSHLLNSPGNFTASKLAWIKENEPAIYEQIDKIMLPGDYIAMKLSGEICTTVSGLS EGMFWDFKNNRVADFLMDYYGFDSSLIADIKPTFAEQGRVNAIAAKELGLKEGTPITYRA GDQPNNALSLNVFNPGEIASTAGTSGVVYGVNGEVNYDPQSRVNTFAHVNHTIDQTRLGV LLCINGTGILNSWVKRNIAPEGISYNEMNVLASKAPIGSAGISILPFGNGAERMLNNKEI GCSIRGLDFNAHGKHHIIRAAQEGIVFSFKYGIDIMEQMGIPVKMIHAGHANMFLSSIFR DTLAGVTGATIELYDTDGSVGAAKGAGIGAGIYKDNNEAFATLDKLDVIEPNVAKRQEYA DAYAKWKYRLEKSMTGNIPAPVLSSDK >gi|225935328|gb|ACGA01000064.1| GENE 108 95545 - 96861 1694 438 aa, chain + ## HITS:1 COG:SMc03163 KEGG:ns NR:ns ## COG: SMc03163 COG2115 # Protein_GI_number: 15966647 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Sinorhizobium meliloti # 6 437 5 435 436 462 53.0 1e-130 MATKEFFPGIEKIKFEGKDSKNPMAFRYYDAEKVINGKKMKDWLRFAMAWWHTLCAEGGD QFGGGTKQFPWNSNADAIQAAKDKMDAGFEFMQKMGIEYYCFHDVDLVSEGASIEEYEAN LKAIVAYAKQKQAETGIKLLWGTANVFGHARYMNGAATNPDFDVVARAAVQIKNAIDATI ELGGQNYVFWGGREGYMSLLNTDQKREKEHLAKMLTIARDYARARGFKGTFLIEPKPMEP TKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDAN RGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGLGNGGTNFDAKTRRNSTDLEDIFIAHIA GMDAMARALESAAALLNESPYKKMLADRYASFDGGKGKEFEDGKLTLEDVVAYAKANGEP KQTSGKQELYEAILNMYC >gi|225935328|gb|ACGA01000064.1| GENE 109 97000 - 98454 1243 484 aa, chain + ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 12 481 9 479 491 422 48.0 1e-118 MINTTNEGSKLYLYSITSVAILGGLLFGYDTAVISGAEKGLEAFFLSASDFQYNKVMHGI TSSSALIGCVLGGAISGIFASRLGRRNSLRLAAVLFFLSALGSYYPEVLFFEYGKPNMDL LIAFNLYRVLGGIGVGLASAVCPMYIAEIAPSNIRGTLVSCNQFAIIFGMLVVYFVNFLI MGDHQNPIILKDAAGVLSVSAESDMWTVYEGWRYMFGSEAFPAAFFGLLLFFVPKTPRYL VLIQQDEKAYSILEKINGKTKAQEILNDIKATAHEKTEKIFTYGVAVIVIGILLSVFQQA IGINAVLYYAPRIFENAGAEGGGMMQTVIMGIVNIVFTLVAIFTVDRFGRKPLLIIGSIG MAVGAFAVAMCDSMAIKGVLPVLSVIVYAAFFMMSWGPICWVLISEIFPNTIRGKAVAIA VAFQWIFNYIVSSTFPALYDFSPMFAYSLYGIICVAAAIFVWRWVPETKGKTLEDMSKLW KKNK >gi|225935328|gb|ACGA01000064.1| GENE 110 98763 - 99629 455 288 aa, chain + ## HITS:1 COG:no KEGG:CCC13826_0816 NR:ns ## KEGG: CCC13826_0816 # Name: amaA # Def: acid membrane antigen A # Organism: C.concisus # Pathway: not_defined # 10 283 5 258 264 121 29.0 2e-26 MVTEQYKDAIYFSSLFAWHYPNVYKEISDILVYHHIEHGTLLHTKDYWCRDYMPIQWGFK SYIQFRYEPDYLKDKPQYKTNIIPVLKAMSRDMDITQSPLIVDGGNVVVCEANSKWIGSC MRGRKPIVIMTEKVFQENSQIDQSEVLAILKENFYGADIVFLPWDKYDICGHTDGIIHNI GDGKILVNLKVYPPEIEREMRRRLSDDFAVIDLKLSKYDENSWAYINMLQTRDVIIIPGL GLPTDGEALSQIKELHPSYDGRIYQINIAPIIKKWGGALNCLSWTVTK >gi|225935328|gb|ACGA01000064.1| GENE 111 99816 - 100286 358 156 aa, chain + ## HITS:1 COG:no KEGG:Alvin_2303 NR:ns ## KEGG: Alvin_2303 # Name: not_defined # Def: 5 nucleotidase deoxy cytosolic type C # Organism: A.vinosum # Pathway: not_defined # 9 152 2 145 149 197 63.0 7e-50 MTSALSTFKPILYIDMDNVLVDFQSGINKLSEYEKKEYEGRYDEVPNIFAKMYPYKGAID AFHRLVRFYDVYILSTAPWNNPSAWSDKLVWVKKWLGTYSYKRLILSHHKNLNKGDFLID DRLKNGAENFSGELILFGSEQYPNWDSVVDYLISSK >gi|225935328|gb|ACGA01000064.1| GENE 112 100372 - 101109 359 245 aa, chain + ## HITS:1 COG:no KEGG:TDE0221 NR:ns ## KEGG: TDE0221 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 4 152 8 162 216 85 40.0 1e-15 MDTLKKEIAILIQCSDFKEIEKQLQTINKLIVTNYMFELSNGLRIYPIEVEAYFKHPKFN DGFVHGNELQKNNYGKFYVHRTGMTKNSKIKGGTRGGIDLCLSDSTDIYYGILIRSAQFD DGTIKFGPNNVLKFIVEDKNLDYNTLEEEFVLKEAVEDCRDRENKSIILHSTRVGLGRNQ SDDFRDSQLRTIAGPLLSSYAYKEKENVFKHYIINENISKEEAEKISIDILGYCPKSLIK SVYQA >gi|225935328|gb|ACGA01000064.1| GENE 113 101156 - 102049 369 297 aa, chain - ## HITS:1 COG:PA0780 KEGG:ns NR:ns ## COG: PA0780 COG2207 # Protein_GI_number: 15595977 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 37 122 164 249 250 69 34.0 8e-12 MNSNKFSYQGQIETILHYLNSMIQKNWTGKTADEFNRLMSIPELAKIACMSARNLQLMFK AYTSETIHQYIIRTRMEYAQQLLKDNKKSIAEIYEYIGFANQSALNNTFQKKYNLTPREL QKKLLETSHTYPSCISPYRIVESETIPVLFLSYIGNYDTCSTVAFETYTWDCLYKYAKEN SLLPDKEDYWGIAYDDTDITSLEKCRFYACIAIQKGVGSNPPLTNPIKHMDLPQGTYAVY IHQGDYALLDAFYEIILKQLPQSYCLGETPILEHYLNSPTDTDVKELLTEVWIPIIK >gi|225935328|gb|ACGA01000064.1| GENE 114 102196 - 102390 105 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174876|ref|ZP_05761288.1| ## NR: gi|260174876|ref|ZP_05761288.1| hypothetical protein BacD2_23669 [Bacteroides sp. D2] # 1 64 1 64 64 105 100.0 1e-21 MIITMISHIRFWLKKYFFGKSNCFIFEIFQIREKWIEEVESNKSSYFPQLKQCFSLLKAP LSFS >gi|225935328|gb|ACGA01000064.1| GENE 115 102769 - 103059 158 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237713274|ref|ZP_04543755.1| ## NR: gi|237713274|ref|ZP_04543755.1| predicted protein [Bacteroides sp. D1] # 1 96 10 105 105 177 100.0 3e-43 MIRIIASIRLYKDGRRTPFYSGYRPLFDFIEETKTSGQITLLDREAFYPGDEGVVEIAFL IRRALGDNFSEGTKFTFGEGRKHVGEGEVKEILELE >gi|225935328|gb|ACGA01000064.1| GENE 116 103192 - 104730 1235 512 aa, chain + ## HITS:1 COG:BH1089_2 KEGG:ns NR:ns ## COG: BH1089_2 COG0388 # Protein_GI_number: 15613652 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Bacillus halodurans # 202 512 3 313 313 326 49.0 5e-89 MEQHPIKINKVQIRNLQIEDYVQLSQSFTRVYSDGSDVFWTREQIQKLIKIFPEGQIVTV VDDKIVGCALSIIVDYDKVKNDHTYAQVTGKETFNTHSSKGNILYGIEVFIHPGYRGLRL ARRMYEYRKELCETLNLKAIMFGGRIPNYHKYADKMRPKEYIERVRQREIYDPVLTFQLS NDFHVRKVMTNYLPNDEESKHYACLLQWDNIYYQPPTQEYISPKTTVRVGLVQWQMRSYK TLDDLFEQVEFFVDAVSDYKSDFVLFPEYFNAPLMSKYNDKGESQAIRGLAKYTDEIRER FMNLAISYNINIITGSMPYVKEDGLLYNVGFLCRRDGTYEMYEKLHVTPDEIKSWGLNGG KLLNTFDTDCAKIGVLICYDVEFPELSRLMADQGMQILFVPFLTDTQNAYSRVRVCAQAR AIENECFVVIAGSVGNLPRVHNMDIQYAQSGVFTPCDFAFPTDGKRAEATPNTEMILVSD VDLDLLNELHTYGSVRNLKDRRNDLYEVRYKK >gi|225935328|gb|ACGA01000064.1| GENE 117 104829 - 105878 752 349 aa, chain - ## HITS:1 COG:no KEGG:BT_0805 NR:ns ## KEGG: BT_0805 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 349 1 349 349 568 90.0 1e-160 MNYGISILFRAIPLAMAIFCFGYGAFIYGYGDDGSRVVAGPVVFSLGMICIALFCTAATI IRQIIHTYNKSAKYVLPVIGYLAAIITIIGGICIFSNATSTSAFVAGHVITGVGFITTCV ATAATSSTRFSLIPRNSKATSNEVPEGAFSLNQRRALVIVAIIVSLIAWIWAFVLLGNSH SHPAYFVAGHVMVGLACICTSLIALVATIARQIRNDYSEKERNKWPKLVLLMGSISFVWG LFVILADSGSANGTTGYIMLGLGLVCYSISSKVILLAKIWRQEFKLANRIPMIPVFTALA CLFLAAFVFELATTHADYFIPARVLVGLGAICFTLFSIVSILESGTSSK >gi|225935328|gb|ACGA01000064.1| GENE 118 106089 - 109577 3828 1162 aa, chain + ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 8 1162 2 1034 1035 835 40.0 0 MGKRFTEYSQFDLSQVNKDVLKKWDENQVFAKSMTERDGCPSFVFFEGPPSANGMPGIHH VMARTIKDIFCRYKTMKGYQVKRKAGWDTHGLPVELSVEKALGITKEDIGKKISVADYNA ACRKDVMKYTKEWEDLTHRMGYWVDMKHPYITYDNRYIETLWWLLKQLHKKGLLYKGYTI QPYSPAAGTGLSSHELNQPGCYRDVKDTTAVAQFKMKNPKPEMTEWGTPYFIAWTTTPWT LPSNTALCVGPKIDYVAVQSYNAYTGEPITVVLAKALLNVHFNAKAADLKLEDYKAGDKL VPFKVIAEYKGTDLVGMEYEQLIPWVKPVEVSEDGNWKPSDKAFRVIPGDYVTTEDGTGI VHIAPTFGADDANVARAAGIPSLFMINKKGETRPMVDLTGKFYLLNELDENFVKECVDVD KYKEYQGAWVKNAYDPQFMVDGKYDEKAAQAAESLDIALAMMMKADNKAFKIEKHVHNYP HCWRTDKPVLYYPLDSWFIRSTACKERMMELNKTINWKPESTGTGRFGKWLENLNDWNLS RSRYWGTPLPIWRTEDNTDEICIESVEELYNEIEKSVAAGFMKSNPYKDKGFVPGQYSEE NYDKIDLHRPYVDDVILVSKDGKPMKRESDLIDVWFDSGAMPYAQIHYPFENKELLDNRQ VYPADFIAEGVDQTRGWFFTLHAIATMVFDSVSYKAVISNGLVLDKNGNKMSKRLNNAVD PFTTIEKYGSDPLRWYMITNSSPWDNLKFDVEGVEEVRRKFFGTLYNTYSFFALYANVDG FEYKEADVPMAERPEIDRWILSVLNTLIKEVDTCYNEYEPTKAGRLISDFVNDNLSNWYV RLNRKRFWGGEFTQDKLSAYQTLYTCLETVAKLMAPVSPFYADRLYTDLITATGRDNVVS VHLAEFPKYQEEMIDKELEARMQMAQDVTSMVLALRRKVNIKVRQPLQCIMVPVVDEEQK AHIEAVKNLIMNEVNVKEVRFVDGAAGVLVKKVKCDFKKLGPKFGKQMKAVAAAVAEMSQ EAIGELEKNGKYTLNLDGAEAVIEASDVEIFSEDIPGWLVANEGKLTVALEVTITEELRR EGIARELVNRIQNIRKSSGFEITDKIKITISKNTQTDDAVNEYNTYICNQVLGTSLELVD EVKDGTMLEFDDFSLFVNVIKD >gi|225935328|gb|ACGA01000064.1| GENE 119 109613 - 109993 594 126 aa, chain + ## HITS:1 COG:no KEGG:BT_0807 NR:ns ## KEGG: BT_0807 # Name: not_defined # Def: DnaK suppressor protein, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 126 126 209 98.0 2e-53 MAEKTRYSDAELEEFRAIIMEKLELAQRDYEQLKKSLMGLDGNDTDDTSPTYKVLEEGAN TLSKEETTRLAQRQLKFIQGLQAALVRIENKTYGICRETGKLIPAERLRAVPHATLSIEA KNSGKK >gi|225935328|gb|ACGA01000064.1| GENE 120 110091 - 110747 491 218 aa, chain + ## HITS:1 COG:no KEGG:BF2277 NR:ns ## KEGG: BF2277 # Name: not_defined # Def: lipoprotein signal peptidase # Organism: B.fragilis # Pathway: Protein export [PATH:bfr03060] # 1 204 1 204 210 362 92.0 7e-99 MKKLFTKGRIALLVIFSVLIIDQIIKVWIKTHMYWHESIRVTDWFYIYFTENNGMAFGME IFGKLFLTTFRIVAVALIGWYLYKIIKKGFKTGYIVCVALILTGALGNIIDSVFYGVIFN ESTHSQIASFMPEGGGYSTWFYGKVVDMFYFPIIDTNWPTWMPFVGGEHFIFFSPIFNFA DAAISCGIIALLLFYSKYLNESYHSLDKDKKEATDHEK >gi|225935328|gb|ACGA01000064.1| GENE 121 110689 - 111813 843 374 aa, chain + ## HITS:1 COG:no KEGG:BT_0809 NR:ns ## KEGG: BT_0809 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 358 1 343 360 499 73.0 1e-140 MSLIIRWIKIKRKLRIMKNKFRFHLCLICMFVFAVAGCKVKRPSDVISESKMENLLYDYH VAKSMGDNLPYSENYKKALYIDAVFKKYGTTQAAFDSSMVWYTRNTEILSKIYDKVKKRL KDEQELVGDLIAKRDKKPKMTKQGDSIDVWPWQRMIRLTGEMMNNQYVFTLPTDSNYKDR DTLVWEVRYRFLEPMLADSLRGVTMAMQVIYEKDTINHWKTVTEPGVQQIRLFADTLGPM KEIKGFIYYPADSLEKGGALLADRFMMTRYHCTDTVSFAVRDSLNKIKALKADSLKKISA KENADSLHKMIDKEKDDMQRLTPEEMNRRRTGTHREKKPEQIEVEQHIQKERMEQRKERQ MNQRRQQQQRRQNR >gi|225935328|gb|ACGA01000064.1| GENE 122 111819 - 112205 67 128 aa, chain + ## HITS:1 COG:no KEGG:BT_0810 NR:ns ## KEGG: BT_0810 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 125 125 177 72.0 1e-43 MKRFASHYLLIPAVGFMKQQVVEITGEGIVRDVFPLTEEIESVEWMPGVIALLSENQMEK IKNTGMIFKNIPVFPSNLPNVPNQSSQCLMEYLHEAERKGEALFPYLFYPFDFTSMQPVD GTRHRLLR >gi|225935328|gb|ACGA01000064.1| GENE 123 112156 - 112920 677 254 aa, chain - ## HITS:1 COG:VC0803 KEGG:ns NR:ns ## COG: VC0803 COG0566 # Protein_GI_number: 15640821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Vibrio cholerae # 4 250 15 255 257 181 41.0 1e-45 MASLSKNKIKYIHSLELKKIRKEEGVFLAEGPKLVGDLLGHFPCRFLAATSSWLQEHPDI DASELVEVSQEDLSRASLLKTPQQVLAIFEQPSYTVNPEVVHQSLCLALDDVQDPGNLGT IIRLADWFGIEHIICSQNTVDVYNPKTIQATMGGIARVKVHYTSLPDFIRSLGDTPIFGT FLDGKNMYEQPLSANGLIVMGNEGNGIGKEVATLINRKLYIPNYPEGQETSESLNVAIAT AVICAEFRRQAAWK >gi|225935328|gb|ACGA01000064.1| GENE 124 113063 - 115372 1643 769 aa, chain + ## HITS:1 COG:no KEGG:BT_0814 NR:ns ## KEGG: BT_0814 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 769 1 772 772 1332 87.0 0 MRRNFLYLLASAAVLSLASCTTTKFVPDGSYLLDEVKIRTDQKNIRPSSLRMYVRQNPNA KWFSLIKTQLYVYNLSGRDSTKWGNKFLRRIGDAPVIYSEDEAKRSEEEITKAVHNMGYM AATVKRSTKIKKKKIKLYYDVTAGKPYVVQSIKYDIYDPKIASLLKQDSARSLLKEGMYL DVNVLDADRQRITNKLLRNGYYKFNKDYIGYTADTVRNTYNVDLTQHLQMYKAHASDSAR AHQQYWINKINFITDYDVLQSSALSSVDINDSVHYKGYPIYYKDKLYLRPKVLTDNLRFA SGDLFNERDVQQTYSSFGRLSALKYTNIRFIETQVGDSTMLDCYVMLTKSKHKSVSFEVE GTNSAGDLGAAASVSFQNRNLFRGSETFMIKFRGAYEVISGLQAGYSNNNYTEYGVETSI NFPNFLFPFISSDFKRKIRATTEFGLQYNYQLRPEFLRTMASANWSYKWTQRQKIQHRID LINIAFLYLPRISERFKEDYINKGQNHIFQYNYQDRLIINMGYSYNYNSVGGSIINNTIA SNSYSIRFNFESAGNVMYALSKAANIRKNSNGEYAILGIPYAQYLKGEFDFAKNIRIDYR NSFAFHAGVGIAVPYGNAKTIPFEKQYFSGGANSVRGWSVRDLGPGSFAGNGNLLDQSGD IKLDASIEYRSKLFWKFQGAVFIDAGNIWTIRSYANQPGGVFKFDKFYKQIAVAYGLGLR LDLDFFILRFDGGMKAVNPAYETRKEHFPIIHPKFSRDFAFHFAVGYPF >gi|225935328|gb|ACGA01000064.1| GENE 125 115326 - 115778 209 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|299145876|ref|ZP_07038944.1| ## NR: gi|299145876|ref|ZP_07038944.1| hypothetical protein HMPREF9010_01328 [Bacteroides sp. 3_1_23] # 1 150 1 150 150 288 100.0 6e-77 MNKLHTLIYSVLIFTCLSCQQQTPQTQIEQTAIDFCEAFYNFNYPVAKEWSTPSSQSYLS FLASNVGQTHLEQLKTRGAAKVSVISSEIDANLEEASVVCQIKNAFVIHPIGGKMEYVSS LQDTLELVRVNNKWLIRKDIPQQNGKQSHD >gi|225935328|gb|ACGA01000064.1| GENE 126 115872 - 116423 597 183 aa, chain - ## HITS:1 COG:no KEGG:BT_0815 NR:ns ## KEGG: BT_0815 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 183 1 183 183 317 96.0 9e-86 MKRGKLTLVAVVLSGSLLFSSCVGSFGLFNRLSSWNQSVGNKFVNELVFLAFNIVPVYGV AYLADALVINSIEFWSGSNPMANVGDVKKVKGENGNYMVKTLENGYSITKEGETASMDLI YNKEANTWNVVANGESAELVKMNNDGTADLFLPNGEKMNVTLDAQGMLAARQATMSNFMF AAR >gi|225935328|gb|ACGA01000064.1| GENE 127 116517 - 116966 361 149 aa, chain - ## HITS:1 COG:RSc0240 KEGG:ns NR:ns ## COG: RSc0240 COG2259 # Protein_GI_number: 17544959 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 8 131 19 141 145 61 34.0 5e-10 MIYSFLFPTKSNTTKVSLLLLAVRIIFGILLMNHGIQKWSNFQEMSAVFPDPLGIGSPLS LGLAIFGELVCSMAFIIGFLYRLAMIPMIFTMIVAFFVVHANDVFAVKELAFIYLVVFIL MYIAGPGKFSIDHIIGNELSRRKSRVYKN >gi|225935328|gb|ACGA01000064.1| GENE 128 117057 - 117491 388 144 aa, chain - ## HITS:1 COG:VC1962 KEGG:ns NR:ns ## COG: VC1962 COG3015 # Protein_GI_number: 15641964 # Func_class: M Cell wall/membrane/envelope biogenesis; P Inorganic ion transport and metabolism # Function: Uncharacterized lipoprotein NlpE involved in copper resistance # Organism: Vibrio cholerae # 2 144 4 163 163 91 36.0 5e-19 MKKIFILVCSCALLAACNNSAKTNKSAGADSTSIEVTDIHNAENSLDYEGTYKGVFPAAD CPGIETTLTLNPDKTFSLHSVYIDRDSSFDEKGTYTLKDNLLTLKEEGGELSYYKVGENH LRKLTMDKQEITGELADHYVLNKE >gi|225935328|gb|ACGA01000064.1| GENE 129 117636 - 117884 201 82 aa, chain + ## HITS:1 COG:no KEGG:BT_0819 NR:ns ## KEGG: BT_0819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 81 82 124 82.0 1e-27 MAQHTYDNEAVQELLNWAKKMIETKNYPTERYQVNKCTTIIDGKSYLESLIAMISRNWEN PTFHPTIEQLWEFREKWENKEA >gi|225935328|gb|ACGA01000064.1| GENE 130 117997 - 118419 408 140 aa, chain - ## HITS:1 COG:no KEGG:BT_0820 NR:ns ## KEGG: BT_0820 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 140 1 139 139 241 85.0 5e-63 MIIAVDFDGTIVEHRYPRIGEEIPFAVDTLKLLQQEKHRLILWSVREGALLDEAVEWCKA RGLEFYAVNKDYPEEQKDHQGFSRKLKADMFIDDRNLGGLPDWGVIYEMIREKKTFADIY SQNGEEEKTSSKKKKRWLPF Prediction of potential genes in microbial genomes Time: Fri May 13 11:12:02 2011 Seq name: gi|225935327|gb|ACGA01000065.1| Bacteroides sp. D2 cont1.65, whole genome shotgun sequence Length of sequence - 6554 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 41 - 1348 1010 ## BT_0821 putative permease + Term 1372 - 1414 6.1 + Prom 1351 - 1410 1.6 2 1 Op 2 . + CDS 1431 - 2234 654 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I + Term 2451 - 2518 30.2 + TRNA 2324 - 2399 84.1 # Lys CTT 0 0 + TRNA 2429 - 2504 81.9 # Lys CTT 0 0 - Term 2505 - 2537 1.4 3 2 Tu 1 . - CDS 2656 - 4062 1391 ## COG1904 Glucuronate isomerase - Prom 4082 - 4141 7.1 + Prom 4172 - 4231 6.6 4 3 Op 1 1/0.000 + CDS 4261 - 5328 1095 ## COG1609 Transcriptional regulators 5 3 Op 2 . + CDS 5355 - 6552 1225 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases Predicted protein(s) >gi|225935327|gb|ACGA01000065.1| GENE 1 41 - 1348 1010 435 aa, chain + ## HITS:1 COG:no KEGG:BT_0821 NR:ns ## KEGG: BT_0821 # Name: not_defined # Def: putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 435 28 462 462 730 94.0 0 MAYYAVFIVLTIYLSSILGFNDFEASMISGLFSGGLYLLPIFSGAYADKIGFRKSMIIAF SLLSIGYLGLGVLPTLLEAAGLVSYGATTQFNGLPDSYTRWIIVPVLFVLMVGGSFIKSI ISASVAKETTEATRARGYSIFYMMVNVGAFTGKTIIDPLRNVIGEQAYIYINYFSGAMTI VALLAVILLYKSTHTAGEGKSLHEIGQGFMRIITNWRLLILILIVTGFWMVQQQLYATMP KYVIRMAGETAKPGWIANVNPFVVVCCVSFITRLMAKRSAITSMNVGMFLIPISALLMAC GNLLGNDLITGMSNITLMMIAGIVVQALAECFISPRYLEYFSLQAPKGEEGMYLGFSHLH SFLSSIFGFGLAGILLTKYCPDPALFETREAWEAASGNAHYIWYYFAAIGLIAAVALLLF AKITESIDKKKEASR >gi|225935327|gb|ACGA01000065.1| GENE 2 1431 - 2234 654 267 aa, chain + ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 3 266 1 261 261 175 38.0 1e-43 MKVKFISLASGSSGNCYYLGTETYGILIDAGIGIRTIKKSLKDYNILMDSIRAVFITHDH ADHIKAVGNLGEKMNIPVYTTARIHAGINRSYCMTEKLSSSVRYLEKQEPMTLEDFHIES FEVPHDGTDNVGYCIEIDGKVFSFLTDLGEITPTAAHYISKAHYLILEANYDEEMLKMGP YPQYLKERIASKTGHMSNSDTAEFLAENITEHLRYIWLCHLSKDNNHPELAYKTVEWKLK NKGVIVGKDVQLLALKRNTPSELYVFE >gi|225935327|gb|ACGA01000065.1| GENE 3 2656 - 4062 1391 468 aa, chain - ## HITS:1 COG:uxaC KEGG:ns NR:ns ## COG: uxaC COG1904 # Protein_GI_number: 16130987 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Escherichia coli K12 # 1 466 1 465 470 547 56.0 1e-155 MKNFMDENFLLQTETAQKLYHEHAAKMPIIDYHCHLIPQMVADDYKFKSLTEIWLGGDHY KWRAMRTNGVDERFCTGKDTTDWEKFEKWAETVPYTFRNPLYHWTHLELKTAFGINKILN PQTAREIYDECNEKLSQPEYSARGMMRRYHVEAVCTTDDPIDSLEYHIKTRESGFEIKML PTWRPDKAMAVEVPADFRSYVEKLAEVSGVTISNFDDMIAALRKRHDFFAEQGCRLSDHG IEEFYAEDYTDAEIKAIFNKVYGGAELTKDEILKFKSAMLVIFGEMDWEKGWTQQFHYGA IRNNNTKMFKLLGADTGFDSIGEFTTAKAMAKFLDRLNTNGKLTKTILYNLNPCANEVIA TMLGNFQDGSIPGKIQFGSGWWFLDQKDGMEKQMNALSVLGLLSRFVGMLTDSRSFLSYP RHEYFRRTLCNLVGRDVENGEIPASEMDRVNQMIEDISYNNAKNFFKF >gi|225935327|gb|ACGA01000065.1| GENE 4 4261 - 5328 1095 355 aa, chain + ## HITS:1 COG:lin2880 KEGG:ns NR:ns ## COG: lin2880 COG1609 # Protein_GI_number: 16801940 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Listeria innocua # 8 193 4 194 318 73 28.0 4e-13 MEDQNYTIKDIARMAGVSAGTVDRVLHNRGDVSPKSKAKVQKVLDEIHYQPNVFAIGLAA KKKYSFLCLIPYYIEHDYWHSVVGGIERARQELRPFNVSIDYLCYHHGDEKSYQEACLSI KEKNVDAVLISPNFREETLALTAYLQENKIAYAFVDFNMEEAKALTYIGQDSYKSGYIAA KILMRNYSAGEGQELVLFLSNNKDNPAEIQMQRRLDGFMSYIAEEYNNLVIHEVVLNKSD QESNQQTLDEFFQAHPKALLGVVFNSRVYQLGEYLRHAGRSMKGLIGYDLLKANVDLLKS GDVHYLIGQRPGLQGYCGVKALCDHVVFKKSVEPVKYMPIDILIKENIDFYFEFV >gi|225935327|gb|ACGA01000065.1| GENE 5 5355 - 6552 1225 399 aa, chain + ## HITS:1 COG:CAC0695 KEGG:ns NR:ns ## COG: CAC0695 COG0246 # Protein_GI_number: 15893983 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Clostridium acetobutylicum # 15 399 15 399 482 460 58.0 1e-129 MKALNKETAPKTQRPERIIQFGEGNFLRAFVDWIIYNMNQKADFNSSVVVVQPIDKGMVD MLNAQDDLYHVNLQGLDKGEAVNSLTMIDVISRALNPYTQNDEFMKLAEQPEMRFVISNT TEAGIAFDPACKLEDAPASSYPGKLTQLLYHRFKTFNGDKTKGLIIFPCELIFLNGHKLK ETIYQYIELWNLGNEFKTWFEEACGVYATLVDRIVPGFPRKDIAAIKEKLQYDDNLVVQA EVFHLWVIEAPQEVAKEFPADKAGLNVLFVPSEAPYHERKVTLLNGPHTVLSPVAYLSGV NIVRDACQHEVIGKYIHKVMFDELMETLNLPKDELEKFANDVLERFNNPFVDHAVTSIML NSFPKYETRDLPGLKTYLERKGELPKGLVLGLAAIITYY Prediction of potential genes in microbial genomes Time: Fri May 13 11:12:09 2011 Seq name: gi|225935326|gb|ACGA01000066.1| Bacteroides sp. D2 cont1.66, whole genome shotgun sequence Length of sequence - 2842 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 334 290 ## BT_0827 hypothetical protein - Prom 354 - 413 5.4 2 2 Op 1 . + CDS 425 - 973 549 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 3 2 Op 2 . + CDS 1000 - 2313 1239 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 4 2 Op 3 . + CDS 2353 - 2842 377 ## BT_0830 hypothetical protein Predicted protein(s) >gi|225935326|gb|ACGA01000066.1| GENE 1 1 - 334 290 111 aa, chain - ## HITS:1 COG:no KEGG:BT_0827 NR:ns ## KEGG: BT_0827 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 365 196 83.0 2e-49 MIELAQHIETLLLENDCVIVPGFGGFVAHYSPATRVKEENIFLPPTRTIGFNPQLKLNDG VLVQSYMSAYDTSFADANRIVEKEVNEFIGLLHEEGKAHLDNIGEIHYNIY >gi|225935326|gb|ACGA01000066.1| GENE 2 425 - 973 549 182 aa, chain + ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 169 1 170 183 202 60.0 3e-52 MNYIQTEIDGVWLIEPRVFSDERGYFMEAYKKEEFEANIGPVNFIQDNESKSSFGVLRGL HYQKGEHSQAKLVRVLKGEVLDVAVDLRKSSPTFGKHVCVLLSEENKRQFFIPRGFAHGF AVLSEEAVFTYKVDNKYAPQAEASILYNDETLGIDWPLAESQMLLSAKDREGTAFKDAVY FE >gi|225935326|gb|ACGA01000066.1| GENE 3 1000 - 2313 1239 437 aa, chain + ## HITS:1 COG:XF1606 KEGG:ns NR:ns ## COG: XF1606 COG1004 # Protein_GI_number: 15838207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Xylella fastidiosa 9a5c # 1 437 1 444 450 508 56.0 1e-143 MKIAIVGTGYVGLVSGTCFAEIGVNVTCVDTNKEKIESLQKGNIPIYENGLEEMVLRNMK AKRLKFTTSLESCLDEVEVIFSAVGTPPDEDGSADLKYVLEVARTIGRNMKQYKLVVTKS TVPVGTASKVRAVIQEELDKRGVTVDFDVASNPEFLKEGNAISDFMSPDRVVVGVESARA EKLMSKLYKPFLLNNFRVIFMDIPSAEMTKYAANSMLATRISFMNDIANLCEIVGADVNM VRSGIGSDTRIGRKFLYPGIGYGGSCFPKDVKALIKTAEQNGYTMRVLTAVEEVNENQKS VLFEKLMKQFNGDLQGKTVALWGLAFKPETDDMREAPALVLIDKLLKAGCKVRAYDPAAV QECKRRIGDTIYYACDMYDAVLDADVLMLVTEWKEFRLPSWAVIKKTMSQQIVLDGRNIY DKKEMEELGFIYHCIGK >gi|225935326|gb|ACGA01000066.1| GENE 4 2353 - 2842 377 163 aa, chain + ## HITS:1 COG:no KEGG:BT_0830 NR:ns ## KEGG: BT_0830 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 163 3 164 265 261 79.0 7e-69 MKNVAVSLMAIFFAACSGNKSPHSLQQEKEDLSAKGLLQGIWLDDETESPLMRVEGDTIY YADVQSTPIAFKIIRDTLYTYGNDTTYYKIDKQAEHIFWFHSITDDVIKLHKSEDANDSI YFVRQELVIPAYTEVTKRDSVVTYNGTRYRAYVYINPSKMRVI Prediction of potential genes in microbial genomes Time: Fri May 13 11:12:31 2011 Seq name: gi|225935325|gb|ACGA01000067.1| Bacteroides sp. D2 cont1.67, whole genome shotgun sequence Length of sequence - 66150 bp Number of predicted genes - 83, with homology - 80 Number of transcription units - 28, operones - 19 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 233 124 ## BT_0830 hypothetical protein + Term 296 - 340 1.8 - Term 225 - 264 3.1 2 2 Op 1 . - CDS 284 - 1552 1056 ## COG0513 Superfamily II DNA and RNA helicases 3 2 Op 2 . - CDS 1599 - 2828 1390 ## COG0560 Phosphoserine phosphatase - Prom 2879 - 2938 6.4 + Prom 2851 - 2910 4.9 4 3 Tu 1 . + CDS 3095 - 3598 379 ## BT_0833 hypothetical protein + Term 3697 - 3746 8.5 - Term 3682 - 3737 15.3 5 4 Op 1 1/0.000 - CDS 3831 - 4928 925 ## COG0795 Predicted permeases 6 4 Op 2 . - CDS 4932 - 6062 951 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 7 4 Op 3 . - CDS 6059 - 8524 2391 ## COG0466 ATP-dependent Lon protease, bacterial type - Prom 8632 - 8691 7.5 + Prom 8503 - 8562 6.2 8 5 Tu 1 . + CDS 8740 - 9450 489 ## COG4123 Predicted O-methyltransferase + Term 9478 - 9538 11.5 - Term 9464 - 9526 8.1 9 6 Op 1 . - CDS 9732 - 10238 290 ## BT_0845 hypothetical protein 10 6 Op 2 . - CDS 10255 - 10473 100 ## gi|160884587|ref|ZP_02065590.1| hypothetical protein BACOVA_02574 - Prom 10685 - 10744 8.7 - Term 10710 - 10744 2.0 11 7 Tu 1 . - CDS 10906 - 11043 78 ## - Prom 11063 - 11122 6.8 + Prom 10986 - 11045 4.8 12 8 Tu 1 . + CDS 11072 - 11596 265 ## gi|260174913|ref|ZP_05761325.1| hypothetical protein BacD2_23860 + Term 11623 - 11655 3.2 13 9 Op 1 . - CDS 11477 - 11851 171 ## 14 9 Op 2 . - CDS 11900 - 12202 268 ## BDI_3710 hypothetical protein 15 9 Op 3 . - CDS 12205 - 12642 380 ## PRU_2155 nucleotidyltransferase substrate-binding family protein 16 9 Op 4 . - CDS 12695 - 13153 376 ## gi|260174916|ref|ZP_05761328.1| hypothetical protein BacD2_23875 17 9 Op 5 . - CDS 13180 - 13851 161 ## Coch_0868 putative phage repressor - Prom 13946 - 14005 8.2 + Prom 13822 - 13881 6.3 18 10 Op 1 . + CDS 14024 - 14278 160 ## gi|260174918|ref|ZP_05761330.1| hypothetical protein BacD2_23885 19 10 Op 2 . + CDS 14288 - 14536 297 ## gi|260174919|ref|ZP_05761331.1| hypothetical protein BacD2_23890 20 10 Op 3 . + CDS 14604 - 14858 283 ## gi|260174920|ref|ZP_05761332.1| hypothetical protein BacD2_23895 21 10 Op 4 . + CDS 14842 - 15207 291 ## gi|260174921|ref|ZP_05761333.1| hypothetical protein BacD2_23900 + Prom 15211 - 15270 3.6 22 11 Op 1 . + CDS 15311 - 15613 185 ## BT_0851 hypothetical protein 23 11 Op 2 . + CDS 15610 - 15816 207 ## gi|260174923|ref|ZP_05761335.1| hypothetical protein BacD2_23910 24 11 Op 3 . + CDS 15841 - 16815 714 ## gi|260174924|ref|ZP_05761336.1| hypothetical protein BacD2_23915 25 11 Op 4 . + CDS 16844 - 18043 920 ## gi|260174925|ref|ZP_05761337.1| hypothetical protein BacD2_23920 26 11 Op 5 . + CDS 18046 - 18705 483 ## gi|260174926|ref|ZP_05761338.1| hypothetical protein BacD2_23925 27 11 Op 6 . + CDS 18728 - 19141 284 ## Coch_0878 hypothetical protein 28 11 Op 7 . + CDS 19147 - 19332 202 ## gi|260174928|ref|ZP_05761340.1| hypothetical protein BacD2_23935 29 11 Op 8 . + CDS 19313 - 19912 523 ## gi|260174929|ref|ZP_05761341.1| hypothetical protein BacD2_23940 30 11 Op 9 . + CDS 19961 - 21634 592 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Prom 21707 - 21766 2.4 31 12 Op 1 . + CDS 21791 - 22438 584 ## HAPS_0636 hypothetical protein 32 12 Op 2 . + CDS 22481 - 23236 417 ## COG0863 DNA modification methylase + Term 23309 - 23343 0.2 + Prom 23260 - 23319 5.7 33 13 Op 1 . + CDS 23465 - 23671 108 ## gi|260174934|ref|ZP_05761346.1| hypothetical protein BacD2_23965 34 13 Op 2 . + CDS 23702 - 24115 326 ## BDI_0866 putative prophage Lp2 protein 26 + Prom 24245 - 24304 6.2 35 14 Op 1 . + CDS 24341 - 24697 286 ## Cpin_3857 hypothetical protein 36 14 Op 2 . + CDS 24630 - 25163 208 ## Coch_0893 VRR-NUC domain protein + Prom 25188 - 25247 4.6 37 15 Tu 1 . + CDS 25268 - 25555 232 ## gi|260174938|ref|ZP_05761350.1| hypothetical protein BacD2_23985 + Term 25558 - 25594 4.0 38 16 Op 1 . - CDS 25566 - 26870 549 ## BF1081 hypothetical protein - Prom 26894 - 26953 2.1 - Term 26890 - 26925 1.5 39 16 Op 2 . - CDS 27022 - 27933 510 ## BT_0082 hypothetical protein - Prom 27953 - 28012 8.7 - Term 27980 - 28020 6.3 40 17 Op 1 . - CDS 28047 - 28517 293 ## gi|260174941|ref|ZP_05761353.1| hypothetical protein BacD2_24000 41 17 Op 2 . - CDS 28547 - 29041 176 ## COG3023 Negative regulator of beta-lactamase expression 42 17 Op 3 . - CDS 29062 - 29412 365 ## BT_4442 hypothetical protein 43 17 Op 4 . - CDS 29423 - 29689 183 ## gi|260174944|ref|ZP_05761356.1| hypothetical protein BacD2_24015 44 17 Op 5 . - CDS 29686 - 30009 312 ## gi|260174945|ref|ZP_05761357.1| hypothetical protein BacD2_24020 45 17 Op 6 . - CDS 30021 - 34019 2577 ## BDI_0901 hypothetical protein + Prom 33997 - 34056 7.2 46 18 Op 1 . + CDS 34233 - 34490 68 ## gi|260174947|ref|ZP_05761359.1| hypothetical protein BacD2_24030 47 18 Op 2 . + CDS 34515 - 34925 168 ## gi|260174948|ref|ZP_05761360.1| hypothetical protein BacD2_24035 48 18 Op 3 . + CDS 34940 - 35350 93 ## gi|260174949|ref|ZP_05761361.1| hypothetical protein BacD2_24040 49 18 Op 4 . + CDS 35374 - 35778 152 ## gi|260174950|ref|ZP_05761362.1| hypothetical protein BacD2_24045 50 18 Op 5 . + CDS 35795 - 36325 367 ## gi|260174951|ref|ZP_05761363.1| hypothetical protein BacD2_24050 - Term 36181 - 36236 12.2 51 19 Op 1 . - CDS 36295 - 36825 484 ## gi|260174952|ref|ZP_05761364.1| hypothetical protein BacD2_24055 52 19 Op 2 . - CDS 36842 - 39523 1716 ## NT01CX_2162 phage pre-neck appendage-like protein 53 19 Op 3 . - CDS 39513 - 40445 676 ## BDI_0895 hypothetical protein 54 19 Op 4 . - CDS 40475 - 45109 3459 ## GFO_2426 phage tail tape measure protein - Prom 45233 - 45292 4.3 + Prom 45294 - 45353 8.1 55 20 Tu 1 . + CDS 45373 - 45567 262 ## gi|260174957|ref|ZP_05761369.1| hypothetical protein BacD2_24080 - Term 45552 - 45598 9.1 56 21 Op 1 . - CDS 45667 - 46020 92 ## gi|260174958|ref|ZP_05761370.1| hypothetical protein BacD2_24085 57 21 Op 2 . - CDS 45909 - 46232 72 ## gi|260174959|ref|ZP_05761371.1| hypothetical protein BacD2_24090 - Prom 46444 - 46503 3.1 - Term 46361 - 46403 0.8 58 22 Op 1 . - CDS 46615 - 47214 626 ## gi|260174960|ref|ZP_05761372.1| hypothetical protein BacD2_24095 59 22 Op 2 . - CDS 47217 - 47597 346 ## gi|260174961|ref|ZP_05761373.1| hypothetical protein BacD2_24100 60 22 Op 3 . - CDS 47600 - 48055 459 ## GFO_2433 hypothetical protein 61 22 Op 4 . - CDS 48052 - 48363 185 ## gi|260174963|ref|ZP_05761375.1| hypothetical protein BacD2_24110 62 22 Op 5 . - CDS 48357 - 48689 241 ## gi|260174964|ref|ZP_05761376.1| hypothetical protein BacD2_24115 63 22 Op 6 . - CDS 48695 - 49816 1007 ## SEQ_1749 phage capsid protein 64 22 Op 7 . - CDS 49838 - 50422 684 ## gi|260174966|ref|ZP_05761378.1| hypothetical protein BacD2_24125 65 22 Op 8 . - CDS 50442 - 51146 628 ## gi|260174967|ref|ZP_05761379.1| hypothetical protein BacD2_24130 - Prom 51189 - 51248 6.4 + Prom 51113 - 51172 5.1 66 23 Tu 1 . + CDS 51230 - 51493 230 ## gi|260174968|ref|ZP_05761380.1| hypothetical protein BacD2_24135 67 24 Op 1 . - CDS 51488 - 53896 1067 ## GFO_2441 hypothetical protein 68 24 Op 2 . - CDS 53900 - 55348 954 ## GFO_2449 phage portal protein - Prom 55470 - 55529 4.8 69 25 Tu 1 . + CDS 55645 - 55761 90 ## 70 26 Op 1 . - CDS 55777 - 57222 750 ## COG5410 Uncharacterized protein conserved in bacteria 71 26 Op 2 . - CDS 57203 - 57697 248 ## BP3385 hypothetical protein 72 26 Op 3 1/0.000 - CDS 57701 - 58294 390 ## COG0302 GTP cyclohydrolase I 73 26 Op 4 22/0.000 - CDS 58249 - 58794 266 ## COG0602 Organic radical activating enzymes 74 26 Op 5 . - CDS 58781 - 59113 205 ## COG0720 6-pyruvoyl-tetrahydropterin synthase 75 26 Op 6 . - CDS 59118 - 60056 251 ## gi|260174977|ref|ZP_05761389.1| hypothetical protein BacD2_24180 76 26 Op 7 . - CDS 60004 - 60171 87 ## gi|260174978|ref|ZP_05761390.1| hypothetical protein BacD2_24185 - Prom 60194 - 60253 2.8 77 27 Op 1 . - CDS 60326 - 60838 296 ## gi|260174979|ref|ZP_05761391.1| DNA methylase N-4/N-6 domain-containing protein 78 27 Op 2 . - CDS 60855 - 61418 310 ## gi|260174980|ref|ZP_05761392.1| hypothetical protein BacD2_24195 - Prom 61442 - 61501 4.0 79 28 Op 1 . - CDS 61507 - 62346 319 ## BF2329 hypothetical protein 80 28 Op 2 . - CDS 62204 - 63097 324 ## BVU_2853 putative Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit 81 28 Op 3 . - CDS 63111 - 64394 642 ## BF2468 hypothetical protein 82 28 Op 4 . - CDS 64391 - 64792 271 ## BF2469 hypothetical protein 83 28 Op 5 . - CDS 64797 - 65912 573 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) - Prom 65954 - 66013 4.2 Predicted protein(s) >gi|225935325|gb|ACGA01000067.1| GENE 1 3 - 233 124 76 aa, chain + ## HITS:1 COG:no KEGG:BT_0830 NR:ns ## KEGG: BT_0830 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 190 265 265 118 77.0 6e-26 EGKKSLFASDVTKQMFDKVLPVDFLEQSILSDTKFMKVDRNGFHYQAVLAIPETSIYSIV NMEVSFKGDLTITSSK >gi|225935325|gb|ACGA01000067.1| GENE 2 284 - 1552 1056 422 aa, chain - ## HITS:1 COG:CC0835 KEGG:ns NR:ns ## COG: CC0835 COG0513 # Protein_GI_number: 16125088 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Caulobacter vibrioides # 2 367 3 368 476 292 43.0 7e-79 MKFSELQLNANVLEALDAMRFDECTPIQEQAIPIILEGKDLIAVAQTGTGKTAAFLLPVL NKLSEGKHPEDAINCVIMSPTRELAQQIDQQMEGFSYFMPVSSVAVYGGNDGILFEQQKK GLTLGADVVIATPGRLIAHLSLGYVDLSKVSYFILDEADRMLDMGFYEDIMQIAKYLPKE RQTIMFSATMPAKIQQLANTILNNPSEIKLAVSKPAEKIIQAAYVCYENQKLGIIRSLFM DEVPERVIVFASSKIKVKEVAKALKSMKLNVGEMHSDLEQAQRETVMHEFKAGRINILVA TDIVARGIDIDDIRLVINFDVPHDSEDYVHRIGRTARANNDGVALTFISEKEQSNFKSIE NFLEKEIYKIPIPEELGEAPEYKPRSFSKGKGGNYKRKDFRGKRNNNNGGKRNNRPSPPR QN >gi|225935325|gb|ACGA01000067.1| GENE 3 1599 - 2828 1390 409 aa, chain - ## HITS:1 COG:PA4960_2 KEGG:ns NR:ns ## COG: PA4960_2 COG0560 # Protein_GI_number: 15600153 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 193 404 1 212 217 259 65.0 7e-69 MQPSNTELILIRVTGEDRPGLTASVTEILAKYDATILDIGQADIHNTLSLGILFKSEERH SGFIMKELLFKASSLGVTIRFEPITTEQYENWVGMQGKNRYILTVLGRKLSARQISAATS ILAEQGMNIDAIKRLTGRIPLDECQADTRTRACIEFSVRGTPKDRIAMQEKLMKLASELE MDFSFQQDNMYRRMRRLICFDMDSTLIETEVIDELAIRAGVGDEVKAITESAMRGEIDFT ESFTRRVALLKGLDESVMQEIAESLPITEGVERLMYVLKKYGYKIAILSGGFTYFGQYLQ KKYGIDYVYANELEIIDGKLTGRYLGDVVDGKRKAELLRLIAQVEKVDIAQTIAVGDGAN DLPMLGVAGLGIAFHAKPKVVANAKQSINTIGLDGVLYFLGFKDSYLNM >gi|225935325|gb|ACGA01000067.1| GENE 4 3095 - 3598 379 167 aa, chain + ## HITS:1 COG:no KEGG:BT_0833 NR:ns ## KEGG: BT_0833 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 167 4 168 168 314 95.0 7e-85 MKHVKWIFVVLLICSLTAFVEKDKPTGGLSEGDVAPDFKIESTSNGQPAFKLGNLKGKYV LLSFWASYDAQSRMQNVSLSNALRSSHNVEMVSISFDEYQSIFKETVRKDQIVTPTCFVE TEGEDSGLFKKYRLNRGFTNYLLDGNGVIIAKNISAADLSAYVKEVG >gi|225935325|gb|ACGA01000067.1| GENE 5 3831 - 4928 925 365 aa, chain - ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 7 363 2 361 363 100 27.0 5e-21 MKSNRFIKRLDWYIIKKFLGTYVFAIALIISIAVVFDFNEKMDKLMEHEAPWDKIIFEYY MNFIPYFSNLFSPLFVFIAVIFFTSKLAENSEIIAMFSTGMSFKRMMRPYMISAAIIALV TFTLSSYVIPKGSVTRLNFEDRYIKPKKQNSVSNVQLEVDSGVIAYIDNYNNAMKTGNRF SLDKFVDKKLVSHLTARRITYDTTTVHKWTIHDYMVRELDGLKEKITKGDRIDSIINMEP SDFLIMKNQQEMLTSPELSDYIAKQKRRGFANIKEFEIEYHKRIAMSFASFILTIIGVSL SSRKTKGGMGLHLGIGLGLSFSYILFQTITSTFAVNGNVPPAVAVWIPNILYAGIAFFLY QKAPK >gi|225935325|gb|ACGA01000067.1| GENE 6 4932 - 6062 951 376 aa, chain - ## HITS:1 COG:aq_1308 KEGG:ns NR:ns ## COG: aq_1308 COG0343 # Protein_GI_number: 15606515 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Aquifex aeolicus # 3 376 2 378 378 372 51.0 1e-103 MTFELQYTDTKSNARAGLITTDHGQIQTPIFMPVGTLGTVKGVHLTELKEDIQAQIILGN TYHLYLRPGLDVIEKAGGLHRFNGFDRPMLTDSGGFQVFSLAGIRKLREEGAEFRSHIDG SKHIFTPEKVMDIERTIGADIMMAFDECPPGDSDYEYAKKSLGLTHRWLDRCIQRFNETE PKYGYNQSLFPIVQGCVYPDLRKQSAEFVASKGADGNAIGGLAVGEPVDKMYEMIEIVNE ILPKDKPRYLMGVGTPVNILEGIERGVDMFDCVMPTRNGRNGMLFTKDGIINMRNKKWET DFSPIEADGASCVDTLYSKAYLRHLFHAQELLAMQIASIHNLAFYLWLAGETRKHIIAGD FSTWKPMMVKRVSTRL >gi|225935325|gb|ACGA01000067.1| GENE 7 6059 - 8524 2391 821 aa, chain - ## HITS:1 COG:PM1978 KEGG:ns NR:ns ## COG: PM1978 COG0466 # Protein_GI_number: 15603843 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Pasteurella multocida # 39 804 11 772 804 666 46.0 0 MKERYLLEDGGDNSFSLITDYDGNDEQAFDVNVKSGEILPVLPLRNMVLFPGVFLPITVG RKSSLKLVRDADKKHKDIAVVCQRSAHTEDPKLEDLHNIGTVGRIVRILEMPDQTTTVIL QGMKRLSLTSIIETHPYLKGEIELLEEDVPGKDDKEFQALVETCKDLTMRYIKSSDVMHQ DSSFAIKNINNSMFLVNFICSNLPFKKDEKMDLLSINSLRERTYHLLEILNREVQLAEIK ASIQMRAREDIDQQQREYFLQQQIKTIQDELGGGGQEQEIEEMRQKAERMRWNLEVRDTF MKELAKLERTHPQSPDYSVQLNYLQTMLNLPWGTYTTDNLNLKNAEKTLNKDHYGLEKVK ERILEHLAVLKLKGDMKSPIICLYGPPGVGKTSLGKSIASALKRKYVRMSLGGVHDEAEI RGHRKTYIGAMPGRIIKSLIKAGASNPVFILDEIDKVSADRQGDPSSALLEVLDPEQNTS FHDNFLDVDYDLSKVLFIATANNLNTIPGPLLDRMELIEVSGYITEEKVEIARKHLLPKE LEANGLKKTDIKLPKDTLEAIIESYTRESGVRELEKKIGKILRKSARQYATDGYFAKTEI KPTDLYDFLGAPEYTRDKYQGNDYAGVVTGLAWTAVGGEILFVETSLSRGKGGRLTLTGN LGDVMKESAMLALEYIKAHASVLSLNEEIFDNWNIHIHVPEGAIPKDGPSAGITMATSLA SALTQRKVKANIAMTGEITLRGKVLPVGGIKEKILAAKRAGIKEIIMSAENKKNIDEIQD IYLKGLTFHYVNDIKEVFAIALTNEKVADPIDLSVKKPSQE >gi|225935325|gb|ACGA01000067.1| GENE 8 8740 - 9450 489 236 aa, chain + ## HITS:1 COG:YPO2709 KEGG:ns NR:ns ## COG: YPO2709 COG4123 # Protein_GI_number: 16122913 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Yersinia pestis # 6 235 20 251 252 211 45.0 7e-55 MSNPYFQFKQFTVWHDQCAMKVGTDGVLLGAWASVEGARRILDIGTGTGLVALMLAQRSL PDAKIVALEIDEAAAGQARENVARSPWRERVEVVQADFKKYRSSDKFDVIVSNPPYFVDS LECPDRQRAAARHNDSLTYDDLLEGVSGLLTENGFFTVVIPADVAERVKKIASIKKLYAV RQLNVITKPGGIPKRVLITFSFSNQECIVEELLTELARHQYSEEYMTLTRDYYLNM >gi|225935325|gb|ACGA01000067.1| GENE 9 9732 - 10238 290 168 aa, chain - ## HITS:1 COG:no KEGG:BT_0845 NR:ns ## KEGG: BT_0845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 1 168 168 271 88.0 5e-72 MKKFFYLSAILAIVLVSCNSEKEYIAKLSNTASMIEKEADLSEAIALHYCDTWRKVIYDH EYNGEYCTDFNEALAKHQEFIITTDTYKRLKQKKDSIEAIMPQLNDYPSSCKDAYNELVS IYADADELFRFADEPRGSLSTYSTKTTDLYQKIEKSLKEFKIKHIQNK >gi|225935325|gb|ACGA01000067.1| GENE 10 10255 - 10473 100 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160884587|ref|ZP_02065590.1| ## NR: gi|160884587|ref|ZP_02065590.1| hypothetical protein BACOVA_02574 [Bacteroides ovatus ATCC 8483] # 1 72 27 98 98 116 100.0 5e-25 MKKTLLMLVVIFISLYSHSQSPFLNFRIPEESNKRIIGYSSSNKINVFLNIKYQSCNTLI QYYKQDSNKLIV >gi|225935325|gb|ACGA01000067.1| GENE 11 10906 - 11043 78 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEKDKLIASLRQQLRKVLRENSAQKQEIALLNYELERAKIRLSK >gi|225935325|gb|ACGA01000067.1| GENE 12 11072 - 11596 265 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174913|ref|ZP_05761325.1| ## NR: gi|260174913|ref|ZP_05761325.1| hypothetical protein BacD2_23860 [Bacteroides sp. D2] # 1 174 1 174 174 313 100.0 3e-84 MKHFKELHIYFEIDQENYAITRTIKDEKGKIIFQRKTNYLKANPYTDDPVCILEDILQNG TSSQTITSLNELSDTEIRLRTLELVQRTYSCHNPFELIARSIALARYAKTGKYNRSSGNV TEQIGLEYMERIYDQLTEKRYEPKQTANNTNDKPNVPKTNISLFKKLFSRFRNK >gi|225935325|gb|ACGA01000067.1| GENE 13 11477 - 11851 171 124 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MREIHNFNDKCLLTLLRIANIQNKERKRLEKMYCDRFQGYQEVKDFLYANELLKYNGELV NGEWIVDKESPVEITGLGLNAIRYGLFVSETRKQFLEKRYIRLRDIGLIISIIGGLLGFI SFFC >gi|225935325|gb|ACGA01000067.1| GENE 14 11900 - 12202 268 100 aa, chain - ## HITS:1 COG:no KEGG:BDI_3710 NR:ns ## KEGG: BDI_3710 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 99 1 99 99 127 65.0 1e-28 MYGLSDTVITDICSVFRRFPNIDKVLIFGSRAKGTYSEGSDIDLAAVGENITFNQLMDIN IQIEDLGLLYKVDVVDYNKNIGTPIGEHIDRVGLLFYEKE >gi|225935325|gb|ACGA01000067.1| GENE 15 12205 - 12642 380 145 aa, chain - ## HITS:1 COG:no KEGG:PRU_2155 NR:ns ## KEGG: PRU_2155 # Name: not_defined # Def: nucleotidyltransferase substrate-binding family protein # Organism: P.ruminicola # Pathway: not_defined # 2 133 4 135 146 164 60.0 6e-40 MEQDIRWLQRYDSFHRANKRILDITESDKTPDSLSELEMEGLIQRFEYTFELGWKVLQDL LKYKGYEFVQGPNGTLQKAFEDNMITDHDGWRRMAKARVTTSHTYNEGDAIEIVRKIYEE YSLLLKQLDSRLNEEKLRMEMGTLF >gi|225935325|gb|ACGA01000067.1| GENE 16 12695 - 13153 376 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174916|ref|ZP_05761328.1| ## NR: gi|260174916|ref|ZP_05761328.1| hypothetical protein BacD2_23875 [Bacteroides sp. D2] # 1 152 1 152 152 265 100.0 5e-70 MEAFGFIGVVYLLAGIIQLIILFVLIVKFLQLVADVKQLKNLYTERSRELSSSIDKLSSA IKEQSNSKDNDKPDVAKDENIVAEQKKEPNKPYNEASAEEVPTVDENSDDFKQHLRKWKI LKNKGYTDQAIREYMEYTKRNMNSAVDFINSI >gi|225935325|gb|ACGA01000067.1| GENE 17 13180 - 13851 161 223 aa, chain - ## HITS:1 COG:no KEGG:Coch_0868 NR:ns ## KEGG: Coch_0868 # Name: not_defined # Def: putative phage repressor # Organism: C.ochracea # Pathway: not_defined # 34 223 50 243 244 94 33.0 2e-18 MKETVRDRLLQFINELGISTRMFEQNCGLSNGFVRNTGDSIRRNNLEKISTIYPDLNTTW LLTGDGNKLNSSAKSITSISSEMPAPSKQSSKGIPYFDVDVTMGYDELPNDQTNIPNYYL HIPAFQNCDCAVPAYGRSMIPDINDGSIIAIKEVSLDSVLPGEAYLIITDEYRTVKYIRN CKDNPNKWRLVPKNLEEFDEMIIDKTKVLRVFLVKGVITNKIL >gi|225935325|gb|ACGA01000067.1| GENE 18 14024 - 14278 160 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174918|ref|ZP_05761330.1| ## NR: gi|260174918|ref|ZP_05761330.1| hypothetical protein BacD2_23885 [Bacteroides sp. D2] # 1 84 1 84 84 161 100.0 1e-38 MVKTEKIKLVVYKEHTLGYILPELPDSVQILHSSPLKGAIGTTNLQNNFQINNPNEIRLA SESDFDAFGISFDGYKNSPDYIYK >gi|225935325|gb|ACGA01000067.1| GENE 19 14288 - 14536 297 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174919|ref|ZP_05761331.1| ## NR: gi|260174919|ref|ZP_05761331.1| hypothetical protein BacD2_23890 [Bacteroides sp. D2] # 1 82 1 82 82 123 100.0 3e-27 MKTILEVSLQEASKAKDAIRYSLLRTELNQTSTNVWELPTYDMNDGYECDGDEELKDEIR ELFSSFGIAEEEYSFTDKETEE >gi|225935325|gb|ACGA01000067.1| GENE 20 14604 - 14858 283 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174920|ref|ZP_05761332.1| ## NR: gi|260174920|ref|ZP_05761332.1| hypothetical protein BacD2_23895 [Bacteroides sp. D2] # 1 84 3 86 86 132 100.0 7e-30 MENQLETIKANLPYGYEKQIAKEVGCSQGTVHNILNNKPASARSTYKAKVLNVAVRMANE ALEATKGVSKAAAELETLHHGTAS >gi|225935325|gb|ACGA01000067.1| GENE 21 14842 - 15207 291 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174921|ref|ZP_05761333.1| ## NR: gi|260174921|ref|ZP_05761333.1| hypothetical protein BacD2_23900 [Bacteroides sp. D2] # 1 121 1 121 121 205 100.0 6e-52 MELQADAKLTKRENQIAGLAFCGKAKKEIADLLNIAYGTVNVILDRAYKKTGTSKLNELG SWWANRAFALNIDFQQLQKTIVALSFLGIIAFQIAFDCNNDLNRSRRARIRRNKIEEVYE L >gi|225935325|gb|ACGA01000067.1| GENE 22 15311 - 15613 185 100 aa, chain + ## HITS:1 COG:no KEGG:BT_0851 NR:ns ## KEGG: BT_0851 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 25 124 124 187 96.0 1e-46 MENCFEMMVARCIKIGTVQTLTMLGLLPEVVTISQAEEIYGKRLIKEWREKAWIKFYPAN NKERGKYYVKRSELETASAMMDLHNKVPDNIIKQLMQIAV >gi|225935325|gb|ACGA01000067.1| GENE 23 15610 - 15816 207 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174923|ref|ZP_05761335.1| ## NR: gi|260174923|ref|ZP_05761335.1| hypothetical protein BacD2_23910 [Bacteroides sp. D2] # 1 68 1 68 68 108 100.0 9e-23 MRYIPKSSEVLQALQDSIGKQIAEREEQKKNYVPTPVEIKPDKNISIEPTAEDILLMEEY RRGVYQGD >gi|225935325|gb|ACGA01000067.1| GENE 24 15841 - 16815 714 324 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174924|ref|ZP_05761336.1| ## NR: gi|260174924|ref|ZP_05761336.1| hypothetical protein BacD2_23915 [Bacteroides sp. D2] # 1 324 1 324 324 646 100.0 0 MSKIIEVKVEELNALPATKIVESENVQAKFVQMYNAIWGTDKGEQMYHKEVFNFQKLLRD NPDLADSTKMSLYGCFLDIAVNGLTLDQTGHPLCYILSRSSKTGHKNAQGYDIYEKRAYV SVTGYGELTMRMRAGQIKYADNPVVVYEGDHFKASLVNGIKNIEYEAQCPRTSTKVIAAF IRIVRNDNSVDYQWLMEGDIERLKHYSEKANSKWNDQTKRRELGKANALYTSNNGSIDPG FLENKMIKHAFDAYPKVRTGKFTIMDSDQEEEEIIDYGLVDEDKVNEPVQAVDNPNIPFG EEKQLEAPEPVQVPVSDDDEDGGF >gi|225935325|gb|ACGA01000067.1| GENE 25 16844 - 18043 920 399 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174925|ref|ZP_05761337.1| ## NR: gi|260174925|ref|ZP_05761337.1| hypothetical protein BacD2_23920 [Bacteroides sp. D2] # 1 399 1 399 399 614 100.0 1e-174 MATELIKIDEAKNILSSFPDIMGKNTNSVKKCNEAGQALLDTIEGEGMNETIDQATADYL KKVSVTLKNMDERRKPITQIFDRIRSFFTSQEKQIDPKDPSTIPGKLVIKRNEYAKFKYE EEQKRKREAEQRARIETEKANYRQIIGDSLLSYFNQYLSSKVSELQGIFSNLTYENFDRE VIGITVFQTDYPKSHFDKFSADSATYYISQETKQEIRREVLEGKYEQYAQQYKAKIVSVK QDLTDRVPSKRKELAELEQLRLANAEEAAKAEELRKQREKEAAAKRMEELKKEEEAAKQE AALKAQQSSIGSLFMEAAASIAPPPTNAKVKEKIVVLHQQGYLEIFQMWWINEGQTLPVE ELEKIFKKMITYCEKQANGKDQKHIESKFIRYEADVKAK >gi|225935325|gb|ACGA01000067.1| GENE 26 18046 - 18705 483 219 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174926|ref|ZP_05761338.1| ## NR: gi|260174926|ref|ZP_05761338.1| hypothetical protein BacD2_23925 [Bacteroides sp. D2] # 1 219 1 219 219 407 100.0 1e-112 MSNPDSYYSRPEVSNSDLTELKNYLYPRAQYGDKEKAFKFGTLVDALITENERVRYDKLM VDDYVYTKDEFELGLEMRKALRKEAEKDQFLAVVLAQSDTQKFMVNKQQEFFYGNFVYHL DTRCKWDWWLSSFNFGGDLKTTFAESQTQFDEAIDFFDWDRSRAWYMDIAGSKQDFIYAI SKKNCRIFKHFITDRKHPSYIRGKEKYEDLAFKWWQLMV >gi|225935325|gb|ACGA01000067.1| GENE 27 18728 - 19141 284 137 aa, chain + ## HITS:1 COG:no KEGG:Coch_0878 NR:ns ## KEGG: Coch_0878 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 9 130 2 123 127 84 38.0 8e-16 MNLLITSKEQILAELTNIDSFLNITMSEDVAEAVQRGNDLAVYVARSGKLLADSKYWLNE AMKSEVMQTLVDTAKSAKATATAINALVNSLCREERYLVDWCERCNRTATHQLSWCVTVI SKAKAEMQMSGMFNNKK >gi|225935325|gb|ACGA01000067.1| GENE 28 19147 - 19332 202 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174928|ref|ZP_05761340.1| ## NR: gi|260174928|ref|ZP_05761340.1| hypothetical protein BacD2_23935 [Bacteroides sp. D2] # 1 61 1 61 61 82 100.0 1e-14 MKNLRRVTIGISVIGLFTALSFSQREDATTREITTAAVMGVVSTFSIITLSTKEDYGTSK K >gi|225935325|gb|ACGA01000067.1| GENE 29 19313 - 19912 523 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174929|ref|ZP_05761341.1| ## NR: gi|260174929|ref|ZP_05761341.1| hypothetical protein BacD2_23940 [Bacteroides sp. D2] # 1 199 1 199 199 374 100.0 1e-102 MEQAKNEIKKAIIKKDRLNVVYNERFSEANYTNVISKNCDQIIHSDLRETFNRLKLHLVV LCEQPEAANINKDSFTSPGYSEILENYIITGYANDSVDSVSGITIMGAKLLQSGKVVDLK IFVPLLDADYPYYEELSIDAAACDAEVESYLFEEKWGVRQERLDFDTDEPEEAVIIEDKP KKRGRKKQIEAPAPLDATA >gi|225935325|gb|ACGA01000067.1| GENE 30 19961 - 21634 592 557 aa, chain + ## HITS:1 COG:ECU06g0820 KEGG:ns NR:ns ## COG: ECU06g0820 COG0553 # Protein_GI_number: 19074332 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Encephalitozoon_cuniculi # 97 529 126 525 556 154 27.0 4e-37 MNIELKGDNFELSFKYKPSIVDRIRQIPGRRFDGTRKVWIIPTRSRVDLERMIYQIQQFE NINWLSGNEKREEEAVYDIPELPELVIPHNLKIQPYPYQLKGIARGLELKRFMNCDEPGL GKTLQSIATINIAGAFPCLVICPSSLKINWMREWEKFTDKKAMILTDKVRDTWTFFFQTG MHQVFIVNYESLKKYFVQRIKKSEGWTLRDVEFRNSINLFKSVIIDESHRCKSASTQQAK FCKGICTGKEWIIELTGTPVVNRPKDLIPQLAILNRMEDFGGYKPFVNRYCSGQREASNL KELNFNLWKYCVFRREKSLVLTDLPDKIRQVNTCEITNRKEYVDAERDLIMYLQKYKDAD DEKIEKALRGEVMVRINILRQISARGKVRDVIEFVKDFRENGKKIILFCSLHEVVDQLKR YFPTAVSVTGRDSQDVKQRAVDAFQNNPKTDIIICSIKAAGVGLTLTASSNVAFVEFPWT YADCCQCEDRAHRIGQKDSVTCYYFLGRRTIDEKVYRIIQEKKNIANAVTGSTEDIEENI VDMVARIFDTDYDDEGF >gi|225935325|gb|ACGA01000067.1| GENE 31 21791 - 22438 584 215 aa, chain + ## HITS:1 COG:no KEGG:HAPS_0636 NR:ns ## KEGG: HAPS_0636 # Name: not_defined # Def: hypothetical protein # Organism: H.parasuis # Pathway: not_defined # 1 213 1 213 213 191 50.0 2e-47 MNTYSKYVPNVFLAKCSEKHEKGEVIEVTTKYGKENECIVFNLIYERDGFYYYSIVRADG FNVQEWAKQRAERRHEWATSAVQKSSEYYNKSNKDKDFLSLGEPIKVGHHSEKRHRKAID DAWNNMGKSVQFDEKAAEHERIAKYWEQRANTINLSMPESIDFYEHKLEVAQKYHEAVKS GKCPRSHSYALTYAKKEVNELQKKYELAKTLWGDV >gi|225935325|gb|ACGA01000067.1| GENE 32 22481 - 23236 417 251 aa, chain + ## HITS:1 COG:PSLT059 KEGG:ns NR:ns ## COG: PSLT059 COG0863 # Protein_GI_number: 17233429 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Salmonella typhimurium LT2 # 40 249 26 212 226 71 30.0 1e-12 MKEIELYNDHFQNYKVYGIPKAQLIIADVPYNLGNNAYASNPSWYVDGDNKNGESDKAGK EFFDTDKDFRPAEFMHFCSQMLVKEPKDKAKAPCMIIFCEFEDQFRYIELGKRYGLNNYI NLVFRKDFSAQVLKANMKVVGNCEYGLLFYREKLPKFNNDGRMIFNCFDWVRDNDTPKIH PTQKPVPLLRRLIEIFTDKGDVVIDPCAGSGSTLLAAAQLGRRAYGFEIKKKFFADANKF VLSQVQQALFQ >gi|225935325|gb|ACGA01000067.1| GENE 33 23465 - 23671 108 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174934|ref|ZP_05761346.1| ## NR: gi|260174934|ref|ZP_05761346.1| hypothetical protein BacD2_23965 [Bacteroides sp. D2] # 8 68 1 61 61 88 98.0 2e-16 MCDVCKRLYDSYSEGCHYEIEGKNFCGACEDESEATYCDNCMSDMWKSEGRDEDTGLYLC KKCKENKK >gi|225935325|gb|ACGA01000067.1| GENE 34 23702 - 24115 326 137 aa, chain + ## HITS:1 COG:no KEGG:BDI_0866 NR:ns ## KEGG: BDI_0866 # Name: not_defined # Def: putative prophage Lp2 protein 26 # Organism: P.distasonis # Pathway: not_defined # 1 135 1 144 149 68 37.0 5e-11 MRTIKFRGKNLYNNEWIFGDLIQYESGEMAIFSKKLSQYGCEATEMFNRSKVETTTVGQF TGLFDKNGKEIYEGDILHTITFGFEPEEYTAIILYDNCRFQLSNGRNLFYFGQSDLTKMD DTIVIGNIYDNPELIIP >gi|225935325|gb|ACGA01000067.1| GENE 35 24341 - 24697 286 118 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3857 NR:ns ## KEGG: Cpin_3857 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 100 24 123 127 131 60.0 8e-30 MDSSIVYGKSYGMIYICRTCNAYVGVHKGTDQALGRLANKQLRVLKHEAHEYFDKIWRFK LMKRTEAYTWLSSVLELPEEYTHIGMFSEKTCRQVIYVSKQLLQKYGIESKTPYCEND >gi|225935325|gb|ACGA01000067.1| GENE 36 24630 - 25163 208 177 aa, chain + ## HITS:1 COG:no KEGG:Coch_0893 NR:ns ## KEGG: Coch_0893 # Name: not_defined # Def: VRR-NUC domain protein # Organism: C.ochracea # Pathway: not_defined # 65 164 6 100 120 76 44.0 3e-13 MLVNSYCKNMESNLRHLIAKMTKEKCILCGKETVSVIKTGTDFMCYNCYADQRNPTRSKE VHNNEEARIQTEFFKLIPLYFPNIPDKLIFAVPNGGSRHIREAANLKRQGVKPGVSDVIV LIPKKGFASLCIEFKTKVGKQSEYQKEFQKQAESCRNKYVVVRSASQAIEELRKYLS >gi|225935325|gb|ACGA01000067.1| GENE 37 25268 - 25555 232 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174938|ref|ZP_05761350.1| ## NR: gi|260174938|ref|ZP_05761350.1| hypothetical protein BacD2_23985 [Bacteroides sp. D2] # 1 95 1 95 95 191 100.0 9e-48 MTFEEAVSLVDRIKEQVIGVPVKGRLIESLFIGPTNWNEMHVFMNICLQKGEDEAIDEFI GKSFSVYGRSVTYINPDLPRWDVTVLDDWEKTIYN >gi|225935325|gb|ACGA01000067.1| GENE 38 25566 - 26870 549 434 aa, chain - ## HITS:1 COG:no KEGG:BF1081 NR:ns ## KEGG: BF1081 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 425 1 410 415 216 37.0 1e-54 MNRIIIIGNGFDLAHNLKTGYKDFINDYWDTVEEGIYDKYWRLLDQQYGGGKHPLNDYED QFIKIGKEYDKTGVNKVCSSYKEDSPLWKLHTLIDEHNNDPSSNVTVTLTFTNHFFERIS HQCSLVNWVDIENEYYKALKELLQEENYQKQNESIHTLNKEFDSVKRLLEKYLTRISENT ELKKHQSIQDAFSSYVEFEEVATCKQTAFINSFFSNMDIRFDFDIDRHGDLSYNECLTKD EELRYYIDKKLNNDNFKKENLIPNTLILNFNYTKTAEKLYIKNGNDKIINIHGELNNENN PIIFGYGDELDDDYERIERLQNNDFLENIKSIRYHKTRNYRKLLEFVALGPYQVFIMGHS CGNSDRTLLNTLFEHDNCLSIKVFYRQYEDGTDNYIDMIKNISRNFNNKPNMRDIVVNRE SCSPLVPVKKEVAE >gi|225935325|gb|ACGA01000067.1| GENE 39 27022 - 27933 510 303 aa, chain - ## HITS:1 COG:no KEGG:BT_0082 NR:ns ## KEGG: BT_0082 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 295 14 295 306 213 40.0 8e-54 MDTKSFFEKSKKQLNILNKKGWLANISSYNNEYICPLCLNKFTAEQMDELSQEDAPQDKL GGKRIALTCKKCNNTCGSSMDCYLINRIENYENSIFIPGTKRDVKVKVADKTFNGQLEVC SDGRMIMTNSFKQNNPTLLSEYMKQLAEDMALSIENKNKKVDDTRLSVALLKNAYIILFA KFGYTFLMDELYDTIREQIEKPDSEVVPKLWKITTERMIPDGVYLMSDCDGFLVSYTIKK NIEYYVLVAIPFPNVSFDEIVAYLTTIGPNKPMTLKKITNRDYWQDESAIELLRKEIFLE KGV >gi|225935325|gb|ACGA01000067.1| GENE 40 28047 - 28517 293 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174941|ref|ZP_05761353.1| ## NR: gi|260174941|ref|ZP_05761353.1| hypothetical protein BacD2_24000 [Bacteroides sp. D2] # 1 156 12 167 167 239 100.0 3e-62 MLGIWFTSCKTSRNMETQKQVDYSSELSRIQSIIESLRLDVNKQIKITTDKLSDLKIENK TVYLSAPDSTGKQHPIKESTTTASKQDQERTEVDETLSITLQQFSNRLDTISNKVNVLLN QKEIVVELSWWDLHKDKIYVIIIFIAIGVLVYIYLK >gi|225935325|gb|ACGA01000067.1| GENE 41 28547 - 29041 176 164 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 45 151 1 100 116 95 46.0 4e-20 MKTIDAIIIHCSATRAGQDLRAKDIDRIHRARGFNQIGYNFIVDLDGIVENGRPLSIDGA HCNTKGFSKSSYNKHSVGICYIGGLDASGKPADTRTPAQRTALRELVAKLCKEYPIIEVL GHRDTSPDLDGSGEVEPAEYIKACPCFDVRSEFSNFLRNTVIRP >gi|225935325|gb|ACGA01000067.1| GENE 42 29062 - 29412 365 116 aa, chain - ## HITS:1 COG:no KEGG:BT_4442 NR:ns ## KEGG: BT_4442 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 116 116 152 66.0 3e-36 MGVNEWIAVLGAIGGTSTITWLVTFWVNRKTNARKEDATADSMENENERKQVDWLEKRLA ERDAKIDAIYVELRQEQAEKLQLIHDKHELELKLKEAEIKKCDVRGCSKRQPPSDY >gi|225935325|gb|ACGA01000067.1| GENE 43 29423 - 29689 183 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174944|ref|ZP_05761356.1| ## NR: gi|260174944|ref|ZP_05761356.1| hypothetical protein BacD2_24015 [Bacteroides sp. D2] # 1 88 1 88 88 141 100.0 1e-32 MNITSTNSTATTKVTDAIRIKYRMSTRGTEAVKDITAEIVKDETTVGFFNISRNGVTGFS LHEAHGLTFGEVKQVFQTAIDDCSEVLK >gi|225935325|gb|ACGA01000067.1| GENE 44 29686 - 30009 312 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174945|ref|ZP_05761357.1| ## NR: gi|260174945|ref|ZP_05761357.1| hypothetical protein BacD2_24020 [Bacteroides sp. D2] # 1 107 3 109 109 193 100.0 3e-48 MKLNLNKPLIDFRGKEAIKIVNGKEQKQFLRDMVSEALYAAGMNPQSGMDMAKKLRAYNM LQQIINNRGILEITTEDATLLKEICADVFTAGAFGQINELIEGGGKE >gi|225935325|gb|ACGA01000067.1| GENE 45 30021 - 34019 2577 1332 aa, chain - ## HITS:1 COG:no KEGG:BDI_0901 NR:ns ## KEGG: BDI_0901 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 7 903 8 879 1223 884 52.0 0 MITLHNGDKEIDIEVKDESYSYEAIMGEYTLTLYFSHPGYIEIPVGSWCDFYGKRYSLKR DSNFKKNGERNFEYTLILETGEADAMLWKVRHTVDRSIKFSYTAKPHEHLRLLVENLNRR STGWKVGDCIEGTEKVINYNHTYILDAFNQLAELYETEWQIIEETVEGKQIKTIHLRKVE YNKENPLKLSYGKGHGFKVGVGRESGEIPPEIILVETTERNIDYSTYGSKYLLLPKNKTI RFDGIKFENEEGFDSTKARIYKTDADGTCVMRADKELTTAKEDSLDCTAIYPSRVGTVSA VIEVNKKNNFFDFVDKDIPEELNFEDCLIAGESMTVIFQTGMLTGKEFEVKYIHEAKDKK EARRFEIVPQEIDGITMPEPEVWRPKVGDTYAVFGMQLPKAYICNDSTQTGASWEAFKEA AKYLYEHEDKAFIFTGTLDGIWAKKRWLEIGGKIVLGGYVDFYDTQFHPEGSLIRMIGIK RYINNPYSPEIELSNEPVSTSVSSDLNKIETNKVEVDIKHKDALQFTKRRFRDAKETMSM LEDALLNFSGSVNPITVSTMQLLVGDESLQFRFVNSKTNPVQVSHNITYNASTRILNAPA GILQHLTLGISSLSSSHKADEYKYWDMAEYNSPAFIDPEKKYYLYAKVGKENQAGTFLLS ETAIKMEQIAGYYHLLTGVLNSEYEGSRSFVQLYGFTEILPGRVTTERILSPDGDTYFDL VKSEIGGNIQIKAGSSGLENLSEWEAAHKEIENAAKAAEQANNAVDGLHDYVNGAFADGI ITEAEAKAIEKYINTVNNTKQAIEATYNKLYTNVYLSGPAKIGLLNAKVTLMGSIENLIN AINTVITDGQTTVVEKRDVDNKFTLFNSALATFNTAVEEANKAIQDKLKEYSDEALKQAI QALEDAADAAKAAQEAADSVDGLHDYVDGAFADGIIDGAEAKAIEKYLNTVKNTKSAVEA TYSKLYVNTYLEGSAKTALLNAKVSLFGAIDNLIAAINTVIADGQTTIEEKKNVDDKFTL FNSALASFNTAVEEANKAIHDKLKSYSDECTADLKVLNTQISAQVTRVDSLTQRIDTAGW ITTADGNKIYASKELENGNTLISYINQAAGETTIHSSKINLEGAVTITALHSDLQTMINS KIDRDGLGKLAFEDAVEYAKLGTTIVVGGYLNTDLIKVRRIDADSGFIGGFTIENGRLVW TRSGYFGGTSRSLKLGSGTAKEGVVNVTFNAETDGRFGIASIGSNFGGACIYASRNLNAS DRSYPLASTTYAGFFDGGVYVKGSLSSELCLADNFGCITSRNSDGSINYYQGIDFDFGSN MKFRKGLLVSIA >gi|225935325|gb|ACGA01000067.1| GENE 46 34233 - 34490 68 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174947|ref|ZP_05761359.1| ## NR: gi|260174947|ref|ZP_05761359.1| hypothetical protein BacD2_24030 [Bacteroides sp. D2] # 1 85 40 124 124 151 100.0 1e-35 MIQWGNLASSYRNRSIYLPTSFYDNKYIAIVGVYDSTVDSISVVSCKIKTQYTSYFVIVP AATWSAENYLYATEGINWFAIGRWK >gi|225935325|gb|ACGA01000067.1| GENE 47 34515 - 34925 168 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174948|ref|ZP_05761360.1| ## NR: gi|260174948|ref|ZP_05761360.1| hypothetical protein BacD2_24035 [Bacteroides sp. D2] # 1 136 1 136 136 258 100.0 1e-67 MVHFSRKLVAIIVVIFAQSSLGTNAALKDFSNVSTKSFSQNGYYKLPDGLMIQWGVGGKI GDVKTIYLPASFYDTAYNVVACSGFELISEVVVSAVMVYNKNKSNFTVVQRHASNDRNGV YTTGYPFNWFAIGRWK >gi|225935325|gb|ACGA01000067.1| GENE 48 34940 - 35350 93 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174949|ref|ZP_05761361.1| ## NR: gi|260174949|ref|ZP_05761361.1| hypothetical protein BacD2_24040 [Bacteroides sp. D2] # 1 136 1 136 136 256 100.0 3e-67 MKLNINCFSRKLGAILVIIFLQSSLGTKSAQINSQNLGQNGYRKYEDGLLIQWGRLANSS PGSATIYFPVSFYDASYRLVTTMETISNEHSLYTALPYNKAASYVAVMRKYLLADNSITV GSSIRSFDWIAIGRWK >gi|225935325|gb|ACGA01000067.1| GENE 49 35374 - 35778 152 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174950|ref|ZP_05761362.1| ## NR: gi|260174950|ref|ZP_05761362.1| hypothetical protein BacD2_24045 [Bacteroides sp. D2] # 1 134 12 145 145 238 100.0 7e-62 MDRFSRKLVLLLLLVIWQSSLGTNTALKDFSNVSTKNFSQNGYYKLPDGLMIQWGKKTGG SSYQGTITLPLSFHDDYYSLACSAMKGNLIDLSSWVINYLAKTKSSFNYICTYAAEGNGT SNAEFHWIAIGRWK >gi|225935325|gb|ACGA01000067.1| GENE 50 35795 - 36325 367 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174951|ref|ZP_05761363.1| ## NR: gi|260174951|ref|ZP_05761363.1| hypothetical protein BacD2_24050 [Bacteroides sp. D2] # 1 176 1 176 176 336 100.0 4e-91 MKYWKQGFYDEPVNGSIEITEEYYSQLLAGQSAGLLIVESKKGYPILVVHEATIEEIRAQ KLDELRLFDSSEAVNQFSINGVFGWLNKNTRVGLMNSISIERETGRSETSIWLGDTQFIL SIEKAINMLQQIELYALACYNVTQRHINSINQLYTKEEIEAYNFKTGYPGKLSFTG >gi|225935325|gb|ACGA01000067.1| GENE 51 36295 - 36825 484 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174952|ref|ZP_05761364.1| ## NR: gi|260174952|ref|ZP_05761364.1| hypothetical protein BacD2_24055 [Bacteroides sp. D2] # 1 176 1 176 176 339 100.0 5e-92 MKYWKNGFYDEPVDGSVEITDEHYNQLLDGQSNGLLIVESKNGYPILVEYEYDIEEVRKM KISEIQVFDKSTNVNSFDLLGKSMWLDKSTRVGLFNSISIEKNAGKSDTVLWYDAIKYII PIFDALAMLNALELYALNCYNVTQSHISAVKALQTIEEIENYDYTVGYPVKLSFPG >gi|225935325|gb|ACGA01000067.1| GENE 52 36842 - 39523 1716 893 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_2162 NR:ns ## KEGG: NT01CX_2162 # Name: not_defined # Def: phage pre-neck appendage-like protein # Organism: C.novyi # Pathway: not_defined # 65 267 133 335 614 117 39.0 3e-24 MPIKKKKISELTLADSMVGLYTIGVKMVNGVQTSVKVSLEFIKKAYDDVVAATKKANDAA KAADDSRTQIEANEDTRQRNEATRIDAERNRSNEEQARSAAESVRIINENTRKAEEAARA TAEGQRVSAELSRVETENKRVSDEQARKNNEDARKTAETGRSSAESERVKEEDKRKAAET TRSTAETDRVTAEDERKEAESTRETNETARVTAEDNRVTVESERVSAETDRKSAETARVS EENKRKSAETERKSAETARVSEENKRKQNEDTRKAAEDTRSSNETRRVSAETERVEAESQ RKSEYSGIIQEMTSATGDATAQLELVKKATNDANAAKNASVEQTALAKKATNDANAAIIS INAAKEETQQATEEANAAKVASEAQTALAKKATDDANTAKNASVAQTALAKAATDNANAA AQAANNAVSGVDAKVKAAVDALVAGAPEALDTLIELANALNNDPNFAATMATELGKKINV SDIVNNLTSGGTGKVLSAEQGKVLKAALDTHNHAGIYEPVFSKNTAFNKNFGTTSGTVCQ GNDSRLSDARTPKAHTHKVSEISDFPTSMPASDVPAWAKAASKPSYTASEVGASPSNHTH TGVYQPVGSYAASSHKHEATDITPDSTHRFVTDAEKSTWNGKAAGNHNHDSAYQPKGSYA ASSHKHTATDITDDSTHRFVTDSEKSTWNSKAAGNHNHDSAYQPKGNYASSSHNHTASDI KEDETHRFMTDEERTKLSGIASGANNYSHPASHPASMIEESTTRKFMTDAEKTLLSSLGT TYALADLSNAMSVNLSLNGYAKFNNGLLVQWGRVGGSSTVSYSVTMPTSFYNTEYKIFAT VYKPSSDSAVYSSSPLAINKTVSRFYLNRNYASGGTTGLSQELWDWFAIGRWK >gi|225935325|gb|ACGA01000067.1| GENE 53 39513 - 40445 676 310 aa, chain - ## HITS:1 COG:no KEGG:BDI_0895 NR:ns ## KEGG: BDI_0895 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 310 13 310 310 202 40.0 9e-51 MYTKHGISIIKGSYNNLVAFPTLKDPEKNDWPEEDGQEFDLSVVALDTSEISIEFGFRDD LGFGGLIALLSDMGYHNFRFPILGKTYRLRLLSQNSYTIYPRLEIAKIVFANDFPHEANY EYQDPVNSIAMPKGYEIDNKDLSDYGVIVLPGSNAEILKTPAVKKNLLQNFKRQDGAIYD GEVVKFQTKEVSLKCLMRTGTIEAFWRNRNALLYDLTKLSAKVDDEGYEYSDAERIFYCD EWSESYPCYYKSCQTNNFLLNNGVWWEFTLKLVFTSFRIGETDFLLASEAGEFIITEDGI FYIDLKNYAN >gi|225935325|gb|ACGA01000067.1| GENE 54 40475 - 45109 3459 1544 aa, chain - ## HITS:1 COG:no KEGG:GFO_2426 NR:ns ## KEGG: GFO_2426 # Name: not_defined # Def: phage tail tape measure protein # Organism: G.forsetii # Pathway: not_defined # 7 357 6 356 1325 275 48.0 1e-71 MGIQNKDGALYFATGIDNSGLYSGRQEAMGIIKAMASEITAFDVFGGIGISAGIAFARAA KGAYDFEKQFQQSMKEVATLSNGIKGSLTDYMNQVMEITRTIPVEANEAAKALYQIVSAG HDGANGMKVLEASAKAAVGGVTDTATAADAITTVLNAYKLDASKAQEVSDQLFTTVRLGK TDFGQLGKSIAQAAPIAASFGIDIKEVLGAVASITKQGVPTSEAMTKIRAAILGTANQLG DAAFKGRTFQEALQLIYDKAGGSSTKMKELLGTDEALQAALMLTGEKAKEAASDLDEVNN SAGAAEAAFKEMASSAENQMKLLGNNITATLRPLGKEILKEVSSIATSFNKAFENGDAQE SLRILGTLIVTLTSAFIGYKGSIIAINAVKKIHISLLEMEKYEMTLYQKAVEAGTVSENV LTAAQVKQMATRKAMITTIKMHTASLLKNTAAMLANPYILAAAAAATLGYAIYKYATRAT AAEKAQYRFNKTVKESQQRAEDYKNKIQGLVSIARSDAAATTERYKALLQLQRLMKSVFK DMDLKKLKDMDDLELTNKIAEAEARRNVVIAKTNLVMAKQRLQKIDNQIILTNNRNGYTG ALKEDREAVAKEVDLLQGEVDRAVKVHSISIKAHVQSQKEAATQNKAFWTKQKDDATKAL DSIASAQKKLMDAGNFKGIDATVVTAYKENIKKLKEAEKELKVYDSFSKQDDKAQKLREE QEKYKLLLEKQKFDQERMKEDSANELEQIEINKLKESSEKVLRQRALNHRLELQAIKRET EDKKRKVIEDARAAFEINPQNKKKVFNADVFINSESTKKLFASFDNIAKEATIATNTKFD RGDDLSDLLNQYQDYTDQRLAIERKFNEDIATLQEQRKQATKNGDTNQVEQIDRSIAQAT KNKGMELMKLDYDKLKESPEYVRAFENLKETSSETLNSLLTQLENAKSTAAQVLSPDQLR EYTSTIQSIMDELDSRNPFQSLSDKKKELAEAEEELANAQIELENARIKAEAVKGGAKFE NGIKSSKYNPETGKIESTKAYLSEAQALEQVKKKTENYNAAKDKVVKKYNQVKKAEKEVR AQISELADTIDELGKTIGGPAGEIISLIGNIGAFTMTAMSGVESAANTSANAISTVEKAS VILAIIGAAMQVAMKIFDLFGKDDTTEKYEKAKETYESYISILDRVIEKQLELAETLTGD NANAAYEKALKMVKLQSENARVLGQQYLNSGASGKSHSKGYSEVEDMSWEGWKQAADTLG MSIDDFKKKMGGRMTGLFDLTDDQLAKLQENAGIFWSQLDSDTQKFADQIANGVAKVAEV LEQQIADTTLIDYSSLRSDFQDLLTDMDADSADFADNFEEYMRNAILNSMLKEDYMDRLI EWREKLYNAMDDGMTEDEYNALKAEGQQIANEMKAKRDALSEIYGFGKDDDEEREASKKG FASMSQDSADKLDGSFAVMISHTYSINEGVKHLQSNSDKIAEKLAYLSNLDKYMGEIMKY NDIVITYLSDISSHTARLEAIEKAIESIKLGIDTLNTKGIILKR >gi|225935325|gb|ACGA01000067.1| GENE 55 45373 - 45567 262 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174957|ref|ZP_05761369.1| ## NR: gi|260174957|ref|ZP_05761369.1| hypothetical protein BacD2_24080 [Bacteroides sp. D2] # 1 64 1 64 64 70 100.0 5e-11 MTKENENVSIANKSSYTEEEIKAAYEKGKSEGRIEGMLSYQKRLIKNLQQDNASLNQMLQ EMKK >gi|225935325|gb|ACGA01000067.1| GENE 56 45667 - 46020 92 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174958|ref|ZP_05761370.1| ## NR: gi|260174958|ref|ZP_05761370.1| hypothetical protein BacD2_24085 [Bacteroides sp. D2] # 46 117 1 72 72 129 98.0 7e-29 MYLYHDASEREPIARDGDNSRSEKSTKLEGYNQSRFLVIVKEALGLTFNETLDSSYGLIE ILLQEYSFVMRERNKTTDEDGEVEGRDYEWIELPSFDNPNEKIRMKRYYDINGKVKR >gi|225935325|gb|ACGA01000067.1| GENE 57 45909 - 46232 72 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174959|ref|ZP_05761371.1| ## NR: gi|260174959|ref|ZP_05761371.1| hypothetical protein BacD2_24090 [Bacteroides sp. D2] # 1 107 72 178 178 218 100.0 1e-55 MIVKKGELPEDFPKIMDKYGELLLDIVCLGIHNKPSDPPKWFKQALADNSTWEDIRILFN AIIYRIGYHPFCTSITMLRNVSPLRETEIIAARKNLQSWKDTTKVDS >gi|225935325|gb|ACGA01000067.1| GENE 58 46615 - 47214 626 199 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174960|ref|ZP_05761372.1| ## NR: gi|260174960|ref|ZP_05761372.1| hypothetical protein BacD2_24095 [Bacteroides sp. D2] # 1 199 1 199 199 353 100.0 2e-96 MTGEVRPIAMGVGKIKFGTVGDGVPGADLKDFPLPTKGSVVFNFADPKEVKVETEGSDEP LYVEFVKDTTDYIELSIPTPSNEVLKELAGGEIDTADGRNIWKKPINVPSISKTFQCETL PKNGKKVVYTVVNGKITSKISQAPGSEQAELLLIRVYMQAAITASGEKKAAFMREVISVS EGGEAPANPANPEGEEVVE >gi|225935325|gb|ACGA01000067.1| GENE 59 47217 - 47597 346 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174961|ref|ZP_05761373.1| ## NR: gi|260174961|ref|ZP_05761373.1| hypothetical protein BacD2_24100 [Bacteroides sp. D2] # 1 126 1 126 126 234 100.0 2e-60 MDEFDAVDIIYNAVVAAGTDVMIYKDKSEAGLTNEHIVINHLQLNELDFINKVPINVNIF VPLNENGMPQRQRMKEIKRKVRKSLDSINSNDGICKEVTVLWSVPMPDLKEKFACTNIRL EILIDQ >gi|225935325|gb|ACGA01000067.1| GENE 60 47600 - 48055 459 151 aa, chain - ## HITS:1 COG:no KEGG:GFO_2433 NR:ns ## KEGG: GFO_2433 # Name: not_defined # Def: hypothetical protein # Organism: G.forsetii # Pathway: not_defined # 4 132 3 126 142 87 37.0 1e-16 MKNGMTPLFDQHSLERWFDHFQSKAENKILVLLQAGGEKFIDIARRNGSYKDQTGNLRSS IGYIIAKNGEVVAENFTESEKGTDKTTGKYKGRRLAEEVSLSHSGGYVLVGVAGMEYAAA VEAKGYEVVSGANTQCEKYLRDTLKSVFSKI >gi|225935325|gb|ACGA01000067.1| GENE 61 48052 - 48363 185 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174963|ref|ZP_05761375.1| ## NR: gi|260174963|ref|ZP_05761375.1| hypothetical protein BacD2_24110 [Bacteroides sp. D2] # 1 103 1 103 103 200 100.0 2e-50 MVKRYPHTAIVTIDVNGKTVNGEWVPGKPIEISVPGRYDPVSDGTVVYKRNSAGDEAQVH GYFYTKIQPPADSKFLRLKVDSKGVDVPIICWESYQSHSIINV >gi|225935325|gb|ACGA01000067.1| GENE 62 48357 - 48689 241 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174964|ref|ZP_05761376.1| ## NR: gi|260174964|ref|ZP_05761376.1| hypothetical protein BacD2_24115 [Bacteroides sp. D2] # 1 110 1 110 110 221 100.0 8e-57 MATIRETILEYPSIEDMESFLDKVVFVKRGINPEAECTTENMKQVGLCVADTYAMLVNSQ DFSENKLSITHPRSFYVQTAKQLYIENGEPEKAGKLGKRIIIKGKAGNRW >gi|225935325|gb|ACGA01000067.1| GENE 63 48695 - 49816 1007 373 aa, chain - ## HITS:1 COG:no KEGG:SEQ_1749 NR:ns ## KEGG: SEQ_1749 # Name: not_defined # Def: phage capsid protein # Organism: S.equi_equi # Pathway: not_defined # 32 226 33 221 349 65 24.0 2e-09 MERSLIKQVNRKNMGARLNSRKVKPVFFPNFFGVKQKNSLKWETLTGEKGAPVIADVISF DSSAPQKKREVIGKMSGDIPKTAVKRGMNESDWNEYQQLSRDCEGDADLKSILDLAFKDQ DFVYNAVRGRFEWWCMQLMSKGGFILNSSNNNGIVTEEFVGCGMPNENKKVAAVDWSKST TADGLQDIEDTVVAASAEGVTIKYVVMRKDRFALLKKQKAVIEKVKGWINQKEKLTISKK VINEYLAAQENTEGVQIVLVSPSVRIENAAHQRTTVNPWESANICFLEDLQCGDIQHGPI AAEHSVEYKKKATTLKKDFVFISKWSELEPFKEWTKAEANAIPVINDPDAMYIMKTDGQA WEEGEDTEKTDEE >gi|225935325|gb|ACGA01000067.1| GENE 64 49838 - 50422 684 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174966|ref|ZP_05761378.1| ## NR: gi|260174966|ref|ZP_05761378.1| hypothetical protein BacD2_24125 [Bacteroides sp. D2] # 1 194 1 194 194 327 100.0 2e-88 MFRKKQKEFQYAPGIEKIIEDIQGGGTIARAELKGIIDELPPLVMVGKDANGLYHTVKTG RITAVADADAVTIQIAKNHVFKVGEAVTVGGALTGAADVISAIDKTNPGYDTITLAGPIG AAEIGNVLVLVTAKADAKAAKFKYVPEVITMNKVDVTVANQQSGLLVRGTVNEAVMPYPI DEAMKELLHFIRFV >gi|225935325|gb|ACGA01000067.1| GENE 65 50442 - 51146 628 234 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174967|ref|ZP_05761379.1| ## NR: gi|260174967|ref|ZP_05761379.1| hypothetical protein BacD2_24130 [Bacteroides sp. D2] # 1 234 1 234 234 301 100.0 3e-80 MTIIDAIKKSLKAAGVNEKYAVKVQKLFKIEKEDDIDTYIALFKDNILPDLENTSAIEKA KKDAIAEYEKNNGLKDGKPIKSAKKTKKTVKSEDDDEEDDDEDFEDLPASVVKLLKAQQK QISELTASVSSVVSTVTTSTKQASARTLFADAKLPEKWFNRIDVNSETSVEDQIKELQEE YAEIRQSVIDDEVAGGGYKPNSYKPKERSEKEWLELMEDEENSNNGTASLGLDE >gi|225935325|gb|ACGA01000067.1| GENE 66 51230 - 51493 230 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260174968|ref|ZP_05761380.1| ## NR: gi|260174968|ref|ZP_05761380.1| hypothetical protein BacD2_24135 [Bacteroides sp. D2] # 1 87 1 87 87 158 100.0 8e-38 MDNKKKEYRKKAKELALQNGFDQVSYYGEWNGYLAYTVSRKEDAGCCIGYPRFILVKDSV ATLAPYTQSEDIMGMTSMPKDHVDTLL >gi|225935325|gb|ACGA01000067.1| GENE 67 51488 - 53896 1067 802 aa, chain - ## HITS:1 COG:no KEGG:GFO_2441 NR:ns ## KEGG: GFO_2441 # Name: not_defined # Def: hypothetical protein # Organism: G.forsetii # Pathway: not_defined # 36 333 27 326 341 194 34.0 1e-47 MPGLSFYDKQHIQKIAAQQAVIANIFNQFILSVSPYLRKWSDAGKNNVWISNQGIESAVD RELLNLESMLYANISAFQKDGWERAEKKNDDFISQFIKGMSISSATKDGMFAHSLSAFET LKNDIDANGFKLSDRVWNITQQTKSQLEFYLDSGVVAGRNANGISSDIRQILQNPQKRFR RIRNEKGELVLSQPMKNYHPGQGVYRSAYKNALRTSATTTNIAYRSADYERWSKQDFILG IEVHRSANNRGPCKICDAMVGKYPKTFKFTGFHPFCICFATPITMEPEDFADFLLNDTVP QEQVITDIPQAAKEFVSENKDGLQSAFWYKDNFTQDGNVNERLKIVVKQPEKSKQSVIEP ISPFKESQSVEEAQNYATKLGVKYADFSDLSLDIANALNEALQTLPQEDMPLFVGSAKNL QNLSGYAIKKKGYHGITMENIIRIKENEIWKVENGTLVGINAKSINAITKAKAKIETEYM ARTGHHWRFNTDGKSTTFHEIGHVFDQKHLVSSDSKWQTIAKKWAEESQCDILKTPSEAF AEAWAARNTGKELPSYITEYFKELDRQYFNKKSSIPKRIKTDAEKNDIQKRWDDRFVRNF NQAKIEQKIGVKKGKEMTFEEANELRGNINYGKASEYSVNCQSCVVANELRRRGYNVTAL PNLQKTGNIPYELSMRTNWAWIDPKTMVIPKKQTAGGIYDITRSGALKSKSIKELTKELV ELVKEPGRYHIDFAWKGKNSGHIITLEKLHNGKIIIYDPQTGKMKNWRELSKEISLRYGV NVLRVDNLLVNTDIIDGIVKKL >gi|225935325|gb|ACGA01000067.1| GENE 68 53900 - 55348 954 482 aa, chain - ## HITS:1 COG:no KEGG:GFO_2449 NR:ns ## KEGG: GFO_2449 # Name: not_defined # Def: phage portal protein # Organism: G.forsetii # Pathway: not_defined # 23 469 22 436 445 128 27.0 5e-28 MPDIKDILKNEDFGSIVGDLCVDTRENRNPREYMEEYNGDRTRRKESVGYREPKKIAVYS DTEVEIDPETGAEKPKRLEDKTVDVAKVVTNLPKKIVRTSVAFLFGGEMTITAEDSNDGF DEFKKVYKRKLKMQSVLKEFARKVLSETKAAIIFYPVTKDDGKSQLKVKILSTPKDSNVE CEFYPHFDEDDDMDGFIYKYNAEVNGRTCECVKVYTKDVIYSGIMDGVWQVKKIKNRFGK IPVVYAEVDCPDWEDVANLIDKKEMRLSRLSDTNDYFSEPILKTYGLANLPSKETVGKEL NFTMEVDADTGNTYHGDADYLAWQQSCESVTLELNQLDDAIHSGSSSPDLSMSKLMGLGN LSGTSRRFMLIDAEIKASEQMEIFGPAVQRTVAIVQAGMANITHTKYASQLNDNYIEVEF GSILPQDLAEELKNLETASQFNSKETIIKNSPYTDDVETELNRKKQDEKETAQNNSFIGA TL >gi|225935325|gb|ACGA01000067.1| GENE 69 55645 - 55761 90 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKVFKRVAYAIKWFLYQEQSRDNQILWFIWDVITTFFM >gi|225935325|gb|ACGA01000067.1| GENE 70 55777 - 57222 750 481 aa, chain - ## HITS:1 COG:lin1732_1 KEGG:ns NR:ns ## COG: lin1732_1 COG5410 # Protein_GI_number: 16800800 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 74 296 68 296 310 79 29.0 2e-14 MAKKKSKRQILIRKAKAATILRKRIAKKDFWAFCLYYDPKFFSKRLFLKKVAEAFMRVYE SYSASIIYRLAVSMPPRAGKSYISSLFIAWMYGHFPEESVMRNCCSDTLYNKLSYDTRDV VKSRRFKEIFPDIHLKGDKQNVKSWSVEGARQVSYFGGGVGGTVIGFGASMLAMTDDLYK SLEDALSDNNNEKVWSWKQGTHDSRIEGSCCMIDIGTRWSSNDVLGRMEEAGKYNEIIRI AALDENDETFCADVHTTEYYKELRSETDESIWMAEYMQEPFEAKGLLFPKSSLMRFKLAD IAGKKPDGTLGACDTADKGDDDFCAPFAKVFGPKYFITDILFTKDPVEVTEPRLAQMVID TECDQLRIESNNGGRIFAINVRKLVTAKKKSCMIQARATTQHKETRIIMKAGWIKKHCAF LDETEYSKGSDYGRFMKAFTSYKREGDNAHDDAPDGMTILAEFAEAIGLNLKKQTRKVGR G >gi|225935325|gb|ACGA01000067.1| GENE 71 57203 - 57697 248 164 aa, chain - ## HITS:1 COG:no KEGG:BP3385 NR:ns ## KEGG: BP3385 # Name: not_defined # Def: hypothetical protein # Organism: B.pertussis # Pathway: not_defined # 14 135 2 122 149 137 54.0 2e-31 MTEKKNPAEKKKRGRKSEYRIEYADQALKLCLLGATDKELAEFFSISEQTLNKWKKDYPE FLESLKKGKNIADANVASRLYNRAIGYSCKATKFATSEGRITDSKEYIEHYPPDTTAAIF WLKNRQPEKWRDKKEVDANVNLGDELEGLSDEQLQAIIDGKEEE >gi|225935325|gb|ACGA01000067.1| GENE 72 57701 - 58294 390 197 aa, chain - ## HITS:1 COG:mlr0922 KEGG:ns NR:ns ## COG: mlr0922 COG0302 # Protein_GI_number: 13471052 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Mesorhizobium loti # 1 170 1 166 197 134 41.0 8e-32 METKSTDTKDIECAVRTILSYIGDNPEREGLKGTPERIVRMWKELFRGYDPEQAPKITTF PNGKDGLSFNSIVADSGNYYSMCEHHMMPFFGKYWFAYIPNSEGSILGISKIGRVIDYCA ARLQVQERLAQDVVTMIVDALGKEHPPLAVGIIMEGEHLCKTMRGVKKQGKMRSSFYFDN GRLPELKAELSQFASFG >gi|225935325|gb|ACGA01000067.1| GENE 73 58249 - 58794 266 181 aa, chain - ## HITS:1 COG:RSc1449 KEGG:ns NR:ns ## COG: RSc1449 COG0602 # Protein_GI_number: 17546168 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Ralstonia solanacearum # 4 181 5 212 212 138 38.0 6e-33 MKKINEIFYSIQGEGYRTGTPAVFVRFSGCNLKCPFCDTQHSSGREMSDEEIIKEVCFYP TRFVVLTGGEPGLQVDQEFINKLHQAGKFVQIETNGTVPLPIGIDWITCSPKEGSKVFVV NPHEIKVVYTGQDLSTYEAMTAAVYYLQPCSCQNTEEVINYVKEHPKWKLSLQTQKILNV Q >gi|225935325|gb|ACGA01000067.1| GENE 74 58781 - 59113 205 110 aa, chain - ## HITS:1 COG:SMb20939 KEGG:ns NR:ns ## COG: SMb20939 COG0720 # Protein_GI_number: 16264810 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Sinorhizobium meliloti # 1 107 2 117 119 65 37.0 3e-11 MYIVKKRIEISASHSLKLSYESKCQNLHGHNWIIIVWCRAKTLNQDGMVVDFTHIKQKIQ DKLDHKNLNEILPFNTTAENMAKWICDQIPGCFKVMVQESENNIAWYEKD >gi|225935325|gb|ACGA01000067.1| GENE 75 59118 - 60056 251 312 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174977|ref|ZP_05761389.1| ## NR: gi|260174977|ref|ZP_05761389.1| hypothetical protein BacD2_24180 [Bacteroides sp. D2] # 8 312 1 305 305 621 100.0 1e-176 MKLKYWRMISVEPLFKDNLKVHFAGCENLDKLTALHAVGVKYFLFTCYPFVKQMINGKLS NKNRNNIIPSLVSSLGEHAIMDSGLFTLMFGADKGKHDEEFLYTWMLKLVEFVKETGFTG TCVEVDCQKILSPEIAWKFRKEMKRLLPDNRIINVFHLEDGKKGLDRLIDFSDYIAISVP ELRIHKNRTYKTDVAYLTRYIKKKKPQIDIHLLGCTESKMLKENSFCTSADSTTWSAIVR WPKLPFVINGKKLTKHIKNLDESKLLEFYAENIDQLVKKYRFNPRSKLTLAKICLAGELC LHEYDYLLSNQR >gi|225935325|gb|ACGA01000067.1| GENE 76 60004 - 60171 87 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174978|ref|ZP_05761390.1| ## NR: gi|260174978|ref|ZP_05761390.1| hypothetical protein BacD2_24185 [Bacteroides sp. D2] # 1 55 55 109 109 117 100.0 2e-25 MQKKLESTLQNEFGSPCEFGSYSCEDIAQWLLNRFSSMNEVEVLEDDFGGAAIQR >gi|225935325|gb|ACGA01000067.1| GENE 77 60326 - 60838 296 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174979|ref|ZP_05761391.1| ## NR: gi|260174979|ref|ZP_05761391.1| DNA methylase N-4/N-6 domain-containing protein [Bacteroides sp. D2] # 1 170 1 170 170 312 100.0 5e-84 MEEKVEIKIDPRNYRIHGDENKRLIHKSLVECGAGRSVLADRDNVLIAGNGVYEEAQKLG LKVRIVESDGTELIVIKRKDLSTEDEKRKLLALADNHTSDTSRFNFSAIVDDFNLDMLGE WNLNIPNFNIDEDKLDDFFSGSTQPASKKEKVLICPFCGRNVYEKEDDDE >gi|225935325|gb|ACGA01000067.1| GENE 78 60855 - 61418 310 187 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260174980|ref|ZP_05761392.1| ## NR: gi|260174980|ref|ZP_05761392.1| hypothetical protein BacD2_24195 [Bacteroides sp. D2] # 1 187 1 187 187 370 100.0 1e-101 MASKAVNNYITKRYERWLDYSLYHCGLAGIPDEATDVLNEVICSLLQKKNRLLDKLLETK KNGYTELDFFVLKMIKLNASSPTSQYRSRYKPLPVDDNVDYSRLDIEDIPDESEDRNTEI LNKLHLVRDTFESLDLGPVAARVFEFHFFQDGNFSDWEGPETLKQLYEIYNGVQELIRKK IAGETIF >gi|225935325|gb|ACGA01000067.1| GENE 79 61507 - 62346 319 279 aa, chain - ## HITS:1 COG:no KEGG:BF2329 NR:ns ## KEGG: BF2329 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 275 1 275 282 316 55.0 6e-85 MNGELIKTELQVVLSPSQIVQKTAIRKELTPIQQALQAGSTASQLVAEWSGTIAQLNCNV SLSDVANAENIPTLADVNRSFSNSTSVEIITEHLKSVLRYAGVELTNAQLAETALSILSS YWYLNLAELCIFFSQLKNGSRGQFVWGSKINNQAIMVALVEFCKDRRREIEHRENELIRK KAETGYARNENLIKDIVTGVQNTRREREKAKQDFKTFCELFPYLPDKYEPMVLWKAWGGN KEALRKIYGESIPPPDVAEMDIGMYLCNYNITKGKELEK >gi|225935325|gb|ACGA01000067.1| GENE 80 62204 - 63097 324 297 aa, chain - ## HITS:1 COG:no KEGG:BVU_2853 NR:ns ## KEGG: BVU_2853 # Name: not_defined # Def: putative Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit # Organism: B.vulgatus # Pathway: not_defined # 1 265 1 289 319 175 37.0 2e-42 MARPLKQGLDYFPLDTDFLSDRKVRKIINACGPNSVTILICLLCNIYKDKGYYIVWDKEM PFDIADIVGVSEGAVSEVVKKALQVELFDNTLYRKFHILSSRGIQNRFKSCTSKRKDVEI IPDFWINDVNNSINDVNNSINVGDNEQSKVNKRKSSPPHIRVGELFPADSFFDKSLDDCY AELKSNQSWAETVTMNTRSSGYNDLTLEAFYEYLKQFFMEQQNKGETAKSPKDAMSHFAS WLKIELKNKKDERRTNKNRTAGSAKSVTDCPEDSNQKGTNADTAGLTSWIDSLSIGR >gi|225935325|gb|ACGA01000067.1| GENE 81 63111 - 64394 642 427 aa, chain - ## HITS:1 COG:no KEGG:BF2468 NR:ns ## KEGG: BF2468 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 427 1 427 427 475 56.0 1e-132 MRPRTKLQLRVANLSSQLPNIESLMIDWAKNECLKHIGYATKSRIICMECGQRFAPELVK RKRAVCPHCDTSLKIEQSRKRINKQTMFIGKAEICEEFQVIRSFELIAYYRAETKPRYYI REILQHWIKDDGNREVIARANNTGFSGWYGDLEIRNKVVGSYYYNYSNDIYCERYHPASV FRPKYIRMGIDCKLRGMSFLTATNTIPHSPKAETLLKARRYELIDYYEGHRYKIDMYWPS IKICLRNKYRIKDVSMWFDYLKLLDHYHKDLHNAHYVCPKNLKKAHDLYVARKKRDDEKE RKAKEMQQLLKLKKDAENYIKEKSKFFDLKMSDGKIVVVPLKSIEEFQQEGEIMHHCVFT NKYYKEKDSLILSARIGKKHVETVEVNLKTLSIVQSRGVCNQNTEYHERIIGLVTKNMNL IRQKLTA >gi|225935325|gb|ACGA01000067.1| GENE 82 64391 - 64792 271 133 aa, chain - ## HITS:1 COG:no KEGG:BF2469 NR:ns ## KEGG: BF2469 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 130 5 139 139 91 46.0 1e-17 MGKEYQSPKQVIQSYLEERAKSDPLFVTSYTKPNKNIDECYDYIIGEAKKRGGSVVCMSD DEVFGLAVHYYDEDDIKVSKQPATKAVVSNRPEKKKELIPSIEKTKPEQIANNKRKGKKK EIPTGQFLLFEDL >gi|225935325|gb|ACGA01000067.1| GENE 83 64797 - 65912 573 371 aa, chain - ## HITS:1 COG:CC0156 KEGG:ns NR:ns ## COG: CC0156 COG0592 # Protein_GI_number: 16124411 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Caulobacter vibrioides # 1 369 1 372 372 129 26.0 1e-29 MEITVSKTALLDKLKSIGRIIQPKNSIPAYDNFLFVVDEFGIILVTAGEEGGRISTNIDG KTDFTDRSFMANAKTLLDGLKEIPEQPLTIHLYEKELVVKYANGKFSIPVEKGDQYPTMS TDNTATPLLVSGNDLLYGIRQVLFCSANDELRPVLNGVYFDIDLDNISFVATDGTRLAMI ENPSAYTRKERAAFILPSKFAKVLSNIVPEDCMEVEISVNQTNILFEFDSYRLICRMIEG RFPNYRAVIPQKQPNRAVLKRTDIVSALKRVSVFCDENSSLVILKFCPDSLKITAHNLDF CKSAEETVALRTGCDIEIGFKSSFLIEMINNIPSEDIAITMSDPSKASILTRCDEEVRSL TYLLMPLSINY Prediction of potential genes in microbial genomes Time: Fri May 13 11:19:08 2011 Seq name: gi|225935324|gb|ACGA01000068.1| Bacteroides sp. D2 cont1.68, whole genome shotgun sequence Length of sequence - 15092 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1181 322 ## BDI_0911 transposase - Prom 1332 - 1391 4.4 - TRNA 1430 - 1507 82.4 # Pro TGG 0 0 - TRNA 1540 - 1617 82.4 # Pro TGG 0 0 2 2 Tu 1 . - CDS 1735 - 2484 561 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase - Prom 2662 - 2721 9.7 + Prom 3191 - 3250 7.4 3 3 Tu 1 . + CDS 3402 - 3596 183 ## BT_0854 hypothetical protein + Term 3665 - 3713 4.2 - Term 3562 - 3620 19.0 4 4 Op 1 8/0.000 - CDS 3660 - 4682 775 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation - Prom 4809 - 4868 3.1 5 4 Op 2 . - CDS 4928 - 6304 1263 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 6427 - 6486 8.5 + Prom 6262 - 6321 5.5 6 5 Op 1 . + CDS 6496 - 7968 1393 ## BT_0857 outer membrane protein TolC, putative 7 5 Op 2 24/0.000 + CDS 8044 - 9294 1160 ## COG0845 Membrane-fusion protein + Prom 9341 - 9400 2.8 8 5 Op 3 . + CDS 9433 - 10107 352 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 9 5 Op 4 . + CDS 10135 - 12462 1440 ## BT_0862 putative ABC-transporter permease protein 10 5 Op 5 . + CDS 12479 - 14887 1356 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 14907 - 14949 -0.8 Predicted protein(s) >gi|225935324|gb|ACGA01000068.1| GENE 1 2 - 1181 322 393 aa, chain - ## HITS:1 COG:no KEGG:BDI_0911 NR:ns ## KEGG: BDI_0911 # Name: not_defined # Def: transposase # Organism: P.distasonis # Pathway: not_defined # 1 393 1 412 435 161 29.0 5e-38 MATFKAIVFQTGRHIKQDGTSNIKIRIYHNRESQYIATSYYIRPENMDDSGRILPNVPNS EMIEYEINAYIQKIRREYLKLGQERTQFMSCKDLKEEIEKSLVPDAEFIDFVEFTQNIVI QTEKRKTAEWYRSSIDTLCWYMKRKKIDIKLITSFMLNKMIKDLYHSGPAGTPLEPGTVS HYLRGIRALYNKAKLYYNNEDFDIIRIPGDPFKKVEIPEYRRKRKNIDTNTLLKIRDFQS DKKCTNMARDVFMMMFYMMGININDLYSMSCERRGRLEYTRSKTKTRNNHEQIPLSIKIE PELRILLDKYTEGYFLSYFHTNYCNLNNFMRAVNNGLKDICMNLELDFKITTNWARHSWA SLARNKAGVPKADIDFCLGHVNNDYKMADIYID >gi|225935324|gb|ACGA01000068.1| GENE 2 1735 - 2484 561 249 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 1 245 1 244 245 220 44 4e-57 MKENKYNNSHFFSQYSQMSRSVEGLKGAGEWHVLQKMLPDFAGKRVLDLGCGFGWHCAYA IEHGATHATGIDISEKMLEEARKRNPSPLIEYQCMAIEDFDFQPDSYDIVISSLTFHYLE SFTDICRKINNSLTPGGTFVFSVEHPIFTAYGNQDWHYDQDGKPIHWPVDRYFTEGRRTA VFLGEEVIKYHKTLTTYINGLLQTGFEICELIEPQPDESLLDIIPGMKDELRRPMMLLIS ANKKSETAK >gi|225935324|gb|ACGA01000068.1| GENE 3 3402 - 3596 183 64 aa, chain + ## HITS:1 COG:no KEGG:BT_0854 NR:ns ## KEGG: BT_0854 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 64 1 64 64 79 92.0 3e-14 MKTRKQKETGEKEVHLLEQKGYERMVNEIVPVQQAEAYRKPTNKAVKEAVKELNPDTNSL GSRG >gi|225935324|gb|ACGA01000068.1| GENE 4 3660 - 4682 775 340 aa, chain - ## HITS:1 COG:CC1742 KEGG:ns NR:ns ## COG: CC1742 COG5000 # Protein_GI_number: 16125986 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Caulobacter vibrioides # 20 339 358 699 716 97 24.0 4e-20 MDEWAQELSNALKDFRGKLLAEEVKHQYYENLLNKVDTAVLVTDASGHIEWMNQAAVAHL GQISQLPDVLQEASISNDMSMIRIEQNGIVLEMAISCTTFATQGKKQRLISLKNIHSVLE RNEMEAWQKLIRVLTHEIMNSITPIISLSETLSERGIPESLGEKEYSAMLQAMQTIHRRS KGLLGFVENYRRLTRIPTPIRTKVSVAELFMDLKKLFPEEYIHFEMPASDLYLYADRAQI EQVLINLLKNARETCERKIDKEIQIKFFSKDNPTLTISDNGEGILTDVLDKIFVPFFTTK TSGSGIGLSLCKQIMTLHDGSINVKSEIGKGSCFILTFPK >gi|225935324|gb|ACGA01000068.1| GENE 5 4928 - 6304 1263 458 aa, chain - ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 10 455 8 441 441 299 39.0 7e-81 MEEVNKLGKILIVDDNEDVLFALNLLLEPYTEKIKVTTTPDRIEHFMTTFHPDLILLDMN FSRDAISGQEGFESLKQILQMDPQAIVIFMTAYADTDKAVRAIKAGATDFIPKPWEKDKL LATLTSGMRLRQSQQEVSILKEQVEVLSGQNASENDIIGESSAMQEVFATINKLSNTDAN ILILGENGTGKDVIARLLYRCSPRYGKPFVTIDLGSIPEQLFESELFGFEKGAFTDAKKS KAGRMEVATNGTLFLDEIGNLSLPMQSKLLTAIEKRQISRLGSTQAVPIDVRLICATNAD IQQMVEDGNFRQDLLYRINTIEIHIPPLRERGNDIILLAEHFLDRYARKYKKEMHGLTRE AKNKLLKYAWPGNVRELQHTIERAVILGDGSMLKPENFLFHATPKQKKDEEMVLNLEQLE RQAIEKALRISDGNISRAAEYLGITRFALYRKLEKLGL >gi|225935324|gb|ACGA01000068.1| GENE 6 6496 - 7968 1393 490 aa, chain + ## HITS:1 COG:no KEGG:BT_0857 NR:ns ## KEGG: BT_0857 # Name: not_defined # Def: outer membrane protein TolC, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 490 1 490 490 808 90.0 0 MNLKLIFIYLGLLPTELLIAQDQVTELSLKQVIEIARMESPDAQTARHSFRSAYWNYKYY RANYLPSLSLTSDPNLNRAINKITMGDGSVKFVEQNLLNTDLTLNLSQNLSWTGGSFFLE TSAQRMDLFSEHKYSWQTSPIMIGYRQSLFGYNSLKWDRRIEPVRYQEAKKSYVETLELV SANAINKFFALATAQSDYDIASFNYANADTLYRYAQGRYNIGTITENEMLQLELNRLTEE TNRMNALIEMDNCMQELRSYLGIHEDRELRVLVSPQIPDFSVNLNEALALAYENSPDIQT MERWKLESESAVAKARANAGLKADIYLRFGVTQTADKLPDAYRNLLDQQYVSIGISLPIL DWGRGKGQVRVARSNRDLVYTQVEQSRTDFELNVRKLVKQFNLQTQRVRIAARTDETAQR RSEVARKLYLLGKSTILDLNASISEKDSARRNYISALYNYWSLYYTLRSMTLYDFERNTI LTEDYHLLIE >gi|225935324|gb|ACGA01000068.1| GENE 7 8044 - 9294 1160 416 aa, chain + ## HITS:1 COG:YPO1498 KEGG:ns NR:ns ## COG: YPO1498 COG0845 # Protein_GI_number: 16121771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 56 406 55 410 420 83 23.0 6e-16 MDIKLEKKPWYIRYRYYLIGGLLFAAFLIYVITLSLGPRKLRIDAENIQIAEVKESNFME YVDVEGLIQPILTIKINTREAGSVERIVGEEGSLLQQGDTILILSNPDLLRNIEDQRDEW EKRMITYQEQEIEMEQKSLNLKQQALTNNYELERLKKSIALDREEFQMGVKSKAQLLVAE DEYSYKQKNAALQQESLRHDSTVTMIRKELIRNDRERERKKYERTRERLNSLVVTAPLKG QLSFVKVTPGQQVSSGESIAEIKVLDQYKIHTSLSEYYIDRITTGLPATVNYQGNKYPLK ISKVVPEVKDRVFDVDLVFTGDMPDNVRVGKSFRVQIELGQPEQALIIPRGNFYQSTGGQ WIYKVNASKTKAVRVPLNIGRQNPQQYEIVEGLQPGDWVITTGYDTFGDAEELILN >gi|225935324|gb|ACGA01000068.1| GENE 8 9433 - 10107 352 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 199 4 199 223 140 38 7e-33 MIKTINLQKIFKTEEVETWALNNVNIEVKQGEFVAIMGPSGCGKSTLLNILGLLDNPTGG EYYLNGTEVSRYTESQRTSLRKGVIGFVFQSFNLIDELNVYENIELPLLYMGISASERKK RVETAMERMSITHRSKHFPQQLSGGQQQRVAIARAVVSNPKLILADEPTGNLDSKNGKEV MGLLNELNKEGTTIVMVTHSQHDAGYADRIINLFDGQVVTEVSM >gi|225935324|gb|ACGA01000068.1| GENE 9 10135 - 12462 1440 775 aa, chain + ## HITS:1 COG:no KEGG:BT_0862 NR:ns ## KEGG: BT_0862 # Name: not_defined # Def: putative ABC-transporter permease protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 775 1 777 777 1167 74.0 0 MINHYLKVAFRNLVKYKTQSLVSIIGLAVGFTCFALSALWIRYEMTYDNFHEGAERIYLA GNKFDLQGDGFSYYSSSLLAEYLVKNCPEIEKSCRVYLGPGEKPVNYQETEYQLTQMEVD SNFISMFHITAIDGDNYLRLDKNQIAITDRAAQRIFGTESPIGKQLFFPQKNNDEKTIVA VVKSWEGHSLYSFDILLPYYDWEPDWGRQQCNTLFRVYPNSDIKALEQRLSKTEVEQNSH KRSISIPIAPLTTLRSTHPTDNVNVKLNHVRLFACIGALVIICGLCNYLTMHVTRIRMRK RELALRKVNGASNGSLLSLLLSELILLLAISSGVGLVLIELILPVFKRLSQIGESTSFFY TEVITYILALLLLTTGIAAILVHYVNRKTLLDNIKHKSTIHFSGWFYKGSILFQLFISIG FVFCTLFMLKQLNFLLNSKELGLERHNVGVISSYGFGNVPFDAILDQMPDVTERLQGFYT PIPKMMFSTCTLDEWEGKTDKEQRVSVESETINQAYVDFFQVDVLEGEMLNEKDGKETVV INEAAVKAFGWDKPIGKTLRMGVNEVCTVKGVIRDIYYNAPVHPVTPAVFFLPESKERGD FIFKFKEGTWNAVSQKLREEAYKGNPDRELYLVNIEEVYDAYMKSENTLCQLLSIVSFIC IAIAVFGIFSLVTLSCEQRRKEIAIRKVNGANVRVILNLFFKEYLILLAVASILAFPLGY LLMRRWLEEYVKQTPIEGWLYAVIFIGMGLVIFLSIIWRVWKAARQNPAEVIKSE >gi|225935324|gb|ACGA01000068.1| GENE 10 12479 - 14887 1356 802 aa, chain + ## HITS:1 COG:TM0351 KEGG:ns NR:ns ## COG: TM0351 COG0577 # Protein_GI_number: 15643119 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Thermotoga maritima # 7 395 7 402 404 67 23.0 1e-10 MKKQLSIIRTLFRFKTYTIINIVGLAVSVAATLIIVRYIHQELTVDHFCKDLDRLYLLTA QRSDGGIAIMDNTDRNHDPNFIDPMKNPEIEAYSYCVSFEDDYILFDNHRYQANVLVADS TFMQLMNYPVLSGITSHRKPDEVIITRKYARHLFGDESPLGKQLVFSTGNALTIAGIVDE PDTKSSLQFDLLAPVNQGKYTDWSRMGFCVARLAKGTDIAKYNEKISKPQSLICFGNRPI QFRLLSLKDFYFDKAVSSVSAAFQRGNKDHITVLSVVACMLLLVGIFNFINIYTVIILKR AREFGVKKVYGASGFQIFSQIYAENVCMVAVALLIIWMIVEVTAGMFASVYDIPVKSDVS FDLLLSFILLLGLPLVTSLFPFLRYNYSSPITSLRSVSVGGHSIVSRALFLFIQYVITFS LIVIALFFVRQLYTMLHTDLGYHTKDIISCRFLSFETMNKRYSSDEEWQRDYDEIQHKEQ VIRQKMDGCPYFAAWQYGDPPIQLEPQTTVECNGEKHKMAITFASNGYMRMFGLKLKEGR MWNDKDQFAQYKMIINETARKLFRIEKIDEASLQTESRLWWSQGIDLGKNPPFQVVGVIE DFRTGHLSKGDAPLAILYEENGNPTDPLLATIVGGKRKEAIDFLKALHNEVLGEGEFEYS FVEDQVEKLYDDDKRTTRIYVTFAGLAICVSCLGLFGLSLYDIRQRYREIALRKVNGATG GQIALLLVRKYLYILGAAFVVAIPLVYYIINDYTKDFAVKAPIGVGIFVAGFILTSLISL GTLLWQVRKAVRINPGVIMKNE Prediction of potential genes in microbial genomes Time: Fri May 13 11:19:32 2011 Seq name: gi|225935323|gb|ACGA01000069.1| Bacteroides sp. D2 cont1.69, whole genome shotgun sequence Length of sequence - 12322 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 9, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1615 1297 ## COG5492 Bacterial surface proteins containing Ig-like domains + Term 1643 - 1686 9.3 - Term 1631 - 1674 11.0 2 2 Op 1 . - CDS 1787 - 2362 414 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) - Term 2378 - 2437 19.2 3 2 Op 2 . - CDS 2452 - 3636 1108 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 3688 - 3747 7.0 + Prom 3757 - 3816 7.0 4 3 Tu 1 . + CDS 3857 - 4882 923 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Prom 4896 - 4955 1.6 5 4 Op 1 . + CDS 4991 - 6754 1959 ## COG0173 Aspartyl-tRNA synthetase 6 4 Op 2 . + CDS 6796 - 7140 307 ## BT_0874 hypothetical protein + Term 7165 - 7222 10.2 - Term 7151 - 7210 9.0 7 5 Op 1 5/0.000 - CDS 7263 - 8147 950 ## COG0388 Predicted amidohydrolase 8 5 Op 2 . - CDS 8215 - 9336 959 ## COG2957 Peptidylarginine deiminase and related enzymes - Prom 9358 - 9417 5.0 9 6 Tu 1 . - CDS 9420 - 9962 502 ## COG4739 Uncharacterized protein containing a ferredoxin domain - Prom 9983 - 10042 3.7 + Prom 9754 - 9813 4.4 10 7 Tu 1 . + CDS 9997 - 10182 178 ## gi|260175006|ref|ZP_05761418.1| hypothetical protein BacD2_24331 + Term 10373 - 10422 7.0 + Prom 10277 - 10336 8.8 11 8 Tu 1 . + CDS 10465 - 10923 361 ## BT_3062 hypothetical protein + Term 10943 - 10987 10.3 + Prom 11471 - 11530 8.3 12 9 Op 1 . + CDS 11584 - 12069 336 ## gi|260175008|ref|ZP_05761420.1| hypothetical protein BacD2_24341 13 9 Op 2 . + CDS 12100 - 12312 107 ## gi|260175009|ref|ZP_05761421.1| hypothetical protein BacD2_24346 Predicted protein(s) >gi|225935323|gb|ACGA01000069.1| GENE 1 2 - 1615 1297 537 aa, chain + ## HITS:1 COG:YPMT1.20c KEGG:ns NR:ns ## COG: YPMT1.20c COG5492 # Protein_GI_number: 16082802 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Yersinia pestis # 118 224 110 218 220 66 43.0 2e-10 TVTFTLAGEGGATFTLPMASEMKIFDKFDEFKVNSEKHELTLALNVKEGDYAAIKAELTS SKGMEVAIVKATTRATSTPWGVELTSPTFKEDKTIDVNAKVTFTLPAGIEEGEFALLKVI VVGKDGTEHSATRVIVYTTEVSVESVALTDKSVAKDGTVKLIPTFTPATATNKNVTWKSS KSEVATIAADGTVTGVTEGETTITVTTEDGGFTATCKVKVTAGPTFENEGEAGIDGSSWD KAYTIKNKEQLVLLATRVNGNQPSTWNSKYYKLLDNIDLGTMEWTPIGALSSANRQFFGN FDGAGYSISGTLTIKDTEQYAGIFGYGNGTTIQNLSFKGSISEYKNSNGRFGSIIGQTQQ GKIINCHNTAALTGTATTGGIVGYSTRGTIIIACSNSGVITGSEVIGGITGDGEADITGC FNTANIIGVNNPSAGGIAGLCGKAIACWSAAGTISTNNDKGGIVGWIDPNNSDYCYWKTV VGISSAVGRGKSENCAAFSGNAPSTAQIKAMNDAWQAADATREYQFNATAGMIEKKP >gi|225935323|gb|ACGA01000069.1| GENE 2 1787 - 2362 414 191 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 8 188 3 182 185 164 45 4e-40 MQDIVNGRCGWCGTDELYVKYHDQEWGKLVTDDKMLFEFLVLESAQAGLSWITILKKREG YRKAFCDFDPERVAQMNDEDIERLMQFDGIVKNRLKIKATITNARQFLVVQKEFGSFYDY TLSFFPDRKPIINAFQSLSEIPVSSPESDAMSKDMKKRGFKFFGATICYAHLQAAGFIND HLADCICRKNK >gi|225935323|gb|ACGA01000069.1| GENE 3 2452 - 3636 1108 394 aa, chain - ## HITS:1 COG:BS_kbl KEGG:ns NR:ns ## COG: BS_kbl COG0156 # Protein_GI_number: 16078763 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Bacillus subtilis # 27 394 25 392 392 295 38.0 8e-80 MGLLQEKLAKYDLPQKFMAQGVYPYFREIEGKQGTEVEMGGHEVLMFGSNAYTGLTGDER VIEAGIKAMHKYGSGCAGSRFLNGTLDLHVQLEKELAAFVGKDEALCFSTGFTVNSGVIP ALTDRNDYIICDDRDHASIVDGRRLSFSQQLKYKHNDMADLEKQLQKCNPDSVKLIIVDG VFSMEGDLANLPEIVRLKHKYNATIMVDEAHGLGVFGKQGRGVCDHFGLTHEVDLIMGTF SKSLASIGGFIAADSSIINWLRHNARTYIFSASNTPAATASALEALHIIQDEPERLEALW EATNYALKRFREAGFEIGATESPIIPLYVRDTEKTFMVTKLAFDEGVFINPVIPPACAPQ DTLVRVALMATHTKDQIDRAVEKLVKAFKALDIL >gi|225935323|gb|ACGA01000069.1| GENE 4 3857 - 4882 923 341 aa, chain + ## HITS:1 COG:lin0768 KEGG:ns NR:ns ## COG: lin0768 COG1597 # Protein_GI_number: 16799842 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Listeria innocua # 5 299 1 299 309 115 29.0 8e-26 MNERMKKIKFVVNPISGTQSKELILNLLDEKIDKARYSWEVVNTERAGHAVEIAAKAAEE KTDIVVAIGGDGTINEIARSLVHTDTALGIIPCGSGNGLARHLHIPMEPKRALEVLNEGC MDVIDYGKINGTDFFCTCGVGFDAFVSLKFAHAGKRGVLTYLEKTLQESLKYEPETYELE TENGVSKYKAFLIACGNASQYGNNAYIAPQATLTDGLLDVTILEPFTVLDVPSLAFQLFN KTIDQNSRIKTFRCKQLCIRRTTPGVVHFDGDPMETDANVNIELIQRGLRVVVPQASEKD AANVLQRAQEYMNGIKLMNEAIVDNITDRNKKILKKLTKKV >gi|225935323|gb|ACGA01000069.1| GENE 5 4991 - 6754 1959 587 aa, chain + ## HITS:1 COG:BH1252 KEGG:ns NR:ns ## COG: BH1252 COG0173 # Protein_GI_number: 15613815 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Bacillus halodurans # 3 581 4 585 595 600 51.0 1e-171 MFRTHTCGELRISDVNKQVTLSGWVQRSRKMGGMTFIDLRDRYGITQLVFNEEINAELCE RANKLGREFVIQVTGTVNERFSKNANIPTGDIEIIVSELNVLNTAMTPPFTIEDNTDGGD DIRMKYRYLDLRRNAVRSNLELRHRMTIEVRKYLDSLGFIEVETPVLIGSTPEGARDFVV PSRMNPGQFYALPQSPQTLKQLLMVSGFDRYFQIAKCFRDEDLRADRQPEFTQIDCEMSF VEQEDIITTFEGMAKHLFKTLRGVELAEPFQRMSWADAMKYYGSDKPDLRFGMKFVELMD IMKGHGFSVFDNAAYVGGICAEGAATYTRKQLDALTDFVKKPQIGAKGMVYARVEADGTV KSSVDKFYTQEVLQQMKEAFGAKPGDLILILSGDDVMKTRKQLCELRLEMGAQLGLRDKN KFVCLWVIDFPMFEWSEEEGRLMAMHHPFTHPKEEDIPMLDTDPAAVRADAYDMVVNGVE VGGGSIRIHDAKLQAKMFEILGFTPEKAQAQFGFLMNAFKYGAPPHGGLAYGLDRWVSLF AGLDSIRDCIAFPKNNSGRDVMLDAPSAIDQTQLDELNLIVDLKEGE >gi|225935323|gb|ACGA01000069.1| GENE 6 6796 - 7140 307 114 aa, chain + ## HITS:1 COG:no KEGG:BT_0874 NR:ns ## KEGG: BT_0874 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 1 114 114 180 88.0 1e-44 MNALIMALVVWLMMKEMSFDGDYMVANITAYLIAQIHNFIWCKYWIFPVENKKNSIWKQI LLFCSAFAVAYTAQFLFLVLLVEGLDVNEYLAQFLGLFIYGGANFLANKKITFQ >gi|225935323|gb|ACGA01000069.1| GENE 7 7263 - 8147 950 294 aa, chain - ## HITS:1 COG:XF2443 KEGG:ns NR:ns ## COG: XF2443 COG0388 # Protein_GI_number: 15839034 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Xylella fastidiosa 9a5c # 4 294 6 295 295 414 65.0 1e-116 MRKIKVGIIQQSNTADIKANLMNLAKSIEACAAHGAQLIVLQELHNSLYFCQTENTNLFD LAEPIPGPSTGFYSELAAANKVVLVTSLFEKRAPGLYHNTAVVFDRDGSIAGKYRKMHIP DDPAYYEKFYFTPGDIGFEPIQTSLGKLGVLVCWDQWYPEAARLMALKGAELLIYPTAIG WESSDTDDEKARQLNAWIISQRAHAVANGLPVISVNRVGHEPDPSGQTNGILFWGNSFVA GPQGEFLAQAGNDHPENIVVEIDMERSENVRRWWPFLRDRRIDEYDGLTKRFLD >gi|225935323|gb|ACGA01000069.1| GENE 8 8215 - 9336 959 373 aa, chain - ## HITS:1 COG:HP0049 KEGG:ns NR:ns ## COG: HP0049 COG2957 # Protein_GI_number: 15644680 # Func_class: E Amino acid transport and metabolism # Function: Peptidylarginine deiminase and related enzymes # Organism: Helicobacter pylori 26695 # 38 364 6 328 330 305 48.0 1e-82 MGIMVGLPSPSGSEKDLQLNFGKNMTVQVEMRAPHLPAEWHMQSGIQLTWPHAGTDWAYM LAEVQECFINIAREIAKRELLLIVTPEPEEVKKQIAATVNMNNVRFLECATNDTWARDHG AITMIDTGTPSLLDFTFNGWGLKFASELDNQITKHAVEAGALKGQYIDHLDFVLEGGSIE SDGMGTLLTTSECLLSPQRNGRLNQVEIEEYLKSTFHLQKVLWLDHGYLAGDDTDSHIDT LARFCSTDTIAYVKCENKEDEHYEALLAMEEQLKTFRTLAGEPYRLLALPMADKIEEDGE RLPATYANFLIMNDVILYPTYNQPANDKKAGEVLQQAFPSHQIIGIDCRALIKQHGSLHC VTMQYPLGVIKES >gi|225935323|gb|ACGA01000069.1| GENE 9 9420 - 9962 502 180 aa, chain - ## HITS:1 COG:AF2201 KEGG:ns NR:ns ## COG: AF2201 COG4739 # Protein_GI_number: 11499783 # Func_class: S Function unknown # Function: Uncharacterized protein containing a ferredoxin domain # Organism: Archaeoglobus fulgidus # 1 176 1 184 184 136 41.0 2e-32 MILNERDARHEHILQVARQMMTAARTAPKGKGIDIIEVALITDEEIKQLSDTMIAMVEEH GMKFFLRDADNILSAECVVLIGTREQTQGLNCGHCGFATCAERTDGVPCALNSIDVGIAI GSACATAADLRVDTRVMFSAGLAAQRLNWLKDCKMVMAIPVSASSKNPFFDRKPKQENNA >gi|225935323|gb|ACGA01000069.1| GENE 10 9997 - 10182 178 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175006|ref|ZP_05761418.1| ## NR: gi|260175006|ref|ZP_05761418.1| hypothetical protein BacD2_24331 [Bacteroides sp. D2] # 1 61 1 61 61 77 100.0 3e-13 MKIGKITLDDEINRSIGDVISDGLSGVVIDELGTVIISGEFGTVVSRESVSAIIRIKGLV V >gi|225935323|gb|ACGA01000069.1| GENE 11 10465 - 10923 361 152 aa, chain + ## HITS:1 COG:no KEGG:BT_3062 NR:ns ## KEGG: BT_3062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 11 163 165 110 41.0 1e-23 MLNFDKENKQEKYVAANVLIGSISYEKLCNEVNQRTGIHRAMVDVILKGAEDTMITFLEE GFSVRFGEFGSFRPAINAASKDKEEDVNAATIIRRKIVFTPGTKFKTMLGNASIELFSDR EVSASAGDEDETGGEKPDGGGGSGGEAPDPAA >gi|225935323|gb|ACGA01000069.1| GENE 12 11584 - 12069 336 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175008|ref|ZP_05761420.1| ## NR: gi|260175008|ref|ZP_05761420.1| hypothetical protein BacD2_24341 [Bacteroides sp. D2] # 1 161 1 161 161 302 100.0 5e-81 MKLSFLLLISFCFYLNGNAQTYKVEIEDITNKIDIRTSSYQPRHRIEGSSLTSHEIGKVL LKNNEMIKQLYTFLEEDINQSNSNPETLSKNGKYKAFIEGGWTDKDKKVIWFRMYAKSKI KEVTWDEGIIVNSWIFSNMMDKVMGNEPKYQNEKQRYFNGF >gi|225935323|gb|ACGA01000069.1| GENE 13 12100 - 12312 107 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175009|ref|ZP_05761421.1| ## NR: gi|260175009|ref|ZP_05761421.1| hypothetical protein BacD2_24346 [Bacteroides sp. D2] # 1 70 1 70 70 124 100.0 2e-27 MVICNYIKQSLVLVFLIVSLVAVCQNNGYKVINKGHGKIAIFYNGKTTNFIQEEANRLKR DSRFHKNSRL Prediction of potential genes in microbial genomes Time: Fri May 13 11:20:01 2011 Seq name: gi|225935322|gb|ACGA01000070.1| Bacteroides sp. D2 cont1.70, whole genome shotgun sequence Length of sequence - 24785 bp Number of predicted genes - 28, with homology - 26 Number of transcription units - 15, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 16 - 180 183 ## gi|260175010|ref|ZP_05761422.1| hypothetical protein BacD2_24351 + Prom 184 - 243 2.3 2 2 Tu 1 . + CDS 263 - 664 272 ## gi|260175011|ref|ZP_05761423.1| hypothetical protein BacD2_24356 + Term 726 - 765 -0.6 + Prom 709 - 768 4.7 3 3 Op 1 . + CDS 871 - 1068 186 ## gi|260175012|ref|ZP_05761424.1| hypothetical protein BacD2_24361 4 3 Op 2 . + CDS 1081 - 1278 224 ## gi|260175013|ref|ZP_05761425.1| hypothetical protein BacD2_24366 5 3 Op 3 . + CDS 1275 - 1463 239 ## gi|260175014|ref|ZP_05761426.1| hypothetical protein BacD2_24371 + Prom 1553 - 1612 7.3 6 4 Tu 1 . + CDS 1689 - 1838 131 ## gi|260175016|ref|ZP_05761428.1| hypothetical protein BacD2_24381 + Term 1888 - 1929 9.7 + Prom 1895 - 1954 2.1 7 5 Op 1 . + CDS 1979 - 3535 1380 ## BT_1828 hypothetical protein 8 5 Op 2 . + CDS 3572 - 4192 208 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 9 5 Op 3 . + CDS 4240 - 5034 395 ## COG0390 ABC-type uncharacterized transport system, permease component 10 5 Op 4 . + CDS 5031 - 5327 216 ## BT_0880 hypothetical protein 11 6 Tu 1 . - CDS 5347 - 5760 335 ## BT_0881 hypothetical protein - Prom 5787 - 5846 5.8 - Term 5872 - 5927 16.1 12 7 Tu 1 . - CDS 5993 - 6496 505 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 6716 - 6775 8.9 + Prom 6483 - 6542 6.9 13 8 Op 1 3/0.000 + CDS 6775 - 7869 804 ## COG3323 Uncharacterized protein conserved in bacteria 14 8 Op 2 . + CDS 7875 - 8705 997 ## COG1579 Zn-ribbon protein, possibly nucleic acid-binding + Term 8715 - 8752 -0.6 + Prom 8788 - 8847 4.3 15 9 Op 1 13/0.000 + CDS 8867 - 10264 512 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 16 9 Op 2 27/0.000 + CDS 10298 - 11431 978 ## COG0845 Membrane-fusion protein 17 9 Op 3 . + CDS 11486 - 14641 2682 ## COG0841 Cation/multidrug efflux pump + Term 14718 - 14771 13.1 - Term 14711 - 14755 10.0 18 10 Tu 1 . - CDS 14810 - 15259 337 ## BT_0887 hypothetical protein - Prom 15313 - 15372 6.2 + Prom 15214 - 15273 5.3 19 11 Op 1 . + CDS 15323 - 16099 692 ## COG0775 Nucleoside phosphorylase 20 11 Op 2 . + CDS 16123 - 17142 646 ## COG1466 DNA polymerase III, delta subunit + Term 17171 - 17226 6.0 + Prom 17874 - 17933 7.0 21 12 Op 1 . + CDS 18149 - 18289 61 ## 22 12 Op 2 . + CDS 18353 - 18823 347 ## BT_0890 hypothetical protein 23 12 Op 3 13/0.000 + CDS 18900 - 19676 582 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 24 12 Op 4 . + CDS 19664 - 20575 748 ## COG0167 Dihydroorotate dehydrogenase + Term 20729 - 20795 22.8 - Term 20722 - 20778 15.5 25 13 Tu 1 . - CDS 20792 - 21469 742 ## COG0336 tRNA-(guanine-N1)-methyltransferase - Prom 21553 - 21612 7.5 + Prom 21405 - 21464 7.4 26 14 Op 1 . + CDS 21545 - 23545 1711 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 27 14 Op 2 . + CDS 23605 - 24498 762 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Term 24545 - 24596 16.4 28 15 Tu 1 . - CDS 24427 - 24573 81 ## - Prom 24599 - 24658 7.9 Predicted protein(s) >gi|225935322|gb|ACGA01000070.1| GENE 1 16 - 180 183 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175010|ref|ZP_05761422.1| ## NR: gi|260175010|ref|ZP_05761422.1| hypothetical protein BacD2_24351 [Bacteroides sp. D2] # 1 54 6 59 59 90 100.0 3e-17 MIGLYKGVELQYNWFLQRKVGKVIDIYKGDVNSEARKIVIEEGDKYFNKVFSNK >gi|225935322|gb|ACGA01000070.1| GENE 2 263 - 664 272 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175011|ref|ZP_05761423.1| ## NR: gi|260175011|ref|ZP_05761423.1| hypothetical protein BacD2_24356 [Bacteroides sp. D2] # 1 133 1 133 133 262 100.0 6e-69 MQIENIWNNIVDWFNDRSERGRLIRSFNAAARESFIQGIAPTMLKASLSKGNREYKHQFS SWMNSGFRIQALSGRPLTKDEMTFIGQTILSDTMLIRRLIALGWDTLEVHDDSGIYGCRW KMIDYASMGGLLT >gi|225935322|gb|ACGA01000070.1| GENE 3 871 - 1068 186 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175012|ref|ZP_05761424.1| ## NR: gi|260175012|ref|ZP_05761424.1| hypothetical protein BacD2_24361 [Bacteroides sp. D2] # 1 65 71 135 135 124 100.0 2e-27 MDSAFGARNLYDKLKGKCHPDNFSTNLILFDRATEIFALIVKNKYNYRELILLKERAEKE LNINI >gi|225935322|gb|ACGA01000070.1| GENE 4 1081 - 1278 224 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175013|ref|ZP_05761425.1| ## NR: gi|260175013|ref|ZP_05761425.1| hypothetical protein BacD2_24366 [Bacteroides sp. D2] # 1 65 1 65 65 100 100.0 3e-20 MEGILLLVVAVALAYGVGCMGRNRKIGFGWAFFFALINVILGLIIVLCSKKKDVSFVDMN KEDKQ >gi|225935322|gb|ACGA01000070.1| GENE 5 1275 - 1463 239 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175014|ref|ZP_05761426.1| ## NR: gi|260175014|ref|ZP_05761426.1| hypothetical protein BacD2_24371 [Bacteroides sp. D2] # 1 62 1 62 62 111 100.0 1e-23 MKILGWILLILGILSFIGAISQGHNVVGPAFFGGLGAYLLSRINKKKQEEQDKDKWVNGN NS >gi|225935322|gb|ACGA01000070.1| GENE 6 1689 - 1838 131 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175016|ref|ZP_05761428.1| ## NR: gi|260175016|ref|ZP_05761428.1| hypothetical protein BacD2_24381 [Bacteroides sp. D2] # 1 49 1 49 49 62 100.0 9e-09 MKVIFSFILFLVAIKCFRAAYIEMRDGTSWWVMWGIFAVLAVLGAIALF >gi|225935322|gb|ACGA01000070.1| GENE 7 1979 - 3535 1380 518 aa, chain + ## HITS:1 COG:no KEGG:BT_1828 NR:ns ## KEGG: BT_1828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 520 809 77.0 0 MSNKIYPIGIQNFEKIRKGGYCYIDKTAWIYQMVKTGSYYFLSRPRRFGKSLLLSTLEAY FQGKKKLFEGLAIEKLEKDWFKYPILHIDLNTEKYDTPERLENKLNRTLVEWEKIYGAES AETSLAMRFEGIIQRACEKEGQSVVILVDEYDKPMLQAIGNDELQKSFRDTLKAFYGALK SKDGCIKFGMLTGVTKFGKVSVFSDLNNLEDISMRQQYIEICGISDRELHENFETELHEF ADAQGLTYDEICTEMRERYDGYHFTHDSIGMYNPFSVLNTLKYNVFGNYWFETGTPTYLV ELLKKHHYDLHRMAHEETSADVLNSIDSTSDNPIPVIYQSGYLTIKGYDREFETYRLGFP NREVEEGFVKYLMPFYANINAVESSFEIQKFVREVRSGDYDSFFRRLQSFFADTPYELVR DLELHYQNVLFIVFKLVGFYVKAEYHTSQGRIDLVLQTDKFIYVMEFKLEGTAEEALQQI NEKHYAKPFESDGRTLFKIGVNFSAETRNIEKWVAELQ >gi|225935322|gb|ACGA01000070.1| GENE 8 3572 - 4192 208 206 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 197 1 202 245 84 30 5e-16 MLHINNACIAFGTEVLFSGFSMKLERGETACIVGQSGCGKTSLLNAVMGFVPLKEGSIQV GETLLDISTIDSVRRQIAWIPQELALPFEWVKEMVALPFGLKVNRSVPFSEERLFTCFDE LGLEHELYTKRVNEVSGGQRQRIMLAVAAMLNKPLIIIDEPTSALDAGSTGKVLSFFRRQ AERGTAVLAVSHDKDFASGCHYLIEL >gi|225935322|gb|ACGA01000070.1| GENE 9 4240 - 5034 395 264 aa, chain + ## HITS:1 COG:RSc0748 KEGG:ns NR:ns ## COG: RSc0748 COG0390 # Protein_GI_number: 17545467 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Ralstonia solanacearum # 31 256 35 260 272 95 31.0 1e-19 MGTIDISYYNLFIGLLLLAIPFFYLWKFKTGLLKPAVIGTLRMIIQLFFIGMYLKYLFLW NNPWINFLWVIIMVFVAGQTALVRTQLKRSILLIPITVGFLCSVVLVGIYFIGIVLQLDN IFSAQYFIPIFGILMGNMLSSNVIALNTYYSGLKREQQLYRYLLGNGATRQEAQAPFIRQ AIIKSFSPLIANIAVMGLVALPGTMIGQILGGSSPNVAIKYQMMIMVITFTASMLSLMIT ISLASRRSFDAYGKLLEVTKEPRK >gi|225935322|gb|ACGA01000070.1| GENE 10 5031 - 5327 216 98 aa, chain + ## HITS:1 COG:no KEGG:BT_0880 NR:ns ## KEGG: BT_0880 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 88 1 88 92 134 75.0 8e-31 MTVIDQIFHKVAEIAIPHFFITVEFSASGTEMPEHIEAFLQEKYEAILRGASGRKFIYKE GEWRLIFTFFPTDRVVDERYALKNKVQMINKVQMKSKS >gi|225935322|gb|ACGA01000070.1| GENE 11 5347 - 5760 335 137 aa, chain - ## HITS:1 COG:no KEGG:BT_0881 NR:ns ## KEGG: BT_0881 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 138 164 61.0 7e-40 MKTNLTLLLLLVMMVSCQPKKSDSIEMSVAEAAKIDNIEDCAADTVKATAIFWIDKVETK HCKEYGFRTIKARVLIREDGKVDLLSFVKKQSPDVEKYIRHHLSKFQVTEKMFEGGYVQP GEQFVQLRCLWGMLKGK >gi|225935322|gb|ACGA01000070.1| GENE 12 5993 - 6496 505 167 aa, chain - ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 98 36.0 6e-21 MKSLSFRKDLVGVQEELLRFAYKLTTDREEANDLLQETSLKALDNEDKYTPDTNFKGWMY TIMRNIFINNYRKVVRDQTFVDKTDNLYHLNLPQDAGFESTERTYDLKEMHRVVNALPKE YRVPFAMHVSGFKYREIAEKLNLPLGTVKSRIFFTRQKLQEELKDFR >gi|225935322|gb|ACGA01000070.1| GENE 13 6775 - 7869 804 364 aa, chain + ## HITS:1 COG:BH1380_2 KEGG:ns NR:ns ## COG: BH1380_2 COG3323 # Protein_GI_number: 15613943 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 132 234 7 109 113 115 53.0 1e-25 MKIKEIVSALERFAPLPLQDGFDNAGLQIGLTEAEATGALLCLDVTEAVLDEAIALGYNL VISHHPLIFKGYKSITGKDYVERCILKAIKNDIVIYSAHTNLDNAQGGVNYKIAEKIGLK NLKVLEPKENCLIKLVTFVPDAQADSVREVLFAAGCGNIGNYDSCSYNLKGEGTFRAKEG THPFCGTIGELHHENEVRIETILPIYKKAEVIKALLSVHPYEEPAFDLYPLQNDWLQAGS GIVGELDESETELEFLKRIKKIFEVGCVRHNKLTGREIQKVALCGGAGAFLLPQAIRTGA DVFITGEIKYHDYFGHEGDILMAEIGHYESEQYTKEIFYSIIRDLFPNFALQLSKINTNP IKYL >gi|225935322|gb|ACGA01000070.1| GENE 14 7875 - 8705 997 276 aa, chain + ## HITS:1 COG:TP0494 KEGG:ns NR:ns ## COG: TP0494 COG1579 # Protein_GI_number: 15639485 # Func_class: R General function prediction only # Function: Zn-ribbon protein, possibly nucleic acid-binding # Organism: Treponema pallidum # 18 244 7 232 273 62 22.0 9e-10 MAREAKKDPNELTVEQKLKTLFQLQTMLSKIDEIKTLRGELPLEVQDLEDEIAGLSTRID KIKAEVDELKSAIAGKRVEIETAKASVEKYKSQQDNVRNNREYDFLTKEIEFQTLEIELC EKRIKEYSADKEEKEAEVTKNDQILNERLKDLEQKKSELDEIISETKQEEEKLRDKAKDL ETKIEPRLLQSFKRIRKNSRNGLGIVYVQRDACGGCFNKIPPQRQLDIRSRKKVIVCEYC GRIMIDPELAGVQIEHKVEEAPVATTKRAIRRKTAE >gi|225935322|gb|ACGA01000070.1| GENE 15 8867 - 10264 512 465 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 49 464 41 457 460 201 29 3e-51 MNIRVWGTSLLMFLSTVTAVGQTANRYLKAPLPQDWEESGEVFQQILPVDDHWWKSFQDT KLDSLIALAVDRNYSVAMAINRIAAARANLWIERSNFFPSIGLNAGWTRQETSGNTSSLP QTTDHYYDASLSMSWELDIFGSIRKRVKAQKENFAASKEEYTGVMVSLAAEVASAYINLR ELQQELEVVKKNTASQEEVLKITEVRYNTGLVAKLDVAQAKSVLYSTKASIPQLEAGINQ YITTLAVLLGMYPQDIRPVLESSGTLPDYMEPIGVGMPVDLLLRRPDVRSAELSVNAQAA LLGASKADWLPKVFLKGSFGYAARDLKDLTKSKSMTYEIAPSLNWTIFNGGQLVNATRLA KAQLDEAINQFNQTVLTAVQETDNAMNSYRNSIKQIVALREVRNQGIETLKLSLELYKQG LSPFQNVLDAQRSLLSYENQLVQAQGSSLLQLIALYKALGGGWRE >gi|225935322|gb|ACGA01000070.1| GENE 16 10298 - 11431 978 377 aa, chain + ## HITS:1 COG:mll7356 KEGG:ns NR:ns ## COG: mll7356 COG0845 # Protein_GI_number: 13476125 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 29 369 49 379 384 159 31.0 6e-39 MKKLMYIFLALPILTGCKEKKSTGAMGGMPTPEISVTKPIVEDITLTKDYPGYLTTEKTV NLVARVNGTLQSTSYVAGGRVKQGQLLFVIEPTLYKDQVEQAEAELKTTQAQLEYARNNY SRMKEAVKSDAVSQIQVLQAESSVKEGVAAVSNAEAALSTARTNLGYCYVRAPFDGTISK ATVDIGSYVGGSLQPVTLATIYKDNQMYAYFNVADNQWLAMTMDTQQLPTDLPKKIMVQL GKGGTESYPATLDYLSPNVDVNTGTLMVRANFDNPKGILKSGLYVSITLPYGEAKNAMLV KEGSIGTDQLGKYLYIVNDSNIVHYRHIEIGQLVDGTLRQVLGGLSPQEQYVTEALMKVR DGMKVKPLPDSLPKRGK >gi|225935322|gb|ACGA01000070.1| GENE 17 11486 - 14641 2682 1051 aa, chain + ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 3 1029 2 1027 1051 865 44.0 0 MFSKFFINRPIFATVLALLIVVAGLVTLNILPVAQFPEITPPTVQVSAVYPGANAETVAQ TVGIPIEQQVNGVDGMLYMSSNSSSSGAYSLTITFAVGTDIDMATVQVQNRVSIAQSSLP EPVVVQGVTVQKQSSNIVMFLTMTSQDSVYNSLYLTNYAKLNLVDQLTRVPGVGAVNVMG AGDYSMRIWLDPEAMRIRNISPQQVYQSIQSQNVEVSAGYIGQPIGQDNNNAFQYTLNVQ GRLKSPEQFGNIIIRREQDGAMLRLKDIARIDLGSASYSVVSRLNGKPTAAIAIYQQPGS NSLDVSKGVKAKMEELAESFPSGVAYNVTLDTTDVIHASIDEVMVTFFETTLLVILVIFL FLQNWRAVIIPCITIPVSLIGTFAVMAAFGFSINTLTLFGLILAVAIVVDDAIVVVENAS RLLETGQYSPREAVTKAMGEITGPIVGVVLVLLAVFIPTMMISGISGQLYKQFALTIAAS TVLSGFNSLTLTPALCALFLEKSKPSNFFIYKGFNKAYDKTQGVYDKIVKWLLQRPGMAL ASYGALTVIALLLFMHWPSTFIPDEDDGYFIAVVQLPPAASLERTQAVGEKINGILDSYP EVKNYIGISGFSIMGGGEQSNSATYFVVLKNWSERKGKEHTAAAIVNRFNGEAYMTIQAA EVFAMVPPAIPGLGASGGLQLQLEDRRNLGPTEMQQAINALLASYHSKPTLASVSSQYQA NVPQYFLNIDRDKVQFMGIALNDVFSTLGYYMGAAYVNDFVEFGRIYQVKIEARDQAQRV IDDVLKLSVPNSAGEMVPFSSFTKVEEQLGQDQINRYNMYSTASLTCNVAPGSSTGQAIQ EVEALFKEQLGDEFGYEWTSVAYQETQAGNTTTIVLIMALIVAFLVLAAQYESWTSPVAA VIGLPVALLGAMIGCLIMGTPVSIYTQIGIILLVALSAKNGILIVEFARDFRAEGNSIRE AAFEAGHVRLRPILMTSFAFVLGVMPLLFATGAGAQSRIALGAAVVFGMAMNTLLATIYI PNFYEFMQKLQERFSKKKGNEDGGKDAAMQK >gi|225935322|gb|ACGA01000070.1| GENE 18 14810 - 15259 337 149 aa, chain - ## HITS:1 COG:no KEGG:BT_0887 NR:ns ## KEGG: BT_0887 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 286 95.0 1e-76 MLSLNLPVFDTKINVRNGKNVIFDVIRKRYVALTPEEWVRQHFVHFLIAHKGYPTALLAN EVMVKLNGTTKRCDTVLYRRDLSARMIVEYKAPHIEITQAVFDQITRYNMVLKVDYLIVS NGMQHYCCRMDYENQTYTFLQDIPDYHSL >gi|225935322|gb|ACGA01000070.1| GENE 19 15323 - 16099 692 258 aa, chain + ## HITS:1 COG:CPn0894 KEGG:ns NR:ns ## COG: CPn0894 COG0775 # Protein_GI_number: 15618803 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Chlamydophila pneumoniae CWL029 # 3 255 6 263 293 205 39.0 6e-53 MKTKEEIVANWLPRYTKRNLEDFGEYILLTNFNKYVEIFANQFNVPILGRDANMISASAE GITMINFGMGSPNAAIIMDLLGAIQPKACLFLGKCGGIDKKNQLGDLILPIAAIRGEGTS NDYFPPEVPALPAFMLQRAVSSSIRDKGRDYWTGTVYTTNRRIWEHDDVFKEYLKKTRAM AVDMETATLFSCGFANHIPTGALLLVSDQPMTPDGVKTDRSDNLVTRNYVEEHVEIGIAS LRMIIDEKKTVKHLKFDW >gi|225935322|gb|ACGA01000070.1| GENE 20 16123 - 17142 646 339 aa, chain + ## HITS:1 COG:BS_yqeN KEGG:ns NR:ns ## COG: BS_yqeN COG1466 # Protein_GI_number: 16079610 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus subtilis # 10 317 4 319 347 80 24.0 3e-15 MAKQELTCDDILKELRAKQYRPIYYLMGEESYYIDLIADYITDNVLSETEKEFNLTVVYG ADVDVATIINAAKRYPMMSEHQVVVVKEAQAVRNMEELSYYLQKPLLSTILVICHKHGTL DRRKKLAAEVDKVGVLFESKKIKDAQLPGFIASYMKRKGVDMEPKATVMLADFVGSDLSR LTGELEKLIITLPAGQKRVTPEQIEKNIGISKDYNNFELRSALVEKDILKANKIIKYFEE NPKTNPIQMTLSLLFSFYSNLMLAYYAPDKSEQGIANMLGLRTPWQARDYMAAMRKYSGV KTMQIVGEIRYADAKSKGVQNSSMTDGDILRELVFKILH >gi|225935322|gb|ACGA01000070.1| GENE 21 18149 - 18289 61 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIITRFILYASVIFDLYNVIYFCNINIFIFVICFCESNYLPLYNQ >gi|225935322|gb|ACGA01000070.1| GENE 22 18353 - 18823 347 156 aa, chain + ## HITS:1 COG:no KEGG:BT_0890 NR:ns ## KEGG: BT_0890 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 242 89.0 4e-63 MEIKDRIRMIMEREKVPPRVFAETIGVQQSTLSHILNDRNKPSLEVVMKVHQKYDYVNLE WLLYGKGEMMVSEEGTSFSSSNHDYLPSLFDENPVNPSKEPTLPENRKETPLRNAENAPK EIVKQEIRYIEKPARKITEIRIFFDDNTYETFRPEK >gi|225935322|gb|ACGA01000070.1| GENE 23 18900 - 19676 582 258 aa, chain + ## HITS:1 COG:BH2535 KEGG:ns NR:ns ## COG: BH2535 COG0543 # Protein_GI_number: 15615098 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus halodurans # 8 255 7 258 259 167 36.0 1e-41 MKKFILDLTVTENIRLNANYVLLKLTSQSLLPEMLPGQFAELRVDGSPTTFLRRPISINF VDKQRNEVWFLIQLVGDGTRRLAEVNPGETINVVLPLGNAYTMPLEASDKLLLVGGGVGT APMLYLGEQLAKKGHKPTFLLGARSDKDLLQLEEFAKYGEVYTTTEDGSHGEKGYVTQHS ILNKVRFEQIYTCGPKPMMVAVAKYAKSNQIECEVSLENTMACGIGACLCCVENTTEGHL CVCKEGPVFNINKLLWQI >gi|225935322|gb|ACGA01000070.1| GENE 24 19664 - 20575 748 303 aa, chain + ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 4 299 2 297 311 300 49.0 3e-81 MADLSVNIGELQMKNPVMTASGTFGYGEEFSDFIDIARIGGIIVKGTTLHKREGNPYPRM AETPSGMLNAVGLQNKGVDYFVEHIYPRIKDIQTNMIVNVSGSAIEDYVKTAEIINELDK IPAIELNISCPNVKQGGMAFGVSAKGASEVVKAVRSAYKKTLIVKLSPNVTDITEIARAA EESGADSVSLINTLLGMAIDAERKRPILSTITGGMSGAAVKPIALRMVWQVAKVINIPVI GLGGIMDWKDAVEFMLAGATAIQIGTANFIDPAVTIKVEDGINNYLERHGCKSVKEIIGA LEV >gi|225935322|gb|ACGA01000070.1| GENE 25 20792 - 21469 742 225 aa, chain - ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 223 1 224 246 247 50.0 1e-65 MRIDIITVLPEMIEGFFNCSIMKRAQDKGLAEIHIHNLRDYTEDKYRRVDDYPFGGFAGM VMKIEPIERCINALKAERDYDEVIFTTPDGEQFNQPMANTLSLAQNLIILCGHFKGIDYR IREHLITKEISIGDYVLTGGELAAAVMADAIVRIIPGVISDEQSALSDSFQDNLLAAPVY TRPADYKGWKVPDILLSGHEAKIKEWELQQSLERTRKLRPDLLGE >gi|225935322|gb|ACGA01000070.1| GENE 26 21545 - 23545 1711 666 aa, chain + ## HITS:1 COG:BH0649 KEGG:ns NR:ns ## COG: BH0649 COG0272 # Protein_GI_number: 15613212 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Bacillus halodurans # 2 666 5 668 669 556 45.0 1e-158 MDIKEKIEELRAELHRHNYNYYVLNAPEISDKEFDDKMRELQDLEQAHPEYKDENSPTMR VGSDLNKNFTQVAHKYPMLSLANTYSEAEVTDFYDRVRKALNEDFEICCEMKYDGTSISL TYENGKLVRAVTRGDGEKGDDVTDNVKTIRSIPLVLHGDNYPASFEIRGEILMPWEVFEE LNREKEAREEPLFANPRNAASGTLKLQNSSIVASRKLDAYLYYLLGDNLPCDGHYENLQE AAKWGFKISDLTRKCQTLEEVFEFINYWDVERKNLPVATDGIVLKVNSLRQQKNLGFTAK SPRWAIAYKFQAERALTRLNKVTYQVGRTGAVTPVANLDPVQLSGTVVKRASLHNADIIE GLDLHIGDMVYVEKGGEIIPKITGVDKDARSFMLGEKVRFIVNCPECGSKLVRYEGEAAH YCPNETACPPQIKGKIEHFISRKAMNIDGLGPETVDMFYRLRLIKNTADLYKLTADDIKG LDRMGEKSAENIITGIAQSKTVPFERVIFALGIRFVGETVAKKIAKSFENIDDLQQADLE KLVSIDEIGEKIAQSILAYFANESNRELVAKLKEAGLQLYRTEEDLSGYTDKLAGQSIVI SGVFVHHSRDEYKELIEKNGGKNVGSISAKTSFILAGDNMGPAKLEKAKKLGITILSEDE FLKLIS >gi|225935322|gb|ACGA01000070.1| GENE 27 23605 - 24498 762 297 aa, chain + ## HITS:1 COG:BH1742 KEGG:ns NR:ns ## COG: BH1742 COG0329 # Protein_GI_number: 15614305 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Bacillus halodurans # 12 277 9 273 295 256 47.0 3e-68 MIQTKLKGMGVALITPFKEDESVDYDALMRMVDYLLQNNADFLCVLGTTAETPTLTEEEK KTIKKMVIDRVNGRIPILLGVGGNNTRAIVETLKNDDFTGVDAILSVVPYYNKPSQEGIY QHYKAIAEATELPIVLYNVPGRTGVNMTAETTLRIARDFKNVVAIKEASGNITQMDDIIK NKPENFNVISGDDGITFPLITLGAVGVISVIGNAFPREFSRMTRLALQGDFANALTIHHR FTELFNLLFVDGNPAGVKSMLNAMGMIENKLRLPLVPTRITTFEAIRKVLNELNIKC >gi|225935322|gb|ACGA01000070.1| GENE 28 24427 - 24573 81 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIEFNHWDIYLKHLDVYYEFLAEEASAFNIQLIQNLAYCLKCGNASRN Prediction of potential genes in microbial genomes Time: Fri May 13 11:20:54 2011 Seq name: gi|225935321|gb|ACGA01000071.1| Bacteroides sp. D2 cont1.71, whole genome shotgun sequence Length of sequence - 6767 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 80 - 132 5.7 1 1 Tu 1 . - CDS 204 - 551 243 ## gi|160883009|ref|ZP_02064012.1| hypothetical protein BACOVA_00972 - Prom 582 - 641 3.8 + Prom 1857 - 1916 6.1 2 2 Tu 1 . + CDS 1936 - 3144 956 ## COG5026 Hexokinase + Term 3199 - 3249 13.0 - Term 3243 - 3290 -0.9 3 3 Tu 1 . - CDS 3299 - 5293 1729 ## BT_1871 putative alpha-glucosidase - Prom 5346 - 5405 8.7 4 4 Op 1 . - CDS 5409 - 6506 776 ## Cpin_2851 hypothetical protein 5 4 Op 2 . - CDS 6575 - 6766 168 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|225935321|gb|ACGA01000071.1| GENE 1 204 - 551 243 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160883009|ref|ZP_02064012.1| ## NR: gi|160883009|ref|ZP_02064012.1| hypothetical protein BACOVA_00972 [Bacteroides ovatus ATCC 8483] # 1 115 1 115 115 219 100.0 6e-56 MYGDFNRIVVQLTQHPVMYKPLSDLTYTECELAYALIRELIDLSIEGDYTLLDYIQMARL EYYLGELSCKISCSREETALHYAGALHLLEKGGFDLGIKKWVELVSLRIENSKKE >gi|225935321|gb|ACGA01000071.1| GENE 2 1936 - 3144 956 402 aa, chain + ## HITS:1 COG:TP0505 KEGG:ns NR:ns ## COG: TP0505 COG5026 # Protein_GI_number: 15639496 # Func_class: G Carbohydrate transport and metabolism # Function: Hexokinase # Organism: Treponema pallidum # 6 287 20 306 444 127 31.0 5e-29 MEKNIFQLDNEQLKEIARSFKAKVEEGLNTENAEIQCIPTFITPKASSINGKSLVLDLGG TNYRVAIVDFSKVPPTIHPNNGWKKDMSVMKSPGYTREELFKELADMITGIKREKEMPIG YCFSYPTESVPSGDAKLLRWTKGVDIKEMIGEVVGKPLLDYLNERNKIKFTNIKVLNDTV ASLFAGLTDSSYDAYIGLIVGTGTNMATFIPADKIKKLSPSHKVDGLIPVNLESGNFHPP FLTAVDNTVDVISDNPGRQRFEKAVSGMYLGDILKATFPLEEFEEKFDAQKLTSIMNYPD IYKEVYVQVAQWIYGRSAQLVAASLTGLIMLLKSYNKEIRRICLVAEGSLFWSKNRKDKN YNMIVMEKLRELFSLFGLEDVEIDIKSMNNANLIGTGIAALS >gi|225935321|gb|ACGA01000071.1| GENE 3 3299 - 5293 1729 664 aa, chain - ## HITS:1 COG:no KEGG:BT_1871 NR:ns ## KEGG: BT_1871 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 664 1 662 662 1218 85.0 0 MKKLKILLFLCVVACTLTVQAQKQFTLNSPDGKLQTTITVGDKLTYDIHCDGRQILASSP ISMTLDNGEVWGEKAKLAGTSGKKVDQMVPSPFYRANELRDYYNELTLRFKKDWNVEFRA YNDGIAYRFVSRSKKPFNVVDETVDYRFPSDMVASVPYVKAGKDGDYESQYFNSFENTYT TDRLSKLNKKRLMFLPLVVDAGEGVKICITESGLENYPGLYLSAAEGENRLTGKFAPYPK KTVQGGHNQLQMLVKEREAYIAKVDKPRSFPWRMSIVTTSDKDLAASNLSYLLAAPSRLT DISWIKPGKVAWDWWNNWNLDGVDFITGVNNPTYKAYIDFASANGIEYVILDEGWAVNLQ ADLMQVVKDINLKELVDYAASKNVGIILWAGYYAFDRDMENVCRYFADMGVKGFKVDFMD RDDQYMTAFNYRAAEMCAKYKLILDLHGTHKPAGLNRTYPNVLNFEGVNGLEQMKWSPAS VDQVKYDVLLPFTRQVSGPMDYTQGAMRNASKGNYYPCNSEPMSQGTRCRQLALYVVFES PFNMLCDAPSNYMRELESTEFIANIPTVWDESIVLDGKMGEYIVTARRAGNVWYVGGITD WTARDIEVDCSFLGDKTYDATLFKDGVNAHRVGRDYKCESFRIKNDSKLKIHLAPGGGFA LKIK >gi|225935321|gb|ACGA01000071.1| GENE 4 5409 - 6506 776 365 aa, chain - ## HITS:1 COG:no KEGG:Cpin_2851 NR:ns ## KEGG: Cpin_2851 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 29 361 20 350 352 373 53.0 1e-102 MKNKWCVSVAFLFFLVGICINACTQECNSSNNDRWSEEKIQKWYDDQDWLIGCNYIPATA INQIEMWSADTFNPKQIDKELSWAHDLGFNTLRVFLSSVVWKNDAKGMKKRMDEFLNICK KYSIRPMFVFFDDCWNAESAYGKQPEPKPGVHNSGWVQDPSCSLRKDTLTLYPFLQEYVK DIVRTYAHDDRILMWDLYNEPGNSKHEETSLSLLTNVFRWVRDCKPSQPITAGVWDYNSP RKNVLNAFMLNHSDIISYHNYDNETQHAECIKFLKMLNRPLICTEYMARRNDSRFCNVLP LMKKEKTGAINWGFVAGKTNTIFAWDEVIPSGEEPELWFHDIYRSNGVPFQQEEVDCIKS LTGKR >gi|225935321|gb|ACGA01000071.1| GENE 5 6575 - 6766 168 63 aa, chain - ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 1 60 702 761 764 68 50.0 3e-12 KELKGFKRIFLKSGESRDINFVITENDLKFYNSGLEYIYEPGEFDVMVGSNSRDVQTKRF RAE Prediction of potential genes in microbial genomes Time: Fri May 13 11:21:56 2011 Seq name: gi|225935320|gb|ACGA01000072.1| Bacteroides sp. D2 cont1.72, whole genome shotgun sequence Length of sequence - 174201 bp Number of predicted genes - 129, with homology - 128 Number of transcription units - 52, operones - 31 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 267 213 ## BT_1872 periplasmic beta-glucosidase precursor 2 1 Op 2 . - CDS 284 - 2395 1784 ## COG1874 Beta-galactosidase - Prom 2451 - 2510 3.8 - Term 2454 - 2494 2.1 3 2 Op 1 . - CDS 2531 - 4579 1777 ## BT_3313 hypothetical protein 4 2 Op 2 . - CDS 4606 - 5613 876 ## Fjoh_2023 hypothetical protein 5 2 Op 3 . - CDS 5636 - 7291 1393 ## Dfer_0810 RagB/SusD domain protein 6 2 Op 4 . - CDS 7302 - 10667 2917 ## Dfer_0811 TonB-dependent receptor - Prom 10699 - 10758 4.6 7 3 Op 1 6/0.000 - CDS 10935 - 11948 848 ## COG3712 Fe2+-dicitrate sensor, membrane component 8 3 Op 2 . - CDS 12020 - 12610 403 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 12816 - 12875 9.4 - Term 12840 - 12888 7.8 9 4 Op 1 . - CDS 12900 - 13766 429 ## COG0524 Sugar kinases, ribokinase family 10 4 Op 2 . - CDS 13771 - 14799 594 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 11 4 Op 3 . - CDS 14796 - 15881 660 ## Phep_1387 hypothetical protein 12 4 Op 4 . - CDS 15903 - 17225 605 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 13 4 Op 5 . - CDS 17212 - 19449 1284 ## PRU_0396 histidine acid phosphatase family protein 14 4 Op 6 . - CDS 19490 - 20764 509 ## Ccel_0950 HI0933 family protein 15 4 Op 7 . - CDS 20764 - 22149 791 ## Dfer_0342 hypothetical protein 16 4 Op 8 . - CDS 22137 - 22466 283 ## gi|260175058|ref|ZP_05761470.1| hypothetical protein BacD2_24591 17 4 Op 9 . - CDS 22482 - 23435 759 ## SG0242 hypothetical protein 18 4 Op 10 . - CDS 23457 - 25214 1489 ## BT_2460 hypothetical protein 19 4 Op 11 . - CDS 25221 - 28307 2166 ## BT_2461 hypothetical protein 20 4 Op 12 . - CDS 28313 - 29677 899 ## gi|260175062|ref|ZP_05761474.1| hypothetical protein BacD2_24611 21 4 Op 13 . - CDS 29709 - 31499 1264 ## COG0591 Na+/proline symporter 22 4 Op 14 . - CDS 31505 - 32824 934 ## Ccel_0951 hypothetical protein - Prom 32894 - 32953 4.1 23 5 Tu 1 . + CDS 33222 - 34250 666 ## COG1609 Transcriptional regulators + Term 34253 - 34292 -0.8 - Term 34235 - 34286 9.0 24 6 Op 1 . - CDS 34300 - 36825 1656 ## COG3525 N-acetyl-beta-hexosaminidase 25 6 Op 2 . - CDS 36833 - 37906 795 ## Phep_0506 hypothetical protein 26 6 Op 3 . - CDS 37942 - 39420 1021 ## Coch_0957 hypothetical protein 27 6 Op 4 . - CDS 39468 - 40841 1130 ## gi|260175070|ref|ZP_05761482.1| hypothetical protein BacD2_24651 28 6 Op 5 . - CDS 40883 - 42526 1624 ## Dfer_0810 RagB/SusD domain protein 29 6 Op 6 . - CDS 42559 - 45975 2463 ## Dfer_0811 TonB-dependent receptor - Prom 46172 - 46231 6.5 + Prom 46177 - 46236 6.0 30 7 Op 1 6/0.000 + CDS 46304 - 46891 447 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 31 7 Op 2 . + CDS 46946 - 47989 723 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 48053 - 48105 -0.0 - Term 48083 - 48136 18.0 32 8 Op 1 . - CDS 48147 - 50345 2020 ## COG3537 Putative alpha-1,2-mannosidase 33 8 Op 2 . - CDS 50378 - 51652 920 ## PRU_2549 hypothetical protein - Prom 51822 - 51881 6.6 + Prom 51687 - 51746 5.8 34 9 Op 1 . + CDS 51883 - 53661 1866 ## COG0616 Periplasmic serine proteases (ClpP class) 35 9 Op 2 . + CDS 53672 - 54802 542 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 36 9 Op 3 . + CDS 54759 - 55568 957 ## COG0005 Purine nucleoside phosphorylase 37 9 Op 4 . + CDS 55585 - 56622 991 ## COG0611 Thiamine monophosphate kinase + Term 56662 - 56708 -0.8 + TRNA 56784 - 56859 82.7 # Phe GAA 0 0 - Term 56865 - 56905 9.2 38 10 Op 1 . - CDS 56940 - 57146 258 ## gi|298482322|ref|ZP_07000509.1| hypothetical protein HMPREF0106_02786 39 10 Op 2 . - CDS 57199 - 57684 497 ## BT_1884 cold shock protein, putative DNA-binding protein 40 10 Op 3 . - CDS 57738 - 58862 899 ## COG0513 Superfamily II DNA and RNA helicases - Prom 58889 - 58948 4.6 41 10 Op 4 . - CDS 58952 - 60025 876 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase - Prom 60131 - 60190 4.3 - Term 60346 - 60388 10.7 42 11 Tu 1 . - CDS 60397 - 60696 262 ## COG0724 RNA-binding proteins (RRM domain) 43 12 Op 1 . + CDS 61012 - 61674 380 ## COG0692 Uracil DNA glycosylase 44 12 Op 2 . + CDS 61722 - 62495 414 ## BT_1888 LuxR family transcriptional regulator - Term 62468 - 62503 -1.0 45 13 Tu 1 . - CDS 62534 - 63118 517 ## BVU_0638 hypothetical protein - Prom 63314 - 63373 6.0 + Prom 63161 - 63220 3.7 46 14 Op 1 . + CDS 63257 - 63673 299 ## BT_1890 hypothetical protein 47 14 Op 2 . + CDS 63600 - 65321 877 ## COG3973 Superfamily I DNA and RNA helicases + Term 65495 - 65535 -0.9 - Term 65353 - 65392 7.5 48 15 Tu 1 . - CDS 65423 - 66979 1651 ## BF2880 hypothetical protein - Term 66997 - 67036 1.0 49 16 Op 1 . - CDS 67128 - 68048 451 ## COG3023 Negative regulator of beta-lactamase expression 50 16 Op 2 . - CDS 68079 - 68429 417 ## BT_4442 hypothetical protein - Term 68446 - 68480 4.0 51 16 Op 3 . - CDS 68515 - 69033 566 ## BF2226 hypothetical protein - Prom 69060 - 69119 5.0 + Prom 69406 - 69465 2.9 52 17 Op 1 . + CDS 69546 - 70118 330 ## BF3645 hypothetical protein 53 17 Op 2 . + CDS 70115 - 70984 486 ## BF3644 hypothetical protein - Term 71006 - 71051 12.1 54 18 Op 1 . - CDS 71284 - 71943 550 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) 55 18 Op 2 . - CDS 71983 - 72324 119 ## BT_2526 hypothetical protein - Prom 72366 - 72425 8.4 - Term 72437 - 72470 0.6 56 19 Tu 1 . - CDS 72547 - 73716 1239 ## BT_0418 outer membrane porin F precursor - Prom 73890 - 73949 2.2 57 20 Op 1 . - CDS 73967 - 74233 327 ## COG0776 Bacterial nucleoid DNA-binding protein 58 20 Op 2 . - CDS 74291 - 76939 1510 ## BT_2473 hypothetical protein 59 20 Op 3 . - CDS 77012 - 79675 1260 ## gi|260175101|ref|ZP_05761513.1| hypothetical protein BacD2_24816 60 20 Op 4 . - CDS 79693 - 81807 1414 ## FIC_00184 hypothetical protein 61 20 Op 5 . - CDS 81861 - 82823 602 ## BT_1062 hypothetical protein - Prom 82849 - 82908 7.1 - Term 83095 - 83154 7.2 62 21 Op 1 . - CDS 83189 - 84802 1617 ## BT_1063 hypothetical protein 63 21 Op 2 . - CDS 84863 - 86191 1123 ## BVU_0907 hypothetical protein 64 21 Op 3 . - CDS 86199 - 86768 477 ## BDI_3526 hypothetical protein - Prom 86824 - 86883 5.9 + Prom 86933 - 86992 3.5 65 22 Tu 1 . + CDS 87037 - 87951 465 ## BT_2889 AraC family transcription regulator - Term 88016 - 88057 -0.9 66 23 Tu 1 . - CDS 88139 - 89359 772 ## BF3332 putative integrase - Prom 89411 - 89470 6.1 67 24 Tu 1 . + CDS 89923 - 90405 199 ## BF2331 hypothetical protein + Prom 90424 - 90483 3.6 68 25 Op 1 . + CDS 90553 - 90810 238 ## COG4728 Uncharacterized protein conserved in bacteria 69 25 Op 2 . + CDS 90770 - 91048 247 ## gi|260175111|ref|ZP_05761523.1| hypothetical protein BacD2_24866 70 25 Op 3 . + CDS 91056 - 91739 377 ## Psyc_0245 hypothetical protein - Term 91545 - 91586 0.8 71 26 Op 1 . - CDS 91752 - 93914 742 ## COG1700 Uncharacterized conserved protein 72 26 Op 2 . - CDS 93854 - 94024 112 ## 73 26 Op 3 . - CDS 94062 - 95948 942 ## COG1401 GTPase subunit of restriction endonuclease - Prom 95981 - 96040 7.3 + Prom 96027 - 96086 12.6 74 27 Tu 1 . + CDS 96213 - 97868 1086 ## BT_1894 TPR repeat-containing protein + Term 97958 - 98000 4.7 + Prom 97880 - 97939 6.8 75 28 Op 1 . + CDS 98031 - 98435 401 ## BT_1895 hypothetical protein 76 28 Op 2 . + CDS 98447 - 99733 716 ## COG3291 FOG: PKD repeat + Term 99741 - 99784 4.6 77 29 Op 1 . - CDS 99869 - 100546 457 ## BT_1899 hypothetical protein 78 29 Op 2 . - CDS 100619 - 101911 786 ## COG0534 Na+-driven multidrug efflux pump - Prom 101941 - 102000 4.1 + Prom 102191 - 102250 3.9 79 30 Op 1 . + CDS 102351 - 103664 758 ## COG0527 Aspartokinases 80 30 Op 2 . + CDS 103718 - 104533 639 ## COG0345 Pyrroline-5-carboxylate reductase 81 31 Tu 1 . - CDS 104624 - 105964 1272 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 106090 - 106149 4.0 + Prom 105965 - 106024 4.2 82 32 Op 1 . + CDS 106094 - 108598 1862 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 83 32 Op 2 . + CDS 108630 - 108983 343 ## BT_2435 MarR family transcriptional regulator + Term 109015 - 109073 5.3 + Prom 109043 - 109102 3.3 84 33 Op 1 . + CDS 109207 - 109842 652 ## BT_2437 hypothetical protein 85 33 Op 2 . + CDS 109858 - 110484 629 ## BT_2438 hypothetical protein + Term 110527 - 110576 6.1 - Term 110511 - 110566 10.0 86 34 Op 1 . - CDS 110572 - 113583 2390 ## COG1472 Beta-glucosidase-related glycosidases 87 34 Op 2 1/0.100 - CDS 113580 - 114428 268 ## PROTEIN SUPPORTED gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 88 34 Op 3 . - CDS 114445 - 115230 771 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 115367 - 115426 8.3 + Prom 115297 - 115356 4.6 89 35 Tu 1 . + CDS 115386 - 115736 579 ## PROTEIN SUPPORTED gi|237713894|ref|ZP_04544375.1| 50S ribosomal protein L19 + Term 115761 - 115809 14.4 - Term 115749 - 115797 10.6 90 36 Op 1 . - CDS 115842 - 115985 90 ## gi|160883610|ref|ZP_02064613.1| hypothetical protein BACOVA_01582 91 36 Op 2 . - CDS 116010 - 116990 497 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 117014 - 117073 6.7 + Prom 116956 - 117015 3.2 92 37 Op 1 36/0.000 + CDS 117170 - 117886 358 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 93 37 Op 2 10/0.000 + CDS 117917 - 119176 1141 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 94 37 Op 3 13/0.000 + CDS 119181 - 120422 325 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 95 37 Op 4 13/0.000 + CDS 120503 - 121603 1187 ## COG0845 Membrane-fusion protein 96 37 Op 5 . + CDS 121613 - 122950 1051 ## COG1538 Outer membrane protein 97 37 Op 6 . + CDS 123035 - 124063 820 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase + Term 124112 - 124159 10.8 - Term 124103 - 124144 7.0 98 38 Tu 1 . - CDS 124202 - 124717 561 ## BT_2500 hypothetical protein - Prom 124889 - 124948 4.1 - Term 124917 - 124952 1.3 99 39 Tu 1 . - CDS 124984 - 126390 876 ## Phep_3369 kelch repeat protein - Prom 126436 - 126495 5.1 100 40 Op 1 . - CDS 126518 - 127408 619 ## Mboo_1713 hypothetical protein 101 40 Op 2 . - CDS 127414 - 128643 1236 ## GFO_3192 phosphate-selective porin O and P 102 40 Op 3 . - CDS 128676 - 130199 1306 ## COG2866 Predicted carboxypeptidase 103 40 Op 4 . - CDS 130207 - 131583 1212 ## GFO_3192 phosphate-selective porin O and P 104 40 Op 5 . - CDS 131596 - 132744 1116 ## CCC13826_0552 isoaspartyl dipeptidase (EC:3.4.19.5) 105 40 Op 6 . - CDS 132760 - 134139 1074 ## COG3069 C4-dicarboxylate transporter 106 40 Op 7 . - CDS 134157 - 135572 1188 ## gi|260175146|ref|ZP_05761558.1| hypothetical protein BacD2_25041 - Prom 135766 - 135825 7.5 + Prom 135636 - 135695 6.6 107 41 Tu 1 . + CDS 135793 - 137742 1225 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) - Term 137620 - 137669 13.2 108 42 Op 1 . - CDS 137773 - 139155 188 ## BT_2502 hypothetical protein 109 42 Op 2 . - CDS 139170 - 139853 485 ## BT_2503 hypothetical protein 110 42 Op 3 . - CDS 139880 - 140551 584 ## BT_2504 hypothetical protein - Prom 140579 - 140638 4.5 - Term 140589 - 140625 6.4 111 43 Tu 1 . - CDS 140667 - 141506 222 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains - Prom 141535 - 141594 4.1 + Prom 141469 - 141528 6.2 112 44 Tu 1 . + CDS 141719 - 143791 1205 ## COG3119 Arylsulfatase A and related enzymes + Prom 143853 - 143912 7.1 113 45 Tu 1 . + CDS 143932 - 145374 813 ## COG3119 Arylsulfatase A and related enzymes + Prom 145486 - 145545 7.1 114 46 Tu 1 . + CDS 145580 - 146962 916 ## COG3119 Arylsulfatase A and related enzymes + Term 147019 - 147064 2.8 + Prom 147007 - 147066 5.8 115 47 Op 1 . + CDS 147091 - 150252 2224 ## BT_0364 hypothetical protein 116 47 Op 2 . + CDS 150272 - 152098 1289 ## PRU_2735 hypothetical protein 117 47 Op 3 . + CDS 152124 - 154952 2558 ## BT_0364 hypothetical protein 118 47 Op 4 . + CDS 154993 - 156780 1267 ## PRU_2737 putative lipoprotein 119 47 Op 5 . + CDS 156795 - 157742 892 ## gi|260175159|ref|ZP_05761571.1| hypothetical protein BacD2_25106 120 47 Op 6 . + CDS 157800 - 159290 1143 ## COG3119 Arylsulfatase A and related enzymes 121 48 Tu 1 . - CDS 159299 - 163321 2386 ## COG0642 Signal transduction histidine kinase - Prom 163532 - 163591 8.4 + Prom 163355 - 163414 4.8 122 49 Op 1 . + CDS 163551 - 164636 949 ## RB2377 arylsulfatase 123 49 Op 2 . + CDS 164673 - 166337 1271 ## COG3119 Arylsulfatase A and related enzymes 124 49 Op 3 . + CDS 166343 - 168550 1836 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 168589 - 168640 14.2 + Prom 168653 - 168712 4.7 125 50 Tu 1 . + CDS 168765 - 169766 1063 ## COG0039 Malate/lactate dehydrogenases + Term 169787 - 169831 10.0 + Prom 169861 - 169920 2.0 126 51 Tu 1 . + CDS 169941 - 170120 178 ## gi|237722857|ref|ZP_04553338.1| predicted protein + Term 170313 - 170364 8.1 - Term 169938 - 169974 3.1 127 52 Op 1 . - CDS 170056 - 170478 254 ## BT_2511 putative transcription regulator 128 52 Op 2 . - CDS 170486 - 172432 1878 ## COG2217 Cation transport ATPase - Term 172456 - 172495 4.4 129 52 Op 3 . - CDS 172499 - 173134 732 ## BT_0227 hypothetical protein - Prom 173322 - 173381 4.9 Predicted protein(s) >gi|225935320|gb|ACGA01000072.1| GENE 1 3 - 267 213 88 aa, chain - ## HITS:1 COG:no KEGG:BT_1872 NR:ns ## KEGG: BT_1872 # Name: not_defined # Def: periplasmic beta-glucosidase precursor # Organism: B.thetaiotaomicron # Pathway: Cyanoamino acid metabolism [PATH:bth00460]; Starch and sucrose metabolism [PATH:bth00500]; Biosynthesis of secondary metabolites [PATH:bth01110] # 1 88 1 88 759 151 79.0 5e-36 MKIKNLISCALCGLTLFACSPSGGGKDAEMDCFITDLMERMTLREKLGQLNLPSGGDLVT GSVMNGELSDMIRKQEIGGFFNVKGIQK >gi|225935320|gb|ACGA01000072.1| GENE 2 284 - 2395 1784 703 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 34 696 2 646 649 327 30.0 4e-89 MKNFLFVLTLLFSLSSFSQSPVDKSFPPEELFALGSYYYPEQWESSQWERDLKKMSEMGI KFTHFAEFAWAMMEPEEGEYHFEWLDRAVSLAEKYGLKVIMCTPTPTPPVWLSKKYPDIL IQRDNGVTIQHGRRQHASWSSDRYRHYVENIVSRLAIHYGNNPTVIGWQIDNEPGHYGVV DYSENAQAKFRIWLQKKYGTIDKLNDTWGASFWSETYQNFEQVRLPNQQEVPDKANPHAM LDLNRFMADELAGFVNMQADILRRHIHKDQWITTNLIPIFNPVDPVRIDHPDFLTYTRYL VTGHNQGIGSQGFRMGIPEDLGFSNDQFRNRVGKAFGVMELQPGQVNWGMYNPQPLPGAI RMWVYHVFAGGGKFVCNYRFRQPLKGSEQYHYGMIMTDGVTLSPGGEEYVRITQEMKKLR AAYDKKNRMPGQLASRRIGLLFDMSNYWEMEFQRQTDQWRTMAHVHKYYNLLKSFAAPVD VISEKEDFSGYPFLIAPAYQLLDNKLVERWTEYVKKGGHLILTCRTGQKDRDAKLWEAPL AAPIHQLAGINSLYYDHLPHNLYGKIDFEGKEFAWNNWGDVLTPAAGTDVWAVYTDQFYK GAASVIHHKLGKGTVTYIGTDTDDGKLEKEVVRRVYTEAGVSTEDLPYGVVKEWRDGFYI ALNYTSDIQEIVIPDEAEVLIGSARLEPAGVVVWKEKSDNKHK >gi|225935320|gb|ACGA01000072.1| GENE 3 2531 - 4579 1777 682 aa, chain - ## HITS:1 COG:no KEGG:BT_3313 NR:ns ## KEGG: BT_3313 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 182 601 175 577 667 74 23.0 2e-11 MKYYIYTVLLLLLTASCSDDVQKWDQWPEWKLASPLSVGGQVLDEEIYSNFQGKKLHLEK GQEVEFSGVDEIESILSPDYFEYVSENKARFKGETADYSVLYDPANELLYIEKAGATYPD GLWLCGANWGHPQARLVTTSGWSMDGPNNVLYCYKSADNVFQLTLYLANNFSFKFFKHRG WGEGDNEITTLPEDNITLTTPFLVAGKTGGDFIPGPLFQPGVYLITLDLNNNTCAFEAKD ENIQEQSFLVNGQEMGILEEASSFLGIALELHKGDEVTFSNFGDVRKMLQPDFFENITKD KATFIGVDGNYKLYYDPINKLTYLENRSVNYPDGLWVCGSSFGHPQAGRVTVGAWTFNLP SDAFQCVKVADNIFETTLYLVKDFQFKFYKQRPWGGELASTIVNTQPVNLLGKGWYYSDP ATGGTGGGHFTGDFVAGPDFTPGVYRVRIDLNKNICMFIDKVDEGQLGPESYKINGTELT QSTNPSYIAVELNLTKGQVVDFEGFSYLDYMLQPEYFTNENGQYKFNAPDGKYRISYNKD RELIYVEKTTDTEFPETVWITGAAFGHPRISGLLPDDIGNWGWDNPKDFICCVKTGDRVY ETNLLLNNDFMFRFYKRKGWNNEITSFDVTIVSEGDLIARGGYWDGDQWKETENFGPGNN FRAGIYHVKLDMNTNTCTFTKR >gi|225935320|gb|ACGA01000072.1| GENE 4 4606 - 5613 876 335 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_2023 NR:ns ## KEGG: Fjoh_2023 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 38 335 42 342 342 212 40.0 2e-53 MKRMSYYIIGLLLSLMSWSCSDDAETGREEIPVETDGGYLFAHMTNANYGKLYYAVSRDG INWETLNKGRIINSAYIGHPDICQGHDGAFYMIAVNPLALWRSENLVTWTSTQLNEMIFN RSNAQGFYTTYYWGAPKMFYDKDSELYIISWHACNDPDKDDWDSMRTLYVLTKDFETYTE PQKLFNFTGTDENMAIIDAIIRKVNGVYYAIMKDERDPAVAPETGKTVRIATSSNLTGPY TNPGAPVTPNDMMREAPIFIERPNHSGWFIYAESYAAKPYGYHLFQSTSMDGPWKERTFS GPNVKDGTDRPGARHGCIVKVNETVYQALLKAYKK >gi|225935320|gb|ACGA01000072.1| GENE 5 5636 - 7291 1393 551 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0810 NR:ns ## KEGG: Dfer_0810 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 11 520 2 486 499 282 36.0 3e-74 MKSNKLCRMKKKFFLPMLVSLQLLFGSCENNFDPYIYGALLQGEYPSTEQEYVSYMMVCY LPYTTVWTYDMGSGGLQHGIHIQSGGAVRMFDSASDICAPATTVGADWERLSKGDYSNCF YYWRGNVDDGGNLNHFPKTAQVTRMTEVIGTLEKAPLTALTEEKKNNLLGEARLCRGLMM YFLLHVYGPVPVILDPEKVIDPEALSNTVRPSLDEMTQWITDDLEFAAKNAPEVAPDLGR YTRDYARFCLMKHCLNEGEHMEGYYQRAIDMYNELNTGKYDLFKTGSTPYVDLFKNANKF NKEMIMAVSCSPTADGNPKHGNANPFLMWALPSDVAKGDPFPMGGGWLQAFSMDKKYYDA FESNDGRLKTIVTSYKDKNGVVINKDNLGVRWNGYIINKFPQETMTTFQGTDIPLARWAD VLLMYAEAEVRKTGTVPSVAAINAVNQVRKRAGLSDLPSVTTNTKDAFLNAILIERGHEF FYEGNRKIDLIRFNKYAQEMYKAKGIMPTHQYMPIPNYAVEQAISYGKELKQTWERPGWA EDKSKAQQSIN >gi|225935320|gb|ACGA01000072.1| GENE 6 7302 - 10667 2917 1121 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0811 NR:ns ## KEGG: Dfer_0811 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 112 1121 14 1016 1016 765 43.0 0 MNYRKKAILMVVALFCLNIAMLAQAVSLKMNNVSVKEAMTQLKNKSGYSFVYKVGDLDTR KIVSVKAKQLNEAIDQILYGQEVVYEVKGKNIIVQKGQARQNTSKDTKKRKVTGTVNDAN GEPIIGATIKEKGTTNGTASDLDAKFSLEVSPGAVLEVSYIGYQTQEVKVGDRVSLSVTL AENQQILDEVVVVGYGTTSRKNLTTSIATVKTEKISKAATSNISGMLLGRAAGLQATVAS PQPGGGINISIRGGGTPIYVVDGVVMPSGSFEVGTGSTSLPSSVNRAGLAGLNPNDIESI EILKDASAAIYGIGAADGVILVTTKKGKEGKPTIVYEGSYSVQKHYPYLDVLSGPELMNM VNVFSKENYLYDKGQYPYGNAAYDDKWTPIFTPTQIANAPTTDWLDKVLKTGAVTNHNLT ISGGSEKFKYYLGVNYYKEDATVYNSDMERYSLRTNITSQLTNFLKLTTIVNLNQNNYTN STVGGDVGNLRDQGAGALFGAIFYPSYLPVYDAEGQYNVFSRTPNPVSMQDINDKSEQNG YYMNFSLDVDIIKNMLSAKVLYGLNKENTSRDSYIPSDIYYALQRKSRGNLGYGKRQQST LEGTLTFQHKFGELLDMNLMAGMGRYLDSGDGSDISYENANDHIQGSSVGMADGPFYPTS YKYKNEKRSQFVRGSFDLFGRYVVSASLRRDGTDKFFPSKKYALFPSVSLAWKMNEESFI KNISWINMLKLRASYGETGSDNLGTTLYGIVTTTREDVQFNNNSVTYIPYILSGANYEDV TWQKTVMKNIGLDFSIFRDRIWGSVDVFRNDVTHLLGTAPTELLGMHGTRPINGGHYKRT GVDVSLNSLNLQTHDFKWTSQITMSHYNAVWIERMPNYDYQKYQKRKNEPMNAFYYYKTT GIIDVDKSNMPESQRSLGPAACMPGYPIVEDKNGDGIIDVNDSYMDNMLPKFYFGFGNTF TWKNFDLDIFMYGQLGVKKWNDAYSYSADAGNLSRGVDAHNVGIYSYNIWNTQTNTNGHF PGIAISKSVALPENLGFDYTRENASYIRVRNITLGYNLGPKELSVFKGYIRGIRVFVDFQ NPLTFTKYKGYDPEINTSSSNLTGGQYPQMRVYSIGAKLTF >gi|225935320|gb|ACGA01000072.1| GENE 7 10935 - 11948 848 337 aa, chain - ## HITS:1 COG:CC1130 KEGG:ns NR:ns ## COG: CC1130 COG3712 # Protein_GI_number: 16125382 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Caulobacter vibrioides # 127 305 103 276 307 63 28.0 6e-10 MEEEKKHIDELIATYLTEGLDKNALAELKAWIAASPENEDYFIQRREVWFSAVSREAASK YNKDKAFDTFKNRIGSRKQVEKAPRQEFRLSALWRYAAIIAVILAVGCFSYWQGGVNVKD TFADISVEAPLGSRTKLYLPDGTLVWLNAGSRMTYSQGFGVDNRKVELEGEGYFEVQRNE KLPFFVKTKDLQLQVLGTKFNFRDYPEDHEVVVSLLEGKVELSNLLKKEKEAFLAPDERA ILNKANGLMTVETVTASNASQWTDGYLFFDEELLPDIVKELERSYNVNIHIANDSLNKFR FYGNFVRREQSIQEVLDALASTEKIQYKIEERNITIY >gi|225935320|gb|ACGA01000072.1| GENE 8 12020 - 12610 403 196 aa, chain - ## HITS:1 COG:SMc04203 KEGG:ns NR:ns ## COG: SMc04203 COG1595 # Protein_GI_number: 15965784 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 23 183 1 155 159 59 27.0 3e-09 MEHTETLIVEQLKTGNEDAYQYIYDRHYALLCHVANGYVKDQFLAETIVGDTIFHLWEIR ETLEISVSIRSYLLRAVRNRCINYLNSEWEKREIAFSSLMPDEITDDKMTISDSHPLGTL LERELEEEIYKAIDKLPNECRRVFDKSRFEGKSYEEISQELGISVNTVKYHIKNALASLQ TNLSKYLITLLLFFFG >gi|225935320|gb|ACGA01000072.1| GENE 9 12900 - 13766 429 288 aa, chain - ## HITS:1 COG:MA1840 KEGG:ns NR:ns ## COG: MA1840 COG0524 # Protein_GI_number: 20090690 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Methanosarcina acetivorans str.C2A # 1 280 29 311 326 158 34.0 9e-39 MKIIGIGEVVWDCFPEGKRLGGAPINFCFFAKELGADSYPVTAIGEDELGDETIVVLKKT GLDLRYVSRNCLPTGKVLVSLNEAGVPQYNIVENVAWDAIECSRATMELVSDADVVCWGS LAQRSEKSRTAILRFIDAVPNTSLKVFDINIRQHFYSTELIVESLQKANVLKLNEDELPL LISLLSLSTDFVEAIAELISRFSLKYVIFTQGAMCSGIYDVSGEVSSIDTPKVEVADTVG AGDSFTATFVVNILRGESVAESHRRAVNVSAYTCTLRGAINPLPDSQK >gi|225935320|gb|ACGA01000072.1| GENE 10 13771 - 14799 594 342 aa, chain - ## HITS:1 COG:BS_yjmD KEGG:ns NR:ns ## COG: BS_yjmD COG1063 # Protein_GI_number: 16078298 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Bacillus subtilis # 16 340 16 336 339 162 29.0 8e-40 MRQAVLVEPKHIEFREVEAPKASDLKEHQVLLNIKRIGICGSEIHSYHGCHPATFYPVVQ GHEYSAVVVATGPAVTICKAGDIVTARPQLVCGKCKPCKRGEYHICEELKVQAFQANGAA QDFFVVDDDRIAVLPKDMSLDYGAMIEPVAVGAHATMRGGDLKGKNVVVSGAGTIGNLVA QFAKARGAKQVLITDVSDFRLEMARKCGIEDTLNVAKTPFKEGIKSVFGDEGFQAAFEVA GVESSICSLMECIEKGSTIVVVAVFDKDPSLSMFYLGEHELKLNGTMMYRHEDYLAAVDM VSSGAIHLEPLISNRFLFEQYDEAYKFIDENRSTSMKIIIKL >gi|225935320|gb|ACGA01000072.1| GENE 11 14796 - 15881 660 361 aa, chain - ## HITS:1 COG:no KEGG:Phep_1387 NR:ns ## KEGG: Phep_1387 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 26 358 29 357 358 178 32.0 2e-43 MKKFFIPVVLLLLCVVSVSAQKQIEASTLNLIGKPFELTPNPYHRVDTLVYKGFNRTENR QVRCPAGMAVLFKTNTRNIHITTKWGYVYSSNSTMPISYKGYDLYIKNADGQWQYAASGS SKAYKSERIETFALIENMDGTMHECMMYMPMYSEVISCKIGIDDDAVIEPLKSDFRHKIV VYGSSFTQGVSTDRSGMSYPMQFMRNTGFQIVSLATSGRCLMQPYMTDVLVEVRADAFIF DTFSNPDAELIRERLIPFINRLIAAHPGTPLIFQRTIYRERRTFDTVLDAKERAKAATVE ELFAKILVDPKYKDVYLITPNASDAHETSVDGTHPNSYGYTLWAKSIEAPVIEILSKYGI K >gi|225935320|gb|ACGA01000072.1| GENE 12 15903 - 17225 605 440 aa, chain - ## HITS:1 COG:PA2342 KEGG:ns NR:ns ## COG: PA2342 COG0246 # Protein_GI_number: 15597538 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Pseudomonas aeruginosa # 5 427 19 473 491 314 37.0 3e-85 MKTFNYNRAEIKAGIVHFGVGNFHRAHLEAYTNLLLEDPSQKCWGVFGAMIMPTDGILFN ALKEENGIYQLTTCSPSGERHNMLIGSLVELAWGEIDSEPIIAKIASEEIKIITLTITEG GYKADLDKPRTVFWYVAEGLKRRMEQDLPITILSCDNMQQNGNAAKCAFMSYFEAKYPEV AAWAEKKVTFPNSMVDRITPATKSGKITNVCCEDFIQCVVEDNFIAGRPAWEKVGVTFTD DVTPYEIMKLSLLNASHTLLSYPAYMEGFRKVDTVLADERYRTMIKLFMNRDVTPYVPVP EGVDLEAYKDLLIKRFSNKAISDQVSRLCGDGIAKFAVYVVPILKQMLRDGKDISIEAFL VAVYCKYLIGAHTESGENITIFEPHITPADKMLISGGSPVEFLKISPFVSLELDKYPVFM EKYELFYGMSVAEGLKVLLQ >gi|225935320|gb|ACGA01000072.1| GENE 13 17212 - 19449 1284 745 aa, chain - ## HITS:1 COG:no KEGG:PRU_0396 NR:ns ## KEGG: PRU_0396 # Name: not_defined # Def: histidine acid phosphatase family protein # Organism: P.ruminicola # Pathway: not_defined # 347 744 50 432 436 176 29.0 4e-42 MKKIYLSVVCLLVSIPLIAQLYVEPEKEVECSVFLAKEGRGRAQQGLEIWDDYIFSCEDG GHVNIYDFKSADSKPLAGFELASSHPDNHVNNVCFGVETKRGASFPLLYITNGKVGSELE WLCFVESITRRGKRFSSEIVQTIELDGSKWAEKGYVSIFGAPSWLVDRERGFIWIFSARK RTVAKVTKNAWENQYVATKFRIPSLSEGAKVRLDQNDILDQVVFPYDVWFTQAGCMHDGK IYYCFGVGKQDDNRPSCIRVYDTDTRTITARYNVQEQVIYEPEDIVIKDGAMYVNTNTNA KKTSDLPCIFKLSLPKEKRIGENPLDEIRQDPERAGGVYYVTDLSHTVTPTPKGYKPFYI NGYFRHGARQIDDTVTYPAIYGVLEKAHDTNNLTDFGKALYERLEPFKMNVFYKEGDLTQ IGYRQTREIGRRMVQNYPEVFENHPYLKTNATNVLRVAATMQSVNSGILSLKPELEWAEI DNSRSFLTTLNPYGNVCPDRSTLDKYILGKENSWYKKYRSYIDEKLDVDVFFTRLFIDIT QIESEYDKYDLVHRFWLMASLMQCLDRQVPIWDIFTEKEILAWAEIENYKYFAQKGPEPV SHGRSWGLASRTLRHLLDESAEDIARKRHGINLNFGHDGVLMAILTNLQVGTWAREASNS KEALQSWKYWDIPMGANLQMIFYQSEDNSDILVKFMLNEKDLQLPLEAVEASYYKWNEVY KFYIEHCDKVERSLAETLKLSYEDF >gi|225935320|gb|ACGA01000072.1| GENE 14 19490 - 20764 509 424 aa, chain - ## HITS:1 COG:no KEGG:Ccel_0950 NR:ns ## KEGG: Ccel_0950 # Name: not_defined # Def: HI0933 family protein # Organism: C.cellulolyticum # Pathway: not_defined # 5 424 17 435 435 409 50.0 1e-112 MNTKYYDVLVAGAGPAGICAAVAAARQGAHVALIERYGVIGGNLTAGYVGPILGSVGKNT MRDEVCSILGVKDNDWIGEHGKAHDFEEAKLTLAEFVAKEKNIDVFLQCCVSDVIRDGKV IKGIECASNEGTLCFEAAVTIDCTGDAVVSFLAGAKIEKGRDDGLMQPVTLEYTIDGVDE SKGLICIGDVDDVQLNGERFLDWCKKKADDGKLPPILAAVRLHPSVRPGCRQVNTTQVNR VDITSVGSIFTADLELRQQIRLLTQFLRENLPGYENCRVIGSGTTTGVRESRRVMGDYVI DADEMAEGCRFADVVVHKALFIVDIHNPDGAGQAEPSIQYCKPYDLPYRCFLPLGLEGLL VAGRCISGTHRAHASYRVMSICMAMGEAVGIAAAMSASQNCTPRALDVGELQRRLESLGV ELFD >gi|225935320|gb|ACGA01000072.1| GENE 15 20764 - 22149 791 461 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0342 NR:ns ## KEGG: Dfer_0342 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 41 461 23 419 419 261 37.0 4e-68 MQRLIFIFSLFILAIVGCSSNESQSFLIRPDGGEEGGDDSATEIEIGKTIPAWQEGVMDI HFINTTTGESVFVIFPDGTQMLIDAASSSVATNSNGNTTNTGIRSRWDPTLTSTRGSQII ADYIRKCMAWTGNSTIDYAVLTHFHNDHFGGYNTSLPKSSNSDTYSLIGYAEIFDNFKIG TLLDRGYPDYNYPFDMATMADNASSCNNYINAVKWNVANKKFEAAIFKAGANNQIVQKYN SAKYPTAKVQNVAVNGEIWTGSGTATKKTFPELSEISYKNSKDITASDNCPPENITSCVM KVSYGNFDFFAGGDLQYNGRSSYAWKDAELPCAKAVGQVELMKADHHGVTNTNQADALKA LNPQTIVVNSWVDCHPRTDILNSMETTLPACDIFITNFWQGDRPSGVDDRVTAEEAARVK GYDGHIVVRVTDGGNKYRVMTTTDSDGAMTVKAVSGPYTSR >gi|225935320|gb|ACGA01000072.1| GENE 16 22137 - 22466 283 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175058|ref|ZP_05761470.1| ## NR: gi|260175058|ref|ZP_05761470.1| hypothetical protein BacD2_24591 [Bacteroides sp. D2] # 1 109 1 109 109 219 100.0 3e-56 MKMFNDSHLEVKKFFKGTFFTHPYEAGWADEAIFFVMVEKIEGDPVFEGRVQLSQDGVHW ADDGSIPTIFKGLGQHIIKVNSNFGNYIRLAVSIEGGEMFLNLHIACKG >gi|225935320|gb|ACGA01000072.1| GENE 17 22482 - 23435 759 317 aa, chain - ## HITS:1 COG:no KEGG:SG0242 NR:ns ## KEGG: SG0242 # Name: not_defined # Def: hypothetical protein # Organism: S.glossinidius # Pathway: not_defined # 10 275 25 271 286 197 41.0 5e-49 MSIGEILLHSRYQPLKRVLSYDVFNTGFNGWMTLMPNFTEYPNFDVPKTLVNKDQWPPVM LSSATFRYPGTHGAMSGTYSLKLSTRPIAAPYTEIPAEGCLGHAIKRLSFSRPGCKYLQI ECWFTYTAEQDVVDGGDRPQPGLHESSIRDFGMGFDVQEGGKRYHVGIRYLNAVDGKLMQ KWQYEHSDEDVTDRDWAYGLDGDWCKKGVDPWWFGRRYPNGDHDGFKDLNDGHQKLIYNE TDCKLNWQYMRLKLDTQLREYVEFQCQDRIWDMRGIAADTVDGYNRIDNLINPLFWVGTD TNRRVFFYIDSVVVSQE >gi|225935320|gb|ACGA01000072.1| GENE 18 23457 - 25214 1489 585 aa, chain - ## HITS:1 COG:no KEGG:BT_2460 NR:ns ## KEGG: BT_2460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 585 1 572 572 403 39.0 1e-110 MKKLHNMKKIHVLIYMIVLLSVTSCVNLDLNPPSAASSENWFSSPEEVRISLNDFYRSTF FVIEEGWTLDRNTDDWAQRTNIYTIAAGSLNASSTSNPNIKTVWSYTYKNISRANRILEA LDNLEGKYSTTELNTLRAEARFFRAFAYSRLITLWGDVPFYLTSITPEEAFEMGRTDKAV VLKQVYEDYDYAAENLPVANNNSGATRVDKGTAYAFKARTALYQHDYGTTAKAAQDCMDL GVYDLAPDYGELFRDKTRGSKEVIFSVVHSSDLELDENGQPTTQSIGSFIARSAGGTHNA QPSWELLAVYEMTNGKTIDEPGSGFDPRDPFANRDPRCLETFAAPGSRIYGIEWNPAPNA LEVMDYTQNRMVTNKDSKGGSDASNCAYNGCCLRKGAQESWRTTLYNDNPVILMRYADVL LMYAEAKIELGDIDATVLACINDVRARAYSTTRTQTNDYPSITTTDQTVLRKVLRRERRV EFAWENLRYFDLLRWHQFENAFGHNMYGFTRTANRAKEYFAAGNWFWPQTPAFDEDGFPS FEAMADGTYIVQHGERKFDEKIYLWPLPSDDVLIMNGKLVQNPGY >gi|225935320|gb|ACGA01000072.1| GENE 19 25221 - 28307 2166 1028 aa, chain - ## HITS:1 COG:no KEGG:BT_2461 NR:ns ## KEGG: BT_2461 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 1028 132 1134 1134 804 42.0 0 MKKSILSLLLACFLQINLAFAQNVMKGVVSDANGPLVGVVIHDKNNGKGTTSDMSGKYAL SGAESGHTIEFRYLGYMTEHVVWDGKSVVNIKLKEDAVQLEETVVIGYGSVKKKDLTGAV GVINNSLIERQSTSQLSQSLQGLIPGLTVTRSSSMPGASATVQVRGVTTMSDSSPLILVD GMMVSSLDNIASEDVQQITVLKDAASASIYGARAAAGVILITTKEATEGQLSIGYNGEVS LSSPTEFPKFLTDPYHYMTMYNEWSWNDAGNPAGGEFANYSQDYIDNYATNNRYDPIQYP IYDWKDAILSSTAMRHKHNLTMTYGNKVIKSHTSATYENADAVYKGSNHERISIRSRNNL KISDKLSGSIDFSVRYATKNDPTSGSPIRAAYMYPSIYLGLYPDGRVGPGKDGSLSNTLA ALLEGGEKKTVSNTMTGKFSLSYKPIKDLTITANLTPTIGNVSIKEMKKAIPVYDAYETD VMLGYVSGYTSNSLSEERRNIKSLEKQFIATYDKTFSKVHNFNAMVGYEDYSYTYETMSG STNDMSLSSFPYLDLANKNALAVAGNSYQNAYRSFFGRIMYNYDSRYYLQLNAREDGSSR FHKDHRWGFFPSASVGWVISNEKFMQNITPVSYLKFRASIGTLGNERIGNYPYQTYISFN NAIMYDSAGSTPQSSMSAAQQDYAYENIHWEKTQSWDIGVDAAFFDNRLDFSADYYYKKT TDMLLSVAIPSFTGYSAPDRNVGKMHTRGWEVKLGWSDRIGDFSYAVSFNISDYKSIIDN LNGKQQFNSDGTIITEGAEYNSWYGYKTAGLFQTAEEVSESALLSASTKPGDVKYVDISG PDGTPDGIINETYDRVVLGSSLPHYLYGGSISLGWKGISFSLLFNGVGKQLSRLTESMIR PMQGQWLPAPSVLLNDNGDRNYWSVYNTAEQNASVRYPRLSHQGGEYNNYKMSDFWLRSS AYMRIKNINIGYTVPKKIVSKVGIKGLRVYVNIDDPYCFDSYLSGWDPEAGASTYITRTY TFGVDIKF >gi|225935320|gb|ACGA01000072.1| GENE 20 28313 - 29677 899 454 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175062|ref|ZP_05761474.1| ## NR: gi|260175062|ref|ZP_05761474.1| hypothetical protein BacD2_24611 [Bacteroides sp. D2] # 5 454 1 450 450 889 100.0 0 MNLKMLLFASVSLVIFASCHDSDTPYMTLATKTLVVEAEGGLFTVDLESNLFYRVNNDCQ AEGSDSHWAVIDSHETQGEITKFTIKISENSSATSRTGAIRFIGDDVTPLKLAITQKGIV PKGISPATESVDAATTESSFKVFGDKEWRAVCIDADVTVSPTNGVGECDIKLTFPENKTF VKRTIEVTVTILDDKDYTYTLVQDAFSGILADWDLKSLTSNTSGTFADGEAQSVFPGANG KYIAPSAGSGKIEYWACDRTGYIAQTAICDRAVGGNGDPYVSGAIPGDYWYICGDAKGAI IPAGTKIHFYFVTKLGTMCSSYWMIEFKDGEEWKPALPISTLLESATETISGASIDYSAT ITYNFAGMLLDTSNNGAYIPAEGVFTTTKDMDEIVLRFGQAGRLGLSGARYSGKYIDCTH ASGQTRFSAQHPSNPETGAAIKEYNQHVLLEIVE >gi|225935320|gb|ACGA01000072.1| GENE 21 29709 - 31499 1264 596 aa, chain - ## HITS:1 COG:VCA0667 KEGG:ns NR:ns ## COG: VCA0667 COG0591 # Protein_GI_number: 15601425 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Vibrio cholerae # 6 463 7 436 513 80 21.0 1e-14 MESLDYFVILLYFLGLIAVSVVMSRKIKNSEDMFIAGRNSSWWLSGVSSYMTIFSASTFV VWGGVAYKSGIVAVIVALCLGVASFIVGKWISGKWRQLRIKSPGEFLTIRFGHRTVSFYT ISGIVARAVHTAVSLYAVAVVMCALIKVDGGSIFASTGMLGDSPEGYLSIWWALLILGAI ALGYTIAGGFLAVLMTDVIQFGVLLAVVVFMIPLSFNAVGGVSAFIDKANEIPGFFSGTS PTYTWVWLLLWIFLNVAMIGGDWPFVQRYISVPTTRDAKKSTYLIGILYLVTPLIWYLPT MIYRVMEPGLALDLDATTMTFNGEHAYVNMSKLVLMKGMVGMMLAAMLSATLSNVSGILN VYANVYTYDIWGHKEKNRQADEKKRIKVGRLFTFVFGLVIIALSMFIPFAGGAEKVVVTL LTMVMCPLYIPSIWGLFSKRLTGNQLITAMVLTWFVGIMARIIIPASVISPSIIESVSGC VLPVLILVMMEIWSVRKKYEDNGYQAICEYTDPDADREPTSKEKKAVRNYSHLAVNCFCI TLGVVVLLLVGLLIVGDPKTLAVKNIVIGSIVLIVATILAYVIYRIMYAKQVKISN >gi|225935320|gb|ACGA01000072.1| GENE 22 31505 - 32824 934 439 aa, chain - ## HITS:1 COG:no KEGG:Ccel_0951 NR:ns ## KEGG: Ccel_0951 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 8 439 4 431 432 330 41.0 7e-89 MDKKVLSMEITKRGPIVGNYDVVVCGGGPAGFIAAIAAARCGAKTAVVEQYGFLGGMATM GLVTPLSVFTYNNEKVIGGIPWEFIERLEKMGGCIIEKPLGNVAFDPELYKLLCQRMMLE AGVDMYMHSYLSGCQIKDGKISSVLFENKNGTEAISADMYIDCTGDGDLAAMAGVPMQAD ECKPLQPLSTYFILGGVDTDSPIIIDAMHHNKQGQNCHCIAVREKLLEMKDVLGIPEFGG PWFCTTLRPGEVTVNMTRTAGNAVDNRNFTAAECRLREDVFKIARILKENFEEFKNSYVT TVAVHAGIRETRRIKGIHTITGEEYVNAYKYPDSISRGAHPIDIHVAAGAAQNVTFLKKA AYVPYRALVADNFPNLLVAGRCISADKTSFASLRVQASCMGVGQAAGVAAAQCIRAGVTV QKADVDKLIEELKKLGAII >gi|225935320|gb|ACGA01000072.1| GENE 23 33222 - 34250 666 342 aa, chain + ## HITS:1 COG:BH2219 KEGG:ns NR:ns ## COG: BH2219 COG1609 # Protein_GI_number: 15614782 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 5 334 4 328 335 194 35.0 3e-49 MKYITIKDIARHLSISVSTVSRALGDDKNIRKETKEKVMEAVKELGYRPNPVATNLKYGH TNAVGVIVPEMVTPFASRVINGIQEVLHAKNIKVIIAESGGDPEKEKENLQLMERFMVDG IIICLCSYKRNKEEYFRLQEAGMPMVFYDRIPYGLEVPQVIVDDYMKSFFLVESLIRSGR KQIVHIQGPDDIYNSIERVRGYKDALAKFGIPFDKDNMLIRTGITFEEGKKAADILMERN IPFNAVFAFTETLAIGAMNRLKELGKKIPDEIAVASFSGTELSNIVYPKLTTVEPPLYQM GKSAAELILEKIKDPASPNHSIVLNAEIKMRASTPPIEVYKQ >gi|225935320|gb|ACGA01000072.1| GENE 24 34300 - 36825 1656 841 aa, chain - ## HITS:1 COG:VC2217 KEGG:ns NR:ns ## COG: VC2217 COG3525 # Protein_GI_number: 15642215 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 52 827 71 865 883 326 29.0 1e-88 MITMRNIWIMMLGICLFGCGAGKQPPSSQLSLTWKLEKDSVEARYFKNTFCLTNNGNKSL ADDWVIYFNQTPIYYQQPINAPLEIECIGSTYYKMYPTEHYQALAPGETLSFTILSEGNV INVSSVPEGAYIVATDEKGKMLQPQNIPIEIGLFTPDAQWIRSKNSFPYADGNYLYKQND DFSKPVECDILSLFPTPKKVEKTGGVSTFSPKVGLKFDGVFKEEALLLKKQLTSQLGCTV SEKDEETIIELKKTELPATCQCPDEYYEIVIKNNLLTLKANDAHGIFNACQTLVALLDNM KLASSPLPNLHITDYPDMGHRGIMLDVARNFTKKTDLLKLIDILSFYKMNVLHLHLSDDE AWRVEIPGLEELTEIASRRGHTTDELTCLYPAYAWGWNEVDTTSLANGYYSRSDFMDILK YAKERHIRIIPEIDIPGHSRAAIKAMNARYRKYIDTDQSKAEEYLLIDFADTSQYLSAQN FTDNVINVAMPSTYRFLEKVIDEIGRMYQDAGVELPAFHVGGDEVPEGIWEGSSICRTFM KEHGLAKIRDLKDYFLEQILEMLNKRNIQAVGWQDIVMKPDNTVNEHFRNSKVLNYCWNT IPEQGGDEVPYKLANAGYPIILCNVGNFYLDMAYCYHVEEPGLRWGGYVDEYVTFDMLPF DIYKSLRRNLKGESVDVKTASNGKQPLTKEGYKNIKGLSGQIWAETIRSFEQIEYYLFPK VFGLAERAWNAQPSWALSPDSKVYADAKRKYNAGIVTYELPRLAKRGINFRVSPPGIIVK DGLLFVNTTNPNAVIRYTTDGSEPTENSVKWQTPIACDAPQIKAKAFYLGKESVTTVLIN E >gi|225935320|gb|ACGA01000072.1| GENE 25 36833 - 37906 795 357 aa, chain - ## HITS:1 COG:no KEGG:Phep_0506 NR:ns ## KEGG: Phep_0506 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 3 356 5 360 362 483 64.0 1e-135 MKKILLSFFIGIMLIACSQKTVSVREVWTKEQANEWYQQWGWLRGCDFIPSTAINQLEMW QADTFDPVTIDRELGWAEDIGMNCMRVYLHHLVWEMDKEGFKKRINEYLTIAAKHHISTI FVFLDDCWNPTYQAGKQPEPQPGVHNSGWARDPGDLYYKGDTAVILPVLENYVKDILTTF KDDKRIVLWDLYNEPGGSGGYRYGERSLPLLQKIFTWGRTVNPSQPLSAGVWDMSLTNLN KFQLENSDVITYHTYEGLDSHQQLIDTLKQYGRPMICTEYMARTQNSTFQDIMPMLKKEN IGAINWGLVAGKTNTIFAWDTPLPDVAEPSLWFHDIFRSDGTPYSTEEVECIRSLTK >gi|225935320|gb|ACGA01000072.1| GENE 26 37942 - 39420 1021 492 aa, chain - ## HITS:1 COG:no KEGG:Coch_0957 NR:ns ## KEGG: Coch_0957 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 133 492 37 411 411 207 34.0 9e-52 MYISMKTYSKKFLVAFFFSMLPMVLMSCSMEPEKASFSISGNLGKLQISNEGFTEVYSIK SNTEWKFVNETDQSWATISPVKGSGNGKVTITVSANAGTGRTAVFRVVPNGVKTQEIEII QGNSYIPEADGVFPIIAWTGVEADKSLEKFPVMKASGINIYLGWYNDLETTLKVLDAAQK TGVKMITSCQDLLSVGTAEEVVKAMMNHPALYAYHLKDEPEVNDLPGLGELVKKIKTVDS HHPCYINLYPNWAWGKELYSENVKSFIEQVPVPFISFDNYPIVSINGAPSIVRPDWYRNL EEISAAAKESNKPFWAFALALSHKLDETHFYKIPTLPELRLQVFSDLAYGAQAIQYFTYR GLQHDEPTEVYDLVKTVNQEVQQLAGIFLGAQVISVSHTGSEIPEGTTALGSLPAPIKSL TTSDTGAVVSVLEKGSNQYLVVVNKDFRNLMNLAIDVDGSVSRVLKDGSTVSPDGSTIAV EPGDMVIFTWRK >gi|225935320|gb|ACGA01000072.1| GENE 27 39468 - 40841 1130 457 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175070|ref|ZP_05761482.1| ## NR: gi|260175070|ref|ZP_05761482.1| hypothetical protein BacD2_24651 [Bacteroides sp. D2] # 1 457 1 457 457 910 100.0 0 MKKIYVLFISSLLLFSCAKDDISKPSEWPEWPTPSKPKIENAVLRGVNGETVVAAGDKVK FTAQISDEYNDLVSFQLLVTMDGAEILNLSKGLSGRSAVIEEEATLPFVAGFQNGRPVVT IKAVNDLGGNESTLTLDEAASVKVTRPETPSKLYIVDDLGNVYEMDKANQSESDYGFRTN AADLAGIGDHFKVAEKVVNNKPDYSGLVWGYADDKISIVTDDAAASIPTPKVGSYTLENI TFDMLSFLADKTLTYSIDIDKSRFFDVGGGYIQLDVELVESAKVNFIGFGNDVTNMLRPE FFKDVSGAKAKFDGPSVLYNLKYNTSNGFMYMERPRDVFYPEVMYIVGTGVGFPREPYVA TLAWDFSYPHQWFFFKKVGASAFEAVVYIDQTMGFKFYRGYGWAQEEDTKNLYTVEPANL VTRSAPGDLIPGPDFKAGLYTIHIDKASEMIRLIPYN >gi|225935320|gb|ACGA01000072.1| GENE 28 40883 - 42526 1624 547 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0810 NR:ns ## KEGG: Dfer_0810 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 20 518 19 486 499 287 37.0 1e-75 MSKNKLFKVFIMIAFIMCSCEDNFDPKMYGTLNVSNYPVTEAEYESFMMTCYMPFTTTWT YWIGAGTSGNQHGWYIPAGGVLKFFDYPTDEMAVWNNGWGGGYYFLSKADFSQCVYYTSG TLSDERPNHFPKVSEISYFTNVIGTLEKASTEVVSEERKREFIAEARLCRGLMMYYLLHV YGPVPLIVDPNDLINPSKLENLVRPTLQQMTEWIMADFDYAYQYIADTQPEQGRYNKDYA RVCIMRHCLNEGYYMSGYYQKAIDMYSELKGRYSLFKKGDNPYIEQFKNANNFNSEVIMA VSCDETADGTNKSGNFNPLMMLATPDNAARVDDQGNPTPFYLQGQGWGQTFNVSPKFWDT YDPSDKRRDVILTKYYTTAGTWIDRNTTTWDGFIINKFPVETATAFQGTDIPLARWADVL LMYAEAEVRKNNAAPSADAVAAVNEVRKRAGLDDLPSSATASAEAFLDALLTERGHEMLY EGMRKIDLIRFNQYAQRTAKIKGVAPTHQYVPIPNYAVQQASESYGKVLVQTFEREGWKA DLAAARN >gi|225935320|gb|ACGA01000072.1| GENE 29 42559 - 45975 2463 1138 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0811 NR:ns ## KEGG: Dfer_0811 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 132 1138 12 1016 1016 766 42.0 0 MKLISIRKMKKRINVPANVKMGRICAIWLFCIALQAVSIPILAQSAKITLRLKNVTVEEV LTSIENQTEYRFLYNKDIVDVSRIVSISVKNELMTLVLDKLFKGEGVSYTIEKRQIVLNK VSSQQQDNKPTVKVTGKVVDESGETLPGVTVMIEGVTQGTITGIDGNYQLQVPEGSTLKF TSIGYTTYTQKITRPMTLNVTMKEDSKQLDEVVVVGYGTTTRKNLTTSIATVKTEKISRA ATSNMSQMLLGRAAGLEATLTSPQPGGAVDLSIRGAGTPIFIVDGVMMPSTSLEVGNGNQ VMPNSINRSGLAGLNPADIESIEVLKDASASIYGIGAANGVVLITTKKGTETRPQITYEG NYSIVKNYPYLEPLSGEEYMNVANIFNKENYLFTNGMYPYGDKPFDNKWVPQFSPQQIAA AQTTDWLDCVLKDGSINNHNITITGGSKLLKYYLSGNYYKQEGTVENSAMERYALRTNIS SQLLPFLKLTAIVNVNENEYTNSTSGGGGGGNGYDAIQSALTYPSYLPIRDAAGNPTLYS NYPNPAEMVKVSDRTKTSGYYLNFAADVDIIKDMLSFRLMYGINKENANRNLYIPSDIYF MDMYKSRGHLGYVERRNQTMEGTLTFKKQFGDLLRVDAVVGMGRYTNDSNGLELDYEQIN DHIAADKVEAAEGAFYPTSFRAADERRSQFARASVDLLDRYVVAATIRRDGTDKFFPGKK YAYFPSVSLAWKLSNEPFMKNISWIDQLKIRGSYGQTGNDNLGSSLYGTFSLAAQYIKFS NNSVTYVPYLLSGPDYPNVTWEKTTMKNIGVDFSVLKDRIWGSFDMFRNDVTNLLGYDSS SPLSMTSSVPMNYGHYVRYGWDATINSLNYEIPRVFKWTSQLTLSHHNAVWKERMPNYYY EEYRIRKNEPVNAYYYYETEGVINIDKSNMPESQKSLPADAQQPGYPIIKDANGDNKITI DDVKMRNTLPKIHIGFGNTFVYKDFDLDIFMYGQFGRTRYNYAYRWALVGDVYYTSPKNS NKYVYTIWNSQTNQNGNRRGIASTKAVALPGNVGFEEDYQNASFVRVRNITLGYNLSGKK LGRIGDYVSSIRVFIDCQNPLTFTKFVGVDPEIKTGGDGSKAEYPMTRTYSFGAKICF >gi|225935320|gb|ACGA01000072.1| GENE 30 46304 - 46891 447 195 aa, chain + ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 1 185 1 193 206 63 25.0 2e-10 MSNIQQQTSDDRFLVIALKQDDKQAFTRLFHAYYKDLVLFGGTYIPEKSTCEDIVQNIFL KLWNDRKTLEIENSLKSYLLKAVRNYCLDELRHRRIIDEHVAYELKSGSIDIDTTENYIL YSDLCRQLKNALEQLPPQEREVFEMSRLENIKYQEIANRLNISVRTVEVRISKALKQLRI LLKDFYLLLFFFLFH >gi|225935320|gb|ACGA01000072.1| GENE 31 46946 - 47989 723 347 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 138 320 132 312 331 83 30.0 6e-16 MINKDSDIEIIISKYLSKEATPEEIKVLEDWISATPENYKSFLSQKNVWEVSHPAFNPEE IDVDSAHRKVMEQILHQNQPTSVRPKLSFLYYWQQIAAILLLPLLIVSAYLYLKPASQIA ETYQELFTPYGTWSVVNLPDGSKVWLNAGSSLKYPTQFNDKQRVVSMQGEAYFEVESDKE HPFIVKTKELTVEATGTAFNVNAYTSDNVTAVTLVKGKVAVTLDKKKTISLSPGEKIDYN LATSLYNVNKTNTYKWCSWKDGVLIFRDDPLEYVFKRLGQTYNVEFILKDAELGKYSYKA TFEGESLNEILRLLEMSAPIQCKAVSNRSNNSEKFEKQRIEVSKTMQ >gi|225935320|gb|ACGA01000072.1| GENE 32 48147 - 50345 2020 732 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 24 731 37 761 790 491 40.0 1e-138 MRRKTPFMAATLAAGLFFYSSCTPAEKTDTKDYTQYVNTFIGAADNGHTFPGACYPFGMI QTSPVTGAVGWRYCSEYVYEDSLIWGFTQTHLNGTGCMDLGDILVMPVTGTRVRAWDAYR SHFPKDKEAATPGYYTAELSDPQVKAELTASIHAALHRYTYHKADSASILIDLQHGPAWR EEQYHSQVNSCEVNWEDAQTLTGHVNNTVWVNQDYFFVMKFNRPVVDSLYLPMGETEKGK RIIASFDMQPGDELMMKVALSTTSIDGAKKNLEAEIPAWDFEGVRATAHNEWNNYLSRIE IEGTDDEKTNFYTCFYHALIQPNQISDVDGMYRNAADSIVKAGTGTFYSTFSLWDTYRAA HPFYTLMIPERVDGFVNSLIEQGEVQGFLPIWGLWGKENFCMIGNHGVSVIAEAYRKGFR GFDAERAFNIIKKTQTVSHPLKSDWEVYTKYGYFPTDLIKAESVSSTLESVYDDYAAADM ARRMGKEEDAAYFAKRADYYKNMFDPQTKFMRPRKADGTWKAPFNPSALGHSESIGGDYT EGNAWQYTWHVQHDVPGLIQLFGGEEPFLNKLDSLFTVKLEGESLSDVTGLIGQYAHGNE PSHHVTYLYALAGRPERTQELVREIFDTQYKNKPDGLCGNDDCGQMSAWYMLSAMGFYPV DPVSAEYVFGAPQLPKMTLHLADGKTFTIIAENLSKEHKYVDSITLNGEPYTKKTISHED IVKGGTLVYKMK >gi|225935320|gb|ACGA01000072.1| GENE 33 50378 - 51652 920 424 aa, chain - ## HITS:1 COG:no KEGG:PRU_2549 NR:ns ## KEGG: PRU_2549 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 21 422 9 412 416 372 46.0 1e-101 MKKLTSILLLLLFVWTVQAQTVREEIRANRYLSANNYYAYPTPTAKQTAAPKGYEPFYIS TYMRHGSRYLIDPNDYRFPKQTLLKADSLNKLTTTGKNTLDIVLRMADMAEGRLGELTSV GARQHRGIAKRMYTNFPQIFADGTEVDARSTVVIRCILSMTSECLQLQAMNPNLCIKNDA SYHDMYYMNPPAKDLSKIASSDKVTKVQKDFEATHVRPQRLMKTLFTDEAYVKANVDEAR LMRRLFDLACNMQSHDTDMQLYSLFTDEECYDLWSCNNLYWYLTHACSPVTDGLMPYREA DLLRNILDRADAALKEGKNQATLRFGHESCLLPLVALLELNHYGESYPDVEKLSEQWRAY DIFPMASNVQFVFFHKKGSNDVLVKVLLNEKETKLPVKTDVAPYYHWKDVESYYREKLDK YQHI >gi|225935320|gb|ACGA01000072.1| GENE 34 51883 - 53661 1866 592 aa, chain + ## HITS:1 COG:all4590 KEGG:ns NR:ns ## COG: all4590 COG0616 # Protein_GI_number: 17232082 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Nostoc sp. PCC 7120 # 38 592 42 609 609 339 36.0 1e-92 MKDFLKFTLATVTGIILSSIVLFIISMVTLFGIMSASDTETIVKKNSVMMLDLNGTLVER TQEDPLGILSQLLGDGSNTYGLDDILSSIQKAKENENIKGIYLQANSLGTSYASLQEIRN ALLDFKESGKFVIAYADSYTQGLYYLSSAADKVLLNPKGMIEWRGIASAPLFYKDLLQKI GVEMQVFKVGTYKSAVEPFIATEMSPANREQVTTFITSIWGQVTEGVSTSRNISVDSLNV YADRMLMFYPAEESVKCGLADTLIYRNDVRNYLKKLVEINEDDNLPILGLGDMMNVRKNV PKDKSGNIVAVYYASGEITDYPSSATSEDGIVGSKVIRDLRKLKDNDDVKAVVLRVNSPG GSAFASEQIWHAVKELKTKKPVIVSMGDYAASGGYYISCVADTIVAEPTTLTGSIGIFGM IPNVKGLTDKIGLSYDVVKTNKFADFGNIMRPFNEDEKSLLQMMITEGYDTFVTRCAEGR HMTKEAIEKIAEGRVWTGETAKELGLVDELGGIDKALDIAVAKAGIEGYTVVSYPEKQDF LSSLLDTKPTNYVESQLLKSKLGEYYQQFGLLKNLQEQSMIQARIPFELNIK >gi|225935320|gb|ACGA01000072.1| GENE 35 53672 - 54802 542 376 aa, chain + ## HITS:1 COG:aq_1656 KEGG:ns NR:ns ## COG: aq_1656 COG1663 # Protein_GI_number: 15606758 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Aquifex aeolicus # 12 325 6 285 315 124 31.0 2e-28 MDEHFIKIHKWLYPASWLYGAVVMMRNKLFDWEVLQSKSFDIPIISVGNLAVGGTGKTPH TEYLIKFLCNQYKVAVLSRGYKRHTKGYVLATPESTARSIGDEPYQMHQKFPSVTVAVDE KRCHGIEKLLALQKPSIDVILLDDAFQHRYVKPGLSILLTDYHRLFCDDTLLPAGRLREP ISGKNRAQIVIVTKCPQDIKPIDYNIITKRLNLYPYQQLFFSSFRYGNLQPVFPTMSPDT NAISTNHEVALSSLTNTDILLMTGIASPAPILERLEGCTKQIDLLSFDDHHNFTHRDIQL IKERFHKLKGEHRLIITTEKDATRLINHPALDKELKPFIYALPIEIEILQNQQDKFNQHI IDYVRENTRNRSLSER >gi|225935320|gb|ACGA01000072.1| GENE 36 54759 - 55568 957 269 aa, chain + ## HITS:1 COG:BH1532 KEGG:ns NR:ns ## COG: BH1532 COG0005 # Protein_GI_number: 15614095 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Bacillus halodurans # 3 266 6 270 275 292 55.0 4e-79 MLEKIQETAAYLKGKMHTSPETAIILGTGLGSLANEITEKYEIKYSDIPNFPVSTVEGHS GKLIFGKLGNKDIMAMQGRFHYYEGYSMKEVTFPVRVMRELGIKTLFVSNASGGTNADFE IGDLMIITDHINYFPEHPLRGKNIPYGPRFPDMSEAYCKELISKADEIAKEKGIKVQHGV YIGTQGPTFETPAEYKLFHILGADAVGMSTVPEVIVANHCGIKVFGISVITDLGVEGKIV EVTHEEVQKAADAAQPKMTTIMRELINRA >gi|225935320|gb|ACGA01000072.1| GENE 37 55585 - 56622 991 345 aa, chain + ## HITS:1 COG:MTH1396 KEGG:ns NR:ns ## COG: MTH1396 COG0611 # Protein_GI_number: 15679395 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Methanothermobacter thermautotrophicus # 1 340 1 324 327 157 34.0 3e-38 MRTEIASLGEFGLIDRLTEGIKLENESSKYGVGDDAAVLSYPSEKQVLVTTDLLMEGVHF DLTYVPLKHLGYKSAVVNFSDIYAMNGTPRQITVSLGLSKRFSVEDMDELYSGIRLACQQ YNVDIVGGDTTSSLTGLAISITCIGDADKDKVVYRNGAKDTDLICVSGDLGAAYMGLQLL EREKVVLQGDKDVQPDFSGKEYLLERQLKPEARKDIIEKLAAANIVPTSMMDISDGLSSE LMHICKQSNAGCRVYEEHIPIDYQTAVMAEEFNMNLTTCALNGGEDYELLFTVSIADHEK ISQMEGVRLIGHITKPELGCALITRDGQEFELKAQGWNPLKEDKQ >gi|225935320|gb|ACGA01000072.1| GENE 38 56940 - 57146 258 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298482322|ref|ZP_07000509.1| ## NR: gi|298482322|ref|ZP_07000509.1| hypothetical protein HMPREF0106_02786 [Bacteroides sp. D22] # 1 67 1 67 93 124 95.0 2e-27 MKKGNIMRDVISSSQFTENVIKEYKKGLGDMVLVAITPRTTIELPAHLTQAEIDARVEIY KKLHSSKV >gi|225935320|gb|ACGA01000072.1| GENE 39 57199 - 57684 497 161 aa, chain - ## HITS:1 COG:no KEGG:BT_1884 NR:ns ## KEGG: BT_1884 # Name: not_defined # Def: cold shock protein, putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 148 152 202 85.0 4e-51 MAKSMTVGKRENEKKRLAKREEKLKKKESKKLSGSSNSFEDMIAYVDENGMITSTPPTED IKKEEINPDEIIIATPKKEEEEPTILRGRVEFFNESRGFGFIKDLSGVEKYFFHVNNVIG NIAENNIVTFDLERGVKGMNAINISLENKSTSEDLAQSKPV >gi|225935320|gb|ACGA01000072.1| GENE 40 57738 - 58862 899 374 aa, chain - ## HITS:1 COG:ECs0875 KEGG:ns NR:ns ## COG: ECs0875 COG0513 # Protein_GI_number: 15830129 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 369 1 370 455 398 55.0 1e-111 MTFKELNITEPILKAIEEKGYAVPTPIQGEAIPAALAKRDILGCAQTGTGKTASFAIPII QHLQMNKEAAKCRGIKALILTSTRELALQISECINDYAKYTQVRHGVIFGGVNQRPQVDM LHKGIDILVATPGRLLDLMNQGHIHLDKIQYFVLDEADRMLDMGFIHDIKRILPKLPKEK QTLFFSATMPDTIISLTNSLLKNPVRISITPKSSTVDAIEQMVYFVEKKEKSLLLVSILQ KSEDQSVLVFSRTKHNADKIVKILGKAGIGSQAIHGNKSQAARQLALGNFKSGKTRVMVA TDIAARGIDINELPLVINYDLPDVPETYVHRIGRTGRAGNTGTALTFCSQEERKLVNDIQ KLTGKKLNKASYTI >gi|225935320|gb|ACGA01000072.1| GENE 41 58952 - 60025 876 357 aa, chain - ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 6 345 8 347 355 288 44.0 9e-78 MKSFFIGNIEIKTPVIQGGMGVGISLSGLASAVANEGGVGVISCAGLGLLYPKGKGTYPE KCINGLKEEIRKARAKTKGIIGVNVMVALSNYADMVRTAIEEKIDVVFSGAGLPLDLPSY LTPESTTKLVPIVSSSRAAKIICDKWQKNYNYLPDAIVVEGPKAGGHLGFKKEQLQDQHY ALEALIPEVVMIASSYKEQKQIPVIAAGGISTGEDIAHFMELGAAGVQMGSIFVTTQECD ASETFKEVYIHSKSEDVLIIESPVGMPGRAIDGEFIHNVNSGLERPKSCSFHCIKTCDYT KSPYCIIKALYNAAKGNMKKGYAFAGSNAFLAEKISSVKEVMSTLEREFFLATHKLA >gi|225935320|gb|ACGA01000072.1| GENE 42 60397 - 60696 262 99 aa, chain - ## HITS:1 COG:all2777 KEGG:ns NR:ns ## COG: all2777 COG0724 # Protein_GI_number: 17230269 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 99 1 99 99 75 42.0 2e-14 MNIYISGLSYGTNDADLTNLFAEFGEVSSAKVIFDRETGRSRGFAFVEMPNDTEGQKAID ELNGVEYDQKVISVSVARPRTERPSNGGGRGGYNNSRRY >gi|225935320|gb|ACGA01000072.1| GENE 43 61012 - 61674 380 220 aa, chain + ## HITS:1 COG:PA0750 KEGG:ns NR:ns ## COG: PA0750 COG0692 # Protein_GI_number: 15595947 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Pseudomonas aeruginosa # 3 220 8 226 231 236 51.0 4e-62 MDVRIEPSWKQQLQEEFDKPYFEKLVTFVKDEYKRAHVLPLGHQIFHIFNSCPFEKVKVV ILGQDPYPNPGQYYGVCFSVPDGVAIPGSLSNIFKEIHQDLGKPIPTSGNLDRWVAQGVF PLNSVLTVRAHETGSHRNMGWEIFTDAVIRKLSEKRENLIFMLWGSYAKEKRSLIDTNKH LVLTTVHPSPRSAEYGFFGCKHFSKANAFLHSKGIEEIDW >gi|225935320|gb|ACGA01000072.1| GENE 44 61722 - 62495 414 257 aa, chain + ## HITS:1 COG:no KEGG:BT_1888 NR:ns ## KEGG: BT_1888 # Name: not_defined # Def: LuxR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 257 5 258 258 407 81.0 1e-112 MNKSNNITREEMWAKQCLSATDIDYAVWERDKSILHQLSKICHNCTFVVDVYKCNYTYAS SNFVDLLGYDSHKIETLEKQGDYLESRIHPDDRAQLAALQVTLSQFIYSLPLEQRNDYSN IYSFRILNARQQYIRVTSRHQVLEQDRNGKAWLVIGNMDISPDQKHSETVDCTVLNLKNG EFFSPSSLLMPSSLNLTNREIEILRLIQKGLLSKEIADKLCISIHTVNIHRQNLLRKLGV QNSIEAIRLGQETGLLS >gi|225935320|gb|ACGA01000072.1| GENE 45 62534 - 63118 517 194 aa, chain - ## HITS:1 COG:no KEGG:BVU_0638 NR:ns ## KEGG: BVU_0638 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 194 1 194 194 299 73.0 4e-80 MNILELAKRNQQKAWEIIEDTRIVRIWEGIGAKVNLVGSLRTGLLVKHRDIDFHIYTSPL DLSASFRAMAELAENTSVKKIEYTNLLHTAEACIEWHAWYQDMEGELWQMDMIHIQEGSR YDGYFERAAERISAVLTDEMRLAILKLKYETPDTEKIMGVEYYQAVIQDGVRSYPEFEEW RRLHPAVGVVEWMP >gi|225935320|gb|ACGA01000072.1| GENE 46 63257 - 63673 299 138 aa, chain + ## HITS:1 COG:no KEGG:BT_1890 NR:ns ## KEGG: BT_1890 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: Nucleotide excision repair [PATH:bth03420]; Mismatch repair [PATH:bth03430] # 1 121 1 121 686 187 77.0 1e-46 MIFNQTEKQEKEYLQQVITTIHDTINTTSTSVKEYIDTLQEYKDYLWSNKDIDPHEIRSM RESILNHFALGENVIDKRRRLSKILDIPYFGRIDFQEKKDKSQVSPIYIGIHTFYDLNQK KKSDLRLASSYFKHVLRS >gi|225935320|gb|ACGA01000072.1| GENE 47 63600 - 65321 877 573 aa, chain + ## HITS:1 COG:BS_yvgS KEGG:ns NR:ns ## COG: BS_yvgS COG3973 # Protein_GI_number: 16080398 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 9 564 133 755 774 306 32.0 5e-83 MTSIKKRNLIYDWRAPISSMFYDHELGEAFYSSPAGKINGAISLKRQYRIRKGKMEYMIE SSVTVHDDILQKELCSNADNKMKNIVTTIQREQNQIIRNENASVLIIQGVAGSGKTSIAL HRIAYLLYAQKGQISSKDILIISPNKVFADYISNVLPELGEETVPETSMEQILSDVIDNK YKHQNFFDQVTELLENPTPDFIERIQYKASFEFISLLDKFILYMENNYFKATDVKLTKYI TIPAEFINEQFKRFHRYPIRQRFETMTDYILEMMQLQYNLTVSTPEKNQLKKEIKKMFAG NNDLQIYKDFFEWIGKPEMFKTRKNRILEYADLAPLAYLHIALNGNNAQSYVKHLLIDEM QDYSPIQYKVIQKLYPCRKTILGDASQSVNPYGSSTADMIQKAFATGEIMKLCKSYRSTF EITSFAQKIQPNNELEPIMRHGEHPKILPFKNTEEEIQGIADLVNSFRNSHYTSLGIICK TESQAKELVQKLQIYANDISLLSNQSSAYVKGIIITSAHMAKGLEFDEVIIPQTDDKNYH SNIDKSMLYVAATRAMHKLTLTYSVSSPSCRFL >gi|225935320|gb|ACGA01000072.1| GENE 48 65423 - 66979 1651 518 aa, chain - ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 5 517 9 522 525 660 63.0 0 MDGILRKLPIGIQTFAEIREENYLYVDKTAFVWRIANTGKPYFLSRPRRFGKSLLLSTFE AYFEGKKELFEGLTIADMEKEWKTYPVFHLDLNAEKYDSPQALVEILSRQLTQWELKYGK GEDEETLSGRFAGVIRRACEQSGRGVVVLVDEYDKPLLQALGDDALLDDYRKTLKAFYGV LKSSDRYLRFVFLTGVTKFAQVSVFSDLNQLNDISMKPQYATICGITRQELEDTFIPELN RLAETNELTYEETLNKMTALYDGYHFCEFAEGVFNPFSVLNVFDGYKFSNYWFQTGTPTF LVELLKKSEYDLRTLIDGVEANASSFMEYRVDANNPIPLIYQSGYLTIKGYDKRFGNYLL KFPNDEVRYGFMDFLVPYYTSVVDDERGFYLGKFVRELESGDVDAFLTRLQAFFADFPYE LNEKTERHYQVVFYLVFKLMGQFTGAEVRSARGRADAVVKTPKYIYVFEFKLNGTAEQAL QQIDEKGYLIPYQVDGRELVKVGVEFSAEKRNIDRWLY >gi|225935320|gb|ACGA01000072.1| GENE 49 67128 - 68048 451 306 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 45 148 1 104 116 105 46.0 1e-22 MRNIDSIIVHCSATKAGQDFTATDIDRWHRERGFNGIGYHYVIRLDGKLEKGRDVALPGA HCKGWNEQSIAICYIGGLDGNGRPADTRTNAQKRVLYQIIMDLQREYNILQVLGHRDTSP DLNGDGVIEPYEYVKACPCFDMKEFLRSGRELLFVLVVALVVPVLLSGCRSKKEVINRES EVRVDSSLNSSSGKSLVKNKVASEKDLEVVEEHIEQVLFVFPVDTLRLKAGMVVKTVVDR KNVNEKQLLQEVNSQSVSLSDMSAKTEIHKTGVEKTTNRTTGSGYWYGIILLVLVLVCWI CKVKRK >gi|225935320|gb|ACGA01000072.1| GENE 50 68079 - 68429 417 116 aa, chain - ## HITS:1 COG:no KEGG:BT_4442 NR:ns ## KEGG: BT_4442 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 116 116 120 54.0 1e-26 MQITTILAFITAMGGLEAVKWLVRYITCRKTDARKEEASVNSMEEENRRKKVDWLEERLT QRDEKIDGLYIELRKEQEEKIDWIHKCHEVELIQKESEVKKCEIRGCVKRMPPSDY >gi|225935320|gb|ACGA01000072.1| GENE 51 68515 - 69033 566 172 aa, chain - ## HITS:1 COG:no KEGG:BF2226 NR:ns ## KEGG: BF2226 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 150 1 152 155 80 35.0 2e-14 MAIRFRRVSRLCDPTNKEAGKKVYPVISYQYDTSATLDEFAKEIASASGVSEGETISVLK DFRTLLRKTLLGGRSVNIAGLGYFYLSAQSKGTEKAEDFTIANISGLRICFRANSDIRLF TGTTTRSDGLKFKDLDHINDSGSVGDGSDDGGETPDPDGGSGGNGEAPDPAA >gi|225935320|gb|ACGA01000072.1| GENE 52 69546 - 70118 330 190 aa, chain + ## HITS:1 COG:no KEGG:BF3645 NR:ns ## KEGG: BF3645 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 186 3 185 189 184 46.0 2e-45 MESKRYNNSFTSSSMVKEEGRTAYYKLLDSVRDGDAVRIKRGVYATAEQLADTMIDVEAI IPGGVLCLFSAWNVHGLTTSLPQAYHVAVKRGRKITLPTFPKIELHHITDTMFDIGVEEL VVSGYHIHIYNKERCVCDAVKYRNKIGMDVCSEVVNNYISQPDRNLSLLMDYADKLRVKR ILEQYIAIKL >gi|225935320|gb|ACGA01000072.1| GENE 53 70115 - 70984 486 289 aa, chain + ## HITS:1 COG:no KEGG:BF3644 NR:ns ## KEGG: BF3644 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 284 3 286 289 281 48.0 2e-74 MSEIKNPGKSIRAKLLNVAKQKDVFYQTILTRYFQERLLYRISQTHYRENFYLKGGALMY AYEHFAARPTLDIDFLGTHISNDGERIAEAFREICAVECEEDGVRFAADKLAAQNITEFK DYHGIRLSIPATMDTIAQVLTMDIGFGDVVTPHPVSLDYPLLLEELPEANILAYSTETVI AEKMHAIIDLADQSSRMKDYYDIYHLLTSFKYDNTILQDAINRTFENRHTLYNADTMFFR KDFPNHTQMQLRWSTFLKKATIKNSLSFSEVAHWLQEALMPYWEAYGRM >gi|225935320|gb|ACGA01000072.1| GENE 54 71284 - 71943 550 219 aa, chain - ## HITS:1 COG:PA4034 KEGG:ns NR:ns ## COG: PA4034 COG0580 # Protein_GI_number: 15599229 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Pseudomonas aeruginosa # 4 216 3 224 229 147 47.0 1e-35 MKTKFIAEMVGTMVLVLMGCGAAVLNGGATSVAAVLTIAFAFGLSVVAMAYTIGPVSGCH INPAITIGVWLNGGLSVMEAGVYIVAQVTGGILGSALLWLITGTMGMEGTGANGFEEPYL LAAFVAEAVFTFIFVLTVLGTTDRDNSSPHFAGLAIGLTLVLVHIVCIPVTGTSVNPARS IGPALFAGVEAISQLWLFIVAPIVGAVVAVPVWKTIKGN >gi|225935320|gb|ACGA01000072.1| GENE 55 71983 - 72324 119 113 aa, chain - ## HITS:1 COG:no KEGG:BT_2526 NR:ns ## KEGG: BT_2526 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 112 21 132 132 133 69.0 2e-30 MIFGCCFTRFKYLEGWEEMELQIHARQYTECFFLALLPALVLSVWLSWWWMLLAFMSYHL LYWLERWFGHHSSFDWEALEHCSDALYLRKRKSRAWMKWYGKKSLPPSEWEDD >gi|225935320|gb|ACGA01000072.1| GENE 56 72547 - 73716 1239 389 aa, chain - ## HITS:1 COG:no KEGG:BT_0418 NR:ns ## KEGG: BT_0418 # Name: not_defined # Def: outer membrane porin F precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 389 1 371 372 243 38.0 6e-63 MKKKILTVALVLSSVCAMAQQKENRDTDGSILRGPYETNSFWSNWYIGVGGGINIYEGEL DNKTSVGNRIAPALDVSLGKWITPSYGVRLQYSGLKAKGLTDASGMYAKGAHRGYYKEEF NVSNLHADFMWNWSNGFLGFNEKRVWNVIPFVDFGWARSWGNSTHDNEIAANIGILNTFR LGKRLDLTLEGRQMLVKECFDGTVGGSKGEGMSSVTLGLSVKLGKTRFNRVEKVAPADYS SYNERINALRAKNHDLSTEAQRLAAELEAARNRKPEVVNESKVTASPVALFFHIGKATLD KKELTNLEFYVNNAIKADKNKTFTLIGSADSATGSKELNQRLSEQRMEYVYNLLVEKYKI SPERLVKKAEGDTNNRFSEPELNRVVIVE >gi|225935320|gb|ACGA01000072.1| GENE 57 73967 - 74233 327 88 aa, chain - ## HITS:1 COG:XF1190 KEGG:ns NR:ns ## COG: XF1190 COG0776 # Protein_GI_number: 15837792 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Xylella fastidiosa 9a5c # 1 86 1 86 94 80 46.0 6e-16 MNKTELIKEVALKGKLSFREAGYALDVIIDTITETMQKGENITLVGFGSFVIQERKERNG HNPSTGGSMVIPAKRQVKFRPGSKMRLG >gi|225935320|gb|ACGA01000072.1| GENE 58 74291 - 76939 1510 882 aa, chain - ## HITS:1 COG:no KEGG:BT_2473 NR:ns ## KEGG: BT_2473 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 519 831 302 660 706 83 26.0 4e-14 MNNKFIYTLYCAVWLVCCSLIVQGCTDEENVGATPDEGTLLRLSIADLEEVQVRGLAGES SAEYRTQCYDIFVYRGTDWEATPYYHSHREKYEEYVPGMYGDGSSSPVIDLGIAPQDGDR IYVFINCQWVDYVDGNTPENVFYGSDKLALSNARVVQFPGCKAFSGYIDWKNGQDNLIRM ERLTAKVRVKITTLYTNKTYGGLMICNAPNYTLASNKYNGLDYEETGELIGVPFKCGPEI SSGWYDEQPGVDIPEYRCSEHTALSGGRSLDSRVFDPQRMAVILVLNEEQDGNLKRSYYR LDFIDDQTLGYYRDVIRGHRYTFLITKIYSEGYATLEEAIYMPASNLEYTVTVNNDWTES FEYNGQWQLNIDREVAELLPNIMDPVPVAKVELQNNNKGTADFSKLTTRQVSLVSPDLEV LDEDVLGAPVPIQLWCYNPTTGQTVKAPGNVADASMLTGTSWVFCCTTDADFVFGGVLAS CWLKIVMGNIVKYIRIDCLSLSDRFTEEFTELDKSDGGTANCYIVSPLGGTYSFNATVMG NGVKGITVANTFQSAKGSAYLTPGNVADVTIAPQSAKLLWQDKKGLISQVAYHDGRVEFI VSDSGQAGNAVIAVYDKPDPNAADATLLWSWHIWCAPRPVDIPVKPEAGAELVAGEKYIF MDRNVGAEDSKEVGMQYQYGRKDPFIDYETNGHEYIYDLLGKDVTALLKTVPNPPEQYEK IRFPLTYYAGIYSMDRPSTGKSNGLWGTDRPAYKDVISDDVKTIYDPCPPGYKVPDMRPY KVVRDKFFLTGKFDFKVSDSYVSFVSFQTSPRTLYFPNGGRYYVGERGGSILYAWTTGVV TPNESHIVFTCTYNDSKKEFDVCNMNAGIVDESFTQVRCISE >gi|225935320|gb|ACGA01000072.1| GENE 59 77012 - 79675 1260 887 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175101|ref|ZP_05761513.1| ## NR: gi|260175101|ref|ZP_05761513.1| hypothetical protein BacD2_24816 [Bacteroides sp. D2] # 1 887 1 887 887 1826 100.0 0 MKKYLAFILLFFGIGCICSCTAIDELGDTSCVPEGMIRVEVPLSIASSVQTTEGVTREIE DRWEPIEDLEVQMWRRDMYSNYYTVSLDPSNLTPVTGENGQKQYILKFDIPEIILQSDDA FLDKTFLVGVRANLNYFSPYGSINPTPPASSPVYMSGEGEWRWDGETFVQDAPIRLYRSI CKLRTLVRTVDKSVLGDWEVYPELTEVQLLDAPPYARKFHPTIALVSNWNWDYVDPTVYV DYPSMVLRDFPSDQIVDIYLGEHTVDKELFRVYEQYFFENYLEDVSKYKAGVNITSLKVK IHLFNKETVENKIIERVIPINDFGYDWGNVSGHDCRLMRNTAYLLDIQVLSPEDIEVNLH LQDWWAEETVDGDIPGVSFMVDKTEVPVIPYGVTDPAPAFEVVPDDPAFLDISLVDKDYK SVDQSNGFVLWYKLPKDGSLTKAPSDNRIRINVTNMGLRMFFTTVKNGNPGGFFRVTCGR STRYIPIITPKYVRTDEMFNKSMTSNCYIAEKEGGYRFAGTVMGNGADGLPPAVQASMLP YNANGYLMGATPSPVIAPKSAKLLWQDVDGLITQVGFDTNTTSRGDVVFYVSGTKGPGNA VISVYDKADPNDPAAKVLWSWHIWCAPRPEEMSTSGSAAADDDGKRIYTFMDRNLGATTA DGSSNKAEGLLYRYGRSVPMIHFASKWGSYKRIYNILGEDITSLIVWGEGPGGNYTGRNW DNILFPLNYYTAARSNLESGMWGEGAVSEVGTFNDKTIYDPCPPGYKVPTYRAFEALNEL YYSQNKFRQESSGIYFTGLKGGTLCFPWVDYFWGEPGTAGQWYDTRGKSLTYYSATIDGV SDNTYPGSIYTFYYFWQSDPGSNSMPRVIIDSTTLLPVRCVREREKE >gi|225935320|gb|ACGA01000072.1| GENE 60 79693 - 81807 1414 704 aa, chain - ## HITS:1 COG:no KEGG:FIC_00184 NR:ns ## KEGG: FIC_00184 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium # Pathway: not_defined # 370 599 45 323 1036 95 29.0 9e-18 MRYKEKYNWLLLLSVAICCFACTAEREPTDGYTLAGTRAEVPMSVHLGVQGAATDPEQIN TSEAAIRRLRLYVFDGNVLDKMYYWDNINTVETYTTPTFMVKAGTGKSVYVIVNEPEDAD TRAQLEAVTHPVHLPEIKYSLSAMGLEGNVLSNDGTYRYNYLPMYGERGGVDALPSSTAT SPQRLEISVDRAVARVDVYVRKDYKDYANGVTDDFVLEFIRTSDHLDTGFISSHSLLEAN EESQGYGDKYYGESIQETEKRMYSFYVPELDCRDKKVKITLEFVINEYTGEGGIRKKVTV ELGGGDDTSSPLEKIERNHVYKLHCSLTKLARWENTLEIDDWEYNHLQAVEVTQDVRVTN CYVVRPGETIDIPLKNIYAAWAWARELGGGIPKGRHIDVTYVWQDVPFQTVADKAIMGGG KDRTTDYIRVKGINEGNAVIALTVDDVIYWSWHIWVTDYDPNKLDGWKSVNGRIWMDRNL GAMSGEYTPDGAARGLIYQWGRKDPFPGTKGFGDNTPKPVYVYDAGGSPEKYIEYISKQT DGINLMYSLQHPDQIIYALDESEESDWYTTNTQEYANNQLWSADKKTIFDPCPAGWRVPD NEAFDRDGDGVTDLIYNYDNSVWGMFSTNKRIYFPYTPRYRSGAFFQENQDYNIMNLWGC ASRTHPGEIAGCFESVSLVPGGNISLSHSRHKADALPVRCIKEK >gi|225935320|gb|ACGA01000072.1| GENE 61 81861 - 82823 602 320 aa, chain - ## HITS:1 COG:no KEGG:BT_1062 NR:ns ## KEGG: BT_1062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 319 4 316 317 304 46.0 2e-81 MKIKDIARRLGCGICLALLLTHFSCDSIHETLPECRLYVDFRYDYNMEFADVFAGRVDRI DVFIFDKDGKFILQKNEQGERLAAGNYRMPLDLPLGEYRISAWGGMSDDFEMPRLVVGQS TPEDLTVKMNREATLIYDKELHPLWYGEPITVNFTGRGEQTETVRLVKDTNKFRFILQNL GPGLALDPDDCLFEIHADNGHCNWDNSLLADDVISYRPYYLADVDDIGLVVEMNTMRLLE NKKVYFMLTRKSNGERLFKVDLIPYLLLTKMEGHNIPAQEYLDRQSEYALVFFYNSELIS FLAAKIVINGWTIWLKNEDL >gi|225935320|gb|ACGA01000072.1| GENE 62 83189 - 84802 1617 537 aa, chain - ## HITS:1 COG:no KEGG:BT_1063 NR:ns ## KEGG: BT_1063 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 533 1 562 566 138 27.0 6e-31 MKLDKIFTALFVGLAMAACSNDDEMTTGVQDQSPVTGGTAYMSVSLNMPKSTGGTRTSGL DHGSTAEQKVGTVLLALFDASDICVGTKTLSAGEYQLTVGGTPGTVGSAIQVPATTKTVL AVINPNTDFQNRCVVSAGWAAINAALATSVKDVIGTAEDNFMMINAGDATTHKALIDVSG YIKKVGDGYVDATNAKVAAVADPAVIYVERVVAKVSLEEASGGVMVMAPGATCTFGDWAL NVTNKDMFPFSEIVTPIGSMSANYRKDNNYAAGEFNKSKFNYLTLDSNGALPADFTTMGV DKYCLENTMDATEQKMAQTTSAVVSAVYTPAGYTAGKSWFRLLGTSYEALTDLQAVYTTA KTAVTDGSASDLQKQQKTLCDEFYVRIAAMATAQSKTWNASDFASLDIDEMDAIVNGGEA SKPAVVPDPIDPVNNPDIVQDLGVEYFKEGTCYYSILIRHDDAVTGTMELGKYGVVRNNW YTLTIKSVKQPGTPWIPDPTNPIDPTDPGEDNDEAEAFLAVGITVNSWTTWSQGVDL >gi|225935320|gb|ACGA01000072.1| GENE 63 84863 - 86191 1123 442 aa, chain - ## HITS:1 COG:no KEGG:BVU_0907 NR:ns ## KEGG: BVU_0907 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 37 437 39 429 486 215 33.0 3e-54 MRKNYILYMVMVAFATVSARAQEISIANTTARKVSDDKVEVSFRMDCSGLQLSTRRQLTV TPIIVNYADSDSLSGMPRSLALPSVRIAGKNRYRMNERRKKLYGSMTGETPYMFRFSKKR RPMIEYSETVPTEKWMSGACVEVRREWQGCAGCGEVLADVFLADVPLFKEEKDTVERPHL ELMVAEAVEKHRSFTRSAYLNFKAGQSVLLADYMNNPSELARIYSSIDSIRNDEIYRIDE ISIVGYSSPEGSYVSNARLSEQRARALERNLKQAYKLDDRTLRCSSVAENWEGLAAWLEE YRPSYMQRVLDIIEQTPNPDARDAKIRAIDGGKIYNALLSEVYPGLRQVKYTVNYVIIPF TVEQGREIIRTRPDKMNHNEMYRVAESYGKGSEEYYRIIRMIAERFPDDRIAINNAAIVA WEMGDYGAMRMYLKRLEDLKAE >gi|225935320|gb|ACGA01000072.1| GENE 64 86199 - 86768 477 189 aa, chain - ## HITS:1 COG:no KEGG:BDI_3526 NR:ns ## KEGG: BDI_3526 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 189 1 185 185 269 70.0 3e-71 MKKYLCLIGVLGILLCNTGNSYAQKVAVKTNLLYDATTTMNLGLEIGLGKKWSLDLSGNY NPWKFDDETRLRHWGVQPELRYWLCEKFNGHFLGLHGHYAKYNVGGMSFLSDNMERHRYQ GHLWGGGISYGYQWLLGSRWSMEAVLGVGYARLDHSKYPCATCGSLQKSEKKNYFGPTKA AINIIYIIK >gi|225935320|gb|ACGA01000072.1| GENE 65 87037 - 87951 465 304 aa, chain + ## HITS:1 COG:no KEGG:BT_2889 NR:ns ## KEGG: BT_2889 # Name: not_defined # Def: AraC family transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 290 10 290 294 347 57.0 3e-94 METKLMKSELLYIEEHLSCQNYMTTIETGFKYLEFDKNTEFEEDTTSKNYLLFFLKGDFT ITCNQFHNRSFHAGEMILIPRSSRLKGIAETGSNLLSMFFDMPEGNCDKLILQSLSELCN NIEYDFSPIRIHYPLTPFLEVLTHCIKNGMSCAHLHTLMQREFFFLLRGFYEKREIATLL HPIIGKEMDFKDFVMRNHTKVDNIEQLISLSNLGRSRFFSKFNEVFGMTAKQWMLKQKNQ RILEKMTEPGVCIKDAIEELGFDSQSNFNRHCKLYFGCTAKQLMERCQTENSPIYEDNHA TCTK >gi|225935320|gb|ACGA01000072.1| GENE 66 88139 - 89359 772 406 aa, chain - ## HITS:1 COG:no KEGG:BF3332 NR:ns ## KEGG: BF3332 # Name: not_defined # Def: putative integrase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 404 1 399 403 395 53.0 1e-108 MITSVKLMLNKSRILNNGSYPLVFQVIHNRRKKLLYTGYRMKEEVFDESEGKIMNGVGST FTATEVVKMNRELRKMRNRIDTRIRQLERTREEFTVEDILAQNAFGTGKPQFYLLRYINA QIERKQELKKVGMAAAYKSTRSSLARFIGRPDVRMSEVDLAFVRRYEDFLYSNGASGNTV SYYLRNLRSLYNQAVTDGYHPRGEYPFAKAQTRPAKTVKRALSRTDMQNLADLKLENEPE LEFTRNLYLFSFYAQGMAFVDIVLLKKTDICNGVLTYSRHKSKQLIRIVVTPQMQGVIDK YNTENEYLFPIISGEYASGYKQYRLALGRINRHLKKIAVVADIKVPLTTYTARHTWATLA RDYGAPISVISAGLGHTSEEMTRVYLKDFDVSMLNQVNSMVTNLSK >gi|225935320|gb|ACGA01000072.1| GENE 67 89923 - 90405 199 160 aa, chain + ## HITS:1 COG:no KEGG:BF2331 NR:ns ## KEGG: BF2331 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 160 1 153 163 98 33.0 9e-20 MVTVTFAIKPYLARYMYVRYGQSLELLSHNNPPSRLPIPIHLSHLTPIYHFLHQLSVPHP QGVSWKETGNITFVLPSPRQGKNPEIYNYFGQNSIFIIEKEIEVEMKAELYSFLLENKFK NGVMYIKSMHEFVVKYDMAESVEEESLMRGFQRWRKNMKK >gi|225935320|gb|ACGA01000072.1| GENE 68 90553 - 90810 238 85 aa, chain + ## HITS:1 COG:PA3012 KEGG:ns NR:ns ## COG: PA3012 COG4728 # Protein_GI_number: 15598208 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 9 64 47 102 124 62 50.0 2e-10 MEKERYFIHFKGGLYKMLGIAQHSETLEEMVVYQALYGKHEIWVRPKTMFFDKIVRNGIK IDRFKEITEKEIYAYYPERKEISEE >gi|225935320|gb|ACGA01000072.1| GENE 69 90770 - 91048 247 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175111|ref|ZP_05761523.1| ## NR: gi|260175111|ref|ZP_05761523.1| hypothetical protein BacD2_24866 [Bacteroides sp. D2] # 1 92 1 92 92 160 100.0 3e-38 MPITQKEKKYLKNKHKGGKNNSKGAIYESYYATYQIVSFMNQYITQLSNVHLTTQLHDAF VDDLFIEEPNAHKTYHQLKDVKDLTWETCKLK >gi|225935320|gb|ACGA01000072.1| GENE 70 91056 - 91739 377 227 aa, chain + ## HITS:1 COG:no KEGG:Psyc_0245 NR:ns ## KEGG: Psyc_0245 # Name: not_defined # Def: hypothetical protein # Organism: P.arcticum # Pathway: not_defined # 2 227 6 230 233 134 37.0 2e-30 MRTYTNITYEKNRDIIQLMHILSNPSTSIEVYRNTFYKLGKALGALLDEKKHSRYGNTML ACASEDADWLTQGVLSSISQKNVALAVFWNQRITLDETTKLEYSPIVKSYIEPIEECQTL IVVKSIISSSCVVKTQLTRLVEKMNPLNIYILSPVMYKDAQKNLQKEFPESISNKFEFLT FAIDTSKDAQGSVLPGVGGMIYPKLGLGDIYEKNRYIPNLVKERMAL >gi|225935320|gb|ACGA01000072.1| GENE 71 91752 - 93914 742 720 aa, chain - ## HITS:1 COG:MTH502 KEGG:ns NR:ns ## COG: MTH502 COG1700 # Protein_GI_number: 15678530 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 59 531 133 565 567 166 27.0 2e-40 MYDFELDKNSYYLQEDRDILTNSKSRQAISRGCISTGIYVGTYSIDVLQKEDNKKVGEVA FEIRSAKVDYRDDYRRMLEDITEFCTELLMQQSSPVIQRYEVDITKDPRSDYQRFAFVRS LVDSDSFADAMYQIQLSPVKRWIDSEEERQLCNVKRWGQHTLRQLACATNRVSLSNQHPL KDRFDTLPRCVPVAFRKETADIPENRFIKYVLRSFLTFCSSIQHHSKAGQRLKQEASLVC EKLLQYLSYPLFRNISEINVLPLNSPILQRKEGYREIFQKWLMFDMAARLTWQGGEDVYG AGKKDVARLYEYWLFFKLLDVLSQKFHIPSKSKEELIDCSNELNLTLKQGKMIMLSGIYE TDSRKLNICFSYNRTFSYTDNYQLSGSWTRNFRPDYTLTIWPGDITSEEAEREELITHIH FDAKYRIEQLFLRDNTNKNNDEVSDDLSEMKREEERGTYKRADLLKMHAYKDAIRRTGGA YILYPGTENQVIHGFHEIIPGLGAFAISPKDYDKSIQAFLTFLDDVVDNFLNRTSLREKL AYHTYSVLSKNNNRSLREPLPEPYGANRDLLPDETYVLIGYYKNEQQLDWILKRKLYNTR TGSANGSLHLSKEVISARYLLLHGERELLTGRLYKLVPKGPRIFSREKLEKIGYKEPSGD YYLVYNIEGIPEPEFEGMKWDIRLLEGYKKGNLSASAFSVSLADLMKVKVFNEYEYTNET >gi|225935320|gb|ACGA01000072.1| GENE 72 93854 - 94024 112 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVIVQPYMFLLGERILCLKVRTPKNMENLLFNYKKVVCMILSLIRIATIYKKIEIF >gi|225935320|gb|ACGA01000072.1| GENE 73 94062 - 95948 942 628 aa, chain - ## HITS:1 COG:MTH501 KEGG:ns NR:ns ## COG: MTH501 COG1401 # Protein_GI_number: 15678529 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Methanothermobacter thermautotrophicus # 266 580 210 518 546 208 39.0 3e-53 MKELPRNIKRDDILQGIRKVEEEGIPPQAHSSTYDLLYNGKRFPPKLVLAYANIFANGEE LDRNEFEGGKGTTCFKILKDNGFKIVLKNESNSISETLVKFLEQAKGNSLKTKDYIRYFD DLQVKVSFGMASLARVPWISFLAQGQKTSNGIYPVYLFYKKYNILVLAYGISEGTVPVVT WNLDTNVETINSYFKKRNIIPDRYGDSYVYAVYDTTKPLNYTLIEEDLKRIIAQYKNIIT SDKNSLIALSGQSKKYDEKDFEPESFMNDLNSTGLVFSSEFVSRFILSLLTKPFVILSGL AGSGKTQLAMMFAKWICEDIDKQVCMIPVGADWTNREFLIGYPNALEKGAYVKPDNGALD LIIEANKNPRKPYFLILDEMNLSYVERYFSDFLSVMESEEGIPLMKNCIDKEITPVTKLP RNLFIIGTINVDETTYMFSPKVLDRANVLEFRIMEKDMKDFLQKLQPIHKDIINNKGVEQ ATSFVKMAVDSWGLQISEEHKASIITSLECFFYELKKIHAEFGYRTACEIFRFMALAEKV NTKLEGDAIVDAAIIQKLLPKLHGSRKKVEPVLKALWNLCLIKGSELTLENTIEFVHNDQ FKYPLSAEKIWRMYCVALDNGFTSFAEA >gi|225935320|gb|ACGA01000072.1| GENE 74 96213 - 97868 1086 551 aa, chain + ## HITS:1 COG:no KEGG:BT_1894 NR:ns ## KEGG: BT_1894 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 551 1 551 551 893 88.0 0 MKCRIILELLAILFFTGCYNREQISRLDEAEALLQNKPDSALTILQQLKSEGSQAEQARY ALLYSEALDKNHIKATNDSLIRRAWEYYKHHPKDLRRQCKTLYYWGRVKLRAGDKPGALR LFLEIEEKLKDTNEPYYAGLLYSQIGEVYYDQMNYSRAYHYFREARNNFRQTDNTREETE ATLDMAAATFNSKDMEKAMRLYSAALDLADEHKYNKLAKASLTNLASLYVVSGKKQIPHD LLQRIELSARQDTLYGYHTLVDVNLLKNRIDSARFYLALAETHSTDIRDMADLQYTAYRI EAQARNFEKATEKIHHYIYLTDSLTRSNMQFSAGMVERDYFKERSKFAEYRMKNRTTWEI TIASVIFLIIGVAYYIIRQRLRLQRERTDRYLLLAEEANAQYKTLTEHMEGQRNAESHLK GLIASRFDIIDKLGKTYYERENTASQQAAMFHEVKQIITDFAENNAMLQELELIVNTCHD NVMEKLRNEFSSMKEADIRLLCYIFVGFSPQVISLFMKDTVANVYARKSRLKSRIKSAET ANKELFLALFG >gi|225935320|gb|ACGA01000072.1| GENE 75 98031 - 98435 401 134 aa, chain + ## HITS:1 COG:no KEGG:BT_1895 NR:ns ## KEGG: BT_1895 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 17 150 150 227 84.0 9e-59 MKKIVILLSAMLFMVACEKGNEEMKEAKVVVSVYDENGKIATGVPVKMYNEKDYKVFEKD NLTLPTAIIQTNESGMATFVLPREEWFATQSQRFLTFVVQEGGGPNNYQIWSSGKTVEAG KAVKVEIRLTQFPN >gi|225935320|gb|ACGA01000072.1| GENE 76 98447 - 99733 716 428 aa, chain + ## HITS:1 COG:MA4285_2 KEGG:ns NR:ns ## COG: MA4285_2 COG3291 # Protein_GI_number: 20093074 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 144 397 659 914 1325 94 28.0 6e-19 MKQTRLYIITALCLQMFLGTILLSCSTENDEYRKDSPSAANPVEPIEASLEDFFIEQLPS KTIYALGENIDLTGLNVTGKYDDGKQRPVKVTPEQISGFSSSTPVEKQEVTITLEGKQKS FSVQVSPVRIENGVLTEILKGYNEVILPNSVRSIPKAAFSNSQTAKVVLNEGLKSIGDMA FFNSAIQEIVFPSSLEQMEENIFYYCRNLKKADLSQTKLTKLPASTFVYAGVEEVLLPAT LKEIGAQAFLKTSQLKTIEIPENVKTIGLEAFRESSVTTVKLPNGVTTMAQRAFYYCPEL AEVTTYGSTFNDDPEAMIHPYCLEGCPKLARFEIPESIRILGQGLLGGNRKVTQLTIPAN VTQINFSAFNNTGIKEVKVEGTTPPQVFEKVWYGFPDDITVIRVPAESVEKYKNANGWRD FTNKITTF >gi|225935320|gb|ACGA01000072.1| GENE 77 99869 - 100546 457 225 aa, chain - ## HITS:1 COG:no KEGG:BT_1899 NR:ns ## KEGG: BT_1899 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 225 11 235 235 421 88.0 1e-116 MCAEMYHRADFIQSDPVQFPHRYTLKQDIEVSGLLTAIMSFGNRKQILKKAEELHGLMGD SPYQYVLSRQWEGDFPPGATGSFYRMLSHADFYGYFQRLYIAYTQFESLEEALMLYPGIP MEKLCAFLEVSAKSPQKKLNMFLRWMIRRDSEVDLGIWESFDRRDLIVPLDTHVCRVAHY FKLTDTETFSLKNARQITTALAEVFPDDPCLGDFALFGLGVNGEI >gi|225935320|gb|ACGA01000072.1| GENE 78 100619 - 101911 786 430 aa, chain - ## HITS:1 COG:PAB0243 KEGG:ns NR:ns ## COG: PAB0243 COG0534 # Protein_GI_number: 14520582 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Pyrococcus abyssi # 1 311 1 315 463 80 25.0 6e-15 MNYTYKQIWLINFPVMMSILMEQLINITDAIFLGHVGEVELGASALAGIYYLAVYMLGFG FSIGLQVMIARRNGEQHYKETGRTFFQGVYFLSGMAVILCLLLHLASPLILCRLITSDEI YQAVIHYLDWRSFGLLFSFPFLAFRSFLVGITTTKALSVAAIMAICINIPFNYLLIFKLN LGISGAAMASSLAEFGAFIVLLLYMWVRVDKVKYGLKVVYDGKLMMKLLRLSVWSMLHSF ISVAPWFLFFVAIEHLGRTELAISNITRSVSTVFFVIVNSFASTTGSLVSNLIGAGQGKE LFPVCHKVLRLGYAVGGPLIVMALWGNQWVIGFYTNNDNLVRLAFYPFIVMLLNYAFALP GYVYINAVTGTGKTKLAFIFQLITILVYLIYLYLLSECFHASLTVYMTAEYLFVILLGIQ SIIYLKRKSD >gi|225935320|gb|ACGA01000072.1| GENE 79 102351 - 103664 758 437 aa, chain + ## HITS:1 COG:MK0109 KEGG:ns NR:ns ## COG: MK0109 COG0527 # Protein_GI_number: 20093549 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Methanopyrus kandleri AV19 # 119 430 130 461 467 72 25.0 2e-12 MKVCKFEKIKDEDEMKQVINCIQKEHPYVAVVPILAQLQEWLQAISISWFHEEDEVSHTT VNAIEAYCCTLANHLITDSHLNQEIKNRILECIKKIHILVEDKADLLIDKMIKAEIYGLS SDLFTYCLRQQGLRAQTLDTGKLIQINLERKPDIPYIQESIQQYIDENRNVDIFIAPLSI CRNVYGEIDFMSEQRNDYYATVLATLFKADEILLSTPINHIYANRNCLREQHSLTYIEAE QLINSGVHLLYADCITLAARSNIVIRLTDTHDLSTERLYISSHDTGNSVKAILSQDSATF VRFTSLNVLPGYLFMGKILEVINKYQINVISMASSNVSISMMLTASRDTLRIIQRELHKY AEMVMDENMSVIHIIGSLHWERTQVESHIMDTIKDIPVSLISYGGSDHCFTLSVHTTDKN RLIGSLSRQFFESQCAA >gi|225935320|gb|ACGA01000072.1| GENE 80 103718 - 104533 639 271 aa, chain + ## HITS:1 COG:lin0414 KEGG:ns NR:ns ## COG: lin0414 COG0345 # Protein_GI_number: 16799491 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Listeria innocua # 2 256 3 260 266 140 31.0 4e-33 MKIAIIGAGHIGSAIVTCLAQGHLYNEKDIIVSNPNIDKLERLQEHFPAIHITTDNQQAI SEAEVIVLAINPWKVDEVLSPLRFSRTQILVSLVSGVCISHLAHLTEAEMPIFRAVPNMA ITERSSLTLIASRGTDKEYQQLIKQIFEEGGKCLFLQEKQLDTTSALTSSGIAFALKYMQ AVMQAGIELGIPGKDAMQMAAYSMEGATELILNHNTHPLLEIEKVTTPGGTTIKGLNELE HKGFTSSIIQAIKSSATTLIDKEEEDAKIFR >gi|225935320|gb|ACGA01000072.1| GENE 81 104624 - 105964 1272 446 aa, chain - ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 446 4 456 461 327 39.0 2e-89 MNKILIIDDEVQIRSLLARMLELEGYEVCQAGDCKAALKQLEIQSPDVALCDVFLPDGNG VDLVLNIKKIAPNVEVILLTAHGNIPDGVQAIKNGAFDYITKGDDNNKIIPLISRAVEKA KMNVRLEKLEKKVGQMYSFDSILGESKVLKEAVSLAQKVSVTDVPVLLTGETGTGKEVFA QAIHYSSKRSKQNFVAVNCSSFSKELLESEMFGHKAGSFTGALKDKKGLFEEANNGTIFL DEIGEMAFELQAKLLRILETGEYIKIGDTKPTRVNVRIIAATNRNLQEEIKAGHFREDLF YRLSVFQVHLPSLRERTGDIRILATAFAKSFSEKLSYAINEMTPAFLEALEQQPWKGNIR ELRNVIERSLIVCEGGRLDICDLPLEIQNTHYECSDDSIGSFELAAMERRHIARVLEYTK GNKTEAARLLKIGLTTLYRKIEEYKI >gi|225935320|gb|ACGA01000072.1| GENE 82 106094 - 108598 1862 834 aa, chain + ## HITS:1 COG:YPO3001_1 KEGG:ns NR:ns ## COG: YPO3001_1 COG0446 # Protein_GI_number: 16123182 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Yersinia pestis # 18 456 20 458 459 421 49.0 1e-117 MKIIIIGGVAGGATTAARIRRVDETAEIVLLEKGKYISYANCGLPYYIGGVIEERDKLFV QTPEAFSTRFRVDVRTENEAIFIDRKRKTVTIRQSSEDTYEESYDKLVISTGASPVRPPL PGIDLSGIFTLRNVTDTDRIKEYIKSHAPRKAVVVGAGFIGLEMAENLHAQGAKVSIVEM GNQVMAPIDFSMASLVHQHLMDKGVNLYLEQAVASFSRDGRGLKVTFKNGQSISADIVIL SIGVRPETNLARAAELTIGPAGGIAVNDYLQTSDESIYAIGDAIEFRHPITGKPWLNYLA GPANRQGRIVADNVLGAKIPYEGSIGTSIAKVFDMTVASTGLPGKRLRQEEIDYMSSTIH PASHAGYYPDAMPMSIKITFDKKTGRLYGGQIVGYDGVDKRIDELALVIKHEGTIYDLMK VEQAYAPPFSSAKDPVALAGYVAEDIITGKNNPVYWRELRDIEMENKFLLDVRTPDEFSL GSLPGAVNIPLDELRDRLAELPKDKMIYTFCAVGLRGYLAYRILTQHGFDKVRNLSGGLK TYRAATTPIIIREENGNEIDESPIQKETASQTSQPNVSTAPVTAAADASATPAKTVRVDA CGLQCPGPILKMKKTMDTLVSGERVEITSTDPGFPRDAAAWCSSTGNQLISKDTSGGKSI VVIEKGEPKSCNIVTSCEGKGKTFIMFSDDLDKALATFVLANGAAATGQKVTIFFTFWGL NVIKKLHKPETEKDIFGKMFGMMLPSSSKKLKLSKMSMGGIGGKMMRYIMNKKGIDSLES LRQQALENGVEFIACQMSMDVMGVKQEELLDEVTIGGVATYMERADNANVNLFI >gi|225935320|gb|ACGA01000072.1| GENE 83 108630 - 108983 343 117 aa, chain + ## HITS:1 COG:no KEGG:BT_2435 NR:ns ## KEGG: BT_2435 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 118 122 150 72.0 1e-35 MKTICAMRDVFKAMGNFETAFEKMYQITLNEAMILCALKEASEKVTATNLSKQTELSPSH TSKMLRILEEKGLIVRSLGSEDRRQMYFHLTHTGKQRVNELELDKVEIPDLLKPLFK >gi|225935320|gb|ACGA01000072.1| GENE 84 109207 - 109842 652 211 aa, chain + ## HITS:1 COG:no KEGG:BT_2437 NR:ns ## KEGG: BT_2437 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 210 210 340 89.0 1e-92 MKKLIPILLAVFALASCEKDPDMGKLDDNYLVYTNYDKKADFKVPTFYLAPQILVISDSK EPEYLEGEGAEQILAAYTDNMVARGYEAAADQESADLGIQVSYIASTYYFTSYTQPEWWW GYPGYWGPGYWGGNWGGGWYYPYAVTYSYSTNSFLTEMVNLKAEQGDSKKLPVVWTSYLT GFETGSKAINRTLAVEAVNQSFTQSPYLTNK >gi|225935320|gb|ACGA01000072.1| GENE 85 109858 - 110484 629 208 aa, chain + ## HITS:1 COG:no KEGG:BT_2438 NR:ns ## KEGG: BT_2438 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 352 92.0 6e-96 MKTRKNIYFKVVALAAIAIAFAMPAKAQLSDNGYANIDWQFNAPLSNHFADKASGWGMNF EGGYFVTPNIGLGLFLNYHSNHEYVGRETFQMGAGEVTTDQQHTIFQLPFGAAARYQWNR GGSFQPYVSAKLGAEYAKIRSNFSMLEARENSWGFYASPEVGINVFPWVYGPGLHFALYY SYGTNKADVLHYSVDGLSNFGFRLGVSF >gi|225935320|gb|ACGA01000072.1| GENE 86 110572 - 113583 2390 1003 aa, chain - ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 13 419 96 528 686 258 35.0 5e-68 MKIIPSIVTFLFLIAAGTVSSPAQNATTVEPLLVYKALQDKDCRHWVDSVMDKLSFKEKV GQLFIYTIAPVNTKRNLELLREVIDTYKVGGLLFSGGKMQNQVELTNRAQRQAKVPLMIT FDGEWGLAMRLRGMPVFPRNMVLGCIRDNRLLYEYGREVARQCRQIGVQVNFAPVADVNI NPENPVINTRSFGEDPIQVADKVIAYASGLESGGVLSVCKHFPGHGDTDVDSHKALPVLP FTRERLDSVELYPFKEAIRAGLGGMMVGHLQVPVIEPIGGLPSSLSRNVVYDLLTDELAF KGLIFTDALAMKGVAGNGNVSLQALKAGNDMVLSPRNLKEEIPAVLEAIEKGELTREDIE SKCRKVLTYKYVLGLKKKSYVQLSGLEQRINSPQTRDLVRRLNLAAITVLNNKNHILPLH TDKEQAIALLEVGDPGETDALAKQLSRYTSLARFSLRANQTEEENQRLRDSLSTYKRIIV AVSEQRLAPYQPFFAKFVPESPAIYLFFTPGKMMLQIQRAVAHASAVVLGHSYSSDVQRQ VADVLFAKASADGQLSASLGELFPTGAGVTITPKTPLHFVPEEYGLSSAHLKRIDSIALD GIRQGAYPGCQVVVLKNGHIMFDKAFGTYTGKGSPRVESTNIYDLASLSKTTGTLLAIMK LYDKGRFNLTDKISDHLPFLQRTDKKDITIQEILYHQSGLPSWIPFYQEAIDKDSYDGRL FSARKDVHHPVQIGTTTWANPKFKFKSEYISPVKTGDYTVQICDSLWLNRSFRKVIEEKI AEAPLKQKRYVYSDVGFILLGMLVEQLAGMPMEAYLQREFYEPMGLEHTGYLPLRRFAKS EIVPSNKDRFLRKETLLGFVHDEASAFFGGLAGNAGLFSTARDVARVYQMLLNGGEIDGQ RYLSKETCQLFTTETSKISRRGLGFDKPDADDSKKGNCAPAAPAEVYGHTGFTGTCAWVD PVNELVYVFLSNRIYPDVTNRKLNQLHIRERIQGAIYDAMKKK >gi|225935320|gb|ACGA01000072.1| GENE 87 113580 - 114428 268 282 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 [Haemophilus influenzae PittAA] # 1 243 1 252 603 107 29 3e-22 ILNMKRFQILFLSCLVLSFTFSIFAQDKDTKEVIILQTSDVHSRIEPVNQKGDEYYNKGG FLRRAAFLEQFRKEHKNVLLFDCGDISQGTPYYNMFRGEVEVKLMNEIGYDAMTIGNHEF DFDVDNMERIFKMANFPVVCANYNLDATVLKDIVKPYVVLEKYGLRIGVFGLGTQPEGMI QANKCEGVVYEDPIRVSNEIAALLKDEEGCDLVVCLSHLGIQMDEHLVAGTRNIDVILGG HSHTFMKGPKTYLNMDGKEVPVMHTGKNGVRVGRLDLTLKHK >gi|225935320|gb|ACGA01000072.1| GENE 88 114445 - 115230 771 261 aa, chain - ## HITS:1 COG:STM4104 KEGG:ns NR:ns ## COG: STM4104 COG0737 # Protein_GI_number: 16767370 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Salmonella typhimurium LT2 # 27 241 283 498 518 87 28.0 3e-17 MKQTYAKYLMVAALTGALLFTSCRTARETTAQYEVTKVEGSMITIDSVWDTHPNAKAAEI LKPYKEKVDAMMYEVIGTSAMKMDKGGPESLLSNLVAGVLQQAAVQVLGKPADMGLVNMG GLRNILPEGDITVGDVFEILPFENSLCVLTMKGTDLRRLFEAIASLHGEGVSGIRLEITK DGKLLNAFVGGKPLKDDQLYTVATIDYLADGNGRMDAFLQAEKRVCPEDATLRGLFLDYV RKQTAEGKAITSKLDGRITIK >gi|225935320|gb|ACGA01000072.1| GENE 89 115386 - 115736 579 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237713894|ref|ZP_04544375.1| 50S ribosomal protein L19 [Bacteroides sp. D1] # 1 116 1 116 117 227 100 2e-58 MDLIKIAEEAFATGKQHPSFKAGDTVTVAYRIIEGNKERVQLYRGVVIKIAGHGDKKRFT VRKMSGTIGVERIFPIESPAIDSIEVNKVGKVRRAKLYYLRALTGKKARIKEKRAN >gi|225935320|gb|ACGA01000072.1| GENE 90 115842 - 115985 90 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160883610|ref|ZP_02064613.1| ## NR: gi|160883610|ref|ZP_02064613.1| hypothetical protein BACOVA_01582 [Bacteroides ovatus ATCC 8483] # 1 43 1 43 43 66 88.0 6e-10 MKPREFMTFGALFYLVLSDVYGLFRQRVILTGVRKYSYTFEELFSLQ >gi|225935320|gb|ACGA01000072.1| GENE 91 116010 - 116990 497 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 10 317 5 316 319 196 38 8e-49 MNSNMEKPYVVGIDIGGTNTVFGIVDARGTIIASSSIKTSGYPTVEEYADEVCKSLLPLI IANGGVDKIRGIGIGAPNGNYYTGTIEFAPNLPWKGILPLAAMFEERLGIPTALTNDANA AAIGEMTYGAARGMKDFIMITLGTGVGSGIVINGQMVYGHDGFAGELGHVIARRDGRLCG CGRKGCLETYCSATGVARTAREFLAARTDASLLRNIPAENITSKDVYDAAVQGDKLAQEI FNFTGNILGEALADAIAFSSPEAIVLFGGLAKSGDYIMKPIQKSIDDNILNIYKGKTKLL VSELKDSDAAVLGASALAWELKDLKE >gi|225935320|gb|ACGA01000072.1| GENE 92 117170 - 117886 358 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 230 1 223 245 142 37 1e-32 MIQLTNINKTYNNGAPLHVLKGINLNIEQGEFVSIMGASGSGKSTLLNILGILDNYDTGD YYLNNVLIKNLSETKAAEYRNRMIGFIFQSFNLISFKDAVENVALPLFYQGISRKKRNAL ALEYLDRLGLKDWAHHMPNEMSGGQKQRVAIARALITQPQIILADEPTGALDSKTSVEVM QILKDLHKLGMTIVVVTHESGVANQTDKIIHIKDGIIERIEDNIDHDASPFGKDGIMK >gi|225935320|gb|ACGA01000072.1| GENE 93 117917 - 119176 1141 419 aa, chain + ## HITS:1 COG:SMc04351_2 KEGG:ns NR:ns ## COG: SMc04351_2 COG0577 # Protein_GI_number: 15965824 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Sinorhizobium meliloti # 12 419 10 393 393 117 27.0 6e-26 MIDIWQEIYSTIKRNKLRTFLTGFAVAWGIFMLIVLLGAGNGLIHAFEQSASERAMNSIK IFPGWTSKSYDGLKEGRRVQLDNKDMDATSHYFPNHVIKAGATVWQGGVNLSFGQEYVSL NLSGVYPNHTEVEVVKLFEGRFINEIDIKERRKVIVLHKKTAEILFNKTHTEPIGQFVNA GNVVYQVVGLYNDKGDSGDSDAYIPFTTLQTIYNKGDKLNNLVMTTKNLETVEANEAFEA HYRKVLGANHRFDPTDHSAIWIWNRFTNYLQQQKGSGMLRIAIWVIGIFTLLSGIVGVSN IMLITVKERTREFGIRKALGAKPLSILWLIIVESVTITTIFGYIGMVAGIGVTEWMNSAF GNQTMDTGMWTETVFLNPTVDIRIAIQATLTLIIAGTLAGLFPARKAVSIRPIEALRAD >gi|225935320|gb|ACGA01000072.1| GENE 94 119181 - 120422 325 413 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 21 413 22 413 413 129 25 7e-29 MRIDMDTCEEILITITRNKTRSLLTAFGVFWGIFMLVALIGGGQGLEDMMKKNFEGFATN SGFLVSQRTGEAYKGFRKGRWWNLESTDIDRLRSQVKEVEIITPSVARWGSKAVYEDKKY DCSVKGLYPDYLHIESQEMAYGRFINEVDIKEARKVCVIGKRIYESLFKPGEDPCGKYVR VDGIYYQVIGMSSSEGDMNIQGRASEAVTLPFTTMQQTYNLGGRIDVICFTAKHGVKVSE IQPKMEQVIKTAHYISPDDKQAVMCLNAEAMFSMVDNLFTGINILVWMVGLGTLLAGAIG VSNIMMVTVKERTTEIGIRRAIGARPKDILQQILSESMVLTTIAGMCGISFAVMVLQLVE MGANADGGDTRFQVTFGLAIGTCALLIALGMLAGLAPAYRAMAIKPIEAIRDE >gi|225935320|gb|ACGA01000072.1| GENE 95 120503 - 121603 1187 366 aa, chain + ## HITS:1 COG:VC1563 KEGG:ns NR:ns ## COG: VC1563 COG0845 # Protein_GI_number: 15641571 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 45 316 52 335 338 125 26.0 1e-28 MKKYLKITLLVVIAIILVGTFVFLYQKSKPKVITYETVRPEMTDLQKTTVATGKVEPRDE ILIKPQISGIIDEVYKEAGQAVRKGEVIAKVKVIPELGQLNSAESRVRLAEINATQAETD FARVKKLYDGQLISREEYEKSEVALKQAREERQTAKDNLEIVKEGITKNSASFSSTMIRS TIDGLILDIPVKAGNSVIMSNTFNDGTTIATVANMNDMIFRGNIDETEVGRIHEQMPIKL TIGALQNLTFNAILEYISPKGVETNGANQFEIKAAITIPDSVQIRSGYSANAEIVLQKAN QVLAVPESTVEFSGDSTFVYIMTDSVPEQKFQRTQVTAGMSDGIKIEIKKGVTIQDKIRG AEKKDK >gi|225935320|gb|ACGA01000072.1| GENE 96 121613 - 122950 1051 445 aa, chain + ## HITS:1 COG:mll1107 KEGG:ns NR:ns ## COG: mll1107 COG1538 # Protein_GI_number: 13471200 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Mesorhizobium loti # 7 428 8 417 466 88 21.0 3e-17 MKIIRKTISALLLAGVGITSIQAQNNSSQAWTLRQCIDYAIEHNIEIRQSANNVEAGKVS VNTTKWARLPNLSGSASQNWSWGRTASPIDNSYSDINSANTSFSLGTNIPIFTGLQLSNQ YSLAKLDLKAAIEDLNKAREDIAINVTSAYLQVLFNQELSKIAHNQVDLSKDQLKRIQGL QGVGKASSSEVAEAQARVAQDEMTAVQSDNTYKLSLLDLTQLLELPTPEGFVLESPKEEL EFESLTPPDDIYTQALTYKPSIKAAEYRLQGSLNSIRIAQSAFYPQLSFSAGLGSNYYTV SGRSESSFGSQMKNNLNKYIGFNLSVPIFNRFATRNRVRTARLQQIDLSLKLDNSKKILY KEIQQAWYNALAAESKYNSSEVAVKANEESFRLMSEKFNNGKATFVEYNEAKLNLTKALS DKLQAKYDYLFRTKILDFYKGQVIE >gi|225935320|gb|ACGA01000072.1| GENE 97 123035 - 124063 820 342 aa, chain + ## HITS:1 COG:SP0660_2 KEGG:ns NR:ns ## COG: SP0660_2 COG0229 # Protein_GI_number: 15900561 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Streptococcus pneumoniae TIGR4 # 198 341 1 144 145 201 64.0 2e-51 MKYFIVLVIVMISNTFLGVAKPMKNQAEIYFAGGCFWGTEHFLKQIRGVENTQVGYANSN VANPSYEQVCSGKTNAAETVKVVYDPKTVDLNLLLDLYFKTIDPTSLNQQGNDRGTQYRT GIYYVDKEDLPVIKQAIQLLSAQYKTPIVLEVKPLTNFYPAETYHQDYLDKNPGGYCHIN PALFEMARKANAPKAKAYQKADDATLRKELSAEQYAVTQKNATEPAFHNKYWNEHRPGIY VDITTGEPLFVSTDKFDSGCGWPSFSQPIQKDLIAEKKDTSHGMIRTEVRSKTGDAHLGH VFTDGPKEKGGLRYCINSASLRFIPKDKMKEEGYGEYLPLVK >gi|225935320|gb|ACGA01000072.1| GENE 98 124202 - 124717 561 171 aa, chain - ## HITS:1 COG:no KEGG:BT_2500 NR:ns ## KEGG: BT_2500 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 172 172 270 86.0 1e-71 MKTTVLFRMIVLVAMVFAGIANTTVKAQGNNFITNEERVDDLVVSKVIYRLDGSLYRHMK YDFTYDDQKRMSSKEAFKWDASTEKWIPYFKIDYTYSSNEITLVYARWNDSHRAYDASVE KSVYELNDANMPVAYMNYKWNDKWIEESAASWAMNVSTPATNEATLLTASR >gi|225935320|gb|ACGA01000072.1| GENE 99 124984 - 126390 876 468 aa, chain - ## HITS:1 COG:no KEGG:Phep_3369 NR:ns ## KEGG: Phep_3369 # Name: not_defined # Def: kelch repeat protein # Organism: P.heparinus # Pathway: not_defined # 233 467 600 837 842 194 40.0 7e-48 MLLIFCLSFTVSVRAVPPAEEIKRVKADQILDTKQFNYLDKGSIITLQIVLPDQDKVVFE SLMRIESTDKELYNLILSRSPQAGTELLWVCNSEVEKVRIGSPKDSIRTDTMSVGFSVRF PADSVYISANEHSYAFSSHGIRSDRGYRFHILGDKRSHLKLVEGHAELLSIGKEEQHLSA WYWILLIIVVDMIVFLGMHWRRRRQKRLKASNRIEILTSFGDEGERVKLPRYNAVYLFGG LQVFDRNGEEVSQRFSPLLRELFVLLILRSAENGISSEELTTMLWFDKDDVSAKNNRAVN LSKLRVLVESVGGCTISKISGNWRVQFDKTAFVDYYECLPERIRPDSLTTERIKIIAALT AKGGLLNECDYLWLDTFKSRIADSLIESLLKYATLLDEHEAYDTKLLISDIIFRFDSVNE AALRLKCATYVSMRKQYMAKTTYEHFCREYKSLYDEPFEHSFNSIISQ >gi|225935320|gb|ACGA01000072.1| GENE 100 126518 - 127408 619 296 aa, chain - ## HITS:1 COG:no KEGG:Mboo_1713 NR:ns ## KEGG: Mboo_1713 # Name: not_defined # Def: hypothetical protein # Organism: M.boonei # Pathway: not_defined # 49 296 68 315 318 165 37.0 2e-39 MRVANLLILIIAVLMTSCFKDDESESLGETSGFLPGDGIVTYSGYAPLADRPVRIHFHIP VNGDMKKMPVLFVFPGLERNADDYLNAWRAEASNREVIVFVFEFPTETYSTAQYIEGGMF QGNTLLDRSEWTFSLIESVFDTVRKDTGSSRNKYDMWGHSAGAQFVHRYLTFMPDTRVDR AVSANAGWYTLPDVDVAYPYGLKNTDVAMTSRVASLFARKLIVHLGTADTDRNGLNTSAG AEAQGANRYQRGKYYFSEAKRISTKDGCSLNWDKYEVAGVAHEYAKMAAAGAKILY >gi|225935320|gb|ACGA01000072.1| GENE 101 127414 - 128643 1236 409 aa, chain - ## HITS:1 COG:no KEGG:GFO_3192 NR:ns ## KEGG: GFO_3192 # Name: not_defined # Def: phosphate-selective porin O and P # Organism: G.forsetii # Pathway: not_defined # 1 386 1 382 401 103 27.0 1e-20 MKKIIIGVFSLILATGMYAQTIGNEAFPQLKDGLLDLNLRTPGQKLRFGGYLQGSGYYTD IKNAESEYGFDIEHAYLSLEGSFLNEKLGFFLQADFADSYPLLDAWVSYAPWKQLKISAG QKQTFTNTRQMMMLDQGLAFGEHSLMDRTFSRTGRELGLFVESRLSLGKLGFDLGAAVTS GDGRNSFGSSSTDPDLGGLKYGGRVTVYPLGFFTSGNEMVFNDFIREKSPKLAIGAAYSY NVGASNMVGEGHNDFQMYDKEGAADYPDYRKLSADLLFKWNGFSLLAEYVNATANGLNGL YVEKDEGAKLKPHQIANYLALGNGFNVQAGYLFHKLWALDVAYSKVKPEWKETEQSVLRE ADNITFGASKYFNNNTFKLQLAGSYTSYNQLVGAGSKEFQVKLNLHILF >gi|225935320|gb|ACGA01000072.1| GENE 102 128676 - 130199 1306 507 aa, chain - ## HITS:1 COG:BH3831 KEGG:ns NR:ns ## COG: BH3831 COG2866 # Protein_GI_number: 15616393 # Func_class: E Amino acid transport and metabolism # Function: Predicted carboxypeptidase # Organism: Bacillus halodurans # 44 228 44 233 351 74 28.0 4e-13 MKRILLILCLLLSGLPFYAQQTGKVTAKFFPDPQVDMDTPAFAKKHGFTTYRELMTFLHD LATAHPEWVKLQIVGRTQRGREIPMIKVSKGGSDKLRVLYTGCVHGNEPAGTEGLLYFMK QLTRDPQLSALLDKMDFYILPSVNIDGSEQGERLTANGIDLNRDQTLLSTPEARTLQRVA LTVKPHLFIDFHEYKPLRVSYEEVTDGLLVTNPNDFMFLWSSNPNVSPALRTVVDDFYVP EASRMADAEGLTHHTYFTTKSNRGEIIFNIGGSSPRSSTNIMALRGAISMLMEVRGVGLG RTSYKRRVYTVYKLAESFARTTFEHEDQIRKAVDESAHYNGDIAVTFRSKPASDYPLNFI DMLACKEITVPVEARIAPESEVVLTRQRPVAYYLDANQNRAVEILQHYGVELERLESPET VELECYTVTKAVESHDLVAGILPLNVATNTSNRTITLPAGSYRIPMSQPLATLVTVLLEP ESANGFVNYRVIDAAVNNTLGVYRKMK >gi|225935320|gb|ACGA01000072.1| GENE 103 130207 - 131583 1212 458 aa, chain - ## HITS:1 COG:no KEGG:GFO_3192 NR:ns ## KEGG: GFO_3192 # Name: not_defined # Def: phosphate-selective porin O and P # Organism: G.forsetii # Pathway: not_defined # 79 432 61 386 401 119 28.0 3e-25 MDMKKYLLFLIGLFLFLPLMAQTDEGSDDAEIVEASDESLSDIDNKVVLHRYKMGDGLRL TTQGGNKLVISGMVQTSVESRRFEDVDQMYNRFRVRRARVRFDGSVYHDKLRFRLGLDLV KGSETDDDSGSLLMDAYAAYRPWGSKLVVSFGQRSTPTDNYELQMSSHTLQFGERSKITS AFSTIRELGVFAESSFRVGSKGLLRPAIAITDGSGPISEGKRYGGLKYGSRLDYLPFGAF RSMGGSREGDMAYELTPKLSVGVAYSYADGTSDRRGGRSNGDILYMNDRDEIDLPDYAKL VADFAFKYRGFSMLGEYAKTWGYVPSSITKRVRNDGTTATTFDVNGEQNVEAYIKNRMML GQGFNIQAGYMLRSLWSFDLRYTYLKPDEYSYMNNNLYFNRHNFYDFSVSKYLTRNYAAK IQLTVGLARSNGENRTPDSTYTYNGNEWIGNLLFQFKF >gi|225935320|gb|ACGA01000072.1| GENE 104 131596 - 132744 1116 382 aa, chain - ## HITS:1 COG:no KEGG:CCC13826_0552 NR:ns ## KEGG: CCC13826_0552 # Name: iadA # Def: isoaspartyl dipeptidase (EC:3.4.19.5) # Organism: C.concisus # Pathway: not_defined # 5 381 3 379 379 300 42.0 6e-80 MMFKLIQNIHLYAPEDKGMNDLLICGEKIACIAPHIDIRGIEVEVIDGAGMNAAPGCIDQ HVHIIGGGGQTGYFSLAPEVPLSRLLACGTTTVVGLLGTDGFVKQLPALYAKTKALCMDG ISAYMLTGYYGLPTPTLTDCIANDLIFIDKVIGCKIAISDDRSSYPTKSELMRIIQQVRL GGFTSAKGGVMHVHLGALPGGIDLLLEIAREYPSLISYISPTHMGRTHDLFLQGIEFARM GGMIDISTGGTKYCEPYESVLEGLNAGVSIDRMTFSSDGNAGVRRKDPVTGEDSYTVAPL HRNLEQVIRLIVDGGIAPSDAFRLITTNPARTMKLKGKGELWEGWDADITLFDDKWKLQG VYARGAEMMHEGIVIRKGNFEM >gi|225935320|gb|ACGA01000072.1| GENE 105 132760 - 134139 1074 459 aa, chain - ## HITS:1 COG:PM0933 KEGG:ns NR:ns ## COG: PM0933 COG3069 # Protein_GI_number: 15602798 # Func_class: C Energy production and conversion # Function: C4-dicarboxylate transporter # Organism: Pasteurella multocida # 5 459 2 460 462 270 41.0 7e-72 MMIYIGAFIALQVIVLVAYWLMKKNNPQGVLMVAGILMLALSMLLGMHSLSLTETTGTPV FDLFRIIKETFSSNMLRVGLMIMTIGGYVAYMKKIKASDALVYVSMQPLAIFKKFPYIAA SIVIPIGQMLFICTPSATGLGLLLVASIFPILVGLGVSRLTAVSVISACTIFDMGPGSAN TARAAELVGQNNMLYFVEHQLPLAIPLTILLMIVYYLTNRYFDRKDRQSGRTQPEMVDTK DFKVDIPLIYAILPVLPLFLLIIFSKYVHLFDPPIELDTTTAMFVSLAVSMLFELIRLRS FKAVFVTMKSFWEGMGKVFTNVVTLIVAAEIFSKGLISLGFIDALVEGCTSLGFSGTLVS IVIILILFLAAILMGSGNASFFSFGPLVPSIATKVGMAPVDMILPMQLASSMGRAASPIA GVIIAISEIAGVSATDLAKRNIIPLTITLIAMIVIHFFI >gi|225935320|gb|ACGA01000072.1| GENE 106 134157 - 135572 1188 471 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175146|ref|ZP_05761558.1| ## NR: gi|260175146|ref|ZP_05761558.1| hypothetical protein BacD2_25041 [Bacteroides sp. D2] # 1 471 1 471 471 822 100.0 0 MMRMNKPMTTAMFCMALLLGGLSSCTTNDVEDIQGDGEQDLYEIKDRALGEYLVYNCSRT DENKLPYEMAVAENGKFYLNTQKAATVENLYLVKNDAQISKLESAGLATAAVKIVDMDGL QFFTALKTLKITSNSVERMDLTALTQLETVEMNNNCVATLDLSQNTKLVRFRYGGNTTTD TSTKLSTISFANNNVIEHIYLKNQNLQENGFTLPSNYSALKELDLSNNPAAPFAIPEDLM NQLTTAVGVVADSEGGGDEDGELFTIPDQAFGEYLYYLSTTAGKLPQGLVVKEGNEYQLD KTIAATVTSVNVSKTTAIISELVTAGLETAETLVSSADGLQFFTGLVEFTATSNKFTEAL PITGLSNLEVLQVNTAGVGSLDLSGSPKLRILNCNGSTKSGYGTLSSINLSYTSNLETLN LKNNKLEAINVTNLVKLTELDLSGNPGANFRIPAGIFNNLTTTKNKGVEAE >gi|225935320|gb|ACGA01000072.1| GENE 107 135793 - 137742 1225 649 aa, chain + ## HITS:1 COG:alr5324 KEGG:ns NR:ns ## COG: alr5324 COG0744 # Protein_GI_number: 17232816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Nostoc sp. PCC 7120 # 433 582 89 236 643 130 42.0 1e-29 METSSKKRIIILSATGILLTSLICLFAFRNVILCHIVDKRATRAEQTYGLQIHYQELRMK GCNEVALQGLSIVPNQRDTLLTLQSVNVKLSFWELLKGEIEVRNVLMNGLAINFIKRDSI ANYDFLFFKRQQEAEPQPVIESDYANRIDRILNLIYGFLPENGQLSQINITERKDSNFVA VNIPSFIIENNHFQSTIQIKEDTLPQQLWEATGELNRRDYTLKASVFAPEKRKISLPYIT RRFGAEVTFDTLSYNMTKDKHASNQLLLKGKARVNGLDVFHKALSPEVIHLDRGQLCYEM NISGHSLELDSTTIVDFNKLQFHPYLRAEKEKGNWHFTAAVNKSWFPADDLFSSLPKGLF SNLEGIKTSGELAYHFLLDIDFAQLDSLKLESELKEKDFRITSYGATSLSKMSGEFIYTA YENGIPVRTFPIGPSCKHFTPLDSISPILRMSVMQSEDGAFFYHRGFLPDALREALIYDL QVKRFARGGSTITMQLVKNVFLNRNKNFARKLEEALIVWLIENEKLTSKERMYEVYLNIV EWGPLVYGIQEASAYYFNKRPSQLNTEESIFLASIIPKPKHFRSSFAEGGQLKENMEGYY KLIAKRLAQKGVISEIEADSIRPDIQVTGAARNSLAGENPESSSPAAEE >gi|225935320|gb|ACGA01000072.1| GENE 108 137773 - 139155 188 460 aa, chain - ## HITS:1 COG:no KEGG:BT_2502 NR:ns ## KEGG: BT_2502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 1 343 399 354 68.0 4e-96 MKRILFLLFAVGLTANMTVMAGMSTSKVRKETRFLTDKMAYELSLSTQQYNDAYEINYDF IYSVRNIMDYVARGYEWALDDYYEALDIRNDDLRWVLSDAQYRRFLGAEYFYRPIYVTGG KWSFRVYVNYPNRSLFYFGVPYHYRTYCGAHYRPHFHHVSYYRGRYTNFNHYSAPHRVRD QRVFHSYRRSDFGSVRFRPNTSTRPHNAPTRPGNSSRPGSSTTRPGTSRPGTSTRPDAST RPGTTTRPGTSTRPRPESGGDRPSTSVRPSTPSGSGRPSNNDKNGNTNRVESDRNSGRRP ETGSGSSSSNRRPTSGSSTGSSSSNRESGKTSSSSRESSSRRNDSGSYNRNNSSERNNSS SGSSRGSSNRSSSNRSSSNRSSSSRSDSGRSSGSTTSRSSSTTSNRSSGSSSSTRSGGSS TRSNVSSGQTERRSSSGSTRSNSTQSSSRSSERNSSSSRR >gi|225935320|gb|ACGA01000072.1| GENE 109 139170 - 139853 485 227 aa, chain - ## HITS:1 COG:no KEGG:BT_2503 NR:ns ## KEGG: BT_2503 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 229 338 162 39.0 1e-38 MKKIHIMLCLALGILVSGCFTSYVSTKQMMGIYQGMTQSQVESVLGKPDFRRFDGDMEEW EFHRDNGTPVLTSEPVTIIVQFVNREVVSMDTFKGYGRPAPMHSVVVPPAVNTTVEVFPN HEQVEEARLMTDSEFDEFINKLKITVMNEDQKKLVDRMLRTYDVTSNQCVKIVKEISYTP DQVEMMKKLYPYVRDKRNFNKVIDILFSNAYKDEMRKFIEEYHQNNK >gi|225935320|gb|ACGA01000072.1| GENE 110 139880 - 140551 584 223 aa, chain - ## HITS:1 COG:no KEGG:BT_2504 NR:ns ## KEGG: BT_2504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 223 1 226 226 318 69.0 7e-86 MRKIIISFCILFAALSLQAQSVNGIRIDGGNTPILVYLGGNQICLPTTTCFIANLNPGHY TVEVFATRFTRAGERVWKGEKLYKDLVYFDGRGVTEIWVDGRDNMRPERPGRPEQGEHRP GYGYNRVMNDQLFQTFYKEMKNEPFKDDRMKLLNAALAGSDFTSAQCLQLTKLYTFDDDR MEIMKIMYPRIVDKEAFFTVINTLTFSSSKEKMKDFIIGYGKR >gi|225935320|gb|ACGA01000072.1| GENE 111 140667 - 141506 222 279 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 5 273 2 275 285 90 26 6e-17 MSIELGKFNQLEVVKEVDFGVYLDGGEEGEILLPTRYVPEDCKIGDFLNVFLYLDMDERL IATTLTPLVQVGQFACLEVSWVNQYGAFLNWGLMKDLFVPFSEQKMKMQVGRKYVVHAHL DEESYRIVASAKVERYLSKEKPEYTLEAEVNILIWQKTDLGFKAIIDNKYSGLLYENEIF CPLETGMEMKAFVKQVREDGKVDLILQKPGFEKIDDFSKTLLDYIKAHGGRIYLNDKSPA EDIYDTFGVSKKTFKKGVGDLYKKRLITLHEDGIALADS >gi|225935320|gb|ACGA01000072.1| GENE 112 141719 - 143791 1205 690 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 32 524 4 523 536 301 33.0 3e-81 MEMNRKNIYLAVLPALPWILTNCCQAQTTDNRPNIILILADDMGYSDIGCYGGEIQTPNL DKLASAGLRYRQFYNGARSCPTRASLLTGLYAHQAGMGWMTTADMQRPPYQGCLNNECVT IAEVLKSAGYSTYMSGKWHVSSERKNSGGVQEHWPNQRGFDEFYGIVGGAANYFNMKYNR NNQQFHSPKDGTFYFTHAISDSAAIFIKQHSFEEKPLFLYLAYTAPHWPLQALQKDIDKY VDVYKAGWDQLRETRFQRQKKMGLFSPDVQMSPRDEAVPAWDSLSKEEQKEFAMRMAIYA AQIDAMDQGVGLVVEELRKKGQLDNTIIMFLSDNGACAEFISSGKRKAVDGKEDTFESYR RNWANVSSTPYKEYKHHTNEGGIATPLIVHYPKGISQKLDNSFVNEYGHITDIMATCVDF GKASYPTTYKGHTIVPMQGVSLRPNFIGKSTKRGKTFWEHEANIAVREGKWKIVTKVIEG QSFDENAIHLYDMNADPTELNDLATTYPQKKNELYAAWKEWANKVEALPLDTRSYGERRR DYKRRCINGQFDDNFGDWKYGVTNTAEIYFSIDSVHTISGDKTARIDIKKSGDKPANGFL KWMFNAKAGEKVSIRFKVQAEKKTPIIFRLEREQNLATKLIDAPLSLTTDVRTFEFKDIP IKENGNYQLVFYVGKSDGVIWIDDVELDIK >gi|225935320|gb|ACGA01000072.1| GENE 113 143932 - 145374 813 480 aa, chain + ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 24 412 9 391 465 102 25.0 1e-21 MKNYLVSLALAAGCTASMYAQQTEKPNLVLFIADDCSFYDLGCYGSPDSKTPNIDRFATQ GIRLTKTYQAVPMSSPTRHNLYTGVWPVRSGAFPNHTRADEGTLSVVQQLHPQGYKVALV GKSHVAPDSVFPFDLYAPTLQGGDIDFDAIRKFISECKDKHEPYCLLVASRQPHTPWNKG DASQFDADKLTVAPMYVNIPETRRMLTHYLAEINYMDNEFGTLLSILDQQKETDKSVVVY LSEQGNSLPFAKWTCYDAGVHSACIVRWPGVIKPHSESDALIEYVDIVPTFIDIAGGKPI APVDGKSFKNVLTGKEKTHKQYTFSLQTTRGIYKGSPFYGIRSVSDGRYRYIVNLTPEAK FQNTEVFSPLFKQWEAKGETDKHAREMTHKYQYRPAIELYDVEKDPYCMKNLADDPNYKT KINELDKQLKGWMKYCGDKGQETEMDALNHLASNLQKDTTTKEKRNNQSTKGKKKRNKKH >gi|225935320|gb|ACGA01000072.1| GENE 114 145580 - 146962 916 460 aa, chain + ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 26 438 10 421 465 114 25.0 5e-25 MEKSLISLCLLSGLSIGNCFAQVPAKPNILLIVADDCSYYDIGCFGAVNNKTPNIDALAR QGIKFNCAYNSVSMSTPTRHCLYTGMFPMHHGGYANHSSVNADVKSLPSYLGELGYRVGL SGKWHIKPLANFPFEDVPGFPKGCTSTNTDYTTDGIKEFMGRNNSQPFCLVLASINPHAP WTGGDASVFDRKKLQLPPQFVDTEVTREYYARYLAEIGLLDQQVGDAIQILKDKDLLKNT LVIFISEQGTQFAGAKWTNWNAGVKSAMIASWKGVTQPGTETSAIIQYEDILPTFIDVAG GKAPDVIDGKSFVGVLQGKAKTHHKYAYHVHNNVPEGPAYPIRSISDGKYRLIWNLTPEA TYLEKHVEKNEWYLSWKAADTEQAHKILNRWQHRPEFELYDITKDPYEFNNLADKAAYKQ KKAELIQELKKWMEQQKDTGADKDVPRTPNKKKIQTSSAS >gi|225935320|gb|ACGA01000072.1| GENE 115 147091 - 150252 2224 1053 aa, chain + ## HITS:1 COG:no KEGG:BT_0364 NR:ns ## KEGG: BT_0364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 1053 4 1027 1027 534 35.0 1e-150 MSNIKFKRCLWFVIIFLLSQSGYAQQTITVRAKVVDSSDTRPLIGATVSVRGTSQGVITD LDGFFQLNVQPHHKLVISYIGYRKLIIDAKEASKAGTIALSSDSEELDEVIVVGYGVQKK ASTVASISTTKGEDLLKGGNLTSVSEALQGKLNGVVAINSTGMPGENAATIFIRGKASWN GTSPLILVDGIERNMNDVDFNEIESISVLKDASATAVYGVRGANGVILLTTKRGSKQKPK VTFSANMGFKQMTTKLDWADYITSMKMYNEALANDNNWDKQVPTSTIRAWEQAYATGNYG PYNDVFPEVDWFDEITKDFGFSQNYNINVEGGMEKMDYFISLGYQYDGGNYNIDKQADFD PRYYYRRYNWRANFDFKLTSTTTFSANVAGNMGYQNKPSGSANFAKTFQAPTNIFPIKYS DGYWGDVSESGYNLIANMNTKGQNMFKKFQGWYDFILKQDLSFITKGLSAKARVSYNQYM TTNSKLIVGLVQGTNGSNAEKNSSVRYYREYDYANPIYNTDGTVSYPLKQNNRFPSEYIQ DETFPVNTTYDALNDVGRRLYYEFALEYNRQFGAHKVSALALMNRQIIESKGTDNVMQFP SYTEDWVGRITYNWKERYLGEVNMAYTGSEKFAPGHRFGFFPSFSVGWRISEEPFIKNKF GKVLTNMKIRYSYGQVGSDAGAPRFNYIQIYNQASNVQFGDSQNVGFGPTFKEGTMADPE STWEVATKQNLGIEFNLWEKLSMSIDLFDERRTGILMAPRTTLAWYGVGLPSLNMGETKN HGFELDLGWHDKVGKNWRYSINYSLSMNENRIVFRDDPADLERHLKEAGKPIDWQARFLA VGNYGTIDDVFNYAQTAIEQASPLKIIPGDLVYIDYNGDGIININDKAAVSQMNYPQTTM SLNLGLEYKGWGLNAMLYAPLGVYKLQFDQYLWDFPEGNIKAQPNTLDRWTPEKINSNQI MRPATHLDRKHNAVQSTYSYANHSYLRLKQLEISYKFPKKLLKPLSITACQFYVKGNNLL TFSGVDNRIDPETGGADSYPIVRTYTVGTRITF >gi|225935320|gb|ACGA01000072.1| GENE 116 150272 - 152098 1289 608 aa, chain + ## HITS:1 COG:no KEGG:PRU_2735 NR:ns ## KEGG: PRU_2735 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 9 608 2 624 624 212 29.0 5e-53 MKRIKILFIAIALLTIGSSCEDFLDKSPDMGLSDKDVYKSYETIRGFLDICYTELEQIFG SDTPANGSPFVGCFSDEMATIYNKSEVLTLHSGNWLRAKQDETFELGVKGATCIHSAYKA IRIANLVLANYTRVPGMTEEQTREILGQAHFYRAWFYFQLLKRYGGMPIFDHPFTENGEE KIARVTYHESHQWMMEDIEDAITMLPDTWDDSNTGRPTKIAAMAFKSMAQLYDASPLMQN DLNSTEVREYDRERAKLAAKSAMNAIDYIDTHTEQYRLMSQSEYKDIFYFTFPPSHQPEH IWYNRVPMGGITTVKGLWLYADLAGGTMVAASTYNAPTQNMVDLFEKKGPDGKYYPIKDS RAGYDDQKPYDNRDPRFKNNILVPGEEWGKNASSKPLYITTYEGGKATEDMKTLKNINSR QQTGYLCKKFLWPEANRYTALWEKYRVITVYIRVAQVYLDFAEASFEATGSATQKVEGCS LSAEDALNIIRARAGITALTEDRVSDPDLFREALRRERAVELMFENHRWWDIRRWMIAHE LFKETYPIKGIKATPVDPKVAVNKMKFTYTVVDVTPETRNFAMRNYWYPFPKTDIASSNG TLQQNPGW >gi|225935320|gb|ACGA01000072.1| GENE 117 152124 - 154952 2558 942 aa, chain + ## HITS:1 COG:no KEGG:BT_0364 NR:ns ## KEGG: BT_0364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 942 108 1027 1027 366 30.0 2e-99 MKRHIIYKWTLLAVGICTLYPDIATAQTSKREILAADTATIERLDGGITLHKDFVGAVSS IPVEELRSFPDLNVINALQGKAPGLVVRWGDGGLGFNSSELFIRGQHSNGNNQAVVIIDG IERSLDDLLVDEIESIEVLKDAVAKVLYGPAAANGVIKVTTKKGKVGKRKIRTTVEGGIM HVTRTPNYLDSYNYATLYNEARANDGLAPIYSPAQLEGYKNSTGVNDLLYPNVDYMDEFT RKQSFYRKATVEFIGGSKNMKYAMVAGYTGANGLEKLGERQNLNRLNLRGNLEIRITDYM NIRAGVGARLEIKDWGSKDGTGIYNAISTNRPNEYPFTIAPDAIGLPPTEDGVPYFGTSD RFSDNLYADVVYGGNSSERYMSSQADFGLDLDFNKYVKGLTAEGYLTFDNYNITRQSLSN VYPTYDIRPYMDETGSEQILFTQKRKQNLPKDHSISNNKVSRTSGWRANISYNRTFGVHD LGANLAYRYYKMERQNQTQDIIDASYSLRLNYGYARRYLVETALVYMGSNKFMDDHKYFL SPTFGVGWVLSNEQFMKKIEKINFLKLKASFGVLGYAGNTGYKLFHTAWKENGTVGFGEQ NKSNAYVTSLVRYGNPDLKWEKSAELNIGIEGRFLENRLRGEINYFHERRWDIIGTNATK YADIIGDYLYAENLAEVKNQGVEAYISWFDRPTKNFSYEIGLNMTFSKNKLIKADELENI ESYRKKIGQPTDHIFGLQAEGLFGKDVPLNGHTPQSFGPYQNGDIAYADLNGDGVIDSRD ETIVGNNFPRISWGMDIDLNYKGWGLYIQGVAETGVNKLLSNSYYWNTGLGKYSDMVLDR YHETLNPQGSYPRLTSTDGKNSFRNSSYWIKNSAFFRLKNVELSYTFKFKKKDIAQNLKL FARGANLLVISGFKNMDPERPDAGITNYPVLSTYTGGLAVTF >gi|225935320|gb|ACGA01000072.1| GENE 118 154993 - 156780 1267 595 aa, chain + ## HITS:1 COG:no KEGG:PRU_2737 NR:ns ## KEGG: PRU_2737 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 2 592 3 589 589 287 32.0 9e-76 MKIGRILILSSVLTPLFISCEDVLETKTVNEWDENKVWNLSDLAQGVLMQAYSAIPNTPN CFDINFLDVATDNAVTNSYASNIYKAAIGGITGADNPIGNWDVCYKQMQNIHEFMEKGLR DDLSYDRVNPETDAAIKKRLEGEALFLRAWWGFALLQRYGGRTDDGEALGYPITTRFITI EEGKHPENFKRNTYQECVRQIMEDCDQAMKKLPDTYTGDDAIVGIENTGRATSLAAAVLK SRVALYSASPAFQSSKIVTLNHGADYSIADQEAYQKQWEYAALISNAVLLLDGFGSYTPL TASNLADAPTTTPADFVFRNYYNTNGMETNHFPPFYRGAAHTVPSQNLVDAFPAIDGYPI GVSKQYDPDFPYANRDKRLDLNIYYQGRQFGDNSTFIDVVAGGKDSKEFHHQASRTGYYL AKFMSKKKDMLTPTQMLSAIHYNPLLRKSEVFLNFAEASNEAWGPKVKGPDCQYSAYDVL KMIRHLSGGLPENDAYLEEMSASKEQFRKLVQNERRLELAFENQRYFDMRRWLLPLDESV KGVIVTRDANGTLSYATETVEERKFNDVRFYYLPLPHAELLKNPNLKNNIGWDNN >gi|225935320|gb|ACGA01000072.1| GENE 119 156795 - 157742 892 315 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175159|ref|ZP_05761571.1| ## NR: gi|260175159|ref|ZP_05761571.1| hypothetical protein BacD2_25106 [Bacteroides sp. D2] # 1 315 3 317 317 625 100.0 1e-178 MKMKKLQIALLFLLTGLTSCYEDYIEDYETTSAGFAISNPLRTVISDRNMSIYVGVSLGG KREVDKTDWAKFTIDESLLNGTGFTLLPSNYYTLGDPETFRVRKSTLPVADVEIKFTDAF YADELTHEVHYALPFRVTESSMDQIREGGETSVVAIKYASSFSGTYYLTGNVVELDEAGN PIESTRQSYGDKQDIIKSPTCVLTTLSKSELIRPGIGDPTTDKKDNLRLTFENNGNFGGN YKVQISTEKGFRPVEAITGNYIHKSEEYTFNGNGDVPCPEIALKYIYEMDGKRYQVDEKL VLRRDPIDDLRIETW >gi|225935320|gb|ACGA01000072.1| GENE 120 157800 - 159290 1143 496 aa, chain + ## HITS:1 COG:CC1172 KEGG:ns NR:ns ## COG: CC1172 COG3119 # Protein_GI_number: 16125424 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Caulobacter vibrioides # 17 449 12 487 521 210 32.0 6e-54 MKISARKTSTLSTALLAILPIAKASAQHTTPSHPDKPNIVIILADDQGYGGVNCYPHIKK IVTPNIDKLAASGVQCMQGYTSGHLSSPTRAGLMTGKYQQSFGFYGLSTPHVGGIPQDQK LLSEYLVENGYNTACIGKWHLGDYIRSHPNNRGFQTFFGFINGLHDYYDPLVGGSWDGVY NGLAFTLDNMEPVTEMEYSTYEYTKRAVDFIQKNADHPFFLYLPYNAIHSPLQAPEELIG ELAINPQEIGKDDIARAMTFALDQGVGKVVETLEQLGLRDNTIIFYLSDNGAVEYSDKWE FRGRKGSYYEGGIRVPFIVSYPAKLAKGTIYNKPVMSIDIAPTVMELAGLSHADMHGVNL LPYLSGKDRTEPHDVLYWSTEKKSNNQVFKNEFAIRQGKWKLVSDPHFEKDYDLYDIEAD PQEKHGLKDQYPEKYKELFGMYLNWINQMPEELANGENARLKGMELMRKYQRNLKKSGKK VVPLSFGPHEKKKGKQ >gi|225935320|gb|ACGA01000072.1| GENE 121 159299 - 163321 2386 1340 aa, chain - ## HITS:1 COG:alr1285_1 KEGG:ns NR:ns ## COG: alr1285_1 COG0642 # Protein_GI_number: 17228780 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 811 1043 226 457 483 125 31.0 6e-28 MKSINVRRFLLTGLFLYTAMNLCATIHYNNLRFKQFETLDKLPHKTINAITQDNSGFLWL GTRNGLCRYDGYNIVTYQYHENDSNSLCHNFVNTLYNDSLQNAIWIATEEGICKYSSSTG RFVRYRIEGNTKPNVVFLKTFDKQLLAGCSNGVYAYDEKKDAFVPFLLTDTEASLVSGLA EDHNGFLWISTPKGARCYNMHKKVFMKLPESVSRISSKLYVDKFNRLWYNMNQQVNVYDI NNQQNYSIKESKQFKIFKSIALDGMDNIWLGSEYGIFVYGKDLRLIHHYQQTEGDLSNLN DNPIYSLFMDANHNMWVGTYFGGVNCYIDATDSFHVFSYGNSRNHLSGKAVRQITGGPDE DIYIATEDGGLNRIDKSGQITRSEALHQRMDIGATNVHSLLLNHEGDLWIGLYEKGARCY NFRTQHTKVFLPMDLYRRHSGFCMIEDAEHDIWYGGLNGFVIFRKEGKGYEVMKFKKPFL KFIFCMVNAPDGTVWAGTRRDGVFRLDKRKQTVERIATFPNKELFITDLYVDSQNQIWVG TDDNGMLLLNSKGEYITSYTEEQIGSNSIKGIVEDNAGNIWFGTVNGLCCIRAKGGIMRY TTEDGLPTNQFNYSSAYKDARGVLYFGTINGLVSFHPERIYKYPSRFNIALTAISLNGEL ITPSTSDSPLEKTISECTSITLTHKQARSVQIEYSGMNYRYNNNTMYAMKLEGVDKKWQN IGEQHQVRFSNLPAGHYRLRIKAGEDGIHWYEKGEKIINIRVLPPFWLSGWAYGFYAVCL LMLGYIGYRYTKNRLRLVMNLKTEHNLRVNMEKLNQQKINFFTYISHDLKTPLTLILSPL QRLLSQKEITNHDKKNIEIIYRNANQMKYLIDELLSFSKIEMNQEKINVRKGNIMLFLKE LSNIFEIVARDREIDFIVKLDDTDEEVWFSPSKLERIMYNLLSNAFKYTAPGDYVQLTAS LQEEDQQTFVHISVKDSGRGIPDGVKDRIFEPYYQVSPKDHREGFGLGLSLTKSLIQLHR GRIEIESKVGEGSNFIVILNVSEQAYEAAERRQDGITLAEIQKYNMRLKETVEILPDKLT QEKHPLQEERQSILIVEDNNEMNNYLKEIFMENYQVIQTYNGKEACEVMQKVYPSLIISD VMMPVMDGLELTRRVKQDINTSHIPVILLTAKTDEGEQTQGYLCGADAYIPKPFNAKNLE LLVRNMQQNRMSGIAHFKQTEELNITQITNNPRDERFMKDLVELIMANISKEDYGITEIT SALCVSRSLLHTKLKMLTGCSASQFIRSIRMREAKKHLLDGMNVAEAACAVGMTDPNYFT KCFKKEFDVTPTEFTKEHLA >gi|225935320|gb|ACGA01000072.1| GENE 122 163551 - 164636 949 361 aa, chain + ## HITS:1 COG:no KEGG:RB2377 NR:ns ## KEGG: RB2377 # Name: atsA # Def: arylsulfatase # Organism: R.baltica # Pathway: not_defined # 21 360 1258 1610 1610 423 58.0 1e-117 MKNLFLPLFFLFLSVTTKAQLSAGDILAGVKSHDKALHIKGGWIRDPYIVLGPDGKYYLT GTTPVAGDPREQDDQYNTGLGKTSIVGYQVQVWCSDDLAHWKSLGAPYSLAEDSPTFKGR EEMATKNPLWAPELHWTGDCWALVYCPQKHSGLALSSGKDIKGPWKHPNPEAFLKKHDPS LFKDDDGTWYLLFSNTLIVPLKPDFAGLAGEPVRIDPSNRRIGHEGATMRKIGKKYVHFG TAWSTDQGRKGSYNLYYCTSDKITGPYSERRFVGRFLGHGTPFQDKSGKWWCTAFFNGNR PPLDPDGIEKRDLREDAQTINKLGTTIVPLDVKILPDGDVYIRAIDPHYATPGPDEVQSF Q >gi|225935320|gb|ACGA01000072.1| GENE 123 164673 - 166337 1271 554 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 21 531 3 527 536 297 34.0 3e-80 MNKLALIAGGLALGSTVTAQEKPNIILILADDLGFSDLGCYGGEIETPVLDHLAAQGIRM TQMYNSARSCPSRANLLTGLYPHQTGLGHMTGSHAKEAKGYAGFRSNSDNVTIAEVLKEA GYFTAMAGKWHLGKVNPVQRGFQEYYGLLGGFNSFWNPKVYTRLPKDRPLHEYAEGEFYA TNVITDYAVDFIDQAHNEDKPLFLYLAYNAPHFPLHAPKAMIDKYMKTYLQGWDKIRDQR WKRIVGMNLLQGKPELSPRGVVPGSFFMDEAHPLPAWDSLTKEQQTDLARRMSIYAAMVD IMDTNIGRVVDKLDKNGELDNTFIIFMSDNGACAEWHEFGFDGHSGLAYHTHVGEELDGM GLPGTYHHYGTGWANVCCTPLSLYKHYAYEGGISTPCIAYWGNKVKNKGKIDHQPCQFSD IMPTCVELAETKYPRTYQGREILPTAGTSILPILQGEKLKERYIYAEHEGNRMVRKGDWK LVSAYFKEDKWELYNITKDRTEQNDLSSQYPERVKEMENAYFEWADKSDVMYFPKMWNTY NQGRKKDLKEYKTK >gi|225935320|gb|ACGA01000072.1| GENE 124 166343 - 168550 1836 735 aa, chain + ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 45 391 2 333 570 87 23.0 1e-16 MKKKHVLTIMLGALLQTSVFAAWSPKGDKIRTVWAEKVTPENVWRSYPRPQLQRSQWMNL NGLWKFAVTDQATTKKQVKYEGEILVPFAIESSLSGVQRTFLPTDKLWYQHNFTIDEDWK GKNIILHFQAVDYECQVWVNNKLAGTHKGGNNPFEFDITKFINKSGDQMVELSVVDPTDQ ESISRGKQQLNQKGIWYTPVSGIWQTVWLEAVNPQYIRQILPEADIHKKTVKLHLDVAGA KGQENIKVEVLDKRKVIKTIEQKGTTDMEIEIPDAVLWTPASPQLYHLNVELSVNGKIVD NVKSYFTLREAGIKKDACGYNRICLNGEPIFQFGTLDQGWWPDGLLTPPSEEAMLWDMVQ LKEMGFNTIRKHIKVEPEQYYYYADSLGLMLWQDMVSGFSTERKQAEHVAAGAANDWNAP AEHSAQWQKEMFEMIDRLRFYSCITTWVVFNEGWGQHNTAEIVNKVMSYDKSRIIDGVTG WADRGVGHMYDVHNYPVSSMILPEYNGDRISVLGEFGGYGWAIKEHLWNPEMRNWGYKNI DGAMALMDNYGRVIHDLKTLVAQGLSAAIYTQTTDVEGEVNGLLTYDRKVTKMPTSLLHI LHNDLYNVAPAKAITLVPDGQGGAKNQRKVSVNGSEMKQVMLPYSVQPKATVVSETEFVL NQPLKNLSLWLRAAGTGKVWLNGVEVFLQDIRMTKQYNQYNLSDYSSLLKKGKNTIKVEI QGTNKMDFDYGLRAF >gi|225935320|gb|ACGA01000072.1| GENE 125 168765 - 169766 1063 333 aa, chain + ## HITS:1 COG:TVN1097 KEGG:ns NR:ns ## COG: TVN1097 COG0039 # Protein_GI_number: 13541928 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Thermoplasma volcanium # 4 270 1 267 325 105 29.0 1e-22 MEFLTNEKLTIVGAAGMIGSNMAQTAMMMKLTPNICLYDPFAPALEGVAEELYHCGFEGV NLTFTSDIKEALTGASYIVSSGGAARKAGMTREDLLKGNAEIAAQFGKDVHQYCPNVKHI VVIFNPADITGLITLLYSGLKPSQVSTLAALDSTRLQNELVKFFHIPASDILNCRTYGGH GEQMAVFASTTKIKGEPLTDFIGTTRLPLADWEALKVRVIQGGKHIIDLRGRSSFQSPAY LSIEMIAAAMGGQPFRWPAGTYVSNGKFDHIMMAMETSITKDGVTYKEVAGTPAEQEELE KSYEHLCKLRDEVIEMGIIPAIKDWHALNPNID >gi|225935320|gb|ACGA01000072.1| GENE 126 169941 - 170120 178 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237722857|ref|ZP_04553338.1| ## NR: gi|237722857|ref|ZP_04553338.1| predicted protein [Bacteroides sp. 2_2_4] # 1 59 1 59 59 69 100.0 6e-11 MGKVPKLKLQVLKVRNIVIILSKLVNGIKLRIEVYKISVIFEKYIRNKLLLVHSLYYAK >gi|225935320|gb|ACGA01000072.1| GENE 127 170056 - 170478 254 140 aa, chain - ## HITS:1 COG:no KEGG:BT_2511 NR:ns ## KEGG: BT_2511 # Name: not_defined # Def: putative transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 137 1 137 139 229 83.0 3e-59 MEEDIYLDKLDRRGIKPTAIRLLVIKAMMQAERAVSLLDLETLLDTVDKSTISRTIALFL SHHLIHSIDDGSGSLKYAVCDNSCNCVVQDLHSHFYCEKCHRTFCLEGTHIPVIDLPKGF TLHSINYVLKGVCSECTSQK >gi|225935320|gb|ACGA01000072.1| GENE 128 170486 - 172432 1878 648 aa, chain - ## HITS:1 COG:PAB0626 KEGG:ns NR:ns ## COG: PAB0626 COG2217 # Protein_GI_number: 14521140 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Pyrococcus abyssi # 33 643 88 687 689 535 47.0 1e-152 MGHCSCGAHSCAAEKKVDAKVSVFHEYGKVIFSLLLLAGGIIMNALDLAFFREGSVALIW YIVAYLPVGIPVMKEAWESIREKDYFSEFTLMIIATLGAFYIGEYPEGVAVMLFYTVGEL FQDKAVDKAKRNIGALLDVRPEKALVLREGNLVIESPKKIKVGEVIEIKAGERVPLDGTM QNEVAAFNTAALTGESVPRNIRKGEEVLAGMIVTDKVIRLEVTRPFDKSALARILELVQN ASERKAPAELFIRKFARIYTPIVIALAVLIVLCPFVYSLINPPFVFAFNDWLYRALVFLV ISCPCALVVSIPLGYFGGIGAASRLGILFKGGNYLDAITKINTVVFDKTGTLTKGTFEVQ ACKSAGEVSEEELVKLVASVESDSTHPIAKAVVNYAKEQNIERVAVTGTKEFAGYGLEAT IDGTTVLVGNCRLLSKFDISYPEELLEITDTIVVCAVGNRYAGYLLLADALKEDAKVAID NLKALNIENIQILSGDKQSIVTNFAEKLGISKAYGDLLPEGKVNHLEELRQDEANWIAFV GDGMNDAPVLALSHVGIAMGGLGSDAAIETADVVIQTDQPSKVAEAIKVGKLTRRIIWQN VSLAFGVKLLVLILGAGGIATLWEAVFADVGVALLAIMNAVRIQKMIK >gi|225935320|gb|ACGA01000072.1| GENE 129 172499 - 173134 732 211 aa, chain - ## HITS:1 COG:no KEGG:BT_0227 NR:ns ## KEGG: BT_0227 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 209 209 91 32.0 2e-17 MKKALLLFALVAISAVSINAQDNLKWGVMAGMNVSKYTITGFDSRIGFHAGVKAELGLSQ DANGSGAYMDFAALLTLKGAKIDGGSLASIKFNPYYLEVPVRVGYKYAVNDNFSLFGSVG PYIAVGLFGKAKAKVDGDYFDFDEIGGNSMSEDIFGDDGLKRFDFGLGLKAGVEFSKKYQ VAISYDFGLVEVAKDLGMKNRNLMISLGYMF Prediction of potential genes in microbial genomes Time: Fri May 13 11:29:40 2011 Seq name: gi|225935319|gb|ACGA01000073.1| Bacteroides sp. D2 cont1.73, whole genome shotgun sequence Length of sequence - 93103 bp Number of predicted genes - 71, with homology - 69 Number of transcription units - 36, operones - 13 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 194 - 415 68 ## gi|260175169|ref|ZP_05761581.1| hypothetical protein BacD2_25158 + Prom 531 - 590 5.5 2 2 Tu 1 . + CDS 768 - 1340 320 ## BVU_4115 RNA polymerase ECF-type sigma factor + Prom 1372 - 1431 4.3 3 3 Tu 1 . + CDS 1462 - 2544 603 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 2614 - 2673 3.6 4 4 Op 1 . + CDS 2694 - 6083 2009 ## BT_4247 hypothetical protein 5 4 Op 2 . + CDS 6109 - 8016 1082 ## BT_4246 hypothetical protein 6 4 Op 3 . + CDS 8035 - 9597 995 ## BT_4040 putative galactose oxidase precursor 7 4 Op 4 . + CDS 9599 - 10996 969 ## Fjoh_4436 glucan 1,6-alpha-isomaltosidase (EC:3.2.1.94) 8 4 Op 5 . + CDS 11002 - 11415 376 ## gi|260175176|ref|ZP_05761588.1| putative large multi-functional protein 9 4 Op 6 . + CDS 11420 - 12142 439 ## Shew_2294 hypothetical protein 10 4 Op 7 . + CDS 12139 - 13467 705 ## Phep_1705 hypothetical protein 11 5 Op 1 . - CDS 13560 - 15401 1544 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 12 5 Op 2 . - CDS 15401 - 15910 642 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase - Prom 15934 - 15993 2.5 - Term 15917 - 15953 0.0 13 6 Op 1 . - CDS 16002 - 16382 539 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 14 6 Op 2 . - CDS 16412 - 17083 525 ## BF4362 hypothetical protein 15 6 Op 3 . - CDS 17140 - 18627 1654 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog - Prom 18673 - 18732 3.5 + Prom 18578 - 18637 2.1 16 7 Tu 1 . + CDS 18766 - 20139 1436 ## COG0006 Xaa-Pro aminopeptidase + Term 20180 - 20233 11.9 + Prom 20203 - 20262 7.5 17 8 Tu 1 . + CDS 20354 - 21664 950 ## COG3458 Acetyl esterase (deacetylase) + Term 21684 - 21724 -0.4 18 9 Tu 1 . - CDS 21669 - 22070 72 ## BT_2526 hypothetical protein - Prom 22147 - 22206 3.1 19 10 Tu 1 . - CDS 22288 - 23952 682 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 - Prom 24030 - 24089 6.4 + Prom 23900 - 23959 5.2 20 11 Tu 1 . + CDS 24044 - 25741 1394 ## COG1283 Na+/phosphate symporter + Term 25758 - 25818 10.9 - Term 25744 - 25810 16.9 21 12 Tu 1 . - CDS 25831 - 26451 457 ## gi|260175187|ref|ZP_05761599.1| hypothetical protein BacD2_25258 - Prom 26572 - 26631 9.1 - Term 26588 - 26646 13.6 22 13 Tu 1 . - CDS 26669 - 26833 233 ## COG1773 Rubredoxin - Prom 26954 - 27013 5.4 + Prom 26859 - 26918 2.9 23 14 Op 1 1/0.000 + CDS 26969 - 28231 827 ## COG0642 Signal transduction histidine kinase + Prom 28233 - 28292 9.3 24 14 Op 2 . + CDS 28322 - 31027 2469 ## COG0474 Cation transport ATPase 25 14 Op 3 . + CDS 31017 - 31640 532 ## COG1011 Predicted hydrolase (HAD superfamily) 26 14 Op 4 . + CDS 31651 - 32598 427 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 27 14 Op 5 . + CDS 32605 - 32694 67 ## 28 14 Op 6 . + CDS 32701 - 33480 597 ## COG1266 Predicted metal-dependent membrane protease + Term 33485 - 33514 1.4 29 15 Tu 1 . - CDS 33494 - 33919 477 ## COG2166 SufE protein probably involved in Fe-S center assembly - Prom 33943 - 34002 4.0 30 16 Op 1 . - CDS 34029 - 35024 1087 ## COG2234 Predicted aminopeptidases 31 16 Op 2 . - CDS 35095 - 36003 873 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF 32 16 Op 3 . - CDS 36012 - 36815 695 ## COG2273 Beta-glucanase/Beta-glucan synthetase - Prom 36896 - 36955 7.6 + Prom 36894 - 36953 10.2 33 17 Op 1 3/0.000 + CDS 36995 - 37954 606 ## COG0280 Phosphotransacetylase 34 17 Op 2 . + CDS 37980 - 39041 1016 ## COG3426 Butyrate kinase 35 18 Tu 1 . - CDS 39090 - 41633 1385 ## BF0730 hypothetical protein - Prom 41865 - 41924 5.1 + Prom 41670 - 41729 5.9 36 19 Op 1 . + CDS 41921 - 45136 2580 ## BT_3519 hypothetical protein 37 19 Op 2 . + CDS 45149 - 47092 1565 ## BT_3520 hypothetical protein 38 19 Op 3 . + CDS 47119 - 48276 843 ## BT_3860 hypothetical protein 39 19 Op 4 . + CDS 48340 - 50529 1384 ## BT_3525 hypothetical protein 40 19 Op 5 . + CDS 50556 - 52871 1850 ## COG3537 Putative alpha-1,2-mannosidase 41 19 Op 6 . + CDS 52908 - 54845 1616 ## BT_2200 hypothetical protein 42 19 Op 7 . + CDS 54872 - 57520 1740 ## BT_2524 alpha-rhamnosidase + Term 57526 - 57578 11.0 + Prom 57856 - 57915 4.7 43 20 Tu 1 . + CDS 58002 - 60284 1483 ## COG3537 Putative alpha-1,2-mannosidase + Term 60332 - 60380 9.1 - Term 60219 - 60254 -1.0 44 21 Op 1 . - CDS 60500 - 62257 1202 ## BT_2553 hypothetical protein 45 21 Op 2 . - CDS 62288 - 62737 487 ## BT_2554 hypothetical protein - Prom 62906 - 62965 8.7 + Prom 62709 - 62768 7.9 46 22 Tu 1 . + CDS 62957 - 63217 292 ## gi|260175211|ref|ZP_05761623.1| hypothetical protein BacD2_25378 + Term 63299 - 63353 15.3 - Term 63685 - 63734 3.1 47 23 Tu 1 . - CDS 63755 - 66736 1817 ## BT_4692 hypothetical protein - Prom 66882 - 66941 6.0 - Term 67080 - 67124 11.1 48 24 Tu 1 . - CDS 67148 - 69274 184 ## PROTEIN SUPPORTED gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P - Prom 69430 - 69489 4.2 + Prom 69243 - 69302 10.1 49 25 Tu 1 . + CDS 69471 - 70622 1158 ## BT_2564 hypothetical protein + Term 70721 - 70776 13.2 + Prom 70654 - 70713 5.5 50 26 Op 1 . + CDS 70804 - 71268 656 ## COG0782 Transcription elongation factor + Term 71275 - 71313 5.1 51 26 Op 2 . + CDS 71320 - 71712 465 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Term 71732 - 71772 8.3 - Term 71724 - 71756 3.2 52 27 Op 1 8/0.000 - CDS 71765 - 72664 619 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 53 27 Op 2 . - CDS 72668 - 73924 949 ## COG0477 Permeases of the major facilitator superfamily - Prom 74049 - 74108 5.9 + Prom 74312 - 74371 1.8 54 28 Op 1 . + CDS 74414 - 75856 1583 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins 55 28 Op 2 . + CDS 75866 - 76831 1144 ## COG2066 Glutaminase + Term 76859 - 76906 8.2 56 29 Tu 1 . - CDS 76757 - 76945 123 ## - Prom 76965 - 77024 2.0 57 30 Tu 1 . - CDS 77359 - 78141 777 ## BT_2572 putative potassium channel subunit - Prom 78358 - 78417 3.5 + Prom 78107 - 78166 9.7 58 31 Tu 1 . + CDS 78358 - 80085 1496 ## COG0531 Amino acid transporters + Term 80210 - 80242 -0.4 - Term 80140 - 80178 -1.0 59 32 Tu 1 . - CDS 80317 - 81585 857 ## COG0642 Signal transduction histidine kinase + TRNA 81785 - 81857 82.1 # Phe GAA 0 0 + TRNA 81871 - 81946 75.3 # Pro CGG 0 0 + Prom 81871 - 81930 78.1 60 33 Tu 1 . + CDS 82052 - 83689 1444 ## BT_2662 alpha-galactosidase precursor 61 34 Op 1 . - CDS 83835 - 84641 689 ## BF3938 hypothetical protein 62 34 Op 2 . - CDS 84659 - 85027 318 ## COG1725 Predicted transcriptional regulators 63 34 Op 3 . - CDS 85039 - 85860 507 ## BF4127 hypothetical protein 64 34 Op 4 . - CDS 85867 - 86712 767 ## COG1131 ABC-type multidrug transport system, ATPase component - Prom 86751 - 86810 5.5 - Term 86815 - 86856 4.1 65 35 Op 1 . - CDS 86885 - 88351 1582 ## BT_2663 TPR repeat-containing protein 66 35 Op 2 . - CDS 88373 - 89314 852 ## COG0226 ABC-type phosphate transport system, periplasmic component 67 35 Op 3 . - CDS 89321 - 90136 852 ## BT_2665 TonB 68 35 Op 4 . - CDS 90163 - 90816 693 ## BT_2666 hypothetical protein 69 35 Op 5 . - CDS 90832 - 91437 468 ## BT_2667 hypothetical protein 70 35 Op 6 . - CDS 91471 - 92268 999 ## COG0811 Biopolymer transport proteins - Prom 92288 - 92347 7.7 71 36 Tu 1 . - CDS 92423 - 92860 173 ## BT_2669 hypothetical protein - Prom 92951 - 93010 6.1 Predicted protein(s) >gi|225935319|gb|ACGA01000073.1| GENE 1 194 - 415 68 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175169|ref|ZP_05761581.1| ## NR: gi|260175169|ref|ZP_05761581.1| hypothetical protein BacD2_25158 [Bacteroides sp. D2] # 1 73 1 73 73 120 100.0 3e-26 MKGYHFGKLLEKCGIISEYSHKVIVYQVIKGIIRDKNKADSKSRYILLSAYYQHLLSGDI ILIFKYLNTILTV >gi|225935319|gb|ACGA01000073.1| GENE 2 768 - 1340 320 190 aa, chain + ## HITS:1 COG:no KEGG:BVU_4115 NR:ns ## KEGG: BVU_4115 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 3 173 8 179 211 125 40.0 8e-28 MKDITLWNIMLKGDTKPLEVLYKRHYELLLNFGLKYVSDEEFVKDCIQDLFVKLCSSTRL SPTDYVRSYLLTSLKNLIFDKLSSLKSTEDINALPFDLTIEDTSLEVLFKDNDEDIQMIK NLQEAYKQLSENQRMAIYLRYIKGLSYREVAAVLEINPQSAMNLVSRALTSLRSKMTLKE YLFFLPFFLS >gi|225935319|gb|ACGA01000073.1| GENE 3 1462 - 2544 603 360 aa, chain + ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 155 357 68 267 274 76 27.0 9e-14 MEGKKANIHQVIRKVILHVRKQERNVSTTELDESWRLIEKQIHATKKKKLHRRLLYAATY SSAAAILLCMFWLGKQFISYYHADDSALADFALKMQAAPLQEDILLLIPGEKNIEVSDTD ANIIYSQNGLVVVNSDTVNQIQPQKKKPEFNQLITPKGKRTQLTLSDGTHLWVNSGTRVI YPTHFERDQREIYVEGEVFLDVRRNEKAPFIVRTKDFQVQVLGTSFNISAYSSEKTSSVV LVEGSVNIKSHNRQQVKLAPGELVNIHSDQLDTPQNVDVEPYICWIKNILMYTDDSLDKV FRKLNLYYGKEFVLDTEVERMQVSGKLDLKDKLEDVLHTISYSAPIEYKEVGDKIYVRKK >gi|225935319|gb|ACGA01000073.1| GENE 4 2694 - 6083 2009 1129 aa, chain + ## HITS:1 COG:no KEGG:BT_4247 NR:ns ## KEGG: BT_4247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 35 1129 25 1117 1117 1147 52.0 0 MKNTSSIFLTRVRMARILFLFSVCFLSLHSYAVNIGAYAQQKNLTIEMRNKTVKEVLDYI EKNSEFIFFYYSKAIDTKRIVSLSVKDQPITVILDQLFKNTDVRYEIKDRQISLKREELP TQQRPQDSKQQKRKLVGTVTDASSSDPLVGVSISVKDSPGTGTITDLDGKFSIEVSNNTE LTFSYIGYKQQVLSVGDLGVLKVKMQSDNEVLDEVVVVGAGTQKKVSVTGAISTVKGVEL RAPSSSLTNNLTGKLAGVVSMTTSGEPGSVSDFYIRGIGTFGGRATPLILMDDIEISSGD LNNIPTESIASFSILKDASATAIYGARGANGVLLIKTKDGMENTRAQINITFENSFLRPA NTLKFVDGPTFMNMYNEALVTRTPSATPKYSDDVIEYTRSGINPYVYPNVDWYDLLFKNS TMNQRANINVQGGSSKVTYYMSLQANHDSGMLNSPKRYSFDTNINRWEYIFQSNISYKMT PTTTVELRMNDQIGYAKGPSFSTSDLFYLTYNTNPVAFPASFPAEEGDKHIRFGGAMLSG NSMYNNPYAYMINNFRGSNHNTINTSVRINQNFDFVTKGLSASILVNWKNYSNSSYTKSL APYYYRVQDGSWNVDSPDLYALEQLTSGSEYISQSDIERYQDNTFYMDGRINYSRQFGRH AVSGMLMYMMREYRSSVLPNRNQGLSARITYDYDQKYLAEINCGYNGTERLAEGERFELF PAISLGWVISNEDFWTPLRSIVSFLKLRGSYGLVGSDDTGKSAGAAHFLYINDIKLDDLS YGTGPDGSVVAKGPSVSYYAVRNAHWERVKKFDIGIDMNLFNQLNIVFDYFHDKRDRILL KRGSFPRLLGYANAVPWSNIGKVDNRGIELAVNWRKKITNDLNMDLRFNLTYNKNKYIYK DEPDYPYVWQSETGKPLSNTIGYIAEGLFKDQADIEMSPSQDELGSTVMPGDIKYRDVNG DGKINSKDQVMISQYGNVPRIQYGIGLDVTYKSFDFGMFFNGSAQRTIMVSGIMPFRGDA STGDRNVMQFIANDYWSEANPNPDAKFPRLGITDSQVANNSVNSTYWMHNGRFIRFKTLE LGYSFPYCRVYINGDNLAVWSPFKEWDPELSWNAYPFQRTINIGVQLKF >gi|225935319|gb|ACGA01000073.1| GENE 5 6109 - 8016 1082 635 aa, chain + ## HITS:1 COG:no KEGG:BT_4246 NR:ns ## KEGG: BT_4246 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 635 1 642 642 573 48.0 1e-162 MKTIKIISRFILSVTAACCMASCNYLDVVPDETADLPDAMKDRASTLGFLHSCYTYFEWF YTVENSFINSTDEYARPLLWADDGQKAAWNQLSPSSQTAPWNSCYDGIGRCHLFMDQLDK SNPTGVTESDKRRWKAEALFCRALYHFMLLESYGPIPLVTEHYNQNLDKKDFPGRSHFDV CVDSIAEWFDRAAFELPPTVQNSELGRATSTACKALKARLLVYAASPLWNGSFPYSNWKN TNFETPGYGKELVSRTYDENKWKRALTACNEALDWALTQGNRKLFDLATSTSIRINQGVP LPNIEGVDDQFKESVMLMRYLNTTRETDGNKEVIFAIGSGDNYGIYSQIPHGVTTYNGSP VGGWGGIAVFLNSVERFYTKNGKLPQYDHTFPDKSDWFESAGLSNPDIIKLNVNREPRFY AWVCYDGGEYSSFIADGSPLILEARNSQKHGYNPDRYNRDNNATGYFSKKRIQPNFQWRS QDNGNNYAHTPLAVIRLAELYLNLAECHAALNSKQDVLDNLNIVRNRAGIPDLELSDINS DMSLMDWVRSERYAELWFEGHRYFDARRWTIAPEVFKAGVREGLNAIAKKDPSFREFNQR VKVDQPFMWTNKMYLLPINDSEIYSNPQLVQSPEY >gi|225935319|gb|ACGA01000073.1| GENE 6 8035 - 9597 995 520 aa, chain + ## HITS:1 COG:no KEGG:BT_4040 NR:ns ## KEGG: BT_4040 # Name: not_defined # Def: putative galactose oxidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 62 422 56 389 447 74 24.0 1e-11 MKTILLKLILILGVCQIVYSCSNEYSNDKLFPEEYHKILYFKNEGKHIFSWQASENMVEE PLLVIKAGSDPSLTATVNVRALTQQEIDENYSQPENQSYKTITPDAYSFKDGVHELTFVT NETSKYLYLNFDVAKIYQLIKSAPESKLVLALQLESTEEGIVNVEMNRVLYVFDINGQQV EWGKKDIQIESATYSSMSVKLKAEIADNISNFSCGLDFSQNAQLVSAYNDFHSTNYEALP AGAYHVNDFSFANGNDDATTTLTVASSTLQKDKQYLLPLKFAAPSSPQIEVSDEIYYLTV VVPADPQVIPDNREWKILLCNSDQKMENPSSTDGDNIGAGAIIDGIFDNHWHSSYWGKDV NGFNNKDDYHYGFTDYHCFDGMRTPITIVIDMKNSIEVSDIGLTQRRDNANIKKIDFYVS DDPQFLFKTIADGGTSADYSAVALNNWTLLFSKENTPRQNETIWFGDSETITPQKGRFLK IVISSAYNAPYGLNGVDYIVVGMAELQVKQLSTMIGKIIQ >gi|225935319|gb|ACGA01000073.1| GENE 7 9599 - 10996 969 465 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4436 NR:ns ## KEGG: Fjoh_4436 # Name: not_defined # Def: glucan 1,6-alpha-isomaltosidase (EC:3.2.1.94) # Organism: F.johnsoniae # Pathway: not_defined # 34 443 31 446 1172 313 41.0 7e-84 MKHRFKFLTLLLVLLTIPAGIFARESKFTKRGSGPLFWNMYQYNFWYGTSMPEDVWKKNI DWLAEEFLPYGYDMAATDGWLFNLNSIDQYGYLTKYDSSWTYGLKYWGDYLKSKGMRFGF YFNPLWVPKAAYVQNNNIKGTSIPITDVVGTTKFEDHLYWIDTDKPGAEQWVKGCIRTMI DQGIELLKIDFLGYYERDYGTNRYIKALKWMAEEAGDEMLLSYSVPNCRNDARNEIIYAD MIRISTDCDGGGWWFISDKERGQVNESGQGDKYRSAFDGLIGWADIIGVKGQTIMDPDFV QLNTLASDAEREFHISMLLVSGSPIGITDQYNTIGDCAKFYKNTEMLELNKVGFVGKPLS TSIWDKQNSSRWIGQLPDGDWIVGLFNREATALEYGIDFEKELGIKGGKVKNVRDLWLHQ DLGAMSGRYSVRLEPHSCKVLRIKTNSKRYKAAVASLRNGAVINR >gi|225935319|gb|ACGA01000073.1| GENE 8 11002 - 11415 376 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175176|ref|ZP_05761588.1| ## NR: gi|260175176|ref|ZP_05761588.1| putative large multi-functional protein [Bacteroides sp. D2] # 1 137 3 139 139 265 100.0 7e-70 MKRYPQYLWFVVLALLLVCTGCSNEGETEDDNNQNVKVEGFYVKGLENVNARVEFDVEVD AAGTYTLHVMGYNYSKGTATCSLYVNGQKQSQLGFQEQSNWSEKMVQVSLAAGINQIAFQ RDAGDNGQFYLDYIEID >gi|225935319|gb|ACGA01000073.1| GENE 9 11420 - 12142 439 240 aa, chain + ## HITS:1 COG:no KEGG:Shew_2294 NR:ns ## KEGG: Shew_2294 # Name: not_defined # Def: hypothetical protein # Organism: S.loihica # Pathway: not_defined # 18 238 60 265 735 128 34.0 1e-28 MFTKRMKIVLLLFVLFCLADYANATPRWRILDSGAIEWQIDSRLPHYDHIEMSGLKVSAV LRYGVNGDGQFVMERSMIWPMLRTIPNNTHASLMQRFAIDYASLLLVNGMALNNEKVKSI CIDGRLTVVSLFAIGYQNTGTARSREPVPVVELTRSFFPSTDKPMLCERYTVKNISSKPI SVIVPFQKAVYKTDPAKGVDGSYTIVTSIQNKEDCLYTIAPKESLTFDASVQAYKKERMN >gi|225935319|gb|ACGA01000073.1| GENE 10 12139 - 13467 705 442 aa, chain + ## HITS:1 COG:no KEGG:Phep_1705 NR:ns ## KEGG: Phep_1705 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 2 435 267 699 706 560 60.0 1e-158 MIVDLADEEHKRMAFVDKIWNNLSFVSPDPVVNTAFAFAKVRAAESIFATEGGFMHSPGG ESYYAAVWANDQAEYANPFFPFLGYDTGNASALNSFRLFARFMNPDYKPIPSSIIAEGKD TWNGAGDRGDAAMIAYGAARYALAKANKKEAQELWPLIEWCLEYCKRKINASGVVTSDAD ELEGRFPAGNANLCTSSLYYDALVSANYLAKELGKQSSVSKEYARQAKVLRNNINNYFGS HIEGFDSYRYYEGNDILRSWIAIPLTVGIFEKKTGTVEALFSPRLWSKDGLLTQAGTDTF WDRSTLYALRGVYSAGARERAMEHMAYYSQQRLLGEHVPYPIEAWPEGNQRHLSAESALY CRIVTEGIFGIRPTGFNSFSLCPQLPDKWNSMRLNNVSAFAFSPFDIIVERQGNKIKTCV LRNGKPVKTYHTENGKDINITF >gi|225935319|gb|ACGA01000073.1| GENE 11 13560 - 15401 1544 613 aa, chain - ## HITS:1 COG:CPn0373 KEGG:ns NR:ns ## COG: CPn0373 COG0821 # Protein_GI_number: 15618288 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Chlamydophila pneumoniae CWL029 # 5 605 9 598 613 392 40.0 1e-108 MDLFNYFRRETTEVNIGAVPLGGPNSIRVQSMTNTSTQDTQACVEQAKRIVDAGGEYVRL TTQGIKEAENLMNINIGLRSQGYMVPLVADVHFNPKVADVAAQYAEKVRINPGNYVDAAR TFKKLEYTDEEYAQEIQKIHDRFVPFLNICKENHTAIRIGVNHGSLSDRIMSRYGDTPAG MVESCMEFLRICVEENFTDVVISIKASNTVVMVKTVRLLVDVMEKEGMAFPLHLGVTEAG DGEDGRIKSALGIGALLSDGLGDTIRVSLSEAPEAEIPVARKLVDYVLLRQDHPYIPGLE APEFNYLSPERRKTKAVRNIGGEHVPVVIADRMDGKTEVNPQFTPDYIYAGRTLPDQRED GVEYILDADVWQGEDGSWPAFNHAQLPLMRECNAELKFLFMPYMAQTDEVIACLKHHPEV VIVSQSNHPNRLGEHRALVHQLMTEGLQNPVVFFQHYSEDDAENLQIKSAADMGALIFDG LCDGIFLFNQGNLSHAVVDATAFGILQAGRTRTSKTEYISCPGCGRTLYDLEKTIARIKA ATSHLKGLKIGIMGCIVNGPGEMADADYGYVGAGRGKISLYKGKVCVEKNIPEEEAVERL LEFIRTDREENQQ >gi|225935319|gb|ACGA01000073.1| GENE 12 15401 - 15910 642 169 aa, chain - ## HITS:1 COG:PAB1077 KEGG:ns NR:ns ## COG: PAB1077 COG0041 # Protein_GI_number: 14521838 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Pyrococcus abyssi # 3 162 7 166 174 167 55.0 1e-41 MTPIVSIIMGSTSDLPVMEKAAQLLNDMHVPFEMNALSAHRTPEAVEEFAKNARNRGIKV IIAAAGMAAALPGVIAANTTLPVIGVPVKGSVLDGVDALYSIIQMPPGIPVATVAINGAM NAAILAIQMLALSDEKLAEAFAAYKEGLKKKIVKANEELKEVKFEYKTN >gi|225935319|gb|ACGA01000073.1| GENE 13 16002 - 16382 539 126 aa, chain - ## HITS:1 COG:MT1874 KEGG:ns NR:ns ## COG: MT1874 COG0509 # Protein_GI_number: 15841296 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Mycobacterium tuberculosis CDC1551 # 2 124 3 132 134 132 53.0 2e-31 MNFPQNLKYTNEHEWIRVEGDIAYVGITDYAQEQLGDIVFVDIPTVGETLEAGETFGTIE VVKTISDLFLPIAGEVLEQNEALEENPELVNKDPYGEGWLIKVKPADVKAVEDLLDAEAY KAVVNG >gi|225935319|gb|ACGA01000073.1| GENE 14 16412 - 17083 525 223 aa, chain - ## HITS:1 COG:no KEGG:BF4362 NR:ns ## KEGG: BF4362 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 223 5 226 226 307 80.0 2e-82 MDREREHIIADKTMLQVARVTSMVFTPFSIPFLSFLVLFLFSYLRIMPIQYKLIVLGIVY CFTILTPTITIFLFRKINGFARQELSERKKRYVPILLTIISYVFCLLMMRKLNIPWYMTG IILASLVVSIICIAVNLKWKLSEHMAGMGGVIGGLISFSALFGYNPVVWLCLFILIAGIL GSARIVLGHHTLGEVLSGFAVGLICALLVLHPVSNILFRVFLF >gi|225935319|gb|ACGA01000073.1| GENE 15 17140 - 18627 1654 495 aa, chain - ## HITS:1 COG:RSc0408 KEGG:ns NR:ns ## COG: RSc0408 COG1508 # Protein_GI_number: 17545127 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Ralstonia solanacearum # 19 495 15 497 499 219 32.0 1e-56 MAQGSRQIQSQAQQQVQTLSPQQILVVKLLELPAVELEDRIHAELLENPALEEGKEESAT DNEYSDTENAEDGMDSDANDYDSLSDYLTEDDIPDYKLQENNRSKDEQAEDIPFSDSTSF YEILKEQLRERNLTEHQCDLVEYLIGSLDDDGLLRKSLESICDELAIYAGIESTEEELEE ALNILQDFDPAGIGARSLQECLLIQIRRKKEEEKTPSPILNLEERIIRECYEEFTRKHWE KIIKKLDIDEETFQETITEITKLNPRPGASLGEAIGRNLQQIVPDFMVETYDDGTINVSL NNRNVPELRMSRDFTEMVEEHTKNRANQSKESKEAMMFLKQKMDAAQGFIDAVRQRQNTL MTTMQAIIDLQRPFFLEGDESLLKPMILKDVAERTGLDISTISRVSNSKYVQTNYGIYPL KYFFSDGYTTEDGEEMSVREIRKILKECIEGEDKKKPLTDDELADILKEKGYPIARRTVA KYRQQLNIPVARLRK >gi|225935319|gb|ACGA01000073.1| GENE 16 18766 - 20139 1436 457 aa, chain + ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 454 1 454 462 405 46.0 1e-113 MFAKETYMQRRALLKKNLGSGVLLFLGNDECGLNYEDNTFRYRQDSTFLYYFGLSCAGLS AIIDIDEDKEIIFGDELSIDAIVWMGSQPTLHEKCERVGVKNLMPSADIVSYLHQCVQKG KAVHYLPPYRPEHKLKLMDWLGIPPAHQEGSVPFIRAVIAQRNYKSAEEIVEIEKACDVT ADMHITAMKVLRPGMYEYEVVAEMNRVAESNNCQLSFATIATINGQTLHNHYHGNKVKPG DLFLIDAGAEVESGYAGDMSSTIPADKKFTIRQREVYEIQNAMHLESVKALRPGIPYMDV YELSARVMVDGMKALGLMKGNTEDAVREGAHALFYPHGLGHMMGLDVHDMENLGEIWVGY NGQPKSTQFGRKSQRLAIPLEPGFVHTVEPGIYFIPELIDMWKAEKKFTDFINYDVVETY KNFGGIRNEEDYLITETGARRLGKKIPLTPEEVEALR >gi|225935319|gb|ACGA01000073.1| GENE 17 20354 - 21664 950 436 aa, chain + ## HITS:1 COG:TM0077 KEGG:ns NR:ns ## COG: TM0077 COG3458 # Protein_GI_number: 15642852 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Thermotoga maritima # 128 417 12 303 325 113 29.0 7e-25 MKRLLLSVWLLSLTLLSAVAENYPYKSDVLWVTVPNHADWLYKTGEKATVEVQFYKYGIP RDNVTVTYEIGGDMMPTADTKGSVTLKNGKAVIPLGTMKEPGFRDCRLKATVDGKTYSHH VKVGFSPEKLLPYTTMPSDFNEFWEKAKAEQKEFPLTYTKEHVEKYSTDKIDCYLVKLQL NKRGQCVYGYLFYPKKEGKFPVVLCPPGAGIKTIKEPLRHKYYAEQGCIRFEIEIHGLHP EMSDEAFKEISNAFNGRENGYLTNGLDSRDNYYMKRVYLACVRAIDFLTSLPEWDGKNVI AQGGSQGGALALITAGLDQRVTACVANHPALSDMAGYKAGRAGGYPHFFRNSVDMDTPEK IKTMAYYDVVNFAKLIKADTYMTWGFNDDVCPPTTSYIVYNVLNCPKEALITPINEHWTS NDTEYGHLLWIKKHLK >gi|225935319|gb|ACGA01000073.1| GENE 18 21669 - 22070 72 133 aa, chain - ## HITS:1 COG:no KEGG:BT_2526 NR:ns ## KEGG: BT_2526 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 132 132 166 77.0 3e-40 MIFYNSLLAKWFLGKGKKHYFMLGWFFFTRYKYLEVWEDMELRIHARQYWECFSLTLIPA LILSLLFSWWWMVLPFVTYHILYWFEKIICYHSIFNWEAMKHCGDTLYLRKRKAYAWRKW YGKKDLPASRWND >gi|225935319|gb|ACGA01000073.1| GENE 19 22288 - 23952 682 554 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 36 554 31 546 546 267 30 2e-70 MLQICCKNNNNSKEFPIGSSLLDIYYGFNLNFPYQVVSAKVNNRSEGLNFRVYNNKDVEF LDVRDQSGMRTYVRSLCFVLFKAVTELFPEGKLFVEHPVSKGYFCNLRIGRPIELEDVTR IKQRMQEIIAENIPYHRIECHTAEAVRVFSERGMNDKVRLLETSGSLYTYYYTLGDTIDY YYGNLLPSTGYLKLFDIVKYYDGLLLRIPSRENPNMLEEVVKQEKMLDVFKEYLNWSYIM GLNNAGDFNLACEEGHATDLINVAEALQEKKIAQIADTIFHRGENGNRVKLVLIAGPSSS GKTTFSKRLSIQLMTNGLKPFPISLDNYFVDREETPLDENGNYDYESLYALDLELFNQQL QALLRGEEVELPRFNFSLGKKEYKGDKLKIEDNTILILEGIHALNPELTPHIPAERKFKI YVSALTTISLDDHNWIPTTDNRLLRRIIRDFNYRGYSARETISRWPSVRAGEDKWIFPYQ ENADVMFNSALLFEFAVLRLHAEPILMGVPRNCPEYCEAYRLLKFIKYFVPVQDKEIPPT SLLREFLGGSSFKY >gi|225935319|gb|ACGA01000073.1| GENE 20 24044 - 25741 1394 565 aa, chain + ## HITS:1 COG:TP0771 KEGG:ns NR:ns ## COG: TP0771 COG1283 # Protein_GI_number: 15639758 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Treponema pallidum # 8 552 47 585 593 263 33.0 5e-70 MEYSFYDFLKLIGSLGLFLYGMKIMSEGLQKVAGDRLRSILTAMTTNRVTGVLTGVLITA LIQSSSATTVMVVSFVNAGLLTLAESISVIMGANIGTTVTAWIISIFGFKVDMAAFALPL LAIALPLIFSGKSNRKSIGEFIFGFSFLFMGLSYLKANAPDLNANPEMLAFVQNYTDMGF FSIILFLLIGTILTMIVQASAATMAITLIMCANGWISLELGAALVLGENIGTTITANLAA LTANTQAKRAALAHFVFNVFGVIWVLIVFHPFMDLVNWVVDTFFQSSNPEVAISYKLSAF HSIFNICNVCILIWGVKLIERTVCALIHPKEEDEEPRLRFITGGMLSTAELSILQARKEI HLFAERTHRMFGMVQDLLHTEKDDDFNKLFSRIEKYENISDNMELEIANYLNQVSEGRLS SESKLQIRAMLREVTEIESIGDSCYNLARTINRKRQTNQDFTEKQYEHIHFMMKLTNDAL AQMIVVVEKPEHQSIDINKSFNIENEINNYRNQLKNQNILDVNNKEYDYQMGVYYMDIIA ECEKLGDYVVNVVEASSDVKEKRAS >gi|225935319|gb|ACGA01000073.1| GENE 21 25831 - 26451 457 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175187|ref|ZP_05761599.1| ## NR: gi|260175187|ref|ZP_05761599.1| hypothetical protein BacD2_25258 [Bacteroides sp. D2] # 1 206 61 266 266 414 100.0 1e-114 MAGCSDDENEEQPPAPDVPDYSTLVVKDIQNIPADFTFDRVEVKVTGVDWQVIETLSFPY ENGQIVMTLPTSFPSEKLQTVDRRNGMGGYWTGTSDDADALVATLGDFFVFNGDKRVGRI AISNWSGKGSSAGKATLVSYQYADRPFTLTGSDKSYYYSNCSFNKGWNIFANINPASEGG TAKVLRTTTVPESTLFWRLAESYVYN >gi|225935319|gb|ACGA01000073.1| GENE 22 26669 - 26833 233 54 aa, chain - ## HITS:1 COG:CAC2778 KEGG:ns NR:ns ## COG: CAC2778 COG1773 # Protein_GI_number: 15896033 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Clostridium acetobutylicum # 1 53 1 53 54 78 73.0 4e-15 MKKYICTVCEYIYDPEQGDPESGIEPGTAFEDIPDDWTCPLCGVGKEDFEPYEG >gi|225935319|gb|ACGA01000073.1| GENE 23 26969 - 28231 827 420 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 165 418 55 315 328 176 37.0 9e-44 MLIALLSTLIILCAVLLISFLWERRKCLRMQQRLFRESRKLERTSHIAGAILKNVHAFIL LIDNDFKVLKTNYYQKTGTRKGTEEKRVGDLLQCRNALAAEGGCGTHSFCGSCPIRTAIR QAFEQRRNFTDLEATLSVVTSDNTTVECDAVISGSYFLLNEEENMVITVHDVTRRKQAEK ELQLAKEKAEKADISNSAFLANMSHEIRTPLNAITGFAEVLGSATTEEEKAQYQEIIKMN ADLLMQLVNDILDMSKIEAGTLEFVQTTVDVNQLLSDLQQLFQMRVNDAGGKIQIIAEPS CSSCMIQTDRNRVAQVLSNFAGNAIKFTHEGSIRLGYEARNTELYFYVKDTGAGIPAEKL PDVFERFVKLNKDKKGAGLGLSISQTIVAKLGGQIGADSVEGEGSTFWFTIPYRSCGKPR >gi|225935319|gb|ACGA01000073.1| GENE 24 28322 - 31027 2469 901 aa, chain + ## HITS:1 COG:SPAPB2B4.04c KEGG:ns NR:ns ## COG: SPAPB2B4.04c COG0474 # Protein_GI_number: 19114802 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Schizosaccharomyces pombe # 26 889 213 1139 1292 457 34.0 1e-128 MSTNKNDYYHLGLTDDEVLQSREKNGINLLTPPKRPSLWKLYLEKFEDPVVRVLLVAAAF SLIISIIENEYAETIGIIAAILLATGIGFFFEYDASKKFDLLNAVNEETLVKVVRNGRVQ EIPRKDIVVGDIVILETGEEIPADGELLEAISLQVNESNLTGEPVINKTTVEADFDEEAT YASNLVMRGTTVVDGHGSMRVLHVGDATEIGKVARQSTEDNLEPTPLNIQLTKLANLIGK IGFTVAGLAFLIFFVKDVVLYFDFSSLNGWHEWLPVFERTLKYFMMAVTLIVVAVPEGLP MSVTLSLALNMRRMLSTNNLVRKMHACETMGAITVICTDKTGTLTQNLMQVHEPNFYGIK NGSDLSDDDISALIAEGISANSTAFLEESTNGEKPKGVGNPTEVALLLWLNKQGRDYLQL REQAHVLDQLTFSTERKFMATLVESPLIGKKILYIKGAPEIVLGKCKEVVLDGRQVDAVE YRSTVEAQLLSYQNMAMRTLGFAFKIVGENEPNDCTELVSANDLNFLGVVAISDPIRPDV PAAVAKCQSAGIGIKIVTGDTPGTATEIARQIGLWNPETDTERNRITGVAFAELSDEEAL DRVMDLKIMSRARPTDKQRLVQLLQQKGAVVAVTGDGTNDAPALNHAQVGLSMGTGTSVA KEASDITLLDDSFNSIGTAVMWGRSLYKNIQRFIVFQLTINFVALLIVLLGSVIGTELPL TVTQMLWVNLIMDTFAALALASIPPSETVMLEKPRRSTDFIISKAMRANIIGVGSIFLIV LLGMIYYFDHSTQGMNVHNLTIFFTFFVMLQFWNLFNARVFGTTDSAFKGLSKSYGMELI VLAILAGQFLIVQFGGAVFRTEPLDWQTWLLIIGVSSTVLWVGELVRLVQRIIHKKDRNE K >gi|225935319|gb|ACGA01000073.1| GENE 25 31017 - 31640 532 207 aa, chain + ## HITS:1 COG:L111950 KEGG:ns NR:ns ## COG: L111950 COG1011 # Protein_GI_number: 15672092 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Lactococcus lactis # 6 196 3 193 207 117 34.0 1e-26 MKSKGIKNLLIDLGGVLINLDRQRCIEHFQKLGLKNVDELLDIQNQDGLLMQQEKGLITS AEFRDGVRKMIGKAISDKQIDAAWNSFLVDIPTQKLDLLLKLREKYVVYLLSNTNQIHWE WTCAHLFPYRTFKVEDYFEKTYLSFEMKMAKPEPEIFKAIIEDAGIEPKETLFIDDSEMN CKTAQNLGISTYTAKAGEDWSHLFNLK >gi|225935319|gb|ACGA01000073.1| GENE 26 31651 - 32598 427 315 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 12 311 16 310 317 169 35 6e-41 MQIIRDTSAITPEPCVATIGFFDGVHAGHRYLIQQVKEIAAAKGLRSALVTFPVHPRKVM NAEYRPELLTTPEEKISLLADIGVDYCLMLDFTPEISRLTAREFMTQLLKERYQVKYLVI GYDHRFGHNRSEGFEDYVRYGKEIGIEVIRAKAYTSNIKIGNEPNIPVSSSLIRKLLHEG EVSLAAHCLKYEYFLDGIVVGGYQVGRKIGFPTANLRVDDPDKLIPADGVYAVWVTFDGE TYMGMLNIGVRPTIDNGPNRTIEVNILHFHSDIYDKFIRLTFVKRTRPELKFSSIDELIT QLHKDAEETETILKK >gi|225935319|gb|ACGA01000073.1| GENE 27 32605 - 32694 67 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQKSFMSISLVVIAISMLAMFIYSFFTTK >gi|225935319|gb|ACGA01000073.1| GENE 28 32701 - 33480 597 259 aa, chain + ## HITS:1 COG:RSc3402 KEGG:ns NR:ns ## COG: RSc3402 COG1266 # Protein_GI_number: 17548119 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Ralstonia solanacearum # 119 221 141 243 285 65 40.0 8e-11 MKTAIKLVLIDLLIAQIIAPILIMIPCTIYLFVTTGNLDKVTLTQMIMIPAQLAGQIMMG IYLWKAGYISKKKATWLPVSAPYLVCSAIAILTCGFAVSSLMSLLDWIPNIMEQSFNILQ SGWGGILAIAIVGPVLEELLFRGAITHALLQQYNPTKAILISALLFGVFHINPAQILPAF LIGILLAWTYYKTGSLIPCILMHVLNNSLSVYLSIKYPEAENMDDLINGTPYLIVLFGSI LLFIGIILTMNHLTSQKQQ >gi|225935319|gb|ACGA01000073.1| GENE 29 33494 - 33919 477 141 aa, chain - ## HITS:1 COG:XF0994 KEGG:ns NR:ns ## COG: XF0994 COG2166 # Protein_GI_number: 15837596 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Xylella fastidiosa 9a5c # 14 137 23 146 146 107 43.0 6e-24 MSINELQDEVIAEFSDFDDWMDRYQLLIDLGNEQEPLEEKYKTEQNLIEGCQSRVWLQAD DVDGKIVFKAESDALIVKGIIALLIKVLSGHTPDEILNADLYFIDKIGLKEHLSPTRSNG LLSMVKQIRMYALAFKAKEGK >gi|225935319|gb|ACGA01000073.1| GENE 30 34029 - 35024 1087 331 aa, chain - ## HITS:1 COG:MK0503 KEGG:ns NR:ns ## COG: MK0503 COG2234 # Protein_GI_number: 20093941 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Methanopyrus kandleri AV19 # 1 323 1 283 337 62 24.0 2e-09 MKKKTMIVLMSAFLLLSAFSCGGGNKANSTSEQDEKTVVNVPQFDADSAYLYVKNQVDFG PRVPNTKEHVACGNYLASQLEAFGAQVTNQYADLIAYDGTLLKARNIIGSYKPESKKRIA LFAHWDTRPWADNDPGEKNHKTPILGANDGASGVGALLEIARLVNQQQPELGIDIILLDA EDYGAPQFYTGKHKEEFWCLGSQYWARNPHVQGYNARFGILLDMVGGEGSVFMKEGYSEE FAPDINKKVWKAAKKAGYGKTFMDGNGGFVTDDHLFINRLARIKTIDIIPYNQEGDFTPT WHTVNDNMEHIDKNTLKAVGQTVLEVIYNEK >gi|225935319|gb|ACGA01000073.1| GENE 31 35095 - 36003 873 302 aa, chain - ## HITS:1 COG:CAC0293 KEGG:ns NR:ns ## COG: CAC0293 COG1619 # Protein_GI_number: 15893585 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Clostridium acetobutylicum # 9 298 6 299 306 156 33.0 5e-38 MDIQFPPFLQKGDKVVIVSPSSKIDQQFLKGAKKRMESWGLKVAIGKHAGSSSGRYAGTI KQRLKDLQDAMDDPKVKAILCSRGGYGAVHLIDKIDFTAFREHPKWLLGFSDITALHNLF QKNGYASLHSLMARHLTVEPEDDLCANYLKDILLGNIPSYMCEKHKLNKQGTAQGVLHGG NMAVAYGLRGTPYDIPAEGTILFIEDVSERPHAIERMMYNLKLGGVLEKLSGLIIGQFTE YEEDCSLGKELYAALADLVKEYDYPVCFNFPVGHVTHNLPLINGAKVELVVGKKNVELKF IC >gi|225935319|gb|ACGA01000073.1| GENE 32 36012 - 36815 695 267 aa, chain - ## HITS:1 COG:CC0380 KEGG:ns NR:ns ## COG: CC0380 COG2273 # Protein_GI_number: 16124635 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Caulobacter vibrioides # 10 265 5 299 301 108 29.0 8e-24 MKQFKNLLFLSFILLCWTLVSCKSSKTSSGWSLIWEENFNQKKSFDSQVWSKIPRGKSDW NNYMTDFDSCYAMRKGKLVLRGIVNHTQRNDTAPYLTGGVYTKGKKAFTNGRIEICAKLN AAGGAWPAIWMLPEGAAWPSGGEIDIMERLNNDSIAYQTVHSHYTYTLGIKDHPVSHSTG VIHPDRYNIFAVEMYPDSLSFYVNDVHTFTYPRIQTKEEGQFPFNQPFYLLIDMQLGGSW VGAVSPEDLPVEMEVDWVRFYQRKSIK >gi|225935319|gb|ACGA01000073.1| GENE 33 36995 - 37954 606 319 aa, chain + ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 16 304 13 300 301 172 34.0 8e-43 MEPILNFAQLTAHLKKLNHRKRIAVVCANDPNTEYAISRALEEGIAEFLMIGDSTILKKY PTLKQYPEYVKTIHIENPDEAAREAVRIVREGGADILMKGIINTDNLLHAILDKEKGLLP KGKILTHLAVMEIPTYHKLLFFSDAAVIPRPTLQQRIEMIWYAICTCRHFGIEQPRVALI HCTEKVSAKFPHSLDYVNIVELAEAGEFGNVIIDGPLDVRTACEQASGDIKGIVSPINGQ ADVLIFPNIESGNAFYKSVSLFAQAEMAGLLQGPICPVVLPSRSDSGLSKYYSIAMACLQ VSGDCKCRKQVSQVTGSSF >gi|225935319|gb|ACGA01000073.1| GENE 34 37980 - 39041 1016 353 aa, chain + ## HITS:1 COG:CAC3075 KEGG:ns NR:ns ## COG: CAC3075 COG3426 # Protein_GI_number: 15896326 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 2 353 3 355 355 360 49.0 2e-99 MKILVINPGSTSTKIAVYENETPLFVSNIKHSVEELSAFPEVIDQFEFRKNLVLQELENN KIPFSFDAIIGRGGLVKPIPGGVYEVNEAMKRDTVHAMRTHACNLGGLIASELASTLPNC PAFIADPGVVDELEDIARITGSPLMPKITIWHALNQKAIARRFAKEQGTQYENLDLIICH LGGGISVAVHHHGRAIDANNALDGEGPFSPERAGTLPAGQLIDLCFSGQLTKDELKKRIS GRAGLTAHLGTTDVPAIIQSIEEGDKKAELILDAMIYNVAKAIGASATVLCGKIDAILLT GGIAYSDHVISRLKKRISFLAPIYVYPGENEMESLAFNAIGALKGELPIQIYK >gi|225935319|gb|ACGA01000073.1| GENE 35 39090 - 41633 1385 847 aa, chain - ## HITS:1 COG:no KEGG:BF0730 NR:ns ## KEGG: BF0730 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 840 4 871 873 520 33.0 1e-146 MWNRQIYLTVIVCIGVLFLSNHLCAQGLKFQRDENSIENRTSYNVFAYKSHTFKDRLDIA FDYAAPETSMRGNVLRIKNEKQNLIYTLSFNSEEQDVIFKLNEEGKQNLVTLHLNKEEFN RRRWQQVQILFNLVKNDVTLKVNKERSTCPDLNFAEQSWKPVMYFGRSEYIIDIPSFGIR NLSVKDDKTTFLFPLNENKGNEVHDLEGSRIGEVVCPVWMINDAYFWRHKVTFESNEVAG AMFNVHNQEIYYFNDKQITTYNTTNGTSNTKEYAAPCPMKFRLGTSFVDTGQERLYAYEV YDPPFSGYASAWYDWHNNQWTPLSYDVLPTQLHHHSAFYDETNHRYIIFGGFGNHRFSNR FYAMDVVTGKWSEIKYTGDHIDPRYFTSMGYEPSSNSLYIFGGMGNESGDQAIERRYYYD LYKLDLNTNHLKKMWEIKWNQANMVPVRNMILPGDSTFYTLFYPEHYSQSYLCLYRFSIS DGSYQVYGDSIPMRSDKIATNANLYYNANRNELYAVTLEFDNQDQASVSNIYSLSFPPIT KEQMEVDDTSSLFFWIMLVGVIVIAALLAVVSIYIFIKRKNRSENQIDNEELELPEGEGE IVDNLHNRPNSIYLFGNFSVRDRHNREINYLFSPKLKQAFILIFEYSLSEGITSQELSEQ LWPDKTENKVKNIRGVTLNHLRKTLAELDGISLVYEKNVFRLILQEPCYCDYLQCMEILT GNPVEEQYERLCEIVSQGAFLKSLDIPLFDHFKALVEAQLEPTLSRQAEVWYEEEKYTKA IAMANASFNLDPLNEKSLRIIIGAYCKQGLYKEAKTRYKLFAKQYQMLQGENYSVTLQSL SGGVVLN >gi|225935319|gb|ACGA01000073.1| GENE 36 41921 - 45136 2580 1071 aa, chain + ## HITS:1 COG:no KEGG:BT_3519 NR:ns ## KEGG: BT_3519 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 48 1071 130 1151 1151 1126 54.0 0 MRNALKPCSTKSSLMVLLLMFLSIPSSASEKTEKKEGKSDSEIVWQNQTKTKRITGRVTN KEGEPLPGVNVMEKGTQNGTISNLDGLYTLNVSDKNPQVAFTYIGFNPVTINIGSASVYD IIMEETINEMDELVVVGYSQQRKISSIGSQSSLKVSDIKMPSASLTSALVGRLSGVIAVQ RTGEPGKDASDIWIRGYSTPNNSNPLMIVDGVERPFNDIDPGDIESITVLKDASATAVYG VRGANGVVIVKTKPGIIGKPVVNVDYYESFTQFTKQPKFADGITYMNTANEALTTDGLSP MYLPSYIKNTESGMDPLLYPNVDWQKVVFKDWGHQRRVNANIRGGSQMAQFYASISYFNE KGMVRENDYENYNTGINYDRYNFLTNLSMKVTSSTTVEIGAQGHLGNGNYPGVNTESIFS STIEVNPVLYPVMFRANGVEYVPGLHTQGGNRNPYAEATKTGYRKTIDNKIMANVRLTQD LSMLTEGLKLSVMYAYDVSNRRESSYTKRENTYYFADRTGEPYDAAGNPILKTTWDQGSN ALAFSGTFGGYRKDNLEASLNYDRVFGKHRVGAMFVYTQQSKTINDAGDFIGSLPYRLQG IAGRATYSWHDRYFAEFNIGYNGGENFPKSKRFGTFPAFGVGWVLSNEKFWEPIQNVFSF MKVRYTNGRVGSSDVDGAARRFMYIDQYDWAADYGDTFGSNVGVDGVRIKNPATPLTWEI AHKQDLGIDLKFFRDELSVTVDFYKENRKQILLNRTNSLPGFAGFQETPYGNVGKTRTKG YDLSLEYFKQFNKDLALTVRGNFTYTDVVWVDDDTPDKKFPWRNREGHSLKALEGFTANG LYTQQDIDAIKNWLALPEAQRQNTAQPFPTPYKVGLNQVRAGDIKYKDLNGDGTINDDDI SWIGNGDIPKINYGFGFNLDYKAFSIGALFQGTAKADRLISGFVHAFNNSSAGNVFSNID DRWTENNPNQNAFYPRLSYGNDAPGNQNNYVNSTWWLKDMSFFRLKSLQLSYRIPETLCH KLNIKNASVYMMGTNLFTISGWKLWDPELGTNDGNKYPNTTAYTIGLNFSF >gi|225935319|gb|ACGA01000073.1| GENE 37 45149 - 47092 1565 647 aa, chain + ## HITS:1 COG:no KEGG:BT_3520 NR:ns ## KEGG: BT_3520 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 646 14 642 642 508 43.0 1e-142 MKKIYINWSTLILLLLTITSCNDYFDQVPEKDMLSLEKIFETREGALKYLASVYTYLPDE FRQRNSANGVNTQGTSGAWVAGCDEAEFAWDFCETQQINNASYTPESGFVYDYWYKYYKG IRSATLFIENLYRCPNLSPGDYDQWTAEAKALRAIYYYYLFRLYGPVPILEEVISENASA ESLQISRNSVEEVVNYIITQLQEAKREGLIDNIKLSKTLSSKYNGLGHIDRAIAQTFILQ TRMLAASDLFNGSNPYYAGLTNKDGKKLFPAYSEEQKKQLWADAANDAKEFIDRYVGNGY DLCRIKTNGVLDPYLSYREAVRGYISEMLNVSSGSAAEMIFFRERVKASDIHYERTPKHF GLPSSVSASSSMAATQEMVDSYFMANGLKPINGYESDNKTPVINTVSGYEDNGFSSSDYL DPVTKRIFAPKGALKAWVGREPRFYADITFDGQKWLNESDGVVYTSLQYSGNSGRGVGNS NDYSKTGYIVRKSAPLAEWDVSDRICILIRLAQIYLDYAEALNESDPGNPDILVYLNLIR ERAGIPQYGNGNGQIPVPADMRQAIRDERRVELSFETQRYFDVRRWCIAEETESKAIHGM NINKDGNDFFVRTKVEDRIFEKKHYLFPLPQKDLNINRNLVQNIGWD >gi|225935319|gb|ACGA01000073.1| GENE 38 47119 - 48276 843 385 aa, chain + ## HITS:1 COG:no KEGG:BT_3860 NR:ns ## KEGG: BT_3860 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 385 1 376 376 291 40.0 4e-77 MEKRYNTNWYMTLMFMLSMVVSLSFSACSDDDKNSSEQDMINTYVTKFQEQISEMAALRD DSPYGLQQGMYPEESKDILNSAIEKLEQVIENLSNGSYVEPAKSKFEQLLAETKRAISNF KESIRTTDYVDPNEDYELYVNGHQGGYIDFGVSPNFSRFGSEGQQTFTIELWMKVKSQEG FGSVISTFVEKGDDNDPNHYRKGWCVNLFDQKSLRMSYASDNWNLMEPGFEFNTLNEWVH FAAVMNEKGFNGETNGSGSPIICKIYLNGELKTTRAKSDMAGNYTSNTLENTAMVAFGQM DPVSGIKGDRKVEGYIKHFHIWKSAKSESDIKKLAQYQLLVTGDESDLVCGWPFNRLPDD FNNVEDLTGRYFASLKGNFQWVNLK >gi|225935319|gb|ACGA01000073.1| GENE 39 48340 - 50529 1384 729 aa, chain + ## HITS:1 COG:no KEGG:BT_3525 NR:ns ## KEGG: BT_3525 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 729 27 721 721 888 58.0 0 MVLKSSFILFCIFSFGGISCIAQHSKEQTMFQTSQPWRPTLDNRGDVAIIYGINGNPTDK TRAVPFAQRVHSWRDRGYTTHFMTGIAWGEYQDYFTGKWDGVPHMDEGQVRQEGDTIWHG HQVPYIVPTENYLAYIKEAHVKKAIDAGIDAIYMEEPEFWARGGYSEAFKREWKKYYGFD WQPQHESPNNTYLSNKLKYQLYYHALKEVFTYAKEYGRSKGMNVRCYVPTHSLLNYSQWM IVSPEASLASLDCVDGYIAQVWTGTSREPNFYSGIQKERVFETAFVEYGSMESMTAPTGR KMTFLTDPIEDQPRDWSDYKKNYEATFTAQLLYPRISNYEIMPWPERIYEGWYRASANSD ERIRIPRHYSTQMQIMINALNHMPVSDNQVSGNKGISALMSNSLMFQRFPEHAGYSDPYL SNFLGQVLPFLKRGVPVSTVHIENVEYPETWKELKILLMSYSNMKPMSPDAHKFISQWVH QGGILVYSGRDNDPFQSVQEWWNTGNQSYLNPSAHLFEMMEMTAFPQTGEYKYGKGTVFI LRNDPKEYVLSAGGDEELVKTVSTLYQKKTKESLCFKNSFVLERGMYKIIAVLDESVTNE PYIAKGKYIDLFDPQLPVLEIKEVSPGTQALLIDMDRVDKKKKSQVLASSCRIYDEKSTA HTYSFTGKAPVNTSGMMRILLPARPKSCTVTNNDDRLQSTVEWKWDASSKTCLLSCENHP DGIAVSIEW >gi|225935319|gb|ACGA01000073.1| GENE 40 50556 - 52871 1850 771 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 23 767 44 770 790 339 31.0 1e-92 MRYIISVIVASFLLVSCSNERLANPSQYVNPFIGASTSTGLAGTYHGMGKTFPGATTPYG MVQVSPQTITGGDNSSGYSYEHKTIEGFAFTQMSGVGWYGDLGNFLTMPTTGKLQLIAGR EDGSIQGYRSLYDKSSEEASAGYYCATLTDYNIKAEATAASHSGILRFTYPANSQSRIQI DLARRVGGTSTTQYIQVKDKHAIRGWMKCTPEGGGWGNGDGKADYTVYFYAVFSKPLNHY GFWNAEIPDDWNRKKDDVVSDSYLQQVSQAPIIKDVNEIEGKHIGFFTEFATEANEEVTM KVGISFVDMVGAENNYKQEIAAKDFDEIRQEAVENWNNELSRMEAKGGSEEEKTVFYTAL YHTMIDPRIVTDVDGRYVGGDGKVYEKEGFTKRSIFSGWDVFRSQFPLQTLINPKMVNDE INSLISLAEQSGYEYYERWEFLNSYSGCMIGNPALSVLTDAYLKGIRGYDVNKAYKYARN TANKFGNGERGYIVLNVSETLEYAYFDWCLSQLATALGHKEDASHYMLRGQNYRNTFDPE VGWFRPRHADGVYDLWPENARTIESYGCVESNLYQQGWFVPHDIPGMIELMGGKEKTLND LSSFFEKAPESLLWNQYYNHANEPVHLVPFLFNHLDAPWLTQKWTRHICDKAYRNAVEGI VGNEDVGQMSAWYVLAASGIHPSCPGSTRYEITSPVFNEIRIHLDPDYFKGKKFVIRTLN NSKTNRYIQKARLNNREYNKCYLDFKDIVAGGILELTMGAAPNPNWGIEKE >gi|225935319|gb|ACGA01000073.1| GENE 41 52908 - 54845 1616 645 aa, chain + ## HITS:1 COG:no KEGG:BT_2200 NR:ns ## KEGG: BT_2200 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 642 6 631 634 608 47.0 1e-172 MNLKMNILAVICFFGITITGCNDSDVKYGYNNDYEIEDNGNGSSEGTISYETLFDPIHDA GQYTFPEMKRKPARYWSCIDALVGSNRNAPDLDSKGVQYHLLCQSIAGLVNRAVDEGKSD IGIWLRDEEHRDSYAVALNGVQAMDVTEQGLQTGTELAINTYPDEDGFKRNVRHLFEGYV LTDVRNNPESSIVASVAAHVHNSIIVDVRDEQTFKNAGYEMKYDARNKTTRNAWEEFKDK CNNEGLVLMPVNTGELRDFAIQHGYFVINLNSSYAVPDGANKELLKEVLAWLAPGAPVYG WEQVIPEVDFVGPASLTGHIWIPNDWSYNLAFTSLDYKNRQSANLAKVVNPKNIDYSLHK NFYSFYLSDGDNVQWMMNHFVDEYYMDPNAESTYMGFGIPVINIAMQSPDQFKEIMNMQK SNYTLIEALGGGYMYIDTFGKDAPEGRSESLKKAAAKVAASMRQHRIKVLGLHAKDYTNE EACKEAYQAFIDANDQLEGILVISLTGYADGGGKTYWLTNKDGLHIPVITAHYALWNHEG NNMRNEGSPAYLARLMKESAQTNGQTFSVISLHAWSNFNDMGQNPSETSEMNPQNTGNLK GASAAKLVERHLGDDFKVISVQELIWRIRMQYYPEETQQYINTLK >gi|225935319|gb|ACGA01000073.1| GENE 42 54872 - 57520 1740 882 aa, chain + ## HITS:1 COG:no KEGG:BT_2524 NR:ns ## KEGG: BT_2524 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 876 23 877 881 1180 64.0 0 MKRIVVILLGSLYGMVGIGAQEAMTPTALQCEHLVNPLGIDATHPRFSWKLQSSRQGVTQ HSYRIIVGTDSLAVTQGKGNSWDSGIVESDCTLSHYQGNKLLPFTRYFWKVQVEEQPDTL QTSSKVQAFETGMMKVDNWQGAWITDRQSKEHHPSPYFRKVFHSGKPVKVARAYIAVGGL YELFLNGEKVGNHRLDPMLTRYDRRILYVTYDVTSHIRKGKNAIGVILGNGWYNHQALGV WGFADAPWRNRPTFCLDLRLTYDDNTTEVVSTGLDWRTSGGSIIRNNIYTGENQDADREQ KGWNTANFDDSSWTDVSFRSAPSQNIVAQQLYPIRNVETIPAQYMNKLNDSLYVFDLGRN IAGVTQLKIEGKAGTIVRLIHAERLSPNGRVDLGRISNFHRPTDEEDLFQTDTYILSGKG KDTFMPKFNYKGFRYVEVSCSEPTELKKECLTGYFMHNDVPEVGNVTTSDPIINKIWKAT NVSYLSNLFGYPTDCPQREKNGWTGDGHLGIEAGLYNYDALTIYEKWLADHRDEQQPNGV LPDIIPTSGWGYGTENGLDWTSTIALIPWNVYLFYGDSKLLEDCYENIKRYVDYVDRNSP QYLSDWGRGDWVPVKTLSSKELTSSVYYYVDTNILAHAAKLFGKQDDYEKYTALAENIKE AINKKYLNRDTGIYAGGSQTELSVPLMWGVVPEDMKAKVAANLANKVQKDGCHVDVGVLG CKALLNALSENGYADLAFQVAAQKDYPSWGWWISNGATALVENWDYQGAGEFSDNHMMFG EIGAWFYKALGGIKPDSEHPGFRHILLAPHFVKGLNYANISYQSPSGLIVSNWKRKGKKI IYEVTIPANCTATFTMPANIKDSRTVALEPGTHVFELQTTAK >gi|225935319|gb|ACGA01000073.1| GENE 43 58002 - 60284 1483 760 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 31 754 11 716 717 433 34.0 1e-121 MKKCWLLFLVLSCTLLAQSKDWTQYVNPLMGTQSSFELSTGNTYPAIARPWGMNFWTPQT GKMGDGWQYVYTANKIRGFKQTHQPSPWINDYGQFSIMPMVGKPEFDEEKRASWFGHKGE EATPYYYKVYLAEYDIVTEMSPTERAVLFRFTFPENAHSYIAVDAFDKGSFIQIIPEENK IIGYSTRNSGGVPENFKNYFIIQFDKPFTYKATVNNGSIQENATEQTTDHAGAIIGFKTQ RGEQVHARIASSFISFEQAARNMKELGNDNLEQIKQKGQKAWNQVLGKIEVEGGNLDQYR TFYSCLYRSLLFPRKFYELDEAGKPVHYSPYNGQVLPGYMFTDTGFWDTFRCLFPLLNLM YPSMNKEMQEGLINTYKESGFFPEWASPGHRDCMIGNNSASVLADAYLKGIKVEDVKTLY EGLIHGTKSVHPTVSSTGRRGYEYYNKLGYVPYDVKINENTARTLEYAYNDWCIYQLAKE LKRPEKEIQLFAKRAMNYKNVFDKDSKLMRGKNEDGTFQSPFSPLKWGDAFTEGNSWHYS WSVFHDPQGLIDLMGGKDMFITMLDSVFSVPPVFDESYYGQVIHEIREMTIMNMGNYAHG NQPIQHMIYLYNYAGQPWKAQYWLRQVMDKMYTPGPDGYCGDEDNGQTSAWYVFSALGFY PVCPGTGEYVVGAPLFKKATLHLENGNSLIIQATDNNKKNMYIQTMNLNGTEYKKNYLHN EDLLKGGNINFQMGSQPNLNRGTEEESFPYSFSKVLKKVR >gi|225935319|gb|ACGA01000073.1| GENE 44 60500 - 62257 1202 585 aa, chain - ## HITS:1 COG:no KEGG:BT_2553 NR:ns ## KEGG: BT_2553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 582 1 582 583 1013 86.0 0 MMKQIPYGITDFGRIRKQNYYYVDKTMFIEKIEMQPPYLFLIRPRRFGKSLTLAMLETYY DVSYAEQFDELFGQLYIGQHPTKLHNQFLIMRFNFSEVSSNVNEVEQSFKLHCCNKLKDF VYKYERLLGKEIWEILDEGTQKDPGAFLSAISTYASRKGNLRIYLLIDEYDNFTNTILST YGTEFYRKATHGEGFIRGFFNVIKSATTGTGAALERLFITGVSPVTMDDVTSGFNIGTNI TTDPWFNDLVGFSEKELREMLTYYKEEGVLEESIDEIVAMMKPNYDNYCFSEDTLEQCMF NSDMTLYFLKSFVLHHKKPKEIVDPNIRTDFNKLTYLIKLDHGLGENFSVIKEIAEQGEI TTDIVTHFSALEMTDPSNFKSLLFYFGLLSIKGVDMVGRPILHVPNLVVREQLFNFLIQG YIKHGIFKIDMNKMSALFENMAFRGDWKPLFDFIADAVREQSRIREYIEGEAHIKGFLLA YLGMYRYYQLYPEYELNKGFADFLFRPSPSVPVLPPFTYLLEVKYAKAGASEKEIRVLAD EARGQLLRYSEDELVAEARAKGGLKLITIVWRSWELALLEEVTLS >gi|225935319|gb|ACGA01000073.1| GENE 45 62288 - 62737 487 149 aa, chain - ## HITS:1 COG:no KEGG:BT_2554 NR:ns ## KEGG: BT_2554 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 147 147 189 79.0 2e-47 METEELTIGRVHHGRNIRRTRIEKDMNQEGLSELVHLSQPAVSKYEKMRVIDDEMLQRFA RALNVPFDYLKTLEEDAQTVVFENITNNGNAATNANIGFVDEVGEDNRVNNFNPIDKITE LYERLLKEKDEKYAALERRLQHIEESLQK >gi|225935319|gb|ACGA01000073.1| GENE 46 62957 - 63217 292 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175211|ref|ZP_05761623.1| ## NR: gi|260175211|ref|ZP_05761623.1| hypothetical protein BacD2_25378 [Bacteroides sp. D2] # 1 86 1 86 86 145 100.0 1e-33 MKRILTNKEYRFIKELKGLMDKYNVVISTDAHGKIEIVVNEDNEPFDFEAENTIYLGACF SNHELNELLERNLAHIKRIEEEYKTN >gi|225935319|gb|ACGA01000073.1| GENE 47 63755 - 66736 1817 993 aa, chain - ## HITS:1 COG:no KEGG:BT_4692 NR:ns ## KEGG: BT_4692 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 987 7 1011 1011 565 39.0 1e-159 MMKSGNRIYRNWLYALLLIVFSSCIDETFTENSVEGTFLTVRGITPLGATHEGTPEDRVV RTLRILAFDRTTGNKISNVYYTASTGDIIRHPIDPGMYDFVFLANEPPHIPVKTKLEAIS NYSDLDHIAYPASFFSSEQVIPMMQEIKKVTVLSDGQGATLENGTTVSMLLLALERLGVR VDVELKAVDNLNGAFKGVIFSNIPNLVPLTAGYDGPAIERNVVRKFTLTDNGNYFADGTP TVEWGWDKKVNRIILPASEPASVNDEDEAVVFTVDMLDNYSPSCELKINSSPVNYSLPKN TKLDLTGYIQDPLQVNIVASKWEDVDEDWDIAGIKVLNVSTLEASITDFNGVRISFSSNM PIVKVLSELYVGDTGTAKTETGKIFNDLVLRYNDTKDDGTTITYATSRFLYTYDRTSQTG SGYMDILLDERNVKDAQETYRIILSVEDEDGGKLQREIKVKTKQYGKRFDFNSYGSGYIG AFFRDDETGERIITGQQGVISGAGERPEDLGAPAPWKATVEAGDFVVLSSSPSFDPHVGT DDPGNPEHYQVLPDAYKGEKGTYVEGRGRIYFRIGVKEKNSGTTPRYAKIKIERYGGRWG TGDNQVWYNTGYMYIRQGEEPDYIMRPGTADPISKGPLKDASRNYARKISPFNLTSPAYF SGSNALYTPVDHKQAKFVKYPTQAGAFFQWGLPKKADQSYFRLAYHPTALTMDYWIRDIL FFDESHKLFLPVWGDAPVPPETVYDYGYSEVFESCPEGYHRPSDGYIDRISYNGPYPNNI DQDNDPVIVNAYKEVIKTQIVDYSDEIAFSELRQSLFINPLSGDVGVNENIGGYDGSVEA GGIKVDRYQNFWIDRNDQVEAQEHIKFFIGFYADGFFDRRPIKMGPGEGNYPYGVSSENA QVAYFGILVFNESNNASVFFPSAGRRRNLNSTLEYAGQTGYYHTSSIAASSAEDPHAVWS MTLGKWPNPGLMYQLPTFGQSIRCVKDETAGTR >gi|225935319|gb|ACGA01000073.1| GENE 48 67148 - 69274 184 708 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P [Veillonella parvula DSM 2008] # 628 705 464 539 632 75 51 9e-13 MINPIVKTIELPDGRTITLETGKLAKQADGSVMLRMGNTMLLATVCAAKDAVPGTDFMPL QVEYKEKFAAFGRFPGGFTKREGRASDYEILTCRLVDRALRPLFPDNYHAEVYVNIILFS ADGVDMPDALAGLAASAALAVSDIPFNGPISEVRVARIDGQFVINPTFEQLEKADMDLMV AATYENIMMVEGEMHEVSEAELLEAMKVAHEAIKVHCKAQMELTEEVGKTVKREYNHEVN DEDLRKAVREACYEKAYAVAASGNNNKHERFAAFEAIREEFKAQFSEEELDEKGALIDRY YHDVEKEAMRRSILDEGKRLDGRKTTEIRPIWCEVGPLPGPHGSAIFTRGETQSLTSVTL GTKLDEKIIDDVLEHGKERFLLHYNFPPFSTGEAKAQRGVGRREIGHGHLAWRALKGQIP ADYPYVVRVVSDILESNGSSSMATVCAGTLALMDAGVKIKKPVSGIAMGLIKNPGEEKYA VLSDILGDEDHLGDMDFKVTGTKDGITATQMDIKVDGLSYEILERALNQAKEGRMHILGK ITETIAEPRADLKEHAPRIETMTIPKEFIGAVIGPGGKIIQGMQEETGAVITIEEIDGMG RIEVSGTNKKCIDNAMRMIKAIVAVPEVGEVYKGKVRSIMPYGAFIEFLPGKDGLLHISE IDWKRLETVEEAGIKEGDEIEVKLIDVDQKTGKFKLSRKVLMPRPEKK >gi|225935319|gb|ACGA01000073.1| GENE 49 69471 - 70622 1158 383 aa, chain + ## HITS:1 COG:no KEGG:BT_2564 NR:ns ## KEGG: BT_2564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 383 1 383 383 685 89.0 0 MKKVTFAALAALTITACSSGPKFQVNGDVSGADGKMLYLEASGLEGIVPLDSVKLKEEGT FSFKQPRPESPEFYRLRIDDKIINFSVDSIETIQIKAPYVDFSTTYTVEGSENSNKIKEL TLKQIRLQKEVDNLLASLRSNGIGHDVFEDSLATLLNNYKEDVKVNYIFAAPNTAAAYFA LFQKLNNYLIFDPLNNKDDVKCFAAVATSLNNAFPHAVRSKNLYNIVIKGMKNTRQPQTK ALEIPQDKIVETGIIDIALRDVKGNVRKLTDLKGKVVLLDFSVFQSPAGAPHNLMLRELY NTYAKEGLEIYQVSLDADEHYWKTAADNLPWVCVRDGNGVYSTNVAVYNVRQVPSIFLIN RNNELKLRGEDIKDLEAAVKSLL >gi|225935319|gb|ACGA01000073.1| GENE 50 70804 - 71268 656 154 aa, chain + ## HITS:1 COG:RC1332 KEGG:ns NR:ns ## COG: RC1332 COG0782 # Protein_GI_number: 15893255 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Rickettsia conorii # 4 152 51 199 206 114 46.0 7e-26 MAYMSEEGYKKLMAELKELETVERPKISAAIAEARDKGDLSENAEYDAAKEAQGMLEMRI NKLKATIADAKIIDESKLKTDSVQILNKVELKNVKNGMKMIYTIVSESEANLKEGKISVN TPIAQGLLGKKVGDVAEITVPQGKIALEVVNISI >gi|225935319|gb|ACGA01000073.1| GENE 51 71320 - 71712 465 130 aa, chain + ## HITS:1 COG:MA0811 KEGG:ns NR:ns ## COG: MA0811 COG0537 # Protein_GI_number: 20089695 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Methanosarcina acetivorans str.C2A # 4 101 17 119 150 88 42.0 2e-18 MATIFSRIIAGEIPCYKVAENDKFFAFLDINPLVKGHTLVVPKQEVDYIFDLSDEDLAAM HVFAKQVACAIKKAFPCQKVGEAVIGLEVPHAHIHLIPIQKESDMLFSNPKLKLSDEEFK SIAQAINSSL >gi|225935319|gb|ACGA01000073.1| GENE 52 71765 - 72664 619 299 aa, chain - ## HITS:1 COG:BS_yyaM KEGG:ns NR:ns ## COG: BS_yyaM COG0697 # Protein_GI_number: 16081133 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 17 269 15 267 305 74 25.0 2e-13 MKNKKLEANLSMAVSKVFSGLNMNALKYLLPLWMSPLTGATLRCTFAAAAFWVIGWFMPP EKSSARDKWLLFLLGALGLYGFMFLYLAGLSKTTPVSSSIFTSLQPIWVFLIMIFFYKEK AGAKKIIGISIGLIGALVCILTQQSDDLASDAFTGNMLCLLSSVVYAVYLILSQRILTAI GAITMLRYTFSGAAVSAIIVTFITGFDAPVFSMPFHWTPFLILMFVLIFPTTISYMLLPV GLKYLKTTVVAIYGYLILIVATITSLALGQDRFSWTQTFAIIFICIGVYLVEVAESKEK >gi|225935319|gb|ACGA01000073.1| GENE 53 72668 - 73924 949 418 aa, chain - ## HITS:1 COG:AGc4286 KEGG:ns NR:ns ## COG: AGc4286 COG0477 # Protein_GI_number: 15889635 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 408 15 387 400 74 22.0 4e-13 MKIQTGRGTIPLITLIAIWSISALTSLPGLAVSPILGDLTKIFPEATDLDIQMLTSLPSL LIIPFILLGGKLTEKVDYVRVLKVGLWLFAASGVLYLISNRMWQLIVVSALLGIGSGLII PLSTGLVSRYFVGTYRVKQFGLSSAITNFTLVIATAVTGYLAEVSWHLPFLVYLLPLISI LLVGHLKEDRTGEVAFTPSPSSSADTTKDNEASEQSTSIDIGGSKYGIHIKHLIELMLFY GVITYIVVVVIFNLPFLMEKHHFSSGNSGLMISLFFLAITAPGFCLDKIVALLKGRTKAY SLLSMALGLLLIWIAPIEWLIIPGCILVGLGYGIIQPMLYDKTTQTALPQKTTLALAFVM MMNYLAILLYPFIVDFFQWIFHTQSQEFPFIFNLLITIVTLFWAYRRRNTFLFNDQLK >gi|225935319|gb|ACGA01000073.1| GENE 54 74414 - 75856 1583 480 aa, chain + ## HITS:1 COG:sll1641 KEGG:ns NR:ns ## COG: sll1641 COG0076 # Protein_GI_number: 16329656 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Synechocystis # 29 443 35 448 467 436 48.0 1e-122 MEDLNFRKGDAKTDVFGSDRMLQPSPVEKIPDGPTTPEVAYQMVKDETFAQTQPRLNLAT FVTTYMDEYATKLMNEAININYIDETEYPRIAVMNGKCINIVANLWNSPEKDTWKTGALA IGSSEACMLGGVAAWLRWRKKRQAQGKPFDKPNFVISTGFQVVWEKFAQLWQIEMREVPL TLEKTTLDPEEALKMCDENTICIVPIQGVTWTGLNDDVEALDKALDAYNAKTGYDIPIHV DAASGGFILPFLYPEKKWDFRLKWVLSISVSGHKFGLVYPGLGWVCWKGKEYLPEEMSFS VNYLGANITQVGLNFSRPAAQILGQYYQFIRLGFQGYKEVQYNSLQIAKYIHSEIAKMVP FVNYSEDVVNPLFIWYLKPEYAKSAKWTLYDLQDKLSQHGWMVPAYTLPSKLEDYVVMRV VVRQGFSRDMADMLLGDINNAIAELEKLEYPTPTRMAQEKNLPVEVKMFNHGGRRKTVKK >gi|225935319|gb|ACGA01000073.1| GENE 55 75866 - 76831 1144 321 aa, chain + ## HITS:1 COG:ECs0538 KEGG:ns NR:ns ## COG: ECs0538 COG2066 # Protein_GI_number: 15829792 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli O157:H7 # 9 312 6 308 310 270 45.0 4e-72 MDKKVTLAQLKEVVQEAYDQVKTNTGGKNADYIPYLANVNKDLFGISVCLLNGQTIHVGD TDYRFGIESVSKVHTAILALRQYGAKEILDKIGADATGLPFNSIIAILLENDHPSTPLVN AGAISACSMVQPIGDSAKKWDAIVGNVTDLCGSAPQLIDELYKSESDTNFNNRSIAWLLK NYNRIYDDPDMSLDLYTRQCSLGVTALQLSIAAGTIANGGVNPVTKKEVFDAVLAPKITA MIAAVGFYEHTGDWMYTSGIPAKTGVGGGVMGVLPGQFGIAAFAPPLDGSGNSVKAQLAI QYIMNKLELNVFSNNHITVVD >gi|225935319|gb|ACGA01000073.1| GENE 56 76757 - 76945 123 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPFVLKHWREIKRVRGIEKGYYQEIVTFTFLIKEIERILVNNGDMVVAKYIQFQLVHDVL NS >gi|225935319|gb|ACGA01000073.1| GENE 57 77359 - 78141 777 260 aa, chain - ## HITS:1 COG:no KEGG:BT_2572 NR:ns ## KEGG: BT_2572 # Name: not_defined # Def: putative potassium channel subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 243 1 242 243 396 90.0 1e-109 MKSALSDFISGKKGIYGILHIIILVMSLFLVISISVDTFKGIPFYTQSSYMKVQLCICLW FLFDFVLEFFLAKHKGRYLRTHFIFLLVAIPYQNIIAYYGWTFSDEITYLLRFIPLLRGG YALAIVVGWLTYNRASSLFVSYLTMLLATVYFSSLAFFVLEHRVNPLVNGYGDALWWAFM DVTTVGSNIIAQTVTGRVLSVLLAALGMMMFPIFTVYITNLIQQSNKRRKQYYEEEELEK KASEKKELAEKAAVQKGSVS >gi|225935319|gb|ACGA01000073.1| GENE 58 78358 - 80085 1496 575 aa, chain + ## HITS:1 COG:BMEII0909 KEGG:ns NR:ns ## COG: BMEII0909 COG0531 # Protein_GI_number: 17989254 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Brucella melitensis # 9 486 23 501 510 507 58.0 1e-143 MANIKNAVKLGVFTLAIMNVTAVVSLRGLPAEAVYGMSSAFYYLFAAIVFLIPTSLVAAE LAAMFQDKQGGVFRWVGEAYGKKLGFLAIWVQWIESTIWYPTVLTFGAVSIAFIGMNDVH DMSLANNKYYTLVVVLIIYWLATFISLKGMSWVGKVAKIGGMVGTIIPAALLIILGIIYL ASGGHSNMDFNSSFFPDFTNFDNVVLAASIFLFYAGMEMGGIHVKDVENPSKNYPKAVFI GALITVLIFVLGTFALGVIIPAKDINLTQSLLVGFDNYFRYIHASWLSPIIAVALAFGVL AGVLTWVAGPSKGIFAVGKAGYMPPFFQKTNKLGVQKNILFVQGIAVTVLSLLFVVMPSV QSFYQILSQLTVILYLIMYLLMFSGAIALRYKMKKLNRPFRIGKSGNGLMWFVGGLGFCG SLLAFILSFIPPSQISTGSNTVWFSVLIIGAIIVVIAPFIIYASKKPSWVDPNSNFEPFH WEVQAQPATVNVSASNANAARPTATSAHTGGATGASTAKPGATVSNAAAPDAASSGATAS GATPSSSSSATSGGNSASGKASPGTGDKDKDAPKS >gi|225935319|gb|ACGA01000073.1| GENE 59 80317 - 81585 857 422 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 151 409 44 312 328 162 32.0 1e-39 MSSLAIIVIIVLLAAFIYMWAKCQSLQKEVHSKENKENELKKLALVLQNINAYFLLIDKD FVVCDTNYYSLNRLPIQVGGVTKRVGDLLHCRNAIAAGECGQHEQCKLCCIRASIGKAFY KKASFKNLEASMKLLSEDETKVTPCDVSVSGTYLNIHGEDYMVLTVYDVTELKNAQRLLS IEREHSISADKLKSAFIANMSHEVRTPLNAIVGFSGLMVSASSEEERKMYADVIAENNER LLRLVNDIFDLSQIESGTVDFVYTEFDANDLLRELEGIFKTKLNNSSVELVCEAHIQPIM MYSERERIIQVLSNLLHNAMKFTESGEIRLGCSLKGTEEVCFVVSDTGIGIPKEEQKKIF SRFIKLDREMQGTGLGLTLSQTIIQNLGGNLELDSEINRGSTFSFVLPRVIKPELIKPQA IK >gi|225935319|gb|ACGA01000073.1| GENE 60 82052 - 83689 1444 545 aa, chain + ## HITS:1 COG:no KEGG:BT_2662 NR:ns ## KEGG: BT_2662 # Name: not_defined # Def: alpha-galactosidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 545 1 507 507 955 87.0 0 MKNRIYLFTVAVASFMCISCTKTQTTLSENEKTVNPPIMGWSSWNAFRVDISEDIIKHQA DLMVKKGLKDAGYHYVNVDDGYFGKRDDNGIMHTHEQRFPNGMKPIADYVHGLGMKAGLY TDAGNSTCGSMWDNDAAGVGAGIYGHEPQDAQLYFGDWGFDFIKIDYCGGDALGLNEKER YTSIRNSIDKVNKDVSINICRWAFPGTWAKDAATSWRISGDINAHWGSLRYVVGKNLYLS AYAGNGHYNDMDMMVIGFRNDSKVGGQGLTPTEEEAHFGLWCIMSSPLLIGCNLENMPDS SLELLTNKELIALNQDPLGLQAYVAQHENEGYVLVKDIEQKRGNVRAVALYNPSDTICSF SVPFSSLEFGGNVKVRDLVKHSDLGNFSGTFEQTLPAHSAMFLRIEGETRLEPTLYEAEW AYLPLFNDLGKNPKGIIYAHDKEASGKMKVGFLGGQPENYAEWREVYSENGGRYDMTIHY LYGKGRQIELDVNGIITKIDSLGEDNGHNQITVPVELKAGYNTIRMGNSYNWAPDIDCFT LKKAL >gi|225935319|gb|ACGA01000073.1| GENE 61 83835 - 84641 689 268 aa, chain - ## HITS:1 COG:no KEGG:BF3938 NR:ns ## KEGG: BF3938 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 268 1 265 265 159 32.0 1e-37 MKRTTYIFIGLLVSGLIVIVATIIFIFMSGKPYQEDGVFLGNEQVKMDLNGVHVIKVFTS QDGVSEARRVFIGGEMKIGSTAALSGKELVSYPKGDYLNVTQKNDTLFMKLDLNVNNIPE KYRDRDYIFVSGFDVSLAVDSLTGIVTGEEGLKLNLIGIETDSLFVRGNRYNISLDSCRL GSCDIQGNGLDFHAKDSKIENFYLNLDGVWRWTFENTEVETEYLSGSNQHSNDLQKGECK RVVWTPLTEDARLQVNVREKAEITITPE >gi|225935319|gb|ACGA01000073.1| GENE 62 84659 - 85027 318 122 aa, chain - ## HITS:1 COG:BH3492 KEGG:ns NR:ns ## COG: BH3492 COG1725 # Protein_GI_number: 15616054 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 3 120 5 118 129 67 31.0 6e-12 MNFKESKAIYLQIADRICDEILLGQYPEEERIPSVREYAAIVEVNANTVMRSFDYLQVQN IIYNKRGIGYFVASGAKELIHSLRKDTFLKEELDYFFRQLYTLDIPIKEIETMYHEFIKK QK >gi|225935319|gb|ACGA01000073.1| GENE 63 85039 - 85860 507 273 aa, chain - ## HITS:1 COG:no KEGG:BF4127 NR:ns ## KEGG: BF4127 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 273 1 271 271 278 57.0 1e-73 MIKDTFFSLPRFINLCRKEMVENWKSNVLRMVLMYGVMAVVMVWNGYFEYRGTYKYQEDP AWIFLLVTFIWALWGFGCLSASFTMEKMKTKTSRTSMLMIPATPFEKFFSRWFVFTVVYL VVFLICYKLADYTRFTIYSLAYPEKDFIAPVDLSHLVGDEKYYTLCRRGLEFGALISGYF FVQSLFVLGSSIWPKNSFLKTFASGTIIGMVYFAVGALVSKVLLESGRYYSGGVLESKET TLWIVIVAGIFFALVNWVLAYFRFKESEIINRM >gi|225935319|gb|ACGA01000073.1| GENE 64 85867 - 86712 767 281 aa, chain - ## HITS:1 COG:BB0573 KEGG:ns NR:ns ## COG: BB0573 COG1131 # Protein_GI_number: 15594918 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Borrelia burgdorferi # 2 215 5 214 270 169 43.0 4e-42 MITVENLSFTYRKSKRAVLRDFSLSFESGRVYGLLGKNGAGKSTLLYLMSGLLTPKNGKV MFHDTDVRRRLPVTLQDMFLVPEEFELPSVSLVSYIELNSPFYPRFSKEEMIKYLHYFEM DIDIDLGSLSMGQKKKVFMSFALATNTSLLLMDEPTNGLDIPGKSQFRKFIASGMSDDKT IVISTHQVRDIDKVLDHVLIMDDSRVLLDESTSNICDKLFFVESDDRELAKSALFAIPTI QGNYLILPNEKQDESELNLELLFNATLATPEEIARLFHTQK >gi|225935319|gb|ACGA01000073.1| GENE 65 86885 - 88351 1582 488 aa, chain - ## HITS:1 COG:no KEGG:BT_2663 NR:ns ## KEGG: BT_2663 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 488 1 483 483 720 89.0 0 MKRVQMFLAGAFIAVGSLYAQSSDAEWQAGVAKLKETIQTNPAQAAEEAEHLIKGKNKKN VELLVAIGDAYLNADKIPEAQEYAALARKANGKSALASVLEGNIAVKQKNAGLASQKYEE AIYFDPKCTEAYLKYADIYKSANASLAIEKLNQLKDLEPSNTAVDKKLAEIYYLKNDFNK AVEAYANFAMGPTATEEDLVKYAFALFLNHDFEKSLEVANMGLQKNARHAAFNRLAMYNY TDLKRFDEALKAADVFFKECDKADYSYLDYMYYGHLLESLKKYDDAVVQYEKAVKMDPTK TDLFKNISSAYEQKNDYKKAISAYQKYYASLDKEKQTPDLQFQFGRLYYGAGTQPDSLAI TVEERKQALMSADSTFHAIAEAAPDSYLGNFWRARANSALDPETTQGLAKPFYEEVAALL ESKNDPHYNSALVECYSYLGYYYLLAIENPALKAEAKANKDKSIEYWSKILAIDPANATA KRALDGIK >gi|225935319|gb|ACGA01000073.1| GENE 66 88373 - 89314 852 313 aa, chain - ## HITS:1 COG:TM1264 KEGG:ns NR:ns ## COG: TM1264 COG0226 # Protein_GI_number: 15644020 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Thermotoga maritima # 37 301 23 271 274 87 25.0 3e-17 MKRQFRLIGLVALVVLSACNSKSKGPTDTYSSGVVSIAADESFEPIIQEEIEVFESLYPL AGIVPRYTTEVEAINLLLKDSVRLAIATRTLTKEEMNSFHSRKFFPREIKLATDGLALIV NRANPDSLLSVRDFRRILTGEVKNWKEVNPNSRLKGIQVVFDNKNSSTVRFAMDSICGGK ELAEGNVSALKTNQQVIDYVANNPDAMGVVGVNWLGNRSDTTNLSFREEIRVMSVSAEDV ATPANSYKPYQAYLFYGNYPLARSIYALLNDPRSGLPWGFASFMTSDKGQRIILKSGLVP ATQPVRIVHVKDE >gi|225935319|gb|ACGA01000073.1| GENE 67 89321 - 90136 852 271 aa, chain - ## HITS:1 COG:no KEGG:BT_2665 NR:ns ## KEGG: BT_2665 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 271 1 270 270 443 92.0 1e-123 MAKLDLASSEWCQLIFEGKNQAYGAYRMRANSTKRHNVAMLIVVVIAAVGFSIPTLLKLA TPEQKEVMTEVTTLSKLEEPEIKQEEMKRVEPVAPPPPALKSSIKFTAPVIKKDEEVHED NEIKSQEDLNATKVSISIADVKGNDEANGKDIADLKQVVTQAAPEPEKVFDMVEQMPTFP GGQQELMAYLGKNIKYPTIAQENGTQGRVIIQFVVERDGSISDVHVARGVDPYLDKEAVR VVKSMPKWLPGKQNGKAVRVKFTVPVMFRLQ >gi|225935319|gb|ACGA01000073.1| GENE 68 90163 - 90816 693 217 aa, chain - ## HITS:1 COG:no KEGG:BT_2666 NR:ns ## KEGG: BT_2666 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 217 342 95.0 6e-93 MSAEVQESGGKRGKSKQKKMTVRVDFTPMVDMNMLLITFFMLCTTLSKPQTMEISMPSND KDITENQKSMVKASQAITLLLGADNKLYYYEGEPNYKDYTSLKETSYGADGLRAVLLQKN AVAVNKVRELKQQKLDLKITDDEFKKQVSEIKSGKDTPTVIIKATDDASYMNLIDALDEM QICNIGKYVITDIAEADEFLIKNYDAKGELSQNLADN >gi|225935319|gb|ACGA01000073.1| GENE 69 90832 - 91437 468 201 aa, chain - ## HITS:1 COG:no KEGG:BT_2667 NR:ns ## KEGG: BT_2667 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 201 1 201 201 347 94.0 2e-94 MGRAQIKKKSTFIDMTAMSDVTVLLLTFFMLTSTFVKKEPVQVTTPASVSEIKIPETNVL QILVDPEGKIFMSLDKQQDMQAVLESMGEEYGIKFTPEQEKRFILSSTFGVPIRSMQKYL DLPEDQRDKILKNEGIPCDSVDNQFKSWVRNARTANADLRIAIKADATTPYSVIKNVMNS LQDLRENRYNLITSLKAESEN >gi|225935319|gb|ACGA01000073.1| GENE 70 91471 - 92268 999 265 aa, chain - ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 52 261 1 200 202 87 28.0 3e-17 METTKKTQVVGIKNAGIVIICCLVIAVCIFQFLLGNPSNFMNNDPNNHPLNMLGTIYKGG IIVPIIQTLLLTVLALSIERYFALRSAFGKGSLSKFVANIKDALAAGDMKKAQEICDKQR GSVANVVTSTLRKYEEMEKNTALPKEQKLLAIQKELEEATALEMPMMQQNLPIIATITTL GTLMGLLGTVIGMIRSFAALSAGGGADSMALSQGISEALINTAFGILTGALAVISYNYYT NKIDKLTYGLDEVGFSIVQTFAATH >gi|225935319|gb|ACGA01000073.1| GENE 71 92423 - 92860 173 145 aa, chain - ## HITS:1 COG:no KEGG:BT_2669 NR:ns ## KEGG: BT_2669 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 1 147 147 161 70.0 8e-39 MKLSPKFILAFLLLMLSFSVSGNAMNFSTCIFSDSPCQLSELSSDSNQRSSVDGQSVSYD SVRSLSVSDVELGLKLVTESNSNNYRLRRIIEINDSLKDVMQKFLVLRENSLVLDQSKSF YSDKDPHYSITCSDYYVFALRRILI Prediction of potential genes in microbial genomes Time: Fri May 13 11:33:20 2011 Seq name: gi|225935318|gb|ACGA01000074.1| Bacteroides sp. D2 cont1.74, whole genome shotgun sequence Length of sequence - 48248 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 14, operones - 10 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + 5S_RRNA 55 - 154 98.0 # CP000140 [D:147281..147431] # 5S ribosomal RNA # Parabacteroides distasonis ATCC 8503 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Parabacteroides. + Prom 470 - 529 7.3 1 1 Op 1 . + CDS 591 - 1181 660 ## COG1390 Archaeal/vacuolar-type H+-ATPase subunit E 2 1 Op 2 . + CDS 1193 - 2035 857 ## BT_1300 hypothetical protein 3 1 Op 3 16/0.000 + CDS 2053 - 3813 1843 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 4 1 Op 4 16/0.000 + CDS 3843 - 5162 1648 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B + Term 5181 - 5221 1.2 + Prom 5164 - 5223 3.0 5 1 Op 5 4/0.000 + CDS 5279 - 5884 594 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D 6 1 Op 6 16/0.000 + CDS 5881 - 7698 1805 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 7 1 Op 7 . + CDS 7773 - 8237 676 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K + Term 8274 - 8323 12.1 + Prom 8327 - 8386 6.9 8 2 Op 1 . + CDS 8440 - 10101 1446 ## COG0438 Glycosyltransferase 9 2 Op 2 . + CDS 10139 - 12703 2886 ## COG0058 Glucan phosphorylase + Term 12720 - 12748 -1.0 - Term 12828 - 12868 4.1 10 3 Tu 1 . - CDS 12914 - 14017 1266 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 14091 - 14150 3.5 + Prom 13978 - 14037 10.3 11 4 Op 1 30/0.000 + CDS 14162 - 15547 1273 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 12 4 Op 2 36/0.000 + CDS 15559 - 16359 621 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 13 4 Op 3 25/0.000 + CDS 16353 - 17147 723 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 14 4 Op 4 . + CDS 17158 - 18510 1319 ## COG0687 Spermidine/putrescine-binding periplasmic protein 15 5 Op 1 . - CDS 18632 - 19414 517 ## COG4912 Predicted DNA alkylation repair enzyme 16 5 Op 2 . - CDS 19449 - 20108 627 ## BT_1287 hypothetical protein - Prom 20209 - 20268 4.9 17 6 Tu 1 . - CDS 20283 - 21968 777 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 22053 - 22112 3.6 - Term 22533 - 22580 11.9 18 7 Op 1 . - CDS 22666 - 24225 1332 ## BT_1284 putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) 19 7 Op 2 . - CDS 24247 - 25482 1253 ## BT_1283 hypothetical protein 20 7 Op 3 . - CDS 25497 - 26531 904 ## BT_1282 hypothetical protein 21 7 Op 4 . - CDS 26556 - 28151 1487 ## BT_1281 hypothetical protein 22 7 Op 5 . - CDS 28156 - 31494 3222 ## BT_1280 hypothetical protein - Prom 31514 - 31573 4.2 23 8 Op 1 . - CDS 31639 - 32625 704 ## COG3712 Fe2+-dicitrate sensor, membrane component 24 8 Op 2 . - CDS 32682 - 33233 482 ## BT_1278 RNA polymerase ECF-type sigma factor - Prom 33341 - 33400 5.9 - Term 33368 - 33418 10.7 25 9 Op 1 . - CDS 33476 - 34786 1145 ## COG0738 Fucose permease 26 9 Op 2 . - CDS 34806 - 35198 379 ## BT_1276 hypothetical protein 27 9 Op 3 3/0.000 - CDS 35205 - 36635 1099 ## COG1070 Sugar (pentulose and hexulose) kinases 28 9 Op 4 . - CDS 36653 - 37291 438 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 29 10 Op 1 . - CDS 37422 - 39197 1844 ## COG2407 L-fucose isomerase and related proteins 30 10 Op 2 . - CDS 39265 - 40260 689 ## COG1609 Transcriptional regulators - Prom 40320 - 40379 9.9 - Term 40360 - 40395 6.5 31 11 Op 1 . - CDS 40438 - 40992 909 ## PROTEIN SUPPORTED gi|160885844|ref|ZP_02066847.1| hypothetical protein BACOVA_03848 - Prom 41013 - 41072 4.6 32 11 Op 2 . - CDS 41099 - 42400 998 ## COG1757 Na+/H+ antiporter - Prom 42540 - 42599 6.9 33 12 Tu 1 . - CDS 42621 - 44252 1025 ## COG0642 Signal transduction histidine kinase - Prom 44418 - 44477 4.5 34 13 Tu 1 . - CDS 44750 - 45277 570 ## BT_1263 putative protease I - Prom 45333 - 45392 5.5 + Prom 45260 - 45319 6.1 35 14 Op 1 . + CDS 45444 - 45791 381 ## COG1733 Predicted transcriptional regulators + Prom 45794 - 45853 7.1 36 14 Op 2 . + CDS 46060 - 48123 1358 ## COG2273 Beta-glucanase/Beta-glucan synthetase + Term 48180 - 48237 12.3 Predicted protein(s) >gi|225935318|gb|ACGA01000074.1| GENE 1 591 - 1181 660 196 aa, chain + ## HITS:1 COG:TP0424 KEGG:ns NR:ns ## COG: TP0424 COG1390 # Protein_GI_number: 15639415 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit E # Organism: Treponema pallidum # 19 195 4 171 232 59 25.0 5e-09 MENKIQELTDKIYREGVEKGNEEAQRLIANAQEEAKKIIEDARKEAESIVNSSRKSADEL AENTKSELKLFAGQAVNALKSEVATMVTDKLITASVKDFAQDKDYLNAFIVALASKWSVD EPIVISTADAESLKKYFAAHAKALLDKGVTIQQVNGIKTLFTVSPADGSYKVNFGEEEFM NYFKAFLRPQLVEMLF >gi|225935318|gb|ACGA01000074.1| GENE 2 1193 - 2035 857 280 aa, chain + ## HITS:1 COG:no KEGG:BT_1300 NR:ns ## KEGG: BT_1300 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 280 1 280 280 497 93.0 1e-139 MSSKYYYLVAGLPELSLEDSKLSYTVADFKTEIYDGLSASDQKLIDLFYLKFDNANVLKL LKDKEAEIDKRGNYSAEELSEYISILREGGEISPKEFPVYLSTFITDYLNTPAESTVLHE DHLAALYYEYAMNCGNKFVSAWFEFNLNINNILVAFTSRKFKWDIASNVVGNTEVCEALR TSSARDFGLSGEVDVFESLVKISEITELVEREKKLDALRWNWMEDAIFFDYFTIERIFAF LLKLEMIERWISLDKERGNQLFRSIIESLKNEVQIPAEFR >gi|225935318|gb|ACGA01000074.1| GENE 3 2053 - 3813 1843 586 aa, chain + ## HITS:1 COG:TP0426 KEGG:ns NR:ns ## COG: TP0426 COG1155 # Protein_GI_number: 15639417 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Treponema pallidum # 3 583 4 574 589 523 45.0 1e-148 MATKGTVSGVIANMVTLVVDGPVAQNEICYISTGGDKLMAEVIKVVGSHVYVQVFESTRG LKVGAEAEFTGHMLEVTLGPGMLSKNYDGLQNDLDKMDGVFLKRGQYTYPLDKERVWHFV PLANVGDKVQASAWLGQVDENFQPLKIMAPFTMKGTATVKTIMPEGDYKIEDTIAILTDE EGNDIPVTMIQRWPVKRAMTNYKEKPRPFKLLETGVRVIDTLNPIVEGGTGFIPGPFGTG KTVLQHAISKQAEADIVIIAACGERANEVVEIFTEFPELVDPHTGRKLMERTIIIANTSN MPVAAREASVYTAMTLAEYYRSMGLKVLLMADSTSRWAQALREMSNRMEELPGPDAFPMD ISAIISNFYGRAGYVKLSNDETGSITFIGTVSPAGGNLKEPVTENTKKVARCFYALEQDR ADKKRYPAVNPIDSYSKYIEYPEFEEYIKGHINDEWIGKVNELKTRLQRGKEIAEQINIL GDDGVPVEYHVIFWKSELIDFVILQQDAFDEIDAVTPMERQEDILNMIIDICHTEFDFDN FNEVMDYFKKMINVCKQMNYSKFKSEQYEGFQKQLKELIAERSVNS >gi|225935318|gb|ACGA01000074.1| GENE 4 3843 - 5162 1648 439 aa, chain + ## HITS:1 COG:TP0427 KEGG:ns NR:ns ## COG: TP0427 COG1156 # Protein_GI_number: 15639418 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Treponema pallidum # 8 437 3 428 430 397 47.0 1e-110 MATKAFQKIYTKITQITKATCSLKATGVGYDELATVNGKLAQVVKIAGDDVTLQVFEGTE GIPTNAEVVFLGKSPTLKVSEQLAGRFFNAFGDPIDGGPDIEGQEVEIGGPSVNPVRRKQ PSELIATGIAGIDLNNTLVSGQKIPFFADPDQPFNQVMANVALRAETDKIILGGMGMTND DYLYFKNVFSNAGALDRIVSFMNTTENPPVERLLIPDMALTAAEYFAVNNNEKVLVLLTD MTSYADALAIVSNRMDQIPSKDSMPGSLYSDLAKIYEKAVQFPSGGSITIIAVTTLSGGD ITHAVPDNTGYITEGQLFLRRDSDIGKVIVDPFRSLSRLKQLVTGKKTRKDHPQVMNAAV RLYADAANAKTKMENGFDLTNYDERTLAFAKDYSNQLLAIDVNLDTTEMLDVAWGLFGKY FRPEEVNIKKDLVDQYWQK >gi|225935318|gb|ACGA01000074.1| GENE 5 5279 - 5884 594 201 aa, chain + ## HITS:1 COG:TP0428 KEGG:ns NR:ns ## COG: TP0428 COG1394 # Protein_GI_number: 15639419 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Treponema pallidum # 1 179 1 180 206 88 31.0 9e-18 MAIKFQYNKTSLQQLEKQLKVRVRTLPIIKNKESALRMEVKRCKTEAADLEDRLEQQIQA YEAMFALWNEFDASLIKVNDVHLGVKKIAGVRVPLLENVEFEIRPYSMFNAPKWYADGIH LLEELAHTAIEREFMLAKLNLLEHARKKTTQKVNLFEKVQIPGYQDALRKIKRFMEDEEN LSKSSQKIMKSHQEKRKEVEA >gi|225935318|gb|ACGA01000074.1| GENE 6 5881 - 7698 1805 605 aa, chain + ## HITS:1 COG:BB0091 KEGG:ns NR:ns ## COG: BB0091 COG1269 # Protein_GI_number: 15594437 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Borrelia burgdorferi # 1 604 1 604 608 184 26.0 3e-46 MITKMKKLTFLVYHKEYEEFLNSLRELGVVHIVEKQQGAADNTELQENIRLSNRLAATLK LLQNQKHEKNAVIATEGGTAARGMQVLDEVDALQTEHGKLSQQLQSYAKEKEALEAWGNF EPDNVQKLKNAGYVIGFYSCSEGNYKEEWETEYNAMIVNRISSKVFFVTLTKGGQEVDLD VEQAKLPAYSLAHLETLYNTTEQAVEENEKKLVTLSETEIPSLKAALKELQSQIEFSKVV LSSEQTAGDKLMLIEGWAPAFSQVEIEAYLNDAHVYYEITDPMPGDNVPIRLNNKGFFAW FEPICKLYMLPKYNELDLTPFFAPFFMVFFGLCLGDSGYGVFLFLGATAYRLMAKKVTPS MKSIISLIQVLAASTFFCGLLTGTFFGANIYDLNWPIVQRLKHAVLMDNNDMFQLSLILG AIQILFGMVLKAVNQTIQFGFKYAVATIGWIILLVSMAVSALLPEVMPMGSTVHLVILGV SAAMIFLYNSPGKNVFLNIGLGLWDSYNMVTGLLGDVLSYVRLFALGLSGGILAGVFNSL AVGMSPDNVIAGPIVMVLIFVIGHAINIFMNVLGAMVHPMRLTFVEFFKNSGYEGGGKEY KPFRN >gi|225935318|gb|ACGA01000074.1| GENE 7 7773 - 8237 676 154 aa, chain + ## HITS:1 COG:SPy0149 KEGG:ns NR:ns ## COG: SPy0149 COG0636 # Protein_GI_number: 15674359 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Streptococcus pyogenes M1 GAS # 5 148 14 154 159 75 38.0 3e-14 MEMNLFIAYIGIAIMVGLSGIGSAYGVTIAGNAAIGALKKNDSAFGNFLVLTALPGTQGL YGFAGYFMFQTIFGILTPEITPIQASAVLGAGIALGLVALFSAIRQGQVCANGIAAIGQG HNVFSNTLILAVFPELYAIVALAATFLIGSALVA >gi|225935318|gb|ACGA01000074.1| GENE 8 8440 - 10101 1446 553 aa, chain + ## HITS:1 COG:YLR258w KEGG:ns NR:ns ## COG: YLR258w COG0438 # Protein_GI_number: 6323287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Saccharomyces cerevisiae # 11 547 10 617 705 214 28.0 5e-55 MVKDLLTPDYIFESSWEVCNKVGGIYTVLSTRANTLQEKFRDRIFFIGPDVWQGKENPLF IESDNLCAAWKEHALEKDELSVRVGRWNIPGEPIVILVDFQPFFEKKDDIYTEMWNRYQV DSLHAYGDYDEASMFSYAAGRVVESFYRYNLTETDKVVYQAHEWMTGMGALYVQEAVPEV ATIFTTHATSIGRSIAGNHKPLYDYLFAYNGDQMAQELNMQSKHSIEKQTAHYVDCFTTV SEITNNECKELLDKPADVVLMNGFEDDFVPKGSTFTGKRKRARTLMLSVANKLLGTNLGD DTLIVGTSGRYEFKNKGIDVFLESLNRLNRDKKLHKDVLAFINVPGWVGEPREDLQARLK SKDKFDTPLEVPFITHWLHNMTHDQVLDMLKYLGMGNRPEDKVKVIFVPCYLDGRDGIMN KEYYDILLGQDLSVYASYYEPWGYTPLESVAFHVPTITTDLAGFGLWVNSLKNQHGINDG VEVLHRSDYNYSEVADGIKDTITLFADKTDKEVKEIRKRAAEVAEQALWKHFIQYYYEAY DIALRKARIRQLS >gi|225935318|gb|ACGA01000074.1| GENE 9 10139 - 12703 2886 854 aa, chain + ## HITS:1 COG:PH1512 KEGG:ns NR:ns ## COG: PH1512 COG0058 # Protein_GI_number: 14591294 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Pyrococcus horikoshii # 21 746 17 734 837 615 44.0 1e-176 MKIKVSNVNTPNWKEVTVKSRIPEELEKLSEIARNIWWAWNFEATELFRDLDPELWKECG QNPVLLLERMSYEKLEALAKDKVILRRMNEVYTKFRDYMDVKPDEQRPSIAYFSMEYGLS SVLKIYSGGLGVLAGDYLKEASDSNVDLCAVGFLYRYGYFTQTLSMDGQQIANYEAQNFG QLPIERVMDANGQPLIVDVPYLDYFVHANVWRVNVGRISLYLLDTDNEMNSEFDRPITHQ LYGGDWENRLKQEILLGIGGILTLKALGIKKDVYHCNEGHAALINVQRICDYVATGLTFD QAIELVRASSLYTVHTPVPAGHDYFDEGLFGKYMGGYPSRMGISWDDLMDLGRNNPGDKG ERFCMSVFACNTSQEVNGVSWLHGKVSQEMFSSIWKGYFPEESHVGYVTNGVHFPTWSAT EWKELYFKYFNENFWFDQSNPKIWEAIYNVPDEEIWKTRMTMKNKLVDYIRKSFRDTWLK NQGDPSRIVSLMDKINPNALLIGFGRRFATYKRAHLLFTDLDRLSKIVNNPDYPVQFLFT GKAHPHDGAGQGLIKRIIEISRRPEFLGKIIFLENYDMQLARRLVSGVDIWLNTPTRPLE ASGTSGEKALMNGVVNFSVLDGWWLEGYREGAGWALTEKRTYQNQEHQDQLDAATIYSIL ETEILPLYYARNKKGYSEGWIKVVKNSIAQIAPHYTMKRQLDDYYSKFYCKLAKRFQTLA ANDNAKAKEIAAWKEDVVAKWDSIEIVSCDKVEELKNGDIESGKEYTITYVIDEKGLNDA VGLELVTTYTTADGKQHVYSVEPFSVVKKEGDLYTFQVKHSLSNAGSFKVSYRMFPKNPE LPHRQDFCYVRWFI >gi|225935318|gb|ACGA01000074.1| GENE 10 12914 - 14017 1266 367 aa, chain - ## HITS:1 COG:SP1000 KEGG:ns NR:ns ## COG: SP1000 COG0526 # Protein_GI_number: 15900873 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Streptococcus pneumoniae TIGR4 # 225 348 41 164 185 75 37.0 2e-13 MKKFTYLVIATAALSMVACTGGNKAGYTITGTVEGASDGDTVYLQEANGRNLTKLDTAVI TKGTFTFEGTQDSVVSRYVTCEVNGEPLMIDFFLENGKINVALTKDNDAVTGTPNNDAYQ EIRAQINDISKKMNAIYEAMGNTSLSDEQKEAKQKEGAQLEEQYDKAIKEGVQKNITNPV GVFLFKQTFYNNSTDENEALLQQIPANFQNDETIVRIKEMTDKQKKTAVGTQFVDFEMQT PEGKTVKLSDYVGKGKVVLVDFWASWCGPCRREMPNLVETYAKYKGKNFEIVGVSLDQDG AAWKEAIKKLDMTWPQMSDLKFWQSEGAQLYAVNSIPHTVLIDGSGKIIARGLHGEELQA KIAEAVK >gi|225935318|gb|ACGA01000074.1| GENE 11 14162 - 15547 1273 461 aa, chain + ## HITS:1 COG:BB0642 KEGG:ns NR:ns ## COG: BB0642 COG3842 # Protein_GI_number: 15594987 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Borrelia burgdorferi # 4 348 2 347 347 399 56.0 1e-111 MQEDKSIIEVSHVSKYFGDKTALDDVTLNVKKGEFVTILGPSGCGKTTLLRLIAGFQTAS EGEIRISGMEITQTPPHKRPVNTVFQKYALFPHLNVYDNIAFGLKLKKTPKQTIEKKVKA ALKMVGMTDYEYRDVDSLSGGQQQRVAIARAIVNEPEVLLLDEPLAALDLKMRKDMQMEL KEMHKSLGITFVYVTHDQEEALTLSDTIVVMSEGKIQQIGTPIDIYNEPINAFVADFIGE SNILNGTMIHDKLVRFCGTEFECVDEGFGENVPVDVVIRPEDLYIFPVSEMAQLVGVVET SIFKGVHYEMTVLCGGYEFLVQDYHHFEVGAEVGLLVKPFDIHIMKKERVCNTFEGKLLD ATHVEFLGCNFECVPVEGIAFDTNVKVEVDFEKVILQDNEEDGTLTGEVKFILYKGDHYH LTVLSDWDENVFVDTNDVWDDGDRVGITIPPDAIRIVKITD >gi|225935318|gb|ACGA01000074.1| GENE 12 15559 - 16359 621 266 aa, chain + ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 18 266 12 277 277 167 38.0 3e-41 MNKRFIVFLSSRKSWTLPYIIFSAIFVIIPLFLIVVYAFTDDSGHLTLANFQKFFEHPEA INTFVYSIGIAIITTLVCILLGYPAAWILSNSKLNRSKTMVVLFILPMWVNILVRTLATV ALFDFFSVPLGEGALIFGMVYNFIPFMIYPIYNTLQKMDHSYIEAAQDLGANPVQVFLKA VLPLSMPGVMSGIMMVFMPTISTFAIAELLTMNNIKLFGTTIQENINNSMWNYGAALSLI MLLLIAATSLFSTDDKENSNEGGGLW >gi|225935318|gb|ACGA01000074.1| GENE 13 16353 - 17147 723 264 aa, chain + ## HITS:1 COG:CAC0838 KEGG:ns NR:ns ## COG: CAC0838 COG1177 # Protein_GI_number: 15894125 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Clostridium acetobutylicum # 1 257 1 252 260 180 42.0 2e-45 MVKKIFAQTYLWILLLLLYSPIVIIVIYSFTEAKVLGNWTGFSTKLYSSLFTTGAHHSLM NALINTITIALLAATASTLLGSIAAIGIFNLKARSRKAIGFVNSIPILNGDIITGISLFL LFVSLGITQGYTTVVLAHITFCTPYVVLSVLPRLKQMNPNIYEAALDLGATPMQALWKVI IPEIRPGMISGFMLALTLSIDDFAVTVFTIGNQGLETLSTYIYADARKGGLTPELRPLSA IIFVVVLALLVVINYRAGKTKKKD >gi|225935318|gb|ACGA01000074.1| GENE 14 17158 - 18510 1319 450 aa, chain + ## HITS:1 COG:lin0800 KEGG:ns NR:ns ## COG: lin0800 COG0687 # Protein_GI_number: 16799874 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Listeria innocua # 11 353 13 325 357 184 32.0 3e-46 MCYYVNNMKRIIPAILLLLALTGCYNSGEPRERVLKIYNWADYIGEGVLEDFQAYYKEQT GENIRIVYQTFDINEIMLTKIEKGHEDFDVVCPSEYIIERMLKKHLLLPIDTNFAHSPNY MNNVAPFIRKQINKLSQPGEKASRYAVCYMWGTAGILYNRAYVPDSVALSWDCLWNKKYA GKILMKDSYRDAYGTAVIYAHAKELKEGTVTVEELMNDYSPRAMELAEKYLKALKPNIAG WEADFGKEMMTKNKAWLNMTWSGDAIWAIEEANAVGVDLDYEVPEEGSNIWYDGWVIPKY ARNPEAASYFINFMCRPDIALRNMDFCGYVSSIATPEILEEKIDTTLHYYSDLSYFFGPG ADSVQIDKIQYPDRKVVERCAMIRDFGDKTKEVLDIWSRIKGDNLGVGITILIFVVIALM SGWMIYKKWQRYNRQKQQRRRSRRKKVRRN >gi|225935318|gb|ACGA01000074.1| GENE 15 18632 - 19414 517 260 aa, chain - ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 41 260 25 251 251 140 36.0 2e-33 MMGITSKKSMPAHSAHTLFLYLCEITNDMKMTTTENIRKELQTLADSKYQEFHSSLLPGA NNILGVRIPQLRTMAKEIIKKEDWRTFVESTDTIYYEETMLQGMIIGLAKMELEEQMKYV TMFIPRIDNWAVCDIFCSELKTSVKKGKENVWQFIQPYLKSSKEFEIRFGIVMLFHYVDD AHIDSLLKYADSFNHDAYYARMAMAWMISLCFIKFPQKTMEYLKHSTLDNWTYNKALQKT IESFRVDKDTKDILREMKRR >gi|225935318|gb|ACGA01000074.1| GENE 16 19449 - 20108 627 219 aa, chain - ## HITS:1 COG:no KEGG:BT_1287 NR:ns ## KEGG: BT_1287 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 219 1 212 212 199 54.0 6e-50 MKIITEIKKACCFNRFAFLVMSLGLIATFTACSSDNDSDIPVYSLKDVEGNYSGNMLTET APSVNPQNYSFKEEQPQGATVTAEVKDNQIMIKKLPVDDLIKSIVGEEIGEIIIETLGDI NYNIPYTAAFNDDSKGSILLQLKPEPLEIKYTIPTQVQTEGEEAPQITVKVTIEAEEKGQ YTYKDKKLTFVIKATQVEVEGEPLENFSVTTFSFDMVKK >gi|225935318|gb|ACGA01000074.1| GENE 17 20283 - 21968 777 561 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 458 560 425 526 526 70 35.0 1e-11 MITSLFVSIIDANIFHQSHFFLLLTAGGISLYWLLSHFLTLHKEIKILKEEHQYFMDSFQ SIRNPITLVHTPLRAACDDSCPEDIKKMLSLVIRNIDCLDEHLTKLMNLRHLLIYSKQMD IAEYELGNFINNRVHSLKNFATDKRIKLEIKTEFNYASVWFDQSKISPIIDKFIKNAIEH SKPEDKRIIFSISSNSEHWELKTFDANKGKLLTCYRRQKYHLIKPKHKSEFKYAFAKSVL CKKLMKLCDGNILINHSTHTVSLRFPTTNSERKVSGYNIIHSAKKSEERKTDTSLGKIAH KKSSIKPTVVLADSNEEFKSYLEECLSKDFDVKSFGNGSEALECIKEKYPDLVICDLMLH GMYGYELSSRLKTSGETSVIPIILYGSRIDIGQRNKRESSLADIFLYVPFHIEDLKIEMN VLIKNNRSLRRSFLQTVFGKQFLEKEEEKVLDDSNYTFINQVKEFILKNIDKENLTIDEI ASELYMSRTAFFNKWKALTGEAPKYFIYRIRMEKARELLESGKYSVQVIPEMIGLKNLKN FRHKYKEYFGITPSKSITKKL >gi|225935318|gb|ACGA01000074.1| GENE 18 22666 - 24225 1332 519 aa, chain - ## HITS:1 COG:no KEGG:BT_1284 NR:ns ## KEGG: BT_1284 # Name: not_defined # Def: putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 429 5 416 508 122 27.0 3e-26 MDMKRNINFKNVLCMAAITLVTGMLGGCEDNSIATPVGKLPNETALDNNFGMLKCDMTTS DLEITMTNTKTPDFAFFYQTVKPVAQDTRITMKIDLDLVDKYNEEHGTEYRKMSTISIEG ISNDGAMTIKAGDTMSDTIMVRLNPLGTLHDYSLLPITVELPEESGIRPSRDSNEIYYKI YYKRSTYPWLSPMKADMKKYKMVGFIDGEKVNASIAKEFIMVLYDNETFGPGHYELWETS WYAYLYDIINVNAATLGYANGKVTLNQSANFKKQLSNMSSLSSYGIKRCVTLKAEGTGLG FCNLTDEQITAFVAEVKKLANYRVDGINLMDEGAEYGINGHPAANATSYPKLIKALRESL GKTKLITVTITGDPAGSLASEAAGIRAGEYIDYAWTWVNTKIMNPWEDSSIEKPIAGLDK SQYGGFSTDEYTFDYENDESYPETLREKLYDKGLGKLYIMRNIPFWSDALEATPYHANIN AGAAAFYRVEGGHKYEPDINYLHNGKYKDMKGDKNYYEM >gi|225935318|gb|ACGA01000074.1| GENE 19 24247 - 25482 1253 411 aa, chain - ## HITS:1 COG:no KEGG:BT_1283 NR:ns ## KEGG: BT_1283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 411 1 438 440 344 42.0 5e-93 MKRYKIIISLAAIICGLMIACDNTDYSNKAPFDNMVYIKEASNSNSERVTFRNTLNEQKR AFSAKLTYPAGKDIVVKLKADPGLINEFNARHGTDYGMLDSKYYSLSNSELIIKAEKSES LIDTIYFNNLLDLEIDKTYLVPLTISETSGGVHLLNGSKTLFYLVRRSSAITVAANLKDN YLEVPTFLDEEKNACLKELGQITMEALVYVEDFTYSGPSNTAGASDISTIMGVEQHFLMR IGDTSFPRQQLQMQGPDGVKFPAADRAKSLNAMTWYHIALVYNAKEHFIAYYVNGQLQSQ DISYGKGATVDICGTPDCEFQIGRSYEDELRQLNGNIAEIRIWNTCRTKEEIWTNMYKVE DPENEESLLAYWKFNEGEGNIVKDHSKHGFDAVSAEPLVWPTGIEIPQINK >gi|225935318|gb|ACGA01000074.1| GENE 20 25497 - 26531 904 344 aa, chain - ## HITS:1 COG:no KEGG:BT_1282 NR:ns ## KEGG: BT_1282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 343 9 326 327 137 30.0 6e-31 MKQNSFISVILTVACSLMFVACSDWTEPTPVDYNPIFPDEQDHSLYEKYEQSLKEYKNRK HTLMLIRFANNPGTTLGEQDFLRSLPDSVDFVVLTNADNLSVYDKEDISKLKRTGTKFLY YKNLSEFYDQCEETENLEEFKTKMNAALKNGESDEFDGYYVQFGRAGSMMHVDFFNEVTT AIREKLAAIGGPKANNGKLVFFDGIVTFTVDYSGRYEEPFIDLVDYFVDGQIVLLESSLE IKLNMSQPVGFYGKMPKEKVITTATPPTDDGKTGIISREDLDVIQTSEGMAYIINHFITG LGNNGPLAGVSIYNANEDYLATDGVTNYRRIRDAIQKLNPSPIN >gi|225935318|gb|ACGA01000074.1| GENE 21 26556 - 28151 1487 531 aa, chain - ## HITS:1 COG:no KEGG:BT_1281 NR:ns ## KEGG: BT_1281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 529 16 530 531 610 58.0 1e-173 MNTQMKEYRKATLRLAYLLTLMLWTGSCIDMDINRNPYETTQDELMRENHIIGSSLKSME ALVVPTQEHLYQFVEAMCGGAYGRYFGETRVGWTEKYSTYNPKSDWLKASFSDPISEMYP SYRDIINRTNDPVALAFAKILRVAIMHRSTDMFGPIPYTKVLGDKTEGDGLSAPYDSQEE VYVAMFKELEEADEALKENLGLSAEGFKKLDNLYYGDVRKWYKYLHSLQLRMAMRIVYVK PELAREIAEKAVAAGVIENNEDNAQLHVEENRSALCFNDWKDYRIAAEIVSYMQGYNDPR LEKYFTQGKYQDDTDYYGLRIGILPSKVTDDELIQTYSNRLMTANDTYMWMTAAEVTFLR AEGALRGWAMSGDAQQLYEKAITLSFEQWGASGASGYSQNKALVPGAYKDPRGTYSAQSP SSITIAWNEDKENTRFEENLERIITQKWIAMFPLGIEAWCEHRRTGYPKFLPIMDNKGVG ITNLTLGIRRLSYPAEEYQLNAENMLGALRKLNGEDNGATRLWWDCNPNVK >gi|225935318|gb|ACGA01000074.1| GENE 22 28156 - 31494 3222 1112 aa, chain - ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1112 1 1110 1110 1695 75.0 0 MLQIYKFIVDKKTKGRSFMLFFCLMLATTVSLAQNGAKITIKKNSISVIEALKEVEKQSG MSVGYNDSQLKNKPAITLNIEAASLENALAQILKGTGFTYQLKEKYIMIIPEQKKESNPT KKITGKVVDENNDPLIGVNIKVEGTAAGSITDIDGNFLIEASTGNTLTFTYIGYTSGSIK VTNQNNYRIQLTADTQQLSEVVVTALGIKREQKALSYNVQQIGADKMPEIKDANFINSLS GKAAGVVINASSSGVGGASKVVMRGTKSIEQSSNALYVIDGIPMYNFGGGGGTEFDSKGA TEAIADINPDDIESISFLTGAAAAAMYGSNAANGAIVVTTKRGQIGKLSATFSSSMEFMN PFVLPKFQNRYGTGTRGDAEGSPILSWGPKLNPANQTGYEPTDFFDTGAAYSNSITLSTG TDKNQTFFSAAAVNSEGMIPNNRYNRYNFTFRNTTHFLNDKMKLDVGASYIIQNDRNMTN QGVYSNPLVSAYLFPRGDDFSLVKAFERYDEARKINVAYWPQGEGDLRMQNPYWIAYRNV RTNEKKRYMLNAGLTYDILDWLNVSARIRVDNTDNTYQQKLYASTISTLTEGSTQGHYTI QKVNETQTYADFMVNINKRIDDFTVVVNAGTSLSDNSSDLLGYGGPIRDTGVPNVFNVFD LDNAKKRATQEGWEEMTQSIFASAEIGWKSMLYLTLTGRNDWASQLKGSNPTSFFYPSVG LSAVISEMVTLPKAIDYLKVRGSFSSVGTPYPRFLVQPTYSYDPTKQDWNAKTHYPIGKL KPERTDSWEIGVDGTFFKDFKVSGSFYYANTYNQTFDPKITVSSGYSTLYVQTGYVRNLG VEGLLSYGHTWRDFGWNSNFTFSWNKNKIVELVKDYVHPETGEIVNKDRLELKGLGYTKF ILKEGGTLGDLYTNADFIRDDKGYIQIDKNGDVAKTDNLPDIKLGSVFPKANLAWNNSFS YKGIYAGFQLSARLGGIVYSATQAALDQYGVSEASAAARDRGGVLVNGRSWVNAQQYYEI VATSSGLPQYYTYSATNLRLQEAHIGYTIPCKWLGNICDINVSVVGRNLWMIYCKAPFDP EAIATTSNFYQGIDYFMMPSTRNFGFNVKINF >gi|225935318|gb|ACGA01000074.1| GENE 23 31639 - 32625 704 328 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 88 274 98 272 323 90 33.0 3e-18 MRNFFQKILILFTGNNYPEATQNEFYQWLVDEDHASQKEDALRELWNKAQRQKNVKGMQK SYERLKKQEGIPTVPKERRIRPIHIWQSAAAILFLLLASSVYLSTVGAKAETDLLQQYIP IAEMRSLTLPDGTKVQLNSKSTLLYPQEFTGDSRSVFLLGEANFKVKPDKKRPFIVKSND LQITALGTEFNVSAYPESQEIATTLISGSIRVDYNNLTTSVILHPNEQFAYNRNNKKYDL SNPDMDDVTAWQRGELVLQKMTLTDIINVLERKYPYTFVYSIKNLKNDRFSFRFKDNAPL EEVMEIIVNVVGQMDYRIVEDKCYLTRI >gi|225935318|gb|ACGA01000074.1| GENE 24 32682 - 33233 482 183 aa, chain - ## HITS:1 COG:no KEGG:BT_1278 NR:ns ## KEGG: BT_1278 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 172 1 172 183 201 65.0 8e-51 MSQNQDPIHLARQFEIIFTTHYSAVKYFAINLLKSEEDAEDIIQDVFAKLWTQPEVWLEN KDIDQYIYAMTKNTTFNFIKRKRLEQFYQEQLSQKYLIEDLLKSEDTLDPIYYKEALLII RLILDRLPERRRKVFEMSRINNMSNMEIAQALNISVRTVEHQIYLTLLEIKKTIFIAFFL CFF >gi|225935318|gb|ACGA01000074.1| GENE 25 33476 - 34786 1145 436 aa, chain - ## HITS:1 COG:HI0610 KEGG:ns NR:ns ## COG: HI0610 COG0738 # Protein_GI_number: 16272552 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Haemophilus influenzae # 16 433 10 425 428 394 51.0 1e-109 MKHTKQSIISKDGISYLIPFILITSCFALWGFANDITNPMVKAFSKIFRMSATDGALVQV AFYGGYFAMAFPAAIFIRKYSYKAGVLLGLGMYAFGAFLFFPAKMTGEYYPFLIAYFILT CGLSFLETSCNPYILSMGTEETATRRLNLAQSFNPMGSLLGMYVAMQFIQAKLHPMCTED RALLNDSEFQAIKESDLSVLIAPYLIIGLIIVAMLLLIRFVKMPKNGDQNHRIDFFPTLK RIFTQTRYREGVIAQFFYVGAQIMCWTFIIQYGTRLFMSQGMDEKSAEVLSQQYNIIAMV IFCISRFICTFILRYLNAGKLLMILAIFGGIFTLGVIFLQNIFGMYCLVAVSACMSLMFP TIYGIALKGMGDDAKFGAAGLIMAILGGSVLPPLQASIIDLEQIAWLPAVNVSFILPFIC FLVIIGYGYRTVKRNW >gi|225935318|gb|ACGA01000074.1| GENE 26 34806 - 35198 379 130 aa, chain - ## HITS:1 COG:no KEGG:BT_1276 NR:ns ## KEGG: BT_1276 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 231 90.0 7e-60 METSKTGYQVQSYDVPVKRYCQTLDLRDSPELIAEYRKRHSQAEAWPEILAGIREVGILE MEIYILGTRLFMIVETPLDFDWDKAMERLNTLPRQQEWEEYMAIFQQAAPGMSSAEKWKP MERMFHLYNT >gi|225935318|gb|ACGA01000074.1| GENE 27 35205 - 36635 1099 476 aa, chain - ## HITS:1 COG:BH1551 KEGG:ns NR:ns ## COG: BH1551 COG1070 # Protein_GI_number: 15614114 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 10 472 5 461 467 347 39.0 4e-95 MKDTIKHTYLAADFGGGSGRIIAGFLLNGRLELEEIYRFSNRQVKLGNHIYWDFPALFED MKTGLKLAAQKGYAVKGIGIDTWGVDFGLIDKHGNLLGNPVCYRDARTEGMPAEVFKCMD EHRHYAETGIQVMPINTLFQLYSMKQNQDAQLEVARQLLFMPDLFSYFLTGVANNEYCIA STSELLNAKSRNWSFDTIHSLGLPEHLFGKIILPGTIRGTLKEDIARETGLGAVDVIAVG SHDTASAVAAVPAAESPIAFLSSGTWSLLGVEVDEPILTEEARKAQFTNEGGVDGKIRFL QNITGLWILQRLMSEWKACGEEQNYDIIIPQAAEAEIDTIIPVDDAIFMNPENMENTLIH YCRNHALQIPQNKAETVRCVLQSLAFRYRLAVEQLNRCLPAPIRQLNIIGGGSQNKLLNQ LTADELGIPVYAGPVEATAMGNILTQAMAKGEIANLRELREIVTRSVTPQVYYPKK >gi|225935318|gb|ACGA01000074.1| GENE 28 36653 - 37291 438 212 aa, chain - ## HITS:1 COG:HI1012 KEGG:ns NR:ns ## COG: HI1012 COG0235 # Protein_GI_number: 16272947 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Haemophilus influenzae # 27 203 27 206 210 101 34.0 1e-21 MITNEHIEQFIAQAHRYGDAKLMLCSSGNLSWRIGEEALISGTGSWVPTLGKEKVSICHI ASGIPANGVKPSMESTFHLGILRERPDVNVVLHFQSEYATAVSCMKNKPANFNVTAEIPC HVGSEIPVIPYYRPGSPELAKAVVEAMQKHNSVLLTNHGQVVCGKDFDQVYERATFFEMA CRIIVQSGGDYSVLTPAEIEDLEIYVLGKKTN >gi|225935318|gb|ACGA01000074.1| GENE 29 37422 - 39197 1844 591 aa, chain - ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 591 1 588 588 837 67.0 0 MKKYPKIGIRPTIDGRQGGVRESLEEKTMNLAKAVAELISSNLKNGDGSPVECVIADSTI GRVAESAACAEKFEREGVGSTITVTSCWCYGAETMDMNPHYPKAVWGFNGTERPGAVYLA AVLAGHAQKGLPAFGIYGRDVQDLDDNTIPEDVSEKILRFARAAQAVATMRGKSYLSMGS VSMGIAGSIVNPDFFQEYLGMRNESIDLTEIIRRMDEGIYDHEEYAKAMAWTEKHCKTNE GEDFKNRPEKRKTREQKDADWEFIVKMTIIMRDLMTGNPKLKEMGFKEEALGHNAIAAGF QGQRQWTDFYPNGDYPEALLNTSFDWNGIREAFVVATENDACNGVAMLFGHLLTNRAQIF SDVRTYWSPEAVKRVTGKELSGAAANGIIHLINSGATTLDGSGQSLDAKDNPVMKEPWNL TDADVENCLQATTWYPADRDYFRGGGFSSNFLSKGGMPVTMMRLNLIKGLGPVLQIAEGW TVEIDPEIHQKLNMRTDPTWPTTWFVPRLCDKPAFKDVYSVMNNWGANHGAISYGHIGQD LITLASMLRIPVCMHNVEEDQMFRPAAWNAFGMDKEGSDYRACATYGPIYK >gi|225935318|gb|ACGA01000074.1| GENE 30 39265 - 40260 689 331 aa, chain - ## HITS:1 COG:L0146 KEGG:ns NR:ns ## COG: L0146 COG1609 # Protein_GI_number: 15673482 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 19 313 8 318 345 60 25.0 6e-09 MKRTDKKATFGQQSSKVTQLADALSQAISRKEFLEGDSLPSINQLSAQYGVSRDTVFKAF LDLRERGLIDSTPGKGYYVTSQVTNVLLLLDQYTPFKEALYNSFVRHLPINYKVDLLFHQ YNERLFNTILRESIGKYNKYIVMNFDNEKFSTVLNKINPTKLLLLDFGKFEKEKYSYICQ DFDESFYQALHLLRERLKNYHQLVFLFPKSLKHPQSSKEYFTRFCQEQGFLCEVQEDIEN LTIRKGVAYIAIKQQDVVKVVKQGRLEGLKCGKDFGLLAYNDIPSYEVIDEGITSLSIDW EMMGNEAANFVLNNVPVQKFLPTEVRLRKSL >gi|225935318|gb|ACGA01000074.1| GENE 31 40438 - 40992 909 184 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160885844|ref|ZP_02066847.1| hypothetical protein BACOVA_03848 [Bacteroides ovatus ATCC 8483] # 1 184 1 184 184 354 99 5e-97 MATRIRLQRHGRKSYAFYSIVIADSRAPRDGKFIEKIGTYNPNTDPATVDLNFDAALAWV LKGAQPSDTVRNILSREGVYMKKHLLGGVAKGAFGEAEAEAKFEAWKNNKQSGLATLKAK QDEAKKAEAKARLEAEKKINEVKAKALAEKKAAEAAEKAAAEAPAEEAAAAPAEEAPATE AAAE >gi|225935318|gb|ACGA01000074.1| GENE 32 41099 - 42400 998 433 aa, chain - ## HITS:1 COG:SA2117 KEGG:ns NR:ns ## COG: SA2117 COG1757 # Protein_GI_number: 15927906 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Staphylococcus aureus N315 # 4 430 21 448 459 335 45.0 1e-91 MTEEKIHKDRPGNWWALSPLLVFLCLYLVTSILVNDFYKVPITVAFLVSSCYAIAITRGL KLDQRIYQFSVGAANKNILLMVWIFILAGAFAHSAKQMGAIDATVNLTLQILPDNLLLAG IFIAACFISLSIGTSVGTIVALTPVAVGLAEKTEIALPFMVAVVVGGSFFGDNLSFISDT TIASTKTQECVMRDKFRVNSMIVVPAAIIVLGIYIFQGLSITAPTQTQTIEWIKVIPYII VLGTAIAGMNVMLVLIIGILTSGIIGIATGSFGIFDWFGAMGTGITGMGELIIITLLAGG MLETIRYNGGIDFIIRKLTRHVNGKRGAELSIAALVSIANLCTANNTIAIITTGPIAKDI AVKFHLDRRKTASILDTFSCLIQGIIPYGAQMLIAAGLASISPISIIGNLYYPFTMGACA LLAILFRYPKRYS >gi|225935318|gb|ACGA01000074.1| GENE 33 42621 - 44252 1025 543 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 277 526 50 307 328 179 38.0 1e-44 MIKERENYNAFLLHFLERLNSCDSIEKKITESLTDICEYYGFKRGFIYQTDGFRYFYLKE TVGNEDNILRQRFEISEMSSQHVAHANGKNKPFYACRTQKASWDDVDIIDFYRVNSLLIR QIQDSEGKIIGFIGFADKEHVISFTDEELQVIHLILGSLSKEIAVREYKEREVRASKTLS SIMNNMGVDIYVNSFDSHDMLYANESMAAPYGGIEHFEGKKCWQALYKDKTGECEFCPKK HLIDENGQPTKVYSWDYQRPFDKCWFRVFSAAFAWIDGQIAHVITSVDINHQKTIEEELR VAKEKAENLDRLKSAFLANMSHEIRTPLNAIVGFASLLVESDNKKERQDYVDIMQENTEL LLQLISDILDLSKIEAGTLDFTMDHLGIKSFCEDIIRNYDIKEDKPVPVLLAPDLPEYYI YTDKKRLMQVITNFINNALKFTNEGQIVLEYHYKAESNEIEFAVTDTGMGIAPDKVNQVF DRFVKLNSFSKGTGLGLSICRSIIEHLGGTIGVESEMGNGSRFWFTHPLRLNQREKTKEI ISK >gi|225935318|gb|ACGA01000074.1| GENE 34 44750 - 45277 570 175 aa, chain - ## HITS:1 COG:no KEGG:BT_1263 NR:ns ## KEGG: BT_1263 # Name: not_defined # Def: putative protease I # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 322 92.0 4e-87 MAKKVAVLAVNPVNGCGLFQYLEAFFENGISYKVFAVSNTKEIKTNSGISLTVDDVIANL KGHEDEFEALVFSCGDAVPVFQQNANKPFNVDLMQVIKTFGDKGKMIIGHCAGAMMFDFT GITKGKKVAVHPLAKPAIQNGTATDEKSEIDGNFFTAQDENTIWTMMPQVIEALK >gi|225935318|gb|ACGA01000074.1| GENE 35 45444 - 45791 381 115 aa, chain + ## HITS:1 COG:AGc3635 KEGG:ns NR:ns ## COG: AGc3635 COG1733 # Protein_GI_number: 15889290 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 112 36 137 147 109 51.0 1e-24 MKNFHPTGNCPIRDVLSRLGDKWSMLVLITLNANGTMRFSDIHKTIEDVSQRMLTVTLRT LESDGLVERKVYAEVPPRVEYCLTDTGGTLIPHIEGLVGWALENMETILKHRNNN >gi|225935318|gb|ACGA01000074.1| GENE 36 46060 - 48123 1358 687 aa, chain + ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 188 438 216 457 642 96 30.0 2e-19 MKNRFFIVALIVLFFCGCSDNDDTVVNNSIDELGNLENGLVFLPRYEDGKASIEYRAEDG EFYITANFKTLSARLAERIADEWNQATGDKNVTADYTEVQTRTTSVSSLEVTKVVSSNET EGLITIEMKLPLEALLKPFFFSVSAKAGNTLLMSAYVPVYKDDLDKQLAYVPDPRPIPAI KDGLYYTWGDEFNIDGSVNEELWTFEEGFKRGNEPQNYVKGIDNAIVVDGRLLITAKKER RKNPNYDPTSGDYRMNWEYGEYTSASMNGNQKRFFLFGRTEVRAKIDPSSGSFPAIWTCG NNKEWPLNGEIDIMEFYIHNGEQSLTSNFAAGKKQPWEAIWNSRWTPLSYYELKDPDWIK KYHVYRMDWDEKEIKLYVDDELRNSIKVEEFLNEDGSIVFHNPQYMWLNLALKDQGRGID LSETKNIQFEVDYFRLYQKVIDHQKPSKVTNFTAKSKDGLVNLSWKPSIDTGGAGLLRYD LYRGGIGSGYFIGSTTTTSFTEVEVAPGAEMTYYIQALDGVGNYSDVNEVTVRPEGGGGL HEGNLVPDGTMESLPIELDQSLPGFDGSWGNASWGSLRVLTGGYWGKYIQITGGNGELQF PVTWTPNTTYVLRAKIKTTGSGFEFGLGKVKVGDSSLALSIPNTRGEWIQFEQEFMAETA STGNCSLHNWSSNTVVQMDNWELYEKE Prediction of potential genes in microbial genomes Time: Fri May 13 11:34:42 2011 Seq name: gi|225935317|gb|ACGA01000075.1| Bacteroides sp. D2 cont1.75, whole genome shotgun sequence Length of sequence - 126722 bp Number of predicted genes - 90, with homology - 86 Number of transcription units - 37, operones - 20 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 6.5 1 1 Tu 1 . + CDS 81 - 4028 2156 ## COG0642 Signal transduction histidine kinase + Term 4073 - 4128 -0.2 + Prom 4130 - 4189 6.8 2 2 Op 1 . + CDS 4358 - 7480 2808 ## BT_2818 hypothetical protein 3 2 Op 2 . + CDS 7492 - 9336 1357 ## BT_2819 hypothetical protein 4 2 Op 3 . + CDS 9369 - 12134 2203 ## BT_2820 hypothetical protein 5 2 Op 4 . + CDS 12159 - 13829 1448 ## PRU_2229 putative lipoprotein 6 2 Op 5 . + CDS 13858 - 15042 726 ## COG2273 Beta-glucanase/Beta-glucan synthetase + Term 15068 - 15110 3.5 + Prom 15045 - 15104 9.6 7 3 Op 1 . + CDS 15143 - 17740 1621 ## COG3250 Beta-galactosidase/beta-glucuronidase 8 3 Op 2 . + CDS 17779 - 20076 1105 ## BCAH187_A3211 alpha-fucosidase + Term 20199 - 20237 0.6 9 4 Tu 1 . - CDS 20030 - 20311 120 ## - Prom 20331 - 20390 3.9 + Prom 20229 - 20288 6.0 10 5 Op 1 . + CDS 20341 - 20784 418 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 11 5 Op 2 . + CDS 20791 - 21474 473 ## Psta_4074 RNA polymerase, sigma-24 subunit, ECF subfamily 12 5 Op 3 . + CDS 21536 - 22057 517 ## Slin_0877 hypothetical protein + Term 22062 - 22108 4.1 + Prom 22134 - 22193 7.1 13 6 Tu 1 . + CDS 22299 - 22592 166 ## + Term 22795 - 22829 -0.8 - Term 22280 - 22321 10.6 14 7 Op 1 . - CDS 22472 - 23824 1163 ## COG4452 Inner membrane protein involved in colicin E2 resistance 15 7 Op 2 . - CDS 23851 - 24147 245 ## COG1846 Transcriptional regulators 16 7 Op 3 . - CDS 24148 - 24747 456 ## BT_1252 hypothetical protein - Prom 24983 - 25042 5.3 + Prom 24791 - 24850 6.6 17 8 Op 1 . + CDS 25000 - 25440 508 ## BT_1251 MarR family transcriptional regulator 18 8 Op 2 4/0.000 + CDS 25421 - 26740 883 ## COG1538 Outer membrane protein 19 8 Op 3 . + CDS 26755 - 27825 1082 ## COG1566 Multidrug resistance efflux pump 20 8 Op 4 . + CDS 27831 - 29426 1040 ## BT_1248 putative transport protein + Term 29437 - 29491 8.5 - Term 29264 - 29303 1.0 21 9 Op 1 . - CDS 29491 - 30981 1502 ## COG2721 Altronate dehydratase 22 9 Op 2 . - CDS 31012 - 32076 977 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 32221 - 32280 7.8 + Prom 32113 - 32172 6.7 23 10 Op 1 8/0.000 + CDS 32330 - 33355 1420 ## COG0524 Sugar kinases, ribokinase family 24 10 Op 2 . + CDS 33417 - 34085 934 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase + Term 34119 - 34150 0.1 - Term 34101 - 34144 -0.6 25 11 Tu 1 . - CDS 34264 - 34554 68 ## - Prom 34574 - 34633 1.6 26 12 Op 1 4/0.000 - CDS 34671 - 35009 336 ## COG4744 Uncharacterized conserved protein 27 12 Op 2 . - CDS 34996 - 35604 616 ## COG0811 Biopolymer transport proteins 28 12 Op 3 . - CDS 35614 - 36303 396 ## Coch_1188 hypothetical protein 29 12 Op 4 1/0.286 - CDS 36334 - 40680 4399 ## COG1429 Cobalamin biosynthesis protein CobN and related Mg-chelatases 30 12 Op 5 . - CDS 40719 - 42866 1995 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins 31 12 Op 6 . - CDS 42863 - 43546 524 ## BT_0497 hypothetical protein 32 12 Op 7 . - CDS 43562 - 44458 586 ## BT_0498 hypothetical protein - Prom 44557 - 44616 7.4 + Prom 44453 - 44512 9.4 33 13 Op 1 . + CDS 44655 - 45536 831 ## BF2104 hypothetical protein 34 13 Op 2 . + CDS 45555 - 45935 443 ## COG5496 Predicted thioesterase + Prom 45937 - 45996 1.6 35 13 Op 3 . + CDS 46016 - 46918 705 ## COG1230 Co/Zn/Cd efflux system component + Term 46952 - 47010 10.1 + Prom 46930 - 46989 4.1 36 14 Op 1 . + CDS 47019 - 50084 2216 ## BF3444 hypothetical protein 37 14 Op 2 . + CDS 50100 - 51542 1036 ## BF3445 hypothetical protein 38 14 Op 3 . + CDS 51571 - 52767 1134 ## BF3446 putative lipoprotein 39 14 Op 4 . + CDS 52754 - 55618 1930 ## COG0612 Predicted Zn-dependent peptidases 40 14 Op 5 . + CDS 55599 - 56498 755 ## COG3016 Uncharacterized iron-regulated protein + Term 56535 - 56574 6.5 + Prom 56523 - 56582 2.2 41 15 Op 1 . + CDS 56607 - 58205 1547 ## COG2985 Predicted permease 42 15 Op 2 . + CDS 58252 - 58632 130 ## BT_0501 hypothetical protein + Term 58635 - 58682 2.6 43 15 Op 3 . + CDS 58717 - 61038 1959 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Term 61069 - 61111 6.5 + Prom 61051 - 61110 3.7 44 16 Tu 1 . + CDS 61187 - 61504 94 ## BT_0503 hypothetical protein + Prom 61529 - 61588 5.0 45 17 Tu 1 . + CDS 61618 - 63957 2160 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Term 64011 - 64051 6.8 - Term 64043 - 64083 2.2 46 18 Tu 1 . - CDS 64293 - 65207 558 ## COG0053 Predicted Co/Zn/Cd cation transporters - Prom 65327 - 65386 6.1 + Prom 65286 - 65345 6.5 47 19 Tu 1 . + CDS 65392 - 67698 1982 ## COG3525 N-acetyl-beta-hexosaminidase + Term 67890 - 67933 9.6 + Prom 67919 - 67978 4.1 48 20 Op 1 35/0.000 + CDS 68010 - 69782 1458 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 49 20 Op 2 . + CDS 69775 - 71511 228 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 50 20 Op 3 . + CDS 71529 - 72137 619 ## COG1309 Transcriptional regulator + Term 72325 - 72370 -0.9 51 21 Op 1 . - CDS 72456 - 74555 970 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 52 21 Op 2 . - CDS 74576 - 75730 1062 ## COG0251 Putative translation initiation inhibitor, yjgF family 53 21 Op 3 . - CDS 75752 - 76465 447 ## BT_2476 hypothetical protein 54 21 Op 4 . - CDS 76489 - 77703 866 ## BT_2477 hypothetical protein 55 21 Op 5 . - CDS 77731 - 79161 1224 ## COG3488 Predicted thiol oxidoreductase - Prom 79188 - 79247 3.6 - Term 79189 - 79249 6.1 56 22 Op 1 . - CDS 79272 - 80423 892 ## BT_2479 iron-regulated protein A precursor 57 22 Op 2 . - CDS 80453 - 81793 912 ## BT_2480 hypothetical protein - Prom 81889 - 81948 12.9 - Term 81933 - 81994 14.5 58 23 Op 1 . - CDS 82184 - 82546 336 ## BT_0516 hypothetical protein 59 23 Op 2 . - CDS 82580 - 83086 548 ## COG0716 Flavodoxins - Prom 83148 - 83207 3.1 - Term 83134 - 83193 16.1 60 24 Tu 1 . - CDS 83218 - 84393 1118 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) - Prom 84601 - 84660 8.0 61 25 Op 1 . + CDS 84602 - 85306 618 ## BT_0519 hypothetical protein 62 25 Op 2 . + CDS 85296 - 85907 503 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Term 86268 - 86322 10.7 63 26 Op 1 . - CDS 86451 - 87098 429 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 64 26 Op 2 . - CDS 87050 - 87997 306 ## BT_0522 hypothetical protein 65 26 Op 3 . - CDS 88009 - 89166 790 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 66 26 Op 4 . - CDS 89184 - 89549 303 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 89638 - 89697 5.6 - Term 89723 - 89763 7.4 67 27 Tu 1 . - CDS 89784 - 92168 1838 ## BT_0525 outer membrane protein, function unknown - Prom 92203 - 92262 3.9 - Term 92330 - 92388 12.2 68 28 Tu 1 . - CDS 92410 - 93450 794 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D - Prom 93515 - 93574 7.4 + Prom 93401 - 93460 5.5 69 29 Tu 1 . + CDS 93659 - 94015 241 ## gi|260175339|ref|ZP_05761751.1| hypothetical protein BacD2_26024 + Term 94193 - 94226 0.5 - Term 93912 - 93957 9.1 70 30 Op 1 . - CDS 94200 - 94970 334 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc 71 30 Op 2 9/0.000 - CDS 94989 - 95612 573 ## COG0135 Phosphoribosylanthranilate isomerase 72 30 Op 3 21/0.000 - CDS 95639 - 96421 697 ## COG0134 Indole-3-glycerol phosphate synthase 73 30 Op 4 13/0.000 - CDS 96470 - 97465 989 ## COG0547 Anthranilate phosphoribosyltransferase 74 30 Op 5 35/0.000 - CDS 97470 - 98036 403 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 75 30 Op 6 . - CDS 98082 - 99488 1319 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 76 30 Op 7 . - CDS 99537 - 100724 1265 ## COG0133 Tryptophan synthase beta chain - Prom 100946 - 101005 5.2 77 31 Op 1 . + CDS 101085 - 101225 152 ## 78 31 Op 2 . + CDS 101305 - 102450 1157 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family + Term 102482 - 102527 2.3 - Term 102463 - 102520 2.4 79 32 Tu 1 . - CDS 102527 - 104152 1105 ## COG0642 Signal transduction histidine kinase - Prom 104202 - 104261 6.6 + Prom 104409 - 104468 6.1 80 33 Tu 1 . + CDS 104490 - 106349 1562 ## COG0471 Di- and tricarboxylate transporters + Term 106436 - 106474 7.2 + Prom 106495 - 106554 8.4 81 34 Tu 1 . + CDS 106710 - 109049 1948 ## BT_0525 outer membrane protein, function unknown + Term 109105 - 109154 5.1 + Prom 109051 - 109110 1.6 82 35 Tu 1 . + CDS 109178 - 109642 68 ## BF2973 hypothetical protein + Prom 110231 - 110290 6.3 83 36 Tu 1 . + CDS 110358 - 114323 2194 ## COG0642 Signal transduction histidine kinase + Term 114363 - 114407 3.1 + Prom 114364 - 114423 2.4 84 37 Op 1 . + CDS 114501 - 116318 1500 ## COG3250 Beta-galactosidase/beta-glucuronidase 85 37 Op 2 . + CDS 116356 - 119568 2508 ## PRU_2517 TonB dependent receptor 86 37 Op 3 . + CDS 119580 - 121202 1072 ## PRU_2518 putative lipoprotein 87 37 Op 4 . + CDS 121215 - 123038 1170 ## COG2730 Endoglucanase 88 37 Op 5 . + CDS 123047 - 124894 1112 ## BT_3597 sialic acid-specific 9-O-acetylesterase 89 37 Op 6 . + CDS 124948 - 126417 810 ## COG2730 Endoglucanase 90 37 Op 7 . + CDS 126450 - 126720 232 ## BT_1872 periplasmic beta-glucosidase precursor Predicted protein(s) >gi|225935317|gb|ACGA01000075.1| GENE 1 81 - 4028 2156 1315 aa, chain + ## HITS:1 COG:BS_resE_4 KEGG:ns NR:ns ## COG: BS_resE_4 COG0642 # Protein_GI_number: 16079368 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 814 1033 48 268 269 144 32.0 1e-33 MKIPINDILKMRRSLLFFICLIFPFILFAKHFKVFDISTGLPNNTVKCIIQDEQGFIWLG TFNGLCRFDGVDFAVFTHQQNDSLSIINNHIEALLPVENGMWVGTEGGLNFYSYTENRFY PCIQRASTGETQLMTRPIKNIVMSKGKVFVLNVSRELFVLRDNQTFEKCDYEANMACLAI ALYKDGMLLAHTTKGICLIEPEKGRIVGRLNIPQKMSSDNVIYYSKNKNRIYIGYGIGYN SEVFSVNDSLEIRKLDEPVPSDVKSVIDYKDETVFGTDGNGLATFSTKEENFYTPENSNI SSDAIHSLFVDRNENLWIGTYRGGVNLFSVHYNWFKTLTMANKQITHNVVTAIYPGKDHL YIGLDGGGLNVYDQSTGRVTAYTTRNSSIAGNNILSISGDEQYIWLGIYGRGLCRFSLAG HSFKNFVLPSIDGSMNMNRIWEIKDDGRGYIWIIGEGVYLFNKETETFITISELNGVYAS GVVFDDNVAWLSCTNSGLYKMDRVTGKILKHYFKDSKETPIDSNIIRYIFIDSNHCVWFS TEYSGLYKLDEATETVTSCGKSTGLTSQSTVSMQEDQYGNIWLGTDNGLFRYSSATGTFV RFGKEDNLSLVQFNYNACSRKEDILYFGTTGGLVWFNPSEIEYSQRANSVYFTGLELLNN DKELINLYGDHPQEIRLPYDQNFFTIHFSTPELVSPDKINFSCFMENFEKNWQEISYHRQ VSYTNVPPGEYLFYVKSSDSNGKWNEKASCLKIIITPPWWQTGWAFCLWGIIILGILVLV LWFYQHELSIKHMVRLKEIEKNTAKSISEAKLIFFTNITHELRTPIFLITAPLEELMSSA KGPVQVPKSYLSAMYRNAMRLNKLISRIIDFRKLESGKLKLETQRLNVVAFCKDLTIDYE ALCQQKNILFYFQPSKTVIQLDFDAEKLESILSNLISNAFKYTSEEGKIIFSIDDAEDAV LFTIEDNGIGIPEEYHDVIFDSFFQVDPSRASAMGDGIGLSFVKHLVELHGGTVKVESQP ERGAKFMFSIPKKDVEEVVEQLEQKSVIIDFNETDSRQNELVSPQLPTAIHSLLIIDDER ETVEILERFLIKDFKILKASNGVDGLTIAQEALPDIIICDMMMPKMDGMEFLSLVKGDKK LSHIPVIMFTAKTSEDDKMAAFDSGADAYLTKPVSLKYLRKRIDNLLSQSESVAVVNHIS NTEKNYSKEEQRFLLKCREIIDNHLTDADFDVLTFAEKLGMSHSTLYRKVKSVTGMSVIE FINEYRIFKAVQYFNEGETNISTVGVKCGFNDLKNFRAAFKRKVGVSPKQYVMRL >gi|225935317|gb|ACGA01000075.1| GENE 2 4358 - 7480 2808 1040 aa, chain + ## HITS:1 COG:no KEGG:BT_2818 NR:ns ## KEGG: BT_2818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 23 1040 16 1057 1057 787 44.0 0 MKYLDLGLSDMHKAFLQMRLVCLLMIILPGTLWAQNTKISGQVIDDTGEPVIGATVRLKG AMTGTVTDLTGGFVLQVSDMKGILEFSFVGMETLEVPIKGKTMLKVTMKSSSQLLDEVQV VAYGTQKKITLTGSVSSVNTKDLLKVPTASIGNMLAGAMSGVSSVQSTGQPGGDDPEIFV RGMSTLNSANAKPLYLVDGVERDFFQMDPNEIESVTILKDASSTAVFGVRGANGVIIVTT KRGKEGKAKINASFSYGIQAPTRLPEFTNSYEYADFINEAYRHDDKQEPFGSDVVEAFRT HSNPLLYPDTDWMKVLFESTSPQTQANLNISGGTDRVRYFVSMGMLDQKGFFKNYDSRYD GNFNYNRYNYRANLDVDFTKTTLLTVNLGGRVEKRNYPQGGDNLDDLFRHIYWALPFGGP GIIDGKWITGNKQYIPVLGENVSYADGMYNTYGRGYSTKTTNVLNLDLALSQKLDFLTKG LNFKIKFAYNSSYNQTKTRKTSAPYYKPWYRKDITWAEQPAGSDPNEIVYIQNGEEGVIS YGESLGKARDWYAEASFDWKRDFGMHHFSALALYNQSKSYYPNSGWPGIPRGYVGLVGRV TYDYDNKYLIEGNVGYNGSENFAPGHRYGFFPAVSGGWVLTQEDFLKDNSILNFLKVRAS YGIVGNDRYYSSGDYMIRFLYLPNSYQLGNGYQFGTGKSWTNGAYESSFGNSGVSWEKSA KQNYGIDFTLFNQKLSGSVDYFLEKRTNILAQSNTDPVIHAMALPVLNLGEVSNKGVEVN LKWNHKINNFRYWANFNISYAKNKIVYQDEVPSPYEYTLKTGHPVGQPFGLKVRGLYYEG MENVADHNYILKPGDVVYEDLNGDEVIDDNDKTAIGYPNYPLLNGGFTIGFEYKGFDFSM MWSGATKTSRVLRDEFRTPAGGSNYRALLKYMYDDRWTEETAATAKLPRASIDGLNNNYR DSELWIKDASYLRLKNIELGYNFQLPFMKHIGMETLRVFITGYNLLTLDKLKISDPESLN GNVPKYPVMRVVNFGVNVGF >gi|225935317|gb|ACGA01000075.1| GENE 3 7492 - 9336 1357 614 aa, chain + ## HITS:1 COG:no KEGG:BT_2819 NR:ns ## KEGG: BT_2819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 614 1 585 585 301 35.0 6e-80 MKRILNSLLLASFLLSTSCVDEFKVGDAFLDKAPSVDKDKEYVFSNPDYTRDFLWNAYRT LYYGRVLDWSAKGNRVGMGAIDALTDCFTCGLTWCTVYNTYYTGQFTSNSAANDCVYSYS DGEQQWTGIRKAWLFIENVGQTPGITDEEKERLKAEAKMIIACHYSDMFKNFGGVPLLDH ALDPNEELKIPRATLRQTLDFIVQLCDEAYKVLPWALDSSEQENWAGRFTGAAALGLKIR TLLFAASPLYNDNQPFYPEQAGAAAIAEKQVWFGGKDDQLWRDVVDACSLFMSRNSQEGE PWQLVTNASNPRQAYQDAYYKRANGELLIDTRQRATVSWNWDDNCSYLHFYGQYGALVPT LEYVNLFPYADGTPFDASFWEQQRKDKYIEEDPFKNRDPRLYENCLVNNAEFGRDHPAEL WIGGREGMSSETVSDGSKFTGFRIYKYILDLSSQNGKQDQWPYLRLPEIFLSYAEALNEL DRATERDALGNDAYEYINKVRRRVGLSGLQTGLSKEQLRKEILDERAREFGAEDVRYYDM IRWKMAEDFRKPLHGLRIRRGDDKKSYSYEKFQIKKRYIQDGDDGTVYFTPKWYLTPFPF VEINKGYLNQNPGW >gi|225935317|gb|ACGA01000075.1| GENE 4 9369 - 12134 2203 921 aa, chain + ## HITS:1 COG:no KEGG:BT_2820 NR:ns ## KEGG: BT_2820 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 921 6 926 926 526 35.0 1e-147 MRNIYKLSCLLLLSSLNAYAQEDSAFVKKEERQLNLGYGVTVKASESSAAISEITGEELM KSAATNPLNALFGKGLGLTVLQAGGTIWENTPTTMIRGVNTTGSNSILVIVDGFERDLGS ITKEEIESVQILKDAAATALYGLRGANGVMVVTTKTGTRKKIDISVNYEHHFNMLHGLPK MADGYTYALAMNEALINDGLGARYDETQLNGIKNGDPRYGNVDWMDEILKKSGSIDNVFV TLSGGGKYVRYFSSVDYLQSKGFMKPDNIEGDYSSQHKYSKLNVRTNVDVDLTSSTQLFF KMNGMLSEHNRPGIGAGDLMGLIYTTPSAAFPVKTPEGEWGGSDMWTRNPVAEVAAKGYA RSHSRQLFADATIRQDLGMLLKGLSVDVRIGYDNFCEYWDNYPMSYAYAVANVAHPSSAD DYTLRGQNKTDNFNSSVGGQKSRLNLWGRVNYNNMWGKHGVNLTLAASREQTVWSGQNNT RNQINLVGHVHYVYDGKYIADVAFSRNGSNALPPSHRFGNFPAVSTAWVISNEDFLKDVS FIDLLKVRASWGLTGSDVAPESLLWNRKYGWANGYILGGEFGSFSGLGEGHLPTMDMTYE KVSKANIGVDLSFAKALDVTVEFFKEKRSDMFITPGNISSVMGAANAYKNLGKVENKGLE AGINYHKQIRNLNFHVGGNFSFNRSKVIEMGEEYQPYEWMRTTGQTVGQIMGYEAIGFFK DEQDIKNSPVQMFGEVKPGDIKFRDLNGDKIIDERDKKAIGYSSVAPEIYYSLNLGVEYK GFGIDALIQGVGNYSDIASTAGLFRPLSNNTSLTQYYYENRWTPQNQQAKYPRLSQGANA NNFTNNTVWLTDKSYLKLRHAELYYKFSKKLLATTPLHSVKLYVRASDFTLFDHVDICDP EAMGNALPIPSSVQLGFSIGF >gi|225935317|gb|ACGA01000075.1| GENE 5 12159 - 13829 1448 556 aa, chain + ## HITS:1 COG:no KEGG:PRU_2229 NR:ns ## KEGG: PRU_2229 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 14 555 16 559 559 350 39.0 7e-95 MKKHIWIAGALMVLFSSCNDFLDYNENSDYDQNSVFNITWSNEDFVTGIYAYLPGGTSDV DGALRSAGCDEAEYVWPSSTIRRFYDGSWSAMNTVDDKWESYYKAIRLCNDYLQNGTGKT WEQFQYDKNYEKIMRKYKNLEWEVKALRAYFYFELVKRYNKIPLIKELLTEEEANRQEPV YYQTIINFIVAECDEVAAEGKLPALYDGNYDNETGRLTKQFALALKARALLYAASPLNNS EAETNPEKWVKAAKAAKVLIDNAGSWKLSLVSFGELLGADNFKKAEMILAKRIGDNNDFE RTNFPVGVEGGKGGNCPTQNLVDAFDLQDGKSPVEENPYANRDPRLGYTVLYQGGNTAYG KKVDLSIGGANGLPQQGATRTGYYLSKFVNTSISLTANSTTTARHSIPLFRYAEILLIYA EAMNEAYGPEGMAPGEGLNMTALEAINKVRGRSGLGIALLKAEDVSSKILFRRALRKERM AELAFEDHRFWDIRRWKIGPESVTIKRVTVTKDGNSNLFDYTSTSDRIWEDKMYFYPIPQ SEMYKNRNLIQNPGWE >gi|225935317|gb|ACGA01000075.1| GENE 6 13858 - 15042 726 394 aa, chain + ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 42 301 208 457 642 112 29.0 1e-24 MKIYKLVIALGMIGGAFNSTFSAYGDNPPVNPYQPDPNPIRQIAGYEYAWGDEFNVDGVV NPDLWQAEEGFIRGNELQYYRRENAVVKDGRLLITGKKERVKNKFYDASSGDYRKNTEYS EYTSASILGKDKRHFLFGRIEVRARIHPADGAFPAVWTCGYNKNWPSNGEIDMMEYYLAD LGNGKEPVLTTNFCVSSDNPHDAWAQNWKSVFTPVTYYQKKDKDWMNKYHVYRMDWDEES IALYVDDELRNWINIDEFRNGDGSIAFHNPQFMMLNLAIKNHGAGLVDDYAFEVDYFRVY QKKPDLVKPSMVEGLKVLKQTATTCTLQWQPSTDNQGIYRYDIYKEGGRFVGSTTDCEFV IDGVSSDDKVKYYVRALDAAGNYSDRCTPLSVFR >gi|225935317|gb|ACGA01000075.1| GENE 7 15143 - 17740 1621 865 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 31 586 57 661 871 201 27.0 4e-51 MKITKKILFFFSLFLSSLLYAQNFLSPRTIINLNREWKYTRGDHQGAERQVFNDEGWEQV GLPHSFSIPYFMSKDFYVGYGWYRKSLKLTANELSRKIFLEFDGVFQEAEVFVNGKLAGK HVGGYTGFSIDISSLVKAGDNILAIRVNNYWKPNVAPRAGEHVFSGGIYRNVRLVLKSPV HIDWYGTFVTTPDLAKNQGKSSSVNIKTEVCNSEEKEGTYRLLTQVLAPDGQIVTSVETK ETIAGRSSRQISQTTEALDKPQLWNLSTPVLYKIASKLYKGEQLIDEEETTFGFRWFEWT ADRGFFLNGKHLYFRGVNVHQDQAGWGDAVTEAAMYRDVSVMKEAGFDLIRGSHYPHAPA FSRACDELGMLFWAEAPFWGIGGYKPDGYWNSSAYPVTKSDEAGFEESALQQLREMIRIH RNHPSIIVWSMCNEAFFSAPETMPGVRRLLKRMVDLTHQLDPTRPAAVGGAQRPLGEERI DKIGDIAGYNGDGATQTDFQNPGIPNVVSEYGSTTADRPGEYGPGWGDLKKDDGWKGRSW RSGQAIWCGFDHGSIAGSALGKMGIVDYFRIPKRSWYWYRNEYTQVAPPVWPVEGIPARL KLEATKTENVLTDGTDDVLLTVSVLDEAGKLLNNSPSVYLKLISGPGEFPTGSSILFEKD SDIRMIDGQAAIAFRSYYAGKSVIEATSPGLQSVRIEINFAGKYAYESGVTPTVKERPYI RFAPENHETVVQTFGRNNPTFASSLRGKQSAGFAADGNMDTFWEATGEDYSPWWMLDTEK GLTLRTISVHFPKAAIYHYMIEVSDDNKEWKTVLDRRNGRVVEQRTDITFSVQEAPVTGR FIRISFVDKSPAAIAEVEVSGVVRE >gi|225935317|gb|ACGA01000075.1| GENE 8 17779 - 20076 1105 765 aa, chain + ## HITS:1 COG:no KEGG:BCAH187_A3211 NR:ns ## KEGG: BCAH187_A3211 # Name: not_defined # Def: alpha-fucosidase # Organism: B.cereus_AH187 # Pathway: not_defined # 32 759 84 840 1193 603 42.0 1e-171 MKQYKIFMLKGVFSLLLTFVSAFWLNAEQPLMKLWYTRPAQNWMTSALPIGNGELGGLFF GGIACERLQFNEKTLWTGSETKRGAYQSFGNLYIDFAEHNGEAVDYCRELCLDNAIGSVS YEMNGVKYRREYFASYPDRVIVMRITTPGMKGRLNLSVRLEDSHFGQLSVNKNILGIQGQ LDLLSYDAQVKVLNEKGQLSVVDNRLTVCDADAVTILLVAGTNFNISATDYLGTSSEDLH KELYTRLSNASRKNYAALKNIHLKDYQSLFSRVKLDLQADMPEYPTDELVRNHKESRYLD MLYFQYGRYLMLGSSRGMNLPNNLQGIWNADNTPPWECDIHSNINIQMNYWPAEITNLPE CHLPFLQYIAVEAVGKPNGSWRRIAQGEGLRGWTIKTQNNIFGYSDWNINRPANAWYCTH LWQHYAYNNDLEYLRNIAFPVMQSTCKYWFDRLKENKDGKLVAPDEWSPEQGPWEDGVAY AQQLVWQLFNETLHAVEALKKVDIQIDNVFVSELADKFRKLDNGVSVGSWGQIKEWKEDK GKLDFQGNDHRHLSQLIALYPGNQISYHRDTLLADAAKVTLQSRGDMGTGWSRAWKIACW ARLFDGDHAYRLLKSALSLSTLTVISMDNSKGGVYENLFDSHPPFQIDGNFGATAGIAEM LLQSNQGFIHLLPALPLAWSDGSVAGLRTEGDFTFTMKWNAGWLTQCSVLSGSGLKCRIY SPKGLIARVENSSGKRIPISLIKGGLIEFPTKKGEIYNISVSKIY >gi|225935317|gb|ACGA01000075.1| GENE 9 20030 - 20311 120 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYLYHYIYSINIRACCKDINFHLIKHLSKLAYTNVYITKISTFHKLLIPFRGKSCLLKRG DITLKYILQDTNKADKQPFNIFLKLKYYKFPLS >gi|225935317|gb|ACGA01000075.1| GENE 10 20341 - 20784 418 147 aa, chain + ## HITS:1 COG:CAC3445 KEGG:ns NR:ns ## COG: CAC3445 COG0454 # Protein_GI_number: 15896686 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 147 1 147 147 178 59.0 3e-45 MEIKLITSDKKEFLDLLLLADEQESMIDRYLERGDMFVLYDDGLKALCVVTREGEGIYEI KNIATVPFFQRQGYGKRLIEFLFEYYQGKCSEMLVGTGDVPSSRSFYEHCGFAISHRIKN FFTDNYDHPMYEDDMQLVDMIYLKKTL >gi|225935317|gb|ACGA01000075.1| GENE 11 20791 - 21474 473 227 aa, chain + ## HITS:1 COG:no KEGG:Psta_4074 NR:ns ## KEGG: Psta_4074 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: P.staleyi # Pathway: not_defined # 9 181 32 211 293 77 29.0 3e-13 MMEIDINYLVKEYGNMISTIAHRMIQNKEIAREAAQEVWYELCKSFSGFKGDSEISTWIY TVARRTIGRYAACEKQVKMSEIEYFRSLPEIGYSGGEEAKREWIKERCDWCITALNHCLN NDARLIFIFRENIGLPYRQIGEIMEWKESNVRQVYNRSIQKITAFMNDTCPLYNPDGACK CRICKPVYSIDMDKEYAMVQRMMRLADLYRKFEKELPRKNYWEKFLQ >gi|225935317|gb|ACGA01000075.1| GENE 12 21536 - 22057 517 173 aa, chain + ## HITS:1 COG:no KEGG:Slin_0877 NR:ns ## KEGG: Slin_0877 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 1 163 1 160 174 96 33.0 4e-19 MSKIDFLKEQIVEARTFVNRLMSELPEDLWYVIPEGTDSNFIWQVGHLLVSQNFHTLTAI TGVNERVGRLVPIQKYNRMFNGMGTLHRSIEKGLISVAELREQMEIVHQICIENLETLND DVLAECLQPLPFEHPVAETKYEALSWSFKHEMWHSAEMEAIKRDLGYPIVWMK >gi|225935317|gb|ACGA01000075.1| GENE 13 22299 - 22592 166 97 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIVRSLPYDISVLILKCISLKVLYVSIFPRFFISLIAFVFSKIQNKKPFVKRFNRKADYS LNQSIFFEKYITIARMINSPKLPANNAKVSSWIRTKI >gi|225935317|gb|ACGA01000075.1| GENE 14 22472 - 23824 1163 450 aa, chain - ## HITS:1 COG:PA0465 KEGG:ns NR:ns ## COG: PA0465 COG4452 # Protein_GI_number: 15595662 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Pseudomonas aeruginosa # 47 448 27 438 452 222 34.0 1e-57 MDAFNENSNEQQAQQPIGCLHRFSKTIKVVIIGLLILLLMIPMFMIENLISERGRTQEEA IDEVSEKWSLAQTITGPYLNLQYPVVTENNGEKKVSIKDLILFPDELMVNGQLKTEILKR SIYEVNVYQSELTLKGSFSSEELKKSRIDMDQLQFDRAAICLNLTDMRGISEQISITLGD SVYIFEPGMDNRGIANTGVHAIANLSELKQNKKLPYEIKIKLKGSQSLNFIPLGKTTRVD LKANWNTPSFTGNYLPNNRNITEKEFSAQWQVLNLNRNYSQVMADYNTSNIKDIENSSFG VNFKIPVEQYQQSMRSAKYAILIILLTFGVIFFTEIMNKTRIHALQYLLVGLALCLFYSL LLSFSEHVGFNPAYLLSSALTIILVGGYMFGITKKKKPSLIMSGLLAILYIYIFVLIQLE TFALLAGSLGLFIILAIVMYFSKKIDWFNE >gi|225935317|gb|ACGA01000075.1| GENE 15 23851 - 24147 245 98 aa, chain - ## HITS:1 COG:CC2206 KEGG:ns NR:ns ## COG: CC2206 COG1846 # Protein_GI_number: 16126445 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 8 96 10 98 103 82 46.0 3e-16 MIEAFQYINKAFESKVRLGIMAILMVNEEADFNFLKEQLSLTDGNLASHTRSLEELGYIV CNKSFVGRKPRTVFQATPQGREAFKSHIEALEKFLKSK >gi|225935317|gb|ACGA01000075.1| GENE 16 24148 - 24747 456 199 aa, chain - ## HITS:1 COG:no KEGG:BT_1252 NR:ns ## KEGG: BT_1252 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 199 1 183 183 264 81.0 1e-69 MDKDKALESVNEIKELMEKSSKFISLSGLAAIMAGIYALAGAYIATQVITPGTHLIVALE LMAIIASLVLVAAAVTACILSYYKSKKTGQKFFSRLTYRALWNFSLPMLTGGVLCISIMM HEYYDILASVMLLFYGLALVNVSKFTYSSIVWLGYAFICLGVVDCFWEGHSLLFWTIGFG GFHILYGILFYLHYERKRS >gi|225935317|gb|ACGA01000075.1| GENE 17 25000 - 25440 508 146 aa, chain + ## HITS:1 COG:no KEGG:BT_1251 NR:ns ## KEGG: BT_1251 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 146 2 147 147 240 91.0 9e-63 MREGAEFRELMLQIVRTRMAFRRSMQRTLRKNNAGITFEMLQVLSSLWHEQGISQQILAE RIAKDKACLTNLMNNLEKKGYVHRKEDLADRRNKLVFLTPEGEEFKEQIRPILDQVYVYA EKIVGIESIETMLSELKSVCDVLENV >gi|225935317|gb|ACGA01000075.1| GENE 18 25421 - 26740 883 439 aa, chain + ## HITS:1 COG:alr2887_2 KEGG:ns NR:ns ## COG: alr2887_2 COG1538 # Protein_GI_number: 17230379 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Nostoc sp. PCC 7120 # 32 435 27 433 443 82 21.0 2e-15 MYLKMYKKYGAVLSSLFLAAQSINAQTDSLFLTIDQLFERGVQHSLQLQADVMKESMAQE RIRTARFAQLPELQIGLKGGFVGQPVVWERGLSDPSYPEAPDWSQNYAIDLAQPLYQGGK IRSAIHKADIEKQVAELQTLTDEAEIKLRLLNQYMNLFSLFKQHEILTRNIEESELRLRD IRRMKKEGLITNNDVLRSEMQLTNDRLSLEEAENSIALVSQQLNILLGQDENLLLQPDTT ILYQAVALQSYDDYIVQAYENDPAMRLLRAQTELARNEIRSAQSLSLPSVSLYASNTLAR PVSRTLADMYNNNWNVGVSVSYPLSSLYKNNHKVKESKLMVSLRKNEEEQKMQGIRMDVR SAFLRHREALQRVEALKLSVRQAEENYRIMQNRYLNQLAILTDLLDANSVRLNVELQLVT ARTRVIYTYYQLQKACGRL >gi|225935317|gb|ACGA01000075.1| GENE 19 26755 - 27825 1082 356 aa, chain + ## HITS:1 COG:mll0995 KEGG:ns NR:ns ## COG: mll0995 COG1566 # Protein_GI_number: 13471111 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Mesorhizobium loti # 9 350 47 381 417 160 33.0 3e-39 MAIEWKEKQKKLKKVRVRNIVLNLVCVCLAVSGLWWTANYFWRYVNYEVTNDAFVDQYVA PLNVRASGYIKEVRFKEHQYVHQGDTLLVLDNREYQIKVKEAEAALLDAHGSQDVLHSGI ETSHTNIAVQDANIAEAKAKLWQLEQDYHRFERLLKEESVPEQQYEQAKAAYEIAKARYQ ALVAQKQAALSQYAETSKKTTGTEAAILRKEADLDLAKLNLSYTVLTAPYDGYMGRRTLE PGQYVQTGQTISYLVRNKDKWITANYKETQIANIYIGQQVRIKVDAVPGKIFHGEVTAIS EATGSKYSLVPTDNSAGNFVKVQQRIPVRIDLEDVSPEEMAQLRAGMMVETEALRK >gi|225935317|gb|ACGA01000075.1| GENE 20 27831 - 29426 1040 531 aa, chain + ## HITS:1 COG:no KEGG:BT_1248 NR:ns ## KEGG: BT_1248 # Name: not_defined # Def: putative transport protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 1 531 531 836 88.0 0 MKELEAPVRQWVPDWLGLISIFIVILPVTMLNGSYTGSMLEVSNTLGTNSEDITMGYYAA SAGMAIAYPIIPKVLAAFSVKSLLLIDLVLQFFLSWVCARTQNADILIVCSFAVGFLKGF LMLWFIRYAQKIFSPKNVRSEFYSYFYPLVYGGGQASMLVTALLAYHYNWKYMYYFMMVL ILISILFVIICFRHNRPIKAVAWSDLHIREMFVISAGLLMLMYVINYGKVLNWMASGKIC AYIIIAPMLIALFIWYQSRSENPYVSLAPLFQPKAIVGYFYMMLVMFFSTSTTLLTNYLT VILKVDTTHTYSLYIYLLPGYVVGAFICFWWFRWQRWRFRFLIAGGMSCFVIFFAILYFG ISPDSTYEMLYLPIFFRGLGMLTLIIAFALFAVEDLNPKYLLSNAFFLIIFRSVLAPIMA TSFYSNVLYRLQQKYMYSLSETITAADPLSSARYSQSLNNAMTQGHPYDEAAQLATNSLY NTLQQQSLLLALKEILGYLLVISVVIAVISRFIPFHKTIRVTFKKTGDDMV >gi|225935317|gb|ACGA01000075.1| GENE 21 29491 - 30981 1502 496 aa, chain - ## HITS:1 COG:CAC0696 KEGG:ns NR:ns ## COG: CAC0696 COG2721 # Protein_GI_number: 15893984 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Clostridium acetobutylicum # 6 496 5 492 492 611 60.0 1e-175 METKYLRINPADNVAVAIVNLPAGEHLSIDGIEITLNEDIPAGHKFALKNFAEGENVIKY GYPIGHARMAKKQGDWMNETNIKTNLAGLLDYTYNPIQVSLDIPHKDLTFKGYRRKNGDV GVRNEIWIIPTVGCVNGIIGQLAEGLRRETEGKGVDAIVAFPHNYGCSQLGDDHENTKKI LRDMVLHPNAGAVLVVGLGCENNQPDIFREFLGEFDEDRVKFMVTQKVGDEYEEGMEILR DLYAKASKDERTDVPLSELRVGLKCGGSDGFSGITANPLLGMFSDFLIAQGGTSVLTEVP EMFGAETILMNRCKNEELFEQTVYLINDFKEYFLSHGEPVGENPSPGNKAGGISTLEEKA LGCTQKCGKSYVSGVMPYGERLQKKGLNLLSAPGNDLVAATTLAASGCHMVLFTTGRGTP FGTFVPTMKISTNSTLAKNKPGWIDFNAGVIVENEPMEKTCERFIDYIIKVASGEFVNNE KKGYREIAIFKTGVTL >gi|225935317|gb|ACGA01000075.1| GENE 22 31012 - 32076 977 354 aa, chain - ## HITS:1 COG:mll7623 KEGG:ns NR:ns ## COG: mll7623 COG1879 # Protein_GI_number: 13476333 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 6 309 5 305 345 84 26.0 2e-16 MNKLPERIRIKDIARLADVSVGTVDRVIHGRSGVSEASKKRVEEILKQLDYQPNMYASAL ASNKKYTFICLLPEHLEGEYWTAVEIGIHEAIATYSDFNTSVKINYYDPYDYHSFVNASE AILALQPDGVMVAPTAPQYTKGFTDQLQALDIPYIYIDSNIKEVPPLAFFGQNSRQSGYF AARMMMLLAREEKEIVIFRKIHEGIVGSNQQENREIGFRQYMKEHHPSCTILELDLHAER NDEDNEMLDEFFRTYPMVKNGITFNSKAYIVGEYLQSRGKKDFNLIGYDLLERNVTCLKK GSISFLIAQQPELQGANGIKALCDHLIFKKEVTRINYMPIDLLTVETIDYYHSK >gi|225935317|gb|ACGA01000075.1| GENE 23 32330 - 33355 1420 341 aa, chain + ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 4 341 2 339 339 365 54.0 1e-101 MGKKVVTLGEIMLRLSTPGNTRFVQSDSFDVVYGGGEANVAVSCANYGHDAYFVTKLPKH EIGQSAVNALRKYGVKTDFIARGGDRVGIYYLETGASMRPSKVIYDRAHSAIAEADAADF DFDAIMEGADWFHWSGITPAISDKAAELTRLACEAAKRHGVTVSVDLNFRKKLWTKEKAQ SIMKPLMQFVDVCIGNEEDAELCLGFKPDADVEAGHTDAEGYKGIFQQMMKEFGFKYVVS TLRESFSATHNGWKAMIYNGEEFYTSKRYDIDPIIDRVGGGDSFSGGIIHGLMTKPNQGA ALEFAVAASALKHTINGDFNLVSVEEVEALAGGDASGRVQR >gi|225935317|gb|ACGA01000075.1| GENE 24 33417 - 34085 934 222 aa, chain + ## HITS:1 COG:CC1495 KEGG:ns NR:ns ## COG: CC1495 COG0800 # Protein_GI_number: 16125742 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Caulobacter vibrioides # 5 221 4 221 224 228 48.0 9e-60 MAKFDKIAVLNKIGSTGMVPVFYHKDAEVAKKVVKACYDGGVRAFEFTNRGDFAQEVFAE IVKFAAKECPEMAIGIGSIVDPATAAMYLQLGANFVVGPLFNPEIAKVCNRRSVAYTPGC GSVSEVGFAQEVGCDLCKVFPGDVYGTNFVKGLMAPMPWSKLMVTGGVEPTKENLTAWIK AGVFCVGMGSKLFPNDKVAAEDWAYVTAKCKEALGYIAEARK >gi|225935317|gb|ACGA01000075.1| GENE 25 34264 - 34554 68 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVSSLSKPDKRHAHYRRNLLIICLLLGIVLFTFCFLFCQHDSQNMDCDCLRCLLSAPPQE GEYLYNRYLLLISVLCFLLIFLPIYIFEHRQHIKRK >gi|225935317|gb|ACGA01000075.1| GENE 26 34671 - 35009 336 112 aa, chain - ## HITS:1 COG:MA4425 KEGG:ns NR:ns ## COG: MA4425 COG4744 # Protein_GI_number: 20093211 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 2 112 5 114 123 87 47.0 5e-18 MKRNRRVSKFLKEEDTDPLSVVVNLFDVAMVFAVALMVAMVMHMNMTEVFGQEDFTIVKN PGKENMEIISKEGKKINTYKPSGQKEEDSGKRGRKVGIAYELDNGEIIYVPE >gi|225935317|gb|ACGA01000075.1| GENE 27 34996 - 35604 616 202 aa, chain - ## HITS:1 COG:MA4426 KEGG:ns NR:ns ## COG: MA4426 COG0811 # Protein_GI_number: 20093212 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Methanosarcina acetivorans str.C2A # 7 192 10 205 273 112 35.0 7e-25 MELISKLLFWVANSLLIPDIIILLCLFVRSLILIGGTYNMYMTKRKNDKALDQLIKDARI DDLKAALPDKDNSLYMGYLRDLLYHPASADYSDFLITKFENEAEKDVSLSKMLAKIGPVL GLIGTLISMSPALVGLSTGDISGMAYNMQVVFATTVVGLVISAVGLISMQFKQRWYAKDV NNLDYVSRVLNETTGEPNNEKK >gi|225935317|gb|ACGA01000075.1| GENE 28 35614 - 36303 396 229 aa, chain - ## HITS:1 COG:no KEGG:Coch_1188 NR:ns ## KEGG: Coch_1188 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 2 220 8 226 228 149 43.0 1e-34 MLFIVINCSFKLSFWKLWQTVIYSLIAGLFVTGTWQYAILQSKTQIADYLQNTEALQNMA IIITLESALCFGYCVAFLRGLYGKKNLWWAELLRWYPSLLLFPVLFYYLTEAIFRLPGVD FSVTAWSLAGIVVIAIPLLSRLMKYLVPEDDLRLEVHFLVSLFICILGLLTTVNGKTTYA ATEEALNWKAIILSIGLFVVLFLIGFAWNKLKWPLLQRRELKKRSKINQ >gi|225935317|gb|ACGA01000075.1| GENE 29 36334 - 40680 4399 1448 aa, chain - ## HITS:1 COG:MA4424 KEGG:ns NR:ns ## COG: MA4424 COG1429 # Protein_GI_number: 20093210 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobN and related Mg-chelatases # Organism: Methanosarcina acetivorans str.C2A # 906 1424 917 1441 1518 310 32.0 1e-83 MKKKSKIILGVCIVAAALIGMSVWNTWFSPTKIAFINFQTIQQGSISKANDNSSIKLSEV SLDNLDRLTGYDMVFINGMGLRIVEEQRQQIQQAADKGIPVYTSMATNPANNICNLDSVQ QNLIRGYLTNGGKTNYRNMLNYIRKAIDGKISSIPEVEDPAERPSDMLYHAGLTNPDDEL EFLTVADYEKFMKDNRLYKEGARKIMITGQMADATGLIEALEKEGYNVYPVQSMTKFMSF IDEVQPDAIINMAHGRMGDKMVDYLKAKNILLFAPLTINSLVDEWEKDPMGMAGGFMSQS IVTPEIDGAIRPFALFAQYEDEEGLRHSYAVPERLKTFVSTINNYLNLNTKPNNEKKVAI YYYKGPGQNALTAAGMEVVPSLYNLLVRMKQEGYNISGLPANAEELGKMIQAQGAVFNSY AEGAFNDFMQKGHPELITKDQYESWVKESLRPEKYQEVVDAFGEFPGNYMATNDGKLGIA RLQFGNVVLMPQNAAGSGDNSFQVIHGTNMAPPHTYIASYLWMQHGFKADALIHFGTHGS LEFTPRKQVALCSNDWPDRLVGAVPHFYIYSIGNVGEGMMAKRRSYATIQSYLTPPFLES SVRGIYRELMEKIKIYNNSHKENKDQASLAVKTLTVKMGIHRDLGLDSIANKPYTEDEIA RVENFAEELATEKITGQLYTMGVPYEAERITSSVYAMATEPIAYSLLALDKQRGKATDAV SKHRSLFTQQYLTPARHLVEKLLANPALATDELICRTAGITAQELAKARQIEADRNAPKG MMAMMMAASAKSEQSDEKKAGKGHPASVSKEGSPHGQMPESMKEAMKKMGANMDPEKAME MAKSMGASPEALKKMEAGMKANKEKPQGMAAMMAAMGKAPKEYSKEEIELALAIAEVERT IKNVGNYKNALLTSPEEELSSLMNALKGGYTAPTPGGDPIANPNALPTGRNMYAINAEAT PTESAWEKGIALAKQTIDTYKQRHNDSIPRKVSYTLWSSEFIETGGATIAQVLYMLGVEP VRDAFGRVSDLKLIPSAELGRPRIDVVVQTSGQLRDIAASRLFLINRAVEMAAAAKDDKF ENQVAASVVEAERVLTEKGVSPKDAREMASFRVFGGANGMYGTGIQGMVESGDRWESESE IADTYLNNMGAFYGDEKHWEVFQKFAFEAALNSTDVVVQPRQSNTWGALSLDHVYEFMGG MNLAVRNVTGKDPDAYLSDYRNRNNMKMQELKEAVGVESRTTILNPTYIKEKMKGGSSAA SEVAQTVTNTYGWNVMKPTAIDKELWDNIYDVYVKDEYKLNVKDFFEKQNPAALQEVTAV MMETARKGYWKASPEQLSNIAKLHTDLVRQFGPSGSGFTGDNAKLQQFIASQVDAQTAAN YNKELKQMKQATLDGEATKGGMVLKKQSSDAVQGAQEEQNSLNGGLIAGIVLVAFVVMLL ILKKKRKK >gi|225935317|gb|ACGA01000075.1| GENE 30 40719 - 42866 1995 715 aa, chain - ## HITS:1 COG:ECs3047 KEGG:ns NR:ns ## COG: ECs3047 COG4771 # Protein_GI_number: 15832301 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli O157:H7 # 107 681 29 632 659 149 26.0 1e-35 MKPTFIISLLLFSFIATCTTLAQHSTYHISGRVTDTEGEPLPGATISIKGKGTGTVTSVD GTYTLQLPGTGTYLITASYVGYQPQEKRLTVGKEKRLSFRLSEDQFDLGTVVVTGTRTPK LLKDAPIITRVITADDIKKVDATNVGQLLQSELPGIEFSYSMDQQVSINMQGFGGNSVLF LVDGERLAGETLNNVDYNRLNLDNVERVEIVKGAASTLYGSSAVGGVINIITRDSEDPWN LNVNSRFGEHNDQRYGGTFSFKAGKFNSQTNVQHTMSDSYDVPEGDVTTIFGNRTWNVKE RLVYRPIEQLKLTARGGYYFRERDKTGDSKDRYRDFSGGLKANYDFNIMSNLEVAYTFDQ YDKSDYLTSYKYDVRDYSNVQHSVRALYNYTFNDKNILTVGGDYMRDYLLSYQFTNNGSY TMHSTDAFGQFDWNPTEHFNVISGVRFDYFSQSNVRHISPHIGLMYKIANCSLRGSYSQG FRSPTLKEMYMNFDMANAMMIYGNPDLKSETSHNFSLSAEYTKNRYNFTVSGYYNLVNDR IDLTSFRDTDGRWAQLYINTEKVDIAGVDINASAKYPCGLGIRVSYGYLHEFMRDGQMKF SSTRPHSATARIEYGKSWDNYQFNLSLNGRALSKVETNKYVGNNPDEGTTGVTYPGYTMW NLTLTQGIWKGVSLNMAVNNIFNYVPSYYDANSPYTRGTNFSIGLALDVDRFFRK >gi|225935317|gb|ACGA01000075.1| GENE 31 42863 - 43546 524 227 aa, chain - ## HITS:1 COG:no KEGG:BT_0497 NR:ns ## KEGG: BT_0497 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 220 1 220 220 333 75.0 2e-90 MKRSRNAIFKYTIVLGAAMSLSACNGIFENIYDAPIETEMEIKENSFSQVKTVEYTEWAY INLSERTVTTVKIGEEYESQIPDKWDFAIHRYDIKTNEGAAYKTTYTSIDEFKATGKLPK AEDFVEDEWTTDKIAIDMSGMMDGNIIYTESYRNAVLSSWLDVNTATMPPVYTMSNQVFL IRLKDNTYAAIRFTNYMNARGLKGYIDFDFQYPLEFEDNNNETNQQE >gi|225935317|gb|ACGA01000075.1| GENE 32 43562 - 44458 586 298 aa, chain - ## HITS:1 COG:no KEGG:BT_0498 NR:ns ## KEGG: BT_0498 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 292 1 287 294 269 59.0 7e-71 MKILNFWMVALMALSLSSTFVSCSDNDDDEPNLAKEIAGAYKGYSVGACNMFSDYLLGDE STATITANEDGTINLVYKSGSGDFTLNNLKLSSNKSFSGSGEVAMSMGGSTGGSYDYTLE GSVDGSKVLSLKANVDIPVPMGKMEIDFIQGETPVGYYIKDTYEYQTNLSLSVTTPGGNT TESTEDCKVIIENPTGSTVDVTISGFNNTSGSMNYFEKFTVKGVNVTKANDGSYSLNIGS FEADTQKTTGEILAITGTSLSGTVKSDGTTTITVVFKPGSMPWDITADFTGSNTNSAQ >gi|225935317|gb|ACGA01000075.1| GENE 33 44655 - 45536 831 293 aa, chain + ## HITS:1 COG:no KEGG:BF2104 NR:ns ## KEGG: BF2104 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 289 15 304 306 341 61.0 2e-92 MKGGYSRKSVLTGITSFLIVLFTMPLGHALMIFMEHVLSSTALHYAAFTMGAVGLVMVII GVFAKGDTRQTLWGLFGGLLFWTGWVEFIYVYYAHRYEVQPLLNAAGEVVTKPEYLIMPS SFGFWVMFMLIYIFSIKSGCDFFTYLQKVFFRKSTTTIVVRPMTRHTSIVTFMELNLIMW TSYLVLLFCYDENFVGEHSPVTAIVAFGCLAGSFFMFKRLLKITQWGYAMRFSIATVVVF WTFVEVLGRWNIFHEIWVEPMAYTTEMITILLAFLALLVFLFYQSAKKKNSHN >gi|225935317|gb|ACGA01000075.1| GENE 34 45555 - 45935 443 126 aa, chain + ## HITS:1 COG:SPy1339 KEGG:ns NR:ns ## COG: SPy1339 COG5496 # Protein_GI_number: 15675277 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Streptococcus pyogenes M1 GAS # 15 124 11 120 121 87 46.0 6e-18 MLEKGLQFIKEVVVKPENLAVAMGSGDLPVLATPAMVAFMENAALSAVEDQLPEGNTTVG AMIQSTHLKPTALGDTVLVTATLLEVEGRKLTFSVVASDSQGKIGEGTHIRYVVDRNKFM EKLASR >gi|225935317|gb|ACGA01000075.1| GENE 35 46016 - 46918 705 300 aa, chain + ## HITS:1 COG:CC0303 KEGG:ns NR:ns ## COG: CC0303 COG1230 # Protein_GI_number: 16124558 # Func_class: P Inorganic ion transport and metabolism # Function: Co/Zn/Cd efflux system component # Organism: Caulobacter vibrioides # 18 287 75 345 361 245 49.0 1e-64 MEPHHHHEHSHRLTSLNKAFIIGIALNITFVIVEFGVGFYYNSLGLLSDAGHNLGDVASL VLAMLAFRLEKVHPNSRYTYGYKKSTILVSLLNAVILLVAVGIIIAESIDKLFHPVSVDG SAIAWTAGVGVVINALTAWLFMKDKDKDLNVKGAYLHMAADTLVSVGVVASGIIITYTGW SIIDPIIGLGIAVIIIVSTWGLLHDSLRLSLDGVPVGIDAQKIQQLIMEQPGVENCHHLH IWALSTTETALTAHVVIDDVERMEEIKCSIKNKLEEAGIHHVTLEFEDKSISCETKNNCY >gi|225935317|gb|ACGA01000075.1| GENE 36 47019 - 50084 2216 1021 aa, chain + ## HITS:1 COG:no KEGG:BF3444 NR:ns ## KEGG: BF3444 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 1021 1 1016 1016 1636 79.0 0 MRAFIISFFVVIFISAISFQATYAQQKDIVLTGMVTDHQGEPLPGATVRIGNTQFGTVTD ANGRYQLRGRWKKGDMIIFSFIGMKDVREKYTGQNVQDAAMQQDAKSLDEVVVVARQNIN ELDIRAKSGVVQNVDMGRLNSKPMIDMSLALQGSVPGLIVTNTGDLGSKPEIRIRGNSSF REGDAANEPLYVMDGQVISSDAFMTLNPADIKEIKVLKDAVACALYGIKAANGVIEITSL RGNPDGRTTTSYSFNMGITTRGRRGVEMMDTDEKLELERRLQNRSTPGYRYSEDYYRKYF ANDPNLNQMIAEGQGILDSLRTIHTDWFDELIHLNTYQRHNLSVRGGSEKTSYYVSANYA QQGGRVPGNDTHRFTATMSLDQQLGRVGYLSLSANAGYSETDTPNGSSYSPTDLIYQLNP YETKTGKLVSFSNETSDYTYNDLLSQYNSKSTDKRGGVSGSLNLEPLKGLSIDAVAGIDM LLNEGMTLVPSTSISERNSGVDEAERGKLSKEKNVTTNVSSNVRITYNKIFAEKHDFTIG GNMDYYMTDADNISITGYGVGTQMSPSAINQSISGNRKPVVGSYKEKTAQLGLGLVMGYS FDNTYDLFATYKADASSILPESKRWNAAWAIGLGWTISQYPFLKDNNVISRLNLKASYGR MANLAGVSASSTIGTFSYSTDYYGNARLLQLLALYNTDLKAEQTASTDISLSVELFKRLT LDANIYRRETSDALLDVPVPLSNGFSTMKRNIGVLRNEGYELSASLKVLDTSDWRLSLRG SLAYNRNKVVDLYYADRLYASEEAIIPDYEVGKSYDMIYGLKSLGINPITGLPVFQGADG REIPATENPSKENIVALGHATPPYSGIFNLSFSYRDFDLDMDFYYVFGGVKSYSYSYVRS SDDANKNAIKGQLQNMWFEKGDEGKTYHSPFYSSSAIASLDYPNTETVGKSDYLKLSMVS LRYRVPHTFLEKNCNFIKYANVAFQASNLFMITPYKESDPETGSLAGTMQPVLTINLSLT F >gi|225935317|gb|ACGA01000075.1| GENE 37 50100 - 51542 1036 480 aa, chain + ## HITS:1 COG:no KEGG:BF3445 NR:ns ## KEGG: BF3445 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 11 479 14 481 482 580 61.0 1e-164 MIVKMINKSRIKTVIGSLFLTLCFTACSLNIPYENQFSDPDAITTPNTARELLAAAYKQL PSPEFDLSVLGDDFETTSWISRDASLNNLYKWQTQPIEDLAVSIWSDYYAAVAIANAVLE RVDGISVSTTEETKALQAVVSEAKLLKAYCYFNLLRLFAPDYEDGPEKDGIILKDQLELG FLYRSSIEECVTAIRELLKEALMVENNPSDVYWFSQYSGYYMWAELELYAHNYEQAAEYA QKVIDEKGGYDVLGETVYATLWSNSSCAERVFSLFTNSSYYTGINYDTHKGDYMTVNSSL ITLYADGDIRSGASVFTKEMTDDALGNTTMENCLGKYNKMNWEKIELQYINKYRVAGACF ILAEAYCRDGKGHDGQAVGVMNRYLEQRNATLLDTQLSGEPLLKAIFQEKWKEFVGEGER YFDLKRCRKGVLSDWNTSGTMATDKRVQADDYRWTFPIPRGEYLYNENVSQNEGWTKIEN >gi|225935317|gb|ACGA01000075.1| GENE 38 51571 - 52767 1134 398 aa, chain + ## HITS:1 COG:no KEGG:BF3446 NR:ns ## KEGG: BF3446 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 398 1 400 400 503 62.0 1e-141 MKKEMKKLFVYTMGMFLLSLVTSCGDDKEDTGYTGVNKIYLSAENPVIEEAEATPLTVNV DLTTACEQDVTLNFEVLDDEAGILKLVNNPVTIQAGQKKATFQVVSNQKNLLAVDTYFQI GISTIPVENMALNAVLSVRVKPDPQIPELSEEQLALIEGYKTKYNIDLTDWLGIVSCHTV VNSPADGYLTPFVQAFTKEYDGKTIITLSEEATAEMPVLKMVDNPMGLTEYLYWVFRAET VEDSEYWTESPASQTLMEQINWNKDTNESFSASLNGIKLKDITSSTATVDYIGEGTDSYG DPIKAVPFSYTFSAWTRQQELMKTDPTAQEIYEQGGRAEPDYYLCNTGIDEDGWGDPENF EESTGSIDFTAKKMTFKFLIDHTNAGGYTHVTVTYEKK >gi|225935317|gb|ACGA01000075.1| GENE 39 52754 - 55618 1930 954 aa, chain + ## HITS:1 COG:pqqL KEGG:ns NR:ns ## COG: pqqL COG0612 # Protein_GI_number: 16129453 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Escherichia coli K12 # 27 932 21 917 931 174 21.0 5e-43 MRRNNKNLYKYVSIAAILIAAMCVYIPGYAQSAPISLPKGTVEGYLPNGLHYLILKNAVP ASRVEFRLIMRVGSVQETENQKGCAHFLEHMAFGGTRYFPKRSLVSYLESKGVKYGIDIN AFTGYDRTIYMFAVPTDHGQEAVVDSSLLIIRDWLDGISFLPEKVENEKGIILEELRSYD LNDDFYQLKIGQGVFGNHIPLGTADDIRKVTPQVLKEYYNKWYTPSLATLVIVGDISPNE IEAKIKAGLSSLKKRVAKDFRIYPLEYAKGIQISEVRDSLRNHTKVELMIPHPCVVERTI NDAVKKETGRLLLRAITQRFRERHIKADLSDSWYLSDKNHLVLAVEGNDRSEILSIITTA VAELNCLIREGWDKEELADVRTQFCNQLSEGNPDDARSSIYFCDDFTDYVISGDRYMTAP AERKQLKEAMLHVQNNDLQLLLKEWLIYQKQAMLVACNSHPGLGMPLTKQEIAEAWAEGE RAACTPYTYVRQEEKKEVPVETPACLAIRPPFDATMIAHINVYNEMGVREVKLKNGIRLI MKPTQDADSTLYLTSFAPLGTSSLSDEEYPLLEGVGGYIDMGGIAKVDGQIFSEYLFQKE MSITVAMENHWHGFMGMSSTANAPEFFNLIYEKIFDPELKYEDFEEIRQGLMQNEGKETM LEKMLKRDSGRQLSARINELMGASITRPSIRSFAERLNLDSIAGFYKKLYTRPEGTTYVV CGSFNPDSLMRQFVSVFGRIPVSGNTVQYSYPNFELPDKTLIEGFPNDNETQTLFDYLLY GHYQPCQKNTLMLKLMRDVIRNRLISVLRERESLVYSPYISLFYEGIPWGTFYFDINASA DNRNMGQIDRLLKEILQRLKEEEVDIQELQTIKRSFLLAKREALNEKSPAAWRTTLVGLL KNGESLADFEQYEQCLDEITPVELRKAFKEWIDLDNYVLLYLSNNKLKNETTND >gi|225935317|gb|ACGA01000075.1| GENE 40 55599 - 56498 755 299 aa, chain + ## HITS:1 COG:VC2004 KEGG:ns NR:ns ## COG: VC2004 COG3016 # Protein_GI_number: 15642006 # Func_class: S Function unknown # Function: Uncharacterized iron-regulated protein # Organism: Vibrio cholerae # 1 269 15 281 325 107 26.0 4e-23 MKQQMIRRLLVIFLGCFCLQMVTCAQVGVDKPAYVLYDNAGKAITYSQLIKQLGKYDVVF LGEMHNCPITHWMEFEITRSLYHIHKDKLMLGEEMMESDNQLILNEYLQRKISYDRFEEE ARLWPNYSTDYSPVVYFAKENKIPFIATNVPRRYANAVKNGGLEVLDSLSDEAKRYIAPL PISFNYDEKESEAAFSMMNMMGGQKSGDNYKLAQAQAVKDATMGWFIAHNMKDKFLHING SYHSDWKGGIIPYLLQYRPGTSVVTVTSVRQENTDKLDEENKGRADFYICVPEDMVTSY >gi|225935317|gb|ACGA01000075.1| GENE 41 56607 - 58205 1547 532 aa, chain + ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 20 528 18 546 553 275 33.0 2e-73 MFTDLLHSSYFSLFLIVALGFMLGRIKIRGLSLDVSAVIFIALLFGHFGVIIPKELGNFG LVLFIFTIGIQAGPGFFDSFRSKGKTLILITMLIICSACLTAVGLKYAFDIDTPSIVGLI AGALTSTPGLAVAIDSTNSPLASIAYGIAYPFGVIGVILFVKLLPKIMRVDLDKEARRLE IERRGQFPELTTCIYRVTNSHVFNRNLMQINARGMTGAVISRLKHNDEISIPTAHTVLRE GDYIQAVGSEESLNQLAVLIGKREEGELPLDKTQEIESLLLTKKDMINKQLGDLNLQKNF GCTVTRVRRSGIDLSPSPDLALKFGDKLMVVGEKEGLKGVARLLGNNAKKLSDTDFFPIA MGIVLGVLFGKINISFSDSLSFSPGLTGGVLMVALVLSAIGKTGPIIWSMSGPANQLLRQ LGLLLFLAEVGTSAGKNLVATFQESGLLMFGVGAAITLVPMLVAAIVGRLVFKISLLDLL GTITGGMTSTPGLAAADSMVDSNIPSVAYATVYPIAMVFLILFIQIIASAVY >gi|225935317|gb|ACGA01000075.1| GENE 42 58252 - 58632 130 126 aa, chain + ## HITS:1 COG:no KEGG:BT_0501 NR:ns ## KEGG: BT_0501 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 126 2 110 110 148 82.0 6e-35 MPPQRKNEVKRLLLNITRYFLPILFVSYLVSFTFFAHVHVVNGVTIVHSHPFKKGAAHKH STVELLLIHFLSHLTADGAAVVFALSLFIPFLLYLLLGRSPYAHYHCPYHGVVGLRAPPA IRFSIL >gi|225935317|gb|ACGA01000075.1| GENE 43 58717 - 61038 1959 773 aa, chain + ## HITS:1 COG:ECs3047 KEGG:ns NR:ns ## COG: ECs3047 COG4771 # Protein_GI_number: 15832301 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli O157:H7 # 118 633 25 546 659 89 25.0 3e-17 MKKYIFSLVCLCCTLLPALEGQAHEYPNHPELRKSDANIVGHILDKNTKEHLPYITVALK GTTIGTVTDATGHYFLKNLPEGNFVLEVSSVGYKTVRRNVTLKKGCTLEEDFEIEEDAVA LDGVVVSANRNETTRRLAPTLVNVVDLKIFENTNSTTLAQGLSFQPGVRVESNCQNCGFQ QVRINGLDGPYTQILLDSRPIFSALSGVYGIEQIPASMIERVEVMRGGGSALFGSSAIAG TINIITKEPMRNSGMLSHTITGIGDGDAFDNSTALNASLVTDDQRAGLYIFGQNRHRSAY DHDGDGYSEIPKIHGQTIGFRSFLKTTTYSKLTFEYHHMEEFRRGGDLLNRPPHEANVAE QTEHSINGGGLKFDYFSPNEKHRFNVFASAQHINRDSYYGGGQDPNAYGNTTDLNWMAGS QYVYSFGKCIFMPADLTAGIEFNQDKLEDNMWGYHRTVDQKVNIGSAFFQNEWKNEHWGF LIGGRLDKHNLIDHVIFSPRANLRYNPTENINLRFSYSSGFRAPQAFDEDLHVENVGGNV AMVELADNLKEERSQSLSASADIYHRFGAFQVNFLVEGFYTKLSDVFALTDGEIVDGILT RTRYNAPGARVMGLTLEGKMAYLTKFQLQAGVTLQQSHYSEPHVWNPDAPAVKKMMRTPN TYGYFTATYTPIKPLSIALSGTYTGSMLVPHEPVPGFLEKPITVNTKDFFDIGLKAAYDF KLYKSMNLQVNAGVQNIFNAYQNDFDKGADRDSGYIYGPSLPRSFFAGVKISY >gi|225935317|gb|ACGA01000075.1| GENE 44 61187 - 61504 94 105 aa, chain + ## HITS:1 COG:no KEGG:BT_0503 NR:ns ## KEGG: BT_0503 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 9 112 113 153 78.0 2e-36 MKWFLPVLFISYMAGITLFTHSHVVNGVTIVHSHPFKKGSEHSHTTVEFQLIHLLSHVLV TDSGLIPTFSVAALSLLCILFVRPQIEQFYRSCPGVISLRAPPVA >gi|225935317|gb|ACGA01000075.1| GENE 45 61618 - 63957 2160 779 aa, chain + ## HITS:1 COG:STM2199 KEGG:ns NR:ns ## COG: STM2199 COG4771 # Protein_GI_number: 16765529 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 120 684 33 598 663 103 26.0 2e-21 MKKYLLVLVGLCCVLSYALAERPDYPELKKSDANIIGHVLDKKTKEHLPYITIALKGTTI GTVTDATGHYFLKNLPEGNFILEVSSVGYKTVTRNVTLKKGKTLEEDFEIEEDAIALDGV VVSANRSETTRRMAPTLVNVVDLKLFETTNSSTLSQGLNFQPGVRVETNCQNCGFQQVRI NGLDGPYTQILIDSRPVFSALSGVYGLEQIPASMIERVEVMRGGGSALFGSSAIAGTINI ITKEPLRNSGQLSHTITSLGGSSSFDNNTSLNVSLVTDDHRAGLYIFGQNRYRSGYDYDG DGFTELPKLKNQTVGFRSYLKTSTYSKLTFEYHHMQEFRRGGDMLNRPPHEAHIAEQLQH SIDGGSLKFDYFSPDEKNRLSVFASAANTDRDSYYGPGNDPLKAYGKTTDLTAMGGAQYV HTFDKCFFMPSDLTAGLEYNRDRLKDNMWGYNRHTDQTVNIYSAFLQNEWKNDRWGILIG GRLDKHNMVDNVIFSPRANLRFNPTQNINLRLSYSSGFRAPQAFDEDMHIENVGGTVAMI ERAKDLKEEKSQSFSASADMYHRFGAFQTNLLIEGFYTRLTDVFVLGEPYDRGDGILVKT RSNGPGAKVMGLTLEGKVAYLSILQIQAGLTLQRSRYDEPHKWHDDAPAEKKIFRTPDTY GYFTATYTPIKPLSIALSGTYTGRMLVQRMDITAENAELGEMPERKAEAIRTPRFFDLGV KLAYDFKLYKTVDLQLNGGVQNIFESYQKDFDRGANRDSGYIYGPSLPRSFFAGVKISY >gi|225935317|gb|ACGA01000075.1| GENE 46 64293 - 65207 558 304 aa, chain - ## HITS:1 COG:TM0876 KEGG:ns NR:ns ## COG: TM0876 COG0053 # Protein_GI_number: 15643638 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Thermotoga maritima # 1 301 1 302 306 226 37.0 6e-59 MSREKILITTSWISTIGNAILSASKIIIGLFAGSLAVVGDGIDSATDVVISIVMIFTARL INRPPSKKYVFGYEKAEGIATKILSLVIFYAGVQMLLSSTKSIFSDEVKEIPSAIAIYVT IFSIIGKLLLASYQYKQGKKINSSMLTANAINMRNDVVISTGVLLGLIFTFIFKLPILDS ITGLIISLFIIKSSISIFIDSNVELMDGVKDVNVYNKIFEAVEKVPGASNPHRVRSRMIG NLYMITLDIEVNPEITITQAHEIADSVEKSIINSIDNVYDILVHVEPAGKCQTGEKFGVD KDMV >gi|225935317|gb|ACGA01000075.1| GENE 47 65392 - 67698 1982 768 aa, chain + ## HITS:1 COG:XF0847 KEGG:ns NR:ns ## COG: XF0847 COG3525 # Protein_GI_number: 15837449 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Xylella fastidiosa 9a5c # 28 615 87 671 841 313 31.0 1e-84 MFKKLSSSLLIVSACVFSSCTSTVKQEIAILPTPVSLTEQSGSFVLKDGMKIGVSDQSLF PAAGYLQEILRNVISSSVEVTTDKSQVDMYFQLKDTVGKPGSYKLESTPEYIRVEATDYS GIISAITTIRQLLPATIEVQGEKQNYSIPVVQIEDAPRFEWRGFMLDASRHFWNKKEVKH VLDLMSLYKLNKFHWHLSDDQGWRIEIEKYPLLTEKGAWRKFNTQDRTCMARAKEEDNTD FLIPEDKIRIVEGDTLYGGYYTHDDIKEIVAYATQRGIDVIPEIDMPGHFLAAIGQYPEL ACDGLIGWGETFSSPICPGKDTTLEFCQNVFKEVFELFPYEYVHMGGDEVEKANWKKCPL CQKRIRTEKLGSVEELQAWFVRDMEKFFLANGKKLIGWDEVVSDGLSSDAAITWWRSWAK DALPTATAQKQKVIACPNEYFYFDYAQDQNSVKKILAYDPCADERLSPEQKKYIWGVQAN LWAEWIPTMKRIEYLIVPRMIALSEIAWAEPAAKPGLEEFYRQLVPQFKRMDVMRVNYRV PDLQGFYKVNAFIDETTVELTCPLPGTEIRYTTDGSMPTKESALYNGALEVGKTTDFAFR TFRPDGSPSDVVRTKYVKAPYAEAVTAPAALQSGLKAVWHDFRGNLCADMDAAPVKGEYV VESVSIPEEVKGNIGLVLTGYLEVPADGIYTFALLSDDGSTLMLDGELLGDNDGAHSPVE IIVQKALKAGLHPIEVRYFDCNGGVLQMELVNEKGEKEVLPSAWLKHE >gi|225935317|gb|ACGA01000075.1| GENE 48 68010 - 69782 1458 590 aa, chain + ## HITS:1 COG:SP1434 KEGG:ns NR:ns ## COG: SP1434 COG1132 # Protein_GI_number: 15901286 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Streptococcus pneumoniae TIGR4 # 1 586 3 583 586 397 38.0 1e-110 MSTLKKLQNYMGKRKVLLPAAILLSALSALAGMLPYILIWLIVRELLEHGEITSSGNVVM YAWWAAGMAVASIVLYFAALMSSHLAAFRVESNLRKEAMRQIVRMPLGFFDINTSGRIRK IIDDNAGVTHSFLAHQLPDLAATFLVPLVAAILIFVFDWILGLACIVPVIIAMLVMGFMM NAEGRQFMKSYMTSLEEMNTEAVEYVRGIPVVKVFQQTIYSFKNFHRCIMNYNKMVFGYT RMWEKPMSLYTVIINGFVFFLAPLAILLIGYSGNYASVLLNFFLFVLITPVFSQSIMKSM YLNQALGQASEAIGRLENLVAYEHLTVVEHPQPVKEFSIQFEKVSFSYPGANQKAVDDVS FTIPQGNTVALVGASGGGKTTIARLVPRFWEATEGKVLIGGINVREIAPEELMKYISFVF QSTKLFKTSLLENIKYGNPDATMEEVERAVDMAQCREIINKLPLGLNTKIGTEGTYLSGG EQQRIVLARAILKNAPIIVLDEATAFADPENEHLIQQALKELTKGKTVLMIAHRLSSITD ADNILVIDKGKIAEQGTHANLLGKQGIYYRMWNEYQQSVRWTIGKEVSND >gi|225935317|gb|ACGA01000075.1| GENE 49 69775 - 71511 228 578 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 338 559 279 507 563 92 30 9e-18 MIEVIKRRFALSTKGAKDFCKGVFFTTLLDIVLMLPAVFVFLFLEEYLRPVFQPSASVTH GILYYSILGIVFMIVMYIFAVLQYRSTYTTVYDESANRRISLAEKLRKLPLAFFGEKNLS DLTATIMDDCTDLEHTFSHAVPQLFASIISILLITVGMAFYNWQLTIALFWVVPLAAAIL LFSKKEIQKSNESNYLNKRMVTEHIQEGLDTIQEIKSYNQERDYLEKLDASIDTYEKVLT RNELVLGMLVNGSQSVLKLGLASVIIVGANLLASGTVDLFTYLIFMVIGSRVYAPVSEVM NNIAALFYLDVRINRMNEMEALPVQNGTTEFTPQGYDIEFRQVDFAYEQGKQILNNLSFI ARQGEKTALVGPSGSGKSTAARLAARFWDIQSGKITLGGQDISRIDPETLLKNYSVVFQD VVLFNASIMDNIRIGKRDATDEEVRRVARLAQCDEFVTKMPQGYQTIIGENGETLSGGER QRISIARALLKDAPIVLLDEATASLDVENETKIQAGISELVRNKTVLIIAHRMRTVANAD KIVVLENGSVAEMGTPEELKKKNGIFARMVNRQVTNMN >gi|225935317|gb|ACGA01000075.1| GENE 50 71529 - 72137 619 202 aa, chain + ## HITS:1 COG:CAC0821 KEGG:ns NR:ns ## COG: CAC0821 COG1309 # Protein_GI_number: 15894108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 139 1 136 200 70 30.0 3e-12 MQFLKGDIQEGILKAAEEVFLEKGYKDASMREIASRAGVTVSNIYHYFTNKDEIFRTILK PVLNDLYAMIYNHDADQMTIDVFMDSDYQKMSVREYIRLVSEHRDRLRLLLFQAQGSVLE NFRSEYTDLMTRTISVFFQGMKQKYPHINIAITNFFIHLNTVWLFALLEELVLHPVKKEE MEKFIAEYIAFETAGWKELMNA >gi|225935317|gb|ACGA01000075.1| GENE 51 72456 - 74555 970 699 aa, chain - ## HITS:1 COG:Cj1013c_2 KEGG:ns NR:ns ## COG: Cj1013c_2 COG0755 # Protein_GI_number: 15792340 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Campylobacter jejuni # 465 697 22 252 287 182 43.0 3e-45 MKRLLIILYICLVGLLAVTTFVEQAYGTDFVERNIYHTCWFCCLWGTIATIALVALIRRA LWRRFPILLFHGSLLVILVGAMITFIGSKKGYVHLLPGATIDSFQESESGRKADLPFTIQ LDSFRIAYYPGTEAPADYISYITYSLPGQKNVLLHEQISMNRIFTSQGFRFYQSSFDDDG KGSWLTVNYDPWGTGVTYAGYILLGLSMIWLLFSRSSDFRRLLNHPLLKKGGVFILFIFC LAGNMQAQKKLLPALKRTQADSLAQEQVIYHDRVVPFNTLARDFIQKLTGEASYKGLTPE QVIGGWLLYPEVWRNEPLIYIKNTELQHLLNLQTPYARLTDLFDGSVYRLREHWQREQGQ QSKLAKAIQETDEKVGLILMLEKGTFIHPLPTDGSVQPLSELEVKAELLYNRIPFSKILF MINLSLGVLSFLLLLHNSLQRNILSPKAKTISRTAGTFFSVALYLAFIFHLAGYCLRWYI GGRIPLSNGYETMQFMALCILLIACLLHRRFSFILPFGFLLSGFALLVSYLGQMNPQITP LMPVLVSPWLSIHVSLIMMSYALLAFIMLNGILALCLRKKESENNVSGNDAIQDNRIEQL TLVSRLLLYPATFFLGAGIFLGAVWANVSWGRYWAWDPKEVWALITFLVYGVAFHSQSLR IFRKPLFFHIYMILAFLTVLMTYFGVNYVLGGMHSYANA >gi|225935317|gb|ACGA01000075.1| GENE 52 74576 - 75730 1062 384 aa, chain - ## HITS:1 COG:PAB0825 KEGG:ns NR:ns ## COG: PAB0825 COG0251 # Protein_GI_number: 14521450 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus abyssi # 278 375 28 127 127 60 32.0 5e-09 MNNKKQQNIQSEKTLTEIFKYNAGGEASEYHIIIHSTKPEDTYEEQLNAVLNTYDYLLTQ ELKGAVAILKRYFLSDAANQADTLLALTTEGSDCALSIVEQPPLDGTKIALWVYLLTDVQ TQVLHNGLFEVKHSAYRHFWGGSAFNRAANSEYQTRLLLNDYVMQLMEQGCKLADNCIRT WFFVQNVDVNYAGVVKARNEVFVTQNLTEKTHYISSTGIDGRHADPKVLVQMDTYAVAGL QPEQIHFLYAPTHLNPTYEYGVSFERGTYVDYSDRRQVFISGTASINNKGEVVYPGNIRK QTERMWENVETLLKEADCTFEDLGQMIVYLRDIADYAVVKAMYDKRFPHTPKVFVHAPVC RPGWLIEMECMGVKECKNKEYAPF >gi|225935317|gb|ACGA01000075.1| GENE 53 75752 - 76465 447 237 aa, chain - ## HITS:1 COG:no KEGG:BT_2476 NR:ns ## KEGG: BT_2476 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 234 1 238 244 239 50.0 8e-62 MKKYILFGGYLLLLAYITSCDDGRIYEKTETLSEEGRTLKMSGKINGISKWPDGYSVVVA GFSDESEYAVVTKTIPAVEDDEIQVTMTGVSDKVTTIELCVINKLRKRVISFQSMDDLTA VDDTILMDVGTVNVGMYHGIQEKVFNTTCAHCHGGGSSAAANLYLTEGKSYEALVNRPSK KVDGMLLVKPGSAQESVLHTLLNTTISSTWGYDHSKEIVSSPILTLIDDWINNGAQE >gi|225935317|gb|ACGA01000075.1| GENE 54 76489 - 77703 866 404 aa, chain - ## HITS:1 COG:no KEGG:BT_2477 NR:ns ## KEGG: BT_2477 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 404 1 403 403 711 87.0 0 MKRFIIIAMLTAIMPLAAMAQHEEDTENGVVSLAGREGFTIETKKGDFVFKPYLLVQTSA NFNWYDDEGLDKAYNQDNIANSGFSVPYAVLGFTGKAFDKVSFNLSINAAASGGALLQQA WFDVQLKKQFAIRVGKFKTPFSHAYLTTLGETLLPSLPVSLTAPVILPYSLNAVTPNIGT GFDLGVEIHGLVADKFGYEIGLFNGTGAAVNTATKTFSDDWHIPSLLYAGRFTYMPKGVM PATQGNPNRLNEDKIMFGISTSINVESENESTNDYRAGLEFAMLKNKLYLGAEVYYMNVG FTKRQKISESYNYLGGYVQGGYFVAPRLQAAARYDFFNRNGMDTNGFLNMPAVGMNYFFK NCNLKLQAMYQYIARKGHDTQLDRDNDDLGLAVHSASILLQYTF >gi|225935317|gb|ACGA01000075.1| GENE 55 77731 - 79161 1224 476 aa, chain - ## HITS:1 COG:PA4371 KEGG:ns NR:ns ## COG: PA4371 COG3488 # Protein_GI_number: 15599567 # Func_class: C Energy production and conversion # Function: Predicted thiol oxidoreductase # Organism: Pseudomonas aeruginosa # 35 476 34 473 473 251 36.0 3e-66 MNHFLKYSFSFFCLLACSACDDDGIDILDIEIPEGYALSAGTSTIFLNSSVAYDTPADWI TGAYDVRFTRGDRLYDDVRTSNNGHGGGLGPVYAGYSCGSCHRNAGRTKPSLWTEGGSGS YGFSSMLVYISRKNGAFFQDYGRVLHDQAIYGVQPEGKLSVEYTYETFSFPDGEAYTLCK PNYTISEWYAEEIKPEDLFCTVRIPLRHVGMGQMMALDPVEIEALAAKSNYPEYGISGRC NYITERGVRSLGLSGNKAQHADLTVELGFSSDMGVTNSRYPEEICEGQIQVNQGSMMGLS YDQLDVSTEEMENVDLYMQSLGVPARRNVNDPQVIKGEQNFYKAKCHLCHVTTLHTKTRG SVLLNNTQLPWLGGQTIHPYSDYLLHDMGSEIMGVGLNDNYVSGLARGNEWRTTPLWGIG LQEKVNGHTYFLHDGRARNLTEAIMWHGGEGEASKNLFKNMSKEDRDALIAFLNSL >gi|225935317|gb|ACGA01000075.1| GENE 56 79272 - 80423 892 383 aa, chain - ## HITS:1 COG:no KEGG:BT_2479 NR:ns ## KEGG: BT_2479 # Name: not_defined # Def: iron-regulated protein A precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 383 3 372 372 365 56.0 2e-99 MKTKFFYVAALILGLAFTTTSCSSDDDNPTVDPANIDYTPENASSWHNYMRNVAALLKTD ATNLYNAWNSSYKGGESYASLFKAHSASPYASALSCVEEIVDKCAEIANEVGTAKIGDPY NLYKAGNTEEALYAVESWYSWHSRDDYTNNIYSIRNAYYGSLDGNINANSLSTVIAGVNS SLDTKIKNAIQKAAKAIQDIPQPFRNHIPSNETVAAMDACAELESILKNDLKSYIANNSN NINTDAVLNPVVTQYVDAVVVPTYKSLKEKNDALYNAVIALADNPSNSAFETACDAWITA REPWEKSEAFLFGPVDEMGLDPNMDSWPLDQNAIVQILNSQSWSDLEWSEGDDEAAVESA QNVRGFHTLEFLLYKNGEPRKVQ >gi|225935317|gb|ACGA01000075.1| GENE 57 80453 - 81793 912 446 aa, chain - ## HITS:1 COG:no KEGG:BT_2480 NR:ns ## KEGG: BT_2480 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 446 1 456 456 595 66.0 1e-168 MRKITAIVILSLLPSLTIRAQETKEISKSSNSSSTEEYTKFRFGGYGEMVANFKDYGINR FYGGNDGNPKKNRNTISIPRFVLAFDYKFNSKWILGAEIEFESGGTGTAFELENTENGEY ETEVEKGGEVAIEQFHITRLIHRSLNVRAGHIIVPVGLTNAHHEPINFFGTSRPEGETSL LPSTWHENGLEIFGSFGKGYASFDYQAMVVAGLNPNGFDRNTWVGSGKQGIFEEDNFTSP AYVFRLDYKGVPNLRVGASFYYCADAGANSDKEQTYASYGKIPIRIFTADAQYRNKYVTA RGNILYGNLGNSLGVSQANVKLSNKSPYSRLAPVAKNAVSYAAEAGINIRSVFGGNKKIP VIYPFARYEYYNPQEKGEKGQTMEKRCQVSMWTAGLNWYALPNLVIKADYATRQIGTNKV FGVSKSYNSENEFSIGIAYIGWFIKK >gi|225935317|gb|ACGA01000075.1| GENE 58 82184 - 82546 336 120 aa, chain - ## HITS:1 COG:no KEGG:BT_0516 NR:ns ## KEGG: BT_0516 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 120 222 95.0 3e-57 MATERIIPGEIRIFLNHIYEFKKGVRNMVLYTMSKEHEEFAIRRLKNQKISYMIQEVGTN KINLFFGKAECMEAMRHIIIRPLNQLTAEEDFILGAMLGYDLCQQCKRYCSKKEGIKMAV >gi|225935317|gb|ACGA01000075.1| GENE 59 82580 - 83086 548 168 aa, chain - ## HITS:1 COG:Cj1382c KEGG:ns NR:ns ## COG: Cj1382c COG0716 # Protein_GI_number: 15792705 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Campylobacter jejuni # 4 164 3 159 163 143 51.0 2e-34 MNKIGVFYGSTTGTTEDLARRIAEKLDVPSADVFDVSKLTEALVNEYDVLVLGSSTWGAG ELQDDWYDGVKVLKKCDLSHKSVALFGCGDSDSYSDTFCDAIGILYEDLKDTHCKFCGAT DTAGYTFDSSIAVVDGKFVGLPLDEVNEDSKTDERISAWAEQVKQEIS >gi|225935317|gb|ACGA01000075.1| GENE 60 83218 - 84393 1118 391 aa, chain - ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 4 391 5 391 391 537 64.0 1e-152 MANELELKYGCNPNQKPARIFMKEGELPIEVLNGRPGYINLLDAFNSWQLVKELKEATGL PAAASFKHVSPAGAAVAVEMSDTLKKIYFVDDMKLSPLATAYARARGADRMSSYGDFIAL SDTCDEETARIINREVSDGVIAPDYTPEALEILKNKRKGTYNVIKIDPAYRPAPIEHKDV FGVTFEQGRNELKIDESLLKEMPTQNKEIPADAKRDLIIALITLKYTQSNSVCYAKDGQA IGIGAGQQSRIHCTRLAGNKADIWYLRQHPKVMNLPWIEKIRRADRDNTIDVYISEDYDD VLADGIWQQFFTEKPEILTREEKRAWLDTMTGVALGSDAFFPFGDNIERAHKSGVSYIAQ PGGSVRDDHVIGTCDKYNMAMAFTGIRLFHH >gi|225935317|gb|ACGA01000075.1| GENE 61 84602 - 85306 618 234 aa, chain + ## HITS:1 COG:no KEGG:BT_0519 NR:ns ## KEGG: BT_0519 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 234 1 234 234 446 95.0 1e-124 MDNLQKYKPTDKMIDLISDNYSLLQVMSRFGLSLGFGDKTVKEVCELNGVDCRTFLIVVN FMAEGFSRLDGDKDDISIPALIDYLRQAHIYFLDFSLPAIRRKLIEAIDCSQDDVAFLIL KFFDEYTREVRKHMDYEEKTVFKYVDSLIKGVAPKNYQISTFSKHHDQVGEKLTELKNII IKYCPAKANENLLNAALFDIYACEAGLESHCKVEDYIFVPAILNLERRIRENEK >gi|225935317|gb|ACGA01000075.1| GENE 62 85296 - 85907 503 203 aa, chain + ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 135 194 149 208 213 60 55.0 2e-09 MKNNEVVRIAIAETSVIIRGGLTAALKRLSNVKVQPIELLSVEALHDCVRTQCPEMLIVN PTFGDYFDVAKFREEISGKRIRLIALVTSFVDASLLGKYDESISIFDDLETLSKKIAGLL NVVSEEEEMDNQDTLSQREKEIVICVVKGMTNKEIAEKLFLSIHTVITHRRNISKKLQIH SAAGLTIYAIVNKLVALSDVKDL >gi|225935317|gb|ACGA01000075.1| GENE 63 86451 - 87098 429 215 aa, chain - ## HITS:1 COG:SMb21427 KEGG:ns NR:ns ## COG: SMb21427 COG0110 # Protein_GI_number: 16265003 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Sinorhizobium meliloti # 65 191 24 156 162 93 40.0 3e-19 MEKKTLKQRIKENPGLKQAVHRFIMHPVKTRPNWWIRLFDFIYLKRGKGSVIYRSVRKDL PPFNRFSLGKYSVVEDFSCLNNAVGDLTIGDYTRIGLRNTIIGPVNIGNHVNLAQNVTVT GLNHNYQDAEKMIDEQGVSTLPVVIEDDVWVGANSVILPGVTLGKHCVVAAGSVVSHSVP PYSICAGCPARIIKTYDFETKEWKKVEKTPATNHK >gi|225935317|gb|ACGA01000075.1| GENE 64 87050 - 87997 306 315 aa, chain - ## HITS:1 COG:no KEGG:BT_0522 NR:ns ## KEGG: BT_0522 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 305 1 304 304 442 70.0 1e-123 MIDSYIYIIDDLVFFCTGLLLLYLFVMAIASHFKHITYPKAQKEYGCAILVPEGSILPDM YKEEAYEFITYSDLHQTINSLDQERYDLVLFLSNTACALSPQFLNKIYNAYDAGVQAIQL HTIVENRKGICNRFRAIREEIKNSLCRAGNTQFGLSSNLLGTNMAIDLKWLQKNMKSSKT NIERKLFRQNIYIDYLPDVIVYCQSAPACPYRKRIRKTTSYLLPSIFEGNWSFCNRIVQQ LTPSPLKLCIFVSIWTSLITVYNWTLSFGWWIALFGLLITYSLAIPDYLVEDKKKKKHSI WRRKHLNSELKKTPA >gi|225935317|gb|ACGA01000075.1| GENE 65 88009 - 89166 790 385 aa, chain - ## HITS:1 COG:all4420 KEGG:ns NR:ns ## COG: all4420 COG2148 # Protein_GI_number: 17231912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 142 382 250 441 445 141 33.0 3e-33 MQYFVYIGRDSKTIELLSRLSIGVFYAAPNCSKAVKVLEKIREKYDAALFFEQVNISKDI ADIQYMRKKYPGLYMVLVIDSLSKEEASEYLKAGINNTIKYETSQEALKDLSTFLKRRKD QKIKALQLKAQNINAFRLPLWKRTFDIFFSGMAILCLSPLLIFTALAIRIESKGPIIYKS KRVGSNYQIFDFLKFRSMYTDADKHLKDFNALNQYQQEDEDIWGEEPEAEVNEEIDEEEI LLISDDFVISEEDYINKKSKEKSNAFVKLENDPRITKIGRIIRKYSIDELPQLINILKGD MSIVGNRPLPLYEAELLTSDEHIDRFMGPAGLTGLWQVEKRGEAGKLSAEERKQLDITYA KTFSFWLDIKIILKTVTAFIQKENV >gi|225935317|gb|ACGA01000075.1| GENE 66 89184 - 89549 303 121 aa, chain - ## HITS:1 COG:ECs0449 KEGG:ns NR:ns ## COG: ECs0449 COG0745 # Protein_GI_number: 15829703 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 120 1 120 229 77 37.0 7e-15 MKKKILLVDDKSTIGKVAGVYLGKEYDFTYLEDPIKAIEWLNEGNVPDLIISDIRMPLMM GDEFLRYMKNNELFKSIPIVMLSSEESTTERIRLLEEGAEDYILKPFNPLELKIRIKKII D >gi|225935317|gb|ACGA01000075.1| GENE 67 89784 - 92168 1838 794 aa, chain - ## HITS:1 COG:no KEGG:BT_0525 NR:ns ## KEGG: BT_0525 # Name: not_defined # Def: outer membrane protein, function unknown # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 794 1 816 820 515 42.0 1e-144 MMKKTIQFVPIIAIAMAGTMSSCVDSGKDLYDPSYETPNPMGDGFAAPDDIDWNMITTKN VSVEVKDEEGGLFAYLVEIYAEDPLTNESASVLAARTANKENNFKFTAAVSLLPTQKGIY VKQTDPKGREQVYQFDVPENSDNITCKLYYAESAAQNRALMSRGVATRSLAFEKPDYSSI PADAKEVTEMTGTTLLRNANYKITSDYNGTFKFDGYDGDIATRVYVDAQWTIPATFQFQN GIEIIVMNNAKIKASGTMTFIRNSMLTIMETGEVNADDVSFTNGAPAAFRNWGTLTVANK MTLHSGATLYNEGTITSKDISINSNTKIVNDNKIELKGELNLPSNFSLENNGEIYGEALI ANSDAVATNNNIMRFTTISLTNTTFNNACSMEATTSFYANGATFNFTQGYLKAPKMEFVN GTVNLSNGSMLDATVSIYMNTGHAKFYGKGENTSMIKSPVITGQGFTYDGNLVIECDNHV EKSPYWNNFYVQNGAYFTKMGESKVTIEVCTGKKNNGNEGGDPEDPKFPIIMDDNRNYAY LFEDQWPLYGDYDMNDLVLIIKERKISINKSNKAEEFTLSLDLSAAGATKSIGAAIMLDG VPASAITQPVEFSDNSLFKGFNVNSNLIENGQDYAVIPLFDDAHKALGRDRYEQINTIAG HSANTSPKNISFTIKFSNPISVDELNINKLNVFIFVEGNRNQRKEIHIVGYQPTKLANTD LFGGNNDDSSTSRKRYYISKDNLAWGIMVPTDFKWPLEYVNIKSAYSLFESWVTSGGTKN EEWWKTFDSSRVYK >gi|225935317|gb|ACGA01000075.1| GENE 68 92410 - 93450 794 346 aa, chain - ## HITS:1 COG:YPO2161 KEGG:ns NR:ns ## COG: YPO2161 COG0252 # Protein_GI_number: 16122393 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Yersinia pestis # 4 346 2 337 338 292 46.0 6e-79 MRAETPSVLLIYTGGTIGMIENPETGALENFNFDHLLKHVPELKRFNYRISSYQFDPPLD SSDMEPAYWAKLVKIINYNYDYFDGFVILHGTDTMAYTASALSFMLENLSKPVILTGSQL PIGTLRTDGKENLITAIEIAAAKNPDGTAIVPEVCIFFENHLMRGNRTTKINAENFNAFR SFNYPPLARVGIHIKYEPNLIRKPDPTKPLKPHYLFDTNVVILTLFPGIQESIVTSLLHV PGLKAVVMKTFGSGNAPQKEWFIRQLKEATDRGIIIVNITQCASGAVEMGRYETGMHLLE AGVISGYDSTPECAITKLMFLLGHGLPNKDIRYKMNSCLIGEITKS >gi|225935317|gb|ACGA01000075.1| GENE 69 93659 - 94015 241 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175339|ref|ZP_05761751.1| ## NR: gi|260175339|ref|ZP_05761751.1| hypothetical protein BacD2_26024 [Bacteroides sp. D2] # 1 118 1 118 118 222 100.0 7e-57 MKNDRYFLLEICFGADDNLSVMMYQIGGVQLVGIYVLLLYLCAKNDYSKSVVNSMQKPVE KSAVEKTSETPMDKGRKPNETQESRVEEKYYYCCKQQQQYWRGKEDCCCCCSDFGFGE >gi|225935317|gb|ACGA01000075.1| GENE 70 94200 - 94970 334 256 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 240 1 242 263 133 35 5e-30 MNRINQLFNSNKKDILSIYFCAGTPTLDGTADVIRTLEKHGVSMIEVGIPFSDPMADGIV IQNAATQALRNGMSLKLLFEQLRDIRKDVKIPLVLMGYLNPIMQFGFENFCRKCVECGID GVIIPDLPFRDYQEHYRIIAERYNIRVIMLITPETSEERVREIDTHTDGFIYMVSSAATT GAQQDFNEQKRAYFKKIEDMHLNNPLMVGFGISNKATFQAACEHASGAIIGSKFVTLLEE EKDPEKAITRLKEALK >gi|225935317|gb|ACGA01000075.1| GENE 71 94989 - 95612 573 207 aa, chain - ## HITS:1 COG:TM0139 KEGG:ns NR:ns ## COG: TM0139 COG0135 # Protein_GI_number: 15642913 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Thermotoga maritima # 7 206 4 203 205 120 38.0 1e-27 MINGKIIKVCGMREAENIQDVESIEGIDMLGFIFYPKSPRYVYELPAYLPIHARRVGVFV NEDKQTISMYADRFGLNYVQLHGNESPEYCRSLHSTGLKIIKAFSVDRPKDLRKVYDYEK VCDLFLFDTKCEQYGGSGNQFDWSILDMYNGHVPFLLSGGINSYSANALKEFKHPRLAGY DLNSRFELKPGEKDPERIRTFLNELKS >gi|225935317|gb|ACGA01000075.1| GENE 72 95639 - 96421 697 260 aa, chain - ## HITS:1 COG:XF0213 KEGG:ns NR:ns ## COG: XF0213 COG0134 # Protein_GI_number: 15836818 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Xylella fastidiosa 9a5c # 1 256 1 261 264 197 43.0 2e-50 MKDILSEIIANKRFEVDLQKQAISIEQLQEGISEVPTSRSMKQALASSASGIIAEFKRRS PSKGWIKEEACPEEIVPSYAAAGASALSILTDEKFFGGSLKDIRTARPLVEIPILRKDFI IDEYQLYQAKIVGADAVLLIAAALEPEKCNELAEKAHELGLEVLLEIHSSEELIYIDKKI DMIGINNRNLGTFFTDVENSFRLAGQLPQDAVLVSESGISDPEIVKRLRAAGFRGFLIGE TFMKTQQPGETLQNFLQAIQ >gi|225935317|gb|ACGA01000075.1| GENE 73 96470 - 97465 989 331 aa, chain - ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 1 329 2 332 336 209 35.0 6e-54 MKQILYKLFEHQYLGRDEARTILQNIAQGKYNDVQVASLITVFLMRNISVEELCGFRDAL LEMRIPVDLSEFAPIDIVGTGGDGKNTFNISTASCFTVAGAGFPVVKHGNYGATSVSGAS NVMEQHGVKFTSDVDQLRRSMEKCNLAYLHAPLFNPALKAVAPVRKGLAVRTFFNMLGPL VNPVLPAYQLLGVYNLPLLRLYTYTYQESKTKFAVVHSLDGYDEISLTNEFKVATSDHEK IYAPESLGFSRYKETDLDGGQTPEDAAKIFDQIMNNTATEAQKNVVVVNSAFAIHVICPE KTIEECIALAKESLESGRALATLKKFIELNS >gi|225935317|gb|ACGA01000075.1| GENE 74 97470 - 98036 403 188 aa, chain - ## HITS:1 COG:PA0649 KEGG:ns NR:ns ## COG: PA0649 COG0512 # Protein_GI_number: 15595846 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Pseudomonas aeruginosa # 3 186 2 190 201 189 49.0 3e-48 MKILLLDNYDSFTYNLLHAVKELGTTDVEVVRNDQIELDEVDRFDKIILSPGPGIPEEAG LLLPIIKRYAPTKSILGVCLGHQAIGEAFGARLENLKEVYHGVQTPITILHQDLLFEGLG KEIPVGRYHSWVVSRKVFPDCLEITAESQEGQIMAIRHKTYNVHGIQFHPESVLTPQGKE IIKNFLND >gi|225935317|gb|ACGA01000075.1| GENE 75 98082 - 99488 1319 468 aa, chain - ## HITS:1 COG:TM0142 KEGG:ns NR:ns ## COG: TM0142 COG0147 # Protein_GI_number: 15642916 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Thermotoga maritima # 7 467 4 457 461 275 37.0 1e-73 MKTFNYTTHSKQVLGDMHTPVSIYLKVRDMYPQSALMESSDYHAGENSLSFIALCPLASI GINGGIVTANYPDNSRTEEPLTKTFHVEKAMNRFINQFQVTGDNKNVCGLYGYTTFNAVK YFEHIPVKESHDEQNDAPDLLYILYKYVIVFNHFKNELTLVEMLSEGEESGLPELEAAIE NRNYASYNFSVTGPVTSPITDEEHKANVRKGIAHCMRGDVFQIVLSRRFIQPYAGDDFKV YRALRSINPSPYLFYFDFGGYRIFGSSPETHCKIEDGRAYIDPIAGTTRRTGDVVKDREL TEALLADPKENAEHVMLVDLARNDLSRNCHDVRVLFYKEPQYYSHVIHLVSRVSGVLNEG ADKIKTFIDTFPAGTLSGAPKVRAMQLISEIEPHNRGAYGGCIGFIGLNGELNQAITIRT FVSRNNELWFQAGGGIVARSQDEYELQEVNNKLGALKKAIDLAVKLKN >gi|225935317|gb|ACGA01000075.1| GENE 76 99537 - 100724 1265 395 aa, chain - ## HITS:1 COG:TM0138 KEGG:ns NR:ns ## COG: TM0138 COG0133 # Protein_GI_number: 15642912 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Thermotoga maritima # 10 390 3 380 389 449 60.0 1e-126 MKSFLVDQDGYYGEFGGAYVPEILHKCVEELTNKYLEVIESEEFKKEFEQLLRDYVGRPS PLYLAKRLSEKYGCKLYLKREDLNHTGAHKINNTIGQILLARRMGKKRIIAETGAGQHGV ATATVCALMDMECIVYMGKTDVERQHINVEKMKMLGATVIPVTSGNMTLKDATNEAIRDW CCHPADTYYIIGSTVGPHPYPDMVARLQSVISEEIKKQLQEKEGRDYPDYLIACVGGGSN AAGTIYHYINDERVGIILAEAGGKGIETGMTAATIQLGKMGIIHGARTYVIQNEDGQIEE PYSISAGLDYPGIGPIHANLAAQRRANVLAINDDEAIEAAYELTKLEGIIPALESAHALG ALRKLKFKPEDIVVLTVSGRGDKDIETYLSFNEQQ >gi|225935317|gb|ACGA01000075.1| GENE 77 101085 - 101225 152 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKQRKVLVGIAIAIFIILLLYWLLVAEDMKPWLSAMVPSVAQQFFA >gi|225935317|gb|ACGA01000075.1| GENE 78 101305 - 102450 1157 381 aa, chain + ## HITS:1 COG:alr4566 KEGG:ns NR:ns ## COG: alr4566 COG1979 # Protein_GI_number: 17232058 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Nostoc sp. PCC 7120 # 1 377 1 380 384 422 54.0 1e-118 MENFIFQNPVKLIMGKGMIARLAKEIPSDKRIMITFGGGSVKKNGVYDQVKEALKNHFTV EFWGIEPNPAIETLRKAIALGKEEKVDYLLAVGGGSVIDGTKLISAGILYDGDAWDLVLA GRPVTHTVPLATVLTLPATGSEMNNGAVISRRETKEKYPFYANYPIFSILDPEVTFTLPP HQVACGLADTFVHVMEQYMTTPGQSRIMDRWAEGILQTLVEIAPKIRENQHDYQLMADFM LSATMALNGFIAMGVSQDWATHMIGHEITALHGLTHGHTLAIVLPATLQVLHEEKGDKLL QYGERVWGITSGTREERIDEAICHTEEFFRSLGLTTRLHEENIGQDTILEIERRFNERGA KYGENGNVTGAVARRILETAL >gi|225935317|gb|ACGA01000075.1| GENE 79 102527 - 104152 1105 541 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 279 533 11 270 280 184 38.0 3e-46 MANNYLLNYWINEVHWGYNYLLVVILLLVISILLYRIRKLQKTIKKTNHSYRFSFDILDN LPFPIFVKDITNDFRYYYWNKESAAQSGISSEEAIGHTDYEIYGEERGEKYRHIDKELIQ AGKVYRKEEKYTTPDGITHDTIAVKSIISWEGEKKWLLATRWDITQLKNYERELVAAKEE LEKALKKQKLALKSIDFGLIYIDKNYRVQWEETRQIASLVKGRRYIPGKICYQTSALRNE PCGQCAFKKAIEQGKIIRHTIRVDDVDFEVTATPVFGDEKETEIIGGLLRFENITEKLKM DKMLQEAKEKAEESNRLKSAFLANMSHEIRTPLNAIVGFSEMVCQTEEEEERKEFVKIIS SNNILLLQLIDDILDLSKIEAGTMEFTFAQTDINELMEGICRQMQEKNSSPDVQILFTEK ADQCMMYTDRIRLSQVIINFTNNALKFTPKGSIEMGYRIEEAKDEIYFYVKDTGIGIPAD KIDKVFERFVKLNSFIKGTGLGLAICRVIVERLGGVIGVESKEGEGSRFWFRIPRSEKIE K >gi|225935317|gb|ACGA01000075.1| GENE 80 104490 - 106349 1562 619 aa, chain + ## HITS:1 COG:ECs3176 KEGG:ns NR:ns ## COG: ECs3176 COG0471 # Protein_GI_number: 15832430 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli O157:H7 # 7 619 11 610 610 372 37.0 1e-103 MLITIIILVLSAVFFVNGKVRSDIVALCALIALLIFQILTPDEALSGFSNSVVIMMIGLF VVGGAIFQTGLAKMISSRILKLAGTSEIRLFLLVMLVTSAIGAFVSNTGTVALMLPIVVS LAMSAGMNPSRLLMPLAFASSMGGMMTLIGTPPNLVIQNTLTSAGLEPLSFFSFLPVGIV CVIVGTLVLMPLSKWFLSKKGQKDDNKRSGKSLKQLVNEYGLSSNLFRLQVIKDSRLLGK TIIDLDIRRKYGLNIMEVRRGDASQHRFLKTITQKFAAPDTMLQAEDILYVTGEFDKVQL FAEDFLLEILGDHATEETQSATNSLDFYDIGIAEIVLMPASNLINQTIKEAGFRDKFNVN VLGIRRKKEYLLQDLGNERIHSGDVLLVQGTWNNIARLSKEDSDWVVLGQPLAEAAKVTL DYKAPVAAAIMVLMVVMMVFDFIPVAPVTAVMIAGILMVLTGCFRNVEAAYKTINWESIV LIAAMLPMSLALEKTGASEYISNTLVNGLGSYGPVALMAGIYFTTSLMTMFISNTATAVL LAPIALQSAIQIGVSPVPFLFAVTVGASMCFASPFSTPPNALVMPAGQYTFMDYVKVGLP LQIIMGIVMIFVLPLIFPF >gi|225935317|gb|ACGA01000075.1| GENE 81 106710 - 109049 1948 779 aa, chain + ## HITS:1 COG:no KEGG:BT_0525 NR:ns ## KEGG: BT_0525 # Name: not_defined # Def: outer membrane protein, function unknown # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 778 20 816 820 119 24.0 6e-25 MKQNIAKVFTFSLLASSISFTSCVDNEKTLFNADQLKQTYEETFPVKNIDPNGDWTMSHK VTAHVSVNGDLGTDYKIQIFDADPLSSESTAKILAEGTANQSTTLNVVMDCATALNKVFV ACIDNHGHYMVQPVAIENGEVTAQLGHEKDVPTRSMSRAVTTTGIPAMAAPYTADDISSK KAIATDVQADWDLGAGFGWFEYAKLPVFKEKERWFKIQSGTFNKGFTTTGTSGGAQAVKV IVPQGSTWIIESSYQFSDITEIIVENGGKVEIAKNASLVLTNKSYLTVMPGGSITGKGTI QITNGSSGFKNYNAGTINCSVLDFNGGVGVFYNYGLLQLERYEASTNGMELVNHGTMEAE SINGNNNTNIKNGCYLKTGKFQFGTLVMGNTSEAICEELGYNGNDNDIVMEAQSMLTCTG KASLYRTVTGPTVGTALLRINEIANLSGLAQSNSKVTNNIICEITDQTYKGEAHYDWSPF AWLVNKGLQQGATYCNPGKADFILPADGECIKEGYNSDENPDDVEIRNAVYSYAFEDNYP QAGDYDFNDIVLNVNLPAAGNDVKELKYTVNLRAVGAVKQLGAGLRIRGIDKSNVEEVSF GAGATQRTNSLNSGIFENASYETNGNELVIPLFGDAHYVYGYTGSQRPMLNTGNASTPLT DIYTLEVNIKLKNAISIPSVTDGLDFFIAYQGGAQKRTEIHLNQFNSATANGQLADKEVL EVIKAVNNTWALCVPEKFAYPTETTVITNAYSKFADWAHDQSTNTDWYNTVSSNKVMKY >gi|225935317|gb|ACGA01000075.1| GENE 82 109178 - 109642 68 154 aa, chain + ## HITS:1 COG:no KEGG:BF2973 NR:ns ## KEGG: BF2973 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 12 143 6 134 142 99 42.0 3e-20 MKYYPPKKKPLNYFERQQKKLHPKKYHYYFFDYMYYCGEKWSEKGKGARTSGMIVLFSYW SFCIILPLCMYLSSVSIVSRTDGWHIIGLFFLCIIPPVVFCMLRYRTQRRSALMIHYRRS EWCKGIPVWLACSMWFPLCILELWVLIGMGWLST >gi|225935317|gb|ACGA01000075.1| GENE 83 110358 - 114323 2194 1321 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 792 1057 41 312 318 117 29.0 1e-25 MKRLLLLVLISYPVLLFANLMTSNQYYFRSWNINDGLSQNSIYAILQDHLGFMWFGTKDG LNRFDGTSFQVYCKANGMLGNNCVTALYEDKEGSIWIGTDTGVYIYSISRENSYFFDVKT ADGLGINRSVYMIGENKGCIYISSADGFFCFHPQQQKMVQLKIPWAGMKNFLFDQGTCWG AFYADNLYFTLDDFNTVYPFAAQDGTQPFKNEVVNKIIKGEENLLYVGLTKGLKEINLAT RQVRTLLSVDDKGDDIYVREIALYSENELWIGTESGIYTYNIKNGAITHIQSLVGTSYAL SDNVVYSLYKDKEGGMWIGSYFGGINYYPKQYTFFEKIYPHKGMEELGERVREIVEDEKG NLWIGTEDKGLFHYNPVTKELKPFHHQSIYHNVHGLCLENGFLWVGTFSGGLNRINLQTN TVKTYRKKDSCGLSSNDIFVIRRTASGKIYLGTADGLFEYHNENDRFSRVESVPPTLIYD LHEDLKGHLWLATFNGVYKLETEKRKCIHFYHKENDSLSLPYDKVLSIFEDSRHQLWFTT EGGGFCRLVSDSGTFEHYDSNSIGLPSDVVHQIVEDKRGIFWLTTNKGLVRFQPETKVIK NFTIADGLLSNQFNYRSSHQTVDGRIYFGCIAGLISFDPSTFIDNDYVSPLVITDFMIFN KEVVPNTKDSPLKQSITVSDSIELTASQNSFSFRFATLGYSTLDQNKLLYKLEGFDKEWY EARGSLISYSNLPYGSYTLYAKGVNSDGIWNDAPLKLHIYIRPPFYLTGWAYAVYVLLSL CIFYFIIRYSKRRMARKHMRQMEKFEQRKERELYSAKINFFTNVAHEIRTPLTLIKGPLE CILKDKPLADDVREDLVIMRQNSDRLLNLINQLLDFRKTANNSFTLILSECNIQEIVSGV YIRFTSLARQKNIKLSIDLPEEDISAAVDREALIKIISNLLSNAVKYAATYITITVKYNE ATGSFIIKVCNDGSVIPVNMREEIFQPFVQVQNNEEYASGTGLGLALARSLAELHHGTVY LEDMQGVNCFVLEIPVTHVEEVQELQENVSEKEDKISGIENPMAEEQEKGTHLPVILVVE DNQDMLTFIAKQLSSAYSVLKAKDGVEALEILEDANIDLLISDVMMPRMDGMELCHRLKT DLTYSHIPIVLLTAKTNLESKIEGLKLAADAYIEKPFSVEYLIATIESLLCNREKLRQAF CHSPFVFTATMAQTNADKIFLEELNKIVYDNLQDPDFNLDSMAQLLNMSRSSLNRKIRGI LDMTPNDYIRVERLKKAAVLLKDKSYKINEVCYAVGFSTPSYFAKCFQKQFGVLPKDFVE M >gi|225935317|gb|ACGA01000075.1| GENE 84 114501 - 116318 1500 605 aa, chain + ## HITS:1 COG:L0025 KEGG:ns NR:ns ## COG: L0025 COG3250 # Protein_GI_number: 15673962 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Lactococcus lactis # 110 486 109 458 996 85 26.0 3e-16 MKNFIRPILLFLFILIFNVAAFAQWSPAGDKIKTQWGEQIDPNNVLPEYPRPIMERTQWM NLNGCWDYAIVPKGMEAPAVYDGKILVPFAVESSLSGVGKRVGGEKEVWYNRVFNVPEKW NGMNVLLHFGAVDWKTDVWVNDMKVGEHTGGYTPFSFNISSALKKGGNKLTVRVWDPTDK GVQPRGKQVERPKGVWYTPVTGIWQTVWLEPVPVCHITNVRTTPDIDLRRINVTVTTSEN FTDRVRVKVFDGEQLVTEGLAVNKQAIELTMPENAKLWSPQSPFLYKTEVTLENNGKIVD KVNGYVAMRKYSTRRDKDGIVRLQLNNQDLFQFGPLDQGWWPDGLYTAPSDAALRYDVQK TKDFGFNMIRKHIKVEPARWYTHCDQLGIVVWQDMPSGDRNYEWQNRTYFDGIDFKRSPE SEAVYRKEWKEVIDYLYSYPCIGVWVPFNEAWGQFKTPEIAAWTKQYDPSRLVNPASGGN HYTCGDMLDLHNYPAPEMYLYDAQRANVLGEFGGIGMICKEHLWEPDKNWGYVQFNSSQE ATDEYQKYADMLYKLIPRGFSAAVYTQTSDVEIEINGLLTYDRKVVKLDEERLRKINEKI CNALR >gi|225935317|gb|ACGA01000075.1| GENE 85 116356 - 119568 2508 1070 aa, chain + ## HITS:1 COG:no KEGG:PRU_2517 NR:ns ## KEGG: PRU_2517 # Name: not_defined # Def: TonB dependent receptor # Organism: P.ruminicola # Pathway: not_defined # 1 1070 1 1057 1057 1022 50.0 0 MKKLRFLMACALLAFLWSEGSYANTTLENTGSSQQQQYWKIKGKVSDEKGEALIGATVKI KNATHGTVTDVNGNFEIVLPPGGVLIISYIGYREQELKINSQRTYDIVLKEETAGLDEVV VTALGIKRDRKALGYAVSDVKGKELEKAKETNVINALAGKIPGLVISQTAGGPAGSTRVI IRGNTSLTGSNQPLYVVDGVPLDNSNFGSAGAYGGYDLGDGISSINPDDIESMSVLKGPA ASALYGSRAAHGVILITTKKAQKNQNRRAIGVEVNSTVTFEKQLTEYDDVQTEFGQGWDG RINLNDDDAKSACASWGPRYDEGIAFKYFDGVTRPFIYRKDNIKGFFNTGVTATNSLILS SAKENNSIRLSYTNLTNKDIMPNAKISRNTIDLRTFFKLAEKLEVDFKVNYVNEYVKNRP ALADSRTNVANNLMNLAGNFDQVWLKDNYMHSDGTYYDWNRGDVWNINPYWVQYAMKNTT QKDQYFAVASFKYSVNKNLYFKLTGGGENVNFSFMDYIPYSTPSTLLGQLQKSTFENKSY NVEFIANYQNSFKRFNYGITLGGNVYHVDNKSHTITAKNMSMHETVALQSFLQKEITEGS WRKEINSLFGMVNLSWKDLLYADVTLRRDQSSTLPVNNNVYFYPSVGGSFLFSELIKPNP ILSFGKLRASWAQVGSDADPFMLDLNYIMTDKTFGNYSTGYISTGTIPNKDLKPSKTNSF EIGVDLKFLNNRIGLDFTYYKQNSNNQIMNVATSVTSGYGTKLINAGEIENSGVEIALNT TPVQTKDFSWDFNFNFSKNSNKVKSLSTGIESLELAAARWLGVKVLAVPGEEYGVIMGQD FLRNEQGDVIINADSGLPEITSDMKKLGKATWDWTGGLTTTFRYKQFTLSAIFDIKVGAD IYSMTARGLAKSGKAAYTVRGRDEWHKSEEQRLEAGVAEGNWQSTGGYVAEGVVKQVDAA GNVSWVKNTKAIAPYDYWGYICDKTALPFIYDNSYVKMRELTFGYTFPKKLIERYVESLS VSFVARNPFIIYKNVPNIDPDSNYNTSAIGLEYGSLPSRRSFGLNVNIKF >gi|225935317|gb|ACGA01000075.1| GENE 86 119580 - 121202 1072 540 aa, chain + ## HITS:1 COG:no KEGG:PRU_2518 NR:ns ## KEGG: PRU_2518 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 8 538 8 573 580 274 31.0 7e-72 MKRILNIKWMHSLLLSVLLMGFSSCIDYEDFNKNPVVATTIDPNSQLSYIQLCMGGDWLM EQPFAYYYSGFVQQLQGDWSATHFGGEYLIDDAQFQQPWERIYTCHLKNLADILWRTEAD VEQKNINAVARILKCYFFLILTDMYGDIPYFEAEQGYITGNVTPKFDEQELIYKDMHKEL LEAEEQLDEKADKLTGDIIYSGDLKKWRKFANTLNFRIAMRLVKVAPELAEEWVTAICNI ESGLLGVGDDALIHYMDLLDWDETEFRRNGLAQLWRSRENAPMCYFCTTMWDKLKTTNDP RLLILGRCYAEDSTDPFLRTDLTDFIIEKAPKQLESIEAIKPGFYWWDNWPAGFMDGDTY YGKECRPQLNKGFIKGDSPAIFMSYAETELLLAEAKLRWPAINDSKTVETHYSNGVKAAM GLLKNFDLGEEVSNTEMASYLSRNPLPSDKDGQLQFINEQLWILHLINPSEAYSNWRRSG YPVLKPSTEYGAAVKDSRTIPRRLKYPLFEQTYNPTGYYEAVGRMGGEDNWNCRIWWDKE >gi|225935317|gb|ACGA01000075.1| GENE 87 121215 - 123038 1170 607 aa, chain + ## HITS:1 COG:TM1752 KEGG:ns NR:ns ## COG: TM1752 COG2730 # Protein_GI_number: 15644498 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Thermotoga maritima # 164 453 25 299 329 99 24.0 2e-20 MKTWNIIKSIFFISIYVLIAACNDDAAEESISADNYPRIIATSPTLPMKGENGELGKLNV KQGETITIKAIYTPFEYAEAVWYVDGVQEGTGDEFKYSNETPGIYHLKLVVSTTSYETTR EVNLNVYSTGTGGEFSGFEIHKGVNIGNWLSQSSARGEERKNKFKETDAIFLSQQGFDHL RLPVDEEQLFTESGEVDEETLNLIYKTADWCQLYNMRLIFDFHILRSHNYEADNKPLWTS KEEQDKFVRMWKTIHDKLQQYPDDMLAYELLNEPVAPSSDIWNTLVNRLITELREVAPAR MLIIGSNLWNSVSTISDLQIPSEDKNIMLTFHYYEPYLLTHYHASWTDFANLNLPLNYPG LLVRDEDFNNLPADQQEIVRPYKQEYTRESILNNINVAVTAARAKGVKLYCGEFGCLPYS NLTSRYAWYRDLVSIFNETEISYGAWEYNAIFGFCTPDGGLKDATLVDILLGREGAELTT IYTVEAEDCEKGGGTMIIEDNTAASGGKHVNHFWGDSNLKFDVTVEEAGVYYLKIRYVCG TDVWIYSQVNGGVKQMIPCPNSGGWWSEFTDAKTVVNLSAGKNTISFTPEPTGSPILDKF EVCKIKK >gi|225935317|gb|ACGA01000075.1| GENE 88 123047 - 124894 1112 615 aa, chain + ## HITS:1 COG:no KEGG:BT_3597 NR:ns ## KEGG: BT_3597 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 480 21 470 474 353 40.0 8e-96 MAKYIFLFFLWIAGFAYGCGDSETSVNDDIPVSKAPTSIKLPHLISDNMVLQQNSKVILW GEATPKCAVTISASWIKQDIVVNVDGKGQWQASIMTPAGGPETQTLSFDDGASAPLIVRN ILIGEVWICSGQSNMEMPLGGWEGCPVDNMEEAVTNADKHSEIRMLTVESNSATVSQNEL KGDWLVASSGQVKRFSAVAYYFALELQEKLNVPVGLVVAAWGGSDIESWLPGGDKYNGML YPCHKYAAKGFLWYQGESNVWKWYEYQKNMKELVKSWRSLWEGSLGTGENMPFYYAEIAP YALPEDNGATGIVSALLREVQYEVQKEIMPAGMVCTNDLVALSEENQIHPANKKGVGMRL AHLALNQTYEHGEIISASPSFEGMCMEDGRVLLTFKNVGGGWMDIDKNSVLQNFELSDGK TLVNGCYVFYPASDISFGDGDVVTVSSDKVAAPKYIRYCFRNFMLGYMRNKAGLPLIPFR GESNIGGTYMVEVESGTSSGTSVALSDNPYASGERLLTNFYGDARLNLNLVVEEEGLYEL SVYYMSEVDAAVRIQVNGGVQQNLNCSASGSWWNKLQCAGMTVTLLKGENTIVITPEANG GPNLDKVNVVKIERK >gi|225935317|gb|ACGA01000075.1| GENE 89 124948 - 126417 810 489 aa, chain + ## HITS:1 COG:TM1751 KEGG:ns NR:ns ## COG: TM1751 COG2730 # Protein_GI_number: 15644497 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Thermotoga maritima # 13 356 3 314 317 105 25.0 2e-22 MWTNHTIWGNVRMESLQSNFEITRGVNISHWLSQSKDRGELRKSKFGRDEVAFLKKMGFD HLRLPVDEEQLFTPSGDIDTETMELIHQAISWCREFQMRIIFDFHILRSHHFGNNERPLW TDPIEQDKFIHLWKLINRELEKYPESLLAYELLNEPVAPENGVWNELAKRLIKELREVAS KRMLILGSNSYNSVNTIKDLQIPKGDRNIMLTFHCYEPYLLTHYRASWTDFAKLDMALTY PGDLINEKDFEKLSDEEKKIVAPYRRMYSKRTIAHEIEQALKVAKDNGVRLYCGEFGCLP FGNHASRYNWYRDLISIFQKKKVAYACWDYKSIFGFCNTDKTIKDSVLLAILQGKDIRDW EQKSLIEAEKCEMIGSSLSIIANPYASNGEVVKDFFGNCSLIFNVVTENAGEYLFKIRYT SEEDVGLNLQIGNSRQVIRCPNTGGWYDKFEETEIYINLPKGQSVLCITPGMNGGPILDK FELFKEVTD >gi|225935317|gb|ACGA01000075.1| GENE 90 126450 - 126720 232 90 aa, chain + ## HITS:1 COG:no KEGG:BT_1872 NR:ns ## KEGG: BT_1872 # Name: not_defined # Def: periplasmic beta-glucosidase precursor # Organism: B.thetaiotaomicron # Pathway: Cyanoamino acid metabolism [PATH:bth00460]; Starch and sucrose metabolism [PATH:bth00500]; Biosynthesis of secondary metabolites [PATH:bth01110] # 29 90 27 88 759 114 85.0 1e-24 MMRRMKRCCLLVIYNILLIPTLLSQNHYENEMDCFITDLMERMTLREKLGQLNLPSGGDL VTGSVMNGELSDMIRKQEIGGFFNVKGIQK Prediction of potential genes in microbial genomes Time: Fri May 13 11:41:40 2011 Seq name: gi|225935316|gb|ACGA01000076.1| Bacteroides sp. D2 cont1.76, whole genome shotgun sequence Length of sequence - 126954 bp Number of predicted genes - 115, with homology - 115 Number of transcription units - 37, operones - 21 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 199 - 258 5.7 1 1 Op 1 . + CDS 369 - 1412 549 ## NT01CX_1069 hypothetical protein 2 1 Op 2 . + CDS 1457 - 1897 261 ## BVU_3800 two-component system response regulator + Prom 2057 - 2116 6.1 3 2 Op 1 . + CDS 2151 - 3935 1115 ## COG0464 ATPases of the AAA+ class 4 2 Op 2 . + CDS 3962 - 4342 440 ## gi|260175365|ref|ZP_05761777.1| hypothetical protein BacD2_26154 5 2 Op 3 . + CDS 4350 - 4874 472 ## wcw_1361 hypothetical protein + Prom 4879 - 4938 7.1 6 3 Op 1 . + CDS 5008 - 6300 786 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 + Prom 6503 - 6562 7.0 7 3 Op 2 . + CDS 6594 - 7277 411 ## BT_2762 TonB + Term 7299 - 7341 10.1 + Prom 7292 - 7351 3.3 8 4 Op 1 . + CDS 7385 - 8008 319 ## BT_2761 hypothetical protein 9 4 Op 2 . + CDS 8052 - 8615 383 ## COG2096 Uncharacterized conserved protein 10 4 Op 3 . + CDS 8689 - 8910 357 ## PGN_1678 hypothetical protein + Term 8933 - 8984 11.3 + Prom 8967 - 9026 3.9 11 5 Op 1 6/0.000 + CDS 9060 - 9629 344 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 9739 - 9798 2.7 12 5 Op 2 . + CDS 9830 - 11008 917 ## COG3712 Fe2+-dicitrate sensor, membrane component 13 5 Op 3 . + CDS 11053 - 14601 2697 ## BT_3279 hypothetical protein 14 5 Op 4 . + CDS 14616 - 16079 1040 ## BT_3280 hypothetical protein 15 5 Op 5 . + CDS 16090 - 17001 731 ## BT_3281 hypothetical protein 16 5 Op 6 . + CDS 17019 - 19568 2192 ## BT_3282 hypothetical protein + Term 19628 - 19675 11.2 - Term 19614 - 19664 2.2 17 6 Op 1 . - CDS 19675 - 20862 1160 ## BT_2758 hypothetical protein 18 6 Op 2 2/0.000 - CDS 20878 - 21936 1101 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 19 6 Op 3 . - CDS 21972 - 23282 1436 ## COG2704 Anaerobic C4-dicarboxylate transporter - Prom 23437 - 23496 5.6 + Prom 23369 - 23428 5.9 20 7 Tu 1 . + CDS 23461 - 24894 1355 ## COG1027 Aspartate ammonia-lyase + Term 24934 - 24979 12.2 + Prom 24961 - 25020 4.9 21 8 Tu 1 . + CDS 25087 - 25905 467 ## gi|260175383|ref|ZP_05761795.1| hypothetical protein BacD2_26244 + Term 26145 - 26198 5.1 + Prom 26564 - 26623 6.2 22 9 Tu 1 . + CDS 26832 - 27263 340 ## gi|260175384|ref|ZP_05761796.1| hypothetical protein BacD2_26249 + Prom 27298 - 27357 2.8 23 10 Tu 1 . + CDS 27379 - 29397 1217 ## COG0642 Signal transduction histidine kinase + Term 29442 - 29482 -0.8 + Prom 29425 - 29484 8.9 24 11 Op 1 . + CDS 29578 - 30057 327 ## ECL_01936 hypothetical protein 25 11 Op 2 . + CDS 30050 - 30625 545 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 30716 - 30756 3.2 + Prom 30637 - 30696 5.3 26 12 Op 1 . + CDS 30797 - 33115 1758 ## BVU_1454 hypothetical protein 27 12 Op 2 . + CDS 33193 - 33768 760 ## BT_2753 hypothetical protein + Term 33790 - 33848 11.8 + Prom 33875 - 33934 7.8 28 13 Op 1 . + CDS 33960 - 36416 1835 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 29 13 Op 2 . + CDS 36469 - 38265 1641 ## BT_2751 hypothetical protein 30 13 Op 3 . + CDS 38283 - 38753 451 ## COG0394 Protein-tyrosine-phosphatase + Term 38849 - 38905 10.2 - Term 38566 - 38604 2.0 31 14 Tu 1 . - CDS 38728 - 40800 573 ## PROTEIN SUPPORTED gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 - Prom 40834 - 40893 4.6 + Prom 40810 - 40869 3.5 32 15 Op 1 . + CDS 40895 - 42412 1496 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 33 15 Op 2 . + CDS 42426 - 43649 1183 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase + Term 43713 - 43773 16.8 - Term 43699 - 43760 17.0 34 16 Tu 1 . - CDS 43842 - 45053 688 ## BT_2745 thiol:disulfide interchange protein - Prom 45195 - 45254 8.3 - Term 45220 - 45267 6.2 35 17 Op 1 . - CDS 45300 - 45812 445 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 36 17 Op 2 . - CDS 45939 - 47720 1194 ## COG0006 Xaa-Pro aminopeptidase - Prom 47891 - 47950 5.4 + Prom 47830 - 47889 4.2 37 18 Op 1 . + CDS 47909 - 48100 310 ## PROTEIN SUPPORTED gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 38 18 Op 2 . + CDS 48174 - 49055 693 ## COG4974 Site-specific recombinase XerD 39 18 Op 3 . + CDS 49072 - 49371 193 ## PROTEIN SUPPORTED gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 + Term 49416 - 49465 4.3 + TRNA 49469 - 49545 73.6 # Thr TGT 0 0 + TRNA 49628 - 49713 65.6 # Tyr GTA 0 0 + TRNA 49739 - 49811 68.9 # Gly TCC 0 0 + TRNA 49822 - 49893 81.6 # Thr GGT 0 0 40 19 Tu 1 . + CDS 49945 - 51129 1409 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 + Term 51144 - 51183 8.4 + TRNA 51182 - 51257 81.1 # Trp CCA 0 0 + Prom 51184 - 51243 80.2 41 20 Op 1 . + CDS 51269 - 51460 140 ## BF4198 preprotein translocase SecE subunit 42 20 Op 2 45/0.000 + CDS 51477 - 52019 534 ## COG0250 Transcription antiterminator 43 20 Op 3 55/0.000 + CDS 52081 - 52524 743 ## PROTEIN SUPPORTED gi|160883077|ref|ZP_02064080.1| hypothetical protein BACOVA_01040 44 20 Op 4 . + CDS 52540 - 53238 1159 ## PROTEIN SUPPORTED gi|160883076|ref|ZP_02064079.1| hypothetical protein BACOVA_01039 45 20 Op 5 . + CDS 53254 - 53775 852 ## PROTEIN SUPPORTED gi|160883075|ref|ZP_02064078.1| hypothetical protein BACOVA_01038 46 20 Op 6 28/0.000 + CDS 53866 - 54240 589 ## PROTEIN SUPPORTED gi|153805949|ref|ZP_01958617.1| hypothetical protein BACCAC_00194 + Term 54266 - 54310 9.2 + Prom 54263 - 54322 4.3 47 21 Op 1 58/0.000 + CDS 54345 - 58157 4267 ## COG0085 DNA-directed RNA polymerase, beta subunit/140 kD subunit + Term 58175 - 58222 7.8 + Prom 58177 - 58236 2.6 48 21 Op 2 . + CDS 58262 - 62545 4666 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit + Term 62567 - 62618 14.2 + Prom 62585 - 62644 9.8 49 22 Tu 1 . + CDS 62692 - 63000 359 ## BT_2732 hypothetical protein + Prom 63053 - 63112 5.1 50 23 Op 1 56/0.000 + CDS 63137 - 63547 701 ## PROTEIN SUPPORTED gi|160883069|ref|ZP_02064072.1| hypothetical protein BACOVA_01032 + Prom 63570 - 63629 2.2 51 23 Op 2 51/0.000 + CDS 63691 - 64167 809 ## PROTEIN SUPPORTED gi|160883068|ref|ZP_02064071.1| hypothetical protein BACOVA_01031 52 23 Op 3 4/0.000 + CDS 64213 - 66330 1961 ## COG0480 Translation elongation factors (GTPases) + Term 66337 - 66390 8.4 53 23 Op 4 40/0.000 + CDS 66410 - 66715 495 ## PROTEIN SUPPORTED gi|29348137|ref|NP_811640.1| 30S ribosomal protein S10 54 23 Op 5 58/0.000 + CDS 66734 - 67351 1056 ## PROTEIN SUPPORTED gi|237717399|ref|ZP_04547880.1| 50S ribosomal protein L3 55 23 Op 6 61/0.000 + CDS 67351 - 67977 1036 ## PROTEIN SUPPORTED gi|160883063|ref|ZP_02064066.1| hypothetical protein BACOVA_01026 56 23 Op 7 61/0.000 + CDS 67994 - 68284 476 ## PROTEIN SUPPORTED gi|160883062|ref|ZP_02064065.1| hypothetical protein BACOVA_01025 57 23 Op 8 60/0.000 + CDS 68290 - 69114 1416 ## PROTEIN SUPPORTED gi|153805960|ref|ZP_01958628.1| hypothetical protein BACCAC_00205 58 23 Op 9 59/0.000 + CDS 69135 - 69404 465 ## PROTEIN SUPPORTED gi|153805961|ref|ZP_01958629.1| hypothetical protein BACCAC_00206 59 23 Op 10 61/0.000 + CDS 69441 - 69851 679 ## PROTEIN SUPPORTED gi|29348131|ref|NP_811634.1| 50S ribosomal protein L22 60 23 Op 11 50/0.000 + CDS 69857 - 70588 1248 ## PROTEIN SUPPORTED gi|160883058|ref|ZP_02064061.1| hypothetical protein BACOVA_01021 61 23 Op 12 . + CDS 70612 - 71046 737 ## PROTEIN SUPPORTED gi|153805964|ref|ZP_01958632.1| hypothetical protein BACCAC_00209 62 23 Op 13 . + CDS 71052 - 71249 321 ## PROTEIN SUPPORTED gi|160883056|ref|ZP_02064059.1| hypothetical protein BACOVA_01019 63 23 Op 14 50/0.000 + CDS 71246 - 71515 456 ## PROTEIN SUPPORTED gi|160883055|ref|ZP_02064058.1| hypothetical protein BACOVA_01018 64 23 Op 15 57/0.000 + CDS 71518 - 71883 596 ## PROTEIN SUPPORTED gi|29348126|ref|NP_811629.1| 50S ribosomal protein L14 65 23 Op 16 48/0.000 + CDS 71903 - 72223 534 ## PROTEIN SUPPORTED gi|160883053|ref|ZP_02064056.1| hypothetical protein BACOVA_01016 66 23 Op 17 50/0.000 + CDS 72223 - 72780 913 ## PROTEIN SUPPORTED gi|160883052|ref|ZP_02064055.1| hypothetical protein BACOVA_01015 67 23 Op 18 50/0.000 + CDS 72786 - 73085 503 ## PROTEIN SUPPORTED gi|160883051|ref|ZP_02064054.1| hypothetical protein BACOVA_01014 + Term 73103 - 73131 -0.9 68 23 Op 19 55/0.000 + CDS 73138 - 73533 664 ## PROTEIN SUPPORTED gi|160883050|ref|ZP_02064053.1| hypothetical protein BACOVA_01013 69 23 Op 20 46/0.000 + CDS 73549 - 74118 967 ## PROTEIN SUPPORTED gi|153805972|ref|ZP_01958640.1| hypothetical protein BACCAC_00217 70 23 Op 21 56/0.000 + CDS 74140 - 74484 538 ## PROTEIN SUPPORTED gi|29348120|ref|NP_811623.1| 50S ribosomal protein L18 71 23 Op 22 . + CDS 74490 - 75008 848 ## PROTEIN SUPPORTED gi|160883047|ref|ZP_02064050.1| hypothetical protein BACOVA_01010 72 23 Op 23 . + CDS 75018 - 75194 281 ## PROTEIN SUPPORTED gi|53715448|ref|YP_101440.1| 50S ribosomal protein L30 73 23 Op 24 53/0.000 + CDS 75227 - 75673 734 ## PROTEIN SUPPORTED gi|160883045|ref|ZP_02064048.1| hypothetical protein BACOVA_01008 74 23 Op 25 2/0.000 + CDS 75678 - 77018 877 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 75 23 Op 26 9/0.000 + CDS 77034 - 77831 670 ## COG0024 Methionine aminopeptidase 76 23 Op 27 . + CDS 77833 - 78051 239 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 77 23 Op 28 . + CDS 78060 - 78176 198 ## PROTEIN SUPPORTED gi|53715443|ref|YP_101435.1| 50S ribosomal protein L36 78 23 Op 29 48/0.000 + CDS 78210 - 78590 637 ## PROTEIN SUPPORTED gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 79 23 Op 30 36/0.000 + CDS 78602 - 78991 665 ## PROTEIN SUPPORTED gi|29348112|ref|NP_811615.1| 30S ribosomal protein S11 + Prom 78995 - 79054 4.2 80 23 Op 31 26/0.000 + CDS 79110 - 79715 1028 ## PROTEIN SUPPORTED gi|160883039|ref|ZP_02064042.1| hypothetical protein BACOVA_01002 81 23 Op 32 50/0.000 + CDS 79727 - 80719 1015 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 82 23 Op 33 . + CDS 80723 - 81226 834 ## PROTEIN SUPPORTED gi|237717427|ref|ZP_04547908.1| 50S ribosomal protein L17 + Term 81244 - 81305 12.2 + Prom 81310 - 81369 6.5 83 24 Tu 1 . + CDS 81401 - 81769 176 ## BT_2699 hypothetical protein + Term 81797 - 81847 7.6 + Prom 81801 - 81860 5.8 84 25 Tu 1 . + CDS 81882 - 82424 399 ## BT_2698 hypothetical protein + Term 82571 - 82612 7.3 - Term 82338 - 82374 2.1 85 26 Tu 1 . - CDS 82576 - 84390 850 ## COG0249 Mismatch repair ATPase (MutS family) - Prom 84522 - 84581 3.9 + Prom 84608 - 84667 5.7 86 27 Tu 1 . + CDS 84709 - 85353 519 ## BT_2695 hypothetical protein + Prom 85622 - 85681 5.3 87 28 Op 1 1/0.000 + CDS 85731 - 87728 1302 ## COG2987 Urocanate hydratase + Prom 87813 - 87872 6.0 88 28 Op 2 1/0.000 + CDS 87894 - 88796 974 ## COG3643 Glutamate formiminotransferase + Prom 88800 - 88859 3.1 89 28 Op 3 1/0.000 + CDS 88907 - 90160 975 ## COG1228 Imidazolonepropionase and related amidohydrolases 90 28 Op 4 1/0.000 + CDS 90207 - 90836 687 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase 91 28 Op 5 . + CDS 90833 - 92332 1225 ## COG2986 Histidine ammonia-lyase + Term 92381 - 92432 15.1 + Prom 92419 - 92478 3.7 92 29 Op 1 . + CDS 92682 - 93332 614 ## BT_2689 hypothetical protein 93 29 Op 2 13/0.000 + CDS 93388 - 94734 1257 ## COG1538 Outer membrane protein 94 29 Op 3 27/0.000 + CDS 94768 - 95793 1064 ## COG0845 Membrane-fusion protein 95 29 Op 4 . + CDS 95853 - 99011 3095 ## COG0841 Cation/multidrug efflux pump 96 29 Op 5 . + CDS 99024 - 99317 363 ## BT_2685 hypothetical protein + Term 99337 - 99401 15.3 + Prom 99321 - 99380 8.0 97 30 Op 1 . + CDS 99433 - 100824 1288 ## BT_2683 putative periplasmic protein 98 30 Op 2 . + CDS 100778 - 101728 625 ## BT_2682 putative periplasmic protein 99 30 Op 3 . + CDS 101715 - 103202 1115 ## COG1696 Predicted membrane protein involved in D-alanine export + Prom 103205 - 103264 1.8 100 30 Op 4 . + CDS 103289 - 104302 898 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 104315 - 104381 6.6 + Prom 104305 - 104364 5.2 101 31 Tu 1 . + CDS 104545 - 105594 582 ## BT_2654 transposase + Term 105641 - 105676 7.1 102 32 Tu 1 . - CDS 105800 - 106678 426 ## BF1920 putative transcription regulator - Prom 106832 - 106891 9.2 + Prom 106777 - 106836 3.6 103 33 Op 1 . + CDS 106856 - 107389 504 ## PG2130 hypothetical protein 104 33 Op 2 . + CDS 107403 - 109172 1579 ## BF3229 hypothetical protein 105 33 Op 3 . + CDS 109194 - 110081 720 ## BF1562 hypothetical protein 106 33 Op 4 . + CDS 110116 - 111393 1123 ## BF1563 hypothetical protein + Term 111404 - 111436 1.6 + Prom 111456 - 111515 8.9 107 34 Op 1 . + CDS 111537 - 112598 763 ## gi|260175469|ref|ZP_05761881.1| hypothetical protein BacD2_26674 108 34 Op 2 . + CDS 112599 - 115811 2309 ## BDI_2883 hypothetical protein 109 34 Op 3 . + CDS 115839 - 116975 985 ## BF1563 hypothetical protein - Term 117213 - 117252 8.2 110 35 Tu 1 . - CDS 117352 - 118521 440 ## BF1784 clostripain-related protein - Prom 118578 - 118637 5.0 - Term 118575 - 118625 2.4 111 36 Op 1 . - CDS 118653 - 120722 2083 ## gi|260175473|ref|ZP_05761885.1| hypothetical protein BacD2_26694 112 36 Op 2 . - CDS 120784 - 121875 780 ## gi|260175474|ref|ZP_05761886.1| hypothetical protein BacD2_26699 113 36 Op 3 . - CDS 121903 - 123015 926 ## gi|260175475|ref|ZP_05761887.1| hypothetical protein BacD2_26704 114 36 Op 4 . - CDS 123035 - 124240 846 ## BT_2319 hypothetical protein - Prom 124472 - 124531 7.6 - Term 124352 - 124398 3.3 115 37 Tu 1 . - CDS 124621 - 125610 404 ## BT_4479 integrase protein - Prom 125795 - 125854 4.5 Predicted protein(s) >gi|225935316|gb|ACGA01000076.1| GENE 1 369 - 1412 549 347 aa, chain + ## HITS:1 COG:no KEGG:NT01CX_1069 NR:ns ## KEGG: NT01CX_1069 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 81 223 2 168 304 73 30.0 1e-11 MKALKKITFGRVLGFVAVLCLSWVTILITSESRLGRGSTEYLGDIYSSECHSYIKDGKYR IENMVTGKTTIKDIDWLLSAESCKDTLVVFSKNRKRGYFNRFTGEAVIPEQYTHAWVFSE GLAAVVSNEKVGFIDRQGKTVIDFHFPYLKDNAKSVAFVFRNGYSSMYDVSGKCGIIDKK GQWVLEPAYDCIHTPQYGKRIFDVGKFSGVLDDSLQILLSPEYKYIQLLEHYIIADKQDG SQVQIGYDGKVLNENIYSGVTILEYDSFEKTKEGGETIRKSTGLYSYSVYNSYGLMGQNG KPITPVLYDNITAINKDLFLCTLKDSYSSVIINRQGTVIRNEAADLH >gi|225935316|gb|ACGA01000076.1| GENE 2 1457 - 1897 261 146 aa, chain + ## HITS:1 COG:no KEGG:BVU_3800 NR:ns ## KEGG: BVU_3800 # Name: not_defined # Def: two-component system response regulator # Organism: B.vulgatus # Pathway: not_defined # 38 143 146 251 254 93 38.0 3e-18 METNNRSQREKTMRNFISGSEFAAIRISSVVGDQNDIFLCTTATGNLQLIRLGEIGYFRY NSKGRQWEAMLYDGHVLKLKRSVNSQMILNYHPFLTQISQSFILNISYLVLIKENNCVLL PPFNTVETLQITPPFLRKLKEKYPSF >gi|225935316|gb|ACGA01000076.1| GENE 3 2151 - 3935 1115 594 aa, chain + ## HITS:1 COG:APE1367 KEGG:ns NR:ns ## COG: APE1367 COG0464 # Protein_GI_number: 14601365 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Aeropyrum pernix # 321 578 458 721 726 173 36.0 8e-43 MSALNVGQQIHDFRVDAFLKSMYGYEQTYLATSVQEEHKKVLLKLYDVMEFPSRQLSEGG TLMEGNIRNEMYHWFAPIHLYDTVLLDGKEYHCLVREYIESKRLSDLLAEGKRYSWDEAM PIIHQVLNGLSCLHRQERAIIHNDMTPRNILLYESREHETKVYIIGTGHLSYRRSGVASF PTDDLNPWYRAPETYTGMYDEQSDLFSVGALLYEMLTGKAPWQTELPDCSCDTRLYKKIV KKAREAELLFPEELMLTDEQQNTLKKALALDYEKRYYNVEAFKKGLDGEIAVVEKTPPIP VEQENASSWDEKIKKQTGNGFADVAGMEEVKQIFYKDILFLLKNKEKVERYKLKIPNGTL LYGPPGCGKTYIAEKFAQESGLNFMMVKASDLGSIYIHGMQGKIAELFDEAEKKAPTVIC FDEFDAMVPDRSRMDNVGQSGEVNEFLSQLNNCAERGIFVIGTSNRPDRIDPAVLRTGRI DKLIYIPLPDKEARKSLFVFQLRDRWCEETIDCEILADKTAGYVASDITFIVNETALAAA MKDIPISQELLMDEIGNARQSVNKEQIEVYESMYARLESGRIEERRRIGFAAYK >gi|225935316|gb|ACGA01000076.1| GENE 4 3962 - 4342 440 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175365|ref|ZP_05761777.1| ## NR: gi|260175365|ref|ZP_05761777.1| hypothetical protein BacD2_26154 [Bacteroides sp. D2] # 1 126 2 127 127 214 100.0 1e-54 MKTVEKVFEALKNEGLVPVMENFGISFKFQMTNYVYMEDENDESFFNLLIPNIYDVSEEN ELEVLRAINNVNNSMKVAKLVISNDSVWVCFENLLDKEHKMEDLVPVAVSTLYQARQRFY AALKEE >gi|225935316|gb|ACGA01000076.1| GENE 5 4350 - 4874 472 174 aa, chain + ## HITS:1 COG:no KEGG:wcw_1361 NR:ns ## KEGG: wcw_1361 # Name: not_defined # Def: hypothetical protein # Organism: W.chondrophila # Pathway: not_defined # 13 141 87 216 236 110 46.0 2e-23 MKRNDSPDFVGLEELKRKQREQLYNFECWAASGKWNEFHRHHYDWWMFPYNQPSSYGEAY TVYDYEVNLLKKNSIFVRRYLRGVELLLLSWGWKLKDHKMVDNPDLFQDWADWPIRLYKC ASSLLLFGFEKEFESVRTYALRLISEEKNFWYDGKDCSELFRMEILNMSELSEF >gi|225935316|gb|ACGA01000076.1| GENE 6 5008 - 6300 786 430 aa, chain + ## HITS:1 COG:PAE1111 KEGG:ns NR:ns ## COG: PAE1111 COG2110 # Protein_GI_number: 18312414 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Pyrobaculum aerophilum # 7 149 3 145 182 90 38.0 4e-18 MNDEKAYQFGKSRLIIKFGDLTSAVTDVIVSSDDAYLSMGGGVSASILRAGGDVIARDAR KNVPCQMGDVIVTSAGKLEAKYVFHAITIDWSQKDEFTVEKSINSIIKKSLNVLSVLGLK SIAFPAIGTGAARYSLEDVAHFMSMAISEFLSNSDEELEIYIYLMDRYGRRTAIDYIVFF EQFYRRMFTSGVDIAEVNPAETEAHAKQWDMKSMRLNHLNSYLIKLEEQRMNFEMKLIEA IEKNDNEQIKTLHYQLDENNKIRYSCMREIKEVECTDALNSSFKSVFLSSTFEDMKEYRK AIIDRIIKRRMVPICMENWGANANKVTSVITDEVKKADIYLGIFGTRYGYVDENTNMSMT EIEYREALASNKPILVYIAKNAKDDITTGDNSQKMLELLTEIEKERIVYYFNSIDQLGEQ VFADLERYIK >gi|225935316|gb|ACGA01000076.1| GENE 7 6594 - 7277 411 227 aa, chain + ## HITS:1 COG:no KEGG:BT_2762 NR:ns ## KEGG: BT_2762 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 319 82.0 5e-86 MEAKKSKKAAIENQRGSWLLMGLVVALAFMFVSFEWTQHDVRVAALSPDDESIFVTELVP ITFPDEKLEPPPPPETKVTELFEIVENNMEVTDNVSTVSEDMNAVRDVIWIPPVVETETV VEDIIHVSVEVMPEFIGGTAALMKYLSSNIKYPTISQETGSQGKVIVQFVVDKDGTISNP EVVRGVDPYLDKEAIRVISSMPKWTPGVQNGKKVRVKFTVPVSFRLQ >gi|225935316|gb|ACGA01000076.1| GENE 8 7385 - 8008 319 207 aa, chain + ## HITS:1 COG:no KEGG:BT_2761 NR:ns ## KEGG: BT_2761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 6 212 212 337 81.0 1e-91 MNLLLCIKRPFIWLSRFRYRCGYGVHSPFAFSLITDVIYEKMPYYAYSSLKKEQKKMIRE RGWTKGSQKVNRFLFRLVNKVQPATIIEVGRPSSTTLYLQSAKPSASYLFASDLSELFLD ADTSVDFLYLNDYRNPNLLEEVFWVCAHRTTPKSVFVVHGICYSKEMKALWKKLQADERI GITFDLYDVGLLFFDKTKIKQQYIVNF >gi|225935316|gb|ACGA01000076.1| GENE 9 8052 - 8615 383 187 aa, chain + ## HITS:1 COG:lin1172 KEGG:ns NR:ns ## COG: lin1172 COG2096 # Protein_GI_number: 16800241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 185 3 179 188 152 44.0 4e-37 MKKSLVYTKTGDKGTTSLVGGSRVPKTHIRLEAYGTVDELNSHLGWLNTYLQDESDRDFI LSIQHKLFAIGSHLATDQEKTQLKAASIITPENVENIEREIDKLDEQLPELCAFIIPGGS RGAAVCHVCRTICRRAERRILTLSETCTISPEVLAFVNRLSDYLFVLSRKINFDEQNNEI FWDNSWK >gi|225935316|gb|ACGA01000076.1| GENE 10 8689 - 8910 357 73 aa, chain + ## HITS:1 COG:no KEGG:PGN_1678 NR:ns ## KEGG: PGN_1678 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 73 13 85 85 117 98.0 2e-25 MYWTLELASKLEDAPWPATKDELIDYAMRSGAPLEVIENLQEMEDEGEIYESIEDIWPDY PSKEDFFFNEEEY >gi|225935316|gb|ACGA01000076.1| GENE 11 9060 - 9629 344 189 aa, chain + ## HITS:1 COG:PA2896 KEGG:ns NR:ns ## COG: PA2896 COG1595 # Protein_GI_number: 15598092 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 3 177 13 189 194 66 30.0 3e-11 MSSNEISIEQINKLDAAAFRLLYKSYYKGLVCYAMRLIELSEPAEDIVQELFSNIWAKKM VFQSLVSFKVYLYNSVRNASLDYLKHKNIEGSYLQKMLDAHPVYRTGEEDEEGFFSEEVY RQLFETIDALPERCREVFLMYMEGKKNEEVATALHVSIETVKTQKKRAMSILRKKLGAYQ FLLLQLLLP >gi|225935316|gb|ACGA01000076.1| GENE 12 9830 - 11008 917 392 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 199 384 134 317 331 82 30.0 2e-15 MMKRKRSINSIIHDQLLGSTSEEDREQLQRWLDSSQENRENYERLMRENCLADRYQEFAW VDEERAWKRFQEKHFPVRTFNWGKLVRYAAIFILPLIGLAVWLWGTYQMDSKPILSEETR VAMTRSEQMGKQKATLVLTNGQQMDLKSAPVSQDTVARLLAPPPAVVPAEKDQPVEAPVA ENNKLSTYDDSEFWMTFEDGTRVHLNYNTTLKYPPHFTSNSRTVYLDGEAYFEVAKDNKR PFRVVTANGVVRQYGTSFNVNTYMNGVTKVVLVDGSVSVLPNLGDEYKMKPGELAVLHAA TQEVQISEVNIEPYVAWNSGRFVFDNCSLESLMSVISRWYNKEIVFESEDIKKIRFTGDI DRYGNIAPILKAIQRVTHLEMEVRGKSIIIRK >gi|225935316|gb|ACGA01000076.1| GENE 13 11053 - 14601 2697 1182 aa, chain + ## HITS:1 COG:no KEGG:BT_3279 NR:ns ## KEGG: BT_3279 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1182 1 1182 1182 2135 89.0 0 MQKERARERIIMFLCFISLLLTPVLATSQSTAMRVSLNLKSATVKEFFDAVKVQTGLNFI YNTEQVKNIPHITVRSSSQPVNEVLDKVLANTGYTYEIEGNIVTVVYQQPKEKTRTATGV VVDEDGLPLPGAYIKLAKTDHSTIADRDGKFSISLLQQKDPVLQISYLGMMMEEVKAGGD RSLRVVMKSDVKSIDEVVVTGYQEIDRRKLASSISSIKGADLIGGEYLSIDRMLQGRLPG VAVMNMSSTPGAAPKIRIRGSSSITGNREPVWVVDGIILEEPVNISTEELNSPDKINLIG NAIGSVNPEDIERVDILKDASATAIYGIKAANGVIVVTTKQGKSQKPSISYTATLGITAP PTYDKMFRMNSADRIDMSMEMHERGLAFGSYKPSDIGYEGALQQLWNKDITYQDFLQQVQ TLKGLNTDWYDLLFRTAFTHQHSVSISGGNDRANYYMSVGYANNQSVTVGEKLERYNVLA KINTRINRNIRLGLKVSGSLSKADRPHSSIDVYEYAYNTSRAIPLRTASNELYFYANAAG YNGNLPYNVVNELNHSGNSNDNSVIDVAINLDWKPASWINFSSILGLSRSHVTQENWADE QSYYISSMRLTPYGTMLPNPLEDPTFAEEKCLLPFGGELLTSNTRHTSYTWRNSLSLMQS FGKHEVSESIGQEVRSSKYDGLQSTQYGYLPDRGKKFVDIDPTIWKRYGALLKNHPDVVT DTKNNVVSLYATAAYVYDSRYILNFNIRTDGSNKFGQSKSVRFLPIWSVSGRWNVINESF MKNADFLNDLAIRASYGIQGNVHPDQTPNLIASLGTLEALPQEYVSTLHKLPNNKLKWEK TRSYNIAIDWAFWNNRIYGSLDVYYKKGVDQVVTKNVAPSTGAMSVSINDGDVENKGWDL AISLVPIQTKNWMWNLSFNTGKNYNKVLNAGNSAVTWQDYISGTLVSNGNAVNSFYSYKF DKLDAGGYPTFKDINEKDEDGNAVVHSLQEMYDRSFVFSGKREPDLTGGFSTYLKYKNIT FNALFSFSFGNKIRLNDLYESSGQRLPYPDQNMSSEFVNRWREPGDEDRTNIPVLSDKSL TINDRDVTYRIADNGWDMYNKSDIRVVSGSFLRCRSMSVRYDLKREWLKPLYLKGASISF DAGNVFVIKDKALKGRDPEQIGLGARSIPPQRSYSLRFNITL >gi|225935316|gb|ACGA01000076.1| GENE 14 14616 - 16079 1040 487 aa, chain + ## HITS:1 COG:no KEGG:BT_3280 NR:ns ## KEGG: BT_3280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 486 3 487 489 772 76.0 0 MKKYSLYIWLLAGILASCGDFLEESSQDLMIPKSVKDYKEFFFGEVMKTKGREIPHPYLE YMTDDVKDQCYYGTRPTPLSNDFREAMWAYYTWQANPEVGMSNEFSTDVAWSSYYHKILM TNIILDQLPEMNGTDMERRDLAGEAYFMRALSYFMLANLYGQPYHPATAGQDLCVPVNDE VSLSDRMMKRATNAVVYAKMEGDVKRSIQCFKAVGGEKSIFRPNLPAAYLLASRIALFQE KYDQTILYSDSVLRATPHSLYKLREDDTHYFFSLVNKEILLSYGVTTFESYMGEDYRYAG QIVVSDDLLAQYPADDLRLAKYFKNTVGRQTSPSAKEYSIHTPFKWKNNSATVYSNAFRI SEAYLNRAEAYAVSGETNKALTDLNELRENRMKAGATPLVIDPEGIVATVRKERRRELAF EGFRWFDLRRYGCPPLQHTYSSKATEGAGDIFKLQDKRAYVLPIPKSERDRNTEIETFER PESEPVK >gi|225935316|gb|ACGA01000076.1| GENE 15 16090 - 17001 731 303 aa, chain + ## HITS:1 COG:no KEGG:BT_3281 NR:ns ## KEGG: BT_3281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 299 299 441 74.0 1e-122 MKKFILLICLLAGLSSCYNEDALSIPTQPDKYGVLTDDPSDPTRHFIYEFYQKYETVIIT NPTEADYKFNFTSDNGIKITASEQEQGVVDEGIDFLREVLLDLYPDDFLKKNLPFSIILA EEVRMDSYGETTVMNCYASGSFIALGNVTAGLKTMTQEEFRKIRADVNATFWARYMSEVR GLFTISDAFYAASEEIQPKIYDWFYFGYDATPYNTDFYHYGLISYDPDRSLVDEDEEDPE WSFYSLYAPSRQTDLSQWMNFVFEKTPAEIQQICNQYPVMKKKYDVIRESMLENGFDLSK LEL >gi|225935316|gb|ACGA01000076.1| GENE 16 17019 - 19568 2192 849 aa, chain + ## HITS:1 COG:no KEGG:BT_3282 NR:ns ## KEGG: BT_3282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 849 1 849 849 1465 83.0 0 MNKIISLLLVGVLAAGVTTAHAESPIFKKKKKKGKTATEQTVTKDSTGNGTTEYHKLLKG GKTIDGMFKIHIVKDKYYFEIPKDLMKRDFMISSRVSSTSNNKDVAAGQMPHNPVLVTFS ADDKKVYMHRKMVRNLCDTTSNMYEAFQRNFSNPIWEAYKIESLSPDSSAYVVNMSALFI TDVPEFSPFRSENIMDVLRKKKAMKGSLVAAKSSILGMKCFPLNINIKTLMSYTVDGGPF TVTMTRNIILLPEEMMRPRYADSRIGYFDESKRFYTEKKDGLQELSYINRWNLQPKPEDW ERYKKGELVEPQKPIIYYVDTAIPDKWRDYVKKGIEDWQVAFEEIGFKNAIIAKDYPKDD PDFDPDDIRYSCYRMITTPEQNSMGPSCADPRSGEIIQGDVLFYSNIVKLLHDWRFVQTA AVDPKVRKAVFDNETMGSSLRYVAAHEIGHTLGLMHNFGASYSFPVDSLRSASFTRKYGT TPSIMDYTRFNYVAQPGDKGVTLTPPLLGVYDKFAIKWGYQPIPDAASSEEERTTLNQWI KEKEHDPMYKYGPQPFISEIDPSCKAEDLGDNAVKAGMYGLKNLKIIVKNLPEWTNEDDY RYERLTDSYKEVIIQMQRYLLHAMVSVGGIYLDEPRRDNKQPIIRFVPKAEQKEALNFIL ETVMELPEWLLDKNVIQYTGPIYSPSSLQNMVINRLFFPYITSSLVLYEELYPQQAYRYS EFMDDIYNFVWKKTLSGAGLNMYDRNLQIAYVEKLLKEAGLAKQKASPFSFKAQNEVEEM FLTENVDWQKAGFELNPLSNTGMIREPAIHRKIMDSYKLLQSKANTGDATTRAHYQSLAF KIKQALKDQ >gi|225935316|gb|ACGA01000076.1| GENE 17 19675 - 20862 1160 395 aa, chain - ## HITS:1 COG:no KEGG:BT_2758 NR:ns ## KEGG: BT_2758 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 395 1 389 389 702 91.0 0 MKKWIFMLLLAGSIQGVYAQKAEKKEKFNPNTPLFEELTDVKKKTDKFNLYLNMQGSFDA HFQNGFQEGDFNMHQLRIEAKGNINNWLSYRYRQRLNRSNDANGMIDNLPTSIDYAGIGI KLNDQFSLFAGKQCAAYGGIEFDLNPIDIYQYSDMIDYMSNFMTGLNVGYNITPEQQLNL QILNSRNSSFDNTYGITEDAEGNLPDLKSGKMPLVYTLNWNGNFNNVFKTRWSASVMNEA KSHNMYYYALGNELNLGKWNAFVDFMYSKEDIDRKGIITSIVGRPGGHNAFDANYLSVVA KCNYRFLPKWNVFVKGMYETASVGKASEGIEKGNYRTSWGYLGGIEYYPMETNLHFFVTY VGRSYDFTSRAKVLGQENYSTNRISVGFIWQMPVF >gi|225935316|gb|ACGA01000076.1| GENE 18 20878 - 21936 1101 352 aa, chain - ## HITS:1 COG:ECs3833 KEGG:ns NR:ns ## COG: ECs3833 COG0252 # Protein_GI_number: 15833087 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Escherichia coli O157:H7 # 1 352 1 348 348 373 61.0 1e-103 MKQFKSFGLVVVTLLFSVTMAFAAKPNIHILATGGTIAGTGSSATGTSYTAGQVAIGALL DAVPEIKDIANVTGEQIVKIGSQDMNDQVWLTLAKKINELLKRPDIDGIVITHGTDTMEE TAYFLNLTVKSDKPVVLVGAMRPSTALSADGPLNLYNAVVTAAAKESKDKGVLVAMNGLI LGAQSTVKMNTVDVQTFQAPNSGALGYVLNGKVFYNQVTLKKHTTQSVFDVTHLNALPKV GIVYSYSNIEADMVTPMLSNGYKGIIHAGVGNGNIHQNIFPVLTDARQKGILVVRSSRVP TGPTTLDAEVDDAKYQFVASQELNPQKSRVLLMLALTKTTDWKQIQQYFNEY >gi|225935316|gb|ACGA01000076.1| GENE 19 21972 - 23282 1436 436 aa, chain - ## HITS:1 COG:HI0746 KEGG:ns NR:ns ## COG: HI0746 COG2704 # Protein_GI_number: 16272687 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Haemophilus influenzae # 2 430 6 433 440 368 55.0 1e-101 MILQLAFVLTAIIIGARLGGIGLGVMGGVGLAILTFVFGLQPTAPPIDVMLMIAAVISAA SCMQAAGGLDYMVKLAEHLLRKNPSHVTLLSPLVTYLFTFVAGTGHVAYSVLPVIAEVAT ETKIRPERPLGIAVIASQQAITASPISAATVALLGLLAGFDITLFDILKITIPATIIGVL VGALFSMRIGKELVEDPEYQKRLKEGLFNNKKVEIKNVKNKRSAMISVIIFILATAFIVL FGSFEGMRPSFLIDGEIVTLGMSSIIEIVMLSAAAIILLVTKTDGIKATQGSVFPAGMQA VIAIFGIAWMGDTFLQGNMGQLTLSIQGIVQQMPWLFGVALFVMSILLYSQAATVRALVP LGIALGISPYMLIALFPAVNGYFFIPNYPTVVAAINFDRTGTTKIGKYVLNHSFMMPGLV STVVAIALGLLFIQIF >gi|225935316|gb|ACGA01000076.1| GENE 20 23461 - 24894 1355 477 aa, chain + ## HITS:1 COG:Cj0087 KEGG:ns NR:ns ## COG: Cj0087 COG1027 # Protein_GI_number: 15791475 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Campylobacter jejuni # 9 468 3 462 468 521 55.0 1e-147 MEQNLSKKTRTESDLIGSREVPESALYGVQTLRGIENFRISKFHLNEYPLFIQALAITKM GATVANRELDLLTEEQADAILRACKEILEGKHYEQFPVDMIQGGAGTTTNMNANEVIANR ALELMGHQRGEYQFCSPNDHVNRSQSTNDAYPTAIHIGLYYTHLKLVKHLEVLIESFRKK ANEFAHVIKMGRTQLEDAVPMTLGQTFNGFASILSHEVKNLDFAAQDFLAVNMGATAIGT GITAEPEYAEKCIAALRKITGLDIKLADDLVGATSDTSCMVGYSGAMRRIAVKMNKICND LRLLASGPRCGLGEINLPAMQPGSSIMPGKVNPVIPEVMNQVAYKVIGNDLCVAMSGEAA QMELNAMEPVMAQCCFESADLLMNGFDTLRTLCIDGITANEEKCRRDVHNSIGVVTALNP VIGYKNSTKIAKEALETGKGVYELVLEHNILSKEDLDTILKPENMIKPVKLDIHPNR >gi|225935316|gb|ACGA01000076.1| GENE 21 25087 - 25905 467 272 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175383|ref|ZP_05761795.1| ## NR: gi|260175383|ref|ZP_05761795.1| hypothetical protein BacD2_26244 [Bacteroides sp. D2] # 1 272 1 272 272 561 100.0 1e-158 MINRLLKLIGGGGTHTKEGLNSLTDNPYKVGFSFVLDSSAMARFKEEALKLPFRYLDMKA KLVYITSPDDADSVSFDIVFPAYPMDKYCYFYPSSDFPHYIPKGYTPDIRLAPKNADEKK EVDPKKLMEWISIIESDIKRQLNPDKFQYWLADKRDEMFADKEWSSGCSIYSDELELVGH VHTFCGNANFAGHEESDKADCARVTLYGDPDDEVYFIDLKPLESGAPVSCKKGTVFLKKE PVDYLRLEFCELARFMCRWGLAENGQLFIYLC >gi|225935316|gb|ACGA01000076.1| GENE 22 26832 - 27263 340 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175384|ref|ZP_05761796.1| ## NR: gi|260175384|ref|ZP_05761796.1| hypothetical protein BacD2_26249 [Bacteroides sp. D2] # 1 143 1 143 143 265 100.0 9e-70 MAQWISIEEAAAKYGFETEYIWVLIEMREITVDYERPGIDIINDDSVQEFMSRNEKGATL MYIEKLERYCIDKSKLCSIYLSLLGKQDKEITIYKGAGILYDTLQIKWLKQCDRVQKLEK ELVLGQTTCHTCWLKRICMKIRK >gi|225935316|gb|ACGA01000076.1| GENE 23 27379 - 29397 1217 672 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 439 659 77 304 328 117 28.0 1e-25 MKQSFGIFLLVVCYYTSAMALTINKDSLLRAAKATTDQSQKTHIYRDLADFYAKKPEETH YLKLLYELGKETKNKELVLDALTDLASSHTAKMDFDSVFYYINAIKKEVDAQTENKRLSY LRMKLFDARINKGKEEAKDALDYELDLFKSADKDNIYIQIERAYISGASLIEFQKYQEAN VHLEQAYQLANNLSDRDDMRYRIVTKWLYANNLRDQGRMDECAEALESVVELYKKNYEDN YQGKRIYYNVDCNYLQCYASLLMLAATLPKEKLQHYIDEIKKIGATVTDPRDRYSYLLAM NNYYLFKGDYPSSLAVINEQIGIIRTLMPKYLPYKFKLQSDIYEEMGDYQNALEKHRIYV HLKDSFASQERQEQLNTLQVEYEVDKLKYEKANLESKNKRMQLITLSIILLLTITACIYL YYHWKKEKQMKMKLFILHQKAGESERMKAEFINSMCHEIRTPLNSIVGFSEIILDEDCDD ESKAEFRELIKMNAGVLTSLVDDMLVVANLDSSRELLPCEVVNVSDICREEFGKFEQRNR KPIKYVLNVPEGEVVCSTNAKHLSIVLENLLSNAYKFTEEGCITLELQKSESPDGIRIQI CDTGCGIPLDKQEVVFERFTKLDSFTQGNGIGLFLCKLIISRLMGSIEIDSSYTGGTRFV IFLPAMEKLETK >gi|225935316|gb|ACGA01000076.1| GENE 24 29578 - 30057 327 159 aa, chain + ## HITS:1 COG:no KEGG:ECL_01936 NR:ns ## KEGG: ECL_01936 # Name: not_defined # Def: hypothetical protein # Organism: E.cloacae # Pathway: not_defined # 28 131 24 115 139 76 44.0 3e-13 MATKSFNRRFHGRCGGKFSFFRNPFVFVILWLVVAAGIGAIVMLLWNWLMPVIFGLPALS LWQALGLLVLMRILFGSFGFGRRGGRPGGGMPGMPGRGMNPVREKWMKMTPEQRKEFINK RKKMGFGPWGRDDFFNRGDFGMDKDFEMDENVDSSKKDE >gi|225935316|gb|ACGA01000076.1| GENE 25 30050 - 30625 545 191 aa, chain + ## HITS:1 COG:ML1076 KEGG:ns NR:ns ## COG: ML1076 COG1595 # Protein_GI_number: 15827526 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mycobacterium leprae # 12 181 89 244 263 69 28.0 4e-12 MNKQKGHEGDVERLIEEHQPQLKAFIHRRVSSKEDAEDILQDVFYQLVKAVSNTLNPIEQ VTSWLYRVTRNTIINKGKKKSEEEWPVSAYAEEESVLSEFSEVLFANDDLAPSPETEYMR SLVWQELEAALGELPAEQRTVFELMELEGLSVKEVAEATGASANTVLSRKHYAVKHLRKR LEGLYNDILTY >gi|225935316|gb|ACGA01000076.1| GENE 26 30797 - 33115 1758 772 aa, chain + ## HITS:1 COG:no KEGG:BVU_1454 NR:ns ## KEGG: BVU_1454 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 772 7 773 773 472 35.0 1e-131 MKTNYILLFALMLMVSLKSSAQIFSGRVIGENQLPVEYATVALLSVPDSALVGGAVTDSK GVFKITAQDAGPYLLSVSFVGYKKFVQRVASSVEENITLIPDDLMLDEVVITQRLPQFRL DNGGLTTKVENTILSKAGTAMNVVGLLPGVLKKQDGTLEVLGKGVPLIYINNRKVRNLDE LDRLNSENIKQVELITNPGAEYGASVGAVLKLKTTGNKNDGFGVGVRSVVDYAHKVGNND QINMEYHRKGLDVFGAFQYRLEHLKETGESVQDTYVDASWQQKAQTTDLGRNISYFSQIG FNYEINEKHSLGATYELTSAPRNKMNNDNRTEVYDGQALYDVWNTYTFAIEKEYPSHHSN VYYHGEVNSLEMDLNVDVLAGKNKEEEHISELSRNYDDFYITTLSRTDNKLYAGKLVLSY PLAGGRISGGSEYSYTHRTSDFSGFGDEIEGTSDKIREKNLAFFLDCVYRLGGVNASAGL RYEDVYYDFYENSLYQSDESKHYRNLFPSLSLEGAMGQVQWALGYHIQVARPHYEQLKNA IHYGNRFTYLGGAPNLQPTYIHAAEMRAIYRDLQLSVGYNRYNDDILFSIEQITSDPKIS LIKFRNVDHRDEMTASLVFAPSIGCWKPEWTASLNSQWFKIGYLNEMKNMSGTTFGIRWN NSFQLPAGYIFRLNGEYTGKGVYQNCYTRPVSCLSTSIYKSFFNGRVDCLLEGNDLFHSM RDANTQYDDRVKMFRETKRNTQEVKLTVHYKFNLQKSKYKGTGAGLGEQQRL >gi|225935316|gb|ACGA01000076.1| GENE 27 33193 - 33768 760 191 aa, chain + ## HITS:1 COG:no KEGG:BT_2753 NR:ns ## KEGG: BT_2753 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 191 1 196 196 306 82.0 2e-82 MKKIFGALMIAICIAMAMPAQAQIHFGVKGGLNLSKASFSNVSENFKKDNFTGFFIGPMA EFNIPIVGLGVDASLLFAQRGIKVSEGNDDITVKQNGIDIPVNLKYTIGLGSLAGIYLAA GPDFYFDFEKKSGIDKKKAEVGINVGAGLKLLNHLQVGANYNIPLGNTADIEGTNASYKT KTWQVSVAYIF >gi|225935316|gb|ACGA01000076.1| GENE 28 33960 - 36416 1835 818 aa, chain + ## HITS:1 COG:BS_priA KEGG:ns NR:ns ## COG: BS_priA COG1198 # Protein_GI_number: 16078634 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Bacillus subtilis # 13 817 15 801 805 468 34.0 1e-131 MKKYVDVILPLPLPKSFTYSLPDECAEDVKIGCRVVVPFGRKKFYTAIVLNVHYCAPTEY EVKDISALLDASPILLPAQFKFWEWIADYYLCTQGDVYKAALPSGLKLESETIVEYNPDF EADAPLPEREQRILDLLAVDSQQCVTKLEKDSGIKNILTVIKSLLDKEAIFVKEELRRTY KPKTEARVRLAGTADEKRLHILFDILSRAPKQLALLMKYVECSGVLGAGTPKEVSKKELL QRANVAPSVLNGLVDKKIFEIYYHEIGRLDKQEKEIVELNPLNEFQQRAFDEVVQSFQEK NVCLLHGVTSSGKTEIYIHLIEEAIRQGKQVLYLLPEIALTTQITERLQRVFGARLGIYH SKFPDAERVEIWRKQLGENGYDIILGVRSSIFLPFRNLGLVIVDEEHENTYKQQDPAPRY HARSAAIVLAAMYGAKTLLGTATPSIESWQNAREGKYGFVQLKERYKEIQLPEIIPVDIK ELHRKKRMVGQFSPLLIQYMKEALEQKEQVILFQNRRGFAPMVECRTCGWVPKCKNCDVS LTYHKGINQLTCHYCGYTYQLPKSCPACEGTELVNRGFGTEKIEDDIKILFPEAAVSRMD LDTTRTRSAYEKIIADFEQGKTDILIGTQMVSKGLDFDHVSIVGILNADTMLNYPDFRSY ERAFQLMAQVAGRAGRKNKRGRVVLQTKSIDHPIIHQVIANDYEDMVGGQLVERQMFHYP PYYRLVYVYLKNHNEALLDQMAAVMADKLRAVFGNRVLGPDKPPVARIQTLFIKKIVVKI EQNAPMGRARELLLRIQREMLADERYKSLIVYYDVDPM >gi|225935316|gb|ACGA01000076.1| GENE 29 36469 - 38265 1641 598 aa, chain + ## HITS:1 COG:no KEGG:BT_2751 NR:ns ## KEGG: BT_2751 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 598 1 598 598 1128 87.0 0 MKRLTIFSLTCLFSMGAVFAQQGVTQCGVPTGQPKFPLLTYQELPDPTAPSDKEWAAVTS TQVSWGTTDVRYAKHQLPQLKKQQTVSLKGWRGERVNAQAVVWTGVELKDLNFSFGDFKD KKGNLLPKDAFTGGFVRYVMTDELNKDGRGACGHRKSIDYDSLLVADPIDTNLKTMALPA RTVQPVWVQCWIPQSATPGTYQGELLINDGSRLLQRLNLEITVSSRELPQPSEWAYHLDL WQSPYAVARYYQVPLWSQEHFDAMRPLMKMLADAGQKIITATLTHKPWNGQTEDYFDTMV TWIKRADGTWAFDYTIFDRWVEFMMSVGIDKQINCYSMVPWELSFQYYDQATNSLQFVKT APGEAAYEEMWGAMLASFSKHLKEKGWFDICAIAMDERPMEVMQKTLKVIRKADPDFKVS LAGNYHEEIEPDLYDYCIVIGQNFPEEVRLRRVAENKRTNYYTCCTEAHPNTFTFSDPAE AAWMSYYSSKKHLDGYLRWAYNSWPLEPLLDSRFRSWAGGDTYLVYPGARSCIRFERLIE GIQAHEKINILRQEFEKKGNKAGLKKIEKMLAPFNLGSMPEIPAAVTVNRANQILNSF >gi|225935316|gb|ACGA01000076.1| GENE 30 38283 - 38753 451 156 aa, chain + ## HITS:1 COG:alr5068 KEGG:ns NR:ns ## COG: alr5068 COG0394 # Protein_GI_number: 17232560 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Nostoc sp. PCC 7120 # 3 153 4 155 161 151 50.0 4e-37 MKKILFVCLGNICRSSTAEGVMLHLIEEAGLEKEFVIDSAGILSYHQGELPDSRMRAHAA RRGYQLVHRSRPVRTEDFYNFDLIIGMDDRNIDDLKDKAPSPEEWKKIHRMTEYCTRIPA DHVPDPYYGGAEGFEYVLDVLEDACAGLLTSLTQDN >gi|225935316|gb|ACGA01000076.1| GENE 31 38728 - 40800 573 690 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 [Bacillus selenitireducens MLS10] # 196 687 243 732 750 225 29 9e-58 MEHLKKKKRFSWRDLLYKSLLFVGTVALIVYFLPRDGKFNYQFDINKPWKYGQLIATFDF PIYKEDAVVKREQDSLMAFFQPYYQLDKNIEKDAIAKLKENYHTNLKGILPSIDYLRYIE RTLKEIYQAGIVSTENIQQLHKDSTSSVMVIDDKLANPQATENLYTVKKAYEHLLSADST HFNREILRQCSLNEYITPNLTFDEERTQAAKEEILNNYSWANGLVVSGQKIIDRGEIISP HTYNILESLRKESIKRNESMGQNRLILGGQILFVGMLMLCFMLYLDLFRKDYYQRKGSLS LLFTLIVFYSVITAFMVTHNLFNVYIIPYAMLPIIIRVFLDSRTAFLTHVITILICSISL RFPHEFILTQLAAGLVAIFSLRELSQRSQLFRTALLVILTYAAIYFAFELMTENGLSTDF SKLNIRMYTYFIINGILLLFTYPLLFLLEKTFGFTSNVTLVELSNINNDLLRQMSETVPG TFQHSMQVANLAAEAAIRIGAKSQLVRTGALYHDIGKMENPAFFTENQSGGINPHKNLNY EQSAQVVISHVTDGLKLADKHNLPKAIKDFISTHHGRGKTKYFYISWKNEHPDEEPNEEL FTYPGPNPFTKEQAILMMADAVEAASRSLPEYTEETISNLVDKIIDSQVTEGYFKECPIT FKDIATVKAVFKEKLKIAYHTRISYPELKK >gi|225935316|gb|ACGA01000076.1| GENE 32 40895 - 42412 1496 505 aa, chain + ## HITS:1 COG:BB0372 KEGG:ns NR:ns ## COG: BB0372 COG0008 # Protein_GI_number: 15594717 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Borrelia burgdorferi # 7 505 4 487 490 332 37.0 1e-90 MAERKVRVRFAPSPTGALHIGGVRTALYNYLFARQHGGDLIFRIEDTDSNRFVPGAEEYI LESFKWLGIHFDEGVSFGGEQGPYRQSERREIYKKYVQILLENGKAYIAFDTPEELDAKR AEIANFQYDASTRGMMRNSLTMPKEEVDALIAEGKQYVVRFKIEPNEDIHVNDLIRGEVV INSSILDDKVLYKSADELPTYHLANIVDDHLMEVSHVIRGEEWLPSAPLHVLLYRAFGWE DTMPEFAHLPLLLKPEGNGKLSKRDGDRLGFPVFPLEWHDPKSGEISSGYRESGYLPEAV INFLALLGWNPGNDLEVMSMDELIKLFDLHRCSKSGAKFDYKKGIWFNHQYIQQKPNEEI AELFLPFLKEHGVEAPFEKVVTVVGMMKDRVSFIKELWDVCSFFFVAPTEYDEKTVKKRW KEDSAKCMTELAEVIASIEDFSIEGQEKVVMDWIAEKGYHTGNIMNAFRLTLVGEGKGPH MFDISWVLGKEETIARMKRAVEVLK >gi|225935316|gb|ACGA01000076.1| GENE 33 42426 - 43649 1183 407 aa, chain + ## HITS:1 COG:YPO0055 KEGG:ns NR:ns ## COG: YPO0055 COG1519 # Protein_GI_number: 16120408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Yersinia pestis # 49 405 51 416 425 140 28.0 4e-33 MLYDLAIVIYDFIVHLAAPFSRKPRKMMKGHWVVYELLRQQVEKGEQYIWFHAASLGEFE QGRPLIEMIREKYPNYKILLTFFSPSGYEVRKHYRGADIVCYLPFDKPRNVKKFLDIANP CMAFFIKYEFWKNYLDELHKRRIPVYSVSSIFRREQIFFKWYGGTYRNVLKDFDHLFVQN EASKRYLSKIGISRVTVVGDTRFDRVLQIREEAKELPLVEKFKGNNSFTFVAGSSWGPDE DLFLEYFNNHLEMKLIIAPHVIDENHLVEIIGKLKRPYVRYTRADERNVLKADCLIIDCF GLLSSIYRYGEIAYIGGGFGVGIHNTLEAAVYGIPVIFGPKYQKFMEAVQLLEAKGAYSI KDYDELKTLLDRFLTDEVFLRETGTNAGYYVTSNAGATEKIMHMINF >gi|225935316|gb|ACGA01000076.1| GENE 34 43842 - 45053 688 403 aa, chain - ## HITS:1 COG:no KEGG:BT_2745 NR:ns ## KEGG: BT_2745 # Name: not_defined # Def: thiol:disulfide interchange protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 403 12 404 404 618 76.0 1e-175 MRIKQLLIIFFLTALPTIHAQNSGKEYLKKVLTKLEKIESATYNITNESWQHGDSTALSI LHGFVKEYNNPMDSTIGASYVSLDANDTTKMDFYYDGNIRALTYHDNKKLVIDDFTFRPL PFRPVSPPFFNHAKNIIQYALTTKDHTTLELKDEGNNYYFKLVIDENQQVEFFGKAYHLP LPPFDIGNTTSIYELWIDKSNDLPYKKRREMSHDISVSTCSDLQTNNLAINDFNTAHYFP EDYEIVKYSDLHKKDGDSSASELIGKKAPAWILENIEAQPVSLADCKSRIILINFTGIGC GACQAAIPFLKQLKDSFNSEEFDLIAIESWSRKTSAIRNYANRKELNYTILNATNEVIKE YQTGGAAPYFFIMDQERIIRKVIRGYSKKNTDKEIINAIKELL >gi|225935316|gb|ACGA01000076.1| GENE 35 45300 - 45812 445 170 aa, chain - ## HITS:1 COG:BS_ytoA KEGG:ns NR:ns ## COG: BS_ytoA COG0663 # Protein_GI_number: 16080104 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Bacillus subtilis # 3 170 1 166 171 150 50.0 1e-36 MALIKSVRGFTPEIGENCFLADNATIIGDVKIGNDCSIWFSTVLRGDVNSIRIGNGVNIQ DGSVLHTLYQKSTIEIGDHVSVGHNVTIHGATIKDYALVGMGSTILDHAVVGEGAIVAAG SLVLSNTIIEPGSIWGGVPAKFIKNVDPEQAKELNQKIAHNYLMYSQWYK >gi|225935316|gb|ACGA01000076.1| GENE 36 45939 - 47720 1194 593 aa, chain - ## HITS:1 COG:Cj0653c KEGG:ns NR:ns ## COG: Cj0653c COG0006 # Protein_GI_number: 15792013 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Campylobacter jejuni # 6 593 5 595 596 462 42.0 1e-129 MKQNIKERVHALRMTFHPNYIKAFIIPSTDPHLSEYVAPHWMSREWISGFTGSAGTAVIL MDKAGLWTDSRYFLQAAKELEGSGITLYKEMLPETPSITEFLCQHLKPGESVSIDGKMFS VQQVEQMKEELAAHQLQVDIFGDPLKNIWKDRPSIPDSPALIYDIKYAGKSCEEKISAIR AELKKKGVYALFISALDEIAWTLNLRGNDVHCNPVIVSYLLITQDEVTYFISPEKVTPEV ETYLKKQQIGIQKYDEVETFLNSFPGENILIDPRKTNYAIYSAINPKCSIIRGESPVTLL KAIRNEQEIAGIHAAMQRDGVALVRFLKWLEESVSAGKETELSIDKKLHEFRAAQPLYMG ESFDTIAGYKEHGAIVHYSATPESDVTLQPKGFLLLDSGAQYMDGTTDITRTIALGELTE EEKTDYTLILKGHIALAMAKFPAGTRGAQLDVLARMPIWNHRMNFLHGTGHGVGHFLSVH EGPQSIRMNENPVILQPGMVTSNEPGVYKAGSHGIRTENLTLVCKDGEGMFGEYLKFETI TLCPICKKGIIKEMLTKEEIEWLNNYHQTVYEKLSPDLNEEEKTWLQKATTSI >gi|225935316|gb|ACGA01000076.1| GENE 37 47909 - 48100 310 63 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 [Bacteroides ovatus ATCC 8483] # 1 63 1 63 63 124 100 3e-27 MIVVPVKEGENIEKALKKFKRKFEKTGIVKELRSRQQFDKPSVTKRLKKERAVYVQQLQQ VED >gi|225935316|gb|ACGA01000076.1| GENE 38 48174 - 49055 693 293 aa, chain + ## HITS:1 COG:SA1328 KEGG:ns NR:ns ## COG: SA1328 COG4974 # Protein_GI_number: 15927078 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 2 293 4 295 295 194 35.0 1e-49 MLIESFLDYLQYERNYSEKTVLAYGEDIKQLQEFAQEEYGKFNPLEVEAELIREWIVSLM DKGYTSTSVNRKLSSLRTFYKYLIRQGKTTIDPLRKIKGPKNKKPLPVFLKENEMNRLLD DTDFGEGFKGCRDRLVIEVFYATGMRLSELIGLDDKDVDFSASLLKVTGKRNKQRLIPFG DELRELMLEYINVRNETIPERSEAFFIRENGERLYKNLVYNLVKRNLSKVATLKKRSPHV LRHTFATTMLNNEAELGAVKELLGHESITTTEIYTHATFEELKKVYKQAHPRA >gi|225935316|gb|ACGA01000076.1| GENE 39 49072 - 49371 193 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 [Kordia algicida OT-1] # 1 98 4 101 102 79 41 1e-13 MDVRIQSIHFDASEQLQAFIQKKVSKLEKYYEDIKKVEVSLKVVKPEVAENKEAGIKILI PNGEFYASKVCDTFEEAIDLDVEALVKQLVKYKEKQRSK >gi|225935316|gb|ACGA01000076.1| GENE 40 49945 - 51129 1409 394 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 394 1 407 407 547 66 1e-154 MAKEKFERTKPHVNIGTIGHVDHGKTTLTAAITTVLAKKGLSELRSFDSIDNAPEEKERG ITINTSHVEYQTANRHYAHVDCPGHADYVKNMVTGAAQMDGAIIVCAATDGPMPQTREHI LLARQVNVPRLVVFLNKCDMVDDEEMLELVEMEMRELLSFYDFDGDNTPIIRGSALGALN GVEKWEDKVMELMDAVDTWIPLPPRDVDKPFLMPVEDVFSITGRGTVATGRIETGIIHVG DEIEILGLGEDKKSVVTGVEMFRKLLDQGEAGDNVGLLLRGVDKNEIKRGMVLCKPGQIK PHSKFKAEVYILKKEEGGRHTPFHNKYRPQFYLRTMDCTGEITLPEGTEMVMPGDNVTIT VELIYPVALNPGLRFAIREGGRTVGAGQITEIID >gi|225935316|gb|ACGA01000076.1| GENE 41 51269 - 51460 140 63 aa, chain + ## HITS:1 COG:no KEGG:BF4198 NR:ns ## KEGG: BF4198 # Name: not_defined # Def: preprotein translocase SecE subunit # Organism: B.fragilis # Pathway: Protein export [PATH:bfr03060]; Bacterial secretion system [PATH:bfr03070] # 1 63 1 63 63 100 90.0 2e-20 MKKVIAYIKESYDELVHKVSWPTYSELANSAVVVLYASLLIALVVWGMDVCFQNFMEKIV YPH >gi|225935316|gb|ACGA01000076.1| GENE 42 51477 - 52019 534 180 aa, chain + ## HITS:1 COG:CC3205 KEGG:ns NR:ns ## COG: CC3205 COG0250 # Protein_GI_number: 16127435 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Caulobacter vibrioides # 7 179 12 183 185 150 47.0 9e-37 MAEIEKKWYVLRAISGKEAKVKEYLEADLKNSDLGEYVSQVLIPTEKVYQVRNGKKIVKE RSYLPGYVLVEAALVGEVAHHLRNTPNVIGFLGGSEKPVPLRQSEVNRILGTVDELQETG EELNIPYVVGETVKVTFGPFSGFSGIIEEVNSEKKKLKVMVKIFGRKTPLELGFMQVEKE >gi|225935316|gb|ACGA01000076.1| GENE 43 52081 - 52524 743 147 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883077|ref|ZP_02064080.1| hypothetical protein BACOVA_01040 [Bacteroides ovatus ATCC 8483] # 1 147 1 147 147 290 100 2e-77 MAKEVAGLIKLQIKGGAANPSPPVGPALGSKGINIMEFCKQFNARTQDKAGKILPVIITY YADKSFDFVIKTPPVAIQLLEVAKVKSGSAEPNRKKVAELTWEQVRTIAQDKMVDLNCFT VEAAMRMVAGTARSMGIAVKGEFPVNN >gi|225935316|gb|ACGA01000076.1| GENE 44 52540 - 53238 1159 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883076|ref|ZP_02064079.1| hypothetical protein BACOVA_01039 [Bacteroides ovatus ATCC 8483] # 1 232 1 232 232 451 100 1e-125 MGKLTKNQKLAAEKIEAGKAYSLKEAASLVKEITFSKFDASLDIDVRLGVDPRKANQMVR GVVSLPHGTGKEVRVLVLCTPDAEAAAKEAGADYVGLDEYIEKIKGGWTDIDVIITMPSI MGKIGALGRVLGPRGLMPNPKSGTVTMDVAKAVKEVKQGKIDFKVDKSGIVHTSIGKVSF SPDQIRDNAKEFISTLNKLKPTAAKGTYIKSIYLSSTMSAGIKIDPKSVDEI >gi|225935316|gb|ACGA01000076.1| GENE 45 53254 - 53775 852 173 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883075|ref|ZP_02064078.1| hypothetical protein BACOVA_01038 [Bacteroides ovatus ATCC 8483] # 1 173 1 173 173 332 100 4e-90 MRKEDKSTIIEQIAATVKEYGHFYLVDVTAMNAAATSALRRDCFKSDIKLMVVKNTLLHK ALESLEEDFSPLYGSLKGTTAVMFTNTANVPAKLIKDKAKDGIPGLKAAYAEESFYVGAD QLDALVAIKSKNEVIADIVALLQSPAKNVISALQSGGNTLHGVLKTLGERTEA >gi|225935316|gb|ACGA01000076.1| GENE 46 53866 - 54240 589 124 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805949|ref|ZP_01958617.1| hypothetical protein BACCAC_00194 [Bacteroides caccae ATCC 43185] # 1 124 1 124 124 231 100 1e-59 MADLKAFAEQLVNLTVKEVNELATILKEEYGIEPAAAAVAVAAGPAAGAAAAEEKTSFDV VLKSAGAAKLQVVKAVKEACGLGLKEAKDLVDGAPSTVKEGLAKDEAESLKKTLEEAGAE VELK >gi|225935316|gb|ACGA01000076.1| GENE 47 54345 - 58157 4267 1270 aa, chain + ## HITS:1 COG:SMc01317 KEGG:ns NR:ns ## COG: SMc01317 COG0085 # Protein_GI_number: 15965101 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta subunit/140 kD subunit # Organism: Sinorhizobium meliloti # 9 1270 14 1362 1380 1135 46.0 0 MSSNTVNQRVNFASTKNPLEYPDFLEVQLKSFQDFLQLDTPPEKRKNEGLYKVFAENFPI ADTRNNFVLEFLDYYIDPPRYTIDDCIERGLTYSVPLKAKLKLYCTDPDHEDFDTVIQDV FLGPIPYMTDKATFVINGAERVVVSQLHRSPGVFFGQSVHANGTKLYSARIIPFKGSWIE FATDINNVMYAYIDRKKKLPVTTLLRAIGFENDKDILEIFNLAEDVKVNKTNLKKVLGRK LAARVLKTWIEDFVDEDTGEVVSIERNEVIIDRETVLEEVHIDEILESGVQNILLHKDEP NQSDFSIIYNTLQKDPSNSEKEAVLYIYRQLRNADPADDASAREVINNLFFSEKRYDLGD VGRYRINKKLNLTTDMDVRVLTKEDIIEIIKYLIELINSKADVDDIDHLSNRRVRTVGEQ LSNQFAVGLARMSRTIRERMNVRDNEVFTPIDLINAKTISSVINSFFGTNALSQFMDQTN PLAEITHKRRMSALGPGGLSRERAGFEVRDVHYTHYGRLCPIETPEGPNIGLISSLCVFA KINDLGFIETPYRKVENGKVDLSDNGLIYLTAEEEEEKVIAQGNAPLNDDGTFVRNKVKS RQDADFPVVEPTEVDLMDVSPQQIASIAASLIPFLEHDDANRALMGSNMMRQAVPLLRSE APIVGTGIERQLVRDSRTQITAEGDGVVDYVDATTIRILYDRTEDEEFVSFEPALKEYRI PKFRKTNQNMTIDLRPICDRGQRVKKGDILTEGYSTEKGELALGKNLLVAYMPWKGYNYE DAIVLNERVVREDLLTSVHVEEYSLEVRETKRGMEELTSDIPNVSEEATKDLDENGIVRI GARIEPGDIMIGKITPKGESDPSPEEKLLRAIFGDKAGDVKDASLKASPSLKGVVIDKKL FSRVIKNRSSKLADKALLPKIDDEFESKVADLKRILVKKLMTLTEGKVSQGVKDYLGAEV IAKGSKFSASDFDSLDFTSIQLSNWTSDDHINGMIRDLVMNFIKKYKELDAELKRKKFAI TIGDELPAGIIQMAKVYIAKKRKIGVGDKMAGRHGNKGIVSRVVRQEDMPFLSDGTPVDI VLNPLGVPSRMNIGQIFEAVLGRAGKTLGVKFATPIFDGATMEDLDQWTDKAGLPRYCKT YLCDGGTGEQFDQAATVGVTYMLKLGHMVEDKMHARSIGPYSLITQQPLGGKAQFGGQRF GEMEVWALEGFGAAHILQEILTIKSDDVVGRSKAYEAIVKGEPMPQPGIPESLNVLLHEL RGLGLSINLE >gi|225935316|gb|ACGA01000076.1| GENE 48 58262 - 62545 4666 1427 aa, chain + ## HITS:1 COG:mlr0277 KEGG:ns NR:ns ## COG: mlr0277 COG0086 # Protein_GI_number: 13470543 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Mesorhizobium loti # 13 1395 18 1356 1398 1325 50.0 0 MAFRKENKTKSNFSKISIGLASPEEILENSSGEVLKPETINYRTYKPERDGLFCERIFGP IKDYECHCGKYKRIRYKGIVCDRCGVEVTEKKVRRERMGHIQLVVPVAHIWYFRSLPNKI GYLLGLPTKKLDSIIYYERYVVIQPGVKAEDGVAEYDLLSEEEYLDILDTLPKDNQYLED NDPNKFVAKMGAEAIYDLLARLDLDALSYELRHRAGNDASQQRKNEALKRLQVVESFRAS RGRNKPEWMIVRIVPVIPPELRPLVPLDGGRFATSDLNDLYRRVIIRNNRLKRLIEIKAP EVILRNEKRMLQESVDSLFDNSRKSSAVKTDANRPLKSLSDSLKGKQGRFRQNLLGKRVD YSARSVIVVGPELRMGECGIPKLMAAELYKPFIIRKLIERGIVKTVKSAKKIVDRKEPVI WDILEHVMKGHPVLLNRAPTLHRLGIQAFQPKMIEGKAIQLHPLACTAFNADFDGDQMAV HLPLSNEAILEAQMLMLQSHNILNPANGAPITVPAQDMVLGLYYITKLRAGAKGEGLTFY GPEEALIAYNEGKVDIHAPVKVIVKDVDENGNIVDVMRETSVGRVIVNEIVPPEAGYINT IISKKSLRDIISDVIKVCGVAKAADFLDGIKNLGYQMAFKGGLSFNLGDIIIPKEKETLV QKGYDEVEQVVNNYNMGFITNNERYNQVIDIWTHVNSELSNILMKTISSDDQGFNSVYMM LDSGARGSKEQIRQLSGMRGLMAKPQKAGAEGGQIIENPILSNFKEGLSVLEYFISTHGA RKGLADTALKTADAGYLTRRLVDVSHDVIITEEDCGTLRGLVCTDLKNNDEVIATLYERI LGRVSVHDIIHPTTGELLVAGGEEITEEVAKKIQESPIESVEIRSVLTCEAKKGVCAKCY GRNLATSRMVQKGEAVGVIAAQSIGEPGTQLTLRTFHAGGTAANIAANASIVAKNNARLE FEELRTVDIVDEMGESAKVVVGRLAEVRFVDVNTGIVLSTHNVPYGSTLYVSDGDLVEKG KLIARWDPFNAVIITEATGKIEFEGVIENVTYKVESDEATGLREIIIIESKDKTKVPTAH ILTEDGDLIRTYNLPVGGHVIIENGQKVKAGEVIVKIPRAVGKAGDITGGLPRVTELFEA RNPSNPAVVSEIDGEVTMGKIKRGNREIIVTSKTGEVKKYLVALSKQILVQENDYVRAGT PLSDGATTPADILAIKGPTAVQEYIVNEVQDVYRLQGVKINDKHFEIIVRQMMRKVQIDE PGDTRFLEQQVVDKLEFMEENDRIWGKKVVVDAGDSQNMQAGQIVTARKLRDENSMLKRR DLKPVEVRDAVAATSTQILQGITRAALQTSSFMSAASFQETTKVLNEAAINGKIDKLEGM KENVICGHLIPAGTGQREFEKLIVGSKEEYDRILANKKTVLDYNEVE >gi|225935316|gb|ACGA01000076.1| GENE 49 62692 - 63000 359 102 aa, chain + ## HITS:1 COG:no KEGG:BT_2732 NR:ns ## KEGG: BT_2732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 102 1 101 101 164 89.0 1e-39 MEEQNNNGQLQIELREEVAQGTYANLAIITHSSSEFILDFVRVMPGIPKAGVQSRIIVAP EHAKRLLRALEDNIAKYERVFGPIRTSDEPPISPLTGVKGEA >gi|225935316|gb|ACGA01000076.1| GENE 50 63137 - 63547 701 136 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883069|ref|ZP_02064072.1| hypothetical protein BACOVA_01032 [Bacteroides ovatus ATCC 8483] # 1 136 1 136 136 274 100 1e-72 MPTIQQLVRKGREVLVEKSKSPALDSCPQRRGVCVRVYTTTPKKPNSAMRKVARVRLTNQ KEVNSYIPGEGHNLQEHSIVLVRGGRVKDLPGVRYHIVRGTLDTAGVAGRTQRRSKYGAK RPKPGQAAAPAKGKKK >gi|225935316|gb|ACGA01000076.1| GENE 51 63691 - 64167 809 158 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883068|ref|ZP_02064071.1| hypothetical protein BACOVA_01031 [Bacteroides ovatus ATCC 8483] # 1 158 1 158 158 316 100 4e-85 MRKAKPKKRLILPDPVFNDQKVSKFVNHLMYDGKKNTSYEIFYAALETVKAKLPNEEKSA LEIWKKALDNVTPQVEVKSRRVGGATFQVPTEIRPDRKESISMKNLILFARKRGGKSMAD KLAAEIMDAFNEQGGAYKRKEDMHRMAEANRAFAHFRF >gi|225935316|gb|ACGA01000076.1| GENE 52 64213 - 66330 1961 705 aa, chain + ## HITS:1 COG:HP1195 KEGG:ns NR:ns ## COG: HP1195 COG0480 # Protein_GI_number: 15645809 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Helicobacter pylori 26695 # 3 700 4 692 692 842 60.0 0 MAKHDLHLTRNIGIMAHIDAGKTTTSERILFYTGLTHKIGEVHDGAATMDWMEQEQERGI TITSAATTTRWKYAGDTYKINLIDTPGHVDFTAEVERSLRILDGAVAAYCAVGGVEPQSE TVWRQADKYNVPRIAYVNKMDRSGADFFEVVRQMKDVLGANPCPIVVPIGAEESFKGLVD LIKMKAIYWHDETMGADYSVEEIPAELVDEANEWRDKMLEKVAEFDDALMEKYFDDPSTI TEEEVLRALRNATVQMAVVPMLCGSSFKNKGVQTLLDYVCAFLPSPLDTENVIGTNPNTG AEEDRKPSDDEKTSALAFKIATDPYVGRLTFFRVYSGKIEAGSYIYNSRSGKKERVSRLF QMHSNKQNPVEVIGAGDIGAGVGFKDIHTGDTLCDENAPIVLESMDFPEPVIGIAVEPKT QKDMDKLSNGLAKLAEEDPTFTVKTDEQTGQTVISGMGELHLDIIIDRLKREFKVECNQG KPQVNYKEAITKTVNLREVYKKQSGGRGKFADIIVNIGPADADFTLGGLQFVDEVKGGNI PKEFIPAVQKGFTNAMKSGVLAGYPLDSLKVTLVDGSFHPVDSDQLSFEICAMQAYKNAC AKAGPVLMEPIMKLEVVTPEENMGDVIGDLNKRRGQVEGMESSRSGARIVKAMVPLAEMF GYVTALRTITSGRATSSMTYSHHAQLSSSIAKAVLEEVKGRADLL >gi|225935316|gb|ACGA01000076.1| GENE 53 66410 - 66715 495 101 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348137|ref|NP_811640.1| 30S ribosomal protein S10 [Bacteroides thetaiotaomicron VPI-5482] # 1 101 1 101 101 195 100 1e-48 MSQKIRIKLKSYDHNLVDKSAEKIVRTVKATGAIVSGPIPLPTHKRIFTVNRSTFVNKKS REQFELSSYKRLIDIYSSTAKTVDALMKLELPSGVEVEIKV >gi|225935316|gb|ACGA01000076.1| GENE 54 66734 - 67351 1056 205 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717399|ref|ZP_04547880.1| 50S ribosomal protein L3 [Bacteroides sp. D1] # 1 205 1 205 205 411 98 1e-114 MPGLLGKKIGMTSVFSADGKNVPCTVIEAGPCVVTQVKTVEKDGYAAVQLGFQDKKEKHT TKPLMGHFKRAGVTPKRHLAEFKEFGTDLNLGDTITVEMFNDANFVDVVGTSKGKGFQGV VKRHGFGGVGQATHGQHNRARKPGSIGACSYPAKVFKGMRMGGQLGGDRVTVQNLQVLKV IADHNLLLIKGSIPGCKGSIVLIEK >gi|225935316|gb|ACGA01000076.1| GENE 55 67351 - 67977 1036 208 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883063|ref|ZP_02064066.1| hypothetical protein BACOVA_01026 [Bacteroides ovatus ATCC 8483] # 1 208 1 208 208 403 99 1e-111 MEVNVYNIKGEDTGRKVTLNESIFGIEPNDHAIYLDVKQFMTNQRQGTHKSKERSEISGS TRKIGRQKGGGGARRGDMNSPVLVGGARVFGPKPRDYFFKLNKKVKTLARKSALSYKAQN DAIVVVEDFTFEAPKTKDFVAMTKNLKVSDKKLLVILPEANKNVYLSARNIESANVQTIS GLNTYRVLNAGVVVLTENSLKAIDNILI >gi|225935316|gb|ACGA01000076.1| GENE 56 67994 - 68284 476 96 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883062|ref|ZP_02064065.1| hypothetical protein BACOVA_01025 [Bacteroides ovatus ATCC 8483] # 1 96 1 96 96 187 98 2e-46 MGIIIKPLVTEKMTAITDKLNRFGFVVRPEANKLEIKKEIEALYNVTVVDVNTVKYAGKN KSRYTKAGIINGRTNAFKKAIVTLKEGDSIDFYSNI >gi|225935316|gb|ACGA01000076.1| GENE 57 68290 - 69114 1416 274 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805960|ref|ZP_01958628.1| hypothetical protein BACCAC_00205 [Bacteroides caccae ATCC 43185] # 1 274 1 274 274 550 100 1e-155 MAVRKFKPTTPGQRHKIIGTFEEITASVPEKSLVYGKKSSGGRNSEGKMTMRYIGGGHRK VIRIVDFKRNKDGVPAVVKTIEYDPNRSARIALLFYADGEKRYIIAPNGLQVGATLMSGE TAAPEIGNALPLQNIPVGTVIHNIELRPGQGAALVRSAGNFAQLTSREGKYCVIKLPSGE VRQILSTCKATIGSVGNSDHGLESSGKAGRSRWQGRRPRNRGVVMNPVDHPMGGGEGRAS GGHPRSRKGLYAKGLKTRAPKKQSSKYIIERRKK >gi|225935316|gb|ACGA01000076.1| GENE 58 69135 - 69404 465 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805961|ref|ZP_01958629.1| hypothetical protein BACCAC_00206 [Bacteroides caccae ATCC 43185] # 1 89 1 89 89 183 100 3e-45 MSRSLKKGPYINVKLEKRILAMNESGKKVVVKTWARASMISPDFVGHTVAVHNGNKFIPV YVTENMVGHKLGEFAPTRTFRGHAGNKKR >gi|225935316|gb|ACGA01000076.1| GENE 59 69441 - 69851 679 136 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348131|ref|NP_811634.1| 50S ribosomal protein L22 [Bacteroides thetaiotaomicron VPI-5482] # 1 136 1 136 136 266 100 5e-70 MGARKKISAEKRKEALKTMYFAKLQNVPTSPRKMRLVADMIRGMEVNRALGVLKFSSKEA AARVEKLLRSAIANWEQKNERKAESGELFVTQIFVDGGATLKRMRPAPQGRGYRIRKRSN HVTLFVGAKSNNEDQN >gi|225935316|gb|ACGA01000076.1| GENE 60 69857 - 70588 1248 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883058|ref|ZP_02064061.1| hypothetical protein BACOVA_01021 [Bacteroides ovatus ATCC 8483] # 1 243 1 243 243 485 100 1e-136 MGQKVNPISNRLGIIRGWDSNWYGGNDYGDSLLEDSKIRKYLNARLAKASVSRIVIERTL KLVTITVCTARPGIIIGKGGQEVDKLKEELKKVTDKDIQINIFEVKRPELDAVIVANNIA RQVEGKIAYRRAIKMAIANTMRMGAEGIKIQISGRLNGAEMARSEMYKEGRTPLHTFRAD IDYCHAEALTKVGLLGIKVWICRGEVFGKKELAPNFTQSKESGRGNNSGNNGGKNFKRKK NNR >gi|225935316|gb|ACGA01000076.1| GENE 61 70612 - 71046 737 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805964|ref|ZP_01958632.1| hypothetical protein BACCAC_00209 [Bacteroides caccae ATCC 43185] # 1 144 1 144 144 288 100 9e-77 MLQPKKTKFRRQQKGRQKGNAQRGNQLAFGSFGIKALETKWITGRQIEAARIAVTRYMQR QGQIWIRIFPDKPITRKPADVRMGKGKGAPEGFVAPVTPGRIIIEAEGVSYEIAKEALRL AAQKLPITTKFVVRRDYDIQNQNA >gi|225935316|gb|ACGA01000076.1| GENE 62 71052 - 71249 321 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883056|ref|ZP_02064059.1| hypothetical protein BACOVA_01019 [Bacteroides ovatus ATCC 8483] # 1 65 1 65 65 128 100 2e-28 MKIAEIKEMTTNDLVERVEAETANYNQMVINHSISPLENPAQIKQLRRTIARMKTELRER ELNNK >gi|225935316|gb|ACGA01000076.1| GENE 63 71246 - 71515 456 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883055|ref|ZP_02064058.1| hypothetical protein BACOVA_01018 [Bacteroides ovatus ATCC 8483] # 1 89 1 89 89 180 100 3e-44 MISLMEARNLRKERTGVVLSNKMDKTITVAAKFKEKHPIYGKFVSKTKKYHAHDEKNECN IGDTVSIMETRPLSKTKRWRLVEIIERAK >gi|225935316|gb|ACGA01000076.1| GENE 64 71518 - 71883 596 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348126|ref|NP_811629.1| 50S ribosomal protein L14 [Bacteroides thetaiotaomicron VPI-5482] # 1 121 1 121 121 234 100 2e-60 MIQVESRLTVCDNSGAKEALCIRVLGGTGRRYASVGDVIVVSVKSVIPSSDVKKGAVSKA LIVRTKKEIRRPDGSYIRFDDNACVLLNNAGEIRGSRIFGPVARELRATNMKVVSLAPEV L >gi|225935316|gb|ACGA01000076.1| GENE 65 71903 - 72223 534 106 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883053|ref|ZP_02064056.1| hypothetical protein BACOVA_01016 [Bacteroides ovatus ATCC 8483] # 1 106 1 106 106 210 100 3e-53 MSKLHIKKGDTVYVNAGEDKGKTGRVLKVLVKEGRAFVEGINMVSKSTKPNAKNPQGGIV KQEASIHISNLNPVDPKTGKATRIGRKKSSEGTLVRYSKKSGEEIK >gi|225935316|gb|ACGA01000076.1| GENE 66 72223 - 72780 913 185 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883052|ref|ZP_02064055.1| hypothetical protein BACOVA_01015 [Bacteroides ovatus ATCC 8483] # 1 185 1 185 185 356 100 3e-97 MSNTASLKKEYAERIAPALKSQFQYSSTMQVPVLKKIVINQGLGMAVADKKIIEVAINEM TAITGQKAVATISRKDIANFKLRKKMPIGVMVTLRRERMYEFLEKLVRVALPRIRDFKGI ESKFDGKGNYTLGIQEQIIFPEINIDSITRILGMNITFVTSAETDEEGYALLKEFGLPFK NAKKD >gi|225935316|gb|ACGA01000076.1| GENE 67 72786 - 73085 503 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883051|ref|ZP_02064054.1| hypothetical protein BACOVA_01014 [Bacteroides ovatus ATCC 8483] # 1 99 1 99 99 198 100 1e-49 MAKESMKAREVKRAKLVAKYAERRAALKQIVRTGDPADAFEAAQKLQELPKNSNPIRMHN RCKLTGRPKGYIRQFGISRIQFREMASNGLIPGVKKASW >gi|225935316|gb|ACGA01000076.1| GENE 68 73138 - 73533 664 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883050|ref|ZP_02064053.1| hypothetical protein BACOVA_01013 [Bacteroides ovatus ATCC 8483] # 1 131 1 131 131 260 100 3e-68 MTDPIADYLTRLRNAIGAKHRVVEVPASNLKKEITKILFEKGYILNYKFVEDGPQGTIKV ALKYDSVNKVNAIKKLERISSPGMRKYTGYKDMPRVINGLGIAIISTSKGVMTNKEAAEL KIGGEVLCYVY >gi|225935316|gb|ACGA01000076.1| GENE 69 73549 - 74118 967 189 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153805972|ref|ZP_01958640.1| hypothetical protein BACCAC_00217 [Bacteroides caccae ATCC 43185] # 1 189 1 189 189 377 100 1e-103 MSRIGKLPISIPAGVTVTLKDNVVTVKGPKGEMSQYVNPAINVAIEDGHVTLTENDKEML DNPKQKHAFHGLYRSLVHNMVVGVSEGYKKELELVGVGYRASNQGNIIELALGYTHNIFI QLPAEVKVETKSERNKNPLIILESCDKQLLGQVCSKIRSFRKPEPYKGKGIKFVGEVIRR KSGKSAGAK >gi|225935316|gb|ACGA01000076.1| GENE 70 74140 - 74484 538 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348120|ref|NP_811623.1| 50S ribosomal protein L18 [Bacteroides thetaiotaomicron VPI-5482] # 1 114 1 114 114 211 95 1e-53 MTTKIERRVKIKYRVRNKISGTTECPRMSVFRSNKQIYVQIIDDLSGKTLAAASSLGLTD KVAKKEQAAKVGEMIAKKAQEAGITTVVFDRNGYLYHGRVKEVADAARNGGLKF >gi|225935316|gb|ACGA01000076.1| GENE 71 74490 - 75008 848 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883047|ref|ZP_02064050.1| hypothetical protein BACOVA_01010 [Bacteroides ovatus ATCC 8483] # 1 172 1 172 172 331 100 1e-89 MAGVNNRVKITNDIELKDRLVAINRVTKVTKGGRTFSFSAIVVVGNEEGIIGWGLGKAGE VTAAIAKGVESAKKNLVKVPILKGTVPHEQSARFGGAEVFIKPASHGTGVVAGGAMRAVL ESVGVTDVLAKSKGSSNPHNLVKATIEALSEMRDARMVAQNRGISVEKVFRG >gi|225935316|gb|ACGA01000076.1| GENE 72 75018 - 75194 281 58 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53715448|ref|YP_101440.1| 50S ribosomal protein L30 [Bacteroides fragilis YCH46] # 1 58 1 58 58 112 100 7e-24 MSTIKIKQVKSRIGAPADQKRTLDALGLRKLNRVVEHESTPSILGMVDKVKHLVAIVK >gi|225935316|gb|ACGA01000076.1| GENE 73 75227 - 75673 734 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883045|ref|ZP_02064048.1| hypothetical protein BACOVA_01008 [Bacteroides ovatus ATCC 8483] # 1 148 1 148 148 287 99 2e-76 MNLSNLKPAEGSTKTRKRIGRGTGSGLGGTSTRGHKGAKSRSGYSKKIGFEGGQMPLQRR VPKFGFKNINRVEYKAINLDTIQKLAEAKSLAKIGVNDFIEAGFISSNQLVKVLGNGTLT SKLEVEAHAFSKTATAAIEAAGGTVVKL >gi|225935316|gb|ACGA01000076.1| GENE 74 75678 - 77018 877 446 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 13 442 19 447 447 342 41 5e-93 MRKAIETLKNIWKIEDLRQRILITILFVAIYRFGSYVVLPGINPAMLAKLHEQTSEGLLA LLNMFSGGAFSNASIFALGIMPYISASIVIQLLGIAVPYFQKLQREGESGRRKMNQYTRY LTIAILLVQAPSYLLNLKMQAGPSLNASLDWTLFMVTSTIILAAGSMFILWLGERITDKG IGNGISFIILIGIIARFPDALLQEVVSRVANKSGGLIMFIIEIVFLLLVIGAAILLVQGT RKIPVQYAKRIVGNKQYGGARQYIPLKVNAAGVMPIIFAQAIMFIPITFIGFSNVNDAGG FLHAFTDHTSFWYNFVFAVMIILFTYFYTAITINPTQMAEDMKRNNGFIPGIKPGKKTAE YIDDIMSRITLPGSFFLALVAIMPAFAGIFGVQAGFAQFFGGTSLLILVGVVLDTLQQVE SHLLMRHYDGLLKSGRIKGRAGVAAY >gi|225935316|gb|ACGA01000076.1| GENE 75 77034 - 77831 670 265 aa, chain + ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 1 251 1 246 248 265 50.0 6e-71 MIFLKTEDEIELLRQSNLLVGKTLAEVAKLVKPGVTTRELDKVAEEFIRDHGATPTFKGF PNQYGEPFPASLCTSVNEQVVHGIPGDIVLKDGDIVSVDCGTYMNGFCGDSAYTFCVGEV DEEIRNLLKVTKEALYIGIQNAVQGKRIGDIGYAIQQYCESHSYGVVREFVGHGIGKEMH EDPQVPNYGKRGYGPLMKRGLCIAIEPMITLGDRQVIMERDGWTVRTRDRKCAAHFEHTI AVGVGEADILSSFKFIEEVLGDKAI >gi|225935316|gb|ACGA01000076.1| GENE 76 77833 - 78051 239 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 96 61 5e-19 MAKQSAIEQDGVIVEALSNAMFRVELENGHEITAHISGKMRMHYIKILPGDKVRVEMSPY DLSKGRIVFRYK >gi|225935316|gb|ACGA01000076.1| GENE 77 78060 - 78176 198 38 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53715443|ref|YP_101435.1| 50S ribosomal protein L36 [Bacteroides fragilis YCH46] # 1 38 1 38 38 80 100 3e-14 MKVRASLKKRTPECKIVRRNGRLYVINKKNPKYKQRQG >gi|225935316|gb|ACGA01000076.1| GENE 78 78210 - 78590 637 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 [Bacteroides thetaiotaomicron VPI-5482] # 1 126 1 126 126 249 100 3e-65 MAIRIVGVDLPQNKRGEIALTYVYGIGRSSSAKILDKAGVDKDLKVKDWTDDQAAKIREI IGAEYKVEGDLRSEIQLNIKRLMDIGCYRGVRHRIGLPVRGQSTKNNARTRKGRKKTVAN KKKATK >gi|225935316|gb|ACGA01000076.1| GENE 79 78602 - 78991 665 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348112|ref|NP_811615.1| 30S ribosomal protein S11 [Bacteroides thetaiotaomicron VPI-5482] # 1 129 1 129 129 260 100 2e-68 MAKKTVAAKKRNVKVDANGQLHVHSSFNNIIVSLANSEGQIISWSSAGKMGFRGSKKNTP YAAQMAAQDCAKIAFDLGLRKVKAYVKGPGNGRESAIRTIHGAGIEVTEIIDVTPLPHNG CRPPKRRRV >gi|225935316|gb|ACGA01000076.1| GENE 80 79110 - 79715 1028 201 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883039|ref|ZP_02064042.1| hypothetical protein BACOVA_01002 [Bacteroides ovatus ATCC 8483] # 1 201 1 201 201 400 100 1e-110 MARYTGPKSRIARKFGEGIFGADKVLSKKNYPPGQHGNSRKRKTSEYGVQLREKQKAKYT YGVLEKQFRNLFEKAATAKGITGEVLLQLLEGRLDNVVFRLGIAPTRAAARQLVSHKHIT VDGEVVNIPSFAVKPGQLIGVRERSKSLEVIANSLAGFNHSKYAWLEWDEASKVGKMLHI PERADIPENIKEHLIVELYSK >gi|225935316|gb|ACGA01000076.1| GENE 81 79727 - 80719 1015 330 aa, chain + ## HITS:1 COG:BMEI0781 KEGG:ns NR:ns ## COG: BMEI0781 COG0202 # Protein_GI_number: 17987064 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Brucella melitensis # 8 325 11 321 337 238 42.0 1e-62 MAILAFQKPDKVLMLEADSRFGKFEFRPLEPGFGITVGNALRRILLSSLEGFAITTIRID GVEHEFSSVPGVKEDVTNIILNLKQVRFKQVVEEFESEKVSITVENSSEFKAGDIGKYLT GFEVLNPELVICHLDSKSTMQIDITINKGRGYVPADENREYCTDVNVIPIDSIYTPIRNV KYAVENFRVEQKTDYEKLVLEISTDGSIHPKEALKEAAKILIYHFMLFSDEKITLESNDT DGNEEFDEEVLHMRQLLKTKLVDMDLSVRALNCLKAADVETLGDLVQFNKTDLLKFRNFG KKSLTELDDLLESLNLSFGTDISKYKLDKE >gi|225935316|gb|ACGA01000076.1| GENE 82 80723 - 81226 834 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237717427|ref|ZP_04547908.1| 50S ribosomal protein L17 [Bacteroides sp. D1] # 1 167 1 167 167 325 100 5e-88 MRHNKKFNHLGRTASHRSAMLSNMACSLIKHKRITTTVAKAKALKKFVEPLITKSKEDTT NSRRVVFSNLQDKIAVTELFKEISVKIADRPGGYTRIIKTGNRLGDNAEMCFIELVDYNE NMAKEKVAKKATRTRRSKKAAEAAPAAVEAPATEAPKAEEPKAESAE >gi|225935316|gb|ACGA01000076.1| GENE 83 81401 - 81769 176 122 aa, chain + ## HITS:1 COG:no KEGG:BT_2699 NR:ns ## KEGG: BT_2699 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 122 11 131 131 185 74.0 4e-46 MLLLTVAFHSIAGNVSTAKVVDEQERAITYAMRQHGEISAPDLPGKPVAELNNLQSHQIS VTRIQRVQLGEYFFSLKNALQCCADRDNSLSQHWGRIYGTTTSYYCQPSSEYYVYTLRRI II >gi|225935316|gb|ACGA01000076.1| GENE 84 81882 - 82424 399 180 aa, chain + ## HITS:1 COG:no KEGG:BT_2698 NR:ns ## KEGG: BT_2698 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 179 3 178 179 258 82.0 6e-68 MATTQTIDATIFASSHPDIAKRTSVSGVLISSVMLLVGILAFASTFELEDKSSTVSMALM VLGTGLFLMGIFRLFWKSKEVIYLPTKSVAKEHSIFFDLKHMDALRNLVNSGSFSAGSNI KSEASGNIRMDVILSADKKFAAVQLFQFIPYTYQPITSVQYFTDEQASAVIAFLAKSKMQ >gi|225935316|gb|ACGA01000076.1| GENE 85 82576 - 84390 850 604 aa, chain - ## HITS:1 COG:CAC3034 KEGG:ns NR:ns ## COG: CAC3034 COG0249 # Protein_GI_number: 15896285 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 1 601 1 597 598 253 27.0 7e-67 MEKQEQIITTYQQIIQKTELELQNARKRIYYISLLRLILFVGAVANAIIFWSDGWLCLSA FAILPFILFIWLVKRHNFWFYRKDFLKKKIEINEQELRAMQYDFSDFDDGKEFVNPSHLY TLDLDVFGEHSLFQYINRTATPIGKLHLANWFNAHLENKEAIEHRQEAIHELSTDLEYRQ QIRLLGLLYKGKPADTTEIKEWANSPSYYRKRTLLRILPTAVTIINFLCIGLAILGIISA SIAGGVFVGFVLFSTIFSKGITKLQTTYGEKLQILSTYADQILLTEQKEMQSHILQKLKT ELTSQNQTASQAVRQLSKLMNALDQRSNLLISTILNGLIFWELRQVMRIEKWKEIHASDL PRWIETIGEIDAYCSLATFAYNHPDYIYPTIAAQSFHLQAEALGHPLMDRNKCVRNGIDI ERRPFFIIITGANMAGKSTYLRTVGVNYLLACIGAPVWAKRMEIYPARLVTSLRTSDSLA DNESYFFAELKRLKLIIDKLEAGEELFIILDEILKGTNSMDKQKGSFALIKQFMHMNTNG IIATHDLLLGTLVDSFPQNIRNYCFEADITNNELTFSYQMRNGVAQNMNACFLMKKMGIA VIND >gi|225935316|gb|ACGA01000076.1| GENE 86 84709 - 85353 519 214 aa, chain + ## HITS:1 COG:no KEGG:BT_2695 NR:ns ## KEGG: BT_2695 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 214 8 221 222 352 80.0 6e-96 MKKGCILCLLISVISLPLIAQNIIYSNLKELLAQHGDTTAVLRVEKRSRNQIVLTGGADY RITAGDDESMCRRLKKRCFAVRDEKGYLYLNCRKLRYKKLRFGAWYAPAVQLGKNIYFCA MPLGSVIGGNFVEEDDVKLGGNIGDALAASSLVTKRVCYELNGETGKVDFLGKDKMLQLL KNYPELKQAYLKDDSQEAKHTFKYLLELRNKQKK >gi|225935316|gb|ACGA01000076.1| GENE 87 85731 - 87728 1302 665 aa, chain + ## HITS:1 COG:SPy2082 KEGG:ns NR:ns ## COG: SPy2082 COG2987 # Protein_GI_number: 15675840 # Func_class: E Amino acid transport and metabolism # Function: Urocanate hydratase # Organism: Streptococcus pyogenes M1 GAS # 1 659 13 671 676 1021 70.0 0 MRITLGNTLPPYPDFVEGIRRASDRGYTLTPAQTITALKNALRYIPSEWHEQLAPEFMEE LRTRGRIYGYRFRPAGDLKAKPIDEYQGQCIEGKAFQVMIDNNLCFDIALYPYELVTYGE TGQVCQNWMQYRLIKQYLEELTQEQTLVIESGHPLGLFRSRPDAPRVIITNSMMIGQFDN QRDWHIAAQMGVANYGQMTAGGWMYIGPQGIVHGTFNTLLNAGRLKLGIPQDQDLRGHLF ISSGLGGMSGAQPKAAEIAGAVSIIAEVDRSRIETRYRQGWVGHVTVDIIEAYRMAAEAM RRREPCSIAYHGNIVDLLEYAEREKILIELLSDQTSCHAVYEGGYCPAGLTFEERTRLLH ESPEEFRHLVDISLHRHFEVIKKLVARGTYFFDYGNSFMKAIYDAGVKEISYNGVDEKDG FIWPSYVEDIMGPQLFDYGYGPFRWVCLSGKHEDLIKTDHAAMECIDVNRRGQDLDNYNW IRDAEKNQLVVGTQARILYQDAVGRMKIALRFNEMVRRGQVGPIMLGRDHHDVSGTDSPF RETSNIKDGSNVMADMAVQCFAGNCARGMSLVALHNGGGVGIGKAINGGFGMVCDGSKRV DEILRSAMLWDVMGGVARRSWARNPHAMETSEAFNESHAGDYQITMPYVADEELIKKIVP SIVGK >gi|225935316|gb|ACGA01000076.1| GENE 88 87894 - 88796 974 300 aa, chain + ## HITS:1 COG:SPy2083 KEGG:ns NR:ns ## COG: SPy2083 COG3643 # Protein_GI_number: 15675841 # Func_class: E Amino acid transport and metabolism # Function: Glutamate formiminotransferase # Organism: Streptococcus pyogenes M1 GAS # 5 299 3 298 299 338 53.0 7e-93 MSWNKIIECVPNFSEGRDLEKIDQIVAPFRSKAGVKLLDYSNDEDHNRLVVTLVGEPEAL CEAVVEAVGVAVRLINLNRHTGQHPRMGAVDVIPFIPIKNTSMEEAIELSKKVAAKVAEL YNLPVFLYEKSATAPHRENLASVRKGEFEGMAEKIKLPEWQPDFGPAERHPTAGTVAIGA RMPLVAYNINLSTDNLEIATKIAKNIRHINGGLRYVKAMGVELKERNITQVSINMTDYTR TALYRAFELVRIEARRYGVTIVGSEIIGLVPMEALIDTASYYLGLENFTMQQVLEARIME >gi|225935316|gb|ACGA01000076.1| GENE 89 88907 - 90160 975 417 aa, chain + ## HITS:1 COG:BH1984 KEGG:ns NR:ns ## COG: BH1984 COG1228 # Protein_GI_number: 15614547 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Bacillus halodurans # 21 411 24 409 426 328 45.0 9e-90 MSENLIIFNARIVTPIGFSARKGEEMSQLQIIENGTVEVTKGIITYVGENRGEDRDGYYQ HYWHYNARGHCLLPGFVDSHTHFVFGGERSEEFSWRLKGESYMSIMERGGGIASTVKATR QMNYLKLRSAAESFLKKMSTMGVTTVEGKSGYGLDRDTELLQLKVMRSLNNAEHKRVDIV STFLGAHALPEEYVGRSDEYIDFLIREMLPLIHSGKWAECCDVFCEQGVFSIEQSRRLLQ AAKEYGFLLKLHADEIVSLGGAELAAELGALSADHLLHASDAGIRAMADAGVVATLLPLT AFALKEPYARGREMIDAGCAVALATDLNPGSCFSGSIPLTIALACIYMQMSIEETITALT LNGAAALQRADRIGSIEVGKQGDFVILNSDNYHILPYYIGMNCVIMTIKGGILYPVA >gi|225935316|gb|ACGA01000076.1| GENE 90 90207 - 90836 687 209 aa, chain + ## HITS:1 COG:FN0739 KEGG:ns NR:ns ## COG: FN0739 COG3404 # Protein_GI_number: 19704074 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Fusobacterium nucleatum # 2 207 3 212 212 135 40.0 8e-32 MLADLTVKDFLDKVACSDPVPGGGSIAALNGALASSLSTMVARLTVGKKGYEVSEEVMQH AQTITLRLLDEFMALIDKDSAAYNEVFACFKLPKITDEVKAARSAAIQKATKQATLVPLE VARKALDMMSVIADVARLGNRNAITDACVAMMSARTAVLGALLNVRINLGSLKDRDFVLQ LQTEADAIEQTACRREKELLDAVNQDLRV >gi|225935316|gb|ACGA01000076.1| GENE 91 90833 - 92332 1225 499 aa, chain + ## HITS:1 COG:FN1406 KEGG:ns NR:ns ## COG: FN1406 COG2986 # Protein_GI_number: 19704738 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Fusobacterium nucleatum # 3 493 2 495 511 413 45.0 1e-115 MSKNVYHIGSGALTFEIIERIINENLKLELAPEAKLRIQKCRDYLDQKIASSEEPLYGIT TGFGSLCTKNISPDELGTLQENLIKSHACSVGEEIRPVIIKLMMLLKAHALSLGHSGVQV ITVQRILDFFNNDVMPIVYDRGSLGASGDLAPLANLFLPLIGVGDVYYKGKKCEAISVLD EFGWEPVKLMSKEGLALLNGTQFMSANGVFAMLKAFRLSKKADLIAALSLEAFDGRIDPF MDCIQQIRPHQGQIETGEAFRKLLAGSELIERHKEHVQDPYSFRCIPQVHGATKDAIRYV ASVLLTEINSVTDNPTIFPDEDRIISGGNFHGQPLAISYDFLAIALAELGNISERRVSQL IMGLRGLPEFLVANPGLNSGFMIPQYAAASMVSQNKMYCYAASSDSIVSSNGQEDHVSMG ANAATKLYKVMDNLEHILAIELMNAAQGIDFRRPQKTSPVLERFLHAYRKEVPFVKDDIV MYKEIHKTVAFLKRTQIEY >gi|225935316|gb|ACGA01000076.1| GENE 92 92682 - 93332 614 216 aa, chain + ## HITS:1 COG:no KEGG:BT_2689 NR:ns ## KEGG: BT_2689 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 215 2 213 215 335 86.0 8e-91 MMTEHGKDASQRVELRERIITAATEAFTSKGIKSITMDDIAAALGISKRTLYEVFSDKES LLKECILKAQADRDKYLQKVFEQSHNVLEVILAVFQKSIEMFHQTNKRFFEDIKKYPKVY EMMKNRQDSDSMKTMSFFKTGVEQGIFRSDVNFAIVNLLVREQFDVLLNTDICNEYPFIE VYESIMFTYIRGISTEKGARVLEDFIQEYRKNRIED >gi|225935316|gb|ACGA01000076.1| GENE 93 93388 - 94734 1257 448 aa, chain + ## HITS:1 COG:PA4144 KEGG:ns NR:ns ## COG: PA4144 COG1538 # Protein_GI_number: 15599339 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 39 447 56 464 471 74 19.0 3e-13 MNKMKRLTGRKILLVAVAFCAFGFAKAQDTQTEKNTLTLTLDKALEIALDENPTMKVAAE EIALKKVASKEAWQSLLPEASIAGSLDHTIKAAEMKLNDMSFKMGQDGSNTANAGLSINL PLFVPGVYRAMSMTKTDIELAVEKSRASKLDLVNQVSKAYYQLMLAQDSYEVLQGSYKLA EDNYNVVNAKYQQGTVSEYDKISAEVQMRSIKPNLIAAANAVTLAKLQLKVLMGITADVE IKINDNLTNYETTMFANQLKEENVNLNNNTTMKQLDLNMKLLEKNVKSLKTNFMPILSMS FSYQYQSLYNPNINFFDYNWSNSSSLMFNISIPLYKASNFTKVKSARIQMRQLDWNRIDT ERQLNMQIVSCRNNMSASTEQVVSNKENVMQAKKAVVIAEKRYDVGKGTVLELNSSQVSL TQAQLTYNQSIYDYLVAKADLDQVLGKE >gi|225935316|gb|ACGA01000076.1| GENE 94 94768 - 95793 1064 341 aa, chain + ## HITS:1 COG:VC0165 KEGG:ns NR:ns ## COG: VC0165 COG0845 # Protein_GI_number: 15640195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 1 339 1 356 368 110 27.0 4e-24 MKRSIQLVALLLTVFMGSCTGGKDKAAAEHVDAKPIVKLADVKARPVDQIQDYTATVEAE VKNNIAPSSPVRIDQIFVEVGDRVSKGQKLVQMDAANLKQTKLQLDNQEIEFNRIDELYK VGGASKSEWDASKMQLDVKKTAYKNLLENTSLQSPINGVVTARNYDNGDMYSGGEPVLVV EQITPVKLLINVSETYFTKVKKGTPVDVKLDVYGDEVFTGTINLIYPTIDATTRTFQVEI RLDNKDQRVRPGMFARATLNFGTAENVVVPDLAIVKQAGSGDRYVFVYKDGKVSYNKVEL GRRMGTEYELKSGVPDNSQVVVAGQSRLVNGMEVEVEASSK >gi|225935316|gb|ACGA01000076.1| GENE 95 95853 - 99011 3095 1052 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 7 1020 5 1005 1093 490 31.0 1e-138 MSLYEGAVKKPIMTSLCFLAVVIFGLFSLSKLPIDLYPDIDTNTIMVMTAYPGASASDIE NNVTRPLENTLNAVSNLKHITSRSSENMSLITLEFEFGNDIDVLTNDVRDKLDMVSSQLP DDVENPIIFKFSTDMIPIVLLSVQANESQSALYKILDDRVVNPLARIPGVGTVSISGAPQ REIQVYCDPNKLEAYHLTIETISSIIGAENKNIPGGNFDIGSETYALRVEGEFDDSRQLA DVVVGTHNGANVFLRDVARIVDTVEERAQETYNNGVQGAMIVVQKQSGANSVEISKKVAD ALPRLQKNLPSDVKIGVIVDTSDNILNTIDSLTETVVYALLFVVIVVFLFLGRWRATLII CITIPLSLIASFIYLAVSGNTINIISLSSLSIAIGMVVDDAIVVLENVTTHIERGSDPKQ AAVHGTNEVAISVIASTLTMIAVFFPLTMVSGMSGVLFKQLGWMMCAIMFISTVAALSLT PMLCSQLLRLQKKPSKMFKLFFTPIEKALDGLDTWYAKMLNWAVRHRPIVIVGCIVFFVI SLLCAKGIGTEFFPAQDNARIAVQLELPIGTRKEIAQELSQKLTNQWLTKYKDIMKVCNY TVGQADSDNTWASMQDNGSHIISFNISLVDPGDRNVSLETVCDEMREDLKAYPEFSKAQV ILGGSNTGMSAQASADFEVYGYDMTVTDSVAARLKRELLKVKGVTEVNISRSDYQPEYQV DFDREKLAMHGLNLSTAGNYLRNRVNGAVASKYREDGDEYDIKVRYAPEFRTSLESLENI LIYNAQGQSVRVKDVGKVVERFAPPTIERKDRERIVTVSAVISGAPLGDVVAAGNKVIDK MDLPGEVTIQISGSYEDQQDSFRDLGTLAILIVILVFIVMAAQFESLTYPFIIMFSLPFA FSGVLMALFFTGSTLSVMSLLGGIMLIGIVVKNGIVLIDYITLCRERGLAVINSVVTSGK SRLRPVLMTTATTVLGMIPMAIGGGQGSEMWSPMAIAVIGGLTVSTILTLILIPTLYCVF AGTGIKNRRRKLRHQRELDVYFQENKDSIIKK >gi|225935316|gb|ACGA01000076.1| GENE 96 99024 - 99317 363 97 aa, chain + ## HITS:1 COG:no KEGG:BT_2685 NR:ns ## KEGG: BT_2685 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 199 95.0 2e-50 MKSVLITFDQAYYERIMALLDRLGCRGFTYLEKVQGRGSKTGDPHFGSHAWPSMCSAILT VVDDSKVDPLLDTLHKMDLETQQLGLRAFVWNIERTI >gi|225935316|gb|ACGA01000076.1| GENE 97 99433 - 100824 1288 463 aa, chain + ## HITS:1 COG:no KEGG:BT_2683 NR:ns ## KEGG: BT_2683 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 463 1 464 464 866 88.0 0 MEIIKNYLKYSLWFVLIVFAALLGLHWLPVLTIDGHTMRRVDLLSDLRYPESETAAADSD SIPLPPVVKPAFVDTCRAGMTCIEDYSDSTLRGMTPFYEALNRVSSTDSDDKQVRIAVFG DSFIEADIFTADLREMLQKQFGGCGVGFVTITSMTSGYRPTVRHTFGGWSSHAVTDSIYF DKKKQGISGHYFVPRNGAYVELRGQNKYASLLDTCQRASIFFYNKDSVLLSARVNKGENK NYSLGPSDGLQQVQVDGRIGSIRWTVDRADSTLFYGLAMDGKKGIILDNFSLRGSSGLSL RGIPVQTLKQFNRQRPYDLIILEYGLNVATERGRNYDNYQKGLITAIEHLKECFPQAGIL LLSVGDRDYKNENGELRTMPGVKNLIRYQQNIAAESGIAFWNMFEAMGGEGSMAKLVHAK PSMANYDYTHINFRGGKHLAGLLYETMIYGKEQYDRRRAYEKE >gi|225935316|gb|ACGA01000076.1| GENE 98 100778 - 101728 625 316 aa, chain + ## HITS:1 COG:no KEGG:BT_2682 NR:ns ## KEGG: BT_2682 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 316 6 288 288 523 87.0 1e-147 MVRNNTTGGVLMRKSNLLSIFILILAGMLSVPCFLTETIAQDRMPVCSPLGKTAKRIKPL REMNWANDTINVQVAFPAAFRETGRNEIIDSIALLAPVFEHLRQVRAGLSEDTVRIVHIG DSHVRGHIYPQTTGARLAETFGAISYIDKGVNGATCLTFTHPDRIAEIAALKPELLILSF GTNESHNRRYNINVHYNQMDELVKLLRDSLPNIPILLTTPPGSYESFRQRRRRRTYAINP RTATAAETIHRYAKDHRLLVWDMYDVVGGKRRACTNWTEANLMRPDHVHYLPEGYILQGN LLYQALIQAYNDYVSH >gi|225935316|gb|ACGA01000076.1| GENE 99 101715 - 103202 1115 495 aa, chain + ## HITS:1 COG:PA3548 KEGG:ns NR:ns ## COG: PA3548 COG1696 # Protein_GI_number: 15598744 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Pseudomonas aeruginosa # 40 418 19 390 520 251 40.0 2e-66 MFPIDIDFSRLKEVLTYDPQAPMIFSSGIFLWLFAAFMVVYVLLQRKYTARILFVTLFSY YFYYKSSGTYFFLLAIVTVADFFLAQLMDRAEGYWKRKGLVVLSLSINLGLLVYFKYTNF LGGVIASLMGGEFTALDIFLPVGISFFTFQSLSYTIDVYRRDIKPLTNLLDYAFYVSFFP QLVAGPIVRARDFIPQIRKPLFVSQEMFGRGIFLIVSGLFKKAIISDYISINFVERIFDN PTLYSGVENLMGVYGYALQIYCDFSGYSDMAIGIALLLGFHFNLNFNSPYKSASITEFWR RWHISLSSWLRDYLYISLGGNRKGKFRQYLNLIITMFLGGLWHGASWNFVLWGTFHGVAL ALHKMWMTITGRKKGEQSHGWRRVFGVIITFHFVCFCWIFFRNADFQNSMDMLGQIFTTF RPQLFPQLLEGYWKVFALMLLGFLLHFAPDSWENAACRGVTRLPFVGKAVLMVALIYLVI QMKSSEIQPFIYFQF >gi|225935316|gb|ACGA01000076.1| GENE 100 103289 - 104302 898 337 aa, chain + ## HITS:1 COG:STM2406 KEGG:ns NR:ns ## COG: STM2406 COG0667 # Protein_GI_number: 16765732 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Salmonella typhimurium LT2 # 7 332 3 328 332 380 56.0 1e-105 MENSFSYQPAADRYEQMPYTYCGKSGLQLPLISLGLWHNFGSVDNFNVATDMIKYAFDHG VTHFDLANNYGPAPGSAEVNFGRILKENFQGYRDEMIISSKAGHDMWAGPYGGNSSRKNL MASIDQSLRRTGLEYFDIFYSHRYDGVTPVEETIQTLIDIVKQGKALYIGISKYPPEQAR VAYEMMAKAGVPCLISQYRYSMFDRVVEAESLPLAAEYGSGFIAFSPLAQGLLTDKYLNG IPEGSRAARSSGFLQQSQVTHEKVEAARQLNEIARRREQTLAEMALAWVLKDERMTSVIV GASSVNQLADNLKALEHLEFTTDELKEIEQILCMHGK >gi|225935316|gb|ACGA01000076.1| GENE 101 104545 - 105594 582 349 aa, chain + ## HITS:1 COG:no KEGG:BT_2654 NR:ns ## KEGG: BT_2654 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 41 339 108 403 403 239 42.0 8e-62 MKSIKINLRESDIKFHEGSRYIQIILDTEELIGGYKVAQLEYSFFSLLKERIIYLKKESK KSTASNYQCAFRSFKQFRQNEDVCLSEITSAMMKEYENYLKKKNICLNTISFYMRTLRAA YNYGVDEMELLSENRKPFRKVFTGEEKTIKRAVKENVIQELLSLNLTHEPILELARDMFM FSIYMRGMPFVDIAHLRKDNMINNNMSYQRQKTDQRLLVRIMSCAQDIIDKYSMVMRSSE YLFPLLYNPRREKNTTYETALRIHNRRLRTLSERLGLKTPLSSYVARHTWATLAKWSGIP DAIISEAMGHTSCETTKIYLDSFNEDVVDKANKTVVSVLSGRNKLIVDE >gi|225935316|gb|ACGA01000076.1| GENE 102 105800 - 106678 426 292 aa, chain - ## HITS:1 COG:no KEGG:BF1920 NR:ns ## KEGG: BF1920 # Name: not_defined # Def: putative transcription regulator # Organism: B.fragilis # Pathway: not_defined # 1 283 1 281 288 158 31.0 3e-37 MGIFYEDEHKTCYHYSTVELSVFKVFEVNETNNTVSEEINKSIILFILEGELYMNCNSFQ NRLIKAGEMVLLPKNCCFYGRALKKSVIISCAFIQDIQFCNKYSLENLTNDIPEGFTYDF TILPIRERLHEFLILLKNYLYDGLGCTHFHNLKKQELFILFRGYYSKKELTSFFFPIAGK DLDFKDFVFTNYPQVNSIKEFAELANMTVATFNRRFKEAFEVSTHKWIATRKAERVLRDI RVTNKTLECIAVEHDFSSTAYLATFCKQQYGKSPSELRQQELKNCTSFKEGS >gi|225935316|gb|ACGA01000076.1| GENE 103 106856 - 107389 504 177 aa, chain + ## HITS:1 COG:no KEGG:PG2130 NR:ns ## KEGG: PG2130 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 8 176 12 191 193 167 50.0 1e-40 MKQSLLFILLVSGLFPVLFAQNVAVKTNGLYWLTMTPNAGIEFALNKKVTLDLSAAYNPW TFKADKKMRFWFVQPEAKYWFCEKFEGHFVGLHLHGAQYFGGFKEKRYDGYLLGGGLTYG YDWILSPHWNLEASIGIGYAHLWYKESDRIPCLKCYENKHKNYVGPTKATLSLIYIF >gi|225935316|gb|ACGA01000076.1| GENE 104 107403 - 109172 1579 589 aa, chain + ## HITS:1 COG:no KEGG:BF3229 NR:ns ## KEGG: BF3229 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 55 559 123 633 662 117 26.0 9e-25 MKQSLQYISLVTIMMSVLVGCSTSNKQVGKVVTTTPAPNVLTPDTANQVNLDLLFHIPAR YLSERSRLVITPQLVRNDTVQDEYTPLVVDAPIYNKKINRKKVLDDYLDPYEKEAMKADK VSHSFELPYKETVQLPVGTEEGRFVAVISADGCGQCTGIDTIDIAVISNPITLLPDVKES LELSWIEPEFVVRPKILQGKGIANLQFVIDKFDINLSMGNNRVELEKMVRTLAPIMKDTL ATVTSLTITGMASADGPLAYNTTLARNRATAAKQWLAGELENGAGIQRLITVSSRPEGWE PVLAAMIADGNPDSVAVKEILNKYNTGNDDVQERYIRRLPCWNRIKSSYLQKERKVEYVY TYVMKSFTGDDELLDMYRKRPDAFNEEEFLRVATLMKSHNEKVEVYQTILKYFPQSQIAA NNLAVLYLREGRTEKAQEILSGLSQYSPEVFNSLAASYVYANNYEKAIELLQDVDLPEAR YNLGLIKAKKRYLNEAYELLRPFADLNSAIVALSINKNVEAKSILSILKDTRPVAEYARS LTAVRLYEDAVFYEHIANACNDMKLRERAVSEPDFHRFHADESFRTLIK >gi|225935316|gb|ACGA01000076.1| GENE 105 109194 - 110081 720 295 aa, chain + ## HITS:1 COG:no KEGG:BF1562 NR:ns ## KEGG: BF1562 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 294 3 286 287 137 29.0 4e-31 MKKITYILLLATSFSLGSCIRDKVEPCPPLQVNIVVKDKNYFNVDKVQWEERKNENLAFG EYVPTLYYTLRDAITGKVVEEQGVFDVVGNGTTFPVTFCDCIPHGKYVLTVWGGLKDNTP LGDNSLTAVLHAENKEGTDVYMTNDTLIYDAWNYNYTVDIERAKGKLIIEVEDLPDYVNY SDKTIGGLYKNLNSDFKYSEATSVYSQNEWEPSSDIVIKTLLAPSQVEKGSVVHLNFYDW ADRTMPALTPRDVEVTIKRNELTVLKYVYVDGKKDFFIYILVNDEWEMLHNMNID >gi|225935316|gb|ACGA01000076.1| GENE 106 110116 - 111393 1123 425 aa, chain + ## HITS:1 COG:no KEGG:BF1563 NR:ns ## KEGG: BF1563 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 425 29 389 389 119 29.0 3e-25 MKKSYLMVAVTASMLAFTACSNDEEATKMNEDSTTQTLIVQVASAGDGLTTRAGRPLYSS AAAQDIDNVKVVVYAAGGKIAYEKNLDDWKTISKAYTTNGHGLQYTLTLKGADKLVAGTY TVLAVGYANSSDYTYSPLLTSLAKDASYTAPITATIGTGKEVEEVFAGEAALTVPADNDQ AFNAPVTLHRQVAGGFGYFKNIPAAVDGNTAKTLRLVASAKNTAVEFKNFNGTFTEAPNG VQYIVNGTTPAAEEAGVSFSDATTKGYVVYSIDLSTWFPNGDTDNNGILNNSDNDWVNPT AGLNVVKGSVFAGKFIIPFGYVDGKNTMELQLLAGDGSIIKHWTVKVPAVAPDVTAPQVK DQSVNVYNVVRNHMYNIGVKVANPTDPTEPEEKPDVEVPEDLSKGQDLILKVNDNWELIH QMELE >gi|225935316|gb|ACGA01000076.1| GENE 107 111537 - 112598 763 353 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260175469|ref|ZP_05761881.1| ## NR: gi|260175469|ref|ZP_05761881.1| hypothetical protein BacD2_26674 [Bacteroides sp. D2] # 1 353 1 353 353 679 100.0 0 MRTIHYLLTFIILLSVFSCQDEDILPVQNNEIPPFTNSTTRAIGITSGLTQDANGYWVAS RRIPLVGKGRIVDNISDALVSVLGWKENVANMVDIDIANSTSFAGVANVDAVANQIASVR DMTRTYAGGQPAGFVYKVDNTGLLTLNVLKGFWIETRLKGVLQETKGGNTSATTLELNLL SAANDNGKQALSISTSFSKPFDEIRIGMRGISAEVLKALSLYYAFVGDNPMQPCTTDNTT YFPNGVEIHKNGLFDLGWTSLLNPDKIINADLTDGAGFGTVAGLLSDPHVTVNFKKEIPA GTEVGYCFTDATILDLSLLTGTVLETYDADNNKVDAVEITSLLGLSAIGGGQK >gi|225935316|gb|ACGA01000076.1| GENE 108 112599 - 115811 2309 1070 aa, chain + ## HITS:1 COG:no KEGG:BDI_2883 NR:ns ## KEGG: BDI_2883 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 628 1067 1446 1924 2009 126 26.0 6e-27 MSLITSAPCTQVRIKFTGANINLGATVVNYAFVREPVEVDASSFLSLSNTTITGNTYQFP SSKLGTMSCTLLDFPAGATPTIIENNRITGMTVDGDYTVLVTYTAEDGNTYTQTVVITRK TQGMTGEGCNTLITATGDKVKVSNPSGGGSLISISDIDGVENIVDENPDNYATYTKGITV AGNIGILGIETTDGSLLNTSGRKIRIGFTIQPMSSLLGVDALTYFRIRLKRNGEYIDSGV TDENNAVSAGLIGNDGSRMRLSITTDKEFDAIELWKSGLLNLTLSTFRIYNAFWEPASST CYSGSVGDACLELLTAANHGAEINYDATGSGSLISVGSSFNNLGNLLDDDRESAAEITNT TVAGGVTVAVKFNPIKTKAQVGFMLKGVTGLADVDLLKAIELFVYSAGVKDDNAKTSFGT LGAEVIGSGDYTYVATIPEVSEFDEVRIKFLGVAQALENVLLSGVFIRPDTDGDGIPDCA EENTDDNKNPIASAYALAEHVCAGDAVEIYVNGENLPQGNYQLTFTDVTSTNNIGEKTIA LTNGTLMVDNLPAGDYYINIKSLTNTNAYYNGIHVTVHPLATTWKPGTSSTSWNVWSNWT NGIPWDCTNVIIPSDSRNYPVLKSKEQNYCHYIHFEPGAEVINTHYLHYEKAWVELSLTP GRYYMLSAVLNNTMTGDMFIPASMDGTQNNAVFSDLDATSSPANRFNPRIFQRLWSNNAT GKKLTGDVTVTPDETNWTPPFNAVNQRYALGNGFSLMVDKGTVTKDKCTFRFPKEHTTYY YYNAAGASTGISESISRPGFSGRFIYETLSGTVPSFPLTVSITNQKEGTTFLVGNPFMAH IGIEEFFAANPQITSIKVFDGNTNNSLIKADGELLTNGTDYSHIAPMQSFFATVTTPATS LNVIFNEAMQNQPGSGGLLRSTRSGSSRSTRSASAVRSELRVTATVGQVSASTLLIMNPR ASVSYRAGEDAELLLDNEVPSAVTVFSEADDKALDIQQLNASASHIPLGFSLGKAARVTL TVSHQVGDAWSRWTLVDTQNGRRYPLTEYVVEVDAGIIGTHVGRFYLEKK >gi|225935316|gb|ACGA01000076.1| GENE 109 115839 - 116975 985 378 aa, chain + ## HITS:1 COG:no KEGG:BF1563 NR:ns ## KEGG: BF1563 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 373 24 385 389 69 25.0 2e-10 MNKLINRIVRYCSCALALLVAGCSQSEDTDTTGMEDYGGVTLSFYASSGLAISTRTELGG SEDVQHVQDVQLYIFDKSGVCVASEDVKWKEYFESLGGLPANTADMTYKVKYKGFAVGDA YTFLAVGRDVESSPTYGFPDAITVGNDLANVKATLKAADWKEMHQSELFSGKTILTYNEP GMKGQIDLYRRVAGVMGWFINLPPTVNGITLTSIRITLYKKQNKSVPLLPLAVKPVFKDY IDDEVALDGGEVLVEIPVTAADITSDMVFSKGSYVLPVPAPPAISDDDYTLRVELVGGGN VLRYKRVKLGDGDVLDPSTGSGTGIIDTQGPYRFPIIANQFYGVGTVANPIDMGGKEPDI VTTLDPKWENVEDTLPLE >gi|225935316|gb|ACGA01000076.1| GENE 110 117352 - 118521 440 389 aa, chain - ## HITS:1 COG:no KEGG:BF1784 NR:ns ## KEGG: BF1784 # Name: not_defined # Def: clostripain-related protein # Organism: B.fragilis # Pathway: not_defined # 37 367 29 360 378 218 39.0 3e-55 MHILLFIRYRLFLFCCCHLLFASCDNEGYLHPIIPQQEKETQRILIAYLAGDNSLSPEIE QKIAAITIGFLATDCSQNRLFIYCDRKNASPQLMEISAHTANPRQLLKTYTAQNSASSNV MKQVLNDILNNYSASSYGLILFSHGSGWLPAAGLENPSRSTRSVCIDGEDEMELADFASA LPLPNHRKWDFILFEGCYMGSVEVAYELKDKTEAIIASPTEIVSPGMTEVYPSALSYLYQ PTPELERFAQSYFDAWNSKTGDYRSATISVVRTEYLEELALLARAAFLRWKPDTATISSL QCFNRNEWHLFFDLKETLLTANPALKTYVNELWKNIVTYSAATPSFLPGLAYGFTIHTHS GLSCYVPQNEFLYINQGYTQTTWCGTVYQ >gi|225935316|gb|ACGA01000076.1| GENE 111 118653 - 120722 2083 689 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175473|ref|ZP_05761885.1| ## NR: gi|260175473|ref|ZP_05761885.1| hypothetical protein BacD2_26694 [Bacteroides sp. D2] # 1 689 1 689 689 1241 100.0 0 MKKSLFWSLIAATAMMTACGDDAFVDNQQVVTGEETPVAFKINMGNAISTYAGTDFSKGG WSNWKDSDTEMIEEVNARATLQVFANGGTSPYTQVRTVLTEAPSGNDIELPQFRLPAGNS YTAVVWVDFVPKSVTSSSTESRNDYYWKTENLASVSELQFTDGLMPIVETPEARDAYTGL IKFTVNADGTYNITDGAEDQSISTAIPITATRKFAKISLILTDYDNKTEWADALNNLGSD NLLEYLGMEVTNLSTGYNALTQSPVSATPTTQSYFHRPFDVASATSTSDHVNGVKWYQDG DKFYPIFDLNYVIPEKAGKDATYSLSFKGYNVDGTPSFTADAAALTEINAEADGFKLVAI RNASNVPVRTNAHTIIKMNLYKAYSCTVTIVDEFTGGTENVVIDDDDKKQSTDDHFIPEG DENGAHIVVVRGNDGKAISIDITGINKDNFDLVIDDLVAMKNLSSTADNTKPVINMSGTD IPNPADFSDLEGNVTNDIKLTLTQDGQDIVLEGLTNNITLDLPSIQKFIVTSTADVTIEK VVSNSWPTSITAKNVTFTGTTYNSVSINVEETATFNGGTYNNEVAVIGSENDAEAVINGG VFNHNFKSSVDTTIKGGSLPYVDADTNGVADYMLIMNAGKTLTVEKADSAFGPLFATSGA DCKLVYDTQDIKDFYSGLEGSNWSNISVK >gi|225935316|gb|ACGA01000076.1| GENE 112 120784 - 121875 780 363 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175474|ref|ZP_05761886.1| ## NR: gi|260175474|ref|ZP_05761886.1| hypothetical protein BacD2_26699 [Bacteroides sp. D2] # 7 363 1 357 357 689 100.0 0 MKKHRLMNHQKYFRTLCFGLSFLTAACTNESPLSTEPAEWPKTSISIQATVDTKAANEPE TAADNGTITRAGGDKQVPAGYRLRCKLEVYSKTTDLRVAQAQLFIDDNSGLTGAITFPDV HLPPNDTYKAICWADYIAAGTETDLIYNTSKGLSDIRLIETIGTVHAETAANEADIEDAY AGQSETFTIEASGAVKEGDEEKLTSIKLRRPFVKLSLPWVKLTTEDGSVWTDALRNIRIV YETGNVLYTQFNAWTGEASGARTVSSALYSKEVSSFNKTFALHDYLLVPVPMPDAIPLSF HIEAETTVGTPVTFKRYGSTATDYSSISIELPNPNYMLRIKGATAAIEDDETIYPLTFKE YIP >gi|225935316|gb|ACGA01000076.1| GENE 113 121903 - 123015 926 370 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260175475|ref|ZP_05761887.1| ## NR: gi|260175475|ref|ZP_05761887.1| hypothetical protein BacD2_26704 [Bacteroides sp. D2] # 1 370 1 370 370 689 100.0 0 MRTEKDTETMDLYNKYNASEWLTRKEAIRRKRLVTIISASAAALLILILLLMSSCEKKDL CYLDAHPHICHTQLTLKFNTAWDNEPIYSNYTRSAGSLTVRYVLEFWTMSDDGKLETQLE RKIVNGGSLAEGNNRYQVSVDLPADRIAVLAWAEPLASGQTSNPYFDVSSLTSVKMKEPF GAGATKDAFSASTTWDYSGYGGPHSHSNDLDFTEELELLRPFGTYTVISNDMEDYFEKAG ADAPEPHTAKVNYQMWIPPMFDTFRQVASGSQSGATFNHTVSVHTEGKEYKLAEDLIFIG PGDGTDNYYNIIVETHAADGTSIHKSGNAEIRMQRNKHTLVYGAFLTVRKPSSPGINDEF EEIIEIVIPD >gi|225935316|gb|ACGA01000076.1| GENE 114 123035 - 124240 846 401 aa, chain - ## HITS:1 COG:no KEGG:BT_2319 NR:ns ## KEGG: BT_2319 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 41 399 31 416 430 123 30.0 1e-26 MEYLRRKLCLLAGSILLTCAAWGQTLPTDSITKDPEAWYIHFKTGKSNLDLDYNGNRGTL QRCIDRVQEIIDKNEYVIDHIRIIGYASPEGPLALNLRLSAARADVLKDYLVAKTGLSSN LFEVVAGGENWNELRVMVEKSNIKEKDRVLDILDHTPEGVDPEIALKKLPGGTYRYMLTN FYPKLRSASSVQLLRIVPVVAPPVVKEEIKVVEVDTVKRTPQKPVVQEPEPCRCDPPLMA IKTNLASWAALITPNIELEAYLGSRYSIAVEGAYRWLKDSKAKGNSYNVASVSPEIRMYV RDDRSFQGSYWGLYGLYGEYDVKFGSTGRQGNIRGLGISYGYIFKFNRFDCLYFDLGISA GYNRLRYDKYFWYDPCNAFKEHKGKNYWGPSKIKASLVWRF >gi|225935316|gb|ACGA01000076.1| GENE 115 124621 - 125610 404 329 aa, chain - ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 313 1 298 305 263 47.0 6e-69 MNDLKKFMETVILRLENEGRYGTAHVYQSTLNAILRYWQIRQKGTIKLDKVFTPVLLQDF ERYLLENMLSMNTVSTYMRMLRAIYHRATKEKRIKWRQGLFDTVYTGVRADTKRSLTAKH MGNILVAQQSLPSLKEAQSWFTLLFLLRGMPFIDLARLRKCDLYGDTLIYKRQKTGRIIT VNVTPEAMRLIKQLANRNPQSPYLLSILPSNSKLQNGNCFGSKEEYRLYQAILRGFNRRL NELSRRMNLGEKLSSYTARHTWATTAYQKKCATGVICNALGHSSIKVTETYLKAFEQKEI DRTNRMIISYVKQEAEKANKNRNTLFNTL Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:11 2011 Seq name: gi|225935315|gb|ACGA01000077.1| Bacteroides sp. D2 cont1.77, whole genome shotgun sequence Length of sequence - 5169 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - LSU_RRNA 156 - 2908 99.0 # FJ410383 [D:701..3455] # 23S ribosomal RNA # Bacteroides ovatus # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. - TRNA 3069 - 3145 93.1 # Ala TGC 0 0 - TRNA 3204 - 3280 86.4 # Ile GAT 0 0 - SSU_RRNA 3496 - 4981 98.0 # EF403471 [D:1..1488] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Predicted protein(s) Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:12 2011 Seq name: gi|225935314|gb|ACGA01000078.1| Bacteroides sp. D2 cont1.78, whole genome shotgun sequence Length of sequence - 2342 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 59 - 775 290 ## COG3436 Transposase and inactivated derivatives 2 2 Op 1 . - CDS 904 - 1623 381 ## BVU_0483 transposase - Prom 1647 - 1706 2.4 3 2 Op 2 . - CDS 1710 - 2060 139 ## COG3436 Transposase and inactivated derivatives 4 2 Op 3 . - CDS 2048 - 2332 123 ## BVU_0481 hypothetical protein Predicted protein(s) >gi|225935314|gb|ACGA01000078.1| GENE 1 59 - 775 290 238 aa, chain - ## HITS:1 COG:ECs3866 KEGG:ns NR:ns ## COG: ECs3866 COG3436 # Protein_GI_number: 15833120 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 17 232 252 457 463 112 34.0 5e-25 MLYDDGSRSERVILNELGSCKGIIQSDGYSPYRKLESDAYPNITRIPCLQHIKRKFIDCG EKDPDAKRIVELINALYQNEHKHKVGVEGWTVEQNLMHRKKYAPDILGEIKDVLDEIEER GDLLPKSELQEAITYLRNEWNAVVDIFNYGDTYLDNNMVERMNRYISLSRKNSLFFGSHK GAERGAILYTIALTCRMHKVNLFEYLTDVINRTAEWQPNTPIEKYRELLPDRWEKAND >gi|225935314|gb|ACGA01000078.1| GENE 2 904 - 1623 381 239 aa, chain - ## HITS:1 COG:no KEGG:BVU_0483 NR:ns ## KEGG: BVU_0483 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 1 226 1 226 521 389 91.0 1e-107 MKKDEIIELLKEQIKGLRDDNNRLLDQIDALIKEVSSLKEALLQKGESLSKQQRLTKGLA KLVSNTSEQQQAPQSAISEEERQKIEAEKADKRKARKNNGAKRDMHYEMEEEEHVVYPDD PDFDINKARLFTTVPRICVRYECVPMRFIKHVYKIHTYTQEGRLFEGKTPASAFLNSSYD GSFIAGLMELRYIQSLPVERIINYFESHGFTLKKPTAHKLIEKASTSSKIFISASGRQL >gi|225935314|gb|ACGA01000078.1| GENE 3 1710 - 2060 139 116 aa, chain - ## HITS:1 COG:ECs3847 KEGG:ns NR:ns ## COG: ECs3847 COG3436 # Protein_GI_number: 15833101 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 113 1 110 115 64 33.0 4e-11 MYSLTSANRYYLYQGFVRMNLGIDGLFKIIRSEMKDLSPVSGDIFLFFGKNRQSVKILRW DGDGFLLYYKRLEGGSFELPTFNPNTGNYEISYQVLSFILNGVSLKSVRLRKRFRI >gi|225935314|gb|ACGA01000078.1| GENE 4 2048 - 2332 123 94 aa, chain - ## HITS:1 COG:no KEGG:BVU_0481 NR:ns ## KEGG: BVU_0481 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 94 41 134 134 182 95.0 3e-45 MSSWMSRRGYSVKQAKADVVRDYYGGVEPSQPTTSSPSFTQIAPAMLSEEEFSLAGITIT FNSGTTISVKRATPGGVIKMLRDYERKEGDPCIL Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:18 2011 Seq name: gi|225935313|gb|ACGA01000079.1| Bacteroides sp. D2 cont1.79, whole genome shotgun sequence Length of sequence - 1820 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1819 1436 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|225935313|gb|ACGA01000079.1| GENE 1 1 - 1819 1436 606 aa, chain + ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 8 606 96 700 764 598 52.0 1e-170 NALQHLAVEESRLKIPLLVGADVIHGYETIFPIPLALSCSWDTLAVERMARISAIEASAD GINWTFSPMVDICRDARWGRIAEGSGEDPYLGSLMAKAYVRGYQGNNMQGNDEILACVKH FALYGASESGRDYNVVDMSRLRMYNEYLAPYKAAVDAGVGSVMSSFNIVDGIPATANKWL LTDLLRDEWGFGGLLVTDYNSIAEMSSHGVAPLKEASIRALQAGTDMDMVSCGFLNTLEE SLKEGKVTEEQINAACRRVLEAKYKLGLFSDPYKYCDTLRTREKLYTAEHRSSARTIATE TFVLLKNDHHLLPLDIKGKIALIGPMADARNNMCGMWSMTCTPSRHGTLLEGIRSAAGDK AEILYAKGSNIYYDAETEKAATGIRPLECGDNRQLLDEALRIAARADVIVAALGECAEMS GESVSRTNLEIPDAQQDLLKALVKTGKPVVLLLFTGRPLVLNWEATNVHSILNVWFGGSE TGDAVADVLFGKVAPSGKLTTTFPRSVGQLPLFYNHLNTGRPDPDNRIFNRYASNYLDES NEPLYPFGYGLSYTDFVYSDLQISSETLPKNGELTVSVTVTNKGNYDGYETVQLYLRDIY AEIARP Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:19 2011 Seq name: gi|225935312|gb|ACGA01000080.1| Bacteroides sp. D2 cont1.80, whole genome shotgun sequence Length of sequence - 1684 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 48 - 1683 1141 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|225935312|gb|ACGA01000080.1| GENE 1 48 - 1683 1141 545 aa, chain + ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 19 542 59 581 754 433 45.0 1e-121 MLWATYRADPWTKKTLANGLNPELSAKAGNALQKYVMENTRLGIPMFLAEEAPHGHMAIG ATVFPTGIGMAATWSPELVKEVGQVIAKEIRSQGGHISYGPVLDLTRDPRWSRVEETFGE DPVLSGILGASMVDGLGGGNLSQKYATIATLKHFLAYAVPEGGQNGNYASVGIRDLHQNF LPPFRKAIDSGALSVMTSYNSIDGIPCTSNHYLLTQLLRNEWKFRGFVVSDLYSIEGIHE SHFVALTKENAAIQSVTAGVDVDLGGDAYTNLCHAVQSGQMDKAVIDTAVCRVLRMKFEM GLFEHPYVDPKIAAKTVRRKEHIELARKIAQSSITLLKNENSILPLSKTINKVAVIGPNA DNRYNMLGDYTAPQEDSNVKTVLDGIITKLSPSRVEYVRGCAIRDTTVNEIEQAIEAARR SEVVIVVVGGSSARDFKTSYKETGAAVAEEGSVSDMECGEGFDRASLSLLGRQQELLESL QKTGKPLIVVYIEGRPLEKNWASEYADALLTAYYPGQEGGNAIADVLFGDYNPSGRLPIS VPRSV Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:20 2011 Seq name: gi|225935311|gb|ACGA01000081.1| Bacteroides sp. D2 cont1.81, whole genome shotgun sequence Length of sequence - 1593 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 212 236 ## BT_0825 altronate oxidoreductase (EC:1.1.1.58) + Term 226 - 283 19.8 - Term 213 - 271 20.0 2 2 Op 1 . - CDS 287 - 820 573 ## BT_0826 hypothetical protein 3 2 Op 2 . - CDS 798 - 1592 678 ## BT_0827 hypothetical protein Predicted protein(s) >gi|225935311|gb|ACGA01000081.1| GENE 1 3 - 212 236 69 aa, chain + ## HITS:1 COG:no KEGG:BT_0825 NR:ns ## KEGG: BT_0825 # Name: not_defined # Def: altronate oxidoreductase (EC:1.1.1.58) # Organism: B.thetaiotaomicron # Pathway: Pentose and glucuronate interconversions [PATH:bth00040]; Metabolic pathways [PATH:bth01100] # 1 69 411 479 479 134 94.0 1e-30 VPNDAPEIMNLLKELWATGCTKKVAEGVLAADFIWGEDLNKIPGLTEAVKADLDSIQEKG MLETVKGIL >gi|225935311|gb|ACGA01000081.1| GENE 2 287 - 820 573 177 aa, chain - ## HITS:1 COG:no KEGG:BT_0826 NR:ns ## KEGG: BT_0826 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 177 1 177 177 317 95.0 1e-85 MKRVLCPKCENYLYFDETKYSEGQSLVFECEHCGKQFSIRLGKSKVKALRKEENLEEEAE SHKEEFGYITVIENVFGFKQLLPLQEGDNVIGRRCVGTNINTPIESGDMSMDRRHCIINI KRNKQGKLVYTLRDAPSLTGTFLMNEILGDKDRVHIEDGAIVTIGATTFILHTAEQE >gi|225935311|gb|ACGA01000081.1| GENE 3 798 - 1592 678 264 aa, chain - ## HITS:1 COG:no KEGG:BT_0827 NR:ns ## KEGG: BT_0827 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 250 112 365 365 310 68.0 3e-83 GNYEFIPYDYKITTPSLYGLDSFEIHELSALQQKEKVLIPTYPEKEKKTFEISINRAYLR NAAAMIAAIVLFFAFSTPVENTDVQKNNYAQLLPSELFEQIEKQSVAITPVYVKNDAAQQ AKKFSASSASTKTSSAKKHTTDKAKTSKPIAVREVKVVKQETAAPAPAVKSQESANHPFH IIVAGGISLKDAEAIATQLKSKGFADAKALNTDGKVRVSISSFNNRDEATKQLLELRKNE TYKNAWLLAKIKKPFKKHETSSLS Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:28 2011 Seq name: gi|225935310|gb|ACGA01000082.1| Bacteroides sp. D2 cont1.82, whole genome shotgun sequence Length of sequence - 1437 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 31 - 90 1.7 1 1 Tu 1 . + CDS 112 - 1341 526 ## COG3385 FOG: Transposase and inactivated derivatives Predicted protein(s) >gi|225935310|gb|ACGA01000082.1| GENE 1 112 - 1341 526 409 aa, chain + ## HITS:1 COG:SMb20766 KEGG:ns NR:ns ## COG: SMb20766 COG3385 # Protein_GI_number: 16265206 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 12 355 7 334 387 150 29.0 5e-36 MGKSTHFSGQPLYCQVIKLLDKSKVLNHSRSNGGERYVKRFDGWTHLVVMLYAVIMRFDS LREITASLQAEARKLCHLGISVMTSRSTLADANKRRPESVFESIYRDLYATYRNHLSSDS RSHKEPKWMKRLQIIDSTTITLFSNLLFKGVGRHPKTGKKKGGIKVHTVIHANEGVPSDI RFTSAATNDSFMLKPSTLNKGDIMAMDRAYIDYQKFEQMTQRGVIYVTKMKKGLKYSVLS DTMYQTPKGLMEVRIQQVTFTKQLKNGETINHQACVITYADEEKHKLISLLTNDRESDPS EIIAIYHKRWEIELLFKQMKQNFPLKYFYGESANAIKIQIWVTLIANLLLMVMQKGLTRS WSFSGLATMVRITLMYYVDFYSLFNHPERDWESILEAASESPHQTSLFD Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:29 2011 Seq name: gi|225935309|gb|ACGA01000083.1| Bacteroides sp. D2 cont1.83, whole genome shotgun sequence Length of sequence - 1352 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 52 - 111 4.3 1 1 Tu 1 . + CDS 146 - 1339 464 ## COG5433 Transposase Predicted protein(s) >gi|225935309|gb|ACGA01000083.1| GENE 1 146 - 1339 464 397 aa, chain + ## HITS:1 COG:ydcC KEGG:ns NR:ns ## COG: ydcC COG5433 # Protein_GI_number: 16129419 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli K12 # 22 389 12 368 378 211 37.0 3e-54 MKQETKRKIEISNLHEFADSLILIDNRIDRCKKHQASTIVLIAISAVICGADTWNSIEDF GKSKESFFAAKLSNFNGIPSHDTFNRFFSALDPLKFEESYRQWVQSILKCYSGHIAIDGK TIRGAYESEQDKRHRKQGVLPDSNTGKYKLHVISAFATELGISLGQLCTQEKENEIVVIP ELLDMLCIKDCIITIDALGCQRTIAEKVIKGEGDYIFIVKDNQPKLKEIVLSVTESIVSK GTTVRFDKYETHEEGHGRNESRICYCCNDPGFLGADIRKKWKNIQSFGYIENTRNTNKGT TVEKRCFISSLEPDAQKILKNSREHWEIENNLHWQLDVNFHEDNTRRRNISALNFSVLAK IALATLRNNKREIPINRKRLIAGWDNEFLWELILHDL Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:30 2011 Seq name: gi|225935308|gb|ACGA01000084.1| Bacteroides sp. D2 cont1.84, whole genome shotgun sequence Length of sequence - 878 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 878 749 ## COG1596 Periplasmic protein involved in polysaccharide export Predicted protein(s) >gi|225935308|gb|ACGA01000084.1| GENE 1 2 - 878 749 292 aa, chain + ## HITS:1 COG:TM0638 KEGG:ns NR:ns ## COG: TM0638 COG1596 # Protein_GI_number: 15643403 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Thermotoga maritima # 11 260 134 369 992 71 28.0 1e-12 GVSDIGSLRNVQLVRNGKNIATIDVYQFIMKGNIQDDIRLQEGDVVIVPAYDVLVKIDGK VKRPMRFEMKKDESLSTLISYAGGFEADAYTRSLRVVRQNGQEYEVNTVKDLDYSVYKMR NGDVVTAEAILNRFINKLEIRGAVYRPGIYQLNGKLNTVRELVNEAQGLTGDAFLNRAVL YRQREDLTTEVVPVDIKAIMDGTSQNIILMKNDILYIPSIHDLEDRGNVVIHGEVAKPDS YPYADNMTLEDLIIQAGGLREAASVVRVDVSRRIKNPRSTVDNDTIGQIYTF Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:30 2011 Seq name: gi|225935307|gb|ACGA01000085.1| Bacteroides sp. D2 cont1.85, whole genome shotgun sequence Length of sequence - 783 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 17 - 782 648 ## COG1596 Periplasmic protein involved in polysaccharide export Predicted protein(s) >gi|225935307|gb|ACGA01000085.1| GENE 1 17 - 782 648 255 aa, chain + ## HITS:1 COG:aq_505 KEGG:ns NR:ns ## COG: aq_505 COG1596 # Protein_GI_number: 15605977 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Aquifex aeolicus # 142 252 90 192 725 70 39.0 3e-12 MTTRRKFNVFLFILLLGVFSPLMAQSMSDSQVLEYVKDGIRQGKEQKQLASELARRGVTK EQATRVKQLYEQQNNVNASNATGTDVNESRLREEMKENTSDMLEDHPSTQDLARGNQVFG RNIFNTRNLTFEPSVNIATPLNYRLGPGDEVIIDIWGASQNTIRQQISPDGTINIQKIGP VNLNGLTIAEANDYLKKTLNKIYNGLNNANDPTSDIRLTLGSIRTIQINVMGEVVQPGTY SLSSFATVFHALYRA Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:31 2011 Seq name: gi|225935306|gb|ACGA01000086.1| Bacteroides sp. D2 cont1.86, whole genome shotgun sequence Length of sequence - 783 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 17 - 782 662 ## COG1596 Periplasmic protein involved in polysaccharide export Predicted protein(s) >gi|225935306|gb|ACGA01000086.1| GENE 1 17 - 782 662 255 aa, chain + ## HITS:1 COG:aq_505 KEGG:ns NR:ns ## COG: aq_505 COG1596 # Protein_GI_number: 15605977 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Aquifex aeolicus # 142 252 90 192 725 70 39.0 3e-12 MTTRRKFNVFLFILLLGVFSPLMAQSMSDSQVLEYVKDGIRQGKEQKQLASELARRGVTK EQATRVKQLYEQQNNVNASNATGTDVNESRLREEMKENTSDMLEDHPSTQDLARGNQVFG RNIFNTRNLTFEPSVNIATPLNYRLGPGDEVIIDIWGASQNTIRQQISPDGTINIQKIGP VNLNGLTIAEANDYLKKTLNKIYNGLNNANDPTSDIRLTLGSIRTIQINVMGEVVQPGTY SLSSFATVFHALYRA Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:31 2011 Seq name: gi|225935305|gb|ACGA01000087.1| Bacteroides sp. D2 cont1.87, whole genome shotgun sequence Length of sequence - 645 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri May 13 11:46:32 2011 Seq name: gi|225935304|gb|ACGA01000088.1| Bacteroides sp. D2 cont1.88, whole genome shotgun sequence Length of sequence - 598 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 41 - 596 367 ## BT_0596 putative transcriptional regulator Predicted protein(s) >gi|225935304|gb|ACGA01000088.1| GENE 1 41 - 596 367 185 aa, chain + ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 185 1 185 192 323 84.0 2e-87 MILTKPKSVNAGPSSGTGEGVAHSKRWYVALVRMHHEKKVAERLSKMGIDSFVPVQQQIH QWSDRRKMVDTVLLPMMVFVHVNPKERMEVLSFSTVSRYMVMRGESTPAVIPDEQMARFR FMLDYSEEAVCMNDTPLARGEKVRVIKGPLSGLVGELVTVGGKSKIAVRLNMLGCACVDM PIGYV